12/21/2021 14:34:56 - INFO - codeparrot_training - Distributed environment: TPU Num processes: 8 Process index: 0 Local process index: 0 Device: xla:1 Use FP16 precision: False 12/21/2021 14:34:56 - WARNING - huggingface_hub.repository - Revision `robust-sun-12` does not exist. Created and checked out branch `robust-sun-12`. 12/21/2021 14:34:56 - WARNING - huggingface_hub.repository - 12/21/2021 14:35:08 - WARNING - datasets.builder - Using custom data configuration lvwerra___codeparrot-clean-train-a1efdd1059bd841d 12/21/2021 14:35:09 - WARNING - datasets.builder - Using custom data configuration lvwerra___codeparrot-clean-valid-a800eb55c299abc0 12/21/2021 14:35:51 - INFO - codeparrot_training - Step 0: {'lr': 0.0, 'samples': 512, 'steps': 0, 'batch_loss/train': 1.0193529492244124} 12/21/2021 14:37:13 - INFO - codeparrot_training - Step 1: {'lr': 2.8571428571428573e-06, 'samples': 1024, 'steps': 1, 'batch_loss/train': 0.9981147311627865} 12/21/2021 14:38:39 - INFO - codeparrot_training - Step 2: {'lr': 5.7142857142857145e-06, 'samples': 1536, 'steps': 2, 'batch_loss/train': 0.955647824332118} 12/21/2021 14:38:52 - INFO - codeparrot_training - Step 3: {'lr': 8.571428571428573e-06, 'samples': 2048, 'steps': 3, 'batch_loss/train': 1.0883512431755662} 12/21/2021 14:39:03 - INFO - codeparrot_training - Step 4: {'lr': 1.1428571428571429e-05, 'samples': 2560, 'steps': 4, 'batch_loss/train': 0.8130962830036879} 12/21/2021 14:39:13 - INFO - codeparrot_training - Step 5: {'lr': 1.4285714285714285e-05, 'samples': 3072, 'steps': 5, 'batch_loss/train': 1.0086954981088638} 12/21/2021 14:39:25 - INFO - codeparrot_training - Step 6: {'lr': 1.7142857142857145e-05, 'samples': 3584, 'steps': 6, 'batch_loss/train': 1.0232350481674075} 12/21/2021 14:39:36 - INFO - codeparrot_training - Step 7: {'lr': 2e-05, 'samples': 4096, 'steps': 7, 'batch_loss/train': 0.9230576828122139} 12/21/2021 14:39:47 - INFO - codeparrot_training - Step 8: {'lr': 2.2857142857142858e-05, 'samples': 4608, 'steps': 8, 'batch_loss/train': 0.9473168570548296} 12/21/2021 14:40:00 - INFO - codeparrot_training - Step 9: {'lr': 2.5714285714285714e-05, 'samples': 5120, 'steps': 9, 'batch_loss/train': 0.9869899312034249} 12/21/2021 14:40:10 - INFO - codeparrot_training - Step 10: {'lr': 2.857142857142857e-05, 'samples': 5632, 'steps': 10, 'batch_loss/train': 0.8964740806259215} 12/21/2021 14:40:21 - INFO - codeparrot_training - Step 11: {'lr': 3.142857142857143e-05, 'samples': 6144, 'steps': 11, 'batch_loss/train': 1.0702612651512027} 12/21/2021 14:40:31 - INFO - codeparrot_training - Step 12: {'lr': 3.428571428571429e-05, 'samples': 6656, 'steps': 12, 'batch_loss/train': 1.0014494936913252} 12/21/2021 14:40:43 - INFO - codeparrot_training - Step 13: {'lr': 3.7142857142857143e-05, 'samples': 7168, 'steps': 13, 'batch_loss/train': 0.918277096003294} 12/21/2021 14:40:54 - INFO - codeparrot_training - Step 14: {'lr': 4e-05, 'samples': 7680, 'steps': 14, 'batch_loss/train': 1.0102925430983305} 12/21/2021 14:41:05 - INFO - codeparrot_training - Step 15: {'lr': 4.2857142857142856e-05, 'samples': 8192, 'steps': 15, 'batch_loss/train': 0.9099750043824315} 12/21/2021 14:41:17 - INFO - codeparrot_training - Step 16: {'lr': 4.5714285714285716e-05, 'samples': 8704, 'steps': 16, 'batch_loss/train': 1.0381581708788872} 12/21/2021 14:41:28 - INFO - codeparrot_training - Step 17: {'lr': 4.857142857142857e-05, 'samples': 9216, 'steps': 17, 'batch_loss/train': 0.912262661382556} 12/21/2021 14:41:38 - INFO - codeparrot_training - Step 18: {'lr': 5.142857142857143e-05, 'samples': 9728, 'steps': 18, 'batch_loss/train': 1.0805493760854006} 12/21/2021 14:41:51 - INFO - codeparrot_training - Step 19: {'lr': 5.428571428571429e-05, 'samples': 10240, 'steps': 19, 'batch_loss/train': 0.9622202254831791} 12/21/2021 14:42:02 - INFO - codeparrot_training - Step 20: {'lr': 5.714285714285714e-05, 'samples': 10752, 'steps': 20, 'batch_loss/train': 1.0766527820378542} 12/21/2021 14:42:12 - INFO - codeparrot_training - Step 21: {'lr': 6e-05, 'samples': 11264, 'steps': 21, 'batch_loss/train': 1.3213243894279003} 12/21/2021 14:42:25 - INFO - codeparrot_training - Step 22: {'lr': 6.285714285714286e-05, 'samples': 11776, 'steps': 22, 'batch_loss/train': 1.001508022658527} 12/21/2021 14:42:35 - INFO - codeparrot_training - Step 23: {'lr': 6.571428571428571e-05, 'samples': 12288, 'steps': 23, 'batch_loss/train': 1.0677247643470764} 12/21/2021 14:42:46 - INFO - codeparrot_training - Step 24: {'lr': 6.857142857142858e-05, 'samples': 12800, 'steps': 24, 'batch_loss/train': 0.9136228444986045} 12/21/2021 14:42:56 - INFO - codeparrot_training - Step 25: {'lr': 7.142857142857142e-05, 'samples': 13312, 'steps': 25, 'batch_loss/train': 1.1633764002472162} 12/21/2021 14:43:09 - INFO - codeparrot_training - Step 26: {'lr': 7.428571428571429e-05, 'samples': 13824, 'steps': 26, 'batch_loss/train': 0.9266835525631905} 12/21/2021 14:43:19 - INFO - codeparrot_training - Step 27: {'lr': 7.714285714285714e-05, 'samples': 14336, 'steps': 27, 'batch_loss/train': 0.9802188752219081} 12/21/2021 14:43:30 - INFO - codeparrot_training - Step 28: {'lr': 8e-05, 'samples': 14848, 'steps': 28, 'batch_loss/train': 1.007140651345253} 12/21/2021 14:43:42 - INFO - codeparrot_training - Step 29: {'lr': 8.285714285714286e-05, 'samples': 15360, 'steps': 29, 'batch_loss/train': 0.8645388996228576} 12/21/2021 14:43:53 - INFO - codeparrot_training - Step 30: {'lr': 8.571428571428571e-05, 'samples': 15872, 'steps': 30, 'batch_loss/train': 0.9906511381268501} 12/21/2021 14:44:04 - INFO - codeparrot_training - Step 31: {'lr': 8.857142857142857e-05, 'samples': 16384, 'steps': 31, 'batch_loss/train': 0.8584465784952044} 12/21/2021 14:44:15 - INFO - codeparrot_training - Step 32: {'lr': 9.142857142857143e-05, 'samples': 16896, 'steps': 32, 'batch_loss/train': 0.9680130295455456} 12/21/2021 14:44:26 - INFO - codeparrot_training - Step 33: {'lr': 9.42857142857143e-05, 'samples': 17408, 'steps': 33, 'batch_loss/train': 1.008946900255978} 12/21/2021 14:44:37 - INFO - codeparrot_training - Step 34: {'lr': 9.714285714285714e-05, 'samples': 17920, 'steps': 34, 'batch_loss/train': 1.020456189289689} 12/21/2021 14:44:47 - INFO - codeparrot_training - Step 35: {'lr': 0.0001, 'samples': 18432, 'steps': 35, 'batch_loss/train': 0.9887907225638628} 12/21/2021 14:44:59 - INFO - codeparrot_training - Step 36: {'lr': 0.00010285714285714286, 'samples': 18944, 'steps': 36, 'batch_loss/train': 1.0221164394170046} 12/21/2021 14:45:10 - INFO - codeparrot_training - Step 37: {'lr': 0.00010571428571428572, 'samples': 19456, 'steps': 37, 'batch_loss/train': 1.0078937038779259} 12/21/2021 14:45:20 - INFO - codeparrot_training - Step 38: {'lr': 0.00010857142857142858, 'samples': 19968, 'steps': 38, 'batch_loss/train': 0.9564523370936513} 12/21/2021 14:45:32 - INFO - codeparrot_training - Step 39: {'lr': 0.00011142857142857143, 'samples': 20480, 'steps': 39, 'batch_loss/train': 0.9587605223059654} 12/21/2021 14:45:43 - INFO - codeparrot_training - Step 40: {'lr': 0.00011428571428571428, 'samples': 20992, 'steps': 40, 'batch_loss/train': 0.963530576787889} 12/21/2021 14:45:53 - INFO - codeparrot_training - Step 41: {'lr': 0.00011714285714285715, 'samples': 21504, 'steps': 41, 'batch_loss/train': 0.9486037753522396} 12/21/2021 14:46:06 - INFO - codeparrot_training - Step 42: {'lr': 0.00012, 'samples': 22016, 'steps': 42, 'batch_loss/train': 1.0729197282344103} 12/21/2021 14:46:17 - INFO - codeparrot_training - Step 43: {'lr': 0.00012285714285714287, 'samples': 22528, 'steps': 43, 'batch_loss/train': 0.9991202894598246} 12/21/2021 14:46:27 - INFO - codeparrot_training - Step 44: {'lr': 0.00012571428571428572, 'samples': 23040, 'steps': 44, 'batch_loss/train': 0.9529557069763541} 12/21/2021 14:46:38 - INFO - codeparrot_training - Step 45: {'lr': 0.00012857142857142855, 'samples': 23552, 'steps': 45, 'batch_loss/train': 0.9314988972619176} 12/21/2021 14:46:50 - INFO - codeparrot_training - Step 46: {'lr': 0.00013142857142857143, 'samples': 24064, 'steps': 46, 'batch_loss/train': 0.8918952587991953} 12/21/2021 14:47:01 - INFO - codeparrot_training - Step 47: {'lr': 0.00013428571428571428, 'samples': 24576, 'steps': 47, 'batch_loss/train': 0.9235455105081201} 12/21/2021 14:47:11 - INFO - codeparrot_training - Step 48: {'lr': 0.00013714285714285716, 'samples': 25088, 'steps': 48, 'batch_loss/train': 0.9857792295515537} 12/21/2021 14:47:24 - INFO - codeparrot_training - Step 49: {'lr': 0.00014000000000000001, 'samples': 25600, 'steps': 49, 'batch_loss/train': 0.9763009026646614} 12/21/2021 14:47:35 - INFO - codeparrot_training - Step 50: {'lr': 0.00014285714285714284, 'samples': 26112, 'steps': 50, 'batch_loss/train': 0.9829016849398613} 12/21/2021 14:47:45 - INFO - codeparrot_training - Step 51: {'lr': 0.00014571428571428572, 'samples': 26624, 'steps': 51, 'batch_loss/train': 1.001112732104957} 12/21/2021 14:47:57 - INFO - codeparrot_training - Step 52: {'lr': 0.00014857142857142857, 'samples': 27136, 'steps': 52, 'batch_loss/train': 1.321533925831318} 12/21/2021 14:48:08 - INFO - codeparrot_training - Step 53: {'lr': 0.00015142857142857145, 'samples': 27648, 'steps': 53, 'batch_loss/train': 1.3060968089848757} 12/21/2021 14:48:19 - INFO - codeparrot_training - Step 54: {'lr': 0.00015428571428571428, 'samples': 28160, 'steps': 54, 'batch_loss/train': 0.9784271735697985} 12/21/2021 14:48:29 - INFO - codeparrot_training - Step 55: {'lr': 0.00015714285714285713, 'samples': 28672, 'steps': 55, 'batch_loss/train': 0.9450377775356174} 12/21/2021 14:48:41 - INFO - codeparrot_training - Step 56: {'lr': 0.00016, 'samples': 29184, 'steps': 56, 'batch_loss/train': 0.9179345397278666} 12/21/2021 14:48:52 - INFO - codeparrot_training - Step 57: {'lr': 0.00016285714285714287, 'samples': 29696, 'steps': 57, 'batch_loss/train': 0.9568109698593616} 12/21/2021 14:49:02 - INFO - codeparrot_training - Step 58: {'lr': 0.00016571428571428572, 'samples': 30208, 'steps': 58, 'batch_loss/train': 0.9738078229129314} 12/21/2021 14:49:15 - INFO - codeparrot_training - Step 59: {'lr': 0.00016857142857142857, 'samples': 30720, 'steps': 59, 'batch_loss/train': 1.0150575507432222} 12/21/2021 14:49:26 - INFO - codeparrot_training - Step 60: {'lr': 0.00017142857142857143, 'samples': 31232, 'steps': 60, 'batch_loss/train': 0.9956141086295247} 12/21/2021 14:49:36 - INFO - codeparrot_training - Step 61: {'lr': 0.0001742857142857143, 'samples': 31744, 'steps': 61, 'batch_loss/train': 0.9621291160583496} 12/21/2021 14:49:48 - INFO - codeparrot_training - Step 62: {'lr': 0.00017714285714285713, 'samples': 32256, 'steps': 62, 'batch_loss/train': 0.9962088353931904} 12/21/2021 14:49:59 - INFO - codeparrot_training - Step 63: {'lr': 0.00017999999999999998, 'samples': 32768, 'steps': 63, 'batch_loss/train': 0.9740501055493951} 12/21/2021 14:50:10 - INFO - codeparrot_training - Step 64: {'lr': 0.00018285714285714286, 'samples': 33280, 'steps': 64, 'batch_loss/train': 0.9131401311606169} 12/21/2021 14:50:20 - INFO - codeparrot_training - Step 65: {'lr': 0.00018571428571428572, 'samples': 33792, 'steps': 65, 'batch_loss/train': 1.007351204752922} 12/21/2021 14:50:32 - INFO - codeparrot_training - Step 66: {'lr': 0.0001885714285714286, 'samples': 34304, 'steps': 66, 'batch_loss/train': 1.0498820468783379} 12/21/2021 14:50:43 - INFO - codeparrot_training - Step 67: {'lr': 0.00019142857142857142, 'samples': 34816, 'steps': 67, 'batch_loss/train': 0.982292203232646} 12/21/2021 14:50:53 - INFO - codeparrot_training - Step 68: {'lr': 0.00019428571428571428, 'samples': 35328, 'steps': 68, 'batch_loss/train': 1.0195065662264824} 12/21/2021 14:51:06 - INFO - codeparrot_training - Step 69: {'lr': 0.00019714285714285716, 'samples': 35840, 'steps': 69, 'batch_loss/train': 1.001670149154961} 12/21/2021 14:51:16 - INFO - codeparrot_training - Step 70: {'lr': 0.0002, 'samples': 36352, 'steps': 70, 'batch_loss/train': 1.0421318169683218} 12/21/2021 14:51:27 - INFO - codeparrot_training - Step 71: {'lr': 0.00020285714285714286, 'samples': 36864, 'steps': 71, 'batch_loss/train': 1.0117254294455051} 12/21/2021 14:51:39 - INFO - codeparrot_training - Step 72: {'lr': 0.00020571428571428572, 'samples': 37376, 'steps': 72, 'batch_loss/train': 0.9900305140763521} 12/21/2021 14:51:50 - INFO - codeparrot_training - Step 73: {'lr': 0.00020857142857142857, 'samples': 37888, 'steps': 73, 'batch_loss/train': 0.9446136141195893} 12/21/2021 14:52:00 - INFO - codeparrot_training - Step 74: {'lr': 0.00021142857142857145, 'samples': 38400, 'steps': 74, 'batch_loss/train': 1.0178908160887659} 12/21/2021 14:52:14 - INFO - codeparrot_training - Step 75: {'lr': 0.00021428571428571427, 'samples': 38912, 'steps': 75, 'batch_loss/train': 0.9789360649883747} 12/21/2021 14:52:24 - INFO - codeparrot_training - Step 76: {'lr': 0.00021714285714285715, 'samples': 39424, 'steps': 76, 'batch_loss/train': 1.102635620161891} 12/21/2021 14:52:35 - INFO - codeparrot_training - Step 77: {'lr': 0.00022, 'samples': 39936, 'steps': 77, 'batch_loss/train': 1.0151197109371424} 12/21/2021 14:52:45 - INFO - codeparrot_training - Step 78: {'lr': 0.00022285714285714286, 'samples': 40448, 'steps': 78, 'batch_loss/train': 1.7697611907497048} 12/21/2021 14:52:58 - INFO - codeparrot_training - Step 79: {'lr': 0.00022571428571428571, 'samples': 40960, 'steps': 79, 'batch_loss/train': 0.9738621157594025} 12/21/2021 14:53:08 - INFO - codeparrot_training - Step 80: {'lr': 0.00022857142857142857, 'samples': 41472, 'steps': 80, 'batch_loss/train': 1.020509110763669} 12/21/2021 14:53:19 - INFO - codeparrot_training - Step 81: {'lr': 0.00023142857142857142, 'samples': 41984, 'steps': 81, 'batch_loss/train': 1.0166559685021639} 12/21/2021 14:53:31 - INFO - codeparrot_training - Step 82: {'lr': 0.0002342857142857143, 'samples': 42496, 'steps': 82, 'batch_loss/train': 1.2909208964556456} 12/21/2021 14:53:41 - INFO - codeparrot_training - Step 83: {'lr': 0.00023714285714285715, 'samples': 43008, 'steps': 83, 'batch_loss/train': 0.9644039301201701} 12/21/2021 14:53:52 - INFO - codeparrot_training - Step 84: {'lr': 0.00024, 'samples': 43520, 'steps': 84, 'batch_loss/train': 1.0707816677168012} 12/21/2021 14:54:04 - INFO - codeparrot_training - Step 85: {'lr': 0.00024285714285714286, 'samples': 44032, 'steps': 85, 'batch_loss/train': 1.1155021954327822} 12/21/2021 14:54:15 - INFO - codeparrot_training - Step 86: {'lr': 0.00024571428571428574, 'samples': 44544, 'steps': 86, 'batch_loss/train': 0.9685124810785055} 12/21/2021 14:54:25 - INFO - codeparrot_training - Step 87: {'lr': 0.00024857142857142857, 'samples': 45056, 'steps': 87, 'batch_loss/train': 1.0195770990103483} 12/21/2021 14:54:36 - INFO - codeparrot_training - Step 88: {'lr': 0.00025142857142857145, 'samples': 45568, 'steps': 88, 'batch_loss/train': 0.9976689526811242} 12/21/2021 14:54:49 - INFO - codeparrot_training - Step 89: {'lr': 0.00025428571428571427, 'samples': 46080, 'steps': 89, 'batch_loss/train': 1.0429263738915324} 12/21/2021 14:55:00 - INFO - codeparrot_training - Step 90: {'lr': 0.0002571428571428571, 'samples': 46592, 'steps': 90, 'batch_loss/train': 0.9543298054486513} 12/21/2021 14:55:10 - INFO - codeparrot_training - Step 91: {'lr': 0.00026000000000000003, 'samples': 47104, 'steps': 91, 'batch_loss/train': 0.9885049089789391} 12/21/2021 14:55:22 - INFO - codeparrot_training - Step 92: {'lr': 0.00026285714285714286, 'samples': 47616, 'steps': 92, 'batch_loss/train': 0.7912878496572375} 12/21/2021 14:55:33 - INFO - codeparrot_training - Step 93: {'lr': 0.00026571428571428574, 'samples': 48128, 'steps': 93, 'batch_loss/train': 0.9988777888938785} 12/21/2021 14:55:43 - INFO - codeparrot_training - Step 94: {'lr': 0.00026857142857142856, 'samples': 48640, 'steps': 94, 'batch_loss/train': 0.9855301873758435} 12/21/2021 14:55:56 - INFO - codeparrot_training - Step 95: {'lr': 0.0002714285714285714, 'samples': 49152, 'steps': 95, 'batch_loss/train': 0.9779925588518381} 12/21/2021 14:56:07 - INFO - codeparrot_training - Step 96: {'lr': 0.0002742857142857143, 'samples': 49664, 'steps': 96, 'batch_loss/train': 1.0134917721152306} 12/21/2021 14:56:17 - INFO - codeparrot_training - Step 97: {'lr': 0.00027714285714285715, 'samples': 50176, 'steps': 97, 'batch_loss/train': 0.9322703145444393} 12/21/2021 14:56:28 - INFO - codeparrot_training - Step 98: {'lr': 0.00028000000000000003, 'samples': 50688, 'steps': 98, 'batch_loss/train': 1.0615130085498095} 12/21/2021 14:56:40 - INFO - codeparrot_training - Step 99: {'lr': 0.00028285714285714286, 'samples': 51200, 'steps': 99, 'batch_loss/train': 0.944296664558351} 12/21/2021 14:56:51 - INFO - codeparrot_training - Step 100: {'lr': 0.0002857142857142857, 'samples': 51712, 'steps': 100, 'batch_loss/train': 0.9070937177166343} 12/21/2021 14:57:01 - INFO - codeparrot_training - Step 101: {'lr': 0.0002885714285714286, 'samples': 52224, 'steps': 101, 'batch_loss/train': 1.016639574430883} 12/21/2021 14:57:13 - INFO - codeparrot_training - Step 102: {'lr': 0.00029142857142857144, 'samples': 52736, 'steps': 102, 'batch_loss/train': 0.9732032003812492} 12/21/2021 14:57:24 - INFO - codeparrot_training - Step 103: {'lr': 0.00029428571428571427, 'samples': 53248, 'steps': 103, 'batch_loss/train': 1.0322233149781823} 12/21/2021 14:57:34 - INFO - codeparrot_training - Step 104: {'lr': 0.00029714285714285715, 'samples': 53760, 'steps': 104, 'batch_loss/train': 0.9859379809349775} 12/21/2021 14:57:47 - INFO - codeparrot_training - Step 105: {'lr': 0.0003, 'samples': 54272, 'steps': 105, 'batch_loss/train': 0.8461151635274291} 12/21/2021 14:57:58 - INFO - codeparrot_training - Step 106: {'lr': 0.0003028571428571429, 'samples': 54784, 'steps': 106, 'batch_loss/train': 1.1073736632242799} 12/21/2021 14:58:08 - INFO - codeparrot_training - Step 107: {'lr': 0.00030571428571428573, 'samples': 55296, 'steps': 107, 'batch_loss/train': 0.9310879334807396} 12/21/2021 14:58:20 - INFO - codeparrot_training - Step 108: {'lr': 0.00030857142857142856, 'samples': 55808, 'steps': 108, 'batch_loss/train': 1.0085503365844488} 12/21/2021 14:58:31 - INFO - codeparrot_training - Step 109: {'lr': 0.00031142857142857144, 'samples': 56320, 'steps': 109, 'batch_loss/train': 1.0120310503989458} 12/21/2021 14:58:42 - INFO - codeparrot_training - Step 110: {'lr': 0.00031428571428571427, 'samples': 56832, 'steps': 110, 'batch_loss/train': 1.012949876487255} 12/21/2021 14:58:52 - INFO - codeparrot_training - Step 111: {'lr': 0.00031714285714285715, 'samples': 57344, 'steps': 111, 'batch_loss/train': 0.9427720690146089} 12/21/2021 14:59:04 - INFO - codeparrot_training - Step 112: {'lr': 0.00032, 'samples': 57856, 'steps': 112, 'batch_loss/train': 0.9724619053304195} 12/21/2021 14:59:15 - INFO - codeparrot_training - Step 113: {'lr': 0.00032285714285714285, 'samples': 58368, 'steps': 113, 'batch_loss/train': 1.0471261069178581} 12/21/2021 14:59:25 - INFO - codeparrot_training - Step 114: {'lr': 0.00032571428571428573, 'samples': 58880, 'steps': 114, 'batch_loss/train': 0.9464971609413624} 12/21/2021 14:59:37 - INFO - codeparrot_training - Step 115: {'lr': 0.00032857142857142856, 'samples': 59392, 'steps': 115, 'batch_loss/train': 1.0060633458197117} 12/21/2021 14:59:48 - INFO - codeparrot_training - Step 116: {'lr': 0.00033142857142857144, 'samples': 59904, 'steps': 116, 'batch_loss/train': 1.039223862811923} 12/21/2021 14:59:58 - INFO - codeparrot_training - Step 117: {'lr': 0.0003342857142857143, 'samples': 60416, 'steps': 117, 'batch_loss/train': 0.9779045283794403} 12/21/2021 15:00:11 - INFO - codeparrot_training - Step 118: {'lr': 0.00033714285714285714, 'samples': 60928, 'steps': 118, 'batch_loss/train': 0.9705766662955284} 12/21/2021 15:00:22 - INFO - codeparrot_training - Step 119: {'lr': 0.00034, 'samples': 61440, 'steps': 119, 'batch_loss/train': 0.9785442650318146} 12/21/2021 15:00:32 - INFO - codeparrot_training - Step 120: {'lr': 0.00034285714285714285, 'samples': 61952, 'steps': 120, 'batch_loss/train': 0.9353706063702703} 12/21/2021 15:00:43 - INFO - codeparrot_training - Step 121: {'lr': 0.00034571428571428573, 'samples': 62464, 'steps': 121, 'batch_loss/train': 0.9075854606926441} 12/21/2021 15:00:55 - INFO - codeparrot_training - Step 122: {'lr': 0.0003485714285714286, 'samples': 62976, 'steps': 122, 'batch_loss/train': 1.002996614202857} 12/21/2021 15:01:05 - INFO - codeparrot_training - Step 123: {'lr': 0.00035142857142857144, 'samples': 63488, 'steps': 123, 'batch_loss/train': 0.8917915541678667} 12/21/2021 15:01:16 - INFO - codeparrot_training - Step 124: {'lr': 0.00035428571428571426, 'samples': 64000, 'steps': 124, 'batch_loss/train': 1.020295524969697} 12/21/2021 15:01:28 - INFO - codeparrot_training - Step 125: {'lr': 0.00035714285714285714, 'samples': 64512, 'steps': 125, 'batch_loss/train': 1.069016046822071} 12/21/2021 15:01:39 - INFO - codeparrot_training - Step 126: {'lr': 0.00035999999999999997, 'samples': 65024, 'steps': 126, 'batch_loss/train': 0.8671705303713679} 12/21/2021 15:01:49 - INFO - codeparrot_training - Step 127: {'lr': 0.0003628571428571429, 'samples': 65536, 'steps': 127, 'batch_loss/train': 1.2122967820614576} 12/21/2021 15:02:03 - INFO - codeparrot_training - Step 128: {'lr': 0.00036571428571428573, 'samples': 66048, 'steps': 128, 'batch_loss/train': 0.9741223966702819} 12/21/2021 15:02:13 - INFO - codeparrot_training - Step 129: {'lr': 0.00036857142857142855, 'samples': 66560, 'steps': 129, 'batch_loss/train': 0.9305755216628313} 12/21/2021 15:02:24 - INFO - codeparrot_training - Step 130: {'lr': 0.00037142857142857143, 'samples': 67072, 'steps': 130, 'batch_loss/train': 1.5241616610437632} 12/21/2021 15:02:35 - INFO - codeparrot_training - Step 131: {'lr': 0.00037428571428571426, 'samples': 67584, 'steps': 131, 'batch_loss/train': 0.9661178383976221} 12/21/2021 15:02:46 - INFO - codeparrot_training - Step 132: {'lr': 0.0003771428571428572, 'samples': 68096, 'steps': 132, 'batch_loss/train': 1.0141914580017328} 12/21/2021 15:02:57 - INFO - codeparrot_training - Step 133: {'lr': 0.00038, 'samples': 68608, 'steps': 133, 'batch_loss/train': 0.9654294233769178} 12/21/2021 15:03:08 - INFO - codeparrot_training - Step 134: {'lr': 0.00038285714285714285, 'samples': 69120, 'steps': 134, 'batch_loss/train': 0.948192348703742} 12/21/2021 15:03:20 - INFO - codeparrot_training - Step 135: {'lr': 0.0003857142857142857, 'samples': 69632, 'steps': 135, 'batch_loss/train': 1.0017092926427722} 12/21/2021 15:03:31 - INFO - codeparrot_training - Step 136: {'lr': 0.00038857142857142855, 'samples': 70144, 'steps': 136, 'batch_loss/train': 0.9743279106914997} 12/21/2021 15:03:42 - INFO - codeparrot_training - Step 137: {'lr': 0.00039142857142857143, 'samples': 70656, 'steps': 137, 'batch_loss/train': 0.8830758472904563} 12/21/2021 15:03:53 - INFO - codeparrot_training - Step 138: {'lr': 0.0003942857142857143, 'samples': 71168, 'steps': 138, 'batch_loss/train': 1.062146995216608} 12/21/2021 15:04:04 - INFO - codeparrot_training - Step 139: {'lr': 0.00039714285714285714, 'samples': 71680, 'steps': 139, 'batch_loss/train': 1.0399387907236814} 12/21/2021 15:04:15 - INFO - codeparrot_training - Step 140: {'lr': 0.0004, 'samples': 72192, 'steps': 140, 'batch_loss/train': 1.0582102639600635} 12/21/2021 15:04:25 - INFO - codeparrot_training - Step 141: {'lr': 0.00040285714285714285, 'samples': 72704, 'steps': 141, 'batch_loss/train': 1.0336521156132221} 12/21/2021 15:04:37 - INFO - codeparrot_training - Step 142: {'lr': 0.0004057142857142857, 'samples': 73216, 'steps': 142, 'batch_loss/train': 0.8318402171134949} 12/21/2021 15:04:48 - INFO - codeparrot_training - Step 143: {'lr': 0.0004085714285714286, 'samples': 73728, 'steps': 143, 'batch_loss/train': 0.8727374253794551} 12/21/2021 15:04:59 - INFO - codeparrot_training - Step 144: {'lr': 0.00041142857142857143, 'samples': 74240, 'steps': 144, 'batch_loss/train': 1.0500088389962912} 12/21/2021 15:05:11 - INFO - codeparrot_training - Step 145: {'lr': 0.0004142857142857143, 'samples': 74752, 'steps': 145, 'batch_loss/train': 1.0643755737692118} 12/21/2021 15:05:22 - INFO - codeparrot_training - Step 146: {'lr': 0.00041714285714285714, 'samples': 75264, 'steps': 146, 'batch_loss/train': 0.9461820563301444} 12/21/2021 15:05:32 - INFO - codeparrot_training - Step 147: {'lr': 0.00042, 'samples': 75776, 'steps': 147, 'batch_loss/train': 1.0224799066781998} 12/21/2021 15:05:44 - INFO - codeparrot_training - Step 148: {'lr': 0.0004228571428571429, 'samples': 76288, 'steps': 148, 'batch_loss/train': 0.8724416689947248} 12/21/2021 15:05:55 - INFO - codeparrot_training - Step 149: {'lr': 0.0004257142857142857, 'samples': 76800, 'steps': 149, 'batch_loss/train': 0.9072890188544989} 12/21/2021 15:06:06 - INFO - codeparrot_training - Step 150: {'lr': 0.00042857142857142855, 'samples': 77312, 'steps': 150, 'batch_loss/train': 0.9915821412578225} 12/21/2021 15:06:16 - INFO - codeparrot_training - Step 151: {'lr': 0.00043142857142857143, 'samples': 77824, 'steps': 151, 'batch_loss/train': 1.0291715785861015} 12/21/2021 15:06:29 - INFO - codeparrot_training - Step 152: {'lr': 0.0004342857142857143, 'samples': 78336, 'steps': 152, 'batch_loss/train': 1.022611951455474} 12/21/2021 15:06:39 - INFO - codeparrot_training - Step 153: {'lr': 0.0004371428571428572, 'samples': 78848, 'steps': 153, 'batch_loss/train': 1.0028812251985073} 12/21/2021 15:06:50 - INFO - codeparrot_training - Step 154: {'lr': 0.00044, 'samples': 79360, 'steps': 154, 'batch_loss/train': 1.1939127957448363} 12/21/2021 15:07:02 - INFO - codeparrot_training - Step 155: {'lr': 0.00044285714285714284, 'samples': 79872, 'steps': 155, 'batch_loss/train': 0.9952369630336761} 12/21/2021 15:07:13 - INFO - codeparrot_training - Step 156: {'lr': 0.0004457142857142857, 'samples': 80384, 'steps': 156, 'batch_loss/train': 1.0121058207005262} 12/21/2021 15:07:23 - INFO - codeparrot_training - Step 157: {'lr': 0.0004485714285714286, 'samples': 80896, 'steps': 157, 'batch_loss/train': 0.98527164850384} 12/21/2021 15:07:35 - INFO - codeparrot_training - Step 158: {'lr': 0.00045142857142857143, 'samples': 81408, 'steps': 158, 'batch_loss/train': 0.9504385013133287} 12/21/2021 15:07:46 - INFO - codeparrot_training - Step 159: {'lr': 0.0004542857142857143, 'samples': 81920, 'steps': 159, 'batch_loss/train': 0.9669047053903341} 12/21/2021 15:07:56 - INFO - codeparrot_training - Step 160: {'lr': 0.00045714285714285713, 'samples': 82432, 'steps': 160, 'batch_loss/train': 1.0060318671166897} 12/21/2021 15:08:07 - INFO - codeparrot_training - Step 161: {'lr': 0.00046, 'samples': 82944, 'steps': 161, 'batch_loss/train': 1.0761142205446959} 12/21/2021 15:08:19 - INFO - codeparrot_training - Step 162: {'lr': 0.00046285714285714284, 'samples': 83456, 'steps': 162, 'batch_loss/train': 0.9274779781699181} 12/21/2021 15:08:30 - INFO - codeparrot_training - Step 163: {'lr': 0.0004657142857142857, 'samples': 83968, 'steps': 163, 'batch_loss/train': 1.0799931325018406} 12/21/2021 15:08:40 - INFO - codeparrot_training - Step 164: {'lr': 0.0004685714285714286, 'samples': 84480, 'steps': 164, 'batch_loss/train': 0.8160594068467617} 12/21/2021 15:08:54 - INFO - codeparrot_training - Step 165: {'lr': 0.0004714285714285714, 'samples': 84992, 'steps': 165, 'batch_loss/train': 0.9737982749938965} 12/21/2021 15:09:05 - INFO - codeparrot_training - Step 166: {'lr': 0.0004742857142857143, 'samples': 85504, 'steps': 166, 'batch_loss/train': 0.9546704925596714} 12/21/2021 15:09:15 - INFO - codeparrot_training - Step 167: {'lr': 0.00047714285714285713, 'samples': 86016, 'steps': 167, 'batch_loss/train': 0.9489555452018976} 12/21/2021 15:09:27 - INFO - codeparrot_training - Step 168: {'lr': 0.00048, 'samples': 86528, 'steps': 168, 'batch_loss/train': 0.7765848562121391} 12/21/2021 15:09:38 - INFO - codeparrot_training - Step 169: {'lr': 0.0004828571428571429, 'samples': 87040, 'steps': 169, 'batch_loss/train': 1.0046934774145484} 12/21/2021 15:09:48 - INFO - codeparrot_training - Step 170: {'lr': 0.0004857142857142857, 'samples': 87552, 'steps': 170, 'batch_loss/train': 0.8790218532085419} 12/21/2021 15:09:59 - INFO - codeparrot_training - Step 171: {'lr': 0.0004885714285714286, 'samples': 88064, 'steps': 171, 'batch_loss/train': 0.9702072646468878} 12/21/2021 15:10:11 - INFO - codeparrot_training - Step 172: {'lr': 0.0004914285714285715, 'samples': 88576, 'steps': 172, 'batch_loss/train': 0.9177612056955695} 12/21/2021 15:10:21 - INFO - codeparrot_training - Step 173: {'lr': 0.0004942857142857143, 'samples': 89088, 'steps': 173, 'batch_loss/train': 1.0101577769964933} 12/21/2021 15:10:32 - INFO - codeparrot_training - Step 174: {'lr': 0.0004971428571428571, 'samples': 89600, 'steps': 174, 'batch_loss/train': 0.9843772668391466} 12/21/2021 15:10:45 - INFO - codeparrot_training - Step 175: {'lr': 0.0005, 'samples': 90112, 'steps': 175, 'batch_loss/train': 1.0309578496962786} 12/21/2021 15:10:55 - INFO - codeparrot_training - Step 176: {'lr': 0.0004999999995030472, 'samples': 90624, 'steps': 176, 'batch_loss/train': 0.9139007637277246} 12/21/2021 15:11:06 - INFO - codeparrot_training - Step 177: {'lr': 0.0004999999980121888, 'samples': 91136, 'steps': 177, 'batch_loss/train': 1.0263297399505973} 12/21/2021 15:11:18 - INFO - codeparrot_training - Step 178: {'lr': 0.0004999999955274248, 'samples': 91648, 'steps': 178, 'batch_loss/train': 0.9592798547819257} 12/21/2021 15:11:29 - INFO - codeparrot_training - Step 179: {'lr': 0.0004999999920487552, 'samples': 92160, 'steps': 179, 'batch_loss/train': 0.8388644908554852} 12/21/2021 15:11:39 - INFO - codeparrot_training - Step 180: {'lr': 0.00049999998757618, 'samples': 92672, 'steps': 180, 'batch_loss/train': 0.9975963737815619} 12/21/2021 15:11:50 - INFO - codeparrot_training - Step 181: {'lr': 0.0004999999821096993, 'samples': 93184, 'steps': 181, 'batch_loss/train': 0.9878550455905497} 12/21/2021 15:12:03 - INFO - codeparrot_training - Step 182: {'lr': 0.0004999999756493131, 'samples': 93696, 'steps': 182, 'batch_loss/train': 1.0588420629501343} 12/21/2021 15:12:13 - INFO - codeparrot_training - Step 183: {'lr': 0.0004999999681950213, 'samples': 94208, 'steps': 183, 'batch_loss/train': 0.996025518514216} 12/21/2021 15:12:24 - INFO - codeparrot_training - Step 184: {'lr': 0.0004999999597468241, 'samples': 94720, 'steps': 184, 'batch_loss/train': 0.8855784628540277} 12/21/2021 15:12:36 - INFO - codeparrot_training - Step 185: {'lr': 0.0004999999503047214, 'samples': 95232, 'steps': 185, 'batch_loss/train': 0.9955014921724796} 12/21/2021 15:12:47 - INFO - codeparrot_training - Step 186: {'lr': 0.0004999999398687133, 'samples': 95744, 'steps': 186, 'batch_loss/train': 0.9876936078071594} 12/21/2021 15:12:57 - INFO - codeparrot_training - Step 187: {'lr': 0.0004999999284387999, 'samples': 96256, 'steps': 187, 'batch_loss/train': 0.9578468287363648} 12/21/2021 15:13:09 - INFO - codeparrot_training - Step 188: {'lr': 0.0004999999160149812, 'samples': 96768, 'steps': 188, 'batch_loss/train': 1.3733998239040375} 12/21/2021 15:13:20 - INFO - codeparrot_training - Step 189: {'lr': 0.0004999999025972571, 'samples': 97280, 'steps': 189, 'batch_loss/train': 0.8602615557610989} 12/21/2021 15:13:30 - INFO - codeparrot_training - Step 190: {'lr': 0.0004999998881856279, 'samples': 97792, 'steps': 190, 'batch_loss/train': 0.9249651916325092} 12/21/2021 15:13:41 - INFO - codeparrot_training - Step 191: {'lr': 0.0004999998727800935, 'samples': 98304, 'steps': 191, 'batch_loss/train': 1.0908228866755962} 12/21/2021 15:13:53 - INFO - codeparrot_training - Step 192: {'lr': 0.000499999856380654, 'samples': 98816, 'steps': 192, 'batch_loss/train': 0.7210814044810832} 12/21/2021 15:14:04 - INFO - codeparrot_training - Step 193: {'lr': 0.0004999998389873094, 'samples': 99328, 'steps': 193, 'batch_loss/train': 1.003062218427658} 12/21/2021 15:14:15 - INFO - codeparrot_training - Step 194: {'lr': 0.0004999998206000599, 'samples': 99840, 'steps': 194, 'batch_loss/train': 0.9711842276155949} 12/21/2021 15:14:27 - INFO - codeparrot_training - Step 195: {'lr': 0.0004999998012189056, 'samples': 100352, 'steps': 195, 'batch_loss/train': 1.0285911057144403} 12/21/2021 15:14:38 - INFO - codeparrot_training - Step 196: {'lr': 0.0004999997808438464, 'samples': 100864, 'steps': 196, 'batch_loss/train': 0.8522499427199364} 12/21/2021 15:14:49 - INFO - codeparrot_training - Step 197: {'lr': 0.0004999997594748824, 'samples': 101376, 'steps': 197, 'batch_loss/train': 1.0216512009501457} 12/21/2021 15:15:00 - INFO - codeparrot_training - Step 198: {'lr': 0.0004999997371120139, 'samples': 101888, 'steps': 198, 'batch_loss/train': 0.9640177562832832} 12/21/2021 15:15:11 - INFO - codeparrot_training - Step 199: {'lr': 0.0004999997137552407, 'samples': 102400, 'steps': 199, 'batch_loss/train': 1.0707423686981201} 12/21/2021 15:15:22 - INFO - codeparrot_training - Step 200: {'lr': 0.000499999689404563, 'samples': 102912, 'steps': 200, 'batch_loss/train': 1.0855901474133134} 12/21/2021 15:15:32 - INFO - codeparrot_training - Step 201: {'lr': 0.000499999664059981, 'samples': 103424, 'steps': 201, 'batch_loss/train': 0.9749338757246733} 12/21/2021 15:15:44 - INFO - codeparrot_training - Step 202: {'lr': 0.0004999996377214949, 'samples': 103936, 'steps': 202, 'batch_loss/train': 0.9779012911021709} 12/21/2021 15:15:55 - INFO - codeparrot_training - Step 203: {'lr': 0.0004999996103891045, 'samples': 104448, 'steps': 203, 'batch_loss/train': 0.9549416210502386} 12/21/2021 15:16:05 - INFO - codeparrot_training - Step 204: {'lr': 0.00049999958206281, 'samples': 104960, 'steps': 204, 'batch_loss/train': 0.9440921135246754} 12/21/2021 15:16:18 - INFO - codeparrot_training - Step 205: {'lr': 0.0004999995527426116, 'samples': 105472, 'steps': 205, 'batch_loss/train': 0.9560529226437211} 12/21/2021 15:16:29 - INFO - codeparrot_training - Step 206: {'lr': 0.0004999995224285093, 'samples': 105984, 'steps': 206, 'batch_loss/train': 0.9873572085052729} 12/21/2021 15:16:39 - INFO - codeparrot_training - Step 207: {'lr': 0.0004999994911205034, 'samples': 106496, 'steps': 207, 'batch_loss/train': 1.0310314800590277} 12/21/2021 15:16:51 - INFO - codeparrot_training - Step 208: {'lr': 0.0004999994588185939, 'samples': 107008, 'steps': 208, 'batch_loss/train': 0.9547451809048653} 12/21/2021 15:17:02 - INFO - codeparrot_training - Step 209: {'lr': 0.0004999994255227809, 'samples': 107520, 'steps': 209, 'batch_loss/train': 0.9218645989894867} 12/21/2021 15:17:13 - INFO - codeparrot_training - Step 210: {'lr': 0.0004999993912330646, 'samples': 108032, 'steps': 210, 'batch_loss/train': 0.94939138693735} 12/21/2021 15:17:25 - INFO - codeparrot_training - Step 211: {'lr': 0.0004999993559494452, 'samples': 108544, 'steps': 211, 'batch_loss/train': 0.9946675049141049} 12/21/2021 15:17:36 - INFO - codeparrot_training - Step 212: {'lr': 0.0004999993196719226, 'samples': 109056, 'steps': 212, 'batch_loss/train': 0.9801458483561873} 12/21/2021 15:17:47 - INFO - codeparrot_training - Step 213: {'lr': 0.0004999992824004972, 'samples': 109568, 'steps': 213, 'batch_loss/train': 0.9083622833713889} 12/21/2021 15:17:57 - INFO - codeparrot_training - Step 214: {'lr': 0.0004999992441351691, 'samples': 110080, 'steps': 214, 'batch_loss/train': 0.8982020476832986} 12/21/2021 15:18:09 - INFO - codeparrot_training - Step 215: {'lr': 0.0004999992048759383, 'samples': 110592, 'steps': 215, 'batch_loss/train': 0.9808334819972515} 12/21/2021 15:18:20 - INFO - codeparrot_training - Step 216: {'lr': 0.0004999991646228051, 'samples': 111104, 'steps': 216, 'batch_loss/train': 0.9249397749081254} 12/21/2021 15:18:31 - INFO - codeparrot_training - Step 217: {'lr': 0.0004999991233757696, 'samples': 111616, 'steps': 217, 'batch_loss/train': 0.8354386785067618} 12/21/2021 15:18:43 - INFO - codeparrot_training - Step 218: {'lr': 0.000499999081134832, 'samples': 112128, 'steps': 218, 'batch_loss/train': 0.8332201889716089} 12/21/2021 15:18:53 - INFO - codeparrot_training - Step 219: {'lr': 0.0004999990378999925, 'samples': 112640, 'steps': 219, 'batch_loss/train': 0.9582055304199457} 12/21/2021 15:19:04 - INFO - codeparrot_training - Step 220: {'lr': 0.0004999989936712512, 'samples': 113152, 'steps': 220, 'batch_loss/train': 1.078530428931117} 12/21/2021 15:19:16 - INFO - codeparrot_training - Step 221: {'lr': 0.0004999989484486082, 'samples': 113664, 'steps': 221, 'batch_loss/train': 1.0509033929556608} 12/21/2021 15:19:27 - INFO - codeparrot_training - Step 222: {'lr': 0.0004999989022320638, 'samples': 114176, 'steps': 222, 'batch_loss/train': 1.0024668099358678} 12/21/2021 15:19:37 - INFO - codeparrot_training - Step 223: {'lr': 0.0004999988550216182, 'samples': 114688, 'steps': 223, 'batch_loss/train': 0.9155860748142004} 12/21/2021 15:19:48 - INFO - codeparrot_training - Step 224: {'lr': 0.0004999988068172715, 'samples': 115200, 'steps': 224, 'batch_loss/train': 1.0199661303777248} 12/21/2021 15:20:00 - INFO - codeparrot_training - Step 225: {'lr': 0.000499998757619024, 'samples': 115712, 'steps': 225, 'batch_loss/train': 0.9875146131962538} 12/21/2021 15:20:11 - INFO - codeparrot_training - Step 226: {'lr': 0.0004999987074268758, 'samples': 116224, 'steps': 226, 'batch_loss/train': 0.9970517549663782} 12/21/2021 15:20:21 - INFO - codeparrot_training - Step 227: {'lr': 0.0004999986562408272, 'samples': 116736, 'steps': 227, 'batch_loss/train': 0.9353131148964167} 12/21/2021 15:20:33 - INFO - codeparrot_training - Step 228: {'lr': 0.0004999986040608782, 'samples': 117248, 'steps': 228, 'batch_loss/train': 0.9688568478450179} 12/21/2021 15:20:44 - INFO - codeparrot_training - Step 229: {'lr': 0.0004999985508870293, 'samples': 117760, 'steps': 229, 'batch_loss/train': 0.982265992090106} 12/21/2021 15:20:55 - INFO - codeparrot_training - Step 230: {'lr': 0.0004999984967192806, 'samples': 118272, 'steps': 230, 'batch_loss/train': 0.9376679211854935} 12/21/2021 15:21:07 - INFO - codeparrot_training - Step 231: {'lr': 0.0004999984415576321, 'samples': 118784, 'steps': 231, 'batch_loss/train': 0.955386396497488} 12/21/2021 15:21:17 - INFO - codeparrot_training - Step 232: {'lr': 0.0004999983854020844, 'samples': 119296, 'steps': 232, 'batch_loss/train': 1.0152222216129303} 12/21/2021 15:21:28 - INFO - codeparrot_training - Step 233: {'lr': 0.0004999983282526373, 'samples': 119808, 'steps': 233, 'batch_loss/train': 1.0677617341279984} 12/21/2021 15:21:38 - INFO - codeparrot_training - Step 234: {'lr': 0.0004999982701092912, 'samples': 120320, 'steps': 234, 'batch_loss/train': 0.9254444502294064} 12/21/2021 15:21:51 - INFO - codeparrot_training - Step 235: {'lr': 0.0004999982109720466, 'samples': 120832, 'steps': 235, 'batch_loss/train': 0.9195733573287725} 12/21/2021 15:22:02 - INFO - codeparrot_training - Step 236: {'lr': 0.0004999981508409034, 'samples': 121344, 'steps': 236, 'batch_loss/train': 1.5990784456953406} 12/21/2021 15:22:12 - INFO - codeparrot_training - Step 237: {'lr': 0.0004999980897158619, 'samples': 121856, 'steps': 237, 'batch_loss/train': 0.964858983643353} 12/21/2021 15:22:25 - INFO - codeparrot_training - Step 238: {'lr': 0.0004999980275969224, 'samples': 122368, 'steps': 238, 'batch_loss/train': 0.9365309840068221} 12/21/2021 15:22:35 - INFO - codeparrot_training - Step 239: {'lr': 0.0004999979644840852, 'samples': 122880, 'steps': 239, 'batch_loss/train': 1.0039056641981006} 12/21/2021 15:22:46 - INFO - codeparrot_training - Step 240: {'lr': 0.0004999979003773504, 'samples': 123392, 'steps': 240, 'batch_loss/train': 0.993403622880578} 12/21/2021 15:22:56 - INFO - codeparrot_training - Step 241: {'lr': 0.0004999978352767185, 'samples': 123904, 'steps': 241, 'batch_loss/train': 0.9287360450252891} 12/21/2021 15:23:09 - INFO - codeparrot_training - Step 242: {'lr': 0.0004999977691821896, 'samples': 124416, 'steps': 242, 'batch_loss/train': 0.8893203837797046} 12/21/2021 15:23:19 - INFO - codeparrot_training - Step 243: {'lr': 0.0004999977020937638, 'samples': 124928, 'steps': 243, 'batch_loss/train': 0.9247511001303792} 12/21/2021 15:23:30 - INFO - codeparrot_training - Step 244: {'lr': 0.0004999976340114416, 'samples': 125440, 'steps': 244, 'batch_loss/train': 0.9693126305937767} 12/21/2021 15:23:42 - INFO - codeparrot_training - Step 245: {'lr': 0.0004999975649352233, 'samples': 125952, 'steps': 245, 'batch_loss/train': 0.8935371916741133} 12/21/2021 15:23:53 - INFO - codeparrot_training - Step 246: {'lr': 0.0004999974948651089, 'samples': 126464, 'steps': 246, 'batch_loss/train': 0.9815532034263015} 12/21/2021 15:24:03 - INFO - codeparrot_training - Step 247: {'lr': 0.000499997423801099, 'samples': 126976, 'steps': 247, 'batch_loss/train': 2.1787368413060904} 12/21/2021 15:24:15 - INFO - codeparrot_training - Step 248: {'lr': 0.0004999973517431937, 'samples': 127488, 'steps': 248, 'batch_loss/train': 0.9655200140550733} 12/21/2021 15:24:26 - INFO - codeparrot_training - Step 249: {'lr': 0.0004999972786913934, 'samples': 128000, 'steps': 249, 'batch_loss/train': 1.0043728100135922} 12/21/2021 15:24:36 - INFO - codeparrot_training - Step 250: {'lr': 0.0004999972046456981, 'samples': 128512, 'steps': 250, 'batch_loss/train': 0.9132165648043156} 12/21/2021 15:24:49 - INFO - codeparrot_training - Step 251: {'lr': 0.0004999971296061085, 'samples': 129024, 'steps': 251, 'batch_loss/train': 1.0024230564013124} 12/21/2021 15:25:00 - INFO - codeparrot_training - Step 252: {'lr': 0.0004999970535726246, 'samples': 129536, 'steps': 252, 'batch_loss/train': 0.9403008110821247} 12/21/2021 15:25:10 - INFO - codeparrot_training - Step 253: {'lr': 0.0004999969765452468, 'samples': 130048, 'steps': 253, 'batch_loss/train': 0.9630314344540238} 12/21/2021 15:25:21 - INFO - codeparrot_training - Step 254: {'lr': 0.0004999968985239755, 'samples': 130560, 'steps': 254, 'batch_loss/train': 1.0133339166641235} 12/21/2021 15:25:33 - INFO - codeparrot_training - Step 255: {'lr': 0.0004999968195088109, 'samples': 131072, 'steps': 255, 'batch_loss/train': 0.9787106551229954} 12/21/2021 15:25:43 - INFO - codeparrot_training - Step 256: {'lr': 0.0004999967394997534, 'samples': 131584, 'steps': 256, 'batch_loss/train': 0.9070568662136793} 12/21/2021 15:25:54 - INFO - codeparrot_training - Step 257: {'lr': 0.0004999966584968033, 'samples': 132096, 'steps': 257, 'batch_loss/train': 0.8569030165672302} 12/21/2021 15:26:06 - INFO - codeparrot_training - Step 258: {'lr': 0.0004999965764999607, 'samples': 132608, 'steps': 258, 'batch_loss/train': 0.9390520844608545} 12/21/2021 15:26:17 - INFO - codeparrot_training - Step 259: {'lr': 0.0004999964935092262, 'samples': 133120, 'steps': 259, 'batch_loss/train': 0.9352331347763538} 12/21/2021 15:26:27 - INFO - codeparrot_training - Step 260: {'lr': 0.0004999964095245999, 'samples': 133632, 'steps': 260, 'batch_loss/train': 1.0299120405688882} 12/21/2021 15:26:40 - INFO - codeparrot_training - Step 261: {'lr': 0.0004999963245460824, 'samples': 134144, 'steps': 261, 'batch_loss/train': 0.9368714848533273} 12/21/2021 15:26:50 - INFO - codeparrot_training - Step 262: {'lr': 0.0004999962385736739, 'samples': 134656, 'steps': 262, 'batch_loss/train': 0.8667335538193583} 12/21/2021 15:27:01 - INFO - codeparrot_training - Step 263: {'lr': 0.0004999961516073748, 'samples': 135168, 'steps': 263, 'batch_loss/train': 1.0675992406904697} 12/21/2021 15:27:12 - INFO - codeparrot_training - Step 264: {'lr': 0.0004999960636471853, 'samples': 135680, 'steps': 264, 'batch_loss/train': 0.9339342024177313} 12/21/2021 15:27:24 - INFO - codeparrot_training - Step 265: {'lr': 0.000499995974693106, 'samples': 136192, 'steps': 265, 'batch_loss/train': 0.9612384159117937} 12/21/2021 15:27:34 - INFO - codeparrot_training - Step 266: {'lr': 0.0004999958847451369, 'samples': 136704, 'steps': 266, 'batch_loss/train': 1.006809951737523} 12/21/2021 15:27:45 - INFO - codeparrot_training - Step 267: {'lr': 0.0004999957938032787, 'samples': 137216, 'steps': 267, 'batch_loss/train': 0.9956084182485938} 12/21/2021 15:27:57 - INFO - codeparrot_training - Step 268: {'lr': 0.0004999957018675316, 'samples': 137728, 'steps': 268, 'batch_loss/train': 0.9225286403670907} 12/21/2021 15:28:07 - INFO - codeparrot_training - Step 269: {'lr': 0.0004999956089378959, 'samples': 138240, 'steps': 269, 'batch_loss/train': 1.0786495171487331} 12/21/2021 15:28:18 - INFO - codeparrot_training - Step 270: {'lr': 0.0004999955150143721, 'samples': 138752, 'steps': 270, 'batch_loss/train': 0.9791294615715742} 12/21/2021 15:28:30 - INFO - codeparrot_training - Step 271: {'lr': 0.0004999954200969605, 'samples': 139264, 'steps': 271, 'batch_loss/train': 0.9864300889894366} 12/21/2021 15:28:41 - INFO - codeparrot_training - Step 272: {'lr': 0.0004999953241856616, 'samples': 139776, 'steps': 272, 'batch_loss/train': 0.9177341545000672} 12/21/2021 15:28:51 - INFO - codeparrot_training - Step 273: {'lr': 0.0004999952272804756, 'samples': 140288, 'steps': 273, 'batch_loss/train': 0.946137722581625} 12/21/2021 15:29:05 - INFO - codeparrot_training - Step 274: {'lr': 0.000499995129381403, 'samples': 140800, 'steps': 274, 'batch_loss/train': 0.9582210006192327} 12/21/2021 15:29:15 - INFO - codeparrot_training - Step 275: {'lr': 0.0004999950304884442, 'samples': 141312, 'steps': 275, 'batch_loss/train': 0.9708927199244499} 12/21/2021 15:29:26 - INFO - codeparrot_training - Step 276: {'lr': 0.0004999949306015995, 'samples': 141824, 'steps': 276, 'batch_loss/train': 1.87380766030401} 12/21/2021 15:29:36 - INFO - codeparrot_training - Step 277: {'lr': 0.0004999948297208693, 'samples': 142336, 'steps': 277, 'batch_loss/train': 1.1308678593486547} 12/21/2021 15:29:48 - INFO - codeparrot_training - Step 278: {'lr': 0.0004999947278462541, 'samples': 142848, 'steps': 278, 'batch_loss/train': 1.020630268380046} 12/21/2021 15:29:59 - INFO - codeparrot_training - Step 279: {'lr': 0.0004999946249777543, 'samples': 143360, 'steps': 279, 'batch_loss/train': 0.9665484940633178} 12/21/2021 15:30:09 - INFO - codeparrot_training - Step 280: {'lr': 0.0004999945211153703, 'samples': 143872, 'steps': 280, 'batch_loss/train': 1.0033400291576982} 12/21/2021 15:30:20 - INFO - codeparrot_training - Step 281: {'lr': 0.0004999944162591024, 'samples': 144384, 'steps': 281, 'batch_loss/train': 0.9401888307183981} 12/21/2021 15:30:33 - INFO - codeparrot_training - Step 282: {'lr': 0.0004999943104089511, 'samples': 144896, 'steps': 282, 'batch_loss/train': 1.0101582873612642} 12/21/2021 15:30:43 - INFO - codeparrot_training - Step 283: {'lr': 0.0004999942035649168, 'samples': 145408, 'steps': 283, 'batch_loss/train': 1.0264825131744146} 12/21/2021 15:30:54 - INFO - codeparrot_training - Step 284: {'lr': 0.0004999940957269999, 'samples': 145920, 'steps': 284, 'batch_loss/train': 0.9054904184304178} 12/21/2021 15:31:06 - INFO - codeparrot_training - Step 285: {'lr': 0.0004999939868952009, 'samples': 146432, 'steps': 285, 'batch_loss/train': 0.8634658250957727} 12/21/2021 15:31:16 - INFO - codeparrot_training - Step 286: {'lr': 0.0004999938770695203, 'samples': 146944, 'steps': 286, 'batch_loss/train': 1.0623836005106568} 12/21/2021 15:31:27 - INFO - codeparrot_training - Step 287: {'lr': 0.0004999937662499585, 'samples': 147456, 'steps': 287, 'batch_loss/train': 0.9658253453671932} 12/21/2021 15:31:39 - INFO - codeparrot_training - Step 288: {'lr': 0.0004999936544365156, 'samples': 147968, 'steps': 288, 'batch_loss/train': 0.9038360472768545} 12/21/2021 15:31:49 - INFO - codeparrot_training - Step 289: {'lr': 0.0004999935416291926, 'samples': 148480, 'steps': 289, 'batch_loss/train': 0.9555488238111138} 12/21/2021 15:32:00 - INFO - codeparrot_training - Step 290: {'lr': 0.0004999934278279894, 'samples': 148992, 'steps': 290, 'batch_loss/train': 0.9170712037011981} 12/21/2021 15:32:13 - INFO - codeparrot_training - Step 291: {'lr': 0.0004999933130329069, 'samples': 149504, 'steps': 291, 'batch_loss/train': 0.8491080598905683} 12/21/2021 15:32:23 - INFO - codeparrot_training - Step 292: {'lr': 0.0004999931972439454, 'samples': 150016, 'steps': 292, 'batch_loss/train': 0.9852352160960436} 12/21/2021 15:32:34 - INFO - codeparrot_training - Step 293: {'lr': 0.0004999930804611052, 'samples': 150528, 'steps': 293, 'batch_loss/train': 0.8676561275497079} 12/21/2021 15:32:45 - INFO - codeparrot_training - Step 294: {'lr': 0.0004999929626843871, 'samples': 151040, 'steps': 294, 'batch_loss/train': 0.957035600207746} 12/21/2021 15:32:57 - INFO - codeparrot_training - Step 295: {'lr': 0.0004999928439137913, 'samples': 151552, 'steps': 295, 'batch_loss/train': 0.8217380112037063} 12/21/2021 15:33:07 - INFO - codeparrot_training - Step 296: {'lr': 0.0004999927241493183, 'samples': 152064, 'steps': 296, 'batch_loss/train': 0.9880850934423506} 12/21/2021 15:33:18 - INFO - codeparrot_training - Step 297: {'lr': 0.0004999926033909686, 'samples': 152576, 'steps': 297, 'batch_loss/train': 0.975646959617734} 12/21/2021 15:33:31 - INFO - codeparrot_training - Step 298: {'lr': 0.0004999924816387428, 'samples': 153088, 'steps': 298, 'batch_loss/train': 0.9518024520948529} 12/21/2021 15:33:41 - INFO - codeparrot_training - Step 299: {'lr': 0.0004999923588926412, 'samples': 153600, 'steps': 299, 'batch_loss/train': 0.9249488320201635} 12/21/2021 15:33:52 - INFO - codeparrot_training - Step 300: {'lr': 0.0004999922351526644, 'samples': 154112, 'steps': 300, 'batch_loss/train': 1.021987003274262} 12/21/2021 15:34:04 - INFO - codeparrot_training - Step 301: {'lr': 0.0004999921104188129, 'samples': 154624, 'steps': 301, 'batch_loss/train': 0.944774815812707} 12/21/2021 15:34:15 - INFO - codeparrot_training - Step 302: {'lr': 0.0004999919846910871, 'samples': 155136, 'steps': 302, 'batch_loss/train': 1.0029222639277577} 12/21/2021 15:34:25 - INFO - codeparrot_training - Step 303: {'lr': 0.0004999918579694876, 'samples': 155648, 'steps': 303, 'batch_loss/train': 0.8526419843547046} 12/21/2021 15:34:36 - INFO - codeparrot_training - Step 304: {'lr': 0.0004999917302540149, 'samples': 156160, 'steps': 304, 'batch_loss/train': 0.9772113729268312} 12/21/2021 15:34:48 - INFO - codeparrot_training - Step 305: {'lr': 0.0004999916015446694, 'samples': 156672, 'steps': 305, 'batch_loss/train': 0.9436007365584373} 12/21/2021 15:34:58 - INFO - codeparrot_training - Step 306: {'lr': 0.0004999914718414517, 'samples': 157184, 'steps': 306, 'batch_loss/train': 1.0183395985513926} 12/21/2021 15:35:09 - INFO - codeparrot_training - Step 307: {'lr': 0.0004999913411443623, 'samples': 157696, 'steps': 307, 'batch_loss/train': 0.945149693172425} 12/21/2021 15:35:21 - INFO - codeparrot_training - Step 308: {'lr': 0.0004999912094534019, 'samples': 158208, 'steps': 308, 'batch_loss/train': 1.0343824811279774} 12/21/2021 15:35:32 - INFO - codeparrot_training - Step 309: {'lr': 0.0004999910767685705, 'samples': 158720, 'steps': 309, 'batch_loss/train': 0.8817191179841757} 12/21/2021 15:35:42 - INFO - codeparrot_training - Step 310: {'lr': 0.0004999909430898693, 'samples': 159232, 'steps': 310, 'batch_loss/train': 1.0754667315632105} 12/21/2021 15:35:55 - INFO - codeparrot_training - Step 311: {'lr': 0.0004999908084172982, 'samples': 159744, 'steps': 311, 'batch_loss/train': 0.8908740505576134} 12/21/2021 15:36:05 - INFO - codeparrot_training - Step 312: {'lr': 0.0004999906727508583, 'samples': 160256, 'steps': 312, 'batch_loss/train': 1.0570627879351377} 12/21/2021 15:36:16 - INFO - codeparrot_training - Step 313: {'lr': 0.0004999905360905497, 'samples': 160768, 'steps': 313, 'batch_loss/train': 0.9630824876949191} 12/21/2021 15:36:27 - INFO - codeparrot_training - Step 314: {'lr': 0.0004999903984363733, 'samples': 161280, 'steps': 314, 'batch_loss/train': 0.9628634955734015} 12/21/2021 15:36:39 - INFO - codeparrot_training - Step 315: {'lr': 0.0004999902597883294, 'samples': 161792, 'steps': 315, 'batch_loss/train': 1.0564095564186573} 12/21/2021 15:36:49 - INFO - codeparrot_training - Step 316: {'lr': 0.0004999901201464185, 'samples': 162304, 'steps': 316, 'batch_loss/train': 0.9426975306123495} 12/21/2021 15:37:00 - INFO - codeparrot_training - Step 317: {'lr': 0.0004999899795106414, 'samples': 162816, 'steps': 317, 'batch_loss/train': 0.9392783408984542} 12/21/2021 15:37:12 - INFO - codeparrot_training - Step 318: {'lr': 0.0004999898378809985, 'samples': 163328, 'steps': 318, 'batch_loss/train': 0.9344945745542645} 12/21/2021 15:37:22 - INFO - codeparrot_training - Step 319: {'lr': 0.0004999896952574904, 'samples': 163840, 'steps': 319, 'batch_loss/train': 0.9809899972751737} 12/21/2021 15:37:33 - INFO - codeparrot_training - Step 320: {'lr': 0.0004999895516401176, 'samples': 164352, 'steps': 320, 'batch_loss/train': 1.0894929263740778} 12/21/2021 15:37:46 - INFO - codeparrot_training - Step 321: {'lr': 0.0004999894070288808, 'samples': 164864, 'steps': 321, 'batch_loss/train': 0.9599240012466908} 12/21/2021 15:37:56 - INFO - codeparrot_training - Step 322: {'lr': 0.0004999892614237806, 'samples': 165376, 'steps': 322, 'batch_loss/train': 0.8795394534245133} 12/21/2021 15:38:07 - INFO - codeparrot_training - Step 323: {'lr': 0.0004999891148248174, 'samples': 165888, 'steps': 323, 'batch_loss/train': 0.9449589308351278} 12/21/2021 15:38:17 - INFO - codeparrot_training - Step 324: {'lr': 0.0004999889672319918, 'samples': 166400, 'steps': 324, 'batch_loss/train': 0.9080932587385178} 12/21/2021 15:38:30 - INFO - codeparrot_training - Step 325: {'lr': 0.0004999888186453046, 'samples': 166912, 'steps': 325, 'batch_loss/train': 1.0157170034945011} 12/21/2021 15:38:40 - INFO - codeparrot_training - Step 326: {'lr': 0.0004999886690647561, 'samples': 167424, 'steps': 326, 'batch_loss/train': 0.934957368299365} 12/21/2021 15:38:51 - INFO - codeparrot_training - Step 327: {'lr': 0.000499988518490347, 'samples': 167936, 'steps': 327, 'batch_loss/train': 0.8197665601037443} 12/21/2021 15:39:03 - INFO - codeparrot_training - Step 328: {'lr': 0.0004999883669220782, 'samples': 168448, 'steps': 328, 'batch_loss/train': 0.9284519851207733} 12/21/2021 15:39:14 - INFO - codeparrot_training - Step 329: {'lr': 0.0004999882143599499, 'samples': 168960, 'steps': 329, 'batch_loss/train': 0.9782567238435149} 12/21/2021 15:39:25 - INFO - codeparrot_training - Step 330: {'lr': 0.0004999880608039628, 'samples': 169472, 'steps': 330, 'batch_loss/train': 1.0394312385469675} 12/21/2021 15:39:36 - INFO - codeparrot_training - Step 331: {'lr': 0.0004999879062541177, 'samples': 169984, 'steps': 331, 'batch_loss/train': 0.9363724086433649} 12/21/2021 15:39:47 - INFO - codeparrot_training - Step 332: {'lr': 0.000499987750710415, 'samples': 170496, 'steps': 332, 'batch_loss/train': 0.9781676828861237} 12/21/2021 15:39:58 - INFO - codeparrot_training - Step 333: {'lr': 0.0004999875941728554, 'samples': 171008, 'steps': 333, 'batch_loss/train': 0.9209995167329907} 12/21/2021 15:40:08 - INFO - codeparrot_training - Step 334: {'lr': 0.0004999874366414395, 'samples': 171520, 'steps': 334, 'batch_loss/train': 1.0243413960561156} 12/21/2021 15:40:20 - INFO - codeparrot_training - Step 335: {'lr': 0.0004999872781161679, 'samples': 172032, 'steps': 335, 'batch_loss/train': 1.0678168572485447} 12/21/2021 15:40:31 - INFO - codeparrot_training - Step 336: {'lr': 0.0004999871185970413, 'samples': 172544, 'steps': 336, 'batch_loss/train': 0.9590579112991691} 12/21/2021 15:40:41 - INFO - codeparrot_training - Step 337: {'lr': 0.0004999869580840603, 'samples': 173056, 'steps': 337, 'batch_loss/train': 0.9584408421069384} 12/21/2021 15:40:55 - INFO - codeparrot_training - Step 338: {'lr': 0.0004999867965772256, 'samples': 173568, 'steps': 338, 'batch_loss/train': 1.0288824839517474} 12/21/2021 15:41:05 - INFO - codeparrot_training - Step 339: {'lr': 0.0004999866340765378, 'samples': 174080, 'steps': 339, 'batch_loss/train': 1.0880566388368607} 12/21/2021 15:41:16 - INFO - codeparrot_training - Step 340: {'lr': 0.0004999864705819975, 'samples': 174592, 'steps': 340, 'batch_loss/train': 1.0811644373461604} 12/21/2021 15:41:28 - INFO - codeparrot_training - Step 341: {'lr': 0.0004999863060936054, 'samples': 175104, 'steps': 341, 'batch_loss/train': 1.671540541574359} 12/21/2021 15:41:38 - INFO - codeparrot_training - Step 342: {'lr': 0.0004999861406113622, 'samples': 175616, 'steps': 342, 'batch_loss/train': 0.9352710619568825} 12/21/2021 15:41:49 - INFO - codeparrot_training - Step 343: {'lr': 0.0004999859741352683, 'samples': 176128, 'steps': 343, 'batch_loss/train': 1.0104450918734074} 12/21/2021 15:41:59 - INFO - codeparrot_training - Step 344: {'lr': 0.0004999858066653247, 'samples': 176640, 'steps': 344, 'batch_loss/train': 1.02015068102628} 12/21/2021 15:42:12 - INFO - codeparrot_training - Step 345: {'lr': 0.0004999856382015319, 'samples': 177152, 'steps': 345, 'batch_loss/train': 0.9521654164418578} 12/21/2021 15:42:23 - INFO - codeparrot_training - Step 346: {'lr': 0.0004999854687438906, 'samples': 177664, 'steps': 346, 'batch_loss/train': 0.9196355575695634} 12/21/2021 15:42:33 - INFO - codeparrot_training - Step 347: {'lr': 0.0004999852982924015, 'samples': 178176, 'steps': 347, 'batch_loss/train': 0.9492135313339531} 12/21/2021 15:42:45 - INFO - codeparrot_training - Step 348: {'lr': 0.0004999851268470652, 'samples': 178688, 'steps': 348, 'batch_loss/train': 1.071784246712923} 12/21/2021 15:42:56 - INFO - codeparrot_training - Step 349: {'lr': 0.0004999849544078823, 'samples': 179200, 'steps': 349, 'batch_loss/train': 1.1324431374669075} 12/21/2021 15:43:07 - INFO - codeparrot_training - Step 350: {'lr': 0.0004999847809748539, 'samples': 179712, 'steps': 350, 'batch_loss/train': 0.7901751389726996} 12/21/2021 15:43:19 - INFO - codeparrot_training - Step 351: {'lr': 0.0004999846065479802, 'samples': 180224, 'steps': 351, 'batch_loss/train': 0.9234656607732177} 12/21/2021 15:43:29 - INFO - codeparrot_training - Step 352: {'lr': 0.0004999844311272621, 'samples': 180736, 'steps': 352, 'batch_loss/train': 0.9884148836135864} 12/21/2021 15:43:40 - INFO - codeparrot_training - Step 353: {'lr': 0.0004999842547127003, 'samples': 181248, 'steps': 353, 'batch_loss/train': 1.0030706701800227} 12/21/2021 15:43:50 - INFO - codeparrot_training - Step 354: {'lr': 0.0004999840773042956, 'samples': 181760, 'steps': 354, 'batch_loss/train': 0.930622486397624} 12/21/2021 15:44:02 - INFO - codeparrot_training - Step 355: {'lr': 0.0004999838989020486, 'samples': 182272, 'steps': 355, 'batch_loss/train': 0.9351348020136356} 12/21/2021 15:44:13 - INFO - codeparrot_training - Step 356: {'lr': 0.0004999837195059599, 'samples': 182784, 'steps': 356, 'batch_loss/train': 1.0733826979994774} 12/21/2021 15:44:24 - INFO - codeparrot_training - Step 357: {'lr': 0.0004999835391160303, 'samples': 183296, 'steps': 357, 'batch_loss/train': 1.010494620539248} 12/21/2021 15:44:36 - INFO - codeparrot_training - Step 358: {'lr': 0.0004999833577322607, 'samples': 183808, 'steps': 358, 'batch_loss/train': 0.9469421803951263} 12/21/2021 15:44:47 - INFO - codeparrot_training - Step 359: {'lr': 0.0004999831753546514, 'samples': 184320, 'steps': 359, 'batch_loss/train': 0.9487142357975245} 12/21/2021 15:44:58 - INFO - codeparrot_training - Step 360: {'lr': 0.0004999829919832036, 'samples': 184832, 'steps': 360, 'batch_loss/train': 0.9500178750604391} 12/21/2021 15:45:10 - INFO - codeparrot_training - Step 361: {'lr': 0.0004999828076179178, 'samples': 185344, 'steps': 361, 'batch_loss/train': 0.9616537699475884} 12/21/2021 15:45:20 - INFO - codeparrot_training - Step 362: {'lr': 0.0004999826222587947, 'samples': 185856, 'steps': 362, 'batch_loss/train': 0.9094876050949097} 12/21/2021 15:45:31 - INFO - codeparrot_training - Step 363: {'lr': 0.000499982435905835, 'samples': 186368, 'steps': 363, 'batch_loss/train': 1.0322210416197777} 12/21/2021 15:45:43 - INFO - codeparrot_training - Step 364: {'lr': 0.0004999822485590396, 'samples': 186880, 'steps': 364, 'batch_loss/train': 1.1324112638831139} 12/21/2021 15:45:53 - INFO - codeparrot_training - Step 365: {'lr': 0.0004999820602184092, 'samples': 187392, 'steps': 365, 'batch_loss/train': 0.9534476650878787} 12/21/2021 15:46:04 - INFO - codeparrot_training - Step 366: {'lr': 0.0004999818708839445, 'samples': 187904, 'steps': 366, 'batch_loss/train': 1.0070430897176266} 12/21/2021 15:46:15 - INFO - codeparrot_training - Step 367: {'lr': 0.0004999816805556463, 'samples': 188416, 'steps': 367, 'batch_loss/train': 0.8387978570535779} 12/21/2021 15:46:27 - INFO - codeparrot_training - Step 368: {'lr': 0.0004999814892335153, 'samples': 188928, 'steps': 368, 'batch_loss/train': 0.9353106217458844} 12/21/2021 15:46:38 - INFO - codeparrot_training - Step 369: {'lr': 0.0004999812969175522, 'samples': 189440, 'steps': 369, 'batch_loss/train': 0.9284348818473518} 12/21/2021 15:46:49 - INFO - codeparrot_training - Step 370: {'lr': 0.0004999811036077579, 'samples': 189952, 'steps': 370, 'batch_loss/train': 0.9561633160337806} 12/21/2021 15:47:01 - INFO - codeparrot_training - Step 371: {'lr': 0.0004999809093041332, 'samples': 190464, 'steps': 371, 'batch_loss/train': 0.9692120300605893} 12/21/2021 15:47:11 - INFO - codeparrot_training - Step 372: {'lr': 0.0004999807140066788, 'samples': 190976, 'steps': 372, 'batch_loss/train': 1.0437386203557253} 12/21/2021 15:47:22 - INFO - codeparrot_training - Step 373: {'lr': 0.0004999805177153954, 'samples': 191488, 'steps': 373, 'batch_loss/train': 1.045035165734589} 12/21/2021 15:47:34 - INFO - codeparrot_training - Step 374: {'lr': 0.0004999803204302839, 'samples': 192000, 'steps': 374, 'batch_loss/train': 0.8727764482609928} 12/21/2021 15:47:44 - INFO - codeparrot_training - Step 375: {'lr': 0.0004999801221513449, 'samples': 192512, 'steps': 375, 'batch_loss/train': 0.8536123684607446} 12/21/2021 15:47:55 - INFO - codeparrot_training - Step 376: {'lr': 0.0004999799228785796, 'samples': 193024, 'steps': 376, 'batch_loss/train': 0.9042056417092681} 12/21/2021 15:48:05 - INFO - codeparrot_training - Step 377: {'lr': 0.0004999797226119882, 'samples': 193536, 'steps': 377, 'batch_loss/train': 0.9008130421862006} 12/21/2021 15:48:18 - INFO - codeparrot_training - Step 378: {'lr': 0.0004999795213515719, 'samples': 194048, 'steps': 378, 'batch_loss/train': 0.9194910684600472} 12/21/2021 15:48:29 - INFO - codeparrot_training - Step 379: {'lr': 0.0004999793190973316, 'samples': 194560, 'steps': 379, 'batch_loss/train': 0.9500436671078205} 12/21/2021 15:48:39 - INFO - codeparrot_training - Step 380: {'lr': 0.0004999791158492678, 'samples': 195072, 'steps': 380, 'batch_loss/train': 0.9273672616109252} 12/21/2021 15:48:51 - INFO - codeparrot_training - Step 381: {'lr': 0.0004999789116073814, 'samples': 195584, 'steps': 381, 'batch_loss/train': 1.0207349183037877} 12/21/2021 15:49:02 - INFO - codeparrot_training - Step 382: {'lr': 0.0004999787063716732, 'samples': 196096, 'steps': 382, 'batch_loss/train': 0.9625328816473484} 12/21/2021 15:49:13 - INFO - codeparrot_training - Step 383: {'lr': 0.000499978500142144, 'samples': 196608, 'steps': 383, 'batch_loss/train': 0.7469302010722458} 12/21/2021 15:49:25 - INFO - codeparrot_training - Step 384: {'lr': 0.0004999782929187948, 'samples': 197120, 'steps': 384, 'batch_loss/train': 0.9214018918573856} 12/21/2021 15:49:36 - INFO - codeparrot_training - Step 385: {'lr': 0.0004999780847016263, 'samples': 197632, 'steps': 385, 'batch_loss/train': 1.0566344875842333} 12/21/2021 15:49:46 - INFO - codeparrot_training - Step 386: {'lr': 0.0004999778754906394, 'samples': 198144, 'steps': 386, 'batch_loss/train': 0.9904032200574875} 12/21/2021 15:49:58 - INFO - codeparrot_training - Step 387: {'lr': 0.0004999776652858346, 'samples': 198656, 'steps': 387, 'batch_loss/train': 1.0329768899828196} 12/21/2021 15:50:09 - INFO - codeparrot_training - Step 388: {'lr': 0.0004999774540872132, 'samples': 199168, 'steps': 388, 'batch_loss/train': 0.9588639475405216} 12/21/2021 15:50:20 - INFO - codeparrot_training - Step 389: {'lr': 0.0004999772418947758, 'samples': 199680, 'steps': 389, 'batch_loss/train': 0.9446934647858143} 12/21/2021 15:50:30 - INFO - codeparrot_training - Step 390: {'lr': 0.0004999770287085232, 'samples': 200192, 'steps': 390, 'batch_loss/train': 0.9255288196727633} 12/21/2021 15:50:42 - INFO - codeparrot_training - Step 391: {'lr': 0.0004999768145284564, 'samples': 200704, 'steps': 391, 'batch_loss/train': 0.9534507757052779} 12/21/2021 15:50:53 - INFO - codeparrot_training - Step 392: {'lr': 0.000499976599354576, 'samples': 201216, 'steps': 392, 'batch_loss/train': 1.0267576836049557} 12/21/2021 15:51:03 - INFO - codeparrot_training - Step 393: {'lr': 0.0004999763831868832, 'samples': 201728, 'steps': 393, 'batch_loss/train': 0.9819985963404179} 12/21/2021 15:51:16 - INFO - codeparrot_training - Step 394: {'lr': 0.0004999761660253786, 'samples': 202240, 'steps': 394, 'batch_loss/train': 0.9275947390124202} 12/21/2021 15:51:26 - INFO - codeparrot_training - Step 395: {'lr': 0.0004999759478700632, 'samples': 202752, 'steps': 395, 'batch_loss/train': 0.9359030965715647} 12/21/2021 15:51:37 - INFO - codeparrot_training - Step 396: {'lr': 0.0004999757287209379, 'samples': 203264, 'steps': 396, 'batch_loss/train': 0.9917447231709957} 12/21/2021 15:51:50 - INFO - codeparrot_training - Step 397: {'lr': 0.0004999755085780033, 'samples': 203776, 'steps': 397, 'batch_loss/train': 0.8854732243344188} 12/21/2021 15:52:00 - INFO - codeparrot_training - Step 398: {'lr': 0.0004999752874412606, 'samples': 204288, 'steps': 398, 'batch_loss/train': 0.9453186276368797} 12/21/2021 15:52:11 - INFO - codeparrot_training - Step 399: {'lr': 0.0004999750653107106, 'samples': 204800, 'steps': 399, 'batch_loss/train': 0.9650594405829906} 12/21/2021 15:52:21 - INFO - codeparrot_training - Step 400: {'lr': 0.000499974842186354, 'samples': 205312, 'steps': 400, 'batch_loss/train': 0.9882339080795646} 12/21/2021 15:52:33 - INFO - codeparrot_training - Step 401: {'lr': 0.0004999746180681918, 'samples': 205824, 'steps': 401, 'batch_loss/train': 0.8767112297937274} 12/21/2021 15:52:44 - INFO - codeparrot_training - Step 402: {'lr': 0.000499974392956225, 'samples': 206336, 'steps': 402, 'batch_loss/train': 0.9569975165650249} 12/21/2021 15:52:54 - INFO - codeparrot_training - Step 403: {'lr': 0.0004999741668504544, 'samples': 206848, 'steps': 403, 'batch_loss/train': 0.7826286354102194} 12/21/2021 15:53:06 - INFO - codeparrot_training - Step 404: {'lr': 0.0004999739397508808, 'samples': 207360, 'steps': 404, 'batch_loss/train': 0.9592352164909244} 12/21/2021 15:53:17 - INFO - codeparrot_training - Step 405: {'lr': 0.0004999737116575053, 'samples': 207872, 'steps': 405, 'batch_loss/train': 1.034396498464048} 12/21/2021 15:53:28 - INFO - codeparrot_training - Step 406: {'lr': 0.0004999734825703287, 'samples': 208384, 'steps': 406, 'batch_loss/train': 1.0085913049988449} 12/21/2021 15:53:40 - INFO - codeparrot_training - Step 407: {'lr': 0.0004999732524893518, 'samples': 208896, 'steps': 407, 'batch_loss/train': 0.961815488524735} 12/21/2021 15:53:51 - INFO - codeparrot_training - Step 408: {'lr': 0.0004999730214145758, 'samples': 209408, 'steps': 408, 'batch_loss/train': 0.9491942655295134} 12/21/2021 15:54:01 - INFO - codeparrot_training - Step 409: {'lr': 0.0004999727893460013, 'samples': 209920, 'steps': 409, 'batch_loss/train': 0.973499272018671} 12/21/2021 15:54:12 - INFO - codeparrot_training - Step 410: {'lr': 0.0004999725562836295, 'samples': 210432, 'steps': 410, 'batch_loss/train': 0.9724812339991331} 12/21/2021 15:54:24 - INFO - codeparrot_training - Step 411: {'lr': 0.0004999723222274612, 'samples': 210944, 'steps': 411, 'batch_loss/train': 0.8766064941883087} 12/21/2021 15:54:35 - INFO - codeparrot_training - Step 412: {'lr': 0.0004999720871774972, 'samples': 211456, 'steps': 412, 'batch_loss/train': 0.7345240158028901} 12/21/2021 15:54:46 - INFO - codeparrot_training - Step 413: {'lr': 0.0004999718511337387, 'samples': 211968, 'steps': 413, 'batch_loss/train': 0.9546373225748539} 12/21/2021 15:54:58 - INFO - codeparrot_training - Step 414: {'lr': 0.0004999716140961864, 'samples': 212480, 'steps': 414, 'batch_loss/train': 1.066631537862122} 12/21/2021 15:55:09 - INFO - codeparrot_training - Step 415: {'lr': 0.0004999713760648415, 'samples': 212992, 'steps': 415, 'batch_loss/train': 0.929376820102334} 12/21/2021 15:55:19 - INFO - codeparrot_training - Step 416: {'lr': 0.0004999711370397047, 'samples': 213504, 'steps': 416, 'batch_loss/train': 0.9020446855574846} 12/21/2021 15:55:31 - INFO - codeparrot_training - Step 417: {'lr': 0.0004999708970207771, 'samples': 214016, 'steps': 417, 'batch_loss/train': 0.9906241353601217} 12/21/2021 15:55:42 - INFO - codeparrot_training - Step 418: {'lr': 0.0004999706560080595, 'samples': 214528, 'steps': 418, 'batch_loss/train': 0.9769453471526504} 12/21/2021 15:55:53 - INFO - codeparrot_training - Step 419: {'lr': 0.000499970414001553, 'samples': 215040, 'steps': 419, 'batch_loss/train': 0.9733418663963675} 12/21/2021 15:56:04 - INFO - codeparrot_training - Step 420: {'lr': 0.0004999701710012585, 'samples': 215552, 'steps': 420, 'batch_loss/train': 0.9755980726331472} 12/21/2021 15:56:15 - INFO - codeparrot_training - Step 421: {'lr': 0.000499969927007177, 'samples': 216064, 'steps': 421, 'batch_loss/train': 0.8809562092646956} 12/21/2021 15:56:26 - INFO - codeparrot_training - Step 422: {'lr': 0.0004999696820193095, 'samples': 216576, 'steps': 422, 'batch_loss/train': 0.9990965034812689} 12/21/2021 15:56:36 - INFO - codeparrot_training - Step 423: {'lr': 0.0004999694360376569, 'samples': 217088, 'steps': 423, 'batch_loss/train': 0.986396006308496} 12/21/2021 15:56:49 - INFO - codeparrot_training - Step 424: {'lr': 0.0004999691890622202, 'samples': 217600, 'steps': 424, 'batch_loss/train': 1.0693664755672216} 12/21/2021 15:56:59 - INFO - codeparrot_training - Step 425: {'lr': 0.0004999689410930003, 'samples': 218112, 'steps': 425, 'batch_loss/train': 0.8571261004544795} 12/21/2021 15:57:10 - INFO - codeparrot_training - Step 426: {'lr': 0.0004999686921299984, 'samples': 218624, 'steps': 426, 'batch_loss/train': 0.8347470150329173} 12/21/2021 15:57:22 - INFO - codeparrot_training - Step 427: {'lr': 0.0004999684421732153, 'samples': 219136, 'steps': 427, 'batch_loss/train': 0.8844667943194509} 12/21/2021 15:57:33 - INFO - codeparrot_training - Step 428: {'lr': 0.0004999681912226521, 'samples': 219648, 'steps': 428, 'batch_loss/train': 0.9169071968644857} 12/21/2021 15:57:43 - INFO - codeparrot_training - Step 429: {'lr': 0.0004999679392783098, 'samples': 220160, 'steps': 429, 'batch_loss/train': 1.0903414757922292} 12/21/2021 15:57:55 - INFO - codeparrot_training - Step 430: {'lr': 0.0004999676863401893, 'samples': 220672, 'steps': 430, 'batch_loss/train': 0.9996788166463375} 12/21/2021 15:58:06 - INFO - codeparrot_training - Step 431: {'lr': 0.0004999674324082916, 'samples': 221184, 'steps': 431, 'batch_loss/train': 0.9633755311369896} 12/21/2021 15:58:16 - INFO - codeparrot_training - Step 432: {'lr': 0.0004999671774826179, 'samples': 221696, 'steps': 432, 'batch_loss/train': 0.8803270729258657} 12/21/2021 15:58:27 - INFO - codeparrot_training - Step 433: {'lr': 0.000499966921563169, 'samples': 222208, 'steps': 433, 'batch_loss/train': 1.026193905621767} 12/21/2021 15:58:41 - INFO - codeparrot_training - Step 434: {'lr': 0.000499966664649946, 'samples': 222720, 'steps': 434, 'batch_loss/train': 0.9324929639697075} 12/21/2021 15:58:51 - INFO - codeparrot_training - Step 435: {'lr': 0.0004999664067429499, 'samples': 223232, 'steps': 435, 'batch_loss/train': 1.0224649542942643} 12/21/2021 15:59:02 - INFO - codeparrot_training - Step 436: {'lr': 0.0004999661478421819, 'samples': 223744, 'steps': 436, 'batch_loss/train': 0.8969925502315164} 12/21/2021 15:59:14 - INFO - codeparrot_training - Step 437: {'lr': 0.0004999658879476427, 'samples': 224256, 'steps': 437, 'batch_loss/train': 0.9263608362525702} 12/21/2021 15:59:24 - INFO - codeparrot_training - Step 438: {'lr': 0.0004999656270593336, 'samples': 224768, 'steps': 438, 'batch_loss/train': 0.9655403601936996} 12/21/2021 15:59:35 - INFO - codeparrot_training - Step 439: {'lr': 0.0004999653651772555, 'samples': 225280, 'steps': 439, 'batch_loss/train': 0.9194264309480786} 12/21/2021 15:59:47 - INFO - codeparrot_training - Step 440: {'lr': 0.0004999651023014095, 'samples': 225792, 'steps': 440, 'batch_loss/train': 0.9931161850690842} 12/21/2021 15:59:57 - INFO - codeparrot_training - Step 441: {'lr': 0.0004999648384317967, 'samples': 226304, 'steps': 441, 'batch_loss/train': 0.9761939644813538} 12/21/2021 16:00:08 - INFO - codeparrot_training - Step 442: {'lr': 0.0004999645735684181, 'samples': 226816, 'steps': 442, 'batch_loss/train': 0.9765765778720379} 12/21/2021 16:00:19 - INFO - codeparrot_training - Step 443: {'lr': 0.0004999643077112747, 'samples': 227328, 'steps': 443, 'batch_loss/train': 0.9984427690505981} 12/21/2021 16:00:32 - INFO - codeparrot_training - Step 444: {'lr': 0.0004999640408603676, 'samples': 227840, 'steps': 444, 'batch_loss/train': 0.8500730954110622} 12/21/2021 16:00:42 - INFO - codeparrot_training - Step 445: {'lr': 0.0004999637730156979, 'samples': 228352, 'steps': 445, 'batch_loss/train': 0.8858804395422339} 12/21/2021 16:00:53 - INFO - codeparrot_training - Step 446: {'lr': 0.0004999635041772665, 'samples': 228864, 'steps': 446, 'batch_loss/train': 0.8956717094406486} 12/21/2021 16:01:05 - INFO - codeparrot_training - Step 447: {'lr': 0.0004999632343450747, 'samples': 229376, 'steps': 447, 'batch_loss/train': 1.0046755797229707} 12/21/2021 16:01:16 - INFO - codeparrot_training - Step 448: {'lr': 0.0004999629635191234, 'samples': 229888, 'steps': 448, 'batch_loss/train': 0.7434851163998246} 12/21/2021 16:01:26 - INFO - codeparrot_training - Step 449: {'lr': 0.0004999626916994137, 'samples': 230400, 'steps': 449, 'batch_loss/train': 0.9022434931248426} 12/21/2021 16:01:38 - INFO - codeparrot_training - Step 450: {'lr': 0.0004999624188859468, 'samples': 230912, 'steps': 450, 'batch_loss/train': 0.9132275562733412} 12/21/2021 16:01:49 - INFO - codeparrot_training - Step 451: {'lr': 0.0004999621450787237, 'samples': 231424, 'steps': 451, 'batch_loss/train': 0.8767648488283157} 12/21/2021 16:02:00 - INFO - codeparrot_training - Step 452: {'lr': 0.0004999618702777455, 'samples': 231936, 'steps': 452, 'batch_loss/train': 0.8904498647898436} 12/21/2021 16:02:12 - INFO - codeparrot_training - Step 453: {'lr': 0.0004999615944830132, 'samples': 232448, 'steps': 453, 'batch_loss/train': 0.9733011778444052} 12/21/2021 16:02:23 - INFO - codeparrot_training - Step 454: {'lr': 0.0004999613176945282, 'samples': 232960, 'steps': 454, 'batch_loss/train': 0.8590325242839754} 12/21/2021 16:02:34 - INFO - codeparrot_training - Step 455: {'lr': 0.0004999610399122912, 'samples': 233472, 'steps': 455, 'batch_loss/train': 0.840726169757545} 12/21/2021 16:02:44 - INFO - codeparrot_training - Step 456: {'lr': 0.0004999607611363035, 'samples': 233984, 'steps': 456, 'batch_loss/train': 0.9300893796607852} 12/21/2021 16:02:56 - INFO - codeparrot_training - Step 457: {'lr': 0.0004999604813665662, 'samples': 234496, 'steps': 457, 'batch_loss/train': 0.9475970091298223} 12/21/2021 16:03:07 - INFO - codeparrot_training - Step 458: {'lr': 0.0004999602006030804, 'samples': 235008, 'steps': 458, 'batch_loss/train': 1.0036976113915443} 12/21/2021 16:03:17 - INFO - codeparrot_training - Step 459: {'lr': 0.0004999599188458472, 'samples': 235520, 'steps': 459, 'batch_loss/train': 0.9122916385531425} 12/21/2021 16:03:30 - INFO - codeparrot_training - Step 460: {'lr': 0.0004999596360948679, 'samples': 236032, 'steps': 460, 'batch_loss/train': 0.9647639291360974} 12/21/2021 16:03:41 - INFO - codeparrot_training - Step 461: {'lr': 0.0004999593523501433, 'samples': 236544, 'steps': 461, 'batch_loss/train': 0.9536460340023041} 12/21/2021 16:03:51 - INFO - codeparrot_training - Step 462: {'lr': 0.0004999590676116747, 'samples': 237056, 'steps': 462, 'batch_loss/train': 0.8009715909138322} 12/21/2021 16:04:03 - INFO - codeparrot_training - Step 463: {'lr': 0.0004999587818794633, 'samples': 237568, 'steps': 463, 'batch_loss/train': 0.9024169323965907} 12/21/2021 16:04:14 - INFO - codeparrot_training - Step 464: {'lr': 0.0004999584951535102, 'samples': 238080, 'steps': 464, 'batch_loss/train': 0.8396811774000525} 12/21/2021 16:04:24 - INFO - codeparrot_training - Step 465: {'lr': 0.0004999582074338163, 'samples': 238592, 'steps': 465, 'batch_loss/train': 0.928785752505064} 12/21/2021 16:04:37 - INFO - codeparrot_training - Step 466: {'lr': 0.0004999579187203832, 'samples': 239104, 'steps': 466, 'batch_loss/train': 0.9804435735568404} 12/21/2021 16:04:47 - INFO - codeparrot_training - Step 467: {'lr': 0.0004999576290132117, 'samples': 239616, 'steps': 467, 'batch_loss/train': 0.9512661406770349} 12/21/2021 16:04:58 - INFO - codeparrot_training - Step 468: {'lr': 0.0004999573383123031, 'samples': 240128, 'steps': 468, 'batch_loss/train': 0.9538242537528276} 12/21/2021 16:05:08 - INFO - codeparrot_training - Step 469: {'lr': 0.0004999570466176584, 'samples': 240640, 'steps': 469, 'batch_loss/train': 0.9624655842781067} 12/21/2021 16:05:21 - INFO - codeparrot_training - Step 470: {'lr': 0.0004999567539292789, 'samples': 241152, 'steps': 470, 'batch_loss/train': 0.9739622380584478} 12/21/2021 16:05:32 - INFO - codeparrot_training - Step 471: {'lr': 0.0004999564602471657, 'samples': 241664, 'steps': 471, 'batch_loss/train': 0.9446419794112444} 12/21/2021 16:05:42 - INFO - codeparrot_training - Step 472: {'lr': 0.00049995616557132, 'samples': 242176, 'steps': 472, 'batch_loss/train': 0.9030982083640993} 12/21/2021 16:05:54 - INFO - codeparrot_training - Step 473: {'lr': 0.000499955869901743, 'samples': 242688, 'steps': 473, 'batch_loss/train': 0.8914791769348085} 12/21/2021 16:06:05 - INFO - codeparrot_training - Step 474: {'lr': 0.0004999555732384358, 'samples': 243200, 'steps': 474, 'batch_loss/train': 0.9041023524478078} 12/21/2021 16:06:16 - INFO - codeparrot_training - Step 475: {'lr': 0.0004999552755813997, 'samples': 243712, 'steps': 475, 'batch_loss/train': 0.9548308560624719} 12/21/2021 16:06:28 - INFO - codeparrot_training - Step 476: {'lr': 0.0004999549769306357, 'samples': 244224, 'steps': 476, 'batch_loss/train': 0.9872032226994634} 12/21/2021 16:06:38 - INFO - codeparrot_training - Step 477: {'lr': 0.0004999546772861452, 'samples': 244736, 'steps': 477, 'batch_loss/train': 0.9012071953620762} 12/21/2021 16:06:49 - INFO - codeparrot_training - Step 478: {'lr': 0.0004999543766479291, 'samples': 245248, 'steps': 478, 'batch_loss/train': 0.9515836574137211} 12/21/2021 16:06:59 - INFO - codeparrot_training - Step 479: {'lr': 0.000499954075015989, 'samples': 245760, 'steps': 479, 'batch_loss/train': 0.780715158674866} 12/21/2021 16:07:11 - INFO - codeparrot_training - Step 480: {'lr': 0.0004999537723903258, 'samples': 246272, 'steps': 480, 'batch_loss/train': 0.8905156045220792} 12/21/2021 16:07:22 - INFO - codeparrot_training - Step 481: {'lr': 0.0004999534687709407, 'samples': 246784, 'steps': 481, 'batch_loss/train': 0.8681614296510816} 12/21/2021 16:07:32 - INFO - codeparrot_training - Step 482: {'lr': 0.0004999531641578351, 'samples': 247296, 'steps': 482, 'batch_loss/train': 0.929211019538343} 12/21/2021 16:07:46 - INFO - codeparrot_training - Step 483: {'lr': 0.00049995285855101, 'samples': 247808, 'steps': 483, 'batch_loss/train': 0.9394317781552672} 12/21/2021 16:07:56 - INFO - codeparrot_training - Step 484: {'lr': 0.0004999525519504668, 'samples': 248320, 'steps': 484, 'batch_loss/train': 0.955608868971467} 12/21/2021 16:08:07 - INFO - codeparrot_training - Step 485: {'lr': 0.0004999522443562066, 'samples': 248832, 'steps': 485, 'batch_loss/train': 0.8707458833232522} 12/21/2021 16:08:17 - INFO - codeparrot_training - Step 486: {'lr': 0.0004999519357682306, 'samples': 249344, 'steps': 486, 'batch_loss/train': 1.1621080995246302} 12/21/2021 16:08:29 - INFO - codeparrot_training - Step 487: {'lr': 0.0004999516261865401, 'samples': 249856, 'steps': 487, 'batch_loss/train': 0.8691662596538663} 12/21/2021 16:08:40 - INFO - codeparrot_training - Step 488: {'lr': 0.0004999513156111364, 'samples': 250368, 'steps': 488, 'batch_loss/train': 0.9347028806805611} 12/21/2021 16:08:51 - INFO - codeparrot_training - Step 489: {'lr': 0.0004999510040420206, 'samples': 250880, 'steps': 489, 'batch_loss/train': 0.9338290086016059} 12/21/2021 16:09:03 - INFO - codeparrot_training - Step 490: {'lr': 0.000499950691479194, 'samples': 251392, 'steps': 490, 'batch_loss/train': 1.0669192932546139} 12/21/2021 16:09:14 - INFO - codeparrot_training - Step 491: {'lr': 0.0004999503779226578, 'samples': 251904, 'steps': 491, 'batch_loss/train': 1.0738368043676019} 12/21/2021 16:09:25 - INFO - codeparrot_training - Step 492: {'lr': 0.0004999500633724132, 'samples': 252416, 'steps': 492, 'batch_loss/train': 0.9119481407105923} 12/21/2021 16:09:37 - INFO - codeparrot_training - Step 493: {'lr': 0.0004999497478284617, 'samples': 252928, 'steps': 493, 'batch_loss/train': 0.7464928608387709} 12/21/2021 16:09:47 - INFO - codeparrot_training - Step 494: {'lr': 0.0004999494312908043, 'samples': 253440, 'steps': 494, 'batch_loss/train': 0.7216577949002385} 12/21/2021 16:09:58 - INFO - codeparrot_training - Step 495: {'lr': 0.0004999491137594423, 'samples': 253952, 'steps': 495, 'batch_loss/train': 0.9382506683468819} 12/21/2021 16:10:10 - INFO - codeparrot_training - Step 496: {'lr': 0.000499948795234377, 'samples': 254464, 'steps': 496, 'batch_loss/train': 1.0057146502658725} 12/21/2021 16:10:21 - INFO - codeparrot_training - Step 497: {'lr': 0.0004999484757156097, 'samples': 254976, 'steps': 497, 'batch_loss/train': 0.9702515061944723} 12/21/2021 16:10:31 - INFO - codeparrot_training - Step 498: {'lr': 0.0004999481552031415, 'samples': 255488, 'steps': 498, 'batch_loss/train': 0.8413978372700512} 12/21/2021 16:10:42 - INFO - codeparrot_training - Step 499: {'lr': 0.000499947833696974, 'samples': 256000, 'steps': 499, 'batch_loss/train': 1.068810211494565} 12/21/2021 16:10:54 - INFO - codeparrot_training - Step 500: {'lr': 0.0004999475111971082, 'samples': 256512, 'steps': 500, 'batch_loss/train': 0.9169547948986292} 12/21/2021 16:11:05 - INFO - codeparrot_training - Step 501: {'lr': 0.0004999471877035456, 'samples': 257024, 'steps': 501, 'batch_loss/train': 0.9324342245236039} 12/21/2021 16:11:15 - INFO - codeparrot_training - Step 502: {'lr': 0.0004999468632162872, 'samples': 257536, 'steps': 502, 'batch_loss/train': 0.9523064000532031} 12/21/2021 16:11:27 - INFO - codeparrot_training - Step 503: {'lr': 0.0004999465377353345, 'samples': 258048, 'steps': 503, 'batch_loss/train': 0.9037008536979556} 12/21/2021 16:11:38 - INFO - codeparrot_training - Step 504: {'lr': 0.0004999462112606888, 'samples': 258560, 'steps': 504, 'batch_loss/train': 0.8526471545919776} 12/21/2021 16:11:48 - INFO - codeparrot_training - Step 505: {'lr': 0.0004999458837923513, 'samples': 259072, 'steps': 505, 'batch_loss/train': 0.9257241990417242} 12/21/2021 16:12:01 - INFO - codeparrot_training - Step 506: {'lr': 0.0004999455553303233, 'samples': 259584, 'steps': 506, 'batch_loss/train': 0.907796804793179} 12/21/2021 16:12:11 - INFO - codeparrot_training - Step 507: {'lr': 0.0004999452258746061, 'samples': 260096, 'steps': 507, 'batch_loss/train': 0.9255656283348799} 12/21/2021 16:12:22 - INFO - codeparrot_training - Step 508: {'lr': 0.0004999448954252012, 'samples': 260608, 'steps': 508, 'batch_loss/train': 0.8958309143781662} 12/21/2021 16:12:34 - INFO - codeparrot_training - Step 509: {'lr': 0.0004999445639821097, 'samples': 261120, 'steps': 509, 'batch_loss/train': 0.8369890097528696} 12/21/2021 16:12:45 - INFO - codeparrot_training - Step 510: {'lr': 0.0004999442315453329, 'samples': 261632, 'steps': 510, 'batch_loss/train': 0.9495102781802416} 12/21/2021 16:12:55 - INFO - codeparrot_training - Step 511: {'lr': 0.0004999438981148724, 'samples': 262144, 'steps': 511, 'batch_loss/train': 0.9727047346532345} 12/21/2021 16:13:06 - INFO - codeparrot_training - Step 512: {'lr': 0.0004999435636907293, 'samples': 262656, 'steps': 512, 'batch_loss/train': 0.9792565805837512} 12/21/2021 16:13:19 - INFO - codeparrot_training - Step 513: {'lr': 0.0004999432282729048, 'samples': 263168, 'steps': 513, 'batch_loss/train': 0.7626077516470104} 12/21/2021 16:13:29 - INFO - codeparrot_training - Step 514: {'lr': 0.0004999428918614006, 'samples': 263680, 'steps': 514, 'batch_loss/train': 0.8711663307622075} 12/21/2021 16:13:40 - INFO - codeparrot_training - Step 515: {'lr': 0.0004999425544562178, 'samples': 264192, 'steps': 515, 'batch_loss/train': 0.9464005529880524} 12/21/2021 16:13:52 - INFO - codeparrot_training - Step 516: {'lr': 0.0004999422160573575, 'samples': 264704, 'steps': 516, 'batch_loss/train': 0.8990733688697219} 12/21/2021 16:14:03 - INFO - codeparrot_training - Step 517: {'lr': 0.0004999418766648216, 'samples': 265216, 'steps': 517, 'batch_loss/train': 0.968270679935813} 12/21/2021 16:14:13 - INFO - codeparrot_training - Step 518: {'lr': 0.0004999415362786112, 'samples': 265728, 'steps': 518, 'batch_loss/train': 1.0139030292630196} 12/21/2021 16:14:26 - INFO - codeparrot_training - Step 519: {'lr': 0.0004999411948987275, 'samples': 266240, 'steps': 519, 'batch_loss/train': 0.8389423368498683} 12/21/2021 16:14:37 - INFO - codeparrot_training - Step 520: {'lr': 0.000499940852525172, 'samples': 266752, 'steps': 520, 'batch_loss/train': 0.9152657091617584} 12/21/2021 16:14:47 - INFO - codeparrot_training - Step 521: {'lr': 0.0004999405091579461, 'samples': 267264, 'steps': 521, 'batch_loss/train': 0.718719408614561} 12/21/2021 16:14:58 - INFO - codeparrot_training - Step 522: {'lr': 0.000499940164797051, 'samples': 267776, 'steps': 522, 'batch_loss/train': 0.8578516519628465} 12/21/2021 16:15:10 - INFO - codeparrot_training - Step 523: {'lr': 0.0004999398194424884, 'samples': 268288, 'steps': 523, 'batch_loss/train': 0.9500483646988869} 12/21/2021 16:15:20 - INFO - codeparrot_training - Step 524: {'lr': 0.0004999394730942592, 'samples': 268800, 'steps': 524, 'batch_loss/train': 0.880490711890161} 12/21/2021 16:15:31 - INFO - codeparrot_training - Step 525: {'lr': 0.0004999391257523653, 'samples': 269312, 'steps': 525, 'batch_loss/train': 0.9496993944048882} 12/21/2021 16:15:43 - INFO - codeparrot_training - Step 526: {'lr': 0.0004999387774168076, 'samples': 269824, 'steps': 526, 'batch_loss/train': 0.9183698203414679} 12/21/2021 16:15:53 - INFO - codeparrot_training - Step 527: {'lr': 0.0004999384280875877, 'samples': 270336, 'steps': 527, 'batch_loss/train': 0.8676091292873025} 12/21/2021 16:16:04 - INFO - codeparrot_training - Step 528: {'lr': 0.000499938077764707, 'samples': 270848, 'steps': 528, 'batch_loss/train': 0.8968275524675846} 12/21/2021 16:16:17 - INFO - codeparrot_training - Step 529: {'lr': 0.0004999377264481669, 'samples': 271360, 'steps': 529, 'batch_loss/train': 0.8513631895184517} 12/21/2021 16:16:27 - INFO - codeparrot_training - Step 530: {'lr': 0.0004999373741379688, 'samples': 271872, 'steps': 530, 'batch_loss/train': 0.8707464477047324} 12/21/2021 16:16:38 - INFO - codeparrot_training - Step 531: {'lr': 0.000499937020834114, 'samples': 272384, 'steps': 531, 'batch_loss/train': 0.9676802353933454} 12/21/2021 16:16:50 - INFO - codeparrot_training - Step 532: {'lr': 0.0004999366665366041, 'samples': 272896, 'steps': 532, 'batch_loss/train': 1.0208979127928615} 12/21/2021 16:17:01 - INFO - codeparrot_training - Step 533: {'lr': 0.0004999363112454402, 'samples': 273408, 'steps': 533, 'batch_loss/train': 0.7659446746110916} 12/21/2021 16:17:11 - INFO - codeparrot_training - Step 534: {'lr': 0.0004999359549606241, 'samples': 273920, 'steps': 534, 'batch_loss/train': 0.9105313969776034} 12/21/2021 16:17:22 - INFO - codeparrot_training - Step 535: {'lr': 0.0004999355976821569, 'samples': 274432, 'steps': 535, 'batch_loss/train': 0.9344321079552174} 12/21/2021 16:17:34 - INFO - codeparrot_training - Step 536: {'lr': 0.0004999352394100401, 'samples': 274944, 'steps': 536, 'batch_loss/train': 0.8789194077253342} 12/21/2021 16:17:44 - INFO - codeparrot_training - Step 537: {'lr': 0.0004999348801442753, 'samples': 275456, 'steps': 537, 'batch_loss/train': 0.7410235283896327} 12/21/2021 16:17:55 - INFO - codeparrot_training - Step 538: {'lr': 0.0004999345198848637, 'samples': 275968, 'steps': 538, 'batch_loss/train': 0.8595439791679382} 12/21/2021 16:18:08 - INFO - codeparrot_training - Step 539: {'lr': 0.0004999341586318067, 'samples': 276480, 'steps': 539, 'batch_loss/train': 0.8188903618138283} 12/21/2021 16:18:18 - INFO - codeparrot_training - Step 540: {'lr': 0.0004999337963851061, 'samples': 276992, 'steps': 540, 'batch_loss/train': 0.9862461071461439} 12/21/2021 16:18:29 - INFO - codeparrot_training - Step 541: {'lr': 0.000499933433144763, 'samples': 277504, 'steps': 541, 'batch_loss/train': 0.8946229419670999} 12/21/2021 16:18:41 - INFO - codeparrot_training - Step 542: {'lr': 0.0004999330689107789, 'samples': 278016, 'steps': 542, 'batch_loss/train': 0.9362280331552029} 12/21/2021 16:18:52 - INFO - codeparrot_training - Step 543: {'lr': 0.0004999327036831554, 'samples': 278528, 'steps': 543, 'batch_loss/train': 0.9131284700706601} 12/21/2021 16:19:02 - INFO - codeparrot_training - Step 544: {'lr': 0.0004999323374618937, 'samples': 279040, 'steps': 544, 'batch_loss/train': 0.9678662391379476} 12/21/2021 16:19:14 - INFO - codeparrot_training - Step 545: {'lr': 0.0004999319702469954, 'samples': 279552, 'steps': 545, 'batch_loss/train': 0.9269123962149024} 12/21/2021 16:19:25 - INFO - codeparrot_training - Step 546: {'lr': 0.0004999316020384621, 'samples': 280064, 'steps': 546, 'batch_loss/train': 0.9190409397706389} 12/21/2021 16:19:35 - INFO - codeparrot_training - Step 547: {'lr': 0.000499931232836295, 'samples': 280576, 'steps': 547, 'batch_loss/train': 0.8725959439761937} 12/21/2021 16:19:46 - INFO - codeparrot_training - Step 548: {'lr': 0.0004999308626404957, 'samples': 281088, 'steps': 548, 'batch_loss/train': 0.7994477478787303} 12/21/2021 16:19:58 - INFO - codeparrot_training - Step 549: {'lr': 0.0004999304914510657, 'samples': 281600, 'steps': 549, 'batch_loss/train': 0.9431182509288192} 12/21/2021 16:20:09 - INFO - codeparrot_training - Step 550: {'lr': 0.0004999301192680064, 'samples': 282112, 'steps': 550, 'batch_loss/train': 0.8486450358759612} 12/21/2021 16:20:20 - INFO - codeparrot_training - Step 551: {'lr': 0.0004999297460913193, 'samples': 282624, 'steps': 551, 'batch_loss/train': 0.9616946149617434} 12/21/2021 16:20:32 - INFO - codeparrot_training - Step 552: {'lr': 0.000499929371921006, 'samples': 283136, 'steps': 552, 'batch_loss/train': 0.9738977933302522} 12/21/2021 16:20:43 - INFO - codeparrot_training - Step 553: {'lr': 0.0004999289967570678, 'samples': 283648, 'steps': 553, 'batch_loss/train': 1.0289804823696613} 12/21/2021 16:20:53 - INFO - codeparrot_training - Step 554: {'lr': 0.0004999286205995063, 'samples': 284160, 'steps': 554, 'batch_loss/train': 0.796769333537668} 12/21/2021 16:21:04 - INFO - codeparrot_training - Step 555: {'lr': 0.000499928243448323, 'samples': 284672, 'steps': 555, 'batch_loss/train': 0.8846205677837133} 12/21/2021 16:21:16 - INFO - codeparrot_training - Step 556: {'lr': 0.0004999278653035193, 'samples': 285184, 'steps': 556, 'batch_loss/train': 0.8715219246223569} 12/21/2021 16:21:27 - INFO - codeparrot_training - Step 557: {'lr': 0.0004999274861650967, 'samples': 285696, 'steps': 557, 'batch_loss/train': 0.8064535167068243} 12/21/2021 16:21:37 - INFO - codeparrot_training - Step 558: {'lr': 0.0004999271060330571, 'samples': 286208, 'steps': 558, 'batch_loss/train': 0.8831348391249776} 12/21/2021 16:21:50 - INFO - codeparrot_training - Step 559: {'lr': 0.0004999267249074014, 'samples': 286720, 'steps': 559, 'batch_loss/train': 0.7371269129216671} 12/21/2021 16:22:01 - INFO - codeparrot_training - Step 560: {'lr': 0.0004999263427881316, 'samples': 287232, 'steps': 560, 'batch_loss/train': 1.0548465140163898} 12/21/2021 16:22:12 - INFO - codeparrot_training - Step 561: {'lr': 0.0004999259596752488, 'samples': 287744, 'steps': 561, 'batch_loss/train': 1.193105038255453} 12/21/2021 16:22:24 - INFO - codeparrot_training - Step 562: {'lr': 0.000499925575568755, 'samples': 288256, 'steps': 562, 'batch_loss/train': 0.9636553274467587} 12/21/2021 16:22:34 - INFO - codeparrot_training - Step 563: {'lr': 0.0004999251904686513, 'samples': 288768, 'steps': 563, 'batch_loss/train': 0.7096619489602745} 12/21/2021 16:22:45 - INFO - codeparrot_training - Step 564: {'lr': 0.0004999248043749395, 'samples': 289280, 'steps': 564, 'batch_loss/train': 1.0251792203634977} 12/21/2021 16:22:55 - INFO - codeparrot_training - Step 565: {'lr': 0.000499924417287621, 'samples': 289792, 'steps': 565, 'batch_loss/train': 1.1077187359333038} 12/21/2021 16:23:07 - INFO - codeparrot_training - Step 566: {'lr': 0.0004999240292066974, 'samples': 290304, 'steps': 566, 'batch_loss/train': 0.880513193551451} 12/21/2021 16:23:18 - INFO - codeparrot_training - Step 567: {'lr': 0.0004999236401321702, 'samples': 290816, 'steps': 567, 'batch_loss/train': 0.9123738692142069} 12/21/2021 16:23:29 - INFO - codeparrot_training - Step 568: {'lr': 0.0004999232500640411, 'samples': 291328, 'steps': 568, 'batch_loss/train': 0.9084219019860029} 12/21/2021 16:23:41 - INFO - codeparrot_training - Step 569: {'lr': 0.0004999228590023114, 'samples': 291840, 'steps': 569, 'batch_loss/train': 0.8991077225655317} 12/21/2021 16:23:52 - INFO - codeparrot_training - Step 570: {'lr': 0.0004999224669469829, 'samples': 292352, 'steps': 570, 'batch_loss/train': 0.9874973967671394} 12/21/2021 16:24:03 - INFO - codeparrot_training - Step 571: {'lr': 0.0004999220738980569, 'samples': 292864, 'steps': 571, 'batch_loss/train': 0.9690563306212425} 12/21/2021 16:24:15 - INFO - codeparrot_training - Step 572: {'lr': 0.0004999216798555351, 'samples': 293376, 'steps': 572, 'batch_loss/train': 0.8383851731196046} 12/21/2021 16:24:25 - INFO - codeparrot_training - Step 573: {'lr': 0.0004999212848194192, 'samples': 293888, 'steps': 573, 'batch_loss/train': 0.970521092414856} 12/21/2021 16:24:36 - INFO - codeparrot_training - Step 574: {'lr': 0.0004999208887897105, 'samples': 294400, 'steps': 574, 'batch_loss/train': 0.9525731345638633} 12/21/2021 16:24:48 - INFO - codeparrot_training - Step 575: {'lr': 0.0004999204917664108, 'samples': 294912, 'steps': 575, 'batch_loss/train': 0.8828167701140046} 12/21/2021 16:24:59 - INFO - codeparrot_training - Step 576: {'lr': 0.0004999200937495215, 'samples': 295424, 'steps': 576, 'batch_loss/train': 0.968753369525075} 12/21/2021 16:25:09 - INFO - codeparrot_training - Step 577: {'lr': 0.0004999196947390444, 'samples': 295936, 'steps': 577, 'batch_loss/train': 1.0379687119275331} 12/21/2021 16:25:20 - INFO - codeparrot_training - Step 578: {'lr': 0.0004999192947349809, 'samples': 296448, 'steps': 578, 'batch_loss/train': 0.9050431232899427} 12/21/2021 16:25:32 - INFO - codeparrot_training - Step 579: {'lr': 0.0004999188937373326, 'samples': 296960, 'steps': 579, 'batch_loss/train': 0.9736380837857723} 12/21/2021 16:25:42 - INFO - codeparrot_training - Step 580: {'lr': 0.0004999184917461012, 'samples': 297472, 'steps': 580, 'batch_loss/train': 0.8154596420936286} 12/21/2021 16:25:53 - INFO - codeparrot_training - Step 581: {'lr': 0.0004999180887612883, 'samples': 297984, 'steps': 581, 'batch_loss/train': 0.9418940469622612} 12/21/2021 16:26:06 - INFO - codeparrot_training - Step 582: {'lr': 0.0004999176847828953, 'samples': 298496, 'steps': 582, 'batch_loss/train': 0.9892596285790205} 12/21/2021 16:26:16 - INFO - codeparrot_training - Step 583: {'lr': 0.0004999172798109241, 'samples': 299008, 'steps': 583, 'batch_loss/train': 0.8385149957612157} 12/21/2021 16:26:27 - INFO - codeparrot_training - Step 584: {'lr': 0.0004999168738453761, 'samples': 299520, 'steps': 584, 'batch_loss/train': 0.8724001785740256} 12/21/2021 16:26:39 - INFO - codeparrot_training - Step 585: {'lr': 0.000499916466886253, 'samples': 300032, 'steps': 585, 'batch_loss/train': 0.9139909027144313} 12/21/2021 16:26:49 - INFO - codeparrot_training - Step 586: {'lr': 0.0004999160589335563, 'samples': 300544, 'steps': 586, 'batch_loss/train': 0.9542697379365563} 12/21/2021 16:27:00 - INFO - codeparrot_training - Step 587: {'lr': 0.0004999156499872878, 'samples': 301056, 'steps': 587, 'batch_loss/train': 0.9293676074594259} 12/21/2021 16:27:10 - INFO - codeparrot_training - Step 588: {'lr': 0.000499915240047449, 'samples': 301568, 'steps': 588, 'batch_loss/train': 0.8713159929029644} 12/21/2021 16:27:23 - INFO - codeparrot_training - Step 589: {'lr': 0.0004999148291140416, 'samples': 302080, 'steps': 589, 'batch_loss/train': 0.9792732698842883} 12/21/2021 16:27:34 - INFO - codeparrot_training - Step 590: {'lr': 0.0004999144171870672, 'samples': 302592, 'steps': 590, 'batch_loss/train': 0.799872491043061} 12/21/2021 16:27:44 - INFO - codeparrot_training - Step 591: {'lr': 0.0004999140042665274, 'samples': 303104, 'steps': 591, 'batch_loss/train': 0.9258495345711708} 12/21/2021 16:27:56 - INFO - codeparrot_training - Step 592: {'lr': 0.0004999135903524239, 'samples': 303616, 'steps': 592, 'batch_loss/train': 0.9421892203390598} 12/21/2021 16:28:07 - INFO - codeparrot_training - Step 593: {'lr': 0.0004999131754447584, 'samples': 304128, 'steps': 593, 'batch_loss/train': 0.869837237522006} 12/21/2021 16:28:18 - INFO - codeparrot_training - Step 594: {'lr': 0.0004999127595435324, 'samples': 304640, 'steps': 594, 'batch_loss/train': 0.97826305963099} 12/21/2021 16:28:30 - INFO - codeparrot_training - Step 595: {'lr': 0.0004999123426487476, 'samples': 305152, 'steps': 595, 'batch_loss/train': 0.9203468547202647} 12/21/2021 16:28:40 - INFO - codeparrot_training - Step 596: {'lr': 0.0004999119247604059, 'samples': 305664, 'steps': 596, 'batch_loss/train': 0.8747538402676582} 12/21/2021 16:28:51 - INFO - codeparrot_training - Step 597: {'lr': 0.0004999115058785085, 'samples': 306176, 'steps': 597, 'batch_loss/train': 1.036929794587195} 12/21/2021 16:29:03 - INFO - codeparrot_training - Step 598: {'lr': 0.0004999110860030573, 'samples': 306688, 'steps': 598, 'batch_loss/train': 0.9580811411142349} 12/21/2021 16:29:14 - INFO - codeparrot_training - Step 599: {'lr': 0.0004999106651340542, 'samples': 307200, 'steps': 599, 'batch_loss/train': 0.9031574176624417} 12/21/2021 16:29:25 - INFO - codeparrot_training - Step 600: {'lr': 0.0004999102432715005, 'samples': 307712, 'steps': 600, 'batch_loss/train': 0.9655727837234735} 12/21/2021 16:29:35 - INFO - codeparrot_training - Step 601: {'lr': 0.0004999098204153981, 'samples': 308224, 'steps': 601, 'batch_loss/train': 0.8549080542288721} 12/21/2021 16:29:47 - INFO - codeparrot_training - Step 602: {'lr': 0.0004999093965657487, 'samples': 308736, 'steps': 602, 'batch_loss/train': 1.1503985403105617} 12/21/2021 16:29:58 - INFO - codeparrot_training - Step 603: {'lr': 0.0004999089717225536, 'samples': 309248, 'steps': 603, 'batch_loss/train': 1.0146469259634614} 12/21/2021 16:30:09 - INFO - codeparrot_training - Step 604: {'lr': 0.000499908545885815, 'samples': 309760, 'steps': 604, 'batch_loss/train': 0.9436314627528191} 12/21/2021 16:30:22 - INFO - codeparrot_training - Step 605: {'lr': 0.0004999081190555345, 'samples': 310272, 'steps': 605, 'batch_loss/train': 0.7581614810042083} 12/21/2021 16:30:32 - INFO - codeparrot_training - Step 606: {'lr': 0.0004999076912317135, 'samples': 310784, 'steps': 606, 'batch_loss/train': 0.9093598071485758} 12/21/2021 16:30:43 - INFO - codeparrot_training - Step 607: {'lr': 0.0004999072624143539, 'samples': 311296, 'steps': 607, 'batch_loss/train': 0.7014100013766438} 12/21/2021 16:30:55 - INFO - codeparrot_training - Step 608: {'lr': 0.0004999068326034575, 'samples': 311808, 'steps': 608, 'batch_loss/train': 0.8505488268565387} 12/21/2021 16:31:05 - INFO - codeparrot_training - Step 609: {'lr': 0.0004999064017990258, 'samples': 312320, 'steps': 609, 'batch_loss/train': 1.0037020947784185} 12/21/2021 16:31:16 - INFO - codeparrot_training - Step 610: {'lr': 0.0004999059700010606, 'samples': 312832, 'steps': 610, 'batch_loss/train': 0.9056342476978898} 12/21/2021 16:31:26 - INFO - codeparrot_training - Step 611: {'lr': 0.0004999055372095636, 'samples': 313344, 'steps': 611, 'batch_loss/train': 0.8517302870750427} 12/21/2021 16:31:38 - INFO - codeparrot_training - Step 612: {'lr': 0.0004999051034245367, 'samples': 313856, 'steps': 612, 'batch_loss/train': 0.9454834964126348} 12/21/2021 16:31:49 - INFO - codeparrot_training - Step 613: {'lr': 0.0004999046686459813, 'samples': 314368, 'steps': 613, 'batch_loss/train': 0.9660000558942556} 12/21/2021 16:32:00 - INFO - codeparrot_training - Step 614: {'lr': 0.0004999042328738993, 'samples': 314880, 'steps': 614, 'batch_loss/train': 0.9627989996224642} 12/21/2021 16:32:12 - INFO - codeparrot_training - Step 615: {'lr': 0.0004999037961082926, 'samples': 315392, 'steps': 615, 'batch_loss/train': 0.7407698503229767} 12/21/2021 16:32:22 - INFO - codeparrot_training - Step 616: {'lr': 0.0004999033583491626, 'samples': 315904, 'steps': 616, 'batch_loss/train': 0.9574173782020807} 12/21/2021 16:32:33 - INFO - codeparrot_training - Step 617: {'lr': 0.0004999029195965112, 'samples': 316416, 'steps': 617, 'batch_loss/train': 0.953805978409946} 12/21/2021 16:32:46 - INFO - codeparrot_training - Step 618: {'lr': 0.0004999024798503403, 'samples': 316928, 'steps': 618, 'batch_loss/train': 0.9668229855597019} 12/21/2021 16:32:56 - INFO - codeparrot_training - Step 619: {'lr': 0.0004999020391106514, 'samples': 317440, 'steps': 619, 'batch_loss/train': 0.9141043045092374} 12/21/2021 16:33:07 - INFO - codeparrot_training - Step 620: {'lr': 0.0004999015973774463, 'samples': 317952, 'steps': 620, 'batch_loss/train': 0.8565683905035257} 12/21/2021 16:33:17 - INFO - codeparrot_training - Step 621: {'lr': 0.000499901154650727, 'samples': 318464, 'steps': 621, 'batch_loss/train': 0.9203378558158875} 12/21/2021 16:33:30 - INFO - codeparrot_training - Step 622: {'lr': 0.0004999007109304949, 'samples': 318976, 'steps': 622, 'batch_loss/train': 0.8640285274013877} 12/21/2021 16:33:40 - INFO - codeparrot_training - Step 623: {'lr': 0.000499900266216752, 'samples': 319488, 'steps': 623, 'batch_loss/train': 0.7352262572385371} 12/21/2021 16:33:51 - INFO - codeparrot_training - Step 624: {'lr': 0.0004998998205094999, 'samples': 320000, 'steps': 624, 'batch_loss/train': 1.1667565982788801} 12/21/2021 16:34:03 - INFO - codeparrot_training - Step 625: {'lr': 0.0004998993738087406, 'samples': 320512, 'steps': 625, 'batch_loss/train': 0.8875412936322391} 12/21/2021 16:34:13 - INFO - codeparrot_training - Step 626: {'lr': 0.0004998989261144757, 'samples': 321024, 'steps': 626, 'batch_loss/train': 0.8262747582048178} 12/21/2021 16:34:24 - INFO - codeparrot_training - Step 627: {'lr': 0.000499898477426707, 'samples': 321536, 'steps': 627, 'batch_loss/train': 0.8741793856024742} 12/21/2021 16:34:37 - INFO - codeparrot_training - Step 628: {'lr': 0.0004998980277454364, 'samples': 322048, 'steps': 628, 'batch_loss/train': 0.8630499104037881} 12/21/2021 16:34:47 - INFO - codeparrot_training - Step 629: {'lr': 0.0004998975770706655, 'samples': 322560, 'steps': 629, 'batch_loss/train': 0.927836736664176} 12/21/2021 16:34:58 - INFO - codeparrot_training - Step 630: {'lr': 0.0004998971254023962, 'samples': 323072, 'steps': 630, 'batch_loss/train': 0.8090596257243305} 12/21/2021 16:35:10 - INFO - codeparrot_training - Step 631: {'lr': 0.0004998966727406303, 'samples': 323584, 'steps': 631, 'batch_loss/train': 0.9767442140728235} 12/21/2021 16:35:21 - INFO - codeparrot_training - Step 632: {'lr': 0.0004998962190853696, 'samples': 324096, 'steps': 632, 'batch_loss/train': 0.9817150719463825} 12/21/2021 16:35:32 - INFO - codeparrot_training - Step 633: {'lr': 0.0004998957644366159, 'samples': 324608, 'steps': 633, 'batch_loss/train': 0.8378247471991926} 12/21/2021 16:35:42 - INFO - codeparrot_training - Step 634: {'lr': 0.0004998953087943709, 'samples': 325120, 'steps': 634, 'batch_loss/train': 0.9648652654141188} 12/21/2021 16:35:55 - INFO - codeparrot_training - Step 635: {'lr': 0.0004998948521586367, 'samples': 325632, 'steps': 635, 'batch_loss/train': 0.945153484120965} 12/21/2021 16:36:05 - INFO - codeparrot_training - Step 636: {'lr': 0.0004998943945294147, 'samples': 326144, 'steps': 636, 'batch_loss/train': 0.8803400984033942} 12/21/2021 16:36:16 - INFO - codeparrot_training - Step 637: {'lr': 0.0004998939359067072, 'samples': 326656, 'steps': 637, 'batch_loss/train': 1.0595063250511885} 12/21/2021 16:36:28 - INFO - codeparrot_training - Step 638: {'lr': 0.0004998934762905157, 'samples': 327168, 'steps': 638, 'batch_loss/train': 0.8174590254202485} 12/21/2021 16:36:39 - INFO - codeparrot_training - Step 639: {'lr': 0.0004998930156808419, 'samples': 327680, 'steps': 639, 'batch_loss/train': 0.876562980003655} 12/21/2021 16:36:49 - INFO - codeparrot_training - Step 640: {'lr': 0.000499892554077688, 'samples': 328192, 'steps': 640, 'batch_loss/train': 1.021360730752349} 12/21/2021 16:37:01 - INFO - codeparrot_training - Step 641: {'lr': 0.0004998920914810556, 'samples': 328704, 'steps': 641, 'batch_loss/train': 0.9346235450357199} 12/21/2021 16:37:12 - INFO - codeparrot_training - Step 642: {'lr': 0.0004998916278909466, 'samples': 329216, 'steps': 642, 'batch_loss/train': 0.9330318197607994} 12/21/2021 16:37:22 - INFO - codeparrot_training - Step 643: {'lr': 0.0004998911633073629, 'samples': 329728, 'steps': 643, 'batch_loss/train': 0.9473899034783244} 12/21/2021 16:37:35 - INFO - codeparrot_training - Step 644: {'lr': 0.0004998906977303062, 'samples': 330240, 'steps': 644, 'batch_loss/train': 0.7694513439200819} 12/21/2021 16:37:45 - INFO - codeparrot_training - Step 645: {'lr': 0.0004998902311597785, 'samples': 330752, 'steps': 645, 'batch_loss/train': 0.5681499605998397} 12/21/2021 16:37:56 - INFO - codeparrot_training - Step 646: {'lr': 0.0004998897635957815, 'samples': 331264, 'steps': 646, 'batch_loss/train': 0.9241367857903242} 12/21/2021 16:38:06 - INFO - codeparrot_training - Step 647: {'lr': 0.0004998892950383172, 'samples': 331776, 'steps': 647, 'batch_loss/train': 1.0140241999179125} 12/21/2021 16:38:19 - INFO - codeparrot_training - Step 648: {'lr': 0.0004998888254873875, 'samples': 332288, 'steps': 648, 'batch_loss/train': 0.8737476663663983} 12/21/2021 16:38:30 - INFO - codeparrot_training - Step 649: {'lr': 0.0004998883549429942, 'samples': 332800, 'steps': 649, 'batch_loss/train': 0.9440994933247566} 12/21/2021 16:38:40 - INFO - codeparrot_training - Step 650: {'lr': 0.000499887883405139, 'samples': 333312, 'steps': 650, 'batch_loss/train': 0.9624589420855045} 12/21/2021 16:38:52 - INFO - codeparrot_training - Step 651: {'lr': 0.0004998874108738241, 'samples': 333824, 'steps': 651, 'batch_loss/train': 1.0394103089347482} 12/21/2021 16:39:03 - INFO - codeparrot_training - Step 652: {'lr': 0.0004998869373490511, 'samples': 334336, 'steps': 652, 'batch_loss/train': 0.8162920204922557} 12/21/2021 16:39:13 - INFO - codeparrot_training - Step 653: {'lr': 0.000499886462830822, 'samples': 334848, 'steps': 653, 'batch_loss/train': 0.8337281758431345} 12/21/2021 16:39:26 - INFO - codeparrot_training - Step 654: {'lr': 0.0004998859873191387, 'samples': 335360, 'steps': 654, 'batch_loss/train': 0.8260816955007613} 12/21/2021 16:39:36 - INFO - codeparrot_training - Step 655: {'lr': 0.0004998855108140031, 'samples': 335872, 'steps': 655, 'batch_loss/train': 0.8891378277912736} 12/21/2021 16:39:47 - INFO - codeparrot_training - Step 656: {'lr': 0.000499885033315417, 'samples': 336384, 'steps': 656, 'batch_loss/train': 0.9109824746847153} 12/21/2021 16:40:00 - INFO - codeparrot_training - Step 657: {'lr': 0.0004998845548233825, 'samples': 336896, 'steps': 657, 'batch_loss/train': 0.9319252073764801} 12/21/2021 16:40:10 - INFO - codeparrot_training - Step 658: {'lr': 0.0004998840753379011, 'samples': 337408, 'steps': 658, 'batch_loss/train': 0.8713351637125015} 12/21/2021 16:40:21 - INFO - codeparrot_training - Step 659: {'lr': 0.0004998835948589751, 'samples': 337920, 'steps': 659, 'batch_loss/train': 0.9683993626385927} 12/21/2021 16:40:31 - INFO - codeparrot_training - Step 660: {'lr': 0.0004998831133866064, 'samples': 338432, 'steps': 660, 'batch_loss/train': 0.865633636713028} 12/21/2021 16:40:43 - INFO - codeparrot_training - Step 661: {'lr': 0.0004998826309207966, 'samples': 338944, 'steps': 661, 'batch_loss/train': 0.9237379627302289} 12/21/2021 16:40:54 - INFO - codeparrot_training - Step 662: {'lr': 0.0004998821474615481, 'samples': 339456, 'steps': 662, 'batch_loss/train': 0.863347053527832} 12/21/2021 16:41:05 - INFO - codeparrot_training - Step 663: {'lr': 0.0004998816630088622, 'samples': 339968, 'steps': 663, 'batch_loss/train': 0.9137359028682113} 12/21/2021 16:41:17 - INFO - codeparrot_training - Step 664: {'lr': 0.0004998811775627413, 'samples': 340480, 'steps': 664, 'batch_loss/train': 0.8101784214377403} 12/21/2021 16:41:28 - INFO - codeparrot_training - Step 665: {'lr': 0.0004998806911231873, 'samples': 340992, 'steps': 665, 'batch_loss/train': 0.8837747406214476} 12/21/2021 16:41:38 - INFO - codeparrot_training - Step 666: {'lr': 0.0004998802036902018, 'samples': 341504, 'steps': 666, 'batch_loss/train': 0.9349890789017081} 12/21/2021 16:41:50 - INFO - codeparrot_training - Step 667: {'lr': 0.0004998797152637872, 'samples': 342016, 'steps': 667, 'batch_loss/train': 1.0027073482051492} 12/21/2021 16:42:01 - INFO - codeparrot_training - Step 668: {'lr': 0.000499879225843945, 'samples': 342528, 'steps': 668, 'batch_loss/train': 0.9386633480899036} 12/21/2021 16:42:12 - INFO - codeparrot_training - Step 669: {'lr': 0.0004998787354306774, 'samples': 343040, 'steps': 669, 'batch_loss/train': 0.8936235718429089} 12/21/2021 16:42:22 - INFO - codeparrot_training - Step 670: {'lr': 0.0004998782440239863, 'samples': 343552, 'steps': 670, 'batch_loss/train': 0.9008663929998875} 12/21/2021 16:42:35 - INFO - codeparrot_training - Step 671: {'lr': 0.0004998777516238737, 'samples': 344064, 'steps': 671, 'batch_loss/train': 0.9066405333578587} 12/21/2021 16:42:45 - INFO - codeparrot_training - Step 672: {'lr': 0.0004998772582303414, 'samples': 344576, 'steps': 672, 'batch_loss/train': 0.8753687320277095} 12/21/2021 16:42:56 - INFO - codeparrot_training - Step 673: {'lr': 0.0004998767638433917, 'samples': 345088, 'steps': 673, 'batch_loss/train': 0.8501032739877701} 12/21/2021 16:43:09 - INFO - codeparrot_training - Step 674: {'lr': 0.0004998762684630261, 'samples': 345600, 'steps': 674, 'batch_loss/train': 0.9156850469298661} 12/21/2021 16:43:19 - INFO - codeparrot_training - Step 675: {'lr': 0.0004998757720892469, 'samples': 346112, 'steps': 675, 'batch_loss/train': 0.8992579788900912} 12/21/2021 16:43:30 - INFO - codeparrot_training - Step 676: {'lr': 0.000499875274722056, 'samples': 346624, 'steps': 676, 'batch_loss/train': 0.8720930330455303} 12/21/2021 16:43:42 - INFO - codeparrot_training - Step 677: {'lr': 0.0004998747763614554, 'samples': 347136, 'steps': 677, 'batch_loss/train': 0.9037777651101351} 12/21/2021 16:43:52 - INFO - codeparrot_training - Step 678: {'lr': 0.0004998742770074469, 'samples': 347648, 'steps': 678, 'batch_loss/train': 0.9344880515709519} 12/21/2021 16:44:03 - INFO - codeparrot_training - Step 679: {'lr': 0.0004998737766600327, 'samples': 348160, 'steps': 679, 'batch_loss/train': 0.8562336768954992} 12/21/2021 16:44:15 - INFO - codeparrot_training - Step 680: {'lr': 0.0004998732753192148, 'samples': 348672, 'steps': 680, 'batch_loss/train': 0.8585494672879577} 12/21/2021 16:44:26 - INFO - codeparrot_training - Step 681: {'lr': 0.0004998727729849949, 'samples': 349184, 'steps': 681, 'batch_loss/train': 0.9700227463617921} 12/21/2021 16:44:36 - INFO - codeparrot_training - Step 682: {'lr': 0.0004998722696573753, 'samples': 349696, 'steps': 682, 'batch_loss/train': 0.9024811657145619} 12/21/2021 16:44:47 - INFO - codeparrot_training - Step 683: {'lr': 0.000499871765336358, 'samples': 350208, 'steps': 683, 'batch_loss/train': 0.9197928688954562} 12/21/2021 16:44:59 - INFO - codeparrot_training - Step 684: {'lr': 0.0004998712600219449, 'samples': 350720, 'steps': 684, 'batch_loss/train': 0.8372736421879381} 12/21/2021 16:45:10 - INFO - codeparrot_training - Step 685: {'lr': 0.0004998707537141378, 'samples': 351232, 'steps': 685, 'batch_loss/train': 0.7522589818108827} 12/21/2021 16:45:20 - INFO - codeparrot_training - Step 686: {'lr': 0.0004998702464129391, 'samples': 351744, 'steps': 686, 'batch_loss/train': 0.852756405249238} 12/21/2021 16:45:33 - INFO - codeparrot_training - Step 687: {'lr': 0.0004998697381183506, 'samples': 352256, 'steps': 687, 'batch_loss/train': 0.9529910506680608} 12/21/2021 16:45:43 - INFO - codeparrot_training - Step 688: {'lr': 0.0004998692288303745, 'samples': 352768, 'steps': 688, 'batch_loss/train': 0.9314002264291048} 12/21/2021 16:45:54 - INFO - codeparrot_training - Step 689: {'lr': 0.0004998687185490125, 'samples': 353280, 'steps': 689, 'batch_loss/train': 0.8984973533079028} 12/21/2021 16:46:06 - INFO - codeparrot_training - Step 690: {'lr': 0.000499868207274267, 'samples': 353792, 'steps': 690, 'batch_loss/train': 0.9233706695958972} 12/21/2021 16:46:17 - INFO - codeparrot_training - Step 691: {'lr': 0.0004998676950061397, 'samples': 354304, 'steps': 691, 'batch_loss/train': 0.8453902406617999} 12/21/2021 16:46:27 - INFO - codeparrot_training - Step 692: {'lr': 0.0004998671817446329, 'samples': 354816, 'steps': 692, 'batch_loss/train': 0.8967659650370479} 12/21/2021 16:46:38 - INFO - codeparrot_training - Step 693: {'lr': 0.0004998666674897484, 'samples': 355328, 'steps': 693, 'batch_loss/train': 0.9203964900225401} 12/21/2021 16:46:50 - INFO - codeparrot_training - Step 694: {'lr': 0.0004998661522414885, 'samples': 355840, 'steps': 694, 'batch_loss/train': 0.9685180312953889} 12/21/2021 16:47:00 - INFO - codeparrot_training - Step 695: {'lr': 0.0004998656359998552, 'samples': 356352, 'steps': 695, 'batch_loss/train': 0.782053689006716} 12/21/2021 16:47:11 - INFO - codeparrot_training - Step 696: {'lr': 0.0004998651187648503, 'samples': 356864, 'steps': 696, 'batch_loss/train': 0.9232744155451655} 12/21/2021 16:47:24 - INFO - codeparrot_training - Step 697: {'lr': 0.000499864600536476, 'samples': 357376, 'steps': 697, 'batch_loss/train': 0.9550192523747683} 12/21/2021 16:47:35 - INFO - codeparrot_training - Step 698: {'lr': 0.0004998640813147345, 'samples': 357888, 'steps': 698, 'batch_loss/train': 0.9095878661610186} 12/21/2021 16:47:45 - INFO - codeparrot_training - Step 699: {'lr': 0.0004998635610996278, 'samples': 358400, 'steps': 699, 'batch_loss/train': 0.9093206040561199} 12/21/2021 16:47:57 - INFO - codeparrot_training - Step 700: {'lr': 0.0004998630398911579, 'samples': 358912, 'steps': 700, 'batch_loss/train': 0.7932508641388267} 12/21/2021 16:48:08 - INFO - codeparrot_training - Step 701: {'lr': 0.0004998625176893269, 'samples': 359424, 'steps': 701, 'batch_loss/train': 0.8625519368797541} 12/21/2021 16:48:18 - INFO - codeparrot_training - Step 702: {'lr': 0.0004998619944941369, 'samples': 359936, 'steps': 702, 'batch_loss/train': 0.9058898026123643} 12/21/2021 16:48:29 - INFO - codeparrot_training - Step 703: {'lr': 0.0004998614703055898, 'samples': 360448, 'steps': 703, 'batch_loss/train': 0.7993348529562354} 12/21/2021 16:48:42 - INFO - codeparrot_training - Step 704: {'lr': 0.0004998609451236879, 'samples': 360960, 'steps': 704, 'batch_loss/train': 0.9983088783919811} 12/21/2021 16:48:52 - INFO - codeparrot_training - Step 705: {'lr': 0.0004998604189484333, 'samples': 361472, 'steps': 705, 'batch_loss/train': 1.0443515917286277} 12/21/2021 16:49:03 - INFO - codeparrot_training - Step 706: {'lr': 0.000499859891779828, 'samples': 361984, 'steps': 706, 'batch_loss/train': 0.9746590838767588} 12/21/2021 16:49:15 - INFO - codeparrot_training - Step 707: {'lr': 0.000499859363617874, 'samples': 362496, 'steps': 707, 'batch_loss/train': 0.9542826134711504} 12/21/2021 16:49:25 - INFO - codeparrot_training - Step 708: {'lr': 0.0004998588344625736, 'samples': 363008, 'steps': 708, 'batch_loss/train': 0.8941388195380569} 12/21/2021 16:49:36 - INFO - codeparrot_training - Step 709: {'lr': 0.0004998583043139289, 'samples': 363520, 'steps': 709, 'batch_loss/train': 0.9298544200137258} 12/21/2021 16:49:48 - INFO - codeparrot_training - Step 710: {'lr': 0.0004998577731719419, 'samples': 364032, 'steps': 710, 'batch_loss/train': 0.9288242040202022} 12/21/2021 16:49:59 - INFO - codeparrot_training - Step 711: {'lr': 0.0004998572410366146, 'samples': 364544, 'steps': 711, 'batch_loss/train': 0.9556368067860603} 12/21/2021 16:50:09 - INFO - codeparrot_training - Step 712: {'lr': 0.0004998567079079494, 'samples': 365056, 'steps': 712, 'batch_loss/train': 0.9007089296355844} 12/21/2021 16:50:22 - INFO - codeparrot_training - Step 713: {'lr': 0.0004998561737859482, 'samples': 365568, 'steps': 713, 'batch_loss/train': 1.0700763603672385} 12/21/2021 16:50:33 - INFO - codeparrot_training - Step 714: {'lr': 0.0004998556386706133, 'samples': 366080, 'steps': 714, 'batch_loss/train': 0.9096572594717145} 12/21/2021 16:50:43 - INFO - codeparrot_training - Step 715: {'lr': 0.0004998551025619466, 'samples': 366592, 'steps': 715, 'batch_loss/train': 0.9411030448973179} 12/21/2021 16:50:54 - INFO - codeparrot_training - Step 716: {'lr': 0.0004998545654599503, 'samples': 367104, 'steps': 716, 'batch_loss/train': 0.9920748313888907} 12/21/2021 16:51:06 - INFO - codeparrot_training - Step 717: {'lr': 0.0004998540273646267, 'samples': 367616, 'steps': 717, 'batch_loss/train': 1.0416659973561764} 12/21/2021 16:51:17 - INFO - codeparrot_training - Step 718: {'lr': 0.0004998534882759778, 'samples': 368128, 'steps': 718, 'batch_loss/train': 0.7783045200631022} 12/21/2021 16:51:28 - INFO - codeparrot_training - Step 719: {'lr': 0.0004998529481940058, 'samples': 368640, 'steps': 719, 'batch_loss/train': 0.9713968969881535} 12/21/2021 16:51:40 - INFO - codeparrot_training - Step 720: {'lr': 0.0004998524071187127, 'samples': 369152, 'steps': 720, 'batch_loss/train': 0.9542250568047166} 12/21/2021 16:51:50 - INFO - codeparrot_training - Step 721: {'lr': 0.0004998518650501009, 'samples': 369664, 'steps': 721, 'batch_loss/train': 0.918422156944871} 12/21/2021 16:52:01 - INFO - codeparrot_training - Step 722: {'lr': 0.0004998513219881724, 'samples': 370176, 'steps': 722, 'batch_loss/train': 0.8979240972548723} 12/21/2021 16:52:13 - INFO - codeparrot_training - Step 723: {'lr': 0.0004998507779329294, 'samples': 370688, 'steps': 723, 'batch_loss/train': 1.0303898230195045} 12/21/2021 16:52:23 - INFO - codeparrot_training - Step 724: {'lr': 0.000499850232884374, 'samples': 371200, 'steps': 724, 'batch_loss/train': 0.9922699043527246} 12/21/2021 16:52:34 - INFO - codeparrot_training - Step 725: {'lr': 0.0004998496868425085, 'samples': 371712, 'steps': 725, 'batch_loss/train': 0.9277333309873939} 12/21/2021 16:52:47 - INFO - codeparrot_training - Step 726: {'lr': 0.0004998491398073348, 'samples': 372224, 'steps': 726, 'batch_loss/train': 0.9918480785563588} 12/21/2021 16:52:57 - INFO - codeparrot_training - Step 727: {'lr': 0.0004998485917788554, 'samples': 372736, 'steps': 727, 'batch_loss/train': 0.8040990834124386} 12/21/2021 16:53:08 - INFO - codeparrot_training - Step 728: {'lr': 0.0004998480427570723, 'samples': 373248, 'steps': 728, 'batch_loss/train': 0.8798563666641712} 12/21/2021 16:53:18 - INFO - codeparrot_training - Step 729: {'lr': 0.0004998474927419878, 'samples': 373760, 'steps': 729, 'batch_loss/train': 0.9049311252310872} 12/21/2021 16:53:30 - INFO - codeparrot_training - Step 730: {'lr': 0.0004998469417336039, 'samples': 374272, 'steps': 730, 'batch_loss/train': 0.8904352216050029} 12/21/2021 16:53:41 - INFO - codeparrot_training - Step 731: {'lr': 0.000499846389731923, 'samples': 374784, 'steps': 731, 'batch_loss/train': 0.9411286730319262} 12/21/2021 16:53:51 - INFO - codeparrot_training - Step 732: {'lr': 0.000499845836736947, 'samples': 375296, 'steps': 732, 'batch_loss/train': 0.9338088165968657} 12/21/2021 16:54:04 - INFO - codeparrot_training - Step 733: {'lr': 0.0004998452827486785, 'samples': 375808, 'steps': 733, 'batch_loss/train': 0.9239707002416253} 12/21/2021 16:54:15 - INFO - codeparrot_training - Step 734: {'lr': 0.0004998447277671194, 'samples': 376320, 'steps': 734, 'batch_loss/train': 0.9076833575963974} 12/21/2021 16:54:25 - INFO - codeparrot_training - Step 735: {'lr': 0.0004998441717922721, 'samples': 376832, 'steps': 735, 'batch_loss/train': 0.9181579956784844} 12/21/2021 16:54:37 - INFO - codeparrot_training - Step 736: {'lr': 0.0004998436148241386, 'samples': 377344, 'steps': 736, 'batch_loss/train': 0.9060037350282073} 12/21/2021 16:54:48 - INFO - codeparrot_training - Step 737: {'lr': 0.0004998430568627213, 'samples': 377856, 'steps': 737, 'batch_loss/train': 1.014872153289616} 12/21/2021 16:54:58 - INFO - codeparrot_training - Step 738: {'lr': 0.0004998424979080222, 'samples': 378368, 'steps': 738, 'batch_loss/train': 0.9603051524609327} 12/21/2021 16:55:10 - INFO - codeparrot_training - Step 739: {'lr': 0.0004998419379600438, 'samples': 378880, 'steps': 739, 'batch_loss/train': 0.8300898065790534} 12/21/2021 16:55:21 - INFO - codeparrot_training - Step 740: {'lr': 0.0004998413770187882, 'samples': 379392, 'steps': 740, 'batch_loss/train': 0.942758864723146} 12/21/2021 16:55:32 - INFO - codeparrot_training - Step 741: {'lr': 0.0004998408150842575, 'samples': 379904, 'steps': 741, 'batch_loss/train': 0.9354134553577751} 12/21/2021 16:55:42 - INFO - codeparrot_training - Step 742: {'lr': 0.0004998402521564541, 'samples': 380416, 'steps': 742, 'batch_loss/train': 0.8622790761291981} 12/21/2021 16:55:55 - INFO - codeparrot_training - Step 743: {'lr': 0.0004998396882353803, 'samples': 380928, 'steps': 743, 'batch_loss/train': 0.8875884525477886} 12/21/2021 16:56:06 - INFO - codeparrot_training - Step 744: {'lr': 0.0004998391233210381, 'samples': 381440, 'steps': 744, 'batch_loss/train': 0.8584947464987636} 12/21/2021 16:56:17 - INFO - codeparrot_training - Step 745: {'lr': 0.00049983855741343, 'samples': 381952, 'steps': 745, 'batch_loss/train': 0.8631401276215911} 12/21/2021 16:56:29 - INFO - codeparrot_training - Step 746: {'lr': 0.0004998379905125581, 'samples': 382464, 'steps': 746, 'batch_loss/train': 0.978673106059432} 12/21/2021 16:56:39 - INFO - codeparrot_training - Step 747: {'lr': 0.0004998374226184246, 'samples': 382976, 'steps': 747, 'batch_loss/train': 1.0031325398012996} 12/21/2021 16:56:50 - INFO - codeparrot_training - Step 748: {'lr': 0.000499836853731032, 'samples': 383488, 'steps': 748, 'batch_loss/train': 0.9518019109964371} 12/21/2021 16:57:02 - INFO - codeparrot_training - Step 749: {'lr': 0.0004998362838503822, 'samples': 384000, 'steps': 749, 'batch_loss/train': 0.926420415751636} 12/21/2021 16:57:13 - INFO - codeparrot_training - Step 750: {'lr': 0.0004998357129764778, 'samples': 384512, 'steps': 750, 'batch_loss/train': 0.9860595315694809} 12/21/2021 16:57:23 - INFO - codeparrot_training - Step 751: {'lr': 0.0004998351411093209, 'samples': 385024, 'steps': 751, 'batch_loss/train': 0.8680475722067058} 12/21/2021 16:57:34 - INFO - codeparrot_training - Step 752: {'lr': 0.0004998345682489139, 'samples': 385536, 'steps': 752, 'batch_loss/train': 0.930605205707252} 12/21/2021 16:57:47 - INFO - codeparrot_training - Step 753: {'lr': 0.0004998339943952588, 'samples': 386048, 'steps': 753, 'batch_loss/train': 0.8897510543465614} 12/21/2021 16:57:57 - INFO - codeparrot_training - Step 754: {'lr': 0.0004998334195483583, 'samples': 386560, 'steps': 754, 'batch_loss/train': 0.9025165792554617} 12/21/2021 16:58:08 - INFO - codeparrot_training - Step 755: {'lr': 0.0004998328437082143, 'samples': 387072, 'steps': 755, 'batch_loss/train': 0.9224835326895118} 12/21/2021 16:58:20 - INFO - codeparrot_training - Step 756: {'lr': 0.0004998322668748293, 'samples': 387584, 'steps': 756, 'batch_loss/train': 0.8922624634578824} 12/21/2021 16:58:30 - INFO - codeparrot_training - Step 757: {'lr': 0.0004998316890482055, 'samples': 388096, 'steps': 757, 'batch_loss/train': 0.9402753226459026} 12/21/2021 16:58:41 - INFO - codeparrot_training - Step 758: {'lr': 0.0004998311102283453, 'samples': 388608, 'steps': 758, 'batch_loss/train': 0.9860587837174535} 12/21/2021 16:58:53 - INFO - codeparrot_training - Step 759: {'lr': 0.0004998305304152508, 'samples': 389120, 'steps': 759, 'batch_loss/train': 0.965982579626143} 12/21/2021 16:59:04 - INFO - codeparrot_training - Step 760: {'lr': 0.0004998299496089246, 'samples': 389632, 'steps': 760, 'batch_loss/train': 0.8267340925522149} 12/21/2021 16:59:14 - INFO - codeparrot_training - Step 761: {'lr': 0.0004998293678093688, 'samples': 390144, 'steps': 761, 'batch_loss/train': 0.9654532987624407} 12/21/2021 16:59:27 - INFO - codeparrot_training - Step 762: {'lr': 0.0004998287850165858, 'samples': 390656, 'steps': 762, 'batch_loss/train': 0.9248868841677904} 12/21/2021 16:59:38 - INFO - codeparrot_training - Step 763: {'lr': 0.0004998282012305778, 'samples': 391168, 'steps': 763, 'batch_loss/train': 0.7699675001204014} 12/21/2021 16:59:49 - INFO - codeparrot_training - Step 764: {'lr': 0.0004998276164513473, 'samples': 391680, 'steps': 764, 'batch_loss/train': 0.998703227378428} 12/21/2021 16:59:59 - INFO - codeparrot_training - Step 765: {'lr': 0.0004998270306788965, 'samples': 392192, 'steps': 765, 'batch_loss/train': 1.2129826387390494} 12/21/2021 17:00:11 - INFO - codeparrot_training - Step 766: {'lr': 0.0004998264439132278, 'samples': 392704, 'steps': 766, 'batch_loss/train': 1.055231023579836} 12/21/2021 17:00:22 - INFO - codeparrot_training - Step 767: {'lr': 0.0004998258561543433, 'samples': 393216, 'steps': 767, 'batch_loss/train': 0.9193539707921445} 12/21/2021 17:00:32 - INFO - codeparrot_training - Step 768: {'lr': 0.0004998252674022457, 'samples': 393728, 'steps': 768, 'batch_loss/train': 0.9410045798867941} 12/21/2021 17:00:44 - INFO - codeparrot_training - Step 769: {'lr': 0.0004998246776569371, 'samples': 394240, 'steps': 769, 'batch_loss/train': 0.8575128288939595} 12/21/2021 17:00:55 - INFO - codeparrot_training - Step 770: {'lr': 0.00049982408691842, 'samples': 394752, 'steps': 770, 'batch_loss/train': 0.8406329331919551} 12/21/2021 17:01:05 - INFO - codeparrot_training - Step 771: {'lr': 0.0004998234951866966, 'samples': 395264, 'steps': 771, 'batch_loss/train': 0.8926150659099221} 12/21/2021 17:01:16 - INFO - codeparrot_training - Step 772: {'lr': 0.0004998229024617693, 'samples': 395776, 'steps': 772, 'batch_loss/train': 0.8969995081424713} 12/21/2021 17:01:29 - INFO - codeparrot_training - Step 773: {'lr': 0.0004998223087436404, 'samples': 396288, 'steps': 773, 'batch_loss/train': 0.7905642939731479} 12/21/2021 17:01:39 - INFO - codeparrot_training - Step 774: {'lr': 0.0004998217140323125, 'samples': 396800, 'steps': 774, 'batch_loss/train': 0.9869577521458268} 12/21/2021 17:01:50 - INFO - codeparrot_training - Step 775: {'lr': 0.0004998211183277877, 'samples': 397312, 'steps': 775, 'batch_loss/train': 0.7224799026735127} 12/21/2021 17:02:02 - INFO - codeparrot_training - Step 776: {'lr': 0.0004998205216300685, 'samples': 397824, 'steps': 776, 'batch_loss/train': 0.956611798144877} 12/21/2021 17:02:13 - INFO - codeparrot_training - Step 777: {'lr': 0.000499819923939157, 'samples': 398336, 'steps': 777, 'batch_loss/train': 0.6940632057376206} 12/21/2021 17:02:23 - INFO - codeparrot_training - Step 778: {'lr': 0.0004998193252550562, 'samples': 398848, 'steps': 778, 'batch_loss/train': 0.9514288548380136} 12/21/2021 17:02:36 - INFO - codeparrot_training - Step 779: {'lr': 0.0004998187255777678, 'samples': 399360, 'steps': 779, 'batch_loss/train': 0.8291643848642707} 12/21/2021 17:02:46 - INFO - codeparrot_training - Step 780: {'lr': 0.0004998181249072946, 'samples': 399872, 'steps': 780, 'batch_loss/train': 0.9225154826417565} 12/21/2021 17:02:57 - INFO - codeparrot_training - Step 781: {'lr': 0.0004998175232436388, 'samples': 400384, 'steps': 781, 'batch_loss/train': 0.7809789045713842} 12/21/2021 17:03:10 - INFO - codeparrot_training - Step 782: {'lr': 0.0004998169205868029, 'samples': 400896, 'steps': 782, 'batch_loss/train': 0.9135488187894225} 12/21/2021 17:03:20 - INFO - codeparrot_training - Step 783: {'lr': 0.0004998163169367894, 'samples': 401408, 'steps': 783, 'batch_loss/train': 0.9129585232585669} 12/21/2021 17:03:31 - INFO - codeparrot_training - Step 784: {'lr': 0.0004998157122936002, 'samples': 401920, 'steps': 784, 'batch_loss/train': 0.8906822032295167} 12/21/2021 17:03:41 - INFO - codeparrot_training - Step 785: {'lr': 0.0004998151066572383, 'samples': 402432, 'steps': 785, 'batch_loss/train': 0.9094138201326132} 12/21/2021 17:03:53 - INFO - codeparrot_training - Step 786: {'lr': 0.0004998145000277058, 'samples': 402944, 'steps': 786, 'batch_loss/train': 1.1266619339585304} 12/21/2021 17:04:04 - INFO - codeparrot_training - Step 787: {'lr': 0.0004998138924050051, 'samples': 403456, 'steps': 787, 'batch_loss/train': 0.9350923728197813} 12/21/2021 17:04:15 - INFO - codeparrot_training - Step 788: {'lr': 0.0004998132837891389, 'samples': 403968, 'steps': 788, 'batch_loss/train': 0.8893251535482705} 12/21/2021 17:04:27 - INFO - codeparrot_training - Step 789: {'lr': 0.0004998126741801092, 'samples': 404480, 'steps': 789, 'batch_loss/train': 0.9448728957213461} 12/21/2021 17:04:38 - INFO - codeparrot_training - Step 790: {'lr': 0.0004998120635779187, 'samples': 404992, 'steps': 790, 'batch_loss/train': 0.9978616088628769} 12/21/2021 17:04:48 - INFO - codeparrot_training - Step 791: {'lr': 0.0004998114519825698, 'samples': 405504, 'steps': 791, 'batch_loss/train': 0.8847496602684259} 12/21/2021 17:05:00 - INFO - codeparrot_training - Step 792: {'lr': 0.0004998108393940649, 'samples': 406016, 'steps': 792, 'batch_loss/train': 0.9300745222717524} 12/21/2021 17:05:11 - INFO - codeparrot_training - Step 793: {'lr': 0.0004998102258124065, 'samples': 406528, 'steps': 793, 'batch_loss/train': 0.9128897180780768} 12/21/2021 17:05:22 - INFO - codeparrot_training - Step 794: {'lr': 0.0004998096112375968, 'samples': 407040, 'steps': 794, 'batch_loss/train': 1.0378374606370926} 12/21/2021 17:05:34 - INFO - codeparrot_training - Step 795: {'lr': 0.0004998089956696384, 'samples': 407552, 'steps': 795, 'batch_loss/train': 0.9944586558267474} 12/21/2021 17:05:44 - INFO - codeparrot_training - Step 796: {'lr': 0.000499808379108534, 'samples': 408064, 'steps': 796, 'batch_loss/train': 0.9192905947566032} 12/21/2021 17:05:55 - INFO - codeparrot_training - Step 797: {'lr': 0.0004998077615542856, 'samples': 408576, 'steps': 797, 'batch_loss/train': 0.9454828109592199} 12/21/2021 17:06:05 - INFO - codeparrot_training - Step 798: {'lr': 0.0004998071430068959, 'samples': 409088, 'steps': 798, 'batch_loss/train': 0.8841877561062574} 12/21/2021 17:06:17 - INFO - codeparrot_training - Step 799: {'lr': 0.0004998065234663672, 'samples': 409600, 'steps': 799, 'batch_loss/train': 0.8881682259961963} 12/21/2021 17:06:28 - INFO - codeparrot_training - Step 800: {'lr': 0.0004998059029327023, 'samples': 410112, 'steps': 800, 'batch_loss/train': 0.9492154698818922} 12/21/2021 17:06:38 - INFO - codeparrot_training - Step 801: {'lr': 0.0004998052814059033, 'samples': 410624, 'steps': 801, 'batch_loss/train': 0.9140373067930341} 12/21/2021 17:06:51 - INFO - codeparrot_training - Step 802: {'lr': 0.000499804658885973, 'samples': 411136, 'steps': 802, 'batch_loss/train': 0.9074042178690434} 12/21/2021 17:07:02 - INFO - codeparrot_training - Step 803: {'lr': 0.0004998040353729135, 'samples': 411648, 'steps': 803, 'batch_loss/train': 0.889681194908917} 12/21/2021 17:07:13 - INFO - codeparrot_training - Step 804: {'lr': 0.0004998034108667276, 'samples': 412160, 'steps': 804, 'batch_loss/train': 0.9234187286347151} 12/21/2021 17:07:25 - INFO - codeparrot_training - Step 805: {'lr': 0.0004998027853674176, 'samples': 412672, 'steps': 805, 'batch_loss/train': 0.9126674877479672} 12/21/2021 17:07:35 - INFO - codeparrot_training - Step 806: {'lr': 0.0004998021588749861, 'samples': 413184, 'steps': 806, 'batch_loss/train': 1.1862736027687788} 12/21/2021 17:07:46 - INFO - codeparrot_training - Step 807: {'lr': 0.0004998015313894355, 'samples': 413696, 'steps': 807, 'batch_loss/train': 1.037706938572228} 12/21/2021 17:07:56 - INFO - codeparrot_training - Step 808: {'lr': 0.0004998009029107684, 'samples': 414208, 'steps': 808, 'batch_loss/train': 0.9012275338172913} 12/21/2021 17:08:08 - INFO - codeparrot_training - Step 809: {'lr': 0.0004998002734389872, 'samples': 414720, 'steps': 809, 'batch_loss/train': 0.8174186530523002} 12/21/2021 17:08:19 - INFO - codeparrot_training - Step 810: {'lr': 0.0004997996429740943, 'samples': 415232, 'steps': 810, 'batch_loss/train': 0.9302441347390413} 12/21/2021 17:08:30 - INFO - codeparrot_training - Step 811: {'lr': 0.0004997990115160925, 'samples': 415744, 'steps': 811, 'batch_loss/train': 0.9354924131184816} 12/21/2021 17:08:42 - INFO - codeparrot_training - Step 812: {'lr': 0.0004997983790649841, 'samples': 416256, 'steps': 812, 'batch_loss/train': 0.882003903388977} 12/21/2021 17:08:53 - INFO - codeparrot_training - Step 813: {'lr': 0.0004997977456207717, 'samples': 416768, 'steps': 813, 'batch_loss/train': 0.9220817005261779} 12/21/2021 17:09:04 - INFO - codeparrot_training - Step 814: {'lr': 0.0004997971111834577, 'samples': 417280, 'steps': 814, 'batch_loss/train': 0.9384055119007826} 12/21/2021 17:09:16 - INFO - codeparrot_training - Step 815: {'lr': 0.0004997964757530448, 'samples': 417792, 'steps': 815, 'batch_loss/train': 1.0483489474281669} 12/21/2021 17:09:26 - INFO - codeparrot_training - Step 816: {'lr': 0.0004997958393295354, 'samples': 418304, 'steps': 816, 'batch_loss/train': 1.0413115546107292} 12/21/2021 17:09:37 - INFO - codeparrot_training - Step 817: {'lr': 0.000499795201912932, 'samples': 418816, 'steps': 817, 'batch_loss/train': 0.8990647792816162} 12/21/2021 17:09:47 - INFO - codeparrot_training - Step 818: {'lr': 0.0004997945635032373, 'samples': 419328, 'steps': 818, 'batch_loss/train': 0.8802832514047623} 12/21/2021 17:10:00 - INFO - codeparrot_training - Step 819: {'lr': 0.0004997939241004536, 'samples': 419840, 'steps': 819, 'batch_loss/train': 0.9417443200945854} 12/21/2021 17:10:11 - INFO - codeparrot_training - Step 820: {'lr': 0.0004997932837045837, 'samples': 420352, 'steps': 820, 'batch_loss/train': 0.8611459489911795} 12/21/2021 17:10:21 - INFO - codeparrot_training - Step 821: {'lr': 0.00049979264231563, 'samples': 420864, 'steps': 821, 'batch_loss/train': 0.9343234058469534} 12/21/2021 17:10:33 - INFO - codeparrot_training - Step 822: {'lr': 0.000499791999933595, 'samples': 421376, 'steps': 822, 'batch_loss/train': 1.006358064711094} 12/21/2021 17:10:44 - INFO - codeparrot_training - Step 823: {'lr': 0.0004997913565584814, 'samples': 421888, 'steps': 823, 'batch_loss/train': 1.0345821371302009} 12/21/2021 17:10:54 - INFO - codeparrot_training - Step 824: {'lr': 0.0004997907121902917, 'samples': 422400, 'steps': 824, 'batch_loss/train': 0.8690118379890919} 12/21/2021 17:11:07 - INFO - codeparrot_training - Step 825: {'lr': 0.0004997900668290284, 'samples': 422912, 'steps': 825, 'batch_loss/train': 0.6611096791457385} 12/21/2021 17:11:17 - INFO - codeparrot_training - Step 826: {'lr': 0.0004997894204746941, 'samples': 423424, 'steps': 826, 'batch_loss/train': 0.8851624899543822} 12/21/2021 17:11:28 - INFO - codeparrot_training - Step 827: {'lr': 0.0004997887731272914, 'samples': 423936, 'steps': 827, 'batch_loss/train': 0.9128889078274369} 12/21/2021 17:11:40 - INFO - codeparrot_training - Step 828: {'lr': 0.0004997881247868228, 'samples': 424448, 'steps': 828, 'batch_loss/train': 0.7799935028888285} 12/21/2021 17:11:51 - INFO - codeparrot_training - Step 829: {'lr': 0.000499787475453291, 'samples': 424960, 'steps': 829, 'batch_loss/train': 0.9238556064665318} 12/21/2021 17:12:02 - INFO - codeparrot_training - Step 830: {'lr': 0.0004997868251266985, 'samples': 425472, 'steps': 830, 'batch_loss/train': 0.9082990828901529} 12/21/2021 17:12:12 - INFO - codeparrot_training - Step 831: {'lr': 0.000499786173807048, 'samples': 425984, 'steps': 831, 'batch_loss/train': 0.822741243056953} 12/21/2021 17:12:24 - INFO - codeparrot_training - Step 832: {'lr': 0.0004997855214943418, 'samples': 426496, 'steps': 832, 'batch_loss/train': 0.8597711115144193} 12/21/2021 17:12:35 - INFO - codeparrot_training - Step 833: {'lr': 0.0004997848681885828, 'samples': 427008, 'steps': 833, 'batch_loss/train': 0.9635952524840832} 12/21/2021 17:12:45 - INFO - codeparrot_training - Step 834: {'lr': 0.0004997842138897735, 'samples': 427520, 'steps': 834, 'batch_loss/train': 0.9419303433969617} 12/21/2021 17:12:57 - INFO - codeparrot_training - Step 835: {'lr': 0.0004997835585979164, 'samples': 428032, 'steps': 835, 'batch_loss/train': 0.9472830363083631} 12/21/2021 17:13:08 - INFO - codeparrot_training - Step 836: {'lr': 0.0004997829023130142, 'samples': 428544, 'steps': 836, 'batch_loss/train': 0.7724728945177048} 12/21/2021 17:13:19 - INFO - codeparrot_training - Step 837: {'lr': 0.0004997822450350695, 'samples': 429056, 'steps': 837, 'batch_loss/train': 0.8159578728955239} 12/21/2021 17:13:31 - INFO - codeparrot_training - Step 838: {'lr': 0.0004997815867640849, 'samples': 429568, 'steps': 838, 'batch_loss/train': 0.7640949334017932} 12/21/2021 17:13:41 - INFO - codeparrot_training - Step 839: {'lr': 0.0004997809275000631, 'samples': 430080, 'steps': 839, 'batch_loss/train': 0.9598758053034544} 12/21/2021 17:13:52 - INFO - codeparrot_training - Step 840: {'lr': 0.0004997802672430065, 'samples': 430592, 'steps': 840, 'batch_loss/train': 0.9455451741814613} 12/21/2021 17:14:05 - INFO - codeparrot_training - Step 841: {'lr': 0.000499779605992918, 'samples': 431104, 'steps': 841, 'batch_loss/train': 0.942421811632812} 12/21/2021 17:14:15 - INFO - codeparrot_training - Step 842: {'lr': 0.0004997789437498001, 'samples': 431616, 'steps': 842, 'batch_loss/train': 0.9513060636818409} 12/21/2021 17:14:26 - INFO - codeparrot_training - Step 843: {'lr': 0.0004997782805136554, 'samples': 432128, 'steps': 843, 'batch_loss/train': 1.0295728594064713} 12/21/2021 17:14:36 - INFO - codeparrot_training - Step 844: {'lr': 0.0004997776162844865, 'samples': 432640, 'steps': 844, 'batch_loss/train': 0.9750628145411611} 12/21/2021 17:14:49 - INFO - codeparrot_training - Step 845: {'lr': 0.0004997769510622961, 'samples': 433152, 'steps': 845, 'batch_loss/train': 0.7850676458328962} 12/21/2021 17:14:59 - INFO - codeparrot_training - Step 846: {'lr': 0.0004997762848470871, 'samples': 433664, 'steps': 846, 'batch_loss/train': 0.8023728863336146} 12/21/2021 17:15:10 - INFO - codeparrot_training - Step 847: {'lr': 0.0004997756176388615, 'samples': 434176, 'steps': 847, 'batch_loss/train': 0.984395582228899} 12/21/2021 17:15:22 - INFO - codeparrot_training - Step 848: {'lr': 0.0004997749494376227, 'samples': 434688, 'steps': 848, 'batch_loss/train': 0.9751801751554012} 12/21/2021 17:15:33 - INFO - codeparrot_training - Step 849: {'lr': 0.0004997742802433729, 'samples': 435200, 'steps': 849, 'batch_loss/train': 1.072255440056324} 12/21/2021 17:15:43 - INFO - codeparrot_training - Step 850: {'lr': 0.0004997736100561149, 'samples': 435712, 'steps': 850, 'batch_loss/train': 0.7036756607703865} 12/21/2021 17:15:56 - INFO - codeparrot_training - Step 851: {'lr': 0.0004997729388758513, 'samples': 436224, 'steps': 851, 'batch_loss/train': 0.7454155432060361} 12/21/2021 17:16:06 - INFO - codeparrot_training - Step 852: {'lr': 0.0004997722667025848, 'samples': 436736, 'steps': 852, 'batch_loss/train': 0.9254516665823758} 12/21/2021 17:16:17 - INFO - codeparrot_training - Step 853: {'lr': 0.0004997715935363181, 'samples': 437248, 'steps': 853, 'batch_loss/train': 1.0428108582273126} 12/21/2021 17:16:28 - INFO - codeparrot_training - Step 854: {'lr': 0.0004997709193770538, 'samples': 437760, 'steps': 854, 'batch_loss/train': 0.8779808301478624} 12/21/2021 17:16:40 - INFO - codeparrot_training - Step 855: {'lr': 0.0004997702442247947, 'samples': 438272, 'steps': 855, 'batch_loss/train': 0.9166898503899574} 12/21/2021 17:16:50 - INFO - codeparrot_training - Step 856: {'lr': 0.0004997695680795434, 'samples': 438784, 'steps': 856, 'batch_loss/train': 0.930611576884985} 12/21/2021 17:17:01 - INFO - codeparrot_training - Step 857: {'lr': 0.0004997688909413027, 'samples': 439296, 'steps': 857, 'batch_loss/train': 0.9523218180984259} 12/21/2021 17:17:13 - INFO - codeparrot_training - Step 858: {'lr': 0.000499768212810075, 'samples': 439808, 'steps': 858, 'batch_loss/train': 0.9139770325273275} 12/21/2021 17:17:24 - INFO - codeparrot_training - Step 859: {'lr': 0.0004997675336858633, 'samples': 440320, 'steps': 859, 'batch_loss/train': 0.8533319418784231} 12/21/2021 17:17:35 - INFO - codeparrot_training - Step 860: {'lr': 0.0004997668535686702, 'samples': 440832, 'steps': 860, 'batch_loss/train': 0.9894550524186343} 12/21/2021 17:17:48 - INFO - codeparrot_training - Step 861: {'lr': 0.0004997661724584984, 'samples': 441344, 'steps': 861, 'batch_loss/train': 1.8830384816974401} 12/21/2021 17:17:59 - INFO - codeparrot_training - Step 862: {'lr': 0.0004997654903553506, 'samples': 441856, 'steps': 862, 'batch_loss/train': 0.901781047694385} 12/21/2021 17:18:09 - INFO - codeparrot_training - Step 863: {'lr': 0.0004997648072592295, 'samples': 442368, 'steps': 863, 'batch_loss/train': 1.52697267010808} 12/21/2021 17:18:20 - INFO - codeparrot_training - Step 864: {'lr': 0.0004997641231701379, 'samples': 442880, 'steps': 864, 'batch_loss/train': 0.9294169563800097} 12/21/2021 17:18:32 - INFO - codeparrot_training - Step 865: {'lr': 0.0004997634380880784, 'samples': 443392, 'steps': 865, 'batch_loss/train': 1.5136930644512177} 12/21/2021 17:18:43 - INFO - codeparrot_training - Step 866: {'lr': 0.0004997627520130538, 'samples': 443904, 'steps': 866, 'batch_loss/train': 1.069660285487771} 12/21/2021 17:18:53 - INFO - codeparrot_training - Step 867: {'lr': 0.0004997620649450667, 'samples': 444416, 'steps': 867, 'batch_loss/train': 0.8720626179128885} 12/21/2021 17:19:06 - INFO - codeparrot_training - Step 868: {'lr': 0.0004997613768841201, 'samples': 444928, 'steps': 868, 'batch_loss/train': 0.884731070138514} 12/21/2021 17:19:16 - INFO - codeparrot_training - Step 869: {'lr': 0.0004997606878302164, 'samples': 445440, 'steps': 869, 'batch_loss/train': 0.9463777532801032} 12/21/2021 17:19:27 - INFO - codeparrot_training - Step 870: {'lr': 0.0004997599977833587, 'samples': 445952, 'steps': 870, 'batch_loss/train': 1.0486275823786855} 12/21/2021 17:19:39 - INFO - codeparrot_training - Step 871: {'lr': 0.0004997593067435494, 'samples': 446464, 'steps': 871, 'batch_loss/train': 0.9842685889452696} 12/21/2021 17:19:50 - INFO - codeparrot_training - Step 872: {'lr': 0.0004997586147107914, 'samples': 446976, 'steps': 872, 'batch_loss/train': 0.8689216449856758} 12/21/2021 17:20:01 - INFO - codeparrot_training - Step 873: {'lr': 0.0004997579216850875, 'samples': 447488, 'steps': 873, 'batch_loss/train': 0.9537707678973675} 12/21/2021 17:20:11 - INFO - codeparrot_training - Step 874: {'lr': 0.0004997572276664405, 'samples': 448000, 'steps': 874, 'batch_loss/train': 1.264841839671135} 12/21/2021 17:20:23 - INFO - codeparrot_training - Step 875: {'lr': 0.000499756532654853, 'samples': 448512, 'steps': 875, 'batch_loss/train': 0.9504139134660363} 12/21/2021 17:20:34 - INFO - codeparrot_training - Step 876: {'lr': 0.0004997558366503277, 'samples': 449024, 'steps': 876, 'batch_loss/train': 0.9323732294142246} 12/21/2021 17:20:45 - INFO - codeparrot_training - Step 877: {'lr': 0.0004997551396528677, 'samples': 449536, 'steps': 877, 'batch_loss/train': 1.0832994310185313} 12/21/2021 17:20:57 - INFO - codeparrot_training - Step 878: {'lr': 0.0004997544416624755, 'samples': 450048, 'steps': 878, 'batch_loss/train': 0.9073361391201615} 12/21/2021 17:21:08 - INFO - codeparrot_training - Step 879: {'lr': 0.0004997537426791539, 'samples': 450560, 'steps': 879, 'batch_loss/train': 0.8538142582401633} 12/21/2021 17:21:18 - INFO - codeparrot_training - Step 880: {'lr': 0.0004997530427029058, 'samples': 451072, 'steps': 880, 'batch_loss/train': 0.9367448491975665} 12/21/2021 17:21:30 - INFO - codeparrot_training - Step 881: {'lr': 0.0004997523417337339, 'samples': 451584, 'steps': 881, 'batch_loss/train': 0.9488799460232258} 12/21/2021 17:21:41 - INFO - codeparrot_training - Step 882: {'lr': 0.0004997516397716408, 'samples': 452096, 'steps': 882, 'batch_loss/train': 0.8153837828431278} 12/21/2021 17:21:52 - INFO - codeparrot_training - Step 883: {'lr': 0.0004997509368166297, 'samples': 452608, 'steps': 883, 'batch_loss/train': 0.903922101482749} 12/21/2021 17:22:02 - INFO - codeparrot_training - Step 884: {'lr': 0.0004997502328687031, 'samples': 453120, 'steps': 884, 'batch_loss/train': 1.0400319676846266} 12/21/2021 17:22:14 - INFO - codeparrot_training - Step 885: {'lr': 0.0004997495279278639, 'samples': 453632, 'steps': 885, 'batch_loss/train': 0.9313750742003322} 12/21/2021 17:22:25 - INFO - codeparrot_training - Step 886: {'lr': 0.0004997488219941149, 'samples': 454144, 'steps': 886, 'batch_loss/train': 0.8759532701224089} 12/21/2021 17:22:35 - INFO - codeparrot_training - Step 887: {'lr': 0.0004997481150674589, 'samples': 454656, 'steps': 887, 'batch_loss/train': 0.952106449753046} 12/21/2021 17:22:49 - INFO - codeparrot_training - Step 888: {'lr': 0.0004997474071478986, 'samples': 455168, 'steps': 888, 'batch_loss/train': 0.908934413921088} 12/21/2021 17:23:00 - INFO - codeparrot_training - Step 889: {'lr': 0.0004997466982354369, 'samples': 455680, 'steps': 889, 'batch_loss/train': 0.9488861933350563} 12/21/2021 17:23:10 - INFO - codeparrot_training - Step 890: {'lr': 0.0004997459883300766, 'samples': 456192, 'steps': 890, 'batch_loss/train': 1.2181328516453505} 12/21/2021 17:23:23 - INFO - codeparrot_training - Step 891: {'lr': 0.0004997452774318208, 'samples': 456704, 'steps': 891, 'batch_loss/train': 0.9329738602973521} 12/21/2021 17:23:33 - INFO - codeparrot_training - Step 892: {'lr': 0.0004997445655406718, 'samples': 457216, 'steps': 892, 'batch_loss/train': 0.9559244178235531} 12/21/2021 17:23:44 - INFO - codeparrot_training - Step 893: {'lr': 0.0004997438526566329, 'samples': 457728, 'steps': 893, 'batch_loss/train': 0.9444565968587995} 12/21/2021 17:23:56 - INFO - codeparrot_training - Step 894: {'lr': 0.0004997431387797067, 'samples': 458240, 'steps': 894, 'batch_loss/train': 0.9750812072306871} 12/21/2021 17:24:06 - INFO - codeparrot_training - Step 895: {'lr': 0.000499742423909896, 'samples': 458752, 'steps': 895, 'batch_loss/train': 0.9508182452991605} 12/21/2021 17:24:17 - INFO - codeparrot_training - Step 896: {'lr': 0.0004997417080472037, 'samples': 459264, 'steps': 896, 'batch_loss/train': 0.9594501657411456} 12/21/2021 17:24:27 - INFO - codeparrot_training - Step 897: {'lr': 0.0004997409911916327, 'samples': 459776, 'steps': 897, 'batch_loss/train': 0.9807299934327602} 12/21/2021 17:24:40 - INFO - codeparrot_training - Step 898: {'lr': 0.0004997402733431859, 'samples': 460288, 'steps': 898, 'batch_loss/train': 0.9721807986497879} 12/21/2021 17:24:51 - INFO - codeparrot_training - Step 899: {'lr': 0.000499739554501866, 'samples': 460800, 'steps': 899, 'batch_loss/train': 0.920583606697619} 12/21/2021 17:25:01 - INFO - codeparrot_training - Step 900: {'lr': 0.000499738834667676, 'samples': 461312, 'steps': 900, 'batch_loss/train': 0.9912521066144109} 12/21/2021 17:25:13 - INFO - codeparrot_training - Step 901: {'lr': 0.0004997381138406186, 'samples': 461824, 'steps': 901, 'batch_loss/train': 0.9627675246447325} 12/21/2021 17:25:24 - INFO - codeparrot_training - Step 902: {'lr': 0.0004997373920206968, 'samples': 462336, 'steps': 902, 'batch_loss/train': 0.9098580265417695} 12/21/2021 17:25:35 - INFO - codeparrot_training - Step 903: {'lr': 0.0004997366692079133, 'samples': 462848, 'steps': 903, 'batch_loss/train': 0.903676787391305} 12/21/2021 17:25:47 - INFO - codeparrot_training - Step 904: {'lr': 0.0004997359454022714, 'samples': 463360, 'steps': 904, 'batch_loss/train': 0.9083635397255421} 12/21/2021 17:25:58 - INFO - codeparrot_training - Step 905: {'lr': 0.0004997352206037734, 'samples': 463872, 'steps': 905, 'batch_loss/train': 0.853295112028718} 12/21/2021 17:26:09 - INFO - codeparrot_training - Step 906: {'lr': 0.0004997344948124226, 'samples': 464384, 'steps': 906, 'batch_loss/train': 1.2307321727275848} 12/21/2021 17:26:19 - INFO - codeparrot_training - Step 907: {'lr': 0.0004997337680282217, 'samples': 464896, 'steps': 907, 'batch_loss/train': 0.9160974379628897} 12/21/2021 17:26:31 - INFO - codeparrot_training - Step 908: {'lr': 0.0004997330402511736, 'samples': 465408, 'steps': 908, 'batch_loss/train': 1.1366801457479596} 12/21/2021 17:26:42 - INFO - codeparrot_training - Step 909: {'lr': 0.0004997323114812813, 'samples': 465920, 'steps': 909, 'batch_loss/train': 0.9090405153110623} 12/21/2021 17:26:52 - INFO - codeparrot_training - Step 910: {'lr': 0.0004997315817185476, 'samples': 466432, 'steps': 910, 'batch_loss/train': 1.0287680439651012} 12/21/2021 17:27:04 - INFO - codeparrot_training - Step 911: {'lr': 0.0004997308509629754, 'samples': 466944, 'steps': 911, 'batch_loss/train': 0.9808224998414516} 12/21/2021 17:27:15 - INFO - codeparrot_training - Step 912: {'lr': 0.0004997301192145677, 'samples': 467456, 'steps': 912, 'batch_loss/train': 0.8896083422005177} 12/21/2021 17:27:26 - INFO - codeparrot_training - Step 913: {'lr': 0.0004997293864733273, 'samples': 467968, 'steps': 913, 'batch_loss/train': 0.8836987032555044} 12/21/2021 17:27:38 - INFO - codeparrot_training - Step 914: {'lr': 0.000499728652739257, 'samples': 468480, 'steps': 914, 'batch_loss/train': 1.1301283570937812} 12/21/2021 17:27:49 - INFO - codeparrot_training - Step 915: {'lr': 0.0004997279180123601, 'samples': 468992, 'steps': 915, 'batch_loss/train': 0.8975952491164207} 12/21/2021 17:27:59 - INFO - codeparrot_training - Step 916: {'lr': 0.0004997271822926391, 'samples': 469504, 'steps': 916, 'batch_loss/train': 0.8759130109101534} 12/21/2021 17:28:10 - INFO - codeparrot_training - Step 917: {'lr': 0.0004997264455800974, 'samples': 470016, 'steps': 917, 'batch_loss/train': 0.9435654133558273} 12/21/2021 17:28:22 - INFO - codeparrot_training - Step 918: {'lr': 0.0004997257078747374, 'samples': 470528, 'steps': 918, 'batch_loss/train': 0.978057200089097} 12/21/2021 17:28:33 - INFO - codeparrot_training - Step 919: {'lr': 0.0004997249691765622, 'samples': 471040, 'steps': 919, 'batch_loss/train': 0.8530536172911525} 12/21/2021 17:28:43 - INFO - codeparrot_training - Step 920: {'lr': 0.0004997242294855751, 'samples': 471552, 'steps': 920, 'batch_loss/train': 0.8414009558036923} 12/21/2021 17:28:56 - INFO - codeparrot_training - Step 921: {'lr': 0.0004997234888017786, 'samples': 472064, 'steps': 921, 'batch_loss/train': 0.9589731092564762} 12/21/2021 17:29:07 - INFO - codeparrot_training - Step 922: {'lr': 0.0004997227471251758, 'samples': 472576, 'steps': 922, 'batch_loss/train': 0.8600756211671978} 12/21/2021 17:29:17 - INFO - codeparrot_training - Step 923: {'lr': 0.0004997220044557696, 'samples': 473088, 'steps': 923, 'batch_loss/train': 0.8780028335750103} 12/21/2021 17:29:29 - INFO - codeparrot_training - Step 924: {'lr': 0.0004997212607935631, 'samples': 473600, 'steps': 924, 'batch_loss/train': 0.8099083518609405} 12/21/2021 17:29:40 - INFO - codeparrot_training - Step 925: {'lr': 0.0004997205161385591, 'samples': 474112, 'steps': 925, 'batch_loss/train': 0.8519788309931755} 12/21/2021 17:29:50 - INFO - codeparrot_training - Step 926: {'lr': 0.0004997197704907607, 'samples': 474624, 'steps': 926, 'batch_loss/train': 0.8584920326247811} 12/21/2021 17:30:01 - INFO - codeparrot_training - Step 927: {'lr': 0.0004997190238501708, 'samples': 475136, 'steps': 927, 'batch_loss/train': 0.8563522147014737} 12/21/2021 17:30:13 - INFO - codeparrot_training - Step 928: {'lr': 0.0004997182762167922, 'samples': 475648, 'steps': 928, 'batch_loss/train': 0.9796495949849486} 12/21/2021 17:30:24 - INFO - codeparrot_training - Step 929: {'lr': 0.0004997175275906281, 'samples': 476160, 'steps': 929, 'batch_loss/train': 0.9414922697469592} 12/21/2021 17:30:34 - INFO - codeparrot_training - Step 930: {'lr': 0.0004997167779716814, 'samples': 476672, 'steps': 930, 'batch_loss/train': 0.9104800587520003} 12/21/2021 17:30:46 - INFO - codeparrot_training - Step 931: {'lr': 0.0004997160273599551, 'samples': 477184, 'steps': 931, 'batch_loss/train': 0.9091139989905059} 12/21/2021 17:30:57 - INFO - codeparrot_training - Step 932: {'lr': 0.0004997152757554522, 'samples': 477696, 'steps': 932, 'batch_loss/train': 0.9514064388349652} 12/21/2021 17:31:07 - INFO - codeparrot_training - Step 933: {'lr': 0.0004997145231581756, 'samples': 478208, 'steps': 933, 'batch_loss/train': 0.922889971639961} 12/21/2021 17:31:20 - INFO - codeparrot_training - Step 934: {'lr': 0.0004997137695681283, 'samples': 478720, 'steps': 934, 'batch_loss/train': 0.9761518789455295} 12/21/2021 17:31:31 - INFO - codeparrot_training - Step 935: {'lr': 0.0004997130149853134, 'samples': 479232, 'steps': 935, 'batch_loss/train': 0.7854829244315624} 12/21/2021 17:31:41 - INFO - codeparrot_training - Step 936: {'lr': 0.0004997122594097338, 'samples': 479744, 'steps': 936, 'batch_loss/train': 1.0057788996491581} 12/21/2021 17:31:52 - INFO - codeparrot_training - Step 937: {'lr': 0.0004997115028413926, 'samples': 480256, 'steps': 937, 'batch_loss/train': 1.1213493403047323} 12/21/2021 17:32:04 - INFO - codeparrot_training - Step 938: {'lr': 0.0004997107452802927, 'samples': 480768, 'steps': 938, 'batch_loss/train': 0.7906994260847569} 12/21/2021 17:32:15 - INFO - codeparrot_training - Step 939: {'lr': 0.0004997099867264372, 'samples': 481280, 'steps': 939, 'batch_loss/train': 0.959058933891356} 12/21/2021 17:32:25 - INFO - codeparrot_training - Step 940: {'lr': 0.0004997092271798291, 'samples': 481792, 'steps': 940, 'batch_loss/train': 0.8127758912742138} 12/21/2021 17:32:37 - INFO - codeparrot_training - Step 941: {'lr': 0.0004997084666404713, 'samples': 482304, 'steps': 941, 'batch_loss/train': 0.9607830392196774} 12/21/2021 17:32:48 - INFO - codeparrot_training - Step 942: {'lr': 0.0004997077051083669, 'samples': 482816, 'steps': 942, 'batch_loss/train': 0.9080006247386336} 12/21/2021 17:32:58 - INFO - codeparrot_training - Step 943: {'lr': 0.0004997069425835191, 'samples': 483328, 'steps': 943, 'batch_loss/train': 0.9566507814452052} 12/21/2021 17:33:11 - INFO - codeparrot_training - Step 944: {'lr': 0.0004997061790659307, 'samples': 483840, 'steps': 944, 'batch_loss/train': 1.0351978112012148} 12/21/2021 17:33:22 - INFO - codeparrot_training - Step 945: {'lr': 0.0004997054145556048, 'samples': 484352, 'steps': 945, 'batch_loss/train': 0.9912108490243554} 12/21/2021 17:33:32 - INFO - codeparrot_training - Step 946: {'lr': 0.0004997046490525446, 'samples': 484864, 'steps': 946, 'batch_loss/train': 0.9437259715050459} 12/21/2021 17:33:45 - INFO - codeparrot_training - Step 947: {'lr': 0.0004997038825567528, 'samples': 485376, 'steps': 947, 'batch_loss/train': 0.9389331797137856} 12/21/2021 17:33:55 - INFO - codeparrot_training - Step 948: {'lr': 0.0004997031150682328, 'samples': 485888, 'steps': 948, 'batch_loss/train': 0.7905911942943931} 12/21/2021 17:34:06 - INFO - codeparrot_training - Step 949: {'lr': 0.0004997023465869874, 'samples': 486400, 'steps': 949, 'batch_loss/train': 0.9152175458148122} 12/21/2021 17:34:16 - INFO - codeparrot_training - Step 950: {'lr': 0.0004997015771130198, 'samples': 486912, 'steps': 950, 'batch_loss/train': 1.284138262271881} 12/21/2021 17:34:29 - INFO - codeparrot_training - Step 951: {'lr': 0.000499700806646333, 'samples': 487424, 'steps': 951, 'batch_loss/train': 0.8678724225610495} 12/21/2021 17:34:40 - INFO - codeparrot_training - Step 952: {'lr': 0.00049970003518693, 'samples': 487936, 'steps': 952, 'batch_loss/train': 0.8930850867182016} 12/21/2021 17:34:50 - INFO - codeparrot_training - Step 953: {'lr': 0.0004996992627348141, 'samples': 488448, 'steps': 953, 'batch_loss/train': 0.9073094232007861} 12/21/2021 17:35:02 - INFO - codeparrot_training - Step 954: {'lr': 0.0004996984892899882, 'samples': 488960, 'steps': 954, 'batch_loss/train': 1.0472514182329178} 12/21/2021 17:35:13 - INFO - codeparrot_training - Step 955: {'lr': 0.0004996977148524553, 'samples': 489472, 'steps': 955, 'batch_loss/train': 0.9209820367395878} 12/21/2021 17:35:23 - INFO - codeparrot_training - Step 956: {'lr': 0.0004996969394222186, 'samples': 489984, 'steps': 956, 'batch_loss/train': 0.8026992455124855} 12/21/2021 17:35:35 - INFO - codeparrot_training - Step 957: {'lr': 0.0004996961629992811, 'samples': 490496, 'steps': 957, 'batch_loss/train': 0.9692586734890938} 12/21/2021 17:35:46 - INFO - codeparrot_training - Step 958: {'lr': 0.000499695385583646, 'samples': 491008, 'steps': 958, 'batch_loss/train': 0.9362279893830419} 12/21/2021 17:35:57 - INFO - codeparrot_training - Step 959: {'lr': 0.0004996946071753164, 'samples': 491520, 'steps': 959, 'batch_loss/train': 0.9494533007964492} 12/21/2021 17:36:07 - INFO - codeparrot_training - Step 960: {'lr': 0.0004996938277742951, 'samples': 492032, 'steps': 960, 'batch_loss/train': 0.8683849526569247} 12/21/2021 17:36:20 - INFO - codeparrot_training - Step 961: {'lr': 0.0004996930473805856, 'samples': 492544, 'steps': 961, 'batch_loss/train': 1.2063287291675806} 12/21/2021 17:36:30 - INFO - codeparrot_training - Step 962: {'lr': 0.0004996922659941908, 'samples': 493056, 'steps': 962, 'batch_loss/train': 0.8595601376146078} 12/21/2021 17:36:41 - INFO - codeparrot_training - Step 963: {'lr': 0.0004996914836151138, 'samples': 493568, 'steps': 963, 'batch_loss/train': 0.9041547570377588} 12/21/2021 17:36:53 - INFO - codeparrot_training - Step 964: {'lr': 0.0004996907002433578, 'samples': 494080, 'steps': 964, 'batch_loss/train': 0.8873012363910675} 12/21/2021 17:37:04 - INFO - codeparrot_training - Step 965: {'lr': 0.0004996899158789258, 'samples': 494592, 'steps': 965, 'batch_loss/train': 0.8391993469558656} 12/21/2021 17:37:14 - INFO - codeparrot_training - Step 966: {'lr': 0.000499689130521821, 'samples': 495104, 'steps': 966, 'batch_loss/train': 0.9376294063404202} 12/21/2021 17:37:26 - INFO - codeparrot_training - Step 967: {'lr': 0.0004996883441720464, 'samples': 495616, 'steps': 967, 'batch_loss/train': 1.32773732021451} 12/21/2021 17:37:37 - INFO - codeparrot_training - Step 968: {'lr': 0.0004996875568296053, 'samples': 496128, 'steps': 968, 'batch_loss/train': 0.9905015528202057} 12/21/2021 17:37:47 - INFO - codeparrot_training - Step 969: {'lr': 0.0004996867684945008, 'samples': 496640, 'steps': 969, 'batch_loss/train': 0.8725618738681078} 12/21/2021 17:37:58 - INFO - codeparrot_training - Step 970: {'lr': 0.0004996859791667359, 'samples': 497152, 'steps': 970, 'batch_loss/train': 1.0036595575511456} 12/21/2021 17:38:10 - INFO - codeparrot_training - Step 971: {'lr': 0.0004996851888463138, 'samples': 497664, 'steps': 971, 'batch_loss/train': 0.8835762604139745} 12/21/2021 17:38:21 - INFO - codeparrot_training - Step 972: {'lr': 0.0004996843975332378, 'samples': 498176, 'steps': 972, 'batch_loss/train': 1.055220203474164} 12/21/2021 17:38:31 - INFO - codeparrot_training - Step 973: {'lr': 0.0004996836052275109, 'samples': 498688, 'steps': 973, 'batch_loss/train': 0.9634850195143372} 12/21/2021 17:38:44 - INFO - codeparrot_training - Step 974: {'lr': 0.0004996828119291361, 'samples': 499200, 'steps': 974, 'batch_loss/train': 0.8381894044578075} 12/21/2021 17:38:55 - INFO - codeparrot_training - Step 975: {'lr': 0.0004996820176381169, 'samples': 499712, 'steps': 975, 'batch_loss/train': 0.9537183176726103} 12/21/2021 17:39:06 - INFO - codeparrot_training - Step 976: {'lr': 0.0004996812223544562, 'samples': 500224, 'steps': 976, 'batch_loss/train': 0.934873765334487} 12/21/2021 17:39:18 - INFO - codeparrot_training - Step 977: {'lr': 0.0004996804260781572, 'samples': 500736, 'steps': 977, 'batch_loss/train': 0.8153394656255841} 12/21/2021 17:39:28 - INFO - codeparrot_training - Step 978: {'lr': 0.0004996796288092232, 'samples': 501248, 'steps': 978, 'batch_loss/train': 0.9296532776206732} 12/21/2021 17:39:39 - INFO - codeparrot_training - Step 979: {'lr': 0.0004996788305476572, 'samples': 501760, 'steps': 979, 'batch_loss/train': 0.9139515142887831} 12/21/2021 17:39:51 - INFO - codeparrot_training - Step 980: {'lr': 0.0004996780312934623, 'samples': 502272, 'steps': 980, 'batch_loss/train': 0.9188459692522883} 12/21/2021 17:40:02 - INFO - codeparrot_training - Step 981: {'lr': 0.0004996772310466421, 'samples': 502784, 'steps': 981, 'batch_loss/train': 0.8332904735580087} 12/21/2021 17:40:13 - INFO - codeparrot_training - Step 982: {'lr': 0.0004996764298071993, 'samples': 503296, 'steps': 982, 'batch_loss/train': 0.9159372635185719} 12/21/2021 17:40:23 - INFO - codeparrot_training - Step 983: {'lr': 0.0004996756275751374, 'samples': 503808, 'steps': 983, 'batch_loss/train': 0.8658452806994319} 12/21/2021 17:40:35 - INFO - codeparrot_training - Step 984: {'lr': 0.0004996748243504594, 'samples': 504320, 'steps': 984, 'batch_loss/train': 1.006726504303515} 12/21/2021 17:40:46 - INFO - codeparrot_training - Step 985: {'lr': 0.0004996740201331687, 'samples': 504832, 'steps': 985, 'batch_loss/train': 0.9890728034079075} 12/21/2021 17:40:56 - INFO - codeparrot_training - Step 986: {'lr': 0.0004996732149232683, 'samples': 505344, 'steps': 986, 'batch_loss/train': 0.8526571863330901} 12/21/2021 17:41:08 - INFO - codeparrot_training - Step 987: {'lr': 0.0004996724087207614, 'samples': 505856, 'steps': 987, 'batch_loss/train': 0.9746766141615808} 12/21/2021 17:41:19 - INFO - codeparrot_training - Step 988: {'lr': 0.0004996716015256514, 'samples': 506368, 'steps': 988, 'batch_loss/train': 0.8539458839222789} 12/21/2021 17:41:30 - INFO - codeparrot_training - Step 989: {'lr': 0.0004996707933379413, 'samples': 506880, 'steps': 989, 'batch_loss/train': 0.9388692257925868} 12/21/2021 17:41:42 - INFO - codeparrot_training - Step 990: {'lr': 0.0004996699841576343, 'samples': 507392, 'steps': 990, 'batch_loss/train': 0.803133706562221} 12/21/2021 17:41:53 - INFO - codeparrot_training - Step 991: {'lr': 0.0004996691739847338, 'samples': 507904, 'steps': 991, 'batch_loss/train': 1.0021716505289078} 12/21/2021 17:42:03 - INFO - codeparrot_training - Step 992: {'lr': 0.000499668362819243, 'samples': 508416, 'steps': 992, 'batch_loss/train': 1.1028096228837967} 12/21/2021 17:42:14 - INFO - codeparrot_training - Step 993: {'lr': 0.0004996675506611651, 'samples': 508928, 'steps': 993, 'batch_loss/train': 0.9497033399529755} 12/21/2021 17:42:26 - INFO - codeparrot_training - Step 994: {'lr': 0.0004996667375105031, 'samples': 509440, 'steps': 994, 'batch_loss/train': 0.9420621516183019} 12/21/2021 17:42:37 - INFO - codeparrot_training - Step 995: {'lr': 0.0004996659233672606, 'samples': 509952, 'steps': 995, 'batch_loss/train': 0.9178862543776631} 12/21/2021 17:42:47 - INFO - codeparrot_training - Step 996: {'lr': 0.0004996651082314407, 'samples': 510464, 'steps': 996, 'batch_loss/train': 0.8658295627683401} 12/21/2021 17:42:59 - INFO - codeparrot_training - Step 997: {'lr': 0.0004996642921030464, 'samples': 510976, 'steps': 997, 'batch_loss/train': 0.8900093277916312} 12/21/2021 17:43:10 - INFO - codeparrot_training - Step 998: {'lr': 0.0004996634749820812, 'samples': 511488, 'steps': 998, 'batch_loss/train': 0.8054971592500806} 12/21/2021 17:43:20 - INFO - codeparrot_training - Step 999: {'lr': 0.0004996626568685484, 'samples': 512000, 'steps': 999, 'batch_loss/train': 0.9153449330478907} 12/21/2021 17:43:33 - INFO - codeparrot_training - Step 1000: {'lr': 0.000499661837762451, 'samples': 512512, 'steps': 1000, 'batch_loss/train': 0.9770119711756706} 12/21/2021 17:43:44 - INFO - codeparrot_training - Step 1001: {'lr': 0.0004996610176637925, 'samples': 513024, 'steps': 1001, 'batch_loss/train': 0.8659257451072335} 12/21/2021 17:43:54 - INFO - codeparrot_training - Step 1002: {'lr': 0.000499660196572576, 'samples': 513536, 'steps': 1002, 'batch_loss/train': 0.8665200937539339} 12/21/2021 17:44:05 - INFO - codeparrot_training - Step 1003: {'lr': 0.0004996593744888049, 'samples': 514048, 'steps': 1003, 'batch_loss/train': 0.8351971795782447} 12/21/2021 17:44:18 - INFO - codeparrot_training - Step 1004: {'lr': 0.0004996585514124823, 'samples': 514560, 'steps': 1004, 'batch_loss/train': 0.9125053212046623} 12/21/2021 17:44:28 - INFO - codeparrot_training - Step 1005: {'lr': 0.0004996577273436117, 'samples': 515072, 'steps': 1005, 'batch_loss/train': 1.001401899382472} 12/21/2021 17:44:39 - INFO - codeparrot_training - Step 1006: {'lr': 0.0004996569022821961, 'samples': 515584, 'steps': 1006, 'batch_loss/train': 1.61145652923733} 12/21/2021 17:44:51 - INFO - codeparrot_training - Step 1007: {'lr': 0.000499656076228239, 'samples': 516096, 'steps': 1007, 'batch_loss/train': 0.8782493341714144} 12/21/2021 17:45:01 - INFO - codeparrot_training - Step 1008: {'lr': 0.0004996552491817437, 'samples': 516608, 'steps': 1008, 'batch_loss/train': 0.9275859594345093} 12/21/2021 17:45:12 - INFO - codeparrot_training - Step 1009: {'lr': 0.0004996544211427132, 'samples': 517120, 'steps': 1009, 'batch_loss/train': 0.8456953288987279} 12/21/2021 17:45:24 - INFO - codeparrot_training - Step 1010: {'lr': 0.0004996535921111511, 'samples': 517632, 'steps': 1010, 'batch_loss/train': 0.9477060651406646} 12/21/2021 17:45:34 - INFO - codeparrot_training - Step 1011: {'lr': 0.0004996527620870606, 'samples': 518144, 'steps': 1011, 'batch_loss/train': 1.0580871552228928} 12/21/2021 17:45:45 - INFO - codeparrot_training - Step 1012: {'lr': 0.000499651931070445, 'samples': 518656, 'steps': 1012, 'batch_loss/train': 1.0251055853441358} 12/21/2021 17:45:58 - INFO - codeparrot_training - Step 1013: {'lr': 0.0004996510990613075, 'samples': 519168, 'steps': 1013, 'batch_loss/train': 0.9154602931812406} 12/21/2021 17:46:08 - INFO - codeparrot_training - Step 1014: {'lr': 0.0004996502660596516, 'samples': 519680, 'steps': 1014, 'batch_loss/train': 0.9666747972369194} 12/21/2021 17:46:19 - INFO - codeparrot_training - Step 1015: {'lr': 0.0004996494320654804, 'samples': 520192, 'steps': 1015, 'batch_loss/train': 0.8140917466953397} 12/21/2021 17:46:30 - INFO - codeparrot_training - Step 1016: {'lr': 0.0004996485970787973, 'samples': 520704, 'steps': 1016, 'batch_loss/train': 0.8170314254239202} 12/21/2021 17:46:42 - INFO - codeparrot_training - Step 1017: {'lr': 0.0004996477610996057, 'samples': 521216, 'steps': 1017, 'batch_loss/train': 0.9008469814434648} 12/21/2021 17:46:52 - INFO - codeparrot_training - Step 1018: {'lr': 0.0004996469241279089, 'samples': 521728, 'steps': 1018, 'batch_loss/train': 0.9004081226885319} 12/21/2021 17:47:03 - INFO - codeparrot_training - Step 1019: {'lr': 0.0004996460861637101, 'samples': 522240, 'steps': 1019, 'batch_loss/train': 0.8875172259286046} 12/21/2021 17:47:16 - INFO - codeparrot_training - Step 1020: {'lr': 0.0004996452472070128, 'samples': 522752, 'steps': 1020, 'batch_loss/train': 0.9761928748339415} 12/21/2021 17:47:26 - INFO - codeparrot_training - Step 1021: {'lr': 0.0004996444072578203, 'samples': 523264, 'steps': 1021, 'batch_loss/train': 1.049674330279231} 12/21/2021 17:47:37 - INFO - codeparrot_training - Step 1022: {'lr': 0.0004996435663161358, 'samples': 523776, 'steps': 1022, 'batch_loss/train': 0.8628543829545379} 12/21/2021 17:47:49 - INFO - codeparrot_training - Step 1023: {'lr': 0.0004996427243819627, 'samples': 524288, 'steps': 1023, 'batch_loss/train': 0.9352536166552454} 12/21/2021 17:47:59 - INFO - codeparrot_training - Step 1024: {'lr': 0.0004996418814553046, 'samples': 524800, 'steps': 1024, 'batch_loss/train': 0.9838276915252209} 12/21/2021 17:48:10 - INFO - codeparrot_training - Step 1025: {'lr': 0.0004996410375361645, 'samples': 525312, 'steps': 1025, 'batch_loss/train': 0.9914461933076382} 12/21/2021 17:48:21 - INFO - codeparrot_training - Step 1026: {'lr': 0.000499640192624546, 'samples': 525824, 'steps': 1026, 'batch_loss/train': 0.9910847740247846} 12/21/2021 17:48:33 - INFO - codeparrot_training - Step 1027: {'lr': 0.0004996393467204522, 'samples': 526336, 'steps': 1027, 'batch_loss/train': 0.9384469399228692} 12/21/2021 17:48:43 - INFO - codeparrot_training - Step 1028: {'lr': 0.0004996384998238867, 'samples': 526848, 'steps': 1028, 'batch_loss/train': 0.888360857963562} 12/21/2021 17:48:54 - INFO - codeparrot_training - Step 1029: {'lr': 0.0004996376519348527, 'samples': 527360, 'steps': 1029, 'batch_loss/train': 0.8908344563096762} 12/21/2021 17:49:06 - INFO - codeparrot_training - Step 1030: {'lr': 0.0004996368030533538, 'samples': 527872, 'steps': 1030, 'batch_loss/train': 0.9832698026439175} 12/21/2021 17:49:17 - INFO - codeparrot_training - Step 1031: {'lr': 0.0004996359531793931, 'samples': 528384, 'steps': 1031, 'batch_loss/train': 1.0083358958363533} 12/21/2021 17:49:28 - INFO - codeparrot_training - Step 1032: {'lr': 0.0004996351023129742, 'samples': 528896, 'steps': 1032, 'batch_loss/train': 0.8692763391882181} 12/21/2021 17:49:40 - INFO - codeparrot_training - Step 1033: {'lr': 0.0004996342504541004, 'samples': 529408, 'steps': 1033, 'batch_loss/train': 0.8394672274589539} 12/21/2021 17:49:50 - INFO - codeparrot_training - Step 1034: {'lr': 0.0004996333976027749, 'samples': 529920, 'steps': 1034, 'batch_loss/train': 0.846665873657912} 12/21/2021 17:50:01 - INFO - codeparrot_training - Step 1035: {'lr': 0.0004996325437590014, 'samples': 530432, 'steps': 1035, 'batch_loss/train': 0.9387017646804452} 12/21/2021 17:50:12 - INFO - codeparrot_training - Step 1036: {'lr': 0.0004996316889227832, 'samples': 530944, 'steps': 1036, 'batch_loss/train': 0.907019299454987} 12/21/2021 17:50:24 - INFO - codeparrot_training - Step 1037: {'lr': 0.0004996308330941236, 'samples': 531456, 'steps': 1037, 'batch_loss/train': 0.8694336637854576} 12/21/2021 17:50:35 - INFO - codeparrot_training - Step 1038: {'lr': 0.0004996299762730261, 'samples': 531968, 'steps': 1038, 'batch_loss/train': 0.9105223082005978} 12/21/2021 17:50:45 - INFO - codeparrot_training - Step 1039: {'lr': 0.0004996291184594941, 'samples': 532480, 'steps': 1039, 'batch_loss/train': 0.935341777279973} 12/21/2021 17:50:57 - INFO - codeparrot_training - Step 1040: {'lr': 0.0004996282596535309, 'samples': 532992, 'steps': 1040, 'batch_loss/train': 0.9475723467767239} 12/21/2021 17:51:08 - INFO - codeparrot_training - Step 1041: {'lr': 0.0004996273998551401, 'samples': 533504, 'steps': 1041, 'batch_loss/train': 0.8299167403019965} 12/21/2021 17:51:19 - INFO - codeparrot_training - Step 1042: {'lr': 0.000499626539064325, 'samples': 534016, 'steps': 1042, 'batch_loss/train': 0.8731899745762348} 12/21/2021 17:51:31 - INFO - codeparrot_training - Step 1043: {'lr': 0.0004996256772810889, 'samples': 534528, 'steps': 1043, 'batch_loss/train': 1.0501099117100239} 12/21/2021 17:51:41 - INFO - codeparrot_training - Step 1044: {'lr': 0.0004996248145054355, 'samples': 535040, 'steps': 1044, 'batch_loss/train': 0.8032674202695489} 12/21/2021 17:51:52 - INFO - codeparrot_training - Step 1045: {'lr': 0.000499623950737368, 'samples': 535552, 'steps': 1045, 'batch_loss/train': 0.9964918354526162} 12/21/2021 17:52:02 - INFO - codeparrot_training - Step 1046: {'lr': 0.00049962308597689, 'samples': 536064, 'steps': 1046, 'batch_loss/train': 0.8173797582276165} 12/21/2021 17:52:14 - INFO - codeparrot_training - Step 1047: {'lr': 0.0004996222202240049, 'samples': 536576, 'steps': 1047, 'batch_loss/train': 0.9599733343347907} 12/21/2021 17:52:25 - INFO - codeparrot_training - Step 1048: {'lr': 0.0004996213534787159, 'samples': 537088, 'steps': 1048, 'batch_loss/train': 0.8657295554876328} 12/21/2021 17:52:36 - INFO - codeparrot_training - Step 1049: {'lr': 0.0004996204857410268, 'samples': 537600, 'steps': 1049, 'batch_loss/train': 0.9591828566044569} 12/21/2021 17:52:49 - INFO - codeparrot_training - Step 1050: {'lr': 0.0004996196170109409, 'samples': 538112, 'steps': 1050, 'batch_loss/train': 0.9471572078764439} 12/21/2021 17:52:59 - INFO - codeparrot_training - Step 1051: {'lr': 0.0004996187472884616, 'samples': 538624, 'steps': 1051, 'batch_loss/train': 0.64387660427019} 12/21/2021 17:53:10 - INFO - codeparrot_training - Step 1052: {'lr': 0.0004996178765735925, 'samples': 539136, 'steps': 1052, 'batch_loss/train': 0.8945763395167887} 12/21/2021 17:53:22 - INFO - codeparrot_training - Step 1053: {'lr': 0.000499617004866337, 'samples': 539648, 'steps': 1053, 'batch_loss/train': 0.9317202400416136} 12/21/2021 17:53:32 - INFO - codeparrot_training - Step 1054: {'lr': 0.0004996161321666984, 'samples': 540160, 'steps': 1054, 'batch_loss/train': 0.8093121713027358} 12/21/2021 17:53:43 - INFO - codeparrot_training - Step 1055: {'lr': 0.0004996152584746805, 'samples': 540672, 'steps': 1055, 'batch_loss/train': 1.0307663017883897} 12/21/2021 17:53:55 - INFO - codeparrot_training - Step 1056: {'lr': 0.0004996143837902864, 'samples': 541184, 'steps': 1056, 'batch_loss/train': 0.9378761565312743} 12/21/2021 17:54:06 - INFO - codeparrot_training - Step 1057: {'lr': 0.0004996135081135199, 'samples': 541696, 'steps': 1057, 'batch_loss/train': 0.7767104115337133} 12/21/2021 17:54:16 - INFO - codeparrot_training - Step 1058: {'lr': 0.0004996126314443842, 'samples': 542208, 'steps': 1058, 'batch_loss/train': 0.9620478171855211} 12/21/2021 17:54:27 - INFO - codeparrot_training - Step 1059: {'lr': 0.0004996117537828831, 'samples': 542720, 'steps': 1059, 'batch_loss/train': 0.802382935769856} 12/21/2021 17:54:40 - INFO - codeparrot_training - Step 1060: {'lr': 0.0004996108751290198, 'samples': 543232, 'steps': 1060, 'batch_loss/train': 0.8702808013185859} 12/21/2021 17:54:50 - INFO - codeparrot_training - Step 1061: {'lr': 0.0004996099954827978, 'samples': 543744, 'steps': 1061, 'batch_loss/train': 0.9301826488226652} 12/21/2021 17:55:01 - INFO - codeparrot_training - Step 1062: {'lr': 0.0004996091148442209, 'samples': 544256, 'steps': 1062, 'batch_loss/train': 0.8478460044134408} 12/21/2021 17:55:13 - INFO - codeparrot_training - Step 1063: {'lr': 0.0004996082332132923, 'samples': 544768, 'steps': 1063, 'batch_loss/train': 0.8586355419829488} 12/21/2021 17:55:24 - INFO - codeparrot_training - Step 1064: {'lr': 0.0004996073505900157, 'samples': 545280, 'steps': 1064, 'batch_loss/train': 0.832656716927886} 12/21/2021 17:55:34 - INFO - codeparrot_training - Step 1065: {'lr': 0.0004996064669743945, 'samples': 545792, 'steps': 1065, 'batch_loss/train': 0.9472284931689501} 12/21/2021 17:55:47 - INFO - codeparrot_training - Step 1066: {'lr': 0.0004996055823664322, 'samples': 546304, 'steps': 1066, 'batch_loss/train': 0.8303265804424882} 12/21/2021 17:55:57 - INFO - codeparrot_training - Step 1067: {'lr': 0.0004996046967661324, 'samples': 546816, 'steps': 1067, 'batch_loss/train': 0.9168522106483579} 12/21/2021 17:56:08 - INFO - codeparrot_training - Step 1068: {'lr': 0.0004996038101734984, 'samples': 547328, 'steps': 1068, 'batch_loss/train': 0.8720788052305579} 12/21/2021 17:56:20 - INFO - codeparrot_training - Step 1069: {'lr': 0.0004996029225885341, 'samples': 547840, 'steps': 1069, 'batch_loss/train': 0.9396663988009095} 12/21/2021 17:56:31 - INFO - codeparrot_training - Step 1070: {'lr': 0.0004996020340112427, 'samples': 548352, 'steps': 1070, 'batch_loss/train': 0.8944519702345133} 12/21/2021 17:56:42 - INFO - codeparrot_training - Step 1071: {'lr': 0.0004996011444416279, 'samples': 548864, 'steps': 1071, 'batch_loss/train': 0.8981921337544918} 12/21/2021 17:56:52 - INFO - codeparrot_training - Step 1072: {'lr': 0.0004996002538796933, 'samples': 549376, 'steps': 1072, 'batch_loss/train': 0.8412441415712237} 12/21/2021 17:57:04 - INFO - codeparrot_training - Step 1073: {'lr': 0.0004995993623254422, 'samples': 549888, 'steps': 1073, 'batch_loss/train': 0.8462172476574779} 12/21/2021 17:57:15 - INFO - codeparrot_training - Step 1074: {'lr': 0.0004995984697788784, 'samples': 550400, 'steps': 1074, 'batch_loss/train': 0.9077871618792415} 12/21/2021 17:57:26 - INFO - codeparrot_training - Step 1075: {'lr': 0.0004995975762400052, 'samples': 550912, 'steps': 1075, 'batch_loss/train': 0.7821190003305674} 12/21/2021 17:57:38 - INFO - codeparrot_training - Step 1076: {'lr': 0.0004995966817088265, 'samples': 551424, 'steps': 1076, 'batch_loss/train': 0.9895831868052483} 12/21/2021 17:57:49 - INFO - codeparrot_training - Step 1077: {'lr': 0.0004995957861853454, 'samples': 551936, 'steps': 1077, 'batch_loss/train': 0.851031650789082} 12/21/2021 17:57:59 - INFO - codeparrot_training - Step 1078: {'lr': 0.0004995948896695658, 'samples': 552448, 'steps': 1078, 'batch_loss/train': 0.8771517518907785} 12/21/2021 17:58:11 - INFO - codeparrot_training - Step 1079: {'lr': 0.0004995939921614911, 'samples': 552960, 'steps': 1079, 'batch_loss/train': 0.9658382544294} 12/21/2021 17:58:22 - INFO - codeparrot_training - Step 1080: {'lr': 0.000499593093661125, 'samples': 553472, 'steps': 1080, 'batch_loss/train': 0.8046531332656741} 12/21/2021 17:58:33 - INFO - codeparrot_training - Step 1081: {'lr': 0.0004995921941684709, 'samples': 553984, 'steps': 1081, 'batch_loss/train': 0.9565259134396911} 12/21/2021 17:58:43 - INFO - codeparrot_training - Step 1082: {'lr': 0.0004995912936835326, 'samples': 554496, 'steps': 1082, 'batch_loss/train': 0.7734261483419687} 12/21/2021 17:58:55 - INFO - codeparrot_training - Step 1083: {'lr': 0.0004995903922063135, 'samples': 555008, 'steps': 1083, 'batch_loss/train': 0.9082761164754629} 12/21/2021 17:59:06 - INFO - codeparrot_training - Step 1084: {'lr': 0.0004995894897368173, 'samples': 555520, 'steps': 1084, 'batch_loss/train': 0.9238338135182858} 12/21/2021 17:59:17 - INFO - codeparrot_training - Step 1085: {'lr': 0.0004995885862750474, 'samples': 556032, 'steps': 1085, 'batch_loss/train': 0.8493210365995765} 12/21/2021 17:59:29 - INFO - codeparrot_training - Step 1086: {'lr': 0.0004995876818210077, 'samples': 556544, 'steps': 1086, 'batch_loss/train': 0.9383063027635217} 12/21/2021 17:59:39 - INFO - codeparrot_training - Step 1087: {'lr': 0.0004995867763747015, 'samples': 557056, 'steps': 1087, 'batch_loss/train': 0.8494196771644056} 12/21/2021 17:59:50 - INFO - codeparrot_training - Step 1088: {'lr': 0.0004995858699361324, 'samples': 557568, 'steps': 1088, 'batch_loss/train': 0.8711388576775789} 12/21/2021 18:00:03 - INFO - codeparrot_training - Step 1089: {'lr': 0.0004995849625053044, 'samples': 558080, 'steps': 1089, 'batch_loss/train': 0.7956964373588562} 12/21/2021 18:00:14 - INFO - codeparrot_training - Step 1090: {'lr': 0.0004995840540822207, 'samples': 558592, 'steps': 1090, 'batch_loss/train': 0.6550719826482236} 12/21/2021 18:00:24 - INFO - codeparrot_training - Step 1091: {'lr': 0.000499583144666885, 'samples': 559104, 'steps': 1091, 'batch_loss/train': 0.8832875224761665} 12/21/2021 18:00:35 - INFO - codeparrot_training - Step 1092: {'lr': 0.0004995822342593009, 'samples': 559616, 'steps': 1092, 'batch_loss/train': 0.9499610029160976} 12/21/2021 18:00:47 - INFO - codeparrot_training - Step 1093: {'lr': 0.0004995813228594723, 'samples': 560128, 'steps': 1093, 'batch_loss/train': 0.9427025811746716} 12/21/2021 18:00:58 - INFO - codeparrot_training - Step 1094: {'lr': 0.0004995804104674024, 'samples': 560640, 'steps': 1094, 'batch_loss/train': 0.9450205909088254} 12/21/2021 18:01:08 - INFO - codeparrot_training - Step 1095: {'lr': 0.0004995794970830951, 'samples': 561152, 'steps': 1095, 'batch_loss/train': 0.7671683279331774} 12/21/2021 18:01:21 - INFO - codeparrot_training - Step 1096: {'lr': 0.0004995785827065539, 'samples': 561664, 'steps': 1096, 'batch_loss/train': 0.8087856397032738} 12/21/2021 18:01:32 - INFO - codeparrot_training - Step 1097: {'lr': 0.0004995776673377825, 'samples': 562176, 'steps': 1097, 'batch_loss/train': 0.8797249929048121} 12/21/2021 18:01:42 - INFO - codeparrot_training - Step 1098: {'lr': 0.0004995767509767846, 'samples': 562688, 'steps': 1098, 'batch_loss/train': 0.9031044449657202} 12/21/2021 18:01:55 - INFO - codeparrot_training - Step 1099: {'lr': 0.0004995758336235638, 'samples': 563200, 'steps': 1099, 'batch_loss/train': 0.9182086293585598} 12/21/2021 18:02:06 - INFO - codeparrot_training - Step 1100: {'lr': 0.0004995749152781236, 'samples': 563712, 'steps': 1100, 'batch_loss/train': 0.8950230367481709} 12/21/2021 18:02:17 - INFO - codeparrot_training - Step 1101: {'lr': 0.0004995739959404678, 'samples': 564224, 'steps': 1101, 'batch_loss/train': 0.9475566386245191} 12/21/2021 18:02:27 - INFO - codeparrot_training - Step 1102: {'lr': 0.0004995730756106, 'samples': 564736, 'steps': 1102, 'batch_loss/train': 1.1874813102185726} 12/21/2021 18:02:39 - INFO - codeparrot_training - Step 1103: {'lr': 0.000499572154288524, 'samples': 565248, 'steps': 1103, 'batch_loss/train': 1.7612678539007902} 12/21/2021 18:02:50 - INFO - codeparrot_training - Step 1104: {'lr': 0.0004995712319742433, 'samples': 565760, 'steps': 1104, 'batch_loss/train': 1.2077126186341047} 12/21/2021 18:03:00 - INFO - codeparrot_training - Step 1105: {'lr': 0.0004995703086677617, 'samples': 566272, 'steps': 1105, 'batch_loss/train': 0.9749624468386173} 12/21/2021 18:03:13 - INFO - codeparrot_training - Step 1106: {'lr': 0.0004995693843690826, 'samples': 566784, 'steps': 1106, 'batch_loss/train': 0.9623480718582869} 12/21/2021 18:03:24 - INFO - codeparrot_training - Step 1107: {'lr': 0.0004995684590782101, 'samples': 567296, 'steps': 1107, 'batch_loss/train': 0.9023960018530488} 12/21/2021 18:03:34 - INFO - codeparrot_training - Step 1108: {'lr': 0.0004995675327951474, 'samples': 567808, 'steps': 1108, 'batch_loss/train': 0.8568100268021226} 12/21/2021 18:03:46 - INFO - codeparrot_training - Step 1109: {'lr': 0.0004995666055198986, 'samples': 568320, 'steps': 1109, 'batch_loss/train': 0.9364044480025768} 12/21/2021 18:03:57 - INFO - codeparrot_training - Step 1110: {'lr': 0.000499565677252467, 'samples': 568832, 'steps': 1110, 'batch_loss/train': 0.8646228360012174} 12/21/2021 18:04:07 - INFO - codeparrot_training - Step 1111: {'lr': 0.0004995647479928566, 'samples': 569344, 'steps': 1111, 'batch_loss/train': 0.9255408979952335} 12/21/2021 18:04:18 - INFO - codeparrot_training - Step 1112: {'lr': 0.0004995638177410711, 'samples': 569856, 'steps': 1112, 'batch_loss/train': 1.0486212018877268} 12/21/2021 18:04:30 - INFO - codeparrot_training - Step 1113: {'lr': 0.000499562886497114, 'samples': 570368, 'steps': 1113, 'batch_loss/train': 1.0237348061054945} 12/21/2021 18:04:41 - INFO - codeparrot_training - Step 1114: {'lr': 0.0004995619542609892, 'samples': 570880, 'steps': 1114, 'batch_loss/train': 0.8184514958411455} 12/21/2021 18:04:52 - INFO - codeparrot_training - Step 1115: {'lr': 0.0004995610210327001, 'samples': 571392, 'steps': 1115, 'batch_loss/train': 0.865839023143053} 12/21/2021 18:05:04 - INFO - codeparrot_training - Step 1116: {'lr': 0.0004995600868122508, 'samples': 571904, 'steps': 1116, 'batch_loss/train': 0.8920089188031852} 12/21/2021 18:05:15 - INFO - codeparrot_training - Step 1117: {'lr': 0.0004995591515996447, 'samples': 572416, 'steps': 1117, 'batch_loss/train': 0.9018277155701071} 12/21/2021 18:05:25 - INFO - codeparrot_training - Step 1118: {'lr': 0.0004995582153948857, 'samples': 572928, 'steps': 1118, 'batch_loss/train': 1.4468217734247446} 12/21/2021 18:05:37 - INFO - codeparrot_training - Step 1119: {'lr': 0.0004995572781979774, 'samples': 573440, 'steps': 1119, 'batch_loss/train': 0.9325792817398906} 12/21/2021 18:05:48 - INFO - codeparrot_training - Step 1120: {'lr': 0.0004995563400089238, 'samples': 573952, 'steps': 1120, 'batch_loss/train': 0.9161987453699112} 12/21/2021 18:05:59 - INFO - codeparrot_training - Step 1121: {'lr': 0.0004995554008277283, 'samples': 574464, 'steps': 1121, 'batch_loss/train': 0.9810406556352973} 12/21/2021 18:06:09 - INFO - codeparrot_training - Step 1122: {'lr': 0.0004995544606543947, 'samples': 574976, 'steps': 1122, 'batch_loss/train': 0.9154063882306218} 12/21/2021 18:06:21 - INFO - codeparrot_training - Step 1123: {'lr': 0.0004995535194889267, 'samples': 575488, 'steps': 1123, 'batch_loss/train': 0.7840563806239516} 12/21/2021 18:06:32 - INFO - codeparrot_training - Step 1124: {'lr': 0.0004995525773313283, 'samples': 576000, 'steps': 1124, 'batch_loss/train': 0.9391268035396934} 12/21/2021 18:06:42 - INFO - codeparrot_training - Step 1125: {'lr': 0.000499551634181603, 'samples': 576512, 'steps': 1125, 'batch_loss/train': 0.9383440567180514} 12/21/2021 18:06:55 - INFO - codeparrot_training - Step 1126: {'lr': 0.0004995506900397547, 'samples': 577024, 'steps': 1126, 'batch_loss/train': 0.9530966784805059} 12/21/2021 18:07:06 - INFO - codeparrot_training - Step 1127: {'lr': 0.0004995497449057871, 'samples': 577536, 'steps': 1127, 'batch_loss/train': 0.988777719438076} 12/21/2021 18:07:16 - INFO - codeparrot_training - Step 1128: {'lr': 0.0004995487987797038, 'samples': 578048, 'steps': 1128, 'batch_loss/train': 0.8927262420766056} 12/21/2021 18:07:29 - INFO - codeparrot_training - Step 1129: {'lr': 0.0004995478516615088, 'samples': 578560, 'steps': 1129, 'batch_loss/train': 0.9784006746485829} 12/21/2021 18:07:39 - INFO - codeparrot_training - Step 1130: {'lr': 0.0004995469035512058, 'samples': 579072, 'steps': 1130, 'batch_loss/train': 0.866100194863975} 12/21/2021 18:07:50 - INFO - codeparrot_training - Step 1131: {'lr': 0.0004995459544487984, 'samples': 579584, 'steps': 1131, 'batch_loss/train': 0.7437573652714491} 12/21/2021 18:08:00 - INFO - codeparrot_training - Step 1132: {'lr': 0.0004995450043542906, 'samples': 580096, 'steps': 1132, 'batch_loss/train': 0.9188604271039367} 12/21/2021 18:08:12 - INFO - codeparrot_training - Step 1133: {'lr': 0.0004995440532676861, 'samples': 580608, 'steps': 1133, 'batch_loss/train': 0.9555202294141054} 12/21/2021 18:08:23 - INFO - codeparrot_training - Step 1134: {'lr': 0.0004995431011889887, 'samples': 581120, 'steps': 1134, 'batch_loss/train': 0.9115732107311487} 12/21/2021 18:08:34 - INFO - codeparrot_training - Step 1135: {'lr': 0.0004995421481182021, 'samples': 581632, 'steps': 1135, 'batch_loss/train': 0.8363559860736132} 12/21/2021 18:08:46 - INFO - codeparrot_training - Step 1136: {'lr': 0.0004995411940553301, 'samples': 582144, 'steps': 1136, 'batch_loss/train': 0.8803338319994509} 12/21/2021 18:08:57 - INFO - codeparrot_training - Step 1137: {'lr': 0.0004995402390003767, 'samples': 582656, 'steps': 1137, 'batch_loss/train': 0.9504870409145951} 12/21/2021 18:09:08 - INFO - codeparrot_training - Step 1138: {'lr': 0.0004995392829533453, 'samples': 583168, 'steps': 1138, 'batch_loss/train': 0.8740635262802243} 12/21/2021 18:09:20 - INFO - codeparrot_training - Step 1139: {'lr': 0.0004995383259142401, 'samples': 583680, 'steps': 1139, 'batch_loss/train': 0.9069474171847105} 12/21/2021 18:09:30 - INFO - codeparrot_training - Step 1140: {'lr': 0.0004995373678830647, 'samples': 584192, 'steps': 1140, 'batch_loss/train': 0.6965680932626128} 12/21/2021 18:09:41 - INFO - codeparrot_training - Step 1141: {'lr': 0.000499536408859823, 'samples': 584704, 'steps': 1141, 'batch_loss/train': 0.8678837665356696} 12/21/2021 18:09:53 - INFO - codeparrot_training - Step 1142: {'lr': 0.0004995354488445186, 'samples': 585216, 'steps': 1142, 'batch_loss/train': 0.9365042299032211} 12/21/2021 18:10:04 - INFO - codeparrot_training - Step 1143: {'lr': 0.0004995344878371556, 'samples': 585728, 'steps': 1143, 'batch_loss/train': 0.952761891297996} 12/21/2021 18:10:14 - INFO - codeparrot_training - Step 1144: {'lr': 0.0004995335258377377, 'samples': 586240, 'steps': 1144, 'batch_loss/train': 0.9256991683505476} 12/21/2021 18:10:25 - INFO - codeparrot_training - Step 1145: {'lr': 0.0004995325628462687, 'samples': 586752, 'steps': 1145, 'batch_loss/train': 0.9593670899048448} 12/21/2021 18:10:38 - INFO - codeparrot_training - Step 1146: {'lr': 0.0004995315988627525, 'samples': 587264, 'steps': 1146, 'batch_loss/train': 0.8608019361272454} 12/21/2021 18:10:48 - INFO - codeparrot_training - Step 1147: {'lr': 0.0004995306338871928, 'samples': 587776, 'steps': 1147, 'batch_loss/train': 0.9922945331782103} 12/21/2021 18:10:59 - INFO - codeparrot_training - Step 1148: {'lr': 0.0004995296679195935, 'samples': 588288, 'steps': 1148, 'batch_loss/train': 0.8341850200667977} 12/21/2021 18:11:11 - INFO - codeparrot_training - Step 1149: {'lr': 0.0004995287009599585, 'samples': 588800, 'steps': 1149, 'batch_loss/train': 0.8272012053057551} 12/21/2021 18:11:21 - INFO - codeparrot_training - Step 1150: {'lr': 0.0004995277330082916, 'samples': 589312, 'steps': 1150, 'batch_loss/train': 0.8838570918887854} 12/21/2021 18:11:32 - INFO - codeparrot_training - Step 1151: {'lr': 0.0004995267640645967, 'samples': 589824, 'steps': 1151, 'batch_loss/train': 0.8952997587621212} 12/21/2021 18:11:45 - INFO - codeparrot_training - Step 1152: {'lr': 0.0004995257941288776, 'samples': 590336, 'steps': 1152, 'batch_loss/train': 0.8195746010169387} 12/21/2021 18:11:56 - INFO - codeparrot_training - Step 1153: {'lr': 0.0004995248232011381, 'samples': 590848, 'steps': 1153, 'batch_loss/train': 0.9236949887126684} 12/21/2021 18:12:06 - INFO - codeparrot_training - Step 1154: {'lr': 0.0004995238512813821, 'samples': 591360, 'steps': 1154, 'batch_loss/train': 0.9074621677864343} 12/21/2021 18:12:17 - INFO - codeparrot_training - Step 1155: {'lr': 0.0004995228783696136, 'samples': 591872, 'steps': 1155, 'batch_loss/train': 0.8963299728929996} 12/21/2021 18:12:30 - INFO - codeparrot_training - Step 1156: {'lr': 0.0004995219044658363, 'samples': 592384, 'steps': 1156, 'batch_loss/train': 0.8924400648102164} 12/21/2021 18:12:40 - INFO - codeparrot_training - Step 1157: {'lr': 0.000499520929570054, 'samples': 592896, 'steps': 1157, 'batch_loss/train': 0.9086917554959655} 12/21/2021 18:12:51 - INFO - codeparrot_training - Step 1158: {'lr': 0.0004995199536822709, 'samples': 593408, 'steps': 1158, 'batch_loss/train': 1.4081650339066982} 12/21/2021 18:13:03 - INFO - codeparrot_training - Step 1159: {'lr': 0.0004995189768024905, 'samples': 593920, 'steps': 1159, 'batch_loss/train': 0.9024025881662965} 12/21/2021 18:13:13 - INFO - codeparrot_training - Step 1160: {'lr': 0.0004995179989307169, 'samples': 594432, 'steps': 1160, 'batch_loss/train': 0.8904345347546041} 12/21/2021 18:13:24 - INFO - codeparrot_training - Step 1161: {'lr': 0.0004995170200669541, 'samples': 594944, 'steps': 1161, 'batch_loss/train': 0.7745180642232299} 12/21/2021 18:13:35 - INFO - codeparrot_training - Step 1162: {'lr': 0.0004995160402112057, 'samples': 595456, 'steps': 1162, 'batch_loss/train': 1.1850290074944496} 12/21/2021 18:13:47 - INFO - codeparrot_training - Step 1163: {'lr': 0.0004995150593634758, 'samples': 595968, 'steps': 1163, 'batch_loss/train': 0.950513401068747} 12/21/2021 18:13:57 - INFO - codeparrot_training - Step 1164: {'lr': 0.0004995140775237682, 'samples': 596480, 'steps': 1164, 'batch_loss/train': 0.9199875956401229} 12/21/2021 18:14:08 - INFO - codeparrot_training - Step 1165: {'lr': 0.0004995130946920868, 'samples': 596992, 'steps': 1165, 'batch_loss/train': 1.0683848885819316} 12/21/2021 18:14:21 - INFO - codeparrot_training - Step 1166: {'lr': 0.0004995121108684357, 'samples': 597504, 'steps': 1166, 'batch_loss/train': 0.9540066784247756} 12/21/2021 18:14:31 - INFO - codeparrot_training - Step 1167: {'lr': 0.0004995111260528185, 'samples': 598016, 'steps': 1167, 'batch_loss/train': 0.930673093535006} 12/21/2021 18:14:42 - INFO - codeparrot_training - Step 1168: {'lr': 0.0004995101402452394, 'samples': 598528, 'steps': 1168, 'batch_loss/train': 0.9216750035993755} 12/21/2021 18:14:54 - INFO - codeparrot_training - Step 1169: {'lr': 0.000499509153445702, 'samples': 599040, 'steps': 1169, 'batch_loss/train': 0.8992116237059236} 12/21/2021 18:15:05 - INFO - codeparrot_training - Step 1170: {'lr': 0.0004995081656542106, 'samples': 599552, 'steps': 1170, 'batch_loss/train': 0.9409580049104989} 12/21/2021 18:15:15 - INFO - codeparrot_training - Step 1171: {'lr': 0.0004995071768707688, 'samples': 600064, 'steps': 1171, 'batch_loss/train': 0.8488235464319587} 12/21/2021 18:15:28 - INFO - codeparrot_training - Step 1172: {'lr': 0.0004995061870953809, 'samples': 600576, 'steps': 1172, 'batch_loss/train': 0.966129869222641} 12/21/2021 18:15:38 - INFO - codeparrot_training - Step 1173: {'lr': 0.0004995051963280504, 'samples': 601088, 'steps': 1173, 'batch_loss/train': 0.9478419600054622} 12/21/2021 18:15:49 - INFO - codeparrot_training - Step 1174: {'lr': 0.0004995042045687815, 'samples': 601600, 'steps': 1174, 'batch_loss/train': 0.7674035511445254} 12/21/2021 18:15:59 - INFO - codeparrot_training - Step 1175: {'lr': 0.0004995032118175782, 'samples': 602112, 'steps': 1175, 'batch_loss/train': 0.8185501310508698} 12/21/2021 18:16:12 - INFO - codeparrot_training - Step 1176: {'lr': 0.0004995022180744442, 'samples': 602624, 'steps': 1176, 'batch_loss/train': 0.9788629738613963} 12/21/2021 18:16:23 - INFO - codeparrot_training - Step 1177: {'lr': 0.0004995012233393836, 'samples': 603136, 'steps': 1177, 'batch_loss/train': 0.9133735606446862} 12/21/2021 18:16:34 - INFO - codeparrot_training - Step 1178: {'lr': 0.0004995002276124004, 'samples': 603648, 'steps': 1178, 'batch_loss/train': 1.0913109579123557} 12/21/2021 18:16:46 - INFO - codeparrot_training - Step 1179: {'lr': 0.0004994992308934985, 'samples': 604160, 'steps': 1179, 'batch_loss/train': 0.9067457905039191} 12/21/2021 18:16:56 - INFO - codeparrot_training - Step 1180: {'lr': 0.0004994982331826818, 'samples': 604672, 'steps': 1180, 'batch_loss/train': 0.9443781683221459} 12/21/2021 18:17:07 - INFO - codeparrot_training - Step 1181: {'lr': 0.0004994972344799544, 'samples': 605184, 'steps': 1181, 'batch_loss/train': 1.003391232341528} 12/21/2021 18:17:20 - INFO - codeparrot_training - Step 1182: {'lr': 0.0004994962347853203, 'samples': 605696, 'steps': 1182, 'batch_loss/train': 0.9118214081972837} 12/21/2021 18:17:30 - INFO - codeparrot_training - Step 1183: {'lr': 0.0004994952340987832, 'samples': 606208, 'steps': 1183, 'batch_loss/train': 0.9671361288055778} 12/21/2021 18:17:41 - INFO - codeparrot_training - Step 1184: {'lr': 0.0004994942324203474, 'samples': 606720, 'steps': 1184, 'batch_loss/train': 0.9741847687400877} 12/21/2021 18:17:51 - INFO - codeparrot_training - Step 1185: {'lr': 0.0004994932297500166, 'samples': 607232, 'steps': 1185, 'batch_loss/train': 0.9206735617481172} 12/21/2021 18:18:03 - INFO - codeparrot_training - Step 1186: {'lr': 0.000499492226087795, 'samples': 607744, 'steps': 1186, 'batch_loss/train': 0.896538395434618} 12/21/2021 18:18:14 - INFO - codeparrot_training - Step 1187: {'lr': 0.0004994912214336865, 'samples': 608256, 'steps': 1187, 'batch_loss/train': 0.951688070781529} 12/21/2021 18:18:25 - INFO - codeparrot_training - Step 1188: {'lr': 0.000499490215787695, 'samples': 608768, 'steps': 1188, 'batch_loss/train': 0.9373837700113654} 12/21/2021 18:18:37 - INFO - codeparrot_training - Step 1189: {'lr': 0.0004994892091498247, 'samples': 609280, 'steps': 1189, 'batch_loss/train': 0.9094412056729198} 12/21/2021 18:18:47 - INFO - codeparrot_training - Step 1190: {'lr': 0.0004994882015200796, 'samples': 609792, 'steps': 1190, 'batch_loss/train': 0.9100059298798442} 12/21/2021 18:18:58 - INFO - codeparrot_training - Step 1191: {'lr': 0.0004994871928984635, 'samples': 610304, 'steps': 1191, 'batch_loss/train': 0.8352670585736632} 12/21/2021 18:19:10 - INFO - codeparrot_training - Step 1192: {'lr': 0.0004994861832849806, 'samples': 610816, 'steps': 1192, 'batch_loss/train': 0.8741358816623688} 12/21/2021 18:19:21 - INFO - codeparrot_training - Step 1193: {'lr': 0.0004994851726796347, 'samples': 611328, 'steps': 1193, 'batch_loss/train': 0.9233765387907624} 12/21/2021 18:19:32 - INFO - codeparrot_training - Step 1194: {'lr': 0.0004994841610824301, 'samples': 611840, 'steps': 1194, 'batch_loss/train': 0.9368011187762022} 12/21/2021 18:19:44 - INFO - codeparrot_training - Step 1195: {'lr': 0.0004994831484933706, 'samples': 612352, 'steps': 1195, 'batch_loss/train': 0.815311367623508} 12/21/2021 18:19:54 - INFO - codeparrot_training - Step 1196: {'lr': 0.0004994821349124603, 'samples': 612864, 'steps': 1196, 'batch_loss/train': 0.8949263100512326} 12/21/2021 18:20:05 - INFO - codeparrot_training - Step 1197: {'lr': 0.0004994811203397032, 'samples': 613376, 'steps': 1197, 'batch_loss/train': 0.8489316711202264} 12/21/2021 18:20:16 - INFO - codeparrot_training - Step 1198: {'lr': 0.0004994801047751035, 'samples': 613888, 'steps': 1198, 'batch_loss/train': 0.9408408664166927} 12/21/2021 18:20:28 - INFO - codeparrot_training - Step 1199: {'lr': 0.0004994790882186651, 'samples': 614400, 'steps': 1199, 'batch_loss/train': 0.9314808757044375} 12/21/2021 18:20:39 - INFO - codeparrot_training - Step 1200: {'lr': 0.000499478070670392, 'samples': 614912, 'steps': 1200, 'batch_loss/train': 0.9746698988601565} 12/21/2021 18:20:49 - INFO - codeparrot_training - Step 1201: {'lr': 0.0004994770521302882, 'samples': 615424, 'steps': 1201, 'batch_loss/train': 0.9159172605723143} 12/21/2021 18:21:01 - INFO - codeparrot_training - Step 1202: {'lr': 0.0004994760325983579, 'samples': 615936, 'steps': 1202, 'batch_loss/train': 0.8153248308226466} 12/21/2021 18:21:12 - INFO - codeparrot_training - Step 1203: {'lr': 0.0004994750120746051, 'samples': 616448, 'steps': 1203, 'batch_loss/train': 0.8492526998743415} 12/21/2021 18:21:22 - INFO - codeparrot_training - Step 1204: {'lr': 0.0004994739905590338, 'samples': 616960, 'steps': 1204, 'batch_loss/train': 0.8187123462557793} 12/21/2021 18:21:35 - INFO - codeparrot_training - Step 1205: {'lr': 0.0004994729680516482, 'samples': 617472, 'steps': 1205, 'batch_loss/train': 1.0016724057495594} 12/21/2021 18:21:46 - INFO - codeparrot_training - Step 1206: {'lr': 0.0004994719445524522, 'samples': 617984, 'steps': 1206, 'batch_loss/train': 0.8118661541957408} 12/21/2021 18:21:56 - INFO - codeparrot_training - Step 1207: {'lr': 0.00049947092006145, 'samples': 618496, 'steps': 1207, 'batch_loss/train': 1.0311933532357216} 12/21/2021 18:22:07 - INFO - codeparrot_training - Step 1208: {'lr': 0.0004994698945786456, 'samples': 619008, 'steps': 1208, 'batch_loss/train': 0.8503660308197141} 12/21/2021 18:22:19 - INFO - codeparrot_training - Step 1209: {'lr': 0.0004994688681040432, 'samples': 619520, 'steps': 1209, 'batch_loss/train': 0.9504197668284178} 12/21/2021 18:22:30 - INFO - codeparrot_training - Step 1210: {'lr': 0.0004994678406376466, 'samples': 620032, 'steps': 1210, 'batch_loss/train': 0.9647932406514883} 12/21/2021 18:22:40 - INFO - codeparrot_training - Step 1211: {'lr': 0.0004994668121794601, 'samples': 620544, 'steps': 1211, 'batch_loss/train': 0.8855375917628407} 12/21/2021 18:22:53 - INFO - codeparrot_training - Step 1212: {'lr': 0.0004994657827294878, 'samples': 621056, 'steps': 1212, 'batch_loss/train': 1.114992587827146} 12/21/2021 18:23:04 - INFO - codeparrot_training - Step 1213: {'lr': 0.0004994647522877337, 'samples': 621568, 'steps': 1213, 'batch_loss/train': 1.0224395813420415} 12/21/2021 18:23:14 - INFO - codeparrot_training - Step 1214: {'lr': 0.000499463720854202, 'samples': 622080, 'steps': 1214, 'batch_loss/train': 0.9078174885362387} 12/21/2021 18:23:27 - INFO - codeparrot_training - Step 1215: {'lr': 0.0004994626884288967, 'samples': 622592, 'steps': 1215, 'batch_loss/train': 0.9695952069014311} 12/21/2021 18:23:37 - INFO - codeparrot_training - Step 1216: {'lr': 0.0004994616550118219, 'samples': 623104, 'steps': 1216, 'batch_loss/train': 0.8072706330567598} 12/21/2021 18:23:48 - INFO - codeparrot_training - Step 1217: {'lr': 0.0004994606206029818, 'samples': 623616, 'steps': 1217, 'batch_loss/train': 0.8016591249033809} 12/21/2021 18:23:58 - INFO - codeparrot_training - Step 1218: {'lr': 0.0004994595852023804, 'samples': 624128, 'steps': 1218, 'batch_loss/train': 0.988537983968854} 12/21/2021 18:24:11 - INFO - codeparrot_training - Step 1219: {'lr': 0.000499458548810022, 'samples': 624640, 'steps': 1219, 'batch_loss/train': 0.8402985902503133} 12/21/2021 18:24:21 - INFO - codeparrot_training - Step 1220: {'lr': 0.0004994575114259105, 'samples': 625152, 'steps': 1220, 'batch_loss/train': 0.9254513545893133} 12/21/2021 18:24:32 - INFO - codeparrot_training - Step 1221: {'lr': 0.0004994564730500502, 'samples': 625664, 'steps': 1221, 'batch_loss/train': 0.9947165064513683} 12/21/2021 18:24:44 - INFO - codeparrot_training - Step 1222: {'lr': 0.0004994554336824452, 'samples': 626176, 'steps': 1222, 'batch_loss/train': 0.8540261844173074} 12/21/2021 18:24:55 - INFO - codeparrot_training - Step 1223: {'lr': 0.0004994543933230995, 'samples': 626688, 'steps': 1223, 'batch_loss/train': 0.8317353008314967} 12/21/2021 18:25:06 - INFO - codeparrot_training - Step 1224: {'lr': 0.0004994533519720172, 'samples': 627200, 'steps': 1224, 'batch_loss/train': 0.9294225857593119} 12/21/2021 18:25:18 - INFO - codeparrot_training - Step 1225: {'lr': 0.0004994523096292028, 'samples': 627712, 'steps': 1225, 'batch_loss/train': 0.9803953859955072} 12/21/2021 18:25:28 - INFO - codeparrot_training - Step 1226: {'lr': 0.0004994512662946601, 'samples': 628224, 'steps': 1226, 'batch_loss/train': 1.0673902332782745} 12/21/2021 18:25:39 - INFO - codeparrot_training - Step 1227: {'lr': 0.0004994502219683933, 'samples': 628736, 'steps': 1227, 'batch_loss/train': 0.9396049245260656} 12/21/2021 18:25:50 - INFO - codeparrot_training - Step 1228: {'lr': 0.0004994491766504067, 'samples': 629248, 'steps': 1228, 'batch_loss/train': 0.9172460688278079} 12/21/2021 18:26:02 - INFO - codeparrot_training - Step 1229: {'lr': 0.0004994481303407044, 'samples': 629760, 'steps': 1229, 'batch_loss/train': 0.8158911308273673} 12/21/2021 18:26:12 - INFO - codeparrot_training - Step 1230: {'lr': 0.0004994470830392905, 'samples': 630272, 'steps': 1230, 'batch_loss/train': 0.8511423089075834} 12/21/2021 18:26:23 - INFO - codeparrot_training - Step 1231: {'lr': 0.000499446034746169, 'samples': 630784, 'steps': 1231, 'batch_loss/train': 0.9460949646309018} 12/21/2021 18:26:35 - INFO - codeparrot_training - Step 1232: {'lr': 0.0004994449854613444, 'samples': 631296, 'steps': 1232, 'batch_loss/train': 0.8951304238289595} 12/21/2021 18:26:46 - INFO - codeparrot_training - Step 1233: {'lr': 0.0004994439351848208, 'samples': 631808, 'steps': 1233, 'batch_loss/train': 0.7241121765691787} 12/21/2021 18:26:56 - INFO - codeparrot_training - Step 1234: {'lr': 0.0004994428839166021, 'samples': 632320, 'steps': 1234, 'batch_loss/train': 0.8898308789357543} 12/21/2021 18:27:09 - INFO - codeparrot_training - Step 1235: {'lr': 0.0004994418316566928, 'samples': 632832, 'steps': 1235, 'batch_loss/train': 0.8237797874026} 12/21/2021 18:27:20 - INFO - codeparrot_training - Step 1236: {'lr': 0.000499440778405097, 'samples': 633344, 'steps': 1236, 'batch_loss/train': 0.8611332550644875} 12/21/2021 18:27:30 - INFO - codeparrot_training - Step 1237: {'lr': 0.0004994397241618188, 'samples': 633856, 'steps': 1237, 'batch_loss/train': 0.8826666735112667} 12/21/2021 18:27:41 - INFO - codeparrot_training - Step 1238: {'lr': 0.0004994386689268624, 'samples': 634368, 'steps': 1238, 'batch_loss/train': 0.8119998555630445} 12/21/2021 18:27:53 - INFO - codeparrot_training - Step 1239: {'lr': 0.0004994376127002322, 'samples': 634880, 'steps': 1239, 'batch_loss/train': 0.6617459214758128} 12/21/2021 18:28:04 - INFO - codeparrot_training - Step 1240: {'lr': 0.000499436555481932, 'samples': 635392, 'steps': 1240, 'batch_loss/train': 1.023212667554617} 12/21/2021 18:28:15 - INFO - codeparrot_training - Step 1241: {'lr': 0.0004994354972719663, 'samples': 635904, 'steps': 1241, 'batch_loss/train': 0.938759570941329} 12/21/2021 18:28:27 - INFO - codeparrot_training - Step 1242: {'lr': 0.0004994344380703394, 'samples': 636416, 'steps': 1242, 'batch_loss/train': 0.870722085237503} 12/21/2021 18:28:38 - INFO - codeparrot_training - Step 1243: {'lr': 0.0004994333778770552, 'samples': 636928, 'steps': 1243, 'batch_loss/train': 0.9078935999423265} 12/21/2021 18:28:48 - INFO - codeparrot_training - Step 1244: {'lr': 0.0004994323166921182, 'samples': 637440, 'steps': 1244, 'batch_loss/train': 0.8422485021874309} 12/21/2021 18:29:00 - INFO - codeparrot_training - Step 1245: {'lr': 0.0004994312545155324, 'samples': 637952, 'steps': 1245, 'batch_loss/train': 0.8362310873344541} 12/21/2021 18:29:11 - INFO - codeparrot_training - Step 1246: {'lr': 0.0004994301913473022, 'samples': 638464, 'steps': 1246, 'batch_loss/train': 0.9576555611565709} 12/21/2021 18:29:22 - INFO - codeparrot_training - Step 1247: {'lr': 0.0004994291271874316, 'samples': 638976, 'steps': 1247, 'batch_loss/train': 0.946130913682282} 12/21/2021 18:29:34 - INFO - codeparrot_training - Step 1248: {'lr': 0.000499428062035925, 'samples': 639488, 'steps': 1248, 'batch_loss/train': 0.9039501752704382} 12/21/2021 18:29:44 - INFO - codeparrot_training - Step 1249: {'lr': 0.0004994269958927867, 'samples': 640000, 'steps': 1249, 'batch_loss/train': 0.848575550597161} 12/21/2021 18:29:55 - INFO - codeparrot_training - Step 1250: {'lr': 0.0004994259287580209, 'samples': 640512, 'steps': 1250, 'batch_loss/train': 0.9087131032720208} 12/21/2021 18:30:06 - INFO - codeparrot_training - Step 1251: {'lr': 0.0004994248606316317, 'samples': 641024, 'steps': 1251, 'batch_loss/train': 0.8138246252201498} 12/21/2021 18:30:18 - INFO - codeparrot_training - Step 1252: {'lr': 0.0004994237915136233, 'samples': 641536, 'steps': 1252, 'batch_loss/train': 0.8964862320572138} 12/21/2021 18:30:29 - INFO - codeparrot_training - Step 1253: {'lr': 0.0004994227214040002, 'samples': 642048, 'steps': 1253, 'batch_loss/train': 0.8127140523865819} 12/21/2021 18:30:40 - INFO - codeparrot_training - Step 1254: {'lr': 0.0004994216503027665, 'samples': 642560, 'steps': 1254, 'batch_loss/train': 0.9757550451904535} 12/21/2021 18:30:52 - INFO - codeparrot_training - Step 1255: {'lr': 0.0004994205782099265, 'samples': 643072, 'steps': 1255, 'batch_loss/train': 0.8581873029470444} 12/21/2021 18:31:02 - INFO - codeparrot_training - Step 1256: {'lr': 0.0004994195051254846, 'samples': 643584, 'steps': 1256, 'batch_loss/train': 0.879802443087101} 12/21/2021 18:31:13 - INFO - codeparrot_training - Step 1257: {'lr': 0.0004994184310494448, 'samples': 644096, 'steps': 1257, 'batch_loss/train': 0.8712422819808125} 12/21/2021 18:31:25 - INFO - codeparrot_training - Step 1258: {'lr': 0.0004994173559818116, 'samples': 644608, 'steps': 1258, 'batch_loss/train': 0.8286894992925227} 12/21/2021 18:31:35 - INFO - codeparrot_training - Step 1259: {'lr': 0.0004994162799225889, 'samples': 645120, 'steps': 1259, 'batch_loss/train': 0.928883945569396} 12/21/2021 18:31:46 - INFO - codeparrot_training - Step 1260: {'lr': 0.0004994152028717816, 'samples': 645632, 'steps': 1260, 'batch_loss/train': 0.9047780660912395} 12/21/2021 18:31:57 - INFO - codeparrot_training - Step 1261: {'lr': 0.0004994141248293935, 'samples': 646144, 'steps': 1261, 'batch_loss/train': 1.0263357982039452} 12/21/2021 18:32:10 - INFO - codeparrot_training - Step 1262: {'lr': 0.0004994130457954288, 'samples': 646656, 'steps': 1262, 'batch_loss/train': 0.8393691079691052} 12/21/2021 18:32:20 - INFO - codeparrot_training - Step 1263: {'lr': 0.0004994119657698922, 'samples': 647168, 'steps': 1263, 'batch_loss/train': 0.9209253257140517} 12/21/2021 18:32:31 - INFO - codeparrot_training - Step 1264: {'lr': 0.0004994108847527878, 'samples': 647680, 'steps': 1264, 'batch_loss/train': 0.9038002192974091} 12/21/2021 18:32:43 - INFO - codeparrot_training - Step 1265: {'lr': 0.0004994098027441199, 'samples': 648192, 'steps': 1265, 'batch_loss/train': 0.9246189179830253} 12/21/2021 18:32:53 - INFO - codeparrot_training - Step 1266: {'lr': 0.0004994087197438928, 'samples': 648704, 'steps': 1266, 'batch_loss/train': 0.9447633372619748} 12/21/2021 18:33:04 - INFO - codeparrot_training - Step 1267: {'lr': 0.0004994076357521108, 'samples': 649216, 'steps': 1267, 'batch_loss/train': 0.9187744194641709} 12/21/2021 18:33:17 - INFO - codeparrot_training - Step 1268: {'lr': 0.0004994065507687781, 'samples': 649728, 'steps': 1268, 'batch_loss/train': 0.9471127837896347} 12/21/2021 18:33:27 - INFO - codeparrot_training - Step 1269: {'lr': 0.0004994054647938993, 'samples': 650240, 'steps': 1269, 'batch_loss/train': 1.3510896479710937} 12/21/2021 18:33:38 - INFO - codeparrot_training - Step 1270: {'lr': 0.0004994043778274783, 'samples': 650752, 'steps': 1270, 'batch_loss/train': 0.9727164045907557} 12/21/2021 18:33:49 - INFO - codeparrot_training - Step 1271: {'lr': 0.0004994032898695199, 'samples': 651264, 'steps': 1271, 'batch_loss/train': 0.8685201350599527} 12/21/2021 18:34:02 - INFO - codeparrot_training - Step 1272: {'lr': 0.0004994022009200281, 'samples': 651776, 'steps': 1272, 'batch_loss/train': 0.9125380613841116} 12/21/2021 18:34:12 - INFO - codeparrot_training - Step 1273: {'lr': 0.0004994011109790075, 'samples': 652288, 'steps': 1273, 'batch_loss/train': 0.9562150659039617} 12/21/2021 18:34:23 - INFO - codeparrot_training - Step 1274: {'lr': 0.000499400020046462, 'samples': 652800, 'steps': 1274, 'batch_loss/train': 0.9103519981727004} 12/21/2021 18:34:35 - INFO - codeparrot_training - Step 1275: {'lr': 0.0004993989281223963, 'samples': 653312, 'steps': 1275, 'batch_loss/train': 0.9870065823197365} 12/21/2021 18:34:45 - INFO - codeparrot_training - Step 1276: {'lr': 0.0004993978352068146, 'samples': 653824, 'steps': 1276, 'batch_loss/train': 0.960326406173408} 12/21/2021 18:34:56 - INFO - codeparrot_training - Step 1277: {'lr': 0.0004993967412997212, 'samples': 654336, 'steps': 1277, 'batch_loss/train': 1.0036738468334079} 12/21/2021 18:35:08 - INFO - codeparrot_training - Step 1278: {'lr': 0.0004993956464011207, 'samples': 654848, 'steps': 1278, 'batch_loss/train': 1.1145060686394572} 12/21/2021 18:35:19 - INFO - codeparrot_training - Step 1279: {'lr': 0.0004993945505110171, 'samples': 655360, 'steps': 1279, 'batch_loss/train': 0.76822858909145} 12/21/2021 18:35:29 - INFO - codeparrot_training - Step 1280: {'lr': 0.0004993934536294151, 'samples': 655872, 'steps': 1280, 'batch_loss/train': 0.8462537098675966} 12/21/2021 18:35:40 - INFO - codeparrot_training - Step 1281: {'lr': 0.0004993923557563188, 'samples': 656384, 'steps': 1281, 'batch_loss/train': 0.8669967195019126} 12/21/2021 18:35:53 - INFO - codeparrot_training - Step 1282: {'lr': 0.0004993912568917327, 'samples': 656896, 'steps': 1282, 'batch_loss/train': 0.9611996412277222} 12/21/2021 18:36:04 - INFO - codeparrot_training - Step 1283: {'lr': 0.000499390157035661, 'samples': 657408, 'steps': 1283, 'batch_loss/train': 0.9204590367153287} 12/21/2021 18:36:14 - INFO - codeparrot_training - Step 1284: {'lr': 0.0004993890561881084, 'samples': 657920, 'steps': 1284, 'batch_loss/train': 0.8793917093425989} 12/21/2021 18:36:26 - INFO - codeparrot_training - Step 1285: {'lr': 0.0004993879543490789, 'samples': 658432, 'steps': 1285, 'batch_loss/train': 0.9312426913529634} 12/21/2021 18:36:37 - INFO - codeparrot_training - Step 1286: {'lr': 0.0004993868515185772, 'samples': 658944, 'steps': 1286, 'batch_loss/train': 0.9649236835539341} 12/21/2021 18:36:47 - INFO - codeparrot_training - Step 1287: {'lr': 0.0004993857476966075, 'samples': 659456, 'steps': 1287, 'batch_loss/train': 0.9192196615040302} 12/21/2021 18:36:59 - INFO - codeparrot_training - Step 1288: {'lr': 0.0004993846428831742, 'samples': 659968, 'steps': 1288, 'batch_loss/train': 0.867634005844593} 12/21/2021 18:37:10 - INFO - codeparrot_training - Step 1289: {'lr': 0.0004993835370782817, 'samples': 660480, 'steps': 1289, 'batch_loss/train': 0.9262999817728996} 12/21/2021 18:37:20 - INFO - codeparrot_training - Step 1290: {'lr': 0.0004993824302819346, 'samples': 660992, 'steps': 1290, 'batch_loss/train': 0.8916918057948351} 12/21/2021 18:37:31 - INFO - codeparrot_training - Step 1291: {'lr': 0.0004993813224941369, 'samples': 661504, 'steps': 1291, 'batch_loss/train': 0.9068458424881101} 12/21/2021 18:37:44 - INFO - codeparrot_training - Step 1292: {'lr': 0.0004993802137148933, 'samples': 662016, 'steps': 1292, 'batch_loss/train': 0.7487506093457341} 12/21/2021 18:37:54 - INFO - codeparrot_training - Step 1293: {'lr': 0.0004993791039442082, 'samples': 662528, 'steps': 1293, 'batch_loss/train': 0.8621078282594681} 12/21/2021 18:38:05 - INFO - codeparrot_training - Step 1294: {'lr': 0.0004993779931820859, 'samples': 663040, 'steps': 1294, 'batch_loss/train': 0.9061572141945362} 12/21/2021 18:38:17 - INFO - codeparrot_training - Step 1295: {'lr': 0.0004993768814285308, 'samples': 663552, 'steps': 1295, 'batch_loss/train': 0.8956511998549104} 12/21/2021 18:38:28 - INFO - codeparrot_training - Step 1296: {'lr': 0.0004993757686835474, 'samples': 664064, 'steps': 1296, 'batch_loss/train': 0.8759795539081097} 12/21/2021 18:38:38 - INFO - codeparrot_training - Step 1297: {'lr': 0.0004993746549471402, 'samples': 664576, 'steps': 1297, 'batch_loss/train': 0.9617621544748545} 12/21/2021 18:38:51 - INFO - codeparrot_training - Step 1298: {'lr': 0.0004993735402193134, 'samples': 665088, 'steps': 1298, 'batch_loss/train': 0.957000813446939} 12/21/2021 18:39:02 - INFO - codeparrot_training - Step 1299: {'lr': 0.0004993724245000717, 'samples': 665600, 'steps': 1299, 'batch_loss/train': 1.0371154267340899} 12/21/2021 18:39:13 - INFO - codeparrot_training - Step 1300: {'lr': 0.0004993713077894194, 'samples': 666112, 'steps': 1300, 'batch_loss/train': 0.8866588901728392} 12/21/2021 18:39:23 - INFO - codeparrot_training - Step 1301: {'lr': 0.0004993701900873607, 'samples': 666624, 'steps': 1301, 'batch_loss/train': 0.7850203523412347} 12/21/2021 18:39:35 - INFO - codeparrot_training - Step 1302: {'lr': 0.0004993690713939005, 'samples': 667136, 'steps': 1302, 'batch_loss/train': 1.0228252205997705} 12/21/2021 18:39:46 - INFO - codeparrot_training - Step 1303: {'lr': 0.000499367951709043, 'samples': 667648, 'steps': 1303, 'batch_loss/train': 1.0434366362169385} 12/21/2021 18:39:56 - INFO - codeparrot_training - Step 1304: {'lr': 0.0004993668310327926, 'samples': 668160, 'steps': 1304, 'batch_loss/train': 0.7902837623842061} 12/21/2021 18:40:09 - INFO - codeparrot_training - Step 1305: {'lr': 0.000499365709365154, 'samples': 668672, 'steps': 1305, 'batch_loss/train': 0.9913491755723953} 12/21/2021 18:40:20 - INFO - codeparrot_training - Step 1306: {'lr': 0.0004993645867061312, 'samples': 669184, 'steps': 1306, 'batch_loss/train': 0.9016605205833912} 12/21/2021 18:40:30 - INFO - codeparrot_training - Step 1307: {'lr': 0.0004993634630557292, 'samples': 669696, 'steps': 1307, 'batch_loss/train': 0.5655952508095652} 12/21/2021 18:40:42 - INFO - codeparrot_training - Step 1308: {'lr': 0.0004993623384139523, 'samples': 670208, 'steps': 1308, 'batch_loss/train': 0.9662716723978519} 12/21/2021 18:40:53 - INFO - codeparrot_training - Step 1309: {'lr': 0.0004993612127808047, 'samples': 670720, 'steps': 1309, 'batch_loss/train': 0.9132321067154408} 12/21/2021 18:41:04 - INFO - codeparrot_training - Step 1310: {'lr': 0.0004993600861562912, 'samples': 671232, 'steps': 1310, 'batch_loss/train': 0.8492658697068691} 12/21/2021 18:41:14 - INFO - codeparrot_training - Step 1311: {'lr': 0.0004993589585404161, 'samples': 671744, 'steps': 1311, 'batch_loss/train': 0.8645156929269433} 12/21/2021 18:41:27 - INFO - codeparrot_training - Step 1312: {'lr': 0.0004993578299331839, 'samples': 672256, 'steps': 1312, 'batch_loss/train': 0.9388946956023574} 12/21/2021 18:41:38 - INFO - codeparrot_training - Step 1313: {'lr': 0.0004993567003345992, 'samples': 672768, 'steps': 1313, 'batch_loss/train': 0.9403657596558332} 12/21/2021 18:41:48 - INFO - codeparrot_training - Step 1314: {'lr': 0.0004993555697446663, 'samples': 673280, 'steps': 1314, 'batch_loss/train': 0.9188530156388879} 12/21/2021 18:42:01 - INFO - codeparrot_training - Step 1315: {'lr': 0.0004993544381633899, 'samples': 673792, 'steps': 1315, 'batch_loss/train': 0.710785408038646} 12/21/2021 18:42:11 - INFO - codeparrot_training - Step 1316: {'lr': 0.0004993533055907744, 'samples': 674304, 'steps': 1316, 'batch_loss/train': 0.8243555780500174} 12/21/2021 18:42:22 - INFO - codeparrot_training - Step 1317: {'lr': 0.0004993521720268243, 'samples': 674816, 'steps': 1317, 'batch_loss/train': 0.9317512549459934} 12/21/2021 18:42:34 - INFO - codeparrot_training - Step 1318: {'lr': 0.0004993510374715442, 'samples': 675328, 'steps': 1318, 'batch_loss/train': 0.9920937879942358} 12/21/2021 18:42:45 - INFO - codeparrot_training - Step 1319: {'lr': 0.0004993499019249383, 'samples': 675840, 'steps': 1319, 'batch_loss/train': 0.9533022390678525} 12/21/2021 18:42:55 - INFO - codeparrot_training - Step 1320: {'lr': 0.0004993487653870115, 'samples': 676352, 'steps': 1320, 'batch_loss/train': 0.8998299855738878} 12/21/2021 18:43:06 - INFO - codeparrot_training - Step 1321: {'lr': 0.0004993476278577682, 'samples': 676864, 'steps': 1321, 'batch_loss/train': 0.8547343546524644} 12/21/2021 18:43:19 - INFO - codeparrot_training - Step 1322: {'lr': 0.0004993464893372128, 'samples': 677376, 'steps': 1322, 'batch_loss/train': 1.0036962730810046} 12/21/2021 18:43:29 - INFO - codeparrot_training - Step 1323: {'lr': 0.00049934534982535, 'samples': 677888, 'steps': 1323, 'batch_loss/train': 0.9214518086519092} 12/21/2021 18:43:40 - INFO - codeparrot_training - Step 1324: {'lr': 0.0004993442093221841, 'samples': 678400, 'steps': 1324, 'batch_loss/train': 0.8624354796484113} 12/21/2021 18:43:52 - INFO - codeparrot_training - Step 1325: {'lr': 0.0004993430678277199, 'samples': 678912, 'steps': 1325, 'batch_loss/train': 1.0128678511828184} 12/21/2021 18:44:03 - INFO - codeparrot_training - Step 1326: {'lr': 0.0004993419253419617, 'samples': 679424, 'steps': 1326, 'batch_loss/train': 0.8773164730519056} 12/21/2021 18:44:13 - INFO - codeparrot_training - Step 1327: {'lr': 0.0004993407818649142, 'samples': 679936, 'steps': 1327, 'batch_loss/train': 0.8841583263128996} 12/21/2021 18:44:25 - INFO - codeparrot_training - Step 1328: {'lr': 0.0004993396373965819, 'samples': 680448, 'steps': 1328, 'batch_loss/train': 0.9331040475517511} 12/21/2021 18:44:36 - INFO - codeparrot_training - Step 1329: {'lr': 0.0004993384919369693, 'samples': 680960, 'steps': 1329, 'batch_loss/train': 0.8925506938248873} 12/21/2021 18:44:46 - INFO - codeparrot_training - Step 1330: {'lr': 0.0004993373454860811, 'samples': 681472, 'steps': 1330, 'batch_loss/train': 0.8381318263709545} 12/21/2021 18:44:59 - INFO - codeparrot_training - Step 1331: {'lr': 0.0004993361980439217, 'samples': 681984, 'steps': 1331, 'batch_loss/train': 0.9490224411711097} 12/21/2021 18:45:10 - INFO - codeparrot_training - Step 1332: {'lr': 0.0004993350496104958, 'samples': 682496, 'steps': 1332, 'batch_loss/train': 0.8712426014244556} 12/21/2021 18:45:20 - INFO - codeparrot_training - Step 1333: {'lr': 0.0004993339001858079, 'samples': 683008, 'steps': 1333, 'batch_loss/train': 1.007031557150185} 12/21/2021 18:45:31 - INFO - codeparrot_training - Step 1334: {'lr': 0.0004993327497698624, 'samples': 683520, 'steps': 1334, 'batch_loss/train': 0.8620072351768613} 12/21/2021 18:45:44 - INFO - codeparrot_training - Step 1335: {'lr': 0.0004993315983626641, 'samples': 684032, 'steps': 1335, 'batch_loss/train': 0.8929694406688213} 12/21/2021 18:45:55 - INFO - codeparrot_training - Step 1336: {'lr': 0.0004993304459642175, 'samples': 684544, 'steps': 1336, 'batch_loss/train': 0.9880138784646988} 12/21/2021 18:46:05 - INFO - codeparrot_training - Step 1337: {'lr': 0.0004993292925745272, 'samples': 685056, 'steps': 1337, 'batch_loss/train': 0.935458067804575} 12/21/2021 18:46:17 - INFO - codeparrot_training - Step 1338: {'lr': 0.0004993281381935978, 'samples': 685568, 'steps': 1338, 'batch_loss/train': 0.9177853753790259} 12/21/2021 18:46:28 - INFO - codeparrot_training - Step 1339: {'lr': 0.0004993269828214339, 'samples': 686080, 'steps': 1339, 'batch_loss/train': 1.0212460467591882} 12/21/2021 18:46:39 - INFO - codeparrot_training - Step 1340: {'lr': 0.0004993258264580399, 'samples': 686592, 'steps': 1340, 'batch_loss/train': 0.8958797240629792} 12/21/2021 18:46:52 - INFO - codeparrot_training - Step 1341: {'lr': 0.0004993246691034208, 'samples': 687104, 'steps': 1341, 'batch_loss/train': 0.8619717983528972} 12/21/2021 18:47:02 - INFO - codeparrot_training - Step 1342: {'lr': 0.0004993235107575807, 'samples': 687616, 'steps': 1342, 'batch_loss/train': 0.9788844976574183} 12/21/2021 18:47:13 - INFO - codeparrot_training - Step 1343: {'lr': 0.0004993223514205247, 'samples': 688128, 'steps': 1343, 'batch_loss/train': 0.9293665299192071} 12/21/2021 18:47:23 - INFO - codeparrot_training - Step 1344: {'lr': 0.0004993211910922571, 'samples': 688640, 'steps': 1344, 'batch_loss/train': 0.9605931024998426} 12/21/2021 18:47:35 - INFO - codeparrot_training - Step 1345: {'lr': 0.0004993200297727825, 'samples': 689152, 'steps': 1345, 'batch_loss/train': 0.8961412999778986} 12/21/2021 18:47:46 - INFO - codeparrot_training - Step 1346: {'lr': 0.0004993188674621056, 'samples': 689664, 'steps': 1346, 'batch_loss/train': 0.9617785345762968} 12/21/2021 18:47:56 - INFO - codeparrot_training - Step 1347: {'lr': 0.0004993177041602311, 'samples': 690176, 'steps': 1347, 'batch_loss/train': 0.9072169419378042} 12/21/2021 18:48:09 - INFO - codeparrot_training - Step 1348: {'lr': 0.0004993165398671635, 'samples': 690688, 'steps': 1348, 'batch_loss/train': 0.7944323308765888} 12/21/2021 18:48:19 - INFO - codeparrot_training - Step 1349: {'lr': 0.0004993153745829075, 'samples': 691200, 'steps': 1349, 'batch_loss/train': 0.8820071071386337} 12/21/2021 18:48:30 - INFO - codeparrot_training - Step 1350: {'lr': 0.0004993142083074677, 'samples': 691712, 'steps': 1350, 'batch_loss/train': 0.9674626467749476} 12/21/2021 18:48:42 - INFO - codeparrot_training - Step 1351: {'lr': 0.0004993130410408487, 'samples': 692224, 'steps': 1351, 'batch_loss/train': 0.8883714079856873} 12/21/2021 18:48:53 - INFO - codeparrot_training - Step 1352: {'lr': 0.0004993118727830553, 'samples': 692736, 'steps': 1352, 'batch_loss/train': 0.8666070178151131} 12/21/2021 18:49:04 - INFO - codeparrot_training - Step 1353: {'lr': 0.0004993107035340919, 'samples': 693248, 'steps': 1353, 'batch_loss/train': 0.8847145279869437} 12/21/2021 18:49:14 - INFO - codeparrot_training - Step 1354: {'lr': 0.0004993095332939633, 'samples': 693760, 'steps': 1354, 'batch_loss/train': 0.9575709011405706} 12/21/2021 18:49:26 - INFO - codeparrot_training - Step 1355: {'lr': 0.0004993083620626742, 'samples': 694272, 'steps': 1355, 'batch_loss/train': 0.9073086497373879} 12/21/2021 18:49:37 - INFO - codeparrot_training - Step 1356: {'lr': 0.0004993071898402292, 'samples': 694784, 'steps': 1356, 'batch_loss/train': 0.9253665707074106} 12/21/2021 18:49:47 - INFO - codeparrot_training - Step 1357: {'lr': 0.0004993060166266329, 'samples': 695296, 'steps': 1357, 'batch_loss/train': 0.8852570233866572} 12/21/2021 18:50:00 - INFO - codeparrot_training - Step 1358: {'lr': 0.00049930484242189, 'samples': 695808, 'steps': 1358, 'batch_loss/train': 0.8214883715845644} 12/21/2021 18:50:11 - INFO - codeparrot_training - Step 1359: {'lr': 0.0004993036672260052, 'samples': 696320, 'steps': 1359, 'batch_loss/train': 0.9004403837025166} 12/21/2021 18:50:22 - INFO - codeparrot_training - Step 1360: {'lr': 0.0004993024910389831, 'samples': 696832, 'steps': 1360, 'batch_loss/train': 0.9973794119432569} 12/21/2021 18:50:34 - INFO - codeparrot_training - Step 1361: {'lr': 0.0004993013138608284, 'samples': 697344, 'steps': 1361, 'batch_loss/train': 0.9549466390162706} 12/21/2021 18:50:44 - INFO - codeparrot_training - Step 1362: {'lr': 0.0004993001356915459, 'samples': 697856, 'steps': 1362, 'batch_loss/train': 0.8603733228519559} 12/21/2021 18:50:55 - INFO - codeparrot_training - Step 1363: {'lr': 0.0004992989565311402, 'samples': 698368, 'steps': 1363, 'batch_loss/train': 0.9401235659606755} 12/21/2021 18:51:07 - INFO - codeparrot_training - Step 1364: {'lr': 0.0004992977763796161, 'samples': 698880, 'steps': 1364, 'batch_loss/train': 0.8158649448305368} 12/21/2021 18:51:17 - INFO - codeparrot_training - Step 1365: {'lr': 0.000499296595236978, 'samples': 699392, 'steps': 1365, 'batch_loss/train': 1.001479273661971} 12/21/2021 18:51:28 - INFO - codeparrot_training - Step 1366: {'lr': 0.0004992954131032307, 'samples': 699904, 'steps': 1366, 'batch_loss/train': 0.9982989002019167} 12/21/2021 18:51:39 - INFO - codeparrot_training - Step 1367: {'lr': 0.0004992942299783792, 'samples': 700416, 'steps': 1367, 'batch_loss/train': 0.9400274325162172} 12/21/2021 18:51:51 - INFO - codeparrot_training - Step 1368: {'lr': 0.0004992930458624277, 'samples': 700928, 'steps': 1368, 'batch_loss/train': 0.8175384434871376} 12/21/2021 18:52:02 - INFO - codeparrot_training - Step 1369: {'lr': 0.0004992918607553814, 'samples': 701440, 'steps': 1369, 'batch_loss/train': 0.8123994609341025} 12/21/2021 18:52:13 - INFO - codeparrot_training - Step 1370: {'lr': 0.0004992906746572447, 'samples': 701952, 'steps': 1370, 'batch_loss/train': 0.8737145038321614} 12/21/2021 18:52:25 - INFO - codeparrot_training - Step 1371: {'lr': 0.0004992894875680224, 'samples': 702464, 'steps': 1371, 'batch_loss/train': 0.8616317110136151} 12/21/2021 18:52:35 - INFO - codeparrot_training - Step 1372: {'lr': 0.0004992882994877192, 'samples': 702976, 'steps': 1372, 'batch_loss/train': 0.8232845973689109} 12/21/2021 18:52:46 - INFO - codeparrot_training - Step 1373: {'lr': 0.00049928711041634, 'samples': 703488, 'steps': 1373, 'batch_loss/train': 0.8644114304333925} 12/21/2021 18:52:58 - INFO - codeparrot_training - Step 1374: {'lr': 0.0004992859203538892, 'samples': 704000, 'steps': 1374, 'batch_loss/train': 0.8904834846034646} 12/21/2021 18:53:08 - INFO - codeparrot_training - Step 1375: {'lr': 0.0004992847293003717, 'samples': 704512, 'steps': 1375, 'batch_loss/train': 0.8894833009690046} 12/21/2021 18:53:19 - INFO - codeparrot_training - Step 1376: {'lr': 0.0004992835372557924, 'samples': 705024, 'steps': 1376, 'batch_loss/train': 0.8786670342087746} 12/21/2021 18:53:29 - INFO - codeparrot_training - Step 1377: {'lr': 0.0004992823442201558, 'samples': 705536, 'steps': 1377, 'batch_loss/train': 0.8387476159259677} 12/21/2021 18:53:42 - INFO - codeparrot_training - Step 1378: {'lr': 0.0004992811501934667, 'samples': 706048, 'steps': 1378, 'batch_loss/train': 0.975234052632004} 12/21/2021 18:53:52 - INFO - codeparrot_training - Step 1379: {'lr': 0.0004992799551757299, 'samples': 706560, 'steps': 1379, 'batch_loss/train': 0.890184338670224} 12/21/2021 18:54:03 - INFO - codeparrot_training - Step 1380: {'lr': 0.0004992787591669501, 'samples': 707072, 'steps': 1380, 'batch_loss/train': 0.8847593450918794} 12/21/2021 18:54:16 - INFO - codeparrot_training - Step 1381: {'lr': 0.0004992775621671322, 'samples': 707584, 'steps': 1381, 'batch_loss/train': 0.8043151115998626} 12/21/2021 18:54:27 - INFO - codeparrot_training - Step 1382: {'lr': 0.0004992763641762806, 'samples': 708096, 'steps': 1382, 'batch_loss/train': 1.0100176334381104} 12/21/2021 18:54:37 - INFO - codeparrot_training - Step 1383: {'lr': 0.0004992751651944005, 'samples': 708608, 'steps': 1383, 'batch_loss/train': 0.8148020063526928} 12/21/2021 18:54:49 - INFO - codeparrot_training - Step 1384: {'lr': 0.0004992739652214964, 'samples': 709120, 'steps': 1384, 'batch_loss/train': 0.8589685750193894} 12/21/2021 18:55:00 - INFO - codeparrot_training - Step 1385: {'lr': 0.0004992727642575731, 'samples': 709632, 'steps': 1385, 'batch_loss/train': 0.8756378502584994} 12/21/2021 18:55:11 - INFO - codeparrot_training - Step 1386: {'lr': 0.0004992715623026355, 'samples': 710144, 'steps': 1386, 'batch_loss/train': 0.9711198946461082} 12/21/2021 18:55:21 - INFO - codeparrot_training - Step 1387: {'lr': 0.0004992703593566882, 'samples': 710656, 'steps': 1387, 'batch_loss/train': 0.7297414396889508} 12/21/2021 18:55:33 - INFO - codeparrot_training - Step 1388: {'lr': 0.0004992691554197361, 'samples': 711168, 'steps': 1388, 'batch_loss/train': 0.9059382630512118} 12/21/2021 18:55:44 - INFO - codeparrot_training - Step 1389: {'lr': 0.000499267950491784, 'samples': 711680, 'steps': 1389, 'batch_loss/train': 1.543355810455978} 12/21/2021 18:55:54 - INFO - codeparrot_training - Step 1390: {'lr': 0.0004992667445728366, 'samples': 712192, 'steps': 1390, 'batch_loss/train': 0.9728460945188999} 12/21/2021 18:56:07 - INFO - codeparrot_training - Step 1391: {'lr': 0.0004992655376628987, 'samples': 712704, 'steps': 1391, 'batch_loss/train': 0.9879138451069593} 12/21/2021 18:56:18 - INFO - codeparrot_training - Step 1392: {'lr': 0.0004992643297619752, 'samples': 713216, 'steps': 1392, 'batch_loss/train': 0.9045278923586011} 12/21/2021 18:56:28 - INFO - codeparrot_training - Step 1393: {'lr': 0.0004992631208700709, 'samples': 713728, 'steps': 1393, 'batch_loss/train': 0.9775938503444195} 12/21/2021 18:56:40 - INFO - codeparrot_training - Step 1394: {'lr': 0.0004992619109871904, 'samples': 714240, 'steps': 1394, 'batch_loss/train': 0.8892487827688456} 12/21/2021 18:56:51 - INFO - codeparrot_training - Step 1395: {'lr': 0.0004992607001133388, 'samples': 714752, 'steps': 1395, 'batch_loss/train': 0.8694000644609332} 12/21/2021 18:57:02 - INFO - codeparrot_training - Step 1396: {'lr': 0.0004992594882485208, 'samples': 715264, 'steps': 1396, 'batch_loss/train': 0.8429547948762774} 12/21/2021 18:57:12 - INFO - codeparrot_training - Step 1397: {'lr': 0.0004992582753927411, 'samples': 715776, 'steps': 1397, 'batch_loss/train': 0.9060738850384951} 12/21/2021 18:57:26 - INFO - codeparrot_training - Step 1398: {'lr': 0.0004992570615460045, 'samples': 716288, 'steps': 1398, 'batch_loss/train': 1.281076810322702} 12/21/2021 18:57:36 - INFO - codeparrot_training - Step 1399: {'lr': 0.0004992558467083162, 'samples': 716800, 'steps': 1399, 'batch_loss/train': 0.9588902518153191} 12/21/2021 18:57:47 - INFO - codeparrot_training - Step 1400: {'lr': 0.0004992546308796805, 'samples': 717312, 'steps': 1400, 'batch_loss/train': 0.7826875653117895} 12/21/2021 18:57:59 - INFO - codeparrot_training - Step 1401: {'lr': 0.0004992534140601027, 'samples': 717824, 'steps': 1401, 'batch_loss/train': 0.9038854958489537} 12/21/2021 18:58:10 - INFO - codeparrot_training - Step 1402: {'lr': 0.0004992521962495874, 'samples': 718336, 'steps': 1402, 'batch_loss/train': 0.9273305330425501} 12/21/2021 18:58:20 - INFO - codeparrot_training - Step 1403: {'lr': 0.0004992509774481394, 'samples': 718848, 'steps': 1403, 'batch_loss/train': 1.0787429315969348} 12/21/2021 18:58:32 - INFO - codeparrot_training - Step 1404: {'lr': 0.0004992497576557637, 'samples': 719360, 'steps': 1404, 'batch_loss/train': 0.9390718201175332} 12/21/2021 18:58:43 - INFO - codeparrot_training - Step 1405: {'lr': 0.0004992485368724651, 'samples': 719872, 'steps': 1405, 'batch_loss/train': 1.0478160828351974} 12/21/2021 18:58:54 - INFO - codeparrot_training - Step 1406: {'lr': 0.0004992473150982484, 'samples': 720384, 'steps': 1406, 'batch_loss/train': 1.003836389631033} 12/21/2021 18:59:04 - INFO - codeparrot_training - Step 1407: {'lr': 0.0004992460923331184, 'samples': 720896, 'steps': 1407, 'batch_loss/train': 0.9099589590914547} 12/21/2021 18:59:17 - INFO - codeparrot_training - Step 1408: {'lr': 0.0004992448685770802, 'samples': 721408, 'steps': 1408, 'batch_loss/train': 0.8307887779083103} 12/21/2021 18:59:28 - INFO - codeparrot_training - Step 1409: {'lr': 0.0004992436438301384, 'samples': 721920, 'steps': 1409, 'batch_loss/train': 0.7110651433467865} 12/21/2021 18:59:39 - INFO - codeparrot_training - Step 1410: {'lr': 0.000499242418092298, 'samples': 722432, 'steps': 1410, 'batch_loss/train': 0.7149399314075708} 12/21/2021 18:59:51 - INFO - codeparrot_training - Step 1411: {'lr': 0.000499241191363564, 'samples': 722944, 'steps': 1411, 'batch_loss/train': 0.8380734208039939} 12/21/2021 19:00:01 - INFO - codeparrot_training - Step 1412: {'lr': 0.000499239963643941, 'samples': 723456, 'steps': 1412, 'batch_loss/train': 0.8435431718826294} 12/21/2021 19:00:12 - INFO - codeparrot_training - Step 1413: {'lr': 0.0004992387349334339, 'samples': 723968, 'steps': 1413, 'batch_loss/train': 0.8829627665691078} 12/21/2021 19:00:24 - INFO - codeparrot_training - Step 1414: {'lr': 0.000499237505232048, 'samples': 724480, 'steps': 1414, 'batch_loss/train': 0.7680662181228399} 12/21/2021 19:00:34 - INFO - codeparrot_training - Step 1415: {'lr': 0.0004992362745397876, 'samples': 724992, 'steps': 1415, 'batch_loss/train': 0.8271518922410905} 12/21/2021 19:00:45 - INFO - codeparrot_training - Step 1416: {'lr': 0.000499235042856658, 'samples': 725504, 'steps': 1416, 'batch_loss/train': 0.9793215957470238} 12/21/2021 19:00:56 - INFO - codeparrot_training - Step 1417: {'lr': 0.000499233810182664, 'samples': 726016, 'steps': 1417, 'batch_loss/train': 0.9254060639068484} 12/21/2021 19:01:08 - INFO - codeparrot_training - Step 1418: {'lr': 0.0004992325765178104, 'samples': 726528, 'steps': 1418, 'batch_loss/train': 0.8961984254419804} 12/21/2021 19:01:19 - INFO - codeparrot_training - Step 1419: {'lr': 0.0004992313418621023, 'samples': 727040, 'steps': 1419, 'batch_loss/train': 0.8967116167768836} 12/21/2021 19:01:29 - INFO - codeparrot_training - Step 1420: {'lr': 0.0004992301062155443, 'samples': 727552, 'steps': 1420, 'batch_loss/train': 0.8577445903792977} 12/21/2021 19:01:42 - INFO - codeparrot_training - Step 1421: {'lr': 0.0004992288695781417, 'samples': 728064, 'steps': 1421, 'batch_loss/train': 0.9045203067362309} 12/21/2021 19:01:53 - INFO - codeparrot_training - Step 1422: {'lr': 0.0004992276319498992, 'samples': 728576, 'steps': 1422, 'batch_loss/train': 0.8740531024523079} 12/21/2021 19:02:03 - INFO - codeparrot_training - Step 1423: {'lr': 0.0004992263933308217, 'samples': 729088, 'steps': 1423, 'batch_loss/train': 0.8829802665859461} 12/21/2021 19:02:15 - INFO - codeparrot_training - Step 1424: {'lr': 0.000499225153720914, 'samples': 729600, 'steps': 1424, 'batch_loss/train': 0.9505403861403465} 12/21/2021 19:02:26 - INFO - codeparrot_training - Step 1425: {'lr': 0.0004992239131201815, 'samples': 730112, 'steps': 1425, 'batch_loss/train': 0.7963016289286315} 12/21/2021 19:02:36 - INFO - codeparrot_training - Step 1426: {'lr': 0.0004992226715286285, 'samples': 730624, 'steps': 1426, 'batch_loss/train': 0.9058397859334946} 12/21/2021 19:02:47 - INFO - codeparrot_training - Step 1427: {'lr': 0.0004992214289462606, 'samples': 731136, 'steps': 1427, 'batch_loss/train': 0.9820694969967008} 12/21/2021 19:03:00 - INFO - codeparrot_training - Step 1428: {'lr': 0.0004992201853730821, 'samples': 731648, 'steps': 1428, 'batch_loss/train': 0.8871501497924328} 12/21/2021 19:03:11 - INFO - codeparrot_training - Step 1429: {'lr': 0.0004992189408090984, 'samples': 732160, 'steps': 1429, 'batch_loss/train': 0.9270755974575877} 12/21/2021 19:03:21 - INFO - codeparrot_training - Step 1430: {'lr': 0.0004992176952543142, 'samples': 732672, 'steps': 1430, 'batch_loss/train': 0.8956932174041867} 12/21/2021 19:03:33 - INFO - codeparrot_training - Step 1431: {'lr': 0.0004992164487087347, 'samples': 733184, 'steps': 1431, 'batch_loss/train': 0.9295681174844503} 12/21/2021 19:03:44 - INFO - codeparrot_training - Step 1432: {'lr': 0.0004992152011723646, 'samples': 733696, 'steps': 1432, 'batch_loss/train': 0.8500270834192634} 12/21/2021 19:03:54 - INFO - codeparrot_training - Step 1433: {'lr': 0.000499213952645209, 'samples': 734208, 'steps': 1433, 'batch_loss/train': 0.9154115151613951} 12/21/2021 19:04:06 - INFO - codeparrot_training - Step 1434: {'lr': 0.0004992127031272727, 'samples': 734720, 'steps': 1434, 'batch_loss/train': 0.7609016797505319} 12/21/2021 19:04:17 - INFO - codeparrot_training - Step 1435: {'lr': 0.000499211452618561, 'samples': 735232, 'steps': 1435, 'batch_loss/train': 0.9242110941559076} 12/21/2021 19:04:28 - INFO - codeparrot_training - Step 1436: {'lr': 0.0004992102011190785, 'samples': 735744, 'steps': 1436, 'batch_loss/train': 0.8660112479701638} 12/21/2021 19:04:38 - INFO - codeparrot_training - Step 1437: {'lr': 0.0004992089486288303, 'samples': 736256, 'steps': 1437, 'batch_loss/train': 0.8736496847122908} 12/21/2021 19:04:51 - INFO - codeparrot_training - Step 1438: {'lr': 0.0004992076951478216, 'samples': 736768, 'steps': 1438, 'batch_loss/train': 0.9058330673724413} 12/21/2021 19:05:02 - INFO - codeparrot_training - Step 1439: {'lr': 0.0004992064406760571, 'samples': 737280, 'steps': 1439, 'batch_loss/train': 0.7826402503997087} 12/21/2021 19:05:13 - INFO - codeparrot_training - Step 1440: {'lr': 0.000499205185213542, 'samples': 737792, 'steps': 1440, 'batch_loss/train': 0.9729894632473588} 12/21/2021 19:05:25 - INFO - codeparrot_training - Step 1441: {'lr': 0.000499203928760281, 'samples': 738304, 'steps': 1441, 'batch_loss/train': 0.7506841602735221} 12/21/2021 19:05:35 - INFO - codeparrot_training - Step 1442: {'lr': 0.0004992026713162793, 'samples': 738816, 'steps': 1442, 'batch_loss/train': 0.8113713748753071} 12/21/2021 19:05:46 - INFO - codeparrot_training - Step 1443: {'lr': 0.000499201412881542, 'samples': 739328, 'steps': 1443, 'batch_loss/train': 0.8861641632393003} 12/21/2021 19:05:59 - INFO - codeparrot_training - Step 1444: {'lr': 0.0004992001534560738, 'samples': 739840, 'steps': 1444, 'batch_loss/train': 0.9161312039941549} 12/21/2021 19:06:09 - INFO - codeparrot_training - Step 1445: {'lr': 0.0004991988930398801, 'samples': 740352, 'steps': 1445, 'batch_loss/train': 0.8818469177931547} 12/21/2021 19:06:20 - INFO - codeparrot_training - Step 1446: {'lr': 0.0004991976316329655, 'samples': 740864, 'steps': 1446, 'batch_loss/train': 0.9179386273026466} 12/21/2021 19:06:30 - INFO - codeparrot_training - Step 1447: {'lr': 0.0004991963692353353, 'samples': 741376, 'steps': 1447, 'batch_loss/train': 0.9280525110661983} 12/21/2021 19:06:42 - INFO - codeparrot_training - Step 1448: {'lr': 0.0004991951058469945, 'samples': 741888, 'steps': 1448, 'batch_loss/train': 0.9176689023151994} 12/21/2021 19:06:53 - INFO - codeparrot_training - Step 1449: {'lr': 0.0004991938414679479, 'samples': 742400, 'steps': 1449, 'batch_loss/train': 0.9053338039666414} 12/21/2021 19:07:04 - INFO - codeparrot_training - Step 1450: {'lr': 0.0004991925760982008, 'samples': 742912, 'steps': 1450, 'batch_loss/train': 0.8886358682066202} 12/21/2021 19:07:16 - INFO - codeparrot_training - Step 1451: {'lr': 0.0004991913097377581, 'samples': 743424, 'steps': 1451, 'batch_loss/train': 0.8458385253325105} 12/21/2021 19:07:27 - INFO - codeparrot_training - Step 1452: {'lr': 0.0004991900423866246, 'samples': 743936, 'steps': 1452, 'batch_loss/train': 0.8727362100034952} 12/21/2021 19:07:37 - INFO - codeparrot_training - Step 1453: {'lr': 0.0004991887740448059, 'samples': 744448, 'steps': 1453, 'batch_loss/train': 0.8100819559767842} 12/21/2021 19:07:49 - INFO - codeparrot_training - Step 1454: {'lr': 0.0004991875047123066, 'samples': 744960, 'steps': 1454, 'batch_loss/train': 0.9352717623114586} 12/21/2021 19:08:00 - INFO - codeparrot_training - Step 1455: {'lr': 0.0004991862343891319, 'samples': 745472, 'steps': 1455, 'batch_loss/train': 1.0207194704562426} 12/21/2021 19:08:11 - INFO - codeparrot_training - Step 1456: {'lr': 0.0004991849630752868, 'samples': 745984, 'steps': 1456, 'batch_loss/train': 0.9129997259005904} 12/21/2021 19:08:21 - INFO - codeparrot_training - Step 1457: {'lr': 0.0004991836907707765, 'samples': 746496, 'steps': 1457, 'batch_loss/train': 0.8793914546258748} 12/21/2021 19:08:34 - INFO - codeparrot_training - Step 1458: {'lr': 0.0004991824174756057, 'samples': 747008, 'steps': 1458, 'batch_loss/train': 0.8392095658928156} 12/21/2021 19:08:45 - INFO - codeparrot_training - Step 1459: {'lr': 0.0004991811431897798, 'samples': 747520, 'steps': 1459, 'batch_loss/train': 0.9201991688460112} 12/21/2021 19:08:55 - INFO - codeparrot_training - Step 1460: {'lr': 0.0004991798679133037, 'samples': 748032, 'steps': 1460, 'batch_loss/train': 0.9655222529545426} 12/21/2021 19:09:07 - INFO - codeparrot_training - Step 1461: {'lr': 0.0004991785916461826, 'samples': 748544, 'steps': 1461, 'batch_loss/train': 0.8345540221780539} 12/21/2021 19:09:18 - INFO - codeparrot_training - Step 1462: {'lr': 0.0004991773143884216, 'samples': 749056, 'steps': 1462, 'batch_loss/train': 0.9990052655339241} 12/21/2021 19:09:28 - INFO - codeparrot_training - Step 1463: {'lr': 0.0004991760361400256, 'samples': 749568, 'steps': 1463, 'batch_loss/train': 0.957309759221971} 12/21/2021 19:09:40 - INFO - codeparrot_training - Step 1464: {'lr': 0.0004991747569009997, 'samples': 750080, 'steps': 1464, 'batch_loss/train': 0.9088741270825267} 12/21/2021 19:09:51 - INFO - codeparrot_training - Step 1465: {'lr': 0.0004991734766713491, 'samples': 750592, 'steps': 1465, 'batch_loss/train': 0.8389442404732108} 12/21/2021 19:10:02 - INFO - codeparrot_training - Step 1466: {'lr': 0.0004991721954510788, 'samples': 751104, 'steps': 1466, 'batch_loss/train': 0.889693749602884} 12/21/2021 19:10:12 - INFO - codeparrot_training - Step 1467: {'lr': 0.000499170913240194, 'samples': 751616, 'steps': 1467, 'batch_loss/train': 0.9019071599468589} 12/21/2021 19:10:25 - INFO - codeparrot_training - Step 1468: {'lr': 0.0004991696300386998, 'samples': 752128, 'steps': 1468, 'batch_loss/train': 0.9565586755052209} 12/21/2021 19:10:36 - INFO - codeparrot_training - Step 1469: {'lr': 0.0004991683458466011, 'samples': 752640, 'steps': 1469, 'batch_loss/train': 1.0009773131459951} 12/21/2021 19:10:46 - INFO - codeparrot_training - Step 1470: {'lr': 0.0004991670606639032, 'samples': 753152, 'steps': 1470, 'batch_loss/train': 0.7662123013287783} 12/21/2021 19:10:59 - INFO - codeparrot_training - Step 1471: {'lr': 0.0004991657744906112, 'samples': 753664, 'steps': 1471, 'batch_loss/train': 0.6708750370889902} 12/21/2021 19:11:09 - INFO - codeparrot_training - Step 1472: {'lr': 0.00049916448732673, 'samples': 754176, 'steps': 1472, 'batch_loss/train': 0.8803734052926302} 12/21/2021 19:11:20 - INFO - codeparrot_training - Step 1473: {'lr': 0.000499163199172265, 'samples': 754688, 'steps': 1473, 'batch_loss/train': 0.9705691980198026} 12/21/2021 19:11:33 - INFO - codeparrot_training - Step 1474: {'lr': 0.0004991619100272211, 'samples': 755200, 'steps': 1474, 'batch_loss/train': 0.9967543762177229} 12/21/2021 19:11:43 - INFO - codeparrot_training - Step 1475: {'lr': 0.0004991606198916037, 'samples': 755712, 'steps': 1475, 'batch_loss/train': 0.8363066669553518} 12/21/2021 19:11:54 - INFO - codeparrot_training - Step 1476: {'lr': 0.0004991593287654177, 'samples': 756224, 'steps': 1476, 'batch_loss/train': 0.8749741897918284} 12/21/2021 19:12:04 - INFO - codeparrot_training - Step 1477: {'lr': 0.0004991580366486682, 'samples': 756736, 'steps': 1477, 'batch_loss/train': 0.895608157850802} 12/21/2021 19:12:17 - INFO - codeparrot_training - Step 1478: {'lr': 0.0004991567435413604, 'samples': 757248, 'steps': 1478, 'batch_loss/train': 0.9263411909341812} 12/21/2021 19:12:27 - INFO - codeparrot_training - Step 1479: {'lr': 0.0004991554494434996, 'samples': 757760, 'steps': 1479, 'batch_loss/train': 0.7372247721068561} 12/21/2021 19:12:38 - INFO - codeparrot_training - Step 1480: {'lr': 0.0004991541543550907, 'samples': 758272, 'steps': 1480, 'batch_loss/train': 0.8418326985556632} 12/21/2021 19:12:50 - INFO - codeparrot_training - Step 1481: {'lr': 0.0004991528582761391, 'samples': 758784, 'steps': 1481, 'batch_loss/train': 1.209502063691616} 12/21/2021 19:13:01 - INFO - codeparrot_training - Step 1482: {'lr': 0.0004991515612066496, 'samples': 759296, 'steps': 1482, 'batch_loss/train': 1.087610300630331} 12/21/2021 19:13:12 - INFO - codeparrot_training - Step 1483: {'lr': 0.0004991502631466277, 'samples': 759808, 'steps': 1483, 'batch_loss/train': 0.8514039069414139} 12/21/2021 19:13:24 - INFO - codeparrot_training - Step 1484: {'lr': 0.0004991489640960785, 'samples': 760320, 'steps': 1484, 'batch_loss/train': 0.9606312597170472} 12/21/2021 19:13:35 - INFO - codeparrot_training - Step 1485: {'lr': 0.000499147664055007, 'samples': 760832, 'steps': 1485, 'batch_loss/train': 0.9134987881407142} 12/21/2021 19:13:46 - INFO - codeparrot_training - Step 1486: {'lr': 0.0004991463630234184, 'samples': 761344, 'steps': 1486, 'batch_loss/train': 0.860897995531559} 12/21/2021 19:13:56 - INFO - codeparrot_training - Step 1487: {'lr': 0.000499145061001318, 'samples': 761856, 'steps': 1487, 'batch_loss/train': 0.8509105974808335} 12/21/2021 19:14:08 - INFO - codeparrot_training - Step 1488: {'lr': 0.0004991437579887109, 'samples': 762368, 'steps': 1488, 'batch_loss/train': 0.8100118604488671} 12/21/2021 19:14:19 - INFO - codeparrot_training - Step 1489: {'lr': 0.0004991424539856024, 'samples': 762880, 'steps': 1489, 'batch_loss/train': 0.9670460168272257} 12/21/2021 19:14:30 - INFO - codeparrot_training - Step 1490: {'lr': 0.0004991411489919975, 'samples': 763392, 'steps': 1490, 'batch_loss/train': 0.9150240542367101} 12/21/2021 19:14:42 - INFO - codeparrot_training - Step 1491: {'lr': 0.0004991398430079013, 'samples': 763904, 'steps': 1491, 'batch_loss/train': 0.8878517271950841} 12/21/2021 19:14:52 - INFO - codeparrot_training - Step 1492: {'lr': 0.0004991385360333194, 'samples': 764416, 'steps': 1492, 'batch_loss/train': 0.9301278153434396} 12/21/2021 19:15:03 - INFO - codeparrot_training - Step 1493: {'lr': 0.0004991372280682566, 'samples': 764928, 'steps': 1493, 'batch_loss/train': 0.8494605850428343} 12/21/2021 19:15:16 - INFO - codeparrot_training - Step 1494: {'lr': 0.0004991359191127184, 'samples': 765440, 'steps': 1494, 'batch_loss/train': 1.1543506849557161} 12/21/2021 19:15:26 - INFO - codeparrot_training - Step 1495: {'lr': 0.0004991346091667097, 'samples': 765952, 'steps': 1495, 'batch_loss/train': 0.8929167026653886} 12/21/2021 19:15:37 - INFO - codeparrot_training - Step 1496: {'lr': 0.000499133298230236, 'samples': 766464, 'steps': 1496, 'batch_loss/train': 0.8789736004546285} 12/21/2021 19:15:48 - INFO - codeparrot_training - Step 1497: {'lr': 0.0004991319863033023, 'samples': 766976, 'steps': 1497, 'batch_loss/train': 0.8890458294190466} 12/21/2021 19:16:00 - INFO - codeparrot_training - Step 1498: {'lr': 0.0004991306733859138, 'samples': 767488, 'steps': 1498, 'batch_loss/train': 0.8582348180934787} 12/21/2021 19:16:10 - INFO - codeparrot_training - Step 1499: {'lr': 0.000499129359478076, 'samples': 768000, 'steps': 1499, 'batch_loss/train': 0.9286510422825813} 12/21/2021 19:16:21 - INFO - codeparrot_training - Step 1500: {'lr': 0.0004991280445797938, 'samples': 768512, 'steps': 1500, 'batch_loss/train': 0.9479373442009091} 12/21/2021 19:16:33 - INFO - codeparrot_training - Step 1501: {'lr': 0.0004991267286910726, 'samples': 769024, 'steps': 1501, 'batch_loss/train': 1.016344478353858} 12/21/2021 19:16:44 - INFO - codeparrot_training - Step 1502: {'lr': 0.0004991254118119176, 'samples': 769536, 'steps': 1502, 'batch_loss/train': 0.8211573557928205} 12/21/2021 19:16:54 - INFO - codeparrot_training - Step 1503: {'lr': 0.000499124093942334, 'samples': 770048, 'steps': 1503, 'batch_loss/train': 1.0534568205475807} 12/21/2021 19:17:07 - INFO - codeparrot_training - Step 1504: {'lr': 0.000499122775082327, 'samples': 770560, 'steps': 1504, 'batch_loss/train': 0.878528674133122} 12/21/2021 19:17:17 - INFO - codeparrot_training - Step 1505: {'lr': 0.0004991214552319021, 'samples': 771072, 'steps': 1505, 'batch_loss/train': 0.8986593801528215} 12/21/2021 19:17:28 - INFO - codeparrot_training - Step 1506: {'lr': 0.0004991201343910643, 'samples': 771584, 'steps': 1506, 'batch_loss/train': 1.2879918431863189} 12/21/2021 19:17:38 - INFO - codeparrot_training - Step 1507: {'lr': 0.0004991188125598188, 'samples': 772096, 'steps': 1507, 'batch_loss/train': 1.0066184606403112} 12/21/2021 19:17:51 - INFO - codeparrot_training - Step 1508: {'lr': 0.000499117489738171, 'samples': 772608, 'steps': 1508, 'batch_loss/train': 0.9410562459379435} 12/21/2021 19:18:02 - INFO - codeparrot_training - Step 1509: {'lr': 0.0004991161659261262, 'samples': 773120, 'steps': 1509, 'batch_loss/train': 0.9238882078789175} 12/21/2021 19:18:12 - INFO - codeparrot_training - Step 1510: {'lr': 0.0004991148411236895, 'samples': 773632, 'steps': 1510, 'batch_loss/train': 0.79955212213099} 12/21/2021 19:18:24 - INFO - codeparrot_training - Step 1511: {'lr': 0.0004991135153308663, 'samples': 774144, 'steps': 1511, 'batch_loss/train': 0.9009072091430426} 12/21/2021 19:18:35 - INFO - codeparrot_training - Step 1512: {'lr': 0.0004991121885476619, 'samples': 774656, 'steps': 1512, 'batch_loss/train': 0.7902439776808023} 12/21/2021 19:18:46 - INFO - codeparrot_training - Step 1513: {'lr': 0.0004991108607740814, 'samples': 775168, 'steps': 1513, 'batch_loss/train': 0.9307892825454473} 12/21/2021 19:18:58 - INFO - codeparrot_training - Step 1514: {'lr': 0.0004991095320101303, 'samples': 775680, 'steps': 1514, 'batch_loss/train': 0.8853988570626825} 12/21/2021 19:19:09 - INFO - codeparrot_training - Step 1515: {'lr': 0.0004991082022558135, 'samples': 776192, 'steps': 1515, 'batch_loss/train': 0.9643477988429368} 12/21/2021 19:19:20 - INFO - codeparrot_training - Step 1516: {'lr': 0.0004991068715111367, 'samples': 776704, 'steps': 1516, 'batch_loss/train': 0.9457984799519181} 12/21/2021 19:19:30 - INFO - codeparrot_training - Step 1517: {'lr': 0.0004991055397761051, 'samples': 777216, 'steps': 1517, 'batch_loss/train': 0.8710153824649751} 12/21/2021 19:19:42 - INFO - codeparrot_training - Step 1518: {'lr': 0.0004991042070507239, 'samples': 777728, 'steps': 1518, 'batch_loss/train': 0.9458189494907856} 12/21/2021 19:19:53 - INFO - codeparrot_training - Step 1519: {'lr': 0.0004991028733349984, 'samples': 778240, 'steps': 1519, 'batch_loss/train': 0.8499181373044848} 12/21/2021 19:20:04 - INFO - codeparrot_training - Step 1520: {'lr': 0.0004991015386289339, 'samples': 778752, 'steps': 1520, 'batch_loss/train': 0.9264786168932915} 12/21/2021 19:20:16 - INFO - codeparrot_training - Step 1521: {'lr': 0.0004991002029325357, 'samples': 779264, 'steps': 1521, 'batch_loss/train': 0.8807628750801086} 12/21/2021 19:20:26 - INFO - codeparrot_training - Step 1522: {'lr': 0.0004990988662458094, 'samples': 779776, 'steps': 1522, 'batch_loss/train': 0.9004290448501706} 12/21/2021 19:20:37 - INFO - codeparrot_training - Step 1523: {'lr': 0.0004990975285687598, 'samples': 780288, 'steps': 1523, 'batch_loss/train': 0.9843099880963564} 12/21/2021 19:20:50 - INFO - codeparrot_training - Step 1524: {'lr': 0.0004990961899013927, 'samples': 780800, 'steps': 1524, 'batch_loss/train': 0.8712683441117406} 12/21/2021 19:21:01 - INFO - codeparrot_training - Step 1525: {'lr': 0.0004990948502437131, 'samples': 781312, 'steps': 1525, 'batch_loss/train': 0.7869816906750202} 12/21/2021 19:21:11 - INFO - codeparrot_training - Step 1526: {'lr': 0.0004990935095957263, 'samples': 781824, 'steps': 1526, 'batch_loss/train': 0.7774976945947856} 12/21/2021 19:21:22 - INFO - codeparrot_training - Step 1527: {'lr': 0.0004990921679574379, 'samples': 782336, 'steps': 1527, 'batch_loss/train': 0.9547456745058298} 12/21/2021 19:21:34 - INFO - codeparrot_training - Step 1528: {'lr': 0.0004990908253288531, 'samples': 782848, 'steps': 1528, 'batch_loss/train': 0.8119464088231325} 12/21/2021 19:21:44 - INFO - codeparrot_training - Step 1529: {'lr': 0.0004990894817099771, 'samples': 783360, 'steps': 1529, 'batch_loss/train': 0.9305125717073679} 12/21/2021 19:21:55 - INFO - codeparrot_training - Step 1530: {'lr': 0.0004990881371008155, 'samples': 783872, 'steps': 1530, 'batch_loss/train': 0.8607888668775558} 12/21/2021 19:22:07 - INFO - codeparrot_training - Step 1531: {'lr': 0.0004990867915013735, 'samples': 784384, 'steps': 1531, 'batch_loss/train': 0.9708341555669904} 12/21/2021 19:22:18 - INFO - codeparrot_training - Step 1532: {'lr': 0.0004990854449116565, 'samples': 784896, 'steps': 1532, 'batch_loss/train': 0.8977903034538031} 12/21/2021 19:22:28 - INFO - codeparrot_training - Step 1533: {'lr': 0.0004990840973316697, 'samples': 785408, 'steps': 1533, 'batch_loss/train': 0.871687388047576} 12/21/2021 19:22:41 - INFO - codeparrot_training - Step 1534: {'lr': 0.0004990827487614187, 'samples': 785920, 'steps': 1534, 'batch_loss/train': 1.1145816231146455} 12/21/2021 19:22:51 - INFO - codeparrot_training - Step 1535: {'lr': 0.0004990813992009086, 'samples': 786432, 'steps': 1535, 'batch_loss/train': 0.8210630025714636} 12/21/2021 19:23:02 - INFO - codeparrot_training - Step 1536: {'lr': 0.0004990800486501452, 'samples': 786944, 'steps': 1536, 'batch_loss/train': 0.8837874531745911} 12/21/2021 19:23:12 - INFO - codeparrot_training - Step 1537: {'lr': 0.0004990786971091334, 'samples': 787456, 'steps': 1537, 'batch_loss/train': 0.8106287140399218} 12/21/2021 19:23:25 - INFO - codeparrot_training - Step 1538: {'lr': 0.0004990773445778786, 'samples': 787968, 'steps': 1538, 'batch_loss/train': 0.8841955773532391} 12/21/2021 19:23:36 - INFO - codeparrot_training - Step 1539: {'lr': 0.0004990759910563865, 'samples': 788480, 'steps': 1539, 'batch_loss/train': 0.8380987984128296} 12/21/2021 19:23:46 - INFO - codeparrot_training - Step 1540: {'lr': 0.0004990746365446622, 'samples': 788992, 'steps': 1540, 'batch_loss/train': 0.6680723326280713} 12/21/2021 19:23:59 - INFO - codeparrot_training - Step 1541: {'lr': 0.0004990732810427113, 'samples': 789504, 'steps': 1541, 'batch_loss/train': 0.9859416512772441} 12/21/2021 19:24:09 - INFO - codeparrot_training - Step 1542: {'lr': 0.0004990719245505391, 'samples': 790016, 'steps': 1542, 'batch_loss/train': 0.958910996094346} 12/21/2021 19:24:20 - INFO - codeparrot_training - Step 1543: {'lr': 0.0004990705670681508, 'samples': 790528, 'steps': 1543, 'batch_loss/train': 0.8678007861599326} 12/21/2021 19:24:33 - INFO - codeparrot_training - Step 1544: {'lr': 0.0004990692085955522, 'samples': 791040, 'steps': 1544, 'batch_loss/train': 0.8476616442203522} 12/21/2021 19:24:44 - INFO - codeparrot_training - Step 1545: {'lr': 0.0004990678491327482, 'samples': 791552, 'steps': 1545, 'batch_loss/train': 0.8259910992346704} 12/21/2021 19:24:54 - INFO - codeparrot_training - Step 1546: {'lr': 0.0004990664886797447, 'samples': 792064, 'steps': 1546, 'batch_loss/train': 0.8504134751856327} 12/21/2021 19:25:05 - INFO - codeparrot_training - Step 1547: {'lr': 0.0004990651272365469, 'samples': 792576, 'steps': 1547, 'batch_loss/train': 0.8602686319500208} 12/21/2021 19:25:17 - INFO - codeparrot_training - Step 1548: {'lr': 0.00049906376480316, 'samples': 793088, 'steps': 1548, 'batch_loss/train': 1.0521958786994219} 12/21/2021 19:25:28 - INFO - codeparrot_training - Step 1549: {'lr': 0.0004990624013795898, 'samples': 793600, 'steps': 1549, 'batch_loss/train': 0.9907089453190565} 12/21/2021 19:25:38 - INFO - codeparrot_training - Step 1550: {'lr': 0.0004990610369658414, 'samples': 794112, 'steps': 1550, 'batch_loss/train': 0.946219832636416} 12/21/2021 19:25:51 - INFO - codeparrot_training - Step 1551: {'lr': 0.0004990596715619205, 'samples': 794624, 'steps': 1551, 'batch_loss/train': 0.868032711558044} 12/21/2021 19:26:02 - INFO - codeparrot_training - Step 1552: {'lr': 0.0004990583051678322, 'samples': 795136, 'steps': 1552, 'batch_loss/train': 0.884167302865535} 12/21/2021 19:26:12 - INFO - codeparrot_training - Step 1553: {'lr': 0.0004990569377835823, 'samples': 795648, 'steps': 1553, 'batch_loss/train': 0.6480185776017606} 12/21/2021 19:26:23 - INFO - codeparrot_training - Step 1554: {'lr': 0.0004990555694091759, 'samples': 796160, 'steps': 1554, 'batch_loss/train': 0.8735680351965129} 12/21/2021 19:26:36 - INFO - codeparrot_training - Step 1555: {'lr': 0.0004990542000446188, 'samples': 796672, 'steps': 1555, 'batch_loss/train': 0.9366705920547247} 12/21/2021 19:26:46 - INFO - codeparrot_training - Step 1556: {'lr': 0.0004990528296899161, 'samples': 797184, 'steps': 1556, 'batch_loss/train': 0.8278948925435543} 12/21/2021 19:26:57 - INFO - codeparrot_training - Step 1557: {'lr': 0.0004990514583450734, 'samples': 797696, 'steps': 1557, 'batch_loss/train': 0.9266274953261018} 12/21/2021 19:27:09 - INFO - codeparrot_training - Step 1558: {'lr': 0.0004990500860100961, 'samples': 798208, 'steps': 1558, 'batch_loss/train': 0.9197458419948816} 12/21/2021 19:27:20 - INFO - codeparrot_training - Step 1559: {'lr': 0.0004990487126849898, 'samples': 798720, 'steps': 1559, 'batch_loss/train': 0.8157045301049948} 12/21/2021 19:27:30 - INFO - codeparrot_training - Step 1560: {'lr': 0.0004990473383697597, 'samples': 799232, 'steps': 1560, 'batch_loss/train': 0.9552362952381372} 12/21/2021 19:27:43 - INFO - codeparrot_training - Step 1561: {'lr': 0.0004990459630644116, 'samples': 799744, 'steps': 1561, 'batch_loss/train': 0.8295984999276698} 12/21/2021 19:27:54 - INFO - codeparrot_training - Step 1562: {'lr': 0.0004990445867689506, 'samples': 800256, 'steps': 1562, 'batch_loss/train': 0.9468330757226795} 12/21/2021 19:28:04 - INFO - codeparrot_training - Step 1563: {'lr': 0.0004990432094833826, 'samples': 800768, 'steps': 1563, 'batch_loss/train': 0.9141863500699401} 12/21/2021 19:28:15 - INFO - codeparrot_training - Step 1564: {'lr': 0.0004990418312077127, 'samples': 801280, 'steps': 1564, 'batch_loss/train': 0.8741412414237857} 12/21/2021 19:28:27 - INFO - codeparrot_training - Step 1565: {'lr': 0.0004990404519419464, 'samples': 801792, 'steps': 1565, 'batch_loss/train': 1.0191202405840158} 12/21/2021 19:28:38 - INFO - codeparrot_training - Step 1566: {'lr': 0.0004990390716860894, 'samples': 802304, 'steps': 1566, 'batch_loss/train': 0.8509348733350635} 12/21/2021 19:28:48 - INFO - codeparrot_training - Step 1567: {'lr': 0.0004990376904401471, 'samples': 802816, 'steps': 1567, 'batch_loss/train': 0.9865949116647243} 12/21/2021 19:29:01 - INFO - codeparrot_training - Step 1568: {'lr': 0.000499036308204125, 'samples': 803328, 'steps': 1568, 'batch_loss/train': 0.8997511016204953} 12/21/2021 19:29:11 - INFO - codeparrot_training - Step 1569: {'lr': 0.0004990349249780285, 'samples': 803840, 'steps': 1569, 'batch_loss/train': 0.9291855967603624} 12/21/2021 19:29:22 - INFO - codeparrot_training - Step 1570: {'lr': 0.0004990335407618632, 'samples': 804352, 'steps': 1570, 'batch_loss/train': 0.8769066073000431} 12/21/2021 19:29:34 - INFO - codeparrot_training - Step 1571: {'lr': 0.0004990321555556346, 'samples': 804864, 'steps': 1571, 'batch_loss/train': 0.9875875562429428} 12/21/2021 19:29:44 - INFO - codeparrot_training - Step 1572: {'lr': 0.0004990307693593482, 'samples': 805376, 'steps': 1572, 'batch_loss/train': 0.8935456285253167} 12/21/2021 19:29:55 - INFO - codeparrot_training - Step 1573: {'lr': 0.0004990293821730094, 'samples': 805888, 'steps': 1573, 'batch_loss/train': 0.9274824447929859} 12/21/2021 19:30:05 - INFO - codeparrot_training - Step 1574: {'lr': 0.0004990279939966239, 'samples': 806400, 'steps': 1574, 'batch_loss/train': 0.9043461563996971} 12/21/2021 19:30:19 - INFO - codeparrot_training - Step 1575: {'lr': 0.0004990266048301971, 'samples': 806912, 'steps': 1575, 'batch_loss/train': 0.8452258533798158} 12/21/2021 19:30:29 - INFO - codeparrot_training - Step 1576: {'lr': 0.0004990252146737345, 'samples': 807424, 'steps': 1576, 'batch_loss/train': 1.0736794285476208} 12/21/2021 19:30:40 - INFO - codeparrot_training - Step 1577: {'lr': 0.0004990238235272418, 'samples': 807936, 'steps': 1577, 'batch_loss/train': 0.9411089336499572} 12/21/2021 19:30:52 - INFO - codeparrot_training - Step 1578: {'lr': 0.0004990224313907243, 'samples': 808448, 'steps': 1578, 'batch_loss/train': 0.7255055927671492} 12/21/2021 19:31:02 - INFO - codeparrot_training - Step 1579: {'lr': 0.0004990210382641875, 'samples': 808960, 'steps': 1579, 'batch_loss/train': 0.9548053853213787} 12/21/2021 19:31:13 - INFO - codeparrot_training - Step 1580: {'lr': 0.0004990196441476373, 'samples': 809472, 'steps': 1580, 'batch_loss/train': 0.8864284581504762} 12/21/2021 19:31:25 - INFO - codeparrot_training - Step 1581: {'lr': 0.0004990182490410789, 'samples': 809984, 'steps': 1581, 'batch_loss/train': 1.0247873272746801} 12/21/2021 19:31:36 - INFO - codeparrot_training - Step 1582: {'lr': 0.000499016852944518, 'samples': 810496, 'steps': 1582, 'batch_loss/train': 0.8404687298461795} 12/21/2021 19:31:46 - INFO - codeparrot_training - Step 1583: {'lr': 0.0004990154558579602, 'samples': 811008, 'steps': 1583, 'batch_loss/train': 0.935194200836122} 12/21/2021 19:31:57 - INFO - codeparrot_training - Step 1584: {'lr': 0.0004990140577814109, 'samples': 811520, 'steps': 1584, 'batch_loss/train': 1.0090470649302006} 12/21/2021 19:32:10 - INFO - codeparrot_training - Step 1585: {'lr': 0.0004990126587148758, 'samples': 812032, 'steps': 1585, 'batch_loss/train': 0.9786989237181842} 12/21/2021 19:32:20 - INFO - codeparrot_training - Step 1586: {'lr': 0.0004990112586583603, 'samples': 812544, 'steps': 1586, 'batch_loss/train': 0.8996252031065524} 12/21/2021 19:32:31 - INFO - codeparrot_training - Step 1587: {'lr': 0.0004990098576118701, 'samples': 813056, 'steps': 1587, 'batch_loss/train': 0.9940212331712246} 12/21/2021 19:32:43 - INFO - codeparrot_training - Step 1588: {'lr': 0.0004990084555754107, 'samples': 813568, 'steps': 1588, 'batch_loss/train': 0.9426584369502962} 12/21/2021 19:32:54 - INFO - codeparrot_training - Step 1589: {'lr': 0.0004990070525489877, 'samples': 814080, 'steps': 1589, 'batch_loss/train': 0.9588494468480349} 12/21/2021 19:33:04 - INFO - codeparrot_training - Step 1590: {'lr': 0.0004990056485326067, 'samples': 814592, 'steps': 1590, 'batch_loss/train': 0.900996298994869} 12/21/2021 19:33:17 - INFO - codeparrot_training - Step 1591: {'lr': 0.0004990042435262733, 'samples': 815104, 'steps': 1591, 'batch_loss/train': 0.8317434769123793} 12/21/2021 19:33:28 - INFO - codeparrot_training - Step 1592: {'lr': 0.000499002837529993, 'samples': 815616, 'steps': 1592, 'batch_loss/train': 0.8489330234006047} 12/21/2021 19:33:38 - INFO - codeparrot_training - Step 1593: {'lr': 0.0004990014305437714, 'samples': 816128, 'steps': 1593, 'batch_loss/train': 0.854260669555515} 12/21/2021 19:33:49 - INFO - codeparrot_training - Step 1594: {'lr': 0.0004990000225676142, 'samples': 816640, 'steps': 1594, 'batch_loss/train': 0.834496327675879} 12/21/2021 19:34:01 - INFO - codeparrot_training - Step 1595: {'lr': 0.000498998613601527, 'samples': 817152, 'steps': 1595, 'batch_loss/train': 0.929279656149447} 12/21/2021 19:34:12 - INFO - codeparrot_training - Step 1596: {'lr': 0.0004989972036455151, 'samples': 817664, 'steps': 1596, 'batch_loss/train': 0.8766897656023502} 12/21/2021 19:34:22 - INFO - codeparrot_training - Step 1597: {'lr': 0.0004989957926995846, 'samples': 818176, 'steps': 1597, 'batch_loss/train': 0.8777226358652115} 12/21/2021 19:34:34 - INFO - codeparrot_training - Step 1598: {'lr': 0.0004989943807637407, 'samples': 818688, 'steps': 1598, 'batch_loss/train': 0.8732022782787681} 12/21/2021 19:34:45 - INFO - codeparrot_training - Step 1599: {'lr': 0.0004989929678379892, 'samples': 819200, 'steps': 1599, 'batch_loss/train': 0.8743901252746582} 12/21/2021 19:34:56 - INFO - codeparrot_training - Step 1600: {'lr': 0.0004989915539223357, 'samples': 819712, 'steps': 1600, 'batch_loss/train': 0.8758493410423398} 12/21/2021 19:35:08 - INFO - codeparrot_training - Step 1601: {'lr': 0.0004989901390167857, 'samples': 820224, 'steps': 1601, 'batch_loss/train': 0.9535415591672063} 12/21/2021 19:35:19 - INFO - codeparrot_training - Step 1602: {'lr': 0.0004989887231213449, 'samples': 820736, 'steps': 1602, 'batch_loss/train': 0.9203783189877868} 12/21/2021 19:35:30 - INFO - codeparrot_training - Step 1603: {'lr': 0.0004989873062360191, 'samples': 821248, 'steps': 1603, 'batch_loss/train': 0.8980770474299788} 12/21/2021 19:35:42 - INFO - codeparrot_training - Step 1604: {'lr': 0.0004989858883608137, 'samples': 821760, 'steps': 1604, 'batch_loss/train': 0.9070390276610851} 12/21/2021 19:35:53 - INFO - codeparrot_training - Step 1605: {'lr': 0.0004989844694957343, 'samples': 822272, 'steps': 1605, 'batch_loss/train': 0.8865944016724825} 12/21/2021 19:36:03 - INFO - codeparrot_training - Step 1606: {'lr': 0.0004989830496407869, 'samples': 822784, 'steps': 1606, 'batch_loss/train': 0.8270519720390439} 12/21/2021 19:36:14 - INFO - codeparrot_training - Step 1607: {'lr': 0.0004989816287959768, 'samples': 823296, 'steps': 1607, 'batch_loss/train': 0.9952144976705313} 12/21/2021 19:36:26 - INFO - codeparrot_training - Step 1608: {'lr': 0.0004989802069613097, 'samples': 823808, 'steps': 1608, 'batch_loss/train': 0.8741616876795888} 12/21/2021 19:36:37 - INFO - codeparrot_training - Step 1609: {'lr': 0.0004989787841367914, 'samples': 824320, 'steps': 1609, 'batch_loss/train': 0.9387342985719442} 12/21/2021 19:36:47 - INFO - codeparrot_training - Step 1610: {'lr': 0.0004989773603224274, 'samples': 824832, 'steps': 1610, 'batch_loss/train': 0.9511869661509991} 12/21/2021 19:36:59 - INFO - codeparrot_training - Step 1611: {'lr': 0.0004989759355182235, 'samples': 825344, 'steps': 1611, 'batch_loss/train': 1.0106961466372013} 12/21/2021 19:37:10 - INFO - codeparrot_training - Step 1612: {'lr': 0.0004989745097241853, 'samples': 825856, 'steps': 1612, 'batch_loss/train': 0.8387086261063814} 12/21/2021 19:37:21 - INFO - codeparrot_training - Step 1613: {'lr': 0.0004989730829403185, 'samples': 826368, 'steps': 1613, 'batch_loss/train': 0.8310669106431305} 12/21/2021 19:37:33 - INFO - codeparrot_training - Step 1614: {'lr': 0.0004989716551666285, 'samples': 826880, 'steps': 1614, 'batch_loss/train': 0.9509173333644867} 12/21/2021 19:37:44 - INFO - codeparrot_training - Step 1615: {'lr': 0.0004989702264031213, 'samples': 827392, 'steps': 1615, 'batch_loss/train': 0.9248645920306444} 12/21/2021 19:37:55 - INFO - codeparrot_training - Step 1616: {'lr': 0.0004989687966498026, 'samples': 827904, 'steps': 1616, 'batch_loss/train': 0.833437523804605} 12/21/2021 19:38:05 - INFO - codeparrot_training - Step 1617: {'lr': 0.0004989673659066779, 'samples': 828416, 'steps': 1617, 'batch_loss/train': 0.9388104099780321} 12/21/2021 19:38:18 - INFO - codeparrot_training - Step 1618: {'lr': 0.000498965934173753, 'samples': 828928, 'steps': 1618, 'batch_loss/train': 0.9078132705762982} 12/21/2021 19:38:28 - INFO - codeparrot_training - Step 1619: {'lr': 0.0004989645014510334, 'samples': 829440, 'steps': 1619, 'batch_loss/train': 0.8750639786012471} 12/21/2021 19:38:39 - INFO - codeparrot_training - Step 1620: {'lr': 0.0004989630677385251, 'samples': 829952, 'steps': 1620, 'batch_loss/train': 0.8077618693932891} 12/21/2021 19:38:51 - INFO - codeparrot_training - Step 1621: {'lr': 0.0004989616330362337, 'samples': 830464, 'steps': 1621, 'batch_loss/train': 0.8755777953192592} 12/21/2021 19:39:02 - INFO - codeparrot_training - Step 1622: {'lr': 0.0004989601973441647, 'samples': 830976, 'steps': 1622, 'batch_loss/train': 0.880197555758059} 12/21/2021 19:39:13 - INFO - codeparrot_training - Step 1623: {'lr': 0.000498958760662324, 'samples': 831488, 'steps': 1623, 'batch_loss/train': 0.9659086624160409} 12/21/2021 19:39:25 - INFO - codeparrot_training - Step 1624: {'lr': 0.0004989573229907172, 'samples': 832000, 'steps': 1624, 'batch_loss/train': 0.9231716627255082} 12/21/2021 19:39:36 - INFO - codeparrot_training - Step 1625: {'lr': 0.0004989558843293503, 'samples': 832512, 'steps': 1625, 'batch_loss/train': 1.0634433487430215} 12/21/2021 19:39:47 - INFO - codeparrot_training - Step 1626: {'lr': 0.0004989544446782287, 'samples': 833024, 'steps': 1626, 'batch_loss/train': 0.8936382224783301} 12/21/2021 19:39:57 - INFO - codeparrot_training - Step 1627: {'lr': 0.0004989530040373582, 'samples': 833536, 'steps': 1627, 'batch_loss/train': 0.9219766827300191} 12/21/2021 19:40:09 - INFO - codeparrot_training - Step 1628: {'lr': 0.0004989515624067445, 'samples': 834048, 'steps': 1628, 'batch_loss/train': 0.8224241929128766} 12/21/2021 19:40:20 - INFO - codeparrot_training - Step 1629: {'lr': 0.0004989501197863934, 'samples': 834560, 'steps': 1629, 'batch_loss/train': 0.8785842685028911} 12/21/2021 19:40:31 - INFO - codeparrot_training - Step 1630: {'lr': 0.0004989486761763108, 'samples': 835072, 'steps': 1630, 'batch_loss/train': 0.932359257247299} 12/21/2021 19:40:43 - INFO - codeparrot_training - Step 1631: {'lr': 0.0004989472315765021, 'samples': 835584, 'steps': 1631, 'batch_loss/train': 1.027493633562699} 12/21/2021 19:40:54 - INFO - codeparrot_training - Step 1632: {'lr': 0.0004989457859869733, 'samples': 836096, 'steps': 1632, 'batch_loss/train': 0.8971542920917273} 12/21/2021 19:41:05 - INFO - codeparrot_training - Step 1633: {'lr': 0.00049894433940773, 'samples': 836608, 'steps': 1633, 'batch_loss/train': 0.9591789881233126} 12/21/2021 19:41:15 - INFO - codeparrot_training - Step 1634: {'lr': 0.000498942891838778, 'samples': 837120, 'steps': 1634, 'batch_loss/train': 0.8635334779974073} 12/21/2021 19:41:27 - INFO - codeparrot_training - Step 1635: {'lr': 0.000498941443280123, 'samples': 837632, 'steps': 1635, 'batch_loss/train': 0.8572245980612934} 12/21/2021 19:41:38 - INFO - codeparrot_training - Step 1636: {'lr': 0.0004989399937317709, 'samples': 838144, 'steps': 1636, 'batch_loss/train': 0.8242830978706479} 12/21/2021 19:41:49 - INFO - codeparrot_training - Step 1637: {'lr': 0.0004989385431937274, 'samples': 838656, 'steps': 1637, 'batch_loss/train': 0.8978930844459683} 12/21/2021 19:42:01 - INFO - codeparrot_training - Step 1638: {'lr': 0.0004989370916659982, 'samples': 839168, 'steps': 1638, 'batch_loss/train': 0.8647003355436027} 12/21/2021 19:42:11 - INFO - codeparrot_training - Step 1639: {'lr': 0.0004989356391485892, 'samples': 839680, 'steps': 1639, 'batch_loss/train': 0.794282587012276} 12/21/2021 19:42:22 - INFO - codeparrot_training - Step 1640: {'lr': 0.000498934185641506, 'samples': 840192, 'steps': 1640, 'batch_loss/train': 0.9602486668154597} 12/21/2021 19:42:35 - INFO - codeparrot_training - Step 1641: {'lr': 0.0004989327311447544, 'samples': 840704, 'steps': 1641, 'batch_loss/train': 0.8559561893343925} 12/21/2021 19:42:45 - INFO - codeparrot_training - Step 1642: {'lr': 0.0004989312756583404, 'samples': 841216, 'steps': 1642, 'batch_loss/train': 0.9191740900278091} 12/21/2021 19:42:56 - INFO - codeparrot_training - Step 1643: {'lr': 0.0004989298191822696, 'samples': 841728, 'steps': 1643, 'batch_loss/train': 0.8940439550206065} 12/21/2021 19:43:06 - INFO - codeparrot_training - Step 1644: {'lr': 0.0004989283617165479, 'samples': 842240, 'steps': 1644, 'batch_loss/train': 0.9084221813827753} 12/21/2021 19:43:19 - INFO - codeparrot_training - Step 1645: {'lr': 0.0004989269032611809, 'samples': 842752, 'steps': 1645, 'batch_loss/train': 0.8558106198906898} 12/21/2021 19:43:29 - INFO - codeparrot_training - Step 1646: {'lr': 0.0004989254438161746, 'samples': 843264, 'steps': 1646, 'batch_loss/train': 0.9146951215807348} 12/21/2021 19:43:40 - INFO - codeparrot_training - Step 1647: {'lr': 0.0004989239833815347, 'samples': 843776, 'steps': 1647, 'batch_loss/train': 0.9020987655967474} 12/21/2021 19:43:52 - INFO - codeparrot_training - Step 1648: {'lr': 0.000498922521957267, 'samples': 844288, 'steps': 1648, 'batch_loss/train': 1.034890677779913} 12/21/2021 19:44:03 - INFO - codeparrot_training - Step 1649: {'lr': 0.0004989210595433774, 'samples': 844800, 'steps': 1649, 'batch_loss/train': 0.8290247670374811} 12/21/2021 19:44:14 - INFO - codeparrot_training - Step 1650: {'lr': 0.0004989195961398716, 'samples': 845312, 'steps': 1650, 'batch_loss/train': 0.9488262673839927} 12/21/2021 19:44:26 - INFO - codeparrot_training - Step 1651: {'lr': 0.0004989181317467555, 'samples': 845824, 'steps': 1651, 'batch_loss/train': 1.0439008735120296} 12/21/2021 19:44:36 - INFO - codeparrot_training - Step 1652: {'lr': 0.0004989166663640349, 'samples': 846336, 'steps': 1652, 'batch_loss/train': 0.9392561241984367} 12/21/2021 19:44:47 - INFO - codeparrot_training - Step 1653: {'lr': 0.0004989151999917156, 'samples': 846848, 'steps': 1653, 'batch_loss/train': 1.0095143702346832} 12/21/2021 19:44:58 - INFO - codeparrot_training - Step 1654: {'lr': 0.0004989137326298035, 'samples': 847360, 'steps': 1654, 'batch_loss/train': 0.8812794545665383} 12/21/2021 19:45:10 - INFO - codeparrot_training - Step 1655: {'lr': 0.0004989122642783044, 'samples': 847872, 'steps': 1655, 'batch_loss/train': 0.9923330219462514} 12/21/2021 19:45:21 - INFO - codeparrot_training - Step 1656: {'lr': 0.000498910794937224, 'samples': 848384, 'steps': 1656, 'batch_loss/train': 1.0809170519933105} 12/21/2021 19:45:31 - INFO - codeparrot_training - Step 1657: {'lr': 0.0004989093246065684, 'samples': 848896, 'steps': 1657, 'batch_loss/train': 0.9620170630514622} 12/21/2021 19:45:44 - INFO - codeparrot_training - Step 1658: {'lr': 0.0004989078532863432, 'samples': 849408, 'steps': 1658, 'batch_loss/train': 1.0742413885891438} 12/21/2021 19:45:54 - INFO - codeparrot_training - Step 1659: {'lr': 0.0004989063809765545, 'samples': 849920, 'steps': 1659, 'batch_loss/train': 0.8559922957792878} 12/21/2021 19:46:05 - INFO - codeparrot_training - Step 1660: {'lr': 0.0004989049076772079, 'samples': 850432, 'steps': 1660, 'batch_loss/train': 0.843426818959415} 12/21/2021 19:46:17 - INFO - codeparrot_training - Step 1661: {'lr': 0.0004989034333883094, 'samples': 850944, 'steps': 1661, 'batch_loss/train': 0.8882105117663741} 12/21/2021 19:46:28 - INFO - codeparrot_training - Step 1662: {'lr': 0.0004989019581098648, 'samples': 851456, 'steps': 1662, 'batch_loss/train': 0.8527037426829338} 12/21/2021 19:46:39 - INFO - codeparrot_training - Step 1663: {'lr': 0.00049890048184188, 'samples': 851968, 'steps': 1663, 'batch_loss/train': 0.992137317545712} 12/21/2021 19:46:51 - INFO - codeparrot_training - Step 1664: {'lr': 0.0004988990045843609, 'samples': 852480, 'steps': 1664, 'batch_loss/train': 0.8860413506627083} 12/21/2021 19:47:01 - INFO - codeparrot_training - Step 1665: {'lr': 0.0004988975263373134, 'samples': 852992, 'steps': 1665, 'batch_loss/train': 0.8462892472743988} 12/21/2021 19:47:12 - INFO - codeparrot_training - Step 1666: {'lr': 0.0004988960471007432, 'samples': 853504, 'steps': 1666, 'batch_loss/train': 0.9574611871503294} 12/21/2021 19:47:23 - INFO - codeparrot_training - Step 1667: {'lr': 0.0004988945668746563, 'samples': 854016, 'steps': 1667, 'batch_loss/train': 0.9716064305976033} 12/21/2021 19:47:35 - INFO - codeparrot_training - Step 1668: {'lr': 0.0004988930856590587, 'samples': 854528, 'steps': 1668, 'batch_loss/train': 0.9574818797409534} 12/21/2021 19:47:45 - INFO - codeparrot_training - Step 1669: {'lr': 0.000498891603453956, 'samples': 855040, 'steps': 1669, 'batch_loss/train': 0.8710432769730687} 12/21/2021 19:47:56 - INFO - codeparrot_training - Step 1670: {'lr': 0.0004988901202593544, 'samples': 855552, 'steps': 1670, 'batch_loss/train': 0.9373943191021681} 12/21/2021 19:48:09 - INFO - codeparrot_training - Step 1671: {'lr': 0.0004988886360752596, 'samples': 856064, 'steps': 1671, 'batch_loss/train': 0.9859117092564702} 12/21/2021 19:48:19 - INFO - codeparrot_training - Step 1672: {'lr': 0.0004988871509016776, 'samples': 856576, 'steps': 1672, 'batch_loss/train': 0.8840123564004898} 12/21/2021 19:48:30 - INFO - codeparrot_training - Step 1673: {'lr': 0.0004988856647386142, 'samples': 857088, 'steps': 1673, 'batch_loss/train': 0.9437036588788033} 12/21/2021 19:48:42 - INFO - codeparrot_training - Step 1674: {'lr': 0.0004988841775860754, 'samples': 857600, 'steps': 1674, 'batch_loss/train': 1.0447020791471004} 12/21/2021 19:48:52 - INFO - codeparrot_training - Step 1675: {'lr': 0.0004988826894440671, 'samples': 858112, 'steps': 1675, 'batch_loss/train': 1.3184889201074839} 12/21/2021 19:49:03 - INFO - codeparrot_training - Step 1676: {'lr': 0.0004988812003125952, 'samples': 858624, 'steps': 1676, 'batch_loss/train': 0.9408456021919847} 12/21/2021 19:49:14 - INFO - codeparrot_training - Step 1677: {'lr': 0.0004988797101916656, 'samples': 859136, 'steps': 1677, 'batch_loss/train': 0.8162884362973273} 12/21/2021 19:49:27 - INFO - codeparrot_training - Step 1678: {'lr': 0.0004988782190812843, 'samples': 859648, 'steps': 1678, 'batch_loss/train': 0.9030062723904848} 12/21/2021 19:49:37 - INFO - codeparrot_training - Step 1679: {'lr': 0.0004988767269814571, 'samples': 860160, 'steps': 1679, 'batch_loss/train': 0.88985241856426} 12/21/2021 19:49:48 - INFO - codeparrot_training - Step 1680: {'lr': 0.00049887523389219, 'samples': 860672, 'steps': 1680, 'batch_loss/train': 0.8859491515904665} 12/21/2021 19:50:00 - INFO - codeparrot_training - Step 1681: {'lr': 0.000498873739813489, 'samples': 861184, 'steps': 1681, 'batch_loss/train': 0.9164216630160809} 12/21/2021 19:50:11 - INFO - codeparrot_training - Step 1682: {'lr': 0.00049887224474536, 'samples': 861696, 'steps': 1682, 'batch_loss/train': 1.0542579526081681} 12/21/2021 19:50:21 - INFO - codeparrot_training - Step 1683: {'lr': 0.0004988707486878089, 'samples': 862208, 'steps': 1683, 'batch_loss/train': 0.9508535517379642} 12/21/2021 19:50:32 - INFO - codeparrot_training - Step 1684: {'lr': 0.0004988692516408416, 'samples': 862720, 'steps': 1684, 'batch_loss/train': 0.8899596398696303} 12/21/2021 19:50:44 - INFO - codeparrot_training - Step 1685: {'lr': 0.0004988677536044643, 'samples': 863232, 'steps': 1685, 'batch_loss/train': 0.9636764544993639} 12/21/2021 19:50:55 - INFO - codeparrot_training - Step 1686: {'lr': 0.0004988662545786826, 'samples': 863744, 'steps': 1686, 'batch_loss/train': 0.8935754783451557} 12/21/2021 19:51:05 - INFO - codeparrot_training - Step 1687: {'lr': 0.0004988647545635027, 'samples': 864256, 'steps': 1687, 'batch_loss/train': 0.8424288369715214} 12/21/2021 19:51:17 - INFO - codeparrot_training - Step 1688: {'lr': 0.0004988632535589306, 'samples': 864768, 'steps': 1688, 'batch_loss/train': 1.0158801395446062} 12/21/2021 19:51:28 - INFO - codeparrot_training - Step 1689: {'lr': 0.000498861751564972, 'samples': 865280, 'steps': 1689, 'batch_loss/train': 0.9659739769995213} 12/21/2021 19:51:38 - INFO - codeparrot_training - Step 1690: {'lr': 0.0004988602485816332, 'samples': 865792, 'steps': 1690, 'batch_loss/train': 0.9077348466962576} 12/21/2021 19:51:51 - INFO - codeparrot_training - Step 1691: {'lr': 0.0004988587446089199, 'samples': 866304, 'steps': 1691, 'batch_loss/train': 0.8862872291356325} 12/21/2021 19:52:02 - INFO - codeparrot_training - Step 1692: {'lr': 0.0004988572396468383, 'samples': 866816, 'steps': 1692, 'batch_loss/train': 0.8973912801593542} 12/21/2021 19:52:13 - INFO - codeparrot_training - Step 1693: {'lr': 0.0004988557336953941, 'samples': 867328, 'steps': 1693, 'batch_loss/train': 0.8899939404800534} 12/21/2021 19:52:23 - INFO - codeparrot_training - Step 1694: {'lr': 0.0004988542267545936, 'samples': 867840, 'steps': 1694, 'batch_loss/train': 0.9546453291550279} 12/21/2021 19:52:35 - INFO - codeparrot_training - Step 1695: {'lr': 0.0004988527188244427, 'samples': 868352, 'steps': 1695, 'batch_loss/train': 1.0924792597070336} 12/21/2021 19:52:46 - INFO - codeparrot_training - Step 1696: {'lr': 0.0004988512099049473, 'samples': 868864, 'steps': 1696, 'batch_loss/train': 0.7072330936789513} 12/21/2021 19:52:57 - INFO - codeparrot_training - Step 1697: {'lr': 0.0004988496999961134, 'samples': 869376, 'steps': 1697, 'batch_loss/train': 0.9214607588946819} 12/21/2021 19:53:09 - INFO - codeparrot_training - Step 1698: {'lr': 0.0004988481890979471, 'samples': 869888, 'steps': 1698, 'batch_loss/train': 0.9209415949881077} 12/21/2021 19:53:19 - INFO - codeparrot_training - Step 1699: {'lr': 0.0004988466772104544, 'samples': 870400, 'steps': 1699, 'batch_loss/train': 0.9891600273549557} 12/21/2021 19:53:30 - INFO - codeparrot_training - Step 1700: {'lr': 0.0004988451643336412, 'samples': 870912, 'steps': 1700, 'batch_loss/train': 1.0022788783535361} 12/21/2021 19:53:43 - INFO - codeparrot_training - Step 1701: {'lr': 0.0004988436504675136, 'samples': 871424, 'steps': 1701, 'batch_loss/train': 0.9143160805106163} 12/21/2021 19:53:53 - INFO - codeparrot_training - Step 1702: {'lr': 0.0004988421356120777, 'samples': 871936, 'steps': 1702, 'batch_loss/train': 1.230787705630064} 12/21/2021 19:54:04 - INFO - codeparrot_training - Step 1703: {'lr': 0.0004988406197673394, 'samples': 872448, 'steps': 1703, 'batch_loss/train': 0.8632575068622828} 12/21/2021 19:54:16 - INFO - codeparrot_training - Step 1704: {'lr': 0.0004988391029333047, 'samples': 872960, 'steps': 1704, 'batch_loss/train': 0.8298462247475982} 12/21/2021 19:54:27 - INFO - codeparrot_training - Step 1705: {'lr': 0.0004988375851099797, 'samples': 873472, 'steps': 1705, 'batch_loss/train': 0.9333933675661683} 12/21/2021 19:54:37 - INFO - codeparrot_training - Step 1706: {'lr': 0.0004988360662973704, 'samples': 873984, 'steps': 1706, 'batch_loss/train': 0.8585763769224286} 12/21/2021 19:54:48 - INFO - codeparrot_training - Step 1707: {'lr': 0.0004988345464954828, 'samples': 874496, 'steps': 1707, 'batch_loss/train': 0.8557607689872384} 12/21/2021 19:55:01 - INFO - codeparrot_training - Step 1708: {'lr': 0.0004988330257043231, 'samples': 875008, 'steps': 1708, 'batch_loss/train': 0.9054092988371849} 12/21/2021 19:55:11 - INFO - codeparrot_training - Step 1709: {'lr': 0.0004988315039238972, 'samples': 875520, 'steps': 1709, 'batch_loss/train': 0.9229631670750678} 12/21/2021 19:55:22 - INFO - codeparrot_training - Step 1710: {'lr': 0.0004988299811542113, 'samples': 876032, 'steps': 1710, 'batch_loss/train': 0.9498990289866924} 12/21/2021 19:55:34 - INFO - codeparrot_training - Step 1711: {'lr': 0.0004988284573952712, 'samples': 876544, 'steps': 1711, 'batch_loss/train': 0.9152682507410645} 12/21/2021 19:55:45 - INFO - codeparrot_training - Step 1712: {'lr': 0.0004988269326470832, 'samples': 877056, 'steps': 1712, 'batch_loss/train': 0.7151929382234812} 12/21/2021 19:55:55 - INFO - codeparrot_training - Step 1713: {'lr': 0.0004988254069096533, 'samples': 877568, 'steps': 1713, 'batch_loss/train': 0.9512526281177998} 12/21/2021 19:56:07 - INFO - codeparrot_training - Step 1714: {'lr': 0.0004988238801829874, 'samples': 878080, 'steps': 1714, 'batch_loss/train': 0.9871786087751389} 12/21/2021 19:56:18 - INFO - codeparrot_training - Step 1715: {'lr': 0.0004988223524670918, 'samples': 878592, 'steps': 1715, 'batch_loss/train': 1.059871178586036} 12/21/2021 19:56:29 - INFO - codeparrot_training - Step 1716: {'lr': 0.0004988208237619725, 'samples': 879104, 'steps': 1716, 'batch_loss/train': 0.9195260615088046} 12/21/2021 19:56:39 - INFO - codeparrot_training - Step 1717: {'lr': 0.0004988192940676355, 'samples': 879616, 'steps': 1717, 'batch_loss/train': 1.01906623179093} 12/21/2021 19:56:52 - INFO - codeparrot_training - Step 1718: {'lr': 0.0004988177633840869, 'samples': 880128, 'steps': 1718, 'batch_loss/train': 0.9443487338721752} 12/21/2021 19:57:03 - INFO - codeparrot_training - Step 1719: {'lr': 0.0004988162317113329, 'samples': 880640, 'steps': 1719, 'batch_loss/train': 0.9177821390330791} 12/21/2021 19:57:13 - INFO - codeparrot_training - Step 1720: {'lr': 0.0004988146990493795, 'samples': 881152, 'steps': 1720, 'batch_loss/train': 0.8549966989085078} 12/21/2021 19:57:25 - INFO - codeparrot_training - Step 1721: {'lr': 0.0004988131653982327, 'samples': 881664, 'steps': 1721, 'batch_loss/train': 0.8796768421307206} 12/21/2021 19:57:36 - INFO - codeparrot_training - Step 1722: {'lr': 0.0004988116307578989, 'samples': 882176, 'steps': 1722, 'batch_loss/train': 0.8261880306527019} 12/21/2021 19:57:46 - INFO - codeparrot_training - Step 1723: {'lr': 0.0004988100951283838, 'samples': 882688, 'steps': 1723, 'batch_loss/train': 0.9156139930710196} 12/21/2021 19:57:58 - INFO - codeparrot_training - Step 1724: {'lr': 0.0004988085585096939, 'samples': 883200, 'steps': 1724, 'batch_loss/train': 0.9225135380402207} 12/21/2021 19:58:09 - INFO - codeparrot_training - Step 1725: {'lr': 0.000498807020901835, 'samples': 883712, 'steps': 1725, 'batch_loss/train': 0.9821897577494383} 12/21/2021 19:58:20 - INFO - codeparrot_training - Step 1726: {'lr': 0.0004988054823048133, 'samples': 884224, 'steps': 1726, 'batch_loss/train': 0.827798436395824} 12/21/2021 19:58:30 - INFO - codeparrot_training - Step 1727: {'lr': 0.0004988039427186351, 'samples': 884736, 'steps': 1727, 'batch_loss/train': 0.9724058229476213} 12/21/2021 19:58:42 - INFO - codeparrot_training - Step 1728: {'lr': 0.0004988024021433063, 'samples': 885248, 'steps': 1728, 'batch_loss/train': 0.91817457228899} 12/21/2021 19:58:53 - INFO - codeparrot_training - Step 1729: {'lr': 0.000498800860578833, 'samples': 885760, 'steps': 1729, 'batch_loss/train': 0.9184758132323623} 12/21/2021 19:59:03 - INFO - codeparrot_training - Step 1730: {'lr': 0.0004987993180252215, 'samples': 886272, 'steps': 1730, 'batch_loss/train': 0.949419878423214} 12/21/2021 19:59:16 - INFO - codeparrot_training - Step 1731: {'lr': 0.0004987977744824779, 'samples': 886784, 'steps': 1731, 'batch_loss/train': 0.9081051032990217} 12/21/2021 19:59:27 - INFO - codeparrot_training - Step 1732: {'lr': 0.0004987962299506082, 'samples': 887296, 'steps': 1732, 'batch_loss/train': 0.9061238206923008} 12/21/2021 19:59:37 - INFO - codeparrot_training - Step 1733: {'lr': 0.0004987946844296187, 'samples': 887808, 'steps': 1733, 'batch_loss/train': 0.9276872025802732} 12/21/2021 19:59:49 - INFO - codeparrot_training - Step 1734: {'lr': 0.0004987931379195154, 'samples': 888320, 'steps': 1734, 'batch_loss/train': 0.9237720146775246} 12/21/2021 20:00:00 - INFO - codeparrot_training - Step 1735: {'lr': 0.0004987915904203045, 'samples': 888832, 'steps': 1735, 'batch_loss/train': 0.9263345254585147} 12/21/2021 20:00:11 - INFO - codeparrot_training - Step 1736: {'lr': 0.0004987900419319923, 'samples': 889344, 'steps': 1736, 'batch_loss/train': 0.8594641145318747} 12/21/2021 20:00:21 - INFO - codeparrot_training - Step 1737: {'lr': 0.0004987884924545847, 'samples': 889856, 'steps': 1737, 'batch_loss/train': 0.8268601922318339} 12/21/2021 20:00:33 - INFO - codeparrot_training - Step 1738: {'lr': 0.0004987869419880881, 'samples': 890368, 'steps': 1738, 'batch_loss/train': 0.9198745731264353} 12/21/2021 20:00:44 - INFO - codeparrot_training - Step 1739: {'lr': 0.0004987853905325085, 'samples': 890880, 'steps': 1739, 'batch_loss/train': 0.9156588902696967} 12/21/2021 20:00:54 - INFO - codeparrot_training - Step 1740: {'lr': 0.0004987838380878522, 'samples': 891392, 'steps': 1740, 'batch_loss/train': 0.8364510666579008} 12/21/2021 20:01:07 - INFO - codeparrot_training - Step 1741: {'lr': 0.0004987822846541253, 'samples': 891904, 'steps': 1741, 'batch_loss/train': 0.8473976843524724} 12/21/2021 20:01:18 - INFO - codeparrot_training - Step 1742: {'lr': 0.0004987807302313338, 'samples': 892416, 'steps': 1742, 'batch_loss/train': 0.9717363230884075} 12/21/2021 20:01:29 - INFO - codeparrot_training - Step 1743: {'lr': 0.0004987791748194842, 'samples': 892928, 'steps': 1743, 'batch_loss/train': 0.8751451941207051} 12/21/2021 20:01:41 - INFO - codeparrot_training - Step 1744: {'lr': 0.0004987776184185825, 'samples': 893440, 'steps': 1744, 'batch_loss/train': 0.8930283952504396} 12/21/2021 20:01:51 - INFO - codeparrot_training - Step 1745: {'lr': 0.000498776061028635, 'samples': 893952, 'steps': 1745, 'batch_loss/train': 0.837039184756577} 12/21/2021 20:02:02 - INFO - codeparrot_training - Step 1746: {'lr': 0.0004987745026496477, 'samples': 894464, 'steps': 1746, 'batch_loss/train': 0.8130146125331521} 12/21/2021 20:02:13 - INFO - codeparrot_training - Step 1747: {'lr': 0.0004987729432816269, 'samples': 894976, 'steps': 1747, 'batch_loss/train': 1.0292198001407087} 12/21/2021 20:02:25 - INFO - codeparrot_training - Step 1748: {'lr': 0.0004987713829245789, 'samples': 895488, 'steps': 1748, 'batch_loss/train': 1.024717700202018} 12/21/2021 20:02:36 - INFO - codeparrot_training - Step 1749: {'lr': 0.0004987698215785098, 'samples': 896000, 'steps': 1749, 'batch_loss/train': 0.8937940374016762} 12/21/2021 20:02:47 - INFO - codeparrot_training - Step 1750: {'lr': 0.0004987682592434258, 'samples': 896512, 'steps': 1750, 'batch_loss/train': 0.7694371391553432} 12/21/2021 20:02:59 - INFO - codeparrot_training - Step 1751: {'lr': 0.0004987666959193331, 'samples': 897024, 'steps': 1751, 'batch_loss/train': 0.9619645616039634} 12/21/2021 20:03:10 - INFO - codeparrot_training - Step 1752: {'lr': 0.000498765131606238, 'samples': 897536, 'steps': 1752, 'batch_loss/train': 0.6888892548158765} 12/21/2021 20:03:20 - INFO - codeparrot_training - Step 1753: {'lr': 0.0004987635663041466, 'samples': 898048, 'steps': 1753, 'batch_loss/train': 0.8204534322721884} 12/21/2021 20:03:32 - INFO - codeparrot_training - Step 1754: {'lr': 0.0004987620000130653, 'samples': 898560, 'steps': 1754, 'batch_loss/train': 1.2819089908152819} 12/21/2021 20:03:43 - INFO - codeparrot_training - Step 1755: {'lr': 0.0004987604327330002, 'samples': 899072, 'steps': 1755, 'batch_loss/train': 0.8924822164699435} 12/21/2021 20:03:53 - INFO - codeparrot_training - Step 1756: {'lr': 0.0004987588644639574, 'samples': 899584, 'steps': 1756, 'batch_loss/train': 0.8796547055244446} 12/21/2021 20:04:04 - INFO - codeparrot_training - Step 1757: {'lr': 0.0004987572952059435, 'samples': 900096, 'steps': 1757, 'batch_loss/train': 0.8518597921356559} 12/21/2021 20:04:17 - INFO - codeparrot_training - Step 1758: {'lr': 0.0004987557249589644, 'samples': 900608, 'steps': 1758, 'batch_loss/train': 0.9156473223119974} 12/21/2021 20:04:28 - INFO - codeparrot_training - Step 1759: {'lr': 0.0004987541537230265, 'samples': 901120, 'steps': 1759, 'batch_loss/train': 0.8706931555643678} 12/21/2021 20:04:38 - INFO - codeparrot_training - Step 1760: {'lr': 0.000498752581498136, 'samples': 901632, 'steps': 1760, 'batch_loss/train': 0.7410661680623889} 12/21/2021 20:04:50 - INFO - codeparrot_training - Step 1761: {'lr': 0.0004987510082842991, 'samples': 902144, 'steps': 1761, 'batch_loss/train': 0.844179111532867} 12/21/2021 20:05:01 - INFO - codeparrot_training - Step 1762: {'lr': 0.0004987494340815221, 'samples': 902656, 'steps': 1762, 'batch_loss/train': 0.8779253354296088} 12/21/2021 20:05:11 - INFO - codeparrot_training - Step 1763: {'lr': 0.0004987478588898113, 'samples': 903168, 'steps': 1763, 'batch_loss/train': 0.8999108620919287} 12/21/2021 20:05:25 - INFO - codeparrot_training - Step 1764: {'lr': 0.000498746282709173, 'samples': 903680, 'steps': 1764, 'batch_loss/train': 0.8654523342847824} 12/21/2021 20:05:36 - INFO - codeparrot_training - Step 1765: {'lr': 0.0004987447055396134, 'samples': 904192, 'steps': 1765, 'batch_loss/train': 0.9194268696010113} 12/21/2021 20:05:46 - INFO - codeparrot_training - Step 1766: {'lr': 0.0004987431273811388, 'samples': 904704, 'steps': 1766, 'batch_loss/train': 0.8822370981797576} 12/21/2021 20:05:57 - INFO - codeparrot_training - Step 1767: {'lr': 0.0004987415482337554, 'samples': 905216, 'steps': 1767, 'batch_loss/train': 0.8363524139858782} 12/21/2021 20:06:09 - INFO - codeparrot_training - Step 1768: {'lr': 0.0004987399680974695, 'samples': 905728, 'steps': 1768, 'batch_loss/train': 0.910089218756184} 12/21/2021 20:06:20 - INFO - codeparrot_training - Step 1769: {'lr': 0.0004987383869722874, 'samples': 906240, 'steps': 1769, 'batch_loss/train': 1.5741720795631409} 12/21/2021 20:06:30 - INFO - codeparrot_training - Step 1770: {'lr': 0.0004987368048582154, 'samples': 906752, 'steps': 1770, 'batch_loss/train': 0.8993078609928489} 12/21/2021 20:06:42 - INFO - codeparrot_training - Step 1771: {'lr': 0.0004987352217552598, 'samples': 907264, 'steps': 1771, 'batch_loss/train': 0.9222843889147043} 12/21/2021 20:06:53 - INFO - codeparrot_training - Step 1772: {'lr': 0.000498733637663427, 'samples': 907776, 'steps': 1772, 'batch_loss/train': 0.8354229419492185} 12/21/2021 20:07:04 - INFO - codeparrot_training - Step 1773: {'lr': 0.000498732052582723, 'samples': 908288, 'steps': 1773, 'batch_loss/train': 0.9040659051388502} 12/21/2021 20:07:16 - INFO - codeparrot_training - Step 1774: {'lr': 0.0004987304665131542, 'samples': 908800, 'steps': 1774, 'batch_loss/train': 0.9690363425761461} 12/21/2021 20:07:27 - INFO - codeparrot_training - Step 1775: {'lr': 0.0004987288794547272, 'samples': 909312, 'steps': 1775, 'batch_loss/train': 0.89034179225564} 12/21/2021 20:07:38 - INFO - codeparrot_training - Step 1776: {'lr': 0.000498727291407448, 'samples': 909824, 'steps': 1776, 'batch_loss/train': 0.9229135345667601} 12/21/2021 20:07:48 - INFO - codeparrot_training - Step 1777: {'lr': 0.000498725702371323, 'samples': 910336, 'steps': 1777, 'batch_loss/train': 0.8124324735254049} 12/21/2021 20:08:00 - INFO - codeparrot_training - Step 1778: {'lr': 0.0004987241123463585, 'samples': 910848, 'steps': 1778, 'batch_loss/train': 0.8551133973523974} 12/21/2021 20:08:11 - INFO - codeparrot_training - Step 1779: {'lr': 0.0004987225213325609, 'samples': 911360, 'steps': 1779, 'batch_loss/train': 0.879351211944595} 12/21/2021 20:08:21 - INFO - codeparrot_training - Step 1780: {'lr': 0.0004987209293299365, 'samples': 911872, 'steps': 1780, 'batch_loss/train': 0.9372209943830967} 12/21/2021 20:08:34 - INFO - codeparrot_training - Step 1781: {'lr': 0.0004987193363384916, 'samples': 912384, 'steps': 1781, 'batch_loss/train': 0.9684248957782984} 12/21/2021 20:08:45 - INFO - codeparrot_training - Step 1782: {'lr': 0.0004987177423582324, 'samples': 912896, 'steps': 1782, 'batch_loss/train': 0.8334794277325273} 12/21/2021 20:08:55 - INFO - codeparrot_training - Step 1783: {'lr': 0.0004987161473891655, 'samples': 913408, 'steps': 1783, 'batch_loss/train': 0.9344437965191901} 12/21/2021 20:09:08 - INFO - codeparrot_training - Step 1784: {'lr': 0.000498714551431297, 'samples': 913920, 'steps': 1784, 'batch_loss/train': 0.8825885057449341} 12/21/2021 20:09:18 - INFO - codeparrot_training - Step 1785: {'lr': 0.0004987129544846334, 'samples': 914432, 'steps': 1785, 'batch_loss/train': 0.8822074355557561} 12/21/2021 20:09:29 - INFO - codeparrot_training - Step 1786: {'lr': 0.000498711356549181, 'samples': 914944, 'steps': 1786, 'batch_loss/train': 0.8603976527228951} 12/21/2021 20:09:40 - INFO - codeparrot_training - Step 1787: {'lr': 0.0004987097576249462, 'samples': 915456, 'steps': 1787, 'batch_loss/train': 0.8634001165628433} 12/21/2021 20:09:52 - INFO - codeparrot_training - Step 1788: {'lr': 0.0004987081577119353, 'samples': 915968, 'steps': 1788, 'batch_loss/train': 0.959977783029899} 12/21/2021 20:10:02 - INFO - codeparrot_training - Step 1789: {'lr': 0.0004987065568101546, 'samples': 916480, 'steps': 1789, 'batch_loss/train': 0.902426291257143} 12/21/2021 20:10:13 - INFO - codeparrot_training - Step 1790: {'lr': 0.0004987049549196106, 'samples': 916992, 'steps': 1790, 'batch_loss/train': 0.9163969978690147} 12/21/2021 20:10:25 - INFO - codeparrot_training - Step 1791: {'lr': 0.0004987033520403096, 'samples': 917504, 'steps': 1791, 'batch_loss/train': 0.8881054315716028} 12/21/2021 20:10:36 - INFO - codeparrot_training - Step 1792: {'lr': 0.000498701748172258, 'samples': 918016, 'steps': 1792, 'batch_loss/train': 0.9543562810868025} 12/21/2021 20:10:47 - INFO - codeparrot_training - Step 1793: {'lr': 0.0004987001433154622, 'samples': 918528, 'steps': 1793, 'batch_loss/train': 0.9367486266419291} 12/21/2021 20:10:59 - INFO - codeparrot_training - Step 1794: {'lr': 0.0004986985374699285, 'samples': 919040, 'steps': 1794, 'batch_loss/train': 0.9013417465612292} 12/21/2021 20:11:09 - INFO - codeparrot_training - Step 1795: {'lr': 0.0004986969306356632, 'samples': 919552, 'steps': 1795, 'batch_loss/train': 0.8312400672584772} 12/21/2021 20:11:20 - INFO - codeparrot_training - Step 1796: {'lr': 0.0004986953228126729, 'samples': 920064, 'steps': 1796, 'batch_loss/train': 0.9092927593737841} 12/21/2021 20:11:31 - INFO - codeparrot_training - Step 1797: {'lr': 0.000498693714000964, 'samples': 920576, 'steps': 1797, 'batch_loss/train': 0.9284424344077706} 12/21/2021 20:11:43 - INFO - codeparrot_training - Step 1798: {'lr': 0.0004986921042005425, 'samples': 921088, 'steps': 1798, 'batch_loss/train': 0.7865123264491558} 12/21/2021 20:11:53 - INFO - codeparrot_training - Step 1799: {'lr': 0.0004986904934114153, 'samples': 921600, 'steps': 1799, 'batch_loss/train': 0.9214104975108057} 12/21/2021 20:12:04 - INFO - codeparrot_training - Step 1800: {'lr': 0.0004986888816335887, 'samples': 922112, 'steps': 1800, 'batch_loss/train': 0.9327549645677209} 12/21/2021 20:12:16 - INFO - codeparrot_training - Step 1801: {'lr': 0.0004986872688670688, 'samples': 922624, 'steps': 1801, 'batch_loss/train': 0.9589235652238131} 12/21/2021 20:12:26 - INFO - codeparrot_training - Step 1802: {'lr': 0.0004986856551118623, 'samples': 923136, 'steps': 1802, 'batch_loss/train': 0.9533125841990113} 12/21/2021 20:12:37 - INFO - codeparrot_training - Step 1803: {'lr': 0.0004986840403679757, 'samples': 923648, 'steps': 1803, 'batch_loss/train': 0.8772525256499648} 12/21/2021 20:12:50 - INFO - codeparrot_training - Step 1804: {'lr': 0.000498682424635415, 'samples': 924160, 'steps': 1804, 'batch_loss/train': 0.9441607277840376} 12/21/2021 20:13:00 - INFO - codeparrot_training - Step 1805: {'lr': 0.000498680807914187, 'samples': 924672, 'steps': 1805, 'batch_loss/train': 0.8214109698310494} 12/21/2021 20:13:11 - INFO - codeparrot_training - Step 1806: {'lr': 0.000498679190204298, 'samples': 925184, 'steps': 1806, 'batch_loss/train': 0.966174628585577} 12/21/2021 20:13:23 - INFO - codeparrot_training - Step 1807: {'lr': 0.0004986775715057544, 'samples': 925696, 'steps': 1807, 'batch_loss/train': 0.8874991219490767} 12/21/2021 20:13:34 - INFO - codeparrot_training - Step 1808: {'lr': 0.0004986759518185627, 'samples': 926208, 'steps': 1808, 'batch_loss/train': 0.950497136451304} 12/21/2021 20:13:44 - INFO - codeparrot_training - Step 1809: {'lr': 0.0004986743311427294, 'samples': 926720, 'steps': 1809, 'batch_loss/train': 0.8621352333575487} 12/21/2021 20:13:55 - INFO - codeparrot_training - Step 1810: {'lr': 0.0004986727094782608, 'samples': 927232, 'steps': 1810, 'batch_loss/train': 0.9612332861870527} 12/21/2021 20:14:07 - INFO - codeparrot_training - Step 1811: {'lr': 0.0004986710868251633, 'samples': 927744, 'steps': 1811, 'batch_loss/train': 0.897529099136591} 12/21/2021 20:14:18 - INFO - codeparrot_training - Step 1812: {'lr': 0.0004986694631834435, 'samples': 928256, 'steps': 1812, 'batch_loss/train': 0.7724511902779341} 12/21/2021 20:14:29 - INFO - codeparrot_training - Step 1813: {'lr': 0.0004986678385531079, 'samples': 928768, 'steps': 1813, 'batch_loss/train': 0.968490356579423} 12/21/2021 20:14:41 - INFO - codeparrot_training - Step 1814: {'lr': 0.0004986662129341629, 'samples': 929280, 'steps': 1814, 'batch_loss/train': 0.9682283839210868} 12/21/2021 20:14:51 - INFO - codeparrot_training - Step 1815: {'lr': 0.0004986645863266148, 'samples': 929792, 'steps': 1815, 'batch_loss/train': 0.8703804188407958} 12/21/2021 20:15:02 - INFO - codeparrot_training - Step 1816: {'lr': 0.0004986629587304702, 'samples': 930304, 'steps': 1816, 'batch_loss/train': 0.8031561346724629} 12/21/2021 20:15:14 - INFO - codeparrot_training - Step 1817: {'lr': 0.0004986613301457356, 'samples': 930816, 'steps': 1817, 'batch_loss/train': 0.9431584663689137} 12/21/2021 20:15:24 - INFO - codeparrot_training - Step 1818: {'lr': 0.0004986597005724174, 'samples': 931328, 'steps': 1818, 'batch_loss/train': 0.828075626399368} 12/21/2021 20:15:35 - INFO - codeparrot_training - Step 1819: {'lr': 0.0004986580700105221, 'samples': 931840, 'steps': 1819, 'batch_loss/train': 0.8400483876466751} 12/21/2021 20:15:46 - INFO - codeparrot_training - Step 1820: {'lr': 0.0004986564384600563, 'samples': 932352, 'steps': 1820, 'batch_loss/train': 0.8771297363564372} 12/21/2021 20:15:58 - INFO - codeparrot_training - Step 1821: {'lr': 0.0004986548059210264, 'samples': 932864, 'steps': 1821, 'batch_loss/train': 0.814026640728116} 12/21/2021 20:16:09 - INFO - codeparrot_training - Step 1822: {'lr': 0.0004986531723934388, 'samples': 933376, 'steps': 1822, 'batch_loss/train': 0.9399106362834573} 12/21/2021 20:16:20 - INFO - codeparrot_training - Step 1823: {'lr': 0.0004986515378773001, 'samples': 933888, 'steps': 1823, 'batch_loss/train': 0.9374174429103732} 12/21/2021 20:16:32 - INFO - codeparrot_training - Step 1824: {'lr': 0.0004986499023726167, 'samples': 934400, 'steps': 1824, 'batch_loss/train': 0.8607952566817403} 12/21/2021 20:16:42 - INFO - codeparrot_training - Step 1825: {'lr': 0.0004986482658793953, 'samples': 934912, 'steps': 1825, 'batch_loss/train': 0.8238134458661079} 12/21/2021 20:16:53 - INFO - codeparrot_training - Step 1826: {'lr': 0.0004986466283976423, 'samples': 935424, 'steps': 1826, 'batch_loss/train': 0.9723093062639236} 12/21/2021 20:17:06 - INFO - codeparrot_training - Step 1827: {'lr': 0.0004986449899273641, 'samples': 935936, 'steps': 1827, 'batch_loss/train': 0.9604789754375815} 12/21/2021 20:17:16 - INFO - codeparrot_training - Step 1828: {'lr': 0.0004986433504685674, 'samples': 936448, 'steps': 1828, 'batch_loss/train': 0.812616216018796} 12/21/2021 20:17:27 - INFO - codeparrot_training - Step 1829: {'lr': 0.0004986417100212584, 'samples': 936960, 'steps': 1829, 'batch_loss/train': 0.9892189031234011} 12/21/2021 20:17:39 - INFO - codeparrot_training - Step 1830: {'lr': 0.0004986400685854441, 'samples': 937472, 'steps': 1830, 'batch_loss/train': 0.8710268647409976} 12/21/2021 20:17:49 - INFO - codeparrot_training - Step 1831: {'lr': 0.0004986384261611306, 'samples': 937984, 'steps': 1831, 'batch_loss/train': 0.9007011516951025} 12/21/2021 20:18:00 - INFO - codeparrot_training - Step 1832: {'lr': 0.0004986367827483247, 'samples': 938496, 'steps': 1832, 'batch_loss/train': 0.8797266194596887} 12/21/2021 20:18:11 - INFO - codeparrot_training - Step 1833: {'lr': 0.0004986351383470328, 'samples': 939008, 'steps': 1833, 'batch_loss/train': 0.9119500303640962} 12/21/2021 20:18:23 - INFO - codeparrot_training - Step 1834: {'lr': 0.0004986334929572614, 'samples': 939520, 'steps': 1834, 'batch_loss/train': 0.8960771700367332} 12/21/2021 20:18:34 - INFO - codeparrot_training - Step 1835: {'lr': 0.0004986318465790173, 'samples': 940032, 'steps': 1835, 'batch_loss/train': 0.7886189089622349} 12/21/2021 20:18:44 - INFO - codeparrot_training - Step 1836: {'lr': 0.0004986301992123067, 'samples': 940544, 'steps': 1836, 'batch_loss/train': 1.0327209755778313} 12/21/2021 20:18:58 - INFO - codeparrot_training - Step 1837: {'lr': 0.0004986285508571364, 'samples': 941056, 'steps': 1837, 'batch_loss/train': 0.8866083342581987} 12/21/2021 20:19:09 - INFO - codeparrot_training - Step 1838: {'lr': 0.0004986269015135128, 'samples': 941568, 'steps': 1838, 'batch_loss/train': 0.9049155563116074} 12/21/2021 20:19:19 - INFO - codeparrot_training - Step 1839: {'lr': 0.0004986252511814424, 'samples': 942080, 'steps': 1839, 'batch_loss/train': 0.8353464880492538} 12/21/2021 20:19:31 - INFO - codeparrot_training - Step 1840: {'lr': 0.0004986235998609321, 'samples': 942592, 'steps': 1840, 'batch_loss/train': 0.9126728833653033} 12/21/2021 20:19:42 - INFO - codeparrot_training - Step 1841: {'lr': 0.0004986219475519881, 'samples': 943104, 'steps': 1841, 'batch_loss/train': 0.7925252560526133} 12/21/2021 20:19:53 - INFO - codeparrot_training - Step 1842: {'lr': 0.0004986202942546172, 'samples': 943616, 'steps': 1842, 'batch_loss/train': 0.8110767262987792} 12/21/2021 20:20:03 - INFO - codeparrot_training - Step 1843: {'lr': 0.0004986186399688258, 'samples': 944128, 'steps': 1843, 'batch_loss/train': 0.8096279329620302} 12/21/2021 20:20:15 - INFO - codeparrot_training - Step 1844: {'lr': 0.0004986169846946207, 'samples': 944640, 'steps': 1844, 'batch_loss/train': 0.9027986219152808} 12/21/2021 20:20:26 - INFO - codeparrot_training - Step 1845: {'lr': 0.0004986153284320081, 'samples': 945152, 'steps': 1845, 'batch_loss/train': 0.8371636793017387} 12/21/2021 20:20:36 - INFO - codeparrot_training - Step 1846: {'lr': 0.0004986136711809951, 'samples': 945664, 'steps': 1846, 'batch_loss/train': 0.8917500502429903} 12/21/2021 20:20:49 - INFO - codeparrot_training - Step 1847: {'lr': 0.0004986120129415879, 'samples': 946176, 'steps': 1847, 'batch_loss/train': 0.8751227331813425} 12/21/2021 20:20:59 - INFO - codeparrot_training - Step 1848: {'lr': 0.0004986103537137932, 'samples': 946688, 'steps': 1848, 'batch_loss/train': 0.7125408542342484} 12/21/2021 20:21:10 - INFO - codeparrot_training - Step 1849: {'lr': 0.0004986086934976176, 'samples': 947200, 'steps': 1849, 'batch_loss/train': 0.9307743078097701} 12/21/2021 20:21:22 - INFO - codeparrot_training - Step 1850: {'lr': 0.0004986070322930677, 'samples': 947712, 'steps': 1850, 'batch_loss/train': 0.8707185450475663} 12/21/2021 20:21:33 - INFO - codeparrot_training - Step 1851: {'lr': 0.0004986053701001501, 'samples': 948224, 'steps': 1851, 'batch_loss/train': 0.9580605085939169} 12/21/2021 20:21:44 - INFO - codeparrot_training - Step 1852: {'lr': 0.0004986037069188714, 'samples': 948736, 'steps': 1852, 'batch_loss/train': 0.7944429162889719} 12/21/2021 20:21:54 - INFO - codeparrot_training - Step 1853: {'lr': 0.0004986020427492384, 'samples': 949248, 'steps': 1853, 'batch_loss/train': 0.8779871286824346} 12/21/2021 20:22:06 - INFO - codeparrot_training - Step 1854: {'lr': 0.0004986003775912574, 'samples': 949760, 'steps': 1854, 'batch_loss/train': 0.8946093013510108} 12/21/2021 20:22:17 - INFO - codeparrot_training - Step 1855: {'lr': 0.0004985987114449351, 'samples': 950272, 'steps': 1855, 'batch_loss/train': 0.9508311804383993} 12/21/2021 20:22:28 - INFO - codeparrot_training - Step 1856: {'lr': 0.0004985970443102782, 'samples': 950784, 'steps': 1856, 'batch_loss/train': 0.9317941963672638} 12/21/2021 20:22:40 - INFO - codeparrot_training - Step 1857: {'lr': 0.0004985953761872936, 'samples': 951296, 'steps': 1857, 'batch_loss/train': 0.9445256488397717} 12/21/2021 20:22:51 - INFO - codeparrot_training - Step 1858: {'lr': 0.0004985937070759874, 'samples': 951808, 'steps': 1858, 'batch_loss/train': 0.9254755415022373} 12/21/2021 20:23:02 - INFO - codeparrot_training - Step 1859: {'lr': 0.0004985920369763665, 'samples': 952320, 'steps': 1859, 'batch_loss/train': 0.9439691929146647} 12/21/2021 20:23:14 - INFO - codeparrot_training - Step 1860: {'lr': 0.0004985903658884375, 'samples': 952832, 'steps': 1860, 'batch_loss/train': 0.968684159219265} 12/21/2021 20:23:25 - INFO - codeparrot_training - Step 1861: {'lr': 0.0004985886938122071, 'samples': 953344, 'steps': 1861, 'batch_loss/train': 0.7654836247675121} 12/21/2021 20:23:35 - INFO - codeparrot_training - Step 1862: {'lr': 0.000498587020747682, 'samples': 953856, 'steps': 1862, 'batch_loss/train': 1.0087153273634613} 12/21/2021 20:23:46 - INFO - codeparrot_training - Step 1863: {'lr': 0.0004985853466948687, 'samples': 954368, 'steps': 1863, 'batch_loss/train': 0.7112129926681519} 12/21/2021 20:23:58 - INFO - codeparrot_training - Step 1864: {'lr': 0.000498583671653774, 'samples': 954880, 'steps': 1864, 'batch_loss/train': 0.8184665851294994} 12/21/2021 20:24:09 - INFO - codeparrot_training - Step 1865: {'lr': 0.0004985819956244044, 'samples': 955392, 'steps': 1865, 'batch_loss/train': 0.8702767519280314} 12/21/2021 20:24:19 - INFO - codeparrot_training - Step 1866: {'lr': 0.0004985803186067667, 'samples': 955904, 'steps': 1866, 'batch_loss/train': 0.9370418824255466} 12/21/2021 20:24:32 - INFO - codeparrot_training - Step 1867: {'lr': 0.0004985786406008674, 'samples': 956416, 'steps': 1867, 'batch_loss/train': 0.984340375289321} 12/21/2021 20:24:43 - INFO - codeparrot_training - Step 1868: {'lr': 0.0004985769616067135, 'samples': 956928, 'steps': 1868, 'batch_loss/train': 0.9193143034353852} 12/21/2021 20:24:53 - INFO - codeparrot_training - Step 1869: {'lr': 0.0004985752816243112, 'samples': 957440, 'steps': 1869, 'batch_loss/train': 0.8370444281026721} 12/21/2021 20:25:05 - INFO - codeparrot_training - Step 1870: {'lr': 0.0004985736006536676, 'samples': 957952, 'steps': 1870, 'batch_loss/train': 0.9120601594913751} 12/21/2021 20:25:16 - INFO - codeparrot_training - Step 1871: {'lr': 0.0004985719186947891, 'samples': 958464, 'steps': 1871, 'batch_loss/train': 0.9577167658135295} 12/21/2021 20:25:27 - INFO - codeparrot_training - Step 1872: {'lr': 0.0004985702357476827, 'samples': 958976, 'steps': 1872, 'batch_loss/train': 0.8961206395179033} 12/21/2021 20:25:37 - INFO - codeparrot_training - Step 1873: {'lr': 0.0004985685518123547, 'samples': 959488, 'steps': 1873, 'batch_loss/train': 0.8794773784466088} 12/21/2021 20:25:49 - INFO - codeparrot_training - Step 1874: {'lr': 0.0004985668668888121, 'samples': 960000, 'steps': 1874, 'batch_loss/train': 0.8279963359236717} 12/21/2021 20:26:00 - INFO - codeparrot_training - Step 1875: {'lr': 0.0004985651809770615, 'samples': 960512, 'steps': 1875, 'batch_loss/train': 0.925343323033303} 12/21/2021 20:26:10 - INFO - codeparrot_training - Step 1876: {'lr': 0.0004985634940771096, 'samples': 961024, 'steps': 1876, 'batch_loss/train': 0.8192994790151715} 12/21/2021 20:26:22 - INFO - codeparrot_training - Step 1877: {'lr': 0.000498561806188963, 'samples': 961536, 'steps': 1877, 'batch_loss/train': 0.8512080786749721} 12/21/2021 20:26:33 - INFO - codeparrot_training - Step 1878: {'lr': 0.0004985601173126286, 'samples': 962048, 'steps': 1878, 'batch_loss/train': 0.8746908996254206} 12/21/2021 20:26:43 - INFO - codeparrot_training - Step 1879: {'lr': 0.0004985584274481131, 'samples': 962560, 'steps': 1879, 'batch_loss/train': 0.9004705036059022} 12/21/2021 20:26:56 - INFO - codeparrot_training - Step 1880: {'lr': 0.000498556736595423, 'samples': 963072, 'steps': 1880, 'batch_loss/train': 0.8611906156875193} 12/21/2021 20:27:07 - INFO - codeparrot_training - Step 1881: {'lr': 0.0004985550447545652, 'samples': 963584, 'steps': 1881, 'batch_loss/train': 0.7130138901993632} 12/21/2021 20:27:18 - INFO - codeparrot_training - Step 1882: {'lr': 0.0004985533519255464, 'samples': 964096, 'steps': 1882, 'batch_loss/train': 0.834296409972012} 12/21/2021 20:27:28 - INFO - codeparrot_training - Step 1883: {'lr': 0.0004985516581083734, 'samples': 964608, 'steps': 1883, 'batch_loss/train': 0.8909173300489783} 12/21/2021 20:27:40 - INFO - codeparrot_training - Step 1884: {'lr': 0.0004985499633030528, 'samples': 965120, 'steps': 1884, 'batch_loss/train': 0.9050975497812033} 12/21/2021 20:27:51 - INFO - codeparrot_training - Step 1885: {'lr': 0.0004985482675095913, 'samples': 965632, 'steps': 1885, 'batch_loss/train': 0.9018258606083691} 12/21/2021 20:28:01 - INFO - codeparrot_training - Step 1886: {'lr': 0.0004985465707279959, 'samples': 966144, 'steps': 1886, 'batch_loss/train': 0.890422020573169} 12/21/2021 20:28:13 - INFO - codeparrot_training - Step 1887: {'lr': 0.0004985448729582731, 'samples': 966656, 'steps': 1887, 'batch_loss/train': 0.853865951532498} 12/21/2021 20:28:24 - INFO - codeparrot_training - Step 1888: {'lr': 0.0004985431742004297, 'samples': 967168, 'steps': 1888, 'batch_loss/train': 0.8792003877460957} 12/21/2021 20:28:34 - INFO - codeparrot_training - Step 1889: {'lr': 0.0004985414744544726, 'samples': 967680, 'steps': 1889, 'batch_loss/train': 0.9136353489011526} 12/21/2021 20:28:47 - INFO - codeparrot_training - Step 1890: {'lr': 0.0004985397737204084, 'samples': 968192, 'steps': 1890, 'batch_loss/train': 0.8181290803477168} 12/21/2021 20:28:58 - INFO - codeparrot_training - Step 1891: {'lr': 0.0004985380719982438, 'samples': 968704, 'steps': 1891, 'batch_loss/train': 0.8627249887213111} 12/21/2021 20:29:09 - INFO - codeparrot_training - Step 1892: {'lr': 0.0004985363692879858, 'samples': 969216, 'steps': 1892, 'batch_loss/train': 0.9257152266800404} 12/21/2021 20:29:19 - INFO - codeparrot_training - Step 1893: {'lr': 0.000498534665589641, 'samples': 969728, 'steps': 1893, 'batch_loss/train': 0.8799448469653726} 12/21/2021 20:29:31 - INFO - codeparrot_training - Step 1894: {'lr': 0.0004985329609032161, 'samples': 970240, 'steps': 1894, 'batch_loss/train': 0.8503386178053916} 12/21/2021 20:29:42 - INFO - codeparrot_training - Step 1895: {'lr': 0.0004985312552287181, 'samples': 970752, 'steps': 1895, 'batch_loss/train': 0.6699882475659251} 12/21/2021 20:29:53 - INFO - codeparrot_training - Step 1896: {'lr': 0.0004985295485661537, 'samples': 971264, 'steps': 1896, 'batch_loss/train': 0.9669402996078134} 12/21/2021 20:30:05 - INFO - codeparrot_training - Step 1897: {'lr': 0.0004985278409155296, 'samples': 971776, 'steps': 1897, 'batch_loss/train': 0.8014138168655336} 12/21/2021 20:30:16 - INFO - codeparrot_training - Step 1898: {'lr': 0.0004985261322768527, 'samples': 972288, 'steps': 1898, 'batch_loss/train': 0.9111372977495193} 12/21/2021 20:30:27 - INFO - codeparrot_training - Step 1899: {'lr': 0.0004985244226501296, 'samples': 972800, 'steps': 1899, 'batch_loss/train': 0.8931278483942151} 12/21/2021 20:30:37 - INFO - codeparrot_training - Step 1900: {'lr': 0.0004985227120353673, 'samples': 973312, 'steps': 1900, 'batch_loss/train': 0.8450008472427726} 12/21/2021 20:30:49 - INFO - codeparrot_training - Step 1901: {'lr': 0.0004985210004325726, 'samples': 973824, 'steps': 1901, 'batch_loss/train': 0.9265907276421785} 12/21/2021 20:31:00 - INFO - codeparrot_training - Step 1902: {'lr': 0.0004985192878417523, 'samples': 974336, 'steps': 1902, 'batch_loss/train': 0.8288377718999982} 12/21/2021 20:31:11 - INFO - codeparrot_training - Step 1903: {'lr': 0.000498517574262913, 'samples': 974848, 'steps': 1903, 'batch_loss/train': 0.9185442980378866} 12/21/2021 20:31:23 - INFO - codeparrot_training - Step 1904: {'lr': 0.0004985158596960617, 'samples': 975360, 'steps': 1904, 'batch_loss/train': 0.8217168105766177} 12/21/2021 20:31:33 - INFO - codeparrot_training - Step 1905: {'lr': 0.0004985141441412052, 'samples': 975872, 'steps': 1905, 'batch_loss/train': 0.9199291882105172} 12/21/2021 20:31:44 - INFO - codeparrot_training - Step 1906: {'lr': 0.0004985124275983504, 'samples': 976384, 'steps': 1906, 'batch_loss/train': 0.8420824324712157} 12/21/2021 20:31:57 - INFO - codeparrot_training - Step 1907: {'lr': 0.0004985107100675038, 'samples': 976896, 'steps': 1907, 'batch_loss/train': 0.9212402869015932} 12/21/2021 20:32:07 - INFO - codeparrot_training - Step 1908: {'lr': 0.0004985089915486726, 'samples': 977408, 'steps': 1908, 'batch_loss/train': 0.7811398338526487} 12/21/2021 20:32:18 - INFO - codeparrot_training - Step 1909: {'lr': 0.0004985072720418635, 'samples': 977920, 'steps': 1909, 'batch_loss/train': 0.8198994658887386} 12/21/2021 20:32:28 - INFO - codeparrot_training - Step 1910: {'lr': 0.0004985055515470832, 'samples': 978432, 'steps': 1910, 'batch_loss/train': 0.9118938446044922} 12/21/2021 20:32:41 - INFO - codeparrot_training - Step 1911: {'lr': 0.0004985038300643387, 'samples': 978944, 'steps': 1911, 'batch_loss/train': 0.7598953689448535} 12/21/2021 20:32:51 - INFO - codeparrot_training - Step 1912: {'lr': 0.0004985021075936369, 'samples': 979456, 'steps': 1912, 'batch_loss/train': 0.8355837122071534} 12/21/2021 20:33:02 - INFO - codeparrot_training - Step 1913: {'lr': 0.0004985003841349845, 'samples': 979968, 'steps': 1913, 'batch_loss/train': 0.765285518951714} 12/21/2021 20:33:14 - INFO - codeparrot_training - Step 1914: {'lr': 0.0004984986596883883, 'samples': 980480, 'steps': 1914, 'batch_loss/train': 0.8225007951259613} 12/21/2021 20:33:25 - INFO - codeparrot_training - Step 1915: {'lr': 0.0004984969342538553, 'samples': 980992, 'steps': 1915, 'batch_loss/train': 0.9137710812501609} 12/21/2021 20:33:35 - INFO - codeparrot_training - Step 1916: {'lr': 0.0004984952078313925, 'samples': 981504, 'steps': 1916, 'batch_loss/train': 0.8579232050105929} 12/21/2021 20:33:48 - INFO - codeparrot_training - Step 1917: {'lr': 0.0004984934804210064, 'samples': 982016, 'steps': 1917, 'batch_loss/train': 0.8911875681951642} 12/21/2021 20:33:59 - INFO - codeparrot_training - Step 1918: {'lr': 0.0004984917520227041, 'samples': 982528, 'steps': 1918, 'batch_loss/train': 0.972049449570477} 12/21/2021 20:34:09 - INFO - codeparrot_training - Step 1919: {'lr': 0.0004984900226364924, 'samples': 983040, 'steps': 1919, 'batch_loss/train': 0.8034211080521345} 12/21/2021 20:34:20 - INFO - codeparrot_training - Step 1920: {'lr': 0.0004984882922623782, 'samples': 983552, 'steps': 1920, 'batch_loss/train': 0.7752224360592663} 12/21/2021 20:34:32 - INFO - codeparrot_training - Step 1921: {'lr': 0.0004984865609003684, 'samples': 984064, 'steps': 1921, 'batch_loss/train': 0.9371872218325734} 12/21/2021 20:34:43 - INFO - codeparrot_training - Step 1922: {'lr': 0.0004984848285504699, 'samples': 984576, 'steps': 1922, 'batch_loss/train': 0.8593639386817813} 12/21/2021 20:34:53 - INFO - codeparrot_training - Step 1923: {'lr': 0.0004984830952126894, 'samples': 985088, 'steps': 1923, 'batch_loss/train': 0.883861318230629} 12/21/2021 20:35:05 - INFO - codeparrot_training - Step 1924: {'lr': 0.0004984813608870341, 'samples': 985600, 'steps': 1924, 'batch_loss/train': 1.0461214780807495} 12/21/2021 20:35:16 - INFO - codeparrot_training - Step 1925: {'lr': 0.0004984796255735107, 'samples': 986112, 'steps': 1925, 'batch_loss/train': 1.0278077712282538} 12/21/2021 20:35:27 - INFO - codeparrot_training - Step 1926: {'lr': 0.000498477889272126, 'samples': 986624, 'steps': 1926, 'batch_loss/train': 0.8411048408597708} 12/21/2021 20:35:40 - INFO - codeparrot_training - Step 1927: {'lr': 0.0004984761519828871, 'samples': 987136, 'steps': 1927, 'batch_loss/train': 0.8074838230386376} 12/21/2021 20:35:50 - INFO - codeparrot_training - Step 1928: {'lr': 0.0004984744137058009, 'samples': 987648, 'steps': 1928, 'batch_loss/train': 1.0221177656203508} 12/21/2021 20:36:01 - INFO - codeparrot_training - Step 1929: {'lr': 0.0004984726744408742, 'samples': 988160, 'steps': 1929, 'batch_loss/train': 0.636359327705577} 12/21/2021 20:36:14 - INFO - codeparrot_training - Step 1930: {'lr': 0.000498470934188114, 'samples': 988672, 'steps': 1930, 'batch_loss/train': 0.8164314767345786} 12/21/2021 20:36:24 - INFO - codeparrot_training - Step 1931: {'lr': 0.0004984691929475272, 'samples': 989184, 'steps': 1931, 'batch_loss/train': 0.9007533332332969} 12/21/2021 20:36:35 - INFO - codeparrot_training - Step 1932: {'lr': 0.0004984674507191206, 'samples': 989696, 'steps': 1932, 'batch_loss/train': 0.815044350689277} 12/21/2021 20:36:45 - INFO - codeparrot_training - Step 1933: {'lr': 0.0004984657075029013, 'samples': 990208, 'steps': 1933, 'batch_loss/train': 1.0739290239289403} 12/21/2021 20:36:57 - INFO - codeparrot_training - Step 1934: {'lr': 0.0004984639632988762, 'samples': 990720, 'steps': 1934, 'batch_loss/train': 1.1658818731084466} 12/21/2021 20:37:08 - INFO - codeparrot_training - Step 1935: {'lr': 0.0004984622181070521, 'samples': 991232, 'steps': 1935, 'batch_loss/train': 0.8970695147290826} 12/21/2021 20:37:19 - INFO - codeparrot_training - Step 1936: {'lr': 0.0004984604719274362, 'samples': 991744, 'steps': 1936, 'batch_loss/train': 0.8768092561513186} 12/21/2021 20:37:32 - INFO - codeparrot_training - Step 1937: {'lr': 0.0004984587247600352, 'samples': 992256, 'steps': 1937, 'batch_loss/train': 0.9532602233812213} 12/21/2021 20:37:43 - INFO - codeparrot_training - Step 1938: {'lr': 0.0004984569766048561, 'samples': 992768, 'steps': 1938, 'batch_loss/train': 0.9120695497840643} 12/21/2021 20:37:54 - INFO - codeparrot_training - Step 1939: {'lr': 0.0004984552274619058, 'samples': 993280, 'steps': 1939, 'batch_loss/train': 0.908918671309948} 12/21/2021 20:38:04 - INFO - codeparrot_training - Step 1940: {'lr': 0.0004984534773311915, 'samples': 993792, 'steps': 1940, 'batch_loss/train': 0.8521115388721228} 12/21/2021 20:38:16 - INFO - codeparrot_training - Step 1941: {'lr': 0.0004984517262127199, 'samples': 994304, 'steps': 1941, 'batch_loss/train': 0.8717880635522306} 12/21/2021 20:38:27 - INFO - codeparrot_training - Step 1942: {'lr': 0.000498449974106498, 'samples': 994816, 'steps': 1942, 'batch_loss/train': 0.8815956693142653} 12/21/2021 20:38:38 - INFO - codeparrot_training - Step 1943: {'lr': 0.0004984482210125329, 'samples': 995328, 'steps': 1943, 'batch_loss/train': 0.9211061354726553} 12/21/2021 20:38:51 - INFO - codeparrot_training - Step 1944: {'lr': 0.0004984464669308314, 'samples': 995840, 'steps': 1944, 'batch_loss/train': 0.8871599705889821} 12/21/2021 20:39:01 - INFO - codeparrot_training - Step 1945: {'lr': 0.0004984447118614006, 'samples': 996352, 'steps': 1945, 'batch_loss/train': 0.7164979332592338} 12/21/2021 20:39:12 - INFO - codeparrot_training - Step 1946: {'lr': 0.0004984429558042475, 'samples': 996864, 'steps': 1946, 'batch_loss/train': 0.8750196183100343} 12/21/2021 20:39:24 - INFO - codeparrot_training - Step 1947: {'lr': 0.0004984411987593789, 'samples': 997376, 'steps': 1947, 'batch_loss/train': 0.9554291544482112} 12/21/2021 20:39:35 - INFO - codeparrot_training - Step 1948: {'lr': 0.000498439440726802, 'samples': 997888, 'steps': 1948, 'batch_loss/train': 0.9645392382517457} 12/21/2021 20:39:45 - INFO - codeparrot_training - Step 1949: {'lr': 0.0004984376817065236, 'samples': 998400, 'steps': 1949, 'batch_loss/train': 0.9548986069858074} 12/21/2021 20:39:56 - INFO - codeparrot_training - Step 1950: {'lr': 0.0004984359216985508, 'samples': 998912, 'steps': 1950, 'batch_loss/train': 0.9310208605602384} 12/21/2021 20:40:08 - INFO - codeparrot_training - Step 1951: {'lr': 0.0004984341607028906, 'samples': 999424, 'steps': 1951, 'batch_loss/train': 0.8403540216386318} 12/21/2021 20:40:18 - INFO - codeparrot_training - Step 1952: {'lr': 0.00049843239871955, 'samples': 999936, 'steps': 1952, 'batch_loss/train': 0.9184751519933343} 12/21/2021 20:40:29 - INFO - codeparrot_training - Step 1953: {'lr': 0.0004984306357485361, 'samples': 1000448, 'steps': 1953, 'batch_loss/train': 0.9114741757512093} 12/21/2021 20:40:41 - INFO - codeparrot_training - Step 1954: {'lr': 0.0004984288717898557, 'samples': 1000960, 'steps': 1954, 'batch_loss/train': 0.9521170016378164} 12/21/2021 20:40:52 - INFO - codeparrot_training - Step 1955: {'lr': 0.0004984271068435158, 'samples': 1001472, 'steps': 1955, 'batch_loss/train': 0.8104671039618552} 12/21/2021 20:41:02 - INFO - codeparrot_training - Step 1956: {'lr': 0.0004984253409095236, 'samples': 1001984, 'steps': 1956, 'batch_loss/train': 0.9920618869364262} 12/21/2021 20:41:15 - INFO - codeparrot_training - Step 1957: {'lr': 0.0004984235739878861, 'samples': 1002496, 'steps': 1957, 'batch_loss/train': 0.8919129464775324} 12/21/2021 20:41:26 - INFO - codeparrot_training - Step 1958: {'lr': 0.0004984218060786102, 'samples': 1003008, 'steps': 1958, 'batch_loss/train': 0.6220389024820179} 12/21/2021 20:41:36 - INFO - codeparrot_training - Step 1959: {'lr': 0.000498420037181703, 'samples': 1003520, 'steps': 1959, 'batch_loss/train': 0.9105502516031265} 12/21/2021 20:41:49 - INFO - codeparrot_training - Step 1960: {'lr': 0.0004984182672971715, 'samples': 1004032, 'steps': 1960, 'batch_loss/train': 0.8652134397998452} 12/21/2021 20:41:59 - INFO - codeparrot_training - Step 1961: {'lr': 0.000498416496425023, 'samples': 1004544, 'steps': 1961, 'batch_loss/train': 0.9193787034600973} 12/21/2021 20:42:10 - INFO - codeparrot_training - Step 1962: {'lr': 0.000498414724565264, 'samples': 1005056, 'steps': 1962, 'batch_loss/train': 0.8855801778845489} 12/21/2021 20:42:20 - INFO - codeparrot_training - Step 1963: {'lr': 0.000498412951717902, 'samples': 1005568, 'steps': 1963, 'batch_loss/train': 0.922630394808948} 12/21/2021 20:42:32 - INFO - codeparrot_training - Step 1964: {'lr': 0.0004984111778829439, 'samples': 1006080, 'steps': 1964, 'batch_loss/train': 0.9501363150775433} 12/21/2021 20:42:43 - INFO - codeparrot_training - Step 1965: {'lr': 0.0004984094030603967, 'samples': 1006592, 'steps': 1965, 'batch_loss/train': 0.9640448782593012} 12/21/2021 20:42:54 - INFO - codeparrot_training - Step 1966: {'lr': 0.0004984076272502675, 'samples': 1007104, 'steps': 1966, 'batch_loss/train': 0.900334213860333} 12/21/2021 20:43:07 - INFO - codeparrot_training - Step 1967: {'lr': 0.0004984058504525635, 'samples': 1007616, 'steps': 1967, 'batch_loss/train': 0.9432421568781137} 12/21/2021 20:43:17 - INFO - codeparrot_training - Step 1968: {'lr': 0.0004984040726672916, 'samples': 1008128, 'steps': 1968, 'batch_loss/train': 0.900830393191427} 12/21/2021 20:43:28 - INFO - codeparrot_training - Step 1969: {'lr': 0.0004984022938944588, 'samples': 1008640, 'steps': 1969, 'batch_loss/train': 0.8101817322894931} 12/21/2021 20:43:40 - INFO - codeparrot_training - Step 1970: {'lr': 0.0004984005141340724, 'samples': 1009152, 'steps': 1970, 'batch_loss/train': 0.8904825272038579} 12/21/2021 20:43:51 - INFO - codeparrot_training - Step 1971: {'lr': 0.0004983987333861392, 'samples': 1009664, 'steps': 1971, 'batch_loss/train': 0.8357994500547647} 12/21/2021 20:44:01 - INFO - codeparrot_training - Step 1972: {'lr': 0.0004983969516506666, 'samples': 1010176, 'steps': 1972, 'batch_loss/train': 0.90746954921633} 12/21/2021 20:44:12 - INFO - codeparrot_training - Step 1973: {'lr': 0.0004983951689276614, 'samples': 1010688, 'steps': 1973, 'batch_loss/train': 0.8991688769310713} 12/21/2021 20:44:25 - INFO - codeparrot_training - Step 1974: {'lr': 0.0004983933852171307, 'samples': 1011200, 'steps': 1974, 'batch_loss/train': 0.8147042547352612} 12/21/2021 20:44:35 - INFO - codeparrot_training - Step 1975: {'lr': 0.0004983916005190819, 'samples': 1011712, 'steps': 1975, 'batch_loss/train': 0.9409529753029346} 12/21/2021 20:44:46 - INFO - codeparrot_training - Step 1976: {'lr': 0.0004983898148335217, 'samples': 1012224, 'steps': 1976, 'batch_loss/train': 0.7380706034600735} 12/21/2021 20:44:58 - INFO - codeparrot_training - Step 1977: {'lr': 0.0004983880281604576, 'samples': 1012736, 'steps': 1977, 'batch_loss/train': 0.8662133812904358} 12/21/2021 20:45:09 - INFO - codeparrot_training - Step 1978: {'lr': 0.0004983862404998963, 'samples': 1013248, 'steps': 1978, 'batch_loss/train': 0.8539702156558633} 12/21/2021 20:45:19 - INFO - codeparrot_training - Step 1979: {'lr': 0.0004983844518518452, 'samples': 1013760, 'steps': 1979, 'batch_loss/train': 0.6292981938458979} 12/21/2021 20:45:30 - INFO - codeparrot_training - Step 1980: {'lr': 0.0004983826622163112, 'samples': 1014272, 'steps': 1980, 'batch_loss/train': 0.7358023351989686} 12/21/2021 20:45:42 - INFO - codeparrot_training - Step 1981: {'lr': 0.0004983808715933016, 'samples': 1014784, 'steps': 1981, 'batch_loss/train': 0.883704699575901} 12/21/2021 20:45:53 - INFO - codeparrot_training - Step 1982: {'lr': 0.0004983790799828234, 'samples': 1015296, 'steps': 1982, 'batch_loss/train': 0.8585545821115375} 12/21/2021 20:46:03 - INFO - codeparrot_training - Step 1983: {'lr': 0.0004983772873848838, 'samples': 1015808, 'steps': 1983, 'batch_loss/train': 0.9777622818946838} 12/21/2021 20:46:16 - INFO - codeparrot_training - Step 1984: {'lr': 0.0004983754937994899, 'samples': 1016320, 'steps': 1984, 'batch_loss/train': 0.8509199172258377} 12/21/2021 20:46:26 - INFO - codeparrot_training - Step 1985: {'lr': 0.0004983736992266488, 'samples': 1016832, 'steps': 1985, 'batch_loss/train': 0.8992130886763334} 12/21/2021 20:46:37 - INFO - codeparrot_training - Step 1986: {'lr': 0.0004983719036663676, 'samples': 1017344, 'steps': 1986, 'batch_loss/train': 0.87557462323457} 12/21/2021 20:46:50 - INFO - codeparrot_training - Step 1987: {'lr': 0.0004983701071186535, 'samples': 1017856, 'steps': 1987, 'batch_loss/train': 0.8397209839895368} 12/21/2021 20:47:00 - INFO - codeparrot_training - Step 1988: {'lr': 0.0004983683095835138, 'samples': 1018368, 'steps': 1988, 'batch_loss/train': 0.885900791734457} 12/21/2021 20:47:11 - INFO - codeparrot_training - Step 1989: {'lr': 0.0004983665110609553, 'samples': 1018880, 'steps': 1989, 'batch_loss/train': 0.9881757171824574} 12/21/2021 20:47:22 - INFO - codeparrot_training - Step 1990: {'lr': 0.0004983647115509854, 'samples': 1019392, 'steps': 1990, 'batch_loss/train': 0.9471376352012157} 12/21/2021 20:47:34 - INFO - codeparrot_training - Step 1991: {'lr': 0.0004983629110536112, 'samples': 1019904, 'steps': 1991, 'batch_loss/train': 0.8785703368484974} 12/21/2021 20:47:44 - INFO - codeparrot_training - Step 1992: {'lr': 0.0004983611095688397, 'samples': 1020416, 'steps': 1992, 'batch_loss/train': 0.8402291759848595} 12/21/2021 20:47:55 - INFO - codeparrot_training - Step 1993: {'lr': 0.0004983593070966784, 'samples': 1020928, 'steps': 1993, 'batch_loss/train': 0.9637845447286963} 12/21/2021 20:48:07 - INFO - codeparrot_training - Step 1994: {'lr': 0.0004983575036371342, 'samples': 1021440, 'steps': 1994, 'batch_loss/train': 0.8597122263163328} 12/21/2021 20:48:17 - INFO - codeparrot_training - Step 1995: {'lr': 0.0004983556991902143, 'samples': 1021952, 'steps': 1995, 'batch_loss/train': 0.9756135120987892} 12/21/2021 20:48:28 - INFO - codeparrot_training - Step 1996: {'lr': 0.000498353893755926, 'samples': 1022464, 'steps': 1996, 'batch_loss/train': 0.9825044861063361} 12/21/2021 20:48:41 - INFO - codeparrot_training - Step 1997: {'lr': 0.0004983520873342763, 'samples': 1022976, 'steps': 1997, 'batch_loss/train': 0.9247800167649984} 12/21/2021 20:48:51 - INFO - codeparrot_training - Step 1998: {'lr': 0.0004983502799252725, 'samples': 1023488, 'steps': 1998, 'batch_loss/train': 0.936079922132194} 12/21/2021 20:49:02 - INFO - codeparrot_training - Step 1999: {'lr': 0.0004983484715289218, 'samples': 1024000, 'steps': 1999, 'batch_loss/train': 0.971809234470129} 12/21/2021 20:49:14 - INFO - codeparrot_training - Step 2000: {'lr': 0.0004983466621452313, 'samples': 1024512, 'steps': 2000, 'batch_loss/train': 0.9302699360996485} 12/21/2021 20:49:25 - INFO - codeparrot_training - Step 2001: {'lr': 0.0004983448517742083, 'samples': 1025024, 'steps': 2001, 'batch_loss/train': 0.9567683283239603} 12/21/2021 20:49:35 - INFO - codeparrot_training - Step 2002: {'lr': 0.0004983430404158599, 'samples': 1025536, 'steps': 2002, 'batch_loss/train': 0.9115695282816887} 12/21/2021 20:49:46 - INFO - codeparrot_training - Step 2003: {'lr': 0.0004983412280701933, 'samples': 1026048, 'steps': 2003, 'batch_loss/train': 0.7873107944615185} 12/21/2021 20:49:59 - INFO - codeparrot_training - Step 2004: {'lr': 0.0004983394147372159, 'samples': 1026560, 'steps': 2004, 'batch_loss/train': 0.925415784586221} 12/21/2021 20:50:09 - INFO - codeparrot_training - Step 2005: {'lr': 0.0004983376004169347, 'samples': 1027072, 'steps': 2005, 'batch_loss/train': 0.9583002189174294} 12/21/2021 20:50:20 - INFO - codeparrot_training - Step 2006: {'lr': 0.0004983357851093568, 'samples': 1027584, 'steps': 2006, 'batch_loss/train': 0.8649103962816298} 12/21/2021 20:50:32 - INFO - codeparrot_training - Step 2007: {'lr': 0.0004983339688144898, 'samples': 1028096, 'steps': 2007, 'batch_loss/train': 0.9426075303927064} 12/21/2021 20:50:43 - INFO - codeparrot_training - Step 2008: {'lr': 0.0004983321515323406, 'samples': 1028608, 'steps': 2008, 'batch_loss/train': 0.8747092611156404} 12/21/2021 20:50:53 - INFO - codeparrot_training - Step 2009: {'lr': 0.0004983303332629166, 'samples': 1029120, 'steps': 2009, 'batch_loss/train': 0.8771300781518221} 12/21/2021 20:51:05 - INFO - codeparrot_training - Step 2010: {'lr': 0.0004983285140062249, 'samples': 1029632, 'steps': 2010, 'batch_loss/train': 0.9037513546645641} 12/21/2021 20:51:16 - INFO - codeparrot_training - Step 2011: {'lr': 0.0004983266937622728, 'samples': 1030144, 'steps': 2011, 'batch_loss/train': 0.8476507430896163} 12/21/2021 20:51:27 - INFO - codeparrot_training - Step 2012: {'lr': 0.0004983248725310675, 'samples': 1030656, 'steps': 2012, 'batch_loss/train': 0.8998219296336174} 12/21/2021 20:51:37 - INFO - codeparrot_training - Step 2013: {'lr': 0.0004983230503126163, 'samples': 1031168, 'steps': 2013, 'batch_loss/train': 1.7594540459103882} 12/21/2021 20:51:50 - INFO - codeparrot_training - Step 2014: {'lr': 0.0004983212271069265, 'samples': 1031680, 'steps': 2014, 'batch_loss/train': 0.9030841011554003} 12/21/2021 20:52:01 - INFO - codeparrot_training - Step 2015: {'lr': 0.0004983194029140051, 'samples': 1032192, 'steps': 2015, 'batch_loss/train': 0.6301599130965769} 12/21/2021 20:52:11 - INFO - codeparrot_training - Step 2016: {'lr': 0.0004983175777338596, 'samples': 1032704, 'steps': 2016, 'batch_loss/train': 0.9767075311392546} 12/21/2021 20:52:24 - INFO - codeparrot_training - Step 2017: {'lr': 0.0004983157515664972, 'samples': 1033216, 'steps': 2017, 'batch_loss/train': 0.9529700479470193} 12/21/2021 20:52:35 - INFO - codeparrot_training - Step 2018: {'lr': 0.0004983139244119252, 'samples': 1033728, 'steps': 2018, 'batch_loss/train': 0.9072929713875055} 12/21/2021 20:52:45 - INFO - codeparrot_training - Step 2019: {'lr': 0.0004983120962701506, 'samples': 1034240, 'steps': 2019, 'batch_loss/train': 0.9229670595377684} 12/21/2021 20:52:57 - INFO - codeparrot_training - Step 2020: {'lr': 0.000498310267141181, 'samples': 1034752, 'steps': 2020, 'batch_loss/train': 0.8694041995331645} 12/21/2021 20:53:08 - INFO - codeparrot_training - Step 2021: {'lr': 0.0004983084370250235, 'samples': 1035264, 'steps': 2021, 'batch_loss/train': 0.8456221492961049} 12/21/2021 20:53:19 - INFO - codeparrot_training - Step 2022: {'lr': 0.0004983066059216854, 'samples': 1035776, 'steps': 2022, 'batch_loss/train': 0.9045664104633033} 12/21/2021 20:53:29 - INFO - codeparrot_training - Step 2023: {'lr': 0.000498304773831174, 'samples': 1036288, 'steps': 2023, 'batch_loss/train': 0.8795490153133869} 12/21/2021 20:53:41 - INFO - codeparrot_training - Step 2024: {'lr': 0.0004983029407534966, 'samples': 1036800, 'steps': 2024, 'batch_loss/train': 0.9291595285758376} 12/21/2021 20:53:52 - INFO - codeparrot_training - Step 2025: {'lr': 0.0004983011066886604, 'samples': 1037312, 'steps': 2025, 'batch_loss/train': 0.8593326075933874} 12/21/2021 20:54:02 - INFO - codeparrot_training - Step 2026: {'lr': 0.0004982992716366729, 'samples': 1037824, 'steps': 2026, 'batch_loss/train': 0.8937574117444456} 12/21/2021 20:54:15 - INFO - codeparrot_training - Step 2027: {'lr': 0.0004982974355975412, 'samples': 1038336, 'steps': 2027, 'batch_loss/train': 0.8459899229928851} 12/21/2021 20:54:26 - INFO - codeparrot_training - Step 2028: {'lr': 0.0004982955985712725, 'samples': 1038848, 'steps': 2028, 'batch_loss/train': 0.9798802491277456} 12/21/2021 20:54:36 - INFO - codeparrot_training - Step 2029: {'lr': 0.0004982937605578743, 'samples': 1039360, 'steps': 2029, 'batch_loss/train': 0.8368792654946446} 12/21/2021 20:54:49 - INFO - codeparrot_training - Step 2030: {'lr': 0.0004982919215573541, 'samples': 1039872, 'steps': 2030, 'batch_loss/train': 0.9758359370753169} 12/21/2021 20:55:00 - INFO - codeparrot_training - Step 2031: {'lr': 0.0004982900815697186, 'samples': 1040384, 'steps': 2031, 'batch_loss/train': 0.9173170011490583} 12/21/2021 20:55:11 - INFO - codeparrot_training - Step 2032: {'lr': 0.0004982882405949757, 'samples': 1040896, 'steps': 2032, 'batch_loss/train': 1.336600348353386} 12/21/2021 20:55:21 - INFO - codeparrot_training - Step 2033: {'lr': 0.0004982863986331325, 'samples': 1041408, 'steps': 2033, 'batch_loss/train': 0.9601623322814703} 12/21/2021 20:55:33 - INFO - codeparrot_training - Step 2034: {'lr': 0.0004982845556841963, 'samples': 1041920, 'steps': 2034, 'batch_loss/train': 0.9148767059668899} 12/21/2021 20:55:44 - INFO - codeparrot_training - Step 2035: {'lr': 0.0004982827117481746, 'samples': 1042432, 'steps': 2035, 'batch_loss/train': 0.9845279511064291} 12/21/2021 20:55:54 - INFO - codeparrot_training - Step 2036: {'lr': 0.0004982808668250743, 'samples': 1042944, 'steps': 2036, 'batch_loss/train': 0.8697410952299833} 12/21/2021 20:56:07 - INFO - codeparrot_training - Step 2037: {'lr': 0.0004982790209149032, 'samples': 1043456, 'steps': 2037, 'batch_loss/train': 0.8704917645081878} 12/21/2021 20:56:18 - INFO - codeparrot_training - Step 2038: {'lr': 0.0004982771740176684, 'samples': 1043968, 'steps': 2038, 'batch_loss/train': 0.9036972601898015} 12/21/2021 20:56:28 - INFO - codeparrot_training - Step 2039: {'lr': 0.0004982753261333773, 'samples': 1044480, 'steps': 2039, 'batch_loss/train': 0.8113594288006425} 12/21/2021 20:56:40 - INFO - codeparrot_training - Step 2040: {'lr': 0.0004982734772620373, 'samples': 1044992, 'steps': 2040, 'batch_loss/train': 1.017271576449275} 12/21/2021 20:56:51 - INFO - codeparrot_training - Step 2041: {'lr': 0.0004982716274036557, 'samples': 1045504, 'steps': 2041, 'batch_loss/train': 1.009530832991004} 12/21/2021 20:57:02 - INFO - codeparrot_training - Step 2042: {'lr': 0.0004982697765582399, 'samples': 1046016, 'steps': 2042, 'batch_loss/train': 0.8396618878468871} 12/21/2021 20:57:14 - INFO - codeparrot_training - Step 2043: {'lr': 0.0004982679247257971, 'samples': 1046528, 'steps': 2043, 'batch_loss/train': 0.8713187663815916} 12/21/2021 20:57:25 - INFO - codeparrot_training - Step 2044: {'lr': 0.0004982660719063349, 'samples': 1047040, 'steps': 2044, 'batch_loss/train': 0.9142915960401297} 12/21/2021 20:57:36 - INFO - codeparrot_training - Step 2045: {'lr': 0.0004982642180998603, 'samples': 1047552, 'steps': 2045, 'batch_loss/train': 0.9489435544237494} 12/21/2021 20:57:46 - INFO - codeparrot_training - Step 2046: {'lr': 0.0004982623633063811, 'samples': 1048064, 'steps': 2046, 'batch_loss/train': 0.9301741113886237} 12/21/2021 20:57:58 - INFO - codeparrot_training - Step 2047: {'lr': 0.0004982605075259045, 'samples': 1048576, 'steps': 2047, 'batch_loss/train': 0.8671368854120374} 12/21/2021 20:58:09 - INFO - codeparrot_training - Step 2048: {'lr': 0.0004982586507584378, 'samples': 1049088, 'steps': 2048, 'batch_loss/train': 0.9748756121844053} 12/21/2021 20:58:20 - INFO - codeparrot_training - Step 2049: {'lr': 0.0004982567930039884, 'samples': 1049600, 'steps': 2049, 'batch_loss/train': 0.8773713186383247} 12/21/2021 20:58:32 - INFO - codeparrot_training - Step 2050: {'lr': 0.0004982549342625639, 'samples': 1050112, 'steps': 2050, 'batch_loss/train': 0.8680571410804987} 12/21/2021 20:58:43 - INFO - codeparrot_training - Step 2051: {'lr': 0.0004982530745341713, 'samples': 1050624, 'steps': 2051, 'batch_loss/train': 0.9838526826351881} 12/21/2021 20:58:53 - INFO - codeparrot_training - Step 2052: {'lr': 0.0004982512138188183, 'samples': 1051136, 'steps': 2052, 'batch_loss/train': 0.9161370927467942} 12/21/2021 20:59:06 - INFO - codeparrot_training - Step 2053: {'lr': 0.0004982493521165123, 'samples': 1051648, 'steps': 2053, 'batch_loss/train': 0.7827617154689506} 12/21/2021 20:59:17 - INFO - codeparrot_training - Step 2054: {'lr': 0.0004982474894272605, 'samples': 1052160, 'steps': 2054, 'batch_loss/train': 0.7820278378203511} 12/21/2021 20:59:27 - INFO - codeparrot_training - Step 2055: {'lr': 0.0004982456257510705, 'samples': 1052672, 'steps': 2055, 'batch_loss/train': 0.961200850084424} 12/21/2021 20:59:38 - INFO - codeparrot_training - Step 2056: {'lr': 0.0004982437610879496, 'samples': 1053184, 'steps': 2056, 'batch_loss/train': 1.027608166448772} 12/21/2021 20:59:50 - INFO - codeparrot_training - Step 2057: {'lr': 0.0004982418954379052, 'samples': 1053696, 'steps': 2057, 'batch_loss/train': 0.7977152357343584} 12/21/2021 21:00:01 - INFO - codeparrot_training - Step 2058: {'lr': 0.0004982400288009448, 'samples': 1054208, 'steps': 2058, 'batch_loss/train': 0.8413022616878152} 12/21/2021 21:00:11 - INFO - codeparrot_training - Step 2059: {'lr': 0.0004982381611770756, 'samples': 1054720, 'steps': 2059, 'batch_loss/train': 0.7534038992598653} 12/21/2021 21:00:23 - INFO - codeparrot_training - Step 2060: {'lr': 0.0004982362925663054, 'samples': 1055232, 'steps': 2060, 'batch_loss/train': 0.8310652254149318} 12/21/2021 21:00:34 - INFO - codeparrot_training - Step 2061: {'lr': 0.0004982344229686414, 'samples': 1055744, 'steps': 2061, 'batch_loss/train': 0.8373401000862941} 12/21/2021 21:00:44 - INFO - codeparrot_training - Step 2062: {'lr': 0.0004982325523840911, 'samples': 1056256, 'steps': 2062, 'batch_loss/train': 0.8499380508437753} 12/21/2021 21:00:56 - INFO - codeparrot_training - Step 2063: {'lr': 0.0004982306808126618, 'samples': 1056768, 'steps': 2063, 'batch_loss/train': 0.8572957813739777} 12/21/2021 21:01:07 - INFO - codeparrot_training - Step 2064: {'lr': 0.0004982288082543611, 'samples': 1057280, 'steps': 2064, 'batch_loss/train': 0.8479521656408906} 12/21/2021 21:01:18 - INFO - codeparrot_training - Step 2065: {'lr': 0.0004982269347091964, 'samples': 1057792, 'steps': 2065, 'batch_loss/train': 0.8383498266339302} 12/21/2021 21:01:28 - INFO - codeparrot_training - Step 2066: {'lr': 0.000498225060177175, 'samples': 1058304, 'steps': 2066, 'batch_loss/train': 0.9711756845936179} 12/21/2021 21:01:41 - INFO - codeparrot_training - Step 2067: {'lr': 0.0004982231846583046, 'samples': 1058816, 'steps': 2067, 'batch_loss/train': 0.9383909879252315} 12/21/2021 21:01:52 - INFO - codeparrot_training - Step 2068: {'lr': 0.0004982213081525926, 'samples': 1059328, 'steps': 2068, 'batch_loss/train': 0.8447313709184527} 12/21/2021 21:02:02 - INFO - codeparrot_training - Step 2069: {'lr': 0.0004982194306600464, 'samples': 1059840, 'steps': 2069, 'batch_loss/train': 0.8892351621761918} 12/21/2021 21:02:15 - INFO - codeparrot_training - Step 2070: {'lr': 0.0004982175521806734, 'samples': 1060352, 'steps': 2070, 'batch_loss/train': 0.8545324122533202} 12/21/2021 21:02:25 - INFO - codeparrot_training - Step 2071: {'lr': 0.0004982156727144811, 'samples': 1060864, 'steps': 2071, 'batch_loss/train': 0.7637093365192413} 12/21/2021 21:02:36 - INFO - codeparrot_training - Step 2072: {'lr': 0.0004982137922614771, 'samples': 1061376, 'steps': 2072, 'batch_loss/train': 0.8623336441814899} 12/21/2021 21:02:48 - INFO - codeparrot_training - Step 2073: {'lr': 0.0004982119108216688, 'samples': 1061888, 'steps': 2073, 'batch_loss/train': 0.9540804685093462} 12/21/2021 21:02:59 - INFO - codeparrot_training - Step 2074: {'lr': 0.0004982100283950636, 'samples': 1062400, 'steps': 2074, 'batch_loss/train': 0.82081974146422} 12/21/2021 21:03:10 - INFO - codeparrot_training - Step 2075: {'lr': 0.0004982081449816689, 'samples': 1062912, 'steps': 2075, 'batch_loss/train': 0.8770499657839537} 12/21/2021 21:03:20 - INFO - codeparrot_training - Step 2076: {'lr': 0.0004982062605814925, 'samples': 1063424, 'steps': 2076, 'batch_loss/train': 0.9273561807349324} 12/21/2021 21:03:32 - INFO - codeparrot_training - Step 2077: {'lr': 0.0004982043751945418, 'samples': 1063936, 'steps': 2077, 'batch_loss/train': 0.7912288075312972} 12/21/2021 21:03:43 - INFO - codeparrot_training - Step 2078: {'lr': 0.0004982024888208241, 'samples': 1064448, 'steps': 2078, 'batch_loss/train': 0.8962944250088185} 12/21/2021 21:03:54 - INFO - codeparrot_training - Step 2079: {'lr': 0.0004982006014603471, 'samples': 1064960, 'steps': 2079, 'batch_loss/train': 0.9481277707964182} 12/21/2021 21:04:06 - INFO - codeparrot_training - Step 2080: {'lr': 0.0004981987131131182, 'samples': 1065472, 'steps': 2080, 'batch_loss/train': 0.9640060588717461} 12/21/2021 21:04:16 - INFO - codeparrot_training - Step 2081: {'lr': 0.0004981968237791448, 'samples': 1065984, 'steps': 2081, 'batch_loss/train': 0.9303702700417489} 12/21/2021 21:04:27 - INFO - codeparrot_training - Step 2082: {'lr': 0.0004981949334584347, 'samples': 1066496, 'steps': 2082, 'batch_loss/train': 0.8874542722478509} 12/21/2021 21:04:40 - INFO - codeparrot_training - Step 2083: {'lr': 0.0004981930421509952, 'samples': 1067008, 'steps': 2083, 'batch_loss/train': 1.031493254005909} 12/21/2021 21:04:50 - INFO - codeparrot_training - Step 2084: {'lr': 0.0004981911498568339, 'samples': 1067520, 'steps': 2084, 'batch_loss/train': 0.6493448503315449} 12/21/2021 21:05:01 - INFO - codeparrot_training - Step 2085: {'lr': 0.0004981892565759583, 'samples': 1068032, 'steps': 2085, 'batch_loss/train': 0.8720532795414329} 12/21/2021 21:05:12 - INFO - codeparrot_training - Step 2086: {'lr': 0.0004981873623083759, 'samples': 1068544, 'steps': 2086, 'batch_loss/train': 0.8928144611418247} 12/21/2021 21:05:24 - INFO - codeparrot_training - Step 2087: {'lr': 0.0004981854670540942, 'samples': 1069056, 'steps': 2087, 'batch_loss/train': 0.9243437945842743} 12/21/2021 21:05:35 - INFO - codeparrot_training - Step 2088: {'lr': 0.0004981835708131211, 'samples': 1069568, 'steps': 2088, 'batch_loss/train': 0.9362732470035553} 12/21/2021 21:05:45 - INFO - codeparrot_training - Step 2089: {'lr': 0.0004981816735854635, 'samples': 1070080, 'steps': 2089, 'batch_loss/train': 0.9847526662051678} 12/21/2021 21:05:58 - INFO - codeparrot_training - Step 2090: {'lr': 0.0004981797753711295, 'samples': 1070592, 'steps': 2090, 'batch_loss/train': 0.8986259873490781} 12/21/2021 21:06:09 - INFO - codeparrot_training - Step 2091: {'lr': 0.0004981778761701262, 'samples': 1071104, 'steps': 2091, 'batch_loss/train': 1.6072808797471225} 12/21/2021 21:06:19 - INFO - codeparrot_training - Step 2092: {'lr': 0.0004981759759824615, 'samples': 1071616, 'steps': 2092, 'batch_loss/train': 0.8402136759832501} 12/21/2021 21:06:31 - INFO - codeparrot_training - Step 2093: {'lr': 0.0004981740748081429, 'samples': 1072128, 'steps': 2093, 'batch_loss/train': 0.8786074016243219} 12/21/2021 21:06:42 - INFO - codeparrot_training - Step 2094: {'lr': 0.0004981721726471778, 'samples': 1072640, 'steps': 2094, 'batch_loss/train': 0.9028868023306131} 12/21/2021 21:06:52 - INFO - codeparrot_training - Step 2095: {'lr': 0.0004981702694995737, 'samples': 1073152, 'steps': 2095, 'batch_loss/train': 0.8658848702907562} 12/21/2021 21:07:03 - INFO - codeparrot_training - Step 2096: {'lr': 0.0004981683653653385, 'samples': 1073664, 'steps': 2096, 'batch_loss/train': 0.8809843941126019} 12/21/2021 21:07:15 - INFO - codeparrot_training - Step 2097: {'lr': 0.0004981664602444795, 'samples': 1074176, 'steps': 2097, 'batch_loss/train': 0.9681063313037157} 12/21/2021 21:07:26 - INFO - codeparrot_training - Step 2098: {'lr': 0.0004981645541370045, 'samples': 1074688, 'steps': 2098, 'batch_loss/train': 0.7830138020217419} 12/21/2021 21:07:36 - INFO - codeparrot_training - Step 2099: {'lr': 0.0004981626470429208, 'samples': 1075200, 'steps': 2099, 'batch_loss/train': 0.8380732387304306} 12/21/2021 21:07:48 - INFO - codeparrot_training - Step 2100: {'lr': 0.0004981607389622361, 'samples': 1075712, 'steps': 2100, 'batch_loss/train': 0.8200987172313035} 12/21/2021 21:07:59 - INFO - codeparrot_training - Step 2101: {'lr': 0.000498158829894958, 'samples': 1076224, 'steps': 2101, 'batch_loss/train': 0.90417237021029} 12/21/2021 21:08:09 - INFO - codeparrot_training - Step 2102: {'lr': 0.0004981569198410943, 'samples': 1076736, 'steps': 2102, 'batch_loss/train': 0.9305787873454392} 12/21/2021 21:08:22 - INFO - codeparrot_training - Step 2103: {'lr': 0.0004981550088006523, 'samples': 1077248, 'steps': 2103, 'batch_loss/train': 0.8563022955786437} 12/21/2021 21:08:33 - INFO - codeparrot_training - Step 2104: {'lr': 0.0004981530967736395, 'samples': 1077760, 'steps': 2104, 'batch_loss/train': 0.8638444542884827} 12/21/2021 21:08:43 - INFO - codeparrot_training - Step 2105: {'lr': 0.0004981511837600639, 'samples': 1078272, 'steps': 2105, 'batch_loss/train': 0.8885208987630904} 12/21/2021 21:08:54 - INFO - codeparrot_training - Step 2106: {'lr': 0.0004981492697599327, 'samples': 1078784, 'steps': 2106, 'batch_loss/train': 1.0104186870157719} 12/21/2021 21:09:06 - INFO - codeparrot_training - Step 2107: {'lr': 0.0004981473547732539, 'samples': 1079296, 'steps': 2107, 'batch_loss/train': 0.9351612981408834} 12/21/2021 21:09:17 - INFO - codeparrot_training - Step 2108: {'lr': 0.0004981454388000348, 'samples': 1079808, 'steps': 2108, 'batch_loss/train': 0.8659628313034773} 12/21/2021 21:09:27 - INFO - codeparrot_training - Step 2109: {'lr': 0.0004981435218402832, 'samples': 1080320, 'steps': 2109, 'batch_loss/train': 0.8570291493088007} 12/21/2021 21:09:39 - INFO - codeparrot_training - Step 2110: {'lr': 0.0004981416038940065, 'samples': 1080832, 'steps': 2110, 'batch_loss/train': 1.0245922897593118} 12/21/2021 21:09:50 - INFO - codeparrot_training - Step 2111: {'lr': 0.0004981396849612126, 'samples': 1081344, 'steps': 2111, 'batch_loss/train': 0.8535224534571171} 12/21/2021 21:10:00 - INFO - codeparrot_training - Step 2112: {'lr': 0.000498137765041909, 'samples': 1081856, 'steps': 2112, 'batch_loss/train': 0.838034956716001} 12/21/2021 21:10:13 - INFO - codeparrot_training - Step 2113: {'lr': 0.0004981358441361033, 'samples': 1082368, 'steps': 2113, 'batch_loss/train': 0.8196658273227513} 12/21/2021 21:10:24 - INFO - codeparrot_training - Step 2114: {'lr': 0.0004981339222438031, 'samples': 1082880, 'steps': 2114, 'batch_loss/train': 0.9101518481038511} 12/21/2021 21:10:35 - INFO - codeparrot_training - Step 2115: {'lr': 0.0004981319993650163, 'samples': 1083392, 'steps': 2115, 'batch_loss/train': 0.8587559210136533} 12/21/2021 21:10:45 - INFO - codeparrot_training - Step 2116: {'lr': 0.0004981300754997503, 'samples': 1083904, 'steps': 2116, 'batch_loss/train': 0.8440177515149117} 12/21/2021 21:10:57 - INFO - codeparrot_training - Step 2117: {'lr': 0.0004981281506480128, 'samples': 1084416, 'steps': 2117, 'batch_loss/train': 0.8158174208365381} 12/21/2021 21:11:08 - INFO - codeparrot_training - Step 2118: {'lr': 0.0004981262248098114, 'samples': 1084928, 'steps': 2118, 'batch_loss/train': 0.8434099815785885} 12/21/2021 21:11:18 - INFO - codeparrot_training - Step 2119: {'lr': 0.0004981242979851539, 'samples': 1085440, 'steps': 2119, 'batch_loss/train': 0.9283712981268764} 12/21/2021 21:11:31 - INFO - codeparrot_training - Step 2120: {'lr': 0.0004981223701740478, 'samples': 1085952, 'steps': 2120, 'batch_loss/train': 0.9390012174844742} 12/21/2021 21:11:42 - INFO - codeparrot_training - Step 2121: {'lr': 0.0004981204413765009, 'samples': 1086464, 'steps': 2121, 'batch_loss/train': 0.8318948755040765} 12/21/2021 21:11:52 - INFO - codeparrot_training - Step 2122: {'lr': 0.0004981185115925208, 'samples': 1086976, 'steps': 2122, 'batch_loss/train': 1.4338542288169265} 12/21/2021 21:12:05 - INFO - codeparrot_training - Step 2123: {'lr': 0.0004981165808221152, 'samples': 1087488, 'steps': 2123, 'batch_loss/train': 0.909333495888859} 12/21/2021 21:12:16 - INFO - codeparrot_training - Step 2124: {'lr': 0.0004981146490652918, 'samples': 1088000, 'steps': 2124, 'batch_loss/train': 0.9454401088878512} 12/21/2021 21:12:26 - INFO - codeparrot_training - Step 2125: {'lr': 0.0004981127163220582, 'samples': 1088512, 'steps': 2125, 'batch_loss/train': 0.9084541611373425} 12/21/2021 21:12:37 - INFO - codeparrot_training - Step 2126: {'lr': 0.000498110782592422, 'samples': 1089024, 'steps': 2126, 'batch_loss/train': 0.9888563510030508} 12/21/2021 21:12:49 - INFO - codeparrot_training - Step 2127: {'lr': 0.0004981088478763912, 'samples': 1089536, 'steps': 2127, 'batch_loss/train': 1.0494901379570365} 12/21/2021 21:13:00 - INFO - codeparrot_training - Step 2128: {'lr': 0.0004981069121739732, 'samples': 1090048, 'steps': 2128, 'batch_loss/train': 1.0097657274454832} 12/21/2021 21:13:11 - INFO - codeparrot_training - Step 2129: {'lr': 0.0004981049754851759, 'samples': 1090560, 'steps': 2129, 'batch_loss/train': 0.8863750156015158} 12/21/2021 21:13:24 - INFO - codeparrot_training - Step 2130: {'lr': 0.0004981030378100068, 'samples': 1091072, 'steps': 2130, 'batch_loss/train': 0.8474823962897062} 12/21/2021 21:13:34 - INFO - codeparrot_training - Step 2131: {'lr': 0.0004981010991484736, 'samples': 1091584, 'steps': 2131, 'batch_loss/train': 0.8327016048133373} 12/21/2021 21:13:45 - INFO - codeparrot_training - Step 2132: {'lr': 0.0004980991595005843, 'samples': 1092096, 'steps': 2132, 'batch_loss/train': 0.8102053515613079} 12/21/2021 21:13:57 - INFO - codeparrot_training - Step 2133: {'lr': 0.0004980972188663464, 'samples': 1092608, 'steps': 2133, 'batch_loss/train': 0.8455256223678589} 12/21/2021 21:14:08 - INFO - codeparrot_training - Step 2134: {'lr': 0.0004980952772457676, 'samples': 1093120, 'steps': 2134, 'batch_loss/train': 0.8444462269544601} 12/21/2021 21:14:18 - INFO - codeparrot_training - Step 2135: {'lr': 0.0004980933346388556, 'samples': 1093632, 'steps': 2135, 'batch_loss/train': 0.9169659428298473} 12/21/2021 21:14:29 - INFO - codeparrot_training - Step 2136: {'lr': 0.0004980913910456183, 'samples': 1094144, 'steps': 2136, 'batch_loss/train': 1.0533856302499771} 12/21/2021 21:14:41 - INFO - codeparrot_training - Step 2137: {'lr': 0.0004980894464660632, 'samples': 1094656, 'steps': 2137, 'batch_loss/train': 1.2273599151521921} 12/21/2021 21:14:51 - INFO - codeparrot_training - Step 2138: {'lr': 0.0004980875009001982, 'samples': 1095168, 'steps': 2138, 'batch_loss/train': 0.8861374240368605} 12/21/2021 21:15:02 - INFO - codeparrot_training - Step 2139: {'lr': 0.0004980855543480309, 'samples': 1095680, 'steps': 2139, 'batch_loss/train': 0.8363719885237515} 12/21/2021 21:15:14 - INFO - codeparrot_training - Step 2140: {'lr': 0.0004980836068095691, 'samples': 1096192, 'steps': 2140, 'batch_loss/train': 0.866899729706347} 12/21/2021 21:15:25 - INFO - codeparrot_training - Step 2141: {'lr': 0.0004980816582848206, 'samples': 1096704, 'steps': 2141, 'batch_loss/train': 0.9049460580572486} 12/21/2021 21:15:35 - INFO - codeparrot_training - Step 2142: {'lr': 0.0004980797087737931, 'samples': 1097216, 'steps': 2142, 'batch_loss/train': 0.9276563785970211} 12/21/2021 21:15:48 - INFO - codeparrot_training - Step 2143: {'lr': 0.0004980777582764944, 'samples': 1097728, 'steps': 2143, 'batch_loss/train': 0.8835952868685126} 12/21/2021 21:15:59 - INFO - codeparrot_training - Step 2144: {'lr': 0.0004980758067929321, 'samples': 1098240, 'steps': 2144, 'batch_loss/train': 0.9366036951541901} 12/21/2021 21:16:09 - INFO - codeparrot_training - Step 2145: {'lr': 0.0004980738543231141, 'samples': 1098752, 'steps': 2145, 'batch_loss/train': 0.8915330907329917} 12/21/2021 21:16:20 - INFO - codeparrot_training - Step 2146: {'lr': 0.0004980719008670482, 'samples': 1099264, 'steps': 2146, 'batch_loss/train': 0.9525922108441591} 12/21/2021 21:16:32 - INFO - codeparrot_training - Step 2147: {'lr': 0.0004980699464247419, 'samples': 1099776, 'steps': 2147, 'batch_loss/train': 0.9620562195777893} 12/21/2021 21:16:43 - INFO - codeparrot_training - Step 2148: {'lr': 0.0004980679909962033, 'samples': 1100288, 'steps': 2148, 'batch_loss/train': 0.9537472035735846} 12/21/2021 21:16:53 - INFO - codeparrot_training - Step 2149: {'lr': 0.0004980660345814401, 'samples': 1100800, 'steps': 2149, 'batch_loss/train': 0.8254425078630447} 12/21/2021 21:17:06 - INFO - codeparrot_training - Step 2150: {'lr': 0.0004980640771804598, 'samples': 1101312, 'steps': 2150, 'batch_loss/train': 0.9531893031671643} 12/21/2021 21:17:16 - INFO - codeparrot_training - Step 2151: {'lr': 0.0004980621187932706, 'samples': 1101824, 'steps': 2151, 'batch_loss/train': 1.106052802875638} 12/21/2021 21:17:27 - INFO - codeparrot_training - Step 2152: {'lr': 0.00049806015941988, 'samples': 1102336, 'steps': 2152, 'batch_loss/train': 1.0102463271468878} 12/21/2021 21:17:40 - INFO - codeparrot_training - Step 2153: {'lr': 0.0004980581990602959, 'samples': 1102848, 'steps': 2153, 'batch_loss/train': 0.9261414585635066} 12/21/2021 21:17:50 - INFO - codeparrot_training - Step 2154: {'lr': 0.000498056237714526, 'samples': 1103360, 'steps': 2154, 'batch_loss/train': 0.9095185939222574} 12/21/2021 21:18:01 - INFO - codeparrot_training - Step 2155: {'lr': 0.0004980542753825782, 'samples': 1103872, 'steps': 2155, 'batch_loss/train': 0.8923492953181267} 12/21/2021 21:18:13 - INFO - codeparrot_training - Step 2156: {'lr': 0.0004980523120644603, 'samples': 1104384, 'steps': 2156, 'batch_loss/train': 0.9582636393606663} 12/21/2021 21:18:24 - INFO - codeparrot_training - Step 2157: {'lr': 0.00049805034776018, 'samples': 1104896, 'steps': 2157, 'batch_loss/train': 0.8866456095129251} 12/21/2021 21:18:34 - INFO - codeparrot_training - Step 2158: {'lr': 0.0004980483824697453, 'samples': 1105408, 'steps': 2158, 'batch_loss/train': 1.0278130862861872} 12/21/2021 21:18:45 - INFO - codeparrot_training - Step 2159: {'lr': 0.0004980464161931637, 'samples': 1105920, 'steps': 2159, 'batch_loss/train': 0.943147636950016} 12/21/2021 21:18:57 - INFO - codeparrot_training - Step 2160: {'lr': 0.0004980444489304434, 'samples': 1106432, 'steps': 2160, 'batch_loss/train': 0.8256605723872781} 12/21/2021 21:19:08 - INFO - codeparrot_training - Step 2161: {'lr': 0.0004980424806815918, 'samples': 1106944, 'steps': 2161, 'batch_loss/train': 1.0392022356390953} 12/21/2021 21:19:18 - INFO - codeparrot_training - Step 2162: {'lr': 0.0004980405114466172, 'samples': 1107456, 'steps': 2162, 'batch_loss/train': 0.8696748856455088} 12/21/2021 21:19:31 - INFO - codeparrot_training - Step 2163: {'lr': 0.0004980385412255271, 'samples': 1107968, 'steps': 2163, 'batch_loss/train': 0.9622868206351995} 12/21/2021 21:19:41 - INFO - codeparrot_training - Step 2164: {'lr': 0.0004980365700183294, 'samples': 1108480, 'steps': 2164, 'batch_loss/train': 0.8946727151051164} 12/21/2021 21:19:52 - INFO - codeparrot_training - Step 2165: {'lr': 0.000498034597825032, 'samples': 1108992, 'steps': 2165, 'batch_loss/train': 0.9010253064334393} 12/21/2021 21:20:04 - INFO - codeparrot_training - Step 2166: {'lr': 0.0004980326246456425, 'samples': 1109504, 'steps': 2166, 'batch_loss/train': 0.9960449505597353} 12/21/2021 21:20:15 - INFO - codeparrot_training - Step 2167: {'lr': 0.0004980306504801691, 'samples': 1110016, 'steps': 2167, 'batch_loss/train': 0.9025410525500774} 12/21/2021 21:20:26 - INFO - codeparrot_training - Step 2168: {'lr': 0.0004980286753286195, 'samples': 1110528, 'steps': 2168, 'batch_loss/train': 0.8581621199846268} 12/21/2021 21:20:36 - INFO - codeparrot_training - Step 2169: {'lr': 0.0004980266991910015, 'samples': 1111040, 'steps': 2169, 'batch_loss/train': 1.4633424987550825} 12/21/2021 21:20:49 - INFO - codeparrot_training - Step 2170: {'lr': 0.0004980247220673229, 'samples': 1111552, 'steps': 2170, 'batch_loss/train': 0.9213416995480657} 12/21/2021 21:20:59 - INFO - codeparrot_training - Step 2171: {'lr': 0.0004980227439575918, 'samples': 1112064, 'steps': 2171, 'batch_loss/train': 0.8921311385929585} 12/21/2021 21:21:10 - INFO - codeparrot_training - Step 2172: {'lr': 0.0004980207648618159, 'samples': 1112576, 'steps': 2172, 'batch_loss/train': 0.8760981447994709} 12/21/2021 21:21:22 - INFO - codeparrot_training - Step 2173: {'lr': 0.0004980187847800031, 'samples': 1113088, 'steps': 2173, 'batch_loss/train': 0.8544832719489932} 12/21/2021 21:21:33 - INFO - codeparrot_training - Step 2174: {'lr': 0.0004980168037121612, 'samples': 1113600, 'steps': 2174, 'batch_loss/train': 0.9261558644939214} 12/21/2021 21:21:43 - INFO - codeparrot_training - Step 2175: {'lr': 0.0004980148216582981, 'samples': 1114112, 'steps': 2175, 'batch_loss/train': 0.9252141518518329} 12/21/2021 21:21:56 - INFO - codeparrot_training - Step 2176: {'lr': 0.0004980128386184218, 'samples': 1114624, 'steps': 2176, 'batch_loss/train': 0.8932766157668084} 12/21/2021 21:22:07 - INFO - codeparrot_training - Step 2177: {'lr': 0.0004980108545925401, 'samples': 1115136, 'steps': 2177, 'batch_loss/train': 0.8832616172730923} 12/21/2021 21:22:17 - INFO - codeparrot_training - Step 2178: {'lr': 0.0004980088695806608, 'samples': 1115648, 'steps': 2178, 'batch_loss/train': 0.8894687267020345} 12/21/2021 21:22:28 - INFO - codeparrot_training - Step 2179: {'lr': 0.000498006883582792, 'samples': 1116160, 'steps': 2179, 'batch_loss/train': 0.9565798686817288} 12/21/2021 21:22:41 - INFO - codeparrot_training - Step 2180: {'lr': 0.0004980048965989413, 'samples': 1116672, 'steps': 2180, 'batch_loss/train': 0.8366931579075754} 12/21/2021 21:22:51 - INFO - codeparrot_training - Step 2181: {'lr': 0.0004980029086291169, 'samples': 1117184, 'steps': 2181, 'batch_loss/train': 0.784460118971765} 12/21/2021 21:23:02 - INFO - codeparrot_training - Step 2182: {'lr': 0.0004980009196733266, 'samples': 1117696, 'steps': 2182, 'batch_loss/train': 0.8811569255776703} 12/21/2021 21:23:14 - INFO - codeparrot_training - Step 2183: {'lr': 0.0004979989297315782, 'samples': 1118208, 'steps': 2183, 'batch_loss/train': 0.887143530882895} 12/21/2021 21:23:25 - INFO - codeparrot_training - Step 2184: {'lr': 0.0004979969388038798, 'samples': 1118720, 'steps': 2184, 'batch_loss/train': 0.8376783616840839} 12/21/2021 21:23:36 - INFO - codeparrot_training - Step 2185: {'lr': 0.0004979949468902392, 'samples': 1119232, 'steps': 2185, 'batch_loss/train': 1.4935764158144593} 12/21/2021 21:23:47 - INFO - codeparrot_training - Step 2186: {'lr': 0.0004979929539906643, 'samples': 1119744, 'steps': 2186, 'batch_loss/train': 0.9285049010068178} 12/21/2021 21:23:58 - INFO - codeparrot_training - Step 2187: {'lr': 0.0004979909601051629, 'samples': 1120256, 'steps': 2187, 'batch_loss/train': 0.8790778340771794} 12/21/2021 21:24:09 - INFO - codeparrot_training - Step 2188: {'lr': 0.0004979889652337433, 'samples': 1120768, 'steps': 2188, 'batch_loss/train': 0.9266867991536856} 12/21/2021 21:24:19 - INFO - codeparrot_training - Step 2189: {'lr': 0.0004979869693764131, 'samples': 1121280, 'steps': 2189, 'batch_loss/train': 0.8623130759224296} 12/21/2021 21:24:33 - INFO - codeparrot_training - Step 2190: {'lr': 0.0004979849725331804, 'samples': 1121792, 'steps': 2190, 'batch_loss/train': 0.917578674852848} 12/21/2021 21:24:44 - INFO - codeparrot_training - Step 2191: {'lr': 0.000497982974704053, 'samples': 1122304, 'steps': 2191, 'batch_loss/train': 0.800558096729219} 12/21/2021 21:24:54 - INFO - codeparrot_training - Step 2192: {'lr': 0.000497980975889039, 'samples': 1122816, 'steps': 2192, 'batch_loss/train': 0.9526922944933176} 12/21/2021 21:25:06 - INFO - codeparrot_training - Step 2193: {'lr': 0.0004979789760881463, 'samples': 1123328, 'steps': 2193, 'batch_loss/train': 0.89458985067904} 12/21/2021 21:25:17 - INFO - codeparrot_training - Step 2194: {'lr': 0.0004979769753013828, 'samples': 1123840, 'steps': 2194, 'batch_loss/train': 0.9124226216226816} 12/21/2021 21:25:27 - INFO - codeparrot_training - Step 2195: {'lr': 0.0004979749735287564, 'samples': 1124352, 'steps': 2195, 'batch_loss/train': 0.857542373239994} 12/21/2021 21:25:41 - INFO - codeparrot_training - Step 2196: {'lr': 0.0004979729707702752, 'samples': 1124864, 'steps': 2196, 'batch_loss/train': 0.8491443232633173} 12/21/2021 21:25:51 - INFO - codeparrot_training - Step 2197: {'lr': 0.0004979709670259471, 'samples': 1125376, 'steps': 2197, 'batch_loss/train': 1.846842444036156} 12/21/2021 21:26:02 - INFO - codeparrot_training - Step 2198: {'lr': 0.0004979689622957802, 'samples': 1125888, 'steps': 2198, 'batch_loss/train': 0.6682246718555689} 12/21/2021 21:26:12 - INFO - codeparrot_training - Step 2199: {'lr': 0.000497966956579782, 'samples': 1126400, 'steps': 2199, 'batch_loss/train': 0.9120249319821596} 12/21/2021 21:26:25 - INFO - codeparrot_training - Step 2200: {'lr': 0.0004979649498779612, 'samples': 1126912, 'steps': 2200, 'batch_loss/train': 1.71440254105255} 12/21/2021 21:26:35 - INFO - codeparrot_training - Step 2201: {'lr': 0.0004979629421903251, 'samples': 1127424, 'steps': 2201, 'batch_loss/train': 0.9875760222785175} 12/21/2021 21:26:46 - INFO - codeparrot_training - Step 2202: {'lr': 0.000497960933516882, 'samples': 1127936, 'steps': 2202, 'batch_loss/train': 0.7724146912805736} 12/21/2021 21:26:58 - INFO - codeparrot_training - Step 2203: {'lr': 0.00049795892385764, 'samples': 1128448, 'steps': 2203, 'batch_loss/train': 0.8396572517231107} 12/21/2021 21:27:08 - INFO - codeparrot_training - Step 2204: {'lr': 0.0004979569132126069, 'samples': 1128960, 'steps': 2204, 'batch_loss/train': 0.8988684583455324} 12/21/2021 21:27:19 - INFO - codeparrot_training - Step 2205: {'lr': 0.0004979549015817907, 'samples': 1129472, 'steps': 2205, 'batch_loss/train': 0.8194378204643726} 12/21/2021 21:27:32 - INFO - codeparrot_training - Step 2206: {'lr': 0.0004979528889651994, 'samples': 1129984, 'steps': 2206, 'batch_loss/train': 0.9286805298179388} 12/21/2021 21:27:43 - INFO - codeparrot_training - Step 2207: {'lr': 0.0004979508753628411, 'samples': 1130496, 'steps': 2207, 'batch_loss/train': 0.7505495395162143} 12/21/2021 21:27:53 - INFO - codeparrot_training - Step 2208: {'lr': 0.0004979488607747238, 'samples': 1131008, 'steps': 2208, 'batch_loss/train': 0.8061871505342424} 12/21/2021 21:28:04 - INFO - codeparrot_training - Step 2209: {'lr': 0.0004979468452008554, 'samples': 1131520, 'steps': 2209, 'batch_loss/train': 0.7535290666855872} 12/21/2021 21:28:16 - INFO - codeparrot_training - Step 2210: {'lr': 0.0004979448286412438, 'samples': 1132032, 'steps': 2210, 'batch_loss/train': 1.0124948769807816} 12/21/2021 21:28:27 - INFO - codeparrot_training - Step 2211: {'lr': 0.0004979428110958974, 'samples': 1132544, 'steps': 2211, 'batch_loss/train': 0.8996817008592188} 12/21/2021 21:28:37 - INFO - codeparrot_training - Step 2212: {'lr': 0.0004979407925648239, 'samples': 1133056, 'steps': 2212, 'batch_loss/train': 0.9879712192341685} 12/21/2021 21:28:49 - INFO - codeparrot_training - Step 2213: {'lr': 0.0004979387730480314, 'samples': 1133568, 'steps': 2213, 'batch_loss/train': 0.8593467073515058} 12/21/2021 21:29:00 - INFO - codeparrot_training - Step 2214: {'lr': 0.000497936752545528, 'samples': 1134080, 'steps': 2214, 'batch_loss/train': 0.9214249616488814} 12/21/2021 21:29:10 - INFO - codeparrot_training - Step 2215: {'lr': 0.0004979347310573217, 'samples': 1134592, 'steps': 2215, 'batch_loss/train': 0.814708998426795} 12/21/2021 21:29:23 - INFO - codeparrot_training - Step 2216: {'lr': 0.0004979327085834205, 'samples': 1135104, 'steps': 2216, 'batch_loss/train': 0.8611375791952014} 12/21/2021 21:29:34 - INFO - codeparrot_training - Step 2217: {'lr': 0.0004979306851238325, 'samples': 1135616, 'steps': 2217, 'batch_loss/train': 0.8968394612893462} 12/21/2021 21:29:44 - INFO - codeparrot_training - Step 2218: {'lr': 0.0004979286606785657, 'samples': 1136128, 'steps': 2218, 'batch_loss/train': 0.9267248502001166} 12/21/2021 21:29:55 - INFO - codeparrot_training - Step 2219: {'lr': 0.0004979266352476281, 'samples': 1136640, 'steps': 2219, 'batch_loss/train': 0.9810216510668397} 12/21/2021 21:30:07 - INFO - codeparrot_training - Step 2220: {'lr': 0.0004979246088310279, 'samples': 1137152, 'steps': 2220, 'batch_loss/train': 0.8797378921881318} 12/21/2021 21:30:17 - INFO - codeparrot_training - Step 2221: {'lr': 0.0004979225814287731, 'samples': 1137664, 'steps': 2221, 'batch_loss/train': 0.8825861206278205} 12/21/2021 21:30:28 - INFO - codeparrot_training - Step 2222: {'lr': 0.0004979205530408716, 'samples': 1138176, 'steps': 2222, 'batch_loss/train': 0.8596117487177253} 12/21/2021 21:30:40 - INFO - codeparrot_training - Step 2223: {'lr': 0.0004979185236673317, 'samples': 1138688, 'steps': 2223, 'batch_loss/train': 0.7715544956736267} 12/21/2021 21:30:51 - INFO - codeparrot_training - Step 2224: {'lr': 0.0004979164933081613, 'samples': 1139200, 'steps': 2224, 'batch_loss/train': 0.6525041048880666} 12/21/2021 21:31:02 - INFO - codeparrot_training - Step 2225: {'lr': 0.0004979144619633685, 'samples': 1139712, 'steps': 2225, 'batch_loss/train': 0.6971809483366087} 12/21/2021 21:31:15 - INFO - codeparrot_training - Step 2226: {'lr': 0.0004979124296329615, 'samples': 1140224, 'steps': 2226, 'batch_loss/train': 1.041741375811398} 12/21/2021 21:31:25 - INFO - codeparrot_training - Step 2227: {'lr': 0.0004979103963169482, 'samples': 1140736, 'steps': 2227, 'batch_loss/train': 0.709704970009625} 12/21/2021 21:31:36 - INFO - codeparrot_training - Step 2228: {'lr': 0.0004979083620153368, 'samples': 1141248, 'steps': 2228, 'batch_loss/train': 0.9767834870144725} 12/21/2021 21:31:46 - INFO - codeparrot_training - Step 2229: {'lr': 0.0004979063267281354, 'samples': 1141760, 'steps': 2229, 'batch_loss/train': 0.859210027847439} 12/21/2021 21:31:58 - INFO - codeparrot_training - Step 2230: {'lr': 0.000497904290455352, 'samples': 1142272, 'steps': 2230, 'batch_loss/train': 0.8877223068848252} 12/21/2021 21:32:09 - INFO - codeparrot_training - Step 2231: {'lr': 0.0004979022531969948, 'samples': 1142784, 'steps': 2231, 'batch_loss/train': 0.9415205167606473} 12/21/2021 21:32:20 - INFO - codeparrot_training - Step 2232: {'lr': 0.0004979002149530718, 'samples': 1143296, 'steps': 2232, 'batch_loss/train': 0.8467596904374659} 12/21/2021 21:32:32 - INFO - codeparrot_training - Step 2233: {'lr': 0.0004978981757235912, 'samples': 1143808, 'steps': 2233, 'batch_loss/train': 0.8203006573021412} 12/21/2021 21:32:42 - INFO - codeparrot_training - Step 2234: {'lr': 0.000497896135508561, 'samples': 1144320, 'steps': 2234, 'batch_loss/train': 0.9310867032036185} 12/21/2021 21:32:53 - INFO - codeparrot_training - Step 2235: {'lr': 0.0004978940943079894, 'samples': 1144832, 'steps': 2235, 'batch_loss/train': 0.8537202458828688} 12/21/2021 21:33:06 - INFO - codeparrot_training - Step 2236: {'lr': 0.0004978920521218845, 'samples': 1145344, 'steps': 2236, 'batch_loss/train': 0.8510296512395144} 12/21/2021 21:33:16 - INFO - codeparrot_training - Step 2237: {'lr': 0.0004978900089502543, 'samples': 1145856, 'steps': 2237, 'batch_loss/train': 0.7684935056604445} 12/21/2021 21:33:27 - INFO - codeparrot_training - Step 2238: {'lr': 0.0004978879647931072, 'samples': 1146368, 'steps': 2238, 'batch_loss/train': 0.8555542253889143} 12/21/2021 21:33:38 - INFO - codeparrot_training - Step 2239: {'lr': 0.000497885919650451, 'samples': 1146880, 'steps': 2239, 'batch_loss/train': 0.8755789836868644} 12/21/2021 21:33:50 - INFO - codeparrot_training - Step 2240: {'lr': 0.0004978838735222941, 'samples': 1147392, 'steps': 2240, 'batch_loss/train': 0.7839141879230738} 12/21/2021 21:34:00 - INFO - codeparrot_training - Step 2241: {'lr': 0.0004978818264086444, 'samples': 1147904, 'steps': 2241, 'batch_loss/train': 0.8618870414793491} 12/21/2021 21:34:11 - INFO - codeparrot_training - Step 2242: {'lr': 0.0004978797783095102, 'samples': 1148416, 'steps': 2242, 'batch_loss/train': 0.8104640068486333} 12/21/2021 21:34:23 - INFO - codeparrot_training - Step 2243: {'lr': 0.0004978777292248996, 'samples': 1148928, 'steps': 2243, 'batch_loss/train': 0.8643834972754121} 12/21/2021 21:34:34 - INFO - codeparrot_training - Step 2244: {'lr': 0.0004978756791548207, 'samples': 1149440, 'steps': 2244, 'batch_loss/train': 0.8470911365002394} 12/21/2021 21:34:45 - INFO - codeparrot_training - Step 2245: {'lr': 0.0004978736280992817, 'samples': 1149952, 'steps': 2245, 'batch_loss/train': 0.8103189882822335} 12/21/2021 21:34:55 - INFO - codeparrot_training - Step 2246: {'lr': 0.0004978715760582908, 'samples': 1150464, 'steps': 2246, 'batch_loss/train': 0.8458332307636738} 12/21/2021 21:35:07 - INFO - codeparrot_training - Step 2247: {'lr': 0.0004978695230318561, 'samples': 1150976, 'steps': 2247, 'batch_loss/train': 0.8244990287348628} 12/21/2021 21:35:18 - INFO - codeparrot_training - Step 2248: {'lr': 0.0004978674690199858, 'samples': 1151488, 'steps': 2248, 'batch_loss/train': 0.9097336670383811} 12/21/2021 21:35:28 - INFO - codeparrot_training - Step 2249: {'lr': 0.0004978654140226879, 'samples': 1152000, 'steps': 2249, 'batch_loss/train': 0.9904982983134687} 12/21/2021 21:35:40 - INFO - codeparrot_training - Step 2250: {'lr': 0.0004978633580399708, 'samples': 1152512, 'steps': 2250, 'batch_loss/train': 0.8670511143282056} 12/21/2021 21:35:51 - INFO - codeparrot_training - Step 2251: {'lr': 0.0004978613010718426, 'samples': 1153024, 'steps': 2251, 'batch_loss/train': 0.8320633759722114} 12/21/2021 21:36:02 - INFO - codeparrot_training - Step 2252: {'lr': 0.0004978592431183114, 'samples': 1153536, 'steps': 2252, 'batch_loss/train': 0.8419060765299946} 12/21/2021 21:36:14 - INFO - codeparrot_training - Step 2253: {'lr': 0.0004978571841793855, 'samples': 1154048, 'steps': 2253, 'batch_loss/train': 0.902762464247644} 12/21/2021 21:36:25 - INFO - codeparrot_training - Step 2254: {'lr': 0.000497855124255073, 'samples': 1154560, 'steps': 2254, 'batch_loss/train': 0.8537281826138496} 12/21/2021 21:36:35 - INFO - codeparrot_training - Step 2255: {'lr': 0.0004978530633453821, 'samples': 1155072, 'steps': 2255, 'batch_loss/train': 0.9316979879513383} 12/21/2021 21:36:47 - INFO - codeparrot_training - Step 2256: {'lr': 0.000497851001450321, 'samples': 1155584, 'steps': 2256, 'batch_loss/train': 0.8397816433571279} 12/21/2021 21:36:58 - INFO - codeparrot_training - Step 2257: {'lr': 0.000497848938569898, 'samples': 1156096, 'steps': 2257, 'batch_loss/train': 0.9209824232384562} 12/21/2021 21:37:09 - INFO - codeparrot_training - Step 2258: {'lr': 0.000497846874704121, 'samples': 1156608, 'steps': 2258, 'batch_loss/train': 0.9315807013772428} 12/21/2021 21:37:19 - INFO - codeparrot_training - Step 2259: {'lr': 0.0004978448098529986, 'samples': 1157120, 'steps': 2259, 'batch_loss/train': 0.8902814593166113} 12/21/2021 21:37:31 - INFO - codeparrot_training - Step 2260: {'lr': 0.0004978427440165388, 'samples': 1157632, 'steps': 2260, 'batch_loss/train': 0.8946724468842149} 12/21/2021 21:37:42 - INFO - codeparrot_training - Step 2261: {'lr': 0.0004978406771947496, 'samples': 1158144, 'steps': 2261, 'batch_loss/train': 0.8967864457517862} 12/21/2021 21:37:52 - INFO - codeparrot_training - Step 2262: {'lr': 0.0004978386093876396, 'samples': 1158656, 'steps': 2262, 'batch_loss/train': 0.883538038469851} 12/21/2021 21:38:04 - INFO - codeparrot_training - Step 2263: {'lr': 0.000497836540595217, 'samples': 1159168, 'steps': 2263, 'batch_loss/train': 0.9213018584996462} 12/21/2021 21:38:15 - INFO - codeparrot_training - Step 2264: {'lr': 0.0004978344708174898, 'samples': 1159680, 'steps': 2264, 'batch_loss/train': 0.9337521586567163} 12/21/2021 21:38:26 - INFO - codeparrot_training - Step 2265: {'lr': 0.0004978324000544663, 'samples': 1160192, 'steps': 2265, 'batch_loss/train': 0.856205290183425} 12/21/2021 21:38:39 - INFO - codeparrot_training - Step 2266: {'lr': 0.0004978303283061547, 'samples': 1160704, 'steps': 2266, 'batch_loss/train': 0.8010878721252084} 12/21/2021 21:38:50 - INFO - codeparrot_training - Step 2267: {'lr': 0.0004978282555725634, 'samples': 1161216, 'steps': 2267, 'batch_loss/train': 2.1426543900743127} 12/21/2021 21:39:00 - INFO - codeparrot_training - Step 2268: {'lr': 0.0004978261818537005, 'samples': 1161728, 'steps': 2268, 'batch_loss/train': 0.9220291841775179} 12/21/2021 21:39:11 - INFO - codeparrot_training - Step 2269: {'lr': 0.0004978241071495743, 'samples': 1162240, 'steps': 2269, 'batch_loss/train': 0.8712067613378167} 12/21/2021 21:39:23 - INFO - codeparrot_training - Step 2270: {'lr': 0.000497822031460193, 'samples': 1162752, 'steps': 2270, 'batch_loss/train': 0.8895259560085833} 12/21/2021 21:39:34 - INFO - codeparrot_training - Step 2271: {'lr': 0.0004978199547855649, 'samples': 1163264, 'steps': 2271, 'batch_loss/train': 1.0714044999331236} 12/21/2021 21:39:44 - INFO - codeparrot_training - Step 2272: {'lr': 0.0004978178771256981, 'samples': 1163776, 'steps': 2272, 'batch_loss/train': 0.9201099090278149} 12/21/2021 21:39:56 - INFO - codeparrot_training - Step 2273: {'lr': 0.0004978157984806011, 'samples': 1164288, 'steps': 2273, 'batch_loss/train': 0.7963182115927339} 12/21/2021 21:40:07 - INFO - codeparrot_training - Step 2274: {'lr': 0.0004978137188502821, 'samples': 1164800, 'steps': 2274, 'batch_loss/train': 0.8720248723402619} 12/21/2021 21:40:18 - INFO - codeparrot_training - Step 2275: {'lr': 0.0004978116382347493, 'samples': 1165312, 'steps': 2275, 'batch_loss/train': 0.8976620621979237} 12/21/2021 21:40:30 - INFO - codeparrot_training - Step 2276: {'lr': 0.0004978095566340109, 'samples': 1165824, 'steps': 2276, 'batch_loss/train': 0.8699776418507099} 12/21/2021 21:40:41 - INFO - codeparrot_training - Step 2277: {'lr': 0.0004978074740480754, 'samples': 1166336, 'steps': 2277, 'batch_loss/train': 0.8808933347463608} 12/21/2021 21:40:52 - INFO - codeparrot_training - Step 2278: {'lr': 0.0004978053904769508, 'samples': 1166848, 'steps': 2278, 'batch_loss/train': 0.9219242474064231} 12/21/2021 21:41:02 - INFO - codeparrot_training - Step 2279: {'lr': 0.0004978033059206457, 'samples': 1167360, 'steps': 2279, 'batch_loss/train': 0.815322594717145} 12/21/2021 21:41:14 - INFO - codeparrot_training - Step 2280: {'lr': 0.0004978012203791681, 'samples': 1167872, 'steps': 2280, 'batch_loss/train': 0.8727032486349344} 12/21/2021 21:41:25 - INFO - codeparrot_training - Step 2281: {'lr': 0.0004977991338525266, 'samples': 1168384, 'steps': 2281, 'batch_loss/train': 0.8825947418808937} 12/21/2021 21:41:35 - INFO - codeparrot_training - Step 2282: {'lr': 0.0004977970463407291, 'samples': 1168896, 'steps': 2282, 'batch_loss/train': 0.9200711483135819} 12/21/2021 21:41:48 - INFO - codeparrot_training - Step 2283: {'lr': 0.0004977949578437842, 'samples': 1169408, 'steps': 2283, 'batch_loss/train': 0.9378671711310744} 12/21/2021 21:41:59 - INFO - codeparrot_training - Step 2284: {'lr': 0.0004977928683617, 'samples': 1169920, 'steps': 2284, 'batch_loss/train': 1.332573532126844} 12/21/2021 21:42:09 - INFO - codeparrot_training - Step 2285: {'lr': 0.0004977907778944851, 'samples': 1170432, 'steps': 2285, 'batch_loss/train': 1.3997020409442484} 12/21/2021 21:42:21 - INFO - codeparrot_training - Step 2286: {'lr': 0.0004977886864421475, 'samples': 1170944, 'steps': 2286, 'batch_loss/train': 0.8744787750765681} 12/21/2021 21:42:32 - INFO - codeparrot_training - Step 2287: {'lr': 0.0004977865940046958, 'samples': 1171456, 'steps': 2287, 'batch_loss/train': 0.8960451623424888} 12/21/2021 21:42:42 - INFO - codeparrot_training - Step 2288: {'lr': 0.000497784500582138, 'samples': 1171968, 'steps': 2288, 'batch_loss/train': 0.837315577082336} 12/21/2021 21:42:53 - INFO - codeparrot_training - Step 2289: {'lr': 0.0004977824061744826, 'samples': 1172480, 'steps': 2289, 'batch_loss/train': 0.8370909532532096} 12/21/2021 21:43:05 - INFO - codeparrot_training - Step 2290: {'lr': 0.0004977803107817379, 'samples': 1172992, 'steps': 2290, 'batch_loss/train': 0.8417616076767445} 12/21/2021 21:43:16 - INFO - codeparrot_training - Step 2291: {'lr': 0.0004977782144039123, 'samples': 1173504, 'steps': 2291, 'batch_loss/train': 0.8052168441936374} 12/21/2021 21:43:26 - INFO - codeparrot_training - Step 2292: {'lr': 0.0004977761170410142, 'samples': 1174016, 'steps': 2292, 'batch_loss/train': 0.8740883832797408} 12/21/2021 21:43:39 - INFO - codeparrot_training - Step 2293: {'lr': 0.0004977740186930515, 'samples': 1174528, 'steps': 2293, 'batch_loss/train': 0.8501850543543696} 12/21/2021 21:43:50 - INFO - codeparrot_training - Step 2294: {'lr': 0.000497771919360033, 'samples': 1175040, 'steps': 2294, 'batch_loss/train': 0.7779655768536031} 12/21/2021 21:44:00 - INFO - codeparrot_training - Step 2295: {'lr': 0.0004977698190419671, 'samples': 1175552, 'steps': 2295, 'batch_loss/train': 0.8492563883773983} 12/21/2021 21:44:12 - INFO - codeparrot_training - Step 2296: {'lr': 0.0004977677177388617, 'samples': 1176064, 'steps': 2296, 'batch_loss/train': 0.9158772830851376} 12/21/2021 21:44:23 - INFO - codeparrot_training - Step 2297: {'lr': 0.0004977656154507255, 'samples': 1176576, 'steps': 2297, 'batch_loss/train': 0.8607986383140087} 12/21/2021 21:44:34 - INFO - codeparrot_training - Step 2298: {'lr': 0.0004977635121775668, 'samples': 1177088, 'steps': 2298, 'batch_loss/train': 0.9793339476455003} 12/21/2021 21:44:44 - INFO - codeparrot_training - Step 2299: {'lr': 0.0004977614079193939, 'samples': 1177600, 'steps': 2299, 'batch_loss/train': 0.789015160407871} 12/21/2021 21:44:57 - INFO - codeparrot_training - Step 2300: {'lr': 0.0004977593026762152, 'samples': 1178112, 'steps': 2300, 'batch_loss/train': 0.947744095697999} 12/21/2021 21:45:08 - INFO - codeparrot_training - Step 2301: {'lr': 0.0004977571964480391, 'samples': 1178624, 'steps': 2301, 'batch_loss/train': 0.8814868815243244} 12/21/2021 21:45:18 - INFO - codeparrot_training - Step 2302: {'lr': 0.0004977550892348738, 'samples': 1179136, 'steps': 2302, 'batch_loss/train': 0.9349094778299332} 12/21/2021 21:45:30 - INFO - codeparrot_training - Step 2303: {'lr': 0.0004977529810367279, 'samples': 1179648, 'steps': 2303, 'batch_loss/train': 0.8994383979588747} 12/21/2021 21:45:41 - INFO - codeparrot_training - Step 2304: {'lr': 0.0004977508718536097, 'samples': 1180160, 'steps': 2304, 'batch_loss/train': 0.9291807231493294} 12/21/2021 21:45:51 - INFO - codeparrot_training - Step 2305: {'lr': 0.0004977487616855276, 'samples': 1180672, 'steps': 2305, 'batch_loss/train': 0.8446083217859268} 12/21/2021 21:46:03 - INFO - codeparrot_training - Step 2306: {'lr': 0.0004977466505324899, 'samples': 1181184, 'steps': 2306, 'batch_loss/train': 0.9765346995554864} 12/21/2021 21:46:14 - INFO - codeparrot_training - Step 2307: {'lr': 0.0004977445383945051, 'samples': 1181696, 'steps': 2307, 'batch_loss/train': 0.8346361583098769} 12/21/2021 21:46:24 - INFO - codeparrot_training - Step 2308: {'lr': 0.0004977424252715816, 'samples': 1182208, 'steps': 2308, 'batch_loss/train': 0.9940981734544039} 12/21/2021 21:46:37 - INFO - codeparrot_training - Step 2309: {'lr': 0.0004977403111637278, 'samples': 1182720, 'steps': 2309, 'batch_loss/train': 0.823179142549634} 12/21/2021 21:46:47 - INFO - codeparrot_training - Step 2310: {'lr': 0.000497738196070952, 'samples': 1183232, 'steps': 2310, 'batch_loss/train': 0.9323650067672133} 12/21/2021 21:46:58 - INFO - codeparrot_training - Step 2311: {'lr': 0.0004977360799932627, 'samples': 1183744, 'steps': 2311, 'batch_loss/train': 1.1666291635483503} 12/21/2021 21:47:09 - INFO - codeparrot_training - Step 2312: {'lr': 0.0004977339629306682, 'samples': 1184256, 'steps': 2312, 'batch_loss/train': 0.8491664617322385} 12/21/2021 21:47:21 - INFO - codeparrot_training - Step 2313: {'lr': 0.000497731844883177, 'samples': 1184768, 'steps': 2313, 'batch_loss/train': 0.7647358803078532} 12/21/2021 21:47:32 - INFO - codeparrot_training - Step 2314: {'lr': 0.0004977297258507977, 'samples': 1185280, 'steps': 2314, 'batch_loss/train': 0.8725790660828352} 12/21/2021 21:47:43 - INFO - codeparrot_training - Step 2315: {'lr': 0.0004977276058335384, 'samples': 1185792, 'steps': 2315, 'batch_loss/train': 0.8600598387420177} 12/21/2021 21:47:55 - INFO - codeparrot_training - Step 2316: {'lr': 0.0004977254848314077, 'samples': 1186304, 'steps': 2316, 'batch_loss/train': 0.8896271474659443} 12/21/2021 21:48:05 - INFO - codeparrot_training - Step 2317: {'lr': 0.0004977233628444141, 'samples': 1186816, 'steps': 2317, 'batch_loss/train': 0.8423408018425107} 12/21/2021 21:48:16 - INFO - codeparrot_training - Step 2318: {'lr': 0.0004977212398725658, 'samples': 1187328, 'steps': 2318, 'batch_loss/train': 0.8389368960633874} 12/21/2021 21:48:26 - INFO - codeparrot_training - Step 2319: {'lr': 0.0004977191159158715, 'samples': 1187840, 'steps': 2319, 'batch_loss/train': 0.8435713793151081} 12/21/2021 21:48:39 - INFO - codeparrot_training - Step 2320: {'lr': 0.0004977169909743395, 'samples': 1188352, 'steps': 2320, 'batch_loss/train': 0.8436617329716682} 12/21/2021 21:48:49 - INFO - codeparrot_training - Step 2321: {'lr': 0.0004977148650479782, 'samples': 1188864, 'steps': 2321, 'batch_loss/train': 0.8810571497306228} 12/21/2021 21:49:00 - INFO - codeparrot_training - Step 2322: {'lr': 0.0004977127381367962, 'samples': 1189376, 'steps': 2322, 'batch_loss/train': 0.6939816528465599} 12/21/2021 21:49:13 - INFO - codeparrot_training - Step 2323: {'lr': 0.0004977106102408019, 'samples': 1189888, 'steps': 2323, 'batch_loss/train': 0.7772838626988232} 12/21/2021 21:49:23 - INFO - codeparrot_training - Step 2324: {'lr': 0.0004977084813600037, 'samples': 1190400, 'steps': 2324, 'batch_loss/train': 0.8892768668010831} 12/21/2021 21:49:34 - INFO - codeparrot_training - Step 2325: {'lr': 0.0004977063514944102, 'samples': 1190912, 'steps': 2325, 'batch_loss/train': 0.8198740109801292} 12/21/2021 21:49:46 - INFO - codeparrot_training - Step 2326: {'lr': 0.0004977042206440297, 'samples': 1191424, 'steps': 2326, 'batch_loss/train': 1.145355949178338} 12/21/2021 21:49:57 - INFO - codeparrot_training - Step 2327: {'lr': 0.0004977020888088708, 'samples': 1191936, 'steps': 2327, 'batch_loss/train': 1.0620467979460955} 12/21/2021 21:50:07 - INFO - codeparrot_training - Step 2328: {'lr': 0.0004976999559889419, 'samples': 1192448, 'steps': 2328, 'batch_loss/train': 0.9247380411252379} 12/21/2021 21:50:18 - INFO - codeparrot_training - Step 2329: {'lr': 0.0004976978221842514, 'samples': 1192960, 'steps': 2329, 'batch_loss/train': 0.7741743242368102} 12/21/2021 21:50:31 - INFO - codeparrot_training - Step 2330: {'lr': 0.000497695687394808, 'samples': 1193472, 'steps': 2330, 'batch_loss/train': 0.8631991744041443} 12/21/2021 21:50:41 - INFO - codeparrot_training - Step 2331: {'lr': 0.0004976935516206201, 'samples': 1193984, 'steps': 2331, 'batch_loss/train': 0.8145364029332995} 12/21/2021 21:50:52 - INFO - codeparrot_training - Step 2332: {'lr': 0.0004976914148616961, 'samples': 1194496, 'steps': 2332, 'batch_loss/train': 0.9249777430668473} 12/21/2021 21:51:04 - INFO - codeparrot_training - Step 2333: {'lr': 0.0004976892771180446, 'samples': 1195008, 'steps': 2333, 'batch_loss/train': 0.9417273616418242} 12/21/2021 21:51:14 - INFO - codeparrot_training - Step 2334: {'lr': 0.000497687138389674, 'samples': 1195520, 'steps': 2334, 'batch_loss/train': 0.8532694061286747} 12/21/2021 21:51:25 - INFO - codeparrot_training - Step 2335: {'lr': 0.0004976849986765929, 'samples': 1196032, 'steps': 2335, 'batch_loss/train': 0.871632250957191} 12/21/2021 21:51:37 - INFO - codeparrot_training - Step 2336: {'lr': 0.0004976828579788098, 'samples': 1196544, 'steps': 2336, 'batch_loss/train': 0.91069849813357} 12/21/2021 21:51:48 - INFO - codeparrot_training - Step 2337: {'lr': 0.000497680716296333, 'samples': 1197056, 'steps': 2337, 'batch_loss/train': 0.8804760938510299} 12/21/2021 21:51:58 - INFO - codeparrot_training - Step 2338: {'lr': 0.0004976785736291714, 'samples': 1197568, 'steps': 2338, 'batch_loss/train': 0.9022408267483115} 12/21/2021 21:52:11 - INFO - codeparrot_training - Step 2339: {'lr': 0.0004976764299773332, 'samples': 1198080, 'steps': 2339, 'batch_loss/train': 0.9523665346205235} 12/21/2021 21:52:22 - INFO - codeparrot_training - Step 2340: {'lr': 0.0004976742853408272, 'samples': 1198592, 'steps': 2340, 'batch_loss/train': 0.934666195884347} 12/21/2021 21:52:32 - INFO - codeparrot_training - Step 2341: {'lr': 0.0004976721397196615, 'samples': 1199104, 'steps': 2341, 'batch_loss/train': 0.8503328179940581} 12/21/2021 21:52:43 - INFO - codeparrot_training - Step 2342: {'lr': 0.0004976699931138451, 'samples': 1199616, 'steps': 2342, 'batch_loss/train': 0.9063902031630278} 12/21/2021 21:52:55 - INFO - codeparrot_training - Step 2343: {'lr': 0.0004976678455233861, 'samples': 1200128, 'steps': 2343, 'batch_loss/train': 0.9257733291015029} 12/21/2021 21:53:06 - INFO - codeparrot_training - Step 2344: {'lr': 0.0004976656969482936, 'samples': 1200640, 'steps': 2344, 'batch_loss/train': 0.923200418241322} 12/21/2021 21:53:16 - INFO - codeparrot_training - Step 2345: {'lr': 0.0004976635473885755, 'samples': 1201152, 'steps': 2345, 'batch_loss/train': 0.9614932350814342} 12/21/2021 21:53:29 - INFO - codeparrot_training - Step 2346: {'lr': 0.0004976613968442408, 'samples': 1201664, 'steps': 2346, 'batch_loss/train': 1.7225259644910693} 12/21/2021 21:53:39 - INFO - codeparrot_training - Step 2347: {'lr': 0.0004976592453152978, 'samples': 1202176, 'steps': 2347, 'batch_loss/train': 0.874820230063051} 12/21/2021 21:53:50 - INFO - codeparrot_training - Step 2348: {'lr': 0.0004976570928017553, 'samples': 1202688, 'steps': 2348, 'batch_loss/train': 0.9124313183128834} 12/21/2021 21:54:02 - INFO - codeparrot_training - Step 2349: {'lr': 0.0004976549393036216, 'samples': 1203200, 'steps': 2349, 'batch_loss/train': 0.9434279901906848} 12/21/2021 21:54:12 - INFO - codeparrot_training - Step 2350: {'lr': 0.0004976527848209053, 'samples': 1203712, 'steps': 2350, 'batch_loss/train': 0.8958991011604667} 12/21/2021 21:54:23 - INFO - codeparrot_training - Step 2351: {'lr': 0.0004976506293536153, 'samples': 1204224, 'steps': 2351, 'batch_loss/train': 0.9379781316965818} 12/21/2021 21:54:34 - INFO - codeparrot_training - Step 2352: {'lr': 0.0004976484729017597, 'samples': 1204736, 'steps': 2352, 'batch_loss/train': 0.9066098197363317} 12/21/2021 21:54:46 - INFO - codeparrot_training - Step 2353: {'lr': 0.0004976463154653473, 'samples': 1205248, 'steps': 2353, 'batch_loss/train': 0.9837380452081561} 12/21/2021 21:54:57 - INFO - codeparrot_training - Step 2354: {'lr': 0.0004976441570443867, 'samples': 1205760, 'steps': 2354, 'batch_loss/train': 0.9686786141246557} 12/21/2021 21:55:07 - INFO - codeparrot_training - Step 2355: {'lr': 0.0004976419976388864, 'samples': 1206272, 'steps': 2355, 'batch_loss/train': 0.9933761283755302} 12/21/2021 21:55:20 - INFO - codeparrot_training - Step 2356: {'lr': 0.000497639837248855, 'samples': 1206784, 'steps': 2356, 'batch_loss/train': 0.9119098447263241} 12/21/2021 21:55:30 - INFO - codeparrot_training - Step 2357: {'lr': 0.0004976376758743012, 'samples': 1207296, 'steps': 2357, 'batch_loss/train': 0.9667276339605451} 12/21/2021 21:55:41 - INFO - codeparrot_training - Step 2358: {'lr': 0.0004976355135152334, 'samples': 1207808, 'steps': 2358, 'batch_loss/train': 0.9433298241347075} 12/21/2021 21:55:53 - INFO - codeparrot_training - Step 2359: {'lr': 0.0004976333501716605, 'samples': 1208320, 'steps': 2359, 'batch_loss/train': 0.8725891700014472} 12/21/2021 21:56:03 - INFO - codeparrot_training - Step 2360: {'lr': 0.0004976311858435906, 'samples': 1208832, 'steps': 2360, 'batch_loss/train': 0.9047106020152569} 12/21/2021 21:56:14 - INFO - codeparrot_training - Step 2361: {'lr': 0.0004976290205310329, 'samples': 1209344, 'steps': 2361, 'batch_loss/train': 0.9422816019505262} 12/21/2021 21:56:24 - INFO - codeparrot_training - Step 2362: {'lr': 0.0004976268542339955, 'samples': 1209856, 'steps': 2362, 'batch_loss/train': 0.8470016540959477} 12/21/2021 21:56:37 - INFO - codeparrot_training - Step 2363: {'lr': 0.0004976246869524873, 'samples': 1210368, 'steps': 2363, 'batch_loss/train': 0.970789322629571} 12/21/2021 21:56:48 - INFO - codeparrot_training - Step 2364: {'lr': 0.0004976225186865169, 'samples': 1210880, 'steps': 2364, 'batch_loss/train': 1.028847231529653} 12/21/2021 21:56:59 - INFO - codeparrot_training - Step 2365: {'lr': 0.0004976203494360929, 'samples': 1211392, 'steps': 2365, 'batch_loss/train': 0.8853037673979998} 12/21/2021 21:57:11 - INFO - codeparrot_training - Step 2366: {'lr': 0.0004976181792012238, 'samples': 1211904, 'steps': 2366, 'batch_loss/train': 0.8265939480625093} 12/21/2021 21:57:21 - INFO - codeparrot_training - Step 2367: {'lr': 0.0004976160079819183, 'samples': 1212416, 'steps': 2367, 'batch_loss/train': 0.8612535633146763} 12/21/2021 21:57:32 - INFO - codeparrot_training - Step 2368: {'lr': 0.000497613835778185, 'samples': 1212928, 'steps': 2368, 'batch_loss/train': 0.9739093212410808} 12/21/2021 21:57:44 - INFO - codeparrot_training - Step 2369: {'lr': 0.0004976116625900327, 'samples': 1213440, 'steps': 2369, 'batch_loss/train': 0.86505445279181} 12/21/2021 21:57:55 - INFO - codeparrot_training - Step 2370: {'lr': 0.0004976094884174699, 'samples': 1213952, 'steps': 2370, 'batch_loss/train': 0.9953653095290065} 12/21/2021 21:58:06 - INFO - codeparrot_training - Step 2371: {'lr': 0.0004976073132605053, 'samples': 1214464, 'steps': 2371, 'batch_loss/train': 0.9579923544079065} 12/21/2021 21:58:18 - INFO - codeparrot_training - Step 2372: {'lr': 0.0004976051371191474, 'samples': 1214976, 'steps': 2372, 'batch_loss/train': 0.8554048575460911} 12/21/2021 21:58:28 - INFO - codeparrot_training - Step 2373: {'lr': 0.0004976029599934051, 'samples': 1215488, 'steps': 2373, 'batch_loss/train': 0.9077753610908985} 12/21/2021 21:58:39 - INFO - codeparrot_training - Step 2374: {'lr': 0.0004976007818832868, 'samples': 1216000, 'steps': 2374, 'batch_loss/train': 1.0730564780533314} 12/21/2021 21:58:49 - INFO - codeparrot_training - Step 2375: {'lr': 0.0004975986027888013, 'samples': 1216512, 'steps': 2375, 'batch_loss/train': 0.8591835740953684} 12/21/2021 21:59:01 - INFO - codeparrot_training - Step 2376: {'lr': 0.0004975964227099573, 'samples': 1217024, 'steps': 2376, 'batch_loss/train': 1.1030665198341012} 12/21/2021 21:59:12 - INFO - codeparrot_training - Step 2377: {'lr': 0.0004975942416467633, 'samples': 1217536, 'steps': 2377, 'batch_loss/train': 0.8398629724979401} 12/21/2021 21:59:23 - INFO - codeparrot_training - Step 2378: {'lr': 0.0004975920595992283, 'samples': 1218048, 'steps': 2378, 'batch_loss/train': 0.8421083074063063} 12/21/2021 21:59:36 - INFO - codeparrot_training - Step 2379: {'lr': 0.0004975898765673607, 'samples': 1218560, 'steps': 2379, 'batch_loss/train': 1.0005822032690048} 12/21/2021 21:59:46 - INFO - codeparrot_training - Step 2380: {'lr': 0.000497587692551169, 'samples': 1219072, 'steps': 2380, 'batch_loss/train': 0.8523722635582089} 12/21/2021 21:59:57 - INFO - codeparrot_training - Step 2381: {'lr': 0.0004975855075506623, 'samples': 1219584, 'steps': 2381, 'batch_loss/train': 0.9162467382848263} 12/21/2021 22:00:09 - INFO - codeparrot_training - Step 2382: {'lr': 0.0004975833215658492, 'samples': 1220096, 'steps': 2382, 'batch_loss/train': 0.8199595594778657} 12/21/2021 22:00:19 - INFO - codeparrot_training - Step 2383: {'lr': 0.0004975811345967381, 'samples': 1220608, 'steps': 2383, 'batch_loss/train': 0.9168350305408239} 12/21/2021 22:00:30 - INFO - codeparrot_training - Step 2384: {'lr': 0.0004975789466433381, 'samples': 1221120, 'steps': 2384, 'batch_loss/train': 1.214301634579897} 12/21/2021 22:00:43 - INFO - codeparrot_training - Step 2385: {'lr': 0.0004975767577056576, 'samples': 1221632, 'steps': 2385, 'batch_loss/train': 0.938509707339108} 12/21/2021 22:00:53 - INFO - codeparrot_training - Step 2386: {'lr': 0.0004975745677837053, 'samples': 1222144, 'steps': 2386, 'batch_loss/train': 0.9724141815677285} 12/21/2021 22:01:04 - INFO - codeparrot_training - Step 2387: {'lr': 0.0004975723768774901, 'samples': 1222656, 'steps': 2387, 'batch_loss/train': 0.9731059758923948} 12/21/2021 22:01:15 - INFO - codeparrot_training - Step 2388: {'lr': 0.0004975701849870205, 'samples': 1223168, 'steps': 2388, 'batch_loss/train': 0.5016653165221214} 12/21/2021 22:01:27 - INFO - codeparrot_training - Step 2389: {'lr': 0.0004975679921123055, 'samples': 1223680, 'steps': 2389, 'batch_loss/train': 0.8466060645878315} 12/21/2021 22:01:37 - INFO - codeparrot_training - Step 2390: {'lr': 0.0004975657982533536, 'samples': 1224192, 'steps': 2390, 'batch_loss/train': 0.9465504665859044} 12/21/2021 22:01:48 - INFO - codeparrot_training - Step 2391: {'lr': 0.0004975636034101734, 'samples': 1224704, 'steps': 2391, 'batch_loss/train': 0.898351626470685} 12/21/2021 22:02:00 - INFO - codeparrot_training - Step 2392: {'lr': 0.0004975614075827739, 'samples': 1225216, 'steps': 2392, 'batch_loss/train': 0.9409959949553013} 12/21/2021 22:02:11 - INFO - codeparrot_training - Step 2393: {'lr': 0.0004975592107711637, 'samples': 1225728, 'steps': 2393, 'batch_loss/train': 0.796946465736255} 12/21/2021 22:02:22 - INFO - codeparrot_training - Step 2394: {'lr': 0.0004975570129753516, 'samples': 1226240, 'steps': 2394, 'batch_loss/train': 0.8744842549785972} 12/21/2021 22:02:33 - INFO - codeparrot_training - Step 2395: {'lr': 0.0004975548141953462, 'samples': 1226752, 'steps': 2395, 'batch_loss/train': 0.8209635470993817} 12/21/2021 22:02:44 - INFO - codeparrot_training - Step 2396: {'lr': 0.0004975526144311564, 'samples': 1227264, 'steps': 2396, 'batch_loss/train': 0.83765724953264} 12/21/2021 22:02:55 - INFO - codeparrot_training - Step 2397: {'lr': 0.000497550413682791, 'samples': 1227776, 'steps': 2397, 'batch_loss/train': 0.8945213290862739} 12/21/2021 22:03:05 - INFO - codeparrot_training - Step 2398: {'lr': 0.0004975482119502585, 'samples': 1228288, 'steps': 2398, 'batch_loss/train': 0.8881947649642825} 12/21/2021 22:03:18 - INFO - codeparrot_training - Step 2399: {'lr': 0.0004975460092335678, 'samples': 1228800, 'steps': 2399, 'batch_loss/train': 0.8556665410287678} 12/21/2021 22:03:29 - INFO - codeparrot_training - Step 2400: {'lr': 0.0004975438055327275, 'samples': 1229312, 'steps': 2400, 'batch_loss/train': 0.9395778954494745} 12/21/2021 22:03:39 - INFO - codeparrot_training - Step 2401: {'lr': 0.0004975416008477467, 'samples': 1229824, 'steps': 2401, 'batch_loss/train': 0.8943656301125884} 12/21/2021 22:03:52 - INFO - codeparrot_training - Step 2402: {'lr': 0.0004975393951786338, 'samples': 1230336, 'steps': 2402, 'batch_loss/train': 0.9031441528350115} 12/21/2021 22:04:02 - INFO - codeparrot_training - Step 2403: {'lr': 0.0004975371885253978, 'samples': 1230848, 'steps': 2403, 'batch_loss/train': 1.5882976193679497} 12/21/2021 22:04:13 - INFO - codeparrot_training - Step 2404: {'lr': 0.0004975349808880474, 'samples': 1231360, 'steps': 2404, 'batch_loss/train': 0.8681585155427456} 12/21/2021 22:04:25 - INFO - codeparrot_training - Step 2405: {'lr': 0.0004975327722665915, 'samples': 1231872, 'steps': 2405, 'batch_loss/train': 0.9288921132683754} 12/21/2021 22:04:36 - INFO - codeparrot_training - Step 2406: {'lr': 0.0004975305626610386, 'samples': 1232384, 'steps': 2406, 'batch_loss/train': 0.9370471481233835} 12/21/2021 22:04:46 - INFO - codeparrot_training - Step 2407: {'lr': 0.0004975283520713977, 'samples': 1232896, 'steps': 2407, 'batch_loss/train': 0.803179731592536} 12/21/2021 22:04:57 - INFO - codeparrot_training - Step 2408: {'lr': 0.0004975261404976775, 'samples': 1233408, 'steps': 2408, 'batch_loss/train': 0.798239984549582} 12/21/2021 22:05:10 - INFO - codeparrot_training - Step 2409: {'lr': 0.0004975239279398869, 'samples': 1233920, 'steps': 2409, 'batch_loss/train': 0.8998807547613978} 12/21/2021 22:05:21 - INFO - codeparrot_training - Step 2410: {'lr': 0.0004975217143980346, 'samples': 1234432, 'steps': 2410, 'batch_loss/train': 2.1426363266073167} 12/21/2021 22:05:31 - INFO - codeparrot_training - Step 2411: {'lr': 0.0004975194998721293, 'samples': 1234944, 'steps': 2411, 'batch_loss/train': 0.9118229290470481} 12/21/2021 22:05:43 - INFO - codeparrot_training - Step 2412: {'lr': 0.0004975172843621801, 'samples': 1235456, 'steps': 2412, 'batch_loss/train': 0.8632950251922011} 12/21/2021 22:05:54 - INFO - codeparrot_training - Step 2413: {'lr': 0.0004975150678681956, 'samples': 1235968, 'steps': 2413, 'batch_loss/train': 0.8642621673643589} 12/21/2021 22:06:04 - INFO - codeparrot_training - Step 2414: {'lr': 0.0004975128503901846, 'samples': 1236480, 'steps': 2414, 'batch_loss/train': 0.8957793498411775} 12/21/2021 22:06:17 - INFO - codeparrot_training - Step 2415: {'lr': 0.000497510631928156, 'samples': 1236992, 'steps': 2415, 'batch_loss/train': 0.8772434862330556} 12/21/2021 22:06:28 - INFO - codeparrot_training - Step 2416: {'lr': 0.0004975084124821186, 'samples': 1237504, 'steps': 2416, 'batch_loss/train': 0.8770761080086231} 12/21/2021 22:06:38 - INFO - codeparrot_training - Step 2417: {'lr': 0.0004975061920520812, 'samples': 1238016, 'steps': 2417, 'batch_loss/train': 0.6964186881668866} 12/21/2021 22:06:49 - INFO - codeparrot_training - Step 2418: {'lr': 0.0004975039706380525, 'samples': 1238528, 'steps': 2418, 'batch_loss/train': 0.8637498514726758} 12/21/2021 22:07:01 - INFO - codeparrot_training - Step 2419: {'lr': 0.0004975017482400417, 'samples': 1239040, 'steps': 2419, 'batch_loss/train': 0.9701014682650566} 12/21/2021 22:07:12 - INFO - codeparrot_training - Step 2420: {'lr': 0.0004974995248580572, 'samples': 1239552, 'steps': 2420, 'batch_loss/train': 0.8720846213400364} 12/21/2021 22:07:22 - INFO - codeparrot_training - Step 2421: {'lr': 0.0004974973004921081, 'samples': 1240064, 'steps': 2421, 'batch_loss/train': 1.0774633758701384} 12/21/2021 22:07:34 - INFO - codeparrot_training - Step 2422: {'lr': 0.0004974950751422033, 'samples': 1240576, 'steps': 2422, 'batch_loss/train': 0.9713053675368428} 12/21/2021 22:07:45 - INFO - codeparrot_training - Step 2423: {'lr': 0.0004974928488083514, 'samples': 1241088, 'steps': 2423, 'batch_loss/train': 0.8834156133234501} 12/21/2021 22:07:55 - INFO - codeparrot_training - Step 2424: {'lr': 0.0004974906214905614, 'samples': 1241600, 'steps': 2424, 'batch_loss/train': 0.9456081883981824} 12/21/2021 22:08:08 - INFO - codeparrot_training - Step 2425: {'lr': 0.0004974883931888422, 'samples': 1242112, 'steps': 2425, 'batch_loss/train': 0.9954968951642513} 12/21/2021 22:08:19 - INFO - codeparrot_training - Step 2426: {'lr': 0.0004974861639032025, 'samples': 1242624, 'steps': 2426, 'batch_loss/train': 0.8597612073644996} 12/21/2021 22:08:29 - INFO - codeparrot_training - Step 2427: {'lr': 0.0004974839336336514, 'samples': 1243136, 'steps': 2427, 'batch_loss/train': 0.9161617700010538} 12/21/2021 22:08:41 - INFO - codeparrot_training - Step 2428: {'lr': 0.0004974817023801975, 'samples': 1243648, 'steps': 2428, 'batch_loss/train': 0.8924769866280258} 12/21/2021 22:08:52 - INFO - codeparrot_training - Step 2429: {'lr': 0.0004974794701428498, 'samples': 1244160, 'steps': 2429, 'batch_loss/train': 0.7896713707596064} 12/21/2021 22:09:02 - INFO - codeparrot_training - Step 2430: {'lr': 0.0004974772369216173, 'samples': 1244672, 'steps': 2430, 'batch_loss/train': 0.9270444745197892} 12/21/2021 22:09:13 - INFO - codeparrot_training - Step 2431: {'lr': 0.0004974750027165086, 'samples': 1245184, 'steps': 2431, 'batch_loss/train': 0.8613794194534421} 12/21/2021 22:09:25 - INFO - codeparrot_training - Step 2432: {'lr': 0.0004974727675275327, 'samples': 1245696, 'steps': 2432, 'batch_loss/train': 0.9009307059459388} 12/21/2021 22:09:36 - INFO - codeparrot_training - Step 2433: {'lr': 0.0004974705313546987, 'samples': 1246208, 'steps': 2433, 'batch_loss/train': 0.9892600793391466} 12/21/2021 22:09:46 - INFO - codeparrot_training - Step 2434: {'lr': 0.0004974682941980153, 'samples': 1246720, 'steps': 2434, 'batch_loss/train': 0.8625464253127575} 12/21/2021 22:09:58 - INFO - codeparrot_training - Step 2435: {'lr': 0.0004974660560574913, 'samples': 1247232, 'steps': 2435, 'batch_loss/train': 0.8426018957979977} 12/21/2021 22:10:09 - INFO - codeparrot_training - Step 2436: {'lr': 0.0004974638169331357, 'samples': 1247744, 'steps': 2436, 'batch_loss/train': 0.7655221875756979} 12/21/2021 22:10:19 - INFO - codeparrot_training - Step 2437: {'lr': 0.0004974615768249574, 'samples': 1248256, 'steps': 2437, 'batch_loss/train': 0.899914741050452} 12/21/2021 22:10:32 - INFO - codeparrot_training - Step 2438: {'lr': 0.0004974593357329654, 'samples': 1248768, 'steps': 2438, 'batch_loss/train': 0.9506952306255698} 12/21/2021 22:10:43 - INFO - codeparrot_training - Step 2439: {'lr': 0.0004974570936571684, 'samples': 1249280, 'steps': 2439, 'batch_loss/train': 0.9471357576549053} 12/21/2021 22:10:53 - INFO - codeparrot_training - Step 2440: {'lr': 0.0004974548505975755, 'samples': 1249792, 'steps': 2440, 'batch_loss/train': 0.8299966296181083} 12/21/2021 22:11:04 - INFO - codeparrot_training - Step 2441: {'lr': 0.0004974526065541955, 'samples': 1250304, 'steps': 2441, 'batch_loss/train': 0.9677605060860515} 12/21/2021 22:11:16 - INFO - codeparrot_training - Step 2442: {'lr': 0.0004974503615270374, 'samples': 1250816, 'steps': 2442, 'batch_loss/train': 0.8658693616744131} 12/21/2021 22:11:27 - INFO - codeparrot_training - Step 2443: {'lr': 0.0004974481155161102, 'samples': 1251328, 'steps': 2443, 'batch_loss/train': 0.8328185183927417} 12/21/2021 22:11:37 - INFO - codeparrot_training - Step 2444: {'lr': 0.0004974458685214226, 'samples': 1251840, 'steps': 2444, 'batch_loss/train': 0.8201797772198915} 12/21/2021 22:11:49 - INFO - codeparrot_training - Step 2445: {'lr': 0.0004974436205429836, 'samples': 1252352, 'steps': 2445, 'batch_loss/train': 0.9158852323889732} 12/21/2021 22:12:00 - INFO - codeparrot_training - Step 2446: {'lr': 0.0004974413715808023, 'samples': 1252864, 'steps': 2446, 'batch_loss/train': 0.9735498046502471} 12/21/2021 22:12:10 - INFO - codeparrot_training - Step 2447: {'lr': 0.0004974391216348876, 'samples': 1253376, 'steps': 2447, 'batch_loss/train': 0.8345221932977438} 12/21/2021 22:12:23 - INFO - codeparrot_training - Step 2448: {'lr': 0.0004974368707052482, 'samples': 1253888, 'steps': 2448, 'batch_loss/train': 0.8090018695220351} 12/21/2021 22:12:34 - INFO - codeparrot_training - Step 2449: {'lr': 0.0004974346187918934, 'samples': 1254400, 'steps': 2449, 'batch_loss/train': 0.861745860427618} 12/21/2021 22:12:44 - INFO - codeparrot_training - Step 2450: {'lr': 0.0004974323658948319, 'samples': 1254912, 'steps': 2450, 'batch_loss/train': 0.8884500926360488} 12/21/2021 22:12:56 - INFO - codeparrot_training - Step 2451: {'lr': 0.0004974301120140727, 'samples': 1255424, 'steps': 2451, 'batch_loss/train': 0.8980726227164268} 12/21/2021 22:13:07 - INFO - codeparrot_training - Step 2452: {'lr': 0.0004974278571496249, 'samples': 1255936, 'steps': 2452, 'batch_loss/train': 0.806457516271621} 12/21/2021 22:13:18 - INFO - codeparrot_training - Step 2453: {'lr': 0.0004974256013014973, 'samples': 1256448, 'steps': 2453, 'batch_loss/train': 0.894386775791645} 12/21/2021 22:13:28 - INFO - codeparrot_training - Step 2454: {'lr': 0.000497423344469699, 'samples': 1256960, 'steps': 2454, 'batch_loss/train': 0.8365414766594768} 12/21/2021 22:13:41 - INFO - codeparrot_training - Step 2455: {'lr': 0.0004974210866542389, 'samples': 1257472, 'steps': 2455, 'batch_loss/train': 0.8316435103770345} 12/21/2021 22:13:51 - INFO - codeparrot_training - Step 2456: {'lr': 0.0004974188278551258, 'samples': 1257984, 'steps': 2456, 'batch_loss/train': 0.9044487760402262} 12/21/2021 22:14:02 - INFO - codeparrot_training - Step 2457: {'lr': 0.0004974165680723691, 'samples': 1258496, 'steps': 2457, 'batch_loss/train': 0.8661789642646909} 12/21/2021 22:14:14 - INFO - codeparrot_training - Step 2458: {'lr': 0.0004974143073059775, 'samples': 1259008, 'steps': 2458, 'batch_loss/train': 0.9507492007687688} 12/21/2021 22:14:25 - INFO - codeparrot_training - Step 2459: {'lr': 0.00049741204555596, 'samples': 1259520, 'steps': 2459, 'batch_loss/train': 0.8234295197762549} 12/21/2021 22:14:35 - INFO - codeparrot_training - Step 2460: {'lr': 0.0004974097828223255, 'samples': 1260032, 'steps': 2460, 'batch_loss/train': 0.880888644605875} 12/21/2021 22:14:47 - INFO - codeparrot_training - Step 2461: {'lr': 0.0004974075191050833, 'samples': 1260544, 'steps': 2461, 'batch_loss/train': 0.8481785664334893} 12/21/2021 22:14:58 - INFO - codeparrot_training - Step 2462: {'lr': 0.0004974052544042422, 'samples': 1261056, 'steps': 2462, 'batch_loss/train': 0.8406291655264795} 12/21/2021 22:15:08 - INFO - codeparrot_training - Step 2463: {'lr': 0.0004974029887198111, 'samples': 1261568, 'steps': 2463, 'batch_loss/train': 1.137168486136943} 12/21/2021 22:15:19 - INFO - codeparrot_training - Step 2464: {'lr': 0.0004974007220517992, 'samples': 1262080, 'steps': 2464, 'batch_loss/train': 0.9874716633930802} 12/21/2021 22:15:32 - INFO - codeparrot_training - Step 2465: {'lr': 0.0004973984544002155, 'samples': 1262592, 'steps': 2465, 'batch_loss/train': 0.86197071056813} 12/21/2021 22:15:42 - INFO - codeparrot_training - Step 2466: {'lr': 0.0004973961857650688, 'samples': 1263104, 'steps': 2466, 'batch_loss/train': 1.0518849920481443} 12/21/2021 22:15:53 - INFO - codeparrot_training - Step 2467: {'lr': 0.0004973939161463685, 'samples': 1263616, 'steps': 2467, 'batch_loss/train': 0.8167329262942076} 12/21/2021 22:16:05 - INFO - codeparrot_training - Step 2468: {'lr': 0.0004973916455441233, 'samples': 1264128, 'steps': 2468, 'batch_loss/train': 1.0348054729402065} 12/21/2021 22:16:16 - INFO - codeparrot_training - Step 2469: {'lr': 0.0004973893739583423, 'samples': 1264640, 'steps': 2469, 'batch_loss/train': 0.8945004306733608} 12/21/2021 22:16:27 - INFO - codeparrot_training - Step 2470: {'lr': 0.0004973871013890345, 'samples': 1265152, 'steps': 2470, 'batch_loss/train': 0.9106945530511439} 12/21/2021 22:16:39 - INFO - codeparrot_training - Step 2471: {'lr': 0.0004973848278362091, 'samples': 1265664, 'steps': 2471, 'batch_loss/train': 0.8765766080468893} 12/21/2021 22:16:50 - INFO - codeparrot_training - Step 2472: {'lr': 0.000497382553299875, 'samples': 1266176, 'steps': 2472, 'batch_loss/train': 0.9193477304652333} 12/21/2021 22:17:00 - INFO - codeparrot_training - Step 2473: {'lr': 0.0004973802777800413, 'samples': 1266688, 'steps': 2473, 'batch_loss/train': 0.8447763184085488} 12/21/2021 22:17:11 - INFO - codeparrot_training - Step 2474: {'lr': 0.000497378001276717, 'samples': 1267200, 'steps': 2474, 'batch_loss/train': 0.8332170657813549} 12/21/2021 22:17:23 - INFO - codeparrot_training - Step 2475: {'lr': 0.0004973757237899112, 'samples': 1267712, 'steps': 2475, 'batch_loss/train': 0.892075321637094} 12/21/2021 22:17:34 - INFO - codeparrot_training - Step 2476: {'lr': 0.0004973734453196329, 'samples': 1268224, 'steps': 2476, 'batch_loss/train': 0.8991949721239507} 12/21/2021 22:17:45 - INFO - codeparrot_training - Step 2477: {'lr': 0.0004973711658658913, 'samples': 1268736, 'steps': 2477, 'batch_loss/train': 0.8673014836385846} 12/21/2021 22:17:57 - INFO - codeparrot_training - Step 2478: {'lr': 0.0004973688854286952, 'samples': 1269248, 'steps': 2478, 'batch_loss/train': 0.9110845597460866} 12/21/2021 22:18:08 - INFO - codeparrot_training - Step 2479: {'lr': 0.0004973666040080539, 'samples': 1269760, 'steps': 2479, 'batch_loss/train': 0.9549682098440826} 12/21/2021 22:18:18 - INFO - codeparrot_training - Step 2480: {'lr': 0.0004973643216039762, 'samples': 1270272, 'steps': 2480, 'batch_loss/train': 0.9013823578134179} 12/21/2021 22:18:30 - INFO - codeparrot_training - Step 2481: {'lr': 0.0004973620382164717, 'samples': 1270784, 'steps': 2481, 'batch_loss/train': 0.5444711202289909} 12/21/2021 22:18:41 - INFO - codeparrot_training - Step 2482: {'lr': 0.0004973597538455489, 'samples': 1271296, 'steps': 2482, 'batch_loss/train': 1.0777513198554516} 12/21/2021 22:18:52 - INFO - codeparrot_training - Step 2483: {'lr': 0.0004973574684912171, 'samples': 1271808, 'steps': 2483, 'batch_loss/train': 0.8471165262162685} 12/21/2021 22:19:02 - INFO - codeparrot_training - Step 2484: {'lr': 0.0004973551821534854, 'samples': 1272320, 'steps': 2484, 'batch_loss/train': 0.8948148735798895} 12/21/2021 22:19:16 - INFO - codeparrot_training - Step 2485: {'lr': 0.000497352894832363, 'samples': 1272832, 'steps': 2485, 'batch_loss/train': 0.9314783862791955} 12/21/2021 22:19:26 - INFO - codeparrot_training - Step 2486: {'lr': 0.0004973506065278589, 'samples': 1273344, 'steps': 2486, 'batch_loss/train': 0.7975423445459455} 12/21/2021 22:19:37 - INFO - codeparrot_training - Step 2487: {'lr': 0.0004973483172399821, 'samples': 1273856, 'steps': 2487, 'batch_loss/train': 0.6478187893517315} 12/21/2021 22:19:49 - INFO - codeparrot_training - Step 2488: {'lr': 0.0004973460269687418, 'samples': 1274368, 'steps': 2488, 'batch_loss/train': 0.8588375961408019} 12/21/2021 22:20:00 - INFO - codeparrot_training - Step 2489: {'lr': 0.0004973437357141471, 'samples': 1274880, 'steps': 2489, 'batch_loss/train': 0.6346221710555255} 12/21/2021 22:20:10 - INFO - codeparrot_training - Step 2490: {'lr': 0.0004973414434762071, 'samples': 1275392, 'steps': 2490, 'batch_loss/train': 0.9561953926458955} 12/21/2021 22:20:22 - INFO - codeparrot_training - Step 2491: {'lr': 0.000497339150254931, 'samples': 1275904, 'steps': 2491, 'batch_loss/train': 0.8932080608792603} 12/21/2021 22:20:33 - INFO - codeparrot_training - Step 2492: {'lr': 0.0004973368560503277, 'samples': 1276416, 'steps': 2492, 'batch_loss/train': 0.8231331426650286} 12/21/2021 22:20:43 - INFO - codeparrot_training - Step 2493: {'lr': 0.0004973345608624064, 'samples': 1276928, 'steps': 2493, 'batch_loss/train': 0.682543438160792} 12/21/2021 22:20:54 - INFO - codeparrot_training - Step 2494: {'lr': 0.0004973322646911764, 'samples': 1277440, 'steps': 2494, 'batch_loss/train': 0.8630196452140808} 12/21/2021 22:21:07 - INFO - codeparrot_training - Step 2495: {'lr': 0.0004973299675366467, 'samples': 1277952, 'steps': 2495, 'batch_loss/train': 1.0371509697288275} 12/21/2021 22:21:17 - INFO - codeparrot_training - Step 2496: {'lr': 0.0004973276693988265, 'samples': 1278464, 'steps': 2496, 'batch_loss/train': 1.103387150913477} 12/21/2021 22:21:28 - INFO - codeparrot_training - Step 2497: {'lr': 0.0004973253702777248, 'samples': 1278976, 'steps': 2497, 'batch_loss/train': 0.9223515233024955} 12/21/2021 22:21:40 - INFO - codeparrot_training - Step 2498: {'lr': 0.0004973230701733507, 'samples': 1279488, 'steps': 2498, 'batch_loss/train': 0.8907973039895296} 12/21/2021 22:21:51 - INFO - codeparrot_training - Step 2499: {'lr': 0.0004973207690857135, 'samples': 1280000, 'steps': 2499, 'batch_loss/train': 0.8420358770526946} 12/21/2021 22:21:51 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/21/2021 22:49:57 - INFO - codeparrot_training - Batch size: torch.Size([0, 2048]) 12/21/2021 22:49:57 - INFO - codeparrot_training - Step 2500: {'loss/eval': 0.9031597971916199, 'perplexity': 2.4673871994018555} 12/21/2021 22:50:27 - INFO - codeparrot_training - Step 2500: {'lr': 0.0004973184670148224, 'samples': 1280512, 'steps': 2500, 'batch_loss/train': 0.9076937567442656} 12/21/2021 22:50:39 - INFO - codeparrot_training - Step 2501: {'lr': 0.0004973161639606864, 'samples': 1281024, 'steps': 2501, 'batch_loss/train': 0.969430792145431} 12/21/2021 22:50:50 - INFO - codeparrot_training - Step 2502: {'lr': 0.0004973138599233148, 'samples': 1281536, 'steps': 2502, 'batch_loss/train': 0.8416600208729506} 12/21/2021 22:51:01 - INFO - codeparrot_training - Step 2503: {'lr': 0.0004973115549027166, 'samples': 1282048, 'steps': 2503, 'batch_loss/train': 0.8591052221599966} 12/21/2021 22:51:11 - INFO - codeparrot_training - Step 2504: {'lr': 0.0004973092488989011, 'samples': 1282560, 'steps': 2504, 'batch_loss/train': 0.9008149318397045} 12/21/2021 22:51:23 - INFO - codeparrot_training - Step 2505: {'lr': 0.0004973069419118773, 'samples': 1283072, 'steps': 2505, 'batch_loss/train': 0.8336633089929819} 12/21/2021 22:51:34 - INFO - codeparrot_training - Step 2506: {'lr': 0.0004973046339416545, 'samples': 1283584, 'steps': 2506, 'batch_loss/train': 0.9070305284112692} 12/21/2021 22:51:44 - INFO - codeparrot_training - Step 2507: {'lr': 0.000497302324988242, 'samples': 1284096, 'steps': 2507, 'batch_loss/train': 0.9228264223784208} 12/21/2021 22:51:57 - INFO - codeparrot_training - Step 2508: {'lr': 0.0004973000150516487, 'samples': 1284608, 'steps': 2508, 'batch_loss/train': 0.9444270497187972} 12/21/2021 22:52:08 - INFO - codeparrot_training - Step 2509: {'lr': 0.0004972977041318839, 'samples': 1285120, 'steps': 2509, 'batch_loss/train': 0.953637545928359} 12/21/2021 22:52:19 - INFO - codeparrot_training - Step 2510: {'lr': 0.0004972953922289569, 'samples': 1285632, 'steps': 2510, 'batch_loss/train': 0.8730979631654918} 12/21/2021 22:52:31 - INFO - codeparrot_training - Step 2511: {'lr': 0.0004972930793428768, 'samples': 1286144, 'steps': 2511, 'batch_loss/train': 0.8727558553218842} 12/21/2021 22:52:41 - INFO - codeparrot_training - Step 2512: {'lr': 0.0004972907654736527, 'samples': 1286656, 'steps': 2512, 'batch_loss/train': 0.877959605306387} 12/21/2021 22:52:52 - INFO - codeparrot_training - Step 2513: {'lr': 0.0004972884506212939, 'samples': 1287168, 'steps': 2513, 'batch_loss/train': 0.8885626718401909} 12/21/2021 22:53:02 - INFO - codeparrot_training - Step 2514: {'lr': 0.0004972861347858097, 'samples': 1287680, 'steps': 2514, 'batch_loss/train': 0.7683573863469064} 12/21/2021 22:53:15 - INFO - codeparrot_training - Step 2515: {'lr': 0.0004972838179672091, 'samples': 1288192, 'steps': 2515, 'batch_loss/train': 0.9208315955474973} 12/21/2021 22:53:26 - INFO - codeparrot_training - Step 2516: {'lr': 0.0004972815001655014, 'samples': 1288704, 'steps': 2516, 'batch_loss/train': 1.0412279767915606} 12/21/2021 22:53:37 - INFO - codeparrot_training - Step 2517: {'lr': 0.0004972791813806958, 'samples': 1289216, 'steps': 2517, 'batch_loss/train': 0.7875987719744444} 12/21/2021 22:53:49 - INFO - codeparrot_training - Step 2518: {'lr': 0.0004972768616128016, 'samples': 1289728, 'steps': 2518, 'batch_loss/train': 0.8762602936476469} 12/21/2021 22:53:59 - INFO - codeparrot_training - Step 2519: {'lr': 0.000497274540861828, 'samples': 1290240, 'steps': 2519, 'batch_loss/train': 0.8094407469034195} 12/21/2021 22:54:10 - INFO - codeparrot_training - Step 2520: {'lr': 0.0004972722191277842, 'samples': 1290752, 'steps': 2520, 'batch_loss/train': 0.8472680025734007} 12/21/2021 22:54:22 - INFO - codeparrot_training - Step 2521: {'lr': 0.0004972698964106794, 'samples': 1291264, 'steps': 2521, 'batch_loss/train': 0.8058119593188167} 12/21/2021 22:54:33 - INFO - codeparrot_training - Step 2522: {'lr': 0.0004972675727105228, 'samples': 1291776, 'steps': 2522, 'batch_loss/train': 0.8879459099844098} 12/21/2021 22:54:43 - INFO - codeparrot_training - Step 2523: {'lr': 0.0004972652480273237, 'samples': 1292288, 'steps': 2523, 'batch_loss/train': 1.0259497240185738} 12/21/2021 22:54:56 - INFO - codeparrot_training - Step 2524: {'lr': 0.0004972629223610914, 'samples': 1292800, 'steps': 2524, 'batch_loss/train': 0.8909566830843687} 12/21/2021 22:55:07 - INFO - codeparrot_training - Step 2525: {'lr': 0.000497260595711835, 'samples': 1293312, 'steps': 2525, 'batch_loss/train': 0.8714152667671442} 12/21/2021 22:55:17 - INFO - codeparrot_training - Step 2526: {'lr': 0.000497258268079564, 'samples': 1293824, 'steps': 2526, 'batch_loss/train': 0.8264178214594722} 12/21/2021 22:55:28 - INFO - codeparrot_training - Step 2527: {'lr': 0.0004972559394642873, 'samples': 1294336, 'steps': 2527, 'batch_loss/train': 0.890899671241641} 12/21/2021 22:55:40 - INFO - codeparrot_training - Step 2528: {'lr': 0.0004972536098660144, 'samples': 1294848, 'steps': 2528, 'batch_loss/train': 0.8058766731992364} 12/21/2021 22:55:50 - INFO - codeparrot_training - Step 2529: {'lr': 0.0004972512792847546, 'samples': 1295360, 'steps': 2529, 'batch_loss/train': 0.8343138564378023} 12/21/2021 22:56:01 - INFO - codeparrot_training - Step 2530: {'lr': 0.000497248947720517, 'samples': 1295872, 'steps': 2530, 'batch_loss/train': 0.976877330802381} 12/21/2021 22:56:14 - INFO - codeparrot_training - Step 2531: {'lr': 0.000497246615173311, 'samples': 1296384, 'steps': 2531, 'batch_loss/train': 0.9488952672109008} 12/21/2021 22:56:24 - INFO - codeparrot_training - Step 2532: {'lr': 0.0004972442816431456, 'samples': 1296896, 'steps': 2532, 'batch_loss/train': 0.8364651361480355} 12/21/2021 22:56:35 - INFO - codeparrot_training - Step 2533: {'lr': 0.0004972419471300305, 'samples': 1297408, 'steps': 2533, 'batch_loss/train': 0.8487222855910659} 12/21/2021 22:56:47 - INFO - codeparrot_training - Step 2534: {'lr': 0.0004972396116339747, 'samples': 1297920, 'steps': 2534, 'batch_loss/train': 0.8443860290572047} 12/21/2021 22:56:58 - INFO - codeparrot_training - Step 2535: {'lr': 0.0004972372751549876, 'samples': 1298432, 'steps': 2535, 'batch_loss/train': 0.9987309090793133} 12/21/2021 22:57:08 - INFO - codeparrot_training - Step 2536: {'lr': 0.0004972349376930784, 'samples': 1298944, 'steps': 2536, 'batch_loss/train': 0.8589098758529872} 12/21/2021 22:57:19 - INFO - codeparrot_training - Step 2537: {'lr': 0.0004972325992482565, 'samples': 1299456, 'steps': 2537, 'batch_loss/train': 0.949855612590909} 12/21/2021 22:57:31 - INFO - codeparrot_training - Step 2538: {'lr': 0.0004972302598205311, 'samples': 1299968, 'steps': 2538, 'batch_loss/train': 0.9207930257543921} 12/21/2021 22:57:42 - INFO - codeparrot_training - Step 2539: {'lr': 0.0004972279194099115, 'samples': 1300480, 'steps': 2539, 'batch_loss/train': 0.8668918842449784} 12/21/2021 22:57:52 - INFO - codeparrot_training - Step 2540: {'lr': 0.0004972255780164071, 'samples': 1300992, 'steps': 2540, 'batch_loss/train': 0.9496146347373724} 12/21/2021 22:58:05 - INFO - codeparrot_training - Step 2541: {'lr': 0.0004972232356400271, 'samples': 1301504, 'steps': 2541, 'batch_loss/train': 0.9162016650661826} 12/21/2021 22:58:15 - INFO - codeparrot_training - Step 2542: {'lr': 0.000497220892280781, 'samples': 1302016, 'steps': 2542, 'batch_loss/train': 0.9554236540570855} 12/21/2021 22:58:26 - INFO - codeparrot_training - Step 2543: {'lr': 0.0004972185479386778, 'samples': 1302528, 'steps': 2543, 'batch_loss/train': 0.8487241314724088} 12/21/2021 22:58:39 - INFO - codeparrot_training - Step 2544: {'lr': 0.0004972162026137271, 'samples': 1303040, 'steps': 2544, 'batch_loss/train': 0.9302274729125202} 12/21/2021 22:58:49 - INFO - codeparrot_training - Step 2545: {'lr': 0.0004972138563059381, 'samples': 1303552, 'steps': 2545, 'batch_loss/train': 0.8458808241412044} 12/21/2021 22:59:00 - INFO - codeparrot_training - Step 2546: {'lr': 0.0004972115090153201, 'samples': 1304064, 'steps': 2546, 'batch_loss/train': 0.8888266906142235} 12/21/2021 22:59:12 - INFO - codeparrot_training - Step 2547: {'lr': 0.0004972091607418826, 'samples': 1304576, 'steps': 2547, 'batch_loss/train': 0.7463920973241329} 12/21/2021 22:59:23 - INFO - codeparrot_training - Step 2548: {'lr': 0.0004972068114856347, 'samples': 1305088, 'steps': 2548, 'batch_loss/train': 0.9357333313673735} 12/21/2021 22:59:33 - INFO - codeparrot_training - Step 2549: {'lr': 0.000497204461246586, 'samples': 1305600, 'steps': 2549, 'batch_loss/train': 0.8408659463748336} 12/21/2021 22:59:44 - INFO - codeparrot_training - Step 2550: {'lr': 0.0004972021100247456, 'samples': 1306112, 'steps': 2550, 'batch_loss/train': 0.9263641079887748} 12/21/2021 22:59:56 - INFO - codeparrot_training - Step 2551: {'lr': 0.000497199757820123, 'samples': 1306624, 'steps': 2551, 'batch_loss/train': 1.0148287108168006} 12/21/2021 23:00:07 - INFO - codeparrot_training - Step 2552: {'lr': 0.0004971974046327274, 'samples': 1307136, 'steps': 2552, 'batch_loss/train': 0.9012897098436952} 12/21/2021 23:00:17 - INFO - codeparrot_training - Step 2553: {'lr': 0.0004971950504625683, 'samples': 1307648, 'steps': 2553, 'batch_loss/train': 0.891429235227406} 12/21/2021 23:00:30 - INFO - codeparrot_training - Step 2554: {'lr': 0.000497192695309655, 'samples': 1308160, 'steps': 2554, 'batch_loss/train': 0.831726127769798} 12/21/2021 23:00:41 - INFO - codeparrot_training - Step 2555: {'lr': 0.0004971903391739969, 'samples': 1308672, 'steps': 2555, 'batch_loss/train': 0.9400324746966362} 12/21/2021 23:00:52 - INFO - codeparrot_training - Step 2556: {'lr': 0.0004971879820556035, 'samples': 1309184, 'steps': 2556, 'batch_loss/train': 0.6422781564760953} 12/21/2021 23:01:02 - INFO - codeparrot_training - Step 2557: {'lr': 0.0004971856239544838, 'samples': 1309696, 'steps': 2557, 'batch_loss/train': 0.8568375967442989} 12/21/2021 23:01:14 - INFO - codeparrot_training - Step 2558: {'lr': 0.0004971832648706476, 'samples': 1310208, 'steps': 2558, 'batch_loss/train': 0.7511599408462644} 12/21/2021 23:01:25 - INFO - codeparrot_training - Step 2559: {'lr': 0.0004971809048041039, 'samples': 1310720, 'steps': 2559, 'batch_loss/train': 0.8719435341190547} 12/21/2021 23:01:36 - INFO - codeparrot_training - Step 2560: {'lr': 0.0004971785437548622, 'samples': 1311232, 'steps': 2560, 'batch_loss/train': 0.8061729660257697} 12/21/2021 23:01:49 - INFO - codeparrot_training - Step 2561: {'lr': 0.0004971761817229321, 'samples': 1311744, 'steps': 2561, 'batch_loss/train': 0.8520532930269837} 12/21/2021 23:01:59 - INFO - codeparrot_training - Step 2562: {'lr': 0.0004971738187083228, 'samples': 1312256, 'steps': 2562, 'batch_loss/train': 0.8292180690914392} 12/21/2021 23:02:10 - INFO - codeparrot_training - Step 2563: {'lr': 0.0004971714547110437, 'samples': 1312768, 'steps': 2563, 'batch_loss/train': 0.8803146705031395} 12/21/2021 23:02:22 - INFO - codeparrot_training - Step 2564: {'lr': 0.0004971690897311042, 'samples': 1313280, 'steps': 2564, 'batch_loss/train': 0.8979310672730207} 12/21/2021 23:02:33 - INFO - codeparrot_training - Step 2565: {'lr': 0.0004971667237685138, 'samples': 1313792, 'steps': 2565, 'batch_loss/train': 0.895390348508954} 12/21/2021 23:02:43 - INFO - codeparrot_training - Step 2566: {'lr': 0.0004971643568232818, 'samples': 1314304, 'steps': 2566, 'batch_loss/train': 0.8766223592683673} 12/21/2021 23:02:54 - INFO - codeparrot_training - Step 2567: {'lr': 0.0004971619888954176, 'samples': 1314816, 'steps': 2567, 'batch_loss/train': 0.7394297434948385} 12/21/2021 23:03:06 - INFO - codeparrot_training - Step 2568: {'lr': 0.0004971596199849306, 'samples': 1315328, 'steps': 2568, 'batch_loss/train': 0.8700663484632969} 12/21/2021 23:03:16 - INFO - codeparrot_training - Step 2569: {'lr': 0.0004971572500918304, 'samples': 1315840, 'steps': 2569, 'batch_loss/train': 0.8787068529054523} 12/21/2021 23:03:27 - INFO - codeparrot_training - Step 2570: {'lr': 0.0004971548792161263, 'samples': 1316352, 'steps': 2570, 'batch_loss/train': 0.8988667586818337} 12/21/2021 23:03:40 - INFO - codeparrot_training - Step 2571: {'lr': 0.0004971525073578277, 'samples': 1316864, 'steps': 2571, 'batch_loss/train': 0.848110687918961} 12/21/2021 23:03:50 - INFO - codeparrot_training - Step 2572: {'lr': 0.000497150134516944, 'samples': 1317376, 'steps': 2572, 'batch_loss/train': 0.8512084872927517} 12/21/2021 23:04:01 - INFO - codeparrot_training - Step 2573: {'lr': 0.0004971477606934847, 'samples': 1317888, 'steps': 2573, 'batch_loss/train': 0.9369232738390565} 12/21/2021 23:04:13 - INFO - codeparrot_training - Step 2574: {'lr': 0.0004971453858874593, 'samples': 1318400, 'steps': 2574, 'batch_loss/train': 0.8359151147305965} 12/21/2021 23:04:24 - INFO - codeparrot_training - Step 2575: {'lr': 0.000497143010098877, 'samples': 1318912, 'steps': 2575, 'batch_loss/train': 0.8020782079547644} 12/21/2021 23:04:34 - INFO - codeparrot_training - Step 2576: {'lr': 0.0004971406333277476, 'samples': 1319424, 'steps': 2576, 'batch_loss/train': 0.9667276088148355} 12/21/2021 23:04:47 - INFO - codeparrot_training - Step 2577: {'lr': 0.0004971382555740803, 'samples': 1319936, 'steps': 2577, 'batch_loss/train': 0.8350053374888375} 12/21/2021 23:04:57 - INFO - codeparrot_training - Step 2578: {'lr': 0.0004971358768378845, 'samples': 1320448, 'steps': 2578, 'batch_loss/train': 0.838725158944726} 12/21/2021 23:05:08 - INFO - codeparrot_training - Step 2579: {'lr': 0.0004971334971191699, 'samples': 1320960, 'steps': 2579, 'batch_loss/train': 0.9367148522287607} 12/21/2021 23:05:19 - INFO - codeparrot_training - Step 2580: {'lr': 0.0004971311164179458, 'samples': 1321472, 'steps': 2580, 'batch_loss/train': 0.9532340820878744} 12/21/2021 23:05:31 - INFO - codeparrot_training - Step 2581: {'lr': 0.0004971287347342217, 'samples': 1321984, 'steps': 2581, 'batch_loss/train': 0.9852427393198013} 12/21/2021 23:05:42 - INFO - codeparrot_training - Step 2582: {'lr': 0.000497126352068007, 'samples': 1322496, 'steps': 2582, 'batch_loss/train': 0.8499576719477773} 12/21/2021 23:05:52 - INFO - codeparrot_training - Step 2583: {'lr': 0.0004971239684193112, 'samples': 1323008, 'steps': 2583, 'batch_loss/train': 0.8626180618302897} 12/21/2021 23:06:05 - INFO - codeparrot_training - Step 2584: {'lr': 0.000497121583788144, 'samples': 1323520, 'steps': 2584, 'batch_loss/train': 1.097781540825963} 12/21/2021 23:06:15 - INFO - codeparrot_training - Step 2585: {'lr': 0.0004971191981745145, 'samples': 1324032, 'steps': 2585, 'batch_loss/train': 0.8322471072897315} 12/21/2021 23:06:26 - INFO - codeparrot_training - Step 2586: {'lr': 0.0004971168115784324, 'samples': 1324544, 'steps': 2586, 'batch_loss/train': 1.0926132048480213} 12/21/2021 23:06:38 - INFO - codeparrot_training - Step 2587: {'lr': 0.0004971144239999073, 'samples': 1325056, 'steps': 2587, 'batch_loss/train': 0.8683037227019668} 12/21/2021 23:06:49 - INFO - codeparrot_training - Step 2588: {'lr': 0.0004971120354389484, 'samples': 1325568, 'steps': 2588, 'batch_loss/train': 0.8266562519129366} 12/21/2021 23:06:59 - INFO - codeparrot_training - Step 2589: {'lr': 0.0004971096458955655, 'samples': 1326080, 'steps': 2589, 'batch_loss/train': 0.8135174978524446} 12/21/2021 23:07:10 - INFO - codeparrot_training - Step 2590: {'lr': 0.000497107255369768, 'samples': 1326592, 'steps': 2590, 'batch_loss/train': 0.8550618444569409} 12/21/2021 23:07:23 - INFO - codeparrot_training - Step 2591: {'lr': 0.0004971048638615651, 'samples': 1327104, 'steps': 2591, 'batch_loss/train': 0.8892005924135447} 12/21/2021 23:07:34 - INFO - codeparrot_training - Step 2592: {'lr': 0.0004971024713709668, 'samples': 1327616, 'steps': 2592, 'batch_loss/train': 0.815569675527513} 12/21/2021 23:07:44 - INFO - codeparrot_training - Step 2593: {'lr': 0.0004971000778979822, 'samples': 1328128, 'steps': 2593, 'batch_loss/train': 0.8779277447611094} 12/21/2021 23:07:56 - INFO - codeparrot_training - Step 2594: {'lr': 0.000497097683442621, 'samples': 1328640, 'steps': 2594, 'batch_loss/train': 0.6797101623378694} 12/21/2021 23:08:07 - INFO - codeparrot_training - Step 2595: {'lr': 0.0004970952880048929, 'samples': 1329152, 'steps': 2595, 'batch_loss/train': 0.9521017652004957} 12/21/2021 23:08:17 - INFO - codeparrot_training - Step 2596: {'lr': 0.000497092891584807, 'samples': 1329664, 'steps': 2596, 'batch_loss/train': 0.8435386172495782} 12/21/2021 23:08:29 - INFO - codeparrot_training - Step 2597: {'lr': 0.0004970904941823732, 'samples': 1330176, 'steps': 2597, 'batch_loss/train': 0.8982730489224195} 12/21/2021 23:08:40 - INFO - codeparrot_training - Step 2598: {'lr': 0.000497088095797601, 'samples': 1330688, 'steps': 2598, 'batch_loss/train': 0.8791788518428802} 12/21/2021 23:08:51 - INFO - codeparrot_training - Step 2599: {'lr': 0.0004970856964304996, 'samples': 1331200, 'steps': 2599, 'batch_loss/train': 0.8541745794937015} 12/21/2021 23:09:01 - INFO - codeparrot_training - Step 2600: {'lr': 0.000497083296081079, 'samples': 1331712, 'steps': 2600, 'batch_loss/train': 0.8430203646421432} 12/21/2021 23:09:14 - INFO - codeparrot_training - Step 2601: {'lr': 0.0004970808947493483, 'samples': 1332224, 'steps': 2601, 'batch_loss/train': 0.7953534601256251} 12/21/2021 23:09:25 - INFO - codeparrot_training - Step 2602: {'lr': 0.0004970784924353173, 'samples': 1332736, 'steps': 2602, 'batch_loss/train': 1.0203884337097406} 12/21/2021 23:09:35 - INFO - codeparrot_training - Step 2603: {'lr': 0.0004970760891389956, 'samples': 1333248, 'steps': 2603, 'batch_loss/train': 0.8180944984778762} 12/21/2021 23:09:48 - INFO - codeparrot_training - Step 2604: {'lr': 0.0004970736848603926, 'samples': 1333760, 'steps': 2604, 'batch_loss/train': 0.9121229574084282} 12/21/2021 23:09:58 - INFO - codeparrot_training - Step 2605: {'lr': 0.0004970712795995179, 'samples': 1334272, 'steps': 2605, 'batch_loss/train': 0.8513930896297097} 12/21/2021 23:10:09 - INFO - codeparrot_training - Step 2606: {'lr': 0.0004970688733563812, 'samples': 1334784, 'steps': 2606, 'batch_loss/train': 0.9241869505494833} 12/21/2021 23:10:20 - INFO - codeparrot_training - Step 2607: {'lr': 0.0004970664661309918, 'samples': 1335296, 'steps': 2607, 'batch_loss/train': 0.788687966298312} 12/21/2021 23:10:32 - INFO - codeparrot_training - Step 2608: {'lr': 0.0004970640579233594, 'samples': 1335808, 'steps': 2608, 'batch_loss/train': 0.871890461537987} 12/21/2021 23:10:43 - INFO - codeparrot_training - Step 2609: {'lr': 0.0004970616487334937, 'samples': 1336320, 'steps': 2609, 'batch_loss/train': 0.8284319383092225} 12/21/2021 23:10:53 - INFO - codeparrot_training - Step 2610: {'lr': 0.0004970592385614041, 'samples': 1336832, 'steps': 2610, 'batch_loss/train': 0.8617114580702037} 12/21/2021 23:11:06 - INFO - codeparrot_training - Step 2611: {'lr': 0.0004970568274071004, 'samples': 1337344, 'steps': 2611, 'batch_loss/train': 0.9049709932878613} 12/21/2021 23:11:16 - INFO - codeparrot_training - Step 2612: {'lr': 0.0004970544152705919, 'samples': 1337856, 'steps': 2612, 'batch_loss/train': 0.832271090708673} 12/21/2021 23:11:27 - INFO - codeparrot_training - Step 2613: {'lr': 0.0004970520021518884, 'samples': 1338368, 'steps': 2613, 'batch_loss/train': 0.8715925868600607} 12/21/2021 23:11:39 - INFO - codeparrot_training - Step 2614: {'lr': 0.0004970495880509993, 'samples': 1338880, 'steps': 2614, 'batch_loss/train': 0.978884952608496} 12/21/2021 23:11:50 - INFO - codeparrot_training - Step 2615: {'lr': 0.0004970471729679345, 'samples': 1339392, 'steps': 2615, 'batch_loss/train': 0.8630227679386735} 12/21/2021 23:12:00 - INFO - codeparrot_training - Step 2616: {'lr': 0.0004970447569027033, 'samples': 1339904, 'steps': 2616, 'batch_loss/train': 0.9812512015923858} 12/21/2021 23:12:12 - INFO - codeparrot_training - Step 2617: {'lr': 0.0004970423398553154, 'samples': 1340416, 'steps': 2617, 'batch_loss/train': 0.8890673791756853} 12/21/2021 23:12:23 - INFO - codeparrot_training - Step 2618: {'lr': 0.0004970399218257805, 'samples': 1340928, 'steps': 2618, 'batch_loss/train': 0.8740392103791237} 12/21/2021 23:12:33 - INFO - codeparrot_training - Step 2619: {'lr': 0.0004970375028141081, 'samples': 1341440, 'steps': 2619, 'batch_loss/train': 0.9780873656272888} 12/21/2021 23:12:44 - INFO - codeparrot_training - Step 2620: {'lr': 0.000497035082820308, 'samples': 1341952, 'steps': 2620, 'batch_loss/train': 0.8473696820437908} 12/21/2021 23:12:57 - INFO - codeparrot_training - Step 2621: {'lr': 0.0004970326618443895, 'samples': 1342464, 'steps': 2621, 'batch_loss/train': 0.9232274442911148} 12/21/2021 23:13:07 - INFO - codeparrot_training - Step 2622: {'lr': 0.0004970302398863625, 'samples': 1342976, 'steps': 2622, 'batch_loss/train': 0.8577353181317449} 12/21/2021 23:13:18 - INFO - codeparrot_training - Step 2623: {'lr': 0.0004970278169462366, 'samples': 1343488, 'steps': 2623, 'batch_loss/train': 0.888273099437356} 12/21/2021 23:13:32 - INFO - codeparrot_training - Step 2624: {'lr': 0.0004970253930240213, 'samples': 1344000, 'steps': 2624, 'batch_loss/train': 0.8390135131776333} 12/21/2021 23:13:42 - INFO - codeparrot_training - Step 2625: {'lr': 0.0004970229681197264, 'samples': 1344512, 'steps': 2625, 'batch_loss/train': 0.949209351092577} 12/21/2021 23:13:53 - INFO - codeparrot_training - Step 2626: {'lr': 0.0004970205422333613, 'samples': 1345024, 'steps': 2626, 'batch_loss/train': 0.7938872186932713} 12/21/2021 23:14:05 - INFO - codeparrot_training - Step 2627: {'lr': 0.0004970181153649359, 'samples': 1345536, 'steps': 2627, 'batch_loss/train': 1.011822747066617} 12/21/2021 23:14:16 - INFO - codeparrot_training - Step 2628: {'lr': 0.0004970156875144597, 'samples': 1346048, 'steps': 2628, 'batch_loss/train': 0.7088752770796418} 12/21/2021 23:14:26 - INFO - codeparrot_training - Step 2629: {'lr': 0.0004970132586819424, 'samples': 1346560, 'steps': 2629, 'batch_loss/train': 0.9272313341498375} 12/21/2021 23:14:38 - INFO - codeparrot_training - Step 2630: {'lr': 0.0004970108288673937, 'samples': 1347072, 'steps': 2630, 'batch_loss/train': 0.8952867351472378} 12/21/2021 23:14:49 - INFO - codeparrot_training - Step 2631: {'lr': 0.0004970083980708232, 'samples': 1347584, 'steps': 2631, 'batch_loss/train': 0.7968609081581235} 12/21/2021 23:15:00 - INFO - codeparrot_training - Step 2632: {'lr': 0.0004970059662922406, 'samples': 1348096, 'steps': 2632, 'batch_loss/train': 0.8547662980854511} 12/21/2021 23:15:10 - INFO - codeparrot_training - Step 2633: {'lr': 0.0004970035335316555, 'samples': 1348608, 'steps': 2633, 'batch_loss/train': 0.9292575269937515} 12/21/2021 23:15:23 - INFO - codeparrot_training - Step 2634: {'lr': 0.0004970010997890776, 'samples': 1349120, 'steps': 2634, 'batch_loss/train': 0.9456824958324432} 12/21/2021 23:15:34 - INFO - codeparrot_training - Step 2635: {'lr': 0.0004969986650645166, 'samples': 1349632, 'steps': 2635, 'batch_loss/train': 0.8396728993393481} 12/21/2021 23:15:45 - INFO - codeparrot_training - Step 2636: {'lr': 0.0004969962293579821, 'samples': 1350144, 'steps': 2636, 'batch_loss/train': 0.8499658210203052} 12/21/2021 23:15:57 - INFO - codeparrot_training - Step 2637: {'lr': 0.0004969937926694839, 'samples': 1350656, 'steps': 2637, 'batch_loss/train': 0.8127735247835517} 12/21/2021 23:16:07 - INFO - codeparrot_training - Step 2638: {'lr': 0.0004969913549990317, 'samples': 1351168, 'steps': 2638, 'batch_loss/train': 0.8123251209035516} 12/21/2021 23:16:18 - INFO - codeparrot_training - Step 2639: {'lr': 0.0004969889163466351, 'samples': 1351680, 'steps': 2639, 'batch_loss/train': 0.9272226132452488} 12/21/2021 23:16:30 - INFO - codeparrot_training - Step 2640: {'lr': 0.0004969864767123039, 'samples': 1352192, 'steps': 2640, 'batch_loss/train': 0.9494982548058033} 12/21/2021 23:16:41 - INFO - codeparrot_training - Step 2641: {'lr': 0.0004969840360960477, 'samples': 1352704, 'steps': 2641, 'batch_loss/train': 0.8643590963911265} 12/21/2021 23:16:52 - INFO - codeparrot_training - Step 2642: {'lr': 0.0004969815944978761, 'samples': 1353216, 'steps': 2642, 'batch_loss/train': 0.9265452940016985} 12/21/2021 23:17:03 - INFO - codeparrot_training - Step 2643: {'lr': 0.0004969791519177991, 'samples': 1353728, 'steps': 2643, 'batch_loss/train': 0.8339579561725259} 12/21/2021 23:17:15 - INFO - codeparrot_training - Step 2644: {'lr': 0.0004969767083558262, 'samples': 1354240, 'steps': 2644, 'batch_loss/train': 0.9131715372204781} 12/21/2021 23:17:25 - INFO - codeparrot_training - Step 2645: {'lr': 0.0004969742638119671, 'samples': 1354752, 'steps': 2645, 'batch_loss/train': 0.8549335254356265} 12/21/2021 23:17:36 - INFO - codeparrot_training - Step 2646: {'lr': 0.0004969718182862317, 'samples': 1355264, 'steps': 2646, 'batch_loss/train': 0.8345813578926027} 12/21/2021 23:17:48 - INFO - codeparrot_training - Step 2647: {'lr': 0.0004969693717786295, 'samples': 1355776, 'steps': 2647, 'batch_loss/train': 0.881142670288682} 12/21/2021 23:17:58 - INFO - codeparrot_training - Step 2648: {'lr': 0.0004969669242891703, 'samples': 1356288, 'steps': 2648, 'batch_loss/train': 0.8633270543068647} 12/21/2021 23:18:09 - INFO - codeparrot_training - Step 2649: {'lr': 0.0004969644758178639, 'samples': 1356800, 'steps': 2649, 'batch_loss/train': 0.8655777284875512} 12/21/2021 23:18:22 - INFO - codeparrot_training - Step 2650: {'lr': 0.0004969620263647201, 'samples': 1357312, 'steps': 2650, 'batch_loss/train': 0.8849942591041327} 12/21/2021 23:18:33 - INFO - codeparrot_training - Step 2651: {'lr': 0.0004969595759297484, 'samples': 1357824, 'steps': 2651, 'batch_loss/train': 0.8167379070073366} 12/21/2021 23:18:43 - INFO - codeparrot_training - Step 2652: {'lr': 0.0004969571245129587, 'samples': 1358336, 'steps': 2652, 'batch_loss/train': 0.756469989195466} 12/21/2021 23:18:55 - INFO - codeparrot_training - Step 2653: {'lr': 0.0004969546721143608, 'samples': 1358848, 'steps': 2653, 'batch_loss/train': 0.8799095032736659} 12/21/2021 23:19:06 - INFO - codeparrot_training - Step 2654: {'lr': 0.0004969522187339643, 'samples': 1359360, 'steps': 2654, 'batch_loss/train': 0.8422023504972458} 12/21/2021 23:19:16 - INFO - codeparrot_training - Step 2655: {'lr': 0.000496949764371779, 'samples': 1359872, 'steps': 2655, 'batch_loss/train': 0.8282237350940704} 12/21/2021 23:19:27 - INFO - codeparrot_training - Step 2656: {'lr': 0.0004969473090278147, 'samples': 1360384, 'steps': 2656, 'batch_loss/train': 0.9151206817477942} 12/21/2021 23:19:39 - INFO - codeparrot_training - Step 2657: {'lr': 0.0004969448527020811, 'samples': 1360896, 'steps': 2657, 'batch_loss/train': 0.832263364456594} 12/21/2021 23:19:50 - INFO - codeparrot_training - Step 2658: {'lr': 0.0004969423953945881, 'samples': 1361408, 'steps': 2658, 'batch_loss/train': 0.8922028904780746} 12/21/2021 23:20:00 - INFO - codeparrot_training - Step 2659: {'lr': 0.0004969399371053453, 'samples': 1361920, 'steps': 2659, 'batch_loss/train': 0.6436726925894618} 12/21/2021 23:20:13 - INFO - codeparrot_training - Step 2660: {'lr': 0.0004969374778343626, 'samples': 1362432, 'steps': 2660, 'batch_loss/train': 0.8898045094683766} 12/21/2021 23:20:23 - INFO - codeparrot_training - Step 2661: {'lr': 0.0004969350175816497, 'samples': 1362944, 'steps': 2661, 'batch_loss/train': 0.7435582936741412} 12/21/2021 23:20:34 - INFO - codeparrot_training - Step 2662: {'lr': 0.0004969325563472165, 'samples': 1363456, 'steps': 2662, 'batch_loss/train': 0.7861685073003173} 12/21/2021 23:20:47 - INFO - codeparrot_training - Step 2663: {'lr': 0.0004969300941310726, 'samples': 1363968, 'steps': 2663, 'batch_loss/train': 0.8314899448305368} 12/21/2021 23:20:57 - INFO - codeparrot_training - Step 2664: {'lr': 0.000496927630933228, 'samples': 1364480, 'steps': 2664, 'batch_loss/train': 0.8617793517187238} 12/21/2021 23:21:08 - INFO - codeparrot_training - Step 2665: {'lr': 0.0004969251667536922, 'samples': 1364992, 'steps': 2665, 'batch_loss/train': 1.206850937800482} 12/21/2021 23:21:19 - INFO - codeparrot_training - Step 2666: {'lr': 0.0004969227015924752, 'samples': 1365504, 'steps': 2666, 'batch_loss/train': 0.8822238245047629} 12/21/2021 23:21:31 - INFO - codeparrot_training - Step 2667: {'lr': 0.0004969202354495869, 'samples': 1366016, 'steps': 2667, 'batch_loss/train': 0.9446771945804358} 12/21/2021 23:21:41 - INFO - codeparrot_training - Step 2668: {'lr': 0.0004969177683250369, 'samples': 1366528, 'steps': 2668, 'batch_loss/train': 0.9044107217341661} 12/21/2021 23:21:52 - INFO - codeparrot_training - Step 2669: {'lr': 0.000496915300218835, 'samples': 1367040, 'steps': 2669, 'batch_loss/train': 1.0803780667483807} 12/21/2021 23:22:05 - INFO - codeparrot_training - Step 2670: {'lr': 0.0004969128311309912, 'samples': 1367552, 'steps': 2670, 'batch_loss/train': 0.854048520559445} 12/21/2021 23:22:15 - INFO - codeparrot_training - Step 2671: {'lr': 0.0004969103610615153, 'samples': 1368064, 'steps': 2671, 'batch_loss/train': 0.9085577204823494} 12/21/2021 23:22:26 - INFO - codeparrot_training - Step 2672: {'lr': 0.0004969078900104168, 'samples': 1368576, 'steps': 2672, 'batch_loss/train': 0.8956662062555552} 12/21/2021 23:22:37 - INFO - codeparrot_training - Step 2673: {'lr': 0.0004969054179777059, 'samples': 1369088, 'steps': 2673, 'batch_loss/train': 0.8469512620940804} 12/21/2021 23:22:49 - INFO - codeparrot_training - Step 2674: {'lr': 0.0004969029449633924, 'samples': 1369600, 'steps': 2674, 'batch_loss/train': 0.9123724950477481} 12/21/2021 23:22:59 - INFO - codeparrot_training - Step 2675: {'lr': 0.0004969004709674858, 'samples': 1370112, 'steps': 2675, 'batch_loss/train': 0.8991649281233549} 12/21/2021 23:23:10 - INFO - codeparrot_training - Step 2676: {'lr': 0.0004968979959899963, 'samples': 1370624, 'steps': 2676, 'batch_loss/train': 0.9092682749032974} 12/21/2021 23:23:22 - INFO - codeparrot_training - Step 2677: {'lr': 0.0004968955200309335, 'samples': 1371136, 'steps': 2677, 'batch_loss/train': 0.9461671775206923} 12/21/2021 23:23:33 - INFO - codeparrot_training - Step 2678: {'lr': 0.0004968930430903075, 'samples': 1371648, 'steps': 2678, 'batch_loss/train': 0.8685863772407174} 12/21/2021 23:23:43 - INFO - codeparrot_training - Step 2679: {'lr': 0.0004968905651681278, 'samples': 1372160, 'steps': 2679, 'batch_loss/train': 1.840269047766924} 12/21/2021 23:23:56 - INFO - codeparrot_training - Step 2680: {'lr': 0.0004968880862644045, 'samples': 1372672, 'steps': 2680, 'batch_loss/train': 0.9768112525343895} 12/21/2021 23:24:07 - INFO - codeparrot_training - Step 2681: {'lr': 0.0004968856063791474, 'samples': 1373184, 'steps': 2681, 'batch_loss/train': 0.8325344668701291} 12/21/2021 23:24:17 - INFO - codeparrot_training - Step 2682: {'lr': 0.0004968831255123663, 'samples': 1373696, 'steps': 2682, 'batch_loss/train': 0.8481564680114388} 12/21/2021 23:24:28 - INFO - codeparrot_training - Step 2683: {'lr': 0.0004968806436640712, 'samples': 1374208, 'steps': 2683, 'batch_loss/train': 0.8954875534400344} 12/21/2021 23:24:40 - INFO - codeparrot_training - Step 2684: {'lr': 0.0004968781608342718, 'samples': 1374720, 'steps': 2684, 'batch_loss/train': 0.9011481814086437} 12/21/2021 23:24:50 - INFO - codeparrot_training - Step 2685: {'lr': 0.0004968756770229781, 'samples': 1375232, 'steps': 2685, 'batch_loss/train': 0.8864008858799934} 12/21/2021 23:25:01 - INFO - codeparrot_training - Step 2686: {'lr': 0.0004968731922301999, 'samples': 1375744, 'steps': 2686, 'batch_loss/train': 0.9456286551430821} 12/21/2021 23:25:13 - INFO - codeparrot_training - Step 2687: {'lr': 0.0004968707064559471, 'samples': 1376256, 'steps': 2687, 'batch_loss/train': 0.879870175383985} 12/21/2021 23:25:24 - INFO - codeparrot_training - Step 2688: {'lr': 0.0004968682197002295, 'samples': 1376768, 'steps': 2688, 'batch_loss/train': 0.8257661666721106} 12/21/2021 23:25:34 - INFO - codeparrot_training - Step 2689: {'lr': 0.0004968657319630572, 'samples': 1377280, 'steps': 2689, 'batch_loss/train': 0.9626553459092975} 12/21/2021 23:25:47 - INFO - codeparrot_training - Step 2690: {'lr': 0.00049686324324444, 'samples': 1377792, 'steps': 2690, 'batch_loss/train': 0.8306416813284159} 12/21/2021 23:25:57 - INFO - codeparrot_training - Step 2691: {'lr': 0.0004968607535443876, 'samples': 1378304, 'steps': 2691, 'batch_loss/train': 0.926137916278094} 12/21/2021 23:26:08 - INFO - codeparrot_training - Step 2692: {'lr': 0.0004968582628629102, 'samples': 1378816, 'steps': 2692, 'batch_loss/train': 0.8321814527735114} 12/21/2021 23:26:20 - INFO - codeparrot_training - Step 2693: {'lr': 0.0004968557712000174, 'samples': 1379328, 'steps': 2693, 'batch_loss/train': 0.8305544788017869} 12/21/2021 23:26:31 - INFO - codeparrot_training - Step 2694: {'lr': 0.0004968532785557195, 'samples': 1379840, 'steps': 2694, 'batch_loss/train': 0.8677771054208279} 12/21/2021 23:26:41 - INFO - codeparrot_training - Step 2695: {'lr': 0.000496850784930026, 'samples': 1380352, 'steps': 2695, 'batch_loss/train': 1.101633214391768} 12/21/2021 23:26:52 - INFO - codeparrot_training - Step 2696: {'lr': 0.000496848290322947, 'samples': 1380864, 'steps': 2696, 'batch_loss/train': 0.8915745439007878} 12/21/2021 23:27:05 - INFO - codeparrot_training - Step 2697: {'lr': 0.0004968457947344924, 'samples': 1381376, 'steps': 2697, 'batch_loss/train': 1.067648135125637} 12/21/2021 23:27:15 - INFO - codeparrot_training - Step 2698: {'lr': 0.0004968432981646723, 'samples': 1381888, 'steps': 2698, 'batch_loss/train': 0.8309534005820751} 12/21/2021 23:27:26 - INFO - codeparrot_training - Step 2699: {'lr': 0.0004968408006134962, 'samples': 1382400, 'steps': 2699, 'batch_loss/train': 0.8716224925592542} 12/21/2021 23:27:38 - INFO - codeparrot_training - Step 2700: {'lr': 0.0004968383020809744, 'samples': 1382912, 'steps': 2700, 'batch_loss/train': 0.793915051035583} 12/21/2021 23:27:49 - INFO - codeparrot_training - Step 2701: {'lr': 0.0004968358025671167, 'samples': 1383424, 'steps': 2701, 'batch_loss/train': 0.9377364721149206} 12/21/2021 23:27:59 - INFO - codeparrot_training - Step 2702: {'lr': 0.000496833302071933, 'samples': 1383936, 'steps': 2702, 'batch_loss/train': 0.8213134435936809} 12/21/2021 23:28:11 - INFO - codeparrot_training - Step 2703: {'lr': 0.0004968308005954333, 'samples': 1384448, 'steps': 2703, 'batch_loss/train': 0.9814783623442054} 12/21/2021 23:28:22 - INFO - codeparrot_training - Step 2704: {'lr': 0.0004968282981376276, 'samples': 1384960, 'steps': 2704, 'batch_loss/train': 0.8629418667405844} 12/21/2021 23:28:32 - INFO - codeparrot_training - Step 2705: {'lr': 0.0004968257946985257, 'samples': 1385472, 'steps': 2705, 'batch_loss/train': 0.8386291197966784} 12/21/2021 23:28:45 - INFO - codeparrot_training - Step 2706: {'lr': 0.0004968232902781378, 'samples': 1385984, 'steps': 2706, 'batch_loss/train': 0.8490266920998693} 12/21/2021 23:28:56 - INFO - codeparrot_training - Step 2707: {'lr': 0.0004968207848764735, 'samples': 1386496, 'steps': 2707, 'batch_loss/train': 0.6435901797376573} 12/21/2021 23:29:06 - INFO - codeparrot_training - Step 2708: {'lr': 0.0004968182784935431, 'samples': 1387008, 'steps': 2708, 'batch_loss/train': 0.9010505508631468} 12/21/2021 23:29:17 - INFO - codeparrot_training - Step 2709: {'lr': 0.0004968157711293564, 'samples': 1387520, 'steps': 2709, 'batch_loss/train': 0.7711103688925505} 12/21/2021 23:29:30 - INFO - codeparrot_training - Step 2710: {'lr': 0.0004968132627839233, 'samples': 1388032, 'steps': 2710, 'batch_loss/train': 0.9176118718460202} 12/21/2021 23:29:40 - INFO - codeparrot_training - Step 2711: {'lr': 0.0004968107534572539, 'samples': 1388544, 'steps': 2711, 'batch_loss/train': 0.8944595092907548} 12/21/2021 23:29:51 - INFO - codeparrot_training - Step 2712: {'lr': 0.0004968082431493582, 'samples': 1389056, 'steps': 2712, 'batch_loss/train': 0.7738998178392649} 12/21/2021 23:30:03 - INFO - codeparrot_training - Step 2713: {'lr': 0.000496805731860246, 'samples': 1389568, 'steps': 2713, 'batch_loss/train': 0.8671526871621609} 12/21/2021 23:30:13 - INFO - codeparrot_training - Step 2714: {'lr': 0.0004968032195899275, 'samples': 1390080, 'steps': 2714, 'batch_loss/train': 0.8038806186523288} 12/21/2021 23:30:24 - INFO - codeparrot_training - Step 2715: {'lr': 0.0004968007063384126, 'samples': 1390592, 'steps': 2715, 'batch_loss/train': 0.8965359330177307} 12/21/2021 23:30:36 - INFO - codeparrot_training - Step 2716: {'lr': 0.0004967981921057112, 'samples': 1391104, 'steps': 2716, 'batch_loss/train': 0.8254398126155138} 12/21/2021 23:30:47 - INFO - codeparrot_training - Step 2717: {'lr': 0.0004967956768918335, 'samples': 1391616, 'steps': 2717, 'batch_loss/train': 0.9515547091141343} 12/21/2021 23:30:58 - INFO - codeparrot_training - Step 2718: {'lr': 0.0004967931606967892, 'samples': 1392128, 'steps': 2718, 'batch_loss/train': 1.0667793406173587} 12/21/2021 23:31:08 - INFO - codeparrot_training - Step 2719: {'lr': 0.0004967906435205886, 'samples': 1392640, 'steps': 2719, 'batch_loss/train': 0.8525957148522139} 12/21/2021 23:31:21 - INFO - codeparrot_training - Step 2720: {'lr': 0.0004967881253632416, 'samples': 1393152, 'steps': 2720, 'batch_loss/train': 0.8748413920402527} 12/21/2021 23:31:31 - INFO - codeparrot_training - Step 2721: {'lr': 0.0004967856062247581, 'samples': 1393664, 'steps': 2721, 'batch_loss/train': 0.8947004042565823} 12/21/2021 23:31:42 - INFO - codeparrot_training - Step 2722: {'lr': 0.0004967830861051482, 'samples': 1394176, 'steps': 2722, 'batch_loss/train': 0.8371201138943434} 12/21/2021 23:31:54 - INFO - codeparrot_training - Step 2723: {'lr': 0.000496780565004422, 'samples': 1394688, 'steps': 2723, 'batch_loss/train': 0.8371166586875916} 12/21/2021 23:32:05 - INFO - codeparrot_training - Step 2724: {'lr': 0.0004967780429225894, 'samples': 1395200, 'steps': 2724, 'batch_loss/train': 0.7864578813314438} 12/21/2021 23:32:15 - INFO - codeparrot_training - Step 2725: {'lr': 0.0004967755198596606, 'samples': 1395712, 'steps': 2725, 'batch_loss/train': 0.9323317701928318} 12/21/2021 23:32:29 - INFO - codeparrot_training - Step 2726: {'lr': 0.0004967729958156455, 'samples': 1396224, 'steps': 2726, 'batch_loss/train': 0.8089192556217313} 12/21/2021 23:32:40 - INFO - codeparrot_training - Step 2727: {'lr': 0.000496770470790554, 'samples': 1396736, 'steps': 2727, 'batch_loss/train': 1.2252685157582164} 12/21/2021 23:32:50 - INFO - codeparrot_training - Step 2728: {'lr': 0.0004967679447843964, 'samples': 1397248, 'steps': 2728, 'batch_loss/train': 2.0998979611322284} 12/21/2021 23:33:01 - INFO - codeparrot_training - Step 2729: {'lr': 0.0004967654177971826, 'samples': 1397760, 'steps': 2729, 'batch_loss/train': 0.8694474725052714} 12/21/2021 23:33:13 - INFO - codeparrot_training - Step 2730: {'lr': 0.0004967628898289227, 'samples': 1398272, 'steps': 2730, 'batch_loss/train': 0.9039225094020367} 12/21/2021 23:33:24 - INFO - codeparrot_training - Step 2731: {'lr': 0.0004967603608796266, 'samples': 1398784, 'steps': 2731, 'batch_loss/train': 0.8972703805193305} 12/21/2021 23:33:34 - INFO - codeparrot_training - Step 2732: {'lr': 0.0004967578309493046, 'samples': 1399296, 'steps': 2732, 'batch_loss/train': 0.8592028999701142} 12/21/2021 23:33:46 - INFO - codeparrot_training - Step 2733: {'lr': 0.0004967553000379667, 'samples': 1399808, 'steps': 2733, 'batch_loss/train': 0.9384011151269078} 12/21/2021 23:33:57 - INFO - codeparrot_training - Step 2734: {'lr': 0.0004967527681456228, 'samples': 1400320, 'steps': 2734, 'batch_loss/train': 1.017911254428327} 12/21/2021 23:34:08 - INFO - codeparrot_training - Step 2735: {'lr': 0.000496750235272283, 'samples': 1400832, 'steps': 2735, 'batch_loss/train': 0.8836188307031989} 12/21/2021 23:34:20 - INFO - codeparrot_training - Step 2736: {'lr': 0.0004967477014179575, 'samples': 1401344, 'steps': 2736, 'batch_loss/train': 0.8850441165268421} 12/21/2021 23:34:31 - INFO - codeparrot_training - Step 2737: {'lr': 0.0004967451665826563, 'samples': 1401856, 'steps': 2737, 'batch_loss/train': 0.7825722214765847} 12/21/2021 23:34:42 - INFO - codeparrot_training - Step 2738: {'lr': 0.0004967426307663895, 'samples': 1402368, 'steps': 2738, 'batch_loss/train': 0.9299429561942816} 12/21/2021 23:34:54 - INFO - codeparrot_training - Step 2739: {'lr': 0.0004967400939691672, 'samples': 1402880, 'steps': 2739, 'batch_loss/train': 0.8927700808271766} 12/21/2021 23:35:04 - INFO - codeparrot_training - Step 2740: {'lr': 0.0004967375561909994, 'samples': 1403392, 'steps': 2740, 'batch_loss/train': 0.9150375127792358} 12/21/2021 23:35:15 - INFO - codeparrot_training - Step 2741: {'lr': 0.0004967350174318963, 'samples': 1403904, 'steps': 2741, 'batch_loss/train': 0.8059806269593537} 12/21/2021 23:35:25 - INFO - codeparrot_training - Step 2742: {'lr': 0.0004967324776918679, 'samples': 1404416, 'steps': 2742, 'batch_loss/train': 0.7748554176650941} 12/21/2021 23:35:38 - INFO - codeparrot_training - Step 2743: {'lr': 0.0004967299369709242, 'samples': 1404928, 'steps': 2743, 'batch_loss/train': 0.8595873732119799} 12/21/2021 23:35:49 - INFO - codeparrot_training - Step 2744: {'lr': 0.0004967273952690756, 'samples': 1405440, 'steps': 2744, 'batch_loss/train': 0.953446839004755} 12/21/2021 23:36:00 - INFO - codeparrot_training - Step 2745: {'lr': 0.0004967248525863321, 'samples': 1405952, 'steps': 2745, 'batch_loss/train': 0.9198706010356545} 12/21/2021 23:36:12 - INFO - codeparrot_training - Step 2746: {'lr': 0.0004967223089227035, 'samples': 1406464, 'steps': 2746, 'batch_loss/train': 0.8563555241562426} 12/21/2021 23:36:22 - INFO - codeparrot_training - Step 2747: {'lr': 0.0004967197642782004, 'samples': 1406976, 'steps': 2747, 'batch_loss/train': 0.8798938356339931} 12/21/2021 23:36:33 - INFO - codeparrot_training - Step 2748: {'lr': 0.0004967172186528325, 'samples': 1407488, 'steps': 2748, 'batch_loss/train': 0.8090404313988984} 12/21/2021 23:36:45 - INFO - codeparrot_training - Step 2749: {'lr': 0.0004967146720466103, 'samples': 1408000, 'steps': 2749, 'batch_loss/train': 1.0644350359216332} 12/21/2021 23:36:56 - INFO - codeparrot_training - Step 2750: {'lr': 0.0004967121244595435, 'samples': 1408512, 'steps': 2750, 'batch_loss/train': 0.8782786931842566} 12/21/2021 23:37:06 - INFO - codeparrot_training - Step 2751: {'lr': 0.0004967095758916426, 'samples': 1409024, 'steps': 2751, 'batch_loss/train': 0.8194076817017049} 12/21/2021 23:37:17 - INFO - codeparrot_training - Step 2752: {'lr': 0.0004967070263429174, 'samples': 1409536, 'steps': 2752, 'batch_loss/train': 0.8952427394688129} 12/21/2021 23:37:29 - INFO - codeparrot_training - Step 2753: {'lr': 0.0004967044758133784, 'samples': 1410048, 'steps': 2753, 'batch_loss/train': 0.984594602137804} 12/21/2021 23:37:40 - INFO - codeparrot_training - Step 2754: {'lr': 0.0004967019243030355, 'samples': 1410560, 'steps': 2754, 'batch_loss/train': 0.8314443789422512} 12/21/2021 23:37:50 - INFO - codeparrot_training - Step 2755: {'lr': 0.0004966993718118988, 'samples': 1411072, 'steps': 2755, 'batch_loss/train': 0.8793294029310346} 12/21/2021 23:38:03 - INFO - codeparrot_training - Step 2756: {'lr': 0.0004966968183399787, 'samples': 1411584, 'steps': 2756, 'batch_loss/train': 0.8368006106466055} 12/21/2021 23:38:14 - INFO - codeparrot_training - Step 2757: {'lr': 0.0004966942638872851, 'samples': 1412096, 'steps': 2757, 'batch_loss/train': 0.884452068246901} 12/21/2021 23:38:24 - INFO - codeparrot_training - Step 2758: {'lr': 0.0004966917084538283, 'samples': 1412608, 'steps': 2758, 'batch_loss/train': 0.958603729493916} 12/21/2021 23:38:36 - INFO - codeparrot_training - Step 2759: {'lr': 0.0004966891520396184, 'samples': 1413120, 'steps': 2759, 'batch_loss/train': 0.8930850634351373} 12/21/2021 23:38:47 - INFO - codeparrot_training - Step 2760: {'lr': 0.0004966865946446656, 'samples': 1413632, 'steps': 2760, 'batch_loss/train': 0.8794597964733839} 12/21/2021 23:38:57 - INFO - codeparrot_training - Step 2761: {'lr': 0.0004966840362689801, 'samples': 1414144, 'steps': 2761, 'batch_loss/train': 0.8583412365987897} 12/21/2021 23:39:08 - INFO - codeparrot_training - Step 2762: {'lr': 0.0004966814769125718, 'samples': 1414656, 'steps': 2762, 'batch_loss/train': 0.9090088084340096} 12/21/2021 23:39:20 - INFO - codeparrot_training - Step 2763: {'lr': 0.0004966789165754512, 'samples': 1415168, 'steps': 2763, 'batch_loss/train': 0.8443227065727115} 12/21/2021 23:39:31 - INFO - codeparrot_training - Step 2764: {'lr': 0.0004966763552576285, 'samples': 1415680, 'steps': 2764, 'batch_loss/train': 0.719876506831497} 12/21/2021 23:39:41 - INFO - codeparrot_training - Step 2765: {'lr': 0.0004966737929591136, 'samples': 1416192, 'steps': 2765, 'batch_loss/train': 0.76170210307464} 12/21/2021 23:39:54 - INFO - codeparrot_training - Step 2766: {'lr': 0.000496671229679917, 'samples': 1416704, 'steps': 2766, 'batch_loss/train': 0.9373163082636893} 12/21/2021 23:40:05 - INFO - codeparrot_training - Step 2767: {'lr': 0.0004966686654200486, 'samples': 1417216, 'steps': 2767, 'batch_loss/train': 0.8672814543824643} 12/21/2021 23:40:15 - INFO - codeparrot_training - Step 2768: {'lr': 0.0004966661001795188, 'samples': 1417728, 'steps': 2768, 'batch_loss/train': 0.9686710042878985} 12/21/2021 23:40:27 - INFO - codeparrot_training - Step 2769: {'lr': 0.0004966635339583377, 'samples': 1418240, 'steps': 2769, 'batch_loss/train': 0.8428207878023386} 12/21/2021 23:40:38 - INFO - codeparrot_training - Step 2770: {'lr': 0.0004966609667565154, 'samples': 1418752, 'steps': 2770, 'batch_loss/train': 0.7251744972309098} 12/21/2021 23:40:49 - INFO - codeparrot_training - Step 2771: {'lr': 0.0004966583985740624, 'samples': 1419264, 'steps': 2771, 'batch_loss/train': 0.8712121639400721} 12/21/2021 23:40:59 - INFO - codeparrot_training - Step 2772: {'lr': 0.0004966558294109886, 'samples': 1419776, 'steps': 2772, 'batch_loss/train': 0.852354547358118} 12/21/2021 23:41:11 - INFO - codeparrot_training - Step 2773: {'lr': 0.0004966532592673045, 'samples': 1420288, 'steps': 2773, 'batch_loss/train': 0.7654701012652367} 12/21/2021 23:41:22 - INFO - codeparrot_training - Step 2774: {'lr': 0.0004966506881430201, 'samples': 1420800, 'steps': 2774, 'batch_loss/train': 0.8304709908552468} 12/21/2021 23:41:33 - INFO - codeparrot_training - Step 2775: {'lr': 0.0004966481160381456, 'samples': 1421312, 'steps': 2775, 'batch_loss/train': 0.8790553957223892} 12/21/2021 23:41:45 - INFO - codeparrot_training - Step 2776: {'lr': 0.0004966455429526915, 'samples': 1421824, 'steps': 2776, 'batch_loss/train': 0.7828576392494142} 12/21/2021 23:41:56 - INFO - codeparrot_training - Step 2777: {'lr': 0.0004966429688866677, 'samples': 1422336, 'steps': 2777, 'batch_loss/train': 0.8534541856497526} 12/21/2021 23:42:07 - INFO - codeparrot_training - Step 2778: {'lr': 0.0004966403938400847, 'samples': 1422848, 'steps': 2778, 'batch_loss/train': 0.7504670768976212} 12/21/2021 23:42:19 - INFO - codeparrot_training - Step 2779: {'lr': 0.0004966378178129526, 'samples': 1423360, 'steps': 2779, 'batch_loss/train': 0.8033681497909129} 12/21/2021 23:42:30 - INFO - codeparrot_training - Step 2780: {'lr': 0.0004966352408052817, 'samples': 1423872, 'steps': 2780, 'batch_loss/train': 0.8164720870554447} 12/21/2021 23:42:40 - INFO - codeparrot_training - Step 2781: {'lr': 0.000496632662817082, 'samples': 1424384, 'steps': 2781, 'batch_loss/train': 0.8724935334175825} 12/21/2021 23:42:52 - INFO - codeparrot_training - Step 2782: {'lr': 0.0004966300838483642, 'samples': 1424896, 'steps': 2782, 'batch_loss/train': 0.8145610201172531} 12/21/2021 23:43:03 - INFO - codeparrot_training - Step 2783: {'lr': 0.0004966275038991382, 'samples': 1425408, 'steps': 2783, 'batch_loss/train': 0.8904623882845044} 12/21/2021 23:43:14 - INFO - codeparrot_training - Step 2784: {'lr': 0.0004966249229694144, 'samples': 1425920, 'steps': 2784, 'batch_loss/train': 0.8781835101544857} 12/21/2021 23:43:24 - INFO - codeparrot_training - Step 2785: {'lr': 0.000496622341059203, 'samples': 1426432, 'steps': 2785, 'batch_loss/train': 0.8815981792286038} 12/21/2021 23:43:37 - INFO - codeparrot_training - Step 2786: {'lr': 0.0004966197581685143, 'samples': 1426944, 'steps': 2786, 'batch_loss/train': 0.8855038192123175} 12/21/2021 23:43:48 - INFO - codeparrot_training - Step 2787: {'lr': 0.0004966171742973585, 'samples': 1427456, 'steps': 2787, 'batch_loss/train': 0.8718706956133246} 12/21/2021 23:43:58 - INFO - codeparrot_training - Step 2788: {'lr': 0.000496614589445746, 'samples': 1427968, 'steps': 2788, 'batch_loss/train': 0.9071602299809456} 12/21/2021 23:44:10 - INFO - codeparrot_training - Step 2789: {'lr': 0.000496612003613687, 'samples': 1428480, 'steps': 2789, 'batch_loss/train': 0.8452225644141436} 12/21/2021 23:44:21 - INFO - codeparrot_training - Step 2790: {'lr': 0.0004966094168011918, 'samples': 1428992, 'steps': 2790, 'batch_loss/train': 0.8973483853042126} 12/21/2021 23:44:32 - INFO - codeparrot_training - Step 2791: {'lr': 0.0004966068290082708, 'samples': 1429504, 'steps': 2791, 'batch_loss/train': 0.7372503913938999} 12/21/2021 23:44:44 - INFO - codeparrot_training - Step 2792: {'lr': 0.000496604240234934, 'samples': 1430016, 'steps': 2792, 'batch_loss/train': 0.8342458037659526} 12/21/2021 23:44:54 - INFO - codeparrot_training - Step 2793: {'lr': 0.0004966016504811919, 'samples': 1430528, 'steps': 2793, 'batch_loss/train': 0.8195690996944904} 12/21/2021 23:45:05 - INFO - codeparrot_training - Step 2794: {'lr': 0.0004965990597470548, 'samples': 1431040, 'steps': 2794, 'batch_loss/train': 0.8199907764792442} 12/21/2021 23:45:16 - INFO - codeparrot_training - Step 2795: {'lr': 0.0004965964680325328, 'samples': 1431552, 'steps': 2795, 'batch_loss/train': 0.8460550783202052} 12/21/2021 23:45:28 - INFO - codeparrot_training - Step 2796: {'lr': 0.0004965938753376365, 'samples': 1432064, 'steps': 2796, 'batch_loss/train': 0.9620615616440773} 12/21/2021 23:45:39 - INFO - codeparrot_training - Step 2797: {'lr': 0.000496591281662376, 'samples': 1432576, 'steps': 2797, 'batch_loss/train': 0.8256391501054168} 12/21/2021 23:45:50 - INFO - codeparrot_training - Step 2798: {'lr': 0.0004965886870067617, 'samples': 1433088, 'steps': 2798, 'batch_loss/train': 0.7301017369609326} 12/21/2021 23:46:02 - INFO - codeparrot_training - Step 2799: {'lr': 0.0004965860913708039, 'samples': 1433600, 'steps': 2799, 'batch_loss/train': 0.8490456156432629} 12/21/2021 23:46:12 - INFO - codeparrot_training - Step 2800: {'lr': 0.0004965834947545128, 'samples': 1434112, 'steps': 2800, 'batch_loss/train': 0.9143759487196803} 12/21/2021 23:46:23 - INFO - codeparrot_training - Step 2801: {'lr': 0.000496580897157899, 'samples': 1434624, 'steps': 2801, 'batch_loss/train': 0.8197619970887899} 12/21/2021 23:46:36 - INFO - codeparrot_training - Step 2802: {'lr': 0.0004965782985809726, 'samples': 1435136, 'steps': 2802, 'batch_loss/train': 0.8713466441258788} 12/21/2021 23:46:46 - INFO - codeparrot_training - Step 2803: {'lr': 0.0004965756990237439, 'samples': 1435648, 'steps': 2803, 'batch_loss/train': 0.9801749223843217} 12/21/2021 23:46:57 - INFO - codeparrot_training - Step 2804: {'lr': 0.0004965730984862233, 'samples': 1436160, 'steps': 2804, 'batch_loss/train': 0.8587969169020653} 12/21/2021 23:47:07 - INFO - codeparrot_training - Step 2805: {'lr': 0.0004965704969684213, 'samples': 1436672, 'steps': 2805, 'batch_loss/train': 0.8538382276892662} 12/21/2021 23:47:19 - INFO - codeparrot_training - Step 2806: {'lr': 0.0004965678944703481, 'samples': 1437184, 'steps': 2806, 'batch_loss/train': 0.9124665325507522} 12/21/2021 23:47:30 - INFO - codeparrot_training - Step 2807: {'lr': 0.0004965652909920139, 'samples': 1437696, 'steps': 2807, 'batch_loss/train': 0.8945768615230918} 12/21/2021 23:47:41 - INFO - codeparrot_training - Step 2808: {'lr': 0.0004965626865334294, 'samples': 1438208, 'steps': 2808, 'batch_loss/train': 0.9108815314248204} 12/21/2021 23:47:53 - INFO - codeparrot_training - Step 2809: {'lr': 0.0004965600810946047, 'samples': 1438720, 'steps': 2809, 'batch_loss/train': 0.9295633239671588} 12/21/2021 23:48:03 - INFO - codeparrot_training - Step 2810: {'lr': 0.0004965574746755502, 'samples': 1439232, 'steps': 2810, 'batch_loss/train': 0.9038051944226027} 12/21/2021 23:48:14 - INFO - codeparrot_training - Step 2811: {'lr': 0.0004965548672762761, 'samples': 1439744, 'steps': 2811, 'batch_loss/train': 0.8813595036044717} 12/21/2021 23:48:27 - INFO - codeparrot_training - Step 2812: {'lr': 0.0004965522588967933, 'samples': 1440256, 'steps': 2812, 'batch_loss/train': 0.8693674448877573} 12/21/2021 23:48:37 - INFO - codeparrot_training - Step 2813: {'lr': 0.0004965496495371115, 'samples': 1440768, 'steps': 2813, 'batch_loss/train': 0.854512601159513} 12/21/2021 23:48:48 - INFO - codeparrot_training - Step 2814: {'lr': 0.0004965470391972416, 'samples': 1441280, 'steps': 2814, 'batch_loss/train': 0.8857122138142586} 12/21/2021 23:49:00 - INFO - codeparrot_training - Step 2815: {'lr': 0.0004965444278771936, 'samples': 1441792, 'steps': 2815, 'batch_loss/train': 0.9771657353267074} 12/21/2021 23:49:11 - INFO - codeparrot_training - Step 2816: {'lr': 0.0004965418155769781, 'samples': 1442304, 'steps': 2816, 'batch_loss/train': 0.8739115702919662} 12/21/2021 23:49:21 - INFO - codeparrot_training - Step 2817: {'lr': 0.0004965392022966054, 'samples': 1442816, 'steps': 2817, 'batch_loss/train': 0.916544396430254} 12/21/2021 23:49:32 - INFO - codeparrot_training - Step 2818: {'lr': 0.0004965365880360861, 'samples': 1443328, 'steps': 2818, 'batch_loss/train': 0.8176784424576908} 12/21/2021 23:49:44 - INFO - codeparrot_training - Step 2819: {'lr': 0.0004965339727954302, 'samples': 1443840, 'steps': 2819, 'batch_loss/train': 0.7750601591542363} 12/21/2021 23:49:55 - INFO - codeparrot_training - Step 2820: {'lr': 0.0004965313565746484, 'samples': 1444352, 'steps': 2820, 'batch_loss/train': 0.9488355049397796} 12/21/2021 23:50:05 - INFO - codeparrot_training - Step 2821: {'lr': 0.000496528739373751, 'samples': 1444864, 'steps': 2821, 'batch_loss/train': 0.7964668245986104} 12/21/2021 23:50:17 - INFO - codeparrot_training - Step 2822: {'lr': 0.0004965261211927484, 'samples': 1445376, 'steps': 2822, 'batch_loss/train': 0.9484735019505024} 12/21/2021 23:50:28 - INFO - codeparrot_training - Step 2823: {'lr': 0.0004965235020316511, 'samples': 1445888, 'steps': 2823, 'batch_loss/train': 0.7780307899229228} 12/21/2021 23:50:39 - INFO - codeparrot_training - Step 2824: {'lr': 0.0004965208818904694, 'samples': 1446400, 'steps': 2824, 'batch_loss/train': 0.7593808164820075} 12/21/2021 23:50:52 - INFO - codeparrot_training - Step 2825: {'lr': 0.0004965182607692137, 'samples': 1446912, 'steps': 2825, 'batch_loss/train': 1.2515212474390864} 12/21/2021 23:51:02 - INFO - codeparrot_training - Step 2826: {'lr': 0.0004965156386678945, 'samples': 1447424, 'steps': 2826, 'batch_loss/train': 0.8807951798662543} 12/21/2021 23:51:13 - INFO - codeparrot_training - Step 2827: {'lr': 0.0004965130155865223, 'samples': 1447936, 'steps': 2827, 'batch_loss/train': 0.8640672229230404} 12/21/2021 23:51:23 - INFO - codeparrot_training - Step 2828: {'lr': 0.0004965103915251073, 'samples': 1448448, 'steps': 2828, 'batch_loss/train': 0.9924935344606638} 12/21/2021 23:51:36 - INFO - codeparrot_training - Step 2829: {'lr': 0.0004965077664836602, 'samples': 1448960, 'steps': 2829, 'batch_loss/train': 0.8473209850490093} 12/21/2021 23:51:46 - INFO - codeparrot_training - Step 2830: {'lr': 0.0004965051404621911, 'samples': 1449472, 'steps': 2830, 'batch_loss/train': 0.8569857375696301} 12/21/2021 23:51:57 - INFO - codeparrot_training - Step 2831: {'lr': 0.0004965025134607107, 'samples': 1449984, 'steps': 2831, 'batch_loss/train': 0.8988811438903213} 12/21/2021 23:52:09 - INFO - codeparrot_training - Step 2832: {'lr': 0.0004964998854792295, 'samples': 1450496, 'steps': 2832, 'batch_loss/train': 0.84225017670542} 12/21/2021 23:52:20 - INFO - codeparrot_training - Step 2833: {'lr': 0.0004964972565177578, 'samples': 1451008, 'steps': 2833, 'batch_loss/train': 0.9077969556674361} 12/21/2021 23:52:30 - INFO - codeparrot_training - Step 2834: {'lr': 0.000496494626576306, 'samples': 1451520, 'steps': 2834, 'batch_loss/train': 0.9489975296892226} 12/21/2021 23:52:43 - INFO - codeparrot_training - Step 2835: {'lr': 0.0004964919956548847, 'samples': 1452032, 'steps': 2835, 'batch_loss/train': 0.8240236695855856} 12/21/2021 23:52:54 - INFO - codeparrot_training - Step 2836: {'lr': 0.0004964893637535043, 'samples': 1452544, 'steps': 2836, 'batch_loss/train': 0.9654202740639448} 12/21/2021 23:53:04 - INFO - codeparrot_training - Step 2837: {'lr': 0.0004964867308721753, 'samples': 1453056, 'steps': 2837, 'batch_loss/train': 0.852312111761421} 12/21/2021 23:53:15 - INFO - codeparrot_training - Step 2838: {'lr': 0.000496484097010908, 'samples': 1453568, 'steps': 2838, 'batch_loss/train': 0.8109756391495466} 12/21/2021 23:53:27 - INFO - codeparrot_training - Step 2839: {'lr': 0.000496481462169713, 'samples': 1454080, 'steps': 2839, 'batch_loss/train': 0.8806477729231119} 12/21/2021 23:53:38 - INFO - codeparrot_training - Step 2840: {'lr': 0.0004964788263486008, 'samples': 1454592, 'steps': 2840, 'batch_loss/train': 0.8800964802503586} 12/21/2021 23:53:48 - INFO - codeparrot_training - Step 2841: {'lr': 0.000496476189547582, 'samples': 1455104, 'steps': 2841, 'batch_loss/train': 0.9086245694197714} 12/21/2021 23:54:01 - INFO - codeparrot_training - Step 2842: {'lr': 0.0004964735517666668, 'samples': 1455616, 'steps': 2842, 'batch_loss/train': 0.8611039528623223} 12/21/2021 23:54:12 - INFO - codeparrot_training - Step 2843: {'lr': 0.0004964709130058658, 'samples': 1456128, 'steps': 2843, 'batch_loss/train': 0.885093062184751} 12/21/2021 23:54:22 - INFO - codeparrot_training - Step 2844: {'lr': 0.0004964682732651896, 'samples': 1456640, 'steps': 2844, 'batch_loss/train': 0.9418956665322185} 12/21/2021 23:54:34 - INFO - codeparrot_training - Step 2845: {'lr': 0.0004964656325446486, 'samples': 1457152, 'steps': 2845, 'batch_loss/train': 0.8936934070661664} 12/21/2021 23:54:45 - INFO - codeparrot_training - Step 2846: {'lr': 0.0004964629908442532, 'samples': 1457664, 'steps': 2846, 'batch_loss/train': 0.8727347794920206} 12/21/2021 23:54:55 - INFO - codeparrot_training - Step 2847: {'lr': 0.0004964603481640142, 'samples': 1458176, 'steps': 2847, 'batch_loss/train': 0.9256553370505571} 12/21/2021 23:55:07 - INFO - codeparrot_training - Step 2848: {'lr': 0.0004964577045039418, 'samples': 1458688, 'steps': 2848, 'batch_loss/train': 0.9007095694541931} 12/21/2021 23:55:18 - INFO - codeparrot_training - Step 2849: {'lr': 0.0004964550598640466, 'samples': 1459200, 'steps': 2849, 'batch_loss/train': 0.9398393807932734} 12/21/2021 23:55:29 - INFO - codeparrot_training - Step 2850: {'lr': 0.0004964524142443392, 'samples': 1459712, 'steps': 2850, 'batch_loss/train': 0.8468107334338129} 12/21/2021 23:55:39 - INFO - codeparrot_training - Step 2851: {'lr': 0.0004964497676448301, 'samples': 1460224, 'steps': 2851, 'batch_loss/train': 0.9058442609384656} 12/21/2021 23:55:52 - INFO - codeparrot_training - Step 2852: {'lr': 0.0004964471200655296, 'samples': 1460736, 'steps': 2852, 'batch_loss/train': 0.8936695717275143} 12/21/2021 23:56:03 - INFO - codeparrot_training - Step 2853: {'lr': 0.0004964444715064485, 'samples': 1461248, 'steps': 2853, 'batch_loss/train': 0.8698366414755583} 12/21/2021 23:56:13 - INFO - codeparrot_training - Step 2854: {'lr': 0.0004964418219675973, 'samples': 1461760, 'steps': 2854, 'batch_loss/train': 0.8841481376439333} 12/21/2021 23:56:26 - INFO - codeparrot_training - Step 2855: {'lr': 0.0004964391714489864, 'samples': 1462272, 'steps': 2855, 'batch_loss/train': 0.9602138102054596} 12/21/2021 23:56:37 - INFO - codeparrot_training - Step 2856: {'lr': 0.0004964365199506264, 'samples': 1462784, 'steps': 2856, 'batch_loss/train': 1.2038272619247437} 12/21/2021 23:56:47 - INFO - codeparrot_training - Step 2857: {'lr': 0.000496433867472528, 'samples': 1463296, 'steps': 2857, 'batch_loss/train': 1.0564644867554307} 12/21/2021 23:56:58 - INFO - codeparrot_training - Step 2858: {'lr': 0.0004964312140147013, 'samples': 1463808, 'steps': 2858, 'batch_loss/train': 0.9190641529858112} 12/21/2021 23:57:10 - INFO - codeparrot_training - Step 2859: {'lr': 0.0004964285595771573, 'samples': 1464320, 'steps': 2859, 'batch_loss/train': 0.8868136648088694} 12/21/2021 23:57:21 - INFO - codeparrot_training - Step 2860: {'lr': 0.0004964259041599066, 'samples': 1464832, 'steps': 2860, 'batch_loss/train': 0.8466164115816355} 12/21/2021 23:57:31 - INFO - codeparrot_training - Step 2861: {'lr': 0.0004964232477629593, 'samples': 1465344, 'steps': 2861, 'batch_loss/train': 0.7719985288567841} 12/21/2021 23:57:45 - INFO - codeparrot_training - Step 2862: {'lr': 0.0004964205903863262, 'samples': 1465856, 'steps': 2862, 'batch_loss/train': 0.915548182092607} 12/21/2021 23:57:55 - INFO - codeparrot_training - Step 2863: {'lr': 0.0004964179320300179, 'samples': 1466368, 'steps': 2863, 'batch_loss/train': 0.8909932738170028} 12/21/2021 23:58:06 - INFO - codeparrot_training - Step 2864: {'lr': 0.000496415272694045, 'samples': 1466880, 'steps': 2864, 'batch_loss/train': 0.765384775120765} 12/21/2021 23:58:18 - INFO - codeparrot_training - Step 2865: {'lr': 0.000496412612378418, 'samples': 1467392, 'steps': 2865, 'batch_loss/train': 0.8376013990491629} 12/21/2021 23:58:29 - INFO - codeparrot_training - Step 2866: {'lr': 0.0004964099510831475, 'samples': 1467904, 'steps': 2866, 'batch_loss/train': 0.8705869084224105} 12/21/2021 23:58:39 - INFO - codeparrot_training - Step 2867: {'lr': 0.0004964072888082442, 'samples': 1468416, 'steps': 2867, 'batch_loss/train': 0.9318092977628112} 12/21/2021 23:58:50 - INFO - codeparrot_training - Step 2868: {'lr': 0.0004964046255537184, 'samples': 1468928, 'steps': 2868, 'batch_loss/train': 0.8997740289196372} 12/21/2021 23:59:02 - INFO - codeparrot_training - Step 2869: {'lr': 0.000496401961319581, 'samples': 1469440, 'steps': 2869, 'batch_loss/train': 1.031172463670373} 12/21/2021 23:59:13 - INFO - codeparrot_training - Step 2870: {'lr': 0.0004963992961058422, 'samples': 1469952, 'steps': 2870, 'batch_loss/train': 0.8672622283920646} 12/21/2021 23:59:23 - INFO - codeparrot_training - Step 2871: {'lr': 0.000496396629912513, 'samples': 1470464, 'steps': 2871, 'batch_loss/train': 1.049331195652485} 12/21/2021 23:59:36 - INFO - codeparrot_training - Step 2872: {'lr': 0.0004963939627396038, 'samples': 1470976, 'steps': 2872, 'batch_loss/train': 1.3430225565098226} 12/21/2021 23:59:47 - INFO - codeparrot_training - Step 2873: {'lr': 0.0004963912945871252, 'samples': 1471488, 'steps': 2873, 'batch_loss/train': 0.8312508948147297} 12/21/2021 23:59:57 - INFO - codeparrot_training - Step 2874: {'lr': 0.0004963886254550878, 'samples': 1472000, 'steps': 2874, 'batch_loss/train': 0.8993829181417823} 12/22/2021 00:00:09 - INFO - codeparrot_training - Step 2875: {'lr': 0.0004963859553435025, 'samples': 1472512, 'steps': 2875, 'batch_loss/train': 0.8733553243800998} 12/22/2021 00:00:20 - INFO - codeparrot_training - Step 2876: {'lr': 0.0004963832842523794, 'samples': 1473024, 'steps': 2876, 'batch_loss/train': 0.9229561807587743} 12/22/2021 00:00:31 - INFO - codeparrot_training - Step 2877: {'lr': 0.0004963806121817296, 'samples': 1473536, 'steps': 2877, 'batch_loss/train': 0.8684650948271155} 12/22/2021 00:00:41 - INFO - codeparrot_training - Step 2878: {'lr': 0.0004963779391315634, 'samples': 1474048, 'steps': 2878, 'batch_loss/train': 0.929153410717845} 12/22/2021 00:00:54 - INFO - codeparrot_training - Step 2879: {'lr': 0.0004963752651018916, 'samples': 1474560, 'steps': 2879, 'batch_loss/train': 0.9285211749374866} 12/22/2021 00:01:04 - INFO - codeparrot_training - Step 2880: {'lr': 0.0004963725900927247, 'samples': 1475072, 'steps': 2880, 'batch_loss/train': 0.9373218063265085} 12/22/2021 00:01:15 - INFO - codeparrot_training - Step 2881: {'lr': 0.0004963699141040734, 'samples': 1475584, 'steps': 2881, 'batch_loss/train': 0.7236965104239061} 12/22/2021 00:01:28 - INFO - codeparrot_training - Step 2882: {'lr': 0.0004963672371359484, 'samples': 1476096, 'steps': 2882, 'batch_loss/train': 0.9335309453308582} 12/22/2021 00:01:38 - INFO - codeparrot_training - Step 2883: {'lr': 0.0004963645591883603, 'samples': 1476608, 'steps': 2883, 'batch_loss/train': 0.9433216033503413} 12/22/2021 00:01:49 - INFO - codeparrot_training - Step 2884: {'lr': 0.0004963618802613197, 'samples': 1477120, 'steps': 2884, 'batch_loss/train': 1.0087015088647604} 12/22/2021 00:02:01 - INFO - codeparrot_training - Step 2885: {'lr': 0.0004963592003548374, 'samples': 1477632, 'steps': 2885, 'batch_loss/train': 0.6383123500272632} 12/22/2021 00:02:12 - INFO - codeparrot_training - Step 2886: {'lr': 0.0004963565194689238, 'samples': 1478144, 'steps': 2886, 'batch_loss/train': 0.8915530694648623} 12/22/2021 00:02:22 - INFO - codeparrot_training - Step 2887: {'lr': 0.0004963538376035897, 'samples': 1478656, 'steps': 2887, 'batch_loss/train': 0.8105394979938865} 12/22/2021 00:02:34 - INFO - codeparrot_training - Step 2888: {'lr': 0.0004963511547588458, 'samples': 1479168, 'steps': 2888, 'batch_loss/train': 0.9146529901772738} 12/22/2021 00:02:45 - INFO - codeparrot_training - Step 2889: {'lr': 0.0004963484709347028, 'samples': 1479680, 'steps': 2889, 'batch_loss/train': 0.9707105793058872} 12/22/2021 00:02:56 - INFO - codeparrot_training - Step 2890: {'lr': 0.0004963457861311713, 'samples': 1480192, 'steps': 2890, 'batch_loss/train': 0.7517307370435447} 12/22/2021 00:03:06 - INFO - codeparrot_training - Step 2891: {'lr': 0.0004963431003482618, 'samples': 1480704, 'steps': 2891, 'batch_loss/train': 0.8765983311459422} 12/22/2021 00:03:19 - INFO - codeparrot_training - Step 2892: {'lr': 0.0004963404135859853, 'samples': 1481216, 'steps': 2892, 'batch_loss/train': 0.7453456418588758} 12/22/2021 00:03:30 - INFO - codeparrot_training - Step 2893: {'lr': 0.0004963377258443522, 'samples': 1481728, 'steps': 2893, 'batch_loss/train': 0.9584305575117469} 12/22/2021 00:03:40 - INFO - codeparrot_training - Step 2894: {'lr': 0.0004963350371233735, 'samples': 1482240, 'steps': 2894, 'batch_loss/train': 0.8540538772940636} 12/22/2021 00:03:53 - INFO - codeparrot_training - Step 2895: {'lr': 0.0004963323474230596, 'samples': 1482752, 'steps': 2895, 'batch_loss/train': 0.8443974177353084} 12/22/2021 00:04:03 - INFO - codeparrot_training - Step 2896: {'lr': 0.0004963296567434213, 'samples': 1483264, 'steps': 2896, 'batch_loss/train': 0.800471568480134} 12/22/2021 00:04:14 - INFO - codeparrot_training - Step 2897: {'lr': 0.0004963269650844692, 'samples': 1483776, 'steps': 2897, 'batch_loss/train': 0.8654970473144203} 12/22/2021 00:04:27 - INFO - codeparrot_training - Step 2898: {'lr': 0.0004963242724462143, 'samples': 1484288, 'steps': 2898, 'batch_loss/train': 0.8327738051302731} 12/22/2021 00:04:37 - INFO - codeparrot_training - Step 2899: {'lr': 0.0004963215788286669, 'samples': 1484800, 'steps': 2899, 'batch_loss/train': 0.8541070362553} 12/22/2021 00:04:48 - INFO - codeparrot_training - Step 2900: {'lr': 0.0004963188842318381, 'samples': 1485312, 'steps': 2900, 'batch_loss/train': 0.8433470204472542} 12/22/2021 00:04:59 - INFO - codeparrot_training - Step 2901: {'lr': 0.0004963161886557383, 'samples': 1485824, 'steps': 2901, 'batch_loss/train': 0.8325515834148973} 12/22/2021 00:05:11 - INFO - codeparrot_training - Step 2902: {'lr': 0.0004963134921003783, 'samples': 1486336, 'steps': 2902, 'batch_loss/train': 0.9418292930349708} 12/22/2021 00:05:21 - INFO - codeparrot_training - Step 2903: {'lr': 0.0004963107945657689, 'samples': 1486848, 'steps': 2903, 'batch_loss/train': 0.7670798851177096} 12/22/2021 00:05:32 - INFO - codeparrot_training - Step 2904: {'lr': 0.0004963080960519209, 'samples': 1487360, 'steps': 2904, 'batch_loss/train': 0.8767623296007514} 12/22/2021 00:05:44 - INFO - codeparrot_training - Step 2905: {'lr': 0.0004963053965588447, 'samples': 1487872, 'steps': 2905, 'batch_loss/train': 0.9281723168678582} 12/22/2021 00:05:54 - INFO - codeparrot_training - Step 2906: {'lr': 0.0004963026960865515, 'samples': 1488384, 'steps': 2906, 'batch_loss/train': 0.8888087114319205} 12/22/2021 00:06:05 - INFO - codeparrot_training - Step 2907: {'lr': 0.0004962999946350516, 'samples': 1488896, 'steps': 2907, 'batch_loss/train': 0.8760175798088312} 12/22/2021 00:06:17 - INFO - codeparrot_training - Step 2908: {'lr': 0.0004962972922043559, 'samples': 1489408, 'steps': 2908, 'batch_loss/train': 0.9299073615111411} 12/22/2021 00:06:28 - INFO - codeparrot_training - Step 2909: {'lr': 0.0004962945887944753, 'samples': 1489920, 'steps': 2909, 'batch_loss/train': 0.7910862183198333} 12/22/2021 00:06:39 - INFO - codeparrot_training - Step 2910: {'lr': 0.0004962918844054204, 'samples': 1490432, 'steps': 2910, 'batch_loss/train': 0.872055065818131} 12/22/2021 00:06:49 - INFO - codeparrot_training - Step 2911: {'lr': 0.0004962891790372019, 'samples': 1490944, 'steps': 2911, 'batch_loss/train': 0.8696145731955767} 12/22/2021 00:07:02 - INFO - codeparrot_training - Step 2912: {'lr': 0.0004962864726898306, 'samples': 1491456, 'steps': 2912, 'batch_loss/train': 0.8794516469351947} 12/22/2021 00:07:13 - INFO - codeparrot_training - Step 2913: {'lr': 0.0004962837653633174, 'samples': 1491968, 'steps': 2913, 'batch_loss/train': 0.7723189720418304} 12/22/2021 00:07:23 - INFO - codeparrot_training - Step 2914: {'lr': 0.0004962810570576727, 'samples': 1492480, 'steps': 2914, 'batch_loss/train': 0.9134014174342155} 12/22/2021 00:07:35 - INFO - codeparrot_training - Step 2915: {'lr': 0.0004962783477729076, 'samples': 1492992, 'steps': 2915, 'batch_loss/train': 0.8580280332826078} 12/22/2021 00:07:46 - INFO - codeparrot_training - Step 2916: {'lr': 0.0004962756375090329, 'samples': 1493504, 'steps': 2916, 'batch_loss/train': 0.832851410843432} 12/22/2021 00:07:57 - INFO - codeparrot_training - Step 2917: {'lr': 0.0004962729262660593, 'samples': 1494016, 'steps': 2917, 'batch_loss/train': 1.0168451061472297} 12/22/2021 00:08:09 - INFO - codeparrot_training - Step 2918: {'lr': 0.0004962702140439973, 'samples': 1494528, 'steps': 2918, 'batch_loss/train': 0.852037301287055} 12/22/2021 00:08:19 - INFO - codeparrot_training - Step 2919: {'lr': 0.0004962675008428581, 'samples': 1495040, 'steps': 2919, 'batch_loss/train': 0.8455076892860234} 12/22/2021 00:08:30 - INFO - codeparrot_training - Step 2920: {'lr': 0.0004962647866626523, 'samples': 1495552, 'steps': 2920, 'batch_loss/train': 0.871443840675056} 12/22/2021 00:08:42 - INFO - codeparrot_training - Step 2921: {'lr': 0.0004962620715033906, 'samples': 1496064, 'steps': 2921, 'batch_loss/train': 0.7829725155606866} 12/22/2021 00:08:53 - INFO - codeparrot_training - Step 2922: {'lr': 0.0004962593553650839, 'samples': 1496576, 'steps': 2922, 'batch_loss/train': 0.8813907615840435} 12/22/2021 00:09:04 - INFO - codeparrot_training - Step 2923: {'lr': 0.000496256638247743, 'samples': 1497088, 'steps': 2923, 'batch_loss/train': 0.8756352197378874} 12/22/2021 00:09:14 - INFO - codeparrot_training - Step 2924: {'lr': 0.0004962539201513787, 'samples': 1497600, 'steps': 2924, 'batch_loss/train': 0.8332637073472142} 12/22/2021 00:09:27 - INFO - codeparrot_training - Step 2925: {'lr': 0.0004962512010760018, 'samples': 1498112, 'steps': 2925, 'batch_loss/train': 0.8550588954240084} 12/22/2021 00:09:38 - INFO - codeparrot_training - Step 2926: {'lr': 0.0004962484810216232, 'samples': 1498624, 'steps': 2926, 'batch_loss/train': 0.8066789982840419} 12/22/2021 00:09:48 - INFO - codeparrot_training - Step 2927: {'lr': 0.0004962457599882535, 'samples': 1499136, 'steps': 2927, 'batch_loss/train': 0.7762155579403043} 12/22/2021 00:10:01 - INFO - codeparrot_training - Step 2928: {'lr': 0.0004962430379759036, 'samples': 1499648, 'steps': 2928, 'batch_loss/train': 0.8517991006374359} 12/22/2021 00:10:12 - INFO - codeparrot_training - Step 2929: {'lr': 0.0004962403149845844, 'samples': 1500160, 'steps': 2929, 'batch_loss/train': 0.786177865229547} 12/22/2021 00:10:22 - INFO - codeparrot_training - Step 2930: {'lr': 0.0004962375910143068, 'samples': 1500672, 'steps': 2930, 'batch_loss/train': 0.7200079159811139} 12/22/2021 00:10:34 - INFO - codeparrot_training - Step 2931: {'lr': 0.0004962348660650814, 'samples': 1501184, 'steps': 2931, 'batch_loss/train': 0.8562792930752039} 12/22/2021 00:10:45 - INFO - codeparrot_training - Step 2932: {'lr': 0.0004962321401369192, 'samples': 1501696, 'steps': 2932, 'batch_loss/train': 0.8072061110287905} 12/22/2021 00:10:56 - INFO - codeparrot_training - Step 2933: {'lr': 0.0004962294132298309, 'samples': 1502208, 'steps': 2933, 'batch_loss/train': 0.8849640721455216} 12/22/2021 00:11:06 - INFO - codeparrot_training - Step 2934: {'lr': 0.0004962266853438275, 'samples': 1502720, 'steps': 2934, 'batch_loss/train': 0.9049284048378468} 12/22/2021 00:11:18 - INFO - codeparrot_training - Step 2935: {'lr': 0.0004962239564789197, 'samples': 1503232, 'steps': 2935, 'batch_loss/train': 0.9346189349889755} 12/22/2021 00:11:29 - INFO - codeparrot_training - Step 2936: {'lr': 0.0004962212266351185, 'samples': 1503744, 'steps': 2936, 'batch_loss/train': 1.0462322980165482} 12/22/2021 00:11:40 - INFO - codeparrot_training - Step 2937: {'lr': 0.0004962184958124346, 'samples': 1504256, 'steps': 2937, 'batch_loss/train': 0.88956314092502} 12/22/2021 00:11:52 - INFO - codeparrot_training - Step 2938: {'lr': 0.000496215764010879, 'samples': 1504768, 'steps': 2938, 'batch_loss/train': 0.8482174808159471} 12/22/2021 00:12:03 - INFO - codeparrot_training - Step 2939: {'lr': 0.0004962130312304625, 'samples': 1505280, 'steps': 2939, 'batch_loss/train': 0.8539789877831936} 12/22/2021 00:12:14 - INFO - codeparrot_training - Step 2940: {'lr': 0.0004962102974711959, 'samples': 1505792, 'steps': 2940, 'batch_loss/train': 0.8022037263726816} 12/22/2021 00:12:26 - INFO - codeparrot_training - Step 2941: {'lr': 0.0004962075627330901, 'samples': 1506304, 'steps': 2941, 'batch_loss/train': 0.9581134002655745} 12/22/2021 00:12:37 - INFO - codeparrot_training - Step 2942: {'lr': 0.0004962048270161561, 'samples': 1506816, 'steps': 2942, 'batch_loss/train': 0.8410080634057522} 12/22/2021 00:12:48 - INFO - codeparrot_training - Step 2943: {'lr': 0.0004962020903204045, 'samples': 1507328, 'steps': 2943, 'batch_loss/train': 0.8268013931810856} 12/22/2021 00:12:58 - INFO - codeparrot_training - Step 2944: {'lr': 0.0004961993526458465, 'samples': 1507840, 'steps': 2944, 'batch_loss/train': 0.7475924706086516} 12/22/2021 00:13:10 - INFO - codeparrot_training - Step 2945: {'lr': 0.0004961966139924929, 'samples': 1508352, 'steps': 2945, 'batch_loss/train': 0.8386268308386207} 12/22/2021 00:13:21 - INFO - codeparrot_training - Step 2946: {'lr': 0.0004961938743603543, 'samples': 1508864, 'steps': 2946, 'batch_loss/train': 0.8953671613708138} 12/22/2021 00:13:32 - INFO - codeparrot_training - Step 2947: {'lr': 0.000496191133749442, 'samples': 1509376, 'steps': 2947, 'batch_loss/train': 0.9012109395116568} 12/22/2021 00:13:44 - INFO - codeparrot_training - Step 2948: {'lr': 0.0004961883921597666, 'samples': 1509888, 'steps': 2948, 'batch_loss/train': 0.9005736131221056} 12/22/2021 00:13:54 - INFO - codeparrot_training - Step 2949: {'lr': 0.0004961856495913392, 'samples': 1510400, 'steps': 2949, 'batch_loss/train': 1.0169416833668947} 12/22/2021 00:14:05 - INFO - codeparrot_training - Step 2950: {'lr': 0.0004961829060441705, 'samples': 1510912, 'steps': 2950, 'batch_loss/train': 1.0916631098371} 12/22/2021 00:14:18 - INFO - codeparrot_training - Step 2951: {'lr': 0.0004961801615182716, 'samples': 1511424, 'steps': 2951, 'batch_loss/train': 0.7530182292684913} 12/22/2021 00:14:29 - INFO - codeparrot_training - Step 2952: {'lr': 0.0004961774160136533, 'samples': 1511936, 'steps': 2952, 'batch_loss/train': 0.8072453094646335} 12/22/2021 00:14:39 - INFO - codeparrot_training - Step 2953: {'lr': 0.0004961746695303265, 'samples': 1512448, 'steps': 2953, 'batch_loss/train': 0.8104291008785367} 12/22/2021 00:14:50 - INFO - codeparrot_training - Step 2954: {'lr': 0.0004961719220683024, 'samples': 1512960, 'steps': 2954, 'batch_loss/train': 0.8152489559724927} 12/22/2021 00:15:02 - INFO - codeparrot_training - Step 2955: {'lr': 0.0004961691736275914, 'samples': 1513472, 'steps': 2955, 'batch_loss/train': 0.8046854101121426} 12/22/2021 00:15:13 - INFO - codeparrot_training - Step 2956: {'lr': 0.0004961664242082048, 'samples': 1513984, 'steps': 2956, 'batch_loss/train': 0.8412078423425555} 12/22/2021 00:15:23 - INFO - codeparrot_training - Step 2957: {'lr': 0.0004961636738101535, 'samples': 1514496, 'steps': 2957, 'batch_loss/train': 0.9023262448608875} 12/22/2021 00:15:36 - INFO - codeparrot_training - Step 2958: {'lr': 0.0004961609224334484, 'samples': 1515008, 'steps': 2958, 'batch_loss/train': 0.8240511417388916} 12/22/2021 00:15:47 - INFO - codeparrot_training - Step 2959: {'lr': 0.0004961581700781003, 'samples': 1515520, 'steps': 2959, 'batch_loss/train': 0.9120802539400756} 12/22/2021 00:15:57 - INFO - codeparrot_training - Step 2960: {'lr': 0.0004961554167441203, 'samples': 1516032, 'steps': 2960, 'batch_loss/train': 0.8870995547622442} 12/22/2021 00:16:09 - INFO - codeparrot_training - Step 2961: {'lr': 0.0004961526624315194, 'samples': 1516544, 'steps': 2961, 'batch_loss/train': 0.9367961045354605} 12/22/2021 00:16:20 - INFO - codeparrot_training - Step 2962: {'lr': 0.0004961499071403084, 'samples': 1517056, 'steps': 2962, 'batch_loss/train': 0.8567962287925184} 12/22/2021 00:16:31 - INFO - codeparrot_training - Step 2963: {'lr': 0.0004961471508704983, 'samples': 1517568, 'steps': 2963, 'batch_loss/train': 0.960740964859724} 12/22/2021 00:16:43 - INFO - codeparrot_training - Step 2964: {'lr': 0.0004961443936221001, 'samples': 1518080, 'steps': 2964, 'batch_loss/train': 0.8646677490323782} 12/22/2021 00:16:53 - INFO - codeparrot_training - Step 2965: {'lr': 0.0004961416353951247, 'samples': 1518592, 'steps': 2965, 'batch_loss/train': 0.9682387188076973} 12/22/2021 00:17:04 - INFO - codeparrot_training - Step 2966: {'lr': 0.0004961388761895832, 'samples': 1519104, 'steps': 2966, 'batch_loss/train': 0.9413503743708134} 12/22/2021 00:17:14 - INFO - codeparrot_training - Step 2967: {'lr': 0.0004961361160054864, 'samples': 1519616, 'steps': 2967, 'batch_loss/train': 0.9703543996438384} 12/22/2021 00:17:27 - INFO - codeparrot_training - Step 2968: {'lr': 0.0004961333548428453, 'samples': 1520128, 'steps': 2968, 'batch_loss/train': 0.8507348503917456} 12/22/2021 00:17:38 - INFO - codeparrot_training - Step 2969: {'lr': 0.0004961305927016708, 'samples': 1520640, 'steps': 2969, 'batch_loss/train': 0.8701075608842075} 12/22/2021 00:17:48 - INFO - codeparrot_training - Step 2970: {'lr': 0.0004961278295819743, 'samples': 1521152, 'steps': 2970, 'batch_loss/train': 0.8559769433923066} 12/22/2021 00:18:01 - INFO - codeparrot_training - Step 2971: {'lr': 0.0004961250654837662, 'samples': 1521664, 'steps': 2971, 'batch_loss/train': 0.8931840006262064} 12/22/2021 00:18:11 - INFO - codeparrot_training - Step 2972: {'lr': 0.0004961223004070579, 'samples': 1522176, 'steps': 2972, 'batch_loss/train': 0.8695220556110144} 12/22/2021 00:18:22 - INFO - codeparrot_training - Step 2973: {'lr': 0.0004961195343518603, 'samples': 1522688, 'steps': 2973, 'batch_loss/train': 0.9247833495028317} 12/22/2021 00:18:35 - INFO - codeparrot_training - Step 2974: {'lr': 0.0004961167673181844, 'samples': 1523200, 'steps': 2974, 'batch_loss/train': 0.754037200473249} 12/22/2021 00:18:45 - INFO - codeparrot_training - Step 2975: {'lr': 0.000496113999306041, 'samples': 1523712, 'steps': 2975, 'batch_loss/train': 0.9668109491467476} 12/22/2021 00:18:56 - INFO - codeparrot_training - Step 2976: {'lr': 0.0004961112303154413, 'samples': 1524224, 'steps': 2976, 'batch_loss/train': 0.8115405943244696} 12/22/2021 00:19:06 - INFO - codeparrot_training - Step 2977: {'lr': 0.0004961084603463963, 'samples': 1524736, 'steps': 2977, 'batch_loss/train': 0.5788128701969981} 12/22/2021 00:19:19 - INFO - codeparrot_training - Step 2978: {'lr': 0.0004961056893989171, 'samples': 1525248, 'steps': 2978, 'batch_loss/train': 0.7695569433271885} 12/22/2021 00:19:30 - INFO - codeparrot_training - Step 2979: {'lr': 0.0004961029174730146, 'samples': 1525760, 'steps': 2979, 'batch_loss/train': 0.8036586173111573} 12/22/2021 00:19:40 - INFO - codeparrot_training - Step 2980: {'lr': 0.0004961001445686996, 'samples': 1526272, 'steps': 2980, 'batch_loss/train': 0.7599600348621607} 12/22/2021 00:19:53 - INFO - codeparrot_training - Step 2981: {'lr': 0.0004960973706859836, 'samples': 1526784, 'steps': 2981, 'batch_loss/train': 0.8249022495001554} 12/22/2021 00:20:03 - INFO - codeparrot_training - Step 2982: {'lr': 0.0004960945958248773, 'samples': 1527296, 'steps': 2982, 'batch_loss/train': 0.9340034928172827} 12/22/2021 00:20:14 - INFO - codeparrot_training - Step 2983: {'lr': 0.0004960918199853917, 'samples': 1527808, 'steps': 2983, 'batch_loss/train': 0.9259119182825089} 12/22/2021 00:20:26 - INFO - codeparrot_training - Step 2984: {'lr': 0.000496089043167538, 'samples': 1528320, 'steps': 2984, 'batch_loss/train': 0.965371448546648} 12/22/2021 00:20:37 - INFO - codeparrot_training - Step 2985: {'lr': 0.0004960862653713273, 'samples': 1528832, 'steps': 2985, 'batch_loss/train': 0.963838517665863} 12/22/2021 00:20:47 - INFO - codeparrot_training - Step 2986: {'lr': 0.0004960834865967704, 'samples': 1529344, 'steps': 2986, 'batch_loss/train': 0.861060225404799} 12/22/2021 00:20:58 - INFO - codeparrot_training - Step 2987: {'lr': 0.0004960807068438785, 'samples': 1529856, 'steps': 2987, 'batch_loss/train': 0.9171348242089152} 12/22/2021 00:21:10 - INFO - codeparrot_training - Step 2988: {'lr': 0.0004960779261126626, 'samples': 1530368, 'steps': 2988, 'batch_loss/train': 0.7605819683521986} 12/22/2021 00:21:21 - INFO - codeparrot_training - Step 2989: {'lr': 0.0004960751444031339, 'samples': 1530880, 'steps': 2989, 'batch_loss/train': 0.7838842070195824} 12/22/2021 00:21:31 - INFO - codeparrot_training - Step 2990: {'lr': 0.0004960723617153032, 'samples': 1531392, 'steps': 2990, 'batch_loss/train': 0.8764848709106445} 12/22/2021 00:21:44 - INFO - codeparrot_training - Step 2991: {'lr': 0.0004960695780491818, 'samples': 1531904, 'steps': 2991, 'batch_loss/train': 0.8022947888821363} 12/22/2021 00:21:55 - INFO - codeparrot_training - Step 2992: {'lr': 0.0004960667934047807, 'samples': 1532416, 'steps': 2992, 'batch_loss/train': 0.8532500602304935} 12/22/2021 00:22:05 - INFO - codeparrot_training - Step 2993: {'lr': 0.0004960640077821108, 'samples': 1532928, 'steps': 2993, 'batch_loss/train': 0.8128378102555871} 12/22/2021 00:22:17 - INFO - codeparrot_training - Step 2994: {'lr': 0.0004960612211811834, 'samples': 1533440, 'steps': 2994, 'batch_loss/train': 0.8611951814964414} 12/22/2021 00:22:28 - INFO - codeparrot_training - Step 2995: {'lr': 0.0004960584336020095, 'samples': 1533952, 'steps': 2995, 'batch_loss/train': 0.8225822672247887} 12/22/2021 00:22:38 - INFO - codeparrot_training - Step 2996: {'lr': 0.0004960556450446001, 'samples': 1534464, 'steps': 2996, 'batch_loss/train': 0.8130850829184055} 12/22/2021 00:22:49 - INFO - codeparrot_training - Step 2997: {'lr': 0.0004960528555089665, 'samples': 1534976, 'steps': 2997, 'batch_loss/train': 0.8801628071814775} 12/22/2021 00:23:02 - INFO - codeparrot_training - Step 2998: {'lr': 0.0004960500649951195, 'samples': 1535488, 'steps': 2998, 'batch_loss/train': 0.9043007805012167} 12/22/2021 00:23:12 - INFO - codeparrot_training - Step 2999: {'lr': 0.0004960472735030705, 'samples': 1536000, 'steps': 2999, 'batch_loss/train': 1.0239335969090462} 12/22/2021 00:23:23 - INFO - codeparrot_training - Step 3000: {'lr': 0.0004960444810328303, 'samples': 1536512, 'steps': 3000, 'batch_loss/train': 0.845787356607616} 12/22/2021 00:23:35 - INFO - codeparrot_training - Step 3001: {'lr': 0.0004960416875844102, 'samples': 1537024, 'steps': 3001, 'batch_loss/train': 0.8958561411127448} 12/22/2021 00:23:46 - INFO - codeparrot_training - Step 3002: {'lr': 0.0004960388931578212, 'samples': 1537536, 'steps': 3002, 'batch_loss/train': 0.8389314515516162} 12/22/2021 00:23:57 - INFO - codeparrot_training - Step 3003: {'lr': 0.0004960360977530746, 'samples': 1538048, 'steps': 3003, 'batch_loss/train': 0.7584452927112579} 12/22/2021 00:24:09 - INFO - codeparrot_training - Step 3004: {'lr': 0.0004960333013701813, 'samples': 1538560, 'steps': 3004, 'batch_loss/train': 0.8678699880838394} 12/22/2021 00:24:19 - INFO - codeparrot_training - Step 3005: {'lr': 0.0004960305040091525, 'samples': 1539072, 'steps': 3005, 'batch_loss/train': 0.7375816283747554} 12/22/2021 00:24:30 - INFO - codeparrot_training - Step 3006: {'lr': 0.0004960277056699993, 'samples': 1539584, 'steps': 3006, 'batch_loss/train': 0.9170281309634447} 12/22/2021 00:24:43 - INFO - codeparrot_training - Step 3007: {'lr': 0.0004960249063527328, 'samples': 1540096, 'steps': 3007, 'batch_loss/train': 0.8138867886736989} 12/22/2021 00:24:54 - INFO - codeparrot_training - Step 3008: {'lr': 0.0004960221060573642, 'samples': 1540608, 'steps': 3008, 'batch_loss/train': 0.8875448501203209} 12/22/2021 00:25:04 - INFO - codeparrot_training - Step 3009: {'lr': 0.0004960193047839047, 'samples': 1541120, 'steps': 3009, 'batch_loss/train': 1.2428184300661087} 12/22/2021 00:25:15 - INFO - codeparrot_training - Step 3010: {'lr': 0.0004960165025323652, 'samples': 1541632, 'steps': 3010, 'batch_loss/train': 1.5285909390076995} 12/22/2021 00:25:27 - INFO - codeparrot_training - Step 3011: {'lr': 0.000496013699302757, 'samples': 1542144, 'steps': 3011, 'batch_loss/train': 0.926459682174027} 12/22/2021 00:25:38 - INFO - codeparrot_training - Step 3012: {'lr': 0.0004960108950950913, 'samples': 1542656, 'steps': 3012, 'batch_loss/train': 0.7894781604409218} 12/22/2021 00:25:48 - INFO - codeparrot_training - Step 3013: {'lr': 0.000496008089909379, 'samples': 1543168, 'steps': 3013, 'batch_loss/train': 0.9074496598914266} 12/22/2021 00:26:01 - INFO - codeparrot_training - Step 3014: {'lr': 0.0004960052837456317, 'samples': 1543680, 'steps': 3014, 'batch_loss/train': 0.8952338602393866} 12/22/2021 00:26:12 - INFO - codeparrot_training - Step 3015: {'lr': 0.0004960024766038601, 'samples': 1544192, 'steps': 3015, 'batch_loss/train': 0.7996892146766186} 12/22/2021 00:26:23 - INFO - codeparrot_training - Step 3016: {'lr': 0.0004959996684840755, 'samples': 1544704, 'steps': 3016, 'batch_loss/train': 0.9057547110132873} 12/22/2021 00:26:35 - INFO - codeparrot_training - Step 3017: {'lr': 0.0004959968593862892, 'samples': 1545216, 'steps': 3017, 'batch_loss/train': 0.9462001109495759} 12/22/2021 00:26:45 - INFO - codeparrot_training - Step 3018: {'lr': 0.0004959940493105123, 'samples': 1545728, 'steps': 3018, 'batch_loss/train': 0.8366453787311912} 12/22/2021 00:26:56 - INFO - codeparrot_training - Step 3019: {'lr': 0.0004959912382567559, 'samples': 1546240, 'steps': 3019, 'batch_loss/train': 0.8614417221397161} 12/22/2021 00:27:07 - INFO - codeparrot_training - Step 3020: {'lr': 0.0004959884262250311, 'samples': 1546752, 'steps': 3020, 'batch_loss/train': 0.8534379992634058} 12/22/2021 00:27:19 - INFO - codeparrot_training - Step 3021: {'lr': 0.0004959856132153493, 'samples': 1547264, 'steps': 3021, 'batch_loss/train': 0.7620975030586123} 12/22/2021 00:27:29 - INFO - codeparrot_training - Step 3022: {'lr': 0.0004959827992277216, 'samples': 1547776, 'steps': 3022, 'batch_loss/train': 0.9459830988198519} 12/22/2021 00:27:40 - INFO - codeparrot_training - Step 3023: {'lr': 0.0004959799842621592, 'samples': 1548288, 'steps': 3023, 'batch_loss/train': 0.8370558796450496} 12/22/2021 00:27:52 - INFO - codeparrot_training - Step 3024: {'lr': 0.0004959771683186732, 'samples': 1548800, 'steps': 3024, 'batch_loss/train': 0.8837002348154783} 12/22/2021 00:28:02 - INFO - codeparrot_training - Step 3025: {'lr': 0.0004959743513972748, 'samples': 1549312, 'steps': 3025, 'batch_loss/train': 0.884586906991899} 12/22/2021 00:28:13 - INFO - codeparrot_training - Step 3026: {'lr': 0.0004959715334979754, 'samples': 1549824, 'steps': 3026, 'batch_loss/train': 0.8589917328208685} 12/22/2021 00:28:26 - INFO - codeparrot_training - Step 3027: {'lr': 0.000495968714620786, 'samples': 1550336, 'steps': 3027, 'batch_loss/train': 0.8917285269126296} 12/22/2021 00:28:36 - INFO - codeparrot_training - Step 3028: {'lr': 0.0004959658947657178, 'samples': 1550848, 'steps': 3028, 'batch_loss/train': 0.8786073978990316} 12/22/2021 00:28:47 - INFO - codeparrot_training - Step 3029: {'lr': 0.0004959630739327822, 'samples': 1551360, 'steps': 3029, 'batch_loss/train': 0.8593123937025666} 12/22/2021 00:28:58 - INFO - codeparrot_training - Step 3030: {'lr': 0.0004959602521219901, 'samples': 1551872, 'steps': 3030, 'batch_loss/train': 0.8191675832495093} 12/22/2021 00:29:10 - INFO - codeparrot_training - Step 3031: {'lr': 0.000495957429333353, 'samples': 1552384, 'steps': 3031, 'batch_loss/train': 1.0185594484210014} 12/22/2021 00:29:20 - INFO - codeparrot_training - Step 3032: {'lr': 0.0004959546055668822, 'samples': 1552896, 'steps': 3032, 'batch_loss/train': 0.8848305458668619} 12/22/2021 00:29:31 - INFO - codeparrot_training - Step 3033: {'lr': 0.0004959517808225886, 'samples': 1553408, 'steps': 3033, 'batch_loss/train': 0.8473475202918053} 12/22/2021 00:29:43 - INFO - codeparrot_training - Step 3034: {'lr': 0.0004959489551004836, 'samples': 1553920, 'steps': 3034, 'batch_loss/train': 0.8368791860993952} 12/22/2021 00:29:54 - INFO - codeparrot_training - Step 3035: {'lr': 0.0004959461284005784, 'samples': 1554432, 'steps': 3035, 'batch_loss/train': 0.8726472007110715} 12/22/2021 00:30:04 - INFO - codeparrot_training - Step 3036: {'lr': 0.0004959433007228843, 'samples': 1554944, 'steps': 3036, 'batch_loss/train': 0.877885733731091} 12/22/2021 00:30:17 - INFO - codeparrot_training - Step 3037: {'lr': 0.0004959404720674124, 'samples': 1555456, 'steps': 3037, 'batch_loss/train': 0.8191741090267897} 12/22/2021 00:30:28 - INFO - codeparrot_training - Step 3038: {'lr': 0.0004959376424341741, 'samples': 1555968, 'steps': 3038, 'batch_loss/train': 0.8256903504952788} 12/22/2021 00:30:38 - INFO - codeparrot_training - Step 3039: {'lr': 0.0004959348118231807, 'samples': 1556480, 'steps': 3039, 'batch_loss/train': 0.7605213592760265} 12/22/2021 00:30:50 - INFO - codeparrot_training - Step 3040: {'lr': 0.0004959319802344433, 'samples': 1556992, 'steps': 3040, 'batch_loss/train': 0.9237731462344527} 12/22/2021 00:31:01 - INFO - codeparrot_training - Step 3041: {'lr': 0.0004959291476679731, 'samples': 1557504, 'steps': 3041, 'batch_loss/train': 0.8381185550242662} 12/22/2021 00:31:12 - INFO - codeparrot_training - Step 3042: {'lr': 0.0004959263141237816, 'samples': 1558016, 'steps': 3042, 'batch_loss/train': 0.9697933858260512} 12/22/2021 00:31:22 - INFO - codeparrot_training - Step 3043: {'lr': 0.0004959234796018799, 'samples': 1558528, 'steps': 3043, 'batch_loss/train': 0.9182802028954029} 12/22/2021 00:31:35 - INFO - codeparrot_training - Step 3044: {'lr': 0.0004959206441022792, 'samples': 1559040, 'steps': 3044, 'batch_loss/train': 0.8922312278300524} 12/22/2021 00:31:46 - INFO - codeparrot_training - Step 3045: {'lr': 0.000495917807624991, 'samples': 1559552, 'steps': 3045, 'batch_loss/train': 0.7025610520504415} 12/22/2021 00:31:57 - INFO - codeparrot_training - Step 3046: {'lr': 0.0004959149701700264, 'samples': 1560064, 'steps': 3046, 'batch_loss/train': 0.8047926286235452} 12/22/2021 00:32:09 - INFO - codeparrot_training - Step 3047: {'lr': 0.0004959121317373968, 'samples': 1560576, 'steps': 3047, 'batch_loss/train': 0.8515224056318402} 12/22/2021 00:32:19 - INFO - codeparrot_training - Step 3048: {'lr': 0.0004959092923271134, 'samples': 1561088, 'steps': 3048, 'batch_loss/train': 0.8055978587362915} 12/22/2021 00:32:30 - INFO - codeparrot_training - Step 3049: {'lr': 0.0004959064519391874, 'samples': 1561600, 'steps': 3049, 'batch_loss/train': 0.8483535428531468} 12/22/2021 00:32:43 - INFO - codeparrot_training - Step 3050: {'lr': 0.0004959036105736303, 'samples': 1562112, 'steps': 3050, 'batch_loss/train': 0.88635613466613} 12/22/2021 00:32:54 - INFO - codeparrot_training - Step 3051: {'lr': 0.0004959007682304532, 'samples': 1562624, 'steps': 3051, 'batch_loss/train': 0.7347767143510282} 12/22/2021 00:33:04 - INFO - codeparrot_training - Step 3052: {'lr': 0.0004958979249096675, 'samples': 1563136, 'steps': 3052, 'batch_loss/train': 0.8436926156282425} 12/22/2021 00:33:15 - INFO - codeparrot_training - Step 3053: {'lr': 0.0004958950806112847, 'samples': 1563648, 'steps': 3053, 'batch_loss/train': 1.223364321514964} 12/22/2021 00:33:27 - INFO - codeparrot_training - Step 3054: {'lr': 0.0004958922353353156, 'samples': 1564160, 'steps': 3054, 'batch_loss/train': 0.9730399204418063} 12/22/2021 00:33:38 - INFO - codeparrot_training - Step 3055: {'lr': 0.000495889389081772, 'samples': 1564672, 'steps': 3055, 'batch_loss/train': 0.7957408530637622} 12/22/2021 00:33:49 - INFO - codeparrot_training - Step 3056: {'lr': 0.0004958865418506651, 'samples': 1565184, 'steps': 3056, 'batch_loss/train': 0.8656440395861864} 12/22/2021 00:34:03 - INFO - codeparrot_training - Step 3057: {'lr': 0.000495883693642006, 'samples': 1565696, 'steps': 3057, 'batch_loss/train': 0.9476745035499334} 12/22/2021 00:34:13 - INFO - codeparrot_training - Step 3058: {'lr': 0.0004958808444558063, 'samples': 1566208, 'steps': 3058, 'batch_loss/train': 0.9414535304531455} 12/22/2021 00:34:24 - INFO - codeparrot_training - Step 3059: {'lr': 0.0004958779942920771, 'samples': 1566720, 'steps': 3059, 'batch_loss/train': 0.9855457870289683} 12/22/2021 00:34:34 - INFO - codeparrot_training - Step 3060: {'lr': 0.0004958751431508298, 'samples': 1567232, 'steps': 3060, 'batch_loss/train': 0.804049733094871} 12/22/2021 00:34:46 - INFO - codeparrot_training - Step 3061: {'lr': 0.0004958722910320759, 'samples': 1567744, 'steps': 3061, 'batch_loss/train': 0.8185342783108354} 12/22/2021 00:34:57 - INFO - codeparrot_training - Step 3062: {'lr': 0.0004958694379358265, 'samples': 1568256, 'steps': 3062, 'batch_loss/train': 0.848387093283236} 12/22/2021 00:35:08 - INFO - codeparrot_training - Step 3063: {'lr': 0.0004958665838620932, 'samples': 1568768, 'steps': 3063, 'batch_loss/train': 0.9590487708337605} 12/22/2021 00:35:20 - INFO - codeparrot_training - Step 3064: {'lr': 0.000495863728810887, 'samples': 1569280, 'steps': 3064, 'batch_loss/train': 0.7944198246113956} 12/22/2021 00:35:30 - INFO - codeparrot_training - Step 3065: {'lr': 0.0004958608727822196, 'samples': 1569792, 'steps': 3065, 'batch_loss/train': 0.8014315916225314} 12/22/2021 00:35:41 - INFO - codeparrot_training - Step 3066: {'lr': 0.0004958580157761022, 'samples': 1570304, 'steps': 3066, 'batch_loss/train': 0.7802999187260866} 12/22/2021 00:35:54 - INFO - codeparrot_training - Step 3067: {'lr': 0.0004958551577925461, 'samples': 1570816, 'steps': 3067, 'batch_loss/train': 0.8856266178190708} 12/22/2021 00:36:04 - INFO - codeparrot_training - Step 3068: {'lr': 0.0004958522988315627, 'samples': 1571328, 'steps': 3068, 'batch_loss/train': 0.749654735904187} 12/22/2021 00:36:15 - INFO - codeparrot_training - Step 3069: {'lr': 0.0004958494388931635, 'samples': 1571840, 'steps': 3069, 'batch_loss/train': 0.784135281573981} 12/22/2021 00:36:27 - INFO - codeparrot_training - Step 3070: {'lr': 0.0004958465779773596, 'samples': 1572352, 'steps': 3070, 'batch_loss/train': 0.899507686495781} 12/22/2021 00:36:38 - INFO - codeparrot_training - Step 3071: {'lr': 0.0004958437160841627, 'samples': 1572864, 'steps': 3071, 'batch_loss/train': 0.8792612408287823} 12/22/2021 00:36:48 - INFO - codeparrot_training - Step 3072: {'lr': 0.0004958408532135839, 'samples': 1573376, 'steps': 3072, 'batch_loss/train': 0.8825881537050009} 12/22/2021 00:36:59 - INFO - codeparrot_training - Step 3073: {'lr': 0.0004958379893656347, 'samples': 1573888, 'steps': 3073, 'batch_loss/train': 0.5988505707355216} 12/22/2021 00:37:12 - INFO - codeparrot_training - Step 3074: {'lr': 0.0004958351245403265, 'samples': 1574400, 'steps': 3074, 'batch_loss/train': 0.8611243199557066} 12/22/2021 00:37:23 - INFO - codeparrot_training - Step 3075: {'lr': 0.0004958322587376707, 'samples': 1574912, 'steps': 3075, 'batch_loss/train': 0.8500094041228294} 12/22/2021 00:37:33 - INFO - codeparrot_training - Step 3076: {'lr': 0.0004958293919576786, 'samples': 1575424, 'steps': 3076, 'batch_loss/train': 0.8364934409037232} 12/22/2021 00:37:45 - INFO - codeparrot_training - Step 3077: {'lr': 0.0004958265242003617, 'samples': 1575936, 'steps': 3077, 'batch_loss/train': 0.8479018164798617} 12/22/2021 00:37:56 - INFO - codeparrot_training - Step 3078: {'lr': 0.0004958236554657313, 'samples': 1576448, 'steps': 3078, 'batch_loss/train': 0.8614955022931099} 12/22/2021 00:38:07 - INFO - codeparrot_training - Step 3079: {'lr': 0.000495820785753799, 'samples': 1576960, 'steps': 3079, 'batch_loss/train': 0.8291443753987551} 12/22/2021 00:38:20 - INFO - codeparrot_training - Step 3080: {'lr': 0.0004958179150645759, 'samples': 1577472, 'steps': 3080, 'batch_loss/train': 0.8062189901247621} 12/22/2021 00:38:30 - INFO - codeparrot_training - Step 3081: {'lr': 0.0004958150433980738, 'samples': 1577984, 'steps': 3081, 'batch_loss/train': 1.0027417140081525} 12/22/2021 00:38:41 - INFO - codeparrot_training - Step 3082: {'lr': 0.0004958121707543037, 'samples': 1578496, 'steps': 3082, 'batch_loss/train': 0.8559663305059075} 12/22/2021 00:38:51 - INFO - codeparrot_training - Step 3083: {'lr': 0.0004958092971332773, 'samples': 1579008, 'steps': 3083, 'batch_loss/train': 0.9047391004860401} 12/22/2021 00:39:03 - INFO - codeparrot_training - Step 3084: {'lr': 0.0004958064225350061, 'samples': 1579520, 'steps': 3084, 'batch_loss/train': 0.8246345072984695} 12/22/2021 00:39:14 - INFO - codeparrot_training - Step 3085: {'lr': 0.0004958035469595012, 'samples': 1580032, 'steps': 3085, 'batch_loss/train': 0.7823211150243878} 12/22/2021 00:39:25 - INFO - codeparrot_training - Step 3086: {'lr': 0.0004958006704067743, 'samples': 1580544, 'steps': 3086, 'batch_loss/train': 0.8490783916786313} 12/22/2021 00:39:37 - INFO - codeparrot_training - Step 3087: {'lr': 0.0004957977928768366, 'samples': 1581056, 'steps': 3087, 'batch_loss/train': 0.9028349558357149} 12/22/2021 00:39:48 - INFO - codeparrot_training - Step 3088: {'lr': 0.0004957949143696998, 'samples': 1581568, 'steps': 3088, 'batch_loss/train': 0.8764702859334648} 12/22/2021 00:39:58 - INFO - codeparrot_training - Step 3089: {'lr': 0.0004957920348853752, 'samples': 1582080, 'steps': 3089, 'batch_loss/train': 0.7288846196606755} 12/22/2021 00:40:10 - INFO - codeparrot_training - Step 3090: {'lr': 0.0004957891544238742, 'samples': 1582592, 'steps': 3090, 'batch_loss/train': 0.8401413289830089} 12/22/2021 00:40:21 - INFO - codeparrot_training - Step 3091: {'lr': 0.0004957862729852086, 'samples': 1583104, 'steps': 3091, 'batch_loss/train': 0.9385048374533653} 12/22/2021 00:40:31 - INFO - codeparrot_training - Step 3092: {'lr': 0.0004957833905693893, 'samples': 1583616, 'steps': 3092, 'batch_loss/train': 0.8192966913338751} 12/22/2021 00:40:42 - INFO - codeparrot_training - Step 3093: {'lr': 0.000495780507176428, 'samples': 1584128, 'steps': 3093, 'batch_loss/train': 0.8873361181467772} 12/22/2021 00:40:54 - INFO - codeparrot_training - Step 3094: {'lr': 0.0004957776228063364, 'samples': 1584640, 'steps': 3094, 'batch_loss/train': 0.8343557557091117} 12/22/2021 00:41:05 - INFO - codeparrot_training - Step 3095: {'lr': 0.0004957747374591258, 'samples': 1585152, 'steps': 3095, 'batch_loss/train': 0.8645121119916439} 12/22/2021 00:41:16 - INFO - codeparrot_training - Step 3096: {'lr': 0.0004957718511348076, 'samples': 1585664, 'steps': 3096, 'batch_loss/train': 0.9243402648717165} 12/22/2021 00:41:29 - INFO - codeparrot_training - Step 3097: {'lr': 0.0004957689638333933, 'samples': 1586176, 'steps': 3097, 'batch_loss/train': 0.8603623956441879} 12/22/2021 00:41:39 - INFO - codeparrot_training - Step 3098: {'lr': 0.0004957660755548943, 'samples': 1586688, 'steps': 3098, 'batch_loss/train': 0.9453988280147314} 12/22/2021 00:41:50 - INFO - codeparrot_training - Step 3099: {'lr': 0.0004957631862993223, 'samples': 1587200, 'steps': 3099, 'batch_loss/train': 0.8779096202924848} 12/22/2021 00:42:02 - INFO - codeparrot_training - Step 3100: {'lr': 0.0004957602960666886, 'samples': 1587712, 'steps': 3100, 'batch_loss/train': 0.851197361946106} 12/22/2021 00:42:13 - INFO - codeparrot_training - Step 3101: {'lr': 0.0004957574048570048, 'samples': 1588224, 'steps': 3101, 'batch_loss/train': 0.8761541591957211} 12/22/2021 00:42:23 - INFO - codeparrot_training - Step 3102: {'lr': 0.0004957545126702824, 'samples': 1588736, 'steps': 3102, 'batch_loss/train': 0.8260644427500665} 12/22/2021 00:42:34 - INFO - codeparrot_training - Step 3103: {'lr': 0.0004957516195065327, 'samples': 1589248, 'steps': 3103, 'batch_loss/train': 0.8532250113785267} 12/22/2021 00:42:47 - INFO - codeparrot_training - Step 3104: {'lr': 0.0004957487253657674, 'samples': 1589760, 'steps': 3104, 'batch_loss/train': 0.9954732721671462} 12/22/2021 00:42:57 - INFO - codeparrot_training - Step 3105: {'lr': 0.0004957458302479981, 'samples': 1590272, 'steps': 3105, 'batch_loss/train': 0.8254889557138085} 12/22/2021 00:43:08 - INFO - codeparrot_training - Step 3106: {'lr': 0.000495742934153236, 'samples': 1590784, 'steps': 3106, 'batch_loss/train': 0.8195729069411755} 12/22/2021 00:43:20 - INFO - codeparrot_training - Step 3107: {'lr': 0.0004957400370814929, 'samples': 1591296, 'steps': 3107, 'batch_loss/train': 0.8999533872120082} 12/22/2021 00:43:30 - INFO - codeparrot_training - Step 3108: {'lr': 0.0004957371390327802, 'samples': 1591808, 'steps': 3108, 'batch_loss/train': 0.8271447294391692} 12/22/2021 00:43:41 - INFO - codeparrot_training - Step 3109: {'lr': 0.0004957342400071093, 'samples': 1592320, 'steps': 3109, 'batch_loss/train': 1.0253043249249458} 12/22/2021 00:43:53 - INFO - codeparrot_training - Step 3110: {'lr': 0.0004957313400044919, 'samples': 1592832, 'steps': 3110, 'batch_loss/train': 0.8345599882304668} 12/22/2021 00:44:04 - INFO - codeparrot_training - Step 3111: {'lr': 0.0004957284390249396, 'samples': 1593344, 'steps': 3111, 'batch_loss/train': 0.8806321583688259} 12/22/2021 00:44:14 - INFO - codeparrot_training - Step 3112: {'lr': 0.0004957255370684637, 'samples': 1593856, 'steps': 3112, 'batch_loss/train': 0.9513812442310154} 12/22/2021 00:44:25 - INFO - codeparrot_training - Step 3113: {'lr': 0.0004957226341350758, 'samples': 1594368, 'steps': 3113, 'batch_loss/train': 0.9449809119105339} 12/22/2021 00:44:38 - INFO - codeparrot_training - Step 3114: {'lr': 0.0004957197302247876, 'samples': 1594880, 'steps': 3114, 'batch_loss/train': 1.0237303017638624} 12/22/2021 00:44:48 - INFO - codeparrot_training - Step 3115: {'lr': 0.0004957168253376104, 'samples': 1595392, 'steps': 3115, 'batch_loss/train': 0.9327936503104866} 12/22/2021 00:44:59 - INFO - codeparrot_training - Step 3116: {'lr': 0.000495713919473556, 'samples': 1595904, 'steps': 3116, 'batch_loss/train': 0.8291188208386302} 12/22/2021 00:45:11 - INFO - codeparrot_training - Step 3117: {'lr': 0.0004957110126326358, 'samples': 1596416, 'steps': 3117, 'batch_loss/train': 0.894895819015801} 12/22/2021 00:45:22 - INFO - codeparrot_training - Step 3118: {'lr': 0.0004957081048148614, 'samples': 1596928, 'steps': 3118, 'batch_loss/train': 0.9036348462104797} 12/22/2021 00:45:32 - INFO - codeparrot_training - Step 3119: {'lr': 0.0004957051960202443, 'samples': 1597440, 'steps': 3119, 'batch_loss/train': 0.913657296448946} 12/22/2021 00:45:45 - INFO - codeparrot_training - Step 3120: {'lr': 0.0004957022862487963, 'samples': 1597952, 'steps': 3120, 'batch_loss/train': 0.9887194253969938} 12/22/2021 00:45:55 - INFO - codeparrot_training - Step 3121: {'lr': 0.0004956993755005285, 'samples': 1598464, 'steps': 3121, 'batch_loss/train': 0.8462177379988134} 12/22/2021 00:46:06 - INFO - codeparrot_training - Step 3122: {'lr': 0.0004956964637754529, 'samples': 1598976, 'steps': 3122, 'batch_loss/train': 0.8642888115718961} 12/22/2021 00:46:18 - INFO - codeparrot_training - Step 3123: {'lr': 0.0004956935510735809, 'samples': 1599488, 'steps': 3123, 'batch_loss/train': 0.947129082866013} 12/22/2021 00:46:29 - INFO - codeparrot_training - Step 3124: {'lr': 0.0004956906373949242, 'samples': 1600000, 'steps': 3124, 'batch_loss/train': 0.8488472178578377} 12/22/2021 00:46:39 - INFO - codeparrot_training - Step 3125: {'lr': 0.0004956877227394942, 'samples': 1600512, 'steps': 3125, 'batch_loss/train': 0.8660394959151745} 12/22/2021 00:46:50 - INFO - codeparrot_training - Step 3126: {'lr': 0.0004956848071073026, 'samples': 1601024, 'steps': 3126, 'batch_loss/train': 0.937318030744791} 12/22/2021 00:47:02 - INFO - codeparrot_training - Step 3127: {'lr': 0.000495681890498361, 'samples': 1601536, 'steps': 3127, 'batch_loss/train': 0.9434640668332577} 12/22/2021 00:47:13 - INFO - codeparrot_training - Step 3128: {'lr': 0.0004956789729126809, 'samples': 1602048, 'steps': 3128, 'batch_loss/train': 0.9394240211695433} 12/22/2021 00:47:23 - INFO - codeparrot_training - Step 3129: {'lr': 0.0004956760543502741, 'samples': 1602560, 'steps': 3129, 'batch_loss/train': 0.9542834351304919} 12/22/2021 00:47:35 - INFO - codeparrot_training - Step 3130: {'lr': 0.0004956731348111519, 'samples': 1603072, 'steps': 3130, 'batch_loss/train': 0.932606253772974} 12/22/2021 00:47:46 - INFO - codeparrot_training - Step 3131: {'lr': 0.0004956702142953262, 'samples': 1603584, 'steps': 3131, 'batch_loss/train': 0.8273211540654302} 12/22/2021 00:47:56 - INFO - codeparrot_training - Step 3132: {'lr': 0.0004956672928028086, 'samples': 1604096, 'steps': 3132, 'batch_loss/train': 0.7796893659979105} 12/22/2021 00:48:09 - INFO - codeparrot_training - Step 3133: {'lr': 0.0004956643703336103, 'samples': 1604608, 'steps': 3133, 'batch_loss/train': 0.9116971455514431} 12/22/2021 00:48:20 - INFO - codeparrot_training - Step 3134: {'lr': 0.0004956614468877435, 'samples': 1605120, 'steps': 3134, 'batch_loss/train': 1.1251118397340178} 12/22/2021 00:48:31 - INFO - codeparrot_training - Step 3135: {'lr': 0.0004956585224652194, 'samples': 1605632, 'steps': 3135, 'batch_loss/train': 0.6840076111257076} 12/22/2021 00:48:41 - INFO - codeparrot_training - Step 3136: {'lr': 0.0004956555970660498, 'samples': 1606144, 'steps': 3136, 'batch_loss/train': 0.8798742722719908} 12/22/2021 00:48:53 - INFO - codeparrot_training - Step 3137: {'lr': 0.0004956526706902464, 'samples': 1606656, 'steps': 3137, 'batch_loss/train': 0.9323478145524859} 12/22/2021 00:49:04 - INFO - codeparrot_training - Step 3138: {'lr': 0.0004956497433378206, 'samples': 1607168, 'steps': 3138, 'batch_loss/train': 0.8929630997590721} 12/22/2021 00:49:15 - INFO - codeparrot_training - Step 3139: {'lr': 0.0004956468150087842, 'samples': 1607680, 'steps': 3139, 'batch_loss/train': 0.8258349057286978} 12/22/2021 00:49:27 - INFO - codeparrot_training - Step 3140: {'lr': 0.0004956438857031488, 'samples': 1608192, 'steps': 3140, 'batch_loss/train': 0.8580984491854906} 12/22/2021 00:49:37 - INFO - codeparrot_training - Step 3141: {'lr': 0.0004956409554209262, 'samples': 1608704, 'steps': 3141, 'batch_loss/train': 0.708816030761227} 12/22/2021 00:49:48 - INFO - codeparrot_training - Step 3142: {'lr': 0.0004956380241621278, 'samples': 1609216, 'steps': 3142, 'batch_loss/train': 0.8920195316895843} 12/22/2021 00:50:01 - INFO - codeparrot_training - Step 3143: {'lr': 0.0004956350919267653, 'samples': 1609728, 'steps': 3143, 'batch_loss/train': 0.8528318586759269} 12/22/2021 00:50:12 - INFO - codeparrot_training - Step 3144: {'lr': 0.0004956321587148505, 'samples': 1610240, 'steps': 3144, 'batch_loss/train': 0.879759943112731} 12/22/2021 00:50:22 - INFO - codeparrot_training - Step 3145: {'lr': 0.000495629224526395, 'samples': 1610752, 'steps': 3145, 'batch_loss/train': 0.7971174586564302} 12/22/2021 00:50:33 - INFO - codeparrot_training - Step 3146: {'lr': 0.0004956262893614104, 'samples': 1611264, 'steps': 3146, 'batch_loss/train': 0.6952160979853943} 12/22/2021 00:50:45 - INFO - codeparrot_training - Step 3147: {'lr': 0.0004956233532199084, 'samples': 1611776, 'steps': 3147, 'batch_loss/train': 0.8164999010041356} 12/22/2021 00:50:55 - INFO - codeparrot_training - Step 3148: {'lr': 0.0004956204161019008, 'samples': 1612288, 'steps': 3148, 'batch_loss/train': 0.8433505753055215} 12/22/2021 00:51:06 - INFO - codeparrot_training - Step 3149: {'lr': 0.0004956174780073991, 'samples': 1612800, 'steps': 3149, 'batch_loss/train': 0.8666272582486272} 12/22/2021 00:51:19 - INFO - codeparrot_training - Step 3150: {'lr': 0.000495614538936415, 'samples': 1613312, 'steps': 3150, 'batch_loss/train': 1.1471748743206263} 12/22/2021 00:51:30 - INFO - codeparrot_training - Step 3151: {'lr': 0.0004956115988889603, 'samples': 1613824, 'steps': 3151, 'batch_loss/train': 0.9273092318326235} 12/22/2021 00:51:40 - INFO - codeparrot_training - Step 3152: {'lr': 0.0004956086578650466, 'samples': 1614336, 'steps': 3152, 'batch_loss/train': 0.8708318891003728} 12/22/2021 00:51:52 - INFO - codeparrot_training - Step 3153: {'lr': 0.0004956057158646858, 'samples': 1614848, 'steps': 3153, 'batch_loss/train': 0.9029769892804325} 12/22/2021 00:52:03 - INFO - codeparrot_training - Step 3154: {'lr': 0.0004956027728878891, 'samples': 1615360, 'steps': 3154, 'batch_loss/train': 0.6747975691687316} 12/22/2021 00:52:14 - INFO - codeparrot_training - Step 3155: {'lr': 0.0004955998289346687, 'samples': 1615872, 'steps': 3155, 'batch_loss/train': 0.7828514622524381} 12/22/2021 00:52:24 - INFO - codeparrot_training - Step 3156: {'lr': 0.000495596884005036, 'samples': 1616384, 'steps': 3156, 'batch_loss/train': 0.8458865177817643} 12/22/2021 00:52:37 - INFO - codeparrot_training - Step 3157: {'lr': 0.0004955939380990029, 'samples': 1616896, 'steps': 3157, 'batch_loss/train': 0.8477370543405414} 12/22/2021 00:52:47 - INFO - codeparrot_training - Step 3158: {'lr': 0.000495590991216581, 'samples': 1617408, 'steps': 3158, 'batch_loss/train': 0.9276319891214371} 12/22/2021 00:52:58 - INFO - codeparrot_training - Step 3159: {'lr': 0.0004955880433577821, 'samples': 1617920, 'steps': 3159, 'batch_loss/train': 0.8668582211248577} 12/22/2021 00:53:11 - INFO - codeparrot_training - Step 3160: {'lr': 0.0004955850945226179, 'samples': 1618432, 'steps': 3160, 'batch_loss/train': 0.8766002906486392} 12/22/2021 00:53:21 - INFO - codeparrot_training - Step 3161: {'lr': 0.0004955821447110999, 'samples': 1618944, 'steps': 3161, 'batch_loss/train': 0.8407662138342857} 12/22/2021 00:53:32 - INFO - codeparrot_training - Step 3162: {'lr': 0.0004955791939232402, 'samples': 1619456, 'steps': 3162, 'batch_loss/train': 0.9558550100773573} 12/22/2021 00:53:44 - INFO - codeparrot_training - Step 3163: {'lr': 0.0004955762421590504, 'samples': 1619968, 'steps': 3163, 'batch_loss/train': 0.8579746903851628} 12/22/2021 00:53:55 - INFO - codeparrot_training - Step 3164: {'lr': 0.0004955732894185421, 'samples': 1620480, 'steps': 3164, 'batch_loss/train': 0.8738546902313828} 12/22/2021 00:54:05 - INFO - codeparrot_training - Step 3165: {'lr': 0.0004955703357017271, 'samples': 1620992, 'steps': 3165, 'batch_loss/train': 0.8202944239601493} 12/22/2021 00:54:16 - INFO - codeparrot_training - Step 3166: {'lr': 0.0004955673810086172, 'samples': 1621504, 'steps': 3166, 'batch_loss/train': 0.9381919177249074} 12/22/2021 00:54:28 - INFO - codeparrot_training - Step 3167: {'lr': 0.0004955644253392241, 'samples': 1622016, 'steps': 3167, 'batch_loss/train': 0.8512092921882868} 12/22/2021 00:54:38 - INFO - codeparrot_training - Step 3168: {'lr': 0.0004955614686935596, 'samples': 1622528, 'steps': 3168, 'batch_loss/train': 0.7866478050127625} 12/22/2021 00:54:49 - INFO - codeparrot_training - Step 3169: {'lr': 0.0004955585110716353, 'samples': 1623040, 'steps': 3169, 'batch_loss/train': 0.8045298187062144} 12/22/2021 00:55:01 - INFO - codeparrot_training - Step 3170: {'lr': 0.0004955555524734632, 'samples': 1623552, 'steps': 3170, 'batch_loss/train': 0.9548365105874836} 12/22/2021 00:55:12 - INFO - codeparrot_training - Step 3171: {'lr': 0.000495552592899055, 'samples': 1624064, 'steps': 3171, 'batch_loss/train': 0.8311224840581417} 12/22/2021 00:55:22 - INFO - codeparrot_training - Step 3172: {'lr': 0.0004955496323484223, 'samples': 1624576, 'steps': 3172, 'batch_loss/train': 0.9350264267995954} 12/22/2021 00:55:35 - INFO - codeparrot_training - Step 3173: {'lr': 0.000495546670821577, 'samples': 1625088, 'steps': 3173, 'batch_loss/train': 0.8304752875119448} 12/22/2021 00:55:46 - INFO - codeparrot_training - Step 3174: {'lr': 0.0004955437083185308, 'samples': 1625600, 'steps': 3174, 'batch_loss/train': 0.9179609259590507} 12/22/2021 00:55:56 - INFO - codeparrot_training - Step 3175: {'lr': 0.0004955407448392955, 'samples': 1626112, 'steps': 3175, 'batch_loss/train': 0.8328950349241495} 12/22/2021 00:56:08 - INFO - codeparrot_training - Step 3176: {'lr': 0.000495537780383883, 'samples': 1626624, 'steps': 3176, 'batch_loss/train': 0.9142196793109179} 12/22/2021 00:56:19 - INFO - codeparrot_training - Step 3177: {'lr': 0.0004955348149523051, 'samples': 1627136, 'steps': 3177, 'batch_loss/train': 0.892035293392837} 12/22/2021 00:56:30 - INFO - codeparrot_training - Step 3178: {'lr': 0.0004955318485445734, 'samples': 1627648, 'steps': 3178, 'batch_loss/train': 0.9108480010181665} 12/22/2021 00:56:40 - INFO - codeparrot_training - Step 3179: {'lr': 0.0004955288811606997, 'samples': 1628160, 'steps': 3179, 'batch_loss/train': 0.899305091239512} 12/22/2021 00:56:52 - INFO - codeparrot_training - Step 3180: {'lr': 0.000495525912800696, 'samples': 1628672, 'steps': 3180, 'batch_loss/train': 0.8560072919353843} 12/22/2021 00:57:03 - INFO - codeparrot_training - Step 3181: {'lr': 0.0004955229434645738, 'samples': 1629184, 'steps': 3181, 'batch_loss/train': 0.8620474515482783} 12/22/2021 00:57:14 - INFO - codeparrot_training - Step 3182: {'lr': 0.0004955199731523453, 'samples': 1629696, 'steps': 3182, 'batch_loss/train': 0.8335727546364069} 12/22/2021 00:57:26 - INFO - codeparrot_training - Step 3183: {'lr': 0.000495517001864022, 'samples': 1630208, 'steps': 3183, 'batch_loss/train': 0.7653261162340641} 12/22/2021 00:57:37 - INFO - codeparrot_training - Step 3184: {'lr': 0.0004955140295996157, 'samples': 1630720, 'steps': 3184, 'batch_loss/train': 0.8150523398071527} 12/22/2021 00:57:48 - INFO - codeparrot_training - Step 3185: {'lr': 0.0004955110563591384, 'samples': 1631232, 'steps': 3185, 'batch_loss/train': 0.9723459994420409} 12/22/2021 00:58:00 - INFO - codeparrot_training - Step 3186: {'lr': 0.0004955080821426019, 'samples': 1631744, 'steps': 3186, 'batch_loss/train': 0.8903674529865384} 12/22/2021 00:58:10 - INFO - codeparrot_training - Step 3187: {'lr': 0.0004955051069500179, 'samples': 1632256, 'steps': 3187, 'batch_loss/train': 0.9380496283993125} 12/22/2021 00:58:21 - INFO - codeparrot_training - Step 3188: {'lr': 0.0004955021307813983, 'samples': 1632768, 'steps': 3188, 'batch_loss/train': 0.8543719546869397} 12/22/2021 00:58:31 - INFO - codeparrot_training - Step 3189: {'lr': 0.000495499153636755, 'samples': 1633280, 'steps': 3189, 'batch_loss/train': 0.8932953625917435} 12/22/2021 00:58:44 - INFO - codeparrot_training - Step 3190: {'lr': 0.0004954961755160996, 'samples': 1633792, 'steps': 3190, 'batch_loss/train': 0.8710927246138453} 12/22/2021 00:58:55 - INFO - codeparrot_training - Step 3191: {'lr': 0.0004954931964194442, 'samples': 1634304, 'steps': 3191, 'batch_loss/train': 0.8828936778008938} 12/22/2021 00:59:05 - INFO - codeparrot_training - Step 3192: {'lr': 0.0004954902163468006, 'samples': 1634816, 'steps': 3192, 'batch_loss/train': 0.9056357750669122} 12/22/2021 00:59:17 - INFO - codeparrot_training - Step 3193: {'lr': 0.0004954872352981805, 'samples': 1635328, 'steps': 3193, 'batch_loss/train': 0.8068346404470503} 12/22/2021 00:59:28 - INFO - codeparrot_training - Step 3194: {'lr': 0.0004954842532735959, 'samples': 1635840, 'steps': 3194, 'batch_loss/train': 0.8595417821779847} 12/22/2021 00:59:39 - INFO - codeparrot_training - Step 3195: {'lr': 0.0004954812702730586, 'samples': 1636352, 'steps': 3195, 'batch_loss/train': 0.852472176309675} 12/22/2021 00:59:51 - INFO - codeparrot_training - Step 3196: {'lr': 0.0004954782862965805, 'samples': 1636864, 'steps': 3196, 'batch_loss/train': 0.8074733950197697} 12/22/2021 01:00:01 - INFO - codeparrot_training - Step 3197: {'lr': 0.0004954753013441732, 'samples': 1637376, 'steps': 3197, 'batch_loss/train': 0.8092804551124573} 12/22/2021 01:00:12 - INFO - codeparrot_training - Step 3198: {'lr': 0.000495472315415849, 'samples': 1637888, 'steps': 3198, 'batch_loss/train': 0.8897005319595337} 12/22/2021 01:00:22 - INFO - codeparrot_training - Step 3199: {'lr': 0.0004954693285116196, 'samples': 1638400, 'steps': 3199, 'batch_loss/train': 0.9762557428330183} 12/22/2021 01:00:35 - INFO - codeparrot_training - Step 3200: {'lr': 0.0004954663406314967, 'samples': 1638912, 'steps': 3200, 'batch_loss/train': 0.8925016354769468} 12/22/2021 01:00:46 - INFO - codeparrot_training - Step 3201: {'lr': 0.0004954633517754924, 'samples': 1639424, 'steps': 3201, 'batch_loss/train': 0.9250837806612253} 12/22/2021 01:00:56 - INFO - codeparrot_training - Step 3202: {'lr': 0.0004954603619436184, 'samples': 1639936, 'steps': 3202, 'batch_loss/train': 0.940974947065115} 12/22/2021 01:01:08 - INFO - codeparrot_training - Step 3203: {'lr': 0.0004954573711358868, 'samples': 1640448, 'steps': 3203, 'batch_loss/train': 0.7833749856799841} 12/22/2021 01:01:19 - INFO - codeparrot_training - Step 3204: {'lr': 0.0004954543793523093, 'samples': 1640960, 'steps': 3204, 'batch_loss/train': 0.6777229853905737} 12/22/2021 01:01:30 - INFO - codeparrot_training - Step 3205: {'lr': 0.0004954513865928979, 'samples': 1641472, 'steps': 3205, 'batch_loss/train': 0.8670973004773259} 12/22/2021 01:01:42 - INFO - codeparrot_training - Step 3206: {'lr': 0.0004954483928576645, 'samples': 1641984, 'steps': 3206, 'batch_loss/train': 0.8695839610882103} 12/22/2021 01:01:52 - INFO - codeparrot_training - Step 3207: {'lr': 0.0004954453981466209, 'samples': 1642496, 'steps': 3207, 'batch_loss/train': 0.9005298353731632} 12/22/2021 01:02:03 - INFO - codeparrot_training - Step 3208: {'lr': 0.0004954424024597791, 'samples': 1643008, 'steps': 3208, 'batch_loss/train': 0.8060500612482429} 12/22/2021 01:02:13 - INFO - codeparrot_training - Step 3209: {'lr': 0.000495439405797151, 'samples': 1643520, 'steps': 3209, 'batch_loss/train': 0.8038836251944304} 12/22/2021 01:02:25 - INFO - codeparrot_training - Step 3210: {'lr': 0.0004954364081587484, 'samples': 1644032, 'steps': 3210, 'batch_loss/train': 0.8096551196649671} 12/22/2021 01:02:36 - INFO - codeparrot_training - Step 3211: {'lr': 0.0004954334095445833, 'samples': 1644544, 'steps': 3211, 'batch_loss/train': 0.8127144649624825} 12/22/2021 01:02:47 - INFO - codeparrot_training - Step 3212: {'lr': 0.0004954304099546678, 'samples': 1645056, 'steps': 3212, 'batch_loss/train': 0.9686018507927656} 12/22/2021 01:03:00 - INFO - codeparrot_training - Step 3213: {'lr': 0.0004954274093890137, 'samples': 1645568, 'steps': 3213, 'batch_loss/train': 0.7230539321899414} 12/22/2021 01:03:10 - INFO - codeparrot_training - Step 3214: {'lr': 0.0004954244078476327, 'samples': 1646080, 'steps': 3214, 'batch_loss/train': 0.8607652503997087} 12/22/2021 01:03:21 - INFO - codeparrot_training - Step 3215: {'lr': 0.0004954214053305369, 'samples': 1646592, 'steps': 3215, 'batch_loss/train': 0.8517503682523966} 12/22/2021 01:03:33 - INFO - codeparrot_training - Step 3216: {'lr': 0.0004954184018377385, 'samples': 1647104, 'steps': 3216, 'batch_loss/train': 0.9722456829622388} 12/22/2021 01:03:43 - INFO - codeparrot_training - Step 3217: {'lr': 0.000495415397369249, 'samples': 1647616, 'steps': 3217, 'batch_loss/train': 0.9117504945024848} 12/22/2021 01:03:54 - INFO - codeparrot_training - Step 3218: {'lr': 0.0004954123919250806, 'samples': 1648128, 'steps': 3218, 'batch_loss/train': 0.8294601608067751} 12/22/2021 01:04:05 - INFO - codeparrot_training - Step 3219: {'lr': 0.0004954093855052452, 'samples': 1648640, 'steps': 3219, 'batch_loss/train': 0.8942053578794003} 12/22/2021 01:04:18 - INFO - codeparrot_training - Step 3220: {'lr': 0.0004954063781097548, 'samples': 1649152, 'steps': 3220, 'batch_loss/train': 0.7798873065039515} 12/22/2021 01:04:29 - INFO - codeparrot_training - Step 3221: {'lr': 0.0004954033697386212, 'samples': 1649664, 'steps': 3221, 'batch_loss/train': 1.4414020013064146} 12/22/2021 01:04:39 - INFO - codeparrot_training - Step 3222: {'lr': 0.0004954003603918566, 'samples': 1650176, 'steps': 3222, 'batch_loss/train': 1.008320851251483} 12/22/2021 01:04:52 - INFO - codeparrot_training - Step 3223: {'lr': 0.0004953973500694726, 'samples': 1650688, 'steps': 3223, 'batch_loss/train': 0.7174558758269995} 12/22/2021 01:05:02 - INFO - codeparrot_training - Step 3224: {'lr': 0.0004953943387714816, 'samples': 1651200, 'steps': 3224, 'batch_loss/train': 0.7632359374547377} 12/22/2021 01:05:13 - INFO - codeparrot_training - Step 3225: {'lr': 0.0004953913264978953, 'samples': 1651712, 'steps': 3225, 'batch_loss/train': 0.7568937296746299} 12/22/2021 01:05:25 - INFO - codeparrot_training - Step 3226: {'lr': 0.0004953883132487257, 'samples': 1652224, 'steps': 3226, 'batch_loss/train': 0.8296248037368059} 12/22/2021 01:05:36 - INFO - codeparrot_training - Step 3227: {'lr': 0.0004953852990239849, 'samples': 1652736, 'steps': 3227, 'batch_loss/train': 1.025816055946052} 12/22/2021 01:05:46 - INFO - codeparrot_training - Step 3228: {'lr': 0.0004953822838236848, 'samples': 1653248, 'steps': 3228, 'batch_loss/train': 0.8397544547915459} 12/22/2021 01:05:57 - INFO - codeparrot_training - Step 3229: {'lr': 0.0004953792676478373, 'samples': 1653760, 'steps': 3229, 'batch_loss/train': 0.9426050530746579} 12/22/2021 01:06:10 - INFO - codeparrot_training - Step 3230: {'lr': 0.0004953762504964545, 'samples': 1654272, 'steps': 3230, 'batch_loss/train': 0.8729694799985737} 12/22/2021 01:06:20 - INFO - codeparrot_training - Step 3231: {'lr': 0.0004953732323695484, 'samples': 1654784, 'steps': 3231, 'batch_loss/train': 1.198142415145412} 12/22/2021 01:06:31 - INFO - codeparrot_training - Step 3232: {'lr': 0.000495370213267131, 'samples': 1655296, 'steps': 3232, 'batch_loss/train': 1.3036359138786793} 12/22/2021 01:06:43 - INFO - codeparrot_training - Step 3233: {'lr': 0.0004953671931892143, 'samples': 1655808, 'steps': 3233, 'batch_loss/train': 0.8891704445704818} 12/22/2021 01:06:54 - INFO - codeparrot_training - Step 3234: {'lr': 0.0004953641721358104, 'samples': 1656320, 'steps': 3234, 'batch_loss/train': 0.8976704366505146} 12/22/2021 01:07:05 - INFO - codeparrot_training - Step 3235: {'lr': 0.0004953611501069309, 'samples': 1656832, 'steps': 3235, 'batch_loss/train': 0.8161741960793734} 12/22/2021 01:07:15 - INFO - codeparrot_training - Step 3236: {'lr': 0.0004953581271025883, 'samples': 1657344, 'steps': 3236, 'batch_loss/train': 0.7266395227052271} 12/22/2021 01:07:28 - INFO - codeparrot_training - Step 3237: {'lr': 0.0004953551031227944, 'samples': 1657856, 'steps': 3237, 'batch_loss/train': 0.8324987096711993} 12/22/2021 01:07:38 - INFO - codeparrot_training - Step 3238: {'lr': 0.0004953520781675613, 'samples': 1658368, 'steps': 3238, 'batch_loss/train': 0.880282562226057} 12/22/2021 01:07:49 - INFO - codeparrot_training - Step 3239: {'lr': 0.0004953490522369009, 'samples': 1658880, 'steps': 3239, 'batch_loss/train': 0.895488828420639} 12/22/2021 01:08:01 - INFO - codeparrot_training - Step 3240: {'lr': 0.0004953460253308254, 'samples': 1659392, 'steps': 3240, 'batch_loss/train': 0.998632880859077} 12/22/2021 01:08:12 - INFO - codeparrot_training - Step 3241: {'lr': 0.0004953429974493466, 'samples': 1659904, 'steps': 3241, 'batch_loss/train': 0.9156677424907684} 12/22/2021 01:08:22 - INFO - codeparrot_training - Step 3242: {'lr': 0.0004953399685924769, 'samples': 1660416, 'steps': 3242, 'batch_loss/train': 0.8671466996893287} 12/22/2021 01:08:34 - INFO - codeparrot_training - Step 3243: {'lr': 0.000495336938760228, 'samples': 1660928, 'steps': 3243, 'batch_loss/train': 0.800285778939724} 12/22/2021 01:08:45 - INFO - codeparrot_training - Step 3244: {'lr': 0.000495333907952612, 'samples': 1661440, 'steps': 3244, 'batch_loss/train': 0.815495758317411} 12/22/2021 01:08:55 - INFO - codeparrot_training - Step 3245: {'lr': 0.0004953308761696411, 'samples': 1661952, 'steps': 3245, 'batch_loss/train': 0.9219830073416233} 12/22/2021 01:09:07 - INFO - codeparrot_training - Step 3246: {'lr': 0.0004953278434113272, 'samples': 1662464, 'steps': 3246, 'batch_loss/train': 0.8454957623034716} 12/22/2021 01:09:18 - INFO - codeparrot_training - Step 3247: {'lr': 0.0004953248096776824, 'samples': 1662976, 'steps': 3247, 'batch_loss/train': 0.8319892371073365} 12/22/2021 01:09:29 - INFO - codeparrot_training - Step 3248: {'lr': 0.0004953217749687189, 'samples': 1663488, 'steps': 3248, 'batch_loss/train': 0.8952219849452376} 12/22/2021 01:09:39 - INFO - codeparrot_training - Step 3249: {'lr': 0.0004953187392844486, 'samples': 1664000, 'steps': 3249, 'batch_loss/train': 0.8613059027120471} 12/22/2021 01:09:52 - INFO - codeparrot_training - Step 3250: {'lr': 0.0004953157026248836, 'samples': 1664512, 'steps': 3250, 'batch_loss/train': 0.9219053657725453} 12/22/2021 01:10:03 - INFO - codeparrot_training - Step 3251: {'lr': 0.000495312664990036, 'samples': 1665024, 'steps': 3251, 'batch_loss/train': 0.9366052150726318} 12/22/2021 01:10:13 - INFO - codeparrot_training - Step 3252: {'lr': 0.000495309626379918, 'samples': 1665536, 'steps': 3252, 'batch_loss/train': 0.811802089214325} 12/22/2021 01:10:26 - INFO - codeparrot_training - Step 3253: {'lr': 0.0004953065867945414, 'samples': 1666048, 'steps': 3253, 'batch_loss/train': 0.8236198360100389} 12/22/2021 01:10:36 - INFO - codeparrot_training - Step 3254: {'lr': 0.0004953035462339184, 'samples': 1666560, 'steps': 3254, 'batch_loss/train': 0.9514868892729282} 12/22/2021 01:10:47 - INFO - codeparrot_training - Step 3255: {'lr': 0.0004953005046980613, 'samples': 1667072, 'steps': 3255, 'batch_loss/train': 0.7828476364957169} 12/22/2021 01:10:59 - INFO - codeparrot_training - Step 3256: {'lr': 0.000495297462186982, 'samples': 1667584, 'steps': 3256, 'batch_loss/train': 0.7857782924547791} 12/22/2021 01:11:10 - INFO - codeparrot_training - Step 3257: {'lr': 0.0004952944187006925, 'samples': 1668096, 'steps': 3257, 'batch_loss/train': 0.8527895710431039} 12/22/2021 01:11:20 - INFO - codeparrot_training - Step 3258: {'lr': 0.000495291374239205, 'samples': 1668608, 'steps': 3258, 'batch_loss/train': 0.8169447779655457} 12/22/2021 01:11:31 - INFO - codeparrot_training - Step 3259: {'lr': 0.0004952883288025316, 'samples': 1669120, 'steps': 3259, 'batch_loss/train': 0.993160692974925} 12/22/2021 01:11:44 - INFO - codeparrot_training - Step 3260: {'lr': 0.0004952852823906844, 'samples': 1669632, 'steps': 3260, 'batch_loss/train': 0.8077951131854206} 12/22/2021 01:11:54 - INFO - codeparrot_training - Step 3261: {'lr': 0.0004952822350036757, 'samples': 1670144, 'steps': 3261, 'batch_loss/train': 0.8177419197745621} 12/22/2021 01:12:05 - INFO - codeparrot_training - Step 3262: {'lr': 0.0004952791866415174, 'samples': 1670656, 'steps': 3262, 'batch_loss/train': 0.8868386670947075} 12/22/2021 01:12:17 - INFO - codeparrot_training - Step 3263: {'lr': 0.0004952761373042216, 'samples': 1671168, 'steps': 3263, 'batch_loss/train': 0.7827025903388858} 12/22/2021 01:12:28 - INFO - codeparrot_training - Step 3264: {'lr': 0.0004952730869918005, 'samples': 1671680, 'steps': 3264, 'batch_loss/train': 0.8612278525251895} 12/22/2021 01:12:38 - INFO - codeparrot_training - Step 3265: {'lr': 0.0004952700357042662, 'samples': 1672192, 'steps': 3265, 'batch_loss/train': 0.8218933547614142} 12/22/2021 01:12:51 - INFO - codeparrot_training - Step 3266: {'lr': 0.0004952669834416309, 'samples': 1672704, 'steps': 3266, 'batch_loss/train': 0.820244312286377} 12/22/2021 01:13:02 - INFO - codeparrot_training - Step 3267: {'lr': 0.0004952639302039067, 'samples': 1673216, 'steps': 3267, 'batch_loss/train': 0.8839958850294352} 12/22/2021 01:13:12 - INFO - codeparrot_training - Step 3268: {'lr': 0.0004952608759911057, 'samples': 1673728, 'steps': 3268, 'batch_loss/train': 0.9420266337692738} 12/22/2021 01:13:23 - INFO - codeparrot_training - Step 3269: {'lr': 0.00049525782080324, 'samples': 1674240, 'steps': 3269, 'batch_loss/train': 0.9000209402292967} 12/22/2021 01:13:35 - INFO - codeparrot_training - Step 3270: {'lr': 0.0004952547646403219, 'samples': 1674752, 'steps': 3270, 'batch_loss/train': 0.9081681515090168} 12/22/2021 01:13:46 - INFO - codeparrot_training - Step 3271: {'lr': 0.0004952517075023634, 'samples': 1675264, 'steps': 3271, 'batch_loss/train': 0.8225785752292722} 12/22/2021 01:13:56 - INFO - codeparrot_training - Step 3272: {'lr': 0.0004952486493893768, 'samples': 1675776, 'steps': 3272, 'batch_loss/train': 0.9075093939900398} 12/22/2021 01:14:08 - INFO - codeparrot_training - Step 3273: {'lr': 0.000495245590301374, 'samples': 1676288, 'steps': 3273, 'batch_loss/train': 0.8571320325136185} 12/22/2021 01:14:19 - INFO - codeparrot_training - Step 3274: {'lr': 0.0004952425302383675, 'samples': 1676800, 'steps': 3274, 'batch_loss/train': 0.742012869566679} 12/22/2021 01:14:30 - INFO - codeparrot_training - Step 3275: {'lr': 0.0004952394692003692, 'samples': 1677312, 'steps': 3275, 'batch_loss/train': 0.8040602137334645} 12/22/2021 01:14:42 - INFO - codeparrot_training - Step 3276: {'lr': 0.0004952364071873915, 'samples': 1677824, 'steps': 3276, 'batch_loss/train': 0.8436339679174125} 12/22/2021 01:14:52 - INFO - codeparrot_training - Step 3277: {'lr': 0.0004952333441994464, 'samples': 1678336, 'steps': 3277, 'batch_loss/train': 0.883572404505685} 12/22/2021 01:15:03 - INFO - codeparrot_training - Step 3278: {'lr': 0.000495230280236546, 'samples': 1678848, 'steps': 3278, 'batch_loss/train': 0.8778392914682627} 12/22/2021 01:15:16 - INFO - codeparrot_training - Step 3279: {'lr': 0.0004952272152987026, 'samples': 1679360, 'steps': 3279, 'batch_loss/train': 0.8123840838670731} 12/22/2021 01:15:26 - INFO - codeparrot_training - Step 3280: {'lr': 0.0004952241493859285, 'samples': 1679872, 'steps': 3280, 'batch_loss/train': 0.7877422538585961} 12/22/2021 01:15:37 - INFO - codeparrot_training - Step 3281: {'lr': 0.0004952210824982357, 'samples': 1680384, 'steps': 3281, 'batch_loss/train': 0.9263921580277383} 12/22/2021 01:15:48 - INFO - codeparrot_training - Step 3282: {'lr': 0.0004952180146356365, 'samples': 1680896, 'steps': 3282, 'batch_loss/train': 1.0066408598795533} 12/22/2021 01:16:00 - INFO - codeparrot_training - Step 3283: {'lr': 0.0004952149457981432, 'samples': 1681408, 'steps': 3283, 'batch_loss/train': 0.7316201319918036} 12/22/2021 01:16:10 - INFO - codeparrot_training - Step 3284: {'lr': 0.0004952118759857677, 'samples': 1681920, 'steps': 3284, 'batch_loss/train': 0.8549640281125903} 12/22/2021 01:16:21 - INFO - codeparrot_training - Step 3285: {'lr': 0.0004952088051985224, 'samples': 1682432, 'steps': 3285, 'batch_loss/train': 0.827948659658432} 12/22/2021 01:16:33 - INFO - codeparrot_training - Step 3286: {'lr': 0.0004952057334364194, 'samples': 1682944, 'steps': 3286, 'batch_loss/train': 0.8094743508845568} 12/22/2021 01:16:43 - INFO - codeparrot_training - Step 3287: {'lr': 0.0004952026606994711, 'samples': 1683456, 'steps': 3287, 'batch_loss/train': 0.8417792990803719} 12/22/2021 01:16:54 - INFO - codeparrot_training - Step 3288: {'lr': 0.0004951995869876897, 'samples': 1683968, 'steps': 3288, 'batch_loss/train': 0.9162263749167323} 12/22/2021 01:17:07 - INFO - codeparrot_training - Step 3289: {'lr': 0.0004951965123010871, 'samples': 1684480, 'steps': 3289, 'batch_loss/train': 0.8527079345658422} 12/22/2021 01:17:17 - INFO - codeparrot_training - Step 3290: {'lr': 0.0004951934366396759, 'samples': 1684992, 'steps': 3290, 'batch_loss/train': 0.848004000261426} 12/22/2021 01:17:28 - INFO - codeparrot_training - Step 3291: {'lr': 0.0004951903600034681, 'samples': 1685504, 'steps': 3291, 'batch_loss/train': 0.8483575042337179} 12/22/2021 01:17:38 - INFO - codeparrot_training - Step 3292: {'lr': 0.0004951872823924761, 'samples': 1686016, 'steps': 3292, 'batch_loss/train': 0.9069962371140718} 12/22/2021 01:17:51 - INFO - codeparrot_training - Step 3293: {'lr': 0.0004951842038067119, 'samples': 1686528, 'steps': 3293, 'batch_loss/train': 0.8679569992236793} 12/22/2021 01:18:01 - INFO - codeparrot_training - Step 3294: {'lr': 0.000495181124246188, 'samples': 1687040, 'steps': 3294, 'batch_loss/train': 0.9147716956213117} 12/22/2021 01:18:12 - INFO - codeparrot_training - Step 3295: {'lr': 0.0004951780437109165, 'samples': 1687552, 'steps': 3295, 'batch_loss/train': 0.9453265117481351} 12/22/2021 01:18:25 - INFO - codeparrot_training - Step 3296: {'lr': 0.0004951749622009096, 'samples': 1688064, 'steps': 3296, 'batch_loss/train': 0.8698658663779497} 12/22/2021 01:18:35 - INFO - codeparrot_training - Step 3297: {'lr': 0.0004951718797161798, 'samples': 1688576, 'steps': 3297, 'batch_loss/train': 1.720781633630395} 12/22/2021 01:18:46 - INFO - codeparrot_training - Step 3298: {'lr': 0.000495168796256739, 'samples': 1689088, 'steps': 3298, 'batch_loss/train': 1.0057665100321174} 12/22/2021 01:18:58 - INFO - codeparrot_training - Step 3299: {'lr': 0.0004951657118225998, 'samples': 1689600, 'steps': 3299, 'batch_loss/train': 0.9691520966589451} 12/22/2021 01:19:09 - INFO - codeparrot_training - Step 3300: {'lr': 0.0004951626264137741, 'samples': 1690112, 'steps': 3300, 'batch_loss/train': 0.9822781444527209} 12/22/2021 01:19:19 - INFO - codeparrot_training - Step 3301: {'lr': 0.0004951595400302745, 'samples': 1690624, 'steps': 3301, 'batch_loss/train': 0.9080630950629711} 12/22/2021 01:19:30 - INFO - codeparrot_training - Step 3302: {'lr': 0.0004951564526721131, 'samples': 1691136, 'steps': 3302, 'batch_loss/train': 0.8833744712173939} 12/22/2021 01:19:42 - INFO - codeparrot_training - Step 3303: {'lr': 0.0004951533643393022, 'samples': 1691648, 'steps': 3303, 'batch_loss/train': 0.8311827890574932} 12/22/2021 01:19:53 - INFO - codeparrot_training - Step 3304: {'lr': 0.0004951502750318541, 'samples': 1692160, 'steps': 3304, 'batch_loss/train': 0.8668560776859522} 12/22/2021 01:20:03 - INFO - codeparrot_training - Step 3305: {'lr': 0.0004951471847497812, 'samples': 1692672, 'steps': 3305, 'batch_loss/train': 0.8658565212972462} 12/22/2021 01:20:16 - INFO - codeparrot_training - Step 3306: {'lr': 0.0004951440934930955, 'samples': 1693184, 'steps': 3306, 'batch_loss/train': 0.9992596823722124} 12/22/2021 01:20:27 - INFO - codeparrot_training - Step 3307: {'lr': 0.0004951410012618095, 'samples': 1693696, 'steps': 3307, 'batch_loss/train': 0.9248998183757067} 12/22/2021 01:20:37 - INFO - codeparrot_training - Step 3308: {'lr': 0.0004951379080559354, 'samples': 1694208, 'steps': 3308, 'batch_loss/train': 0.9645476881414652} 12/22/2021 01:20:49 - INFO - codeparrot_training - Step 3309: {'lr': 0.0004951348138754856, 'samples': 1694720, 'steps': 3309, 'batch_loss/train': 0.8214915383141488} 12/22/2021 01:21:00 - INFO - codeparrot_training - Step 3310: {'lr': 0.0004951317187204724, 'samples': 1695232, 'steps': 3310, 'batch_loss/train': 0.9077181573957205} 12/22/2021 01:21:11 - INFO - codeparrot_training - Step 3311: {'lr': 0.0004951286225909078, 'samples': 1695744, 'steps': 3311, 'batch_loss/train': 0.8042204430093989} 12/22/2021 01:21:21 - INFO - codeparrot_training - Step 3312: {'lr': 0.0004951255254868046, 'samples': 1696256, 'steps': 3312, 'batch_loss/train': 0.7171985658351332} 12/22/2021 01:21:33 - INFO - codeparrot_training - Step 3313: {'lr': 0.0004951224274081748, 'samples': 1696768, 'steps': 3313, 'batch_loss/train': 0.913719198666513} 12/22/2021 01:21:44 - INFO - codeparrot_training - Step 3314: {'lr': 0.0004951193283550307, 'samples': 1697280, 'steps': 3314, 'batch_loss/train': 0.7739215805195272} 12/22/2021 01:21:55 - INFO - codeparrot_training - Step 3315: {'lr': 0.0004951162283273848, 'samples': 1697792, 'steps': 3315, 'batch_loss/train': 0.8415600573644042} 12/22/2021 01:22:08 - INFO - codeparrot_training - Step 3316: {'lr': 0.0004951131273252492, 'samples': 1698304, 'steps': 3316, 'batch_loss/train': 0.8914443589746952} 12/22/2021 01:22:18 - INFO - codeparrot_training - Step 3317: {'lr': 0.0004951100253486365, 'samples': 1698816, 'steps': 3317, 'batch_loss/train': 0.8173208148218691} 12/22/2021 01:22:29 - INFO - codeparrot_training - Step 3318: {'lr': 0.0004951069223975589, 'samples': 1699328, 'steps': 3318, 'batch_loss/train': 0.9111671587452292} 12/22/2021 01:22:41 - INFO - codeparrot_training - Step 3319: {'lr': 0.0004951038184720286, 'samples': 1699840, 'steps': 3319, 'batch_loss/train': 0.8911869488656521} 12/22/2021 01:22:52 - INFO - codeparrot_training - Step 3320: {'lr': 0.0004951007135720582, 'samples': 1700352, 'steps': 3320, 'batch_loss/train': 0.9127333955839276} 12/22/2021 01:23:02 - INFO - codeparrot_training - Step 3321: {'lr': 0.0004950976076976598, 'samples': 1700864, 'steps': 3321, 'batch_loss/train': 0.8426312864758074} 12/22/2021 01:23:14 - INFO - codeparrot_training - Step 3322: {'lr': 0.0004950945008488459, 'samples': 1701376, 'steps': 3322, 'batch_loss/train': 0.8474989314563572} 12/22/2021 01:23:25 - INFO - codeparrot_training - Step 3323: {'lr': 0.0004950913930256288, 'samples': 1701888, 'steps': 3323, 'batch_loss/train': 0.7600602176971734} 12/22/2021 01:23:36 - INFO - codeparrot_training - Step 3324: {'lr': 0.0004950882842280209, 'samples': 1702400, 'steps': 3324, 'batch_loss/train': 0.8494994239881635} 12/22/2021 01:23:46 - INFO - codeparrot_training - Step 3325: {'lr': 0.0004950851744560345, 'samples': 1702912, 'steps': 3325, 'batch_loss/train': 0.8507796199992299} 12/22/2021 01:23:59 - INFO - codeparrot_training - Step 3326: {'lr': 0.0004950820637096819, 'samples': 1703424, 'steps': 3326, 'batch_loss/train': 0.8060475625097752} 12/22/2021 01:24:10 - INFO - codeparrot_training - Step 3327: {'lr': 0.0004950789519889756, 'samples': 1703936, 'steps': 3327, 'batch_loss/train': 0.7596516348421574} 12/22/2021 01:24:20 - INFO - codeparrot_training - Step 3328: {'lr': 0.0004950758392939281, 'samples': 1704448, 'steps': 3328, 'batch_loss/train': 0.8100775023922324} 12/22/2021 01:24:32 - INFO - codeparrot_training - Step 3329: {'lr': 0.0004950727256245514, 'samples': 1704960, 'steps': 3329, 'batch_loss/train': 0.8798440617974848} 12/22/2021 01:24:43 - INFO - codeparrot_training - Step 3330: {'lr': 0.0004950696109808581, 'samples': 1705472, 'steps': 3330, 'batch_loss/train': 0.9333367650397122} 12/22/2021 01:24:54 - INFO - codeparrot_training - Step 3331: {'lr': 0.0004950664953628606, 'samples': 1705984, 'steps': 3331, 'batch_loss/train': 0.8974853558465838} 12/22/2021 01:25:06 - INFO - codeparrot_training - Step 3332: {'lr': 0.0004950633787705712, 'samples': 1706496, 'steps': 3332, 'batch_loss/train': 0.9606946017593145} 12/22/2021 01:25:16 - INFO - codeparrot_training - Step 3333: {'lr': 0.0004950602612040024, 'samples': 1707008, 'steps': 3333, 'batch_loss/train': 0.7826344245113432} 12/22/2021 01:25:27 - INFO - codeparrot_training - Step 3334: {'lr': 0.0004950571426631666, 'samples': 1707520, 'steps': 3334, 'batch_loss/train': 0.7861535400152206} 12/22/2021 01:25:38 - INFO - codeparrot_training - Step 3335: {'lr': 0.000495054023148076, 'samples': 1708032, 'steps': 3335, 'batch_loss/train': 0.8086537765339017} 12/22/2021 01:25:50 - INFO - codeparrot_training - Step 3336: {'lr': 0.0004950509026587432, 'samples': 1708544, 'steps': 3336, 'batch_loss/train': 0.8678730884566903} 12/22/2021 01:26:01 - INFO - codeparrot_training - Step 3337: {'lr': 0.0004950477811951805, 'samples': 1709056, 'steps': 3337, 'batch_loss/train': 0.7900496572256088} 12/22/2021 01:26:11 - INFO - codeparrot_training - Step 3338: {'lr': 0.0004950446587574003, 'samples': 1709568, 'steps': 3338, 'batch_loss/train': 0.8214613827876747} 12/22/2021 01:26:24 - INFO - codeparrot_training - Step 3339: {'lr': 0.0004950415353454152, 'samples': 1710080, 'steps': 3339, 'batch_loss/train': 0.8419863726012409} 12/22/2021 01:26:35 - INFO - codeparrot_training - Step 3340: {'lr': 0.0004950384109592374, 'samples': 1710592, 'steps': 3340, 'batch_loss/train': 0.7351262625306845} 12/22/2021 01:26:45 - INFO - codeparrot_training - Step 3341: {'lr': 0.0004950352855988794, 'samples': 1711104, 'steps': 3341, 'batch_loss/train': 0.823459388804622} 12/22/2021 01:26:57 - INFO - codeparrot_training - Step 3342: {'lr': 0.0004950321592643536, 'samples': 1711616, 'steps': 3342, 'batch_loss/train': 0.8006290141493082} 12/22/2021 01:27:08 - INFO - codeparrot_training - Step 3343: {'lr': 0.0004950290319556726, 'samples': 1712128, 'steps': 3343, 'batch_loss/train': 0.9133477415889502} 12/22/2021 01:27:18 - INFO - codeparrot_training - Step 3344: {'lr': 0.0004950259036728485, 'samples': 1712640, 'steps': 3344, 'batch_loss/train': 0.8628484271466732} 12/22/2021 01:27:29 - INFO - codeparrot_training - Step 3345: {'lr': 0.0004950227744158941, 'samples': 1713152, 'steps': 3345, 'batch_loss/train': 0.832843164447695} 12/22/2021 01:27:42 - INFO - codeparrot_training - Step 3346: {'lr': 0.0004950196441848215, 'samples': 1713664, 'steps': 3346, 'batch_loss/train': 0.8937053717672825} 12/22/2021 01:27:52 - INFO - codeparrot_training - Step 3347: {'lr': 0.0004950165129796434, 'samples': 1714176, 'steps': 3347, 'batch_loss/train': 0.9066405529156327} 12/22/2021 01:28:03 - INFO - codeparrot_training - Step 3348: {'lr': 0.0004950133808003721, 'samples': 1714688, 'steps': 3348, 'batch_loss/train': 0.8267844850197434} 12/22/2021 01:28:15 - INFO - codeparrot_training - Step 3349: {'lr': 0.0004950102476470202, 'samples': 1715200, 'steps': 3349, 'batch_loss/train': 0.8786013047210872} 12/22/2021 01:28:26 - INFO - codeparrot_training - Step 3350: {'lr': 0.0004950071135196001, 'samples': 1715712, 'steps': 3350, 'batch_loss/train': 0.8179828943684697} 12/22/2021 01:28:36 - INFO - codeparrot_training - Step 3351: {'lr': 0.000495003978418124, 'samples': 1716224, 'steps': 3351, 'batch_loss/train': 0.8873338755220175} 12/22/2021 01:28:48 - INFO - codeparrot_training - Step 3352: {'lr': 0.0004950008423426047, 'samples': 1716736, 'steps': 3352, 'batch_loss/train': 0.8418361209332943} 12/22/2021 01:28:59 - INFO - codeparrot_training - Step 3353: {'lr': 0.0004949977052930547, 'samples': 1717248, 'steps': 3353, 'batch_loss/train': 0.8470705589279532} 12/22/2021 01:29:10 - INFO - codeparrot_training - Step 3354: {'lr': 0.0004949945672694862, 'samples': 1717760, 'steps': 3354, 'batch_loss/train': 0.9693102296441793} 12/22/2021 01:29:20 - INFO - codeparrot_training - Step 3355: {'lr': 0.0004949914282719118, 'samples': 1718272, 'steps': 3355, 'batch_loss/train': 0.8353178454563022} 12/22/2021 01:29:33 - INFO - codeparrot_training - Step 3356: {'lr': 0.000494988288300344, 'samples': 1718784, 'steps': 3356, 'batch_loss/train': 1.6879682741127908} 12/22/2021 01:29:43 - INFO - codeparrot_training - Step 3357: {'lr': 0.0004949851473547953, 'samples': 1719296, 'steps': 3357, 'batch_loss/train': 0.8219684339128435} 12/22/2021 01:29:54 - INFO - codeparrot_training - Step 3358: {'lr': 0.0004949820054352782, 'samples': 1719808, 'steps': 3358, 'batch_loss/train': 0.886695166118443} 12/22/2021 01:30:07 - INFO - codeparrot_training - Step 3359: {'lr': 0.0004949788625418051, 'samples': 1720320, 'steps': 3359, 'batch_loss/train': 0.8939358908683062} 12/22/2021 01:30:17 - INFO - codeparrot_training - Step 3360: {'lr': 0.0004949757186743886, 'samples': 1720832, 'steps': 3360, 'batch_loss/train': 0.8564875368028879} 12/22/2021 01:30:28 - INFO - codeparrot_training - Step 3361: {'lr': 0.0004949725738330411, 'samples': 1721344, 'steps': 3361, 'batch_loss/train': 0.8089627851732075} 12/22/2021 01:30:40 - INFO - codeparrot_training - Step 3362: {'lr': 0.000494969428017775, 'samples': 1721856, 'steps': 3362, 'batch_loss/train': 0.8794811833649874} 12/22/2021 01:30:51 - INFO - codeparrot_training - Step 3363: {'lr': 0.0004949662812286031, 'samples': 1722368, 'steps': 3363, 'batch_loss/train': 0.8640848947688937} 12/22/2021 01:31:01 - INFO - codeparrot_training - Step 3364: {'lr': 0.0004949631334655377, 'samples': 1722880, 'steps': 3364, 'batch_loss/train': 0.9145896658301353} 12/22/2021 01:31:12 - INFO - codeparrot_training - Step 3365: {'lr': 0.0004949599847285915, 'samples': 1723392, 'steps': 3365, 'batch_loss/train': 0.7617625597631559} 12/22/2021 01:31:25 - INFO - codeparrot_training - Step 3366: {'lr': 0.0004949568350177768, 'samples': 1723904, 'steps': 3366, 'batch_loss/train': 0.9003536608070135} 12/22/2021 01:31:36 - INFO - codeparrot_training - Step 3367: {'lr': 0.0004949536843331063, 'samples': 1724416, 'steps': 3367, 'batch_loss/train': 0.8585276575759053} 12/22/2021 01:31:47 - INFO - codeparrot_training - Step 3368: {'lr': 0.0004949505326745923, 'samples': 1724928, 'steps': 3368, 'batch_loss/train': 0.7112729009240866} 12/22/2021 01:31:59 - INFO - codeparrot_training - Step 3369: {'lr': 0.0004949473800422476, 'samples': 1725440, 'steps': 3369, 'batch_loss/train': 0.7274879766628146} 12/22/2021 01:32:09 - INFO - codeparrot_training - Step 3370: {'lr': 0.0004949442264360845, 'samples': 1725952, 'steps': 3370, 'batch_loss/train': 0.9315635552629828} 12/22/2021 01:32:20 - INFO - codeparrot_training - Step 3371: {'lr': 0.0004949410718561157, 'samples': 1726464, 'steps': 3371, 'batch_loss/train': 0.7658983878791332} 12/22/2021 01:32:32 - INFO - codeparrot_training - Step 3372: {'lr': 0.0004949379163023536, 'samples': 1726976, 'steps': 3372, 'batch_loss/train': 0.9245859263464808} 12/22/2021 01:32:43 - INFO - codeparrot_training - Step 3373: {'lr': 0.0004949347597748109, 'samples': 1727488, 'steps': 3373, 'batch_loss/train': 0.9659429499879479} 12/22/2021 01:32:53 - INFO - codeparrot_training - Step 3374: {'lr': 0.0004949316022735001, 'samples': 1728000, 'steps': 3374, 'batch_loss/train': 0.9042179335374385} 12/22/2021 01:33:04 - INFO - codeparrot_training - Step 3375: {'lr': 0.0004949284437984337, 'samples': 1728512, 'steps': 3375, 'batch_loss/train': 0.830235549248755} 12/22/2021 01:33:17 - INFO - codeparrot_training - Step 3376: {'lr': 0.0004949252843496244, 'samples': 1729024, 'steps': 3376, 'batch_loss/train': 0.7931954357773066} 12/22/2021 01:33:27 - INFO - codeparrot_training - Step 3377: {'lr': 0.0004949221239270845, 'samples': 1729536, 'steps': 3377, 'batch_loss/train': 0.8651972180232406} 12/22/2021 01:33:38 - INFO - codeparrot_training - Step 3378: {'lr': 0.0004949189625308268, 'samples': 1730048, 'steps': 3378, 'batch_loss/train': 0.8280920311808586} 12/22/2021 01:33:50 - INFO - codeparrot_training - Step 3379: {'lr': 0.0004949158001608636, 'samples': 1730560, 'steps': 3379, 'batch_loss/train': 0.8807862750254571} 12/22/2021 01:34:01 - INFO - codeparrot_training - Step 3380: {'lr': 0.0004949126368172079, 'samples': 1731072, 'steps': 3380, 'batch_loss/train': 0.9043366210535169} 12/22/2021 01:34:11 - INFO - codeparrot_training - Step 3381: {'lr': 0.000494909472499872, 'samples': 1731584, 'steps': 3381, 'batch_loss/train': 0.797097334638238} 12/22/2021 01:34:24 - INFO - codeparrot_training - Step 3382: {'lr': 0.0004949063072088684, 'samples': 1732096, 'steps': 3382, 'batch_loss/train': 0.8598443325608969} 12/22/2021 01:34:35 - INFO - codeparrot_training - Step 3383: {'lr': 0.0004949031409442099, 'samples': 1732608, 'steps': 3383, 'batch_loss/train': 0.8089424767531455} 12/22/2021 01:34:45 - INFO - codeparrot_training - Step 3384: {'lr': 0.0004948999737059089, 'samples': 1733120, 'steps': 3384, 'batch_loss/train': 0.9327204441651702} 12/22/2021 01:34:56 - INFO - codeparrot_training - Step 3385: {'lr': 0.000494896805493978, 'samples': 1733632, 'steps': 3385, 'batch_loss/train': 0.927832999266684} 12/22/2021 01:35:08 - INFO - codeparrot_training - Step 3386: {'lr': 0.00049489363630843, 'samples': 1734144, 'steps': 3386, 'batch_loss/train': 0.8636948452331126} 12/22/2021 01:35:19 - INFO - codeparrot_training - Step 3387: {'lr': 0.0004948904661492774, 'samples': 1734656, 'steps': 3387, 'batch_loss/train': 0.9479699833318591} 12/22/2021 01:35:29 - INFO - codeparrot_training - Step 3388: {'lr': 0.0004948872950165326, 'samples': 1735168, 'steps': 3388, 'batch_loss/train': 0.7856004713103175} 12/22/2021 01:35:41 - INFO - codeparrot_training - Step 3389: {'lr': 0.0004948841229102086, 'samples': 1735680, 'steps': 3389, 'batch_loss/train': 0.8151787873357534} 12/22/2021 01:35:52 - INFO - codeparrot_training - Step 3390: {'lr': 0.0004948809498303176, 'samples': 1736192, 'steps': 3390, 'batch_loss/train': 0.8436945173889399} 12/22/2021 01:36:03 - INFO - codeparrot_training - Step 3391: {'lr': 0.0004948777757768725, 'samples': 1736704, 'steps': 3391, 'batch_loss/train': 0.8598989369347692} 12/22/2021 01:36:15 - INFO - codeparrot_training - Step 3392: {'lr': 0.0004948746007498859, 'samples': 1737216, 'steps': 3392, 'batch_loss/train': 0.9182702000252903} 12/22/2021 01:36:25 - INFO - codeparrot_training - Step 3393: {'lr': 0.0004948714247493703, 'samples': 1737728, 'steps': 3393, 'batch_loss/train': 0.8036752948537469} 12/22/2021 01:36:36 - INFO - codeparrot_training - Step 3394: {'lr': 0.0004948682477753383, 'samples': 1738240, 'steps': 3394, 'batch_loss/train': 0.8835043162107468} 12/22/2021 01:36:47 - INFO - codeparrot_training - Step 3395: {'lr': 0.0004948650698278027, 'samples': 1738752, 'steps': 3395, 'batch_loss/train': 0.9211112689226866} 12/22/2021 01:37:00 - INFO - codeparrot_training - Step 3396: {'lr': 0.000494861890906776, 'samples': 1739264, 'steps': 3396, 'batch_loss/train': 0.610370030393824} 12/22/2021 01:37:10 - INFO - codeparrot_training - Step 3397: {'lr': 0.0004948587110122708, 'samples': 1739776, 'steps': 3397, 'batch_loss/train': 0.8418231131508946} 12/22/2021 01:37:21 - INFO - codeparrot_training - Step 3398: {'lr': 0.0004948555301442999, 'samples': 1740288, 'steps': 3398, 'batch_loss/train': 0.8670580238103867} 12/22/2021 01:37:33 - INFO - codeparrot_training - Step 3399: {'lr': 0.0004948523483028759, 'samples': 1740800, 'steps': 3399, 'batch_loss/train': 0.8117843121290207} 12/22/2021 01:37:44 - INFO - codeparrot_training - Step 3400: {'lr': 0.0004948491654880113, 'samples': 1741312, 'steps': 3400, 'batch_loss/train': 0.7641113931313157} 12/22/2021 01:37:54 - INFO - codeparrot_training - Step 3401: {'lr': 0.000494845981699719, 'samples': 1741824, 'steps': 3401, 'batch_loss/train': 0.748710936284624} 12/22/2021 01:38:06 - INFO - codeparrot_training - Step 3402: {'lr': 0.0004948427969380115, 'samples': 1742336, 'steps': 3402, 'batch_loss/train': 0.9423190783709288} 12/22/2021 01:38:17 - INFO - codeparrot_training - Step 3403: {'lr': 0.0004948396112029014, 'samples': 1742848, 'steps': 3403, 'batch_loss/train': 0.8834946695715189} 12/22/2021 01:38:28 - INFO - codeparrot_training - Step 3404: {'lr': 0.0004948364244944015, 'samples': 1743360, 'steps': 3404, 'batch_loss/train': 0.8975165830925107} 12/22/2021 01:38:40 - INFO - codeparrot_training - Step 3405: {'lr': 0.0004948332368125244, 'samples': 1743872, 'steps': 3405, 'batch_loss/train': 0.8442022111266851} 12/22/2021 01:38:51 - INFO - codeparrot_training - Step 3406: {'lr': 0.0004948300481572829, 'samples': 1744384, 'steps': 3406, 'batch_loss/train': 1.59714859072119} 12/22/2021 01:39:02 - INFO - codeparrot_training - Step 3407: {'lr': 0.0004948268585286895, 'samples': 1744896, 'steps': 3407, 'batch_loss/train': 0.5950370144564658} 12/22/2021 01:39:12 - INFO - codeparrot_training - Step 3408: {'lr': 0.0004948236679267569, 'samples': 1745408, 'steps': 3408, 'batch_loss/train': 0.7973870933055878} 12/22/2021 01:39:24 - INFO - codeparrot_training - Step 3409: {'lr': 0.0004948204763514979, 'samples': 1745920, 'steps': 3409, 'batch_loss/train': 0.7612559555564076} 12/22/2021 01:39:35 - INFO - codeparrot_training - Step 3410: {'lr': 0.0004948172838029251, 'samples': 1746432, 'steps': 3410, 'batch_loss/train': 0.9103408735245466} 12/22/2021 01:39:46 - INFO - codeparrot_training - Step 3411: {'lr': 0.0004948140902810513, 'samples': 1746944, 'steps': 3411, 'batch_loss/train': 0.8794205598533154} 12/22/2021 01:39:58 - INFO - codeparrot_training - Step 3412: {'lr': 0.000494810895785889, 'samples': 1747456, 'steps': 3412, 'batch_loss/train': 0.8798856721259654} 12/22/2021 01:40:09 - INFO - codeparrot_training - Step 3413: {'lr': 0.000494807700317451, 'samples': 1747968, 'steps': 3413, 'batch_loss/train': 0.8439251352101564} 12/22/2021 01:40:19 - INFO - codeparrot_training - Step 3414: {'lr': 0.0004948045038757502, 'samples': 1748480, 'steps': 3414, 'batch_loss/train': 0.865317358635366} 12/22/2021 01:40:32 - INFO - codeparrot_training - Step 3415: {'lr': 0.000494801306460799, 'samples': 1748992, 'steps': 3415, 'batch_loss/train': 0.7499922765418887} 12/22/2021 01:40:43 - INFO - codeparrot_training - Step 3416: {'lr': 0.0004947981080726102, 'samples': 1749504, 'steps': 3416, 'batch_loss/train': 0.8782188044860959} 12/22/2021 01:40:53 - INFO - codeparrot_training - Step 3417: {'lr': 0.0004947949087111967, 'samples': 1750016, 'steps': 3417, 'batch_loss/train': 0.8633600249886513} 12/22/2021 01:41:04 - INFO - codeparrot_training - Step 3418: {'lr': 0.0004947917083765709, 'samples': 1750528, 'steps': 3418, 'batch_loss/train': 0.8563424134626985} 12/22/2021 01:41:16 - INFO - codeparrot_training - Step 3419: {'lr': 0.0004947885070687457, 'samples': 1751040, 'steps': 3419, 'batch_loss/train': 0.8643941758200526} 12/22/2021 01:41:27 - INFO - codeparrot_training - Step 3420: {'lr': 0.000494785304787734, 'samples': 1751552, 'steps': 3420, 'batch_loss/train': 0.8091750051826239} 12/22/2021 01:41:37 - INFO - codeparrot_training - Step 3421: {'lr': 0.0004947821015335483, 'samples': 1752064, 'steps': 3421, 'batch_loss/train': 0.9128811741247773} 12/22/2021 01:41:50 - INFO - codeparrot_training - Step 3422: {'lr': 0.0004947788973062013, 'samples': 1752576, 'steps': 3422, 'batch_loss/train': 0.9143862789496779} 12/22/2021 01:42:01 - INFO - codeparrot_training - Step 3423: {'lr': 0.0004947756921057059, 'samples': 1753088, 'steps': 3423, 'batch_loss/train': 0.9567630970850587} 12/22/2021 01:42:11 - INFO - codeparrot_training - Step 3424: {'lr': 0.0004947724859320747, 'samples': 1753600, 'steps': 3424, 'batch_loss/train': 0.865407002158463} 12/22/2021 01:42:23 - INFO - codeparrot_training - Step 3425: {'lr': 0.0004947692787853206, 'samples': 1754112, 'steps': 3425, 'batch_loss/train': 0.9021863555535674} 12/22/2021 01:42:34 - INFO - codeparrot_training - Step 3426: {'lr': 0.0004947660706654562, 'samples': 1754624, 'steps': 3426, 'batch_loss/train': 0.8894019816070795} 12/22/2021 01:42:45 - INFO - codeparrot_training - Step 3427: {'lr': 0.0004947628615724944, 'samples': 1755136, 'steps': 3427, 'batch_loss/train': 0.8438635151833296} 12/22/2021 01:42:55 - INFO - codeparrot_training - Step 3428: {'lr': 0.0004947596515064479, 'samples': 1755648, 'steps': 3428, 'batch_loss/train': 0.7859973236918449} 12/22/2021 01:43:08 - INFO - codeparrot_training - Step 3429: {'lr': 0.0004947564404673293, 'samples': 1756160, 'steps': 3429, 'batch_loss/train': 0.9809649474918842} 12/22/2021 01:43:18 - INFO - codeparrot_training - Step 3430: {'lr': 0.0004947532284551516, 'samples': 1756672, 'steps': 3430, 'batch_loss/train': 0.7754708230495453} 12/22/2021 01:43:29 - INFO - codeparrot_training - Step 3431: {'lr': 0.0004947500154699275, 'samples': 1757184, 'steps': 3431, 'batch_loss/train': 0.8452319046482444} 12/22/2021 01:43:41 - INFO - codeparrot_training - Step 3432: {'lr': 0.0004947468015116698, 'samples': 1757696, 'steps': 3432, 'batch_loss/train': 0.8546511121094227} 12/22/2021 01:43:51 - INFO - codeparrot_training - Step 3433: {'lr': 0.0004947435865803911, 'samples': 1758208, 'steps': 3433, 'batch_loss/train': 0.8639962105080485} 12/22/2021 01:44:02 - INFO - codeparrot_training - Step 3434: {'lr': 0.0004947403706761043, 'samples': 1758720, 'steps': 3434, 'batch_loss/train': 0.9540189248509705} 12/22/2021 01:44:15 - INFO - codeparrot_training - Step 3435: {'lr': 0.0004947371537988223, 'samples': 1759232, 'steps': 3435, 'batch_loss/train': 0.9038183055818081} 12/22/2021 01:44:25 - INFO - codeparrot_training - Step 3436: {'lr': 0.0004947339359485577, 'samples': 1759744, 'steps': 3436, 'batch_loss/train': 0.7885990324430168} 12/22/2021 01:44:36 - INFO - codeparrot_training - Step 3437: {'lr': 0.0004947307171253235, 'samples': 1760256, 'steps': 3437, 'batch_loss/train': 0.8729413943365216} 12/22/2021 01:44:48 - INFO - codeparrot_training - Step 3438: {'lr': 0.0004947274973291323, 'samples': 1760768, 'steps': 3438, 'batch_loss/train': 0.8978551314212382} 12/22/2021 01:44:59 - INFO - codeparrot_training - Step 3439: {'lr': 0.000494724276559997, 'samples': 1761280, 'steps': 3439, 'batch_loss/train': 0.8837389647960663} 12/22/2021 01:45:09 - INFO - codeparrot_training - Step 3440: {'lr': 0.0004947210548179303, 'samples': 1761792, 'steps': 3440, 'batch_loss/train': 0.8333333535119891} 12/22/2021 01:45:20 - INFO - codeparrot_training - Step 3441: {'lr': 0.0004947178321029452, 'samples': 1762304, 'steps': 3441, 'batch_loss/train': 0.7598433429375291} 12/22/2021 01:45:32 - INFO - codeparrot_training - Step 3442: {'lr': 0.0004947146084150544, 'samples': 1762816, 'steps': 3442, 'batch_loss/train': 0.92936583282426} 12/22/2021 01:45:42 - INFO - codeparrot_training - Step 3443: {'lr': 0.0004947113837542707, 'samples': 1763328, 'steps': 3443, 'batch_loss/train': 0.8570017539896071} 12/22/2021 01:45:53 - INFO - codeparrot_training - Step 3444: {'lr': 0.0004947081581206068, 'samples': 1763840, 'steps': 3444, 'batch_loss/train': 0.735582840628922} 12/22/2021 01:46:06 - INFO - codeparrot_training - Step 3445: {'lr': 0.0004947049315140758, 'samples': 1764352, 'steps': 3445, 'batch_loss/train': 0.8245374895632267} 12/22/2021 01:46:16 - INFO - codeparrot_training - Step 3446: {'lr': 0.0004947017039346904, 'samples': 1764864, 'steps': 3446, 'batch_loss/train': 0.7546464260667562} 12/22/2021 01:46:27 - INFO - codeparrot_training - Step 3447: {'lr': 0.0004946984753824633, 'samples': 1765376, 'steps': 3447, 'batch_loss/train': 0.8289693417027593} 12/22/2021 01:46:39 - INFO - codeparrot_training - Step 3448: {'lr': 0.0004946952458574075, 'samples': 1765888, 'steps': 3448, 'batch_loss/train': 0.8947907940018922} 12/22/2021 01:46:50 - INFO - codeparrot_training - Step 3449: {'lr': 0.0004946920153595359, 'samples': 1766400, 'steps': 3449, 'batch_loss/train': 0.8471428910270333} 12/22/2021 01:47:00 - INFO - codeparrot_training - Step 3450: {'lr': 0.0004946887838888612, 'samples': 1766912, 'steps': 3450, 'batch_loss/train': 0.9065291732549667} 12/22/2021 01:47:11 - INFO - codeparrot_training - Step 3451: {'lr': 0.0004946855514453962, 'samples': 1767424, 'steps': 3451, 'batch_loss/train': 0.8795671816915274} 12/22/2021 01:47:24 - INFO - codeparrot_training - Step 3452: {'lr': 0.0004946823180291539, 'samples': 1767936, 'steps': 3452, 'batch_loss/train': 0.8072973852977157} 12/22/2021 01:47:34 - INFO - codeparrot_training - Step 3453: {'lr': 0.000494679083640147, 'samples': 1768448, 'steps': 3453, 'batch_loss/train': 0.7879467578604817} 12/22/2021 01:47:45 - INFO - codeparrot_training - Step 3454: {'lr': 0.0004946758482783886, 'samples': 1768960, 'steps': 3454, 'batch_loss/train': 0.9275901149958372} 12/22/2021 01:47:57 - INFO - codeparrot_training - Step 3455: {'lr': 0.0004946726119438913, 'samples': 1769472, 'steps': 3455, 'batch_loss/train': 0.7924302471801639} 12/22/2021 01:48:08 - INFO - codeparrot_training - Step 3456: {'lr': 0.0004946693746366681, 'samples': 1769984, 'steps': 3456, 'batch_loss/train': 0.8475770372897387} 12/22/2021 01:48:18 - INFO - codeparrot_training - Step 3457: {'lr': 0.000494666136356732, 'samples': 1770496, 'steps': 3457, 'batch_loss/train': 0.9338865922763944} 12/22/2021 01:48:30 - INFO - codeparrot_training - Step 3458: {'lr': 0.0004946628971040956, 'samples': 1771008, 'steps': 3458, 'batch_loss/train': 0.7618992370553315} 12/22/2021 01:48:41 - INFO - codeparrot_training - Step 3459: {'lr': 0.0004946596568787719, 'samples': 1771520, 'steps': 3459, 'batch_loss/train': 0.8476750357076526} 12/22/2021 01:48:52 - INFO - codeparrot_training - Step 3460: {'lr': 0.0004946564156807738, 'samples': 1772032, 'steps': 3460, 'batch_loss/train': 0.8062706240452826} 12/22/2021 01:49:02 - INFO - codeparrot_training - Step 3461: {'lr': 0.0004946531735101141, 'samples': 1772544, 'steps': 3461, 'batch_loss/train': 0.9660002440214157} 12/22/2021 01:49:15 - INFO - codeparrot_training - Step 3462: {'lr': 0.0004946499303668058, 'samples': 1773056, 'steps': 3462, 'batch_loss/train': 0.9085809281095862} 12/22/2021 01:49:26 - INFO - codeparrot_training - Step 3463: {'lr': 0.0004946466862508619, 'samples': 1773568, 'steps': 3463, 'batch_loss/train': 0.857090204488486} 12/22/2021 01:49:36 - INFO - codeparrot_training - Step 3464: {'lr': 0.000494643441162295, 'samples': 1774080, 'steps': 3464, 'batch_loss/train': 0.943649560213089} 12/22/2021 01:49:48 - INFO - codeparrot_training - Step 3465: {'lr': 0.0004946401951011182, 'samples': 1774592, 'steps': 3465, 'batch_loss/train': 0.6680805434007198} 12/22/2021 01:49:59 - INFO - codeparrot_training - Step 3466: {'lr': 0.0004946369480673443, 'samples': 1775104, 'steps': 3466, 'batch_loss/train': 0.8659596741199493} 12/22/2021 01:50:10 - INFO - codeparrot_training - Step 3467: {'lr': 0.0004946337000609863, 'samples': 1775616, 'steps': 3467, 'batch_loss/train': 0.8596974331885576} 12/22/2021 01:50:22 - INFO - codeparrot_training - Step 3468: {'lr': 0.000494630451082057, 'samples': 1776128, 'steps': 3468, 'batch_loss/train': 0.8380794748663902} 12/22/2021 01:50:33 - INFO - codeparrot_training - Step 3469: {'lr': 0.0004946272011305694, 'samples': 1776640, 'steps': 3469, 'batch_loss/train': 0.5856748193036765} 12/22/2021 01:50:44 - INFO - codeparrot_training - Step 3470: {'lr': 0.0004946239502065366, 'samples': 1777152, 'steps': 3470, 'batch_loss/train': 0.7223432927858084} 12/22/2021 01:50:54 - INFO - codeparrot_training - Step 3471: {'lr': 0.0004946206983099712, 'samples': 1777664, 'steps': 3471, 'batch_loss/train': 0.8448981121182442} 12/22/2021 01:51:06 - INFO - codeparrot_training - Step 3472: {'lr': 0.0004946174454408863, 'samples': 1778176, 'steps': 3472, 'batch_loss/train': 1.04445660719648} 12/22/2021 01:51:17 - INFO - codeparrot_training - Step 3473: {'lr': 0.0004946141915992948, 'samples': 1778688, 'steps': 3473, 'batch_loss/train': 0.9172955797985196} 12/22/2021 01:51:27 - INFO - codeparrot_training - Step 3474: {'lr': 0.0004946109367852097, 'samples': 1779200, 'steps': 3474, 'batch_loss/train': 1.0339185260236263} 12/22/2021 01:51:41 - INFO - codeparrot_training - Step 3475: {'lr': 0.0004946076809986438, 'samples': 1779712, 'steps': 3475, 'batch_loss/train': 0.8509270157665014} 12/22/2021 01:51:51 - INFO - codeparrot_training - Step 3476: {'lr': 0.0004946044242396101, 'samples': 1780224, 'steps': 3476, 'batch_loss/train': 1.1049997832160443} 12/22/2021 01:52:02 - INFO - codeparrot_training - Step 3477: {'lr': 0.0004946011665081216, 'samples': 1780736, 'steps': 3477, 'batch_loss/train': 0.9175719609484076} 12/22/2021 01:52:14 - INFO - codeparrot_training - Step 3478: {'lr': 0.0004945979078041912, 'samples': 1781248, 'steps': 3478, 'batch_loss/train': 0.9563420200720429} 12/22/2021 01:52:25 - INFO - codeparrot_training - Step 3479: {'lr': 0.0004945946481278319, 'samples': 1781760, 'steps': 3479, 'batch_loss/train': 0.8762904442846775} 12/22/2021 01:52:35 - INFO - codeparrot_training - Step 3480: {'lr': 0.0004945913874790566, 'samples': 1782272, 'steps': 3480, 'batch_loss/train': 0.8808010555803776} 12/22/2021 01:52:46 - INFO - codeparrot_training - Step 3481: {'lr': 0.0004945881258578783, 'samples': 1782784, 'steps': 3481, 'batch_loss/train': 0.8639015620574355} 12/22/2021 01:52:58 - INFO - codeparrot_training - Step 3482: {'lr': 0.00049458486326431, 'samples': 1783296, 'steps': 3482, 'batch_loss/train': 0.822933750692755} 12/22/2021 01:53:09 - INFO - codeparrot_training - Step 3483: {'lr': 0.0004945815996983647, 'samples': 1783808, 'steps': 3483, 'batch_loss/train': 0.8456965144723654} 12/22/2021 01:53:20 - INFO - codeparrot_training - Step 3484: {'lr': 0.0004945783351600552, 'samples': 1784320, 'steps': 3484, 'batch_loss/train': 0.8831368796527386} 12/22/2021 01:53:32 - INFO - codeparrot_training - Step 3485: {'lr': 0.0004945750696493946, 'samples': 1784832, 'steps': 3485, 'batch_loss/train': 0.9064144417643547} 12/22/2021 01:53:42 - INFO - codeparrot_training - Step 3486: {'lr': 0.000494571803166396, 'samples': 1785344, 'steps': 3486, 'batch_loss/train': 0.8073588674888015} 12/22/2021 01:53:53 - INFO - codeparrot_training - Step 3487: {'lr': 0.0004945685357110721, 'samples': 1785856, 'steps': 3487, 'batch_loss/train': 0.9321060562506318} 12/22/2021 01:54:05 - INFO - codeparrot_training - Step 3488: {'lr': 0.0004945652672834361, 'samples': 1786368, 'steps': 3488, 'batch_loss/train': 0.8714982904493809} 12/22/2021 01:54:16 - INFO - codeparrot_training - Step 3489: {'lr': 0.0004945619978835009, 'samples': 1786880, 'steps': 3489, 'batch_loss/train': 0.8427813844755292} 12/22/2021 01:54:26 - INFO - codeparrot_training - Step 3490: {'lr': 0.0004945587275112796, 'samples': 1787392, 'steps': 3490, 'batch_loss/train': 0.8943967996165156} 12/22/2021 01:54:39 - INFO - codeparrot_training - Step 3491: {'lr': 0.000494555456166785, 'samples': 1787904, 'steps': 3491, 'batch_loss/train': 0.9104488557204604} 12/22/2021 01:54:49 - INFO - codeparrot_training - Step 3492: {'lr': 0.0004945521838500305, 'samples': 1788416, 'steps': 3492, 'batch_loss/train': 0.8866565618664026} 12/22/2021 01:55:00 - INFO - codeparrot_training - Step 3493: {'lr': 0.0004945489105610287, 'samples': 1788928, 'steps': 3493, 'batch_loss/train': 0.8301026145927608} 12/22/2021 01:55:11 - INFO - codeparrot_training - Step 3494: {'lr': 0.0004945456362997928, 'samples': 1789440, 'steps': 3494, 'batch_loss/train': 0.864351318217814} 12/22/2021 01:55:24 - INFO - codeparrot_training - Step 3495: {'lr': 0.0004945423610663357, 'samples': 1789952, 'steps': 3495, 'batch_loss/train': 0.7974714837037027} 12/22/2021 01:55:35 - INFO - codeparrot_training - Step 3496: {'lr': 0.0004945390848606707, 'samples': 1790464, 'steps': 3496, 'batch_loss/train': 0.9848687676712871} 12/22/2021 01:55:45 - INFO - codeparrot_training - Step 3497: {'lr': 0.0004945358076828104, 'samples': 1790976, 'steps': 3497, 'batch_loss/train': 0.9001434613019228} 12/22/2021 01:55:57 - INFO - codeparrot_training - Step 3498: {'lr': 0.0004945325295327683, 'samples': 1791488, 'steps': 3498, 'batch_loss/train': 0.9471545265987515} 12/22/2021 01:56:08 - INFO - codeparrot_training - Step 3499: {'lr': 0.000494529250410557, 'samples': 1792000, 'steps': 3499, 'batch_loss/train': 0.8704524664208293} 12/22/2021 01:56:19 - INFO - codeparrot_training - Step 3500: {'lr': 0.0004945259703161898, 'samples': 1792512, 'steps': 3500, 'batch_loss/train': 0.8676300197839737} 12/22/2021 01:56:31 - INFO - codeparrot_training - Step 3501: {'lr': 0.0004945226892496797, 'samples': 1793024, 'steps': 3501, 'batch_loss/train': 0.8452836577780545} 12/22/2021 01:56:41 - INFO - codeparrot_training - Step 3502: {'lr': 0.0004945194072110397, 'samples': 1793536, 'steps': 3502, 'batch_loss/train': 0.8925215350463986} 12/22/2021 01:56:52 - INFO - codeparrot_training - Step 3503: {'lr': 0.0004945161242002829, 'samples': 1794048, 'steps': 3503, 'batch_loss/train': 0.8552897032350302} 12/22/2021 01:57:02 - INFO - codeparrot_training - Step 3504: {'lr': 0.0004945128402174222, 'samples': 1794560, 'steps': 3504, 'batch_loss/train': 1.0117326541803777} 12/22/2021 01:57:15 - INFO - codeparrot_training - Step 3505: {'lr': 0.000494509555262471, 'samples': 1795072, 'steps': 3505, 'batch_loss/train': 0.883851895108819} 12/22/2021 01:57:26 - INFO - codeparrot_training - Step 3506: {'lr': 0.0004945062693354419, 'samples': 1795584, 'steps': 3506, 'batch_loss/train': 0.9263391690328717} 12/22/2021 01:57:36 - INFO - codeparrot_training - Step 3507: {'lr': 0.0004945029824363483, 'samples': 1796096, 'steps': 3507, 'batch_loss/train': 0.8215256440453231} 12/22/2021 01:57:49 - INFO - codeparrot_training - Step 3508: {'lr': 0.0004944996945652031, 'samples': 1796608, 'steps': 3508, 'batch_loss/train': 0.7772551644593477} 12/22/2021 01:57:59 - INFO - codeparrot_training - Step 3509: {'lr': 0.0004944964057220194, 'samples': 1797120, 'steps': 3509, 'batch_loss/train': 0.8318264717236161} 12/22/2021 01:58:10 - INFO - codeparrot_training - Step 3510: {'lr': 0.0004944931159068104, 'samples': 1797632, 'steps': 3510, 'batch_loss/train': 0.9086710996925831} 12/22/2021 01:58:22 - INFO - codeparrot_training - Step 3511: {'lr': 0.0004944898251195892, 'samples': 1798144, 'steps': 3511, 'batch_loss/train': 0.9272334445267916} 12/22/2021 01:58:32 - INFO - codeparrot_training - Step 3512: {'lr': 0.0004944865333603686, 'samples': 1798656, 'steps': 3512, 'batch_loss/train': 0.9310260312631726} 12/22/2021 01:58:43 - INFO - codeparrot_training - Step 3513: {'lr': 0.0004944832406291619, 'samples': 1799168, 'steps': 3513, 'batch_loss/train': 0.9598990902304649} 12/22/2021 01:58:56 - INFO - codeparrot_training - Step 3514: {'lr': 0.0004944799469259821, 'samples': 1799680, 'steps': 3514, 'batch_loss/train': 0.8060433105565608} 12/22/2021 01:59:06 - INFO - codeparrot_training - Step 3515: {'lr': 0.0004944766522508424, 'samples': 1800192, 'steps': 3515, 'batch_loss/train': 0.8312797234393656} 12/22/2021 01:59:17 - INFO - codeparrot_training - Step 3516: {'lr': 0.0004944733566037559, 'samples': 1800704, 'steps': 3516, 'batch_loss/train': 0.8710194202139974} 12/22/2021 01:59:27 - INFO - codeparrot_training - Step 3517: {'lr': 0.0004944700599847355, 'samples': 1801216, 'steps': 3517, 'batch_loss/train': 0.8407049275701866} 12/22/2021 01:59:40 - INFO - codeparrot_training - Step 3518: {'lr': 0.0004944667623937945, 'samples': 1801728, 'steps': 3518, 'batch_loss/train': 2.8580757966265082} 12/22/2021 01:59:51 - INFO - codeparrot_training - Step 3519: {'lr': 0.0004944634638309461, 'samples': 1802240, 'steps': 3519, 'batch_loss/train': 0.9616761561483145} 12/22/2021 02:00:01 - INFO - codeparrot_training - Step 3520: {'lr': 0.0004944601642962031, 'samples': 1802752, 'steps': 3520, 'batch_loss/train': 0.9264923417940736} 12/22/2021 02:00:14 - INFO - codeparrot_training - Step 3521: {'lr': 0.0004944568637895789, 'samples': 1803264, 'steps': 3521, 'batch_loss/train': 0.969231927767396} 12/22/2021 02:00:25 - INFO - codeparrot_training - Step 3522: {'lr': 0.0004944535623110864, 'samples': 1803776, 'steps': 3522, 'batch_loss/train': 0.9023872278630733} 12/22/2021 02:00:35 - INFO - codeparrot_training - Step 3523: {'lr': 0.0004944502598607388, 'samples': 1804288, 'steps': 3523, 'batch_loss/train': 0.8764679087325931} 12/22/2021 02:00:46 - INFO - codeparrot_training - Step 3524: {'lr': 0.0004944469564385495, 'samples': 1804800, 'steps': 3524, 'batch_loss/train': 0.8167861038818955} 12/22/2021 02:00:58 - INFO - codeparrot_training - Step 3525: {'lr': 0.0004944436520445312, 'samples': 1805312, 'steps': 3525, 'batch_loss/train': 0.9607346630655229} 12/22/2021 02:01:09 - INFO - codeparrot_training - Step 3526: {'lr': 0.0004944403466786973, 'samples': 1805824, 'steps': 3526, 'batch_loss/train': 0.9265015041455626} 12/22/2021 02:01:19 - INFO - codeparrot_training - Step 3527: {'lr': 0.0004944370403410609, 'samples': 1806336, 'steps': 3527, 'batch_loss/train': 0.8194585470482707} 12/22/2021 02:01:31 - INFO - codeparrot_training - Step 3528: {'lr': 0.0004944337330316352, 'samples': 1806848, 'steps': 3528, 'batch_loss/train': 0.8380950358696282} 12/22/2021 02:01:42 - INFO - codeparrot_training - Step 3529: {'lr': 0.0004944304247504332, 'samples': 1807360, 'steps': 3529, 'batch_loss/train': 0.8413641154766083} 12/22/2021 02:01:52 - INFO - codeparrot_training - Step 3530: {'lr': 0.000494427115497468, 'samples': 1807872, 'steps': 3530, 'batch_loss/train': 0.9342019772157073} 12/22/2021 02:02:05 - INFO - codeparrot_training - Step 3531: {'lr': 0.000494423805272753, 'samples': 1808384, 'steps': 3531, 'batch_loss/train': 2.0119240302592516} 12/22/2021 02:02:16 - INFO - codeparrot_training - Step 3532: {'lr': 0.0004944204940763013, 'samples': 1808896, 'steps': 3532, 'batch_loss/train': 0.816680911462754} 12/22/2021 02:02:27 - INFO - codeparrot_training - Step 3533: {'lr': 0.0004944171819081259, 'samples': 1809408, 'steps': 3533, 'batch_loss/train': 0.9404733330011368} 12/22/2021 02:02:37 - INFO - codeparrot_training - Step 3534: {'lr': 0.0004944138687682402, 'samples': 1809920, 'steps': 3534, 'batch_loss/train': 0.834939862601459} 12/22/2021 02:02:49 - INFO - codeparrot_training - Step 3535: {'lr': 0.0004944105546566571, 'samples': 1810432, 'steps': 3535, 'batch_loss/train': 0.9229628471657634} 12/22/2021 02:03:00 - INFO - codeparrot_training - Step 3536: {'lr': 0.00049440723957339, 'samples': 1810944, 'steps': 3536, 'batch_loss/train': 0.912520335521549} 12/22/2021 02:03:11 - INFO - codeparrot_training - Step 3537: {'lr': 0.0004944039235184518, 'samples': 1811456, 'steps': 3537, 'batch_loss/train': 0.8837254727259278} 12/22/2021 02:03:23 - INFO - codeparrot_training - Step 3538: {'lr': 0.0004944006064918562, 'samples': 1811968, 'steps': 3538, 'batch_loss/train': 0.9516239296644926} 12/22/2021 02:03:34 - INFO - codeparrot_training - Step 3539: {'lr': 0.0004943972884936159, 'samples': 1812480, 'steps': 3539, 'batch_loss/train': 0.9229928832501173} 12/22/2021 02:03:44 - INFO - codeparrot_training - Step 3540: {'lr': 0.0004943939695237443, 'samples': 1812992, 'steps': 3540, 'batch_loss/train': 0.8192246803082526} 12/22/2021 02:03:57 - INFO - codeparrot_training - Step 3541: {'lr': 0.0004943906495822545, 'samples': 1813504, 'steps': 3541, 'batch_loss/train': 0.8373703854158521} 12/22/2021 02:04:07 - INFO - codeparrot_training - Step 3542: {'lr': 0.0004943873286691598, 'samples': 1814016, 'steps': 3542, 'batch_loss/train': 0.8835948491469026} 12/22/2021 02:04:18 - INFO - codeparrot_training - Step 3543: {'lr': 0.0004943840067844733, 'samples': 1814528, 'steps': 3543, 'batch_loss/train': 0.5543482950888574} 12/22/2021 02:04:31 - INFO - codeparrot_training - Step 3544: {'lr': 0.0004943806839282084, 'samples': 1815040, 'steps': 3544, 'batch_loss/train': 0.973328361287713} 12/22/2021 02:04:41 - INFO - codeparrot_training - Step 3545: {'lr': 0.0004943773601003781, 'samples': 1815552, 'steps': 3545, 'batch_loss/train': 0.8416095515713096} 12/22/2021 02:04:52 - INFO - codeparrot_training - Step 3546: {'lr': 0.0004943740353009956, 'samples': 1816064, 'steps': 3546, 'batch_loss/train': 0.8184488546103239} 12/22/2021 02:05:03 - INFO - codeparrot_training - Step 3547: {'lr': 0.0004943707095300744, 'samples': 1816576, 'steps': 3547, 'batch_loss/train': 0.8637743787840009} 12/22/2021 02:05:15 - INFO - codeparrot_training - Step 3548: {'lr': 0.0004943673827876273, 'samples': 1817088, 'steps': 3548, 'batch_loss/train': 0.815297300927341} 12/22/2021 02:05:25 - INFO - codeparrot_training - Step 3549: {'lr': 0.000494364055073668, 'samples': 1817600, 'steps': 3549, 'batch_loss/train': 0.8656611237674952} 12/22/2021 02:05:36 - INFO - codeparrot_training - Step 3550: {'lr': 0.0004943607263882094, 'samples': 1818112, 'steps': 3550, 'batch_loss/train': 0.8853835673071444} 12/22/2021 02:05:49 - INFO - codeparrot_training - Step 3551: {'lr': 0.0004943573967312648, 'samples': 1818624, 'steps': 3551, 'batch_loss/train': 0.9265786493197083} 12/22/2021 02:05:59 - INFO - codeparrot_training - Step 3552: {'lr': 0.0004943540661028474, 'samples': 1819136, 'steps': 3552, 'batch_loss/train': 0.8892348720692098} 12/22/2021 02:06:10 - INFO - codeparrot_training - Step 3553: {'lr': 0.0004943507345029706, 'samples': 1819648, 'steps': 3553, 'batch_loss/train': 0.9021750576794147} 12/22/2021 02:06:22 - INFO - codeparrot_training - Step 3554: {'lr': 0.0004943474019316474, 'samples': 1820160, 'steps': 3554, 'batch_loss/train': 0.7938539409078658} 12/22/2021 02:06:32 - INFO - codeparrot_training - Step 3555: {'lr': 0.0004943440683888912, 'samples': 1820672, 'steps': 3555, 'batch_loss/train': 0.7603551298379898} 12/22/2021 02:06:43 - INFO - codeparrot_training - Step 3556: {'lr': 0.0004943407338747154, 'samples': 1821184, 'steps': 3556, 'batch_loss/train': 0.9094089400023222} 12/22/2021 02:06:54 - INFO - codeparrot_training - Step 3557: {'lr': 0.000494337398389133, 'samples': 1821696, 'steps': 3557, 'batch_loss/train': 0.8615685030817986} 12/22/2021 02:07:06 - INFO - codeparrot_training - Step 3558: {'lr': 0.0004943340619321573, 'samples': 1822208, 'steps': 3558, 'batch_loss/train': 0.880504909902811} 12/22/2021 02:07:16 - INFO - codeparrot_training - Step 3559: {'lr': 0.0004943307245038018, 'samples': 1822720, 'steps': 3559, 'batch_loss/train': 0.7891916492953897} 12/22/2021 02:07:27 - INFO - codeparrot_training - Step 3560: {'lr': 0.0004943273861040795, 'samples': 1823232, 'steps': 3560, 'batch_loss/train': 0.8722427310422063} 12/22/2021 02:07:40 - INFO - codeparrot_training - Step 3561: {'lr': 0.0004943240467330038, 'samples': 1823744, 'steps': 3561, 'batch_loss/train': 1.3392586950212717} 12/22/2021 02:07:50 - INFO - codeparrot_training - Step 3562: {'lr': 0.0004943207063905879, 'samples': 1824256, 'steps': 3562, 'batch_loss/train': 0.8518108995631337} 12/22/2021 02:08:01 - INFO - codeparrot_training - Step 3563: {'lr': 0.0004943173650768451, 'samples': 1824768, 'steps': 3563, 'batch_loss/train': 0.8585761766880751} 12/22/2021 02:08:12 - INFO - codeparrot_training - Step 3564: {'lr': 0.0004943140227917888, 'samples': 1825280, 'steps': 3564, 'batch_loss/train': 0.8329691016115248} 12/22/2021 02:08:24 - INFO - codeparrot_training - Step 3565: {'lr': 0.000494310679535432, 'samples': 1825792, 'steps': 3565, 'batch_loss/train': 0.824758957605809} 12/22/2021 02:08:34 - INFO - codeparrot_training - Step 3566: {'lr': 0.0004943073353077884, 'samples': 1826304, 'steps': 3566, 'batch_loss/train': 0.8035524319857359} 12/22/2021 02:08:45 - INFO - codeparrot_training - Step 3567: {'lr': 0.0004943039901088711, 'samples': 1826816, 'steps': 3567, 'batch_loss/train': 0.9234896302223206} 12/22/2021 02:08:58 - INFO - codeparrot_training - Step 3568: {'lr': 0.0004943006439386932, 'samples': 1827328, 'steps': 3568, 'batch_loss/train': 0.8337323293089867} 12/22/2021 02:09:08 - INFO - codeparrot_training - Step 3569: {'lr': 0.0004942972967972683, 'samples': 1827840, 'steps': 3569, 'batch_loss/train': 0.8427014593034983} 12/22/2021 02:09:19 - INFO - codeparrot_training - Step 3570: {'lr': 0.0004942939486846095, 'samples': 1828352, 'steps': 3570, 'batch_loss/train': 0.9111682642251253} 12/22/2021 02:09:32 - INFO - codeparrot_training - Step 3571: {'lr': 0.0004942905996007303, 'samples': 1828864, 'steps': 3571, 'batch_loss/train': 0.9075486836954951} 12/22/2021 02:09:43 - INFO - codeparrot_training - Step 3572: {'lr': 0.000494287249545644, 'samples': 1829376, 'steps': 3572, 'batch_loss/train': 0.8511207697447389} 12/22/2021 02:09:53 - INFO - codeparrot_training - Step 3573: {'lr': 0.0004942838985193636, 'samples': 1829888, 'steps': 3573, 'batch_loss/train': 0.9741732208058238} 12/22/2021 02:10:04 - INFO - codeparrot_training - Step 3574: {'lr': 0.0004942805465219028, 'samples': 1830400, 'steps': 3574, 'batch_loss/train': 1.2674470040947199} 12/22/2021 02:10:16 - INFO - codeparrot_training - Step 3575: {'lr': 0.0004942771935532748, 'samples': 1830912, 'steps': 3575, 'batch_loss/train': 0.9817301537841558} 12/22/2021 02:10:26 - INFO - codeparrot_training - Step 3576: {'lr': 0.0004942738396134928, 'samples': 1831424, 'steps': 3576, 'batch_loss/train': 0.8651961451396346} 12/22/2021 02:10:37 - INFO - codeparrot_training - Step 3577: {'lr': 0.0004942704847025704, 'samples': 1831936, 'steps': 3577, 'batch_loss/train': 0.8070804169401526} 12/22/2021 02:10:49 - INFO - codeparrot_training - Step 3578: {'lr': 0.0004942671288205207, 'samples': 1832448, 'steps': 3578, 'batch_loss/train': 0.8615930192172527} 12/22/2021 02:11:00 - INFO - codeparrot_training - Step 3579: {'lr': 0.0004942637719673571, 'samples': 1832960, 'steps': 3579, 'batch_loss/train': 0.8399009457789361} 12/22/2021 02:11:10 - INFO - codeparrot_training - Step 3580: {'lr': 0.0004942604141430929, 'samples': 1833472, 'steps': 3580, 'batch_loss/train': 0.8758146241307259} 12/22/2021 02:11:23 - INFO - codeparrot_training - Step 3581: {'lr': 0.0004942570553477417, 'samples': 1833984, 'steps': 3581, 'batch_loss/train': 0.768500764388591} 12/22/2021 02:11:34 - INFO - codeparrot_training - Step 3582: {'lr': 0.0004942536955813166, 'samples': 1834496, 'steps': 3582, 'batch_loss/train': 0.7934064352884889} 12/22/2021 02:11:44 - INFO - codeparrot_training - Step 3583: {'lr': 0.000494250334843831, 'samples': 1835008, 'steps': 3583, 'batch_loss/train': 0.8626810284331441} 12/22/2021 02:11:55 - INFO - codeparrot_training - Step 3584: {'lr': 0.0004942469731352984, 'samples': 1835520, 'steps': 3584, 'batch_loss/train': 0.9239878803491592} 12/22/2021 02:12:07 - INFO - codeparrot_training - Step 3585: {'lr': 0.000494243610455732, 'samples': 1836032, 'steps': 3585, 'batch_loss/train': 0.7879227800294757} 12/22/2021 02:12:18 - INFO - codeparrot_training - Step 3586: {'lr': 0.0004942402468051452, 'samples': 1836544, 'steps': 3586, 'batch_loss/train': 0.9153551158960909} 12/22/2021 02:12:28 - INFO - codeparrot_training - Step 3587: {'lr': 0.0004942368821835515, 'samples': 1837056, 'steps': 3587, 'batch_loss/train': 0.8310556160286069} 12/22/2021 02:12:40 - INFO - codeparrot_training - Step 3588: {'lr': 0.0004942335165909641, 'samples': 1837568, 'steps': 3588, 'batch_loss/train': 0.8327822322025895} 12/22/2021 02:12:51 - INFO - codeparrot_training - Step 3589: {'lr': 0.0004942301500273963, 'samples': 1838080, 'steps': 3589, 'batch_loss/train': 0.7075921571813524} 12/22/2021 02:13:02 - INFO - codeparrot_training - Step 3590: {'lr': 0.0004942267824928619, 'samples': 1838592, 'steps': 3590, 'batch_loss/train': 0.9668259071186185} 12/22/2021 02:13:14 - INFO - codeparrot_training - Step 3591: {'lr': 0.0004942234139873739, 'samples': 1839104, 'steps': 3591, 'batch_loss/train': 0.8692309316247702} 12/22/2021 02:13:25 - INFO - codeparrot_training - Step 3592: {'lr': 0.0004942200445109458, 'samples': 1839616, 'steps': 3592, 'batch_loss/train': 0.9152122503146529} 12/22/2021 02:13:35 - INFO - codeparrot_training - Step 3593: {'lr': 0.0004942166740635911, 'samples': 1840128, 'steps': 3593, 'batch_loss/train': 0.9167777337133884} 12/22/2021 02:13:46 - INFO - codeparrot_training - Step 3594: {'lr': 0.0004942133026453232, 'samples': 1840640, 'steps': 3594, 'batch_loss/train': 0.8875272190198302} 12/22/2021 02:13:58 - INFO - codeparrot_training - Step 3595: {'lr': 0.0004942099302561552, 'samples': 1841152, 'steps': 3595, 'batch_loss/train': 0.8295080633834004} 12/22/2021 02:14:09 - INFO - codeparrot_training - Step 3596: {'lr': 0.0004942065568961008, 'samples': 1841664, 'steps': 3596, 'batch_loss/train': 0.8035096656531096} 12/22/2021 02:14:19 - INFO - codeparrot_training - Step 3597: {'lr': 0.0004942031825651734, 'samples': 1842176, 'steps': 3597, 'batch_loss/train': 0.9165981682017446} 12/22/2021 02:14:32 - INFO - codeparrot_training - Step 3598: {'lr': 0.0004941998072633863, 'samples': 1842688, 'steps': 3598, 'batch_loss/train': 1.048768994398415} 12/22/2021 02:14:43 - INFO - codeparrot_training - Step 3599: {'lr': 0.000494196430990753, 'samples': 1843200, 'steps': 3599, 'batch_loss/train': 1.2040011070203036} 12/22/2021 02:14:53 - INFO - codeparrot_training - Step 3600: {'lr': 0.0004941930537472869, 'samples': 1843712, 'steps': 3600, 'batch_loss/train': 0.807585927657783} 12/22/2021 02:15:05 - INFO - codeparrot_training - Step 3601: {'lr': 0.0004941896755330014, 'samples': 1844224, 'steps': 3601, 'batch_loss/train': 0.7648380836471915} 12/22/2021 02:15:16 - INFO - codeparrot_training - Step 3602: {'lr': 0.0004941862963479099, 'samples': 1844736, 'steps': 3602, 'batch_loss/train': 1.0362963839434087} 12/22/2021 02:15:27 - INFO - codeparrot_training - Step 3603: {'lr': 0.000494182916192026, 'samples': 1845248, 'steps': 3603, 'batch_loss/train': 1.0919600976631045} 12/22/2021 02:15:39 - INFO - codeparrot_training - Step 3604: {'lr': 0.000494179535065363, 'samples': 1845760, 'steps': 3604, 'batch_loss/train': 0.9153896253556013} 12/22/2021 02:15:49 - INFO - codeparrot_training - Step 3605: {'lr': 0.0004941761529679343, 'samples': 1846272, 'steps': 3605, 'batch_loss/train': 1.2690646974369884} 12/22/2021 02:16:00 - INFO - codeparrot_training - Step 3606: {'lr': 0.0004941727698997535, 'samples': 1846784, 'steps': 3606, 'batch_loss/train': 0.8726218547672033} 12/22/2021 02:16:10 - INFO - codeparrot_training - Step 3607: {'lr': 0.0004941693858608339, 'samples': 1847296, 'steps': 3607, 'batch_loss/train': 0.8166847731918097} 12/22/2021 02:16:24 - INFO - codeparrot_training - Step 3608: {'lr': 0.000494166000851189, 'samples': 1847808, 'steps': 3608, 'batch_loss/train': 0.8843117239885032} 12/22/2021 02:16:34 - INFO - codeparrot_training - Step 3609: {'lr': 0.0004941626148708323, 'samples': 1848320, 'steps': 3609, 'batch_loss/train': 0.8140062503516674} 12/22/2021 02:16:45 - INFO - codeparrot_training - Step 3610: {'lr': 0.0004941592279197772, 'samples': 1848832, 'steps': 3610, 'batch_loss/train': 0.8856231700628996} 12/22/2021 02:16:57 - INFO - codeparrot_training - Step 3611: {'lr': 0.0004941558399980373, 'samples': 1849344, 'steps': 3611, 'batch_loss/train': 1.2615128380712122} 12/22/2021 02:17:08 - INFO - codeparrot_training - Step 3612: {'lr': 0.0004941524511056259, 'samples': 1849856, 'steps': 3612, 'batch_loss/train': 0.9730823596473783} 12/22/2021 02:17:18 - INFO - codeparrot_training - Step 3613: {'lr': 0.0004941490612425565, 'samples': 1850368, 'steps': 3613, 'batch_loss/train': 0.8517878665588796} 12/22/2021 02:17:29 - INFO - codeparrot_training - Step 3614: {'lr': 0.0004941456704088427, 'samples': 1850880, 'steps': 3614, 'batch_loss/train': 0.8801688766106963} 12/22/2021 02:17:42 - INFO - codeparrot_training - Step 3615: {'lr': 0.0004941422786044979, 'samples': 1851392, 'steps': 3615, 'batch_loss/train': 0.8270635297521949} 12/22/2021 02:17:52 - INFO - codeparrot_training - Step 3616: {'lr': 0.0004941388858295356, 'samples': 1851904, 'steps': 3616, 'batch_loss/train': 0.8101574634201825} 12/22/2021 02:18:03 - INFO - codeparrot_training - Step 3617: {'lr': 0.0004941354920839692, 'samples': 1852416, 'steps': 3617, 'batch_loss/train': 0.868084330111742} 12/22/2021 02:18:15 - INFO - codeparrot_training - Step 3618: {'lr': 0.0004941320973678123, 'samples': 1852928, 'steps': 3618, 'batch_loss/train': 0.77992626093328} 12/22/2021 02:18:26 - INFO - codeparrot_training - Step 3619: {'lr': 0.0004941287016810784, 'samples': 1853440, 'steps': 3619, 'batch_loss/train': 1.1729822349734604} 12/22/2021 02:18:36 - INFO - codeparrot_training - Step 3620: {'lr': 0.0004941253050237809, 'samples': 1853952, 'steps': 3620, 'batch_loss/train': 0.841286982409656} 12/22/2021 02:18:49 - INFO - codeparrot_training - Step 3621: {'lr': 0.0004941219073959334, 'samples': 1854464, 'steps': 3621, 'batch_loss/train': 0.8932104390114546} 12/22/2021 02:19:00 - INFO - codeparrot_training - Step 3622: {'lr': 0.0004941185087975493, 'samples': 1854976, 'steps': 3622, 'batch_loss/train': 0.8765964093618095} 12/22/2021 02:19:10 - INFO - codeparrot_training - Step 3623: {'lr': 0.0004941151092286422, 'samples': 1855488, 'steps': 3623, 'batch_loss/train': 0.7725192091893405} 12/22/2021 02:19:22 - INFO - codeparrot_training - Step 3624: {'lr': 0.0004941117086892258, 'samples': 1856000, 'steps': 3624, 'batch_loss/train': 0.7387925083748996} 12/22/2021 02:19:33 - INFO - codeparrot_training - Step 3625: {'lr': 0.0004941083071793132, 'samples': 1856512, 'steps': 3625, 'batch_loss/train': 0.7698519108816981} 12/22/2021 02:19:44 - INFO - codeparrot_training - Step 3626: {'lr': 0.0004941049046989182, 'samples': 1857024, 'steps': 3626, 'batch_loss/train': 0.7716005090624094} 12/22/2021 02:19:54 - INFO - codeparrot_training - Step 3627: {'lr': 0.0004941015012480543, 'samples': 1857536, 'steps': 3627, 'batch_loss/train': 0.930221838876605} 12/22/2021 02:20:06 - INFO - codeparrot_training - Step 3628: {'lr': 0.0004940980968267349, 'samples': 1858048, 'steps': 3628, 'batch_loss/train': 0.8367686718702316} 12/22/2021 02:20:17 - INFO - codeparrot_training - Step 3629: {'lr': 0.0004940946914349738, 'samples': 1858560, 'steps': 3629, 'batch_loss/train': 0.8119125356897712} 12/22/2021 02:20:27 - INFO - codeparrot_training - Step 3630: {'lr': 0.0004940912850727843, 'samples': 1859072, 'steps': 3630, 'batch_loss/train': 0.867166300304234} 12/22/2021 02:20:40 - INFO - codeparrot_training - Step 3631: {'lr': 0.00049408787774018, 'samples': 1859584, 'steps': 3631, 'batch_loss/train': 0.8659879714250565} 12/22/2021 02:20:51 - INFO - codeparrot_training - Step 3632: {'lr': 0.0004940844694371743, 'samples': 1860096, 'steps': 3632, 'batch_loss/train': 0.8980640228837729} 12/22/2021 02:21:01 - INFO - codeparrot_training - Step 3633: {'lr': 0.0004940810601637811, 'samples': 1860608, 'steps': 3633, 'batch_loss/train': 0.8157307635992765} 12/22/2021 02:21:13 - INFO - codeparrot_training - Step 3634: {'lr': 0.0004940776499200138, 'samples': 1861120, 'steps': 3634, 'batch_loss/train': 0.8793178303167224} 12/22/2021 02:21:24 - INFO - codeparrot_training - Step 3635: {'lr': 0.0004940742387058858, 'samples': 1861632, 'steps': 3635, 'batch_loss/train': 1.2891863882541656} 12/22/2021 02:21:35 - INFO - codeparrot_training - Step 3636: {'lr': 0.0004940708265214109, 'samples': 1862144, 'steps': 3636, 'batch_loss/train': 0.8910515252500772} 12/22/2021 02:21:45 - INFO - codeparrot_training - Step 3637: {'lr': 0.0004940674133666025, 'samples': 1862656, 'steps': 3637, 'batch_loss/train': 0.9146782271564007} 12/22/2021 02:21:58 - INFO - codeparrot_training - Step 3638: {'lr': 0.0004940639992414743, 'samples': 1863168, 'steps': 3638, 'batch_loss/train': 0.9318164084106684} 12/22/2021 02:22:08 - INFO - codeparrot_training - Step 3639: {'lr': 0.0004940605841460396, 'samples': 1863680, 'steps': 3639, 'batch_loss/train': 0.8962002722546458} 12/22/2021 02:22:19 - INFO - codeparrot_training - Step 3640: {'lr': 0.0004940571680803124, 'samples': 1864192, 'steps': 3640, 'batch_loss/train': 0.7705502943135798} 12/22/2021 02:22:31 - INFO - codeparrot_training - Step 3641: {'lr': 0.0004940537510443059, 'samples': 1864704, 'steps': 3641, 'batch_loss/train': 0.9080872964113951} 12/22/2021 02:22:42 - INFO - codeparrot_training - Step 3642: {'lr': 0.0004940503330380339, 'samples': 1865216, 'steps': 3642, 'batch_loss/train': 0.9034329671412706} 12/22/2021 02:22:52 - INFO - codeparrot_training - Step 3643: {'lr': 0.0004940469140615099, 'samples': 1865728, 'steps': 3643, 'batch_loss/train': 0.9537591021507978} 12/22/2021 02:23:05 - INFO - codeparrot_training - Step 3644: {'lr': 0.0004940434941147475, 'samples': 1866240, 'steps': 3644, 'batch_loss/train': 0.7651811763644218} 12/22/2021 02:23:15 - INFO - codeparrot_training - Step 3645: {'lr': 0.0004940400731977603, 'samples': 1866752, 'steps': 3645, 'batch_loss/train': 0.9020968480035663} 12/22/2021 02:23:26 - INFO - codeparrot_training - Step 3646: {'lr': 0.000494036651310562, 'samples': 1867264, 'steps': 3646, 'batch_loss/train': 0.8872583769261837} 12/22/2021 02:23:36 - INFO - codeparrot_training - Step 3647: {'lr': 0.0004940332284531661, 'samples': 1867776, 'steps': 3647, 'batch_loss/train': 1.2779811657965183} 12/22/2021 02:23:49 - INFO - codeparrot_training - Step 3648: {'lr': 0.0004940298046255861, 'samples': 1868288, 'steps': 3648, 'batch_loss/train': 0.877135087735951} 12/22/2021 02:24:00 - INFO - codeparrot_training - Step 3649: {'lr': 0.0004940263798278359, 'samples': 1868800, 'steps': 3649, 'batch_loss/train': 0.9365897988900542} 12/22/2021 02:24:10 - INFO - codeparrot_training - Step 3650: {'lr': 0.0004940229540599288, 'samples': 1869312, 'steps': 3650, 'batch_loss/train': 0.8053650725632906} 12/22/2021 02:24:23 - INFO - codeparrot_training - Step 3651: {'lr': 0.0004940195273218787, 'samples': 1869824, 'steps': 3651, 'batch_loss/train': 0.9111508382484317} 12/22/2021 02:24:33 - INFO - codeparrot_training - Step 3652: {'lr': 0.0004940160996136991, 'samples': 1870336, 'steps': 3652, 'batch_loss/train': 0.8475642576813698} 12/22/2021 02:24:44 - INFO - codeparrot_training - Step 3653: {'lr': 0.0004940126709354035, 'samples': 1870848, 'steps': 3653, 'batch_loss/train': 1.0278633758425713} 12/22/2021 02:24:57 - INFO - codeparrot_training - Step 3654: {'lr': 0.0004940092412870058, 'samples': 1871360, 'steps': 3654, 'batch_loss/train': 0.8620964987203479} 12/22/2021 02:25:07 - INFO - codeparrot_training - Step 3655: {'lr': 0.0004940058106685195, 'samples': 1871872, 'steps': 3655, 'batch_loss/train': 0.8140120171010494} 12/22/2021 02:25:18 - INFO - codeparrot_training - Step 3656: {'lr': 0.0004940023790799582, 'samples': 1872384, 'steps': 3656, 'batch_loss/train': 0.6418354441411793} 12/22/2021 02:25:29 - INFO - codeparrot_training - Step 3657: {'lr': 0.0004939989465213354, 'samples': 1872896, 'steps': 3657, 'batch_loss/train': 1.7412631381303072} 12/22/2021 02:25:41 - INFO - codeparrot_training - Step 3658: {'lr': 0.0004939955129926651, 'samples': 1873408, 'steps': 3658, 'batch_loss/train': 0.8861029157415032} 12/22/2021 02:25:51 - INFO - codeparrot_training - Step 3659: {'lr': 0.0004939920784939607, 'samples': 1873920, 'steps': 3659, 'batch_loss/train': 0.9573064064607024} 12/22/2021 02:26:02 - INFO - codeparrot_training - Step 3660: {'lr': 0.0004939886430252359, 'samples': 1874432, 'steps': 3660, 'batch_loss/train': 0.9369873618707061} 12/22/2021 02:26:14 - INFO - codeparrot_training - Step 3661: {'lr': 0.0004939852065865045, 'samples': 1874944, 'steps': 3661, 'batch_loss/train': 0.9306574715301394} 12/22/2021 02:26:25 - INFO - codeparrot_training - Step 3662: {'lr': 0.00049398176917778, 'samples': 1875456, 'steps': 3662, 'batch_loss/train': 0.8630824033170938} 12/22/2021 02:26:35 - INFO - codeparrot_training - Step 3663: {'lr': 0.000493978330799076, 'samples': 1875968, 'steps': 3663, 'batch_loss/train': 0.8511675326153636} 12/22/2021 02:26:47 - INFO - codeparrot_training - Step 3664: {'lr': 0.0004939748914504064, 'samples': 1876480, 'steps': 3664, 'batch_loss/train': 0.9501930475234985} 12/22/2021 02:26:58 - INFO - codeparrot_training - Step 3665: {'lr': 0.0004939714511317848, 'samples': 1876992, 'steps': 3665, 'batch_loss/train': 0.9063807829516008} 12/22/2021 02:27:08 - INFO - codeparrot_training - Step 3666: {'lr': 0.0004939680098432247, 'samples': 1877504, 'steps': 3666, 'batch_loss/train': 0.8147152392193675} 12/22/2021 02:27:21 - INFO - codeparrot_training - Step 3667: {'lr': 0.00049396456758474, 'samples': 1878016, 'steps': 3667, 'batch_loss/train': 0.9330359287559986} 12/22/2021 02:27:32 - INFO - codeparrot_training - Step 3668: {'lr': 0.0004939611243563442, 'samples': 1878528, 'steps': 3668, 'batch_loss/train': 0.9242592337541282} 12/22/2021 02:27:42 - INFO - codeparrot_training - Step 3669: {'lr': 0.0004939576801580511, 'samples': 1879040, 'steps': 3669, 'batch_loss/train': 0.8686613440513611} 12/22/2021 02:27:53 - INFO - codeparrot_training - Step 3670: {'lr': 0.0004939542349898745, 'samples': 1879552, 'steps': 3670, 'batch_loss/train': 0.8887386941350996} 12/22/2021 02:28:05 - INFO - codeparrot_training - Step 3671: {'lr': 0.0004939507888518279, 'samples': 1880064, 'steps': 3671, 'batch_loss/train': 0.8994757956825197} 12/22/2021 02:28:16 - INFO - codeparrot_training - Step 3672: {'lr': 0.000493947341743925, 'samples': 1880576, 'steps': 3672, 'batch_loss/train': 0.8503441517241299} 12/22/2021 02:28:26 - INFO - codeparrot_training - Step 3673: {'lr': 0.0004939438936661796, 'samples': 1881088, 'steps': 3673, 'batch_loss/train': 0.9394881743937731} 12/22/2021 02:28:38 - INFO - codeparrot_training - Step 3674: {'lr': 0.0004939404446186055, 'samples': 1881600, 'steps': 3674, 'batch_loss/train': 0.8352928550448269} 12/22/2021 02:28:49 - INFO - codeparrot_training - Step 3675: {'lr': 0.0004939369946012162, 'samples': 1882112, 'steps': 3675, 'batch_loss/train': 0.9431041590869427} 12/22/2021 02:28:59 - INFO - codeparrot_training - Step 3676: {'lr': 0.0004939335436140255, 'samples': 1882624, 'steps': 3676, 'batch_loss/train': 1.0028809700161219} 12/22/2021 02:29:12 - INFO - codeparrot_training - Step 3677: {'lr': 0.0004939300916570472, 'samples': 1883136, 'steps': 3677, 'batch_loss/train': 0.8926482358947396} 12/22/2021 02:29:23 - INFO - codeparrot_training - Step 3678: {'lr': 0.000493926638730295, 'samples': 1883648, 'steps': 3678, 'batch_loss/train': 0.9580854177474976} 12/22/2021 02:29:33 - INFO - codeparrot_training - Step 3679: {'lr': 0.0004939231848337826, 'samples': 1884160, 'steps': 3679, 'batch_loss/train': 0.8879381760489196} 12/22/2021 02:29:46 - INFO - codeparrot_training - Step 3680: {'lr': 0.0004939197299675237, 'samples': 1884672, 'steps': 3680, 'batch_loss/train': 0.7806490198709071} 12/22/2021 02:29:56 - INFO - codeparrot_training - Step 3681: {'lr': 0.000493916274131532, 'samples': 1885184, 'steps': 3681, 'batch_loss/train': 0.8014894425868988} 12/22/2021 02:30:07 - INFO - codeparrot_training - Step 3682: {'lr': 0.0004939128173258213, 'samples': 1885696, 'steps': 3682, 'batch_loss/train': 0.7911846657516435} 12/22/2021 02:30:17 - INFO - codeparrot_training - Step 3683: {'lr': 0.0004939093595504052, 'samples': 1886208, 'steps': 3683, 'batch_loss/train': 0.7792215831577778} 12/22/2021 02:30:30 - INFO - codeparrot_training - Step 3684: {'lr': 0.0004939059008052979, 'samples': 1886720, 'steps': 3684, 'batch_loss/train': 0.9123597536236048} 12/22/2021 02:30:41 - INFO - codeparrot_training - Step 3685: {'lr': 0.0004939024410905126, 'samples': 1887232, 'steps': 3685, 'batch_loss/train': 0.9013401214033365} 12/22/2021 02:30:51 - INFO - codeparrot_training - Step 3686: {'lr': 0.0004938989804060633, 'samples': 1887744, 'steps': 3686, 'batch_loss/train': 0.9371725060045719} 12/22/2021 02:31:03 - INFO - codeparrot_training - Step 3687: {'lr': 0.0004938955187519639, 'samples': 1888256, 'steps': 3687, 'batch_loss/train': 0.7906889840960503} 12/22/2021 02:31:14 - INFO - codeparrot_training - Step 3688: {'lr': 0.0004938920561282278, 'samples': 1888768, 'steps': 3688, 'batch_loss/train': 0.9133713995106518} 12/22/2021 02:31:25 - INFO - codeparrot_training - Step 3689: {'lr': 0.0004938885925348691, 'samples': 1889280, 'steps': 3689, 'batch_loss/train': 0.8408324522897601} 12/22/2021 02:31:37 - INFO - codeparrot_training - Step 3690: {'lr': 0.0004938851279719014, 'samples': 1889792, 'steps': 3690, 'batch_loss/train': 0.902257933979854} 12/22/2021 02:31:47 - INFO - codeparrot_training - Step 3691: {'lr': 0.0004938816624393386, 'samples': 1890304, 'steps': 3691, 'batch_loss/train': 0.7715837322175503} 12/22/2021 02:31:58 - INFO - codeparrot_training - Step 3692: {'lr': 0.0004938781959371943, 'samples': 1890816, 'steps': 3692, 'batch_loss/train': 0.8506901357322931} 12/22/2021 02:32:08 - INFO - codeparrot_training - Step 3693: {'lr': 0.0004938747284654824, 'samples': 1891328, 'steps': 3693, 'batch_loss/train': 0.875853581354022} 12/22/2021 02:32:21 - INFO - codeparrot_training - Step 3694: {'lr': 0.0004938712600242167, 'samples': 1891840, 'steps': 3694, 'batch_loss/train': 0.8136497223749757} 12/22/2021 02:32:32 - INFO - codeparrot_training - Step 3695: {'lr': 0.0004938677906134109, 'samples': 1892352, 'steps': 3695, 'batch_loss/train': 0.922045623883605} 12/22/2021 02:32:43 - INFO - codeparrot_training - Step 3696: {'lr': 0.0004938643202330789, 'samples': 1892864, 'steps': 3696, 'batch_loss/train': 1.1267789769917727} 12/22/2021 02:32:55 - INFO - codeparrot_training - Step 3697: {'lr': 0.0004938608488832344, 'samples': 1893376, 'steps': 3697, 'batch_loss/train': 0.944523161277175} 12/22/2021 02:33:06 - INFO - codeparrot_training - Step 3698: {'lr': 0.0004938573765638913, 'samples': 1893888, 'steps': 3698, 'batch_loss/train': 0.9346029055304825} 12/22/2021 02:33:16 - INFO - codeparrot_training - Step 3699: {'lr': 0.0004938539032750633, 'samples': 1894400, 'steps': 3699, 'batch_loss/train': 0.7726988680660725} 12/22/2021 02:33:27 - INFO - codeparrot_training - Step 3700: {'lr': 0.0004938504290167643, 'samples': 1894912, 'steps': 3700, 'batch_loss/train': 0.8603280680254102} 12/22/2021 02:33:39 - INFO - codeparrot_training - Step 3701: {'lr': 0.000493846953789008, 'samples': 1895424, 'steps': 3701, 'batch_loss/train': 0.8613020270131528} 12/22/2021 02:33:49 - INFO - codeparrot_training - Step 3702: {'lr': 0.0004938434775918082, 'samples': 1895936, 'steps': 3702, 'batch_loss/train': 0.9788078777492046} 12/22/2021 02:34:00 - INFO - codeparrot_training - Step 3703: {'lr': 0.000493840000425179, 'samples': 1896448, 'steps': 3703, 'batch_loss/train': 0.9022299209609628} 12/22/2021 02:34:12 - INFO - codeparrot_training - Step 3704: {'lr': 0.0004938365222891339, 'samples': 1896960, 'steps': 3704, 'batch_loss/train': 0.7969147749245167} 12/22/2021 02:34:23 - INFO - codeparrot_training - Step 3705: {'lr': 0.0004938330431836869, 'samples': 1897472, 'steps': 3705, 'batch_loss/train': 0.925512027926743} 12/22/2021 02:34:34 - INFO - codeparrot_training - Step 3706: {'lr': 0.0004938295631088519, 'samples': 1897984, 'steps': 3706, 'batch_loss/train': 0.8370374981313944} 12/22/2021 02:34:47 - INFO - codeparrot_training - Step 3707: {'lr': 0.0004938260820646426, 'samples': 1898496, 'steps': 3707, 'batch_loss/train': 1.066465531475842} 12/22/2021 02:34:57 - INFO - codeparrot_training - Step 3708: {'lr': 0.0004938226000510726, 'samples': 1899008, 'steps': 3708, 'batch_loss/train': 0.867115480825305} 12/22/2021 02:35:08 - INFO - codeparrot_training - Step 3709: {'lr': 0.0004938191170681562, 'samples': 1899520, 'steps': 3709, 'batch_loss/train': 0.9091820036992431} 12/22/2021 02:35:19 - INFO - codeparrot_training - Step 3710: {'lr': 0.0004938156331159071, 'samples': 1900032, 'steps': 3710, 'batch_loss/train': 0.9045236324891448} 12/22/2021 02:35:31 - INFO - codeparrot_training - Step 3711: {'lr': 0.000493812148194339, 'samples': 1900544, 'steps': 3711, 'batch_loss/train': 0.9062775727361441} 12/22/2021 02:35:41 - INFO - codeparrot_training - Step 3712: {'lr': 0.000493808662303466, 'samples': 1901056, 'steps': 3712, 'batch_loss/train': 0.9554779622703791} 12/22/2021 02:35:52 - INFO - codeparrot_training - Step 3713: {'lr': 0.0004938051754433016, 'samples': 1901568, 'steps': 3713, 'batch_loss/train': 0.8990859221667051} 12/22/2021 02:36:05 - INFO - codeparrot_training - Step 3714: {'lr': 0.00049380168761386, 'samples': 1902080, 'steps': 3714, 'batch_loss/train': 0.7776436768472195} 12/22/2021 02:36:15 - INFO - codeparrot_training - Step 3715: {'lr': 0.000493798198815155, 'samples': 1902592, 'steps': 3715, 'batch_loss/train': 0.8665605848655105} 12/22/2021 02:36:26 - INFO - codeparrot_training - Step 3716: {'lr': 0.0004937947090472004, 'samples': 1903104, 'steps': 3716, 'batch_loss/train': 0.7853714087978005} 12/22/2021 02:36:38 - INFO - codeparrot_training - Step 3717: {'lr': 0.0004937912183100101, 'samples': 1903616, 'steps': 3717, 'batch_loss/train': 0.7539542373269796} 12/22/2021 02:36:49 - INFO - codeparrot_training - Step 3718: {'lr': 0.0004937877266035978, 'samples': 1904128, 'steps': 3718, 'batch_loss/train': 0.8744692080654204} 12/22/2021 02:36:59 - INFO - codeparrot_training - Step 3719: {'lr': 0.0004937842339279777, 'samples': 1904640, 'steps': 3719, 'batch_loss/train': 0.8569204686209559} 12/22/2021 02:37:11 - INFO - codeparrot_training - Step 3720: {'lr': 0.0004937807402831635, 'samples': 1905152, 'steps': 3720, 'batch_loss/train': 0.7682564994320273} 12/22/2021 02:37:22 - INFO - codeparrot_training - Step 3721: {'lr': 0.0004937772456691692, 'samples': 1905664, 'steps': 3721, 'batch_loss/train': 1.0924650533124804} 12/22/2021 02:37:33 - INFO - codeparrot_training - Step 3722: {'lr': 0.0004937737500860085, 'samples': 1906176, 'steps': 3722, 'batch_loss/train': 0.9254699042066932} 12/22/2021 02:37:43 - INFO - codeparrot_training - Step 3723: {'lr': 0.0004937702535336955, 'samples': 1906688, 'steps': 3723, 'batch_loss/train': 0.9108530124649405} 12/22/2021 02:37:57 - INFO - codeparrot_training - Step 3724: {'lr': 0.000493766756012244, 'samples': 1907200, 'steps': 3724, 'batch_loss/train': 0.9177951514720917} 12/22/2021 02:38:07 - INFO - codeparrot_training - Step 3725: {'lr': 0.0004937632575216679, 'samples': 1907712, 'steps': 3725, 'batch_loss/train': 1.58736028149724} 12/22/2021 02:38:18 - INFO - codeparrot_training - Step 3726: {'lr': 0.0004937597580619811, 'samples': 1908224, 'steps': 3726, 'batch_loss/train': 0.6940076700411737} 12/22/2021 02:38:30 - INFO - codeparrot_training - Step 3727: {'lr': 0.0004937562576331976, 'samples': 1908736, 'steps': 3727, 'batch_loss/train': 0.93151596467942} 12/22/2021 02:38:41 - INFO - codeparrot_training - Step 3728: {'lr': 0.0004937527562353313, 'samples': 1909248, 'steps': 3728, 'batch_loss/train': 0.8667444619350135} 12/22/2021 02:38:51 - INFO - codeparrot_training - Step 3729: {'lr': 0.0004937492538683958, 'samples': 1909760, 'steps': 3729, 'batch_loss/train': 0.8251191028393805} 12/22/2021 02:39:02 - INFO - codeparrot_training - Step 3730: {'lr': 0.0004937457505324056, 'samples': 1910272, 'steps': 3730, 'batch_loss/train': 2.5965562192723155} 12/22/2021 02:39:14 - INFO - codeparrot_training - Step 3731: {'lr': 0.0004937422462273742, 'samples': 1910784, 'steps': 3731, 'batch_loss/train': 0.8418670655228198} 12/22/2021 02:39:25 - INFO - codeparrot_training - Step 3732: {'lr': 0.0004937387409533157, 'samples': 1911296, 'steps': 3732, 'batch_loss/train': 0.8680355180986226} 12/22/2021 02:39:35 - INFO - codeparrot_training - Step 3733: {'lr': 0.0004937352347102441, 'samples': 1911808, 'steps': 3733, 'batch_loss/train': 0.7991327196359634} 12/22/2021 02:39:48 - INFO - codeparrot_training - Step 3734: {'lr': 0.0004937317274981731, 'samples': 1912320, 'steps': 3734, 'batch_loss/train': 0.9300766452215612} 12/22/2021 02:39:58 - INFO - codeparrot_training - Step 3735: {'lr': 0.0004937282193171169, 'samples': 1912832, 'steps': 3735, 'batch_loss/train': 0.875653357245028} 12/22/2021 02:40:09 - INFO - codeparrot_training - Step 3736: {'lr': 0.0004937247101670892, 'samples': 1913344, 'steps': 3736, 'batch_loss/train': 1.0622287299484015} 12/22/2021 02:40:21 - INFO - codeparrot_training - Step 3737: {'lr': 0.0004937212000481041, 'samples': 1913856, 'steps': 3737, 'batch_loss/train': 0.8742702966555953} 12/22/2021 02:40:32 - INFO - codeparrot_training - Step 3738: {'lr': 0.0004937176889601755, 'samples': 1914368, 'steps': 3738, 'batch_loss/train': 0.86161985155195} 12/22/2021 02:40:42 - INFO - codeparrot_training - Step 3739: {'lr': 0.0004937141769033175, 'samples': 1914880, 'steps': 3739, 'batch_loss/train': 0.9091787878423929} 12/22/2021 02:40:55 - INFO - codeparrot_training - Step 3740: {'lr': 0.0004937106638775439, 'samples': 1915392, 'steps': 3740, 'batch_loss/train': 0.8842210378497839} 12/22/2021 02:41:06 - INFO - codeparrot_training - Step 3741: {'lr': 0.0004937071498828688, 'samples': 1915904, 'steps': 3741, 'batch_loss/train': 0.8322340531740338} 12/22/2021 02:41:16 - INFO - codeparrot_training - Step 3742: {'lr': 0.000493703634919306, 'samples': 1916416, 'steps': 3742, 'batch_loss/train': 0.8923069657757878} 12/22/2021 02:41:27 - INFO - codeparrot_training - Step 3743: {'lr': 0.0004937001189868696, 'samples': 1916928, 'steps': 3743, 'batch_loss/train': 0.9037461522966623} 12/22/2021 02:41:39 - INFO - codeparrot_training - Step 3744: {'lr': 0.0004936966020855735, 'samples': 1917440, 'steps': 3744, 'batch_loss/train': 0.8962100022472441} 12/22/2021 02:41:49 - INFO - codeparrot_training - Step 3745: {'lr': 0.0004936930842154318, 'samples': 1917952, 'steps': 3745, 'batch_loss/train': 0.8037969917058945} 12/22/2021 02:42:00 - INFO - codeparrot_training - Step 3746: {'lr': 0.0004936895653764584, 'samples': 1918464, 'steps': 3746, 'batch_loss/train': 0.8700243709608912} 12/22/2021 02:42:12 - INFO - codeparrot_training - Step 3747: {'lr': 0.0004936860455686672, 'samples': 1918976, 'steps': 3747, 'batch_loss/train': 0.8389458516612649} 12/22/2021 02:42:23 - INFO - codeparrot_training - Step 3748: {'lr': 0.0004936825247920724, 'samples': 1919488, 'steps': 3748, 'batch_loss/train': 0.8706636801362038} 12/22/2021 02:42:34 - INFO - codeparrot_training - Step 3749: {'lr': 0.0004936790030466879, 'samples': 1920000, 'steps': 3749, 'batch_loss/train': 1.7911450904794037} 12/22/2021 02:42:46 - INFO - codeparrot_training - Step 3750: {'lr': 0.0004936754803325276, 'samples': 1920512, 'steps': 3750, 'batch_loss/train': 0.8968791607767344} 12/22/2021 02:42:56 - INFO - codeparrot_training - Step 3751: {'lr': 0.0004936719566496058, 'samples': 1921024, 'steps': 3751, 'batch_loss/train': 0.9759563449770212} 12/22/2021 02:43:07 - INFO - codeparrot_training - Step 3752: {'lr': 0.0004936684319979361, 'samples': 1921536, 'steps': 3752, 'batch_loss/train': 0.8528996258974075} 12/22/2021 02:43:18 - INFO - codeparrot_training - Step 3753: {'lr': 0.0004936649063775327, 'samples': 1922048, 'steps': 3753, 'batch_loss/train': 0.8533864188939333} 12/22/2021 02:43:30 - INFO - codeparrot_training - Step 3754: {'lr': 0.0004936613797884098, 'samples': 1922560, 'steps': 3754, 'batch_loss/train': 0.9278583582490683} 12/22/2021 02:43:41 - INFO - codeparrot_training - Step 3755: {'lr': 0.0004936578522305811, 'samples': 1923072, 'steps': 3755, 'batch_loss/train': 0.9637786388630047} 12/22/2021 02:43:52 - INFO - codeparrot_training - Step 3756: {'lr': 0.0004936543237040609, 'samples': 1923584, 'steps': 3756, 'batch_loss/train': 0.9153960263356566} 12/22/2021 02:44:04 - INFO - codeparrot_training - Step 3757: {'lr': 0.0004936507942088631, 'samples': 1924096, 'steps': 3757, 'batch_loss/train': 0.7233433681540191} 12/22/2021 02:44:15 - INFO - codeparrot_training - Step 3758: {'lr': 0.0004936472637450017, 'samples': 1924608, 'steps': 3758, 'batch_loss/train': 0.8644535755738616} 12/22/2021 02:44:25 - INFO - codeparrot_training - Step 3759: {'lr': 0.0004936437323124908, 'samples': 1925120, 'steps': 3759, 'batch_loss/train': 0.8718647081404924} 12/22/2021 02:44:38 - INFO - codeparrot_training - Step 3760: {'lr': 0.0004936401999113444, 'samples': 1925632, 'steps': 3760, 'batch_loss/train': 0.8403368499130011} 12/22/2021 02:44:48 - INFO - codeparrot_training - Step 3761: {'lr': 0.0004936366665415766, 'samples': 1926144, 'steps': 3761, 'batch_loss/train': 1.8116241469979286} 12/22/2021 02:44:59 - INFO - codeparrot_training - Step 3762: {'lr': 0.0004936331322032012, 'samples': 1926656, 'steps': 3762, 'batch_loss/train': 0.9391850866377354} 12/22/2021 02:45:10 - INFO - codeparrot_training - Step 3763: {'lr': 0.0004936295968962327, 'samples': 1927168, 'steps': 3763, 'batch_loss/train': 1.042867235839367} 12/22/2021 02:45:22 - INFO - codeparrot_training - Step 3764: {'lr': 0.0004936260606206848, 'samples': 1927680, 'steps': 3764, 'batch_loss/train': 0.9154640473425388} 12/22/2021 02:45:33 - INFO - codeparrot_training - Step 3765: {'lr': 0.0004936225233765718, 'samples': 1928192, 'steps': 3765, 'batch_loss/train': 0.948992533609271} 12/22/2021 02:45:44 - INFO - codeparrot_training - Step 3766: {'lr': 0.0004936189851639076, 'samples': 1928704, 'steps': 3766, 'batch_loss/train': 0.9629520233720541} 12/22/2021 02:45:56 - INFO - codeparrot_training - Step 3767: {'lr': 0.0004936154459827062, 'samples': 1929216, 'steps': 3767, 'batch_loss/train': 0.9275340298190713} 12/22/2021 02:46:06 - INFO - codeparrot_training - Step 3768: {'lr': 0.0004936119058329818, 'samples': 1929728, 'steps': 3768, 'batch_loss/train': 0.8784381435252726} 12/22/2021 02:46:17 - INFO - codeparrot_training - Step 3769: {'lr': 0.0004936083647147485, 'samples': 1930240, 'steps': 3769, 'batch_loss/train': 1.014723513275385} 12/22/2021 02:46:30 - INFO - codeparrot_training - Step 3770: {'lr': 0.0004936048226280203, 'samples': 1930752, 'steps': 3770, 'batch_loss/train': 0.8563129166141152} 12/22/2021 02:46:40 - INFO - codeparrot_training - Step 3771: {'lr': 0.0004936012795728113, 'samples': 1931264, 'steps': 3771, 'batch_loss/train': 0.8295387299731374} 12/22/2021 02:46:51 - INFO - codeparrot_training - Step 3772: {'lr': 0.0004935977355491355, 'samples': 1931776, 'steps': 3772, 'batch_loss/train': 0.8888448006473482} 12/22/2021 02:47:02 - INFO - codeparrot_training - Step 3773: {'lr': 0.0004935941905570073, 'samples': 1932288, 'steps': 3773, 'batch_loss/train': 0.8409133991226554} 12/22/2021 02:47:14 - INFO - codeparrot_training - Step 3774: {'lr': 0.0004935906445964405, 'samples': 1932800, 'steps': 3774, 'batch_loss/train': 0.84512999933213} 12/22/2021 02:47:24 - INFO - codeparrot_training - Step 3775: {'lr': 0.0004935870976674492, 'samples': 1933312, 'steps': 3775, 'batch_loss/train': 0.8995063658803701} 12/22/2021 02:47:35 - INFO - codeparrot_training - Step 3776: {'lr': 0.0004935835497700476, 'samples': 1933824, 'steps': 3776, 'batch_loss/train': 0.923853381536901} 12/22/2021 02:47:47 - INFO - codeparrot_training - Step 3777: {'lr': 0.0004935800009042497, 'samples': 1934336, 'steps': 3777, 'batch_loss/train': 0.9007323170080781} 12/22/2021 02:47:58 - INFO - codeparrot_training - Step 3778: {'lr': 0.0004935764510700698, 'samples': 1934848, 'steps': 3778, 'batch_loss/train': 0.8202476426959038} 12/22/2021 02:48:09 - INFO - codeparrot_training - Step 3779: {'lr': 0.0004935729002675218, 'samples': 1935360, 'steps': 3779, 'batch_loss/train': 1.1605665972456336} 12/22/2021 02:48:22 - INFO - codeparrot_training - Step 3780: {'lr': 0.00049356934849662, 'samples': 1935872, 'steps': 3780, 'batch_loss/train': 0.7942460384219885} 12/22/2021 02:48:33 - INFO - codeparrot_training - Step 3781: {'lr': 0.0004935657957573785, 'samples': 1936384, 'steps': 3781, 'batch_loss/train': 1.1023618336766958} 12/22/2021 02:48:43 - INFO - codeparrot_training - Step 3782: {'lr': 0.0004935622420498113, 'samples': 1936896, 'steps': 3782, 'batch_loss/train': 1.6150321764871478} 12/22/2021 02:48:54 - INFO - codeparrot_training - Step 3783: {'lr': 0.0004935586873739326, 'samples': 1937408, 'steps': 3783, 'batch_loss/train': 1.1446838593110442} 12/22/2021 02:49:06 - INFO - codeparrot_training - Step 3784: {'lr': 0.0004935551317297566, 'samples': 1937920, 'steps': 3784, 'batch_loss/train': 0.9644232727587223} 12/22/2021 02:49:17 - INFO - codeparrot_training - Step 3785: {'lr': 0.0004935515751172972, 'samples': 1938432, 'steps': 3785, 'batch_loss/train': 0.8307030852884054} 12/22/2021 02:49:27 - INFO - codeparrot_training - Step 3786: {'lr': 0.0004935480175365688, 'samples': 1938944, 'steps': 3786, 'batch_loss/train': 0.93092046584934} 12/22/2021 02:49:38 - INFO - codeparrot_training - Step 3787: {'lr': 0.0004935444589875853, 'samples': 1939456, 'steps': 3787, 'batch_loss/train': 0.9085046174004674} 12/22/2021 02:49:50 - INFO - codeparrot_training - Step 3788: {'lr': 0.0004935408994703611, 'samples': 1939968, 'steps': 3788, 'batch_loss/train': 0.9460109192878008} 12/22/2021 02:50:00 - INFO - codeparrot_training - Step 3789: {'lr': 0.0004935373389849103, 'samples': 1940480, 'steps': 3789, 'batch_loss/train': 0.793442026595585} 12/22/2021 02:50:11 - INFO - codeparrot_training - Step 3790: {'lr': 0.0004935337775312468, 'samples': 1940992, 'steps': 3790, 'batch_loss/train': 0.8823752286843956} 12/22/2021 02:50:23 - INFO - codeparrot_training - Step 3791: {'lr': 0.0004935302151093852, 'samples': 1941504, 'steps': 3791, 'batch_loss/train': 0.8942114543169737} 12/22/2021 02:50:34 - INFO - codeparrot_training - Step 3792: {'lr': 0.0004935266517193393, 'samples': 1942016, 'steps': 3792, 'batch_loss/train': 0.9843899719417095} 12/22/2021 02:50:44 - INFO - codeparrot_training - Step 3793: {'lr': 0.0004935230873611234, 'samples': 1942528, 'steps': 3793, 'batch_loss/train': 0.8916125278919935} 12/22/2021 02:50:57 - INFO - codeparrot_training - Step 3794: {'lr': 0.0004935195220347517, 'samples': 1943040, 'steps': 3794, 'batch_loss/train': 0.9275600267574191} 12/22/2021 02:51:08 - INFO - codeparrot_training - Step 3795: {'lr': 0.0004935159557402382, 'samples': 1943552, 'steps': 3795, 'batch_loss/train': 0.9085673121735454} 12/22/2021 02:51:18 - INFO - codeparrot_training - Step 3796: {'lr': 0.0004935123884775973, 'samples': 1944064, 'steps': 3796, 'batch_loss/train': 0.7847347231581807} 12/22/2021 02:51:30 - INFO - codeparrot_training - Step 3797: {'lr': 0.000493508820246843, 'samples': 1944576, 'steps': 3797, 'batch_loss/train': 0.8392064590007067} 12/22/2021 02:51:41 - INFO - codeparrot_training - Step 3798: {'lr': 0.0004935052510479897, 'samples': 1945088, 'steps': 3798, 'batch_loss/train': 0.9343324014917016} 12/22/2021 02:51:52 - INFO - codeparrot_training - Step 3799: {'lr': 0.0004935016808810514, 'samples': 1945600, 'steps': 3799, 'batch_loss/train': 0.8433773340657353} 12/22/2021 02:52:02 - INFO - codeparrot_training - Step 3800: {'lr': 0.0004934981097460424, 'samples': 1946112, 'steps': 3800, 'batch_loss/train': 0.8842285927385092} 12/22/2021 02:52:15 - INFO - codeparrot_training - Step 3801: {'lr': 0.0004934945376429768, 'samples': 1946624, 'steps': 3801, 'batch_loss/train': 0.8370447773486376} 12/22/2021 02:52:26 - INFO - codeparrot_training - Step 3802: {'lr': 0.000493490964571869, 'samples': 1947136, 'steps': 3802, 'batch_loss/train': 0.8043763479217887} 12/22/2021 02:52:36 - INFO - codeparrot_training - Step 3803: {'lr': 0.0004934873905327329, 'samples': 1947648, 'steps': 3803, 'batch_loss/train': 0.8584928940981627} 12/22/2021 02:52:48 - INFO - codeparrot_training - Step 3804: {'lr': 0.000493483815525583, 'samples': 1948160, 'steps': 3804, 'batch_loss/train': 0.858178704045713} 12/22/2021 02:52:59 - INFO - codeparrot_training - Step 3805: {'lr': 0.0004934802395504332, 'samples': 1948672, 'steps': 3805, 'batch_loss/train': 0.8357215044088662} 12/22/2021 02:53:10 - INFO - codeparrot_training - Step 3806: {'lr': 0.0004934766626072981, 'samples': 1949184, 'steps': 3806, 'batch_loss/train': 0.8188387360423803} 12/22/2021 02:53:22 - INFO - codeparrot_training - Step 3807: {'lr': 0.0004934730846961916, 'samples': 1949696, 'steps': 3807, 'batch_loss/train': 0.9045049669221044} 12/22/2021 02:53:32 - INFO - codeparrot_training - Step 3808: {'lr': 0.0004934695058171282, 'samples': 1950208, 'steps': 3808, 'batch_loss/train': 0.9567530667409301} 12/22/2021 02:53:43 - INFO - codeparrot_training - Step 3809: {'lr': 0.0004934659259701218, 'samples': 1950720, 'steps': 3809, 'batch_loss/train': 0.9193551195785403} 12/22/2021 02:53:54 - INFO - codeparrot_training - Step 3810: {'lr': 0.0004934623451551868, 'samples': 1951232, 'steps': 3810, 'batch_loss/train': 0.8828546367585659} 12/22/2021 02:54:06 - INFO - codeparrot_training - Step 3811: {'lr': 0.0004934587633723376, 'samples': 1951744, 'steps': 3811, 'batch_loss/train': 0.8745678209234029} 12/22/2021 02:54:17 - INFO - codeparrot_training - Step 3812: {'lr': 0.0004934551806215881, 'samples': 1952256, 'steps': 3812, 'batch_loss/train': 0.8780061267316341} 12/22/2021 02:54:28 - INFO - codeparrot_training - Step 3813: {'lr': 0.0004934515969029528, 'samples': 1952768, 'steps': 3813, 'batch_loss/train': 0.8838810017332435} 12/22/2021 02:54:40 - INFO - codeparrot_training - Step 3814: {'lr': 0.000493448012216446, 'samples': 1953280, 'steps': 3814, 'batch_loss/train': 0.9207546710968018} 12/22/2021 02:54:50 - INFO - codeparrot_training - Step 3815: {'lr': 0.0004934444265620817, 'samples': 1953792, 'steps': 3815, 'batch_loss/train': 0.9235510937869549} 12/22/2021 02:55:01 - INFO - codeparrot_training - Step 3816: {'lr': 0.0004934408399398743, 'samples': 1954304, 'steps': 3816, 'batch_loss/train': 0.835369573906064} 12/22/2021 02:55:14 - INFO - codeparrot_training - Step 3817: {'lr': 0.0004934372523498381, 'samples': 1954816, 'steps': 3817, 'batch_loss/train': 0.8079075608402491} 12/22/2021 02:55:24 - INFO - codeparrot_training - Step 3818: {'lr': 0.0004934336637919871, 'samples': 1955328, 'steps': 3818, 'batch_loss/train': 0.9456892469897866} 12/22/2021 02:55:35 - INFO - codeparrot_training - Step 3819: {'lr': 0.000493430074266336, 'samples': 1955840, 'steps': 3819, 'batch_loss/train': 0.87785336188972} 12/22/2021 02:55:45 - INFO - codeparrot_training - Step 3820: {'lr': 0.0004934264837728987, 'samples': 1956352, 'steps': 3820, 'batch_loss/train': 0.8012208454310894} 12/22/2021 02:55:58 - INFO - codeparrot_training - Step 3821: {'lr': 0.0004934228923116897, 'samples': 1956864, 'steps': 3821, 'batch_loss/train': 0.739445025101304} 12/22/2021 02:56:08 - INFO - codeparrot_training - Step 3822: {'lr': 0.0004934192998827232, 'samples': 1957376, 'steps': 3822, 'batch_loss/train': 0.8686580862849951} 12/22/2021 02:56:19 - INFO - codeparrot_training - Step 3823: {'lr': 0.0004934157064860134, 'samples': 1957888, 'steps': 3823, 'batch_loss/train': 0.9037773348391056} 12/22/2021 02:56:31 - INFO - codeparrot_training - Step 3824: {'lr': 0.0004934121121215748, 'samples': 1958400, 'steps': 3824, 'batch_loss/train': 0.8799220873042941} 12/22/2021 02:56:42 - INFO - codeparrot_training - Step 3825: {'lr': 0.0004934085167894213, 'samples': 1958912, 'steps': 3825, 'batch_loss/train': 0.8213279251940548} 12/22/2021 02:56:52 - INFO - codeparrot_training - Step 3826: {'lr': 0.0004934049204895676, 'samples': 1959424, 'steps': 3826, 'batch_loss/train': 1.6960221035405993} 12/22/2021 02:57:04 - INFO - codeparrot_training - Step 3827: {'lr': 0.0004934013232220279, 'samples': 1959936, 'steps': 3827, 'batch_loss/train': 0.7514405031688511} 12/22/2021 02:57:15 - INFO - codeparrot_training - Step 3828: {'lr': 0.0004933977249868164, 'samples': 1960448, 'steps': 3828, 'batch_loss/train': 0.89906468661502} 12/22/2021 02:57:26 - INFO - codeparrot_training - Step 3829: {'lr': 0.0004933941257839475, 'samples': 1960960, 'steps': 3829, 'batch_loss/train': 0.92047721426934} 12/22/2021 02:57:36 - INFO - codeparrot_training - Step 3830: {'lr': 0.0004933905256134354, 'samples': 1961472, 'steps': 3830, 'batch_loss/train': 0.8562086718156934} 12/22/2021 02:57:49 - INFO - codeparrot_training - Step 3831: {'lr': 0.0004933869244752943, 'samples': 1961984, 'steps': 3831, 'batch_loss/train': 0.7722583990544081} 12/22/2021 02:58:00 - INFO - codeparrot_training - Step 3832: {'lr': 0.000493383322369539, 'samples': 1962496, 'steps': 3832, 'batch_loss/train': 0.8859816547483206} 12/22/2021 02:58:10 - INFO - codeparrot_training - Step 3833: {'lr': 0.0004933797192961833, 'samples': 1963008, 'steps': 3833, 'batch_loss/train': 0.7720382558181882} 12/22/2021 02:58:22 - INFO - codeparrot_training - Step 3834: {'lr': 0.0004933761152552418, 'samples': 1963520, 'steps': 3834, 'batch_loss/train': 0.9004795998334885} 12/22/2021 02:58:33 - INFO - codeparrot_training - Step 3835: {'lr': 0.0004933725102467287, 'samples': 1964032, 'steps': 3835, 'batch_loss/train': 0.9689303785562515} 12/22/2021 02:58:43 - INFO - codeparrot_training - Step 3836: {'lr': 0.0004933689042706585, 'samples': 1964544, 'steps': 3836, 'batch_loss/train': 0.9114434849470854} 12/22/2021 02:58:55 - INFO - codeparrot_training - Step 3837: {'lr': 0.0004933652973270453, 'samples': 1965056, 'steps': 3837, 'batch_loss/train': 0.8811247888952494} 12/22/2021 02:59:06 - INFO - codeparrot_training - Step 3838: {'lr': 0.0004933616894159037, 'samples': 1965568, 'steps': 3838, 'batch_loss/train': 1.076737537048757} 12/22/2021 02:59:17 - INFO - codeparrot_training - Step 3839: {'lr': 0.0004933580805372478, 'samples': 1966080, 'steps': 3839, 'batch_loss/train': 0.8767387922853231} 12/22/2021 02:59:29 - INFO - codeparrot_training - Step 3840: {'lr': 0.0004933544706910922, 'samples': 1966592, 'steps': 3840, 'batch_loss/train': 0.7454911158420146} 12/22/2021 02:59:40 - INFO - codeparrot_training - Step 3841: {'lr': 0.000493350859877451, 'samples': 1967104, 'steps': 3841, 'batch_loss/train': 0.7605250785127282} 12/22/2021 02:59:51 - INFO - codeparrot_training - Step 3842: {'lr': 0.0004933472480963387, 'samples': 1967616, 'steps': 3842, 'batch_loss/train': 0.8632157635875046} 12/22/2021 03:00:01 - INFO - codeparrot_training - Step 3843: {'lr': 0.0004933436353477697, 'samples': 1968128, 'steps': 3843, 'batch_loss/train': 0.9379818886518478} 12/22/2021 03:00:13 - INFO - codeparrot_training - Step 3844: {'lr': 0.0004933400216317581, 'samples': 1968640, 'steps': 3844, 'batch_loss/train': 0.8442303249612451} 12/22/2021 03:00:24 - INFO - codeparrot_training - Step 3845: {'lr': 0.0004933364069483185, 'samples': 1969152, 'steps': 3845, 'batch_loss/train': 0.7561356239020824} 12/22/2021 03:00:34 - INFO - codeparrot_training - Step 3846: {'lr': 0.0004933327912974653, 'samples': 1969664, 'steps': 3846, 'batch_loss/train': 0.8705980135127902} 12/22/2021 03:00:47 - INFO - codeparrot_training - Step 3847: {'lr': 0.0004933291746792129, 'samples': 1970176, 'steps': 3847, 'batch_loss/train': 0.8357559628784657} 12/22/2021 03:00:58 - INFO - codeparrot_training - Step 3848: {'lr': 0.0004933255570935755, 'samples': 1970688, 'steps': 3848, 'batch_loss/train': 0.7942328061908484} 12/22/2021 03:01:09 - INFO - codeparrot_training - Step 3849: {'lr': 0.0004933219385405674, 'samples': 1971200, 'steps': 3849, 'batch_loss/train': 1.105294725857675} 12/22/2021 03:01:21 - INFO - codeparrot_training - Step 3850: {'lr': 0.0004933183190202033, 'samples': 1971712, 'steps': 3850, 'batch_loss/train': 0.7367704003117979} 12/22/2021 03:01:31 - INFO - codeparrot_training - Step 3851: {'lr': 0.0004933146985324975, 'samples': 1972224, 'steps': 3851, 'batch_loss/train': 0.8952564429491758} 12/22/2021 03:01:42 - INFO - codeparrot_training - Step 3852: {'lr': 0.0004933110770774642, 'samples': 1972736, 'steps': 3852, 'batch_loss/train': 0.8151126080192626} 12/22/2021 03:01:52 - INFO - codeparrot_training - Step 3853: {'lr': 0.0004933074546551179, 'samples': 1973248, 'steps': 3853, 'batch_loss/train': 0.8732803547754884} 12/22/2021 03:02:05 - INFO - codeparrot_training - Step 3854: {'lr': 0.0004933038312654732, 'samples': 1973760, 'steps': 3854, 'batch_loss/train': 1.0026607951149344} 12/22/2021 03:02:15 - INFO - codeparrot_training - Step 3855: {'lr': 0.0004933002069085441, 'samples': 1974272, 'steps': 3855, 'batch_loss/train': 0.7407977811526507} 12/22/2021 03:02:26 - INFO - codeparrot_training - Step 3856: {'lr': 0.0004932965815843454, 'samples': 1974784, 'steps': 3856, 'batch_loss/train': 0.7021969703491777} 12/22/2021 03:02:39 - INFO - codeparrot_training - Step 3857: {'lr': 0.0004932929552928914, 'samples': 1975296, 'steps': 3857, 'batch_loss/train': 0.8418228719383478} 12/22/2021 03:02:49 - INFO - codeparrot_training - Step 3858: {'lr': 0.0004932893280341964, 'samples': 1975808, 'steps': 3858, 'batch_loss/train': 0.8094428973272443} 12/22/2021 03:03:00 - INFO - codeparrot_training - Step 3859: {'lr': 0.0004932856998082749, 'samples': 1976320, 'steps': 3859, 'batch_loss/train': 0.8810163140296936} 12/22/2021 03:03:12 - INFO - codeparrot_training - Step 3860: {'lr': 0.0004932820706151414, 'samples': 1976832, 'steps': 3860, 'batch_loss/train': 0.8627619971521199} 12/22/2021 03:03:23 - INFO - codeparrot_training - Step 3861: {'lr': 0.0004932784404548101, 'samples': 1977344, 'steps': 3861, 'batch_loss/train': 0.9168316442519426} 12/22/2021 03:03:33 - INFO - codeparrot_training - Step 3862: {'lr': 0.0004932748093272956, 'samples': 1977856, 'steps': 3862, 'batch_loss/train': 0.7957651475444436} 12/22/2021 03:03:44 - INFO - codeparrot_training - Step 3863: {'lr': 0.0004932711772326125, 'samples': 1978368, 'steps': 3863, 'batch_loss/train': 0.8357260311022401} 12/22/2021 03:03:56 - INFO - codeparrot_training - Step 3864: {'lr': 0.0004932675441707749, 'samples': 1978880, 'steps': 3864, 'batch_loss/train': 0.8389697624370456} 12/22/2021 03:04:07 - INFO - codeparrot_training - Step 3865: {'lr': 0.0004932639101417974, 'samples': 1979392, 'steps': 3865, 'batch_loss/train': 0.8550601061433554} 12/22/2021 03:04:17 - INFO - codeparrot_training - Step 3866: {'lr': 0.0004932602751456945, 'samples': 1979904, 'steps': 3866, 'batch_loss/train': 0.7959549902006984} 12/22/2021 03:04:29 - INFO - codeparrot_training - Step 3867: {'lr': 0.0004932566391824806, 'samples': 1980416, 'steps': 3867, 'batch_loss/train': 0.7526218174025416} 12/22/2021 03:04:40 - INFO - codeparrot_training - Step 3868: {'lr': 0.00049325300225217, 'samples': 1980928, 'steps': 3868, 'batch_loss/train': 0.8627567188814282} 12/22/2021 03:04:51 - INFO - codeparrot_training - Step 3869: {'lr': 0.0004932493643547774, 'samples': 1981440, 'steps': 3869, 'batch_loss/train': 0.8175106435082853} 12/22/2021 03:05:03 - INFO - codeparrot_training - Step 3870: {'lr': 0.0004932457254903172, 'samples': 1981952, 'steps': 3870, 'batch_loss/train': 0.8299692878499627} 12/22/2021 03:05:14 - INFO - codeparrot_training - Step 3871: {'lr': 0.0004932420856588039, 'samples': 1982464, 'steps': 3871, 'batch_loss/train': 0.9673319458961487} 12/22/2021 03:05:24 - INFO - codeparrot_training - Step 3872: {'lr': 0.0004932384448602518, 'samples': 1982976, 'steps': 3872, 'batch_loss/train': 0.8853581510484219} 12/22/2021 03:05:36 - INFO - codeparrot_training - Step 3873: {'lr': 0.0004932348030946756, 'samples': 1983488, 'steps': 3873, 'batch_loss/train': 0.8844238193705678} 12/22/2021 03:05:47 - INFO - codeparrot_training - Step 3874: {'lr': 0.0004932311603620896, 'samples': 1984000, 'steps': 3874, 'batch_loss/train': 0.9274575170129538} 12/22/2021 03:05:58 - INFO - codeparrot_training - Step 3875: {'lr': 0.0004932275166625084, 'samples': 1984512, 'steps': 3875, 'batch_loss/train': 0.9914575247094035} 12/22/2021 03:06:08 - INFO - codeparrot_training - Step 3876: {'lr': 0.0004932238719959464, 'samples': 1985024, 'steps': 3876, 'batch_loss/train': 0.8374196030199528} 12/22/2021 03:06:20 - INFO - codeparrot_training - Step 3877: {'lr': 0.000493220226362418, 'samples': 1985536, 'steps': 3877, 'batch_loss/train': 0.810539124533534} 12/22/2021 03:06:31 - INFO - codeparrot_training - Step 3878: {'lr': 0.000493216579761938, 'samples': 1986048, 'steps': 3878, 'batch_loss/train': 0.8527323370799422} 12/22/2021 03:06:42 - INFO - codeparrot_training - Step 3879: {'lr': 0.0004932129321945206, 'samples': 1986560, 'steps': 3879, 'batch_loss/train': 0.8350735884159803} 12/22/2021 03:06:54 - INFO - codeparrot_training - Step 3880: {'lr': 0.0004932092836601805, 'samples': 1987072, 'steps': 3880, 'batch_loss/train': 0.9590949066914618} 12/22/2021 03:07:05 - INFO - codeparrot_training - Step 3881: {'lr': 0.000493205634158932, 'samples': 1987584, 'steps': 3881, 'batch_loss/train': 0.9650401920080185} 12/22/2021 03:07:16 - INFO - codeparrot_training - Step 3882: {'lr': 0.0004932019836907898, 'samples': 1988096, 'steps': 3882, 'batch_loss/train': 0.8384674405679107} 12/22/2021 03:07:28 - INFO - codeparrot_training - Step 3883: {'lr': 0.0004931983322557684, 'samples': 1988608, 'steps': 3883, 'batch_loss/train': 0.9073997475206852} 12/22/2021 03:07:38 - INFO - codeparrot_training - Step 3884: {'lr': 0.0004931946798538823, 'samples': 1989120, 'steps': 3884, 'batch_loss/train': 1.008957359008491} 12/22/2021 03:07:49 - INFO - codeparrot_training - Step 3885: {'lr': 0.0004931910264851458, 'samples': 1989632, 'steps': 3885, 'batch_loss/train': 0.8374830596148968} 12/22/2021 03:08:00 - INFO - codeparrot_training - Step 3886: {'lr': 0.0004931873721495737, 'samples': 1990144, 'steps': 3886, 'batch_loss/train': 0.9547140579670668} 12/22/2021 03:08:12 - INFO - codeparrot_training - Step 3887: {'lr': 0.0004931837168471805, 'samples': 1990656, 'steps': 3887, 'batch_loss/train': 0.8437651824206114} 12/22/2021 03:08:23 - INFO - codeparrot_training - Step 3888: {'lr': 0.0004931800605779805, 'samples': 1991168, 'steps': 3888, 'batch_loss/train': 0.8827533386647701} 12/22/2021 03:08:34 - INFO - codeparrot_training - Step 3889: {'lr': 0.0004931764033419885, 'samples': 1991680, 'steps': 3889, 'batch_loss/train': 0.8302919287234545} 12/22/2021 03:08:46 - INFO - codeparrot_training - Step 3890: {'lr': 0.0004931727451392189, 'samples': 1992192, 'steps': 3890, 'batch_loss/train': 1.562019070610404} 12/22/2021 03:08:56 - INFO - codeparrot_training - Step 3891: {'lr': 0.0004931690859696863, 'samples': 1992704, 'steps': 3891, 'batch_loss/train': 0.6735600819811225} 12/22/2021 03:09:07 - INFO - codeparrot_training - Step 3892: {'lr': 0.0004931654258334053, 'samples': 1993216, 'steps': 3892, 'batch_loss/train': 0.8193907630629838} 12/22/2021 03:09:19 - INFO - codeparrot_training - Step 3893: {'lr': 0.0004931617647303903, 'samples': 1993728, 'steps': 3893, 'batch_loss/train': 0.9119421700015664} 12/22/2021 03:09:30 - INFO - codeparrot_training - Step 3894: {'lr': 0.000493158102660656, 'samples': 1994240, 'steps': 3894, 'batch_loss/train': 0.8828867739066482} 12/22/2021 03:09:40 - INFO - codeparrot_training - Step 3895: {'lr': 0.0004931544396242168, 'samples': 1994752, 'steps': 3895, 'batch_loss/train': 0.9571973392739892} 12/22/2021 03:09:53 - INFO - codeparrot_training - Step 3896: {'lr': 0.0004931507756210875, 'samples': 1995264, 'steps': 3896, 'batch_loss/train': 0.7557554412633181} 12/22/2021 03:10:04 - INFO - codeparrot_training - Step 3897: {'lr': 0.0004931471106512824, 'samples': 1995776, 'steps': 3897, 'batch_loss/train': 0.966940370388329} 12/22/2021 03:10:14 - INFO - codeparrot_training - Step 3898: {'lr': 0.0004931434447148162, 'samples': 1996288, 'steps': 3898, 'batch_loss/train': 0.7110615766141564} 12/22/2021 03:10:25 - INFO - codeparrot_training - Step 3899: {'lr': 0.0004931397778117036, 'samples': 1996800, 'steps': 3899, 'batch_loss/train': 0.9368090350180864} 12/22/2021 03:10:38 - INFO - codeparrot_training - Step 3900: {'lr': 0.0004931361099419589, 'samples': 1997312, 'steps': 3900, 'batch_loss/train': 0.9381023864261806} 12/22/2021 03:10:48 - INFO - codeparrot_training - Step 3901: {'lr': 0.0004931324411055969, 'samples': 1997824, 'steps': 3901, 'batch_loss/train': 0.8043716205283999} 12/22/2021 03:10:59 - INFO - codeparrot_training - Step 3902: {'lr': 0.0004931287713026321, 'samples': 1998336, 'steps': 3902, 'batch_loss/train': 0.9005447374656796} 12/22/2021 03:11:11 - INFO - codeparrot_training - Step 3903: {'lr': 0.0004931251005330791, 'samples': 1998848, 'steps': 3903, 'batch_loss/train': 0.8559758923947811} 12/22/2021 03:11:22 - INFO - codeparrot_training - Step 3904: {'lr': 0.0004931214287969525, 'samples': 1999360, 'steps': 3904, 'batch_loss/train': 0.8582718628458679} 12/22/2021 03:11:32 - INFO - codeparrot_training - Step 3905: {'lr': 0.000493117756094267, 'samples': 1999872, 'steps': 3905, 'batch_loss/train': 0.7367847251007333} 12/22/2021 03:11:44 - INFO - codeparrot_training - Step 3906: {'lr': 0.0004931140824250371, 'samples': 2000384, 'steps': 3906, 'batch_loss/train': 0.9151057740673423} 12/22/2021 03:11:55 - INFO - codeparrot_training - Step 3907: {'lr': 0.0004931104077892772, 'samples': 2000896, 'steps': 3907, 'batch_loss/train': 0.918741153087467} 12/22/2021 03:12:05 - INFO - codeparrot_training - Step 3908: {'lr': 0.0004931067321870023, 'samples': 2001408, 'steps': 3908, 'batch_loss/train': 0.8395575573667884} 12/22/2021 03:12:16 - INFO - codeparrot_training - Step 3909: {'lr': 0.0004931030556182268, 'samples': 2001920, 'steps': 3909, 'batch_loss/train': 0.8448283183388412} 12/22/2021 03:12:29 - INFO - codeparrot_training - Step 3910: {'lr': 0.0004930993780829652, 'samples': 2002432, 'steps': 3910, 'batch_loss/train': 0.8309317911043763} 12/22/2021 03:12:39 - INFO - codeparrot_training - Step 3911: {'lr': 0.0004930956995812324, 'samples': 2002944, 'steps': 3911, 'batch_loss/train': 0.9689611354842782} 12/22/2021 03:12:50 - INFO - codeparrot_training - Step 3912: {'lr': 0.0004930920201130428, 'samples': 2003456, 'steps': 3912, 'batch_loss/train': 0.80058483639732} 12/22/2021 03:13:02 - INFO - codeparrot_training - Step 3913: {'lr': 0.0004930883396784113, 'samples': 2003968, 'steps': 3913, 'batch_loss/train': 0.973320534452796} 12/22/2021 03:13:13 - INFO - codeparrot_training - Step 3914: {'lr': 0.0004930846582773522, 'samples': 2004480, 'steps': 3914, 'batch_loss/train': 0.7821514308452606} 12/22/2021 03:13:23 - INFO - codeparrot_training - Step 3915: {'lr': 0.0004930809759098803, 'samples': 2004992, 'steps': 3915, 'batch_loss/train': 0.8534503588452935} 12/22/2021 03:13:36 - INFO - codeparrot_training - Step 3916: {'lr': 0.0004930772925760102, 'samples': 2005504, 'steps': 3916, 'batch_loss/train': 0.8088550022803247} 12/22/2021 03:13:47 - INFO - codeparrot_training - Step 3917: {'lr': 0.0004930736082757566, 'samples': 2006016, 'steps': 3917, 'batch_loss/train': 0.9357772294897586} 12/22/2021 03:13:57 - INFO - codeparrot_training - Step 3918: {'lr': 0.0004930699230091341, 'samples': 2006528, 'steps': 3918, 'batch_loss/train': 0.8147646440193057} 12/22/2021 03:14:09 - INFO - codeparrot_training - Step 3919: {'lr': 0.0004930662367761574, 'samples': 2007040, 'steps': 3919, 'batch_loss/train': 0.8136955020017922} 12/22/2021 03:14:20 - INFO - codeparrot_training - Step 3920: {'lr': 0.000493062549576841, 'samples': 2007552, 'steps': 3920, 'batch_loss/train': 0.8642320623621345} 12/22/2021 03:14:30 - INFO - codeparrot_training - Step 3921: {'lr': 0.0004930588614111999, 'samples': 2008064, 'steps': 3921, 'batch_loss/train': 0.9008946660906076} 12/22/2021 03:14:41 - INFO - codeparrot_training - Step 3922: {'lr': 0.0004930551722792483, 'samples': 2008576, 'steps': 3922, 'batch_loss/train': 0.8513338938355446} 12/22/2021 03:14:53 - INFO - codeparrot_training - Step 3923: {'lr': 0.0004930514821810013, 'samples': 2009088, 'steps': 3923, 'batch_loss/train': 0.763062552548945} 12/22/2021 03:15:04 - INFO - codeparrot_training - Step 3924: {'lr': 0.0004930477911164733, 'samples': 2009600, 'steps': 3924, 'batch_loss/train': 0.822375706396997} 12/22/2021 03:15:14 - INFO - codeparrot_training - Step 3925: {'lr': 0.000493044099085679, 'samples': 2010112, 'steps': 3925, 'batch_loss/train': 0.7947790520265698} 12/22/2021 03:15:27 - INFO - codeparrot_training - Step 3926: {'lr': 0.0004930404060886332, 'samples': 2010624, 'steps': 3926, 'batch_loss/train': 0.850260779261589} 12/22/2021 03:15:38 - INFO - codeparrot_training - Step 3927: {'lr': 0.0004930367121253505, 'samples': 2011136, 'steps': 3927, 'batch_loss/train': 0.882644783705473} 12/22/2021 03:15:48 - INFO - codeparrot_training - Step 3928: {'lr': 0.0004930330171958457, 'samples': 2011648, 'steps': 3928, 'batch_loss/train': 0.8367344061844051} 12/22/2021 03:16:00 - INFO - codeparrot_training - Step 3929: {'lr': 0.0004930293213001333, 'samples': 2012160, 'steps': 3929, 'batch_loss/train': 0.7689497340470552} 12/22/2021 03:16:11 - INFO - codeparrot_training - Step 3930: {'lr': 0.000493025624438228, 'samples': 2012672, 'steps': 3930, 'batch_loss/train': 0.8501249547116458} 12/22/2021 03:16:22 - INFO - codeparrot_training - Step 3931: {'lr': 0.0004930219266101447, 'samples': 2013184, 'steps': 3931, 'batch_loss/train': 0.8033738858066499} 12/22/2021 03:16:32 - INFO - codeparrot_training - Step 3932: {'lr': 0.000493018227815898, 'samples': 2013696, 'steps': 3932, 'batch_loss/train': 0.9191227182745934} 12/22/2021 03:16:44 - INFO - codeparrot_training - Step 3933: {'lr': 0.0004930145280555025, 'samples': 2014208, 'steps': 3933, 'batch_loss/train': 0.8806504644453526} 12/22/2021 03:16:55 - INFO - codeparrot_training - Step 3934: {'lr': 0.000493010827328973, 'samples': 2014720, 'steps': 3934, 'batch_loss/train': 0.6737387403845787} 12/22/2021 03:17:05 - INFO - codeparrot_training - Step 3935: {'lr': 0.0004930071256363242, 'samples': 2015232, 'steps': 3935, 'batch_loss/train': 0.8183085494674742} 12/22/2021 03:17:19 - INFO - codeparrot_training - Step 3936: {'lr': 0.0004930034229775709, 'samples': 2015744, 'steps': 3936, 'batch_loss/train': 0.8656334890983999} 12/22/2021 03:17:30 - INFO - codeparrot_training - Step 3937: {'lr': 0.0004929997193527277, 'samples': 2016256, 'steps': 3937, 'batch_loss/train': 0.8002016507089138} 12/22/2021 03:17:40 - INFO - codeparrot_training - Step 3938: {'lr': 0.0004929960147618093, 'samples': 2016768, 'steps': 3938, 'batch_loss/train': 0.9789000083692372} 12/22/2021 03:17:52 - INFO - codeparrot_training - Step 3939: {'lr': 0.0004929923092048306, 'samples': 2017280, 'steps': 3939, 'batch_loss/train': 1.2592190178111196} 12/22/2021 03:18:03 - INFO - codeparrot_training - Step 3940: {'lr': 0.0004929886026818062, 'samples': 2017792, 'steps': 3940, 'batch_loss/train': 0.8581275949254632} 12/22/2021 03:18:14 - INFO - codeparrot_training - Step 3941: {'lr': 0.0004929848951927509, 'samples': 2018304, 'steps': 3941, 'batch_loss/train': 0.9207782819867134} 12/22/2021 03:18:26 - INFO - codeparrot_training - Step 3942: {'lr': 0.0004929811867376793, 'samples': 2018816, 'steps': 3942, 'batch_loss/train': 0.7627676008269191} 12/22/2021 03:18:37 - INFO - codeparrot_training - Step 3943: {'lr': 0.0004929774773166064, 'samples': 2019328, 'steps': 3943, 'batch_loss/train': 0.7873056810349226} 12/22/2021 03:18:48 - INFO - codeparrot_training - Step 3944: {'lr': 0.0004929737669295466, 'samples': 2019840, 'steps': 3944, 'batch_loss/train': 0.7888134242966771} 12/22/2021 03:18:58 - INFO - codeparrot_training - Step 3945: {'lr': 0.000492970055576515, 'samples': 2020352, 'steps': 3945, 'batch_loss/train': 0.7963713677600026} 12/22/2021 03:19:10 - INFO - codeparrot_training - Step 3946: {'lr': 0.0004929663432575262, 'samples': 2020864, 'steps': 3946, 'batch_loss/train': 0.9198372159153223} 12/22/2021 03:19:21 - INFO - codeparrot_training - Step 3947: {'lr': 0.0004929626299725949, 'samples': 2021376, 'steps': 3947, 'batch_loss/train': 0.9403694430366158} 12/22/2021 03:19:32 - INFO - codeparrot_training - Step 3948: {'lr': 0.0004929589157217359, 'samples': 2021888, 'steps': 3948, 'batch_loss/train': 0.9057676559314132} 12/22/2021 03:19:44 - INFO - codeparrot_training - Step 3949: {'lr': 0.000492955200504964, 'samples': 2022400, 'steps': 3949, 'batch_loss/train': 0.7494741673581302} 12/22/2021 03:19:55 - INFO - codeparrot_training - Step 3950: {'lr': 0.000492951484322294, 'samples': 2022912, 'steps': 3950, 'batch_loss/train': 0.863868709653616} 12/22/2021 03:20:05 - INFO - codeparrot_training - Step 3951: {'lr': 0.0004929477671737406, 'samples': 2023424, 'steps': 3951, 'batch_loss/train': 0.9050410622730851} 12/22/2021 03:20:18 - INFO - codeparrot_training - Step 3952: {'lr': 0.0004929440490593187, 'samples': 2023936, 'steps': 3952, 'batch_loss/train': 0.7452170795295388} 12/22/2021 03:20:28 - INFO - codeparrot_training - Step 3953: {'lr': 0.0004929403299790429, 'samples': 2024448, 'steps': 3953, 'batch_loss/train': 0.8952322481200099} 12/22/2021 03:20:39 - INFO - codeparrot_training - Step 3954: {'lr': 0.0004929366099329281, 'samples': 2024960, 'steps': 3954, 'batch_loss/train': 0.9086220040917397} 12/22/2021 03:20:50 - INFO - codeparrot_training - Step 3955: {'lr': 0.0004929328889209891, 'samples': 2025472, 'steps': 3955, 'batch_loss/train': 0.8106886129826307} 12/22/2021 03:21:02 - INFO - codeparrot_training - Step 3956: {'lr': 0.0004929291669432406, 'samples': 2025984, 'steps': 3956, 'batch_loss/train': 0.8356507141143084} 12/22/2021 03:21:13 - INFO - codeparrot_training - Step 3957: {'lr': 0.0004929254439996974, 'samples': 2026496, 'steps': 3957, 'batch_loss/train': 0.5621949810301885} 12/22/2021 03:21:24 - INFO - codeparrot_training - Step 3958: {'lr': 0.0004929217200903745, 'samples': 2027008, 'steps': 3958, 'batch_loss/train': 0.96055016387254} 12/22/2021 03:21:37 - INFO - codeparrot_training - Step 3959: {'lr': 0.0004929179952152865, 'samples': 2027520, 'steps': 3959, 'batch_loss/train': 0.8803301360458136} 12/22/2021 03:21:47 - INFO - codeparrot_training - Step 3960: {'lr': 0.0004929142693744483, 'samples': 2028032, 'steps': 3960, 'batch_loss/train': 0.8416177555918694} 12/22/2021 03:21:58 - INFO - codeparrot_training - Step 3961: {'lr': 0.0004929105425678747, 'samples': 2028544, 'steps': 3961, 'batch_loss/train': 0.816517666913569} 12/22/2021 03:22:10 - INFO - codeparrot_training - Step 3962: {'lr': 0.0004929068147955804, 'samples': 2029056, 'steps': 3962, 'batch_loss/train': 0.6996951932087541} 12/22/2021 03:22:20 - INFO - codeparrot_training - Step 3963: {'lr': 0.0004929030860575805, 'samples': 2029568, 'steps': 3963, 'batch_loss/train': 0.8543358547613025} 12/22/2021 03:22:31 - INFO - codeparrot_training - Step 3964: {'lr': 0.0004928993563538895, 'samples': 2030080, 'steps': 3964, 'batch_loss/train': 0.976449036039412} 12/22/2021 03:22:42 - INFO - codeparrot_training - Step 3965: {'lr': 0.0004928956256845223, 'samples': 2030592, 'steps': 3965, 'batch_loss/train': 0.7406814554706216} 12/22/2021 03:22:54 - INFO - codeparrot_training - Step 3966: {'lr': 0.000492891894049494, 'samples': 2031104, 'steps': 3966, 'batch_loss/train': 0.7277922327630222} 12/22/2021 03:23:04 - INFO - codeparrot_training - Step 3967: {'lr': 0.0004928881614488192, 'samples': 2031616, 'steps': 3967, 'batch_loss/train': 0.8887613015249372} 12/22/2021 03:23:15 - INFO - codeparrot_training - Step 3968: {'lr': 0.0004928844278825128, 'samples': 2032128, 'steps': 3968, 'batch_loss/train': 0.8376848599873483} 12/22/2021 03:23:27 - INFO - codeparrot_training - Step 3969: {'lr': 0.0004928806933505895, 'samples': 2032640, 'steps': 3969, 'batch_loss/train': 0.8786351103335619} 12/22/2021 03:23:38 - INFO - codeparrot_training - Step 3970: {'lr': 0.0004928769578530644, 'samples': 2033152, 'steps': 3970, 'batch_loss/train': 0.7829575063660741} 12/22/2021 03:23:48 - INFO - codeparrot_training - Step 3971: {'lr': 0.0004928732213899521, 'samples': 2033664, 'steps': 3971, 'batch_loss/train': 0.7181234874296933} 12/22/2021 03:24:01 - INFO - codeparrot_training - Step 3972: {'lr': 0.0004928694839612676, 'samples': 2034176, 'steps': 3972, 'batch_loss/train': 0.8288478744216263} 12/22/2021 03:24:12 - INFO - codeparrot_training - Step 3973: {'lr': 0.0004928657455670258, 'samples': 2034688, 'steps': 3973, 'batch_loss/train': 0.8714688718318939} 12/22/2021 03:24:22 - INFO - codeparrot_training - Step 3974: {'lr': 0.0004928620062072415, 'samples': 2035200, 'steps': 3974, 'batch_loss/train': 0.8959438819438219} 12/22/2021 03:24:33 - INFO - codeparrot_training - Step 3975: {'lr': 0.0004928582658819295, 'samples': 2035712, 'steps': 3975, 'batch_loss/train': 0.8504805788397789} 12/22/2021 03:24:45 - INFO - codeparrot_training - Step 3976: {'lr': 0.0004928545245911047, 'samples': 2036224, 'steps': 3976, 'batch_loss/train': 0.8421950861811638} 12/22/2021 03:24:55 - INFO - codeparrot_training - Step 3977: {'lr': 0.0004928507823347822, 'samples': 2036736, 'steps': 3977, 'batch_loss/train': 0.8208606177940965} 12/22/2021 03:25:06 - INFO - codeparrot_training - Step 3978: {'lr': 0.0004928470391129766, 'samples': 2037248, 'steps': 3978, 'batch_loss/train': 0.8018357967957854} 12/22/2021 03:25:18 - INFO - codeparrot_training - Step 3979: {'lr': 0.0004928432949257028, 'samples': 2037760, 'steps': 3979, 'batch_loss/train': 0.8892128756269813} 12/22/2021 03:25:29 - INFO - codeparrot_training - Step 3980: {'lr': 0.0004928395497729757, 'samples': 2038272, 'steps': 3980, 'batch_loss/train': 0.8581522945314646} 12/22/2021 03:25:39 - INFO - codeparrot_training - Step 3981: {'lr': 0.0004928358036548104, 'samples': 2038784, 'steps': 3981, 'batch_loss/train': 0.8016374306753278} 12/22/2021 03:25:53 - INFO - codeparrot_training - Step 3982: {'lr': 0.0004928320565712216, 'samples': 2039296, 'steps': 3982, 'batch_loss/train': 0.9006948256865144} 12/22/2021 03:26:03 - INFO - codeparrot_training - Step 3983: {'lr': 0.0004928283085222241, 'samples': 2039808, 'steps': 3983, 'batch_loss/train': 0.7625440126284957} 12/22/2021 03:26:14 - INFO - codeparrot_training - Step 3984: {'lr': 0.0004928245595078331, 'samples': 2040320, 'steps': 3984, 'batch_loss/train': 0.8810455435886979} 12/22/2021 03:26:26 - INFO - codeparrot_training - Step 3985: {'lr': 0.0004928208095280633, 'samples': 2040832, 'steps': 3985, 'batch_loss/train': 0.6100566554814577} 12/22/2021 03:26:37 - INFO - codeparrot_training - Step 3986: {'lr': 0.0004928170585829296, 'samples': 2041344, 'steps': 3986, 'batch_loss/train': 0.7510919773485512} 12/22/2021 03:26:47 - INFO - codeparrot_training - Step 3987: {'lr': 0.000492813306672447, 'samples': 2041856, 'steps': 3987, 'batch_loss/train': 0.7714018076658249} 12/22/2021 03:26:58 - INFO - codeparrot_training - Step 3988: {'lr': 0.0004928095537966304, 'samples': 2042368, 'steps': 3988, 'batch_loss/train': 0.7487285176757723} 12/22/2021 03:27:10 - INFO - codeparrot_training - Step 3989: {'lr': 0.0004928057999554946, 'samples': 2042880, 'steps': 3989, 'batch_loss/train': 0.8488930584862828} 12/22/2021 03:27:20 - INFO - codeparrot_training - Step 3990: {'lr': 0.0004928020451490547, 'samples': 2043392, 'steps': 3990, 'batch_loss/train': 0.9103718847036362} 12/22/2021 03:27:31 - INFO - codeparrot_training - Step 3991: {'lr': 0.0004927982893773255, 'samples': 2043904, 'steps': 3991, 'batch_loss/train': 0.7831796370446682} 12/22/2021 03:27:44 - INFO - codeparrot_training - Step 3992: {'lr': 0.000492794532640322, 'samples': 2044416, 'steps': 3992, 'batch_loss/train': 0.8909170497208834} 12/22/2021 03:27:54 - INFO - codeparrot_training - Step 3993: {'lr': 0.0004927907749380591, 'samples': 2044928, 'steps': 3993, 'batch_loss/train': 0.8144533315207809} 12/22/2021 03:28:05 - INFO - codeparrot_training - Step 3994: {'lr': 0.0004927870162705518, 'samples': 2045440, 'steps': 3994, 'batch_loss/train': 0.794631714001298} 12/22/2021 03:28:17 - INFO - codeparrot_training - Step 3995: {'lr': 0.000492783256637815, 'samples': 2045952, 'steps': 3995, 'batch_loss/train': 0.908253900706768} 12/22/2021 03:28:28 - INFO - codeparrot_training - Step 3996: {'lr': 0.0004927794960398635, 'samples': 2046464, 'steps': 3996, 'batch_loss/train': 0.7966419719159603} 12/22/2021 03:28:38 - INFO - codeparrot_training - Step 3997: {'lr': 0.0004927757344767125, 'samples': 2046976, 'steps': 3997, 'batch_loss/train': 0.8370355293154716} 12/22/2021 03:28:49 - INFO - codeparrot_training - Step 3998: {'lr': 0.0004927719719483768, 'samples': 2047488, 'steps': 3998, 'batch_loss/train': 0.7639365149661899} 12/22/2021 03:29:02 - INFO - codeparrot_training - Step 3999: {'lr': 0.0004927682084548715, 'samples': 2048000, 'steps': 3999, 'batch_loss/train': 0.7497921297326684} 12/22/2021 03:29:12 - INFO - codeparrot_training - Step 4000: {'lr': 0.0004927644439962115, 'samples': 2048512, 'steps': 4000, 'batch_loss/train': 1.939749263226986} 12/22/2021 03:29:23 - INFO - codeparrot_training - Step 4001: {'lr': 0.0004927606785724116, 'samples': 2049024, 'steps': 4001, 'batch_loss/train': 0.9224952477961779} 12/22/2021 03:29:35 - INFO - codeparrot_training - Step 4002: {'lr': 0.000492756912183487, 'samples': 2049536, 'steps': 4002, 'batch_loss/train': 0.8272404531016946} 12/22/2021 03:29:46 - INFO - codeparrot_training - Step 4003: {'lr': 0.0004927531448294524, 'samples': 2050048, 'steps': 4003, 'batch_loss/train': 0.7317931945435703} 12/22/2021 03:29:56 - INFO - codeparrot_training - Step 4004: {'lr': 0.0004927493765103231, 'samples': 2050560, 'steps': 4004, 'batch_loss/train': 0.8750862628221512} 12/22/2021 03:30:08 - INFO - codeparrot_training - Step 4005: {'lr': 0.000492745607226114, 'samples': 2051072, 'steps': 4005, 'batch_loss/train': 0.8990721963346004} 12/22/2021 03:30:19 - INFO - codeparrot_training - Step 4006: {'lr': 0.0004927418369768399, 'samples': 2051584, 'steps': 4006, 'batch_loss/train': 0.9704464334063232} 12/22/2021 03:30:30 - INFO - codeparrot_training - Step 4007: {'lr': 0.000492738065762516, 'samples': 2052096, 'steps': 4007, 'batch_loss/train': 0.8910357942804694} 12/22/2021 03:30:40 - INFO - codeparrot_training - Step 4008: {'lr': 0.0004927342935831572, 'samples': 2052608, 'steps': 4008, 'batch_loss/train': 0.8607935458421707} 12/22/2021 03:30:53 - INFO - codeparrot_training - Step 4009: {'lr': 0.0004927305204387785, 'samples': 2053120, 'steps': 4009, 'batch_loss/train': 0.8567226156592369} 12/22/2021 03:31:03 - INFO - codeparrot_training - Step 4010: {'lr': 0.0004927267463293948, 'samples': 2053632, 'steps': 4010, 'batch_loss/train': 0.7735480377450585} 12/22/2021 03:31:14 - INFO - codeparrot_training - Step 4011: {'lr': 0.0004927229712550213, 'samples': 2054144, 'steps': 4011, 'batch_loss/train': 0.8865181477740407} 12/22/2021 03:31:26 - INFO - codeparrot_training - Step 4012: {'lr': 0.0004927191952156729, 'samples': 2054656, 'steps': 4012, 'batch_loss/train': 0.8445950420573354} 12/22/2021 03:31:37 - INFO - codeparrot_training - Step 4013: {'lr': 0.0004927154182113646, 'samples': 2055168, 'steps': 4013, 'batch_loss/train': 0.9298778753727674} 12/22/2021 03:31:47 - INFO - codeparrot_training - Step 4014: {'lr': 0.0004927116402421116, 'samples': 2055680, 'steps': 4014, 'batch_loss/train': 0.8360939025878906} 12/22/2021 03:31:59 - INFO - codeparrot_training - Step 4015: {'lr': 0.0004927078613079285, 'samples': 2056192, 'steps': 4015, 'batch_loss/train': 0.8786779902875423} 12/22/2021 03:32:10 - INFO - codeparrot_training - Step 4016: {'lr': 0.0004927040814088307, 'samples': 2056704, 'steps': 4016, 'batch_loss/train': 0.7898402875289321} 12/22/2021 03:32:21 - INFO - codeparrot_training - Step 4017: {'lr': 0.0004927003005448332, 'samples': 2057216, 'steps': 4017, 'batch_loss/train': 0.8690209090709686} 12/22/2021 03:32:33 - INFO - codeparrot_training - Step 4018: {'lr': 0.0004926965187159509, 'samples': 2057728, 'steps': 4018, 'batch_loss/train': 0.8544766263803467} 12/22/2021 03:32:43 - INFO - codeparrot_training - Step 4019: {'lr': 0.0004926927359221988, 'samples': 2058240, 'steps': 4019, 'batch_loss/train': 0.8264490514993668} 12/22/2021 03:32:54 - INFO - codeparrot_training - Step 4020: {'lr': 0.0004926889521635921, 'samples': 2058752, 'steps': 4020, 'batch_loss/train': 0.877487201243639} 12/22/2021 03:33:05 - INFO - codeparrot_training - Step 4021: {'lr': 0.0004926851674401457, 'samples': 2059264, 'steps': 4021, 'batch_loss/train': 0.9886155249550939} 12/22/2021 03:33:17 - INFO - codeparrot_training - Step 4022: {'lr': 0.0004926813817518748, 'samples': 2059776, 'steps': 4022, 'batch_loss/train': 0.96681207139045} 12/22/2021 03:33:28 - INFO - codeparrot_training - Step 4023: {'lr': 0.0004926775950987943, 'samples': 2060288, 'steps': 4023, 'batch_loss/train': 0.8597860326990485} 12/22/2021 03:33:39 - INFO - codeparrot_training - Step 4024: {'lr': 0.0004926738074809193, 'samples': 2060800, 'steps': 4024, 'batch_loss/train': 0.8022804544307292} 12/22/2021 03:33:51 - INFO - codeparrot_training - Step 4025: {'lr': 0.0004926700188982649, 'samples': 2061312, 'steps': 4025, 'batch_loss/train': 1.12282660510391} 12/22/2021 03:34:01 - INFO - codeparrot_training - Step 4026: {'lr': 0.000492666229350846, 'samples': 2061824, 'steps': 4026, 'batch_loss/train': 0.8577085128054023} 12/22/2021 03:34:12 - INFO - codeparrot_training - Step 4027: {'lr': 0.000492662438838678, 'samples': 2062336, 'steps': 4027, 'batch_loss/train': 0.813668274320662} 12/22/2021 03:34:25 - INFO - codeparrot_training - Step 4028: {'lr': 0.0004926586473617757, 'samples': 2062848, 'steps': 4028, 'batch_loss/train': 0.8566002333536744} 12/22/2021 03:34:35 - INFO - codeparrot_training - Step 4029: {'lr': 0.0004926548549201542, 'samples': 2063360, 'steps': 4029, 'batch_loss/train': 0.9147388422861695} 12/22/2021 03:34:46 - INFO - codeparrot_training - Step 4030: {'lr': 0.0004926510615138286, 'samples': 2063872, 'steps': 4030, 'batch_loss/train': 0.9181901644915342} 12/22/2021 03:34:56 - INFO - codeparrot_training - Step 4031: {'lr': 0.0004926472671428142, 'samples': 2064384, 'steps': 4031, 'batch_loss/train': 0.7982976287603378} 12/22/2021 03:35:09 - INFO - codeparrot_training - Step 4032: {'lr': 0.0004926434718071257, 'samples': 2064896, 'steps': 4032, 'batch_loss/train': 0.9180630436167121} 12/22/2021 03:35:19 - INFO - codeparrot_training - Step 4033: {'lr': 0.0004926396755067785, 'samples': 2065408, 'steps': 4033, 'batch_loss/train': 0.8358725449070334} 12/22/2021 03:35:30 - INFO - codeparrot_training - Step 4034: {'lr': 0.0004926358782417874, 'samples': 2065920, 'steps': 4034, 'batch_loss/train': 0.8483314928598702} 12/22/2021 03:35:42 - INFO - codeparrot_training - Step 4035: {'lr': 0.0004926320800121678, 'samples': 2066432, 'steps': 4035, 'batch_loss/train': 0.7135164469946176} 12/22/2021 03:35:52 - INFO - codeparrot_training - Step 4036: {'lr': 0.0004926282808179345, 'samples': 2066944, 'steps': 4036, 'batch_loss/train': 0.7975619584321976} 12/22/2021 03:36:03 - INFO - codeparrot_training - Step 4037: {'lr': 0.0004926244806591029, 'samples': 2067456, 'steps': 4037, 'batch_loss/train': 0.920528088696301} 12/22/2021 03:36:16 - INFO - codeparrot_training - Step 4038: {'lr': 0.000492620679535688, 'samples': 2067968, 'steps': 4038, 'batch_loss/train': 0.8891797438263893} 12/22/2021 03:36:27 - INFO - codeparrot_training - Step 4039: {'lr': 0.0004926168774477047, 'samples': 2068480, 'steps': 4039, 'batch_loss/train': 0.8778839907608926} 12/22/2021 03:36:37 - INFO - codeparrot_training - Step 4040: {'lr': 0.0004926130743951686, 'samples': 2068992, 'steps': 4040, 'batch_loss/train': 0.7715625325217843} 12/22/2021 03:36:48 - INFO - codeparrot_training - Step 4041: {'lr': 0.0004926092703780943, 'samples': 2069504, 'steps': 4041, 'batch_loss/train': 0.8303506067022681} 12/22/2021 03:37:00 - INFO - codeparrot_training - Step 4042: {'lr': 0.0004926054653964972, 'samples': 2070016, 'steps': 4042, 'batch_loss/train': 0.9952564276754856} 12/22/2021 03:37:11 - INFO - codeparrot_training - Step 4043: {'lr': 0.0004926016594503925, 'samples': 2070528, 'steps': 4043, 'batch_loss/train': 0.8283628281205893} 12/22/2021 03:37:21 - INFO - codeparrot_training - Step 4044: {'lr': 0.000492597852539795, 'samples': 2071040, 'steps': 4044, 'batch_loss/train': 0.8273870786651969} 12/22/2021 03:37:34 - INFO - codeparrot_training - Step 4045: {'lr': 0.0004925940446647201, 'samples': 2071552, 'steps': 4045, 'batch_loss/train': 0.8455392615869641} 12/22/2021 03:37:45 - INFO - codeparrot_training - Step 4046: {'lr': 0.000492590235825183, 'samples': 2072064, 'steps': 4046, 'batch_loss/train': 1.0025887973606586} 12/22/2021 03:37:55 - INFO - codeparrot_training - Step 4047: {'lr': 0.0004925864260211986, 'samples': 2072576, 'steps': 4047, 'batch_loss/train': 0.9002260910347104} 12/22/2021 03:38:07 - INFO - codeparrot_training - Step 4048: {'lr': 0.0004925826152527822, 'samples': 2073088, 'steps': 4048, 'batch_loss/train': 0.8416243176907301} 12/22/2021 03:38:18 - INFO - codeparrot_training - Step 4049: {'lr': 0.0004925788035199488, 'samples': 2073600, 'steps': 4049, 'batch_loss/train': 0.93688546679914} 12/22/2021 03:38:28 - INFO - codeparrot_training - Step 4050: {'lr': 0.0004925749908227139, 'samples': 2074112, 'steps': 4050, 'batch_loss/train': 0.8470156388357282} 12/22/2021 03:38:40 - INFO - codeparrot_training - Step 4051: {'lr': 0.0004925711771610924, 'samples': 2074624, 'steps': 4051, 'batch_loss/train': 0.8448299306910485} 12/22/2021 03:38:51 - INFO - codeparrot_training - Step 4052: {'lr': 0.0004925673625350993, 'samples': 2075136, 'steps': 4052, 'batch_loss/train': 0.842994038015604} 12/22/2021 03:39:02 - INFO - codeparrot_training - Step 4053: {'lr': 0.0004925635469447501, 'samples': 2075648, 'steps': 4053, 'batch_loss/train': 0.8194690030068159} 12/22/2021 03:39:12 - INFO - codeparrot_training - Step 4054: {'lr': 0.0004925597303900598, 'samples': 2076160, 'steps': 4054, 'batch_loss/train': 0.917898066341877} 12/22/2021 03:39:24 - INFO - codeparrot_training - Step 4055: {'lr': 0.0004925559128710436, 'samples': 2076672, 'steps': 4055, 'batch_loss/train': 0.8899425034178421} 12/22/2021 03:39:35 - INFO - codeparrot_training - Step 4056: {'lr': 0.0004925520943877167, 'samples': 2077184, 'steps': 4056, 'batch_loss/train': 0.8386385953053832} 12/22/2021 03:39:46 - INFO - codeparrot_training - Step 4057: {'lr': 0.0004925482749400942, 'samples': 2077696, 'steps': 4057, 'batch_loss/train': 0.8371183546259999} 12/22/2021 03:39:59 - INFO - codeparrot_training - Step 4058: {'lr': 0.0004925444545281914, 'samples': 2078208, 'steps': 4058, 'batch_loss/train': 0.8975976807996631} 12/22/2021 03:40:09 - INFO - codeparrot_training - Step 4059: {'lr': 0.0004925406331520234, 'samples': 2078720, 'steps': 4059, 'batch_loss/train': 0.9045816473662853} 12/22/2021 03:40:20 - INFO - codeparrot_training - Step 4060: {'lr': 0.0004925368108116054, 'samples': 2079232, 'steps': 4060, 'batch_loss/train': 0.8102682735770941} 12/22/2021 03:40:32 - INFO - codeparrot_training - Step 4061: {'lr': 0.0004925329875069526, 'samples': 2079744, 'steps': 4061, 'batch_loss/train': 0.8013631673529744} 12/22/2021 03:40:43 - INFO - codeparrot_training - Step 4062: {'lr': 0.0004925291632380803, 'samples': 2080256, 'steps': 4062, 'batch_loss/train': 0.8012921512126923} 12/22/2021 03:40:53 - INFO - codeparrot_training - Step 4063: {'lr': 0.0004925253380050035, 'samples': 2080768, 'steps': 4063, 'batch_loss/train': 0.8662208784371614} 12/22/2021 03:41:04 - INFO - codeparrot_training - Step 4064: {'lr': 0.0004925215118077377, 'samples': 2081280, 'steps': 4064, 'batch_loss/train': 0.887549351900816} 12/22/2021 03:41:16 - INFO - codeparrot_training - Step 4065: {'lr': 0.0004925176846462978, 'samples': 2081792, 'steps': 4065, 'batch_loss/train': 0.8047728394158185} 12/22/2021 03:41:26 - INFO - codeparrot_training - Step 4066: {'lr': 0.0004925138565206992, 'samples': 2082304, 'steps': 4066, 'batch_loss/train': 0.9118258021771908} 12/22/2021 03:41:37 - INFO - codeparrot_training - Step 4067: {'lr': 0.0004925100274309571, 'samples': 2082816, 'steps': 4067, 'batch_loss/train': 0.7800125600770116} 12/22/2021 03:41:49 - INFO - codeparrot_training - Step 4068: {'lr': 0.0004925061973770867, 'samples': 2083328, 'steps': 4068, 'batch_loss/train': 0.842938145622611} 12/22/2021 03:42:00 - INFO - codeparrot_training - Step 4069: {'lr': 0.0004925023663591031, 'samples': 2083840, 'steps': 4069, 'batch_loss/train': 0.8663719743490219} 12/22/2021 03:42:11 - INFO - codeparrot_training - Step 4070: {'lr': 0.0004924985343770219, 'samples': 2084352, 'steps': 4070, 'batch_loss/train': 0.8080026539973915} 12/22/2021 03:42:23 - INFO - codeparrot_training - Step 4071: {'lr': 0.0004924947014308578, 'samples': 2084864, 'steps': 4071, 'batch_loss/train': 0.8576187444850802} 12/22/2021 03:42:33 - INFO - codeparrot_training - Step 4072: {'lr': 0.0004924908675206266, 'samples': 2085376, 'steps': 4072, 'batch_loss/train': 0.8730600103735924} 12/22/2021 03:42:44 - INFO - codeparrot_training - Step 4073: {'lr': 0.0004924870326463431, 'samples': 2085888, 'steps': 4073, 'batch_loss/train': 0.7766828294843435} 12/22/2021 03:42:57 - INFO - codeparrot_training - Step 4074: {'lr': 0.0004924831968080228, 'samples': 2086400, 'steps': 4074, 'batch_loss/train': 0.8401000639423728} 12/22/2021 03:43:07 - INFO - codeparrot_training - Step 4075: {'lr': 0.0004924793600056809, 'samples': 2086912, 'steps': 4075, 'batch_loss/train': 0.8160884510725737} 12/22/2021 03:43:18 - INFO - codeparrot_training - Step 4076: {'lr': 0.0004924755222393325, 'samples': 2087424, 'steps': 4076, 'batch_loss/train': 0.9883578100707382} 12/22/2021 03:43:29 - INFO - codeparrot_training - Step 4077: {'lr': 0.0004924716835089931, 'samples': 2087936, 'steps': 4077, 'batch_loss/train': 0.9357633530162275} 12/22/2021 03:43:41 - INFO - codeparrot_training - Step 4078: {'lr': 0.0004924678438146776, 'samples': 2088448, 'steps': 4078, 'batch_loss/train': 0.8292823480442166} 12/22/2021 03:43:51 - INFO - codeparrot_training - Step 4079: {'lr': 0.0004924640031564019, 'samples': 2088960, 'steps': 4079, 'batch_loss/train': 0.8427103636786342} 12/22/2021 03:44:02 - INFO - codeparrot_training - Step 4080: {'lr': 0.0004924601615341806, 'samples': 2089472, 'steps': 4080, 'batch_loss/train': 0.8121208697557449} 12/22/2021 03:44:14 - INFO - codeparrot_training - Step 4081: {'lr': 0.0004924563189480293, 'samples': 2089984, 'steps': 4081, 'batch_loss/train': 0.8540441486984491} 12/22/2021 03:44:25 - INFO - codeparrot_training - Step 4082: {'lr': 0.0004924524753979632, 'samples': 2090496, 'steps': 4082, 'batch_loss/train': 0.8646829561330378} 12/22/2021 03:44:36 - INFO - codeparrot_training - Step 4083: {'lr': 0.0004924486308839976, 'samples': 2091008, 'steps': 4083, 'batch_loss/train': 0.8439098326489329} 12/22/2021 03:44:46 - INFO - codeparrot_training - Step 4084: {'lr': 0.0004924447854061479, 'samples': 2091520, 'steps': 4084, 'batch_loss/train': 0.8474629046395421} 12/22/2021 03:44:59 - INFO - codeparrot_training - Step 4085: {'lr': 0.0004924409389644292, 'samples': 2092032, 'steps': 4085, 'batch_loss/train': 0.840415452606976} 12/22/2021 03:45:09 - INFO - codeparrot_training - Step 4086: {'lr': 0.0004924370915588568, 'samples': 2092544, 'steps': 4086, 'batch_loss/train': 0.7762196827679873} 12/22/2021 03:45:20 - INFO - codeparrot_training - Step 4087: {'lr': 0.0004924332431894461, 'samples': 2093056, 'steps': 4087, 'batch_loss/train': 0.8470400171354413} 12/22/2021 03:45:32 - INFO - codeparrot_training - Step 4088: {'lr': 0.0004924293938562125, 'samples': 2093568, 'steps': 4088, 'batch_loss/train': 0.8784585501998663} 12/22/2021 03:45:43 - INFO - codeparrot_training - Step 4089: {'lr': 0.000492425543559171, 'samples': 2094080, 'steps': 4089, 'batch_loss/train': 0.9186237771064043} 12/22/2021 03:45:53 - INFO - codeparrot_training - Step 4090: {'lr': 0.0004924216922983372, 'samples': 2094592, 'steps': 4090, 'batch_loss/train': 0.8462586454115808} 12/22/2021 03:46:05 - INFO - codeparrot_training - Step 4091: {'lr': 0.0004924178400737263, 'samples': 2095104, 'steps': 4091, 'batch_loss/train': 0.7473504939116538} 12/22/2021 03:46:16 - INFO - codeparrot_training - Step 4092: {'lr': 0.0004924139868853534, 'samples': 2095616, 'steps': 4092, 'batch_loss/train': 0.8914831876754761} 12/22/2021 03:46:27 - INFO - codeparrot_training - Step 4093: {'lr': 0.0004924101327332342, 'samples': 2096128, 'steps': 4093, 'batch_loss/train': 0.918848617002368} 12/22/2021 03:46:38 - INFO - codeparrot_training - Step 4094: {'lr': 0.0004924062776173838, 'samples': 2096640, 'steps': 4094, 'batch_loss/train': 0.913783541880548} 12/22/2021 03:46:49 - INFO - codeparrot_training - Step 4095: {'lr': 0.0004924024215378175, 'samples': 2097152, 'steps': 4095, 'batch_loss/train': 0.7991795456036925} 12/22/2021 03:47:00 - INFO - codeparrot_training - Step 4096: {'lr': 0.0004923985644945508, 'samples': 2097664, 'steps': 4096, 'batch_loss/train': 0.7815361036919057} 12/22/2021 03:47:10 - INFO - codeparrot_training - Step 4097: {'lr': 0.000492394706487599, 'samples': 2098176, 'steps': 4097, 'batch_loss/train': 0.801373035646975} 12/22/2021 03:47:23 - INFO - codeparrot_training - Step 4098: {'lr': 0.0004923908475169772, 'samples': 2098688, 'steps': 4098, 'batch_loss/train': 0.9192048946861178} 12/22/2021 03:47:34 - INFO - codeparrot_training - Step 4099: {'lr': 0.000492386987582701, 'samples': 2099200, 'steps': 4099, 'batch_loss/train': 0.8191734198480844} 12/22/2021 03:47:45 - INFO - codeparrot_training - Step 4100: {'lr': 0.0004923831266847856, 'samples': 2099712, 'steps': 4100, 'batch_loss/train': 0.8017001561820507} 12/22/2021 03:47:57 - INFO - codeparrot_training - Step 4101: {'lr': 0.0004923792648232465, 'samples': 2100224, 'steps': 4101, 'batch_loss/train': 0.877462980337441} 12/22/2021 03:48:08 - INFO - codeparrot_training - Step 4102: {'lr': 0.0004923754019980988, 'samples': 2100736, 'steps': 4102, 'batch_loss/train': 0.7437187423929572} 12/22/2021 03:48:18 - INFO - codeparrot_training - Step 4103: {'lr': 0.0004923715382093582, 'samples': 2101248, 'steps': 4103, 'batch_loss/train': 0.519745439873077} 12/22/2021 03:48:31 - INFO - codeparrot_training - Step 4104: {'lr': 0.0004923676734570398, 'samples': 2101760, 'steps': 4104, 'batch_loss/train': 0.9059451539069414} 12/22/2021 03:48:41 - INFO - codeparrot_training - Step 4105: {'lr': 0.0004923638077411591, 'samples': 2102272, 'steps': 4105, 'batch_loss/train': 0.770162015222013} 12/22/2021 03:48:52 - INFO - codeparrot_training - Step 4106: {'lr': 0.0004923599410617313, 'samples': 2102784, 'steps': 4106, 'batch_loss/train': 0.7957731019705534} 12/22/2021 03:49:03 - INFO - codeparrot_training - Step 4107: {'lr': 0.0004923560734187719, 'samples': 2103296, 'steps': 4107, 'batch_loss/train': 1.098386156372726} 12/22/2021 03:49:15 - INFO - codeparrot_training - Step 4108: {'lr': 0.0004923522048122964, 'samples': 2103808, 'steps': 4108, 'batch_loss/train': 0.8004101002588868} 12/22/2021 03:49:25 - INFO - codeparrot_training - Step 4109: {'lr': 0.00049234833524232, 'samples': 2104320, 'steps': 4109, 'batch_loss/train': 0.927650012075901} 12/22/2021 03:49:36 - INFO - codeparrot_training - Step 4110: {'lr': 0.0004923444647088581, 'samples': 2104832, 'steps': 4110, 'batch_loss/train': 0.8178929818095639} 12/22/2021 03:49:48 - INFO - codeparrot_training - Step 4111: {'lr': 0.000492340593211926, 'samples': 2105344, 'steps': 4111, 'batch_loss/train': 0.9088439680635929} 12/22/2021 03:49:59 - INFO - codeparrot_training - Step 4112: {'lr': 0.0004923367207515393, 'samples': 2105856, 'steps': 4112, 'batch_loss/train': 0.7563310931436718} 12/22/2021 03:50:09 - INFO - codeparrot_training - Step 4113: {'lr': 0.0004923328473277134, 'samples': 2106368, 'steps': 4113, 'batch_loss/train': 0.8010037818457931} 12/22/2021 03:50:22 - INFO - codeparrot_training - Step 4114: {'lr': 0.0004923289729404634, 'samples': 2106880, 'steps': 4114, 'batch_loss/train': 0.8588320389389992} 12/22/2021 03:50:33 - INFO - codeparrot_training - Step 4115: {'lr': 0.000492325097589805, 'samples': 2107392, 'steps': 4115, 'batch_loss/train': 0.8981123650446534} 12/22/2021 03:50:43 - INFO - codeparrot_training - Step 4116: {'lr': 0.0004923212212757537, 'samples': 2107904, 'steps': 4116, 'batch_loss/train': 0.7968896902166307} 12/22/2021 03:50:55 - INFO - codeparrot_training - Step 4117: {'lr': 0.0004923173439983244, 'samples': 2108416, 'steps': 4117, 'batch_loss/train': 0.8182796519249678} 12/22/2021 03:51:06 - INFO - codeparrot_training - Step 4118: {'lr': 0.000492313465757533, 'samples': 2108928, 'steps': 4118, 'batch_loss/train': 0.8806311342632398} 12/22/2021 03:51:17 - INFO - codeparrot_training - Step 4119: {'lr': 0.0004923095865533947, 'samples': 2109440, 'steps': 4119, 'batch_loss/train': 0.840994376456365} 12/22/2021 03:51:27 - INFO - codeparrot_training - Step 4120: {'lr': 0.0004923057063859251, 'samples': 2109952, 'steps': 4120, 'batch_loss/train': 0.8425967460498214} 12/22/2021 03:51:39 - INFO - codeparrot_training - Step 4121: {'lr': 0.0004923018252551393, 'samples': 2110464, 'steps': 4121, 'batch_loss/train': 0.920306664891541} 12/22/2021 03:51:50 - INFO - codeparrot_training - Step 4122: {'lr': 0.0004922979431610531, 'samples': 2110976, 'steps': 4122, 'batch_loss/train': 0.8342994092963636} 12/22/2021 03:52:01 - INFO - codeparrot_training - Step 4123: {'lr': 0.0004922940601036817, 'samples': 2111488, 'steps': 4123, 'batch_loss/train': 0.8120628613978624} 12/22/2021 03:52:13 - INFO - codeparrot_training - Step 4124: {'lr': 0.0004922901760830406, 'samples': 2112000, 'steps': 4124, 'batch_loss/train': 0.874008446931839} 12/22/2021 03:52:24 - INFO - codeparrot_training - Step 4125: {'lr': 0.0004922862910991452, 'samples': 2112512, 'steps': 4125, 'batch_loss/train': 0.8032994526438415} 12/22/2021 03:52:34 - INFO - codeparrot_training - Step 4126: {'lr': 0.0004922824051520111, 'samples': 2113024, 'steps': 4126, 'batch_loss/train': 0.8865904361009598} 12/22/2021 03:52:46 - INFO - codeparrot_training - Step 4127: {'lr': 0.0004922785182416535, 'samples': 2113536, 'steps': 4127, 'batch_loss/train': 0.7786675738170743} 12/22/2021 03:52:57 - INFO - codeparrot_training - Step 4128: {'lr': 0.0004922746303680881, 'samples': 2114048, 'steps': 4128, 'batch_loss/train': 0.9352441132068634} 12/22/2021 03:53:08 - INFO - codeparrot_training - Step 4129: {'lr': 0.0004922707415313303, 'samples': 2114560, 'steps': 4129, 'batch_loss/train': 0.8524200143292546} 12/22/2021 03:53:18 - INFO - codeparrot_training - Step 4130: {'lr': 0.0004922668517313955, 'samples': 2115072, 'steps': 4130, 'batch_loss/train': 0.8909002440050244} 12/22/2021 03:53:31 - INFO - codeparrot_training - Step 4131: {'lr': 0.0004922629609682991, 'samples': 2115584, 'steps': 4131, 'batch_loss/train': 0.8615256734192371} 12/22/2021 03:53:41 - INFO - codeparrot_training - Step 4132: {'lr': 0.0004922590692420566, 'samples': 2116096, 'steps': 4132, 'batch_loss/train': 0.7947524776682258} 12/22/2021 03:53:52 - INFO - codeparrot_training - Step 4133: {'lr': 0.0004922551765526836, 'samples': 2116608, 'steps': 4133, 'batch_loss/train': 0.8576065422967076} 12/22/2021 03:54:05 - INFO - codeparrot_training - Step 4134: {'lr': 0.0004922512829001955, 'samples': 2117120, 'steps': 4134, 'batch_loss/train': 0.6755987210199237} 12/22/2021 03:54:15 - INFO - codeparrot_training - Step 4135: {'lr': 0.0004922473882846077, 'samples': 2117632, 'steps': 4135, 'batch_loss/train': 0.9504460757598281} 12/22/2021 03:54:26 - INFO - codeparrot_training - Step 4136: {'lr': 0.0004922434927059358, 'samples': 2118144, 'steps': 4136, 'batch_loss/train': 0.9165382497012615} 12/22/2021 03:54:38 - INFO - codeparrot_training - Step 4137: {'lr': 0.0004922395961641953, 'samples': 2118656, 'steps': 4137, 'batch_loss/train': 0.9348579905927181} 12/22/2021 03:54:49 - INFO - codeparrot_training - Step 4138: {'lr': 0.0004922356986594014, 'samples': 2119168, 'steps': 4138, 'batch_loss/train': 0.8500338159501553} 12/22/2021 03:54:59 - INFO - codeparrot_training - Step 4139: {'lr': 0.00049223180019157, 'samples': 2119680, 'steps': 4139, 'batch_loss/train': 0.826232872903347} 12/22/2021 03:55:10 - INFO - codeparrot_training - Step 4140: {'lr': 0.0004922279007607165, 'samples': 2120192, 'steps': 4140, 'batch_loss/train': 0.7446947055868804} 12/22/2021 03:55:22 - INFO - codeparrot_training - Step 4141: {'lr': 0.0004922240003668562, 'samples': 2120704, 'steps': 4141, 'batch_loss/train': 0.5670101901050657} 12/22/2021 03:55:33 - INFO - codeparrot_training - Step 4142: {'lr': 0.0004922200990100048, 'samples': 2121216, 'steps': 4142, 'batch_loss/train': 0.9021270740777254} 12/22/2021 03:55:43 - INFO - codeparrot_training - Step 4143: {'lr': 0.0004922161966901777, 'samples': 2121728, 'steps': 4143, 'batch_loss/train': 0.9323629504069686} 12/22/2021 03:55:56 - INFO - codeparrot_training - Step 4144: {'lr': 0.0004922122934073904, 'samples': 2122240, 'steps': 4144, 'batch_loss/train': 0.8500166293233633} 12/22/2021 03:56:07 - INFO - codeparrot_training - Step 4145: {'lr': 0.0004922083891616586, 'samples': 2122752, 'steps': 4145, 'batch_loss/train': 0.8094088472425938} 12/22/2021 03:56:17 - INFO - codeparrot_training - Step 4146: {'lr': 0.0004922044839529977, 'samples': 2123264, 'steps': 4146, 'batch_loss/train': 0.787008099257946} 12/22/2021 03:56:29 - INFO - codeparrot_training - Step 4147: {'lr': 0.0004922005777814231, 'samples': 2123776, 'steps': 4147, 'batch_loss/train': 0.8856616225093603} 12/22/2021 03:56:40 - INFO - codeparrot_training - Step 4148: {'lr': 0.0004921966706469505, 'samples': 2124288, 'steps': 4148, 'batch_loss/train': 0.8817527913488448} 12/22/2021 03:56:51 - INFO - codeparrot_training - Step 4149: {'lr': 0.0004921927625495954, 'samples': 2124800, 'steps': 4149, 'batch_loss/train': 0.8503183955326676} 12/22/2021 03:57:03 - INFO - codeparrot_training - Step 4150: {'lr': 0.0004921888534893733, 'samples': 2125312, 'steps': 4150, 'batch_loss/train': 0.8844315791502595} 12/22/2021 03:57:13 - INFO - codeparrot_training - Step 4151: {'lr': 0.0004921849434662999, 'samples': 2125824, 'steps': 4151, 'batch_loss/train': 0.8079833211377263} 12/22/2021 03:57:24 - INFO - codeparrot_training - Step 4152: {'lr': 0.0004921810324803905, 'samples': 2126336, 'steps': 4152, 'batch_loss/train': 0.5724159111268818} 12/22/2021 03:57:35 - INFO - codeparrot_training - Step 4153: {'lr': 0.0004921771205316607, 'samples': 2126848, 'steps': 4153, 'batch_loss/train': 0.6515989694744349} 12/22/2021 03:57:47 - INFO - codeparrot_training - Step 4154: {'lr': 0.0004921732076201263, 'samples': 2127360, 'steps': 4154, 'batch_loss/train': 0.9112244881689548} 12/22/2021 03:57:58 - INFO - codeparrot_training - Step 4155: {'lr': 0.0004921692937458025, 'samples': 2127872, 'steps': 4155, 'batch_loss/train': 0.8497237404808402} 12/22/2021 03:58:08 - INFO - codeparrot_training - Step 4156: {'lr': 0.0004921653789087051, 'samples': 2128384, 'steps': 4156, 'batch_loss/train': 0.9024626226164401} 12/22/2021 03:58:21 - INFO - codeparrot_training - Step 4157: {'lr': 0.0004921614631088495, 'samples': 2128896, 'steps': 4157, 'batch_loss/train': 0.8117280481383204} 12/22/2021 03:58:31 - INFO - codeparrot_training - Step 4158: {'lr': 0.0004921575463462516, 'samples': 2129408, 'steps': 4158, 'batch_loss/train': 0.7531013395637274} 12/22/2021 03:58:42 - INFO - codeparrot_training - Step 4159: {'lr': 0.0004921536286209265, 'samples': 2129920, 'steps': 4159, 'batch_loss/train': 0.8092886880040169} 12/22/2021 03:58:53 - INFO - codeparrot_training - Step 4160: {'lr': 0.00049214970993289, 'samples': 2130432, 'steps': 4160, 'batch_loss/train': 0.8370175296440721} 12/22/2021 03:59:05 - INFO - codeparrot_training - Step 4161: {'lr': 0.0004921457902821578, 'samples': 2130944, 'steps': 4161, 'batch_loss/train': 0.8419553088024259} 12/22/2021 03:59:16 - INFO - codeparrot_training - Step 4162: {'lr': 0.0004921418696687453, 'samples': 2131456, 'steps': 4162, 'batch_loss/train': 0.930520792491734} 12/22/2021 03:59:27 - INFO - codeparrot_training - Step 4163: {'lr': 0.0004921379480926682, 'samples': 2131968, 'steps': 4163, 'batch_loss/train': 0.8297982150688767} 12/22/2021 03:59:39 - INFO - codeparrot_training - Step 4164: {'lr': 0.0004921340255539421, 'samples': 2132480, 'steps': 4164, 'batch_loss/train': 0.7703292658552527} 12/22/2021 03:59:49 - INFO - codeparrot_training - Step 4165: {'lr': 0.0004921301020525823, 'samples': 2132992, 'steps': 4165, 'batch_loss/train': 0.841952009126544} 12/22/2021 04:00:00 - INFO - codeparrot_training - Step 4166: {'lr': 0.0004921261775886049, 'samples': 2133504, 'steps': 4166, 'batch_loss/train': 0.7798296669498086} 12/22/2021 04:00:12 - INFO - codeparrot_training - Step 4167: {'lr': 0.0004921222521620251, 'samples': 2134016, 'steps': 4167, 'batch_loss/train': 1.006100881844759} 12/22/2021 04:00:23 - INFO - codeparrot_training - Step 4168: {'lr': 0.0004921183257728587, 'samples': 2134528, 'steps': 4168, 'batch_loss/train': 0.8528145980089903} 12/22/2021 04:00:33 - INFO - codeparrot_training - Step 4169: {'lr': 0.0004921143984211213, 'samples': 2135040, 'steps': 4169, 'batch_loss/train': 0.8339375359937549} 12/22/2021 04:00:46 - INFO - codeparrot_training - Step 4170: {'lr': 0.0004921104701068284, 'samples': 2135552, 'steps': 4170, 'batch_loss/train': 0.7577645003329962} 12/22/2021 04:00:57 - INFO - codeparrot_training - Step 4171: {'lr': 0.0004921065408299958, 'samples': 2136064, 'steps': 4171, 'batch_loss/train': 1.2949805464595556} 12/22/2021 04:01:07 - INFO - codeparrot_training - Step 4172: {'lr': 0.0004921026105906388, 'samples': 2136576, 'steps': 4172, 'batch_loss/train': 0.9009168511256576} 12/22/2021 04:01:18 - INFO - codeparrot_training - Step 4173: {'lr': 0.0004920986793887734, 'samples': 2137088, 'steps': 4173, 'batch_loss/train': 0.8433825736865401} 12/22/2021 04:01:30 - INFO - codeparrot_training - Step 4174: {'lr': 0.000492094747224415, 'samples': 2137600, 'steps': 4174, 'batch_loss/train': 0.9215331058949232} 12/22/2021 04:01:41 - INFO - codeparrot_training - Step 4175: {'lr': 0.0004920908140975793, 'samples': 2138112, 'steps': 4175, 'batch_loss/train': 0.8705638372339308} 12/22/2021 04:01:52 - INFO - codeparrot_training - Step 4176: {'lr': 0.0004920868800082819, 'samples': 2138624, 'steps': 4176, 'batch_loss/train': 0.9390399809926748} 12/22/2021 04:02:04 - INFO - codeparrot_training - Step 4177: {'lr': 0.0004920829449565386, 'samples': 2139136, 'steps': 4177, 'batch_loss/train': 0.8569097490981221} 12/22/2021 04:02:15 - INFO - codeparrot_training - Step 4178: {'lr': 0.0004920790089423648, 'samples': 2139648, 'steps': 4178, 'batch_loss/train': 0.9303377270698547} 12/22/2021 04:02:25 - INFO - codeparrot_training - Step 4179: {'lr': 0.0004920750719657762, 'samples': 2140160, 'steps': 4179, 'batch_loss/train': 0.8319423738867044} 12/22/2021 04:02:37 - INFO - codeparrot_training - Step 4180: {'lr': 0.0004920711340267888, 'samples': 2140672, 'steps': 4180, 'batch_loss/train': 0.8132032584398985} 12/22/2021 04:02:48 - INFO - codeparrot_training - Step 4181: {'lr': 0.0004920671951254178, 'samples': 2141184, 'steps': 4181, 'batch_loss/train': 0.9055496519431472} 12/22/2021 04:02:58 - INFO - codeparrot_training - Step 4182: {'lr': 0.0004920632552616789, 'samples': 2141696, 'steps': 4182, 'batch_loss/train': 0.9279581867158413} 12/22/2021 04:03:09 - INFO - codeparrot_training - Step 4183: {'lr': 0.0004920593144355881, 'samples': 2142208, 'steps': 4183, 'batch_loss/train': 0.8589329849928617} 12/22/2021 04:03:22 - INFO - codeparrot_training - Step 4184: {'lr': 0.0004920553726471608, 'samples': 2142720, 'steps': 4184, 'batch_loss/train': 0.8395436815917492} 12/22/2021 04:03:32 - INFO - codeparrot_training - Step 4185: {'lr': 0.0004920514298964127, 'samples': 2143232, 'steps': 4185, 'batch_loss/train': 0.7087234170176089} 12/22/2021 04:03:43 - INFO - codeparrot_training - Step 4186: {'lr': 0.0004920474861833595, 'samples': 2143744, 'steps': 4186, 'batch_loss/train': 0.9538164502009749} 12/22/2021 04:03:55 - INFO - codeparrot_training - Step 4187: {'lr': 0.000492043541508017, 'samples': 2144256, 'steps': 4187, 'batch_loss/train': 0.9009403511881828} 12/22/2021 04:04:06 - INFO - codeparrot_training - Step 4188: {'lr': 0.0004920395958704007, 'samples': 2144768, 'steps': 4188, 'batch_loss/train': 0.9118561772629619} 12/22/2021 04:04:16 - INFO - codeparrot_training - Step 4189: {'lr': 0.0004920356492705263, 'samples': 2145280, 'steps': 4189, 'batch_loss/train': 0.8730563372373581} 12/22/2021 04:04:29 - INFO - codeparrot_training - Step 4190: {'lr': 0.0004920317017084096, 'samples': 2145792, 'steps': 4190, 'batch_loss/train': 0.9488238608464599} 12/22/2021 04:04:40 - INFO - codeparrot_training - Step 4191: {'lr': 0.0004920277531840662, 'samples': 2146304, 'steps': 4191, 'batch_loss/train': 0.9781081806868315} 12/22/2021 04:04:50 - INFO - codeparrot_training - Step 4192: {'lr': 0.0004920238036975119, 'samples': 2146816, 'steps': 4192, 'batch_loss/train': 0.8862302796915174} 12/22/2021 04:05:01 - INFO - codeparrot_training - Step 4193: {'lr': 0.0004920198532487623, 'samples': 2147328, 'steps': 4193, 'batch_loss/train': 0.9819248989224434} 12/22/2021 04:05:13 - INFO - codeparrot_training - Step 4194: {'lr': 0.0004920159018378332, 'samples': 2147840, 'steps': 4194, 'batch_loss/train': 1.1086815905291587} 12/22/2021 04:05:24 - INFO - codeparrot_training - Step 4195: {'lr': 0.0004920119494647402, 'samples': 2148352, 'steps': 4195, 'batch_loss/train': 0.922237922437489} 12/22/2021 04:05:35 - INFO - codeparrot_training - Step 4196: {'lr': 0.0004920079961294991, 'samples': 2148864, 'steps': 4196, 'batch_loss/train': 0.8239907678216696} 12/22/2021 04:05:47 - INFO - codeparrot_training - Step 4197: {'lr': 0.0004920040418321256, 'samples': 2149376, 'steps': 4197, 'batch_loss/train': 0.8792415484786034} 12/22/2021 04:05:58 - INFO - codeparrot_training - Step 4198: {'lr': 0.0004920000865726354, 'samples': 2149888, 'steps': 4198, 'batch_loss/train': 0.8275346867740154} 12/22/2021 04:06:08 - INFO - codeparrot_training - Step 4199: {'lr': 0.0004919961303510442, 'samples': 2150400, 'steps': 4199, 'batch_loss/train': 0.7876558126881719} 12/22/2021 04:06:20 - INFO - codeparrot_training - Step 4200: {'lr': 0.0004919921731673677, 'samples': 2150912, 'steps': 4200, 'batch_loss/train': 0.8251504953950644} 12/22/2021 04:06:31 - INFO - codeparrot_training - Step 4201: {'lr': 0.0004919882150216219, 'samples': 2151424, 'steps': 4201, 'batch_loss/train': 0.8830639310181141} 12/22/2021 04:06:41 - INFO - codeparrot_training - Step 4202: {'lr': 0.0004919842559138221, 'samples': 2151936, 'steps': 4202, 'batch_loss/train': 0.8418839145451784} 12/22/2021 04:06:52 - INFO - codeparrot_training - Step 4203: {'lr': 0.0004919802958439844, 'samples': 2152448, 'steps': 4203, 'batch_loss/train': 0.8450416726991534} 12/22/2021 04:07:05 - INFO - codeparrot_training - Step 4204: {'lr': 0.0004919763348121244, 'samples': 2152960, 'steps': 4204, 'batch_loss/train': 0.7476040422916412} 12/22/2021 04:07:15 - INFO - codeparrot_training - Step 4205: {'lr': 0.0004919723728182578, 'samples': 2153472, 'steps': 4205, 'batch_loss/train': 0.9225826757028699} 12/22/2021 04:07:26 - INFO - codeparrot_training - Step 4206: {'lr': 0.0004919684098624005, 'samples': 2153984, 'steps': 4206, 'batch_loss/train': 0.8907818743027747} 12/22/2021 04:07:38 - INFO - codeparrot_training - Step 4207: {'lr': 0.0004919644459445682, 'samples': 2154496, 'steps': 4207, 'batch_loss/train': 0.8607776360586286} 12/22/2021 04:07:49 - INFO - codeparrot_training - Step 4208: {'lr': 0.0004919604810647764, 'samples': 2155008, 'steps': 4208, 'batch_loss/train': 0.8636884298175573} 12/22/2021 04:07:59 - INFO - codeparrot_training - Step 4209: {'lr': 0.0004919565152230412, 'samples': 2155520, 'steps': 4209, 'batch_loss/train': 0.8663858054205775} 12/22/2021 04:08:11 - INFO - codeparrot_training - Step 4210: {'lr': 0.0004919525484193783, 'samples': 2156032, 'steps': 4210, 'batch_loss/train': 0.83458595816046} 12/22/2021 04:08:22 - INFO - codeparrot_training - Step 4211: {'lr': 0.0004919485806538034, 'samples': 2156544, 'steps': 4211, 'batch_loss/train': 0.875915807671845} 12/22/2021 04:08:33 - INFO - codeparrot_training - Step 4212: {'lr': 0.0004919446119263323, 'samples': 2157056, 'steps': 4212, 'batch_loss/train': 0.631615602876991} 12/22/2021 04:08:45 - INFO - codeparrot_training - Step 4213: {'lr': 0.0004919406422369809, 'samples': 2157568, 'steps': 4213, 'batch_loss/train': 0.8464700803160667} 12/22/2021 04:08:56 - INFO - codeparrot_training - Step 4214: {'lr': 0.0004919366715857648, 'samples': 2158080, 'steps': 4214, 'batch_loss/train': 0.7966765407472849} 12/22/2021 04:09:06 - INFO - codeparrot_training - Step 4215: {'lr': 0.0004919326999726996, 'samples': 2158592, 'steps': 4215, 'batch_loss/train': 0.8686229945160449} 12/22/2021 04:09:17 - INFO - codeparrot_training - Step 4216: {'lr': 0.0004919287273978016, 'samples': 2159104, 'steps': 4216, 'batch_loss/train': 0.8444241369143128} 12/22/2021 04:09:29 - INFO - codeparrot_training - Step 4217: {'lr': 0.0004919247538610862, 'samples': 2159616, 'steps': 4217, 'batch_loss/train': 0.7683601351454854} 12/22/2021 04:09:40 - INFO - codeparrot_training - Step 4218: {'lr': 0.0004919207793625695, 'samples': 2160128, 'steps': 4218, 'batch_loss/train': 0.8407889101654291} 12/22/2021 04:09:50 - INFO - codeparrot_training - Step 4219: {'lr': 0.0004919168039022669, 'samples': 2160640, 'steps': 4219, 'batch_loss/train': 0.8537206966429949} 12/22/2021 04:10:03 - INFO - codeparrot_training - Step 4220: {'lr': 0.0004919128274801946, 'samples': 2161152, 'steps': 4220, 'batch_loss/train': 0.9095764169469476} 12/22/2021 04:10:14 - INFO - codeparrot_training - Step 4221: {'lr': 0.0004919088500963682, 'samples': 2161664, 'steps': 4221, 'batch_loss/train': 0.8450834965333343} 12/22/2021 04:10:24 - INFO - codeparrot_training - Step 4222: {'lr': 0.0004919048717508034, 'samples': 2162176, 'steps': 4222, 'batch_loss/train': 0.5983525580959395} 12/22/2021 04:10:37 - INFO - codeparrot_training - Step 4223: {'lr': 0.0004919008924435164, 'samples': 2162688, 'steps': 4223, 'batch_loss/train': 0.7912319777533412} 12/22/2021 04:10:47 - INFO - codeparrot_training - Step 4224: {'lr': 0.0004918969121745227, 'samples': 2163200, 'steps': 4224, 'batch_loss/train': 0.732310350984335} 12/22/2021 04:10:58 - INFO - codeparrot_training - Step 4225: {'lr': 0.0004918929309438381, 'samples': 2163712, 'steps': 4225, 'batch_loss/train': 0.8868085164576769} 12/22/2021 04:11:09 - INFO - codeparrot_training - Step 4226: {'lr': 0.0004918889487514788, 'samples': 2164224, 'steps': 4226, 'batch_loss/train': 0.8994815619662404} 12/22/2021 04:11:21 - INFO - codeparrot_training - Step 4227: {'lr': 0.0004918849655974601, 'samples': 2164736, 'steps': 4227, 'batch_loss/train': 0.7697261534631252} 12/22/2021 04:11:31 - INFO - codeparrot_training - Step 4228: {'lr': 0.0004918809814817983, 'samples': 2165248, 'steps': 4228, 'batch_loss/train': 0.7518858457915485} 12/22/2021 04:11:42 - INFO - codeparrot_training - Step 4229: {'lr': 0.000491876996404509, 'samples': 2165760, 'steps': 4229, 'batch_loss/train': 0.858812041580677} 12/22/2021 04:11:55 - INFO - codeparrot_training - Step 4230: {'lr': 0.0004918730103656081, 'samples': 2166272, 'steps': 4230, 'batch_loss/train': 0.8146057026460767} 12/22/2021 04:12:05 - INFO - codeparrot_training - Step 4231: {'lr': 0.0004918690233651113, 'samples': 2166784, 'steps': 4231, 'batch_loss/train': 0.8729232931509614} 12/22/2021 04:12:16 - INFO - codeparrot_training - Step 4232: {'lr': 0.0004918650354030347, 'samples': 2167296, 'steps': 4232, 'batch_loss/train': 0.6907808653777465} 12/22/2021 04:12:28 - INFO - codeparrot_training - Step 4233: {'lr': 0.000491861046479394, 'samples': 2167808, 'steps': 4233, 'batch_loss/train': 0.9290086850523949} 12/22/2021 04:12:39 - INFO - codeparrot_training - Step 4234: {'lr': 0.0004918570565942052, 'samples': 2168320, 'steps': 4234, 'batch_loss/train': 0.9269060548394918} 12/22/2021 04:12:49 - INFO - codeparrot_training - Step 4235: {'lr': 0.0004918530657474841, 'samples': 2168832, 'steps': 4235, 'batch_loss/train': 0.8176009701564908} 12/22/2021 04:13:01 - INFO - codeparrot_training - Step 4236: {'lr': 0.0004918490739392464, 'samples': 2169344, 'steps': 4236, 'batch_loss/train': 0.8197090420871973} 12/22/2021 04:13:12 - INFO - codeparrot_training - Step 4237: {'lr': 0.000491845081169508, 'samples': 2169856, 'steps': 4237, 'batch_loss/train': 0.6932108988985419} 12/22/2021 04:13:23 - INFO - codeparrot_training - Step 4238: {'lr': 0.0004918410874382852, 'samples': 2170368, 'steps': 4238, 'batch_loss/train': 0.8045588014647365} 12/22/2021 04:13:33 - INFO - codeparrot_training - Step 4239: {'lr': 0.0004918370927455933, 'samples': 2170880, 'steps': 4239, 'batch_loss/train': 0.8588037146255374} 12/22/2021 04:13:45 - INFO - codeparrot_training - Step 4240: {'lr': 0.0004918330970914485, 'samples': 2171392, 'steps': 4240, 'batch_loss/train': 0.9035855205729604} 12/22/2021 04:13:56 - INFO - codeparrot_training - Step 4241: {'lr': 0.0004918291004758666, 'samples': 2171904, 'steps': 4241, 'batch_loss/train': 0.8279299810528755} 12/22/2021 04:14:07 - INFO - codeparrot_training - Step 4242: {'lr': 0.0004918251028988635, 'samples': 2172416, 'steps': 4242, 'batch_loss/train': 0.8381521226838231} 12/22/2021 04:14:19 - INFO - codeparrot_training - Step 4243: {'lr': 0.0004918211043604551, 'samples': 2172928, 'steps': 4243, 'batch_loss/train': 0.8244550214149058} 12/22/2021 04:14:30 - INFO - codeparrot_training - Step 4244: {'lr': 0.0004918171048606573, 'samples': 2173440, 'steps': 4244, 'batch_loss/train': 0.8142349328845739} 12/22/2021 04:14:41 - INFO - codeparrot_training - Step 4245: {'lr': 0.0004918131043994861, 'samples': 2173952, 'steps': 4245, 'batch_loss/train': 0.8831957075744867} 12/22/2021 04:14:53 - INFO - codeparrot_training - Step 4246: {'lr': 0.0004918091029769572, 'samples': 2174464, 'steps': 4246, 'batch_loss/train': 0.7677228793036193} 12/22/2021 04:15:03 - INFO - codeparrot_training - Step 4247: {'lr': 0.0004918051005930866, 'samples': 2174976, 'steps': 4247, 'batch_loss/train': 0.9148524203337729} 12/22/2021 04:15:14 - INFO - codeparrot_training - Step 4248: {'lr': 0.0004918010972478903, 'samples': 2175488, 'steps': 4248, 'batch_loss/train': 0.7921742289327085} 12/22/2021 04:15:25 - INFO - codeparrot_training - Step 4249: {'lr': 0.0004917970929413841, 'samples': 2176000, 'steps': 4249, 'batch_loss/train': 0.9120081048458815} 12/22/2021 04:15:37 - INFO - codeparrot_training - Step 4250: {'lr': 0.0004917930876735839, 'samples': 2176512, 'steps': 4250, 'batch_loss/train': 0.8244171927217394} 12/22/2021 04:15:47 - INFO - codeparrot_training - Step 4251: {'lr': 0.0004917890814445057, 'samples': 2177024, 'steps': 4251, 'batch_loss/train': 0.7848007027059793} 12/22/2021 04:15:58 - INFO - codeparrot_training - Step 4252: {'lr': 0.0004917850742541655, 'samples': 2177536, 'steps': 4252, 'batch_loss/train': 0.9193526171147823} 12/22/2021 04:16:11 - INFO - codeparrot_training - Step 4253: {'lr': 0.0004917810661025791, 'samples': 2178048, 'steps': 4253, 'batch_loss/train': 0.8568741017952561} 12/22/2021 04:16:21 - INFO - codeparrot_training - Step 4254: {'lr': 0.0004917770569897624, 'samples': 2178560, 'steps': 4254, 'batch_loss/train': 0.8943828670307994} 12/22/2021 04:16:32 - INFO - codeparrot_training - Step 4255: {'lr': 0.0004917730469157315, 'samples': 2179072, 'steps': 4255, 'batch_loss/train': 0.810413773637265} 12/22/2021 04:16:44 - INFO - codeparrot_training - Step 4256: {'lr': 0.0004917690358805022, 'samples': 2179584, 'steps': 4256, 'batch_loss/train': 0.9113845699466765} 12/22/2021 04:16:55 - INFO - codeparrot_training - Step 4257: {'lr': 0.0004917650238840906, 'samples': 2180096, 'steps': 4257, 'batch_loss/train': 0.8623527302406728} 12/22/2021 04:17:06 - INFO - codeparrot_training - Step 4258: {'lr': 0.0004917610109265125, 'samples': 2180608, 'steps': 4258, 'batch_loss/train': 0.8641026802361012} 12/22/2021 04:17:16 - INFO - codeparrot_training - Step 4259: {'lr': 0.0004917569970077838, 'samples': 2181120, 'steps': 4259, 'batch_loss/train': 0.8389777424745262} 12/22/2021 04:17:29 - INFO - codeparrot_training - Step 4260: {'lr': 0.0004917529821279207, 'samples': 2181632, 'steps': 4260, 'batch_loss/train': 0.8519561602734029} 12/22/2021 04:17:39 - INFO - codeparrot_training - Step 4261: {'lr': 0.0004917489662869389, 'samples': 2182144, 'steps': 4261, 'batch_loss/train': 0.9113058946095407} 12/22/2021 04:17:50 - INFO - codeparrot_training - Step 4262: {'lr': 0.0004917449494848546, 'samples': 2182656, 'steps': 4262, 'batch_loss/train': 0.7968486715108156} 12/22/2021 04:18:02 - INFO - codeparrot_training - Step 4263: {'lr': 0.0004917409317216836, 'samples': 2183168, 'steps': 4263, 'batch_loss/train': 0.8811958925798535} 12/22/2021 04:18:13 - INFO - codeparrot_training - Step 4264: {'lr': 0.000491736912997442, 'samples': 2183680, 'steps': 4264, 'batch_loss/train': 0.8660278837196529} 12/22/2021 04:18:23 - INFO - codeparrot_training - Step 4265: {'lr': 0.0004917328933121457, 'samples': 2184192, 'steps': 4265, 'batch_loss/train': 0.8070195340551436} 12/22/2021 04:18:36 - INFO - codeparrot_training - Step 4266: {'lr': 0.0004917288726658106, 'samples': 2184704, 'steps': 4266, 'batch_loss/train': 0.7929613990709186} 12/22/2021 04:18:46 - INFO - codeparrot_training - Step 4267: {'lr': 0.0004917248510584528, 'samples': 2185216, 'steps': 4267, 'batch_loss/train': 0.904638612177223} 12/22/2021 04:18:57 - INFO - codeparrot_training - Step 4268: {'lr': 0.0004917208284900883, 'samples': 2185728, 'steps': 4268, 'batch_loss/train': 0.9198421272449195} 12/22/2021 04:19:07 - INFO - codeparrot_training - Step 4269: {'lr': 0.0004917168049607331, 'samples': 2186240, 'steps': 4269, 'batch_loss/train': 0.82564309053123} 12/22/2021 04:19:20 - INFO - codeparrot_training - Step 4270: {'lr': 0.0004917127804704031, 'samples': 2186752, 'steps': 4270, 'batch_loss/train': 0.9663230860605836} 12/22/2021 04:19:31 - INFO - codeparrot_training - Step 4271: {'lr': 0.0004917087550191143, 'samples': 2187264, 'steps': 4271, 'batch_loss/train': 0.9898975621908903} 12/22/2021 04:19:41 - INFO - codeparrot_training - Step 4272: {'lr': 0.0004917047286068829, 'samples': 2187776, 'steps': 4272, 'batch_loss/train': 0.8915459495037794} 12/22/2021 04:19:54 - INFO - codeparrot_training - Step 4273: {'lr': 0.0004917007012337246, 'samples': 2188288, 'steps': 4273, 'batch_loss/train': 0.8515634760260582} 12/22/2021 04:20:04 - INFO - codeparrot_training - Step 4274: {'lr': 0.0004916966728996557, 'samples': 2188800, 'steps': 4274, 'batch_loss/train': 0.8997660977765918} 12/22/2021 04:20:15 - INFO - codeparrot_training - Step 4275: {'lr': 0.0004916926436046921, 'samples': 2189312, 'steps': 4275, 'batch_loss/train': 0.911219228990376} 12/22/2021 04:20:27 - INFO - codeparrot_training - Step 4276: {'lr': 0.0004916886133488496, 'samples': 2189824, 'steps': 4276, 'batch_loss/train': 0.9021259834989905} 12/22/2021 04:20:38 - INFO - codeparrot_training - Step 4277: {'lr': 0.0004916845821321446, 'samples': 2190336, 'steps': 4277, 'batch_loss/train': 0.9018486775457859} 12/22/2021 04:20:48 - INFO - codeparrot_training - Step 4278: {'lr': 0.000491680549954593, 'samples': 2190848, 'steps': 4278, 'batch_loss/train': 0.8041638755239546} 12/22/2021 04:20:59 - INFO - codeparrot_training - Step 4279: {'lr': 0.0004916765168162107, 'samples': 2191360, 'steps': 4279, 'batch_loss/train': 0.8360296441242099} 12/22/2021 04:21:11 - INFO - codeparrot_training - Step 4280: {'lr': 0.0004916724827170138, 'samples': 2191872, 'steps': 4280, 'batch_loss/train': 0.8398772007785738} 12/22/2021 04:21:21 - INFO - codeparrot_training - Step 4281: {'lr': 0.0004916684476570185, 'samples': 2192384, 'steps': 4281, 'batch_loss/train': 0.7536044353619218} 12/22/2021 04:21:32 - INFO - codeparrot_training - Step 4282: {'lr': 0.0004916644116362406, 'samples': 2192896, 'steps': 4282, 'batch_loss/train': 0.8800013419240713} 12/22/2021 04:21:45 - INFO - codeparrot_training - Step 4283: {'lr': 0.0004916603746546963, 'samples': 2193408, 'steps': 4283, 'batch_loss/train': 0.943959946744144} 12/22/2021 04:21:56 - INFO - codeparrot_training - Step 4284: {'lr': 0.0004916563367124015, 'samples': 2193920, 'steps': 4284, 'batch_loss/train': 1.0458155032247305} 12/22/2021 04:22:06 - INFO - codeparrot_training - Step 4285: {'lr': 0.0004916522978093724, 'samples': 2194432, 'steps': 4285, 'batch_loss/train': 0.8263000678271055} 12/22/2021 04:22:18 - INFO - codeparrot_training - Step 4286: {'lr': 0.0004916482579456251, 'samples': 2194944, 'steps': 4286, 'batch_loss/train': 0.6502634531352669} 12/22/2021 04:22:29 - INFO - codeparrot_training - Step 4287: {'lr': 0.0004916442171211754, 'samples': 2195456, 'steps': 4287, 'batch_loss/train': 0.7300446379231289} 12/22/2021 04:22:39 - INFO - codeparrot_training - Step 4288: {'lr': 0.0004916401753360396, 'samples': 2195968, 'steps': 4288, 'batch_loss/train': 0.7999651785939932} 12/22/2021 04:22:52 - INFO - codeparrot_training - Step 4289: {'lr': 0.0004916361325902338, 'samples': 2196480, 'steps': 4289, 'batch_loss/train': 0.8447217689827085} 12/22/2021 04:23:03 - INFO - codeparrot_training - Step 4290: {'lr': 0.0004916320888837739, 'samples': 2196992, 'steps': 4290, 'batch_loss/train': 1.4963058098219335} 12/22/2021 04:23:14 - INFO - codeparrot_training - Step 4291: {'lr': 0.0004916280442166759, 'samples': 2197504, 'steps': 4291, 'batch_loss/train': 0.8934101993218064} 12/22/2021 04:23:24 - INFO - codeparrot_training - Step 4292: {'lr': 0.0004916239985889563, 'samples': 2198016, 'steps': 4292, 'batch_loss/train': 0.8289038199000061} 12/22/2021 04:23:37 - INFO - codeparrot_training - Step 4293: {'lr': 0.0004916199520006308, 'samples': 2198528, 'steps': 4293, 'batch_loss/train': 0.6786834923550487} 12/22/2021 04:23:47 - INFO - codeparrot_training - Step 4294: {'lr': 0.0004916159044517156, 'samples': 2199040, 'steps': 4294, 'batch_loss/train': 0.7881987635046244} 12/22/2021 04:23:58 - INFO - codeparrot_training - Step 4295: {'lr': 0.0004916118559422267, 'samples': 2199552, 'steps': 4295, 'batch_loss/train': 0.841077177785337} 12/22/2021 04:24:10 - INFO - codeparrot_training - Step 4296: {'lr': 0.0004916078064721804, 'samples': 2200064, 'steps': 4296, 'batch_loss/train': 0.8503197869285941} 12/22/2021 04:24:20 - INFO - codeparrot_training - Step 4297: {'lr': 0.0004916037560415927, 'samples': 2200576, 'steps': 4297, 'batch_loss/train': 0.8673441684804857} 12/22/2021 04:24:31 - INFO - codeparrot_training - Step 4298: {'lr': 0.0004915997046504797, 'samples': 2201088, 'steps': 4298, 'batch_loss/train': 0.7526493291370571} 12/22/2021 04:24:44 - INFO - codeparrot_training - Step 4299: {'lr': 0.0004915956522988575, 'samples': 2201600, 'steps': 4299, 'batch_loss/train': 0.8334763366729021} 12/22/2021 04:24:54 - INFO - codeparrot_training - Step 4300: {'lr': 0.0004915915989867421, 'samples': 2202112, 'steps': 4300, 'batch_loss/train': 0.836633768863976} 12/22/2021 04:25:05 - INFO - codeparrot_training - Step 4301: {'lr': 0.0004915875447141498, 'samples': 2202624, 'steps': 4301, 'batch_loss/train': 0.8552018594928086} 12/22/2021 04:25:16 - INFO - codeparrot_training - Step 4302: {'lr': 0.0004915834894810967, 'samples': 2203136, 'steps': 4302, 'batch_loss/train': 0.819814101792872} 12/22/2021 04:25:28 - INFO - codeparrot_training - Step 4303: {'lr': 0.0004915794332875987, 'samples': 2203648, 'steps': 4303, 'batch_loss/train': 0.6780428306665272} 12/22/2021 04:25:38 - INFO - codeparrot_training - Step 4304: {'lr': 0.0004915753761336722, 'samples': 2204160, 'steps': 4304, 'batch_loss/train': 0.7966506266966462} 12/22/2021 04:25:49 - INFO - codeparrot_training - Step 4305: {'lr': 0.0004915713180193332, 'samples': 2204672, 'steps': 4305, 'batch_loss/train': 0.9035773146897554} 12/22/2021 04:26:01 - INFO - codeparrot_training - Step 4306: {'lr': 0.0004915672589445979, 'samples': 2205184, 'steps': 4306, 'batch_loss/train': 0.7459883964620531} 12/22/2021 04:26:12 - INFO - codeparrot_training - Step 4307: {'lr': 0.0004915631989094824, 'samples': 2205696, 'steps': 4307, 'batch_loss/train': 0.928390633314848} 12/22/2021 04:26:23 - INFO - codeparrot_training - Step 4308: {'lr': 0.0004915591379140027, 'samples': 2206208, 'steps': 4308, 'batch_loss/train': 0.8931056968867779} 12/22/2021 04:26:35 - INFO - codeparrot_training - Step 4309: {'lr': 0.0004915550759581753, 'samples': 2206720, 'steps': 4309, 'batch_loss/train': 0.8859048169106245} 12/22/2021 04:26:45 - INFO - codeparrot_training - Step 4310: {'lr': 0.0004915510130420158, 'samples': 2207232, 'steps': 4310, 'batch_loss/train': 0.8397634271532297} 12/22/2021 04:26:56 - INFO - codeparrot_training - Step 4311: {'lr': 0.000491546949165541, 'samples': 2207744, 'steps': 4311, 'batch_loss/train': 0.7807764271274209} 12/22/2021 04:27:07 - INFO - codeparrot_training - Step 4312: {'lr': 0.0004915428843287666, 'samples': 2208256, 'steps': 4312, 'batch_loss/train': 0.8280781302601099} 12/22/2021 04:27:19 - INFO - codeparrot_training - Step 4313: {'lr': 0.0004915388185317089, 'samples': 2208768, 'steps': 4313, 'batch_loss/train': 0.8607384227216244} 12/22/2021 04:27:30 - INFO - codeparrot_training - Step 4314: {'lr': 0.000491534751774384, 'samples': 2209280, 'steps': 4314, 'batch_loss/train': 0.8272448405623436} 12/22/2021 04:27:40 - INFO - codeparrot_training - Step 4315: {'lr': 0.0004915306840568083, 'samples': 2209792, 'steps': 4315, 'batch_loss/train': 0.871538856998086} 12/22/2021 04:27:52 - INFO - codeparrot_training - Step 4316: {'lr': 0.0004915266153789976, 'samples': 2210304, 'steps': 4316, 'batch_loss/train': 0.8684214428067207} 12/22/2021 04:28:03 - INFO - codeparrot_training - Step 4317: {'lr': 0.0004915225457409684, 'samples': 2210816, 'steps': 4317, 'batch_loss/train': 0.8546538418158889} 12/22/2021 04:28:13 - INFO - codeparrot_training - Step 4318: {'lr': 0.0004915184751427367, 'samples': 2211328, 'steps': 4318, 'batch_loss/train': 0.7153084180317819} 12/22/2021 04:28:26 - INFO - codeparrot_training - Step 4319: {'lr': 0.0004915144035843187, 'samples': 2211840, 'steps': 4319, 'batch_loss/train': 1.0290447762236} 12/22/2021 04:28:37 - INFO - codeparrot_training - Step 4320: {'lr': 0.0004915103310657307, 'samples': 2212352, 'steps': 4320, 'batch_loss/train': 0.9303810987621546} 12/22/2021 04:28:47 - INFO - codeparrot_training - Step 4321: {'lr': 0.0004915062575869888, 'samples': 2212864, 'steps': 4321, 'batch_loss/train': 0.8502654423937201} 12/22/2021 04:28:59 - INFO - codeparrot_training - Step 4322: {'lr': 0.0004915021831481092, 'samples': 2213376, 'steps': 4322, 'batch_loss/train': 0.9260749395471066} 12/22/2021 04:29:10 - INFO - codeparrot_training - Step 4323: {'lr': 0.0004914981077491081, 'samples': 2213888, 'steps': 4323, 'batch_loss/train': 0.8449598010629416} 12/22/2021 04:29:20 - INFO - codeparrot_training - Step 4324: {'lr': 0.0004914940313900017, 'samples': 2214400, 'steps': 4324, 'batch_loss/train': 0.905383724719286} 12/22/2021 04:29:31 - INFO - codeparrot_training - Step 4325: {'lr': 0.0004914899540708062, 'samples': 2214912, 'steps': 4325, 'batch_loss/train': 0.7623560610227287} 12/22/2021 04:29:43 - INFO - codeparrot_training - Step 4326: {'lr': 0.0004914858757915379, 'samples': 2215424, 'steps': 4326, 'batch_loss/train': 0.8833630029112101} 12/22/2021 04:29:54 - INFO - codeparrot_training - Step 4327: {'lr': 0.000491481796552213, 'samples': 2215936, 'steps': 4327, 'batch_loss/train': 0.7579985377378762} 12/22/2021 04:30:04 - INFO - codeparrot_training - Step 4328: {'lr': 0.0004914777163528474, 'samples': 2216448, 'steps': 4328, 'batch_loss/train': 0.8453649329021573} 12/22/2021 04:30:17 - INFO - codeparrot_training - Step 4329: {'lr': 0.0004914736351934578, 'samples': 2216960, 'steps': 4329, 'batch_loss/train': 0.8972917823120952} 12/22/2021 04:30:28 - INFO - codeparrot_training - Step 4330: {'lr': 0.0004914695530740601, 'samples': 2217472, 'steps': 4330, 'batch_loss/train': 0.8705263687297702} 12/22/2021 04:30:38 - INFO - codeparrot_training - Step 4331: {'lr': 0.0004914654699946706, 'samples': 2217984, 'steps': 4331, 'batch_loss/train': 0.8144289206247777} 12/22/2021 04:30:51 - INFO - codeparrot_training - Step 4332: {'lr': 0.0004914613859553057, 'samples': 2218496, 'steps': 4332, 'batch_loss/train': 0.8889299537986517} 12/22/2021 04:31:01 - INFO - codeparrot_training - Step 4333: {'lr': 0.0004914573009559814, 'samples': 2219008, 'steps': 4333, 'batch_loss/train': 0.834584609605372} 12/22/2021 04:31:12 - INFO - codeparrot_training - Step 4334: {'lr': 0.0004914532149967141, 'samples': 2219520, 'steps': 4334, 'batch_loss/train': 0.9567328579723835} 12/22/2021 04:31:22 - INFO - codeparrot_training - Step 4335: {'lr': 0.00049144912807752, 'samples': 2220032, 'steps': 4335, 'batch_loss/train': 0.8596513941884041} 12/22/2021 04:31:35 - INFO - codeparrot_training - Step 4336: {'lr': 0.0004914450401984152, 'samples': 2220544, 'steps': 4336, 'batch_loss/train': 0.8836327902972698} 12/22/2021 04:31:46 - INFO - codeparrot_training - Step 4337: {'lr': 0.0004914409513594163, 'samples': 2221056, 'steps': 4337, 'batch_loss/train': 0.8168367631733418} 12/22/2021 04:31:56 - INFO - codeparrot_training - Step 4338: {'lr': 0.0004914368615605391, 'samples': 2221568, 'steps': 4338, 'batch_loss/train': 0.8262974533718079} 12/22/2021 04:32:08 - INFO - codeparrot_training - Step 4339: {'lr': 0.0004914327708018003, 'samples': 2222080, 'steps': 4339, 'batch_loss/train': 0.8271156130358577} 12/22/2021 04:32:19 - INFO - codeparrot_training - Step 4340: {'lr': 0.0004914286790832159, 'samples': 2222592, 'steps': 4340, 'batch_loss/train': 0.7762467744760215} 12/22/2021 04:32:30 - INFO - codeparrot_training - Step 4341: {'lr': 0.0004914245864048022, 'samples': 2223104, 'steps': 4341, 'batch_loss/train': 0.7999018141999841} 12/22/2021 04:32:42 - INFO - codeparrot_training - Step 4342: {'lr': 0.0004914204927665755, 'samples': 2223616, 'steps': 4342, 'batch_loss/train': 0.8941382877528667} 12/22/2021 04:32:52 - INFO - codeparrot_training - Step 4343: {'lr': 0.0004914163981685521, 'samples': 2224128, 'steps': 4343, 'batch_loss/train': 0.8529383409768343} 12/22/2021 04:33:03 - INFO - codeparrot_training - Step 4344: {'lr': 0.0004914123026107483, 'samples': 2224640, 'steps': 4344, 'batch_loss/train': 0.8716491600498557} 12/22/2021 04:33:13 - INFO - codeparrot_training - Step 4345: {'lr': 0.0004914082060931803, 'samples': 2225152, 'steps': 4345, 'batch_loss/train': 0.8688375055789948} 12/22/2021 04:33:27 - INFO - codeparrot_training - Step 4346: {'lr': 0.0004914041086158643, 'samples': 2225664, 'steps': 4346, 'batch_loss/train': 0.7994059952907264} 12/22/2021 04:33:37 - INFO - codeparrot_training - Step 4347: {'lr': 0.0004914000101788169, 'samples': 2226176, 'steps': 4347, 'batch_loss/train': 0.8570217899978161} 12/22/2021 04:33:48 - INFO - codeparrot_training - Step 4348: {'lr': 0.0004913959107820541, 'samples': 2226688, 'steps': 4348, 'batch_loss/train': 0.759948680177331} 12/22/2021 04:34:00 - INFO - codeparrot_training - Step 4349: {'lr': 0.0004913918104255924, 'samples': 2227200, 'steps': 4349, 'batch_loss/train': 0.6070269639603794} 12/22/2021 04:34:11 - INFO - codeparrot_training - Step 4350: {'lr': 0.000491387709109448, 'samples': 2227712, 'steps': 4350, 'batch_loss/train': 0.7912876233458519} 12/22/2021 04:34:21 - INFO - codeparrot_training - Step 4351: {'lr': 0.0004913836068336371, 'samples': 2228224, 'steps': 4351, 'batch_loss/train': 0.7168423091061413} 12/22/2021 04:34:33 - INFO - codeparrot_training - Step 4352: {'lr': 0.0004913795035981762, 'samples': 2228736, 'steps': 4352, 'batch_loss/train': 0.8378981999121606} 12/22/2021 04:34:44 - INFO - codeparrot_training - Step 4353: {'lr': 0.0004913753994030815, 'samples': 2229248, 'steps': 4353, 'batch_loss/train': 0.8789275642484426} 12/22/2021 04:34:54 - INFO - codeparrot_training - Step 4354: {'lr': 0.0004913712942483695, 'samples': 2229760, 'steps': 4354, 'batch_loss/train': 0.9983461257070303} 12/22/2021 04:35:05 - INFO - codeparrot_training - Step 4355: {'lr': 0.0004913671881340562, 'samples': 2230272, 'steps': 4355, 'batch_loss/train': 0.8305387832224369} 12/22/2021 04:35:17 - INFO - codeparrot_training - Step 4356: {'lr': 0.0004913630810601581, 'samples': 2230784, 'steps': 4356, 'batch_loss/train': 0.8214098606258631} 12/22/2021 04:35:28 - INFO - codeparrot_training - Step 4357: {'lr': 0.0004913589730266917, 'samples': 2231296, 'steps': 4357, 'batch_loss/train': 0.8066215198487043} 12/22/2021 04:35:38 - INFO - codeparrot_training - Step 4358: {'lr': 0.0004913548640336729, 'samples': 2231808, 'steps': 4358, 'batch_loss/train': 0.8230480141937733} 12/22/2021 04:35:51 - INFO - codeparrot_training - Step 4359: {'lr': 0.0004913507540811185, 'samples': 2232320, 'steps': 4359, 'batch_loss/train': 0.7813420053571463} 12/22/2021 04:36:02 - INFO - codeparrot_training - Step 4360: {'lr': 0.0004913466431690445, 'samples': 2232832, 'steps': 4360, 'batch_loss/train': 0.8544958839192986} 12/22/2021 04:36:12 - INFO - codeparrot_training - Step 4361: {'lr': 0.0004913425312974675, 'samples': 2233344, 'steps': 4361, 'batch_loss/train': 0.8760593868792057} 12/22/2021 04:36:24 - INFO - codeparrot_training - Step 4362: {'lr': 0.0004913384184664036, 'samples': 2233856, 'steps': 4362, 'batch_loss/train': 0.8553951736539602} 12/22/2021 04:36:35 - INFO - codeparrot_training - Step 4363: {'lr': 0.0004913343046758693, 'samples': 2234368, 'steps': 4363, 'batch_loss/train': 0.8247763961553574} 12/22/2021 04:36:46 - INFO - codeparrot_training - Step 4364: {'lr': 0.000491330189925881, 'samples': 2234880, 'steps': 4364, 'batch_loss/train': 0.824323145672679} 12/22/2021 04:36:58 - INFO - codeparrot_training - Step 4365: {'lr': 0.000491326074216455, 'samples': 2235392, 'steps': 4365, 'batch_loss/train': 0.7620609644800425} 12/22/2021 04:37:09 - INFO - codeparrot_training - Step 4366: {'lr': 0.0004913219575476076, 'samples': 2235904, 'steps': 4366, 'batch_loss/train': 0.8685972932726145} 12/22/2021 04:37:19 - INFO - codeparrot_training - Step 4367: {'lr': 0.0004913178399193551, 'samples': 2236416, 'steps': 4367, 'batch_loss/train': 0.8301494945771992} 12/22/2021 04:37:30 - INFO - codeparrot_training - Step 4368: {'lr': 0.0004913137213317141, 'samples': 2236928, 'steps': 4368, 'batch_loss/train': 0.8394230350386351} 12/22/2021 04:37:42 - INFO - codeparrot_training - Step 4369: {'lr': 0.0004913096017847008, 'samples': 2237440, 'steps': 4369, 'batch_loss/train': 0.7621159784030169} 12/22/2021 04:37:53 - INFO - codeparrot_training - Step 4370: {'lr': 0.0004913054812783318, 'samples': 2237952, 'steps': 4370, 'batch_loss/train': 0.8487726273015141} 12/22/2021 04:38:03 - INFO - codeparrot_training - Step 4371: {'lr': 0.0004913013598126231, 'samples': 2238464, 'steps': 4371, 'batch_loss/train': 0.800506578758359} 12/22/2021 04:38:15 - INFO - codeparrot_training - Step 4372: {'lr': 0.0004912972373875915, 'samples': 2238976, 'steps': 4372, 'batch_loss/train': 0.865969062782824} 12/22/2021 04:38:26 - INFO - codeparrot_training - Step 4373: {'lr': 0.0004912931140032532, 'samples': 2239488, 'steps': 4373, 'batch_loss/train': 0.9593783812597394} 12/22/2021 04:38:37 - INFO - codeparrot_training - Step 4374: {'lr': 0.0004912889896596244, 'samples': 2240000, 'steps': 4374, 'batch_loss/train': 0.8752002874389291} 12/22/2021 04:38:50 - INFO - codeparrot_training - Step 4375: {'lr': 0.0004912848643567219, 'samples': 2240512, 'steps': 4375, 'batch_loss/train': 0.9686876833438873} 12/22/2021 04:39:01 - INFO - codeparrot_training - Step 4376: {'lr': 0.0004912807380945617, 'samples': 2241024, 'steps': 4376, 'batch_loss/train': 0.8089068531990051} 12/22/2021 04:39:11 - INFO - codeparrot_training - Step 4377: {'lr': 0.0004912766108731605, 'samples': 2241536, 'steps': 4377, 'batch_loss/train': 0.8065771916881204} 12/22/2021 04:39:22 - INFO - codeparrot_training - Step 4378: {'lr': 0.0004912724826925346, 'samples': 2242048, 'steps': 4378, 'batch_loss/train': 0.7044837472494692} 12/22/2021 04:39:34 - INFO - codeparrot_training - Step 4379: {'lr': 0.0004912683535527002, 'samples': 2242560, 'steps': 4379, 'batch_loss/train': 0.8578854100778699} 12/22/2021 04:39:45 - INFO - codeparrot_training - Step 4380: {'lr': 0.0004912642234536741, 'samples': 2243072, 'steps': 4380, 'batch_loss/train': 0.814509361051023} 12/22/2021 04:39:55 - INFO - codeparrot_training - Step 4381: {'lr': 0.0004912600923954727, 'samples': 2243584, 'steps': 4381, 'batch_loss/train': 0.8083323931787163} 12/22/2021 04:40:07 - INFO - codeparrot_training - Step 4382: {'lr': 0.000491255960378112, 'samples': 2244096, 'steps': 4382, 'batch_loss/train': 0.8257894795387983} 12/22/2021 04:40:18 - INFO - codeparrot_training - Step 4383: {'lr': 0.0004912518274016089, 'samples': 2244608, 'steps': 4383, 'batch_loss/train': 0.8662349423393607} 12/22/2021 04:40:28 - INFO - codeparrot_training - Step 4384: {'lr': 0.0004912476934659795, 'samples': 2245120, 'steps': 4384, 'batch_loss/train': 0.8560129594989121} 12/22/2021 04:40:42 - INFO - codeparrot_training - Step 4385: {'lr': 0.0004912435585712404, 'samples': 2245632, 'steps': 4385, 'batch_loss/train': 0.8162845754995942} 12/22/2021 04:40:53 - INFO - codeparrot_training - Step 4386: {'lr': 0.0004912394227174079, 'samples': 2246144, 'steps': 4386, 'batch_loss/train': 0.7768817562609911} 12/22/2021 04:41:04 - INFO - codeparrot_training - Step 4387: {'lr': 0.0004912352859044988, 'samples': 2246656, 'steps': 4387, 'batch_loss/train': 0.8936093067750335} 12/22/2021 04:41:14 - INFO - codeparrot_training - Step 4388: {'lr': 0.0004912311481325291, 'samples': 2247168, 'steps': 4388, 'batch_loss/train': 1.1603546403348446} 12/22/2021 04:41:26 - INFO - codeparrot_training - Step 4389: {'lr': 0.0004912270094015155, 'samples': 2247680, 'steps': 4389, 'batch_loss/train': 1.0338610066100955} 12/22/2021 04:41:37 - INFO - codeparrot_training - Step 4390: {'lr': 0.0004912228697114743, 'samples': 2248192, 'steps': 4390, 'batch_loss/train': 0.8225042428821325} 12/22/2021 04:41:48 - INFO - codeparrot_training - Step 4391: {'lr': 0.0004912187290624221, 'samples': 2248704, 'steps': 4391, 'batch_loss/train': 0.8637838016729802} 12/22/2021 04:42:00 - INFO - codeparrot_training - Step 4392: {'lr': 0.0004912145874543753, 'samples': 2249216, 'steps': 4392, 'batch_loss/train': 0.9040963547304273} 12/22/2021 04:42:10 - INFO - codeparrot_training - Step 4393: {'lr': 0.0004912104448873504, 'samples': 2249728, 'steps': 4393, 'batch_loss/train': 0.7512270987499505} 12/22/2021 04:42:21 - INFO - codeparrot_training - Step 4394: {'lr': 0.0004912063013613639, 'samples': 2250240, 'steps': 4394, 'batch_loss/train': 0.6600972732994705} 12/22/2021 04:42:35 - INFO - codeparrot_training - Step 4395: {'lr': 0.0004912021568764322, 'samples': 2250752, 'steps': 4395, 'batch_loss/train': 0.8731367462314665} 12/22/2021 04:42:46 - INFO - codeparrot_training - Step 4396: {'lr': 0.0004911980114325718, 'samples': 2251264, 'steps': 4396, 'batch_loss/train': 0.7806921643204987} 12/22/2021 04:42:56 - INFO - codeparrot_training - Step 4397: {'lr': 0.0004911938650297991, 'samples': 2251776, 'steps': 4397, 'batch_loss/train': 0.9399443846195936} 12/22/2021 04:43:07 - INFO - codeparrot_training - Step 4398: {'lr': 0.0004911897176681307, 'samples': 2252288, 'steps': 4398, 'batch_loss/train': 0.8369409227743745} 12/22/2021 04:43:19 - INFO - codeparrot_training - Step 4399: {'lr': 0.0004911855693475832, 'samples': 2252800, 'steps': 4399, 'batch_loss/train': 0.8093470465391874} 12/22/2021 04:43:29 - INFO - codeparrot_training - Step 4400: {'lr': 0.0004911814200681728, 'samples': 2253312, 'steps': 4400, 'batch_loss/train': 0.8003094922751188} 12/22/2021 04:43:40 - INFO - codeparrot_training - Step 4401: {'lr': 0.0004911772698299161, 'samples': 2253824, 'steps': 4401, 'batch_loss/train': 0.7825276516377926} 12/22/2021 04:43:54 - INFO - codeparrot_training - Step 4402: {'lr': 0.0004911731186328297, 'samples': 2254336, 'steps': 4402, 'batch_loss/train': 0.8577448213472962} 12/22/2021 04:44:04 - INFO - codeparrot_training - Step 4403: {'lr': 0.00049116896647693, 'samples': 2254848, 'steps': 4403, 'batch_loss/train': 0.8877853509038687} 12/22/2021 04:44:15 - INFO - codeparrot_training - Step 4404: {'lr': 0.0004911648133622337, 'samples': 2255360, 'steps': 4404, 'batch_loss/train': 0.8982436810620129} 12/22/2021 04:44:27 - INFO - codeparrot_training - Step 4405: {'lr': 0.000491160659288757, 'samples': 2255872, 'steps': 4405, 'batch_loss/train': 0.7663659891113639} 12/22/2021 04:44:38 - INFO - codeparrot_training - Step 4406: {'lr': 0.0004911565042565166, 'samples': 2256384, 'steps': 4406, 'batch_loss/train': 0.8552565285935998} 12/22/2021 04:44:49 - INFO - codeparrot_training - Step 4407: {'lr': 0.0004911523482655291, 'samples': 2256896, 'steps': 4407, 'batch_loss/train': 0.8909098757430911} 12/22/2021 04:45:01 - INFO - codeparrot_training - Step 4408: {'lr': 0.0004911481913158108, 'samples': 2257408, 'steps': 4408, 'batch_loss/train': 0.8994851433672011} 12/22/2021 04:45:11 - INFO - codeparrot_training - Step 4409: {'lr': 0.0004911440334073784, 'samples': 2257920, 'steps': 4409, 'batch_loss/train': 0.9619122911244631} 12/22/2021 04:45:22 - INFO - codeparrot_training - Step 4410: {'lr': 0.0004911398745402484, 'samples': 2258432, 'steps': 4410, 'batch_loss/train': 0.8882929319515824} 12/22/2021 04:45:33 - INFO - codeparrot_training - Step 4411: {'lr': 0.0004911357147144373, 'samples': 2258944, 'steps': 4411, 'batch_loss/train': 0.8700050804764032} 12/22/2021 04:45:46 - INFO - codeparrot_training - Step 4412: {'lr': 0.0004911315539299616, 'samples': 2259456, 'steps': 4412, 'batch_loss/train': 0.8474257923662663} 12/22/2021 04:45:57 - INFO - codeparrot_training - Step 4413: {'lr': 0.0004911273921868379, 'samples': 2259968, 'steps': 4413, 'batch_loss/train': 1.0131636802107096} 12/22/2021 04:46:08 - INFO - codeparrot_training - Step 4414: {'lr': 0.0004911232294850828, 'samples': 2260480, 'steps': 4414, 'batch_loss/train': 0.8477210476994514} 12/22/2021 04:46:20 - INFO - codeparrot_training - Step 4415: {'lr': 0.0004911190658247127, 'samples': 2260992, 'steps': 4415, 'batch_loss/train': 0.842244123108685} 12/22/2021 04:46:30 - INFO - codeparrot_training - Step 4416: {'lr': 0.0004911149012057443, 'samples': 2261504, 'steps': 4416, 'batch_loss/train': 0.8421527352184057} 12/22/2021 04:46:41 - INFO - codeparrot_training - Step 4417: {'lr': 0.0004911107356281941, 'samples': 2262016, 'steps': 4417, 'batch_loss/train': 0.8741696048527956} 12/22/2021 04:46:53 - INFO - codeparrot_training - Step 4418: {'lr': 0.0004911065690920785, 'samples': 2262528, 'steps': 4418, 'batch_loss/train': 0.8470067149028182} 12/22/2021 04:47:04 - INFO - codeparrot_training - Step 4419: {'lr': 0.0004911024015974144, 'samples': 2263040, 'steps': 4419, 'batch_loss/train': 0.7909521497786045} 12/22/2021 04:47:15 - INFO - codeparrot_training - Step 4420: {'lr': 0.0004910982331442182, 'samples': 2263552, 'steps': 4420, 'batch_loss/train': 0.8368228925392032} 12/22/2021 04:47:25 - INFO - codeparrot_training - Step 4421: {'lr': 0.0004910940637325063, 'samples': 2264064, 'steps': 4421, 'batch_loss/train': 1.0501950066536665} 12/22/2021 04:47:39 - INFO - codeparrot_training - Step 4422: {'lr': 0.0004910898933622956, 'samples': 2264576, 'steps': 4422, 'batch_loss/train': 0.6129891681484878} 12/22/2021 04:47:50 - INFO - codeparrot_training - Step 4423: {'lr': 0.0004910857220336024, 'samples': 2265088, 'steps': 4423, 'batch_loss/train': 0.8376237573102117} 12/22/2021 04:48:00 - INFO - codeparrot_training - Step 4424: {'lr': 0.0004910815497464434, 'samples': 2265600, 'steps': 4424, 'batch_loss/train': 0.8774800039827824} 12/22/2021 04:48:12 - INFO - codeparrot_training - Step 4425: {'lr': 0.0004910773765008353, 'samples': 2266112, 'steps': 4425, 'batch_loss/train': 0.9039821373298764} 12/22/2021 04:48:23 - INFO - codeparrot_training - Step 4426: {'lr': 0.0004910732022967945, 'samples': 2266624, 'steps': 4426, 'batch_loss/train': 0.7905682311393321} 12/22/2021 04:48:34 - INFO - codeparrot_training - Step 4427: {'lr': 0.0004910690271343376, 'samples': 2267136, 'steps': 4427, 'batch_loss/train': 0.8590633049607277} 12/22/2021 04:48:44 - INFO - codeparrot_training - Step 4428: {'lr': 0.0004910648510134814, 'samples': 2267648, 'steps': 4428, 'batch_loss/train': 0.7075971263693646} 12/22/2021 04:48:56 - INFO - codeparrot_training - Step 4429: {'lr': 0.0004910606739342424, 'samples': 2268160, 'steps': 4429, 'batch_loss/train': 0.9245136110112071} 12/22/2021 04:49:07 - INFO - codeparrot_training - Step 4430: {'lr': 0.000491056495896637, 'samples': 2268672, 'steps': 4430, 'batch_loss/train': 0.8254746170714498} 12/22/2021 04:49:18 - INFO - codeparrot_training - Step 4431: {'lr': 0.0004910523169006822, 'samples': 2269184, 'steps': 4431, 'batch_loss/train': 0.8274789536371827} 12/22/2021 04:49:32 - INFO - codeparrot_training - Step 4432: {'lr': 0.0004910481369463942, 'samples': 2269696, 'steps': 4432, 'batch_loss/train': 0.8326917092781514} 12/22/2021 04:49:42 - INFO - codeparrot_training - Step 4433: {'lr': 0.00049104395603379, 'samples': 2270208, 'steps': 4433, 'batch_loss/train': 0.9098015194758773} 12/22/2021 04:49:53 - INFO - codeparrot_training - Step 4434: {'lr': 0.000491039774162886, 'samples': 2270720, 'steps': 4434, 'batch_loss/train': 0.8259419649839401} 12/22/2021 04:50:05 - INFO - codeparrot_training - Step 4435: {'lr': 0.0004910355913336988, 'samples': 2271232, 'steps': 4435, 'batch_loss/train': 0.8773518428206444} 12/22/2021 04:50:16 - INFO - codeparrot_training - Step 4436: {'lr': 0.0004910314075462452, 'samples': 2271744, 'steps': 4436, 'batch_loss/train': 0.735341764986515} 12/22/2021 04:50:26 - INFO - codeparrot_training - Step 4437: {'lr': 0.0004910272228005417, 'samples': 2272256, 'steps': 4437, 'batch_loss/train': 0.800392840988934} 12/22/2021 04:50:38 - INFO - codeparrot_training - Step 4438: {'lr': 0.0004910230370966049, 'samples': 2272768, 'steps': 4438, 'batch_loss/train': 0.9243511683307588} 12/22/2021 04:50:49 - INFO - codeparrot_training - Step 4439: {'lr': 0.0004910188504344516, 'samples': 2273280, 'steps': 4439, 'batch_loss/train': 0.7621354758739471} 12/22/2021 04:51:00 - INFO - codeparrot_training - Step 4440: {'lr': 0.0004910146628140983, 'samples': 2273792, 'steps': 4440, 'batch_loss/train': 0.7615276854485273} 12/22/2021 04:51:10 - INFO - codeparrot_training - Step 4441: {'lr': 0.0004910104742355617, 'samples': 2274304, 'steps': 4441, 'batch_loss/train': 0.8453859235160053} 12/22/2021 04:51:24 - INFO - codeparrot_training - Step 4442: {'lr': 0.0004910062846988584, 'samples': 2274816, 'steps': 4442, 'batch_loss/train': 0.7246930431574583} 12/22/2021 04:51:35 - INFO - codeparrot_training - Step 4443: {'lr': 0.0004910020942040053, 'samples': 2275328, 'steps': 4443, 'batch_loss/train': 0.8424735520966351} 12/22/2021 04:51:45 - INFO - codeparrot_training - Step 4444: {'lr': 0.0004909979027510188, 'samples': 2275840, 'steps': 4444, 'batch_loss/train': 0.7691285111941397} 12/22/2021 04:51:57 - INFO - codeparrot_training - Step 4445: {'lr': 0.0004909937103399156, 'samples': 2276352, 'steps': 4445, 'batch_loss/train': 0.9281305223703384} 12/22/2021 04:52:08 - INFO - codeparrot_training - Step 4446: {'lr': 0.0004909895169707124, 'samples': 2276864, 'steps': 4446, 'batch_loss/train': 0.9803616432473063} 12/22/2021 04:52:19 - INFO - codeparrot_training - Step 4447: {'lr': 0.0004909853226434259, 'samples': 2277376, 'steps': 4447, 'batch_loss/train': 0.8627274129539728} 12/22/2021 04:52:31 - INFO - codeparrot_training - Step 4448: {'lr': 0.0004909811273580727, 'samples': 2277888, 'steps': 4448, 'batch_loss/train': 0.738302931189537} 12/22/2021 04:52:41 - INFO - codeparrot_training - Step 4449: {'lr': 0.0004909769311146696, 'samples': 2278400, 'steps': 4449, 'batch_loss/train': 0.9979618107900023} 12/22/2021 04:52:52 - INFO - codeparrot_training - Step 4450: {'lr': 0.0004909727339132331, 'samples': 2278912, 'steps': 4450, 'batch_loss/train': 0.8984070168808103} 12/22/2021 04:53:03 - INFO - codeparrot_training - Step 4451: {'lr': 0.00049096853575378, 'samples': 2279424, 'steps': 4451, 'batch_loss/train': 0.9306482020765543} 12/22/2021 04:53:16 - INFO - codeparrot_training - Step 4452: {'lr': 0.0004909643366363271, 'samples': 2279936, 'steps': 4452, 'batch_loss/train': 0.8965840209275484} 12/22/2021 04:53:27 - INFO - codeparrot_training - Step 4453: {'lr': 0.000490960136560891, 'samples': 2280448, 'steps': 4453, 'batch_loss/train': 0.8636475945822895} 12/22/2021 04:53:38 - INFO - codeparrot_training - Step 4454: {'lr': 0.0004909559355274883, 'samples': 2280960, 'steps': 4454, 'batch_loss/train': 0.8603773410432041} 12/22/2021 04:53:50 - INFO - codeparrot_training - Step 4455: {'lr': 0.0004909517335361358, 'samples': 2281472, 'steps': 4455, 'batch_loss/train': 0.8115797969512641} 12/22/2021 04:54:00 - INFO - codeparrot_training - Step 4456: {'lr': 0.0004909475305868501, 'samples': 2281984, 'steps': 4456, 'batch_loss/train': 0.9128471277654171} 12/22/2021 04:54:11 - INFO - codeparrot_training - Step 4457: {'lr': 0.0004909433266796482, 'samples': 2282496, 'steps': 4457, 'batch_loss/train': 0.7981330892071128} 12/22/2021 04:54:23 - INFO - codeparrot_training - Step 4458: {'lr': 0.0004909391218145464, 'samples': 2283008, 'steps': 4458, 'batch_loss/train': 0.920263109728694} 12/22/2021 04:54:34 - INFO - codeparrot_training - Step 4459: {'lr': 0.0004909349159915617, 'samples': 2283520, 'steps': 4459, 'batch_loss/train': 0.9429194652475417} 12/22/2021 04:54:44 - INFO - codeparrot_training - Step 4460: {'lr': 0.0004909307092107108, 'samples': 2284032, 'steps': 4460, 'batch_loss/train': 0.8637862885370851} 12/22/2021 04:54:58 - INFO - codeparrot_training - Step 4461: {'lr': 0.0004909265014720103, 'samples': 2284544, 'steps': 4461, 'batch_loss/train': 0.7900959574617445} 12/22/2021 04:55:09 - INFO - codeparrot_training - Step 4462: {'lr': 0.000490922292775477, 'samples': 2285056, 'steps': 4462, 'batch_loss/train': 0.9256550362333655} 12/22/2021 04:55:19 - INFO - codeparrot_training - Step 4463: {'lr': 0.0004909180831211277, 'samples': 2285568, 'steps': 4463, 'batch_loss/train': 0.710028724744916} 12/22/2021 04:55:30 - INFO - codeparrot_training - Step 4464: {'lr': 0.0004909138725089791, 'samples': 2286080, 'steps': 4464, 'batch_loss/train': 0.9054735414683819} 12/22/2021 04:55:42 - INFO - codeparrot_training - Step 4465: {'lr': 0.0004909096609390478, 'samples': 2286592, 'steps': 4465, 'batch_loss/train': 0.8041528251487762} 12/22/2021 04:55:53 - INFO - codeparrot_training - Step 4466: {'lr': 0.0004909054484113507, 'samples': 2287104, 'steps': 4466, 'batch_loss/train': 0.8063451442867517} 12/22/2021 04:56:04 - INFO - codeparrot_training - Step 4467: {'lr': 0.0004909012349259045, 'samples': 2287616, 'steps': 4467, 'batch_loss/train': 0.7996640615165234} 12/22/2021 04:56:16 - INFO - codeparrot_training - Step 4468: {'lr': 0.000490897020482726, 'samples': 2288128, 'steps': 4468, 'batch_loss/train': 0.8639720734208822} 12/22/2021 04:56:26 - INFO - codeparrot_training - Step 4469: {'lr': 0.0004908928050818319, 'samples': 2288640, 'steps': 4469, 'batch_loss/train': 0.9351622872054577} 12/22/2021 04:56:37 - INFO - codeparrot_training - Step 4470: {'lr': 0.0004908885887232389, 'samples': 2289152, 'steps': 4470, 'batch_loss/train': 0.8561831591650844} 12/22/2021 04:56:51 - INFO - codeparrot_training - Step 4471: {'lr': 0.0004908843714069638, 'samples': 2289664, 'steps': 4471, 'batch_loss/train': 0.8860693546012044} 12/22/2021 04:57:02 - INFO - codeparrot_training - Step 4472: {'lr': 0.0004908801531330235, 'samples': 2290176, 'steps': 4472, 'batch_loss/train': 0.8034948045387864} 12/22/2021 04:57:13 - INFO - codeparrot_training - Step 4473: {'lr': 0.0004908759339014346, 'samples': 2290688, 'steps': 4473, 'batch_loss/train': 0.8965884735807776} 12/22/2021 04:57:23 - INFO - codeparrot_training - Step 4474: {'lr': 0.0004908717137122139, 'samples': 2291200, 'steps': 4474, 'batch_loss/train': 0.9972191522829235} 12/22/2021 04:57:35 - INFO - codeparrot_training - Step 4475: {'lr': 0.0004908674925653783, 'samples': 2291712, 'steps': 4475, 'batch_loss/train': 1.1217596540227532} 12/22/2021 04:57:46 - INFO - codeparrot_training - Step 4476: {'lr': 0.0004908632704609445, 'samples': 2292224, 'steps': 4476, 'batch_loss/train': 0.8806676249951124} 12/22/2021 04:57:57 - INFO - codeparrot_training - Step 4477: {'lr': 0.0004908590473989291, 'samples': 2292736, 'steps': 4477, 'batch_loss/train': 0.8897327240556479} 12/22/2021 04:58:11 - INFO - codeparrot_training - Step 4478: {'lr': 0.0004908548233793493, 'samples': 2293248, 'steps': 4478, 'batch_loss/train': 0.8133748946711421} 12/22/2021 04:58:21 - INFO - codeparrot_training - Step 4479: {'lr': 0.0004908505984022216, 'samples': 2293760, 'steps': 4479, 'batch_loss/train': 0.9163514608517289} 12/22/2021 04:58:32 - INFO - codeparrot_training - Step 4480: {'lr': 0.0004908463724675628, 'samples': 2294272, 'steps': 4480, 'batch_loss/train': 0.8814560770988464} 12/22/2021 04:58:44 - INFO - codeparrot_training - Step 4481: {'lr': 0.0004908421455753899, 'samples': 2294784, 'steps': 4481, 'batch_loss/train': 0.8898887718096375} 12/22/2021 04:58:55 - INFO - codeparrot_training - Step 4482: {'lr': 0.0004908379177257194, 'samples': 2295296, 'steps': 4482, 'batch_loss/train': 0.8804589658975601} 12/22/2021 04:59:05 - INFO - codeparrot_training - Step 4483: {'lr': 0.0004908336889185684, 'samples': 2295808, 'steps': 4483, 'batch_loss/train': 0.8639893550425768} 12/22/2021 04:59:16 - INFO - codeparrot_training - Step 4484: {'lr': 0.0004908294591539535, 'samples': 2296320, 'steps': 4484, 'batch_loss/train': 0.7909381031058729} 12/22/2021 04:59:28 - INFO - codeparrot_training - Step 4485: {'lr': 0.0004908252284318916, 'samples': 2296832, 'steps': 4485, 'batch_loss/train': 0.8721143761649728} 12/22/2021 04:59:38 - INFO - codeparrot_training - Step 4486: {'lr': 0.0004908209967523996, 'samples': 2297344, 'steps': 4486, 'batch_loss/train': 0.8771800005342811} 12/22/2021 04:59:49 - INFO - codeparrot_training - Step 4487: {'lr': 0.0004908167641154942, 'samples': 2297856, 'steps': 4487, 'batch_loss/train': 0.7540180124342442} 12/22/2021 05:00:01 - INFO - codeparrot_training - Step 4488: {'lr': 0.0004908125305211923, 'samples': 2298368, 'steps': 4488, 'batch_loss/train': 0.7781521873548627} 12/22/2021 05:00:12 - INFO - codeparrot_training - Step 4489: {'lr': 0.0004908082959695107, 'samples': 2298880, 'steps': 4489, 'batch_loss/train': 0.8174245655536652} 12/22/2021 05:00:23 - INFO - codeparrot_training - Step 4490: {'lr': 0.0004908040604604662, 'samples': 2299392, 'steps': 4490, 'batch_loss/train': 0.8766718963161111} 12/22/2021 05:00:37 - INFO - codeparrot_training - Step 4491: {'lr': 0.0004907998239940758, 'samples': 2299904, 'steps': 4491, 'batch_loss/train': 0.8892779489979148} 12/22/2021 05:00:47 - INFO - codeparrot_training - Step 4492: {'lr': 0.000490795586570356, 'samples': 2300416, 'steps': 4492, 'batch_loss/train': 0.8672011904418468} 12/22/2021 05:00:58 - INFO - codeparrot_training - Step 4493: {'lr': 0.000490791348189324, 'samples': 2300928, 'steps': 4493, 'batch_loss/train': 0.8793912557885051} 12/22/2021 05:01:09 - INFO - codeparrot_training - Step 4494: {'lr': 0.0004907871088509966, 'samples': 2301440, 'steps': 4494, 'batch_loss/train': 0.5006250035949051} 12/22/2021 05:01:21 - INFO - codeparrot_training - Step 4495: {'lr': 0.0004907828685553904, 'samples': 2301952, 'steps': 4495, 'batch_loss/train': 0.8914337689056993} 12/22/2021 05:01:31 - INFO - codeparrot_training - Step 4496: {'lr': 0.0004907786273025226, 'samples': 2302464, 'steps': 4496, 'batch_loss/train': 0.7561483760364354} 12/22/2021 05:01:42 - INFO - codeparrot_training - Step 4497: {'lr': 0.0004907743850924098, 'samples': 2302976, 'steps': 4497, 'batch_loss/train': 0.8203543098643422} 12/22/2021 05:01:54 - INFO - codeparrot_training - Step 4498: {'lr': 0.000490770141925069, 'samples': 2303488, 'steps': 4498, 'batch_loss/train': 0.8177482038736343} 12/22/2021 05:02:05 - INFO - codeparrot_training - Step 4499: {'lr': 0.000490765897800517, 'samples': 2304000, 'steps': 4499, 'batch_loss/train': 0.972559662302956} 12/22/2021 05:02:16 - INFO - codeparrot_training - Step 4500: {'lr': 0.0004907616527187707, 'samples': 2304512, 'steps': 4500, 'batch_loss/train': 0.9467790741473436} 12/22/2021 05:02:29 - INFO - codeparrot_training - Step 4501: {'lr': 0.000490757406679847, 'samples': 2305024, 'steps': 4501, 'batch_loss/train': 0.8586147530004382} 12/22/2021 05:02:40 - INFO - codeparrot_training - Step 4502: {'lr': 0.0004907531596837628, 'samples': 2305536, 'steps': 4502, 'batch_loss/train': 0.9199285414069891} 12/22/2021 05:02:51 - INFO - codeparrot_training - Step 4503: {'lr': 0.0004907489117305349, 'samples': 2306048, 'steps': 4503, 'batch_loss/train': 0.9137954544275999} 12/22/2021 05:03:01 - INFO - codeparrot_training - Step 4504: {'lr': 0.0004907446628201801, 'samples': 2306560, 'steps': 4504, 'batch_loss/train': 0.9680771278217435} 12/22/2021 05:03:13 - INFO - codeparrot_training - Step 4505: {'lr': 0.0004907404129527156, 'samples': 2307072, 'steps': 4505, 'batch_loss/train': 0.919852661434561} 12/22/2021 05:03:24 - INFO - codeparrot_training - Step 4506: {'lr': 0.000490736162128158, 'samples': 2307584, 'steps': 4506, 'batch_loss/train': 0.9246813748031855} 12/22/2021 05:03:34 - INFO - codeparrot_training - Step 4507: {'lr': 0.0004907319103465244, 'samples': 2308096, 'steps': 4507, 'batch_loss/train': 0.9155094735324383} 12/22/2021 05:03:48 - INFO - codeparrot_training - Step 4508: {'lr': 0.0004907276576078315, 'samples': 2308608, 'steps': 4508, 'batch_loss/train': 0.9555933214724064} 12/22/2021 05:03:59 - INFO - codeparrot_training - Step 4509: {'lr': 0.0004907234039120965, 'samples': 2309120, 'steps': 4509, 'batch_loss/train': 0.9638576582074165} 12/22/2021 05:04:10 - INFO - codeparrot_training - Step 4510: {'lr': 0.000490719149259336, 'samples': 2309632, 'steps': 4510, 'batch_loss/train': 0.8549946490675211} 12/22/2021 05:04:21 - INFO - codeparrot_training - Step 4511: {'lr': 0.0004907148936495672, 'samples': 2310144, 'steps': 4511, 'batch_loss/train': 0.8434853479266167} 12/22/2021 05:04:32 - INFO - codeparrot_training - Step 4512: {'lr': 0.0004907106370828068, 'samples': 2310656, 'steps': 4512, 'batch_loss/train': 0.8765092082321644} 12/22/2021 05:04:43 - INFO - codeparrot_training - Step 4513: {'lr': 0.0004907063795590717, 'samples': 2311168, 'steps': 4513, 'batch_loss/train': 0.9490054761990905} 12/22/2021 05:04:53 - INFO - codeparrot_training - Step 4514: {'lr': 0.0004907021210783791, 'samples': 2311680, 'steps': 4514, 'batch_loss/train': 0.6650582491420209} 12/22/2021 05:05:05 - INFO - codeparrot_training - Step 4515: {'lr': 0.0004906978616407456, 'samples': 2312192, 'steps': 4515, 'batch_loss/train': 0.9259910238906741} 12/22/2021 05:05:16 - INFO - codeparrot_training - Step 4516: {'lr': 0.0004906936012461883, 'samples': 2312704, 'steps': 4516, 'batch_loss/train': 0.8779153395444155} 12/22/2021 05:05:27 - INFO - codeparrot_training - Step 4517: {'lr': 0.0004906893398947242, 'samples': 2313216, 'steps': 4517, 'batch_loss/train': 0.8220060290768743} 12/22/2021 05:05:39 - INFO - codeparrot_training - Step 4518: {'lr': 0.0004906850775863701, 'samples': 2313728, 'steps': 4518, 'batch_loss/train': 0.895450628362596} 12/22/2021 05:05:50 - INFO - codeparrot_training - Step 4519: {'lr': 0.0004906808143211431, 'samples': 2314240, 'steps': 4519, 'batch_loss/train': 0.8014659222681075} 12/22/2021 05:06:01 - INFO - codeparrot_training - Step 4520: {'lr': 0.00049067655009906, 'samples': 2314752, 'steps': 4520, 'batch_loss/train': 0.7161218263208866} 12/22/2021 05:06:13 - INFO - codeparrot_training - Step 4521: {'lr': 0.0004906722849201378, 'samples': 2315264, 'steps': 4521, 'batch_loss/train': 0.7995186583139002} 12/22/2021 05:06:24 - INFO - codeparrot_training - Step 4522: {'lr': 0.0004906680187843935, 'samples': 2315776, 'steps': 4522, 'batch_loss/train': 0.8536771470680833} 12/22/2021 05:06:34 - INFO - codeparrot_training - Step 4523: {'lr': 0.000490663751691844, 'samples': 2316288, 'steps': 4523, 'batch_loss/train': 0.8579108947888017} 12/22/2021 05:06:46 - INFO - codeparrot_training - Step 4524: {'lr': 0.0004906594836425062, 'samples': 2316800, 'steps': 4524, 'batch_loss/train': 0.8683698941022158} 12/22/2021 05:06:57 - INFO - codeparrot_training - Step 4525: {'lr': 0.0004906552146363973, 'samples': 2317312, 'steps': 4525, 'batch_loss/train': 0.6334838517941535} 12/22/2021 05:07:08 - INFO - codeparrot_training - Step 4526: {'lr': 0.0004906509446735341, 'samples': 2317824, 'steps': 4526, 'batch_loss/train': 0.9095436092466116} 12/22/2021 05:07:18 - INFO - codeparrot_training - Step 4527: {'lr': 0.0004906466737539337, 'samples': 2318336, 'steps': 4527, 'batch_loss/train': 0.8403153244871646} 12/22/2021 05:07:30 - INFO - codeparrot_training - Step 4528: {'lr': 0.0004906424018776129, 'samples': 2318848, 'steps': 4528, 'batch_loss/train': 0.7996029583737254} 12/22/2021 05:07:41 - INFO - codeparrot_training - Step 4529: {'lr': 0.0004906381290445888, 'samples': 2319360, 'steps': 4529, 'batch_loss/train': 0.9649016633629799} 12/22/2021 05:07:52 - INFO - codeparrot_training - Step 4530: {'lr': 0.0004906338552548783, 'samples': 2319872, 'steps': 4530, 'batch_loss/train': 0.735960575286299} 12/22/2021 05:08:04 - INFO - codeparrot_training - Step 4531: {'lr': 0.0004906295805084985, 'samples': 2320384, 'steps': 4531, 'batch_loss/train': 0.8512612003833055} 12/22/2021 05:08:15 - INFO - codeparrot_training - Step 4532: {'lr': 0.0004906253048054663, 'samples': 2320896, 'steps': 4532, 'batch_loss/train': 0.7424246231094003} 12/22/2021 05:08:26 - INFO - codeparrot_training - Step 4533: {'lr': 0.0004906210281457987, 'samples': 2321408, 'steps': 4533, 'batch_loss/train': 0.862918977625668} 12/22/2021 05:08:36 - INFO - codeparrot_training - Step 4534: {'lr': 0.0004906167505295129, 'samples': 2321920, 'steps': 4534, 'batch_loss/train': 1.1041178461164236} 12/22/2021 05:08:48 - INFO - codeparrot_training - Step 4535: {'lr': 0.0004906124719566258, 'samples': 2322432, 'steps': 4535, 'batch_loss/train': 0.955032299272716} 12/22/2021 05:08:59 - INFO - codeparrot_training - Step 4536: {'lr': 0.0004906081924271542, 'samples': 2322944, 'steps': 4536, 'batch_loss/train': 0.8470449068117887} 12/22/2021 05:09:09 - INFO - codeparrot_training - Step 4537: {'lr': 0.0004906039119411152, 'samples': 2323456, 'steps': 4537, 'batch_loss/train': 0.9465688238851726} 12/22/2021 05:09:22 - INFO - codeparrot_training - Step 4538: {'lr': 0.0004905996304985261, 'samples': 2323968, 'steps': 4538, 'batch_loss/train': 0.9312055530026555} 12/22/2021 05:09:32 - INFO - codeparrot_training - Step 4539: {'lr': 0.0004905953480994036, 'samples': 2324480, 'steps': 4539, 'batch_loss/train': 0.8200385058298707} 12/22/2021 05:09:43 - INFO - codeparrot_training - Step 4540: {'lr': 0.0004905910647437649, 'samples': 2324992, 'steps': 4540, 'batch_loss/train': 0.6266604211414233} 12/22/2021 05:09:56 - INFO - codeparrot_training - Step 4541: {'lr': 0.0004905867804316269, 'samples': 2325504, 'steps': 4541, 'batch_loss/train': 0.9382193014025688} 12/22/2021 05:10:07 - INFO - codeparrot_training - Step 4542: {'lr': 0.0004905824951630068, 'samples': 2326016, 'steps': 4542, 'batch_loss/train': 0.8781315619125962} 12/22/2021 05:10:17 - INFO - codeparrot_training - Step 4543: {'lr': 0.0004905782089379215, 'samples': 2326528, 'steps': 4543, 'batch_loss/train': 0.939152397448197} 12/22/2021 05:10:28 - INFO - codeparrot_training - Step 4544: {'lr': 0.000490573921756388, 'samples': 2327040, 'steps': 4544, 'batch_loss/train': 1.0296334475278854} 12/22/2021 05:10:40 - INFO - codeparrot_training - Step 4545: {'lr': 0.0004905696336184235, 'samples': 2327552, 'steps': 4545, 'batch_loss/train': 0.8271511709317565} 12/22/2021 05:10:51 - INFO - codeparrot_training - Step 4546: {'lr': 0.000490565344524045, 'samples': 2328064, 'steps': 4546, 'batch_loss/train': 0.8622129997238517} 12/22/2021 05:11:01 - INFO - codeparrot_training - Step 4547: {'lr': 0.0004905610544732695, 'samples': 2328576, 'steps': 4547, 'batch_loss/train': 0.9305679742246866} 12/22/2021 05:11:14 - INFO - codeparrot_training - Step 4548: {'lr': 0.000490556763466114, 'samples': 2329088, 'steps': 4548, 'batch_loss/train': 0.85017781984061} 12/22/2021 05:11:25 - INFO - codeparrot_training - Step 4549: {'lr': 0.0004905524715025958, 'samples': 2329600, 'steps': 4549, 'batch_loss/train': 0.8331185262650251} 12/22/2021 05:11:35 - INFO - codeparrot_training - Step 4550: {'lr': 0.0004905481785827317, 'samples': 2330112, 'steps': 4550, 'batch_loss/train': 0.8713033171370625} 12/22/2021 05:11:47 - INFO - codeparrot_training - Step 4551: {'lr': 0.0004905438847065388, 'samples': 2330624, 'steps': 4551, 'batch_loss/train': 0.8570882566273212} 12/22/2021 05:11:58 - INFO - codeparrot_training - Step 4552: {'lr': 0.0004905395898740344, 'samples': 2331136, 'steps': 4552, 'batch_loss/train': 0.8495499640703201} 12/22/2021 05:12:08 - INFO - codeparrot_training - Step 4553: {'lr': 0.0004905352940852353, 'samples': 2331648, 'steps': 4553, 'batch_loss/train': 0.8811652674339712} 12/22/2021 05:12:19 - INFO - codeparrot_training - Step 4554: {'lr': 0.0004905309973401587, 'samples': 2332160, 'steps': 4554, 'batch_loss/train': 0.7779306762386113} 12/22/2021 05:12:31 - INFO - codeparrot_training - Step 4555: {'lr': 0.0004905266996388217, 'samples': 2332672, 'steps': 4555, 'batch_loss/train': 0.8681117771193385} 12/22/2021 05:12:42 - INFO - codeparrot_training - Step 4556: {'lr': 0.0004905224009812413, 'samples': 2333184, 'steps': 4556, 'batch_loss/train': 0.8347870726138353} 12/22/2021 05:12:52 - INFO - codeparrot_training - Step 4557: {'lr': 0.0004905181013674347, 'samples': 2333696, 'steps': 4557, 'batch_loss/train': 0.8342241114005446} 12/22/2021 05:13:05 - INFO - codeparrot_training - Step 4558: {'lr': 0.0004905138007974191, 'samples': 2334208, 'steps': 4558, 'batch_loss/train': 0.8152538002468646} 12/22/2021 05:13:16 - INFO - codeparrot_training - Step 4559: {'lr': 0.0004905094992712113, 'samples': 2334720, 'steps': 4559, 'batch_loss/train': 0.8247867645695806} 12/22/2021 05:13:26 - INFO - codeparrot_training - Step 4560: {'lr': 0.0004905051967888285, 'samples': 2335232, 'steps': 4560, 'batch_loss/train': 0.9078517146408558} 12/22/2021 05:13:38 - INFO - codeparrot_training - Step 4561: {'lr': 0.0004905008933502879, 'samples': 2335744, 'steps': 4561, 'batch_loss/train': 0.7458389364182949} 12/22/2021 05:13:49 - INFO - codeparrot_training - Step 4562: {'lr': 0.0004904965889556066, 'samples': 2336256, 'steps': 4562, 'batch_loss/train': 0.9573906604200602} 12/22/2021 05:14:00 - INFO - codeparrot_training - Step 4563: {'lr': 0.0004904922836048016, 'samples': 2336768, 'steps': 4563, 'batch_loss/train': 0.8746203677728772} 12/22/2021 05:14:12 - INFO - codeparrot_training - Step 4564: {'lr': 0.00049048797729789, 'samples': 2337280, 'steps': 4564, 'batch_loss/train': 0.7793659372255206} 12/22/2021 05:14:23 - INFO - codeparrot_training - Step 4565: {'lr': 0.0004904836700348891, 'samples': 2337792, 'steps': 4565, 'batch_loss/train': 0.890867137350142} 12/22/2021 05:14:33 - INFO - codeparrot_training - Step 4566: {'lr': 0.000490479361815816, 'samples': 2338304, 'steps': 4566, 'batch_loss/train': 0.8718393961898983} 12/22/2021 05:14:44 - INFO - codeparrot_training - Step 4567: {'lr': 0.0004904750526406877, 'samples': 2338816, 'steps': 4567, 'batch_loss/train': 0.9218784379772842} 12/22/2021 05:14:56 - INFO - codeparrot_training - Step 4568: {'lr': 0.0004904707425095214, 'samples': 2339328, 'steps': 4568, 'batch_loss/train': 0.8522556377574801} 12/22/2021 05:15:06 - INFO - codeparrot_training - Step 4569: {'lr': 0.0004904664314223342, 'samples': 2339840, 'steps': 4569, 'batch_loss/train': 0.8754020165652037} 12/22/2021 05:15:17 - INFO - codeparrot_training - Step 4570: {'lr': 0.0004904621193791433, 'samples': 2340352, 'steps': 4570, 'batch_loss/train': 0.8524485467933118} 12/22/2021 05:15:30 - INFO - codeparrot_training - Step 4571: {'lr': 0.0004904578063799659, 'samples': 2340864, 'steps': 4571, 'batch_loss/train': 0.8596793340984732} 12/22/2021 05:15:40 - INFO - codeparrot_training - Step 4572: {'lr': 0.0004904534924248189, 'samples': 2341376, 'steps': 4572, 'batch_loss/train': 0.8204932883381844} 12/22/2021 05:15:51 - INFO - codeparrot_training - Step 4573: {'lr': 0.0004904491775137197, 'samples': 2341888, 'steps': 4573, 'batch_loss/train': 0.7975587798282504} 12/22/2021 05:16:03 - INFO - codeparrot_training - Step 4574: {'lr': 0.0004904448616466852, 'samples': 2342400, 'steps': 4574, 'batch_loss/train': 0.8540769312530756} 12/22/2021 05:16:14 - INFO - codeparrot_training - Step 4575: {'lr': 0.0004904405448237329, 'samples': 2342912, 'steps': 4575, 'batch_loss/train': 0.7354127075523138} 12/22/2021 05:16:24 - INFO - codeparrot_training - Step 4576: {'lr': 0.0004904362270448798, 'samples': 2343424, 'steps': 4576, 'batch_loss/train': 0.8587034177035093} 12/22/2021 05:16:35 - INFO - codeparrot_training - Step 4577: {'lr': 0.0004904319083101429, 'samples': 2343936, 'steps': 4577, 'batch_loss/train': 0.8105435948818922} 12/22/2021 05:16:48 - INFO - codeparrot_training - Step 4578: {'lr': 0.0004904275886195396, 'samples': 2344448, 'steps': 4578, 'batch_loss/train': 0.8528214311227202} 12/22/2021 05:16:58 - INFO - codeparrot_training - Step 4579: {'lr': 0.0004904232679730869, 'samples': 2344960, 'steps': 4579, 'batch_loss/train': 0.8448275746777654} 12/22/2021 05:17:09 - INFO - codeparrot_training - Step 4580: {'lr': 0.0004904189463708022, 'samples': 2345472, 'steps': 4580, 'batch_loss/train': 0.7996860770508647} 12/22/2021 05:17:21 - INFO - codeparrot_training - Step 4581: {'lr': 0.0004904146238127024, 'samples': 2345984, 'steps': 4581, 'batch_loss/train': 0.7189640321303159} 12/22/2021 05:17:32 - INFO - codeparrot_training - Step 4582: {'lr': 0.000490410300298805, 'samples': 2346496, 'steps': 4582, 'batch_loss/train': 0.9284991305321455} 12/22/2021 05:17:42 - INFO - codeparrot_training - Step 4583: {'lr': 0.0004904059758291269, 'samples': 2347008, 'steps': 4583, 'batch_loss/train': 0.824628762784414} 12/22/2021 05:17:54 - INFO - codeparrot_training - Step 4584: {'lr': 0.0004904016504036855, 'samples': 2347520, 'steps': 4584, 'batch_loss/train': 0.8455335032194853} 12/22/2021 05:18:05 - INFO - codeparrot_training - Step 4585: {'lr': 0.0004903973240224978, 'samples': 2348032, 'steps': 4585, 'batch_loss/train': 0.7312934116926044} 12/22/2021 05:18:16 - INFO - codeparrot_training - Step 4586: {'lr': 0.0004903929966855812, 'samples': 2348544, 'steps': 4586, 'batch_loss/train': 0.9430673681199551} 12/22/2021 05:18:26 - INFO - codeparrot_training - Step 4587: {'lr': 0.0004903886683929528, 'samples': 2349056, 'steps': 4587, 'batch_loss/train': 0.8869965933263302} 12/22/2021 05:18:39 - INFO - codeparrot_training - Step 4588: {'lr': 0.0004903843391446298, 'samples': 2349568, 'steps': 4588, 'batch_loss/train': 0.9145477656275034} 12/22/2021 05:18:50 - INFO - codeparrot_training - Step 4589: {'lr': 0.0004903800089406294, 'samples': 2350080, 'steps': 4589, 'batch_loss/train': 0.8991410974413157} 12/22/2021 05:19:00 - INFO - codeparrot_training - Step 4590: {'lr': 0.000490375677780969, 'samples': 2350592, 'steps': 4590, 'batch_loss/train': 0.9018108081072569} 12/22/2021 05:19:12 - INFO - codeparrot_training - Step 4591: {'lr': 0.0004903713456656654, 'samples': 2351104, 'steps': 4591, 'batch_loss/train': 0.8571840059012175} 12/22/2021 05:19:23 - INFO - codeparrot_training - Step 4592: {'lr': 0.0004903670125947364, 'samples': 2351616, 'steps': 4592, 'batch_loss/train': 0.9608303466811776} 12/22/2021 05:19:34 - INFO - codeparrot_training - Step 4593: {'lr': 0.0004903626785681987, 'samples': 2352128, 'steps': 4593, 'batch_loss/train': 0.7886392744258046} 12/22/2021 05:19:47 - INFO - codeparrot_training - Step 4594: {'lr': 0.0004903583435860697, 'samples': 2352640, 'steps': 4594, 'batch_loss/train': 0.8474338166415691} 12/22/2021 05:19:57 - INFO - codeparrot_training - Step 4595: {'lr': 0.0004903540076483669, 'samples': 2353152, 'steps': 4595, 'batch_loss/train': 0.883853274397552} 12/22/2021 05:20:08 - INFO - codeparrot_training - Step 4596: {'lr': 0.0004903496707551072, 'samples': 2353664, 'steps': 4596, 'batch_loss/train': 0.9092966383323073} 12/22/2021 05:20:18 - INFO - codeparrot_training - Step 4597: {'lr': 0.000490345332906308, 'samples': 2354176, 'steps': 4597, 'batch_loss/train': 0.8070572186261415} 12/22/2021 05:20:31 - INFO - codeparrot_training - Step 4598: {'lr': 0.0004903409941019864, 'samples': 2354688, 'steps': 4598, 'batch_loss/train': 0.9133243560791016} 12/22/2021 05:20:41 - INFO - codeparrot_training - Step 4599: {'lr': 0.00049033665434216, 'samples': 2355200, 'steps': 4599, 'batch_loss/train': 0.8495410340838134} 12/22/2021 05:20:52 - INFO - codeparrot_training - Step 4600: {'lr': 0.0004903323136268455, 'samples': 2355712, 'steps': 4600, 'batch_loss/train': 1.0855380753055215} 12/22/2021 05:21:04 - INFO - codeparrot_training - Step 4601: {'lr': 0.0004903279719560607, 'samples': 2356224, 'steps': 4601, 'batch_loss/train': 0.8661505524069071} 12/22/2021 05:21:15 - INFO - codeparrot_training - Step 4602: {'lr': 0.0004903236293298225, 'samples': 2356736, 'steps': 4602, 'batch_loss/train': 0.6889974935911596} 12/22/2021 05:21:25 - INFO - codeparrot_training - Step 4603: {'lr': 0.0004903192857481484, 'samples': 2357248, 'steps': 4603, 'batch_loss/train': 0.8174928883090615} 12/22/2021 05:21:37 - INFO - codeparrot_training - Step 4604: {'lr': 0.0004903149412110554, 'samples': 2357760, 'steps': 4604, 'batch_loss/train': 0.7811952559277415} 12/22/2021 05:21:48 - INFO - codeparrot_training - Step 4605: {'lr': 0.0004903105957185611, 'samples': 2358272, 'steps': 4605, 'batch_loss/train': 0.8003590672742575} 12/22/2021 05:21:59 - INFO - codeparrot_training - Step 4606: {'lr': 0.0004903062492706825, 'samples': 2358784, 'steps': 4606, 'batch_loss/train': 0.9205117197707295} 12/22/2021 05:22:11 - INFO - codeparrot_training - Step 4607: {'lr': 0.000490301901867437, 'samples': 2359296, 'steps': 4607, 'batch_loss/train': 0.9959717895835638} 12/22/2021 05:22:22 - INFO - codeparrot_training - Step 4608: {'lr': 0.000490297553508842, 'samples': 2359808, 'steps': 4608, 'batch_loss/train': 0.873446230776608} 12/22/2021 05:22:33 - INFO - codeparrot_training - Step 4609: {'lr': 0.0004902932041949145, 'samples': 2360320, 'steps': 4609, 'batch_loss/train': 1.0240419576875865} 12/22/2021 05:22:43 - INFO - codeparrot_training - Step 4610: {'lr': 0.000490288853925672, 'samples': 2360832, 'steps': 4610, 'batch_loss/train': 0.9231995670124888} 12/22/2021 05:22:55 - INFO - codeparrot_training - Step 4611: {'lr': 0.0004902845027011318, 'samples': 2361344, 'steps': 4611, 'batch_loss/train': 0.8554347716271877} 12/22/2021 05:23:06 - INFO - codeparrot_training - Step 4612: {'lr': 0.0004902801505213111, 'samples': 2361856, 'steps': 4612, 'batch_loss/train': 0.869145855307579} 12/22/2021 05:23:17 - INFO - codeparrot_training - Step 4613: {'lr': 0.0004902757973862272, 'samples': 2362368, 'steps': 4613, 'batch_loss/train': 0.9361643120646477} 12/22/2021 05:23:29 - INFO - codeparrot_training - Step 4614: {'lr': 0.0004902714432958975, 'samples': 2362880, 'steps': 4614, 'batch_loss/train': 0.8538794470950961} 12/22/2021 05:23:39 - INFO - codeparrot_training - Step 4615: {'lr': 0.0004902670882503393, 'samples': 2363392, 'steps': 4615, 'batch_loss/train': 0.9098050566390157} 12/22/2021 05:23:50 - INFO - codeparrot_training - Step 4616: {'lr': 0.0004902627322495699, 'samples': 2363904, 'steps': 4616, 'batch_loss/train': 0.899462531786412} 12/22/2021 05:24:03 - INFO - codeparrot_training - Step 4617: {'lr': 0.0004902583752936066, 'samples': 2364416, 'steps': 4617, 'batch_loss/train': 0.8209617389366031} 12/22/2021 05:24:13 - INFO - codeparrot_training - Step 4618: {'lr': 0.0004902540173824665, 'samples': 2364928, 'steps': 4618, 'batch_loss/train': 0.8263073544949293} 12/22/2021 05:24:24 - INFO - codeparrot_training - Step 4619: {'lr': 0.0004902496585161673, 'samples': 2365440, 'steps': 4619, 'batch_loss/train': 1.0453309882432222} 12/22/2021 05:24:35 - INFO - codeparrot_training - Step 4620: {'lr': 0.0004902452986947262, 'samples': 2365952, 'steps': 4620, 'batch_loss/train': 2.1080478746443987} 12/22/2021 05:24:47 - INFO - codeparrot_training - Step 4621: {'lr': 0.0004902409379181605, 'samples': 2366464, 'steps': 4621, 'batch_loss/train': 0.9085658937692642} 12/22/2021 05:24:57 - INFO - codeparrot_training - Step 4622: {'lr': 0.0004902365761864875, 'samples': 2366976, 'steps': 4622, 'batch_loss/train': 0.8818166870623827} 12/22/2021 05:25:08 - INFO - codeparrot_training - Step 4623: {'lr': 0.0004902322134997247, 'samples': 2367488, 'steps': 4623, 'batch_loss/train': 0.794861315516755} 12/22/2021 05:25:21 - INFO - codeparrot_training - Step 4624: {'lr': 0.0004902278498578891, 'samples': 2368000, 'steps': 4624, 'batch_loss/train': 0.6933790950570256} 12/22/2021 05:25:32 - INFO - codeparrot_training - Step 4625: {'lr': 0.0004902234852609985, 'samples': 2368512, 'steps': 4625, 'batch_loss/train': 1.360093167051673} 12/22/2021 05:25:43 - INFO - codeparrot_training - Step 4626: {'lr': 0.00049021911970907, 'samples': 2369024, 'steps': 4626, 'batch_loss/train': 1.0006458293646574} 12/22/2021 05:25:53 - INFO - codeparrot_training - Step 4627: {'lr': 0.0004902147532021208, 'samples': 2369536, 'steps': 4627, 'batch_loss/train': 0.6165992827154696} 12/22/2021 05:26:05 - INFO - codeparrot_training - Step 4628: {'lr': 0.0004902103857401686, 'samples': 2370048, 'steps': 4628, 'batch_loss/train': 0.9190306598320603} 12/22/2021 05:26:16 - INFO - codeparrot_training - Step 4629: {'lr': 0.0004902060173232306, 'samples': 2370560, 'steps': 4629, 'batch_loss/train': 0.8391031995415688} 12/22/2021 05:26:27 - INFO - codeparrot_training - Step 4630: {'lr': 0.0004902016479513241, 'samples': 2371072, 'steps': 4630, 'batch_loss/train': 0.8331723753362894} 12/22/2021 05:26:39 - INFO - codeparrot_training - Step 4631: {'lr': 0.0004901972776244667, 'samples': 2371584, 'steps': 4631, 'batch_loss/train': 0.9789747772738338} 12/22/2021 05:26:49 - INFO - codeparrot_training - Step 4632: {'lr': 0.0004901929063426755, 'samples': 2372096, 'steps': 4632, 'batch_loss/train': 0.8740890370681882} 12/22/2021 05:27:00 - INFO - codeparrot_training - Step 4633: {'lr': 0.000490188534105968, 'samples': 2372608, 'steps': 4633, 'batch_loss/train': 0.9647713704034686} 12/22/2021 05:27:10 - INFO - codeparrot_training - Step 4634: {'lr': 0.0004901841609143615, 'samples': 2373120, 'steps': 4634, 'batch_loss/train': 0.9329115336295217} 12/22/2021 05:27:23 - INFO - codeparrot_training - Step 4635: {'lr': 0.0004901797867678736, 'samples': 2373632, 'steps': 4635, 'batch_loss/train': 0.7030652347020805} 12/22/2021 05:27:34 - INFO - codeparrot_training - Step 4636: {'lr': 0.0004901754116665214, 'samples': 2374144, 'steps': 4636, 'batch_loss/train': 0.9464276684448123} 12/22/2021 05:27:45 - INFO - codeparrot_training - Step 4637: {'lr': 0.0004901710356103227, 'samples': 2374656, 'steps': 4637, 'batch_loss/train': 0.7658823998644948} 12/22/2021 05:27:57 - INFO - codeparrot_training - Step 4638: {'lr': 0.0004901666585992945, 'samples': 2375168, 'steps': 4638, 'batch_loss/train': 0.8512355457060039} 12/22/2021 05:28:07 - INFO - codeparrot_training - Step 4639: {'lr': 0.0004901622806334543, 'samples': 2375680, 'steps': 4639, 'batch_loss/train': 0.9532221604604274} 12/22/2021 05:28:18 - INFO - codeparrot_training - Step 4640: {'lr': 0.0004901579017128196, 'samples': 2376192, 'steps': 4640, 'batch_loss/train': 0.8292650077491999} 12/22/2021 05:28:30 - INFO - codeparrot_training - Step 4641: {'lr': 0.0004901535218374078, 'samples': 2376704, 'steps': 4641, 'batch_loss/train': 0.8650107309222221} 12/22/2021 05:28:41 - INFO - codeparrot_training - Step 4642: {'lr': 0.0004901491410072363, 'samples': 2377216, 'steps': 4642, 'batch_loss/train': 0.8228635438717902} 12/22/2021 05:28:51 - INFO - codeparrot_training - Step 4643: {'lr': 0.0004901447592223224, 'samples': 2377728, 'steps': 4643, 'batch_loss/train': 0.7810203351546079} 12/22/2021 05:29:04 - INFO - codeparrot_training - Step 4644: {'lr': 0.0004901403764826836, 'samples': 2378240, 'steps': 4644, 'batch_loss/train': 0.7865223467815667} 12/22/2021 05:29:15 - INFO - codeparrot_training - Step 4645: {'lr': 0.0004901359927883374, 'samples': 2378752, 'steps': 4645, 'batch_loss/train': 0.8611668609082699} 12/22/2021 05:29:25 - INFO - codeparrot_training - Step 4646: {'lr': 0.0004901316081393012, 'samples': 2379264, 'steps': 4646, 'batch_loss/train': 0.9094913341104984} 12/22/2021 05:29:36 - INFO - codeparrot_training - Step 4647: {'lr': 0.0004901272225355923, 'samples': 2379776, 'steps': 4647, 'batch_loss/train': 0.839872170239687} 12/22/2021 05:29:48 - INFO - codeparrot_training - Step 4648: {'lr': 0.0004901228359772283, 'samples': 2380288, 'steps': 4648, 'batch_loss/train': 0.8667188506806269} 12/22/2021 05:29:59 - INFO - codeparrot_training - Step 4649: {'lr': 0.0004901184484642266, 'samples': 2380800, 'steps': 4649, 'batch_loss/train': 0.9003892475739121} 12/22/2021 05:30:09 - INFO - codeparrot_training - Step 4650: {'lr': 0.0004901140599966045, 'samples': 2381312, 'steps': 4650, 'batch_loss/train': 0.9060362004674971} 12/22/2021 05:30:21 - INFO - codeparrot_training - Step 4651: {'lr': 0.0004901096705743796, 'samples': 2381824, 'steps': 4651, 'batch_loss/train': 0.8143169092945755} 12/22/2021 05:30:32 - INFO - codeparrot_training - Step 4652: {'lr': 0.0004901052801975694, 'samples': 2382336, 'steps': 4652, 'batch_loss/train': 0.8696380525361747} 12/22/2021 05:30:43 - INFO - codeparrot_training - Step 4653: {'lr': 0.0004901008888661912, 'samples': 2382848, 'steps': 4653, 'batch_loss/train': 0.8638692544773221} 12/22/2021 05:30:53 - INFO - codeparrot_training - Step 4654: {'lr': 0.0004900964965802626, 'samples': 2383360, 'steps': 4654, 'batch_loss/train': 0.8818233939819038} 12/22/2021 05:31:06 - INFO - codeparrot_training - Step 4655: {'lr': 0.0004900921033398008, 'samples': 2383872, 'steps': 4655, 'batch_loss/train': 0.8776472159661353} 12/22/2021 05:31:17 - INFO - codeparrot_training - Step 4656: {'lr': 0.0004900877091448235, 'samples': 2384384, 'steps': 4656, 'batch_loss/train': 0.7758060521446168} 12/22/2021 05:31:27 - INFO - codeparrot_training - Step 4657: {'lr': 0.0004900833139953484, 'samples': 2384896, 'steps': 4657, 'batch_loss/train': 0.9206443731673062} 12/22/2021 05:31:40 - INFO - codeparrot_training - Step 4658: {'lr': 0.0004900789178913924, 'samples': 2385408, 'steps': 4658, 'batch_loss/train': 0.7933828285895288} 12/22/2021 05:31:50 - INFO - codeparrot_training - Step 4659: {'lr': 0.0004900745208329733, 'samples': 2385920, 'steps': 4659, 'batch_loss/train': 0.8929068697616458} 12/22/2021 05:32:01 - INFO - codeparrot_training - Step 4660: {'lr': 0.0004900701228201087, 'samples': 2386432, 'steps': 4660, 'batch_loss/train': 0.9193641571328044} 12/22/2021 05:32:13 - INFO - codeparrot_training - Step 4661: {'lr': 0.0004900657238528158, 'samples': 2386944, 'steps': 4661, 'batch_loss/train': 1.526666210964322} 12/22/2021 05:32:24 - INFO - codeparrot_training - Step 4662: {'lr': 0.0004900613239311123, 'samples': 2387456, 'steps': 4662, 'batch_loss/train': 0.8656842103227973} 12/22/2021 05:32:34 - INFO - codeparrot_training - Step 4663: {'lr': 0.0004900569230550156, 'samples': 2387968, 'steps': 4663, 'batch_loss/train': 0.8517012582160532} 12/22/2021 05:32:45 - INFO - codeparrot_training - Step 4664: {'lr': 0.0004900525212245433, 'samples': 2388480, 'steps': 4664, 'batch_loss/train': 1.277628567069769} 12/22/2021 05:32:58 - INFO - codeparrot_training - Step 4665: {'lr': 0.0004900481184397127, 'samples': 2388992, 'steps': 4665, 'batch_loss/train': 0.9328899951651692} 12/22/2021 05:33:09 - INFO - codeparrot_training - Step 4666: {'lr': 0.0004900437147005414, 'samples': 2389504, 'steps': 4666, 'batch_loss/train': 0.8792198719456792} 12/22/2021 05:33:19 - INFO - codeparrot_training - Step 4667: {'lr': 0.0004900393100070469, 'samples': 2390016, 'steps': 4667, 'batch_loss/train': 0.9007315933704376} 12/22/2021 05:33:31 - INFO - codeparrot_training - Step 4668: {'lr': 0.0004900349043592469, 'samples': 2390528, 'steps': 4668, 'batch_loss/train': 0.8137525953352451} 12/22/2021 05:33:42 - INFO - codeparrot_training - Step 4669: {'lr': 0.0004900304977571586, 'samples': 2391040, 'steps': 4669, 'batch_loss/train': 0.8157358483877033} 12/22/2021 05:33:52 - INFO - codeparrot_training - Step 4670: {'lr': 0.0004900260902007998, 'samples': 2391552, 'steps': 4670, 'batch_loss/train': 0.8861128818243742} 12/22/2021 05:34:04 - INFO - codeparrot_training - Step 4671: {'lr': 0.0004900216816901878, 'samples': 2392064, 'steps': 4671, 'batch_loss/train': 0.8891404131427407} 12/22/2021 05:34:15 - INFO - codeparrot_training - Step 4672: {'lr': 0.0004900172722253403, 'samples': 2392576, 'steps': 4672, 'batch_loss/train': 0.8815219579264522} 12/22/2021 05:34:26 - INFO - codeparrot_training - Step 4673: {'lr': 0.0004900128618062747, 'samples': 2393088, 'steps': 4673, 'batch_loss/train': 0.9583655903115869} 12/22/2021 05:34:36 - INFO - codeparrot_training - Step 4674: {'lr': 0.0004900084504330086, 'samples': 2393600, 'steps': 4674, 'batch_loss/train': 0.8290217723697424} 12/22/2021 05:34:49 - INFO - codeparrot_training - Step 4675: {'lr': 0.0004900040381055594, 'samples': 2394112, 'steps': 4675, 'batch_loss/train': 0.7887717741541564} 12/22/2021 05:34:59 - INFO - codeparrot_training - Step 4676: {'lr': 0.000489999624823945, 'samples': 2394624, 'steps': 4676, 'batch_loss/train': 0.8688865583389997} 12/22/2021 05:35:10 - INFO - codeparrot_training - Step 4677: {'lr': 0.0004899952105881826, 'samples': 2395136, 'steps': 4677, 'batch_loss/train': 0.9816875448450446} 12/22/2021 05:35:22 - INFO - codeparrot_training - Step 4678: {'lr': 0.0004899907953982899, 'samples': 2395648, 'steps': 4678, 'batch_loss/train': 1.2295084111392498} 12/22/2021 05:35:33 - INFO - codeparrot_training - Step 4679: {'lr': 0.0004899863792542843, 'samples': 2396160, 'steps': 4679, 'batch_loss/train': 0.9045986756682396} 12/22/2021 05:35:43 - INFO - codeparrot_training - Step 4680: {'lr': 0.0004899819621561836, 'samples': 2396672, 'steps': 4680, 'batch_loss/train': 0.9453831277787685} 12/22/2021 05:35:56 - INFO - codeparrot_training - Step 4681: {'lr': 0.0004899775441040052, 'samples': 2397184, 'steps': 4681, 'batch_loss/train': 0.8209117446094751} 12/22/2021 05:36:07 - INFO - codeparrot_training - Step 4682: {'lr': 0.0004899731250977667, 'samples': 2397696, 'steps': 4682, 'batch_loss/train': 0.7099014499690384} 12/22/2021 05:36:17 - INFO - codeparrot_training - Step 4683: {'lr': 0.0004899687051374856, 'samples': 2398208, 'steps': 4683, 'batch_loss/train': 0.8359908619895577} 12/22/2021 05:36:28 - INFO - codeparrot_training - Step 4684: {'lr': 0.0004899642842231797, 'samples': 2398720, 'steps': 4684, 'batch_loss/train': 0.8578286489937454} 12/22/2021 05:36:40 - INFO - codeparrot_training - Step 4685: {'lr': 0.0004899598623548662, 'samples': 2399232, 'steps': 4685, 'batch_loss/train': 0.9778510965406895} 12/22/2021 05:36:51 - INFO - codeparrot_training - Step 4686: {'lr': 0.000489955439532563, 'samples': 2399744, 'steps': 4686, 'batch_loss/train': 1.667997750453651} 12/22/2021 05:37:01 - INFO - codeparrot_training - Step 4687: {'lr': 0.0004899510157562875, 'samples': 2400256, 'steps': 4687, 'batch_loss/train': 0.8966680308803916} 12/22/2021 05:37:13 - INFO - codeparrot_training - Step 4688: {'lr': 0.0004899465910260575, 'samples': 2400768, 'steps': 4688, 'batch_loss/train': 0.907324779778719} 12/22/2021 05:37:24 - INFO - codeparrot_training - Step 4689: {'lr': 0.0004899421653418904, 'samples': 2401280, 'steps': 4689, 'batch_loss/train': 0.8261666349135339} 12/22/2021 05:37:35 - INFO - codeparrot_training - Step 4690: {'lr': 0.0004899377387038037, 'samples': 2401792, 'steps': 4690, 'batch_loss/train': 0.9321572971530259} 12/22/2021 05:37:47 - INFO - codeparrot_training - Step 4691: {'lr': 0.0004899333111118154, 'samples': 2402304, 'steps': 4691, 'batch_loss/train': 0.8877207012847066} 12/22/2021 05:37:57 - INFO - codeparrot_training - Step 4692: {'lr': 0.0004899288825659427, 'samples': 2402816, 'steps': 4692, 'batch_loss/train': 0.849187099840492} 12/22/2021 05:38:08 - INFO - codeparrot_training - Step 4693: {'lr': 0.0004899244530662033, 'samples': 2403328, 'steps': 4693, 'batch_loss/train': 0.8010710030794144} 12/22/2021 05:38:21 - INFO - codeparrot_training - Step 4694: {'lr': 0.000489920022612615, 'samples': 2403840, 'steps': 4694, 'batch_loss/train': 0.9494202854111791} 12/22/2021 05:38:32 - INFO - codeparrot_training - Step 4695: {'lr': 0.0004899155912051952, 'samples': 2404352, 'steps': 4695, 'batch_loss/train': 0.9274922190234065} 12/22/2021 05:38:42 - INFO - codeparrot_training - Step 4696: {'lr': 0.0004899111588439617, 'samples': 2404864, 'steps': 4696, 'batch_loss/train': 0.938635416328907} 12/22/2021 05:38:53 - INFO - codeparrot_training - Step 4697: {'lr': 0.000489906725528932, 'samples': 2405376, 'steps': 4697, 'batch_loss/train': 0.8738009287044406} 12/22/2021 05:39:05 - INFO - codeparrot_training - Step 4698: {'lr': 0.0004899022912601236, 'samples': 2405888, 'steps': 4698, 'batch_loss/train': 0.870822736993432} 12/22/2021 05:39:15 - INFO - codeparrot_training - Step 4699: {'lr': 0.0004898978560375544, 'samples': 2406400, 'steps': 4699, 'batch_loss/train': 0.9104921109974384} 12/22/2021 05:39:26 - INFO - codeparrot_training - Step 4700: {'lr': 0.0004898934198612419, 'samples': 2406912, 'steps': 4700, 'batch_loss/train': 0.8699377765879035} 12/22/2021 05:39:39 - INFO - codeparrot_training - Step 4701: {'lr': 0.0004898889827312037, 'samples': 2407424, 'steps': 4701, 'batch_loss/train': 0.9113212209194899} 12/22/2021 05:39:49 - INFO - codeparrot_training - Step 4702: {'lr': 0.0004898845446474575, 'samples': 2407936, 'steps': 4702, 'batch_loss/train': 0.849408102221787} 12/22/2021 05:40:00 - INFO - codeparrot_training - Step 4703: {'lr': 0.000489880105610021, 'samples': 2408448, 'steps': 4703, 'batch_loss/train': 0.8941238056868315} 12/22/2021 05:40:12 - INFO - codeparrot_training - Step 4704: {'lr': 0.0004898756656189116, 'samples': 2408960, 'steps': 4704, 'batch_loss/train': 1.0079377628862858} 12/22/2021 05:40:23 - INFO - codeparrot_training - Step 4705: {'lr': 0.0004898712246741472, 'samples': 2409472, 'steps': 4705, 'batch_loss/train': 0.880921171978116} 12/22/2021 05:40:33 - INFO - codeparrot_training - Step 4706: {'lr': 0.0004898667827757455, 'samples': 2409984, 'steps': 4706, 'batch_loss/train': 0.8649303708225489} 12/22/2021 05:40:44 - INFO - codeparrot_training - Step 4707: {'lr': 0.0004898623399237239, 'samples': 2410496, 'steps': 4707, 'batch_loss/train': 0.8292835699394345} 12/22/2021 05:40:57 - INFO - codeparrot_training - Step 4708: {'lr': 0.0004898578961181003, 'samples': 2411008, 'steps': 4708, 'batch_loss/train': 0.8948356937617064} 12/22/2021 05:41:07 - INFO - codeparrot_training - Step 4709: {'lr': 0.0004898534513588922, 'samples': 2411520, 'steps': 4709, 'batch_loss/train': 0.9203171655535698} 12/22/2021 05:41:18 - INFO - codeparrot_training - Step 4710: {'lr': 0.0004898490056461174, 'samples': 2412032, 'steps': 4710, 'batch_loss/train': 1.0332338837906718} 12/22/2021 05:41:30 - INFO - codeparrot_training - Step 4711: {'lr': 0.0004898445589797935, 'samples': 2412544, 'steps': 4711, 'batch_loss/train': 1.1074281483888626} 12/22/2021 05:41:41 - INFO - codeparrot_training - Step 4712: {'lr': 0.0004898401113599382, 'samples': 2413056, 'steps': 4712, 'batch_loss/train': 0.9200476091355085} 12/22/2021 05:41:52 - INFO - codeparrot_training - Step 4713: {'lr': 0.0004898356627865691, 'samples': 2413568, 'steps': 4713, 'batch_loss/train': 0.8686068197712302} 12/22/2021 05:42:03 - INFO - codeparrot_training - Step 4714: {'lr': 0.000489831213259704, 'samples': 2414080, 'steps': 4714, 'batch_loss/train': 0.9382328558713198} 12/22/2021 05:42:15 - INFO - codeparrot_training - Step 4715: {'lr': 0.0004898267627793606, 'samples': 2414592, 'steps': 4715, 'batch_loss/train': 0.8927031913772225} 12/22/2021 05:42:25 - INFO - codeparrot_training - Step 4716: {'lr': 0.0004898223113455565, 'samples': 2415104, 'steps': 4716, 'batch_loss/train': 0.8882098281756043} 12/22/2021 05:42:36 - INFO - codeparrot_training - Step 4717: {'lr': 0.0004898178589583096, 'samples': 2415616, 'steps': 4717, 'batch_loss/train': 0.8870672178454697} 12/22/2021 05:42:48 - INFO - codeparrot_training - Step 4718: {'lr': 0.0004898134056176372, 'samples': 2416128, 'steps': 4718, 'batch_loss/train': 0.8273202423006296} 12/22/2021 05:42:59 - INFO - codeparrot_training - Step 4719: {'lr': 0.0004898089513235574, 'samples': 2416640, 'steps': 4719, 'batch_loss/train': 0.8490454284474254} 12/22/2021 05:43:09 - INFO - codeparrot_training - Step 4720: {'lr': 0.0004898044960760877, 'samples': 2417152, 'steps': 4720, 'batch_loss/train': 0.8419190244749188} 12/22/2021 05:43:22 - INFO - codeparrot_training - Step 4721: {'lr': 0.0004898000398752459, 'samples': 2417664, 'steps': 4721, 'batch_loss/train': 0.7603944423608482} 12/22/2021 05:43:33 - INFO - codeparrot_training - Step 4722: {'lr': 0.0004897955827210497, 'samples': 2418176, 'steps': 4722, 'batch_loss/train': 0.9887412246316671} 12/22/2021 05:43:43 - INFO - codeparrot_training - Step 4723: {'lr': 0.0004897911246135167, 'samples': 2418688, 'steps': 4723, 'batch_loss/train': 1.029698956757784} 12/22/2021 05:43:55 - INFO - codeparrot_training - Step 4724: {'lr': 0.0004897866655526649, 'samples': 2419200, 'steps': 4724, 'batch_loss/train': 0.8818455683067441} 12/22/2021 05:44:06 - INFO - codeparrot_training - Step 4725: {'lr': 0.0004897822055385117, 'samples': 2419712, 'steps': 4725, 'batch_loss/train': 0.8259754227474332} 12/22/2021 05:44:17 - INFO - codeparrot_training - Step 4726: {'lr': 0.0004897777445710751, 'samples': 2420224, 'steps': 4726, 'batch_loss/train': 0.7811438087373972} 12/22/2021 05:44:27 - INFO - codeparrot_training - Step 4727: {'lr': 0.0004897732826503727, 'samples': 2420736, 'steps': 4727, 'batch_loss/train': 0.8952704854309559} 12/22/2021 05:44:40 - INFO - codeparrot_training - Step 4728: {'lr': 0.0004897688197764222, 'samples': 2421248, 'steps': 4728, 'batch_loss/train': 0.8343784064054489} 12/22/2021 05:44:50 - INFO - codeparrot_training - Step 4729: {'lr': 0.0004897643559492415, 'samples': 2421760, 'steps': 4729, 'batch_loss/train': 0.7600196897983551} 12/22/2021 05:45:01 - INFO - codeparrot_training - Step 4730: {'lr': 0.0004897598911688482, 'samples': 2422272, 'steps': 4730, 'batch_loss/train': 0.8858982659876347} 12/22/2021 05:45:13 - INFO - codeparrot_training - Step 4731: {'lr': 0.0004897554254352601, 'samples': 2422784, 'steps': 4731, 'batch_loss/train': 0.8740650126710534} 12/22/2021 05:45:23 - INFO - codeparrot_training - Step 4732: {'lr': 0.0004897509587484949, 'samples': 2423296, 'steps': 4732, 'batch_loss/train': 1.6503102369606495} 12/22/2021 05:45:34 - INFO - codeparrot_training - Step 4733: {'lr': 0.0004897464911085706, 'samples': 2423808, 'steps': 4733, 'batch_loss/train': 0.8254860602319241} 12/22/2021 05:45:47 - INFO - codeparrot_training - Step 4734: {'lr': 0.0004897420225155046, 'samples': 2424320, 'steps': 4734, 'batch_loss/train': 0.8738939571194351} 12/22/2021 05:45:58 - INFO - codeparrot_training - Step 4735: {'lr': 0.0004897375529693148, 'samples': 2424832, 'steps': 4735, 'batch_loss/train': 0.7476444293279201} 12/22/2021 05:46:08 - INFO - codeparrot_training - Step 4736: {'lr': 0.0004897330824700192, 'samples': 2425344, 'steps': 4736, 'batch_loss/train': 0.8056006943807006} 12/22/2021 05:46:19 - INFO - codeparrot_training - Step 4737: {'lr': 0.0004897286110176353, 'samples': 2425856, 'steps': 4737, 'batch_loss/train': 0.8416024437174201} 12/22/2021 05:46:31 - INFO - codeparrot_training - Step 4738: {'lr': 0.000489724138612181, 'samples': 2426368, 'steps': 4738, 'batch_loss/train': 0.864123678766191} 12/22/2021 05:46:42 - INFO - codeparrot_training - Step 4739: {'lr': 0.000489719665253674, 'samples': 2426880, 'steps': 4739, 'batch_loss/train': 1.0573223326355219} 12/22/2021 05:46:52 - INFO - codeparrot_training - Step 4740: {'lr': 0.0004897151909421321, 'samples': 2427392, 'steps': 4740, 'batch_loss/train': 0.7552709971787408} 12/22/2021 05:47:05 - INFO - codeparrot_training - Step 4741: {'lr': 0.0004897107156775731, 'samples': 2427904, 'steps': 4741, 'batch_loss/train': 0.8612934835255146} 12/22/2021 05:47:16 - INFO - codeparrot_training - Step 4742: {'lr': 0.0004897062394600148, 'samples': 2428416, 'steps': 4742, 'batch_loss/train': 0.9006940638646483} 12/22/2021 05:47:26 - INFO - codeparrot_training - Step 4743: {'lr': 0.0004897017622894751, 'samples': 2428928, 'steps': 4743, 'batch_loss/train': 0.8731550453230739} 12/22/2021 05:47:38 - INFO - codeparrot_training - Step 4744: {'lr': 0.0004896972841659716, 'samples': 2429440, 'steps': 4744, 'batch_loss/train': 0.877603180706501} 12/22/2021 05:47:49 - INFO - codeparrot_training - Step 4745: {'lr': 0.0004896928050895222, 'samples': 2429952, 'steps': 4745, 'batch_loss/train': 0.888151309452951} 12/22/2021 05:48:00 - INFO - codeparrot_training - Step 4746: {'lr': 0.0004896883250601449, 'samples': 2430464, 'steps': 4746, 'batch_loss/train': 0.797690718434751} 12/22/2021 05:48:10 - INFO - codeparrot_training - Step 4747: {'lr': 0.0004896838440778571, 'samples': 2430976, 'steps': 4747, 'batch_loss/train': 0.836217406205833} 12/22/2021 05:48:22 - INFO - codeparrot_training - Step 4748: {'lr': 0.000489679362142677, 'samples': 2431488, 'steps': 4748, 'batch_loss/train': 0.852365336380899} 12/22/2021 05:48:33 - INFO - codeparrot_training - Step 4749: {'lr': 0.0004896748792546221, 'samples': 2432000, 'steps': 4749, 'batch_loss/train': 0.882318802177906} 12/22/2021 05:48:44 - INFO - codeparrot_training - Step 4750: {'lr': 0.0004896703954137103, 'samples': 2432512, 'steps': 4750, 'batch_loss/train': 0.8632469540461898} 12/22/2021 05:48:56 - INFO - codeparrot_training - Step 4751: {'lr': 0.0004896659106199598, 'samples': 2433024, 'steps': 4751, 'batch_loss/train': 0.8827535177115351} 12/22/2021 05:49:07 - INFO - codeparrot_training - Step 4752: {'lr': 0.000489661424873388, 'samples': 2433536, 'steps': 4752, 'batch_loss/train': 0.823852134635672} 12/22/2021 05:49:17 - INFO - codeparrot_training - Step 4753: {'lr': 0.0004896569381740127, 'samples': 2434048, 'steps': 4753, 'batch_loss/train': 0.836799361044541} 12/22/2021 05:49:29 - INFO - codeparrot_training - Step 4754: {'lr': 0.0004896524505218522, 'samples': 2434560, 'steps': 4754, 'batch_loss/train': 0.8468020083382726} 12/22/2021 05:49:40 - INFO - codeparrot_training - Step 4755: {'lr': 0.0004896479619169239, 'samples': 2435072, 'steps': 4755, 'batch_loss/train': 0.916930666193366} 12/22/2021 05:49:51 - INFO - codeparrot_training - Step 4756: {'lr': 0.0004896434723592458, 'samples': 2435584, 'steps': 4756, 'batch_loss/train': 0.84939696919173} 12/22/2021 05:50:03 - INFO - codeparrot_training - Step 4757: {'lr': 0.0004896389818488357, 'samples': 2436096, 'steps': 4757, 'batch_loss/train': 0.8416456556878984} 12/22/2021 05:50:14 - INFO - codeparrot_training - Step 4758: {'lr': 0.0004896344903857115, 'samples': 2436608, 'steps': 4758, 'batch_loss/train': 0.8899242291226983} 12/22/2021 05:50:24 - INFO - codeparrot_training - Step 4759: {'lr': 0.0004896299979698911, 'samples': 2437120, 'steps': 4759, 'batch_loss/train': 1.2607581317424774} 12/22/2021 05:50:35 - INFO - codeparrot_training - Step 4760: {'lr': 0.0004896255046013923, 'samples': 2437632, 'steps': 4760, 'batch_loss/train': 0.8955657221376896} 12/22/2021 05:50:47 - INFO - codeparrot_training - Step 4761: {'lr': 0.000489621010280233, 'samples': 2438144, 'steps': 4761, 'batch_loss/train': 0.8781210165470839} 12/22/2021 05:50:58 - INFO - codeparrot_training - Step 4762: {'lr': 0.000489616515006431, 'samples': 2438656, 'steps': 4762, 'batch_loss/train': 0.856322611682117} 12/22/2021 05:51:08 - INFO - codeparrot_training - Step 4763: {'lr': 0.0004896120187800042, 'samples': 2439168, 'steps': 4763, 'batch_loss/train': 0.8876886460930109} 12/22/2021 05:51:20 - INFO - codeparrot_training - Step 4764: {'lr': 0.0004896075216009704, 'samples': 2439680, 'steps': 4764, 'batch_loss/train': 0.7832733672112226} 12/22/2021 05:51:31 - INFO - codeparrot_training - Step 4765: {'lr': 0.0004896030234693477, 'samples': 2440192, 'steps': 4765, 'batch_loss/train': 0.8554117809981108} 12/22/2021 05:51:42 - INFO - codeparrot_training - Step 4766: {'lr': 0.0004895985243851538, 'samples': 2440704, 'steps': 4766, 'batch_loss/train': 0.8214857801795006} 12/22/2021 05:51:54 - INFO - codeparrot_training - Step 4767: {'lr': 0.0004895940243484067, 'samples': 2441216, 'steps': 4767, 'batch_loss/train': 0.6619344763457775} 12/22/2021 05:52:04 - INFO - codeparrot_training - Step 4768: {'lr': 0.0004895895233591242, 'samples': 2441728, 'steps': 4768, 'batch_loss/train': 0.851491560228169} 12/22/2021 05:52:15 - INFO - codeparrot_training - Step 4769: {'lr': 0.0004895850214173241, 'samples': 2442240, 'steps': 4769, 'batch_loss/train': 0.9806192582473159} 12/22/2021 05:52:25 - INFO - codeparrot_training - Step 4770: {'lr': 0.0004895805185230245, 'samples': 2442752, 'steps': 4770, 'batch_loss/train': 0.9466247437521815} 12/22/2021 05:52:38 - INFO - codeparrot_training - Step 4771: {'lr': 0.0004895760146762433, 'samples': 2443264, 'steps': 4771, 'batch_loss/train': 0.9641015958040953} 12/22/2021 05:52:49 - INFO - codeparrot_training - Step 4772: {'lr': 0.0004895715098769983, 'samples': 2443776, 'steps': 4772, 'batch_loss/train': 0.8265406205318868} 12/22/2021 05:53:00 - INFO - codeparrot_training - Step 4773: {'lr': 0.0004895670041253074, 'samples': 2444288, 'steps': 4773, 'batch_loss/train': 0.8828124809078872} 12/22/2021 05:53:12 - INFO - codeparrot_training - Step 4774: {'lr': 0.0004895624974211886, 'samples': 2444800, 'steps': 4774, 'batch_loss/train': 0.7308535580523312} 12/22/2021 05:53:23 - INFO - codeparrot_training - Step 4775: {'lr': 0.0004895579897646596, 'samples': 2445312, 'steps': 4775, 'batch_loss/train': 0.8371193977072835} 12/22/2021 05:53:33 - INFO - codeparrot_training - Step 4776: {'lr': 0.0004895534811557386, 'samples': 2445824, 'steps': 4776, 'batch_loss/train': 0.9098690608516335} 12/22/2021 05:53:44 - INFO - codeparrot_training - Step 4777: {'lr': 0.0004895489715944435, 'samples': 2446336, 'steps': 4777, 'batch_loss/train': 0.8766644150018692} 12/22/2021 05:53:57 - INFO - codeparrot_training - Step 4778: {'lr': 0.000489544461080792, 'samples': 2446848, 'steps': 4778, 'batch_loss/train': 0.8682454703375697} 12/22/2021 05:54:07 - INFO - codeparrot_training - Step 4779: {'lr': 0.0004895399496148021, 'samples': 2447360, 'steps': 4779, 'batch_loss/train': 0.9028678452596068} 12/22/2021 05:54:18 - INFO - codeparrot_training - Step 4780: {'lr': 0.000489535437196492, 'samples': 2447872, 'steps': 4780, 'batch_loss/train': 1.4302107356488705} 12/22/2021 05:54:31 - INFO - codeparrot_training - Step 4781: {'lr': 0.0004895309238258794, 'samples': 2448384, 'steps': 4781, 'batch_loss/train': 0.9242234386038035} 12/22/2021 05:54:41 - INFO - codeparrot_training - Step 4782: {'lr': 0.0004895264095029823, 'samples': 2448896, 'steps': 4782, 'batch_loss/train': 1.0215579215437174} 12/22/2021 05:54:52 - INFO - codeparrot_training - Step 4783: {'lr': 0.0004895218942278186, 'samples': 2449408, 'steps': 4783, 'batch_loss/train': 0.8611331060528755} 12/22/2021 05:55:03 - INFO - codeparrot_training - Step 4784: {'lr': 0.0004895173780004063, 'samples': 2449920, 'steps': 4784, 'batch_loss/train': 0.8653955650515854} 12/22/2021 05:55:15 - INFO - codeparrot_training - Step 4785: {'lr': 0.0004895128608207633, 'samples': 2450432, 'steps': 4785, 'batch_loss/train': 0.9540949985384941} 12/22/2021 05:55:26 - INFO - codeparrot_training - Step 4786: {'lr': 0.0004895083426889077, 'samples': 2450944, 'steps': 4786, 'batch_loss/train': 1.0111564695835114} 12/22/2021 05:55:36 - INFO - codeparrot_training - Step 4787: {'lr': 0.0004895038236048572, 'samples': 2451456, 'steps': 4787, 'batch_loss/train': 0.890831345692277} 12/22/2021 05:55:49 - INFO - codeparrot_training - Step 4788: {'lr': 0.00048949930356863, 'samples': 2451968, 'steps': 4788, 'batch_loss/train': 0.8243619576096535} 12/22/2021 05:56:00 - INFO - codeparrot_training - Step 4789: {'lr': 0.000489494782580244, 'samples': 2452480, 'steps': 4789, 'batch_loss/train': 0.872888557612896} 12/22/2021 05:56:10 - INFO - codeparrot_training - Step 4790: {'lr': 0.0004894902606397173, 'samples': 2452992, 'steps': 4790, 'batch_loss/train': 0.8059537326917052} 12/22/2021 05:56:22 - INFO - codeparrot_training - Step 4791: {'lr': 0.0004894857377470677, 'samples': 2453504, 'steps': 4791, 'batch_loss/train': 0.804004130885005} 12/22/2021 05:56:33 - INFO - codeparrot_training - Step 4792: {'lr': 0.0004894812139023131, 'samples': 2454016, 'steps': 4792, 'batch_loss/train': 0.8758954703807831} 12/22/2021 05:56:43 - INFO - codeparrot_training - Step 4793: {'lr': 0.0004894766891054718, 'samples': 2454528, 'steps': 4793, 'batch_loss/train': 0.9155638227239251} 12/22/2021 05:56:55 - INFO - codeparrot_training - Step 4794: {'lr': 0.0004894721633565614, 'samples': 2455040, 'steps': 4794, 'batch_loss/train': 0.870713890530169} 12/22/2021 05:57:06 - INFO - codeparrot_training - Step 4795: {'lr': 0.0004894676366556002, 'samples': 2455552, 'steps': 4795, 'batch_loss/train': 0.828155203955248} 12/22/2021 05:57:17 - INFO - codeparrot_training - Step 4796: {'lr': 0.0004894631090026062, 'samples': 2456064, 'steps': 4796, 'batch_loss/train': 0.9764314910862595} 12/22/2021 05:57:27 - INFO - codeparrot_training - Step 4797: {'lr': 0.0004894585803975972, 'samples': 2456576, 'steps': 4797, 'batch_loss/train': 0.9002700485289097} 12/22/2021 05:57:40 - INFO - codeparrot_training - Step 4798: {'lr': 0.0004894540508405912, 'samples': 2457088, 'steps': 4798, 'batch_loss/train': 0.8783863089047372} 12/22/2021 05:57:51 - INFO - codeparrot_training - Step 4799: {'lr': 0.0004894495203316065, 'samples': 2457600, 'steps': 4799, 'batch_loss/train': 1.1464869873598218} 12/22/2021 05:58:02 - INFO - codeparrot_training - Step 4800: {'lr': 0.0004894449888706608, 'samples': 2458112, 'steps': 4800, 'batch_loss/train': 0.9252524385228753} 12/22/2021 05:58:14 - INFO - codeparrot_training - Step 4801: {'lr': 0.0004894404564577722, 'samples': 2458624, 'steps': 4801, 'batch_loss/train': 0.9626491079106927} 12/22/2021 05:58:25 - INFO - codeparrot_training - Step 4802: {'lr': 0.0004894359230929589, 'samples': 2459136, 'steps': 4802, 'batch_loss/train': 0.9331808965653181} 12/22/2021 05:58:35 - INFO - codeparrot_training - Step 4803: {'lr': 0.0004894313887762388, 'samples': 2459648, 'steps': 4803, 'batch_loss/train': 1.0017683347687125} 12/22/2021 05:58:46 - INFO - codeparrot_training - Step 4804: {'lr': 0.0004894268535076298, 'samples': 2460160, 'steps': 4804, 'batch_loss/train': 0.8879474103450775} 12/22/2021 05:58:58 - INFO - codeparrot_training - Step 4805: {'lr': 0.00048942231728715, 'samples': 2460672, 'steps': 4805, 'batch_loss/train': 0.9078800035640597} 12/22/2021 05:59:08 - INFO - codeparrot_training - Step 4806: {'lr': 0.0004894177801148175, 'samples': 2461184, 'steps': 4806, 'batch_loss/train': 1.4339326620101929} 12/22/2021 05:59:19 - INFO - codeparrot_training - Step 4807: {'lr': 0.0004894132419906502, 'samples': 2461696, 'steps': 4807, 'batch_loss/train': 0.922285582870245} 12/22/2021 05:59:31 - INFO - codeparrot_training - Step 4808: {'lr': 0.0004894087029146665, 'samples': 2462208, 'steps': 4808, 'batch_loss/train': 0.7596188454190269} 12/22/2021 05:59:42 - INFO - codeparrot_training - Step 4809: {'lr': 0.000489404162886884, 'samples': 2462720, 'steps': 4809, 'batch_loss/train': 0.8629690539091825} 12/22/2021 05:59:52 - INFO - codeparrot_training - Step 4810: {'lr': 0.0004893996219073211, 'samples': 2463232, 'steps': 4810, 'batch_loss/train': 0.8326163615565747} 12/22/2021 06:00:06 - INFO - codeparrot_training - Step 4811: {'lr': 0.0004893950799759956, 'samples': 2463744, 'steps': 4811, 'batch_loss/train': 0.8858857275918126} 12/22/2021 06:00:17 - INFO - codeparrot_training - Step 4812: {'lr': 0.0004893905370929256, 'samples': 2464256, 'steps': 4812, 'batch_loss/train': 0.8645848063752055} 12/22/2021 06:00:28 - INFO - codeparrot_training - Step 4813: {'lr': 0.0004893859932581292, 'samples': 2464768, 'steps': 4813, 'batch_loss/train': 0.8285825131461024} 12/22/2021 06:00:40 - INFO - codeparrot_training - Step 4814: {'lr': 0.0004893814484716246, 'samples': 2465280, 'steps': 4814, 'batch_loss/train': 0.8250753311440349} 12/22/2021 06:00:50 - INFO - codeparrot_training - Step 4815: {'lr': 0.0004893769027334297, 'samples': 2465792, 'steps': 4815, 'batch_loss/train': 0.8010852250736207} 12/22/2021 06:01:01 - INFO - codeparrot_training - Step 4816: {'lr': 0.0004893723560435625, 'samples': 2466304, 'steps': 4816, 'batch_loss/train': 0.8299886742606759} 12/22/2021 06:01:12 - INFO - codeparrot_training - Step 4817: {'lr': 0.0004893678084020414, 'samples': 2466816, 'steps': 4817, 'batch_loss/train': 0.8326175892725587} 12/22/2021 06:01:26 - INFO - codeparrot_training - Step 4818: {'lr': 0.0004893632598088842, 'samples': 2467328, 'steps': 4818, 'batch_loss/train': 0.7935682255774736} 12/22/2021 06:01:36 - INFO - codeparrot_training - Step 4819: {'lr': 0.0004893587102641091, 'samples': 2467840, 'steps': 4819, 'batch_loss/train': 0.8955145124346018} 12/22/2021 06:01:47 - INFO - codeparrot_training - Step 4820: {'lr': 0.000489354159767734, 'samples': 2468352, 'steps': 4820, 'batch_loss/train': 0.8664062730967999} 12/22/2021 06:01:59 - INFO - codeparrot_training - Step 4821: {'lr': 0.0004893496083197773, 'samples': 2468864, 'steps': 4821, 'batch_loss/train': 0.8961376706138253} 12/22/2021 06:02:09 - INFO - codeparrot_training - Step 4822: {'lr': 0.0004893450559202568, 'samples': 2469376, 'steps': 4822, 'batch_loss/train': 0.9192093135789037} 12/22/2021 06:02:20 - INFO - codeparrot_training - Step 4823: {'lr': 0.0004893405025691908, 'samples': 2469888, 'steps': 4823, 'batch_loss/train': 0.8529096208512783} 12/22/2021 06:02:34 - INFO - codeparrot_training - Step 4824: {'lr': 0.0004893359482665973, 'samples': 2470400, 'steps': 4824, 'batch_loss/train': 0.8378188833594322} 12/22/2021 06:02:45 - INFO - codeparrot_training - Step 4825: {'lr': 0.0004893313930124945, 'samples': 2470912, 'steps': 4825, 'batch_loss/train': 0.8634252110496163} 12/22/2021 06:02:55 - INFO - codeparrot_training - Step 4826: {'lr': 0.0004893268368069004, 'samples': 2471424, 'steps': 4826, 'batch_loss/train': 0.8089898083126172} 12/22/2021 06:03:07 - INFO - codeparrot_training - Step 4827: {'lr': 0.0004893222796498332, 'samples': 2471936, 'steps': 4827, 'batch_loss/train': 0.839616765268147} 12/22/2021 06:03:18 - INFO - codeparrot_training - Step 4828: {'lr': 0.0004893177215413109, 'samples': 2472448, 'steps': 4828, 'batch_loss/train': 0.8707883968017995} 12/22/2021 06:03:29 - INFO - codeparrot_training - Step 4829: {'lr': 0.0004893131624813518, 'samples': 2472960, 'steps': 4829, 'batch_loss/train': 0.8318832016084343} 12/22/2021 06:03:39 - INFO - codeparrot_training - Step 4830: {'lr': 0.0004893086024699739, 'samples': 2473472, 'steps': 4830, 'batch_loss/train': 0.8539703758433461} 12/22/2021 06:03:51 - INFO - codeparrot_training - Step 4831: {'lr': 0.0004893040415071953, 'samples': 2473984, 'steps': 4831, 'batch_loss/train': 0.7329705939628184} 12/22/2021 06:04:02 - INFO - codeparrot_training - Step 4832: {'lr': 0.0004892994795930342, 'samples': 2474496, 'steps': 4832, 'batch_loss/train': 0.8651347095146775} 12/22/2021 06:04:13 - INFO - codeparrot_training - Step 4833: {'lr': 0.0004892949167275088, 'samples': 2475008, 'steps': 4833, 'batch_loss/train': 0.8311507338657975} 12/22/2021 06:04:26 - INFO - codeparrot_training - Step 4834: {'lr': 0.0004892903529106372, 'samples': 2475520, 'steps': 4834, 'batch_loss/train': 0.792690203525126} 12/22/2021 06:04:37 - INFO - codeparrot_training - Step 4835: {'lr': 0.0004892857881424374, 'samples': 2476032, 'steps': 4835, 'batch_loss/train': 0.8359676720574498} 12/22/2021 06:04:48 - INFO - codeparrot_training - Step 4836: {'lr': 0.0004892812224229277, 'samples': 2476544, 'steps': 4836, 'batch_loss/train': 0.9099986478686333} 12/22/2021 06:05:00 - INFO - codeparrot_training - Step 4837: {'lr': 0.0004892766557521262, 'samples': 2477056, 'steps': 4837, 'batch_loss/train': 1.0203783866018057} 12/22/2021 06:05:10 - INFO - codeparrot_training - Step 4838: {'lr': 0.000489272088130051, 'samples': 2477568, 'steps': 4838, 'batch_loss/train': 0.8512644972652197} 12/22/2021 06:05:21 - INFO - codeparrot_training - Step 4839: {'lr': 0.0004892675195567204, 'samples': 2478080, 'steps': 4839, 'batch_loss/train': 0.8396337293088436} 12/22/2021 06:05:32 - INFO - codeparrot_training - Step 4840: {'lr': 0.0004892629500321525, 'samples': 2478592, 'steps': 4840, 'batch_loss/train': 0.7681135924067348} 12/22/2021 06:05:44 - INFO - codeparrot_training - Step 4841: {'lr': 0.0004892583795563654, 'samples': 2479104, 'steps': 4841, 'batch_loss/train': 0.765113758854568} 12/22/2021 06:05:54 - INFO - codeparrot_training - Step 4842: {'lr': 0.0004892538081293774, 'samples': 2479616, 'steps': 4842, 'batch_loss/train': 0.6849686030764133} 12/22/2021 06:06:05 - INFO - codeparrot_training - Step 4843: {'lr': 0.0004892492357512065, 'samples': 2480128, 'steps': 4843, 'batch_loss/train': 0.8720608539879322} 12/22/2021 06:06:17 - INFO - codeparrot_training - Step 4844: {'lr': 0.000489244662421871, 'samples': 2480640, 'steps': 4844, 'batch_loss/train': 0.8614648319780827} 12/22/2021 06:06:28 - INFO - codeparrot_training - Step 4845: {'lr': 0.0004892400881413891, 'samples': 2481152, 'steps': 4845, 'batch_loss/train': 0.8915302027016878} 12/22/2021 06:06:39 - INFO - codeparrot_training - Step 4846: {'lr': 0.000489235512909779, 'samples': 2481664, 'steps': 4846, 'batch_loss/train': 0.7794659232022241} 12/22/2021 06:06:52 - INFO - codeparrot_training - Step 4847: {'lr': 0.0004892309367270587, 'samples': 2482176, 'steps': 4847, 'batch_loss/train': 0.7623191657476127} 12/22/2021 06:07:03 - INFO - codeparrot_training - Step 4848: {'lr': 0.0004892263595932466, 'samples': 2482688, 'steps': 4848, 'batch_loss/train': 0.8345967289060354} 12/22/2021 06:07:14 - INFO - codeparrot_training - Step 4849: {'lr': 0.0004892217815083608, 'samples': 2483200, 'steps': 4849, 'batch_loss/train': 0.8729309644550085} 12/22/2021 06:07:24 - INFO - codeparrot_training - Step 4850: {'lr': 0.0004892172024724195, 'samples': 2483712, 'steps': 4850, 'batch_loss/train': 0.8546615736559033} 12/22/2021 06:07:37 - INFO - codeparrot_training - Step 4851: {'lr': 0.000489212622485441, 'samples': 2484224, 'steps': 4851, 'batch_loss/train': 0.5937001995043829} 12/22/2021 06:07:47 - INFO - codeparrot_training - Step 4852: {'lr': 0.0004892080415474433, 'samples': 2484736, 'steps': 4852, 'batch_loss/train': 0.8600810393691063} 12/22/2021 06:07:58 - INFO - codeparrot_training - Step 4853: {'lr': 0.0004892034596584449, 'samples': 2485248, 'steps': 4853, 'batch_loss/train': 0.8777376469224691} 12/22/2021 06:08:10 - INFO - codeparrot_training - Step 4854: {'lr': 0.0004891988768184638, 'samples': 2485760, 'steps': 4854, 'batch_loss/train': 1.0641091577708721} 12/22/2021 06:08:21 - INFO - codeparrot_training - Step 4855: {'lr': 0.0004891942930275182, 'samples': 2486272, 'steps': 4855, 'batch_loss/train': 0.8806906817480922} 12/22/2021 06:08:31 - INFO - codeparrot_training - Step 4856: {'lr': 0.0004891897082856266, 'samples': 2486784, 'steps': 4856, 'batch_loss/train': 1.3511718660593033} 12/22/2021 06:08:45 - INFO - codeparrot_training - Step 4857: {'lr': 0.0004891851225928069, 'samples': 2487296, 'steps': 4857, 'batch_loss/train': 0.8081442788243294} 12/22/2021 06:08:56 - INFO - codeparrot_training - Step 4858: {'lr': 0.0004891805359490775, 'samples': 2487808, 'steps': 4858, 'batch_loss/train': 0.8510625176131725} 12/22/2021 06:09:06 - INFO - codeparrot_training - Step 4859: {'lr': 0.0004891759483544566, 'samples': 2488320, 'steps': 4859, 'batch_loss/train': 0.8231836222112179} 12/22/2021 06:09:18 - INFO - codeparrot_training - Step 4860: {'lr': 0.0004891713598089624, 'samples': 2488832, 'steps': 4860, 'batch_loss/train': 0.7920321598649025} 12/22/2021 06:09:29 - INFO - codeparrot_training - Step 4861: {'lr': 0.0004891667703126133, 'samples': 2489344, 'steps': 4861, 'batch_loss/train': 0.8507118560373783} 12/22/2021 06:09:40 - INFO - codeparrot_training - Step 4862: {'lr': 0.0004891621798654273, 'samples': 2489856, 'steps': 4862, 'batch_loss/train': 0.8092920454218984} 12/22/2021 06:09:50 - INFO - codeparrot_training - Step 4863: {'lr': 0.0004891575884674227, 'samples': 2490368, 'steps': 4863, 'batch_loss/train': 0.8336324701085687} 12/22/2021 06:10:05 - INFO - codeparrot_training - Step 4864: {'lr': 0.000489152996118618, 'samples': 2490880, 'steps': 4864, 'batch_loss/train': 0.8532057483680546} 12/22/2021 06:10:15 - INFO - codeparrot_training - Step 4865: {'lr': 0.0004891484028190313, 'samples': 2491392, 'steps': 4865, 'batch_loss/train': 0.8638418577611446} 12/22/2021 06:10:26 - INFO - codeparrot_training - Step 4866: {'lr': 0.0004891438085686807, 'samples': 2491904, 'steps': 4866, 'batch_loss/train': 0.8087279424071312} 12/22/2021 06:10:38 - INFO - codeparrot_training - Step 4867: {'lr': 0.0004891392133675848, 'samples': 2492416, 'steps': 4867, 'batch_loss/train': 0.9047911781817675} 12/22/2021 06:10:49 - INFO - codeparrot_training - Step 4868: {'lr': 0.0004891346172157615, 'samples': 2492928, 'steps': 4868, 'batch_loss/train': 0.8179152216762304} 12/22/2021 06:10:59 - INFO - codeparrot_training - Step 4869: {'lr': 0.0004891300201132294, 'samples': 2493440, 'steps': 4869, 'batch_loss/train': 0.7388020101934671} 12/22/2021 06:11:11 - INFO - codeparrot_training - Step 4870: {'lr': 0.0004891254220600065, 'samples': 2493952, 'steps': 4870, 'batch_loss/train': 0.8287974065169692} 12/22/2021 06:11:22 - INFO - codeparrot_training - Step 4871: {'lr': 0.0004891208230561112, 'samples': 2494464, 'steps': 4871, 'batch_loss/train': 1.0391075005754828} 12/22/2021 06:11:32 - INFO - codeparrot_training - Step 4872: {'lr': 0.000489116223101562, 'samples': 2494976, 'steps': 4872, 'batch_loss/train': 0.8974265828728676} 12/22/2021 06:11:46 - INFO - codeparrot_training - Step 4873: {'lr': 0.0004891116221963769, 'samples': 2495488, 'steps': 4873, 'batch_loss/train': 0.8618247546255589} 12/22/2021 06:11:57 - INFO - codeparrot_training - Step 4874: {'lr': 0.0004891070203405742, 'samples': 2496000, 'steps': 4874, 'batch_loss/train': 0.9354149866849184} 12/22/2021 06:12:07 - INFO - codeparrot_training - Step 4875: {'lr': 0.0004891024175341723, 'samples': 2496512, 'steps': 4875, 'batch_loss/train': 0.8726594569161534} 12/22/2021 06:12:18 - INFO - codeparrot_training - Step 4876: {'lr': 0.0004890978137771894, 'samples': 2497024, 'steps': 4876, 'batch_loss/train': 0.9168099798262119} 12/22/2021 06:12:30 - INFO - codeparrot_training - Step 4877: {'lr': 0.0004890932090696441, 'samples': 2497536, 'steps': 4877, 'batch_loss/train': 0.6704823895124719} 12/22/2021 06:12:41 - INFO - codeparrot_training - Step 4878: {'lr': 0.0004890886034115543, 'samples': 2498048, 'steps': 4878, 'batch_loss/train': 0.9247748255729675} 12/22/2021 06:12:51 - INFO - codeparrot_training - Step 4879: {'lr': 0.0004890839968029384, 'samples': 2498560, 'steps': 4879, 'batch_loss/train': 0.917789913713932} 12/22/2021 06:13:05 - INFO - codeparrot_training - Step 4880: {'lr': 0.000489079389243815, 'samples': 2499072, 'steps': 4880, 'batch_loss/train': 0.9062646627426147} 12/22/2021 06:13:16 - INFO - codeparrot_training - Step 4881: {'lr': 0.000489074780734202, 'samples': 2499584, 'steps': 4881, 'batch_loss/train': 0.8129159649834037} 12/22/2021 06:13:26 - INFO - codeparrot_training - Step 4882: {'lr': 0.0004890701712741182, 'samples': 2500096, 'steps': 4882, 'batch_loss/train': 0.7994407210499048} 12/22/2021 06:13:38 - INFO - codeparrot_training - Step 4883: {'lr': 0.0004890655608635816, 'samples': 2500608, 'steps': 4883, 'batch_loss/train': 0.8542783046141267} 12/22/2021 06:13:49 - INFO - codeparrot_training - Step 4884: {'lr': 0.0004890609495026105, 'samples': 2501120, 'steps': 4884, 'batch_loss/train': 0.940254146233201} 12/22/2021 06:14:00 - INFO - codeparrot_training - Step 4885: {'lr': 0.0004890563371912234, 'samples': 2501632, 'steps': 4885, 'batch_loss/train': 0.8525793347507715} 12/22/2021 06:14:12 - INFO - codeparrot_training - Step 4886: {'lr': 0.0004890517239294385, 'samples': 2502144, 'steps': 4886, 'batch_loss/train': 0.8124392926692963} 12/22/2021 06:14:22 - INFO - codeparrot_training - Step 4887: {'lr': 0.0004890471097172743, 'samples': 2502656, 'steps': 4887, 'batch_loss/train': 0.8305366178974509} 12/22/2021 06:14:33 - INFO - codeparrot_training - Step 4888: {'lr': 0.000489042494554749, 'samples': 2503168, 'steps': 4888, 'batch_loss/train': 0.9007789017632604} 12/22/2021 06:14:43 - INFO - codeparrot_training - Step 4889: {'lr': 0.000489037878441881, 'samples': 2503680, 'steps': 4889, 'batch_loss/train': 0.7769419602118433} 12/22/2021 06:14:55 - INFO - codeparrot_training - Step 4890: {'lr': 0.0004890332613786886, 'samples': 2504192, 'steps': 4890, 'batch_loss/train': 0.8656888613477349} 12/22/2021 06:15:06 - INFO - codeparrot_training - Step 4891: {'lr': 0.0004890286433651904, 'samples': 2504704, 'steps': 4891, 'batch_loss/train': 0.8363634869456291} 12/22/2021 06:15:17 - INFO - codeparrot_training - Step 4892: {'lr': 0.0004890240244014044, 'samples': 2505216, 'steps': 4892, 'batch_loss/train': 0.7856450523249805} 12/22/2021 06:15:30 - INFO - codeparrot_training - Step 4893: {'lr': 0.0004890194044873491, 'samples': 2505728, 'steps': 4893, 'batch_loss/train': 0.7224272296298295} 12/22/2021 06:15:41 - INFO - codeparrot_training - Step 4894: {'lr': 0.000489014783623043, 'samples': 2506240, 'steps': 4894, 'batch_loss/train': 0.8056339360773563} 12/22/2021 06:15:52 - INFO - codeparrot_training - Step 4895: {'lr': 0.0004890101618085043, 'samples': 2506752, 'steps': 4895, 'batch_loss/train': 0.8929878631606698} 12/22/2021 06:16:04 - INFO - codeparrot_training - Step 4896: {'lr': 0.0004890055390437515, 'samples': 2507264, 'steps': 4896, 'batch_loss/train': 0.8101188603322953} 12/22/2021 06:16:14 - INFO - codeparrot_training - Step 4897: {'lr': 0.000489000915328803, 'samples': 2507776, 'steps': 4897, 'batch_loss/train': 0.7918894882313907} 12/22/2021 06:16:25 - INFO - codeparrot_training - Step 4898: {'lr': 0.0004889962906636769, 'samples': 2508288, 'steps': 4898, 'batch_loss/train': 0.8021686980500817} 12/22/2021 06:16:35 - INFO - codeparrot_training - Step 4899: {'lr': 0.0004889916650483919, 'samples': 2508800, 'steps': 4899, 'batch_loss/train': 0.8358808420598507} 12/22/2021 06:16:48 - INFO - codeparrot_training - Step 4900: {'lr': 0.0004889870384829664, 'samples': 2509312, 'steps': 4900, 'batch_loss/train': 0.8794752787798643} 12/22/2021 06:16:58 - INFO - codeparrot_training - Step 4901: {'lr': 0.0004889824109674185, 'samples': 2509824, 'steps': 4901, 'batch_loss/train': 0.8325082408264279} 12/22/2021 06:17:09 - INFO - codeparrot_training - Step 4902: {'lr': 0.0004889777825017668, 'samples': 2510336, 'steps': 4902, 'batch_loss/train': 0.9181616585701704} 12/22/2021 06:17:23 - INFO - codeparrot_training - Step 4903: {'lr': 0.0004889731530860297, 'samples': 2510848, 'steps': 4903, 'batch_loss/train': 0.8401491753757} 12/22/2021 06:17:33 - INFO - codeparrot_training - Step 4904: {'lr': 0.0004889685227202255, 'samples': 2511360, 'steps': 4904, 'batch_loss/train': 0.783416991122067} 12/22/2021 06:17:44 - INFO - codeparrot_training - Step 4905: {'lr': 0.0004889638914043728, 'samples': 2511872, 'steps': 4905, 'batch_loss/train': 0.8908946961164474} 12/22/2021 06:17:56 - INFO - codeparrot_training - Step 4906: {'lr': 0.0004889592591384898, 'samples': 2512384, 'steps': 4906, 'batch_loss/train': 0.986887626349926} 12/22/2021 06:18:07 - INFO - codeparrot_training - Step 4907: {'lr': 0.0004889546259225951, 'samples': 2512896, 'steps': 4907, 'batch_loss/train': 0.883234680572059} 12/22/2021 06:18:18 - INFO - codeparrot_training - Step 4908: {'lr': 0.000488949991756707, 'samples': 2513408, 'steps': 4908, 'batch_loss/train': 1.7456344235688448} 12/22/2021 06:18:28 - INFO - codeparrot_training - Step 4909: {'lr': 0.000488945356640844, 'samples': 2513920, 'steps': 4909, 'batch_loss/train': 0.8983127484098077} 12/22/2021 06:18:42 - INFO - codeparrot_training - Step 4910: {'lr': 0.0004889407205750243, 'samples': 2514432, 'steps': 4910, 'batch_loss/train': 0.8459378206171095} 12/22/2021 06:18:53 - INFO - codeparrot_training - Step 4911: {'lr': 0.0004889360835592667, 'samples': 2514944, 'steps': 4911, 'batch_loss/train': 0.7960054343566298} 12/22/2021 06:19:03 - INFO - codeparrot_training - Step 4912: {'lr': 0.0004889314455935894, 'samples': 2515456, 'steps': 4912, 'batch_loss/train': 0.7130440566688776} 12/22/2021 06:19:15 - INFO - codeparrot_training - Step 4913: {'lr': 0.0004889268066780109, 'samples': 2515968, 'steps': 4913, 'batch_loss/train': 0.7974855378270149} 12/22/2021 06:19:26 - INFO - codeparrot_training - Step 4914: {'lr': 0.0004889221668125496, 'samples': 2516480, 'steps': 4914, 'batch_loss/train': 0.9555186918005347} 12/22/2021 06:19:37 - INFO - codeparrot_training - Step 4915: {'lr': 0.000488917525997224, 'samples': 2516992, 'steps': 4915, 'batch_loss/train': 0.8706473726779222} 12/22/2021 06:19:49 - INFO - codeparrot_training - Step 4916: {'lr': 0.0004889128842320526, 'samples': 2517504, 'steps': 4916, 'batch_loss/train': 0.8868661848828197} 12/22/2021 06:19:59 - INFO - codeparrot_training - Step 4917: {'lr': 0.0004889082415170537, 'samples': 2518016, 'steps': 4917, 'batch_loss/train': 0.836289532482624} 12/22/2021 06:20:10 - INFO - codeparrot_training - Step 4918: {'lr': 0.0004889035978522458, 'samples': 2518528, 'steps': 4918, 'batch_loss/train': 0.8319826498627663} 12/22/2021 06:20:21 - INFO - codeparrot_training - Step 4919: {'lr': 0.0004888989532376474, 'samples': 2519040, 'steps': 4919, 'batch_loss/train': 0.8569427160546184} 12/22/2021 06:20:34 - INFO - codeparrot_training - Step 4920: {'lr': 0.0004888943076732771, 'samples': 2519552, 'steps': 4920, 'batch_loss/train': 0.7693746904260479} 12/22/2021 06:20:45 - INFO - codeparrot_training - Step 4921: {'lr': 0.0004888896611591531, 'samples': 2520064, 'steps': 4921, 'batch_loss/train': 0.7883839998394251} 12/22/2021 06:20:56 - INFO - codeparrot_training - Step 4922: {'lr': 0.0004888850136952941, 'samples': 2520576, 'steps': 4922, 'batch_loss/train': 0.8241212562425062} 12/22/2021 06:21:08 - INFO - codeparrot_training - Step 4923: {'lr': 0.0004888803652817184, 'samples': 2521088, 'steps': 4923, 'batch_loss/train': 0.7836155788972974} 12/22/2021 06:21:19 - INFO - codeparrot_training - Step 4924: {'lr': 0.0004888757159184447, 'samples': 2521600, 'steps': 4924, 'batch_loss/train': 0.8874483983963728} 12/22/2021 06:21:29 - INFO - codeparrot_training - Step 4925: {'lr': 0.0004888710656054912, 'samples': 2522112, 'steps': 4925, 'batch_loss/train': 0.5586250246269628} 12/22/2021 06:21:41 - INFO - codeparrot_training - Step 4926: {'lr': 0.0004888664143428767, 'samples': 2522624, 'steps': 4926, 'batch_loss/train': 0.9078589169075713} 12/22/2021 06:21:52 - INFO - codeparrot_training - Step 4927: {'lr': 0.0004888617621306194, 'samples': 2523136, 'steps': 4927, 'batch_loss/train': 0.8767602816224098} 12/22/2021 06:22:02 - INFO - codeparrot_training - Step 4928: {'lr': 0.0004888571089687379, 'samples': 2523648, 'steps': 4928, 'batch_loss/train': 1.0531121343374252} 12/22/2021 06:22:14 - INFO - codeparrot_training - Step 4929: {'lr': 0.0004888524548572508, 'samples': 2524160, 'steps': 4929, 'batch_loss/train': 0.8383704135194421} 12/22/2021 06:22:25 - INFO - codeparrot_training - Step 4930: {'lr': 0.0004888477997961765, 'samples': 2524672, 'steps': 4930, 'batch_loss/train': 0.8181331967934966} 12/22/2021 06:22:36 - INFO - codeparrot_training - Step 4931: {'lr': 0.0004888431437855335, 'samples': 2525184, 'steps': 4931, 'batch_loss/train': 0.829911857843399} 12/22/2021 06:22:46 - INFO - codeparrot_training - Step 4932: {'lr': 0.0004888384868253404, 'samples': 2525696, 'steps': 4932, 'batch_loss/train': 0.7890358632430434} 12/22/2021 06:23:00 - INFO - codeparrot_training - Step 4933: {'lr': 0.0004888338289156157, 'samples': 2526208, 'steps': 4933, 'batch_loss/train': 0.9955997270881198} 12/22/2021 06:23:11 - INFO - codeparrot_training - Step 4934: {'lr': 0.0004888291700563778, 'samples': 2526720, 'steps': 4934, 'batch_loss/train': 1.0420809127390385} 12/22/2021 06:23:22 - INFO - codeparrot_training - Step 4935: {'lr': 0.0004888245102476453, 'samples': 2527232, 'steps': 4935, 'batch_loss/train': 0.909764098469168} 12/22/2021 06:23:34 - INFO - codeparrot_training - Step 4936: {'lr': 0.0004888198494894367, 'samples': 2527744, 'steps': 4936, 'batch_loss/train': 0.8257213095203042} 12/22/2021 06:23:44 - INFO - codeparrot_training - Step 4937: {'lr': 0.0004888151877817706, 'samples': 2528256, 'steps': 4937, 'batch_loss/train': 0.9138863971456885} 12/22/2021 06:23:55 - INFO - codeparrot_training - Step 4938: {'lr': 0.0004888105251246655, 'samples': 2528768, 'steps': 4938, 'batch_loss/train': 0.7877942927880213} 12/22/2021 06:24:09 - INFO - codeparrot_training - Step 4939: {'lr': 0.0004888058615181398, 'samples': 2529280, 'steps': 4939, 'batch_loss/train': 0.7478074862156063} 12/22/2021 06:24:19 - INFO - codeparrot_training - Step 4940: {'lr': 0.0004888011969622124, 'samples': 2529792, 'steps': 4940, 'batch_loss/train': 0.8402346549555659} 12/22/2021 06:24:30 - INFO - codeparrot_training - Step 4941: {'lr': 0.0004887965314569014, 'samples': 2530304, 'steps': 4941, 'batch_loss/train': 0.9378176229074597} 12/22/2021 06:24:41 - INFO - codeparrot_training - Step 4942: {'lr': 0.0004887918650022257, 'samples': 2530816, 'steps': 4942, 'batch_loss/train': 0.7866222767625004} 12/22/2021 06:24:53 - INFO - codeparrot_training - Step 4943: {'lr': 0.0004887871975982036, 'samples': 2531328, 'steps': 4943, 'batch_loss/train': 0.8699145391583443} 12/22/2021 06:25:03 - INFO - codeparrot_training - Step 4944: {'lr': 0.0004887825292448538, 'samples': 2531840, 'steps': 4944, 'batch_loss/train': 0.8386723245494068} 12/22/2021 06:25:14 - INFO - codeparrot_training - Step 4945: {'lr': 0.0004887778599421948, 'samples': 2532352, 'steps': 4945, 'batch_loss/train': 0.7481547992210835} 12/22/2021 06:25:26 - INFO - codeparrot_training - Step 4946: {'lr': 0.0004887731896902452, 'samples': 2532864, 'steps': 4946, 'batch_loss/train': 0.8914304384961724} 12/22/2021 06:25:37 - INFO - codeparrot_training - Step 4947: {'lr': 0.0004887685184890236, 'samples': 2533376, 'steps': 4947, 'batch_loss/train': 0.8275982507038862} 12/22/2021 06:25:47 - INFO - codeparrot_training - Step 4948: {'lr': 0.0004887638463385485, 'samples': 2533888, 'steps': 4948, 'batch_loss/train': 0.8020043754950166} 12/22/2021 06:26:00 - INFO - codeparrot_training - Step 4949: {'lr': 0.0004887591732388386, 'samples': 2534400, 'steps': 4949, 'batch_loss/train': 0.7773739714175463} 12/22/2021 06:26:11 - INFO - codeparrot_training - Step 4950: {'lr': 0.0004887544991899123, 'samples': 2534912, 'steps': 4950, 'batch_loss/train': 0.9138218834996223} 12/22/2021 06:26:21 - INFO - codeparrot_training - Step 4951: {'lr': 0.0004887498241917882, 'samples': 2535424, 'steps': 4951, 'batch_loss/train': 0.8527651708573103} 12/22/2021 06:26:34 - INFO - codeparrot_training - Step 4952: {'lr': 0.0004887451482444849, 'samples': 2535936, 'steps': 4952, 'batch_loss/train': 0.8449178719893098} 12/22/2021 06:26:44 - INFO - codeparrot_training - Step 4953: {'lr': 0.0004887404713480211, 'samples': 2536448, 'steps': 4953, 'batch_loss/train': 0.8603670578449965} 12/22/2021 06:26:55 - INFO - codeparrot_training - Step 4954: {'lr': 0.0004887357935024154, 'samples': 2536960, 'steps': 4954, 'batch_loss/train': 0.9232867108657956} 12/22/2021 06:27:06 - INFO - codeparrot_training - Step 4955: {'lr': 0.0004887311147076862, 'samples': 2537472, 'steps': 4955, 'batch_loss/train': 0.9573373245075345} 12/22/2021 06:27:18 - INFO - codeparrot_training - Step 4956: {'lr': 0.0004887264349638524, 'samples': 2537984, 'steps': 4956, 'batch_loss/train': 0.8092666286975145} 12/22/2021 06:27:28 - INFO - codeparrot_training - Step 4957: {'lr': 0.0004887217542709324, 'samples': 2538496, 'steps': 4957, 'batch_loss/train': 0.8307648794725537} 12/22/2021 06:27:39 - INFO - codeparrot_training - Step 4958: {'lr': 0.0004887170726289447, 'samples': 2539008, 'steps': 4958, 'batch_loss/train': 0.8899243976920843} 12/22/2021 06:27:52 - INFO - codeparrot_training - Step 4959: {'lr': 0.0004887123900379082, 'samples': 2539520, 'steps': 4959, 'batch_loss/train': 0.882855542935431} 12/22/2021 06:28:02 - INFO - codeparrot_training - Step 4960: {'lr': 0.0004887077064978412, 'samples': 2540032, 'steps': 4960, 'batch_loss/train': 0.701606247574091} 12/22/2021 06:28:13 - INFO - codeparrot_training - Step 4961: {'lr': 0.0004887030220087627, 'samples': 2540544, 'steps': 4961, 'batch_loss/train': 2.0212436942383647} 12/22/2021 06:28:25 - INFO - codeparrot_training - Step 4962: {'lr': 0.0004886983365706911, 'samples': 2541056, 'steps': 4962, 'batch_loss/train': 0.8411136418581009} 12/22/2021 06:28:36 - INFO - codeparrot_training - Step 4963: {'lr': 0.0004886936501836449, 'samples': 2541568, 'steps': 4963, 'batch_loss/train': 0.8441598974168301} 12/22/2021 06:28:46 - INFO - codeparrot_training - Step 4964: {'lr': 0.000488688962847643, 'samples': 2542080, 'steps': 4964, 'batch_loss/train': 0.8455916093662381} 12/22/2021 06:28:57 - INFO - codeparrot_training - Step 4965: {'lr': 0.0004886842745627038, 'samples': 2542592, 'steps': 4965, 'batch_loss/train': 0.8889451390132308} 12/22/2021 06:29:09 - INFO - codeparrot_training - Step 4966: {'lr': 0.000488679585328846, 'samples': 2543104, 'steps': 4966, 'batch_loss/train': 0.7929069604724646} 12/22/2021 06:29:20 - INFO - codeparrot_training - Step 4967: {'lr': 0.0004886748951460885, 'samples': 2543616, 'steps': 4967, 'batch_loss/train': 1.001233596354723} 12/22/2021 06:29:31 - INFO - codeparrot_training - Step 4968: {'lr': 0.0004886702040144496, 'samples': 2544128, 'steps': 4968, 'batch_loss/train': 0.8585275579243898} 12/22/2021 06:29:43 - INFO - codeparrot_training - Step 4969: {'lr': 0.0004886655119339481, 'samples': 2544640, 'steps': 4969, 'batch_loss/train': 1.034215236082673} 12/22/2021 06:29:54 - INFO - codeparrot_training - Step 4970: {'lr': 0.0004886608189046027, 'samples': 2545152, 'steps': 4970, 'batch_loss/train': 0.8377849506214261} 12/22/2021 06:30:04 - INFO - codeparrot_training - Step 4971: {'lr': 0.0004886561249264319, 'samples': 2545664, 'steps': 4971, 'batch_loss/train': 0.8523126216605306} 12/22/2021 06:30:17 - INFO - codeparrot_training - Step 4972: {'lr': 0.0004886514299994545, 'samples': 2546176, 'steps': 4972, 'batch_loss/train': 0.9025827292352915} 12/22/2021 06:30:27 - INFO - codeparrot_training - Step 4973: {'lr': 0.0004886467341236892, 'samples': 2546688, 'steps': 4973, 'batch_loss/train': 0.8945728838443756} 12/22/2021 06:30:38 - INFO - codeparrot_training - Step 4974: {'lr': 0.0004886420372991545, 'samples': 2547200, 'steps': 4974, 'batch_loss/train': 0.89814661629498} 12/22/2021 06:30:48 - INFO - codeparrot_training - Step 4975: {'lr': 0.0004886373395258692, 'samples': 2547712, 'steps': 4975, 'batch_loss/train': 0.6792180235497653} 12/22/2021 06:31:01 - INFO - codeparrot_training - Step 4976: {'lr': 0.000488632640803852, 'samples': 2548224, 'steps': 4976, 'batch_loss/train': 0.8670194745063782} 12/22/2021 06:31:12 - INFO - codeparrot_training - Step 4977: {'lr': 0.0004886279411331215, 'samples': 2548736, 'steps': 4977, 'batch_loss/train': 0.8967490899376571} 12/22/2021 06:31:22 - INFO - codeparrot_training - Step 4978: {'lr': 0.0004886232405136963, 'samples': 2549248, 'steps': 4978, 'batch_loss/train': 0.5250988856423646} 12/22/2021 06:31:35 - INFO - codeparrot_training - Step 4979: {'lr': 0.0004886185389455953, 'samples': 2549760, 'steps': 4979, 'batch_loss/train': 0.9665238726884127} 12/22/2021 06:31:46 - INFO - codeparrot_training - Step 4980: {'lr': 0.000488613836428837, 'samples': 2550272, 'steps': 4980, 'batch_loss/train': 0.9862922234460711} 12/22/2021 06:31:56 - INFO - codeparrot_training - Step 4981: {'lr': 0.0004886091329634403, 'samples': 2550784, 'steps': 4981, 'batch_loss/train': 1.2031910424120724} 12/22/2021 06:32:08 - INFO - codeparrot_training - Step 4982: {'lr': 0.0004886044285494237, 'samples': 2551296, 'steps': 4982, 'batch_loss/train': 0.7784961732104421} 12/22/2021 06:32:19 - INFO - codeparrot_training - Step 4983: {'lr': 0.0004885997231868061, 'samples': 2551808, 'steps': 4983, 'batch_loss/train': 0.6730921814450994} 12/22/2021 06:32:30 - INFO - codeparrot_training - Step 4984: {'lr': 0.0004885950168756059, 'samples': 2552320, 'steps': 4984, 'batch_loss/train': 0.8643502220511436} 12/22/2021 06:32:42 - INFO - codeparrot_training - Step 4985: {'lr': 0.0004885903096158422, 'samples': 2552832, 'steps': 4985, 'batch_loss/train': 0.8439259333536029} 12/22/2021 06:32:53 - INFO - codeparrot_training - Step 4986: {'lr': 0.0004885856014075334, 'samples': 2553344, 'steps': 4986, 'batch_loss/train': 0.7322236965410411} 12/22/2021 06:33:04 - INFO - codeparrot_training - Step 4987: {'lr': 0.0004885808922506983, 'samples': 2553856, 'steps': 4987, 'batch_loss/train': 0.8568874904885888} 12/22/2021 06:33:14 - INFO - codeparrot_training - Step 4988: {'lr': 0.0004885761821453557, 'samples': 2554368, 'steps': 4988, 'batch_loss/train': 0.8341573365032673} 12/22/2021 06:33:27 - INFO - codeparrot_training - Step 4989: {'lr': 0.0004885714710915243, 'samples': 2554880, 'steps': 4989, 'batch_loss/train': 0.6961626239353791} 12/22/2021 06:33:37 - INFO - codeparrot_training - Step 4990: {'lr': 0.0004885667590892229, 'samples': 2555392, 'steps': 4990, 'batch_loss/train': 0.8997619859874249} 12/22/2021 06:33:48 - INFO - codeparrot_training - Step 4991: {'lr': 0.00048856204613847, 'samples': 2555904, 'steps': 4991, 'batch_loss/train': 0.778249517083168} 12/22/2021 06:34:00 - INFO - codeparrot_training - Step 4992: {'lr': 0.0004885573322392845, 'samples': 2556416, 'steps': 4992, 'batch_loss/train': 0.7451556499581784} 12/22/2021 06:34:10 - INFO - codeparrot_training - Step 4993: {'lr': 0.0004885526173916853, 'samples': 2556928, 'steps': 4993, 'batch_loss/train': 0.7986269295215607} 12/22/2021 06:34:21 - INFO - codeparrot_training - Step 4994: {'lr': 0.0004885479015956908, 'samples': 2557440, 'steps': 4994, 'batch_loss/train': 0.895805636420846} 12/22/2021 06:34:34 - INFO - codeparrot_training - Step 4995: {'lr': 0.0004885431848513198, 'samples': 2557952, 'steps': 4995, 'batch_loss/train': 0.8972878735512495} 12/22/2021 06:34:44 - INFO - codeparrot_training - Step 4996: {'lr': 0.0004885384671585914, 'samples': 2558464, 'steps': 4996, 'batch_loss/train': 0.634955593268387} 12/22/2021 06:34:55 - INFO - codeparrot_training - Step 4997: {'lr': 0.000488533748517524, 'samples': 2558976, 'steps': 4997, 'batch_loss/train': 0.6228164002532139} 12/22/2021 06:35:06 - INFO - codeparrot_training - Step 4998: {'lr': 0.0004885290289281366, 'samples': 2559488, 'steps': 4998, 'batch_loss/train': 0.6954452490899712} 12/22/2021 06:35:18 - INFO - codeparrot_training - Step 4999: {'lr': 0.0004885243083904477, 'samples': 2560000, 'steps': 4999, 'batch_loss/train': 0.7173171901376918} 12/22/2021 06:35:18 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/22/2021 07:03:01 - INFO - codeparrot_training - Batch size: torch.Size([0, 2048]) 12/22/2021 07:03:01 - INFO - codeparrot_training - Step 5000: {'loss/eval': 0.8700061440467834, 'perplexity': 2.386925458908081} 12/22/2021 07:03:18 - WARNING - huggingface_hub.repository - Several commits (2) will be pushed upstream. 12/22/2021 07:03:31 - INFO - codeparrot_training - Step 5000: {'lr': 0.0004885195869044763, 'samples': 2560512, 'steps': 5000, 'batch_loss/train': 0.8320622127503157} 12/22/2021 07:03:42 - INFO - codeparrot_training - Step 5001: {'lr': 0.0004885148644702411, 'samples': 2561024, 'steps': 5001, 'batch_loss/train': 0.8805693546310067} 12/22/2021 07:03:54 - INFO - codeparrot_training - Step 5002: {'lr': 0.0004885101410877608, 'samples': 2561536, 'steps': 5002, 'batch_loss/train': 0.9241110319271684} 12/22/2021 07:04:05 - INFO - codeparrot_training - Step 5003: {'lr': 0.0004885054167570542, 'samples': 2562048, 'steps': 5003, 'batch_loss/train': 0.7342068292200565} 12/22/2021 07:04:16 - INFO - codeparrot_training - Step 5004: {'lr': 0.0004885006914781403, 'samples': 2562560, 'steps': 5004, 'batch_loss/train': 0.7681306814774871} 12/22/2021 07:04:28 - INFO - codeparrot_training - Step 5005: {'lr': 0.0004884959652510374, 'samples': 2563072, 'steps': 5005, 'batch_loss/train': 0.7770472045522183} 12/22/2021 07:04:39 - INFO - codeparrot_training - Step 5006: {'lr': 0.0004884912380757649, 'samples': 2563584, 'steps': 5006, 'batch_loss/train': 0.849177117459476} 12/22/2021 07:04:49 - INFO - codeparrot_training - Step 5007: {'lr': 0.0004884865099523411, 'samples': 2564096, 'steps': 5007, 'batch_loss/train': 0.7809845199808478} 12/22/2021 07:05:00 - INFO - codeparrot_training - Step 5008: {'lr': 0.0004884817808807851, 'samples': 2564608, 'steps': 5008, 'batch_loss/train': 0.8179520606063306} 12/22/2021 07:05:13 - INFO - codeparrot_training - Step 5009: {'lr': 0.0004884770508611155, 'samples': 2565120, 'steps': 5009, 'batch_loss/train': 0.8757626805454493} 12/22/2021 07:05:23 - INFO - codeparrot_training - Step 5010: {'lr': 0.0004884723198933513, 'samples': 2565632, 'steps': 5010, 'batch_loss/train': 0.9554009977728128} 12/22/2021 07:05:34 - INFO - codeparrot_training - Step 5011: {'lr': 0.0004884675879775111, 'samples': 2566144, 'steps': 5011, 'batch_loss/train': 0.7739547211676836} 12/22/2021 07:05:46 - INFO - codeparrot_training - Step 5012: {'lr': 0.0004884628551136139, 'samples': 2566656, 'steps': 5012, 'batch_loss/train': 0.7124896933091804} 12/22/2021 07:05:57 - INFO - codeparrot_training - Step 5013: {'lr': 0.0004884581213016782, 'samples': 2567168, 'steps': 5013, 'batch_loss/train': 0.7614744026213884} 12/22/2021 07:06:07 - INFO - codeparrot_training - Step 5014: {'lr': 0.0004884533865417233, 'samples': 2567680, 'steps': 5014, 'batch_loss/train': 0.7651010500267148} 12/22/2021 07:06:19 - INFO - codeparrot_training - Step 5015: {'lr': 0.0004884486508337677, 'samples': 2568192, 'steps': 5015, 'batch_loss/train': 0.8497518375515938} 12/22/2021 07:06:30 - INFO - codeparrot_training - Step 5016: {'lr': 0.0004884439141778303, 'samples': 2568704, 'steps': 5016, 'batch_loss/train': 0.87134431861341} 12/22/2021 07:06:40 - INFO - codeparrot_training - Step 5017: {'lr': 0.0004884391765739299, 'samples': 2569216, 'steps': 5017, 'batch_loss/train': 0.9229918606579304} 12/22/2021 07:06:53 - INFO - codeparrot_training - Step 5018: {'lr': 0.0004884344380220853, 'samples': 2569728, 'steps': 5018, 'batch_loss/train': 0.8600880149751902} 12/22/2021 07:07:04 - INFO - codeparrot_training - Step 5019: {'lr': 0.0004884296985223156, 'samples': 2570240, 'steps': 5019, 'batch_loss/train': 0.7120422977022827} 12/22/2021 07:07:14 - INFO - codeparrot_training - Step 5020: {'lr': 0.0004884249580746393, 'samples': 2570752, 'steps': 5020, 'batch_loss/train': 0.9101304952055216} 12/22/2021 07:07:25 - INFO - codeparrot_training - Step 5021: {'lr': 0.0004884202166790754, 'samples': 2571264, 'steps': 5021, 'batch_loss/train': 0.871868209913373} 12/22/2021 07:07:37 - INFO - codeparrot_training - Step 5022: {'lr': 0.0004884154743356428, 'samples': 2571776, 'steps': 5022, 'batch_loss/train': 0.8733372502028942} 12/22/2021 07:07:48 - INFO - codeparrot_training - Step 5023: {'lr': 0.0004884107310443602, 'samples': 2572288, 'steps': 5023, 'batch_loss/train': 0.8899572817608714} 12/22/2021 07:07:58 - INFO - codeparrot_training - Step 5024: {'lr': 0.0004884059868052466, 'samples': 2572800, 'steps': 5024, 'batch_loss/train': 0.8540454669855535} 12/22/2021 07:08:11 - INFO - codeparrot_training - Step 5025: {'lr': 0.0004884012416183209, 'samples': 2573312, 'steps': 5025, 'batch_loss/train': 1.5122183803468943} 12/22/2021 07:08:22 - INFO - codeparrot_training - Step 5026: {'lr': 0.0004883964954836017, 'samples': 2573824, 'steps': 5026, 'batch_loss/train': 0.8045603232458234} 12/22/2021 07:08:32 - INFO - codeparrot_training - Step 5027: {'lr': 0.0004883917484011083, 'samples': 2574336, 'steps': 5027, 'batch_loss/train': 0.9128123736009002} 12/22/2021 07:08:44 - INFO - codeparrot_training - Step 5028: {'lr': 0.0004883870003708591, 'samples': 2574848, 'steps': 5028, 'batch_loss/train': 0.8985810521990061} 12/22/2021 07:08:55 - INFO - codeparrot_training - Step 5029: {'lr': 0.0004883822513928733, 'samples': 2575360, 'steps': 5029, 'batch_loss/train': 0.8880992052145302} 12/22/2021 07:09:06 - INFO - codeparrot_training - Step 5030: {'lr': 0.0004883775014671697, 'samples': 2575872, 'steps': 5030, 'batch_loss/train': 0.7543832990340889} 12/22/2021 07:09:16 - INFO - codeparrot_training - Step 5031: {'lr': 0.0004883727505937671, 'samples': 2576384, 'steps': 5031, 'batch_loss/train': 0.8083597607910633} 12/22/2021 07:09:28 - INFO - codeparrot_training - Step 5032: {'lr': 0.0004883679987726844, 'samples': 2576896, 'steps': 5032, 'batch_loss/train': 0.7212779433466494} 12/22/2021 07:09:39 - INFO - codeparrot_training - Step 5033: {'lr': 0.0004883632460039406, 'samples': 2577408, 'steps': 5033, 'batch_loss/train': 0.7584801316261292} 12/22/2021 07:09:50 - INFO - codeparrot_training - Step 5034: {'lr': 0.0004883584922875546, 'samples': 2577920, 'steps': 5034, 'batch_loss/train': 0.8040855978615582} 12/22/2021 07:10:02 - INFO - codeparrot_training - Step 5035: {'lr': 0.0004883537376235452, 'samples': 2578432, 'steps': 5035, 'batch_loss/train': 0.8081605602055788} 12/22/2021 07:10:13 - INFO - codeparrot_training - Step 5036: {'lr': 0.0004883489820119313, 'samples': 2578944, 'steps': 5036, 'batch_loss/train': 0.8174084788188338} 12/22/2021 07:10:24 - INFO - codeparrot_training - Step 5037: {'lr': 0.0004883442254527319, 'samples': 2579456, 'steps': 5037, 'batch_loss/train': 0.8375936807133257} 12/22/2021 07:10:36 - INFO - codeparrot_training - Step 5038: {'lr': 0.0004883394679459658, 'samples': 2579968, 'steps': 5038, 'batch_loss/train': 0.8922272576019168} 12/22/2021 07:10:46 - INFO - codeparrot_training - Step 5039: {'lr': 0.000488334709491652, 'samples': 2580480, 'steps': 5039, 'batch_loss/train': 0.8475206391885877} 12/22/2021 07:10:57 - INFO - codeparrot_training - Step 5040: {'lr': 0.0004883299500898094, 'samples': 2580992, 'steps': 5040, 'batch_loss/train': 0.7866382766515017} 12/22/2021 07:11:08 - INFO - codeparrot_training - Step 5041: {'lr': 0.0004883251897404569, 'samples': 2581504, 'steps': 5041, 'batch_loss/train': 0.8242761408910155} 12/22/2021 07:11:20 - INFO - codeparrot_training - Step 5042: {'lr': 0.0004883204284436134, 'samples': 2582016, 'steps': 5042, 'batch_loss/train': 0.8427498396486044} 12/22/2021 07:11:31 - INFO - codeparrot_training - Step 5043: {'lr': 0.0004883156661992979, 'samples': 2582528, 'steps': 5043, 'batch_loss/train': 0.7681382885202765} 12/22/2021 07:11:41 - INFO - codeparrot_training - Step 5044: {'lr': 0.0004883109030075293, 'samples': 2583040, 'steps': 5044, 'batch_loss/train': 0.761404390912503} 12/22/2021 07:11:53 - INFO - codeparrot_training - Step 5045: {'lr': 0.0004883061388683264, 'samples': 2583552, 'steps': 5045, 'batch_loss/train': 0.8296514190733433} 12/22/2021 07:12:04 - INFO - codeparrot_training - Step 5046: {'lr': 0.0004883013737817085, 'samples': 2584064, 'steps': 5046, 'batch_loss/train': 0.813614132697694} 12/22/2021 07:12:15 - INFO - codeparrot_training - Step 5047: {'lr': 0.0004882966077476941, 'samples': 2584576, 'steps': 5047, 'batch_loss/train': 0.8238914450630546} 12/22/2021 07:12:27 - INFO - codeparrot_training - Step 5048: {'lr': 0.0004882918407663025, 'samples': 2585088, 'steps': 5048, 'batch_loss/train': 0.8934039529412985} 12/22/2021 07:12:38 - INFO - codeparrot_training - Step 5049: {'lr': 0.0004882870728375525, 'samples': 2585600, 'steps': 5049, 'batch_loss/train': 0.8865980440750718} 12/22/2021 07:12:49 - INFO - codeparrot_training - Step 5050: {'lr': 0.000488282303961463, 'samples': 2586112, 'steps': 5050, 'batch_loss/train': 0.8109507393091917} 12/22/2021 07:13:01 - INFO - codeparrot_training - Step 5051: {'lr': 0.0004882775341380531, 'samples': 2586624, 'steps': 5051, 'batch_loss/train': 0.7727855751290917} 12/22/2021 07:13:11 - INFO - codeparrot_training - Step 5052: {'lr': 0.0004882727633673417, 'samples': 2587136, 'steps': 5052, 'batch_loss/train': 0.7365996188018471} 12/22/2021 07:13:22 - INFO - codeparrot_training - Step 5053: {'lr': 0.0004882679916493478, 'samples': 2587648, 'steps': 5053, 'batch_loss/train': 0.923710910603404} 12/22/2021 07:13:33 - INFO - codeparrot_training - Step 5054: {'lr': 0.0004882632189840902, 'samples': 2588160, 'steps': 5054, 'batch_loss/train': 0.8788424925878644} 12/22/2021 07:13:46 - INFO - codeparrot_training - Step 5055: {'lr': 0.00048825844537158807, 'samples': 2588672, 'steps': 5055, 'batch_loss/train': 0.5772469540825114} 12/22/2021 07:13:56 - INFO - codeparrot_training - Step 5056: {'lr': 0.00048825367081186034, 'samples': 2589184, 'steps': 5056, 'batch_loss/train': 0.7502389438450336} 12/22/2021 07:14:07 - INFO - codeparrot_training - Step 5057: {'lr': 0.000488248895304926, 'samples': 2589696, 'steps': 5057, 'batch_loss/train': 0.8012906024232507} 12/22/2021 07:14:19 - INFO - codeparrot_training - Step 5058: {'lr': 0.00048824411885080396, 'samples': 2590208, 'steps': 5058, 'batch_loss/train': 0.9026337964460254} 12/22/2021 07:14:30 - INFO - codeparrot_training - Step 5059: {'lr': 0.00048823934144951326, 'samples': 2590720, 'steps': 5059, 'batch_loss/train': 0.7986275171861053} 12/22/2021 07:14:40 - INFO - codeparrot_training - Step 5060: {'lr': 0.000488234563101073, 'samples': 2591232, 'steps': 5060, 'batch_loss/train': 0.729488730430603} 12/22/2021 07:14:53 - INFO - codeparrot_training - Step 5061: {'lr': 0.00048822978380550186, 'samples': 2591744, 'steps': 5061, 'batch_loss/train': 0.7329579290235415} 12/22/2021 07:15:03 - INFO - codeparrot_training - Step 5062: {'lr': 0.00048822500356281916, 'samples': 2592256, 'steps': 5062, 'batch_loss/train': 0.7397046044934541} 12/22/2021 07:15:14 - INFO - codeparrot_training - Step 5063: {'lr': 0.00048822022237304385, 'samples': 2592768, 'steps': 5063, 'batch_loss/train': 0.8124210952082649} 12/22/2021 07:15:27 - INFO - codeparrot_training - Step 5064: {'lr': 0.0004882154402361949, 'samples': 2593280, 'steps': 5064, 'batch_loss/train': 0.8574423836544156} 12/22/2021 07:15:37 - INFO - codeparrot_training - Step 5065: {'lr': 0.0004882106571522912, 'samples': 2593792, 'steps': 5065, 'batch_loss/train': 0.734961814712733} 12/22/2021 07:15:48 - INFO - codeparrot_training - Step 5066: {'lr': 0.00048820587312135193, 'samples': 2594304, 'steps': 5066, 'batch_loss/train': 0.7706595063209534} 12/22/2021 07:15:59 - INFO - codeparrot_training - Step 5067: {'lr': 0.000488201088143396, 'samples': 2594816, 'steps': 5067, 'batch_loss/train': 0.9204591987654567} 12/22/2021 07:16:11 - INFO - codeparrot_training - Step 5068: {'lr': 0.0004881963022184426, 'samples': 2595328, 'steps': 5068, 'batch_loss/train': 0.8115908158943057} 12/22/2021 07:16:21 - INFO - codeparrot_training - Step 5069: {'lr': 0.0004881915153465106, 'samples': 2595840, 'steps': 5069, 'batch_loss/train': 0.8354516974650323} 12/22/2021 07:16:32 - INFO - codeparrot_training - Step 5070: {'lr': 0.000488186727527619, 'samples': 2596352, 'steps': 5070, 'batch_loss/train': 0.9027641238644719} 12/22/2021 07:16:44 - INFO - codeparrot_training - Step 5071: {'lr': 0.00048818193876178697, 'samples': 2596864, 'steps': 5071, 'batch_loss/train': 0.8526964816264808} 12/22/2021 07:16:55 - INFO - codeparrot_training - Step 5072: {'lr': 0.00048817714904903344, 'samples': 2597376, 'steps': 5072, 'batch_loss/train': 0.814683657605201} 12/22/2021 07:17:05 - INFO - codeparrot_training - Step 5073: {'lr': 0.00048817235838937757, 'samples': 2597888, 'steps': 5073, 'batch_loss/train': 0.9269695272669196} 12/22/2021 07:17:18 - INFO - codeparrot_training - Step 5074: {'lr': 0.00048816756678283824, 'samples': 2598400, 'steps': 5074, 'batch_loss/train': 0.8341900110244751} 12/22/2021 07:17:29 - INFO - codeparrot_training - Step 5075: {'lr': 0.0004881627742294347, 'samples': 2598912, 'steps': 5075, 'batch_loss/train': 0.8081537466496229} 12/22/2021 07:17:39 - INFO - codeparrot_training - Step 5076: {'lr': 0.0004881579807291858, 'samples': 2599424, 'steps': 5076, 'batch_loss/train': 0.8625033758580685} 12/22/2021 07:17:52 - INFO - codeparrot_training - Step 5077: {'lr': 0.00048815318628211074, 'samples': 2599936, 'steps': 5077, 'batch_loss/train': 0.833066706545651} 12/22/2021 07:18:02 - INFO - codeparrot_training - Step 5078: {'lr': 0.0004881483908882285, 'samples': 2600448, 'steps': 5078, 'batch_loss/train': 0.8349724127911031} 12/22/2021 07:18:13 - INFO - codeparrot_training - Step 5079: {'lr': 0.00048814359454755823, 'samples': 2600960, 'steps': 5079, 'batch_loss/train': 0.8388864588923752} 12/22/2021 07:18:23 - INFO - codeparrot_training - Step 5080: {'lr': 0.0004881387972601189, 'samples': 2601472, 'steps': 5080, 'batch_loss/train': 0.7548414124175906} 12/22/2021 07:18:36 - INFO - codeparrot_training - Step 5081: {'lr': 0.0004881339990259297, 'samples': 2601984, 'steps': 5081, 'batch_loss/train': 0.8260689387097955} 12/22/2021 07:18:46 - INFO - codeparrot_training - Step 5082: {'lr': 0.0004881291998450096, 'samples': 2602496, 'steps': 5082, 'batch_loss/train': 0.8602105574682355} 12/22/2021 07:18:57 - INFO - codeparrot_training - Step 5083: {'lr': 0.0004881243997173777, 'samples': 2603008, 'steps': 5083, 'batch_loss/train': 0.8235787763260305} 12/22/2021 07:19:10 - INFO - codeparrot_training - Step 5084: {'lr': 0.0004881195986430531, 'samples': 2603520, 'steps': 5084, 'batch_loss/train': 0.9158705808222294} 12/22/2021 07:19:20 - INFO - codeparrot_training - Step 5085: {'lr': 0.0004881147966220549, 'samples': 2604032, 'steps': 5085, 'batch_loss/train': 0.923566329292953} 12/22/2021 07:19:31 - INFO - codeparrot_training - Step 5086: {'lr': 0.0004881099936544021, 'samples': 2604544, 'steps': 5086, 'batch_loss/train': 0.7876188373193145} 12/22/2021 07:19:43 - INFO - codeparrot_training - Step 5087: {'lr': 0.000488105189740114, 'samples': 2605056, 'steps': 5087, 'batch_loss/train': 0.9078691601753235} 12/22/2021 07:19:54 - INFO - codeparrot_training - Step 5088: {'lr': 0.0004881003848792095, 'samples': 2605568, 'steps': 5088, 'batch_loss/train': 0.8131893342360854} 12/22/2021 07:20:04 - INFO - codeparrot_training - Step 5089: {'lr': 0.00048809557907170776, 'samples': 2606080, 'steps': 5089, 'batch_loss/train': 0.7458519868087023} 12/22/2021 07:20:15 - INFO - codeparrot_training - Step 5090: {'lr': 0.0004880907723176279, 'samples': 2606592, 'steps': 5090, 'batch_loss/train': 0.8540478236973286} 12/22/2021 07:20:27 - INFO - codeparrot_training - Step 5091: {'lr': 0.00048808596461698907, 'samples': 2607104, 'steps': 5091, 'batch_loss/train': 0.8699125368148088} 12/22/2021 07:20:38 - INFO - codeparrot_training - Step 5092: {'lr': 0.00048808115596981033, 'samples': 2607616, 'steps': 5092, 'batch_loss/train': 0.8240590878995135} 12/22/2021 07:20:48 - INFO - codeparrot_training - Step 5093: {'lr': 0.00048807634637611084, 'samples': 2608128, 'steps': 5093, 'batch_loss/train': 0.8215751945972443} 12/22/2021 07:21:01 - INFO - codeparrot_training - Step 5094: {'lr': 0.00048807153583590963, 'samples': 2608640, 'steps': 5094, 'batch_loss/train': 0.8679645885713398} 12/22/2021 07:21:12 - INFO - codeparrot_training - Step 5095: {'lr': 0.00048806672434922594, 'samples': 2609152, 'steps': 5095, 'batch_loss/train': 0.9064486846327782} 12/22/2021 07:21:22 - INFO - codeparrot_training - Step 5096: {'lr': 0.00048806191191607876, 'samples': 2609664, 'steps': 5096, 'batch_loss/train': 0.7940569492056966} 12/22/2021 07:21:34 - INFO - codeparrot_training - Step 5097: {'lr': 0.00048805709853648737, 'samples': 2610176, 'steps': 5097, 'batch_loss/train': 0.8515434018336236} 12/22/2021 07:21:45 - INFO - codeparrot_training - Step 5098: {'lr': 0.00048805228421047085, 'samples': 2610688, 'steps': 5098, 'batch_loss/train': 0.8491352340206504} 12/22/2021 07:21:56 - INFO - codeparrot_training - Step 5099: {'lr': 0.0004880474689380483, 'samples': 2611200, 'steps': 5099, 'batch_loss/train': 0.8125575222074986} 12/22/2021 07:22:08 - INFO - codeparrot_training - Step 5100: {'lr': 0.0004880426527192389, 'samples': 2611712, 'steps': 5100, 'batch_loss/train': 0.8250413299538195} 12/22/2021 07:22:19 - INFO - codeparrot_training - Step 5101: {'lr': 0.00048803783555406184, 'samples': 2612224, 'steps': 5101, 'batch_loss/train': 0.8158526755869389} 12/22/2021 07:22:29 - INFO - codeparrot_training - Step 5102: {'lr': 0.0004880330174425362, 'samples': 2612736, 'steps': 5102, 'batch_loss/train': 0.7864664131775498} 12/22/2021 07:22:40 - INFO - codeparrot_training - Step 5103: {'lr': 0.00048802819838468113, 'samples': 2613248, 'steps': 5103, 'batch_loss/train': 0.7860177331604064} 12/22/2021 07:22:52 - INFO - codeparrot_training - Step 5104: {'lr': 0.00048802337838051586, 'samples': 2613760, 'steps': 5104, 'batch_loss/train': 0.7723338128998876} 12/22/2021 07:23:03 - INFO - codeparrot_training - Step 5105: {'lr': 0.0004880185574300595, 'samples': 2614272, 'steps': 5105, 'batch_loss/train': 0.8864371068775654} 12/22/2021 07:23:14 - INFO - codeparrot_training - Step 5106: {'lr': 0.0004880137355333312, 'samples': 2614784, 'steps': 5106, 'batch_loss/train': 0.8259484227746725} 12/22/2021 07:23:26 - INFO - codeparrot_training - Step 5107: {'lr': 0.00048800891269035017, 'samples': 2615296, 'steps': 5107, 'batch_loss/train': 0.7503736224025488} 12/22/2021 07:23:36 - INFO - codeparrot_training - Step 5108: {'lr': 0.0004880040889011356, 'samples': 2615808, 'steps': 5108, 'batch_loss/train': 0.796417617239058} 12/22/2021 07:23:47 - INFO - codeparrot_training - Step 5109: {'lr': 0.0004879992641657066, 'samples': 2616320, 'steps': 5109, 'batch_loss/train': 0.839892765507102} 12/22/2021 07:24:00 - INFO - codeparrot_training - Step 5110: {'lr': 0.00048799443848408243, 'samples': 2616832, 'steps': 5110, 'batch_loss/train': 0.6519882655702531} 12/22/2021 07:24:10 - INFO - codeparrot_training - Step 5111: {'lr': 0.0004879896118562822, 'samples': 2617344, 'steps': 5111, 'batch_loss/train': 0.7405995714943856} 12/22/2021 07:24:21 - INFO - codeparrot_training - Step 5112: {'lr': 0.0004879847842823252, 'samples': 2617856, 'steps': 5112, 'batch_loss/train': 0.9239078853279352} 12/22/2021 07:24:33 - INFO - codeparrot_training - Step 5113: {'lr': 0.0004879799557622305, 'samples': 2618368, 'steps': 5113, 'batch_loss/train': 0.8596064625307918} 12/22/2021 07:24:44 - INFO - codeparrot_training - Step 5114: {'lr': 0.00048797512629601735, 'samples': 2618880, 'steps': 5114, 'batch_loss/train': 0.7944960305467248} 12/22/2021 07:24:54 - INFO - codeparrot_training - Step 5115: {'lr': 0.000487970295883705, 'samples': 2619392, 'steps': 5115, 'batch_loss/train': 0.8262302987277508} 12/22/2021 07:25:05 - INFO - codeparrot_training - Step 5116: {'lr': 0.00048796546452531256, 'samples': 2619904, 'steps': 5116, 'batch_loss/train': 0.876097368542105} 12/22/2021 07:25:17 - INFO - codeparrot_training - Step 5117: {'lr': 0.0004879606322208593, 'samples': 2620416, 'steps': 5117, 'batch_loss/train': 0.8387753623537719} 12/22/2021 07:25:27 - INFO - codeparrot_training - Step 5118: {'lr': 0.0004879557989703644, 'samples': 2620928, 'steps': 5118, 'batch_loss/train': 0.8020581370219588} 12/22/2021 07:25:38 - INFO - codeparrot_training - Step 5119: {'lr': 0.00048795096477384714, 'samples': 2621440, 'steps': 5119, 'batch_loss/train': 0.8092505186796188} 12/22/2021 07:25:50 - INFO - codeparrot_training - Step 5120: {'lr': 0.0004879461296313267, 'samples': 2621952, 'steps': 5120, 'batch_loss/train': 0.7977716131135821} 12/22/2021 07:26:01 - INFO - codeparrot_training - Step 5121: {'lr': 0.0004879412935428222, 'samples': 2622464, 'steps': 5121, 'batch_loss/train': 0.8285151366144419} 12/22/2021 07:26:11 - INFO - codeparrot_training - Step 5122: {'lr': 0.00048793645650835305, 'samples': 2622976, 'steps': 5122, 'batch_loss/train': 0.6300239127594978} 12/22/2021 07:26:24 - INFO - codeparrot_training - Step 5123: {'lr': 0.00048793161852793833, 'samples': 2623488, 'steps': 5123, 'batch_loss/train': 0.9078920201864094} 12/22/2021 07:26:35 - INFO - codeparrot_training - Step 5124: {'lr': 0.0004879267796015974, 'samples': 2624000, 'steps': 5124, 'batch_loss/train': 0.7642887248657644} 12/22/2021 07:26:45 - INFO - codeparrot_training - Step 5125: {'lr': 0.00048792193972934934, 'samples': 2624512, 'steps': 5125, 'batch_loss/train': 0.8558997400105} 12/22/2021 07:26:57 - INFO - codeparrot_training - Step 5126: {'lr': 0.0004879170989112135, 'samples': 2625024, 'steps': 5126, 'batch_loss/train': 0.7889987037051469} 12/22/2021 07:27:08 - INFO - codeparrot_training - Step 5127: {'lr': 0.00048791225714720914, 'samples': 2625536, 'steps': 5127, 'batch_loss/train': 0.9400976072065532} 12/22/2021 07:27:19 - INFO - codeparrot_training - Step 5128: {'lr': 0.0004879074144373555, 'samples': 2626048, 'steps': 5128, 'batch_loss/train': 0.9022550489753485} 12/22/2021 07:27:29 - INFO - codeparrot_training - Step 5129: {'lr': 0.0004879025707816718, 'samples': 2626560, 'steps': 5129, 'batch_loss/train': 0.8805124843493104} 12/22/2021 07:27:42 - INFO - codeparrot_training - Step 5130: {'lr': 0.0004878977261801773, 'samples': 2627072, 'steps': 5130, 'batch_loss/train': 0.7635718630626798} 12/22/2021 07:27:52 - INFO - codeparrot_training - Step 5131: {'lr': 0.0004878928806328913, 'samples': 2627584, 'steps': 5131, 'batch_loss/train': 0.8855533329769969} 12/22/2021 07:28:03 - INFO - codeparrot_training - Step 5132: {'lr': 0.00048788803413983296, 'samples': 2628096, 'steps': 5132, 'batch_loss/train': 0.8357096980325878} 12/22/2021 07:28:15 - INFO - codeparrot_training - Step 5133: {'lr': 0.00048788318670102166, 'samples': 2628608, 'steps': 5133, 'batch_loss/train': 0.8420067806728184} 12/22/2021 07:28:26 - INFO - codeparrot_training - Step 5134: {'lr': 0.00048787833831647663, 'samples': 2629120, 'steps': 5134, 'batch_loss/train': 0.8075422085821629} 12/22/2021 07:28:37 - INFO - codeparrot_training - Step 5135: {'lr': 0.00048787348898621713, 'samples': 2629632, 'steps': 5135, 'batch_loss/train': 0.7994310911744833} 12/22/2021 07:28:49 - INFO - codeparrot_training - Step 5136: {'lr': 0.0004878686387102625, 'samples': 2630144, 'steps': 5136, 'batch_loss/train': 1.0079103857278824} 12/22/2021 07:28:59 - INFO - codeparrot_training - Step 5137: {'lr': 0.0004878637874886319, 'samples': 2630656, 'steps': 5137, 'batch_loss/train': 0.5567522118799388} 12/22/2021 07:29:10 - INFO - codeparrot_training - Step 5138: {'lr': 0.0004878589353213447, 'samples': 2631168, 'steps': 5138, 'batch_loss/train': 0.8272718489170074} 12/22/2021 07:29:21 - INFO - codeparrot_training - Step 5139: {'lr': 0.0004878540822084203, 'samples': 2631680, 'steps': 5139, 'batch_loss/train': 1.0862758927978575} 12/22/2021 07:29:33 - INFO - codeparrot_training - Step 5140: {'lr': 0.0004878492281498778, 'samples': 2632192, 'steps': 5140, 'batch_loss/train': 1.2662646835669875} 12/22/2021 07:29:44 - INFO - codeparrot_training - Step 5141: {'lr': 0.0004878443731457366, 'samples': 2632704, 'steps': 5141, 'batch_loss/train': 0.879656913690269} 12/22/2021 07:29:55 - INFO - codeparrot_training - Step 5142: {'lr': 0.00048783951719601594, 'samples': 2633216, 'steps': 5142, 'batch_loss/train': 0.8494847221300006} 12/22/2021 07:30:07 - INFO - codeparrot_training - Step 5143: {'lr': 0.0004878346603007352, 'samples': 2633728, 'steps': 5143, 'batch_loss/train': 0.7303058311226778} 12/22/2021 07:30:17 - INFO - codeparrot_training - Step 5144: {'lr': 0.00048782980245991363, 'samples': 2634240, 'steps': 5144, 'batch_loss/train': 0.6992500469787046} 12/22/2021 07:30:28 - INFO - codeparrot_training - Step 5145: {'lr': 0.00048782494367357053, 'samples': 2634752, 'steps': 5145, 'batch_loss/train': 1.7016164693050086} 12/22/2021 07:30:40 - INFO - codeparrot_training - Step 5146: {'lr': 0.0004878200839417254, 'samples': 2635264, 'steps': 5146, 'batch_loss/train': 0.887586587574333} 12/22/2021 07:30:51 - INFO - codeparrot_training - Step 5147: {'lr': 0.00048781522326439726, 'samples': 2635776, 'steps': 5147, 'batch_loss/train': 0.6353576665278524} 12/22/2021 07:31:01 - INFO - codeparrot_training - Step 5148: {'lr': 0.00048781036164160564, 'samples': 2636288, 'steps': 5148, 'batch_loss/train': 0.796504438854754} 12/22/2021 07:31:14 - INFO - codeparrot_training - Step 5149: {'lr': 0.00048780549907336977, 'samples': 2636800, 'steps': 5149, 'batch_loss/train': 0.8766056816093624} 12/22/2021 07:31:25 - INFO - codeparrot_training - Step 5150: {'lr': 0.0004878006355597091, 'samples': 2637312, 'steps': 5150, 'batch_loss/train': 0.8060974571853876} 12/22/2021 07:31:35 - INFO - codeparrot_training - Step 5151: {'lr': 0.0004877957711006429, 'samples': 2637824, 'steps': 5151, 'batch_loss/train': 0.797766949981451} 12/22/2021 07:31:46 - INFO - codeparrot_training - Step 5152: {'lr': 0.00048779090569619045, 'samples': 2638336, 'steps': 5152, 'batch_loss/train': 0.8672151481732726} 12/22/2021 07:31:58 - INFO - codeparrot_training - Step 5153: {'lr': 0.00048778603934637125, 'samples': 2638848, 'steps': 5153, 'batch_loss/train': 0.8151112599298358} 12/22/2021 07:32:09 - INFO - codeparrot_training - Step 5154: {'lr': 0.0004877811720512044, 'samples': 2639360, 'steps': 5154, 'batch_loss/train': 0.8036110382527113} 12/22/2021 07:32:20 - INFO - codeparrot_training - Step 5155: {'lr': 0.0004877763038107095, 'samples': 2639872, 'steps': 5155, 'batch_loss/train': 0.730876925168559} 12/22/2021 07:32:32 - INFO - codeparrot_training - Step 5156: {'lr': 0.0004877714346249057, 'samples': 2640384, 'steps': 5156, 'batch_loss/train': 0.8656086195260286} 12/22/2021 07:32:43 - INFO - codeparrot_training - Step 5157: {'lr': 0.0004877665644938125, 'samples': 2640896, 'steps': 5157, 'batch_loss/train': 0.5909633635892533} 12/22/2021 07:32:53 - INFO - codeparrot_training - Step 5158: {'lr': 0.00048776169341744924, 'samples': 2641408, 'steps': 5158, 'batch_loss/train': 0.839778233319521} 12/22/2021 07:33:06 - INFO - codeparrot_training - Step 5159: {'lr': 0.0004877568213958352, 'samples': 2641920, 'steps': 5159, 'batch_loss/train': 0.8973023425787687} 12/22/2021 07:33:17 - INFO - codeparrot_training - Step 5160: {'lr': 0.00048775194842898983, 'samples': 2642432, 'steps': 5160, 'batch_loss/train': 0.9385179420933127} 12/22/2021 07:33:28 - INFO - codeparrot_training - Step 5161: {'lr': 0.00048774707451693257, 'samples': 2642944, 'steps': 5161, 'batch_loss/train': 1.0820924900472164} 12/22/2021 07:33:40 - INFO - codeparrot_training - Step 5162: {'lr': 0.0004877421996596826, 'samples': 2643456, 'steps': 5162, 'batch_loss/train': 0.8256354429759085} 12/22/2021 07:33:50 - INFO - codeparrot_training - Step 5163: {'lr': 0.0004877373238572595, 'samples': 2643968, 'steps': 5163, 'batch_loss/train': 0.9022978371940553} 12/22/2021 07:34:01 - INFO - codeparrot_training - Step 5164: {'lr': 0.0004877324471096825, 'samples': 2644480, 'steps': 5164, 'batch_loss/train': 0.7393261989345774} 12/22/2021 07:34:12 - INFO - codeparrot_training - Step 5165: {'lr': 0.0004877275694169711, 'samples': 2644992, 'steps': 5165, 'batch_loss/train': 0.799501902423799} 12/22/2021 07:34:24 - INFO - codeparrot_training - Step 5166: {'lr': 0.00048772269077914466, 'samples': 2645504, 'steps': 5166, 'batch_loss/train': 1.6017461405135691} 12/22/2021 07:34:34 - INFO - codeparrot_training - Step 5167: {'lr': 0.0004877178111962225, 'samples': 2646016, 'steps': 5167, 'batch_loss/train': 0.8642596527934074} 12/22/2021 07:34:45 - INFO - codeparrot_training - Step 5168: {'lr': 0.0004877129306682241, 'samples': 2646528, 'steps': 5168, 'batch_loss/train': 0.9378862408921123} 12/22/2021 07:34:58 - INFO - codeparrot_training - Step 5169: {'lr': 0.0004877080491951689, 'samples': 2647040, 'steps': 5169, 'batch_loss/train': 0.7520132805220783} 12/22/2021 07:35:09 - INFO - codeparrot_training - Step 5170: {'lr': 0.0004877031667770762, 'samples': 2647552, 'steps': 5170, 'batch_loss/train': 0.8114856285974383} 12/22/2021 07:35:19 - INFO - codeparrot_training - Step 5171: {'lr': 0.0004876982834139654, 'samples': 2648064, 'steps': 5171, 'batch_loss/train': 0.6848253235220909} 12/22/2021 07:35:31 - INFO - codeparrot_training - Step 5172: {'lr': 0.0004876933991058561, 'samples': 2648576, 'steps': 5172, 'batch_loss/train': 0.8601394454017282} 12/22/2021 07:35:42 - INFO - codeparrot_training - Step 5173: {'lr': 0.00048768851385276753, 'samples': 2649088, 'steps': 5173, 'batch_loss/train': 0.8531266744248569} 12/22/2021 07:35:53 - INFO - codeparrot_training - Step 5174: {'lr': 0.0004876836276547192, 'samples': 2649600, 'steps': 5174, 'batch_loss/train': 0.8611481385305524} 12/22/2021 07:36:03 - INFO - codeparrot_training - Step 5175: {'lr': 0.0004876787405117305, 'samples': 2650112, 'steps': 5175, 'batch_loss/train': 0.7979636676609516} 12/22/2021 07:36:15 - INFO - codeparrot_training - Step 5176: {'lr': 0.0004876738524238209, 'samples': 2650624, 'steps': 5176, 'batch_loss/train': 0.8422134951688349} 12/22/2021 07:36:26 - INFO - codeparrot_training - Step 5177: {'lr': 0.00048766896339100973, 'samples': 2651136, 'steps': 5177, 'batch_loss/train': 0.899076409637928} 12/22/2021 07:36:36 - INFO - codeparrot_training - Step 5178: {'lr': 0.0004876640734133165, 'samples': 2651648, 'steps': 5178, 'batch_loss/train': 0.7860659803263843} 12/22/2021 07:36:49 - INFO - codeparrot_training - Step 5179: {'lr': 0.0004876591824907608, 'samples': 2652160, 'steps': 5179, 'batch_loss/train': 0.8796239215880632} 12/22/2021 07:37:00 - INFO - codeparrot_training - Step 5180: {'lr': 0.00048765429062336176, 'samples': 2652672, 'steps': 5180, 'batch_loss/train': 0.8244584454223514} 12/22/2021 07:37:10 - INFO - codeparrot_training - Step 5181: {'lr': 0.0004876493978111391, 'samples': 2653184, 'steps': 5181, 'batch_loss/train': 0.8448367095552385} 12/22/2021 07:37:22 - INFO - codeparrot_training - Step 5182: {'lr': 0.0004876445040541121, 'samples': 2653696, 'steps': 5182, 'batch_loss/train': 0.793601281940937} 12/22/2021 07:37:33 - INFO - codeparrot_training - Step 5183: {'lr': 0.00048763960935230033, 'samples': 2654208, 'steps': 5183, 'batch_loss/train': 0.9436844829469919} 12/22/2021 07:37:44 - INFO - codeparrot_training - Step 5184: {'lr': 0.0004876347137057232, 'samples': 2654720, 'steps': 5184, 'batch_loss/train': 0.723298512166366} 12/22/2021 07:37:56 - INFO - codeparrot_training - Step 5185: {'lr': 0.0004876298171144001, 'samples': 2655232, 'steps': 5185, 'batch_loss/train': 0.8599538998678327} 12/22/2021 07:38:07 - INFO - codeparrot_training - Step 5186: {'lr': 0.0004876249195783507, 'samples': 2655744, 'steps': 5186, 'batch_loss/train': 0.794318325817585} 12/22/2021 07:38:17 - INFO - codeparrot_training - Step 5187: {'lr': 0.0004876200210975943, 'samples': 2656256, 'steps': 5187, 'batch_loss/train': 0.5681743751047179} 12/22/2021 07:38:28 - INFO - codeparrot_training - Step 5188: {'lr': 0.0004876151216721504, 'samples': 2656768, 'steps': 5188, 'batch_loss/train': 0.8262056731618941} 12/22/2021 07:38:41 - INFO - codeparrot_training - Step 5189: {'lr': 0.00048761022130203846, 'samples': 2657280, 'steps': 5189, 'batch_loss/train': 0.8211936857551336} 12/22/2021 07:38:51 - INFO - codeparrot_training - Step 5190: {'lr': 0.00048760531998727805, 'samples': 2657792, 'steps': 5190, 'batch_loss/train': 0.805610129609704} 12/22/2021 07:39:02 - INFO - codeparrot_training - Step 5191: {'lr': 0.00048760041772788856, 'samples': 2658304, 'steps': 5191, 'batch_loss/train': 0.9093351420015097} 12/22/2021 07:39:14 - INFO - codeparrot_training - Step 5192: {'lr': 0.00048759551452388954, 'samples': 2658816, 'steps': 5192, 'batch_loss/train': 0.836968925781548} 12/22/2021 07:39:24 - INFO - codeparrot_training - Step 5193: {'lr': 0.0004875906103753005, 'samples': 2659328, 'steps': 5193, 'batch_loss/train': 0.8154324032366276} 12/22/2021 07:39:35 - INFO - codeparrot_training - Step 5194: {'lr': 0.00048758570528214086, 'samples': 2659840, 'steps': 5194, 'batch_loss/train': 0.8593331333249807} 12/22/2021 07:39:48 - INFO - codeparrot_training - Step 5195: {'lr': 0.00048758079924443023, 'samples': 2660352, 'steps': 5195, 'batch_loss/train': 0.859629150480032} 12/22/2021 07:39:58 - INFO - codeparrot_training - Step 5196: {'lr': 0.000487575892262188, 'samples': 2660864, 'steps': 5196, 'batch_loss/train': 0.8120974572375417} 12/22/2021 07:40:09 - INFO - codeparrot_training - Step 5197: {'lr': 0.0004875709843354337, 'samples': 2661376, 'steps': 5197, 'batch_loss/train': 0.807296521961689} 12/22/2021 07:40:20 - INFO - codeparrot_training - Step 5198: {'lr': 0.00048756607546418686, 'samples': 2661888, 'steps': 5198, 'batch_loss/train': 0.801524709444493} 12/22/2021 07:40:32 - INFO - codeparrot_training - Step 5199: {'lr': 0.0004875611656484671, 'samples': 2662400, 'steps': 5199, 'batch_loss/train': 0.8545715119689703} 12/22/2021 07:40:42 - INFO - codeparrot_training - Step 5200: {'lr': 0.0004875562548882938, 'samples': 2662912, 'steps': 5200, 'batch_loss/train': 0.760861698538065} 12/22/2021 07:40:53 - INFO - codeparrot_training - Step 5201: {'lr': 0.0004875513431836865, 'samples': 2663424, 'steps': 5201, 'batch_loss/train': 0.8220115816220641} 12/22/2021 07:41:05 - INFO - codeparrot_training - Step 5202: {'lr': 0.0004875464305346647, 'samples': 2663936, 'steps': 5202, 'batch_loss/train': 0.7761526093818247} 12/22/2021 07:41:16 - INFO - codeparrot_training - Step 5203: {'lr': 0.00048754151694124814, 'samples': 2664448, 'steps': 5203, 'batch_loss/train': 0.7758071117568761} 12/22/2021 07:41:26 - INFO - codeparrot_training - Step 5204: {'lr': 0.0004875366024034561, 'samples': 2664960, 'steps': 5204, 'batch_loss/train': 0.806422880385071} 12/22/2021 07:41:38 - INFO - codeparrot_training - Step 5205: {'lr': 0.0004875316869213082, 'samples': 2665472, 'steps': 5205, 'batch_loss/train': 0.8189944115001708} 12/22/2021 07:41:49 - INFO - codeparrot_training - Step 5206: {'lr': 0.0004875267704948241, 'samples': 2665984, 'steps': 5206, 'batch_loss/train': 0.943042210303247} 12/22/2021 07:42:00 - INFO - codeparrot_training - Step 5207: {'lr': 0.0004875218531240232, 'samples': 2666496, 'steps': 5207, 'batch_loss/train': 0.721203843713738} 12/22/2021 07:42:12 - INFO - codeparrot_training - Step 5208: {'lr': 0.00048751693480892505, 'samples': 2667008, 'steps': 5208, 'batch_loss/train': 0.7948101174551994} 12/22/2021 07:42:23 - INFO - codeparrot_training - Step 5209: {'lr': 0.0004875120155495493, 'samples': 2667520, 'steps': 5209, 'batch_loss/train': 0.8439151747152209} 12/22/2021 07:42:34 - INFO - codeparrot_training - Step 5210: {'lr': 0.00048750709534591544, 'samples': 2668032, 'steps': 5210, 'batch_loss/train': 0.8746386673301458} 12/22/2021 07:42:44 - INFO - codeparrot_training - Step 5211: {'lr': 0.00048750217419804304, 'samples': 2668544, 'steps': 5211, 'batch_loss/train': 0.8431794699281454} 12/22/2021 07:42:56 - INFO - codeparrot_training - Step 5212: {'lr': 0.0004874972521059517, 'samples': 2669056, 'steps': 5212, 'batch_loss/train': 0.8157262355089188} 12/22/2021 07:43:07 - INFO - codeparrot_training - Step 5213: {'lr': 0.00048749232906966097, 'samples': 2669568, 'steps': 5213, 'batch_loss/train': 0.7748117167502642} 12/22/2021 07:43:18 - INFO - codeparrot_training - Step 5214: {'lr': 0.00048748740508919037, 'samples': 2670080, 'steps': 5214, 'batch_loss/train': 0.8403342058882117} 12/22/2021 07:43:30 - INFO - codeparrot_training - Step 5215: {'lr': 0.00048748248016455955, 'samples': 2670592, 'steps': 5215, 'batch_loss/train': 0.6738692428916693} 12/22/2021 07:43:40 - INFO - codeparrot_training - Step 5216: {'lr': 0.000487477554295788, 'samples': 2671104, 'steps': 5216, 'batch_loss/train': 0.8722157217562199} 12/22/2021 07:43:51 - INFO - codeparrot_training - Step 5217: {'lr': 0.0004874726274828954, 'samples': 2671616, 'steps': 5217, 'batch_loss/train': 0.8459791917121038} 12/22/2021 07:44:04 - INFO - codeparrot_training - Step 5218: {'lr': 0.0004874676997259013, 'samples': 2672128, 'steps': 5218, 'batch_loss/train': 0.7706700381822884} 12/22/2021 07:44:14 - INFO - codeparrot_training - Step 5219: {'lr': 0.00048746277102482534, 'samples': 2672640, 'steps': 5219, 'batch_loss/train': 0.8822456859052181} 12/22/2021 07:44:25 - INFO - codeparrot_training - Step 5220: {'lr': 0.000487457841379687, 'samples': 2673152, 'steps': 5220, 'batch_loss/train': 0.7929918197914958} 12/22/2021 07:44:37 - INFO - codeparrot_training - Step 5221: {'lr': 0.00048745291079050604, 'samples': 2673664, 'steps': 5221, 'batch_loss/train': 1.2833760408684611} 12/22/2021 07:44:48 - INFO - codeparrot_training - Step 5222: {'lr': 0.00048744797925730183, 'samples': 2674176, 'steps': 5222, 'batch_loss/train': 0.7879995191469789} 12/22/2021 07:44:58 - INFO - codeparrot_training - Step 5223: {'lr': 0.00048744304678009423, 'samples': 2674688, 'steps': 5223, 'batch_loss/train': 0.7913250522688031} 12/22/2021 07:45:09 - INFO - codeparrot_training - Step 5224: {'lr': 0.00048743811335890265, 'samples': 2675200, 'steps': 5224, 'batch_loss/train': 0.8812306122854352} 12/22/2021 07:45:22 - INFO - codeparrot_training - Step 5225: {'lr': 0.0004874331789937468, 'samples': 2675712, 'steps': 5225, 'batch_loss/train': 1.5481226909905672} 12/22/2021 07:45:32 - INFO - codeparrot_training - Step 5226: {'lr': 0.0004874282436846463, 'samples': 2676224, 'steps': 5226, 'batch_loss/train': 0.7046557241119444} 12/22/2021 07:45:43 - INFO - codeparrot_training - Step 5227: {'lr': 0.0004874233074316207, 'samples': 2676736, 'steps': 5227, 'batch_loss/train': 0.7271522888913751} 12/22/2021 07:45:55 - INFO - codeparrot_training - Step 5228: {'lr': 0.00048741837023468974, 'samples': 2677248, 'steps': 5228, 'batch_loss/train': 0.8375287316739559} 12/22/2021 07:46:06 - INFO - codeparrot_training - Step 5229: {'lr': 0.00048741343209387296, 'samples': 2677760, 'steps': 5229, 'batch_loss/train': 0.8268868364393711} 12/22/2021 07:46:16 - INFO - codeparrot_training - Step 5230: {'lr': 0.00048740849300919, 'samples': 2678272, 'steps': 5230, 'batch_loss/train': 0.8035427476279438} 12/22/2021 07:46:28 - INFO - codeparrot_training - Step 5231: {'lr': 0.0004874035529806605, 'samples': 2678784, 'steps': 5231, 'batch_loss/train': 0.8765542563050985} 12/22/2021 07:46:39 - INFO - codeparrot_training - Step 5232: {'lr': 0.0004873986120083042, 'samples': 2679296, 'steps': 5232, 'batch_loss/train': 0.8559035463258624} 12/22/2021 07:46:50 - INFO - codeparrot_training - Step 5233: {'lr': 0.00048739367009214066, 'samples': 2679808, 'steps': 5233, 'batch_loss/train': 0.7410897477529943} 12/22/2021 07:47:00 - INFO - codeparrot_training - Step 5234: {'lr': 0.0004873887272321894, 'samples': 2680320, 'steps': 5234, 'batch_loss/train': 0.7711266847327352} 12/22/2021 07:47:12 - INFO - codeparrot_training - Step 5235: {'lr': 0.00048738378342847035, 'samples': 2680832, 'steps': 5235, 'batch_loss/train': 0.784931045724079} 12/22/2021 07:47:23 - INFO - codeparrot_training - Step 5236: {'lr': 0.00048737883868100286, 'samples': 2681344, 'steps': 5236, 'batch_loss/train': 0.824943920597434} 12/22/2021 07:47:34 - INFO - codeparrot_training - Step 5237: {'lr': 0.0004873738929898068, 'samples': 2681856, 'steps': 5237, 'batch_loss/train': 0.8424162706360221} 12/22/2021 07:47:46 - INFO - codeparrot_training - Step 5238: {'lr': 0.0004873689463549018, 'samples': 2682368, 'steps': 5238, 'batch_loss/train': 0.7969976183958352} 12/22/2021 07:47:57 - INFO - codeparrot_training - Step 5239: {'lr': 0.0004873639987763074, 'samples': 2682880, 'steps': 5239, 'batch_loss/train': 0.8095313832163811} 12/22/2021 07:48:08 - INFO - codeparrot_training - Step 5240: {'lr': 0.0004873590502540435, 'samples': 2683392, 'steps': 5240, 'batch_loss/train': 0.7326445318758488} 12/22/2021 07:48:20 - INFO - codeparrot_training - Step 5241: {'lr': 0.0004873541007881296, 'samples': 2683904, 'steps': 5241, 'batch_loss/train': 0.7423460907302797} 12/22/2021 07:48:31 - INFO - codeparrot_training - Step 5242: {'lr': 0.0004873491503785854, 'samples': 2684416, 'steps': 5242, 'batch_loss/train': 0.8121972130611539} 12/22/2021 07:48:41 - INFO - codeparrot_training - Step 5243: {'lr': 0.0004873441990254306, 'samples': 2684928, 'steps': 5243, 'batch_loss/train': 0.7329010413959622} 12/22/2021 07:48:52 - INFO - codeparrot_training - Step 5244: {'lr': 0.0004873392467286848, 'samples': 2685440, 'steps': 5244, 'batch_loss/train': 0.7803105320781469} 12/22/2021 07:49:04 - INFO - codeparrot_training - Step 5245: {'lr': 0.0004873342934883679, 'samples': 2685952, 'steps': 5245, 'batch_loss/train': 0.8564492175355554} 12/22/2021 07:49:14 - INFO - codeparrot_training - Step 5246: {'lr': 0.0004873293393044994, 'samples': 2686464, 'steps': 5246, 'batch_loss/train': 0.9236041205003858} 12/22/2021 07:49:25 - INFO - codeparrot_training - Step 5247: {'lr': 0.000487324384177099, 'samples': 2686976, 'steps': 5247, 'batch_loss/train': 0.768800477613695} 12/22/2021 07:49:38 - INFO - codeparrot_training - Step 5248: {'lr': 0.00048731942810618657, 'samples': 2687488, 'steps': 5248, 'batch_loss/train': 0.8833275465294719} 12/22/2021 07:49:48 - INFO - codeparrot_training - Step 5249: {'lr': 0.00048731447109178166, 'samples': 2688000, 'steps': 5249, 'batch_loss/train': 0.8922822782769799} 12/22/2021 07:49:59 - INFO - codeparrot_training - Step 5250: {'lr': 0.000487309513133904, 'samples': 2688512, 'steps': 5250, 'batch_loss/train': 0.803537632280495} 12/22/2021 07:50:11 - INFO - codeparrot_training - Step 5251: {'lr': 0.00048730455423257334, 'samples': 2689024, 'steps': 5251, 'batch_loss/train': 0.9186808774247766} 12/22/2021 07:50:22 - INFO - codeparrot_training - Step 5252: {'lr': 0.00048729959438780933, 'samples': 2689536, 'steps': 5252, 'batch_loss/train': 0.607413825346157} 12/22/2021 07:50:32 - INFO - codeparrot_training - Step 5253: {'lr': 0.00048729463359963184, 'samples': 2690048, 'steps': 5253, 'batch_loss/train': 0.838462249841541} 12/22/2021 07:50:44 - INFO - codeparrot_training - Step 5254: {'lr': 0.0004872896718680604, 'samples': 2690560, 'steps': 5254, 'batch_loss/train': 0.8639535903930664} 12/22/2021 07:50:55 - INFO - codeparrot_training - Step 5255: {'lr': 0.00048728470919311487, 'samples': 2691072, 'steps': 5255, 'batch_loss/train': 0.7542773224413395} 12/22/2021 07:51:06 - INFO - codeparrot_training - Step 5256: {'lr': 0.0004872797455748149, 'samples': 2691584, 'steps': 5256, 'batch_loss/train': 0.8872460201382637} 12/22/2021 07:51:16 - INFO - codeparrot_training - Step 5257: {'lr': 0.00048727478101318023, 'samples': 2692096, 'steps': 5257, 'batch_loss/train': 0.8411627793684602} 12/22/2021 07:51:31 - INFO - codeparrot_training - Step 5258: {'lr': 0.0004872698155082307, 'samples': 2692608, 'steps': 5258, 'batch_loss/train': 0.796847939491272} 12/22/2021 07:51:41 - INFO - codeparrot_training - Step 5259: {'lr': 0.0004872648490599859, 'samples': 2693120, 'steps': 5259, 'batch_loss/train': 0.8187226597219706} 12/22/2021 07:51:52 - INFO - codeparrot_training - Step 5260: {'lr': 0.0004872598816684657, 'samples': 2693632, 'steps': 5260, 'batch_loss/train': 0.7629484017379582} 12/22/2021 07:52:04 - INFO - codeparrot_training - Step 5261: {'lr': 0.00048725491333368977, 'samples': 2694144, 'steps': 5261, 'batch_loss/train': 0.8822535173967481} 12/22/2021 07:52:15 - INFO - codeparrot_training - Step 5262: {'lr': 0.00048724994405567785, 'samples': 2694656, 'steps': 5262, 'batch_loss/train': 0.7132278555072844} 12/22/2021 07:52:25 - INFO - codeparrot_training - Step 5263: {'lr': 0.0004872449738344498, 'samples': 2695168, 'steps': 5263, 'batch_loss/train': 0.8701714221388102} 12/22/2021 07:52:38 - INFO - codeparrot_training - Step 5264: {'lr': 0.0004872400026700253, 'samples': 2695680, 'steps': 5264, 'batch_loss/train': 0.86508484557271} 12/22/2021 07:52:48 - INFO - codeparrot_training - Step 5265: {'lr': 0.0004872350305624241, 'samples': 2696192, 'steps': 5265, 'batch_loss/train': 0.9200203688815236} 12/22/2021 07:52:59 - INFO - codeparrot_training - Step 5266: {'lr': 0.000487230057511666, 'samples': 2696704, 'steps': 5266, 'batch_loss/train': 0.8444128409028053} 12/22/2021 07:53:09 - INFO - codeparrot_training - Step 5267: {'lr': 0.00048722508351777075, 'samples': 2697216, 'steps': 5267, 'batch_loss/train': 0.8363498440012336} 12/22/2021 07:53:23 - INFO - codeparrot_training - Step 5268: {'lr': 0.00048722010858075817, 'samples': 2697728, 'steps': 5268, 'batch_loss/train': 0.9612704701721668} 12/22/2021 07:53:34 - INFO - codeparrot_training - Step 5269: {'lr': 0.000487215132700648, 'samples': 2698240, 'steps': 5269, 'batch_loss/train': 0.7607409763149917} 12/22/2021 07:53:45 - INFO - codeparrot_training - Step 5270: {'lr': 0.00048721015587746005, 'samples': 2698752, 'steps': 5270, 'batch_loss/train': 0.8636591613758355} 12/22/2021 07:53:57 - INFO - codeparrot_training - Step 5271: {'lr': 0.0004872051781112141, 'samples': 2699264, 'steps': 5271, 'batch_loss/train': 0.7693062666803598} 12/22/2021 07:54:08 - INFO - codeparrot_training - Step 5272: {'lr': 0.00048720019940192983, 'samples': 2699776, 'steps': 5272, 'batch_loss/train': 0.9272532053291798} 12/22/2021 07:54:18 - INFO - codeparrot_training - Step 5273: {'lr': 0.00048719521974962724, 'samples': 2700288, 'steps': 5273, 'batch_loss/train': 0.6500369349960238} 12/22/2021 07:54:31 - INFO - codeparrot_training - Step 5274: {'lr': 0.000487190239154326, 'samples': 2700800, 'steps': 5274, 'batch_loss/train': 1.0594573095440865} 12/22/2021 07:54:41 - INFO - codeparrot_training - Step 5275: {'lr': 0.00048718525761604585, 'samples': 2701312, 'steps': 5275, 'batch_loss/train': 0.9343610694631934} 12/22/2021 07:54:52 - INFO - codeparrot_training - Step 5276: {'lr': 0.0004871802751348068, 'samples': 2701824, 'steps': 5276, 'batch_loss/train': 0.865500376559794} 12/22/2021 07:55:02 - INFO - codeparrot_training - Step 5277: {'lr': 0.0004871752917106285, 'samples': 2702336, 'steps': 5277, 'batch_loss/train': 1.086283485405147} 12/22/2021 07:55:16 - INFO - codeparrot_training - Step 5278: {'lr': 0.00048717030734353066, 'samples': 2702848, 'steps': 5278, 'batch_loss/train': 0.8511466458439827} 12/22/2021 07:55:27 - INFO - codeparrot_training - Step 5279: {'lr': 0.0004871653220335334, 'samples': 2703360, 'steps': 5279, 'batch_loss/train': 0.8828693181276321} 12/22/2021 07:55:37 - INFO - codeparrot_training - Step 5280: {'lr': 0.0004871603357806562, 'samples': 2703872, 'steps': 5280, 'batch_loss/train': 0.8884804360568523} 12/22/2021 07:55:50 - INFO - codeparrot_training - Step 5281: {'lr': 0.00048715534858491914, 'samples': 2704384, 'steps': 5281, 'batch_loss/train': 0.8657029722817242} 12/22/2021 07:56:01 - INFO - codeparrot_training - Step 5282: {'lr': 0.00048715036044634204, 'samples': 2704896, 'steps': 5282, 'batch_loss/train': 0.7236345265991986} 12/22/2021 07:56:11 - INFO - codeparrot_training - Step 5283: {'lr': 0.00048714537136494457, 'samples': 2705408, 'steps': 5283, 'batch_loss/train': 0.4883810649625957} 12/22/2021 07:56:22 - INFO - codeparrot_training - Step 5284: {'lr': 0.0004871403813407467, 'samples': 2705920, 'steps': 5284, 'batch_loss/train': 0.6887710539158434} 12/22/2021 07:56:36 - INFO - codeparrot_training - Step 5285: {'lr': 0.0004871353903737681, 'samples': 2706432, 'steps': 5285, 'batch_loss/train': 0.8846836667507887} 12/22/2021 07:56:46 - INFO - codeparrot_training - Step 5286: {'lr': 0.00048713039846402887, 'samples': 2706944, 'steps': 5286, 'batch_loss/train': 0.8082766351290047} 12/22/2021 07:56:57 - INFO - codeparrot_training - Step 5287: {'lr': 0.0004871254056115486, 'samples': 2707456, 'steps': 5287, 'batch_loss/train': 0.8930182540789247} 12/22/2021 07:57:09 - INFO - codeparrot_training - Step 5288: {'lr': 0.00048712041181634726, 'samples': 2707968, 'steps': 5288, 'batch_loss/train': 0.7312342235818505} 12/22/2021 07:57:20 - INFO - codeparrot_training - Step 5289: {'lr': 0.00048711541707844474, 'samples': 2708480, 'steps': 5289, 'batch_loss/train': 0.8380979089997709} 12/22/2021 07:57:30 - INFO - codeparrot_training - Step 5290: {'lr': 0.0004871104213978609, 'samples': 2708992, 'steps': 5290, 'batch_loss/train': 0.8206794848665595} 12/22/2021 07:57:42 - INFO - codeparrot_training - Step 5291: {'lr': 0.00048710542477461543, 'samples': 2709504, 'steps': 5291, 'batch_loss/train': 0.7680468941107392} 12/22/2021 07:57:53 - INFO - codeparrot_training - Step 5292: {'lr': 0.0004871004272087284, 'samples': 2710016, 'steps': 5292, 'batch_loss/train': 0.859201954677701} 12/22/2021 07:58:04 - INFO - codeparrot_training - Step 5293: {'lr': 0.00048709542870021964, 'samples': 2710528, 'steps': 5293, 'batch_loss/train': 0.7079500074032694} 12/22/2021 07:58:17 - INFO - codeparrot_training - Step 5294: {'lr': 0.00048709042924910883, 'samples': 2711040, 'steps': 5294, 'batch_loss/train': 0.8386176759377122} 12/22/2021 07:58:28 - INFO - codeparrot_training - Step 5295: {'lr': 0.00048708542885541613, 'samples': 2711552, 'steps': 5295, 'batch_loss/train': 1.0090811103582382} 12/22/2021 07:58:39 - INFO - codeparrot_training - Step 5296: {'lr': 0.00048708042751916127, 'samples': 2712064, 'steps': 5296, 'batch_loss/train': 0.9535026391968131} 12/22/2021 07:58:49 - INFO - codeparrot_training - Step 5297: {'lr': 0.0004870754252403641, 'samples': 2712576, 'steps': 5297, 'batch_loss/train': 0.7747516231611371} 12/22/2021 07:59:01 - INFO - codeparrot_training - Step 5298: {'lr': 0.0004870704220190446, 'samples': 2713088, 'steps': 5298, 'batch_loss/train': 0.8180511770769954} 12/22/2021 07:59:12 - INFO - codeparrot_training - Step 5299: {'lr': 0.00048706541785522255, 'samples': 2713600, 'steps': 5299, 'batch_loss/train': 0.8611202919855714} 12/22/2021 07:59:23 - INFO - codeparrot_training - Step 5300: {'lr': 0.000487060412748918, 'samples': 2714112, 'steps': 5300, 'batch_loss/train': 0.7761143548414111} 12/22/2021 07:59:35 - INFO - codeparrot_training - Step 5301: {'lr': 0.0004870554067001507, 'samples': 2714624, 'steps': 5301, 'batch_loss/train': 1.1953103533014655} 12/22/2021 07:59:46 - INFO - codeparrot_training - Step 5302: {'lr': 0.0004870503997089406, 'samples': 2715136, 'steps': 5302, 'batch_loss/train': 0.8552465634420514} 12/22/2021 07:59:56 - INFO - codeparrot_training - Step 5303: {'lr': 0.0004870453917753076, 'samples': 2715648, 'steps': 5303, 'batch_loss/train': 0.9261169051751494} 12/22/2021 08:00:08 - INFO - codeparrot_training - Step 5304: {'lr': 0.0004870403828992717, 'samples': 2716160, 'steps': 5304, 'batch_loss/train': 0.9070681045996025} 12/22/2021 08:00:19 - INFO - codeparrot_training - Step 5305: {'lr': 0.0004870353730808527, 'samples': 2716672, 'steps': 5305, 'batch_loss/train': 0.8493547993712127} 12/22/2021 08:00:30 - INFO - codeparrot_training - Step 5306: {'lr': 0.0004870303623200706, 'samples': 2717184, 'steps': 5306, 'batch_loss/train': 0.8591873263940215} 12/22/2021 08:00:40 - INFO - codeparrot_training - Step 5307: {'lr': 0.0004870253506169452, 'samples': 2717696, 'steps': 5307, 'batch_loss/train': 0.8740036757662892} 12/22/2021 08:00:54 - INFO - codeparrot_training - Step 5308: {'lr': 0.0004870203379714966, 'samples': 2718208, 'steps': 5308, 'batch_loss/train': 0.909379712305963} 12/22/2021 08:01:05 - INFO - codeparrot_training - Step 5309: {'lr': 0.0004870153243837445, 'samples': 2718720, 'steps': 5309, 'batch_loss/train': 0.8056545159779489} 12/22/2021 08:01:15 - INFO - codeparrot_training - Step 5310: {'lr': 0.0004870103098537091, 'samples': 2719232, 'steps': 5310, 'batch_loss/train': 0.8286101408302784} 12/22/2021 08:01:27 - INFO - codeparrot_training - Step 5311: {'lr': 0.0004870052943814102, 'samples': 2719744, 'steps': 5311, 'batch_loss/train': 0.8965840730816126} 12/22/2021 08:01:38 - INFO - codeparrot_training - Step 5312: {'lr': 0.00048700027796686764, 'samples': 2720256, 'steps': 5312, 'batch_loss/train': 0.7777359802275896} 12/22/2021 08:01:49 - INFO - codeparrot_training - Step 5313: {'lr': 0.0004869952606101015, 'samples': 2720768, 'steps': 5313, 'batch_loss/train': 0.6548780525336042} 12/22/2021 08:02:01 - INFO - codeparrot_training - Step 5314: {'lr': 0.00048699024231113163, 'samples': 2721280, 'steps': 5314, 'batch_loss/train': 0.8332604710012674} 12/22/2021 08:02:11 - INFO - codeparrot_training - Step 5315: {'lr': 0.0004869852230699782, 'samples': 2721792, 'steps': 5315, 'batch_loss/train': 0.8203113721683621} 12/22/2021 08:02:22 - INFO - codeparrot_training - Step 5316: {'lr': 0.0004869802028866609, 'samples': 2722304, 'steps': 5316, 'batch_loss/train': 0.7779252398759127} 12/22/2021 08:02:36 - INFO - codeparrot_training - Step 5317: {'lr': 0.0004869751817611998, 'samples': 2722816, 'steps': 5317, 'batch_loss/train': 0.8235271638259292} 12/22/2021 08:02:47 - INFO - codeparrot_training - Step 5318: {'lr': 0.0004869701596936148, 'samples': 2723328, 'steps': 5318, 'batch_loss/train': 0.6622180547565222} 12/22/2021 08:02:57 - INFO - codeparrot_training - Step 5319: {'lr': 0.00048696513668392607, 'samples': 2723840, 'steps': 5319, 'batch_loss/train': 0.8454562127590179} 12/22/2021 08:03:08 - INFO - codeparrot_training - Step 5320: {'lr': 0.0004869601127321533, 'samples': 2724352, 'steps': 5320, 'batch_loss/train': 0.8333263252861798} 12/22/2021 08:03:20 - INFO - codeparrot_training - Step 5321: {'lr': 0.00048695508783831666, 'samples': 2724864, 'steps': 5321, 'batch_loss/train': 0.8435657932423055} 12/22/2021 08:03:30 - INFO - codeparrot_training - Step 5322: {'lr': 0.000486950062002436, 'samples': 2725376, 'steps': 5322, 'batch_loss/train': 0.7693412629887462} 12/22/2021 08:03:41 - INFO - codeparrot_training - Step 5323: {'lr': 0.0004869450352245314, 'samples': 2725888, 'steps': 5323, 'batch_loss/train': 0.8418661709874868} 12/22/2021 08:03:55 - INFO - codeparrot_training - Step 5324: {'lr': 0.0004869400075046228, 'samples': 2726400, 'steps': 5324, 'batch_loss/train': 0.8485212791711092} 12/22/2021 08:04:05 - INFO - codeparrot_training - Step 5325: {'lr': 0.0004869349788427302, 'samples': 2726912, 'steps': 5325, 'batch_loss/train': 0.9515488911420107} 12/22/2021 08:04:16 - INFO - codeparrot_training - Step 5326: {'lr': 0.0004869299492388736, 'samples': 2727424, 'steps': 5326, 'batch_loss/train': 1.4110697405412793} 12/22/2021 08:04:28 - INFO - codeparrot_training - Step 5327: {'lr': 0.00048692491869307297, 'samples': 2727936, 'steps': 5327, 'batch_loss/train': 0.8275197430048138} 12/22/2021 08:04:39 - INFO - codeparrot_training - Step 5328: {'lr': 0.0004869198872053483, 'samples': 2728448, 'steps': 5328, 'batch_loss/train': 0.8590838573873043} 12/22/2021 08:04:49 - INFO - codeparrot_training - Step 5329: {'lr': 0.0004869148547757197, 'samples': 2728960, 'steps': 5329, 'batch_loss/train': 0.8174284631386399} 12/22/2021 08:05:00 - INFO - codeparrot_training - Step 5330: {'lr': 0.000486909821404207, 'samples': 2729472, 'steps': 5330, 'batch_loss/train': 0.9091595662757754} 12/22/2021 08:05:12 - INFO - codeparrot_training - Step 5331: {'lr': 0.00048690478709083036, 'samples': 2729984, 'steps': 5331, 'batch_loss/train': 0.8419328229501843} 12/22/2021 08:05:23 - INFO - codeparrot_training - Step 5332: {'lr': 0.0004868997518356098, 'samples': 2730496, 'steps': 5332, 'batch_loss/train': 0.8265373650938272} 12/22/2021 08:05:33 - INFO - codeparrot_training - Step 5333: {'lr': 0.00048689471563856515, 'samples': 2731008, 'steps': 5333, 'batch_loss/train': 0.7613476142287254} 12/22/2021 08:05:45 - INFO - codeparrot_training - Step 5334: {'lr': 0.00048688967849971654, 'samples': 2731520, 'steps': 5334, 'batch_loss/train': 0.8533398732542992} 12/22/2021 08:05:56 - INFO - codeparrot_training - Step 5335: {'lr': 0.00048688464041908407, 'samples': 2732032, 'steps': 5335, 'batch_loss/train': 0.6896511446684599} 12/22/2021 08:06:07 - INFO - codeparrot_training - Step 5336: {'lr': 0.0004868796013966877, 'samples': 2732544, 'steps': 5336, 'batch_loss/train': 0.8523937845602632} 12/22/2021 08:06:21 - INFO - codeparrot_training - Step 5337: {'lr': 0.0004868745614325475, 'samples': 2733056, 'steps': 5337, 'batch_loss/train': 0.8573332093656063} 12/22/2021 08:06:31 - INFO - codeparrot_training - Step 5338: {'lr': 0.0004868695205266835, 'samples': 2733568, 'steps': 5338, 'batch_loss/train': 0.9000754016451538} 12/22/2021 08:06:42 - INFO - codeparrot_training - Step 5339: {'lr': 0.00048686447867911576, 'samples': 2734080, 'steps': 5339, 'batch_loss/train': 0.7300323385279626} 12/22/2021 08:06:54 - INFO - codeparrot_training - Step 5340: {'lr': 0.00048685943588986426, 'samples': 2734592, 'steps': 5340, 'batch_loss/train': 0.926966248312965} 12/22/2021 08:07:05 - INFO - codeparrot_training - Step 5341: {'lr': 0.000486854392158949, 'samples': 2735104, 'steps': 5341, 'batch_loss/train': 0.7915992354974151} 12/22/2021 08:07:15 - INFO - codeparrot_training - Step 5342: {'lr': 0.0004868493474863902, 'samples': 2735616, 'steps': 5342, 'batch_loss/train': 0.8985977210104465} 12/22/2021 08:07:26 - INFO - codeparrot_training - Step 5343: {'lr': 0.00048684430187220773, 'samples': 2736128, 'steps': 5343, 'batch_loss/train': 0.8423207169398665} 12/22/2021 08:07:38 - INFO - codeparrot_training - Step 5344: {'lr': 0.0004868392553164218, 'samples': 2736640, 'steps': 5344, 'batch_loss/train': 0.8829184444621205} 12/22/2021 08:07:49 - INFO - codeparrot_training - Step 5345: {'lr': 0.0004868342078190524, 'samples': 2737152, 'steps': 5345, 'batch_loss/train': 0.8981734262779355} 12/22/2021 08:07:59 - INFO - codeparrot_training - Step 5346: {'lr': 0.00048682915938011965, 'samples': 2737664, 'steps': 5346, 'batch_loss/train': 0.85230353474617} 12/22/2021 08:08:13 - INFO - codeparrot_training - Step 5347: {'lr': 0.00048682410999964363, 'samples': 2738176, 'steps': 5347, 'batch_loss/train': 0.7802719555329531} 12/22/2021 08:08:24 - INFO - codeparrot_training - Step 5348: {'lr': 0.00048681905967764427, 'samples': 2738688, 'steps': 5348, 'batch_loss/train': 0.7942353133112192} 12/22/2021 08:08:34 - INFO - codeparrot_training - Step 5349: {'lr': 0.0004868140084141418, 'samples': 2739200, 'steps': 5349, 'batch_loss/train': 0.7234954256564379} 12/22/2021 08:08:46 - INFO - codeparrot_training - Step 5350: {'lr': 0.0004868089562091562, 'samples': 2739712, 'steps': 5350, 'batch_loss/train': 0.8563688034191728} 12/22/2021 08:08:57 - INFO - codeparrot_training - Step 5351: {'lr': 0.00048680390306270774, 'samples': 2740224, 'steps': 5351, 'batch_loss/train': 0.8720232667401433} 12/22/2021 08:09:08 - INFO - codeparrot_training - Step 5352: {'lr': 0.00048679884897481627, 'samples': 2740736, 'steps': 5352, 'batch_loss/train': 0.7912257169373333} 12/22/2021 08:09:18 - INFO - codeparrot_training - Step 5353: {'lr': 0.00048679379394550197, 'samples': 2741248, 'steps': 5353, 'batch_loss/train': 0.8271538987755775} 12/22/2021 08:09:32 - INFO - codeparrot_training - Step 5354: {'lr': 0.00048678873797478495, 'samples': 2741760, 'steps': 5354, 'batch_loss/train': 0.766699098283425} 12/22/2021 08:09:43 - INFO - codeparrot_training - Step 5355: {'lr': 0.0004867836810626853, 'samples': 2742272, 'steps': 5355, 'batch_loss/train': 0.9164185179397464} 12/22/2021 08:09:53 - INFO - codeparrot_training - Step 5356: {'lr': 0.00048677862320922326, 'samples': 2742784, 'steps': 5356, 'batch_loss/train': 0.9305220134556293} 12/22/2021 08:10:05 - INFO - codeparrot_training - Step 5357: {'lr': 0.00048677356441441876, 'samples': 2743296, 'steps': 5357, 'batch_loss/train': 0.9198910053819418} 12/22/2021 08:10:16 - INFO - codeparrot_training - Step 5358: {'lr': 0.00048676850467829194, 'samples': 2743808, 'steps': 5358, 'batch_loss/train': 0.8297822717577219} 12/22/2021 08:10:26 - INFO - codeparrot_training - Step 5359: {'lr': 0.00048676344400086296, 'samples': 2744320, 'steps': 5359, 'batch_loss/train': 0.8481898424215615} 12/22/2021 08:10:39 - INFO - codeparrot_training - Step 5360: {'lr': 0.000486758382382152, 'samples': 2744832, 'steps': 5360, 'batch_loss/train': 0.8599391933530569} 12/22/2021 08:10:49 - INFO - codeparrot_training - Step 5361: {'lr': 0.000486753319822179, 'samples': 2745344, 'steps': 5361, 'batch_loss/train': 0.8595310607925057} 12/22/2021 08:11:00 - INFO - codeparrot_training - Step 5362: {'lr': 0.0004867482563209642, 'samples': 2745856, 'steps': 5362, 'batch_loss/train': 0.887111347168684} 12/22/2021 08:11:10 - INFO - codeparrot_training - Step 5363: {'lr': 0.00048674319187852775, 'samples': 2746368, 'steps': 5363, 'batch_loss/train': 0.8953358540311456} 12/22/2021 08:11:24 - INFO - codeparrot_training - Step 5364: {'lr': 0.00048673812649488977, 'samples': 2746880, 'steps': 5364, 'batch_loss/train': 0.8114157309755683} 12/22/2021 08:11:35 - INFO - codeparrot_training - Step 5365: {'lr': 0.0004867330601700704, 'samples': 2747392, 'steps': 5365, 'batch_loss/train': 0.8810829212889075} 12/22/2021 08:11:45 - INFO - codeparrot_training - Step 5366: {'lr': 0.00048672799290408973, 'samples': 2747904, 'steps': 5366, 'batch_loss/train': 0.8111021085642278} 12/22/2021 08:11:58 - INFO - codeparrot_training - Step 5367: {'lr': 0.000486722924696968, 'samples': 2748416, 'steps': 5367, 'batch_loss/train': 1.3379595344886184} 12/22/2021 08:12:09 - INFO - codeparrot_training - Step 5368: {'lr': 0.0004867178555487253, 'samples': 2748928, 'steps': 5368, 'batch_loss/train': 0.807134760543704} 12/22/2021 08:12:19 - INFO - codeparrot_training - Step 5369: {'lr': 0.00048671278545938177, 'samples': 2749440, 'steps': 5369, 'batch_loss/train': 0.7863868772983551} 12/22/2021 08:12:31 - INFO - codeparrot_training - Step 5370: {'lr': 0.0004867077144289576, 'samples': 2749952, 'steps': 5370, 'batch_loss/train': 0.5890834031160921} 12/22/2021 08:12:42 - INFO - codeparrot_training - Step 5371: {'lr': 0.00048670264245747284, 'samples': 2750464, 'steps': 5371, 'batch_loss/train': 0.7857716935686767} 12/22/2021 08:12:53 - INFO - codeparrot_training - Step 5372: {'lr': 0.0004866975695449478, 'samples': 2750976, 'steps': 5372, 'batch_loss/train': 0.7279616794548929} 12/22/2021 08:13:03 - INFO - codeparrot_training - Step 5373: {'lr': 0.00048669249569140263, 'samples': 2751488, 'steps': 5373, 'batch_loss/train': 0.8582889940589666} 12/22/2021 08:13:15 - INFO - codeparrot_training - Step 5374: {'lr': 0.0004866874208968575, 'samples': 2752000, 'steps': 5374, 'batch_loss/train': 0.8245993714081123} 12/22/2021 08:13:26 - INFO - codeparrot_training - Step 5375: {'lr': 0.00048668234516133246, 'samples': 2752512, 'steps': 5375, 'batch_loss/train': 0.8776011243462563} 12/22/2021 08:13:36 - INFO - codeparrot_training - Step 5376: {'lr': 0.00048667726848484783, 'samples': 2753024, 'steps': 5376, 'batch_loss/train': 0.8505949787795544} 12/22/2021 08:13:50 - INFO - codeparrot_training - Step 5377: {'lr': 0.00048667219086742376, 'samples': 2753536, 'steps': 5377, 'batch_loss/train': 0.8241427717730403} 12/22/2021 08:14:01 - INFO - codeparrot_training - Step 5378: {'lr': 0.0004866671123090805, 'samples': 2754048, 'steps': 5378, 'batch_loss/train': 0.764074451290071} 12/22/2021 08:14:11 - INFO - codeparrot_training - Step 5379: {'lr': 0.00048666203280983803, 'samples': 2754560, 'steps': 5379, 'batch_loss/train': 0.8475730940699577} 12/22/2021 08:14:24 - INFO - codeparrot_training - Step 5380: {'lr': 0.0004866569523697167, 'samples': 2755072, 'steps': 5380, 'batch_loss/train': 0.8100205641239882} 12/22/2021 08:14:34 - INFO - codeparrot_training - Step 5381: {'lr': 0.0004866518709887367, 'samples': 2755584, 'steps': 5381, 'batch_loss/train': 0.8661933029070497} 12/22/2021 08:14:45 - INFO - codeparrot_training - Step 5382: {'lr': 0.00048664678866691826, 'samples': 2756096, 'steps': 5382, 'batch_loss/train': 0.8661498269066215} 12/22/2021 08:14:58 - INFO - codeparrot_training - Step 5383: {'lr': 0.0004866417054042815, 'samples': 2756608, 'steps': 5383, 'batch_loss/train': 0.8656429685652256} 12/22/2021 08:15:09 - INFO - codeparrot_training - Step 5384: {'lr': 0.00048663662120084666, 'samples': 2757120, 'steps': 5384, 'batch_loss/train': 0.8110362393781543} 12/22/2021 08:15:20 - INFO - codeparrot_training - Step 5385: {'lr': 0.00048663153605663404, 'samples': 2757632, 'steps': 5385, 'batch_loss/train': 0.8056907337158918} 12/22/2021 08:15:30 - INFO - codeparrot_training - Step 5386: {'lr': 0.0004866264499716637, 'samples': 2758144, 'steps': 5386, 'batch_loss/train': 0.8449963349848986} 12/22/2021 08:15:42 - INFO - codeparrot_training - Step 5387: {'lr': 0.0004866213629459559, 'samples': 2758656, 'steps': 5387, 'batch_loss/train': 0.8078215820714831} 12/22/2021 08:15:53 - INFO - codeparrot_training - Step 5388: {'lr': 0.00048661627497953105, 'samples': 2759168, 'steps': 5388, 'batch_loss/train': 0.8574061123654246} 12/22/2021 08:16:04 - INFO - codeparrot_training - Step 5389: {'lr': 0.0004866111860724092, 'samples': 2759680, 'steps': 5389, 'batch_loss/train': 0.8478884622454643} 12/22/2021 08:16:16 - INFO - codeparrot_training - Step 5390: {'lr': 0.00048660609622461047, 'samples': 2760192, 'steps': 5390, 'batch_loss/train': 0.8426821371540427} 12/22/2021 08:16:26 - INFO - codeparrot_training - Step 5391: {'lr': 0.00048660100543615535, 'samples': 2760704, 'steps': 5391, 'batch_loss/train': 0.8082428704947233} 12/22/2021 08:16:37 - INFO - codeparrot_training - Step 5392: {'lr': 0.00048659591370706404, 'samples': 2761216, 'steps': 5392, 'batch_loss/train': 0.8586778985336423} 12/22/2021 08:16:51 - INFO - codeparrot_training - Step 5393: {'lr': 0.00048659082103735655, 'samples': 2761728, 'steps': 5393, 'batch_loss/train': 0.7584928190335631} 12/22/2021 08:17:02 - INFO - codeparrot_training - Step 5394: {'lr': 0.0004865857274270534, 'samples': 2762240, 'steps': 5394, 'batch_loss/train': 0.841634992044419} 12/22/2021 08:17:12 - INFO - codeparrot_training - Step 5395: {'lr': 0.0004865806328761747, 'samples': 2762752, 'steps': 5395, 'batch_loss/train': 0.8200758574530482} 12/22/2021 08:17:23 - INFO - codeparrot_training - Step 5396: {'lr': 0.00048657553738474076, 'samples': 2763264, 'steps': 5396, 'batch_loss/train': 0.902098439168185} 12/22/2021 08:17:35 - INFO - codeparrot_training - Step 5397: {'lr': 0.0004865704409527718, 'samples': 2763776, 'steps': 5397, 'batch_loss/train': 0.8456369554623961} 12/22/2021 08:17:46 - INFO - codeparrot_training - Step 5398: {'lr': 0.00048656534358028803, 'samples': 2764288, 'steps': 5398, 'batch_loss/train': 0.8254197924397886} 12/22/2021 08:17:56 - INFO - codeparrot_training - Step 5399: {'lr': 0.00048656024526730986, 'samples': 2764800, 'steps': 5399, 'batch_loss/train': 0.6844951461534947} 12/22/2021 08:18:10 - INFO - codeparrot_training - Step 5400: {'lr': 0.0004865551460138574, 'samples': 2765312, 'steps': 5400, 'batch_loss/train': 0.8150859698653221} 12/22/2021 08:18:21 - INFO - codeparrot_training - Step 5401: {'lr': 0.0004865500458199511, 'samples': 2765824, 'steps': 5401, 'batch_loss/train': 0.8164421934634447} 12/22/2021 08:18:32 - INFO - codeparrot_training - Step 5402: {'lr': 0.00048654494468561104, 'samples': 2766336, 'steps': 5402, 'batch_loss/train': 0.6840644188923761} 12/22/2021 08:18:42 - INFO - codeparrot_training - Step 5403: {'lr': 0.00048653984261085763, 'samples': 2766848, 'steps': 5403, 'batch_loss/train': 0.7717261365614831} 12/22/2021 08:18:54 - INFO - codeparrot_training - Step 5404: {'lr': 0.00048653473959571117, 'samples': 2767360, 'steps': 5404, 'batch_loss/train': 0.8686932092532516} 12/22/2021 08:19:05 - INFO - codeparrot_training - Step 5405: {'lr': 0.0004865296356401918, 'samples': 2767872, 'steps': 5405, 'batch_loss/train': 0.791649273596704} 12/22/2021 08:19:15 - INFO - codeparrot_training - Step 5406: {'lr': 0.00048652453074432, 'samples': 2768384, 'steps': 5406, 'batch_loss/train': 0.7832256256369874} 12/22/2021 08:19:27 - INFO - codeparrot_training - Step 5407: {'lr': 0.00048651942490811594, 'samples': 2768896, 'steps': 5407, 'batch_loss/train': 0.7889483869075775} 12/22/2021 08:19:38 - INFO - codeparrot_training - Step 5408: {'lr': 0.00048651431813159996, 'samples': 2769408, 'steps': 5408, 'batch_loss/train': 0.9390760762616992} 12/22/2021 08:19:49 - INFO - codeparrot_training - Step 5409: {'lr': 0.00048650921041479235, 'samples': 2769920, 'steps': 5409, 'batch_loss/train': 0.856032507494092} 12/22/2021 08:20:01 - INFO - codeparrot_training - Step 5410: {'lr': 0.0004865041017577134, 'samples': 2770432, 'steps': 5410, 'batch_loss/train': 0.740288688451983} 12/22/2021 08:20:11 - INFO - codeparrot_training - Step 5411: {'lr': 0.0004864989921603834, 'samples': 2770944, 'steps': 5411, 'batch_loss/train': 0.788552907994017} 12/22/2021 08:20:22 - INFO - codeparrot_training - Step 5412: {'lr': 0.0004864938816228228, 'samples': 2771456, 'steps': 5412, 'batch_loss/train': 0.6608107767533511} 12/22/2021 08:20:36 - INFO - codeparrot_training - Step 5413: {'lr': 0.00048648877014505177, 'samples': 2771968, 'steps': 5413, 'batch_loss/train': 0.8632551534101367} 12/22/2021 08:20:47 - INFO - codeparrot_training - Step 5414: {'lr': 0.0004864836577270907, 'samples': 2772480, 'steps': 5414, 'batch_loss/train': 0.8254227760480717} 12/22/2021 08:20:58 - INFO - codeparrot_training - Step 5415: {'lr': 0.0004864785443689599, 'samples': 2772992, 'steps': 5415, 'batch_loss/train': 0.8796333838254213} 12/22/2021 08:21:08 - INFO - codeparrot_training - Step 5416: {'lr': 0.00048647343007067966, 'samples': 2773504, 'steps': 5416, 'batch_loss/train': 0.8944463916122913} 12/22/2021 08:21:20 - INFO - codeparrot_training - Step 5417: {'lr': 0.00048646831483227043, 'samples': 2774016, 'steps': 5417, 'batch_loss/train': 0.8296287814155221} 12/22/2021 08:21:31 - INFO - codeparrot_training - Step 5418: {'lr': 0.00048646319865375243, 'samples': 2774528, 'steps': 5418, 'batch_loss/train': 0.8127078963443637} 12/22/2021 08:21:41 - INFO - codeparrot_training - Step 5419: {'lr': 0.00048645808153514604, 'samples': 2775040, 'steps': 5419, 'batch_loss/train': 0.8374172844924033} 12/22/2021 08:21:54 - INFO - codeparrot_training - Step 5420: {'lr': 0.00048645296347647163, 'samples': 2775552, 'steps': 5420, 'batch_loss/train': 0.7223873801995069} 12/22/2021 08:22:04 - INFO - codeparrot_training - Step 5421: {'lr': 0.00048644784447774943, 'samples': 2776064, 'steps': 5421, 'batch_loss/train': 1.1930944295600057} 12/22/2021 08:22:15 - INFO - codeparrot_training - Step 5422: {'lr': 0.00048644272453899994, 'samples': 2776576, 'steps': 5422, 'batch_loss/train': 0.8105516377836466} 12/22/2021 08:22:29 - INFO - codeparrot_training - Step 5423: {'lr': 0.0004864376036602435, 'samples': 2777088, 'steps': 5423, 'batch_loss/train': 0.7809997578151524} 12/22/2021 08:22:39 - INFO - codeparrot_training - Step 5424: {'lr': 0.00048643248184150037, 'samples': 2777600, 'steps': 5424, 'batch_loss/train': 0.8929062588140368} 12/22/2021 08:22:50 - INFO - codeparrot_training - Step 5425: {'lr': 0.0004864273590827909, 'samples': 2778112, 'steps': 5425, 'batch_loss/train': 0.8504737373441458} 12/22/2021 08:23:01 - INFO - codeparrot_training - Step 5426: {'lr': 0.0004864222353841356, 'samples': 2778624, 'steps': 5426, 'batch_loss/train': 0.8825518134981394} 12/22/2021 08:23:13 - INFO - codeparrot_training - Step 5427: {'lr': 0.0004864171107455547, 'samples': 2779136, 'steps': 5427, 'batch_loss/train': 0.931457182392478} 12/22/2021 08:23:23 - INFO - codeparrot_training - Step 5428: {'lr': 0.00048641198516706876, 'samples': 2779648, 'steps': 5428, 'batch_loss/train': 0.8248101242352277} 12/22/2021 08:23:34 - INFO - codeparrot_training - Step 5429: {'lr': 0.0004864068586486979, 'samples': 2780160, 'steps': 5429, 'batch_loss/train': 0.9532039500772953} 12/22/2021 08:23:46 - INFO - codeparrot_training - Step 5430: {'lr': 0.0004864017311904627, 'samples': 2780672, 'steps': 5430, 'batch_loss/train': 0.8915260694921017} 12/22/2021 08:23:57 - INFO - codeparrot_training - Step 5431: {'lr': 0.0004863966027923835, 'samples': 2781184, 'steps': 5431, 'batch_loss/train': 0.8563177362084389} 12/22/2021 08:24:07 - INFO - codeparrot_training - Step 5432: {'lr': 0.00048639147345448067, 'samples': 2781696, 'steps': 5432, 'batch_loss/train': 0.8620434231124818} 12/22/2021 08:24:21 - INFO - codeparrot_training - Step 5433: {'lr': 0.00048638634317677454, 'samples': 2782208, 'steps': 5433, 'batch_loss/train': 0.9308632919564843} 12/22/2021 08:24:32 - INFO - codeparrot_training - Step 5434: {'lr': 0.00048638121195928555, 'samples': 2782720, 'steps': 5434, 'batch_loss/train': 0.8874993035569787} 12/22/2021 08:24:42 - INFO - codeparrot_training - Step 5435: {'lr': 0.00048637607980203413, 'samples': 2783232, 'steps': 5435, 'batch_loss/train': 1.0053558647632599} 12/22/2021 08:24:53 - INFO - codeparrot_training - Step 5436: {'lr': 0.0004863709467050407, 'samples': 2783744, 'steps': 5436, 'batch_loss/train': 0.7641539350152016} 12/22/2021 08:25:05 - INFO - codeparrot_training - Step 5437: {'lr': 0.0004863658126683257, 'samples': 2784256, 'steps': 5437, 'batch_loss/train': 0.8203153358772397} 12/22/2021 08:25:16 - INFO - codeparrot_training - Step 5438: {'lr': 0.0004863606776919094, 'samples': 2784768, 'steps': 5438, 'batch_loss/train': 0.7715433151461184} 12/22/2021 08:25:26 - INFO - codeparrot_training - Step 5439: {'lr': 0.0004863555417758123, 'samples': 2785280, 'steps': 5439, 'batch_loss/train': 0.7690318329259753} 12/22/2021 08:25:38 - INFO - codeparrot_training - Step 5440: {'lr': 0.00048635040492005477, 'samples': 2785792, 'steps': 5440, 'batch_loss/train': 0.7505073081701994} 12/22/2021 08:25:49 - INFO - codeparrot_training - Step 5441: {'lr': 0.0004863452671246573, 'samples': 2786304, 'steps': 5441, 'batch_loss/train': 0.8178627425804734} 12/22/2021 08:26:00 - INFO - codeparrot_training - Step 5442: {'lr': 0.00048634012838964037, 'samples': 2786816, 'steps': 5442, 'batch_loss/train': 0.8415256766602397} 12/22/2021 08:26:14 - INFO - codeparrot_training - Step 5443: {'lr': 0.00048633498871502424, 'samples': 2787328, 'steps': 5443, 'batch_loss/train': 0.7322190003469586} 12/22/2021 08:26:25 - INFO - codeparrot_training - Step 5444: {'lr': 0.0004863298481008295, 'samples': 2787840, 'steps': 5444, 'batch_loss/train': 0.7392522641457617} 12/22/2021 08:26:35 - INFO - codeparrot_training - Step 5445: {'lr': 0.0004863247065470764, 'samples': 2788352, 'steps': 5445, 'batch_loss/train': 0.836065006442368} 12/22/2021 08:26:46 - INFO - codeparrot_training - Step 5446: {'lr': 0.0004863195640537856, 'samples': 2788864, 'steps': 5446, 'batch_loss/train': 0.8202513381838799} 12/22/2021 08:26:58 - INFO - codeparrot_training - Step 5447: {'lr': 0.0004863144206209774, 'samples': 2789376, 'steps': 5447, 'batch_loss/train': 0.7306854762136936} 12/22/2021 08:27:09 - INFO - codeparrot_training - Step 5448: {'lr': 0.00048630927624867225, 'samples': 2789888, 'steps': 5448, 'batch_loss/train': 0.8668657802045345} 12/22/2021 08:27:19 - INFO - codeparrot_training - Step 5449: {'lr': 0.00048630413093689074, 'samples': 2790400, 'steps': 5449, 'batch_loss/train': 0.8263691421598196} 12/22/2021 08:27:31 - INFO - codeparrot_training - Step 5450: {'lr': 0.0004862989846856532, 'samples': 2790912, 'steps': 5450, 'batch_loss/train': 0.7736410656943917} 12/22/2021 08:27:42 - INFO - codeparrot_training - Step 5451: {'lr': 0.0004862938374949801, 'samples': 2791424, 'steps': 5451, 'batch_loss/train': 0.8824033187702298} 12/22/2021 08:27:53 - INFO - codeparrot_training - Step 5452: {'lr': 0.00048628868936489197, 'samples': 2791936, 'steps': 5452, 'batch_loss/train': 0.8896137643605471} 12/22/2021 08:28:03 - INFO - codeparrot_training - Step 5453: {'lr': 0.00048628354029540917, 'samples': 2792448, 'steps': 5453, 'batch_loss/train': 0.8847996527329087} 12/22/2021 08:28:17 - INFO - codeparrot_training - Step 5454: {'lr': 0.0004862783902865523, 'samples': 2792960, 'steps': 5454, 'batch_loss/train': 0.8549400242045522} 12/22/2021 08:28:28 - INFO - codeparrot_training - Step 5455: {'lr': 0.0004862732393383417, 'samples': 2793472, 'steps': 5455, 'batch_loss/train': 0.8231601417064667} 12/22/2021 08:28:38 - INFO - codeparrot_training - Step 5456: {'lr': 0.0004862680874507979, 'samples': 2793984, 'steps': 5456, 'batch_loss/train': 0.7357166856527328} 12/22/2021 08:28:51 - INFO - codeparrot_training - Step 5457: {'lr': 0.00048626293462394145, 'samples': 2794496, 'steps': 5457, 'batch_loss/train': 0.8218143479898572} 12/22/2021 08:29:01 - INFO - codeparrot_training - Step 5458: {'lr': 0.0004862577808577928, 'samples': 2795008, 'steps': 5458, 'batch_loss/train': 0.8306350344792008} 12/22/2021 08:29:12 - INFO - codeparrot_training - Step 5459: {'lr': 0.0004862526261523723, 'samples': 2795520, 'steps': 5459, 'batch_loss/train': 0.8759316941723228} 12/22/2021 08:29:26 - INFO - codeparrot_training - Step 5460: {'lr': 0.0004862474705077007, 'samples': 2796032, 'steps': 5460, 'batch_loss/train': 0.944841225631535} 12/22/2021 08:29:36 - INFO - codeparrot_training - Step 5461: {'lr': 0.00048624231392379826, 'samples': 2796544, 'steps': 5461, 'batch_loss/train': 0.8781329151242971} 12/22/2021 08:29:47 - INFO - codeparrot_training - Step 5462: {'lr': 0.0004862371564006857, 'samples': 2797056, 'steps': 5462, 'batch_loss/train': 0.9548628926277161} 12/22/2021 08:29:59 - INFO - codeparrot_training - Step 5463: {'lr': 0.0004862319979383833, 'samples': 2797568, 'steps': 5463, 'batch_loss/train': 0.8955955505371094} 12/22/2021 08:30:09 - INFO - codeparrot_training - Step 5464: {'lr': 0.0004862268385369117, 'samples': 2798080, 'steps': 5464, 'batch_loss/train': 0.8564695592503995} 12/22/2021 08:30:20 - INFO - codeparrot_training - Step 5465: {'lr': 0.0004862216781962914, 'samples': 2798592, 'steps': 5465, 'batch_loss/train': 0.7719645164906979} 12/22/2021 08:30:31 - INFO - codeparrot_training - Step 5466: {'lr': 0.0004862165169165429, 'samples': 2799104, 'steps': 5466, 'batch_loss/train': 0.8832343462854624} 12/22/2021 08:30:43 - INFO - codeparrot_training - Step 5467: {'lr': 0.00048621135469768676, 'samples': 2799616, 'steps': 5467, 'batch_loss/train': 0.8554864525794983} 12/22/2021 08:30:54 - INFO - codeparrot_training - Step 5468: {'lr': 0.0004862061915397433, 'samples': 2800128, 'steps': 5468, 'batch_loss/train': 0.9034695578739047} 12/22/2021 08:31:04 - INFO - codeparrot_training - Step 5469: {'lr': 0.00048620102744273343, 'samples': 2800640, 'steps': 5469, 'batch_loss/train': 0.8604144458658993} 12/22/2021 08:31:18 - INFO - codeparrot_training - Step 5470: {'lr': 0.00048619586240667735, 'samples': 2801152, 'steps': 5470, 'batch_loss/train': 0.8508328655734658} 12/22/2021 08:31:29 - INFO - codeparrot_training - Step 5471: {'lr': 0.00048619069643159567, 'samples': 2801664, 'steps': 5471, 'batch_loss/train': 0.9132173033431172} 12/22/2021 08:31:39 - INFO - codeparrot_training - Step 5472: {'lr': 0.00048618552951750903, 'samples': 2802176, 'steps': 5472, 'batch_loss/train': 0.8118261465569958} 12/22/2021 08:31:51 - INFO - codeparrot_training - Step 5473: {'lr': 0.0004861803616644379, 'samples': 2802688, 'steps': 5473, 'batch_loss/train': 0.8300081435590982} 12/22/2021 08:32:02 - INFO - codeparrot_training - Step 5474: {'lr': 0.0004861751928724028, 'samples': 2803200, 'steps': 5474, 'batch_loss/train': 0.8535118065774441} 12/22/2021 08:32:12 - INFO - codeparrot_training - Step 5475: {'lr': 0.00048617002314142434, 'samples': 2803712, 'steps': 5475, 'batch_loss/train': 0.8305682372301817} 12/22/2021 08:32:23 - INFO - codeparrot_training - Step 5476: {'lr': 0.000486164852471523, 'samples': 2804224, 'steps': 5476, 'batch_loss/train': 0.9005085118114948} 12/22/2021 08:32:35 - INFO - codeparrot_training - Step 5477: {'lr': 0.00048615968086271945, 'samples': 2804736, 'steps': 5477, 'batch_loss/train': 0.8177941907197237} 12/22/2021 08:32:46 - INFO - codeparrot_training - Step 5478: {'lr': 0.0004861545083150341, 'samples': 2805248, 'steps': 5478, 'batch_loss/train': 0.7560230307281017} 12/22/2021 08:32:56 - INFO - codeparrot_training - Step 5479: {'lr': 0.00048614933482848765, 'samples': 2805760, 'steps': 5479, 'batch_loss/train': 0.8650189600884914} 12/22/2021 08:33:08 - INFO - codeparrot_training - Step 5480: {'lr': 0.00048614416040310056, 'samples': 2806272, 'steps': 5480, 'batch_loss/train': 0.8101830091327429} 12/22/2021 08:33:19 - INFO - codeparrot_training - Step 5481: {'lr': 0.0004861389850388935, 'samples': 2806784, 'steps': 5481, 'batch_loss/train': 0.7876493893563747} 12/22/2021 08:33:29 - INFO - codeparrot_training - Step 5482: {'lr': 0.000486133808735887, 'samples': 2807296, 'steps': 5482, 'batch_loss/train': 0.945357458665967} 12/22/2021 08:33:44 - INFO - codeparrot_training - Step 5483: {'lr': 0.0004861286314941016, 'samples': 2807808, 'steps': 5483, 'batch_loss/train': 0.8553092898800969} 12/22/2021 08:33:54 - INFO - codeparrot_training - Step 5484: {'lr': 0.0004861234533135579, 'samples': 2808320, 'steps': 5484, 'batch_loss/train': 0.8756231754086912} 12/22/2021 08:34:05 - INFO - codeparrot_training - Step 5485: {'lr': 0.00048611827419427656, 'samples': 2808832, 'steps': 5485, 'batch_loss/train': 0.9196724286302924} 12/22/2021 08:34:17 - INFO - codeparrot_training - Step 5486: {'lr': 0.00048611309413627813, 'samples': 2809344, 'steps': 5486, 'batch_loss/train': 0.8268642099574208} 12/22/2021 08:34:28 - INFO - codeparrot_training - Step 5487: {'lr': 0.00048610791313958315, 'samples': 2809856, 'steps': 5487, 'batch_loss/train': 0.7990129478275776} 12/22/2021 08:34:38 - INFO - codeparrot_training - Step 5488: {'lr': 0.0004861027312042122, 'samples': 2810368, 'steps': 5488, 'batch_loss/train': 0.8513962710276246} 12/22/2021 08:34:49 - INFO - codeparrot_training - Step 5489: {'lr': 0.00048609754833018606, 'samples': 2810880, 'steps': 5489, 'batch_loss/train': 0.810037394054234} 12/22/2021 08:35:01 - INFO - codeparrot_training - Step 5490: {'lr': 0.00048609236451752514, 'samples': 2811392, 'steps': 5490, 'batch_loss/train': 0.821965045761317} 12/22/2021 08:35:12 - INFO - codeparrot_training - Step 5491: {'lr': 0.0004860871797662502, 'samples': 2811904, 'steps': 5491, 'batch_loss/train': 0.9400608399882913} 12/22/2021 08:35:23 - INFO - codeparrot_training - Step 5492: {'lr': 0.00048608199407638173, 'samples': 2812416, 'steps': 5492, 'batch_loss/train': 0.5733422947814688} 12/22/2021 08:35:36 - INFO - codeparrot_training - Step 5493: {'lr': 0.0004860768074479404, 'samples': 2812928, 'steps': 5493, 'batch_loss/train': 0.8332412457093596} 12/22/2021 08:35:47 - INFO - codeparrot_training - Step 5494: {'lr': 0.00048607161988094673, 'samples': 2813440, 'steps': 5494, 'batch_loss/train': 0.8601884329691529} 12/22/2021 08:35:58 - INFO - codeparrot_training - Step 5495: {'lr': 0.00048606643137542155, 'samples': 2813952, 'steps': 5495, 'batch_loss/train': 0.8469513636082411} 12/22/2021 08:36:10 - INFO - codeparrot_training - Step 5496: {'lr': 0.00048606124193138533, 'samples': 2814464, 'steps': 5496, 'batch_loss/train': 0.8157674581743777} 12/22/2021 08:36:20 - INFO - codeparrot_training - Step 5497: {'lr': 0.0004860560515488588, 'samples': 2814976, 'steps': 5497, 'batch_loss/train': 0.9015816813334823} 12/22/2021 08:36:31 - INFO - codeparrot_training - Step 5498: {'lr': 0.0004860508602278625, 'samples': 2815488, 'steps': 5498, 'batch_loss/train': 0.8810752388089895} 12/22/2021 08:36:41 - INFO - codeparrot_training - Step 5499: {'lr': 0.00048604566796841707, 'samples': 2816000, 'steps': 5499, 'batch_loss/train': 0.6996796983294189} 12/22/2021 08:37:02 - INFO - codeparrot_training - Step 5500: {'lr': 0.0004860404747705433, 'samples': 2816512, 'steps': 5500, 'batch_loss/train': 0.8504255227744579} 12/22/2021 08:37:12 - INFO - codeparrot_training - Step 5501: {'lr': 0.0004860352806342616, 'samples': 2817024, 'steps': 5501, 'batch_loss/train': 1.8899233965203166} 12/22/2021 08:37:23 - INFO - codeparrot_training - Step 5502: {'lr': 0.00048603008555959276, 'samples': 2817536, 'steps': 5502, 'batch_loss/train': 0.829337484203279} 12/22/2021 08:37:35 - INFO - codeparrot_training - Step 5503: {'lr': 0.00048602488954655745, 'samples': 2818048, 'steps': 5503, 'batch_loss/train': 0.8734086146578193} 12/22/2021 08:37:46 - INFO - codeparrot_training - Step 5504: {'lr': 0.00048601969259517633, 'samples': 2818560, 'steps': 5504, 'batch_loss/train': 0.8523861300200224} 12/22/2021 08:37:56 - INFO - codeparrot_training - Step 5505: {'lr': 0.00048601449470547, 'samples': 2819072, 'steps': 5505, 'batch_loss/train': 0.8567594583146274} 12/22/2021 08:38:08 - INFO - codeparrot_training - Step 5506: {'lr': 0.00048600929587745917, 'samples': 2819584, 'steps': 5506, 'batch_loss/train': 0.9190341029316187} 12/22/2021 08:38:19 - INFO - codeparrot_training - Step 5507: {'lr': 0.0004860040961111645, 'samples': 2820096, 'steps': 5507, 'batch_loss/train': 0.8600553348660469} 12/22/2021 08:38:30 - INFO - codeparrot_training - Step 5508: {'lr': 0.0004859988954066067, 'samples': 2820608, 'steps': 5508, 'batch_loss/train': 0.994999036192894} 12/22/2021 08:38:40 - INFO - codeparrot_training - Step 5509: {'lr': 0.0004859936937638063, 'samples': 2821120, 'steps': 5509, 'batch_loss/train': 0.8654903010465205} 12/22/2021 08:38:54 - INFO - codeparrot_training - Step 5510: {'lr': 0.0004859884911827841, 'samples': 2821632, 'steps': 5510, 'batch_loss/train': 0.8817246756516397} 12/22/2021 08:39:05 - INFO - codeparrot_training - Step 5511: {'lr': 0.00048598328766356083, 'samples': 2822144, 'steps': 5511, 'batch_loss/train': 0.8367508109658957} 12/22/2021 08:39:15 - INFO - codeparrot_training - Step 5512: {'lr': 0.00048597808320615705, 'samples': 2822656, 'steps': 5512, 'batch_loss/train': 0.8063434725627303} 12/22/2021 08:39:27 - INFO - codeparrot_training - Step 5513: {'lr': 0.0004859728778105935, 'samples': 2823168, 'steps': 5513, 'batch_loss/train': 0.8825604449957609} 12/22/2021 08:39:38 - INFO - codeparrot_training - Step 5514: {'lr': 0.000485967671476891, 'samples': 2823680, 'steps': 5514, 'batch_loss/train': 0.8664020858705044} 12/22/2021 08:39:49 - INFO - codeparrot_training - Step 5515: {'lr': 0.00048596246420507005, 'samples': 2824192, 'steps': 5515, 'batch_loss/train': 0.7894297251477838} 12/22/2021 08:40:01 - INFO - codeparrot_training - Step 5516: {'lr': 0.00048595725599515137, 'samples': 2824704, 'steps': 5516, 'batch_loss/train': 0.8341816905885935} 12/22/2021 08:40:11 - INFO - codeparrot_training - Step 5517: {'lr': 0.0004859520468471559, 'samples': 2825216, 'steps': 5517, 'batch_loss/train': 0.8168144059600309} 12/22/2021 08:40:22 - INFO - codeparrot_training - Step 5518: {'lr': 0.00048594683676110405, 'samples': 2825728, 'steps': 5518, 'batch_loss/train': 0.8812808161601424} 12/22/2021 08:40:34 - INFO - codeparrot_training - Step 5519: {'lr': 0.0004859416257370167, 'samples': 2826240, 'steps': 5519, 'batch_loss/train': 0.8031903663650155} 12/22/2021 08:40:45 - INFO - codeparrot_training - Step 5520: {'lr': 0.0004859364137749146, 'samples': 2826752, 'steps': 5520, 'batch_loss/train': 0.6449619494378567} 12/22/2021 08:40:55 - INFO - codeparrot_training - Step 5521: {'lr': 0.00048593120087481835, 'samples': 2827264, 'steps': 5521, 'batch_loss/train': 0.8255366031080484} 12/22/2021 08:41:06 - INFO - codeparrot_training - Step 5522: {'lr': 0.00048592598703674873, 'samples': 2827776, 'steps': 5522, 'batch_loss/train': 0.8023286764509976} 12/22/2021 08:41:20 - INFO - codeparrot_training - Step 5523: {'lr': 0.00048592077226072643, 'samples': 2828288, 'steps': 5523, 'batch_loss/train': 0.8097060475265607} 12/22/2021 08:41:30 - INFO - codeparrot_training - Step 5524: {'lr': 0.00048591555654677233, 'samples': 2828800, 'steps': 5524, 'batch_loss/train': 0.8140312377363443} 12/22/2021 08:41:41 - INFO - codeparrot_training - Step 5525: {'lr': 0.00048591033989490697, 'samples': 2829312, 'steps': 5525, 'batch_loss/train': 0.8378475261852145} 12/22/2021 08:41:53 - INFO - codeparrot_training - Step 5526: {'lr': 0.00048590512230515116, 'samples': 2829824, 'steps': 5526, 'batch_loss/train': 0.9162900210358202} 12/22/2021 08:42:03 - INFO - codeparrot_training - Step 5527: {'lr': 0.00048589990377752567, 'samples': 2830336, 'steps': 5527, 'batch_loss/train': 0.9057242274284363} 12/22/2021 08:42:14 - INFO - codeparrot_training - Step 5528: {'lr': 0.0004858946843120513, 'samples': 2830848, 'steps': 5528, 'batch_loss/train': 0.9070913046598434} 12/22/2021 08:42:28 - INFO - codeparrot_training - Step 5529: {'lr': 0.00048588946390874864, 'samples': 2831360, 'steps': 5529, 'batch_loss/train': 0.870794995687902} 12/22/2021 08:42:38 - INFO - codeparrot_training - Step 5530: {'lr': 0.0004858842425676386, 'samples': 2831872, 'steps': 5530, 'batch_loss/train': 0.7901262287050486} 12/22/2021 08:42:49 - INFO - codeparrot_training - Step 5531: {'lr': 0.00048587902028874187, 'samples': 2832384, 'steps': 5531, 'batch_loss/train': 0.8715042271651328} 12/22/2021 08:43:00 - INFO - codeparrot_training - Step 5532: {'lr': 0.00048587379707207924, 'samples': 2832896, 'steps': 5532, 'batch_loss/train': 0.846978253684938} 12/22/2021 08:43:12 - INFO - codeparrot_training - Step 5533: {'lr': 0.0004858685729176714, 'samples': 2833408, 'steps': 5533, 'batch_loss/train': 0.9900704133324325} 12/22/2021 08:43:22 - INFO - codeparrot_training - Step 5534: {'lr': 0.0004858633478255392, 'samples': 2833920, 'steps': 5534, 'batch_loss/train': 0.8722764728590846} 12/22/2021 08:43:33 - INFO - codeparrot_training - Step 5535: {'lr': 0.00048585812179570344, 'samples': 2834432, 'steps': 5535, 'batch_loss/train': 0.9359913216903806} 12/22/2021 08:43:45 - INFO - codeparrot_training - Step 5536: {'lr': 0.0004858528948281848, 'samples': 2834944, 'steps': 5536, 'batch_loss/train': 0.8713368540629745} 12/22/2021 08:43:56 - INFO - codeparrot_training - Step 5537: {'lr': 0.0004858476669230041, 'samples': 2835456, 'steps': 5537, 'batch_loss/train': 0.7785305029246956} 12/22/2021 08:44:06 - INFO - codeparrot_training - Step 5538: {'lr': 0.00048584243808018216, 'samples': 2835968, 'steps': 5538, 'batch_loss/train': 0.8410163826774806} 12/22/2021 08:44:20 - INFO - codeparrot_training - Step 5539: {'lr': 0.00048583720829973963, 'samples': 2836480, 'steps': 5539, 'batch_loss/train': 0.7603269438259304} 12/22/2021 08:44:31 - INFO - codeparrot_training - Step 5540: {'lr': 0.0004858319775816975, 'samples': 2836992, 'steps': 5540, 'batch_loss/train': 0.8596139345318079} 12/22/2021 08:44:41 - INFO - codeparrot_training - Step 5541: {'lr': 0.0004858267459260765, 'samples': 2837504, 'steps': 5541, 'batch_loss/train': 0.8627270655706525} 12/22/2021 08:44:53 - INFO - codeparrot_training - Step 5542: {'lr': 0.00048582151333289734, 'samples': 2838016, 'steps': 5542, 'batch_loss/train': 0.8910313434898853} 12/22/2021 08:45:04 - INFO - codeparrot_training - Step 5543: {'lr': 0.0004858162798021809, 'samples': 2838528, 'steps': 5543, 'batch_loss/train': 0.7853014236316085} 12/22/2021 08:45:15 - INFO - codeparrot_training - Step 5544: {'lr': 0.000485811045333948, 'samples': 2839040, 'steps': 5544, 'batch_loss/train': 0.8146105892956257} 12/22/2021 08:45:25 - INFO - codeparrot_training - Step 5545: {'lr': 0.00048580580992821945, 'samples': 2839552, 'steps': 5545, 'batch_loss/train': 0.8464458831585944} 12/22/2021 08:45:39 - INFO - codeparrot_training - Step 5546: {'lr': 0.00048580057358501595, 'samples': 2840064, 'steps': 5546, 'batch_loss/train': 0.82735098246485} 12/22/2021 08:45:50 - INFO - codeparrot_training - Step 5547: {'lr': 0.00048579533630435846, 'samples': 2840576, 'steps': 5547, 'batch_loss/train': 0.7756375437602401} 12/22/2021 08:46:01 - INFO - codeparrot_training - Step 5548: {'lr': 0.00048579009808626773, 'samples': 2841088, 'steps': 5548, 'batch_loss/train': 1.1358589185401797} 12/22/2021 08:46:13 - INFO - codeparrot_training - Step 5549: {'lr': 0.00048578485893076463, 'samples': 2841600, 'steps': 5549, 'batch_loss/train': 0.8364254701882601} 12/22/2021 08:46:23 - INFO - codeparrot_training - Step 5550: {'lr': 0.00048577961883786993, 'samples': 2842112, 'steps': 5550, 'batch_loss/train': 0.9473873672541231} 12/22/2021 08:46:34 - INFO - codeparrot_training - Step 5551: {'lr': 0.0004857743778076045, 'samples': 2842624, 'steps': 5551, 'batch_loss/train': 0.8768483297899365} 12/22/2021 08:46:46 - INFO - codeparrot_training - Step 5552: {'lr': 0.00048576913583998915, 'samples': 2843136, 'steps': 5552, 'batch_loss/train': 0.8606969127431512} 12/22/2021 08:46:57 - INFO - codeparrot_training - Step 5553: {'lr': 0.00048576389293504476, 'samples': 2843648, 'steps': 5553, 'batch_loss/train': 0.8788987365551293} 12/22/2021 08:47:07 - INFO - codeparrot_training - Step 5554: {'lr': 0.0004857586490927921, 'samples': 2844160, 'steps': 5554, 'batch_loss/train': 0.8559087114408612} 12/22/2021 08:47:18 - INFO - codeparrot_training - Step 5555: {'lr': 0.0004857534043132521, 'samples': 2844672, 'steps': 5555, 'batch_loss/train': 0.8014709106646478} 12/22/2021 08:47:30 - INFO - codeparrot_training - Step 5556: {'lr': 0.0004857481585964456, 'samples': 2845184, 'steps': 5556, 'batch_loss/train': 0.7358614474069327} 12/22/2021 08:47:41 - INFO - codeparrot_training - Step 5557: {'lr': 0.0004857429119423935, 'samples': 2845696, 'steps': 5557, 'batch_loss/train': 0.8169999374076724} 12/22/2021 08:47:51 - INFO - codeparrot_training - Step 5558: {'lr': 0.00048573766435111647, 'samples': 2846208, 'steps': 5558, 'batch_loss/train': 0.8463625125586987} 12/22/2021 08:48:05 - INFO - codeparrot_training - Step 5559: {'lr': 0.0004857324158226356, 'samples': 2846720, 'steps': 5559, 'batch_loss/train': 0.8952081922907382} 12/22/2021 08:48:16 - INFO - codeparrot_training - Step 5560: {'lr': 0.0004857271663569716, 'samples': 2847232, 'steps': 5560, 'batch_loss/train': 1.0927448188886046} 12/22/2021 08:48:26 - INFO - codeparrot_training - Step 5561: {'lr': 0.00048572191595414537, 'samples': 2847744, 'steps': 5561, 'batch_loss/train': 0.7739031529054046} 12/22/2021 08:48:39 - INFO - codeparrot_training - Step 5562: {'lr': 0.00048571666461417784, 'samples': 2848256, 'steps': 5562, 'batch_loss/train': 0.8702233172953129} 12/22/2021 08:48:49 - INFO - codeparrot_training - Step 5563: {'lr': 0.00048571141233708987, 'samples': 2848768, 'steps': 5563, 'batch_loss/train': 0.9490427523851395} 12/22/2021 08:49:00 - INFO - codeparrot_training - Step 5564: {'lr': 0.0004857061591229023, 'samples': 2849280, 'steps': 5564, 'batch_loss/train': 0.6757154051447287} 12/22/2021 08:49:10 - INFO - codeparrot_training - Step 5565: {'lr': 0.0004857009049716361, 'samples': 2849792, 'steps': 5565, 'batch_loss/train': 0.9381251344457269} 12/22/2021 08:49:23 - INFO - codeparrot_training - Step 5566: {'lr': 0.00048569564988331206, 'samples': 2850304, 'steps': 5566, 'batch_loss/train': 0.9770109355449677} 12/22/2021 08:49:33 - INFO - codeparrot_training - Step 5567: {'lr': 0.00048569039385795113, 'samples': 2850816, 'steps': 5567, 'batch_loss/train': 0.8914863709360361} 12/22/2021 08:49:44 - INFO - codeparrot_training - Step 5568: {'lr': 0.0004856851368955741, 'samples': 2851328, 'steps': 5568, 'batch_loss/train': 0.870058459462598} 12/22/2021 08:49:57 - INFO - codeparrot_training - Step 5569: {'lr': 0.00048567987899620203, 'samples': 2851840, 'steps': 5569, 'batch_loss/train': 0.8764712382107973} 12/22/2021 08:50:08 - INFO - codeparrot_training - Step 5570: {'lr': 0.00048567462015985577, 'samples': 2852352, 'steps': 5570, 'batch_loss/train': 0.8958473792299628} 12/22/2021 08:50:18 - INFO - codeparrot_training - Step 5571: {'lr': 0.00048566936038655614, 'samples': 2852864, 'steps': 5571, 'batch_loss/train': 0.8231508862227201} 12/22/2021 08:50:32 - INFO - codeparrot_training - Step 5572: {'lr': 0.0004856640996763241, 'samples': 2853376, 'steps': 5572, 'batch_loss/train': 0.9206551676616073} 12/22/2021 08:50:42 - INFO - codeparrot_training - Step 5573: {'lr': 0.00048565883802918064, 'samples': 2853888, 'steps': 5573, 'batch_loss/train': 1.2869197204709053} 12/22/2021 08:50:53 - INFO - codeparrot_training - Step 5574: {'lr': 0.00048565357544514667, 'samples': 2854400, 'steps': 5574, 'batch_loss/train': 0.9124233983457088} 12/22/2021 08:51:03 - INFO - codeparrot_training - Step 5575: {'lr': 0.000485648311924243, 'samples': 2854912, 'steps': 5575, 'batch_loss/train': 0.8390075993956998} 12/22/2021 08:51:17 - INFO - codeparrot_training - Step 5576: {'lr': 0.0004856430474664906, 'samples': 2855424, 'steps': 5576, 'batch_loss/train': 0.8218794935382903} 12/22/2021 08:51:28 - INFO - codeparrot_training - Step 5577: {'lr': 0.00048563778207191046, 'samples': 2855936, 'steps': 5577, 'batch_loss/train': 0.768242622143589} 12/22/2021 08:51:38 - INFO - codeparrot_training - Step 5578: {'lr': 0.0004856325157405235, 'samples': 2856448, 'steps': 5578, 'batch_loss/train': 0.6984410103177652} 12/22/2021 08:51:50 - INFO - codeparrot_training - Step 5579: {'lr': 0.0004856272484723506, 'samples': 2856960, 'steps': 5579, 'batch_loss/train': 0.7410258282907307} 12/22/2021 08:52:01 - INFO - codeparrot_training - Step 5580: {'lr': 0.00048562198026741264, 'samples': 2857472, 'steps': 5580, 'batch_loss/train': 0.9054269585758448} 12/22/2021 08:52:12 - INFO - codeparrot_training - Step 5581: {'lr': 0.0004856167111257308, 'samples': 2857984, 'steps': 5581, 'batch_loss/train': 0.8236196301877499} 12/22/2021 08:52:24 - INFO - codeparrot_training - Step 5582: {'lr': 0.0004856114410473258, 'samples': 2858496, 'steps': 5582, 'batch_loss/train': 0.8328793570399284} 12/22/2021 08:52:34 - INFO - codeparrot_training - Step 5583: {'lr': 0.0004856061700322187, 'samples': 2859008, 'steps': 5583, 'batch_loss/train': 0.9219427118077874} 12/22/2021 08:52:45 - INFO - codeparrot_training - Step 5584: {'lr': 0.0004856008980804304, 'samples': 2859520, 'steps': 5584, 'batch_loss/train': 0.7870924239978194} 12/22/2021 08:52:56 - INFO - codeparrot_training - Step 5585: {'lr': 0.00048559562519198186, 'samples': 2860032, 'steps': 5585, 'batch_loss/train': 0.8786496752873063} 12/22/2021 08:53:10 - INFO - codeparrot_training - Step 5586: {'lr': 0.0004855903513668942, 'samples': 2860544, 'steps': 5586, 'batch_loss/train': 0.8508237898349762} 12/22/2021 08:53:20 - INFO - codeparrot_training - Step 5587: {'lr': 0.00048558507660518814, 'samples': 2861056, 'steps': 5587, 'batch_loss/train': 0.8349921582266688} 12/22/2021 08:53:31 - INFO - codeparrot_training - Step 5588: {'lr': 0.0004855798009068848, 'samples': 2861568, 'steps': 5588, 'batch_loss/train': 0.8318051715032198} 12/22/2021 08:53:43 - INFO - codeparrot_training - Step 5589: {'lr': 0.0004855745242720051, 'samples': 2862080, 'steps': 5589, 'batch_loss/train': 0.9037135103717446} 12/22/2021 08:53:54 - INFO - codeparrot_training - Step 5590: {'lr': 0.0004855692467005701, 'samples': 2862592, 'steps': 5590, 'batch_loss/train': 0.8527002623304725} 12/22/2021 08:54:05 - INFO - codeparrot_training - Step 5591: {'lr': 0.0004855639681926007, 'samples': 2863104, 'steps': 5591, 'batch_loss/train': 0.8715710649266839} 12/22/2021 08:54:17 - INFO - codeparrot_training - Step 5592: {'lr': 0.0004855586887481179, 'samples': 2863616, 'steps': 5592, 'batch_loss/train': 0.7197220651432872} 12/22/2021 08:54:27 - INFO - codeparrot_training - Step 5593: {'lr': 0.0004855534083671427, 'samples': 2864128, 'steps': 5593, 'batch_loss/train': 0.9055227981880307} 12/22/2021 08:54:38 - INFO - codeparrot_training - Step 5594: {'lr': 0.0004855481270496961, 'samples': 2864640, 'steps': 5594, 'batch_loss/train': 0.7854229635559022} 12/22/2021 08:54:52 - INFO - codeparrot_training - Step 5595: {'lr': 0.000485542844795799, 'samples': 2865152, 'steps': 5595, 'batch_loss/train': 0.8651060927659273} 12/22/2021 08:55:02 - INFO - codeparrot_training - Step 5596: {'lr': 0.0004855375616054726, 'samples': 2865664, 'steps': 5596, 'batch_loss/train': 0.9209417384117842} 12/22/2021 08:55:13 - INFO - codeparrot_training - Step 5597: {'lr': 0.00048553227747873774, 'samples': 2866176, 'steps': 5597, 'batch_loss/train': 0.8273551925085485} 12/22/2021 08:55:23 - INFO - codeparrot_training - Step 5598: {'lr': 0.00048552699241561547, 'samples': 2866688, 'steps': 5598, 'batch_loss/train': 0.9791857954114676} 12/22/2021 08:55:36 - INFO - codeparrot_training - Step 5599: {'lr': 0.00048552170641612686, 'samples': 2867200, 'steps': 5599, 'batch_loss/train': 0.8660226156935096} 12/22/2021 08:55:46 - INFO - codeparrot_training - Step 5600: {'lr': 0.00048551641948029286, 'samples': 2867712, 'steps': 5600, 'batch_loss/train': 0.8952138824388385} 12/22/2021 08:55:57 - INFO - codeparrot_training - Step 5601: {'lr': 0.00048551113160813443, 'samples': 2868224, 'steps': 5601, 'batch_loss/train': 0.7802688446827233} 12/22/2021 08:56:09 - INFO - codeparrot_training - Step 5602: {'lr': 0.0004855058427996727, 'samples': 2868736, 'steps': 5602, 'batch_loss/train': 0.84813292324543} 12/22/2021 08:56:20 - INFO - codeparrot_training - Step 5603: {'lr': 0.0004855005530549287, 'samples': 2869248, 'steps': 5603, 'batch_loss/train': 0.7951269829645753} 12/22/2021 08:56:30 - INFO - codeparrot_training - Step 5604: {'lr': 0.0004854952623739235, 'samples': 2869760, 'steps': 5604, 'batch_loss/train': 0.6476303532253951} 12/22/2021 08:56:42 - INFO - codeparrot_training - Step 5605: {'lr': 0.00048548997075667787, 'samples': 2870272, 'steps': 5605, 'batch_loss/train': 0.8705633049830794} 12/22/2021 08:56:53 - INFO - codeparrot_training - Step 5606: {'lr': 0.00048548467820321316, 'samples': 2870784, 'steps': 5606, 'batch_loss/train': 0.8453765604645014} 12/22/2021 08:57:04 - INFO - codeparrot_training - Step 5607: {'lr': 0.0004854793847135502, 'samples': 2871296, 'steps': 5607, 'batch_loss/train': 0.7808441187953576} 12/22/2021 08:57:14 - INFO - codeparrot_training - Step 5608: {'lr': 0.00048547409028771026, 'samples': 2871808, 'steps': 5608, 'batch_loss/train': 0.7732529072090983} 12/22/2021 08:57:28 - INFO - codeparrot_training - Step 5609: {'lr': 0.00048546879492571414, 'samples': 2872320, 'steps': 5609, 'batch_loss/train': 0.9469293244183064} 12/22/2021 08:57:39 - INFO - codeparrot_training - Step 5610: {'lr': 0.000485463498627583, 'samples': 2872832, 'steps': 5610, 'batch_loss/train': 0.8450426245108247} 12/22/2021 08:57:50 - INFO - codeparrot_training - Step 5611: {'lr': 0.00048545820139333796, 'samples': 2873344, 'steps': 5611, 'batch_loss/train': 0.8929669661447406} 12/22/2021 08:58:02 - INFO - codeparrot_training - Step 5612: {'lr': 0.00048545290322299996, 'samples': 2873856, 'steps': 5612, 'batch_loss/train': 0.8080568788573146} 12/22/2021 08:58:12 - INFO - codeparrot_training - Step 5613: {'lr': 0.00048544760411659016, 'samples': 2874368, 'steps': 5613, 'batch_loss/train': 0.8007675828412175} 12/22/2021 08:58:23 - INFO - codeparrot_training - Step 5614: {'lr': 0.00048544230407412953, 'samples': 2874880, 'steps': 5614, 'batch_loss/train': 0.8090135948732495} 12/22/2021 08:58:37 - INFO - codeparrot_training - Step 5615: {'lr': 0.0004854370030956392, 'samples': 2875392, 'steps': 5615, 'batch_loss/train': 0.8960241777822375} 12/22/2021 08:58:48 - INFO - codeparrot_training - Step 5616: {'lr': 0.0004854317011811403, 'samples': 2875904, 'steps': 5616, 'batch_loss/train': 0.8461141786538064} 12/22/2021 08:58:58 - INFO - codeparrot_training - Step 5617: {'lr': 0.00048542639833065385, 'samples': 2876416, 'steps': 5617, 'batch_loss/train': 0.7576275798492134} 12/22/2021 08:59:10 - INFO - codeparrot_training - Step 5618: {'lr': 0.0004854210945442009, 'samples': 2876928, 'steps': 5618, 'batch_loss/train': 0.8202845184132457} 12/22/2021 08:59:21 - INFO - codeparrot_training - Step 5619: {'lr': 0.0004854157898218026, 'samples': 2877440, 'steps': 5619, 'batch_loss/train': 0.8549157893285155} 12/22/2021 08:59:32 - INFO - codeparrot_training - Step 5620: {'lr': 0.00048541048416348, 'samples': 2877952, 'steps': 5620, 'batch_loss/train': 0.9295331304892898} 12/22/2021 08:59:42 - INFO - codeparrot_training - Step 5621: {'lr': 0.0004854051775692542, 'samples': 2878464, 'steps': 5621, 'batch_loss/train': 0.8786707744002342} 12/22/2021 08:59:54 - INFO - codeparrot_training - Step 5622: {'lr': 0.0004853998700391463, 'samples': 2878976, 'steps': 5622, 'batch_loss/train': 0.7456266935914755} 12/22/2021 09:00:05 - INFO - codeparrot_training - Step 5623: {'lr': 0.0004853945615731774, 'samples': 2879488, 'steps': 5623, 'batch_loss/train': 0.8320846566930413} 12/22/2021 09:00:15 - INFO - codeparrot_training - Step 5624: {'lr': 0.00048538925217136855, 'samples': 2880000, 'steps': 5624, 'batch_loss/train': 0.830588411539793} 12/22/2021 09:00:29 - INFO - codeparrot_training - Step 5625: {'lr': 0.000485383941833741, 'samples': 2880512, 'steps': 5625, 'batch_loss/train': 0.8330432586371899} 12/22/2021 09:00:40 - INFO - codeparrot_training - Step 5626: {'lr': 0.00048537863056031573, 'samples': 2881024, 'steps': 5626, 'batch_loss/train': 0.7704596528783441} 12/22/2021 09:00:51 - INFO - codeparrot_training - Step 5627: {'lr': 0.00048537331835111385, 'samples': 2881536, 'steps': 5627, 'batch_loss/train': 0.7445763302966952} 12/22/2021 09:01:03 - INFO - codeparrot_training - Step 5628: {'lr': 0.00048536800520615665, 'samples': 2882048, 'steps': 5628, 'batch_loss/train': 0.8866435810923576} 12/22/2021 09:01:13 - INFO - codeparrot_training - Step 5629: {'lr': 0.00048536269112546505, 'samples': 2882560, 'steps': 5629, 'batch_loss/train': 0.8341096239164472} 12/22/2021 09:01:24 - INFO - codeparrot_training - Step 5630: {'lr': 0.0004853573761090603, 'samples': 2883072, 'steps': 5630, 'batch_loss/train': 0.8282598536461592} 12/22/2021 09:01:34 - INFO - codeparrot_training - Step 5631: {'lr': 0.00048535206015696343, 'samples': 2883584, 'steps': 5631, 'batch_loss/train': 0.8537438176572323} 12/22/2021 09:01:47 - INFO - codeparrot_training - Step 5632: {'lr': 0.0004853467432691957, 'samples': 2884096, 'steps': 5632, 'batch_loss/train': 0.923003301024437} 12/22/2021 09:01:57 - INFO - codeparrot_training - Step 5633: {'lr': 0.00048534142544577814, 'samples': 2884608, 'steps': 5633, 'batch_loss/train': 0.8490701038390398} 12/22/2021 09:02:08 - INFO - codeparrot_training - Step 5634: {'lr': 0.00048533610668673193, 'samples': 2885120, 'steps': 5634, 'batch_loss/train': 0.9004959352314472} 12/22/2021 09:02:22 - INFO - codeparrot_training - Step 5635: {'lr': 0.00048533078699207824, 'samples': 2885632, 'steps': 5635, 'batch_loss/train': 0.7819297071546316} 12/22/2021 09:02:32 - INFO - codeparrot_training - Step 5636: {'lr': 0.00048532546636183825, 'samples': 2886144, 'steps': 5636, 'batch_loss/train': 0.9381141848862171} 12/22/2021 09:02:43 - INFO - codeparrot_training - Step 5637: {'lr': 0.000485320144796033, 'samples': 2886656, 'steps': 5637, 'batch_loss/train': 0.8372248605883215} 12/22/2021 09:02:55 - INFO - codeparrot_training - Step 5638: {'lr': 0.0004853148222946837, 'samples': 2887168, 'steps': 5638, 'batch_loss/train': 0.8524871589615941} 12/22/2021 09:03:06 - INFO - codeparrot_training - Step 5639: {'lr': 0.0004853094988578115, 'samples': 2887680, 'steps': 5639, 'batch_loss/train': 0.8288182783871889} 12/22/2021 09:03:16 - INFO - codeparrot_training - Step 5640: {'lr': 0.00048530417448543764, 'samples': 2888192, 'steps': 5640, 'batch_loss/train': 0.752754190005362} 12/22/2021 09:03:28 - INFO - codeparrot_training - Step 5641: {'lr': 0.0004852988491775832, 'samples': 2888704, 'steps': 5641, 'batch_loss/train': 0.7358274096623063} 12/22/2021 09:03:39 - INFO - codeparrot_training - Step 5642: {'lr': 0.0004852935229342694, 'samples': 2889216, 'steps': 5642, 'batch_loss/train': 0.8638804042711854} 12/22/2021 09:03:50 - INFO - codeparrot_training - Step 5643: {'lr': 0.0004852881957555174, 'samples': 2889728, 'steps': 5643, 'batch_loss/train': 0.8110417518764734} 12/22/2021 09:04:00 - INFO - codeparrot_training - Step 5644: {'lr': 0.00048528286764134833, 'samples': 2890240, 'steps': 5644, 'batch_loss/train': 0.6515327373053879} 12/22/2021 09:04:14 - INFO - codeparrot_training - Step 5645: {'lr': 0.00048527753859178347, 'samples': 2890752, 'steps': 5645, 'batch_loss/train': 0.8502593049779534} 12/22/2021 09:04:25 - INFO - codeparrot_training - Step 5646: {'lr': 0.0004852722086068439, 'samples': 2891264, 'steps': 5646, 'batch_loss/train': 0.9542822507210076} 12/22/2021 09:04:36 - INFO - codeparrot_training - Step 5647: {'lr': 0.00048526687768655086, 'samples': 2891776, 'steps': 5647, 'batch_loss/train': 0.7401347253471613} 12/22/2021 09:04:48 - INFO - codeparrot_training - Step 5648: {'lr': 0.0004852615458309256, 'samples': 2892288, 'steps': 5648, 'batch_loss/train': 0.9341551875695586} 12/22/2021 09:04:59 - INFO - codeparrot_training - Step 5649: {'lr': 0.00048525621303998923, 'samples': 2892800, 'steps': 5649, 'batch_loss/train': 0.8810380985960364} 12/22/2021 09:05:09 - INFO - codeparrot_training - Step 5650: {'lr': 0.000485250879313763, 'samples': 2893312, 'steps': 5650, 'batch_loss/train': 0.6895835986360908} 12/22/2021 09:05:20 - INFO - codeparrot_training - Step 5651: {'lr': 0.0004852455446522681, 'samples': 2893824, 'steps': 5651, 'batch_loss/train': 0.7786202603019774} 12/22/2021 09:05:32 - INFO - codeparrot_training - Step 5652: {'lr': 0.0004852402090555258, 'samples': 2894336, 'steps': 5652, 'batch_loss/train': 0.8557143236976117} 12/22/2021 09:05:42 - INFO - codeparrot_training - Step 5653: {'lr': 0.0004852348725235571, 'samples': 2894848, 'steps': 5653, 'batch_loss/train': 0.7833751454018056} 12/22/2021 09:05:53 - INFO - codeparrot_training - Step 5654: {'lr': 0.0004852295350563836, 'samples': 2895360, 'steps': 5654, 'batch_loss/train': 0.8460045491810888} 12/22/2021 09:06:07 - INFO - codeparrot_training - Step 5655: {'lr': 0.00048522419665402605, 'samples': 2895872, 'steps': 5655, 'batch_loss/train': 0.8101686034351587} 12/22/2021 09:06:18 - INFO - codeparrot_training - Step 5656: {'lr': 0.0004852188573165061, 'samples': 2896384, 'steps': 5656, 'batch_loss/train': 0.771273156395182} 12/22/2021 09:06:28 - INFO - codeparrot_training - Step 5657: {'lr': 0.00048521351704384466, 'samples': 2896896, 'steps': 5657, 'batch_loss/train': 0.8007986098527908} 12/22/2021 09:06:41 - INFO - codeparrot_training - Step 5658: {'lr': 0.00048520817583606315, 'samples': 2897408, 'steps': 5658, 'batch_loss/train': 0.8284246879629791} 12/22/2021 09:06:52 - INFO - codeparrot_training - Step 5659: {'lr': 0.0004852028336931827, 'samples': 2897920, 'steps': 5659, 'batch_loss/train': 1.3159173969179392} 12/22/2021 09:07:02 - INFO - codeparrot_training - Step 5660: {'lr': 0.00048519749061522454, 'samples': 2898432, 'steps': 5660, 'batch_loss/train': 0.8948812177404761} 12/22/2021 09:07:13 - INFO - codeparrot_training - Step 5661: {'lr': 0.0004851921466022101, 'samples': 2898944, 'steps': 5661, 'batch_loss/train': 0.915553679689765} 12/22/2021 09:07:27 - INFO - codeparrot_training - Step 5662: {'lr': 0.0004851868016541604, 'samples': 2899456, 'steps': 5662, 'batch_loss/train': 0.6808586689876392} 12/22/2021 09:07:37 - INFO - codeparrot_training - Step 5663: {'lr': 0.00048518145577109686, 'samples': 2899968, 'steps': 5663, 'batch_loss/train': 0.8245192030444741} 12/22/2021 09:07:48 - INFO - codeparrot_training - Step 5664: {'lr': 0.0004851761089530406, 'samples': 2900480, 'steps': 5664, 'batch_loss/train': 0.8475692011415958} 12/22/2021 09:08:00 - INFO - codeparrot_training - Step 5665: {'lr': 0.0004851707612000129, 'samples': 2900992, 'steps': 5665, 'batch_loss/train': 0.8313924353569746} 12/22/2021 09:08:11 - INFO - codeparrot_training - Step 5666: {'lr': 0.00048516541251203513, 'samples': 2901504, 'steps': 5666, 'batch_loss/train': 0.7609308627434075} 12/22/2021 09:08:21 - INFO - codeparrot_training - Step 5667: {'lr': 0.0004851600628891284, 'samples': 2902016, 'steps': 5667, 'batch_loss/train': 0.9095109631307423} 12/22/2021 09:08:33 - INFO - codeparrot_training - Step 5668: {'lr': 0.00048515471233131406, 'samples': 2902528, 'steps': 5668, 'batch_loss/train': 0.8473911797627807} 12/22/2021 09:08:44 - INFO - codeparrot_training - Step 5669: {'lr': 0.00048514936083861337, 'samples': 2903040, 'steps': 5669, 'batch_loss/train': 0.7933294787071645} 12/22/2021 09:08:54 - INFO - codeparrot_training - Step 5670: {'lr': 0.00048514400841104764, 'samples': 2903552, 'steps': 5670, 'batch_loss/train': 0.840327893383801} 12/22/2021 09:09:05 - INFO - codeparrot_training - Step 5671: {'lr': 0.0004851386550486382, 'samples': 2904064, 'steps': 5671, 'batch_loss/train': 0.9986354792490602} 12/22/2021 09:09:17 - INFO - codeparrot_training - Step 5672: {'lr': 0.00048513330075140615, 'samples': 2904576, 'steps': 5672, 'batch_loss/train': 0.6958081829361618} 12/22/2021 09:09:28 - INFO - codeparrot_training - Step 5673: {'lr': 0.0004851279455193729, 'samples': 2905088, 'steps': 5673, 'batch_loss/train': 0.8573737805709243} 12/22/2021 09:09:39 - INFO - codeparrot_training - Step 5674: {'lr': 0.0004851225893525597, 'samples': 2905600, 'steps': 5674, 'batch_loss/train': 0.9663674524053931} 12/22/2021 09:09:52 - INFO - codeparrot_training - Step 5675: {'lr': 0.00048511723225098783, 'samples': 2906112, 'steps': 5675, 'batch_loss/train': 0.8396362457424402} 12/22/2021 09:10:03 - INFO - codeparrot_training - Step 5676: {'lr': 0.0004851118742146787, 'samples': 2906624, 'steps': 5676, 'batch_loss/train': 0.8602829105220735} 12/22/2021 09:10:13 - INFO - codeparrot_training - Step 5677: {'lr': 0.00048510651524365353, 'samples': 2907136, 'steps': 5677, 'batch_loss/train': 0.9151310995221138} 12/22/2021 09:10:26 - INFO - codeparrot_training - Step 5678: {'lr': 0.0004851011553379336, 'samples': 2907648, 'steps': 5678, 'batch_loss/train': 0.9108059927821159} 12/22/2021 09:10:36 - INFO - codeparrot_training - Step 5679: {'lr': 0.00048509579449754024, 'samples': 2908160, 'steps': 5679, 'batch_loss/train': 0.8114614123478532} 12/22/2021 09:10:47 - INFO - codeparrot_training - Step 5680: {'lr': 0.00048509043272249475, 'samples': 2908672, 'steps': 5680, 'batch_loss/train': 0.9146654922515154} 12/22/2021 09:10:57 - INFO - codeparrot_training - Step 5681: {'lr': 0.0004850850700128185, 'samples': 2909184, 'steps': 5681, 'batch_loss/train': 0.8479226436465979} 12/22/2021 09:11:10 - INFO - codeparrot_training - Step 5682: {'lr': 0.0004850797063685328, 'samples': 2909696, 'steps': 5682, 'batch_loss/train': 0.7706514657475054} 12/22/2021 09:11:20 - INFO - codeparrot_training - Step 5683: {'lr': 0.0004850743417896589, 'samples': 2910208, 'steps': 5683, 'batch_loss/train': 0.8841035012155771} 12/22/2021 09:11:31 - INFO - codeparrot_training - Step 5684: {'lr': 0.00048506897627621825, 'samples': 2910720, 'steps': 5684, 'batch_loss/train': 0.8735576020553708} 12/22/2021 09:11:45 - INFO - codeparrot_training - Step 5685: {'lr': 0.00048506360982823206, 'samples': 2911232, 'steps': 5685, 'batch_loss/train': 0.9123560599982738} 12/22/2021 09:11:55 - INFO - codeparrot_training - Step 5686: {'lr': 0.0004850582424457217, 'samples': 2911744, 'steps': 5686, 'batch_loss/train': 0.8719707326963544} 12/22/2021 09:12:06 - INFO - codeparrot_training - Step 5687: {'lr': 0.0004850528741287085, 'samples': 2912256, 'steps': 5687, 'batch_loss/train': 0.7116758991032839} 12/22/2021 09:12:19 - INFO - codeparrot_training - Step 5688: {'lr': 0.0004850475048772139, 'samples': 2912768, 'steps': 5688, 'batch_loss/train': 0.8443560642190278} 12/22/2021 09:12:29 - INFO - codeparrot_training - Step 5689: {'lr': 0.0004850421346912591, 'samples': 2913280, 'steps': 5689, 'batch_loss/train': 0.4805158629314974} 12/22/2021 09:12:40 - INFO - codeparrot_training - Step 5690: {'lr': 0.00048503676357086555, 'samples': 2913792, 'steps': 5690, 'batch_loss/train': 0.9052162892185152} 12/22/2021 09:12:51 - INFO - codeparrot_training - Step 5691: {'lr': 0.0004850313915160546, 'samples': 2914304, 'steps': 5691, 'batch_loss/train': 1.4162810402922332} 12/22/2021 09:13:04 - INFO - codeparrot_training - Step 5692: {'lr': 0.0004850260185268476, 'samples': 2914816, 'steps': 5692, 'batch_loss/train': 0.7795289093628526} 12/22/2021 09:13:15 - INFO - codeparrot_training - Step 5693: {'lr': 0.0004850206446032658, 'samples': 2915328, 'steps': 5693, 'batch_loss/train': 0.7869266183115542} 12/22/2021 09:13:25 - INFO - codeparrot_training - Step 5694: {'lr': 0.0004850152697453307, 'samples': 2915840, 'steps': 5694, 'batch_loss/train': 0.7863834223244339} 12/22/2021 09:13:37 - INFO - codeparrot_training - Step 5695: {'lr': 0.0004850098939530636, 'samples': 2916352, 'steps': 5695, 'batch_loss/train': 0.9556503891944885} 12/22/2021 09:13:48 - INFO - codeparrot_training - Step 5696: {'lr': 0.00048500451722648596, 'samples': 2916864, 'steps': 5696, 'batch_loss/train': 0.9351096488535404} 12/22/2021 09:13:59 - INFO - codeparrot_training - Step 5697: {'lr': 0.0004849991395656191, 'samples': 2917376, 'steps': 5697, 'batch_loss/train': 0.8924522008746862} 12/22/2021 09:14:12 - INFO - codeparrot_training - Step 5698: {'lr': 0.0004849937609704843, 'samples': 2917888, 'steps': 5698, 'batch_loss/train': 0.8973119854927063} 12/22/2021 09:14:23 - INFO - codeparrot_training - Step 5699: {'lr': 0.00048498838144110315, 'samples': 2918400, 'steps': 5699, 'batch_loss/train': 0.8396797212772071} 12/22/2021 09:14:34 - INFO - codeparrot_training - Step 5700: {'lr': 0.00048498300097749687, 'samples': 2918912, 'steps': 5700, 'batch_loss/train': 0.8692030804231763} 12/22/2021 09:14:44 - INFO - codeparrot_training - Step 5701: {'lr': 0.0004849776195796869, 'samples': 2919424, 'steps': 5701, 'batch_loss/train': 0.7290507066063583} 12/22/2021 09:14:56 - INFO - codeparrot_training - Step 5702: {'lr': 0.0004849722372476947, 'samples': 2919936, 'steps': 5702, 'batch_loss/train': 0.8137485506013036} 12/22/2021 09:15:07 - INFO - codeparrot_training - Step 5703: {'lr': 0.0004849668539815415, 'samples': 2920448, 'steps': 5703, 'batch_loss/train': 0.8743936214596033} 12/22/2021 09:15:17 - INFO - codeparrot_training - Step 5704: {'lr': 0.0004849614697812489, 'samples': 2920960, 'steps': 5704, 'batch_loss/train': 0.8104946985840797} 12/22/2021 09:15:30 - INFO - codeparrot_training - Step 5705: {'lr': 0.00048495608464683813, 'samples': 2921472, 'steps': 5705, 'batch_loss/train': 0.7998543409630656} 12/22/2021 09:15:40 - INFO - codeparrot_training - Step 5706: {'lr': 0.00048495069857833076, 'samples': 2921984, 'steps': 5706, 'batch_loss/train': 0.8813775787129998} 12/22/2021 09:15:51 - INFO - codeparrot_training - Step 5707: {'lr': 0.0004849453115757481, 'samples': 2922496, 'steps': 5707, 'batch_loss/train': 0.8049501562491059} 12/22/2021 09:16:04 - INFO - codeparrot_training - Step 5708: {'lr': 0.0004849399236391115, 'samples': 2923008, 'steps': 5708, 'batch_loss/train': 0.8510736282914877} 12/22/2021 09:16:15 - INFO - codeparrot_training - Step 5709: {'lr': 0.00048493453476844265, 'samples': 2923520, 'steps': 5709, 'batch_loss/train': 0.8923796946182847} 12/22/2021 09:16:25 - INFO - codeparrot_training - Step 5710: {'lr': 0.0004849291449637626, 'samples': 2924032, 'steps': 5710, 'batch_loss/train': 0.8285945504903793} 12/22/2021 09:16:36 - INFO - codeparrot_training - Step 5711: {'lr': 0.00048492375422509314, 'samples': 2924544, 'steps': 5711, 'batch_loss/train': 0.8685862235724926} 12/22/2021 09:16:48 - INFO - codeparrot_training - Step 5712: {'lr': 0.0004849183625524555, 'samples': 2925056, 'steps': 5712, 'batch_loss/train': 0.8544315239414573} 12/22/2021 09:16:59 - INFO - codeparrot_training - Step 5713: {'lr': 0.00048491296994587107, 'samples': 2925568, 'steps': 5713, 'batch_loss/train': 0.8719198005273938} 12/22/2021 09:17:09 - INFO - codeparrot_training - Step 5714: {'lr': 0.00048490757640536145, 'samples': 2926080, 'steps': 5714, 'batch_loss/train': 0.5411126994295046} 12/22/2021 09:17:21 - INFO - codeparrot_training - Step 5715: {'lr': 0.00048490218193094806, 'samples': 2926592, 'steps': 5715, 'batch_loss/train': 0.7986998707056046} 12/22/2021 09:17:32 - INFO - codeparrot_training - Step 5716: {'lr': 0.00048489678652265214, 'samples': 2927104, 'steps': 5716, 'batch_loss/train': 0.8243153020739555} 12/22/2021 09:17:43 - INFO - codeparrot_training - Step 5717: {'lr': 0.0004848913901804954, 'samples': 2927616, 'steps': 5717, 'batch_loss/train': 0.856998291797936} 12/22/2021 09:17:57 - INFO - codeparrot_training - Step 5718: {'lr': 0.00048488599290449913, 'samples': 2928128, 'steps': 5718, 'batch_loss/train': 0.8344067432917655} 12/22/2021 09:18:07 - INFO - codeparrot_training - Step 5719: {'lr': 0.00048488059469468486, 'samples': 2928640, 'steps': 5719, 'batch_loss/train': 0.7989489464089274} 12/22/2021 09:18:18 - INFO - codeparrot_training - Step 5720: {'lr': 0.00048487519555107404, 'samples': 2929152, 'steps': 5720, 'batch_loss/train': 0.8580202455632389} 12/22/2021 09:18:29 - INFO - codeparrot_training - Step 5721: {'lr': 0.00048486979547368814, 'samples': 2929664, 'steps': 5721, 'batch_loss/train': 0.8357704030349851} 12/22/2021 09:18:41 - INFO - codeparrot_training - Step 5722: {'lr': 0.0004848643944625486, 'samples': 2930176, 'steps': 5722, 'batch_loss/train': 0.9067406225949526} 12/22/2021 09:18:51 - INFO - codeparrot_training - Step 5723: {'lr': 0.0004848589925176769, 'samples': 2930688, 'steps': 5723, 'batch_loss/train': 0.8254811326041818} 12/22/2021 09:19:02 - INFO - codeparrot_training - Step 5724: {'lr': 0.00048485358963909454, 'samples': 2931200, 'steps': 5724, 'batch_loss/train': 0.8203696710988879} 12/22/2021 09:19:15 - INFO - codeparrot_training - Step 5725: {'lr': 0.000484848185826823, 'samples': 2931712, 'steps': 5725, 'batch_loss/train': 0.8669303255155683} 12/22/2021 09:19:26 - INFO - codeparrot_training - Step 5726: {'lr': 0.0004848427810808837, 'samples': 2932224, 'steps': 5726, 'batch_loss/train': 0.8935562931001186} 12/22/2021 09:19:36 - INFO - codeparrot_training - Step 5727: {'lr': 0.0004848373754012982, 'samples': 2932736, 'steps': 5727, 'batch_loss/train': 0.8311355630867183} 12/22/2021 09:19:48 - INFO - codeparrot_training - Step 5728: {'lr': 0.000484831968788088, 'samples': 2933248, 'steps': 5728, 'batch_loss/train': 0.8106460468843579} 12/22/2021 09:19:59 - INFO - codeparrot_training - Step 5729: {'lr': 0.0004848265612412745, 'samples': 2933760, 'steps': 5729, 'batch_loss/train': 0.6537963908631355} 12/22/2021 09:20:09 - INFO - codeparrot_training - Step 5730: {'lr': 0.0004848211527608793, 'samples': 2934272, 'steps': 5730, 'batch_loss/train': 0.8708720253780484} 12/22/2021 09:20:22 - INFO - codeparrot_training - Step 5731: {'lr': 0.0004848157433469239, 'samples': 2934784, 'steps': 5731, 'batch_loss/train': 0.8374509173445404} 12/22/2021 09:20:32 - INFO - codeparrot_training - Step 5732: {'lr': 0.0004848103329994297, 'samples': 2935296, 'steps': 5732, 'batch_loss/train': 0.8305446365848184} 12/22/2021 09:20:43 - INFO - codeparrot_training - Step 5733: {'lr': 0.00048480492171841825, 'samples': 2935808, 'steps': 5733, 'batch_loss/train': 0.8553833039477468} 12/22/2021 09:20:53 - INFO - codeparrot_training - Step 5734: {'lr': 0.0004847995095039112, 'samples': 2936320, 'steps': 5734, 'batch_loss/train': 0.7489721644669771} 12/22/2021 09:21:06 - INFO - codeparrot_training - Step 5735: {'lr': 0.00048479409635592983, 'samples': 2936832, 'steps': 5735, 'batch_loss/train': 0.8157742517068982} 12/22/2021 09:21:17 - INFO - codeparrot_training - Step 5736: {'lr': 0.00048478868227449585, 'samples': 2937344, 'steps': 5736, 'batch_loss/train': 0.859787899069488} 12/22/2021 09:21:27 - INFO - codeparrot_training - Step 5737: {'lr': 0.00048478326725963074, 'samples': 2937856, 'steps': 5737, 'batch_loss/train': 0.9430506937205791} 12/22/2021 09:21:41 - INFO - codeparrot_training - Step 5738: {'lr': 0.000484777851311356, 'samples': 2938368, 'steps': 5738, 'batch_loss/train': 0.7850810028612614} 12/22/2021 09:21:52 - INFO - codeparrot_training - Step 5739: {'lr': 0.00048477243442969313, 'samples': 2938880, 'steps': 5739, 'batch_loss/train': 0.8597177788615227} 12/22/2021 09:22:03 - INFO - codeparrot_training - Step 5740: {'lr': 0.0004847670166146638, 'samples': 2939392, 'steps': 5740, 'batch_loss/train': 0.7401916095986962} 12/22/2021 09:22:15 - INFO - codeparrot_training - Step 5741: {'lr': 0.00048476159786628936, 'samples': 2939904, 'steps': 5741, 'batch_loss/train': 0.6857287962920964} 12/22/2021 09:22:25 - INFO - codeparrot_training - Step 5742: {'lr': 0.0004847561781845915, 'samples': 2940416, 'steps': 5742, 'batch_loss/train': 0.9006999144330621} 12/22/2021 09:22:36 - INFO - codeparrot_training - Step 5743: {'lr': 0.0004847507575695917, 'samples': 2940928, 'steps': 5743, 'batch_loss/train': 0.832065581344068} 12/22/2021 09:22:47 - INFO - codeparrot_training - Step 5744: {'lr': 0.00048474533602131155, 'samples': 2941440, 'steps': 5744, 'batch_loss/train': 0.8263417799025774} 12/22/2021 09:22:59 - INFO - codeparrot_training - Step 5745: {'lr': 0.0004847399135397725, 'samples': 2941952, 'steps': 5745, 'batch_loss/train': 0.8461204562336206} 12/22/2021 09:23:09 - INFO - codeparrot_training - Step 5746: {'lr': 0.00048473449012499626, 'samples': 2942464, 'steps': 5746, 'batch_loss/train': 0.8284491747617722} 12/22/2021 09:23:20 - INFO - codeparrot_training - Step 5747: {'lr': 0.0004847290657770044, 'samples': 2942976, 'steps': 5747, 'batch_loss/train': 0.8076611296273768} 12/22/2021 09:23:34 - INFO - codeparrot_training - Step 5748: {'lr': 0.0004847236404958182, 'samples': 2943488, 'steps': 5748, 'batch_loss/train': 0.7651047296822071} 12/22/2021 09:23:44 - INFO - codeparrot_training - Step 5749: {'lr': 0.0004847182142814596, 'samples': 2944000, 'steps': 5749, 'batch_loss/train': 0.7770491372793913} 12/22/2021 09:23:55 - INFO - codeparrot_training - Step 5750: {'lr': 0.00048471278713394997, 'samples': 2944512, 'steps': 5750, 'batch_loss/train': 2.058103452436626} 12/22/2021 09:24:07 - INFO - codeparrot_training - Step 5751: {'lr': 0.00048470735905331087, 'samples': 2945024, 'steps': 5751, 'batch_loss/train': 0.9872956993058324} 12/22/2021 09:24:18 - INFO - codeparrot_training - Step 5752: {'lr': 0.00048470193003956395, 'samples': 2945536, 'steps': 5752, 'batch_loss/train': 0.7920241877436638} 12/22/2021 09:24:28 - INFO - codeparrot_training - Step 5753: {'lr': 0.0004846965000927308, 'samples': 2946048, 'steps': 5753, 'batch_loss/train': 0.742504145950079} 12/22/2021 09:24:39 - INFO - codeparrot_training - Step 5754: {'lr': 0.000484691069212833, 'samples': 2946560, 'steps': 5754, 'batch_loss/train': 0.784639110788703} 12/22/2021 09:24:53 - INFO - codeparrot_training - Step 5755: {'lr': 0.00048468563739989214, 'samples': 2947072, 'steps': 5755, 'batch_loss/train': 0.8742161002010107} 12/22/2021 09:25:04 - INFO - codeparrot_training - Step 5756: {'lr': 0.0004846802046539297, 'samples': 2947584, 'steps': 5756, 'batch_loss/train': 0.8450655769556761} 12/22/2021 09:25:15 - INFO - codeparrot_training - Step 5757: {'lr': 0.0004846747709749675, 'samples': 2948096, 'steps': 5757, 'batch_loss/train': 0.8543236181139946} 12/22/2021 09:25:27 - INFO - codeparrot_training - Step 5758: {'lr': 0.0004846693363630269, 'samples': 2948608, 'steps': 5758, 'batch_loss/train': 0.8461536141112447} 12/22/2021 09:25:37 - INFO - codeparrot_training - Step 5759: {'lr': 0.0004846639008181297, 'samples': 2949120, 'steps': 5759, 'batch_loss/train': 0.6764679113402963} 12/22/2021 09:25:48 - INFO - codeparrot_training - Step 5760: {'lr': 0.0004846584643402975, 'samples': 2949632, 'steps': 5760, 'batch_loss/train': 0.8283503563143313} 12/22/2021 09:26:00 - INFO - codeparrot_training - Step 5761: {'lr': 0.0004846530269295517, 'samples': 2950144, 'steps': 5761, 'batch_loss/train': 0.907743931747973} 12/22/2021 09:26:11 - INFO - codeparrot_training - Step 5762: {'lr': 0.00048464758858591416, 'samples': 2950656, 'steps': 5762, 'batch_loss/train': 0.8069822359830141} 12/22/2021 09:26:22 - INFO - codeparrot_training - Step 5763: {'lr': 0.00048464214930940644, 'samples': 2951168, 'steps': 5763, 'batch_loss/train': 0.8635082086548209} 12/22/2021 09:26:32 - INFO - codeparrot_training - Step 5764: {'lr': 0.00048463670910005004, 'samples': 2951680, 'steps': 5764, 'batch_loss/train': 0.8141178023070097} 12/22/2021 09:26:45 - INFO - codeparrot_training - Step 5765: {'lr': 0.00048463126795786674, 'samples': 2952192, 'steps': 5765, 'batch_loss/train': 0.8168561840429902} 12/22/2021 09:26:56 - INFO - codeparrot_training - Step 5766: {'lr': 0.00048462582588287807, 'samples': 2952704, 'steps': 5766, 'batch_loss/train': 0.8032137593254447} 12/22/2021 09:27:07 - INFO - codeparrot_training - Step 5767: {'lr': 0.0004846203828751057, 'samples': 2953216, 'steps': 5767, 'batch_loss/train': 0.8515425072982907} 12/22/2021 09:27:19 - INFO - codeparrot_training - Step 5768: {'lr': 0.00048461493893457133, 'samples': 2953728, 'steps': 5768, 'batch_loss/train': 0.8261868804693222} 12/22/2021 09:27:29 - INFO - codeparrot_training - Step 5769: {'lr': 0.0004846094940612965, 'samples': 2954240, 'steps': 5769, 'batch_loss/train': 0.8067722423002124} 12/22/2021 09:27:40 - INFO - codeparrot_training - Step 5770: {'lr': 0.00048460404825530294, 'samples': 2954752, 'steps': 5770, 'batch_loss/train': 0.895632790401578} 12/22/2021 09:27:52 - INFO - codeparrot_training - Step 5771: {'lr': 0.0004845986015166123, 'samples': 2955264, 'steps': 5771, 'batch_loss/train': 0.8511456800624728} 12/22/2021 09:28:03 - INFO - codeparrot_training - Step 5772: {'lr': 0.0004845931538452462, 'samples': 2955776, 'steps': 5772, 'batch_loss/train': 0.7734547811560333} 12/22/2021 09:28:14 - INFO - codeparrot_training - Step 5773: {'lr': 0.0004845877052412262, 'samples': 2956288, 'steps': 5773, 'batch_loss/train': 0.735999945551157} 12/22/2021 09:28:26 - INFO - codeparrot_training - Step 5774: {'lr': 0.0004845822557045741, 'samples': 2956800, 'steps': 5774, 'batch_loss/train': 0.912411673925817} 12/22/2021 09:28:36 - INFO - codeparrot_training - Step 5775: {'lr': 0.00048457680523531155, 'samples': 2957312, 'steps': 5775, 'batch_loss/train': 0.8265768582932651} 12/22/2021 09:28:47 - INFO - codeparrot_training - Step 5776: {'lr': 0.0004845713538334602, 'samples': 2957824, 'steps': 5776, 'batch_loss/train': 0.7902720733545721} 12/22/2021 09:28:57 - INFO - codeparrot_training - Step 5777: {'lr': 0.00048456590149904166, 'samples': 2958336, 'steps': 5777, 'batch_loss/train': 0.8356708297505975} 12/22/2021 09:29:11 - INFO - codeparrot_training - Step 5778: {'lr': 0.0004845604482320778, 'samples': 2958848, 'steps': 5778, 'batch_loss/train': 0.8453921573236585} 12/22/2021 09:29:21 - INFO - codeparrot_training - Step 5779: {'lr': 0.00048455499403259007, 'samples': 2959360, 'steps': 5779, 'batch_loss/train': 0.8335894802585244} 12/22/2021 09:29:32 - INFO - codeparrot_training - Step 5780: {'lr': 0.0004845495389006002, 'samples': 2959872, 'steps': 5780, 'batch_loss/train': 0.8217864017933607} 12/22/2021 09:29:44 - INFO - codeparrot_training - Step 5781: {'lr': 0.00048454408283613, 'samples': 2960384, 'steps': 5781, 'batch_loss/train': 0.847214637324214} 12/22/2021 09:29:55 - INFO - codeparrot_training - Step 5782: {'lr': 0.00048453862583920106, 'samples': 2960896, 'steps': 5782, 'batch_loss/train': 0.8678084928542376} 12/22/2021 09:30:05 - INFO - codeparrot_training - Step 5783: {'lr': 0.00048453316790983517, 'samples': 2961408, 'steps': 5783, 'batch_loss/train': 0.9213716816157103} 12/22/2021 09:30:19 - INFO - codeparrot_training - Step 5784: {'lr': 0.0004845277090480539, 'samples': 2961920, 'steps': 5784, 'batch_loss/train': 0.7597740460187197} 12/22/2021 09:30:29 - INFO - codeparrot_training - Step 5785: {'lr': 0.000484522249253879, 'samples': 2962432, 'steps': 5785, 'batch_loss/train': 0.8663237881846726} 12/22/2021 09:30:40 - INFO - codeparrot_training - Step 5786: {'lr': 0.0004845167885273322, 'samples': 2962944, 'steps': 5786, 'batch_loss/train': 0.9096400681883097} 12/22/2021 09:30:50 - INFO - codeparrot_training - Step 5787: {'lr': 0.0004845113268684351, 'samples': 2963456, 'steps': 5787, 'batch_loss/train': 0.7790908981114626} 12/22/2021 09:31:03 - INFO - codeparrot_training - Step 5788: {'lr': 0.0004845058642772097, 'samples': 2963968, 'steps': 5788, 'batch_loss/train': 0.8633778709918261} 12/22/2021 09:31:13 - INFO - codeparrot_training - Step 5789: {'lr': 0.0004845004007536774, 'samples': 2964480, 'steps': 5789, 'batch_loss/train': 0.8235282788518816} 12/22/2021 09:31:24 - INFO - codeparrot_training - Step 5790: {'lr': 0.0004844949362978601, 'samples': 2964992, 'steps': 5790, 'batch_loss/train': 0.5986658283509314} 12/22/2021 09:31:36 - INFO - codeparrot_training - Step 5791: {'lr': 0.00048448947090977944, 'samples': 2965504, 'steps': 5791, 'batch_loss/train': 0.8269715213682503} 12/22/2021 09:31:47 - INFO - codeparrot_training - Step 5792: {'lr': 0.0004844840045894573, 'samples': 2966016, 'steps': 5792, 'batch_loss/train': 0.9422792186960578} 12/22/2021 09:31:57 - INFO - codeparrot_training - Step 5793: {'lr': 0.00048447853733691514, 'samples': 2966528, 'steps': 5793, 'batch_loss/train': 0.736701971385628} 12/22/2021 09:32:10 - INFO - codeparrot_training - Step 5794: {'lr': 0.00048447306915217495, 'samples': 2967040, 'steps': 5794, 'batch_loss/train': 0.7794853420928121} 12/22/2021 09:32:21 - INFO - codeparrot_training - Step 5795: {'lr': 0.0004844676000352584, 'samples': 2967552, 'steps': 5795, 'batch_loss/train': 0.9199546193704009} 12/22/2021 09:32:31 - INFO - codeparrot_training - Step 5796: {'lr': 0.0004844621299861871, 'samples': 2968064, 'steps': 5796, 'batch_loss/train': 0.7864136239513755} 12/22/2021 09:32:42 - INFO - codeparrot_training - Step 5797: {'lr': 0.00048445665900498293, 'samples': 2968576, 'steps': 5797, 'batch_loss/train': 0.8148917173966765} 12/22/2021 09:32:54 - INFO - codeparrot_training - Step 5798: {'lr': 0.0004844511870916676, 'samples': 2969088, 'steps': 5798, 'batch_loss/train': 0.8773173801600933} 12/22/2021 09:33:05 - INFO - codeparrot_training - Step 5799: {'lr': 0.00048444571424626295, 'samples': 2969600, 'steps': 5799, 'batch_loss/train': 0.8532738485373557} 12/22/2021 09:33:15 - INFO - codeparrot_training - Step 5800: {'lr': 0.0004844402404687906, 'samples': 2970112, 'steps': 5800, 'batch_loss/train': 0.7716006608679891} 12/22/2021 09:33:29 - INFO - codeparrot_training - Step 5801: {'lr': 0.00048443476575927235, 'samples': 2970624, 'steps': 5801, 'batch_loss/train': 0.8113830601796508} 12/22/2021 09:33:39 - INFO - codeparrot_training - Step 5802: {'lr': 0.00048442929011773006, 'samples': 2971136, 'steps': 5802, 'batch_loss/train': 0.8714562226086855} 12/22/2021 09:33:50 - INFO - codeparrot_training - Step 5803: {'lr': 0.0004844238135441854, 'samples': 2971648, 'steps': 5803, 'batch_loss/train': 0.8062011187430471} 12/22/2021 09:34:02 - INFO - codeparrot_training - Step 5804: {'lr': 0.00048441833603866014, 'samples': 2972160, 'steps': 5804, 'batch_loss/train': 0.8767431192100048} 12/22/2021 09:34:13 - INFO - codeparrot_training - Step 5805: {'lr': 0.00048441285760117615, 'samples': 2972672, 'steps': 5805, 'batch_loss/train': 0.8417985294945538} 12/22/2021 09:34:23 - INFO - codeparrot_training - Step 5806: {'lr': 0.0004844073782317552, 'samples': 2973184, 'steps': 5806, 'batch_loss/train': 0.9514559404924512} 12/22/2021 09:34:35 - INFO - codeparrot_training - Step 5807: {'lr': 0.0004844018979304189, 'samples': 2973696, 'steps': 5807, 'batch_loss/train': 0.8098769118078053} 12/22/2021 09:34:46 - INFO - codeparrot_training - Step 5808: {'lr': 0.0004843964166971892, 'samples': 2974208, 'steps': 5808, 'batch_loss/train': 0.8870953880250454} 12/22/2021 09:34:57 - INFO - codeparrot_training - Step 5809: {'lr': 0.0004843909345320878, 'samples': 2974720, 'steps': 5809, 'batch_loss/train': 0.8315235492773354} 12/22/2021 09:35:07 - INFO - codeparrot_training - Step 5810: {'lr': 0.00048438545143513666, 'samples': 2975232, 'steps': 5810, 'batch_loss/train': 1.0708468342199922} 12/22/2021 09:35:20 - INFO - codeparrot_training - Step 5811: {'lr': 0.0004843799674063574, 'samples': 2975744, 'steps': 5811, 'batch_loss/train': 0.8131828126497567} 12/22/2021 09:35:30 - INFO - codeparrot_training - Step 5812: {'lr': 0.00048437448244577184, 'samples': 2976256, 'steps': 5812, 'batch_loss/train': 0.8251016112044454} 12/22/2021 09:35:41 - INFO - codeparrot_training - Step 5813: {'lr': 0.0004843689965534019, 'samples': 2976768, 'steps': 5813, 'batch_loss/train': 0.8231647964566946} 12/22/2021 09:35:54 - INFO - codeparrot_training - Step 5814: {'lr': 0.00048436350972926934, 'samples': 2977280, 'steps': 5814, 'batch_loss/train': 0.8141170172020793} 12/22/2021 09:36:05 - INFO - codeparrot_training - Step 5815: {'lr': 0.000484358021973396, 'samples': 2977792, 'steps': 5815, 'batch_loss/train': 0.9086918383836746} 12/22/2021 09:36:16 - INFO - codeparrot_training - Step 5816: {'lr': 0.00048435253328580355, 'samples': 2978304, 'steps': 5816, 'batch_loss/train': 0.8028171919286251} 12/22/2021 09:36:28 - INFO - codeparrot_training - Step 5817: {'lr': 0.0004843470436665139, 'samples': 2978816, 'steps': 5817, 'batch_loss/train': 0.7231872845441103} 12/22/2021 09:36:38 - INFO - codeparrot_training - Step 5818: {'lr': 0.00048434155311554897, 'samples': 2979328, 'steps': 5818, 'batch_loss/train': 0.8162254225462675} 12/22/2021 09:36:49 - INFO - codeparrot_training - Step 5819: {'lr': 0.0004843360616329305, 'samples': 2979840, 'steps': 5819, 'batch_loss/train': 0.7773890299722552} 12/22/2021 09:36:59 - INFO - codeparrot_training - Step 5820: {'lr': 0.0004843305692186804, 'samples': 2980352, 'steps': 5820, 'batch_loss/train': 0.8290798161178827} 12/22/2021 09:37:12 - INFO - codeparrot_training - Step 5821: {'lr': 0.00048432507587282026, 'samples': 2980864, 'steps': 5821, 'batch_loss/train': 0.8891586442478001} 12/22/2021 09:37:22 - INFO - codeparrot_training - Step 5822: {'lr': 0.00048431958159537216, 'samples': 2981376, 'steps': 5822, 'batch_loss/train': 0.9150996766984463} 12/22/2021 09:37:33 - INFO - codeparrot_training - Step 5823: {'lr': 0.00048431408638635793, 'samples': 2981888, 'steps': 5823, 'batch_loss/train': 0.7514825817197561} 12/22/2021 09:37:46 - INFO - codeparrot_training - Step 5824: {'lr': 0.0004843085902457993, 'samples': 2982400, 'steps': 5824, 'batch_loss/train': 0.8678661650046706} 12/22/2021 09:37:57 - INFO - codeparrot_training - Step 5825: {'lr': 0.00048430309317371825, 'samples': 2982912, 'steps': 5825, 'batch_loss/train': 0.8567056888714433} 12/22/2021 09:38:07 - INFO - codeparrot_training - Step 5826: {'lr': 0.0004842975951701366, 'samples': 2983424, 'steps': 5826, 'batch_loss/train': 0.8461593522224575} 12/22/2021 09:38:19 - INFO - codeparrot_training - Step 5827: {'lr': 0.0004842920962350761, 'samples': 2983936, 'steps': 5827, 'batch_loss/train': 0.7972874755505472} 12/22/2021 09:38:30 - INFO - codeparrot_training - Step 5828: {'lr': 0.0004842865963685587, 'samples': 2984448, 'steps': 5828, 'batch_loss/train': 0.8208236801438034} 12/22/2021 09:38:41 - INFO - codeparrot_training - Step 5829: {'lr': 0.0004842810955706063, 'samples': 2984960, 'steps': 5829, 'batch_loss/train': 0.8133365479297936} 12/22/2021 09:38:54 - INFO - codeparrot_training - Step 5830: {'lr': 0.00048427559384124066, 'samples': 2985472, 'steps': 5830, 'batch_loss/train': 0.9233568566851318} 12/22/2021 09:39:05 - INFO - codeparrot_training - Step 5831: {'lr': 0.0004842700911804838, 'samples': 2985984, 'steps': 5831, 'batch_loss/train': 0.7753650601953268} 12/22/2021 09:39:15 - INFO - codeparrot_training - Step 5832: {'lr': 0.00048426458758835746, 'samples': 2986496, 'steps': 5832, 'batch_loss/train': 0.8603991328272969} 12/22/2021 09:39:26 - INFO - codeparrot_training - Step 5833: {'lr': 0.0004842590830648836, 'samples': 2987008, 'steps': 5833, 'batch_loss/train': 0.6577828519511968} 12/22/2021 09:39:38 - INFO - codeparrot_training - Step 5834: {'lr': 0.00048425357761008403, 'samples': 2987520, 'steps': 5834, 'batch_loss/train': 0.8098188969306648} 12/22/2021 09:39:49 - INFO - codeparrot_training - Step 5835: {'lr': 0.0004842480712239807, 'samples': 2988032, 'steps': 5835, 'batch_loss/train': 0.8631446734070778} 12/22/2021 09:39:59 - INFO - codeparrot_training - Step 5836: {'lr': 0.00048424256390659553, 'samples': 2988544, 'steps': 5836, 'batch_loss/train': 0.8023289376869798} 12/22/2021 09:40:11 - INFO - codeparrot_training - Step 5837: {'lr': 0.0004842370556579504, 'samples': 2989056, 'steps': 5837, 'batch_loss/train': 0.6902215828886256} 12/22/2021 09:40:22 - INFO - codeparrot_training - Step 5838: {'lr': 0.0004842315464780671, 'samples': 2989568, 'steps': 5838, 'batch_loss/train': 0.9504642831161618} 12/22/2021 09:40:32 - INFO - codeparrot_training - Step 5839: {'lr': 0.00048422603636696763, 'samples': 2990080, 'steps': 5839, 'batch_loss/train': 0.9598584026098251} 12/22/2021 09:40:46 - INFO - codeparrot_training - Step 5840: {'lr': 0.00048422052532467395, 'samples': 2990592, 'steps': 5840, 'batch_loss/train': 0.9839219432324171} 12/22/2021 09:40:57 - INFO - codeparrot_training - Step 5841: {'lr': 0.00048421501335120775, 'samples': 2991104, 'steps': 5841, 'batch_loss/train': 3.831133171916008} 12/22/2021 09:41:07 - INFO - codeparrot_training - Step 5842: {'lr': 0.0004842095004465912, 'samples': 2991616, 'steps': 5842, 'batch_loss/train': 0.8490051701664925} 12/22/2021 09:41:18 - INFO - codeparrot_training - Step 5843: {'lr': 0.00048420398661084613, 'samples': 2992128, 'steps': 5843, 'batch_loss/train': 0.8639615988358855} 12/22/2021 09:41:30 - INFO - codeparrot_training - Step 5844: {'lr': 0.00048419847184399446, 'samples': 2992640, 'steps': 5844, 'batch_loss/train': 0.8941837782040238} 12/22/2021 09:41:41 - INFO - codeparrot_training - Step 5845: {'lr': 0.00048419295614605804, 'samples': 2993152, 'steps': 5845, 'batch_loss/train': 0.9357319008558989} 12/22/2021 09:41:52 - INFO - codeparrot_training - Step 5846: {'lr': 0.0004841874395170588, 'samples': 2993664, 'steps': 5846, 'batch_loss/train': 0.8511571874842048} 12/22/2021 09:42:04 - INFO - codeparrot_training - Step 5847: {'lr': 0.00048418192195701884, 'samples': 2994176, 'steps': 5847, 'batch_loss/train': 0.68590397387743} 12/22/2021 09:42:14 - INFO - codeparrot_training - Step 5848: {'lr': 0.0004841764034659599, 'samples': 2994688, 'steps': 5848, 'batch_loss/train': 0.6014314832864329} 12/22/2021 09:42:25 - INFO - codeparrot_training - Step 5849: {'lr': 0.000484170884043904, 'samples': 2995200, 'steps': 5849, 'batch_loss/train': 0.7494340289849788} 12/22/2021 09:42:37 - INFO - codeparrot_training - Step 5850: {'lr': 0.00048416536369087315, 'samples': 2995712, 'steps': 5850, 'batch_loss/train': 0.8237294759601355} 12/22/2021 09:42:48 - INFO - codeparrot_training - Step 5851: {'lr': 0.00048415984240688916, 'samples': 2996224, 'steps': 5851, 'batch_loss/train': 0.8461961438879371} 12/22/2021 09:42:58 - INFO - codeparrot_training - Step 5852: {'lr': 0.00048415432019197416, 'samples': 2996736, 'steps': 5852, 'batch_loss/train': 0.8338897544890642} 12/22/2021 09:43:09 - INFO - codeparrot_training - Step 5853: {'lr': 0.0004841487970461499, 'samples': 2997248, 'steps': 5853, 'batch_loss/train': 0.8589390581473708} 12/22/2021 09:43:22 - INFO - codeparrot_training - Step 5854: {'lr': 0.0004841432729694384, 'samples': 2997760, 'steps': 5854, 'batch_loss/train': 0.8307117680087686} 12/22/2021 09:43:33 - INFO - codeparrot_training - Step 5855: {'lr': 0.0004841377479618617, 'samples': 2998272, 'steps': 5855, 'batch_loss/train': 0.8404018171131611} 12/22/2021 09:43:43 - INFO - codeparrot_training - Step 5856: {'lr': 0.00048413222202344175, 'samples': 2998784, 'steps': 5856, 'batch_loss/train': 0.8609887701459229} 12/22/2021 09:43:56 - INFO - codeparrot_training - Step 5857: {'lr': 0.0004841266951542005, 'samples': 2999296, 'steps': 5857, 'batch_loss/train': 0.7263403343968093} 12/22/2021 09:44:06 - INFO - codeparrot_training - Step 5858: {'lr': 0.0004841211673541599, 'samples': 2999808, 'steps': 5858, 'batch_loss/train': 0.7040828599128872} 12/22/2021 09:44:17 - INFO - codeparrot_training - Step 5859: {'lr': 0.00048411563862334195, 'samples': 3000320, 'steps': 5859, 'batch_loss/train': 0.6270570630440488} 12/22/2021 09:44:29 - INFO - codeparrot_training - Step 5860: {'lr': 0.00048411010896176853, 'samples': 3000832, 'steps': 5860, 'batch_loss/train': 0.6955433897674084} 12/22/2021 09:44:40 - INFO - codeparrot_training - Step 5861: {'lr': 0.0004841045783694618, 'samples': 3001344, 'steps': 5861, 'batch_loss/train': 0.808807848021388} 12/22/2021 09:44:50 - INFO - codeparrot_training - Step 5862: {'lr': 0.0004840990468464437, 'samples': 3001856, 'steps': 5862, 'batch_loss/train': 0.7061990009387955} 12/22/2021 09:45:05 - INFO - codeparrot_training - Step 5863: {'lr': 0.0004840935143927361, 'samples': 3002368, 'steps': 5863, 'batch_loss/train': 0.9085164414718747} 12/22/2021 09:45:15 - INFO - codeparrot_training - Step 5864: {'lr': 0.00048408798100836104, 'samples': 3002880, 'steps': 5864, 'batch_loss/train': 0.7655909861205146} 12/22/2021 09:45:26 - INFO - codeparrot_training - Step 5865: {'lr': 0.0004840824466933407, 'samples': 3003392, 'steps': 5865, 'batch_loss/train': 0.8518238186370581} 12/22/2021 09:45:36 - INFO - codeparrot_training - Step 5866: {'lr': 0.00048407691144769684, 'samples': 3003904, 'steps': 5866, 'batch_loss/train': 0.7934396299533546} 12/22/2021 09:45:49 - INFO - codeparrot_training - Step 5867: {'lr': 0.00048407137527145154, 'samples': 3004416, 'steps': 5867, 'batch_loss/train': 0.6358429545070976} 12/22/2021 09:45:59 - INFO - codeparrot_training - Step 5868: {'lr': 0.0004840658381646269, 'samples': 3004928, 'steps': 5868, 'batch_loss/train': 0.8703310526907444} 12/22/2021 09:46:10 - INFO - codeparrot_training - Step 5869: {'lr': 0.0004840603001272448, 'samples': 3005440, 'steps': 5869, 'batch_loss/train': 0.7231445447541773} 12/22/2021 09:46:23 - INFO - codeparrot_training - Step 5870: {'lr': 0.00048405476115932734, 'samples': 3005952, 'steps': 5870, 'batch_loss/train': 0.771531994920224} 12/22/2021 09:46:34 - INFO - codeparrot_training - Step 5871: {'lr': 0.00048404922126089655, 'samples': 3006464, 'steps': 5871, 'batch_loss/train': 0.9143048841506243} 12/22/2021 09:46:44 - INFO - codeparrot_training - Step 5872: {'lr': 0.00048404368043197446, 'samples': 3006976, 'steps': 5872, 'batch_loss/train': 0.7528669135645032} 12/22/2021 09:46:56 - INFO - codeparrot_training - Step 5873: {'lr': 0.0004840381386725831, 'samples': 3007488, 'steps': 5873, 'batch_loss/train': 0.8075909288600087} 12/22/2021 09:47:07 - INFO - codeparrot_training - Step 5874: {'lr': 0.00048403259598274435, 'samples': 3008000, 'steps': 5874, 'batch_loss/train': 0.8734674528241158} 12/22/2021 09:47:17 - INFO - codeparrot_training - Step 5875: {'lr': 0.0004840270523624804, 'samples': 3008512, 'steps': 5875, 'batch_loss/train': 0.7785025069024414} 12/22/2021 09:47:28 - INFO - codeparrot_training - Step 5876: {'lr': 0.00048402150781181333, 'samples': 3009024, 'steps': 5876, 'batch_loss/train': 0.8032307270914316} 12/22/2021 09:47:40 - INFO - codeparrot_training - Step 5877: {'lr': 0.00048401596233076506, 'samples': 3009536, 'steps': 5877, 'batch_loss/train': 0.8054124116897583} 12/22/2021 09:47:51 - INFO - codeparrot_training - Step 5878: {'lr': 0.00048401041591935765, 'samples': 3010048, 'steps': 5878, 'batch_loss/train': 0.8507997374981642} 12/22/2021 09:48:01 - INFO - codeparrot_training - Step 5879: {'lr': 0.00048400486857761316, 'samples': 3010560, 'steps': 5879, 'batch_loss/train': 0.9045336726121604} 12/22/2021 09:48:16 - INFO - codeparrot_training - Step 5880: {'lr': 0.00048399932030555377, 'samples': 3011072, 'steps': 5880, 'batch_loss/train': 0.8760382598266006} 12/22/2021 09:48:27 - INFO - codeparrot_training - Step 5881: {'lr': 0.0004839937711032014, 'samples': 3011584, 'steps': 5881, 'batch_loss/train': 0.8138262685388327} 12/22/2021 09:48:37 - INFO - codeparrot_training - Step 5882: {'lr': 0.00048398822097057813, 'samples': 3012096, 'steps': 5882, 'batch_loss/train': 0.8430622931336984} 12/22/2021 09:48:50 - INFO - codeparrot_training - Step 5883: {'lr': 0.000483982669907706, 'samples': 3012608, 'steps': 5883, 'batch_loss/train': 0.5771192706888542} 12/22/2021 09:49:00 - INFO - codeparrot_training - Step 5884: {'lr': 0.0004839771179146072, 'samples': 3013120, 'steps': 5884, 'batch_loss/train': 0.8773356098681688} 12/22/2021 09:49:11 - INFO - codeparrot_training - Step 5885: {'lr': 0.0004839715649913037, 'samples': 3013632, 'steps': 5885, 'batch_loss/train': 0.8950758641585708} 12/22/2021 09:49:22 - INFO - codeparrot_training - Step 5886: {'lr': 0.0004839660111378176, 'samples': 3014144, 'steps': 5886, 'batch_loss/train': 1.091680724406615} 12/22/2021 09:49:34 - INFO - codeparrot_training - Step 5887: {'lr': 0.000483960456354171, 'samples': 3014656, 'steps': 5887, 'batch_loss/train': 0.8492540861479938} 12/22/2021 09:49:44 - INFO - codeparrot_training - Step 5888: {'lr': 0.00048395490064038593, 'samples': 3015168, 'steps': 5888, 'batch_loss/train': 0.7882869164459407} 12/22/2021 09:49:55 - INFO - codeparrot_training - Step 5889: {'lr': 0.00048394934399648463, 'samples': 3015680, 'steps': 5889, 'batch_loss/train': 0.7614826115313917} 12/22/2021 09:50:07 - INFO - codeparrot_training - Step 5890: {'lr': 0.00048394378642248895, 'samples': 3016192, 'steps': 5890, 'batch_loss/train': 0.8381769126281142} 12/22/2021 09:50:18 - INFO - codeparrot_training - Step 5891: {'lr': 0.0004839382279184211, 'samples': 3016704, 'steps': 5891, 'batch_loss/train': 0.8298911135643721} 12/22/2021 09:50:29 - INFO - codeparrot_training - Step 5892: {'lr': 0.00048393266848430325, 'samples': 3017216, 'steps': 5892, 'batch_loss/train': 1.0016037970781326} 12/22/2021 09:50:42 - INFO - codeparrot_training - Step 5893: {'lr': 0.0004839271081201574, 'samples': 3017728, 'steps': 5893, 'batch_loss/train': 0.7112064799293876} 12/22/2021 09:50:53 - INFO - codeparrot_training - Step 5894: {'lr': 0.0004839215468260057, 'samples': 3018240, 'steps': 5894, 'batch_loss/train': 0.7616677894257009} 12/22/2021 09:51:03 - INFO - codeparrot_training - Step 5895: {'lr': 0.00048391598460187027, 'samples': 3018752, 'steps': 5895, 'batch_loss/train': 0.8827156694605947} 12/22/2021 09:51:14 - INFO - codeparrot_training - Step 5896: {'lr': 0.0004839104214477732, 'samples': 3019264, 'steps': 5896, 'batch_loss/train': 0.7687343675643206} 12/22/2021 09:51:26 - INFO - codeparrot_training - Step 5897: {'lr': 0.00048390485736373665, 'samples': 3019776, 'steps': 5897, 'batch_loss/train': 0.853515150025487} 12/22/2021 09:51:37 - INFO - codeparrot_training - Step 5898: {'lr': 0.00048389929234978266, 'samples': 3020288, 'steps': 5898, 'batch_loss/train': 0.6540393326431513} 12/22/2021 09:51:47 - INFO - codeparrot_training - Step 5899: {'lr': 0.00048389372640593345, 'samples': 3020800, 'steps': 5899, 'batch_loss/train': 0.882825237698853} 12/22/2021 09:52:01 - INFO - codeparrot_training - Step 5900: {'lr': 0.000483888159532211, 'samples': 3021312, 'steps': 5900, 'batch_loss/train': 0.7429541610181332} 12/22/2021 09:52:12 - INFO - codeparrot_training - Step 5901: {'lr': 0.00048388259172863765, 'samples': 3021824, 'steps': 5901, 'batch_loss/train': 0.8718411782756448} 12/22/2021 09:52:22 - INFO - codeparrot_training - Step 5902: {'lr': 0.0004838770229952354, 'samples': 3022336, 'steps': 5902, 'batch_loss/train': 0.7832218511030078} 12/22/2021 09:52:34 - INFO - codeparrot_training - Step 5903: {'lr': 0.0004838714533320263, 'samples': 3022848, 'steps': 5903, 'batch_loss/train': 0.9433184871450067} 12/22/2021 09:52:45 - INFO - codeparrot_training - Step 5904: {'lr': 0.00048386588273903274, 'samples': 3023360, 'steps': 5904, 'batch_loss/train': 0.881486451253295} 12/22/2021 09:52:56 - INFO - codeparrot_training - Step 5905: {'lr': 0.0004838603112162767, 'samples': 3023872, 'steps': 5905, 'batch_loss/train': 0.7785580870695412} 12/22/2021 09:53:06 - INFO - codeparrot_training - Step 5906: {'lr': 0.0004838547387637804, 'samples': 3024384, 'steps': 5906, 'batch_loss/train': 0.8854863308370113} 12/22/2021 09:53:19 - INFO - codeparrot_training - Step 5907: {'lr': 0.00048384916538156587, 'samples': 3024896, 'steps': 5907, 'batch_loss/train': 0.8531829677522182} 12/22/2021 09:53:29 - INFO - codeparrot_training - Step 5908: {'lr': 0.0004838435910696555, 'samples': 3025408, 'steps': 5908, 'batch_loss/train': 0.7389262232463807} 12/22/2021 09:53:40 - INFO - codeparrot_training - Step 5909: {'lr': 0.00048383801582807117, 'samples': 3025920, 'steps': 5909, 'batch_loss/train': 0.6211895048618317} 12/22/2021 09:53:54 - INFO - codeparrot_training - Step 5910: {'lr': 0.0004838324396568352, 'samples': 3026432, 'steps': 5910, 'batch_loss/train': 0.9211449190042913} 12/22/2021 09:54:05 - INFO - codeparrot_training - Step 5911: {'lr': 0.00048382686255596976, 'samples': 3026944, 'steps': 5911, 'batch_loss/train': 0.9205114021897316} 12/22/2021 09:54:15 - INFO - codeparrot_training - Step 5912: {'lr': 0.000483821284525497, 'samples': 3027456, 'steps': 5912, 'batch_loss/train': 0.7310941219329834} 12/22/2021 09:54:27 - INFO - codeparrot_training - Step 5913: {'lr': 0.0004838157055654392, 'samples': 3027968, 'steps': 5913, 'batch_loss/train': 0.8412494603544474} 12/22/2021 09:54:38 - INFO - codeparrot_training - Step 5914: {'lr': 0.00048381012567581827, 'samples': 3028480, 'steps': 5914, 'batch_loss/train': 0.8154981648549438} 12/22/2021 09:54:49 - INFO - codeparrot_training - Step 5915: {'lr': 0.0004838045448566567, 'samples': 3028992, 'steps': 5915, 'batch_loss/train': 0.7502658050507307} 12/22/2021 09:54:59 - INFO - codeparrot_training - Step 5916: {'lr': 0.0004837989631079764, 'samples': 3029504, 'steps': 5916, 'batch_loss/train': 0.8909061932936311} 12/22/2021 09:55:11 - INFO - codeparrot_training - Step 5917: {'lr': 0.00048379338042979985, 'samples': 3030016, 'steps': 5917, 'batch_loss/train': 0.8226528270170093} 12/22/2021 09:55:22 - INFO - codeparrot_training - Step 5918: {'lr': 0.000483787796822149, 'samples': 3030528, 'steps': 5918, 'batch_loss/train': 0.8504959079436958} 12/22/2021 09:55:33 - INFO - codeparrot_training - Step 5919: {'lr': 0.00048378221228504624, 'samples': 3031040, 'steps': 5919, 'batch_loss/train': 0.8756532929837704} 12/22/2021 09:55:45 - INFO - codeparrot_training - Step 5920: {'lr': 0.00048377662681851365, 'samples': 3031552, 'steps': 5920, 'batch_loss/train': 0.8155528614297509} 12/22/2021 09:55:55 - INFO - codeparrot_training - Step 5921: {'lr': 0.00048377104042257336, 'samples': 3032064, 'steps': 5921, 'batch_loss/train': 0.8168036947026849} 12/22/2021 09:56:06 - INFO - codeparrot_training - Step 5922: {'lr': 0.0004837654530972478, 'samples': 3032576, 'steps': 5922, 'batch_loss/train': 0.7628154968842864} 12/22/2021 09:56:19 - INFO - codeparrot_training - Step 5923: {'lr': 0.000483759864842559, 'samples': 3033088, 'steps': 5923, 'batch_loss/train': 0.7936536027118564} 12/22/2021 09:56:30 - INFO - codeparrot_training - Step 5924: {'lr': 0.00048375427565852924, 'samples': 3033600, 'steps': 5924, 'batch_loss/train': 0.8171494910493493} 12/22/2021 09:56:40 - INFO - codeparrot_training - Step 5925: {'lr': 0.00048374868554518083, 'samples': 3034112, 'steps': 5925, 'batch_loss/train': 0.8813756331801414} 12/22/2021 09:56:52 - INFO - codeparrot_training - Step 5926: {'lr': 0.0004837430945025358, 'samples': 3034624, 'steps': 5926, 'batch_loss/train': 0.7907022135332227} 12/22/2021 09:57:03 - INFO - codeparrot_training - Step 5927: {'lr': 0.00048373750253061655, 'samples': 3035136, 'steps': 5927, 'batch_loss/train': 0.8060454223304987} 12/22/2021 09:57:13 - INFO - codeparrot_training - Step 5928: {'lr': 0.00048373190962944517, 'samples': 3035648, 'steps': 5928, 'batch_loss/train': 0.9420889932662249} 12/22/2021 09:57:24 - INFO - codeparrot_training - Step 5929: {'lr': 0.000483726315799044, 'samples': 3036160, 'steps': 5929, 'batch_loss/train': 0.6842918854672462} 12/22/2021 09:57:39 - INFO - codeparrot_training - Step 5930: {'lr': 0.00048372072103943534, 'samples': 3036672, 'steps': 5930, 'batch_loss/train': 0.8285544002428651} 12/22/2021 09:57:49 - INFO - codeparrot_training - Step 5931: {'lr': 0.00048371512535064123, 'samples': 3037184, 'steps': 5931, 'batch_loss/train': 1.350781373679638} 12/22/2021 09:58:00 - INFO - codeparrot_training - Step 5932: {'lr': 0.00048370952873268407, 'samples': 3037696, 'steps': 5932, 'batch_loss/train': 0.822087038308382} 12/22/2021 09:58:12 - INFO - codeparrot_training - Step 5933: {'lr': 0.0004837039311855861, 'samples': 3038208, 'steps': 5933, 'batch_loss/train': 0.7288822869304568} 12/22/2021 09:58:23 - INFO - codeparrot_training - Step 5934: {'lr': 0.00048369833270936946, 'samples': 3038720, 'steps': 5934, 'batch_loss/train': 0.8101111073046923} 12/22/2021 09:58:33 - INFO - codeparrot_training - Step 5935: {'lr': 0.0004836927333040566, 'samples': 3039232, 'steps': 5935, 'batch_loss/train': 0.7872900003567338} 12/22/2021 09:58:44 - INFO - codeparrot_training - Step 5936: {'lr': 0.0004836871329696696, 'samples': 3039744, 'steps': 5936, 'batch_loss/train': 0.6723930449225008} 12/22/2021 09:58:56 - INFO - codeparrot_training - Step 5937: {'lr': 0.0004836815317062308, 'samples': 3040256, 'steps': 5937, 'batch_loss/train': 0.8234181962907314} 12/22/2021 09:59:07 - INFO - codeparrot_training - Step 5938: {'lr': 0.0004836759295137625, 'samples': 3040768, 'steps': 5938, 'batch_loss/train': 0.7110526002943516} 12/22/2021 09:59:17 - INFO - codeparrot_training - Step 5939: {'lr': 0.0004836703263922869, 'samples': 3041280, 'steps': 5939, 'batch_loss/train': 0.8396198702976108} 12/22/2021 09:59:30 - INFO - codeparrot_training - Step 5940: {'lr': 0.00048366472234182635, 'samples': 3041792, 'steps': 5940, 'batch_loss/train': 0.8294582888484001} 12/22/2021 09:59:41 - INFO - codeparrot_training - Step 5941: {'lr': 0.00048365911736240305, 'samples': 3042304, 'steps': 5941, 'batch_loss/train': 0.8984924592077732} 12/22/2021 09:59:52 - INFO - codeparrot_training - Step 5942: {'lr': 0.00048365351145403934, 'samples': 3042816, 'steps': 5942, 'batch_loss/train': 0.9247235115617514} 12/22/2021 10:00:04 - INFO - codeparrot_training - Step 5943: {'lr': 0.00048364790461675745, 'samples': 3043328, 'steps': 5943, 'batch_loss/train': 1.0266372421756387} 12/22/2021 10:00:14 - INFO - codeparrot_training - Step 5944: {'lr': 0.0004836422968505798, 'samples': 3043840, 'steps': 5944, 'batch_loss/train': 0.9457192495465279} 12/22/2021 10:00:25 - INFO - codeparrot_training - Step 5945: {'lr': 0.0004836366881555285, 'samples': 3044352, 'steps': 5945, 'batch_loss/train': 0.8229340557008982} 12/22/2021 10:00:37 - INFO - codeparrot_training - Step 5946: {'lr': 0.000483631078531626, 'samples': 3044864, 'steps': 5946, 'batch_loss/train': 0.8029974717646837} 12/22/2021 10:00:48 - INFO - codeparrot_training - Step 5947: {'lr': 0.00048362546797889463, 'samples': 3045376, 'steps': 5947, 'batch_loss/train': 0.891373130492866} 12/22/2021 10:00:58 - INFO - codeparrot_training - Step 5948: {'lr': 0.0004836198564973565, 'samples': 3045888, 'steps': 5948, 'batch_loss/train': 1.0341093987226486} 12/22/2021 10:01:09 - INFO - codeparrot_training - Step 5949: {'lr': 0.00048361424408703405, 'samples': 3046400, 'steps': 5949, 'batch_loss/train': 0.9183723945170641} 12/22/2021 10:01:23 - INFO - codeparrot_training - Step 5950: {'lr': 0.0004836086307479496, 'samples': 3046912, 'steps': 5950, 'batch_loss/train': 0.8186028152704239} 12/22/2021 10:01:33 - INFO - codeparrot_training - Step 5951: {'lr': 0.00048360301648012546, 'samples': 3047424, 'steps': 5951, 'batch_loss/train': 0.8199203498661518} 12/22/2021 10:01:44 - INFO - codeparrot_training - Step 5952: {'lr': 0.0004835974012835839, 'samples': 3047936, 'steps': 5952, 'batch_loss/train': 0.8987015360035002} 12/22/2021 10:01:56 - INFO - codeparrot_training - Step 5953: {'lr': 0.0004835917851583472, 'samples': 3048448, 'steps': 5953, 'batch_loss/train': 0.7406127359718084} 12/22/2021 10:02:06 - INFO - codeparrot_training - Step 5954: {'lr': 0.0004835861681044379, 'samples': 3048960, 'steps': 5954, 'batch_loss/train': 0.8859782572835684} 12/22/2021 10:02:17 - INFO - codeparrot_training - Step 5955: {'lr': 0.0004835805501218781, 'samples': 3049472, 'steps': 5955, 'batch_loss/train': 0.7909465008415282} 12/22/2021 10:02:29 - INFO - codeparrot_training - Step 5956: {'lr': 0.0004835749312106903, 'samples': 3049984, 'steps': 5956, 'batch_loss/train': 0.8599804739933461} 12/22/2021 10:02:40 - INFO - codeparrot_training - Step 5957: {'lr': 0.0004835693113708968, 'samples': 3050496, 'steps': 5957, 'batch_loss/train': 0.8603528128005564} 12/22/2021 10:02:50 - INFO - codeparrot_training - Step 5958: {'lr': 0.0004835636906025198, 'samples': 3051008, 'steps': 5958, 'batch_loss/train': 0.810064472258091} 12/22/2021 10:03:01 - INFO - codeparrot_training - Step 5959: {'lr': 0.00048355806890558174, 'samples': 3051520, 'steps': 5959, 'batch_loss/train': 0.9314755415543914} 12/22/2021 10:03:14 - INFO - codeparrot_training - Step 5960: {'lr': 0.0004835524462801051, 'samples': 3052032, 'steps': 5960, 'batch_loss/train': 0.8168728679884225} 12/22/2021 10:03:25 - INFO - codeparrot_training - Step 5961: {'lr': 0.00048354682272611207, 'samples': 3052544, 'steps': 5961, 'batch_loss/train': 0.8092447738163173} 12/22/2021 10:03:35 - INFO - codeparrot_training - Step 5962: {'lr': 0.000483541198243625, 'samples': 3053056, 'steps': 5962, 'batch_loss/train': 0.7883809776976705} 12/22/2021 10:03:47 - INFO - codeparrot_training - Step 5963: {'lr': 0.00048353557283266634, 'samples': 3053568, 'steps': 5963, 'batch_loss/train': 0.8591432794928551} 12/22/2021 10:03:58 - INFO - codeparrot_training - Step 5964: {'lr': 0.0004835299464932585, 'samples': 3054080, 'steps': 5964, 'batch_loss/train': 0.8734520780853927} 12/22/2021 10:04:08 - INFO - codeparrot_training - Step 5965: {'lr': 0.00048352431922542375, 'samples': 3054592, 'steps': 5965, 'batch_loss/train': 0.8179803118109703} 12/22/2021 10:04:21 - INFO - codeparrot_training - Step 5966: {'lr': 0.0004835186910291844, 'samples': 3055104, 'steps': 5966, 'batch_loss/train': 0.8361913422122598} 12/22/2021 10:04:31 - INFO - codeparrot_training - Step 5967: {'lr': 0.0004835130619045629, 'samples': 3055616, 'steps': 5967, 'batch_loss/train': 0.9816371649503708} 12/22/2021 10:04:42 - INFO - codeparrot_training - Step 5968: {'lr': 0.0004835074318515817, 'samples': 3056128, 'steps': 5968, 'batch_loss/train': 0.7054812214337289} 12/22/2021 10:04:56 - INFO - codeparrot_training - Step 5969: {'lr': 0.0004835018008702631, 'samples': 3056640, 'steps': 5969, 'batch_loss/train': 0.8319119149819016} 12/22/2021 10:05:06 - INFO - codeparrot_training - Step 5970: {'lr': 0.00048349616896062954, 'samples': 3057152, 'steps': 5970, 'batch_loss/train': 0.8358773646177724} 12/22/2021 10:05:17 - INFO - codeparrot_training - Step 5971: {'lr': 0.00048349053612270335, 'samples': 3057664, 'steps': 5971, 'batch_loss/train': 0.8492595572024584} 12/22/2021 10:05:28 - INFO - codeparrot_training - Step 5972: {'lr': 0.000483484902356507, 'samples': 3058176, 'steps': 5972, 'batch_loss/train': 0.8949209712445736} 12/22/2021 10:05:40 - INFO - codeparrot_training - Step 5973: {'lr': 0.00048347926766206274, 'samples': 3058688, 'steps': 5973, 'batch_loss/train': 0.8692952524870634} 12/22/2021 10:05:51 - INFO - codeparrot_training - Step 5974: {'lr': 0.0004834736320393931, 'samples': 3059200, 'steps': 5974, 'batch_loss/train': 0.8533979346975684} 12/22/2021 10:06:01 - INFO - codeparrot_training - Step 5975: {'lr': 0.0004834679954885205, 'samples': 3059712, 'steps': 5975, 'batch_loss/train': 0.7765102908015251} 12/22/2021 10:06:13 - INFO - codeparrot_training - Step 5976: {'lr': 0.0004834623580094673, 'samples': 3060224, 'steps': 5976, 'batch_loss/train': 0.8673041213769466} 12/22/2021 10:06:24 - INFO - codeparrot_training - Step 5977: {'lr': 0.0004834567196022559, 'samples': 3060736, 'steps': 5977, 'batch_loss/train': 0.695502161514014} 12/22/2021 10:06:34 - INFO - codeparrot_training - Step 5978: {'lr': 0.0004834510802669087, 'samples': 3061248, 'steps': 5978, 'batch_loss/train': 0.682962829247117} 12/22/2021 10:06:48 - INFO - codeparrot_training - Step 5979: {'lr': 0.0004834454400034482, 'samples': 3061760, 'steps': 5979, 'batch_loss/train': 0.763389864936471} 12/22/2021 10:06:59 - INFO - codeparrot_training - Step 5980: {'lr': 0.00048343979881189673, 'samples': 3062272, 'steps': 5980, 'batch_loss/train': 0.7892996752634645} 12/22/2021 10:07:10 - INFO - codeparrot_training - Step 5981: {'lr': 0.0004834341566922768, 'samples': 3062784, 'steps': 5981, 'batch_loss/train': 0.8325672983191907} 12/22/2021 10:07:20 - INFO - codeparrot_training - Step 5982: {'lr': 0.00048342851364461083, 'samples': 3063296, 'steps': 5982, 'batch_loss/train': 0.923472149297595} 12/22/2021 10:07:32 - INFO - codeparrot_training - Step 5983: {'lr': 0.0004834228696689212, 'samples': 3063808, 'steps': 5983, 'batch_loss/train': 0.854379981290549} 12/22/2021 10:07:43 - INFO - codeparrot_training - Step 5984: {'lr': 0.00048341722476523034, 'samples': 3064320, 'steps': 5984, 'batch_loss/train': 0.8148784511722624} 12/22/2021 10:07:54 - INFO - codeparrot_training - Step 5985: {'lr': 0.0004834115789335608, 'samples': 3064832, 'steps': 5985, 'batch_loss/train': 0.7594530750066042} 12/22/2021 10:08:08 - INFO - codeparrot_training - Step 5986: {'lr': 0.0004834059321739349, 'samples': 3065344, 'steps': 5986, 'batch_loss/train': 0.887644262984395} 12/22/2021 10:08:18 - INFO - codeparrot_training - Step 5987: {'lr': 0.0004834002844863752, 'samples': 3065856, 'steps': 5987, 'batch_loss/train': 0.8889184212312102} 12/22/2021 10:08:29 - INFO - codeparrot_training - Step 5988: {'lr': 0.00048339463587090405, 'samples': 3066368, 'steps': 5988, 'batch_loss/train': 0.8543855417519808} 12/22/2021 10:08:41 - INFO - codeparrot_training - Step 5989: {'lr': 0.0004833889863275439, 'samples': 3066880, 'steps': 5989, 'batch_loss/train': 0.7945295190438628} 12/22/2021 10:08:51 - INFO - codeparrot_training - Step 5990: {'lr': 0.00048338333585631736, 'samples': 3067392, 'steps': 5990, 'batch_loss/train': 0.8607330319937319} 12/22/2021 10:09:02 - INFO - codeparrot_training - Step 5991: {'lr': 0.00048337768445724674, 'samples': 3067904, 'steps': 5991, 'batch_loss/train': 0.7976834760047495} 12/22/2021 10:09:14 - INFO - codeparrot_training - Step 5992: {'lr': 0.0004833720321303546, 'samples': 3068416, 'steps': 5992, 'batch_loss/train': 0.825092388316989} 12/22/2021 10:09:25 - INFO - codeparrot_training - Step 5993: {'lr': 0.00048336637887566337, 'samples': 3068928, 'steps': 5993, 'batch_loss/train': 0.7295876163989305} 12/22/2021 10:09:36 - INFO - codeparrot_training - Step 5994: {'lr': 0.00048336072469319554, 'samples': 3069440, 'steps': 5994, 'batch_loss/train': 0.8239531256258488} 12/22/2021 10:09:46 - INFO - codeparrot_training - Step 5995: {'lr': 0.0004833550695829736, 'samples': 3069952, 'steps': 5995, 'batch_loss/train': 0.6442985790781677} 12/22/2021 10:09:58 - INFO - codeparrot_training - Step 5996: {'lr': 0.00048334941354501995, 'samples': 3070464, 'steps': 5996, 'batch_loss/train': 0.8044915068894625} 12/22/2021 10:10:09 - INFO - codeparrot_training - Step 5997: {'lr': 0.00048334375657935717, 'samples': 3070976, 'steps': 5997, 'batch_loss/train': 0.8315109722316265} 12/22/2021 10:10:20 - INFO - codeparrot_training - Step 5998: {'lr': 0.00048333809868600776, 'samples': 3071488, 'steps': 5998, 'batch_loss/train': 0.7408097791485488} 12/22/2021 10:10:33 - INFO - codeparrot_training - Step 5999: {'lr': 0.00048333243986499415, 'samples': 3072000, 'steps': 5999, 'batch_loss/train': 0.8584732888266444} 12/22/2021 10:10:44 - INFO - codeparrot_training - Step 6000: {'lr': 0.00048332678011633884, 'samples': 3072512, 'steps': 6000, 'batch_loss/train': 0.8094491111114621} 12/22/2021 10:10:55 - INFO - codeparrot_training - Step 6001: {'lr': 0.0004833211194400644, 'samples': 3073024, 'steps': 6001, 'batch_loss/train': 0.9047154700383544} 12/22/2021 10:11:07 - INFO - codeparrot_training - Step 6002: {'lr': 0.0004833154578361932, 'samples': 3073536, 'steps': 6002, 'batch_loss/train': 0.8293280685320497} 12/22/2021 10:11:17 - INFO - codeparrot_training - Step 6003: {'lr': 0.00048330979530474797, 'samples': 3074048, 'steps': 6003, 'batch_loss/train': 0.8818184928968549} 12/22/2021 10:11:28 - INFO - codeparrot_training - Step 6004: {'lr': 0.000483304131845751, 'samples': 3074560, 'steps': 6004, 'batch_loss/train': 0.9051821082830429} 12/22/2021 10:11:38 - INFO - codeparrot_training - Step 6005: {'lr': 0.00048329846745922487, 'samples': 3075072, 'steps': 6005, 'batch_loss/train': 0.7615556349046528} 12/22/2021 10:11:50 - INFO - codeparrot_training - Step 6006: {'lr': 0.00048329280214519225, 'samples': 3075584, 'steps': 6006, 'batch_loss/train': 0.7854309938848019} 12/22/2021 10:12:01 - INFO - codeparrot_training - Step 6007: {'lr': 0.0004832871359036754, 'samples': 3076096, 'steps': 6007, 'batch_loss/train': 0.8054884932935238} 12/22/2021 10:12:12 - INFO - codeparrot_training - Step 6008: {'lr': 0.000483281468734697, 'samples': 3076608, 'steps': 6008, 'batch_loss/train': 0.8579882502090186} 12/22/2021 10:12:26 - INFO - codeparrot_training - Step 6009: {'lr': 0.0004832758006382796, 'samples': 3077120, 'steps': 6009, 'batch_loss/train': 0.8544599683955312} 12/22/2021 10:12:36 - INFO - codeparrot_training - Step 6010: {'lr': 0.00048327013161444574, 'samples': 3077632, 'steps': 6010, 'batch_loss/train': 0.90238300524652} 12/22/2021 10:12:47 - INFO - codeparrot_training - Step 6011: {'lr': 0.00048326446166321787, 'samples': 3078144, 'steps': 6011, 'batch_loss/train': 0.6684546712785959} 12/22/2021 10:12:59 - INFO - codeparrot_training - Step 6012: {'lr': 0.00048325879078461853, 'samples': 3078656, 'steps': 6012, 'batch_loss/train': 0.8298867950215936} 12/22/2021 10:13:10 - INFO - codeparrot_training - Step 6013: {'lr': 0.0004832531189786704, 'samples': 3079168, 'steps': 6013, 'batch_loss/train': 0.7599729401990771} 12/22/2021 10:13:20 - INFO - codeparrot_training - Step 6014: {'lr': 0.00048324744624539585, 'samples': 3079680, 'steps': 6014, 'batch_loss/train': 0.8672980554401875} 12/22/2021 10:13:34 - INFO - codeparrot_training - Step 6015: {'lr': 0.0004832417725848176, 'samples': 3080192, 'steps': 6015, 'batch_loss/train': 0.9016238474287093} 12/22/2021 10:13:44 - INFO - codeparrot_training - Step 6016: {'lr': 0.0004832360979969581, 'samples': 3080704, 'steps': 6016, 'batch_loss/train': 0.9532428318634629} 12/22/2021 10:13:55 - INFO - codeparrot_training - Step 6017: {'lr': 0.00048323042248184, 'samples': 3081216, 'steps': 6017, 'batch_loss/train': 0.7756922319531441} 12/22/2021 10:14:06 - INFO - codeparrot_training - Step 6018: {'lr': 0.00048322474603948576, 'samples': 3081728, 'steps': 6018, 'batch_loss/train': 0.8543115872889757} 12/22/2021 10:14:18 - INFO - codeparrot_training - Step 6019: {'lr': 0.00048321906866991796, 'samples': 3082240, 'steps': 6019, 'batch_loss/train': 0.7713001957163215} 12/22/2021 10:14:28 - INFO - codeparrot_training - Step 6020: {'lr': 0.0004832133903731593, 'samples': 3082752, 'steps': 6020, 'batch_loss/train': 0.8614855883643031} 12/22/2021 10:14:39 - INFO - codeparrot_training - Step 6021: {'lr': 0.0004832077111492321, 'samples': 3083264, 'steps': 6021, 'batch_loss/train': 0.7899050954729319} 12/22/2021 10:14:51 - INFO - codeparrot_training - Step 6022: {'lr': 0.0004832020309981592, 'samples': 3083776, 'steps': 6022, 'batch_loss/train': 0.9106174111366272} 12/22/2021 10:15:02 - INFO - codeparrot_training - Step 6023: {'lr': 0.0004831963499199631, 'samples': 3084288, 'steps': 6023, 'batch_loss/train': 0.5537806492066011} 12/22/2021 10:15:12 - INFO - codeparrot_training - Step 6024: {'lr': 0.00048319066791466627, 'samples': 3084800, 'steps': 6024, 'batch_loss/train': 0.866264695301652} 12/22/2021 10:15:25 - INFO - codeparrot_training - Step 6025: {'lr': 0.0004831849849822915, 'samples': 3085312, 'steps': 6025, 'batch_loss/train': 0.8546557305380702} 12/22/2021 10:15:36 - INFO - codeparrot_training - Step 6026: {'lr': 0.00048317930112286116, 'samples': 3085824, 'steps': 6026, 'batch_loss/train': 0.7652636775746942} 12/22/2021 10:15:47 - INFO - codeparrot_training - Step 6027: {'lr': 0.00048317361633639803, 'samples': 3086336, 'steps': 6027, 'batch_loss/train': 0.9007238065823913} 12/22/2021 10:15:57 - INFO - codeparrot_training - Step 6028: {'lr': 0.00048316793062292465, 'samples': 3086848, 'steps': 6028, 'batch_loss/train': 0.8440606305375695} 12/22/2021 10:16:10 - INFO - codeparrot_training - Step 6029: {'lr': 0.00048316224398246366, 'samples': 3087360, 'steps': 6029, 'batch_loss/train': 0.8009464326314628} 12/22/2021 10:16:20 - INFO - codeparrot_training - Step 6030: {'lr': 0.0004831565564150375, 'samples': 3087872, 'steps': 6030, 'batch_loss/train': 0.7590063959360123} 12/22/2021 10:16:31 - INFO - codeparrot_training - Step 6031: {'lr': 0.00048315086792066904, 'samples': 3088384, 'steps': 6031, 'batch_loss/train': 0.9664768110960722} 12/22/2021 10:16:43 - INFO - codeparrot_training - Step 6032: {'lr': 0.00048314517849938073, 'samples': 3088896, 'steps': 6032, 'batch_loss/train': 1.012310624588281} 12/22/2021 10:16:54 - INFO - codeparrot_training - Step 6033: {'lr': 0.0004831394881511952, 'samples': 3089408, 'steps': 6033, 'batch_loss/train': 0.7594665559008718} 12/22/2021 10:17:04 - INFO - codeparrot_training - Step 6034: {'lr': 0.00048313379687613513, 'samples': 3089920, 'steps': 6034, 'batch_loss/train': 0.8304921863600612} 12/22/2021 10:17:16 - INFO - codeparrot_training - Step 6035: {'lr': 0.000483128104674223, 'samples': 3090432, 'steps': 6035, 'batch_loss/train': 0.8705488815903664} 12/22/2021 10:17:27 - INFO - codeparrot_training - Step 6036: {'lr': 0.00048312241154548167, 'samples': 3090944, 'steps': 6036, 'batch_loss/train': 0.9322573319077492} 12/22/2021 10:17:38 - INFO - codeparrot_training - Step 6037: {'lr': 0.00048311671748993364, 'samples': 3091456, 'steps': 6037, 'batch_loss/train': 0.8418340850621462} 12/22/2021 10:17:51 - INFO - codeparrot_training - Step 6038: {'lr': 0.0004831110225076015, 'samples': 3091968, 'steps': 6038, 'batch_loss/train': 0.7947523072361946} 12/22/2021 10:18:01 - INFO - codeparrot_training - Step 6039: {'lr': 0.000483105326598508, 'samples': 3092480, 'steps': 6039, 'batch_loss/train': 1.2716509997844696} 12/22/2021 10:18:12 - INFO - codeparrot_training - Step 6040: {'lr': 0.00048309962976267573, 'samples': 3092992, 'steps': 6040, 'batch_loss/train': 0.8648385060951114} 12/22/2021 10:18:23 - INFO - codeparrot_training - Step 6041: {'lr': 0.00048309393200012733, 'samples': 3093504, 'steps': 6041, 'batch_loss/train': 0.799593155272305} 12/22/2021 10:18:35 - INFO - codeparrot_training - Step 6042: {'lr': 0.00048308823331088554, 'samples': 3094016, 'steps': 6042, 'batch_loss/train': 0.7702000555582345} 12/22/2021 10:18:46 - INFO - codeparrot_training - Step 6043: {'lr': 0.0004830825336949729, 'samples': 3094528, 'steps': 6043, 'batch_loss/train': 0.7401592072565109} 12/22/2021 10:18:56 - INFO - codeparrot_training - Step 6044: {'lr': 0.00048307683315241206, 'samples': 3095040, 'steps': 6044, 'batch_loss/train': 0.5851552889216691} 12/22/2021 10:19:09 - INFO - codeparrot_training - Step 6045: {'lr': 0.0004830711316832258, 'samples': 3095552, 'steps': 6045, 'batch_loss/train': 0.8572166168596596} 12/22/2021 10:19:19 - INFO - codeparrot_training - Step 6046: {'lr': 0.0004830654292874367, 'samples': 3096064, 'steps': 6046, 'batch_loss/train': 0.7190213343128562} 12/22/2021 10:19:30 - INFO - codeparrot_training - Step 6047: {'lr': 0.00048305972596506745, 'samples': 3096576, 'steps': 6047, 'batch_loss/train': 0.8773108618333936} 12/22/2021 10:19:43 - INFO - codeparrot_training - Step 6048: {'lr': 0.0004830540217161408, 'samples': 3097088, 'steps': 6048, 'batch_loss/train': 0.9008601559326053} 12/22/2021 10:19:54 - INFO - codeparrot_training - Step 6049: {'lr': 0.0004830483165406793, 'samples': 3097600, 'steps': 6049, 'batch_loss/train': 0.9033520398661494} 12/22/2021 10:20:05 - INFO - codeparrot_training - Step 6050: {'lr': 0.0004830426104387057, 'samples': 3098112, 'steps': 6050, 'batch_loss/train': 0.8550374014303088} 12/22/2021 10:20:17 - INFO - codeparrot_training - Step 6051: {'lr': 0.00048303690341024267, 'samples': 3098624, 'steps': 6051, 'batch_loss/train': 0.8152256272733212} 12/22/2021 10:20:27 - INFO - codeparrot_training - Step 6052: {'lr': 0.0004830311954553129, 'samples': 3099136, 'steps': 6052, 'batch_loss/train': 0.867822952568531} 12/22/2021 10:20:38 - INFO - codeparrot_training - Step 6053: {'lr': 0.00048302548657393906, 'samples': 3099648, 'steps': 6053, 'batch_loss/train': 0.8661920083686709} 12/22/2021 10:20:49 - INFO - codeparrot_training - Step 6054: {'lr': 0.00048301977676614387, 'samples': 3100160, 'steps': 6054, 'batch_loss/train': 0.8413096978329122} 12/22/2021 10:21:03 - INFO - codeparrot_training - Step 6055: {'lr': 0.00048301406603195007, 'samples': 3100672, 'steps': 6055, 'batch_loss/train': 0.8569995509460568} 12/22/2021 10:21:13 - INFO - codeparrot_training - Step 6056: {'lr': 0.0004830083543713804, 'samples': 3101184, 'steps': 6056, 'batch_loss/train': 0.6997008174657822} 12/22/2021 10:21:24 - INFO - codeparrot_training - Step 6057: {'lr': 0.00048300264178445735, 'samples': 3101696, 'steps': 6057, 'batch_loss/train': 0.847994439303875} 12/22/2021 10:21:36 - INFO - codeparrot_training - Step 6058: {'lr': 0.00048299692827120385, 'samples': 3102208, 'steps': 6058, 'batch_loss/train': 0.7741380371153355} 12/22/2021 10:21:47 - INFO - codeparrot_training - Step 6059: {'lr': 0.0004829912138316425, 'samples': 3102720, 'steps': 6059, 'batch_loss/train': 0.8519069971516728} 12/22/2021 10:21:57 - INFO - codeparrot_training - Step 6060: {'lr': 0.0004829854984657961, 'samples': 3103232, 'steps': 6060, 'batch_loss/train': 0.8240454364567995} 12/22/2021 10:22:09 - INFO - codeparrot_training - Step 6061: {'lr': 0.00048297978217368734, 'samples': 3103744, 'steps': 6061, 'batch_loss/train': 0.8441201290115714} 12/22/2021 10:22:20 - INFO - codeparrot_training - Step 6062: {'lr': 0.0004829740649553389, 'samples': 3104256, 'steps': 6062, 'batch_loss/train': 0.8645347636193037} 12/22/2021 10:22:31 - INFO - codeparrot_training - Step 6063: {'lr': 0.00048296834681077353, 'samples': 3104768, 'steps': 6063, 'batch_loss/train': 0.8250171262770891} 12/22/2021 10:22:41 - INFO - codeparrot_training - Step 6064: {'lr': 0.00048296262774001397, 'samples': 3105280, 'steps': 6064, 'batch_loss/train': 0.8208346366882324} 12/22/2021 10:22:55 - INFO - codeparrot_training - Step 6065: {'lr': 0.000482956907743083, 'samples': 3105792, 'steps': 6065, 'batch_loss/train': 0.868305959738791} 12/22/2021 10:23:06 - INFO - codeparrot_training - Step 6066: {'lr': 0.0004829511868200034, 'samples': 3106304, 'steps': 6066, 'batch_loss/train': 0.7898139022290707} 12/22/2021 10:23:16 - INFO - codeparrot_training - Step 6067: {'lr': 0.00048294546497079774, 'samples': 3106816, 'steps': 6067, 'batch_loss/train': 0.776778556406498} 12/22/2021 10:23:29 - INFO - codeparrot_training - Step 6068: {'lr': 0.00048293974219548887, 'samples': 3107328, 'steps': 6068, 'batch_loss/train': 0.7410378465428948} 12/22/2021 10:23:39 - INFO - codeparrot_training - Step 6069: {'lr': 0.00048293401849409956, 'samples': 3107840, 'steps': 6069, 'batch_loss/train': 0.8661900823935866} 12/22/2021 10:23:50 - INFO - codeparrot_training - Step 6070: {'lr': 0.00048292829386665256, 'samples': 3108352, 'steps': 6070, 'batch_loss/train': 0.8814140520989895} 12/22/2021 10:24:02 - INFO - codeparrot_training - Step 6071: {'lr': 0.0004829225683131706, 'samples': 3108864, 'steps': 6071, 'batch_loss/train': 0.8004583036527038} 12/22/2021 10:24:13 - INFO - codeparrot_training - Step 6072: {'lr': 0.00048291684183367645, 'samples': 3109376, 'steps': 6072, 'batch_loss/train': 0.9274955838918686} 12/22/2021 10:24:23 - INFO - codeparrot_training - Step 6073: {'lr': 0.0004829111144281929, 'samples': 3109888, 'steps': 6073, 'batch_loss/train': 0.9793562777340412} 12/22/2021 10:24:35 - INFO - codeparrot_training - Step 6074: {'lr': 0.0004829053860967427, 'samples': 3110400, 'steps': 6074, 'batch_loss/train': 0.819565256126225} 12/22/2021 10:24:46 - INFO - codeparrot_training - Step 6075: {'lr': 0.00048289965683934855, 'samples': 3110912, 'steps': 6075, 'batch_loss/train': 0.7870323853567243} 12/22/2021 10:24:56 - INFO - codeparrot_training - Step 6076: {'lr': 0.0004828939266560334, 'samples': 3111424, 'steps': 6076, 'batch_loss/train': 0.792230986058712} 12/22/2021 10:25:07 - INFO - codeparrot_training - Step 6077: {'lr': 0.00048288819554681995, 'samples': 3111936, 'steps': 6077, 'batch_loss/train': 0.7648584796115756} 12/22/2021 10:25:20 - INFO - codeparrot_training - Step 6078: {'lr': 0.0004828824635117309, 'samples': 3112448, 'steps': 6078, 'batch_loss/train': 0.8354447549208999} 12/22/2021 10:25:31 - INFO - codeparrot_training - Step 6079: {'lr': 0.00048287673055078916, 'samples': 3112960, 'steps': 6079, 'batch_loss/train': 0.8216803031973541} 12/22/2021 10:25:41 - INFO - codeparrot_training - Step 6080: {'lr': 0.0004828709966640175, 'samples': 3113472, 'steps': 6080, 'batch_loss/train': 0.8607541560195386} 12/22/2021 10:25:54 - INFO - codeparrot_training - Step 6081: {'lr': 0.00048286526185143856, 'samples': 3113984, 'steps': 6081, 'batch_loss/train': 0.8315807012841105} 12/22/2021 10:26:05 - INFO - codeparrot_training - Step 6082: {'lr': 0.00048285952611307534, 'samples': 3114496, 'steps': 6082, 'batch_loss/train': 0.7403650041669607} 12/22/2021 10:26:16 - INFO - codeparrot_training - Step 6083: {'lr': 0.00048285378944895054, 'samples': 3115008, 'steps': 6083, 'batch_loss/train': 0.6826661266386509} 12/22/2021 10:26:29 - INFO - codeparrot_training - Step 6084: {'lr': 0.00048284805185908703, 'samples': 3115520, 'steps': 6084, 'batch_loss/train': 0.48206330835819244} 12/22/2021 10:26:39 - INFO - codeparrot_training - Step 6085: {'lr': 0.0004828423133435076, 'samples': 3116032, 'steps': 6085, 'batch_loss/train': 0.6418164690257981} 12/22/2021 10:26:50 - INFO - codeparrot_training - Step 6086: {'lr': 0.00048283657390223503, 'samples': 3116544, 'steps': 6086, 'batch_loss/train': 0.9058454670011997} 12/22/2021 10:27:00 - INFO - codeparrot_training - Step 6087: {'lr': 0.0004828308335352921, 'samples': 3117056, 'steps': 6087, 'batch_loss/train': 0.8804152929224074} 12/22/2021 10:27:13 - INFO - codeparrot_training - Step 6088: {'lr': 0.00048282509224270174, 'samples': 3117568, 'steps': 6088, 'batch_loss/train': 0.8334405818022788} 12/22/2021 10:27:24 - INFO - codeparrot_training - Step 6089: {'lr': 0.0004828193500244867, 'samples': 3118080, 'steps': 6089, 'batch_loss/train': 0.8696992620825768} 12/22/2021 10:27:34 - INFO - codeparrot_training - Step 6090: {'lr': 0.0004828136068806699, 'samples': 3118592, 'steps': 6090, 'batch_loss/train': 1.09666408598423} 12/22/2021 10:27:46 - INFO - codeparrot_training - Step 6091: {'lr': 0.0004828078628112741, 'samples': 3119104, 'steps': 6091, 'batch_loss/train': 0.843933230265975} 12/22/2021 10:27:57 - INFO - codeparrot_training - Step 6092: {'lr': 0.00048280211781632213, 'samples': 3119616, 'steps': 6092, 'batch_loss/train': 0.8668200339889154} 12/22/2021 10:28:08 - INFO - codeparrot_training - Step 6093: {'lr': 0.0004827963718958368, 'samples': 3120128, 'steps': 6093, 'batch_loss/train': 0.8014393826015294} 12/22/2021 10:28:18 - INFO - codeparrot_training - Step 6094: {'lr': 0.00048279062504984107, 'samples': 3120640, 'steps': 6094, 'batch_loss/train': 0.8605478946119547} 12/22/2021 10:28:32 - INFO - codeparrot_training - Step 6095: {'lr': 0.00048278487727835764, 'samples': 3121152, 'steps': 6095, 'batch_loss/train': 1.2093638544902205} 12/22/2021 10:28:43 - INFO - codeparrot_training - Step 6096: {'lr': 0.00048277912858140943, 'samples': 3121664, 'steps': 6096, 'batch_loss/train': 0.8166631190106273} 12/22/2021 10:28:53 - INFO - codeparrot_training - Step 6097: {'lr': 0.0004827733789590193, 'samples': 3122176, 'steps': 6097, 'batch_loss/train': 0.781952740624547} 12/22/2021 10:29:05 - INFO - codeparrot_training - Step 6098: {'lr': 0.00048276762841121016, 'samples': 3122688, 'steps': 6098, 'batch_loss/train': 0.845244855619967} 12/22/2021 10:29:16 - INFO - codeparrot_training - Step 6099: {'lr': 0.00048276187693800475, 'samples': 3123200, 'steps': 6099, 'batch_loss/train': 0.9255031272768974} 12/22/2021 10:29:27 - INFO - codeparrot_training - Step 6100: {'lr': 0.00048275612453942606, 'samples': 3123712, 'steps': 6100, 'batch_loss/train': 0.8861522311344743} 12/22/2021 10:29:39 - INFO - codeparrot_training - Step 6101: {'lr': 0.0004827503712154969, 'samples': 3124224, 'steps': 6101, 'batch_loss/train': 0.8556830957531929} 12/22/2021 10:29:49 - INFO - codeparrot_training - Step 6102: {'lr': 0.00048274461696624015, 'samples': 3124736, 'steps': 6102, 'batch_loss/train': 0.9109441787004471} 12/22/2021 10:30:00 - INFO - codeparrot_training - Step 6103: {'lr': 0.0004827388617916787, 'samples': 3125248, 'steps': 6103, 'batch_loss/train': 0.9380465499125421} 12/22/2021 10:30:12 - INFO - codeparrot_training - Step 6104: {'lr': 0.0004827331056918354, 'samples': 3125760, 'steps': 6104, 'batch_loss/train': 0.82921729888767} 12/22/2021 10:30:23 - INFO - codeparrot_training - Step 6105: {'lr': 0.0004827273486667331, 'samples': 3126272, 'steps': 6105, 'batch_loss/train': 0.8408355982974172} 12/22/2021 10:30:33 - INFO - codeparrot_training - Step 6106: {'lr': 0.0004827215907163948, 'samples': 3126784, 'steps': 6106, 'batch_loss/train': 0.8863145038485527} 12/22/2021 10:30:44 - INFO - codeparrot_training - Step 6107: {'lr': 0.0004827158318408432, 'samples': 3127296, 'steps': 6107, 'batch_loss/train': 0.86663752887398} 12/22/2021 10:30:58 - INFO - codeparrot_training - Step 6108: {'lr': 0.0004827100720401015, 'samples': 3127808, 'steps': 6108, 'batch_loss/train': 0.8241856023669243} 12/22/2021 10:31:08 - INFO - codeparrot_training - Step 6109: {'lr': 0.00048270431131419234, 'samples': 3128320, 'steps': 6109, 'batch_loss/train': 0.7723400918766856} 12/22/2021 10:31:19 - INFO - codeparrot_training - Step 6110: {'lr': 0.0004826985496631386, 'samples': 3128832, 'steps': 6110, 'batch_loss/train': 0.82594243157655} 12/22/2021 10:31:31 - INFO - codeparrot_training - Step 6111: {'lr': 0.00048269278708696354, 'samples': 3129344, 'steps': 6111, 'batch_loss/train': 0.863815194927156} 12/22/2021 10:31:42 - INFO - codeparrot_training - Step 6112: {'lr': 0.0004826870235856896, 'samples': 3129856, 'steps': 6112, 'batch_loss/train': 0.7829693965613842} 12/22/2021 10:31:52 - INFO - codeparrot_training - Step 6113: {'lr': 0.00048268125915934003, 'samples': 3130368, 'steps': 6113, 'batch_loss/train': 0.9192438069730997} 12/22/2021 10:32:05 - INFO - codeparrot_training - Step 6114: {'lr': 0.00048267549380793753, 'samples': 3130880, 'steps': 6114, 'batch_loss/train': 0.8007999863475561} 12/22/2021 10:32:16 - INFO - codeparrot_training - Step 6115: {'lr': 0.00048266972753150517, 'samples': 3131392, 'steps': 6115, 'batch_loss/train': 0.9347799448296428} 12/22/2021 10:32:27 - INFO - codeparrot_training - Step 6116: {'lr': 0.0004826639603300659, 'samples': 3131904, 'steps': 6116, 'batch_loss/train': 0.7788191528525203} 12/22/2021 10:32:37 - INFO - codeparrot_training - Step 6117: {'lr': 0.0004826581922036425, 'samples': 3132416, 'steps': 6117, 'batch_loss/train': 0.8814527411013842} 12/22/2021 10:32:50 - INFO - codeparrot_training - Step 6118: {'lr': 0.000482652423152258, 'samples': 3132928, 'steps': 6118, 'batch_loss/train': 0.8068274101242423} 12/22/2021 10:33:00 - INFO - codeparrot_training - Step 6119: {'lr': 0.00048264665317593525, 'samples': 3133440, 'steps': 6119, 'batch_loss/train': 0.6756918448954821} 12/22/2021 10:33:11 - INFO - codeparrot_training - Step 6120: {'lr': 0.0004826408822746973, 'samples': 3133952, 'steps': 6120, 'batch_loss/train': 0.9093992281705141} 12/22/2021 10:33:23 - INFO - codeparrot_training - Step 6121: {'lr': 0.00048263511044856704, 'samples': 3134464, 'steps': 6121, 'batch_loss/train': 0.9022008283063769} 12/22/2021 10:33:34 - INFO - codeparrot_training - Step 6122: {'lr': 0.00048262933769756744, 'samples': 3134976, 'steps': 6122, 'batch_loss/train': 0.8611963037401438} 12/22/2021 10:33:44 - INFO - codeparrot_training - Step 6123: {'lr': 0.0004826235640217214, 'samples': 3135488, 'steps': 6123, 'batch_loss/train': 1.2079878523945808} 12/22/2021 10:33:58 - INFO - codeparrot_training - Step 6124: {'lr': 0.00048261778942105193, 'samples': 3136000, 'steps': 6124, 'batch_loss/train': 0.8971126230899245} 12/22/2021 10:34:08 - INFO - codeparrot_training - Step 6125: {'lr': 0.00048261201389558194, 'samples': 3136512, 'steps': 6125, 'batch_loss/train': 0.6345100633334368} 12/22/2021 10:34:19 - INFO - codeparrot_training - Step 6126: {'lr': 0.00048260623744533445, 'samples': 3137024, 'steps': 6126, 'batch_loss/train': 0.8713331799954176} 12/22/2021 10:34:31 - INFO - codeparrot_training - Step 6127: {'lr': 0.0004826004600703324, 'samples': 3137536, 'steps': 6127, 'batch_loss/train': 0.7658971734344959} 12/22/2021 10:34:42 - INFO - codeparrot_training - Step 6128: {'lr': 0.00048259468177059867, 'samples': 3138048, 'steps': 6128, 'batch_loss/train': 0.8645640444010496} 12/22/2021 10:34:52 - INFO - codeparrot_training - Step 6129: {'lr': 0.0004825889025461564, 'samples': 3138560, 'steps': 6129, 'batch_loss/train': 1.0712089342996478} 12/22/2021 10:35:03 - INFO - codeparrot_training - Step 6130: {'lr': 0.0004825831223970284, 'samples': 3139072, 'steps': 6130, 'batch_loss/train': 0.812475211918354} 12/22/2021 10:35:17 - INFO - codeparrot_training - Step 6131: {'lr': 0.0004825773413232377, 'samples': 3139584, 'steps': 6131, 'batch_loss/train': 0.8142317654564977} 12/22/2021 10:35:27 - INFO - codeparrot_training - Step 6132: {'lr': 0.0004825715593248074, 'samples': 3140096, 'steps': 6132, 'batch_loss/train': 0.8313628803007305} 12/22/2021 10:35:38 - INFO - codeparrot_training - Step 6133: {'lr': 0.00048256577640176033, 'samples': 3140608, 'steps': 6133, 'batch_loss/train': 0.7156811682507396} 12/22/2021 10:35:50 - INFO - codeparrot_training - Step 6134: {'lr': 0.0004825599925541196, 'samples': 3141120, 'steps': 6134, 'batch_loss/train': 0.7873505186289549} 12/22/2021 10:36:01 - INFO - codeparrot_training - Step 6135: {'lr': 0.0004825542077819081, 'samples': 3141632, 'steps': 6135, 'batch_loss/train': 0.8297998365014791} 12/22/2021 10:36:11 - INFO - codeparrot_training - Step 6136: {'lr': 0.0004825484220851489, 'samples': 3142144, 'steps': 6136, 'batch_loss/train': 0.9041793337091804} 12/22/2021 10:36:22 - INFO - codeparrot_training - Step 6137: {'lr': 0.000482542635463865, 'samples': 3142656, 'steps': 6137, 'batch_loss/train': 0.681054437533021} 12/22/2021 10:36:34 - INFO - codeparrot_training - Step 6138: {'lr': 0.0004825368479180794, 'samples': 3143168, 'steps': 6138, 'batch_loss/train': 0.797631923109293} 12/22/2021 10:36:45 - INFO - codeparrot_training - Step 6139: {'lr': 0.00048253105944781507, 'samples': 3143680, 'steps': 6139, 'batch_loss/train': 0.8304380951449275} 12/22/2021 10:36:55 - INFO - codeparrot_training - Step 6140: {'lr': 0.000482525270053095, 'samples': 3144192, 'steps': 6140, 'batch_loss/train': 0.721484289271757} 12/22/2021 10:37:08 - INFO - codeparrot_training - Step 6141: {'lr': 0.0004825194797339423, 'samples': 3144704, 'steps': 6141, 'batch_loss/train': 0.8252095070201904} 12/22/2021 10:37:18 - INFO - codeparrot_training - Step 6142: {'lr': 0.00048251368849037995, 'samples': 3145216, 'steps': 6142, 'batch_loss/train': 0.7849601516500115} 12/22/2021 10:37:29 - INFO - codeparrot_training - Step 6143: {'lr': 0.00048250789632243096, 'samples': 3145728, 'steps': 6143, 'batch_loss/train': 0.829605589620769} 12/22/2021 10:37:42 - INFO - codeparrot_training - Step 6144: {'lr': 0.00048250210323011835, 'samples': 3146240, 'steps': 6144, 'batch_loss/train': 0.8049623928964138} 12/22/2021 10:37:53 - INFO - codeparrot_training - Step 6145: {'lr': 0.0004824963092134652, 'samples': 3146752, 'steps': 6145, 'batch_loss/train': 0.8284043530002236} 12/22/2021 10:38:03 - INFO - codeparrot_training - Step 6146: {'lr': 0.00048249051427249446, 'samples': 3147264, 'steps': 6146, 'batch_loss/train': 0.8568012397736311} 12/22/2021 10:38:17 - INFO - codeparrot_training - Step 6147: {'lr': 0.0004824847184072293, 'samples': 3147776, 'steps': 6147, 'batch_loss/train': 0.926252530887723} 12/22/2021 10:38:28 - INFO - codeparrot_training - Step 6148: {'lr': 0.00048247892161769264, 'samples': 3148288, 'steps': 6148, 'batch_loss/train': 0.9896812802180648} 12/22/2021 10:38:39 - INFO - codeparrot_training - Step 6149: {'lr': 0.0004824731239039075, 'samples': 3148800, 'steps': 6149, 'batch_loss/train': 0.8639711779542267} 12/22/2021 10:38:49 - INFO - codeparrot_training - Step 6150: {'lr': 0.0004824673252658971, 'samples': 3149312, 'steps': 6150, 'batch_loss/train': 0.7937702429480851} 12/22/2021 10:39:01 - INFO - codeparrot_training - Step 6151: {'lr': 0.0004824615257036844, 'samples': 3149824, 'steps': 6151, 'batch_loss/train': 0.7937206160277128} 12/22/2021 10:39:12 - INFO - codeparrot_training - Step 6152: {'lr': 0.0004824557252172924, 'samples': 3150336, 'steps': 6152, 'batch_loss/train': 0.8252721629105508} 12/22/2021 10:39:23 - INFO - codeparrot_training - Step 6153: {'lr': 0.0004824499238067442, 'samples': 3150848, 'steps': 6153, 'batch_loss/train': 0.8453938104212284} 12/22/2021 10:39:35 - INFO - codeparrot_training - Step 6154: {'lr': 0.00048244412147206283, 'samples': 3151360, 'steps': 6154, 'batch_loss/train': 0.8462059719022363} 12/22/2021 10:39:45 - INFO - codeparrot_training - Step 6155: {'lr': 0.0004824383182132715, 'samples': 3151872, 'steps': 6155, 'batch_loss/train': 0.764123153174296} 12/22/2021 10:39:56 - INFO - codeparrot_training - Step 6156: {'lr': 0.0004824325140303931, 'samples': 3152384, 'steps': 6156, 'batch_loss/train': 0.8195342924445868} 12/22/2021 10:40:10 - INFO - codeparrot_training - Step 6157: {'lr': 0.00048242670892345086, 'samples': 3152896, 'steps': 6157, 'batch_loss/train': 0.8387013773899525} 12/22/2021 10:40:20 - INFO - codeparrot_training - Step 6158: {'lr': 0.0004824209028924678, 'samples': 3153408, 'steps': 6158, 'batch_loss/train': 0.8926189560443163} 12/22/2021 10:40:31 - INFO - codeparrot_training - Step 6159: {'lr': 0.0004824150959374669, 'samples': 3153920, 'steps': 6159, 'batch_loss/train': 0.9011288098990917} 12/22/2021 10:40:42 - INFO - codeparrot_training - Step 6160: {'lr': 0.00048240928805847133, 'samples': 3154432, 'steps': 6160, 'batch_loss/train': 0.908566078171134} 12/22/2021 10:40:54 - INFO - codeparrot_training - Step 6161: {'lr': 0.0004824034792555042, 'samples': 3154944, 'steps': 6161, 'batch_loss/train': 0.7327114557847381} 12/22/2021 10:41:04 - INFO - codeparrot_training - Step 6162: {'lr': 0.0004823976695285887, 'samples': 3155456, 'steps': 6162, 'batch_loss/train': 0.7840760466642678} 12/22/2021 10:41:15 - INFO - codeparrot_training - Step 6163: {'lr': 0.0004823918588777477, 'samples': 3155968, 'steps': 6163, 'batch_loss/train': 0.777124305954203} 12/22/2021 10:41:29 - INFO - codeparrot_training - Step 6164: {'lr': 0.00048238604730300446, 'samples': 3156480, 'steps': 6164, 'batch_loss/train': 0.916400333866477} 12/22/2021 10:41:40 - INFO - codeparrot_training - Step 6165: {'lr': 0.00048238023480438196, 'samples': 3156992, 'steps': 6165, 'batch_loss/train': 0.8951224153861403} 12/22/2021 10:41:50 - INFO - codeparrot_training - Step 6166: {'lr': 0.00048237442138190353, 'samples': 3157504, 'steps': 6166, 'batch_loss/train': 0.8766753100790083} 12/22/2021 10:42:02 - INFO - codeparrot_training - Step 6167: {'lr': 0.00048236860703559204, 'samples': 3158016, 'steps': 6167, 'batch_loss/train': 0.8582933666184545} 12/22/2021 10:42:13 - INFO - codeparrot_training - Step 6168: {'lr': 0.00048236279176547075, 'samples': 3158528, 'steps': 6168, 'batch_loss/train': 0.8370394725352526} 12/22/2021 10:42:24 - INFO - codeparrot_training - Step 6169: {'lr': 0.00048235697557156274, 'samples': 3159040, 'steps': 6169, 'batch_loss/train': 1.0079500935971737} 12/22/2021 10:42:36 - INFO - codeparrot_training - Step 6170: {'lr': 0.0004823511584538911, 'samples': 3159552, 'steps': 6170, 'batch_loss/train': 0.7329636681824923} 12/22/2021 10:42:47 - INFO - codeparrot_training - Step 6171: {'lr': 0.00048234534041247904, 'samples': 3160064, 'steps': 6171, 'batch_loss/train': 0.9025739654898643} 12/22/2021 10:42:57 - INFO - codeparrot_training - Step 6172: {'lr': 0.0004823395214473496, 'samples': 3160576, 'steps': 6172, 'batch_loss/train': 0.7210675240494311} 12/22/2021 10:43:08 - INFO - codeparrot_training - Step 6173: {'lr': 0.00048233370155852595, 'samples': 3161088, 'steps': 6173, 'batch_loss/train': 1.4897926719859242} 12/22/2021 10:43:22 - INFO - codeparrot_training - Step 6174: {'lr': 0.0004823278807460312, 'samples': 3161600, 'steps': 6174, 'batch_loss/train': 0.8137546773068607} 12/22/2021 10:43:32 - INFO - codeparrot_training - Step 6175: {'lr': 0.0004823220590098886, 'samples': 3162112, 'steps': 6175, 'batch_loss/train': 0.8662508558481932} 12/22/2021 10:43:43 - INFO - codeparrot_training - Step 6176: {'lr': 0.0004823162363501211, 'samples': 3162624, 'steps': 6176, 'batch_loss/train': 1.0347999725490808} 12/22/2021 10:43:55 - INFO - codeparrot_training - Step 6177: {'lr': 0.00048231041276675205, 'samples': 3163136, 'steps': 6177, 'batch_loss/train': 0.695096772396937} 12/22/2021 10:44:06 - INFO - codeparrot_training - Step 6178: {'lr': 0.0004823045882598045, 'samples': 3163648, 'steps': 6178, 'batch_loss/train': 0.808139962144196} 12/22/2021 10:44:16 - INFO - codeparrot_training - Step 6179: {'lr': 0.0004822987628293016, 'samples': 3164160, 'steps': 6179, 'batch_loss/train': 0.9238378200680017} 12/22/2021 10:44:30 - INFO - codeparrot_training - Step 6180: {'lr': 0.0004822929364752665, 'samples': 3164672, 'steps': 6180, 'batch_loss/train': 0.7017159684328362} 12/22/2021 10:44:41 - INFO - codeparrot_training - Step 6181: {'lr': 0.00048228710919772245, 'samples': 3165184, 'steps': 6181, 'batch_loss/train': 0.7658808752894402} 12/22/2021 10:44:52 - INFO - codeparrot_training - Step 6182: {'lr': 0.00048228128099669253, 'samples': 3165696, 'steps': 6182, 'batch_loss/train': 0.9078575121238828} 12/22/2021 10:45:02 - INFO - codeparrot_training - Step 6183: {'lr': 0.00048227545187220003, 'samples': 3166208, 'steps': 6183, 'batch_loss/train': 0.8238260596990585} 12/22/2021 10:45:14 - INFO - codeparrot_training - Step 6184: {'lr': 0.0004822696218242679, 'samples': 3166720, 'steps': 6184, 'batch_loss/train': 0.8794694822281599} 12/22/2021 10:45:25 - INFO - codeparrot_training - Step 6185: {'lr': 0.0004822637908529196, 'samples': 3167232, 'steps': 6185, 'batch_loss/train': 0.9209959656000137} 12/22/2021 10:45:35 - INFO - codeparrot_training - Step 6186: {'lr': 0.0004822579589581781, 'samples': 3167744, 'steps': 6186, 'batch_loss/train': 0.8536698343232274} 12/22/2021 10:45:48 - INFO - codeparrot_training - Step 6187: {'lr': 0.0004822521261400666, 'samples': 3168256, 'steps': 6187, 'batch_loss/train': 0.7772055622190237} 12/22/2021 10:45:58 - INFO - codeparrot_training - Step 6188: {'lr': 0.0004822462923986084, 'samples': 3168768, 'steps': 6188, 'batch_loss/train': 0.8250536862760782} 12/22/2021 10:46:09 - INFO - codeparrot_training - Step 6189: {'lr': 0.00048224045773382664, 'samples': 3169280, 'steps': 6189, 'batch_loss/train': 0.8737171199172735} 12/22/2021 10:46:21 - INFO - codeparrot_training - Step 6190: {'lr': 0.00048223462214574454, 'samples': 3169792, 'steps': 6190, 'batch_loss/train': 0.6697337997611612} 12/22/2021 10:46:32 - INFO - codeparrot_training - Step 6191: {'lr': 0.0004822287856343852, 'samples': 3170304, 'steps': 6191, 'batch_loss/train': 0.7944043790921569} 12/22/2021 10:46:42 - INFO - codeparrot_training - Step 6192: {'lr': 0.00048222294819977193, 'samples': 3170816, 'steps': 6192, 'batch_loss/train': 0.8519969824701548} 12/22/2021 10:46:56 - INFO - codeparrot_training - Step 6193: {'lr': 0.0004822171098419279, 'samples': 3171328, 'steps': 6193, 'batch_loss/train': 0.7893543168902397} 12/22/2021 10:47:07 - INFO - codeparrot_training - Step 6194: {'lr': 0.00048221127056087624, 'samples': 3171840, 'steps': 6194, 'batch_loss/train': 0.9574283566325903} 12/22/2021 10:47:17 - INFO - codeparrot_training - Step 6195: {'lr': 0.00048220543035664033, 'samples': 3172352, 'steps': 6195, 'batch_loss/train': 0.9220648761838675} 12/22/2021 10:47:28 - INFO - codeparrot_training - Step 6196: {'lr': 0.00048219958922924336, 'samples': 3172864, 'steps': 6196, 'batch_loss/train': 0.862685726955533} 12/22/2021 10:47:40 - INFO - codeparrot_training - Step 6197: {'lr': 0.0004821937471787084, 'samples': 3173376, 'steps': 6197, 'batch_loss/train': 0.8235587044619024} 12/22/2021 10:47:51 - INFO - codeparrot_training - Step 6198: {'lr': 0.00048218790420505883, 'samples': 3173888, 'steps': 6198, 'batch_loss/train': 0.8211920047178864} 12/22/2021 10:48:01 - INFO - codeparrot_training - Step 6199: {'lr': 0.0004821820603083178, 'samples': 3174400, 'steps': 6199, 'batch_loss/train': 0.8924268907867372} 12/22/2021 10:48:13 - INFO - codeparrot_training - Step 6200: {'lr': 0.00048217621548850856, 'samples': 3174912, 'steps': 6200, 'batch_loss/train': 0.798533650347963} 12/22/2021 10:48:24 - INFO - codeparrot_training - Step 6201: {'lr': 0.00048217036974565437, 'samples': 3175424, 'steps': 6201, 'batch_loss/train': 0.7801126018166542} 12/22/2021 10:48:34 - INFO - codeparrot_training - Step 6202: {'lr': 0.0004821645230797784, 'samples': 3175936, 'steps': 6202, 'batch_loss/train': 0.7502914387732744} 12/22/2021 10:48:48 - INFO - codeparrot_training - Step 6203: {'lr': 0.00048215867549090404, 'samples': 3176448, 'steps': 6203, 'batch_loss/train': 0.8958838889375329} 12/22/2021 10:48:59 - INFO - codeparrot_training - Step 6204: {'lr': 0.00048215282697905437, 'samples': 3176960, 'steps': 6204, 'batch_loss/train': 0.8702020198106766} 12/22/2021 10:49:09 - INFO - codeparrot_training - Step 6205: {'lr': 0.00048214697754425277, 'samples': 3177472, 'steps': 6205, 'batch_loss/train': 0.7831249451264739} 12/22/2021 10:49:20 - INFO - codeparrot_training - Step 6206: {'lr': 0.0004821411271865225, 'samples': 3177984, 'steps': 6206, 'batch_loss/train': 0.8271563528105617} 12/22/2021 10:49:32 - INFO - codeparrot_training - Step 6207: {'lr': 0.00048213527590588664, 'samples': 3178496, 'steps': 6207, 'batch_loss/train': 0.82398437615484} 12/22/2021 10:49:43 - INFO - codeparrot_training - Step 6208: {'lr': 0.0004821294237023686, 'samples': 3179008, 'steps': 6208, 'batch_loss/train': 0.754022684879601} 12/22/2021 10:49:54 - INFO - codeparrot_training - Step 6209: {'lr': 0.00048212357057599166, 'samples': 3179520, 'steps': 6209, 'batch_loss/train': 0.8553044400177896} 12/22/2021 10:50:08 - INFO - codeparrot_training - Step 6210: {'lr': 0.00048211771652677905, 'samples': 3180032, 'steps': 6210, 'batch_loss/train': 0.7455617259256542} 12/22/2021 10:50:18 - INFO - codeparrot_training - Step 6211: {'lr': 0.00048211186155475394, 'samples': 3180544, 'steps': 6211, 'batch_loss/train': 0.7630011013243347} 12/22/2021 10:50:29 - INFO - codeparrot_training - Step 6212: {'lr': 0.0004821060056599398, 'samples': 3181056, 'steps': 6212, 'batch_loss/train': 0.8476255112327635} 12/22/2021 10:50:41 - INFO - codeparrot_training - Step 6213: {'lr': 0.0004821001488423599, 'samples': 3181568, 'steps': 6213, 'batch_loss/train': 0.7884888611733913} 12/22/2021 10:50:52 - INFO - codeparrot_training - Step 6214: {'lr': 0.0004820942911020373, 'samples': 3182080, 'steps': 6214, 'batch_loss/train': 0.8410338396206498} 12/22/2021 10:51:02 - INFO - codeparrot_training - Step 6215: {'lr': 0.00048208843243899546, 'samples': 3182592, 'steps': 6215, 'batch_loss/train': 0.8788142083212733} 12/22/2021 10:51:13 - INFO - codeparrot_training - Step 6216: {'lr': 0.0004820825728532577, 'samples': 3183104, 'steps': 6216, 'batch_loss/train': 0.6178542757406831} 12/22/2021 10:51:25 - INFO - codeparrot_training - Step 6217: {'lr': 0.00048207671234484724, 'samples': 3183616, 'steps': 6217, 'batch_loss/train': 0.8077007364481688} 12/22/2021 10:51:36 - INFO - codeparrot_training - Step 6218: {'lr': 0.0004820708509137874, 'samples': 3184128, 'steps': 6218, 'batch_loss/train': 0.8579256036318839} 12/22/2021 10:51:46 - INFO - codeparrot_training - Step 6219: {'lr': 0.0004820649885601015, 'samples': 3184640, 'steps': 6219, 'batch_loss/train': 0.7268237147945911} 12/22/2021 10:51:59 - INFO - codeparrot_training - Step 6220: {'lr': 0.0004820591252838128, 'samples': 3185152, 'steps': 6220, 'batch_loss/train': 0.8000594805926085} 12/22/2021 10:52:09 - INFO - codeparrot_training - Step 6221: {'lr': 0.00048205326108494465, 'samples': 3185664, 'steps': 6221, 'batch_loss/train': 0.7399692726321518} 12/22/2021 10:52:20 - INFO - codeparrot_training - Step 6222: {'lr': 0.00048204739596352037, 'samples': 3186176, 'steps': 6222, 'batch_loss/train': 0.9044714083429426} 12/22/2021 10:52:32 - INFO - codeparrot_training - Step 6223: {'lr': 0.00048204152991956326, 'samples': 3186688, 'steps': 6223, 'batch_loss/train': 0.7707517584785819} 12/22/2021 10:52:42 - INFO - codeparrot_training - Step 6224: {'lr': 0.0004820356629530966, 'samples': 3187200, 'steps': 6224, 'batch_loss/train': 0.8799088634550571} 12/22/2021 10:52:53 - INFO - codeparrot_training - Step 6225: {'lr': 0.0004820297950641438, 'samples': 3187712, 'steps': 6225, 'batch_loss/train': 0.8514682576060295} 12/22/2021 10:53:07 - INFO - codeparrot_training - Step 6226: {'lr': 0.00048202392625272815, 'samples': 3188224, 'steps': 6226, 'batch_loss/train': 0.8901290036737919} 12/22/2021 10:53:18 - INFO - codeparrot_training - Step 6227: {'lr': 0.000482018056518873, 'samples': 3188736, 'steps': 6227, 'batch_loss/train': 1.0200080787763} 12/22/2021 10:53:28 - INFO - codeparrot_training - Step 6228: {'lr': 0.0004820121858626016, 'samples': 3189248, 'steps': 6228, 'batch_loss/train': 0.7801902769133449} 12/22/2021 10:53:39 - INFO - codeparrot_training - Step 6229: {'lr': 0.00048200631428393735, 'samples': 3189760, 'steps': 6229, 'batch_loss/train': 0.806672066450119} 12/22/2021 10:53:51 - INFO - codeparrot_training - Step 6230: {'lr': 0.00048200044178290367, 'samples': 3190272, 'steps': 6230, 'batch_loss/train': 0.8267153450287879} 12/22/2021 10:54:01 - INFO - codeparrot_training - Step 6231: {'lr': 0.00048199456835952374, 'samples': 3190784, 'steps': 6231, 'batch_loss/train': 0.8254925524815917} 12/22/2021 10:54:12 - INFO - codeparrot_training - Step 6232: {'lr': 0.00048198869401382106, 'samples': 3191296, 'steps': 6232, 'batch_loss/train': 0.8378573218360543} 12/22/2021 10:54:26 - INFO - codeparrot_training - Step 6233: {'lr': 0.00048198281874581897, 'samples': 3191808, 'steps': 6233, 'batch_loss/train': 0.8187043741345406} 12/22/2021 10:54:36 - INFO - codeparrot_training - Step 6234: {'lr': 0.0004819769425555407, 'samples': 3192320, 'steps': 6234, 'batch_loss/train': 0.8689020173624158} 12/22/2021 10:54:47 - INFO - codeparrot_training - Step 6235: {'lr': 0.0004819710654430097, 'samples': 3192832, 'steps': 6235, 'batch_loss/train': 0.8082722118124366} 12/22/2021 10:54:59 - INFO - codeparrot_training - Step 6236: {'lr': 0.0004819651874082494, 'samples': 3193344, 'steps': 6236, 'batch_loss/train': 0.7812382252886891} 12/22/2021 10:55:10 - INFO - codeparrot_training - Step 6237: {'lr': 0.0004819593084512831, 'samples': 3193856, 'steps': 6237, 'batch_loss/train': 0.7162841283716261} 12/22/2021 10:55:21 - INFO - codeparrot_training - Step 6238: {'lr': 0.0004819534285721341, 'samples': 3194368, 'steps': 6238, 'batch_loss/train': 0.8458056058734655} 12/22/2021 10:55:31 - INFO - codeparrot_training - Step 6239: {'lr': 0.00048194754777082597, 'samples': 3194880, 'steps': 6239, 'batch_loss/train': 0.9926552148535848} 12/22/2021 10:55:43 - INFO - codeparrot_training - Step 6240: {'lr': 0.00048194166604738185, 'samples': 3195392, 'steps': 6240, 'batch_loss/train': 0.916782648069784} 12/22/2021 10:55:54 - INFO - codeparrot_training - Step 6241: {'lr': 0.0004819357834018253, 'samples': 3195904, 'steps': 6241, 'batch_loss/train': 0.8261843698564917} 12/22/2021 10:56:05 - INFO - codeparrot_training - Step 6242: {'lr': 0.00048192989983417963, 'samples': 3196416, 'steps': 6242, 'batch_loss/train': 0.8385788220912218} 12/22/2021 10:56:19 - INFO - codeparrot_training - Step 6243: {'lr': 0.00048192401534446827, 'samples': 3196928, 'steps': 6243, 'batch_loss/train': 0.8284058473072946} 12/22/2021 10:56:29 - INFO - codeparrot_training - Step 6244: {'lr': 0.00048191812993271456, 'samples': 3197440, 'steps': 6244, 'batch_loss/train': 0.8261693781241775} 12/22/2021 10:56:40 - INFO - codeparrot_training - Step 6245: {'lr': 0.00048191224359894194, 'samples': 3197952, 'steps': 6245, 'batch_loss/train': 0.8863339172676206} 12/22/2021 10:56:52 - INFO - codeparrot_training - Step 6246: {'lr': 0.0004819063563431738, 'samples': 3198464, 'steps': 6246, 'batch_loss/train': 0.7776326145976782} 12/22/2021 10:57:03 - INFO - codeparrot_training - Step 6247: {'lr': 0.00048190046816543367, 'samples': 3198976, 'steps': 6247, 'batch_loss/train': 0.8694247305393219} 12/22/2021 10:57:14 - INFO - codeparrot_training - Step 6248: {'lr': 0.00048189457906574475, 'samples': 3199488, 'steps': 6248, 'batch_loss/train': 0.4843153493711725} 12/22/2021 10:57:28 - INFO - codeparrot_training - Step 6249: {'lr': 0.00048188868904413054, 'samples': 3200000, 'steps': 6249, 'batch_loss/train': 0.8070887690410018} 12/22/2021 10:57:38 - INFO - codeparrot_training - Step 6250: {'lr': 0.0004818827981006144, 'samples': 3200512, 'steps': 6250, 'batch_loss/train': 0.9668920468539} 12/22/2021 10:57:49 - INFO - codeparrot_training - Step 6251: {'lr': 0.0004818769062352199, 'samples': 3201024, 'steps': 6251, 'batch_loss/train': 2.324610139708966} 12/22/2021 10:57:59 - INFO - codeparrot_training - Step 6252: {'lr': 0.00048187101344797037, 'samples': 3201536, 'steps': 6252, 'batch_loss/train': 0.8256525909528136} 12/22/2021 10:58:11 - INFO - codeparrot_training - Step 6253: {'lr': 0.0004818651197388892, 'samples': 3202048, 'steps': 6253, 'batch_loss/train': 0.9016305156983435} 12/22/2021 10:58:22 - INFO - codeparrot_training - Step 6254: {'lr': 0.00048185922510799985, 'samples': 3202560, 'steps': 6254, 'batch_loss/train': 0.8192547345533967} 12/22/2021 10:58:33 - INFO - codeparrot_training - Step 6255: {'lr': 0.0004818533295553258, 'samples': 3203072, 'steps': 6255, 'batch_loss/train': 0.6969763850793242} 12/22/2021 10:58:45 - INFO - codeparrot_training - Step 6256: {'lr': 0.00048184743308089047, 'samples': 3203584, 'steps': 6256, 'batch_loss/train': 0.8755266619846225} 12/22/2021 10:58:55 - INFO - codeparrot_training - Step 6257: {'lr': 0.0004818415356847173, 'samples': 3204096, 'steps': 6257, 'batch_loss/train': 0.9974287506192923} 12/22/2021 10:59:06 - INFO - codeparrot_training - Step 6258: {'lr': 0.0004818356373668297, 'samples': 3204608, 'steps': 6258, 'batch_loss/train': 0.799183395691216} 12/22/2021 10:59:17 - INFO - codeparrot_training - Step 6259: {'lr': 0.00048182973812725113, 'samples': 3205120, 'steps': 6259, 'batch_loss/train': 1.0236657978966832} 12/22/2021 10:59:29 - INFO - codeparrot_training - Step 6260: {'lr': 0.00048182383796600505, 'samples': 3205632, 'steps': 6260, 'batch_loss/train': 0.8315721731632948} 12/22/2021 10:59:39 - INFO - codeparrot_training - Step 6261: {'lr': 0.0004818179368831149, 'samples': 3206144, 'steps': 6261, 'batch_loss/train': 0.8023822866380215} 12/22/2021 10:59:50 - INFO - codeparrot_training - Step 6262: {'lr': 0.00048181203487860426, 'samples': 3206656, 'steps': 6262, 'batch_loss/train': 0.8170980038121343} 12/22/2021 11:00:04 - INFO - codeparrot_training - Step 6263: {'lr': 0.00048180613195249644, 'samples': 3207168, 'steps': 6263, 'batch_loss/train': 0.85078015178442} 12/22/2021 11:00:14 - INFO - codeparrot_training - Step 6264: {'lr': 0.00048180022810481493, 'samples': 3207680, 'steps': 6264, 'batch_loss/train': 0.7356258537620306} 12/22/2021 11:00:25 - INFO - codeparrot_training - Step 6265: {'lr': 0.0004817943233355833, 'samples': 3208192, 'steps': 6265, 'batch_loss/train': 0.8783565713092685} 12/22/2021 11:00:37 - INFO - codeparrot_training - Step 6266: {'lr': 0.00048178841764482495, 'samples': 3208704, 'steps': 6266, 'batch_loss/train': 0.7720146425999701} 12/22/2021 11:00:48 - INFO - codeparrot_training - Step 6267: {'lr': 0.0004817825110325633, 'samples': 3209216, 'steps': 6267, 'batch_loss/train': 0.9111083466559649} 12/22/2021 11:00:58 - INFO - codeparrot_training - Step 6268: {'lr': 0.00048177660349882205, 'samples': 3209728, 'steps': 6268, 'batch_loss/train': 0.8402035944163799} 12/22/2021 11:01:10 - INFO - codeparrot_training - Step 6269: {'lr': 0.0004817706950436244, 'samples': 3210240, 'steps': 6269, 'batch_loss/train': 0.8830206552520394} 12/22/2021 11:01:21 - INFO - codeparrot_training - Step 6270: {'lr': 0.00048176478566699406, 'samples': 3210752, 'steps': 6270, 'batch_loss/train': 0.8081178888678551} 12/22/2021 11:01:32 - INFO - codeparrot_training - Step 6271: {'lr': 0.0004817588753689544, 'samples': 3211264, 'steps': 6271, 'batch_loss/train': 0.7023015934973955} 12/22/2021 11:01:42 - INFO - codeparrot_training - Step 6272: {'lr': 0.00048175296414952894, 'samples': 3211776, 'steps': 6272, 'batch_loss/train': 0.7758098756894469} 12/22/2021 11:01:57 - INFO - codeparrot_training - Step 6273: {'lr': 0.0004817470520087413, 'samples': 3212288, 'steps': 6273, 'batch_loss/train': 0.7842437773942947} 12/22/2021 11:02:08 - INFO - codeparrot_training - Step 6274: {'lr': 0.0004817411389466148, 'samples': 3212800, 'steps': 6274, 'batch_loss/train': 0.826840084977448} 12/22/2021 11:02:18 - INFO - codeparrot_training - Step 6275: {'lr': 0.00048173522496317304, 'samples': 3213312, 'steps': 6275, 'batch_loss/train': 0.869357218965888} 12/22/2021 11:02:30 - INFO - codeparrot_training - Step 6276: {'lr': 0.00048172931005843956, 'samples': 3213824, 'steps': 6276, 'batch_loss/train': 0.8520398875698447} 12/22/2021 11:02:41 - INFO - codeparrot_training - Step 6277: {'lr': 0.00048172339423243786, 'samples': 3214336, 'steps': 6277, 'batch_loss/train': 0.7585135591216385} 12/22/2021 11:02:52 - INFO - codeparrot_training - Step 6278: {'lr': 0.00048171747748519136, 'samples': 3214848, 'steps': 6278, 'batch_loss/train': 0.8294745953753591} 12/22/2021 11:03:06 - INFO - codeparrot_training - Step 6279: {'lr': 0.00048171155981672375, 'samples': 3215360, 'steps': 6279, 'batch_loss/train': 0.9128307066857815} 12/22/2021 11:03:17 - INFO - codeparrot_training - Step 6280: {'lr': 0.0004817056412270584, 'samples': 3215872, 'steps': 6280, 'batch_loss/train': 0.8601364688947797} 12/22/2021 11:03:27 - INFO - codeparrot_training - Step 6281: {'lr': 0.00048169972171621894, 'samples': 3216384, 'steps': 6281, 'batch_loss/train': 0.9978404967114329} 12/22/2021 11:03:38 - INFO - codeparrot_training - Step 6282: {'lr': 0.0004816938012842289, 'samples': 3216896, 'steps': 6282, 'batch_loss/train': 0.877877720631659} 12/22/2021 11:03:50 - INFO - codeparrot_training - Step 6283: {'lr': 0.00048168787993111176, 'samples': 3217408, 'steps': 6283, 'batch_loss/train': 0.8568945648148656} 12/22/2021 11:04:01 - INFO - codeparrot_training - Step 6284: {'lr': 0.00048168195765689104, 'samples': 3217920, 'steps': 6284, 'batch_loss/train': 0.8414593166671693} 12/22/2021 11:04:11 - INFO - codeparrot_training - Step 6285: {'lr': 0.0004816760344615904, 'samples': 3218432, 'steps': 6285, 'batch_loss/train': 1.6383819011971354} 12/22/2021 11:04:24 - INFO - codeparrot_training - Step 6286: {'lr': 0.00048167011034523335, 'samples': 3218944, 'steps': 6286, 'batch_loss/train': 0.8118842984549701} 12/22/2021 11:04:35 - INFO - codeparrot_training - Step 6287: {'lr': 0.0004816641853078433, 'samples': 3219456, 'steps': 6287, 'batch_loss/train': 1.1980202561244369} 12/22/2021 11:04:45 - INFO - codeparrot_training - Step 6288: {'lr': 0.0004816582593494441, 'samples': 3219968, 'steps': 6288, 'batch_loss/train': 0.7652892032638192} 12/22/2021 11:04:59 - INFO - codeparrot_training - Step 6289: {'lr': 0.000481652332470059, 'samples': 3220480, 'steps': 6289, 'batch_loss/train': 0.8366132332012057} 12/22/2021 11:05:10 - INFO - codeparrot_training - Step 6290: {'lr': 0.00048164640466971166, 'samples': 3220992, 'steps': 6290, 'batch_loss/train': 0.880854731425643} 12/22/2021 11:05:20 - INFO - codeparrot_training - Step 6291: {'lr': 0.0004816404759484258, 'samples': 3221504, 'steps': 6291, 'batch_loss/train': 0.7060826970264316} 12/22/2021 11:05:31 - INFO - codeparrot_training - Step 6292: {'lr': 0.0004816345463062248, 'samples': 3222016, 'steps': 6292, 'batch_loss/train': 0.8123225492890924} 12/22/2021 11:05:43 - INFO - codeparrot_training - Step 6293: {'lr': 0.0004816286157431323, 'samples': 3222528, 'steps': 6293, 'batch_loss/train': 0.8068345892243087} 12/22/2021 11:05:54 - INFO - codeparrot_training - Step 6294: {'lr': 0.00048162268425917197, 'samples': 3223040, 'steps': 6294, 'batch_loss/train': 0.9130487088114023} 12/22/2021 11:06:04 - INFO - codeparrot_training - Step 6295: {'lr': 0.0004816167518543672, 'samples': 3223552, 'steps': 6295, 'batch_loss/train': 0.7621258818544447} 12/22/2021 11:06:16 - INFO - codeparrot_training - Step 6296: {'lr': 0.00048161081852874174, 'samples': 3224064, 'steps': 6296, 'batch_loss/train': 1.0520569700747728} 12/22/2021 11:06:27 - INFO - codeparrot_training - Step 6297: {'lr': 0.0004816048842823191, 'samples': 3224576, 'steps': 6297, 'batch_loss/train': 0.7888739798218012} 12/22/2021 11:06:37 - INFO - codeparrot_training - Step 6298: {'lr': 0.0004815989491151229, 'samples': 3225088, 'steps': 6298, 'batch_loss/train': 0.8394889561459422} 12/22/2021 11:06:51 - INFO - codeparrot_training - Step 6299: {'lr': 0.00048159301302717666, 'samples': 3225600, 'steps': 6299, 'batch_loss/train': 0.7947842474095523} 12/22/2021 11:07:02 - INFO - codeparrot_training - Step 6300: {'lr': 0.00048158707601850416, 'samples': 3226112, 'steps': 6300, 'batch_loss/train': 0.8509305231273174} 12/22/2021 11:07:12 - INFO - codeparrot_training - Step 6301: {'lr': 0.00048158113808912886, 'samples': 3226624, 'steps': 6301, 'batch_loss/train': 0.9077911712229252} 12/22/2021 11:07:24 - INFO - codeparrot_training - Step 6302: {'lr': 0.00048157519923907433, 'samples': 3227136, 'steps': 6302, 'batch_loss/train': 0.8661785218864679} 12/22/2021 11:07:35 - INFO - codeparrot_training - Step 6303: {'lr': 0.00048156925946836427, 'samples': 3227648, 'steps': 6303, 'batch_loss/train': 0.7971649994142354} 12/22/2021 11:07:46 - INFO - codeparrot_training - Step 6304: {'lr': 0.0004815633187770223, 'samples': 3228160, 'steps': 6304, 'batch_loss/train': 0.8905881717801094} 12/22/2021 11:07:56 - INFO - codeparrot_training - Step 6305: {'lr': 0.000481557377165072, 'samples': 3228672, 'steps': 6305, 'batch_loss/train': 0.8251508101820946} 12/22/2021 11:08:08 - INFO - codeparrot_training - Step 6306: {'lr': 0.000481551434632537, 'samples': 3229184, 'steps': 6306, 'batch_loss/train': 0.8350191600620747} 12/22/2021 11:08:19 - INFO - codeparrot_training - Step 6307: {'lr': 0.0004815454911794409, 'samples': 3229696, 'steps': 6307, 'batch_loss/train': 0.9215751895681024} 12/22/2021 11:08:30 - INFO - codeparrot_training - Step 6308: {'lr': 0.00048153954680580735, 'samples': 3230208, 'steps': 6308, 'batch_loss/train': 0.8314933129586279} 12/22/2021 11:08:43 - INFO - codeparrot_training - Step 6309: {'lr': 0.00048153360151166006, 'samples': 3230720, 'steps': 6309, 'batch_loss/train': 0.8458782150410116} 12/22/2021 11:08:54 - INFO - codeparrot_training - Step 6310: {'lr': 0.0004815276552970226, 'samples': 3231232, 'steps': 6310, 'batch_loss/train': 0.7224395776866004} 12/22/2021 11:09:05 - INFO - codeparrot_training - Step 6311: {'lr': 0.00048152170816191854, 'samples': 3231744, 'steps': 6311, 'batch_loss/train': 0.8288334524258971} 12/22/2021 11:09:17 - INFO - codeparrot_training - Step 6312: {'lr': 0.0004815157601063715, 'samples': 3232256, 'steps': 6312, 'batch_loss/train': 0.7257224209606647} 12/22/2021 11:09:27 - INFO - codeparrot_training - Step 6313: {'lr': 0.0004815098111304054, 'samples': 3232768, 'steps': 6313, 'batch_loss/train': 0.929454724304378} 12/22/2021 11:09:38 - INFO - codeparrot_training - Step 6314: {'lr': 0.0004815038612340436, 'samples': 3233280, 'steps': 6314, 'batch_loss/train': 0.7410261817276478} 12/22/2021 11:09:49 - INFO - codeparrot_training - Step 6315: {'lr': 0.0004814979104173099, 'samples': 3233792, 'steps': 6315, 'batch_loss/train': 0.7717260383069515} 12/22/2021 11:10:01 - INFO - codeparrot_training - Step 6316: {'lr': 0.0004814919586802279, 'samples': 3234304, 'steps': 6316, 'batch_loss/train': 0.8172218604013324} 12/22/2021 11:10:11 - INFO - codeparrot_training - Step 6317: {'lr': 0.0004814860060228213, 'samples': 3234816, 'steps': 6317, 'batch_loss/train': 0.7340401005931199} 12/22/2021 11:10:22 - INFO - codeparrot_training - Step 6318: {'lr': 0.00048148005244511374, 'samples': 3235328, 'steps': 6318, 'batch_loss/train': 0.7877460373565555} 12/22/2021 11:10:37 - INFO - codeparrot_training - Step 6319: {'lr': 0.00048147409794712883, 'samples': 3235840, 'steps': 6319, 'batch_loss/train': 0.9664824586361647} 12/22/2021 11:10:48 - INFO - codeparrot_training - Step 6320: {'lr': 0.00048146814252889036, 'samples': 3236352, 'steps': 6320, 'batch_loss/train': 0.8303072373382747} 12/22/2021 11:10:58 - INFO - codeparrot_training - Step 6321: {'lr': 0.00048146218619042196, 'samples': 3236864, 'steps': 6321, 'batch_loss/train': 0.9004777390509844} 12/22/2021 11:11:10 - INFO - codeparrot_training - Step 6322: {'lr': 0.0004814562289317473, 'samples': 3237376, 'steps': 6322, 'batch_loss/train': 0.8400474200025201} 12/22/2021 11:11:21 - INFO - codeparrot_training - Step 6323: {'lr': 0.0004814502707528901, 'samples': 3237888, 'steps': 6323, 'batch_loss/train': 0.6379442418692634} 12/22/2021 11:11:32 - INFO - codeparrot_training - Step 6324: {'lr': 0.00048144431165387397, 'samples': 3238400, 'steps': 6324, 'batch_loss/train': 0.8638065494596958} 12/22/2021 11:11:43 - INFO - codeparrot_training - Step 6325: {'lr': 0.00048143835163472264, 'samples': 3238912, 'steps': 6325, 'batch_loss/train': 0.8872394040226936} 12/22/2021 11:11:54 - INFO - codeparrot_training - Step 6326: {'lr': 0.0004814323906954599, 'samples': 3239424, 'steps': 6326, 'batch_loss/train': 0.7746141124516726} 12/22/2021 11:12:05 - INFO - codeparrot_training - Step 6327: {'lr': 0.0004814264288361093, 'samples': 3239936, 'steps': 6327, 'batch_loss/train': 0.7626793272793293} 12/22/2021 11:12:15 - INFO - codeparrot_training - Step 6328: {'lr': 0.0004814204660566945, 'samples': 3240448, 'steps': 6328, 'batch_loss/train': 0.8409227570518851} 12/22/2021 11:12:29 - INFO - codeparrot_training - Step 6329: {'lr': 0.00048141450235723946, 'samples': 3240960, 'steps': 6329, 'batch_loss/train': 0.6924685739213601} 12/22/2021 11:12:40 - INFO - codeparrot_training - Step 6330: {'lr': 0.0004814085377377677, 'samples': 3241472, 'steps': 6330, 'batch_loss/train': 0.79420280829072} 12/22/2021 11:12:50 - INFO - codeparrot_training - Step 6331: {'lr': 0.0004814025721983029, 'samples': 3241984, 'steps': 6331, 'batch_loss/train': 0.8947003833018243} 12/22/2021 11:13:02 - INFO - codeparrot_training - Step 6332: {'lr': 0.0004813966057388689, 'samples': 3242496, 'steps': 6332, 'batch_loss/train': 0.8220061911270022} 12/22/2021 11:13:13 - INFO - codeparrot_training - Step 6333: {'lr': 0.0004813906383594894, 'samples': 3243008, 'steps': 6333, 'batch_loss/train': 0.8019225541502237} 12/22/2021 11:13:24 - INFO - codeparrot_training - Step 6334: {'lr': 0.00048138467006018806, 'samples': 3243520, 'steps': 6334, 'batch_loss/train': 0.7395367072895169} 12/22/2021 11:13:37 - INFO - codeparrot_training - Step 6335: {'lr': 0.0004813787008409886, 'samples': 3244032, 'steps': 6335, 'batch_loss/train': 0.8569433847442269} 12/22/2021 11:13:48 - INFO - codeparrot_training - Step 6336: {'lr': 0.00048137273070191487, 'samples': 3244544, 'steps': 6336, 'batch_loss/train': 0.8286504046991467} 12/22/2021 11:13:59 - INFO - codeparrot_training - Step 6337: {'lr': 0.0004813667596429905, 'samples': 3245056, 'steps': 6337, 'batch_loss/train': 0.8943970929831266} 12/22/2021 11:14:09 - INFO - codeparrot_training - Step 6338: {'lr': 0.0004813607876642392, 'samples': 3245568, 'steps': 6338, 'batch_loss/train': 0.7645148544106632} 12/22/2021 11:14:22 - INFO - codeparrot_training - Step 6339: {'lr': 0.00048135481476568474, 'samples': 3246080, 'steps': 6339, 'batch_loss/train': 0.7588249752297997} 12/22/2021 11:14:32 - INFO - codeparrot_training - Step 6340: {'lr': 0.00048134884094735097, 'samples': 3246592, 'steps': 6340, 'batch_loss/train': 0.8846748899668455} 12/22/2021 11:14:43 - INFO - codeparrot_training - Step 6341: {'lr': 0.0004813428662092616, 'samples': 3247104, 'steps': 6341, 'batch_loss/train': 0.9205680824816227} 12/22/2021 11:14:55 - INFO - codeparrot_training - Step 6342: {'lr': 0.0004813368905514403, 'samples': 3247616, 'steps': 6342, 'batch_loss/train': 0.8336727935820818} 12/22/2021 11:15:06 - INFO - codeparrot_training - Step 6343: {'lr': 0.0004813309139739108, 'samples': 3248128, 'steps': 6343, 'batch_loss/train': 0.8788163335993886} 12/22/2021 11:15:16 - INFO - codeparrot_training - Step 6344: {'lr': 0.00048132493647669707, 'samples': 3248640, 'steps': 6344, 'batch_loss/train': 0.7963222777470946} 12/22/2021 11:15:28 - INFO - codeparrot_training - Step 6345: {'lr': 0.0004813189580598226, 'samples': 3249152, 'steps': 6345, 'batch_loss/train': 0.7872040402144194} 12/22/2021 11:15:39 - INFO - codeparrot_training - Step 6346: {'lr': 0.00048131297872331145, 'samples': 3249664, 'steps': 6346, 'batch_loss/train': 0.7863682424649596} 12/22/2021 11:15:50 - INFO - codeparrot_training - Step 6347: {'lr': 0.0004813069984671871, 'samples': 3250176, 'steps': 6347, 'batch_loss/train': 0.8357561975717545} 12/22/2021 11:16:00 - INFO - codeparrot_training - Step 6348: {'lr': 0.00048130101729147354, 'samples': 3250688, 'steps': 6348, 'batch_loss/train': 0.8532714881002903} 12/22/2021 11:16:14 - INFO - codeparrot_training - Step 6349: {'lr': 0.00048129503519619445, 'samples': 3251200, 'steps': 6349, 'batch_loss/train': 0.600561773404479} 12/22/2021 11:16:25 - INFO - codeparrot_training - Step 6350: {'lr': 0.0004812890521813736, 'samples': 3251712, 'steps': 6350, 'batch_loss/train': 0.8386447317898273} 12/22/2021 11:16:35 - INFO - codeparrot_training - Step 6351: {'lr': 0.0004812830682470348, 'samples': 3252224, 'steps': 6351, 'batch_loss/train': 1.0444960566237569} 12/22/2021 11:16:47 - INFO - codeparrot_training - Step 6352: {'lr': 0.00048127708339320196, 'samples': 3252736, 'steps': 6352, 'batch_loss/train': 0.7836023173294961} 12/22/2021 11:16:58 - INFO - codeparrot_training - Step 6353: {'lr': 0.0004812710976198987, 'samples': 3253248, 'steps': 6353, 'batch_loss/train': 0.8444328410550952} 12/22/2021 11:17:09 - INFO - codeparrot_training - Step 6354: {'lr': 0.00048126511092714885, 'samples': 3253760, 'steps': 6354, 'batch_loss/train': 0.8374726593028754} 12/22/2021 11:17:21 - INFO - codeparrot_training - Step 6355: {'lr': 0.0004812591233149762, 'samples': 3254272, 'steps': 6355, 'batch_loss/train': 0.8426442723721266} 12/22/2021 11:17:32 - INFO - codeparrot_training - Step 6356: {'lr': 0.0004812531347834047, 'samples': 3254784, 'steps': 6356, 'batch_loss/train': 0.8692095493897796} 12/22/2021 11:17:42 - INFO - codeparrot_training - Step 6357: {'lr': 0.00048124714533245806, 'samples': 3255296, 'steps': 6357, 'batch_loss/train': 0.8562375940382481} 12/22/2021 11:17:56 - INFO - codeparrot_training - Step 6358: {'lr': 0.00048124115496216006, 'samples': 3255808, 'steps': 6358, 'batch_loss/train': 0.7783602867275476} 12/22/2021 11:18:07 - INFO - codeparrot_training - Step 6359: {'lr': 0.00048123516367253445, 'samples': 3256320, 'steps': 6359, 'batch_loss/train': 1.5080239465460181} 12/22/2021 11:18:18 - INFO - codeparrot_training - Step 6360: {'lr': 0.0004812291714636052, 'samples': 3256832, 'steps': 6360, 'batch_loss/train': 0.9551394488662481} 12/22/2021 11:18:28 - INFO - codeparrot_training - Step 6361: {'lr': 0.00048122317833539606, 'samples': 3257344, 'steps': 6361, 'batch_loss/train': 0.8175530200824142} 12/22/2021 11:18:40 - INFO - codeparrot_training - Step 6362: {'lr': 0.00048121718428793093, 'samples': 3257856, 'steps': 6362, 'batch_loss/train': 0.8868716498836875} 12/22/2021 11:18:51 - INFO - codeparrot_training - Step 6363: {'lr': 0.00048121118932123355, 'samples': 3258368, 'steps': 6363, 'batch_loss/train': 0.6693473884370178} 12/22/2021 11:19:02 - INFO - codeparrot_training - Step 6364: {'lr': 0.00048120519343532777, 'samples': 3258880, 'steps': 6364, 'batch_loss/train': 0.6160304528311826} 12/22/2021 11:19:16 - INFO - codeparrot_training - Step 6365: {'lr': 0.0004811991966302374, 'samples': 3259392, 'steps': 6365, 'batch_loss/train': 0.7698003549594432} 12/22/2021 11:19:26 - INFO - codeparrot_training - Step 6366: {'lr': 0.0004811931989059864, 'samples': 3259904, 'steps': 6366, 'batch_loss/train': 0.8369901459664106} 12/22/2021 11:19:37 - INFO - codeparrot_training - Step 6367: {'lr': 0.0004811872002625985, 'samples': 3260416, 'steps': 6367, 'batch_loss/train': 0.8414588803425431} 12/22/2021 11:19:49 - INFO - codeparrot_training - Step 6368: {'lr': 0.00048118120070009756, 'samples': 3260928, 'steps': 6368, 'batch_loss/train': 1.013897211290896} 12/22/2021 11:19:59 - INFO - codeparrot_training - Step 6369: {'lr': 0.00048117520021850754, 'samples': 3261440, 'steps': 6369, 'batch_loss/train': 0.7799563007429242} 12/22/2021 11:20:10 - INFO - codeparrot_training - Step 6370: {'lr': 0.00048116919881785214, 'samples': 3261952, 'steps': 6370, 'batch_loss/train': 0.8087970349006355} 12/22/2021 11:20:21 - INFO - codeparrot_training - Step 6371: {'lr': 0.0004811631964981553, 'samples': 3262464, 'steps': 6371, 'batch_loss/train': 0.920294675976038} 12/22/2021 11:20:33 - INFO - codeparrot_training - Step 6372: {'lr': 0.00048115719325944085, 'samples': 3262976, 'steps': 6372, 'batch_loss/train': 0.824061369523406} 12/22/2021 11:20:44 - INFO - codeparrot_training - Step 6373: {'lr': 0.00048115118910173273, 'samples': 3263488, 'steps': 6373, 'batch_loss/train': 0.9356520166620612} 12/22/2021 11:20:54 - INFO - codeparrot_training - Step 6374: {'lr': 0.00048114518402505473, 'samples': 3264000, 'steps': 6374, 'batch_loss/train': 0.6132482655812055} 12/22/2021 11:21:08 - INFO - codeparrot_training - Step 6375: {'lr': 0.0004811391780294308, 'samples': 3264512, 'steps': 6375, 'batch_loss/train': 0.8215221166610718} 12/22/2021 11:21:19 - INFO - codeparrot_training - Step 6376: {'lr': 0.00048113317111488476, 'samples': 3265024, 'steps': 6376, 'batch_loss/train': 0.8347632344812155} 12/22/2021 11:21:30 - INFO - codeparrot_training - Step 6377: {'lr': 0.0004811271632814405, 'samples': 3265536, 'steps': 6377, 'batch_loss/train': 0.7977524502202868} 12/22/2021 11:21:42 - INFO - codeparrot_training - Step 6378: {'lr': 0.00048112115452912186, 'samples': 3266048, 'steps': 6378, 'batch_loss/train': 0.6763107907027006} 12/22/2021 11:21:52 - INFO - codeparrot_training - Step 6379: {'lr': 0.00048111514485795284, 'samples': 3266560, 'steps': 6379, 'batch_loss/train': 0.8790541822090745} 12/22/2021 11:22:03 - INFO - codeparrot_training - Step 6380: {'lr': 0.00048110913426795723, 'samples': 3267072, 'steps': 6380, 'batch_loss/train': 0.849862058646977} 12/22/2021 11:22:15 - INFO - codeparrot_training - Step 6381: {'lr': 0.00048110312275915894, 'samples': 3267584, 'steps': 6381, 'batch_loss/train': 0.8250335045158863} 12/22/2021 11:22:26 - INFO - codeparrot_training - Step 6382: {'lr': 0.00048109711033158197, 'samples': 3268096, 'steps': 6382, 'batch_loss/train': 0.7931120125576854} 12/22/2021 11:22:36 - INFO - codeparrot_training - Step 6383: {'lr': 0.00048109109698525013, 'samples': 3268608, 'steps': 6383, 'batch_loss/train': 0.8295251922681928} 12/22/2021 11:22:47 - INFO - codeparrot_training - Step 6384: {'lr': 0.0004810850827201873, 'samples': 3269120, 'steps': 6384, 'batch_loss/train': 0.8227516058832407} 12/22/2021 11:22:59 - INFO - codeparrot_training - Step 6385: {'lr': 0.00048107906753641746, 'samples': 3269632, 'steps': 6385, 'batch_loss/train': 0.8734818575903773} 12/22/2021 11:23:09 - INFO - codeparrot_training - Step 6386: {'lr': 0.0004810730514339645, 'samples': 3270144, 'steps': 6386, 'batch_loss/train': 0.7935062437318265} 12/22/2021 11:23:20 - INFO - codeparrot_training - Step 6387: {'lr': 0.0004810670344128524, 'samples': 3270656, 'steps': 6387, 'batch_loss/train': 0.7663248288445175} 12/22/2021 11:23:34 - INFO - codeparrot_training - Step 6388: {'lr': 0.0004810610164731049, 'samples': 3271168, 'steps': 6388, 'batch_loss/train': 0.9173416618723422} 12/22/2021 11:23:45 - INFO - codeparrot_training - Step 6389: {'lr': 0.00048105499761474615, 'samples': 3271680, 'steps': 6389, 'batch_loss/train': 0.984837283147499} 12/22/2021 11:23:55 - INFO - codeparrot_training - Step 6390: {'lr': 0.00048104897783779986, 'samples': 3272192, 'steps': 6390, 'batch_loss/train': 0.8556765504181385} 12/22/2021 11:24:07 - INFO - codeparrot_training - Step 6391: {'lr': 0.00048104295714229017, 'samples': 3272704, 'steps': 6391, 'batch_loss/train': 0.8395042773336172} 12/22/2021 11:24:18 - INFO - codeparrot_training - Step 6392: {'lr': 0.00048103693552824093, 'samples': 3273216, 'steps': 6392, 'batch_loss/train': 0.7851774152368307} 12/22/2021 11:24:28 - INFO - codeparrot_training - Step 6393: {'lr': 0.000481030912995676, 'samples': 3273728, 'steps': 6393, 'batch_loss/train': 0.7907074890099466} 12/22/2021 11:24:39 - INFO - codeparrot_training - Step 6394: {'lr': 0.0004810248895446194, 'samples': 3274240, 'steps': 6394, 'batch_loss/train': 0.8826291216537356} 12/22/2021 11:24:53 - INFO - codeparrot_training - Step 6395: {'lr': 0.0004810188651750951, 'samples': 3274752, 'steps': 6395, 'batch_loss/train': 0.7759314887225628} 12/22/2021 11:25:04 - INFO - codeparrot_training - Step 6396: {'lr': 0.000481012839887127, 'samples': 3275264, 'steps': 6396, 'batch_loss/train': 0.8602457121014595} 12/22/2021 11:25:14 - INFO - codeparrot_training - Step 6397: {'lr': 0.0004810068136807391, 'samples': 3275776, 'steps': 6397, 'batch_loss/train': 0.7955685588531196} 12/22/2021 11:25:26 - INFO - codeparrot_training - Step 6398: {'lr': 0.0004810007865559554, 'samples': 3276288, 'steps': 6398, 'batch_loss/train': 0.7695747525431216} 12/22/2021 11:25:37 - INFO - codeparrot_training - Step 6399: {'lr': 0.0004809947585127996, 'samples': 3276800, 'steps': 6399, 'batch_loss/train': 0.801992752822116} 12/22/2021 11:25:47 - INFO - codeparrot_training - Step 6400: {'lr': 0.000480988729551296, 'samples': 3277312, 'steps': 6400, 'batch_loss/train': 0.9111617733724415} 12/22/2021 11:26:00 - INFO - codeparrot_training - Step 6401: {'lr': 0.00048098269967146846, 'samples': 3277824, 'steps': 6401, 'batch_loss/train': 0.8567503960803151} 12/22/2021 11:26:10 - INFO - codeparrot_training - Step 6402: {'lr': 0.00048097666887334087, 'samples': 3278336, 'steps': 6402, 'batch_loss/train': 0.729989011422731} 12/22/2021 11:26:21 - INFO - codeparrot_training - Step 6403: {'lr': 0.00048097063715693723, 'samples': 3278848, 'steps': 6403, 'batch_loss/train': 0.8392686266452074} 12/22/2021 11:26:31 - INFO - codeparrot_training - Step 6404: {'lr': 0.0004809646045222816, 'samples': 3279360, 'steps': 6404, 'batch_loss/train': 0.8193266000598669} 12/22/2021 11:26:45 - INFO - codeparrot_training - Step 6405: {'lr': 0.00048095857096939787, 'samples': 3279872, 'steps': 6405, 'batch_loss/train': 0.7120424583554268} 12/22/2021 11:26:56 - INFO - codeparrot_training - Step 6406: {'lr': 0.00048095253649831006, 'samples': 3280384, 'steps': 6406, 'batch_loss/train': 0.8166516842320561} 12/22/2021 11:27:06 - INFO - codeparrot_training - Step 6407: {'lr': 0.0004809465011090423, 'samples': 3280896, 'steps': 6407, 'batch_loss/train': 0.830906287766993} 12/22/2021 11:27:18 - INFO - codeparrot_training - Step 6408: {'lr': 0.00048094046480161827, 'samples': 3281408, 'steps': 6408, 'batch_loss/train': 0.8442448116838932} 12/22/2021 11:27:29 - INFO - codeparrot_training - Step 6409: {'lr': 0.0004809344275760622, 'samples': 3281920, 'steps': 6409, 'batch_loss/train': 0.736570886336267} 12/22/2021 11:27:40 - INFO - codeparrot_training - Step 6410: {'lr': 0.00048092838943239815, 'samples': 3282432, 'steps': 6410, 'batch_loss/train': 1.0026547089219093} 12/22/2021 11:27:55 - INFO - codeparrot_training - Step 6411: {'lr': 0.0004809223503706499, 'samples': 3282944, 'steps': 6411, 'batch_loss/train': 0.737940764054656} 12/22/2021 11:28:06 - INFO - codeparrot_training - Step 6412: {'lr': 0.0004809163103908416, 'samples': 3283456, 'steps': 6412, 'batch_loss/train': 0.854516290826723} 12/22/2021 11:28:16 - INFO - codeparrot_training - Step 6413: {'lr': 0.00048091026949299734, 'samples': 3283968, 'steps': 6413, 'batch_loss/train': 1.0386575618758798} 12/22/2021 11:28:27 - INFO - codeparrot_training - Step 6414: {'lr': 0.0004809042276771409, 'samples': 3284480, 'steps': 6414, 'batch_loss/train': 1.191975059453398} 12/22/2021 11:28:39 - INFO - codeparrot_training - Step 6415: {'lr': 0.0004808981849432965, 'samples': 3284992, 'steps': 6415, 'batch_loss/train': 0.8967691799625754} 12/22/2021 11:28:50 - INFO - codeparrot_training - Step 6416: {'lr': 0.00048089214129148813, 'samples': 3285504, 'steps': 6416, 'batch_loss/train': 0.8462617853656411} 12/22/2021 11:29:00 - INFO - codeparrot_training - Step 6417: {'lr': 0.00048088609672173975, 'samples': 3286016, 'steps': 6417, 'batch_loss/train': 0.9287387188524008} 12/22/2021 11:29:12 - INFO - codeparrot_training - Step 6418: {'lr': 0.0004808800512340754, 'samples': 3286528, 'steps': 6418, 'batch_loss/train': 0.949709439650178} 12/22/2021 11:29:23 - INFO - codeparrot_training - Step 6419: {'lr': 0.00048087400482851915, 'samples': 3287040, 'steps': 6419, 'batch_loss/train': 0.8377832798287272} 12/22/2021 11:29:34 - INFO - codeparrot_training - Step 6420: {'lr': 0.0004808679575050951, 'samples': 3287552, 'steps': 6420, 'batch_loss/train': 0.9450042331591249} 12/22/2021 11:29:46 - INFO - codeparrot_training - Step 6421: {'lr': 0.00048086190926382717, 'samples': 3288064, 'steps': 6421, 'batch_loss/train': 0.8168274089694023} 12/22/2021 11:29:57 - INFO - codeparrot_training - Step 6422: {'lr': 0.00048085586010473947, 'samples': 3288576, 'steps': 6422, 'batch_loss/train': 0.7908781904261559} 12/22/2021 11:30:07 - INFO - codeparrot_training - Step 6423: {'lr': 0.00048084981002785603, 'samples': 3289088, 'steps': 6423, 'batch_loss/train': 0.8014107793569565} 12/22/2021 11:30:18 - INFO - codeparrot_training - Step 6424: {'lr': 0.00048084375903320093, 'samples': 3289600, 'steps': 6424, 'batch_loss/train': 1.041065444238484} 12/22/2021 11:30:32 - INFO - codeparrot_training - Step 6425: {'lr': 0.00048083770712079824, 'samples': 3290112, 'steps': 6425, 'batch_loss/train': 1.7314960518851876} 12/22/2021 11:30:42 - INFO - codeparrot_training - Step 6426: {'lr': 0.00048083165429067186, 'samples': 3290624, 'steps': 6426, 'batch_loss/train': 0.9398281620815396} 12/22/2021 11:30:53 - INFO - codeparrot_training - Step 6427: {'lr': 0.00048082560054284613, 'samples': 3291136, 'steps': 6427, 'batch_loss/train': 0.8512114062905312} 12/22/2021 11:31:05 - INFO - codeparrot_training - Step 6428: {'lr': 0.0004808195458773449, 'samples': 3291648, 'steps': 6428, 'batch_loss/train': 0.8078408967703581} 12/22/2021 11:31:15 - INFO - codeparrot_training - Step 6429: {'lr': 0.0004808134902941923, 'samples': 3292160, 'steps': 6429, 'batch_loss/train': 0.7935840534046292} 12/22/2021 11:31:26 - INFO - codeparrot_training - Step 6430: {'lr': 0.0004808074337934124, 'samples': 3292672, 'steps': 6430, 'batch_loss/train': 0.7293504017870873} 12/22/2021 11:31:39 - INFO - codeparrot_training - Step 6431: {'lr': 0.00048080137637502933, 'samples': 3293184, 'steps': 6431, 'batch_loss/train': 0.9263877347111702} 12/22/2021 11:31:49 - INFO - codeparrot_training - Step 6432: {'lr': 0.00048079531803906715, 'samples': 3293696, 'steps': 6432, 'batch_loss/train': 0.8643355281092227} 12/22/2021 11:32:00 - INFO - codeparrot_training - Step 6433: {'lr': 0.0004807892587855499, 'samples': 3294208, 'steps': 6433, 'batch_loss/train': 1.1887617523316294} 12/22/2021 11:32:11 - INFO - codeparrot_training - Step 6434: {'lr': 0.0004807831986145017, 'samples': 3294720, 'steps': 6434, 'batch_loss/train': 0.9154257830232382} 12/22/2021 11:32:25 - INFO - codeparrot_training - Step 6435: {'lr': 0.0004807771375259466, 'samples': 3295232, 'steps': 6435, 'batch_loss/train': 0.8099397569894791} 12/22/2021 11:32:35 - INFO - codeparrot_training - Step 6436: {'lr': 0.00048077107551990886, 'samples': 3295744, 'steps': 6436, 'batch_loss/train': 0.8992217374034226} 12/22/2021 11:32:46 - INFO - codeparrot_training - Step 6437: {'lr': 0.0004807650125964124, 'samples': 3296256, 'steps': 6437, 'batch_loss/train': 0.8874814594164491} 12/22/2021 11:32:58 - INFO - codeparrot_training - Step 6438: {'lr': 0.0004807589487554813, 'samples': 3296768, 'steps': 6438, 'batch_loss/train': 0.710952871479094} 12/22/2021 11:33:09 - INFO - codeparrot_training - Step 6439: {'lr': 0.00048075288399713985, 'samples': 3297280, 'steps': 6439, 'batch_loss/train': 0.7915699491277337} 12/22/2021 11:33:19 - INFO - codeparrot_training - Step 6440: {'lr': 0.00048074681832141204, 'samples': 3297792, 'steps': 6440, 'batch_loss/train': 0.9513332010246813} 12/22/2021 11:33:33 - INFO - codeparrot_training - Step 6441: {'lr': 0.000480740751728322, 'samples': 3298304, 'steps': 6441, 'batch_loss/train': 0.8810980068519711} 12/22/2021 11:33:44 - INFO - codeparrot_training - Step 6442: {'lr': 0.00048073468421789385, 'samples': 3298816, 'steps': 6442, 'batch_loss/train': 0.6947481140960008} 12/22/2021 11:33:54 - INFO - codeparrot_training - Step 6443: {'lr': 0.0004807286157901517, 'samples': 3299328, 'steps': 6443, 'batch_loss/train': 0.939699275419116} 12/22/2021 11:34:05 - INFO - codeparrot_training - Step 6444: {'lr': 0.0004807225464451197, 'samples': 3299840, 'steps': 6444, 'batch_loss/train': 0.8521342906169593} 12/22/2021 11:34:17 - INFO - codeparrot_training - Step 6445: {'lr': 0.00048071647618282197, 'samples': 3300352, 'steps': 6445, 'batch_loss/train': 0.8773769773542881} 12/22/2021 11:34:28 - INFO - codeparrot_training - Step 6446: {'lr': 0.0004807104050032827, 'samples': 3300864, 'steps': 6446, 'batch_loss/train': 1.3210913115181029} 12/22/2021 11:34:38 - INFO - codeparrot_training - Step 6447: {'lr': 0.00048070433290652595, 'samples': 3301376, 'steps': 6447, 'batch_loss/train': 0.8203734736889601} 12/22/2021 11:34:50 - INFO - codeparrot_training - Step 6448: {'lr': 0.0004806982598925759, 'samples': 3301888, 'steps': 6448, 'batch_loss/train': 0.8045608047395945} 12/22/2021 11:35:01 - INFO - codeparrot_training - Step 6449: {'lr': 0.00048069218596145664, 'samples': 3302400, 'steps': 6449, 'batch_loss/train': 0.9275511098094285} 12/22/2021 11:35:12 - INFO - codeparrot_training - Step 6450: {'lr': 0.0004806861111131924, 'samples': 3302912, 'steps': 6450, 'batch_loss/train': 0.8663354134187102} 12/22/2021 11:35:26 - INFO - codeparrot_training - Step 6451: {'lr': 0.00048068003534780724, 'samples': 3303424, 'steps': 6451, 'batch_loss/train': 0.842506691638846} 12/22/2021 11:35:37 - INFO - codeparrot_training - Step 6452: {'lr': 0.0004806739586653254, 'samples': 3303936, 'steps': 6452, 'batch_loss/train': 0.9034820222295821} 12/22/2021 11:35:47 - INFO - codeparrot_training - Step 6453: {'lr': 0.0004806678810657709, 'samples': 3304448, 'steps': 6453, 'batch_loss/train': 1.1094394656829536} 12/22/2021 11:35:58 - INFO - codeparrot_training - Step 6454: {'lr': 0.00048066180254916803, 'samples': 3304960, 'steps': 6454, 'batch_loss/train': 0.754434494767338} 12/22/2021 11:36:10 - INFO - codeparrot_training - Step 6455: {'lr': 0.000480655723115541, 'samples': 3305472, 'steps': 6455, 'batch_loss/train': 0.8572454159148037} 12/22/2021 11:36:20 - INFO - codeparrot_training - Step 6456: {'lr': 0.00048064964276491387, 'samples': 3305984, 'steps': 6456, 'batch_loss/train': 0.8310453025624156} 12/22/2021 11:36:31 - INFO - codeparrot_training - Step 6457: {'lr': 0.0004806435614973109, 'samples': 3306496, 'steps': 6457, 'batch_loss/train': 0.9077599395532161} 12/22/2021 11:36:43 - INFO - codeparrot_training - Step 6458: {'lr': 0.00048063747931275616, 'samples': 3307008, 'steps': 6458, 'batch_loss/train': 0.8949022265151143} 12/22/2021 11:36:54 - INFO - codeparrot_training - Step 6459: {'lr': 0.0004806313962112739, 'samples': 3307520, 'steps': 6459, 'batch_loss/train': 0.8142871167510748} 12/22/2021 11:37:04 - INFO - codeparrot_training - Step 6460: {'lr': 0.0004806253121928883, 'samples': 3308032, 'steps': 6460, 'batch_loss/train': 0.8640155638568103} 12/22/2021 11:37:16 - INFO - codeparrot_training - Step 6461: {'lr': 0.00048061922725762353, 'samples': 3308544, 'steps': 6461, 'batch_loss/train': 0.8304147713351995} 12/22/2021 11:37:27 - INFO - codeparrot_training - Step 6462: {'lr': 0.00048061314140550375, 'samples': 3309056, 'steps': 6462, 'batch_loss/train': 0.8765138159506023} 12/22/2021 11:37:38 - INFO - codeparrot_training - Step 6463: {'lr': 0.00048060705463655323, 'samples': 3309568, 'steps': 6463, 'batch_loss/train': 0.8909217268228531} 12/22/2021 11:37:48 - INFO - codeparrot_training - Step 6464: {'lr': 0.00048060096695079615, 'samples': 3310080, 'steps': 6464, 'batch_loss/train': 0.8155405977740884} 12/22/2021 11:38:02 - INFO - codeparrot_training - Step 6465: {'lr': 0.0004805948783482567, 'samples': 3310592, 'steps': 6465, 'batch_loss/train': 0.7656483971513808} 12/22/2021 11:38:13 - INFO - codeparrot_training - Step 6466: {'lr': 0.0004805887888289591, 'samples': 3311104, 'steps': 6466, 'batch_loss/train': 0.8281451184302568} 12/22/2021 11:38:23 - INFO - codeparrot_training - Step 6467: {'lr': 0.00048058269839292743, 'samples': 3311616, 'steps': 6467, 'batch_loss/train': 0.812498196028173} 12/22/2021 11:38:35 - INFO - codeparrot_training - Step 6468: {'lr': 0.0004805766070401861, 'samples': 3312128, 'steps': 6468, 'batch_loss/train': 0.940979931736365} 12/22/2021 11:38:46 - INFO - codeparrot_training - Step 6469: {'lr': 0.0004805705147707593, 'samples': 3312640, 'steps': 6469, 'batch_loss/train': 0.8287421744316816} 12/22/2021 11:38:56 - INFO - codeparrot_training - Step 6470: {'lr': 0.0004805644215846712, 'samples': 3313152, 'steps': 6470, 'batch_loss/train': 0.8809888353571296} 12/22/2021 11:39:08 - INFO - codeparrot_training - Step 6471: {'lr': 0.0004805583274819459, 'samples': 3313664, 'steps': 6471, 'batch_loss/train': 0.8752128848573193} 12/22/2021 11:39:19 - INFO - codeparrot_training - Step 6472: {'lr': 0.00048055223246260784, 'samples': 3314176, 'steps': 6472, 'batch_loss/train': 0.8695345111191273} 12/22/2021 11:39:30 - INFO - codeparrot_training - Step 6473: {'lr': 0.00048054613652668106, 'samples': 3314688, 'steps': 6473, 'batch_loss/train': 0.7737562730908394} 12/22/2021 11:39:43 - INFO - codeparrot_training - Step 6474: {'lr': 0.00048054003967419, 'samples': 3315200, 'steps': 6474, 'batch_loss/train': 0.8891008757054806} 12/22/2021 11:39:54 - INFO - codeparrot_training - Step 6475: {'lr': 0.0004805339419051587, 'samples': 3315712, 'steps': 6475, 'batch_loss/train': 0.691631811670959} 12/22/2021 11:40:05 - INFO - codeparrot_training - Step 6476: {'lr': 0.0004805278432196115, 'samples': 3316224, 'steps': 6476, 'batch_loss/train': 0.8451125416904688} 12/22/2021 11:40:15 - INFO - codeparrot_training - Step 6477: {'lr': 0.0004805217436175727, 'samples': 3316736, 'steps': 6477, 'batch_loss/train': 0.8534210156649351} 12/22/2021 11:40:28 - INFO - codeparrot_training - Step 6478: {'lr': 0.0004805156430990665, 'samples': 3317248, 'steps': 6478, 'batch_loss/train': 1.5348565550521016} 12/22/2021 11:40:38 - INFO - codeparrot_training - Step 6479: {'lr': 0.0004805095416641171, 'samples': 3317760, 'steps': 6479, 'batch_loss/train': 0.8652492072433233} 12/22/2021 11:40:49 - INFO - codeparrot_training - Step 6480: {'lr': 0.00048050343931274874, 'samples': 3318272, 'steps': 6480, 'batch_loss/train': 0.9211316918954253} 12/22/2021 11:41:03 - INFO - codeparrot_training - Step 6481: {'lr': 0.0004804973360449858, 'samples': 3318784, 'steps': 6481, 'batch_loss/train': 0.8093918738886714} 12/22/2021 11:41:13 - INFO - codeparrot_training - Step 6482: {'lr': 0.0004804912318608525, 'samples': 3319296, 'steps': 6482, 'batch_loss/train': 0.8256771950982511} 12/22/2021 11:41:24 - INFO - codeparrot_training - Step 6483: {'lr': 0.0004804851267603731, 'samples': 3319808, 'steps': 6483, 'batch_loss/train': 0.9147145040333271} 12/22/2021 11:41:36 - INFO - codeparrot_training - Step 6484: {'lr': 0.00048047902074357186, 'samples': 3320320, 'steps': 6484, 'batch_loss/train': 0.9223741246387362} 12/22/2021 11:41:46 - INFO - codeparrot_training - Step 6485: {'lr': 0.000480472913810473, 'samples': 3320832, 'steps': 6485, 'batch_loss/train': 0.7470155353657901} 12/22/2021 11:41:57 - INFO - codeparrot_training - Step 6486: {'lr': 0.00048046680596110095, 'samples': 3321344, 'steps': 6486, 'batch_loss/train': 0.8867442645132542} 12/22/2021 11:42:08 - INFO - codeparrot_training - Step 6487: {'lr': 0.0004804606971954798, 'samples': 3321856, 'steps': 6487, 'batch_loss/train': 0.7981154061853886} 12/22/2021 11:42:20 - INFO - codeparrot_training - Step 6488: {'lr': 0.00048045458751363405, 'samples': 3322368, 'steps': 6488, 'batch_loss/train': 0.8039563093334436} 12/22/2021 11:42:30 - INFO - codeparrot_training - Step 6489: {'lr': 0.00048044847691558777, 'samples': 3322880, 'steps': 6489, 'batch_loss/train': 0.6954171873512678} 12/22/2021 11:42:41 - INFO - codeparrot_training - Step 6490: {'lr': 0.00048044236540136545, 'samples': 3323392, 'steps': 6490, 'batch_loss/train': 0.8815414477139711} 12/22/2021 11:42:55 - INFO - codeparrot_training - Step 6491: {'lr': 0.0004804362529709912, 'samples': 3323904, 'steps': 6491, 'batch_loss/train': 0.8939166646450758} 12/22/2021 11:43:05 - INFO - codeparrot_training - Step 6492: {'lr': 0.00048043013962448944, 'samples': 3324416, 'steps': 6492, 'batch_loss/train': 0.8664937689900398} 12/22/2021 11:43:16 - INFO - codeparrot_training - Step 6493: {'lr': 0.0004804240253618845, 'samples': 3324928, 'steps': 6493, 'batch_loss/train': 0.6983002633205615} 12/22/2021 11:43:28 - INFO - codeparrot_training - Step 6494: {'lr': 0.0004804179101832006, 'samples': 3325440, 'steps': 6494, 'batch_loss/train': 0.7801956471521407} 12/22/2021 11:43:39 - INFO - codeparrot_training - Step 6495: {'lr': 0.0004804117940884621, 'samples': 3325952, 'steps': 6495, 'batch_loss/train': 0.9744748114608228} 12/22/2021 11:43:49 - INFO - codeparrot_training - Step 6496: {'lr': 0.00048040567707769333, 'samples': 3326464, 'steps': 6496, 'batch_loss/train': 0.8125344337895513} 12/22/2021 11:44:01 - INFO - codeparrot_training - Step 6497: {'lr': 0.0004803995591509186, 'samples': 3326976, 'steps': 6497, 'batch_loss/train': 0.8739111879840493} 12/22/2021 11:44:12 - INFO - codeparrot_training - Step 6498: {'lr': 0.0004803934403081622, 'samples': 3327488, 'steps': 6498, 'batch_loss/train': 0.8547037122771144} 12/22/2021 11:44:23 - INFO - codeparrot_training - Step 6499: {'lr': 0.00048038732054944846, 'samples': 3328000, 'steps': 6499, 'batch_loss/train': 0.8258694857358932} 12/22/2021 11:44:33 - INFO - codeparrot_training - Step 6500: {'lr': 0.0004803811998748017, 'samples': 3328512, 'steps': 6500, 'batch_loss/train': 0.8658650810830295} 12/22/2021 11:44:45 - INFO - codeparrot_training - Step 6501: {'lr': 0.00048037507828424633, 'samples': 3329024, 'steps': 6501, 'batch_loss/train': 0.6806757079903036} 12/22/2021 11:44:56 - INFO - codeparrot_training - Step 6502: {'lr': 0.00048036895577780657, 'samples': 3329536, 'steps': 6502, 'batch_loss/train': 0.9445935003459454} 12/22/2021 11:45:07 - INFO - codeparrot_training - Step 6503: {'lr': 0.0004803628323555069, 'samples': 3330048, 'steps': 6503, 'batch_loss/train': 0.730150890070945} 12/22/2021 11:45:30 - INFO - codeparrot_training - Step 6504: {'lr': 0.00048035670801737154, 'samples': 3330560, 'steps': 6504, 'batch_loss/train': 0.7756949549075216} 12/22/2021 11:45:41 - INFO - codeparrot_training - Step 6505: {'lr': 0.0004803505827634249, 'samples': 3331072, 'steps': 6505, 'batch_loss/train': 0.8070660424418747} 12/22/2021 11:45:51 - INFO - codeparrot_training - Step 6506: {'lr': 0.00048034445659369136, 'samples': 3331584, 'steps': 6506, 'batch_loss/train': 0.8223155168816447} 12/22/2021 11:46:03 - INFO - codeparrot_training - Step 6507: {'lr': 0.0004803383295081952, 'samples': 3332096, 'steps': 6507, 'batch_loss/train': 0.8399602533318102} 12/22/2021 11:46:14 - INFO - codeparrot_training - Step 6508: {'lr': 0.00048033220150696084, 'samples': 3332608, 'steps': 6508, 'batch_loss/train': 0.7926609208807349} 12/22/2021 11:46:25 - INFO - codeparrot_training - Step 6509: {'lr': 0.0004803260725900126, 'samples': 3333120, 'steps': 6509, 'batch_loss/train': 0.8272367748431861} 12/22/2021 11:46:35 - INFO - codeparrot_training - Step 6510: {'lr': 0.0004803199427573748, 'samples': 3333632, 'steps': 6510, 'batch_loss/train': 0.8581687700934708} 12/22/2021 11:46:49 - INFO - codeparrot_training - Step 6511: {'lr': 0.00048031381200907197, 'samples': 3334144, 'steps': 6511, 'batch_loss/train': 0.817698169965297} 12/22/2021 11:47:00 - INFO - codeparrot_training - Step 6512: {'lr': 0.00048030768034512835, 'samples': 3334656, 'steps': 6512, 'batch_loss/train': 0.787359724752605} 12/22/2021 11:47:11 - INFO - codeparrot_training - Step 6513: {'lr': 0.00048030154776556834, 'samples': 3335168, 'steps': 6513, 'batch_loss/train': 0.8447153028100729} 12/22/2021 11:47:23 - INFO - codeparrot_training - Step 6514: {'lr': 0.00048029541427041636, 'samples': 3335680, 'steps': 6514, 'batch_loss/train': 0.8180902861058712} 12/22/2021 11:47:33 - INFO - codeparrot_training - Step 6515: {'lr': 0.00048028927985969673, 'samples': 3336192, 'steps': 6515, 'batch_loss/train': 0.7913793965708464} 12/22/2021 11:47:44 - INFO - codeparrot_training - Step 6516: {'lr': 0.0004802831445334339, 'samples': 3336704, 'steps': 6516, 'batch_loss/train': 0.8811459476128221} 12/22/2021 11:47:56 - INFO - codeparrot_training - Step 6517: {'lr': 0.00048027700829165223, 'samples': 3337216, 'steps': 6517, 'batch_loss/train': 0.8038213429972529} 12/22/2021 11:48:07 - INFO - codeparrot_training - Step 6518: {'lr': 0.00048027087113437615, 'samples': 3337728, 'steps': 6518, 'batch_loss/train': 0.8222640715539455} 12/22/2021 11:48:17 - INFO - codeparrot_training - Step 6519: {'lr': 0.00048026473306163, 'samples': 3338240, 'steps': 6519, 'batch_loss/train': 0.7915791375562549} 12/22/2021 11:48:31 - INFO - codeparrot_training - Step 6520: {'lr': 0.00048025859407343824, 'samples': 3338752, 'steps': 6520, 'batch_loss/train': 0.8638124130666256} 12/22/2021 11:48:42 - INFO - codeparrot_training - Step 6521: {'lr': 0.0004802524541698252, 'samples': 3339264, 'steps': 6521, 'batch_loss/train': 0.7145763521548361} 12/22/2021 11:48:53 - INFO - codeparrot_training - Step 6522: {'lr': 0.00048024631335081544, 'samples': 3339776, 'steps': 6522, 'batch_loss/train': 0.7321809669956565} 12/22/2021 11:49:03 - INFO - codeparrot_training - Step 6523: {'lr': 0.0004802401716164332, 'samples': 3340288, 'steps': 6523, 'batch_loss/train': 0.8292872477322817} 12/22/2021 11:49:15 - INFO - codeparrot_training - Step 6524: {'lr': 0.00048023402896670296, 'samples': 3340800, 'steps': 6524, 'batch_loss/train': 0.8134365966543555} 12/22/2021 11:49:26 - INFO - codeparrot_training - Step 6525: {'lr': 0.00048022788540164925, 'samples': 3341312, 'steps': 6525, 'batch_loss/train': 0.8986651105806231} 12/22/2021 11:49:36 - INFO - codeparrot_training - Step 6526: {'lr': 0.00048022174092129634, 'samples': 3341824, 'steps': 6526, 'batch_loss/train': 0.6957268866244704} 12/22/2021 11:49:50 - INFO - codeparrot_training - Step 6527: {'lr': 0.00048021559552566875, 'samples': 3342336, 'steps': 6527, 'batch_loss/train': 0.9089712561108172} 12/22/2021 11:50:01 - INFO - codeparrot_training - Step 6528: {'lr': 0.0004802094492147908, 'samples': 3342848, 'steps': 6528, 'batch_loss/train': 0.8203020971268415} 12/22/2021 11:50:11 - INFO - codeparrot_training - Step 6529: {'lr': 0.00048020330198868705, 'samples': 3343360, 'steps': 6529, 'batch_loss/train': 0.9162176912650466} 12/22/2021 11:50:23 - INFO - codeparrot_training - Step 6530: {'lr': 0.0004801971538473819, 'samples': 3343872, 'steps': 6530, 'batch_loss/train': 0.8452585618942976} 12/22/2021 11:50:34 - INFO - codeparrot_training - Step 6531: {'lr': 0.0004801910047908997, 'samples': 3344384, 'steps': 6531, 'batch_loss/train': 0.8752691810950637} 12/22/2021 11:50:45 - INFO - codeparrot_training - Step 6532: {'lr': 0.0004801848548192651, 'samples': 3344896, 'steps': 6532, 'batch_loss/train': 0.8147289613261819} 12/22/2021 11:50:57 - INFO - codeparrot_training - Step 6533: {'lr': 0.0004801787039325024, 'samples': 3345408, 'steps': 6533, 'batch_loss/train': 0.8568208920769393} 12/22/2021 11:51:07 - INFO - codeparrot_training - Step 6534: {'lr': 0.00048017255213063603, 'samples': 3345920, 'steps': 6534, 'batch_loss/train': 0.8164173532277346} 12/22/2021 11:51:18 - INFO - codeparrot_training - Step 6535: {'lr': 0.00048016639941369057, 'samples': 3346432, 'steps': 6535, 'batch_loss/train': 0.7560137324035168} 12/22/2021 11:51:29 - INFO - codeparrot_training - Step 6536: {'lr': 0.00048016024578169045, 'samples': 3346944, 'steps': 6536, 'batch_loss/train': 0.8831102792173624} 12/22/2021 11:51:41 - INFO - codeparrot_training - Step 6537: {'lr': 0.00048015409123466, 'samples': 3347456, 'steps': 6537, 'batch_loss/train': 0.9259641813114285} 12/22/2021 11:51:51 - INFO - codeparrot_training - Step 6538: {'lr': 0.0004801479357726238, 'samples': 3347968, 'steps': 6538, 'batch_loss/train': 0.8023346997797489} 12/22/2021 11:52:02 - INFO - codeparrot_training - Step 6539: {'lr': 0.00048014177939560633, 'samples': 3348480, 'steps': 6539, 'batch_loss/train': 0.9371925867162645} 12/22/2021 11:52:16 - INFO - codeparrot_training - Step 6540: {'lr': 0.00048013562210363203, 'samples': 3348992, 'steps': 6540, 'batch_loss/train': 0.8310823012143373} 12/22/2021 11:52:26 - INFO - codeparrot_training - Step 6541: {'lr': 0.0004801294638967254, 'samples': 3349504, 'steps': 6541, 'batch_loss/train': 0.6563290029298514} 12/22/2021 11:52:37 - INFO - codeparrot_training - Step 6542: {'lr': 0.0004801233047749109, 'samples': 3350016, 'steps': 6542, 'batch_loss/train': 0.7974485498853028} 12/22/2021 11:52:49 - INFO - codeparrot_training - Step 6543: {'lr': 0.0004801171447382131, 'samples': 3350528, 'steps': 6543, 'batch_loss/train': 0.6015849245595746} 12/22/2021 11:53:00 - INFO - codeparrot_training - Step 6544: {'lr': 0.0004801109837866564, 'samples': 3351040, 'steps': 6544, 'batch_loss/train': 0.8408672786317766} 12/22/2021 11:53:10 - INFO - codeparrot_training - Step 6545: {'lr': 0.00048010482192026527, 'samples': 3351552, 'steps': 6545, 'batch_loss/train': 0.6802356601692736} 12/22/2021 11:53:21 - INFO - codeparrot_training - Step 6546: {'lr': 0.0004800986591390643, 'samples': 3352064, 'steps': 6546, 'batch_loss/train': 0.762760323472321} 12/22/2021 11:53:33 - INFO - codeparrot_training - Step 6547: {'lr': 0.0004800924954430779, 'samples': 3352576, 'steps': 6547, 'batch_loss/train': 0.845622206106782} 12/22/2021 11:53:44 - INFO - codeparrot_training - Step 6548: {'lr': 0.00048008633083233067, 'samples': 3353088, 'steps': 6548, 'batch_loss/train': 0.8358886810019612} 12/22/2021 11:53:54 - INFO - codeparrot_training - Step 6549: {'lr': 0.000480080165306847, 'samples': 3353600, 'steps': 6549, 'batch_loss/train': 0.7836395660415292} 12/22/2021 11:54:08 - INFO - codeparrot_training - Step 6550: {'lr': 0.00048007399886665147, 'samples': 3354112, 'steps': 6550, 'batch_loss/train': 0.7281373364385217} 12/22/2021 11:54:19 - INFO - codeparrot_training - Step 6551: {'lr': 0.00048006783151176866, 'samples': 3354624, 'steps': 6551, 'batch_loss/train': 0.8716560620814562} 12/22/2021 11:54:29 - INFO - codeparrot_training - Step 6552: {'lr': 0.000480061663242223, 'samples': 3355136, 'steps': 6552, 'batch_loss/train': 0.7980993064120412} 12/22/2021 11:54:42 - INFO - codeparrot_training - Step 6553: {'lr': 0.00048005549405803896, 'samples': 3355648, 'steps': 6553, 'batch_loss/train': 0.8059456087648869} 12/22/2021 11:54:52 - INFO - codeparrot_training - Step 6554: {'lr': 0.0004800493239592412, 'samples': 3356160, 'steps': 6554, 'batch_loss/train': 0.8407421838492155} 12/22/2021 11:55:03 - INFO - codeparrot_training - Step 6555: {'lr': 0.0004800431529458542, 'samples': 3356672, 'steps': 6555, 'batch_loss/train': 0.8103168671950698} 12/22/2021 11:55:13 - INFO - codeparrot_training - Step 6556: {'lr': 0.0004800369810179025, 'samples': 3357184, 'steps': 6556, 'batch_loss/train': 0.824755716137588} 12/22/2021 11:55:26 - INFO - codeparrot_training - Step 6557: {'lr': 0.00048003080817541054, 'samples': 3357696, 'steps': 6557, 'batch_loss/train': 0.6954219910549} 12/22/2021 11:55:36 - INFO - codeparrot_training - Step 6558: {'lr': 0.0004800246344184031, 'samples': 3358208, 'steps': 6558, 'batch_loss/train': 0.8877788013778627} 12/22/2021 11:55:47 - INFO - codeparrot_training - Step 6559: {'lr': 0.0004800184597469044, 'samples': 3358720, 'steps': 6559, 'batch_loss/train': 0.871194769628346} 12/22/2021 11:56:01 - INFO - codeparrot_training - Step 6560: {'lr': 0.00048001228416093923, 'samples': 3359232, 'steps': 6560, 'batch_loss/train': 0.872986956499517} 12/22/2021 11:56:11 - INFO - codeparrot_training - Step 6561: {'lr': 0.000480006107660532, 'samples': 3359744, 'steps': 6561, 'batch_loss/train': 0.754713820759207} 12/22/2021 11:56:22 - INFO - codeparrot_training - Step 6562: {'lr': 0.00047999993024570745, 'samples': 3360256, 'steps': 6562, 'batch_loss/train': 0.9015206377953291} 12/22/2021 11:56:34 - INFO - codeparrot_training - Step 6563: {'lr': 0.00047999375191648997, 'samples': 3360768, 'steps': 6563, 'batch_loss/train': 0.789495540317148} 12/22/2021 11:56:45 - INFO - codeparrot_training - Step 6564: {'lr': 0.0004799875726729042, 'samples': 3361280, 'steps': 6564, 'batch_loss/train': 0.7958039408549666} 12/22/2021 11:56:55 - INFO - codeparrot_training - Step 6565: {'lr': 0.00047998139251497455, 'samples': 3361792, 'steps': 6565, 'batch_loss/train': 0.8723706295713782} 12/22/2021 11:57:09 - INFO - codeparrot_training - Step 6566: {'lr': 0.00047997521144272585, 'samples': 3362304, 'steps': 6566, 'batch_loss/train': 0.8396282428875566} 12/22/2021 11:57:20 - INFO - codeparrot_training - Step 6567: {'lr': 0.0004799690294561825, 'samples': 3362816, 'steps': 6567, 'batch_loss/train': 0.7987316637299955} 12/22/2021 11:57:30 - INFO - codeparrot_training - Step 6568: {'lr': 0.00047996284655536916, 'samples': 3363328, 'steps': 6568, 'batch_loss/train': 0.9186257533729076} 12/22/2021 11:57:41 - INFO - codeparrot_training - Step 6569: {'lr': 0.00047995666274031035, 'samples': 3363840, 'steps': 6569, 'batch_loss/train': 0.8583105588331819} 12/22/2021 11:57:53 - INFO - codeparrot_training - Step 6570: {'lr': 0.00047995047801103064, 'samples': 3364352, 'steps': 6570, 'batch_loss/train': 0.7516571786254644} 12/22/2021 11:58:04 - INFO - codeparrot_training - Step 6571: {'lr': 0.00047994429236755475, 'samples': 3364864, 'steps': 6571, 'batch_loss/train': 0.8175930362194777} 12/22/2021 11:58:14 - INFO - codeparrot_training - Step 6572: {'lr': 0.0004799381058099071, 'samples': 3365376, 'steps': 6572, 'batch_loss/train': 0.6972866177093238} 12/22/2021 11:58:26 - INFO - codeparrot_training - Step 6573: {'lr': 0.0004799319183381124, 'samples': 3365888, 'steps': 6573, 'batch_loss/train': 0.7378065015655011} 12/22/2021 11:58:37 - INFO - codeparrot_training - Step 6574: {'lr': 0.0004799257299521952, 'samples': 3366400, 'steps': 6574, 'batch_loss/train': 0.8091222560033202} 12/22/2021 11:58:47 - INFO - codeparrot_training - Step 6575: {'lr': 0.00047991954065218013, 'samples': 3366912, 'steps': 6575, 'batch_loss/train': 0.8325409078970551} 12/22/2021 11:59:00 - INFO - codeparrot_training - Step 6576: {'lr': 0.0004799133504380917, 'samples': 3367424, 'steps': 6576, 'batch_loss/train': 0.8740324405953288} 12/22/2021 11:59:10 - INFO - codeparrot_training - Step 6577: {'lr': 0.00047990715930995467, 'samples': 3367936, 'steps': 6577, 'batch_loss/train': 0.845385548658669} 12/22/2021 11:59:21 - INFO - codeparrot_training - Step 6578: {'lr': 0.0004799009672677936, 'samples': 3368448, 'steps': 6578, 'batch_loss/train': 0.788119069300592} 12/22/2021 11:59:31 - INFO - codeparrot_training - Step 6579: {'lr': 0.000479894774311633, 'samples': 3368960, 'steps': 6579, 'batch_loss/train': 0.8799793850630522} 12/22/2021 11:59:45 - INFO - codeparrot_training - Step 6580: {'lr': 0.0004798885804414977, 'samples': 3369472, 'steps': 6580, 'batch_loss/train': 0.7844729535281658} 12/22/2021 11:59:56 - INFO - codeparrot_training - Step 6581: {'lr': 0.00047988238565741214, 'samples': 3369984, 'steps': 6581, 'batch_loss/train': 0.7408735044300556} 12/22/2021 12:00:06 - INFO - codeparrot_training - Step 6582: {'lr': 0.00047987618995940103, 'samples': 3370496, 'steps': 6582, 'batch_loss/train': 0.8681331118568778} 12/22/2021 12:00:19 - INFO - codeparrot_training - Step 6583: {'lr': 0.00047986999334748896, 'samples': 3371008, 'steps': 6583, 'batch_loss/train': 0.7157994201406837} 12/22/2021 12:00:29 - INFO - codeparrot_training - Step 6584: {'lr': 0.00047986379582170066, 'samples': 3371520, 'steps': 6584, 'batch_loss/train': 0.8424687068909407} 12/22/2021 12:00:40 - INFO - codeparrot_training - Step 6585: {'lr': 0.00047985759738206063, 'samples': 3372032, 'steps': 6585, 'batch_loss/train': 0.8462845655158162} 12/22/2021 12:00:52 - INFO - codeparrot_training - Step 6586: {'lr': 0.00047985139802859364, 'samples': 3372544, 'steps': 6586, 'batch_loss/train': 0.8108883909881115} 12/22/2021 12:01:02 - INFO - codeparrot_training - Step 6587: {'lr': 0.00047984519776132425, 'samples': 3373056, 'steps': 6587, 'batch_loss/train': 0.7954647438600659} 12/22/2021 12:01:13 - INFO - codeparrot_training - Step 6588: {'lr': 0.00047983899658027713, 'samples': 3373568, 'steps': 6588, 'batch_loss/train': 0.8991196639835835} 12/22/2021 12:01:27 - INFO - codeparrot_training - Step 6589: {'lr': 0.00047983279448547694, 'samples': 3374080, 'steps': 6589, 'batch_loss/train': 0.9235098687931895} 12/22/2021 12:01:37 - INFO - codeparrot_training - Step 6590: {'lr': 0.00047982659147694835, 'samples': 3374592, 'steps': 6590, 'batch_loss/train': 0.8802334545180202} 12/22/2021 12:01:48 - INFO - codeparrot_training - Step 6591: {'lr': 0.000479820387554716, 'samples': 3375104, 'steps': 6591, 'batch_loss/train': 0.8683155281469226} 12/22/2021 12:01:58 - INFO - codeparrot_training - Step 6592: {'lr': 0.0004798141827188046, 'samples': 3375616, 'steps': 6592, 'batch_loss/train': 0.8185287341475487} 12/22/2021 12:02:11 - INFO - codeparrot_training - Step 6593: {'lr': 0.00047980797696923875, 'samples': 3376128, 'steps': 6593, 'batch_loss/train': 0.8720998056232929} 12/22/2021 12:02:21 - INFO - codeparrot_training - Step 6594: {'lr': 0.0004798017703060432, 'samples': 3376640, 'steps': 6594, 'batch_loss/train': 0.8552739312872291} 12/22/2021 12:02:32 - INFO - codeparrot_training - Step 6595: {'lr': 0.0004797955627292425, 'samples': 3377152, 'steps': 6595, 'batch_loss/train': 0.7520619858405553} 12/22/2021 12:02:47 - INFO - codeparrot_training - Step 6596: {'lr': 0.0004797893542388615, 'samples': 3377664, 'steps': 6596, 'batch_loss/train': 0.8423955966718495} 12/22/2021 12:02:57 - INFO - codeparrot_training - Step 6597: {'lr': 0.0004797831448349248, 'samples': 3378176, 'steps': 6597, 'batch_loss/train': 0.9407830238342285} 12/22/2021 12:03:08 - INFO - codeparrot_training - Step 6598: {'lr': 0.000479776934517457, 'samples': 3378688, 'steps': 6598, 'batch_loss/train': 0.8279910013079643} 12/22/2021 12:03:20 - INFO - codeparrot_training - Step 6599: {'lr': 0.0004797707232864829, 'samples': 3379200, 'steps': 6599, 'batch_loss/train': 0.7905045347288251} 12/22/2021 12:03:30 - INFO - codeparrot_training - Step 6600: {'lr': 0.0004797645111420272, 'samples': 3379712, 'steps': 6600, 'batch_loss/train': 0.7537998054176569} 12/22/2021 12:03:41 - INFO - codeparrot_training - Step 6601: {'lr': 0.0004797582980841145, 'samples': 3380224, 'steps': 6601, 'batch_loss/train': 0.7428387608379126} 12/22/2021 12:03:52 - INFO - codeparrot_training - Step 6602: {'lr': 0.0004797520841127696, 'samples': 3380736, 'steps': 6602, 'batch_loss/train': 0.7383167487569153} 12/22/2021 12:04:04 - INFO - codeparrot_training - Step 6603: {'lr': 0.0004797458692280171, 'samples': 3381248, 'steps': 6603, 'batch_loss/train': 0.7983119413256645} 12/22/2021 12:04:14 - INFO - codeparrot_training - Step 6604: {'lr': 0.00047973965342988184, 'samples': 3381760, 'steps': 6604, 'batch_loss/train': 0.8428991194814444} 12/22/2021 12:04:25 - INFO - codeparrot_training - Step 6605: {'lr': 0.0004797334367183885, 'samples': 3382272, 'steps': 6605, 'batch_loss/train': 0.7371601425111294} 12/22/2021 12:04:39 - INFO - codeparrot_training - Step 6606: {'lr': 0.00047972721909356165, 'samples': 3382784, 'steps': 6606, 'batch_loss/train': 0.7980085076997057} 12/22/2021 12:04:50 - INFO - codeparrot_training - Step 6607: {'lr': 0.0004797210005554262, 'samples': 3383296, 'steps': 6607, 'batch_loss/train': 0.8121307883411646} 12/22/2021 12:05:00 - INFO - codeparrot_training - Step 6608: {'lr': 0.0004797147811040068, 'samples': 3383808, 'steps': 6608, 'batch_loss/train': 0.8516832836903632} 12/22/2021 12:05:12 - INFO - codeparrot_training - Step 6609: {'lr': 0.0004797085607393281, 'samples': 3384320, 'steps': 6609, 'batch_loss/train': 1.0008364543318748} 12/22/2021 12:05:23 - INFO - codeparrot_training - Step 6610: {'lr': 0.00047970233946141496, 'samples': 3384832, 'steps': 6610, 'batch_loss/train': 0.8285148376598954} 12/22/2021 12:05:34 - INFO - codeparrot_training - Step 6611: {'lr': 0.000479696117270292, 'samples': 3385344, 'steps': 6611, 'batch_loss/train': 0.9523425670340657} 12/22/2021 12:05:44 - INFO - codeparrot_training - Step 6612: {'lr': 0.00047968989416598403, 'samples': 3385856, 'steps': 6612, 'batch_loss/train': 0.8643765710294247} 12/22/2021 12:05:57 - INFO - codeparrot_training - Step 6613: {'lr': 0.0004796836701485158, 'samples': 3386368, 'steps': 6613, 'batch_loss/train': 0.833330177469179} 12/22/2021 12:06:08 - INFO - codeparrot_training - Step 6614: {'lr': 0.00047967744521791193, 'samples': 3386880, 'steps': 6614, 'batch_loss/train': 0.8151707267388701} 12/22/2021 12:06:18 - INFO - codeparrot_training - Step 6615: {'lr': 0.00047967121937419734, 'samples': 3387392, 'steps': 6615, 'batch_loss/train': 0.7556912427535281} 12/22/2021 12:06:30 - INFO - codeparrot_training - Step 6616: {'lr': 0.00047966499261739673, 'samples': 3387904, 'steps': 6616, 'batch_loss/train': 0.9394516916945577} 12/22/2021 12:06:41 - INFO - codeparrot_training - Step 6617: {'lr': 0.0004796587649475347, 'samples': 3388416, 'steps': 6617, 'batch_loss/train': 0.8631193097680807} 12/22/2021 12:06:52 - INFO - codeparrot_training - Step 6618: {'lr': 0.00047965253636463625, 'samples': 3388928, 'steps': 6618, 'batch_loss/train': 0.9158620685338974} 12/22/2021 12:07:04 - INFO - codeparrot_training - Step 6619: {'lr': 0.00047964630686872604, 'samples': 3389440, 'steps': 6619, 'batch_loss/train': 0.8316696649417281} 12/22/2021 12:07:14 - INFO - codeparrot_training - Step 6620: {'lr': 0.0004796400764598288, 'samples': 3389952, 'steps': 6620, 'batch_loss/train': 0.7768376637250185} 12/22/2021 12:07:25 - INFO - codeparrot_training - Step 6621: {'lr': 0.00047963384513796924, 'samples': 3390464, 'steps': 6621, 'batch_loss/train': 0.8940996006131172} 12/22/2021 12:07:39 - INFO - codeparrot_training - Step 6622: {'lr': 0.0004796276129031723, 'samples': 3390976, 'steps': 6622, 'batch_loss/train': 0.8576382268220186} 12/22/2021 12:07:50 - INFO - codeparrot_training - Step 6623: {'lr': 0.0004796213797554626, 'samples': 3391488, 'steps': 6623, 'batch_loss/train': 0.8086680417181924} 12/22/2021 12:08:00 - INFO - codeparrot_training - Step 6624: {'lr': 0.0004796151456948651, 'samples': 3392000, 'steps': 6624, 'batch_loss/train': 0.9125053081661463} 12/22/2021 12:08:12 - INFO - codeparrot_training - Step 6625: {'lr': 0.00047960891072140445, 'samples': 3392512, 'steps': 6625, 'batch_loss/train': 0.9070598557591438} 12/22/2021 12:08:23 - INFO - codeparrot_training - Step 6626: {'lr': 0.0004796026748351054, 'samples': 3393024, 'steps': 6626, 'batch_loss/train': 0.9092798996716738} 12/22/2021 12:08:34 - INFO - codeparrot_training - Step 6627: {'lr': 0.0004795964380359929, 'samples': 3393536, 'steps': 6627, 'batch_loss/train': 0.9292844156734645} 12/22/2021 12:08:44 - INFO - codeparrot_training - Step 6628: {'lr': 0.00047959020032409163, 'samples': 3394048, 'steps': 6628, 'batch_loss/train': 0.8573224525898695} 12/22/2021 12:08:56 - INFO - codeparrot_training - Step 6629: {'lr': 0.00047958396169942637, 'samples': 3394560, 'steps': 6629, 'batch_loss/train': 0.8801091006025672} 12/22/2021 12:09:07 - INFO - codeparrot_training - Step 6630: {'lr': 0.000479577722162022, 'samples': 3395072, 'steps': 6630, 'batch_loss/train': 0.5348422068054788} 12/22/2021 12:09:18 - INFO - codeparrot_training - Step 6631: {'lr': 0.0004795714817119033, 'samples': 3395584, 'steps': 6631, 'batch_loss/train': 0.7387833877000958} 12/22/2021 12:09:30 - INFO - codeparrot_training - Step 6632: {'lr': 0.00047956524034909506, 'samples': 3396096, 'steps': 6632, 'batch_loss/train': 0.7473134063184261} 12/22/2021 12:09:40 - INFO - codeparrot_training - Step 6633: {'lr': 0.0004795589980736221, 'samples': 3396608, 'steps': 6633, 'batch_loss/train': 0.8767468724399805} 12/22/2021 12:09:51 - INFO - codeparrot_training - Step 6634: {'lr': 0.0004795527548855092, 'samples': 3397120, 'steps': 6634, 'batch_loss/train': 0.8057214226573706} 12/22/2021 12:10:05 - INFO - codeparrot_training - Step 6635: {'lr': 0.0004795465107847813, 'samples': 3397632, 'steps': 6635, 'batch_loss/train': 0.6897730445489287} 12/22/2021 12:10:15 - INFO - codeparrot_training - Step 6636: {'lr': 0.00047954026577146314, 'samples': 3398144, 'steps': 6636, 'batch_loss/train': 0.8867252208292484} 12/22/2021 12:10:26 - INFO - codeparrot_training - Step 6637: {'lr': 0.00047953401984557955, 'samples': 3398656, 'steps': 6637, 'batch_loss/train': 0.8981460349168628} 12/22/2021 12:10:36 - INFO - codeparrot_training - Step 6638: {'lr': 0.00047952777300715535, 'samples': 3399168, 'steps': 6638, 'batch_loss/train': 0.8120677488623187} 12/22/2021 12:10:49 - INFO - codeparrot_training - Step 6639: {'lr': 0.0004795215252562153, 'samples': 3399680, 'steps': 6639, 'batch_loss/train': 0.8384194606915116} 12/22/2021 12:10:59 - INFO - codeparrot_training - Step 6640: {'lr': 0.00047951527659278445, 'samples': 3400192, 'steps': 6640, 'batch_loss/train': 0.8689128672704101} 12/22/2021 12:11:10 - INFO - codeparrot_training - Step 6641: {'lr': 0.00047950902701688746, 'samples': 3400704, 'steps': 6641, 'batch_loss/train': 0.8730633165687323} 12/22/2021 12:11:24 - INFO - codeparrot_training - Step 6642: {'lr': 0.0004795027765285493, 'samples': 3401216, 'steps': 6642, 'batch_loss/train': 0.838439071783796} 12/22/2021 12:11:34 - INFO - codeparrot_training - Step 6643: {'lr': 0.00047949652512779476, 'samples': 3401728, 'steps': 6643, 'batch_loss/train': 0.8151087630540133} 12/22/2021 12:11:45 - INFO - codeparrot_training - Step 6644: {'lr': 0.00047949027281464864, 'samples': 3402240, 'steps': 6644, 'batch_loss/train': 0.7506071841344237} 12/22/2021 12:11:57 - INFO - codeparrot_training - Step 6645: {'lr': 0.0004794840195891358, 'samples': 3402752, 'steps': 6645, 'batch_loss/train': 0.7499136226251721} 12/22/2021 12:12:08 - INFO - codeparrot_training - Step 6646: {'lr': 0.00047947776545128126, 'samples': 3403264, 'steps': 6646, 'batch_loss/train': 0.8802944244816899} 12/22/2021 12:12:18 - INFO - codeparrot_training - Step 6647: {'lr': 0.0004794715104011097, 'samples': 3403776, 'steps': 6647, 'batch_loss/train': 0.845320382155478} 12/22/2021 12:12:29 - INFO - codeparrot_training - Step 6648: {'lr': 0.0004794652544386461, 'samples': 3404288, 'steps': 6648, 'batch_loss/train': 0.7090343572199345} 12/22/2021 12:12:41 - INFO - codeparrot_training - Step 6649: {'lr': 0.00047945899756391525, 'samples': 3404800, 'steps': 6649, 'batch_loss/train': 0.8955608708783984} 12/22/2021 12:12:52 - INFO - codeparrot_training - Step 6650: {'lr': 0.0004794527397769421, 'samples': 3405312, 'steps': 6650, 'batch_loss/train': 0.815421260194853} 12/22/2021 12:13:02 - INFO - codeparrot_training - Step 6651: {'lr': 0.00047944648107775144, 'samples': 3405824, 'steps': 6651, 'batch_loss/train': 0.787367717246525} 12/22/2021 12:13:16 - INFO - codeparrot_training - Step 6652: {'lr': 0.0004794402214663682, 'samples': 3406336, 'steps': 6652, 'batch_loss/train': 0.8822357328608632} 12/22/2021 12:13:27 - INFO - codeparrot_training - Step 6653: {'lr': 0.00047943396094281737, 'samples': 3406848, 'steps': 6653, 'batch_loss/train': 0.8667261423543096} 12/22/2021 12:13:37 - INFO - codeparrot_training - Step 6654: {'lr': 0.0004794276995071237, 'samples': 3407360, 'steps': 6654, 'batch_loss/train': 0.8963458030484617} 12/22/2021 12:13:50 - INFO - codeparrot_training - Step 6655: {'lr': 0.0004794214371593121, 'samples': 3407872, 'steps': 6655, 'batch_loss/train': 0.7649100758135319} 12/22/2021 12:14:00 - INFO - codeparrot_training - Step 6656: {'lr': 0.00047941517389940753, 'samples': 3408384, 'steps': 6656, 'batch_loss/train': 0.7401067912578583} 12/22/2021 12:14:11 - INFO - codeparrot_training - Step 6657: {'lr': 0.00047940890972743484, 'samples': 3408896, 'steps': 6657, 'batch_loss/train': 0.7931829812005162} 12/22/2021 12:14:23 - INFO - codeparrot_training - Step 6658: {'lr': 0.00047940264464341897, 'samples': 3409408, 'steps': 6658, 'batch_loss/train': 0.7058939817361534} 12/22/2021 12:14:34 - INFO - codeparrot_training - Step 6659: {'lr': 0.0004793963786473847, 'samples': 3409920, 'steps': 6659, 'batch_loss/train': 0.7958652912639081} 12/22/2021 12:14:44 - INFO - codeparrot_training - Step 6660: {'lr': 0.0004793901117393572, 'samples': 3410432, 'steps': 6660, 'batch_loss/train': 0.7744211694225669} 12/22/2021 12:14:55 - INFO - codeparrot_training - Step 6661: {'lr': 0.0004793838439193611, 'samples': 3410944, 'steps': 6661, 'batch_loss/train': 0.8063936731778085} 12/22/2021 12:15:09 - INFO - codeparrot_training - Step 6662: {'lr': 0.0004793775751874215, 'samples': 3411456, 'steps': 6662, 'batch_loss/train': 1.110158664174378} 12/22/2021 12:15:19 - INFO - codeparrot_training - Step 6663: {'lr': 0.0004793713055435633, 'samples': 3411968, 'steps': 6663, 'batch_loss/train': 0.8803553627803922} 12/22/2021 12:15:30 - INFO - codeparrot_training - Step 6664: {'lr': 0.00047936503498781136, 'samples': 3412480, 'steps': 6664, 'batch_loss/train': 0.8245450705289841} 12/22/2021 12:15:42 - INFO - codeparrot_training - Step 6665: {'lr': 0.0004793587635201906, 'samples': 3412992, 'steps': 6665, 'batch_loss/train': 0.8441961393691599} 12/22/2021 12:15:53 - INFO - codeparrot_training - Step 6666: {'lr': 0.0004793524911407261, 'samples': 3413504, 'steps': 6666, 'batch_loss/train': 0.9181681238114834} 12/22/2021 12:16:03 - INFO - codeparrot_training - Step 6667: {'lr': 0.00047934621784944266, 'samples': 3414016, 'steps': 6667, 'batch_loss/train': 0.7748170434497297} 12/22/2021 12:16:17 - INFO - codeparrot_training - Step 6668: {'lr': 0.0004793399436463652, 'samples': 3414528, 'steps': 6668, 'batch_loss/train': 0.7913466775789857} 12/22/2021 12:16:27 - INFO - codeparrot_training - Step 6669: {'lr': 0.00047933366853151874, 'samples': 3415040, 'steps': 6669, 'batch_loss/train': 0.8912379620596766} 12/22/2021 12:16:38 - INFO - codeparrot_training - Step 6670: {'lr': 0.00047932739250492823, 'samples': 3415552, 'steps': 6670, 'batch_loss/train': 0.869252453558147} 12/22/2021 12:16:49 - INFO - codeparrot_training - Step 6671: {'lr': 0.0004793211155666186, 'samples': 3416064, 'steps': 6671, 'batch_loss/train': 0.8327503064647317} 12/22/2021 12:17:01 - INFO - codeparrot_training - Step 6672: {'lr': 0.00047931483771661477, 'samples': 3416576, 'steps': 6672, 'batch_loss/train': 0.9308084156364202} 12/22/2021 12:17:11 - INFO - codeparrot_training - Step 6673: {'lr': 0.0004793085589549418, 'samples': 3417088, 'steps': 6673, 'batch_loss/train': 0.8171251607127488} 12/22/2021 12:17:22 - INFO - codeparrot_training - Step 6674: {'lr': 0.0004793022792816245, 'samples': 3417600, 'steps': 6674, 'batch_loss/train': 0.8893222352489829} 12/22/2021 12:17:34 - INFO - codeparrot_training - Step 6675: {'lr': 0.000479295998696688, 'samples': 3418112, 'steps': 6675, 'batch_loss/train': 0.9709283309057355} 12/22/2021 12:17:45 - INFO - codeparrot_training - Step 6676: {'lr': 0.0004792897172001571, 'samples': 3418624, 'steps': 6676, 'batch_loss/train': 0.8744691722095013} 12/22/2021 12:17:55 - INFO - codeparrot_training - Step 6677: {'lr': 0.00047928343479205696, 'samples': 3419136, 'steps': 6677, 'batch_loss/train': 0.9372304994612932} 12/22/2021 12:18:07 - INFO - codeparrot_training - Step 6678: {'lr': 0.00047927715147241237, 'samples': 3419648, 'steps': 6678, 'batch_loss/train': 0.8878528540953994} 12/22/2021 12:18:18 - INFO - codeparrot_training - Step 6679: {'lr': 0.00047927086724124844, 'samples': 3420160, 'steps': 6679, 'batch_loss/train': 0.7772411885671318} 12/22/2021 12:18:29 - INFO - codeparrot_training - Step 6680: {'lr': 0.00047926458209859017, 'samples': 3420672, 'steps': 6680, 'batch_loss/train': 1.633176039904356} 12/22/2021 12:18:39 - INFO - codeparrot_training - Step 6681: {'lr': 0.00047925829604446237, 'samples': 3421184, 'steps': 6681, 'batch_loss/train': 0.7953078071586788} 12/22/2021 12:18:53 - INFO - codeparrot_training - Step 6682: {'lr': 0.0004792520090788902, 'samples': 3421696, 'steps': 6682, 'batch_loss/train': 0.9507204899564385} 12/22/2021 12:19:04 - INFO - codeparrot_training - Step 6683: {'lr': 0.0004792457212018986, 'samples': 3422208, 'steps': 6683, 'batch_loss/train': 0.8730300273746252} 12/22/2021 12:19:15 - INFO - codeparrot_training - Step 6684: {'lr': 0.0004792394324135126, 'samples': 3422720, 'steps': 6684, 'batch_loss/train': 0.8116583544760942} 12/22/2021 12:19:27 - INFO - codeparrot_training - Step 6685: {'lr': 0.0004792331427137572, 'samples': 3423232, 'steps': 6685, 'batch_loss/train': 0.7985582710243762} 12/22/2021 12:19:37 - INFO - codeparrot_training - Step 6686: {'lr': 0.0004792268521026573, 'samples': 3423744, 'steps': 6686, 'batch_loss/train': 0.7691714437678456} 12/22/2021 12:19:48 - INFO - codeparrot_training - Step 6687: {'lr': 0.000479220560580238, 'samples': 3424256, 'steps': 6687, 'batch_loss/train': 0.8441241255495697} 12/22/2021 12:20:00 - INFO - codeparrot_training - Step 6688: {'lr': 0.00047921426814652435, 'samples': 3424768, 'steps': 6688, 'batch_loss/train': 0.8459569751285017} 12/22/2021 12:20:11 - INFO - codeparrot_training - Step 6689: {'lr': 0.0004792079748015413, 'samples': 3425280, 'steps': 6689, 'batch_loss/train': 0.8890618747100234} 12/22/2021 12:20:21 - INFO - codeparrot_training - Step 6690: {'lr': 0.0004792016805453139, 'samples': 3425792, 'steps': 6690, 'batch_loss/train': 0.8945514913648367} 12/22/2021 12:20:32 - INFO - codeparrot_training - Step 6691: {'lr': 0.0004791953853778671, 'samples': 3426304, 'steps': 6691, 'batch_loss/train': 0.8475206852890551} 12/22/2021 12:20:46 - INFO - codeparrot_training - Step 6692: {'lr': 0.000479189089299226, 'samples': 3426816, 'steps': 6692, 'batch_loss/train': 0.9143191818147898} 12/22/2021 12:20:56 - INFO - codeparrot_training - Step 6693: {'lr': 0.00047918279230941565, 'samples': 3427328, 'steps': 6693, 'batch_loss/train': 0.8764141201972961} 12/22/2021 12:21:07 - INFO - codeparrot_training - Step 6694: {'lr': 0.00047917649440846105, 'samples': 3427840, 'steps': 6694, 'batch_loss/train': 0.698559440264944} 12/22/2021 12:21:19 - INFO - codeparrot_training - Step 6695: {'lr': 0.00047917019559638724, 'samples': 3428352, 'steps': 6695, 'batch_loss/train': 0.8499699989333749} 12/22/2021 12:21:30 - INFO - codeparrot_training - Step 6696: {'lr': 0.0004791638958732193, 'samples': 3428864, 'steps': 6696, 'batch_loss/train': 0.8389962243381888} 12/22/2021 12:21:40 - INFO - codeparrot_training - Step 6697: {'lr': 0.00047915759523898215, 'samples': 3429376, 'steps': 6697, 'batch_loss/train': 0.8225814942270517} 12/22/2021 12:21:54 - INFO - codeparrot_training - Step 6698: {'lr': 0.000479151293693701, 'samples': 3429888, 'steps': 6698, 'batch_loss/train': 0.7634328946005553} 12/22/2021 12:22:05 - INFO - codeparrot_training - Step 6699: {'lr': 0.00047914499123740084, 'samples': 3430400, 'steps': 6699, 'batch_loss/train': 0.8184032525168732} 12/22/2021 12:22:15 - INFO - codeparrot_training - Step 6700: {'lr': 0.00047913868787010673, 'samples': 3430912, 'steps': 6700, 'batch_loss/train': 0.9617364453151822} 12/22/2021 12:22:26 - INFO - codeparrot_training - Step 6701: {'lr': 0.0004791323835918436, 'samples': 3431424, 'steps': 6701, 'batch_loss/train': 0.8266657744534314} 12/22/2021 12:22:38 - INFO - codeparrot_training - Step 6702: {'lr': 0.0004791260784026368, 'samples': 3431936, 'steps': 6702, 'batch_loss/train': 0.8688592547550797} 12/22/2021 12:22:49 - INFO - codeparrot_training - Step 6703: {'lr': 0.0004791197723025111, 'samples': 3432448, 'steps': 6703, 'batch_loss/train': 0.808355092536658} 12/22/2021 12:22:59 - INFO - codeparrot_training - Step 6704: {'lr': 0.0004791134652914917, 'samples': 3432960, 'steps': 6704, 'batch_loss/train': 0.7593507673591375} 12/22/2021 12:23:11 - INFO - codeparrot_training - Step 6705: {'lr': 0.0004791071573696038, 'samples': 3433472, 'steps': 6705, 'batch_loss/train': 0.7329377420246601} 12/22/2021 12:23:22 - INFO - codeparrot_training - Step 6706: {'lr': 0.0004791008485368722, 'samples': 3433984, 'steps': 6706, 'batch_loss/train': 0.9120721314102411} 12/22/2021 12:23:33 - INFO - codeparrot_training - Step 6707: {'lr': 0.00047909453879332214, 'samples': 3434496, 'steps': 6707, 'batch_loss/train': 0.7169462903402746} 12/22/2021 12:23:46 - INFO - codeparrot_training - Step 6708: {'lr': 0.0004790882281389788, 'samples': 3435008, 'steps': 6708, 'batch_loss/train': 1.5870802593417466} 12/22/2021 12:23:57 - INFO - codeparrot_training - Step 6709: {'lr': 0.00047908191657386706, 'samples': 3435520, 'steps': 6709, 'batch_loss/train': 0.8075016057118773} 12/22/2021 12:24:08 - INFO - codeparrot_training - Step 6710: {'lr': 0.00047907560409801215, 'samples': 3436032, 'steps': 6710, 'batch_loss/train': 0.8019827511161566} 12/22/2021 12:24:18 - INFO - codeparrot_training - Step 6711: {'lr': 0.0004790692907114392, 'samples': 3436544, 'steps': 6711, 'batch_loss/train': 0.8039377955719829} 12/22/2021 12:24:31 - INFO - codeparrot_training - Step 6712: {'lr': 0.0004790629764141732, 'samples': 3437056, 'steps': 6712, 'batch_loss/train': 0.9075028225779533} 12/22/2021 12:24:41 - INFO - codeparrot_training - Step 6713: {'lr': 0.00047905666120623923, 'samples': 3437568, 'steps': 6713, 'batch_loss/train': 0.8866673132870346} 12/22/2021 12:24:52 - INFO - codeparrot_training - Step 6714: {'lr': 0.00047905034508766256, 'samples': 3438080, 'steps': 6714, 'batch_loss/train': 0.8336760341189802} 12/22/2021 12:25:04 - INFO - codeparrot_training - Step 6715: {'lr': 0.0004790440280584681, 'samples': 3438592, 'steps': 6715, 'batch_loss/train': 0.8323878622613847} 12/22/2021 12:25:15 - INFO - codeparrot_training - Step 6716: {'lr': 0.0004790377101186811, 'samples': 3439104, 'steps': 6716, 'batch_loss/train': 0.8008340196684003} 12/22/2021 12:25:25 - INFO - codeparrot_training - Step 6717: {'lr': 0.00047903139126832675, 'samples': 3439616, 'steps': 6717, 'batch_loss/train': 0.900247959420085} 12/22/2021 12:25:37 - INFO - codeparrot_training - Step 6718: {'lr': 0.00047902507150743, 'samples': 3440128, 'steps': 6718, 'batch_loss/train': 0.7199329206487164} 12/22/2021 12:25:48 - INFO - codeparrot_training - Step 6719: {'lr': 0.00047901875083601603, 'samples': 3440640, 'steps': 6719, 'batch_loss/train': 0.8597286595031619} 12/22/2021 12:25:58 - INFO - codeparrot_training - Step 6720: {'lr': 0.00047901242925411003, 'samples': 3441152, 'steps': 6720, 'batch_loss/train': 0.7717309249565005} 12/22/2021 12:26:09 - INFO - codeparrot_training - Step 6721: {'lr': 0.000479006106761737, 'samples': 3441664, 'steps': 6721, 'batch_loss/train': 0.7905892003327608} 12/22/2021 12:26:23 - INFO - codeparrot_training - Step 6722: {'lr': 0.00047899978335892224, 'samples': 3442176, 'steps': 6722, 'batch_loss/train': 0.8358592325821519} 12/22/2021 12:26:34 - INFO - codeparrot_training - Step 6723: {'lr': 0.00047899345904569077, 'samples': 3442688, 'steps': 6723, 'batch_loss/train': 0.7606317410245538} 12/22/2021 12:26:44 - INFO - codeparrot_training - Step 6724: {'lr': 0.0004789871338220677, 'samples': 3443200, 'steps': 6724, 'batch_loss/train': 0.7699350656475872} 12/22/2021 12:26:56 - INFO - codeparrot_training - Step 6725: {'lr': 0.00047898080768807836, 'samples': 3443712, 'steps': 6725, 'batch_loss/train': 0.8539029667153955} 12/22/2021 12:27:07 - INFO - codeparrot_training - Step 6726: {'lr': 0.0004789744806437477, 'samples': 3444224, 'steps': 6726, 'batch_loss/train': 0.6918845674954355} 12/22/2021 12:27:17 - INFO - codeparrot_training - Step 6727: {'lr': 0.0004789681526891011, 'samples': 3444736, 'steps': 6727, 'batch_loss/train': 0.8756737941876054} 12/22/2021 12:27:31 - INFO - codeparrot_training - Step 6728: {'lr': 0.0004789618238241634, 'samples': 3445248, 'steps': 6728, 'batch_loss/train': 0.8100194856524467} 12/22/2021 12:27:42 - INFO - codeparrot_training - Step 6729: {'lr': 0.0004789554940489601, 'samples': 3445760, 'steps': 6729, 'batch_loss/train': 1.0379635589197278} 12/22/2021 12:27:52 - INFO - codeparrot_training - Step 6730: {'lr': 0.0004789491633635161, 'samples': 3446272, 'steps': 6730, 'batch_loss/train': 0.8277869899757206} 12/22/2021 12:28:05 - INFO - codeparrot_training - Step 6731: {'lr': 0.00047894283176785674, 'samples': 3446784, 'steps': 6731, 'batch_loss/train': 0.8573962403461337} 12/22/2021 12:28:15 - INFO - codeparrot_training - Step 6732: {'lr': 0.0004789364992620071, 'samples': 3447296, 'steps': 6732, 'batch_loss/train': 1.6077063742559403} 12/22/2021 12:28:26 - INFO - codeparrot_training - Step 6733: {'lr': 0.00047893016584599235, 'samples': 3447808, 'steps': 6733, 'batch_loss/train': 0.7629845291376114} 12/22/2021 12:28:37 - INFO - codeparrot_training - Step 6734: {'lr': 0.00047892383151983776, 'samples': 3448320, 'steps': 6734, 'batch_loss/train': 0.7715491368435323} 12/22/2021 12:28:49 - INFO - codeparrot_training - Step 6735: {'lr': 0.0004789174962835684, 'samples': 3448832, 'steps': 6735, 'batch_loss/train': 0.8773106057196856} 12/22/2021 12:29:00 - INFO - codeparrot_training - Step 6736: {'lr': 0.0004789111601372096, 'samples': 3449344, 'steps': 6736, 'batch_loss/train': 0.9426577761769295} 12/22/2021 12:29:10 - INFO - codeparrot_training - Step 6737: {'lr': 0.0004789048230807863, 'samples': 3449856, 'steps': 6737, 'batch_loss/train': 0.8530695559456944} 12/22/2021 12:29:24 - INFO - codeparrot_training - Step 6738: {'lr': 0.000478898485114324, 'samples': 3450368, 'steps': 6738, 'batch_loss/train': 0.6180729759507813} 12/22/2021 12:29:35 - INFO - codeparrot_training - Step 6739: {'lr': 0.0004788921462378477, 'samples': 3450880, 'steps': 6739, 'batch_loss/train': 0.8964958526194096} 12/22/2021 12:29:46 - INFO - codeparrot_training - Step 6740: {'lr': 0.0004788858064513827, 'samples': 3451392, 'steps': 6740, 'batch_loss/train': 0.7252804897725582} 12/22/2021 12:29:58 - INFO - codeparrot_training - Step 6741: {'lr': 0.00047887946575495406, 'samples': 3451904, 'steps': 6741, 'batch_loss/train': 1.0338693531230092} 12/22/2021 12:30:08 - INFO - codeparrot_training - Step 6742: {'lr': 0.00047887312414858713, 'samples': 3452416, 'steps': 6742, 'batch_loss/train': 0.8662549993023276} 12/22/2021 12:30:19 - INFO - codeparrot_training - Step 6743: {'lr': 0.0004788667816323071, 'samples': 3452928, 'steps': 6743, 'batch_loss/train': 1.008405588567257} 12/22/2021 12:30:29 - INFO - codeparrot_training - Step 6744: {'lr': 0.00047886043820613913, 'samples': 3453440, 'steps': 6744, 'batch_loss/train': 0.8503370061516762} 12/22/2021 12:30:42 - INFO - codeparrot_training - Step 6745: {'lr': 0.0004788540938701085, 'samples': 3453952, 'steps': 6745, 'batch_loss/train': 0.9318075561895967} 12/22/2021 12:30:52 - INFO - codeparrot_training - Step 6746: {'lr': 0.0004788477486242404, 'samples': 3454464, 'steps': 6746, 'batch_loss/train': 0.8709158878773451} 12/22/2021 12:31:03 - INFO - codeparrot_training - Step 6747: {'lr': 0.00047884140246856014, 'samples': 3454976, 'steps': 6747, 'batch_loss/train': 0.9081109669059515} 12/22/2021 12:31:17 - INFO - codeparrot_training - Step 6748: {'lr': 0.0004788350554030927, 'samples': 3455488, 'steps': 6748, 'batch_loss/train': 0.9437629217281938} 12/22/2021 12:31:27 - INFO - codeparrot_training - Step 6749: {'lr': 0.0004788287074278636, 'samples': 3456000, 'steps': 6749, 'batch_loss/train': 0.8704372067004442} 12/22/2021 12:31:38 - INFO - codeparrot_training - Step 6750: {'lr': 0.0004788223585428979, 'samples': 3456512, 'steps': 6750, 'batch_loss/train': 0.8326106816530228} 12/22/2021 12:31:50 - INFO - codeparrot_training - Step 6751: {'lr': 0.00047881600874822097, 'samples': 3457024, 'steps': 6751, 'batch_loss/train': 0.8833223273977637} 12/22/2021 12:32:00 - INFO - codeparrot_training - Step 6752: {'lr': 0.00047880965804385797, 'samples': 3457536, 'steps': 6752, 'batch_loss/train': 0.912467454560101} 12/22/2021 12:32:11 - INFO - codeparrot_training - Step 6753: {'lr': 0.0004788033064298342, 'samples': 3458048, 'steps': 6753, 'batch_loss/train': 0.8174044024199247} 12/22/2021 12:32:22 - INFO - codeparrot_training - Step 6754: {'lr': 0.00047879695390617473, 'samples': 3458560, 'steps': 6754, 'batch_loss/train': 0.8094499707221985} 12/22/2021 12:32:34 - INFO - codeparrot_training - Step 6755: {'lr': 0.00047879060047290506, 'samples': 3459072, 'steps': 6755, 'batch_loss/train': 0.9030839670449495} 12/22/2021 12:32:45 - INFO - codeparrot_training - Step 6756: {'lr': 0.0004787842461300503, 'samples': 3459584, 'steps': 6756, 'batch_loss/train': 0.9659704491496086} 12/22/2021 12:32:55 - INFO - codeparrot_training - Step 6757: {'lr': 0.0004787778908776359, 'samples': 3460096, 'steps': 6757, 'batch_loss/train': 0.8172312956303358} 12/22/2021 12:33:09 - INFO - codeparrot_training - Step 6758: {'lr': 0.0004787715347156869, 'samples': 3460608, 'steps': 6758, 'batch_loss/train': 0.899753600358963} 12/22/2021 12:33:20 - INFO - codeparrot_training - Step 6759: {'lr': 0.0004787651776442287, 'samples': 3461120, 'steps': 6759, 'batch_loss/train': 0.9523409362882376} 12/22/2021 12:33:30 - INFO - codeparrot_training - Step 6760: {'lr': 0.00047875881966328644, 'samples': 3461632, 'steps': 6760, 'batch_loss/train': 1.0059047173708677} 12/22/2021 12:33:43 - INFO - codeparrot_training - Step 6761: {'lr': 0.00047875246077288554, 'samples': 3462144, 'steps': 6761, 'batch_loss/train': 0.9257979704998434} 12/22/2021 12:33:53 - INFO - codeparrot_training - Step 6762: {'lr': 0.0004787461009730512, 'samples': 3462656, 'steps': 6762, 'batch_loss/train': 0.9077563462778926} 12/22/2021 12:34:04 - INFO - codeparrot_training - Step 6763: {'lr': 0.00047873974026380875, 'samples': 3463168, 'steps': 6763, 'batch_loss/train': 0.8598671574145555} 12/22/2021 12:34:14 - INFO - codeparrot_training - Step 6764: {'lr': 0.0004787333786451835, 'samples': 3463680, 'steps': 6764, 'batch_loss/train': 0.8090655636042356} 12/22/2021 12:34:27 - INFO - codeparrot_training - Step 6765: {'lr': 0.0004787270161172007, 'samples': 3464192, 'steps': 6765, 'batch_loss/train': 0.9125288762152195} 12/22/2021 12:34:37 - INFO - codeparrot_training - Step 6766: {'lr': 0.00047872065267988564, 'samples': 3464704, 'steps': 6766, 'batch_loss/train': 0.8347770124673843} 12/22/2021 12:34:48 - INFO - codeparrot_training - Step 6767: {'lr': 0.0004787142883332636, 'samples': 3465216, 'steps': 6767, 'batch_loss/train': 0.8921569529920816} 12/22/2021 12:35:02 - INFO - codeparrot_training - Step 6768: {'lr': 0.00047870792307735987, 'samples': 3465728, 'steps': 6768, 'batch_loss/train': 0.9868333488702774} 12/22/2021 12:35:13 - INFO - codeparrot_training - Step 6769: {'lr': 0.00047870155691219984, 'samples': 3466240, 'steps': 6769, 'batch_loss/train': 0.7568963142111897} 12/22/2021 12:35:23 - INFO - codeparrot_training - Step 6770: {'lr': 0.0004786951898378088, 'samples': 3466752, 'steps': 6770, 'batch_loss/train': 0.7259195246733725} 12/22/2021 12:35:35 - INFO - codeparrot_training - Step 6771: {'lr': 0.000478688821854212, 'samples': 3467264, 'steps': 6771, 'batch_loss/train': 0.9847712982445955} 12/22/2021 12:35:46 - INFO - codeparrot_training - Step 6772: {'lr': 0.0004786824529614349, 'samples': 3467776, 'steps': 6772, 'batch_loss/train': 0.9273947365581989} 12/22/2021 12:35:57 - INFO - codeparrot_training - Step 6773: {'lr': 0.0004786760831595025, 'samples': 3468288, 'steps': 6773, 'batch_loss/train': 0.8448646059259772} 12/22/2021 12:36:08 - INFO - codeparrot_training - Step 6774: {'lr': 0.0004786697124484405, 'samples': 3468800, 'steps': 6774, 'batch_loss/train': 0.8269837684929371} 12/22/2021 12:36:19 - INFO - codeparrot_training - Step 6775: {'lr': 0.000478663340828274, 'samples': 3469312, 'steps': 6775, 'batch_loss/train': 0.9079471873119473} 12/22/2021 12:36:30 - INFO - codeparrot_training - Step 6776: {'lr': 0.0004786569682990285, 'samples': 3469824, 'steps': 6776, 'batch_loss/train': 0.8673191703855991} 12/22/2021 12:36:40 - INFO - codeparrot_training - Step 6777: {'lr': 0.00047865059486072915, 'samples': 3470336, 'steps': 6777, 'batch_loss/train': 0.8739445433020592} 12/22/2021 12:36:54 - INFO - codeparrot_training - Step 6778: {'lr': 0.00047864422051340134, 'samples': 3470848, 'steps': 6778, 'batch_loss/train': 0.8673689127899706} 12/22/2021 12:37:05 - INFO - codeparrot_training - Step 6779: {'lr': 0.0004786378452570704, 'samples': 3471360, 'steps': 6779, 'batch_loss/train': 0.9037138964049518} 12/22/2021 12:37:15 - INFO - codeparrot_training - Step 6780: {'lr': 0.0004786314690917619, 'samples': 3471872, 'steps': 6780, 'batch_loss/train': 0.8629695298150182} 12/22/2021 12:37:27 - INFO - codeparrot_training - Step 6781: {'lr': 0.00047862509201750087, 'samples': 3472384, 'steps': 6781, 'batch_loss/train': 0.7719604629091918} 12/22/2021 12:37:38 - INFO - codeparrot_training - Step 6782: {'lr': 0.00047861871403431284, 'samples': 3472896, 'steps': 6782, 'batch_loss/train': 0.8904079450294375} 12/22/2021 12:37:49 - INFO - codeparrot_training - Step 6783: {'lr': 0.0004786123351422231, 'samples': 3473408, 'steps': 6783, 'batch_loss/train': 0.8858865424990654} 12/22/2021 12:38:02 - INFO - codeparrot_training - Step 6784: {'lr': 0.00047860595534125706, 'samples': 3473920, 'steps': 6784, 'batch_loss/train': 0.9018488926813006} 12/22/2021 12:38:13 - INFO - codeparrot_training - Step 6785: {'lr': 0.0004785995746314401, 'samples': 3474432, 'steps': 6785, 'batch_loss/train': 0.8918699407950044} 12/22/2021 12:38:24 - INFO - codeparrot_training - Step 6786: {'lr': 0.00047859319301279745, 'samples': 3474944, 'steps': 6786, 'batch_loss/train': 0.8482865579426289} 12/22/2021 12:38:34 - INFO - codeparrot_training - Step 6787: {'lr': 0.00047858681048535464, 'samples': 3475456, 'steps': 6787, 'batch_loss/train': 0.8505675001069903} 12/22/2021 12:38:46 - INFO - codeparrot_training - Step 6788: {'lr': 0.000478580427049137, 'samples': 3475968, 'steps': 6788, 'batch_loss/train': 0.9734480148181319} 12/22/2021 12:38:57 - INFO - codeparrot_training - Step 6789: {'lr': 0.00047857404270416985, 'samples': 3476480, 'steps': 6789, 'batch_loss/train': 0.8811354478821158} 12/22/2021 12:39:07 - INFO - codeparrot_training - Step 6790: {'lr': 0.0004785676574504786, 'samples': 3476992, 'steps': 6790, 'batch_loss/train': 0.8438646430149674} 12/22/2021 12:39:20 - INFO - codeparrot_training - Step 6791: {'lr': 0.0004785612712880887, 'samples': 3477504, 'steps': 6791, 'batch_loss/train': 0.8702675909735262} 12/22/2021 12:39:30 - INFO - codeparrot_training - Step 6792: {'lr': 0.0004785548842170254, 'samples': 3478016, 'steps': 6792, 'batch_loss/train': 0.8794072507880628} 12/22/2021 12:39:41 - INFO - codeparrot_training - Step 6793: {'lr': 0.0004785484962373143, 'samples': 3478528, 'steps': 6793, 'batch_loss/train': 1.069552551023662} 12/22/2021 12:39:53 - INFO - codeparrot_training - Step 6794: {'lr': 0.0004785421073489806, 'samples': 3479040, 'steps': 6794, 'batch_loss/train': 0.7598264482803643} 12/22/2021 12:40:04 - INFO - codeparrot_training - Step 6795: {'lr': 0.00047853571755204977, 'samples': 3479552, 'steps': 6795, 'batch_loss/train': 0.8483000164851546} 12/22/2021 12:40:14 - INFO - codeparrot_training - Step 6796: {'lr': 0.0004785293268465473, 'samples': 3480064, 'steps': 6796, 'batch_loss/train': 0.8726807981729507} 12/22/2021 12:40:25 - INFO - codeparrot_training - Step 6797: {'lr': 0.0004785229352324984, 'samples': 3480576, 'steps': 6797, 'batch_loss/train': 0.8084081190172583} 12/22/2021 12:40:39 - INFO - codeparrot_training - Step 6798: {'lr': 0.0004785165427099286, 'samples': 3481088, 'steps': 6798, 'batch_loss/train': 0.8220905065536499} 12/22/2021 12:40:49 - INFO - codeparrot_training - Step 6799: {'lr': 0.0004785101492788634, 'samples': 3481600, 'steps': 6799, 'batch_loss/train': 0.8991244351491332} 12/22/2021 12:41:00 - INFO - codeparrot_training - Step 6800: {'lr': 0.0004785037549393281, 'samples': 3482112, 'steps': 6800, 'batch_loss/train': 0.7527687414549291} 12/22/2021 12:41:12 - INFO - codeparrot_training - Step 6801: {'lr': 0.0004784973596913481, 'samples': 3482624, 'steps': 6801, 'batch_loss/train': 0.9073523161932826} 12/22/2021 12:41:23 - INFO - codeparrot_training - Step 6802: {'lr': 0.0004784909635349489, 'samples': 3483136, 'steps': 6802, 'batch_loss/train': 0.8742046039551497} 12/22/2021 12:41:33 - INFO - codeparrot_training - Step 6803: {'lr': 0.000478484566470156, 'samples': 3483648, 'steps': 6803, 'batch_loss/train': 0.8283093455247581} 12/22/2021 12:41:45 - INFO - codeparrot_training - Step 6804: {'lr': 0.0004784781684969947, 'samples': 3484160, 'steps': 6804, 'batch_loss/train': 0.9167087860405445} 12/22/2021 12:41:56 - INFO - codeparrot_training - Step 6805: {'lr': 0.0004784717696154903, 'samples': 3484672, 'steps': 6805, 'batch_loss/train': 0.8255998436361551} 12/22/2021 12:42:06 - INFO - codeparrot_training - Step 6806: {'lr': 0.0004784653698256686, 'samples': 3485184, 'steps': 6806, 'batch_loss/train': 0.8232054822146893} 12/22/2021 12:42:17 - INFO - codeparrot_training - Step 6807: {'lr': 0.00047845896912755476, 'samples': 3485696, 'steps': 6807, 'batch_loss/train': 0.7462190766818821} 12/22/2021 12:42:31 - INFO - codeparrot_training - Step 6808: {'lr': 0.0004784525675211744, 'samples': 3486208, 'steps': 6808, 'batch_loss/train': 0.9006503792479634} 12/22/2021 12:42:41 - INFO - codeparrot_training - Step 6809: {'lr': 0.0004784461650065528, 'samples': 3486720, 'steps': 6809, 'batch_loss/train': 0.8999133361503482} 12/22/2021 12:42:52 - INFO - codeparrot_training - Step 6810: {'lr': 0.0004784397615837156, 'samples': 3487232, 'steps': 6810, 'batch_loss/train': 0.9470609021373093} 12/22/2021 12:43:04 - INFO - codeparrot_training - Step 6811: {'lr': 0.0004784333572526881, 'samples': 3487744, 'steps': 6811, 'batch_loss/train': 0.8540064776316285} 12/22/2021 12:43:15 - INFO - codeparrot_training - Step 6812: {'lr': 0.0004784269520134958, 'samples': 3488256, 'steps': 6812, 'batch_loss/train': 0.7187073780223727} 12/22/2021 12:43:26 - INFO - codeparrot_training - Step 6813: {'lr': 0.00047842054586616424, 'samples': 3488768, 'steps': 6813, 'batch_loss/train': 0.9083376061171293} 12/22/2021 12:43:39 - INFO - codeparrot_training - Step 6814: {'lr': 0.0004784141388107188, 'samples': 3489280, 'steps': 6814, 'batch_loss/train': 0.767266808077693} 12/22/2021 12:43:50 - INFO - codeparrot_training - Step 6815: {'lr': 0.00047840773084718505, 'samples': 3489792, 'steps': 6815, 'batch_loss/train': 0.8944163536652923} 12/22/2021 12:44:01 - INFO - codeparrot_training - Step 6816: {'lr': 0.0004784013219755884, 'samples': 3490304, 'steps': 6816, 'batch_loss/train': 0.8880995051003993} 12/22/2021 12:44:11 - INFO - codeparrot_training - Step 6817: {'lr': 0.00047839491219595434, 'samples': 3490816, 'steps': 6817, 'batch_loss/train': 0.829765213187784} 12/22/2021 12:44:23 - INFO - codeparrot_training - Step 6818: {'lr': 0.00047838850150830824, 'samples': 3491328, 'steps': 6818, 'batch_loss/train': 0.7931210750248283} 12/22/2021 12:44:34 - INFO - codeparrot_training - Step 6819: {'lr': 0.0004783820899126758, 'samples': 3491840, 'steps': 6819, 'batch_loss/train': 0.8459783792495728} 12/22/2021 12:44:44 - INFO - codeparrot_training - Step 6820: {'lr': 0.0004783756774090824, 'samples': 3492352, 'steps': 6820, 'batch_loss/train': 0.832150319358334} 12/22/2021 12:44:56 - INFO - codeparrot_training - Step 6821: {'lr': 0.0004783692639975535, 'samples': 3492864, 'steps': 6821, 'batch_loss/train': 0.9144419440999627} 12/22/2021 12:45:07 - INFO - codeparrot_training - Step 6822: {'lr': 0.0004783628496781146, 'samples': 3493376, 'steps': 6822, 'batch_loss/train': 0.781462557381019} 12/22/2021 12:45:18 - INFO - codeparrot_training - Step 6823: {'lr': 0.0004783564344507913, 'samples': 3493888, 'steps': 6823, 'batch_loss/train': 0.8328437209129333} 12/22/2021 12:45:32 - INFO - codeparrot_training - Step 6824: {'lr': 0.000478350018315609, 'samples': 3494400, 'steps': 6824, 'batch_loss/train': 0.9502471489831805} 12/22/2021 12:45:42 - INFO - codeparrot_training - Step 6825: {'lr': 0.0004783436012725933, 'samples': 3494912, 'steps': 6825, 'batch_loss/train': 0.811186883598566} 12/22/2021 12:45:53 - INFO - codeparrot_training - Step 6826: {'lr': 0.00047833718332176967, 'samples': 3495424, 'steps': 6826, 'batch_loss/train': 0.7944247666746378} 12/22/2021 12:46:05 - INFO - codeparrot_training - Step 6827: {'lr': 0.0004783307644631635, 'samples': 3495936, 'steps': 6827, 'batch_loss/train': 0.8447435788111761} 12/22/2021 12:46:16 - INFO - codeparrot_training - Step 6828: {'lr': 0.00047832434469680053, 'samples': 3496448, 'steps': 6828, 'batch_loss/train': 0.8374567185528576} 12/22/2021 12:46:27 - INFO - codeparrot_training - Step 6829: {'lr': 0.0004783179240227062, 'samples': 3496960, 'steps': 6829, 'batch_loss/train': 0.8238410851918161} 12/22/2021 12:46:37 - INFO - codeparrot_training - Step 6830: {'lr': 0.0004783115024409059, 'samples': 3497472, 'steps': 6830, 'batch_loss/train': 0.8881312198936939} 12/22/2021 12:46:49 - INFO - codeparrot_training - Step 6831: {'lr': 0.00047830507995142536, 'samples': 3497984, 'steps': 6831, 'batch_loss/train': 1.1219901368021965} 12/22/2021 12:47:00 - INFO - codeparrot_training - Step 6832: {'lr': 0.00047829865655429004, 'samples': 3498496, 'steps': 6832, 'batch_loss/train': 0.7638358152471483} 12/22/2021 12:47:10 - INFO - codeparrot_training - Step 6833: {'lr': 0.00047829223224952545, 'samples': 3499008, 'steps': 6833, 'batch_loss/train': 0.8332435339689255} 12/22/2021 12:47:22 - INFO - codeparrot_training - Step 6834: {'lr': 0.00047828580703715705, 'samples': 3499520, 'steps': 6834, 'batch_loss/train': 0.8568749697878957} 12/22/2021 12:47:33 - INFO - codeparrot_training - Step 6835: {'lr': 0.00047827938091721055, 'samples': 3500032, 'steps': 6835, 'batch_loss/train': 0.8256972376257181} 12/22/2021 12:47:44 - INFO - codeparrot_training - Step 6836: {'lr': 0.0004782729538897115, 'samples': 3500544, 'steps': 6836, 'batch_loss/train': 0.8176514552906156} 12/22/2021 12:47:57 - INFO - codeparrot_training - Step 6837: {'lr': 0.00047826652595468527, 'samples': 3501056, 'steps': 6837, 'batch_loss/train': 0.77076306194067} 12/22/2021 12:48:08 - INFO - codeparrot_training - Step 6838: {'lr': 0.00047826009711215755, 'samples': 3501568, 'steps': 6838, 'batch_loss/train': 0.7518743656110018} 12/22/2021 12:48:19 - INFO - codeparrot_training - Step 6839: {'lr': 0.0004782536673621539, 'samples': 3502080, 'steps': 6839, 'batch_loss/train': 0.8574678869917989} 12/22/2021 12:48:29 - INFO - codeparrot_training - Step 6840: {'lr': 0.00047824723670469984, 'samples': 3502592, 'steps': 6840, 'batch_loss/train': 0.8435002551414073} 12/22/2021 12:48:42 - INFO - codeparrot_training - Step 6841: {'lr': 0.0004782408051398209, 'samples': 3503104, 'steps': 6841, 'batch_loss/train': 0.8968148212879896} 12/22/2021 12:48:52 - INFO - codeparrot_training - Step 6842: {'lr': 0.0004782343726675428, 'samples': 3503616, 'steps': 6842, 'batch_loss/train': 0.7899433327838778} 12/22/2021 12:49:03 - INFO - codeparrot_training - Step 6843: {'lr': 0.0004782279392878909, 'samples': 3504128, 'steps': 6843, 'batch_loss/train': 0.878983429633081} 12/22/2021 12:49:17 - INFO - codeparrot_training - Step 6844: {'lr': 0.000478221505000891, 'samples': 3504640, 'steps': 6844, 'batch_loss/train': 0.8645762475207448} 12/22/2021 12:49:28 - INFO - codeparrot_training - Step 6845: {'lr': 0.0004782150698065685, 'samples': 3505152, 'steps': 6845, 'batch_loss/train': 1.2482087342068553} 12/22/2021 12:49:38 - INFO - codeparrot_training - Step 6846: {'lr': 0.00047820863370494907, 'samples': 3505664, 'steps': 6846, 'batch_loss/train': 0.7830397039651871} 12/22/2021 12:49:50 - INFO - codeparrot_training - Step 6847: {'lr': 0.00047820219669605826, 'samples': 3506176, 'steps': 6847, 'batch_loss/train': 0.8829594617709517} 12/22/2021 12:50:01 - INFO - codeparrot_training - Step 6848: {'lr': 0.0004781957587799217, 'samples': 3506688, 'steps': 6848, 'batch_loss/train': 0.876767803914845} 12/22/2021 12:50:12 - INFO - codeparrot_training - Step 6849: {'lr': 0.00047818931995656494, 'samples': 3507200, 'steps': 6849, 'batch_loss/train': 0.749486580491066} 12/22/2021 12:50:22 - INFO - codeparrot_training - Step 6850: {'lr': 0.0004781828802260136, 'samples': 3507712, 'steps': 6850, 'batch_loss/train': 0.726338536478579} 12/22/2021 12:50:34 - INFO - codeparrot_training - Step 6851: {'lr': 0.00047817643958829336, 'samples': 3508224, 'steps': 6851, 'batch_loss/train': 0.8483903603628278} 12/22/2021 12:50:45 - INFO - codeparrot_training - Step 6852: {'lr': 0.00047816999804342977, 'samples': 3508736, 'steps': 6852, 'batch_loss/train': 0.9002201855182648} 12/22/2021 12:50:55 - INFO - codeparrot_training - Step 6853: {'lr': 0.0004781635555914483, 'samples': 3509248, 'steps': 6853, 'batch_loss/train': 0.8465242171660066} 12/22/2021 12:51:09 - INFO - codeparrot_training - Step 6854: {'lr': 0.0004781571122323748, 'samples': 3509760, 'steps': 6854, 'batch_loss/train': 0.8029172965325415} 12/22/2021 12:51:20 - INFO - codeparrot_training - Step 6855: {'lr': 0.0004781506679662346, 'samples': 3510272, 'steps': 6855, 'batch_loss/train': 0.8994726347737014} 12/22/2021 12:51:30 - INFO - codeparrot_training - Step 6856: {'lr': 0.0004781442227930537, 'samples': 3510784, 'steps': 6856, 'batch_loss/train': 0.8384826984256506} 12/22/2021 12:51:43 - INFO - codeparrot_training - Step 6857: {'lr': 0.0004781377767128574, 'samples': 3511296, 'steps': 6857, 'batch_loss/train': 0.8050114456564188} 12/22/2021 12:51:53 - INFO - codeparrot_training - Step 6858: {'lr': 0.00047813132972567143, 'samples': 3511808, 'steps': 6858, 'batch_loss/train': 0.7892208928242326} 12/22/2021 12:52:04 - INFO - codeparrot_training - Step 6859: {'lr': 0.00047812488183152145, 'samples': 3512320, 'steps': 6859, 'batch_loss/train': 1.0068320371210575} 12/22/2021 12:52:14 - INFO - codeparrot_training - Step 6860: {'lr': 0.0004781184330304331, 'samples': 3512832, 'steps': 6860, 'batch_loss/train': 0.8050445280969143} 12/22/2021 12:52:28 - INFO - codeparrot_training - Step 6861: {'lr': 0.000478111983322432, 'samples': 3513344, 'steps': 6861, 'batch_loss/train': 0.8679223814979196} 12/22/2021 12:52:39 - INFO - codeparrot_training - Step 6862: {'lr': 0.00047810553270754376, 'samples': 3513856, 'steps': 6862, 'batch_loss/train': 0.8103510569781065} 12/22/2021 12:52:50 - INFO - codeparrot_training - Step 6863: {'lr': 0.00047809908118579404, 'samples': 3514368, 'steps': 6863, 'batch_loss/train': 0.8362954752519727} 12/22/2021 12:53:02 - INFO - codeparrot_training - Step 6864: {'lr': 0.00047809262875720863, 'samples': 3514880, 'steps': 6864, 'batch_loss/train': 0.835279525257647} 12/22/2021 12:53:12 - INFO - codeparrot_training - Step 6865: {'lr': 0.0004780861754218129, 'samples': 3515392, 'steps': 6865, 'batch_loss/train': 0.8000361165031791} 12/22/2021 12:53:23 - INFO - codeparrot_training - Step 6866: {'lr': 0.0004780797211796327, 'samples': 3515904, 'steps': 6866, 'batch_loss/train': 0.8739816853776574} 12/22/2021 12:53:35 - INFO - codeparrot_training - Step 6867: {'lr': 0.0004780732660306936, 'samples': 3516416, 'steps': 6867, 'batch_loss/train': 0.8294966695830226} 12/22/2021 12:53:46 - INFO - codeparrot_training - Step 6868: {'lr': 0.0004780668099750214, 'samples': 3516928, 'steps': 6868, 'batch_loss/train': 1.4203750640153885} 12/22/2021 12:53:56 - INFO - codeparrot_training - Step 6869: {'lr': 0.0004780603530126416, 'samples': 3517440, 'steps': 6869, 'batch_loss/train': 0.8517472061794251} 12/22/2021 12:54:08 - INFO - codeparrot_training - Step 6870: {'lr': 0.00047805389514358, 'samples': 3517952, 'steps': 6870, 'batch_loss/train': 0.8811293905600905} 12/22/2021 12:54:19 - INFO - codeparrot_training - Step 6871: {'lr': 0.00047804743636786225, 'samples': 3518464, 'steps': 6871, 'batch_loss/train': 0.7287796433083713} 12/22/2021 12:54:30 - INFO - codeparrot_training - Step 6872: {'lr': 0.00047804097668551393, 'samples': 3518976, 'steps': 6872, 'batch_loss/train': 0.7869997941888869} 12/22/2021 12:54:40 - INFO - codeparrot_training - Step 6873: {'lr': 0.00047803451609656084, 'samples': 3519488, 'steps': 6873, 'batch_loss/train': 0.9009672608226538} 12/22/2021 12:54:54 - INFO - codeparrot_training - Step 6874: {'lr': 0.0004780280546010286, 'samples': 3520000, 'steps': 6874, 'batch_loss/train': 0.857836025999859} 12/22/2021 12:55:05 - INFO - codeparrot_training - Step 6875: {'lr': 0.0004780215921989429, 'samples': 3520512, 'steps': 6875, 'batch_loss/train': 0.8585587115958333} 12/22/2021 12:55:15 - INFO - codeparrot_training - Step 6876: {'lr': 0.0004780151288903295, 'samples': 3521024, 'steps': 6876, 'batch_loss/train': 0.9169222228229046} 12/22/2021 12:55:27 - INFO - codeparrot_training - Step 6877: {'lr': 0.000478008664675214, 'samples': 3521536, 'steps': 6877, 'batch_loss/train': 0.7921930681914091} 12/22/2021 12:55:38 - INFO - codeparrot_training - Step 6878: {'lr': 0.0004780021995536221, 'samples': 3522048, 'steps': 6878, 'batch_loss/train': 0.7619738313369453} 12/22/2021 12:55:49 - INFO - codeparrot_training - Step 6879: {'lr': 0.0004779957335255796, 'samples': 3522560, 'steps': 6879, 'batch_loss/train': 0.7854180801659822} 12/22/2021 12:56:01 - INFO - codeparrot_training - Step 6880: {'lr': 0.0004779892665911122, 'samples': 3523072, 'steps': 6880, 'batch_loss/train': 0.7979280715808272} 12/22/2021 12:56:11 - INFO - codeparrot_training - Step 6881: {'lr': 0.0004779827987502455, 'samples': 3523584, 'steps': 6881, 'batch_loss/train': 0.8335308711975813} 12/22/2021 12:56:22 - INFO - codeparrot_training - Step 6882: {'lr': 0.00047797633000300525, 'samples': 3524096, 'steps': 6882, 'batch_loss/train': 0.831101187504828} 12/22/2021 12:56:33 - INFO - codeparrot_training - Step 6883: {'lr': 0.00047796986034941727, 'samples': 3524608, 'steps': 6883, 'batch_loss/train': 0.7825077744200826} 12/22/2021 12:56:47 - INFO - codeparrot_training - Step 6884: {'lr': 0.00047796338978950707, 'samples': 3525120, 'steps': 6884, 'batch_loss/train': 0.8178739652503282} 12/22/2021 12:56:57 - INFO - codeparrot_training - Step 6885: {'lr': 0.0004779569183233006, 'samples': 3525632, 'steps': 6885, 'batch_loss/train': 0.909658077172935} 12/22/2021 12:57:08 - INFO - codeparrot_training - Step 6886: {'lr': 0.0004779504459508235, 'samples': 3526144, 'steps': 6886, 'batch_loss/train': 0.7192649858770892} 12/22/2021 12:57:20 - INFO - codeparrot_training - Step 6887: {'lr': 0.0004779439726721015, 'samples': 3526656, 'steps': 6887, 'batch_loss/train': 0.84517622878775} 12/22/2021 12:57:30 - INFO - codeparrot_training - Step 6888: {'lr': 0.0004779374984871603, 'samples': 3527168, 'steps': 6888, 'batch_loss/train': 0.8670377805829048} 12/22/2021 12:57:41 - INFO - codeparrot_training - Step 6889: {'lr': 0.0004779310233960257, 'samples': 3527680, 'steps': 6889, 'batch_loss/train': 0.9248344874940813} 12/22/2021 12:57:53 - INFO - codeparrot_training - Step 6890: {'lr': 0.0004779245473987234, 'samples': 3528192, 'steps': 6890, 'batch_loss/train': 0.7958310176618397} 12/22/2021 12:58:04 - INFO - codeparrot_training - Step 6891: {'lr': 0.0004779180704952792, 'samples': 3528704, 'steps': 6891, 'batch_loss/train': 0.916795710567385} 12/22/2021 12:58:14 - INFO - codeparrot_training - Step 6892: {'lr': 0.0004779115926857188, 'samples': 3529216, 'steps': 6892, 'batch_loss/train': 0.894349536858499} 12/22/2021 12:58:29 - INFO - codeparrot_training - Step 6893: {'lr': 0.0004779051139700679, 'samples': 3529728, 'steps': 6893, 'batch_loss/train': 0.8364543206989765} 12/22/2021 12:58:39 - INFO - codeparrot_training - Step 6894: {'lr': 0.00047789863434835237, 'samples': 3530240, 'steps': 6894, 'batch_loss/train': 0.7834193306043744} 12/22/2021 12:58:50 - INFO - codeparrot_training - Step 6895: {'lr': 0.0004778921538205979, 'samples': 3530752, 'steps': 6895, 'batch_loss/train': 0.7633658368140459} 12/22/2021 12:59:01 - INFO - codeparrot_training - Step 6896: {'lr': 0.0004778856723868303, 'samples': 3531264, 'steps': 6896, 'batch_loss/train': 0.9297445239499211} 12/22/2021 12:59:13 - INFO - codeparrot_training - Step 6897: {'lr': 0.0004778791900470753, 'samples': 3531776, 'steps': 6897, 'batch_loss/train': 0.714201295748353} 12/22/2021 12:59:23 - INFO - codeparrot_training - Step 6898: {'lr': 0.00047787270680135866, 'samples': 3532288, 'steps': 6898, 'batch_loss/train': 0.8604094740003347} 12/22/2021 12:59:34 - INFO - codeparrot_training - Step 6899: {'lr': 0.00047786622264970626, 'samples': 3532800, 'steps': 6899, 'batch_loss/train': 0.8246651338413358} 12/22/2021 12:59:48 - INFO - codeparrot_training - Step 6900: {'lr': 0.0004778597375921437, 'samples': 3533312, 'steps': 6900, 'batch_loss/train': 0.8737885421141982} 12/22/2021 12:59:58 - INFO - codeparrot_training - Step 6901: {'lr': 0.00047785325162869686, 'samples': 3533824, 'steps': 6901, 'batch_loss/train': 0.8625508421100676} 12/22/2021 13:00:09 - INFO - codeparrot_training - Step 6902: {'lr': 0.0004778467647593916, 'samples': 3534336, 'steps': 6902, 'batch_loss/train': 0.9055325676454231} 12/22/2021 13:00:21 - INFO - codeparrot_training - Step 6903: {'lr': 0.00047784027698425356, 'samples': 3534848, 'steps': 6903, 'batch_loss/train': 0.8702748436480761} 12/22/2021 13:00:32 - INFO - codeparrot_training - Step 6904: {'lr': 0.0004778337883033087, 'samples': 3535360, 'steps': 6904, 'batch_loss/train': 0.9536830177530646} 12/22/2021 13:00:42 - INFO - codeparrot_training - Step 6905: {'lr': 0.0004778272987165827, 'samples': 3535872, 'steps': 6905, 'batch_loss/train': 0.848059271927923} 12/22/2021 13:00:53 - INFO - codeparrot_training - Step 6906: {'lr': 0.00047782080822410136, 'samples': 3536384, 'steps': 6906, 'batch_loss/train': 0.9043547417968512} 12/22/2021 13:01:05 - INFO - codeparrot_training - Step 6907: {'lr': 0.0004778143168258905, 'samples': 3536896, 'steps': 6907, 'batch_loss/train': 0.7529145423322916} 12/22/2021 13:01:16 - INFO - codeparrot_training - Step 6908: {'lr': 0.0004778078245219759, 'samples': 3537408, 'steps': 6908, 'batch_loss/train': 0.8631441686302423} 12/22/2021 13:01:26 - INFO - codeparrot_training - Step 6909: {'lr': 0.0004778013313123835, 'samples': 3537920, 'steps': 6909, 'batch_loss/train': 0.8303271364420652} 12/22/2021 13:01:39 - INFO - codeparrot_training - Step 6910: {'lr': 0.0004777948371971389, 'samples': 3538432, 'steps': 6910, 'batch_loss/train': 0.8993386342190206} 12/22/2021 13:01:49 - INFO - codeparrot_training - Step 6911: {'lr': 0.00047778834217626816, 'samples': 3538944, 'steps': 6911, 'batch_loss/train': 0.850620181299746} 12/22/2021 13:02:00 - INFO - codeparrot_training - Step 6912: {'lr': 0.00047778184624979695, 'samples': 3539456, 'steps': 6912, 'batch_loss/train': 0.8448124947026372} 12/22/2021 13:02:14 - INFO - codeparrot_training - Step 6913: {'lr': 0.000477775349417751, 'samples': 3539968, 'steps': 6913, 'batch_loss/train': 0.7056853544199839} 12/22/2021 13:02:24 - INFO - codeparrot_training - Step 6914: {'lr': 0.0004777688516801564, 'samples': 3540480, 'steps': 6914, 'batch_loss/train': 0.8726887544617057} 12/22/2021 13:02:35 - INFO - codeparrot_training - Step 6915: {'lr': 0.00047776235303703885, 'samples': 3540992, 'steps': 6915, 'batch_loss/train': 0.8460023198276758} 12/22/2021 13:02:46 - INFO - codeparrot_training - Step 6916: {'lr': 0.0004777558534884242, 'samples': 3541504, 'steps': 6916, 'batch_loss/train': 0.8598597021773458} 12/22/2021 13:02:58 - INFO - codeparrot_training - Step 6917: {'lr': 0.0004777493530343382, 'samples': 3542016, 'steps': 6917, 'batch_loss/train': 0.8056365749798715} 12/22/2021 13:03:08 - INFO - codeparrot_training - Step 6918: {'lr': 0.0004777428516748068, 'samples': 3542528, 'steps': 6918, 'batch_loss/train': 0.8678967915475368} 12/22/2021 13:03:19 - INFO - codeparrot_training - Step 6919: {'lr': 0.0004777363494098558, 'samples': 3543040, 'steps': 6919, 'batch_loss/train': 0.8219510619528592} 12/22/2021 13:03:31 - INFO - codeparrot_training - Step 6920: {'lr': 0.0004777298462395111, 'samples': 3543552, 'steps': 6920, 'batch_loss/train': 0.8018168660346419} 12/22/2021 13:03:42 - INFO - codeparrot_training - Step 6921: {'lr': 0.00047772334216379843, 'samples': 3544064, 'steps': 6921, 'batch_loss/train': 0.7397546265274286} 12/22/2021 13:03:52 - INFO - codeparrot_training - Step 6922: {'lr': 0.00047771683718274386, 'samples': 3544576, 'steps': 6922, 'batch_loss/train': 0.7603208683431149} 12/22/2021 13:04:06 - INFO - codeparrot_training - Step 6923: {'lr': 0.00047771033129637304, 'samples': 3545088, 'steps': 6923, 'batch_loss/train': 0.7948522195219994} 12/22/2021 13:04:17 - INFO - codeparrot_training - Step 6924: {'lr': 0.00047770382450471196, 'samples': 3545600, 'steps': 6924, 'batch_loss/train': 0.8138161359820515} 12/22/2021 13:04:28 - INFO - codeparrot_training - Step 6925: {'lr': 0.00047769731680778646, 'samples': 3546112, 'steps': 6925, 'batch_loss/train': 0.740538137848489} 12/22/2021 13:04:40 - INFO - codeparrot_training - Step 6926: {'lr': 0.0004776908082056224, 'samples': 3546624, 'steps': 6926, 'batch_loss/train': 0.8234712369740009} 12/22/2021 13:04:50 - INFO - codeparrot_training - Step 6927: {'lr': 0.0004776842986982456, 'samples': 3547136, 'steps': 6927, 'batch_loss/train': 0.8844731170684099} 12/22/2021 13:05:01 - INFO - codeparrot_training - Step 6928: {'lr': 0.00047767778828568204, 'samples': 3547648, 'steps': 6928, 'batch_loss/train': 0.8779600341804326} 12/22/2021 13:05:11 - INFO - codeparrot_training - Step 6929: {'lr': 0.00047767127696795756, 'samples': 3548160, 'steps': 6929, 'batch_loss/train': 0.7759677013382316} 12/22/2021 13:05:25 - INFO - codeparrot_training - Step 6930: {'lr': 0.00047766476474509807, 'samples': 3548672, 'steps': 6930, 'batch_loss/train': 0.8414387134835124} 12/22/2021 13:05:36 - INFO - codeparrot_training - Step 6931: {'lr': 0.00047765825161712937, 'samples': 3549184, 'steps': 6931, 'batch_loss/train': 0.7702608546242118} 12/22/2021 13:05:47 - INFO - codeparrot_training - Step 6932: {'lr': 0.0004776517375840775, 'samples': 3549696, 'steps': 6932, 'batch_loss/train': 0.8195980321615934} 12/22/2021 13:05:59 - INFO - codeparrot_training - Step 6933: {'lr': 0.00047764522264596824, 'samples': 3550208, 'steps': 6933, 'batch_loss/train': 0.8208807362243533} 12/22/2021 13:06:09 - INFO - codeparrot_training - Step 6934: {'lr': 0.00047763870680282754, 'samples': 3550720, 'steps': 6934, 'batch_loss/train': 0.7730083698406816} 12/22/2021 13:06:20 - INFO - codeparrot_training - Step 6935: {'lr': 0.00047763219005468127, 'samples': 3551232, 'steps': 6935, 'batch_loss/train': 0.7822374842362478} 12/22/2021 13:06:32 - INFO - codeparrot_training - Step 6936: {'lr': 0.0004776256724015554, 'samples': 3551744, 'steps': 6936, 'batch_loss/train': 0.8692762600257993} 12/22/2021 13:06:43 - INFO - codeparrot_training - Step 6937: {'lr': 0.00047761915384347586, 'samples': 3552256, 'steps': 6937, 'batch_loss/train': 0.884889226523228} 12/22/2021 13:06:53 - INFO - codeparrot_training - Step 6938: {'lr': 0.00047761263438046844, 'samples': 3552768, 'steps': 6938, 'batch_loss/train': 0.8018298828974366} 12/22/2021 13:07:04 - INFO - codeparrot_training - Step 6939: {'lr': 0.0004776061140125591, 'samples': 3553280, 'steps': 6939, 'batch_loss/train': 0.909806203097105} 12/22/2021 13:07:18 - INFO - codeparrot_training - Step 6940: {'lr': 0.0004775995927397738, 'samples': 3553792, 'steps': 6940, 'batch_loss/train': 0.7289066608063877} 12/22/2021 13:07:28 - INFO - codeparrot_training - Step 6941: {'lr': 0.0004775930705621385, 'samples': 3554304, 'steps': 6941, 'batch_loss/train': 0.6880506570450962} 12/22/2021 13:07:39 - INFO - codeparrot_training - Step 6942: {'lr': 0.0004775865474796791, 'samples': 3554816, 'steps': 6942, 'batch_loss/train': 0.7912719463929534} 12/22/2021 13:07:51 - INFO - codeparrot_training - Step 6943: {'lr': 0.00047758002349242146, 'samples': 3555328, 'steps': 6943, 'batch_loss/train': 0.8608554899692535} 12/22/2021 13:08:02 - INFO - codeparrot_training - Step 6944: {'lr': 0.0004775734986003916, 'samples': 3555840, 'steps': 6944, 'batch_loss/train': 0.7973303105682135} 12/22/2021 13:08:12 - INFO - codeparrot_training - Step 6945: {'lr': 0.0004775669728036155, 'samples': 3556352, 'steps': 6945, 'batch_loss/train': 0.8844816903583705} 12/22/2021 13:08:24 - INFO - codeparrot_training - Step 6946: {'lr': 0.000477560446102119, 'samples': 3556864, 'steps': 6946, 'batch_loss/train': 0.7660764493048191} 12/22/2021 13:08:35 - INFO - codeparrot_training - Step 6947: {'lr': 0.000477553918495928, 'samples': 3557376, 'steps': 6947, 'batch_loss/train': 0.825427430216223} 12/22/2021 13:08:46 - INFO - codeparrot_training - Step 6948: {'lr': 0.00047754738998506864, 'samples': 3557888, 'steps': 6948, 'batch_loss/train': 0.7870476427488029} 12/22/2021 13:09:00 - INFO - codeparrot_training - Step 6949: {'lr': 0.0004775408605695668, 'samples': 3558400, 'steps': 6949, 'batch_loss/train': 0.7669964379165322} 12/22/2021 13:09:10 - INFO - codeparrot_training - Step 6950: {'lr': 0.0004775343302494484, 'samples': 3558912, 'steps': 6950, 'batch_loss/train': 0.7821305366232991} 12/22/2021 13:09:21 - INFO - codeparrot_training - Step 6951: {'lr': 0.00047752779902473933, 'samples': 3559424, 'steps': 6951, 'batch_loss/train': 0.8496185140684247} 12/22/2021 13:09:31 - INFO - codeparrot_training - Step 6952: {'lr': 0.0004775212668954657, 'samples': 3559936, 'steps': 6952, 'batch_loss/train': 0.8116379089187831} 12/22/2021 13:09:43 - INFO - codeparrot_training - Step 6953: {'lr': 0.0004775147338616534, 'samples': 3560448, 'steps': 6953, 'batch_loss/train': 0.7432887773029506} 12/22/2021 13:09:54 - INFO - codeparrot_training - Step 6954: {'lr': 0.00047750819992332846, 'samples': 3560960, 'steps': 6954, 'batch_loss/train': 0.6057755742222071} 12/22/2021 13:10:05 - INFO - codeparrot_training - Step 6955: {'lr': 0.0004775016650805168, 'samples': 3561472, 'steps': 6955, 'batch_loss/train': 0.6737189101986587} 12/22/2021 13:10:17 - INFO - codeparrot_training - Step 6956: {'lr': 0.00047749512933324445, 'samples': 3561984, 'steps': 6956, 'batch_loss/train': 0.8439257694408298} 12/22/2021 13:10:27 - INFO - codeparrot_training - Step 6957: {'lr': 0.00047748859268153735, 'samples': 3562496, 'steps': 6957, 'batch_loss/train': 0.8456597821787} 12/22/2021 13:10:38 - INFO - codeparrot_training - Step 6958: {'lr': 0.0004774820551254214, 'samples': 3563008, 'steps': 6958, 'batch_loss/train': 0.7706015752628446} 12/22/2021 13:10:49 - INFO - codeparrot_training - Step 6959: {'lr': 0.00047747551666492286, 'samples': 3563520, 'steps': 6959, 'batch_loss/train': 0.9762260876595974} 12/22/2021 13:11:02 - INFO - codeparrot_training - Step 6960: {'lr': 0.00047746897730006747, 'samples': 3564032, 'steps': 6960, 'batch_loss/train': 1.016547854989767} 12/22/2021 13:11:13 - INFO - codeparrot_training - Step 6961: {'lr': 0.0004774624370308813, 'samples': 3564544, 'steps': 6961, 'batch_loss/train': 0.7744967319304124} 12/22/2021 13:11:24 - INFO - codeparrot_training - Step 6962: {'lr': 0.0004774558958573904, 'samples': 3565056, 'steps': 6962, 'batch_loss/train': 0.8448918936774135} 12/22/2021 13:11:36 - INFO - codeparrot_training - Step 6963: {'lr': 0.00047744935377962074, 'samples': 3565568, 'steps': 6963, 'batch_loss/train': 0.9700009664520621} 12/22/2021 13:11:47 - INFO - codeparrot_training - Step 6964: {'lr': 0.00047744281079759835, 'samples': 3566080, 'steps': 6964, 'batch_loss/train': 0.839099545031786} 12/22/2021 13:11:57 - INFO - codeparrot_training - Step 6965: {'lr': 0.00047743626691134917, 'samples': 3566592, 'steps': 6965, 'batch_loss/train': 0.7385329082608223} 12/22/2021 13:12:09 - INFO - codeparrot_training - Step 6966: {'lr': 0.0004774297221208993, 'samples': 3567104, 'steps': 6966, 'batch_loss/train': 0.6754113277420402} 12/22/2021 13:12:20 - INFO - codeparrot_training - Step 6967: {'lr': 0.0004774231764262747, 'samples': 3567616, 'steps': 6967, 'batch_loss/train': 0.9213206134736538} 12/22/2021 13:12:31 - INFO - codeparrot_training - Step 6968: {'lr': 0.0004774166298275015, 'samples': 3568128, 'steps': 6968, 'batch_loss/train': 0.8607096262276173} 12/22/2021 13:12:41 - INFO - codeparrot_training - Step 6969: {'lr': 0.00047741008232460557, 'samples': 3568640, 'steps': 6969, 'batch_loss/train': 0.8601622032001615} 12/22/2021 13:12:55 - INFO - codeparrot_training - Step 6970: {'lr': 0.000477403533917613, 'samples': 3569152, 'steps': 6970, 'batch_loss/train': 0.7472593327984214} 12/22/2021 13:13:06 - INFO - codeparrot_training - Step 6971: {'lr': 0.00047739698460654993, 'samples': 3569664, 'steps': 6971, 'batch_loss/train': 0.9062351025640965} 12/22/2021 13:13:17 - INFO - codeparrot_training - Step 6972: {'lr': 0.0004773904343914423, 'samples': 3570176, 'steps': 6972, 'batch_loss/train': 0.7874990431591868} 12/22/2021 13:13:29 - INFO - codeparrot_training - Step 6973: {'lr': 0.00047738388327231613, 'samples': 3570688, 'steps': 6973, 'batch_loss/train': 0.5162037026602775} 12/22/2021 13:13:40 - INFO - codeparrot_training - Step 6974: {'lr': 0.0004773773312491975, 'samples': 3571200, 'steps': 6974, 'batch_loss/train': 0.7300380822271109} 12/22/2021 13:13:50 - INFO - codeparrot_training - Step 6975: {'lr': 0.0004773707783221125, 'samples': 3571712, 'steps': 6975, 'batch_loss/train': 0.8640876268036664} 12/22/2021 13:14:01 - INFO - codeparrot_training - Step 6976: {'lr': 0.00047736422449108705, 'samples': 3572224, 'steps': 6976, 'batch_loss/train': 0.8174218004569411} 12/22/2021 13:14:13 - INFO - codeparrot_training - Step 6977: {'lr': 0.0004773576697561474, 'samples': 3572736, 'steps': 6977, 'batch_loss/train': 0.9530540592968464} 12/22/2021 13:14:24 - INFO - codeparrot_training - Step 6978: {'lr': 0.00047735111411731937, 'samples': 3573248, 'steps': 6978, 'batch_loss/train': 0.8727387501858175} 12/22/2021 13:14:34 - INFO - codeparrot_training - Step 6979: {'lr': 0.00047734455757462925, 'samples': 3573760, 'steps': 6979, 'batch_loss/train': 0.79849180043675} 12/22/2021 13:14:46 - INFO - codeparrot_training - Step 6980: {'lr': 0.00047733800012810304, 'samples': 3574272, 'steps': 6980, 'batch_loss/train': 0.9248971473425627} 12/22/2021 13:14:57 - INFO - codeparrot_training - Step 6981: {'lr': 0.00047733144177776666, 'samples': 3574784, 'steps': 6981, 'batch_loss/train': 0.893150856718421} 12/22/2021 13:15:07 - INFO - codeparrot_training - Step 6982: {'lr': 0.00047732488252364634, 'samples': 3575296, 'steps': 6982, 'batch_loss/train': 0.8430629437789321} 12/22/2021 13:15:22 - INFO - codeparrot_training - Step 6983: {'lr': 0.0004773183223657682, 'samples': 3575808, 'steps': 6983, 'batch_loss/train': 0.9203350022435188} 12/22/2021 13:15:32 - INFO - codeparrot_training - Step 6984: {'lr': 0.00047731176130415823, 'samples': 3576320, 'steps': 6984, 'batch_loss/train': 0.8170223375782371} 12/22/2021 13:15:43 - INFO - codeparrot_training - Step 6985: {'lr': 0.00047730519933884244, 'samples': 3576832, 'steps': 6985, 'batch_loss/train': 1.1627113306894898} 12/22/2021 13:15:54 - INFO - codeparrot_training - Step 6986: {'lr': 0.00047729863646984706, 'samples': 3577344, 'steps': 6986, 'batch_loss/train': 0.9021399349439889} 12/22/2021 13:16:06 - INFO - codeparrot_training - Step 6987: {'lr': 0.0004772920726971981, 'samples': 3577856, 'steps': 6987, 'batch_loss/train': 0.7414624670054764} 12/22/2021 13:16:16 - INFO - codeparrot_training - Step 6988: {'lr': 0.00047728550802092175, 'samples': 3578368, 'steps': 6988, 'batch_loss/train': 0.8636791361495852} 12/22/2021 13:16:27 - INFO - codeparrot_training - Step 6989: {'lr': 0.00047727894244104397, 'samples': 3578880, 'steps': 6989, 'batch_loss/train': 0.8804600974544883} 12/22/2021 13:16:41 - INFO - codeparrot_training - Step 6990: {'lr': 0.000477272375957591, 'samples': 3579392, 'steps': 6990, 'batch_loss/train': 0.8248263928107917} 12/22/2021 13:16:51 - INFO - codeparrot_training - Step 6991: {'lr': 0.0004772658085705888, 'samples': 3579904, 'steps': 6991, 'batch_loss/train': 0.888526227325201} 12/22/2021 13:17:02 - INFO - codeparrot_training - Step 6992: {'lr': 0.0004772592402800636, 'samples': 3580416, 'steps': 6992, 'batch_loss/train': 0.8689745874144137} 12/22/2021 13:17:14 - INFO - codeparrot_training - Step 6993: {'lr': 0.00047725267108604145, 'samples': 3580928, 'steps': 6993, 'batch_loss/train': 0.7819505874067545} 12/22/2021 13:17:25 - INFO - codeparrot_training - Step 6994: {'lr': 0.0004772461009885485, 'samples': 3581440, 'steps': 6994, 'batch_loss/train': 0.920429764315486} 12/22/2021 13:17:35 - INFO - codeparrot_training - Step 6995: {'lr': 0.0004772395299876109, 'samples': 3581952, 'steps': 6995, 'batch_loss/train': 0.809286629781127} 12/22/2021 13:17:46 - INFO - codeparrot_training - Step 6996: {'lr': 0.0004772329580832546, 'samples': 3582464, 'steps': 6996, 'batch_loss/train': 0.87276516482234} 12/22/2021 13:17:58 - INFO - codeparrot_training - Step 6997: {'lr': 0.00047722638527550597, 'samples': 3582976, 'steps': 6997, 'batch_loss/train': 0.8098124042153358} 12/22/2021 13:18:09 - INFO - codeparrot_training - Step 6998: {'lr': 0.00047721981156439097, 'samples': 3583488, 'steps': 6998, 'batch_loss/train': 0.7616244433447719} 12/22/2021 13:18:19 - INFO - codeparrot_training - Step 6999: {'lr': 0.0004772132369499359, 'samples': 3584000, 'steps': 6999, 'batch_loss/train': 0.9632371235638857} 12/22/2021 13:18:33 - INFO - codeparrot_training - Step 7000: {'lr': 0.00047720666143216667, 'samples': 3584512, 'steps': 7000, 'batch_loss/train': 0.8480719504877925} 12/22/2021 13:18:44 - INFO - codeparrot_training - Step 7001: {'lr': 0.00047720008501110964, 'samples': 3585024, 'steps': 7001, 'batch_loss/train': 0.8087753960862756} 12/22/2021 13:18:54 - INFO - codeparrot_training - Step 7002: {'lr': 0.0004771935076867908, 'samples': 3585536, 'steps': 7002, 'batch_loss/train': 0.8434031791985035} 12/22/2021 13:19:07 - INFO - codeparrot_training - Step 7003: {'lr': 0.00047718692945923635, 'samples': 3586048, 'steps': 7003, 'batch_loss/train': 0.8066940014250576} 12/22/2021 13:19:17 - INFO - codeparrot_training - Step 7004: {'lr': 0.0004771803503284725, 'samples': 3586560, 'steps': 7004, 'batch_loss/train': 0.8591175116598606} 12/22/2021 13:19:28 - INFO - codeparrot_training - Step 7005: {'lr': 0.0004771737702945253, 'samples': 3587072, 'steps': 7005, 'batch_loss/train': 0.7318648416548967} 12/22/2021 13:19:38 - INFO - codeparrot_training - Step 7006: {'lr': 0.000477167189357421, 'samples': 3587584, 'steps': 7006, 'batch_loss/train': 0.7920144647359848} 12/22/2021 13:20:00 - INFO - codeparrot_training - Step 7007: {'lr': 0.0004771606075171857, 'samples': 3588096, 'steps': 7007, 'batch_loss/train': 0.8123489310964942} 12/22/2021 13:20:11 - INFO - codeparrot_training - Step 7008: {'lr': 0.00047715402477384566, 'samples': 3588608, 'steps': 7008, 'batch_loss/train': 1.0712685333564878} 12/22/2021 13:20:22 - INFO - codeparrot_training - Step 7009: {'lr': 0.00047714744112742694, 'samples': 3589120, 'steps': 7009, 'batch_loss/train': 0.9029171662405133} 12/22/2021 13:20:34 - INFO - codeparrot_training - Step 7010: {'lr': 0.0004771408565779558, 'samples': 3589632, 'steps': 7010, 'batch_loss/train': 0.7341439817100763} 12/22/2021 13:20:45 - INFO - codeparrot_training - Step 7011: {'lr': 0.0004771342711254584, 'samples': 3590144, 'steps': 7011, 'batch_loss/train': 0.7521173427812755} 12/22/2021 13:20:55 - INFO - codeparrot_training - Step 7012: {'lr': 0.00047712768476996083, 'samples': 3590656, 'steps': 7012, 'batch_loss/train': 0.7966316761448979} 12/22/2021 13:21:08 - INFO - codeparrot_training - Step 7013: {'lr': 0.00047712109751148934, 'samples': 3591168, 'steps': 7013, 'batch_loss/train': 0.821582411415875} 12/22/2021 13:21:19 - INFO - codeparrot_training - Step 7014: {'lr': 0.00047711450935007015, 'samples': 3591680, 'steps': 7014, 'batch_loss/train': 0.9432555558159947} 12/22/2021 13:21:29 - INFO - codeparrot_training - Step 7015: {'lr': 0.0004771079202857294, 'samples': 3592192, 'steps': 7015, 'batch_loss/train': 0.8388312924653292} 12/22/2021 13:21:40 - INFO - codeparrot_training - Step 7016: {'lr': 0.0004771013303184933, 'samples': 3592704, 'steps': 7016, 'batch_loss/train': 1.3167557008564472} 12/22/2021 13:21:52 - INFO - codeparrot_training - Step 7017: {'lr': 0.00047709473944838816, 'samples': 3593216, 'steps': 7017, 'batch_loss/train': 0.9328236470464617} 12/22/2021 13:22:03 - INFO - codeparrot_training - Step 7018: {'lr': 0.00047708814767544004, 'samples': 3593728, 'steps': 7018, 'batch_loss/train': 0.864845375996083} 12/22/2021 13:22:13 - INFO - codeparrot_training - Step 7019: {'lr': 0.0004770815549996752, 'samples': 3594240, 'steps': 7019, 'batch_loss/train': 0.7420060265576467} 12/22/2021 13:22:27 - INFO - codeparrot_training - Step 7020: {'lr': 0.00047707496142111977, 'samples': 3594752, 'steps': 7020, 'batch_loss/train': 0.8492612624540925} 12/22/2021 13:22:38 - INFO - codeparrot_training - Step 7021: {'lr': 0.00047706836693980003, 'samples': 3595264, 'steps': 7021, 'batch_loss/train': 0.8985769636929035} 12/22/2021 13:22:49 - INFO - codeparrot_training - Step 7022: {'lr': 0.0004770617715557423, 'samples': 3595776, 'steps': 7022, 'batch_loss/train': 0.7920430679805577} 12/22/2021 13:22:59 - INFO - codeparrot_training - Step 7023: {'lr': 0.00047705517526897267, 'samples': 3596288, 'steps': 7023, 'batch_loss/train': 0.8662904505617917} 12/22/2021 13:23:11 - INFO - codeparrot_training - Step 7024: {'lr': 0.00047704857807951735, 'samples': 3596800, 'steps': 7024, 'batch_loss/train': 0.7840550448745489} 12/22/2021 13:23:22 - INFO - codeparrot_training - Step 7025: {'lr': 0.00047704197998740263, 'samples': 3597312, 'steps': 7025, 'batch_loss/train': 0.7082425514236093} 12/22/2021 13:23:32 - INFO - codeparrot_training - Step 7026: {'lr': 0.00047703538099265475, 'samples': 3597824, 'steps': 7026, 'batch_loss/train': 0.7559889508411288} 12/22/2021 13:23:44 - INFO - codeparrot_training - Step 7027: {'lr': 0.0004770287810952999, 'samples': 3598336, 'steps': 7027, 'batch_loss/train': 0.8111560088582337} 12/22/2021 13:23:55 - INFO - codeparrot_training - Step 7028: {'lr': 0.0004770221802953644, 'samples': 3598848, 'steps': 7028, 'batch_loss/train': 0.733147791819647} 12/22/2021 13:24:06 - INFO - codeparrot_training - Step 7029: {'lr': 0.00047701557859287436, 'samples': 3599360, 'steps': 7029, 'batch_loss/train': 0.8258112077601254} 12/22/2021 13:24:20 - INFO - codeparrot_training - Step 7030: {'lr': 0.0004770089759878561, 'samples': 3599872, 'steps': 7030, 'batch_loss/train': 0.8322807636577636} 12/22/2021 13:24:30 - INFO - codeparrot_training - Step 7031: {'lr': 0.0004770023724803358, 'samples': 3600384, 'steps': 7031, 'batch_loss/train': 0.8060726318508387} 12/22/2021 13:24:41 - INFO - codeparrot_training - Step 7032: {'lr': 0.00047699576807033994, 'samples': 3600896, 'steps': 7032, 'batch_loss/train': 0.897120657376945} 12/22/2021 13:24:52 - INFO - codeparrot_training - Step 7033: {'lr': 0.0004769891627578945, 'samples': 3601408, 'steps': 7033, 'batch_loss/train': 0.767275545746088} 12/22/2021 13:25:04 - INFO - codeparrot_training - Step 7034: {'lr': 0.0004769825565430259, 'samples': 3601920, 'steps': 7034, 'batch_loss/train': 0.7763953669928014} 12/22/2021 13:25:14 - INFO - codeparrot_training - Step 7035: {'lr': 0.0004769759494257604, 'samples': 3602432, 'steps': 7035, 'batch_loss/train': 0.8087208885699511} 12/22/2021 13:25:25 - INFO - codeparrot_training - Step 7036: {'lr': 0.0004769693414061241, 'samples': 3602944, 'steps': 7036, 'batch_loss/train': 0.8206933187320828} 12/22/2021 13:25:39 - INFO - codeparrot_training - Step 7037: {'lr': 0.0004769627324841436, 'samples': 3603456, 'steps': 7037, 'batch_loss/train': 1.003490949049592} 12/22/2021 13:25:49 - INFO - codeparrot_training - Step 7038: {'lr': 0.0004769561226598448, 'samples': 3603968, 'steps': 7038, 'batch_loss/train': 0.8465441679582} 12/22/2021 13:26:00 - INFO - codeparrot_training - Step 7039: {'lr': 0.0004769495119332542, 'samples': 3604480, 'steps': 7039, 'batch_loss/train': 0.8399830055423081} 12/22/2021 13:26:12 - INFO - codeparrot_training - Step 7040: {'lr': 0.00047694290030439804, 'samples': 3604992, 'steps': 7040, 'batch_loss/train': 0.79094418277964} 12/22/2021 13:26:23 - INFO - codeparrot_training - Step 7041: {'lr': 0.00047693628777330254, 'samples': 3605504, 'steps': 7041, 'batch_loss/train': 0.7672763536684215} 12/22/2021 13:26:33 - INFO - codeparrot_training - Step 7042: {'lr': 0.0004769296743399941, 'samples': 3606016, 'steps': 7042, 'batch_loss/train': 0.8213579086586833} 12/22/2021 13:26:45 - INFO - codeparrot_training - Step 7043: {'lr': 0.000476923060004499, 'samples': 3606528, 'steps': 7043, 'batch_loss/train': 0.8888020953163505} 12/22/2021 13:26:56 - INFO - codeparrot_training - Step 7044: {'lr': 0.0004769164447668435, 'samples': 3607040, 'steps': 7044, 'batch_loss/train': 0.8189800798427314} 12/22/2021 13:27:07 - INFO - codeparrot_training - Step 7045: {'lr': 0.0004769098286270538, 'samples': 3607552, 'steps': 7045, 'batch_loss/train': 0.7725118407979608} 12/22/2021 13:27:17 - INFO - codeparrot_training - Step 7046: {'lr': 0.0004769032115851564, 'samples': 3608064, 'steps': 7046, 'batch_loss/train': 0.7922098655253649} 12/22/2021 13:27:31 - INFO - codeparrot_training - Step 7047: {'lr': 0.00047689659364117743, 'samples': 3608576, 'steps': 7047, 'batch_loss/train': 0.8290647752583027} 12/22/2021 13:27:42 - INFO - codeparrot_training - Step 7048: {'lr': 0.00047688997479514335, 'samples': 3609088, 'steps': 7048, 'batch_loss/train': 0.8600551020354033} 12/22/2021 13:27:53 - INFO - codeparrot_training - Step 7049: {'lr': 0.0004768833550470804, 'samples': 3609600, 'steps': 7049, 'batch_loss/train': 0.8629018338397145} 12/22/2021 13:28:05 - INFO - codeparrot_training - Step 7050: {'lr': 0.0004768767343970148, 'samples': 3610112, 'steps': 7050, 'batch_loss/train': 0.7905005598440766} 12/22/2021 13:28:15 - INFO - codeparrot_training - Step 7051: {'lr': 0.00047687011284497306, 'samples': 3610624, 'steps': 7051, 'batch_loss/train': 0.8341086134314537} 12/22/2021 13:28:26 - INFO - codeparrot_training - Step 7052: {'lr': 0.0004768634903909813, 'samples': 3611136, 'steps': 7052, 'batch_loss/train': 0.8044734504073858} 12/22/2021 13:28:39 - INFO - codeparrot_training - Step 7053: {'lr': 0.0004768568670350661, 'samples': 3611648, 'steps': 7053, 'batch_loss/train': 0.8528203056193888} 12/22/2021 13:28:50 - INFO - codeparrot_training - Step 7054: {'lr': 0.00047685024277725356, 'samples': 3612160, 'steps': 7054, 'batch_loss/train': 0.7774304173653945} 12/22/2021 13:29:00 - INFO - codeparrot_training - Step 7055: {'lr': 0.00047684361761757017, 'samples': 3612672, 'steps': 7055, 'batch_loss/train': 0.8917577397078276} 12/22/2021 13:29:11 - INFO - codeparrot_training - Step 7056: {'lr': 0.0004768369915560422, 'samples': 3613184, 'steps': 7056, 'batch_loss/train': 0.8755016513168812} 12/22/2021 13:29:23 - INFO - codeparrot_training - Step 7057: {'lr': 0.000476830364592696, 'samples': 3613696, 'steps': 7057, 'batch_loss/train': 0.8073353637009859} 12/22/2021 13:29:34 - INFO - codeparrot_training - Step 7058: {'lr': 0.00047682373672755784, 'samples': 3614208, 'steps': 7058, 'batch_loss/train': 0.7559750126674771} 12/22/2021 13:29:44 - INFO - codeparrot_training - Step 7059: {'lr': 0.00047681710796065424, 'samples': 3614720, 'steps': 7059, 'batch_loss/train': 0.8330523520708084} 12/22/2021 13:29:57 - INFO - codeparrot_training - Step 7060: {'lr': 0.00047681047829201145, 'samples': 3615232, 'steps': 7060, 'batch_loss/train': 0.8139301047194749} 12/22/2021 13:30:08 - INFO - codeparrot_training - Step 7061: {'lr': 0.0004768038477216559, 'samples': 3615744, 'steps': 7061, 'batch_loss/train': 0.8427695950958878} 12/22/2021 13:30:18 - INFO - codeparrot_training - Step 7062: {'lr': 0.00047679721624961375, 'samples': 3616256, 'steps': 7062, 'batch_loss/train': 0.793532314710319} 12/22/2021 13:30:29 - INFO - codeparrot_training - Step 7063: {'lr': 0.0004767905838759116, 'samples': 3616768, 'steps': 7063, 'batch_loss/train': 0.8130671232938766} 12/22/2021 13:30:41 - INFO - codeparrot_training - Step 7064: {'lr': 0.0004767839506005757, 'samples': 3617280, 'steps': 7064, 'batch_loss/train': 0.8693484053947031} 12/22/2021 13:30:51 - INFO - codeparrot_training - Step 7065: {'lr': 0.0004767773164236324, 'samples': 3617792, 'steps': 7065, 'batch_loss/train': 0.8516583032906055} 12/22/2021 13:31:02 - INFO - codeparrot_training - Step 7066: {'lr': 0.00047677068134510816, 'samples': 3618304, 'steps': 7066, 'batch_loss/train': 0.9370281286537647} 12/22/2021 13:31:16 - INFO - codeparrot_training - Step 7067: {'lr': 0.00047676404536502924, 'samples': 3618816, 'steps': 7067, 'batch_loss/train': 0.8550166366621852} 12/22/2021 13:31:26 - INFO - codeparrot_training - Step 7068: {'lr': 0.0004767574084834222, 'samples': 3619328, 'steps': 7068, 'batch_loss/train': 0.8063106518238783} 12/22/2021 13:31:37 - INFO - codeparrot_training - Step 7069: {'lr': 0.0004767507707003133, 'samples': 3619840, 'steps': 7069, 'batch_loss/train': 0.8237781170755625} 12/22/2021 13:31:49 - INFO - codeparrot_training - Step 7070: {'lr': 0.000476744132015729, 'samples': 3620352, 'steps': 7070, 'batch_loss/train': 0.902932426892221} 12/22/2021 13:32:00 - INFO - codeparrot_training - Step 7071: {'lr': 0.00047673749242969557, 'samples': 3620864, 'steps': 7071, 'batch_loss/train': 0.7826687768101692} 12/22/2021 13:32:11 - INFO - codeparrot_training - Step 7072: {'lr': 0.0004767308519422395, 'samples': 3621376, 'steps': 7072, 'batch_loss/train': 0.8336544372141361} 12/22/2021 13:32:21 - INFO - codeparrot_training - Step 7073: {'lr': 0.00047672421055338716, 'samples': 3621888, 'steps': 7073, 'batch_loss/train': 0.6662880186922848} 12/22/2021 13:32:34 - INFO - codeparrot_training - Step 7074: {'lr': 0.00047671756826316497, 'samples': 3622400, 'steps': 7074, 'batch_loss/train': 0.7840192019939423} 12/22/2021 13:32:45 - INFO - codeparrot_training - Step 7075: {'lr': 0.0004767109250715994, 'samples': 3622912, 'steps': 7075, 'batch_loss/train': 0.9108171267434955} 12/22/2021 13:32:56 - INFO - codeparrot_training - Step 7076: {'lr': 0.00047670428097871663, 'samples': 3623424, 'steps': 7076, 'batch_loss/train': 0.7256877054460347} 12/22/2021 13:33:09 - INFO - codeparrot_training - Step 7077: {'lr': 0.0004766976359845434, 'samples': 3623936, 'steps': 7077, 'batch_loss/train': 0.48092768620699644} 12/22/2021 13:33:20 - INFO - codeparrot_training - Step 7078: {'lr': 0.00047669099008910596, 'samples': 3624448, 'steps': 7078, 'batch_loss/train': 0.8403696184977889} 12/22/2021 13:33:30 - INFO - codeparrot_training - Step 7079: {'lr': 0.0004766843432924307, 'samples': 3624960, 'steps': 7079, 'batch_loss/train': 0.920107388868928} 12/22/2021 13:33:41 - INFO - codeparrot_training - Step 7080: {'lr': 0.00047667769559454404, 'samples': 3625472, 'steps': 7080, 'batch_loss/train': 0.8327355794608593} 12/22/2021 13:33:53 - INFO - codeparrot_training - Step 7081: {'lr': 0.00047667104699547247, 'samples': 3625984, 'steps': 7081, 'batch_loss/train': 0.9096828131005168} 12/22/2021 13:34:04 - INFO - codeparrot_training - Step 7082: {'lr': 0.00047666439749524247, 'samples': 3626496, 'steps': 7082, 'batch_loss/train': 0.8056904695695266} 12/22/2021 13:34:14 - INFO - codeparrot_training - Step 7083: {'lr': 0.0004766577470938803, 'samples': 3627008, 'steps': 7083, 'batch_loss/train': 0.8387500159442425} 12/22/2021 13:34:29 - INFO - codeparrot_training - Step 7084: {'lr': 0.00047665109579141264, 'samples': 3627520, 'steps': 7084, 'batch_loss/train': 0.8809711365029216} 12/22/2021 13:34:39 - INFO - codeparrot_training - Step 7085: {'lr': 0.00047664444358786574, 'samples': 3628032, 'steps': 7085, 'batch_loss/train': 0.8100851215422153} 12/22/2021 13:34:50 - INFO - codeparrot_training - Step 7086: {'lr': 0.00047663779048326605, 'samples': 3628544, 'steps': 7086, 'batch_loss/train': 1.1396552305668592} 12/22/2021 13:35:01 - INFO - codeparrot_training - Step 7087: {'lr': 0.00047663113647764015, 'samples': 3629056, 'steps': 7087, 'batch_loss/train': 0.854038143530488} 12/22/2021 13:35:13 - INFO - codeparrot_training - Step 7088: {'lr': 0.00047662448157101444, 'samples': 3629568, 'steps': 7088, 'batch_loss/train': 0.9839891539886594} 12/22/2021 13:35:23 - INFO - codeparrot_training - Step 7089: {'lr': 0.0004766178257634153, 'samples': 3630080, 'steps': 7089, 'batch_loss/train': 0.8230416048318148} 12/22/2021 13:35:34 - INFO - codeparrot_training - Step 7090: {'lr': 0.00047661116905486933, 'samples': 3630592, 'steps': 7090, 'batch_loss/train': 0.8274489536415786} 12/22/2021 13:35:46 - INFO - codeparrot_training - Step 7091: {'lr': 0.00047660451144540284, 'samples': 3631104, 'steps': 7091, 'batch_loss/train': 0.8366961516439915} 12/22/2021 13:35:57 - INFO - codeparrot_training - Step 7092: {'lr': 0.0004765978529350424, 'samples': 3631616, 'steps': 7092, 'batch_loss/train': 0.9617505818605423} 12/22/2021 13:36:07 - INFO - codeparrot_training - Step 7093: {'lr': 0.00047659119352381455, 'samples': 3632128, 'steps': 7093, 'batch_loss/train': 0.8827696447260678} 12/22/2021 13:36:21 - INFO - codeparrot_training - Step 7094: {'lr': 0.00047658453321174554, 'samples': 3632640, 'steps': 7094, 'batch_loss/train': 0.8289554044604301} 12/22/2021 13:36:32 - INFO - codeparrot_training - Step 7095: {'lr': 0.00047657787199886206, 'samples': 3633152, 'steps': 7095, 'batch_loss/train': 0.9103460758924484} 12/22/2021 13:36:42 - INFO - codeparrot_training - Step 7096: {'lr': 0.0004765712098851904, 'samples': 3633664, 'steps': 7096, 'batch_loss/train': 0.8728581639006734} 12/22/2021 13:36:54 - INFO - codeparrot_training - Step 7097: {'lr': 0.0004765645468707573, 'samples': 3634176, 'steps': 7097, 'batch_loss/train': 0.9022845355793834} 12/22/2021 13:37:05 - INFO - codeparrot_training - Step 7098: {'lr': 0.000476557882955589, 'samples': 3634688, 'steps': 7098, 'batch_loss/train': 0.8465368514880538} 12/22/2021 13:37:16 - INFO - codeparrot_training - Step 7099: {'lr': 0.0004765512181397122, 'samples': 3635200, 'steps': 7099, 'batch_loss/train': 0.8484116578474641} 12/22/2021 13:37:26 - INFO - codeparrot_training - Step 7100: {'lr': 0.00047654455242315325, 'samples': 3635712, 'steps': 7100, 'batch_loss/train': 0.9018256920389831} 12/22/2021 13:37:38 - INFO - codeparrot_training - Step 7101: {'lr': 0.0004765378858059387, 'samples': 3636224, 'steps': 7101, 'batch_loss/train': 0.8866120120510459} 12/22/2021 13:37:49 - INFO - codeparrot_training - Step 7102: {'lr': 0.0004765312182880951, 'samples': 3636736, 'steps': 7102, 'batch_loss/train': 0.8454292928799987} 12/22/2021 13:37:59 - INFO - codeparrot_training - Step 7103: {'lr': 0.0004765245498696488, 'samples': 3637248, 'steps': 7103, 'batch_loss/train': 0.8855830949032679} 12/22/2021 13:38:12 - INFO - codeparrot_training - Step 7104: {'lr': 0.0004765178805506265, 'samples': 3637760, 'steps': 7104, 'batch_loss/train': 0.8894941918551922} 12/22/2021 13:38:22 - INFO - codeparrot_training - Step 7105: {'lr': 0.0004765112103310546, 'samples': 3638272, 'steps': 7105, 'batch_loss/train': 1.015096073038876} 12/22/2021 13:38:33 - INFO - codeparrot_training - Step 7106: {'lr': 0.0004765045392109597, 'samples': 3638784, 'steps': 7106, 'batch_loss/train': 0.8128266828134656} 12/22/2021 13:38:46 - INFO - codeparrot_training - Step 7107: {'lr': 0.0004764978671903683, 'samples': 3639296, 'steps': 7107, 'batch_loss/train': 0.9309484660625458} 12/22/2021 13:38:57 - INFO - codeparrot_training - Step 7108: {'lr': 0.00047649119426930686, 'samples': 3639808, 'steps': 7108, 'batch_loss/train': 0.8738606367260218} 12/22/2021 13:39:07 - INFO - codeparrot_training - Step 7109: {'lr': 0.00047648452044780193, 'samples': 3640320, 'steps': 7109, 'batch_loss/train': 0.8142558755353093} 12/22/2021 13:39:18 - INFO - codeparrot_training - Step 7110: {'lr': 0.00047647784572588006, 'samples': 3640832, 'steps': 7110, 'batch_loss/train': 0.9466039258986712} 12/22/2021 13:39:30 - INFO - codeparrot_training - Step 7111: {'lr': 0.00047647117010356785, 'samples': 3641344, 'steps': 7111, 'batch_loss/train': 0.7973197605460882} 12/22/2021 13:39:41 - INFO - codeparrot_training - Step 7112: {'lr': 0.0004764644935808917, 'samples': 3641856, 'steps': 7112, 'batch_loss/train': 0.8898186720907688} 12/22/2021 13:39:51 - INFO - codeparrot_training - Step 7113: {'lr': 0.0004764578161578783, 'samples': 3642368, 'steps': 7113, 'batch_loss/train': 0.763585589826107} 12/22/2021 13:40:05 - INFO - codeparrot_training - Step 7114: {'lr': 0.0004764511378345541, 'samples': 3642880, 'steps': 7114, 'batch_loss/train': 0.8785209606867284} 12/22/2021 13:40:15 - INFO - codeparrot_training - Step 7115: {'lr': 0.0004764444586109457, 'samples': 3643392, 'steps': 7115, 'batch_loss/train': 0.6884577474556863} 12/22/2021 13:40:26 - INFO - codeparrot_training - Step 7116: {'lr': 0.00047643777848707967, 'samples': 3643904, 'steps': 7116, 'batch_loss/train': 0.9689983483403921} 12/22/2021 13:40:38 - INFO - codeparrot_training - Step 7117: {'lr': 0.00047643109746298243, 'samples': 3644416, 'steps': 7117, 'batch_loss/train': 0.8694139407598414} 12/22/2021 13:40:49 - INFO - codeparrot_training - Step 7118: {'lr': 0.0004764244155386807, 'samples': 3644928, 'steps': 7118, 'batch_loss/train': 0.8653029389679432} 12/22/2021 13:40:59 - INFO - codeparrot_training - Step 7119: {'lr': 0.000476417732714201, 'samples': 3645440, 'steps': 7119, 'batch_loss/train': 0.7074047073256224} 12/22/2021 13:41:10 - INFO - codeparrot_training - Step 7120: {'lr': 0.0004764110489895699, 'samples': 3645952, 'steps': 7120, 'batch_loss/train': 0.8879531035199761} 12/22/2021 13:41:22 - INFO - codeparrot_training - Step 7121: {'lr': 0.0004764043643648139, 'samples': 3646464, 'steps': 7121, 'batch_loss/train': 0.8819845207035542} 12/22/2021 13:41:33 - INFO - codeparrot_training - Step 7122: {'lr': 0.00047639767883995965, 'samples': 3646976, 'steps': 7122, 'batch_loss/train': 0.8544998448342085} 12/22/2021 13:41:43 - INFO - codeparrot_training - Step 7123: {'lr': 0.00047639099241503376, 'samples': 3647488, 'steps': 7123, 'batch_loss/train': 0.873342078179121} 12/22/2021 13:41:57 - INFO - codeparrot_training - Step 7124: {'lr': 0.0004763843050900627, 'samples': 3648000, 'steps': 7124, 'batch_loss/train': 0.7866378715261817} 12/22/2021 13:42:07 - INFO - codeparrot_training - Step 7125: {'lr': 0.0004763776168650732, 'samples': 3648512, 'steps': 7125, 'batch_loss/train': 0.5596708323573694} 12/22/2021 13:42:18 - INFO - codeparrot_training - Step 7126: {'lr': 0.0004763709277400917, 'samples': 3649024, 'steps': 7126, 'batch_loss/train': 0.8784467459190637} 12/22/2021 13:42:30 - INFO - codeparrot_training - Step 7127: {'lr': 0.00047636423771514494, 'samples': 3649536, 'steps': 7127, 'batch_loss/train': 0.8426903672516346} 12/22/2021 13:42:41 - INFO - codeparrot_training - Step 7128: {'lr': 0.0004763575467902594, 'samples': 3650048, 'steps': 7128, 'batch_loss/train': 0.8814342524856329} 12/22/2021 13:42:51 - INFO - codeparrot_training - Step 7129: {'lr': 0.0004763508549654617, 'samples': 3650560, 'steps': 7129, 'batch_loss/train': 0.8071393212303519} 12/22/2021 13:43:02 - INFO - codeparrot_training - Step 7130: {'lr': 0.0004763441622407785, 'samples': 3651072, 'steps': 7130, 'batch_loss/train': 0.9341746838763356} 12/22/2021 13:43:14 - INFO - codeparrot_training - Step 7131: {'lr': 0.0004763374686162364, 'samples': 3651584, 'steps': 7131, 'batch_loss/train': 0.7261282964609563} 12/22/2021 13:43:25 - INFO - codeparrot_training - Step 7132: {'lr': 0.00047633077409186186, 'samples': 3652096, 'steps': 7132, 'batch_loss/train': 0.9295733198523521} 12/22/2021 13:43:35 - INFO - codeparrot_training - Step 7133: {'lr': 0.00047632407866768177, 'samples': 3652608, 'steps': 7133, 'batch_loss/train': 0.6995535967871547} 12/22/2021 13:43:49 - INFO - codeparrot_training - Step 7134: {'lr': 0.0004763173823437226, 'samples': 3653120, 'steps': 7134, 'batch_loss/train': 0.8426833958365023} 12/22/2021 13:44:00 - INFO - codeparrot_training - Step 7135: {'lr': 0.0004763106851200109, 'samples': 3653632, 'steps': 7135, 'batch_loss/train': 0.9551356434822083} 12/22/2021 13:44:10 - INFO - codeparrot_training - Step 7136: {'lr': 0.00047630398699657336, 'samples': 3654144, 'steps': 7136, 'batch_loss/train': 0.8156138542108238} 12/22/2021 13:44:23 - INFO - codeparrot_training - Step 7137: {'lr': 0.00047629728797343664, 'samples': 3654656, 'steps': 7137, 'batch_loss/train': 0.8925898605957627} 12/22/2021 13:44:33 - INFO - codeparrot_training - Step 7138: {'lr': 0.0004762905880506273, 'samples': 3655168, 'steps': 7138, 'batch_loss/train': 0.8705204986035824} 12/22/2021 13:44:44 - INFO - codeparrot_training - Step 7139: {'lr': 0.00047628388722817204, 'samples': 3655680, 'steps': 7139, 'batch_loss/train': 0.7514646993950009} 12/22/2021 13:44:55 - INFO - codeparrot_training - Step 7140: {'lr': 0.00047627718550609756, 'samples': 3656192, 'steps': 7140, 'batch_loss/train': 0.7150139478035271} 12/22/2021 13:45:07 - INFO - codeparrot_training - Step 7141: {'lr': 0.0004762704828844304, 'samples': 3656704, 'steps': 7141, 'batch_loss/train': 0.8762044282630086} 12/22/2021 13:45:17 - INFO - codeparrot_training - Step 7142: {'lr': 0.00047626377936319723, 'samples': 3657216, 'steps': 7142, 'batch_loss/train': 0.8232256481423974} 12/22/2021 13:45:28 - INFO - codeparrot_training - Step 7143: {'lr': 0.0004762570749424247, 'samples': 3657728, 'steps': 7143, 'batch_loss/train': 0.8155152429826558} 12/22/2021 13:45:40 - INFO - codeparrot_training - Step 7144: {'lr': 0.00047625036962213954, 'samples': 3658240, 'steps': 7144, 'batch_loss/train': 0.9215041087009013} 12/22/2021 13:45:51 - INFO - codeparrot_training - Step 7145: {'lr': 0.00047624366340236825, 'samples': 3658752, 'steps': 7145, 'batch_loss/train': 0.9513242393732071} 12/22/2021 13:46:01 - INFO - codeparrot_training - Step 7146: {'lr': 0.0004762369562831376, 'samples': 3659264, 'steps': 7146, 'batch_loss/train': 0.8776032170280814} 12/22/2021 13:46:14 - INFO - codeparrot_training - Step 7147: {'lr': 0.00047623024826447426, 'samples': 3659776, 'steps': 7147, 'batch_loss/train': 0.8841526741161942} 12/22/2021 13:46:25 - INFO - codeparrot_training - Step 7148: {'lr': 0.00047622353934640484, 'samples': 3660288, 'steps': 7148, 'batch_loss/train': 0.8526532286778092} 12/22/2021 13:46:36 - INFO - codeparrot_training - Step 7149: {'lr': 0.00047621682952895606, 'samples': 3660800, 'steps': 7149, 'batch_loss/train': 0.8978823609650135} 12/22/2021 13:46:48 - INFO - codeparrot_training - Step 7150: {'lr': 0.0004762101188121546, 'samples': 3661312, 'steps': 7150, 'batch_loss/train': 0.8467078460380435} 12/22/2021 13:46:58 - INFO - codeparrot_training - Step 7151: {'lr': 0.0004762034071960271, 'samples': 3661824, 'steps': 7151, 'batch_loss/train': 0.9056877875700593} 12/22/2021 13:47:09 - INFO - codeparrot_training - Step 7152: {'lr': 0.0004761966946806002, 'samples': 3662336, 'steps': 7152, 'batch_loss/train': 0.8530835499987006} 12/22/2021 13:47:20 - INFO - codeparrot_training - Step 7153: {'lr': 0.00047618998126590075, 'samples': 3662848, 'steps': 7153, 'batch_loss/train': 0.8692097533494234} 12/22/2021 13:47:33 - INFO - codeparrot_training - Step 7154: {'lr': 0.00047618326695195536, 'samples': 3663360, 'steps': 7154, 'batch_loss/train': 0.7967000403441489} 12/22/2021 13:47:43 - INFO - codeparrot_training - Step 7155: {'lr': 0.0004761765517387906, 'samples': 3663872, 'steps': 7155, 'batch_loss/train': 0.8301424905657768} 12/22/2021 13:47:54 - INFO - codeparrot_training - Step 7156: {'lr': 0.00047616983562643337, 'samples': 3664384, 'steps': 7156, 'batch_loss/train': 0.8035162361338735} 12/22/2021 13:48:06 - INFO - codeparrot_training - Step 7157: {'lr': 0.0004761631186149102, 'samples': 3664896, 'steps': 7157, 'batch_loss/train': 0.8300117538310587} 12/22/2021 13:48:17 - INFO - codeparrot_training - Step 7158: {'lr': 0.00047615640070424787, 'samples': 3665408, 'steps': 7158, 'batch_loss/train': 0.8256598338484764} 12/22/2021 13:48:27 - INFO - codeparrot_training - Step 7159: {'lr': 0.0004761496818944731, 'samples': 3665920, 'steps': 7159, 'batch_loss/train': 0.8499274770729244} 12/22/2021 13:48:39 - INFO - codeparrot_training - Step 7160: {'lr': 0.0004761429621856126, 'samples': 3666432, 'steps': 7160, 'batch_loss/train': 0.9071199833415449} 12/22/2021 13:48:50 - INFO - codeparrot_training - Step 7161: {'lr': 0.00047613624157769307, 'samples': 3666944, 'steps': 7161, 'batch_loss/train': 0.9202789552509785} 12/22/2021 13:49:00 - INFO - codeparrot_training - Step 7162: {'lr': 0.00047612952007074124, 'samples': 3667456, 'steps': 7162, 'batch_loss/train': 0.7640180857852101} 12/22/2021 13:49:11 - INFO - codeparrot_training - Step 7163: {'lr': 0.00047612279766478377, 'samples': 3667968, 'steps': 7163, 'batch_loss/train': 0.8672564388252795} 12/22/2021 13:49:25 - INFO - codeparrot_training - Step 7164: {'lr': 0.0004761160743598475, 'samples': 3668480, 'steps': 7164, 'batch_loss/train': 0.8322676233947277} 12/22/2021 13:49:36 - INFO - codeparrot_training - Step 7165: {'lr': 0.00047610935015595903, 'samples': 3668992, 'steps': 7165, 'batch_loss/train': 0.899617588147521} 12/22/2021 13:49:46 - INFO - codeparrot_training - Step 7166: {'lr': 0.0004761026250531452, 'samples': 3669504, 'steps': 7166, 'batch_loss/train': 0.7859832406975329} 12/22/2021 13:49:58 - INFO - codeparrot_training - Step 7167: {'lr': 0.0004760958990514327, 'samples': 3670016, 'steps': 7167, 'batch_loss/train': 0.8024688176810741} 12/22/2021 13:50:09 - INFO - codeparrot_training - Step 7168: {'lr': 0.0004760891721508483, 'samples': 3670528, 'steps': 7168, 'batch_loss/train': 0.8472095895558596} 12/22/2021 13:50:20 - INFO - codeparrot_training - Step 7169: {'lr': 0.0004760824443514187, 'samples': 3671040, 'steps': 7169, 'batch_loss/train': 0.8130195699632168} 12/22/2021 13:50:33 - INFO - codeparrot_training - Step 7170: {'lr': 0.00047607571565317063, 'samples': 3671552, 'steps': 7170, 'batch_loss/train': 0.9301624801009893} 12/22/2021 13:50:43 - INFO - codeparrot_training - Step 7171: {'lr': 0.0004760689860561309, 'samples': 3672064, 'steps': 7171, 'batch_loss/train': 0.7552131889387965} 12/22/2021 13:50:54 - INFO - codeparrot_training - Step 7172: {'lr': 0.0004760622555603263, 'samples': 3672576, 'steps': 7172, 'batch_loss/train': 0.7973873526789248} 12/22/2021 13:51:06 - INFO - codeparrot_training - Step 7173: {'lr': 0.00047605552416578345, 'samples': 3673088, 'steps': 7173, 'batch_loss/train': 0.8572099935263395} 12/22/2021 13:51:17 - INFO - codeparrot_training - Step 7174: {'lr': 0.00047604879187252927, 'samples': 3673600, 'steps': 7174, 'batch_loss/train': 0.8838508315384388} 12/22/2021 13:51:27 - INFO - codeparrot_training - Step 7175: {'lr': 0.00047604205868059037, 'samples': 3674112, 'steps': 7175, 'batch_loss/train': 1.1636341847479343} 12/22/2021 13:51:38 - INFO - codeparrot_training - Step 7176: {'lr': 0.00047603532458999364, 'samples': 3674624, 'steps': 7176, 'batch_loss/train': 0.806734505109489} 12/22/2021 13:51:50 - INFO - codeparrot_training - Step 7177: {'lr': 0.00047602858960076576, 'samples': 3675136, 'steps': 7177, 'batch_loss/train': 0.817106974311173} 12/22/2021 13:52:01 - INFO - codeparrot_training - Step 7178: {'lr': 0.0004760218537129336, 'samples': 3675648, 'steps': 7178, 'batch_loss/train': 1.1709199361503124} 12/22/2021 13:52:11 - INFO - codeparrot_training - Step 7179: {'lr': 0.0004760151169265239, 'samples': 3676160, 'steps': 7179, 'batch_loss/train': 0.8282949514687061} 12/22/2021 13:52:23 - INFO - codeparrot_training - Step 7180: {'lr': 0.00047600837924156343, 'samples': 3676672, 'steps': 7180, 'batch_loss/train': 0.8497569053433836} 12/22/2021 13:52:34 - INFO - codeparrot_training - Step 7181: {'lr': 0.0004760016406580789, 'samples': 3677184, 'steps': 7181, 'batch_loss/train': 0.8561178096570075} 12/22/2021 13:52:45 - INFO - codeparrot_training - Step 7182: {'lr': 0.00047599490117609727, 'samples': 3677696, 'steps': 7182, 'batch_loss/train': 0.8688687579706311} 12/22/2021 13:52:55 - INFO - codeparrot_training - Step 7183: {'lr': 0.00047598816079564524, 'samples': 3678208, 'steps': 7183, 'batch_loss/train': 0.6737896907143295} 12/22/2021 13:53:09 - INFO - codeparrot_training - Step 7184: {'lr': 0.00047598141951674957, 'samples': 3678720, 'steps': 7184, 'batch_loss/train': 0.7697523627430201} 12/22/2021 13:53:20 - INFO - codeparrot_training - Step 7185: {'lr': 0.0004759746773394372, 'samples': 3679232, 'steps': 7185, 'batch_loss/train': 0.8609926851931959} 12/22/2021 13:53:30 - INFO - codeparrot_training - Step 7186: {'lr': 0.0004759679342637347, 'samples': 3679744, 'steps': 7186, 'batch_loss/train': 0.8776262318715453} 12/22/2021 13:53:43 - INFO - codeparrot_training - Step 7187: {'lr': 0.0004759611902896691, 'samples': 3680256, 'steps': 7187, 'batch_loss/train': 0.9041659873910248} 12/22/2021 13:53:53 - INFO - codeparrot_training - Step 7188: {'lr': 0.0004759544454172671, 'samples': 3680768, 'steps': 7188, 'batch_loss/train': 0.8263935847207904} 12/22/2021 13:54:04 - INFO - codeparrot_training - Step 7189: {'lr': 0.0004759476996465556, 'samples': 3681280, 'steps': 7189, 'batch_loss/train': 0.8066223778296262} 12/22/2021 13:54:16 - INFO - codeparrot_training - Step 7190: {'lr': 0.00047594095297756123, 'samples': 3681792, 'steps': 7190, 'batch_loss/train': 0.6268079220317304} 12/22/2021 13:54:27 - INFO - codeparrot_training - Step 7191: {'lr': 0.00047593420541031106, 'samples': 3682304, 'steps': 7191, 'batch_loss/train': 0.8422328000888228} 12/22/2021 13:54:37 - INFO - codeparrot_training - Step 7192: {'lr': 0.00047592745694483176, 'samples': 3682816, 'steps': 7192, 'batch_loss/train': 0.8913843433838338} 12/22/2021 13:54:48 - INFO - codeparrot_training - Step 7193: {'lr': 0.0004759207075811502, 'samples': 3683328, 'steps': 7193, 'batch_loss/train': 1.0177974391262978} 12/22/2021 13:55:02 - INFO - codeparrot_training - Step 7194: {'lr': 0.00047591395731929324, 'samples': 3683840, 'steps': 7194, 'batch_loss/train': 0.7983999503776431} 12/22/2021 13:55:12 - INFO - codeparrot_training - Step 7195: {'lr': 0.00047590720615928763, 'samples': 3684352, 'steps': 7195, 'batch_loss/train': 0.8320920960977674} 12/22/2021 13:55:23 - INFO - codeparrot_training - Step 7196: {'lr': 0.0004759004541011603, 'samples': 3684864, 'steps': 7196, 'batch_loss/train': 0.720728438347578} 12/22/2021 13:55:35 - INFO - codeparrot_training - Step 7197: {'lr': 0.000475893701144938, 'samples': 3685376, 'steps': 7197, 'batch_loss/train': 0.7895612854044884} 12/22/2021 13:55:46 - INFO - codeparrot_training - Step 7198: {'lr': 0.00047588694729064774, 'samples': 3685888, 'steps': 7198, 'batch_loss/train': 0.8332423334941268} 12/22/2021 13:55:56 - INFO - codeparrot_training - Step 7199: {'lr': 0.00047588019253831627, 'samples': 3686400, 'steps': 7199, 'batch_loss/train': 1.0302943456918001} 12/22/2021 13:56:09 - INFO - codeparrot_training - Step 7200: {'lr': 0.0004758734368879704, 'samples': 3686912, 'steps': 7200, 'batch_loss/train': 0.8752056434750557} 12/22/2021 13:56:20 - INFO - codeparrot_training - Step 7201: {'lr': 0.000475866680339637, 'samples': 3687424, 'steps': 7201, 'batch_loss/train': 0.8939204821363091} 12/22/2021 13:56:31 - INFO - codeparrot_training - Step 7202: {'lr': 0.000475859922893343, 'samples': 3687936, 'steps': 7202, 'batch_loss/train': 0.9083641981706023} 12/22/2021 13:56:43 - INFO - codeparrot_training - Step 7203: {'lr': 0.0004758531645491152, 'samples': 3688448, 'steps': 7203, 'batch_loss/train': 0.9196971915662289} 12/22/2021 13:56:53 - INFO - codeparrot_training - Step 7204: {'lr': 0.00047584640530698056, 'samples': 3688960, 'steps': 7204, 'batch_loss/train': 0.8545288881286979} 12/22/2021 13:57:04 - INFO - codeparrot_training - Step 7205: {'lr': 0.00047583964516696584, 'samples': 3689472, 'steps': 7205, 'batch_loss/train': 0.8733822526410222} 12/22/2021 13:57:14 - INFO - codeparrot_training - Step 7206: {'lr': 0.000475832884129098, 'samples': 3689984, 'steps': 7206, 'batch_loss/train': 0.7418668419122696} 12/22/2021 13:57:27 - INFO - codeparrot_training - Step 7207: {'lr': 0.0004758261221934038, 'samples': 3690496, 'steps': 7207, 'batch_loss/train': 0.7897809995338321} 12/22/2021 13:57:37 - INFO - codeparrot_training - Step 7208: {'lr': 0.00047581935935991026, 'samples': 3691008, 'steps': 7208, 'batch_loss/train': 0.8349346546456218} 12/22/2021 13:57:48 - INFO - codeparrot_training - Step 7209: {'lr': 0.0004758125956286442, 'samples': 3691520, 'steps': 7209, 'batch_loss/train': 0.8566568158566952} 12/22/2021 13:58:01 - INFO - codeparrot_training - Step 7210: {'lr': 0.00047580583099963254, 'samples': 3692032, 'steps': 7210, 'batch_loss/train': 0.8693674805108458} 12/22/2021 13:58:12 - INFO - codeparrot_training - Step 7211: {'lr': 0.00047579906547290217, 'samples': 3692544, 'steps': 7211, 'batch_loss/train': 0.816472856560722} 12/22/2021 13:58:22 - INFO - codeparrot_training - Step 7212: {'lr': 0.00047579229904848, 'samples': 3693056, 'steps': 7212, 'batch_loss/train': 0.8935066759586334} 12/22/2021 13:58:35 - INFO - codeparrot_training - Step 7213: {'lr': 0.00047578553172639284, 'samples': 3693568, 'steps': 7213, 'batch_loss/train': 0.7791763776913285} 12/22/2021 13:58:45 - INFO - codeparrot_training - Step 7214: {'lr': 0.00047577876350666766, 'samples': 3694080, 'steps': 7214, 'batch_loss/train': 0.8998512243852019} 12/22/2021 13:58:56 - INFO - codeparrot_training - Step 7215: {'lr': 0.0004757719943893314, 'samples': 3694592, 'steps': 7215, 'batch_loss/train': 0.7877727746963501} 12/22/2021 13:59:06 - INFO - codeparrot_training - Step 7216: {'lr': 0.00047576522437441093, 'samples': 3695104, 'steps': 7216, 'batch_loss/train': 0.7974215799476951} 12/22/2021 13:59:18 - INFO - codeparrot_training - Step 7217: {'lr': 0.00047575845346193316, 'samples': 3695616, 'steps': 7217, 'batch_loss/train': 0.8940124697983265} 12/22/2021 13:59:29 - INFO - codeparrot_training - Step 7218: {'lr': 0.00047575168165192507, 'samples': 3696128, 'steps': 7218, 'batch_loss/train': 0.7451733839698136} 12/22/2021 13:59:40 - INFO - codeparrot_training - Step 7219: {'lr': 0.00047574490894441345, 'samples': 3696640, 'steps': 7219, 'batch_loss/train': 0.8235886669717729} 12/22/2021 13:59:52 - INFO - codeparrot_training - Step 7220: {'lr': 0.00047573813533942534, 'samples': 3697152, 'steps': 7220, 'batch_loss/train': 0.9084638329222798} 12/22/2021 14:00:02 - INFO - codeparrot_training - Step 7221: {'lr': 0.00047573136083698764, 'samples': 3697664, 'steps': 7221, 'batch_loss/train': 0.8646487221121788} 12/22/2021 14:00:13 - INFO - codeparrot_training - Step 7222: {'lr': 0.0004757245854371274, 'samples': 3698176, 'steps': 7222, 'batch_loss/train': 0.8107417831197381} 12/22/2021 14:00:26 - INFO - codeparrot_training - Step 7223: {'lr': 0.0004757178091398713, 'samples': 3698688, 'steps': 7223, 'batch_loss/train': 0.773349282098934} 12/22/2021 14:00:37 - INFO - codeparrot_training - Step 7224: {'lr': 0.00047571103194524644, 'samples': 3699200, 'steps': 7224, 'batch_loss/train': 0.915868241339922} 12/22/2021 14:00:48 - INFO - codeparrot_training - Step 7225: {'lr': 0.00047570425385327973, 'samples': 3699712, 'steps': 7225, 'batch_loss/train': 0.8707627262920141} 12/22/2021 14:01:00 - INFO - codeparrot_training - Step 7226: {'lr': 0.0004756974748639982, 'samples': 3700224, 'steps': 7226, 'batch_loss/train': 0.8313724000472575} 12/22/2021 14:01:10 - INFO - codeparrot_training - Step 7227: {'lr': 0.00047569069497742865, 'samples': 3700736, 'steps': 7227, 'batch_loss/train': 0.782561413012445} 12/22/2021 14:01:21 - INFO - codeparrot_training - Step 7228: {'lr': 0.0004756839141935982, 'samples': 3701248, 'steps': 7228, 'batch_loss/train': 0.8654409828595817} 12/22/2021 14:01:32 - INFO - codeparrot_training - Step 7229: {'lr': 0.00047567713251253364, 'samples': 3701760, 'steps': 7229, 'batch_loss/train': 0.8508286429569125} 12/22/2021 14:01:45 - INFO - codeparrot_training - Step 7230: {'lr': 0.00047567034993426205, 'samples': 3702272, 'steps': 7230, 'batch_loss/train': 0.8268684381619096} 12/22/2021 14:01:55 - INFO - codeparrot_training - Step 7231: {'lr': 0.0004756635664588104, 'samples': 3702784, 'steps': 7231, 'batch_loss/train': 0.8219552598893642} 12/22/2021 14:02:06 - INFO - codeparrot_training - Step 7232: {'lr': 0.00047565678208620556, 'samples': 3703296, 'steps': 7232, 'batch_loss/train': 0.8471865374594927} 12/22/2021 14:02:18 - INFO - codeparrot_training - Step 7233: {'lr': 0.0004756499968164746, 'samples': 3703808, 'steps': 7233, 'batch_loss/train': 0.8243901138193905} 12/22/2021 14:02:29 - INFO - codeparrot_training - Step 7234: {'lr': 0.00047564321064964436, 'samples': 3704320, 'steps': 7234, 'batch_loss/train': 0.8178769573569298} 12/22/2021 14:02:39 - INFO - codeparrot_training - Step 7235: {'lr': 0.000475636423585742, 'samples': 3704832, 'steps': 7235, 'batch_loss/train': 0.8621544828638434} 12/22/2021 14:02:51 - INFO - codeparrot_training - Step 7236: {'lr': 0.0004756296356247944, 'samples': 3705344, 'steps': 7236, 'batch_loss/train': 0.898038114188239} 12/22/2021 14:03:02 - INFO - codeparrot_training - Step 7237: {'lr': 0.00047562284676682856, 'samples': 3705856, 'steps': 7237, 'batch_loss/train': 0.9339466001838446} 12/22/2021 14:03:13 - INFO - codeparrot_training - Step 7238: {'lr': 0.0004756160570118715, 'samples': 3706368, 'steps': 7238, 'batch_loss/train': 0.8767225882038474} 12/22/2021 14:03:23 - INFO - codeparrot_training - Step 7239: {'lr': 0.00047560926635995015, 'samples': 3706880, 'steps': 7239, 'batch_loss/train': 0.808199352119118} 12/22/2021 14:03:37 - INFO - codeparrot_training - Step 7240: {'lr': 0.00047560247481109153, 'samples': 3707392, 'steps': 7240, 'batch_loss/train': 0.8527782214805484} 12/22/2021 14:03:48 - INFO - codeparrot_training - Step 7241: {'lr': 0.0004755956823653227, 'samples': 3707904, 'steps': 7241, 'batch_loss/train': 0.7423536172136664} 12/22/2021 14:03:59 - INFO - codeparrot_training - Step 7242: {'lr': 0.0004755888890226706, 'samples': 3708416, 'steps': 7242, 'batch_loss/train': 0.8189097311114892} 12/22/2021 14:04:11 - INFO - codeparrot_training - Step 7243: {'lr': 0.0004755820947831623, 'samples': 3708928, 'steps': 7243, 'batch_loss/train': 0.891308082267642} 12/22/2021 14:04:21 - INFO - codeparrot_training - Step 7244: {'lr': 0.00047557529964682466, 'samples': 3709440, 'steps': 7244, 'batch_loss/train': 0.8724024007096887} 12/22/2021 14:04:32 - INFO - codeparrot_training - Step 7245: {'lr': 0.0004755685036136849, 'samples': 3709952, 'steps': 7245, 'batch_loss/train': 0.83649533521384} 12/22/2021 14:04:44 - INFO - codeparrot_training - Step 7246: {'lr': 0.0004755617066837699, 'samples': 3710464, 'steps': 7246, 'batch_loss/train': 0.8449483197182417} 12/22/2021 14:04:54 - INFO - codeparrot_training - Step 7247: {'lr': 0.0004755549088571067, 'samples': 3710976, 'steps': 7247, 'batch_loss/train': 0.818956330884248} 12/22/2021 14:05:05 - INFO - codeparrot_training - Step 7248: {'lr': 0.0004755481101337224, 'samples': 3711488, 'steps': 7248, 'batch_loss/train': 0.8541111988015473} 12/22/2021 14:05:16 - INFO - codeparrot_training - Step 7249: {'lr': 0.0004755413105136439, 'samples': 3712000, 'steps': 7249, 'batch_loss/train': 0.8703787988051772} 12/22/2021 14:05:29 - INFO - codeparrot_training - Step 7250: {'lr': 0.0004755345099968984, 'samples': 3712512, 'steps': 7250, 'batch_loss/train': 0.8964364239946008} 12/22/2021 14:05:39 - INFO - codeparrot_training - Step 7251: {'lr': 0.0004755277085835128, 'samples': 3713024, 'steps': 7251, 'batch_loss/train': 0.9202803550288081} 12/22/2021 14:05:50 - INFO - codeparrot_training - Step 7252: {'lr': 0.0004755209062735142, 'samples': 3713536, 'steps': 7252, 'batch_loss/train': 0.8580260453745723} 12/22/2021 14:06:02 - INFO - codeparrot_training - Step 7253: {'lr': 0.0004755141030669297, 'samples': 3714048, 'steps': 7253, 'batch_loss/train': 1.0161980334669352} 12/22/2021 14:06:13 - INFO - codeparrot_training - Step 7254: {'lr': 0.0004755072989637862, 'samples': 3714560, 'steps': 7254, 'batch_loss/train': 0.8145532924681902} 12/22/2021 14:06:24 - INFO - codeparrot_training - Step 7255: {'lr': 0.0004755004939641108, 'samples': 3715072, 'steps': 7255, 'batch_loss/train': 0.7729354975745082} 12/22/2021 14:06:36 - INFO - codeparrot_training - Step 7256: {'lr': 0.0004754936880679306, 'samples': 3715584, 'steps': 7256, 'batch_loss/train': 0.8015628028661013} 12/22/2021 14:06:46 - INFO - codeparrot_training - Step 7257: {'lr': 0.00047548688127527265, 'samples': 3716096, 'steps': 7257, 'batch_loss/train': 0.7374908569618128} 12/22/2021 14:06:57 - INFO - codeparrot_training - Step 7258: {'lr': 0.00047548007358616404, 'samples': 3716608, 'steps': 7258, 'batch_loss/train': 0.864990116097033} 12/22/2021 14:07:10 - INFO - codeparrot_training - Step 7259: {'lr': 0.0004754732650006317, 'samples': 3717120, 'steps': 7259, 'batch_loss/train': 0.9060262087732553} 12/22/2021 14:07:21 - INFO - codeparrot_training - Step 7260: {'lr': 0.0004754664555187029, 'samples': 3717632, 'steps': 7260, 'batch_loss/train': 0.7976559810340405} 12/22/2021 14:07:31 - INFO - codeparrot_training - Step 7261: {'lr': 0.0004754596451404045, 'samples': 3718144, 'steps': 7261, 'batch_loss/train': 0.8249120553955436} 12/22/2021 14:07:42 - INFO - codeparrot_training - Step 7262: {'lr': 0.0004754528338657638, 'samples': 3718656, 'steps': 7262, 'batch_loss/train': 0.7587319249287248} 12/22/2021 14:07:54 - INFO - codeparrot_training - Step 7263: {'lr': 0.00047544602169480765, 'samples': 3719168, 'steps': 7263, 'batch_loss/train': 0.8825690867379308} 12/22/2021 14:08:05 - INFO - codeparrot_training - Step 7264: {'lr': 0.00047543920862756333, 'samples': 3719680, 'steps': 7264, 'batch_loss/train': 0.8389409137889743} 12/22/2021 14:08:15 - INFO - codeparrot_training - Step 7265: {'lr': 0.00047543239466405786, 'samples': 3720192, 'steps': 7265, 'batch_loss/train': 0.7154602042865008} 12/22/2021 14:08:27 - INFO - codeparrot_training - Step 7266: {'lr': 0.00047542557980431824, 'samples': 3720704, 'steps': 7266, 'batch_loss/train': 0.8463249001652002} 12/22/2021 14:08:38 - INFO - codeparrot_training - Step 7267: {'lr': 0.0004754187640483716, 'samples': 3721216, 'steps': 7267, 'batch_loss/train': 0.7686636040452868} 12/22/2021 14:08:49 - INFO - codeparrot_training - Step 7268: {'lr': 0.00047541194739624513, 'samples': 3721728, 'steps': 7268, 'batch_loss/train': 0.8412314071319997} 12/22/2021 14:09:02 - INFO - codeparrot_training - Step 7269: {'lr': 0.00047540512984796585, 'samples': 3722240, 'steps': 7269, 'batch_loss/train': 0.7618517249356955} 12/22/2021 14:09:12 - INFO - codeparrot_training - Step 7270: {'lr': 0.0004753983114035609, 'samples': 3722752, 'steps': 7270, 'batch_loss/train': 0.8769476199522614} 12/22/2021 14:09:23 - INFO - codeparrot_training - Step 7271: {'lr': 0.0004753914920630573, 'samples': 3723264, 'steps': 7271, 'batch_loss/train': 0.8673306880518794} 12/22/2021 14:09:34 - INFO - codeparrot_training - Step 7272: {'lr': 0.00047538467182648233, 'samples': 3723776, 'steps': 7272, 'batch_loss/train': 0.8537070974707603} 12/22/2021 14:09:46 - INFO - codeparrot_training - Step 7273: {'lr': 0.000475377850693863, 'samples': 3724288, 'steps': 7273, 'batch_loss/train': 0.7653378773247823} 12/22/2021 14:09:56 - INFO - codeparrot_training - Step 7274: {'lr': 0.0004753710286652264, 'samples': 3724800, 'steps': 7274, 'batch_loss/train': 0.8580685001797974} 12/22/2021 14:10:07 - INFO - codeparrot_training - Step 7275: {'lr': 0.00047536420574059967, 'samples': 3725312, 'steps': 7275, 'batch_loss/train': 0.8287601182237267} 12/22/2021 14:10:19 - INFO - codeparrot_training - Step 7276: {'lr': 0.00047535738192001, 'samples': 3725824, 'steps': 7276, 'batch_loss/train': 0.77036471106112} 12/22/2021 14:10:29 - INFO - codeparrot_training - Step 7277: {'lr': 0.0004753505572034844, 'samples': 3726336, 'steps': 7277, 'batch_loss/train': 0.8929137103259563} 12/22/2021 14:10:40 - INFO - codeparrot_training - Step 7278: {'lr': 0.00047534373159105014, 'samples': 3726848, 'steps': 7278, 'batch_loss/train': 0.8272282779216766} 12/22/2021 14:10:54 - INFO - codeparrot_training - Step 7279: {'lr': 0.00047533690508273427, 'samples': 3727360, 'steps': 7279, 'batch_loss/train': 0.8259509499184787} 12/22/2021 14:11:04 - INFO - codeparrot_training - Step 7280: {'lr': 0.0004753300776785639, 'samples': 3727872, 'steps': 7280, 'batch_loss/train': 0.8707565059885383} 12/22/2021 14:11:15 - INFO - codeparrot_training - Step 7281: {'lr': 0.0004753232493785663, 'samples': 3728384, 'steps': 7281, 'batch_loss/train': 0.9143045609816909} 12/22/2021 14:11:26 - INFO - codeparrot_training - Step 7282: {'lr': 0.00047531642018276845, 'samples': 3728896, 'steps': 7282, 'batch_loss/train': 0.8391990819945931} 12/22/2021 14:11:38 - INFO - codeparrot_training - Step 7283: {'lr': 0.0004753095900911977, 'samples': 3729408, 'steps': 7283, 'batch_loss/train': 0.7861812897026539} 12/22/2021 14:11:48 - INFO - codeparrot_training - Step 7284: {'lr': 0.000475302759103881, 'samples': 3729920, 'steps': 7284, 'batch_loss/train': 0.9138794178143144} 12/22/2021 14:11:59 - INFO - codeparrot_training - Step 7285: {'lr': 0.0004752959272208457, 'samples': 3730432, 'steps': 7285, 'batch_loss/train': 0.7715226095169783} 12/22/2021 14:12:13 - INFO - codeparrot_training - Step 7286: {'lr': 0.0004752890944421188, 'samples': 3730944, 'steps': 7286, 'batch_loss/train': 0.7720172945410013} 12/22/2021 14:12:23 - INFO - codeparrot_training - Step 7287: {'lr': 0.0004752822607677275, 'samples': 3731456, 'steps': 7287, 'batch_loss/train': 0.8565966736059636} 12/22/2021 14:12:34 - INFO - codeparrot_training - Step 7288: {'lr': 0.00047527542619769903, 'samples': 3731968, 'steps': 7288, 'batch_loss/train': 0.8464145383331925} 12/22/2021 14:12:46 - INFO - codeparrot_training - Step 7289: {'lr': 0.00047526859073206053, 'samples': 3732480, 'steps': 7289, 'batch_loss/train': 1.0650475332513452} 12/22/2021 14:12:57 - INFO - codeparrot_training - Step 7290: {'lr': 0.0004752617543708391, 'samples': 3732992, 'steps': 7290, 'batch_loss/train': 0.777961740270257} 12/22/2021 14:13:07 - INFO - codeparrot_training - Step 7291: {'lr': 0.0004752549171140621, 'samples': 3733504, 'steps': 7291, 'batch_loss/train': 0.8421843480318785} 12/22/2021 14:13:18 - INFO - codeparrot_training - Step 7292: {'lr': 0.00047524807896175646, 'samples': 3734016, 'steps': 7292, 'batch_loss/train': 0.8182287416420877} 12/22/2021 14:13:30 - INFO - codeparrot_training - Step 7293: {'lr': 0.00047524123991394965, 'samples': 3734528, 'steps': 7293, 'batch_loss/train': 0.858135711401701} 12/22/2021 14:13:41 - INFO - codeparrot_training - Step 7294: {'lr': 0.0004752343999706686, 'samples': 3735040, 'steps': 7294, 'batch_loss/train': 0.8647234113886952} 12/22/2021 14:13:51 - INFO - codeparrot_training - Step 7295: {'lr': 0.0004752275591319408, 'samples': 3735552, 'steps': 7295, 'batch_loss/train': 0.7661357158794999} 12/22/2021 14:14:03 - INFO - codeparrot_training - Step 7296: {'lr': 0.0004752207173977932, 'samples': 3736064, 'steps': 7296, 'batch_loss/train': 0.8787804208695889} 12/22/2021 14:14:14 - INFO - codeparrot_training - Step 7297: {'lr': 0.00047521387476825294, 'samples': 3736576, 'steps': 7297, 'batch_loss/train': 0.8476814543828368} 12/22/2021 14:14:25 - INFO - codeparrot_training - Step 7298: {'lr': 0.00047520703124334743, 'samples': 3737088, 'steps': 7298, 'batch_loss/train': 0.8745168652385473} 12/22/2021 14:14:38 - INFO - codeparrot_training - Step 7299: {'lr': 0.00047520018682310387, 'samples': 3737600, 'steps': 7299, 'batch_loss/train': 0.8031691955402493} 12/22/2021 14:14:49 - INFO - codeparrot_training - Step 7300: {'lr': 0.0004751933415075493, 'samples': 3738112, 'steps': 7300, 'batch_loss/train': 0.8777457950636744} 12/22/2021 14:15:00 - INFO - codeparrot_training - Step 7301: {'lr': 0.00047518649529671103, 'samples': 3738624, 'steps': 7301, 'batch_loss/train': 0.9138424536213279} 12/22/2021 14:15:12 - INFO - codeparrot_training - Step 7302: {'lr': 0.0004751796481906163, 'samples': 3739136, 'steps': 7302, 'batch_loss/train': 0.7425100998952985} 12/22/2021 14:15:22 - INFO - codeparrot_training - Step 7303: {'lr': 0.0004751728001892924, 'samples': 3739648, 'steps': 7303, 'batch_loss/train': 0.8787292838096619} 12/22/2021 14:15:33 - INFO - codeparrot_training - Step 7304: {'lr': 0.0004751659512927664, 'samples': 3740160, 'steps': 7304, 'batch_loss/train': 0.8672501463443041} 12/22/2021 14:15:43 - INFO - codeparrot_training - Step 7305: {'lr': 0.00047515910150106554, 'samples': 3740672, 'steps': 7305, 'batch_loss/train': 0.8070503729395568} 12/22/2021 14:15:56 - INFO - codeparrot_training - Step 7306: {'lr': 0.0004751522508142172, 'samples': 3741184, 'steps': 7306, 'batch_loss/train': 0.8152629649266601} 12/22/2021 14:16:06 - INFO - codeparrot_training - Step 7307: {'lr': 0.00047514539923224856, 'samples': 3741696, 'steps': 7307, 'batch_loss/train': 0.7316143690841272} 12/22/2021 14:16:17 - INFO - codeparrot_training - Step 7308: {'lr': 0.0004751385467551867, 'samples': 3742208, 'steps': 7308, 'batch_loss/train': 1.0337708946317434} 12/22/2021 14:16:31 - INFO - codeparrot_training - Step 7309: {'lr': 0.0004751316933830591, 'samples': 3742720, 'steps': 7309, 'batch_loss/train': 0.9764470467343926} 12/22/2021 14:16:41 - INFO - codeparrot_training - Step 7310: {'lr': 0.00047512483911589276, 'samples': 3743232, 'steps': 7310, 'batch_loss/train': 0.8037869795225561} 12/22/2021 14:16:52 - INFO - codeparrot_training - Step 7311: {'lr': 0.0004751179839537152, 'samples': 3743744, 'steps': 7311, 'batch_loss/train': 0.8458557976409793} 12/22/2021 14:17:03 - INFO - codeparrot_training - Step 7312: {'lr': 0.00047511112789655344, 'samples': 3744256, 'steps': 7312, 'batch_loss/train': 0.8221633965149522} 12/22/2021 14:17:15 - INFO - codeparrot_training - Step 7313: {'lr': 0.00047510427094443487, 'samples': 3744768, 'steps': 7313, 'batch_loss/train': 0.7359274821355939} 12/22/2021 14:17:25 - INFO - codeparrot_training - Step 7314: {'lr': 0.00047509741309738676, 'samples': 3745280, 'steps': 7314, 'batch_loss/train': 0.8606559401378036} 12/22/2021 14:17:36 - INFO - codeparrot_training - Step 7315: {'lr': 0.00047509055435543624, 'samples': 3745792, 'steps': 7315, 'batch_loss/train': 0.8427449278533459} 12/22/2021 14:17:50 - INFO - codeparrot_training - Step 7316: {'lr': 0.00047508369471861066, 'samples': 3746304, 'steps': 7316, 'batch_loss/train': 0.8358900398015976} 12/22/2021 14:18:00 - INFO - codeparrot_training - Step 7317: {'lr': 0.00047507683418693736, 'samples': 3746816, 'steps': 7317, 'batch_loss/train': 0.7986450335010886} 12/22/2021 14:18:11 - INFO - codeparrot_training - Step 7318: {'lr': 0.0004750699727604436, 'samples': 3747328, 'steps': 7318, 'batch_loss/train': 0.8277010843157768} 12/22/2021 14:18:23 - INFO - codeparrot_training - Step 7319: {'lr': 0.00047506311043915653, 'samples': 3747840, 'steps': 7319, 'batch_loss/train': 0.8245133715681732} 12/22/2021 14:18:34 - INFO - codeparrot_training - Step 7320: {'lr': 0.00047505624722310355, 'samples': 3748352, 'steps': 7320, 'batch_loss/train': 0.7986253560520709} 12/22/2021 14:18:44 - INFO - codeparrot_training - Step 7321: {'lr': 0.0004750493831123119, 'samples': 3748864, 'steps': 7321, 'batch_loss/train': 0.8823458717670292} 12/22/2021 14:18:56 - INFO - codeparrot_training - Step 7322: {'lr': 0.0004750425181068089, 'samples': 3749376, 'steps': 7322, 'batch_loss/train': 0.8331428952515125} 12/22/2021 14:19:07 - INFO - codeparrot_training - Step 7323: {'lr': 0.0004750356522066218, 'samples': 3749888, 'steps': 7323, 'batch_loss/train': 0.832701834384352} 12/22/2021 14:19:18 - INFO - codeparrot_training - Step 7324: {'lr': 0.00047502878541177783, 'samples': 3750400, 'steps': 7324, 'batch_loss/train': 0.8752533346414566} 12/22/2021 14:19:28 - INFO - codeparrot_training - Step 7325: {'lr': 0.0004750219177223045, 'samples': 3750912, 'steps': 7325, 'batch_loss/train': 0.9461708581075072} 12/22/2021 14:19:42 - INFO - codeparrot_training - Step 7326: {'lr': 0.000475015049138229, 'samples': 3751424, 'steps': 7326, 'batch_loss/train': 0.8052229657769203} 12/22/2021 14:19:52 - INFO - codeparrot_training - Step 7327: {'lr': 0.0004750081796595785, 'samples': 3751936, 'steps': 7327, 'batch_loss/train': 0.6072865868918598} 12/22/2021 14:20:03 - INFO - codeparrot_training - Step 7328: {'lr': 0.00047500130928638054, 'samples': 3752448, 'steps': 7328, 'batch_loss/train': 0.6469452453311533} 12/22/2021 14:20:15 - INFO - codeparrot_training - Step 7329: {'lr': 0.0004749944380186623, 'samples': 3752960, 'steps': 7329, 'batch_loss/train': 0.8324315100908279} 12/22/2021 14:20:26 - INFO - codeparrot_training - Step 7330: {'lr': 0.0004749875658564511, 'samples': 3753472, 'steps': 7330, 'batch_loss/train': 0.7058352703461424} 12/22/2021 14:20:36 - INFO - codeparrot_training - Step 7331: {'lr': 0.0004749806927997744, 'samples': 3753984, 'steps': 7331, 'batch_loss/train': 0.8426335954573005} 12/22/2021 14:20:47 - INFO - codeparrot_training - Step 7332: {'lr': 0.0004749738188486593, 'samples': 3754496, 'steps': 7332, 'batch_loss/train': 0.8496779669076204} 12/22/2021 14:20:59 - INFO - codeparrot_training - Step 7333: {'lr': 0.0004749669440031333, 'samples': 3755008, 'steps': 7333, 'batch_loss/train': 0.8916905494406819} 12/22/2021 14:21:09 - INFO - codeparrot_training - Step 7334: {'lr': 0.0004749600682632237, 'samples': 3755520, 'steps': 7334, 'batch_loss/train': 0.7987063052132726} 12/22/2021 14:21:20 - INFO - codeparrot_training - Step 7335: {'lr': 0.00047495319162895773, 'samples': 3756032, 'steps': 7335, 'batch_loss/train': 0.8837719485163689} 12/22/2021 14:21:33 - INFO - codeparrot_training - Step 7336: {'lr': 0.0004749463141003628, 'samples': 3756544, 'steps': 7336, 'batch_loss/train': 0.7935267239809036} 12/22/2021 14:21:44 - INFO - codeparrot_training - Step 7337: {'lr': 0.00047493943567746634, 'samples': 3757056, 'steps': 7337, 'batch_loss/train': 0.8605890749022365} 12/22/2021 14:21:54 - INFO - codeparrot_training - Step 7338: {'lr': 0.00047493255636029563, 'samples': 3757568, 'steps': 7338, 'batch_loss/train': 0.7768248594366014} 12/22/2021 14:22:07 - INFO - codeparrot_training - Step 7339: {'lr': 0.00047492567614887795, 'samples': 3758080, 'steps': 7339, 'batch_loss/train': 0.8847869262099266} 12/22/2021 14:22:17 - INFO - codeparrot_training - Step 7340: {'lr': 0.0004749187950432407, 'samples': 3758592, 'steps': 7340, 'batch_loss/train': 0.8593337612692267} 12/22/2021 14:22:28 - INFO - codeparrot_training - Step 7341: {'lr': 0.0004749119130434113, 'samples': 3759104, 'steps': 7341, 'batch_loss/train': 0.8143094964325428} 12/22/2021 14:22:40 - INFO - codeparrot_training - Step 7342: {'lr': 0.000474905030149417, 'samples': 3759616, 'steps': 7342, 'batch_loss/train': 0.7403398538008332} 12/22/2021 14:22:50 - INFO - codeparrot_training - Step 7343: {'lr': 0.00047489814636128525, 'samples': 3760128, 'steps': 7343, 'batch_loss/train': 0.7841997528448701} 12/22/2021 14:23:01 - INFO - codeparrot_training - Step 7344: {'lr': 0.0004748912616790434, 'samples': 3760640, 'steps': 7344, 'batch_loss/train': 0.8242482244968414} 12/22/2021 14:23:11 - INFO - codeparrot_training - Step 7345: {'lr': 0.0004748843761027188, 'samples': 3761152, 'steps': 7345, 'batch_loss/train': 0.9011033102869987} 12/22/2021 14:23:25 - INFO - codeparrot_training - Step 7346: {'lr': 0.0004748774896323388, 'samples': 3761664, 'steps': 7346, 'batch_loss/train': 0.8169178538955748} 12/22/2021 14:23:35 - INFO - codeparrot_training - Step 7347: {'lr': 0.0004748706022679309, 'samples': 3762176, 'steps': 7347, 'batch_loss/train': 0.8976025218144059} 12/22/2021 14:23:46 - INFO - codeparrot_training - Step 7348: {'lr': 0.0004748637140095223, 'samples': 3762688, 'steps': 7348, 'batch_loss/train': 0.8578886864706874} 12/22/2021 14:23:58 - INFO - codeparrot_training - Step 7349: {'lr': 0.00047485682485714055, 'samples': 3763200, 'steps': 7349, 'batch_loss/train': 0.7228887806413695} 12/22/2021 14:24:08 - INFO - codeparrot_training - Step 7350: {'lr': 0.0004748499348108129, 'samples': 3763712, 'steps': 7350, 'batch_loss/train': 0.7342450558207929} 12/22/2021 14:24:19 - INFO - codeparrot_training - Step 7351: {'lr': 0.00047484304387056687, 'samples': 3764224, 'steps': 7351, 'batch_loss/train': 0.9424785701557994} 12/22/2021 14:24:31 - INFO - codeparrot_training - Step 7352: {'lr': 0.0004748361520364297, 'samples': 3764736, 'steps': 7352, 'batch_loss/train': 0.8246470401063561} 12/22/2021 14:24:42 - INFO - codeparrot_training - Step 7353: {'lr': 0.000474829259308429, 'samples': 3765248, 'steps': 7353, 'batch_loss/train': 0.7998347068205476} 12/22/2021 14:24:52 - INFO - codeparrot_training - Step 7354: {'lr': 0.00047482236568659197, 'samples': 3765760, 'steps': 7354, 'batch_loss/train': 0.8452332271263003} 12/22/2021 14:25:03 - INFO - codeparrot_training - Step 7355: {'lr': 0.00047481547117094613, 'samples': 3766272, 'steps': 7355, 'batch_loss/train': 0.7935043191537261} 12/22/2021 14:25:16 - INFO - codeparrot_training - Step 7356: {'lr': 0.00047480857576151893, 'samples': 3766784, 'steps': 7356, 'batch_loss/train': 0.8574633188545704} 12/22/2021 14:25:27 - INFO - codeparrot_training - Step 7357: {'lr': 0.00047480167945833764, 'samples': 3767296, 'steps': 7357, 'batch_loss/train': 0.876712522469461} 12/22/2021 14:25:37 - INFO - codeparrot_training - Step 7358: {'lr': 0.0004747947822614298, 'samples': 3767808, 'steps': 7358, 'batch_loss/train': 0.9357470273971558} 12/22/2021 14:25:50 - INFO - codeparrot_training - Step 7359: {'lr': 0.00047478788417082277, 'samples': 3768320, 'steps': 7359, 'batch_loss/train': 0.8579050712287426} 12/22/2021 14:26:00 - INFO - codeparrot_training - Step 7360: {'lr': 0.00047478098518654395, 'samples': 3768832, 'steps': 7360, 'batch_loss/train': 0.8267919765785336} 12/22/2021 14:26:11 - INFO - codeparrot_training - Step 7361: {'lr': 0.0004747740853086208, 'samples': 3769344, 'steps': 7361, 'batch_loss/train': 0.7983421320095658} 12/22/2021 14:26:25 - INFO - codeparrot_training - Step 7362: {'lr': 0.0004747671845370808, 'samples': 3769856, 'steps': 7362, 'batch_loss/train': 0.8404528824612498} 12/22/2021 14:26:35 - INFO - codeparrot_training - Step 7363: {'lr': 0.0004747602828719514, 'samples': 3770368, 'steps': 7363, 'batch_loss/train': 0.8308074064552784} 12/22/2021 14:26:46 - INFO - codeparrot_training - Step 7364: {'lr': 0.0004747533803132599, 'samples': 3770880, 'steps': 7364, 'batch_loss/train': 0.865470970980823} 12/22/2021 14:26:56 - INFO - codeparrot_training - Step 7365: {'lr': 0.0004747464768610338, 'samples': 3771392, 'steps': 7365, 'batch_loss/train': 0.9339829264208674} 12/22/2021 14:27:08 - INFO - codeparrot_training - Step 7366: {'lr': 0.00047473957251530064, 'samples': 3771904, 'steps': 7366, 'batch_loss/train': 0.8636457631364465} 12/22/2021 14:27:19 - INFO - codeparrot_training - Step 7367: {'lr': 0.00047473266727608775, 'samples': 3772416, 'steps': 7367, 'batch_loss/train': 0.918923465302214} 12/22/2021 14:27:30 - INFO - codeparrot_training - Step 7368: {'lr': 0.00047472576114342266, 'samples': 3772928, 'steps': 7368, 'batch_loss/train': 0.7070311307907104} 12/22/2021 14:27:42 - INFO - codeparrot_training - Step 7369: {'lr': 0.0004747188541173328, 'samples': 3773440, 'steps': 7369, 'batch_loss/train': 0.9583571357652545} 12/22/2021 14:27:53 - INFO - codeparrot_training - Step 7370: {'lr': 0.0004747119461978456, 'samples': 3773952, 'steps': 7370, 'batch_loss/train': 0.9555854629725218} 12/22/2021 14:28:03 - INFO - codeparrot_training - Step 7371: {'lr': 0.00047470503738498857, 'samples': 3774464, 'steps': 7371, 'batch_loss/train': 0.8455285239033401} 12/22/2021 14:28:16 - INFO - codeparrot_training - Step 7372: {'lr': 0.00047469812767878916, 'samples': 3774976, 'steps': 7372, 'batch_loss/train': 0.8480750499293208} 12/22/2021 14:28:26 - INFO - codeparrot_training - Step 7373: {'lr': 0.0004746912170792748, 'samples': 3775488, 'steps': 7373, 'batch_loss/train': 0.8993919417262077} 12/22/2021 14:28:37 - INFO - codeparrot_training - Step 7374: {'lr': 0.000474684305586473, 'samples': 3776000, 'steps': 7374, 'batch_loss/train': 0.9257248099893332} 12/22/2021 14:28:47 - INFO - codeparrot_training - Step 7375: {'lr': 0.0004746773932004113, 'samples': 3776512, 'steps': 7375, 'batch_loss/train': 0.8882329789921641} 12/22/2021 14:29:01 - INFO - codeparrot_training - Step 7376: {'lr': 0.0004746704799211171, 'samples': 3777024, 'steps': 7376, 'batch_loss/train': 0.770767706213519} 12/22/2021 14:29:12 - INFO - codeparrot_training - Step 7377: {'lr': 0.00047466356574861793, 'samples': 3777536, 'steps': 7377, 'batch_loss/train': 0.8706606673076749} 12/22/2021 14:29:23 - INFO - codeparrot_training - Step 7378: {'lr': 0.0004746566506829412, 'samples': 3778048, 'steps': 7378, 'batch_loss/train': 0.8142686365172267} 12/22/2021 14:29:35 - INFO - codeparrot_training - Step 7379: {'lr': 0.00047464973472411445, 'samples': 3778560, 'steps': 7379, 'batch_loss/train': 0.9446112017612904} 12/22/2021 14:29:45 - INFO - codeparrot_training - Step 7380: {'lr': 0.0004746428178721652, 'samples': 3779072, 'steps': 7380, 'batch_loss/train': 0.809739391785115} 12/22/2021 14:29:56 - INFO - codeparrot_training - Step 7381: {'lr': 0.0004746359001271209, 'samples': 3779584, 'steps': 7381, 'batch_loss/train': 0.800771385896951} 12/22/2021 14:30:08 - INFO - codeparrot_training - Step 7382: {'lr': 0.00047462898148900915, 'samples': 3780096, 'steps': 7382, 'batch_loss/train': 0.8009304003790021} 12/22/2021 14:30:19 - INFO - codeparrot_training - Step 7383: {'lr': 0.0004746220619578573, 'samples': 3780608, 'steps': 7383, 'batch_loss/train': 0.7572012721793726} 12/22/2021 14:30:29 - INFO - codeparrot_training - Step 7384: {'lr': 0.00047461514153369303, 'samples': 3781120, 'steps': 7384, 'batch_loss/train': 0.732654144638218} 12/22/2021 14:30:43 - INFO - codeparrot_training - Step 7385: {'lr': 0.0004746082202165437, 'samples': 3781632, 'steps': 7385, 'batch_loss/train': 0.8392102941870689} 12/22/2021 14:30:54 - INFO - codeparrot_training - Step 7386: {'lr': 0.0004746012980064369, 'samples': 3782144, 'steps': 7386, 'batch_loss/train': 0.835987625643611} 12/22/2021 14:31:04 - INFO - codeparrot_training - Step 7387: {'lr': 0.00047459437490340016, 'samples': 3782656, 'steps': 7387, 'batch_loss/train': 0.8273229943588376} 12/22/2021 14:31:15 - INFO - codeparrot_training - Step 7388: {'lr': 0.000474587450907461, 'samples': 3783168, 'steps': 7388, 'batch_loss/train': 0.8584577534347773} 12/22/2021 14:31:27 - INFO - codeparrot_training - Step 7389: {'lr': 0.00047458052601864693, 'samples': 3783680, 'steps': 7389, 'batch_loss/train': 0.9345967918634415} 12/22/2021 14:31:38 - INFO - codeparrot_training - Step 7390: {'lr': 0.0004745736002369855, 'samples': 3784192, 'steps': 7390, 'batch_loss/train': 0.8642106894403696} 12/22/2021 14:31:48 - INFO - codeparrot_training - Step 7391: {'lr': 0.00047456667356250416, 'samples': 3784704, 'steps': 7391, 'batch_loss/train': 0.7381571726873517} 12/22/2021 14:32:00 - INFO - codeparrot_training - Step 7392: {'lr': 0.0004745597459952306, 'samples': 3785216, 'steps': 7392, 'batch_loss/train': 0.718398030847311} 12/22/2021 14:32:11 - INFO - codeparrot_training - Step 7393: {'lr': 0.0004745528175351923, 'samples': 3785728, 'steps': 7393, 'batch_loss/train': 0.8575572711415589} 12/22/2021 14:32:22 - INFO - codeparrot_training - Step 7394: {'lr': 0.0004745458881824167, 'samples': 3786240, 'steps': 7394, 'batch_loss/train': 0.7807603580877185} 12/22/2021 14:32:35 - INFO - codeparrot_training - Step 7395: {'lr': 0.0004745389579369315, 'samples': 3786752, 'steps': 7395, 'batch_loss/train': 0.9380607563070953} 12/22/2021 14:32:46 - INFO - codeparrot_training - Step 7396: {'lr': 0.00047453202679876417, 'samples': 3787264, 'steps': 7396, 'batch_loss/train': 0.922282581217587} 12/22/2021 14:32:57 - INFO - codeparrot_training - Step 7397: {'lr': 0.00047452509476794225, 'samples': 3787776, 'steps': 7397, 'batch_loss/train': 0.8495742538943887} 12/22/2021 14:33:09 - INFO - codeparrot_training - Step 7398: {'lr': 0.00047451816184449337, 'samples': 3788288, 'steps': 7398, 'batch_loss/train': 0.7391049093566835} 12/22/2021 14:33:19 - INFO - codeparrot_training - Step 7399: {'lr': 0.00047451122802844503, 'samples': 3788800, 'steps': 7399, 'batch_loss/train': 0.7844569908920676} 12/22/2021 14:33:30 - INFO - codeparrot_training - Step 7400: {'lr': 0.0004745042933198248, 'samples': 3789312, 'steps': 7400, 'batch_loss/train': 0.8803804824128747} 12/22/2021 14:33:41 - INFO - codeparrot_training - Step 7401: {'lr': 0.0004744973577186603, 'samples': 3789824, 'steps': 7401, 'batch_loss/train': 0.8013826506212354} 12/22/2021 14:33:54 - INFO - codeparrot_training - Step 7402: {'lr': 0.0004744904212249791, 'samples': 3790336, 'steps': 7402, 'batch_loss/train': 0.8033814243972301} 12/22/2021 14:34:05 - INFO - codeparrot_training - Step 7403: {'lr': 0.0004744834838388087, 'samples': 3790848, 'steps': 7403, 'batch_loss/train': 1.1823236541822553} 12/22/2021 14:34:16 - INFO - codeparrot_training - Step 7404: {'lr': 0.0004744765455601767, 'samples': 3791360, 'steps': 7404, 'batch_loss/train': 0.8227425282821059} 12/22/2021 14:34:28 - INFO - codeparrot_training - Step 7405: {'lr': 0.00047446960638911084, 'samples': 3791872, 'steps': 7405, 'batch_loss/train': 0.6979093336267397} 12/22/2021 14:34:38 - INFO - codeparrot_training - Step 7406: {'lr': 0.00047446266632563853, 'samples': 3792384, 'steps': 7406, 'batch_loss/train': 0.8098440109752119} 12/22/2021 14:34:49 - INFO - codeparrot_training - Step 7407: {'lr': 0.00047445572536978733, 'samples': 3792896, 'steps': 7407, 'batch_loss/train': 0.8501004707068205} 12/22/2021 14:35:01 - INFO - codeparrot_training - Step 7408: {'lr': 0.000474448783521585, 'samples': 3793408, 'steps': 7408, 'batch_loss/train': 0.8973354054614902} 12/22/2021 14:35:12 - INFO - codeparrot_training - Step 7409: {'lr': 0.000474441840781059, 'samples': 3793920, 'steps': 7409, 'batch_loss/train': 0.7981917532160878} 12/22/2021 14:35:22 - INFO - codeparrot_training - Step 7410: {'lr': 0.0004744348971482371, 'samples': 3794432, 'steps': 7410, 'batch_loss/train': 0.7433338053524494} 12/22/2021 14:35:33 - INFO - codeparrot_training - Step 7411: {'lr': 0.00047442795262314663, 'samples': 3794944, 'steps': 7411, 'batch_loss/train': 0.7766850795596838} 12/22/2021 14:35:45 - INFO - codeparrot_training - Step 7412: {'lr': 0.0004744210072058154, 'samples': 3795456, 'steps': 7412, 'batch_loss/train': 0.8328683711588383} 12/22/2021 14:35:56 - INFO - codeparrot_training - Step 7413: {'lr': 0.00047441406089627103, 'samples': 3795968, 'steps': 7413, 'batch_loss/train': 0.8308446411974728} 12/22/2021 14:36:06 - INFO - codeparrot_training - Step 7414: {'lr': 0.0004744071136945411, 'samples': 3796480, 'steps': 7414, 'batch_loss/train': 0.9815456382930279} 12/22/2021 14:36:20 - INFO - codeparrot_training - Step 7415: {'lr': 0.00047440016560065324, 'samples': 3796992, 'steps': 7415, 'batch_loss/train': 0.7600307064130902} 12/22/2021 14:36:31 - INFO - codeparrot_training - Step 7416: {'lr': 0.00047439321661463496, 'samples': 3797504, 'steps': 7416, 'batch_loss/train': 0.827815230935812} 12/22/2021 14:36:42 - INFO - codeparrot_training - Step 7417: {'lr': 0.0004743862667365141, 'samples': 3798016, 'steps': 7417, 'batch_loss/train': 0.7805816670879722} 12/22/2021 14:36:54 - INFO - codeparrot_training - Step 7418: {'lr': 0.0004743793159663181, 'samples': 3798528, 'steps': 7418, 'batch_loss/train': 0.8639273596927524} 12/22/2021 14:37:04 - INFO - codeparrot_training - Step 7419: {'lr': 0.0004743723643040746, 'samples': 3799040, 'steps': 7419, 'batch_loss/train': 0.8185944091528654} 12/22/2021 14:37:15 - INFO - codeparrot_training - Step 7420: {'lr': 0.0004743654117498114, 'samples': 3799552, 'steps': 7420, 'batch_loss/train': 0.7425479656085372} 12/22/2021 14:37:26 - INFO - codeparrot_training - Step 7421: {'lr': 0.00047435845830355594, 'samples': 3800064, 'steps': 7421, 'batch_loss/train': 0.815330651588738} 12/22/2021 14:37:38 - INFO - codeparrot_training - Step 7422: {'lr': 0.0004743515039653361, 'samples': 3800576, 'steps': 7422, 'batch_loss/train': 0.8166523650288582} 12/22/2021 14:37:48 - INFO - codeparrot_training - Step 7423: {'lr': 0.0004743445487351793, 'samples': 3801088, 'steps': 7423, 'batch_loss/train': 0.7766743777319789} 12/22/2021 14:37:59 - INFO - codeparrot_training - Step 7424: {'lr': 0.0004743375926131133, 'samples': 3801600, 'steps': 7424, 'batch_loss/train': 0.8753244765102863} 12/22/2021 14:38:13 - INFO - codeparrot_training - Step 7425: {'lr': 0.0004743306355991657, 'samples': 3802112, 'steps': 7425, 'batch_loss/train': 0.8776278144214302} 12/22/2021 14:38:23 - INFO - codeparrot_training - Step 7426: {'lr': 0.0004743236776933643, 'samples': 3802624, 'steps': 7426, 'batch_loss/train': 0.8964219652116299} 12/22/2021 14:38:34 - INFO - codeparrot_training - Step 7427: {'lr': 0.0004743167188957366, 'samples': 3803136, 'steps': 7427, 'batch_loss/train': 0.8881268026307225} 12/22/2021 14:38:46 - INFO - codeparrot_training - Step 7428: {'lr': 0.00047430975920631025, 'samples': 3803648, 'steps': 7428, 'batch_loss/train': 0.9480778984725475} 12/22/2021 14:38:57 - INFO - codeparrot_training - Step 7429: {'lr': 0.00047430279862511305, 'samples': 3804160, 'steps': 7429, 'batch_loss/train': 0.8916678931564093} 12/22/2021 14:39:07 - INFO - codeparrot_training - Step 7430: {'lr': 0.0004742958371521727, 'samples': 3804672, 'steps': 7430, 'batch_loss/train': 0.762379227206111} 12/22/2021 14:39:18 - INFO - codeparrot_training - Step 7431: {'lr': 0.0004742888747875167, 'samples': 3805184, 'steps': 7431, 'batch_loss/train': 0.8785108840093017} 12/22/2021 14:39:32 - INFO - codeparrot_training - Step 7432: {'lr': 0.0004742819115311728, 'samples': 3805696, 'steps': 7432, 'batch_loss/train': 0.7628775632474571} 12/22/2021 14:39:42 - INFO - codeparrot_training - Step 7433: {'lr': 0.0004742749473831688, 'samples': 3806208, 'steps': 7433, 'batch_loss/train': 0.9266701908782125} 12/22/2021 14:39:53 - INFO - codeparrot_training - Step 7434: {'lr': 0.0004742679823435322, 'samples': 3806720, 'steps': 7434, 'batch_loss/train': 0.8551060175523162} 12/22/2021 14:40:05 - INFO - codeparrot_training - Step 7435: {'lr': 0.00047426101641229087, 'samples': 3807232, 'steps': 7435, 'batch_loss/train': 0.9052185695618391} 12/22/2021 14:40:16 - INFO - codeparrot_training - Step 7436: {'lr': 0.00047425404958947236, 'samples': 3807744, 'steps': 7436, 'batch_loss/train': 0.7752332086674869} 12/22/2021 14:40:26 - INFO - codeparrot_training - Step 7437: {'lr': 0.0004742470818751045, 'samples': 3808256, 'steps': 7437, 'batch_loss/train': 0.9379218658432364} 12/22/2021 14:40:38 - INFO - codeparrot_training - Step 7438: {'lr': 0.00047424011326921477, 'samples': 3808768, 'steps': 7438, 'batch_loss/train': 0.863781095482409} 12/22/2021 14:40:49 - INFO - codeparrot_training - Step 7439: {'lr': 0.00047423314377183115, 'samples': 3809280, 'steps': 7439, 'batch_loss/train': 0.6601737914606929} 12/22/2021 14:41:00 - INFO - codeparrot_training - Step 7440: {'lr': 0.0004742261733829811, 'samples': 3809792, 'steps': 7440, 'batch_loss/train': 0.8810279844328761} 12/22/2021 14:41:10 - INFO - codeparrot_training - Step 7441: {'lr': 0.00047421920210269255, 'samples': 3810304, 'steps': 7441, 'batch_loss/train': 0.8705039350315928} 12/22/2021 14:41:24 - INFO - codeparrot_training - Step 7442: {'lr': 0.00047421222993099303, 'samples': 3810816, 'steps': 7442, 'batch_loss/train': 0.8566071717068553} 12/22/2021 14:41:35 - INFO - codeparrot_training - Step 7443: {'lr': 0.0004742052568679104, 'samples': 3811328, 'steps': 7443, 'batch_loss/train': 0.91909109801054} 12/22/2021 14:41:46 - INFO - codeparrot_training - Step 7444: {'lr': 0.00047419828291347233, 'samples': 3811840, 'steps': 7444, 'batch_loss/train': 0.690821957658045} 12/22/2021 14:41:58 - INFO - codeparrot_training - Step 7445: {'lr': 0.0004741913080677065, 'samples': 3812352, 'steps': 7445, 'batch_loss/train': 0.9864846877753735} 12/22/2021 14:42:08 - INFO - codeparrot_training - Step 7446: {'lr': 0.0004741843323306407, 'samples': 3812864, 'steps': 7446, 'batch_loss/train': 0.9224138082936406} 12/22/2021 14:42:19 - INFO - codeparrot_training - Step 7447: {'lr': 0.0004741773557023027, 'samples': 3813376, 'steps': 7447, 'batch_loss/train': 0.9420093391090631} 12/22/2021 14:42:31 - INFO - codeparrot_training - Step 7448: {'lr': 0.0004741703781827201, 'samples': 3813888, 'steps': 7448, 'batch_loss/train': 0.8076666127890348} 12/22/2021 14:42:42 - INFO - codeparrot_training - Step 7449: {'lr': 0.0004741633997719208, 'samples': 3814400, 'steps': 7449, 'batch_loss/train': 0.6548409853130579} 12/22/2021 14:42:52 - INFO - codeparrot_training - Step 7450: {'lr': 0.00047415642046993236, 'samples': 3814912, 'steps': 7450, 'batch_loss/train': 0.8299872381612659} 12/22/2021 14:43:03 - INFO - codeparrot_training - Step 7451: {'lr': 0.00047414944027678274, 'samples': 3815424, 'steps': 7451, 'batch_loss/train': 0.884418273344636} 12/22/2021 14:43:15 - INFO - codeparrot_training - Step 7452: {'lr': 0.0004741424591924995, 'samples': 3815936, 'steps': 7452, 'batch_loss/train': 0.8156142570078373} 12/22/2021 14:43:26 - INFO - codeparrot_training - Step 7453: {'lr': 0.0004741354772171105, 'samples': 3816448, 'steps': 7453, 'batch_loss/train': 0.9541308113839477} 12/22/2021 14:43:36 - INFO - codeparrot_training - Step 7454: {'lr': 0.00047412849435064347, 'samples': 3816960, 'steps': 7454, 'batch_loss/train': 0.9603907028213143} 12/22/2021 14:43:50 - INFO - codeparrot_training - Step 7455: {'lr': 0.00047412151059312614, 'samples': 3817472, 'steps': 7455, 'batch_loss/train': 0.884848166257143} 12/22/2021 14:44:01 - INFO - codeparrot_training - Step 7456: {'lr': 0.00047411452594458637, 'samples': 3817984, 'steps': 7456, 'batch_loss/train': 0.7579919565469027} 12/22/2021 14:44:11 - INFO - codeparrot_training - Step 7457: {'lr': 0.0004741075404050519, 'samples': 3818496, 'steps': 7457, 'batch_loss/train': 0.8435037974268198} 12/22/2021 14:44:23 - INFO - codeparrot_training - Step 7458: {'lr': 0.0004741005539745504, 'samples': 3819008, 'steps': 7458, 'batch_loss/train': 0.6862323060631752} 12/22/2021 14:44:34 - INFO - codeparrot_training - Step 7459: {'lr': 0.0004740935666531097, 'samples': 3819520, 'steps': 7459, 'batch_loss/train': 1.1161625641398132} 12/22/2021 14:44:45 - INFO - codeparrot_training - Step 7460: {'lr': 0.00047408657844075767, 'samples': 3820032, 'steps': 7460, 'batch_loss/train': 0.9285465097054839} 12/22/2021 14:44:55 - INFO - codeparrot_training - Step 7461: {'lr': 0.00047407958933752195, 'samples': 3820544, 'steps': 7461, 'batch_loss/train': 0.7508604233153164} 12/22/2021 14:45:07 - INFO - codeparrot_training - Step 7462: {'lr': 0.0004740725993434304, 'samples': 3821056, 'steps': 7462, 'batch_loss/train': 0.8651629975065589} 12/22/2021 14:45:18 - INFO - codeparrot_training - Step 7463: {'lr': 0.00047406560845851076, 'samples': 3821568, 'steps': 7463, 'batch_loss/train': 0.902547157369554} 12/22/2021 14:45:28 - INFO - codeparrot_training - Step 7464: {'lr': 0.00047405861668279095, 'samples': 3822080, 'steps': 7464, 'batch_loss/train': 0.7927557015791535} 12/22/2021 14:45:43 - INFO - codeparrot_training - Step 7465: {'lr': 0.00047405162401629867, 'samples': 3822592, 'steps': 7465, 'batch_loss/train': 0.7981174644082785} 12/22/2021 14:45:54 - INFO - codeparrot_training - Step 7466: {'lr': 0.00047404463045906166, 'samples': 3823104, 'steps': 7466, 'batch_loss/train': 0.857662440277636} 12/22/2021 14:46:04 - INFO - codeparrot_training - Step 7467: {'lr': 0.00047403763601110784, 'samples': 3823616, 'steps': 7467, 'batch_loss/train': 0.7587837565224618} 12/22/2021 14:46:17 - INFO - codeparrot_training - Step 7468: {'lr': 0.00047403064067246496, 'samples': 3824128, 'steps': 7468, 'batch_loss/train': 0.7543335319496691} 12/22/2021 14:46:28 - INFO - codeparrot_training - Step 7469: {'lr': 0.0004740236444431609, 'samples': 3824640, 'steps': 7469, 'batch_loss/train': 0.7953319605439901} 12/22/2021 14:46:38 - INFO - codeparrot_training - Step 7470: {'lr': 0.00047401664732322335, 'samples': 3825152, 'steps': 7470, 'batch_loss/train': 0.8335494983475655} 12/22/2021 14:46:49 - INFO - codeparrot_training - Step 7471: {'lr': 0.0004740096493126802, 'samples': 3825664, 'steps': 7471, 'batch_loss/train': 2.186828937381506} 12/22/2021 14:47:03 - INFO - codeparrot_training - Step 7472: {'lr': 0.0004740026504115593, 'samples': 3826176, 'steps': 7472, 'batch_loss/train': 1.2198208216577768} 12/22/2021 14:47:13 - INFO - codeparrot_training - Step 7473: {'lr': 0.0004739956506198885, 'samples': 3826688, 'steps': 7473, 'batch_loss/train': 0.8216623961925507} 12/22/2021 14:47:24 - INFO - codeparrot_training - Step 7474: {'lr': 0.00047398864993769546, 'samples': 3827200, 'steps': 7474, 'batch_loss/train': 0.8781544880475849} 12/22/2021 14:47:34 - INFO - codeparrot_training - Step 7475: {'lr': 0.00047398164836500817, 'samples': 3827712, 'steps': 7475, 'batch_loss/train': 0.8307502309326082} 12/22/2021 14:47:47 - INFO - codeparrot_training - Step 7476: {'lr': 0.0004739746459018544, 'samples': 3828224, 'steps': 7476, 'batch_loss/train': 0.8641582820564508} 12/22/2021 14:47:57 - INFO - codeparrot_training - Step 7477: {'lr': 0.00047396764254826204, 'samples': 3828736, 'steps': 7477, 'batch_loss/train': 0.8112185792997479} 12/22/2021 14:48:08 - INFO - codeparrot_training - Step 7478: {'lr': 0.00047396063830425884, 'samples': 3829248, 'steps': 7478, 'batch_loss/train': 0.8739717658609152} 12/22/2021 14:48:20 - INFO - codeparrot_training - Step 7479: {'lr': 0.00047395363316987275, 'samples': 3829760, 'steps': 7479, 'batch_loss/train': 0.926356989890337} 12/22/2021 14:48:31 - INFO - codeparrot_training - Step 7480: {'lr': 0.0004739466271451316, 'samples': 3830272, 'steps': 7480, 'batch_loss/train': 0.8382454235106707} 12/22/2021 14:48:41 - INFO - codeparrot_training - Step 7481: {'lr': 0.0004739396202300632, 'samples': 3830784, 'steps': 7481, 'batch_loss/train': 0.7024494024226442} 12/22/2021 14:48:55 - INFO - codeparrot_training - Step 7482: {'lr': 0.0004739326124246954, 'samples': 3831296, 'steps': 7482, 'batch_loss/train': 0.8681825376115739} 12/22/2021 14:49:06 - INFO - codeparrot_training - Step 7483: {'lr': 0.00047392560372905615, 'samples': 3831808, 'steps': 7483, 'batch_loss/train': 0.894986261613667} 12/22/2021 14:49:16 - INFO - codeparrot_training - Step 7484: {'lr': 0.0004739185941431732, 'samples': 3832320, 'steps': 7484, 'batch_loss/train': 0.9226340996101499} 12/22/2021 14:49:28 - INFO - codeparrot_training - Step 7485: {'lr': 0.00047391158366707445, 'samples': 3832832, 'steps': 7485, 'batch_loss/train': 0.844125549774617} 12/22/2021 14:49:39 - INFO - codeparrot_training - Step 7486: {'lr': 0.0004739045723007878, 'samples': 3833344, 'steps': 7486, 'batch_loss/train': 0.8316954541951418} 12/22/2021 14:49:50 - INFO - codeparrot_training - Step 7487: {'lr': 0.0004738975600443411, 'samples': 3833856, 'steps': 7487, 'batch_loss/train': 0.8183707599528134} 12/22/2021 14:50:00 - INFO - codeparrot_training - Step 7488: {'lr': 0.00047389054689776235, 'samples': 3834368, 'steps': 7488, 'batch_loss/train': 0.9174208622425795} 12/22/2021 14:50:15 - INFO - codeparrot_training - Step 7489: {'lr': 0.00047388353286107924, 'samples': 3834880, 'steps': 7489, 'batch_loss/train': 0.884535538032651} 12/22/2021 14:50:25 - INFO - codeparrot_training - Step 7490: {'lr': 0.00047387651793431976, 'samples': 3835392, 'steps': 7490, 'batch_loss/train': 0.751802250277251} 12/22/2021 14:50:36 - INFO - codeparrot_training - Step 7491: {'lr': 0.00047386950211751166, 'samples': 3835904, 'steps': 7491, 'batch_loss/train': 0.8378577632829547} 12/22/2021 14:50:48 - INFO - codeparrot_training - Step 7492: {'lr': 0.0004738624854106831, 'samples': 3836416, 'steps': 7492, 'batch_loss/train': 1.1982022465672344} 12/22/2021 14:50:58 - INFO - codeparrot_training - Step 7493: {'lr': 0.0004738554678138618, 'samples': 3836928, 'steps': 7493, 'batch_loss/train': 0.8160443096421659} 12/22/2021 14:51:09 - INFO - codeparrot_training - Step 7494: {'lr': 0.0004738484493270756, 'samples': 3837440, 'steps': 7494, 'batch_loss/train': 0.8627600027248263} 12/22/2021 14:51:20 - INFO - codeparrot_training - Step 7495: {'lr': 0.0004738414299503526, 'samples': 3837952, 'steps': 7495, 'batch_loss/train': 0.8406823175027966} 12/22/2021 14:51:32 - INFO - codeparrot_training - Step 7496: {'lr': 0.0004738344096837205, 'samples': 3838464, 'steps': 7496, 'batch_loss/train': 0.8508276538923383} 12/22/2021 14:51:43 - INFO - codeparrot_training - Step 7497: {'lr': 0.0004738273885272074, 'samples': 3838976, 'steps': 7497, 'batch_loss/train': 0.7707238211296499} 12/22/2021 14:51:53 - INFO - codeparrot_training - Step 7498: {'lr': 0.000473820366480841, 'samples': 3839488, 'steps': 7498, 'batch_loss/train': 0.8493719166144729} 12/22/2021 14:52:07 - INFO - codeparrot_training - Step 7499: {'lr': 0.00047381334354464945, 'samples': 3840000, 'steps': 7499, 'batch_loss/train': 0.9593390175141394} 12/22/2021 14:52:07 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/23/2021 13:47:06 - INFO - codeparrot_training - Distributed environment: TPU Num processes: 8 Process index: 0 Local process index: 0 Device: xla:1 Use FP16 precision: False 12/23/2021 13:47:06 - WARNING - huggingface_hub.repository - Revision `usual-cosmos-13` does not exist. Created and checked out branch `usual-cosmos-13`. 12/23/2021 13:47:06 - WARNING - huggingface_hub.repository - 12/23/2021 13:47:19 - WARNING - datasets.builder - Using custom data configuration lvwerra___codeparrot-clean-train-a1efdd1059bd841d 12/23/2021 13:47:20 - WARNING - datasets.builder - Using custom data configuration lvwerra___codeparrot-clean-valid-a800eb55c299abc0 12/23/2021 13:50:01 - INFO - codeparrot_training - Distributed environment: TPU Num processes: 8 Process index: 0 Local process index: 0 Device: xla:1 Use FP16 precision: False 12/23/2021 13:50:01 - WARNING - huggingface_hub.repository - Revision `peach-smoke-14` does not exist. Created and checked out branch `peach-smoke-14`. 12/23/2021 13:50:01 - WARNING - huggingface_hub.repository - 12/23/2021 13:50:07 - WARNING - datasets.builder - Using custom data configuration lvwerra___codeparrot-clean-train-a1efdd1059bd841d 12/23/2021 13:50:08 - WARNING - datasets.builder - Using custom data configuration lvwerra___codeparrot-clean-valid-a800eb55c299abc0 12/23/2021 15:28:29 - INFO - codeparrot_training - Step 5001: {'lr': 0.0004885148644702411, 'samples': 2561024, 'steps': 5001, 'batch_loss/train': 1.5026091635227203} 12/23/2021 15:29:47 - INFO - codeparrot_training - Step 5002: {'lr': 0.0004885101410877608, 'samples': 2561536, 'steps': 5002, 'batch_loss/train': 1.1865872200578451} 12/23/2021 15:31:11 - INFO - codeparrot_training - Step 5003: {'lr': 0.0004885054167570542, 'samples': 2562048, 'steps': 5003, 'batch_loss/train': 0.94274543505162} 12/23/2021 15:31:21 - INFO - codeparrot_training - Step 5004: {'lr': 0.0004885006914781403, 'samples': 2562560, 'steps': 5004, 'batch_loss/train': 0.9444990511983633} 12/23/2021 15:31:33 - INFO - codeparrot_training - Step 5005: {'lr': 0.0004884959652510374, 'samples': 2563072, 'steps': 5005, 'batch_loss/train': 0.937822412699461} 12/23/2021 15:31:44 - INFO - codeparrot_training - Step 5006: {'lr': 0.0004884912380757649, 'samples': 2563584, 'steps': 5006, 'batch_loss/train': 1.0078136511147022} 12/23/2021 15:31:55 - INFO - codeparrot_training - Step 5007: {'lr': 0.0004884865099523411, 'samples': 2564096, 'steps': 5007, 'batch_loss/train': 0.9146937504410744} 12/23/2021 15:32:05 - INFO - codeparrot_training - Step 5008: {'lr': 0.0004884817808807851, 'samples': 2564608, 'steps': 5008, 'batch_loss/train': 0.9582442473620176} 12/23/2021 15:32:18 - INFO - codeparrot_training - Step 5009: {'lr': 0.0004884770508611155, 'samples': 2565120, 'steps': 5009, 'batch_loss/train': 1.0170180341228843} 12/23/2021 15:32:29 - INFO - codeparrot_training - Step 5010: {'lr': 0.0004884723198933513, 'samples': 2565632, 'steps': 5010, 'batch_loss/train': 1.0913577722385526} 12/23/2021 15:32:39 - INFO - codeparrot_training - Step 5011: {'lr': 0.0004884675879775111, 'samples': 2566144, 'steps': 5011, 'batch_loss/train': 0.8828882509842515} 12/23/2021 15:32:51 - INFO - codeparrot_training - Step 5012: {'lr': 0.0004884628551136139, 'samples': 2566656, 'steps': 5012, 'batch_loss/train': 0.8071097414940596} 12/23/2021 15:33:02 - INFO - codeparrot_training - Step 5013: {'lr': 0.0004884581213016782, 'samples': 2567168, 'steps': 5013, 'batch_loss/train': 0.8648702101781964} 12/23/2021 15:33:13 - INFO - codeparrot_training - Step 5014: {'lr': 0.0004884533865417233, 'samples': 2567680, 'steps': 5014, 'batch_loss/train': 0.8629378015175462} 12/23/2021 15:33:25 - INFO - codeparrot_training - Step 5015: {'lr': 0.0004884486508337677, 'samples': 2568192, 'steps': 5015, 'batch_loss/train': 0.9607390277087688} 12/23/2021 15:33:35 - INFO - codeparrot_training - Step 5016: {'lr': 0.0004884439141778303, 'samples': 2568704, 'steps': 5016, 'batch_loss/train': 0.9778957339003682} 12/23/2021 15:33:46 - INFO - codeparrot_training - Step 5017: {'lr': 0.0004884391765739299, 'samples': 2569216, 'steps': 5017, 'batch_loss/train': 1.0375934960320592} 12/23/2021 15:33:59 - INFO - codeparrot_training - Step 5018: {'lr': 0.0004884344380220853, 'samples': 2569728, 'steps': 5018, 'batch_loss/train': 0.9520214977674186} 12/23/2021 15:34:09 - INFO - codeparrot_training - Step 5019: {'lr': 0.0004884296985223156, 'samples': 2570240, 'steps': 5019, 'batch_loss/train': 0.7895133283454925} 12/23/2021 15:34:20 - INFO - codeparrot_training - Step 5020: {'lr': 0.0004884249580746393, 'samples': 2570752, 'steps': 5020, 'batch_loss/train': 0.9972262429073453} 12/23/2021 15:34:30 - INFO - codeparrot_training - Step 5021: {'lr': 0.0004884202166790754, 'samples': 2571264, 'steps': 5021, 'batch_loss/train': 0.9639650443568826} 12/23/2021 15:34:42 - INFO - codeparrot_training - Step 5022: {'lr': 0.0004884154743356428, 'samples': 2571776, 'steps': 5022, 'batch_loss/train': 0.9555566008202732} 12/23/2021 15:34:53 - INFO - codeparrot_training - Step 5023: {'lr': 0.0004884107310443602, 'samples': 2572288, 'steps': 5023, 'batch_loss/train': 0.9700555251911283} 12/23/2021 15:35:04 - INFO - codeparrot_training - Step 5024: {'lr': 0.0004884059868052466, 'samples': 2572800, 'steps': 5024, 'batch_loss/train': 0.9354477424640208} 12/23/2021 15:35:17 - INFO - codeparrot_training - Step 5025: {'lr': 0.0004884012416183209, 'samples': 2573312, 'steps': 5025, 'batch_loss/train': 1.6362912580370903} 12/23/2021 15:35:28 - INFO - codeparrot_training - Step 5026: {'lr': 0.0004883964954836017, 'samples': 2573824, 'steps': 5026, 'batch_loss/train': 0.8716003783047199} 12/23/2021 15:35:38 - INFO - codeparrot_training - Step 5027: {'lr': 0.0004883917484011083, 'samples': 2574336, 'steps': 5027, 'batch_loss/train': 0.983111934736371} 12/23/2021 15:35:51 - INFO - codeparrot_training - Step 5028: {'lr': 0.0004883870003708591, 'samples': 2574848, 'steps': 5028, 'batch_loss/train': 0.972165203653276} 12/23/2021 15:36:01 - INFO - codeparrot_training - Step 5029: {'lr': 0.0004883822513928733, 'samples': 2575360, 'steps': 5029, 'batch_loss/train': 0.9556087236851454} 12/23/2021 15:36:12 - INFO - codeparrot_training - Step 5030: {'lr': 0.0004883775014671697, 'samples': 2575872, 'steps': 5030, 'batch_loss/train': 0.8103720187209547} 12/23/2021 15:36:23 - INFO - codeparrot_training - Step 5031: {'lr': 0.0004883727505937671, 'samples': 2576384, 'steps': 5031, 'batch_loss/train': 0.8694045953452587} 12/23/2021 15:36:35 - INFO - codeparrot_training - Step 5032: {'lr': 0.0004883679987726844, 'samples': 2576896, 'steps': 5032, 'batch_loss/train': 0.7771210488863289} 12/23/2021 15:36:45 - INFO - codeparrot_training - Step 5033: {'lr': 0.0004883632460039406, 'samples': 2577408, 'steps': 5033, 'batch_loss/train': 0.8132210718467832} 12/23/2021 15:36:56 - INFO - codeparrot_training - Step 5034: {'lr': 0.0004883584922875546, 'samples': 2577920, 'steps': 5034, 'batch_loss/train': 0.8654826935380697} 12/23/2021 15:37:09 - INFO - codeparrot_training - Step 5035: {'lr': 0.0004883537376235452, 'samples': 2578432, 'steps': 5035, 'batch_loss/train': 0.8686805320903659} 12/23/2021 15:37:19 - INFO - codeparrot_training - Step 5036: {'lr': 0.0004883489820119313, 'samples': 2578944, 'steps': 5036, 'batch_loss/train': 0.8766453023999929} 12/23/2021 15:37:30 - INFO - codeparrot_training - Step 5037: {'lr': 0.0004883442254527319, 'samples': 2579456, 'steps': 5037, 'batch_loss/train': 0.8979486878961325} 12/23/2021 15:37:42 - INFO - codeparrot_training - Step 5038: {'lr': 0.0004883394679459658, 'samples': 2579968, 'steps': 5038, 'batch_loss/train': 0.9537549670785666} 12/23/2021 15:37:52 - INFO - codeparrot_training - Step 5039: {'lr': 0.000488334709491652, 'samples': 2580480, 'steps': 5039, 'batch_loss/train': 0.9081472549587488} 12/23/2021 15:38:03 - INFO - codeparrot_training - Step 5040: {'lr': 0.0004883299500898094, 'samples': 2580992, 'steps': 5040, 'batch_loss/train': 0.8420660514384508} 12/23/2021 15:38:14 - INFO - codeparrot_training - Step 5041: {'lr': 0.0004883251897404569, 'samples': 2581504, 'steps': 5041, 'batch_loss/train': 0.8815709548071027} 12/23/2021 15:38:26 - INFO - codeparrot_training - Step 5042: {'lr': 0.0004883204284436134, 'samples': 2582016, 'steps': 5042, 'batch_loss/train': 0.9119223123416305} 12/23/2021 15:38:36 - INFO - codeparrot_training - Step 5043: {'lr': 0.0004883156661992979, 'samples': 2582528, 'steps': 5043, 'batch_loss/train': 0.8186895484104753} 12/23/2021 15:38:47 - INFO - codeparrot_training - Step 5044: {'lr': 0.0004883109030075293, 'samples': 2583040, 'steps': 5044, 'batch_loss/train': 0.8131888685747981} 12/23/2021 15:38:59 - INFO - codeparrot_training - Step 5045: {'lr': 0.0004883061388683264, 'samples': 2583552, 'steps': 5045, 'batch_loss/train': 0.8855598121881485} 12/23/2021 15:39:10 - INFO - codeparrot_training - Step 5046: {'lr': 0.0004883013737817085, 'samples': 2584064, 'steps': 5046, 'batch_loss/train': 0.8686516782036051} 12/23/2021 15:39:20 - INFO - codeparrot_training - Step 5047: {'lr': 0.0004882966077476941, 'samples': 2584576, 'steps': 5047, 'batch_loss/train': 0.8775115087628365} 12/23/2021 15:39:33 - INFO - codeparrot_training - Step 5048: {'lr': 0.0004882918407663025, 'samples': 2585088, 'steps': 5048, 'batch_loss/train': 0.9477125005796552} 12/23/2021 15:39:44 - INFO - codeparrot_training - Step 5049: {'lr': 0.0004882870728375525, 'samples': 2585600, 'steps': 5049, 'batch_loss/train': 0.9402937367558479} 12/23/2021 15:39:54 - INFO - codeparrot_training - Step 5050: {'lr': 0.000488282303961463, 'samples': 2586112, 'steps': 5050, 'batch_loss/train': 0.8628735924139619} 12/23/2021 15:40:06 - INFO - codeparrot_training - Step 5051: {'lr': 0.0004882775341380531, 'samples': 2586624, 'steps': 5051, 'batch_loss/train': 0.818708369974047} 12/23/2021 15:40:17 - INFO - codeparrot_training - Step 5052: {'lr': 0.0004882727633673417, 'samples': 2587136, 'steps': 5052, 'batch_loss/train': 0.7817599646514282} 12/23/2021 15:40:28 - INFO - codeparrot_training - Step 5053: {'lr': 0.0004882679916493478, 'samples': 2587648, 'steps': 5053, 'batch_loss/train': 0.9761519134044647} 12/23/2021 15:40:38 - INFO - codeparrot_training - Step 5054: {'lr': 0.0004882632189840902, 'samples': 2588160, 'steps': 5054, 'batch_loss/train': 0.9316751328296959} 12/23/2021 15:40:51 - INFO - codeparrot_training - Step 5055: {'lr': 0.00048825844537158807, 'samples': 2588672, 'steps': 5055, 'batch_loss/train': 0.6149575223680586} 12/23/2021 15:41:02 - INFO - codeparrot_training - Step 5056: {'lr': 0.00048825367081186034, 'samples': 2589184, 'steps': 5056, 'batch_loss/train': 0.7953188382089138} 12/23/2021 15:41:12 - INFO - codeparrot_training - Step 5057: {'lr': 0.000488248895304926, 'samples': 2589696, 'steps': 5057, 'batch_loss/train': 0.8482408225536346} 12/23/2021 15:41:25 - INFO - codeparrot_training - Step 5058: {'lr': 0.00048824411885080396, 'samples': 2590208, 'steps': 5058, 'batch_loss/train': 0.9526168508455157} 12/23/2021 15:41:35 - INFO - codeparrot_training - Step 5059: {'lr': 0.00048823934144951326, 'samples': 2590720, 'steps': 5059, 'batch_loss/train': 0.843859089538455} 12/23/2021 15:41:46 - INFO - codeparrot_training - Step 5060: {'lr': 0.000488234563101073, 'samples': 2591232, 'steps': 5060, 'batch_loss/train': 0.7656138380989432} 12/23/2021 15:41:58 - INFO - codeparrot_training - Step 5061: {'lr': 0.00048822978380550186, 'samples': 2591744, 'steps': 5061, 'batch_loss/train': 0.7750613241223618} 12/23/2021 15:42:09 - INFO - codeparrot_training - Step 5062: {'lr': 0.00048822500356281916, 'samples': 2592256, 'steps': 5062, 'batch_loss/train': 0.7809433155925944} 12/23/2021 15:42:19 - INFO - codeparrot_training - Step 5063: {'lr': 0.00048822022237304385, 'samples': 2592768, 'steps': 5063, 'batch_loss/train': 0.855440502637066} 12/23/2021 15:42:32 - INFO - codeparrot_training - Step 5064: {'lr': 0.0004882154402361949, 'samples': 2593280, 'steps': 5064, 'batch_loss/train': 0.9073802577331662} 12/23/2021 15:42:43 - INFO - codeparrot_training - Step 5065: {'lr': 0.0004882106571522912, 'samples': 2593792, 'steps': 5065, 'batch_loss/train': 0.7754645613022149} 12/23/2021 15:42:53 - INFO - codeparrot_training - Step 5066: {'lr': 0.00048820587312135193, 'samples': 2594304, 'steps': 5066, 'batch_loss/train': 0.8155799154192209} 12/23/2021 15:43:04 - INFO - codeparrot_training - Step 5067: {'lr': 0.000488201088143396, 'samples': 2594816, 'steps': 5067, 'batch_loss/train': 0.9725690968334675} 12/23/2021 15:43:16 - INFO - codeparrot_training - Step 5068: {'lr': 0.0004881963022184426, 'samples': 2595328, 'steps': 5068, 'batch_loss/train': 0.8566752867773175} 12/23/2021 15:43:27 - INFO - codeparrot_training - Step 5069: {'lr': 0.0004881915153465106, 'samples': 2595840, 'steps': 5069, 'batch_loss/train': 0.8825627989135683} 12/23/2021 15:43:37 - INFO - codeparrot_training - Step 5070: {'lr': 0.000488186727527619, 'samples': 2596352, 'steps': 5070, 'batch_loss/train': 0.9566769255325198} 12/23/2021 15:43:49 - INFO - codeparrot_training - Step 5071: {'lr': 0.00048818193876178697, 'samples': 2596864, 'steps': 5071, 'batch_loss/train': 0.8984131230972707} 12/23/2021 15:44:00 - INFO - codeparrot_training - Step 5072: {'lr': 0.00048817714904903344, 'samples': 2597376, 'steps': 5072, 'batch_loss/train': 0.8612307119183242} 12/23/2021 15:44:10 - INFO - codeparrot_training - Step 5073: {'lr': 0.00048817235838937757, 'samples': 2597888, 'steps': 5073, 'batch_loss/train': 0.9760541040450335} 12/23/2021 15:44:23 - INFO - codeparrot_training - Step 5074: {'lr': 0.00048816756678283824, 'samples': 2598400, 'steps': 5074, 'batch_loss/train': 0.87648830935359} 12/23/2021 15:44:34 - INFO - codeparrot_training - Step 5075: {'lr': 0.0004881627742294347, 'samples': 2598912, 'steps': 5075, 'batch_loss/train': 0.8510495238006115} 12/23/2021 15:44:44 - INFO - codeparrot_training - Step 5076: {'lr': 0.0004881579807291858, 'samples': 2599424, 'steps': 5076, 'batch_loss/train': 0.9107937561348081} 12/23/2021 15:44:56 - INFO - codeparrot_training - Step 5077: {'lr': 0.00048815318628211074, 'samples': 2599936, 'steps': 5077, 'batch_loss/train': 0.8791267406195402} 12/23/2021 15:45:07 - INFO - codeparrot_training - Step 5078: {'lr': 0.0004881483908882285, 'samples': 2600448, 'steps': 5078, 'batch_loss/train': 0.8774698209017515} 12/23/2021 15:45:18 - INFO - codeparrot_training - Step 5079: {'lr': 0.00048814359454755823, 'samples': 2600960, 'steps': 5079, 'batch_loss/train': 0.8795020994730294} 12/23/2021 15:45:28 - INFO - codeparrot_training - Step 5080: {'lr': 0.0004881387972601189, 'samples': 2601472, 'steps': 5080, 'batch_loss/train': 0.7977597746066749} 12/23/2021 15:45:41 - INFO - codeparrot_training - Step 5081: {'lr': 0.0004881339990259297, 'samples': 2601984, 'steps': 5081, 'batch_loss/train': 0.8704285835847259} 12/23/2021 15:45:51 - INFO - codeparrot_training - Step 5082: {'lr': 0.0004881291998450096, 'samples': 2602496, 'steps': 5082, 'batch_loss/train': 0.9047094183042645} 12/23/2021 15:46:02 - INFO - codeparrot_training - Step 5083: {'lr': 0.0004881243997173777, 'samples': 2603008, 'steps': 5083, 'batch_loss/train': 0.8659959118813276} 12/23/2021 15:46:15 - INFO - codeparrot_training - Step 5084: {'lr': 0.0004881195986430531, 'samples': 2603520, 'steps': 5084, 'batch_loss/train': 0.9663840988650918} 12/23/2021 15:46:25 - INFO - codeparrot_training - Step 5085: {'lr': 0.0004881147966220549, 'samples': 2604032, 'steps': 5085, 'batch_loss/train': 0.973696562461555} 12/23/2021 15:46:36 - INFO - codeparrot_training - Step 5086: {'lr': 0.0004881099936544021, 'samples': 2604544, 'steps': 5086, 'batch_loss/train': 0.8294914495199919} 12/23/2021 15:46:48 - INFO - codeparrot_training - Step 5087: {'lr': 0.000488105189740114, 'samples': 2605056, 'steps': 5087, 'batch_loss/train': 0.9522217009216547} 12/23/2021 15:46:58 - INFO - codeparrot_training - Step 5088: {'lr': 0.0004881003848792095, 'samples': 2605568, 'steps': 5088, 'batch_loss/train': 0.8545662397518754} 12/23/2021 15:47:09 - INFO - codeparrot_training - Step 5089: {'lr': 0.00048809557907170776, 'samples': 2606080, 'steps': 5089, 'batch_loss/train': 0.7805071684997529} 12/23/2021 15:47:20 - INFO - codeparrot_training - Step 5090: {'lr': 0.0004880907723176279, 'samples': 2606592, 'steps': 5090, 'batch_loss/train': 0.8963091094046831} 12/23/2021 15:47:31 - INFO - codeparrot_training - Step 5091: {'lr': 0.00048808596461698907, 'samples': 2607104, 'steps': 5091, 'batch_loss/train': 0.9134775875136256} 12/23/2021 15:47:42 - INFO - codeparrot_training - Step 5092: {'lr': 0.00048808115596981033, 'samples': 2607616, 'steps': 5092, 'batch_loss/train': 0.8666937302332371} 12/23/2021 15:47:53 - INFO - codeparrot_training - Step 5093: {'lr': 0.00048807634637611084, 'samples': 2608128, 'steps': 5093, 'batch_loss/train': 0.8661548839882016} 12/23/2021 15:48:05 - INFO - codeparrot_training - Step 5094: {'lr': 0.00048807153583590963, 'samples': 2608640, 'steps': 5094, 'batch_loss/train': 0.9108794671483338} 12/23/2021 15:48:16 - INFO - codeparrot_training - Step 5095: {'lr': 0.00048806672434922594, 'samples': 2609152, 'steps': 5095, 'batch_loss/train': 0.9526095055043697} 12/23/2021 15:48:27 - INFO - codeparrot_training - Step 5096: {'lr': 0.00048806191191607876, 'samples': 2609664, 'steps': 5096, 'batch_loss/train': 0.8360253497958183} 12/23/2021 15:48:39 - INFO - codeparrot_training - Step 5097: {'lr': 0.00048805709853648737, 'samples': 2610176, 'steps': 5097, 'batch_loss/train': 0.8948234813287854} 12/23/2021 15:48:49 - INFO - codeparrot_training - Step 5098: {'lr': 0.00048805228421047085, 'samples': 2610688, 'steps': 5098, 'batch_loss/train': 0.8901030803099275} 12/23/2021 15:49:00 - INFO - codeparrot_training - Step 5099: {'lr': 0.0004880474689380483, 'samples': 2611200, 'steps': 5099, 'batch_loss/train': 0.854511615820229} 12/23/2021 15:49:12 - INFO - codeparrot_training - Step 5100: {'lr': 0.0004880426527192389, 'samples': 2611712, 'steps': 5100, 'batch_loss/train': 0.8664661478251219} 12/23/2021 15:49:22 - INFO - codeparrot_training - Step 5101: {'lr': 0.00048803783555406184, 'samples': 2612224, 'steps': 5101, 'batch_loss/train': 0.859831496141851} 12/23/2021 15:49:33 - INFO - codeparrot_training - Step 5102: {'lr': 0.0004880330174425362, 'samples': 2612736, 'steps': 5102, 'batch_loss/train': 0.8280322453938425} 12/23/2021 15:49:44 - INFO - codeparrot_training - Step 5103: {'lr': 0.00048802819838468113, 'samples': 2613248, 'steps': 5103, 'batch_loss/train': 0.8251943215727806} 12/23/2021 15:49:56 - INFO - codeparrot_training - Step 5104: {'lr': 0.00048802337838051586, 'samples': 2613760, 'steps': 5104, 'batch_loss/train': 0.8112712358124554} 12/23/2021 15:50:07 - INFO - codeparrot_training - Step 5105: {'lr': 0.0004880185574300595, 'samples': 2614272, 'steps': 5105, 'batch_loss/train': 0.9277708940207958} 12/23/2021 15:50:18 - INFO - codeparrot_training - Step 5106: {'lr': 0.0004880137355333312, 'samples': 2614784, 'steps': 5106, 'batch_loss/train': 0.8675226448103786} 12/23/2021 15:50:30 - INFO - codeparrot_training - Step 5107: {'lr': 0.00048800891269035017, 'samples': 2615296, 'steps': 5107, 'batch_loss/train': 0.7895541829057038} 12/23/2021 15:50:40 - INFO - codeparrot_training - Step 5108: {'lr': 0.0004880040889011356, 'samples': 2615808, 'steps': 5108, 'batch_loss/train': 0.8319482561200857} 12/23/2021 15:50:51 - INFO - codeparrot_training - Step 5109: {'lr': 0.0004879992641657066, 'samples': 2616320, 'steps': 5109, 'batch_loss/train': 0.8801541225984693} 12/23/2021 15:51:04 - INFO - codeparrot_training - Step 5110: {'lr': 0.00048799443848408243, 'samples': 2616832, 'steps': 5110, 'batch_loss/train': 0.6825656982837245} 12/23/2021 15:51:14 - INFO - codeparrot_training - Step 5111: {'lr': 0.0004879896118562822, 'samples': 2617344, 'steps': 5111, 'batch_loss/train': 0.7790704960934818} 12/23/2021 15:51:25 - INFO - codeparrot_training - Step 5112: {'lr': 0.0004879847842823252, 'samples': 2617856, 'steps': 5112, 'batch_loss/train': 0.966109124943614} 12/23/2021 15:51:37 - INFO - codeparrot_training - Step 5113: {'lr': 0.0004879799557622305, 'samples': 2618368, 'steps': 5113, 'batch_loss/train': 0.901219037361443} 12/23/2021 15:51:47 - INFO - codeparrot_training - Step 5114: {'lr': 0.00048797512629601735, 'samples': 2618880, 'steps': 5114, 'batch_loss/train': 0.8336077006533742} 12/23/2021 15:51:58 - INFO - codeparrot_training - Step 5115: {'lr': 0.000487970295883705, 'samples': 2619392, 'steps': 5115, 'batch_loss/train': 0.8637348031625152} 12/23/2021 15:52:08 - INFO - codeparrot_training - Step 5116: {'lr': 0.00048796546452531256, 'samples': 2619904, 'steps': 5116, 'batch_loss/train': 0.9165717167779803} 12/23/2021 15:52:20 - INFO - codeparrot_training - Step 5117: {'lr': 0.0004879606322208593, 'samples': 2620416, 'steps': 5117, 'batch_loss/train': 0.8781238525407389} 12/23/2021 15:52:31 - INFO - codeparrot_training - Step 5118: {'lr': 0.0004879557989703644, 'samples': 2620928, 'steps': 5118, 'batch_loss/train': 0.8413781682029366} 12/23/2021 15:52:42 - INFO - codeparrot_training - Step 5119: {'lr': 0.00048795096477384714, 'samples': 2621440, 'steps': 5119, 'batch_loss/train': 0.8467745902016759} 12/23/2021 15:52:54 - INFO - codeparrot_training - Step 5120: {'lr': 0.0004879461296313267, 'samples': 2621952, 'steps': 5120, 'batch_loss/train': 0.83587427996099} 12/23/2021 15:53:05 - INFO - codeparrot_training - Step 5121: {'lr': 0.0004879412935428222, 'samples': 2622464, 'steps': 5121, 'batch_loss/train': 0.870086413808167} 12/23/2021 15:53:15 - INFO - codeparrot_training - Step 5122: {'lr': 0.00048793645650835305, 'samples': 2622976, 'steps': 5122, 'batch_loss/train': 0.6617107341298833} 12/23/2021 15:53:28 - INFO - codeparrot_training - Step 5123: {'lr': 0.00048793161852793833, 'samples': 2623488, 'steps': 5123, 'batch_loss/train': 0.9495914459694177} 12/23/2021 15:53:39 - INFO - codeparrot_training - Step 5124: {'lr': 0.0004879267796015974, 'samples': 2624000, 'steps': 5124, 'batch_loss/train': 0.7968164412304759} 12/23/2021 15:53:49 - INFO - codeparrot_training - Step 5125: {'lr': 0.00048792193972934934, 'samples': 2624512, 'steps': 5125, 'batch_loss/train': 0.8950514532625675} 12/23/2021 15:54:01 - INFO - codeparrot_training - Step 5126: {'lr': 0.0004879170989112135, 'samples': 2625024, 'steps': 5126, 'batch_loss/train': 0.8249944759299979} 12/23/2021 15:54:12 - INFO - codeparrot_training - Step 5127: {'lr': 0.00048791225714720914, 'samples': 2625536, 'steps': 5127, 'batch_loss/train': 0.985559078399092} 12/23/2021 15:54:23 - INFO - codeparrot_training - Step 5128: {'lr': 0.0004879074144373555, 'samples': 2626048, 'steps': 5128, 'batch_loss/train': 0.9424641523510218} 12/23/2021 15:54:33 - INFO - codeparrot_training - Step 5129: {'lr': 0.0004879025707816718, 'samples': 2626560, 'steps': 5129, 'batch_loss/train': 0.9210411766543984} 12/23/2021 15:54:45 - INFO - codeparrot_training - Step 5130: {'lr': 0.0004878977261801773, 'samples': 2627072, 'steps': 5130, 'batch_loss/train': 0.7952145347371697} 12/23/2021 15:54:56 - INFO - codeparrot_training - Step 5131: {'lr': 0.0004878928806328913, 'samples': 2627584, 'steps': 5131, 'batch_loss/train': 0.9276646012440324} 12/23/2021 15:55:06 - INFO - codeparrot_training - Step 5132: {'lr': 0.00048788803413983296, 'samples': 2628096, 'steps': 5132, 'batch_loss/train': 0.8722852931823581} 12/23/2021 15:55:19 - INFO - codeparrot_training - Step 5133: {'lr': 0.00048788318670102166, 'samples': 2628608, 'steps': 5133, 'batch_loss/train': 0.8821050124242902} 12/23/2021 15:55:30 - INFO - codeparrot_training - Step 5134: {'lr': 0.00048787833831647663, 'samples': 2629120, 'steps': 5134, 'batch_loss/train': 0.8455578377470374} 12/23/2021 15:55:40 - INFO - codeparrot_training - Step 5135: {'lr': 0.00048787348898621713, 'samples': 2629632, 'steps': 5135, 'batch_loss/train': 0.8344406075775623} 12/23/2021 15:55:52 - INFO - codeparrot_training - Step 5136: {'lr': 0.0004878686387102625, 'samples': 2630144, 'steps': 5136, 'batch_loss/train': 1.0528412610292435} 12/23/2021 15:56:03 - INFO - codeparrot_training - Step 5137: {'lr': 0.0004878637874886319, 'samples': 2630656, 'steps': 5137, 'batch_loss/train': 0.5806331451749429} 12/23/2021 15:56:13 - INFO - codeparrot_training - Step 5138: {'lr': 0.0004878589353213447, 'samples': 2631168, 'steps': 5138, 'batch_loss/train': 0.8641628930345178} 12/23/2021 15:56:24 - INFO - codeparrot_training - Step 5139: {'lr': 0.0004878540822084203, 'samples': 2631680, 'steps': 5139, 'batch_loss/train': 1.1372787691652775} 12/23/2021 15:56:38 - INFO - codeparrot_training - Step 5140: {'lr': 0.0004878492281498778, 'samples': 2632192, 'steps': 5140, 'batch_loss/train': 1.3216730738058686} 12/23/2021 15:56:48 - INFO - codeparrot_training - Step 5141: {'lr': 0.0004878443731457366, 'samples': 2632704, 'steps': 5141, 'batch_loss/train': 0.9198079258203506} 12/23/2021 15:56:59 - INFO - codeparrot_training - Step 5142: {'lr': 0.00048783951719601594, 'samples': 2633216, 'steps': 5142, 'batch_loss/train': 0.8868131330236793} 12/23/2021 15:57:11 - INFO - codeparrot_training - Step 5143: {'lr': 0.0004878346603007352, 'samples': 2633728, 'steps': 5143, 'batch_loss/train': 0.7653667835984379} 12/23/2021 15:57:22 - INFO - codeparrot_training - Step 5144: {'lr': 0.00048782980245991363, 'samples': 2634240, 'steps': 5144, 'batch_loss/train': 0.728238670155406} 12/23/2021 15:57:32 - INFO - codeparrot_training - Step 5145: {'lr': 0.00048782494367357053, 'samples': 2634752, 'steps': 5145, 'batch_loss/train': 1.771315807942301} 12/23/2021 15:57:45 - INFO - codeparrot_training - Step 5146: {'lr': 0.0004878200839417254, 'samples': 2635264, 'steps': 5146, 'batch_loss/train': 0.9261872041970491} 12/23/2021 15:57:55 - INFO - codeparrot_training - Step 5147: {'lr': 0.00048781522326439726, 'samples': 2635776, 'steps': 5147, 'batch_loss/train': 0.6632390524027869} 12/23/2021 15:58:06 - INFO - codeparrot_training - Step 5148: {'lr': 0.00048781036164160564, 'samples': 2636288, 'steps': 5148, 'batch_loss/train': 0.8320887368172407} 12/23/2021 15:58:18 - INFO - codeparrot_training - Step 5149: {'lr': 0.00048780549907336977, 'samples': 2636800, 'steps': 5149, 'batch_loss/train': 0.9134296369738877} 12/23/2021 15:58:29 - INFO - codeparrot_training - Step 5150: {'lr': 0.0004878006355597091, 'samples': 2637312, 'steps': 5150, 'batch_loss/train': 0.8401218093931675} 12/23/2021 15:58:40 - INFO - codeparrot_training - Step 5151: {'lr': 0.0004877957711006429, 'samples': 2637824, 'steps': 5151, 'batch_loss/train': 0.834359779022634} 12/23/2021 15:58:50 - INFO - codeparrot_training - Step 5152: {'lr': 0.00048779090569619045, 'samples': 2638336, 'steps': 5152, 'batch_loss/train': 0.9028393849730492} 12/23/2021 15:59:02 - INFO - codeparrot_training - Step 5153: {'lr': 0.00048778603934637125, 'samples': 2638848, 'steps': 5153, 'batch_loss/train': 0.8520970428362489} 12/23/2021 15:59:13 - INFO - codeparrot_training - Step 5154: {'lr': 0.0004877811720512044, 'samples': 2639360, 'steps': 5154, 'batch_loss/train': 0.8374455678276718} 12/23/2021 15:59:24 - INFO - codeparrot_training - Step 5155: {'lr': 0.0004877763038107095, 'samples': 2639872, 'steps': 5155, 'batch_loss/train': 0.7676745208445936} 12/23/2021 15:59:36 - INFO - codeparrot_training - Step 5156: {'lr': 0.0004877714346249057, 'samples': 2640384, 'steps': 5156, 'batch_loss/train': 0.902866561897099} 12/23/2021 15:59:46 - INFO - codeparrot_training - Step 5157: {'lr': 0.0004877665644938125, 'samples': 2640896, 'steps': 5157, 'batch_loss/train': 0.6161817027023062} 12/23/2021 15:59:57 - INFO - codeparrot_training - Step 5158: {'lr': 0.00048776169341744924, 'samples': 2641408, 'steps': 5158, 'batch_loss/train': 0.8791153328493237} 12/23/2021 16:00:10 - INFO - codeparrot_training - Step 5159: {'lr': 0.0004877568213958352, 'samples': 2641920, 'steps': 5159, 'batch_loss/train': 0.9370149979367852} 12/23/2021 16:00:20 - INFO - codeparrot_training - Step 5160: {'lr': 0.00048775194842898983, 'samples': 2642432, 'steps': 5160, 'batch_loss/train': 0.9828830366022885} 12/23/2021 16:00:31 - INFO - codeparrot_training - Step 5161: {'lr': 0.00048774707451693257, 'samples': 2642944, 'steps': 5161, 'batch_loss/train': 1.1311150202527642} 12/23/2021 16:00:43 - INFO - codeparrot_training - Step 5162: {'lr': 0.0004877421996596826, 'samples': 2643456, 'steps': 5162, 'batch_loss/train': 0.8588080340996385} 12/23/2021 16:00:54 - INFO - codeparrot_training - Step 5163: {'lr': 0.0004877373238572595, 'samples': 2643968, 'steps': 5163, 'batch_loss/train': 0.938912445679307} 12/23/2021 16:01:05 - INFO - codeparrot_training - Step 5164: {'lr': 0.0004877324471096825, 'samples': 2644480, 'steps': 5164, 'batch_loss/train': 0.7706322496524081} 12/23/2021 16:01:15 - INFO - codeparrot_training - Step 5165: {'lr': 0.0004877275694169711, 'samples': 2644992, 'steps': 5165, 'batch_loss/train': 0.8351135021075606} 12/23/2021 16:01:27 - INFO - codeparrot_training - Step 5166: {'lr': 0.00048772269077914466, 'samples': 2645504, 'steps': 5166, 'batch_loss/train': 1.6811448927037418} 12/23/2021 16:01:38 - INFO - codeparrot_training - Step 5167: {'lr': 0.0004877178111962225, 'samples': 2646016, 'steps': 5167, 'batch_loss/train': 0.9077048264443874} 12/23/2021 16:01:48 - INFO - codeparrot_training - Step 5168: {'lr': 0.0004877129306682241, 'samples': 2646528, 'steps': 5168, 'batch_loss/train': 0.984713482670486} 12/23/2021 16:02:01 - INFO - codeparrot_training - Step 5169: {'lr': 0.0004877080491951689, 'samples': 2647040, 'steps': 5169, 'batch_loss/train': 0.7918211440555751} 12/23/2021 16:02:12 - INFO - codeparrot_training - Step 5170: {'lr': 0.0004877031667770762, 'samples': 2647552, 'steps': 5170, 'batch_loss/train': 0.8449555281549692} 12/23/2021 16:02:22 - INFO - codeparrot_training - Step 5171: {'lr': 0.0004876982834139654, 'samples': 2648064, 'steps': 5171, 'batch_loss/train': 0.7173471720889211} 12/23/2021 16:02:34 - INFO - codeparrot_training - Step 5172: {'lr': 0.0004876933991058561, 'samples': 2648576, 'steps': 5172, 'batch_loss/train': 0.9032726273871958} 12/23/2021 16:02:45 - INFO - codeparrot_training - Step 5173: {'lr': 0.00048768851385276753, 'samples': 2649088, 'steps': 5173, 'batch_loss/train': 0.8930482282303274} 12/23/2021 16:02:56 - INFO - codeparrot_training - Step 5174: {'lr': 0.0004876836276547192, 'samples': 2649600, 'steps': 5174, 'batch_loss/train': 0.8985779462382197} 12/23/2021 16:03:06 - INFO - codeparrot_training - Step 5175: {'lr': 0.0004876787405117305, 'samples': 2650112, 'steps': 5175, 'batch_loss/train': 0.8352443668991327} 12/23/2021 16:03:18 - INFO - codeparrot_training - Step 5176: {'lr': 0.0004876738524238209, 'samples': 2650624, 'steps': 5176, 'batch_loss/train': 0.8773021437227726} 12/23/2021 16:03:29 - INFO - codeparrot_training - Step 5177: {'lr': 0.00048766896339100973, 'samples': 2651136, 'steps': 5177, 'batch_loss/train': 0.9391407128423452} 12/23/2021 16:03:39 - INFO - codeparrot_training - Step 5178: {'lr': 0.0004876640734133165, 'samples': 2651648, 'steps': 5178, 'batch_loss/train': 0.819476536475122} 12/23/2021 16:03:52 - INFO - codeparrot_training - Step 5179: {'lr': 0.0004876591824907608, 'samples': 2652160, 'steps': 5179, 'batch_loss/train': 0.9158071130514145} 12/23/2021 16:04:03 - INFO - codeparrot_training - Step 5180: {'lr': 0.00048765429062336176, 'samples': 2652672, 'steps': 5180, 'batch_loss/train': 0.8574013109318912} 12/23/2021 16:04:14 - INFO - codeparrot_training - Step 5181: {'lr': 0.0004876493978111391, 'samples': 2653184, 'steps': 5181, 'batch_loss/train': 0.883578764507547} 12/23/2021 16:04:26 - INFO - codeparrot_training - Step 5182: {'lr': 0.0004876445040541121, 'samples': 2653696, 'steps': 5182, 'batch_loss/train': 0.826340064406395} 12/23/2021 16:04:36 - INFO - codeparrot_training - Step 5183: {'lr': 0.00048763960935230033, 'samples': 2654208, 'steps': 5183, 'batch_loss/train': 0.9827705025672913} 12/23/2021 16:04:47 - INFO - codeparrot_training - Step 5184: {'lr': 0.0004876347137057232, 'samples': 2654720, 'steps': 5184, 'batch_loss/train': 0.7474596172105521} 12/23/2021 16:04:59 - INFO - codeparrot_training - Step 5185: {'lr': 0.0004876298171144001, 'samples': 2655232, 'steps': 5185, 'batch_loss/train': 0.8981895484030247} 12/23/2021 16:05:10 - INFO - codeparrot_training - Step 5186: {'lr': 0.0004876249195783507, 'samples': 2655744, 'steps': 5186, 'batch_loss/train': 0.8266614563763142} 12/23/2021 16:05:20 - INFO - codeparrot_training - Step 5187: {'lr': 0.0004876200210975943, 'samples': 2656256, 'steps': 5187, 'batch_loss/train': 0.5903100800933316} 12/23/2021 16:05:31 - INFO - codeparrot_training - Step 5188: {'lr': 0.0004876151216721504, 'samples': 2656768, 'steps': 5188, 'batch_loss/train': 0.8587072733789682} 12/23/2021 16:05:44 - INFO - codeparrot_training - Step 5189: {'lr': 0.00048761022130203846, 'samples': 2657280, 'steps': 5189, 'batch_loss/train': 0.8489085100591183} 12/23/2021 16:05:54 - INFO - codeparrot_training - Step 5190: {'lr': 0.00048760531998727805, 'samples': 2657792, 'steps': 5190, 'batch_loss/train': 0.8367858109995723} 12/23/2021 16:06:05 - INFO - codeparrot_training - Step 5191: {'lr': 0.00048760041772788856, 'samples': 2658304, 'steps': 5191, 'batch_loss/train': 0.9495682287961245} 12/23/2021 16:06:17 - INFO - codeparrot_training - Step 5192: {'lr': 0.00048759551452388954, 'samples': 2658816, 'steps': 5192, 'batch_loss/train': 0.869043055921793} 12/23/2021 16:06:27 - INFO - codeparrot_training - Step 5193: {'lr': 0.0004875906103753005, 'samples': 2659328, 'steps': 5193, 'batch_loss/train': 0.8486891235224903} 12/23/2021 16:06:38 - INFO - codeparrot_training - Step 5194: {'lr': 0.00048758570528214086, 'samples': 2659840, 'steps': 5194, 'batch_loss/train': 0.8906372943893075} 12/23/2021 16:06:51 - INFO - codeparrot_training - Step 5195: {'lr': 0.00048758079924443023, 'samples': 2660352, 'steps': 5195, 'batch_loss/train': 0.888426311314106} 12/23/2021 16:07:01 - INFO - codeparrot_training - Step 5196: {'lr': 0.000487575892262188, 'samples': 2660864, 'steps': 5196, 'batch_loss/train': 0.8461703220382333} 12/23/2021 16:07:12 - INFO - codeparrot_training - Step 5197: {'lr': 0.0004875709843354337, 'samples': 2661376, 'steps': 5197, 'batch_loss/train': 0.8402104526758194} 12/23/2021 16:07:23 - INFO - codeparrot_training - Step 5198: {'lr': 0.00048756607546418686, 'samples': 2661888, 'steps': 5198, 'batch_loss/train': 0.8352627651765943} 12/23/2021 16:07:35 - INFO - codeparrot_training - Step 5199: {'lr': 0.0004875611656484671, 'samples': 2662400, 'steps': 5199, 'batch_loss/train': 0.8864085986278951} 12/23/2021 16:07:45 - INFO - codeparrot_training - Step 5200: {'lr': 0.0004875562548882938, 'samples': 2662912, 'steps': 5200, 'batch_loss/train': 0.7915018824860454} 12/23/2021 16:07:56 - INFO - codeparrot_training - Step 5201: {'lr': 0.0004875513431836865, 'samples': 2663424, 'steps': 5201, 'batch_loss/train': 0.8539647553116083} 12/23/2021 16:08:08 - INFO - codeparrot_training - Step 5202: {'lr': 0.0004875464305346647, 'samples': 2663936, 'steps': 5202, 'batch_loss/train': 0.8072041100822389} 12/23/2021 16:08:18 - INFO - codeparrot_training - Step 5203: {'lr': 0.00048754151694124814, 'samples': 2664448, 'steps': 5203, 'batch_loss/train': 0.8064050707034767} 12/23/2021 16:08:29 - INFO - codeparrot_training - Step 5204: {'lr': 0.0004875366024034561, 'samples': 2664960, 'steps': 5204, 'batch_loss/train': 0.8362100617960095} 12/23/2021 16:08:41 - INFO - codeparrot_training - Step 5205: {'lr': 0.0004875316869213082, 'samples': 2665472, 'steps': 5205, 'batch_loss/train': 0.8530515935271978} 12/23/2021 16:08:52 - INFO - codeparrot_training - Step 5206: {'lr': 0.0004875267704948241, 'samples': 2665984, 'steps': 5206, 'batch_loss/train': 0.9809999521821737} 12/23/2021 16:09:02 - INFO - codeparrot_training - Step 5207: {'lr': 0.0004875218531240232, 'samples': 2666496, 'steps': 5207, 'batch_loss/train': 0.7495585037395358} 12/23/2021 16:09:15 - INFO - codeparrot_training - Step 5208: {'lr': 0.00048751693480892505, 'samples': 2667008, 'steps': 5208, 'batch_loss/train': 0.823962717782706} 12/23/2021 16:09:26 - INFO - codeparrot_training - Step 5209: {'lr': 0.0004875120155495493, 'samples': 2667520, 'steps': 5209, 'batch_loss/train': 0.8770559905096889} 12/23/2021 16:09:36 - INFO - codeparrot_training - Step 5210: {'lr': 0.00048750709534591544, 'samples': 2668032, 'steps': 5210, 'batch_loss/train': 0.9038038300350308} 12/23/2021 16:09:47 - INFO - codeparrot_training - Step 5211: {'lr': 0.00048750217419804304, 'samples': 2668544, 'steps': 5211, 'batch_loss/train': 0.8750148843973875} 12/23/2021 16:09:59 - INFO - codeparrot_training - Step 5212: {'lr': 0.0004874972521059517, 'samples': 2669056, 'steps': 5212, 'batch_loss/train': 0.8449505330063403} 12/23/2021 16:10:10 - INFO - codeparrot_training - Step 5213: {'lr': 0.00048749232906966097, 'samples': 2669568, 'steps': 5213, 'batch_loss/train': 0.8029340831562877} 12/23/2021 16:10:20 - INFO - codeparrot_training - Step 5214: {'lr': 0.00048748740508919037, 'samples': 2670080, 'steps': 5214, 'batch_loss/train': 0.8714582156389952} 12/23/2021 16:10:32 - INFO - codeparrot_training - Step 5215: {'lr': 0.00048748248016455955, 'samples': 2670592, 'steps': 5215, 'batch_loss/train': 0.7042688215151429} 12/23/2021 16:10:43 - INFO - codeparrot_training - Step 5216: {'lr': 0.000487477554295788, 'samples': 2671104, 'steps': 5216, 'batch_loss/train': 0.9037635866552591} 12/23/2021 16:10:53 - INFO - codeparrot_training - Step 5217: {'lr': 0.0004874726274828954, 'samples': 2671616, 'steps': 5217, 'batch_loss/train': 0.8789307737024501} 12/23/2021 16:11:06 - INFO - codeparrot_training - Step 5218: {'lr': 0.0004874676997259013, 'samples': 2672128, 'steps': 5218, 'batch_loss/train': 0.8022622941061854} 12/23/2021 16:11:17 - INFO - codeparrot_training - Step 5219: {'lr': 0.00048746277102482534, 'samples': 2672640, 'steps': 5219, 'batch_loss/train': 0.9200519239529967} 12/23/2021 16:11:27 - INFO - codeparrot_training - Step 5220: {'lr': 0.000487457841379687, 'samples': 2673152, 'steps': 5220, 'batch_loss/train': 0.8257863046601415} 12/23/2021 16:11:39 - INFO - codeparrot_training - Step 5221: {'lr': 0.00048745291079050604, 'samples': 2673664, 'steps': 5221, 'batch_loss/train': 1.340913669206202} 12/23/2021 16:11:50 - INFO - codeparrot_training - Step 5222: {'lr': 0.00048744797925730183, 'samples': 2674176, 'steps': 5222, 'batch_loss/train': 0.8200142844580114} 12/23/2021 16:12:01 - INFO - codeparrot_training - Step 5223: {'lr': 0.00048744304678009423, 'samples': 2674688, 'steps': 5223, 'batch_loss/train': 0.8258342994377017} 12/23/2021 16:12:11 - INFO - codeparrot_training - Step 5224: {'lr': 0.00048743811335890265, 'samples': 2675200, 'steps': 5224, 'batch_loss/train': 0.9213867979124188} 12/23/2021 16:12:24 - INFO - codeparrot_training - Step 5225: {'lr': 0.0004874331789937468, 'samples': 2675712, 'steps': 5225, 'batch_loss/train': 1.6109188366681337} 12/23/2021 16:12:35 - INFO - codeparrot_training - Step 5226: {'lr': 0.0004874282436846463, 'samples': 2676224, 'steps': 5226, 'batch_loss/train': 0.731355095282197} 12/23/2021 16:12:45 - INFO - codeparrot_training - Step 5227: {'lr': 0.0004874233074316207, 'samples': 2676736, 'steps': 5227, 'batch_loss/train': 0.7569799171760678} 12/23/2021 16:12:57 - INFO - codeparrot_training - Step 5228: {'lr': 0.00048741837023468974, 'samples': 2677248, 'steps': 5228, 'batch_loss/train': 0.8711100616492331} 12/23/2021 16:13:08 - INFO - codeparrot_training - Step 5229: {'lr': 0.00048741343209387296, 'samples': 2677760, 'steps': 5229, 'batch_loss/train': 0.8613164818380028} 12/23/2021 16:13:18 - INFO - codeparrot_training - Step 5230: {'lr': 0.00048740849300919, 'samples': 2678272, 'steps': 5230, 'batch_loss/train': 0.8373868661001325} 12/23/2021 16:13:31 - INFO - codeparrot_training - Step 5231: {'lr': 0.0004874035529806605, 'samples': 2678784, 'steps': 5231, 'batch_loss/train': 0.9138460620306432} 12/23/2021 16:13:41 - INFO - codeparrot_training - Step 5232: {'lr': 0.0004873986120083042, 'samples': 2679296, 'steps': 5232, 'batch_loss/train': 0.8925014492124319} 12/23/2021 16:13:52 - INFO - codeparrot_training - Step 5233: {'lr': 0.00048739367009214066, 'samples': 2679808, 'steps': 5233, 'batch_loss/train': 0.769422170240432} 12/23/2021 16:14:02 - INFO - codeparrot_training - Step 5234: {'lr': 0.0004873887272321894, 'samples': 2680320, 'steps': 5234, 'batch_loss/train': 0.7988339466974139} 12/23/2021 16:14:15 - INFO - codeparrot_training - Step 5235: {'lr': 0.00048738378342847035, 'samples': 2680832, 'steps': 5235, 'batch_loss/train': 0.8136687772348523} 12/23/2021 16:14:25 - INFO - codeparrot_training - Step 5236: {'lr': 0.00048737883868100286, 'samples': 2681344, 'steps': 5236, 'batch_loss/train': 0.8610944785177708} 12/23/2021 16:14:36 - INFO - codeparrot_training - Step 5237: {'lr': 0.0004873738929898068, 'samples': 2681856, 'steps': 5237, 'batch_loss/train': 0.8753723716363311} 12/23/2021 16:14:49 - INFO - codeparrot_training - Step 5238: {'lr': 0.0004873689463549018, 'samples': 2682368, 'steps': 5238, 'batch_loss/train': 0.830129380337894} 12/23/2021 16:14:59 - INFO - codeparrot_training - Step 5239: {'lr': 0.0004873639987763074, 'samples': 2682880, 'steps': 5239, 'batch_loss/train': 0.8405273575335741} 12/23/2021 16:15:10 - INFO - codeparrot_training - Step 5240: {'lr': 0.0004873590502540435, 'samples': 2683392, 'steps': 5240, 'batch_loss/train': 0.7600652379915118} 12/23/2021 16:15:22 - INFO - codeparrot_training - Step 5241: {'lr': 0.0004873541007881296, 'samples': 2683904, 'steps': 5241, 'batch_loss/train': 0.7708083838224411} 12/23/2021 16:15:33 - INFO - codeparrot_training - Step 5242: {'lr': 0.0004873491503785854, 'samples': 2684416, 'steps': 5242, 'batch_loss/train': 0.8405386693775654} 12/23/2021 16:15:43 - INFO - codeparrot_training - Step 5243: {'lr': 0.0004873441990254306, 'samples': 2684928, 'steps': 5243, 'batch_loss/train': 0.7605385719798505} 12/23/2021 16:15:54 - INFO - codeparrot_training - Step 5244: {'lr': 0.0004873392467286848, 'samples': 2685440, 'steps': 5244, 'batch_loss/train': 0.8143158247694373} 12/23/2021 16:16:06 - INFO - codeparrot_training - Step 5245: {'lr': 0.0004873342934883679, 'samples': 2685952, 'steps': 5245, 'batch_loss/train': 0.8865085495635867} 12/23/2021 16:16:17 - INFO - codeparrot_training - Step 5246: {'lr': 0.0004873293393044994, 'samples': 2686464, 'steps': 5246, 'batch_loss/train': 0.9576758984476328} 12/23/2021 16:16:27 - INFO - codeparrot_training - Step 5247: {'lr': 0.000487324384177099, 'samples': 2686976, 'steps': 5247, 'batch_loss/train': 0.8006468894891441} 12/23/2021 16:16:40 - INFO - codeparrot_training - Step 5248: {'lr': 0.00048731942810618657, 'samples': 2687488, 'steps': 5248, 'batch_loss/train': 0.913567965850234} 12/23/2021 16:16:51 - INFO - codeparrot_training - Step 5249: {'lr': 0.00048731447109178166, 'samples': 2688000, 'steps': 5249, 'batch_loss/train': 0.9248799169436097} 12/23/2021 16:17:01 - INFO - codeparrot_training - Step 5250: {'lr': 0.000487309513133904, 'samples': 2688512, 'steps': 5250, 'batch_loss/train': 0.8359837067546323} 12/23/2021 16:17:13 - INFO - codeparrot_training - Step 5251: {'lr': 0.00048730455423257334, 'samples': 2689024, 'steps': 5251, 'batch_loss/train': 0.9543391084298491} 12/23/2021 16:17:24 - INFO - codeparrot_training - Step 5252: {'lr': 0.00048729959438780933, 'samples': 2689536, 'steps': 5252, 'batch_loss/train': 0.6332688733236864} 12/23/2021 16:17:34 - INFO - codeparrot_training - Step 5253: {'lr': 0.00048729463359963184, 'samples': 2690048, 'steps': 5253, 'batch_loss/train': 0.8698622076772153} 12/23/2021 16:17:47 - INFO - codeparrot_training - Step 5254: {'lr': 0.0004872896718680604, 'samples': 2690560, 'steps': 5254, 'batch_loss/train': 0.8987905085086823} 12/23/2021 16:17:57 - INFO - codeparrot_training - Step 5255: {'lr': 0.00048728470919311487, 'samples': 2691072, 'steps': 5255, 'batch_loss/train': 0.7815705854445696} 12/23/2021 16:18:08 - INFO - codeparrot_training - Step 5256: {'lr': 0.0004872797455748149, 'samples': 2691584, 'steps': 5256, 'batch_loss/train': 0.9251176929101348} 12/23/2021 16:18:18 - INFO - codeparrot_training - Step 5257: {'lr': 0.00048727478101318023, 'samples': 2692096, 'steps': 5257, 'batch_loss/train': 0.8720349576324224} 12/23/2021 16:18:32 - INFO - codeparrot_training - Step 5258: {'lr': 0.0004872698155082307, 'samples': 2692608, 'steps': 5258, 'batch_loss/train': 0.8240997971734032} 12/23/2021 16:18:43 - INFO - codeparrot_training - Step 5259: {'lr': 0.0004872648490599859, 'samples': 2693120, 'steps': 5259, 'batch_loss/train': 0.8457135390490294} 12/23/2021 16:18:53 - INFO - codeparrot_training - Step 5260: {'lr': 0.0004872598816684657, 'samples': 2693632, 'steps': 5260, 'batch_loss/train': 0.7920829213690013} 12/23/2021 16:19:06 - INFO - codeparrot_training - Step 5261: {'lr': 0.00048725491333368977, 'samples': 2694144, 'steps': 5261, 'batch_loss/train': 0.9129653256386518} 12/23/2021 16:19:16 - INFO - codeparrot_training - Step 5262: {'lr': 0.00048724994405567785, 'samples': 2694656, 'steps': 5262, 'batch_loss/train': 0.7395861707627773} 12/23/2021 16:19:27 - INFO - codeparrot_training - Step 5263: {'lr': 0.0004872449738344498, 'samples': 2695168, 'steps': 5263, 'batch_loss/train': 0.8994851233437657} 12/23/2021 16:19:39 - INFO - codeparrot_training - Step 5264: {'lr': 0.0004872400026700253, 'samples': 2695680, 'steps': 5264, 'batch_loss/train': 0.8946950063109398} 12/23/2021 16:19:49 - INFO - codeparrot_training - Step 5265: {'lr': 0.0004872350305624241, 'samples': 2696192, 'steps': 5265, 'batch_loss/train': 0.9522080784663558} 12/23/2021 16:20:00 - INFO - codeparrot_training - Step 5266: {'lr': 0.000487230057511666, 'samples': 2696704, 'steps': 5266, 'batch_loss/train': 0.8744368813931942} 12/23/2021 16:20:11 - INFO - codeparrot_training - Step 5267: {'lr': 0.00048722508351777075, 'samples': 2697216, 'steps': 5267, 'batch_loss/train': 0.8643522392958403} 12/23/2021 16:20:25 - INFO - codeparrot_training - Step 5268: {'lr': 0.00048722010858075817, 'samples': 2697728, 'steps': 5268, 'batch_loss/train': 1.0017239544540644} 12/23/2021 16:20:35 - INFO - codeparrot_training - Step 5269: {'lr': 0.000487215132700648, 'samples': 2698240, 'steps': 5269, 'batch_loss/train': 0.788708905223757} 12/23/2021 16:20:46 - INFO - codeparrot_training - Step 5270: {'lr': 0.00048721015587746005, 'samples': 2698752, 'steps': 5270, 'batch_loss/train': 0.894611913478002} 12/23/2021 16:20:58 - INFO - codeparrot_training - Step 5271: {'lr': 0.0004872051781112141, 'samples': 2699264, 'steps': 5271, 'batch_loss/train': 0.7946038041263819} 12/23/2021 16:21:08 - INFO - codeparrot_training - Step 5272: {'lr': 0.00048720019940192983, 'samples': 2699776, 'steps': 5272, 'batch_loss/train': 0.9595844657160342} 12/23/2021 16:21:19 - INFO - codeparrot_training - Step 5273: {'lr': 0.00048719521974962724, 'samples': 2700288, 'steps': 5273, 'batch_loss/train': 0.6735741018783301} 12/23/2021 16:21:31 - INFO - codeparrot_training - Step 5274: {'lr': 0.000487190239154326, 'samples': 2700800, 'steps': 5274, 'batch_loss/train': 1.0999576188623905} 12/23/2021 16:21:42 - INFO - codeparrot_training - Step 5275: {'lr': 0.00048718525761604585, 'samples': 2701312, 'steps': 5275, 'batch_loss/train': 0.9683049526065588} 12/23/2021 16:21:53 - INFO - codeparrot_training - Step 5276: {'lr': 0.0004871802751348068, 'samples': 2701824, 'steps': 5276, 'batch_loss/train': 0.8910955227911472} 12/23/2021 16:22:03 - INFO - codeparrot_training - Step 5277: {'lr': 0.0004871752917106285, 'samples': 2702336, 'steps': 5277, 'batch_loss/train': 1.1133567597717047} 12/23/2021 16:22:16 - INFO - codeparrot_training - Step 5278: {'lr': 0.00048717030734353066, 'samples': 2702848, 'steps': 5278, 'batch_loss/train': 0.8769415374845266} 12/23/2021 16:22:27 - INFO - codeparrot_training - Step 5279: {'lr': 0.0004871653220335334, 'samples': 2703360, 'steps': 5279, 'batch_loss/train': 0.9150624638423324} 12/23/2021 16:22:38 - INFO - codeparrot_training - Step 5280: {'lr': 0.0004871603357806562, 'samples': 2703872, 'steps': 5280, 'batch_loss/train': 0.919238462112844} 12/23/2021 16:22:50 - INFO - codeparrot_training - Step 5281: {'lr': 0.00048715534858491914, 'samples': 2704384, 'steps': 5281, 'batch_loss/train': 0.898308320902288} 12/23/2021 16:23:01 - INFO - codeparrot_training - Step 5282: {'lr': 0.00048715036044634204, 'samples': 2704896, 'steps': 5282, 'batch_loss/train': 0.7492508511058986} 12/23/2021 16:23:11 - INFO - codeparrot_training - Step 5283: {'lr': 0.00048714537136494457, 'samples': 2705408, 'steps': 5283, 'batch_loss/train': 0.5116488956846297} 12/23/2021 16:23:22 - INFO - codeparrot_training - Step 5284: {'lr': 0.0004871403813407467, 'samples': 2705920, 'steps': 5284, 'batch_loss/train': 0.7132409383775666} 12/23/2021 16:23:35 - INFO - codeparrot_training - Step 5285: {'lr': 0.0004871353903737681, 'samples': 2706432, 'steps': 5285, 'batch_loss/train': 0.9177498687058687} 12/23/2021 16:23:46 - INFO - codeparrot_training - Step 5286: {'lr': 0.00048713039846402887, 'samples': 2706944, 'steps': 5286, 'batch_loss/train': 0.8387041185051203} 12/23/2021 16:23:57 - INFO - codeparrot_training - Step 5287: {'lr': 0.0004871254056115486, 'samples': 2707456, 'steps': 5287, 'batch_loss/train': 0.923771602101624} 12/23/2021 16:24:09 - INFO - codeparrot_training - Step 5288: {'lr': 0.00048712041181634726, 'samples': 2707968, 'steps': 5288, 'batch_loss/train': 0.758305405266583} 12/23/2021 16:24:19 - INFO - codeparrot_training - Step 5289: {'lr': 0.00048711541707844474, 'samples': 2708480, 'steps': 5289, 'batch_loss/train': 0.8637887629447505} 12/23/2021 16:24:30 - INFO - codeparrot_training - Step 5290: {'lr': 0.0004871104213978609, 'samples': 2708992, 'steps': 5290, 'batch_loss/train': 0.8502021729946136} 12/23/2021 16:24:42 - INFO - codeparrot_training - Step 5291: {'lr': 0.00048710542477461543, 'samples': 2709504, 'steps': 5291, 'batch_loss/train': 0.7943046921864152} 12/23/2021 16:24:52 - INFO - codeparrot_training - Step 5292: {'lr': 0.0004871004272087284, 'samples': 2710016, 'steps': 5292, 'batch_loss/train': 0.8912041252478957} 12/23/2021 16:25:03 - INFO - codeparrot_training - Step 5293: {'lr': 0.00048709542870021964, 'samples': 2710528, 'steps': 5293, 'batch_loss/train': 0.7331510013900697} 12/23/2021 16:25:16 - INFO - codeparrot_training - Step 5294: {'lr': 0.00048709042924910883, 'samples': 2711040, 'steps': 5294, 'batch_loss/train': 0.8696537343785167} 12/23/2021 16:25:27 - INFO - codeparrot_training - Step 5295: {'lr': 0.00048708542885541613, 'samples': 2711552, 'steps': 5295, 'batch_loss/train': 1.044104853644967} 12/23/2021 16:25:37 - INFO - codeparrot_training - Step 5296: {'lr': 0.00048708042751916127, 'samples': 2712064, 'steps': 5296, 'batch_loss/train': 0.9864169042557478} 12/23/2021 16:25:48 - INFO - codeparrot_training - Step 5297: {'lr': 0.0004870754252403641, 'samples': 2712576, 'steps': 5297, 'batch_loss/train': 0.8001178693957627} 12/23/2021 16:26:00 - INFO - codeparrot_training - Step 5298: {'lr': 0.0004870704220190446, 'samples': 2713088, 'steps': 5298, 'batch_loss/train': 0.8459446951746941} 12/23/2021 16:26:11 - INFO - codeparrot_training - Step 5299: {'lr': 0.00048706541785522255, 'samples': 2713600, 'steps': 5299, 'batch_loss/train': 0.8911434598267078} 12/23/2021 16:26:21 - INFO - codeparrot_training - Step 5300: {'lr': 0.000487060412748918, 'samples': 2714112, 'steps': 5300, 'batch_loss/train': 0.8026726664975286} 12/23/2021 16:26:34 - INFO - codeparrot_training - Step 5301: {'lr': 0.0004870554067001507, 'samples': 2714624, 'steps': 5301, 'batch_loss/train': 1.2333370428532362} 12/23/2021 16:26:45 - INFO - codeparrot_training - Step 5302: {'lr': 0.0004870503997089406, 'samples': 2715136, 'steps': 5302, 'batch_loss/train': 0.8824695469811559} 12/23/2021 16:26:55 - INFO - codeparrot_training - Step 5303: {'lr': 0.0004870453917753076, 'samples': 2715648, 'steps': 5303, 'batch_loss/train': 0.9643040746450424} 12/23/2021 16:27:07 - INFO - codeparrot_training - Step 5304: {'lr': 0.0004870403828992717, 'samples': 2716160, 'steps': 5304, 'batch_loss/train': 0.9376390869729221} 12/23/2021 16:27:18 - INFO - codeparrot_training - Step 5305: {'lr': 0.0004870353730808527, 'samples': 2716672, 'steps': 5305, 'batch_loss/train': 0.8772654398344457} 12/23/2021 16:27:28 - INFO - codeparrot_training - Step 5306: {'lr': 0.0004870303623200706, 'samples': 2717184, 'steps': 5306, 'batch_loss/train': 0.8874456090852618} 12/23/2021 16:27:39 - INFO - codeparrot_training - Step 5307: {'lr': 0.0004870253506169452, 'samples': 2717696, 'steps': 5307, 'batch_loss/train': 0.9026808366179466} 12/23/2021 16:27:52 - INFO - codeparrot_training - Step 5308: {'lr': 0.0004870203379714966, 'samples': 2718208, 'steps': 5308, 'batch_loss/train': 0.9405182553455234} 12/23/2021 16:28:03 - INFO - codeparrot_training - Step 5309: {'lr': 0.0004870153243837445, 'samples': 2718720, 'steps': 5309, 'batch_loss/train': 0.8305644830688834} 12/23/2021 16:28:13 - INFO - codeparrot_training - Step 5310: {'lr': 0.0004870103098537091, 'samples': 2719232, 'steps': 5310, 'batch_loss/train': 0.8557098507881165} 12/23/2021 16:28:25 - INFO - codeparrot_training - Step 5311: {'lr': 0.0004870052943814102, 'samples': 2719744, 'steps': 5311, 'batch_loss/train': 0.9230048917233944} 12/23/2021 16:28:36 - INFO - codeparrot_training - Step 5312: {'lr': 0.00048700027796686764, 'samples': 2720256, 'steps': 5312, 'batch_loss/train': 0.8038409533910453} 12/23/2021 16:28:47 - INFO - codeparrot_training - Step 5313: {'lr': 0.0004869952606101015, 'samples': 2720768, 'steps': 5313, 'batch_loss/train': 0.6772401641355827} 12/23/2021 16:28:59 - INFO - codeparrot_training - Step 5314: {'lr': 0.00048699024231113163, 'samples': 2721280, 'steps': 5314, 'batch_loss/train': 0.8588543841615319} 12/23/2021 16:29:09 - INFO - codeparrot_training - Step 5315: {'lr': 0.0004869852230699782, 'samples': 2721792, 'steps': 5315, 'batch_loss/train': 0.8439810443669558} 12/23/2021 16:29:20 - INFO - codeparrot_training - Step 5316: {'lr': 0.0004869802028866609, 'samples': 2722304, 'steps': 5316, 'batch_loss/train': 0.8028134508058429} 12/23/2021 16:29:33 - INFO - codeparrot_training - Step 5317: {'lr': 0.0004869751817611998, 'samples': 2722816, 'steps': 5317, 'batch_loss/train': 0.8499654736369848} 12/23/2021 16:29:44 - INFO - codeparrot_training - Step 5318: {'lr': 0.0004869701596936148, 'samples': 2723328, 'steps': 5318, 'batch_loss/train': 0.6854632245376706} 12/23/2021 16:29:54 - INFO - codeparrot_training - Step 5319: {'lr': 0.00048696513668392607, 'samples': 2723840, 'steps': 5319, 'batch_loss/train': 0.8734716540202498} 12/23/2021 16:30:05 - INFO - codeparrot_training - Step 5320: {'lr': 0.0004869601127321533, 'samples': 2724352, 'steps': 5320, 'batch_loss/train': 0.8590220487676561} 12/23/2021 16:30:17 - INFO - codeparrot_training - Step 5321: {'lr': 0.00048695508783831666, 'samples': 2724864, 'steps': 5321, 'batch_loss/train': 0.8713660016655922} 12/23/2021 16:30:27 - INFO - codeparrot_training - Step 5322: {'lr': 0.000486950062002436, 'samples': 2725376, 'steps': 5322, 'batch_loss/train': 0.7933802455663681} 12/23/2021 16:30:38 - INFO - codeparrot_training - Step 5323: {'lr': 0.0004869450352245314, 'samples': 2725888, 'steps': 5323, 'batch_loss/train': 0.8674264429137111} 12/23/2021 16:30:51 - INFO - codeparrot_training - Step 5324: {'lr': 0.0004869400075046228, 'samples': 2726400, 'steps': 5324, 'batch_loss/train': 0.8762605171650648} 12/23/2021 16:31:02 - INFO - codeparrot_training - Step 5325: {'lr': 0.0004869349788427302, 'samples': 2726912, 'steps': 5325, 'batch_loss/train': 0.983810419216752} 12/23/2021 16:31:12 - INFO - codeparrot_training - Step 5326: {'lr': 0.0004869299492388736, 'samples': 2727424, 'steps': 5326, 'batch_loss/train': 1.4499065298587084} 12/23/2021 16:31:24 - INFO - codeparrot_training - Step 5327: {'lr': 0.00048692491869307297, 'samples': 2727936, 'steps': 5327, 'batch_loss/train': 0.8516068027820438} 12/23/2021 16:31:35 - INFO - codeparrot_training - Step 5328: {'lr': 0.0004869198872053483, 'samples': 2728448, 'steps': 5328, 'batch_loss/train': 0.8848169082775712} 12/23/2021 16:31:45 - INFO - codeparrot_training - Step 5329: {'lr': 0.0004869148547757197, 'samples': 2728960, 'steps': 5329, 'batch_loss/train': 0.8424395131878555} 12/23/2021 16:31:56 - INFO - codeparrot_training - Step 5330: {'lr': 0.000486909821404207, 'samples': 2729472, 'steps': 5330, 'batch_loss/train': 0.9383692881092429} 12/23/2021 16:32:08 - INFO - codeparrot_training - Step 5331: {'lr': 0.00048690478709083036, 'samples': 2729984, 'steps': 5331, 'batch_loss/train': 0.8686944637447596} 12/23/2021 16:32:19 - INFO - codeparrot_training - Step 5332: {'lr': 0.0004868997518356098, 'samples': 2730496, 'steps': 5332, 'batch_loss/train': 0.8535953741520643} 12/23/2021 16:32:29 - INFO - codeparrot_training - Step 5333: {'lr': 0.00048689471563856515, 'samples': 2731008, 'steps': 5333, 'batch_loss/train': 0.7828864809125662} 12/23/2021 16:32:42 - INFO - codeparrot_training - Step 5334: {'lr': 0.00048688967849971654, 'samples': 2731520, 'steps': 5334, 'batch_loss/train': 0.8806298803538084} 12/23/2021 16:32:52 - INFO - codeparrot_training - Step 5335: {'lr': 0.00048688464041908407, 'samples': 2732032, 'steps': 5335, 'batch_loss/train': 0.7133628227747977} 12/23/2021 16:33:03 - INFO - codeparrot_training - Step 5336: {'lr': 0.0004868796013966877, 'samples': 2732544, 'steps': 5336, 'batch_loss/train': 0.8815459599718451} 12/23/2021 16:33:16 - INFO - codeparrot_training - Step 5337: {'lr': 0.0004868745614325475, 'samples': 2733056, 'steps': 5337, 'batch_loss/train': 0.8822910580784082} 12/23/2021 16:33:27 - INFO - codeparrot_training - Step 5338: {'lr': 0.0004868695205266835, 'samples': 2733568, 'steps': 5338, 'batch_loss/train': 0.9289957279106602} 12/23/2021 16:33:37 - INFO - codeparrot_training - Step 5339: {'lr': 0.00048686447867911576, 'samples': 2734080, 'steps': 5339, 'batch_loss/train': 0.7540243096882477} 12/23/2021 16:33:49 - INFO - codeparrot_training - Step 5340: {'lr': 0.00048685943588986426, 'samples': 2734592, 'steps': 5340, 'batch_loss/train': 0.9539390387944877} 12/23/2021 16:34:00 - INFO - codeparrot_training - Step 5341: {'lr': 0.000486854392158949, 'samples': 2735104, 'steps': 5341, 'batch_loss/train': 0.8150996305048466} 12/23/2021 16:34:10 - INFO - codeparrot_training - Step 5342: {'lr': 0.0004868493474863902, 'samples': 2735616, 'steps': 5342, 'batch_loss/train': 0.9241571687161922} 12/23/2021 16:34:21 - INFO - codeparrot_training - Step 5343: {'lr': 0.00048684430187220773, 'samples': 2736128, 'steps': 5343, 'batch_loss/train': 0.870395075995475} 12/23/2021 16:34:33 - INFO - codeparrot_training - Step 5344: {'lr': 0.0004868392553164218, 'samples': 2736640, 'steps': 5344, 'batch_loss/train': 0.9121706038713455} 12/23/2021 16:34:43 - INFO - codeparrot_training - Step 5345: {'lr': 0.0004868342078190524, 'samples': 2737152, 'steps': 5345, 'batch_loss/train': 0.9251508666202426} 12/23/2021 16:34:54 - INFO - codeparrot_training - Step 5346: {'lr': 0.00048682915938011965, 'samples': 2737664, 'steps': 5346, 'batch_loss/train': 0.8784493701532483} 12/23/2021 16:35:07 - INFO - codeparrot_training - Step 5347: {'lr': 0.00048682410999964363, 'samples': 2738176, 'steps': 5347, 'batch_loss/train': 0.8046607514843345} 12/23/2021 16:35:18 - INFO - codeparrot_training - Step 5348: {'lr': 0.00048681905967764427, 'samples': 2738688, 'steps': 5348, 'batch_loss/train': 0.8216858748346567} 12/23/2021 16:35:28 - INFO - codeparrot_training - Step 5349: {'lr': 0.0004868140084141418, 'samples': 2739200, 'steps': 5349, 'batch_loss/train': 0.7445935884024948} 12/23/2021 16:35:40 - INFO - codeparrot_training - Step 5350: {'lr': 0.0004868089562091562, 'samples': 2739712, 'steps': 5350, 'batch_loss/train': 0.8821705225855112} 12/23/2021 16:35:51 - INFO - codeparrot_training - Step 5351: {'lr': 0.00048680390306270774, 'samples': 2740224, 'steps': 5351, 'batch_loss/train': 0.9015653925016522} 12/23/2021 16:36:01 - INFO - codeparrot_training - Step 5352: {'lr': 0.00048679884897481627, 'samples': 2740736, 'steps': 5352, 'batch_loss/train': 0.816592394374311} 12/23/2021 16:36:12 - INFO - codeparrot_training - Step 5353: {'lr': 0.00048679379394550197, 'samples': 2741248, 'steps': 5353, 'batch_loss/train': 0.8538243742659688} 12/23/2021 16:36:25 - INFO - codeparrot_training - Step 5354: {'lr': 0.00048678873797478495, 'samples': 2741760, 'steps': 5354, 'batch_loss/train': 0.7908783396705985} 12/23/2021 16:36:36 - INFO - codeparrot_training - Step 5355: {'lr': 0.0004867836810626853, 'samples': 2742272, 'steps': 5355, 'batch_loss/train': 0.9454667512327433} 12/23/2021 16:36:46 - INFO - codeparrot_training - Step 5356: {'lr': 0.00048677862320922326, 'samples': 2742784, 'steps': 5356, 'batch_loss/train': 0.9546548193320632} 12/23/2021 16:36:58 - INFO - codeparrot_training - Step 5357: {'lr': 0.00048677356441441876, 'samples': 2743296, 'steps': 5357, 'batch_loss/train': 0.9479739340022206} 12/23/2021 16:37:09 - INFO - codeparrot_training - Step 5358: {'lr': 0.00048676850467829194, 'samples': 2743808, 'steps': 5358, 'batch_loss/train': 0.8547979686409235} 12/23/2021 16:37:20 - INFO - codeparrot_training - Step 5359: {'lr': 0.00048676344400086296, 'samples': 2744320, 'steps': 5359, 'batch_loss/train': 0.8723630853928626} 12/23/2021 16:37:32 - INFO - codeparrot_training - Step 5360: {'lr': 0.000486758382382152, 'samples': 2744832, 'steps': 5360, 'batch_loss/train': 0.8847291991114616} 12/23/2021 16:37:42 - INFO - codeparrot_training - Step 5361: {'lr': 0.000486753319822179, 'samples': 2745344, 'steps': 5361, 'batch_loss/train': 0.8829949535429478} 12/23/2021 16:37:53 - INFO - codeparrot_training - Step 5362: {'lr': 0.0004867482563209642, 'samples': 2745856, 'steps': 5362, 'batch_loss/train': 0.9154328349977732} 12/23/2021 16:38:03 - INFO - codeparrot_training - Step 5363: {'lr': 0.00048674319187852775, 'samples': 2746368, 'steps': 5363, 'batch_loss/train': 0.9211330087855458} 12/23/2021 16:38:17 - INFO - codeparrot_training - Step 5364: {'lr': 0.00048673812649488977, 'samples': 2746880, 'steps': 5364, 'batch_loss/train': 0.8377491421997547} 12/23/2021 16:38:28 - INFO - codeparrot_training - Step 5365: {'lr': 0.0004867330601700704, 'samples': 2747392, 'steps': 5365, 'batch_loss/train': 0.910118673928082} 12/23/2021 16:38:38 - INFO - codeparrot_training - Step 5366: {'lr': 0.00048672799290408973, 'samples': 2747904, 'steps': 5366, 'batch_loss/train': 0.8357130810618401} 12/23/2021 16:38:51 - INFO - codeparrot_training - Step 5367: {'lr': 0.000486722924696968, 'samples': 2748416, 'steps': 5367, 'batch_loss/train': 1.381870148703456} 12/23/2021 16:39:01 - INFO - codeparrot_training - Step 5368: {'lr': 0.0004867178555487253, 'samples': 2748928, 'steps': 5368, 'batch_loss/train': 0.8311923081055284} 12/23/2021 16:39:12 - INFO - codeparrot_training - Step 5369: {'lr': 0.00048671278545938177, 'samples': 2749440, 'steps': 5369, 'batch_loss/train': 0.8073940454050899} 12/23/2021 16:39:25 - INFO - codeparrot_training - Step 5370: {'lr': 0.0004867077144289576, 'samples': 2749952, 'steps': 5370, 'batch_loss/train': 0.6093897209502757} 12/23/2021 16:39:35 - INFO - codeparrot_training - Step 5371: {'lr': 0.00048670264245747284, 'samples': 2750464, 'steps': 5371, 'batch_loss/train': 0.8094451702199876} 12/23/2021 16:39:46 - INFO - codeparrot_training - Step 5372: {'lr': 0.0004866975695449478, 'samples': 2750976, 'steps': 5372, 'batch_loss/train': 0.7492595096118748} 12/23/2021 16:39:56 - INFO - codeparrot_training - Step 5373: {'lr': 0.00048669249569140263, 'samples': 2751488, 'steps': 5373, 'batch_loss/train': 0.8816837603226304} 12/23/2021 16:40:08 - INFO - codeparrot_training - Step 5374: {'lr': 0.0004866874208968575, 'samples': 2752000, 'steps': 5374, 'batch_loss/train': 0.8495887336321175} 12/23/2021 16:40:19 - INFO - codeparrot_training - Step 5375: {'lr': 0.00048668234516133246, 'samples': 2752512, 'steps': 5375, 'batch_loss/train': 0.9044250212609768} 12/23/2021 16:40:29 - INFO - codeparrot_training - Step 5376: {'lr': 0.00048667726848484783, 'samples': 2753024, 'steps': 5376, 'batch_loss/train': 0.874465717934072} 12/23/2021 16:40:43 - INFO - codeparrot_training - Step 5377: {'lr': 0.00048667219086742376, 'samples': 2753536, 'steps': 5377, 'batch_loss/train': 0.8486524485051632} 12/23/2021 16:40:54 - INFO - codeparrot_training - Step 5378: {'lr': 0.0004866671123090805, 'samples': 2754048, 'steps': 5378, 'batch_loss/train': 0.7860046112909913} 12/23/2021 16:41:04 - INFO - codeparrot_training - Step 5379: {'lr': 0.00048666203280983803, 'samples': 2754560, 'steps': 5379, 'batch_loss/train': 0.8727233214303851} 12/23/2021 16:41:16 - INFO - codeparrot_training - Step 5380: {'lr': 0.0004866569523697167, 'samples': 2755072, 'steps': 5380, 'batch_loss/train': 0.8357019661925733} 12/23/2021 16:41:27 - INFO - codeparrot_training - Step 5381: {'lr': 0.0004866518709887367, 'samples': 2755584, 'steps': 5381, 'batch_loss/train': 0.8919831872917712} 12/23/2021 16:41:37 - INFO - codeparrot_training - Step 5382: {'lr': 0.00048664678866691826, 'samples': 2756096, 'steps': 5382, 'batch_loss/train': 0.891495100222528} 12/23/2021 16:41:50 - INFO - codeparrot_training - Step 5383: {'lr': 0.0004866417054042815, 'samples': 2756608, 'steps': 5383, 'batch_loss/train': 0.8914495501667261} 12/23/2021 16:42:01 - INFO - codeparrot_training - Step 5384: {'lr': 0.00048663662120084666, 'samples': 2757120, 'steps': 5384, 'batch_loss/train': 0.8325051832944155} 12/23/2021 16:42:12 - INFO - codeparrot_training - Step 5385: {'lr': 0.00048663153605663404, 'samples': 2757632, 'steps': 5385, 'batch_loss/train': 0.8300542468205094} 12/23/2021 16:42:22 - INFO - codeparrot_training - Step 5386: {'lr': 0.0004866264499716637, 'samples': 2758144, 'steps': 5386, 'batch_loss/train': 0.8701753905043006} 12/23/2021 16:42:34 - INFO - codeparrot_training - Step 5387: {'lr': 0.0004866213629459559, 'samples': 2758656, 'steps': 5387, 'batch_loss/train': 0.8301691859960556} 12/23/2021 16:42:45 - INFO - codeparrot_training - Step 5388: {'lr': 0.00048661627497953105, 'samples': 2759168, 'steps': 5388, 'batch_loss/train': 0.8837770805694163} 12/23/2021 16:42:55 - INFO - codeparrot_training - Step 5389: {'lr': 0.0004866111860724092, 'samples': 2759680, 'steps': 5389, 'batch_loss/train': 0.8745271973311901} 12/23/2021 16:43:07 - INFO - codeparrot_training - Step 5390: {'lr': 0.00048660609622461047, 'samples': 2760192, 'steps': 5390, 'batch_loss/train': 0.867006110958755} 12/23/2021 16:43:18 - INFO - codeparrot_training - Step 5391: {'lr': 0.00048660100543615535, 'samples': 2760704, 'steps': 5391, 'batch_loss/train': 0.8336112545803189} 12/23/2021 16:43:29 - INFO - codeparrot_training - Step 5392: {'lr': 0.00048659591370706404, 'samples': 2761216, 'steps': 5392, 'batch_loss/train': 0.883025829680264} 12/23/2021 16:43:42 - INFO - codeparrot_training - Step 5393: {'lr': 0.00048659082103735655, 'samples': 2761728, 'steps': 5393, 'batch_loss/train': 0.7802399541251361} 12/23/2021 16:43:53 - INFO - codeparrot_training - Step 5394: {'lr': 0.0004865857274270534, 'samples': 2762240, 'steps': 5394, 'batch_loss/train': 0.8684125831350684} 12/23/2021 16:44:03 - INFO - codeparrot_training - Step 5395: {'lr': 0.0004865806328761747, 'samples': 2762752, 'steps': 5395, 'batch_loss/train': 0.844222305342555} 12/23/2021 16:44:14 - INFO - codeparrot_training - Step 5396: {'lr': 0.00048657553738474076, 'samples': 2763264, 'steps': 5396, 'batch_loss/train': 0.9278980467934161} 12/23/2021 16:44:26 - INFO - codeparrot_training - Step 5397: {'lr': 0.0004865704409527718, 'samples': 2763776, 'steps': 5397, 'batch_loss/train': 0.8894483372569084} 12/23/2021 16:44:37 - INFO - codeparrot_training - Step 5398: {'lr': 0.00048656534358028803, 'samples': 2764288, 'steps': 5398, 'batch_loss/train': 0.8530466109514236} 12/23/2021 16:44:47 - INFO - codeparrot_training - Step 5399: {'lr': 0.00048656024526730986, 'samples': 2764800, 'steps': 5399, 'batch_loss/train': 0.7062849182402715} 12/23/2021 16:45:01 - INFO - codeparrot_training - Step 5400: {'lr': 0.0004865551460138574, 'samples': 2765312, 'steps': 5400, 'batch_loss/train': 0.839220954105258} 12/23/2021 16:45:11 - INFO - codeparrot_training - Step 5401: {'lr': 0.0004865500458199511, 'samples': 2765824, 'steps': 5401, 'batch_loss/train': 0.8407015525735915} 12/23/2021 16:45:22 - INFO - codeparrot_training - Step 5402: {'lr': 0.00048654494468561104, 'samples': 2766336, 'steps': 5402, 'batch_loss/train': 0.7057989879976958} 12/23/2021 16:45:32 - INFO - codeparrot_training - Step 5403: {'lr': 0.00048653984261085763, 'samples': 2766848, 'steps': 5403, 'batch_loss/train': 0.7950535393320024} 12/23/2021 16:45:44 - INFO - codeparrot_training - Step 5404: {'lr': 0.00048653473959571117, 'samples': 2767360, 'steps': 5404, 'batch_loss/train': 0.8963530790060759} 12/23/2021 16:45:55 - INFO - codeparrot_training - Step 5405: {'lr': 0.0004865296356401918, 'samples': 2767872, 'steps': 5405, 'batch_loss/train': 0.8151414263993502} 12/23/2021 16:46:06 - INFO - codeparrot_training - Step 5406: {'lr': 0.00048652453074432, 'samples': 2768384, 'steps': 5406, 'batch_loss/train': 0.8070954130962491} 12/23/2021 16:46:18 - INFO - codeparrot_training - Step 5407: {'lr': 0.00048651942490811594, 'samples': 2768896, 'steps': 5407, 'batch_loss/train': 0.8110774857923388} 12/23/2021 16:46:28 - INFO - codeparrot_training - Step 5408: {'lr': 0.00048651431813159996, 'samples': 2769408, 'steps': 5408, 'batch_loss/train': 0.9703455241397023} 12/23/2021 16:46:39 - INFO - codeparrot_training - Step 5409: {'lr': 0.00048650921041479235, 'samples': 2769920, 'steps': 5409, 'batch_loss/train': 0.8798171691596508} 12/23/2021 16:46:51 - INFO - codeparrot_training - Step 5410: {'lr': 0.0004865041017577134, 'samples': 2770432, 'steps': 5410, 'batch_loss/train': 0.7630250910297036} 12/23/2021 16:47:02 - INFO - codeparrot_training - Step 5411: {'lr': 0.0004864989921603834, 'samples': 2770944, 'steps': 5411, 'batch_loss/train': 0.812802659580484} 12/23/2021 16:47:12 - INFO - codeparrot_training - Step 5412: {'lr': 0.0004864938816228228, 'samples': 2771456, 'steps': 5412, 'batch_loss/train': 0.6832397856051102} 12/23/2021 16:47:25 - INFO - codeparrot_training - Step 5413: {'lr': 0.00048648877014505177, 'samples': 2771968, 'steps': 5413, 'batch_loss/train': 0.8876504842191935} 12/23/2021 16:47:36 - INFO - codeparrot_training - Step 5414: {'lr': 0.0004864836577270907, 'samples': 2772480, 'steps': 5414, 'batch_loss/train': 0.8502401326550171} 12/23/2021 16:47:47 - INFO - codeparrot_training - Step 5415: {'lr': 0.0004864785443689599, 'samples': 2772992, 'steps': 5415, 'batch_loss/train': 0.9055342599749565} 12/23/2021 16:47:57 - INFO - codeparrot_training - Step 5416: {'lr': 0.00048647343007067966, 'samples': 2773504, 'steps': 5416, 'batch_loss/train': 0.9210329754278064} 12/23/2021 16:48:09 - INFO - codeparrot_training - Step 5417: {'lr': 0.00048646831483227043, 'samples': 2774016, 'steps': 5417, 'batch_loss/train': 0.8579631838947535} 12/23/2021 16:48:20 - INFO - codeparrot_training - Step 5418: {'lr': 0.00048646319865375243, 'samples': 2774528, 'steps': 5418, 'batch_loss/train': 0.8380297459661961} 12/23/2021 16:48:30 - INFO - codeparrot_training - Step 5419: {'lr': 0.00048645808153514604, 'samples': 2775040, 'steps': 5419, 'batch_loss/train': 0.8629515366628766} 12/23/2021 16:48:43 - INFO - codeparrot_training - Step 5420: {'lr': 0.00048645296347647163, 'samples': 2775552, 'steps': 5420, 'batch_loss/train': 0.7522060924675316} 12/23/2021 16:48:53 - INFO - codeparrot_training - Step 5421: {'lr': 0.00048644784447774943, 'samples': 2776064, 'steps': 5421, 'batch_loss/train': 1.2231357749551535} 12/23/2021 16:49:04 - INFO - codeparrot_training - Step 5422: {'lr': 0.00048644272453899994, 'samples': 2776576, 'steps': 5422, 'batch_loss/train': 0.8381822789087892} 12/23/2021 16:49:17 - INFO - codeparrot_training - Step 5423: {'lr': 0.0004864376036602435, 'samples': 2777088, 'steps': 5423, 'batch_loss/train': 0.8051809086464345} 12/23/2021 16:49:28 - INFO - codeparrot_training - Step 5424: {'lr': 0.00048643248184150037, 'samples': 2777600, 'steps': 5424, 'batch_loss/train': 0.9188210293650627} 12/23/2021 16:49:38 - INFO - codeparrot_training - Step 5425: {'lr': 0.0004864273590827909, 'samples': 2778112, 'steps': 5425, 'batch_loss/train': 0.8744485639035702} 12/23/2021 16:49:49 - INFO - codeparrot_training - Step 5426: {'lr': 0.0004864222353841356, 'samples': 2778624, 'steps': 5426, 'batch_loss/train': 0.9106654766947031} 12/23/2021 16:50:01 - INFO - codeparrot_training - Step 5427: {'lr': 0.0004864171107455547, 'samples': 2779136, 'steps': 5427, 'batch_loss/train': 0.958432037383318} 12/23/2021 16:50:12 - INFO - codeparrot_training - Step 5428: {'lr': 0.00048641198516706876, 'samples': 2779648, 'steps': 5428, 'batch_loss/train': 0.8499084641225636} 12/23/2021 16:50:22 - INFO - codeparrot_training - Step 5429: {'lr': 0.0004864068586486979, 'samples': 2780160, 'steps': 5429, 'batch_loss/train': 0.9809244927018881} 12/23/2021 16:50:35 - INFO - codeparrot_training - Step 5430: {'lr': 0.0004864017311904627, 'samples': 2780672, 'steps': 5430, 'batch_loss/train': 0.9190212339162827} 12/23/2021 16:50:45 - INFO - codeparrot_training - Step 5431: {'lr': 0.0004863966027923835, 'samples': 2781184, 'steps': 5431, 'batch_loss/train': 0.879918996244669} 12/23/2021 16:50:56 - INFO - codeparrot_training - Step 5432: {'lr': 0.00048639147345448067, 'samples': 2781696, 'steps': 5432, 'batch_loss/train': 0.8900058441795409} 12/23/2021 16:51:09 - INFO - codeparrot_training - Step 5433: {'lr': 0.00048638634317677454, 'samples': 2782208, 'steps': 5433, 'batch_loss/train': 0.9628579961135983} 12/23/2021 16:51:19 - INFO - codeparrot_training - Step 5434: {'lr': 0.00048638121195928555, 'samples': 2782720, 'steps': 5434, 'batch_loss/train': 0.9142745798453689} 12/23/2021 16:51:30 - INFO - codeparrot_training - Step 5435: {'lr': 0.00048637607980203413, 'samples': 2783232, 'steps': 5435, 'batch_loss/train': 1.0297857634723186} 12/23/2021 16:51:41 - INFO - codeparrot_training - Step 5436: {'lr': 0.0004863709467050407, 'samples': 2783744, 'steps': 5436, 'batch_loss/train': 0.785022190771997} 12/23/2021 16:51:53 - INFO - codeparrot_training - Step 5437: {'lr': 0.0004863658126683257, 'samples': 2784256, 'steps': 5437, 'batch_loss/train': 0.847307794727385} 12/23/2021 16:52:03 - INFO - codeparrot_training - Step 5438: {'lr': 0.0004863606776919094, 'samples': 2784768, 'steps': 5438, 'batch_loss/train': 0.7921666195616126} 12/23/2021 16:52:14 - INFO - codeparrot_training - Step 5439: {'lr': 0.0004863555417758123, 'samples': 2785280, 'steps': 5439, 'batch_loss/train': 0.7908816831186414} 12/23/2021 16:52:26 - INFO - codeparrot_training - Step 5440: {'lr': 0.00048635040492005477, 'samples': 2785792, 'steps': 5440, 'batch_loss/train': 0.7705679815262556} 12/23/2021 16:52:37 - INFO - codeparrot_training - Step 5441: {'lr': 0.0004863452671246573, 'samples': 2786304, 'steps': 5441, 'batch_loss/train': 0.8456028830260038} 12/23/2021 16:52:47 - INFO - codeparrot_training - Step 5442: {'lr': 0.00048634012838964037, 'samples': 2786816, 'steps': 5442, 'batch_loss/train': 0.8654043255373836} 12/23/2021 16:53:01 - INFO - codeparrot_training - Step 5443: {'lr': 0.00048633498871502424, 'samples': 2787328, 'steps': 5443, 'batch_loss/train': 0.7514548627659678} 12/23/2021 16:53:12 - INFO - codeparrot_training - Step 5444: {'lr': 0.0004863298481008295, 'samples': 2787840, 'steps': 5444, 'batch_loss/train': 0.763471735175699} 12/23/2021 16:53:22 - INFO - codeparrot_training - Step 5445: {'lr': 0.0004863247065470764, 'samples': 2788352, 'steps': 5445, 'batch_loss/train': 0.8640974806621671} 12/23/2021 16:53:33 - INFO - codeparrot_training - Step 5446: {'lr': 0.0004863195640537856, 'samples': 2788864, 'steps': 5446, 'batch_loss/train': 0.8433566894382238} 12/23/2021 16:53:45 - INFO - codeparrot_training - Step 5447: {'lr': 0.0004863144206209774, 'samples': 2789376, 'steps': 5447, 'batch_loss/train': 0.7550083799287677} 12/23/2021 16:53:55 - INFO - codeparrot_training - Step 5448: {'lr': 0.00048630927624867225, 'samples': 2789888, 'steps': 5448, 'batch_loss/train': 0.8935807971283793} 12/23/2021 16:54:06 - INFO - codeparrot_training - Step 5449: {'lr': 0.00048630413093689074, 'samples': 2790400, 'steps': 5449, 'batch_loss/train': 0.852018719073385} 12/23/2021 16:54:18 - INFO - codeparrot_training - Step 5450: {'lr': 0.0004862989846856532, 'samples': 2790912, 'steps': 5450, 'batch_loss/train': 0.7957274955697358} 12/23/2021 16:54:29 - INFO - codeparrot_training - Step 5451: {'lr': 0.0004862938374949801, 'samples': 2791424, 'steps': 5451, 'batch_loss/train': 0.9076069789007306} 12/23/2021 16:54:39 - INFO - codeparrot_training - Step 5452: {'lr': 0.00048628868936489197, 'samples': 2791936, 'steps': 5452, 'batch_loss/train': 0.9166996702551842} 12/23/2021 16:54:50 - INFO - codeparrot_training - Step 5453: {'lr': 0.00048628354029540917, 'samples': 2792448, 'steps': 5453, 'batch_loss/train': 0.908805008046329} 12/23/2021 16:55:03 - INFO - codeparrot_training - Step 5454: {'lr': 0.0004862783902865523, 'samples': 2792960, 'steps': 5454, 'batch_loss/train': 0.8838750598952174} 12/23/2021 16:55:14 - INFO - codeparrot_training - Step 5455: {'lr': 0.0004862732393383417, 'samples': 2793472, 'steps': 5455, 'batch_loss/train': 0.8492619628086686} 12/23/2021 16:55:24 - INFO - codeparrot_training - Step 5456: {'lr': 0.0004862680874507979, 'samples': 2793984, 'steps': 5456, 'batch_loss/train': 0.7578277804423124} 12/23/2021 16:55:36 - INFO - codeparrot_training - Step 5457: {'lr': 0.00048626293462394145, 'samples': 2794496, 'steps': 5457, 'batch_loss/train': 0.8438782002776861} 12/23/2021 16:55:47 - INFO - codeparrot_training - Step 5458: {'lr': 0.0004862577808577928, 'samples': 2795008, 'steps': 5458, 'batch_loss/train': 0.8565883748233318} 12/23/2021 16:55:58 - INFO - codeparrot_training - Step 5459: {'lr': 0.0004862526261523723, 'samples': 2795520, 'steps': 5459, 'batch_loss/train': 0.9030020087957382} 12/23/2021 16:56:11 - INFO - codeparrot_training - Step 5460: {'lr': 0.0004862474705077007, 'samples': 2796032, 'steps': 5460, 'batch_loss/train': 0.9739807085134089} 12/23/2021 16:56:21 - INFO - codeparrot_training - Step 5461: {'lr': 0.00048624231392379826, 'samples': 2796544, 'steps': 5461, 'batch_loss/train': 0.9053991818800569} 12/23/2021 16:56:32 - INFO - codeparrot_training - Step 5462: {'lr': 0.0004862371564006857, 'samples': 2797056, 'steps': 5462, 'batch_loss/train': 0.983168913051486} 12/23/2021 16:56:44 - INFO - codeparrot_training - Step 5463: {'lr': 0.0004862319979383833, 'samples': 2797568, 'steps': 5463, 'batch_loss/train': 0.9225147999823093} 12/23/2021 16:56:54 - INFO - codeparrot_training - Step 5464: {'lr': 0.0004862268385369117, 'samples': 2798080, 'steps': 5464, 'batch_loss/train': 0.8815447636879981} 12/23/2021 16:57:05 - INFO - codeparrot_training - Step 5465: {'lr': 0.0004862216781962914, 'samples': 2798592, 'steps': 5465, 'batch_loss/train': 0.7933923387899995} 12/23/2021 16:57:16 - INFO - codeparrot_training - Step 5466: {'lr': 0.0004862165169165429, 'samples': 2799104, 'steps': 5466, 'batch_loss/train': 0.9087414192035794} 12/23/2021 16:57:28 - INFO - codeparrot_training - Step 5467: {'lr': 0.00048621135469768676, 'samples': 2799616, 'steps': 5467, 'batch_loss/train': 0.8818594589829445} 12/23/2021 16:57:39 - INFO - codeparrot_training - Step 5468: {'lr': 0.0004862061915397433, 'samples': 2800128, 'steps': 5468, 'batch_loss/train': 0.9297546176239848} 12/23/2021 16:57:49 - INFO - codeparrot_training - Step 5469: {'lr': 0.00048620102744273343, 'samples': 2800640, 'steps': 5469, 'batch_loss/train': 0.8880697530694306} 12/23/2021 16:58:02 - INFO - codeparrot_training - Step 5470: {'lr': 0.00048619586240667735, 'samples': 2801152, 'steps': 5470, 'batch_loss/train': 0.8764357380568981} 12/23/2021 16:58:13 - INFO - codeparrot_training - Step 5471: {'lr': 0.00048619069643159567, 'samples': 2801664, 'steps': 5471, 'batch_loss/train': 0.9405564125627279} 12/23/2021 16:58:24 - INFO - codeparrot_training - Step 5472: {'lr': 0.00048618552951750903, 'samples': 2802176, 'steps': 5472, 'batch_loss/train': 0.8350269668735564} 12/23/2021 16:58:36 - INFO - codeparrot_training - Step 5473: {'lr': 0.0004861803616644379, 'samples': 2802688, 'steps': 5473, 'batch_loss/train': 0.8527972493320704} 12/23/2021 16:58:46 - INFO - codeparrot_training - Step 5474: {'lr': 0.0004861751928724028, 'samples': 2803200, 'steps': 5474, 'batch_loss/train': 0.8787162788212299} 12/23/2021 16:58:57 - INFO - codeparrot_training - Step 5475: {'lr': 0.00048617002314142434, 'samples': 2803712, 'steps': 5475, 'batch_loss/train': 0.8552487371489406} 12/23/2021 16:59:08 - INFO - codeparrot_training - Step 5476: {'lr': 0.000486164852471523, 'samples': 2804224, 'steps': 5476, 'batch_loss/train': 0.9269188800826669} 12/23/2021 16:59:20 - INFO - codeparrot_training - Step 5477: {'lr': 0.00048615968086271945, 'samples': 2804736, 'steps': 5477, 'batch_loss/train': 0.8380448473617435} 12/23/2021 16:59:30 - INFO - codeparrot_training - Step 5478: {'lr': 0.0004861545083150341, 'samples': 2805248, 'steps': 5478, 'batch_loss/train': 0.7759979721158743} 12/23/2021 16:59:41 - INFO - codeparrot_training - Step 5479: {'lr': 0.00048614933482848765, 'samples': 2805760, 'steps': 5479, 'batch_loss/train': 0.8907184517011046} 12/23/2021 16:59:53 - INFO - codeparrot_training - Step 5480: {'lr': 0.00048614416040310056, 'samples': 2806272, 'steps': 5480, 'batch_loss/train': 0.8324821889400482} 12/23/2021 17:00:03 - INFO - codeparrot_training - Step 5481: {'lr': 0.0004861389850388935, 'samples': 2806784, 'steps': 5481, 'batch_loss/train': 0.8123818365857005} 12/23/2021 17:00:14 - INFO - codeparrot_training - Step 5482: {'lr': 0.000486133808735887, 'samples': 2807296, 'steps': 5482, 'batch_loss/train': 0.9717820845544338} 12/23/2021 17:00:27 - INFO - codeparrot_training - Step 5483: {'lr': 0.0004861286314941016, 'samples': 2807808, 'steps': 5483, 'batch_loss/train': 0.8779890909790993} 12/23/2021 17:00:38 - INFO - codeparrot_training - Step 5484: {'lr': 0.0004861234533135579, 'samples': 2808320, 'steps': 5484, 'batch_loss/train': 0.8981337123550475} 12/23/2021 17:00:49 - INFO - codeparrot_training - Step 5485: {'lr': 0.00048611827419427656, 'samples': 2808832, 'steps': 5485, 'batch_loss/train': 0.9462939174845815} 12/23/2021 17:01:01 - INFO - codeparrot_training - Step 5486: {'lr': 0.00048611309413627813, 'samples': 2809344, 'steps': 5486, 'batch_loss/train': 0.8503827666863799} 12/23/2021 17:01:11 - INFO - codeparrot_training - Step 5487: {'lr': 0.00048610791313958315, 'samples': 2809856, 'steps': 5487, 'batch_loss/train': 0.8227664027363062} 12/23/2021 17:01:22 - INFO - codeparrot_training - Step 5488: {'lr': 0.0004861027312042122, 'samples': 2810368, 'steps': 5488, 'batch_loss/train': 0.8755974061787128} 12/23/2021 17:01:33 - INFO - codeparrot_training - Step 5489: {'lr': 0.00048609754833018606, 'samples': 2810880, 'steps': 5489, 'batch_loss/train': 0.8333402005955577} 12/23/2021 17:01:45 - INFO - codeparrot_training - Step 5490: {'lr': 0.00048609236451752514, 'samples': 2811392, 'steps': 5490, 'batch_loss/train': 0.8436583462171257} 12/23/2021 17:01:56 - INFO - codeparrot_training - Step 5491: {'lr': 0.0004860871797662502, 'samples': 2811904, 'steps': 5491, 'batch_loss/train': 0.9671539906412363} 12/23/2021 17:02:06 - INFO - codeparrot_training - Step 5492: {'lr': 0.00048608199407638173, 'samples': 2812416, 'steps': 5492, 'batch_loss/train': 0.593134640250355} 12/23/2021 17:02:19 - INFO - codeparrot_training - Step 5493: {'lr': 0.0004860768074479404, 'samples': 2812928, 'steps': 5493, 'batch_loss/train': 0.8570374399423599} 12/23/2021 17:02:30 - INFO - codeparrot_training - Step 5494: {'lr': 0.00048607161988094673, 'samples': 2813440, 'steps': 5494, 'batch_loss/train': 0.8861780175939202} 12/23/2021 17:02:40 - INFO - codeparrot_training - Step 5495: {'lr': 0.00048606643137542155, 'samples': 2813952, 'steps': 5495, 'batch_loss/train': 0.8693936923518777} 12/23/2021 17:02:53 - INFO - codeparrot_training - Step 5496: {'lr': 0.00048606124193138533, 'samples': 2814464, 'steps': 5496, 'batch_loss/train': 0.8396357074379921} 12/23/2021 17:03:03 - INFO - codeparrot_training - Step 5497: {'lr': 0.0004860560515488588, 'samples': 2814976, 'steps': 5497, 'batch_loss/train': 0.9266094658523798} 12/23/2021 17:03:14 - INFO - codeparrot_training - Step 5498: {'lr': 0.0004860508602278625, 'samples': 2815488, 'steps': 5498, 'batch_loss/train': 0.9052289724349976} 12/23/2021 17:03:24 - INFO - codeparrot_training - Step 5499: {'lr': 0.00048604566796841707, 'samples': 2816000, 'steps': 5499, 'batch_loss/train': 0.7189815999008715} 12/23/2021 17:03:38 - INFO - codeparrot_training - Step 5500: {'lr': 0.0004860404747705433, 'samples': 2816512, 'steps': 5500, 'batch_loss/train': 0.8741805134341121} 12/23/2021 17:03:48 - INFO - codeparrot_training - Step 5501: {'lr': 0.0004860352806342616, 'samples': 2817024, 'steps': 5501, 'batch_loss/train': 1.8579393615946174} 12/23/2021 17:03:59 - INFO - codeparrot_training - Step 5502: {'lr': 0.00048603008555959276, 'samples': 2817536, 'steps': 5502, 'batch_loss/train': 0.8429241038393229} 12/23/2021 17:04:11 - INFO - codeparrot_training - Step 5503: {'lr': 0.00048602488954655745, 'samples': 2818048, 'steps': 5503, 'batch_loss/train': 0.8866462865844369} 12/23/2021 17:04:22 - INFO - codeparrot_training - Step 5504: {'lr': 0.00048601969259517633, 'samples': 2818560, 'steps': 5504, 'batch_loss/train': 0.8725893534719944} 12/23/2021 17:04:32 - INFO - codeparrot_training - Step 5505: {'lr': 0.00048601449470547, 'samples': 2819072, 'steps': 5505, 'batch_loss/train': 0.8743888004682958} 12/23/2021 17:04:44 - INFO - codeparrot_training - Step 5506: {'lr': 0.00048600929587745917, 'samples': 2819584, 'steps': 5506, 'batch_loss/train': 0.9318708442151546} 12/23/2021 17:04:55 - INFO - codeparrot_training - Step 5507: {'lr': 0.0004860040961111645, 'samples': 2820096, 'steps': 5507, 'batch_loss/train': 0.8723258678801358} 12/23/2021 17:05:05 - INFO - codeparrot_training - Step 5508: {'lr': 0.0004859988954066067, 'samples': 2820608, 'steps': 5508, 'batch_loss/train': 1.011260747909546} 12/23/2021 17:05:16 - INFO - codeparrot_training - Step 5509: {'lr': 0.0004859936937638063, 'samples': 2821120, 'steps': 5509, 'batch_loss/train': 0.8856195937842131} 12/23/2021 17:05:29 - INFO - codeparrot_training - Step 5510: {'lr': 0.0004859884911827841, 'samples': 2821632, 'steps': 5510, 'batch_loss/train': 0.9002292226068676} 12/23/2021 17:05:40 - INFO - codeparrot_training - Step 5511: {'lr': 0.00048598328766356083, 'samples': 2822144, 'steps': 5511, 'batch_loss/train': 0.8228402028325945} 12/23/2021 17:05:50 - INFO - codeparrot_training - Step 5512: {'lr': 0.00048597808320615705, 'samples': 2822656, 'steps': 5512, 'batch_loss/train': 0.8222105987370014} 12/23/2021 17:06:03 - INFO - codeparrot_training - Step 5513: {'lr': 0.0004859728778105935, 'samples': 2823168, 'steps': 5513, 'batch_loss/train': 0.8975968901067972} 12/23/2021 17:06:13 - INFO - codeparrot_training - Step 5514: {'lr': 0.000485967671476891, 'samples': 2823680, 'steps': 5514, 'batch_loss/train': 0.8824088722467422} 12/23/2021 17:06:24 - INFO - codeparrot_training - Step 5515: {'lr': 0.00048596246420507005, 'samples': 2824192, 'steps': 5515, 'batch_loss/train': 0.8070039702579379} 12/23/2021 17:06:36 - INFO - codeparrot_training - Step 5516: {'lr': 0.00048595725599515137, 'samples': 2824704, 'steps': 5516, 'batch_loss/train': 0.8514657085761428} 12/23/2021 17:06:46 - INFO - codeparrot_training - Step 5517: {'lr': 0.0004859520468471559, 'samples': 2825216, 'steps': 5517, 'batch_loss/train': 0.8330312790349126} 12/23/2021 17:06:57 - INFO - codeparrot_training - Step 5518: {'lr': 0.00048594683676110405, 'samples': 2825728, 'steps': 5518, 'batch_loss/train': 0.8987758485600352} 12/23/2021 17:07:09 - INFO - codeparrot_training - Step 5519: {'lr': 0.0004859416257370167, 'samples': 2826240, 'steps': 5519, 'batch_loss/train': 0.819471298949793} 12/23/2021 17:07:20 - INFO - codeparrot_training - Step 5520: {'lr': 0.0004859364137749146, 'samples': 2826752, 'steps': 5520, 'batch_loss/train': 0.6587581299245358} 12/23/2021 17:07:30 - INFO - codeparrot_training - Step 5521: {'lr': 0.00048593120087481835, 'samples': 2827264, 'steps': 5521, 'batch_loss/train': 0.8435986842960119} 12/23/2021 17:07:41 - INFO - codeparrot_training - Step 5522: {'lr': 0.00048592598703674873, 'samples': 2827776, 'steps': 5522, 'batch_loss/train': 0.8170756348408759} 12/23/2021 17:07:54 - INFO - codeparrot_training - Step 5523: {'lr': 0.00048592077226072643, 'samples': 2828288, 'steps': 5523, 'batch_loss/train': 0.8291243680287153} 12/23/2021 17:08:04 - INFO - codeparrot_training - Step 5524: {'lr': 0.00048591555654677233, 'samples': 2828800, 'steps': 5524, 'batch_loss/train': 0.8320842143148184} 12/23/2021 17:08:15 - INFO - codeparrot_training - Step 5525: {'lr': 0.00048591033989490697, 'samples': 2829312, 'steps': 5525, 'batch_loss/train': 0.8551817871630192} 12/23/2021 17:08:27 - INFO - codeparrot_training - Step 5526: {'lr': 0.00048590512230515116, 'samples': 2829824, 'steps': 5526, 'batch_loss/train': 0.9437891258858144} 12/23/2021 17:08:38 - INFO - codeparrot_training - Step 5527: {'lr': 0.00048589990377752567, 'samples': 2830336, 'steps': 5527, 'batch_loss/train': 0.925607274286449} 12/23/2021 17:08:48 - INFO - codeparrot_training - Step 5528: {'lr': 0.0004858946843120513, 'samples': 2830848, 'steps': 5528, 'batch_loss/train': 0.9275646768510342} 12/23/2021 17:09:01 - INFO - codeparrot_training - Step 5529: {'lr': 0.00048588946390874864, 'samples': 2831360, 'steps': 5529, 'batch_loss/train': 0.8918994190171361} 12/23/2021 17:09:12 - INFO - codeparrot_training - Step 5530: {'lr': 0.0004858842425676386, 'samples': 2831872, 'steps': 5530, 'batch_loss/train': 0.8088418273255229} 12/23/2021 17:09:23 - INFO - codeparrot_training - Step 5531: {'lr': 0.00048587902028874187, 'samples': 2832384, 'steps': 5531, 'batch_loss/train': 0.8946389541961253} 12/23/2021 17:09:33 - INFO - codeparrot_training - Step 5532: {'lr': 0.00048587379707207924, 'samples': 2832896, 'steps': 5532, 'batch_loss/train': 0.8687834292650223} 12/23/2021 17:09:45 - INFO - codeparrot_training - Step 5533: {'lr': 0.0004858685729176714, 'samples': 2833408, 'steps': 5533, 'batch_loss/train': 1.0128828743472695} 12/23/2021 17:09:56 - INFO - codeparrot_training - Step 5534: {'lr': 0.0004858633478255392, 'samples': 2833920, 'steps': 5534, 'batch_loss/train': 0.892938076518476} 12/23/2021 17:10:06 - INFO - codeparrot_training - Step 5535: {'lr': 0.00048585812179570344, 'samples': 2834432, 'steps': 5535, 'batch_loss/train': 0.9575237291865051} 12/23/2021 17:10:18 - INFO - codeparrot_training - Step 5536: {'lr': 0.0004858528948281848, 'samples': 2834944, 'steps': 5536, 'batch_loss/train': 0.8917054496705532} 12/23/2021 17:10:29 - INFO - codeparrot_training - Step 5537: {'lr': 0.0004858476669230041, 'samples': 2835456, 'steps': 5537, 'batch_loss/train': 0.7965695176972076} 12/23/2021 17:10:39 - INFO - codeparrot_training - Step 5538: {'lr': 0.00048584243808018216, 'samples': 2835968, 'steps': 5538, 'batch_loss/train': 0.8610226305900142} 12/23/2021 17:10:52 - INFO - codeparrot_training - Step 5539: {'lr': 0.00048583720829973963, 'samples': 2836480, 'steps': 5539, 'batch_loss/train': 0.7794702122919261} 12/23/2021 17:11:03 - INFO - codeparrot_training - Step 5540: {'lr': 0.0004858319775816975, 'samples': 2836992, 'steps': 5540, 'batch_loss/train': 0.8852645913138986} 12/23/2021 17:11:14 - INFO - codeparrot_training - Step 5541: {'lr': 0.0004858267459260765, 'samples': 2837504, 'steps': 5541, 'batch_loss/train': 0.884495141915977} 12/23/2021 17:11:26 - INFO - codeparrot_training - Step 5542: {'lr': 0.00048582151333289734, 'samples': 2838016, 'steps': 5542, 'batch_loss/train': 0.9159451965242624} 12/23/2021 17:11:36 - INFO - codeparrot_training - Step 5543: {'lr': 0.0004858162798021809, 'samples': 2838528, 'steps': 5543, 'batch_loss/train': 0.8079344977159053} 12/23/2021 17:11:47 - INFO - codeparrot_training - Step 5544: {'lr': 0.000485811045333948, 'samples': 2839040, 'steps': 5544, 'batch_loss/train': 0.834439953789115} 12/23/2021 17:11:57 - INFO - codeparrot_training - Step 5545: {'lr': 0.00048580580992821945, 'samples': 2839552, 'steps': 5545, 'batch_loss/train': 0.8672593992669135} 12/23/2021 17:12:12 - INFO - codeparrot_training - Step 5546: {'lr': 0.00048580057358501595, 'samples': 2840064, 'steps': 5546, 'batch_loss/train': 0.8461138848215342} 12/23/2021 17:12:22 - INFO - codeparrot_training - Step 5547: {'lr': 0.00048579533630435846, 'samples': 2840576, 'steps': 5547, 'batch_loss/train': 0.7957617281936109} 12/23/2021 17:12:33 - INFO - codeparrot_training - Step 5548: {'lr': 0.00048579009808626773, 'samples': 2841088, 'steps': 5548, 'batch_loss/train': 1.1710157385095954} 12/23/2021 17:12:45 - INFO - codeparrot_training - Step 5549: {'lr': 0.00048578485893076463, 'samples': 2841600, 'steps': 5549, 'batch_loss/train': 0.8558984054252505} 12/23/2021 17:12:56 - INFO - codeparrot_training - Step 5550: {'lr': 0.00048577961883786993, 'samples': 2842112, 'steps': 5550, 'batch_loss/train': 0.973009247565642} 12/23/2021 17:13:06 - INFO - codeparrot_training - Step 5551: {'lr': 0.0004857743778076045, 'samples': 2842624, 'steps': 5551, 'batch_loss/train': 0.900227690115571} 12/23/2021 17:13:18 - INFO - codeparrot_training - Step 5552: {'lr': 0.00048576913583998915, 'samples': 2843136, 'steps': 5552, 'batch_loss/train': 0.8795542884618044} 12/23/2021 17:13:29 - INFO - codeparrot_training - Step 5553: {'lr': 0.00048576389293504476, 'samples': 2843648, 'steps': 5553, 'batch_loss/train': 0.9020623269025236} 12/23/2021 17:13:40 - INFO - codeparrot_training - Step 5554: {'lr': 0.0004857586490927921, 'samples': 2844160, 'steps': 5554, 'batch_loss/train': 0.8789241891354322} 12/23/2021 17:13:50 - INFO - codeparrot_training - Step 5555: {'lr': 0.0004857534043132521, 'samples': 2844672, 'steps': 5555, 'batch_loss/train': 0.8254860430024564} 12/23/2021 17:14:02 - INFO - codeparrot_training - Step 5556: {'lr': 0.0004857481585964456, 'samples': 2845184, 'steps': 5556, 'batch_loss/train': 0.7534028613008559} 12/23/2021 17:14:13 - INFO - codeparrot_training - Step 5557: {'lr': 0.0004857429119423935, 'samples': 2845696, 'steps': 5557, 'batch_loss/train': 0.8373604333028197} 12/23/2021 17:14:24 - INFO - codeparrot_training - Step 5558: {'lr': 0.00048573766435111647, 'samples': 2846208, 'steps': 5558, 'batch_loss/train': 0.866861255839467} 12/23/2021 17:14:37 - INFO - codeparrot_training - Step 5559: {'lr': 0.0004857324158226356, 'samples': 2846720, 'steps': 5559, 'batch_loss/train': 0.9144597514532506} 12/23/2021 17:14:47 - INFO - codeparrot_training - Step 5560: {'lr': 0.0004857271663569716, 'samples': 2847232, 'steps': 5560, 'batch_loss/train': 1.1073125498369336} 12/23/2021 17:14:58 - INFO - codeparrot_training - Step 5561: {'lr': 0.00048572191595414537, 'samples': 2847744, 'steps': 5561, 'batch_loss/train': 0.7938453843817115} 12/23/2021 17:15:10 - INFO - codeparrot_training - Step 5562: {'lr': 0.00048571666461417784, 'samples': 2848256, 'steps': 5562, 'batch_loss/train': 0.8941897638142109} 12/23/2021 17:15:21 - INFO - codeparrot_training - Step 5563: {'lr': 0.00048571141233708987, 'samples': 2848768, 'steps': 5563, 'batch_loss/train': 0.9739172831177711} 12/23/2021 17:15:31 - INFO - codeparrot_training - Step 5564: {'lr': 0.0004857061591229023, 'samples': 2849280, 'steps': 5564, 'batch_loss/train': 0.6936892472440377} 12/23/2021 17:15:42 - INFO - codeparrot_training - Step 5565: {'lr': 0.0004857009049716361, 'samples': 2849792, 'steps': 5565, 'batch_loss/train': 0.9611944258213043} 12/23/2021 17:15:54 - INFO - codeparrot_training - Step 5566: {'lr': 0.00048569564988331206, 'samples': 2850304, 'steps': 5566, 'batch_loss/train': 0.9998290464282036} 12/23/2021 17:16:04 - INFO - codeparrot_training - Step 5567: {'lr': 0.00048569039385795113, 'samples': 2850816, 'steps': 5567, 'batch_loss/train': 0.9147938899695873} 12/23/2021 17:16:15 - INFO - codeparrot_training - Step 5568: {'lr': 0.0004856851368955741, 'samples': 2851328, 'steps': 5568, 'batch_loss/train': 0.8900518424343318} 12/23/2021 17:16:29 - INFO - codeparrot_training - Step 5569: {'lr': 0.00048567987899620203, 'samples': 2851840, 'steps': 5569, 'batch_loss/train': 0.8980349758639932} 12/23/2021 17:16:40 - INFO - codeparrot_training - Step 5570: {'lr': 0.00048567462015985577, 'samples': 2852352, 'steps': 5570, 'batch_loss/train': 0.9185434896498919} 12/23/2021 17:16:50 - INFO - codeparrot_training - Step 5571: {'lr': 0.00048566936038655614, 'samples': 2852864, 'steps': 5571, 'batch_loss/train': 0.8423673161305487} 12/23/2021 17:17:04 - INFO - codeparrot_training - Step 5572: {'lr': 0.0004856640996763241, 'samples': 2853376, 'steps': 5572, 'batch_loss/train': 0.9444960216060281} 12/23/2021 17:17:14 - INFO - codeparrot_training - Step 5573: {'lr': 0.00048565883802918064, 'samples': 2853888, 'steps': 5573, 'batch_loss/train': 1.2567194746807218} 12/23/2021 17:17:25 - INFO - codeparrot_training - Step 5574: {'lr': 0.00048565357544514667, 'samples': 2854400, 'steps': 5574, 'batch_loss/train': 0.9332261020317674} 12/23/2021 17:17:35 - INFO - codeparrot_training - Step 5575: {'lr': 0.000485648311924243, 'samples': 2854912, 'steps': 5575, 'batch_loss/train': 0.859251391608268} 12/23/2021 17:17:49 - INFO - codeparrot_training - Step 5576: {'lr': 0.0004856430474664906, 'samples': 2855424, 'steps': 5576, 'batch_loss/train': 0.8427300970070064} 12/23/2021 17:18:00 - INFO - codeparrot_training - Step 5577: {'lr': 0.00048563778207191046, 'samples': 2855936, 'steps': 5577, 'batch_loss/train': 0.786781070753932} 12/23/2021 17:18:10 - INFO - codeparrot_training - Step 5578: {'lr': 0.0004856325157405235, 'samples': 2856448, 'steps': 5578, 'batch_loss/train': 0.7156032832572237} 12/23/2021 17:18:22 - INFO - codeparrot_training - Step 5579: {'lr': 0.0004856272484723506, 'samples': 2856960, 'steps': 5579, 'batch_loss/train': 0.7617356129921973} 12/23/2021 17:18:33 - INFO - codeparrot_training - Step 5580: {'lr': 0.00048562198026741264, 'samples': 2857472, 'steps': 5580, 'batch_loss/train': 0.9278225831221789} 12/23/2021 17:18:43 - INFO - codeparrot_training - Step 5581: {'lr': 0.0004856167111257308, 'samples': 2857984, 'steps': 5581, 'batch_loss/train': 0.8413434326648712} 12/23/2021 17:18:55 - INFO - codeparrot_training - Step 5582: {'lr': 0.0004856114410473258, 'samples': 2858496, 'steps': 5582, 'batch_loss/train': 0.852060136385262} 12/23/2021 17:19:06 - INFO - codeparrot_training - Step 5583: {'lr': 0.0004856061700322187, 'samples': 2859008, 'steps': 5583, 'batch_loss/train': 0.9400273836217821} 12/23/2021 17:19:17 - INFO - codeparrot_training - Step 5584: {'lr': 0.0004856008980804304, 'samples': 2859520, 'steps': 5584, 'batch_loss/train': 0.8032504841685295} 12/23/2021 17:19:27 - INFO - codeparrot_training - Step 5585: {'lr': 0.00048559562519198186, 'samples': 2860032, 'steps': 5585, 'batch_loss/train': 0.8981916531920433} 12/23/2021 17:19:40 - INFO - codeparrot_training - Step 5586: {'lr': 0.0004855903513668942, 'samples': 2860544, 'steps': 5586, 'batch_loss/train': 0.8717922540381551} 12/23/2021 17:19:51 - INFO - codeparrot_training - Step 5587: {'lr': 0.00048558507660518814, 'samples': 2861056, 'steps': 5587, 'batch_loss/train': 0.8536131763830781} 12/23/2021 17:20:02 - INFO - codeparrot_training - Step 5588: {'lr': 0.0004855798009068848, 'samples': 2861568, 'steps': 5588, 'batch_loss/train': 0.851682068197988} 12/23/2021 17:20:14 - INFO - codeparrot_training - Step 5589: {'lr': 0.0004855745242720051, 'samples': 2862080, 'steps': 5589, 'batch_loss/train': 0.9316445691511035} 12/23/2021 17:20:24 - INFO - codeparrot_training - Step 5590: {'lr': 0.0004855692467005701, 'samples': 2862592, 'steps': 5590, 'batch_loss/train': 0.8758534016087651} 12/23/2021 17:20:35 - INFO - codeparrot_training - Step 5591: {'lr': 0.0004855639681926007, 'samples': 2863104, 'steps': 5591, 'batch_loss/train': 0.8922669235616922} 12/23/2021 17:20:47 - INFO - codeparrot_training - Step 5592: {'lr': 0.0004855586887481179, 'samples': 2863616, 'steps': 5592, 'batch_loss/train': 0.7365989559330046} 12/23/2021 17:20:58 - INFO - codeparrot_training - Step 5593: {'lr': 0.0004855534083671427, 'samples': 2864128, 'steps': 5593, 'batch_loss/train': 0.9325183751061559} 12/23/2021 17:21:08 - INFO - codeparrot_training - Step 5594: {'lr': 0.0004855481270496961, 'samples': 2864640, 'steps': 5594, 'batch_loss/train': 0.8058261582627892} 12/23/2021 17:21:21 - INFO - codeparrot_training - Step 5595: {'lr': 0.000485542844795799, 'samples': 2865152, 'steps': 5595, 'batch_loss/train': 0.8849473455920815} 12/23/2021 17:21:32 - INFO - codeparrot_training - Step 5596: {'lr': 0.0004855375616054726, 'samples': 2865664, 'steps': 5596, 'batch_loss/train': 0.9457293432205915} 12/23/2021 17:21:43 - INFO - codeparrot_training - Step 5597: {'lr': 0.00048553227747873774, 'samples': 2866176, 'steps': 5597, 'batch_loss/train': 0.8494891335722059} 12/23/2021 17:21:53 - INFO - codeparrot_training - Step 5598: {'lr': 0.00048552699241561547, 'samples': 2866688, 'steps': 5598, 'batch_loss/train': 0.9992296798154712} 12/23/2021 17:22:05 - INFO - codeparrot_training - Step 5599: {'lr': 0.00048552170641612686, 'samples': 2867200, 'steps': 5599, 'batch_loss/train': 0.8889033505693078} 12/23/2021 17:22:16 - INFO - codeparrot_training - Step 5600: {'lr': 0.00048551641948029286, 'samples': 2867712, 'steps': 5600, 'batch_loss/train': 0.9205373004078865} 12/23/2021 17:22:26 - INFO - codeparrot_training - Step 5601: {'lr': 0.00048551113160813443, 'samples': 2868224, 'steps': 5601, 'batch_loss/train': 0.8006526557728648} 12/23/2021 17:22:39 - INFO - codeparrot_training - Step 5602: {'lr': 0.0004855058427996727, 'samples': 2868736, 'steps': 5602, 'batch_loss/train': 0.8697733879089355} 12/23/2021 17:22:49 - INFO - codeparrot_training - Step 5603: {'lr': 0.0004855005530549287, 'samples': 2869248, 'steps': 5603, 'batch_loss/train': 0.8155545084737241} 12/23/2021 17:23:00 - INFO - codeparrot_training - Step 5604: {'lr': 0.0004854952623739235, 'samples': 2869760, 'steps': 5604, 'batch_loss/train': 0.664637322188355} 12/23/2021 17:23:12 - INFO - codeparrot_training - Step 5605: {'lr': 0.00048548997075667787, 'samples': 2870272, 'steps': 5605, 'batch_loss/train': 0.8923755511641502} 12/23/2021 17:23:22 - INFO - codeparrot_training - Step 5606: {'lr': 0.00048548467820321316, 'samples': 2870784, 'steps': 5606, 'batch_loss/train': 0.8696077847853303} 12/23/2021 17:23:33 - INFO - codeparrot_training - Step 5607: {'lr': 0.0004854793847135502, 'samples': 2871296, 'steps': 5607, 'batch_loss/train': 0.8030636941548437} 12/23/2021 17:23:44 - INFO - codeparrot_training - Step 5608: {'lr': 0.00048547409028771026, 'samples': 2871808, 'steps': 5608, 'batch_loss/train': 0.7932616770267487} 12/23/2021 17:23:57 - INFO - codeparrot_training - Step 5609: {'lr': 0.00048546879492571414, 'samples': 2872320, 'steps': 5609, 'batch_loss/train': 0.9706066912040114} 12/23/2021 17:24:07 - INFO - codeparrot_training - Step 5610: {'lr': 0.000485463498627583, 'samples': 2872832, 'steps': 5610, 'batch_loss/train': 0.8683832297101617} 12/23/2021 17:24:18 - INFO - codeparrot_training - Step 5611: {'lr': 0.00048545820139333796, 'samples': 2873344, 'steps': 5611, 'batch_loss/train': 0.9176500802859664} 12/23/2021 17:24:30 - INFO - codeparrot_training - Step 5612: {'lr': 0.00048545290322299996, 'samples': 2873856, 'steps': 5612, 'batch_loss/train': 0.8297794726677239} 12/23/2021 17:24:41 - INFO - codeparrot_training - Step 5613: {'lr': 0.00048544760411659016, 'samples': 2874368, 'steps': 5613, 'batch_loss/train': 0.8236626544967294} 12/23/2021 17:24:51 - INFO - codeparrot_training - Step 5614: {'lr': 0.00048544230407412953, 'samples': 2874880, 'steps': 5614, 'batch_loss/train': 0.8302073460072279} 12/23/2021 17:25:04 - INFO - codeparrot_training - Step 5615: {'lr': 0.0004854370030956392, 'samples': 2875392, 'steps': 5615, 'batch_loss/train': 0.920360348187387} 12/23/2021 17:25:15 - INFO - codeparrot_training - Step 5616: {'lr': 0.0004854317011811403, 'samples': 2875904, 'steps': 5616, 'batch_loss/train': 0.8687647208571434} 12/23/2021 17:25:25 - INFO - codeparrot_training - Step 5617: {'lr': 0.00048542639833065385, 'samples': 2876416, 'steps': 5617, 'batch_loss/train': 0.7807512022554874} 12/23/2021 17:25:37 - INFO - codeparrot_training - Step 5618: {'lr': 0.0004854210945442009, 'samples': 2876928, 'steps': 5618, 'batch_loss/train': 0.8431373322382569} 12/23/2021 17:25:48 - INFO - codeparrot_training - Step 5619: {'lr': 0.0004854157898218026, 'samples': 2877440, 'steps': 5619, 'batch_loss/train': 0.8802944794297218} 12/23/2021 17:25:59 - INFO - codeparrot_training - Step 5620: {'lr': 0.00048541048416348, 'samples': 2877952, 'steps': 5620, 'batch_loss/train': 0.9604625422507524} 12/23/2021 17:26:09 - INFO - codeparrot_training - Step 5621: {'lr': 0.0004854051775692542, 'samples': 2878464, 'steps': 5621, 'batch_loss/train': 0.8999402206391096} 12/23/2021 17:26:21 - INFO - codeparrot_training - Step 5622: {'lr': 0.0004853998700391463, 'samples': 2878976, 'steps': 5622, 'batch_loss/train': 0.7686800982337445} 12/23/2021 17:26:32 - INFO - codeparrot_training - Step 5623: {'lr': 0.0004853945615731774, 'samples': 2879488, 'steps': 5623, 'batch_loss/train': 0.8545526005327702} 12/23/2021 17:26:42 - INFO - codeparrot_training - Step 5624: {'lr': 0.00048538925217136855, 'samples': 2880000, 'steps': 5624, 'batch_loss/train': 0.8539441218599677} 12/23/2021 17:26:56 - INFO - codeparrot_training - Step 5625: {'lr': 0.000485383941833741, 'samples': 2880512, 'steps': 5625, 'batch_loss/train': 0.8568756757304072} 12/23/2021 17:27:06 - INFO - codeparrot_training - Step 5626: {'lr': 0.00048537863056031573, 'samples': 2881024, 'steps': 5626, 'batch_loss/train': 0.792392217554152} 12/23/2021 17:27:17 - INFO - codeparrot_training - Step 5627: {'lr': 0.00048537331835111385, 'samples': 2881536, 'steps': 5627, 'batch_loss/train': 0.7625353699550033} 12/23/2021 17:27:29 - INFO - codeparrot_training - Step 5628: {'lr': 0.00048536800520615665, 'samples': 2882048, 'steps': 5628, 'batch_loss/train': 0.9085093894973397} 12/23/2021 17:27:39 - INFO - codeparrot_training - Step 5629: {'lr': 0.00048536269112546505, 'samples': 2882560, 'steps': 5629, 'batch_loss/train': 0.8547924719750881} 12/23/2021 17:27:50 - INFO - codeparrot_training - Step 5630: {'lr': 0.0004853573761090603, 'samples': 2883072, 'steps': 5630, 'batch_loss/train': 0.8494716705754399} 12/23/2021 17:28:01 - INFO - codeparrot_training - Step 5631: {'lr': 0.00048535206015696343, 'samples': 2883584, 'steps': 5631, 'batch_loss/train': 0.877021761611104} 12/23/2021 17:28:13 - INFO - codeparrot_training - Step 5632: {'lr': 0.0004853467432691957, 'samples': 2884096, 'steps': 5632, 'batch_loss/train': 0.947802297770977} 12/23/2021 17:28:23 - INFO - codeparrot_training - Step 5633: {'lr': 0.00048534142544577814, 'samples': 2884608, 'steps': 5633, 'batch_loss/train': 0.8717921879142523} 12/23/2021 17:28:34 - INFO - codeparrot_training - Step 5634: {'lr': 0.00048533610668673193, 'samples': 2885120, 'steps': 5634, 'batch_loss/train': 0.9233399308286607} 12/23/2021 17:28:47 - INFO - codeparrot_training - Step 5635: {'lr': 0.00048533078699207824, 'samples': 2885632, 'steps': 5635, 'batch_loss/train': 0.8021317915990949} 12/23/2021 17:28:58 - INFO - codeparrot_training - Step 5636: {'lr': 0.00048532546636183825, 'samples': 2886144, 'steps': 5636, 'batch_loss/train': 0.9622331839054823} 12/23/2021 17:29:08 - INFO - codeparrot_training - Step 5637: {'lr': 0.000485320144796033, 'samples': 2886656, 'steps': 5637, 'batch_loss/train': 0.8594139166816603} 12/23/2021 17:29:21 - INFO - codeparrot_training - Step 5638: {'lr': 0.0004853148222946837, 'samples': 2887168, 'steps': 5638, 'batch_loss/train': 0.8732613250613213} 12/23/2021 17:29:31 - INFO - codeparrot_training - Step 5639: {'lr': 0.0004853094988578115, 'samples': 2887680, 'steps': 5639, 'batch_loss/train': 0.8512482673395425} 12/23/2021 17:29:42 - INFO - codeparrot_training - Step 5640: {'lr': 0.00048530417448543764, 'samples': 2888192, 'steps': 5640, 'batch_loss/train': 0.773575053550303} 12/23/2021 17:29:54 - INFO - codeparrot_training - Step 5641: {'lr': 0.0004852988491775832, 'samples': 2888704, 'steps': 5641, 'batch_loss/train': 0.7572517129592597} 12/23/2021 17:30:04 - INFO - codeparrot_training - Step 5642: {'lr': 0.0004852935229342694, 'samples': 2889216, 'steps': 5642, 'batch_loss/train': 0.8855923749506474} 12/23/2021 17:30:15 - INFO - codeparrot_training - Step 5643: {'lr': 0.0004852881957555174, 'samples': 2889728, 'steps': 5643, 'batch_loss/train': 0.8325124559924006} 12/23/2021 17:30:26 - INFO - codeparrot_training - Step 5644: {'lr': 0.00048528286764134833, 'samples': 2890240, 'steps': 5644, 'batch_loss/train': 0.6692821539472789} 12/23/2021 17:30:39 - INFO - codeparrot_training - Step 5645: {'lr': 0.00048527753859178347, 'samples': 2890752, 'steps': 5645, 'batch_loss/train': 0.8705287212505937} 12/23/2021 17:30:49 - INFO - codeparrot_training - Step 5646: {'lr': 0.0004852722086068439, 'samples': 2891264, 'steps': 5646, 'batch_loss/train': 0.9769562250003219} 12/23/2021 17:31:00 - INFO - codeparrot_training - Step 5647: {'lr': 0.00048526687768655086, 'samples': 2891776, 'steps': 5647, 'batch_loss/train': 0.7590757622383535} 12/23/2021 17:31:12 - INFO - codeparrot_training - Step 5648: {'lr': 0.0004852615458309256, 'samples': 2892288, 'steps': 5648, 'batch_loss/train': 0.947014482691884} 12/23/2021 17:31:23 - INFO - codeparrot_training - Step 5649: {'lr': 0.00048525621303998923, 'samples': 2892800, 'steps': 5649, 'batch_loss/train': 0.9017032599076629} 12/23/2021 17:31:33 - INFO - codeparrot_training - Step 5650: {'lr': 0.000485250879313763, 'samples': 2893312, 'steps': 5650, 'batch_loss/train': 0.706723022274673} 12/23/2021 17:31:44 - INFO - codeparrot_training - Step 5651: {'lr': 0.0004852455446522681, 'samples': 2893824, 'steps': 5651, 'batch_loss/train': 0.7976221679709852} 12/23/2021 17:31:56 - INFO - codeparrot_training - Step 5652: {'lr': 0.0004852402090555258, 'samples': 2894336, 'steps': 5652, 'batch_loss/train': 0.8768794368952513} 12/23/2021 17:32:07 - INFO - codeparrot_training - Step 5653: {'lr': 0.0004852348725235571, 'samples': 2894848, 'steps': 5653, 'batch_loss/train': 0.8027380462735891} 12/23/2021 17:32:17 - INFO - codeparrot_training - Step 5654: {'lr': 0.0004852295350563836, 'samples': 2895360, 'steps': 5654, 'batch_loss/train': 0.8682599854655564} 12/23/2021 17:32:30 - INFO - codeparrot_training - Step 5655: {'lr': 0.00048522419665402605, 'samples': 2895872, 'steps': 5655, 'batch_loss/train': 0.8331684852018952} 12/23/2021 17:32:41 - INFO - codeparrot_training - Step 5656: {'lr': 0.0004852188573165061, 'samples': 2896384, 'steps': 5656, 'batch_loss/train': 0.7908188076689839} 12/23/2021 17:32:52 - INFO - codeparrot_training - Step 5657: {'lr': 0.00048521351704384466, 'samples': 2896896, 'steps': 5657, 'batch_loss/train': 0.8206956987269223} 12/23/2021 17:33:04 - INFO - codeparrot_training - Step 5658: {'lr': 0.00048520817583606315, 'samples': 2897408, 'steps': 5658, 'batch_loss/train': 0.8490797565318644} 12/23/2021 17:33:15 - INFO - codeparrot_training - Step 5659: {'lr': 0.0004852028336931827, 'samples': 2897920, 'steps': 5659, 'batch_loss/train': 1.2442417982965708} 12/23/2021 17:33:25 - INFO - codeparrot_training - Step 5660: {'lr': 0.00048519749061522454, 'samples': 2898432, 'steps': 5660, 'batch_loss/train': 0.8903687437996268} 12/23/2021 17:33:36 - INFO - codeparrot_training - Step 5661: {'lr': 0.0004851921466022101, 'samples': 2898944, 'steps': 5661, 'batch_loss/train': 0.9303020723164082} 12/23/2021 17:33:49 - INFO - codeparrot_training - Step 5662: {'lr': 0.0004851868016541604, 'samples': 2899456, 'steps': 5662, 'batch_loss/train': 0.6762329021003097} 12/23/2021 17:34:00 - INFO - codeparrot_training - Step 5663: {'lr': 0.00048518145577109686, 'samples': 2899968, 'steps': 5663, 'batch_loss/train': 0.8446913557127118} 12/23/2021 17:34:10 - INFO - codeparrot_training - Step 5664: {'lr': 0.0004851761089530406, 'samples': 2900480, 'steps': 5664, 'batch_loss/train': 0.8422333234921098} 12/23/2021 17:34:22 - INFO - codeparrot_training - Step 5665: {'lr': 0.0004851707612000129, 'samples': 2900992, 'steps': 5665, 'batch_loss/train': 0.8478340557776392} 12/23/2021 17:34:33 - INFO - codeparrot_training - Step 5666: {'lr': 0.00048516541251203513, 'samples': 2901504, 'steps': 5666, 'batch_loss/train': 0.7539586089551449} 12/23/2021 17:34:44 - INFO - codeparrot_training - Step 5667: {'lr': 0.0004851600628891284, 'samples': 2902016, 'steps': 5667, 'batch_loss/train': 0.9275765102356672} 12/23/2021 17:34:56 - INFO - codeparrot_training - Step 5668: {'lr': 0.00048515471233131406, 'samples': 2902528, 'steps': 5668, 'batch_loss/train': 0.8563982918858528} 12/23/2021 17:35:06 - INFO - codeparrot_training - Step 5669: {'lr': 0.00048514936083861337, 'samples': 2903040, 'steps': 5669, 'batch_loss/train': 0.8024991089478135} 12/23/2021 17:35:17 - INFO - codeparrot_training - Step 5670: {'lr': 0.00048514400841104764, 'samples': 2903552, 'steps': 5670, 'batch_loss/train': 0.8589194258674979} 12/23/2021 17:35:28 - INFO - codeparrot_training - Step 5671: {'lr': 0.0004851386550486382, 'samples': 2904064, 'steps': 5671, 'batch_loss/train': 1.0097386883571744} 12/23/2021 17:35:40 - INFO - codeparrot_training - Step 5672: {'lr': 0.00048513330075140615, 'samples': 2904576, 'steps': 5672, 'batch_loss/train': 0.7166034178808331} 12/23/2021 17:35:50 - INFO - codeparrot_training - Step 5673: {'lr': 0.0004851279455193729, 'samples': 2905088, 'steps': 5673, 'batch_loss/train': 0.8665933618322015} 12/23/2021 17:36:01 - INFO - codeparrot_training - Step 5674: {'lr': 0.0004851225893525597, 'samples': 2905600, 'steps': 5674, 'batch_loss/train': 0.9810452135279775} 12/23/2021 17:36:14 - INFO - codeparrot_training - Step 5675: {'lr': 0.00048511723225098783, 'samples': 2906112, 'steps': 5675, 'batch_loss/train': 0.8630236145108938} 12/23/2021 17:36:25 - INFO - codeparrot_training - Step 5676: {'lr': 0.0004851118742146787, 'samples': 2906624, 'steps': 5676, 'batch_loss/train': 0.8741940199397504} 12/23/2021 17:36:35 - INFO - codeparrot_training - Step 5677: {'lr': 0.00048510651524365353, 'samples': 2907136, 'steps': 5677, 'batch_loss/train': 0.9369575306773186} 12/23/2021 17:36:47 - INFO - codeparrot_training - Step 5678: {'lr': 0.0004851011553379336, 'samples': 2907648, 'steps': 5678, 'batch_loss/train': 0.9262470761314034} 12/23/2021 17:36:58 - INFO - codeparrot_training - Step 5679: {'lr': 0.00048509579449754024, 'samples': 2908160, 'steps': 5679, 'batch_loss/train': 0.8292013071477413} 12/23/2021 17:37:08 - INFO - codeparrot_training - Step 5680: {'lr': 0.00048509043272249475, 'samples': 2908672, 'steps': 5680, 'batch_loss/train': 0.9375350645277649} 12/23/2021 17:37:19 - INFO - codeparrot_training - Step 5681: {'lr': 0.0004850850700128185, 'samples': 2909184, 'steps': 5681, 'batch_loss/train': 0.8607015833258629} 12/23/2021 17:37:31 - INFO - codeparrot_training - Step 5682: {'lr': 0.0004850797063685328, 'samples': 2909696, 'steps': 5682, 'batch_loss/train': 0.7911530383862555} 12/23/2021 17:37:42 - INFO - codeparrot_training - Step 5683: {'lr': 0.0004850743417896589, 'samples': 2910208, 'steps': 5683, 'batch_loss/train': 0.9002758674323559} 12/23/2021 17:37:52 - INFO - codeparrot_training - Step 5684: {'lr': 0.00048506897627621825, 'samples': 2910720, 'steps': 5684, 'batch_loss/train': 0.8946467079222202} 12/23/2021 17:38:05 - INFO - codeparrot_training - Step 5685: {'lr': 0.00048506360982823206, 'samples': 2911232, 'steps': 5685, 'batch_loss/train': 0.9298830665647984} 12/23/2021 17:38:16 - INFO - codeparrot_training - Step 5686: {'lr': 0.0004850582424457217, 'samples': 2911744, 'steps': 5686, 'batch_loss/train': 0.8906488846987486} 12/23/2021 17:38:27 - INFO - codeparrot_training - Step 5687: {'lr': 0.0004850528741287085, 'samples': 2912256, 'steps': 5687, 'batch_loss/train': 0.728834193199873} 12/23/2021 17:38:40 - INFO - codeparrot_training - Step 5688: {'lr': 0.0004850475048772139, 'samples': 2912768, 'steps': 5688, 'batch_loss/train': 0.8610173701308668} 12/23/2021 17:38:50 - INFO - codeparrot_training - Step 5689: {'lr': 0.0004850421346912591, 'samples': 2913280, 'steps': 5689, 'batch_loss/train': 0.490997152402997} 12/23/2021 17:39:01 - INFO - codeparrot_training - Step 5690: {'lr': 0.00048503676357086555, 'samples': 2913792, 'steps': 5690, 'batch_loss/train': 0.9332099710591137} 12/23/2021 17:39:11 - INFO - codeparrot_training - Step 5691: {'lr': 0.0004850313915160546, 'samples': 2914304, 'steps': 5691, 'batch_loss/train': 1.530093121342361} 12/23/2021 17:39:25 - INFO - codeparrot_training - Step 5692: {'lr': 0.0004850260185268476, 'samples': 2914816, 'steps': 5692, 'batch_loss/train': 0.7996871937066317} 12/23/2021 17:39:35 - INFO - codeparrot_training - Step 5693: {'lr': 0.0004850206446032658, 'samples': 2915328, 'steps': 5693, 'batch_loss/train': 0.80346107063815} 12/23/2021 17:39:46 - INFO - codeparrot_training - Step 5694: {'lr': 0.0004850152697453307, 'samples': 2915840, 'steps': 5694, 'batch_loss/train': 0.8043468984542415} 12/23/2021 17:39:58 - INFO - codeparrot_training - Step 5695: {'lr': 0.0004850098939530636, 'samples': 2916352, 'steps': 5695, 'batch_loss/train': 0.977509181946516} 12/23/2021 17:40:09 - INFO - codeparrot_training - Step 5696: {'lr': 0.00048500451722648596, 'samples': 2916864, 'steps': 5696, 'batch_loss/train': 0.9594697924330831} 12/23/2021 17:40:19 - INFO - codeparrot_training - Step 5697: {'lr': 0.0004849991395656191, 'samples': 2917376, 'steps': 5697, 'batch_loss/train': 0.9153572823852301} 12/23/2021 17:40:34 - INFO - codeparrot_training - Step 5698: {'lr': 0.0004849937609704843, 'samples': 2917888, 'steps': 5698, 'batch_loss/train': 0.921175523661077} 12/23/2021 17:40:44 - INFO - codeparrot_training - Step 5699: {'lr': 0.00048498838144110315, 'samples': 2918400, 'steps': 5699, 'batch_loss/train': 0.8596014480572194} 12/23/2021 17:40:55 - INFO - codeparrot_training - Step 5700: {'lr': 0.00048498300097749687, 'samples': 2918912, 'steps': 5700, 'batch_loss/train': 0.8912921277806163} 12/23/2021 17:41:05 - INFO - codeparrot_training - Step 5701: {'lr': 0.0004849776195796869, 'samples': 2919424, 'steps': 5701, 'batch_loss/train': 0.7450002958066761} 12/23/2021 17:41:17 - INFO - codeparrot_training - Step 5702: {'lr': 0.0004849722372476947, 'samples': 2919936, 'steps': 5702, 'batch_loss/train': 0.8302383599802852} 12/23/2021 17:41:28 - INFO - codeparrot_training - Step 5703: {'lr': 0.0004849668539815415, 'samples': 2920448, 'steps': 5703, 'batch_loss/train': 0.8983387500047684} 12/23/2021 17:41:39 - INFO - codeparrot_training - Step 5704: {'lr': 0.0004849614697812489, 'samples': 2920960, 'steps': 5704, 'batch_loss/train': 0.8286627158522606} 12/23/2021 17:41:51 - INFO - codeparrot_training - Step 5705: {'lr': 0.00048495608464683813, 'samples': 2921472, 'steps': 5705, 'batch_loss/train': 0.8180871140211821} 12/23/2021 17:42:01 - INFO - codeparrot_training - Step 5706: {'lr': 0.00048495069857833076, 'samples': 2921984, 'steps': 5706, 'batch_loss/train': 0.9020958906039596} 12/23/2021 17:42:12 - INFO - codeparrot_training - Step 5707: {'lr': 0.0004849453115757481, 'samples': 2922496, 'steps': 5707, 'batch_loss/train': 0.8186493096873164} 12/23/2021 17:42:26 - INFO - codeparrot_training - Step 5708: {'lr': 0.0004849399236391115, 'samples': 2923008, 'steps': 5708, 'batch_loss/train': 0.8741459120064974} 12/23/2021 17:42:36 - INFO - codeparrot_training - Step 5709: {'lr': 0.00048493453476844265, 'samples': 2923520, 'steps': 5709, 'batch_loss/train': 0.9111541910097003} 12/23/2021 17:42:47 - INFO - codeparrot_training - Step 5710: {'lr': 0.0004849291449637626, 'samples': 2924032, 'steps': 5710, 'batch_loss/train': 0.8488597963005304} 12/23/2021 17:42:58 - INFO - codeparrot_training - Step 5711: {'lr': 0.00048492375422509314, 'samples': 2924544, 'steps': 5711, 'batch_loss/train': 0.8909217910841107} 12/23/2021 17:43:10 - INFO - codeparrot_training - Step 5712: {'lr': 0.0004849183625524555, 'samples': 2925056, 'steps': 5712, 'batch_loss/train': 0.8746017012745142} 12/23/2021 17:43:21 - INFO - codeparrot_training - Step 5713: {'lr': 0.00048491296994587107, 'samples': 2925568, 'steps': 5713, 'batch_loss/train': 0.8949397504329681} 12/23/2021 17:43:31 - INFO - codeparrot_training - Step 5714: {'lr': 0.00048490757640536145, 'samples': 2926080, 'steps': 5714, 'batch_loss/train': 0.5550980230327696} 12/23/2021 17:43:43 - INFO - codeparrot_training - Step 5715: {'lr': 0.00048490218193094806, 'samples': 2926592, 'steps': 5715, 'batch_loss/train': 0.8172649592161179} 12/23/2021 17:43:54 - INFO - codeparrot_training - Step 5716: {'lr': 0.00048489678652265214, 'samples': 2927104, 'steps': 5716, 'batch_loss/train': 0.8432203438133001} 12/23/2021 17:44:04 - INFO - codeparrot_training - Step 5717: {'lr': 0.0004848913901804954, 'samples': 2927616, 'steps': 5717, 'batch_loss/train': 0.878210287541151} 12/23/2021 17:44:18 - INFO - codeparrot_training - Step 5718: {'lr': 0.00048488599290449913, 'samples': 2928128, 'steps': 5718, 'batch_loss/train': 0.8559604017063975} 12/23/2021 17:44:28 - INFO - codeparrot_training - Step 5719: {'lr': 0.00048488059469468486, 'samples': 2928640, 'steps': 5719, 'batch_loss/train': 0.8142470642924309} 12/23/2021 17:44:39 - INFO - codeparrot_training - Step 5720: {'lr': 0.00048487519555107404, 'samples': 2929152, 'steps': 5720, 'batch_loss/train': 0.8788981977850199} 12/23/2021 17:44:49 - INFO - codeparrot_training - Step 5721: {'lr': 0.00048486979547368814, 'samples': 2929664, 'steps': 5721, 'batch_loss/train': 0.853825356811285} 12/23/2021 17:45:01 - INFO - codeparrot_training - Step 5722: {'lr': 0.0004848643944625486, 'samples': 2930176, 'steps': 5722, 'batch_loss/train': 0.9267551824450493} 12/23/2021 17:45:12 - INFO - codeparrot_training - Step 5723: {'lr': 0.0004848589925176769, 'samples': 2930688, 'steps': 5723, 'batch_loss/train': 0.8458045292645693} 12/23/2021 17:45:23 - INFO - codeparrot_training - Step 5724: {'lr': 0.00048485358963909454, 'samples': 2931200, 'steps': 5724, 'batch_loss/train': 0.8388996729627252} 12/23/2021 17:45:36 - INFO - codeparrot_training - Step 5725: {'lr': 0.000484848185826823, 'samples': 2931712, 'steps': 5725, 'batch_loss/train': 0.887157610617578} 12/23/2021 17:45:47 - INFO - codeparrot_training - Step 5726: {'lr': 0.0004848427810808837, 'samples': 2932224, 'steps': 5726, 'batch_loss/train': 0.9172366484999657} 12/23/2021 17:45:58 - INFO - codeparrot_training - Step 5727: {'lr': 0.0004848373754012982, 'samples': 2932736, 'steps': 5727, 'batch_loss/train': 0.8495680810883641} 12/23/2021 17:46:10 - INFO - codeparrot_training - Step 5728: {'lr': 0.000484831968788088, 'samples': 2933248, 'steps': 5728, 'batch_loss/train': 0.8288836134597659} 12/23/2021 17:46:20 - INFO - codeparrot_training - Step 5729: {'lr': 0.0004848265612412745, 'samples': 2933760, 'steps': 5729, 'batch_loss/train': 0.6694669981952757} 12/23/2021 17:46:31 - INFO - codeparrot_training - Step 5730: {'lr': 0.0004848211527608793, 'samples': 2934272, 'steps': 5730, 'batch_loss/train': 0.89100047852844} 12/23/2021 17:46:43 - INFO - codeparrot_training - Step 5731: {'lr': 0.0004848157433469239, 'samples': 2934784, 'steps': 5731, 'batch_loss/train': 0.8551372173242271} 12/23/2021 17:46:53 - INFO - codeparrot_training - Step 5732: {'lr': 0.0004848103329994297, 'samples': 2935296, 'steps': 5732, 'batch_loss/train': 0.8487052330747247} 12/23/2021 17:47:04 - INFO - codeparrot_training - Step 5733: {'lr': 0.00048480492171841825, 'samples': 2935808, 'steps': 5733, 'batch_loss/train': 0.8736532917246222} 12/23/2021 17:47:15 - INFO - codeparrot_training - Step 5734: {'lr': 0.0004847995095039112, 'samples': 2936320, 'steps': 5734, 'batch_loss/train': 0.7677104147151113} 12/23/2021 17:47:27 - INFO - codeparrot_training - Step 5735: {'lr': 0.00048479409635592983, 'samples': 2936832, 'steps': 5735, 'batch_loss/train': 0.833510048687458} 12/23/2021 17:47:37 - INFO - codeparrot_training - Step 5736: {'lr': 0.00048478868227449585, 'samples': 2937344, 'steps': 5736, 'batch_loss/train': 0.8808179637417197} 12/23/2021 17:47:48 - INFO - codeparrot_training - Step 5737: {'lr': 0.00048478326725963074, 'samples': 2937856, 'steps': 5737, 'batch_loss/train': 0.9663164839148521} 12/23/2021 17:48:02 - INFO - codeparrot_training - Step 5738: {'lr': 0.000484777851311356, 'samples': 2938368, 'steps': 5738, 'batch_loss/train': 0.8004668047651649} 12/23/2021 17:48:12 - INFO - codeparrot_training - Step 5739: {'lr': 0.00048477243442969313, 'samples': 2938880, 'steps': 5739, 'batch_loss/train': 0.8829970816150308} 12/23/2021 17:48:23 - INFO - codeparrot_training - Step 5740: {'lr': 0.0004847670166146638, 'samples': 2939392, 'steps': 5740, 'batch_loss/train': 0.7511804001405835} 12/23/2021 17:48:35 - INFO - codeparrot_training - Step 5741: {'lr': 0.00048476159786628936, 'samples': 2939904, 'steps': 5741, 'batch_loss/train': 0.6992355352267623} 12/23/2021 17:48:46 - INFO - codeparrot_training - Step 5742: {'lr': 0.0004847561781845915, 'samples': 2940416, 'steps': 5742, 'batch_loss/train': 0.9154954534024} 12/23/2021 17:48:56 - INFO - codeparrot_training - Step 5743: {'lr': 0.0004847507575695917, 'samples': 2940928, 'steps': 5743, 'batch_loss/train': 0.8479732144623995} 12/23/2021 17:49:07 - INFO - codeparrot_training - Step 5744: {'lr': 0.00048474533602131155, 'samples': 2941440, 'steps': 5744, 'batch_loss/train': 0.8439125348813832} 12/23/2021 17:49:19 - INFO - codeparrot_training - Step 5745: {'lr': 0.0004847399135397725, 'samples': 2941952, 'steps': 5745, 'batch_loss/train': 0.8634787406772375} 12/23/2021 17:49:30 - INFO - codeparrot_training - Step 5746: {'lr': 0.00048473449012499626, 'samples': 2942464, 'steps': 5746, 'batch_loss/train': 0.8475349908694625} 12/23/2021 17:49:40 - INFO - codeparrot_training - Step 5747: {'lr': 0.0004847290657770044, 'samples': 2942976, 'steps': 5747, 'batch_loss/train': 0.8270867080427706} 12/23/2021 17:49:54 - INFO - codeparrot_training - Step 5748: {'lr': 0.0004847236404958182, 'samples': 2943488, 'steps': 5748, 'batch_loss/train': 0.7840717742219567} 12/23/2021 17:50:05 - INFO - codeparrot_training - Step 5749: {'lr': 0.0004847182142814596, 'samples': 2944000, 'steps': 5749, 'batch_loss/train': 0.7996553983539343} 12/23/2021 17:50:15 - INFO - codeparrot_training - Step 5750: {'lr': 0.00048471278713394997, 'samples': 2944512, 'steps': 5750, 'batch_loss/train': 2.1070206351578236} 12/23/2021 17:50:27 - INFO - codeparrot_training - Step 5751: {'lr': 0.00048470735905331087, 'samples': 2945024, 'steps': 5751, 'batch_loss/train': 0.9943737210705876} 12/23/2021 17:50:38 - INFO - codeparrot_training - Step 5752: {'lr': 0.00048470193003956395, 'samples': 2945536, 'steps': 5752, 'batch_loss/train': 0.8111037453636527} 12/23/2021 17:50:48 - INFO - codeparrot_training - Step 5753: {'lr': 0.0004846965000927308, 'samples': 2946048, 'steps': 5753, 'batch_loss/train': 0.7603561477735639} 12/23/2021 17:50:59 - INFO - codeparrot_training - Step 5754: {'lr': 0.000484691069212833, 'samples': 2946560, 'steps': 5754, 'batch_loss/train': 0.8025361662730575} 12/23/2021 17:51:12 - INFO - codeparrot_training - Step 5755: {'lr': 0.00048468563739989214, 'samples': 2947072, 'steps': 5755, 'batch_loss/train': 0.8946326747536659} 12/23/2021 17:51:23 - INFO - codeparrot_training - Step 5756: {'lr': 0.0004846802046539297, 'samples': 2947584, 'steps': 5756, 'batch_loss/train': 0.8663171147927642} 12/23/2021 17:51:33 - INFO - codeparrot_training - Step 5757: {'lr': 0.0004846747709749675, 'samples': 2948096, 'steps': 5757, 'batch_loss/train': 0.8758942447602749} 12/23/2021 17:51:45 - INFO - codeparrot_training - Step 5758: {'lr': 0.0004846693363630269, 'samples': 2948608, 'steps': 5758, 'batch_loss/train': 0.8662596447393298} 12/23/2021 17:51:56 - INFO - codeparrot_training - Step 5759: {'lr': 0.0004846639008181297, 'samples': 2949120, 'steps': 5759, 'batch_loss/train': 0.6931184604763985} 12/23/2021 17:52:06 - INFO - codeparrot_training - Step 5760: {'lr': 0.0004846584643402975, 'samples': 2949632, 'steps': 5760, 'batch_loss/train': 0.8460764815099537} 12/23/2021 17:52:19 - INFO - codeparrot_training - Step 5761: {'lr': 0.0004846530269295517, 'samples': 2950144, 'steps': 5761, 'batch_loss/train': 0.9372547529637814} 12/23/2021 17:52:29 - INFO - codeparrot_training - Step 5762: {'lr': 0.00048464758858591416, 'samples': 2950656, 'steps': 5762, 'batch_loss/train': 0.8251945469528437} 12/23/2021 17:52:40 - INFO - codeparrot_training - Step 5763: {'lr': 0.00048464214930940644, 'samples': 2951168, 'steps': 5763, 'batch_loss/train': 0.8841389818117023} 12/23/2021 17:52:50 - INFO - codeparrot_training - Step 5764: {'lr': 0.00048463670910005004, 'samples': 2951680, 'steps': 5764, 'batch_loss/train': 0.8269897382706404} 12/23/2021 17:53:04 - INFO - codeparrot_training - Step 5765: {'lr': 0.00048463126795786674, 'samples': 2952192, 'steps': 5765, 'batch_loss/train': 0.8345305370166898} 12/23/2021 17:53:15 - INFO - codeparrot_training - Step 5766: {'lr': 0.00048462582588287807, 'samples': 2952704, 'steps': 5766, 'batch_loss/train': 0.8199377274140716} 12/23/2021 17:53:25 - INFO - codeparrot_training - Step 5767: {'lr': 0.0004846203828751057, 'samples': 2953216, 'steps': 5767, 'batch_loss/train': 0.8685622564516962} 12/23/2021 17:53:38 - INFO - codeparrot_training - Step 5768: {'lr': 0.00048461493893457133, 'samples': 2953728, 'steps': 5768, 'batch_loss/train': 0.8400259651243687} 12/23/2021 17:53:48 - INFO - codeparrot_training - Step 5769: {'lr': 0.0004846094940612965, 'samples': 2954240, 'steps': 5769, 'batch_loss/train': 0.8218321073800325} 12/23/2021 17:53:59 - INFO - codeparrot_training - Step 5770: {'lr': 0.00048460404825530294, 'samples': 2954752, 'steps': 5770, 'batch_loss/train': 0.9200469302013516} 12/23/2021 17:54:11 - INFO - codeparrot_training - Step 5771: {'lr': 0.0004845986015166123, 'samples': 2955264, 'steps': 5771, 'batch_loss/train': 0.8687302637845278} 12/23/2021 17:54:22 - INFO - codeparrot_training - Step 5772: {'lr': 0.0004845931538452462, 'samples': 2955776, 'steps': 5772, 'batch_loss/train': 0.7901577027514577} 12/23/2021 17:54:32 - INFO - codeparrot_training - Step 5773: {'lr': 0.0004845877052412262, 'samples': 2956288, 'steps': 5773, 'batch_loss/train': 0.751711543649435} 12/23/2021 17:54:44 - INFO - codeparrot_training - Step 5774: {'lr': 0.0004845822557045741, 'samples': 2956800, 'steps': 5774, 'batch_loss/train': 0.9358236128464341} 12/23/2021 17:54:55 - INFO - codeparrot_training - Step 5775: {'lr': 0.00048457680523531155, 'samples': 2957312, 'steps': 5775, 'batch_loss/train': 0.8457398852333426} 12/23/2021 17:55:05 - INFO - codeparrot_training - Step 5776: {'lr': 0.0004845713538334602, 'samples': 2957824, 'steps': 5776, 'batch_loss/train': 0.8078504307195544} 12/23/2021 17:55:16 - INFO - codeparrot_training - Step 5777: {'lr': 0.00048456590149904166, 'samples': 2958336, 'steps': 5777, 'batch_loss/train': 0.8546206876635551} 12/23/2021 17:55:30 - INFO - codeparrot_training - Step 5778: {'lr': 0.0004845604482320778, 'samples': 2958848, 'steps': 5778, 'batch_loss/train': 0.865051562897861} 12/23/2021 17:55:41 - INFO - codeparrot_training - Step 5779: {'lr': 0.00048455499403259007, 'samples': 2959360, 'steps': 5779, 'batch_loss/train': 0.8522650999948382} 12/23/2021 17:55:51 - INFO - codeparrot_training - Step 5780: {'lr': 0.0004845495389006002, 'samples': 2959872, 'steps': 5780, 'batch_loss/train': 0.8406418971717358} 12/23/2021 17:56:03 - INFO - codeparrot_training - Step 5781: {'lr': 0.00048454408283613, 'samples': 2960384, 'steps': 5781, 'batch_loss/train': 0.8683620989322662} 12/23/2021 17:56:14 - INFO - codeparrot_training - Step 5782: {'lr': 0.00048453862583920106, 'samples': 2960896, 'steps': 5782, 'batch_loss/train': 0.8906156211160123} 12/23/2021 17:56:24 - INFO - codeparrot_training - Step 5783: {'lr': 0.00048453316790983517, 'samples': 2961408, 'steps': 5783, 'batch_loss/train': 0.9435799550265074} 12/23/2021 17:56:38 - INFO - codeparrot_training - Step 5784: {'lr': 0.0004845277090480539, 'samples': 2961920, 'steps': 5784, 'batch_loss/train': 0.7738277893513441} 12/23/2021 17:56:49 - INFO - codeparrot_training - Step 5785: {'lr': 0.000484522249253879, 'samples': 2962432, 'steps': 5785, 'batch_loss/train': 0.8862414290197194} 12/23/2021 17:56:59 - INFO - codeparrot_training - Step 5786: {'lr': 0.0004845167885273322, 'samples': 2962944, 'steps': 5786, 'batch_loss/train': 0.9296642746776342} 12/23/2021 17:57:10 - INFO - codeparrot_training - Step 5787: {'lr': 0.0004845113268684351, 'samples': 2963456, 'steps': 5787, 'batch_loss/train': 0.7956486102193594} 12/23/2021 17:57:22 - INFO - codeparrot_training - Step 5788: {'lr': 0.0004845058642772097, 'samples': 2963968, 'steps': 5788, 'batch_loss/train': 0.8849366893991828} 12/23/2021 17:57:33 - INFO - codeparrot_training - Step 5789: {'lr': 0.0004845004007536774, 'samples': 2964480, 'steps': 5789, 'batch_loss/train': 0.8426930643618107} 12/23/2021 17:57:43 - INFO - codeparrot_training - Step 5790: {'lr': 0.0004844949362978601, 'samples': 2964992, 'steps': 5790, 'batch_loss/train': 0.6134842461906374} 12/23/2021 17:57:55 - INFO - codeparrot_training - Step 5791: {'lr': 0.00048448947090977944, 'samples': 2965504, 'steps': 5791, 'batch_loss/train': 0.844102589180693} 12/23/2021 17:58:06 - INFO - codeparrot_training - Step 5792: {'lr': 0.0004844840045894573, 'samples': 2966016, 'steps': 5792, 'batch_loss/train': 0.962216068059206} 12/23/2021 17:58:17 - INFO - codeparrot_training - Step 5793: {'lr': 0.00048447853733691514, 'samples': 2966528, 'steps': 5793, 'batch_loss/train': 0.7510251980274916} 12/23/2021 17:58:31 - INFO - codeparrot_training - Step 5794: {'lr': 0.00048447306915217495, 'samples': 2967040, 'steps': 5794, 'batch_loss/train': 0.7968386858701706} 12/23/2021 17:58:41 - INFO - codeparrot_training - Step 5795: {'lr': 0.0004844676000352584, 'samples': 2967552, 'steps': 5795, 'batch_loss/train': 0.9431507186964154} 12/23/2021 17:58:52 - INFO - codeparrot_training - Step 5796: {'lr': 0.0004844621299861871, 'samples': 2968064, 'steps': 5796, 'batch_loss/train': 0.8042679475620389} 12/23/2021 17:59:02 - INFO - codeparrot_training - Step 5797: {'lr': 0.00048445665900498293, 'samples': 2968576, 'steps': 5797, 'batch_loss/train': 0.8323586732149124} 12/23/2021 17:59:14 - INFO - codeparrot_training - Step 5798: {'lr': 0.0004844511870916676, 'samples': 2969088, 'steps': 5798, 'batch_loss/train': 0.8965103412047029} 12/23/2021 17:59:25 - INFO - codeparrot_training - Step 5799: {'lr': 0.00048444571424626295, 'samples': 2969600, 'steps': 5799, 'batch_loss/train': 0.8718155124224722} 12/23/2021 17:59:35 - INFO - codeparrot_training - Step 5800: {'lr': 0.0004844402404687906, 'samples': 2970112, 'steps': 5800, 'batch_loss/train': 0.7887903396040201} 12/23/2021 17:59:49 - INFO - codeparrot_training - Step 5801: {'lr': 0.00048443476575927235, 'samples': 2970624, 'steps': 5801, 'batch_loss/train': 0.8299019364640117} 12/23/2021 18:00:00 - INFO - codeparrot_training - Step 5802: {'lr': 0.00048442929011773006, 'samples': 2971136, 'steps': 5802, 'batch_loss/train': 0.8887974796816707} 12/23/2021 18:00:10 - INFO - codeparrot_training - Step 5803: {'lr': 0.0004844238135441854, 'samples': 2971648, 'steps': 5803, 'batch_loss/train': 0.8258345681242645} 12/23/2021 18:00:22 - INFO - codeparrot_training - Step 5804: {'lr': 0.00048441833603866014, 'samples': 2972160, 'steps': 5804, 'batch_loss/train': 0.8969822460785508} 12/23/2021 18:00:33 - INFO - codeparrot_training - Step 5805: {'lr': 0.00048441285760117615, 'samples': 2972672, 'steps': 5805, 'batch_loss/train': 0.8619542452506721} 12/23/2021 18:00:44 - INFO - codeparrot_training - Step 5806: {'lr': 0.0004844073782317552, 'samples': 2973184, 'steps': 5806, 'batch_loss/train': 0.9738603178411722} 12/23/2021 18:00:56 - INFO - codeparrot_training - Step 5807: {'lr': 0.0004844018979304189, 'samples': 2973696, 'steps': 5807, 'batch_loss/train': 0.8271306455135345} 12/23/2021 18:01:06 - INFO - codeparrot_training - Step 5808: {'lr': 0.0004843964166971892, 'samples': 2974208, 'steps': 5808, 'batch_loss/train': 0.9077390963211656} 12/23/2021 18:01:17 - INFO - codeparrot_training - Step 5809: {'lr': 0.0004843909345320878, 'samples': 2974720, 'steps': 5809, 'batch_loss/train': 0.8486774093471467} 12/23/2021 18:01:27 - INFO - codeparrot_training - Step 5810: {'lr': 0.00048438545143513666, 'samples': 2975232, 'steps': 5810, 'batch_loss/train': 1.0926573490723968} 12/23/2021 18:01:40 - INFO - codeparrot_training - Step 5811: {'lr': 0.0004843799674063574, 'samples': 2975744, 'steps': 5811, 'batch_loss/train': 0.8336267080157995} 12/23/2021 18:01:50 - INFO - codeparrot_training - Step 5812: {'lr': 0.00048437448244577184, 'samples': 2976256, 'steps': 5812, 'batch_loss/train': 0.8439084365963936} 12/23/2021 18:02:01 - INFO - codeparrot_training - Step 5813: {'lr': 0.0004843689965534019, 'samples': 2976768, 'steps': 5813, 'batch_loss/train': 0.8429314438253641} 12/23/2021 18:02:15 - INFO - codeparrot_training - Step 5814: {'lr': 0.00048436350972926934, 'samples': 2977280, 'steps': 5814, 'batch_loss/train': 0.8312474051490426} 12/23/2021 18:02:26 - INFO - codeparrot_training - Step 5815: {'lr': 0.000484358021973396, 'samples': 2977792, 'steps': 5815, 'batch_loss/train': 0.9268621187657118} 12/23/2021 18:02:36 - INFO - codeparrot_training - Step 5816: {'lr': 0.00048435253328580355, 'samples': 2978304, 'steps': 5816, 'batch_loss/train': 0.8216818887740374} 12/23/2021 18:02:48 - INFO - codeparrot_training - Step 5817: {'lr': 0.0004843470436665139, 'samples': 2978816, 'steps': 5817, 'batch_loss/train': 0.7420337898656726} 12/23/2021 18:02:59 - INFO - codeparrot_training - Step 5818: {'lr': 0.00048434155311554897, 'samples': 2979328, 'steps': 5818, 'batch_loss/train': 0.8367470325902104} 12/23/2021 18:03:09 - INFO - codeparrot_training - Step 5819: {'lr': 0.0004843360616329305, 'samples': 2979840, 'steps': 5819, 'batch_loss/train': 0.7958736615255475} 12/23/2021 18:03:20 - INFO - codeparrot_training - Step 5820: {'lr': 0.0004843305692186804, 'samples': 2980352, 'steps': 5820, 'batch_loss/train': 0.8476417800411582} 12/23/2021 18:03:32 - INFO - codeparrot_training - Step 5821: {'lr': 0.00048432507587282026, 'samples': 2980864, 'steps': 5821, 'batch_loss/train': 0.9120648400858045} 12/23/2021 18:03:43 - INFO - codeparrot_training - Step 5822: {'lr': 0.00048431958159537216, 'samples': 2981376, 'steps': 5822, 'batch_loss/train': 0.9358613155782223} 12/23/2021 18:03:53 - INFO - codeparrot_training - Step 5823: {'lr': 0.00048431408638635793, 'samples': 2981888, 'steps': 5823, 'batch_loss/train': 0.7689091609790921} 12/23/2021 18:04:07 - INFO - codeparrot_training - Step 5824: {'lr': 0.0004843085902457993, 'samples': 2982400, 'steps': 5824, 'batch_loss/train': 0.891480048187077} 12/23/2021 18:04:17 - INFO - codeparrot_training - Step 5825: {'lr': 0.00048430309317371825, 'samples': 2982912, 'steps': 5825, 'batch_loss/train': 0.8775230934843421} 12/23/2021 18:04:28 - INFO - codeparrot_training - Step 5826: {'lr': 0.0004842975951701366, 'samples': 2983424, 'steps': 5826, 'batch_loss/train': 0.8653912174049765} 12/23/2021 18:04:40 - INFO - codeparrot_training - Step 5827: {'lr': 0.0004842920962350761, 'samples': 2983936, 'steps': 5827, 'batch_loss/train': 0.8171444504987448} 12/23/2021 18:04:51 - INFO - codeparrot_training - Step 5828: {'lr': 0.0004842865963685587, 'samples': 2984448, 'steps': 5828, 'batch_loss/train': 0.8398151290602982} 12/23/2021 18:05:01 - INFO - codeparrot_training - Step 5829: {'lr': 0.0004842810955706063, 'samples': 2984960, 'steps': 5829, 'batch_loss/train': 0.8316138354130089} 12/23/2021 18:05:15 - INFO - codeparrot_training - Step 5830: {'lr': 0.00048427559384124066, 'samples': 2985472, 'steps': 5830, 'batch_loss/train': 0.9397130000870675} 12/23/2021 18:05:26 - INFO - codeparrot_training - Step 5831: {'lr': 0.0004842700911804838, 'samples': 2985984, 'steps': 5831, 'batch_loss/train': 0.7919710800051689} 12/23/2021 18:05:36 - INFO - codeparrot_training - Step 5832: {'lr': 0.00048426458758835746, 'samples': 2986496, 'steps': 5832, 'batch_loss/train': 0.880448171403259} 12/23/2021 18:05:47 - INFO - codeparrot_training - Step 5833: {'lr': 0.0004842590830648836, 'samples': 2987008, 'steps': 5833, 'batch_loss/train': 0.6716298459796235} 12/23/2021 18:05:59 - INFO - codeparrot_training - Step 5834: {'lr': 0.00048425357761008403, 'samples': 2987520, 'steps': 5834, 'batch_loss/train': 0.8312158337794244} 12/23/2021 18:06:09 - INFO - codeparrot_training - Step 5835: {'lr': 0.0004842480712239807, 'samples': 2988032, 'steps': 5835, 'batch_loss/train': 0.8833014294505119} 12/23/2021 18:06:20 - INFO - codeparrot_training - Step 5836: {'lr': 0.00048424256390659553, 'samples': 2988544, 'steps': 5836, 'batch_loss/train': 0.8235817458480597} 12/23/2021 18:06:32 - INFO - codeparrot_training - Step 5837: {'lr': 0.0004842370556579504, 'samples': 2989056, 'steps': 5837, 'batch_loss/train': 0.7049318433273584} 12/23/2021 18:06:42 - INFO - codeparrot_training - Step 5838: {'lr': 0.0004842315464780671, 'samples': 2989568, 'steps': 5838, 'batch_loss/train': 0.9805558985099196} 12/23/2021 18:06:53 - INFO - codeparrot_training - Step 5839: {'lr': 0.00048422603636696763, 'samples': 2990080, 'steps': 5839, 'batch_loss/train': 0.9872951870784163} 12/23/2021 18:07:08 - INFO - codeparrot_training - Step 5840: {'lr': 0.00048422052532467395, 'samples': 2990592, 'steps': 5840, 'batch_loss/train': 1.0052135940641165} 12/23/2021 18:07:19 - INFO - codeparrot_training - Step 5841: {'lr': 0.00048421501335120775, 'samples': 2991104, 'steps': 5841, 'batch_loss/train': 3.9218246303498745} 12/23/2021 18:07:29 - INFO - codeparrot_training - Step 5842: {'lr': 0.0004842095004465912, 'samples': 2991616, 'steps': 5842, 'batch_loss/train': 0.8735242439433932} 12/23/2021 18:07:40 - INFO - codeparrot_training - Step 5843: {'lr': 0.00048420398661084613, 'samples': 2992128, 'steps': 5843, 'batch_loss/train': 0.886126521974802} 12/23/2021 18:07:52 - INFO - codeparrot_training - Step 5844: {'lr': 0.00048419847184399446, 'samples': 2992640, 'steps': 5844, 'batch_loss/train': 0.9197756946086884} 12/23/2021 18:08:02 - INFO - codeparrot_training - Step 5845: {'lr': 0.00048419295614605804, 'samples': 2993152, 'steps': 5845, 'batch_loss/train': 0.9648728920146823} 12/23/2021 18:08:13 - INFO - codeparrot_training - Step 5846: {'lr': 0.0004841874395170588, 'samples': 2993664, 'steps': 5846, 'batch_loss/train': 0.875336735509336} 12/23/2021 18:08:25 - INFO - codeparrot_training - Step 5847: {'lr': 0.00048418192195701884, 'samples': 2994176, 'steps': 5847, 'batch_loss/train': 0.7077372325584292} 12/23/2021 18:08:36 - INFO - codeparrot_training - Step 5848: {'lr': 0.0004841764034659599, 'samples': 2994688, 'steps': 5848, 'batch_loss/train': 0.6185529751237482} 12/23/2021 18:08:46 - INFO - codeparrot_training - Step 5849: {'lr': 0.000484170884043904, 'samples': 2995200, 'steps': 5849, 'batch_loss/train': 0.7704075709916651} 12/23/2021 18:08:59 - INFO - codeparrot_training - Step 5850: {'lr': 0.00048416536369087315, 'samples': 2995712, 'steps': 5850, 'batch_loss/train': 0.8455570274963975} 12/23/2021 18:09:09 - INFO - codeparrot_training - Step 5851: {'lr': 0.00048415984240688916, 'samples': 2996224, 'steps': 5851, 'batch_loss/train': 0.8689518561586738} 12/23/2021 18:09:20 - INFO - codeparrot_training - Step 5852: {'lr': 0.00048415432019197416, 'samples': 2996736, 'steps': 5852, 'batch_loss/train': 0.8580337492749095} 12/23/2021 18:09:31 - INFO - codeparrot_training - Step 5853: {'lr': 0.0004841487970461499, 'samples': 2997248, 'steps': 5853, 'batch_loss/train': 0.8797200359404087} 12/23/2021 18:09:44 - INFO - codeparrot_training - Step 5854: {'lr': 0.0004841432729694384, 'samples': 2997760, 'steps': 5854, 'batch_loss/train': 0.8501955345273018} 12/23/2021 18:09:55 - INFO - codeparrot_training - Step 5855: {'lr': 0.0004841377479618617, 'samples': 2998272, 'steps': 5855, 'batch_loss/train': 0.8610369497910142} 12/23/2021 18:10:05 - INFO - codeparrot_training - Step 5856: {'lr': 0.00048413222202344175, 'samples': 2998784, 'steps': 5856, 'batch_loss/train': 0.8796071424148977} 12/23/2021 18:10:18 - INFO - codeparrot_training - Step 5857: {'lr': 0.0004841266951542005, 'samples': 2999296, 'steps': 5857, 'batch_loss/train': 0.73899145424366} 12/23/2021 18:10:28 - INFO - codeparrot_training - Step 5858: {'lr': 0.0004841211673541599, 'samples': 2999808, 'steps': 5858, 'batch_loss/train': 0.7185219149105251} 12/23/2021 18:10:39 - INFO - codeparrot_training - Step 5859: {'lr': 0.00048411563862334195, 'samples': 3000320, 'steps': 5859, 'batch_loss/train': 0.6403381248237565} 12/23/2021 18:10:51 - INFO - codeparrot_training - Step 5860: {'lr': 0.00048411010896176853, 'samples': 3000832, 'steps': 5860, 'batch_loss/train': 0.7100799726322293} 12/23/2021 18:11:02 - INFO - codeparrot_training - Step 5861: {'lr': 0.0004841045783694618, 'samples': 3001344, 'steps': 5861, 'batch_loss/train': 0.8244436588138342} 12/23/2021 18:11:12 - INFO - codeparrot_training - Step 5862: {'lr': 0.0004840990468464437, 'samples': 3001856, 'steps': 5862, 'batch_loss/train': 0.7207987507572398} 12/23/2021 18:11:26 - INFO - codeparrot_training - Step 5863: {'lr': 0.0004840935143927361, 'samples': 3002368, 'steps': 5863, 'batch_loss/train': 0.926836606580764} 12/23/2021 18:11:36 - INFO - codeparrot_training - Step 5864: {'lr': 0.00048408798100836104, 'samples': 3002880, 'steps': 5864, 'batch_loss/train': 0.7815721521619707} 12/23/2021 18:11:47 - INFO - codeparrot_training - Step 5865: {'lr': 0.0004840824466933407, 'samples': 3003392, 'steps': 5865, 'batch_loss/train': 0.868167910259217} 12/23/2021 18:11:58 - INFO - codeparrot_training - Step 5866: {'lr': 0.00048407691144769684, 'samples': 3003904, 'steps': 5866, 'batch_loss/train': 0.810678671579808} 12/23/2021 18:12:10 - INFO - codeparrot_training - Step 5867: {'lr': 0.00048407137527145154, 'samples': 3004416, 'steps': 5867, 'batch_loss/train': 0.648551924386993} 12/23/2021 18:12:20 - INFO - codeparrot_training - Step 5868: {'lr': 0.0004840658381646269, 'samples': 3004928, 'steps': 5868, 'batch_loss/train': 0.8872129004448652} 12/23/2021 18:12:31 - INFO - codeparrot_training - Step 5869: {'lr': 0.0004840603001272448, 'samples': 3005440, 'steps': 5869, 'batch_loss/train': 0.7358945580199361} 12/23/2021 18:12:45 - INFO - codeparrot_training - Step 5870: {'lr': 0.00048405476115932734, 'samples': 3005952, 'steps': 5870, 'batch_loss/train': 0.7914250721223652} 12/23/2021 18:12:55 - INFO - codeparrot_training - Step 5871: {'lr': 0.00048404922126089655, 'samples': 3006464, 'steps': 5871, 'batch_loss/train': 0.9335980229079723} 12/23/2021 18:13:06 - INFO - codeparrot_training - Step 5872: {'lr': 0.00048404368043197446, 'samples': 3006976, 'steps': 5872, 'batch_loss/train': 0.7655609147623181} 12/23/2021 18:13:18 - INFO - codeparrot_training - Step 5873: {'lr': 0.0004840381386725831, 'samples': 3007488, 'steps': 5873, 'batch_loss/train': 0.8234609849750996} 12/23/2021 18:13:28 - INFO - codeparrot_training - Step 5874: {'lr': 0.00048403259598274435, 'samples': 3008000, 'steps': 5874, 'batch_loss/train': 0.8895758548751473} 12/23/2021 18:13:39 - INFO - codeparrot_training - Step 5875: {'lr': 0.0004840270523624804, 'samples': 3008512, 'steps': 5875, 'batch_loss/train': 0.795543879037723} 12/23/2021 18:13:49 - INFO - codeparrot_training - Step 5876: {'lr': 0.00048402150781181333, 'samples': 3009024, 'steps': 5876, 'batch_loss/train': 0.8219110975041986} 12/23/2021 18:14:02 - INFO - codeparrot_training - Step 5877: {'lr': 0.00048401596233076506, 'samples': 3009536, 'steps': 5877, 'batch_loss/train': 0.8186393054202199} 12/23/2021 18:14:12 - INFO - codeparrot_training - Step 5878: {'lr': 0.00048401041591935765, 'samples': 3010048, 'steps': 5878, 'batch_loss/train': 0.8694507135078311} 12/23/2021 18:14:23 - INFO - codeparrot_training - Step 5879: {'lr': 0.00048400486857761316, 'samples': 3010560, 'steps': 5879, 'batch_loss/train': 0.9220739644952118} 12/23/2021 18:14:36 - INFO - codeparrot_training - Step 5880: {'lr': 0.00048399932030555377, 'samples': 3011072, 'steps': 5880, 'batch_loss/train': 0.8920849887654185} 12/23/2021 18:14:47 - INFO - codeparrot_training - Step 5881: {'lr': 0.0004839937711032014, 'samples': 3011584, 'steps': 5881, 'batch_loss/train': 0.8300386248156428} 12/23/2021 18:14:57 - INFO - codeparrot_training - Step 5882: {'lr': 0.00048398822097057813, 'samples': 3012096, 'steps': 5882, 'batch_loss/train': 0.8608750870334916} 12/23/2021 18:15:10 - INFO - codeparrot_training - Step 5883: {'lr': 0.000483982669907706, 'samples': 3012608, 'steps': 5883, 'batch_loss/train': 0.5857427216833457} 12/23/2021 18:15:21 - INFO - codeparrot_training - Step 5884: {'lr': 0.0004839771179146072, 'samples': 3013120, 'steps': 5884, 'batch_loss/train': 0.8927373439073563} 12/23/2021 18:15:31 - INFO - codeparrot_training - Step 5885: {'lr': 0.0004839715649913037, 'samples': 3013632, 'steps': 5885, 'batch_loss/train': 0.9143444001674652} 12/23/2021 18:15:42 - INFO - codeparrot_training - Step 5886: {'lr': 0.0004839660111378176, 'samples': 3014144, 'steps': 5886, 'batch_loss/train': 1.088699326151982} 12/23/2021 18:15:54 - INFO - codeparrot_training - Step 5887: {'lr': 0.000483960456354171, 'samples': 3014656, 'steps': 5887, 'batch_loss/train': 0.8673433605581522} 12/23/2021 18:16:04 - INFO - codeparrot_training - Step 5888: {'lr': 0.00048395490064038593, 'samples': 3015168, 'steps': 5888, 'batch_loss/train': 0.802117079962045} 12/23/2021 18:16:15 - INFO - codeparrot_training - Step 5889: {'lr': 0.00048394934399648463, 'samples': 3015680, 'steps': 5889, 'batch_loss/train': 0.7741905804723501} 12/23/2021 18:16:27 - INFO - codeparrot_training - Step 5890: {'lr': 0.00048394378642248895, 'samples': 3016192, 'steps': 5890, 'batch_loss/train': 0.8539569918066263} 12/23/2021 18:16:38 - INFO - codeparrot_training - Step 5891: {'lr': 0.0004839382279184211, 'samples': 3016704, 'steps': 5891, 'batch_loss/train': 0.8418521154671907} 12/23/2021 18:16:48 - INFO - codeparrot_training - Step 5892: {'lr': 0.00048393266848430325, 'samples': 3017216, 'steps': 5892, 'batch_loss/train': 1.0218718964606524} 12/23/2021 18:17:02 - INFO - codeparrot_training - Step 5893: {'lr': 0.0004839271081201574, 'samples': 3017728, 'steps': 5893, 'batch_loss/train': 0.7163914463017136} 12/23/2021 18:17:13 - INFO - codeparrot_training - Step 5894: {'lr': 0.0004839215468260057, 'samples': 3018240, 'steps': 5894, 'batch_loss/train': 0.7738532070070505} 12/23/2021 18:17:23 - INFO - codeparrot_training - Step 5895: {'lr': 0.00048391598460187027, 'samples': 3018752, 'steps': 5895, 'batch_loss/train': 0.8983491808176041} 12/23/2021 18:17:34 - INFO - codeparrot_training - Step 5896: {'lr': 0.0004839104214477732, 'samples': 3019264, 'steps': 5896, 'batch_loss/train': 0.7808998366817832} 12/23/2021 18:17:46 - INFO - codeparrot_training - Step 5897: {'lr': 0.00048390485736373665, 'samples': 3019776, 'steps': 5897, 'batch_loss/train': 0.8678877027705312} 12/23/2021 18:17:57 - INFO - codeparrot_training - Step 5898: {'lr': 0.00048389929234978266, 'samples': 3020288, 'steps': 5898, 'batch_loss/train': 0.6519565869821236} 12/23/2021 18:18:07 - INFO - codeparrot_training - Step 5899: {'lr': 0.00048389372640593345, 'samples': 3020800, 'steps': 5899, 'batch_loss/train': 0.8964795849751681} 12/23/2021 18:18:21 - INFO - codeparrot_training - Step 5900: {'lr': 0.000483888159532211, 'samples': 3021312, 'steps': 5900, 'batch_loss/train': 0.7515653483569622} 12/23/2021 18:18:31 - INFO - codeparrot_training - Step 5901: {'lr': 0.00048388259172863765, 'samples': 3021824, 'steps': 5901, 'batch_loss/train': 0.8825102504342794} 12/23/2021 18:18:42 - INFO - codeparrot_training - Step 5902: {'lr': 0.0004838770229952354, 'samples': 3022336, 'steps': 5902, 'batch_loss/train': 0.7940139025449753} 12/23/2021 18:18:54 - INFO - codeparrot_training - Step 5903: {'lr': 0.0004838714533320263, 'samples': 3022848, 'steps': 5903, 'batch_loss/train': 0.9552051154896617} 12/23/2021 18:19:04 - INFO - codeparrot_training - Step 5904: {'lr': 0.00048386588273903274, 'samples': 3023360, 'steps': 5904, 'batch_loss/train': 0.8947020824998617} 12/23/2021 18:19:15 - INFO - codeparrot_training - Step 5905: {'lr': 0.0004838603112162767, 'samples': 3023872, 'steps': 5905, 'batch_loss/train': 0.7900797035545111} 12/23/2021 18:19:26 - INFO - codeparrot_training - Step 5906: {'lr': 0.0004838547387637804, 'samples': 3024384, 'steps': 5906, 'batch_loss/train': 0.8998640021309257} 12/23/2021 18:19:38 - INFO - codeparrot_training - Step 5907: {'lr': 0.00048384916538156587, 'samples': 3024896, 'steps': 5907, 'batch_loss/train': 0.8650648687034845} 12/23/2021 18:19:48 - INFO - codeparrot_training - Step 5908: {'lr': 0.0004838435910696555, 'samples': 3025408, 'steps': 5908, 'batch_loss/train': 0.7423968308139592} 12/23/2021 18:19:59 - INFO - codeparrot_training - Step 5909: {'lr': 0.00048383801582807117, 'samples': 3025920, 'steps': 5909, 'batch_loss/train': 0.6298225386999547} 12/23/2021 18:20:12 - INFO - codeparrot_training - Step 5910: {'lr': 0.0004838324396568352, 'samples': 3026432, 'steps': 5910, 'batch_loss/train': 0.9363216264173388} 12/23/2021 18:20:23 - INFO - codeparrot_training - Step 5911: {'lr': 0.00048382686255596976, 'samples': 3026944, 'steps': 5911, 'batch_loss/train': 0.935461837798357} 12/23/2021 18:20:33 - INFO - codeparrot_training - Step 5912: {'lr': 0.000483821284525497, 'samples': 3027456, 'steps': 5912, 'batch_loss/train': 0.7423773258924484} 12/23/2021 18:20:45 - INFO - codeparrot_training - Step 5913: {'lr': 0.0004838157055654392, 'samples': 3027968, 'steps': 5913, 'batch_loss/train': 0.8543748846277595} 12/23/2021 18:20:56 - INFO - codeparrot_training - Step 5914: {'lr': 0.00048381012567581827, 'samples': 3028480, 'steps': 5914, 'batch_loss/train': 0.8287508133798838} 12/23/2021 18:21:07 - INFO - codeparrot_training - Step 5915: {'lr': 0.0004838045448566567, 'samples': 3028992, 'steps': 5915, 'batch_loss/train': 0.7635943321511149} 12/23/2021 18:21:17 - INFO - codeparrot_training - Step 5916: {'lr': 0.0004837989631079764, 'samples': 3029504, 'steps': 5916, 'batch_loss/train': 0.9069278221577406} 12/23/2021 18:21:29 - INFO - codeparrot_training - Step 5917: {'lr': 0.00048379338042979985, 'samples': 3030016, 'steps': 5917, 'batch_loss/train': 0.8368039270862937} 12/23/2021 18:21:40 - INFO - codeparrot_training - Step 5918: {'lr': 0.000483787796822149, 'samples': 3030528, 'steps': 5918, 'batch_loss/train': 0.8643777701072395} 12/23/2021 18:21:50 - INFO - codeparrot_training - Step 5919: {'lr': 0.00048378221228504624, 'samples': 3031040, 'steps': 5919, 'batch_loss/train': 0.890282591804862} 12/23/2021 18:22:02 - INFO - codeparrot_training - Step 5920: {'lr': 0.00048377662681851365, 'samples': 3031552, 'steps': 5920, 'batch_loss/train': 0.8312801364809275} 12/23/2021 18:22:13 - INFO - codeparrot_training - Step 5921: {'lr': 0.00048377104042257336, 'samples': 3032064, 'steps': 5921, 'batch_loss/train': 0.8321523237973452} 12/23/2021 18:22:23 - INFO - codeparrot_training - Step 5922: {'lr': 0.0004837654530972478, 'samples': 3032576, 'steps': 5922, 'batch_loss/train': 0.7769948198692873} 12/23/2021 18:22:37 - INFO - codeparrot_training - Step 5923: {'lr': 0.000483759864842559, 'samples': 3033088, 'steps': 5923, 'batch_loss/train': 0.8089629737660289} 12/23/2021 18:22:48 - INFO - codeparrot_training - Step 5924: {'lr': 0.00048375427565852924, 'samples': 3033600, 'steps': 5924, 'batch_loss/train': 0.8336866507306695} 12/23/2021 18:22:58 - INFO - codeparrot_training - Step 5925: {'lr': 0.00048374868554518083, 'samples': 3034112, 'steps': 5925, 'batch_loss/train': 0.8991731656715274} 12/23/2021 18:23:10 - INFO - codeparrot_training - Step 5926: {'lr': 0.0004837430945025358, 'samples': 3034624, 'steps': 5926, 'batch_loss/train': 0.8068174128420651} 12/23/2021 18:23:21 - INFO - codeparrot_training - Step 5927: {'lr': 0.00048373750253061655, 'samples': 3035136, 'steps': 5927, 'batch_loss/train': 0.8222529571503401} 12/23/2021 18:23:32 - INFO - codeparrot_training - Step 5928: {'lr': 0.00048373190962944517, 'samples': 3035648, 'steps': 5928, 'batch_loss/train': 0.9597984892316163} 12/23/2021 18:23:42 - INFO - codeparrot_training - Step 5929: {'lr': 0.000483726315799044, 'samples': 3036160, 'steps': 5929, 'batch_loss/train': 0.696131584700197} 12/23/2021 18:23:56 - INFO - codeparrot_training - Step 5930: {'lr': 0.00048372072103943534, 'samples': 3036672, 'steps': 5930, 'batch_loss/train': 0.8417520001530647} 12/23/2021 18:24:07 - INFO - codeparrot_training - Step 5931: {'lr': 0.00048371512535064123, 'samples': 3037184, 'steps': 5931, 'batch_loss/train': 1.3293260484933853} 12/23/2021 18:24:17 - INFO - codeparrot_training - Step 5932: {'lr': 0.00048370952873268407, 'samples': 3037696, 'steps': 5932, 'batch_loss/train': 0.8406922807916999} 12/23/2021 18:24:29 - INFO - codeparrot_training - Step 5933: {'lr': 0.0004837039311855861, 'samples': 3038208, 'steps': 5933, 'batch_loss/train': 0.7413743166252971} 12/23/2021 18:24:40 - INFO - codeparrot_training - Step 5934: {'lr': 0.00048369833270936946, 'samples': 3038720, 'steps': 5934, 'batch_loss/train': 0.8279759557917714} 12/23/2021 18:24:51 - INFO - codeparrot_training - Step 5935: {'lr': 0.0004836927333040566, 'samples': 3039232, 'steps': 5935, 'batch_loss/train': 0.8022311292588711} 12/23/2021 18:25:01 - INFO - codeparrot_training - Step 5936: {'lr': 0.0004836871329696696, 'samples': 3039744, 'steps': 5936, 'batch_loss/train': 0.6844540894962847} 12/23/2021 18:25:13 - INFO - codeparrot_training - Step 5937: {'lr': 0.0004836815317062308, 'samples': 3040256, 'steps': 5937, 'batch_loss/train': 0.8421481447294354} 12/23/2021 18:25:24 - INFO - codeparrot_training - Step 5938: {'lr': 0.0004836759295137625, 'samples': 3040768, 'steps': 5938, 'batch_loss/train': 0.727612632792443} 12/23/2021 18:25:35 - INFO - codeparrot_training - Step 5939: {'lr': 0.0004836703263922869, 'samples': 3041280, 'steps': 5939, 'batch_loss/train': 0.8579008253291249} 12/23/2021 18:25:48 - INFO - codeparrot_training - Step 5940: {'lr': 0.00048366472234182635, 'samples': 3041792, 'steps': 5940, 'batch_loss/train': 0.8456213781610131} 12/23/2021 18:25:59 - INFO - codeparrot_training - Step 5941: {'lr': 0.00048365911736240305, 'samples': 3042304, 'steps': 5941, 'batch_loss/train': 0.9164325152523816} 12/23/2021 18:26:10 - INFO - codeparrot_training - Step 5942: {'lr': 0.00048365351145403934, 'samples': 3042816, 'steps': 5942, 'batch_loss/train': 0.9445583280175924} 12/23/2021 18:26:22 - INFO - codeparrot_training - Step 5943: {'lr': 0.00048364790461675745, 'samples': 3043328, 'steps': 5943, 'batch_loss/train': 1.048178169876337} 12/23/2021 18:26:32 - INFO - codeparrot_training - Step 5944: {'lr': 0.0004836422968505798, 'samples': 3043840, 'steps': 5944, 'batch_loss/train': 0.9658558396622539} 12/23/2021 18:26:43 - INFO - codeparrot_training - Step 5945: {'lr': 0.0004836366881555285, 'samples': 3044352, 'steps': 5945, 'batch_loss/train': 0.8396646352484822} 12/23/2021 18:26:55 - INFO - codeparrot_training - Step 5946: {'lr': 0.000483631078531626, 'samples': 3044864, 'steps': 5946, 'batch_loss/train': 0.8205954618752003} 12/23/2021 18:27:05 - INFO - codeparrot_training - Step 5947: {'lr': 0.00048362546797889463, 'samples': 3045376, 'steps': 5947, 'batch_loss/train': 0.911721358075738} 12/23/2021 18:27:16 - INFO - codeparrot_training - Step 5948: {'lr': 0.0004836198564973565, 'samples': 3045888, 'steps': 5948, 'batch_loss/train': 1.050477771088481} 12/23/2021 18:27:27 - INFO - codeparrot_training - Step 5949: {'lr': 0.00048361424408703405, 'samples': 3046400, 'steps': 5949, 'batch_loss/train': 0.9376128036528826} 12/23/2021 18:27:40 - INFO - codeparrot_training - Step 5950: {'lr': 0.0004836086307479496, 'samples': 3046912, 'steps': 5950, 'batch_loss/train': 0.8373533766716719} 12/23/2021 18:27:50 - INFO - codeparrot_training - Step 5951: {'lr': 0.00048360301648012546, 'samples': 3047424, 'steps': 5951, 'batch_loss/train': 0.839091713540256} 12/23/2021 18:28:01 - INFO - codeparrot_training - Step 5952: {'lr': 0.0004835974012835839, 'samples': 3047936, 'steps': 5952, 'batch_loss/train': 0.9065551529638469} 12/23/2021 18:28:13 - INFO - codeparrot_training - Step 5953: {'lr': 0.0004835917851583472, 'samples': 3048448, 'steps': 5953, 'batch_loss/train': 0.7569553982466459} 12/23/2021 18:28:23 - INFO - codeparrot_training - Step 5954: {'lr': 0.0004835861681044379, 'samples': 3048960, 'steps': 5954, 'batch_loss/train': 0.9055330436676741} 12/23/2021 18:28:34 - INFO - codeparrot_training - Step 5955: {'lr': 0.0004835805501218781, 'samples': 3049472, 'steps': 5955, 'batch_loss/train': 0.8092591767199337} 12/23/2021 18:28:46 - INFO - codeparrot_training - Step 5956: {'lr': 0.0004835749312106903, 'samples': 3049984, 'steps': 5956, 'batch_loss/train': 0.8787575841415673} 12/23/2021 18:28:57 - INFO - codeparrot_training - Step 5957: {'lr': 0.0004835693113708968, 'samples': 3050496, 'steps': 5957, 'batch_loss/train': 0.879173609893769} 12/23/2021 18:29:07 - INFO - codeparrot_training - Step 5958: {'lr': 0.0004835636906025198, 'samples': 3051008, 'steps': 5958, 'batch_loss/train': 0.8264527887804434} 12/23/2021 18:29:18 - INFO - codeparrot_training - Step 5959: {'lr': 0.00048355806890558174, 'samples': 3051520, 'steps': 5959, 'batch_loss/train': 0.9518850147724152} 12/23/2021 18:29:31 - INFO - codeparrot_training - Step 5960: {'lr': 0.0004835524462801051, 'samples': 3052032, 'steps': 5960, 'batch_loss/train': 0.8356580315157771} 12/23/2021 18:29:42 - INFO - codeparrot_training - Step 5961: {'lr': 0.00048354682272611207, 'samples': 3052544, 'steps': 5961, 'batch_loss/train': 0.826870153658092} 12/23/2021 18:29:53 - INFO - codeparrot_training - Step 5962: {'lr': 0.000483541198243625, 'samples': 3053056, 'steps': 5962, 'batch_loss/train': 0.8062297524884343} 12/23/2021 18:30:05 - INFO - codeparrot_training - Step 5963: {'lr': 0.00048353557283266634, 'samples': 3053568, 'steps': 5963, 'batch_loss/train': 0.8778781611472368} 12/23/2021 18:30:15 - INFO - codeparrot_training - Step 5964: {'lr': 0.0004835299464932585, 'samples': 3054080, 'steps': 5964, 'batch_loss/train': 0.8918313686735928} 12/23/2021 18:30:26 - INFO - codeparrot_training - Step 5965: {'lr': 0.00048352431922542375, 'samples': 3054592, 'steps': 5965, 'batch_loss/train': 0.8362298272550106} 12/23/2021 18:30:38 - INFO - codeparrot_training - Step 5966: {'lr': 0.0004835186910291844, 'samples': 3055104, 'steps': 5966, 'batch_loss/train': 0.8554074121639132} 12/23/2021 18:30:48 - INFO - codeparrot_training - Step 5967: {'lr': 0.0004835130619045629, 'samples': 3055616, 'steps': 5967, 'batch_loss/train': 1.0016941986978054} 12/23/2021 18:30:59 - INFO - codeparrot_training - Step 5968: {'lr': 0.0004835074318515817, 'samples': 3056128, 'steps': 5968, 'batch_loss/train': 0.7218209733255208} 12/23/2021 18:31:12 - INFO - codeparrot_training - Step 5969: {'lr': 0.0004835018008702631, 'samples': 3056640, 'steps': 5969, 'batch_loss/train': 0.8479747083038092} 12/23/2021 18:31:23 - INFO - codeparrot_training - Step 5970: {'lr': 0.00048349616896062954, 'samples': 3057152, 'steps': 5970, 'batch_loss/train': 0.8528989123879} 12/23/2021 18:31:33 - INFO - codeparrot_training - Step 5971: {'lr': 0.00048349053612270335, 'samples': 3057664, 'steps': 5971, 'batch_loss/train': 0.8675800366327167} 12/23/2021 18:31:44 - INFO - codeparrot_training - Step 5972: {'lr': 0.000483484902356507, 'samples': 3058176, 'steps': 5972, 'batch_loss/train': 0.9167897319421172} 12/23/2021 18:31:56 - INFO - codeparrot_training - Step 5973: {'lr': 0.00048347926766206274, 'samples': 3058688, 'steps': 5973, 'batch_loss/train': 0.8887089462950826} 12/23/2021 18:32:07 - INFO - codeparrot_training - Step 5974: {'lr': 0.0004834736320393931, 'samples': 3059200, 'steps': 5974, 'batch_loss/train': 0.8711890717968345} 12/23/2021 18:32:17 - INFO - codeparrot_training - Step 5975: {'lr': 0.0004834679954885205, 'samples': 3059712, 'steps': 5975, 'batch_loss/train': 0.7936429511755705} 12/23/2021 18:32:29 - INFO - codeparrot_training - Step 5976: {'lr': 0.0004834623580094673, 'samples': 3060224, 'steps': 5976, 'batch_loss/train': 0.8866534028202295} 12/23/2021 18:32:40 - INFO - codeparrot_training - Step 5977: {'lr': 0.0004834567196022559, 'samples': 3060736, 'steps': 5977, 'batch_loss/train': 0.7090451908297837} 12/23/2021 18:32:50 - INFO - codeparrot_training - Step 5978: {'lr': 0.0004834510802669087, 'samples': 3061248, 'steps': 5978, 'batch_loss/train': 0.6976567292585969} 12/23/2021 18:33:03 - INFO - codeparrot_training - Step 5979: {'lr': 0.0004834454400034482, 'samples': 3061760, 'steps': 5979, 'batch_loss/train': 0.7815812593325973} 12/23/2021 18:33:14 - INFO - codeparrot_training - Step 5980: {'lr': 0.00048343979881189673, 'samples': 3062272, 'steps': 5980, 'batch_loss/train': 0.8052142797969282} 12/23/2021 18:33:25 - INFO - codeparrot_training - Step 5981: {'lr': 0.0004834341566922768, 'samples': 3062784, 'steps': 5981, 'batch_loss/train': 0.850577131845057} 12/23/2021 18:33:35 - INFO - codeparrot_training - Step 5982: {'lr': 0.00048342851364461083, 'samples': 3063296, 'steps': 5982, 'batch_loss/train': 0.9440705347806215} 12/23/2021 18:33:47 - INFO - codeparrot_training - Step 5983: {'lr': 0.0004834228696689212, 'samples': 3063808, 'steps': 5983, 'batch_loss/train': 0.8709281287156045} 12/23/2021 18:33:58 - INFO - codeparrot_training - Step 5984: {'lr': 0.00048341722476523034, 'samples': 3064320, 'steps': 5984, 'batch_loss/train': 0.8309761034324765} 12/23/2021 18:34:09 - INFO - codeparrot_training - Step 5985: {'lr': 0.0004834115789335608, 'samples': 3064832, 'steps': 5985, 'batch_loss/train': 0.7751219645142555} 12/23/2021 18:34:22 - INFO - codeparrot_training - Step 5986: {'lr': 0.0004834059321739349, 'samples': 3065344, 'steps': 5986, 'batch_loss/train': 0.9069859487935901} 12/23/2021 18:34:33 - INFO - codeparrot_training - Step 5987: {'lr': 0.0004834002844863752, 'samples': 3065856, 'steps': 5987, 'batch_loss/train': 0.9089980474673212} 12/23/2021 18:34:43 - INFO - codeparrot_training - Step 5988: {'lr': 0.00048339463587090405, 'samples': 3066368, 'steps': 5988, 'batch_loss/train': 0.8730621905997396} 12/23/2021 18:34:55 - INFO - codeparrot_training - Step 5989: {'lr': 0.0004833889863275439, 'samples': 3066880, 'steps': 5989, 'batch_loss/train': 0.8102513086050749} 12/23/2021 18:35:06 - INFO - codeparrot_training - Step 5990: {'lr': 0.00048338333585631736, 'samples': 3067392, 'steps': 5990, 'batch_loss/train': 0.8804748326074332} 12/23/2021 18:35:16 - INFO - codeparrot_training - Step 5991: {'lr': 0.00048337768445724674, 'samples': 3067904, 'steps': 5991, 'batch_loss/train': 0.8145453683100641} 12/23/2021 18:35:29 - INFO - codeparrot_training - Step 5992: {'lr': 0.0004833720321303546, 'samples': 3068416, 'steps': 5992, 'batch_loss/train': 0.8424277566373348} 12/23/2021 18:35:39 - INFO - codeparrot_training - Step 5993: {'lr': 0.00048336637887566337, 'samples': 3068928, 'steps': 5993, 'batch_loss/train': 0.7449813117273152} 12/23/2021 18:35:50 - INFO - codeparrot_training - Step 5994: {'lr': 0.00048336072469319554, 'samples': 3069440, 'steps': 5994, 'batch_loss/train': 0.841082124505192} 12/23/2021 18:36:01 - INFO - codeparrot_training - Step 5995: {'lr': 0.0004833550695829736, 'samples': 3069952, 'steps': 5995, 'batch_loss/train': 0.6587314223870635} 12/23/2021 18:36:13 - INFO - codeparrot_training - Step 5996: {'lr': 0.00048334941354501995, 'samples': 3070464, 'steps': 5996, 'batch_loss/train': 0.8242392148822546} 12/23/2021 18:36:23 - INFO - codeparrot_training - Step 5997: {'lr': 0.00048334375657935717, 'samples': 3070976, 'steps': 5997, 'batch_loss/train': 0.8484153663739562} 12/23/2021 18:36:34 - INFO - codeparrot_training - Step 5998: {'lr': 0.00048333809868600776, 'samples': 3071488, 'steps': 5998, 'batch_loss/train': 0.7519930629059672} 12/23/2021 18:36:47 - INFO - codeparrot_training - Step 5999: {'lr': 0.00048333243986499415, 'samples': 3072000, 'steps': 5999, 'batch_loss/train': 0.8782752612605691} 12/23/2021 18:36:57 - INFO - codeparrot_training - Step 6000: {'lr': 0.00048332678011633884, 'samples': 3072512, 'steps': 6000, 'batch_loss/train': 0.8264210755005479} 12/23/2021 18:37:08 - INFO - codeparrot_training - Step 6001: {'lr': 0.0004833211194400644, 'samples': 3073024, 'steps': 6001, 'batch_loss/train': 0.9253470338881016} 12/23/2021 18:37:20 - INFO - codeparrot_training - Step 6002: {'lr': 0.0004833154578361932, 'samples': 3073536, 'steps': 6002, 'batch_loss/train': 0.8473845296539366} 12/23/2021 18:37:31 - INFO - codeparrot_training - Step 6003: {'lr': 0.00048330979530474797, 'samples': 3074048, 'steps': 6003, 'batch_loss/train': 0.9023696146905422} 12/23/2021 18:37:41 - INFO - codeparrot_training - Step 6004: {'lr': 0.000483304131845751, 'samples': 3074560, 'steps': 6004, 'batch_loss/train': 0.9237347561866045} 12/23/2021 18:37:52 - INFO - codeparrot_training - Step 6005: {'lr': 0.00048329846745922487, 'samples': 3075072, 'steps': 6005, 'batch_loss/train': 0.7790967528708279} 12/23/2021 18:38:04 - INFO - codeparrot_training - Step 6006: {'lr': 0.00048329280214519225, 'samples': 3075584, 'steps': 6006, 'batch_loss/train': 0.8019183361902833} 12/23/2021 18:38:14 - INFO - codeparrot_training - Step 6007: {'lr': 0.0004832871359036754, 'samples': 3076096, 'steps': 6007, 'batch_loss/train': 0.8217471595853567} 12/23/2021 18:38:25 - INFO - codeparrot_training - Step 6008: {'lr': 0.000483281468734697, 'samples': 3076608, 'steps': 6008, 'batch_loss/train': 0.877291375072673} 12/23/2021 18:38:38 - INFO - codeparrot_training - Step 6009: {'lr': 0.0004832758006382796, 'samples': 3077120, 'steps': 6009, 'batch_loss/train': 0.8736789496615529} 12/23/2021 18:38:49 - INFO - codeparrot_training - Step 6010: {'lr': 0.00048327013161444574, 'samples': 3077632, 'steps': 6010, 'batch_loss/train': 0.9237018302083015} 12/23/2021 18:38:59 - INFO - codeparrot_training - Step 6011: {'lr': 0.00048326446166321787, 'samples': 3078144, 'steps': 6011, 'batch_loss/train': 0.699217552319169} 12/23/2021 18:39:11 - INFO - codeparrot_training - Step 6012: {'lr': 0.00048325879078461853, 'samples': 3078656, 'steps': 6012, 'batch_loss/train': 0.8516709730029106} 12/23/2021 18:39:22 - INFO - codeparrot_training - Step 6013: {'lr': 0.0004832531189786704, 'samples': 3079168, 'steps': 6013, 'batch_loss/train': 0.7765497546643019} 12/23/2021 18:39:33 - INFO - codeparrot_training - Step 6014: {'lr': 0.00048324744624539585, 'samples': 3079680, 'steps': 6014, 'batch_loss/train': 0.8897068416699767} 12/23/2021 18:39:47 - INFO - codeparrot_training - Step 6015: {'lr': 0.0004832417725848176, 'samples': 3080192, 'steps': 6015, 'batch_loss/train': 0.9273698022589087} 12/23/2021 18:39:58 - INFO - codeparrot_training - Step 6016: {'lr': 0.0004832360979969581, 'samples': 3080704, 'steps': 6016, 'batch_loss/train': 0.9752066088840365} 12/23/2021 18:40:08 - INFO - codeparrot_training - Step 6017: {'lr': 0.00048323042248184, 'samples': 3081216, 'steps': 6017, 'batch_loss/train': 0.7916126633062959} 12/23/2021 18:40:19 - INFO - codeparrot_training - Step 6018: {'lr': 0.00048322474603948576, 'samples': 3081728, 'steps': 6018, 'batch_loss/train': 0.8713384144939482} 12/23/2021 18:40:31 - INFO - codeparrot_training - Step 6019: {'lr': 0.00048321906866991796, 'samples': 3082240, 'steps': 6019, 'batch_loss/train': 0.7927553588524461} 12/23/2021 18:40:41 - INFO - codeparrot_training - Step 6020: {'lr': 0.0004832133903731593, 'samples': 3082752, 'steps': 6020, 'batch_loss/train': 0.8820639187470078} 12/23/2021 18:40:52 - INFO - codeparrot_training - Step 6021: {'lr': 0.0004832077111492321, 'samples': 3083264, 'steps': 6021, 'batch_loss/train': 0.8043414205312729} 12/23/2021 18:41:04 - INFO - codeparrot_training - Step 6022: {'lr': 0.0004832020309981592, 'samples': 3083776, 'steps': 6022, 'batch_loss/train': 0.9307437846437097} 12/23/2021 18:41:15 - INFO - codeparrot_training - Step 6023: {'lr': 0.0004831963499199631, 'samples': 3084288, 'steps': 6023, 'batch_loss/train': 0.5668099818285555} 12/23/2021 18:41:25 - INFO - codeparrot_training - Step 6024: {'lr': 0.00048319066791466627, 'samples': 3084800, 'steps': 6024, 'batch_loss/train': 0.8870122330263257} 12/23/2021 18:41:39 - INFO - codeparrot_training - Step 6025: {'lr': 0.0004831849849822915, 'samples': 3085312, 'steps': 6025, 'batch_loss/train': 0.8729191739112139} 12/23/2021 18:41:50 - INFO - codeparrot_training - Step 6026: {'lr': 0.00048317930112286116, 'samples': 3085824, 'steps': 6026, 'batch_loss/train': 0.7843036269769073} 12/23/2021 18:42:00 - INFO - codeparrot_training - Step 6027: {'lr': 0.00048317361633639803, 'samples': 3086336, 'steps': 6027, 'batch_loss/train': 0.9210655381903052} 12/23/2021 18:42:11 - INFO - codeparrot_training - Step 6028: {'lr': 0.00048316793062292465, 'samples': 3086848, 'steps': 6028, 'batch_loss/train': 0.8638545237481594} 12/23/2021 18:42:23 - INFO - codeparrot_training - Step 6029: {'lr': 0.00048316224398246366, 'samples': 3087360, 'steps': 6029, 'batch_loss/train': 0.8182992837391794} 12/23/2021 18:42:34 - INFO - codeparrot_training - Step 6030: {'lr': 0.0004831565564150375, 'samples': 3087872, 'steps': 6030, 'batch_loss/train': 0.7776999892666936} 12/23/2021 18:42:44 - INFO - codeparrot_training - Step 6031: {'lr': 0.00048315086792066904, 'samples': 3088384, 'steps': 6031, 'batch_loss/train': 0.9893842488527298} 12/23/2021 18:42:56 - INFO - codeparrot_training - Step 6032: {'lr': 0.00048314517849938073, 'samples': 3088896, 'steps': 6032, 'batch_loss/train': 1.040649883914739} 12/23/2021 18:43:07 - INFO - codeparrot_training - Step 6033: {'lr': 0.0004831394881511952, 'samples': 3089408, 'steps': 6033, 'batch_loss/train': 0.7766550513915718} 12/23/2021 18:43:17 - INFO - codeparrot_training - Step 6034: {'lr': 0.00048313379687613513, 'samples': 3089920, 'steps': 6034, 'batch_loss/train': 0.8496537888422608} 12/23/2021 18:43:30 - INFO - codeparrot_training - Step 6035: {'lr': 0.000483128104674223, 'samples': 3090432, 'steps': 6035, 'batch_loss/train': 0.8899904228746891} 12/23/2021 18:43:40 - INFO - codeparrot_training - Step 6036: {'lr': 0.00048312241154548167, 'samples': 3090944, 'steps': 6036, 'batch_loss/train': 0.9509349009022117} 12/23/2021 18:43:51 - INFO - codeparrot_training - Step 6037: {'lr': 0.00048311671748993364, 'samples': 3091456, 'steps': 6037, 'batch_loss/train': 0.8607864184305072} 12/23/2021 18:44:05 - INFO - codeparrot_training - Step 6038: {'lr': 0.0004831110225076015, 'samples': 3091968, 'steps': 6038, 'batch_loss/train': 0.811508153565228} 12/23/2021 18:44:15 - INFO - codeparrot_training - Step 6039: {'lr': 0.000483105326598508, 'samples': 3092480, 'steps': 6039, 'batch_loss/train': 1.3041536659002304} 12/23/2021 18:44:26 - INFO - codeparrot_training - Step 6040: {'lr': 0.00048309962976267573, 'samples': 3092992, 'steps': 6040, 'batch_loss/train': 0.8852015174925327} 12/23/2021 18:44:37 - INFO - codeparrot_training - Step 6041: {'lr': 0.00048309393200012733, 'samples': 3093504, 'steps': 6041, 'batch_loss/train': 0.8187704272568226} 12/23/2021 18:44:49 - INFO - codeparrot_training - Step 6042: {'lr': 0.00048308823331088554, 'samples': 3094016, 'steps': 6042, 'batch_loss/train': 0.7864492959342897} 12/23/2021 18:45:00 - INFO - codeparrot_training - Step 6043: {'lr': 0.0004830825336949729, 'samples': 3094528, 'steps': 6043, 'batch_loss/train': 0.7567867940524593} 12/23/2021 18:45:10 - INFO - codeparrot_training - Step 6044: {'lr': 0.00048307683315241206, 'samples': 3095040, 'steps': 6044, 'batch_loss/train': 0.5998000521212816} 12/23/2021 18:45:22 - INFO - codeparrot_training - Step 6045: {'lr': 0.0004830711316832258, 'samples': 3095552, 'steps': 6045, 'batch_loss/train': 0.8776691320817918} 12/23/2021 18:45:33 - INFO - codeparrot_training - Step 6046: {'lr': 0.0004830654292874367, 'samples': 3096064, 'steps': 6046, 'batch_loss/train': 0.7344814697280526} 12/23/2021 18:45:43 - INFO - codeparrot_training - Step 6047: {'lr': 0.00048305972596506745, 'samples': 3096576, 'steps': 6047, 'batch_loss/train': 0.8988330420106649} 12/23/2021 18:45:57 - INFO - codeparrot_training - Step 6048: {'lr': 0.0004830540217161408, 'samples': 3097088, 'steps': 6048, 'batch_loss/train': 0.9210238931700587} 12/23/2021 18:46:08 - INFO - codeparrot_training - Step 6049: {'lr': 0.0004830483165406793, 'samples': 3097600, 'steps': 6049, 'batch_loss/train': 0.9225013954564929} 12/23/2021 18:46:18 - INFO - codeparrot_training - Step 6050: {'lr': 0.0004830426104387057, 'samples': 3098112, 'steps': 6050, 'batch_loss/train': 0.8733643908053637} 12/23/2021 18:46:30 - INFO - codeparrot_training - Step 6051: {'lr': 0.00048303690341024267, 'samples': 3098624, 'steps': 6051, 'batch_loss/train': 0.8344813836738467} 12/23/2021 18:46:41 - INFO - codeparrot_training - Step 6052: {'lr': 0.0004830311954553129, 'samples': 3099136, 'steps': 6052, 'batch_loss/train': 0.8877637218683958} 12/23/2021 18:46:51 - INFO - codeparrot_training - Step 6053: {'lr': 0.00048302548657393906, 'samples': 3099648, 'steps': 6053, 'batch_loss/train': 0.8854377190582454} 12/23/2021 18:47:02 - INFO - codeparrot_training - Step 6054: {'lr': 0.00048301977676614387, 'samples': 3100160, 'steps': 6054, 'batch_loss/train': 0.8647752702236176} 12/23/2021 18:47:16 - INFO - codeparrot_training - Step 6055: {'lr': 0.00048301406603195007, 'samples': 3100672, 'steps': 6055, 'batch_loss/train': 0.8764647338539362} 12/23/2021 18:47:26 - INFO - codeparrot_training - Step 6056: {'lr': 0.0004830083543713804, 'samples': 3101184, 'steps': 6056, 'batch_loss/train': 0.7129918681457639} 12/23/2021 18:47:37 - INFO - codeparrot_training - Step 6057: {'lr': 0.00048300264178445735, 'samples': 3101696, 'steps': 6057, 'batch_loss/train': 0.8662923723459244} 12/23/2021 18:47:49 - INFO - codeparrot_training - Step 6058: {'lr': 0.00048299692827120385, 'samples': 3102208, 'steps': 6058, 'batch_loss/train': 0.791578525211662} 12/23/2021 18:47:59 - INFO - codeparrot_training - Step 6059: {'lr': 0.0004829912138316425, 'samples': 3102720, 'steps': 6059, 'batch_loss/train': 0.8699720883741975} 12/23/2021 18:48:10 - INFO - codeparrot_training - Step 6060: {'lr': 0.0004829854984657961, 'samples': 3103232, 'steps': 6060, 'batch_loss/train': 0.8421018337830901} 12/23/2021 18:48:22 - INFO - codeparrot_training - Step 6061: {'lr': 0.00048297978217368734, 'samples': 3103744, 'steps': 6061, 'batch_loss/train': 0.8643220970407128} 12/23/2021 18:48:32 - INFO - codeparrot_training - Step 6062: {'lr': 0.0004829740649553389, 'samples': 3104256, 'steps': 6062, 'batch_loss/train': 0.8845199160277843} 12/23/2021 18:48:43 - INFO - codeparrot_training - Step 6063: {'lr': 0.00048296834681077353, 'samples': 3104768, 'steps': 6063, 'batch_loss/train': 0.8419940415769815} 12/23/2021 18:48:54 - INFO - codeparrot_training - Step 6064: {'lr': 0.00048296262774001397, 'samples': 3105280, 'steps': 6064, 'batch_loss/train': 0.839435251429677} 12/23/2021 18:49:07 - INFO - codeparrot_training - Step 6065: {'lr': 0.000482956907743083, 'samples': 3105792, 'steps': 6065, 'batch_loss/train': 0.888203009031713} 12/23/2021 18:49:18 - INFO - codeparrot_training - Step 6066: {'lr': 0.0004829511868200034, 'samples': 3106304, 'steps': 6066, 'batch_loss/train': 0.8083533113822341} 12/23/2021 18:49:28 - INFO - codeparrot_training - Step 6067: {'lr': 0.00048294546497079774, 'samples': 3106816, 'steps': 6067, 'batch_loss/train': 0.7899560732766986} 12/23/2021 18:49:40 - INFO - codeparrot_training - Step 6068: {'lr': 0.00048293974219548887, 'samples': 3107328, 'steps': 6068, 'batch_loss/train': 0.7578352666459978} 12/23/2021 18:49:51 - INFO - codeparrot_training - Step 6069: {'lr': 0.00048293401849409956, 'samples': 3107840, 'steps': 6069, 'batch_loss/train': 0.8833907786756754} 12/23/2021 18:50:01 - INFO - codeparrot_training - Step 6070: {'lr': 0.00048292829386665256, 'samples': 3108352, 'steps': 6070, 'batch_loss/train': 0.9024988077580929} 12/23/2021 18:50:13 - INFO - codeparrot_training - Step 6071: {'lr': 0.0004829225683131706, 'samples': 3108864, 'steps': 6071, 'batch_loss/train': 0.8155947374179959} 12/23/2021 18:50:24 - INFO - codeparrot_training - Step 6072: {'lr': 0.00048291684183367645, 'samples': 3109376, 'steps': 6072, 'batch_loss/train': 0.9466386893764138} 12/23/2021 18:50:35 - INFO - codeparrot_training - Step 6073: {'lr': 0.0004829111144281929, 'samples': 3109888, 'steps': 6073, 'batch_loss/train': 0.9996325615793467} 12/23/2021 18:50:46 - INFO - codeparrot_training - Step 6074: {'lr': 0.0004829053860967427, 'samples': 3110400, 'steps': 6074, 'batch_loss/train': 0.8356164740398526} 12/23/2021 18:50:57 - INFO - codeparrot_training - Step 6075: {'lr': 0.00048289965683934855, 'samples': 3110912, 'steps': 6075, 'batch_loss/train': 0.8024642560631037} 12/23/2021 18:51:08 - INFO - codeparrot_training - Step 6076: {'lr': 0.0004828939266560334, 'samples': 3111424, 'steps': 6076, 'batch_loss/train': 0.8086444847285748} 12/23/2021 18:51:18 - INFO - codeparrot_training - Step 6077: {'lr': 0.00048288819554681995, 'samples': 3111936, 'steps': 6077, 'batch_loss/train': 0.7802216736599803} 12/23/2021 18:51:32 - INFO - codeparrot_training - Step 6078: {'lr': 0.0004828824635117309, 'samples': 3112448, 'steps': 6078, 'batch_loss/train': 0.8514965893700719} 12/23/2021 18:51:43 - INFO - codeparrot_training - Step 6079: {'lr': 0.00048287673055078916, 'samples': 3112960, 'steps': 6079, 'batch_loss/train': 0.8371637957170606} 12/23/2021 18:51:53 - INFO - codeparrot_training - Step 6080: {'lr': 0.0004828709966640175, 'samples': 3113472, 'steps': 6080, 'batch_loss/train': 0.8800743115134537} 12/23/2021 18:52:06 - INFO - codeparrot_training - Step 6081: {'lr': 0.00048286526185143856, 'samples': 3113984, 'steps': 6081, 'batch_loss/train': 0.8473503924906254} 12/23/2021 18:52:16 - INFO - codeparrot_training - Step 6082: {'lr': 0.00048285952611307534, 'samples': 3114496, 'steps': 6082, 'batch_loss/train': 0.757990549900569} 12/23/2021 18:52:27 - INFO - codeparrot_training - Step 6083: {'lr': 0.00048285378944895054, 'samples': 3115008, 'steps': 6083, 'batch_loss/train': 0.6985275216866285} 12/23/2021 18:52:41 - INFO - codeparrot_training - Step 6084: {'lr': 0.00048284805185908703, 'samples': 3115520, 'steps': 6084, 'batch_loss/train': 0.4915030673146248} 12/23/2021 18:52:51 - INFO - codeparrot_training - Step 6085: {'lr': 0.0004828423133435076, 'samples': 3116032, 'steps': 6085, 'batch_loss/train': 0.6555395472096279} 12/23/2021 18:53:02 - INFO - codeparrot_training - Step 6086: {'lr': 0.00048283657390223503, 'samples': 3116544, 'steps': 6086, 'batch_loss/train': 0.9301524758338928} 12/23/2021 18:53:13 - INFO - codeparrot_training - Step 6087: {'lr': 0.0004828308335352921, 'samples': 3117056, 'steps': 6087, 'batch_loss/train': 0.9040381808299571} 12/23/2021 18:53:25 - INFO - codeparrot_training - Step 6088: {'lr': 0.00048282509224270174, 'samples': 3117568, 'steps': 6088, 'batch_loss/train': 0.8505504108034074} 12/23/2021 18:53:36 - INFO - codeparrot_training - Step 6089: {'lr': 0.0004828193500244867, 'samples': 3118080, 'steps': 6089, 'batch_loss/train': 0.8892074674367905} 12/23/2021 18:53:47 - INFO - codeparrot_training - Step 6090: {'lr': 0.0004828136068806699, 'samples': 3118592, 'steps': 6090, 'batch_loss/train': 1.108687499538064} 12/23/2021 18:53:59 - INFO - codeparrot_training - Step 6091: {'lr': 0.0004828078628112741, 'samples': 3119104, 'steps': 6091, 'batch_loss/train': 0.8654631478711963} 12/23/2021 18:54:09 - INFO - codeparrot_training - Step 6092: {'lr': 0.00048280211781632213, 'samples': 3119616, 'steps': 6092, 'batch_loss/train': 0.8887252797139809} 12/23/2021 18:54:20 - INFO - codeparrot_training - Step 6093: {'lr': 0.0004827963718958368, 'samples': 3120128, 'steps': 6093, 'batch_loss/train': 0.8193308799527586} 12/23/2021 18:54:30 - INFO - codeparrot_training - Step 6094: {'lr': 0.00048279062504984107, 'samples': 3120640, 'steps': 6094, 'batch_loss/train': 0.8812017049640417} 12/23/2021 18:54:43 - INFO - codeparrot_training - Step 6095: {'lr': 0.00048278487727835764, 'samples': 3121152, 'steps': 6095, 'batch_loss/train': 1.2335653752088547} 12/23/2021 18:54:54 - INFO - codeparrot_training - Step 6096: {'lr': 0.00048277912858140943, 'samples': 3121664, 'steps': 6096, 'batch_loss/train': 0.8355279108509421} 12/23/2021 18:55:05 - INFO - codeparrot_training - Step 6097: {'lr': 0.0004827733789590193, 'samples': 3122176, 'steps': 6097, 'batch_loss/train': 0.8016488822177052} 12/23/2021 18:55:17 - INFO - codeparrot_training - Step 6098: {'lr': 0.00048276762841121016, 'samples': 3122688, 'steps': 6098, 'batch_loss/train': 0.8629789846017957} 12/23/2021 18:55:27 - INFO - codeparrot_training - Step 6099: {'lr': 0.00048276187693800475, 'samples': 3123200, 'steps': 6099, 'batch_loss/train': 0.9461807813495398} 12/23/2021 18:55:38 - INFO - codeparrot_training - Step 6100: {'lr': 0.00048275612453942606, 'samples': 3123712, 'steps': 6100, 'batch_loss/train': 0.9055876852944493} 12/23/2021 18:55:50 - INFO - codeparrot_training - Step 6101: {'lr': 0.0004827503712154969, 'samples': 3124224, 'steps': 6101, 'batch_loss/train': 0.8732046624645591} 12/23/2021 18:56:00 - INFO - codeparrot_training - Step 6102: {'lr': 0.00048274461696624015, 'samples': 3124736, 'steps': 6102, 'batch_loss/train': 0.9314009435474873} 12/23/2021 18:56:11 - INFO - codeparrot_training - Step 6103: {'lr': 0.0004827388617916787, 'samples': 3125248, 'steps': 6103, 'batch_loss/train': 0.9585136533714831} 12/23/2021 18:56:23 - INFO - codeparrot_training - Step 6104: {'lr': 0.0004827331056918354, 'samples': 3125760, 'steps': 6104, 'batch_loss/train': 0.8485535969957709} 12/23/2021 18:56:33 - INFO - codeparrot_training - Step 6105: {'lr': 0.0004827273486667331, 'samples': 3126272, 'steps': 6105, 'batch_loss/train': 0.8591582207009196} 12/23/2021 18:56:44 - INFO - codeparrot_training - Step 6106: {'lr': 0.0004827215907163948, 'samples': 3126784, 'steps': 6106, 'batch_loss/train': 0.9051511958241463} 12/23/2021 18:56:55 - INFO - codeparrot_training - Step 6107: {'lr': 0.0004827158318408432, 'samples': 3127296, 'steps': 6107, 'batch_loss/train': 0.8840127671137452} 12/23/2021 18:57:08 - INFO - codeparrot_training - Step 6108: {'lr': 0.0004827100720401015, 'samples': 3127808, 'steps': 6108, 'batch_loss/train': 0.8397765811532736} 12/23/2021 18:57:18 - INFO - codeparrot_training - Step 6109: {'lr': 0.00048270431131419234, 'samples': 3128320, 'steps': 6109, 'batch_loss/train': 0.7868944401852787} 12/23/2021 18:57:29 - INFO - codeparrot_training - Step 6110: {'lr': 0.0004826985496631386, 'samples': 3128832, 'steps': 6110, 'batch_loss/train': 0.8444827385246754} 12/23/2021 18:57:41 - INFO - codeparrot_training - Step 6111: {'lr': 0.00048269278708696354, 'samples': 3129344, 'steps': 6111, 'batch_loss/train': 0.8832928659394383} 12/23/2021 18:57:52 - INFO - codeparrot_training - Step 6112: {'lr': 0.0004826870235856896, 'samples': 3129856, 'steps': 6112, 'batch_loss/train': 0.8001764630898833} 12/23/2021 18:58:02 - INFO - codeparrot_training - Step 6113: {'lr': 0.00048268125915934003, 'samples': 3130368, 'steps': 6113, 'batch_loss/train': 0.9402580093592405} 12/23/2021 18:58:16 - INFO - codeparrot_training - Step 6114: {'lr': 0.00048267549380793753, 'samples': 3130880, 'steps': 6114, 'batch_loss/train': 0.8214766839519143} 12/23/2021 18:58:27 - INFO - codeparrot_training - Step 6115: {'lr': 0.00048266972753150517, 'samples': 3131392, 'steps': 6115, 'batch_loss/train': 0.950416692532599} 12/23/2021 18:58:37 - INFO - codeparrot_training - Step 6116: {'lr': 0.0004826639603300659, 'samples': 3131904, 'steps': 6116, 'batch_loss/train': 0.7947676749899983} 12/23/2021 18:58:48 - INFO - codeparrot_training - Step 6117: {'lr': 0.0004826581922036425, 'samples': 3132416, 'steps': 6117, 'batch_loss/train': 0.900772487744689} 12/23/2021 18:59:00 - INFO - codeparrot_training - Step 6118: {'lr': 0.000482652423152258, 'samples': 3132928, 'steps': 6118, 'batch_loss/train': 0.8240331513807178} 12/23/2021 18:59:11 - INFO - codeparrot_training - Step 6119: {'lr': 0.00048264665317593525, 'samples': 3133440, 'steps': 6119, 'batch_loss/train': 0.6882699513807893} 12/23/2021 18:59:21 - INFO - codeparrot_training - Step 6120: {'lr': 0.0004826408822746973, 'samples': 3133952, 'steps': 6120, 'batch_loss/train': 0.9271773518994451} 12/23/2021 18:59:33 - INFO - codeparrot_training - Step 6121: {'lr': 0.00048263511044856704, 'samples': 3134464, 'steps': 6121, 'batch_loss/train': 0.921166536398232} 12/23/2021 18:59:44 - INFO - codeparrot_training - Step 6122: {'lr': 0.00048262933769756744, 'samples': 3134976, 'steps': 6122, 'batch_loss/train': 0.8769117183983326} 12/23/2021 18:59:55 - INFO - codeparrot_training - Step 6123: {'lr': 0.0004826235640217214, 'samples': 3135488, 'steps': 6123, 'batch_loss/train': 1.2360077584162354} 12/23/2021 19:00:09 - INFO - codeparrot_training - Step 6124: {'lr': 0.00048261778942105193, 'samples': 3136000, 'steps': 6124, 'batch_loss/train': 0.9147136111278087} 12/23/2021 19:00:19 - INFO - codeparrot_training - Step 6125: {'lr': 0.00048261201389558194, 'samples': 3136512, 'steps': 6125, 'batch_loss/train': 0.6462871782714501} 12/23/2021 19:00:30 - INFO - codeparrot_training - Step 6126: {'lr': 0.00048260623744533445, 'samples': 3137024, 'steps': 6126, 'batch_loss/train': 0.891575706191361} 12/23/2021 19:00:42 - INFO - codeparrot_training - Step 6127: {'lr': 0.0004826004600703324, 'samples': 3137536, 'steps': 6127, 'batch_loss/train': 0.7820113841444254} 12/23/2021 19:00:52 - INFO - codeparrot_training - Step 6128: {'lr': 0.00048259468177059867, 'samples': 3138048, 'steps': 6128, 'batch_loss/train': 0.8805674090981483} 12/23/2021 19:01:03 - INFO - codeparrot_training - Step 6129: {'lr': 0.0004825889025461564, 'samples': 3138560, 'steps': 6129, 'batch_loss/train': 1.1016242085024714} 12/23/2021 19:01:14 - INFO - codeparrot_training - Step 6130: {'lr': 0.0004825831223970284, 'samples': 3139072, 'steps': 6130, 'batch_loss/train': 0.8293384676799178} 12/23/2021 19:01:27 - INFO - codeparrot_training - Step 6131: {'lr': 0.0004825773413232377, 'samples': 3139584, 'steps': 6131, 'batch_loss/train': 0.8244412890635431} 12/23/2021 19:01:37 - INFO - codeparrot_training - Step 6132: {'lr': 0.0004825715593248074, 'samples': 3140096, 'steps': 6132, 'batch_loss/train': 0.8453193414025009} 12/23/2021 19:01:48 - INFO - codeparrot_training - Step 6133: {'lr': 0.00048256577640176033, 'samples': 3140608, 'steps': 6133, 'batch_loss/train': 0.7297395975328982} 12/23/2021 19:02:00 - INFO - codeparrot_training - Step 6134: {'lr': 0.0004825599925541196, 'samples': 3141120, 'steps': 6134, 'batch_loss/train': 0.8003038046881557} 12/23/2021 19:02:11 - INFO - codeparrot_training - Step 6135: {'lr': 0.0004825542077819081, 'samples': 3141632, 'steps': 6135, 'batch_loss/train': 0.8453643801622093} 12/23/2021 19:02:21 - INFO - codeparrot_training - Step 6136: {'lr': 0.0004825484220851489, 'samples': 3142144, 'steps': 6136, 'batch_loss/train': 0.9247198291122913} 12/23/2021 19:02:32 - INFO - codeparrot_training - Step 6137: {'lr': 0.000482542635463865, 'samples': 3142656, 'steps': 6137, 'batch_loss/train': 0.693823928013444} 12/23/2021 19:02:44 - INFO - codeparrot_training - Step 6138: {'lr': 0.0004825368479180794, 'samples': 3143168, 'steps': 6138, 'batch_loss/train': 0.8098215004429221} 12/23/2021 19:02:55 - INFO - codeparrot_training - Step 6139: {'lr': 0.00048253105944781507, 'samples': 3143680, 'steps': 6139, 'batch_loss/train': 0.8460358530282974} 12/23/2021 19:03:05 - INFO - codeparrot_training - Step 6140: {'lr': 0.000482525270053095, 'samples': 3144192, 'steps': 6140, 'batch_loss/train': 0.7398361524101347} 12/23/2021 19:03:17 - INFO - codeparrot_training - Step 6141: {'lr': 0.0004825194797339423, 'samples': 3144704, 'steps': 6141, 'batch_loss/train': 0.8418246213113889} 12/23/2021 19:03:28 - INFO - codeparrot_training - Step 6142: {'lr': 0.00048251368849037995, 'samples': 3145216, 'steps': 6142, 'batch_loss/train': 0.7992138676345348} 12/23/2021 19:03:38 - INFO - codeparrot_training - Step 6143: {'lr': 0.00048250789632243096, 'samples': 3145728, 'steps': 6143, 'batch_loss/train': 0.8477866742759943} 12/23/2021 19:03:52 - INFO - codeparrot_training - Step 6144: {'lr': 0.00048250210323011835, 'samples': 3146240, 'steps': 6144, 'batch_loss/train': 0.8197430316358805} 12/23/2021 19:04:02 - INFO - codeparrot_training - Step 6145: {'lr': 0.0004824963092134652, 'samples': 3146752, 'steps': 6145, 'batch_loss/train': 0.8445977009832859} 12/23/2021 19:04:13 - INFO - codeparrot_training - Step 6146: {'lr': 0.00048249051427249446, 'samples': 3147264, 'steps': 6146, 'batch_loss/train': 0.8739229273051023} 12/23/2021 19:04:26 - INFO - codeparrot_training - Step 6147: {'lr': 0.0004824847184072293, 'samples': 3147776, 'steps': 6147, 'batch_loss/train': 0.9471801407635212} 12/23/2021 19:04:37 - INFO - codeparrot_training - Step 6148: {'lr': 0.00048247892161769264, 'samples': 3148288, 'steps': 6148, 'batch_loss/train': 1.0071319108828902} 12/23/2021 19:04:48 - INFO - codeparrot_training - Step 6149: {'lr': 0.0004824731239039075, 'samples': 3148800, 'steps': 6149, 'batch_loss/train': 0.8772745300084352} 12/23/2021 19:04:58 - INFO - codeparrot_training - Step 6150: {'lr': 0.0004824673252658971, 'samples': 3149312, 'steps': 6150, 'batch_loss/train': 0.8093972927890718} 12/23/2021 19:05:10 - INFO - codeparrot_training - Step 6151: {'lr': 0.0004824615257036844, 'samples': 3149824, 'steps': 6151, 'batch_loss/train': 0.8091893712989986} 12/23/2021 19:05:21 - INFO - codeparrot_training - Step 6152: {'lr': 0.0004824557252172924, 'samples': 3150336, 'steps': 6152, 'batch_loss/train': 0.8403333653695881} 12/23/2021 19:05:31 - INFO - codeparrot_training - Step 6153: {'lr': 0.0004824499238067442, 'samples': 3150848, 'steps': 6153, 'batch_loss/train': 0.8612623130902648} 12/23/2021 19:05:43 - INFO - codeparrot_training - Step 6154: {'lr': 0.00048244412147206283, 'samples': 3151360, 'steps': 6154, 'batch_loss/train': 0.8586272064130753} 12/23/2021 19:05:54 - INFO - codeparrot_training - Step 6155: {'lr': 0.0004824383182132715, 'samples': 3151872, 'steps': 6155, 'batch_loss/train': 0.7778337458148599} 12/23/2021 19:06:04 - INFO - codeparrot_training - Step 6156: {'lr': 0.0004824325140303931, 'samples': 3152384, 'steps': 6156, 'batch_loss/train': 0.8357659606263041} 12/23/2021 19:06:17 - INFO - codeparrot_training - Step 6157: {'lr': 0.00048242670892345086, 'samples': 3152896, 'steps': 6157, 'batch_loss/train': 0.8548043286427855} 12/23/2021 19:06:28 - INFO - codeparrot_training - Step 6158: {'lr': 0.0004824209028924678, 'samples': 3153408, 'steps': 6158, 'batch_loss/train': 0.910292711108923} 12/23/2021 19:06:38 - INFO - codeparrot_training - Step 6159: {'lr': 0.0004824150959374669, 'samples': 3153920, 'steps': 6159, 'batch_loss/train': 0.9178377948701382} 12/23/2021 19:06:49 - INFO - codeparrot_training - Step 6160: {'lr': 0.00048240928805847133, 'samples': 3154432, 'steps': 6160, 'batch_loss/train': 0.9256278686225414} 12/23/2021 19:07:01 - INFO - codeparrot_training - Step 6161: {'lr': 0.0004824034792555042, 'samples': 3154944, 'steps': 6161, 'batch_loss/train': 0.7469811677001417} 12/23/2021 19:07:12 - INFO - codeparrot_training - Step 6162: {'lr': 0.0004823976695285887, 'samples': 3155456, 'steps': 6162, 'batch_loss/train': 0.7979245013557374} 12/23/2021 19:07:22 - INFO - codeparrot_training - Step 6163: {'lr': 0.0004823918588777477, 'samples': 3155968, 'steps': 6163, 'batch_loss/train': 0.7927864962257445} 12/23/2021 19:07:35 - INFO - codeparrot_training - Step 6164: {'lr': 0.00048238604730300446, 'samples': 3156480, 'steps': 6164, 'batch_loss/train': 0.9374984344467521} 12/23/2021 19:07:46 - INFO - codeparrot_training - Step 6165: {'lr': 0.00048238023480438196, 'samples': 3156992, 'steps': 6165, 'batch_loss/train': 0.9147194372490048} 12/23/2021 19:07:56 - INFO - codeparrot_training - Step 6166: {'lr': 0.00048237442138190353, 'samples': 3157504, 'steps': 6166, 'batch_loss/train': 0.8917863848619163} 12/23/2021 19:08:09 - INFO - codeparrot_training - Step 6167: {'lr': 0.00048236860703559204, 'samples': 3158016, 'steps': 6167, 'batch_loss/train': 0.8740701312199235} 12/23/2021 19:08:19 - INFO - codeparrot_training - Step 6168: {'lr': 0.00048236279176547075, 'samples': 3158528, 'steps': 6168, 'batch_loss/train': 0.8546557240188122} 12/23/2021 19:08:30 - INFO - codeparrot_training - Step 6169: {'lr': 0.00048235697557156274, 'samples': 3159040, 'steps': 6169, 'batch_loss/train': 1.0232348684221506} 12/23/2021 19:08:42 - INFO - codeparrot_training - Step 6170: {'lr': 0.0004823511584538911, 'samples': 3159552, 'steps': 6170, 'batch_loss/train': 0.7492199530825019} 12/23/2021 19:08:53 - INFO - codeparrot_training - Step 6171: {'lr': 0.00048234534041247904, 'samples': 3160064, 'steps': 6171, 'batch_loss/train': 0.9214504533447325} 12/23/2021 19:09:03 - INFO - codeparrot_training - Step 6172: {'lr': 0.0004823395214473496, 'samples': 3160576, 'steps': 6172, 'batch_loss/train': 0.7399221435189247} 12/23/2021 19:09:14 - INFO - codeparrot_training - Step 6173: {'lr': 0.00048233370155852595, 'samples': 3161088, 'steps': 6173, 'batch_loss/train': 1.5138627802953124} 12/23/2021 19:09:28 - INFO - codeparrot_training - Step 6174: {'lr': 0.0004823278807460312, 'samples': 3161600, 'steps': 6174, 'batch_loss/train': 0.8278568810783327} 12/23/2021 19:09:38 - INFO - codeparrot_training - Step 6175: {'lr': 0.0004823220590098886, 'samples': 3162112, 'steps': 6175, 'batch_loss/train': 0.8829247537069023} 12/23/2021 19:09:49 - INFO - codeparrot_training - Step 6176: {'lr': 0.0004823162363501211, 'samples': 3162624, 'steps': 6176, 'batch_loss/train': 1.058266320032999} 12/23/2021 19:10:01 - INFO - codeparrot_training - Step 6177: {'lr': 0.00048231041276675205, 'samples': 3163136, 'steps': 6177, 'batch_loss/train': 0.708885476924479} 12/23/2021 19:10:11 - INFO - codeparrot_training - Step 6178: {'lr': 0.0004823045882598045, 'samples': 3163648, 'steps': 6178, 'batch_loss/train': 0.8257036246359348} 12/23/2021 19:10:22 - INFO - codeparrot_training - Step 6179: {'lr': 0.0004822987628293016, 'samples': 3164160, 'steps': 6179, 'batch_loss/train': 0.9502774062566459} 12/23/2021 19:10:35 - INFO - codeparrot_training - Step 6180: {'lr': 0.0004822929364752665, 'samples': 3164672, 'steps': 6180, 'batch_loss/train': 0.7205281648784876} 12/23/2021 19:10:46 - INFO - codeparrot_training - Step 6181: {'lr': 0.00048228710919772245, 'samples': 3165184, 'steps': 6181, 'batch_loss/train': 0.785894088447094} 12/23/2021 19:10:56 - INFO - codeparrot_training - Step 6182: {'lr': 0.00048228128099669253, 'samples': 3165696, 'steps': 6182, 'batch_loss/train': 0.9273154772818089} 12/23/2021 19:11:07 - INFO - codeparrot_training - Step 6183: {'lr': 0.00048227545187220003, 'samples': 3166208, 'steps': 6183, 'batch_loss/train': 0.848409584723413} 12/23/2021 19:11:19 - INFO - codeparrot_training - Step 6184: {'lr': 0.0004822696218242679, 'samples': 3166720, 'steps': 6184, 'batch_loss/train': 0.9004194997251034} 12/23/2021 19:11:29 - INFO - codeparrot_training - Step 6185: {'lr': 0.0004822637908529196, 'samples': 3167232, 'steps': 6185, 'batch_loss/train': 0.9522846639156342} 12/23/2021 19:11:40 - INFO - codeparrot_training - Step 6186: {'lr': 0.0004822579589581781, 'samples': 3167744, 'steps': 6186, 'batch_loss/train': 0.8746916735544801} 12/23/2021 19:11:52 - INFO - codeparrot_training - Step 6187: {'lr': 0.0004822521261400666, 'samples': 3168256, 'steps': 6187, 'batch_loss/train': 0.8010245035402477} 12/23/2021 19:12:03 - INFO - codeparrot_training - Step 6188: {'lr': 0.0004822462923986084, 'samples': 3168768, 'steps': 6188, 'batch_loss/train': 0.8500561127439141} 12/23/2021 19:12:13 - INFO - codeparrot_training - Step 6189: {'lr': 0.00048224045773382664, 'samples': 3169280, 'steps': 6189, 'batch_loss/train': 0.8987848963588476} 12/23/2021 19:12:25 - INFO - codeparrot_training - Step 6190: {'lr': 0.00048223462214574454, 'samples': 3169792, 'steps': 6190, 'batch_loss/train': 0.6886999923735857} 12/23/2021 19:12:36 - INFO - codeparrot_training - Step 6191: {'lr': 0.0004822287856343852, 'samples': 3170304, 'steps': 6191, 'batch_loss/train': 0.8146905489265919} 12/23/2021 19:12:46 - INFO - codeparrot_training - Step 6192: {'lr': 0.00048222294819977193, 'samples': 3170816, 'steps': 6192, 'batch_loss/train': 0.8736959975212812} 12/23/2021 19:13:00 - INFO - codeparrot_training - Step 6193: {'lr': 0.0004822171098419279, 'samples': 3171328, 'steps': 6193, 'batch_loss/train': 0.8078755624592304} 12/23/2021 19:13:10 - INFO - codeparrot_training - Step 6194: {'lr': 0.00048221127056087624, 'samples': 3171840, 'steps': 6194, 'batch_loss/train': 0.9787458833307028} 12/23/2021 19:13:21 - INFO - codeparrot_training - Step 6195: {'lr': 0.00048220543035664033, 'samples': 3172352, 'steps': 6195, 'batch_loss/train': 0.9517631372436881} 12/23/2021 19:13:31 - INFO - codeparrot_training - Step 6196: {'lr': 0.00048219958922924336, 'samples': 3172864, 'steps': 6196, 'batch_loss/train': 0.8827500715851784} 12/23/2021 19:13:43 - INFO - codeparrot_training - Step 6197: {'lr': 0.0004821937471787084, 'samples': 3173376, 'steps': 6197, 'batch_loss/train': 0.8455467880703509} 12/23/2021 19:13:54 - INFO - codeparrot_training - Step 6198: {'lr': 0.00048218790420505883, 'samples': 3173888, 'steps': 6198, 'batch_loss/train': 0.845529867336154} 12/23/2021 19:14:05 - INFO - codeparrot_training - Step 6199: {'lr': 0.0004821820603083178, 'samples': 3174400, 'steps': 6199, 'batch_loss/train': 0.9171948381699622} 12/23/2021 19:14:17 - INFO - codeparrot_training - Step 6200: {'lr': 0.00048217621548850856, 'samples': 3174912, 'steps': 6200, 'batch_loss/train': 0.8182698942255229} 12/23/2021 19:14:27 - INFO - codeparrot_training - Step 6201: {'lr': 0.00048217036974565437, 'samples': 3175424, 'steps': 6201, 'batch_loss/train': 0.7989611849188805} 12/23/2021 19:14:38 - INFO - codeparrot_training - Step 6202: {'lr': 0.0004821645230797784, 'samples': 3175936, 'steps': 6202, 'batch_loss/train': 0.76821602974087} 12/23/2021 19:14:51 - INFO - codeparrot_training - Step 6203: {'lr': 0.00048215867549090404, 'samples': 3176448, 'steps': 6203, 'batch_loss/train': 0.9126338651403785} 12/23/2021 19:15:01 - INFO - codeparrot_training - Step 6204: {'lr': 0.00048215282697905437, 'samples': 3176960, 'steps': 6204, 'batch_loss/train': 0.8932211957871914} 12/23/2021 19:15:12 - INFO - codeparrot_training - Step 6205: {'lr': 0.00048214697754425277, 'samples': 3177472, 'steps': 6205, 'batch_loss/train': 0.801686423830688} 12/23/2021 19:15:23 - INFO - codeparrot_training - Step 6206: {'lr': 0.0004821411271865225, 'samples': 3177984, 'steps': 6206, 'batch_loss/train': 0.8449810985475779} 12/23/2021 19:15:35 - INFO - codeparrot_training - Step 6207: {'lr': 0.00048213527590588664, 'samples': 3178496, 'steps': 6207, 'batch_loss/train': 0.8430665256455541} 12/23/2021 19:15:45 - INFO - codeparrot_training - Step 6208: {'lr': 0.0004821294237023686, 'samples': 3179008, 'steps': 6208, 'batch_loss/train': 0.7703283014707267} 12/23/2021 19:15:56 - INFO - codeparrot_training - Step 6209: {'lr': 0.00048212357057599166, 'samples': 3179520, 'steps': 6209, 'batch_loss/train': 0.8773196246474981} 12/23/2021 19:16:09 - INFO - codeparrot_training - Step 6210: {'lr': 0.00048211771652677905, 'samples': 3180032, 'steps': 6210, 'batch_loss/train': 0.7646572799421847} 12/23/2021 19:16:20 - INFO - codeparrot_training - Step 6211: {'lr': 0.00048211186155475394, 'samples': 3180544, 'steps': 6211, 'batch_loss/train': 0.7790250760735944} 12/23/2021 19:16:30 - INFO - codeparrot_training - Step 6212: {'lr': 0.0004821060056599398, 'samples': 3181056, 'steps': 6212, 'batch_loss/train': 0.8670127233490348} 12/23/2021 19:16:43 - INFO - codeparrot_training - Step 6213: {'lr': 0.0004821001488423599, 'samples': 3181568, 'steps': 6213, 'batch_loss/train': 0.8039958495646715} 12/23/2021 19:16:53 - INFO - codeparrot_training - Step 6214: {'lr': 0.0004820942911020373, 'samples': 3182080, 'steps': 6214, 'batch_loss/train': 0.8576817661523819} 12/23/2021 19:17:04 - INFO - codeparrot_training - Step 6215: {'lr': 0.00048208843243899546, 'samples': 3182592, 'steps': 6215, 'batch_loss/train': 0.8986051958054304} 12/23/2021 19:17:14 - INFO - codeparrot_training - Step 6216: {'lr': 0.0004820825728532577, 'samples': 3183104, 'steps': 6216, 'batch_loss/train': 0.6305037504062057} 12/23/2021 19:17:26 - INFO - codeparrot_training - Step 6217: {'lr': 0.00048207671234484724, 'samples': 3183616, 'steps': 6217, 'batch_loss/train': 0.8251875410787761} 12/23/2021 19:17:37 - INFO - codeparrot_training - Step 6218: {'lr': 0.0004820708509137874, 'samples': 3184128, 'steps': 6218, 'batch_loss/train': 0.8745332164689898} 12/23/2021 19:17:48 - INFO - codeparrot_training - Step 6219: {'lr': 0.0004820649885601015, 'samples': 3184640, 'steps': 6219, 'batch_loss/train': 0.7438611974939704} 12/23/2021 19:18:00 - INFO - codeparrot_training - Step 6220: {'lr': 0.0004820591252838128, 'samples': 3185152, 'steps': 6220, 'batch_loss/train': 0.8181492444127798} 12/23/2021 19:18:10 - INFO - codeparrot_training - Step 6221: {'lr': 0.00048205326108494465, 'samples': 3185664, 'steps': 6221, 'batch_loss/train': 0.7528134528547525} 12/23/2021 19:18:21 - INFO - codeparrot_training - Step 6222: {'lr': 0.00048204739596352037, 'samples': 3186176, 'steps': 6222, 'batch_loss/train': 0.9202347565442324} 12/23/2021 19:18:33 - INFO - codeparrot_training - Step 6223: {'lr': 0.00048204152991956326, 'samples': 3186688, 'steps': 6223, 'batch_loss/train': 0.7890992667526007} 12/23/2021 19:18:43 - INFO - codeparrot_training - Step 6224: {'lr': 0.0004820356629530966, 'samples': 3187200, 'steps': 6224, 'batch_loss/train': 0.8971185488626361} 12/23/2021 19:18:54 - INFO - codeparrot_training - Step 6225: {'lr': 0.0004820297950641438, 'samples': 3187712, 'steps': 6225, 'batch_loss/train': 0.8682784158736467} 12/23/2021 19:19:07 - INFO - codeparrot_training - Step 6226: {'lr': 0.00048202392625272815, 'samples': 3188224, 'steps': 6226, 'batch_loss/train': 0.9077216796576977} 12/23/2021 19:19:18 - INFO - codeparrot_training - Step 6227: {'lr': 0.000482018056518873, 'samples': 3188736, 'steps': 6227, 'batch_loss/train': 1.0404776018112898} 12/23/2021 19:19:28 - INFO - codeparrot_training - Step 6228: {'lr': 0.0004820121858626016, 'samples': 3189248, 'steps': 6228, 'batch_loss/train': 0.7974925003945827} 12/23/2021 19:19:39 - INFO - codeparrot_training - Step 6229: {'lr': 0.00048200631428393735, 'samples': 3189760, 'steps': 6229, 'batch_loss/train': 0.8182296128943563} 12/23/2021 19:19:51 - INFO - codeparrot_training - Step 6230: {'lr': 0.00048200044178290367, 'samples': 3190272, 'steps': 6230, 'batch_loss/train': 0.8441724951844662} 12/23/2021 19:20:01 - INFO - codeparrot_training - Step 6231: {'lr': 0.00048199456835952374, 'samples': 3190784, 'steps': 6231, 'batch_loss/train': 0.8412117380648851} 12/23/2021 19:20:12 - INFO - codeparrot_training - Step 6232: {'lr': 0.00048198869401382106, 'samples': 3191296, 'steps': 6232, 'batch_loss/train': 0.8549791490659118} 12/23/2021 19:20:25 - INFO - codeparrot_training - Step 6233: {'lr': 0.00048198281874581897, 'samples': 3191808, 'steps': 6233, 'batch_loss/train': 0.8353495132178068} 12/23/2021 19:20:36 - INFO - codeparrot_training - Step 6234: {'lr': 0.0004819769425555407, 'samples': 3192320, 'steps': 6234, 'batch_loss/train': 0.8882966991513968} 12/23/2021 19:20:46 - INFO - codeparrot_training - Step 6235: {'lr': 0.0004819710654430097, 'samples': 3192832, 'steps': 6235, 'batch_loss/train': 0.8255678373388946} 12/23/2021 19:20:59 - INFO - codeparrot_training - Step 6236: {'lr': 0.0004819651874082494, 'samples': 3193344, 'steps': 6236, 'batch_loss/train': 0.7989011937752366} 12/23/2021 19:21:09 - INFO - codeparrot_training - Step 6237: {'lr': 0.0004819593084512831, 'samples': 3193856, 'steps': 6237, 'batch_loss/train': 0.7307513374835253} 12/23/2021 19:21:20 - INFO - codeparrot_training - Step 6238: {'lr': 0.0004819534285721341, 'samples': 3194368, 'steps': 6238, 'batch_loss/train': 0.8647325607016683} 12/23/2021 19:21:30 - INFO - codeparrot_training - Step 6239: {'lr': 0.00048194754777082597, 'samples': 3194880, 'steps': 6239, 'batch_loss/train': 1.0070551447570324} 12/23/2021 19:21:42 - INFO - codeparrot_training - Step 6240: {'lr': 0.00048194166604738185, 'samples': 3195392, 'steps': 6240, 'batch_loss/train': 0.9316873487550765} 12/23/2021 19:21:53 - INFO - codeparrot_training - Step 6241: {'lr': 0.0004819357834018253, 'samples': 3195904, 'steps': 6241, 'batch_loss/train': 0.8423882224597037} 12/23/2021 19:22:04 - INFO - codeparrot_training - Step 6242: {'lr': 0.00048192989983417963, 'samples': 3196416, 'steps': 6242, 'batch_loss/train': 0.8540492197498679} 12/23/2021 19:22:17 - INFO - codeparrot_training - Step 6243: {'lr': 0.00048192401534446827, 'samples': 3196928, 'steps': 6243, 'batch_loss/train': 0.8456971254199743} 12/23/2021 19:22:28 - INFO - codeparrot_training - Step 6244: {'lr': 0.00048191812993271456, 'samples': 3197440, 'steps': 6244, 'batch_loss/train': 0.842575398273766} 12/23/2021 19:22:38 - INFO - codeparrot_training - Step 6245: {'lr': 0.00048191224359894194, 'samples': 3197952, 'steps': 6245, 'batch_loss/train': 0.906162385828793} 12/23/2021 19:22:51 - INFO - codeparrot_training - Step 6246: {'lr': 0.0004819063563431738, 'samples': 3198464, 'steps': 6246, 'batch_loss/train': 0.7923900242894888} 12/23/2021 19:23:01 - INFO - codeparrot_training - Step 6247: {'lr': 0.00048190046816543367, 'samples': 3198976, 'steps': 6247, 'batch_loss/train': 0.8844696078449488} 12/23/2021 19:23:12 - INFO - codeparrot_training - Step 6248: {'lr': 0.00048189457906574475, 'samples': 3199488, 'steps': 6248, 'batch_loss/train': 0.4925309788668528} 12/23/2021 19:23:25 - INFO - codeparrot_training - Step 6249: {'lr': 0.00048188868904413054, 'samples': 3200000, 'steps': 6249, 'batch_loss/train': 0.8247741032391787} 12/23/2021 19:23:36 - INFO - codeparrot_training - Step 6250: {'lr': 0.0004818827981006144, 'samples': 3200512, 'steps': 6250, 'batch_loss/train': 0.9985879638697952} 12/23/2021 19:23:46 - INFO - codeparrot_training - Step 6251: {'lr': 0.0004818769062352199, 'samples': 3201024, 'steps': 6251, 'batch_loss/train': 2.35749838873744} 12/23/2021 19:23:57 - INFO - codeparrot_training - Step 6252: {'lr': 0.00048187101344797037, 'samples': 3201536, 'steps': 6252, 'batch_loss/train': 0.8339326102286577} 12/23/2021 19:24:09 - INFO - codeparrot_training - Step 6253: {'lr': 0.0004818651197388892, 'samples': 3202048, 'steps': 6253, 'batch_loss/train': 0.922753281891346} 12/23/2021 19:24:20 - INFO - codeparrot_training - Step 6254: {'lr': 0.00048185922510799985, 'samples': 3202560, 'steps': 6254, 'batch_loss/train': 0.8321730103343725} 12/23/2021 19:24:30 - INFO - codeparrot_training - Step 6255: {'lr': 0.0004818533295553258, 'samples': 3203072, 'steps': 6255, 'batch_loss/train': 0.7096857642754912} 12/23/2021 19:24:42 - INFO - codeparrot_training - Step 6256: {'lr': 0.00048184743308089047, 'samples': 3203584, 'steps': 6256, 'batch_loss/train': 0.8968213042244315} 12/23/2021 19:24:53 - INFO - codeparrot_training - Step 6257: {'lr': 0.0004818415356847173, 'samples': 3204096, 'steps': 6257, 'batch_loss/train': 1.0182666124310344} 12/23/2021 19:25:03 - INFO - codeparrot_training - Step 6258: {'lr': 0.0004818356373668297, 'samples': 3204608, 'steps': 6258, 'batch_loss/train': 0.8146989271044731} 12/23/2021 19:25:14 - INFO - codeparrot_training - Step 6259: {'lr': 0.00048182973812725113, 'samples': 3205120, 'steps': 6259, 'batch_loss/train': 1.046090287156403} 12/23/2021 19:25:26 - INFO - codeparrot_training - Step 6260: {'lr': 0.00048182383796600505, 'samples': 3205632, 'steps': 6260, 'batch_loss/train': 0.8464795826002955} 12/23/2021 19:25:37 - INFO - codeparrot_training - Step 6261: {'lr': 0.0004818179368831149, 'samples': 3206144, 'steps': 6261, 'batch_loss/train': 0.8145375489257276} 12/23/2021 19:25:47 - INFO - codeparrot_training - Step 6262: {'lr': 0.00048181203487860426, 'samples': 3206656, 'steps': 6262, 'batch_loss/train': 0.8262369148433208} 12/23/2021 19:26:01 - INFO - codeparrot_training - Step 6263: {'lr': 0.00048180613195249644, 'samples': 3207168, 'steps': 6263, 'batch_loss/train': 0.8696820484474301} 12/23/2021 19:26:11 - INFO - codeparrot_training - Step 6264: {'lr': 0.00048180022810481493, 'samples': 3207680, 'steps': 6264, 'batch_loss/train': 0.7509583665523678} 12/23/2021 19:26:22 - INFO - codeparrot_training - Step 6265: {'lr': 0.0004817943233355833, 'samples': 3208192, 'steps': 6265, 'batch_loss/train': 0.894938675686717} 12/23/2021 19:26:34 - INFO - codeparrot_training - Step 6266: {'lr': 0.00048178841764482495, 'samples': 3208704, 'steps': 6266, 'batch_loss/train': 0.7860732432454824} 12/23/2021 19:26:44 - INFO - codeparrot_training - Step 6267: {'lr': 0.0004817825110325633, 'samples': 3209216, 'steps': 6267, 'batch_loss/train': 0.9254091447219253} 12/23/2021 19:26:55 - INFO - codeparrot_training - Step 6268: {'lr': 0.00048177660349882205, 'samples': 3209728, 'steps': 6268, 'batch_loss/train': 0.854687936604023} 12/23/2021 19:27:07 - INFO - codeparrot_training - Step 6269: {'lr': 0.0004817706950436244, 'samples': 3210240, 'steps': 6269, 'batch_loss/train': 0.898751325905323} 12/23/2021 19:27:18 - INFO - codeparrot_training - Step 6270: {'lr': 0.00048176478566699406, 'samples': 3210752, 'steps': 6270, 'batch_loss/train': 0.8186372700147331} 12/23/2021 19:27:28 - INFO - codeparrot_training - Step 6271: {'lr': 0.0004817588753689544, 'samples': 3211264, 'steps': 6271, 'batch_loss/train': 0.7120343195274472} 12/23/2021 19:27:39 - INFO - codeparrot_training - Step 6272: {'lr': 0.00048175296414952894, 'samples': 3211776, 'steps': 6272, 'batch_loss/train': 0.7893921909853816} 12/23/2021 19:27:52 - INFO - codeparrot_training - Step 6273: {'lr': 0.0004817470520087413, 'samples': 3212288, 'steps': 6273, 'batch_loss/train': 0.8019486330449581} 12/23/2021 19:28:03 - INFO - codeparrot_training - Step 6274: {'lr': 0.0004817411389466148, 'samples': 3212800, 'steps': 6274, 'batch_loss/train': 0.8419052688404918} 12/23/2021 19:28:13 - INFO - codeparrot_training - Step 6275: {'lr': 0.00048173522496317304, 'samples': 3213312, 'steps': 6275, 'batch_loss/train': 0.8861635765060782} 12/23/2021 19:28:25 - INFO - codeparrot_training - Step 6276: {'lr': 0.00048172931005843956, 'samples': 3213824, 'steps': 6276, 'batch_loss/train': 0.865821173414588} 12/23/2021 19:28:36 - INFO - codeparrot_training - Step 6277: {'lr': 0.00048172339423243786, 'samples': 3214336, 'steps': 6277, 'batch_loss/train': 0.7708904966711998} 12/23/2021 19:28:47 - INFO - codeparrot_training - Step 6278: {'lr': 0.00048171747748519136, 'samples': 3214848, 'steps': 6278, 'batch_loss/train': 0.8452870985493064} 12/23/2021 19:29:00 - INFO - codeparrot_training - Step 6279: {'lr': 0.00048171155981672375, 'samples': 3215360, 'steps': 6279, 'batch_loss/train': 0.9331196639686823} 12/23/2021 19:29:10 - INFO - codeparrot_training - Step 6280: {'lr': 0.0004817056412270584, 'samples': 3215872, 'steps': 6280, 'batch_loss/train': 0.8746827887371182} 12/23/2021 19:29:21 - INFO - codeparrot_training - Step 6281: {'lr': 0.00048169972171621894, 'samples': 3216384, 'steps': 6281, 'batch_loss/train': 0.9949339097365737} 12/23/2021 19:29:32 - INFO - codeparrot_training - Step 6282: {'lr': 0.0004816938012842289, 'samples': 3216896, 'steps': 6282, 'batch_loss/train': 0.8937361128628254} 12/23/2021 19:29:44 - INFO - codeparrot_training - Step 6283: {'lr': 0.00048168787993111176, 'samples': 3217408, 'steps': 6283, 'batch_loss/train': 0.8732508295215666} 12/23/2021 19:29:54 - INFO - codeparrot_training - Step 6284: {'lr': 0.00048168195765689104, 'samples': 3217920, 'steps': 6284, 'batch_loss/train': 0.8593590329401195} 12/23/2021 19:30:05 - INFO - codeparrot_training - Step 6285: {'lr': 0.0004816760344615904, 'samples': 3218432, 'steps': 6285, 'batch_loss/train': 1.6732439696788788} 12/23/2021 19:30:17 - INFO - codeparrot_training - Step 6286: {'lr': 0.00048167011034523335, 'samples': 3218944, 'steps': 6286, 'batch_loss/train': 0.8306072591803968} 12/23/2021 19:30:28 - INFO - codeparrot_training - Step 6287: {'lr': 0.0004816641853078433, 'samples': 3219456, 'steps': 6287, 'batch_loss/train': 1.2401721561327577} 12/23/2021 19:30:39 - INFO - codeparrot_training - Step 6288: {'lr': 0.0004816582593494441, 'samples': 3219968, 'steps': 6288, 'batch_loss/train': 0.7845822980161756} 12/23/2021 19:30:53 - INFO - codeparrot_training - Step 6289: {'lr': 0.000481652332470059, 'samples': 3220480, 'steps': 6289, 'batch_loss/train': 0.8633116045966744} 12/23/2021 19:31:03 - INFO - codeparrot_training - Step 6290: {'lr': 0.00048164640466971166, 'samples': 3220992, 'steps': 6290, 'batch_loss/train': 0.9027934828773141} 12/23/2021 19:31:14 - INFO - codeparrot_training - Step 6291: {'lr': 0.0004816404759484258, 'samples': 3221504, 'steps': 6291, 'batch_loss/train': 0.7266103513538837} 12/23/2021 19:31:24 - INFO - codeparrot_training - Step 6292: {'lr': 0.0004816345463062248, 'samples': 3222016, 'steps': 6292, 'batch_loss/train': 0.8378871174063534} 12/23/2021 19:31:37 - INFO - codeparrot_training - Step 6293: {'lr': 0.0004816286157431323, 'samples': 3222528, 'steps': 6293, 'batch_loss/train': 0.8284159526228905} 12/23/2021 19:31:47 - INFO - codeparrot_training - Step 6294: {'lr': 0.00048162268425917197, 'samples': 3223040, 'steps': 6294, 'batch_loss/train': 0.9377811551094055} 12/23/2021 19:31:58 - INFO - codeparrot_training - Step 6295: {'lr': 0.0004816167518543672, 'samples': 3223552, 'steps': 6295, 'batch_loss/train': 0.7822399965953082} 12/23/2021 19:32:10 - INFO - codeparrot_training - Step 6296: {'lr': 0.00048161081852874174, 'samples': 3224064, 'steps': 6296, 'batch_loss/train': 1.0336824841797352} 12/23/2021 19:32:20 - INFO - codeparrot_training - Step 6297: {'lr': 0.0004816048842823191, 'samples': 3224576, 'steps': 6297, 'batch_loss/train': 0.8062356067821383} 12/23/2021 19:32:31 - INFO - codeparrot_training - Step 6298: {'lr': 0.0004815989491151229, 'samples': 3225088, 'steps': 6298, 'batch_loss/train': 0.8574892738834023} 12/23/2021 19:32:44 - INFO - codeparrot_training - Step 6299: {'lr': 0.00048159301302717666, 'samples': 3225600, 'steps': 6299, 'batch_loss/train': 0.8124414128251374} 12/23/2021 19:32:54 - INFO - codeparrot_training - Step 6300: {'lr': 0.00048158707601850416, 'samples': 3226112, 'steps': 6300, 'batch_loss/train': 0.8690347336232662} 12/23/2021 19:33:05 - INFO - codeparrot_training - Step 6301: {'lr': 0.00048158113808912886, 'samples': 3226624, 'steps': 6301, 'batch_loss/train': 0.9247024366632104} 12/23/2021 19:33:17 - INFO - codeparrot_training - Step 6302: {'lr': 0.00048157519923907433, 'samples': 3227136, 'steps': 6302, 'batch_loss/train': 0.881863534450531} 12/23/2021 19:33:28 - INFO - codeparrot_training - Step 6303: {'lr': 0.00048156925946836427, 'samples': 3227648, 'steps': 6303, 'batch_loss/train': 0.8179659713059664} 12/23/2021 19:33:38 - INFO - codeparrot_training - Step 6304: {'lr': 0.0004815633187770223, 'samples': 3228160, 'steps': 6304, 'batch_loss/train': 0.9069101735949516} 12/23/2021 19:33:49 - INFO - codeparrot_training - Step 6305: {'lr': 0.000481557377165072, 'samples': 3228672, 'steps': 6305, 'batch_loss/train': 0.8446366367861629} 12/23/2021 19:34:01 - INFO - codeparrot_training - Step 6306: {'lr': 0.000481551434632537, 'samples': 3229184, 'steps': 6306, 'batch_loss/train': 0.8532740147784352} 12/23/2021 19:34:11 - INFO - codeparrot_training - Step 6307: {'lr': 0.0004815454911794409, 'samples': 3229696, 'steps': 6307, 'batch_loss/train': 0.9394666934385896} 12/23/2021 19:34:22 - INFO - codeparrot_training - Step 6308: {'lr': 0.00048153954680580735, 'samples': 3230208, 'steps': 6308, 'batch_loss/train': 0.8495687246322632} 12/23/2021 19:34:35 - INFO - codeparrot_training - Step 6309: {'lr': 0.00048153360151166006, 'samples': 3230720, 'steps': 6309, 'batch_loss/train': 0.8641701713204384} 12/23/2021 19:34:46 - INFO - codeparrot_training - Step 6310: {'lr': 0.0004815276552970226, 'samples': 3231232, 'steps': 6310, 'batch_loss/train': 0.7383868074975908} 12/23/2021 19:34:56 - INFO - codeparrot_training - Step 6311: {'lr': 0.00048152170816191854, 'samples': 3231744, 'steps': 6311, 'batch_loss/train': 0.8469790294766426} 12/23/2021 19:35:08 - INFO - codeparrot_training - Step 6312: {'lr': 0.0004815157601063715, 'samples': 3232256, 'steps': 6312, 'batch_loss/train': 0.7395428996533155} 12/23/2021 19:35:19 - INFO - codeparrot_training - Step 6313: {'lr': 0.0004815098111304054, 'samples': 3232768, 'steps': 6313, 'batch_loss/train': 0.9501498807221651} 12/23/2021 19:35:30 - INFO - codeparrot_training - Step 6314: {'lr': 0.0004815038612340436, 'samples': 3233280, 'steps': 6314, 'batch_loss/train': 0.7551151113584638} 12/23/2021 19:35:40 - INFO - codeparrot_training - Step 6315: {'lr': 0.0004814979104173099, 'samples': 3233792, 'steps': 6315, 'batch_loss/train': 0.7866106517612934} 12/23/2021 19:35:52 - INFO - codeparrot_training - Step 6316: {'lr': 0.0004814919586802279, 'samples': 3234304, 'steps': 6316, 'batch_loss/train': 0.8321128427051008} 12/23/2021 19:36:03 - INFO - codeparrot_training - Step 6317: {'lr': 0.0004814860060228213, 'samples': 3234816, 'steps': 6317, 'batch_loss/train': 0.7506366181187332} 12/23/2021 19:36:14 - INFO - codeparrot_training - Step 6318: {'lr': 0.00048148005244511374, 'samples': 3235328, 'steps': 6318, 'batch_loss/train': 0.8028223887085915} 12/23/2021 19:36:27 - INFO - codeparrot_training - Step 6319: {'lr': 0.00048147409794712883, 'samples': 3235840, 'steps': 6319, 'batch_loss/train': 0.9844986703246832} 12/23/2021 19:36:38 - INFO - codeparrot_training - Step 6320: {'lr': 0.00048146814252889036, 'samples': 3236352, 'steps': 6320, 'batch_loss/train': 0.8473936170339584} 12/23/2021 19:36:49 - INFO - codeparrot_training - Step 6321: {'lr': 0.00048146218619042196, 'samples': 3236864, 'steps': 6321, 'batch_loss/train': 0.9189420435577631} 12/23/2021 19:37:01 - INFO - codeparrot_training - Step 6322: {'lr': 0.0004814562289317473, 'samples': 3237376, 'steps': 6322, 'batch_loss/train': 0.8597549072001129} 12/23/2021 19:37:11 - INFO - codeparrot_training - Step 6323: {'lr': 0.0004814502707528901, 'samples': 3237888, 'steps': 6323, 'batch_loss/train': 0.6611430912744254} 12/23/2021 19:37:22 - INFO - codeparrot_training - Step 6324: {'lr': 0.00048144431165387397, 'samples': 3238400, 'steps': 6324, 'batch_loss/train': 0.8822643496096134} 12/23/2021 19:37:34 - INFO - codeparrot_training - Step 6325: {'lr': 0.00048143835163472264, 'samples': 3238912, 'steps': 6325, 'batch_loss/train': 0.9055756200104952} 12/23/2021 19:37:45 - INFO - codeparrot_training - Step 6326: {'lr': 0.0004814323906954599, 'samples': 3239424, 'steps': 6326, 'batch_loss/train': 0.7927223173901439} 12/23/2021 19:37:55 - INFO - codeparrot_training - Step 6327: {'lr': 0.0004814264288361093, 'samples': 3239936, 'steps': 6327, 'batch_loss/train': 0.777548614423722} 12/23/2021 19:38:06 - INFO - codeparrot_training - Step 6328: {'lr': 0.0004814204660566945, 'samples': 3240448, 'steps': 6328, 'batch_loss/train': 0.8562621530145407} 12/23/2021 19:38:19 - INFO - codeparrot_training - Step 6329: {'lr': 0.00048141450235723946, 'samples': 3240960, 'steps': 6329, 'batch_loss/train': 0.7070676367729902} 12/23/2021 19:38:30 - INFO - codeparrot_training - Step 6330: {'lr': 0.0004814085377377677, 'samples': 3241472, 'steps': 6330, 'batch_loss/train': 0.8096684413030744} 12/23/2021 19:38:40 - INFO - codeparrot_training - Step 6331: {'lr': 0.0004814025721983029, 'samples': 3241984, 'steps': 6331, 'batch_loss/train': 0.9125736569985747} 12/23/2021 19:38:53 - INFO - codeparrot_training - Step 6332: {'lr': 0.0004813966057388689, 'samples': 3242496, 'steps': 6332, 'batch_loss/train': 0.8385059824213386} 12/23/2021 19:39:03 - INFO - codeparrot_training - Step 6333: {'lr': 0.0004813906383594894, 'samples': 3243008, 'steps': 6333, 'batch_loss/train': 0.8188949963077903} 12/23/2021 19:39:14 - INFO - codeparrot_training - Step 6334: {'lr': 0.00048138467006018806, 'samples': 3243520, 'steps': 6334, 'batch_loss/train': 0.7544340868480504} 12/23/2021 19:39:27 - INFO - codeparrot_training - Step 6335: {'lr': 0.0004813787008409886, 'samples': 3244032, 'steps': 6335, 'batch_loss/train': 0.8732138620689511} 12/23/2021 19:39:37 - INFO - codeparrot_training - Step 6336: {'lr': 0.00048137273070191487, 'samples': 3244544, 'steps': 6336, 'batch_loss/train': 0.8448209017515182} 12/23/2021 19:39:48 - INFO - codeparrot_training - Step 6337: {'lr': 0.0004813667596429905, 'samples': 3245056, 'steps': 6337, 'batch_loss/train': 0.911784770898521} 12/23/2021 19:39:59 - INFO - codeparrot_training - Step 6338: {'lr': 0.0004813607876642392, 'samples': 3245568, 'steps': 6338, 'batch_loss/train': 0.7801407835213467} 12/23/2021 19:40:11 - INFO - codeparrot_training - Step 6339: {'lr': 0.00048135481476568474, 'samples': 3246080, 'steps': 6339, 'batch_loss/train': 0.7718469509854913} 12/23/2021 19:40:22 - INFO - codeparrot_training - Step 6340: {'lr': 0.00048134884094735097, 'samples': 3246592, 'steps': 6340, 'batch_loss/train': 0.9036385612562299} 12/23/2021 19:40:32 - INFO - codeparrot_training - Step 6341: {'lr': 0.0004813428662092616, 'samples': 3247104, 'steps': 6341, 'batch_loss/train': 0.9393473770469427} 12/23/2021 19:40:44 - INFO - codeparrot_training - Step 6342: {'lr': 0.0004813368905514403, 'samples': 3247616, 'steps': 6342, 'batch_loss/train': 0.8473982876166701} 12/23/2021 19:40:55 - INFO - codeparrot_training - Step 6343: {'lr': 0.0004813309139739108, 'samples': 3248128, 'steps': 6343, 'batch_loss/train': 0.9001483237370849} 12/23/2021 19:41:05 - INFO - codeparrot_training - Step 6344: {'lr': 0.00048132493647669707, 'samples': 3248640, 'steps': 6344, 'batch_loss/train': 0.8117022826336324} 12/23/2021 19:41:18 - INFO - codeparrot_training - Step 6345: {'lr': 0.0004813189580598226, 'samples': 3249152, 'steps': 6345, 'batch_loss/train': 0.8036365965381265} 12/23/2021 19:41:28 - INFO - codeparrot_training - Step 6346: {'lr': 0.00048131297872331145, 'samples': 3249664, 'steps': 6346, 'batch_loss/train': 0.8008499867282808} 12/23/2021 19:41:39 - INFO - codeparrot_training - Step 6347: {'lr': 0.0004813069984671871, 'samples': 3250176, 'steps': 6347, 'batch_loss/train': 0.8498573061078787} 12/23/2021 19:41:49 - INFO - codeparrot_training - Step 6348: {'lr': 0.00048130101729147354, 'samples': 3250688, 'steps': 6348, 'batch_loss/train': 0.8696333570405841} 12/23/2021 19:42:03 - INFO - codeparrot_training - Step 6349: {'lr': 0.00048129503519619445, 'samples': 3251200, 'steps': 6349, 'batch_loss/train': 0.6134173478931189} 12/23/2021 19:42:14 - INFO - codeparrot_training - Step 6350: {'lr': 0.0004812890521813736, 'samples': 3251712, 'steps': 6350, 'batch_loss/train': 0.856201246380806} 12/23/2021 19:42:24 - INFO - codeparrot_training - Step 6351: {'lr': 0.0004812830682470348, 'samples': 3252224, 'steps': 6351, 'batch_loss/train': 1.0737401600927114} 12/23/2021 19:42:36 - INFO - codeparrot_training - Step 6352: {'lr': 0.00048127708339320196, 'samples': 3252736, 'steps': 6352, 'batch_loss/train': 0.7982528447173536} 12/23/2021 19:42:47 - INFO - codeparrot_training - Step 6353: {'lr': 0.0004812710976198987, 'samples': 3253248, 'steps': 6353, 'batch_loss/train': 0.8610861953347921} 12/23/2021 19:42:57 - INFO - codeparrot_training - Step 6354: {'lr': 0.00048126511092714885, 'samples': 3253760, 'steps': 6354, 'batch_loss/train': 0.8516823714599013} 12/23/2021 19:43:09 - INFO - codeparrot_training - Step 6355: {'lr': 0.0004812591233149762, 'samples': 3254272, 'steps': 6355, 'batch_loss/train': 0.8582667214795947} 12/23/2021 19:43:20 - INFO - codeparrot_training - Step 6356: {'lr': 0.0004812531347834047, 'samples': 3254784, 'steps': 6356, 'batch_loss/train': 0.8837211281061172} 12/23/2021 19:43:31 - INFO - codeparrot_training - Step 6357: {'lr': 0.00048124714533245806, 'samples': 3255296, 'steps': 6357, 'batch_loss/train': 0.8715837579220533} 12/23/2021 19:43:44 - INFO - codeparrot_training - Step 6358: {'lr': 0.00048124115496216006, 'samples': 3255808, 'steps': 6358, 'batch_loss/train': 0.7893350338563323} 12/23/2021 19:43:55 - INFO - codeparrot_training - Step 6359: {'lr': 0.00048123516367253445, 'samples': 3256320, 'steps': 6359, 'batch_loss/train': 1.5292960852384567} 12/23/2021 19:44:05 - INFO - codeparrot_training - Step 6360: {'lr': 0.0004812291714636052, 'samples': 3256832, 'steps': 6360, 'batch_loss/train': 0.9796842550858855} 12/23/2021 19:44:16 - INFO - codeparrot_training - Step 6361: {'lr': 0.00048122317833539606, 'samples': 3257344, 'steps': 6361, 'batch_loss/train': 0.8317240439355373} 12/23/2021 19:44:28 - INFO - codeparrot_training - Step 6362: {'lr': 0.00048121718428793093, 'samples': 3257856, 'steps': 6362, 'batch_loss/train': 0.903604437597096} 12/23/2021 19:44:39 - INFO - codeparrot_training - Step 6363: {'lr': 0.00048121118932123355, 'samples': 3258368, 'steps': 6363, 'batch_loss/train': 0.6846163356676698} 12/23/2021 19:44:49 - INFO - codeparrot_training - Step 6364: {'lr': 0.00048120519343532777, 'samples': 3258880, 'steps': 6364, 'batch_loss/train': 0.6292945020832121} 12/23/2021 19:45:03 - INFO - codeparrot_training - Step 6365: {'lr': 0.0004811991966302374, 'samples': 3259392, 'steps': 6365, 'batch_loss/train': 0.7843211852014065} 12/23/2021 19:45:13 - INFO - codeparrot_training - Step 6366: {'lr': 0.0004811931989059864, 'samples': 3259904, 'steps': 6366, 'batch_loss/train': 0.8552224021404982} 12/23/2021 19:45:24 - INFO - codeparrot_training - Step 6367: {'lr': 0.0004811872002625985, 'samples': 3260416, 'steps': 6367, 'batch_loss/train': 0.8568976307287812} 12/23/2021 19:45:36 - INFO - codeparrot_training - Step 6368: {'lr': 0.00048118120070009756, 'samples': 3260928, 'steps': 6368, 'batch_loss/train': 1.033564972691238} 12/23/2021 19:45:46 - INFO - codeparrot_training - Step 6369: {'lr': 0.00048117520021850754, 'samples': 3261440, 'steps': 6369, 'batch_loss/train': 0.7953883782029152} 12/23/2021 19:45:57 - INFO - codeparrot_training - Step 6370: {'lr': 0.00048116919881785214, 'samples': 3261952, 'steps': 6370, 'batch_loss/train': 0.8229173682630062} 12/23/2021 19:46:07 - INFO - codeparrot_training - Step 6371: {'lr': 0.0004811631964981553, 'samples': 3262464, 'steps': 6371, 'batch_loss/train': 0.9379030726850033} 12/23/2021 19:46:19 - INFO - codeparrot_training - Step 6372: {'lr': 0.00048115719325944085, 'samples': 3262976, 'steps': 6372, 'batch_loss/train': 0.8385634506121278} 12/23/2021 19:46:30 - INFO - codeparrot_training - Step 6373: {'lr': 0.00048115118910173273, 'samples': 3263488, 'steps': 6373, 'batch_loss/train': 0.9522645147517323} 12/23/2021 19:46:41 - INFO - codeparrot_training - Step 6374: {'lr': 0.00048114518402505473, 'samples': 3264000, 'steps': 6374, 'batch_loss/train': 0.626209256472066} 12/23/2021 19:46:54 - INFO - codeparrot_training - Step 6375: {'lr': 0.0004811391780294308, 'samples': 3264512, 'steps': 6375, 'batch_loss/train': 0.8362092841416597} 12/23/2021 19:47:05 - INFO - codeparrot_training - Step 6376: {'lr': 0.00048113317111488476, 'samples': 3265024, 'steps': 6376, 'batch_loss/train': 0.8504860168322921} 12/23/2021 19:47:15 - INFO - codeparrot_training - Step 6377: {'lr': 0.0004811271632814405, 'samples': 3265536, 'steps': 6377, 'batch_loss/train': 0.8129426334053278} 12/23/2021 19:47:27 - INFO - codeparrot_training - Step 6378: {'lr': 0.00048112115452912186, 'samples': 3266048, 'steps': 6378, 'batch_loss/train': 0.6898706387728453} 12/23/2021 19:47:38 - INFO - codeparrot_training - Step 6379: {'lr': 0.00048111514485795284, 'samples': 3266560, 'steps': 6379, 'batch_loss/train': 0.8930934621021152} 12/23/2021 19:47:49 - INFO - codeparrot_training - Step 6380: {'lr': 0.00048110913426795723, 'samples': 3267072, 'steps': 6380, 'batch_loss/train': 0.8650483749806881} 12/23/2021 19:48:01 - INFO - codeparrot_training - Step 6381: {'lr': 0.00048110312275915894, 'samples': 3267584, 'steps': 6381, 'batch_loss/train': 0.8397275903262198} 12/23/2021 19:48:12 - INFO - codeparrot_training - Step 6382: {'lr': 0.00048109711033158197, 'samples': 3268096, 'steps': 6382, 'batch_loss/train': 0.8087675860151649} 12/23/2021 19:48:22 - INFO - codeparrot_training - Step 6383: {'lr': 0.00048109109698525013, 'samples': 3268608, 'steps': 6383, 'batch_loss/train': 0.8483970323577523} 12/23/2021 19:48:33 - INFO - codeparrot_training - Step 6384: {'lr': 0.0004810850827201873, 'samples': 3269120, 'steps': 6384, 'batch_loss/train': 0.8381582163274288} 12/23/2021 19:48:45 - INFO - codeparrot_training - Step 6385: {'lr': 0.00048107906753641746, 'samples': 3269632, 'steps': 6385, 'batch_loss/train': 0.8905774354934692} 12/23/2021 19:48:55 - INFO - codeparrot_training - Step 6386: {'lr': 0.0004810730514339645, 'samples': 3270144, 'steps': 6386, 'batch_loss/train': 0.8074898542836308} 12/23/2021 19:49:06 - INFO - codeparrot_training - Step 6387: {'lr': 0.0004810670344128524, 'samples': 3270656, 'steps': 6387, 'batch_loss/train': 0.7808846835978329} 12/23/2021 19:49:19 - INFO - codeparrot_training - Step 6388: {'lr': 0.0004810610164731049, 'samples': 3271168, 'steps': 6388, 'batch_loss/train': 0.9271870595403016} 12/23/2021 19:49:30 - INFO - codeparrot_training - Step 6389: {'lr': 0.00048105499761474615, 'samples': 3271680, 'steps': 6389, 'batch_loss/train': 0.9958838375750929} 12/23/2021 19:49:40 - INFO - codeparrot_training - Step 6390: {'lr': 0.00048104897783779986, 'samples': 3272192, 'steps': 6390, 'batch_loss/train': 0.8702499251812696} 12/23/2021 19:49:52 - INFO - codeparrot_training - Step 6391: {'lr': 0.00048104295714229017, 'samples': 3272704, 'steps': 6391, 'batch_loss/train': 0.853831073269248} 12/23/2021 19:50:03 - INFO - codeparrot_training - Step 6392: {'lr': 0.00048103693552824093, 'samples': 3273216, 'steps': 6392, 'batch_loss/train': 0.795758131891489} 12/23/2021 19:50:14 - INFO - codeparrot_training - Step 6393: {'lr': 0.000481030912995676, 'samples': 3273728, 'steps': 6393, 'batch_loss/train': 0.804118241649121} 12/23/2021 19:50:24 - INFO - codeparrot_training - Step 6394: {'lr': 0.0004810248895446194, 'samples': 3274240, 'steps': 6394, 'batch_loss/train': 0.8954105116426945} 12/23/2021 19:50:37 - INFO - codeparrot_training - Step 6395: {'lr': 0.0004810188651750951, 'samples': 3274752, 'steps': 6395, 'batch_loss/train': 0.7883667070418596} 12/23/2021 19:50:48 - INFO - codeparrot_training - Step 6396: {'lr': 0.000481012839887127, 'samples': 3275264, 'steps': 6396, 'batch_loss/train': 0.8737477893009782} 12/23/2021 19:50:59 - INFO - codeparrot_training - Step 6397: {'lr': 0.0004810068136807391, 'samples': 3275776, 'steps': 6397, 'batch_loss/train': 0.8072757786139846} 12/23/2021 19:51:11 - INFO - codeparrot_training - Step 6398: {'lr': 0.0004810007865559554, 'samples': 3276288, 'steps': 6398, 'batch_loss/train': 0.7836789852008224} 12/23/2021 19:51:22 - INFO - codeparrot_training - Step 6399: {'lr': 0.0004809947585127996, 'samples': 3276800, 'steps': 6399, 'batch_loss/train': 0.8123418122995645} 12/23/2021 19:51:32 - INFO - codeparrot_training - Step 6400: {'lr': 0.000480988729551296, 'samples': 3277312, 'steps': 6400, 'batch_loss/train': 0.9266685266047716} 12/23/2021 19:51:44 - INFO - codeparrot_training - Step 6401: {'lr': 0.00048098269967146846, 'samples': 3277824, 'steps': 6401, 'batch_loss/train': 0.8726921994239092} 12/23/2021 19:51:55 - INFO - codeparrot_training - Step 6402: {'lr': 0.00048097666887334087, 'samples': 3278336, 'steps': 6402, 'batch_loss/train': 0.7403618663083762} 12/23/2021 19:52:06 - INFO - codeparrot_training - Step 6403: {'lr': 0.00048097063715693723, 'samples': 3278848, 'steps': 6403, 'batch_loss/train': 0.85459856223315} 12/23/2021 19:52:16 - INFO - codeparrot_training - Step 6404: {'lr': 0.0004809646045222816, 'samples': 3279360, 'steps': 6404, 'batch_loss/train': 0.8324901051819324} 12/23/2021 19:52:30 - INFO - codeparrot_training - Step 6405: {'lr': 0.00048095857096939787, 'samples': 3279872, 'steps': 6405, 'batch_loss/train': 0.7237504990771413} 12/23/2021 19:52:41 - INFO - codeparrot_training - Step 6406: {'lr': 0.00048095253649831006, 'samples': 3280384, 'steps': 6406, 'batch_loss/train': 0.8302901855204254} 12/23/2021 19:52:51 - INFO - codeparrot_training - Step 6407: {'lr': 0.0004809465011090423, 'samples': 3280896, 'steps': 6407, 'batch_loss/train': 0.8410794641822577} 12/23/2021 19:53:03 - INFO - codeparrot_training - Step 6408: {'lr': 0.00048094046480161827, 'samples': 3281408, 'steps': 6408, 'batch_loss/train': 0.8613712405785918} 12/23/2021 19:53:14 - INFO - codeparrot_training - Step 6409: {'lr': 0.0004809344275760622, 'samples': 3281920, 'steps': 6409, 'batch_loss/train': 0.7484542764723301} 12/23/2021 19:53:25 - INFO - codeparrot_training - Step 6410: {'lr': 0.00048092838943239815, 'samples': 3282432, 'steps': 6410, 'batch_loss/train': 1.0185464266687632} 12/23/2021 19:53:39 - INFO - codeparrot_training - Step 6411: {'lr': 0.0004809223503706499, 'samples': 3282944, 'steps': 6411, 'batch_loss/train': 0.7482278631068766} 12/23/2021 19:53:50 - INFO - codeparrot_training - Step 6412: {'lr': 0.0004809163103908416, 'samples': 3283456, 'steps': 6412, 'batch_loss/train': 0.8691043246071786} 12/23/2021 19:54:01 - INFO - codeparrot_training - Step 6413: {'lr': 0.00048091026949299734, 'samples': 3283968, 'steps': 6413, 'batch_loss/train': 1.037929973565042} 12/23/2021 19:54:11 - INFO - codeparrot_training - Step 6414: {'lr': 0.0004809042276771409, 'samples': 3284480, 'steps': 6414, 'batch_loss/train': 1.1975951215717942} 12/23/2021 19:54:24 - INFO - codeparrot_training - Step 6415: {'lr': 0.0004808981849432965, 'samples': 3284992, 'steps': 6415, 'batch_loss/train': 0.9099480397999287} 12/23/2021 19:54:34 - INFO - codeparrot_training - Step 6416: {'lr': 0.00048089214129148813, 'samples': 3285504, 'steps': 6416, 'batch_loss/train': 0.8592549653258175} 12/23/2021 19:54:45 - INFO - codeparrot_training - Step 6417: {'lr': 0.00048088609672173975, 'samples': 3286016, 'steps': 6417, 'batch_loss/train': 0.9314098749309778} 12/23/2021 19:54:57 - INFO - codeparrot_training - Step 6418: {'lr': 0.0004808800512340754, 'samples': 3286528, 'steps': 6418, 'batch_loss/train': 0.9605526095256209} 12/23/2021 19:55:08 - INFO - codeparrot_training - Step 6419: {'lr': 0.00048087400482851915, 'samples': 3287040, 'steps': 6419, 'batch_loss/train': 0.851240792311728} 12/23/2021 19:55:18 - INFO - codeparrot_training - Step 6420: {'lr': 0.0004808679575050951, 'samples': 3287552, 'steps': 6420, 'batch_loss/train': 0.9608183884993196} 12/23/2021 19:55:30 - INFO - codeparrot_training - Step 6421: {'lr': 0.00048086190926382717, 'samples': 3288064, 'steps': 6421, 'batch_loss/train': 0.838118314743042} 12/23/2021 19:55:41 - INFO - codeparrot_training - Step 6422: {'lr': 0.00048085586010473947, 'samples': 3288576, 'steps': 6422, 'batch_loss/train': 0.8038072532508522} 12/23/2021 19:55:52 - INFO - codeparrot_training - Step 6423: {'lr': 0.00048084981002785603, 'samples': 3289088, 'steps': 6423, 'batch_loss/train': 0.8141986755654216} 12/23/2021 19:56:02 - INFO - codeparrot_training - Step 6424: {'lr': 0.00048084375903320093, 'samples': 3289600, 'steps': 6424, 'batch_loss/train': 1.060533693060279} 12/23/2021 19:56:16 - INFO - codeparrot_training - Step 6425: {'lr': 0.00048083770712079824, 'samples': 3290112, 'steps': 6425, 'batch_loss/train': 1.7615102054551244} 12/23/2021 19:56:26 - INFO - codeparrot_training - Step 6426: {'lr': 0.00048083165429067186, 'samples': 3290624, 'steps': 6426, 'batch_loss/train': 0.9591216673143208} 12/23/2021 19:56:37 - INFO - codeparrot_training - Step 6427: {'lr': 0.00048082560054284613, 'samples': 3291136, 'steps': 6427, 'batch_loss/train': 0.8680540444329381} 12/23/2021 19:56:49 - INFO - codeparrot_training - Step 6428: {'lr': 0.0004808195458773449, 'samples': 3291648, 'steps': 6428, 'batch_loss/train': 0.8220078824087977} 12/23/2021 19:56:59 - INFO - codeparrot_training - Step 6429: {'lr': 0.0004808134902941923, 'samples': 3292160, 'steps': 6429, 'batch_loss/train': 0.8071011849679053} 12/23/2021 19:57:10 - INFO - codeparrot_training - Step 6430: {'lr': 0.0004808074337934124, 'samples': 3292672, 'steps': 6430, 'batch_loss/train': 0.7421411306131631} 12/23/2021 19:57:23 - INFO - codeparrot_training - Step 6431: {'lr': 0.00048080137637502933, 'samples': 3293184, 'steps': 6431, 'batch_loss/train': 0.9479916235432029} 12/23/2021 19:57:34 - INFO - codeparrot_training - Step 6432: {'lr': 0.00048079531803906715, 'samples': 3293696, 'steps': 6432, 'batch_loss/train': 0.8733512340113521} 12/23/2021 19:57:44 - INFO - codeparrot_training - Step 6433: {'lr': 0.0004807892587855499, 'samples': 3294208, 'steps': 6433, 'batch_loss/train': 1.2029640022665262} 12/23/2021 19:57:55 - INFO - codeparrot_training - Step 6434: {'lr': 0.0004807831986145017, 'samples': 3294720, 'steps': 6434, 'batch_loss/train': 0.9388163797557354} 12/23/2021 19:58:08 - INFO - codeparrot_training - Step 6435: {'lr': 0.0004807771375259466, 'samples': 3295232, 'steps': 6435, 'batch_loss/train': 0.8262115097604692} 12/23/2021 19:58:19 - INFO - codeparrot_training - Step 6436: {'lr': 0.00048077107551990886, 'samples': 3295744, 'steps': 6436, 'batch_loss/train': 0.9198131035082042} 12/23/2021 19:58:29 - INFO - codeparrot_training - Step 6437: {'lr': 0.0004807650125964124, 'samples': 3296256, 'steps': 6437, 'batch_loss/train': 0.9057351322844625} 12/23/2021 19:58:41 - INFO - codeparrot_training - Step 6438: {'lr': 0.0004807589487554813, 'samples': 3296768, 'steps': 6438, 'batch_loss/train': 0.7224897118285298} 12/23/2021 19:58:52 - INFO - codeparrot_training - Step 6439: {'lr': 0.00048075288399713985, 'samples': 3297280, 'steps': 6439, 'batch_loss/train': 0.8056897735223174} 12/23/2021 19:59:03 - INFO - codeparrot_training - Step 6440: {'lr': 0.00048074681832141204, 'samples': 3297792, 'steps': 6440, 'batch_loss/train': 0.9764278046786785} 12/23/2021 19:59:16 - INFO - codeparrot_training - Step 6441: {'lr': 0.000480740751728322, 'samples': 3298304, 'steps': 6441, 'batch_loss/train': 0.8987559946253896} 12/23/2021 19:59:26 - INFO - codeparrot_training - Step 6442: {'lr': 0.00048073468421789385, 'samples': 3298816, 'steps': 6442, 'batch_loss/train': 0.7083995998837054} 12/23/2021 19:59:37 - INFO - codeparrot_training - Step 6443: {'lr': 0.0004807286157901517, 'samples': 3299328, 'steps': 6443, 'batch_loss/train': 0.961805148050189} 12/23/2021 19:59:47 - INFO - codeparrot_training - Step 6444: {'lr': 0.0004807225464451197, 'samples': 3299840, 'steps': 6444, 'batch_loss/train': 0.8678195807151496} 12/23/2021 20:00:00 - INFO - codeparrot_training - Step 6445: {'lr': 0.00048071647618282197, 'samples': 3300352, 'steps': 6445, 'batch_loss/train': 0.8939205892384052} 12/23/2021 20:00:10 - INFO - codeparrot_training - Step 6446: {'lr': 0.0004807104050032827, 'samples': 3300864, 'steps': 6446, 'batch_loss/train': 1.3441665079444647} 12/23/2021 20:00:21 - INFO - codeparrot_training - Step 6447: {'lr': 0.00048070433290652595, 'samples': 3301376, 'steps': 6447, 'batch_loss/train': 0.8329836772754788} 12/23/2021 20:00:33 - INFO - codeparrot_training - Step 6448: {'lr': 0.0004806982598925759, 'samples': 3301888, 'steps': 6448, 'batch_loss/train': 0.8244279325008392} 12/23/2021 20:00:43 - INFO - codeparrot_training - Step 6449: {'lr': 0.00048069218596145664, 'samples': 3302400, 'steps': 6449, 'batch_loss/train': 0.9494886007159948} 12/23/2021 20:00:54 - INFO - codeparrot_training - Step 6450: {'lr': 0.0004806861111131924, 'samples': 3302912, 'steps': 6450, 'batch_loss/train': 0.8874971494078636} 12/23/2021 20:01:07 - INFO - codeparrot_training - Step 6451: {'lr': 0.00048068003534780724, 'samples': 3303424, 'steps': 6451, 'batch_loss/train': 0.8556239377940074} 12/23/2021 20:01:18 - INFO - codeparrot_training - Step 6452: {'lr': 0.0004806739586653254, 'samples': 3303936, 'steps': 6452, 'batch_loss/train': 0.9241299037821591} 12/23/2021 20:01:29 - INFO - codeparrot_training - Step 6453: {'lr': 0.0004806678810657709, 'samples': 3304448, 'steps': 6453, 'batch_loss/train': 1.1470076469704509} 12/23/2021 20:01:39 - INFO - codeparrot_training - Step 6454: {'lr': 0.00048066180254916803, 'samples': 3304960, 'steps': 6454, 'batch_loss/train': 0.766250197775662} 12/23/2021 20:01:51 - INFO - codeparrot_training - Step 6455: {'lr': 0.000480655723115541, 'samples': 3305472, 'steps': 6455, 'batch_loss/train': 0.8772042475175112} 12/23/2021 20:02:02 - INFO - codeparrot_training - Step 6456: {'lr': 0.00048064964276491387, 'samples': 3305984, 'steps': 6456, 'batch_loss/train': 0.8490691867191344} 12/23/2021 20:02:13 - INFO - codeparrot_training - Step 6457: {'lr': 0.0004806435614973109, 'samples': 3306496, 'steps': 6457, 'batch_loss/train': 0.9350399544928223} 12/23/2021 20:02:25 - INFO - codeparrot_training - Step 6458: {'lr': 0.00048063747931275616, 'samples': 3307008, 'steps': 6458, 'batch_loss/train': 0.9145379476249218} 12/23/2021 20:02:35 - INFO - codeparrot_training - Step 6459: {'lr': 0.0004806313962112739, 'samples': 3307520, 'steps': 6459, 'batch_loss/train': 0.8328520357608795} 12/23/2021 20:02:46 - INFO - codeparrot_training - Step 6460: {'lr': 0.0004806253121928883, 'samples': 3308032, 'steps': 6460, 'batch_loss/train': 0.8851549779064953} 12/23/2021 20:02:58 - INFO - codeparrot_training - Step 6461: {'lr': 0.00048061922725762353, 'samples': 3308544, 'steps': 6461, 'batch_loss/train': 0.8435468666721135} 12/23/2021 20:03:09 - INFO - codeparrot_training - Step 6462: {'lr': 0.00048061314140550375, 'samples': 3309056, 'steps': 6462, 'batch_loss/train': 0.895835685543716} 12/23/2021 20:03:19 - INFO - codeparrot_training - Step 6463: {'lr': 0.00048060705463655323, 'samples': 3309568, 'steps': 6463, 'batch_loss/train': 0.9102049637585878} 12/23/2021 20:03:30 - INFO - codeparrot_training - Step 6464: {'lr': 0.00048060096695079615, 'samples': 3310080, 'steps': 6464, 'batch_loss/train': 0.8298476664349437} 12/23/2021 20:03:43 - INFO - codeparrot_training - Step 6465: {'lr': 0.0004805948783482567, 'samples': 3310592, 'steps': 6465, 'batch_loss/train': 0.7796417763456702} 12/23/2021 20:03:53 - INFO - codeparrot_training - Step 6466: {'lr': 0.0004805887888289591, 'samples': 3311104, 'steps': 6466, 'batch_loss/train': 0.8460877658799291} 12/23/2021 20:04:04 - INFO - codeparrot_training - Step 6467: {'lr': 0.00048058269839292743, 'samples': 3311616, 'steps': 6467, 'batch_loss/train': 0.8286208184435964} 12/23/2021 20:04:16 - INFO - codeparrot_training - Step 6468: {'lr': 0.0004805766070401861, 'samples': 3312128, 'steps': 6468, 'batch_loss/train': 0.9626922030001879} 12/23/2021 20:04:27 - INFO - codeparrot_training - Step 6469: {'lr': 0.0004805705147707593, 'samples': 3312640, 'steps': 6469, 'batch_loss/train': 0.8472865018993616} 12/23/2021 20:04:37 - INFO - codeparrot_training - Step 6470: {'lr': 0.0004805644215846712, 'samples': 3313152, 'steps': 6470, 'batch_loss/train': 0.8984522046521306} 12/23/2021 20:04:49 - INFO - codeparrot_training - Step 6471: {'lr': 0.0004805583274819459, 'samples': 3313664, 'steps': 6471, 'batch_loss/train': 0.8957743154605851} 12/23/2021 20:05:00 - INFO - codeparrot_training - Step 6472: {'lr': 0.00048055223246260784, 'samples': 3314176, 'steps': 6472, 'batch_loss/train': 0.8871749863028526} 12/23/2021 20:05:10 - INFO - codeparrot_training - Step 6473: {'lr': 0.00048054613652668106, 'samples': 3314688, 'steps': 6473, 'batch_loss/train': 0.7900642659515142} 12/23/2021 20:05:24 - INFO - codeparrot_training - Step 6474: {'lr': 0.00048054003967419, 'samples': 3315200, 'steps': 6474, 'batch_loss/train': 0.906851259060204} 12/23/2021 20:05:34 - INFO - codeparrot_training - Step 6475: {'lr': 0.0004805339419051587, 'samples': 3315712, 'steps': 6475, 'batch_loss/train': 0.7048701220192015} 12/23/2021 20:05:45 - INFO - codeparrot_training - Step 6476: {'lr': 0.0004805278432196115, 'samples': 3316224, 'steps': 6476, 'batch_loss/train': 0.8619322348386049} 12/23/2021 20:05:55 - INFO - codeparrot_training - Step 6477: {'lr': 0.0004805217436175727, 'samples': 3316736, 'steps': 6477, 'batch_loss/train': 0.8694862369447947} 12/23/2021 20:06:08 - INFO - codeparrot_training - Step 6478: {'lr': 0.0004805156430990665, 'samples': 3317248, 'steps': 6478, 'batch_loss/train': 1.5792609071359038} 12/23/2021 20:06:18 - INFO - codeparrot_training - Step 6479: {'lr': 0.0004805095416641171, 'samples': 3317760, 'steps': 6479, 'batch_loss/train': 0.8769527189433575} 12/23/2021 20:06:29 - INFO - codeparrot_training - Step 6480: {'lr': 0.00048050343931274874, 'samples': 3318272, 'steps': 6480, 'batch_loss/train': 0.9341939566656947} 12/23/2021 20:06:42 - INFO - codeparrot_training - Step 6481: {'lr': 0.0004804973360449858, 'samples': 3318784, 'steps': 6481, 'batch_loss/train': 0.821267444640398} 12/23/2021 20:06:53 - INFO - codeparrot_training - Step 6482: {'lr': 0.0004804912318608525, 'samples': 3319296, 'steps': 6482, 'batch_loss/train': 0.840239395853132} 12/23/2021 20:07:03 - INFO - codeparrot_training - Step 6483: {'lr': 0.0004804851267603731, 'samples': 3319808, 'steps': 6483, 'batch_loss/train': 0.9293851079419255} 12/23/2021 20:07:15 - INFO - codeparrot_training - Step 6484: {'lr': 0.00048047902074357186, 'samples': 3320320, 'steps': 6484, 'batch_loss/train': 0.9276780383661389} 12/23/2021 20:07:26 - INFO - codeparrot_training - Step 6485: {'lr': 0.000480472913810473, 'samples': 3320832, 'steps': 6485, 'batch_loss/train': 0.7623352063819766} 12/23/2021 20:07:36 - INFO - codeparrot_training - Step 6486: {'lr': 0.00048046680596110095, 'samples': 3321344, 'steps': 6486, 'batch_loss/train': 0.90085812471807} 12/23/2021 20:07:47 - INFO - codeparrot_training - Step 6487: {'lr': 0.0004804606971954798, 'samples': 3321856, 'steps': 6487, 'batch_loss/train': 0.806686520576477} 12/23/2021 20:07:59 - INFO - codeparrot_training - Step 6488: {'lr': 0.00048045458751363405, 'samples': 3322368, 'steps': 6488, 'batch_loss/train': 0.8182236813008785} 12/23/2021 20:08:10 - INFO - codeparrot_training - Step 6489: {'lr': 0.00048044847691558777, 'samples': 3322880, 'steps': 6489, 'batch_loss/train': 0.706020301906392} 12/23/2021 20:08:20 - INFO - codeparrot_training - Step 6490: {'lr': 0.00048044236540136545, 'samples': 3323392, 'steps': 6490, 'batch_loss/train': 0.8961585154756904} 12/23/2021 20:08:33 - INFO - codeparrot_training - Step 6491: {'lr': 0.0004804362529709912, 'samples': 3323904, 'steps': 6491, 'batch_loss/train': 0.908109886571765} 12/23/2021 20:08:44 - INFO - codeparrot_training - Step 6492: {'lr': 0.00048043013962448944, 'samples': 3324416, 'steps': 6492, 'batch_loss/train': 0.8798798006027937} 12/23/2021 20:08:55 - INFO - codeparrot_training - Step 6493: {'lr': 0.0004804240253618845, 'samples': 3324928, 'steps': 6493, 'batch_loss/train': 0.7110456082737073} 12/23/2021 20:09:07 - INFO - codeparrot_training - Step 6494: {'lr': 0.0004804179101832006, 'samples': 3325440, 'steps': 6494, 'batch_loss/train': 0.7935129031538963} 12/23/2021 20:09:17 - INFO - codeparrot_training - Step 6495: {'lr': 0.0004804117940884621, 'samples': 3325952, 'steps': 6495, 'batch_loss/train': 0.9848827430978417} 12/23/2021 20:09:28 - INFO - codeparrot_training - Step 6496: {'lr': 0.00048040567707769333, 'samples': 3326464, 'steps': 6496, 'batch_loss/train': 0.8262265999801457} 12/23/2021 20:09:40 - INFO - codeparrot_training - Step 6497: {'lr': 0.0004803995591509186, 'samples': 3326976, 'steps': 6497, 'batch_loss/train': 0.8847430376335979} 12/23/2021 20:09:50 - INFO - codeparrot_training - Step 6498: {'lr': 0.0004803934403081622, 'samples': 3327488, 'steps': 6498, 'batch_loss/train': 0.8682459238916636} 12/23/2021 20:10:01 - INFO - codeparrot_training - Step 6499: {'lr': 0.00048038732054944846, 'samples': 3328000, 'steps': 6499, 'batch_loss/train': 0.8376254271715879} 12/23/2021 20:10:12 - INFO - codeparrot_training - Step 6500: {'lr': 0.0004803811998748017, 'samples': 3328512, 'steps': 6500, 'batch_loss/train': 0.8831671602092683} 12/23/2021 20:10:24 - INFO - codeparrot_training - Step 6501: {'lr': 0.00048037507828424633, 'samples': 3329024, 'steps': 6501, 'batch_loss/train': 0.6934801011811942} 12/23/2021 20:10:34 - INFO - codeparrot_training - Step 6502: {'lr': 0.00048036895577780657, 'samples': 3329536, 'steps': 6502, 'batch_loss/train': 0.9600462084636092} 12/23/2021 20:10:45 - INFO - codeparrot_training - Step 6503: {'lr': 0.0004803628323555069, 'samples': 3330048, 'steps': 6503, 'batch_loss/train': 0.7408625164534897} 12/23/2021 20:10:58 - INFO - codeparrot_training - Step 6504: {'lr': 0.00048035670801737154, 'samples': 3330560, 'steps': 6504, 'batch_loss/train': 0.7881877408362925} 12/23/2021 20:11:09 - INFO - codeparrot_training - Step 6505: {'lr': 0.0004803505827634249, 'samples': 3331072, 'steps': 6505, 'batch_loss/train': 0.8209440105129033} 12/23/2021 20:11:19 - INFO - codeparrot_training - Step 6506: {'lr': 0.00048034445659369136, 'samples': 3331584, 'steps': 6506, 'batch_loss/train': 0.8336921203881502} 12/23/2021 20:11:31 - INFO - codeparrot_training - Step 6507: {'lr': 0.0004803383295081952, 'samples': 3332096, 'steps': 6507, 'batch_loss/train': 0.8553354847244918} 12/23/2021 20:11:42 - INFO - codeparrot_training - Step 6508: {'lr': 0.00048033220150696084, 'samples': 3332608, 'steps': 6508, 'batch_loss/train': 0.8041099812835455} 12/23/2021 20:11:53 - INFO - codeparrot_training - Step 6509: {'lr': 0.0004803260725900126, 'samples': 3333120, 'steps': 6509, 'batch_loss/train': 0.8419149741530418} 12/23/2021 20:12:03 - INFO - codeparrot_training - Step 6510: {'lr': 0.0004803199427573748, 'samples': 3333632, 'steps': 6510, 'batch_loss/train': 0.8743484714068472} 12/23/2021 20:12:17 - INFO - codeparrot_training - Step 6511: {'lr': 0.00048031381200907197, 'samples': 3334144, 'steps': 6511, 'batch_loss/train': 0.8302447162568569} 12/23/2021 20:12:27 - INFO - codeparrot_training - Step 6512: {'lr': 0.00048030768034512835, 'samples': 3334656, 'steps': 6512, 'batch_loss/train': 0.800785016734153} 12/23/2021 20:12:38 - INFO - codeparrot_training - Step 6513: {'lr': 0.00048030154776556834, 'samples': 3335168, 'steps': 6513, 'batch_loss/train': 0.8584978580474854} 12/23/2021 20:12:50 - INFO - codeparrot_training - Step 6514: {'lr': 0.00048029541427041636, 'samples': 3335680, 'steps': 6514, 'batch_loss/train': 0.8323653806000948} 12/23/2021 20:13:01 - INFO - codeparrot_training - Step 6515: {'lr': 0.00048028927985969673, 'samples': 3336192, 'steps': 6515, 'batch_loss/train': 0.8048906493932009} 12/23/2021 20:13:11 - INFO - codeparrot_training - Step 6516: {'lr': 0.0004802831445334339, 'samples': 3336704, 'steps': 6516, 'batch_loss/train': 0.9010712541639805} 12/23/2021 20:13:23 - INFO - codeparrot_training - Step 6517: {'lr': 0.00048027700829165223, 'samples': 3337216, 'steps': 6517, 'batch_loss/train': 0.8179630199447274} 12/23/2021 20:13:34 - INFO - codeparrot_training - Step 6518: {'lr': 0.00048027087113437615, 'samples': 3337728, 'steps': 6518, 'batch_loss/train': 0.8379490403458476} 12/23/2021 20:13:44 - INFO - codeparrot_training - Step 6519: {'lr': 0.00048026473306163, 'samples': 3338240, 'steps': 6519, 'batch_loss/train': 0.8069698652252555} 12/23/2021 20:13:58 - INFO - codeparrot_training - Step 6520: {'lr': 0.00048025859407343824, 'samples': 3338752, 'steps': 6520, 'batch_loss/train': 0.8801242681220174} 12/23/2021 20:14:09 - INFO - codeparrot_training - Step 6521: {'lr': 0.0004802524541698252, 'samples': 3339264, 'steps': 6521, 'batch_loss/train': 0.73272276786156} 12/23/2021 20:14:20 - INFO - codeparrot_training - Step 6522: {'lr': 0.00048024631335081544, 'samples': 3339776, 'steps': 6522, 'batch_loss/train': 0.744210752658546} 12/23/2021 20:14:30 - INFO - codeparrot_training - Step 6523: {'lr': 0.0004802401716164332, 'samples': 3340288, 'steps': 6523, 'batch_loss/train': 0.8441927218809724} 12/23/2021 20:14:43 - INFO - codeparrot_training - Step 6524: {'lr': 0.00048023402896670296, 'samples': 3340800, 'steps': 6524, 'batch_loss/train': 0.8280469486489892} 12/23/2021 20:14:53 - INFO - codeparrot_training - Step 6525: {'lr': 0.00048022788540164925, 'samples': 3341312, 'steps': 6525, 'batch_loss/train': 0.9143739575520158} 12/23/2021 20:15:04 - INFO - codeparrot_training - Step 6526: {'lr': 0.00048022174092129634, 'samples': 3341824, 'steps': 6526, 'batch_loss/train': 0.7089239874621853} 12/23/2021 20:15:17 - INFO - codeparrot_training - Step 6527: {'lr': 0.00048021559552566875, 'samples': 3342336, 'steps': 6527, 'batch_loss/train': 0.9276117165572941} 12/23/2021 20:15:28 - INFO - codeparrot_training - Step 6528: {'lr': 0.0004802094492147908, 'samples': 3342848, 'steps': 6528, 'batch_loss/train': 0.8338212631642818} 12/23/2021 20:15:38 - INFO - codeparrot_training - Step 6529: {'lr': 0.00048020330198868705, 'samples': 3343360, 'steps': 6529, 'batch_loss/train': 0.9346329420804977} 12/23/2021 20:15:50 - INFO - codeparrot_training - Step 6530: {'lr': 0.0004801971538473819, 'samples': 3343872, 'steps': 6530, 'batch_loss/train': 0.8599731419235468} 12/23/2021 20:16:01 - INFO - codeparrot_training - Step 6531: {'lr': 0.0004801910047908997, 'samples': 3344384, 'steps': 6531, 'batch_loss/train': 0.8909636307507753} 12/23/2021 20:16:11 - INFO - codeparrot_training - Step 6532: {'lr': 0.0004801848548192651, 'samples': 3344896, 'steps': 6532, 'batch_loss/train': 0.8314579529687762} 12/23/2021 20:16:23 - INFO - codeparrot_training - Step 6533: {'lr': 0.0004801787039325024, 'samples': 3345408, 'steps': 6533, 'batch_loss/train': 0.8733090781606734} 12/23/2021 20:16:34 - INFO - codeparrot_training - Step 6534: {'lr': 0.00048017255213063603, 'samples': 3345920, 'steps': 6534, 'batch_loss/train': 0.8336505237966776} 12/23/2021 20:16:45 - INFO - codeparrot_training - Step 6535: {'lr': 0.00048016639941369057, 'samples': 3346432, 'steps': 6535, 'batch_loss/train': 0.767832423793152} 12/23/2021 20:16:55 - INFO - codeparrot_training - Step 6536: {'lr': 0.00048016024578169045, 'samples': 3346944, 'steps': 6536, 'batch_loss/train': 0.9003973612561822} 12/23/2021 20:17:07 - INFO - codeparrot_training - Step 6537: {'lr': 0.00048015409123466, 'samples': 3347456, 'steps': 6537, 'batch_loss/train': 0.940321097150445} 12/23/2021 20:17:18 - INFO - codeparrot_training - Step 6538: {'lr': 0.0004801479357726238, 'samples': 3347968, 'steps': 6538, 'batch_loss/train': 0.8167865313589573} 12/23/2021 20:17:29 - INFO - codeparrot_training - Step 6539: {'lr': 0.00048014177939560633, 'samples': 3348480, 'steps': 6539, 'batch_loss/train': 0.9539633998647332} 12/23/2021 20:17:42 - INFO - codeparrot_training - Step 6540: {'lr': 0.00048013562210363203, 'samples': 3348992, 'steps': 6540, 'batch_loss/train': 0.8450488159433007} 12/23/2021 20:17:53 - INFO - codeparrot_training - Step 6541: {'lr': 0.0004801294638967254, 'samples': 3349504, 'steps': 6541, 'batch_loss/train': 0.6666249760892242} 12/23/2021 20:18:03 - INFO - codeparrot_training - Step 6542: {'lr': 0.0004801233047749109, 'samples': 3350016, 'steps': 6542, 'batch_loss/train': 0.8120054844766855} 12/23/2021 20:18:15 - INFO - codeparrot_training - Step 6543: {'lr': 0.0004801171447382131, 'samples': 3350528, 'steps': 6543, 'batch_loss/train': 0.6139815181377344} 12/23/2021 20:18:26 - INFO - codeparrot_training - Step 6544: {'lr': 0.0004801109837866564, 'samples': 3351040, 'steps': 6544, 'batch_loss/train': 0.8561935196630657} 12/23/2021 20:18:36 - INFO - codeparrot_training - Step 6545: {'lr': 0.00048010482192026527, 'samples': 3351552, 'steps': 6545, 'batch_loss/train': 0.6940255202935077} 12/23/2021 20:18:47 - INFO - codeparrot_training - Step 6546: {'lr': 0.0004800986591390643, 'samples': 3352064, 'steps': 6546, 'batch_loss/train': 0.7759471326135099} 12/23/2021 20:18:59 - INFO - codeparrot_training - Step 6547: {'lr': 0.0004800924954430779, 'samples': 3352576, 'steps': 6547, 'batch_loss/train': 0.860518092289567} 12/23/2021 20:19:10 - INFO - codeparrot_training - Step 6548: {'lr': 0.00048008633083233067, 'samples': 3353088, 'steps': 6548, 'batch_loss/train': 0.8524567922577262} 12/23/2021 20:19:20 - INFO - codeparrot_training - Step 6549: {'lr': 0.000480080165306847, 'samples': 3353600, 'steps': 6549, 'batch_loss/train': 0.7964440425857902} 12/23/2021 20:19:34 - INFO - codeparrot_training - Step 6550: {'lr': 0.00048007399886665147, 'samples': 3354112, 'steps': 6550, 'batch_loss/train': 0.7383102020248771} 12/23/2021 20:19:44 - INFO - codeparrot_training - Step 6551: {'lr': 0.00048006783151176866, 'samples': 3354624, 'steps': 6551, 'batch_loss/train': 0.8843648051843047} 12/23/2021 20:19:55 - INFO - codeparrot_training - Step 6552: {'lr': 0.000480061663242223, 'samples': 3355136, 'steps': 6552, 'batch_loss/train': 0.8106037890538573} 12/23/2021 20:20:07 - INFO - codeparrot_training - Step 6553: {'lr': 0.00048005549405803896, 'samples': 3355648, 'steps': 6553, 'batch_loss/train': 0.8211616163607687} 12/23/2021 20:20:18 - INFO - codeparrot_training - Step 6554: {'lr': 0.0004800493239592412, 'samples': 3356160, 'steps': 6554, 'batch_loss/train': 0.8528864551335573} 12/23/2021 20:20:28 - INFO - codeparrot_training - Step 6555: {'lr': 0.0004800431529458542, 'samples': 3356672, 'steps': 6555, 'batch_loss/train': 0.8219797797501087} 12/23/2021 20:20:39 - INFO - codeparrot_training - Step 6556: {'lr': 0.0004800369810179025, 'samples': 3357184, 'steps': 6556, 'batch_loss/train': 0.8364271381869912} 12/23/2021 20:20:51 - INFO - codeparrot_training - Step 6557: {'lr': 0.00048003080817541054, 'samples': 3357696, 'steps': 6557, 'batch_loss/train': 0.7052570275263861} 12/23/2021 20:21:02 - INFO - codeparrot_training - Step 6558: {'lr': 0.0004800246344184031, 'samples': 3358208, 'steps': 6558, 'batch_loss/train': 0.9026262436527759} 12/23/2021 20:21:12 - INFO - codeparrot_training - Step 6559: {'lr': 0.0004800184597469044, 'samples': 3358720, 'steps': 6559, 'batch_loss/train': 0.885263055562973} 12/23/2021 20:21:26 - INFO - codeparrot_training - Step 6560: {'lr': 0.00048001228416093923, 'samples': 3359232, 'steps': 6560, 'batch_loss/train': 0.8901649378240108} 12/23/2021 20:21:36 - INFO - codeparrot_training - Step 6561: {'lr': 0.000480006107660532, 'samples': 3359744, 'steps': 6561, 'batch_loss/train': 0.7668513956014067} 12/23/2021 20:21:47 - INFO - codeparrot_training - Step 6562: {'lr': 0.00047999993024570745, 'samples': 3360256, 'steps': 6562, 'batch_loss/train': 0.9234152613207698} 12/23/2021 20:21:59 - INFO - codeparrot_training - Step 6563: {'lr': 0.00047999375191648997, 'samples': 3360768, 'steps': 6563, 'batch_loss/train': 0.8032295210286975} 12/23/2021 20:22:10 - INFO - codeparrot_training - Step 6564: {'lr': 0.0004799875726729042, 'samples': 3361280, 'steps': 6564, 'batch_loss/train': 0.8076360421255231} 12/23/2021 20:22:20 - INFO - codeparrot_training - Step 6565: {'lr': 0.00047998139251497455, 'samples': 3361792, 'steps': 6565, 'batch_loss/train': 0.8898304784670472} 12/23/2021 20:22:33 - INFO - codeparrot_training - Step 6566: {'lr': 0.00047997521144272585, 'samples': 3362304, 'steps': 6566, 'batch_loss/train': 0.8550694966688752} 12/23/2021 20:22:44 - INFO - codeparrot_training - Step 6567: {'lr': 0.0004799690294561825, 'samples': 3362816, 'steps': 6567, 'batch_loss/train': 0.8138840948231518} 12/23/2021 20:22:55 - INFO - codeparrot_training - Step 6568: {'lr': 0.00047996284655536916, 'samples': 3363328, 'steps': 6568, 'batch_loss/train': 0.9386024735867977} 12/23/2021 20:23:05 - INFO - codeparrot_training - Step 6569: {'lr': 0.00047995666274031035, 'samples': 3363840, 'steps': 6569, 'batch_loss/train': 0.8730572648346424} 12/23/2021 20:23:18 - INFO - codeparrot_training - Step 6570: {'lr': 0.00047995047801103064, 'samples': 3364352, 'steps': 6570, 'batch_loss/train': 0.7664410825818777} 12/23/2021 20:23:28 - INFO - codeparrot_training - Step 6571: {'lr': 0.00047994429236755475, 'samples': 3364864, 'steps': 6571, 'batch_loss/train': 0.8345467457547784} 12/23/2021 20:23:39 - INFO - codeparrot_training - Step 6572: {'lr': 0.0004799381058099071, 'samples': 3365376, 'steps': 6572, 'batch_loss/train': 0.7103086861316115} 12/23/2021 20:23:51 - INFO - codeparrot_training - Step 6573: {'lr': 0.0004799319183381124, 'samples': 3365888, 'steps': 6573, 'batch_loss/train': 0.7529295559506863} 12/23/2021 20:24:02 - INFO - codeparrot_training - Step 6574: {'lr': 0.0004799257299521952, 'samples': 3366400, 'steps': 6574, 'batch_loss/train': 0.8248808747157454} 12/23/2021 20:24:12 - INFO - codeparrot_training - Step 6575: {'lr': 0.00047991954065218013, 'samples': 3366912, 'steps': 6575, 'batch_loss/train': 0.8490822650492191} 12/23/2021 20:24:24 - INFO - codeparrot_training - Step 6576: {'lr': 0.0004799133504380917, 'samples': 3367424, 'steps': 6576, 'batch_loss/train': 0.8851858330890536} 12/23/2021 20:24:35 - INFO - codeparrot_training - Step 6577: {'lr': 0.00047990715930995467, 'samples': 3367936, 'steps': 6577, 'batch_loss/train': 0.8618127414956689} 12/23/2021 20:24:46 - INFO - codeparrot_training - Step 6578: {'lr': 0.0004799009672677936, 'samples': 3368448, 'steps': 6578, 'batch_loss/train': 0.8040322391316295} 12/23/2021 20:24:56 - INFO - codeparrot_training - Step 6579: {'lr': 0.000479894774311633, 'samples': 3368960, 'steps': 6579, 'batch_loss/train': 0.896579273045063} 12/23/2021 20:25:10 - INFO - codeparrot_training - Step 6580: {'lr': 0.0004798885804414977, 'samples': 3369472, 'steps': 6580, 'batch_loss/train': 0.7979774139821529} 12/23/2021 20:25:21 - INFO - codeparrot_training - Step 6581: {'lr': 0.00047988238565741214, 'samples': 3369984, 'steps': 6581, 'batch_loss/train': 0.7537045758217573} 12/23/2021 20:25:31 - INFO - codeparrot_training - Step 6582: {'lr': 0.00047987618995940103, 'samples': 3370496, 'steps': 6582, 'batch_loss/train': 0.8839086559601128} 12/23/2021 20:25:43 - INFO - codeparrot_training - Step 6583: {'lr': 0.00047986999334748896, 'samples': 3371008, 'steps': 6583, 'batch_loss/train': 0.7279384811408818} 12/23/2021 20:25:54 - INFO - codeparrot_training - Step 6584: {'lr': 0.00047986379582170066, 'samples': 3371520, 'steps': 6584, 'batch_loss/train': 0.856566933915019} 12/23/2021 20:26:05 - INFO - codeparrot_training - Step 6585: {'lr': 0.00047985759738206063, 'samples': 3372032, 'steps': 6585, 'batch_loss/train': 0.8632339239120483} 12/23/2021 20:26:17 - INFO - codeparrot_training - Step 6586: {'lr': 0.00047985139802859364, 'samples': 3372544, 'steps': 6586, 'batch_loss/train': 0.8242329321801662} 12/23/2021 20:26:27 - INFO - codeparrot_training - Step 6587: {'lr': 0.00047984519776132425, 'samples': 3373056, 'steps': 6587, 'batch_loss/train': 0.81181943975389} 12/23/2021 20:26:38 - INFO - codeparrot_training - Step 6588: {'lr': 0.00047983899658027713, 'samples': 3373568, 'steps': 6588, 'batch_loss/train': 0.9180567217990756} 12/23/2021 20:26:51 - INFO - codeparrot_training - Step 6589: {'lr': 0.00047983279448547694, 'samples': 3374080, 'steps': 6589, 'batch_loss/train': 0.9406764656305313} 12/23/2021 20:27:02 - INFO - codeparrot_training - Step 6590: {'lr': 0.00047982659147694835, 'samples': 3374592, 'steps': 6590, 'batch_loss/train': 0.8980073183774948} 12/23/2021 20:27:12 - INFO - codeparrot_training - Step 6591: {'lr': 0.000479820387554716, 'samples': 3375104, 'steps': 6591, 'batch_loss/train': 0.8851090921089053} 12/23/2021 20:27:23 - INFO - codeparrot_training - Step 6592: {'lr': 0.0004798141827188046, 'samples': 3375616, 'steps': 6592, 'batch_loss/train': 0.833463029935956} 12/23/2021 20:27:35 - INFO - codeparrot_training - Step 6593: {'lr': 0.00047980797696923875, 'samples': 3376128, 'steps': 6593, 'batch_loss/train': 0.888979397714138} 12/23/2021 20:27:45 - INFO - codeparrot_training - Step 6594: {'lr': 0.0004798017703060432, 'samples': 3376640, 'steps': 6594, 'batch_loss/train': 0.871117239817977} 12/23/2021 20:27:56 - INFO - codeparrot_training - Step 6595: {'lr': 0.0004797955627292425, 'samples': 3377152, 'steps': 6595, 'batch_loss/train': 0.7689546751789749} 12/23/2021 20:28:10 - INFO - codeparrot_training - Step 6596: {'lr': 0.0004797893542388615, 'samples': 3377664, 'steps': 6596, 'batch_loss/train': 0.8592088455334306} 12/23/2021 20:28:20 - INFO - codeparrot_training - Step 6597: {'lr': 0.0004797831448349248, 'samples': 3378176, 'steps': 6597, 'batch_loss/train': 0.9597657714039087} 12/23/2021 20:28:31 - INFO - codeparrot_training - Step 6598: {'lr': 0.000479776934517457, 'samples': 3378688, 'steps': 6598, 'batch_loss/train': 0.8436875697225332} 12/23/2021 20:28:43 - INFO - codeparrot_training - Step 6599: {'lr': 0.0004797707232864829, 'samples': 3379200, 'steps': 6599, 'batch_loss/train': 0.805052412673831} 12/23/2021 20:28:53 - INFO - codeparrot_training - Step 6600: {'lr': 0.0004797645111420272, 'samples': 3379712, 'steps': 6600, 'batch_loss/train': 0.7700484776869416} 12/23/2021 20:29:04 - INFO - codeparrot_training - Step 6601: {'lr': 0.0004797582980841145, 'samples': 3380224, 'steps': 6601, 'batch_loss/train': 0.7565264399163425} 12/23/2021 20:29:14 - INFO - codeparrot_training - Step 6602: {'lr': 0.0004797520841127696, 'samples': 3380736, 'steps': 6602, 'batch_loss/train': 0.7521341005340219} 12/23/2021 20:29:27 - INFO - codeparrot_training - Step 6603: {'lr': 0.0004797458692280171, 'samples': 3381248, 'steps': 6603, 'batch_loss/train': 0.8128621224313974} 12/23/2021 20:29:37 - INFO - codeparrot_training - Step 6604: {'lr': 0.00047973965342988184, 'samples': 3381760, 'steps': 6604, 'batch_loss/train': 0.85841272957623} 12/23/2021 20:29:48 - INFO - codeparrot_training - Step 6605: {'lr': 0.0004797334367183885, 'samples': 3382272, 'steps': 6605, 'batch_loss/train': 0.7509722039103508} 12/23/2021 20:30:01 - INFO - codeparrot_training - Step 6606: {'lr': 0.00047972721909356165, 'samples': 3382784, 'steps': 6606, 'batch_loss/train': 0.8125764739234} 12/23/2021 20:30:12 - INFO - codeparrot_training - Step 6607: {'lr': 0.0004797210005554262, 'samples': 3383296, 'steps': 6607, 'batch_loss/train': 0.828918632119894} 12/23/2021 20:30:22 - INFO - codeparrot_training - Step 6608: {'lr': 0.0004797147811040068, 'samples': 3383808, 'steps': 6608, 'batch_loss/train': 0.8678867779672146} 12/23/2021 20:30:34 - INFO - codeparrot_training - Step 6609: {'lr': 0.0004797085607393281, 'samples': 3384320, 'steps': 6609, 'batch_loss/train': 1.0170717069413513} 12/23/2021 20:30:45 - INFO - codeparrot_training - Step 6610: {'lr': 0.00047970233946141496, 'samples': 3384832, 'steps': 6610, 'batch_loss/train': 0.8456258634105325} 12/23/2021 20:30:56 - INFO - codeparrot_training - Step 6611: {'lr': 0.000479696117270292, 'samples': 3385344, 'steps': 6611, 'batch_loss/train': 0.9679913921281695} 12/23/2021 20:31:06 - INFO - codeparrot_training - Step 6612: {'lr': 0.00047968989416598403, 'samples': 3385856, 'steps': 6612, 'batch_loss/train': 0.8808631477877498} 12/23/2021 20:31:20 - INFO - codeparrot_training - Step 6613: {'lr': 0.0004796836701485158, 'samples': 3386368, 'steps': 6613, 'batch_loss/train': 0.8501312183216214} 12/23/2021 20:31:31 - INFO - codeparrot_training - Step 6614: {'lr': 0.00047967744521791193, 'samples': 3386880, 'steps': 6614, 'batch_loss/train': 0.8296098364517093} 12/23/2021 20:31:41 - INFO - codeparrot_training - Step 6615: {'lr': 0.00047967121937419734, 'samples': 3387392, 'steps': 6615, 'batch_loss/train': 0.769038790720515} 12/23/2021 20:31:53 - INFO - codeparrot_training - Step 6616: {'lr': 0.00047966499261739673, 'samples': 3387904, 'steps': 6616, 'batch_loss/train': 0.9692985601723194} 12/23/2021 20:32:04 - INFO - codeparrot_training - Step 6617: {'lr': 0.0004796587649475347, 'samples': 3388416, 'steps': 6617, 'batch_loss/train': 0.8823113003745675} 12/23/2021 20:32:15 - INFO - codeparrot_training - Step 6618: {'lr': 0.00047965253636463625, 'samples': 3388928, 'steps': 6618, 'batch_loss/train': 0.9336572419852018} 12/23/2021 20:32:27 - INFO - codeparrot_training - Step 6619: {'lr': 0.00047964630686872604, 'samples': 3389440, 'steps': 6619, 'batch_loss/train': 0.8475145776756108} 12/23/2021 20:32:38 - INFO - codeparrot_training - Step 6620: {'lr': 0.0004796400764598288, 'samples': 3389952, 'steps': 6620, 'batch_loss/train': 0.7922201426699758} 12/23/2021 20:32:48 - INFO - codeparrot_training - Step 6621: {'lr': 0.00047963384513796924, 'samples': 3390464, 'steps': 6621, 'batch_loss/train': 0.9108885796740651} 12/23/2021 20:33:01 - INFO - codeparrot_training - Step 6622: {'lr': 0.0004796276129031723, 'samples': 3390976, 'steps': 6622, 'batch_loss/train': 0.8728324109688401} 12/23/2021 20:33:12 - INFO - codeparrot_training - Step 6623: {'lr': 0.0004796213797554626, 'samples': 3391488, 'steps': 6623, 'batch_loss/train': 0.824275222257711} 12/23/2021 20:33:23 - INFO - codeparrot_training - Step 6624: {'lr': 0.0004796151456948651, 'samples': 3392000, 'steps': 6624, 'batch_loss/train': 0.9303242638707161} 12/23/2021 20:33:35 - INFO - codeparrot_training - Step 6625: {'lr': 0.00047960891072140445, 'samples': 3392512, 'steps': 6625, 'batch_loss/train': 0.9236344583332539} 12/23/2021 20:33:45 - INFO - codeparrot_training - Step 6626: {'lr': 0.0004796026748351054, 'samples': 3393024, 'steps': 6626, 'batch_loss/train': 0.926085464656353} 12/23/2021 20:33:56 - INFO - codeparrot_training - Step 6627: {'lr': 0.0004795964380359929, 'samples': 3393536, 'steps': 6627, 'batch_loss/train': 0.946737531106919} 12/23/2021 20:34:07 - INFO - codeparrot_training - Step 6628: {'lr': 0.00047959020032409163, 'samples': 3394048, 'steps': 6628, 'batch_loss/train': 0.8760807253420353} 12/23/2021 20:34:19 - INFO - codeparrot_training - Step 6629: {'lr': 0.00047958396169942637, 'samples': 3394560, 'steps': 6629, 'batch_loss/train': 0.896924139931798} 12/23/2021 20:34:30 - INFO - codeparrot_training - Step 6630: {'lr': 0.000479577722162022, 'samples': 3395072, 'steps': 6630, 'batch_loss/train': 0.5449301741900854} 12/23/2021 20:34:40 - INFO - codeparrot_training - Step 6631: {'lr': 0.0004795714817119033, 'samples': 3395584, 'steps': 6631, 'batch_loss/train': 0.7550488223787397} 12/23/2021 20:34:53 - INFO - codeparrot_training - Step 6632: {'lr': 0.00047956524034909506, 'samples': 3396096, 'steps': 6632, 'batch_loss/train': 0.763871056959033} 12/23/2021 20:35:03 - INFO - codeparrot_training - Step 6633: {'lr': 0.0004795589980736221, 'samples': 3396608, 'steps': 6633, 'batch_loss/train': 0.8935642391443253} 12/23/2021 20:35:14 - INFO - codeparrot_training - Step 6634: {'lr': 0.0004795527548855092, 'samples': 3397120, 'steps': 6634, 'batch_loss/train': 0.8226279886439443} 12/23/2021 20:35:27 - INFO - codeparrot_training - Step 6635: {'lr': 0.0004795465107847813, 'samples': 3397632, 'steps': 6635, 'batch_loss/train': 0.7025929940864444} 12/23/2021 20:35:38 - INFO - codeparrot_training - Step 6636: {'lr': 0.00047954026577146314, 'samples': 3398144, 'steps': 6636, 'batch_loss/train': 0.9043339593335986} 12/23/2021 20:35:48 - INFO - codeparrot_training - Step 6637: {'lr': 0.00047953401984557955, 'samples': 3398656, 'steps': 6637, 'batch_loss/train': 0.9101201212033629} 12/23/2021 20:35:59 - INFO - codeparrot_training - Step 6638: {'lr': 0.00047952777300715535, 'samples': 3399168, 'steps': 6638, 'batch_loss/train': 0.8260428783250973} 12/23/2021 20:36:11 - INFO - codeparrot_training - Step 6639: {'lr': 0.0004795215252562153, 'samples': 3399680, 'steps': 6639, 'batch_loss/train': 0.8529995121061802} 12/23/2021 20:36:21 - INFO - codeparrot_training - Step 6640: {'lr': 0.00047951527659278445, 'samples': 3400192, 'steps': 6640, 'batch_loss/train': 0.8848206102848053} 12/23/2021 20:36:32 - INFO - codeparrot_training - Step 6641: {'lr': 0.00047950902701688746, 'samples': 3400704, 'steps': 6641, 'batch_loss/train': 0.8876137929037213} 12/23/2021 20:36:45 - INFO - codeparrot_training - Step 6642: {'lr': 0.0004795027765285493, 'samples': 3401216, 'steps': 6642, 'batch_loss/train': 0.8540183550212532} 12/23/2021 20:36:56 - INFO - codeparrot_training - Step 6643: {'lr': 0.00047949652512779476, 'samples': 3401728, 'steps': 6643, 'batch_loss/train': 0.8294060833286494} 12/23/2021 20:37:07 - INFO - codeparrot_training - Step 6644: {'lr': 0.00047949027281464864, 'samples': 3402240, 'steps': 6644, 'batch_loss/train': 0.7701177913695574} 12/23/2021 20:37:19 - INFO - codeparrot_training - Step 6645: {'lr': 0.0004794840195891358, 'samples': 3402752, 'steps': 6645, 'batch_loss/train': 0.7635275437496603} 12/23/2021 20:37:29 - INFO - codeparrot_training - Step 6646: {'lr': 0.00047947776545128126, 'samples': 3403264, 'steps': 6646, 'batch_loss/train': 0.8984386101365089} 12/23/2021 20:37:40 - INFO - codeparrot_training - Step 6647: {'lr': 0.0004794715104011097, 'samples': 3403776, 'steps': 6647, 'batch_loss/train': 0.8621068382635713} 12/23/2021 20:37:51 - INFO - codeparrot_training - Step 6648: {'lr': 0.0004794652544386461, 'samples': 3404288, 'steps': 6648, 'batch_loss/train': 0.7232420991640538} 12/23/2021 20:38:03 - INFO - codeparrot_training - Step 6649: {'lr': 0.00047945899756391525, 'samples': 3404800, 'steps': 6649, 'batch_loss/train': 0.9124392122030258} 12/23/2021 20:38:13 - INFO - codeparrot_training - Step 6650: {'lr': 0.0004794527397769421, 'samples': 3405312, 'steps': 6650, 'batch_loss/train': 0.8317047453019768} 12/23/2021 20:38:24 - INFO - codeparrot_training - Step 6651: {'lr': 0.00047944648107775144, 'samples': 3405824, 'steps': 6651, 'batch_loss/train': 0.8011546034831554} 12/23/2021 20:38:38 - INFO - codeparrot_training - Step 6652: {'lr': 0.0004794402214663682, 'samples': 3406336, 'steps': 6652, 'batch_loss/train': 0.9000873919576406} 12/23/2021 20:38:48 - INFO - codeparrot_training - Step 6653: {'lr': 0.00047943396094281737, 'samples': 3406848, 'steps': 6653, 'batch_loss/train': 0.8845077259466052} 12/23/2021 20:38:59 - INFO - codeparrot_training - Step 6654: {'lr': 0.0004794276995071237, 'samples': 3407360, 'steps': 6654, 'batch_loss/train': 0.9161198073998094} 12/23/2021 20:39:11 - INFO - codeparrot_training - Step 6655: {'lr': 0.0004794214371593121, 'samples': 3407872, 'steps': 6655, 'batch_loss/train': 0.7794443755410612} 12/23/2021 20:39:22 - INFO - codeparrot_training - Step 6656: {'lr': 0.00047941517389940753, 'samples': 3408384, 'steps': 6656, 'batch_loss/train': 0.7517812326550484} 12/23/2021 20:39:32 - INFO - codeparrot_training - Step 6657: {'lr': 0.00047940890972743484, 'samples': 3408896, 'steps': 6657, 'batch_loss/train': 0.8103777937358245} 12/23/2021 20:39:44 - INFO - codeparrot_training - Step 6658: {'lr': 0.00047940264464341897, 'samples': 3409408, 'steps': 6658, 'batch_loss/train': 0.7194143342785537} 12/23/2021 20:39:55 - INFO - codeparrot_training - Step 6659: {'lr': 0.0004793963786473847, 'samples': 3409920, 'steps': 6659, 'batch_loss/train': 0.8112652800045907} 12/23/2021 20:40:05 - INFO - codeparrot_training - Step 6660: {'lr': 0.0004793901117393572, 'samples': 3410432, 'steps': 6660, 'batch_loss/train': 0.7887753872200847} 12/23/2021 20:40:16 - INFO - codeparrot_training - Step 6661: {'lr': 0.0004793838439193611, 'samples': 3410944, 'steps': 6661, 'batch_loss/train': 0.8202629266306758} 12/23/2021 20:40:30 - INFO - codeparrot_training - Step 6662: {'lr': 0.0004793775751874215, 'samples': 3411456, 'steps': 6662, 'batch_loss/train': 1.1202405570074916} 12/23/2021 20:40:40 - INFO - codeparrot_training - Step 6663: {'lr': 0.0004793713055435633, 'samples': 3411968, 'steps': 6663, 'batch_loss/train': 0.8954945714212954} 12/23/2021 20:40:51 - INFO - codeparrot_training - Step 6664: {'lr': 0.00047936503498781136, 'samples': 3412480, 'steps': 6664, 'batch_loss/train': 0.8334438726305962} 12/23/2021 20:41:03 - INFO - codeparrot_training - Step 6665: {'lr': 0.0004793587635201906, 'samples': 3412992, 'steps': 6665, 'batch_loss/train': 0.8577756038866937} 12/23/2021 20:41:14 - INFO - codeparrot_training - Step 6666: {'lr': 0.0004793524911407261, 'samples': 3413504, 'steps': 6666, 'batch_loss/train': 0.9338244497776031} 12/23/2021 20:41:24 - INFO - codeparrot_training - Step 6667: {'lr': 0.00047934621784944266, 'samples': 3414016, 'steps': 6667, 'batch_loss/train': 0.7857520310208201} 12/23/2021 20:41:38 - INFO - codeparrot_training - Step 6668: {'lr': 0.0004793399436463652, 'samples': 3414528, 'steps': 6668, 'batch_loss/train': 0.8022098513320088} 12/23/2021 20:41:49 - INFO - codeparrot_training - Step 6669: {'lr': 0.00047933366853151874, 'samples': 3415040, 'steps': 6669, 'batch_loss/train': 0.9003030983731151} 12/23/2021 20:41:59 - INFO - codeparrot_training - Step 6670: {'lr': 0.00047932739250492823, 'samples': 3415552, 'steps': 6670, 'batch_loss/train': 0.8780073677189648} 12/23/2021 20:42:10 - INFO - codeparrot_training - Step 6671: {'lr': 0.0004793211155666186, 'samples': 3416064, 'steps': 6671, 'batch_loss/train': 0.8394334698095918} 12/23/2021 20:42:22 - INFO - codeparrot_training - Step 6672: {'lr': 0.00047931483771661477, 'samples': 3416576, 'steps': 6672, 'batch_loss/train': 0.9474397907033563} 12/23/2021 20:42:32 - INFO - codeparrot_training - Step 6673: {'lr': 0.0004793085589549418, 'samples': 3417088, 'steps': 6673, 'batch_loss/train': 0.8269268833100796} 12/23/2021 20:42:43 - INFO - codeparrot_training - Step 6674: {'lr': 0.0004793022792816245, 'samples': 3417600, 'steps': 6674, 'batch_loss/train': 0.8975230623036623} 12/23/2021 20:42:55 - INFO - codeparrot_training - Step 6675: {'lr': 0.000479295998696688, 'samples': 3418112, 'steps': 6675, 'batch_loss/train': 0.9879155559465289} 12/23/2021 20:43:06 - INFO - codeparrot_training - Step 6676: {'lr': 0.0004792897172001571, 'samples': 3418624, 'steps': 6676, 'batch_loss/train': 0.8829214414581656} 12/23/2021 20:43:16 - INFO - codeparrot_training - Step 6677: {'lr': 0.00047928343479205696, 'samples': 3419136, 'steps': 6677, 'batch_loss/train': 0.9424074152484536} 12/23/2021 20:43:28 - INFO - codeparrot_training - Step 6678: {'lr': 0.00047927715147241237, 'samples': 3419648, 'steps': 6678, 'batch_loss/train': 0.9066180679947138} 12/23/2021 20:43:39 - INFO - codeparrot_training - Step 6679: {'lr': 0.00047927086724124844, 'samples': 3420160, 'steps': 6679, 'batch_loss/train': 0.7890382120385766} 12/23/2021 20:43:49 - INFO - codeparrot_training - Step 6680: {'lr': 0.00047926458209859017, 'samples': 3420672, 'steps': 6680, 'batch_loss/train': 1.655387505888939} 12/23/2021 20:44:00 - INFO - codeparrot_training - Step 6681: {'lr': 0.00047925829604446237, 'samples': 3421184, 'steps': 6681, 'batch_loss/train': 0.809243296040222} 12/23/2021 20:44:14 - INFO - codeparrot_training - Step 6682: {'lr': 0.0004792520090788902, 'samples': 3421696, 'steps': 6682, 'batch_loss/train': 0.9528067009523511} 12/23/2021 20:44:25 - INFO - codeparrot_training - Step 6683: {'lr': 0.0004792457212018986, 'samples': 3422208, 'steps': 6683, 'batch_loss/train': 0.8920184532180429} 12/23/2021 20:44:35 - INFO - codeparrot_training - Step 6684: {'lr': 0.0004792394324135126, 'samples': 3422720, 'steps': 6684, 'batch_loss/train': 0.8260881472378969} 12/23/2021 20:44:47 - INFO - codeparrot_training - Step 6685: {'lr': 0.0004792331427137572, 'samples': 3423232, 'steps': 6685, 'batch_loss/train': 0.8085690503939986} 12/23/2021 20:44:58 - INFO - codeparrot_training - Step 6686: {'lr': 0.0004792268521026573, 'samples': 3423744, 'steps': 6686, 'batch_loss/train': 0.78311303909868} 12/23/2021 20:45:08 - INFO - codeparrot_training - Step 6687: {'lr': 0.000479220560580238, 'samples': 3424256, 'steps': 6687, 'batch_loss/train': 0.859293001703918} 12/23/2021 20:45:20 - INFO - codeparrot_training - Step 6688: {'lr': 0.00047921426814652435, 'samples': 3424768, 'steps': 6688, 'batch_loss/train': 0.8593164384365082} 12/23/2021 20:45:31 - INFO - codeparrot_training - Step 6689: {'lr': 0.0004792079748015413, 'samples': 3425280, 'steps': 6689, 'batch_loss/train': 0.9048238396644592} 12/23/2021 20:45:42 - INFO - codeparrot_training - Step 6690: {'lr': 0.0004792016805453139, 'samples': 3425792, 'steps': 6690, 'batch_loss/train': 0.9100741157308221} 12/23/2021 20:45:52 - INFO - codeparrot_training - Step 6691: {'lr': 0.0004791953853778671, 'samples': 3426304, 'steps': 6691, 'batch_loss/train': 0.8635801672935486} 12/23/2021 20:46:06 - INFO - codeparrot_training - Step 6692: {'lr': 0.000479189089299226, 'samples': 3426816, 'steps': 6692, 'batch_loss/train': 0.9278983566910028} 12/23/2021 20:46:16 - INFO - codeparrot_training - Step 6693: {'lr': 0.00047918279230941565, 'samples': 3427328, 'steps': 6693, 'batch_loss/train': 0.8910356042906642} 12/23/2021 20:46:27 - INFO - codeparrot_training - Step 6694: {'lr': 0.00047917649440846105, 'samples': 3427840, 'steps': 6694, 'batch_loss/train': 0.7119110734201968} 12/23/2021 20:46:39 - INFO - codeparrot_training - Step 6695: {'lr': 0.00047917019559638724, 'samples': 3428352, 'steps': 6695, 'batch_loss/train': 0.8642592700198293} 12/23/2021 20:46:50 - INFO - codeparrot_training - Step 6696: {'lr': 0.0004791638958732193, 'samples': 3428864, 'steps': 6696, 'batch_loss/train': 0.8524750911165029} 12/23/2021 20:47:00 - INFO - codeparrot_training - Step 6697: {'lr': 0.00047915759523898215, 'samples': 3429376, 'steps': 6697, 'batch_loss/train': 0.835584718734026} 12/23/2021 20:47:13 - INFO - codeparrot_training - Step 6698: {'lr': 0.000479151293693701, 'samples': 3429888, 'steps': 6698, 'batch_loss/train': 0.7797718772199005} 12/23/2021 20:47:24 - INFO - codeparrot_training - Step 6699: {'lr': 0.00047914499123740084, 'samples': 3430400, 'steps': 6699, 'batch_loss/train': 0.8321781340055168} 12/23/2021 20:47:35 - INFO - codeparrot_training - Step 6700: {'lr': 0.00047913868787010673, 'samples': 3430912, 'steps': 6700, 'batch_loss/train': 0.982667782343924} 12/23/2021 20:47:45 - INFO - codeparrot_training - Step 6701: {'lr': 0.0004791323835918436, 'samples': 3431424, 'steps': 6701, 'batch_loss/train': 0.8421556460671127} 12/23/2021 20:47:57 - INFO - codeparrot_training - Step 6702: {'lr': 0.0004791260784026368, 'samples': 3431936, 'steps': 6702, 'batch_loss/train': 0.8852901887148619} 12/23/2021 20:48:08 - INFO - codeparrot_training - Step 6703: {'lr': 0.0004791197723025111, 'samples': 3432448, 'steps': 6703, 'batch_loss/train': 0.8241078401915729} 12/23/2021 20:48:19 - INFO - codeparrot_training - Step 6704: {'lr': 0.0004791134652914917, 'samples': 3432960, 'steps': 6704, 'batch_loss/train': 0.7714668652042747} 12/23/2021 20:48:31 - INFO - codeparrot_training - Step 6705: {'lr': 0.0004791071573696038, 'samples': 3433472, 'steps': 6705, 'batch_loss/train': 0.7445695642381907} 12/23/2021 20:48:42 - INFO - codeparrot_training - Step 6706: {'lr': 0.0004791008485368722, 'samples': 3433984, 'steps': 6706, 'batch_loss/train': 0.9269942231476307} 12/23/2021 20:48:52 - INFO - codeparrot_training - Step 6707: {'lr': 0.00047909453879332214, 'samples': 3434496, 'steps': 6707, 'batch_loss/train': 0.730744605883956} 12/23/2021 20:49:05 - INFO - codeparrot_training - Step 6708: {'lr': 0.0004790882281389788, 'samples': 3435008, 'steps': 6708, 'batch_loss/train': 1.6285644238814712} 12/23/2021 20:49:16 - INFO - codeparrot_training - Step 6709: {'lr': 0.00047908191657386706, 'samples': 3435520, 'steps': 6709, 'batch_loss/train': 0.8003596235066652} 12/23/2021 20:49:26 - INFO - codeparrot_training - Step 6710: {'lr': 0.00047907560409801215, 'samples': 3436032, 'steps': 6710, 'batch_loss/train': 0.8086223844438791} 12/23/2021 20:49:37 - INFO - codeparrot_training - Step 6711: {'lr': 0.0004790692907114392, 'samples': 3436544, 'steps': 6711, 'batch_loss/train': 0.8057316560298204} 12/23/2021 20:49:49 - INFO - codeparrot_training - Step 6712: {'lr': 0.0004790629764141732, 'samples': 3437056, 'steps': 6712, 'batch_loss/train': 0.9071357706561685} 12/23/2021 20:50:00 - INFO - codeparrot_training - Step 6713: {'lr': 0.00047905666120623923, 'samples': 3437568, 'steps': 6713, 'batch_loss/train': 0.8905982195865363} 12/23/2021 20:50:11 - INFO - codeparrot_training - Step 6714: {'lr': 0.00047905034508766256, 'samples': 3438080, 'steps': 6714, 'batch_loss/train': 0.8337859492748976} 12/23/2021 20:50:23 - INFO - codeparrot_training - Step 6715: {'lr': 0.0004790440280584681, 'samples': 3438592, 'steps': 6715, 'batch_loss/train': 0.8412412768229842} 12/23/2021 20:50:33 - INFO - codeparrot_training - Step 6716: {'lr': 0.0004790377101186811, 'samples': 3439104, 'steps': 6716, 'batch_loss/train': 0.8075630851089954} 12/23/2021 20:50:44 - INFO - codeparrot_training - Step 6717: {'lr': 0.00047903139126832675, 'samples': 3439616, 'steps': 6717, 'batch_loss/train': 0.9009091155603528} 12/23/2021 20:50:56 - INFO - codeparrot_training - Step 6718: {'lr': 0.00047902507150743, 'samples': 3440128, 'steps': 6718, 'batch_loss/train': 0.7275892862817273} 12/23/2021 20:51:07 - INFO - codeparrot_training - Step 6719: {'lr': 0.00047901875083601603, 'samples': 3440640, 'steps': 6719, 'batch_loss/train': 0.8669413924217224} 12/23/2021 20:51:17 - INFO - codeparrot_training - Step 6720: {'lr': 0.00047901242925411003, 'samples': 3441152, 'steps': 6720, 'batch_loss/train': 0.7736031590029597} 12/23/2021 20:51:28 - INFO - codeparrot_training - Step 6721: {'lr': 0.000479006106761737, 'samples': 3441664, 'steps': 6721, 'batch_loss/train': 0.7984777381643653} 12/23/2021 20:51:41 - INFO - codeparrot_training - Step 6722: {'lr': 0.00047899978335892224, 'samples': 3442176, 'steps': 6722, 'batch_loss/train': 0.8448220891878009} 12/23/2021 20:51:52 - INFO - codeparrot_training - Step 6723: {'lr': 0.00047899345904569077, 'samples': 3442688, 'steps': 6723, 'batch_loss/train': 0.7651079595088959} 12/23/2021 20:52:02 - INFO - codeparrot_training - Step 6724: {'lr': 0.0004789871338220677, 'samples': 3443200, 'steps': 6724, 'batch_loss/train': 0.7777837186586112} 12/23/2021 20:52:14 - INFO - codeparrot_training - Step 6725: {'lr': 0.00047898080768807836, 'samples': 3443712, 'steps': 6725, 'batch_loss/train': 0.8633901653811336} 12/23/2021 20:52:25 - INFO - codeparrot_training - Step 6726: {'lr': 0.0004789744806437477, 'samples': 3444224, 'steps': 6726, 'batch_loss/train': 0.7009223294444382} 12/23/2021 20:52:35 - INFO - codeparrot_training - Step 6727: {'lr': 0.0004789681526891011, 'samples': 3444736, 'steps': 6727, 'batch_loss/train': 0.8816190073266625} 12/23/2021 20:52:49 - INFO - codeparrot_training - Step 6728: {'lr': 0.0004789618238241634, 'samples': 3445248, 'steps': 6728, 'batch_loss/train': 0.8198448829352856} 12/23/2021 20:53:00 - INFO - codeparrot_training - Step 6729: {'lr': 0.0004789554940489601, 'samples': 3445760, 'steps': 6729, 'batch_loss/train': 1.0520649617537856} 12/23/2021 20:53:10 - INFO - codeparrot_training - Step 6730: {'lr': 0.0004789491633635161, 'samples': 3446272, 'steps': 6730, 'batch_loss/train': 0.8346819328144193} 12/23/2021 20:53:23 - INFO - codeparrot_training - Step 6731: {'lr': 0.00047894283176785674, 'samples': 3446784, 'steps': 6731, 'batch_loss/train': 0.8742049774155021} 12/23/2021 20:53:33 - INFO - codeparrot_training - Step 6732: {'lr': 0.0004789364992620071, 'samples': 3447296, 'steps': 6732, 'batch_loss/train': 1.6327752680517733} 12/23/2021 20:53:44 - INFO - codeparrot_training - Step 6733: {'lr': 0.00047893016584599235, 'samples': 3447808, 'steps': 6733, 'batch_loss/train': 0.7721715164370835} 12/23/2021 20:53:55 - INFO - codeparrot_training - Step 6734: {'lr': 0.00047892383151983776, 'samples': 3448320, 'steps': 6734, 'batch_loss/train': 0.7850738032720983} 12/23/2021 20:54:07 - INFO - codeparrot_training - Step 6735: {'lr': 0.0004789174962835684, 'samples': 3448832, 'steps': 6735, 'batch_loss/train': 0.8947492763400078} 12/23/2021 20:54:18 - INFO - codeparrot_training - Step 6736: {'lr': 0.0004789111601372096, 'samples': 3449344, 'steps': 6736, 'batch_loss/train': 0.9636008236557245} 12/23/2021 20:54:28 - INFO - codeparrot_training - Step 6737: {'lr': 0.0004789048230807863, 'samples': 3449856, 'steps': 6737, 'batch_loss/train': 0.8673527650535107} 12/23/2021 20:54:41 - INFO - codeparrot_training - Step 6738: {'lr': 0.000478898485114324, 'samples': 3450368, 'steps': 6738, 'batch_loss/train': 0.6305848130141385} 12/23/2021 20:54:52 - INFO - codeparrot_training - Step 6739: {'lr': 0.0004788921462378477, 'samples': 3450880, 'steps': 6739, 'batch_loss/train': 0.9109242632985115} 12/23/2021 20:55:02 - INFO - codeparrot_training - Step 6740: {'lr': 0.0004788858064513827, 'samples': 3451392, 'steps': 6740, 'batch_loss/train': 0.7358981855213642} 12/23/2021 20:55:15 - INFO - codeparrot_training - Step 6741: {'lr': 0.00047887946575495406, 'samples': 3451904, 'steps': 6741, 'batch_loss/train': 1.0021455520763993} 12/23/2021 20:55:25 - INFO - codeparrot_training - Step 6742: {'lr': 0.00047887312414858713, 'samples': 3452416, 'steps': 6742, 'batch_loss/train': 0.8687023911625147} 12/23/2021 20:55:36 - INFO - codeparrot_training - Step 6743: {'lr': 0.0004788667816323071, 'samples': 3452928, 'steps': 6743, 'batch_loss/train': 1.0248452108353376} 12/23/2021 20:55:46 - INFO - codeparrot_training - Step 6744: {'lr': 0.00047886043820613913, 'samples': 3453440, 'steps': 6744, 'batch_loss/train': 0.8421798916533589} 12/23/2021 20:55:58 - INFO - codeparrot_training - Step 6745: {'lr': 0.0004788540938701085, 'samples': 3453952, 'steps': 6745, 'batch_loss/train': 0.9436667738482356} 12/23/2021 20:56:09 - INFO - codeparrot_training - Step 6746: {'lr': 0.0004788477486242404, 'samples': 3454464, 'steps': 6746, 'batch_loss/train': 0.86714340839535} 12/23/2021 20:56:20 - INFO - codeparrot_training - Step 6747: {'lr': 0.00047884140246856014, 'samples': 3454976, 'steps': 6747, 'batch_loss/train': 0.9236221835017204} 12/23/2021 20:56:33 - INFO - codeparrot_training - Step 6748: {'lr': 0.0004788350554030927, 'samples': 3455488, 'steps': 6748, 'batch_loss/train': 0.9481103112921119} 12/23/2021 20:56:43 - INFO - codeparrot_training - Step 6749: {'lr': 0.0004788287074278636, 'samples': 3456000, 'steps': 6749, 'batch_loss/train': 0.8813913967460394} 12/23/2021 20:56:54 - INFO - codeparrot_training - Step 6750: {'lr': 0.0004788223585428979, 'samples': 3456512, 'steps': 6750, 'batch_loss/train': 0.8395000696182251} 12/23/2021 20:57:06 - INFO - codeparrot_training - Step 6751: {'lr': 0.00047881600874822097, 'samples': 3457024, 'steps': 6751, 'batch_loss/train': 0.8946070903912187} 12/23/2021 20:57:17 - INFO - codeparrot_training - Step 6752: {'lr': 0.00047880965804385797, 'samples': 3457536, 'steps': 6752, 'batch_loss/train': 0.9072881620377302} 12/23/2021 20:57:27 - INFO - codeparrot_training - Step 6753: {'lr': 0.0004788033064298342, 'samples': 3458048, 'steps': 6753, 'batch_loss/train': 0.8117232285439968} 12/23/2021 20:57:38 - INFO - codeparrot_training - Step 6754: {'lr': 0.00047879695390617473, 'samples': 3458560, 'steps': 6754, 'batch_loss/train': 0.8055873294360936} 12/23/2021 20:57:51 - INFO - codeparrot_training - Step 6755: {'lr': 0.00047879060047290506, 'samples': 3459072, 'steps': 6755, 'batch_loss/train': 0.8941704761236906} 12/23/2021 20:58:01 - INFO - codeparrot_training - Step 6756: {'lr': 0.0004787842461300503, 'samples': 3459584, 'steps': 6756, 'batch_loss/train': 0.9083206569775939} 12/23/2021 20:58:12 - INFO - codeparrot_training - Step 6757: {'lr': 0.0004787778908776359, 'samples': 3460096, 'steps': 6757, 'batch_loss/train': 0.7625836357474327} 12/23/2021 20:58:25 - INFO - codeparrot_training - Step 6758: {'lr': 0.0004787715347156869, 'samples': 3460608, 'steps': 6758, 'batch_loss/train': 0.8206991897895932} 12/23/2021 20:58:36 - INFO - codeparrot_training - Step 6759: {'lr': 0.0004787651776442287, 'samples': 3461120, 'steps': 6759, 'batch_loss/train': 0.9092725683003664} 12/23/2021 20:58:47 - INFO - codeparrot_training - Step 6760: {'lr': 0.00047875881966328644, 'samples': 3461632, 'steps': 6760, 'batch_loss/train': 0.9716774942353368} 12/23/2021 20:58:58 - INFO - codeparrot_training - Step 6761: {'lr': 0.00047875246077288554, 'samples': 3462144, 'steps': 6761, 'batch_loss/train': 0.8823424505535513} 12/23/2021 20:59:09 - INFO - codeparrot_training - Step 6762: {'lr': 0.0004787461009730512, 'samples': 3462656, 'steps': 6762, 'batch_loss/train': 0.8596898671239614} 12/23/2021 20:59:20 - INFO - codeparrot_training - Step 6763: {'lr': 0.00047873974026380875, 'samples': 3463168, 'steps': 6763, 'batch_loss/train': 0.7984058540314436} 12/23/2021 20:59:30 - INFO - codeparrot_training - Step 6764: {'lr': 0.0004787333786451835, 'samples': 3463680, 'steps': 6764, 'batch_loss/train': 0.7721211379393935} 12/23/2021 20:59:42 - INFO - codeparrot_training - Step 6765: {'lr': 0.0004787270161172007, 'samples': 3464192, 'steps': 6765, 'batch_loss/train': 0.8740882845595479} 12/23/2021 20:59:53 - INFO - codeparrot_training - Step 6766: {'lr': 0.00047872065267988564, 'samples': 3464704, 'steps': 6766, 'batch_loss/train': 0.8015094147995114} 12/23/2021 21:00:04 - INFO - codeparrot_training - Step 6767: {'lr': 0.0004787142883332636, 'samples': 3465216, 'steps': 6767, 'batch_loss/train': 0.8679456375539303} 12/23/2021 21:00:17 - INFO - codeparrot_training - Step 6768: {'lr': 0.00047870792307735987, 'samples': 3465728, 'steps': 6768, 'batch_loss/train': 0.9637270905077457} 12/23/2021 21:00:28 - INFO - codeparrot_training - Step 6769: {'lr': 0.00047870155691219984, 'samples': 3466240, 'steps': 6769, 'batch_loss/train': 0.7353827222250402} 12/23/2021 21:00:38 - INFO - codeparrot_training - Step 6770: {'lr': 0.0004786951898378088, 'samples': 3466752, 'steps': 6770, 'batch_loss/train': 0.7076111203059554} 12/23/2021 21:00:50 - INFO - codeparrot_training - Step 6771: {'lr': 0.000478688821854212, 'samples': 3467264, 'steps': 6771, 'batch_loss/train': 0.9769663270562887} 12/23/2021 21:01:01 - INFO - codeparrot_training - Step 6772: {'lr': 0.0004786824529614349, 'samples': 3467776, 'steps': 6772, 'batch_loss/train': 0.9124030778184533} 12/23/2021 21:01:12 - INFO - codeparrot_training - Step 6773: {'lr': 0.0004786760831595025, 'samples': 3468288, 'steps': 6773, 'batch_loss/train': 0.8351344226393849} 12/23/2021 21:01:24 - INFO - codeparrot_training - Step 6774: {'lr': 0.0004786697124484405, 'samples': 3468800, 'steps': 6774, 'batch_loss/train': 0.8159699346870184} 12/23/2021 21:01:34 - INFO - codeparrot_training - Step 6775: {'lr': 0.000478663340828274, 'samples': 3469312, 'steps': 6775, 'batch_loss/train': 0.9002412529662251} 12/23/2021 21:01:45 - INFO - codeparrot_training - Step 6776: {'lr': 0.0004786569682990285, 'samples': 3469824, 'steps': 6776, 'batch_loss/train': 0.8576286644674838} 12/23/2021 21:01:55 - INFO - codeparrot_training - Step 6777: {'lr': 0.00047865059486072915, 'samples': 3470336, 'steps': 6777, 'batch_loss/train': 0.8671825360506773} 12/23/2021 21:02:09 - INFO - codeparrot_training - Step 6778: {'lr': 0.00047864422051340134, 'samples': 3470848, 'steps': 6778, 'batch_loss/train': 0.8635653387755156} 12/23/2021 21:02:19 - INFO - codeparrot_training - Step 6779: {'lr': 0.0004786378452570704, 'samples': 3471360, 'steps': 6779, 'batch_loss/train': 0.8978708039503545} 12/23/2021 21:02:30 - INFO - codeparrot_training - Step 6780: {'lr': 0.0004786314690917619, 'samples': 3471872, 'steps': 6780, 'batch_loss/train': 0.8652812903746963} 12/23/2021 21:02:42 - INFO - codeparrot_training - Step 6781: {'lr': 0.00047862509201750087, 'samples': 3472384, 'steps': 6781, 'batch_loss/train': 0.7703868071548641} 12/23/2021 21:02:52 - INFO - codeparrot_training - Step 6782: {'lr': 0.00047861871403431284, 'samples': 3472896, 'steps': 6782, 'batch_loss/train': 0.8925214176997542} 12/23/2021 21:03:03 - INFO - codeparrot_training - Step 6783: {'lr': 0.0004786123351422231, 'samples': 3473408, 'steps': 6783, 'batch_loss/train': 0.888842673972249} 12/23/2021 21:03:16 - INFO - codeparrot_training - Step 6784: {'lr': 0.00047860595534125706, 'samples': 3473920, 'steps': 6784, 'batch_loss/train': 0.9059709962457418} 12/23/2021 21:03:27 - INFO - codeparrot_training - Step 6785: {'lr': 0.0004785995746314401, 'samples': 3474432, 'steps': 6785, 'batch_loss/train': 0.8947339355945587} 12/23/2021 21:03:37 - INFO - codeparrot_training - Step 6786: {'lr': 0.00047859319301279745, 'samples': 3474944, 'steps': 6786, 'batch_loss/train': 0.854473166167736} 12/23/2021 21:03:48 - INFO - codeparrot_training - Step 6787: {'lr': 0.00047858681048535464, 'samples': 3475456, 'steps': 6787, 'batch_loss/train': 0.8533919500187039} 12/23/2021 21:04:00 - INFO - codeparrot_training - Step 6788: {'lr': 0.000478580427049137, 'samples': 3475968, 'steps': 6788, 'batch_loss/train': 0.9810123527422547} 12/23/2021 21:04:11 - INFO - codeparrot_training - Step 6789: {'lr': 0.00047857404270416985, 'samples': 3476480, 'steps': 6789, 'batch_loss/train': 0.8880990985780954} 12/23/2021 21:04:21 - INFO - codeparrot_training - Step 6790: {'lr': 0.0004785676574504786, 'samples': 3476992, 'steps': 6790, 'batch_loss/train': 0.8497190391644835} 12/23/2021 21:04:33 - INFO - codeparrot_training - Step 6791: {'lr': 0.0004785612712880887, 'samples': 3477504, 'steps': 6791, 'batch_loss/train': 0.8764082421548665} 12/23/2021 21:04:44 - INFO - codeparrot_training - Step 6792: {'lr': 0.0004785548842170254, 'samples': 3478016, 'steps': 6792, 'batch_loss/train': 0.8846822972409427} 12/23/2021 21:04:55 - INFO - codeparrot_training - Step 6793: {'lr': 0.0004785484962373143, 'samples': 3478528, 'steps': 6793, 'batch_loss/train': 1.060983944684267} 12/23/2021 21:05:07 - INFO - codeparrot_training - Step 6794: {'lr': 0.0004785421073489806, 'samples': 3479040, 'steps': 6794, 'batch_loss/train': 0.7613816782832146} 12/23/2021 21:05:17 - INFO - codeparrot_training - Step 6795: {'lr': 0.00047853571755204977, 'samples': 3479552, 'steps': 6795, 'batch_loss/train': 0.8572881268337369} 12/23/2021 21:05:28 - INFO - codeparrot_training - Step 6796: {'lr': 0.0004785293268465473, 'samples': 3480064, 'steps': 6796, 'batch_loss/train': 0.880180150270462} 12/23/2021 21:05:39 - INFO - codeparrot_training - Step 6797: {'lr': 0.0004785229352324984, 'samples': 3480576, 'steps': 6797, 'batch_loss/train': 0.8172860054764897} 12/23/2021 21:05:52 - INFO - codeparrot_training - Step 6798: {'lr': 0.0004785165427099286, 'samples': 3481088, 'steps': 6798, 'batch_loss/train': 0.8280820790678263} 12/23/2021 21:06:03 - INFO - codeparrot_training - Step 6799: {'lr': 0.0004785101492788634, 'samples': 3481600, 'steps': 6799, 'batch_loss/train': 0.907324830070138} 12/23/2021 21:06:13 - INFO - codeparrot_training - Step 6800: {'lr': 0.0004785037549393281, 'samples': 3482112, 'steps': 6800, 'batch_loss/train': 0.7586327223107219} 12/23/2021 21:06:25 - INFO - codeparrot_training - Step 6801: {'lr': 0.0004784973596913481, 'samples': 3482624, 'steps': 6801, 'batch_loss/train': 0.9165416769683361} 12/23/2021 21:06:36 - INFO - codeparrot_training - Step 6802: {'lr': 0.0004784909635349489, 'samples': 3483136, 'steps': 6802, 'batch_loss/train': 0.884535956196487} 12/23/2021 21:06:46 - INFO - codeparrot_training - Step 6803: {'lr': 0.000478484566470156, 'samples': 3483648, 'steps': 6803, 'batch_loss/train': 0.8315873481333256} 12/23/2021 21:06:58 - INFO - codeparrot_training - Step 6804: {'lr': 0.0004784781684969947, 'samples': 3484160, 'steps': 6804, 'batch_loss/train': 0.9270602986216545} 12/23/2021 21:07:09 - INFO - codeparrot_training - Step 6805: {'lr': 0.0004784717696154903, 'samples': 3484672, 'steps': 6805, 'batch_loss/train': 0.8351595769636333} 12/23/2021 21:07:19 - INFO - codeparrot_training - Step 6806: {'lr': 0.0004784653698256686, 'samples': 3485184, 'steps': 6806, 'batch_loss/train': 0.8327388130128384} 12/23/2021 21:07:30 - INFO - codeparrot_training - Step 6807: {'lr': 0.00047845896912755476, 'samples': 3485696, 'steps': 6807, 'batch_loss/train': 0.7512462181039155} 12/23/2021 21:07:43 - INFO - codeparrot_training - Step 6808: {'lr': 0.0004784525675211744, 'samples': 3486208, 'steps': 6808, 'batch_loss/train': 0.9116589166224003} 12/23/2021 21:07:54 - INFO - codeparrot_training - Step 6809: {'lr': 0.0004784461650065528, 'samples': 3486720, 'steps': 6809, 'batch_loss/train': 0.9086630614474416} 12/23/2021 21:08:04 - INFO - codeparrot_training - Step 6810: {'lr': 0.0004784397615837156, 'samples': 3487232, 'steps': 6810, 'batch_loss/train': 0.9504419374279678} 12/23/2021 21:08:17 - INFO - codeparrot_training - Step 6811: {'lr': 0.0004784333572526881, 'samples': 3487744, 'steps': 6811, 'batch_loss/train': 0.8582804100587964} 12/23/2021 21:08:27 - INFO - codeparrot_training - Step 6812: {'lr': 0.0004784269520134958, 'samples': 3488256, 'steps': 6812, 'batch_loss/train': 0.7246524328365922} 12/23/2021 21:08:38 - INFO - codeparrot_training - Step 6813: {'lr': 0.00047842054586616424, 'samples': 3488768, 'steps': 6813, 'batch_loss/train': 0.9155134812463075} 12/23/2021 21:08:51 - INFO - codeparrot_training - Step 6814: {'lr': 0.0004784141388107188, 'samples': 3489280, 'steps': 6814, 'batch_loss/train': 0.7727695470675826} 12/23/2021 21:09:02 - INFO - codeparrot_training - Step 6815: {'lr': 0.00047840773084718505, 'samples': 3489792, 'steps': 6815, 'batch_loss/train': 0.9048065398819745} 12/23/2021 21:09:12 - INFO - codeparrot_training - Step 6816: {'lr': 0.0004784013219755884, 'samples': 3490304, 'steps': 6816, 'batch_loss/train': 0.8989137066528201} 12/23/2021 21:09:23 - INFO - codeparrot_training - Step 6817: {'lr': 0.00047839491219595434, 'samples': 3490816, 'steps': 6817, 'batch_loss/train': 0.8360137981362641} 12/23/2021 21:09:35 - INFO - codeparrot_training - Step 6818: {'lr': 0.00047838850150830824, 'samples': 3491328, 'steps': 6818, 'batch_loss/train': 0.7997170654125512} 12/23/2021 21:09:45 - INFO - codeparrot_training - Step 6819: {'lr': 0.0004783820899126758, 'samples': 3491840, 'steps': 6819, 'batch_loss/train': 0.8546549053862691} 12/23/2021 21:09:56 - INFO - codeparrot_training - Step 6820: {'lr': 0.0004783756774090824, 'samples': 3492352, 'steps': 6820, 'batch_loss/train': 0.8413745132274926} 12/23/2021 21:10:08 - INFO - codeparrot_training - Step 6821: {'lr': 0.0004783692639975535, 'samples': 3492864, 'steps': 6821, 'batch_loss/train': 0.9242864809930325} 12/23/2021 21:10:19 - INFO - codeparrot_training - Step 6822: {'lr': 0.0004783628496781146, 'samples': 3493376, 'steps': 6822, 'batch_loss/train': 0.7881619688123465} 12/23/2021 21:10:29 - INFO - codeparrot_training - Step 6823: {'lr': 0.0004783564344507913, 'samples': 3493888, 'steps': 6823, 'batch_loss/train': 0.8427679110318422} 12/23/2021 21:10:42 - INFO - codeparrot_training - Step 6824: {'lr': 0.000478350018315609, 'samples': 3494400, 'steps': 6824, 'batch_loss/train': 0.9608896728605032} 12/23/2021 21:10:53 - INFO - codeparrot_training - Step 6825: {'lr': 0.0004783436012725933, 'samples': 3494912, 'steps': 6825, 'batch_loss/train': 0.8207084964960814} 12/23/2021 21:11:04 - INFO - codeparrot_training - Step 6826: {'lr': 0.00047833718332176967, 'samples': 3495424, 'steps': 6826, 'batch_loss/train': 0.8036062577739358} 12/23/2021 21:11:16 - INFO - codeparrot_training - Step 6827: {'lr': 0.0004783307644631635, 'samples': 3495936, 'steps': 6827, 'batch_loss/train': 0.8489548121578991} 12/23/2021 21:11:27 - INFO - codeparrot_training - Step 6828: {'lr': 0.00047832434469680053, 'samples': 3496448, 'steps': 6828, 'batch_loss/train': 0.8474846272729337} 12/23/2021 21:11:37 - INFO - codeparrot_training - Step 6829: {'lr': 0.0004783179240227062, 'samples': 3496960, 'steps': 6829, 'batch_loss/train': 0.8332756608724594} 12/23/2021 21:11:48 - INFO - codeparrot_training - Step 6830: {'lr': 0.0004783115024409059, 'samples': 3497472, 'steps': 6830, 'batch_loss/train': 0.9006317872554064} 12/23/2021 21:12:00 - INFO - codeparrot_training - Step 6831: {'lr': 0.00047830507995142536, 'samples': 3497984, 'steps': 6831, 'batch_loss/train': 1.1406070683151484} 12/23/2021 21:12:10 - INFO - codeparrot_training - Step 6832: {'lr': 0.00047829865655429004, 'samples': 3498496, 'steps': 6832, 'batch_loss/train': 0.7750192121602595} 12/23/2021 21:12:21 - INFO - codeparrot_training - Step 6833: {'lr': 0.00047829223224952545, 'samples': 3499008, 'steps': 6833, 'batch_loss/train': 0.8441748600453138} 12/23/2021 21:12:33 - INFO - codeparrot_training - Step 6834: {'lr': 0.00047828580703715705, 'samples': 3499520, 'steps': 6834, 'batch_loss/train': 0.8680101870559156} 12/23/2021 21:12:44 - INFO - codeparrot_training - Step 6835: {'lr': 0.00047827938091721055, 'samples': 3500032, 'steps': 6835, 'batch_loss/train': 0.8360354145988822} 12/23/2021 21:12:54 - INFO - codeparrot_training - Step 6836: {'lr': 0.0004782729538897115, 'samples': 3500544, 'steps': 6836, 'batch_loss/train': 0.8310494599863887} 12/23/2021 21:13:07 - INFO - codeparrot_training - Step 6837: {'lr': 0.00047826652595468527, 'samples': 3501056, 'steps': 6837, 'batch_loss/train': 0.7814680333249271} 12/23/2021 21:13:18 - INFO - codeparrot_training - Step 6838: {'lr': 0.00047826009711215755, 'samples': 3501568, 'steps': 6838, 'batch_loss/train': 0.7618309729732573} 12/23/2021 21:13:28 - INFO - codeparrot_training - Step 6839: {'lr': 0.0004782536673621539, 'samples': 3502080, 'steps': 6839, 'batch_loss/train': 0.8725486984476447} 12/23/2021 21:13:39 - INFO - codeparrot_training - Step 6840: {'lr': 0.00047824723670469984, 'samples': 3502592, 'steps': 6840, 'batch_loss/train': 0.8565727192908525} 12/23/2021 21:13:51 - INFO - codeparrot_training - Step 6841: {'lr': 0.0004782408051398209, 'samples': 3503104, 'steps': 6841, 'batch_loss/train': 0.910105497110635} 12/23/2021 21:14:02 - INFO - codeparrot_training - Step 6842: {'lr': 0.0004782343726675428, 'samples': 3503616, 'steps': 6842, 'batch_loss/train': 0.8011477636173368} 12/23/2021 21:14:12 - INFO - codeparrot_training - Step 6843: {'lr': 0.0004782279392878909, 'samples': 3504128, 'steps': 6843, 'batch_loss/train': 0.8945971010252833} 12/23/2021 21:14:26 - INFO - codeparrot_training - Step 6844: {'lr': 0.000478221505000891, 'samples': 3504640, 'steps': 6844, 'batch_loss/train': 0.8794456906616688} 12/23/2021 21:14:37 - INFO - codeparrot_training - Step 6845: {'lr': 0.0004782150698065685, 'samples': 3505152, 'steps': 6845, 'batch_loss/train': 1.2586653204634786} 12/23/2021 21:14:47 - INFO - codeparrot_training - Step 6846: {'lr': 0.00047820863370494907, 'samples': 3505664, 'steps': 6846, 'batch_loss/train': 0.7950726374983788} 12/23/2021 21:14:59 - INFO - codeparrot_training - Step 6847: {'lr': 0.00047820219669605826, 'samples': 3506176, 'steps': 6847, 'batch_loss/train': 0.896015883423388} 12/23/2021 21:15:10 - INFO - codeparrot_training - Step 6848: {'lr': 0.0004781957587799217, 'samples': 3506688, 'steps': 6848, 'batch_loss/train': 0.8924163747578859} 12/23/2021 21:15:21 - INFO - codeparrot_training - Step 6849: {'lr': 0.00047818931995656494, 'samples': 3507200, 'steps': 6849, 'batch_loss/train': 0.7634015390649438} 12/23/2021 21:15:31 - INFO - codeparrot_training - Step 6850: {'lr': 0.0004781828802260136, 'samples': 3507712, 'steps': 6850, 'batch_loss/train': 0.7579085901379585} 12/23/2021 21:15:43 - INFO - codeparrot_training - Step 6851: {'lr': 0.00047817643958829336, 'samples': 3508224, 'steps': 6851, 'batch_loss/train': 0.8636228758841753} 12/23/2021 21:15:54 - INFO - codeparrot_training - Step 6852: {'lr': 0.00047816999804342977, 'samples': 3508736, 'steps': 6852, 'batch_loss/train': 0.9168087225407362} 12/23/2021 21:16:04 - INFO - codeparrot_training - Step 6853: {'lr': 0.0004781635555914483, 'samples': 3509248, 'steps': 6853, 'batch_loss/train': 0.8644823590293527} 12/23/2021 21:16:18 - INFO - codeparrot_training - Step 6854: {'lr': 0.0004781571122323748, 'samples': 3509760, 'steps': 6854, 'batch_loss/train': 0.8179903533309698} 12/23/2021 21:16:28 - INFO - codeparrot_training - Step 6855: {'lr': 0.0004781506679662346, 'samples': 3510272, 'steps': 6855, 'batch_loss/train': 0.9181501027196646} 12/23/2021 21:16:39 - INFO - codeparrot_training - Step 6856: {'lr': 0.0004781442227930537, 'samples': 3510784, 'steps': 6856, 'batch_loss/train': 0.8546944269910455} 12/23/2021 21:16:51 - INFO - codeparrot_training - Step 6857: {'lr': 0.0004781377767128574, 'samples': 3511296, 'steps': 6857, 'batch_loss/train': 0.8204962192103267} 12/23/2021 21:17:01 - INFO - codeparrot_training - Step 6858: {'lr': 0.00047813132972567143, 'samples': 3511808, 'steps': 6858, 'batch_loss/train': 0.8043594378978014} 12/23/2021 21:17:12 - INFO - codeparrot_training - Step 6859: {'lr': 0.00047812488183152145, 'samples': 3512320, 'steps': 6859, 'batch_loss/train': 1.026729678735137} 12/23/2021 21:17:23 - INFO - codeparrot_training - Step 6860: {'lr': 0.0004781184330304331, 'samples': 3512832, 'steps': 6860, 'batch_loss/train': 0.8216695245355368} 12/23/2021 21:17:36 - INFO - codeparrot_training - Step 6861: {'lr': 0.000478111983322432, 'samples': 3513344, 'steps': 6861, 'batch_loss/train': 0.8799852337688208} 12/23/2021 21:17:46 - INFO - codeparrot_training - Step 6862: {'lr': 0.00047810553270754376, 'samples': 3513856, 'steps': 6862, 'batch_loss/train': 0.8247129013761878} 12/23/2021 21:17:57 - INFO - codeparrot_training - Step 6863: {'lr': 0.00047809908118579404, 'samples': 3514368, 'steps': 6863, 'batch_loss/train': 0.8491076836362481} 12/23/2021 21:18:09 - INFO - codeparrot_training - Step 6864: {'lr': 0.00047809262875720863, 'samples': 3514880, 'steps': 6864, 'batch_loss/train': 0.8476924812421203} 12/23/2021 21:18:19 - INFO - codeparrot_training - Step 6865: {'lr': 0.0004780861754218129, 'samples': 3515392, 'steps': 6865, 'batch_loss/train': 0.8137873122468591} 12/23/2021 21:18:30 - INFO - codeparrot_training - Step 6866: {'lr': 0.0004780797211796327, 'samples': 3515904, 'steps': 6866, 'batch_loss/train': 0.8919210112653673} 12/23/2021 21:18:42 - INFO - codeparrot_training - Step 6867: {'lr': 0.0004780732660306936, 'samples': 3516416, 'steps': 6867, 'batch_loss/train': 0.844324897043407} 12/23/2021 21:18:53 - INFO - codeparrot_training - Step 6868: {'lr': 0.0004780668099750214, 'samples': 3516928, 'steps': 6868, 'batch_loss/train': 1.444854022236541} 12/23/2021 21:19:03 - INFO - codeparrot_training - Step 6869: {'lr': 0.0004780603530126416, 'samples': 3517440, 'steps': 6869, 'batch_loss/train': 0.8653070954605937} 12/23/2021 21:19:15 - INFO - codeparrot_training - Step 6870: {'lr': 0.00047805389514358, 'samples': 3517952, 'steps': 6870, 'batch_loss/train': 0.8950120368972421} 12/23/2021 21:19:26 - INFO - codeparrot_training - Step 6871: {'lr': 0.00047804743636786225, 'samples': 3518464, 'steps': 6871, 'batch_loss/train': 0.7418174126651138} 12/23/2021 21:19:37 - INFO - codeparrot_training - Step 6872: {'lr': 0.00047804097668551393, 'samples': 3518976, 'steps': 6872, 'batch_loss/train': 0.800217701587826} 12/23/2021 21:19:47 - INFO - codeparrot_training - Step 6873: {'lr': 0.00047803451609656084, 'samples': 3519488, 'steps': 6873, 'batch_loss/train': 0.9161938698962331} 12/23/2021 21:20:01 - INFO - codeparrot_training - Step 6874: {'lr': 0.0004780280546010286, 'samples': 3520000, 'steps': 6874, 'batch_loss/train': 0.8701260765083134} 12/23/2021 21:20:12 - INFO - codeparrot_training - Step 6875: {'lr': 0.0004780215921989429, 'samples': 3520512, 'steps': 6875, 'batch_loss/train': 0.8756373533979058} 12/23/2021 21:20:22 - INFO - codeparrot_training - Step 6876: {'lr': 0.0004780151288903295, 'samples': 3521024, 'steps': 6876, 'batch_loss/train': 0.9333141464740038} 12/23/2021 21:20:34 - INFO - codeparrot_training - Step 6877: {'lr': 0.000478008664675214, 'samples': 3521536, 'steps': 6877, 'batch_loss/train': 0.8050722442567348} 12/23/2021 21:20:45 - INFO - codeparrot_training - Step 6878: {'lr': 0.0004780021995536221, 'samples': 3522048, 'steps': 6878, 'batch_loss/train': 0.773861629422754} 12/23/2021 21:20:55 - INFO - codeparrot_training - Step 6879: {'lr': 0.0004779957335255796, 'samples': 3522560, 'steps': 6879, 'batch_loss/train': 0.7964892415329814} 12/23/2021 21:21:07 - INFO - codeparrot_training - Step 6880: {'lr': 0.0004779892665911122, 'samples': 3523072, 'steps': 6880, 'batch_loss/train': 0.8098608902655542} 12/23/2021 21:21:18 - INFO - codeparrot_training - Step 6881: {'lr': 0.0004779827987502455, 'samples': 3523584, 'steps': 6881, 'batch_loss/train': 0.8478281069546938} 12/23/2021 21:21:29 - INFO - codeparrot_training - Step 6882: {'lr': 0.00047797633000300525, 'samples': 3524096, 'steps': 6882, 'batch_loss/train': 0.8418682394549251} 12/23/2021 21:21:39 - INFO - codeparrot_training - Step 6883: {'lr': 0.00047796986034941727, 'samples': 3524608, 'steps': 6883, 'batch_loss/train': 0.795907424762845} 12/23/2021 21:21:52 - INFO - codeparrot_training - Step 6884: {'lr': 0.00047796338978950707, 'samples': 3525120, 'steps': 6884, 'batch_loss/train': 0.8303869580850005} 12/23/2021 21:22:03 - INFO - codeparrot_training - Step 6885: {'lr': 0.0004779569183233006, 'samples': 3525632, 'steps': 6885, 'batch_loss/train': 0.9259656025096774} 12/23/2021 21:22:14 - INFO - codeparrot_training - Step 6886: {'lr': 0.0004779504459508235, 'samples': 3526144, 'steps': 6886, 'batch_loss/train': 0.7305252196965739} 12/23/2021 21:22:26 - INFO - codeparrot_training - Step 6887: {'lr': 0.0004779439726721015, 'samples': 3526656, 'steps': 6887, 'batch_loss/train': 0.8586296564899385} 12/23/2021 21:22:36 - INFO - codeparrot_training - Step 6888: {'lr': 0.0004779374984871603, 'samples': 3527168, 'steps': 6888, 'batch_loss/train': 0.8818472079001367} 12/23/2021 21:22:47 - INFO - codeparrot_training - Step 6889: {'lr': 0.0004779310233960257, 'samples': 3527680, 'steps': 6889, 'batch_loss/train': 0.9410274475812912} 12/23/2021 21:22:59 - INFO - codeparrot_training - Step 6890: {'lr': 0.0004779245473987234, 'samples': 3528192, 'steps': 6890, 'batch_loss/train': 0.8087455299682915} 12/23/2021 21:23:09 - INFO - codeparrot_training - Step 6891: {'lr': 0.0004779180704952792, 'samples': 3528704, 'steps': 6891, 'batch_loss/train': 0.9342662808485329} 12/23/2021 21:23:20 - INFO - codeparrot_training - Step 6892: {'lr': 0.0004779115926857188, 'samples': 3529216, 'steps': 6892, 'batch_loss/train': 0.9096384616568685} 12/23/2021 21:23:33 - INFO - codeparrot_training - Step 6893: {'lr': 0.0004779051139700679, 'samples': 3529728, 'steps': 6893, 'batch_loss/train': 0.850265197455883} 12/23/2021 21:23:44 - INFO - codeparrot_training - Step 6894: {'lr': 0.00047789863434835237, 'samples': 3530240, 'steps': 6894, 'batch_loss/train': 0.7936879615299404} 12/23/2021 21:23:55 - INFO - codeparrot_training - Step 6895: {'lr': 0.0004778921538205979, 'samples': 3530752, 'steps': 6895, 'batch_loss/train': 0.7766846921294928} 12/23/2021 21:24:05 - INFO - codeparrot_training - Step 6896: {'lr': 0.0004778856723868303, 'samples': 3531264, 'steps': 6896, 'batch_loss/train': 0.9429766647517681} 12/23/2021 21:24:17 - INFO - codeparrot_training - Step 6897: {'lr': 0.0004778791900470753, 'samples': 3531776, 'steps': 6897, 'batch_loss/train': 0.7262771585956216} 12/23/2021 21:24:28 - INFO - codeparrot_training - Step 6898: {'lr': 0.00047787270680135866, 'samples': 3532288, 'steps': 6898, 'batch_loss/train': 0.8736706208437681} 12/23/2021 21:24:38 - INFO - codeparrot_training - Step 6899: {'lr': 0.00047786622264970626, 'samples': 3532800, 'steps': 6899, 'batch_loss/train': 0.8382629537954926} 12/23/2021 21:24:52 - INFO - codeparrot_training - Step 6900: {'lr': 0.0004778597375921437, 'samples': 3533312, 'steps': 6900, 'batch_loss/train': 0.887857224792242} 12/23/2021 21:25:02 - INFO - codeparrot_training - Step 6901: {'lr': 0.00047785325162869686, 'samples': 3533824, 'steps': 6901, 'batch_loss/train': 0.8788446914404631} 12/23/2021 21:25:13 - INFO - codeparrot_training - Step 6902: {'lr': 0.0004778467647593916, 'samples': 3534336, 'steps': 6902, 'batch_loss/train': 0.9211486061103642} 12/23/2021 21:25:25 - INFO - codeparrot_training - Step 6903: {'lr': 0.00047784027698425356, 'samples': 3534848, 'steps': 6903, 'batch_loss/train': 0.8855870626866817} 12/23/2021 21:25:36 - INFO - codeparrot_training - Step 6904: {'lr': 0.0004778337883033087, 'samples': 3535360, 'steps': 6904, 'batch_loss/train': 0.9640071485191584} 12/23/2021 21:25:46 - INFO - codeparrot_training - Step 6905: {'lr': 0.0004778272987165827, 'samples': 3535872, 'steps': 6905, 'batch_loss/train': 0.8632942088879645} 12/23/2021 21:25:57 - INFO - codeparrot_training - Step 6906: {'lr': 0.00047782080822410136, 'samples': 3536384, 'steps': 6906, 'batch_loss/train': 0.916967460885644} 12/23/2021 21:26:09 - INFO - codeparrot_training - Step 6907: {'lr': 0.0004778143168258905, 'samples': 3536896, 'steps': 6907, 'batch_loss/train': 0.7660887343809009} 12/23/2021 21:26:20 - INFO - codeparrot_training - Step 6908: {'lr': 0.0004778078245219759, 'samples': 3537408, 'steps': 6908, 'batch_loss/train': 0.8793529821559787} 12/23/2021 21:26:30 - INFO - codeparrot_training - Step 6909: {'lr': 0.0004778013313123835, 'samples': 3537920, 'steps': 6909, 'batch_loss/train': 0.8436913387849927} 12/23/2021 21:26:43 - INFO - codeparrot_training - Step 6910: {'lr': 0.0004777948371971389, 'samples': 3538432, 'steps': 6910, 'batch_loss/train': 0.910773865878582} 12/23/2021 21:26:53 - INFO - codeparrot_training - Step 6911: {'lr': 0.00047778834217626816, 'samples': 3538944, 'steps': 6911, 'batch_loss/train': 0.8641600804403424} 12/23/2021 21:27:04 - INFO - codeparrot_training - Step 6912: {'lr': 0.00047778184624979695, 'samples': 3539456, 'steps': 6912, 'batch_loss/train': 0.8594919866882265} 12/23/2021 21:27:17 - INFO - codeparrot_training - Step 6913: {'lr': 0.000477775349417751, 'samples': 3539968, 'steps': 6913, 'batch_loss/train': 0.7160270513268188} 12/23/2021 21:27:27 - INFO - codeparrot_training - Step 6914: {'lr': 0.0004777688516801564, 'samples': 3540480, 'steps': 6914, 'batch_loss/train': 0.8863203013315797} 12/23/2021 21:27:38 - INFO - codeparrot_training - Step 6915: {'lr': 0.00047776235303703885, 'samples': 3540992, 'steps': 6915, 'batch_loss/train': 0.8603785252198577} 12/23/2021 21:27:49 - INFO - codeparrot_training - Step 6916: {'lr': 0.0004777558534884242, 'samples': 3541504, 'steps': 6916, 'batch_loss/train': 0.873324510641396} 12/23/2021 21:28:01 - INFO - codeparrot_training - Step 6917: {'lr': 0.0004777493530343382, 'samples': 3542016, 'steps': 6917, 'batch_loss/train': 0.8190066455863416} 12/23/2021 21:28:11 - INFO - codeparrot_training - Step 6918: {'lr': 0.0004777428516748068, 'samples': 3542528, 'steps': 6918, 'batch_loss/train': 0.877871772274375} 12/23/2021 21:28:22 - INFO - codeparrot_training - Step 6919: {'lr': 0.0004777363494098558, 'samples': 3543040, 'steps': 6919, 'batch_loss/train': 0.8326048064045608} 12/23/2021 21:28:34 - INFO - codeparrot_training - Step 6920: {'lr': 0.0004777298462395111, 'samples': 3543552, 'steps': 6920, 'batch_loss/train': 0.8132760592270643} 12/23/2021 21:28:45 - INFO - codeparrot_training - Step 6921: {'lr': 0.00047772334216379843, 'samples': 3544064, 'steps': 6921, 'batch_loss/train': 0.7483230878133327} 12/23/2021 21:28:55 - INFO - codeparrot_training - Step 6922: {'lr': 0.00047771683718274386, 'samples': 3544576, 'steps': 6922, 'batch_loss/train': 0.7688084384426475} 12/23/2021 21:29:08 - INFO - codeparrot_training - Step 6923: {'lr': 0.00047771033129637304, 'samples': 3545088, 'steps': 6923, 'batch_loss/train': 0.804033050313592} 12/23/2021 21:29:19 - INFO - codeparrot_training - Step 6924: {'lr': 0.00047770382450471196, 'samples': 3545600, 'steps': 6924, 'batch_loss/train': 0.824066323461011} 12/23/2021 21:29:30 - INFO - codeparrot_training - Step 6925: {'lr': 0.00047769731680778646, 'samples': 3546112, 'steps': 6925, 'batch_loss/train': 0.7487220182083547} 12/23/2021 21:29:42 - INFO - codeparrot_training - Step 6926: {'lr': 0.0004776908082056224, 'samples': 3546624, 'steps': 6926, 'batch_loss/train': 0.8334716958925128} 12/23/2021 21:29:52 - INFO - codeparrot_training - Step 6927: {'lr': 0.0004776842986982456, 'samples': 3547136, 'steps': 6927, 'batch_loss/train': 0.8945160759612918} 12/23/2021 21:30:03 - INFO - codeparrot_training - Step 6928: {'lr': 0.00047767778828568204, 'samples': 3547648, 'steps': 6928, 'batch_loss/train': 0.887838161084801} 12/23/2021 21:30:13 - INFO - codeparrot_training - Step 6929: {'lr': 0.00047767127696795756, 'samples': 3548160, 'steps': 6929, 'batch_loss/train': 0.7850989541038871} 12/23/2021 21:30:27 - INFO - codeparrot_training - Step 6930: {'lr': 0.00047766476474509807, 'samples': 3548672, 'steps': 6930, 'batch_loss/train': 0.8520954325795174} 12/23/2021 21:30:37 - INFO - codeparrot_training - Step 6931: {'lr': 0.00047765825161712937, 'samples': 3549184, 'steps': 6931, 'batch_loss/train': 0.7806204040534794} 12/23/2021 21:30:48 - INFO - codeparrot_training - Step 6932: {'lr': 0.0004776517375840775, 'samples': 3549696, 'steps': 6932, 'batch_loss/train': 0.8299152031540871} 12/23/2021 21:31:00 - INFO - codeparrot_training - Step 6933: {'lr': 0.00047764522264596824, 'samples': 3550208, 'steps': 6933, 'batch_loss/train': 0.8329054163768888} 12/23/2021 21:31:10 - INFO - codeparrot_training - Step 6934: {'lr': 0.00047763870680282754, 'samples': 3550720, 'steps': 6934, 'batch_loss/train': 0.7841111859306693} 12/23/2021 21:31:21 - INFO - codeparrot_training - Step 6935: {'lr': 0.00047763219005468127, 'samples': 3551232, 'steps': 6935, 'batch_loss/train': 0.7907864074222744} 12/23/2021 21:31:33 - INFO - codeparrot_training - Step 6936: {'lr': 0.0004776256724015554, 'samples': 3551744, 'steps': 6936, 'batch_loss/train': 0.8823614921420813} 12/23/2021 21:31:44 - INFO - codeparrot_training - Step 6937: {'lr': 0.00047761915384347586, 'samples': 3552256, 'steps': 6937, 'batch_loss/train': 0.8969839821802452} 12/23/2021 21:31:54 - INFO - codeparrot_training - Step 6938: {'lr': 0.00047761263438046844, 'samples': 3552768, 'steps': 6938, 'batch_loss/train': 0.8140365155413747} 12/23/2021 21:32:05 - INFO - codeparrot_training - Step 6939: {'lr': 0.0004776061140125591, 'samples': 3553280, 'steps': 6939, 'batch_loss/train': 0.921349816955626} 12/23/2021 21:32:18 - INFO - codeparrot_training - Step 6940: {'lr': 0.0004775995927397738, 'samples': 3553792, 'steps': 6940, 'batch_loss/train': 0.7398234708234668} 12/23/2021 21:32:29 - INFO - codeparrot_training - Step 6941: {'lr': 0.0004775930705621385, 'samples': 3554304, 'steps': 6941, 'batch_loss/train': 0.6969168032519519} 12/23/2021 21:32:40 - INFO - codeparrot_training - Step 6942: {'lr': 0.0004775865474796791, 'samples': 3554816, 'steps': 6942, 'batch_loss/train': 0.8029583496972919} 12/23/2021 21:32:52 - INFO - codeparrot_training - Step 6943: {'lr': 0.00047758002349242146, 'samples': 3555328, 'steps': 6943, 'batch_loss/train': 0.873280230909586} 12/23/2021 21:33:02 - INFO - codeparrot_training - Step 6944: {'lr': 0.0004775734986003916, 'samples': 3555840, 'steps': 6944, 'batch_loss/train': 0.8082992830313742} 12/23/2021 21:33:13 - INFO - codeparrot_training - Step 6945: {'lr': 0.0004775669728036155, 'samples': 3556352, 'steps': 6945, 'batch_loss/train': 0.9000324052758515} 12/23/2021 21:33:25 - INFO - codeparrot_training - Step 6946: {'lr': 0.000477560446102119, 'samples': 3556864, 'steps': 6946, 'batch_loss/train': 0.7769851819612086} 12/23/2021 21:33:36 - INFO - codeparrot_training - Step 6947: {'lr': 0.000477553918495928, 'samples': 3557376, 'steps': 6947, 'batch_loss/train': 0.8370891949161887} 12/23/2021 21:33:46 - INFO - codeparrot_training - Step 6948: {'lr': 0.00047754738998506864, 'samples': 3557888, 'steps': 6948, 'batch_loss/train': 0.7971552750095725} 12/23/2021 21:33:59 - INFO - codeparrot_training - Step 6949: {'lr': 0.0004775408605695668, 'samples': 3558400, 'steps': 6949, 'batch_loss/train': 0.7799787297844887} 12/23/2021 21:34:10 - INFO - codeparrot_training - Step 6950: {'lr': 0.0004775343302494484, 'samples': 3558912, 'steps': 6950, 'batch_loss/train': 0.7929456057026982} 12/23/2021 21:34:20 - INFO - codeparrot_training - Step 6951: {'lr': 0.00047752779902473933, 'samples': 3559424, 'steps': 6951, 'batch_loss/train': 0.8636689987033606} 12/23/2021 21:34:31 - INFO - codeparrot_training - Step 6952: {'lr': 0.0004775212668954657, 'samples': 3559936, 'steps': 6952, 'batch_loss/train': 0.8237831378355622} 12/23/2021 21:34:43 - INFO - codeparrot_training - Step 6953: {'lr': 0.0004775147338616534, 'samples': 3560448, 'steps': 6953, 'batch_loss/train': 0.7537208134308457} 12/23/2021 21:34:54 - INFO - codeparrot_training - Step 6954: {'lr': 0.00047750819992332846, 'samples': 3560960, 'steps': 6954, 'batch_loss/train': 0.6149218869395554} 12/23/2021 21:35:04 - INFO - codeparrot_training - Step 6955: {'lr': 0.0004775016650805168, 'samples': 3561472, 'steps': 6955, 'batch_loss/train': 0.6823960365727544} 12/23/2021 21:35:16 - INFO - codeparrot_training - Step 6956: {'lr': 0.00047749512933324445, 'samples': 3561984, 'steps': 6956, 'batch_loss/train': 0.8558639781549573} 12/23/2021 21:35:27 - INFO - codeparrot_training - Step 6957: {'lr': 0.00047748859268153735, 'samples': 3562496, 'steps': 6957, 'batch_loss/train': 0.8584126867353916} 12/23/2021 21:35:37 - INFO - codeparrot_training - Step 6958: {'lr': 0.0004774820551254214, 'samples': 3563008, 'steps': 6958, 'batch_loss/train': 0.7832226548343897} 12/23/2021 21:35:48 - INFO - codeparrot_training - Step 6959: {'lr': 0.00047747551666492286, 'samples': 3563520, 'steps': 6959, 'batch_loss/train': 0.9911059439182281} 12/23/2021 21:36:01 - INFO - codeparrot_training - Step 6960: {'lr': 0.00047746897730006747, 'samples': 3564032, 'steps': 6960, 'batch_loss/train': 1.0335938390344381} 12/23/2021 21:36:12 - INFO - codeparrot_training - Step 6961: {'lr': 0.0004774624370308813, 'samples': 3564544, 'steps': 6961, 'batch_loss/train': 0.7864361939718947} 12/23/2021 21:36:22 - INFO - codeparrot_training - Step 6962: {'lr': 0.0004774558958573904, 'samples': 3565056, 'steps': 6962, 'batch_loss/train': 0.8587045464664698} 12/23/2021 21:36:34 - INFO - codeparrot_training - Step 6963: {'lr': 0.00047744935377962074, 'samples': 3565568, 'steps': 6963, 'batch_loss/train': 0.9856368647888303} 12/23/2021 21:36:45 - INFO - codeparrot_training - Step 6964: {'lr': 0.00047744281079759835, 'samples': 3566080, 'steps': 6964, 'batch_loss/train': 0.8497234676033258} 12/23/2021 21:36:56 - INFO - codeparrot_training - Step 6965: {'lr': 0.00047743626691134917, 'samples': 3566592, 'steps': 6965, 'batch_loss/train': 0.7528545372188091} 12/23/2021 21:37:08 - INFO - codeparrot_training - Step 6966: {'lr': 0.0004774297221208993, 'samples': 3567104, 'steps': 6966, 'batch_loss/train': 0.6819566730409861} 12/23/2021 21:37:18 - INFO - codeparrot_training - Step 6967: {'lr': 0.0004774231764262747, 'samples': 3567616, 'steps': 6967, 'batch_loss/train': 0.9301007464528084} 12/23/2021 21:37:29 - INFO - codeparrot_training - Step 6968: {'lr': 0.0004774166298275015, 'samples': 3568128, 'steps': 6968, 'batch_loss/train': 0.8708732137456536} 12/23/2021 21:37:39 - INFO - codeparrot_training - Step 6969: {'lr': 0.00047741008232460557, 'samples': 3568640, 'steps': 6969, 'batch_loss/train': 0.8726643212139606} 12/23/2021 21:37:53 - INFO - codeparrot_training - Step 6970: {'lr': 0.000477403533917613, 'samples': 3569152, 'steps': 6970, 'batch_loss/train': 0.7577850488014519} 12/23/2021 21:38:04 - INFO - codeparrot_training - Step 6971: {'lr': 0.00047739698460654993, 'samples': 3569664, 'steps': 6971, 'batch_loss/train': 0.9218739485368133} 12/23/2021 21:38:14 - INFO - codeparrot_training - Step 6972: {'lr': 0.0004773904343914423, 'samples': 3570176, 'steps': 6972, 'batch_loss/train': 0.7965298276394606} 12/23/2021 21:38:26 - INFO - codeparrot_training - Step 6973: {'lr': 0.00047738388327231613, 'samples': 3570688, 'steps': 6973, 'batch_loss/train': 0.5234253676608205} 12/23/2021 21:38:37 - INFO - codeparrot_training - Step 6974: {'lr': 0.0004773773312491975, 'samples': 3571200, 'steps': 6974, 'batch_loss/train': 0.7423238670453429} 12/23/2021 21:38:48 - INFO - codeparrot_training - Step 6975: {'lr': 0.0004773707783221125, 'samples': 3571712, 'steps': 6975, 'batch_loss/train': 0.8767646825872362} 12/23/2021 21:38:58 - INFO - codeparrot_training - Step 6976: {'lr': 0.00047736422449108705, 'samples': 3572224, 'steps': 6976, 'batch_loss/train': 0.8286002082750201} 12/23/2021 21:39:10 - INFO - codeparrot_training - Step 6977: {'lr': 0.0004773576697561474, 'samples': 3572736, 'steps': 6977, 'batch_loss/train': 0.9640879780054092} 12/23/2021 21:39:21 - INFO - codeparrot_training - Step 6978: {'lr': 0.00047735111411731937, 'samples': 3573248, 'steps': 6978, 'batch_loss/train': 0.8873682417906821} 12/23/2021 21:39:32 - INFO - codeparrot_training - Step 6979: {'lr': 0.00047734455757462925, 'samples': 3573760, 'steps': 6979, 'batch_loss/train': 0.8106065536849201} 12/23/2021 21:39:44 - INFO - codeparrot_training - Step 6980: {'lr': 0.00047733800012810304, 'samples': 3574272, 'steps': 6980, 'batch_loss/train': 0.9439310887828469} 12/23/2021 21:39:54 - INFO - codeparrot_training - Step 6981: {'lr': 0.00047733144177776666, 'samples': 3574784, 'steps': 6981, 'batch_loss/train': 0.9083540197461843} 12/23/2021 21:40:05 - INFO - codeparrot_training - Step 6982: {'lr': 0.00047732488252364634, 'samples': 3575296, 'steps': 6982, 'batch_loss/train': 0.857195420190692} 12/23/2021 21:40:18 - INFO - codeparrot_training - Step 6983: {'lr': 0.0004773183223657682, 'samples': 3575808, 'steps': 6983, 'batch_loss/train': 0.9363523237407207} 12/23/2021 21:40:29 - INFO - codeparrot_training - Step 6984: {'lr': 0.00047731176130415823, 'samples': 3576320, 'steps': 6984, 'batch_loss/train': 0.831555612385273} 12/23/2021 21:40:39 - INFO - codeparrot_training - Step 6985: {'lr': 0.00047730519933884244, 'samples': 3576832, 'steps': 6985, 'batch_loss/train': 1.1711747404187918} 12/23/2021 21:40:50 - INFO - codeparrot_training - Step 6986: {'lr': 0.00047729863646984706, 'samples': 3577344, 'steps': 6986, 'batch_loss/train': 0.9158955286256969} 12/23/2021 21:41:02 - INFO - codeparrot_training - Step 6987: {'lr': 0.0004772920726971981, 'samples': 3577856, 'steps': 6987, 'batch_loss/train': 0.7504816681612283} 12/23/2021 21:41:13 - INFO - codeparrot_training - Step 6988: {'lr': 0.00047728550802092175, 'samples': 3578368, 'steps': 6988, 'batch_loss/train': 0.8788252212107182} 12/23/2021 21:41:23 - INFO - codeparrot_training - Step 6989: {'lr': 0.00047727894244104397, 'samples': 3578880, 'steps': 6989, 'batch_loss/train': 0.8937845444306731} 12/23/2021 21:41:36 - INFO - codeparrot_training - Step 6990: {'lr': 0.000477272375957591, 'samples': 3579392, 'steps': 6990, 'batch_loss/train': 0.8343823002651334} 12/23/2021 21:41:47 - INFO - codeparrot_training - Step 6991: {'lr': 0.0004772658085705888, 'samples': 3579904, 'steps': 6991, 'batch_loss/train': 0.9020528383553028} 12/23/2021 21:41:57 - INFO - codeparrot_training - Step 6992: {'lr': 0.0004772592402800636, 'samples': 3580416, 'steps': 6992, 'batch_loss/train': 0.8797792373225093} 12/23/2021 21:42:09 - INFO - codeparrot_training - Step 6993: {'lr': 0.00047725267108604145, 'samples': 3580928, 'steps': 6993, 'batch_loss/train': 0.7949721990153193} 12/23/2021 21:42:20 - INFO - codeparrot_training - Step 6994: {'lr': 0.0004772461009885485, 'samples': 3581440, 'steps': 6994, 'batch_loss/train': 0.9329234268516302} 12/23/2021 21:42:30 - INFO - codeparrot_training - Step 6995: {'lr': 0.0004772395299876109, 'samples': 3581952, 'steps': 6995, 'batch_loss/train': 0.8202592865563929} 12/23/2021 21:42:41 - INFO - codeparrot_training - Step 6996: {'lr': 0.0004772329580832546, 'samples': 3582464, 'steps': 6996, 'batch_loss/train': 0.8843750730156898} 12/23/2021 21:42:53 - INFO - codeparrot_training - Step 6997: {'lr': 0.00047722638527550597, 'samples': 3582976, 'steps': 6997, 'batch_loss/train': 0.8224238567054272} 12/23/2021 21:43:04 - INFO - codeparrot_training - Step 6998: {'lr': 0.00047721981156439097, 'samples': 3583488, 'steps': 6998, 'batch_loss/train': 0.7706106407567859} 12/23/2021 21:43:14 - INFO - codeparrot_training - Step 6999: {'lr': 0.0004772132369499359, 'samples': 3584000, 'steps': 6999, 'batch_loss/train': 0.9770363448187709} 12/23/2021 21:43:28 - INFO - codeparrot_training - Step 7000: {'lr': 0.00047720666143216667, 'samples': 3584512, 'steps': 7000, 'batch_loss/train': 0.8627406032755971} 12/23/2021 21:43:38 - INFO - codeparrot_training - Step 7001: {'lr': 0.00047720008501110964, 'samples': 3585024, 'steps': 7001, 'batch_loss/train': 0.8209858741611242} 12/23/2021 21:43:49 - INFO - codeparrot_training - Step 7002: {'lr': 0.0004771935076867908, 'samples': 3585536, 'steps': 7002, 'batch_loss/train': 0.8554926086217165} 12/23/2021 21:44:01 - INFO - codeparrot_training - Step 7003: {'lr': 0.00047718692945923635, 'samples': 3586048, 'steps': 7003, 'batch_loss/train': 0.8171973074786365} 12/23/2021 21:44:12 - INFO - codeparrot_training - Step 7004: {'lr': 0.0004771803503284725, 'samples': 3586560, 'steps': 7004, 'batch_loss/train': 0.8709346745163202} 12/23/2021 21:44:22 - INFO - codeparrot_training - Step 7005: {'lr': 0.0004771737702945253, 'samples': 3587072, 'steps': 7005, 'batch_loss/train': 0.7434025150723755} 12/23/2021 21:44:33 - INFO - codeparrot_training - Step 7006: {'lr': 0.000477167189357421, 'samples': 3587584, 'steps': 7006, 'batch_loss/train': 0.8018251610919833} 12/23/2021 21:44:46 - INFO - codeparrot_training - Step 7007: {'lr': 0.0004771606075171857, 'samples': 3588096, 'steps': 7007, 'batch_loss/train': 0.8249971307814121} 12/23/2021 21:44:57 - INFO - codeparrot_training - Step 7008: {'lr': 0.00047715402477384566, 'samples': 3588608, 'steps': 7008, 'batch_loss/train': 1.0606078249402344} 12/23/2021 21:45:07 - INFO - codeparrot_training - Step 7009: {'lr': 0.00047714744112742694, 'samples': 3589120, 'steps': 7009, 'batch_loss/train': 0.9145002886652946} 12/23/2021 21:45:20 - INFO - codeparrot_training - Step 7010: {'lr': 0.0004771408565779558, 'samples': 3589632, 'steps': 7010, 'batch_loss/train': 0.7489120806567371} 12/23/2021 21:45:30 - INFO - codeparrot_training - Step 7011: {'lr': 0.0004771342711254584, 'samples': 3590144, 'steps': 7011, 'batch_loss/train': 0.7681552087888122} 12/23/2021 21:45:41 - INFO - codeparrot_training - Step 7012: {'lr': 0.00047712768476996083, 'samples': 3590656, 'steps': 7012, 'batch_loss/train': 0.8108293209224939} 12/23/2021 21:45:54 - INFO - codeparrot_training - Step 7013: {'lr': 0.00047712109751148934, 'samples': 3591168, 'steps': 7013, 'batch_loss/train': 0.833675223402679} 12/23/2021 21:46:05 - INFO - codeparrot_training - Step 7014: {'lr': 0.00047711450935007015, 'samples': 3591680, 'steps': 7014, 'batch_loss/train': 0.9611263908445835} 12/23/2021 21:46:15 - INFO - codeparrot_training - Step 7015: {'lr': 0.0004771079202857294, 'samples': 3592192, 'steps': 7015, 'batch_loss/train': 0.8538799770176411} 12/23/2021 21:46:26 - INFO - codeparrot_training - Step 7016: {'lr': 0.0004771013303184933, 'samples': 3592704, 'steps': 7016, 'batch_loss/train': 1.3463701736181974} 12/23/2021 21:46:38 - INFO - codeparrot_training - Step 7017: {'lr': 0.00047709473944838816, 'samples': 3593216, 'steps': 7017, 'batch_loss/train': 0.9446089905686677} 12/23/2021 21:46:48 - INFO - codeparrot_training - Step 7018: {'lr': 0.00047708814767544004, 'samples': 3593728, 'steps': 7018, 'batch_loss/train': 0.8777445056475699} 12/23/2021 21:46:59 - INFO - codeparrot_training - Step 7019: {'lr': 0.0004770815549996752, 'samples': 3594240, 'steps': 7019, 'batch_loss/train': 0.7572078624507412} 12/23/2021 21:47:12 - INFO - codeparrot_training - Step 7020: {'lr': 0.00047707496142111977, 'samples': 3594752, 'steps': 7020, 'batch_loss/train': 0.8707859227433801} 12/23/2021 21:47:23 - INFO - codeparrot_training - Step 7021: {'lr': 0.00047706836693980003, 'samples': 3595264, 'steps': 7021, 'batch_loss/train': 0.9137544594705105} 12/23/2021 21:47:33 - INFO - codeparrot_training - Step 7022: {'lr': 0.0004770617715557423, 'samples': 3595776, 'steps': 7022, 'batch_loss/train': 0.8091956032440066} 12/23/2021 21:47:44 - INFO - codeparrot_training - Step 7023: {'lr': 0.00047705517526897267, 'samples': 3596288, 'steps': 7023, 'batch_loss/train': 0.8850630563683808} 12/23/2021 21:47:56 - INFO - codeparrot_training - Step 7024: {'lr': 0.00047704857807951735, 'samples': 3596800, 'steps': 7024, 'batch_loss/train': 0.7992569357156754} 12/23/2021 21:48:06 - INFO - codeparrot_training - Step 7025: {'lr': 0.00047704197998740263, 'samples': 3597312, 'steps': 7025, 'batch_loss/train': 0.7226830534636974} 12/23/2021 21:48:17 - INFO - codeparrot_training - Step 7026: {'lr': 0.00047703538099265475, 'samples': 3597824, 'steps': 7026, 'batch_loss/train': 0.7723088413476944} 12/23/2021 21:48:29 - INFO - codeparrot_training - Step 7027: {'lr': 0.0004770287810952999, 'samples': 3598336, 'steps': 7027, 'batch_loss/train': 0.8270145659334958} 12/23/2021 21:48:39 - INFO - codeparrot_training - Step 7028: {'lr': 0.0004770221802953644, 'samples': 3598848, 'steps': 7028, 'batch_loss/train': 0.7483779487665743} 12/23/2021 21:48:50 - INFO - codeparrot_training - Step 7029: {'lr': 0.00047701557859287436, 'samples': 3599360, 'steps': 7029, 'batch_loss/train': 0.8419028399512172} 12/23/2021 21:49:03 - INFO - codeparrot_training - Step 7030: {'lr': 0.0004770089759878561, 'samples': 3599872, 'steps': 7030, 'batch_loss/train': 0.8473116201348603} 12/23/2021 21:49:14 - INFO - codeparrot_training - Step 7031: {'lr': 0.0004770023724803358, 'samples': 3600384, 'steps': 7031, 'batch_loss/train': 0.8210955592803657} 12/23/2021 21:49:25 - INFO - codeparrot_training - Step 7032: {'lr': 0.00047699576807033994, 'samples': 3600896, 'steps': 7032, 'batch_loss/train': 0.9150357120670378} 12/23/2021 21:49:35 - INFO - codeparrot_training - Step 7033: {'lr': 0.0004769891627578945, 'samples': 3601408, 'steps': 7033, 'batch_loss/train': 0.7812175392173231} 12/23/2021 21:49:47 - INFO - codeparrot_training - Step 7034: {'lr': 0.0004769825565430259, 'samples': 3601920, 'steps': 7034, 'batch_loss/train': 0.7894929046742618} 12/23/2021 21:49:58 - INFO - codeparrot_training - Step 7035: {'lr': 0.0004769759494257604, 'samples': 3602432, 'steps': 7035, 'batch_loss/train': 0.821730013936758} 12/23/2021 21:50:08 - INFO - codeparrot_training - Step 7036: {'lr': 0.0004769693414061241, 'samples': 3602944, 'steps': 7036, 'batch_loss/train': 0.8367009349167347} 12/23/2021 21:50:22 - INFO - codeparrot_training - Step 7037: {'lr': 0.0004769627324841436, 'samples': 3603456, 'steps': 7037, 'batch_loss/train': 1.021661525592208} 12/23/2021 21:50:32 - INFO - codeparrot_training - Step 7038: {'lr': 0.0004769561226598448, 'samples': 3603968, 'steps': 7038, 'batch_loss/train': 0.8633162369951606} 12/23/2021 21:50:43 - INFO - codeparrot_training - Step 7039: {'lr': 0.0004769495119332542, 'samples': 3604480, 'steps': 7039, 'batch_loss/train': 0.8557260381057858} 12/23/2021 21:50:55 - INFO - codeparrot_training - Step 7040: {'lr': 0.00047694290030439804, 'samples': 3604992, 'steps': 7040, 'batch_loss/train': 0.8030759415123612} 12/23/2021 21:51:05 - INFO - codeparrot_training - Step 7041: {'lr': 0.00047693628777330254, 'samples': 3605504, 'steps': 7041, 'batch_loss/train': 0.7809805022552609} 12/23/2021 21:51:16 - INFO - codeparrot_training - Step 7042: {'lr': 0.0004769296743399941, 'samples': 3606016, 'steps': 7042, 'batch_loss/train': 0.8357144435867667} 12/23/2021 21:51:28 - INFO - codeparrot_training - Step 7043: {'lr': 0.000476923060004499, 'samples': 3606528, 'steps': 7043, 'batch_loss/train': 0.9061266211792827} 12/23/2021 21:51:39 - INFO - codeparrot_training - Step 7044: {'lr': 0.0004769164447668435, 'samples': 3607040, 'steps': 7044, 'batch_loss/train': 0.8316645794548094} 12/23/2021 21:51:49 - INFO - codeparrot_training - Step 7045: {'lr': 0.0004769098286270538, 'samples': 3607552, 'steps': 7045, 'batch_loss/train': 0.7856781715527177} 12/23/2021 21:52:00 - INFO - codeparrot_training - Step 7046: {'lr': 0.0004769032115851564, 'samples': 3608064, 'steps': 7046, 'batch_loss/train': 0.8048768024891615} 12/23/2021 21:52:13 - INFO - codeparrot_training - Step 7047: {'lr': 0.00047689659364117743, 'samples': 3608576, 'steps': 7047, 'batch_loss/train': 0.8409268818795681} 12/23/2021 21:52:23 - INFO - codeparrot_training - Step 7048: {'lr': 0.00047688997479514335, 'samples': 3609088, 'steps': 7048, 'batch_loss/train': 0.8756671724840999} 12/23/2021 21:52:34 - INFO - codeparrot_training - Step 7049: {'lr': 0.0004768833550470804, 'samples': 3609600, 'steps': 7049, 'batch_loss/train': 0.8765383241698146} 12/23/2021 21:52:46 - INFO - codeparrot_training - Step 7050: {'lr': 0.0004768767343970148, 'samples': 3610112, 'steps': 7050, 'batch_loss/train': 0.8027260899543762} 12/23/2021 21:52:56 - INFO - codeparrot_training - Step 7051: {'lr': 0.00047687011284497306, 'samples': 3610624, 'steps': 7051, 'batch_loss/train': 0.8468901868909597} 12/23/2021 21:53:07 - INFO - codeparrot_training - Step 7052: {'lr': 0.0004768634903909813, 'samples': 3611136, 'steps': 7052, 'batch_loss/train': 0.8151417942717671} 12/23/2021 21:53:21 - INFO - codeparrot_training - Step 7053: {'lr': 0.0004768568670350661, 'samples': 3611648, 'steps': 7053, 'batch_loss/train': 0.8688656468875706} 12/23/2021 21:53:31 - INFO - codeparrot_training - Step 7054: {'lr': 0.00047685024277725356, 'samples': 3612160, 'steps': 7054, 'batch_loss/train': 0.7908484635408968} 12/23/2021 21:53:42 - INFO - codeparrot_training - Step 7055: {'lr': 0.00047684361761757017, 'samples': 3612672, 'steps': 7055, 'batch_loss/train': 0.9079984035342932} 12/23/2021 21:53:53 - INFO - codeparrot_training - Step 7056: {'lr': 0.0004768369915560422, 'samples': 3613184, 'steps': 7056, 'batch_loss/train': 0.8892711522057652} 12/23/2021 21:54:05 - INFO - codeparrot_training - Step 7057: {'lr': 0.000476830364592696, 'samples': 3613696, 'steps': 7057, 'batch_loss/train': 0.8204667172394693} 12/23/2021 21:54:15 - INFO - codeparrot_training - Step 7058: {'lr': 0.00047682373672755784, 'samples': 3614208, 'steps': 7058, 'batch_loss/train': 0.7685574372299016} 12/23/2021 21:54:26 - INFO - codeparrot_training - Step 7059: {'lr': 0.00047681710796065424, 'samples': 3614720, 'steps': 7059, 'batch_loss/train': 0.8492756919004023} 12/23/2021 21:54:38 - INFO - codeparrot_training - Step 7060: {'lr': 0.00047681047829201145, 'samples': 3615232, 'steps': 7060, 'batch_loss/train': 0.8275305279530585} 12/23/2021 21:54:49 - INFO - codeparrot_training - Step 7061: {'lr': 0.0004768038477216559, 'samples': 3615744, 'steps': 7061, 'batch_loss/train': 0.8560773991048336} 12/23/2021 21:55:00 - INFO - codeparrot_training - Step 7062: {'lr': 0.00047679721624961375, 'samples': 3616256, 'steps': 7062, 'batch_loss/train': 0.8078741421923041} 12/23/2021 21:55:10 - INFO - codeparrot_training - Step 7063: {'lr': 0.0004767905838759116, 'samples': 3616768, 'steps': 7063, 'batch_loss/train': 0.859865877777338} 12/23/2021 21:55:22 - INFO - codeparrot_training - Step 7064: {'lr': 0.0004767839506005757, 'samples': 3617280, 'steps': 7064, 'batch_loss/train': 0.9333210922777653} 12/23/2021 21:55:33 - INFO - codeparrot_training - Step 7065: {'lr': 0.0004767773164236324, 'samples': 3617792, 'steps': 7065, 'batch_loss/train': 0.8756831269711256} 12/23/2021 21:55:43 - INFO - codeparrot_training - Step 7066: {'lr': 0.00047677068134510816, 'samples': 3618304, 'steps': 7066, 'batch_loss/train': 0.9581021247431636} 12/23/2021 21:55:57 - INFO - codeparrot_training - Step 7067: {'lr': 0.00047676404536502924, 'samples': 3618816, 'steps': 7067, 'batch_loss/train': 0.8781484963838011} 12/23/2021 21:56:07 - INFO - codeparrot_training - Step 7068: {'lr': 0.0004767574084834222, 'samples': 3619328, 'steps': 7068, 'batch_loss/train': 0.8242837665602565} 12/23/2021 21:56:18 - INFO - codeparrot_training - Step 7069: {'lr': 0.0004767507707003133, 'samples': 3619840, 'steps': 7069, 'batch_loss/train': 0.8424425749108195} 12/23/2021 21:56:30 - INFO - codeparrot_training - Step 7070: {'lr': 0.000476744132015729, 'samples': 3620352, 'steps': 7070, 'batch_loss/train': 0.9247558452188969} 12/23/2021 21:56:41 - INFO - codeparrot_training - Step 7071: {'lr': 0.00047673749242969557, 'samples': 3620864, 'steps': 7071, 'batch_loss/train': 0.7969437669962645} 12/23/2021 21:56:52 - INFO - codeparrot_training - Step 7072: {'lr': 0.0004767308519422395, 'samples': 3621376, 'steps': 7072, 'batch_loss/train': 0.8529840242117643} 12/23/2021 21:57:02 - INFO - codeparrot_training - Step 7073: {'lr': 0.00047672421055338716, 'samples': 3621888, 'steps': 7073, 'batch_loss/train': 0.6819092989899218} 12/23/2021 21:57:15 - INFO - codeparrot_training - Step 7074: {'lr': 0.00047671756826316497, 'samples': 3622400, 'steps': 7074, 'batch_loss/train': 0.8025955725461245} 12/23/2021 21:57:26 - INFO - codeparrot_training - Step 7075: {'lr': 0.0004767109250715994, 'samples': 3622912, 'steps': 7075, 'batch_loss/train': 0.9294234002009034} 12/23/2021 21:57:36 - INFO - codeparrot_training - Step 7076: {'lr': 0.00047670428097871663, 'samples': 3623424, 'steps': 7076, 'batch_loss/train': 0.7387437426950783} 12/23/2021 21:57:50 - INFO - codeparrot_training - Step 7077: {'lr': 0.0004766976359845434, 'samples': 3623936, 'steps': 7077, 'batch_loss/train': 0.5355094410479069} 12/23/2021 21:58:01 - INFO - codeparrot_training - Step 7078: {'lr': 0.00047669099008910596, 'samples': 3624448, 'steps': 7078, 'batch_loss/train': 0.8686794368550181} 12/23/2021 21:58:12 - INFO - codeparrot_training - Step 7079: {'lr': 0.0004766843432924307, 'samples': 3624960, 'steps': 7079, 'batch_loss/train': 0.9436307149007916} 12/23/2021 21:58:22 - INFO - codeparrot_training - Step 7080: {'lr': 0.00047667769559454404, 'samples': 3625472, 'steps': 7080, 'batch_loss/train': 0.8577733309939504} 12/23/2021 21:58:34 - INFO - codeparrot_training - Step 7081: {'lr': 0.00047667104699547247, 'samples': 3625984, 'steps': 7081, 'batch_loss/train': 0.9396647168323398} 12/23/2021 21:58:45 - INFO - codeparrot_training - Step 7082: {'lr': 0.00047666439749524247, 'samples': 3626496, 'steps': 7082, 'batch_loss/train': 0.8304682312300429} 12/23/2021 21:58:55 - INFO - codeparrot_training - Step 7083: {'lr': 0.0004766577470938803, 'samples': 3627008, 'steps': 7083, 'batch_loss/train': 0.8625599388033152} 12/23/2021 21:59:09 - INFO - codeparrot_training - Step 7084: {'lr': 0.00047665109579141264, 'samples': 3627520, 'steps': 7084, 'batch_loss/train': 0.9053611978888512} 12/23/2021 21:59:20 - INFO - codeparrot_training - Step 7085: {'lr': 0.00047664444358786574, 'samples': 3628032, 'steps': 7085, 'batch_loss/train': 0.8335259025916457} 12/23/2021 21:59:30 - INFO - codeparrot_training - Step 7086: {'lr': 0.00047663779048326605, 'samples': 3628544, 'steps': 7086, 'batch_loss/train': 1.16911512427032} 12/23/2021 21:59:41 - INFO - codeparrot_training - Step 7087: {'lr': 0.00047663113647764015, 'samples': 3629056, 'steps': 7087, 'batch_loss/train': 0.8802017355337739} 12/23/2021 21:59:53 - INFO - codeparrot_training - Step 7088: {'lr': 0.00047662448157101444, 'samples': 3629568, 'steps': 7088, 'batch_loss/train': 1.0103995138779283} 12/23/2021 22:00:03 - INFO - codeparrot_training - Step 7089: {'lr': 0.0004766178257634153, 'samples': 3630080, 'steps': 7089, 'batch_loss/train': 0.8452618420124054} 12/23/2021 22:00:14 - INFO - codeparrot_training - Step 7090: {'lr': 0.00047661116905486933, 'samples': 3630592, 'steps': 7090, 'batch_loss/train': 0.8492038112599403} 12/23/2021 22:00:26 - INFO - codeparrot_training - Step 7091: {'lr': 0.00047660451144540284, 'samples': 3631104, 'steps': 7091, 'batch_loss/train': 0.8572692908346653} 12/23/2021 22:00:37 - INFO - codeparrot_training - Step 7092: {'lr': 0.0004765978529350424, 'samples': 3631616, 'steps': 7092, 'batch_loss/train': 0.9819591864943504} 12/23/2021 22:00:47 - INFO - codeparrot_training - Step 7093: {'lr': 0.00047659119352381455, 'samples': 3632128, 'steps': 7093, 'batch_loss/train': 0.9078832312952727} 12/23/2021 22:01:00 - INFO - codeparrot_training - Step 7094: {'lr': 0.00047658453321174554, 'samples': 3632640, 'steps': 7094, 'batch_loss/train': 0.8477963795885444} 12/23/2021 22:01:11 - INFO - codeparrot_training - Step 7095: {'lr': 0.00047657787199886206, 'samples': 3633152, 'steps': 7095, 'batch_loss/train': 0.9292211914435029} 12/23/2021 22:01:21 - INFO - codeparrot_training - Step 7096: {'lr': 0.0004765712098851904, 'samples': 3633664, 'steps': 7096, 'batch_loss/train': 0.8953883415088058} 12/23/2021 22:01:33 - INFO - codeparrot_training - Step 7097: {'lr': 0.0004765645468707573, 'samples': 3634176, 'steps': 7097, 'batch_loss/train': 0.9221909549087286} 12/23/2021 22:01:44 - INFO - codeparrot_training - Step 7098: {'lr': 0.000476557882955589, 'samples': 3634688, 'steps': 7098, 'batch_loss/train': 0.8659280855208635} 12/23/2021 22:01:55 - INFO - codeparrot_training - Step 7099: {'lr': 0.0004765512181397122, 'samples': 3635200, 'steps': 7099, 'batch_loss/train': 0.8665328733623028} 12/23/2021 22:02:05 - INFO - codeparrot_training - Step 7100: {'lr': 0.00047654455242315325, 'samples': 3635712, 'steps': 7100, 'batch_loss/train': 0.9241405609063804} 12/23/2021 22:02:17 - INFO - codeparrot_training - Step 7101: {'lr': 0.0004765378858059387, 'samples': 3636224, 'steps': 7101, 'batch_loss/train': 0.9064558884128928} 12/23/2021 22:02:28 - INFO - codeparrot_training - Step 7102: {'lr': 0.0004765312182880951, 'samples': 3636736, 'steps': 7102, 'batch_loss/train': 0.8637956436723471} 12/23/2021 22:02:38 - INFO - codeparrot_training - Step 7103: {'lr': 0.0004765245498696488, 'samples': 3637248, 'steps': 7103, 'batch_loss/train': 0.9026485257782042} 12/23/2021 22:02:50 - INFO - codeparrot_training - Step 7104: {'lr': 0.0004765178805506265, 'samples': 3637760, 'steps': 7104, 'batch_loss/train': 0.9055257830768824} 12/23/2021 22:03:01 - INFO - codeparrot_training - Step 7105: {'lr': 0.0004765112103310546, 'samples': 3638272, 'steps': 7105, 'batch_loss/train': 1.0557616939768195} 12/23/2021 22:03:12 - INFO - codeparrot_training - Step 7106: {'lr': 0.0004765045392109597, 'samples': 3638784, 'steps': 7106, 'batch_loss/train': 0.8269264586269855} 12/23/2021 22:03:25 - INFO - codeparrot_training - Step 7107: {'lr': 0.0004764978671903683, 'samples': 3639296, 'steps': 7107, 'batch_loss/train': 0.9484722539782524} 12/23/2021 22:03:36 - INFO - codeparrot_training - Step 7108: {'lr': 0.00047649119426930686, 'samples': 3639808, 'steps': 7108, 'batch_loss/train': 0.890125778503716} 12/23/2021 22:03:47 - INFO - codeparrot_training - Step 7109: {'lr': 0.00047648452044780193, 'samples': 3640320, 'steps': 7109, 'batch_loss/train': 0.8303958363831043} 12/23/2021 22:03:57 - INFO - codeparrot_training - Step 7110: {'lr': 0.00047647784572588006, 'samples': 3640832, 'steps': 7110, 'batch_loss/train': 0.9781421292573214} 12/23/2021 22:04:09 - INFO - codeparrot_training - Step 7111: {'lr': 0.00047647117010356785, 'samples': 3641344, 'steps': 7111, 'batch_loss/train': 0.8151944568380713} 12/23/2021 22:04:20 - INFO - codeparrot_training - Step 7112: {'lr': 0.0004764644935808917, 'samples': 3641856, 'steps': 7112, 'batch_loss/train': 0.9074508361518383} 12/23/2021 22:04:31 - INFO - codeparrot_training - Step 7113: {'lr': 0.0004764578161578783, 'samples': 3642368, 'steps': 7113, 'batch_loss/train': 0.7770725693553686} 12/23/2021 22:04:45 - INFO - codeparrot_training - Step 7114: {'lr': 0.0004764511378345541, 'samples': 3642880, 'steps': 7114, 'batch_loss/train': 0.8988014319911599} 12/23/2021 22:04:55 - INFO - codeparrot_training - Step 7115: {'lr': 0.0004764444586109457, 'samples': 3643392, 'steps': 7115, 'batch_loss/train': 0.7124287937767804} 12/23/2021 22:05:06 - INFO - codeparrot_training - Step 7116: {'lr': 0.00047643777848707967, 'samples': 3643904, 'steps': 7116, 'batch_loss/train': 1.0039802081882954} 12/23/2021 22:05:18 - INFO - codeparrot_training - Step 7117: {'lr': 0.00047643109746298243, 'samples': 3644416, 'steps': 7117, 'batch_loss/train': 0.8921778431395069} 12/23/2021 22:05:28 - INFO - codeparrot_training - Step 7118: {'lr': 0.0004764244155386807, 'samples': 3644928, 'steps': 7118, 'batch_loss/train': 0.8831130862236023} 12/23/2021 22:05:39 - INFO - codeparrot_training - Step 7119: {'lr': 0.000476417732714201, 'samples': 3645440, 'steps': 7119, 'batch_loss/train': 0.7204559100791812} 12/23/2021 22:05:50 - INFO - codeparrot_training - Step 7120: {'lr': 0.0004764110489895699, 'samples': 3645952, 'steps': 7120, 'batch_loss/train': 0.9098513480275869} 12/23/2021 22:06:02 - INFO - codeparrot_training - Step 7121: {'lr': 0.0004764043643648139, 'samples': 3646464, 'steps': 7121, 'batch_loss/train': 0.8997217696160078} 12/23/2021 22:06:12 - INFO - codeparrot_training - Step 7122: {'lr': 0.00047639767883995965, 'samples': 3646976, 'steps': 7122, 'batch_loss/train': 0.8732679598033428} 12/23/2021 22:06:23 - INFO - codeparrot_training - Step 7123: {'lr': 0.00047639099241503376, 'samples': 3647488, 'steps': 7123, 'batch_loss/train': 0.8913336666300893} 12/23/2021 22:06:36 - INFO - codeparrot_training - Step 7124: {'lr': 0.0004763843050900627, 'samples': 3648000, 'steps': 7124, 'batch_loss/train': 0.8023087093606591} 12/23/2021 22:06:47 - INFO - codeparrot_training - Step 7125: {'lr': 0.0004763776168650732, 'samples': 3648512, 'steps': 7125, 'batch_loss/train': 0.5725774568272755} 12/23/2021 22:06:57 - INFO - codeparrot_training - Step 7126: {'lr': 0.0004763709277400917, 'samples': 3649024, 'steps': 7126, 'batch_loss/train': 0.8983312845230103} 12/23/2021 22:07:09 - INFO - codeparrot_training - Step 7127: {'lr': 0.00047636423771514494, 'samples': 3649536, 'steps': 7127, 'batch_loss/train': 0.8601238057017326} 12/23/2021 22:07:20 - INFO - codeparrot_training - Step 7128: {'lr': 0.0004763575467902594, 'samples': 3650048, 'steps': 7128, 'batch_loss/train': 0.9003271451219916} 12/23/2021 22:07:31 - INFO - codeparrot_training - Step 7129: {'lr': 0.0004763508549654617, 'samples': 3650560, 'steps': 7129, 'batch_loss/train': 0.8236901350319386} 12/23/2021 22:07:41 - INFO - codeparrot_training - Step 7130: {'lr': 0.0004763441622407785, 'samples': 3651072, 'steps': 7130, 'batch_loss/train': 0.9429804398678243} 12/23/2021 22:07:53 - INFO - codeparrot_training - Step 7131: {'lr': 0.0004763374686162364, 'samples': 3651584, 'steps': 7131, 'batch_loss/train': 0.7423534032423049} 12/23/2021 22:08:04 - INFO - codeparrot_training - Step 7132: {'lr': 0.00047633077409186186, 'samples': 3652096, 'steps': 7132, 'batch_loss/train': 0.9489136971533298} 12/23/2021 22:08:15 - INFO - codeparrot_training - Step 7133: {'lr': 0.00047632407866768177, 'samples': 3652608, 'steps': 7133, 'batch_loss/train': 0.7162669689860195} 12/23/2021 22:08:28 - INFO - codeparrot_training - Step 7134: {'lr': 0.0004763173823437226, 'samples': 3653120, 'steps': 7134, 'batch_loss/train': 0.8643149416893721} 12/23/2021 22:08:38 - INFO - codeparrot_training - Step 7135: {'lr': 0.0004763106851200109, 'samples': 3653632, 'steps': 7135, 'batch_loss/train': 0.9786082170903683} 12/23/2021 22:08:49 - INFO - codeparrot_training - Step 7136: {'lr': 0.00047630398699657336, 'samples': 3654144, 'steps': 7136, 'batch_loss/train': 0.830652441829443} 12/23/2021 22:09:01 - INFO - codeparrot_training - Step 7137: {'lr': 0.00047629728797343664, 'samples': 3654656, 'steps': 7137, 'batch_loss/train': 0.9153794758021832} 12/23/2021 22:09:11 - INFO - codeparrot_training - Step 7138: {'lr': 0.0004762905880506273, 'samples': 3655168, 'steps': 7138, 'batch_loss/train': 0.8882739930413663} 12/23/2021 22:09:22 - INFO - codeparrot_training - Step 7139: {'lr': 0.00047628388722817204, 'samples': 3655680, 'steps': 7139, 'batch_loss/train': 0.7658907491713762} 12/23/2021 22:09:33 - INFO - codeparrot_training - Step 7140: {'lr': 0.00047627718550609756, 'samples': 3656192, 'steps': 7140, 'batch_loss/train': 0.732025052420795} 12/23/2021 22:09:45 - INFO - codeparrot_training - Step 7141: {'lr': 0.0004762704828844304, 'samples': 3656704, 'steps': 7141, 'batch_loss/train': 0.894548729993403} 12/23/2021 22:09:55 - INFO - codeparrot_training - Step 7142: {'lr': 0.00047626377936319723, 'samples': 3657216, 'steps': 7142, 'batch_loss/train': 0.8301494484767318} 12/23/2021 22:10:06 - INFO - codeparrot_training - Step 7143: {'lr': 0.0004762570749424247, 'samples': 3657728, 'steps': 7143, 'batch_loss/train': 0.8316847011446953} 12/23/2021 22:10:18 - INFO - codeparrot_training - Step 7144: {'lr': 0.00047625036962213954, 'samples': 3658240, 'steps': 7144, 'batch_loss/train': 0.9406375009566545} 12/23/2021 22:10:28 - INFO - codeparrot_training - Step 7145: {'lr': 0.00047624366340236825, 'samples': 3658752, 'steps': 7145, 'batch_loss/train': 0.9711662102490664} 12/23/2021 22:10:39 - INFO - codeparrot_training - Step 7146: {'lr': 0.0004762369562831376, 'samples': 3659264, 'steps': 7146, 'batch_loss/train': 0.8923220029100776} 12/23/2021 22:10:53 - INFO - codeparrot_training - Step 7147: {'lr': 0.00047623024826447426, 'samples': 3659776, 'steps': 7147, 'batch_loss/train': 0.8995286580175161} 12/23/2021 22:11:03 - INFO - codeparrot_training - Step 7148: {'lr': 0.00047622353934640484, 'samples': 3660288, 'steps': 7148, 'batch_loss/train': 0.8688778700307012} 12/23/2021 22:11:14 - INFO - codeparrot_training - Step 7149: {'lr': 0.00047621682952895606, 'samples': 3660800, 'steps': 7149, 'batch_loss/train': 0.9132317891344428} 12/23/2021 22:11:26 - INFO - codeparrot_training - Step 7150: {'lr': 0.0004762101188121546, 'samples': 3661312, 'steps': 7150, 'batch_loss/train': 0.8631434831768274} 12/23/2021 22:11:37 - INFO - codeparrot_training - Step 7151: {'lr': 0.0004762034071960271, 'samples': 3661824, 'steps': 7151, 'batch_loss/train': 0.9237594418227673} 12/23/2021 22:11:47 - INFO - codeparrot_training - Step 7152: {'lr': 0.0004761966946806002, 'samples': 3662336, 'steps': 7152, 'batch_loss/train': 0.8708985168486834} 12/23/2021 22:11:58 - INFO - codeparrot_training - Step 7153: {'lr': 0.00047618998126590075, 'samples': 3662848, 'steps': 7153, 'batch_loss/train': 0.8869612235575914} 12/23/2021 22:12:12 - INFO - codeparrot_training - Step 7154: {'lr': 0.00047618326695195536, 'samples': 3663360, 'steps': 7154, 'batch_loss/train': 0.8105840273201466} 12/23/2021 22:12:22 - INFO - codeparrot_training - Step 7155: {'lr': 0.0004761765517387906, 'samples': 3663872, 'steps': 7155, 'batch_loss/train': 0.8454761682078242} 12/23/2021 22:12:33 - INFO - codeparrot_training - Step 7156: {'lr': 0.00047616983562643337, 'samples': 3664384, 'steps': 7156, 'batch_loss/train': 0.8192092105746269} 12/23/2021 22:12:45 - INFO - codeparrot_training - Step 7157: {'lr': 0.0004761631186149102, 'samples': 3664896, 'steps': 7157, 'batch_loss/train': 0.8437828877940774} 12/23/2021 22:12:56 - INFO - codeparrot_training - Step 7158: {'lr': 0.00047615640070424787, 'samples': 3665408, 'steps': 7158, 'batch_loss/train': 0.8436084389686584} 12/23/2021 22:13:06 - INFO - codeparrot_training - Step 7159: {'lr': 0.0004761496818944731, 'samples': 3665920, 'steps': 7159, 'batch_loss/train': 0.8661539917811751} 12/23/2021 22:13:18 - INFO - codeparrot_training - Step 7160: {'lr': 0.0004761429621856126, 'samples': 3666432, 'steps': 7160, 'batch_loss/train': 0.9259837463032454} 12/23/2021 22:13:29 - INFO - codeparrot_training - Step 7161: {'lr': 0.00047613624157769307, 'samples': 3666944, 'steps': 7161, 'batch_loss/train': 0.9372528344392776} 12/23/2021 22:13:39 - INFO - codeparrot_training - Step 7162: {'lr': 0.00047612952007074124, 'samples': 3667456, 'steps': 7162, 'batch_loss/train': 0.7784742321819067} 12/23/2021 22:13:50 - INFO - codeparrot_training - Step 7163: {'lr': 0.00047612279766478377, 'samples': 3667968, 'steps': 7163, 'batch_loss/train': 0.8833864014595747} 12/23/2021 22:14:03 - INFO - codeparrot_training - Step 7164: {'lr': 0.0004761160743598475, 'samples': 3668480, 'steps': 7164, 'batch_loss/train': 0.8476778566837311} 12/23/2021 22:14:14 - INFO - codeparrot_training - Step 7165: {'lr': 0.00047610935015595903, 'samples': 3668992, 'steps': 7165, 'batch_loss/train': 0.9163713799789548} 12/23/2021 22:14:24 - INFO - codeparrot_training - Step 7166: {'lr': 0.0004761026250531452, 'samples': 3669504, 'steps': 7166, 'batch_loss/train': 0.7988437680760399} 12/23/2021 22:14:36 - INFO - codeparrot_training - Step 7167: {'lr': 0.0004760958990514327, 'samples': 3670016, 'steps': 7167, 'batch_loss/train': 0.8144973344169557} 12/23/2021 22:14:47 - INFO - codeparrot_training - Step 7168: {'lr': 0.0004760891721508483, 'samples': 3670528, 'steps': 7168, 'batch_loss/train': 0.8629741882905364} 12/23/2021 22:14:57 - INFO - codeparrot_training - Step 7169: {'lr': 0.0004760824443514187, 'samples': 3671040, 'steps': 7169, 'batch_loss/train': 0.8266983684152365} 12/23/2021 22:15:11 - INFO - codeparrot_training - Step 7170: {'lr': 0.00047607571565317063, 'samples': 3671552, 'steps': 7170, 'batch_loss/train': 0.9463236704468727} 12/23/2021 22:15:22 - INFO - codeparrot_training - Step 7171: {'lr': 0.0004760689860561309, 'samples': 3672064, 'steps': 7171, 'batch_loss/train': 0.7686031833291054} 12/23/2021 22:15:32 - INFO - codeparrot_training - Step 7172: {'lr': 0.0004760622555603263, 'samples': 3672576, 'steps': 7172, 'batch_loss/train': 0.8118474070215598} 12/23/2021 22:15:44 - INFO - codeparrot_training - Step 7173: {'lr': 0.00047605552416578345, 'samples': 3673088, 'steps': 7173, 'batch_loss/train': 0.8701425045728683} 12/23/2021 22:15:55 - INFO - codeparrot_training - Step 7174: {'lr': 0.00047604879187252927, 'samples': 3673600, 'steps': 7174, 'batch_loss/train': 0.8993316092528403} 12/23/2021 22:16:06 - INFO - codeparrot_training - Step 7175: {'lr': 0.00047604205868059037, 'samples': 3674112, 'steps': 7175, 'batch_loss/train': 1.172151891514659} 12/23/2021 22:16:16 - INFO - codeparrot_training - Step 7176: {'lr': 0.00047603532458999364, 'samples': 3674624, 'steps': 7176, 'batch_loss/train': 0.8207385744899511} 12/23/2021 22:16:28 - INFO - codeparrot_training - Step 7177: {'lr': 0.00047602858960076576, 'samples': 3675136, 'steps': 7177, 'batch_loss/train': 0.8318614037707448} 12/23/2021 22:16:39 - INFO - codeparrot_training - Step 7178: {'lr': 0.0004760218537129336, 'samples': 3675648, 'steps': 7178, 'batch_loss/train': 1.1730604637414217} 12/23/2021 22:16:50 - INFO - codeparrot_training - Step 7179: {'lr': 0.0004760151169265239, 'samples': 3676160, 'steps': 7179, 'batch_loss/train': 0.8400692827999592} 12/23/2021 22:17:02 - INFO - codeparrot_training - Step 7180: {'lr': 0.00047600837924156343, 'samples': 3676672, 'steps': 7180, 'batch_loss/train': 0.8645874678622931} 12/23/2021 22:17:12 - INFO - codeparrot_training - Step 7181: {'lr': 0.0004760016406580789, 'samples': 3677184, 'steps': 7181, 'batch_loss/train': 0.8704931819811463} 12/23/2021 22:17:23 - INFO - codeparrot_training - Step 7182: {'lr': 0.00047599490117609727, 'samples': 3677696, 'steps': 7182, 'batch_loss/train': 0.8840293912217021} 12/23/2021 22:17:33 - INFO - codeparrot_training - Step 7183: {'lr': 0.00047598816079564524, 'samples': 3678208, 'steps': 7183, 'batch_loss/train': 0.6847890492063016} 12/23/2021 22:17:47 - INFO - codeparrot_training - Step 7184: {'lr': 0.00047598141951674957, 'samples': 3678720, 'steps': 7184, 'batch_loss/train': 0.7825206872075796} 12/23/2021 22:17:57 - INFO - codeparrot_training - Step 7185: {'lr': 0.0004759746773394372, 'samples': 3679232, 'steps': 7185, 'batch_loss/train': 0.874156286008656} 12/23/2021 22:18:08 - INFO - codeparrot_training - Step 7186: {'lr': 0.0004759679342637347, 'samples': 3679744, 'steps': 7186, 'batch_loss/train': 0.8914982099086046} 12/23/2021 22:18:20 - INFO - codeparrot_training - Step 7187: {'lr': 0.0004759611902896691, 'samples': 3680256, 'steps': 7187, 'batch_loss/train': 0.9215790024027228} 12/23/2021 22:18:30 - INFO - codeparrot_training - Step 7188: {'lr': 0.0004759544454172671, 'samples': 3680768, 'steps': 7188, 'batch_loss/train': 0.8413446652702987} 12/23/2021 22:18:41 - INFO - codeparrot_training - Step 7189: {'lr': 0.0004759476996465556, 'samples': 3681280, 'steps': 7189, 'batch_loss/train': 0.8209305707132444} 12/23/2021 22:18:53 - INFO - codeparrot_training - Step 7190: {'lr': 0.00047594095297756123, 'samples': 3681792, 'steps': 7190, 'batch_loss/train': 0.6359039228409529} 12/23/2021 22:19:04 - INFO - codeparrot_training - Step 7191: {'lr': 0.00047593420541031106, 'samples': 3682304, 'steps': 7191, 'batch_loss/train': 0.8565322319045663} 12/23/2021 22:19:14 - INFO - codeparrot_training - Step 7192: {'lr': 0.00047592745694483176, 'samples': 3682816, 'steps': 7192, 'batch_loss/train': 0.9060135236941278} 12/23/2021 22:19:25 - INFO - codeparrot_training - Step 7193: {'lr': 0.0004759207075811502, 'samples': 3683328, 'steps': 7193, 'batch_loss/train': 1.0113025628961623} 12/23/2021 22:19:38 - INFO - codeparrot_training - Step 7194: {'lr': 0.00047591395731929324, 'samples': 3683840, 'steps': 7194, 'batch_loss/train': 0.8120838310569525} 12/23/2021 22:19:49 - INFO - codeparrot_training - Step 7195: {'lr': 0.00047590720615928763, 'samples': 3684352, 'steps': 7195, 'batch_loss/train': 0.8515205564908683} 12/23/2021 22:19:59 - INFO - codeparrot_training - Step 7196: {'lr': 0.0004759004541011603, 'samples': 3684864, 'steps': 7196, 'batch_loss/train': 0.7362040920415893} 12/23/2021 22:20:11 - INFO - codeparrot_training - Step 7197: {'lr': 0.000475893701144938, 'samples': 3685376, 'steps': 7197, 'batch_loss/train': 0.8056928256992251} 12/23/2021 22:20:22 - INFO - codeparrot_training - Step 7198: {'lr': 0.00047588694729064774, 'samples': 3685888, 'steps': 7198, 'batch_loss/train': 0.8515034643933177} 12/23/2021 22:20:32 - INFO - codeparrot_training - Step 7199: {'lr': 0.00047588019253831627, 'samples': 3686400, 'steps': 7199, 'batch_loss/train': 1.0531422086060047} 12/23/2021 22:20:46 - INFO - codeparrot_training - Step 7200: {'lr': 0.0004758734368879704, 'samples': 3686912, 'steps': 7200, 'batch_loss/train': 0.8940094448626041} 12/23/2021 22:20:57 - INFO - codeparrot_training - Step 7201: {'lr': 0.000475866680339637, 'samples': 3687424, 'steps': 7201, 'batch_loss/train': 0.9129518223926425} 12/23/2021 22:21:07 - INFO - codeparrot_training - Step 7202: {'lr': 0.000475859922893343, 'samples': 3687936, 'steps': 7202, 'batch_loss/train': 0.9280796656385064} 12/23/2021 22:21:19 - INFO - codeparrot_training - Step 7203: {'lr': 0.0004758531645491152, 'samples': 3688448, 'steps': 7203, 'batch_loss/train': 0.938473570626229} 12/23/2021 22:21:30 - INFO - codeparrot_training - Step 7204: {'lr': 0.00047584640530698056, 'samples': 3688960, 'steps': 7204, 'batch_loss/train': 0.8720258586108685} 12/23/2021 22:21:41 - INFO - codeparrot_training - Step 7205: {'lr': 0.00047583964516696584, 'samples': 3689472, 'steps': 7205, 'batch_loss/train': 0.8892642483115196} 12/23/2021 22:21:51 - INFO - codeparrot_training - Step 7206: {'lr': 0.000475832884129098, 'samples': 3689984, 'steps': 7206, 'batch_loss/train': 0.7553557874634862} 12/23/2021 22:22:03 - INFO - codeparrot_training - Step 7207: {'lr': 0.0004758261221934038, 'samples': 3690496, 'steps': 7207, 'batch_loss/train': 0.8047151919454336} 12/23/2021 22:22:14 - INFO - codeparrot_training - Step 7208: {'lr': 0.00047581935935991026, 'samples': 3691008, 'steps': 7208, 'batch_loss/train': 0.8520024251192808} 12/23/2021 22:22:24 - INFO - codeparrot_training - Step 7209: {'lr': 0.0004758125956286442, 'samples': 3691520, 'steps': 7209, 'batch_loss/train': 0.8737692488357425} 12/23/2021 22:22:38 - INFO - codeparrot_training - Step 7210: {'lr': 0.00047580583099963254, 'samples': 3692032, 'steps': 7210, 'batch_loss/train': 0.8845615470781922} 12/23/2021 22:22:49 - INFO - codeparrot_training - Step 7211: {'lr': 0.00047579906547290217, 'samples': 3692544, 'steps': 7211, 'batch_loss/train': 0.8318544521462172} 12/23/2021 22:23:00 - INFO - codeparrot_training - Step 7212: {'lr': 0.00047579229904848, 'samples': 3693056, 'steps': 7212, 'batch_loss/train': 0.9091790029779077} 12/23/2021 22:23:12 - INFO - codeparrot_training - Step 7213: {'lr': 0.00047578553172639284, 'samples': 3693568, 'steps': 7213, 'batch_loss/train': 0.7906980393454432} 12/23/2021 22:23:22 - INFO - codeparrot_training - Step 7214: {'lr': 0.00047577876350666766, 'samples': 3694080, 'steps': 7214, 'batch_loss/train': 0.9186754077672958} 12/23/2021 22:23:33 - INFO - codeparrot_training - Step 7215: {'lr': 0.0004757719943893314, 'samples': 3694592, 'steps': 7215, 'batch_loss/train': 0.799704628996551} 12/23/2021 22:23:43 - INFO - codeparrot_training - Step 7216: {'lr': 0.00047576522437441093, 'samples': 3695104, 'steps': 7216, 'batch_loss/train': 0.8131472829263657} 12/23/2021 22:23:55 - INFO - codeparrot_training - Step 7217: {'lr': 0.00047575845346193316, 'samples': 3695616, 'steps': 7217, 'batch_loss/train': 0.9083469267934561} 12/23/2021 22:24:06 - INFO - codeparrot_training - Step 7218: {'lr': 0.00047575168165192507, 'samples': 3696128, 'steps': 7218, 'batch_loss/train': 0.7571154278703034} 12/23/2021 22:24:17 - INFO - codeparrot_training - Step 7219: {'lr': 0.00047574490894441345, 'samples': 3696640, 'steps': 7219, 'batch_loss/train': 0.8386464649811387} 12/23/2021 22:24:29 - INFO - codeparrot_training - Step 7220: {'lr': 0.00047573813533942534, 'samples': 3697152, 'steps': 7220, 'batch_loss/train': 0.9241423960775137} 12/23/2021 22:24:39 - INFO - codeparrot_training - Step 7221: {'lr': 0.00047573136083698764, 'samples': 3697664, 'steps': 7221, 'batch_loss/train': 0.8815926527604461} 12/23/2021 22:24:50 - INFO - codeparrot_training - Step 7222: {'lr': 0.0004757245854371274, 'samples': 3698176, 'steps': 7222, 'batch_loss/train': 0.826761425472796} 12/23/2021 22:25:04 - INFO - codeparrot_training - Step 7223: {'lr': 0.0004757178091398713, 'samples': 3698688, 'steps': 7223, 'batch_loss/train': 0.7886532364645973} 12/23/2021 22:25:14 - INFO - codeparrot_training - Step 7224: {'lr': 0.00047571103194524644, 'samples': 3699200, 'steps': 7224, 'batch_loss/train': 0.9342003371566534} 12/23/2021 22:25:25 - INFO - codeparrot_training - Step 7225: {'lr': 0.00047570425385327973, 'samples': 3699712, 'steps': 7225, 'batch_loss/train': 0.8868701467290521} 12/23/2021 22:25:37 - INFO - codeparrot_training - Step 7226: {'lr': 0.0004756974748639982, 'samples': 3700224, 'steps': 7226, 'batch_loss/train': 0.8471612092107534} 12/23/2021 22:25:48 - INFO - codeparrot_training - Step 7227: {'lr': 0.00047569069497742865, 'samples': 3700736, 'steps': 7227, 'batch_loss/train': 0.7988603012636304} 12/23/2021 22:25:58 - INFO - codeparrot_training - Step 7228: {'lr': 0.0004756839141935982, 'samples': 3701248, 'steps': 7228, 'batch_loss/train': 0.8808815320953727} 12/23/2021 22:26:09 - INFO - codeparrot_training - Step 7229: {'lr': 0.00047567713251253364, 'samples': 3701760, 'steps': 7229, 'batch_loss/train': 0.8638949180021882} 12/23/2021 22:26:23 - INFO - codeparrot_training - Step 7230: {'lr': 0.00047567034993426205, 'samples': 3702272, 'steps': 7230, 'batch_loss/train': 0.8415845250710845} 12/23/2021 22:26:33 - INFO - codeparrot_training - Step 7231: {'lr': 0.0004756635664588104, 'samples': 3702784, 'steps': 7231, 'batch_loss/train': 0.8353385077789426} 12/23/2021 22:26:44 - INFO - codeparrot_training - Step 7232: {'lr': 0.00047565678208620556, 'samples': 3703296, 'steps': 7232, 'batch_loss/train': 0.8616458000615239} 12/23/2021 22:26:56 - INFO - codeparrot_training - Step 7233: {'lr': 0.0004756499968164746, 'samples': 3703808, 'steps': 7233, 'batch_loss/train': 0.8378886980935931} 12/23/2021 22:27:06 - INFO - codeparrot_training - Step 7234: {'lr': 0.00047564321064964436, 'samples': 3704320, 'steps': 7234, 'batch_loss/train': 0.8354263212531805} 12/23/2021 22:27:17 - INFO - codeparrot_training - Step 7235: {'lr': 0.000475636423585742, 'samples': 3704832, 'steps': 7235, 'batch_loss/train': 0.8735621180385351} 12/23/2021 22:27:29 - INFO - codeparrot_training - Step 7236: {'lr': 0.0004756296356247944, 'samples': 3705344, 'steps': 7236, 'batch_loss/train': 0.9141302180942148} 12/23/2021 22:27:40 - INFO - codeparrot_training - Step 7237: {'lr': 0.00047562284676682856, 'samples': 3705856, 'steps': 7237, 'batch_loss/train': 0.9488301025703549} 12/23/2021 22:27:50 - INFO - codeparrot_training - Step 7238: {'lr': 0.0004756160570118715, 'samples': 3706368, 'steps': 7238, 'batch_loss/train': 0.8907468193210661} 12/23/2021 22:28:01 - INFO - codeparrot_training - Step 7239: {'lr': 0.00047560926635995015, 'samples': 3706880, 'steps': 7239, 'batch_loss/train': 0.8251990715507418} 12/23/2021 22:28:14 - INFO - codeparrot_training - Step 7240: {'lr': 0.00047560247481109153, 'samples': 3707392, 'steps': 7240, 'batch_loss/train': 0.867845562286675} 12/23/2021 22:28:25 - INFO - codeparrot_training - Step 7241: {'lr': 0.0004755956823653227, 'samples': 3707904, 'steps': 7241, 'batch_loss/train': 0.7553672262001783} 12/23/2021 22:28:35 - INFO - codeparrot_training - Step 7242: {'lr': 0.0004755888890226706, 'samples': 3708416, 'steps': 7242, 'batch_loss/train': 0.8331583528779447} 12/23/2021 22:28:47 - INFO - codeparrot_training - Step 7243: {'lr': 0.0004755820947831623, 'samples': 3708928, 'steps': 7243, 'batch_loss/train': 0.9076865054666996} 12/23/2021 22:28:58 - INFO - codeparrot_training - Step 7244: {'lr': 0.00047557529964682466, 'samples': 3709440, 'steps': 7244, 'batch_loss/train': 0.8868251238018274} 12/23/2021 22:29:09 - INFO - codeparrot_training - Step 7245: {'lr': 0.0004755685036136849, 'samples': 3709952, 'steps': 7245, 'batch_loss/train': 0.8523754719644785} 12/23/2021 22:29:20 - INFO - codeparrot_training - Step 7246: {'lr': 0.0004755617066837699, 'samples': 3710464, 'steps': 7246, 'batch_loss/train': 0.8600508235394955} 12/23/2021 22:29:31 - INFO - codeparrot_training - Step 7247: {'lr': 0.0004755549088571067, 'samples': 3710976, 'steps': 7247, 'batch_loss/train': 0.8314569815993309} 12/23/2021 22:29:42 - INFO - codeparrot_training - Step 7248: {'lr': 0.0004755481101337224, 'samples': 3711488, 'steps': 7248, 'batch_loss/train': 0.8691320922225714} 12/23/2021 22:29:52 - INFO - codeparrot_training - Step 7249: {'lr': 0.0004755413105136439, 'samples': 3712000, 'steps': 7249, 'batch_loss/train': 0.8862196244299412} 12/23/2021 22:30:06 - INFO - codeparrot_training - Step 7250: {'lr': 0.0004755345099968984, 'samples': 3712512, 'steps': 7250, 'batch_loss/train': 0.9124447191134095} 12/23/2021 22:30:17 - INFO - codeparrot_training - Step 7251: {'lr': 0.0004755277085835128, 'samples': 3713024, 'steps': 7251, 'batch_loss/train': 0.9372757142409682} 12/23/2021 22:30:27 - INFO - codeparrot_training - Step 7252: {'lr': 0.0004755209062735142, 'samples': 3713536, 'steps': 7252, 'batch_loss/train': 0.8732603499665856} 12/23/2021 22:30:39 - INFO - codeparrot_training - Step 7253: {'lr': 0.0004755141030669297, 'samples': 3714048, 'steps': 7253, 'batch_loss/train': 1.0275812204927206} 12/23/2021 22:30:50 - INFO - codeparrot_training - Step 7254: {'lr': 0.0004755072989637862, 'samples': 3714560, 'steps': 7254, 'batch_loss/train': 0.827024769037962} 12/23/2021 22:31:01 - INFO - codeparrot_training - Step 7255: {'lr': 0.0004755004939641108, 'samples': 3715072, 'steps': 7255, 'batch_loss/train': 0.7855765447020531} 12/23/2021 22:31:13 - INFO - codeparrot_training - Step 7256: {'lr': 0.0004754936880679306, 'samples': 3715584, 'steps': 7256, 'batch_loss/train': 0.8148875897750258} 12/23/2021 22:31:23 - INFO - codeparrot_training - Step 7257: {'lr': 0.00047548688127527265, 'samples': 3716096, 'steps': 7257, 'batch_loss/train': 0.7509629909764044} 12/23/2021 22:31:34 - INFO - codeparrot_training - Step 7258: {'lr': 0.00047548007358616404, 'samples': 3716608, 'steps': 7258, 'batch_loss/train': 0.8803666736930609} 12/23/2021 22:31:48 - INFO - codeparrot_training - Step 7259: {'lr': 0.0004754732650006317, 'samples': 3717120, 'steps': 7259, 'batch_loss/train': 0.9197321804240346} 12/23/2021 22:31:58 - INFO - codeparrot_training - Step 7260: {'lr': 0.0004754664555187029, 'samples': 3717632, 'steps': 7260, 'batch_loss/train': 0.8115100637078285} 12/23/2021 22:32:09 - INFO - codeparrot_training - Step 7261: {'lr': 0.0004754596451404045, 'samples': 3718144, 'steps': 7261, 'batch_loss/train': 0.8382074618712068} 12/23/2021 22:32:20 - INFO - codeparrot_training - Step 7262: {'lr': 0.0004754528338657638, 'samples': 3718656, 'steps': 7262, 'batch_loss/train': 0.7726742569357157} 12/23/2021 22:32:32 - INFO - codeparrot_training - Step 7263: {'lr': 0.00047544602169480765, 'samples': 3719168, 'steps': 7263, 'batch_loss/train': 0.8978134477511048} 12/23/2021 22:32:42 - INFO - codeparrot_training - Step 7264: {'lr': 0.00047543920862756333, 'samples': 3719680, 'steps': 7264, 'batch_loss/train': 0.853446327149868} 12/23/2021 22:32:53 - INFO - codeparrot_training - Step 7265: {'lr': 0.00047543239466405786, 'samples': 3720192, 'steps': 7265, 'batch_loss/train': 0.7265933325979859} 12/23/2021 22:33:05 - INFO - codeparrot_training - Step 7266: {'lr': 0.00047542557980431824, 'samples': 3720704, 'steps': 7266, 'batch_loss/train': 0.8611368429847062} 12/23/2021 22:33:15 - INFO - codeparrot_training - Step 7267: {'lr': 0.0004754187640483716, 'samples': 3721216, 'steps': 7267, 'batch_loss/train': 0.7807685635052621} 12/23/2021 22:33:26 - INFO - codeparrot_training - Step 7268: {'lr': 0.00047541194739624513, 'samples': 3721728, 'steps': 7268, 'batch_loss/train': 0.8692255239002407} 12/23/2021 22:33:40 - INFO - codeparrot_training - Step 7269: {'lr': 0.00047540512984796585, 'samples': 3722240, 'steps': 7269, 'batch_loss/train': 0.7750208945944905} 12/23/2021 22:33:50 - INFO - codeparrot_training - Step 7270: {'lr': 0.0004753983114035609, 'samples': 3722752, 'steps': 7270, 'batch_loss/train': 0.8954297536984086} 12/23/2021 22:34:01 - INFO - codeparrot_training - Step 7271: {'lr': 0.0004753914920630573, 'samples': 3723264, 'steps': 7271, 'batch_loss/train': 0.873485129326582} 12/23/2021 22:34:12 - INFO - codeparrot_training - Step 7272: {'lr': 0.00047538467182648233, 'samples': 3723776, 'steps': 7272, 'batch_loss/train': 0.8681098688393831} 12/23/2021 22:34:24 - INFO - codeparrot_training - Step 7273: {'lr': 0.000475377850693863, 'samples': 3724288, 'steps': 7273, 'batch_loss/train': 0.7788551856065169} 12/23/2021 22:34:34 - INFO - codeparrot_training - Step 7274: {'lr': 0.0004753710286652264, 'samples': 3724800, 'steps': 7274, 'batch_loss/train': 0.8729011369869113} 12/23/2021 22:34:45 - INFO - codeparrot_training - Step 7275: {'lr': 0.00047536420574059967, 'samples': 3725312, 'steps': 7275, 'batch_loss/train': 0.8426068043336272} 12/23/2021 22:34:57 - INFO - codeparrot_training - Step 7276: {'lr': 0.00047535738192001, 'samples': 3725824, 'steps': 7276, 'batch_loss/train': 0.784079541452229} 12/23/2021 22:35:07 - INFO - codeparrot_training - Step 7277: {'lr': 0.0004753505572034844, 'samples': 3726336, 'steps': 7277, 'batch_loss/train': 0.9100342178717256} 12/23/2021 22:35:18 - INFO - codeparrot_training - Step 7278: {'lr': 0.00047534373159105014, 'samples': 3726848, 'steps': 7278, 'batch_loss/train': 0.8450402459129691} 12/23/2021 22:35:32 - INFO - codeparrot_training - Step 7279: {'lr': 0.00047533690508273427, 'samples': 3727360, 'steps': 7279, 'batch_loss/train': 0.8396212072111666} 12/23/2021 22:35:42 - INFO - codeparrot_training - Step 7280: {'lr': 0.0004753300776785639, 'samples': 3727872, 'steps': 7280, 'batch_loss/train': 0.8835293436422944} 12/23/2021 22:35:53 - INFO - codeparrot_training - Step 7281: {'lr': 0.0004753232493785663, 'samples': 3728384, 'steps': 7281, 'batch_loss/train': 0.9276621453464031} 12/23/2021 22:36:03 - INFO - codeparrot_training - Step 7282: {'lr': 0.00047531642018276845, 'samples': 3728896, 'steps': 7282, 'batch_loss/train': 0.8553156089037657} 12/23/2021 22:36:15 - INFO - codeparrot_training - Step 7283: {'lr': 0.0004753095900911977, 'samples': 3729408, 'steps': 7283, 'batch_loss/train': 0.798710091970861} 12/23/2021 22:36:26 - INFO - codeparrot_training - Step 7284: {'lr': 0.000475302759103881, 'samples': 3729920, 'steps': 7284, 'batch_loss/train': 0.9327971581369638} 12/23/2021 22:36:37 - INFO - codeparrot_training - Step 7285: {'lr': 0.0004752959272208457, 'samples': 3730432, 'steps': 7285, 'batch_loss/train': 0.772613943554461} 12/23/2021 22:36:51 - INFO - codeparrot_training - Step 7286: {'lr': 0.0004752890944421188, 'samples': 3730944, 'steps': 7286, 'batch_loss/train': 0.7839570548385382} 12/23/2021 22:37:01 - INFO - codeparrot_training - Step 7287: {'lr': 0.0004752822607677275, 'samples': 3731456, 'steps': 7287, 'batch_loss/train': 0.8701863288879395} 12/23/2021 22:37:12 - INFO - codeparrot_training - Step 7288: {'lr': 0.00047527542619769903, 'samples': 3731968, 'steps': 7288, 'batch_loss/train': 0.863541450118646} 12/23/2021 22:37:24 - INFO - codeparrot_training - Step 7289: {'lr': 0.00047526859073206053, 'samples': 3732480, 'steps': 7289, 'batch_loss/train': 1.0777589045464993} 12/23/2021 22:37:34 - INFO - codeparrot_training - Step 7290: {'lr': 0.0004752617543708391, 'samples': 3732992, 'steps': 7290, 'batch_loss/train': 0.7905243104323745} 12/23/2021 22:37:45 - INFO - codeparrot_training - Step 7291: {'lr': 0.0004752549171140621, 'samples': 3733504, 'steps': 7291, 'batch_loss/train': 0.855791718699038} 12/23/2021 22:37:56 - INFO - codeparrot_training - Step 7292: {'lr': 0.00047524807896175646, 'samples': 3734016, 'steps': 7292, 'batch_loss/train': 0.8326421505771577} 12/23/2021 22:38:08 - INFO - codeparrot_training - Step 7293: {'lr': 0.00047524123991394965, 'samples': 3734528, 'steps': 7293, 'batch_loss/train': 0.8702390557155013} 12/23/2021 22:38:18 - INFO - codeparrot_training - Step 7294: {'lr': 0.0004752343999706686, 'samples': 3735040, 'steps': 7294, 'batch_loss/train': 0.8785776263102889} 12/23/2021 22:38:29 - INFO - codeparrot_training - Step 7295: {'lr': 0.0004752275591319408, 'samples': 3735552, 'steps': 7295, 'batch_loss/train': 0.7786138607189059} 12/23/2021 22:38:41 - INFO - codeparrot_training - Step 7296: {'lr': 0.0004752207173977932, 'samples': 3736064, 'steps': 7296, 'batch_loss/train': 0.8927205875515938} 12/23/2021 22:38:51 - INFO - codeparrot_training - Step 7297: {'lr': 0.00047521387476825294, 'samples': 3736576, 'steps': 7297, 'batch_loss/train': 0.8611820712685585} 12/23/2021 22:39:02 - INFO - codeparrot_training - Step 7298: {'lr': 0.00047520703124334743, 'samples': 3737088, 'steps': 7298, 'batch_loss/train': 0.8873681016266346} 12/23/2021 22:39:15 - INFO - codeparrot_training - Step 7299: {'lr': 0.00047520018682310387, 'samples': 3737600, 'steps': 7299, 'batch_loss/train': 0.815628195181489} 12/23/2021 22:39:26 - INFO - codeparrot_training - Step 7300: {'lr': 0.0004751933415075493, 'samples': 3738112, 'steps': 7300, 'batch_loss/train': 0.8931346135213971} 12/23/2021 22:39:36 - INFO - codeparrot_training - Step 7301: {'lr': 0.00047518649529671103, 'samples': 3738624, 'steps': 7301, 'batch_loss/train': 0.9324156614020467} 12/23/2021 22:39:48 - INFO - codeparrot_training - Step 7302: {'lr': 0.0004751796481906163, 'samples': 3739136, 'steps': 7302, 'batch_loss/train': 0.7534533790312707} 12/23/2021 22:39:59 - INFO - codeparrot_training - Step 7303: {'lr': 0.0004751728001892924, 'samples': 3739648, 'steps': 7303, 'batch_loss/train': 0.8913266519084573} 12/23/2021 22:40:09 - INFO - codeparrot_training - Step 7304: {'lr': 0.0004751659512927664, 'samples': 3740160, 'steps': 7304, 'batch_loss/train': 0.8806086229160428} 12/23/2021 22:40:20 - INFO - codeparrot_training - Step 7305: {'lr': 0.00047515910150106554, 'samples': 3740672, 'steps': 7305, 'batch_loss/train': 0.8239913210272789} 12/23/2021 22:40:32 - INFO - codeparrot_training - Step 7306: {'lr': 0.0004751522508142172, 'samples': 3741184, 'steps': 7306, 'batch_loss/train': 0.8270023395307362} 12/23/2021 22:40:42 - INFO - codeparrot_training - Step 7307: {'lr': 0.00047514539923224856, 'samples': 3741696, 'steps': 7307, 'batch_loss/train': 0.7447771745501086} 12/23/2021 22:40:53 - INFO - codeparrot_training - Step 7308: {'lr': 0.0004751385467551867, 'samples': 3742208, 'steps': 7308, 'batch_loss/train': 1.0470271725207567} 12/23/2021 22:41:07 - INFO - codeparrot_training - Step 7309: {'lr': 0.0004751316933830591, 'samples': 3742720, 'steps': 7309, 'batch_loss/train': 0.9937335019931197} 12/23/2021 22:41:18 - INFO - codeparrot_training - Step 7310: {'lr': 0.00047512483911589276, 'samples': 3743232, 'steps': 7310, 'batch_loss/train': 0.8160195662640035} 12/23/2021 22:41:28 - INFO - codeparrot_training - Step 7311: {'lr': 0.0004751179839537152, 'samples': 3743744, 'steps': 7311, 'batch_loss/train': 0.8612446011975408} 12/23/2021 22:41:39 - INFO - codeparrot_training - Step 7312: {'lr': 0.00047511112789655344, 'samples': 3744256, 'steps': 7312, 'batch_loss/train': 0.8350908271968365} 12/23/2021 22:41:51 - INFO - codeparrot_training - Step 7313: {'lr': 0.00047510427094443487, 'samples': 3744768, 'steps': 7313, 'batch_loss/train': 0.749801623634994} 12/23/2021 22:42:01 - INFO - codeparrot_training - Step 7314: {'lr': 0.00047509741309738676, 'samples': 3745280, 'steps': 7314, 'batch_loss/train': 0.8744569020345807} 12/23/2021 22:42:12 - INFO - codeparrot_training - Step 7315: {'lr': 0.00047509055435543624, 'samples': 3745792, 'steps': 7315, 'batch_loss/train': 0.8569806087762117} 12/23/2021 22:42:26 - INFO - codeparrot_training - Step 7316: {'lr': 0.00047508369471861066, 'samples': 3746304, 'steps': 7316, 'batch_loss/train': 0.8492662720382214} 12/23/2021 22:42:36 - INFO - codeparrot_training - Step 7317: {'lr': 0.00047507683418693736, 'samples': 3746816, 'steps': 7317, 'batch_loss/train': 0.8124135341495275} 12/23/2021 22:42:47 - INFO - codeparrot_training - Step 7318: {'lr': 0.0004750699727604436, 'samples': 3747328, 'steps': 7318, 'batch_loss/train': 0.8426740779541433} 12/23/2021 22:42:59 - INFO - codeparrot_training - Step 7319: {'lr': 0.00047506311043915653, 'samples': 3747840, 'steps': 7319, 'batch_loss/train': 0.837029417976737} 12/23/2021 22:43:09 - INFO - codeparrot_training - Step 7320: {'lr': 0.00047505624722310355, 'samples': 3748352, 'steps': 7320, 'batch_loss/train': 0.8152125794440508} 12/23/2021 22:43:20 - INFO - codeparrot_training - Step 7321: {'lr': 0.0004750493831123119, 'samples': 3748864, 'steps': 7321, 'batch_loss/train': 0.8973736839834601} 12/23/2021 22:43:32 - INFO - codeparrot_training - Step 7322: {'lr': 0.0004750425181068089, 'samples': 3749376, 'steps': 7322, 'batch_loss/train': 0.8462751721963286} 12/23/2021 22:43:43 - INFO - codeparrot_training - Step 7323: {'lr': 0.0004750356522066218, 'samples': 3749888, 'steps': 7323, 'batch_loss/train': 0.8495282789226621} 12/23/2021 22:43:53 - INFO - codeparrot_training - Step 7324: {'lr': 0.00047502878541177783, 'samples': 3750400, 'steps': 7324, 'batch_loss/train': 0.8889597551897168} 12/23/2021 22:44:04 - INFO - codeparrot_training - Step 7325: {'lr': 0.0004750219177223045, 'samples': 3750912, 'steps': 7325, 'batch_loss/train': 0.9577281167730689} 12/23/2021 22:44:18 - INFO - codeparrot_training - Step 7326: {'lr': 0.000475015049138229, 'samples': 3751424, 'steps': 7326, 'batch_loss/train': 0.8170597348362207} 12/23/2021 22:44:29 - INFO - codeparrot_training - Step 7327: {'lr': 0.0004750081796595785, 'samples': 3751936, 'steps': 7327, 'batch_loss/train': 0.6163395744515583} 12/23/2021 22:44:39 - INFO - codeparrot_training - Step 7328: {'lr': 0.00047500130928638054, 'samples': 3752448, 'steps': 7328, 'batch_loss/train': 0.6613188716582954} 12/23/2021 22:44:51 - INFO - codeparrot_training - Step 7329: {'lr': 0.0004749944380186623, 'samples': 3752960, 'steps': 7329, 'batch_loss/train': 0.8462913623079658} 12/23/2021 22:45:02 - INFO - codeparrot_training - Step 7330: {'lr': 0.0004749875658564511, 'samples': 3753472, 'steps': 7330, 'batch_loss/train': 0.7175891302758828} 12/23/2021 22:45:12 - INFO - codeparrot_training - Step 7331: {'lr': 0.0004749806927997744, 'samples': 3753984, 'steps': 7331, 'batch_loss/train': 0.85547827440314} 12/23/2021 22:45:23 - INFO - codeparrot_training - Step 7332: {'lr': 0.0004749738188486593, 'samples': 3754496, 'steps': 7332, 'batch_loss/train': 0.861702342517674} 12/23/2021 22:45:35 - INFO - codeparrot_training - Step 7333: {'lr': 0.0004749669440031333, 'samples': 3755008, 'steps': 7333, 'batch_loss/train': 0.9068438038229942} 12/23/2021 22:45:46 - INFO - codeparrot_training - Step 7334: {'lr': 0.0004749600682632237, 'samples': 3755520, 'steps': 7334, 'batch_loss/train': 0.8125765137374401} 12/23/2021 22:45:56 - INFO - codeparrot_training - Step 7335: {'lr': 0.00047495319162895773, 'samples': 3756032, 'steps': 7335, 'batch_loss/train': 0.8958035344257951} 12/23/2021 22:46:10 - INFO - codeparrot_training - Step 7336: {'lr': 0.0004749463141003628, 'samples': 3756544, 'steps': 7336, 'batch_loss/train': 0.8079745285212994} 12/23/2021 22:46:21 - INFO - codeparrot_training - Step 7337: {'lr': 0.00047493943567746634, 'samples': 3757056, 'steps': 7337, 'batch_loss/train': 0.8728948291391134} 12/23/2021 22:46:31 - INFO - codeparrot_training - Step 7338: {'lr': 0.00047493255636029563, 'samples': 3757568, 'steps': 7338, 'batch_loss/train': 0.7914179407525808} 12/23/2021 22:46:44 - INFO - codeparrot_training - Step 7339: {'lr': 0.00047492567614887795, 'samples': 3758080, 'steps': 7339, 'batch_loss/train': 0.8991232896223664} 12/23/2021 22:46:54 - INFO - codeparrot_training - Step 7340: {'lr': 0.0004749187950432407, 'samples': 3758592, 'steps': 7340, 'batch_loss/train': 0.8722152232658118} 12/23/2021 22:47:05 - INFO - codeparrot_training - Step 7341: {'lr': 0.0004749119130434113, 'samples': 3759104, 'steps': 7341, 'batch_loss/train': 0.8289101738482714} 12/23/2021 22:47:17 - INFO - codeparrot_training - Step 7342: {'lr': 0.000474905030149417, 'samples': 3759616, 'steps': 7342, 'batch_loss/train': 0.7520562279969454} 12/23/2021 22:47:28 - INFO - codeparrot_training - Step 7343: {'lr': 0.00047489814636128525, 'samples': 3760128, 'steps': 7343, 'batch_loss/train': 0.7980201272293925} 12/23/2021 22:47:38 - INFO - codeparrot_training - Step 7344: {'lr': 0.0004748912616790434, 'samples': 3760640, 'steps': 7344, 'batch_loss/train': 0.8371398281306028} 12/23/2021 22:47:49 - INFO - codeparrot_training - Step 7345: {'lr': 0.0004748843761027188, 'samples': 3761152, 'steps': 7345, 'batch_loss/train': 0.9171497728675604} 12/23/2021 22:48:03 - INFO - codeparrot_training - Step 7346: {'lr': 0.0004748774896323388, 'samples': 3761664, 'steps': 7346, 'batch_loss/train': 0.8294218447990716} 12/23/2021 22:48:13 - INFO - codeparrot_training - Step 7347: {'lr': 0.0004748706022679309, 'samples': 3762176, 'steps': 7347, 'batch_loss/train': 0.9093356309458613} 12/23/2021 22:48:24 - INFO - codeparrot_training - Step 7348: {'lr': 0.0004748637140095223, 'samples': 3762688, 'steps': 7348, 'batch_loss/train': 0.8720782767049968} 12/23/2021 22:48:36 - INFO - codeparrot_training - Step 7349: {'lr': 0.00047485682485714055, 'samples': 3763200, 'steps': 7349, 'batch_loss/train': 0.7351636462844908} 12/23/2021 22:48:46 - INFO - codeparrot_training - Step 7350: {'lr': 0.0004748499348108129, 'samples': 3763712, 'steps': 7350, 'batch_loss/train': 0.7433497526217252} 12/23/2021 22:48:57 - INFO - codeparrot_training - Step 7351: {'lr': 0.00047484304387056687, 'samples': 3764224, 'steps': 7351, 'batch_loss/train': 0.9673277325928211} 12/23/2021 22:49:09 - INFO - codeparrot_training - Step 7352: {'lr': 0.0004748361520364297, 'samples': 3764736, 'steps': 7352, 'batch_loss/train': 0.8353835688903928} 12/23/2021 22:49:20 - INFO - codeparrot_training - Step 7353: {'lr': 0.000474829259308429, 'samples': 3765248, 'steps': 7353, 'batch_loss/train': 0.813460536301136} 12/23/2021 22:49:30 - INFO - codeparrot_training - Step 7354: {'lr': 0.00047482236568659197, 'samples': 3765760, 'steps': 7354, 'batch_loss/train': 0.8580814236775041} 12/23/2021 22:49:41 - INFO - codeparrot_training - Step 7355: {'lr': 0.00047481547117094613, 'samples': 3766272, 'steps': 7355, 'batch_loss/train': 0.8070556549355388} 12/23/2021 22:49:55 - INFO - codeparrot_training - Step 7356: {'lr': 0.00047480857576151893, 'samples': 3766784, 'steps': 7356, 'batch_loss/train': 0.8729828046634793} 12/23/2021 22:50:05 - INFO - codeparrot_training - Step 7357: {'lr': 0.00047480167945833764, 'samples': 3767296, 'steps': 7357, 'batch_loss/train': 0.8907235842198133} 12/23/2021 22:50:16 - INFO - codeparrot_training - Step 7358: {'lr': 0.0004747947822614298, 'samples': 3767808, 'steps': 7358, 'batch_loss/train': 0.9513157475739717} 12/23/2021 22:50:28 - INFO - codeparrot_training - Step 7359: {'lr': 0.00047478788417082277, 'samples': 3768320, 'steps': 7359, 'batch_loss/train': 0.873375347815454} 12/23/2021 22:50:39 - INFO - codeparrot_training - Step 7360: {'lr': 0.00047478098518654395, 'samples': 3768832, 'steps': 7360, 'batch_loss/train': 0.8406261480413377} 12/23/2021 22:50:49 - INFO - codeparrot_training - Step 7361: {'lr': 0.0004747740853086208, 'samples': 3769344, 'steps': 7361, 'batch_loss/train': 0.8101363596506417} 12/23/2021 22:51:02 - INFO - codeparrot_training - Step 7362: {'lr': 0.0004747671845370808, 'samples': 3769856, 'steps': 7362, 'batch_loss/train': 0.8539978386834264} 12/23/2021 22:51:13 - INFO - codeparrot_training - Step 7363: {'lr': 0.0004747602828719514, 'samples': 3770368, 'steps': 7363, 'batch_loss/train': 0.8443191330879927} 12/23/2021 22:51:23 - INFO - codeparrot_training - Step 7364: {'lr': 0.0004747533803132599, 'samples': 3770880, 'steps': 7364, 'batch_loss/train': 0.8802364291623235} 12/23/2021 22:51:34 - INFO - codeparrot_training - Step 7365: {'lr': 0.0004747464768610338, 'samples': 3771392, 'steps': 7365, 'batch_loss/train': 0.9542759573087096} 12/23/2021 22:51:46 - INFO - codeparrot_training - Step 7366: {'lr': 0.00047473957251530064, 'samples': 3771904, 'steps': 7366, 'batch_loss/train': 0.8764359252527356} 12/23/2021 22:51:57 - INFO - codeparrot_training - Step 7367: {'lr': 0.00047473266727608775, 'samples': 3772416, 'steps': 7367, 'batch_loss/train': 0.9336402269545943} 12/23/2021 22:52:07 - INFO - codeparrot_training - Step 7368: {'lr': 0.00047472576114342266, 'samples': 3772928, 'steps': 7368, 'batch_loss/train': 0.7198241604492068} 12/23/2021 22:52:20 - INFO - codeparrot_training - Step 7369: {'lr': 0.0004747188541173328, 'samples': 3773440, 'steps': 7369, 'batch_loss/train': 0.970033866353333} 12/23/2021 22:52:30 - INFO - codeparrot_training - Step 7370: {'lr': 0.0004747119461978456, 'samples': 3773952, 'steps': 7370, 'batch_loss/train': 0.9685698198154569} 12/23/2021 22:52:41 - INFO - codeparrot_training - Step 7371: {'lr': 0.00047470503738498857, 'samples': 3774464, 'steps': 7371, 'batch_loss/train': 0.8583505055867136} 12/23/2021 22:52:53 - INFO - codeparrot_training - Step 7372: {'lr': 0.00047469812767878916, 'samples': 3774976, 'steps': 7372, 'batch_loss/train': 0.862472883425653} 12/23/2021 22:53:04 - INFO - codeparrot_training - Step 7373: {'lr': 0.0004746912170792748, 'samples': 3775488, 'steps': 7373, 'batch_loss/train': 0.91508549451828} 12/23/2021 22:53:14 - INFO - codeparrot_training - Step 7374: {'lr': 0.000474684305586473, 'samples': 3776000, 'steps': 7374, 'batch_loss/train': 0.9404789693653584} 12/23/2021 22:53:25 - INFO - codeparrot_training - Step 7375: {'lr': 0.0004746773932004113, 'samples': 3776512, 'steps': 7375, 'batch_loss/train': 0.9061777601018548} 12/23/2021 22:53:38 - INFO - codeparrot_training - Step 7376: {'lr': 0.0004746704799211171, 'samples': 3777024, 'steps': 7376, 'batch_loss/train': 0.7811345458030701} 12/23/2021 22:53:48 - INFO - codeparrot_training - Step 7377: {'lr': 0.00047466356574861793, 'samples': 3777536, 'steps': 7377, 'batch_loss/train': 0.8822394805029035} 12/23/2021 22:53:59 - INFO - codeparrot_training - Step 7378: {'lr': 0.0004746566506829412, 'samples': 3778048, 'steps': 7378, 'batch_loss/train': 0.8272259030491114} 12/23/2021 22:54:11 - INFO - codeparrot_training - Step 7379: {'lr': 0.00047464973472411445, 'samples': 3778560, 'steps': 7379, 'batch_loss/train': 0.9604299340862781} 12/23/2021 22:54:22 - INFO - codeparrot_training - Step 7380: {'lr': 0.0004746428178721652, 'samples': 3779072, 'steps': 7380, 'batch_loss/train': 0.8207512549124658} 12/23/2021 22:54:33 - INFO - codeparrot_training - Step 7381: {'lr': 0.0004746359001271209, 'samples': 3779584, 'steps': 7381, 'batch_loss/train': 0.8157329778186977} 12/23/2021 22:54:45 - INFO - codeparrot_training - Step 7382: {'lr': 0.00047462898148900915, 'samples': 3780096, 'steps': 7382, 'batch_loss/train': 0.8143916169647127} 12/23/2021 22:54:55 - INFO - codeparrot_training - Step 7383: {'lr': 0.0004746220619578573, 'samples': 3780608, 'steps': 7383, 'batch_loss/train': 0.7699949784437194} 12/23/2021 22:55:06 - INFO - codeparrot_training - Step 7384: {'lr': 0.00047461514153369303, 'samples': 3781120, 'steps': 7384, 'batch_loss/train': 0.743561731884256} 12/23/2021 22:55:19 - INFO - codeparrot_training - Step 7385: {'lr': 0.0004746082202165437, 'samples': 3781632, 'steps': 7385, 'batch_loss/train': 0.8535814071074128} 12/23/2021 22:55:30 - INFO - codeparrot_training - Step 7386: {'lr': 0.0004746012980064369, 'samples': 3782144, 'steps': 7386, 'batch_loss/train': 0.8486644122749567} 12/23/2021 22:55:41 - INFO - codeparrot_training - Step 7387: {'lr': 0.00047459437490340016, 'samples': 3782656, 'steps': 7387, 'batch_loss/train': 0.8407515708822757} 12/23/2021 22:55:51 - INFO - codeparrot_training - Step 7388: {'lr': 0.000474587450907461, 'samples': 3783168, 'steps': 7388, 'batch_loss/train': 0.8715252494439483} 12/23/2021 22:56:04 - INFO - codeparrot_training - Step 7389: {'lr': 0.00047458052601864693, 'samples': 3783680, 'steps': 7389, 'batch_loss/train': 0.9482160545885563} 12/23/2021 22:56:14 - INFO - codeparrot_training - Step 7390: {'lr': 0.0004745736002369855, 'samples': 3784192, 'steps': 7390, 'batch_loss/train': 0.8773426171392202} 12/23/2021 22:56:25 - INFO - codeparrot_training - Step 7391: {'lr': 0.00047456667356250416, 'samples': 3784704, 'steps': 7391, 'batch_loss/train': 0.7511476268991828} 12/23/2021 22:56:37 - INFO - codeparrot_training - Step 7392: {'lr': 0.0004745597459952306, 'samples': 3785216, 'steps': 7392, 'batch_loss/train': 0.7284593405202031} 12/23/2021 22:56:47 - INFO - codeparrot_training - Step 7393: {'lr': 0.0004745528175351923, 'samples': 3785728, 'steps': 7393, 'batch_loss/train': 0.8732496672309935} 12/23/2021 22:56:58 - INFO - codeparrot_training - Step 7394: {'lr': 0.0004745458881824167, 'samples': 3786240, 'steps': 7394, 'batch_loss/train': 0.7921279240399599} 12/23/2021 22:57:11 - INFO - codeparrot_training - Step 7395: {'lr': 0.0004745389579369315, 'samples': 3786752, 'steps': 7395, 'batch_loss/train': 0.9539952492341399} 12/23/2021 22:57:22 - INFO - codeparrot_training - Step 7396: {'lr': 0.00047453202679876417, 'samples': 3787264, 'steps': 7396, 'batch_loss/train': 0.9378670882433653} 12/23/2021 22:57:33 - INFO - codeparrot_training - Step 7397: {'lr': 0.00047452509476794225, 'samples': 3787776, 'steps': 7397, 'batch_loss/train': 0.8647222593426704} 12/23/2021 22:57:45 - INFO - codeparrot_training - Step 7398: {'lr': 0.00047451816184449337, 'samples': 3788288, 'steps': 7398, 'batch_loss/train': 0.7493531173095107} 12/23/2021 22:57:55 - INFO - codeparrot_training - Step 7399: {'lr': 0.00047451122802844503, 'samples': 3788800, 'steps': 7399, 'batch_loss/train': 0.7978656971827149} 12/23/2021 22:58:06 - INFO - codeparrot_training - Step 7400: {'lr': 0.0004745042933198248, 'samples': 3789312, 'steps': 7400, 'batch_loss/train': 0.8925178768113256} 12/23/2021 22:58:17 - INFO - codeparrot_training - Step 7401: {'lr': 0.0004744973577186603, 'samples': 3789824, 'steps': 7401, 'batch_loss/train': 0.8168132922146469} 12/23/2021 22:58:30 - INFO - codeparrot_training - Step 7402: {'lr': 0.0004744904212249791, 'samples': 3790336, 'steps': 7402, 'batch_loss/train': 0.8160746006760746} 12/23/2021 22:58:41 - INFO - codeparrot_training - Step 7403: {'lr': 0.0004744834838388087, 'samples': 3790848, 'steps': 7403, 'batch_loss/train': 1.204260278493166} 12/23/2021 22:58:51 - INFO - codeparrot_training - Step 7404: {'lr': 0.0004744765455601767, 'samples': 3791360, 'steps': 7404, 'batch_loss/train': 0.8363184873014688} 12/23/2021 22:59:03 - INFO - codeparrot_training - Step 7405: {'lr': 0.00047446960638911084, 'samples': 3791872, 'steps': 7405, 'batch_loss/train': 0.711411414318718} 12/23/2021 22:59:14 - INFO - codeparrot_training - Step 7406: {'lr': 0.00047446266632563853, 'samples': 3792384, 'steps': 7406, 'batch_loss/train': 0.8252337276935577} 12/23/2021 22:59:24 - INFO - codeparrot_training - Step 7407: {'lr': 0.00047445572536978733, 'samples': 3792896, 'steps': 7407, 'batch_loss/train': 0.8637401573359966} 12/23/2021 22:59:36 - INFO - codeparrot_training - Step 7408: {'lr': 0.000474448783521585, 'samples': 3793408, 'steps': 7408, 'batch_loss/train': 0.9115225546993315} 12/23/2021 22:59:47 - INFO - codeparrot_training - Step 7409: {'lr': 0.000474441840781059, 'samples': 3793920, 'steps': 7409, 'batch_loss/train': 0.8117505181580782} 12/23/2021 22:59:58 - INFO - codeparrot_training - Step 7410: {'lr': 0.0004744348971482371, 'samples': 3794432, 'steps': 7410, 'batch_loss/train': 0.754987636115402} 12/23/2021 23:00:08 - INFO - codeparrot_training - Step 7411: {'lr': 0.00047442795262314663, 'samples': 3794944, 'steps': 7411, 'batch_loss/train': 0.7899832371622324} 12/23/2021 23:00:20 - INFO - codeparrot_training - Step 7412: {'lr': 0.0004744210072058154, 'samples': 3795456, 'steps': 7412, 'batch_loss/train': 0.84541339520365} 12/23/2021 23:00:31 - INFO - codeparrot_training - Step 7413: {'lr': 0.00047441406089627103, 'samples': 3795968, 'steps': 7413, 'batch_loss/train': 0.8436379255726933} 12/23/2021 23:00:42 - INFO - codeparrot_training - Step 7414: {'lr': 0.0004744071136945411, 'samples': 3796480, 'steps': 7414, 'batch_loss/train': 0.9956261776387691} 12/23/2021 23:00:55 - INFO - codeparrot_training - Step 7415: {'lr': 0.00047440016560065324, 'samples': 3796992, 'steps': 7415, 'batch_loss/train': 0.7731343805789948} 12/23/2021 23:01:05 - INFO - codeparrot_training - Step 7416: {'lr': 0.00047439321661463496, 'samples': 3797504, 'steps': 7416, 'batch_loss/train': 0.8430437203496695} 12/23/2021 23:01:16 - INFO - codeparrot_training - Step 7417: {'lr': 0.0004743862667365141, 'samples': 3798016, 'steps': 7417, 'batch_loss/train': 0.792008189484477} 12/23/2021 23:01:28 - INFO - codeparrot_training - Step 7418: {'lr': 0.0004743793159663181, 'samples': 3798528, 'steps': 7418, 'batch_loss/train': 0.8757421653717756} 12/23/2021 23:01:38 - INFO - codeparrot_training - Step 7419: {'lr': 0.0004743723643040746, 'samples': 3799040, 'steps': 7419, 'batch_loss/train': 0.829989917576313} 12/23/2021 23:01:49 - INFO - codeparrot_training - Step 7420: {'lr': 0.0004743654117498114, 'samples': 3799552, 'steps': 7420, 'batch_loss/train': 0.755330110900104} 12/23/2021 23:02:00 - INFO - codeparrot_training - Step 7421: {'lr': 0.00047435845830355594, 'samples': 3800064, 'steps': 7421, 'batch_loss/train': 0.8273093365132809} 12/23/2021 23:02:12 - INFO - codeparrot_training - Step 7422: {'lr': 0.0004743515039653361, 'samples': 3800576, 'steps': 7422, 'batch_loss/train': 0.829190026037395} 12/23/2021 23:02:22 - INFO - codeparrot_training - Step 7423: {'lr': 0.0004743445487351793, 'samples': 3801088, 'steps': 7423, 'batch_loss/train': 0.7878340044990182} 12/23/2021 23:02:33 - INFO - codeparrot_training - Step 7424: {'lr': 0.0004743375926131133, 'samples': 3801600, 'steps': 7424, 'batch_loss/train': 0.8892932906746864} 12/23/2021 23:02:46 - INFO - codeparrot_training - Step 7425: {'lr': 0.0004743306355991657, 'samples': 3802112, 'steps': 7425, 'batch_loss/train': 0.8961858318652958} 12/23/2021 23:02:57 - INFO - codeparrot_training - Step 7426: {'lr': 0.0004743236776933643, 'samples': 3802624, 'steps': 7426, 'batch_loss/train': 0.9107004450634122} 12/23/2021 23:03:08 - INFO - codeparrot_training - Step 7427: {'lr': 0.0004743167188957366, 'samples': 3803136, 'steps': 7427, 'batch_loss/train': 0.9011791367083788} 12/23/2021 23:03:20 - INFO - codeparrot_training - Step 7428: {'lr': 0.00047430975920631025, 'samples': 3803648, 'steps': 7428, 'batch_loss/train': 0.9647365035489202} 12/23/2021 23:03:31 - INFO - codeparrot_training - Step 7429: {'lr': 0.00047430279862511305, 'samples': 3804160, 'steps': 7429, 'batch_loss/train': 0.9068957790732384} 12/23/2021 23:03:41 - INFO - codeparrot_training - Step 7430: {'lr': 0.0004742958371521727, 'samples': 3804672, 'steps': 7430, 'batch_loss/train': 0.7712791804224253} 12/23/2021 23:03:52 - INFO - codeparrot_training - Step 7431: {'lr': 0.0004742888747875167, 'samples': 3805184, 'steps': 7431, 'batch_loss/train': 0.8925664955750108} 12/23/2021 23:04:05 - INFO - codeparrot_training - Step 7432: {'lr': 0.0004742819115311728, 'samples': 3805696, 'steps': 7432, 'batch_loss/train': 0.7760916077531874} 12/23/2021 23:04:15 - INFO - codeparrot_training - Step 7433: {'lr': 0.0004742749473831688, 'samples': 3806208, 'steps': 7433, 'batch_loss/train': 0.9407868925482035} 12/23/2021 23:04:26 - INFO - codeparrot_training - Step 7434: {'lr': 0.0004742679823435322, 'samples': 3806720, 'steps': 7434, 'batch_loss/train': 0.8700456977821887} 12/23/2021 23:04:38 - INFO - codeparrot_training - Step 7435: {'lr': 0.00047426101641229087, 'samples': 3807232, 'steps': 7435, 'batch_loss/train': 0.9327864646911621} 12/23/2021 23:04:49 - INFO - codeparrot_training - Step 7436: {'lr': 0.00047425404958947236, 'samples': 3807744, 'steps': 7436, 'batch_loss/train': 0.7882370674051344} 12/23/2021 23:04:59 - INFO - codeparrot_training - Step 7437: {'lr': 0.0004742470818751045, 'samples': 3808256, 'steps': 7437, 'batch_loss/train': 0.9563119374215603} 12/23/2021 23:05:11 - INFO - codeparrot_training - Step 7438: {'lr': 0.00047424011326921477, 'samples': 3808768, 'steps': 7438, 'batch_loss/train': 0.8778168261051178} 12/23/2021 23:05:22 - INFO - codeparrot_training - Step 7439: {'lr': 0.00047423314377183115, 'samples': 3809280, 'steps': 7439, 'batch_loss/train': 0.6723882332444191} 12/23/2021 23:05:32 - INFO - codeparrot_training - Step 7440: {'lr': 0.0004742261733829811, 'samples': 3809792, 'steps': 7440, 'batch_loss/train': 0.895679852925241} 12/23/2021 23:05:43 - INFO - codeparrot_training - Step 7441: {'lr': 0.00047421920210269255, 'samples': 3810304, 'steps': 7441, 'batch_loss/train': 0.8850605143234134} 12/23/2021 23:05:57 - INFO - codeparrot_training - Step 7442: {'lr': 0.00047421222993099303, 'samples': 3810816, 'steps': 7442, 'batch_loss/train': 0.8709240341559052} 12/23/2021 23:06:07 - INFO - codeparrot_training - Step 7443: {'lr': 0.0004742052568679104, 'samples': 3811328, 'steps': 7443, 'batch_loss/train': 0.9331055302172899} 12/23/2021 23:06:18 - INFO - codeparrot_training - Step 7444: {'lr': 0.00047419828291347233, 'samples': 3811840, 'steps': 7444, 'batch_loss/train': 0.7031475922558457} 12/23/2021 23:06:30 - INFO - codeparrot_training - Step 7445: {'lr': 0.0004741913080677065, 'samples': 3812352, 'steps': 7445, 'batch_loss/train': 0.999621968716383} 12/23/2021 23:06:40 - INFO - codeparrot_training - Step 7446: {'lr': 0.0004741843323306407, 'samples': 3812864, 'steps': 7446, 'batch_loss/train': 0.9387432718649507} 12/23/2021 23:06:51 - INFO - codeparrot_training - Step 7447: {'lr': 0.0004741773557023027, 'samples': 3813376, 'steps': 7447, 'batch_loss/train': 0.9573125680908561} 12/23/2021 23:07:03 - INFO - codeparrot_training - Step 7448: {'lr': 0.0004741703781827201, 'samples': 3813888, 'steps': 7448, 'batch_loss/train': 0.8253097645938396} 12/23/2021 23:07:14 - INFO - codeparrot_training - Step 7449: {'lr': 0.0004741633997719208, 'samples': 3814400, 'steps': 7449, 'batch_loss/train': 0.6635673111304641} 12/23/2021 23:07:24 - INFO - codeparrot_training - Step 7450: {'lr': 0.00047415642046993236, 'samples': 3814912, 'steps': 7450, 'batch_loss/train': 0.8421465288847685} 12/23/2021 23:07:35 - INFO - codeparrot_training - Step 7451: {'lr': 0.00047414944027678274, 'samples': 3815424, 'steps': 7451, 'batch_loss/train': 0.8995887152850628} 12/23/2021 23:07:47 - INFO - codeparrot_training - Step 7452: {'lr': 0.0004741424591924995, 'samples': 3815936, 'steps': 7452, 'batch_loss/train': 0.828726933337748} 12/23/2021 23:07:58 - INFO - codeparrot_training - Step 7453: {'lr': 0.0004741354772171105, 'samples': 3816448, 'steps': 7453, 'batch_loss/train': 0.9722810578532517} 12/23/2021 23:08:08 - INFO - codeparrot_training - Step 7454: {'lr': 0.00047412849435064347, 'samples': 3816960, 'steps': 7454, 'batch_loss/train': 0.9699595207348466} 12/23/2021 23:08:21 - INFO - codeparrot_training - Step 7455: {'lr': 0.00047412151059312614, 'samples': 3817472, 'steps': 7455, 'batch_loss/train': 0.9019744899123907} 12/23/2021 23:08:32 - INFO - codeparrot_training - Step 7456: {'lr': 0.00047411452594458637, 'samples': 3817984, 'steps': 7456, 'batch_loss/train': 0.7708200737833977} 12/23/2021 23:08:42 - INFO - codeparrot_training - Step 7457: {'lr': 0.0004741075404050519, 'samples': 3818496, 'steps': 7457, 'batch_loss/train': 0.8560521313920617} 12/23/2021 23:08:54 - INFO - codeparrot_training - Step 7458: {'lr': 0.0004741005539745504, 'samples': 3819008, 'steps': 7458, 'batch_loss/train': 0.7005306915380061} 12/23/2021 23:09:05 - INFO - codeparrot_training - Step 7459: {'lr': 0.0004740935666531097, 'samples': 3819520, 'steps': 7459, 'batch_loss/train': 1.1265014135278761} 12/23/2021 23:09:16 - INFO - codeparrot_training - Step 7460: {'lr': 0.00047408657844075767, 'samples': 3820032, 'steps': 7460, 'batch_loss/train': 0.9465006170794368} 12/23/2021 23:09:26 - INFO - codeparrot_training - Step 7461: {'lr': 0.00047407958933752195, 'samples': 3820544, 'steps': 7461, 'batch_loss/train': 0.7627403682563454} 12/23/2021 23:09:38 - INFO - codeparrot_training - Step 7462: {'lr': 0.0004740725993434304, 'samples': 3821056, 'steps': 7462, 'batch_loss/train': 0.8775930041447282} 12/23/2021 23:09:49 - INFO - codeparrot_training - Step 7463: {'lr': 0.00047406560845851076, 'samples': 3821568, 'steps': 7463, 'batch_loss/train': 0.9164081336930394} 12/23/2021 23:09:59 - INFO - codeparrot_training - Step 7464: {'lr': 0.00047405861668279095, 'samples': 3822080, 'steps': 7464, 'batch_loss/train': 0.8048156923614442} 12/23/2021 23:10:13 - INFO - codeparrot_training - Step 7465: {'lr': 0.00047405162401629867, 'samples': 3822592, 'steps': 7465, 'batch_loss/train': 0.8128450093790889} 12/23/2021 23:10:24 - INFO - codeparrot_training - Step 7466: {'lr': 0.00047404463045906166, 'samples': 3823104, 'steps': 7466, 'batch_loss/train': 0.8695209436118603} 12/23/2021 23:10:34 - INFO - codeparrot_training - Step 7467: {'lr': 0.00047403763601110784, 'samples': 3823616, 'steps': 7467, 'batch_loss/train': 0.7706090891733766} 12/23/2021 23:10:47 - INFO - codeparrot_training - Step 7468: {'lr': 0.00047403064067246496, 'samples': 3824128, 'steps': 7468, 'batch_loss/train': 0.7665291712619364} 12/23/2021 23:10:58 - INFO - codeparrot_training - Step 7469: {'lr': 0.0004740236444431609, 'samples': 3824640, 'steps': 7469, 'batch_loss/train': 0.8064218247309327} 12/23/2021 23:11:08 - INFO - codeparrot_training - Step 7470: {'lr': 0.00047401664732322335, 'samples': 3825152, 'steps': 7470, 'batch_loss/train': 0.842583785764873} 12/23/2021 23:11:19 - INFO - codeparrot_training - Step 7471: {'lr': 0.0004740096493126802, 'samples': 3825664, 'steps': 7471, 'batch_loss/train': 2.2097465582191944} 12/23/2021 23:11:32 - INFO - codeparrot_training - Step 7472: {'lr': 0.0004740026504115593, 'samples': 3826176, 'steps': 7472, 'batch_loss/train': 1.2286911169067025} 12/23/2021 23:11:43 - INFO - codeparrot_training - Step 7473: {'lr': 0.0004739956506198885, 'samples': 3826688, 'steps': 7473, 'batch_loss/train': 0.8320282474160194} 12/23/2021 23:11:53 - INFO - codeparrot_training - Step 7474: {'lr': 0.00047398864993769546, 'samples': 3827200, 'steps': 7474, 'batch_loss/train': 0.8919558590278029} 12/23/2021 23:12:04 - INFO - codeparrot_training - Step 7475: {'lr': 0.00047398164836500817, 'samples': 3827712, 'steps': 7475, 'batch_loss/train': 0.8428762156981975} 12/23/2021 23:12:16 - INFO - codeparrot_training - Step 7476: {'lr': 0.0004739746459018544, 'samples': 3828224, 'steps': 7476, 'batch_loss/train': 0.8776055378839374} 12/23/2021 23:12:26 - INFO - codeparrot_training - Step 7477: {'lr': 0.00047396764254826204, 'samples': 3828736, 'steps': 7477, 'batch_loss/train': 0.8241258775815368} 12/23/2021 23:12:37 - INFO - codeparrot_training - Step 7478: {'lr': 0.00047396063830425884, 'samples': 3829248, 'steps': 7478, 'batch_loss/train': 0.8864692654460669} 12/23/2021 23:12:49 - INFO - codeparrot_training - Step 7479: {'lr': 0.00047395363316987275, 'samples': 3829760, 'steps': 7479, 'batch_loss/train': 0.9412556127645075} 12/23/2021 23:13:00 - INFO - codeparrot_training - Step 7480: {'lr': 0.0004739466271451316, 'samples': 3830272, 'steps': 7480, 'batch_loss/train': 0.8506463719531894} 12/23/2021 23:13:10 - INFO - codeparrot_training - Step 7481: {'lr': 0.0004739396202300632, 'samples': 3830784, 'steps': 7481, 'batch_loss/train': 0.7088622117880732} 12/23/2021 23:13:24 - INFO - codeparrot_training - Step 7482: {'lr': 0.0004739326124246954, 'samples': 3831296, 'steps': 7482, 'batch_loss/train': 0.8813962955027819} 12/23/2021 23:13:34 - INFO - codeparrot_training - Step 7483: {'lr': 0.00047392560372905615, 'samples': 3831808, 'steps': 7483, 'batch_loss/train': 0.905711667612195} 12/23/2021 23:13:45 - INFO - codeparrot_training - Step 7484: {'lr': 0.0004739185941431732, 'samples': 3832320, 'steps': 7484, 'batch_loss/train': 0.9354615481570363} 12/23/2021 23:13:57 - INFO - codeparrot_training - Step 7485: {'lr': 0.00047391158366707445, 'samples': 3832832, 'steps': 7485, 'batch_loss/train': 0.8573918910697103} 12/23/2021 23:14:07 - INFO - codeparrot_training - Step 7486: {'lr': 0.0004739045723007878, 'samples': 3833344, 'steps': 7486, 'batch_loss/train': 0.8435942968353629} 12/23/2021 23:14:18 - INFO - codeparrot_training - Step 7487: {'lr': 0.0004738975600443411, 'samples': 3833856, 'steps': 7487, 'batch_loss/train': 0.8321838597767055} 12/23/2021 23:14:28 - INFO - codeparrot_training - Step 7488: {'lr': 0.00047389054689776235, 'samples': 3834368, 'steps': 7488, 'batch_loss/train': 0.9326350353658199} 12/23/2021 23:14:42 - INFO - codeparrot_training - Step 7489: {'lr': 0.00047388353286107924, 'samples': 3834880, 'steps': 7489, 'batch_loss/train': 0.8962859711609781} 12/23/2021 23:14:53 - INFO - codeparrot_training - Step 7490: {'lr': 0.00047387651793431976, 'samples': 3835392, 'steps': 7490, 'batch_loss/train': 0.7623732752399519} 12/23/2021 23:15:03 - INFO - codeparrot_training - Step 7491: {'lr': 0.00047386950211751166, 'samples': 3835904, 'steps': 7491, 'batch_loss/train': 0.8511980762705207} 12/23/2021 23:15:15 - INFO - codeparrot_training - Step 7492: {'lr': 0.0004738624854106831, 'samples': 3836416, 'steps': 7492, 'batch_loss/train': 1.2422330037225038} 12/23/2021 23:15:26 - INFO - codeparrot_training - Step 7493: {'lr': 0.0004738554678138618, 'samples': 3836928, 'steps': 7493, 'batch_loss/train': 0.8317286828532815} 12/23/2021 23:15:37 - INFO - codeparrot_training - Step 7494: {'lr': 0.0004738484493270756, 'samples': 3837440, 'steps': 7494, 'batch_loss/train': 0.8770252224057913} 12/23/2021 23:15:47 - INFO - codeparrot_training - Step 7495: {'lr': 0.0004738414299503526, 'samples': 3837952, 'steps': 7495, 'batch_loss/train': 0.8537774831056595} 12/23/2021 23:15:59 - INFO - codeparrot_training - Step 7496: {'lr': 0.0004738344096837205, 'samples': 3838464, 'steps': 7496, 'batch_loss/train': 0.859804212115705} 12/23/2021 23:16:10 - INFO - codeparrot_training - Step 7497: {'lr': 0.0004738273885272074, 'samples': 3838976, 'steps': 7497, 'batch_loss/train': 0.7818895909003913} 12/23/2021 23:16:20 - INFO - codeparrot_training - Step 7498: {'lr': 0.000473820366480841, 'samples': 3839488, 'steps': 7498, 'batch_loss/train': 0.863894612994045} 12/23/2021 23:16:33 - INFO - codeparrot_training - Step 7499: {'lr': 0.00047381334354464945, 'samples': 3840000, 'steps': 7499, 'batch_loss/train': 0.9735000836662948} 12/23/2021 23:16:33 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/23/2021 23:20:08 - INFO - codeparrot_training - Step 7500: {'loss/eval': 0.8546667695045471, 'perplexity': 2.350590944290161} 12/23/2021 23:20:38 - INFO - codeparrot_training - Step 7500: {'lr': 0.00047380631971866046, 'samples': 3840512, 'steps': 7500, 'batch_loss/train': 0.837305162800476} 12/23/2021 23:20:49 - INFO - codeparrot_training - Step 7501: {'lr': 0.0004737992950029021, 'samples': 3841024, 'steps': 7501, 'batch_loss/train': 0.8969468860886991} 12/23/2021 23:21:01 - INFO - codeparrot_training - Step 7502: {'lr': 0.0004737922693974022, 'samples': 3841536, 'steps': 7502, 'batch_loss/train': 0.921319498680532} 12/23/2021 23:21:12 - INFO - codeparrot_training - Step 7503: {'lr': 0.00047378524290218876, 'samples': 3842048, 'steps': 7503, 'batch_loss/train': 0.852928156265989} 12/23/2021 23:21:22 - INFO - codeparrot_training - Step 7504: {'lr': 0.0004737782155172897, 'samples': 3842560, 'steps': 7504, 'batch_loss/train': 0.9662292310968041} 12/23/2021 23:21:33 - INFO - codeparrot_training - Step 7505: {'lr': 0.00047377118724273293, 'samples': 3843072, 'steps': 7505, 'batch_loss/train': 1.0105484728701413} 12/23/2021 23:21:47 - INFO - codeparrot_training - Step 7506: {'lr': 0.00047376415807854645, 'samples': 3843584, 'steps': 7506, 'batch_loss/train': 0.7760168900713325} 12/23/2021 23:21:58 - INFO - codeparrot_training - Step 7507: {'lr': 0.00047375712802475816, 'samples': 3844096, 'steps': 7507, 'batch_loss/train': 0.8687637411057949} 12/23/2021 23:22:08 - INFO - codeparrot_training - Step 7508: {'lr': 0.00047375009708139603, 'samples': 3844608, 'steps': 7508, 'batch_loss/train': 0.8125312682241201} 12/23/2021 23:22:20 - INFO - codeparrot_training - Step 7509: {'lr': 0.000473743065248488, 'samples': 3845120, 'steps': 7509, 'batch_loss/train': 0.7870692368596792} 12/23/2021 23:22:31 - INFO - codeparrot_training - Step 7510: {'lr': 0.000473736032526062, 'samples': 3845632, 'steps': 7510, 'batch_loss/train': 0.8420952223241329} 12/23/2021 23:22:41 - INFO - codeparrot_training - Step 7511: {'lr': 0.000473728998914146, 'samples': 3846144, 'steps': 7511, 'batch_loss/train': 0.8033704916015267} 12/23/2021 23:22:54 - INFO - codeparrot_training - Step 7512: {'lr': 0.000473721964412768, 'samples': 3846656, 'steps': 7512, 'batch_loss/train': 0.8417484448291361} 12/23/2021 23:23:04 - INFO - codeparrot_training - Step 7513: {'lr': 0.0004737149290219559, 'samples': 3847168, 'steps': 7513, 'batch_loss/train': 0.801793310791254} 12/23/2021 23:23:15 - INFO - codeparrot_training - Step 7514: {'lr': 0.00047370789274173776, 'samples': 3847680, 'steps': 7514, 'batch_loss/train': 0.8540254663676023} 12/23/2021 23:23:27 - INFO - codeparrot_training - Step 7515: {'lr': 0.0004737008555721415, 'samples': 3848192, 'steps': 7515, 'batch_loss/train': 0.8614218793809414} 12/23/2021 23:23:38 - INFO - codeparrot_training - Step 7516: {'lr': 0.0004736938175131951, 'samples': 3848704, 'steps': 7516, 'batch_loss/train': 1.0077128866687417} 12/23/2021 23:23:48 - INFO - codeparrot_training - Step 7517: {'lr': 0.0004736867785649266, 'samples': 3849216, 'steps': 7517, 'batch_loss/train': 0.7877142000943422} 12/23/2021 23:23:59 - INFO - codeparrot_training - Step 7518: {'lr': 0.0004736797387273638, 'samples': 3849728, 'steps': 7518, 'batch_loss/train': 0.9100191667675972} 12/23/2021 23:24:13 - INFO - codeparrot_training - Step 7519: {'lr': 0.00047367269800053494, 'samples': 3850240, 'steps': 7519, 'batch_loss/train': 0.8695456041023135} 12/23/2021 23:24:23 - INFO - codeparrot_training - Step 7520: {'lr': 0.00047366565638446784, 'samples': 3850752, 'steps': 7520, 'batch_loss/train': 0.7862599333748221} 12/23/2021 23:24:34 - INFO - codeparrot_training - Step 7521: {'lr': 0.00047365861387919055, 'samples': 3851264, 'steps': 7521, 'batch_loss/train': 0.7524722991511226} 12/23/2021 23:24:46 - INFO - codeparrot_training - Step 7522: {'lr': 0.0004736515704847311, 'samples': 3851776, 'steps': 7522, 'batch_loss/train': 0.8321187263354659} 12/23/2021 23:24:57 - INFO - codeparrot_training - Step 7523: {'lr': 0.00047364452620111735, 'samples': 3852288, 'steps': 7523, 'batch_loss/train': 0.816031382419169} 12/23/2021 23:25:07 - INFO - codeparrot_training - Step 7524: {'lr': 0.0004736374810283775, 'samples': 3852800, 'steps': 7524, 'batch_loss/train': 0.871781200170517} 12/23/2021 23:25:20 - INFO - codeparrot_training - Step 7525: {'lr': 0.00047363043496653946, 'samples': 3853312, 'steps': 7525, 'batch_loss/train': 0.8612668751738966} 12/23/2021 23:25:31 - INFO - codeparrot_training - Step 7526: {'lr': 0.0004736233880156312, 'samples': 3853824, 'steps': 7526, 'batch_loss/train': 0.6096243965439498} 12/23/2021 23:25:41 - INFO - codeparrot_training - Step 7527: {'lr': 0.00047361634017568076, 'samples': 3854336, 'steps': 7527, 'batch_loss/train': 2.4706889241933823} 12/23/2021 23:25:52 - INFO - codeparrot_training - Step 7528: {'lr': 0.00047360929144671626, 'samples': 3854848, 'steps': 7528, 'batch_loss/train': 0.9063254641368985} 12/23/2021 23:26:06 - INFO - codeparrot_training - Step 7529: {'lr': 0.00047360224182876567, 'samples': 3855360, 'steps': 7529, 'batch_loss/train': 0.9294341159984469} 12/23/2021 23:26:17 - INFO - codeparrot_training - Step 7530: {'lr': 0.0004735951913218569, 'samples': 3855872, 'steps': 7530, 'batch_loss/train': 0.8448813608847558} 12/23/2021 23:26:27 - INFO - codeparrot_training - Step 7531: {'lr': 0.00047358813992601814, 'samples': 3856384, 'steps': 7531, 'batch_loss/train': 0.8344636438414454} 12/23/2021 23:26:39 - INFO - codeparrot_training - Step 7532: {'lr': 0.00047358108764127736, 'samples': 3856896, 'steps': 7532, 'batch_loss/train': 0.9150018966756761} 12/23/2021 23:26:50 - INFO - codeparrot_training - Step 7533: {'lr': 0.00047357403446766255, 'samples': 3857408, 'steps': 7533, 'batch_loss/train': 0.7857770635746419} 12/23/2021 23:27:00 - INFO - codeparrot_training - Step 7534: {'lr': 0.00047356698040520187, 'samples': 3857920, 'steps': 7534, 'batch_loss/train': 0.8009329224005342} 12/23/2021 23:27:11 - INFO - codeparrot_training - Step 7535: {'lr': 0.0004735599254539232, 'samples': 3858432, 'steps': 7535, 'batch_loss/train': 0.8050906429998577} 12/23/2021 23:27:23 - INFO - codeparrot_training - Step 7536: {'lr': 0.0004735528696138548, 'samples': 3858944, 'steps': 7536, 'batch_loss/train': 0.8857858432456851} 12/23/2021 23:27:34 - INFO - codeparrot_training - Step 7537: {'lr': 0.0004735458128850245, 'samples': 3859456, 'steps': 7537, 'batch_loss/train': 0.7364701011683792} 12/23/2021 23:27:44 - INFO - codeparrot_training - Step 7538: {'lr': 0.00047353875526746046, 'samples': 3859968, 'steps': 7538, 'batch_loss/train': 0.8632395754102618} 12/23/2021 23:27:58 - INFO - codeparrot_training - Step 7539: {'lr': 0.0004735316967611908, 'samples': 3860480, 'steps': 7539, 'batch_loss/train': 0.8104018392041326} 12/23/2021 23:28:09 - INFO - codeparrot_training - Step 7540: {'lr': 0.0004735246373662435, 'samples': 3860992, 'steps': 7540, 'batch_loss/train': 0.8644509334117174} 12/23/2021 23:28:20 - INFO - codeparrot_training - Step 7541: {'lr': 0.00047351757708264666, 'samples': 3861504, 'steps': 7541, 'batch_loss/train': 0.8298703003674746} 12/23/2021 23:28:32 - INFO - codeparrot_training - Step 7542: {'lr': 0.00047351051591042827, 'samples': 3862016, 'steps': 7542, 'batch_loss/train': 0.8080223377328366} 12/23/2021 23:28:43 - INFO - codeparrot_training - Step 7543: {'lr': 0.0004735034538496165, 'samples': 3862528, 'steps': 7543, 'batch_loss/train': 0.7922411244362593} 12/23/2021 23:28:53 - INFO - codeparrot_training - Step 7544: {'lr': 0.0004734963909002393, 'samples': 3863040, 'steps': 7544, 'batch_loss/train': 0.8111010566353798} 12/23/2021 23:29:07 - INFO - codeparrot_training - Step 7545: {'lr': 0.00047348932706232493, 'samples': 3863552, 'steps': 7545, 'batch_loss/train': 0.8411112865433097} 12/23/2021 23:29:18 - INFO - codeparrot_training - Step 7546: {'lr': 0.00047348226233590143, 'samples': 3864064, 'steps': 7546, 'batch_loss/train': 0.7864159597083926} 12/23/2021 23:29:28 - INFO - codeparrot_training - Step 7547: {'lr': 0.0004734751967209967, 'samples': 3864576, 'steps': 7547, 'batch_loss/train': 0.7968324227258563} 12/23/2021 23:29:39 - INFO - codeparrot_training - Step 7548: {'lr': 0.0004734681302176391, 'samples': 3865088, 'steps': 7548, 'batch_loss/train': 0.8772008958039805} 12/23/2021 23:29:51 - INFO - codeparrot_training - Step 7549: {'lr': 0.00047346106282585655, 'samples': 3865600, 'steps': 7549, 'batch_loss/train': 0.8436633897945285} 12/23/2021 23:30:02 - INFO - codeparrot_training - Step 7550: {'lr': 0.00047345399454567713, 'samples': 3866112, 'steps': 7550, 'batch_loss/train': 0.8896505692973733} 12/23/2021 23:30:12 - INFO - codeparrot_training - Step 7551: {'lr': 0.000473446925377129, 'samples': 3866624, 'steps': 7551, 'batch_loss/train': 0.8236389096127823} 12/23/2021 23:30:24 - INFO - codeparrot_training - Step 7552: {'lr': 0.00047343985532024034, 'samples': 3867136, 'steps': 7552, 'batch_loss/train': 0.8079888187348843} 12/23/2021 23:30:35 - INFO - codeparrot_training - Step 7553: {'lr': 0.0004734327843750391, 'samples': 3867648, 'steps': 7553, 'batch_loss/train': 0.7543365461751819} 12/23/2021 23:30:45 - INFO - codeparrot_training - Step 7554: {'lr': 0.00047342571254155357, 'samples': 3868160, 'steps': 7554, 'batch_loss/train': 0.7698141492437571} 12/23/2021 23:30:58 - INFO - codeparrot_training - Step 7555: {'lr': 0.0004734186398198117, 'samples': 3868672, 'steps': 7555, 'batch_loss/train': 0.8163442541845143} 12/23/2021 23:31:08 - INFO - codeparrot_training - Step 7556: {'lr': 0.00047341156620984173, 'samples': 3869184, 'steps': 7556, 'batch_loss/train': 0.7188698165118694} 12/23/2021 23:31:19 - INFO - codeparrot_training - Step 7557: {'lr': 0.00047340449171167173, 'samples': 3869696, 'steps': 7557, 'batch_loss/train': 0.7105955900624394} 12/23/2021 23:31:29 - INFO - codeparrot_training - Step 7558: {'lr': 0.0004733974163253298, 'samples': 3870208, 'steps': 7558, 'batch_loss/train': 0.8531088633462787} 12/23/2021 23:31:43 - INFO - codeparrot_training - Step 7559: {'lr': 0.00047339034005084406, 'samples': 3870720, 'steps': 7559, 'batch_loss/train': 0.8663130952045321} 12/23/2021 23:31:54 - INFO - codeparrot_training - Step 7560: {'lr': 0.0004733832628882427, 'samples': 3871232, 'steps': 7560, 'batch_loss/train': 0.819406398339197} 12/23/2021 23:32:04 - INFO - codeparrot_training - Step 7561: {'lr': 0.00047337618483755386, 'samples': 3871744, 'steps': 7561, 'batch_loss/train': 0.7795283887535334} 12/23/2021 23:32:17 - INFO - codeparrot_training - Step 7562: {'lr': 0.0004733691058988057, 'samples': 3872256, 'steps': 7562, 'batch_loss/train': 0.7378985336981714} 12/23/2021 23:32:27 - INFO - codeparrot_training - Step 7563: {'lr': 0.00047336202607202617, 'samples': 3872768, 'steps': 7563, 'batch_loss/train': 0.9024181608110666} 12/23/2021 23:32:38 - INFO - codeparrot_training - Step 7564: {'lr': 0.00047335494535724365, 'samples': 3873280, 'steps': 7564, 'batch_loss/train': 0.9060514103621244} 12/23/2021 23:32:50 - INFO - codeparrot_training - Step 7565: {'lr': 0.0004733478637544862, 'samples': 3873792, 'steps': 7565, 'batch_loss/train': 0.8315105652436614} 12/23/2021 23:33:00 - INFO - codeparrot_training - Step 7566: {'lr': 0.000473340781263782, 'samples': 3874304, 'steps': 7566, 'batch_loss/train': 0.8984022596850991} 12/23/2021 23:33:11 - INFO - codeparrot_training - Step 7567: {'lr': 0.0004733336978851591, 'samples': 3874816, 'steps': 7567, 'batch_loss/train': 0.9097151160240173} 12/23/2021 23:33:25 - INFO - codeparrot_training - Step 7568: {'lr': 0.0004733266136186458, 'samples': 3875328, 'steps': 7568, 'batch_loss/train': 0.8743008747696877} 12/23/2021 23:33:35 - INFO - codeparrot_training - Step 7569: {'lr': 0.00047331952846427027, 'samples': 3875840, 'steps': 7569, 'batch_loss/train': 0.8652344569563866} 12/23/2021 23:33:46 - INFO - codeparrot_training - Step 7570: {'lr': 0.0004733124424220605, 'samples': 3876352, 'steps': 7570, 'batch_loss/train': 0.7604622654616833} 12/23/2021 23:33:56 - INFO - codeparrot_training - Step 7571: {'lr': 0.0004733053554920449, 'samples': 3876864, 'steps': 7571, 'batch_loss/train': 0.8829784812405705} 12/23/2021 23:34:09 - INFO - codeparrot_training - Step 7572: {'lr': 0.00047329826767425145, 'samples': 3877376, 'steps': 7572, 'batch_loss/train': 0.8520888835191727} 12/23/2021 23:34:19 - INFO - codeparrot_training - Step 7573: {'lr': 0.00047329117896870843, 'samples': 3877888, 'steps': 7573, 'batch_loss/train': 0.8750678435899317} 12/23/2021 23:34:30 - INFO - codeparrot_training - Step 7574: {'lr': 0.000473284089375444, 'samples': 3878400, 'steps': 7574, 'batch_loss/train': 0.7567882514558733} 12/23/2021 23:34:44 - INFO - codeparrot_training - Step 7575: {'lr': 0.0004732769988944863, 'samples': 3878912, 'steps': 7575, 'batch_loss/train': 0.8168737180531025} 12/23/2021 23:34:54 - INFO - codeparrot_training - Step 7576: {'lr': 0.0004732699075258635, 'samples': 3879424, 'steps': 7576, 'batch_loss/train': 0.7442332403734326} 12/23/2021 23:35:05 - INFO - codeparrot_training - Step 7577: {'lr': 0.0004732628152696039, 'samples': 3879936, 'steps': 7577, 'batch_loss/train': 0.8870409447699785} 12/23/2021 23:35:17 - INFO - codeparrot_training - Step 7578: {'lr': 0.00047325572212573574, 'samples': 3880448, 'steps': 7578, 'batch_loss/train': 0.7570527503266931} 12/23/2021 23:35:28 - INFO - codeparrot_training - Step 7579: {'lr': 0.00047324862809428704, 'samples': 3880960, 'steps': 7579, 'batch_loss/train': 1.5688214898109436} 12/23/2021 23:35:38 - INFO - codeparrot_training - Step 7580: {'lr': 0.0004732415331752861, 'samples': 3881472, 'steps': 7580, 'batch_loss/train': 0.8411909928545356} 12/23/2021 23:35:49 - INFO - codeparrot_training - Step 7581: {'lr': 0.00047323443736876117, 'samples': 3881984, 'steps': 7581, 'batch_loss/train': 0.8208200049120933} 12/23/2021 23:36:01 - INFO - codeparrot_training - Step 7582: {'lr': 0.0004732273406747403, 'samples': 3882496, 'steps': 7582, 'batch_loss/train': 0.9929943904280663} 12/23/2021 23:36:12 - INFO - codeparrot_training - Step 7583: {'lr': 0.0004732202430932519, 'samples': 3883008, 'steps': 7583, 'batch_loss/train': 0.7080137575976551} 12/23/2021 23:36:22 - INFO - codeparrot_training - Step 7584: {'lr': 0.00047321314462432407, 'samples': 3883520, 'steps': 7584, 'batch_loss/train': 0.8414992969483137} 12/23/2021 23:36:36 - INFO - codeparrot_training - Step 7585: {'lr': 0.00047320604526798505, 'samples': 3884032, 'steps': 7585, 'batch_loss/train': 0.8429036289453506} 12/23/2021 23:36:47 - INFO - codeparrot_training - Step 7586: {'lr': 0.0004731989450242631, 'samples': 3884544, 'steps': 7586, 'batch_loss/train': 0.9045059364289045} 12/23/2021 23:36:57 - INFO - codeparrot_training - Step 7587: {'lr': 0.00047319184389318637, 'samples': 3885056, 'steps': 7587, 'batch_loss/train': 0.805141981691122} 12/23/2021 23:37:10 - INFO - codeparrot_training - Step 7588: {'lr': 0.0004731847418747832, 'samples': 3885568, 'steps': 7588, 'batch_loss/train': 0.7472687957342714} 12/23/2021 23:37:20 - INFO - codeparrot_training - Step 7589: {'lr': 0.0004731776389690817, 'samples': 3886080, 'steps': 7589, 'batch_loss/train': 0.7919280016794801} 12/23/2021 23:37:31 - INFO - codeparrot_training - Step 7590: {'lr': 0.0004731705351761103, 'samples': 3886592, 'steps': 7590, 'batch_loss/train': 0.7508890591561794} 12/23/2021 23:37:41 - INFO - codeparrot_training - Step 7591: {'lr': 0.00047316343049589704, 'samples': 3887104, 'steps': 7591, 'batch_loss/train': 0.9181440444663167} 12/23/2021 23:37:53 - INFO - codeparrot_training - Step 7592: {'lr': 0.0004731563249284702, 'samples': 3887616, 'steps': 7592, 'batch_loss/train': 0.9439062755554914} 12/23/2021 23:38:04 - INFO - codeparrot_training - Step 7593: {'lr': 0.00047314921847385816, 'samples': 3888128, 'steps': 7593, 'batch_loss/train': 0.7906967783346772} 12/23/2021 23:38:15 - INFO - codeparrot_training - Step 7594: {'lr': 0.00047314211113208906, 'samples': 3888640, 'steps': 7594, 'batch_loss/train': 0.8860516035929322} 12/23/2021 23:38:27 - INFO - codeparrot_training - Step 7595: {'lr': 0.0004731350029031911, 'samples': 3889152, 'steps': 7595, 'batch_loss/train': 0.8407735042273998} 12/23/2021 23:38:38 - INFO - codeparrot_training - Step 7596: {'lr': 0.0004731278937871927, 'samples': 3889664, 'steps': 7596, 'batch_loss/train': 0.8798204818740487} 12/23/2021 23:38:48 - INFO - codeparrot_training - Step 7597: {'lr': 0.000473120783784122, 'samples': 3890176, 'steps': 7597, 'batch_loss/train': 0.8780159298330545} 12/23/2021 23:38:59 - INFO - codeparrot_training - Step 7598: {'lr': 0.0004731136728940074, 'samples': 3890688, 'steps': 7598, 'batch_loss/train': 0.6631543172989041} 12/23/2021 23:39:13 - INFO - codeparrot_training - Step 7599: {'lr': 0.000473106561116877, 'samples': 3891200, 'steps': 7599, 'batch_loss/train': 0.788127887994051} 12/23/2021 23:39:23 - INFO - codeparrot_training - Step 7600: {'lr': 0.00047309944845275913, 'samples': 3891712, 'steps': 7600, 'batch_loss/train': 0.8010712070390582} 12/23/2021 23:39:34 - INFO - codeparrot_training - Step 7601: {'lr': 0.0004730923349016821, 'samples': 3892224, 'steps': 7601, 'batch_loss/train': 1.0019652796909213} 12/23/2021 23:39:46 - INFO - codeparrot_training - Step 7602: {'lr': 0.0004730852204636743, 'samples': 3892736, 'steps': 7602, 'batch_loss/train': 0.7751108431257308} 12/23/2021 23:39:57 - INFO - codeparrot_training - Step 7603: {'lr': 0.00047307810513876384, 'samples': 3893248, 'steps': 7603, 'batch_loss/train': 0.8936565378680825} 12/23/2021 23:40:07 - INFO - codeparrot_training - Step 7604: {'lr': 0.00047307098892697896, 'samples': 3893760, 'steps': 7604, 'batch_loss/train': 0.7817695867270231} 12/23/2021 23:40:22 - INFO - codeparrot_training - Step 7605: {'lr': 0.0004730638718283482, 'samples': 3894272, 'steps': 7605, 'batch_loss/train': 0.7842894988134503} 12/23/2021 23:40:33 - INFO - codeparrot_training - Step 7606: {'lr': 0.0004730567538428996, 'samples': 3894784, 'steps': 7606, 'batch_loss/train': 1.1773369605652988} 12/23/2021 23:40:43 - INFO - codeparrot_training - Step 7607: {'lr': 0.0004730496349706617, 'samples': 3895296, 'steps': 7607, 'batch_loss/train': 1.2603550013154745} 12/23/2021 23:40:54 - INFO - codeparrot_training - Step 7608: {'lr': 0.0004730425152116625, 'samples': 3895808, 'steps': 7608, 'batch_loss/train': 0.9504274874925613} 12/23/2021 23:41:06 - INFO - codeparrot_training - Step 7609: {'lr': 0.0004730353945659306, 'samples': 3896320, 'steps': 7609, 'batch_loss/train': 0.8724611951038241} 12/23/2021 23:41:16 - INFO - codeparrot_training - Step 7610: {'lr': 0.00047302827303349415, 'samples': 3896832, 'steps': 7610, 'batch_loss/train': 0.8713728873990476} 12/23/2021 23:41:27 - INFO - codeparrot_training - Step 7611: {'lr': 0.0004730211506143815, 'samples': 3897344, 'steps': 7611, 'batch_loss/train': 0.9037752058357} 12/23/2021 23:41:39 - INFO - codeparrot_training - Step 7612: {'lr': 0.000473014027308621, 'samples': 3897856, 'steps': 7612, 'batch_loss/train': 0.8350676335394382} 12/23/2021 23:41:50 - INFO - codeparrot_training - Step 7613: {'lr': 0.00047300690311624085, 'samples': 3898368, 'steps': 7613, 'batch_loss/train': 0.8342877650866285} 12/23/2021 23:42:00 - INFO - codeparrot_training - Step 7614: {'lr': 0.00047299977803726953, 'samples': 3898880, 'steps': 7614, 'batch_loss/train': 0.883557484485209} 12/23/2021 23:42:14 - INFO - codeparrot_training - Step 7615: {'lr': 0.00047299265207173524, 'samples': 3899392, 'steps': 7615, 'batch_loss/train': 0.8401628830470145} 12/23/2021 23:42:25 - INFO - codeparrot_training - Step 7616: {'lr': 0.0004729855252196664, 'samples': 3899904, 'steps': 7616, 'batch_loss/train': 0.8049987275153399} 12/23/2021 23:42:35 - INFO - codeparrot_training - Step 7617: {'lr': 0.00047297839748109126, 'samples': 3900416, 'steps': 7617, 'batch_loss/train': 0.8669584766030312} 12/23/2021 23:42:46 - INFO - codeparrot_training - Step 7618: {'lr': 0.0004729712688560383, 'samples': 3900928, 'steps': 7618, 'batch_loss/train': 0.8849230855703354} 12/23/2021 23:42:58 - INFO - codeparrot_training - Step 7619: {'lr': 0.00047296413934453573, 'samples': 3901440, 'steps': 7619, 'batch_loss/train': 0.899445588234812} 12/23/2021 23:43:09 - INFO - codeparrot_training - Step 7620: {'lr': 0.00047295700894661193, 'samples': 3901952, 'steps': 7620, 'batch_loss/train': 0.828957088291645} 12/23/2021 23:43:19 - INFO - codeparrot_training - Step 7621: {'lr': 0.00047294987766229523, 'samples': 3902464, 'steps': 7621, 'batch_loss/train': 0.880889886058867} 12/23/2021 23:43:33 - INFO - codeparrot_training - Step 7622: {'lr': 0.000472942745491614, 'samples': 3902976, 'steps': 7622, 'batch_loss/train': 0.8894320819526911} 12/23/2021 23:43:43 - INFO - codeparrot_training - Step 7623: {'lr': 0.0004729356124345966, 'samples': 3903488, 'steps': 7623, 'batch_loss/train': 0.6869190237484872} 12/23/2021 23:43:54 - INFO - codeparrot_training - Step 7624: {'lr': 0.00047292847849127135, 'samples': 3904000, 'steps': 7624, 'batch_loss/train': 0.8088714322075248} 12/23/2021 23:44:06 - INFO - codeparrot_training - Step 7625: {'lr': 0.0004729213436616667, 'samples': 3904512, 'steps': 7625, 'batch_loss/train': 0.8729889960959554} 12/23/2021 23:44:17 - INFO - codeparrot_training - Step 7626: {'lr': 0.0004729142079458109, 'samples': 3905024, 'steps': 7626, 'batch_loss/train': 0.9684164430946112} 12/23/2021 23:44:27 - INFO - codeparrot_training - Step 7627: {'lr': 0.0004729070713437324, 'samples': 3905536, 'steps': 7627, 'batch_loss/train': 0.8760317987762392} 12/23/2021 23:44:38 - INFO - codeparrot_training - Step 7628: {'lr': 0.0004728999338554596, 'samples': 3906048, 'steps': 7628, 'batch_loss/train': 0.8615803485736251} 12/23/2021 23:44:50 - INFO - codeparrot_training - Step 7629: {'lr': 0.0004728927954810208, 'samples': 3906560, 'steps': 7629, 'batch_loss/train': 0.7827437296509743} 12/23/2021 23:45:01 - INFO - codeparrot_training - Step 7630: {'lr': 0.00047288565622044434, 'samples': 3907072, 'steps': 7630, 'batch_loss/train': 0.8271077545359731} 12/23/2021 23:45:12 - INFO - codeparrot_training - Step 7631: {'lr': 0.00047287851607375874, 'samples': 3907584, 'steps': 7631, 'batch_loss/train': 0.9330240720883012} 12/23/2021 23:45:24 - INFO - codeparrot_training - Step 7632: {'lr': 0.00047287137504099233, 'samples': 3908096, 'steps': 7632, 'batch_loss/train': 0.6260865456424654} 12/23/2021 23:45:34 - INFO - codeparrot_training - Step 7633: {'lr': 0.0004728642331221734, 'samples': 3908608, 'steps': 7633, 'batch_loss/train': 0.9684890564531088} 12/23/2021 23:45:45 - INFO - codeparrot_training - Step 7634: {'lr': 0.00047285709031733047, 'samples': 3909120, 'steps': 7634, 'batch_loss/train': 0.8718346785753965} 12/23/2021 23:45:59 - INFO - codeparrot_training - Step 7635: {'lr': 0.00047284994662649194, 'samples': 3909632, 'steps': 7635, 'batch_loss/train': 0.8093829844146967} 12/23/2021 23:46:09 - INFO - codeparrot_training - Step 7636: {'lr': 0.0004728428020496861, 'samples': 3910144, 'steps': 7636, 'batch_loss/train': 0.7755043897777796} 12/23/2021 23:46:20 - INFO - codeparrot_training - Step 7637: {'lr': 0.0004728356565869415, 'samples': 3910656, 'steps': 7637, 'batch_loss/train': 0.9084793515503407} 12/23/2021 23:46:31 - INFO - codeparrot_training - Step 7638: {'lr': 0.0004728285102382864, 'samples': 3911168, 'steps': 7638, 'batch_loss/train': 0.975971445441246} 12/23/2021 23:46:43 - INFO - codeparrot_training - Step 7639: {'lr': 0.00047282136300374935, 'samples': 3911680, 'steps': 7639, 'batch_loss/train': 0.7168949282495305} 12/23/2021 23:46:53 - INFO - codeparrot_training - Step 7640: {'lr': 0.0004728142148833586, 'samples': 3912192, 'steps': 7640, 'batch_loss/train': 0.8356193136423826} 12/23/2021 23:47:04 - INFO - codeparrot_training - Step 7641: {'lr': 0.0004728070658771427, 'samples': 3912704, 'steps': 7641, 'batch_loss/train': 0.894687419757247} 12/23/2021 23:47:16 - INFO - codeparrot_training - Step 7642: {'lr': 0.00047279991598513007, 'samples': 3913216, 'steps': 7642, 'batch_loss/train': 0.8393876808695495} 12/23/2021 23:47:27 - INFO - codeparrot_training - Step 7643: {'lr': 0.0004727927652073491, 'samples': 3913728, 'steps': 7643, 'batch_loss/train': 0.7682778513990343} 12/23/2021 23:47:37 - INFO - codeparrot_training - Step 7644: {'lr': 0.0004727856135438282, 'samples': 3914240, 'steps': 7644, 'batch_loss/train': 0.8677961612120271} 12/23/2021 23:47:51 - INFO - codeparrot_training - Step 7645: {'lr': 0.00047277846099459583, 'samples': 3914752, 'steps': 7645, 'batch_loss/train': 0.6154057739768177} 12/23/2021 23:48:02 - INFO - codeparrot_training - Step 7646: {'lr': 0.0004727713075596804, 'samples': 3915264, 'steps': 7646, 'batch_loss/train': 0.7829095385968685} 12/23/2021 23:48:12 - INFO - codeparrot_training - Step 7647: {'lr': 0.0004727641532391104, 'samples': 3915776, 'steps': 7647, 'batch_loss/train': 0.8495571473613381} 12/23/2021 23:48:23 - INFO - codeparrot_training - Step 7648: {'lr': 0.0004727569980329142, 'samples': 3916288, 'steps': 7648, 'batch_loss/train': 0.8600629577413201} 12/23/2021 23:48:35 - INFO - codeparrot_training - Step 7649: {'lr': 0.00047274984194112024, 'samples': 3916800, 'steps': 7649, 'batch_loss/train': 0.954157036729157} 12/23/2021 23:48:46 - INFO - codeparrot_training - Step 7650: {'lr': 0.0004727426849637571, 'samples': 3917312, 'steps': 7650, 'batch_loss/train': 0.9635961311869323} 12/23/2021 23:48:56 - INFO - codeparrot_training - Step 7651: {'lr': 0.0004727355271008531, 'samples': 3917824, 'steps': 7651, 'batch_loss/train': 0.9159078993834555} 12/23/2021 23:49:10 - INFO - codeparrot_training - Step 7652: {'lr': 0.00047272836835243683, 'samples': 3918336, 'steps': 7652, 'batch_loss/train': 0.8702196290250868} 12/23/2021 23:49:21 - INFO - codeparrot_training - Step 7653: {'lr': 0.0004727212087185366, 'samples': 3918848, 'steps': 7653, 'batch_loss/train': 0.7215779214166105} 12/23/2021 23:49:32 - INFO - codeparrot_training - Step 7654: {'lr': 0.0004727140481991809, 'samples': 3919360, 'steps': 7654, 'batch_loss/train': 0.8303556479513645} 12/23/2021 23:49:45 - INFO - codeparrot_training - Step 7655: {'lr': 0.0004727068867943983, 'samples': 3919872, 'steps': 7655, 'batch_loss/train': 0.8769673984497786} 12/23/2021 23:49:55 - INFO - codeparrot_training - Step 7656: {'lr': 0.0004726997245042172, 'samples': 3920384, 'steps': 7656, 'batch_loss/train': 0.8637979393824935} 12/23/2021 23:50:06 - INFO - codeparrot_training - Step 7657: {'lr': 0.00047269256132866604, 'samples': 3920896, 'steps': 7657, 'batch_loss/train': 0.7527638534083962} 12/23/2021 23:50:16 - INFO - codeparrot_training - Step 7658: {'lr': 0.0004726853972677734, 'samples': 3921408, 'steps': 7658, 'batch_loss/train': 0.8975489454169292} 12/23/2021 23:50:29 - INFO - codeparrot_training - Step 7659: {'lr': 0.0004726782323215677, 'samples': 3921920, 'steps': 7659, 'batch_loss/train': 0.9021179871633649} 12/23/2021 23:50:39 - INFO - codeparrot_training - Step 7660: {'lr': 0.0004726710664900774, 'samples': 3922432, 'steps': 7660, 'batch_loss/train': 0.7536958991549909} 12/23/2021 23:50:50 - INFO - codeparrot_training - Step 7661: {'lr': 0.000472663899773331, 'samples': 3922944, 'steps': 7661, 'batch_loss/train': 0.8242409788072109} 12/23/2021 23:51:02 - INFO - codeparrot_training - Step 7662: {'lr': 0.0004726567321713571, 'samples': 3923456, 'steps': 7662, 'batch_loss/train': 0.8276894264854491} 12/23/2021 23:51:13 - INFO - codeparrot_training - Step 7663: {'lr': 0.00047264956368418397, 'samples': 3923968, 'steps': 7663, 'batch_loss/train': 0.7755635521607473} 12/23/2021 23:51:23 - INFO - codeparrot_training - Step 7664: {'lr': 0.0004726423943118403, 'samples': 3924480, 'steps': 7664, 'batch_loss/train': 0.8326340942876413} 12/23/2021 23:51:34 - INFO - codeparrot_training - Step 7665: {'lr': 0.00047263522405435455, 'samples': 3924992, 'steps': 7665, 'batch_loss/train': 0.745432885363698} 12/23/2021 23:51:47 - INFO - codeparrot_training - Step 7666: {'lr': 0.0004726280529117553, 'samples': 3925504, 'steps': 7666, 'batch_loss/train': 0.8656824892386794} 12/23/2021 23:51:58 - INFO - codeparrot_training - Step 7667: {'lr': 0.00047262088088407086, 'samples': 3926016, 'steps': 7667, 'batch_loss/train': 0.8190220948308706} 12/23/2021 23:52:09 - INFO - codeparrot_training - Step 7668: {'lr': 0.00047261370797132987, 'samples': 3926528, 'steps': 7668, 'batch_loss/train': 0.7366170706227422} 12/23/2021 23:52:21 - INFO - codeparrot_training - Step 7669: {'lr': 0.00047260653417356083, 'samples': 3927040, 'steps': 7669, 'batch_loss/train': 0.8459790628403425} 12/23/2021 23:52:32 - INFO - codeparrot_training - Step 7670: {'lr': 0.00047259935949079226, 'samples': 3927552, 'steps': 7670, 'batch_loss/train': 0.8114823636133224} 12/23/2021 23:52:42 - INFO - codeparrot_training - Step 7671: {'lr': 0.0004725921839230527, 'samples': 3928064, 'steps': 7671, 'batch_loss/train': 0.8926234019454569} 12/23/2021 23:52:54 - INFO - codeparrot_training - Step 7672: {'lr': 0.0004725850074703707, 'samples': 3928576, 'steps': 7672, 'batch_loss/train': 0.6974325663177297} 12/23/2021 23:53:05 - INFO - codeparrot_training - Step 7673: {'lr': 0.00047257783013277465, 'samples': 3929088, 'steps': 7673, 'batch_loss/train': 0.8286070330068469} 12/23/2021 23:53:15 - INFO - codeparrot_training - Step 7674: {'lr': 0.0004725706519102933, 'samples': 3929600, 'steps': 7674, 'batch_loss/train': 0.8416361750569195} 12/23/2021 23:53:29 - INFO - codeparrot_training - Step 7675: {'lr': 0.0004725634728029551, 'samples': 3930112, 'steps': 7675, 'batch_loss/train': 0.8898220434784889} 12/23/2021 23:53:40 - INFO - codeparrot_training - Step 7676: {'lr': 0.0004725562928107885, 'samples': 3930624, 'steps': 7676, 'batch_loss/train': 0.8269109427928925} 12/23/2021 23:53:50 - INFO - codeparrot_training - Step 7677: {'lr': 0.00047254911193382213, 'samples': 3931136, 'steps': 7677, 'batch_loss/train': 0.8254661345854402} 12/23/2021 23:54:01 - INFO - codeparrot_training - Step 7678: {'lr': 0.00047254193017208447, 'samples': 3931648, 'steps': 7678, 'batch_loss/train': 0.794717532931827} 12/23/2021 23:54:13 - INFO - codeparrot_training - Step 7679: {'lr': 0.0004725347475256042, 'samples': 3932160, 'steps': 7679, 'batch_loss/train': 1.6690901215188205} 12/23/2021 23:54:24 - INFO - codeparrot_training - Step 7680: {'lr': 0.00047252756399440975, 'samples': 3932672, 'steps': 7680, 'batch_loss/train': 0.8781876275315881} 12/23/2021 23:54:35 - INFO - codeparrot_training - Step 7681: {'lr': 0.0004725203795785297, 'samples': 3933184, 'steps': 7681, 'batch_loss/train': 0.9397936202585697} 12/23/2021 23:54:48 - INFO - codeparrot_training - Step 7682: {'lr': 0.0004725131942779927, 'samples': 3933696, 'steps': 7682, 'batch_loss/train': 0.7925445607397705} 12/23/2021 23:54:59 - INFO - codeparrot_training - Step 7683: {'lr': 0.0004725060080928273, 'samples': 3934208, 'steps': 7683, 'batch_loss/train': 0.6424005797598511} 12/23/2021 23:55:10 - INFO - codeparrot_training - Step 7684: {'lr': 0.00047249882102306194, 'samples': 3934720, 'steps': 7684, 'batch_loss/train': 1.0318118017166853} 12/23/2021 23:55:22 - INFO - codeparrot_training - Step 7685: {'lr': 0.00047249163306872533, 'samples': 3935232, 'steps': 7685, 'batch_loss/train': 0.9952272297814488} 12/23/2021 23:55:33 - INFO - codeparrot_training - Step 7686: {'lr': 0.00047248444422984595, 'samples': 3935744, 'steps': 7686, 'batch_loss/train': 0.8921092003583908} 12/23/2021 23:55:43 - INFO - codeparrot_training - Step 7687: {'lr': 0.00047247725450645244, 'samples': 3936256, 'steps': 7687, 'batch_loss/train': 0.5313380877487361} 12/23/2021 23:55:54 - INFO - codeparrot_training - Step 7688: {'lr': 0.0004724700638985734, 'samples': 3936768, 'steps': 7688, 'batch_loss/train': 0.8680154206231236} 12/23/2021 23:56:06 - INFO - codeparrot_training - Step 7689: {'lr': 0.0004724628724062374, 'samples': 3937280, 'steps': 7689, 'batch_loss/train': 0.8314472418278456} 12/23/2021 23:56:16 - INFO - codeparrot_training - Step 7690: {'lr': 0.000472455680029473, 'samples': 3937792, 'steps': 7690, 'batch_loss/train': 0.9037222182378173} 12/23/2021 23:56:27 - INFO - codeparrot_training - Step 7691: {'lr': 0.00047244848676830875, 'samples': 3938304, 'steps': 7691, 'batch_loss/train': 0.909139552153647} 12/23/2021 23:56:41 - INFO - codeparrot_training - Step 7692: {'lr': 0.00047244129262277344, 'samples': 3938816, 'steps': 7692, 'batch_loss/train': 0.8630758728832006} 12/23/2021 23:56:51 - INFO - codeparrot_training - Step 7693: {'lr': 0.00047243409759289545, 'samples': 3939328, 'steps': 7693, 'batch_loss/train': 0.8688067155890167} 12/23/2021 23:57:02 - INFO - codeparrot_training - Step 7694: {'lr': 0.0004724269016787035, 'samples': 3939840, 'steps': 7694, 'batch_loss/train': 0.9166862415149808} 12/23/2021 23:57:12 - INFO - codeparrot_training - Step 7695: {'lr': 0.00047241970488022615, 'samples': 3940352, 'steps': 7695, 'batch_loss/train': 0.9797491859644651} 12/23/2021 23:57:24 - INFO - codeparrot_training - Step 7696: {'lr': 0.0004724125071974921, 'samples': 3940864, 'steps': 7696, 'batch_loss/train': 0.9187369747087359} 12/23/2021 23:57:35 - INFO - codeparrot_training - Step 7697: {'lr': 0.0004724053086305298, 'samples': 3941376, 'steps': 7697, 'batch_loss/train': 0.8817533627152443} 12/23/2021 23:57:46 - INFO - codeparrot_training - Step 7698: {'lr': 0.000472398109179368, 'samples': 3941888, 'steps': 7698, 'batch_loss/train': 0.8786478666588664} 12/23/2021 23:57:58 - INFO - codeparrot_training - Step 7699: {'lr': 0.0004723909088440353, 'samples': 3942400, 'steps': 7699, 'batch_loss/train': 0.773221236653626} 12/23/2021 23:58:09 - INFO - codeparrot_training - Step 7700: {'lr': 0.00047238370762456034, 'samples': 3942912, 'steps': 7700, 'batch_loss/train': 0.9052623175084591} 12/23/2021 23:58:19 - INFO - codeparrot_training - Step 7701: {'lr': 0.0004723765055209717, 'samples': 3943424, 'steps': 7701, 'batch_loss/train': 0.7952344566583633} 12/23/2021 23:58:31 - INFO - codeparrot_training - Step 7702: {'lr': 0.0004723693025332981, 'samples': 3943936, 'steps': 7702, 'batch_loss/train': 0.5259196342667565} 12/23/2021 23:58:42 - INFO - codeparrot_training - Step 7703: {'lr': 0.0004723620986615681, 'samples': 3944448, 'steps': 7703, 'batch_loss/train': 0.9186717364937067} 12/23/2021 23:58:53 - INFO - codeparrot_training - Step 7704: {'lr': 0.00047235489390581035, 'samples': 3944960, 'steps': 7704, 'batch_loss/train': 0.885705423541367} 12/23/2021 23:59:06 - INFO - codeparrot_training - Step 7705: {'lr': 0.00047234768826605344, 'samples': 3945472, 'steps': 7705, 'batch_loss/train': 0.8999420246109366} 12/23/2021 23:59:17 - INFO - codeparrot_training - Step 7706: {'lr': 0.0004723404817423261, 'samples': 3945984, 'steps': 7706, 'batch_loss/train': 0.9060766100883484} 12/23/2021 23:59:28 - INFO - codeparrot_training - Step 7707: {'lr': 0.000472333274334657, 'samples': 3946496, 'steps': 7707, 'batch_loss/train': 0.7567803165875375} 12/23/2021 23:59:38 - INFO - codeparrot_training - Step 7708: {'lr': 0.0004723260660430747, 'samples': 3947008, 'steps': 7708, 'batch_loss/train': 0.8277129018679261} 12/23/2021 23:59:50 - INFO - codeparrot_training - Step 7709: {'lr': 0.00047231885686760796, 'samples': 3947520, 'steps': 7709, 'batch_loss/train': 0.9162299819290638} 12/24/2021 00:00:01 - INFO - codeparrot_training - Step 7710: {'lr': 0.0004723116468082853, 'samples': 3948032, 'steps': 7710, 'batch_loss/train': 0.9194396883249283} 12/24/2021 00:00:11 - INFO - codeparrot_training - Step 7711: {'lr': 0.0004723044358651356, 'samples': 3948544, 'steps': 7711, 'batch_loss/train': 0.786254083737731} 12/24/2021 00:00:25 - INFO - codeparrot_training - Step 7712: {'lr': 0.0004722972240381873, 'samples': 3949056, 'steps': 7712, 'batch_loss/train': 0.9276328850537539} 12/24/2021 00:00:36 - INFO - codeparrot_training - Step 7713: {'lr': 0.00047229001132746923, 'samples': 3949568, 'steps': 7713, 'batch_loss/train': 0.8149184565991163} 12/24/2021 00:00:47 - INFO - codeparrot_training - Step 7714: {'lr': 0.00047228279773301, 'samples': 3950080, 'steps': 7714, 'batch_loss/train': 0.9748156266286969} 12/24/2021 00:00:59 - INFO - codeparrot_training - Step 7715: {'lr': 0.00047227558325483833, 'samples': 3950592, 'steps': 7715, 'batch_loss/train': 0.8585040839388967} 12/24/2021 00:01:09 - INFO - codeparrot_training - Step 7716: {'lr': 0.00047226836789298276, 'samples': 3951104, 'steps': 7716, 'batch_loss/train': 0.8117432724684477} 12/24/2021 00:01:20 - INFO - codeparrot_training - Step 7717: {'lr': 0.00047226115164747216, 'samples': 3951616, 'steps': 7717, 'batch_loss/train': 0.7998721012845635} 12/24/2021 00:01:30 - INFO - codeparrot_training - Step 7718: {'lr': 0.00047225393451833517, 'samples': 3952128, 'steps': 7718, 'batch_loss/train': 0.7919461075216532} 12/24/2021 00:01:42 - INFO - codeparrot_training - Step 7719: {'lr': 0.0004722467165056004, 'samples': 3952640, 'steps': 7719, 'batch_loss/train': 0.8188716222066432} 12/24/2021 00:01:53 - INFO - codeparrot_training - Step 7720: {'lr': 0.00047223949760929665, 'samples': 3953152, 'steps': 7720, 'batch_loss/train': 0.907781190238893} 12/24/2021 00:02:04 - INFO - codeparrot_training - Step 7721: {'lr': 0.00047223227782945253, 'samples': 3953664, 'steps': 7721, 'batch_loss/train': 0.7816192121244967} 12/24/2021 00:02:18 - INFO - codeparrot_training - Step 7722: {'lr': 0.00047222505716609683, 'samples': 3954176, 'steps': 7722, 'batch_loss/train': 0.8614498279057443} 12/24/2021 00:02:28 - INFO - codeparrot_training - Step 7723: {'lr': 0.0004722178356192581, 'samples': 3954688, 'steps': 7723, 'batch_loss/train': 0.7491670716553926} 12/24/2021 00:02:39 - INFO - codeparrot_training - Step 7724: {'lr': 0.0004722106131889653, 'samples': 3955200, 'steps': 7724, 'batch_loss/train': 0.835769618395716} 12/24/2021 00:02:51 - INFO - codeparrot_training - Step 7725: {'lr': 0.0004722033898752469, 'samples': 3955712, 'steps': 7725, 'batch_loss/train': 0.90083690546453} 12/24/2021 00:03:02 - INFO - codeparrot_training - Step 7726: {'lr': 0.00047219616567813183, 'samples': 3956224, 'steps': 7726, 'batch_loss/train': 0.8478480009362102} 12/24/2021 00:03:13 - INFO - codeparrot_training - Step 7727: {'lr': 0.0004721889405976486, 'samples': 3956736, 'steps': 7727, 'batch_loss/train': 0.8739876430481672} 12/24/2021 00:03:25 - INFO - codeparrot_training - Step 7728: {'lr': 0.0004721817146338261, 'samples': 3957248, 'steps': 7728, 'batch_loss/train': 0.8418154940009117} 12/24/2021 00:03:35 - INFO - codeparrot_training - Step 7729: {'lr': 0.00047217448778669305, 'samples': 3957760, 'steps': 7729, 'batch_loss/train': 1.0096426354721189} 12/24/2021 00:03:46 - INFO - codeparrot_training - Step 7730: {'lr': 0.00047216726005627795, 'samples': 3958272, 'steps': 7730, 'batch_loss/train': 0.8059029756113887} 12/24/2021 00:03:56 - INFO - codeparrot_training - Step 7731: {'lr': 0.00047216003144260986, 'samples': 3958784, 'steps': 7731, 'batch_loss/train': 0.6931096198968589} 12/24/2021 00:04:10 - INFO - codeparrot_training - Step 7732: {'lr': 0.0004721528019457173, 'samples': 3959296, 'steps': 7732, 'batch_loss/train': 0.8288763258606195} 12/24/2021 00:04:21 - INFO - codeparrot_training - Step 7733: {'lr': 0.0004721455715656292, 'samples': 3959808, 'steps': 7733, 'batch_loss/train': 0.8850666736252606} 12/24/2021 00:04:31 - INFO - codeparrot_training - Step 7734: {'lr': 0.000472138340302374, 'samples': 3960320, 'steps': 7734, 'batch_loss/train': 0.8584707845002413} 12/24/2021 00:04:43 - INFO - codeparrot_training - Step 7735: {'lr': 0.00047213110815598074, 'samples': 3960832, 'steps': 7735, 'batch_loss/train': 0.8660083431750536} 12/24/2021 00:04:54 - INFO - codeparrot_training - Step 7736: {'lr': 0.0004721238751264781, 'samples': 3961344, 'steps': 7736, 'batch_loss/train': 0.7676238059066236} 12/24/2021 00:05:05 - INFO - codeparrot_training - Step 7737: {'lr': 0.00047211664121389477, 'samples': 3961856, 'steps': 7737, 'batch_loss/train': 0.9209858234971762} 12/24/2021 00:05:17 - INFO - codeparrot_training - Step 7738: {'lr': 0.0004721094064182595, 'samples': 3962368, 'steps': 7738, 'batch_loss/train': 0.7654495295137167} 12/24/2021 00:05:28 - INFO - codeparrot_training - Step 7739: {'lr': 0.0004721021707396012, 'samples': 3962880, 'steps': 7739, 'batch_loss/train': 0.8349703219719231} 12/24/2021 00:05:38 - INFO - codeparrot_training - Step 7740: {'lr': 0.0004720949341779484, 'samples': 3963392, 'steps': 7740, 'batch_loss/train': 0.8567635677754879} 12/24/2021 00:05:49 - INFO - codeparrot_training - Step 7741: {'lr': 0.0004720876967333301, 'samples': 3963904, 'steps': 7741, 'batch_loss/train': 0.761165373492986} 12/24/2021 00:06:01 - INFO - codeparrot_training - Step 7742: {'lr': 0.00047208045840577487, 'samples': 3964416, 'steps': 7742, 'batch_loss/train': 0.8126173196360469} 12/24/2021 00:06:11 - INFO - codeparrot_training - Step 7743: {'lr': 0.00047207321919531163, 'samples': 3964928, 'steps': 7743, 'batch_loss/train': 0.9443879649043083} 12/24/2021 00:06:22 - INFO - codeparrot_training - Step 7744: {'lr': 0.0004720659791019692, 'samples': 3965440, 'steps': 7744, 'batch_loss/train': 0.8225607448257506} 12/24/2021 00:06:36 - INFO - codeparrot_training - Step 7745: {'lr': 0.0004720587381257763, 'samples': 3965952, 'steps': 7745, 'batch_loss/train': 0.875044864602387} 12/24/2021 00:06:47 - INFO - codeparrot_training - Step 7746: {'lr': 0.0004720514962667616, 'samples': 3966464, 'steps': 7746, 'batch_loss/train': 0.8167591271921992} 12/24/2021 00:06:57 - INFO - codeparrot_training - Step 7747: {'lr': 0.000472044253524954, 'samples': 3966976, 'steps': 7747, 'batch_loss/train': 0.8159863229375333} 12/24/2021 00:07:09 - INFO - codeparrot_training - Step 7748: {'lr': 0.00047203700990038235, 'samples': 3967488, 'steps': 7748, 'batch_loss/train': 0.8636232959106565} 12/24/2021 00:07:20 - INFO - codeparrot_training - Step 7749: {'lr': 0.0004720297653930753, 'samples': 3968000, 'steps': 7749, 'batch_loss/train': 0.7779574198648334} 12/24/2021 00:07:31 - INFO - codeparrot_training - Step 7750: {'lr': 0.00047202252000306177, 'samples': 3968512, 'steps': 7750, 'batch_loss/train': 0.7384155145846307} 12/24/2021 00:07:41 - INFO - codeparrot_training - Step 7751: {'lr': 0.00047201527373037056, 'samples': 3969024, 'steps': 7751, 'batch_loss/train': 0.9063460528850555} 12/24/2021 00:07:56 - INFO - codeparrot_training - Step 7752: {'lr': 0.0004720080265750305, 'samples': 3969536, 'steps': 7752, 'batch_loss/train': 1.0276569542475045} 12/24/2021 00:08:06 - INFO - codeparrot_training - Step 7753: {'lr': 0.0004720007785370703, 'samples': 3970048, 'steps': 7753, 'batch_loss/train': 0.7626035227440298} 12/24/2021 00:08:17 - INFO - codeparrot_training - Step 7754: {'lr': 0.00047199352961651886, 'samples': 3970560, 'steps': 7754, 'batch_loss/train': 0.8188887424767017} 12/24/2021 00:08:29 - INFO - codeparrot_training - Step 7755: {'lr': 0.0004719862798134049, 'samples': 3971072, 'steps': 7755, 'batch_loss/train': 2.6048846198245883} 12/24/2021 00:08:39 - INFO - codeparrot_training - Step 7756: {'lr': 0.00047197902912775736, 'samples': 3971584, 'steps': 7756, 'batch_loss/train': 0.8629233594983816} 12/24/2021 00:08:50 - INFO - codeparrot_training - Step 7757: {'lr': 0.00047197177755960504, 'samples': 3972096, 'steps': 7757, 'batch_loss/train': 0.7417661845684052} 12/24/2021 00:09:02 - INFO - codeparrot_training - Step 7758: {'lr': 0.0004719645251089767, 'samples': 3972608, 'steps': 7758, 'batch_loss/train': 0.9073602259159088} 12/24/2021 00:09:13 - INFO - codeparrot_training - Step 7759: {'lr': 0.0004719572717759012, 'samples': 3973120, 'steps': 7759, 'batch_loss/train': 0.910034236498177} 12/24/2021 00:09:23 - INFO - codeparrot_training - Step 7760: {'lr': 0.00047195001756040745, 'samples': 3973632, 'steps': 7760, 'batch_loss/train': 0.7275279629975557} 12/24/2021 00:09:34 - INFO - codeparrot_training - Step 7761: {'lr': 0.0004719427624625242, 'samples': 3974144, 'steps': 7761, 'batch_loss/train': 0.7614836413413286} 12/24/2021 00:09:48 - INFO - codeparrot_training - Step 7762: {'lr': 0.00047193550648228036, 'samples': 3974656, 'steps': 7762, 'batch_loss/train': 0.8360555348917842} 12/24/2021 00:09:58 - INFO - codeparrot_training - Step 7763: {'lr': 0.0004719282496197047, 'samples': 3975168, 'steps': 7763, 'batch_loss/train': 0.9108322579413652} 12/24/2021 00:10:09 - INFO - codeparrot_training - Step 7764: {'lr': 0.00047192099187482616, 'samples': 3975680, 'steps': 7764, 'batch_loss/train': 0.8504771608859301} 12/24/2021 00:10:21 - INFO - codeparrot_training - Step 7765: {'lr': 0.0004719137332476735, 'samples': 3976192, 'steps': 7765, 'batch_loss/train': 0.831062126904726} 12/24/2021 00:10:32 - INFO - codeparrot_training - Step 7766: {'lr': 0.0004719064737382757, 'samples': 3976704, 'steps': 7766, 'batch_loss/train': 0.9392175609245896} 12/24/2021 00:10:42 - INFO - codeparrot_training - Step 7767: {'lr': 0.0004718992133466615, 'samples': 3977216, 'steps': 7767, 'batch_loss/train': 0.8817456793040037} 12/24/2021 00:10:56 - INFO - codeparrot_training - Step 7768: {'lr': 0.00047189195207285984, 'samples': 3977728, 'steps': 7768, 'batch_loss/train': 0.948285324499011} 12/24/2021 00:11:07 - INFO - codeparrot_training - Step 7769: {'lr': 0.0004718846899168995, 'samples': 3978240, 'steps': 7769, 'batch_loss/train': 0.8217953760176897} 12/24/2021 00:11:17 - INFO - codeparrot_training - Step 7770: {'lr': 0.0004718774268788095, 'samples': 3978752, 'steps': 7770, 'batch_loss/train': 0.8044817689806223} 12/24/2021 00:11:28 - INFO - codeparrot_training - Step 7771: {'lr': 0.00047187016295861854, 'samples': 3979264, 'steps': 7771, 'batch_loss/train': 1.4760863175615668} 12/24/2021 00:11:40 - INFO - codeparrot_training - Step 7772: {'lr': 0.00047186289815635563, 'samples': 3979776, 'steps': 7772, 'batch_loss/train': 0.789566199760884} 12/24/2021 00:11:50 - INFO - codeparrot_training - Step 7773: {'lr': 0.00047185563247204964, 'samples': 3980288, 'steps': 7773, 'batch_loss/train': 0.873761317692697} 12/24/2021 00:12:01 - INFO - codeparrot_training - Step 7774: {'lr': 0.0004718483659057294, 'samples': 3980800, 'steps': 7774, 'batch_loss/train': 0.7915242463350296} 12/24/2021 00:12:13 - INFO - codeparrot_training - Step 7775: {'lr': 0.0004718410984574238, 'samples': 3981312, 'steps': 7775, 'batch_loss/train': 0.7570223016664386} 12/24/2021 00:12:24 - INFO - codeparrot_training - Step 7776: {'lr': 0.00047183383012716173, 'samples': 3981824, 'steps': 7776, 'batch_loss/train': 0.8801613755058497} 12/24/2021 00:12:34 - INFO - codeparrot_training - Step 7777: {'lr': 0.00047182656091497224, 'samples': 3982336, 'steps': 7777, 'batch_loss/train': 0.7636796608567238} 12/24/2021 00:12:46 - INFO - codeparrot_training - Step 7778: {'lr': 0.000471819290820884, 'samples': 3982848, 'steps': 7778, 'batch_loss/train': 0.8993507288396358} 12/24/2021 00:12:57 - INFO - codeparrot_training - Step 7779: {'lr': 0.00047181201984492605, 'samples': 3983360, 'steps': 7779, 'batch_loss/train': 0.7946531735360622} 12/24/2021 00:13:07 - INFO - codeparrot_training - Step 7780: {'lr': 0.00047180474798712724, 'samples': 3983872, 'steps': 7780, 'batch_loss/train': 0.8429595977067947} 12/24/2021 00:13:22 - INFO - codeparrot_training - Step 7781: {'lr': 0.0004717974752475165, 'samples': 3984384, 'steps': 7781, 'batch_loss/train': 0.8921581418253481} 12/24/2021 00:13:32 - INFO - codeparrot_training - Step 7782: {'lr': 0.00047179020162612273, 'samples': 3984896, 'steps': 7782, 'batch_loss/train': 0.9537253230810165} 12/24/2021 00:13:43 - INFO - codeparrot_training - Step 7783: {'lr': 0.0004717829271229749, 'samples': 3985408, 'steps': 7783, 'batch_loss/train': 0.8773607690818608} 12/24/2021 00:13:53 - INFO - codeparrot_training - Step 7784: {'lr': 0.00047177565173810187, 'samples': 3985920, 'steps': 7784, 'batch_loss/train': 0.7833985823672265} 12/24/2021 00:14:06 - INFO - codeparrot_training - Step 7785: {'lr': 0.0004717683754715326, 'samples': 3986432, 'steps': 7785, 'batch_loss/train': 0.8524710964411497} 12/24/2021 00:14:16 - INFO - codeparrot_training - Step 7786: {'lr': 0.0004717610983232961, 'samples': 3986944, 'steps': 7786, 'batch_loss/train': 0.6369265546090901} 12/24/2021 00:14:27 - INFO - codeparrot_training - Step 7787: {'lr': 0.0004717538202934211, 'samples': 3987456, 'steps': 7787, 'batch_loss/train': 0.7142949365079403} 12/24/2021 00:14:39 - INFO - codeparrot_training - Step 7788: {'lr': 0.0004717465413819366, 'samples': 3987968, 'steps': 7788, 'batch_loss/train': 0.8508757473900914} 12/24/2021 00:14:50 - INFO - codeparrot_training - Step 7789: {'lr': 0.00047173926158887165, 'samples': 3988480, 'steps': 7789, 'batch_loss/train': 0.9159912820905447} 12/24/2021 00:15:00 - INFO - codeparrot_training - Step 7790: {'lr': 0.0004717319809142551, 'samples': 3988992, 'steps': 7790, 'batch_loss/train': 0.675512290908955} 12/24/2021 00:15:11 - INFO - codeparrot_training - Step 7791: {'lr': 0.00047172469935811593, 'samples': 3989504, 'steps': 7791, 'batch_loss/train': 0.8082887248601764} 12/24/2021 00:15:25 - INFO - codeparrot_training - Step 7792: {'lr': 0.0004717174169204831, 'samples': 3990016, 'steps': 7792, 'batch_loss/train': 0.799757779110223} 12/24/2021 00:15:35 - INFO - codeparrot_training - Step 7793: {'lr': 0.00047171013360138546, 'samples': 3990528, 'steps': 7793, 'batch_loss/train': 0.9049045480787754} 12/24/2021 00:15:46 - INFO - codeparrot_training - Step 7794: {'lr': 0.0004717028494008521, 'samples': 3991040, 'steps': 7794, 'batch_loss/train': 0.8153358553536236} 12/24/2021 00:15:58 - INFO - codeparrot_training - Step 7795: {'lr': 0.0004716955643189119, 'samples': 3991552, 'steps': 7795, 'batch_loss/train': 0.9251018827781081} 12/24/2021 00:16:09 - INFO - codeparrot_training - Step 7796: {'lr': 0.0004716882783555939, 'samples': 3992064, 'steps': 7796, 'batch_loss/train': 0.8748686774633825} 12/24/2021 00:16:20 - INFO - codeparrot_training - Step 7797: {'lr': 0.00047168099151092695, 'samples': 3992576, 'steps': 7797, 'batch_loss/train': 0.8806435968726873} 12/24/2021 00:16:33 - INFO - codeparrot_training - Step 7798: {'lr': 0.00047167370378494014, 'samples': 3993088, 'steps': 7798, 'batch_loss/train': 0.8781249523162842} 12/24/2021 00:16:44 - INFO - codeparrot_training - Step 7799: {'lr': 0.0004716664151776623, 'samples': 3993600, 'steps': 7799, 'batch_loss/train': 0.758785480633378} 12/24/2021 00:16:55 - INFO - codeparrot_training - Step 7800: {'lr': 0.00047165912568912255, 'samples': 3994112, 'steps': 7800, 'batch_loss/train': 0.8108290818054229} 12/24/2021 00:17:05 - INFO - codeparrot_training - Step 7801: {'lr': 0.0004716518353193498, 'samples': 3994624, 'steps': 7801, 'batch_loss/train': 0.9647655243752524} 12/24/2021 00:17:17 - INFO - codeparrot_training - Step 7802: {'lr': 0.00047164454406837305, 'samples': 3995136, 'steps': 7802, 'batch_loss/train': 0.698604382108897} 12/24/2021 00:17:28 - INFO - codeparrot_training - Step 7803: {'lr': 0.0004716372519362213, 'samples': 3995648, 'steps': 7803, 'batch_loss/train': 0.740534158423543} 12/24/2021 00:17:38 - INFO - codeparrot_training - Step 7804: {'lr': 0.0004716299589229235, 'samples': 3996160, 'steps': 7804, 'batch_loss/train': 0.7729459688998759} 12/24/2021 00:17:50 - INFO - codeparrot_training - Step 7805: {'lr': 0.00047162266502850865, 'samples': 3996672, 'steps': 7805, 'batch_loss/train': 1.723342492710799} 12/24/2021 00:18:01 - INFO - codeparrot_training - Step 7806: {'lr': 0.00047161537025300576, 'samples': 3997184, 'steps': 7806, 'batch_loss/train': 0.8372343294322491} 12/24/2021 00:18:12 - INFO - codeparrot_training - Step 7807: {'lr': 0.0004716080745964438, 'samples': 3997696, 'steps': 7807, 'batch_loss/train': 0.8402264155447483} 12/24/2021 00:18:25 - INFO - codeparrot_training - Step 7808: {'lr': 0.0004716007780588519, 'samples': 3998208, 'steps': 7808, 'batch_loss/train': 0.7986358720809221} 12/24/2021 00:18:36 - INFO - codeparrot_training - Step 7809: {'lr': 0.00047159348064025895, 'samples': 3998720, 'steps': 7809, 'batch_loss/train': 0.7867245005909353} 12/24/2021 00:18:46 - INFO - codeparrot_training - Step 7810: {'lr': 0.00047158618234069405, 'samples': 3999232, 'steps': 7810, 'batch_loss/train': 0.858529444783926} 12/24/2021 00:18:57 - INFO - codeparrot_training - Step 7811: {'lr': 0.00047157888316018605, 'samples': 3999744, 'steps': 7811, 'batch_loss/train': 0.7711707446724176} 12/24/2021 00:19:09 - INFO - codeparrot_training - Step 7812: {'lr': 0.00047157158309876415, 'samples': 4000256, 'steps': 7812, 'batch_loss/train': 0.8958355570212007} 12/24/2021 00:19:20 - INFO - codeparrot_training - Step 7813: {'lr': 0.0004715642821564573, 'samples': 4000768, 'steps': 7813, 'batch_loss/train': 0.6718881884589791} 12/24/2021 00:19:31 - INFO - codeparrot_training - Step 7814: {'lr': 0.00047155698033329445, 'samples': 4001280, 'steps': 7814, 'batch_loss/train': 0.7841298989951611} 12/24/2021 00:19:43 - INFO - codeparrot_training - Step 7815: {'lr': 0.0004715496776293048, 'samples': 4001792, 'steps': 7815, 'batch_loss/train': 0.8399813491851091} 12/24/2021 00:19:53 - INFO - codeparrot_training - Step 7816: {'lr': 0.0004715423740445172, 'samples': 4002304, 'steps': 7816, 'batch_loss/train': 0.9275010081473738} 12/24/2021 00:20:04 - INFO - codeparrot_training - Step 7817: {'lr': 0.0004715350695789608, 'samples': 4002816, 'steps': 7817, 'batch_loss/train': 0.8834222359582782} 12/24/2021 00:20:16 - INFO - codeparrot_training - Step 7818: {'lr': 0.00047152776423266463, 'samples': 4003328, 'steps': 7818, 'batch_loss/train': 0.8491978542879224} 12/24/2021 00:20:26 - INFO - codeparrot_training - Step 7819: {'lr': 0.00047152045800565766, 'samples': 4003840, 'steps': 7819, 'batch_loss/train': 0.8480147952213883} 12/24/2021 00:20:37 - INFO - codeparrot_training - Step 7820: {'lr': 0.0004715131508979691, 'samples': 4004352, 'steps': 7820, 'batch_loss/train': 0.8517689881846309} 12/24/2021 00:20:51 - INFO - codeparrot_training - Step 7821: {'lr': 0.00047150584290962783, 'samples': 4004864, 'steps': 7821, 'batch_loss/train': 0.899832765571773} 12/24/2021 00:21:02 - INFO - codeparrot_training - Step 7822: {'lr': 0.0004714985340406629, 'samples': 4005376, 'steps': 7822, 'batch_loss/train': 0.8600282026454806} 12/24/2021 00:21:12 - INFO - codeparrot_training - Step 7823: {'lr': 0.00047149122429110353, 'samples': 4005888, 'steps': 7823, 'batch_loss/train': 0.8187662633135915} 12/24/2021 00:21:23 - INFO - codeparrot_training - Step 7824: {'lr': 0.00047148391366097866, 'samples': 4006400, 'steps': 7824, 'batch_loss/train': 0.8017391348257661} 12/24/2021 00:21:35 - INFO - codeparrot_training - Step 7825: {'lr': 0.00047147660215031737, 'samples': 4006912, 'steps': 7825, 'batch_loss/train': 0.8291042568162084} 12/24/2021 00:21:45 - INFO - codeparrot_training - Step 7826: {'lr': 0.0004714692897591487, 'samples': 4007424, 'steps': 7826, 'batch_loss/train': 0.8352732290513813} 12/24/2021 00:21:56 - INFO - codeparrot_training - Step 7827: {'lr': 0.00047146197648750176, 'samples': 4007936, 'steps': 7827, 'batch_loss/train': 0.8901313599199057} 12/24/2021 00:22:10 - INFO - codeparrot_training - Step 7828: {'lr': 0.0004714546623354057, 'samples': 4008448, 'steps': 7828, 'batch_loss/train': 0.791309610940516} 12/24/2021 00:22:21 - INFO - codeparrot_training - Step 7829: {'lr': 0.00047144734730288946, 'samples': 4008960, 'steps': 7829, 'batch_loss/train': 0.9053302491083741} 12/24/2021 00:22:31 - INFO - codeparrot_training - Step 7830: {'lr': 0.0004714400313899822, 'samples': 4009472, 'steps': 7830, 'batch_loss/train': 0.8411408700048923} 12/24/2021 00:22:43 - INFO - codeparrot_training - Step 7831: {'lr': 0.000471432714596713, 'samples': 4009984, 'steps': 7831, 'batch_loss/train': 0.8484299993142486} 12/24/2021 00:22:54 - INFO - codeparrot_training - Step 7832: {'lr': 0.00047142539692311095, 'samples': 4010496, 'steps': 7832, 'batch_loss/train': 0.7927501327358186} 12/24/2021 00:23:04 - INFO - codeparrot_training - Step 7833: {'lr': 0.0004714180783692051, 'samples': 4011008, 'steps': 7833, 'batch_loss/train': 0.83873794041574} 12/24/2021 00:23:16 - INFO - codeparrot_training - Step 7834: {'lr': 0.00047141075893502454, 'samples': 4011520, 'steps': 7834, 'batch_loss/train': 0.9186900332570076} 12/24/2021 00:23:27 - INFO - codeparrot_training - Step 7835: {'lr': 0.0004714034386205984, 'samples': 4012032, 'steps': 7835, 'batch_loss/train': 0.8860266683623195} 12/24/2021 00:23:38 - INFO - codeparrot_training - Step 7836: {'lr': 0.0004713961174259559, 'samples': 4012544, 'steps': 7836, 'batch_loss/train': 0.7619354976341128} 12/24/2021 00:23:48 - INFO - codeparrot_training - Step 7837: {'lr': 0.0004713887953511259, 'samples': 4013056, 'steps': 7837, 'batch_loss/train': 0.904076767154038} 12/24/2021 00:24:02 - INFO - codeparrot_training - Step 7838: {'lr': 0.0004713814723961377, 'samples': 4013568, 'steps': 7838, 'batch_loss/train': 0.7829768741503358} 12/24/2021 00:24:13 - INFO - codeparrot_training - Step 7839: {'lr': 0.00047137414856102035, 'samples': 4014080, 'steps': 7839, 'batch_loss/train': 0.8220225425902754} 12/24/2021 00:24:24 - INFO - codeparrot_training - Step 7840: {'lr': 0.00047136682384580296, 'samples': 4014592, 'steps': 7840, 'batch_loss/train': 0.8136431153398007} 12/24/2021 00:24:36 - INFO - codeparrot_training - Step 7841: {'lr': 0.0004713594982505147, 'samples': 4015104, 'steps': 7841, 'batch_loss/train': 0.8565996321849525} 12/24/2021 00:24:46 - INFO - codeparrot_training - Step 7842: {'lr': 0.00047135217177518465, 'samples': 4015616, 'steps': 7842, 'batch_loss/train': 0.8158771386370063} 12/24/2021 00:24:57 - INFO - codeparrot_training - Step 7843: {'lr': 0.00047134484441984194, 'samples': 4016128, 'steps': 7843, 'batch_loss/train': 0.8124688304960728} 12/24/2021 00:25:09 - INFO - codeparrot_training - Step 7844: {'lr': 0.0004713375161845157, 'samples': 4016640, 'steps': 7844, 'batch_loss/train': 0.8044403535313904} 12/24/2021 00:25:19 - INFO - codeparrot_training - Step 7845: {'lr': 0.0004713301870692351, 'samples': 4017152, 'steps': 7845, 'batch_loss/train': 0.8642160687595606} 12/24/2021 00:25:30 - INFO - codeparrot_training - Step 7846: {'lr': 0.00047132285707402925, 'samples': 4017664, 'steps': 7846, 'batch_loss/train': 0.8093892699107528} 12/24/2021 00:25:41 - INFO - codeparrot_training - Step 7847: {'lr': 0.0004713155261989272, 'samples': 4018176, 'steps': 7847, 'batch_loss/train': 0.8914530896581709} 12/24/2021 00:25:54 - INFO - codeparrot_training - Step 7848: {'lr': 0.0004713081944439582, 'samples': 4018688, 'steps': 7848, 'batch_loss/train': 0.8345168698579073} 12/24/2021 00:26:05 - INFO - codeparrot_training - Step 7849: {'lr': 0.00047130086180915146, 'samples': 4019200, 'steps': 7849, 'batch_loss/train': 0.8679360616952181} 12/24/2021 00:26:16 - INFO - codeparrot_training - Step 7850: {'lr': 0.00047129352829453595, 'samples': 4019712, 'steps': 7850, 'batch_loss/train': 0.9166268287226558} 12/24/2021 00:26:28 - INFO - codeparrot_training - Step 7851: {'lr': 0.000471286193900141, 'samples': 4020224, 'steps': 7851, 'batch_loss/train': 0.8038725992664695} 12/24/2021 00:26:39 - INFO - codeparrot_training - Step 7852: {'lr': 0.0004712788586259957, 'samples': 4020736, 'steps': 7852, 'batch_loss/train': 0.8418117230758071} 12/24/2021 00:26:49 - INFO - codeparrot_training - Step 7853: {'lr': 0.00047127152247212925, 'samples': 4021248, 'steps': 7853, 'batch_loss/train': 0.7576478335540742} 12/24/2021 00:27:01 - INFO - codeparrot_training - Step 7854: {'lr': 0.00047126418543857074, 'samples': 4021760, 'steps': 7854, 'batch_loss/train': 0.7687327973544598} 12/24/2021 00:27:12 - INFO - codeparrot_training - Step 7855: {'lr': 0.00047125684752534936, 'samples': 4022272, 'steps': 7855, 'batch_loss/train': 0.8226547548547387} 12/24/2021 00:27:23 - INFO - codeparrot_training - Step 7856: {'lr': 0.00047124950873249426, 'samples': 4022784, 'steps': 7856, 'batch_loss/train': 0.8226862382143736} 12/24/2021 00:27:33 - INFO - codeparrot_training - Step 7857: {'lr': 0.0004712421690600347, 'samples': 4023296, 'steps': 7857, 'batch_loss/train': 0.7287807101383805} 12/24/2021 00:27:47 - INFO - codeparrot_training - Step 7858: {'lr': 0.0004712348285079998, 'samples': 4023808, 'steps': 7858, 'batch_loss/train': 0.887129352428019} 12/24/2021 00:27:57 - INFO - codeparrot_training - Step 7859: {'lr': 0.00047122748707641876, 'samples': 4024320, 'steps': 7859, 'batch_loss/train': 0.7911770846694708} 12/24/2021 00:28:08 - INFO - codeparrot_training - Step 7860: {'lr': 0.00047122014476532083, 'samples': 4024832, 'steps': 7860, 'batch_loss/train': 0.8217048104852438} 12/24/2021 00:28:20 - INFO - codeparrot_training - Step 7861: {'lr': 0.000471212801574735, 'samples': 4025344, 'steps': 7861, 'batch_loss/train': 0.8023664948996156} 12/24/2021 00:28:31 - INFO - codeparrot_training - Step 7862: {'lr': 0.0004712054575046907, 'samples': 4025856, 'steps': 7862, 'batch_loss/train': 0.8133634035475552} 12/24/2021 00:28:41 - INFO - codeparrot_training - Step 7863: {'lr': 0.000471198112555217, 'samples': 4026368, 'steps': 7863, 'batch_loss/train': 0.9196273148991168} 12/24/2021 00:28:53 - INFO - codeparrot_training - Step 7864: {'lr': 0.0004711907667263431, 'samples': 4026880, 'steps': 7864, 'batch_loss/train': 0.7996344286948442} 12/24/2021 00:29:04 - INFO - codeparrot_training - Step 7865: {'lr': 0.00047118342001809823, 'samples': 4027392, 'steps': 7865, 'batch_loss/train': 0.7790972888469696} 12/24/2021 00:29:15 - INFO - codeparrot_training - Step 7866: {'lr': 0.00047117607243051164, 'samples': 4027904, 'steps': 7866, 'batch_loss/train': 0.8765917802229524} 12/24/2021 00:29:29 - INFO - codeparrot_training - Step 7867: {'lr': 0.0004711687239636124, 'samples': 4028416, 'steps': 7867, 'batch_loss/train': 0.8256662383209914} 12/24/2021 00:29:39 - INFO - codeparrot_training - Step 7868: {'lr': 0.00047116137461742996, 'samples': 4028928, 'steps': 7868, 'batch_loss/train': 0.8900353563949466} 12/24/2021 00:29:50 - INFO - codeparrot_training - Step 7869: {'lr': 0.00047115402439199336, 'samples': 4029440, 'steps': 7869, 'batch_loss/train': 0.9696283489465714} 12/24/2021 00:30:01 - INFO - codeparrot_training - Step 7870: {'lr': 0.00047114667328733184, 'samples': 4029952, 'steps': 7870, 'batch_loss/train': 0.914596438407898} 12/24/2021 00:30:13 - INFO - codeparrot_training - Step 7871: {'lr': 0.00047113932130347463, 'samples': 4030464, 'steps': 7871, 'batch_loss/train': 0.7969304136931896} 12/24/2021 00:30:23 - INFO - codeparrot_training - Step 7872: {'lr': 0.00047113196844045105, 'samples': 4030976, 'steps': 7872, 'batch_loss/train': 0.8692623125389218} 12/24/2021 00:30:34 - INFO - codeparrot_training - Step 7873: {'lr': 0.00047112461469829016, 'samples': 4031488, 'steps': 7873, 'batch_loss/train': 0.9015486519783735} 12/24/2021 00:30:46 - INFO - codeparrot_training - Step 7874: {'lr': 0.0004711172600770214, 'samples': 4032000, 'steps': 7874, 'batch_loss/train': 0.8453183013480157} 12/24/2021 00:30:57 - INFO - codeparrot_training - Step 7875: {'lr': 0.0004711099045766738, 'samples': 4032512, 'steps': 7875, 'batch_loss/train': 0.8765365048311651} 12/24/2021 00:31:07 - INFO - codeparrot_training - Step 7876: {'lr': 0.00047110254819727673, 'samples': 4033024, 'steps': 7876, 'batch_loss/train': 0.8410856993868947} 12/24/2021 00:31:21 - INFO - codeparrot_training - Step 7877: {'lr': 0.00047109519093885944, 'samples': 4033536, 'steps': 7877, 'batch_loss/train': 0.9836047780700028} 12/24/2021 00:31:32 - INFO - codeparrot_training - Step 7878: {'lr': 0.0004710878328014512, 'samples': 4034048, 'steps': 7878, 'batch_loss/train': 0.8639131849631667} 12/24/2021 00:31:42 - INFO - codeparrot_training - Step 7879: {'lr': 0.0004710804737850811, 'samples': 4034560, 'steps': 7879, 'batch_loss/train': 0.9227107735350728} 12/24/2021 00:31:53 - INFO - codeparrot_training - Step 7880: {'lr': 0.00047107311388977865, 'samples': 4035072, 'steps': 7880, 'batch_loss/train': 0.7896317224949598} 12/24/2021 00:32:05 - INFO - codeparrot_training - Step 7881: {'lr': 0.0004710657531155729, 'samples': 4035584, 'steps': 7881, 'batch_loss/train': 0.8138365307822824} 12/24/2021 00:32:16 - INFO - codeparrot_training - Step 7882: {'lr': 0.0004710583914624932, 'samples': 4036096, 'steps': 7882, 'batch_loss/train': 0.8471657722257078} 12/24/2021 00:32:27 - INFO - codeparrot_training - Step 7883: {'lr': 0.00047105102893056877, 'samples': 4036608, 'steps': 7883, 'batch_loss/train': 0.84689231577795} 12/24/2021 00:32:40 - INFO - codeparrot_training - Step 7884: {'lr': 0.0004710436655198289, 'samples': 4037120, 'steps': 7884, 'batch_loss/train': 0.8129935944452882} 12/24/2021 00:32:51 - INFO - codeparrot_training - Step 7885: {'lr': 0.00047103630123030294, 'samples': 4037632, 'steps': 7885, 'batch_loss/train': 0.9182918788865209} 12/24/2021 00:33:02 - INFO - codeparrot_training - Step 7886: {'lr': 0.0004710289360620201, 'samples': 4038144, 'steps': 7886, 'batch_loss/train': 0.8774823267012835} 12/24/2021 00:33:13 - INFO - codeparrot_training - Step 7887: {'lr': 0.00047102157001500966, 'samples': 4038656, 'steps': 7887, 'batch_loss/train': 0.7937908750027418} 12/24/2021 00:33:24 - INFO - codeparrot_training - Step 7888: {'lr': 0.00047101420308930087, 'samples': 4039168, 'steps': 7888, 'batch_loss/train': 0.9192721494473517} 12/24/2021 00:33:35 - INFO - codeparrot_training - Step 7889: {'lr': 0.00047100683528492315, 'samples': 4039680, 'steps': 7889, 'batch_loss/train': 0.8148168222978711} 12/24/2021 00:33:45 - INFO - codeparrot_training - Step 7890: {'lr': 0.0004709994666019056, 'samples': 4040192, 'steps': 7890, 'batch_loss/train': 0.8230028804391623} 12/24/2021 00:33:57 - INFO - codeparrot_training - Step 7891: {'lr': 0.00047099209704027766, 'samples': 4040704, 'steps': 7891, 'batch_loss/train': 0.782289866823703} 12/24/2021 00:34:08 - INFO - codeparrot_training - Step 7892: {'lr': 0.00047098472660006866, 'samples': 4041216, 'steps': 7892, 'batch_loss/train': 0.7935131592676044} 12/24/2021 00:34:18 - INFO - codeparrot_training - Step 7893: {'lr': 0.0004709773552813077, 'samples': 4041728, 'steps': 7893, 'batch_loss/train': 0.831738110166043} 12/24/2021 00:34:31 - INFO - codeparrot_training - Step 7894: {'lr': 0.0004709699830840243, 'samples': 4042240, 'steps': 7894, 'batch_loss/train': 0.9365656785666943} 12/24/2021 00:34:42 - INFO - codeparrot_training - Step 7895: {'lr': 0.00047096261000824764, 'samples': 4042752, 'steps': 7895, 'batch_loss/train': 0.7960719394031912} 12/24/2021 00:34:52 - INFO - codeparrot_training - Step 7896: {'lr': 0.0004709552360540071, 'samples': 4043264, 'steps': 7896, 'batch_loss/train': 1.040725683560595} 12/24/2021 00:35:06 - INFO - codeparrot_training - Step 7897: {'lr': 0.0004709478612213319, 'samples': 4043776, 'steps': 7897, 'batch_loss/train': 0.7832752345129848} 12/24/2021 00:35:16 - INFO - codeparrot_training - Step 7898: {'lr': 0.00047094048551025156, 'samples': 4044288, 'steps': 7898, 'batch_loss/train': 0.7235112586058676} 12/24/2021 00:35:27 - INFO - codeparrot_training - Step 7899: {'lr': 0.00047093310892079517, 'samples': 4044800, 'steps': 7899, 'batch_loss/train': 1.9983975170180202} 12/24/2021 00:35:38 - INFO - codeparrot_training - Step 7900: {'lr': 0.0004709257314529922, 'samples': 4045312, 'steps': 7900, 'batch_loss/train': 0.943635415751487} 12/24/2021 00:35:50 - INFO - codeparrot_training - Step 7901: {'lr': 0.00047091835310687194, 'samples': 4045824, 'steps': 7901, 'batch_loss/train': 0.9284107498824596} 12/24/2021 00:36:00 - INFO - codeparrot_training - Step 7902: {'lr': 0.0004709109738824637, 'samples': 4046336, 'steps': 7902, 'batch_loss/train': 0.9029563022777438} 12/24/2021 00:36:11 - INFO - codeparrot_training - Step 7903: {'lr': 0.00047090359377979687, 'samples': 4046848, 'steps': 7903, 'batch_loss/train': 0.7691407594829798} 12/24/2021 00:36:25 - INFO - codeparrot_training - Step 7904: {'lr': 0.00047089621279890073, 'samples': 4047360, 'steps': 7904, 'batch_loss/train': 1.1343740159645677} 12/24/2021 00:36:36 - INFO - codeparrot_training - Step 7905: {'lr': 0.00047088883093980474, 'samples': 4047872, 'steps': 7905, 'batch_loss/train': 0.8774081682786345} 12/24/2021 00:36:47 - INFO - codeparrot_training - Step 7906: {'lr': 0.00047088144820253805, 'samples': 4048384, 'steps': 7906, 'batch_loss/train': 0.8011233564466238} 12/24/2021 00:36:59 - INFO - codeparrot_training - Step 7907: {'lr': 0.00047087406458713016, 'samples': 4048896, 'steps': 7907, 'batch_loss/train': 0.6516563321929425} 12/24/2021 00:37:09 - INFO - codeparrot_training - Step 7908: {'lr': 0.0004708666800936104, 'samples': 4049408, 'steps': 7908, 'batch_loss/train': 0.7620699936524034} 12/24/2021 00:37:20 - INFO - codeparrot_training - Step 7909: {'lr': 0.00047085929472200804, 'samples': 4049920, 'steps': 7909, 'batch_loss/train': 0.8293116865679622} 12/24/2021 00:37:30 - INFO - codeparrot_training - Step 7910: {'lr': 0.0004708519084723527, 'samples': 4050432, 'steps': 7910, 'batch_loss/train': 0.7664149897173047} 12/24/2021 00:37:44 - INFO - codeparrot_training - Step 7911: {'lr': 0.00047084452134467337, 'samples': 4050944, 'steps': 7911, 'batch_loss/train': 0.8986688312143087} 12/24/2021 00:37:55 - INFO - codeparrot_training - Step 7912: {'lr': 0.00047083713333899977, 'samples': 4051456, 'steps': 7912, 'batch_loss/train': 0.8514395784586668} 12/24/2021 00:38:05 - INFO - codeparrot_training - Step 7913: {'lr': 0.000470829744455361, 'samples': 4051968, 'steps': 7913, 'batch_loss/train': 0.8040567105636001} 12/24/2021 00:38:18 - INFO - codeparrot_training - Step 7914: {'lr': 0.0004708223546937866, 'samples': 4052480, 'steps': 7914, 'batch_loss/train': 0.8137658638879657} 12/24/2021 00:38:28 - INFO - codeparrot_training - Step 7915: {'lr': 0.0004708149640543059, 'samples': 4052992, 'steps': 7915, 'batch_loss/train': 0.8562276866286993} 12/24/2021 00:38:39 - INFO - codeparrot_training - Step 7916: {'lr': 0.00047080757253694825, 'samples': 4053504, 'steps': 7916, 'batch_loss/train': 0.8046953408047557} 12/24/2021 00:38:51 - INFO - codeparrot_training - Step 7917: {'lr': 0.0004708001801417431, 'samples': 4054016, 'steps': 7917, 'batch_loss/train': 0.8238313440233469} 12/24/2021 00:39:01 - INFO - codeparrot_training - Step 7918: {'lr': 0.00047079278686871975, 'samples': 4054528, 'steps': 7918, 'batch_loss/train': 0.7932347245514393} 12/24/2021 00:39:12 - INFO - codeparrot_training - Step 7919: {'lr': 0.00047078539271790766, 'samples': 4055040, 'steps': 7919, 'batch_loss/train': 0.9202991351485252} 12/24/2021 00:39:25 - INFO - codeparrot_training - Step 7920: {'lr': 0.0004707779976893363, 'samples': 4055552, 'steps': 7920, 'batch_loss/train': 0.9834686769172549} 12/24/2021 00:39:36 - INFO - codeparrot_training - Step 7921: {'lr': 0.0004707706017830349, 'samples': 4056064, 'steps': 7921, 'batch_loss/train': 0.8585477937012911} 12/24/2021 00:39:47 - INFO - codeparrot_training - Step 7922: {'lr': 0.0004707632049990329, 'samples': 4056576, 'steps': 7922, 'batch_loss/train': 0.8460835320875049} 12/24/2021 00:39:57 - INFO - codeparrot_training - Step 7923: {'lr': 0.0004707558073373598, 'samples': 4057088, 'steps': 7923, 'batch_loss/train': 0.7614115802571177} 12/24/2021 00:40:09 - INFO - codeparrot_training - Step 7924: {'lr': 0.00047074840879804506, 'samples': 4057600, 'steps': 7924, 'batch_loss/train': 0.841886518523097} 12/24/2021 00:40:20 - INFO - codeparrot_training - Step 7925: {'lr': 0.00047074100938111784, 'samples': 4058112, 'steps': 7925, 'batch_loss/train': 0.8042970588430762} 12/24/2021 00:40:31 - INFO - codeparrot_training - Step 7926: {'lr': 0.0004707336090866078, 'samples': 4058624, 'steps': 7926, 'batch_loss/train': 0.8280403558164835} 12/24/2021 00:40:43 - INFO - codeparrot_training - Step 7927: {'lr': 0.0004707262079145443, 'samples': 4059136, 'steps': 7927, 'batch_loss/train': 0.7844891524873674} 12/24/2021 00:40:54 - INFO - codeparrot_training - Step 7928: {'lr': 0.00047071880586495676, 'samples': 4059648, 'steps': 7928, 'batch_loss/train': 0.6324043869972229} 12/24/2021 00:41:04 - INFO - codeparrot_training - Step 7929: {'lr': 0.0004707114029378745, 'samples': 4060160, 'steps': 7929, 'batch_loss/train': 0.8430750593543053} 12/24/2021 00:41:18 - INFO - codeparrot_training - Step 7930: {'lr': 0.00047070399913332707, 'samples': 4060672, 'steps': 7930, 'batch_loss/train': 0.7829198418185115} 12/24/2021 00:41:29 - INFO - codeparrot_training - Step 7931: {'lr': 0.0004706965944513439, 'samples': 4061184, 'steps': 7931, 'batch_loss/train': 0.826528562232852} 12/24/2021 00:41:39 - INFO - codeparrot_training - Step 7932: {'lr': 0.0004706891888919543, 'samples': 4061696, 'steps': 7932, 'batch_loss/train': 0.7622095765545964} 12/24/2021 00:41:51 - INFO - codeparrot_training - Step 7933: {'lr': 0.00047068178245518796, 'samples': 4062208, 'steps': 7933, 'batch_loss/train': 0.9456988833844662} 12/24/2021 00:42:02 - INFO - codeparrot_training - Step 7934: {'lr': 0.00047067437514107416, 'samples': 4062720, 'steps': 7934, 'batch_loss/train': 0.8787456899881363} 12/24/2021 00:42:12 - INFO - codeparrot_training - Step 7935: {'lr': 0.0004706669669496423, 'samples': 4063232, 'steps': 7935, 'batch_loss/train': 0.7682603280991316} 12/24/2021 00:42:23 - INFO - codeparrot_training - Step 7936: {'lr': 0.0004706595578809219, 'samples': 4063744, 'steps': 7936, 'batch_loss/train': 0.7877903073094785} 12/24/2021 00:42:37 - INFO - codeparrot_training - Step 7937: {'lr': 0.0004706521479349424, 'samples': 4064256, 'steps': 7937, 'batch_loss/train': 0.8791144900023937} 12/24/2021 00:42:47 - INFO - codeparrot_training - Step 7938: {'lr': 0.0004706447371117334, 'samples': 4064768, 'steps': 7938, 'batch_loss/train': 0.8986308416351676} 12/24/2021 00:42:58 - INFO - codeparrot_training - Step 7939: {'lr': 0.00047063732541132407, 'samples': 4065280, 'steps': 7939, 'batch_loss/train': 0.803118702955544} 12/24/2021 00:43:10 - INFO - codeparrot_training - Step 7940: {'lr': 0.00047062991283374416, 'samples': 4065792, 'steps': 7940, 'batch_loss/train': 0.8163053411990404} 12/24/2021 00:43:21 - INFO - codeparrot_training - Step 7941: {'lr': 0.000470622499379023, 'samples': 4066304, 'steps': 7941, 'batch_loss/train': 0.8013449301943183} 12/24/2021 00:43:31 - INFO - codeparrot_training - Step 7942: {'lr': 0.00047061508504719007, 'samples': 4066816, 'steps': 7942, 'batch_loss/train': 0.7556127896532416} 12/24/2021 00:43:43 - INFO - codeparrot_training - Step 7943: {'lr': 0.00047060766983827486, 'samples': 4067328, 'steps': 7943, 'batch_loss/train': 0.7949040811508894} 12/24/2021 00:43:54 - INFO - codeparrot_training - Step 7944: {'lr': 0.0004706002537523069, 'samples': 4067840, 'steps': 7944, 'batch_loss/train': 0.845932430587709} 12/24/2021 00:44:04 - INFO - codeparrot_training - Step 7945: {'lr': 0.0004705928367893155, 'samples': 4068352, 'steps': 7945, 'batch_loss/train': 0.8019810351543128} 12/24/2021 00:44:18 - INFO - codeparrot_training - Step 7946: {'lr': 0.00047058541894933045, 'samples': 4068864, 'steps': 7946, 'batch_loss/train': 0.8156157424673438} 12/24/2021 00:44:29 - INFO - codeparrot_training - Step 7947: {'lr': 0.00047057800023238093, 'samples': 4069376, 'steps': 7947, 'batch_loss/train': 0.8120357859879732} 12/24/2021 00:44:39 - INFO - codeparrot_training - Step 7948: {'lr': 0.00047057058063849666, 'samples': 4069888, 'steps': 7948, 'batch_loss/train': 0.7720008324831724} 12/24/2021 00:44:50 - INFO - codeparrot_training - Step 7949: {'lr': 0.00047056316016770697, 'samples': 4070400, 'steps': 7949, 'batch_loss/train': 0.8221951620653272} 12/24/2021 00:45:02 - INFO - codeparrot_training - Step 7950: {'lr': 0.00047055573882004144, 'samples': 4070912, 'steps': 7950, 'batch_loss/train': 0.7618021252565086} 12/24/2021 00:45:13 - INFO - codeparrot_training - Step 7951: {'lr': 0.0004705483165955297, 'samples': 4071424, 'steps': 7951, 'batch_loss/train': 0.8461802881211042} 12/24/2021 00:45:23 - INFO - codeparrot_training - Step 7952: {'lr': 0.0004705408934942009, 'samples': 4071936, 'steps': 7952, 'batch_loss/train': 0.8238041298463941} 12/24/2021 00:45:35 - INFO - codeparrot_training - Step 7953: {'lr': 0.000470533469516085, 'samples': 4072448, 'steps': 7953, 'batch_loss/train': 0.8183460179716349} 12/24/2021 00:45:46 - INFO - codeparrot_training - Step 7954: {'lr': 0.00047052604466121117, 'samples': 4072960, 'steps': 7954, 'batch_loss/train': 0.9702090881764889} 12/24/2021 00:45:57 - INFO - codeparrot_training - Step 7955: {'lr': 0.0004705186189296091, 'samples': 4073472, 'steps': 7955, 'batch_loss/train': 0.8880026554688811} 12/24/2021 00:46:11 - INFO - codeparrot_training - Step 7956: {'lr': 0.00047051119232130825, 'samples': 4073984, 'steps': 7956, 'batch_loss/train': 0.8115739980712533} 12/24/2021 00:46:21 - INFO - codeparrot_training - Step 7957: {'lr': 0.0004705037648363382, 'samples': 4074496, 'steps': 7957, 'batch_loss/train': 0.8616741057485342} 12/24/2021 00:46:32 - INFO - codeparrot_training - Step 7958: {'lr': 0.00047049633647472833, 'samples': 4075008, 'steps': 7958, 'batch_loss/train': 0.7519678976386786} 12/24/2021 00:46:42 - INFO - codeparrot_training - Step 7959: {'lr': 0.0004704889072365084, 'samples': 4075520, 'steps': 7959, 'batch_loss/train': 1.0339457849040627} 12/24/2021 00:46:54 - INFO - codeparrot_training - Step 7960: {'lr': 0.0004704814771217077, 'samples': 4076032, 'steps': 7960, 'batch_loss/train': 0.9459651727229357} 12/24/2021 00:47:05 - INFO - codeparrot_training - Step 7961: {'lr': 0.00047047404613035607, 'samples': 4076544, 'steps': 7961, 'batch_loss/train': 0.7676090579479933} 12/24/2021 00:47:15 - INFO - codeparrot_training - Step 7962: {'lr': 0.0004704666142624828, 'samples': 4077056, 'steps': 7962, 'batch_loss/train': 0.7892724303528666} 12/24/2021 00:47:28 - INFO - codeparrot_training - Step 7963: {'lr': 0.0004704591815181175, 'samples': 4077568, 'steps': 7963, 'batch_loss/train': 0.818393488181755} 12/24/2021 00:47:38 - INFO - codeparrot_training - Step 7964: {'lr': 0.0004704517478972897, 'samples': 4078080, 'steps': 7964, 'batch_loss/train': 0.7953049272764474} 12/24/2021 00:47:49 - INFO - codeparrot_training - Step 7965: {'lr': 0.000470444313400029, 'samples': 4078592, 'steps': 7965, 'batch_loss/train': 0.8445411259308457} 12/24/2021 00:48:03 - INFO - codeparrot_training - Step 7966: {'lr': 0.000470436878026365, 'samples': 4079104, 'steps': 7966, 'batch_loss/train': 0.836354280821979} 12/24/2021 00:48:13 - INFO - codeparrot_training - Step 7967: {'lr': 0.0004704294417763272, 'samples': 4079616, 'steps': 7967, 'batch_loss/train': 0.6716313089709729} 12/24/2021 00:48:24 - INFO - codeparrot_training - Step 7968: {'lr': 0.00047042200464994516, 'samples': 4080128, 'steps': 7968, 'batch_loss/train': 0.8636500909924507} 12/24/2021 00:48:36 - INFO - codeparrot_training - Step 7969: {'lr': 0.0004704145666472485, 'samples': 4080640, 'steps': 7969, 'batch_loss/train': 0.7810054486617446} 12/24/2021 00:48:46 - INFO - codeparrot_training - Step 7970: {'lr': 0.00047040712776826667, 'samples': 4081152, 'steps': 7970, 'batch_loss/train': 0.8997897021472454} 12/24/2021 00:48:57 - INFO - codeparrot_training - Step 7971: {'lr': 0.00047039968801302934, 'samples': 4081664, 'steps': 7971, 'batch_loss/train': 0.7836417118087411} 12/24/2021 00:49:09 - INFO - codeparrot_training - Step 7972: {'lr': 0.0004703922473815661, 'samples': 4082176, 'steps': 7972, 'batch_loss/train': 0.8149064648896456} 12/24/2021 00:49:20 - INFO - codeparrot_training - Step 7973: {'lr': 0.0004703848058739065, 'samples': 4082688, 'steps': 7973, 'batch_loss/train': 0.7947974953567609} 12/24/2021 00:49:30 - INFO - codeparrot_training - Step 7974: {'lr': 0.0004703773634900801, 'samples': 4083200, 'steps': 7974, 'batch_loss/train': 0.8655062369070947} 12/24/2021 00:49:41 - INFO - codeparrot_training - Step 7975: {'lr': 0.00047036992023011655, 'samples': 4083712, 'steps': 7975, 'batch_loss/train': 0.8375948453322053} 12/24/2021 00:49:55 - INFO - codeparrot_training - Step 7976: {'lr': 0.00047036247609404537, 'samples': 4084224, 'steps': 7976, 'batch_loss/train': 0.8606544435024261} 12/24/2021 00:50:05 - INFO - codeparrot_training - Step 7977: {'lr': 0.0004703550310818962, 'samples': 4084736, 'steps': 7977, 'batch_loss/train': 0.811850672122091} 12/24/2021 00:50:16 - INFO - codeparrot_training - Step 7978: {'lr': 0.00047034758519369865, 'samples': 4085248, 'steps': 7978, 'batch_loss/train': 0.6446453138487414} 12/24/2021 00:50:28 - INFO - codeparrot_training - Step 7979: {'lr': 0.0004703401384294823, 'samples': 4085760, 'steps': 7979, 'batch_loss/train': 0.7818159982562065} 12/24/2021 00:50:38 - INFO - codeparrot_training - Step 7980: {'lr': 0.0004703326907892767, 'samples': 4086272, 'steps': 7980, 'batch_loss/train': 0.7535141073167324} 12/24/2021 00:50:49 - INFO - codeparrot_training - Step 7981: {'lr': 0.0004703252422731116, 'samples': 4086784, 'steps': 7981, 'batch_loss/train': 0.7777614826336503} 12/24/2021 00:51:03 - INFO - codeparrot_training - Step 7982: {'lr': 0.00047031779288101653, 'samples': 4087296, 'steps': 7982, 'batch_loss/train': 0.83891084138304} 12/24/2021 00:51:13 - INFO - codeparrot_training - Step 7983: {'lr': 0.00047031034261302095, 'samples': 4087808, 'steps': 7983, 'batch_loss/train': 0.8347124569118023} 12/24/2021 00:51:24 - INFO - codeparrot_training - Step 7984: {'lr': 0.00047030289146915476, 'samples': 4088320, 'steps': 7984, 'batch_loss/train': 0.9033448356203735} 12/24/2021 00:51:36 - INFO - codeparrot_training - Step 7985: {'lr': 0.0004702954394494474, 'samples': 4088832, 'steps': 7985, 'batch_loss/train': 0.8537045698612928} 12/24/2021 00:51:47 - INFO - codeparrot_training - Step 7986: {'lr': 0.00047028798655392867, 'samples': 4089344, 'steps': 7986, 'batch_loss/train': 0.8960854727774858} 12/24/2021 00:51:57 - INFO - codeparrot_training - Step 7987: {'lr': 0.000470280532782628, 'samples': 4089856, 'steps': 7987, 'batch_loss/train': 0.7184857842512429} 12/24/2021 00:52:08 - INFO - codeparrot_training - Step 7988: {'lr': 0.00047027307813557505, 'samples': 4090368, 'steps': 7988, 'batch_loss/train': 0.976050354540348} 12/24/2021 00:52:20 - INFO - codeparrot_training - Step 7989: {'lr': 0.00047026562261279957, 'samples': 4090880, 'steps': 7989, 'batch_loss/train': 0.7976343315094709} 12/24/2021 00:52:30 - INFO - codeparrot_training - Step 7990: {'lr': 0.0004702581662143312, 'samples': 4091392, 'steps': 7990, 'batch_loss/train': 0.858839837834239} 12/24/2021 00:52:41 - INFO - codeparrot_training - Step 7991: {'lr': 0.0004702507089401995, 'samples': 4091904, 'steps': 7991, 'batch_loss/train': 0.7704543611034751} 12/24/2021 00:52:55 - INFO - codeparrot_training - Step 7992: {'lr': 0.0004702432507904341, 'samples': 4092416, 'steps': 7992, 'batch_loss/train': 0.8648165045306087} 12/24/2021 00:53:05 - INFO - codeparrot_training - Step 7993: {'lr': 0.0004702357917650647, 'samples': 4092928, 'steps': 7993, 'batch_loss/train': 0.806750865187496} 12/24/2021 00:53:16 - INFO - codeparrot_training - Step 7994: {'lr': 0.000470228331864121, 'samples': 4093440, 'steps': 7994, 'batch_loss/train': 0.8187468550167978} 12/24/2021 00:53:28 - INFO - codeparrot_training - Step 7995: {'lr': 0.00047022087108763256, 'samples': 4093952, 'steps': 7995, 'batch_loss/train': 0.8498013578355312} 12/24/2021 00:53:39 - INFO - codeparrot_training - Step 7996: {'lr': 0.00047021340943562904, 'samples': 4094464, 'steps': 7996, 'batch_loss/train': 0.8343112552538514} 12/24/2021 00:53:49 - INFO - codeparrot_training - Step 7997: {'lr': 0.0004702059469081403, 'samples': 4094976, 'steps': 7997, 'batch_loss/train': 0.5878462381660938} 12/24/2021 00:54:01 - INFO - codeparrot_training - Step 7998: {'lr': 0.0004701984835051958, 'samples': 4095488, 'steps': 7998, 'batch_loss/train': 0.8611729049589485} 12/24/2021 00:54:12 - INFO - codeparrot_training - Step 7999: {'lr': 0.0004701910192268252, 'samples': 4096000, 'steps': 7999, 'batch_loss/train': 0.7634253771975636} 12/24/2021 00:54:23 - INFO - codeparrot_training - Step 8000: {'lr': 0.00047018355407305837, 'samples': 4096512, 'steps': 8000, 'batch_loss/train': 0.6532913218252361} 12/24/2021 00:54:33 - INFO - codeparrot_training - Step 8001: {'lr': 0.00047017608804392474, 'samples': 4097024, 'steps': 8001, 'batch_loss/train': 0.7867394266650081} 12/24/2021 00:54:45 - INFO - codeparrot_training - Step 8002: {'lr': 0.0004701686211394542, 'samples': 4097536, 'steps': 8002, 'batch_loss/train': 0.7070368439890444} 12/24/2021 00:54:56 - INFO - codeparrot_training - Step 8003: {'lr': 0.00047016115335967633, 'samples': 4098048, 'steps': 8003, 'batch_loss/train': 0.7923441408202052} 12/24/2021 00:55:06 - INFO - codeparrot_training - Step 8004: {'lr': 0.0004701536847046208, 'samples': 4098560, 'steps': 8004, 'batch_loss/train': 0.8245698409155011} 12/24/2021 00:55:20 - INFO - codeparrot_training - Step 8005: {'lr': 0.00047014621517431745, 'samples': 4099072, 'steps': 8005, 'batch_loss/train': 0.7870648056268692} 12/24/2021 00:55:31 - INFO - codeparrot_training - Step 8006: {'lr': 0.00047013874476879583, 'samples': 4099584, 'steps': 8006, 'batch_loss/train': 0.7886667773127556} 12/24/2021 00:55:41 - INFO - codeparrot_training - Step 8007: {'lr': 0.0004701312734880856, 'samples': 4100096, 'steps': 8007, 'batch_loss/train': 0.7575126299634576} 12/24/2021 00:55:53 - INFO - codeparrot_training - Step 8008: {'lr': 0.00047012380133221664, 'samples': 4100608, 'steps': 8008, 'batch_loss/train': 0.8034833404235542} 12/24/2021 00:56:04 - INFO - codeparrot_training - Step 8009: {'lr': 0.00047011632830121854, 'samples': 4101120, 'steps': 8009, 'batch_loss/train': 0.6872036221902817} 12/24/2021 00:56:15 - INFO - codeparrot_training - Step 8010: {'lr': 0.00047010885439512096, 'samples': 4101632, 'steps': 8010, 'batch_loss/train': 0.7270810866029933} 12/24/2021 00:56:27 - INFO - codeparrot_training - Step 8011: {'lr': 0.0004701013796139538, 'samples': 4102144, 'steps': 8011, 'batch_loss/train': 0.850548040587455} 12/24/2021 00:56:37 - INFO - codeparrot_training - Step 8012: {'lr': 0.0004700939039577466, 'samples': 4102656, 'steps': 8012, 'batch_loss/train': 0.7587267057970166} 12/24/2021 00:56:48 - INFO - codeparrot_training - Step 8013: {'lr': 0.00047008642742652916, 'samples': 4103168, 'steps': 8013, 'batch_loss/train': 0.796598644927144} 12/24/2021 00:56:58 - INFO - codeparrot_training - Step 8014: {'lr': 0.0004700789500203312, 'samples': 4103680, 'steps': 8014, 'batch_loss/train': 0.7723975051194429} 12/24/2021 00:57:12 - INFO - codeparrot_training - Step 8015: {'lr': 0.0004700714717391824, 'samples': 4104192, 'steps': 8015, 'batch_loss/train': 0.832493950612843} 12/24/2021 00:57:23 - INFO - codeparrot_training - Step 8016: {'lr': 0.0004700639925831125, 'samples': 4104704, 'steps': 8016, 'batch_loss/train': 0.8758928813040257} 12/24/2021 00:57:33 - INFO - codeparrot_training - Step 8017: {'lr': 0.00047005651255215133, 'samples': 4105216, 'steps': 8017, 'batch_loss/train': 0.6926622146274894} 12/24/2021 00:57:46 - INFO - codeparrot_training - Step 8018: {'lr': 0.0004700490316463285, 'samples': 4105728, 'steps': 8018, 'batch_loss/train': 0.8135226583108306} 12/24/2021 00:57:56 - INFO - codeparrot_training - Step 8019: {'lr': 0.0004700415498656738, 'samples': 4106240, 'steps': 8019, 'batch_loss/train': 0.8568847053684294} 12/24/2021 00:58:07 - INFO - codeparrot_training - Step 8020: {'lr': 0.000470034067210217, 'samples': 4106752, 'steps': 8020, 'batch_loss/train': 0.8986931014806032} 12/24/2021 00:58:20 - INFO - codeparrot_training - Step 8021: {'lr': 0.0004700265836799879, 'samples': 4107264, 'steps': 8021, 'batch_loss/train': 0.8598317699506879} 12/24/2021 00:58:31 - INFO - codeparrot_training - Step 8022: {'lr': 0.00047001909927501606, 'samples': 4107776, 'steps': 8022, 'batch_loss/train': 0.9015274960547686} 12/24/2021 00:58:42 - INFO - codeparrot_training - Step 8023: {'lr': 0.0004700116139953314, 'samples': 4108288, 'steps': 8023, 'batch_loss/train': 0.9032132402062416} 12/24/2021 00:58:54 - INFO - codeparrot_training - Step 8024: {'lr': 0.00047000412784096367, 'samples': 4108800, 'steps': 8024, 'batch_loss/train': 0.7894640387967229} 12/24/2021 00:59:04 - INFO - codeparrot_training - Step 8025: {'lr': 0.0004699966408119426, 'samples': 4109312, 'steps': 8025, 'batch_loss/train': 0.922550754621625} 12/24/2021 00:59:15 - INFO - codeparrot_training - Step 8026: {'lr': 0.000469989152908298, 'samples': 4109824, 'steps': 8026, 'batch_loss/train': 0.8747835848480463} 12/24/2021 00:59:25 - INFO - codeparrot_training - Step 8027: {'lr': 0.0004699816641300595, 'samples': 4110336, 'steps': 8027, 'batch_loss/train': 0.8285459466278553} 12/24/2021 00:59:37 - INFO - codeparrot_training - Step 8028: {'lr': 0.000469974174477257, 'samples': 4110848, 'steps': 8028, 'batch_loss/train': 0.8707410986535251} 12/24/2021 00:59:48 - INFO - codeparrot_training - Step 8029: {'lr': 0.00046996668394992024, 'samples': 4111360, 'steps': 8029, 'batch_loss/train': 0.819162726867944} 12/24/2021 00:59:59 - INFO - codeparrot_training - Step 8030: {'lr': 0.0004699591925480789, 'samples': 4111872, 'steps': 8030, 'batch_loss/train': 0.8475602492690086} 12/24/2021 01:00:12 - INFO - codeparrot_training - Step 8031: {'lr': 0.000469951700271763, 'samples': 4112384, 'steps': 8031, 'batch_loss/train': 0.7427960708737373} 12/24/2021 01:00:23 - INFO - codeparrot_training - Step 8032: {'lr': 0.0004699442071210022, 'samples': 4112896, 'steps': 8032, 'batch_loss/train': 0.7588359643705189} 12/24/2021 01:00:34 - INFO - codeparrot_training - Step 8033: {'lr': 0.0004699367130958262, 'samples': 4113408, 'steps': 8033, 'batch_loss/train': 0.7730286866426468} 12/24/2021 01:00:46 - INFO - codeparrot_training - Step 8034: {'lr': 0.00046992921819626494, 'samples': 4113920, 'steps': 8034, 'batch_loss/train': 0.8349565230309963} 12/24/2021 01:00:56 - INFO - codeparrot_training - Step 8035: {'lr': 0.0004699217224223481, 'samples': 4114432, 'steps': 8035, 'batch_loss/train': 0.8006269689649343} 12/24/2021 01:01:07 - INFO - codeparrot_training - Step 8036: {'lr': 0.00046991422577410556, 'samples': 4114944, 'steps': 8036, 'batch_loss/train': 0.7330894474871457} 12/24/2021 01:01:20 - INFO - codeparrot_training - Step 8037: {'lr': 0.0004699067282515671, 'samples': 4115456, 'steps': 8037, 'batch_loss/train': 0.7676893544849008} 12/24/2021 01:01:31 - INFO - codeparrot_training - Step 8038: {'lr': 0.00046989922985476247, 'samples': 4115968, 'steps': 8038, 'batch_loss/train': 0.7719701258465648} 12/24/2021 01:01:42 - INFO - codeparrot_training - Step 8039: {'lr': 0.00046989173058372156, 'samples': 4116480, 'steps': 8039, 'batch_loss/train': 0.7833002790575847} 12/24/2021 01:01:54 - INFO - codeparrot_training - Step 8040: {'lr': 0.0004698842304384742, 'samples': 4116992, 'steps': 8040, 'batch_loss/train': 0.8347583962604403} 12/24/2021 01:02:05 - INFO - codeparrot_training - Step 8041: {'lr': 0.00046987672941905016, 'samples': 4117504, 'steps': 8041, 'batch_loss/train': 0.8716572746634483} 12/24/2021 01:02:15 - INFO - codeparrot_training - Step 8042: {'lr': 0.0004698692275254792, 'samples': 4118016, 'steps': 8042, 'batch_loss/train': 0.8842378279659897} 12/24/2021 01:02:26 - INFO - codeparrot_training - Step 8043: {'lr': 0.0004698617247577912, 'samples': 4118528, 'steps': 8043, 'batch_loss/train': 0.8208570489659905} 12/24/2021 01:02:38 - INFO - codeparrot_training - Step 8044: {'lr': 0.00046985422111601603, 'samples': 4119040, 'steps': 8044, 'batch_loss/train': 0.8067697025835514} 12/24/2021 01:02:49 - INFO - codeparrot_training - Step 8045: {'lr': 0.00046984671660018354, 'samples': 4119552, 'steps': 8045, 'batch_loss/train': 0.8311382178217173} 12/24/2021 01:02:59 - INFO - codeparrot_training - Step 8046: {'lr': 0.0004698392112103235, 'samples': 4120064, 'steps': 8046, 'batch_loss/train': 0.8489502361044288} 12/24/2021 01:03:11 - INFO - codeparrot_training - Step 8047: {'lr': 0.0004698317049464657, 'samples': 4120576, 'steps': 8047, 'batch_loss/train': 0.797397696878761} 12/24/2021 01:03:22 - INFO - codeparrot_training - Step 8048: {'lr': 0.0004698241978086401, 'samples': 4121088, 'steps': 8048, 'batch_loss/train': 0.8244850179180503} 12/24/2021 01:03:32 - INFO - codeparrot_training - Step 8049: {'lr': 0.00046981668979687654, 'samples': 4121600, 'steps': 8049, 'batch_loss/train': 0.882295535877347} 12/24/2021 01:03:45 - INFO - codeparrot_training - Step 8050: {'lr': 0.0004698091809112047, 'samples': 4122112, 'steps': 8050, 'batch_loss/train': 0.7233349991147406} 12/24/2021 01:03:56 - INFO - codeparrot_training - Step 8051: {'lr': 0.0004698016711516546, 'samples': 4122624, 'steps': 8051, 'batch_loss/train': 1.0998199693858624} 12/24/2021 01:04:07 - INFO - codeparrot_training - Step 8052: {'lr': 0.00046979416051825607, 'samples': 4123136, 'steps': 8052, 'batch_loss/train': 0.8271635910496116} 12/24/2021 01:04:17 - INFO - codeparrot_training - Step 8053: {'lr': 0.00046978664901103895, 'samples': 4123648, 'steps': 8053, 'batch_loss/train': 0.8044015802443027} 12/24/2021 01:04:29 - INFO - codeparrot_training - Step 8054: {'lr': 0.0004697791366300331, 'samples': 4124160, 'steps': 8054, 'batch_loss/train': 0.8767689159139991} 12/24/2021 01:04:40 - INFO - codeparrot_training - Step 8055: {'lr': 0.0004697716233752684, 'samples': 4124672, 'steps': 8055, 'batch_loss/train': 0.8189774870406836} 12/24/2021 01:04:50 - INFO - codeparrot_training - Step 8056: {'lr': 0.0004697641092467747, 'samples': 4125184, 'steps': 8056, 'batch_loss/train': 0.7303338786587119} 12/24/2021 01:05:03 - INFO - codeparrot_training - Step 8057: {'lr': 0.0004697565942445818, 'samples': 4125696, 'steps': 8057, 'batch_loss/train': 0.8170178905129433} 12/24/2021 01:05:14 - INFO - codeparrot_training - Step 8058: {'lr': 0.0004697490783687198, 'samples': 4126208, 'steps': 8058, 'batch_loss/train': 0.8102014530450106} 12/24/2021 01:05:24 - INFO - codeparrot_training - Step 8059: {'lr': 0.0004697415616192184, 'samples': 4126720, 'steps': 8059, 'batch_loss/train': 0.7975074891000986} 12/24/2021 01:05:37 - INFO - codeparrot_training - Step 8060: {'lr': 0.00046973404399610746, 'samples': 4127232, 'steps': 8060, 'batch_loss/train': 0.6041890308260918} 12/24/2021 01:05:48 - INFO - codeparrot_training - Step 8061: {'lr': 0.000469726525499417, 'samples': 4127744, 'steps': 8061, 'batch_loss/train': 0.8435561815276742} 12/24/2021 01:05:58 - INFO - codeparrot_training - Step 8062: {'lr': 0.00046971900612917676, 'samples': 4128256, 'steps': 8062, 'batch_loss/train': 0.8775535207241774} 12/24/2021 01:06:10 - INFO - codeparrot_training - Step 8063: {'lr': 0.0004697114858854168, 'samples': 4128768, 'steps': 8063, 'batch_loss/train': 0.9689391758292913} 12/24/2021 01:06:21 - INFO - codeparrot_training - Step 8064: {'lr': 0.0004697039647681669, 'samples': 4129280, 'steps': 8064, 'batch_loss/train': 0.8148245196789503} 12/24/2021 01:06:32 - INFO - codeparrot_training - Step 8065: {'lr': 0.0004696964427774569, 'samples': 4129792, 'steps': 8065, 'batch_loss/train': 0.7526918337680399} 12/24/2021 01:06:42 - INFO - codeparrot_training - Step 8066: {'lr': 0.00046968891991331697, 'samples': 4130304, 'steps': 8066, 'batch_loss/train': 0.821310605853796} 12/24/2021 01:06:55 - INFO - codeparrot_training - Step 8067: {'lr': 0.0004696813961757767, 'samples': 4130816, 'steps': 8067, 'batch_loss/train': 0.786530408076942} 12/24/2021 01:07:05 - INFO - codeparrot_training - Step 8068: {'lr': 0.00046967387156486616, 'samples': 4131328, 'steps': 8068, 'batch_loss/train': 0.9037140607833862} 12/24/2021 01:07:16 - INFO - codeparrot_training - Step 8069: {'lr': 0.0004696663460806153, 'samples': 4131840, 'steps': 8069, 'batch_loss/train': 0.8924390751635656} 12/24/2021 01:07:28 - INFO - codeparrot_training - Step 8070: {'lr': 0.000469658819723054, 'samples': 4132352, 'steps': 8070, 'batch_loss/train': 0.7123636456672102} 12/24/2021 01:07:39 - INFO - codeparrot_training - Step 8071: {'lr': 0.00046965129249221217, 'samples': 4132864, 'steps': 8071, 'batch_loss/train': 0.7750352155417204} 12/24/2021 01:07:49 - INFO - codeparrot_training - Step 8072: {'lr': 0.00046964376438811975, 'samples': 4133376, 'steps': 8072, 'batch_loss/train': 0.8123497869819403} 12/24/2021 01:08:01 - INFO - codeparrot_training - Step 8073: {'lr': 0.0004696362354108066, 'samples': 4133888, 'steps': 8073, 'batch_loss/train': 0.798861131304875} 12/24/2021 01:08:12 - INFO - codeparrot_training - Step 8074: {'lr': 0.0004696287055603028, 'samples': 4134400, 'steps': 8074, 'batch_loss/train': 0.7573583712801337} 12/24/2021 01:08:23 - INFO - codeparrot_training - Step 8075: {'lr': 0.00046962117483663815, 'samples': 4134912, 'steps': 8075, 'batch_loss/train': 0.813305689021945} 12/24/2021 01:08:35 - INFO - codeparrot_training - Step 8076: {'lr': 0.0004696136432398426, 'samples': 4135424, 'steps': 8076, 'batch_loss/train': 0.724502821918577} 12/24/2021 01:08:46 - INFO - codeparrot_training - Step 8077: {'lr': 0.00046960611076994613, 'samples': 4135936, 'steps': 8077, 'batch_loss/train': 0.7972329342737794} 12/24/2021 01:08:57 - INFO - codeparrot_training - Step 8078: {'lr': 0.00046959857742697874, 'samples': 4136448, 'steps': 8078, 'batch_loss/train': 0.8798511363565922} 12/24/2021 01:09:07 - INFO - codeparrot_training - Step 8079: {'lr': 0.0004695910432109702, 'samples': 4136960, 'steps': 8079, 'batch_loss/train': 0.6733364519895986} 12/24/2021 01:09:19 - INFO - codeparrot_training - Step 8080: {'lr': 0.00046958350812195074, 'samples': 4137472, 'steps': 8080, 'batch_loss/train': 0.8148229783400893} 12/24/2021 01:09:30 - INFO - codeparrot_training - Step 8081: {'lr': 0.00046957597215995005, 'samples': 4137984, 'steps': 8081, 'batch_loss/train': 0.8306138683110476} 12/24/2021 01:09:41 - INFO - codeparrot_training - Step 8082: {'lr': 0.0004695684353249983, 'samples': 4138496, 'steps': 8082, 'batch_loss/train': 0.8291693953797221} 12/24/2021 01:09:53 - INFO - codeparrot_training - Step 8083: {'lr': 0.00046956089761712527, 'samples': 4139008, 'steps': 8083, 'batch_loss/train': 0.7460682378150523} 12/24/2021 01:10:04 - INFO - codeparrot_training - Step 8084: {'lr': 0.0004695533590363611, 'samples': 4139520, 'steps': 8084, 'batch_loss/train': 0.7020055982284248} 12/24/2021 01:10:14 - INFO - codeparrot_training - Step 8085: {'lr': 0.0004695458195827356, 'samples': 4140032, 'steps': 8085, 'batch_loss/train': 0.9789405213668942} 12/24/2021 01:10:27 - INFO - codeparrot_training - Step 8086: {'lr': 0.00046953827925627883, 'samples': 4140544, 'steps': 8086, 'batch_loss/train': 0.8161655054427683} 12/24/2021 01:10:38 - INFO - codeparrot_training - Step 8087: {'lr': 0.0004695307380570208, 'samples': 4141056, 'steps': 8087, 'batch_loss/train': 0.8348628452513367} 12/24/2021 01:10:48 - INFO - codeparrot_training - Step 8088: {'lr': 0.0004695231959849914, 'samples': 4141568, 'steps': 8088, 'batch_loss/train': 0.8022290477529168} 12/24/2021 01:11:00 - INFO - codeparrot_training - Step 8089: {'lr': 0.0004695156530402207, 'samples': 4142080, 'steps': 8089, 'batch_loss/train': 0.7829772885888815} 12/24/2021 01:11:11 - INFO - codeparrot_training - Step 8090: {'lr': 0.00046950810922273865, 'samples': 4142592, 'steps': 8090, 'batch_loss/train': 0.7271380871534348} 12/24/2021 01:11:22 - INFO - codeparrot_training - Step 8091: {'lr': 0.0004695005645325752, 'samples': 4143104, 'steps': 8091, 'batch_loss/train': 0.8577045253477991} 12/24/2021 01:11:32 - INFO - codeparrot_training - Step 8092: {'lr': 0.00046949301896976036, 'samples': 4143616, 'steps': 8092, 'batch_loss/train': 0.8382544927299023} 12/24/2021 01:11:46 - INFO - codeparrot_training - Step 8093: {'lr': 0.0004694854725343243, 'samples': 4144128, 'steps': 8093, 'batch_loss/train': 0.8681939835078083} 12/24/2021 01:11:57 - INFO - codeparrot_training - Step 8094: {'lr': 0.0004694779252262967, 'samples': 4144640, 'steps': 8094, 'batch_loss/train': 0.929686357267201} 12/24/2021 01:12:07 - INFO - codeparrot_training - Step 8095: {'lr': 0.00046947037704570785, 'samples': 4145152, 'steps': 8095, 'batch_loss/train': 0.8660520045086741} 12/24/2021 01:12:19 - INFO - codeparrot_training - Step 8096: {'lr': 0.00046946282799258755, 'samples': 4145664, 'steps': 8096, 'batch_loss/train': 0.8905133074149489} 12/24/2021 01:12:30 - INFO - codeparrot_training - Step 8097: {'lr': 0.00046945527806696596, 'samples': 4146176, 'steps': 8097, 'batch_loss/train': 0.7801611954346299} 12/24/2021 01:12:40 - INFO - codeparrot_training - Step 8098: {'lr': 0.00046944772726887306, 'samples': 4146688, 'steps': 8098, 'batch_loss/train': 0.8567924937233329} 12/24/2021 01:12:52 - INFO - codeparrot_training - Step 8099: {'lr': 0.00046944017559833893, 'samples': 4147200, 'steps': 8099, 'batch_loss/train': 0.9743072185665369} 12/24/2021 01:13:03 - INFO - codeparrot_training - Step 8100: {'lr': 0.00046943262305539345, 'samples': 4147712, 'steps': 8100, 'batch_loss/train': 0.8672479670494795} 12/24/2021 01:13:14 - INFO - codeparrot_training - Step 8101: {'lr': 0.0004694250696400667, 'samples': 4148224, 'steps': 8101, 'batch_loss/train': 0.8658440839499235} 12/24/2021 01:13:26 - INFO - codeparrot_training - Step 8102: {'lr': 0.00046941751535238877, 'samples': 4148736, 'steps': 8102, 'batch_loss/train': 0.8771575633436441} 12/24/2021 01:13:37 - INFO - codeparrot_training - Step 8103: {'lr': 0.0004694099601923896, 'samples': 4149248, 'steps': 8103, 'batch_loss/train': 0.596642991877161} 12/24/2021 01:13:47 - INFO - codeparrot_training - Step 8104: {'lr': 0.00046940240416009927, 'samples': 4149760, 'steps': 8104, 'batch_loss/train': 0.6681339335627854} 12/24/2021 01:13:58 - INFO - codeparrot_training - Step 8105: {'lr': 0.0004693948472555479, 'samples': 4150272, 'steps': 8105, 'batch_loss/train': 0.87276266887784} 12/24/2021 01:14:10 - INFO - codeparrot_training - Step 8106: {'lr': 0.0004693872894787654, 'samples': 4150784, 'steps': 8106, 'batch_loss/train': 0.8315197210758924} 12/24/2021 01:14:21 - INFO - codeparrot_training - Step 8107: {'lr': 0.0004693797308297818, 'samples': 4151296, 'steps': 8107, 'batch_loss/train': 0.7891461306717247} 12/24/2021 01:14:31 - INFO - codeparrot_training - Step 8108: {'lr': 0.0004693721713086273, 'samples': 4151808, 'steps': 8108, 'batch_loss/train': 0.9015859737992287} 12/24/2021 01:14:43 - INFO - codeparrot_training - Step 8109: {'lr': 0.00046936461091533194, 'samples': 4152320, 'steps': 8109, 'batch_loss/train': 0.8461839593946934} 12/24/2021 01:14:54 - INFO - codeparrot_training - Step 8110: {'lr': 0.00046935704964992566, 'samples': 4152832, 'steps': 8110, 'batch_loss/train': 0.7865279437974095} 12/24/2021 01:15:05 - INFO - codeparrot_training - Step 8111: {'lr': 0.00046934948751243855, 'samples': 4153344, 'steps': 8111, 'batch_loss/train': 0.7981072133406997} 12/24/2021 01:15:17 - INFO - codeparrot_training - Step 8112: {'lr': 0.00046934192450290074, 'samples': 4153856, 'steps': 8112, 'batch_loss/train': 1.0280579077079892} 12/24/2021 01:15:27 - INFO - codeparrot_training - Step 8113: {'lr': 0.00046933436062134227, 'samples': 4154368, 'steps': 8113, 'batch_loss/train': 0.8983858628198504} 12/24/2021 01:15:38 - INFO - codeparrot_training - Step 8114: {'lr': 0.00046932679586779316, 'samples': 4154880, 'steps': 8114, 'batch_loss/train': 0.7987134149298072} 12/24/2021 01:15:51 - INFO - codeparrot_training - Step 8115: {'lr': 0.0004693192302422835, 'samples': 4155392, 'steps': 8115, 'batch_loss/train': 0.9090219950303435} 12/24/2021 01:16:01 - INFO - codeparrot_training - Step 8116: {'lr': 0.00046931166374484346, 'samples': 4155904, 'steps': 8116, 'batch_loss/train': 0.8089235397055745} 12/24/2021 01:16:12 - INFO - codeparrot_training - Step 8117: {'lr': 0.00046930409637550304, 'samples': 4156416, 'steps': 8117, 'batch_loss/train': 1.7404006756842136} 12/24/2021 01:16:22 - INFO - codeparrot_training - Step 8118: {'lr': 0.0004692965281342923, 'samples': 4156928, 'steps': 8118, 'batch_loss/train': 0.8925208123400807} 12/24/2021 01:16:34 - INFO - codeparrot_training - Step 8119: {'lr': 0.0004692889590212414, 'samples': 4157440, 'steps': 8119, 'batch_loss/train': 0.8086405415087938} 12/24/2021 01:16:45 - INFO - codeparrot_training - Step 8120: {'lr': 0.0004692813890363804, 'samples': 4157952, 'steps': 8120, 'batch_loss/train': 0.7786790884565562} 12/24/2021 01:16:56 - INFO - codeparrot_training - Step 8121: {'lr': 0.0004692738181797394, 'samples': 4158464, 'steps': 8121, 'batch_loss/train': 0.7792226406745613} 12/24/2021 01:17:08 - INFO - codeparrot_training - Step 8122: {'lr': 0.0004692662464513484, 'samples': 4158976, 'steps': 8122, 'batch_loss/train': 0.8250976549461484} 12/24/2021 01:17:19 - INFO - codeparrot_training - Step 8123: {'lr': 0.00046925867385123773, 'samples': 4159488, 'steps': 8123, 'batch_loss/train': 0.6857956415042281} 12/24/2021 01:17:30 - INFO - codeparrot_training - Step 8124: {'lr': 0.0004692511003794373, 'samples': 4160000, 'steps': 8124, 'batch_loss/train': 0.8419325118884444} 12/24/2021 01:17:42 - INFO - codeparrot_training - Step 8125: {'lr': 0.0004692435260359773, 'samples': 4160512, 'steps': 8125, 'batch_loss/train': 0.8589951368048787} 12/24/2021 01:17:52 - INFO - codeparrot_training - Step 8126: {'lr': 0.00046923595082088777, 'samples': 4161024, 'steps': 8126, 'batch_loss/train': 0.9157076422125101} 12/24/2021 01:18:03 - INFO - codeparrot_training - Step 8127: {'lr': 0.00046922837473419894, 'samples': 4161536, 'steps': 8127, 'batch_loss/train': 0.7638069741660729} 12/24/2021 01:18:15 - INFO - codeparrot_training - Step 8128: {'lr': 0.00046922079777594086, 'samples': 4162048, 'steps': 8128, 'batch_loss/train': 0.8441748637706041} 12/24/2021 01:18:25 - INFO - codeparrot_training - Step 8129: {'lr': 0.0004692132199461436, 'samples': 4162560, 'steps': 8129, 'batch_loss/train': 0.796377394348383} 12/24/2021 01:18:36 - INFO - codeparrot_training - Step 8130: {'lr': 0.00046920564124483744, 'samples': 4163072, 'steps': 8130, 'batch_loss/train': 0.81351737678051} 12/24/2021 01:18:47 - INFO - codeparrot_training - Step 8131: {'lr': 0.0004691980616720524, 'samples': 4163584, 'steps': 8131, 'batch_loss/train': 0.695529482094571} 12/24/2021 01:19:00 - INFO - codeparrot_training - Step 8132: {'lr': 0.0004691904812278186, 'samples': 4164096, 'steps': 8132, 'batch_loss/train': 0.8866526037454605} 12/24/2021 01:19:10 - INFO - codeparrot_training - Step 8133: {'lr': 0.0004691828999121662, 'samples': 4164608, 'steps': 8133, 'batch_loss/train': 0.7438464602455497} 12/24/2021 01:19:21 - INFO - codeparrot_training - Step 8134: {'lr': 0.0004691753177251254, 'samples': 4165120, 'steps': 8134, 'batch_loss/train': 0.87485969113186} 12/24/2021 01:19:33 - INFO - codeparrot_training - Step 8135: {'lr': 0.00046916773466672626, 'samples': 4165632, 'steps': 8135, 'batch_loss/train': 0.7146099016536027} 12/24/2021 01:19:44 - INFO - codeparrot_training - Step 8136: {'lr': 0.000469160150736999, 'samples': 4166144, 'steps': 8136, 'batch_loss/train': 0.7755572898313403} 12/24/2021 01:19:54 - INFO - codeparrot_training - Step 8137: {'lr': 0.00046915256593597367, 'samples': 4166656, 'steps': 8137, 'batch_loss/train': 0.8063747454434633} 12/24/2021 01:20:07 - INFO - codeparrot_training - Step 8138: {'lr': 0.0004691449802636805, 'samples': 4167168, 'steps': 8138, 'batch_loss/train': 0.9776166696101427} 12/24/2021 01:20:17 - INFO - codeparrot_training - Step 8139: {'lr': 0.00046913739372014964, 'samples': 4167680, 'steps': 8139, 'batch_loss/train': 0.7754268357530236} 12/24/2021 01:20:28 - INFO - codeparrot_training - Step 8140: {'lr': 0.0004691298063054113, 'samples': 4168192, 'steps': 8140, 'batch_loss/train': 0.7826967723667622} 12/24/2021 01:20:39 - INFO - codeparrot_training - Step 8141: {'lr': 0.0004691222180194955, 'samples': 4168704, 'steps': 8141, 'batch_loss/train': 0.6505273751099594} 12/24/2021 01:20:51 - INFO - codeparrot_training - Step 8142: {'lr': 0.00046911462886243264, 'samples': 4169216, 'steps': 8142, 'batch_loss/train': 0.8143896469846368} 12/24/2021 01:21:01 - INFO - codeparrot_training - Step 8143: {'lr': 0.0004691070388342526, 'samples': 4169728, 'steps': 8143, 'batch_loss/train': 0.7522155330516398} 12/24/2021 01:21:12 - INFO - codeparrot_training - Step 8144: {'lr': 0.0004690994479349857, 'samples': 4170240, 'steps': 8144, 'batch_loss/train': 0.6979115400463343} 12/24/2021 01:21:24 - INFO - codeparrot_training - Step 8145: {'lr': 0.0004690918561646622, 'samples': 4170752, 'steps': 8145, 'batch_loss/train': 0.912427787669003} 12/24/2021 01:21:35 - INFO - codeparrot_training - Step 8146: {'lr': 0.00046908426352331226, 'samples': 4171264, 'steps': 8146, 'batch_loss/train': 0.8890111688524485} 12/24/2021 01:21:46 - INFO - codeparrot_training - Step 8147: {'lr': 0.0004690766700109659, 'samples': 4171776, 'steps': 8147, 'batch_loss/train': 0.7926251711323857} 12/24/2021 01:21:58 - INFO - codeparrot_training - Step 8148: {'lr': 0.00046906907562765346, 'samples': 4172288, 'steps': 8148, 'batch_loss/train': 0.7709416002035141} 12/24/2021 01:22:08 - INFO - codeparrot_training - Step 8149: {'lr': 0.0004690614803734051, 'samples': 4172800, 'steps': 8149, 'batch_loss/train': 0.7519162856042385} 12/24/2021 01:22:19 - INFO - codeparrot_training - Step 8150: {'lr': 0.000469053884248251, 'samples': 4173312, 'steps': 8150, 'batch_loss/train': 0.8368464056402445} 12/24/2021 01:22:29 - INFO - codeparrot_training - Step 8151: {'lr': 0.00046904628725222145, 'samples': 4173824, 'steps': 8151, 'batch_loss/train': 0.8483164813369513} 12/24/2021 01:22:42 - INFO - codeparrot_training - Step 8152: {'lr': 0.00046903868938534646, 'samples': 4174336, 'steps': 8152, 'batch_loss/train': 0.8010020651854575} 12/24/2021 01:22:53 - INFO - codeparrot_training - Step 8153: {'lr': 0.0004690310906476564, 'samples': 4174848, 'steps': 8153, 'batch_loss/train': 0.6689226890448481} 12/24/2021 01:23:04 - INFO - codeparrot_training - Step 8154: {'lr': 0.0004690234910391814, 'samples': 4175360, 'steps': 8154, 'batch_loss/train': 0.8489601132459939} 12/24/2021 01:23:15 - INFO - codeparrot_training - Step 8155: {'lr': 0.00046901589055995166, 'samples': 4175872, 'steps': 8155, 'batch_loss/train': 0.853793510235846} 12/24/2021 01:23:26 - INFO - codeparrot_training - Step 8156: {'lr': 0.0004690082892099975, 'samples': 4176384, 'steps': 8156, 'batch_loss/train': 0.8248452115803957} 12/24/2021 01:23:37 - INFO - codeparrot_training - Step 8157: {'lr': 0.00046900068698934905, 'samples': 4176896, 'steps': 8157, 'batch_loss/train': 0.7808233788236976} 12/24/2021 01:23:49 - INFO - codeparrot_training - Step 8158: {'lr': 0.00046899308389803654, 'samples': 4177408, 'steps': 8158, 'batch_loss/train': 0.8066940261051059} 12/24/2021 01:23:59 - INFO - codeparrot_training - Step 8159: {'lr': 0.00046898547993609027, 'samples': 4177920, 'steps': 8159, 'batch_loss/train': 0.556179987732321} 12/24/2021 01:24:10 - INFO - codeparrot_training - Step 8160: {'lr': 0.0004689778751035404, 'samples': 4178432, 'steps': 8160, 'batch_loss/train': 0.9054232984781265} 12/24/2021 01:24:23 - INFO - codeparrot_training - Step 8161: {'lr': 0.00046897026940041715, 'samples': 4178944, 'steps': 8161, 'batch_loss/train': 0.8387432573363185} 12/24/2021 01:24:34 - INFO - codeparrot_training - Step 8162: {'lr': 0.0004689626628267508, 'samples': 4179456, 'steps': 8162, 'batch_loss/train': 0.7217128407210112} 12/24/2021 01:24:45 - INFO - codeparrot_training - Step 8163: {'lr': 0.0004689550553825716, 'samples': 4179968, 'steps': 8163, 'batch_loss/train': 0.8240422012750059} 12/24/2021 01:24:55 - INFO - codeparrot_training - Step 8164: {'lr': 0.00046894744706790976, 'samples': 4180480, 'steps': 8164, 'batch_loss/train': 0.9431132134050131} 12/24/2021 01:25:07 - INFO - codeparrot_training - Step 8165: {'lr': 0.0004689398378827956, 'samples': 4180992, 'steps': 8165, 'batch_loss/train': 1.063586950302124} 12/24/2021 01:25:18 - INFO - codeparrot_training - Step 8166: {'lr': 0.0004689322278272592, 'samples': 4181504, 'steps': 8166, 'batch_loss/train': 0.8514209218556061} 12/24/2021 01:25:29 - INFO - codeparrot_training - Step 8167: {'lr': 0.00046892461690133103, 'samples': 4182016, 'steps': 8167, 'batch_loss/train': 0.6442238371819258} 12/24/2021 01:25:41 - INFO - codeparrot_training - Step 8168: {'lr': 0.0004689170051050412, 'samples': 4182528, 'steps': 8168, 'batch_loss/train': 0.7466184481163509} 12/24/2021 01:25:51 - INFO - codeparrot_training - Step 8169: {'lr': 0.00046890939243842, 'samples': 4183040, 'steps': 8169, 'batch_loss/train': 0.8097984530031681} 12/24/2021 01:26:02 - INFO - codeparrot_training - Step 8170: {'lr': 0.00046890177890149775, 'samples': 4183552, 'steps': 8170, 'batch_loss/train': 0.7921217614784837} 12/24/2021 01:26:15 - INFO - codeparrot_training - Step 8171: {'lr': 0.00046889416449430465, 'samples': 4184064, 'steps': 8171, 'batch_loss/train': 0.6790140139637515} 12/24/2021 01:26:25 - INFO - codeparrot_training - Step 8172: {'lr': 0.000468886549216871, 'samples': 4184576, 'steps': 8172, 'batch_loss/train': 0.7778798210201785} 12/24/2021 01:26:36 - INFO - codeparrot_training - Step 8173: {'lr': 0.0004688789330692271, 'samples': 4185088, 'steps': 8173, 'batch_loss/train': 0.7170129456790164} 12/24/2021 01:26:46 - INFO - codeparrot_training - Step 8174: {'lr': 0.0004688713160514032, 'samples': 4185600, 'steps': 8174, 'batch_loss/train': 0.8320628767833114} 12/24/2021 01:26:59 - INFO - codeparrot_training - Step 8175: {'lr': 0.0004688636981634296, 'samples': 4186112, 'steps': 8175, 'batch_loss/train': 0.9652504874393344} 12/24/2021 01:27:09 - INFO - codeparrot_training - Step 8176: {'lr': 0.0004688560794053366, 'samples': 4186624, 'steps': 8176, 'batch_loss/train': 0.8816708056256175} 12/24/2021 01:27:20 - INFO - codeparrot_training - Step 8177: {'lr': 0.0004688484597771544, 'samples': 4187136, 'steps': 8177, 'batch_loss/train': 0.8438224685378373} 12/24/2021 01:27:35 - INFO - codeparrot_training - Step 8178: {'lr': 0.0004688408392789134, 'samples': 4187648, 'steps': 8178, 'batch_loss/train': 0.7263777102343738} 12/24/2021 01:27:46 - INFO - codeparrot_training - Step 8179: {'lr': 0.0004688332179106439, 'samples': 4188160, 'steps': 8179, 'batch_loss/train': 0.8438614662736654} 12/24/2021 01:27:56 - INFO - codeparrot_training - Step 8180: {'lr': 0.00046882559567237603, 'samples': 4188672, 'steps': 8180, 'batch_loss/train': 0.7021734198206104} 12/24/2021 01:28:08 - INFO - codeparrot_training - Step 8181: {'lr': 0.00046881797256414034, 'samples': 4189184, 'steps': 8181, 'batch_loss/train': 0.7424788204953074} 12/24/2021 01:28:19 - INFO - codeparrot_training - Step 8182: {'lr': 0.00046881034858596696, 'samples': 4189696, 'steps': 8182, 'batch_loss/train': 0.9075746452435851} 12/24/2021 01:28:30 - INFO - codeparrot_training - Step 8183: {'lr': 0.00046880272373788623, 'samples': 4190208, 'steps': 8183, 'batch_loss/train': 0.8888679007068276} 12/24/2021 01:28:41 - INFO - codeparrot_training - Step 8184: {'lr': 0.0004687950980199285, 'samples': 4190720, 'steps': 8184, 'batch_loss/train': 0.6362574563827366} 12/24/2021 01:28:52 - INFO - codeparrot_training - Step 8185: {'lr': 0.0004687874714321241, 'samples': 4191232, 'steps': 8185, 'batch_loss/train': 0.8628199864178896} 12/24/2021 01:29:03 - INFO - codeparrot_training - Step 8186: {'lr': 0.00046877984397450335, 'samples': 4191744, 'steps': 8186, 'batch_loss/train': 0.7505935695953667} 12/24/2021 01:29:13 - INFO - codeparrot_training - Step 8187: {'lr': 0.0004687722156470965, 'samples': 4192256, 'steps': 8187, 'batch_loss/train': 0.920768148265779} 12/24/2021 01:29:25 - INFO - codeparrot_training - Step 8188: {'lr': 0.0004687645864499339, 'samples': 4192768, 'steps': 8188, 'batch_loss/train': 0.8338042329996824} 12/24/2021 01:29:36 - INFO - codeparrot_training - Step 8189: {'lr': 0.0004687569563830459, 'samples': 4193280, 'steps': 8189, 'batch_loss/train': 0.8810265716165304} 12/24/2021 01:29:47 - INFO - codeparrot_training - Step 8190: {'lr': 0.000468749325446463, 'samples': 4193792, 'steps': 8190, 'batch_loss/train': 0.85054119839333} 12/24/2021 01:29:59 - INFO - codeparrot_training - Step 8191: {'lr': 0.00046874169364021526, 'samples': 4194304, 'steps': 8191, 'batch_loss/train': 0.8239214951172471} 12/24/2021 01:30:10 - INFO - codeparrot_training - Step 8192: {'lr': 0.00046873406096433314, 'samples': 4194816, 'steps': 8192, 'batch_loss/train': 0.8675544555298984} 12/24/2021 01:30:21 - INFO - codeparrot_training - Step 8193: {'lr': 0.000468726427418847, 'samples': 4195328, 'steps': 8193, 'batch_loss/train': 0.7657534535974264} 12/24/2021 01:30:33 - INFO - codeparrot_training - Step 8194: {'lr': 0.00046871879300378706, 'samples': 4195840, 'steps': 8194, 'batch_loss/train': 0.7784479409456253} 12/24/2021 01:30:43 - INFO - codeparrot_training - Step 8195: {'lr': 0.0004687111577191839, 'samples': 4196352, 'steps': 8195, 'batch_loss/train': 0.8703033113852143} 12/24/2021 01:30:54 - INFO - codeparrot_training - Step 8196: {'lr': 0.0004687035215650678, 'samples': 4196864, 'steps': 8196, 'batch_loss/train': 0.8602029634639621} 12/24/2021 01:31:07 - INFO - codeparrot_training - Step 8197: {'lr': 0.0004686958845414689, 'samples': 4197376, 'steps': 8197, 'batch_loss/train': 0.8053074080962688} 12/24/2021 01:31:17 - INFO - codeparrot_training - Step 8198: {'lr': 0.0004686882466484179, 'samples': 4197888, 'steps': 8198, 'batch_loss/train': 0.8794714789837599} 12/24/2021 01:31:28 - INFO - codeparrot_training - Step 8199: {'lr': 0.000468680607885945, 'samples': 4198400, 'steps': 8199, 'batch_loss/train': 0.7917143320664763} 12/24/2021 01:31:38 - INFO - codeparrot_training - Step 8200: {'lr': 0.0004686729682540806, 'samples': 4198912, 'steps': 8200, 'batch_loss/train': 0.7544780182652175} 12/24/2021 01:31:51 - INFO - codeparrot_training - Step 8201: {'lr': 0.0004686653277528549, 'samples': 4199424, 'steps': 8201, 'batch_loss/train': 0.7602307691704482} 12/24/2021 01:32:01 - INFO - codeparrot_training - Step 8202: {'lr': 0.00046865768638229853, 'samples': 4199936, 'steps': 8202, 'batch_loss/train': 0.7594827646389604} 12/24/2021 01:32:12 - INFO - codeparrot_training - Step 8203: {'lr': 0.00046865004414244174, 'samples': 4200448, 'steps': 8203, 'batch_loss/train': 0.8113026898354292} 12/24/2021 01:32:24 - INFO - codeparrot_training - Step 8204: {'lr': 0.0004686424010333149, 'samples': 4200960, 'steps': 8204, 'batch_loss/train': 0.8644475303590298} 12/24/2021 01:32:34 - INFO - codeparrot_training - Step 8205: {'lr': 0.00046863475705494846, 'samples': 4201472, 'steps': 8205, 'batch_loss/train': 0.8037847159430385} 12/24/2021 01:32:45 - INFO - codeparrot_training - Step 8206: {'lr': 0.00046862711220737273, 'samples': 4201984, 'steps': 8206, 'batch_loss/train': 0.7806522632017732} 12/24/2021 01:32:58 - INFO - codeparrot_training - Step 8207: {'lr': 0.00046861946649061825, 'samples': 4202496, 'steps': 8207, 'batch_loss/train': 0.8723749304190278} 12/24/2021 01:33:08 - INFO - codeparrot_training - Step 8208: {'lr': 0.0004686118199047153, 'samples': 4203008, 'steps': 8208, 'batch_loss/train': 0.7906655189581215} 12/24/2021 01:33:19 - INFO - codeparrot_training - Step 8209: {'lr': 0.00046860417244969424, 'samples': 4203520, 'steps': 8209, 'batch_loss/train': 0.8495715791359544} 12/24/2021 01:33:31 - INFO - codeparrot_training - Step 8210: {'lr': 0.00046859652412558554, 'samples': 4204032, 'steps': 8210, 'batch_loss/train': 0.776506993919611} 12/24/2021 01:33:42 - INFO - codeparrot_training - Step 8211: {'lr': 0.0004685888749324196, 'samples': 4204544, 'steps': 8211, 'batch_loss/train': 0.9115230115130544} 12/24/2021 01:33:52 - INFO - codeparrot_training - Step 8212: {'lr': 0.0004685812248702269, 'samples': 4205056, 'steps': 8212, 'batch_loss/train': 0.8600038532167673} 12/24/2021 01:34:03 - INFO - codeparrot_training - Step 8213: {'lr': 0.0004685735739390377, 'samples': 4205568, 'steps': 8213, 'batch_loss/train': 0.8500008648261428} 12/24/2021 01:34:15 - INFO - codeparrot_training - Step 8214: {'lr': 0.0004685659221388826, 'samples': 4206080, 'steps': 8214, 'batch_loss/train': 0.8791760895401239} 12/24/2021 01:34:25 - INFO - codeparrot_training - Step 8215: {'lr': 0.00046855826946979185, 'samples': 4206592, 'steps': 8215, 'batch_loss/train': 0.8729693130590022} 12/24/2021 01:34:36 - INFO - codeparrot_training - Step 8216: {'lr': 0.00046855061593179594, 'samples': 4207104, 'steps': 8216, 'batch_loss/train': 0.7529259889852256} 12/24/2021 01:34:49 - INFO - codeparrot_training - Step 8217: {'lr': 0.00046854296152492523, 'samples': 4207616, 'steps': 8217, 'batch_loss/train': 0.7645709300413728} 12/24/2021 01:34:59 - INFO - codeparrot_training - Step 8218: {'lr': 0.00046853530624921036, 'samples': 4208128, 'steps': 8218, 'batch_loss/train': 0.8051864821463823} 12/24/2021 01:35:10 - INFO - codeparrot_training - Step 8219: {'lr': 0.00046852765010468156, 'samples': 4208640, 'steps': 8219, 'batch_loss/train': 0.8045769138261676} 12/24/2021 01:35:22 - INFO - codeparrot_training - Step 8220: {'lr': 0.0004685199930913693, 'samples': 4209152, 'steps': 8220, 'batch_loss/train': 0.7964840633212589} 12/24/2021 01:35:32 - INFO - codeparrot_training - Step 8221: {'lr': 0.00046851233520930414, 'samples': 4209664, 'steps': 8221, 'batch_loss/train': 0.8967599207535386} 12/24/2021 01:35:43 - INFO - codeparrot_training - Step 8222: {'lr': 0.0004685046764585164, 'samples': 4210176, 'steps': 8222, 'batch_loss/train': 0.6949585130205378} 12/24/2021 01:35:56 - INFO - codeparrot_training - Step 8223: {'lr': 0.0004684970168390366, 'samples': 4210688, 'steps': 8223, 'batch_loss/train': 0.8723384467884898} 12/24/2021 01:36:06 - INFO - codeparrot_training - Step 8224: {'lr': 0.0004684893563508952, 'samples': 4211200, 'steps': 8224, 'batch_loss/train': 0.8090352485887706} 12/24/2021 01:36:17 - INFO - codeparrot_training - Step 8225: {'lr': 0.0004684816949941225, 'samples': 4211712, 'steps': 8225, 'batch_loss/train': 0.7581093073822558} 12/24/2021 01:36:27 - INFO - codeparrot_training - Step 8226: {'lr': 0.0004684740327687491, 'samples': 4212224, 'steps': 8226, 'batch_loss/train': 0.748026586137712} 12/24/2021 01:36:40 - INFO - codeparrot_training - Step 8227: {'lr': 0.0004684663696748055, 'samples': 4212736, 'steps': 8227, 'batch_loss/train': 0.7005269390065223} 12/24/2021 01:36:50 - INFO - codeparrot_training - Step 8228: {'lr': 0.00046845870571232207, 'samples': 4213248, 'steps': 8228, 'batch_loss/train': 0.7830555029213428} 12/24/2021 01:37:01 - INFO - codeparrot_training - Step 8229: {'lr': 0.0004684510408813293, 'samples': 4213760, 'steps': 8229, 'batch_loss/train': 0.5851048125186935} 12/24/2021 01:37:13 - INFO - codeparrot_training - Step 8230: {'lr': 0.0004684433751818577, 'samples': 4214272, 'steps': 8230, 'batch_loss/train': 0.8372359797358513} 12/24/2021 01:37:24 - INFO - codeparrot_training - Step 8231: {'lr': 0.0004684357086139377, 'samples': 4214784, 'steps': 8231, 'batch_loss/train': 0.6667964695370756} 12/24/2021 01:37:34 - INFO - codeparrot_training - Step 8232: {'lr': 0.00046842804117759986, 'samples': 4215296, 'steps': 8232, 'batch_loss/train': 0.82414760440588} 12/24/2021 01:37:46 - INFO - codeparrot_training - Step 8233: {'lr': 0.00046842037287287455, 'samples': 4215808, 'steps': 8233, 'batch_loss/train': 0.5351813272573054} 12/24/2021 01:37:57 - INFO - codeparrot_training - Step 8234: {'lr': 0.00046841270369979226, 'samples': 4216320, 'steps': 8234, 'batch_loss/train': 0.8512416295707226} 12/24/2021 01:38:07 - INFO - codeparrot_training - Step 8235: {'lr': 0.0004684050336583836, 'samples': 4216832, 'steps': 8235, 'batch_loss/train': 0.7996062459424138} 12/24/2021 01:38:20 - INFO - codeparrot_training - Step 8236: {'lr': 0.00046839736274867903, 'samples': 4217344, 'steps': 8236, 'batch_loss/train': 0.7601207755506039} 12/24/2021 01:38:31 - INFO - codeparrot_training - Step 8237: {'lr': 0.0004683896909707089, 'samples': 4217856, 'steps': 8237, 'batch_loss/train': 0.8505480284802616} 12/24/2021 01:38:41 - INFO - codeparrot_training - Step 8238: {'lr': 0.00046838201832450396, 'samples': 4218368, 'steps': 8238, 'batch_loss/train': 0.8038338535698131} 12/24/2021 01:38:52 - INFO - codeparrot_training - Step 8239: {'lr': 0.00046837434481009446, 'samples': 4218880, 'steps': 8239, 'batch_loss/train': 0.7755146059207618} 12/24/2021 01:39:04 - INFO - codeparrot_training - Step 8240: {'lr': 0.0004683666704275111, 'samples': 4219392, 'steps': 8240, 'batch_loss/train': 0.8540404019877315} 12/24/2021 01:39:15 - INFO - codeparrot_training - Step 8241: {'lr': 0.00046835899517678423, 'samples': 4219904, 'steps': 8241, 'batch_loss/train': 0.7279643039219081} 12/24/2021 01:39:25 - INFO - codeparrot_training - Step 8242: {'lr': 0.0004683513190579445, 'samples': 4220416, 'steps': 8242, 'batch_loss/train': 0.8025240986607969} 12/24/2021 01:39:37 - INFO - codeparrot_training - Step 8243: {'lr': 0.0004683436420710223, 'samples': 4220928, 'steps': 8243, 'batch_loss/train': 0.7725923780817538} 12/24/2021 01:39:48 - INFO - codeparrot_training - Step 8244: {'lr': 0.00046833596421604827, 'samples': 4221440, 'steps': 8244, 'batch_loss/train': 0.8567032795399427} 12/24/2021 01:39:58 - INFO - codeparrot_training - Step 8245: {'lr': 0.000468328285493053, 'samples': 4221952, 'steps': 8245, 'batch_loss/train': 0.8504454325884581} 12/24/2021 01:40:11 - INFO - codeparrot_training - Step 8246: {'lr': 0.00046832060590206673, 'samples': 4222464, 'steps': 8246, 'batch_loss/train': 0.7642542929388583} 12/24/2021 01:40:22 - INFO - codeparrot_training - Step 8247: {'lr': 0.0004683129254431202, 'samples': 4222976, 'steps': 8247, 'batch_loss/train': 0.7327238684520125} 12/24/2021 01:40:32 - INFO - codeparrot_training - Step 8248: {'lr': 0.00046830524411624395, 'samples': 4223488, 'steps': 8248, 'batch_loss/train': 0.7751095282146707} 12/24/2021 01:40:44 - INFO - codeparrot_training - Step 8249: {'lr': 0.00046829756192146845, 'samples': 4224000, 'steps': 8249, 'batch_loss/train': 0.9061227524653077} 12/24/2021 01:40:55 - INFO - codeparrot_training - Step 8250: {'lr': 0.0004682898788588243, 'samples': 4224512, 'steps': 8250, 'batch_loss/train': 0.8472973080351949} 12/24/2021 01:41:06 - INFO - codeparrot_training - Step 8251: {'lr': 0.000468282194928342, 'samples': 4225024, 'steps': 8251, 'batch_loss/train': 0.8289303556084633} 12/24/2021 01:41:16 - INFO - codeparrot_training - Step 8252: {'lr': 0.0004682745101300521, 'samples': 4225536, 'steps': 8252, 'batch_loss/train': 0.8834818722680211} 12/24/2021 01:41:29 - INFO - codeparrot_training - Step 8253: {'lr': 0.00046826682446398515, 'samples': 4226048, 'steps': 8253, 'batch_loss/train': 0.7722612284123898} 12/24/2021 01:41:40 - INFO - codeparrot_training - Step 8254: {'lr': 0.0004682591379301718, 'samples': 4226560, 'steps': 8254, 'batch_loss/train': 0.8532570349052548} 12/24/2021 01:41:50 - INFO - codeparrot_training - Step 8255: {'lr': 0.0004682514505286425, 'samples': 4227072, 'steps': 8255, 'batch_loss/train': 0.8178761056624353} 12/24/2021 01:42:02 - INFO - codeparrot_training - Step 8256: {'lr': 0.0004682437622594278, 'samples': 4227584, 'steps': 8256, 'batch_loss/train': 0.8304248061031103} 12/24/2021 01:42:13 - INFO - codeparrot_training - Step 8257: {'lr': 0.0004682360731225583, 'samples': 4228096, 'steps': 8257, 'batch_loss/train': 0.7347996339667588} 12/24/2021 01:42:24 - INFO - codeparrot_training - Step 8258: {'lr': 0.0004682283831180646, 'samples': 4228608, 'steps': 8258, 'batch_loss/train': 0.7756019253283739} 12/24/2021 01:42:36 - INFO - codeparrot_training - Step 8259: {'lr': 0.0004682206922459773, 'samples': 4229120, 'steps': 8259, 'batch_loss/train': 1.0324527681805193} 12/24/2021 01:42:47 - INFO - codeparrot_training - Step 8260: {'lr': 0.00046821300050632684, 'samples': 4229632, 'steps': 8260, 'batch_loss/train': 0.6877369369613007} 12/24/2021 01:42:57 - INFO - codeparrot_training - Step 8261: {'lr': 0.000468205307899144, 'samples': 4230144, 'steps': 8261, 'batch_loss/train': 0.8485681554302573} 12/24/2021 01:43:08 - INFO - codeparrot_training - Step 8262: {'lr': 0.00046819761442445907, 'samples': 4230656, 'steps': 8262, 'batch_loss/train': 0.7906777061289176} 12/24/2021 01:43:21 - INFO - codeparrot_training - Step 8263: {'lr': 0.0004681899200823029, 'samples': 4231168, 'steps': 8263, 'batch_loss/train': 0.8196446250076406} 12/24/2021 01:43:32 - INFO - codeparrot_training - Step 8264: {'lr': 0.00046818222487270605, 'samples': 4231680, 'steps': 8264, 'batch_loss/train': 0.7288494193926454} 12/24/2021 01:43:42 - INFO - codeparrot_training - Step 8265: {'lr': 0.000468174528795699, 'samples': 4232192, 'steps': 8265, 'batch_loss/train': 0.6614438090473413} 12/24/2021 01:43:54 - INFO - codeparrot_training - Step 8266: {'lr': 0.00046816683185131245, 'samples': 4232704, 'steps': 8266, 'batch_loss/train': 0.6629885563161224} 12/24/2021 01:44:05 - INFO - codeparrot_training - Step 8267: {'lr': 0.00046815913403957686, 'samples': 4233216, 'steps': 8267, 'batch_loss/train': 0.8077113749459386} 12/24/2021 01:44:15 - INFO - codeparrot_training - Step 8268: {'lr': 0.00046815143536052295, 'samples': 4233728, 'steps': 8268, 'batch_loss/train': 0.757072567124851} 12/24/2021 01:44:27 - INFO - codeparrot_training - Step 8269: {'lr': 0.0004681437358141814, 'samples': 4234240, 'steps': 8269, 'batch_loss/train': 0.7796398198697716} 12/24/2021 01:44:38 - INFO - codeparrot_training - Step 8270: {'lr': 0.00046813603540058266, 'samples': 4234752, 'steps': 8270, 'batch_loss/train': 0.8943899748846889} 12/24/2021 01:44:49 - INFO - codeparrot_training - Step 8271: {'lr': 0.0004681283341197574, 'samples': 4235264, 'steps': 8271, 'batch_loss/train': 0.8326428849250078} 12/24/2021 01:45:01 - INFO - codeparrot_training - Step 8272: {'lr': 0.00046812063197173626, 'samples': 4235776, 'steps': 8272, 'batch_loss/train': 0.842764999717474} 12/24/2021 01:45:11 - INFO - codeparrot_training - Step 8273: {'lr': 0.00046811292895654983, 'samples': 4236288, 'steps': 8273, 'batch_loss/train': 0.9163207598030567} 12/24/2021 01:45:22 - INFO - codeparrot_training - Step 8274: {'lr': 0.0004681052250742288, 'samples': 4236800, 'steps': 8274, 'batch_loss/train': 0.7161209497135133} 12/24/2021 01:45:32 - INFO - codeparrot_training - Step 8275: {'lr': 0.0004680975203248037, 'samples': 4237312, 'steps': 8275, 'batch_loss/train': 0.8757654139772058} 12/24/2021 01:45:45 - INFO - codeparrot_training - Step 8276: {'lr': 0.0004680898147083052, 'samples': 4237824, 'steps': 8276, 'batch_loss/train': 0.6205903519876301} 12/24/2021 01:45:56 - INFO - codeparrot_training - Step 8277: {'lr': 0.00046808210822476393, 'samples': 4238336, 'steps': 8277, 'batch_loss/train': 0.8811161583289504} 12/24/2021 01:46:06 - INFO - codeparrot_training - Step 8278: {'lr': 0.0004680744008742106, 'samples': 4238848, 'steps': 8278, 'batch_loss/train': 0.6982410959899426} 12/24/2021 01:46:18 - INFO - codeparrot_training - Step 8279: {'lr': 0.0004680666926566758, 'samples': 4239360, 'steps': 8279, 'batch_loss/train': 0.7621653063688427} 12/24/2021 01:46:29 - INFO - codeparrot_training - Step 8280: {'lr': 0.00046805898357219015, 'samples': 4239872, 'steps': 8280, 'batch_loss/train': 0.6719521815539338} 12/24/2021 01:46:40 - INFO - codeparrot_training - Step 8281: {'lr': 0.00046805127362078427, 'samples': 4240384, 'steps': 8281, 'batch_loss/train': 0.7733690738677979} 12/24/2021 01:46:52 - INFO - codeparrot_training - Step 8282: {'lr': 0.00046804356280248893, 'samples': 4240896, 'steps': 8282, 'batch_loss/train': 0.8124148044735193} 12/24/2021 01:47:03 - INFO - codeparrot_training - Step 8283: {'lr': 0.00046803585111733465, 'samples': 4241408, 'steps': 8283, 'batch_loss/train': 0.7963353088125587} 12/24/2021 01:47:14 - INFO - codeparrot_training - Step 8284: {'lr': 0.0004680281385653522, 'samples': 4241920, 'steps': 8284, 'batch_loss/train': 0.7721483362838626} 12/24/2021 01:47:24 - INFO - codeparrot_training - Step 8285: {'lr': 0.00046802042514657217, 'samples': 4242432, 'steps': 8285, 'batch_loss/train': 0.7922345643746667} 12/24/2021 01:47:36 - INFO - codeparrot_training - Step 8286: {'lr': 0.0004680127108610253, 'samples': 4242944, 'steps': 8286, 'batch_loss/train': 0.6771057518199086} 12/24/2021 01:47:47 - INFO - codeparrot_training - Step 8287: {'lr': 0.00046800499570874213, 'samples': 4243456, 'steps': 8287, 'batch_loss/train': 0.8786126105114818} 12/24/2021 01:47:57 - INFO - codeparrot_training - Step 8288: {'lr': 0.00046799727968975345, 'samples': 4243968, 'steps': 8288, 'batch_loss/train': 0.7761627645231783} 12/24/2021 01:48:10 - INFO - codeparrot_training - Step 8289: {'lr': 0.0004679895628040899, 'samples': 4244480, 'steps': 8289, 'batch_loss/train': 0.6889806401450187} 12/24/2021 01:48:20 - INFO - codeparrot_training - Step 8290: {'lr': 0.00046798184505178215, 'samples': 4244992, 'steps': 8290, 'batch_loss/train': 0.611185984686017} 12/24/2021 01:48:31 - INFO - codeparrot_training - Step 8291: {'lr': 0.00046797412643286083, 'samples': 4245504, 'steps': 8291, 'batch_loss/train': 0.8244266249239445} 12/24/2021 01:48:44 - INFO - codeparrot_training - Step 8292: {'lr': 0.00046796640694735683, 'samples': 4246016, 'steps': 8292, 'batch_loss/train': 0.8727259989827871} 12/24/2021 01:48:54 - INFO - codeparrot_training - Step 8293: {'lr': 0.0004679586865953006, 'samples': 4246528, 'steps': 8293, 'batch_loss/train': 0.7926787566393614} 12/24/2021 01:49:05 - INFO - codeparrot_training - Step 8294: {'lr': 0.00046795096537672297, 'samples': 4247040, 'steps': 8294, 'batch_loss/train': 0.8124474305659533} 12/24/2021 01:49:17 - INFO - codeparrot_training - Step 8295: {'lr': 0.00046794324329165463, 'samples': 4247552, 'steps': 8295, 'batch_loss/train': 0.7606903589330614} 12/24/2021 01:49:28 - INFO - codeparrot_training - Step 8296: {'lr': 0.00046793552034012613, 'samples': 4248064, 'steps': 8296, 'batch_loss/train': 0.7666258756071329} 12/24/2021 01:49:38 - INFO - codeparrot_training - Step 8297: {'lr': 0.0004679277965221684, 'samples': 4248576, 'steps': 8297, 'batch_loss/train': 0.8349676178768277} 12/24/2021 01:49:49 - INFO - codeparrot_training - Step 8298: {'lr': 0.000467920071837812, 'samples': 4249088, 'steps': 8298, 'batch_loss/train': 0.8475632378831506} 12/24/2021 01:50:01 - INFO - codeparrot_training - Step 8299: {'lr': 0.00046791234628708767, 'samples': 4249600, 'steps': 8299, 'batch_loss/train': 0.6283079388667829} 12/24/2021 01:50:12 - INFO - codeparrot_training - Step 8300: {'lr': 0.0004679046198700261, 'samples': 4250112, 'steps': 8300, 'batch_loss/train': 0.9154185829684138} 12/24/2021 01:50:22 - INFO - codeparrot_training - Step 8301: {'lr': 0.00046789689258665805, 'samples': 4250624, 'steps': 8301, 'batch_loss/train': 1.5702416538260877} 12/24/2021 01:50:35 - INFO - codeparrot_training - Step 8302: {'lr': 0.0004678891644370143, 'samples': 4251136, 'steps': 8302, 'batch_loss/train': 0.8406828455626965} 12/24/2021 01:50:46 - INFO - codeparrot_training - Step 8303: {'lr': 0.00046788143542112544, 'samples': 4251648, 'steps': 8303, 'batch_loss/train': 0.8898604763671756} 12/24/2021 01:50:56 - INFO - codeparrot_training - Step 8304: {'lr': 0.00046787370553902233, 'samples': 4252160, 'steps': 8304, 'batch_loss/train': 0.8127849963493645} 12/24/2021 01:51:08 - INFO - codeparrot_training - Step 8305: {'lr': 0.0004678659747907356, 'samples': 4252672, 'steps': 8305, 'batch_loss/train': 0.8039966970682144} 12/24/2021 01:51:19 - INFO - codeparrot_training - Step 8306: {'lr': 0.0004678582431762959, 'samples': 4253184, 'steps': 8306, 'batch_loss/train': 0.9236230403184891} 12/24/2021 01:51:30 - INFO - codeparrot_training - Step 8307: {'lr': 0.0004678505106957342, 'samples': 4253696, 'steps': 8307, 'batch_loss/train': 0.8510080259293318} 12/24/2021 01:51:40 - INFO - codeparrot_training - Step 8308: {'lr': 0.0004678427773490811, 'samples': 4254208, 'steps': 8308, 'batch_loss/train': 0.8629073798656464} 12/24/2021 01:51:52 - INFO - codeparrot_training - Step 8309: {'lr': 0.00046783504313636736, 'samples': 4254720, 'steps': 8309, 'batch_loss/train': 0.8090100605040789} 12/24/2021 01:52:03 - INFO - codeparrot_training - Step 8310: {'lr': 0.0004678273080576238, 'samples': 4255232, 'steps': 8310, 'batch_loss/train': 0.7805741699412465} 12/24/2021 01:52:14 - INFO - codeparrot_training - Step 8311: {'lr': 0.00046781957211288106, 'samples': 4255744, 'steps': 8311, 'batch_loss/train': 0.7706043208017945} 12/24/2021 01:52:26 - INFO - codeparrot_training - Step 8312: {'lr': 0.00046781183530217, 'samples': 4256256, 'steps': 8312, 'batch_loss/train': 0.8132316339178942} 12/24/2021 01:52:37 - INFO - codeparrot_training - Step 8313: {'lr': 0.0004678040976255212, 'samples': 4256768, 'steps': 8313, 'batch_loss/train': 0.8047495016362518} 12/24/2021 01:52:48 - INFO - codeparrot_training - Step 8314: {'lr': 0.0004677963590829657, 'samples': 4257280, 'steps': 8314, 'batch_loss/train': 0.7717815429205075} 12/24/2021 01:52:58 - INFO - codeparrot_training - Step 8315: {'lr': 0.00046778861967453403, 'samples': 4257792, 'steps': 8315, 'batch_loss/train': 0.9234474347904325} 12/24/2021 01:53:10 - INFO - codeparrot_training - Step 8316: {'lr': 0.00046778087940025704, 'samples': 4258304, 'steps': 8316, 'batch_loss/train': 0.8127664597705007} 12/24/2021 01:53:21 - INFO - codeparrot_training - Step 8317: {'lr': 0.0004677731382601655, 'samples': 4258816, 'steps': 8317, 'batch_loss/train': 0.8288415507413447} 12/24/2021 01:53:31 - INFO - codeparrot_training - Step 8318: {'lr': 0.00046776539625429026, 'samples': 4259328, 'steps': 8318, 'batch_loss/train': 0.8119003162719309} 12/24/2021 01:53:43 - INFO - codeparrot_training - Step 8319: {'lr': 0.00046775765338266195, 'samples': 4259840, 'steps': 8319, 'batch_loss/train': 0.761105552315712} 12/24/2021 01:53:54 - INFO - codeparrot_training - Step 8320: {'lr': 0.0004677499096453115, 'samples': 4260352, 'steps': 8320, 'batch_loss/train': 0.7585428459569812} 12/24/2021 01:54:05 - INFO - codeparrot_training - Step 8321: {'lr': 0.0004677421650422696, 'samples': 4260864, 'steps': 8321, 'batch_loss/train': 0.784552313387394} 12/24/2021 01:54:17 - INFO - codeparrot_training - Step 8322: {'lr': 0.00046773441957356706, 'samples': 4261376, 'steps': 8322, 'batch_loss/train': 0.8045941451564431} 12/24/2021 01:54:28 - INFO - codeparrot_training - Step 8323: {'lr': 0.00046772667323923467, 'samples': 4261888, 'steps': 8323, 'batch_loss/train': 0.8648697347380221} 12/24/2021 01:54:38 - INFO - codeparrot_training - Step 8324: {'lr': 0.00046771892603930323, 'samples': 4262400, 'steps': 8324, 'batch_loss/train': 0.732006648555398} 12/24/2021 01:54:50 - INFO - codeparrot_training - Step 8325: {'lr': 0.0004677111779738036, 'samples': 4262912, 'steps': 8325, 'batch_loss/train': 0.7998258573934436} 12/24/2021 01:55:01 - INFO - codeparrot_training - Step 8326: {'lr': 0.00046770342904276646, 'samples': 4263424, 'steps': 8326, 'batch_loss/train': 0.8622670993208885} 12/24/2021 01:55:11 - INFO - codeparrot_training - Step 8327: {'lr': 0.00046769567924622276, 'samples': 4263936, 'steps': 8327, 'batch_loss/train': 0.7320978054776788} 12/24/2021 01:55:22 - INFO - codeparrot_training - Step 8328: {'lr': 0.0004676879285842032, 'samples': 4264448, 'steps': 8328, 'batch_loss/train': 0.8526444560848176} 12/24/2021 01:55:34 - INFO - codeparrot_training - Step 8329: {'lr': 0.0004676801770567386, 'samples': 4264960, 'steps': 8329, 'batch_loss/train': 0.8974574357271194} 12/24/2021 01:55:45 - INFO - codeparrot_training - Step 8330: {'lr': 0.0004676724246638598, 'samples': 4265472, 'steps': 8330, 'batch_loss/train': 0.7391330478712916} 12/24/2021 01:55:55 - INFO - codeparrot_training - Step 8331: {'lr': 0.00046766467140559774, 'samples': 4265984, 'steps': 8331, 'batch_loss/train': 0.6747906726086512} 12/24/2021 01:56:08 - INFO - codeparrot_training - Step 8332: {'lr': 0.00046765691728198306, 'samples': 4266496, 'steps': 8332, 'batch_loss/train': 0.8076325198635459} 12/24/2021 01:56:19 - INFO - codeparrot_training - Step 8333: {'lr': 0.00046764916229304667, 'samples': 4267008, 'steps': 8333, 'batch_loss/train': 0.853317707311362} 12/24/2021 01:56:29 - INFO - codeparrot_training - Step 8334: {'lr': 0.0004676414064388194, 'samples': 4267520, 'steps': 8334, 'batch_loss/train': 0.8935528621077538} 12/24/2021 01:56:41 - INFO - codeparrot_training - Step 8335: {'lr': 0.0004676336497193321, 'samples': 4268032, 'steps': 8335, 'batch_loss/train': 0.7758247815072536} 12/24/2021 01:56:52 - INFO - codeparrot_training - Step 8336: {'lr': 0.00046762589213461547, 'samples': 4268544, 'steps': 8336, 'batch_loss/train': 0.8135985191911459} 12/24/2021 01:57:02 - INFO - codeparrot_training - Step 8337: {'lr': 0.0004676181336847006, 'samples': 4269056, 'steps': 8337, 'batch_loss/train': 0.8651492884382606} 12/24/2021 01:57:14 - INFO - codeparrot_training - Step 8338: {'lr': 0.0004676103743696182, 'samples': 4269568, 'steps': 8338, 'batch_loss/train': 0.6593591163400561} 12/24/2021 01:57:25 - INFO - codeparrot_training - Step 8339: {'lr': 0.00046760261418939895, 'samples': 4270080, 'steps': 8339, 'batch_loss/train': 0.8051543217152357} 12/24/2021 01:57:36 - INFO - codeparrot_training - Step 8340: {'lr': 0.000467594853144074, 'samples': 4270592, 'steps': 8340, 'batch_loss/train': 0.8641693787649274} 12/24/2021 01:57:46 - INFO - codeparrot_training - Step 8341: {'lr': 0.00046758709123367404, 'samples': 4271104, 'steps': 8341, 'batch_loss/train': 0.7137788804247975} 12/24/2021 01:58:01 - INFO - codeparrot_training - Step 8342: {'lr': 0.00046757932845823005, 'samples': 4271616, 'steps': 8342, 'batch_loss/train': 0.9735809080302715} 12/24/2021 01:58:12 - INFO - codeparrot_training - Step 8343: {'lr': 0.00046757156481777275, 'samples': 4272128, 'steps': 8343, 'batch_loss/train': 0.8998000891879201} 12/24/2021 01:58:22 - INFO - codeparrot_training - Step 8344: {'lr': 0.00046756380031233306, 'samples': 4272640, 'steps': 8344, 'batch_loss/train': 0.8848840771242976} 12/24/2021 01:58:34 - INFO - codeparrot_training - Step 8345: {'lr': 0.00046755603494194175, 'samples': 4273152, 'steps': 8345, 'batch_loss/train': 0.8958763498812914} 12/24/2021 01:58:45 - INFO - codeparrot_training - Step 8346: {'lr': 0.0004675482687066299, 'samples': 4273664, 'steps': 8346, 'batch_loss/train': 0.7232481809332967} 12/24/2021 01:58:55 - INFO - codeparrot_training - Step 8347: {'lr': 0.00046754050160642823, 'samples': 4274176, 'steps': 8347, 'batch_loss/train': 0.8475055322051048} 12/24/2021 01:59:08 - INFO - codeparrot_training - Step 8348: {'lr': 0.0004675327336413677, 'samples': 4274688, 'steps': 8348, 'batch_loss/train': 0.8057902920991182} 12/24/2021 01:59:18 - INFO - codeparrot_training - Step 8349: {'lr': 0.00046752496481147917, 'samples': 4275200, 'steps': 8349, 'batch_loss/train': 0.9389029070734978} 12/24/2021 01:59:29 - INFO - codeparrot_training - Step 8350: {'lr': 0.00046751719511679347, 'samples': 4275712, 'steps': 8350, 'batch_loss/train': 0.8075712332502007} 12/24/2021 01:59:43 - INFO - codeparrot_training - Step 8351: {'lr': 0.00046750942455734155, 'samples': 4276224, 'steps': 8351, 'batch_loss/train': 0.8445162605494261} 12/24/2021 01:59:53 - INFO - codeparrot_training - Step 8352: {'lr': 0.00046750165313315427, 'samples': 4276736, 'steps': 8352, 'batch_loss/train': 0.8591178031638265} 12/24/2021 02:00:04 - INFO - codeparrot_training - Step 8353: {'lr': 0.0004674938808442626, 'samples': 4277248, 'steps': 8353, 'batch_loss/train': 0.857651749625802} 12/24/2021 02:00:15 - INFO - codeparrot_training - Step 8354: {'lr': 0.0004674861076906973, 'samples': 4277760, 'steps': 8354, 'batch_loss/train': 0.7920494759455323} 12/24/2021 02:00:27 - INFO - codeparrot_training - Step 8355: {'lr': 0.00046747833367248945, 'samples': 4278272, 'steps': 8355, 'batch_loss/train': 0.8049938697367907} 12/24/2021 02:00:37 - INFO - codeparrot_training - Step 8356: {'lr': 0.0004674705587896698, 'samples': 4278784, 'steps': 8356, 'batch_loss/train': 0.8870478319004178} 12/24/2021 02:00:48 - INFO - codeparrot_training - Step 8357: {'lr': 0.0004674627830422693, 'samples': 4279296, 'steps': 8357, 'batch_loss/train': 0.8517553536221385} 12/24/2021 02:01:02 - INFO - codeparrot_training - Step 8358: {'lr': 0.000467455006430319, 'samples': 4279808, 'steps': 8358, 'batch_loss/train': 0.8642418114468455} 12/24/2021 02:01:12 - INFO - codeparrot_training - Step 8359: {'lr': 0.00046744722895384954, 'samples': 4280320, 'steps': 8359, 'batch_loss/train': 0.8258759174495935} 12/24/2021 02:01:23 - INFO - codeparrot_training - Step 8360: {'lr': 0.0004674394506128921, 'samples': 4280832, 'steps': 8360, 'batch_loss/train': 0.7302857670001686} 12/24/2021 02:01:35 - INFO - codeparrot_training - Step 8361: {'lr': 0.0004674316714074775, 'samples': 4281344, 'steps': 8361, 'batch_loss/train': 0.7957896711304784} 12/24/2021 02:01:45 - INFO - codeparrot_training - Step 8362: {'lr': 0.00046742389133763663, 'samples': 4281856, 'steps': 8362, 'batch_loss/train': 0.7806524592451751} 12/24/2021 02:01:56 - INFO - codeparrot_training - Step 8363: {'lr': 0.0004674161104034005, 'samples': 4282368, 'steps': 8363, 'batch_loss/train': 0.8087122433353215} 12/24/2021 02:02:07 - INFO - codeparrot_training - Step 8364: {'lr': 0.00046740832860480005, 'samples': 4282880, 'steps': 8364, 'batch_loss/train': 0.7557349288836122} 12/24/2021 02:02:19 - INFO - codeparrot_training - Step 8365: {'lr': 0.0004674005459418661, 'samples': 4283392, 'steps': 8365, 'batch_loss/train': 0.9202998103573918} 12/24/2021 02:02:29 - INFO - codeparrot_training - Step 8366: {'lr': 0.0004673927624146297, 'samples': 4283904, 'steps': 8366, 'batch_loss/train': 0.8172570893075317} 12/24/2021 02:02:40 - INFO - codeparrot_training - Step 8367: {'lr': 0.00046738497802312174, 'samples': 4284416, 'steps': 8367, 'batch_loss/train': 0.8072945857420564} 12/24/2021 02:02:54 - INFO - codeparrot_training - Step 8368: {'lr': 0.00046737719276737324, 'samples': 4284928, 'steps': 8368, 'batch_loss/train': 0.7964143145363778} 12/24/2021 02:03:04 - INFO - codeparrot_training - Step 8369: {'lr': 0.00046736940664741505, 'samples': 4285440, 'steps': 8369, 'batch_loss/train': 0.7890322268940508} 12/24/2021 02:03:15 - INFO - codeparrot_training - Step 8370: {'lr': 0.00046736161966327817, 'samples': 4285952, 'steps': 8370, 'batch_loss/train': 0.9232767149806023} 12/24/2021 02:03:27 - INFO - codeparrot_training - Step 8371: {'lr': 0.0004673538318149936, 'samples': 4286464, 'steps': 8371, 'batch_loss/train': 0.7945874724537134} 12/24/2021 02:03:38 - INFO - codeparrot_training - Step 8372: {'lr': 0.0004673460431025923, 'samples': 4286976, 'steps': 8372, 'batch_loss/train': 0.7434328077360988} 12/24/2021 02:03:48 - INFO - codeparrot_training - Step 8373: {'lr': 0.00046733825352610514, 'samples': 4287488, 'steps': 8373, 'batch_loss/train': 0.8238679603673518} 12/24/2021 02:03:59 - INFO - codeparrot_training - Step 8374: {'lr': 0.00046733046308556313, 'samples': 4288000, 'steps': 8374, 'batch_loss/train': 0.8485229024663568} 12/24/2021 02:04:11 - INFO - codeparrot_training - Step 8375: {'lr': 0.0004673226717809973, 'samples': 4288512, 'steps': 8375, 'batch_loss/train': 0.8504741722717881} 12/24/2021 02:04:21 - INFO - codeparrot_training - Step 8376: {'lr': 0.0004673148796124386, 'samples': 4289024, 'steps': 8376, 'batch_loss/train': 0.9526247521862388} 12/24/2021 02:04:32 - INFO - codeparrot_training - Step 8377: {'lr': 0.00046730708657991796, 'samples': 4289536, 'steps': 8377, 'batch_loss/train': 0.8410970540717244} 12/24/2021 02:04:46 - INFO - codeparrot_training - Step 8378: {'lr': 0.0004672992926834664, 'samples': 4290048, 'steps': 8378, 'batch_loss/train': 0.7769662821665406} 12/24/2021 02:04:56 - INFO - codeparrot_training - Step 8379: {'lr': 0.0004672914979231149, 'samples': 4290560, 'steps': 8379, 'batch_loss/train': 0.8456376694375649} 12/24/2021 02:05:07 - INFO - codeparrot_training - Step 8380: {'lr': 0.00046728370229889436, 'samples': 4291072, 'steps': 8380, 'batch_loss/train': 0.906185888685286} 12/24/2021 02:05:19 - INFO - codeparrot_training - Step 8381: {'lr': 0.000467275905810836, 'samples': 4291584, 'steps': 8381, 'batch_loss/train': 0.7563752084970474} 12/24/2021 02:05:30 - INFO - codeparrot_training - Step 8382: {'lr': 0.00046726810845897066, 'samples': 4292096, 'steps': 8382, 'batch_loss/train': 0.8262589310761541} 12/24/2021 02:05:40 - INFO - codeparrot_training - Step 8383: {'lr': 0.00046726031024332934, 'samples': 4292608, 'steps': 8383, 'batch_loss/train': 0.89925875980407} 12/24/2021 02:05:52 - INFO - codeparrot_training - Step 8384: {'lr': 0.0004672525111639431, 'samples': 4293120, 'steps': 8384, 'batch_loss/train': 0.8564584287814796} 12/24/2021 02:06:03 - INFO - codeparrot_training - Step 8385: {'lr': 0.00046724471122084276, 'samples': 4293632, 'steps': 8385, 'batch_loss/train': 0.8026574654504657} 12/24/2021 02:06:14 - INFO - codeparrot_training - Step 8386: {'lr': 0.00046723691041405957, 'samples': 4294144, 'steps': 8386, 'batch_loss/train': 0.9215353704057634} 12/24/2021 02:06:24 - INFO - codeparrot_training - Step 8387: {'lr': 0.0004672291087436245, 'samples': 4294656, 'steps': 8387, 'batch_loss/train': 0.8192179440520704} 12/24/2021 02:06:38 - INFO - codeparrot_training - Step 8388: {'lr': 0.0004672213062095685, 'samples': 4295168, 'steps': 8388, 'batch_loss/train': 0.8506632172502577} 12/24/2021 02:06:49 - INFO - codeparrot_training - Step 8389: {'lr': 0.00046721350281192254, 'samples': 4295680, 'steps': 8389, 'batch_loss/train': 0.8237409191206098} 12/24/2021 02:07:00 - INFO - codeparrot_training - Step 8390: {'lr': 0.00046720569855071776, 'samples': 4296192, 'steps': 8390, 'batch_loss/train': 0.7428384157828987} 12/24/2021 02:07:12 - INFO - codeparrot_training - Step 8391: {'lr': 0.00046719789342598514, 'samples': 4296704, 'steps': 8391, 'batch_loss/train': 0.959510924294591} 12/24/2021 02:07:22 - INFO - codeparrot_training - Step 8392: {'lr': 0.0004671900874377557, 'samples': 4297216, 'steps': 8392, 'batch_loss/train': 0.512741919781547} 12/24/2021 02:07:33 - INFO - codeparrot_training - Step 8393: {'lr': 0.0004671822805860604, 'samples': 4297728, 'steps': 8393, 'batch_loss/train': 0.9056978635489941} 12/24/2021 02:07:45 - INFO - codeparrot_training - Step 8394: {'lr': 0.00046717447287093045, 'samples': 4298240, 'steps': 8394, 'batch_loss/train': 0.8687518779188395} 12/24/2021 02:07:56 - INFO - codeparrot_training - Step 8395: {'lr': 0.0004671666642923968, 'samples': 4298752, 'steps': 8395, 'batch_loss/train': 0.794582842849195} 12/24/2021 02:08:06 - INFO - codeparrot_training - Step 8396: {'lr': 0.0004671588548504905, 'samples': 4299264, 'steps': 8396, 'batch_loss/train': 0.7979568606242537} 12/24/2021 02:08:20 - INFO - codeparrot_training - Step 8397: {'lr': 0.0004671510445452425, 'samples': 4299776, 'steps': 8397, 'batch_loss/train': 0.860458075068891} 12/24/2021 02:08:31 - INFO - codeparrot_training - Step 8398: {'lr': 0.000467143233376684, 'samples': 4300288, 'steps': 8398, 'batch_loss/train': 0.8049463098868728} 12/24/2021 02:08:41 - INFO - codeparrot_training - Step 8399: {'lr': 0.000467135421344846, 'samples': 4300800, 'steps': 8399, 'batch_loss/train': 0.8693463560193777} 12/24/2021 02:08:52 - INFO - codeparrot_training - Step 8400: {'lr': 0.00046712760844975956, 'samples': 4301312, 'steps': 8400, 'batch_loss/train': 0.8497330397367477} 12/24/2021 02:09:04 - INFO - codeparrot_training - Step 8401: {'lr': 0.0004671197946914557, 'samples': 4301824, 'steps': 8401, 'batch_loss/train': 0.8412010293686762} 12/24/2021 02:09:15 - INFO - codeparrot_training - Step 8402: {'lr': 0.00046711198006996554, 'samples': 4302336, 'steps': 8402, 'batch_loss/train': 0.8916558939963579} 12/24/2021 02:09:25 - INFO - codeparrot_training - Step 8403: {'lr': 0.0004671041645853201, 'samples': 4302848, 'steps': 8403, 'batch_loss/train': 0.9114719219505787} 12/24/2021 02:09:39 - INFO - codeparrot_training - Step 8404: {'lr': 0.00046709634823755046, 'samples': 4303360, 'steps': 8404, 'batch_loss/train': 0.8990684505552053} 12/24/2021 02:09:50 - INFO - codeparrot_training - Step 8405: {'lr': 0.0004670885310266878, 'samples': 4303872, 'steps': 8405, 'batch_loss/train': 0.6896335482597351} 12/24/2021 02:10:00 - INFO - codeparrot_training - Step 8406: {'lr': 0.00046708071295276305, 'samples': 4304384, 'steps': 8406, 'batch_loss/train': 0.8351870733313262} 12/24/2021 02:10:12 - INFO - codeparrot_training - Step 8407: {'lr': 0.0004670728940158073, 'samples': 4304896, 'steps': 8407, 'batch_loss/train': 0.8677353761158884} 12/24/2021 02:10:23 - INFO - codeparrot_training - Step 8408: {'lr': 0.0004670650742158518, 'samples': 4305408, 'steps': 8408, 'batch_loss/train': 0.8590118964202702} 12/24/2021 02:10:34 - INFO - codeparrot_training - Step 8409: {'lr': 0.0004670572535529275, 'samples': 4305920, 'steps': 8409, 'batch_loss/train': 0.857900591334328} 12/24/2021 02:10:44 - INFO - codeparrot_training - Step 8410: {'lr': 0.00046704943202706547, 'samples': 4306432, 'steps': 8410, 'batch_loss/train': 0.9221413042396307} 12/24/2021 02:10:56 - INFO - codeparrot_training - Step 8411: {'lr': 0.0004670416096382969, 'samples': 4306944, 'steps': 8411, 'batch_loss/train': 0.7749865800142288} 12/24/2021 02:11:07 - INFO - codeparrot_training - Step 8412: {'lr': 0.0004670337863866528, 'samples': 4307456, 'steps': 8412, 'batch_loss/train': 0.8278316487558186} 12/24/2021 02:11:17 - INFO - codeparrot_training - Step 8413: {'lr': 0.00046702596227216433, 'samples': 4307968, 'steps': 8413, 'batch_loss/train': 0.7740875808522105} 12/24/2021 02:11:31 - INFO - codeparrot_training - Step 8414: {'lr': 0.0004670181372948625, 'samples': 4308480, 'steps': 8414, 'batch_loss/train': 0.8074293858371675} 12/24/2021 02:11:42 - INFO - codeparrot_training - Step 8415: {'lr': 0.00046701031145477865, 'samples': 4308992, 'steps': 8415, 'batch_loss/train': 0.7771019143983722} 12/24/2021 02:11:52 - INFO - codeparrot_training - Step 8416: {'lr': 0.0004670024847519436, 'samples': 4309504, 'steps': 8416, 'batch_loss/train': 0.8534234762191772} 12/24/2021 02:12:04 - INFO - codeparrot_training - Step 8417: {'lr': 0.00046699465718638873, 'samples': 4310016, 'steps': 8417, 'batch_loss/train': 0.846841755323112} 12/24/2021 02:12:15 - INFO - codeparrot_training - Step 8418: {'lr': 0.000466986828758145, 'samples': 4310528, 'steps': 8418, 'batch_loss/train': 0.8076486205682158} 12/24/2021 02:12:26 - INFO - codeparrot_training - Step 8419: {'lr': 0.00046697899946724356, 'samples': 4311040, 'steps': 8419, 'batch_loss/train': 0.9314961712807417} 12/24/2021 02:12:36 - INFO - codeparrot_training - Step 8420: {'lr': 0.00046697116931371555, 'samples': 4311552, 'steps': 8420, 'batch_loss/train': 0.8110460070893168} 12/24/2021 02:12:48 - INFO - codeparrot_training - Step 8421: {'lr': 0.00046696333829759207, 'samples': 4312064, 'steps': 8421, 'batch_loss/train': 0.9813183727674186} 12/24/2021 02:12:59 - INFO - codeparrot_training - Step 8422: {'lr': 0.0004669555064189043, 'samples': 4312576, 'steps': 8422, 'batch_loss/train': 0.9246210614219308} 12/24/2021 02:13:10 - INFO - codeparrot_training - Step 8423: {'lr': 0.0004669476736776833, 'samples': 4313088, 'steps': 8423, 'batch_loss/train': 0.8389170346781611} 12/24/2021 02:13:22 - INFO - codeparrot_training - Step 8424: {'lr': 0.00046693984007396036, 'samples': 4313600, 'steps': 8424, 'batch_loss/train': 0.8680607667192817} 12/24/2021 02:13:32 - INFO - codeparrot_training - Step 8425: {'lr': 0.00046693200560776646, 'samples': 4314112, 'steps': 8425, 'batch_loss/train': 0.8737838882952929} 12/24/2021 02:13:43 - INFO - codeparrot_training - Step 8426: {'lr': 0.0004669241702791328, 'samples': 4314624, 'steps': 8426, 'batch_loss/train': 0.9722544876858592} 12/24/2021 02:13:57 - INFO - codeparrot_training - Step 8427: {'lr': 0.00046691633408809063, 'samples': 4315136, 'steps': 8427, 'batch_loss/train': 0.8925486542284489} 12/24/2021 02:14:08 - INFO - codeparrot_training - Step 8428: {'lr': 0.000466908497034671, 'samples': 4315648, 'steps': 8428, 'batch_loss/train': 0.8786573181860149} 12/24/2021 02:14:18 - INFO - codeparrot_training - Step 8429: {'lr': 0.000466900659118905, 'samples': 4316160, 'steps': 8429, 'batch_loss/train': 0.7433326267637312} 12/24/2021 02:14:29 - INFO - codeparrot_training - Step 8430: {'lr': 0.00046689282034082397, 'samples': 4316672, 'steps': 8430, 'batch_loss/train': 0.8945744037628174} 12/24/2021 02:14:41 - INFO - codeparrot_training - Step 8431: {'lr': 0.0004668849807004589, 'samples': 4317184, 'steps': 8431, 'batch_loss/train': 0.8515260573476553} 12/24/2021 02:14:51 - INFO - codeparrot_training - Step 8432: {'lr': 0.0004668771401978411, 'samples': 4317696, 'steps': 8432, 'batch_loss/train': 0.7911211596801877} 12/24/2021 02:15:02 - INFO - codeparrot_training - Step 8433: {'lr': 0.0004668692988330017, 'samples': 4318208, 'steps': 8433, 'batch_loss/train': 0.791931863874197} 12/24/2021 02:15:14 - INFO - codeparrot_training - Step 8434: {'lr': 0.00046686145660597184, 'samples': 4318720, 'steps': 8434, 'batch_loss/train': 0.8874476663768291} 12/24/2021 02:15:24 - INFO - codeparrot_training - Step 8435: {'lr': 0.00046685361351678266, 'samples': 4319232, 'steps': 8435, 'batch_loss/train': 0.867751038633287} 12/24/2021 02:15:35 - INFO - codeparrot_training - Step 8436: {'lr': 0.00046684576956546546, 'samples': 4319744, 'steps': 8436, 'batch_loss/train': 0.8969257585704327} 12/24/2021 02:15:49 - INFO - codeparrot_training - Step 8437: {'lr': 0.00046683792475205136, 'samples': 4320256, 'steps': 8437, 'batch_loss/train': 0.8624492567032576} 12/24/2021 02:16:00 - INFO - codeparrot_training - Step 8438: {'lr': 0.00046683007907657146, 'samples': 4320768, 'steps': 8438, 'batch_loss/train': 0.8141665587900206} 12/24/2021 02:16:10 - INFO - codeparrot_training - Step 8439: {'lr': 0.00046682223253905717, 'samples': 4321280, 'steps': 8439, 'batch_loss/train': 0.7851574146188796} 12/24/2021 02:16:22 - INFO - codeparrot_training - Step 8440: {'lr': 0.0004668143851395394, 'samples': 4321792, 'steps': 8440, 'batch_loss/train': 0.8674768405035138} 12/24/2021 02:16:33 - INFO - codeparrot_training - Step 8441: {'lr': 0.0004668065368780496, 'samples': 4322304, 'steps': 8441, 'batch_loss/train': 0.8951813029125333} 12/24/2021 02:16:43 - INFO - codeparrot_training - Step 8442: {'lr': 0.00046679868775461887, 'samples': 4322816, 'steps': 8442, 'batch_loss/train': 0.8139764405786991} 12/24/2021 02:16:54 - INFO - codeparrot_training - Step 8443: {'lr': 0.00046679083776927845, 'samples': 4323328, 'steps': 8443, 'batch_loss/train': 0.8373764806892723} 12/24/2021 02:17:08 - INFO - codeparrot_training - Step 8444: {'lr': 0.00046678298692205943, 'samples': 4323840, 'steps': 8444, 'batch_loss/train': 0.8712405376136303} 12/24/2021 02:17:18 - INFO - codeparrot_training - Step 8445: {'lr': 0.00046677513521299315, 'samples': 4324352, 'steps': 8445, 'batch_loss/train': 0.7762760873883963} 12/24/2021 02:17:29 - INFO - codeparrot_training - Step 8446: {'lr': 0.00046676728264211076, 'samples': 4324864, 'steps': 8446, 'batch_loss/train': 0.92607132345438} 12/24/2021 02:17:41 - INFO - codeparrot_training - Step 8447: {'lr': 0.0004667594292094435, 'samples': 4325376, 'steps': 8447, 'batch_loss/train': 0.9017896391451359} 12/24/2021 02:17:52 - INFO - codeparrot_training - Step 8448: {'lr': 0.0004667515749150226, 'samples': 4325888, 'steps': 8448, 'batch_loss/train': 0.9654919365420938} 12/24/2021 02:18:02 - INFO - codeparrot_training - Step 8449: {'lr': 0.00046674371975887933, 'samples': 4326400, 'steps': 8449, 'batch_loss/train': 0.9282708019018173} 12/24/2021 02:18:14 - INFO - codeparrot_training - Step 8450: {'lr': 0.0004667358637410448, 'samples': 4326912, 'steps': 8450, 'batch_loss/train': 0.7987571097910404} 12/24/2021 02:18:25 - INFO - codeparrot_training - Step 8451: {'lr': 0.00046672800686155036, 'samples': 4327424, 'steps': 8451, 'batch_loss/train': 0.8635971359908581} 12/24/2021 02:18:35 - INFO - codeparrot_training - Step 8452: {'lr': 0.00046672014912042716, 'samples': 4327936, 'steps': 8452, 'batch_loss/train': 0.8087145574390888} 12/24/2021 02:18:46 - INFO - codeparrot_training - Step 8453: {'lr': 0.0004667122905177065, 'samples': 4328448, 'steps': 8453, 'batch_loss/train': 0.8258443307131529} 12/24/2021 02:19:00 - INFO - codeparrot_training - Step 8454: {'lr': 0.00046670443105341965, 'samples': 4328960, 'steps': 8454, 'batch_loss/train': 0.7731693871319294} 12/24/2021 02:19:10 - INFO - codeparrot_training - Step 8455: {'lr': 0.00046669657072759773, 'samples': 4329472, 'steps': 8455, 'batch_loss/train': 0.8535616081207991} 12/24/2021 02:19:21 - INFO - codeparrot_training - Step 8456: {'lr': 0.0004666887095402721, 'samples': 4329984, 'steps': 8456, 'batch_loss/train': 0.8608248601667583} 12/24/2021 02:19:33 - INFO - codeparrot_training - Step 8457: {'lr': 0.000466680847491474, 'samples': 4330496, 'steps': 8457, 'batch_loss/train': 0.837286464869976} 12/24/2021 02:19:44 - INFO - codeparrot_training - Step 8458: {'lr': 0.00046667298458123464, 'samples': 4331008, 'steps': 8458, 'batch_loss/train': 1.3205017494037747} 12/24/2021 02:19:54 - INFO - codeparrot_training - Step 8459: {'lr': 0.0004666651208095852, 'samples': 4331520, 'steps': 8459, 'batch_loss/train': 0.8163203238509595} 12/24/2021 02:20:06 - INFO - codeparrot_training - Step 8460: {'lr': 0.00046665725617655717, 'samples': 4332032, 'steps': 8460, 'batch_loss/train': 1.1567176394164562} 12/24/2021 02:20:17 - INFO - codeparrot_training - Step 8461: {'lr': 0.0004666493906821817, 'samples': 4332544, 'steps': 8461, 'batch_loss/train': 0.7937553157098591} 12/24/2021 02:20:28 - INFO - codeparrot_training - Step 8462: {'lr': 0.0004666415243264901, 'samples': 4333056, 'steps': 8462, 'batch_loss/train': 0.9299788065254688} 12/24/2021 02:20:38 - INFO - codeparrot_training - Step 8463: {'lr': 0.0004666336571095135, 'samples': 4333568, 'steps': 8463, 'batch_loss/train': 0.905179699882865} 12/24/2021 02:20:50 - INFO - codeparrot_training - Step 8464: {'lr': 0.00046662578903128326, 'samples': 4334080, 'steps': 8464, 'batch_loss/train': 0.8655403777956963} 12/24/2021 02:21:01 - INFO - codeparrot_training - Step 8465: {'lr': 0.0004666179200918308, 'samples': 4334592, 'steps': 8465, 'batch_loss/train': 0.7981238756328821} 12/24/2021 02:21:11 - INFO - codeparrot_training - Step 8466: {'lr': 0.0004666100502911872, 'samples': 4335104, 'steps': 8466, 'batch_loss/train': 0.8852591929025948} 12/24/2021 02:21:25 - INFO - codeparrot_training - Step 8467: {'lr': 0.00046660217962938384, 'samples': 4335616, 'steps': 8467, 'batch_loss/train': 0.8605877375230193} 12/24/2021 02:21:36 - INFO - codeparrot_training - Step 8468: {'lr': 0.000466594308106452, 'samples': 4336128, 'steps': 8468, 'batch_loss/train': 0.7668606229126453} 12/24/2021 02:21:46 - INFO - codeparrot_training - Step 8469: {'lr': 0.000466586435722423, 'samples': 4336640, 'steps': 8469, 'batch_loss/train': 0.7777119041420519} 12/24/2021 02:21:58 - INFO - codeparrot_training - Step 8470: {'lr': 0.00046657856247732805, 'samples': 4337152, 'steps': 8470, 'batch_loss/train': 0.8256982816383243} 12/24/2021 02:22:09 - INFO - codeparrot_training - Step 8471: {'lr': 0.00046657068837119856, 'samples': 4337664, 'steps': 8471, 'batch_loss/train': 0.8773646838963032} 12/24/2021 02:22:20 - INFO - codeparrot_training - Step 8472: {'lr': 0.0004665628134040658, 'samples': 4338176, 'steps': 8472, 'batch_loss/train': 0.936769078951329} 12/24/2021 02:22:34 - INFO - codeparrot_training - Step 8473: {'lr': 0.00046655493757596103, 'samples': 4338688, 'steps': 8473, 'batch_loss/train': 1.010562398005277} 12/24/2021 02:22:44 - INFO - codeparrot_training - Step 8474: {'lr': 0.00046654706088691556, 'samples': 4339200, 'steps': 8474, 'batch_loss/train': 0.8779322165064514} 12/24/2021 02:22:55 - INFO - codeparrot_training - Step 8475: {'lr': 0.0004665391833369608, 'samples': 4339712, 'steps': 8475, 'batch_loss/train': 0.8731918623670936} 12/24/2021 02:23:05 - INFO - codeparrot_training - Step 8476: {'lr': 0.000466531304926128, 'samples': 4340224, 'steps': 8476, 'batch_loss/train': 0.8343582972884178} 12/24/2021 02:23:18 - INFO - codeparrot_training - Step 8477: {'lr': 0.0004665234256544485, 'samples': 4340736, 'steps': 8477, 'batch_loss/train': 0.7538410015404224} 12/24/2021 02:23:28 - INFO - codeparrot_training - Step 8478: {'lr': 0.0004665155455219536, 'samples': 4341248, 'steps': 8478, 'batch_loss/train': 0.838030518963933} 12/24/2021 02:23:39 - INFO - codeparrot_training - Step 8479: {'lr': 0.00046650766452867467, 'samples': 4341760, 'steps': 8479, 'batch_loss/train': 0.7793589479988441} 12/24/2021 02:23:51 - INFO - codeparrot_training - Step 8480: {'lr': 0.00046649978267464297, 'samples': 4342272, 'steps': 8480, 'batch_loss/train': 0.8532404680736363} 12/24/2021 02:24:01 - INFO - codeparrot_training - Step 8481: {'lr': 0.0004664918999598899, 'samples': 4342784, 'steps': 8481, 'batch_loss/train': 0.8551729861646891} 12/24/2021 02:24:12 - INFO - codeparrot_training - Step 8482: {'lr': 0.00046648401638444683, 'samples': 4343296, 'steps': 8482, 'batch_loss/train': 0.8012634925544262} 12/24/2021 02:24:26 - INFO - codeparrot_training - Step 8483: {'lr': 0.00046647613194834495, 'samples': 4343808, 'steps': 8483, 'batch_loss/train': 0.8482999689877033} 12/24/2021 02:24:37 - INFO - codeparrot_training - Step 8484: {'lr': 0.0004664682466516158, 'samples': 4344320, 'steps': 8484, 'batch_loss/train': 0.8831232967786491} 12/24/2021 02:24:47 - INFO - codeparrot_training - Step 8485: {'lr': 0.00046646036049429054, 'samples': 4344832, 'steps': 8485, 'batch_loss/train': 0.7699672933667898} 12/24/2021 02:24:58 - INFO - codeparrot_training - Step 8486: {'lr': 0.00046645247347640073, 'samples': 4345344, 'steps': 8486, 'batch_loss/train': 0.7869104342535138} 12/24/2021 02:25:10 - INFO - codeparrot_training - Step 8487: {'lr': 0.00046644458559797755, 'samples': 4345856, 'steps': 8487, 'batch_loss/train': 0.8949736831709743} 12/24/2021 02:25:20 - INFO - codeparrot_training - Step 8488: {'lr': 0.00046643669685905243, 'samples': 4346368, 'steps': 8488, 'batch_loss/train': 0.8716134019196033} 12/24/2021 02:25:31 - INFO - codeparrot_training - Step 8489: {'lr': 0.00046642880725965676, 'samples': 4346880, 'steps': 8489, 'batch_loss/train': 1.2138269431889057} 12/24/2021 02:25:45 - INFO - codeparrot_training - Step 8490: {'lr': 0.00046642091679982177, 'samples': 4347392, 'steps': 8490, 'batch_loss/train': 0.7966614374890924} 12/24/2021 02:25:55 - INFO - codeparrot_training - Step 8491: {'lr': 0.00046641302547957905, 'samples': 4347904, 'steps': 8491, 'batch_loss/train': 0.8855134202167392} 12/24/2021 02:26:06 - INFO - codeparrot_training - Step 8492: {'lr': 0.00046640513329895985, 'samples': 4348416, 'steps': 8492, 'batch_loss/train': 0.8342181509360671} 12/24/2021 02:26:18 - INFO - codeparrot_training - Step 8493: {'lr': 0.0004663972402579955, 'samples': 4348928, 'steps': 8493, 'batch_loss/train': 1.028784457128495} 12/24/2021 02:26:29 - INFO - codeparrot_training - Step 8494: {'lr': 0.00046638934635671745, 'samples': 4349440, 'steps': 8494, 'batch_loss/train': 0.7547754711122252} 12/24/2021 02:26:39 - INFO - codeparrot_training - Step 8495: {'lr': 0.00046638145159515703, 'samples': 4349952, 'steps': 8495, 'batch_loss/train': 0.9742001798003912} 12/24/2021 02:26:51 - INFO - codeparrot_training - Step 8496: {'lr': 0.00046637355597334566, 'samples': 4350464, 'steps': 8496, 'batch_loss/train': 0.8989050607196987} 12/24/2021 02:27:02 - INFO - codeparrot_training - Step 8497: {'lr': 0.0004663656594913148, 'samples': 4350976, 'steps': 8497, 'batch_loss/train': 0.875559663400054} 12/24/2021 02:27:12 - INFO - codeparrot_training - Step 8498: {'lr': 0.00046635776214909577, 'samples': 4351488, 'steps': 8498, 'batch_loss/train': 0.879230797290802} 12/24/2021 02:27:23 - INFO - codeparrot_training - Step 8499: {'lr': 0.0004663498639467199, 'samples': 4352000, 'steps': 8499, 'batch_loss/train': 0.8058295091614127} 12/24/2021 02:27:35 - INFO - codeparrot_training - Step 8500: {'lr': 0.0004663419648842187, 'samples': 4352512, 'steps': 8500, 'batch_loss/train': 0.8302388349547982} 12/24/2021 02:27:46 - INFO - codeparrot_training - Step 8501: {'lr': 0.00046633406496162353, 'samples': 4353024, 'steps': 8501, 'batch_loss/train': 0.8376057748682797} 12/24/2021 02:27:56 - INFO - codeparrot_training - Step 8502: {'lr': 0.00046632616417896576, 'samples': 4353536, 'steps': 8502, 'batch_loss/train': 0.8039625841192901} 12/24/2021 02:28:09 - INFO - codeparrot_training - Step 8503: {'lr': 0.00046631826253627697, 'samples': 4354048, 'steps': 8503, 'batch_loss/train': 0.8628224213607609} 12/24/2021 02:28:20 - INFO - codeparrot_training - Step 8504: {'lr': 0.00046631036003358834, 'samples': 4354560, 'steps': 8504, 'batch_loss/train': 0.734711404889822} 12/24/2021 02:28:30 - INFO - codeparrot_training - Step 8505: {'lr': 0.0004663024566709314, 'samples': 4355072, 'steps': 8505, 'batch_loss/train': 0.9938475899398327} 12/24/2021 02:28:42 - INFO - codeparrot_training - Step 8506: {'lr': 0.00046629455244833764, 'samples': 4355584, 'steps': 8506, 'batch_loss/train': 0.8804847467690706} 12/24/2021 02:28:53 - INFO - codeparrot_training - Step 8507: {'lr': 0.0004662866473658383, 'samples': 4356096, 'steps': 8507, 'batch_loss/train': 0.8128717802464962} 12/24/2021 02:29:03 - INFO - codeparrot_training - Step 8508: {'lr': 0.000466278741423465, 'samples': 4356608, 'steps': 8508, 'batch_loss/train': 0.8085114657878876} 12/24/2021 02:29:14 - INFO - codeparrot_training - Step 8509: {'lr': 0.0004662708346212491, 'samples': 4357120, 'steps': 8509, 'batch_loss/train': 0.923474551178515} 12/24/2021 02:29:26 - INFO - codeparrot_training - Step 8510: {'lr': 0.0004662629269592219, 'samples': 4357632, 'steps': 8510, 'batch_loss/train': 0.8840856414753944} 12/24/2021 02:29:37 - INFO - codeparrot_training - Step 8511: {'lr': 0.00046625501843741506, 'samples': 4358144, 'steps': 8511, 'batch_loss/train': 0.838523494079709} 12/24/2021 02:29:47 - INFO - codeparrot_training - Step 8512: {'lr': 0.00046624710905585987, 'samples': 4358656, 'steps': 8512, 'batch_loss/train': 0.8275016695261002} 12/24/2021 02:30:00 - INFO - codeparrot_training - Step 8513: {'lr': 0.00046623919881458786, 'samples': 4359168, 'steps': 8513, 'batch_loss/train': 0.7422022186219692} 12/24/2021 02:30:11 - INFO - codeparrot_training - Step 8514: {'lr': 0.00046623128771363043, 'samples': 4359680, 'steps': 8514, 'batch_loss/train': 0.7459238106384873} 12/24/2021 02:30:21 - INFO - codeparrot_training - Step 8515: {'lr': 0.00046622337575301907, 'samples': 4360192, 'steps': 8515, 'batch_loss/train': 0.7182511298451573} 12/24/2021 02:30:33 - INFO - codeparrot_training - Step 8516: {'lr': 0.0004662154629327853, 'samples': 4360704, 'steps': 8516, 'batch_loss/train': 0.8532736031338573} 12/24/2021 02:30:44 - INFO - codeparrot_training - Step 8517: {'lr': 0.00046620754925296034, 'samples': 4361216, 'steps': 8517, 'batch_loss/train': 0.8963733147829771} 12/24/2021 02:30:55 - INFO - codeparrot_training - Step 8518: {'lr': 0.00046619963471357585, 'samples': 4361728, 'steps': 8518, 'batch_loss/train': 0.8160949489101768} 12/24/2021 02:31:05 - INFO - codeparrot_training - Step 8519: {'lr': 0.0004661917193146632, 'samples': 4362240, 'steps': 8519, 'batch_loss/train': 0.7934007840231061} 12/24/2021 02:31:18 - INFO - codeparrot_training - Step 8520: {'lr': 0.000466183803056254, 'samples': 4362752, 'steps': 8520, 'batch_loss/train': 0.8262057286920026} 12/24/2021 02:31:28 - INFO - codeparrot_training - Step 8521: {'lr': 0.00046617588593837956, 'samples': 4363264, 'steps': 8521, 'batch_loss/train': 1.4092133208177984} 12/24/2021 02:31:39 - INFO - codeparrot_training - Step 8522: {'lr': 0.0004661679679610714, 'samples': 4363776, 'steps': 8522, 'batch_loss/train': 0.8543143221177161} 12/24/2021 02:31:51 - INFO - codeparrot_training - Step 8523: {'lr': 0.0004661600491243611, 'samples': 4364288, 'steps': 8523, 'batch_loss/train': 0.8488532826304436} 12/24/2021 02:32:02 - INFO - codeparrot_training - Step 8524: {'lr': 0.00046615212942828, 'samples': 4364800, 'steps': 8524, 'batch_loss/train': 0.7709514051675797} 12/24/2021 02:32:12 - INFO - codeparrot_training - Step 8525: {'lr': 0.0004661442088728597, 'samples': 4365312, 'steps': 8525, 'batch_loss/train': 0.8372631622478366} 12/24/2021 02:32:25 - INFO - codeparrot_training - Step 8526: {'lr': 0.00046613628745813165, 'samples': 4365824, 'steps': 8526, 'batch_loss/train': 0.7707744790241122} 12/24/2021 02:32:35 - INFO - codeparrot_training - Step 8527: {'lr': 0.00046612836518412725, 'samples': 4366336, 'steps': 8527, 'batch_loss/train': 0.847327342024073} 12/24/2021 02:32:46 - INFO - codeparrot_training - Step 8528: {'lr': 0.0004661204420508781, 'samples': 4366848, 'steps': 8528, 'batch_loss/train': 0.9422778012230992} 12/24/2021 02:32:56 - INFO - codeparrot_training - Step 8529: {'lr': 0.0004661125180584157, 'samples': 4367360, 'steps': 8529, 'batch_loss/train': 0.8187868213281035} 12/24/2021 02:33:09 - INFO - codeparrot_training - Step 8530: {'lr': 0.0004661045932067716, 'samples': 4367872, 'steps': 8530, 'batch_loss/train': 0.8511241776868701} 12/24/2021 02:33:20 - INFO - codeparrot_training - Step 8531: {'lr': 0.0004660966674959771, 'samples': 4368384, 'steps': 8531, 'batch_loss/train': 0.771939629688859} 12/24/2021 02:33:30 - INFO - codeparrot_training - Step 8532: {'lr': 0.0004660887409260639, 'samples': 4368896, 'steps': 8532, 'batch_loss/train': 0.8137279741931707} 12/24/2021 02:33:42 - INFO - codeparrot_training - Step 8533: {'lr': 0.0004660808134970635, 'samples': 4369408, 'steps': 8533, 'batch_loss/train': 0.8990078712813556} 12/24/2021 02:33:53 - INFO - codeparrot_training - Step 8534: {'lr': 0.0004660728852090074, 'samples': 4369920, 'steps': 8534, 'batch_loss/train': 1.068519682623446} 12/24/2021 02:34:03 - INFO - codeparrot_training - Step 8535: {'lr': 0.00046606495606192703, 'samples': 4370432, 'steps': 8535, 'batch_loss/train': 0.9455645121634007} 12/24/2021 02:34:16 - INFO - codeparrot_training - Step 8536: {'lr': 0.00046605702605585385, 'samples': 4370944, 'steps': 8536, 'batch_loss/train': 0.8536328384652734} 12/24/2021 02:34:27 - INFO - codeparrot_training - Step 8537: {'lr': 0.0004660490951908197, 'samples': 4371456, 'steps': 8537, 'batch_loss/train': 0.8909506937488914} 12/24/2021 02:34:38 - INFO - codeparrot_training - Step 8538: {'lr': 0.0004660411634668559, 'samples': 4371968, 'steps': 8538, 'batch_loss/train': 0.8592273006215692} 12/24/2021 02:34:49 - INFO - codeparrot_training - Step 8539: {'lr': 0.00046603323088399397, 'samples': 4372480, 'steps': 8539, 'batch_loss/train': 0.8331640972755849} 12/24/2021 02:35:00 - INFO - codeparrot_training - Step 8540: {'lr': 0.00046602529744226544, 'samples': 4372992, 'steps': 8540, 'batch_loss/train': 0.9001356866210699} 12/24/2021 02:35:11 - INFO - codeparrot_training - Step 8541: {'lr': 0.000466017363141702, 'samples': 4373504, 'steps': 8541, 'batch_loss/train': 0.9500935389660299} 12/24/2021 02:35:21 - INFO - codeparrot_training - Step 8542: {'lr': 0.00046600942798233504, 'samples': 4374016, 'steps': 8542, 'batch_loss/train': 0.8286371920257807} 12/24/2021 02:35:34 - INFO - codeparrot_training - Step 8543: {'lr': 0.0004660014919641962, 'samples': 4374528, 'steps': 8543, 'batch_loss/train': 0.8705573091283441} 12/24/2021 02:35:44 - INFO - codeparrot_training - Step 8544: {'lr': 0.00046599355508731687, 'samples': 4375040, 'steps': 8544, 'batch_loss/train': 0.7534353202208877} 12/24/2021 02:35:55 - INFO - codeparrot_training - Step 8545: {'lr': 0.0004659856173517288, 'samples': 4375552, 'steps': 8545, 'batch_loss/train': 0.8727860683575273} 12/24/2021 02:36:07 - INFO - codeparrot_training - Step 8546: {'lr': 0.00046597767875746344, 'samples': 4376064, 'steps': 8546, 'batch_loss/train': 0.9182995427399874} 12/24/2021 02:36:18 - INFO - codeparrot_training - Step 8547: {'lr': 0.0004659697393045524, 'samples': 4376576, 'steps': 8547, 'batch_loss/train': 0.7513903053477407} 12/24/2021 02:36:28 - INFO - codeparrot_training - Step 8548: {'lr': 0.00046596179899302726, 'samples': 4377088, 'steps': 8548, 'batch_loss/train': 0.6531970052747056} 12/24/2021 02:36:39 - INFO - codeparrot_training - Step 8549: {'lr': 0.0004659538578229195, 'samples': 4377600, 'steps': 8549, 'batch_loss/train': 0.8612705357372761} 12/24/2021 02:36:53 - INFO - codeparrot_training - Step 8550: {'lr': 0.00046594591579426074, 'samples': 4378112, 'steps': 8550, 'batch_loss/train': 0.8825352075509727} 12/24/2021 02:37:03 - INFO - codeparrot_training - Step 8551: {'lr': 0.00046593797290708257, 'samples': 4378624, 'steps': 8551, 'batch_loss/train': 0.7689426059368998} 12/24/2021 02:37:14 - INFO - codeparrot_training - Step 8552: {'lr': 0.0004659300291614166, 'samples': 4379136, 'steps': 8552, 'batch_loss/train': 0.8296660482883453} 12/24/2021 02:37:26 - INFO - codeparrot_training - Step 8553: {'lr': 0.0004659220845572943, 'samples': 4379648, 'steps': 8553, 'batch_loss/train': 0.9332419894635677} 12/24/2021 02:37:36 - INFO - codeparrot_training - Step 8554: {'lr': 0.00046591413909474733, 'samples': 4380160, 'steps': 8554, 'batch_loss/train': 0.8565714620053768} 12/24/2021 02:37:47 - INFO - codeparrot_training - Step 8555: {'lr': 0.0004659061927738073, 'samples': 4380672, 'steps': 8555, 'batch_loss/train': 0.8446054635569453} 12/24/2021 02:37:59 - INFO - codeparrot_training - Step 8556: {'lr': 0.0004658982455945058, 'samples': 4381184, 'steps': 8556, 'batch_loss/train': 0.8467556219547987} 12/24/2021 02:38:10 - INFO - codeparrot_training - Step 8557: {'lr': 0.00046589029755687427, 'samples': 4381696, 'steps': 8557, 'batch_loss/train': 0.9044797020033002} 12/24/2021 02:38:21 - INFO - codeparrot_training - Step 8558: {'lr': 0.00046588234866094455, 'samples': 4382208, 'steps': 8558, 'batch_loss/train': 0.84402069542557} 12/24/2021 02:38:33 - INFO - codeparrot_training - Step 8559: {'lr': 0.0004658743989067481, 'samples': 4382720, 'steps': 8559, 'batch_loss/train': 0.9894341425970197} 12/24/2021 02:38:44 - INFO - codeparrot_training - Step 8560: {'lr': 0.0004658664482943165, 'samples': 4383232, 'steps': 8560, 'batch_loss/train': 0.8878646595403552} 12/24/2021 02:38:54 - INFO - codeparrot_training - Step 8561: {'lr': 0.00046585849682368146, 'samples': 4383744, 'steps': 8561, 'batch_loss/train': 0.8092765472829342} 12/24/2021 02:39:05 - INFO - codeparrot_training - Step 8562: {'lr': 0.0004658505444948745, 'samples': 4384256, 'steps': 8562, 'batch_loss/train': 0.8367946520447731} 12/24/2021 02:39:17 - INFO - codeparrot_training - Step 8563: {'lr': 0.0004658425913079273, 'samples': 4384768, 'steps': 8563, 'batch_loss/train': 0.80741002317518} 12/24/2021 02:39:28 - INFO - codeparrot_training - Step 8564: {'lr': 0.0004658346372628715, 'samples': 4385280, 'steps': 8564, 'batch_loss/train': 0.8865814274176955} 12/24/2021 02:39:38 - INFO - codeparrot_training - Step 8565: {'lr': 0.00046582668235973867, 'samples': 4385792, 'steps': 8565, 'batch_loss/train': 0.972362207248807} 12/24/2021 02:39:51 - INFO - codeparrot_training - Step 8566: {'lr': 0.0004658187265985604, 'samples': 4386304, 'steps': 8566, 'batch_loss/train': 0.8127299384213984} 12/24/2021 02:40:02 - INFO - codeparrot_training - Step 8567: {'lr': 0.00046581076997936834, 'samples': 4386816, 'steps': 8567, 'batch_loss/train': 0.9277726244181395} 12/24/2021 02:40:12 - INFO - codeparrot_training - Step 8568: {'lr': 0.00046580281250219424, 'samples': 4387328, 'steps': 8568, 'batch_loss/train': 0.7258524252101779} 12/24/2021 02:40:24 - INFO - codeparrot_training - Step 8569: {'lr': 0.00046579485416706955, 'samples': 4387840, 'steps': 8569, 'batch_loss/train': 0.7181628623511642} 12/24/2021 02:40:35 - INFO - codeparrot_training - Step 8570: {'lr': 0.00046578689497402603, 'samples': 4388352, 'steps': 8570, 'batch_loss/train': 0.8587824042188004} 12/24/2021 02:40:45 - INFO - codeparrot_training - Step 8571: {'lr': 0.0004657789349230953, 'samples': 4388864, 'steps': 8571, 'batch_loss/train': 1.0024206340312958} 12/24/2021 02:40:56 - INFO - codeparrot_training - Step 8572: {'lr': 0.000465770974014309, 'samples': 4389376, 'steps': 8572, 'batch_loss/train': 0.9420608996879309} 12/24/2021 02:41:08 - INFO - codeparrot_training - Step 8573: {'lr': 0.0004657630122476988, 'samples': 4389888, 'steps': 8573, 'batch_loss/train': 0.6579992165789008} 12/24/2021 02:41:19 - INFO - codeparrot_training - Step 8574: {'lr': 0.0004657550496232963, 'samples': 4390400, 'steps': 8574, 'batch_loss/train': 0.8178366292268038} 12/24/2021 02:41:29 - INFO - codeparrot_training - Step 8575: {'lr': 0.0004657470861411332, 'samples': 4390912, 'steps': 8575, 'batch_loss/train': 0.7887967266142368} 12/24/2021 02:41:42 - INFO - codeparrot_training - Step 8576: {'lr': 0.0004657391218012412, 'samples': 4391424, 'steps': 8576, 'batch_loss/train': 0.790038213133812} 12/24/2021 02:41:53 - INFO - codeparrot_training - Step 8577: {'lr': 0.00046573115660365186, 'samples': 4391936, 'steps': 8577, 'batch_loss/train': 0.7974267741665244} 12/24/2021 02:42:03 - INFO - codeparrot_training - Step 8578: {'lr': 0.00046572319054839686, 'samples': 4392448, 'steps': 8578, 'batch_loss/train': 0.7952477196231484} 12/24/2021 02:42:16 - INFO - codeparrot_training - Step 8579: {'lr': 0.000465715223635508, 'samples': 4392960, 'steps': 8579, 'batch_loss/train': 0.7939690006896853} 12/24/2021 02:42:26 - INFO - codeparrot_training - Step 8580: {'lr': 0.0004657072558650168, 'samples': 4393472, 'steps': 8580, 'batch_loss/train': 0.8057986227795482} 12/24/2021 02:42:37 - INFO - codeparrot_training - Step 8581: {'lr': 0.000465699287236955, 'samples': 4393984, 'steps': 8581, 'batch_loss/train': 0.7934653731063008} 12/24/2021 02:42:48 - INFO - codeparrot_training - Step 8582: {'lr': 0.00046569131775135426, 'samples': 4394496, 'steps': 8582, 'batch_loss/train': 1.2074323557317257} 12/24/2021 02:43:00 - INFO - codeparrot_training - Step 8583: {'lr': 0.0004656833474082463, 'samples': 4395008, 'steps': 8583, 'batch_loss/train': 0.9235432557761669} 12/24/2021 02:43:10 - INFO - codeparrot_training - Step 8584: {'lr': 0.0004656753762076628, 'samples': 4395520, 'steps': 8584, 'batch_loss/train': 0.7416677130386233} 12/24/2021 02:43:21 - INFO - codeparrot_training - Step 8585: {'lr': 0.0004656674041496355, 'samples': 4396032, 'steps': 8585, 'batch_loss/train': 0.9833247857168317} 12/24/2021 02:43:33 - INFO - codeparrot_training - Step 8586: {'lr': 0.00046565943123419597, 'samples': 4396544, 'steps': 8586, 'batch_loss/train': 0.824498129542917} 12/24/2021 02:43:44 - INFO - codeparrot_training - Step 8587: {'lr': 0.00046565145746137594, 'samples': 4397056, 'steps': 8587, 'batch_loss/train': 0.7473431290127337} 12/24/2021 02:43:54 - INFO - codeparrot_training - Step 8588: {'lr': 0.00046564348283120716, 'samples': 4397568, 'steps': 8588, 'batch_loss/train': 0.8228597231209278} 12/24/2021 02:44:07 - INFO - codeparrot_training - Step 8589: {'lr': 0.00046563550734372134, 'samples': 4398080, 'steps': 8589, 'batch_loss/train': 0.941057906486094} 12/24/2021 02:44:17 - INFO - codeparrot_training - Step 8590: {'lr': 0.0004656275309989502, 'samples': 4398592, 'steps': 8590, 'batch_loss/train': 0.7318112837383524} 12/24/2021 02:44:28 - INFO - codeparrot_training - Step 8591: {'lr': 0.0004656195537969253, 'samples': 4399104, 'steps': 8591, 'batch_loss/train': 0.8868268104270101} 12/24/2021 02:44:39 - INFO - codeparrot_training - Step 8592: {'lr': 0.00046561157573767856, 'samples': 4399616, 'steps': 8592, 'batch_loss/train': 0.776538981241174} 12/24/2021 02:44:51 - INFO - codeparrot_training - Step 8593: {'lr': 0.0004656035968212416, 'samples': 4400128, 'steps': 8593, 'batch_loss/train': 0.8762883832678199} 12/24/2021 02:45:01 - INFO - codeparrot_training - Step 8594: {'lr': 0.0004655956170476461, 'samples': 4400640, 'steps': 8594, 'batch_loss/train': 0.8519742917269468} 12/24/2021 02:45:12 - INFO - codeparrot_training - Step 8595: {'lr': 0.00046558763641692393, 'samples': 4401152, 'steps': 8595, 'batch_loss/train': 0.8555694734677672} 12/24/2021 02:45:24 - INFO - codeparrot_training - Step 8596: {'lr': 0.0004655796549291066, 'samples': 4401664, 'steps': 8596, 'batch_loss/train': 0.9261324894614518} 12/24/2021 02:45:34 - INFO - codeparrot_training - Step 8597: {'lr': 0.00046557167258422607, 'samples': 4402176, 'steps': 8597, 'batch_loss/train': 0.7776469085365534} 12/24/2021 02:45:45 - INFO - codeparrot_training - Step 8598: {'lr': 0.0004655636893823139, 'samples': 4402688, 'steps': 8598, 'batch_loss/train': 0.8507119175046682} 12/24/2021 02:45:58 - INFO - codeparrot_training - Step 8599: {'lr': 0.0004655557053234019, 'samples': 4403200, 'steps': 8599, 'batch_loss/train': 0.8931525712832808} 12/24/2021 02:46:08 - INFO - codeparrot_training - Step 8600: {'lr': 0.0004655477204075218, 'samples': 4403712, 'steps': 8600, 'batch_loss/train': 0.7923199813812971} 12/24/2021 02:46:19 - INFO - codeparrot_training - Step 8601: {'lr': 0.00046553973463470534, 'samples': 4404224, 'steps': 8601, 'batch_loss/train': 0.8950454024598002} 12/24/2021 02:46:29 - INFO - codeparrot_training - Step 8602: {'lr': 0.0004655317480049843, 'samples': 4404736, 'steps': 8602, 'batch_loss/train': 0.8189679943025112} 12/24/2021 02:46:42 - INFO - codeparrot_training - Step 8603: {'lr': 0.0004655237605183905, 'samples': 4405248, 'steps': 8603, 'batch_loss/train': 0.8111507277935743} 12/24/2021 02:46:52 - INFO - codeparrot_training - Step 8604: {'lr': 0.00046551577217495544, 'samples': 4405760, 'steps': 8604, 'batch_loss/train': 0.7662250837311149} 12/24/2021 02:47:03 - INFO - codeparrot_training - Step 8605: {'lr': 0.0004655077829747112, 'samples': 4406272, 'steps': 8605, 'batch_loss/train': 0.8631500825285912} 12/24/2021 02:47:15 - INFO - codeparrot_training - Step 8606: {'lr': 0.00046549979291768927, 'samples': 4406784, 'steps': 8606, 'batch_loss/train': 0.6803206705953926} 12/24/2021 02:47:26 - INFO - codeparrot_training - Step 8607: {'lr': 0.0004654918020039216, 'samples': 4407296, 'steps': 8607, 'batch_loss/train': 0.8464823374524713} 12/24/2021 02:47:36 - INFO - codeparrot_training - Step 8608: {'lr': 0.0004654838102334399, 'samples': 4407808, 'steps': 8608, 'batch_loss/train': 0.8754106806591153} 12/24/2021 02:47:49 - INFO - codeparrot_training - Step 8609: {'lr': 0.0004654758176062759, 'samples': 4408320, 'steps': 8609, 'batch_loss/train': 0.763030709233135} 12/24/2021 02:47:59 - INFO - codeparrot_training - Step 8610: {'lr': 0.0004654678241224614, 'samples': 4408832, 'steps': 8610, 'batch_loss/train': 0.923748204484582} 12/24/2021 02:48:10 - INFO - codeparrot_training - Step 8611: {'lr': 0.0004654598297820282, 'samples': 4409344, 'steps': 8611, 'batch_loss/train': 0.8058794336393476} 12/24/2021 02:48:22 - INFO - codeparrot_training - Step 8612: {'lr': 0.00046545183458500806, 'samples': 4409856, 'steps': 8612, 'batch_loss/train': 0.8720189668238163} 12/24/2021 02:48:32 - INFO - codeparrot_training - Step 8613: {'lr': 0.0004654438385314328, 'samples': 4410368, 'steps': 8613, 'batch_loss/train': 0.8929697913117707} 12/24/2021 02:48:43 - INFO - codeparrot_training - Step 8614: {'lr': 0.0004654358416213342, 'samples': 4410880, 'steps': 8614, 'batch_loss/train': 0.7953045470640063} 12/24/2021 02:48:53 - INFO - codeparrot_training - Step 8615: {'lr': 0.00046542784385474403, 'samples': 4411392, 'steps': 8615, 'batch_loss/train': 0.8484076529275626} 12/24/2021 02:49:05 - INFO - codeparrot_training - Step 8616: {'lr': 0.00046541984523169405, 'samples': 4411904, 'steps': 8616, 'batch_loss/train': 0.73436766769737} 12/24/2021 02:49:16 - INFO - codeparrot_training - Step 8617: {'lr': 0.0004654118457522162, 'samples': 4412416, 'steps': 8617, 'batch_loss/train': 0.8965546377003193} 12/24/2021 02:49:27 - INFO - codeparrot_training - Step 8618: {'lr': 0.00046540384541634205, 'samples': 4412928, 'steps': 8618, 'batch_loss/train': 0.8832701463252306} 12/24/2021 02:49:39 - INFO - codeparrot_training - Step 8619: {'lr': 0.0004653958442241036, 'samples': 4413440, 'steps': 8619, 'batch_loss/train': 0.6987185516627505} 12/24/2021 02:49:50 - INFO - codeparrot_training - Step 8620: {'lr': 0.00046538784217553265, 'samples': 4413952, 'steps': 8620, 'batch_loss/train': 0.8639224674552679} 12/24/2021 02:50:00 - INFO - codeparrot_training - Step 8621: {'lr': 0.000465379839270661, 'samples': 4414464, 'steps': 8621, 'batch_loss/train': 0.7975716888904572} 12/24/2021 02:50:12 - INFO - codeparrot_training - Step 8622: {'lr': 0.00046537183550952036, 'samples': 4414976, 'steps': 8622, 'batch_loss/train': 0.7730874689295888} 12/24/2021 02:50:23 - INFO - codeparrot_training - Step 8623: {'lr': 0.00046536383089214263, 'samples': 4415488, 'steps': 8623, 'batch_loss/train': 0.9651076523587108} 12/24/2021 02:50:34 - INFO - codeparrot_training - Step 8624: {'lr': 0.0004653558254185596, 'samples': 4416000, 'steps': 8624, 'batch_loss/train': 0.8175527937710285} 12/24/2021 02:50:44 - INFO - codeparrot_training - Step 8625: {'lr': 0.0004653478190888032, 'samples': 4416512, 'steps': 8625, 'batch_loss/train': 0.9754395354539156} 12/24/2021 02:50:56 - INFO - codeparrot_training - Step 8626: {'lr': 0.0004653398119029051, 'samples': 4417024, 'steps': 8626, 'batch_loss/train': 0.869021019898355} 12/24/2021 02:51:07 - INFO - codeparrot_training - Step 8627: {'lr': 0.00046533180386089726, 'samples': 4417536, 'steps': 8627, 'batch_loss/train': 0.7637069579213858} 12/24/2021 02:51:17 - INFO - codeparrot_training - Step 8628: {'lr': 0.0004653237949628115, 'samples': 4418048, 'steps': 8628, 'batch_loss/train': 0.7000413775094785} 12/24/2021 02:51:30 - INFO - codeparrot_training - Step 8629: {'lr': 0.00046531578520867954, 'samples': 4418560, 'steps': 8629, 'batch_loss/train': 0.8364668739959598} 12/24/2021 02:51:41 - INFO - codeparrot_training - Step 8630: {'lr': 0.00046530777459853346, 'samples': 4419072, 'steps': 8630, 'batch_loss/train': 0.8272188361734152} 12/24/2021 02:51:51 - INFO - codeparrot_training - Step 8631: {'lr': 0.00046529976313240483, 'samples': 4419584, 'steps': 8631, 'batch_loss/train': 0.9468050263822079} 12/24/2021 02:52:03 - INFO - codeparrot_training - Step 8632: {'lr': 0.0004652917508103257, 'samples': 4420096, 'steps': 8632, 'batch_loss/train': 0.7800423973239958} 12/24/2021 02:52:14 - INFO - codeparrot_training - Step 8633: {'lr': 0.0004652837376323279, 'samples': 4420608, 'steps': 8633, 'batch_loss/train': 0.809062885120511} 12/24/2021 02:52:25 - INFO - codeparrot_training - Step 8634: {'lr': 0.0004652757235984432, 'samples': 4421120, 'steps': 8634, 'batch_loss/train': 0.7757048243656754} 12/24/2021 02:52:35 - INFO - codeparrot_training - Step 8635: {'lr': 0.0004652677087087035, 'samples': 4421632, 'steps': 8635, 'batch_loss/train': 0.8215544749982655} 12/24/2021 02:52:48 - INFO - codeparrot_training - Step 8636: {'lr': 0.0004652596929631407, 'samples': 4422144, 'steps': 8636, 'batch_loss/train': 0.8597385957837105} 12/24/2021 02:52:59 - INFO - codeparrot_training - Step 8637: {'lr': 0.0004652516763617866, 'samples': 4422656, 'steps': 8637, 'batch_loss/train': 0.8723340928554535} 12/24/2021 02:53:09 - INFO - codeparrot_training - Step 8638: {'lr': 0.0004652436589046731, 'samples': 4423168, 'steps': 8638, 'batch_loss/train': 0.812760604545474} 12/24/2021 02:53:21 - INFO - codeparrot_training - Step 8639: {'lr': 0.00046523564059183217, 'samples': 4423680, 'steps': 8639, 'batch_loss/train': 0.8383815456181765} 12/24/2021 02:53:32 - INFO - codeparrot_training - Step 8640: {'lr': 0.00046522762142329555, 'samples': 4424192, 'steps': 8640, 'batch_loss/train': 0.8039330914616585} 12/24/2021 02:53:42 - INFO - codeparrot_training - Step 8641: {'lr': 0.00046521960139909513, 'samples': 4424704, 'steps': 8641, 'batch_loss/train': 0.727734754094854} 12/24/2021 02:53:55 - INFO - codeparrot_training - Step 8642: {'lr': 0.0004652115805192629, 'samples': 4425216, 'steps': 8642, 'batch_loss/train': 0.9293350148946047} 12/24/2021 02:54:05 - INFO - codeparrot_training - Step 8643: {'lr': 0.0004652035587838307, 'samples': 4425728, 'steps': 8643, 'batch_loss/train': 0.9468883033841848} 12/24/2021 02:54:16 - INFO - codeparrot_training - Step 8644: {'lr': 0.0004651955361928304, 'samples': 4426240, 'steps': 8644, 'batch_loss/train': 0.9021068150177598} 12/24/2021 02:54:29 - INFO - codeparrot_training - Step 8645: {'lr': 0.0004651875127462938, 'samples': 4426752, 'steps': 8645, 'batch_loss/train': 0.5972073296434246} 12/24/2021 02:54:39 - INFO - codeparrot_training - Step 8646: {'lr': 0.00046517948844425306, 'samples': 4427264, 'steps': 8646, 'batch_loss/train': 0.9679370056837797} 12/24/2021 02:54:50 - INFO - codeparrot_training - Step 8647: {'lr': 0.0004651714632867398, 'samples': 4427776, 'steps': 8647, 'batch_loss/train': 0.8662001164630055} 12/24/2021 02:55:01 - INFO - codeparrot_training - Step 8648: {'lr': 0.0004651634372737861, 'samples': 4428288, 'steps': 8648, 'batch_loss/train': 0.7836248921230435} 12/24/2021 02:55:13 - INFO - codeparrot_training - Step 8649: {'lr': 0.0004651554104054238, 'samples': 4428800, 'steps': 8649, 'batch_loss/train': 0.8520484007894993} 12/24/2021 02:55:23 - INFO - codeparrot_training - Step 8650: {'lr': 0.0004651473826816849, 'samples': 4429312, 'steps': 8650, 'batch_loss/train': 0.837251702323556} 12/24/2021 02:55:34 - INFO - codeparrot_training - Step 8651: {'lr': 0.00046513935410260113, 'samples': 4429824, 'steps': 8651, 'batch_loss/train': 0.8662271220237017} 12/24/2021 02:55:46 - INFO - codeparrot_training - Step 8652: {'lr': 0.00046513132466820457, 'samples': 4430336, 'steps': 8652, 'batch_loss/train': 0.8098890706896782} 12/24/2021 02:55:57 - INFO - codeparrot_training - Step 8653: {'lr': 0.00046512329437852705, 'samples': 4430848, 'steps': 8653, 'batch_loss/train': 0.7699796194210649} 12/24/2021 02:56:07 - INFO - codeparrot_training - Step 8654: {'lr': 0.00046511526323360066, 'samples': 4431360, 'steps': 8654, 'batch_loss/train': 0.7022455840487964} 12/24/2021 02:56:19 - INFO - codeparrot_training - Step 8655: {'lr': 0.0004651072312334571, 'samples': 4431872, 'steps': 8655, 'batch_loss/train': 0.8181375041604042} 12/24/2021 02:56:30 - INFO - codeparrot_training - Step 8656: {'lr': 0.00046509919837812845, 'samples': 4432384, 'steps': 8656, 'batch_loss/train': 0.8539508618414402} 12/24/2021 02:56:41 - INFO - codeparrot_training - Step 8657: {'lr': 0.00046509116466764654, 'samples': 4432896, 'steps': 8657, 'batch_loss/train': 0.7245119409635663} 12/24/2021 02:56:53 - INFO - codeparrot_training - Step 8658: {'lr': 0.0004650831301020435, 'samples': 4433408, 'steps': 8658, 'batch_loss/train': 0.824013645760715} 12/24/2021 02:57:04 - INFO - codeparrot_training - Step 8659: {'lr': 0.000465075094681351, 'samples': 4433920, 'steps': 8659, 'batch_loss/train': 0.910290602594614} 12/24/2021 02:57:15 - INFO - codeparrot_training - Step 8660: {'lr': 0.00046506705840560126, 'samples': 4434432, 'steps': 8660, 'batch_loss/train': 0.914387805853039} 12/24/2021 02:57:25 - INFO - codeparrot_training - Step 8661: {'lr': 0.000465059021274826, 'samples': 4434944, 'steps': 8661, 'batch_loss/train': 0.7990896715782583} 12/24/2021 02:57:38 - INFO - codeparrot_training - Step 8662: {'lr': 0.0004650509832890574, 'samples': 4435456, 'steps': 8662, 'batch_loss/train': 0.8988459901884198} 12/24/2021 02:57:49 - INFO - codeparrot_training - Step 8663: {'lr': 0.0004650429444483272, 'samples': 4435968, 'steps': 8663, 'batch_loss/train': 0.8589367931708694} 12/24/2021 02:57:59 - INFO - codeparrot_training - Step 8664: {'lr': 0.00046503490475266745, 'samples': 4436480, 'steps': 8664, 'batch_loss/train': 0.9011408844962716} 12/24/2021 02:58:11 - INFO - codeparrot_training - Step 8665: {'lr': 0.00046502686420211016, 'samples': 4436992, 'steps': 8665, 'batch_loss/train': 0.8162604039534926} 12/24/2021 02:58:22 - INFO - codeparrot_training - Step 8666: {'lr': 0.00046501882279668716, 'samples': 4437504, 'steps': 8666, 'batch_loss/train': 0.8558601667173207} 12/24/2021 02:58:32 - INFO - codeparrot_training - Step 8667: {'lr': 0.0004650107805364306, 'samples': 4438016, 'steps': 8667, 'batch_loss/train': 0.9983428595587611} 12/24/2021 02:58:43 - INFO - codeparrot_training - Step 8668: {'lr': 0.00046500273742137234, 'samples': 4438528, 'steps': 8668, 'batch_loss/train': 0.8265541102737188} 12/24/2021 02:58:56 - INFO - codeparrot_training - Step 8669: {'lr': 0.00046499469345154433, 'samples': 4439040, 'steps': 8669, 'batch_loss/train': 0.8028325936757028} 12/24/2021 02:59:06 - INFO - codeparrot_training - Step 8670: {'lr': 0.00046498664862697865, 'samples': 4439552, 'steps': 8670, 'batch_loss/train': 0.7517677897121757} 12/24/2021 02:59:17 - INFO - codeparrot_training - Step 8671: {'lr': 0.00046497860294770723, 'samples': 4440064, 'steps': 8671, 'batch_loss/train': 0.8317331410944462} 12/24/2021 02:59:29 - INFO - codeparrot_training - Step 8672: {'lr': 0.00046497055641376206, 'samples': 4440576, 'steps': 8672, 'batch_loss/train': 0.7377115953713655} 12/24/2021 02:59:40 - INFO - codeparrot_training - Step 8673: {'lr': 0.0004649625090251751, 'samples': 4441088, 'steps': 8673, 'batch_loss/train': 0.9139159973710775} 12/24/2021 02:59:50 - INFO - codeparrot_training - Step 8674: {'lr': 0.00046495446078197834, 'samples': 4441600, 'steps': 8674, 'batch_loss/train': 0.7735480926930904} 12/24/2021 03:00:04 - INFO - codeparrot_training - Step 8675: {'lr': 0.00046494641168420386, 'samples': 4442112, 'steps': 8675, 'batch_loss/train': 0.7586936540901661} 12/24/2021 03:00:15 - INFO - codeparrot_training - Step 8676: {'lr': 0.0004649383617318836, 'samples': 4442624, 'steps': 8676, 'batch_loss/train': 0.8819182319566607} 12/24/2021 03:00:25 - INFO - codeparrot_training - Step 8677: {'lr': 0.00046493031092504955, 'samples': 4443136, 'steps': 8677, 'batch_loss/train': 0.7695819570217282} 12/24/2021 03:00:38 - INFO - codeparrot_training - Step 8678: {'lr': 0.0004649222592637338, 'samples': 4443648, 'steps': 8678, 'batch_loss/train': 0.7535993754863739} 12/24/2021 03:00:48 - INFO - codeparrot_training - Step 8679: {'lr': 0.0004649142067479683, 'samples': 4444160, 'steps': 8679, 'batch_loss/train': 0.7600894230417907} 12/24/2021 03:00:59 - INFO - codeparrot_training - Step 8680: {'lr': 0.000464906153377785, 'samples': 4444672, 'steps': 8680, 'batch_loss/train': 0.8683103807270527} 12/24/2021 03:01:10 - INFO - codeparrot_training - Step 8681: {'lr': 0.00046489809915321596, 'samples': 4445184, 'steps': 8681, 'batch_loss/train': 1.1592012953478843} 12/24/2021 03:01:22 - INFO - codeparrot_training - Step 8682: {'lr': 0.00046489004407429324, 'samples': 4445696, 'steps': 8682, 'batch_loss/train': 0.7992589850910008} 12/24/2021 03:01:33 - INFO - codeparrot_training - Step 8683: {'lr': 0.00046488198814104885, 'samples': 4446208, 'steps': 8683, 'batch_loss/train': 0.8719246645923704} 12/24/2021 03:01:44 - INFO - codeparrot_training - Step 8684: {'lr': 0.00046487393135351485, 'samples': 4446720, 'steps': 8684, 'batch_loss/train': 0.8466453887522221} 12/24/2021 03:01:56 - INFO - codeparrot_training - Step 8685: {'lr': 0.00046486587371172316, 'samples': 4447232, 'steps': 8685, 'batch_loss/train': 0.7760421792045236} 12/24/2021 03:02:07 - INFO - codeparrot_training - Step 8686: {'lr': 0.000464857815215706, 'samples': 4447744, 'steps': 8686, 'batch_loss/train': 1.2018705415539443} 12/24/2021 03:02:17 - INFO - codeparrot_training - Step 8687: {'lr': 0.0004648497558654952, 'samples': 4448256, 'steps': 8687, 'batch_loss/train': 0.7951990012079477} 12/24/2021 03:02:29 - INFO - codeparrot_training - Step 8688: {'lr': 0.0004648416956611229, 'samples': 4448768, 'steps': 8688, 'batch_loss/train': 0.8266915110871196} 12/24/2021 03:02:40 - INFO - codeparrot_training - Step 8689: {'lr': 0.00046483363460262116, 'samples': 4449280, 'steps': 8689, 'batch_loss/train': 0.8281767293810844} 12/24/2021 03:02:51 - INFO - codeparrot_training - Step 8690: {'lr': 0.000464825572690022, 'samples': 4449792, 'steps': 8690, 'batch_loss/train': 0.9467020472511649} 12/24/2021 03:03:01 - INFO - codeparrot_training - Step 8691: {'lr': 0.00046481750992335746, 'samples': 4450304, 'steps': 8691, 'batch_loss/train': 0.7508738033939153} 12/24/2021 03:03:13 - INFO - codeparrot_training - Step 8692: {'lr': 0.00046480944630265965, 'samples': 4450816, 'steps': 8692, 'batch_loss/train': 0.8309072675183415} 12/24/2021 03:03:24 - INFO - codeparrot_training - Step 8693: {'lr': 0.0004648013818279606, 'samples': 4451328, 'steps': 8693, 'batch_loss/train': 0.969118001870811} 12/24/2021 03:03:34 - INFO - codeparrot_training - Step 8694: {'lr': 0.00046479331649929234, 'samples': 4451840, 'steps': 8694, 'batch_loss/train': 0.7879718043841422} 12/24/2021 03:03:47 - INFO - codeparrot_training - Step 8695: {'lr': 0.0004647852503166869, 'samples': 4452352, 'steps': 8695, 'batch_loss/train': 0.6372775116469711} 12/24/2021 03:03:58 - INFO - codeparrot_training - Step 8696: {'lr': 0.0004647771832801765, 'samples': 4452864, 'steps': 8696, 'batch_loss/train': 0.9037761930376291} 12/24/2021 03:04:09 - INFO - codeparrot_training - Step 8697: {'lr': 0.00046476911538979307, 'samples': 4453376, 'steps': 8697, 'batch_loss/train': 0.8758232113905251} 12/24/2021 03:04:21 - INFO - codeparrot_training - Step 8698: {'lr': 0.0004647610466455687, 'samples': 4453888, 'steps': 8698, 'batch_loss/train': 0.8326525036245584} 12/24/2021 03:04:31 - INFO - codeparrot_training - Step 8699: {'lr': 0.0004647529770475356, 'samples': 4454400, 'steps': 8699, 'batch_loss/train': 0.897921790368855} 12/24/2021 03:04:42 - INFO - codeparrot_training - Step 8700: {'lr': 0.00046474490659572564, 'samples': 4454912, 'steps': 8700, 'batch_loss/train': 0.6725973235443234} 12/24/2021 03:04:52 - INFO - codeparrot_training - Step 8701: {'lr': 0.00046473683529017106, 'samples': 4455424, 'steps': 8701, 'batch_loss/train': 0.7903540097177029} 12/24/2021 03:05:04 - INFO - codeparrot_training - Step 8702: {'lr': 0.00046472876313090393, 'samples': 4455936, 'steps': 8702, 'batch_loss/train': 0.8632072852924466} 12/24/2021 03:05:15 - INFO - codeparrot_training - Step 8703: {'lr': 0.00046472069011795633, 'samples': 4456448, 'steps': 8703, 'batch_loss/train': 0.9423088566400111} 12/24/2021 03:05:26 - INFO - codeparrot_training - Step 8704: {'lr': 0.0004647126162513603, 'samples': 4456960, 'steps': 8704, 'batch_loss/train': 0.8431726573035121} 12/24/2021 03:05:38 - INFO - codeparrot_training - Step 8705: {'lr': 0.0004647045415311479, 'samples': 4457472, 'steps': 8705, 'batch_loss/train': 0.9674253687262535} 12/24/2021 03:05:49 - INFO - codeparrot_training - Step 8706: {'lr': 0.0004646964659573515, 'samples': 4457984, 'steps': 8706, 'batch_loss/train': 0.7887566909193993} 12/24/2021 03:06:00 - INFO - codeparrot_training - Step 8707: {'lr': 0.0004646883895300029, 'samples': 4458496, 'steps': 8707, 'batch_loss/train': 0.8387957732193172} 12/24/2021 03:06:12 - INFO - codeparrot_training - Step 8708: {'lr': 0.0004646803122491343, 'samples': 4459008, 'steps': 8708, 'batch_loss/train': 0.8605884909629822} 12/24/2021 03:06:22 - INFO - codeparrot_training - Step 8709: {'lr': 0.00046467223411477787, 'samples': 4459520, 'steps': 8709, 'batch_loss/train': 0.8142798037733883} 12/24/2021 03:06:33 - INFO - codeparrot_training - Step 8710: {'lr': 0.0004646641551269657, 'samples': 4460032, 'steps': 8710, 'batch_loss/train': 0.8605701178312302} 12/24/2021 03:06:45 - INFO - codeparrot_training - Step 8711: {'lr': 0.00046465607528572994, 'samples': 4460544, 'steps': 8711, 'batch_loss/train': 0.9252370754256845} 12/24/2021 03:06:56 - INFO - codeparrot_training - Step 8712: {'lr': 0.0004646479945911026, 'samples': 4461056, 'steps': 8712, 'batch_loss/train': 0.9476114199496806} 12/24/2021 03:07:06 - INFO - codeparrot_training - Step 8713: {'lr': 0.0004646399130431159, 'samples': 4461568, 'steps': 8713, 'batch_loss/train': 0.8434821926057339} 12/24/2021 03:07:17 - INFO - codeparrot_training - Step 8714: {'lr': 0.00046463183064180205, 'samples': 4462080, 'steps': 8714, 'batch_loss/train': 0.9090263601392508} 12/24/2021 03:07:30 - INFO - codeparrot_training - Step 8715: {'lr': 0.000464623747387193, 'samples': 4462592, 'steps': 8715, 'batch_loss/train': 0.8553095528623089} 12/24/2021 03:07:40 - INFO - codeparrot_training - Step 8716: {'lr': 0.000464615663279321, 'samples': 4463104, 'steps': 8716, 'batch_loss/train': 0.9752221300732344} 12/24/2021 03:07:51 - INFO - codeparrot_training - Step 8717: {'lr': 0.0004646075783182181, 'samples': 4463616, 'steps': 8717, 'batch_loss/train': 0.8359724176116288} 12/24/2021 03:08:03 - INFO - codeparrot_training - Step 8718: {'lr': 0.00046459949250391656, 'samples': 4464128, 'steps': 8718, 'batch_loss/train': 1.3774912524968386} 12/24/2021 03:08:14 - INFO - codeparrot_training - Step 8719: {'lr': 0.0004645914058364484, 'samples': 4464640, 'steps': 8719, 'batch_loss/train': 1.2062365487217903} 12/24/2021 03:08:24 - INFO - codeparrot_training - Step 8720: {'lr': 0.00046458331831584587, 'samples': 4465152, 'steps': 8720, 'batch_loss/train': 1.1235332693904638} 12/24/2021 03:08:37 - INFO - codeparrot_training - Step 8721: {'lr': 0.0004645752299421411, 'samples': 4465664, 'steps': 8721, 'batch_loss/train': 1.817296864464879} 12/24/2021 03:08:48 - INFO - codeparrot_training - Step 8722: {'lr': 0.00046456714071536627, 'samples': 4466176, 'steps': 8722, 'batch_loss/train': 0.9044035542756319} 12/24/2021 03:08:58 - INFO - codeparrot_training - Step 8723: {'lr': 0.0004645590506355535, 'samples': 4466688, 'steps': 8723, 'batch_loss/train': 0.7795183141715825} 12/24/2021 03:09:09 - INFO - codeparrot_training - Step 8724: {'lr': 0.00046455095970273486, 'samples': 4467200, 'steps': 8724, 'batch_loss/train': 0.910062447655946} 12/24/2021 03:09:21 - INFO - codeparrot_training - Step 8725: {'lr': 0.0004645428679169427, 'samples': 4467712, 'steps': 8725, 'batch_loss/train': 0.9166821904946119} 12/24/2021 03:09:32 - INFO - codeparrot_training - Step 8726: {'lr': 0.0004645347752782091, 'samples': 4468224, 'steps': 8726, 'batch_loss/train': 0.9967179941013455} 12/24/2021 03:09:42 - INFO - codeparrot_training - Step 8727: {'lr': 0.00046452668178656627, 'samples': 4468736, 'steps': 8727, 'batch_loss/train': 0.7910769367590547} 12/24/2021 03:09:54 - INFO - codeparrot_training - Step 8728: {'lr': 0.0004645185874420463, 'samples': 4469248, 'steps': 8728, 'batch_loss/train': 0.8847488630563021} 12/24/2021 03:10:05 - INFO - codeparrot_training - Step 8729: {'lr': 0.00046451049224468144, 'samples': 4469760, 'steps': 8729, 'batch_loss/train': 0.971944629214704} 12/24/2021 03:10:16 - INFO - codeparrot_training - Step 8730: {'lr': 0.00046450239619450385, 'samples': 4470272, 'steps': 8730, 'batch_loss/train': 0.9255290059372783} 12/24/2021 03:10:28 - INFO - codeparrot_training - Step 8731: {'lr': 0.00046449429929154573, 'samples': 4470784, 'steps': 8731, 'batch_loss/train': 0.7932136543095112} 12/24/2021 03:10:39 - INFO - codeparrot_training - Step 8732: {'lr': 0.0004644862015358393, 'samples': 4471296, 'steps': 8732, 'batch_loss/train': 1.0544119561091065} 12/24/2021 03:10:50 - INFO - codeparrot_training - Step 8733: {'lr': 0.00046447810292741664, 'samples': 4471808, 'steps': 8733, 'batch_loss/train': 0.8588859485462308} 12/24/2021 03:11:00 - INFO - codeparrot_training - Step 8734: {'lr': 0.0004644700034663101, 'samples': 4472320, 'steps': 8734, 'batch_loss/train': 0.851445596665144} 12/24/2021 03:11:13 - INFO - codeparrot_training - Step 8735: {'lr': 0.00046446190315255177, 'samples': 4472832, 'steps': 8735, 'batch_loss/train': 0.8246026216074824} 12/24/2021 03:11:23 - INFO - codeparrot_training - Step 8736: {'lr': 0.00046445380198617396, 'samples': 4473344, 'steps': 8736, 'batch_loss/train': 0.863140850327909} 12/24/2021 03:11:34 - INFO - codeparrot_training - Step 8737: {'lr': 0.0004644456999672088, 'samples': 4473856, 'steps': 8737, 'batch_loss/train': 0.8961755707859993} 12/24/2021 03:11:46 - INFO - codeparrot_training - Step 8738: {'lr': 0.0004644375970956884, 'samples': 4474368, 'steps': 8738, 'batch_loss/train': 1.0846617463976145} 12/24/2021 03:11:57 - INFO - codeparrot_training - Step 8739: {'lr': 0.0004644294933716451, 'samples': 4474880, 'steps': 8739, 'batch_loss/train': 0.8592134416103363} 12/24/2021 03:12:07 - INFO - codeparrot_training - Step 8740: {'lr': 0.0004644213887951112, 'samples': 4475392, 'steps': 8740, 'batch_loss/train': 0.9617668176069856} 12/24/2021 03:12:19 - INFO - codeparrot_training - Step 8741: {'lr': 0.00046441328336611876, 'samples': 4475904, 'steps': 8741, 'batch_loss/train': 0.8450288865715265} 12/24/2021 03:12:30 - INFO - codeparrot_training - Step 8742: {'lr': 0.0004644051770847001, 'samples': 4476416, 'steps': 8742, 'batch_loss/train': 0.9009667420759797} 12/24/2021 03:12:41 - INFO - codeparrot_training - Step 8743: {'lr': 0.0004643970699508874, 'samples': 4476928, 'steps': 8743, 'batch_loss/train': 0.839670523069799} 12/24/2021 03:12:51 - INFO - codeparrot_training - Step 8744: {'lr': 0.0004643889619647128, 'samples': 4477440, 'steps': 8744, 'batch_loss/train': 0.8904195073992014} 12/24/2021 03:13:04 - INFO - codeparrot_training - Step 8745: {'lr': 0.00046438085312620875, 'samples': 4477952, 'steps': 8745, 'batch_loss/train': 0.8898492874577641} 12/24/2021 03:13:15 - INFO - codeparrot_training - Step 8746: {'lr': 0.00046437274343540737, 'samples': 4478464, 'steps': 8746, 'batch_loss/train': 0.8675829898566008} 12/24/2021 03:13:26 - INFO - codeparrot_training - Step 8747: {'lr': 0.0004643646328923409, 'samples': 4478976, 'steps': 8747, 'batch_loss/train': 0.8096792092546821} 12/24/2021 03:13:38 - INFO - codeparrot_training - Step 8748: {'lr': 0.00046435652149704154, 'samples': 4479488, 'steps': 8748, 'batch_loss/train': 1.59294862463139} 12/24/2021 03:13:48 - INFO - codeparrot_training - Step 8749: {'lr': 0.00046434840924954165, 'samples': 4480000, 'steps': 8749, 'batch_loss/train': 0.8984662555158138} 12/24/2021 03:13:59 - INFO - codeparrot_training - Step 8750: {'lr': 0.0004643402961498734, 'samples': 4480512, 'steps': 8750, 'batch_loss/train': 0.9195191441103816} 12/24/2021 03:14:09 - INFO - codeparrot_training - Step 8751: {'lr': 0.00046433218219806905, 'samples': 4481024, 'steps': 8751, 'batch_loss/train': 0.6760954447090626} 12/24/2021 03:14:22 - INFO - codeparrot_training - Step 8752: {'lr': 0.0004643240673941609, 'samples': 4481536, 'steps': 8752, 'batch_loss/train': 1.4803313156589866} 12/24/2021 03:14:33 - INFO - codeparrot_training - Step 8753: {'lr': 0.0004643159517381811, 'samples': 4482048, 'steps': 8753, 'batch_loss/train': 0.728548678336665} 12/24/2021 03:14:44 - INFO - codeparrot_training - Step 8754: {'lr': 0.00046430783523016206, 'samples': 4482560, 'steps': 8754, 'batch_loss/train': 0.7391390280099586} 12/24/2021 03:14:56 - INFO - codeparrot_training - Step 8755: {'lr': 0.00046429971787013605, 'samples': 4483072, 'steps': 8755, 'batch_loss/train': 0.9367787227965891} 12/24/2021 03:15:06 - INFO - codeparrot_training - Step 8756: {'lr': 0.0004642915996581352, 'samples': 4483584, 'steps': 8756, 'batch_loss/train': 0.8725891336798668} 12/24/2021 03:15:17 - INFO - codeparrot_training - Step 8757: {'lr': 0.0004642834805941919, 'samples': 4484096, 'steps': 8757, 'batch_loss/train': 0.9133130731061101} 12/24/2021 03:15:29 - INFO - codeparrot_training - Step 8758: {'lr': 0.0004642753606783383, 'samples': 4484608, 'steps': 8758, 'batch_loss/train': 0.9418090875260532} 12/24/2021 03:15:39 - INFO - codeparrot_training - Step 8759: {'lr': 0.0004642672399106069, 'samples': 4485120, 'steps': 8759, 'batch_loss/train': 0.8654315411113203} 12/24/2021 03:15:50 - INFO - codeparrot_training - Step 8760: {'lr': 0.0004642591182910297, 'samples': 4485632, 'steps': 8760, 'batch_loss/train': 0.8405152950435877} 12/24/2021 03:16:01 - INFO - codeparrot_training - Step 8761: {'lr': 0.0004642509958196393, 'samples': 4486144, 'steps': 8761, 'batch_loss/train': 0.7481793803162873} 12/24/2021 03:16:13 - INFO - codeparrot_training - Step 8762: {'lr': 0.0004642428724964678, 'samples': 4486656, 'steps': 8762, 'batch_loss/train': 0.8329678564332426} 12/24/2021 03:16:24 - INFO - codeparrot_training - Step 8763: {'lr': 0.0004642347483215475, 'samples': 4487168, 'steps': 8763, 'batch_loss/train': 0.875232644379139} 12/24/2021 03:16:34 - INFO - codeparrot_training - Step 8764: {'lr': 0.0004642266232949108, 'samples': 4487680, 'steps': 8764, 'batch_loss/train': 1.0197290629148483} 12/24/2021 03:16:46 - INFO - codeparrot_training - Step 8765: {'lr': 0.0004642184974165898, 'samples': 4488192, 'steps': 8765, 'batch_loss/train': 0.787044167984277} 12/24/2021 03:16:57 - INFO - codeparrot_training - Step 8766: {'lr': 0.0004642103706866171, 'samples': 4488704, 'steps': 8766, 'batch_loss/train': 0.9022125396877527} 12/24/2021 03:17:08 - INFO - codeparrot_training - Step 8767: {'lr': 0.0004642022431050248, 'samples': 4489216, 'steps': 8767, 'batch_loss/train': 0.8500024667009711} 12/24/2021 03:17:20 - INFO - codeparrot_training - Step 8768: {'lr': 0.00046419411467184523, 'samples': 4489728, 'steps': 8768, 'batch_loss/train': 1.0670351954177022} 12/24/2021 03:17:31 - INFO - codeparrot_training - Step 8769: {'lr': 0.0004641859853871107, 'samples': 4490240, 'steps': 8769, 'batch_loss/train': 0.8741497539449483} 12/24/2021 03:17:41 - INFO - codeparrot_training - Step 8770: {'lr': 0.00046417785525085367, 'samples': 4490752, 'steps': 8770, 'batch_loss/train': 0.7944830385968089} 12/24/2021 03:17:52 - INFO - codeparrot_training - Step 8771: {'lr': 0.0004641697242631062, 'samples': 4491264, 'steps': 8771, 'batch_loss/train': 0.8769136755727232} 12/24/2021 03:18:04 - INFO - codeparrot_training - Step 8772: {'lr': 0.0004641615924239009, 'samples': 4491776, 'steps': 8772, 'batch_loss/train': 0.9062377642840147} 12/24/2021 03:18:15 - INFO - codeparrot_training - Step 8773: {'lr': 0.0004641534597332699, 'samples': 4492288, 'steps': 8773, 'batch_loss/train': 0.8679818445816636} 12/24/2021 03:18:26 - INFO - codeparrot_training - Step 8774: {'lr': 0.00046414532619124574, 'samples': 4492800, 'steps': 8774, 'batch_loss/train': 0.7936266176402569} 12/24/2021 03:18:40 - INFO - codeparrot_training - Step 8775: {'lr': 0.0004641371917978605, 'samples': 4493312, 'steps': 8775, 'batch_loss/train': 0.8528555482625961} 12/24/2021 03:18:51 - INFO - codeparrot_training - Step 8776: {'lr': 0.0004641290565531467, 'samples': 4493824, 'steps': 8776, 'batch_loss/train': 0.8593984488397837} 12/24/2021 03:19:01 - INFO - codeparrot_training - Step 8777: {'lr': 0.0004641209204571366, 'samples': 4494336, 'steps': 8777, 'batch_loss/train': 0.6462399554438889} 12/24/2021 03:19:13 - INFO - codeparrot_training - Step 8778: {'lr': 0.00046411278350986254, 'samples': 4494848, 'steps': 8778, 'batch_loss/train': 0.9266832396388054} 12/24/2021 03:19:24 - INFO - codeparrot_training - Step 8779: {'lr': 0.0004641046457113569, 'samples': 4495360, 'steps': 8779, 'batch_loss/train': 0.7620758507400751} 12/24/2021 03:19:35 - INFO - codeparrot_training - Step 8780: {'lr': 0.00046409650706165205, 'samples': 4495872, 'steps': 8780, 'batch_loss/train': 0.8899734104052186} 12/24/2021 03:19:45 - INFO - codeparrot_training - Step 8781: {'lr': 0.0004640883675607804, 'samples': 4496384, 'steps': 8781, 'batch_loss/train': 0.9472109284251928} 12/24/2021 03:19:57 - INFO - codeparrot_training - Step 8782: {'lr': 0.0004640802272087741, 'samples': 4496896, 'steps': 8782, 'batch_loss/train': 1.1906334517989308} 12/24/2021 03:20:08 - INFO - codeparrot_training - Step 8783: {'lr': 0.00046407208600566575, 'samples': 4497408, 'steps': 8783, 'batch_loss/train': 0.8494095215573907} 12/24/2021 03:20:18 - INFO - codeparrot_training - Step 8784: {'lr': 0.0004640639439514876, 'samples': 4497920, 'steps': 8784, 'batch_loss/train': 0.8072351976297796} 12/24/2021 03:20:32 - INFO - codeparrot_training - Step 8785: {'lr': 0.00046405580104627196, 'samples': 4498432, 'steps': 8785, 'batch_loss/train': 0.8638163404539227} 12/24/2021 03:20:43 - INFO - codeparrot_training - Step 8786: {'lr': 0.00046404765729005136, 'samples': 4498944, 'steps': 8786, 'batch_loss/train': 0.8772126974072307} 12/24/2021 03:20:54 - INFO - codeparrot_training - Step 8787: {'lr': 0.0004640395126828581, 'samples': 4499456, 'steps': 8787, 'batch_loss/train': 0.9148728786967695} 12/24/2021 03:21:06 - INFO - codeparrot_training - Step 8788: {'lr': 0.00046403136722472454, 'samples': 4499968, 'steps': 8788, 'batch_loss/train': 0.7689257590100169} 12/24/2021 03:21:16 - INFO - codeparrot_training - Step 8789: {'lr': 0.00046402322091568303, 'samples': 4500480, 'steps': 8789, 'batch_loss/train': 0.7843177895992994} 12/24/2021 03:21:27 - INFO - codeparrot_training - Step 8790: {'lr': 0.0004640150737557661, 'samples': 4500992, 'steps': 8790, 'batch_loss/train': 1.0614080233499408} 12/24/2021 03:21:37 - INFO - codeparrot_training - Step 8791: {'lr': 0.00046400692574500603, 'samples': 4501504, 'steps': 8791, 'batch_loss/train': 0.8858222151175141} 12/24/2021 03:21:51 - INFO - codeparrot_training - Step 8792: {'lr': 0.00046399877688343516, 'samples': 4502016, 'steps': 8792, 'batch_loss/train': 1.3182914275676012} 12/24/2021 03:22:02 - INFO - codeparrot_training - Step 8793: {'lr': 0.00046399062717108596, 'samples': 4502528, 'steps': 8793, 'batch_loss/train': 0.6181916686473414} 12/24/2021 03:22:12 - INFO - codeparrot_training - Step 8794: {'lr': 0.00046398247660799086, 'samples': 4503040, 'steps': 8794, 'batch_loss/train': 0.7594907565508038} 12/24/2021 03:22:24 - INFO - codeparrot_training - Step 8795: {'lr': 0.00046397432519418226, 'samples': 4503552, 'steps': 8795, 'batch_loss/train': 0.9061881923116744} 12/24/2021 03:22:35 - INFO - codeparrot_training - Step 8796: {'lr': 0.00046396617292969244, 'samples': 4504064, 'steps': 8796, 'batch_loss/train': 0.806892616674304} 12/24/2021 03:22:46 - INFO - codeparrot_training - Step 8797: {'lr': 0.000463958019814554, 'samples': 4504576, 'steps': 8797, 'batch_loss/train': 1.5217128731310368} 12/24/2021 03:22:58 - INFO - codeparrot_training - Step 8798: {'lr': 0.00046394986584879925, 'samples': 4505088, 'steps': 8798, 'batch_loss/train': 0.9749059053137898} 12/24/2021 03:23:08 - INFO - codeparrot_training - Step 8799: {'lr': 0.00046394171103246055, 'samples': 4505600, 'steps': 8799, 'batch_loss/train': 0.8970692695584148} 12/24/2021 03:23:19 - INFO - codeparrot_training - Step 8800: {'lr': 0.00046393355536557046, 'samples': 4506112, 'steps': 8800, 'batch_loss/train': 0.7898524701595306} 12/24/2021 03:23:33 - INFO - codeparrot_training - Step 8801: {'lr': 0.0004639253988481613, 'samples': 4506624, 'steps': 8801, 'batch_loss/train': 0.8239128990098834} 12/24/2021 03:23:43 - INFO - codeparrot_training - Step 8802: {'lr': 0.0004639172414802655, 'samples': 4507136, 'steps': 8802, 'batch_loss/train': 0.9277879884466529} 12/24/2021 03:23:54 - INFO - codeparrot_training - Step 8803: {'lr': 0.0004639090832619156, 'samples': 4507648, 'steps': 8803, 'batch_loss/train': 0.8597593931481242} 12/24/2021 03:24:04 - INFO - codeparrot_training - Step 8804: {'lr': 0.0004639009241931439, 'samples': 4508160, 'steps': 8804, 'batch_loss/train': 0.8198388547170907} 12/24/2021 03:24:16 - INFO - codeparrot_training - Step 8805: {'lr': 0.0004638927642739829, 'samples': 4508672, 'steps': 8805, 'batch_loss/train': 0.8529989719390869} 12/24/2021 03:24:27 - INFO - codeparrot_training - Step 8806: {'lr': 0.00046388460350446506, 'samples': 4509184, 'steps': 8806, 'batch_loss/train': 0.8047367678955197} 12/24/2021 03:24:38 - INFO - codeparrot_training - Step 8807: {'lr': 0.00046387644188462275, 'samples': 4509696, 'steps': 8807, 'batch_loss/train': 0.8485430041328073} 12/24/2021 03:24:50 - INFO - codeparrot_training - Step 8808: {'lr': 0.0004638682794144885, 'samples': 4510208, 'steps': 8808, 'batch_loss/train': 0.5780602286104113} 12/24/2021 03:25:00 - INFO - codeparrot_training - Step 8809: {'lr': 0.00046386011609409475, 'samples': 4510720, 'steps': 8809, 'batch_loss/train': 0.846978253684938} 12/24/2021 03:25:11 - INFO - codeparrot_training - Step 8810: {'lr': 0.00046385195192347384, 'samples': 4511232, 'steps': 8810, 'batch_loss/train': 0.7800920596346259} 12/24/2021 03:25:25 - INFO - codeparrot_training - Step 8811: {'lr': 0.0004638437869026584, 'samples': 4511744, 'steps': 8811, 'batch_loss/train': 0.7657267288304865} 12/24/2021 03:25:36 - INFO - codeparrot_training - Step 8812: {'lr': 0.00046383562103168073, 'samples': 4512256, 'steps': 8812, 'batch_loss/train': 0.8230949277058244} 12/24/2021 03:25:46 - INFO - codeparrot_training - Step 8813: {'lr': 0.00046382745431057336, 'samples': 4512768, 'steps': 8813, 'batch_loss/train': 0.9511232411023229} 12/24/2021 03:25:57 - INFO - codeparrot_training - Step 8814: {'lr': 0.0004638192867393689, 'samples': 4513280, 'steps': 8814, 'batch_loss/train': 0.8181953669991344} 12/24/2021 03:26:09 - INFO - codeparrot_training - Step 8815: {'lr': 0.0004638111183180996, 'samples': 4513792, 'steps': 8815, 'batch_loss/train': 0.8376346696168184} 12/24/2021 03:26:20 - INFO - codeparrot_training - Step 8816: {'lr': 0.000463802949046798, 'samples': 4514304, 'steps': 8816, 'batch_loss/train': 0.7536187642253935} 12/24/2021 03:26:30 - INFO - codeparrot_training - Step 8817: {'lr': 0.00046379477892549665, 'samples': 4514816, 'steps': 8817, 'batch_loss/train': 0.8962844517081976} 12/24/2021 03:26:42 - INFO - codeparrot_training - Step 8818: {'lr': 0.000463786607954228, 'samples': 4515328, 'steps': 8818, 'batch_loss/train': 0.8157550846226513} 12/24/2021 03:26:53 - INFO - codeparrot_training - Step 8819: {'lr': 0.00046377843613302453, 'samples': 4515840, 'steps': 8819, 'batch_loss/train': 0.8816795134916902} 12/24/2021 03:27:04 - INFO - codeparrot_training - Step 8820: {'lr': 0.00046377026346191864, 'samples': 4516352, 'steps': 8820, 'batch_loss/train': 0.8357441844418645} 12/24/2021 03:27:16 - INFO - codeparrot_training - Step 8821: {'lr': 0.0004637620899409429, 'samples': 4516864, 'steps': 8821, 'batch_loss/train': 0.8494358663447201} 12/24/2021 03:27:26 - INFO - codeparrot_training - Step 8822: {'lr': 0.00046375391557012993, 'samples': 4517376, 'steps': 8822, 'batch_loss/train': 0.7649758821353316} 12/24/2021 03:27:37 - INFO - codeparrot_training - Step 8823: {'lr': 0.000463745740349512, 'samples': 4517888, 'steps': 8823, 'batch_loss/train': 0.8552669247146696} 12/24/2021 03:27:47 - INFO - codeparrot_training - Step 8824: {'lr': 0.00046373756427912173, 'samples': 4518400, 'steps': 8824, 'batch_loss/train': 0.7296106601133943} 12/24/2021 03:28:01 - INFO - codeparrot_training - Step 8825: {'lr': 0.00046372938735899163, 'samples': 4518912, 'steps': 8825, 'batch_loss/train': 0.9994352972134948} 12/24/2021 03:28:12 - INFO - codeparrot_training - Step 8826: {'lr': 0.0004637212095891541, 'samples': 4519424, 'steps': 8826, 'batch_loss/train': 0.7862973464652896} 12/24/2021 03:28:22 - INFO - codeparrot_training - Step 8827: {'lr': 0.00046371303096964176, 'samples': 4519936, 'steps': 8827, 'batch_loss/train': 0.794243048876524} 12/24/2021 03:28:35 - INFO - codeparrot_training - Step 8828: {'lr': 0.00046370485150048726, 'samples': 4520448, 'steps': 8828, 'batch_loss/train': 0.8749634139239788} 12/24/2021 03:28:45 - INFO - codeparrot_training - Step 8829: {'lr': 0.0004636966711817228, 'samples': 4520960, 'steps': 8829, 'batch_loss/train': 0.7743899207562208} 12/24/2021 03:28:56 - INFO - codeparrot_training - Step 8830: {'lr': 0.0004636884900133811, 'samples': 4521472, 'steps': 8830, 'batch_loss/train': 0.6223523358348757} 12/24/2021 03:29:10 - INFO - codeparrot_training - Step 8831: {'lr': 0.00046368030799549473, 'samples': 4521984, 'steps': 8831, 'batch_loss/train': 1.0111778769642115} 12/24/2021 03:29:20 - INFO - codeparrot_training - Step 8832: {'lr': 0.000463672125128096, 'samples': 4522496, 'steps': 8832, 'batch_loss/train': 0.742891127243638} 12/24/2021 03:29:31 - INFO - codeparrot_training - Step 8833: {'lr': 0.0004636639414112177, 'samples': 4523008, 'steps': 8833, 'batch_loss/train': 0.9521155529655516} 12/24/2021 03:29:41 - INFO - codeparrot_training - Step 8834: {'lr': 0.0004636557568448922, 'samples': 4523520, 'steps': 8834, 'batch_loss/train': 0.8030715314671397} 12/24/2021 03:29:54 - INFO - codeparrot_training - Step 8835: {'lr': 0.000463647571429152, 'samples': 4524032, 'steps': 8835, 'batch_loss/train': 1.0553337475284934} 12/24/2021 03:30:04 - INFO - codeparrot_training - Step 8836: {'lr': 0.00046363938516402986, 'samples': 4524544, 'steps': 8836, 'batch_loss/train': 0.7211488373577595} 12/24/2021 03:30:15 - INFO - codeparrot_training - Step 8837: {'lr': 0.0004636311980495582, 'samples': 4525056, 'steps': 8837, 'batch_loss/train': 0.9132841154932976} 12/24/2021 03:30:27 - INFO - codeparrot_training - Step 8838: {'lr': 0.00046362301008576943, 'samples': 4525568, 'steps': 8838, 'batch_loss/train': 0.8481631595641375} 12/24/2021 03:30:37 - INFO - codeparrot_training - Step 8839: {'lr': 0.00046361482127269633, 'samples': 4526080, 'steps': 8839, 'batch_loss/train': 0.8408632259815931} 12/24/2021 03:30:48 - INFO - codeparrot_training - Step 8840: {'lr': 0.0004636066316103713, 'samples': 4526592, 'steps': 8840, 'batch_loss/train': 0.7860144283622503} 12/24/2021 03:31:02 - INFO - codeparrot_training - Step 8841: {'lr': 0.000463598441098827, 'samples': 4527104, 'steps': 8841, 'batch_loss/train': 0.7993230810388923} 12/24/2021 03:31:12 - INFO - codeparrot_training - Step 8842: {'lr': 0.0004635902497380959, 'samples': 4527616, 'steps': 8842, 'batch_loss/train': 0.7816945483209565} 12/24/2021 03:31:23 - INFO - codeparrot_training - Step 8843: {'lr': 0.0004635820575282107, 'samples': 4528128, 'steps': 8843, 'batch_loss/train': 0.8531200774013996} 12/24/2021 03:31:34 - INFO - codeparrot_training - Step 8844: {'lr': 0.0004635738644692039, 'samples': 4528640, 'steps': 8844, 'batch_loss/train': 0.8151991218328476} 12/24/2021 03:31:46 - INFO - codeparrot_training - Step 8845: {'lr': 0.000463565670561108, 'samples': 4529152, 'steps': 8845, 'batch_loss/train': 0.776173073798418} 12/24/2021 03:31:56 - INFO - codeparrot_training - Step 8846: {'lr': 0.00046355747580395555, 'samples': 4529664, 'steps': 8846, 'batch_loss/train': 0.7742609686683863} 12/24/2021 03:32:07 - INFO - codeparrot_training - Step 8847: {'lr': 0.0004635492801977793, 'samples': 4530176, 'steps': 8847, 'batch_loss/train': 0.7707086142618209} 12/24/2021 03:32:19 - INFO - codeparrot_training - Step 8848: {'lr': 0.00046354108374261173, 'samples': 4530688, 'steps': 8848, 'batch_loss/train': 0.9290886214002967} 12/24/2021 03:32:30 - INFO - codeparrot_training - Step 8849: {'lr': 0.00046353288643848545, 'samples': 4531200, 'steps': 8849, 'batch_loss/train': 0.7190767627907917} 12/24/2021 03:32:40 - INFO - codeparrot_training - Step 8850: {'lr': 0.0004635246882854331, 'samples': 4531712, 'steps': 8850, 'batch_loss/train': 0.7695809205761179} 12/24/2021 03:32:52 - INFO - codeparrot_training - Step 8851: {'lr': 0.0004635164892834871, 'samples': 4532224, 'steps': 8851, 'batch_loss/train': 0.7628336800262332} 12/24/2021 03:33:03 - INFO - codeparrot_training - Step 8852: {'lr': 0.00046350828943268025, 'samples': 4532736, 'steps': 8852, 'batch_loss/train': 1.147463764064014} 12/24/2021 03:33:14 - INFO - codeparrot_training - Step 8853: {'lr': 0.00046350008873304504, 'samples': 4533248, 'steps': 8853, 'batch_loss/train': 0.8490530755370855} 12/24/2021 03:33:27 - INFO - codeparrot_training - Step 8854: {'lr': 0.00046349188718461403, 'samples': 4533760, 'steps': 8854, 'batch_loss/train': 0.8453058982267976} 12/24/2021 03:33:38 - INFO - codeparrot_training - Step 8855: {'lr': 0.00046348368478741996, 'samples': 4534272, 'steps': 8855, 'batch_loss/train': 0.8560236068442464} 12/24/2021 03:33:49 - INFO - codeparrot_training - Step 8856: {'lr': 0.00046347548154149535, 'samples': 4534784, 'steps': 8856, 'batch_loss/train': 0.9043783778324723} 12/24/2021 03:33:59 - INFO - codeparrot_training - Step 8857: {'lr': 0.0004634672774468728, 'samples': 4535296, 'steps': 8857, 'batch_loss/train': 0.8274247944355011} 12/24/2021 03:34:11 - INFO - codeparrot_training - Step 8858: {'lr': 0.000463459072503585, 'samples': 4535808, 'steps': 8858, 'batch_loss/train': 0.8767741145566106} 12/24/2021 03:34:22 - INFO - codeparrot_training - Step 8859: {'lr': 0.0004634508667116645, 'samples': 4536320, 'steps': 8859, 'batch_loss/train': 0.9155483180657029} 12/24/2021 03:34:32 - INFO - codeparrot_training - Step 8860: {'lr': 0.0004634426600711439, 'samples': 4536832, 'steps': 8860, 'batch_loss/train': 0.8282479238696396} 12/24/2021 03:34:46 - INFO - codeparrot_training - Step 8861: {'lr': 0.00046343445258205596, 'samples': 4537344, 'steps': 8861, 'batch_loss/train': 0.8917607516050339} 12/24/2021 03:34:57 - INFO - codeparrot_training - Step 8862: {'lr': 0.0004634262442444332, 'samples': 4537856, 'steps': 8862, 'batch_loss/train': 0.8639433840289712} 12/24/2021 03:35:07 - INFO - codeparrot_training - Step 8863: {'lr': 0.00046341803505830825, 'samples': 4538368, 'steps': 8863, 'batch_loss/train': 0.8399404375813901} 12/24/2021 03:35:19 - INFO - codeparrot_training - Step 8864: {'lr': 0.0004634098250237138, 'samples': 4538880, 'steps': 8864, 'batch_loss/train': 0.718433934263885} 12/24/2021 03:35:30 - INFO - codeparrot_training - Step 8865: {'lr': 0.0004634016141406825, 'samples': 4539392, 'steps': 8865, 'batch_loss/train': 0.8633012110367417} 12/24/2021 03:35:41 - INFO - codeparrot_training - Step 8866: {'lr': 0.00046339340240924686, 'samples': 4539904, 'steps': 8866, 'batch_loss/train': 0.8690292385872453} 12/24/2021 03:35:51 - INFO - codeparrot_training - Step 8867: {'lr': 0.0004633851898294398, 'samples': 4540416, 'steps': 8867, 'batch_loss/train': 0.8377538835629821} 12/24/2021 03:36:03 - INFO - codeparrot_training - Step 8868: {'lr': 0.00046337697640129363, 'samples': 4540928, 'steps': 8868, 'batch_loss/train': 0.793802690692246} 12/24/2021 03:36:14 - INFO - codeparrot_training - Step 8869: {'lr': 0.00046336876212484124, 'samples': 4541440, 'steps': 8869, 'batch_loss/train': 1.0002035219222307} 12/24/2021 03:36:24 - INFO - codeparrot_training - Step 8870: {'lr': 0.0004633605470001152, 'samples': 4541952, 'steps': 8870, 'batch_loss/train': 0.8166352650150657} 12/24/2021 03:36:38 - INFO - codeparrot_training - Step 8871: {'lr': 0.0004633523310271482, 'samples': 4542464, 'steps': 8871, 'batch_loss/train': 0.7757368157617748} 12/24/2021 03:36:49 - INFO - codeparrot_training - Step 8872: {'lr': 0.0004633441142059729, 'samples': 4542976, 'steps': 8872, 'batch_loss/train': 0.7990358397364616} 12/24/2021 03:37:00 - INFO - codeparrot_training - Step 8873: {'lr': 0.0004633358965366219, 'samples': 4543488, 'steps': 8873, 'batch_loss/train': 0.8355347928591073} 12/24/2021 03:37:12 - INFO - codeparrot_training - Step 8874: {'lr': 0.000463327678019128, 'samples': 4544000, 'steps': 8874, 'batch_loss/train': 0.7704093465581536} 12/24/2021 03:37:22 - INFO - codeparrot_training - Step 8875: {'lr': 0.00046331945865352376, 'samples': 4544512, 'steps': 8875, 'batch_loss/train': 0.8174531992990524} 12/24/2021 03:37:33 - INFO - codeparrot_training - Step 8876: {'lr': 0.0004633112384398419, 'samples': 4545024, 'steps': 8876, 'batch_loss/train': 0.8491127546876669} 12/24/2021 03:37:47 - INFO - codeparrot_training - Step 8877: {'lr': 0.0004633030173781151, 'samples': 4545536, 'steps': 8877, 'batch_loss/train': 0.8177906285272911} 12/24/2021 03:37:58 - INFO - codeparrot_training - Step 8878: {'lr': 0.00046329479546837606, 'samples': 4546048, 'steps': 8878, 'batch_loss/train': 0.7958224462345243} 12/24/2021 03:38:09 - INFO - codeparrot_training - Step 8879: {'lr': 0.0004632865727106574, 'samples': 4546560, 'steps': 8879, 'batch_loss/train': 0.882681105285883} 12/24/2021 03:38:19 - INFO - codeparrot_training - Step 8880: {'lr': 0.0004632783491049919, 'samples': 4547072, 'steps': 8880, 'batch_loss/train': 0.8539415467530489} 12/24/2021 03:38:31 - INFO - codeparrot_training - Step 8881: {'lr': 0.0004632701246514123, 'samples': 4547584, 'steps': 8881, 'batch_loss/train': 1.1860233470797539} 12/24/2021 03:38:42 - INFO - codeparrot_training - Step 8882: {'lr': 0.00046326189934995103, 'samples': 4548096, 'steps': 8882, 'batch_loss/train': 0.8333166874945164} 12/24/2021 03:38:53 - INFO - codeparrot_training - Step 8883: {'lr': 0.0004632536732006412, 'samples': 4548608, 'steps': 8883, 'batch_loss/train': 0.8025902253575623} 12/24/2021 03:39:05 - INFO - codeparrot_training - Step 8884: {'lr': 0.00046324544620351504, 'samples': 4549120, 'steps': 8884, 'batch_loss/train': 0.8697914564982057} 12/24/2021 03:39:15 - INFO - codeparrot_training - Step 8885: {'lr': 0.00046323721835860563, 'samples': 4549632, 'steps': 8885, 'batch_loss/train': 0.8125324384309351} 12/24/2021 03:39:26 - INFO - codeparrot_training - Step 8886: {'lr': 0.00046322898966594554, 'samples': 4550144, 'steps': 8886, 'batch_loss/train': 0.7701281881891191} 12/24/2021 03:39:38 - INFO - codeparrot_training - Step 8887: {'lr': 0.00046322076012556755, 'samples': 4550656, 'steps': 8887, 'batch_loss/train': 0.8179369773715734} 12/24/2021 03:39:48 - INFO - codeparrot_training - Step 8888: {'lr': 0.00046321252973750425, 'samples': 4551168, 'steps': 8888, 'batch_loss/train': 0.8755275420844555} 12/24/2021 03:39:59 - INFO - codeparrot_training - Step 8889: {'lr': 0.0004632042985017885, 'samples': 4551680, 'steps': 8889, 'batch_loss/train': 0.7677184098865837} 12/24/2021 03:40:10 - INFO - codeparrot_training - Step 8890: {'lr': 0.0004631960664184529, 'samples': 4552192, 'steps': 8890, 'batch_loss/train': 0.9062084285542369} 12/24/2021 03:40:23 - INFO - codeparrot_training - Step 8891: {'lr': 0.0004631878334875303, 'samples': 4552704, 'steps': 8891, 'batch_loss/train': 0.8485051393508911} 12/24/2021 03:40:34 - INFO - codeparrot_training - Step 8892: {'lr': 0.00046317959970905334, 'samples': 4553216, 'steps': 8892, 'batch_loss/train': 0.7954682092531584} 12/24/2021 03:40:45 - INFO - codeparrot_training - Step 8893: {'lr': 0.0004631713650830548, 'samples': 4553728, 'steps': 8893, 'batch_loss/train': 0.9395768390968442} 12/24/2021 03:40:58 - INFO - codeparrot_training - Step 8894: {'lr': 0.00046316312960956743, 'samples': 4554240, 'steps': 8894, 'batch_loss/train': 0.8306060368195176} 12/24/2021 03:41:08 - INFO - codeparrot_training - Step 8895: {'lr': 0.0004631548932886238, 'samples': 4554752, 'steps': 8895, 'batch_loss/train': 0.8710900824517012} 12/24/2021 03:41:19 - INFO - codeparrot_training - Step 8896: {'lr': 0.000463146656120257, 'samples': 4555264, 'steps': 8896, 'batch_loss/train': 1.2367287422530353} 12/24/2021 03:41:29 - INFO - codeparrot_training - Step 8897: {'lr': 0.0004631384181044994, 'samples': 4555776, 'steps': 8897, 'batch_loss/train': 0.8649364095181227} 12/24/2021 03:41:41 - INFO - codeparrot_training - Step 8898: {'lr': 0.000463130179241384, 'samples': 4556288, 'steps': 8898, 'batch_loss/train': 0.8685278436169028} 12/24/2021 03:41:52 - INFO - codeparrot_training - Step 8899: {'lr': 0.0004631219395309435, 'samples': 4556800, 'steps': 8899, 'batch_loss/train': 0.810417260043323} 12/24/2021 03:42:03 - INFO - codeparrot_training - Step 8900: {'lr': 0.00046311369897321066, 'samples': 4557312, 'steps': 8900, 'batch_loss/train': 0.8287769196322188} 12/24/2021 03:42:16 - INFO - codeparrot_training - Step 8901: {'lr': 0.0004631054575682182, 'samples': 4557824, 'steps': 8901, 'batch_loss/train': 0.8040247047320008} 12/24/2021 03:42:27 - INFO - codeparrot_training - Step 8902: {'lr': 0.0004630972153159989, 'samples': 4558336, 'steps': 8902, 'batch_loss/train': 0.8196104876697063} 12/24/2021 03:42:38 - INFO - codeparrot_training - Step 8903: {'lr': 0.0004630889722165855, 'samples': 4558848, 'steps': 8903, 'batch_loss/train': 0.8792536156252027} 12/24/2021 03:42:50 - INFO - codeparrot_training - Step 8904: {'lr': 0.0004630807282700109, 'samples': 4559360, 'steps': 8904, 'batch_loss/train': 0.838012348394841} 12/24/2021 03:43:00 - INFO - codeparrot_training - Step 8905: {'lr': 0.0004630724834763077, 'samples': 4559872, 'steps': 8905, 'batch_loss/train': 0.7521625706576742} 12/24/2021 03:43:11 - INFO - codeparrot_training - Step 8906: {'lr': 0.0004630642378355088, 'samples': 4560384, 'steps': 8906, 'batch_loss/train': 0.9289271887391806} 12/24/2021 03:43:22 - INFO - codeparrot_training - Step 8907: {'lr': 0.00046305599134764685, 'samples': 4560896, 'steps': 8907, 'batch_loss/train': 1.6611131145618856} 12/24/2021 03:43:36 - INFO - codeparrot_training - Step 8908: {'lr': 0.00046304774401275485, 'samples': 4561408, 'steps': 8908, 'batch_loss/train': 0.8226766078732908} 12/24/2021 03:43:46 - INFO - codeparrot_training - Step 8909: {'lr': 0.00046303949583086535, 'samples': 4561920, 'steps': 8909, 'batch_loss/train': 1.0040516424924135} 12/24/2021 03:43:57 - INFO - codeparrot_training - Step 8910: {'lr': 0.00046303124680201137, 'samples': 4562432, 'steps': 8910, 'batch_loss/train': 0.8014153270050883} 12/24/2021 03:44:09 - INFO - codeparrot_training - Step 8911: {'lr': 0.0004630229969262255, 'samples': 4562944, 'steps': 8911, 'batch_loss/train': 0.8067196793854237} 12/24/2021 03:44:19 - INFO - codeparrot_training - Step 8912: {'lr': 0.00046301474620354067, 'samples': 4563456, 'steps': 8912, 'batch_loss/train': 0.7954771472141147} 12/24/2021 03:44:30 - INFO - codeparrot_training - Step 8913: {'lr': 0.0004630064946339896, 'samples': 4563968, 'steps': 8913, 'batch_loss/train': 0.8797797134611756} 12/24/2021 03:44:42 - INFO - codeparrot_training - Step 8914: {'lr': 0.0004629982422176052, 'samples': 4564480, 'steps': 8914, 'batch_loss/train': 0.821965160779655} 12/24/2021 03:44:53 - INFO - codeparrot_training - Step 8915: {'lr': 0.0004629899889544201, 'samples': 4564992, 'steps': 8915, 'batch_loss/train': 0.9303118428215384} 12/24/2021 03:45:03 - INFO - codeparrot_training - Step 8916: {'lr': 0.0004629817348444674, 'samples': 4565504, 'steps': 8916, 'batch_loss/train': 0.7722420440986753} 12/24/2021 03:45:14 - INFO - codeparrot_training - Step 8917: {'lr': 0.0004629734798877796, 'samples': 4566016, 'steps': 8917, 'batch_loss/train': 0.7971955528482795} 12/24/2021 03:45:27 - INFO - codeparrot_training - Step 8918: {'lr': 0.00046296522408438974, 'samples': 4566528, 'steps': 8918, 'batch_loss/train': 1.0177656197920442} 12/24/2021 03:45:37 - INFO - codeparrot_training - Step 8919: {'lr': 0.00046295696743433044, 'samples': 4567040, 'steps': 8919, 'batch_loss/train': 0.808479679748416} 12/24/2021 03:45:48 - INFO - codeparrot_training - Step 8920: {'lr': 0.0004629487099376348, 'samples': 4567552, 'steps': 8920, 'batch_loss/train': 0.7913781711831689} 12/24/2021 03:46:00 - INFO - codeparrot_training - Step 8921: {'lr': 0.0004629404515943354, 'samples': 4568064, 'steps': 8921, 'batch_loss/train': 0.8224666947498918} 12/24/2021 03:46:10 - INFO - codeparrot_training - Step 8922: {'lr': 0.0004629321924044653, 'samples': 4568576, 'steps': 8922, 'batch_loss/train': 0.7932342877611518} 12/24/2021 03:46:21 - INFO - codeparrot_training - Step 8923: {'lr': 0.000462923932368057, 'samples': 4569088, 'steps': 8923, 'batch_loss/train': 0.8181796576827765} 12/24/2021 03:46:33 - INFO - codeparrot_training - Step 8924: {'lr': 0.0004629156714851437, 'samples': 4569600, 'steps': 8924, 'batch_loss/train': 0.8925583411473781} 12/24/2021 03:46:44 - INFO - codeparrot_training - Step 8925: {'lr': 0.000462907409755758, 'samples': 4570112, 'steps': 8925, 'batch_loss/train': 0.886393565684557} 12/24/2021 03:46:54 - INFO - codeparrot_training - Step 8926: {'lr': 0.00046289914717993286, 'samples': 4570624, 'steps': 8926, 'batch_loss/train': 0.836228090338409} 12/24/2021 03:47:06 - INFO - codeparrot_training - Step 8927: {'lr': 0.00046289088375770106, 'samples': 4571136, 'steps': 8927, 'batch_loss/train': 0.9220543242990971} 12/24/2021 03:47:17 - INFO - codeparrot_training - Step 8928: {'lr': 0.00046288261948909556, 'samples': 4571648, 'steps': 8928, 'batch_loss/train': 0.808021760545671} 12/24/2021 03:47:27 - INFO - codeparrot_training - Step 8929: {'lr': 0.0004628743543741491, 'samples': 4572160, 'steps': 8929, 'batch_loss/train': 0.7920901505276561} 12/24/2021 03:47:38 - INFO - codeparrot_training - Step 8930: {'lr': 0.0004628660884128946, 'samples': 4572672, 'steps': 8930, 'batch_loss/train': 0.8148388769477606} 12/24/2021 03:47:51 - INFO - codeparrot_training - Step 8931: {'lr': 0.00046285782160536496, 'samples': 4573184, 'steps': 8931, 'batch_loss/train': 0.800593770109117} 12/24/2021 03:48:01 - INFO - codeparrot_training - Step 8932: {'lr': 0.0004628495539515929, 'samples': 4573696, 'steps': 8932, 'batch_loss/train': 0.7723029758781195} 12/24/2021 03:48:12 - INFO - codeparrot_training - Step 8933: {'lr': 0.0004628412854516114, 'samples': 4574208, 'steps': 8933, 'batch_loss/train': 0.8384164944291115} 12/24/2021 03:48:24 - INFO - codeparrot_training - Step 8934: {'lr': 0.00046283301610545335, 'samples': 4574720, 'steps': 8934, 'batch_loss/train': 0.7821415448561311} 12/24/2021 03:48:34 - INFO - codeparrot_training - Step 8935: {'lr': 0.0004628247459131516, 'samples': 4575232, 'steps': 8935, 'batch_loss/train': 0.8118915818631649} 12/24/2021 03:48:45 - INFO - codeparrot_training - Step 8936: {'lr': 0.000462816474874739, 'samples': 4575744, 'steps': 8936, 'batch_loss/train': 0.8008143082261086} 12/24/2021 03:48:58 - INFO - codeparrot_training - Step 8937: {'lr': 0.0004628082029902484, 'samples': 4576256, 'steps': 8937, 'batch_loss/train': 0.678616359597072} 12/24/2021 03:49:08 - INFO - codeparrot_training - Step 8938: {'lr': 0.0004627999302597128, 'samples': 4576768, 'steps': 8938, 'batch_loss/train': 0.6810552128590643} 12/24/2021 03:49:19 - INFO - codeparrot_training - Step 8939: {'lr': 0.00046279165668316503, 'samples': 4577280, 'steps': 8939, 'batch_loss/train': 0.8138826068607159} 12/24/2021 03:49:30 - INFO - codeparrot_training - Step 8940: {'lr': 0.00046278338226063794, 'samples': 4577792, 'steps': 8940, 'batch_loss/train': 0.8348495848476887} 12/24/2021 03:49:42 - INFO - codeparrot_training - Step 8941: {'lr': 0.00046277510699216454, 'samples': 4578304, 'steps': 8941, 'batch_loss/train': 0.8506525959819555} 12/24/2021 03:49:52 - INFO - codeparrot_training - Step 8942: {'lr': 0.00046276683087777753, 'samples': 4578816, 'steps': 8942, 'batch_loss/train': 0.8473824872635305} 12/24/2021 03:50:03 - INFO - codeparrot_training - Step 8943: {'lr': 0.00046275855391751, 'samples': 4579328, 'steps': 8943, 'batch_loss/train': 0.8764035040512681} 12/24/2021 03:50:15 - INFO - codeparrot_training - Step 8944: {'lr': 0.00046275027611139487, 'samples': 4579840, 'steps': 8944, 'batch_loss/train': 0.7644228041172028} 12/24/2021 03:50:25 - INFO - codeparrot_training - Step 8945: {'lr': 0.00046274199745946487, 'samples': 4580352, 'steps': 8945, 'batch_loss/train': 0.7455387115478516} 12/24/2021 03:50:36 - INFO - codeparrot_training - Step 8946: {'lr': 0.00046273371796175304, 'samples': 4580864, 'steps': 8946, 'batch_loss/train': 0.9695305628702044} 12/24/2021 03:50:49 - INFO - codeparrot_training - Step 8947: {'lr': 0.00046272543761829234, 'samples': 4581376, 'steps': 8947, 'batch_loss/train': 0.8406647043302655} 12/24/2021 03:50:59 - INFO - codeparrot_training - Step 8948: {'lr': 0.00046271715642911556, 'samples': 4581888, 'steps': 8948, 'batch_loss/train': 0.8536031041294336} 12/24/2021 03:51:10 - INFO - codeparrot_training - Step 8949: {'lr': 0.00046270887439425567, 'samples': 4582400, 'steps': 8949, 'batch_loss/train': 0.7957963952794671} 12/24/2021 03:51:22 - INFO - codeparrot_training - Step 8950: {'lr': 0.00046270059151374567, 'samples': 4582912, 'steps': 8950, 'batch_loss/train': 0.793576821219176} 12/24/2021 03:51:33 - INFO - codeparrot_training - Step 8951: {'lr': 0.00046269230778761837, 'samples': 4583424, 'steps': 8951, 'batch_loss/train': 0.8595886304974556} 12/24/2021 03:51:43 - INFO - codeparrot_training - Step 8952: {'lr': 0.0004626840232159068, 'samples': 4583936, 'steps': 8952, 'batch_loss/train': 0.6127372755436227} 12/24/2021 03:51:54 - INFO - codeparrot_training - Step 8953: {'lr': 0.00046267573779864386, 'samples': 4584448, 'steps': 8953, 'batch_loss/train': 0.762057647574693} 12/24/2021 03:52:06 - INFO - codeparrot_training - Step 8954: {'lr': 0.00046266745153586246, 'samples': 4584960, 'steps': 8954, 'batch_loss/train': 0.7570120298769325} 12/24/2021 03:52:17 - INFO - codeparrot_training - Step 8955: {'lr': 0.0004626591644275956, 'samples': 4585472, 'steps': 8955, 'batch_loss/train': 0.7375813936814666} 12/24/2021 03:52:28 - INFO - codeparrot_training - Step 8956: {'lr': 0.0004626508764738762, 'samples': 4585984, 'steps': 8956, 'batch_loss/train': 0.7941938443109393} 12/24/2021 03:52:40 - INFO - codeparrot_training - Step 8957: {'lr': 0.0004626425876747372, 'samples': 4586496, 'steps': 8957, 'batch_loss/train': 0.7299375059083104} 12/24/2021 03:52:50 - INFO - codeparrot_training - Step 8958: {'lr': 0.0004626342980302116, 'samples': 4587008, 'steps': 8958, 'batch_loss/train': 0.7334095789119601} 12/24/2021 03:53:01 - INFO - codeparrot_training - Step 8959: {'lr': 0.00046262600754033224, 'samples': 4587520, 'steps': 8959, 'batch_loss/train': 0.8409651778638363} 12/24/2021 03:53:13 - INFO - codeparrot_training - Step 8960: {'lr': 0.0004626177162051322, 'samples': 4588032, 'steps': 8960, 'batch_loss/train': 0.8345717657357454} 12/24/2021 03:53:24 - INFO - codeparrot_training - Step 8961: {'lr': 0.0004626094240246444, 'samples': 4588544, 'steps': 8961, 'batch_loss/train': 0.8291657119989395} 12/24/2021 03:53:34 - INFO - codeparrot_training - Step 8962: {'lr': 0.0004626011309989018, 'samples': 4589056, 'steps': 8962, 'batch_loss/train': 0.7589532770216465} 12/24/2021 03:53:46 - INFO - codeparrot_training - Step 8963: {'lr': 0.00046259283712793745, 'samples': 4589568, 'steps': 8963, 'batch_loss/train': 0.7920797155238688} 12/24/2021 03:53:57 - INFO - codeparrot_training - Step 8964: {'lr': 0.00046258454241178415, 'samples': 4590080, 'steps': 8964, 'batch_loss/train': 0.8118520695716143} 12/24/2021 03:54:08 - INFO - codeparrot_training - Step 8965: {'lr': 0.00046257624685047504, 'samples': 4590592, 'steps': 8965, 'batch_loss/train': 0.8813617751002312} 12/24/2021 03:54:18 - INFO - codeparrot_training - Step 8966: {'lr': 0.00046256795044404296, 'samples': 4591104, 'steps': 8966, 'batch_loss/train': 0.759699510410428} 12/24/2021 03:54:30 - INFO - codeparrot_training - Step 8967: {'lr': 0.0004625596531925211, 'samples': 4591616, 'steps': 8967, 'batch_loss/train': 0.8118674452416599} 12/24/2021 03:54:41 - INFO - codeparrot_training - Step 8968: {'lr': 0.00046255135509594224, 'samples': 4592128, 'steps': 8968, 'batch_loss/train': 0.7383194384165108} 12/24/2021 03:54:51 - INFO - codeparrot_training - Step 8969: {'lr': 0.00046254305615433947, 'samples': 4592640, 'steps': 8969, 'batch_loss/train': 0.6865884508006275} 12/24/2021 03:55:05 - INFO - codeparrot_training - Step 8970: {'lr': 0.00046253475636774574, 'samples': 4593152, 'steps': 8970, 'batch_loss/train': 0.9337633051909506} 12/24/2021 03:55:16 - INFO - codeparrot_training - Step 8971: {'lr': 0.000462526455736194, 'samples': 4593664, 'steps': 8971, 'batch_loss/train': 0.8742987625300884} 12/24/2021 03:55:26 - INFO - codeparrot_training - Step 8972: {'lr': 0.00046251815425971745, 'samples': 4594176, 'steps': 8972, 'batch_loss/train': 0.7321689445525408} 12/24/2021 03:55:38 - INFO - codeparrot_training - Step 8973: {'lr': 0.0004625098519383489, 'samples': 4594688, 'steps': 8973, 'batch_loss/train': 0.827873945236206} 12/24/2021 03:55:49 - INFO - codeparrot_training - Step 8974: {'lr': 0.0004625015487721214, 'samples': 4595200, 'steps': 8974, 'batch_loss/train': 0.8275012914673425} 12/24/2021 03:55:59 - INFO - codeparrot_training - Step 8975: {'lr': 0.00046249324476106804, 'samples': 4595712, 'steps': 8975, 'batch_loss/train': 0.7635354798985645} 12/24/2021 03:56:10 - INFO - codeparrot_training - Step 8976: {'lr': 0.0004624849399052218, 'samples': 4596224, 'steps': 8976, 'batch_loss/train': 0.8133100564591587} 12/24/2021 03:56:23 - INFO - codeparrot_training - Step 8977: {'lr': 0.00046247663420461555, 'samples': 4596736, 'steps': 8977, 'batch_loss/train': 0.8656565458513796} 12/24/2021 03:56:34 - INFO - codeparrot_training - Step 8978: {'lr': 0.00046246832765928247, 'samples': 4597248, 'steps': 8978, 'batch_loss/train': 0.8401135905878618} 12/24/2021 03:56:44 - INFO - codeparrot_training - Step 8979: {'lr': 0.0004624600202692556, 'samples': 4597760, 'steps': 8979, 'batch_loss/train': 0.9254166977480054} 12/24/2021 03:56:56 - INFO - codeparrot_training - Step 8980: {'lr': 0.00046245171203456787, 'samples': 4598272, 'steps': 8980, 'batch_loss/train': 0.8198836981318891} 12/24/2021 03:57:07 - INFO - codeparrot_training - Step 8981: {'lr': 0.0004624434029552523, 'samples': 4598784, 'steps': 8981, 'batch_loss/train': 0.7926351046189666} 12/24/2021 03:57:18 - INFO - codeparrot_training - Step 8982: {'lr': 0.00046243509303134204, 'samples': 4599296, 'steps': 8982, 'batch_loss/train': 0.8208441762253642} 12/24/2021 03:57:30 - INFO - codeparrot_training - Step 8983: {'lr': 0.0004624267822628701, 'samples': 4599808, 'steps': 8983, 'batch_loss/train': 0.8258274137042463} 12/24/2021 03:57:40 - INFO - codeparrot_training - Step 8984: {'lr': 0.0004624184706498694, 'samples': 4600320, 'steps': 8984, 'batch_loss/train': 0.8251575999893248} 12/24/2021 03:57:51 - INFO - codeparrot_training - Step 8985: {'lr': 0.00046241015819237317, 'samples': 4600832, 'steps': 8985, 'batch_loss/train': 0.8850004738196731} 12/24/2021 03:58:03 - INFO - codeparrot_training - Step 8986: {'lr': 0.0004624018448904143, 'samples': 4601344, 'steps': 8986, 'batch_loss/train': 0.8340547066181898} 12/24/2021 03:58:14 - INFO - codeparrot_training - Step 8987: {'lr': 0.00046239353074402584, 'samples': 4601856, 'steps': 8987, 'batch_loss/train': 0.8565515233203769} 12/24/2021 03:58:25 - INFO - codeparrot_training - Step 8988: {'lr': 0.00046238521575324096, 'samples': 4602368, 'steps': 8988, 'batch_loss/train': 0.8920341841876507} 12/24/2021 03:58:35 - INFO - codeparrot_training - Step 8989: {'lr': 0.0004623768999180926, 'samples': 4602880, 'steps': 8989, 'batch_loss/train': 0.8027377874823287} 12/24/2021 03:58:47 - INFO - codeparrot_training - Step 8990: {'lr': 0.000462368583238614, 'samples': 4603392, 'steps': 8990, 'batch_loss/train': 0.8448416758328676} 12/24/2021 03:58:58 - INFO - codeparrot_training - Step 8991: {'lr': 0.000462360265714838, 'samples': 4603904, 'steps': 8991, 'batch_loss/train': 0.8121149921789765} 12/24/2021 03:59:08 - INFO - codeparrot_training - Step 8992: {'lr': 0.00046235194734679775, 'samples': 4604416, 'steps': 8992, 'batch_loss/train': 0.8332606898620725} 12/24/2021 03:59:21 - INFO - codeparrot_training - Step 8993: {'lr': 0.00046234362813452637, 'samples': 4604928, 'steps': 8993, 'batch_loss/train': 0.8057300448417664} 12/24/2021 03:59:32 - INFO - codeparrot_training - Step 8994: {'lr': 0.0004623353080780569, 'samples': 4605440, 'steps': 8994, 'batch_loss/train': 0.8168519176542759} 12/24/2021 03:59:43 - INFO - codeparrot_training - Step 8995: {'lr': 0.00046232698717742237, 'samples': 4605952, 'steps': 8995, 'batch_loss/train': 0.7150279339402914} 12/24/2021 03:59:55 - INFO - codeparrot_training - Step 8996: {'lr': 0.000462318665432656, 'samples': 4606464, 'steps': 8996, 'batch_loss/train': 0.8553866342408583} 12/24/2021 04:00:05 - INFO - codeparrot_training - Step 8997: {'lr': 0.00046231034284379075, 'samples': 4606976, 'steps': 8997, 'batch_loss/train': 1.609294599853456} 12/24/2021 04:00:16 - INFO - codeparrot_training - Step 8998: {'lr': 0.00046230201941085974, 'samples': 4607488, 'steps': 8998, 'batch_loss/train': 0.9166318532079458} 12/24/2021 04:00:26 - INFO - codeparrot_training - Step 8999: {'lr': 0.0004622936951338961, 'samples': 4608000, 'steps': 8999, 'batch_loss/train': 0.7963320898707025} 12/24/2021 04:00:38 - INFO - codeparrot_training - Step 9000: {'lr': 0.0004622853700129328, 'samples': 4608512, 'steps': 9000, 'batch_loss/train': 0.9208130799233913} 12/24/2021 04:00:49 - INFO - codeparrot_training - Step 9001: {'lr': 0.00046227704404800307, 'samples': 4609024, 'steps': 9001, 'batch_loss/train': 0.8370877973502502} 12/24/2021 04:00:59 - INFO - codeparrot_training - Step 9002: {'lr': 0.00046226871723914, 'samples': 4609536, 'steps': 9002, 'batch_loss/train': 0.8062913557514548} 12/24/2021 04:01:12 - INFO - codeparrot_training - Step 9003: {'lr': 0.0004622603895863766, 'samples': 4610048, 'steps': 9003, 'batch_loss/train': 0.7921785400249064} 12/24/2021 04:01:22 - INFO - codeparrot_training - Step 9004: {'lr': 0.00046225206108974613, 'samples': 4610560, 'steps': 9004, 'batch_loss/train': 0.8325676266103983} 12/24/2021 04:01:33 - INFO - codeparrot_training - Step 9005: {'lr': 0.0004622437317492815, 'samples': 4611072, 'steps': 9005, 'batch_loss/train': 0.7847919296473265} 12/24/2021 04:01:45 - INFO - codeparrot_training - Step 9006: {'lr': 0.00046223540156501596, 'samples': 4611584, 'steps': 9006, 'batch_loss/train': 0.7624960243701935} 12/24/2021 04:01:56 - INFO - codeparrot_training - Step 9007: {'lr': 0.00046222707053698267, 'samples': 4612096, 'steps': 9007, 'batch_loss/train': 0.7962430268526077} 12/24/2021 04:02:07 - INFO - codeparrot_training - Step 9008: {'lr': 0.0004622187386652146, 'samples': 4612608, 'steps': 9008, 'batch_loss/train': 0.7292566442629322} 12/24/2021 04:02:19 - INFO - codeparrot_training - Step 9009: {'lr': 0.00046221040594974496, 'samples': 4613120, 'steps': 9009, 'batch_loss/train': 0.8094799490645528} 12/24/2021 04:02:29 - INFO - codeparrot_training - Step 9010: {'lr': 0.00046220207239060695, 'samples': 4613632, 'steps': 9010, 'batch_loss/train': 0.6649286067695357} 12/24/2021 04:02:40 - INFO - codeparrot_training - Step 9011: {'lr': 0.00046219373798783356, 'samples': 4614144, 'steps': 9011, 'batch_loss/train': 0.6277777237119153} 12/24/2021 04:02:51 - INFO - codeparrot_training - Step 9012: {'lr': 0.00046218540274145796, 'samples': 4614656, 'steps': 9012, 'batch_loss/train': 0.8923907419666648} 12/24/2021 04:03:03 - INFO - codeparrot_training - Step 9013: {'lr': 0.0004621770666515134, 'samples': 4615168, 'steps': 9013, 'batch_loss/train': 0.8578009328339249} 12/24/2021 04:03:13 - INFO - codeparrot_training - Step 9014: {'lr': 0.0004621687297180329, 'samples': 4615680, 'steps': 9014, 'batch_loss/train': 0.7762044947594404} 12/24/2021 04:03:24 - INFO - codeparrot_training - Step 9015: {'lr': 0.00046216039194104964, 'samples': 4616192, 'steps': 9015, 'batch_loss/train': 0.8530406365171075} 12/24/2021 04:03:36 - INFO - codeparrot_training - Step 9016: {'lr': 0.0004621520533205967, 'samples': 4616704, 'steps': 9016, 'batch_loss/train': 0.7963544828817248} 12/24/2021 04:03:47 - INFO - codeparrot_training - Step 9017: {'lr': 0.00046214371385670745, 'samples': 4617216, 'steps': 9017, 'batch_loss/train': 0.9156167628243566} 12/24/2021 04:03:58 - INFO - codeparrot_training - Step 9018: {'lr': 0.00046213537354941484, 'samples': 4617728, 'steps': 9018, 'batch_loss/train': 0.7523833829909563} 12/24/2021 04:04:10 - INFO - codeparrot_training - Step 9019: {'lr': 0.00046212703239875203, 'samples': 4618240, 'steps': 9019, 'batch_loss/train': 0.868171563372016} 12/24/2021 04:04:20 - INFO - codeparrot_training - Step 9020: {'lr': 0.00046211869040475227, 'samples': 4618752, 'steps': 9020, 'batch_loss/train': 0.76880889153108} 12/24/2021 04:04:31 - INFO - codeparrot_training - Step 9021: {'lr': 0.0004621103475674486, 'samples': 4619264, 'steps': 9021, 'batch_loss/train': 0.8204775145277381} 12/24/2021 04:04:42 - INFO - codeparrot_training - Step 9022: {'lr': 0.0004621020038868745, 'samples': 4619776, 'steps': 9022, 'batch_loss/train': 0.7944969637319446} 12/24/2021 04:04:55 - INFO - codeparrot_training - Step 9023: {'lr': 0.0004620936593630627, 'samples': 4620288, 'steps': 9023, 'batch_loss/train': 0.8408987252041698} 12/24/2021 04:05:05 - INFO - codeparrot_training - Step 9024: {'lr': 0.0004620853139960467, 'samples': 4620800, 'steps': 9024, 'batch_loss/train': 0.826198942027986} 12/24/2021 04:05:16 - INFO - codeparrot_training - Step 9025: {'lr': 0.00046207696778585955, 'samples': 4621312, 'steps': 9025, 'batch_loss/train': 0.8341629598289728} 12/24/2021 04:05:28 - INFO - codeparrot_training - Step 9026: {'lr': 0.00046206862073253443, 'samples': 4621824, 'steps': 9026, 'batch_loss/train': 0.8445923468098044} 12/24/2021 04:05:39 - INFO - codeparrot_training - Step 9027: {'lr': 0.00046206027283610465, 'samples': 4622336, 'steps': 9027, 'batch_loss/train': 0.8447689106687903} 12/24/2021 04:05:49 - INFO - codeparrot_training - Step 9028: {'lr': 0.00046205192409660316, 'samples': 4622848, 'steps': 9028, 'batch_loss/train': 0.7831495516002178} 12/24/2021 04:06:02 - INFO - codeparrot_training - Step 9029: {'lr': 0.0004620435745140633, 'samples': 4623360, 'steps': 9029, 'batch_loss/train': 0.7325425138697028} 12/24/2021 04:06:12 - INFO - codeparrot_training - Step 9030: {'lr': 0.0004620352240885183, 'samples': 4623872, 'steps': 9030, 'batch_loss/train': 0.8634557593613863} 12/24/2021 04:06:23 - INFO - codeparrot_training - Step 9031: {'lr': 0.0004620268728200013, 'samples': 4624384, 'steps': 9031, 'batch_loss/train': 0.7265770649537444} 12/24/2021 04:06:34 - INFO - codeparrot_training - Step 9032: {'lr': 0.00046201852070854553, 'samples': 4624896, 'steps': 9032, 'batch_loss/train': 0.831323220860213} 12/24/2021 04:06:46 - INFO - codeparrot_training - Step 9033: {'lr': 0.0004620101677541842, 'samples': 4625408, 'steps': 9033, 'batch_loss/train': 0.8187951361760497} 12/24/2021 04:06:57 - INFO - codeparrot_training - Step 9034: {'lr': 0.0004620018139569505, 'samples': 4625920, 'steps': 9034, 'batch_loss/train': 0.8507727468386292} 12/24/2021 04:07:07 - INFO - codeparrot_training - Step 9035: {'lr': 0.00046199345931687756, 'samples': 4626432, 'steps': 9035, 'batch_loss/train': 0.8381628687493503} 12/24/2021 04:07:20 - INFO - codeparrot_training - Step 9036: {'lr': 0.00046198510383399873, 'samples': 4626944, 'steps': 9036, 'batch_loss/train': 0.6356527936877683} 12/24/2021 04:07:31 - INFO - codeparrot_training - Step 9037: {'lr': 0.00046197674750834717, 'samples': 4627456, 'steps': 9037, 'batch_loss/train': 0.8722222703509033} 12/24/2021 04:07:41 - INFO - codeparrot_training - Step 9038: {'lr': 0.0004619683903399561, 'samples': 4627968, 'steps': 9038, 'batch_loss/train': 0.6878553287824616} 12/24/2021 04:07:53 - INFO - codeparrot_training - Step 9039: {'lr': 0.00046196003232885864, 'samples': 4628480, 'steps': 9039, 'batch_loss/train': 0.6447180293034762} 12/24/2021 04:08:04 - INFO - codeparrot_training - Step 9040: {'lr': 0.00046195167347508825, 'samples': 4628992, 'steps': 9040, 'batch_loss/train': 0.9090849659405649} 12/24/2021 04:08:15 - INFO - codeparrot_training - Step 9041: {'lr': 0.000461943313778678, 'samples': 4629504, 'steps': 9041, 'batch_loss/train': 0.8087686388753355} 12/24/2021 04:08:27 - INFO - codeparrot_training - Step 9042: {'lr': 0.0004619349532396612, 'samples': 4630016, 'steps': 9042, 'batch_loss/train': 0.9254551203921437} 12/24/2021 04:08:37 - INFO - codeparrot_training - Step 9043: {'lr': 0.00046192659185807094, 'samples': 4630528, 'steps': 9043, 'batch_loss/train': 0.7580113662406802} 12/24/2021 04:08:48 - INFO - codeparrot_training - Step 9044: {'lr': 0.0004619182296339406, 'samples': 4631040, 'steps': 9044, 'batch_loss/train': 0.9445004034787416} 12/24/2021 04:08:59 - INFO - codeparrot_training - Step 9045: {'lr': 0.00046190986656730347, 'samples': 4631552, 'steps': 9045, 'batch_loss/train': 0.8761005429551005} 12/24/2021 04:09:11 - INFO - codeparrot_training - Step 9046: {'lr': 0.0004619015026581927, 'samples': 4632064, 'steps': 9046, 'batch_loss/train': 0.7601485950872302} 12/24/2021 04:09:22 - INFO - codeparrot_training - Step 9047: {'lr': 0.00046189313790664157, 'samples': 4632576, 'steps': 9047, 'batch_loss/train': 1.0643341177492402} 12/24/2021 04:09:32 - INFO - codeparrot_training - Step 9048: {'lr': 0.00046188477231268334, 'samples': 4633088, 'steps': 9048, 'batch_loss/train': 0.8600919283926487} 12/24/2021 04:09:45 - INFO - codeparrot_training - Step 9049: {'lr': 0.0004618764058763512, 'samples': 4633600, 'steps': 9049, 'batch_loss/train': 0.8072303542867303} 12/24/2021 04:09:56 - INFO - codeparrot_training - Step 9050: {'lr': 0.0004618680385976785, 'samples': 4634112, 'steps': 9050, 'batch_loss/train': 0.3660876451758668} 12/24/2021 04:10:06 - INFO - codeparrot_training - Step 9051: {'lr': 0.00046185967047669853, 'samples': 4634624, 'steps': 9051, 'batch_loss/train': 0.8117539910599589} 12/24/2021 04:10:19 - INFO - codeparrot_training - Step 9052: {'lr': 0.00046185130151344447, 'samples': 4635136, 'steps': 9052, 'batch_loss/train': 0.6801745612174273} 12/24/2021 04:10:29 - INFO - codeparrot_training - Step 9053: {'lr': 0.0004618429317079496, 'samples': 4635648, 'steps': 9053, 'batch_loss/train': 0.7846093690022826} 12/24/2021 04:10:40 - INFO - codeparrot_training - Step 9054: {'lr': 0.00046183456106024725, 'samples': 4636160, 'steps': 9054, 'batch_loss/train': 0.8319723415188491} 12/24/2021 04:10:51 - INFO - codeparrot_training - Step 9055: {'lr': 0.0004618261895703706, 'samples': 4636672, 'steps': 9055, 'batch_loss/train': 0.8924049288034439} 12/24/2021 04:11:03 - INFO - codeparrot_training - Step 9056: {'lr': 0.0004618178172383531, 'samples': 4637184, 'steps': 9056, 'batch_loss/train': 0.8332425486296415} 12/24/2021 04:11:13 - INFO - codeparrot_training - Step 9057: {'lr': 0.00046180944406422786, 'samples': 4637696, 'steps': 9057, 'batch_loss/train': 0.8738815831020474} 12/24/2021 04:11:24 - INFO - codeparrot_training - Step 9058: {'lr': 0.00046180107004802827, 'samples': 4638208, 'steps': 9058, 'batch_loss/train': 0.8623653301037848} 12/24/2021 04:11:36 - INFO - codeparrot_training - Step 9059: {'lr': 0.00046179269518978767, 'samples': 4638720, 'steps': 9059, 'batch_loss/train': 0.8407369945198298} 12/24/2021 04:11:46 - INFO - codeparrot_training - Step 9060: {'lr': 0.00046178431948953923, 'samples': 4639232, 'steps': 9060, 'batch_loss/train': 0.8405026718974113} 12/24/2021 04:11:57 - INFO - codeparrot_training - Step 9061: {'lr': 0.0004617759429473163, 'samples': 4639744, 'steps': 9061, 'batch_loss/train': 0.7760971384122968} 12/24/2021 04:12:10 - INFO - codeparrot_training - Step 9062: {'lr': 0.0004617675655631522, 'samples': 4640256, 'steps': 9062, 'batch_loss/train': 0.902062070555985} 12/24/2021 04:12:20 - INFO - codeparrot_training - Step 9063: {'lr': 0.0004617591873370802, 'samples': 4640768, 'steps': 9063, 'batch_loss/train': 0.9252905752509832} 12/24/2021 04:12:31 - INFO - codeparrot_training - Step 9064: {'lr': 0.0004617508082691336, 'samples': 4641280, 'steps': 9064, 'batch_loss/train': 0.8343501128256321} 12/24/2021 04:12:41 - INFO - codeparrot_training - Step 9065: {'lr': 0.0004617424283593458, 'samples': 4641792, 'steps': 9065, 'batch_loss/train': 0.8213038127869368} 12/24/2021 04:12:53 - INFO - codeparrot_training - Step 9066: {'lr': 0.00046173404760775004, 'samples': 4642304, 'steps': 9066, 'batch_loss/train': 0.8411748362705112} 12/24/2021 04:13:04 - INFO - codeparrot_training - Step 9067: {'lr': 0.0004617256660143796, 'samples': 4642816, 'steps': 9067, 'batch_loss/train': 0.8230982096865773} 12/24/2021 04:13:15 - INFO - codeparrot_training - Step 9068: {'lr': 0.0004617172835792679, 'samples': 4643328, 'steps': 9068, 'batch_loss/train': 0.796971907839179} 12/24/2021 04:13:27 - INFO - codeparrot_training - Step 9069: {'lr': 0.0004617089003024483, 'samples': 4643840, 'steps': 9069, 'batch_loss/train': 0.8740398909430951} 12/24/2021 04:13:37 - INFO - codeparrot_training - Step 9070: {'lr': 0.0004617005161839539, 'samples': 4644352, 'steps': 9070, 'batch_loss/train': 0.8894078712910414} 12/24/2021 04:13:48 - INFO - codeparrot_training - Step 9071: {'lr': 0.00046169213122381825, 'samples': 4644864, 'steps': 9071, 'batch_loss/train': 0.8276340696029365} 12/24/2021 04:14:00 - INFO - codeparrot_training - Step 9072: {'lr': 0.0004616837454220747, 'samples': 4645376, 'steps': 9072, 'batch_loss/train': 0.6992911719717085} 12/24/2021 04:14:11 - INFO - codeparrot_training - Step 9073: {'lr': 0.00046167535877875637, 'samples': 4645888, 'steps': 9073, 'batch_loss/train': 0.8240812160074711} 12/24/2021 04:14:21 - INFO - codeparrot_training - Step 9074: {'lr': 0.0004616669712938968, 'samples': 4646400, 'steps': 9074, 'batch_loss/train': 0.7725787370000035} 12/24/2021 04:14:35 - INFO - codeparrot_training - Step 9075: {'lr': 0.0004616585829675293, 'samples': 4646912, 'steps': 9075, 'batch_loss/train': 0.61961393232923} 12/24/2021 04:14:46 - INFO - codeparrot_training - Step 9076: {'lr': 0.0004616501937996872, 'samples': 4647424, 'steps': 9076, 'batch_loss/train': 0.8493666993454099} 12/24/2021 04:14:56 - INFO - codeparrot_training - Step 9077: {'lr': 0.00046164180379040375, 'samples': 4647936, 'steps': 9077, 'batch_loss/train': 0.7701422618702054} 12/24/2021 04:15:07 - INFO - codeparrot_training - Step 9078: {'lr': 0.0004616334129397125, 'samples': 4648448, 'steps': 9078, 'batch_loss/train': 0.8094748370349407} 12/24/2021 04:15:19 - INFO - codeparrot_training - Step 9079: {'lr': 0.00046162502124764667, 'samples': 4648960, 'steps': 9079, 'batch_loss/train': 0.8239981774240732} 12/24/2021 04:15:29 - INFO - codeparrot_training - Step 9080: {'lr': 0.00046161662871423967, 'samples': 4649472, 'steps': 9080, 'batch_loss/train': 0.7784845205023885} 12/24/2021 04:15:40 - INFO - codeparrot_training - Step 9081: {'lr': 0.00046160823533952485, 'samples': 4649984, 'steps': 9081, 'batch_loss/train': 0.7871765932068229} 12/24/2021 04:15:52 - INFO - codeparrot_training - Step 9082: {'lr': 0.0004615998411235356, 'samples': 4650496, 'steps': 9082, 'batch_loss/train': 0.8066849410533905} 12/24/2021 04:16:02 - INFO - codeparrot_training - Step 9083: {'lr': 0.0004615914460663053, 'samples': 4651008, 'steps': 9083, 'batch_loss/train': 0.8280006586574018} 12/24/2021 04:16:13 - INFO - codeparrot_training - Step 9084: {'lr': 0.0004615830501678672, 'samples': 4651520, 'steps': 9084, 'batch_loss/train': 0.798971111420542} 12/24/2021 04:16:26 - INFO - codeparrot_training - Step 9085: {'lr': 0.0004615746534282549, 'samples': 4652032, 'steps': 9085, 'batch_loss/train': 0.8173418762162328} 12/24/2021 04:16:36 - INFO - codeparrot_training - Step 9086: {'lr': 0.0004615662558475016, 'samples': 4652544, 'steps': 9086, 'batch_loss/train': 0.7886246684938669} 12/24/2021 04:16:47 - INFO - codeparrot_training - Step 9087: {'lr': 0.00046155785742564077, 'samples': 4653056, 'steps': 9087, 'batch_loss/train': 0.7693022671155632} 12/24/2021 04:16:59 - INFO - codeparrot_training - Step 9088: {'lr': 0.00046154945816270575, 'samples': 4653568, 'steps': 9088, 'batch_loss/train': 0.7349085584282875} 12/24/2021 04:17:10 - INFO - codeparrot_training - Step 9089: {'lr': 0.00046154105805873, 'samples': 4654080, 'steps': 9089, 'batch_loss/train': 0.8023040564730763} 12/24/2021 04:17:20 - INFO - codeparrot_training - Step 9090: {'lr': 0.00046153265711374695, 'samples': 4654592, 'steps': 9090, 'batch_loss/train': 0.7979142107069492} 12/24/2021 04:17:31 - INFO - codeparrot_training - Step 9091: {'lr': 0.00046152425532778983, 'samples': 4655104, 'steps': 9091, 'batch_loss/train': 0.8700813287869096} 12/24/2021 04:17:43 - INFO - codeparrot_training - Step 9092: {'lr': 0.0004615158527008921, 'samples': 4655616, 'steps': 9092, 'batch_loss/train': 0.9087547566741705} 12/24/2021 04:17:54 - INFO - codeparrot_training - Step 9093: {'lr': 0.0004615074492330873, 'samples': 4656128, 'steps': 9093, 'batch_loss/train': 0.7311166781000793} 12/24/2021 04:18:05 - INFO - codeparrot_training - Step 9094: {'lr': 0.00046149904492440874, 'samples': 4656640, 'steps': 9094, 'batch_loss/train': 0.8164987005293369} 12/24/2021 04:18:17 - INFO - codeparrot_training - Step 9095: {'lr': 0.0004614906397748898, 'samples': 4657152, 'steps': 9095, 'batch_loss/train': 0.8157443813979626} 12/24/2021 04:18:27 - INFO - codeparrot_training - Step 9096: {'lr': 0.0004614822337845639, 'samples': 4657664, 'steps': 9096, 'batch_loss/train': 0.9080551797524095} 12/24/2021 04:18:38 - INFO - codeparrot_training - Step 9097: {'lr': 0.00046147382695346453, 'samples': 4658176, 'steps': 9097, 'batch_loss/train': 0.8418636359274387} 12/24/2021 04:18:50 - INFO - codeparrot_training - Step 9098: {'lr': 0.0004614654192816251, 'samples': 4658688, 'steps': 9098, 'batch_loss/train': 0.6987967767636292} 12/24/2021 04:19:00 - INFO - codeparrot_training - Step 9099: {'lr': 0.00046145701076907896, 'samples': 4659200, 'steps': 9099, 'batch_loss/train': 0.7829007192049176} 12/24/2021 04:19:11 - INFO - codeparrot_training - Step 9100: {'lr': 0.00046144860141585965, 'samples': 4659712, 'steps': 9100, 'batch_loss/train': 0.8695377493277192} 12/24/2021 04:19:24 - INFO - codeparrot_training - Step 9101: {'lr': 0.0004614401912220005, 'samples': 4660224, 'steps': 9101, 'batch_loss/train': 0.869384678080678} 12/24/2021 04:19:35 - INFO - codeparrot_training - Step 9102: {'lr': 0.00046143178018753496, 'samples': 4660736, 'steps': 9102, 'batch_loss/train': 1.4356265533715487} 12/24/2021 04:19:45 - INFO - codeparrot_training - Step 9103: {'lr': 0.0004614233683124965, 'samples': 4661248, 'steps': 9103, 'batch_loss/train': 0.8221175484359264} 12/24/2021 04:19:56 - INFO - codeparrot_training - Step 9104: {'lr': 0.0004614149555969186, 'samples': 4661760, 'steps': 9104, 'batch_loss/train': 0.7657946571707726} 12/24/2021 04:20:08 - INFO - codeparrot_training - Step 9105: {'lr': 0.00046140654204083467, 'samples': 4662272, 'steps': 9105, 'batch_loss/train': 0.8844347512349486} 12/24/2021 04:20:19 - INFO - codeparrot_training - Step 9106: {'lr': 0.00046139812764427814, 'samples': 4662784, 'steps': 9106, 'batch_loss/train': 0.7892435854300857} 12/24/2021 04:20:29 - INFO - codeparrot_training - Step 9107: {'lr': 0.00046138971240728245, 'samples': 4663296, 'steps': 9107, 'batch_loss/train': 0.8034548563882709} 12/24/2021 04:20:41 - INFO - codeparrot_training - Step 9108: {'lr': 0.000461381296329881, 'samples': 4663808, 'steps': 9108, 'batch_loss/train': 0.824141537072137} 12/24/2021 04:20:52 - INFO - codeparrot_training - Step 9109: {'lr': 0.00046137287941210747, 'samples': 4664320, 'steps': 9109, 'batch_loss/train': 0.7709369072690606} 12/24/2021 04:21:02 - INFO - codeparrot_training - Step 9110: {'lr': 0.00046136446165399517, 'samples': 4664832, 'steps': 9110, 'batch_loss/train': 0.7769103636965156} 12/24/2021 04:21:14 - INFO - codeparrot_training - Step 9111: {'lr': 0.00046135604305557756, 'samples': 4665344, 'steps': 9111, 'batch_loss/train': 0.882134628482163} 12/24/2021 04:21:25 - INFO - codeparrot_training - Step 9112: {'lr': 0.0004613476236168881, 'samples': 4665856, 'steps': 9112, 'batch_loss/train': 0.8634155401960015} 12/24/2021 04:21:36 - INFO - codeparrot_training - Step 9113: {'lr': 0.0004613392033379603, 'samples': 4666368, 'steps': 9113, 'batch_loss/train': 0.8241383321583271} 12/24/2021 04:21:46 - INFO - codeparrot_training - Step 9114: {'lr': 0.0004613307822188276, 'samples': 4666880, 'steps': 9114, 'batch_loss/train': 0.9055484924465418} 12/24/2021 04:21:59 - INFO - codeparrot_training - Step 9115: {'lr': 0.00046132236025952356, 'samples': 4667392, 'steps': 9115, 'batch_loss/train': 0.8192314449697733} 12/24/2021 04:22:10 - INFO - codeparrot_training - Step 9116: {'lr': 0.00046131393746008155, 'samples': 4667904, 'steps': 9116, 'batch_loss/train': 0.9978381562978029} 12/24/2021 04:22:20 - INFO - codeparrot_training - Step 9117: {'lr': 0.0004613055138205351, 'samples': 4668416, 'steps': 9117, 'batch_loss/train': 0.8651253068819642} 12/24/2021 04:22:33 - INFO - codeparrot_training - Step 9118: {'lr': 0.0004612970893409178, 'samples': 4668928, 'steps': 9118, 'batch_loss/train': 0.6333243835251778} 12/24/2021 04:22:43 - INFO - codeparrot_training - Step 9119: {'lr': 0.000461288664021263, 'samples': 4669440, 'steps': 9119, 'batch_loss/train': 0.8623492177575827} 12/24/2021 04:22:54 - INFO - codeparrot_training - Step 9120: {'lr': 0.00046128023786160433, 'samples': 4669952, 'steps': 9120, 'batch_loss/train': 0.7906700670719147} 12/24/2021 04:23:07 - INFO - codeparrot_training - Step 9121: {'lr': 0.00046127181086197513, 'samples': 4670464, 'steps': 9121, 'batch_loss/train': 0.8305433532223105} 12/24/2021 04:23:17 - INFO - codeparrot_training - Step 9122: {'lr': 0.000461263383022409, 'samples': 4670976, 'steps': 9122, 'batch_loss/train': 0.8446031920611858} 12/24/2021 04:23:28 - INFO - codeparrot_training - Step 9123: {'lr': 0.0004612549543429394, 'samples': 4671488, 'steps': 9123, 'batch_loss/train': 0.8710035150870681} 12/24/2021 04:23:38 - INFO - codeparrot_training - Step 9124: {'lr': 0.0004612465248235999, 'samples': 4672000, 'steps': 9124, 'batch_loss/train': 0.7468689868110232} 12/24/2021 04:23:51 - INFO - codeparrot_training - Step 9125: {'lr': 0.000461238094464424, 'samples': 4672512, 'steps': 9125, 'batch_loss/train': 1.6180530618876219} 12/24/2021 04:24:01 - INFO - codeparrot_training - Step 9126: {'lr': 0.00046122966326544513, 'samples': 4673024, 'steps': 9126, 'batch_loss/train': 0.7914908203529194} 12/24/2021 04:24:12 - INFO - codeparrot_training - Step 9127: {'lr': 0.00046122123122669696, 'samples': 4673536, 'steps': 9127, 'batch_loss/train': 0.804833015310578} 12/24/2021 04:24:25 - INFO - codeparrot_training - Step 9128: {'lr': 0.0004612127983482129, 'samples': 4674048, 'steps': 9128, 'batch_loss/train': 0.6906405064510182} 12/24/2021 04:24:35 - INFO - codeparrot_training - Step 9129: {'lr': 0.0004612043646300265, 'samples': 4674560, 'steps': 9129, 'batch_loss/train': 0.8909429721534252} 12/24/2021 04:24:46 - INFO - codeparrot_training - Step 9130: {'lr': 0.00046119593007217126, 'samples': 4675072, 'steps': 9130, 'batch_loss/train': 0.9269743030890822} 12/24/2021 04:24:56 - INFO - codeparrot_training - Step 9131: {'lr': 0.0004611874946746808, 'samples': 4675584, 'steps': 9131, 'batch_loss/train': 1.0198177946731448} 12/24/2021 04:25:09 - INFO - codeparrot_training - Step 9132: {'lr': 0.0004611790584375886, 'samples': 4676096, 'steps': 9132, 'batch_loss/train': 0.7838801415637136} 12/24/2021 04:25:20 - INFO - codeparrot_training - Step 9133: {'lr': 0.00046117062136092816, 'samples': 4676608, 'steps': 9133, 'batch_loss/train': 1.5659847157076001} 12/24/2021 04:25:31 - INFO - codeparrot_training - Step 9134: {'lr': 0.000461162183444733, 'samples': 4677120, 'steps': 9134, 'batch_loss/train': 0.7434349171817303} 12/24/2021 04:25:43 - INFO - codeparrot_training - Step 9135: {'lr': 0.00046115374468903684, 'samples': 4677632, 'steps': 9135, 'batch_loss/train': 0.997488901950419} 12/24/2021 04:25:53 - INFO - codeparrot_training - Step 9136: {'lr': 0.0004611453050938731, 'samples': 4678144, 'steps': 9136, 'batch_loss/train': 0.7204769984236918} 12/24/2021 04:26:04 - INFO - codeparrot_training - Step 9137: {'lr': 0.00046113686465927534, 'samples': 4678656, 'steps': 9137, 'batch_loss/train': 0.7610568581148982} 12/24/2021 04:26:17 - INFO - codeparrot_training - Step 9138: {'lr': 0.0004611284233852772, 'samples': 4679168, 'steps': 9138, 'batch_loss/train': 0.8392598000355065} 12/24/2021 04:26:27 - INFO - codeparrot_training - Step 9139: {'lr': 0.000461119981271912, 'samples': 4679680, 'steps': 9139, 'batch_loss/train': 0.8513575522229075} 12/24/2021 04:26:38 - INFO - codeparrot_training - Step 9140: {'lr': 0.0004611115383192136, 'samples': 4680192, 'steps': 9140, 'batch_loss/train': 0.8055064301006496} 12/24/2021 04:26:48 - INFO - codeparrot_training - Step 9141: {'lr': 0.00046110309452721546, 'samples': 4680704, 'steps': 9141, 'batch_loss/train': 0.6190696333069354} 12/24/2021 04:27:00 - INFO - codeparrot_training - Step 9142: {'lr': 0.000461094649895951, 'samples': 4681216, 'steps': 9142, 'batch_loss/train': 0.9489371096715331} 12/24/2021 04:27:11 - INFO - codeparrot_training - Step 9143: {'lr': 0.00046108620442545403, 'samples': 4681728, 'steps': 9143, 'batch_loss/train': 0.7577829223591834} 12/24/2021 04:27:22 - INFO - codeparrot_training - Step 9144: {'lr': 0.00046107775811575795, 'samples': 4682240, 'steps': 9144, 'batch_loss/train': 0.7952453447505832} 12/24/2021 04:27:34 - INFO - codeparrot_training - Step 9145: {'lr': 0.0004610693109668964, 'samples': 4682752, 'steps': 9145, 'batch_loss/train': 0.999058801215142} 12/24/2021 04:27:45 - INFO - codeparrot_training - Step 9146: {'lr': 0.000461060862978903, 'samples': 4683264, 'steps': 9146, 'batch_loss/train': 0.5588194580050185} 12/24/2021 04:27:55 - INFO - codeparrot_training - Step 9147: {'lr': 0.0004610524141518113, 'samples': 4683776, 'steps': 9147, 'batch_loss/train': 0.8476582581643015} 12/24/2021 04:28:07 - INFO - codeparrot_training - Step 9148: {'lr': 0.0004610439644856549, 'samples': 4684288, 'steps': 9148, 'batch_loss/train': 0.8859212696552277} 12/24/2021 04:28:18 - INFO - codeparrot_training - Step 9149: {'lr': 0.0004610355139804674, 'samples': 4684800, 'steps': 9149, 'batch_loss/train': 0.8563114320859313} 12/24/2021 04:28:29 - INFO - codeparrot_training - Step 9150: {'lr': 0.00046102706263628237, 'samples': 4685312, 'steps': 9150, 'batch_loss/train': 0.7594685405492783} 12/24/2021 04:28:41 - INFO - codeparrot_training - Step 9151: {'lr': 0.0004610186104531334, 'samples': 4685824, 'steps': 9151, 'batch_loss/train': 0.8358186311088502} 12/24/2021 04:28:52 - INFO - codeparrot_training - Step 9152: {'lr': 0.0004610101574310541, 'samples': 4686336, 'steps': 9152, 'batch_loss/train': 0.8729059789329767} 12/24/2021 04:29:03 - INFO - codeparrot_training - Step 9153: {'lr': 0.00046100170357007807, 'samples': 4686848, 'steps': 9153, 'batch_loss/train': 0.7288405182771385} 12/24/2021 04:29:13 - INFO - codeparrot_training - Step 9154: {'lr': 0.00046099324887023896, 'samples': 4687360, 'steps': 9154, 'batch_loss/train': 0.8704532943665981} 12/24/2021 04:29:25 - INFO - codeparrot_training - Step 9155: {'lr': 0.0004609847933315704, 'samples': 4687872, 'steps': 9155, 'batch_loss/train': 0.8132764045149088} 12/24/2021 04:29:36 - INFO - codeparrot_training - Step 9156: {'lr': 0.00046097633695410593, 'samples': 4688384, 'steps': 9156, 'batch_loss/train': 0.8401018404401839} 12/24/2021 04:29:47 - INFO - codeparrot_training - Step 9157: {'lr': 0.00046096787973787923, 'samples': 4688896, 'steps': 9157, 'batch_loss/train': 0.8377270684577525} 12/24/2021 04:29:59 - INFO - codeparrot_training - Step 9158: {'lr': 0.0004609594216829239, 'samples': 4689408, 'steps': 9158, 'batch_loss/train': 0.5698886892641895} 12/24/2021 04:30:10 - INFO - codeparrot_training - Step 9159: {'lr': 0.00046095096278927356, 'samples': 4689920, 'steps': 9159, 'batch_loss/train': 0.8806459065526724} 12/24/2021 04:30:20 - INFO - codeparrot_training - Step 9160: {'lr': 0.00046094250305696186, 'samples': 4690432, 'steps': 9160, 'batch_loss/train': 0.7149030268192291} 12/24/2021 04:30:33 - INFO - codeparrot_training - Step 9161: {'lr': 0.0004609340424860223, 'samples': 4690944, 'steps': 9161, 'batch_loss/train': 0.8226194274611771} 12/24/2021 04:30:43 - INFO - codeparrot_training - Step 9162: {'lr': 0.00046092558107648873, 'samples': 4691456, 'steps': 9162, 'batch_loss/train': 0.7878418611362576} 12/24/2021 04:30:54 - INFO - codeparrot_training - Step 9163: {'lr': 0.0004609171188283947, 'samples': 4691968, 'steps': 9163, 'batch_loss/train': 0.8376116501167417} 12/24/2021 04:31:05 - INFO - codeparrot_training - Step 9164: {'lr': 0.00046090865574177387, 'samples': 4692480, 'steps': 9164, 'batch_loss/train': 0.8277245936915278} 12/24/2021 04:31:17 - INFO - codeparrot_training - Step 9165: {'lr': 0.0004609001918166598, 'samples': 4692992, 'steps': 9165, 'batch_loss/train': 0.6608478997368366} 12/24/2021 04:31:27 - INFO - codeparrot_training - Step 9166: {'lr': 0.00046089172705308625, 'samples': 4693504, 'steps': 9166, 'batch_loss/train': 0.9082511281594634} 12/24/2021 04:31:38 - INFO - codeparrot_training - Step 9167: {'lr': 0.00046088326145108673, 'samples': 4694016, 'steps': 9167, 'batch_loss/train': 0.7678459514863789} 12/24/2021 04:31:51 - INFO - codeparrot_training - Step 9168: {'lr': 0.0004608747950106951, 'samples': 4694528, 'steps': 9168, 'batch_loss/train': 0.8036837978288531} 12/24/2021 04:32:01 - INFO - codeparrot_training - Step 9169: {'lr': 0.0004608663277319448, 'samples': 4695040, 'steps': 9169, 'batch_loss/train': 0.7943072221241891} 12/24/2021 04:32:12 - INFO - codeparrot_training - Step 9170: {'lr': 0.0004608578596148697, 'samples': 4695552, 'steps': 9170, 'batch_loss/train': 0.8183227134868503} 12/24/2021 04:32:24 - INFO - codeparrot_training - Step 9171: {'lr': 0.00046084939065950333, 'samples': 4696064, 'steps': 9171, 'batch_loss/train': 0.8011547522619367} 12/24/2021 04:32:35 - INFO - codeparrot_training - Step 9172: {'lr': 0.0004608409208658794, 'samples': 4696576, 'steps': 9172, 'batch_loss/train': 0.7243156870827079} 12/24/2021 04:32:45 - INFO - codeparrot_training - Step 9173: {'lr': 0.00046083245023403153, 'samples': 4697088, 'steps': 9173, 'batch_loss/train': 0.6954598659649491} 12/24/2021 04:32:57 - INFO - codeparrot_training - Step 9174: {'lr': 0.00046082397876399353, 'samples': 4697600, 'steps': 9174, 'batch_loss/train': 0.743014182895422} 12/24/2021 04:33:08 - INFO - codeparrot_training - Step 9175: {'lr': 0.00046081550645579896, 'samples': 4698112, 'steps': 9175, 'batch_loss/train': 0.819140668027103} 12/24/2021 04:33:19 - INFO - codeparrot_training - Step 9176: {'lr': 0.00046080703330948146, 'samples': 4698624, 'steps': 9176, 'batch_loss/train': 0.8693468747660518} 12/24/2021 04:33:29 - INFO - codeparrot_training - Step 9177: {'lr': 0.0004607985593250749, 'samples': 4699136, 'steps': 9177, 'batch_loss/train': 0.7455008584074676} 12/24/2021 04:33:41 - INFO - codeparrot_training - Step 9178: {'lr': 0.00046079008450261284, 'samples': 4699648, 'steps': 9178, 'batch_loss/train': 0.888488280121237} 12/24/2021 04:33:52 - INFO - codeparrot_training - Step 9179: {'lr': 0.00046078160884212895, 'samples': 4700160, 'steps': 9179, 'batch_loss/train': 1.2139940187335014} 12/24/2021 04:34:03 - INFO - codeparrot_training - Step 9180: {'lr': 0.000460773132343657, 'samples': 4700672, 'steps': 9180, 'batch_loss/train': 0.7742520226165652} 12/24/2021 04:34:15 - INFO - codeparrot_training - Step 9181: {'lr': 0.0004607646550072307, 'samples': 4701184, 'steps': 9181, 'batch_loss/train': 0.7829918647184968} 12/24/2021 04:34:26 - INFO - codeparrot_training - Step 9182: {'lr': 0.00046075617683288367, 'samples': 4701696, 'steps': 9182, 'batch_loss/train': 0.9602191625162959} 12/24/2021 04:34:37 - INFO - codeparrot_training - Step 9183: {'lr': 0.00046074769782064964, 'samples': 4702208, 'steps': 9183, 'batch_loss/train': 0.7751077152788639} 12/24/2021 04:34:49 - INFO - codeparrot_training - Step 9184: {'lr': 0.0004607392179705624, 'samples': 4702720, 'steps': 9184, 'batch_loss/train': 0.6882170317694545} 12/24/2021 04:34:59 - INFO - codeparrot_training - Step 9185: {'lr': 0.00046073073728265557, 'samples': 4703232, 'steps': 9185, 'batch_loss/train': 0.8351130066439509} 12/24/2021 04:35:10 - INFO - codeparrot_training - Step 9186: {'lr': 0.00046072225575696286, 'samples': 4703744, 'steps': 9186, 'batch_loss/train': 0.8670218493789434} 12/24/2021 04:35:22 - INFO - codeparrot_training - Step 9187: {'lr': 0.00046071377339351803, 'samples': 4704256, 'steps': 9187, 'batch_loss/train': 0.7756240892922506} 12/24/2021 04:35:33 - INFO - codeparrot_training - Step 9188: {'lr': 0.00046070529019235485, 'samples': 4704768, 'steps': 9188, 'batch_loss/train': 0.7527682622894645} 12/24/2021 04:35:43 - INFO - codeparrot_training - Step 9189: {'lr': 0.00046069680615350693, 'samples': 4705280, 'steps': 9189, 'batch_loss/train': 0.7017813276033849} 12/24/2021 04:35:54 - INFO - codeparrot_training - Step 9190: {'lr': 0.00046068832127700815, 'samples': 4705792, 'steps': 9190, 'batch_loss/train': 0.8419773327186704} 12/24/2021 04:36:07 - INFO - codeparrot_training - Step 9191: {'lr': 0.0004606798355628922, 'samples': 4706304, 'steps': 9191, 'batch_loss/train': 0.7853822563192807} 12/24/2021 04:36:17 - INFO - codeparrot_training - Step 9192: {'lr': 0.0004606713490111926, 'samples': 4706816, 'steps': 9192, 'batch_loss/train': 0.8706380510702729} 12/24/2021 04:36:28 - INFO - codeparrot_training - Step 9193: {'lr': 0.00046066286162194336, 'samples': 4707328, 'steps': 9193, 'batch_loss/train': 0.5711594321765006} 12/24/2021 04:36:40 - INFO - codeparrot_training - Step 9194: {'lr': 0.0004606543733951781, 'samples': 4707840, 'steps': 9194, 'batch_loss/train': 0.7523833261802793} 12/24/2021 04:36:51 - INFO - codeparrot_training - Step 9195: {'lr': 0.0004606458843309307, 'samples': 4708352, 'steps': 9195, 'batch_loss/train': 0.8117997348308563} 12/24/2021 04:37:01 - INFO - codeparrot_training - Step 9196: {'lr': 0.0004606373944292347, 'samples': 4708864, 'steps': 9196, 'batch_loss/train': 0.7196988294599578} 12/24/2021 04:37:14 - INFO - codeparrot_training - Step 9197: {'lr': 0.0004606289036901239, 'samples': 4709376, 'steps': 9197, 'batch_loss/train': 0.7847016941523179} 12/24/2021 04:37:25 - INFO - codeparrot_training - Step 9198: {'lr': 0.0004606204121136322, 'samples': 4709888, 'steps': 9198, 'batch_loss/train': 0.724056736798957} 12/24/2021 04:37:36 - INFO - codeparrot_training - Step 9199: {'lr': 0.0004606119196997932, 'samples': 4710400, 'steps': 9199, 'batch_loss/train': 0.8864319967105985} 12/24/2021 04:37:47 - INFO - codeparrot_training - Step 9200: {'lr': 0.00046060342644864085, 'samples': 4710912, 'steps': 9200, 'batch_loss/train': 0.9622691171243787} 12/24/2021 04:37:58 - INFO - codeparrot_training - Step 9201: {'lr': 0.00046059493236020867, 'samples': 4711424, 'steps': 9201, 'batch_loss/train': 0.811912783421576} 12/24/2021 04:38:09 - INFO - codeparrot_training - Step 9202: {'lr': 0.0004605864374345306, 'samples': 4711936, 'steps': 9202, 'batch_loss/train': 0.8334788605570793} 12/24/2021 04:38:19 - INFO - codeparrot_training - Step 9203: {'lr': 0.00046057794167164034, 'samples': 4712448, 'steps': 9203, 'batch_loss/train': 0.8112786337733269} 12/24/2021 04:38:31 - INFO - codeparrot_training - Step 9204: {'lr': 0.00046056944507157175, 'samples': 4712960, 'steps': 9204, 'batch_loss/train': 0.8401810410432518} 12/24/2021 04:38:42 - INFO - codeparrot_training - Step 9205: {'lr': 0.0004605609476343585, 'samples': 4713472, 'steps': 9205, 'batch_loss/train': 0.8072110648499802} 12/24/2021 04:38:52 - INFO - codeparrot_training - Step 9206: {'lr': 0.0004605524493600345, 'samples': 4713984, 'steps': 9206, 'batch_loss/train': 0.9613432087935507} 12/24/2021 04:39:06 - INFO - codeparrot_training - Step 9207: {'lr': 0.0004605439502486334, 'samples': 4714496, 'steps': 9207, 'batch_loss/train': 0.8543742187321186} 12/24/2021 04:39:16 - INFO - codeparrot_training - Step 9208: {'lr': 0.0004605354503001891, 'samples': 4715008, 'steps': 9208, 'batch_loss/train': 0.8338683703914285} 12/24/2021 04:39:27 - INFO - codeparrot_training - Step 9209: {'lr': 0.00046052694951473534, 'samples': 4715520, 'steps': 9209, 'batch_loss/train': 0.8027002871967852} 12/24/2021 04:39:39 - INFO - codeparrot_training - Step 9210: {'lr': 0.0004605184478923059, 'samples': 4716032, 'steps': 9210, 'batch_loss/train': 0.5236996969906613} 12/24/2021 04:39:49 - INFO - codeparrot_training - Step 9211: {'lr': 0.00046050994543293464, 'samples': 4716544, 'steps': 9211, 'batch_loss/train': 0.8731245705857873} 12/24/2021 04:40:00 - INFO - codeparrot_training - Step 9212: {'lr': 0.00046050144213665524, 'samples': 4717056, 'steps': 9212, 'batch_loss/train': 0.8735182788223028} 12/24/2021 04:40:11 - INFO - codeparrot_training - Step 9213: {'lr': 0.00046049293800350165, 'samples': 4717568, 'steps': 9213, 'batch_loss/train': 0.7977657075971365} 12/24/2021 04:40:23 - INFO - codeparrot_training - Step 9214: {'lr': 0.00046048443303350764, 'samples': 4718080, 'steps': 9214, 'batch_loss/train': 0.8534786831587553} 12/24/2021 04:40:33 - INFO - codeparrot_training - Step 9215: {'lr': 0.0004604759272267069, 'samples': 4718592, 'steps': 9215, 'batch_loss/train': 0.7480785414809361} 12/24/2021 04:40:44 - INFO - codeparrot_training - Step 9216: {'lr': 0.0004604674205831334, 'samples': 4719104, 'steps': 9216, 'batch_loss/train': 0.6576629681512713} 12/24/2021 04:40:56 - INFO - codeparrot_training - Step 9217: {'lr': 0.00046045891310282096, 'samples': 4719616, 'steps': 9217, 'batch_loss/train': 0.8475822452455759} 12/24/2021 04:41:07 - INFO - codeparrot_training - Step 9218: {'lr': 0.00046045040478580324, 'samples': 4720128, 'steps': 9218, 'batch_loss/train': 0.7927730555529706} 12/24/2021 04:41:17 - INFO - codeparrot_training - Step 9219: {'lr': 0.00046044189563211423, 'samples': 4720640, 'steps': 9219, 'batch_loss/train': 0.7223151442594826} 12/24/2021 04:41:30 - INFO - codeparrot_training - Step 9220: {'lr': 0.0004604333856417877, 'samples': 4721152, 'steps': 9220, 'batch_loss/train': 0.797158126719296} 12/24/2021 04:41:41 - INFO - codeparrot_training - Step 9221: {'lr': 0.0004604248748148575, 'samples': 4721664, 'steps': 9221, 'batch_loss/train': 0.8281082157045603} 12/24/2021 04:41:51 - INFO - codeparrot_training - Step 9222: {'lr': 0.00046041636315135746, 'samples': 4722176, 'steps': 9222, 'batch_loss/train': 0.7677966672927141} 12/24/2021 04:42:02 - INFO - codeparrot_training - Step 9223: {'lr': 0.0004604078506513214, 'samples': 4722688, 'steps': 9223, 'batch_loss/train': 0.7900439444929361} 12/24/2021 04:42:14 - INFO - codeparrot_training - Step 9224: {'lr': 0.00046039933731478316, 'samples': 4723200, 'steps': 9224, 'batch_loss/train': 0.8220033701509237} 12/24/2021 04:42:24 - INFO - codeparrot_training - Step 9225: {'lr': 0.0004603908231417766, 'samples': 4723712, 'steps': 9225, 'batch_loss/train': 0.872375613078475} 12/24/2021 04:42:35 - INFO - codeparrot_training - Step 9226: {'lr': 0.0004603823081323356, 'samples': 4724224, 'steps': 9226, 'batch_loss/train': 0.8773355002049357} 12/24/2021 04:42:47 - INFO - codeparrot_training - Step 9227: {'lr': 0.0004603737922864939, 'samples': 4724736, 'steps': 9227, 'batch_loss/train': 0.8369566630572081} 12/24/2021 04:42:58 - INFO - codeparrot_training - Step 9228: {'lr': 0.0004603652756042855, 'samples': 4725248, 'steps': 9228, 'batch_loss/train': 0.8219319442287087} 12/24/2021 04:43:08 - INFO - codeparrot_training - Step 9229: {'lr': 0.00046035675808574417, 'samples': 4725760, 'steps': 9229, 'batch_loss/train': 0.8528985735028982} 12/24/2021 04:43:21 - INFO - codeparrot_training - Step 9230: {'lr': 0.00046034823973090377, 'samples': 4726272, 'steps': 9230, 'batch_loss/train': 0.8976454269140959} 12/24/2021 04:43:31 - INFO - codeparrot_training - Step 9231: {'lr': 0.0004603397205397983, 'samples': 4726784, 'steps': 9231, 'batch_loss/train': 0.7152925869449973} 12/24/2021 04:43:42 - INFO - codeparrot_training - Step 9232: {'lr': 0.0004603312005124615, 'samples': 4727296, 'steps': 9232, 'batch_loss/train': 0.7626667832955718} 12/24/2021 04:43:56 - INFO - codeparrot_training - Step 9233: {'lr': 0.00046032267964892713, 'samples': 4727808, 'steps': 9233, 'batch_loss/train': 0.6858877824270166} 12/24/2021 04:44:07 - INFO - codeparrot_training - Step 9234: {'lr': 0.0004603141579492293, 'samples': 4728320, 'steps': 9234, 'batch_loss/train': 0.7669557738117874} 12/24/2021 04:44:18 - INFO - codeparrot_training - Step 9235: {'lr': 0.0004603056354134018, 'samples': 4728832, 'steps': 9235, 'batch_loss/train': 0.8026896719820797} 12/24/2021 04:44:28 - INFO - codeparrot_training - Step 9236: {'lr': 0.0004602971120414785, 'samples': 4729344, 'steps': 9236, 'batch_loss/train': 1.258996531367302} 12/24/2021 04:44:40 - INFO - codeparrot_training - Step 9237: {'lr': 0.00046028858783349325, 'samples': 4729856, 'steps': 9237, 'batch_loss/train': 0.7796207554638386} 12/24/2021 04:44:51 - INFO - codeparrot_training - Step 9238: {'lr': 0.00046028006278948, 'samples': 4730368, 'steps': 9238, 'batch_loss/train': 0.8209165539592505} 12/24/2021 04:45:01 - INFO - codeparrot_training - Step 9239: {'lr': 0.00046027153690947266, 'samples': 4730880, 'steps': 9239, 'batch_loss/train': 0.8115491946227849} 12/24/2021 04:45:16 - INFO - codeparrot_training - Step 9240: {'lr': 0.00046026301019350504, 'samples': 4731392, 'steps': 9240, 'batch_loss/train': 0.907179694622755} 12/24/2021 04:45:26 - INFO - codeparrot_training - Step 9241: {'lr': 0.00046025448264161103, 'samples': 4731904, 'steps': 9241, 'batch_loss/train': 0.7871828677598387} 12/24/2021 04:45:37 - INFO - codeparrot_training - Step 9242: {'lr': 0.00046024595425382464, 'samples': 4732416, 'steps': 9242, 'batch_loss/train': 0.9173800898715854} 12/24/2021 04:45:49 - INFO - codeparrot_training - Step 9243: {'lr': 0.00046023742503017973, 'samples': 4732928, 'steps': 9243, 'batch_loss/train': 0.7822496606968343} 12/24/2021 04:46:00 - INFO - codeparrot_training - Step 9244: {'lr': 0.0004602288949707102, 'samples': 4733440, 'steps': 9244, 'batch_loss/train': 0.8277415951015428} 12/24/2021 04:46:10 - INFO - codeparrot_training - Step 9245: {'lr': 0.00046022036407544997, 'samples': 4733952, 'steps': 9245, 'batch_loss/train': 0.876098201610148} 12/24/2021 04:46:21 - INFO - codeparrot_training - Step 9246: {'lr': 0.0004602118323444329, 'samples': 4734464, 'steps': 9246, 'batch_loss/train': 0.7080364188877866} 12/24/2021 04:46:33 - INFO - codeparrot_training - Step 9247: {'lr': 0.00046020329977769297, 'samples': 4734976, 'steps': 9247, 'batch_loss/train': 0.8213855128269643} 12/24/2021 04:46:43 - INFO - codeparrot_training - Step 9248: {'lr': 0.0004601947663752641, 'samples': 4735488, 'steps': 9248, 'batch_loss/train': 0.9052314823493361} 12/24/2021 04:46:54 - INFO - codeparrot_training - Step 9249: {'lr': 0.00046018623213718026, 'samples': 4736000, 'steps': 9249, 'batch_loss/train': 0.9052667012438178} 12/24/2021 04:47:08 - INFO - codeparrot_training - Step 9250: {'lr': 0.00046017769706347523, 'samples': 4736512, 'steps': 9250, 'batch_loss/train': 0.7302255495451391} 12/24/2021 04:47:19 - INFO - codeparrot_training - Step 9251: {'lr': 0.00046016916115418317, 'samples': 4737024, 'steps': 9251, 'batch_loss/train': 0.784710144624114} 12/24/2021 04:47:29 - INFO - codeparrot_training - Step 9252: {'lr': 0.0004601606244093377, 'samples': 4737536, 'steps': 9252, 'batch_loss/train': 0.7380027244798839} 12/24/2021 04:47:42 - INFO - codeparrot_training - Step 9253: {'lr': 0.00046015208682897307, 'samples': 4738048, 'steps': 9253, 'batch_loss/train': 0.8533845786005259} 12/24/2021 04:47:52 - INFO - codeparrot_training - Step 9254: {'lr': 0.0004601435484131231, 'samples': 4738560, 'steps': 9254, 'batch_loss/train': 0.8718177089467645} 12/24/2021 04:48:03 - INFO - codeparrot_training - Step 9255: {'lr': 0.00046013500916182166, 'samples': 4739072, 'steps': 9255, 'batch_loss/train': 0.7957398863509297} 12/24/2021 04:48:13 - INFO - codeparrot_training - Step 9256: {'lr': 0.0004601264690751028, 'samples': 4739584, 'steps': 9256, 'batch_loss/train': 0.7659076829440892} 12/24/2021 04:48:26 - INFO - codeparrot_training - Step 9257: {'lr': 0.0004601179281530005, 'samples': 4740096, 'steps': 9257, 'batch_loss/train': 0.9196987394243479} 12/24/2021 04:48:37 - INFO - codeparrot_training - Step 9258: {'lr': 0.0004601093863955486, 'samples': 4740608, 'steps': 9258, 'batch_loss/train': 0.84360830206424} 12/24/2021 04:48:47 - INFO - codeparrot_training - Step 9259: {'lr': 0.0004601008438027811, 'samples': 4741120, 'steps': 9259, 'batch_loss/train': 0.7509365421719849} 12/24/2021 04:49:01 - INFO - codeparrot_training - Step 9260: {'lr': 0.0004600923003747319, 'samples': 4741632, 'steps': 9260, 'batch_loss/train': 0.9708970747888088} 12/24/2021 04:49:12 - INFO - codeparrot_training - Step 9261: {'lr': 0.00046008375611143515, 'samples': 4742144, 'steps': 9261, 'batch_loss/train': 0.823251566849649} 12/24/2021 04:49:22 - INFO - codeparrot_training - Step 9262: {'lr': 0.0004600752110129247, 'samples': 4742656, 'steps': 9262, 'batch_loss/train': 0.8312509190291166} 12/24/2021 04:49:34 - INFO - codeparrot_training - Step 9263: {'lr': 0.0004600666650792345, 'samples': 4743168, 'steps': 9263, 'batch_loss/train': 0.8871941892430186} 12/24/2021 04:49:45 - INFO - codeparrot_training - Step 9264: {'lr': 0.00046005811831039854, 'samples': 4743680, 'steps': 9264, 'batch_loss/train': 0.8751127123832703} 12/24/2021 04:49:56 - INFO - codeparrot_training - Step 9265: {'lr': 0.00046004957070645093, 'samples': 4744192, 'steps': 9265, 'batch_loss/train': 0.8287210892885923} 12/24/2021 04:50:06 - INFO - codeparrot_training - Step 9266: {'lr': 0.0004600410222674254, 'samples': 4744704, 'steps': 9266, 'batch_loss/train': 0.7105781342834234} 12/24/2021 04:50:19 - INFO - codeparrot_training - Step 9267: {'lr': 0.00046003247299335615, 'samples': 4745216, 'steps': 9267, 'batch_loss/train': 0.9116681823506951} 12/24/2021 04:50:29 - INFO - codeparrot_training - Step 9268: {'lr': 0.0004600239228842771, 'samples': 4745728, 'steps': 9268, 'batch_loss/train': 0.7805524853756651} 12/24/2021 04:50:40 - INFO - codeparrot_training - Step 9269: {'lr': 0.0004600153719402222, 'samples': 4746240, 'steps': 9269, 'batch_loss/train': 0.7882285746745765} 12/24/2021 04:50:54 - INFO - codeparrot_training - Step 9270: {'lr': 0.0004600068201612255, 'samples': 4746752, 'steps': 9270, 'batch_loss/train': 0.8020293544977903} 12/24/2021 04:51:04 - INFO - codeparrot_training - Step 9271: {'lr': 0.000459998267547321, 'samples': 4747264, 'steps': 9271, 'batch_loss/train': 0.7877363516017795} 12/24/2021 04:51:15 - INFO - codeparrot_training - Step 9272: {'lr': 0.0004599897140985426, 'samples': 4747776, 'steps': 9272, 'batch_loss/train': 0.8839797265827656} 12/24/2021 04:51:28 - INFO - codeparrot_training - Step 9273: {'lr': 0.00045998115981492446, 'samples': 4748288, 'steps': 9273, 'batch_loss/train': 0.9158861770993099} 12/24/2021 04:51:39 - INFO - codeparrot_training - Step 9274: {'lr': 0.0004599726046965005, 'samples': 4748800, 'steps': 9274, 'batch_loss/train': 1.1038900120183825} 12/24/2021 04:51:49 - INFO - codeparrot_training - Step 9275: {'lr': 0.00045996404874330474, 'samples': 4749312, 'steps': 9275, 'batch_loss/train': 0.9150612307712436} 12/24/2021 04:52:00 - INFO - codeparrot_training - Step 9276: {'lr': 0.0004599554919553712, 'samples': 4749824, 'steps': 9276, 'batch_loss/train': 0.8375118556432426} 12/24/2021 04:52:12 - INFO - codeparrot_training - Step 9277: {'lr': 0.0004599469343327338, 'samples': 4750336, 'steps': 9277, 'batch_loss/train': 0.8485537841916084} 12/24/2021 04:52:23 - INFO - codeparrot_training - Step 9278: {'lr': 0.0004599383758754268, 'samples': 4750848, 'steps': 9278, 'batch_loss/train': 0.774083343334496} 12/24/2021 04:52:33 - INFO - codeparrot_training - Step 9279: {'lr': 0.0004599298165834841, 'samples': 4751360, 'steps': 9279, 'batch_loss/train': 0.8016637642867863} 12/24/2021 04:52:47 - INFO - codeparrot_training - Step 9280: {'lr': 0.00045992125645693963, 'samples': 4751872, 'steps': 9280, 'batch_loss/train': 0.8096373518928885} 12/24/2021 04:52:58 - INFO - codeparrot_training - Step 9281: {'lr': 0.0004599126954958275, 'samples': 4752384, 'steps': 9281, 'batch_loss/train': 0.8075381172820926} 12/24/2021 04:53:08 - INFO - codeparrot_training - Step 9282: {'lr': 0.0004599041337001818, 'samples': 4752896, 'steps': 9282, 'batch_loss/train': 0.8995330529287457} 12/24/2021 04:53:19 - INFO - codeparrot_training - Step 9283: {'lr': 0.00045989557107003657, 'samples': 4753408, 'steps': 9283, 'batch_loss/train': 0.8089833837002516} 12/24/2021 04:53:31 - INFO - codeparrot_training - Step 9284: {'lr': 0.00045988700760542575, 'samples': 4753920, 'steps': 9284, 'batch_loss/train': 0.7587472428567708} 12/24/2021 04:53:42 - INFO - codeparrot_training - Step 9285: {'lr': 0.0004598784433063834, 'samples': 4754432, 'steps': 9285, 'batch_loss/train': 0.7621959992684424} 12/24/2021 04:53:52 - INFO - codeparrot_training - Step 9286: {'lr': 0.00045986987817294366, 'samples': 4754944, 'steps': 9286, 'batch_loss/train': 0.8232845840975642} 12/24/2021 04:54:06 - INFO - codeparrot_training - Step 9287: {'lr': 0.0004598613122051405, 'samples': 4755456, 'steps': 9287, 'batch_loss/train': 0.8708798363804817} 12/24/2021 04:54:17 - INFO - codeparrot_training - Step 9288: {'lr': 0.00045985274540300805, 'samples': 4755968, 'steps': 9288, 'batch_loss/train': 0.9340641228482127} 12/24/2021 04:54:28 - INFO - codeparrot_training - Step 9289: {'lr': 0.00045984417776658026, 'samples': 4756480, 'steps': 9289, 'batch_loss/train': 0.8598522581160069} 12/24/2021 04:54:40 - INFO - codeparrot_training - Step 9290: {'lr': 0.0004598356092958913, 'samples': 4756992, 'steps': 9290, 'batch_loss/train': 1.0210999231785536} 12/24/2021 04:54:50 - INFO - codeparrot_training - Step 9291: {'lr': 0.00045982703999097516, 'samples': 4757504, 'steps': 9291, 'batch_loss/train': 0.8950082696974277} 12/24/2021 04:55:01 - INFO - codeparrot_training - Step 9292: {'lr': 0.00045981846985186593, 'samples': 4758016, 'steps': 9292, 'batch_loss/train': 0.7747024795971811} 12/24/2021 04:55:11 - INFO - codeparrot_training - Step 9293: {'lr': 0.0004598098988785977, 'samples': 4758528, 'steps': 9293, 'batch_loss/train': 0.8648281996138394} 12/24/2021 04:55:24 - INFO - codeparrot_training - Step 9294: {'lr': 0.0004598013270712045, 'samples': 4759040, 'steps': 9294, 'batch_loss/train': 0.8759939088486135} 12/24/2021 04:55:34 - INFO - codeparrot_training - Step 9295: {'lr': 0.0004597927544297205, 'samples': 4759552, 'steps': 9295, 'batch_loss/train': 0.7951788953505456} 12/24/2021 04:55:45 - INFO - codeparrot_training - Step 9296: {'lr': 0.00045978418095417973, 'samples': 4760064, 'steps': 9296, 'batch_loss/train': 0.7649937216192484} 12/24/2021 04:55:59 - INFO - codeparrot_training - Step 9297: {'lr': 0.0004597756066446162, 'samples': 4760576, 'steps': 9297, 'batch_loss/train': 0.8561283838935196} 12/24/2021 04:56:10 - INFO - codeparrot_training - Step 9298: {'lr': 0.00045976703150106416, 'samples': 4761088, 'steps': 9298, 'batch_loss/train': 0.8499944480136037} 12/24/2021 04:56:21 - INFO - codeparrot_training - Step 9299: {'lr': 0.0004597584555235575, 'samples': 4761600, 'steps': 9299, 'batch_loss/train': 0.8641223842278123} 12/24/2021 04:56:33 - INFO - codeparrot_training - Step 9300: {'lr': 0.0004597498787121305, 'samples': 4762112, 'steps': 9300, 'batch_loss/train': 0.3855314434040338} 12/24/2021 04:56:44 - INFO - codeparrot_training - Step 9301: {'lr': 0.0004597413010668171, 'samples': 4762624, 'steps': 9301, 'batch_loss/train': 0.8144504968076944} 12/24/2021 04:56:54 - INFO - codeparrot_training - Step 9302: {'lr': 0.0004597327225876515, 'samples': 4763136, 'steps': 9302, 'batch_loss/train': 1.6427259691990912} 12/24/2021 04:57:05 - INFO - codeparrot_training - Step 9303: {'lr': 0.0004597241432746679, 'samples': 4763648, 'steps': 9303, 'batch_loss/train': 0.9008023855276406} 12/24/2021 04:57:17 - INFO - codeparrot_training - Step 9304: {'lr': 0.0004597155631279002, 'samples': 4764160, 'steps': 9304, 'batch_loss/train': 0.8824897054582834} 12/24/2021 04:57:27 - INFO - codeparrot_training - Step 9305: {'lr': 0.0004597069821473826, 'samples': 4764672, 'steps': 9305, 'batch_loss/train': 0.7982721021398902} 12/24/2021 04:57:38 - INFO - codeparrot_training - Step 9306: {'lr': 0.00045969840033314926, 'samples': 4765184, 'steps': 9306, 'batch_loss/train': 0.82406219933182} 12/24/2021 04:57:52 - INFO - codeparrot_training - Step 9307: {'lr': 0.00045968981768523424, 'samples': 4765696, 'steps': 9307, 'batch_loss/train': 0.7879882585257292} 12/24/2021 04:58:02 - INFO - codeparrot_training - Step 9308: {'lr': 0.0004596812342036717, 'samples': 4766208, 'steps': 9308, 'batch_loss/train': 0.877680066972971} 12/24/2021 04:58:13 - INFO - codeparrot_training - Step 9309: {'lr': 0.0004596726498884958, 'samples': 4766720, 'steps': 9309, 'batch_loss/train': 0.8911777632310987} 12/24/2021 04:58:24 - INFO - codeparrot_training - Step 9310: {'lr': 0.00045966406473974046, 'samples': 4767232, 'steps': 9310, 'batch_loss/train': 0.7913695313036442} 12/24/2021 04:58:37 - INFO - codeparrot_training - Step 9311: {'lr': 0.00045965547875744005, 'samples': 4767744, 'steps': 9311, 'batch_loss/train': 0.8720221314579248} 12/24/2021 04:58:47 - INFO - codeparrot_training - Step 9312: {'lr': 0.00045964689194162865, 'samples': 4768256, 'steps': 9312, 'batch_loss/train': 0.8670799992978573} 12/24/2021 04:58:58 - INFO - codeparrot_training - Step 9313: {'lr': 0.0004596383042923403, 'samples': 4768768, 'steps': 9313, 'batch_loss/train': 0.8938649175688624} 12/24/2021 04:59:10 - INFO - codeparrot_training - Step 9314: {'lr': 0.00045962971580960926, 'samples': 4769280, 'steps': 9314, 'batch_loss/train': 1.1816372275352478} 12/24/2021 04:59:21 - INFO - codeparrot_training - Step 9315: {'lr': 0.0004596211264934696, 'samples': 4769792, 'steps': 9315, 'batch_loss/train': 0.7779205874539912} 12/24/2021 04:59:32 - INFO - codeparrot_training - Step 9316: {'lr': 0.0004596125363439555, 'samples': 4770304, 'steps': 9316, 'batch_loss/train': 0.7069731913506985} 12/24/2021 04:59:45 - INFO - codeparrot_training - Step 9317: {'lr': 0.00045960394536110114, 'samples': 4770816, 'steps': 9317, 'batch_loss/train': 1.1976966676302254} 12/24/2021 04:59:56 - INFO - codeparrot_training - Step 9318: {'lr': 0.00045959535354494054, 'samples': 4771328, 'steps': 9318, 'batch_loss/train': 0.8221385451033711} 12/24/2021 05:00:07 - INFO - codeparrot_training - Step 9319: {'lr': 0.00045958676089550806, 'samples': 4771840, 'steps': 9319, 'batch_loss/train': 0.8527211674954742} 12/24/2021 05:00:17 - INFO - codeparrot_training - Step 9320: {'lr': 0.0004595781674128377, 'samples': 4772352, 'steps': 9320, 'batch_loss/train': 0.8114150678738952} 12/24/2021 05:00:30 - INFO - codeparrot_training - Step 9321: {'lr': 0.00045956957309696366, 'samples': 4772864, 'steps': 9321, 'batch_loss/train': 0.844662144780159} 12/24/2021 05:00:40 - INFO - codeparrot_training - Step 9322: {'lr': 0.00045956097794792015, 'samples': 4773376, 'steps': 9322, 'batch_loss/train': 0.8290257258340716} 12/24/2021 05:00:51 - INFO - codeparrot_training - Step 9323: {'lr': 0.0004595523819657413, 'samples': 4773888, 'steps': 9323, 'batch_loss/train': 0.8241915919352323} 12/24/2021 05:01:03 - INFO - codeparrot_training - Step 9324: {'lr': 0.00045954378515046133, 'samples': 4774400, 'steps': 9324, 'batch_loss/train': 0.8740467950701714} 12/24/2021 05:01:14 - INFO - codeparrot_training - Step 9325: {'lr': 0.00045953518750211443, 'samples': 4774912, 'steps': 9325, 'batch_loss/train': 0.8043915973976254} 12/24/2021 05:01:24 - INFO - codeparrot_training - Step 9326: {'lr': 0.0004595265890207347, 'samples': 4775424, 'steps': 9326, 'batch_loss/train': 0.9857061803340912} 12/24/2021 05:01:38 - INFO - codeparrot_training - Step 9327: {'lr': 0.00045951798970635636, 'samples': 4775936, 'steps': 9327, 'batch_loss/train': 0.8134226649999619} 12/24/2021 05:01:49 - INFO - codeparrot_training - Step 9328: {'lr': 0.0004595093895590136, 'samples': 4776448, 'steps': 9328, 'batch_loss/train': 0.8493755338713527} 12/24/2021 05:01:59 - INFO - codeparrot_training - Step 9329: {'lr': 0.00045950078857874067, 'samples': 4776960, 'steps': 9329, 'batch_loss/train': 0.8515662252902985} 12/24/2021 05:02:10 - INFO - codeparrot_training - Step 9330: {'lr': 0.00045949218676557163, 'samples': 4777472, 'steps': 9330, 'batch_loss/train': 0.7793786218389869} 12/24/2021 05:02:22 - INFO - codeparrot_training - Step 9331: {'lr': 0.0004594835841195408, 'samples': 4777984, 'steps': 9331, 'batch_loss/train': 0.8305459627881646} 12/24/2021 05:02:33 - INFO - codeparrot_training - Step 9332: {'lr': 0.00045947498064068234, 'samples': 4778496, 'steps': 9332, 'batch_loss/train': 0.8289373542647809} 12/24/2021 05:02:43 - INFO - codeparrot_training - Step 9333: {'lr': 0.00045946637632903045, 'samples': 4779008, 'steps': 9333, 'batch_loss/train': 0.8905213233083487} 12/24/2021 05:02:55 - INFO - codeparrot_training - Step 9334: {'lr': 0.0004594577711846193, 'samples': 4779520, 'steps': 9334, 'batch_loss/train': 0.8727722521871328} 12/24/2021 05:03:06 - INFO - codeparrot_training - Step 9335: {'lr': 0.0004594491652074831, 'samples': 4780032, 'steps': 9335, 'batch_loss/train': 0.7946861684322357} 12/24/2021 05:03:17 - INFO - codeparrot_training - Step 9336: {'lr': 0.00045944055839765627, 'samples': 4780544, 'steps': 9336, 'batch_loss/train': 0.9541074745357037} 12/24/2021 05:03:31 - INFO - codeparrot_training - Step 9337: {'lr': 0.0004594319507551727, 'samples': 4781056, 'steps': 9337, 'batch_loss/train': 0.8935792236588895} 12/24/2021 05:03:41 - INFO - codeparrot_training - Step 9338: {'lr': 0.0004594233422800669, 'samples': 4781568, 'steps': 9338, 'batch_loss/train': 0.8819122649729252} 12/24/2021 05:03:52 - INFO - codeparrot_training - Step 9339: {'lr': 0.0004594147329723729, 'samples': 4782080, 'steps': 9339, 'batch_loss/train': 0.9431786257773638} 12/24/2021 05:04:02 - INFO - codeparrot_training - Step 9340: {'lr': 0.00045940612283212494, 'samples': 4782592, 'steps': 9340, 'batch_loss/train': 0.8611867972649634} 12/24/2021 05:04:14 - INFO - codeparrot_training - Step 9341: {'lr': 0.0004593975118593574, 'samples': 4783104, 'steps': 9341, 'batch_loss/train': 0.7992866281419992} 12/24/2021 05:04:25 - INFO - codeparrot_training - Step 9342: {'lr': 0.0004593889000541044, 'samples': 4783616, 'steps': 9342, 'batch_loss/train': 0.7939455965533853} 12/24/2021 05:04:36 - INFO - codeparrot_training - Step 9343: {'lr': 0.00045938028741640015, 'samples': 4784128, 'steps': 9343, 'batch_loss/train': 0.9046383984386921} 12/24/2021 05:04:50 - INFO - codeparrot_training - Step 9344: {'lr': 0.00045937167394627897, 'samples': 4784640, 'steps': 9344, 'batch_loss/train': 0.8968088887631893} 12/24/2021 05:05:00 - INFO - codeparrot_training - Step 9345: {'lr': 0.00045936305964377507, 'samples': 4785152, 'steps': 9345, 'batch_loss/train': 0.9150660461746156} 12/24/2021 05:05:11 - INFO - codeparrot_training - Step 9346: {'lr': 0.00045935444450892265, 'samples': 4785664, 'steps': 9346, 'batch_loss/train': 0.8681742064654827} 12/24/2021 05:05:23 - INFO - codeparrot_training - Step 9347: {'lr': 0.00045934582854175604, 'samples': 4786176, 'steps': 9347, 'batch_loss/train': 0.8154079290106893} 12/24/2021 05:05:34 - INFO - codeparrot_training - Step 9348: {'lr': 0.0004593372117423095, 'samples': 4786688, 'steps': 9348, 'batch_loss/train': 0.8276100866496563} 12/24/2021 05:05:44 - INFO - codeparrot_training - Step 9349: {'lr': 0.0004593285941106172, 'samples': 4787200, 'steps': 9349, 'batch_loss/train': 0.895229032728821} 12/24/2021 05:05:55 - INFO - codeparrot_training - Step 9350: {'lr': 0.00045931997564671346, 'samples': 4787712, 'steps': 9350, 'batch_loss/train': 0.795809012837708} 12/24/2021 05:06:07 - INFO - codeparrot_training - Step 9351: {'lr': 0.00045931135635063255, 'samples': 4788224, 'steps': 9351, 'batch_loss/train': 0.894182306714356} 12/24/2021 05:06:18 - INFO - codeparrot_training - Step 9352: {'lr': 0.0004593027362224087, 'samples': 4788736, 'steps': 9352, 'batch_loss/train': 0.8946098294109106} 12/24/2021 05:06:28 - INFO - codeparrot_training - Step 9353: {'lr': 0.0004592941152620762, 'samples': 4789248, 'steps': 9353, 'batch_loss/train': 0.8300603730604053} 12/24/2021 05:06:40 - INFO - codeparrot_training - Step 9354: {'lr': 0.0004592854934696693, 'samples': 4789760, 'steps': 9354, 'batch_loss/train': 0.7575595444068313} 12/24/2021 05:06:51 - INFO - codeparrot_training - Step 9355: {'lr': 0.00045927687084522243, 'samples': 4790272, 'steps': 9355, 'batch_loss/train': 0.8513914551585913} 12/24/2021 05:07:02 - INFO - codeparrot_training - Step 9356: {'lr': 0.0004592682473887696, 'samples': 4790784, 'steps': 9356, 'batch_loss/train': 0.767153870081529} 12/24/2021 05:07:15 - INFO - codeparrot_training - Step 9357: {'lr': 0.0004592596231003453, 'samples': 4791296, 'steps': 9357, 'batch_loss/train': 0.8331183716654778} 12/24/2021 05:07:26 - INFO - codeparrot_training - Step 9358: {'lr': 0.0004592509979799837, 'samples': 4791808, 'steps': 9358, 'batch_loss/train': 0.6505613764747977} 12/24/2021 05:07:37 - INFO - codeparrot_training - Step 9359: {'lr': 0.0004592423720277192, 'samples': 4792320, 'steps': 9359, 'batch_loss/train': 0.7604150315746665} 12/24/2021 05:07:49 - INFO - codeparrot_training - Step 9360: {'lr': 0.00045923374524358594, 'samples': 4792832, 'steps': 9360, 'batch_loss/train': 0.8349352143704891} 12/24/2021 05:08:00 - INFO - codeparrot_training - Step 9361: {'lr': 0.00045922511762761843, 'samples': 4793344, 'steps': 9361, 'batch_loss/train': 0.8160545979626477} 12/24/2021 05:08:10 - INFO - codeparrot_training - Step 9362: {'lr': 0.0004592164891798508, 'samples': 4793856, 'steps': 9362, 'batch_loss/train': 1.0545849353075027} 12/24/2021 05:08:21 - INFO - codeparrot_training - Step 9363: {'lr': 0.00045920785990031745, 'samples': 4794368, 'steps': 9363, 'batch_loss/train': 0.9075660295784473} 12/24/2021 05:08:33 - INFO - codeparrot_training - Step 9364: {'lr': 0.0004591992297890526, 'samples': 4794880, 'steps': 9364, 'batch_loss/train': 0.8178203520365059} 12/24/2021 05:08:44 - INFO - codeparrot_training - Step 9365: {'lr': 0.0004591905988460906, 'samples': 4795392, 'steps': 9365, 'batch_loss/train': 0.8686931568663567} 12/24/2021 05:08:54 - INFO - codeparrot_training - Step 9366: {'lr': 0.00045918196707146573, 'samples': 4795904, 'steps': 9366, 'batch_loss/train': 0.6785373943857849} 12/24/2021 05:09:08 - INFO - codeparrot_training - Step 9367: {'lr': 0.00045917333446521236, 'samples': 4796416, 'steps': 9367, 'batch_loss/train': 0.8648585665505379} 12/24/2021 05:09:19 - INFO - codeparrot_training - Step 9368: {'lr': 0.00045916470102736487, 'samples': 4796928, 'steps': 9368, 'batch_loss/train': 0.8101331973448396} 12/24/2021 05:09:29 - INFO - codeparrot_training - Step 9369: {'lr': 0.00045915606675795746, 'samples': 4797440, 'steps': 9369, 'batch_loss/train': 0.829486696049571} 12/24/2021 05:09:41 - INFO - codeparrot_training - Step 9370: {'lr': 0.0004591474316570245, 'samples': 4797952, 'steps': 9370, 'batch_loss/train': 0.8396150653716177} 12/24/2021 05:09:52 - INFO - codeparrot_training - Step 9371: {'lr': 0.0004591387957246003, 'samples': 4798464, 'steps': 9371, 'batch_loss/train': 0.8393078087829053} 12/24/2021 05:10:02 - INFO - codeparrot_training - Step 9372: {'lr': 0.00045913015896071924, 'samples': 4798976, 'steps': 9372, 'batch_loss/train': 0.7897831001318991} 12/24/2021 05:10:13 - INFO - codeparrot_training - Step 9373: {'lr': 0.0004591215213654156, 'samples': 4799488, 'steps': 9373, 'batch_loss/train': 0.7693133978173137} 12/24/2021 05:10:26 - INFO - codeparrot_training - Step 9374: {'lr': 0.00045911288293872383, 'samples': 4800000, 'steps': 9374, 'batch_loss/train': 0.8118476765230298} 12/24/2021 05:10:36 - INFO - codeparrot_training - Step 9375: {'lr': 0.0004591042436806782, 'samples': 4800512, 'steps': 9375, 'batch_loss/train': 0.886220058426261} 12/24/2021 05:10:47 - INFO - codeparrot_training - Step 9376: {'lr': 0.00045909560359131297, 'samples': 4801024, 'steps': 9376, 'batch_loss/train': 0.8461537105031312} 12/24/2021 05:10:59 - INFO - codeparrot_training - Step 9377: {'lr': 0.0004590869626706625, 'samples': 4801536, 'steps': 9377, 'batch_loss/train': 0.9265558728948236} 12/24/2021 05:11:10 - INFO - codeparrot_training - Step 9378: {'lr': 0.0004590783209187613, 'samples': 4802048, 'steps': 9378, 'batch_loss/train': 0.8334957584738731} 12/24/2021 05:11:20 - INFO - codeparrot_training - Step 9379: {'lr': 0.0004590696783356437, 'samples': 4802560, 'steps': 9379, 'batch_loss/train': 0.7394340145401657} 12/24/2021 05:11:33 - INFO - codeparrot_training - Step 9380: {'lr': 0.000459061034921344, 'samples': 4803072, 'steps': 9380, 'batch_loss/train': 0.8438987517729402} 12/24/2021 05:11:43 - INFO - codeparrot_training - Step 9381: {'lr': 0.0004590523906758965, 'samples': 4803584, 'steps': 9381, 'batch_loss/train': 0.7976532895117998} 12/24/2021 05:11:54 - INFO - codeparrot_training - Step 9382: {'lr': 0.0004590437455993357, 'samples': 4804096, 'steps': 9382, 'batch_loss/train': 0.7952272715047002} 12/24/2021 05:12:04 - INFO - codeparrot_training - Step 9383: {'lr': 0.0004590350996916958, 'samples': 4804608, 'steps': 9383, 'batch_loss/train': 0.8846123334951699} 12/24/2021 05:12:18 - INFO - codeparrot_training - Step 9384: {'lr': 0.0004590264529530114, 'samples': 4805120, 'steps': 9384, 'batch_loss/train': 0.8610642664134502} 12/24/2021 05:12:29 - INFO - codeparrot_training - Step 9385: {'lr': 0.0004590178053833166, 'samples': 4805632, 'steps': 9385, 'batch_loss/train': 0.8701298106461763} 12/24/2021 05:12:40 - INFO - codeparrot_training - Step 9386: {'lr': 0.000459009156982646, 'samples': 4806144, 'steps': 9386, 'batch_loss/train': 0.9523001732304692} 12/24/2021 05:12:52 - INFO - codeparrot_training - Step 9387: {'lr': 0.0004590005077510339, 'samples': 4806656, 'steps': 9387, 'batch_loss/train': 0.871425086632371} 12/24/2021 05:13:02 - INFO - codeparrot_training - Step 9388: {'lr': 0.0004589918576885147, 'samples': 4807168, 'steps': 9388, 'batch_loss/train': 0.889584549702704} 12/24/2021 05:13:13 - INFO - codeparrot_training - Step 9389: {'lr': 0.00045898320679512285, 'samples': 4807680, 'steps': 9389, 'batch_loss/train': 0.9076342741027474} 12/24/2021 05:13:26 - INFO - codeparrot_training - Step 9390: {'lr': 0.0004589745550708926, 'samples': 4808192, 'steps': 9390, 'batch_loss/train': 0.7804044177755713} 12/24/2021 05:13:36 - INFO - codeparrot_training - Step 9391: {'lr': 0.00045896590251585847, 'samples': 4808704, 'steps': 9391, 'batch_loss/train': 0.9022497502155602} 12/24/2021 05:13:47 - INFO - codeparrot_training - Step 9392: {'lr': 0.00045895724913005476, 'samples': 4809216, 'steps': 9392, 'batch_loss/train': 0.7794169187545776} 12/24/2021 05:13:57 - INFO - codeparrot_training - Step 9393: {'lr': 0.00045894859491351595, 'samples': 4809728, 'steps': 9393, 'batch_loss/train': 0.7740548504516482} 12/24/2021 05:14:09 - INFO - codeparrot_training - Step 9394: {'lr': 0.0004589399398662764, 'samples': 4810240, 'steps': 9394, 'batch_loss/train': 0.8948117941617966} 12/24/2021 05:14:20 - INFO - codeparrot_training - Step 9395: {'lr': 0.00045893128398837057, 'samples': 4810752, 'steps': 9395, 'batch_loss/train': 1.0164274973794818} 12/24/2021 05:14:30 - INFO - codeparrot_training - Step 9396: {'lr': 0.00045892262727983285, 'samples': 4811264, 'steps': 9396, 'batch_loss/train': 0.956196216866374} 12/24/2021 05:14:43 - INFO - codeparrot_training - Step 9397: {'lr': 0.00045891396974069766, 'samples': 4811776, 'steps': 9397, 'batch_loss/train': 0.7714368058368564} 12/24/2021 05:14:54 - INFO - codeparrot_training - Step 9398: {'lr': 0.0004589053113709994, 'samples': 4812288, 'steps': 9398, 'batch_loss/train': 0.8012040671892464} 12/24/2021 05:15:05 - INFO - codeparrot_training - Step 9399: {'lr': 0.0004588966521707725, 'samples': 4812800, 'steps': 9399, 'batch_loss/train': 0.8305113688111305} 12/24/2021 05:15:17 - INFO - codeparrot_training - Step 9400: {'lr': 0.0004588879921400514, 'samples': 4813312, 'steps': 9400, 'batch_loss/train': 0.8141802866011858} 12/24/2021 05:15:27 - INFO - codeparrot_training - Step 9401: {'lr': 0.0004588793312788705, 'samples': 4813824, 'steps': 9401, 'batch_loss/train': 0.8857954787090421} 12/24/2021 05:15:38 - INFO - codeparrot_training - Step 9402: {'lr': 0.0004588706695872642, 'samples': 4814336, 'steps': 9402, 'batch_loss/train': 0.8623697450384498} 12/24/2021 05:15:49 - INFO - codeparrot_training - Step 9403: {'lr': 0.00045886200706526703, 'samples': 4814848, 'steps': 9403, 'batch_loss/train': 0.9037335156463087} 12/24/2021 05:16:01 - INFO - codeparrot_training - Step 9404: {'lr': 0.00045885334371291344, 'samples': 4815360, 'steps': 9404, 'batch_loss/train': 0.8876995667815208} 12/24/2021 05:16:12 - INFO - codeparrot_training - Step 9405: {'lr': 0.0004588446795302377, 'samples': 4815872, 'steps': 9405, 'batch_loss/train': 0.780706305988133} 12/24/2021 05:16:23 - INFO - codeparrot_training - Step 9406: {'lr': 0.00045883601451727444, 'samples': 4816384, 'steps': 9406, 'batch_loss/train': 0.7841793559491634} 12/24/2021 05:16:35 - INFO - codeparrot_training - Step 9407: {'lr': 0.0004588273486740581, 'samples': 4816896, 'steps': 9407, 'batch_loss/train': 0.891360579058528} 12/24/2021 05:16:45 - INFO - codeparrot_training - Step 9408: {'lr': 0.0004588186820006229, 'samples': 4817408, 'steps': 9408, 'batch_loss/train': 1.0137143498286605} 12/24/2021 05:16:56 - INFO - codeparrot_training - Step 9409: {'lr': 0.0004588100144970036, 'samples': 4817920, 'steps': 9409, 'batch_loss/train': 0.874416227452457} 12/24/2021 05:17:08 - INFO - codeparrot_training - Step 9410: {'lr': 0.00045880134616323447, 'samples': 4818432, 'steps': 9410, 'batch_loss/train': 0.9245344628579915} 12/24/2021 05:17:19 - INFO - codeparrot_training - Step 9411: {'lr': 0.0004587926769993501, 'samples': 4818944, 'steps': 9411, 'batch_loss/train': 0.9254362038336694} 12/24/2021 05:17:29 - INFO - codeparrot_training - Step 9412: {'lr': 0.00045878400700538483, 'samples': 4819456, 'steps': 9412, 'batch_loss/train': 0.7926641968078911} 12/24/2021 05:17:40 - INFO - codeparrot_training - Step 9413: {'lr': 0.0004587753361813732, 'samples': 4819968, 'steps': 9413, 'batch_loss/train': 0.7947397329844534} 12/24/2021 05:17:53 - INFO - codeparrot_training - Step 9414: {'lr': 0.0004587666645273496, 'samples': 4820480, 'steps': 9414, 'batch_loss/train': 0.7559272265061736} 12/24/2021 05:18:04 - INFO - codeparrot_training - Step 9415: {'lr': 0.0004587579920433487, 'samples': 4820992, 'steps': 9415, 'batch_loss/train': 0.8533353181555867} 12/24/2021 05:18:14 - INFO - codeparrot_training - Step 9416: {'lr': 0.00045874931872940473, 'samples': 4821504, 'steps': 9416, 'batch_loss/train': 0.759435617364943} 12/24/2021 05:18:26 - INFO - codeparrot_training - Step 9417: {'lr': 0.0004587406445855523, 'samples': 4822016, 'steps': 9417, 'batch_loss/train': 0.9144456498324871} 12/24/2021 05:18:37 - INFO - codeparrot_training - Step 9418: {'lr': 0.0004587319696118259, 'samples': 4822528, 'steps': 9418, 'batch_loss/train': 0.8801171649247408} 12/24/2021 05:18:48 - INFO - codeparrot_training - Step 9419: {'lr': 0.0004587232938082601, 'samples': 4823040, 'steps': 9419, 'batch_loss/train': 0.8751288847997785} 12/24/2021 05:18:58 - INFO - codeparrot_training - Step 9420: {'lr': 0.00045871461717488914, 'samples': 4823552, 'steps': 9420, 'batch_loss/train': 0.903081152588129} 12/24/2021 05:19:10 - INFO - codeparrot_training - Step 9421: {'lr': 0.00045870593971174777, 'samples': 4824064, 'steps': 9421, 'batch_loss/train': 0.9234608802944422} 12/24/2021 05:19:21 - INFO - codeparrot_training - Step 9422: {'lr': 0.00045869726141887037, 'samples': 4824576, 'steps': 9422, 'batch_loss/train': 0.8395045157521963} 12/24/2021 05:19:31 - INFO - codeparrot_training - Step 9423: {'lr': 0.0004586885822962914, 'samples': 4825088, 'steps': 9423, 'batch_loss/train': 0.9079019408673048} 12/24/2021 05:19:44 - INFO - codeparrot_training - Step 9424: {'lr': 0.0004586799023440455, 'samples': 4825600, 'steps': 9424, 'batch_loss/train': 0.8186026145704091} 12/24/2021 05:19:55 - INFO - codeparrot_training - Step 9425: {'lr': 0.00045867122156216706, 'samples': 4826112, 'steps': 9425, 'batch_loss/train': 0.7989912582561374} 12/24/2021 05:20:05 - INFO - codeparrot_training - Step 9426: {'lr': 0.00045866253995069064, 'samples': 4826624, 'steps': 9426, 'batch_loss/train': 0.8262464530998841} 12/24/2021 05:20:18 - INFO - codeparrot_training - Step 9427: {'lr': 0.0004586538575096508, 'samples': 4827136, 'steps': 9427, 'batch_loss/train': 0.8437932054512203} 12/24/2021 05:20:28 - INFO - codeparrot_training - Step 9428: {'lr': 0.0004586451742390819, 'samples': 4827648, 'steps': 9428, 'batch_loss/train': 0.8478099550120533} 12/24/2021 05:20:39 - INFO - codeparrot_training - Step 9429: {'lr': 0.0004586364901390186, 'samples': 4828160, 'steps': 9429, 'batch_loss/train': 0.8033909844234586} 12/24/2021 05:20:50 - INFO - codeparrot_training - Step 9430: {'lr': 0.0004586278052094955, 'samples': 4828672, 'steps': 9430, 'batch_loss/train': 0.6746425591409206} 12/24/2021 05:21:02 - INFO - codeparrot_training - Step 9431: {'lr': 0.00045861911945054696, 'samples': 4829184, 'steps': 9431, 'batch_loss/train': 0.804720627842471} 12/24/2021 05:21:12 - INFO - codeparrot_training - Step 9432: {'lr': 0.0004586104328622076, 'samples': 4829696, 'steps': 9432, 'batch_loss/train': 0.8604950020089746} 12/24/2021 05:21:23 - INFO - codeparrot_training - Step 9433: {'lr': 0.00045860174544451183, 'samples': 4830208, 'steps': 9433, 'batch_loss/train': 0.8692738022655249} 12/24/2021 05:21:35 - INFO - codeparrot_training - Step 9434: {'lr': 0.00045859305719749433, 'samples': 4830720, 'steps': 9434, 'batch_loss/train': 0.7671268153935671} 12/24/2021 05:21:46 - INFO - codeparrot_training - Step 9435: {'lr': 0.0004585843681211896, 'samples': 4831232, 'steps': 9435, 'batch_loss/train': 0.912977259606123} 12/24/2021 05:21:57 - INFO - codeparrot_training - Step 9436: {'lr': 0.0004585756782156322, 'samples': 4831744, 'steps': 9436, 'batch_loss/train': 0.8772754287347198} 12/24/2021 05:22:09 - INFO - codeparrot_training - Step 9437: {'lr': 0.0004585669874808567, 'samples': 4832256, 'steps': 9437, 'batch_loss/train': 0.7618852816522121} 12/24/2021 05:22:19 - INFO - codeparrot_training - Step 9438: {'lr': 0.00045855829591689756, 'samples': 4832768, 'steps': 9438, 'batch_loss/train': 0.8490785255562514} 12/24/2021 05:22:30 - INFO - codeparrot_training - Step 9439: {'lr': 0.00045854960352378937, 'samples': 4833280, 'steps': 9439, 'batch_loss/train': 0.7992114573717117} 12/24/2021 05:22:42 - INFO - codeparrot_training - Step 9440: {'lr': 0.0004585409103015668, 'samples': 4833792, 'steps': 9440, 'batch_loss/train': 0.9111256874166429} 12/24/2021 05:22:52 - INFO - codeparrot_training - Step 9441: {'lr': 0.00045853221625026416, 'samples': 4834304, 'steps': 9441, 'batch_loss/train': 0.8599056303501129} 12/24/2021 05:23:03 - INFO - codeparrot_training - Step 9442: {'lr': 0.0004585235213699162, 'samples': 4834816, 'steps': 9442, 'batch_loss/train': 0.8941340744495392} 12/24/2021 05:23:13 - INFO - codeparrot_training - Step 9443: {'lr': 0.0004585148256605575, 'samples': 4835328, 'steps': 9443, 'batch_loss/train': 0.7437717374414206} 12/24/2021 05:23:26 - INFO - codeparrot_training - Step 9444: {'lr': 0.00045850612912222253, 'samples': 4835840, 'steps': 9444, 'batch_loss/train': 0.8491282537579536} 12/24/2021 05:23:37 - INFO - codeparrot_training - Step 9445: {'lr': 0.000458497431754946, 'samples': 4836352, 'steps': 9445, 'batch_loss/train': 0.8824955821037292} 12/24/2021 05:23:47 - INFO - codeparrot_training - Step 9446: {'lr': 0.0004584887335587624, 'samples': 4836864, 'steps': 9446, 'batch_loss/train': 0.8524293913505971} 12/24/2021 05:24:00 - INFO - codeparrot_training - Step 9447: {'lr': 0.00045848003453370623, 'samples': 4837376, 'steps': 9447, 'batch_loss/train': 0.7046545166522264} 12/24/2021 05:24:10 - INFO - codeparrot_training - Step 9448: {'lr': 0.0004584713346798123, 'samples': 4837888, 'steps': 9448, 'batch_loss/train': 0.8157966176513582} 12/24/2021 05:24:21 - INFO - codeparrot_training - Step 9449: {'lr': 0.00045846263399711493, 'samples': 4838400, 'steps': 9449, 'batch_loss/train': 0.8938546273857355} 12/24/2021 05:24:31 - INFO - codeparrot_training - Step 9450: {'lr': 0.0004584539324856489, 'samples': 4838912, 'steps': 9450, 'batch_loss/train': 0.9900725400075316} 12/24/2021 05:24:45 - INFO - codeparrot_training - Step 9451: {'lr': 0.0004584452301454487, 'samples': 4839424, 'steps': 9451, 'batch_loss/train': 0.7124729999341071} 12/24/2021 05:24:55 - INFO - codeparrot_training - Step 9452: {'lr': 0.00045843652697654904, 'samples': 4839936, 'steps': 9452, 'batch_loss/train': 1.0100325737148523} 12/24/2021 05:25:06 - INFO - codeparrot_training - Step 9453: {'lr': 0.0004584278229789844, 'samples': 4840448, 'steps': 9453, 'batch_loss/train': 0.8073821268044412} 12/24/2021 05:25:18 - INFO - codeparrot_training - Step 9454: {'lr': 0.0004584191181527895, 'samples': 4840960, 'steps': 9454, 'batch_loss/train': 0.7803795733489096} 12/24/2021 05:25:29 - INFO - codeparrot_training - Step 9455: {'lr': 0.00045841041249799873, 'samples': 4841472, 'steps': 9455, 'batch_loss/train': 0.879527147859335} 12/24/2021 05:25:39 - INFO - codeparrot_training - Step 9456: {'lr': 0.000458401706014647, 'samples': 4841984, 'steps': 9456, 'batch_loss/train': 0.5705470560351387} 12/24/2021 05:25:50 - INFO - codeparrot_training - Step 9457: {'lr': 0.0004583929987027687, 'samples': 4842496, 'steps': 9457, 'batch_loss/train': 0.8526882516453043} 12/24/2021 05:26:02 - INFO - codeparrot_training - Step 9458: {'lr': 0.0004583842905623986, 'samples': 4843008, 'steps': 9458, 'batch_loss/train': 0.6971007413230836} 12/24/2021 05:26:13 - INFO - codeparrot_training - Step 9459: {'lr': 0.00045837558159357117, 'samples': 4843520, 'steps': 9459, 'batch_loss/train': 0.8864944766974077} 12/24/2021 05:26:24 - INFO - codeparrot_training - Step 9460: {'lr': 0.00045836687179632115, 'samples': 4844032, 'steps': 9460, 'batch_loss/train': 0.7251414959318936} 12/24/2021 05:26:36 - INFO - codeparrot_training - Step 9461: {'lr': 0.0004583581611706831, 'samples': 4844544, 'steps': 9461, 'batch_loss/train': 0.9208791158162057} 12/24/2021 05:26:46 - INFO - codeparrot_training - Step 9462: {'lr': 0.00045834944971669166, 'samples': 4845056, 'steps': 9462, 'batch_loss/train': 0.7579721335787326} 12/24/2021 05:26:57 - INFO - codeparrot_training - Step 9463: {'lr': 0.0004583407374343815, 'samples': 4845568, 'steps': 9463, 'batch_loss/train': 0.8556600189767778} 12/24/2021 05:27:10 - INFO - codeparrot_training - Step 9464: {'lr': 0.0004583320243237873, 'samples': 4846080, 'steps': 9464, 'batch_loss/train': 0.9147521564736962} 12/24/2021 05:27:20 - INFO - codeparrot_training - Step 9465: {'lr': 0.0004583233103849435, 'samples': 4846592, 'steps': 9465, 'batch_loss/train': 0.9549387109000236} 12/24/2021 05:27:31 - INFO - codeparrot_training - Step 9466: {'lr': 0.000458314595617885, 'samples': 4847104, 'steps': 9466, 'batch_loss/train': 0.7977353748865426} 12/24/2021 05:27:43 - INFO - codeparrot_training - Step 9467: {'lr': 0.0004583058800226463, 'samples': 4847616, 'steps': 9467, 'batch_loss/train': 0.860174668719992} 12/24/2021 05:27:53 - INFO - codeparrot_training - Step 9468: {'lr': 0.000458297163599262, 'samples': 4848128, 'steps': 9468, 'batch_loss/train': 0.8718716623261571} 12/24/2021 05:28:04 - INFO - codeparrot_training - Step 9469: {'lr': 0.0004582884463477669, 'samples': 4848640, 'steps': 9469, 'batch_loss/train': 0.8554588984698057} 12/24/2021 05:28:14 - INFO - codeparrot_training - Step 9470: {'lr': 0.00045827972826819555, 'samples': 4849152, 'steps': 9470, 'batch_loss/train': 0.8384904768317938} 12/24/2021 05:28:26 - INFO - codeparrot_training - Step 9471: {'lr': 0.0004582710093605826, 'samples': 4849664, 'steps': 9471, 'batch_loss/train': 0.7896732483059168} 12/24/2021 05:28:37 - INFO - codeparrot_training - Step 9472: {'lr': 0.00045826228962496287, 'samples': 4850176, 'steps': 9472, 'batch_loss/train': 0.9034999739378691} 12/24/2021 05:28:48 - INFO - codeparrot_training - Step 9473: {'lr': 0.0004582535690613708, 'samples': 4850688, 'steps': 9473, 'batch_loss/train': 0.8043697024695575} 12/24/2021 05:29:00 - INFO - codeparrot_training - Step 9474: {'lr': 0.0004582448476698412, 'samples': 4851200, 'steps': 9474, 'batch_loss/train': 0.903818522579968} 12/24/2021 05:29:11 - INFO - codeparrot_training - Step 9475: {'lr': 0.0004582361254504088, 'samples': 4851712, 'steps': 9475, 'batch_loss/train': 0.8648881409317255} 12/24/2021 05:29:22 - INFO - codeparrot_training - Step 9476: {'lr': 0.00045822740240310814, 'samples': 4852224, 'steps': 9476, 'batch_loss/train': 0.9577321698889136} 12/24/2021 05:29:34 - INFO - codeparrot_training - Step 9477: {'lr': 0.0004582186785279739, 'samples': 4852736, 'steps': 9477, 'batch_loss/train': 0.7686693901196122} 12/24/2021 05:29:44 - INFO - codeparrot_training - Step 9478: {'lr': 0.0004582099538250408, 'samples': 4853248, 'steps': 9478, 'batch_loss/train': 0.8342092428356409} 12/24/2021 05:29:55 - INFO - codeparrot_training - Step 9479: {'lr': 0.0004582012282943437, 'samples': 4853760, 'steps': 9479, 'batch_loss/train': 0.7835040362551808} 12/24/2021 05:30:07 - INFO - codeparrot_training - Step 9480: {'lr': 0.000458192501935917, 'samples': 4854272, 'steps': 9480, 'batch_loss/train': 0.8140187673270702} 12/24/2021 05:30:18 - INFO - codeparrot_training - Step 9481: {'lr': 0.00045818377474979554, 'samples': 4854784, 'steps': 9481, 'batch_loss/train': 0.8755070129409432} 12/24/2021 05:30:28 - INFO - codeparrot_training - Step 9482: {'lr': 0.000458175046736014, 'samples': 4855296, 'steps': 9482, 'batch_loss/train': 0.8071266887709498} 12/24/2021 05:30:39 - INFO - codeparrot_training - Step 9483: {'lr': 0.0004581663178946071, 'samples': 4855808, 'steps': 9483, 'batch_loss/train': 0.9595275921747088} 12/24/2021 05:30:51 - INFO - codeparrot_training - Step 9484: {'lr': 0.0004581575882256095, 'samples': 4856320, 'steps': 9484, 'batch_loss/train': 0.745866047218442} 12/24/2021 05:31:02 - INFO - codeparrot_training - Step 9485: {'lr': 0.00045814885772905593, 'samples': 4856832, 'steps': 9485, 'batch_loss/train': 0.8390770256519318} 12/24/2021 05:31:12 - INFO - codeparrot_training - Step 9486: {'lr': 0.00045814012640498116, 'samples': 4857344, 'steps': 9486, 'batch_loss/train': 0.8847678364254534} 12/24/2021 05:31:25 - INFO - codeparrot_training - Step 9487: {'lr': 0.00045813139425341976, 'samples': 4857856, 'steps': 9487, 'batch_loss/train': 0.8743508635088801} 12/24/2021 05:31:35 - INFO - codeparrot_training - Step 9488: {'lr': 0.0004581226612744065, 'samples': 4858368, 'steps': 9488, 'batch_loss/train': 0.856115585193038} 12/24/2021 05:31:46 - INFO - codeparrot_training - Step 9489: {'lr': 0.0004581139274679762, 'samples': 4858880, 'steps': 9489, 'batch_loss/train': 0.8797520278021693} 12/24/2021 05:31:59 - INFO - codeparrot_training - Step 9490: {'lr': 0.00045810519283416345, 'samples': 4859392, 'steps': 9490, 'batch_loss/train': 0.7317400118336082} 12/24/2021 05:32:09 - INFO - codeparrot_training - Step 9491: {'lr': 0.0004580964573730031, 'samples': 4859904, 'steps': 9491, 'batch_loss/train': 0.863672873005271} 12/24/2021 05:32:20 - INFO - codeparrot_training - Step 9492: {'lr': 0.00045808772108452974, 'samples': 4860416, 'steps': 9492, 'batch_loss/train': 0.8428404591977596} 12/24/2021 05:32:30 - INFO - codeparrot_training - Step 9493: {'lr': 0.00045807898396877815, 'samples': 4860928, 'steps': 9493, 'batch_loss/train': 0.7419682946056128} 12/24/2021 05:32:42 - INFO - codeparrot_training - Step 9494: {'lr': 0.00045807024602578317, 'samples': 4861440, 'steps': 9494, 'batch_loss/train': 0.8102901712991297} 12/24/2021 05:32:53 - INFO - codeparrot_training - Step 9495: {'lr': 0.0004580615072555794, 'samples': 4861952, 'steps': 9495, 'batch_loss/train': 0.8368643969297409} 12/24/2021 05:33:04 - INFO - codeparrot_training - Step 9496: {'lr': 0.0004580527676582017, 'samples': 4862464, 'steps': 9496, 'batch_loss/train': 0.8584570176899433} 12/24/2021 05:33:16 - INFO - codeparrot_training - Step 9497: {'lr': 0.00045804402723368466, 'samples': 4862976, 'steps': 9497, 'batch_loss/train': 0.8605176135897636} 12/24/2021 05:33:26 - INFO - codeparrot_training - Step 9498: {'lr': 0.0004580352859820632, 'samples': 4863488, 'steps': 9498, 'batch_loss/train': 0.7766885561868548} 12/24/2021 05:33:37 - INFO - codeparrot_training - Step 9499: {'lr': 0.00045802654390337193, 'samples': 4864000, 'steps': 9499, 'batch_loss/train': 0.9257853208109736} 12/24/2021 05:33:49 - INFO - codeparrot_training - Step 9500: {'lr': 0.00045801780099764567, 'samples': 4864512, 'steps': 9500, 'batch_loss/train': 0.7766750818118453} 12/24/2021 05:34:00 - INFO - codeparrot_training - Step 9501: {'lr': 0.00045800905726491926, 'samples': 4865024, 'steps': 9501, 'batch_loss/train': 0.655148642603308} 12/24/2021 05:34:10 - INFO - codeparrot_training - Step 9502: {'lr': 0.00045800031270522735, 'samples': 4865536, 'steps': 9502, 'batch_loss/train': 0.956457762978971} 12/24/2021 05:34:23 - INFO - codeparrot_training - Step 9503: {'lr': 0.0004579915673186047, 'samples': 4866048, 'steps': 9503, 'batch_loss/train': 0.856170586310327} 12/24/2021 05:34:34 - INFO - codeparrot_training - Step 9504: {'lr': 0.0004579828211050861, 'samples': 4866560, 'steps': 9504, 'batch_loss/train': 0.8177058408036828} 12/24/2021 05:34:44 - INFO - codeparrot_training - Step 9505: {'lr': 0.00045797407406470634, 'samples': 4867072, 'steps': 9505, 'batch_loss/train': 0.7932544313371181} 12/24/2021 05:34:55 - INFO - codeparrot_training - Step 9506: {'lr': 0.00045796532619750023, 'samples': 4867584, 'steps': 9506, 'batch_loss/train': 0.8437367295846343} 12/24/2021 05:35:07 - INFO - codeparrot_training - Step 9507: {'lr': 0.0004579565775035025, 'samples': 4868096, 'steps': 9507, 'batch_loss/train': 0.9299526499817148} 12/24/2021 05:35:18 - INFO - codeparrot_training - Step 9508: {'lr': 0.0004579478279827479, 'samples': 4868608, 'steps': 9508, 'batch_loss/train': 0.9163198778405786} 12/24/2021 05:35:28 - INFO - codeparrot_training - Step 9509: {'lr': 0.0004579390776352713, 'samples': 4869120, 'steps': 9509, 'batch_loss/train': 0.7822511903941631} 12/24/2021 05:35:40 - INFO - codeparrot_training - Step 9510: {'lr': 0.00045793032646110744, 'samples': 4869632, 'steps': 9510, 'batch_loss/train': 0.6944327093660831} 12/24/2021 05:35:51 - INFO - codeparrot_training - Step 9511: {'lr': 0.0004579215744602911, 'samples': 4870144, 'steps': 9511, 'batch_loss/train': 0.8980756318196654} 12/24/2021 05:36:02 - INFO - codeparrot_training - Step 9512: {'lr': 0.00045791282163285706, 'samples': 4870656, 'steps': 9512, 'batch_loss/train': 0.6989829358644783} 12/24/2021 05:36:12 - INFO - codeparrot_training - Step 9513: {'lr': 0.0004579040679788402, 'samples': 4871168, 'steps': 9513, 'batch_loss/train': 0.8204727652482688} 12/24/2021 05:36:25 - INFO - codeparrot_training - Step 9514: {'lr': 0.00045789531349827524, 'samples': 4871680, 'steps': 9514, 'batch_loss/train': 0.8714570924639702} 12/24/2021 05:36:36 - INFO - codeparrot_training - Step 9515: {'lr': 0.000457886558191197, 'samples': 4872192, 'steps': 9515, 'batch_loss/train': 0.8536004775669426} 12/24/2021 05:36:46 - INFO - codeparrot_training - Step 9516: {'lr': 0.0004578778020576403, 'samples': 4872704, 'steps': 9516, 'batch_loss/train': 0.8555603977292776} 12/24/2021 05:36:58 - INFO - codeparrot_training - Step 9517: {'lr': 0.00045786904509764, 'samples': 4873216, 'steps': 9517, 'batch_loss/train': 0.7956689894199371} 12/24/2021 05:37:09 - INFO - codeparrot_training - Step 9518: {'lr': 0.0004578602873112308, 'samples': 4873728, 'steps': 9518, 'batch_loss/train': 0.7681439763400704} 12/24/2021 05:37:19 - INFO - codeparrot_training - Step 9519: {'lr': 0.0004578515286984476, 'samples': 4874240, 'steps': 9519, 'batch_loss/train': 0.8225826062262058} 12/24/2021 05:37:32 - INFO - codeparrot_training - Step 9520: {'lr': 0.0004578427692593252, 'samples': 4874752, 'steps': 9520, 'batch_loss/train': 0.8305044732987881} 12/24/2021 05:37:43 - INFO - codeparrot_training - Step 9521: {'lr': 0.00045783400899389846, 'samples': 4875264, 'steps': 9521, 'batch_loss/train': 0.8849189258180559} 12/24/2021 05:37:53 - INFO - codeparrot_training - Step 9522: {'lr': 0.00045782524790220216, 'samples': 4875776, 'steps': 9522, 'batch_loss/train': 0.7659144503995776} 12/24/2021 05:38:05 - INFO - codeparrot_training - Step 9523: {'lr': 0.00045781648598427116, 'samples': 4876288, 'steps': 9523, 'batch_loss/train': 0.8027099389582872} 12/24/2021 05:38:16 - INFO - codeparrot_training - Step 9524: {'lr': 0.00045780772324014027, 'samples': 4876800, 'steps': 9524, 'batch_loss/train': 0.7544557563960552} 12/24/2021 05:38:27 - INFO - codeparrot_training - Step 9525: {'lr': 0.0004577989596698443, 'samples': 4877312, 'steps': 9525, 'batch_loss/train': 0.8801303477957845} 12/24/2021 05:38:37 - INFO - codeparrot_training - Step 9526: {'lr': 0.0004577901952734181, 'samples': 4877824, 'steps': 9526, 'batch_loss/train': 0.7394061014056206} 12/24/2021 05:38:49 - INFO - codeparrot_training - Step 9527: {'lr': 0.0004577814300508967, 'samples': 4878336, 'steps': 9527, 'batch_loss/train': 0.8425210565328598} 12/24/2021 05:39:00 - INFO - codeparrot_training - Step 9528: {'lr': 0.0004577726640023147, 'samples': 4878848, 'steps': 9528, 'batch_loss/train': 0.8909510294906795} 12/24/2021 05:39:11 - INFO - codeparrot_training - Step 9529: {'lr': 0.000457763897127707, 'samples': 4879360, 'steps': 9529, 'batch_loss/train': 0.8159848768264055} 12/24/2021 05:39:24 - INFO - codeparrot_training - Step 9530: {'lr': 0.00045775512942710853, 'samples': 4879872, 'steps': 9530, 'batch_loss/train': 0.7986289989203215} 12/24/2021 05:39:34 - INFO - codeparrot_training - Step 9531: {'lr': 0.0004577463609005541, 'samples': 4880384, 'steps': 9531, 'batch_loss/train': 0.6155790839111432} 12/24/2021 05:39:45 - INFO - codeparrot_training - Step 9532: {'lr': 0.0004577375915480786, 'samples': 4880896, 'steps': 9532, 'batch_loss/train': 0.6612106095999479} 12/24/2021 05:39:56 - INFO - codeparrot_training - Step 9533: {'lr': 0.00045772882136971684, 'samples': 4881408, 'steps': 9533, 'batch_loss/train': 0.7906445488333702} 12/24/2021 05:40:08 - INFO - codeparrot_training - Step 9534: {'lr': 0.00045772005036550376, 'samples': 4881920, 'steps': 9534, 'batch_loss/train': 0.7687829510541633} 12/24/2021 05:40:19 - INFO - codeparrot_training - Step 9535: {'lr': 0.0004577112785354741, 'samples': 4882432, 'steps': 9535, 'batch_loss/train': 0.8498035883530974} 12/24/2021 05:40:29 - INFO - codeparrot_training - Step 9536: {'lr': 0.00045770250587966296, 'samples': 4882944, 'steps': 9536, 'batch_loss/train': 1.1121146641671658} 12/24/2021 05:40:42 - INFO - codeparrot_training - Step 9537: {'lr': 0.00045769373239810496, 'samples': 4883456, 'steps': 9537, 'batch_loss/train': 1.0269041769206524} 12/24/2021 05:40:53 - INFO - codeparrot_training - Step 9538: {'lr': 0.0004576849580908352, 'samples': 4883968, 'steps': 9538, 'batch_loss/train': 0.8497742065228522} 12/24/2021 05:41:03 - INFO - codeparrot_training - Step 9539: {'lr': 0.00045767618295788844, 'samples': 4884480, 'steps': 9539, 'batch_loss/train': 0.8758765421807766} 12/24/2021 05:41:15 - INFO - codeparrot_training - Step 9540: {'lr': 0.00045766740699929953, 'samples': 4884992, 'steps': 9540, 'batch_loss/train': 0.9051671288907528} 12/24/2021 05:41:26 - INFO - codeparrot_training - Step 9541: {'lr': 0.0004576586302151035, 'samples': 4885504, 'steps': 9541, 'batch_loss/train': 0.7559018353931606} 12/24/2021 05:41:36 - INFO - codeparrot_training - Step 9542: {'lr': 0.00045764985260533515, 'samples': 4886016, 'steps': 9542, 'batch_loss/train': 0.7906448654830456} 12/24/2021 05:41:47 - INFO - codeparrot_training - Step 9543: {'lr': 0.0004576410741700294, 'samples': 4886528, 'steps': 9543, 'batch_loss/train': 0.8053516545332968} 12/24/2021 05:41:59 - INFO - codeparrot_training - Step 9544: {'lr': 0.0004576322949092211, 'samples': 4887040, 'steps': 9544, 'batch_loss/train': 0.7499185718479566} 12/24/2021 05:42:10 - INFO - codeparrot_training - Step 9545: {'lr': 0.00045762351482294517, 'samples': 4887552, 'steps': 9545, 'batch_loss/train': 0.9467488443478942} 12/24/2021 05:42:20 - INFO - codeparrot_training - Step 9546: {'lr': 0.00045761473391123665, 'samples': 4888064, 'steps': 9546, 'batch_loss/train': 0.895807791268453} 12/24/2021 05:42:32 - INFO - codeparrot_training - Step 9547: {'lr': 0.00045760595217413027, 'samples': 4888576, 'steps': 9547, 'batch_loss/train': 0.7240779320709407} 12/24/2021 05:42:43 - INFO - codeparrot_training - Step 9548: {'lr': 0.000457597169611661, 'samples': 4889088, 'steps': 9548, 'batch_loss/train': 0.8130100071430206} 12/24/2021 05:42:54 - INFO - codeparrot_training - Step 9549: {'lr': 0.0004575883862238638, 'samples': 4889600, 'steps': 9549, 'batch_loss/train': 0.7801639684475958} 12/24/2021 05:43:06 - INFO - codeparrot_training - Step 9550: {'lr': 0.00045757960201077354, 'samples': 4890112, 'steps': 9550, 'batch_loss/train': 0.7467315387912095} 12/24/2021 05:43:17 - INFO - codeparrot_training - Step 9551: {'lr': 0.0004575708169724251, 'samples': 4890624, 'steps': 9551, 'batch_loss/train': 0.6806566750165075} 12/24/2021 05:43:28 - INFO - codeparrot_training - Step 9552: {'lr': 0.00045756203110885357, 'samples': 4891136, 'steps': 9552, 'batch_loss/train': 0.8345841020345688} 12/24/2021 05:43:38 - INFO - codeparrot_training - Step 9553: {'lr': 0.00045755324442009373, 'samples': 4891648, 'steps': 9553, 'batch_loss/train': 0.8180984323844314} 12/24/2021 05:43:51 - INFO - codeparrot_training - Step 9554: {'lr': 0.00045754445690618064, 'samples': 4892160, 'steps': 9554, 'batch_loss/train': 0.9774195840582252} 12/24/2021 05:44:01 - INFO - codeparrot_training - Step 9555: {'lr': 0.00045753566856714903, 'samples': 4892672, 'steps': 9555, 'batch_loss/train': 0.8654497042298317} 12/24/2021 05:44:12 - INFO - codeparrot_training - Step 9556: {'lr': 0.00045752687940303405, 'samples': 4893184, 'steps': 9556, 'batch_loss/train': 0.9205900016240776} 12/24/2021 05:44:24 - INFO - codeparrot_training - Step 9557: {'lr': 0.00045751808941387053, 'samples': 4893696, 'steps': 9557, 'batch_loss/train': 0.7690150821581483} 12/24/2021 05:44:34 - INFO - codeparrot_training - Step 9558: {'lr': 0.0004575092985996934, 'samples': 4894208, 'steps': 9558, 'batch_loss/train': 0.8850673446431756} 12/24/2021 05:44:45 - INFO - codeparrot_training - Step 9559: {'lr': 0.0004575005069605377, 'samples': 4894720, 'steps': 9559, 'batch_loss/train': 0.8060903968289495} 12/24/2021 05:44:58 - INFO - codeparrot_training - Step 9560: {'lr': 0.00045749171449643833, 'samples': 4895232, 'steps': 9560, 'batch_loss/train': 0.67896629543975} 12/24/2021 05:45:08 - INFO - codeparrot_training - Step 9561: {'lr': 0.0004574829212074302, 'samples': 4895744, 'steps': 9561, 'batch_loss/train': 1.5798277575522661} 12/24/2021 05:45:19 - INFO - codeparrot_training - Step 9562: {'lr': 0.0004574741270935484, 'samples': 4896256, 'steps': 9562, 'batch_loss/train': 0.825965978205204} 12/24/2021 05:45:30 - INFO - codeparrot_training - Step 9563: {'lr': 0.0004574653321548278, 'samples': 4896768, 'steps': 9563, 'batch_loss/train': 0.7634027698077261} 12/24/2021 05:45:42 - INFO - codeparrot_training - Step 9564: {'lr': 0.0004574565363913033, 'samples': 4897280, 'steps': 9564, 'batch_loss/train': 0.8244622219353914} 12/24/2021 05:45:52 - INFO - codeparrot_training - Step 9565: {'lr': 0.00045744773980301, 'samples': 4897792, 'steps': 9565, 'batch_loss/train': 0.6884225857211277} 12/24/2021 05:46:03 - INFO - codeparrot_training - Step 9566: {'lr': 0.00045743894238998284, 'samples': 4898304, 'steps': 9566, 'batch_loss/train': 0.8749162089079618} 12/24/2021 05:46:17 - INFO - codeparrot_training - Step 9567: {'lr': 0.0004574301441522567, 'samples': 4898816, 'steps': 9567, 'batch_loss/train': 0.7532420437783003} 12/24/2021 05:46:27 - INFO - codeparrot_training - Step 9568: {'lr': 0.00045742134508986663, 'samples': 4899328, 'steps': 9568, 'batch_loss/train': 0.8324984423816204} 12/24/2021 05:46:38 - INFO - codeparrot_training - Step 9569: {'lr': 0.0004574125452028477, 'samples': 4899840, 'steps': 9569, 'batch_loss/train': 0.9510735822841525} 12/24/2021 05:46:48 - INFO - codeparrot_training - Step 9570: {'lr': 0.0004574037444912347, 'samples': 4900352, 'steps': 9570, 'batch_loss/train': 0.816591287497431} 12/24/2021 05:47:00 - INFO - codeparrot_training - Step 9571: {'lr': 0.00045739494295506285, 'samples': 4900864, 'steps': 9571, 'batch_loss/train': 0.8065312615945004} 12/24/2021 05:47:11 - INFO - codeparrot_training - Step 9572: {'lr': 0.0004573861405943669, 'samples': 4901376, 'steps': 9572, 'batch_loss/train': 0.9801194625906646} 12/24/2021 05:47:21 - INFO - codeparrot_training - Step 9573: {'lr': 0.00045737733740918205, 'samples': 4901888, 'steps': 9573, 'batch_loss/train': 0.8684886759147048} 12/24/2021 05:47:34 - INFO - codeparrot_training - Step 9574: {'lr': 0.0004573685333995431, 'samples': 4902400, 'steps': 9574, 'batch_loss/train': 0.9270066739991307} 12/24/2021 05:47:44 - INFO - codeparrot_training - Step 9575: {'lr': 0.00045735972856548524, 'samples': 4902912, 'steps': 9575, 'batch_loss/train': 0.8853498660027981} 12/24/2021 05:47:55 - INFO - codeparrot_training - Step 9576: {'lr': 0.0004573509229070434, 'samples': 4903424, 'steps': 9576, 'batch_loss/train': 1.4862727653235197} 12/24/2021 05:48:08 - INFO - codeparrot_training - Step 9577: {'lr': 0.0004573421164242526, 'samples': 4903936, 'steps': 9577, 'batch_loss/train': 0.8908916646614671} 12/24/2021 05:48:18 - INFO - codeparrot_training - Step 9578: {'lr': 0.00045733330911714777, 'samples': 4904448, 'steps': 9578, 'batch_loss/train': 0.7130797281861305} 12/24/2021 05:48:29 - INFO - codeparrot_training - Step 9579: {'lr': 0.000457324500985764, 'samples': 4904960, 'steps': 9579, 'batch_loss/train': 0.7607355760410428} 12/24/2021 05:48:40 - INFO - codeparrot_training - Step 9580: {'lr': 0.00045731569203013634, 'samples': 4905472, 'steps': 9580, 'batch_loss/train': 1.5804492346942425} 12/24/2021 05:48:52 - INFO - codeparrot_training - Step 9581: {'lr': 0.0004573068822502997, 'samples': 4905984, 'steps': 9581, 'batch_loss/train': 0.7582211876288056} 12/24/2021 05:49:02 - INFO - codeparrot_training - Step 9582: {'lr': 0.00045729807164628924, 'samples': 4906496, 'steps': 9582, 'batch_loss/train': 0.8772495165467262} 12/24/2021 05:49:13 - INFO - codeparrot_training - Step 9583: {'lr': 0.0004572892602181399, 'samples': 4907008, 'steps': 9583, 'batch_loss/train': 0.7689906293526292} 12/24/2021 05:49:25 - INFO - codeparrot_training - Step 9584: {'lr': 0.00045728044796588673, 'samples': 4907520, 'steps': 9584, 'batch_loss/train': 0.7814955050125718} 12/24/2021 05:49:35 - INFO - codeparrot_training - Step 9585: {'lr': 0.0004572716348895647, 'samples': 4908032, 'steps': 9585, 'batch_loss/train': 0.793423580005765} 12/24/2021 05:49:46 - INFO - codeparrot_training - Step 9586: {'lr': 0.000457262820989209, 'samples': 4908544, 'steps': 9586, 'batch_loss/train': 0.9054970832075924} 12/24/2021 05:49:58 - INFO - codeparrot_training - Step 9587: {'lr': 0.00045725400626485456, 'samples': 4909056, 'steps': 9587, 'batch_loss/train': 0.8117129602469504} 12/24/2021 05:50:09 - INFO - codeparrot_training - Step 9588: {'lr': 0.0004572451907165364, 'samples': 4909568, 'steps': 9588, 'batch_loss/train': 0.826209164224565} 12/24/2021 05:50:19 - INFO - codeparrot_training - Step 9589: {'lr': 0.00045723637434428964, 'samples': 4910080, 'steps': 9589, 'batch_loss/train': 0.8805619524791837} 12/24/2021 05:50:30 - INFO - codeparrot_training - Step 9590: {'lr': 0.00045722755714814933, 'samples': 4910592, 'steps': 9590, 'batch_loss/train': 0.8505337880924344} 12/24/2021 05:50:43 - INFO - codeparrot_training - Step 9591: {'lr': 0.00045721873912815046, 'samples': 4911104, 'steps': 9591, 'batch_loss/train': 0.8868199561256915} 12/24/2021 05:50:53 - INFO - codeparrot_training - Step 9592: {'lr': 0.0004572099202843282, 'samples': 4911616, 'steps': 9592, 'batch_loss/train': 0.8778874194249511} 12/24/2021 05:51:04 - INFO - codeparrot_training - Step 9593: {'lr': 0.0004572011006167175, 'samples': 4912128, 'steps': 9593, 'batch_loss/train': 0.7615751698613167} 12/24/2021 05:51:16 - INFO - codeparrot_training - Step 9594: {'lr': 0.00045719228012535345, 'samples': 4912640, 'steps': 9594, 'batch_loss/train': 0.7876712623983622} 12/24/2021 05:51:27 - INFO - codeparrot_training - Step 9595: {'lr': 0.00045718345881027114, 'samples': 4913152, 'steps': 9595, 'batch_loss/train': 0.836400693282485} 12/24/2021 05:51:37 - INFO - codeparrot_training - Step 9596: {'lr': 0.0004571746366715056, 'samples': 4913664, 'steps': 9596, 'batch_loss/train': 0.808773186057806} 12/24/2021 05:51:50 - INFO - codeparrot_training - Step 9597: {'lr': 0.00045716581370909196, 'samples': 4914176, 'steps': 9597, 'batch_loss/train': 1.050887354183942} 12/24/2021 05:52:00 - INFO - codeparrot_training - Step 9598: {'lr': 0.00045715698992306524, 'samples': 4914688, 'steps': 9598, 'batch_loss/train': 0.8390155183151364} 12/24/2021 05:52:11 - INFO - codeparrot_training - Step 9599: {'lr': 0.0004571481653134606, 'samples': 4915200, 'steps': 9599, 'batch_loss/train': 0.7821920095011592} 12/24/2021 05:52:22 - INFO - codeparrot_training - Step 9600: {'lr': 0.000457139339880313, 'samples': 4915712, 'steps': 9600, 'batch_loss/train': 0.8175229215994477} 12/24/2021 05:52:34 - INFO - codeparrot_training - Step 9601: {'lr': 0.00045713051362365764, 'samples': 4916224, 'steps': 9601, 'batch_loss/train': 0.7707579797133803} 12/24/2021 05:52:44 - INFO - codeparrot_training - Step 9602: {'lr': 0.0004571216865435296, 'samples': 4916736, 'steps': 9602, 'batch_loss/train': 0.9161257253726944} 12/24/2021 05:52:55 - INFO - codeparrot_training - Step 9603: {'lr': 0.0004571128586399639, 'samples': 4917248, 'steps': 9603, 'batch_loss/train': 0.8245256608352065} 12/24/2021 05:53:07 - INFO - codeparrot_training - Step 9604: {'lr': 0.0004571040299129956, 'samples': 4917760, 'steps': 9604, 'batch_loss/train': 0.9412580225616693} 12/24/2021 05:53:18 - INFO - codeparrot_training - Step 9605: {'lr': 0.00045709520036266, 'samples': 4918272, 'steps': 9605, 'batch_loss/train': 0.8680046210065484} 12/24/2021 05:53:28 - INFO - codeparrot_training - Step 9606: {'lr': 0.000457086369988992, 'samples': 4918784, 'steps': 9606, 'batch_loss/train': 0.8719374602660537} 12/24/2021 05:53:41 - INFO - codeparrot_training - Step 9607: {'lr': 0.00045707753879202685, 'samples': 4919296, 'steps': 9607, 'batch_loss/train': 0.7171134259551764} 12/24/2021 05:53:52 - INFO - codeparrot_training - Step 9608: {'lr': 0.0004570687067717996, 'samples': 4919808, 'steps': 9608, 'batch_loss/train': 0.7993127787485719} 12/24/2021 05:54:02 - INFO - codeparrot_training - Step 9609: {'lr': 0.0004570598739283453, 'samples': 4920320, 'steps': 9609, 'batch_loss/train': 0.898720134049654} 12/24/2021 05:54:13 - INFO - codeparrot_training - Step 9610: {'lr': 0.00045705104026169907, 'samples': 4920832, 'steps': 9610, 'batch_loss/train': 0.9181181937456131} 12/24/2021 05:54:25 - INFO - codeparrot_training - Step 9611: {'lr': 0.00045704220577189616, 'samples': 4921344, 'steps': 9611, 'batch_loss/train': 0.9196424463298172} 12/24/2021 05:54:35 - INFO - codeparrot_training - Step 9612: {'lr': 0.00045703337045897163, 'samples': 4921856, 'steps': 9612, 'batch_loss/train': 0.7147837423253804} 12/24/2021 05:54:46 - INFO - codeparrot_training - Step 9613: {'lr': 0.0004570245343229606, 'samples': 4922368, 'steps': 9613, 'batch_loss/train': 0.7333982815034688} 12/24/2021 05:54:58 - INFO - codeparrot_training - Step 9614: {'lr': 0.00045701569736389814, 'samples': 4922880, 'steps': 9614, 'batch_loss/train': 0.8162879967130721} 12/24/2021 05:55:09 - INFO - codeparrot_training - Step 9615: {'lr': 0.0004570068595818195, 'samples': 4923392, 'steps': 9615, 'batch_loss/train': 0.7922001820988953} 12/24/2021 05:55:19 - INFO - codeparrot_training - Step 9616: {'lr': 0.0004569980209767597, 'samples': 4923904, 'steps': 9616, 'batch_loss/train': 0.8479813858866692} 12/24/2021 05:55:32 - INFO - codeparrot_training - Step 9617: {'lr': 0.00045698918154875393, 'samples': 4924416, 'steps': 9617, 'batch_loss/train': 0.8836152986623347} 12/24/2021 05:55:43 - INFO - codeparrot_training - Step 9618: {'lr': 0.00045698034129783733, 'samples': 4924928, 'steps': 9618, 'batch_loss/train': 0.8529497743584216} 12/24/2021 05:55:53 - INFO - codeparrot_training - Step 9619: {'lr': 0.00045697150022404505, 'samples': 4925440, 'steps': 9619, 'batch_loss/train': 0.8483198136091232} 12/24/2021 05:56:04 - INFO - codeparrot_training - Step 9620: {'lr': 0.0004569626583274122, 'samples': 4925952, 'steps': 9620, 'batch_loss/train': 0.7629614954348654} 12/24/2021 05:56:16 - INFO - codeparrot_training - Step 9621: {'lr': 0.000456953815607974, 'samples': 4926464, 'steps': 9621, 'batch_loss/train': 0.8281185887753963} 12/24/2021 05:56:26 - INFO - codeparrot_training - Step 9622: {'lr': 0.0004569449720657656, 'samples': 4926976, 'steps': 9622, 'batch_loss/train': 0.8098871083930135} 12/24/2021 05:56:37 - INFO - codeparrot_training - Step 9623: {'lr': 0.0004569361277008221, 'samples': 4927488, 'steps': 9623, 'batch_loss/train': 0.8391839545220137} 12/24/2021 05:56:49 - INFO - codeparrot_training - Step 9624: {'lr': 0.00045692728251317874, 'samples': 4928000, 'steps': 9624, 'batch_loss/train': 0.8265668619424105} 12/24/2021 05:57:00 - INFO - codeparrot_training - Step 9625: {'lr': 0.0004569184365028706, 'samples': 4928512, 'steps': 9625, 'batch_loss/train': 0.7834334331564605} 12/24/2021 05:57:10 - INFO - codeparrot_training - Step 9626: {'lr': 0.00045690958966993286, 'samples': 4929024, 'steps': 9626, 'batch_loss/train': 0.7939013289287686} 12/24/2021 05:57:22 - INFO - codeparrot_training - Step 9627: {'lr': 0.0004569007420144007, 'samples': 4929536, 'steps': 9627, 'batch_loss/train': 0.8133483761921525} 12/24/2021 05:57:33 - INFO - codeparrot_training - Step 9628: {'lr': 0.0004568918935363094, 'samples': 4930048, 'steps': 9628, 'batch_loss/train': 0.7746971319429576} 12/24/2021 05:57:44 - INFO - codeparrot_training - Step 9629: {'lr': 0.000456883044235694, 'samples': 4930560, 'steps': 9629, 'batch_loss/train': 0.806954079773277} 12/24/2021 05:57:54 - INFO - codeparrot_training - Step 9630: {'lr': 0.0004568741941125897, 'samples': 4931072, 'steps': 9630, 'batch_loss/train': 0.6240172714460641} 12/24/2021 05:58:07 - INFO - codeparrot_training - Step 9631: {'lr': 0.0004568653431670318, 'samples': 4931584, 'steps': 9631, 'batch_loss/train': 0.8023794586770236} 12/24/2021 05:58:18 - INFO - codeparrot_training - Step 9632: {'lr': 0.00045685649139905535, 'samples': 4932096, 'steps': 9632, 'batch_loss/train': 0.8423150535672903} 12/24/2021 05:58:28 - INFO - codeparrot_training - Step 9633: {'lr': 0.0004568476388086956, 'samples': 4932608, 'steps': 9633, 'batch_loss/train': 0.8282794747501612} 12/24/2021 05:58:41 - INFO - codeparrot_training - Step 9634: {'lr': 0.00045683878539598766, 'samples': 4933120, 'steps': 9634, 'batch_loss/train': 0.7848073816858232} 12/24/2021 05:58:51 - INFO - codeparrot_training - Step 9635: {'lr': 0.0004568299311609669, 'samples': 4933632, 'steps': 9635, 'batch_loss/train': 0.868182735517621} 12/24/2021 05:59:02 - INFO - codeparrot_training - Step 9636: {'lr': 0.00045682107610366845, 'samples': 4934144, 'steps': 9636, 'batch_loss/train': 0.6941593934316188} 12/24/2021 05:59:13 - INFO - codeparrot_training - Step 9637: {'lr': 0.00045681222022412747, 'samples': 4934656, 'steps': 9637, 'batch_loss/train': 0.594702294562012} 12/24/2021 05:59:25 - INFO - codeparrot_training - Step 9638: {'lr': 0.0004568033635223792, 'samples': 4935168, 'steps': 9638, 'batch_loss/train': 0.8109628171660006} 12/24/2021 05:59:36 - INFO - codeparrot_training - Step 9639: {'lr': 0.00045679450599845885, 'samples': 4935680, 'steps': 9639, 'batch_loss/train': 0.888530071824789} 12/24/2021 05:59:46 - INFO - codeparrot_training - Step 9640: {'lr': 0.0004567856476524016, 'samples': 4936192, 'steps': 9640, 'batch_loss/train': 0.6187977145891637} 12/24/2021 05:59:59 - INFO - codeparrot_training - Step 9641: {'lr': 0.0004567767884842426, 'samples': 4936704, 'steps': 9641, 'batch_loss/train': 0.7168398071080446} 12/24/2021 06:00:09 - INFO - codeparrot_training - Step 9642: {'lr': 0.0004567679284940173, 'samples': 4937216, 'steps': 9642, 'batch_loss/train': 0.8635705458000302} 12/24/2021 06:00:20 - INFO - codeparrot_training - Step 9643: {'lr': 0.0004567590676817608, 'samples': 4937728, 'steps': 9643, 'batch_loss/train': 0.9160610092803836} 12/24/2021 06:00:32 - INFO - codeparrot_training - Step 9644: {'lr': 0.0004567502060475083, 'samples': 4938240, 'steps': 9644, 'batch_loss/train': 0.8104589879512787} 12/24/2021 06:00:42 - INFO - codeparrot_training - Step 9645: {'lr': 0.000456741343591295, 'samples': 4938752, 'steps': 9645, 'batch_loss/train': 0.9009816581383348} 12/24/2021 06:00:53 - INFO - codeparrot_training - Step 9646: {'lr': 0.00045673248031315617, 'samples': 4939264, 'steps': 9646, 'batch_loss/train': 0.7458873877767473} 12/24/2021 06:01:06 - INFO - codeparrot_training - Step 9647: {'lr': 0.0004567236162131271, 'samples': 4939776, 'steps': 9647, 'batch_loss/train': 0.8248126748949289} 12/24/2021 06:01:16 - INFO - codeparrot_training - Step 9648: {'lr': 0.00045671475129124297, 'samples': 4940288, 'steps': 9648, 'batch_loss/train': 0.8830642690882087} 12/24/2021 06:01:27 - INFO - codeparrot_training - Step 9649: {'lr': 0.000456705885547539, 'samples': 4940800, 'steps': 9649, 'batch_loss/train': 0.7311841293703765} 12/24/2021 06:01:37 - INFO - codeparrot_training - Step 9650: {'lr': 0.0004566970189820505, 'samples': 4941312, 'steps': 9650, 'batch_loss/train': 0.7943334393203259} 12/24/2021 06:01:50 - INFO - codeparrot_training - Step 9651: {'lr': 0.00045668815159481273, 'samples': 4941824, 'steps': 9651, 'batch_loss/train': 0.8657347178086638} 12/24/2021 06:02:00 - INFO - codeparrot_training - Step 9652: {'lr': 0.00045667928338586093, 'samples': 4942336, 'steps': 9652, 'batch_loss/train': 0.7339180123526603} 12/24/2021 06:02:11 - INFO - codeparrot_training - Step 9653: {'lr': 0.0004566704143552303, 'samples': 4942848, 'steps': 9653, 'batch_loss/train': 0.9199830563738942} 12/24/2021 06:02:24 - INFO - codeparrot_training - Step 9654: {'lr': 0.00045666154450295615, 'samples': 4943360, 'steps': 9654, 'batch_loss/train': 0.8262374429032207} 12/24/2021 06:02:35 - INFO - codeparrot_training - Step 9655: {'lr': 0.0004566526738290737, 'samples': 4943872, 'steps': 9655, 'batch_loss/train': 0.8315300015965477} 12/24/2021 06:02:45 - INFO - codeparrot_training - Step 9656: {'lr': 0.00045664380233361834, 'samples': 4944384, 'steps': 9656, 'batch_loss/train': 0.8555973982438445} 12/24/2021 06:02:58 - INFO - codeparrot_training - Step 9657: {'lr': 0.00045663493001662516, 'samples': 4944896, 'steps': 9657, 'batch_loss/train': 0.9400074854493141} 12/24/2021 06:03:09 - INFO - codeparrot_training - Step 9658: {'lr': 0.0004566260568781295, 'samples': 4945408, 'steps': 9658, 'batch_loss/train': 0.8189732097089291} 12/24/2021 06:03:19 - INFO - codeparrot_training - Step 9659: {'lr': 0.0004566171829181667, 'samples': 4945920, 'steps': 9659, 'batch_loss/train': 0.7582128047943115} 12/24/2021 06:03:30 - INFO - codeparrot_training - Step 9660: {'lr': 0.000456608308136772, 'samples': 4946432, 'steps': 9660, 'batch_loss/train': 0.7479296531528234} 12/24/2021 06:03:42 - INFO - codeparrot_training - Step 9661: {'lr': 0.0004565994325339807, 'samples': 4946944, 'steps': 9661, 'batch_loss/train': 0.9002428203821182} 12/24/2021 06:03:52 - INFO - codeparrot_training - Step 9662: {'lr': 0.0004565905561098281, 'samples': 4947456, 'steps': 9662, 'batch_loss/train': 0.8570518596097827} 12/24/2021 06:04:03 - INFO - codeparrot_training - Step 9663: {'lr': 0.00045658167886434935, 'samples': 4947968, 'steps': 9663, 'batch_loss/train': 0.8824170143343508} 12/24/2021 06:04:18 - INFO - codeparrot_training - Step 9664: {'lr': 0.0004565728007975799, 'samples': 4948480, 'steps': 9664, 'batch_loss/train': 0.8389441077597439} 12/24/2021 06:04:28 - INFO - codeparrot_training - Step 9665: {'lr': 0.000456563921909555, 'samples': 4948992, 'steps': 9665, 'batch_loss/train': 0.8904600539244711} 12/24/2021 06:04:39 - INFO - codeparrot_training - Step 9666: {'lr': 0.0004565550422003099, 'samples': 4949504, 'steps': 9666, 'batch_loss/train': 0.7896366263739765} 12/24/2021 06:04:49 - INFO - codeparrot_training - Step 9667: {'lr': 0.00045654616166988, 'samples': 4950016, 'steps': 9667, 'batch_loss/train': 0.7406542729586363} 12/24/2021 06:05:02 - INFO - codeparrot_training - Step 9668: {'lr': 0.00045653728031830055, 'samples': 4950528, 'steps': 9668, 'batch_loss/train': 0.8091281279921532} 12/24/2021 06:05:12 - INFO - codeparrot_training - Step 9669: {'lr': 0.0004565283981456068, 'samples': 4951040, 'steps': 9669, 'batch_loss/train': 0.7560581769794226} 12/24/2021 06:05:23 - INFO - codeparrot_training - Step 9670: {'lr': 0.0004565195151518342, 'samples': 4951552, 'steps': 9670, 'batch_loss/train': 0.7561590424738824} 12/24/2021 06:05:37 - INFO - codeparrot_training - Step 9671: {'lr': 0.00045651063133701785, 'samples': 4952064, 'steps': 9671, 'batch_loss/train': 0.7777276621200144} 12/24/2021 06:05:47 - INFO - codeparrot_training - Step 9672: {'lr': 0.00045650174670119333, 'samples': 4952576, 'steps': 9672, 'batch_loss/train': 0.8878359693335369} 12/24/2021 06:05:58 - INFO - codeparrot_training - Step 9673: {'lr': 0.0004564928612443958, 'samples': 4953088, 'steps': 9673, 'batch_loss/train': 0.9155812431126833} 12/24/2021 06:06:10 - INFO - codeparrot_training - Step 9674: {'lr': 0.0004564839749666606, 'samples': 4953600, 'steps': 9674, 'batch_loss/train': 0.6603578852955252} 12/24/2021 06:06:21 - INFO - codeparrot_training - Step 9675: {'lr': 0.00045647508786802316, 'samples': 4954112, 'steps': 9675, 'batch_loss/train': 1.0817725989036262} 12/24/2021 06:06:31 - INFO - codeparrot_training - Step 9676: {'lr': 0.00045646619994851867, 'samples': 4954624, 'steps': 9676, 'batch_loss/train': 0.9308696109801531} 12/24/2021 06:06:42 - INFO - codeparrot_training - Step 9677: {'lr': 0.0004564573112081825, 'samples': 4955136, 'steps': 9677, 'batch_loss/train': 0.9193824804387987} 12/24/2021 06:06:54 - INFO - codeparrot_training - Step 9678: {'lr': 0.0004564484216470501, 'samples': 4955648, 'steps': 9678, 'batch_loss/train': 0.8202054630964994} 12/24/2021 06:07:05 - INFO - codeparrot_training - Step 9679: {'lr': 0.0004564395312651567, 'samples': 4956160, 'steps': 9679, 'batch_loss/train': 0.8433964066207409} 12/24/2021 06:07:15 - INFO - codeparrot_training - Step 9680: {'lr': 0.0004564306400625376, 'samples': 4956672, 'steps': 9680, 'batch_loss/train': 0.8440221431665123} 12/24/2021 06:07:29 - INFO - codeparrot_training - Step 9681: {'lr': 0.0004564217480392283, 'samples': 4957184, 'steps': 9681, 'batch_loss/train': 0.7946419077925384} 12/24/2021 06:07:40 - INFO - codeparrot_training - Step 9682: {'lr': 0.000456412855195264, 'samples': 4957696, 'steps': 9682, 'batch_loss/train': 0.8669867946300656} 12/24/2021 06:07:50 - INFO - codeparrot_training - Step 9683: {'lr': 0.0004564039615306802, 'samples': 4958208, 'steps': 9683, 'batch_loss/train': 0.7810119590722024} 12/24/2021 06:08:02 - INFO - codeparrot_training - Step 9684: {'lr': 0.00045639506704551216, 'samples': 4958720, 'steps': 9684, 'batch_loss/train': 0.8784043611958623} 12/24/2021 06:08:13 - INFO - codeparrot_training - Step 9685: {'lr': 0.00045638617173979527, 'samples': 4959232, 'steps': 9685, 'batch_loss/train': 0.8213427681475878} 12/24/2021 06:08:23 - INFO - codeparrot_training - Step 9686: {'lr': 0.00045637727561356485, 'samples': 4959744, 'steps': 9686, 'batch_loss/train': 0.8113030716776848} 12/24/2021 06:08:35 - INFO - codeparrot_training - Step 9687: {'lr': 0.0004563683786668563, 'samples': 4960256, 'steps': 9687, 'batch_loss/train': 0.8910337956622243} 12/24/2021 06:08:46 - INFO - codeparrot_training - Step 9688: {'lr': 0.000456359480899705, 'samples': 4960768, 'steps': 9688, 'batch_loss/train': 0.8580961781553924} 12/24/2021 06:08:56 - INFO - codeparrot_training - Step 9689: {'lr': 0.0004563505823121464, 'samples': 4961280, 'steps': 9689, 'batch_loss/train': 0.8289892203174531} 12/24/2021 06:09:07 - INFO - codeparrot_training - Step 9690: {'lr': 0.00045634168290421573, 'samples': 4961792, 'steps': 9690, 'batch_loss/train': 0.8706825068220496} 12/24/2021 06:09:21 - INFO - codeparrot_training - Step 9691: {'lr': 0.00045633278267594846, 'samples': 4962304, 'steps': 9691, 'batch_loss/train': 0.8213005531579256} 12/24/2021 06:09:31 - INFO - codeparrot_training - Step 9692: {'lr': 0.00045632388162738006, 'samples': 4962816, 'steps': 9692, 'batch_loss/train': 0.8605448896996677} 12/24/2021 06:09:42 - INFO - codeparrot_training - Step 9693: {'lr': 0.00045631497975854565, 'samples': 4963328, 'steps': 9693, 'batch_loss/train': 0.8031728332862258} 12/24/2021 06:09:54 - INFO - codeparrot_training - Step 9694: {'lr': 0.00045630607706948086, 'samples': 4963840, 'steps': 9694, 'batch_loss/train': 0.8285289893392473} 12/24/2021 06:10:05 - INFO - codeparrot_training - Step 9695: {'lr': 0.000456297173560221, 'samples': 4964352, 'steps': 9695, 'batch_loss/train': 0.9193257144652307} 12/24/2021 06:10:15 - INFO - codeparrot_training - Step 9696: {'lr': 0.00045628826923080145, 'samples': 4964864, 'steps': 9696, 'batch_loss/train': 0.8471873032394797} 12/24/2021 06:10:29 - INFO - codeparrot_training - Step 9697: {'lr': 0.00045627936408125763, 'samples': 4965376, 'steps': 9697, 'batch_loss/train': 0.7986355936154723} 12/24/2021 06:10:40 - INFO - codeparrot_training - Step 9698: {'lr': 0.0004562704581116249, 'samples': 4965888, 'steps': 9698, 'batch_loss/train': 0.7280311235226691} 12/24/2021 06:10:50 - INFO - codeparrot_training - Step 9699: {'lr': 0.0004562615513219388, 'samples': 4966400, 'steps': 9699, 'batch_loss/train': 0.7771798456087708} 12/24/2021 06:11:01 - INFO - codeparrot_training - Step 9700: {'lr': 0.0004562526437122347, 'samples': 4966912, 'steps': 9700, 'batch_loss/train': 0.7571565872058272} 12/24/2021 06:11:13 - INFO - codeparrot_training - Step 9701: {'lr': 0.00045624373528254785, 'samples': 4967424, 'steps': 9701, 'batch_loss/train': 0.811887277290225} 12/24/2021 06:11:23 - INFO - codeparrot_training - Step 9702: {'lr': 0.00045623482603291386, 'samples': 4967936, 'steps': 9702, 'batch_loss/train': 0.8788194498047233} 12/24/2021 06:11:34 - INFO - codeparrot_training - Step 9703: {'lr': 0.000456225915963368, 'samples': 4968448, 'steps': 9703, 'batch_loss/train': 0.8771311286836863} 12/24/2021 06:11:46 - INFO - codeparrot_training - Step 9704: {'lr': 0.0004562170050739458, 'samples': 4968960, 'steps': 9704, 'batch_loss/train': 0.8550418056547642} 12/24/2021 06:11:57 - INFO - codeparrot_training - Step 9705: {'lr': 0.00045620809336468273, 'samples': 4969472, 'steps': 9705, 'batch_loss/train': 0.8401023093611002} 12/24/2021 06:12:07 - INFO - codeparrot_training - Step 9706: {'lr': 0.0004561991808356141, 'samples': 4969984, 'steps': 9706, 'batch_loss/train': 0.8510893806815147} 12/24/2021 06:12:19 - INFO - codeparrot_training - Step 9707: {'lr': 0.0004561902674867753, 'samples': 4970496, 'steps': 9707, 'batch_loss/train': 0.7971376711502671} 12/24/2021 06:12:30 - INFO - codeparrot_training - Step 9708: {'lr': 0.0004561813533182019, 'samples': 4971008, 'steps': 9708, 'batch_loss/train': 0.64122163556749} 12/24/2021 06:12:41 - INFO - codeparrot_training - Step 9709: {'lr': 0.0004561724383299293, 'samples': 4971520, 'steps': 9709, 'batch_loss/train': 0.9400222329422832} 12/24/2021 06:12:54 - INFO - codeparrot_training - Step 9710: {'lr': 0.0004561635225219929, 'samples': 4972032, 'steps': 9710, 'batch_loss/train': 0.9626820366829634} 12/24/2021 06:13:05 - INFO - codeparrot_training - Step 9711: {'lr': 0.00045615460589442823, 'samples': 4972544, 'steps': 9711, 'batch_loss/train': 0.8476070086471736} 12/24/2021 06:13:15 - INFO - codeparrot_training - Step 9712: {'lr': 0.0004561456884472707, 'samples': 4973056, 'steps': 9712, 'batch_loss/train': 0.831956590525806} 12/24/2021 06:13:26 - INFO - codeparrot_training - Step 9713: {'lr': 0.00045613677018055567, 'samples': 4973568, 'steps': 9713, 'batch_loss/train': 0.7904493436217308} 12/24/2021 06:13:38 - INFO - codeparrot_training - Step 9714: {'lr': 0.00045612785109431874, 'samples': 4974080, 'steps': 9714, 'batch_loss/train': 0.8336650775745511} 12/24/2021 06:13:49 - INFO - codeparrot_training - Step 9715: {'lr': 0.00045611893118859524, 'samples': 4974592, 'steps': 9715, 'batch_loss/train': 0.7989748566178605} 12/24/2021 06:13:59 - INFO - codeparrot_training - Step 9716: {'lr': 0.00045611001046342084, 'samples': 4975104, 'steps': 9716, 'batch_loss/train': 0.834887002594769} 12/24/2021 06:14:11 - INFO - codeparrot_training - Step 9717: {'lr': 0.00045610108891883073, 'samples': 4975616, 'steps': 9717, 'batch_loss/train': 0.7461786512285471} 12/24/2021 06:14:22 - INFO - codeparrot_training - Step 9718: {'lr': 0.0004560921665548605, 'samples': 4976128, 'steps': 9718, 'batch_loss/train': 0.7840787898749113} 12/24/2021 06:14:32 - INFO - codeparrot_training - Step 9719: {'lr': 0.0004560832433715457, 'samples': 4976640, 'steps': 9719, 'batch_loss/train': 0.7778612049296498} 12/24/2021 06:14:46 - INFO - codeparrot_training - Step 9720: {'lr': 0.00045607431936892177, 'samples': 4977152, 'steps': 9720, 'batch_loss/train': 0.7957605542615056} 12/24/2021 06:14:57 - INFO - codeparrot_training - Step 9721: {'lr': 0.0004560653945470241, 'samples': 4977664, 'steps': 9721, 'batch_loss/train': 0.8552877763286233} 12/24/2021 06:15:07 - INFO - codeparrot_training - Step 9722: {'lr': 0.0004560564689058882, 'samples': 4978176, 'steps': 9722, 'batch_loss/train': 1.1138531500473619} 12/24/2021 06:15:18 - INFO - codeparrot_training - Step 9723: {'lr': 0.0004560475424455496, 'samples': 4978688, 'steps': 9723, 'batch_loss/train': 0.7267671201843768} 12/24/2021 06:15:30 - INFO - codeparrot_training - Step 9724: {'lr': 0.00045603861516604384, 'samples': 4979200, 'steps': 9724, 'batch_loss/train': 0.7282545128837228} 12/24/2021 06:15:41 - INFO - codeparrot_training - Step 9725: {'lr': 0.00045602968706740627, 'samples': 4979712, 'steps': 9725, 'batch_loss/train': 0.8350965352728963} 12/24/2021 06:15:51 - INFO - codeparrot_training - Step 9726: {'lr': 0.00045602075814967245, 'samples': 4980224, 'steps': 9726, 'batch_loss/train': 0.8225260465405881} 12/24/2021 06:16:05 - INFO - codeparrot_training - Step 9727: {'lr': 0.00045601182841287794, 'samples': 4980736, 'steps': 9727, 'batch_loss/train': 0.8662608824670315} 12/24/2021 06:16:16 - INFO - codeparrot_training - Step 9728: {'lr': 0.0004560028978570582, 'samples': 4981248, 'steps': 9728, 'batch_loss/train': 0.6609342311276123} 12/24/2021 06:16:26 - INFO - codeparrot_training - Step 9729: {'lr': 0.0004559939664822487, 'samples': 4981760, 'steps': 9729, 'batch_loss/train': 0.8497694209218025} 12/24/2021 06:16:38 - INFO - codeparrot_training - Step 9730: {'lr': 0.0004559850342884849, 'samples': 4982272, 'steps': 9730, 'batch_loss/train': 0.6581134703010321} 12/24/2021 06:16:49 - INFO - codeparrot_training - Step 9731: {'lr': 0.0004559761012758025, 'samples': 4982784, 'steps': 9731, 'batch_loss/train': 0.6691816188395023} 12/24/2021 06:17:00 - INFO - codeparrot_training - Step 9732: {'lr': 0.0004559671674442368, 'samples': 4983296, 'steps': 9732, 'batch_loss/train': 0.8832663577049971} 12/24/2021 06:17:10 - INFO - codeparrot_training - Step 9733: {'lr': 0.0004559582327938234, 'samples': 4983808, 'steps': 9733, 'batch_loss/train': 0.80171824619174} 12/24/2021 06:17:23 - INFO - codeparrot_training - Step 9734: {'lr': 0.00045594929732459787, 'samples': 4984320, 'steps': 9734, 'batch_loss/train': 0.8255399232730269} 12/24/2021 06:17:33 - INFO - codeparrot_training - Step 9735: {'lr': 0.00045594036103659575, 'samples': 4984832, 'steps': 9735, 'batch_loss/train': 0.8221545340493321} 12/24/2021 06:17:44 - INFO - codeparrot_training - Step 9736: {'lr': 0.00045593142392985245, 'samples': 4985344, 'steps': 9736, 'batch_loss/train': 0.8067004075273871} 12/24/2021 06:17:57 - INFO - codeparrot_training - Step 9737: {'lr': 0.00045592248600440354, 'samples': 4985856, 'steps': 9737, 'batch_loss/train': 0.8942066356539726} 12/24/2021 06:18:08 - INFO - codeparrot_training - Step 9738: {'lr': 0.0004559135472602847, 'samples': 4986368, 'steps': 9738, 'batch_loss/train': 0.8682993720285594} 12/24/2021 06:18:19 - INFO - codeparrot_training - Step 9739: {'lr': 0.00045590460769753127, 'samples': 4986880, 'steps': 9739, 'batch_loss/train': 0.8682393934577703} 12/24/2021 06:18:31 - INFO - codeparrot_training - Step 9740: {'lr': 0.0004558956673161788, 'samples': 4987392, 'steps': 9740, 'batch_loss/train': 0.8429099740460515} 12/24/2021 06:18:42 - INFO - codeparrot_training - Step 9741: {'lr': 0.000455886726116263, 'samples': 4987904, 'steps': 9741, 'batch_loss/train': 0.8265067199245095} 12/24/2021 06:18:52 - INFO - codeparrot_training - Step 9742: {'lr': 0.0004558777840978193, 'samples': 4988416, 'steps': 9742, 'batch_loss/train': 0.8616525968536735} 12/24/2021 06:19:05 - INFO - codeparrot_training - Step 9743: {'lr': 0.00045586884126088323, 'samples': 4988928, 'steps': 9743, 'batch_loss/train': 0.8012158209457994} 12/24/2021 06:19:15 - INFO - codeparrot_training - Step 9744: {'lr': 0.0004558598976054904, 'samples': 4989440, 'steps': 9744, 'batch_loss/train': 0.8495800653472543} 12/24/2021 06:19:26 - INFO - codeparrot_training - Step 9745: {'lr': 0.00045585095313167633, 'samples': 4989952, 'steps': 9745, 'batch_loss/train': 0.84142588917166} 12/24/2021 06:19:36 - INFO - codeparrot_training - Step 9746: {'lr': 0.0004558420078394766, 'samples': 4990464, 'steps': 9746, 'batch_loss/train': 0.7609900804236531} 12/24/2021 06:19:48 - INFO - codeparrot_training - Step 9747: {'lr': 0.0004558330617289268, 'samples': 4990976, 'steps': 9747, 'batch_loss/train': 0.7503502042964101} 12/24/2021 06:19:59 - INFO - codeparrot_training - Step 9748: {'lr': 0.0004558241148000625, 'samples': 4991488, 'steps': 9748, 'batch_loss/train': 0.7961379187181592} 12/24/2021 06:20:09 - INFO - codeparrot_training - Step 9749: {'lr': 0.0004558151670529191, 'samples': 4992000, 'steps': 9749, 'batch_loss/train': 0.7424012920819223} 12/24/2021 06:20:23 - INFO - codeparrot_training - Step 9750: {'lr': 0.00045580621848753243, 'samples': 4992512, 'steps': 9750, 'batch_loss/train': 0.8173338091000915} 12/24/2021 06:20:34 - INFO - codeparrot_training - Step 9751: {'lr': 0.00045579726910393795, 'samples': 4993024, 'steps': 9751, 'batch_loss/train': 0.8323910153703764} 12/24/2021 06:20:44 - INFO - codeparrot_training - Step 9752: {'lr': 0.0004557883189021712, 'samples': 4993536, 'steps': 9752, 'batch_loss/train': 0.9011134961619973} 12/24/2021 06:20:56 - INFO - codeparrot_training - Step 9753: {'lr': 0.00045577936788226774, 'samples': 4994048, 'steps': 9753, 'batch_loss/train': 0.8319387007504702} 12/24/2021 06:21:07 - INFO - codeparrot_training - Step 9754: {'lr': 0.00045577041604426326, 'samples': 4994560, 'steps': 9754, 'batch_loss/train': 0.8220574425067753} 12/24/2021 06:21:18 - INFO - codeparrot_training - Step 9755: {'lr': 0.0004557614633881934, 'samples': 4995072, 'steps': 9755, 'batch_loss/train': 0.7621094896458089} 12/24/2021 06:21:28 - INFO - codeparrot_training - Step 9756: {'lr': 0.00045575250991409345, 'samples': 4995584, 'steps': 9756, 'batch_loss/train': 0.7916381629183888} 12/24/2021 06:21:42 - INFO - codeparrot_training - Step 9757: {'lr': 0.0004557435556219993, 'samples': 4996096, 'steps': 9757, 'batch_loss/train': 0.819030387327075} 12/24/2021 06:21:53 - INFO - codeparrot_training - Step 9758: {'lr': 0.0004557346005119465, 'samples': 4996608, 'steps': 9758, 'batch_loss/train': 0.9198521412909031} 12/24/2021 06:22:04 - INFO - codeparrot_training - Step 9759: {'lr': 0.0004557256445839706, 'samples': 4997120, 'steps': 9759, 'batch_loss/train': 1.03715615242254} 12/24/2021 06:22:16 - INFO - codeparrot_training - Step 9760: {'lr': 0.0004557166878381072, 'samples': 4997632, 'steps': 9760, 'batch_loss/train': 0.8288096394389868} 12/24/2021 06:22:26 - INFO - codeparrot_training - Step 9761: {'lr': 0.0004557077302743919, 'samples': 4998144, 'steps': 9761, 'batch_loss/train': 0.8068501329980791} 12/24/2021 06:22:37 - INFO - codeparrot_training - Step 9762: {'lr': 0.00045569877189286043, 'samples': 4998656, 'steps': 9762, 'batch_loss/train': 0.8815235765650868} 12/24/2021 06:22:49 - INFO - codeparrot_training - Step 9763: {'lr': 0.0004556898126935482, 'samples': 4999168, 'steps': 9763, 'batch_loss/train': 0.802037101238966} 12/24/2021 06:22:59 - INFO - codeparrot_training - Step 9764: {'lr': 0.00045568085267649097, 'samples': 4999680, 'steps': 9764, 'batch_loss/train': 0.8496105847880244} 12/24/2021 06:23:10 - INFO - codeparrot_training - Step 9765: {'lr': 0.00045567189184172443, 'samples': 5000192, 'steps': 9765, 'batch_loss/train': 0.7796499393880367} 12/24/2021 06:23:21 - INFO - codeparrot_training - Step 9766: {'lr': 0.00045566293018928396, 'samples': 5000704, 'steps': 9766, 'batch_loss/train': 0.840839963639155} 12/24/2021 06:23:35 - INFO - codeparrot_training - Step 9767: {'lr': 0.0004556539677192054, 'samples': 5001216, 'steps': 9767, 'batch_loss/train': 0.7391938711516559} 12/24/2021 06:23:46 - INFO - codeparrot_training - Step 9768: {'lr': 0.00045564500443152434, 'samples': 5001728, 'steps': 9768, 'batch_loss/train': 1.1810866799205542} 12/24/2021 06:23:57 - INFO - codeparrot_training - Step 9769: {'lr': 0.0004556360403262764, 'samples': 5002240, 'steps': 9769, 'batch_loss/train': 0.8769353171810508} 12/24/2021 06:24:09 - INFO - codeparrot_training - Step 9770: {'lr': 0.00045562707540349715, 'samples': 5002752, 'steps': 9770, 'batch_loss/train': 0.8565690591931343} 12/24/2021 06:24:19 - INFO - codeparrot_training - Step 9771: {'lr': 0.00045561810966322227, 'samples': 5003264, 'steps': 9771, 'batch_loss/train': 0.8951028604060411} 12/24/2021 06:24:30 - INFO - codeparrot_training - Step 9772: {'lr': 0.0004556091431054875, 'samples': 5003776, 'steps': 9772, 'batch_loss/train': 1.0405351435765624} 12/24/2021 06:24:40 - INFO - codeparrot_training - Step 9773: {'lr': 0.00045560017573032836, 'samples': 5004288, 'steps': 9773, 'batch_loss/train': 0.8346106561366469} 12/24/2021 06:24:54 - INFO - codeparrot_training - Step 9774: {'lr': 0.0004555912075377806, 'samples': 5004800, 'steps': 9774, 'batch_loss/train': 0.9078797944821417} 12/24/2021 06:25:05 - INFO - codeparrot_training - Step 9775: {'lr': 0.0004555822385278798, 'samples': 5005312, 'steps': 9775, 'batch_loss/train': 0.8411741144955158} 12/24/2021 06:25:15 - INFO - codeparrot_training - Step 9776: {'lr': 0.0004555732687006616, 'samples': 5005824, 'steps': 9776, 'batch_loss/train': 0.79968720395118} 12/24/2021 06:25:28 - INFO - codeparrot_training - Step 9777: {'lr': 0.0004555642980561618, 'samples': 5006336, 'steps': 9777, 'batch_loss/train': 0.8017441779375076} 12/24/2021 06:25:38 - INFO - codeparrot_training - Step 9778: {'lr': 0.0004555553265944159, 'samples': 5006848, 'steps': 9778, 'batch_loss/train': 0.8144642058759928} 12/24/2021 06:25:49 - INFO - codeparrot_training - Step 9779: {'lr': 0.00045554635431545965, 'samples': 5007360, 'steps': 9779, 'batch_loss/train': 0.800537217175588} 12/24/2021 06:26:01 - INFO - codeparrot_training - Step 9780: {'lr': 0.00045553738121932873, 'samples': 5007872, 'steps': 9780, 'batch_loss/train': 0.9227328523993492} 12/24/2021 06:26:11 - INFO - codeparrot_training - Step 9781: {'lr': 0.00045552840730605875, 'samples': 5008384, 'steps': 9781, 'batch_loss/train': 0.8480272423475981} 12/24/2021 06:26:22 - INFO - codeparrot_training - Step 9782: {'lr': 0.00045551943257568544, 'samples': 5008896, 'steps': 9782, 'batch_loss/train': 0.7982451294083148} 12/24/2021 06:26:34 - INFO - codeparrot_training - Step 9783: {'lr': 0.0004555104570282445, 'samples': 5009408, 'steps': 9783, 'batch_loss/train': 0.9377838494256139} 12/24/2021 06:26:45 - INFO - codeparrot_training - Step 9784: {'lr': 0.00045550148066377156, 'samples': 5009920, 'steps': 9784, 'batch_loss/train': 0.8155624363571405} 12/24/2021 06:26:55 - INFO - codeparrot_training - Step 9785: {'lr': 0.0004554925034823023, 'samples': 5010432, 'steps': 9785, 'batch_loss/train': 0.8278255239129066} 12/24/2021 06:27:06 - INFO - codeparrot_training - Step 9786: {'lr': 0.0004554835254838725, 'samples': 5010944, 'steps': 9786, 'batch_loss/train': 0.910793662071228} 12/24/2021 06:27:20 - INFO - codeparrot_training - Step 9787: {'lr': 0.00045547454666851773, 'samples': 5011456, 'steps': 9787, 'batch_loss/train': 0.904159783385694} 12/24/2021 06:27:30 - INFO - codeparrot_training - Step 9788: {'lr': 0.00045546556703627373, 'samples': 5011968, 'steps': 9788, 'batch_loss/train': 0.8218864165246487} 12/24/2021 06:27:41 - INFO - codeparrot_training - Step 9789: {'lr': 0.0004554565865871763, 'samples': 5012480, 'steps': 9789, 'batch_loss/train': 0.7705534314736724} 12/24/2021 06:27:53 - INFO - codeparrot_training - Step 9790: {'lr': 0.000455447605321261, 'samples': 5012992, 'steps': 9790, 'batch_loss/train': 0.7964772330597043} 12/24/2021 06:28:04 - INFO - codeparrot_training - Step 9791: {'lr': 0.0004554386232385637, 'samples': 5013504, 'steps': 9791, 'batch_loss/train': 0.8084822576493025} 12/24/2021 06:28:14 - INFO - codeparrot_training - Step 9792: {'lr': 0.0004554296403391198, 'samples': 5014016, 'steps': 9792, 'batch_loss/train': 0.9131637876853347} 12/24/2021 06:28:26 - INFO - codeparrot_training - Step 9793: {'lr': 0.00045542065662296536, 'samples': 5014528, 'steps': 9793, 'batch_loss/train': 0.8414010195992887} 12/24/2021 06:28:37 - INFO - codeparrot_training - Step 9794: {'lr': 0.0004554116720901359, 'samples': 5015040, 'steps': 9794, 'batch_loss/train': 0.8962394669651985} 12/24/2021 06:28:48 - INFO - codeparrot_training - Step 9795: {'lr': 0.0004554026867406672, 'samples': 5015552, 'steps': 9795, 'batch_loss/train': 0.8113033076515421} 12/24/2021 06:28:58 - INFO - codeparrot_training - Step 9796: {'lr': 0.00045539370057459504, 'samples': 5016064, 'steps': 9796, 'batch_loss/train': 0.8842766843736172} 12/24/2021 06:29:12 - INFO - codeparrot_training - Step 9797: {'lr': 0.000455384713591955, 'samples': 5016576, 'steps': 9797, 'batch_loss/train': 0.8201266331598163} 12/24/2021 06:29:23 - INFO - codeparrot_training - Step 9798: {'lr': 0.00045537572579278297, 'samples': 5017088, 'steps': 9798, 'batch_loss/train': 0.8903770484030247} 12/24/2021 06:29:34 - INFO - codeparrot_training - Step 9799: {'lr': 0.0004553667371771145, 'samples': 5017600, 'steps': 9799, 'batch_loss/train': 0.8652414875105023} 12/24/2021 06:29:46 - INFO - codeparrot_training - Step 9800: {'lr': 0.0004553577477449855, 'samples': 5018112, 'steps': 9800, 'batch_loss/train': 0.8899694747524336} 12/24/2021 06:29:57 - INFO - codeparrot_training - Step 9801: {'lr': 0.00045534875749643163, 'samples': 5018624, 'steps': 9801, 'batch_loss/train': 0.8215032392181456} 12/24/2021 06:30:07 - INFO - codeparrot_training - Step 9802: {'lr': 0.0004553397664314886, 'samples': 5019136, 'steps': 9802, 'batch_loss/train': 0.7654251120984554} 12/24/2021 06:30:21 - INFO - codeparrot_training - Step 9803: {'lr': 0.0004553307745501922, 'samples': 5019648, 'steps': 9803, 'batch_loss/train': 1.1043119570240378} 12/24/2021 06:30:32 - INFO - codeparrot_training - Step 9804: {'lr': 0.00045532178185257824, 'samples': 5020160, 'steps': 9804, 'batch_loss/train': 0.8340222341939807} 12/24/2021 06:30:42 - INFO - codeparrot_training - Step 9805: {'lr': 0.0004553127883386823, 'samples': 5020672, 'steps': 9805, 'batch_loss/train': 1.0062582464888692} 12/24/2021 06:30:53 - INFO - codeparrot_training - Step 9806: {'lr': 0.0004553037940085403, 'samples': 5021184, 'steps': 9806, 'batch_loss/train': 0.8673984562046826} 12/24/2021 06:31:05 - INFO - codeparrot_training - Step 9807: {'lr': 0.00045529479886218796, 'samples': 5021696, 'steps': 9807, 'batch_loss/train': 0.8211776041425765} 12/24/2021 06:31:16 - INFO - codeparrot_training - Step 9808: {'lr': 0.0004552858028996609, 'samples': 5022208, 'steps': 9808, 'batch_loss/train': 0.7691617778036743} 12/24/2021 06:31:26 - INFO - codeparrot_training - Step 9809: {'lr': 0.0004552768061209951, 'samples': 5022720, 'steps': 9809, 'batch_loss/train': 0.8430922580882907} 12/24/2021 06:31:39 - INFO - codeparrot_training - Step 9810: {'lr': 0.0004552678085262262, 'samples': 5023232, 'steps': 9810, 'batch_loss/train': 0.7717522086459212} 12/24/2021 06:31:49 - INFO - codeparrot_training - Step 9811: {'lr': 0.00045525881011538997, 'samples': 5023744, 'steps': 9811, 'batch_loss/train': 0.9606233518570662} 12/24/2021 06:32:00 - INFO - codeparrot_training - Step 9812: {'lr': 0.00045524981088852225, 'samples': 5024256, 'steps': 9812, 'batch_loss/train': 0.9060781598091125} 12/24/2021 06:32:14 - INFO - codeparrot_training - Step 9813: {'lr': 0.00045524081084565883, 'samples': 5024768, 'steps': 9813, 'batch_loss/train': 0.8490574583411217} 12/24/2021 06:32:24 - INFO - codeparrot_training - Step 9814: {'lr': 0.0004552318099868353, 'samples': 5025280, 'steps': 9814, 'batch_loss/train': 0.7255102326162159} 12/24/2021 06:32:35 - INFO - codeparrot_training - Step 9815: {'lr': 0.00045522280831208775, 'samples': 5025792, 'steps': 9815, 'batch_loss/train': 0.8680169493891299} 12/24/2021 06:32:45 - INFO - codeparrot_training - Step 9816: {'lr': 0.00045521380582145166, 'samples': 5026304, 'steps': 9816, 'batch_loss/train': 0.8392131170257926} 12/24/2021 06:32:57 - INFO - codeparrot_training - Step 9817: {'lr': 0.000455204802514963, 'samples': 5026816, 'steps': 9817, 'batch_loss/train': 0.7782282428815961} 12/24/2021 06:33:08 - INFO - codeparrot_training - Step 9818: {'lr': 0.00045519579839265763, 'samples': 5027328, 'steps': 9818, 'batch_loss/train': 0.895240779966116} 12/24/2021 06:33:19 - INFO - codeparrot_training - Step 9819: {'lr': 0.00045518679345457115, 'samples': 5027840, 'steps': 9819, 'batch_loss/train': 0.8462752345949411} 12/24/2021 06:33:31 - INFO - codeparrot_training - Step 9820: {'lr': 0.00045517778770073946, 'samples': 5028352, 'steps': 9820, 'batch_loss/train': 0.7155407364480197} 12/24/2021 06:33:41 - INFO - codeparrot_training - Step 9821: {'lr': 0.0004551687811311984, 'samples': 5028864, 'steps': 9821, 'batch_loss/train': 0.783024494536221} 12/24/2021 06:33:52 - INFO - codeparrot_training - Step 9822: {'lr': 0.00045515977374598374, 'samples': 5029376, 'steps': 9822, 'batch_loss/train': 0.8207530761137605} 12/24/2021 06:34:04 - INFO - codeparrot_training - Step 9823: {'lr': 0.0004551507655451312, 'samples': 5029888, 'steps': 9823, 'batch_loss/train': 0.8311207965016365} 12/24/2021 06:34:15 - INFO - codeparrot_training - Step 9824: {'lr': 0.0004551417565286767, 'samples': 5030400, 'steps': 9824, 'batch_loss/train': 0.8552871090359986} 12/24/2021 06:34:25 - INFO - codeparrot_training - Step 9825: {'lr': 0.0004551327466966562, 'samples': 5030912, 'steps': 9825, 'batch_loss/train': 0.7842899574898183} 12/24/2021 06:34:39 - INFO - codeparrot_training - Step 9826: {'lr': 0.00045512373604910517, 'samples': 5031424, 'steps': 9826, 'batch_loss/train': 0.8833441007882357} 12/24/2021 06:34:50 - INFO - codeparrot_training - Step 9827: {'lr': 0.0004551147245860596, 'samples': 5031936, 'steps': 9827, 'batch_loss/train': 1.6603293847292662} 12/24/2021 06:35:00 - INFO - codeparrot_training - Step 9828: {'lr': 0.00045510571230755547, 'samples': 5032448, 'steps': 9828, 'batch_loss/train': 0.7898919135332108} 12/24/2021 06:35:11 - INFO - codeparrot_training - Step 9829: {'lr': 0.0004550966992136284, 'samples': 5032960, 'steps': 9829, 'batch_loss/train': 0.8255813429132104} 12/24/2021 06:35:23 - INFO - codeparrot_training - Step 9830: {'lr': 0.0004550876853043143, 'samples': 5033472, 'steps': 9830, 'batch_loss/train': 0.7684181537479162} 12/24/2021 06:35:33 - INFO - codeparrot_training - Step 9831: {'lr': 0.000455078670579649, 'samples': 5033984, 'steps': 9831, 'batch_loss/train': 0.874704078771174} 12/24/2021 06:35:44 - INFO - codeparrot_training - Step 9832: {'lr': 0.00045506965503966834, 'samples': 5034496, 'steps': 9832, 'batch_loss/train': 0.8987348387017846} 12/24/2021 06:35:58 - INFO - codeparrot_training - Step 9833: {'lr': 0.0004550606386844082, 'samples': 5035008, 'steps': 9833, 'batch_loss/train': 0.7475741617381573} 12/24/2021 06:36:09 - INFO - codeparrot_training - Step 9834: {'lr': 0.00045505162151390435, 'samples': 5035520, 'steps': 9834, 'batch_loss/train': 0.8305448349565268} 12/24/2021 06:36:19 - INFO - codeparrot_training - Step 9835: {'lr': 0.0004550426035281926, 'samples': 5036032, 'steps': 9835, 'batch_loss/train': 0.7250332939438522} 12/24/2021 06:36:30 - INFO - codeparrot_training - Step 9836: {'lr': 0.000455033584727309, 'samples': 5036544, 'steps': 9836, 'batch_loss/train': 0.857207290828228} 12/24/2021 06:36:42 - INFO - codeparrot_training - Step 9837: {'lr': 0.0004550245651112892, 'samples': 5037056, 'steps': 9837, 'batch_loss/train': 0.8240932114422321} 12/24/2021 06:36:52 - INFO - codeparrot_training - Step 9838: {'lr': 0.00045501554468016926, 'samples': 5037568, 'steps': 9838, 'batch_loss/train': 0.8963577598333359} 12/24/2021 06:37:03 - INFO - codeparrot_training - Step 9839: {'lr': 0.00045500652343398484, 'samples': 5038080, 'steps': 9839, 'batch_loss/train': 0.5684199647512287} 12/24/2021 06:37:15 - INFO - codeparrot_training - Step 9840: {'lr': 0.00045499750137277187, 'samples': 5038592, 'steps': 9840, 'batch_loss/train': 0.8540344443172216} 12/24/2021 06:37:26 - INFO - codeparrot_training - Step 9841: {'lr': 0.00045498847849656635, 'samples': 5039104, 'steps': 9841, 'batch_loss/train': 0.8658972975099459} 12/24/2021 06:37:36 - INFO - codeparrot_training - Step 9842: {'lr': 0.00045497945480540395, 'samples': 5039616, 'steps': 9842, 'batch_loss/train': 0.7586196707561612} 12/24/2021 06:37:50 - INFO - codeparrot_training - Step 9843: {'lr': 0.0004549704302993206, 'samples': 5040128, 'steps': 9843, 'batch_loss/train': 0.8321466762572527} 12/24/2021 06:38:00 - INFO - codeparrot_training - Step 9844: {'lr': 0.00045496140497835227, 'samples': 5040640, 'steps': 9844, 'batch_loss/train': 0.8299184432253242} 12/24/2021 06:38:11 - INFO - codeparrot_training - Step 9845: {'lr': 0.00045495237884253477, 'samples': 5041152, 'steps': 9845, 'batch_loss/train': 0.9082990409806371} 12/24/2021 06:38:22 - INFO - codeparrot_training - Step 9846: {'lr': 0.000454943351891904, 'samples': 5041664, 'steps': 9846, 'batch_loss/train': 0.8352394998073578} 12/24/2021 06:38:34 - INFO - codeparrot_training - Step 9847: {'lr': 0.0004549343241264958, 'samples': 5042176, 'steps': 9847, 'batch_loss/train': 0.9268957246094942} 12/24/2021 06:38:44 - INFO - codeparrot_training - Step 9848: {'lr': 0.0004549252955463462, 'samples': 5042688, 'steps': 9848, 'batch_loss/train': 0.8964537065476179} 12/24/2021 06:38:55 - INFO - codeparrot_training - Step 9849: {'lr': 0.00045491626615149096, 'samples': 5043200, 'steps': 9849, 'batch_loss/train': 0.7889579497277737} 12/24/2021 06:39:07 - INFO - codeparrot_training - Step 9850: {'lr': 0.00045490723594196605, 'samples': 5043712, 'steps': 9850, 'batch_loss/train': 0.9182048784568906} 12/24/2021 06:39:18 - INFO - codeparrot_training - Step 9851: {'lr': 0.0004548982049178073, 'samples': 5044224, 'steps': 9851, 'batch_loss/train': 0.6955841290764511} 12/24/2021 06:39:28 - INFO - codeparrot_training - Step 9852: {'lr': 0.00045488917307905065, 'samples': 5044736, 'steps': 9852, 'batch_loss/train': 0.8511983612552285} 12/24/2021 06:39:42 - INFO - codeparrot_training - Step 9853: {'lr': 0.000454880140425732, 'samples': 5045248, 'steps': 9853, 'batch_loss/train': 0.805098028620705} 12/24/2021 06:39:53 - INFO - codeparrot_training - Step 9854: {'lr': 0.00045487110695788735, 'samples': 5045760, 'steps': 9854, 'batch_loss/train': 0.7650661908555776} 12/24/2021 06:40:03 - INFO - codeparrot_training - Step 9855: {'lr': 0.0004548620726755525, 'samples': 5046272, 'steps': 9855, 'batch_loss/train': 0.8438456566073} 12/24/2021 06:40:14 - INFO - codeparrot_training - Step 9856: {'lr': 0.0004548530375787633, 'samples': 5046784, 'steps': 9856, 'batch_loss/train': 0.7807524483650923} 12/24/2021 06:40:26 - INFO - codeparrot_training - Step 9857: {'lr': 0.0004548440016675559, 'samples': 5047296, 'steps': 9857, 'batch_loss/train': 0.8050119336694479} 12/24/2021 06:40:36 - INFO - codeparrot_training - Step 9858: {'lr': 0.000454834964941966, 'samples': 5047808, 'steps': 9858, 'batch_loss/train': 0.8292583152651787} 12/24/2021 06:40:47 - INFO - codeparrot_training - Step 9859: {'lr': 0.00045482592740202964, 'samples': 5048320, 'steps': 9859, 'batch_loss/train': 0.7283614838379435} 12/24/2021 06:40:59 - INFO - codeparrot_training - Step 9860: {'lr': 0.00045481688904778285, 'samples': 5048832, 'steps': 9860, 'batch_loss/train': 0.9259756300598383} 12/24/2021 06:41:10 - INFO - codeparrot_training - Step 9861: {'lr': 0.0004548078498792613, 'samples': 5049344, 'steps': 9861, 'batch_loss/train': 0.9349979031831026} 12/24/2021 06:41:20 - INFO - codeparrot_training - Step 9862: {'lr': 0.0004547988098965011, 'samples': 5049856, 'steps': 9862, 'batch_loss/train': 0.844707265496254} 12/24/2021 06:41:34 - INFO - codeparrot_training - Step 9863: {'lr': 0.0004547897690995383, 'samples': 5050368, 'steps': 9863, 'batch_loss/train': 0.7618209151551127} 12/24/2021 06:41:45 - INFO - codeparrot_training - Step 9864: {'lr': 0.00045478072748840855, 'samples': 5050880, 'steps': 9864, 'batch_loss/train': 0.8280975827947259} 12/24/2021 06:41:55 - INFO - codeparrot_training - Step 9865: {'lr': 0.000454771685063148, 'samples': 5051392, 'steps': 9865, 'batch_loss/train': 0.7663685223087668} 12/24/2021 06:42:07 - INFO - codeparrot_training - Step 9866: {'lr': 0.00045476264182379257, 'samples': 5051904, 'steps': 9866, 'batch_loss/train': 0.9772258414886892} 12/24/2021 06:42:18 - INFO - codeparrot_training - Step 9867: {'lr': 0.00045475359777037813, 'samples': 5052416, 'steps': 9867, 'batch_loss/train': 0.7433432373218238} 12/24/2021 06:42:29 - INFO - codeparrot_training - Step 9868: {'lr': 0.0004547445529029407, 'samples': 5052928, 'steps': 9868, 'batch_loss/train': 0.7057605665177107} 12/24/2021 06:42:39 - INFO - codeparrot_training - Step 9869: {'lr': 0.00045473550722151626, 'samples': 5053440, 'steps': 9869, 'batch_loss/train': 0.560181507084053} 12/24/2021 06:42:51 - INFO - codeparrot_training - Step 9870: {'lr': 0.00045472646072614075, 'samples': 5053952, 'steps': 9870, 'batch_loss/train': 0.781887992285192} 12/24/2021 06:43:02 - INFO - codeparrot_training - Step 9871: {'lr': 0.0004547174134168501, 'samples': 5054464, 'steps': 9871, 'batch_loss/train': 0.7734268163330853} 12/24/2021 06:43:13 - INFO - codeparrot_training - Step 9872: {'lr': 0.0004547083652936804, 'samples': 5054976, 'steps': 9872, 'batch_loss/train': 0.8880233257077634} 12/24/2021 06:43:27 - INFO - codeparrot_training - Step 9873: {'lr': 0.0004546993163566675, 'samples': 5055488, 'steps': 9873, 'batch_loss/train': 0.8555382442427799} 12/24/2021 06:43:37 - INFO - codeparrot_training - Step 9874: {'lr': 0.00045469026660584737, 'samples': 5056000, 'steps': 9874, 'batch_loss/train': 0.7593304677866399} 12/24/2021 06:43:48 - INFO - codeparrot_training - Step 9875: {'lr': 0.00045468121604125614, 'samples': 5056512, 'steps': 9875, 'batch_loss/train': 0.7498592603951693} 12/24/2021 06:44:00 - INFO - codeparrot_training - Step 9876: {'lr': 0.00045467216466292955, 'samples': 5057024, 'steps': 9876, 'batch_loss/train': 0.7485098638571799} 12/24/2021 06:44:10 - INFO - codeparrot_training - Step 9877: {'lr': 0.00045466311247090375, 'samples': 5057536, 'steps': 9877, 'batch_loss/train': 0.7888044673018157} 12/24/2021 06:44:21 - INFO - codeparrot_training - Step 9878: {'lr': 0.00045465405946521474, 'samples': 5058048, 'steps': 9878, 'batch_loss/train': 0.8713780287653208} 12/24/2021 06:44:32 - INFO - codeparrot_training - Step 9879: {'lr': 0.0004546450056458984, 'samples': 5058560, 'steps': 9879, 'batch_loss/train': 0.8050077946390957} 12/24/2021 06:44:44 - INFO - codeparrot_training - Step 9880: {'lr': 0.0004546359510129908, 'samples': 5059072, 'steps': 9880, 'batch_loss/train': 0.5800036228029057} 12/24/2021 06:44:55 - INFO - codeparrot_training - Step 9881: {'lr': 0.0004546268955665279, 'samples': 5059584, 'steps': 9881, 'batch_loss/train': 0.8801920283585787} 12/24/2021 06:45:05 - INFO - codeparrot_training - Step 9882: {'lr': 0.00045461783930654575, 'samples': 5060096, 'steps': 9882, 'batch_loss/train': 0.6038229528348893} 12/24/2021 06:45:19 - INFO - codeparrot_training - Step 9883: {'lr': 0.00045460878223308046, 'samples': 5060608, 'steps': 9883, 'batch_loss/train': 0.8904303815215826} 12/24/2021 06:45:30 - INFO - codeparrot_training - Step 9884: {'lr': 0.00045459972434616777, 'samples': 5061120, 'steps': 9884, 'batch_loss/train': 0.7878522360697389} 12/24/2021 06:45:40 - INFO - codeparrot_training - Step 9885: {'lr': 0.00045459066564584383, 'samples': 5061632, 'steps': 9885, 'batch_loss/train': 0.8206113483756781} 12/24/2021 06:45:52 - INFO - codeparrot_training - Step 9886: {'lr': 0.0004545816061321447, 'samples': 5062144, 'steps': 9886, 'batch_loss/train': 0.9087833417579532} 12/24/2021 06:46:03 - INFO - codeparrot_training - Step 9887: {'lr': 0.00045457254580510634, 'samples': 5062656, 'steps': 9887, 'batch_loss/train': 1.1327103888615966} 12/24/2021 06:46:14 - INFO - codeparrot_training - Step 9888: {'lr': 0.00045456348466476476, 'samples': 5063168, 'steps': 9888, 'batch_loss/train': 0.8100991128012538} 12/24/2021 06:46:24 - INFO - codeparrot_training - Step 9889: {'lr': 0.000454554422711156, 'samples': 5063680, 'steps': 9889, 'batch_loss/train': 0.7824692241847515} 12/24/2021 06:46:38 - INFO - codeparrot_training - Step 9890: {'lr': 0.0004545453599443161, 'samples': 5064192, 'steps': 9890, 'batch_loss/train': 0.7421319507993758} 12/24/2021 06:46:49 - INFO - codeparrot_training - Step 9891: {'lr': 0.0004545362963642811, 'samples': 5064704, 'steps': 9891, 'batch_loss/train': 0.9342335853725672} 12/24/2021 06:46:59 - INFO - codeparrot_training - Step 9892: {'lr': 0.000454527231971087, 'samples': 5065216, 'steps': 9892, 'batch_loss/train': 1.0699348002672195} 12/24/2021 06:47:11 - INFO - codeparrot_training - Step 9893: {'lr': 0.00045451816676476985, 'samples': 5065728, 'steps': 9893, 'batch_loss/train': 0.7677188962697983} 12/24/2021 06:47:22 - INFO - codeparrot_training - Step 9894: {'lr': 0.0004545091007453658, 'samples': 5066240, 'steps': 9894, 'batch_loss/train': 0.8325847270898521} 12/24/2021 06:47:33 - INFO - codeparrot_training - Step 9895: {'lr': 0.0004545000339129106, 'samples': 5066752, 'steps': 9895, 'batch_loss/train': 0.8583777383901179} 12/24/2021 06:47:45 - INFO - codeparrot_training - Step 9896: {'lr': 0.00045449096626744057, 'samples': 5067264, 'steps': 9896, 'batch_loss/train': 0.8682390404865146} 12/24/2021 06:47:56 - INFO - codeparrot_training - Step 9897: {'lr': 0.0004544818978089917, 'samples': 5067776, 'steps': 9897, 'batch_loss/train': 0.7171646500937641} 12/24/2021 06:48:06 - INFO - codeparrot_training - Step 9898: {'lr': 0.00045447282853759995, 'samples': 5068288, 'steps': 9898, 'batch_loss/train': 1.0050035370513797} 12/24/2021 06:48:17 - INFO - codeparrot_training - Step 9899: {'lr': 0.0004544637584533015, 'samples': 5068800, 'steps': 9899, 'batch_loss/train': 0.7346268664114177} 12/24/2021 06:48:29 - INFO - codeparrot_training - Step 9900: {'lr': 0.00045445468755613233, 'samples': 5069312, 'steps': 9900, 'batch_loss/train': 0.8723827102221549} 12/24/2021 06:48:40 - INFO - codeparrot_training - Step 9901: {'lr': 0.00045444561584612853, 'samples': 5069824, 'steps': 9901, 'batch_loss/train': 0.7575823070947081} 12/24/2021 06:48:50 - INFO - codeparrot_training - Step 9902: {'lr': 0.0004544365433233261, 'samples': 5070336, 'steps': 9902, 'batch_loss/train': 1.1281841811724007} 12/24/2021 06:49:04 - INFO - codeparrot_training - Step 9903: {'lr': 0.00045442746998776125, 'samples': 5070848, 'steps': 9903, 'batch_loss/train': 0.8063647542148829} 12/24/2021 06:49:14 - INFO - codeparrot_training - Step 9904: {'lr': 0.0004544183958394699, 'samples': 5071360, 'steps': 9904, 'batch_loss/train': 0.866689832881093} 12/24/2021 06:49:25 - INFO - codeparrot_training - Step 9905: {'lr': 0.0004544093208784882, 'samples': 5071872, 'steps': 9905, 'batch_loss/train': 0.8658365965820849} 12/24/2021 06:49:37 - INFO - codeparrot_training - Step 9906: {'lr': 0.0004544002451048523, 'samples': 5072384, 'steps': 9906, 'batch_loss/train': 0.9080610070377588} 12/24/2021 06:49:48 - INFO - codeparrot_training - Step 9907: {'lr': 0.00045439116851859806, 'samples': 5072896, 'steps': 9907, 'batch_loss/train': 0.8045594352297485} 12/24/2021 06:49:59 - INFO - codeparrot_training - Step 9908: {'lr': 0.00045438209111976183, 'samples': 5073408, 'steps': 9908, 'batch_loss/train': 0.5786517565138638} 12/24/2021 06:50:10 - INFO - codeparrot_training - Step 9909: {'lr': 0.00045437301290837955, 'samples': 5073920, 'steps': 9909, 'batch_loss/train': 0.8266146974638104} 12/24/2021 06:50:21 - INFO - codeparrot_training - Step 9910: {'lr': 0.0004543639338844874, 'samples': 5074432, 'steps': 9910, 'batch_loss/train': 0.8513415940105915} 12/24/2021 06:50:32 - INFO - codeparrot_training - Step 9911: {'lr': 0.00045435485404812125, 'samples': 5074944, 'steps': 9911, 'batch_loss/train': 0.8299615476280451} 12/24/2021 06:50:42 - INFO - codeparrot_training - Step 9912: {'lr': 0.0004543457733993175, 'samples': 5075456, 'steps': 9912, 'batch_loss/train': 0.7890086490660906} 12/24/2021 06:50:56 - INFO - codeparrot_training - Step 9913: {'lr': 0.00045433669193811207, 'samples': 5075968, 'steps': 9913, 'batch_loss/train': 0.7753328531980515} 12/24/2021 06:51:07 - INFO - codeparrot_training - Step 9914: {'lr': 0.0004543276096645411, 'samples': 5076480, 'steps': 9914, 'batch_loss/train': 0.7587642536964267} 12/24/2021 06:51:17 - INFO - codeparrot_training - Step 9915: {'lr': 0.0004543185265786407, 'samples': 5076992, 'steps': 9915, 'batch_loss/train': 0.9330048142001033} 12/24/2021 06:51:30 - INFO - codeparrot_training - Step 9916: {'lr': 0.00045430944268044706, 'samples': 5077504, 'steps': 9916, 'batch_loss/train': 0.7620395505800843} 12/24/2021 06:51:40 - INFO - codeparrot_training - Step 9917: {'lr': 0.00045430035796999613, 'samples': 5078016, 'steps': 9917, 'batch_loss/train': 0.8113879095762968} 12/24/2021 06:51:51 - INFO - codeparrot_training - Step 9918: {'lr': 0.0004542912724473241, 'samples': 5078528, 'steps': 9918, 'batch_loss/train': 0.9891811507986858} 12/24/2021 06:52:05 - INFO - codeparrot_training - Step 9919: {'lr': 0.0004542821861124672, 'samples': 5079040, 'steps': 9919, 'batch_loss/train': 0.8324353211792186} 12/24/2021 06:52:15 - INFO - codeparrot_training - Step 9920: {'lr': 0.00045427309896546136, 'samples': 5079552, 'steps': 9920, 'batch_loss/train': 0.7891840534284711} 12/24/2021 06:52:26 - INFO - codeparrot_training - Step 9921: {'lr': 0.0004542640110063428, 'samples': 5080064, 'steps': 9921, 'batch_loss/train': 0.7406496061012149} 12/24/2021 06:52:36 - INFO - codeparrot_training - Step 9922: {'lr': 0.00045425492223514786, 'samples': 5080576, 'steps': 9922, 'batch_loss/train': 0.7951553214807063} 12/24/2021 06:52:48 - INFO - codeparrot_training - Step 9923: {'lr': 0.0004542458326519123, 'samples': 5081088, 'steps': 9923, 'batch_loss/train': 0.826094794087112} 12/24/2021 06:52:59 - INFO - codeparrot_training - Step 9924: {'lr': 0.0004542367422566724, 'samples': 5081600, 'steps': 9924, 'batch_loss/train': 0.8542165393009782} 12/24/2021 06:53:10 - INFO - codeparrot_training - Step 9925: {'lr': 0.0004542276510494645, 'samples': 5082112, 'steps': 9925, 'batch_loss/train': 0.8772210981696844} 12/24/2021 06:53:22 - INFO - codeparrot_training - Step 9926: {'lr': 0.00045421855903032436, 'samples': 5082624, 'steps': 9926, 'batch_loss/train': 0.8012725412845612} 12/24/2021 06:53:32 - INFO - codeparrot_training - Step 9927: {'lr': 0.0004542094661992885, 'samples': 5083136, 'steps': 9927, 'batch_loss/train': 0.7276257872581482} 12/24/2021 06:53:43 - INFO - codeparrot_training - Step 9928: {'lr': 0.0004542003725563929, 'samples': 5083648, 'steps': 9928, 'batch_loss/train': 0.824931469745934} 12/24/2021 06:53:57 - INFO - codeparrot_training - Step 9929: {'lr': 0.00045419127810167364, 'samples': 5084160, 'steps': 9929, 'batch_loss/train': 0.7267284831032157} 12/24/2021 06:54:07 - INFO - codeparrot_training - Step 9930: {'lr': 0.000454182182835167, 'samples': 5084672, 'steps': 9930, 'batch_loss/train': 0.8173040077090263} 12/24/2021 06:54:18 - INFO - codeparrot_training - Step 9931: {'lr': 0.00045417308675690905, 'samples': 5085184, 'steps': 9931, 'batch_loss/train': 0.9392796708270907} 12/24/2021 06:54:28 - INFO - codeparrot_training - Step 9932: {'lr': 0.0004541639898669361, 'samples': 5085696, 'steps': 9932, 'batch_loss/train': 0.8785857036709785} 12/24/2021 06:54:40 - INFO - codeparrot_training - Step 9933: {'lr': 0.00045415489216528416, 'samples': 5086208, 'steps': 9933, 'batch_loss/train': 0.8842425085604191} 12/24/2021 06:54:51 - INFO - codeparrot_training - Step 9934: {'lr': 0.00045414579365198943, 'samples': 5086720, 'steps': 9934, 'batch_loss/train': 0.7703712489455938} 12/24/2021 06:55:02 - INFO - codeparrot_training - Step 9935: {'lr': 0.0004541366943270882, 'samples': 5087232, 'steps': 9935, 'batch_loss/train': 1.0080536901950836} 12/24/2021 06:55:14 - INFO - codeparrot_training - Step 9936: {'lr': 0.00045412759419061644, 'samples': 5087744, 'steps': 9936, 'batch_loss/train': 0.9227300761267543} 12/24/2021 06:55:24 - INFO - codeparrot_training - Step 9937: {'lr': 0.00045411849324261043, 'samples': 5088256, 'steps': 9937, 'batch_loss/train': 0.6749605764634907} 12/24/2021 06:55:35 - INFO - codeparrot_training - Step 9938: {'lr': 0.00045410939148310647, 'samples': 5088768, 'steps': 9938, 'batch_loss/train': 0.8884128499776125} 12/24/2021 06:55:47 - INFO - codeparrot_training - Step 9939: {'lr': 0.0004541002889121406, 'samples': 5089280, 'steps': 9939, 'batch_loss/train': 0.7985226111486554} 12/24/2021 06:55:58 - INFO - codeparrot_training - Step 9940: {'lr': 0.00045409118552974906, 'samples': 5089792, 'steps': 9940, 'batch_loss/train': 0.913105795159936} 12/24/2021 06:56:08 - INFO - codeparrot_training - Step 9941: {'lr': 0.00045408208133596793, 'samples': 5090304, 'steps': 9941, 'batch_loss/train': 0.7739684656262398} 12/24/2021 06:56:19 - INFO - codeparrot_training - Step 9942: {'lr': 0.00045407297633083353, 'samples': 5090816, 'steps': 9942, 'batch_loss/train': 0.8655055705457926} 12/24/2021 06:56:33 - INFO - codeparrot_training - Step 9943: {'lr': 0.00045406387051438214, 'samples': 5091328, 'steps': 9943, 'batch_loss/train': 0.9040500167757273} 12/24/2021 06:56:43 - INFO - codeparrot_training - Step 9944: {'lr': 0.0004540547638866498, 'samples': 5091840, 'steps': 9944, 'batch_loss/train': 0.8350487798452377} 12/24/2021 06:56:54 - INFO - codeparrot_training - Step 9945: {'lr': 0.00045404565644767273, 'samples': 5092352, 'steps': 9945, 'batch_loss/train': 0.8778084479272366} 12/24/2021 06:57:06 - INFO - codeparrot_training - Step 9946: {'lr': 0.0004540365481974872, 'samples': 5092864, 'steps': 9946, 'batch_loss/train': 0.9063900224864483} 12/24/2021 06:57:16 - INFO - codeparrot_training - Step 9947: {'lr': 0.0004540274391361294, 'samples': 5093376, 'steps': 9947, 'batch_loss/train': 0.707359979627654} 12/24/2021 06:57:27 - INFO - codeparrot_training - Step 9948: {'lr': 0.0004540183292636355, 'samples': 5093888, 'steps': 9948, 'batch_loss/train': 0.7690678266808391} 12/24/2021 06:57:41 - INFO - codeparrot_training - Step 9949: {'lr': 0.0004540092185800418, 'samples': 5094400, 'steps': 9949, 'batch_loss/train': 0.7581482958048582} 12/24/2021 06:57:51 - INFO - codeparrot_training - Step 9950: {'lr': 0.0004540001070853845, 'samples': 5094912, 'steps': 9950, 'batch_loss/train': 0.8189849378541112} 12/24/2021 06:58:02 - INFO - codeparrot_training - Step 9951: {'lr': 0.00045399099477969973, 'samples': 5095424, 'steps': 9951, 'batch_loss/train': 0.8065938465297222} 12/24/2021 06:58:14 - INFO - codeparrot_training - Step 9952: {'lr': 0.00045398188166302384, 'samples': 5095936, 'steps': 9952, 'batch_loss/train': 0.8935450143180788} 12/24/2021 06:58:25 - INFO - codeparrot_training - Step 9953: {'lr': 0.00045397276773539297, 'samples': 5096448, 'steps': 9953, 'batch_loss/train': 0.8030800935812294} 12/24/2021 06:58:35 - INFO - codeparrot_training - Step 9954: {'lr': 0.00045396365299684344, 'samples': 5096960, 'steps': 9954, 'batch_loss/train': 0.8439356107264757} 12/24/2021 06:58:46 - INFO - codeparrot_training - Step 9955: {'lr': 0.00045395453744741146, 'samples': 5097472, 'steps': 9955, 'batch_loss/train': 0.9428747491911054} 12/24/2021 06:58:58 - INFO - codeparrot_training - Step 9956: {'lr': 0.00045394542108713324, 'samples': 5097984, 'steps': 9956, 'batch_loss/train': 0.8459649961441755} 12/24/2021 06:59:09 - INFO - codeparrot_training - Step 9957: {'lr': 0.00045393630391604494, 'samples': 5098496, 'steps': 9957, 'batch_loss/train': 0.8214237224310637} 12/24/2021 06:59:20 - INFO - codeparrot_training - Step 9958: {'lr': 0.0004539271859341829, 'samples': 5099008, 'steps': 9958, 'batch_loss/train': 0.8571369918063283} 12/24/2021 06:59:33 - INFO - codeparrot_training - Step 9959: {'lr': 0.0004539180671415835, 'samples': 5099520, 'steps': 9959, 'batch_loss/train': 0.7859297711402178} 12/24/2021 06:59:44 - INFO - codeparrot_training - Step 9960: {'lr': 0.0004539089475382828, 'samples': 5100032, 'steps': 9960, 'batch_loss/train': 0.9279863201081753} 12/24/2021 06:59:55 - INFO - codeparrot_training - Step 9961: {'lr': 0.0004538998271243171, 'samples': 5100544, 'steps': 9961, 'batch_loss/train': 0.8660762282088399} 12/24/2021 07:00:07 - INFO - codeparrot_training - Step 9962: {'lr': 0.0004538907058997227, 'samples': 5101056, 'steps': 9962, 'batch_loss/train': 0.8822347559034824} 12/24/2021 07:00:17 - INFO - codeparrot_training - Step 9963: {'lr': 0.0004538815838645358, 'samples': 5101568, 'steps': 9963, 'batch_loss/train': 0.7777657739352435} 12/24/2021 07:00:28 - INFO - codeparrot_training - Step 9964: {'lr': 0.00045387246101879277, 'samples': 5102080, 'steps': 9964, 'batch_loss/train': 0.8983555668964982} 12/24/2021 07:00:39 - INFO - codeparrot_training - Step 9965: {'lr': 0.0004538633373625297, 'samples': 5102592, 'steps': 9965, 'batch_loss/train': 0.8182264361530542} 12/24/2021 07:00:51 - INFO - codeparrot_training - Step 9966: {'lr': 0.0004538542128957831, 'samples': 5103104, 'steps': 9966, 'batch_loss/train': 1.046735038049519} 12/24/2021 07:01:01 - INFO - codeparrot_training - Step 9967: {'lr': 0.00045384508761858906, 'samples': 5103616, 'steps': 9967, 'batch_loss/train': 0.7672114670276642} 12/24/2021 07:01:12 - INFO - codeparrot_training - Step 9968: {'lr': 0.00045383596153098394, 'samples': 5104128, 'steps': 9968, 'batch_loss/train': 0.7261147871613503} 12/24/2021 07:01:26 - INFO - codeparrot_training - Step 9969: {'lr': 0.000453826834633004, 'samples': 5104640, 'steps': 9969, 'batch_loss/train': 1.5988618759438396} 12/24/2021 07:01:36 - INFO - codeparrot_training - Step 9970: {'lr': 0.0004538177069246855, 'samples': 5105152, 'steps': 9970, 'batch_loss/train': 0.8052012752741575} 12/24/2021 07:01:47 - INFO - codeparrot_training - Step 9971: {'lr': 0.00045380857840606484, 'samples': 5105664, 'steps': 9971, 'batch_loss/train': 0.7705423785373569} 12/24/2021 07:01:59 - INFO - codeparrot_training - Step 9972: {'lr': 0.0004537994490771781, 'samples': 5106176, 'steps': 9972, 'batch_loss/train': 0.8624460892751813} 12/24/2021 07:02:10 - INFO - codeparrot_training - Step 9973: {'lr': 0.00045379031893806176, 'samples': 5106688, 'steps': 9973, 'batch_loss/train': 1.1622991282492876} 12/24/2021 07:02:21 - INFO - codeparrot_training - Step 9974: {'lr': 0.0004537811879887521, 'samples': 5107200, 'steps': 9974, 'batch_loss/train': 0.6822924399748445} 12/24/2021 07:02:31 - INFO - codeparrot_training - Step 9975: {'lr': 0.00045377205622928535, 'samples': 5107712, 'steps': 9975, 'batch_loss/train': 0.7919268552213907} 12/24/2021 07:02:43 - INFO - codeparrot_training - Step 9976: {'lr': 0.00045376292365969785, 'samples': 5108224, 'steps': 9976, 'batch_loss/train': 0.8561796126887202} 12/24/2021 07:02:54 - INFO - codeparrot_training - Step 9977: {'lr': 0.00045375379028002587, 'samples': 5108736, 'steps': 9977, 'batch_loss/train': 0.7809690446592867} 12/24/2021 07:03:04 - INFO - codeparrot_training - Step 9978: {'lr': 0.0004537446560903058, 'samples': 5109248, 'steps': 9978, 'batch_loss/train': 0.8433509797323495} 12/24/2021 07:03:18 - INFO - codeparrot_training - Step 9979: {'lr': 0.00045373552109057393, 'samples': 5109760, 'steps': 9979, 'batch_loss/train': 0.7565546619007364} 12/24/2021 07:03:29 - INFO - codeparrot_training - Step 9980: {'lr': 0.0004537263852808665, 'samples': 5110272, 'steps': 9980, 'batch_loss/train': 0.806799964979291} 12/24/2021 07:03:39 - INFO - codeparrot_training - Step 9981: {'lr': 0.0004537172486612199, 'samples': 5110784, 'steps': 9981, 'batch_loss/train': 0.8870962532237172} 12/24/2021 07:03:51 - INFO - codeparrot_training - Step 9982: {'lr': 0.00045370811123167045, 'samples': 5111296, 'steps': 9982, 'batch_loss/train': 0.6935918107628822} 12/24/2021 07:04:02 - INFO - codeparrot_training - Step 9983: {'lr': 0.0004536989729922544, 'samples': 5111808, 'steps': 9983, 'batch_loss/train': 0.8531123785069212} 12/24/2021 07:04:13 - INFO - codeparrot_training - Step 9984: {'lr': 0.00045368983394300825, 'samples': 5112320, 'steps': 9984, 'batch_loss/train': 0.8160441755317152} 12/24/2021 07:04:23 - INFO - codeparrot_training - Step 9985: {'lr': 0.0004536806940839682, 'samples': 5112832, 'steps': 9985, 'batch_loss/train': 0.8743407651782036} 12/24/2021 07:04:35 - INFO - codeparrot_training - Step 9986: {'lr': 0.0004536715534151706, 'samples': 5113344, 'steps': 9986, 'batch_loss/train': 0.8123650378547609} 12/24/2021 07:04:46 - INFO - codeparrot_training - Step 9987: {'lr': 0.0004536624119366518, 'samples': 5113856, 'steps': 9987, 'batch_loss/train': 0.7674467675387859} 12/24/2021 07:04:56 - INFO - codeparrot_training - Step 9988: {'lr': 0.00045365326964844814, 'samples': 5114368, 'steps': 9988, 'batch_loss/train': 0.766269407235086} 12/24/2021 07:05:10 - INFO - codeparrot_training - Step 9989: {'lr': 0.0004536441265505961, 'samples': 5114880, 'steps': 9989, 'batch_loss/train': 0.7696530409157276} 12/24/2021 07:05:21 - INFO - codeparrot_training - Step 9990: {'lr': 0.00045363498264313175, 'samples': 5115392, 'steps': 9990, 'batch_loss/train': 0.8577555203810334} 12/24/2021 07:05:31 - INFO - codeparrot_training - Step 9991: {'lr': 0.0004536258379260916, 'samples': 5115904, 'steps': 9991, 'batch_loss/train': 0.7547545442357659} 12/24/2021 07:05:44 - INFO - codeparrot_training - Step 9992: {'lr': 0.0004536166923995121, 'samples': 5116416, 'steps': 9992, 'batch_loss/train': 0.735021716915071} 12/24/2021 07:05:54 - INFO - codeparrot_training - Step 9993: {'lr': 0.00045360754606342947, 'samples': 5116928, 'steps': 9993, 'batch_loss/train': 0.8807774819433689} 12/24/2021 07:06:05 - INFO - codeparrot_training - Step 9994: {'lr': 0.0004535983989178801, 'samples': 5117440, 'steps': 9994, 'batch_loss/train': 0.8011497594416142} 12/24/2021 07:06:15 - INFO - codeparrot_training - Step 9995: {'lr': 0.0004535892509629004, 'samples': 5117952, 'steps': 9995, 'batch_loss/train': 0.8677449263632298} 12/24/2021 07:06:28 - INFO - codeparrot_training - Step 9996: {'lr': 0.00045358010219852675, 'samples': 5118464, 'steps': 9996, 'batch_loss/train': 0.7966126864776015} 12/24/2021 07:06:38 - INFO - codeparrot_training - Step 9997: {'lr': 0.00045357095262479544, 'samples': 5118976, 'steps': 9997, 'batch_loss/train': 0.8119243765249848} 12/24/2021 07:06:49 - INFO - codeparrot_training - Step 9998: {'lr': 0.0004535618022417428, 'samples': 5119488, 'steps': 9998, 'batch_loss/train': 0.820300068706274} 12/24/2021 07:07:02 - INFO - codeparrot_training - Step 9999: {'lr': 0.0004535526510494054, 'samples': 5120000, 'steps': 9999, 'batch_loss/train': 0.779665807262063} 12/24/2021 07:07:02 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/24/2021 07:10:22 - INFO - codeparrot_training - Step 10000: {'loss/eval': 0.8361794948577881, 'perplexity': 2.3075342178344727} 12/24/2021 07:10:40 - WARNING - huggingface_hub.repository - Several commits (2) will be pushed upstream. 12/24/2021 07:10:53 - INFO - codeparrot_training - Step 10000: {'lr': 0.0004535434990478195, 'samples': 5120512, 'steps': 10000, 'batch_loss/train': 0.7384463627822697} 12/24/2021 07:11:03 - INFO - codeparrot_training - Step 10001: {'lr': 0.0004535343462370215, 'samples': 5121024, 'steps': 10001, 'batch_loss/train': 0.8731622397899628} 12/24/2021 07:11:16 - INFO - codeparrot_training - Step 10002: {'lr': 0.0004535251926170478, 'samples': 5121536, 'steps': 10002, 'batch_loss/train': 0.9016392081975937} 12/24/2021 07:11:26 - INFO - codeparrot_training - Step 10003: {'lr': 0.00045351603818793475, 'samples': 5122048, 'steps': 10003, 'batch_loss/train': 0.7959101144224405} 12/24/2021 07:11:37 - INFO - codeparrot_training - Step 10004: {'lr': 0.00045350688294971877, 'samples': 5122560, 'steps': 10004, 'batch_loss/train': 0.7421799537260085} 12/24/2021 07:11:47 - INFO - codeparrot_training - Step 10005: {'lr': 0.00045349772690243627, 'samples': 5123072, 'steps': 10005, 'batch_loss/train': 0.8387846688274294} 12/24/2021 07:12:01 - INFO - codeparrot_training - Step 10006: {'lr': 0.00045348857004612374, 'samples': 5123584, 'steps': 10006, 'batch_loss/train': 0.8574999624397606} 12/24/2021 07:12:12 - INFO - codeparrot_training - Step 10007: {'lr': 0.00045347941238081733, 'samples': 5124096, 'steps': 10007, 'batch_loss/train': 0.7256192564964294} 12/24/2021 07:12:22 - INFO - codeparrot_training - Step 10008: {'lr': 0.0004534702539065537, 'samples': 5124608, 'steps': 10008, 'batch_loss/train': 0.8157481774687767} 12/24/2021 07:12:34 - INFO - codeparrot_training - Step 10009: {'lr': 0.00045346109462336925, 'samples': 5125120, 'steps': 10009, 'batch_loss/train': 0.7270233146846294} 12/24/2021 07:12:45 - INFO - codeparrot_training - Step 10010: {'lr': 0.00045345193453130017, 'samples': 5125632, 'steps': 10010, 'batch_loss/train': 0.980207584798336} 12/24/2021 07:12:56 - INFO - codeparrot_training - Step 10011: {'lr': 0.0004534427736303831, 'samples': 5126144, 'steps': 10011, 'batch_loss/train': 0.8519828296266496} 12/24/2021 07:13:08 - INFO - codeparrot_training - Step 10012: {'lr': 0.00045343361192065433, 'samples': 5126656, 'steps': 10012, 'batch_loss/train': 0.887370047159493} 12/24/2021 07:13:18 - INFO - codeparrot_training - Step 10013: {'lr': 0.0004534244494021504, 'samples': 5127168, 'steps': 10013, 'batch_loss/train': 0.7606319226324558} 12/24/2021 07:13:29 - INFO - codeparrot_training - Step 10014: {'lr': 0.0004534152860749076, 'samples': 5127680, 'steps': 10014, 'batch_loss/train': 0.9046294093132019} 12/24/2021 07:13:41 - INFO - codeparrot_training - Step 10015: {'lr': 0.00045340612193896246, 'samples': 5128192, 'steps': 10015, 'batch_loss/train': 0.7700536062475294} 12/24/2021 07:13:52 - INFO - codeparrot_training - Step 10016: {'lr': 0.00045339695699435134, 'samples': 5128704, 'steps': 10016, 'batch_loss/train': 0.9798019295558333} 12/24/2021 07:14:02 - INFO - codeparrot_training - Step 10017: {'lr': 0.00045338779124111075, 'samples': 5129216, 'steps': 10017, 'batch_loss/train': 0.8092390866950154} 12/24/2021 07:14:13 - INFO - codeparrot_training - Step 10018: {'lr': 0.00045337862467927716, 'samples': 5129728, 'steps': 10018, 'batch_loss/train': 0.8123644669540226} 12/24/2021 07:14:27 - INFO - codeparrot_training - Step 10019: {'lr': 0.0004533694573088869, 'samples': 5130240, 'steps': 10019, 'batch_loss/train': 0.7045790534466505} 12/24/2021 07:14:37 - INFO - codeparrot_training - Step 10020: {'lr': 0.0004533602891299764, 'samples': 5130752, 'steps': 10020, 'batch_loss/train': 0.8407152853906155} 12/24/2021 07:14:48 - INFO - codeparrot_training - Step 10021: {'lr': 0.0004533511201425823, 'samples': 5131264, 'steps': 10021, 'batch_loss/train': 0.8274174826219678} 12/24/2021 07:15:00 - INFO - codeparrot_training - Step 10022: {'lr': 0.00045334195034674087, 'samples': 5131776, 'steps': 10022, 'batch_loss/train': 0.7748201051726937} 12/24/2021 07:15:11 - INFO - codeparrot_training - Step 10023: {'lr': 0.00045333277974248854, 'samples': 5132288, 'steps': 10023, 'batch_loss/train': 0.5468731331638992} 12/24/2021 07:15:22 - INFO - codeparrot_training - Step 10024: {'lr': 0.00045332360832986196, 'samples': 5132800, 'steps': 10024, 'batch_loss/train': 0.7365310275927186} 12/24/2021 07:15:33 - INFO - codeparrot_training - Step 10025: {'lr': 0.00045331443610889744, 'samples': 5133312, 'steps': 10025, 'batch_loss/train': 0.7461051498539746} 12/24/2021 07:15:44 - INFO - codeparrot_training - Step 10026: {'lr': 0.0004533052630796315, 'samples': 5133824, 'steps': 10026, 'batch_loss/train': 0.8719005426391959} 12/24/2021 07:15:55 - INFO - codeparrot_training - Step 10027: {'lr': 0.00045329608924210064, 'samples': 5134336, 'steps': 10027, 'batch_loss/train': 0.9068466420285404} 12/24/2021 07:16:05 - INFO - codeparrot_training - Step 10028: {'lr': 0.00045328691459634127, 'samples': 5134848, 'steps': 10028, 'batch_loss/train': 0.8361698342487216} 12/24/2021 07:16:19 - INFO - codeparrot_training - Step 10029: {'lr': 0.00045327773914238985, 'samples': 5135360, 'steps': 10029, 'batch_loss/train': 0.8803750462830067} 12/24/2021 07:16:30 - INFO - codeparrot_training - Step 10030: {'lr': 0.00045326856288028296, 'samples': 5135872, 'steps': 10030, 'batch_loss/train': 1.063990918919444} 12/24/2021 07:16:40 - INFO - codeparrot_training - Step 10031: {'lr': 0.00045325938581005694, 'samples': 5136384, 'steps': 10031, 'batch_loss/train': 0.8592365682125092} 12/24/2021 07:16:52 - INFO - codeparrot_training - Step 10032: {'lr': 0.00045325020793174836, 'samples': 5136896, 'steps': 10032, 'batch_loss/train': 0.8187144016847014} 12/24/2021 07:17:03 - INFO - codeparrot_training - Step 10033: {'lr': 0.00045324102924539376, 'samples': 5137408, 'steps': 10033, 'batch_loss/train': 0.7807132680900395} 12/24/2021 07:17:14 - INFO - codeparrot_training - Step 10034: {'lr': 0.0004532318497510295, 'samples': 5137920, 'steps': 10034, 'batch_loss/train': 0.8118524332530797} 12/24/2021 07:17:28 - INFO - codeparrot_training - Step 10035: {'lr': 0.00045322266944869215, 'samples': 5138432, 'steps': 10035, 'batch_loss/train': 0.8186742458492517} 12/24/2021 07:17:38 - INFO - codeparrot_training - Step 10036: {'lr': 0.00045321348833841826, 'samples': 5138944, 'steps': 10036, 'batch_loss/train': 0.8134125433862209} 12/24/2021 07:17:49 - INFO - codeparrot_training - Step 10037: {'lr': 0.00045320430642024423, 'samples': 5139456, 'steps': 10037, 'batch_loss/train': 0.7897336520254612} 12/24/2021 07:17:59 - INFO - codeparrot_training - Step 10038: {'lr': 0.0004531951236942066, 'samples': 5139968, 'steps': 10038, 'batch_loss/train': 0.8743872595950961} 12/24/2021 07:18:12 - INFO - codeparrot_training - Step 10039: {'lr': 0.0004531859401603419, 'samples': 5140480, 'steps': 10039, 'batch_loss/train': 0.9122858010232449} 12/24/2021 07:18:22 - INFO - codeparrot_training - Step 10040: {'lr': 0.0004531767558186866, 'samples': 5140992, 'steps': 10040, 'batch_loss/train': 0.9410342546179891} 12/24/2021 07:18:33 - INFO - codeparrot_training - Step 10041: {'lr': 0.0004531675706692773, 'samples': 5141504, 'steps': 10041, 'batch_loss/train': 0.836743775755167} 12/24/2021 07:18:45 - INFO - codeparrot_training - Step 10042: {'lr': 0.0004531583847121503, 'samples': 5142016, 'steps': 10042, 'batch_loss/train': 0.8470422090031207} 12/24/2021 07:18:56 - INFO - codeparrot_training - Step 10043: {'lr': 0.00045314919794734246, 'samples': 5142528, 'steps': 10043, 'batch_loss/train': 0.8030982539057732} 12/24/2021 07:19:06 - INFO - codeparrot_training - Step 10044: {'lr': 0.00045314001037489, 'samples': 5143040, 'steps': 10044, 'batch_loss/train': 0.8329415237531066} 12/24/2021 07:19:20 - INFO - codeparrot_training - Step 10045: {'lr': 0.00045313082199482956, 'samples': 5143552, 'steps': 10045, 'batch_loss/train': 0.926754754036665} 12/24/2021 07:19:30 - INFO - codeparrot_training - Step 10046: {'lr': 0.00045312163280719777, 'samples': 5144064, 'steps': 10046, 'batch_loss/train': 0.8025999795645475} 12/24/2021 07:19:41 - INFO - codeparrot_training - Step 10047: {'lr': 0.00045311244281203095, 'samples': 5144576, 'steps': 10047, 'batch_loss/train': 0.8698282688856125} 12/24/2021 07:19:52 - INFO - codeparrot_training - Step 10048: {'lr': 0.0004531032520093658, 'samples': 5145088, 'steps': 10048, 'batch_loss/train': 0.8233818635344505} 12/24/2021 07:20:04 - INFO - codeparrot_training - Step 10049: {'lr': 0.0004530940603992388, 'samples': 5145600, 'steps': 10049, 'batch_loss/train': 0.8295921124517918} 12/24/2021 07:20:15 - INFO - codeparrot_training - Step 10050: {'lr': 0.00045308486798168644, 'samples': 5146112, 'steps': 10050, 'batch_loss/train': 0.887915451079607} 12/24/2021 07:20:25 - INFO - codeparrot_training - Step 10051: {'lr': 0.0004530756747567455, 'samples': 5146624, 'steps': 10051, 'batch_loss/train': 0.8592758062295616} 12/24/2021 07:20:37 - INFO - codeparrot_training - Step 10052: {'lr': 0.00045306648072445223, 'samples': 5147136, 'steps': 10052, 'batch_loss/train': 0.7690739911049604} 12/24/2021 07:20:48 - INFO - codeparrot_training - Step 10053: {'lr': 0.0004530572858848433, 'samples': 5147648, 'steps': 10053, 'batch_loss/train': 0.7960900254547596} 12/24/2021 07:20:58 - INFO - codeparrot_training - Step 10054: {'lr': 0.00045304809023795524, 'samples': 5148160, 'steps': 10054, 'batch_loss/train': 0.7819537790492177} 12/24/2021 07:21:10 - INFO - codeparrot_training - Step 10055: {'lr': 0.0004530388937838247, 'samples': 5148672, 'steps': 10055, 'batch_loss/train': 0.7360132756875828} 12/24/2021 07:21:21 - INFO - codeparrot_training - Step 10056: {'lr': 0.0004530296965224882, 'samples': 5149184, 'steps': 10056, 'batch_loss/train': 0.7888494438957423} 12/24/2021 07:21:32 - INFO - codeparrot_training - Step 10057: {'lr': 0.0004530204984539823, 'samples': 5149696, 'steps': 10057, 'batch_loss/train': 0.8448001062497497} 12/24/2021 07:21:46 - INFO - codeparrot_training - Step 10058: {'lr': 0.00045301129957834354, 'samples': 5150208, 'steps': 10058, 'batch_loss/train': 0.8280531796626747} 12/24/2021 07:21:56 - INFO - codeparrot_training - Step 10059: {'lr': 0.0004530020998956085, 'samples': 5150720, 'steps': 10059, 'batch_loss/train': 0.7308527762070298} 12/24/2021 07:22:07 - INFO - codeparrot_training - Step 10060: {'lr': 0.00045299289940581365, 'samples': 5151232, 'steps': 10060, 'batch_loss/train': 0.8160730595700443} 12/24/2021 07:22:18 - INFO - codeparrot_training - Step 10061: {'lr': 0.0004529836981089958, 'samples': 5151744, 'steps': 10061, 'batch_loss/train': 1.0161210680380464} 12/24/2021 07:22:30 - INFO - codeparrot_training - Step 10062: {'lr': 0.00045297449600519143, 'samples': 5152256, 'steps': 10062, 'batch_loss/train': 0.8237703992053866} 12/24/2021 07:22:40 - INFO - codeparrot_training - Step 10063: {'lr': 0.000452965293094437, 'samples': 5152768, 'steps': 10063, 'batch_loss/train': 0.853280515410006} 12/24/2021 07:22:51 - INFO - codeparrot_training - Step 10064: {'lr': 0.00045295608937676935, 'samples': 5153280, 'steps': 10064, 'batch_loss/train': 0.804106748662889} 12/24/2021 07:23:05 - INFO - codeparrot_training - Step 10065: {'lr': 0.00045294688485222477, 'samples': 5153792, 'steps': 10065, 'batch_loss/train': 0.7881203233264387} 12/24/2021 07:23:16 - INFO - codeparrot_training - Step 10066: {'lr': 0.0004529376795208401, 'samples': 5154304, 'steps': 10066, 'batch_loss/train': 0.7749195583164692} 12/24/2021 07:23:26 - INFO - codeparrot_training - Step 10067: {'lr': 0.00045292847338265183, 'samples': 5154816, 'steps': 10067, 'batch_loss/train': 0.8057665126398206} 12/24/2021 07:23:38 - INFO - codeparrot_training - Step 10068: {'lr': 0.00045291926643769654, 'samples': 5155328, 'steps': 10068, 'batch_loss/train': 0.7355767845874652} 12/24/2021 07:23:49 - INFO - codeparrot_training - Step 10069: {'lr': 0.00045291005868601086, 'samples': 5155840, 'steps': 10069, 'batch_loss/train': 0.8797267940826714} 12/24/2021 07:23:59 - INFO - codeparrot_training - Step 10070: {'lr': 0.0004529008501276315, 'samples': 5156352, 'steps': 10070, 'batch_loss/train': 0.9784447494894266} 12/24/2021 07:24:10 - INFO - codeparrot_training - Step 10071: {'lr': 0.0004528916407625948, 'samples': 5156864, 'steps': 10071, 'batch_loss/train': 0.8700631055980921} 12/24/2021 07:24:22 - INFO - codeparrot_training - Step 10072: {'lr': 0.0004528824305909377, 'samples': 5157376, 'steps': 10072, 'batch_loss/train': 0.8832313362509012} 12/24/2021 07:24:33 - INFO - codeparrot_training - Step 10073: {'lr': 0.0004528732196126966, 'samples': 5157888, 'steps': 10073, 'batch_loss/train': 0.7289623375982046} 12/24/2021 07:24:43 - INFO - codeparrot_training - Step 10074: {'lr': 0.00045286400782790817, 'samples': 5158400, 'steps': 10074, 'batch_loss/train': 0.823882938362658} 12/24/2021 07:24:58 - INFO - codeparrot_training - Step 10075: {'lr': 0.00045285479523660913, 'samples': 5158912, 'steps': 10075, 'batch_loss/train': 0.8409622204490006} 12/24/2021 07:25:08 - INFO - codeparrot_training - Step 10076: {'lr': 0.0004528455818388359, 'samples': 5159424, 'steps': 10076, 'batch_loss/train': 0.839268522337079} 12/24/2021 07:25:19 - INFO - codeparrot_training - Step 10077: {'lr': 0.00045283636763462524, 'samples': 5159936, 'steps': 10077, 'batch_loss/train': 0.7417753054760396} 12/24/2021 07:25:30 - INFO - codeparrot_training - Step 10078: {'lr': 0.00045282715262401384, 'samples': 5160448, 'steps': 10078, 'batch_loss/train': 0.6714838650077581} 12/24/2021 07:25:42 - INFO - codeparrot_training - Step 10079: {'lr': 0.0004528179368070382, 'samples': 5160960, 'steps': 10079, 'batch_loss/train': 0.8907219050452113} 12/24/2021 07:25:53 - INFO - codeparrot_training - Step 10080: {'lr': 0.0004528087201837351, 'samples': 5161472, 'steps': 10080, 'batch_loss/train': 0.646483794786036} 12/24/2021 07:26:03 - INFO - codeparrot_training - Step 10081: {'lr': 0.00045279950275414104, 'samples': 5161984, 'steps': 10081, 'batch_loss/train': 0.833185812458396} 12/24/2021 07:26:15 - INFO - codeparrot_training - Step 10082: {'lr': 0.0004527902845182927, 'samples': 5162496, 'steps': 10082, 'batch_loss/train': 0.844594944268465} 12/24/2021 07:26:26 - INFO - codeparrot_training - Step 10083: {'lr': 0.0004527810654762269, 'samples': 5163008, 'steps': 10083, 'batch_loss/train': 0.8578283013775945} 12/24/2021 07:26:37 - INFO - codeparrot_training - Step 10084: {'lr': 0.00045277184562798, 'samples': 5163520, 'steps': 10084, 'batch_loss/train': 0.9161212421022356} 12/24/2021 07:26:49 - INFO - codeparrot_training - Step 10085: {'lr': 0.0004527626249735889, 'samples': 5164032, 'steps': 10085, 'batch_loss/train': 0.8190567409619689} 12/24/2021 07:26:59 - INFO - codeparrot_training - Step 10086: {'lr': 0.00045275340351309015, 'samples': 5164544, 'steps': 10086, 'batch_loss/train': 0.7898024562746286} 12/24/2021 07:27:10 - INFO - codeparrot_training - Step 10087: {'lr': 0.0004527441812465204, 'samples': 5165056, 'steps': 10087, 'batch_loss/train': 0.9124792814254761} 12/24/2021 07:27:20 - INFO - codeparrot_training - Step 10088: {'lr': 0.00045273495817391634, 'samples': 5165568, 'steps': 10088, 'batch_loss/train': 0.8289537830278277} 12/24/2021 07:27:34 - INFO - codeparrot_training - Step 10089: {'lr': 0.00045272573429531467, 'samples': 5166080, 'steps': 10089, 'batch_loss/train': 0.6348412521183491} 12/24/2021 07:27:45 - INFO - codeparrot_training - Step 10090: {'lr': 0.00045271650961075196, 'samples': 5166592, 'steps': 10090, 'batch_loss/train': 0.9243029146455228} 12/24/2021 07:27:56 - INFO - codeparrot_training - Step 10091: {'lr': 0.00045270728412026497, 'samples': 5167104, 'steps': 10091, 'batch_loss/train': 0.8729154216125607} 12/24/2021 07:28:08 - INFO - codeparrot_training - Step 10092: {'lr': 0.0004526980578238904, 'samples': 5167616, 'steps': 10092, 'batch_loss/train': 0.7847422672202811} 12/24/2021 07:28:18 - INFO - codeparrot_training - Step 10093: {'lr': 0.0004526888307216649, 'samples': 5168128, 'steps': 10093, 'batch_loss/train': 0.8348638716852292} 12/24/2021 07:28:29 - INFO - codeparrot_training - Step 10094: {'lr': 0.0004526796028136251, 'samples': 5168640, 'steps': 10094, 'batch_loss/train': 0.8527101911604404} 12/24/2021 07:28:41 - INFO - codeparrot_training - Step 10095: {'lr': 0.00045267037409980773, 'samples': 5169152, 'steps': 10095, 'batch_loss/train': 0.8223016960546374} 12/24/2021 07:28:51 - INFO - codeparrot_training - Step 10096: {'lr': 0.0004526611445802495, 'samples': 5169664, 'steps': 10096, 'batch_loss/train': 0.8311232556588948} 12/24/2021 07:29:02 - INFO - codeparrot_training - Step 10097: {'lr': 0.000452651914254987, 'samples': 5170176, 'steps': 10097, 'batch_loss/train': 0.9468939621001482} 12/24/2021 07:29:17 - INFO - codeparrot_training - Step 10098: {'lr': 0.0004526426831240571, 'samples': 5170688, 'steps': 10098, 'batch_loss/train': 0.86811697576195} 12/24/2021 07:29:28 - INFO - codeparrot_training - Step 10099: {'lr': 0.00045263345118749636, 'samples': 5171200, 'steps': 10099, 'batch_loss/train': 0.7917297640815377} 12/24/2021 07:29:38 - INFO - codeparrot_training - Step 10100: {'lr': 0.00045262421844534156, 'samples': 5171712, 'steps': 10100, 'batch_loss/train': 0.7075430077966303} 12/24/2021 07:29:49 - INFO - codeparrot_training - Step 10101: {'lr': 0.00045261498489762944, 'samples': 5172224, 'steps': 10101, 'batch_loss/train': 0.758405345492065} 12/24/2021 07:30:01 - INFO - codeparrot_training - Step 10102: {'lr': 0.00045260575054439646, 'samples': 5172736, 'steps': 10102, 'batch_loss/train': 0.8015704997815192} 12/24/2021 07:30:12 - INFO - codeparrot_training - Step 10103: {'lr': 0.0004525965153856797, 'samples': 5173248, 'steps': 10103, 'batch_loss/train': 0.7968780760420486} 12/24/2021 07:30:22 - INFO - codeparrot_training - Step 10104: {'lr': 0.00045258727942151556, 'samples': 5173760, 'steps': 10104, 'batch_loss/train': 0.8237467309227213} 12/24/2021 07:30:34 - INFO - codeparrot_training - Step 10105: {'lr': 0.000452578042651941, 'samples': 5174272, 'steps': 10105, 'batch_loss/train': 0.8711291095241904} 12/24/2021 07:30:45 - INFO - codeparrot_training - Step 10106: {'lr': 0.0004525688050769926, 'samples': 5174784, 'steps': 10106, 'batch_loss/train': 0.8100786830764264} 12/24/2021 07:30:56 - INFO - codeparrot_training - Step 10107: {'lr': 0.0004525595666967071, 'samples': 5175296, 'steps': 10107, 'batch_loss/train': 0.8700910205952823} 12/24/2021 07:31:09 - INFO - codeparrot_training - Step 10108: {'lr': 0.00045255032751112116, 'samples': 5175808, 'steps': 10108, 'batch_loss/train': 0.9217575811780989} 12/24/2021 07:31:20 - INFO - codeparrot_training - Step 10109: {'lr': 0.00045254108752027167, 'samples': 5176320, 'steps': 10109, 'batch_loss/train': 0.7760897446423769} 12/24/2021 07:31:31 - INFO - codeparrot_training - Step 10110: {'lr': 0.00045253184672419536, 'samples': 5176832, 'steps': 10110, 'batch_loss/train': 0.9292285852134228} 12/24/2021 07:31:41 - INFO - codeparrot_training - Step 10111: {'lr': 0.00045252260512292883, 'samples': 5177344, 'steps': 10111, 'batch_loss/train': 0.8638786831870675} 12/24/2021 07:31:53 - INFO - codeparrot_training - Step 10112: {'lr': 0.0004525133627165089, 'samples': 5177856, 'steps': 10112, 'batch_loss/train': 0.8382687224075198} 12/24/2021 07:32:04 - INFO - codeparrot_training - Step 10113: {'lr': 0.0004525041195049723, 'samples': 5178368, 'steps': 10113, 'batch_loss/train': 0.8630569819360971} 12/24/2021 07:32:15 - INFO - codeparrot_training - Step 10114: {'lr': 0.0004524948754883558, 'samples': 5178880, 'steps': 10114, 'batch_loss/train': 0.9624964110553265} 12/24/2021 07:32:27 - INFO - codeparrot_training - Step 10115: {'lr': 0.00045248563066669614, 'samples': 5179392, 'steps': 10115, 'batch_loss/train': 0.7550191027112305} 12/24/2021 07:32:37 - INFO - codeparrot_training - Step 10116: {'lr': 0.0004524763850400301, 'samples': 5179904, 'steps': 10116, 'batch_loss/train': 0.7920953119173646} 12/24/2021 07:32:48 - INFO - codeparrot_training - Step 10117: {'lr': 0.0004524671386083943, 'samples': 5180416, 'steps': 10117, 'batch_loss/train': 0.7800825322046876} 12/24/2021 07:33:02 - INFO - codeparrot_training - Step 10118: {'lr': 0.0004524578913718257, 'samples': 5180928, 'steps': 10118, 'batch_loss/train': 0.8213497446849942} 12/24/2021 07:33:13 - INFO - codeparrot_training - Step 10119: {'lr': 0.000452448643330361, 'samples': 5181440, 'steps': 10119, 'batch_loss/train': 0.7496966936159879} 12/24/2021 07:33:23 - INFO - codeparrot_training - Step 10120: {'lr': 0.0004524393944840368, 'samples': 5181952, 'steps': 10120, 'batch_loss/train': 0.8321598377078772} 12/24/2021 07:33:34 - INFO - codeparrot_training - Step 10121: {'lr': 0.00045243014483289007, 'samples': 5182464, 'steps': 10121, 'batch_loss/train': 0.6531371977180243} 12/24/2021 07:33:46 - INFO - codeparrot_training - Step 10122: {'lr': 0.00045242089437695757, 'samples': 5182976, 'steps': 10122, 'batch_loss/train': 0.8507469352334738} 12/24/2021 07:33:56 - INFO - codeparrot_training - Step 10123: {'lr': 0.0004524116431162759, 'samples': 5183488, 'steps': 10123, 'batch_loss/train': 0.7987604746595025} 12/24/2021 07:34:07 - INFO - codeparrot_training - Step 10124: {'lr': 0.0004524023910508821, 'samples': 5184000, 'steps': 10124, 'batch_loss/train': 0.6877826352138072} 12/24/2021 07:34:21 - INFO - codeparrot_training - Step 10125: {'lr': 0.0004523931381808128, 'samples': 5184512, 'steps': 10125, 'batch_loss/train': 0.7711085876217112} 12/24/2021 07:34:32 - INFO - codeparrot_training - Step 10126: {'lr': 0.00045238388450610475, 'samples': 5185024, 'steps': 10126, 'batch_loss/train': 0.8603158053010702} 12/24/2021 07:34:42 - INFO - codeparrot_training - Step 10127: {'lr': 0.0004523746300267948, 'samples': 5185536, 'steps': 10127, 'batch_loss/train': 0.7841445589438081} 12/24/2021 07:34:54 - INFO - codeparrot_training - Step 10128: {'lr': 0.00045236537474291973, 'samples': 5186048, 'steps': 10128, 'batch_loss/train': 0.860504912212491} 12/24/2021 07:35:05 - INFO - codeparrot_training - Step 10129: {'lr': 0.0004523561186545164, 'samples': 5186560, 'steps': 10129, 'batch_loss/train': 0.7541828127577901} 12/24/2021 07:35:15 - INFO - codeparrot_training - Step 10130: {'lr': 0.0004523468617616215, 'samples': 5187072, 'steps': 10130, 'batch_loss/train': 0.8550748387351632} 12/24/2021 07:35:26 - INFO - codeparrot_training - Step 10131: {'lr': 0.00045233760406427183, 'samples': 5187584, 'steps': 10131, 'batch_loss/train': 0.7783492486923933} 12/24/2021 07:35:38 - INFO - codeparrot_training - Step 10132: {'lr': 0.00045232834556250433, 'samples': 5188096, 'steps': 10132, 'batch_loss/train': 0.8599246330559254} 12/24/2021 07:35:48 - INFO - codeparrot_training - Step 10133: {'lr': 0.0004523190862563557, 'samples': 5188608, 'steps': 10133, 'batch_loss/train': 0.791618958581239} 12/24/2021 07:35:59 - INFO - codeparrot_training - Step 10134: {'lr': 0.0004523098261458628, 'samples': 5189120, 'steps': 10134, 'batch_loss/train': 0.8005390698090196} 12/24/2021 07:36:11 - INFO - codeparrot_training - Step 10135: {'lr': 0.00045230056523106236, 'samples': 5189632, 'steps': 10135, 'batch_loss/train': 0.6827683213632554} 12/24/2021 07:36:22 - INFO - codeparrot_training - Step 10136: {'lr': 0.0004522913035119913, 'samples': 5190144, 'steps': 10136, 'batch_loss/train': 0.8081786967813969} 12/24/2021 07:36:32 - INFO - codeparrot_training - Step 10137: {'lr': 0.00045228204098868643, 'samples': 5190656, 'steps': 10137, 'batch_loss/train': 0.8219251686241478} 12/24/2021 07:36:46 - INFO - codeparrot_training - Step 10138: {'lr': 0.00045227277766118445, 'samples': 5191168, 'steps': 10138, 'batch_loss/train': 0.7343961207661778} 12/24/2021 07:36:57 - INFO - codeparrot_training - Step 10139: {'lr': 0.00045226351352952235, 'samples': 5191680, 'steps': 10139, 'batch_loss/train': 0.8928267983719707} 12/24/2021 07:37:07 - INFO - codeparrot_training - Step 10140: {'lr': 0.0004522542485937369, 'samples': 5192192, 'steps': 10140, 'batch_loss/train': 0.8498469199985266} 12/24/2021 07:37:20 - INFO - codeparrot_training - Step 10141: {'lr': 0.00045224498285386484, 'samples': 5192704, 'steps': 10141, 'batch_loss/train': 0.8348737955093384} 12/24/2021 07:37:30 - INFO - codeparrot_training - Step 10142: {'lr': 0.0004522357163099432, 'samples': 5193216, 'steps': 10142, 'batch_loss/train': 0.711587633471936} 12/24/2021 07:37:41 - INFO - codeparrot_training - Step 10143: {'lr': 0.00045222644896200865, 'samples': 5193728, 'steps': 10143, 'batch_loss/train': 0.742865304928273} 12/24/2021 07:37:52 - INFO - codeparrot_training - Step 10144: {'lr': 0.0004522171808100981, 'samples': 5194240, 'steps': 10144, 'batch_loss/train': 0.8137966645881534} 12/24/2021 07:38:04 - INFO - codeparrot_training - Step 10145: {'lr': 0.00045220791185424846, 'samples': 5194752, 'steps': 10145, 'batch_loss/train': 0.6726222014985979} 12/24/2021 07:38:14 - INFO - codeparrot_training - Step 10146: {'lr': 0.00045219864209449645, 'samples': 5195264, 'steps': 10146, 'batch_loss/train': 0.7302912534214556} 12/24/2021 07:38:25 - INFO - codeparrot_training - Step 10147: {'lr': 0.00045218937153087896, 'samples': 5195776, 'steps': 10147, 'batch_loss/train': 0.6061277320841327} 12/24/2021 07:38:39 - INFO - codeparrot_training - Step 10148: {'lr': 0.00045218010016343294, 'samples': 5196288, 'steps': 10148, 'batch_loss/train': 0.803413703572005} 12/24/2021 07:38:49 - INFO - codeparrot_training - Step 10149: {'lr': 0.0004521708279921951, 'samples': 5196800, 'steps': 10149, 'batch_loss/train': 0.9189768023788929} 12/24/2021 07:39:00 - INFO - codeparrot_training - Step 10150: {'lr': 0.00045216155501720247, 'samples': 5197312, 'steps': 10150, 'batch_loss/train': 0.6827120741363615} 12/24/2021 07:39:12 - INFO - codeparrot_training - Step 10151: {'lr': 0.00045215228123849185, 'samples': 5197824, 'steps': 10151, 'batch_loss/train': 0.8330251653678715} 12/24/2021 07:39:22 - INFO - codeparrot_training - Step 10152: {'lr': 0.0004521430066561, 'samples': 5198336, 'steps': 10152, 'batch_loss/train': 0.8049685256555676} 12/24/2021 07:39:33 - INFO - codeparrot_training - Step 10153: {'lr': 0.0004521337312700639, 'samples': 5198848, 'steps': 10153, 'batch_loss/train': 0.8245709100738168} 12/24/2021 07:39:44 - INFO - codeparrot_training - Step 10154: {'lr': 0.00045212445508042043, 'samples': 5199360, 'steps': 10154, 'batch_loss/train': 0.8472356360871345} 12/24/2021 07:39:57 - INFO - codeparrot_training - Step 10155: {'lr': 0.0004521151780872065, 'samples': 5199872, 'steps': 10155, 'batch_loss/train': 0.8183453134261072} 12/24/2021 07:40:08 - INFO - codeparrot_training - Step 10156: {'lr': 0.00045210590029045884, 'samples': 5200384, 'steps': 10156, 'batch_loss/train': 0.7691016055177897} 12/24/2021 07:40:19 - INFO - codeparrot_training - Step 10157: {'lr': 0.0004520966216902145, 'samples': 5200896, 'steps': 10157, 'batch_loss/train': 1.0115101160481572} 12/24/2021 07:40:31 - INFO - codeparrot_training - Step 10158: {'lr': 0.0004520873422865103, 'samples': 5201408, 'steps': 10158, 'batch_loss/train': 0.8065118454396725} 12/24/2021 07:40:41 - INFO - codeparrot_training - Step 10159: {'lr': 0.00045207806207938305, 'samples': 5201920, 'steps': 10159, 'batch_loss/train': 0.772226576693356} 12/24/2021 07:40:52 - INFO - codeparrot_training - Step 10160: {'lr': 0.00045206878106886976, 'samples': 5202432, 'steps': 10160, 'batch_loss/train': 0.829184134490788} 12/24/2021 07:41:04 - INFO - codeparrot_training - Step 10161: {'lr': 0.00045205949925500733, 'samples': 5202944, 'steps': 10161, 'batch_loss/train': 0.886499559506774} 12/24/2021 07:41:15 - INFO - codeparrot_training - Step 10162: {'lr': 0.00045205021663783263, 'samples': 5203456, 'steps': 10162, 'batch_loss/train': 0.8334156973287463} 12/24/2021 07:41:25 - INFO - codeparrot_training - Step 10163: {'lr': 0.00045204093321738256, 'samples': 5203968, 'steps': 10163, 'batch_loss/train': 0.675555441644974} 12/24/2021 07:41:36 - INFO - codeparrot_training - Step 10164: {'lr': 0.000452031648993694, 'samples': 5204480, 'steps': 10164, 'batch_loss/train': 0.767580997897312} 12/24/2021 07:41:50 - INFO - codeparrot_training - Step 10165: {'lr': 0.00045202236396680386, 'samples': 5204992, 'steps': 10165, 'batch_loss/train': 0.7365834410302341} 12/24/2021 07:42:01 - INFO - codeparrot_training - Step 10166: {'lr': 0.00045201307813674916, 'samples': 5205504, 'steps': 10166, 'batch_loss/train': 0.7955602016299963} 12/24/2021 07:42:11 - INFO - codeparrot_training - Step 10167: {'lr': 0.0004520037915035667, 'samples': 5206016, 'steps': 10167, 'batch_loss/train': 0.9910332424333319} 12/24/2021 07:42:23 - INFO - codeparrot_training - Step 10168: {'lr': 0.00045199450406729344, 'samples': 5206528, 'steps': 10168, 'batch_loss/train': 0.7807694856310263} 12/24/2021 07:42:34 - INFO - codeparrot_training - Step 10169: {'lr': 0.00045198521582796625, 'samples': 5207040, 'steps': 10169, 'batch_loss/train': 0.8883673578966409} 12/24/2021 07:42:44 - INFO - codeparrot_training - Step 10170: {'lr': 0.00045197592678562217, 'samples': 5207552, 'steps': 10170, 'batch_loss/train': 0.6858114944770932} 12/24/2021 07:42:57 - INFO - codeparrot_training - Step 10171: {'lr': 0.0004519666369402981, 'samples': 5208064, 'steps': 10171, 'batch_loss/train': 0.8685567826032639} 12/24/2021 07:43:07 - INFO - codeparrot_training - Step 10172: {'lr': 0.00045195734629203087, 'samples': 5208576, 'steps': 10172, 'batch_loss/train': 0.8468240521615371} 12/24/2021 07:43:18 - INFO - codeparrot_training - Step 10173: {'lr': 0.0004519480548408575, 'samples': 5209088, 'steps': 10173, 'batch_loss/train': 1.0350723434239626} 12/24/2021 07:43:29 - INFO - codeparrot_training - Step 10174: {'lr': 0.0004519387625868149, 'samples': 5209600, 'steps': 10174, 'batch_loss/train': 0.8249077764339745} 12/24/2021 07:43:41 - INFO - codeparrot_training - Step 10175: {'lr': 0.00045192946952994005, 'samples': 5210112, 'steps': 10175, 'batch_loss/train': 0.8250155737623572} 12/24/2021 07:43:51 - INFO - codeparrot_training - Step 10176: {'lr': 0.00045192017567026986, 'samples': 5210624, 'steps': 10176, 'batch_loss/train': 0.8343562250956893} 12/24/2021 07:44:02 - INFO - codeparrot_training - Step 10177: {'lr': 0.0004519108810078414, 'samples': 5211136, 'steps': 10177, 'batch_loss/train': 0.8678432370070368} 12/24/2021 07:44:16 - INFO - codeparrot_training - Step 10178: {'lr': 0.0004519015855426915, 'samples': 5211648, 'steps': 10178, 'batch_loss/train': 0.7938163378275931} 12/24/2021 07:44:26 - INFO - codeparrot_training - Step 10179: {'lr': 0.000451892289274857, 'samples': 5212160, 'steps': 10179, 'batch_loss/train': 0.8159626764245331} 12/24/2021 07:44:37 - INFO - codeparrot_training - Step 10180: {'lr': 0.000451882992204375, 'samples': 5212672, 'steps': 10180, 'batch_loss/train': 0.8041011686436832} 12/24/2021 07:44:49 - INFO - codeparrot_training - Step 10181: {'lr': 0.00045187369433128256, 'samples': 5213184, 'steps': 10181, 'batch_loss/train': 0.7416019970551133} 12/24/2021 07:45:00 - INFO - codeparrot_training - Step 10182: {'lr': 0.0004518643956556165, 'samples': 5213696, 'steps': 10182, 'batch_loss/train': 0.947774457745254} 12/24/2021 07:45:10 - INFO - codeparrot_training - Step 10183: {'lr': 0.0004518550961774138, 'samples': 5214208, 'steps': 10183, 'batch_loss/train': 0.764678927604109} 12/24/2021 07:45:24 - INFO - codeparrot_training - Step 10184: {'lr': 0.0004518457958967114, 'samples': 5214720, 'steps': 10184, 'batch_loss/train': 0.7565716598182917} 12/24/2021 07:45:35 - INFO - codeparrot_training - Step 10185: {'lr': 0.0004518364948135464, 'samples': 5215232, 'steps': 10185, 'batch_loss/train': 0.8987127877771854} 12/24/2021 07:45:45 - INFO - codeparrot_training - Step 10186: {'lr': 0.0004518271929279557, 'samples': 5215744, 'steps': 10186, 'batch_loss/train': 0.7852048621280119} 12/24/2021 07:45:56 - INFO - codeparrot_training - Step 10187: {'lr': 0.00045181789023997633, 'samples': 5216256, 'steps': 10187, 'batch_loss/train': 0.9116414990276098} 12/24/2021 07:46:08 - INFO - codeparrot_training - Step 10188: {'lr': 0.0004518085867496452, 'samples': 5216768, 'steps': 10188, 'batch_loss/train': 0.823799857404083} 12/24/2021 07:46:18 - INFO - codeparrot_training - Step 10189: {'lr': 0.0004517992824569993, 'samples': 5217280, 'steps': 10189, 'batch_loss/train': 0.7840465758927166} 12/24/2021 07:46:29 - INFO - codeparrot_training - Step 10190: {'lr': 0.0004517899773620756, 'samples': 5217792, 'steps': 10190, 'batch_loss/train': 0.8246329422108829} 12/24/2021 07:46:41 - INFO - codeparrot_training - Step 10191: {'lr': 0.0004517806714649113, 'samples': 5218304, 'steps': 10191, 'batch_loss/train': 0.8089480269700289} 12/24/2021 07:46:52 - INFO - codeparrot_training - Step 10192: {'lr': 0.0004517713647655431, 'samples': 5218816, 'steps': 10192, 'batch_loss/train': 0.8050560113042593} 12/24/2021 07:47:02 - INFO - codeparrot_training - Step 10193: {'lr': 0.00045176205726400823, 'samples': 5219328, 'steps': 10193, 'batch_loss/train': 0.8286050301976502} 12/24/2021 07:47:16 - INFO - codeparrot_training - Step 10194: {'lr': 0.00045175274896034354, 'samples': 5219840, 'steps': 10194, 'batch_loss/train': 0.8645461862906814} 12/24/2021 07:47:27 - INFO - codeparrot_training - Step 10195: {'lr': 0.0004517434398545861, 'samples': 5220352, 'steps': 10195, 'batch_loss/train': 0.818271319847554} 12/24/2021 07:47:37 - INFO - codeparrot_training - Step 10196: {'lr': 0.00045173412994677293, 'samples': 5220864, 'steps': 10196, 'batch_loss/train': 0.8057068432681262} 12/24/2021 07:47:48 - INFO - codeparrot_training - Step 10197: {'lr': 0.000451724819236941, 'samples': 5221376, 'steps': 10197, 'batch_loss/train': 0.8423797767609358} 12/24/2021 07:48:00 - INFO - codeparrot_training - Step 10198: {'lr': 0.0004517155077251275, 'samples': 5221888, 'steps': 10198, 'batch_loss/train': 0.7995120352134109} 12/24/2021 07:48:11 - INFO - codeparrot_training - Step 10199: {'lr': 0.0004517061954113691, 'samples': 5222400, 'steps': 10199, 'batch_loss/train': 0.7586718774400651} 12/24/2021 07:48:21 - INFO - codeparrot_training - Step 10200: {'lr': 0.00045169688229570315, 'samples': 5222912, 'steps': 10200, 'batch_loss/train': 0.8215111689642072} 12/24/2021 07:48:35 - INFO - codeparrot_training - Step 10201: {'lr': 0.00045168756837816657, 'samples': 5223424, 'steps': 10201, 'batch_loss/train': 0.8303441409952939} 12/24/2021 07:48:46 - INFO - codeparrot_training - Step 10202: {'lr': 0.00045167825365879634, 'samples': 5223936, 'steps': 10202, 'batch_loss/train': 0.6923953462392092} 12/24/2021 07:48:56 - INFO - codeparrot_training - Step 10203: {'lr': 0.0004516689381376295, 'samples': 5224448, 'steps': 10203, 'batch_loss/train': 0.8087781015783548} 12/24/2021 07:49:08 - INFO - codeparrot_training - Step 10204: {'lr': 0.0004516596218147031, 'samples': 5224960, 'steps': 10204, 'batch_loss/train': 0.7944854367524385} 12/24/2021 07:49:19 - INFO - codeparrot_training - Step 10205: {'lr': 0.0004516503046900543, 'samples': 5225472, 'steps': 10205, 'batch_loss/train': 0.9153670333325863} 12/24/2021 07:49:29 - INFO - codeparrot_training - Step 10206: {'lr': 0.0004516409867637199, 'samples': 5225984, 'steps': 10206, 'batch_loss/train': 0.8367989370599389} 12/24/2021 07:49:40 - INFO - codeparrot_training - Step 10207: {'lr': 0.00045163166803573705, 'samples': 5226496, 'steps': 10207, 'batch_loss/train': 0.7125702165067196} 12/24/2021 07:49:52 - INFO - codeparrot_training - Step 10208: {'lr': 0.00045162234850614294, 'samples': 5227008, 'steps': 10208, 'batch_loss/train': 0.8781838724389672} 12/24/2021 07:50:03 - INFO - codeparrot_training - Step 10209: {'lr': 0.00045161302817497447, 'samples': 5227520, 'steps': 10209, 'batch_loss/train': 0.7924749478697777} 12/24/2021 07:50:13 - INFO - codeparrot_training - Step 10210: {'lr': 0.0004516037070422687, 'samples': 5228032, 'steps': 10210, 'batch_loss/train': 0.8197552906349301} 12/24/2021 07:50:26 - INFO - codeparrot_training - Step 10211: {'lr': 0.0004515943851080627, 'samples': 5228544, 'steps': 10211, 'batch_loss/train': 0.8100837599486113} 12/24/2021 07:50:36 - INFO - codeparrot_training - Step 10212: {'lr': 0.0004515850623723936, 'samples': 5229056, 'steps': 10212, 'batch_loss/train': 0.7985985963605344} 12/24/2021 07:50:47 - INFO - codeparrot_training - Step 10213: {'lr': 0.00045157573883529835, 'samples': 5229568, 'steps': 10213, 'batch_loss/train': 0.7540758517570794} 12/24/2021 07:51:01 - INFO - codeparrot_training - Step 10214: {'lr': 0.00045156641449681413, 'samples': 5230080, 'steps': 10214, 'batch_loss/train': 0.7960753054358065} 12/24/2021 07:51:11 - INFO - codeparrot_training - Step 10215: {'lr': 0.0004515570893569779, 'samples': 5230592, 'steps': 10215, 'batch_loss/train': 0.8125176336616278} 12/24/2021 07:51:22 - INFO - codeparrot_training - Step 10216: {'lr': 0.0004515477634158268, 'samples': 5231104, 'steps': 10216, 'batch_loss/train': 0.8239252688363194} 12/24/2021 07:51:34 - INFO - codeparrot_training - Step 10217: {'lr': 0.0004515384366733979, 'samples': 5231616, 'steps': 10217, 'batch_loss/train': 0.6973228054121137} 12/24/2021 07:51:45 - INFO - codeparrot_training - Step 10218: {'lr': 0.00045152910912972825, 'samples': 5232128, 'steps': 10218, 'batch_loss/train': 0.7145278081297874} 12/24/2021 07:51:55 - INFO - codeparrot_training - Step 10219: {'lr': 0.000451519780784855, 'samples': 5232640, 'steps': 10219, 'batch_loss/train': 0.7937457682564855} 12/24/2021 07:52:06 - INFO - codeparrot_training - Step 10220: {'lr': 0.00045151045163881523, 'samples': 5233152, 'steps': 10220, 'batch_loss/train': 0.8729423135519028} 12/24/2021 07:52:18 - INFO - codeparrot_training - Step 10221: {'lr': 0.0004515011216916459, 'samples': 5233664, 'steps': 10221, 'batch_loss/train': 0.7742177611216903} 12/24/2021 07:52:29 - INFO - codeparrot_training - Step 10222: {'lr': 0.00045149179094338424, 'samples': 5234176, 'steps': 10222, 'batch_loss/train': 0.8067156891338527} 12/24/2021 07:52:39 - INFO - codeparrot_training - Step 10223: {'lr': 0.0004514824593940673, 'samples': 5234688, 'steps': 10223, 'batch_loss/train': 0.8892269982025027} 12/24/2021 07:52:53 - INFO - codeparrot_training - Step 10224: {'lr': 0.0004514731270437322, 'samples': 5235200, 'steps': 10224, 'batch_loss/train': 0.8415128029300831} 12/24/2021 07:53:04 - INFO - codeparrot_training - Step 10225: {'lr': 0.000451463793892416, 'samples': 5235712, 'steps': 10225, 'batch_loss/train': 0.8357416205108166} 12/24/2021 07:53:14 - INFO - codeparrot_training - Step 10226: {'lr': 0.0004514544599401558, 'samples': 5236224, 'steps': 10226, 'batch_loss/train': 0.8530229879543185} 12/24/2021 07:53:26 - INFO - codeparrot_training - Step 10227: {'lr': 0.00045144512518698873, 'samples': 5236736, 'steps': 10227, 'batch_loss/train': 0.8862681584432721} 12/24/2021 07:53:37 - INFO - codeparrot_training - Step 10228: {'lr': 0.0004514357896329519, 'samples': 5237248, 'steps': 10228, 'batch_loss/train': 0.7423776767682284} 12/24/2021 07:53:47 - INFO - codeparrot_training - Step 10229: {'lr': 0.0004514264532780825, 'samples': 5237760, 'steps': 10229, 'batch_loss/train': 0.963329516351223} 12/24/2021 07:53:58 - INFO - codeparrot_training - Step 10230: {'lr': 0.0004514171161224175, 'samples': 5238272, 'steps': 10230, 'batch_loss/train': 0.9221114367246628} 12/24/2021 07:54:12 - INFO - codeparrot_training - Step 10231: {'lr': 0.0004514077781659941, 'samples': 5238784, 'steps': 10231, 'batch_loss/train': 0.7822184085380286} 12/24/2021 07:54:22 - INFO - codeparrot_training - Step 10232: {'lr': 0.0004513984394088495, 'samples': 5239296, 'steps': 10232, 'batch_loss/train': 0.8244590228423476} 12/24/2021 07:54:33 - INFO - codeparrot_training - Step 10233: {'lr': 0.00045138909985102075, 'samples': 5239808, 'steps': 10233, 'batch_loss/train': 0.8145471725147218} 12/24/2021 07:54:45 - INFO - codeparrot_training - Step 10234: {'lr': 0.0004513797594925448, 'samples': 5240320, 'steps': 10234, 'batch_loss/train': 0.8743651197291911} 12/24/2021 07:54:56 - INFO - codeparrot_training - Step 10235: {'lr': 0.00045137041833345917, 'samples': 5240832, 'steps': 10235, 'batch_loss/train': 0.8660732381977141} 12/24/2021 07:55:06 - INFO - codeparrot_training - Step 10236: {'lr': 0.00045136107637380073, 'samples': 5241344, 'steps': 10236, 'batch_loss/train': 0.8365636598318815} 12/24/2021 07:55:19 - INFO - codeparrot_training - Step 10237: {'lr': 0.00045135173361360667, 'samples': 5241856, 'steps': 10237, 'batch_loss/train': 0.8873686891747639} 12/24/2021 07:55:29 - INFO - codeparrot_training - Step 10238: {'lr': 0.0004513423900529141, 'samples': 5242368, 'steps': 10238, 'batch_loss/train': 0.8137933639809489} 12/24/2021 07:55:40 - INFO - codeparrot_training - Step 10239: {'lr': 0.00045133304569176027, 'samples': 5242880, 'steps': 10239, 'batch_loss/train': 0.7777752939146012} 12/24/2021 07:55:50 - INFO - codeparrot_training - Step 10240: {'lr': 0.00045132370053018226, 'samples': 5243392, 'steps': 10240, 'batch_loss/train': 0.8102483553811908} 12/24/2021 07:56:04 - INFO - codeparrot_training - Step 10241: {'lr': 0.00045131435456821726, 'samples': 5243904, 'steps': 10241, 'batch_loss/train': 0.8343093160074204} 12/24/2021 07:56:15 - INFO - codeparrot_training - Step 10242: {'lr': 0.00045130500780590234, 'samples': 5244416, 'steps': 10242, 'batch_loss/train': 0.9006024412810802} 12/24/2021 07:56:25 - INFO - codeparrot_training - Step 10243: {'lr': 0.0004512956602432747, 'samples': 5244928, 'steps': 10243, 'batch_loss/train': 0.8116131802089512} 12/24/2021 07:56:38 - INFO - codeparrot_training - Step 10244: {'lr': 0.0004512863118803716, 'samples': 5245440, 'steps': 10244, 'batch_loss/train': 0.7609433378092945} 12/24/2021 07:56:48 - INFO - codeparrot_training - Step 10245: {'lr': 0.0004512769627172301, 'samples': 5245952, 'steps': 10245, 'batch_loss/train': 0.8415548335760832} 12/24/2021 07:56:59 - INFO - codeparrot_training - Step 10246: {'lr': 0.0004512676127538874, 'samples': 5246464, 'steps': 10246, 'batch_loss/train': 0.8195787817239761} 12/24/2021 07:57:11 - INFO - codeparrot_training - Step 10247: {'lr': 0.0004512582619903807, 'samples': 5246976, 'steps': 10247, 'batch_loss/train': 0.8116241740062833} 12/24/2021 07:57:22 - INFO - codeparrot_training - Step 10248: {'lr': 0.0004512489104267471, 'samples': 5247488, 'steps': 10248, 'batch_loss/train': 0.7909357488388196} 12/24/2021 07:57:32 - INFO - codeparrot_training - Step 10249: {'lr': 0.0004512395580630238, 'samples': 5248000, 'steps': 10249, 'batch_loss/train': 0.7565472349524498} 12/24/2021 07:57:43 - INFO - codeparrot_training - Step 10250: {'lr': 0.00045123020489924807, 'samples': 5248512, 'steps': 10250, 'batch_loss/train': 1.334185654297471} 12/24/2021 07:57:55 - INFO - codeparrot_training - Step 10251: {'lr': 0.0004512208509354569, 'samples': 5249024, 'steps': 10251, 'batch_loss/train': 0.8100059786811471} 12/24/2021 07:58:05 - INFO - codeparrot_training - Step 10252: {'lr': 0.0004512114961716877, 'samples': 5249536, 'steps': 10252, 'batch_loss/train': 0.7752942675724626} 12/24/2021 07:58:16 - INFO - codeparrot_training - Step 10253: {'lr': 0.0004512021406079776, 'samples': 5250048, 'steps': 10253, 'batch_loss/train': 0.7004063088679686} 12/24/2021 07:58:30 - INFO - codeparrot_training - Step 10254: {'lr': 0.00045119278424436373, 'samples': 5250560, 'steps': 10254, 'batch_loss/train': 0.8709450121968985} 12/24/2021 07:58:40 - INFO - codeparrot_training - Step 10255: {'lr': 0.0004511834270808832, 'samples': 5251072, 'steps': 10255, 'batch_loss/train': 0.7909171218052506} 12/24/2021 07:58:51 - INFO - codeparrot_training - Step 10256: {'lr': 0.00045117406911757347, 'samples': 5251584, 'steps': 10256, 'batch_loss/train': 0.8792840028181672} 12/24/2021 07:59:03 - INFO - codeparrot_training - Step 10257: {'lr': 0.0004511647103544716, 'samples': 5252096, 'steps': 10257, 'batch_loss/train': 0.7530454217921942} 12/24/2021 07:59:14 - INFO - codeparrot_training - Step 10258: {'lr': 0.00045115535079161474, 'samples': 5252608, 'steps': 10258, 'batch_loss/train': 0.7452674191445112} 12/24/2021 07:59:24 - INFO - codeparrot_training - Step 10259: {'lr': 0.0004511459904290401, 'samples': 5253120, 'steps': 10259, 'batch_loss/train': 0.8253311887383461} 12/24/2021 07:59:35 - INFO - codeparrot_training - Step 10260: {'lr': 0.0004511366292667851, 'samples': 5253632, 'steps': 10260, 'batch_loss/train': 0.8288919925689697} 12/24/2021 07:59:47 - INFO - codeparrot_training - Step 10261: {'lr': 0.00045112726730488673, 'samples': 5254144, 'steps': 10261, 'batch_loss/train': 0.8350591652560979} 12/24/2021 07:59:57 - INFO - codeparrot_training - Step 10262: {'lr': 0.0004511179045433823, 'samples': 5254656, 'steps': 10262, 'batch_loss/train': 0.8687929743900895} 12/24/2021 08:00:08 - INFO - codeparrot_training - Step 10263: {'lr': 0.000451108540982309, 'samples': 5255168, 'steps': 10263, 'batch_loss/train': 0.9114737650379539} 12/24/2021 08:00:22 - INFO - codeparrot_training - Step 10264: {'lr': 0.0004510991766217041, 'samples': 5255680, 'steps': 10264, 'batch_loss/train': 0.915807181969285} 12/24/2021 08:00:32 - INFO - codeparrot_training - Step 10265: {'lr': 0.0004510898114616048, 'samples': 5256192, 'steps': 10265, 'batch_loss/train': 0.8534510163590312} 12/24/2021 08:00:43 - INFO - codeparrot_training - Step 10266: {'lr': 0.0004510804455020483, 'samples': 5256704, 'steps': 10266, 'batch_loss/train': 0.9005343653261662} 12/24/2021 08:00:55 - INFO - codeparrot_training - Step 10267: {'lr': 0.00045107107874307195, 'samples': 5257216, 'steps': 10267, 'batch_loss/train': 0.8069849656894803} 12/24/2021 08:01:06 - INFO - codeparrot_training - Step 10268: {'lr': 0.00045106171118471284, 'samples': 5257728, 'steps': 10268, 'batch_loss/train': 0.7835348849184811} 12/24/2021 08:01:16 - INFO - codeparrot_training - Step 10269: {'lr': 0.00045105234282700834, 'samples': 5258240, 'steps': 10269, 'batch_loss/train': 0.8199660927057266} 12/24/2021 08:01:30 - INFO - codeparrot_training - Step 10270: {'lr': 0.00045104297366999556, 'samples': 5258752, 'steps': 10270, 'batch_loss/train': 0.8108018860220909} 12/24/2021 08:01:41 - INFO - codeparrot_training - Step 10271: {'lr': 0.00045103360371371193, 'samples': 5259264, 'steps': 10271, 'batch_loss/train': 0.7861257623881102} 12/24/2021 08:01:51 - INFO - codeparrot_training - Step 10272: {'lr': 0.0004510242329581946, 'samples': 5259776, 'steps': 10272, 'batch_loss/train': 0.8148827869445086} 12/24/2021 08:02:02 - INFO - codeparrot_training - Step 10273: {'lr': 0.0004510148614034807, 'samples': 5260288, 'steps': 10273, 'batch_loss/train': 0.7867775342892855} 12/24/2021 08:02:14 - INFO - codeparrot_training - Step 10274: {'lr': 0.0004510054890496077, 'samples': 5260800, 'steps': 10274, 'batch_loss/train': 0.8322493406012654} 12/24/2021 08:02:25 - INFO - codeparrot_training - Step 10275: {'lr': 0.0004509961158966128, 'samples': 5261312, 'steps': 10275, 'batch_loss/train': 1.0161528908647597} 12/24/2021 08:02:35 - INFO - codeparrot_training - Step 10276: {'lr': 0.00045098674194453316, 'samples': 5261824, 'steps': 10276, 'batch_loss/train': 0.8454646375030279} 12/24/2021 08:02:47 - INFO - codeparrot_training - Step 10277: {'lr': 0.00045097736719340614, 'samples': 5262336, 'steps': 10277, 'batch_loss/train': 0.9434554455801845} 12/24/2021 08:02:58 - INFO - codeparrot_training - Step 10278: {'lr': 0.000450967991643269, 'samples': 5262848, 'steps': 10278, 'batch_loss/train': 0.919761442579329} 12/24/2021 08:03:08 - INFO - codeparrot_training - Step 10279: {'lr': 0.00045095861529415905, 'samples': 5263360, 'steps': 10279, 'batch_loss/train': 0.8406583201140165} 12/24/2021 08:03:22 - INFO - codeparrot_training - Step 10280: {'lr': 0.0004509492381461135, 'samples': 5263872, 'steps': 10280, 'batch_loss/train': 0.7919424809515476} 12/24/2021 08:03:33 - INFO - codeparrot_training - Step 10281: {'lr': 0.0004509398601991697, 'samples': 5264384, 'steps': 10281, 'batch_loss/train': 0.8079623244702816} 12/24/2021 08:03:44 - INFO - codeparrot_training - Step 10282: {'lr': 0.0004509304814533648, 'samples': 5264896, 'steps': 10282, 'batch_loss/train': 0.7435082579031587} 12/24/2021 08:03:54 - INFO - codeparrot_training - Step 10283: {'lr': 0.0004509211019087362, 'samples': 5265408, 'steps': 10283, 'batch_loss/train': 0.823998243547976} 12/24/2021 08:04:06 - INFO - codeparrot_training - Step 10284: {'lr': 0.0004509117215653212, 'samples': 5265920, 'steps': 10284, 'batch_loss/train': 0.8517758268862963} 12/24/2021 08:04:17 - INFO - codeparrot_training - Step 10285: {'lr': 0.00045090234042315704, 'samples': 5266432, 'steps': 10285, 'batch_loss/train': 0.7436534278094769} 12/24/2021 08:04:27 - INFO - codeparrot_training - Step 10286: {'lr': 0.00045089295848228095, 'samples': 5266944, 'steps': 10286, 'batch_loss/train': 0.7470166180282831} 12/24/2021 08:04:40 - INFO - codeparrot_training - Step 10287: {'lr': 0.0004508835757427304, 'samples': 5267456, 'steps': 10287, 'batch_loss/train': 0.9922557808458805} 12/24/2021 08:04:50 - INFO - codeparrot_training - Step 10288: {'lr': 0.0004508741922045426, 'samples': 5267968, 'steps': 10288, 'batch_loss/train': 0.44264896400272846} 12/24/2021 08:05:01 - INFO - codeparrot_training - Step 10289: {'lr': 0.0004508648078677549, 'samples': 5268480, 'steps': 10289, 'batch_loss/train': 0.8122951062396169} 12/24/2021 08:05:13 - INFO - codeparrot_training - Step 10290: {'lr': 0.0004508554227324045, 'samples': 5268992, 'steps': 10290, 'batch_loss/train': 0.8636781722307205} 12/24/2021 08:05:24 - INFO - codeparrot_training - Step 10291: {'lr': 0.00045084603679852877, 'samples': 5269504, 'steps': 10291, 'batch_loss/train': 0.8585319691337645} 12/24/2021 08:05:34 - INFO - codeparrot_training - Step 10292: {'lr': 0.00045083665006616505, 'samples': 5270016, 'steps': 10292, 'batch_loss/train': 0.8738313559442759} 12/24/2021 08:05:45 - INFO - codeparrot_training - Step 10293: {'lr': 0.00045082726253535066, 'samples': 5270528, 'steps': 10293, 'batch_loss/train': 0.7842201255261898} 12/24/2021 08:05:59 - INFO - codeparrot_training - Step 10294: {'lr': 0.00045081787420612286, 'samples': 5271040, 'steps': 10294, 'batch_loss/train': 0.928266647271812} 12/24/2021 08:06:09 - INFO - codeparrot_training - Step 10295: {'lr': 0.00045080848507851915, 'samples': 5271552, 'steps': 10295, 'batch_loss/train': 0.8466083211824298} 12/24/2021 08:06:20 - INFO - codeparrot_training - Step 10296: {'lr': 0.00045079909515257656, 'samples': 5272064, 'steps': 10296, 'batch_loss/train': 0.8028324521146715} 12/24/2021 08:06:32 - INFO - codeparrot_training - Step 10297: {'lr': 0.00045078970442833266, 'samples': 5272576, 'steps': 10297, 'batch_loss/train': 0.7700351164676249} 12/24/2021 08:06:43 - INFO - codeparrot_training - Step 10298: {'lr': 0.0004507803129058247, 'samples': 5273088, 'steps': 10298, 'batch_loss/train': 0.8011528290808201} 12/24/2021 08:06:53 - INFO - codeparrot_training - Step 10299: {'lr': 0.00045077092058509006, 'samples': 5273600, 'steps': 10299, 'batch_loss/train': 0.8304066595737822} 12/24/2021 08:07:07 - INFO - codeparrot_training - Step 10300: {'lr': 0.000450761527466166, 'samples': 5274112, 'steps': 10300, 'batch_loss/train': 0.9314533211290836} 12/24/2021 08:07:18 - INFO - codeparrot_training - Step 10301: {'lr': 0.00045075213354908993, 'samples': 5274624, 'steps': 10301, 'batch_loss/train': 0.8883287515491247} 12/24/2021 08:07:28 - INFO - codeparrot_training - Step 10302: {'lr': 0.00045074273883389913, 'samples': 5275136, 'steps': 10302, 'batch_loss/train': 0.7701586983166635} 12/24/2021 08:07:41 - INFO - codeparrot_training - Step 10303: {'lr': 0.0004507333433206311, 'samples': 5275648, 'steps': 10303, 'batch_loss/train': 0.845874235033989} 12/24/2021 08:07:52 - INFO - codeparrot_training - Step 10304: {'lr': 0.00045072394700932305, 'samples': 5276160, 'steps': 10304, 'batch_loss/train': 1.5876888651400805} 12/24/2021 08:08:03 - INFO - codeparrot_training - Step 10305: {'lr': 0.00045071454990001236, 'samples': 5276672, 'steps': 10305, 'batch_loss/train': 0.7223566191969439} 12/24/2021 08:08:13 - INFO - codeparrot_training - Step 10306: {'lr': 0.0004507051519927364, 'samples': 5277184, 'steps': 10306, 'batch_loss/train': 0.7596705295145512} 12/24/2021 08:08:25 - INFO - codeparrot_training - Step 10307: {'lr': 0.00045069575328753254, 'samples': 5277696, 'steps': 10307, 'batch_loss/train': 0.866282950155437} 12/24/2021 08:08:36 - INFO - codeparrot_training - Step 10308: {'lr': 0.00045068635378443816, 'samples': 5278208, 'steps': 10308, 'batch_loss/train': 0.6953384908847511} 12/24/2021 08:08:46 - INFO - codeparrot_training - Step 10309: {'lr': 0.00045067695348349063, 'samples': 5278720, 'steps': 10309, 'batch_loss/train': 0.8429928733967245} 12/24/2021 08:09:01 - INFO - codeparrot_training - Step 10310: {'lr': 0.0004506675523847273, 'samples': 5279232, 'steps': 10310, 'batch_loss/train': 0.8224631026387215} 12/24/2021 08:09:11 - INFO - codeparrot_training - Step 10311: {'lr': 0.00045065815048818555, 'samples': 5279744, 'steps': 10311, 'batch_loss/train': 0.9240517979487777} 12/24/2021 08:09:22 - INFO - codeparrot_training - Step 10312: {'lr': 0.0004506487477939027, 'samples': 5280256, 'steps': 10312, 'batch_loss/train': 0.8824596460908651} 12/24/2021 08:09:32 - INFO - codeparrot_training - Step 10313: {'lr': 0.0004506393443019162, 'samples': 5280768, 'steps': 10313, 'batch_loss/train': 0.8022620864212513} 12/24/2021 08:09:44 - INFO - codeparrot_training - Step 10314: {'lr': 0.00045062994001226353, 'samples': 5281280, 'steps': 10314, 'batch_loss/train': 0.8702127188444138} 12/24/2021 08:09:55 - INFO - codeparrot_training - Step 10315: {'lr': 0.0004506205349249819, 'samples': 5281792, 'steps': 10315, 'batch_loss/train': 0.9232615171931684} 12/24/2021 08:10:06 - INFO - codeparrot_training - Step 10316: {'lr': 0.00045061112904010874, 'samples': 5282304, 'steps': 10316, 'batch_loss/train': 0.8210900838021189} 12/24/2021 08:10:19 - INFO - codeparrot_training - Step 10317: {'lr': 0.00045060172235768156, 'samples': 5282816, 'steps': 10317, 'batch_loss/train': 0.9154609264805913} 12/24/2021 08:10:30 - INFO - codeparrot_training - Step 10318: {'lr': 0.0004505923148777376, 'samples': 5283328, 'steps': 10318, 'batch_loss/train': 0.8785388441756368} 12/24/2021 08:10:41 - INFO - codeparrot_training - Step 10319: {'lr': 0.00045058290660031445, 'samples': 5283840, 'steps': 10319, 'batch_loss/train': 0.8675270867533982} 12/24/2021 08:10:53 - INFO - codeparrot_training - Step 10320: {'lr': 0.0004505734975254493, 'samples': 5284352, 'steps': 10320, 'batch_loss/train': 0.9122180202975869} 12/24/2021 08:11:03 - INFO - codeparrot_training - Step 10321: {'lr': 0.0004505640876531796, 'samples': 5284864, 'steps': 10321, 'batch_loss/train': 0.8379419643897563} 12/24/2021 08:11:14 - INFO - codeparrot_training - Step 10322: {'lr': 0.00045055467698354294, 'samples': 5285376, 'steps': 10322, 'batch_loss/train': 0.6547167628305033} 12/24/2021 08:11:26 - INFO - codeparrot_training - Step 10323: {'lr': 0.0004505452655165766, 'samples': 5285888, 'steps': 10323, 'batch_loss/train': 0.9101028288714588} 12/24/2021 08:11:37 - INFO - codeparrot_training - Step 10324: {'lr': 0.00045053585325231793, 'samples': 5286400, 'steps': 10324, 'batch_loss/train': 0.8868663720786572} 12/24/2021 08:11:47 - INFO - codeparrot_training - Step 10325: {'lr': 0.00045052644019080446, 'samples': 5286912, 'steps': 10325, 'batch_loss/train': 0.7799813756719232} 12/24/2021 08:11:58 - INFO - codeparrot_training - Step 10326: {'lr': 0.00045051702633207356, 'samples': 5287424, 'steps': 10326, 'batch_loss/train': 0.8847516025416553} 12/24/2021 08:12:10 - INFO - codeparrot_training - Step 10327: {'lr': 0.00045050761167616274, 'samples': 5287936, 'steps': 10327, 'batch_loss/train': 1.0729634128510952} 12/24/2021 08:12:20 - INFO - codeparrot_training - Step 10328: {'lr': 0.00045049819622310934, 'samples': 5288448, 'steps': 10328, 'batch_loss/train': 0.8352006496861577} 12/24/2021 08:12:31 - INFO - codeparrot_training - Step 10329: {'lr': 0.0004504887799729508, 'samples': 5288960, 'steps': 10329, 'batch_loss/train': 0.833380987867713} 12/24/2021 08:12:45 - INFO - codeparrot_training - Step 10330: {'lr': 0.0004504793629257246, 'samples': 5289472, 'steps': 10330, 'batch_loss/train': 0.9104839023202658} 12/24/2021 08:12:55 - INFO - codeparrot_training - Step 10331: {'lr': 0.0004504699450814681, 'samples': 5289984, 'steps': 10331, 'batch_loss/train': 0.8698417311534286} 12/24/2021 08:13:06 - INFO - codeparrot_training - Step 10332: {'lr': 0.00045046052644021886, 'samples': 5290496, 'steps': 10332, 'batch_loss/train': 0.8124944055452943} 12/24/2021 08:13:18 - INFO - codeparrot_training - Step 10333: {'lr': 0.00045045110700201417, 'samples': 5291008, 'steps': 10333, 'batch_loss/train': 0.790280323009938} 12/24/2021 08:13:29 - INFO - codeparrot_training - Step 10334: {'lr': 0.00045044168676689167, 'samples': 5291520, 'steps': 10334, 'batch_loss/train': 0.8521400075405836} 12/24/2021 08:13:39 - INFO - codeparrot_training - Step 10335: {'lr': 0.00045043226573488864, 'samples': 5292032, 'steps': 10335, 'batch_loss/train': 0.9199348175898194} 12/24/2021 08:13:50 - INFO - codeparrot_training - Step 10336: {'lr': 0.0004504228439060426, 'samples': 5292544, 'steps': 10336, 'batch_loss/train': 0.8285428462550044} 12/24/2021 08:14:02 - INFO - codeparrot_training - Step 10337: {'lr': 0.0004504134212803911, 'samples': 5293056, 'steps': 10337, 'batch_loss/train': 0.9525949954986572} 12/24/2021 08:14:12 - INFO - codeparrot_training - Step 10338: {'lr': 0.00045040399785797137, 'samples': 5293568, 'steps': 10338, 'batch_loss/train': 0.8127890033647418} 12/24/2021 08:14:23 - INFO - codeparrot_training - Step 10339: {'lr': 0.0004503945736388211, 'samples': 5294080, 'steps': 10339, 'batch_loss/train': 0.7505010324530303} 12/24/2021 08:14:37 - INFO - codeparrot_training - Step 10340: {'lr': 0.0004503851486229776, 'samples': 5294592, 'steps': 10340, 'batch_loss/train': 0.8361553573049605} 12/24/2021 08:14:48 - INFO - codeparrot_training - Step 10341: {'lr': 0.00045037572281047855, 'samples': 5295104, 'steps': 10341, 'batch_loss/train': 0.7790012415498495} 12/24/2021 08:14:58 - INFO - codeparrot_training - Step 10342: {'lr': 0.0004503662962013612, 'samples': 5295616, 'steps': 10342, 'batch_loss/train': 0.8714140262454748} 12/24/2021 08:15:10 - INFO - codeparrot_training - Step 10343: {'lr': 0.0004503568687956631, 'samples': 5296128, 'steps': 10343, 'batch_loss/train': 0.7149409609846771} 12/24/2021 08:15:21 - INFO - codeparrot_training - Step 10344: {'lr': 0.00045034744059342175, 'samples': 5296640, 'steps': 10344, 'batch_loss/train': 0.9230180718004704} 12/24/2021 08:15:31 - INFO - codeparrot_training - Step 10345: {'lr': 0.00045033801159467465, 'samples': 5297152, 'steps': 10345, 'batch_loss/train': 0.7988626332953572} 12/24/2021 08:15:45 - INFO - codeparrot_training - Step 10346: {'lr': 0.00045032858179945925, 'samples': 5297664, 'steps': 10346, 'batch_loss/train': 0.8304738523438573} 12/24/2021 08:15:56 - INFO - codeparrot_training - Step 10347: {'lr': 0.00045031915120781295, 'samples': 5298176, 'steps': 10347, 'batch_loss/train': 0.8770318592432886} 12/24/2021 08:16:07 - INFO - codeparrot_training - Step 10348: {'lr': 0.0004503097198197735, 'samples': 5298688, 'steps': 10348, 'batch_loss/train': 0.8657211661338806} 12/24/2021 08:16:17 - INFO - codeparrot_training - Step 10349: {'lr': 0.00045030028763537813, 'samples': 5299200, 'steps': 10349, 'batch_loss/train': 0.810427088290453} 12/24/2021 08:16:29 - INFO - codeparrot_training - Step 10350: {'lr': 0.0004502908546546645, 'samples': 5299712, 'steps': 10350, 'batch_loss/train': 0.7482154425233603} 12/24/2021 08:16:40 - INFO - codeparrot_training - Step 10351: {'lr': 0.00045028142087767003, 'samples': 5300224, 'steps': 10351, 'batch_loss/train': 0.7493707700632513} 12/24/2021 08:16:50 - INFO - codeparrot_training - Step 10352: {'lr': 0.0004502719863044323, 'samples': 5300736, 'steps': 10352, 'batch_loss/train': 0.8197464216500521} 12/24/2021 08:17:03 - INFO - codeparrot_training - Step 10353: {'lr': 0.0004502625509349887, 'samples': 5301248, 'steps': 10353, 'batch_loss/train': 0.8147727567702532} 12/24/2021 08:17:14 - INFO - codeparrot_training - Step 10354: {'lr': 0.00045025311476937685, 'samples': 5301760, 'steps': 10354, 'batch_loss/train': 0.805869847536087} 12/24/2021 08:17:24 - INFO - codeparrot_training - Step 10355: {'lr': 0.00045024367780763426, 'samples': 5302272, 'steps': 10355, 'batch_loss/train': 0.8846519775688648} 12/24/2021 08:17:38 - INFO - codeparrot_training - Step 10356: {'lr': 0.0004502342400497984, 'samples': 5302784, 'steps': 10356, 'batch_loss/train': 1.1092294799163938} 12/24/2021 08:17:49 - INFO - codeparrot_training - Step 10357: {'lr': 0.0004502248014959068, 'samples': 5303296, 'steps': 10357, 'batch_loss/train': 0.8329187147319317} 12/24/2021 08:17:59 - INFO - codeparrot_training - Step 10358: {'lr': 0.000450215362145997, 'samples': 5303808, 'steps': 10358, 'batch_loss/train': 0.7478586505167186} 12/24/2021 08:18:10 - INFO - codeparrot_training - Step 10359: {'lr': 0.00045020592200010643, 'samples': 5304320, 'steps': 10359, 'batch_loss/train': 0.8054930185899138} 12/24/2021 08:18:23 - INFO - codeparrot_training - Step 10360: {'lr': 0.00045019648105827284, 'samples': 5304832, 'steps': 10360, 'batch_loss/train': 0.8395519200712442} 12/24/2021 08:18:34 - INFO - codeparrot_training - Step 10361: {'lr': 0.00045018703932053354, 'samples': 5305344, 'steps': 10361, 'batch_loss/train': 1.0830871537327766} 12/24/2021 08:18:44 - INFO - codeparrot_training - Step 10362: {'lr': 0.0004501775967869262, 'samples': 5305856, 'steps': 10362, 'batch_loss/train': 1.330589983612299} 12/24/2021 08:18:57 - INFO - codeparrot_training - Step 10363: {'lr': 0.0004501681534574883, 'samples': 5306368, 'steps': 10363, 'batch_loss/train': 1.0025493139401078} 12/24/2021 08:19:07 - INFO - codeparrot_training - Step 10364: {'lr': 0.00045015870933225746, 'samples': 5306880, 'steps': 10364, 'batch_loss/train': 0.5632179533131421} 12/24/2021 08:19:18 - INFO - codeparrot_training - Step 10365: {'lr': 0.00045014926441127115, 'samples': 5307392, 'steps': 10365, 'batch_loss/train': 0.7757951426319778} 12/24/2021 08:19:29 - INFO - codeparrot_training - Step 10366: {'lr': 0.0004501398186945669, 'samples': 5307904, 'steps': 10366, 'batch_loss/train': 0.6977612469345331} 12/24/2021 08:19:41 - INFO - codeparrot_training - Step 10367: {'lr': 0.00045013037218218235, 'samples': 5308416, 'steps': 10367, 'batch_loss/train': 0.8646204438991845} 12/24/2021 08:19:51 - INFO - codeparrot_training - Step 10368: {'lr': 0.0004501209248741549, 'samples': 5308928, 'steps': 10368, 'batch_loss/train': 0.924374058842659} 12/24/2021 08:20:02 - INFO - codeparrot_training - Step 10369: {'lr': 0.0004501114767705223, 'samples': 5309440, 'steps': 10369, 'batch_loss/train': 0.8210679916664958} 12/24/2021 08:20:16 - INFO - codeparrot_training - Step 10370: {'lr': 0.000450102027871322, 'samples': 5309952, 'steps': 10370, 'batch_loss/train': 0.8566546365618706} 12/24/2021 08:20:27 - INFO - codeparrot_training - Step 10371: {'lr': 0.00045009257817659165, 'samples': 5310464, 'steps': 10371, 'batch_loss/train': 0.7749910401180387} 12/24/2021 08:20:37 - INFO - codeparrot_training - Step 10372: {'lr': 0.00045008312768636874, 'samples': 5310976, 'steps': 10372, 'batch_loss/train': 0.7820749682141468} 12/24/2021 08:20:49 - INFO - codeparrot_training - Step 10373: {'lr': 0.0004500736764006907, 'samples': 5311488, 'steps': 10373, 'batch_loss/train': 0.8087149886414409} 12/24/2021 08:21:00 - INFO - codeparrot_training - Step 10374: {'lr': 0.00045006422431959547, 'samples': 5312000, 'steps': 10374, 'batch_loss/train': 0.8441426232457161} 12/24/2021 08:21:10 - INFO - codeparrot_training - Step 10375: {'lr': 0.00045005477144312035, 'samples': 5312512, 'steps': 10375, 'batch_loss/train': 0.8710979912430048} 12/24/2021 08:21:21 - INFO - codeparrot_training - Step 10376: {'lr': 0.0004500453177713031, 'samples': 5313024, 'steps': 10376, 'batch_loss/train': 0.8478850685060024} 12/24/2021 08:21:33 - INFO - codeparrot_training - Step 10377: {'lr': 0.00045003586330418107, 'samples': 5313536, 'steps': 10377, 'batch_loss/train': 0.8139738860772923} 12/24/2021 08:21:44 - INFO - codeparrot_training - Step 10378: {'lr': 0.000450026408041792, 'samples': 5314048, 'steps': 10378, 'batch_loss/train': 1.371647898107767} 12/24/2021 08:21:54 - INFO - codeparrot_training - Step 10379: {'lr': 0.0004500169519841735, 'samples': 5314560, 'steps': 10379, 'batch_loss/train': 0.8690751660615206} 12/24/2021 08:22:08 - INFO - codeparrot_training - Step 10380: {'lr': 0.0004500074951313631, 'samples': 5315072, 'steps': 10380, 'batch_loss/train': 0.7596322582103312} 12/24/2021 08:22:19 - INFO - codeparrot_training - Step 10381: {'lr': 0.0004499980374833984, 'samples': 5315584, 'steps': 10381, 'batch_loss/train': 0.9441705290228128} 12/24/2021 08:22:30 - INFO - codeparrot_training - Step 10382: {'lr': 0.00044998857904031707, 'samples': 5316096, 'steps': 10382, 'batch_loss/train': 1.3894639923237264} 12/24/2021 08:22:42 - INFO - codeparrot_training - Step 10383: {'lr': 0.0004499791198021566, 'samples': 5316608, 'steps': 10383, 'batch_loss/train': 0.7575687577482313} 12/24/2021 08:22:52 - INFO - codeparrot_training - Step 10384: {'lr': 0.00044996965976895473, 'samples': 5317120, 'steps': 10384, 'batch_loss/train': 0.8638242569286376} 12/24/2021 08:23:03 - INFO - codeparrot_training - Step 10385: {'lr': 0.0004499601989407489, 'samples': 5317632, 'steps': 10385, 'batch_loss/train': 0.8008475466631353} 12/24/2021 08:23:13 - INFO - codeparrot_training - Step 10386: {'lr': 0.0004499507373175769, 'samples': 5318144, 'steps': 10386, 'batch_loss/train': 0.8193758139386773} 12/24/2021 08:23:28 - INFO - codeparrot_training - Step 10387: {'lr': 0.0004499412748994762, 'samples': 5318656, 'steps': 10387, 'batch_loss/train': 0.9721839679405093} 12/24/2021 08:23:38 - INFO - codeparrot_training - Step 10388: {'lr': 0.00044993181168648456, 'samples': 5319168, 'steps': 10388, 'batch_loss/train': 0.8641826482489705} 12/24/2021 08:23:49 - INFO - codeparrot_training - Step 10389: {'lr': 0.0004499223476786395, 'samples': 5319680, 'steps': 10389, 'batch_loss/train': 0.9758251756429672} 12/24/2021 08:24:01 - INFO - codeparrot_training - Step 10390: {'lr': 0.0004499128828759787, 'samples': 5320192, 'steps': 10390, 'batch_loss/train': 0.8993151895701885} 12/24/2021 08:24:11 - INFO - codeparrot_training - Step 10391: {'lr': 0.0004499034172785397, 'samples': 5320704, 'steps': 10391, 'batch_loss/train': 0.8164901384152472} 12/24/2021 08:24:22 - INFO - codeparrot_training - Step 10392: {'lr': 0.0004498939508863603, 'samples': 5321216, 'steps': 10392, 'batch_loss/train': 0.7565362965688109} 12/24/2021 08:24:34 - INFO - codeparrot_training - Step 10393: {'lr': 0.00044988448369947794, 'samples': 5321728, 'steps': 10393, 'batch_loss/train': 0.8635720293968916} 12/24/2021 08:24:45 - INFO - codeparrot_training - Step 10394: {'lr': 0.00044987501571793035, 'samples': 5322240, 'steps': 10394, 'batch_loss/train': 0.8440023520961404} 12/24/2021 08:24:55 - INFO - codeparrot_training - Step 10395: {'lr': 0.0004498655469417552, 'samples': 5322752, 'steps': 10395, 'batch_loss/train': 0.8020794941112399} 12/24/2021 08:25:09 - INFO - codeparrot_training - Step 10396: {'lr': 0.0004498560773709901, 'samples': 5323264, 'steps': 10396, 'batch_loss/train': 0.8294040840119123} 12/24/2021 08:25:20 - INFO - codeparrot_training - Step 10397: {'lr': 0.00044984660700567267, 'samples': 5323776, 'steps': 10397, 'batch_loss/train': 0.7888339785858989} 12/24/2021 08:25:30 - INFO - codeparrot_training - Step 10398: {'lr': 0.00044983713584584057, 'samples': 5324288, 'steps': 10398, 'batch_loss/train': 0.9230124391615391} 12/24/2021 08:25:41 - INFO - codeparrot_training - Step 10399: {'lr': 0.0004498276638915315, 'samples': 5324800, 'steps': 10399, 'batch_loss/train': 0.8457343215122819} 12/24/2021 08:25:53 - INFO - codeparrot_training - Step 10400: {'lr': 0.00044981819114278313, 'samples': 5325312, 'steps': 10400, 'batch_loss/train': 0.793602844234556} 12/24/2021 08:26:04 - INFO - codeparrot_training - Step 10401: {'lr': 0.00044980871759963304, 'samples': 5325824, 'steps': 10401, 'batch_loss/train': 0.8347790199331939} 12/24/2021 08:26:14 - INFO - codeparrot_training - Step 10402: {'lr': 0.00044979924326211897, 'samples': 5326336, 'steps': 10402, 'batch_loss/train': 0.8243478182703257} 12/24/2021 08:26:28 - INFO - codeparrot_training - Step 10403: {'lr': 0.0004497897681302785, 'samples': 5326848, 'steps': 10403, 'batch_loss/train': 0.8094183653593063} 12/24/2021 08:26:39 - INFO - codeparrot_training - Step 10404: {'lr': 0.0004497802922041493, 'samples': 5327360, 'steps': 10404, 'batch_loss/train': 0.8489378970116377} 12/24/2021 08:26:49 - INFO - codeparrot_training - Step 10405: {'lr': 0.0004497708154837692, 'samples': 5327872, 'steps': 10405, 'batch_loss/train': 0.8085585385560989} 12/24/2021 08:27:01 - INFO - codeparrot_training - Step 10406: {'lr': 0.0004497613379691757, 'samples': 5328384, 'steps': 10406, 'batch_loss/train': 0.7855228644330055} 12/24/2021 08:27:12 - INFO - codeparrot_training - Step 10407: {'lr': 0.0004497518596604066, 'samples': 5328896, 'steps': 10407, 'batch_loss/train': 0.8111181380227208} 12/24/2021 08:27:22 - INFO - codeparrot_training - Step 10408: {'lr': 0.0004497423805574995, 'samples': 5329408, 'steps': 10408, 'batch_loss/train': 0.7828346705064178} 12/24/2021 08:27:33 - INFO - codeparrot_training - Step 10409: {'lr': 0.00044973290066049215, 'samples': 5329920, 'steps': 10409, 'batch_loss/train': 0.8900573663413525} 12/24/2021 08:27:45 - INFO - codeparrot_training - Step 10410: {'lr': 0.00044972341996942216, 'samples': 5330432, 'steps': 10410, 'batch_loss/train': 0.8343139812350273} 12/24/2021 08:27:56 - INFO - codeparrot_training - Step 10411: {'lr': 0.0004497139384843273, 'samples': 5330944, 'steps': 10411, 'batch_loss/train': 0.8774671689607203} 12/24/2021 08:28:06 - INFO - codeparrot_training - Step 10412: {'lr': 0.0004497044562052452, 'samples': 5331456, 'steps': 10412, 'batch_loss/train': 0.81000464130193} 12/24/2021 08:28:19 - INFO - codeparrot_training - Step 10413: {'lr': 0.00044969497313221364, 'samples': 5331968, 'steps': 10413, 'batch_loss/train': 0.6875065765343606} 12/24/2021 08:28:29 - INFO - codeparrot_training - Step 10414: {'lr': 0.0004496854892652702, 'samples': 5332480, 'steps': 10414, 'batch_loss/train': 0.7391628567129374} 12/24/2021 08:28:40 - INFO - codeparrot_training - Step 10415: {'lr': 0.0004496760046044527, 'samples': 5332992, 'steps': 10415, 'batch_loss/train': 0.9452302847057581} 12/24/2021 08:28:51 - INFO - codeparrot_training - Step 10416: {'lr': 0.0004496665191497987, 'samples': 5333504, 'steps': 10416, 'batch_loss/train': 1.4446721789427102} 12/24/2021 08:29:04 - INFO - codeparrot_training - Step 10417: {'lr': 0.00044965703290134615, 'samples': 5334016, 'steps': 10417, 'batch_loss/train': 0.8675704216584563} 12/24/2021 08:29:15 - INFO - codeparrot_training - Step 10418: {'lr': 0.00044964754585913254, 'samples': 5334528, 'steps': 10418, 'batch_loss/train': 0.7481680288910866} 12/24/2021 08:29:26 - INFO - codeparrot_training - Step 10419: {'lr': 0.00044963805802319566, 'samples': 5335040, 'steps': 10419, 'batch_loss/train': 0.8593883530702442} 12/24/2021 08:29:38 - INFO - codeparrot_training - Step 10420: {'lr': 0.00044962856939357335, 'samples': 5335552, 'steps': 10420, 'batch_loss/train': 0.6870936374180019} 12/24/2021 08:29:49 - INFO - codeparrot_training - Step 10421: {'lr': 0.00044961907997030317, 'samples': 5336064, 'steps': 10421, 'batch_loss/train': 0.7779771760106087} 12/24/2021 08:29:59 - INFO - codeparrot_training - Step 10422: {'lr': 0.0004496095897534228, 'samples': 5336576, 'steps': 10422, 'batch_loss/train': 0.823453830787912} 12/24/2021 08:30:11 - INFO - codeparrot_training - Step 10423: {'lr': 0.0004496000987429702, 'samples': 5337088, 'steps': 10423, 'batch_loss/train': 0.8829150283709168} 12/24/2021 08:30:22 - INFO - codeparrot_training - Step 10424: {'lr': 0.0004495906069389829, 'samples': 5337600, 'steps': 10424, 'batch_loss/train': 0.7995540150441229} 12/24/2021 08:30:32 - INFO - codeparrot_training - Step 10425: {'lr': 0.00044958111434149873, 'samples': 5338112, 'steps': 10425, 'batch_loss/train': 0.8270540037192404} 12/24/2021 08:30:46 - INFO - codeparrot_training - Step 10426: {'lr': 0.0004495716209505554, 'samples': 5338624, 'steps': 10426, 'batch_loss/train': 0.9198857424780726} 12/24/2021 08:30:57 - INFO - codeparrot_training - Step 10427: {'lr': 0.0004495621267661907, 'samples': 5339136, 'steps': 10427, 'batch_loss/train': 0.7846713690087199} 12/24/2021 08:31:07 - INFO - codeparrot_training - Step 10428: {'lr': 0.0004495526317884423, 'samples': 5339648, 'steps': 10428, 'batch_loss/train': 0.8033315436914563} 12/24/2021 08:31:18 - INFO - codeparrot_training - Step 10429: {'lr': 0.00044954313601734805, 'samples': 5340160, 'steps': 10429, 'batch_loss/train': 0.6984483300475404} 12/24/2021 08:31:30 - INFO - codeparrot_training - Step 10430: {'lr': 0.00044953363945294557, 'samples': 5340672, 'steps': 10430, 'batch_loss/train': 0.8103962983004749} 12/24/2021 08:31:41 - INFO - codeparrot_training - Step 10431: {'lr': 0.0004495241420952727, 'samples': 5341184, 'steps': 10431, 'batch_loss/train': 0.788681265199557} 12/24/2021 08:31:51 - INFO - codeparrot_training - Step 10432: {'lr': 0.00044951464394436715, 'samples': 5341696, 'steps': 10432, 'batch_loss/train': 0.8960340637713671} 12/24/2021 08:32:05 - INFO - codeparrot_training - Step 10433: {'lr': 0.00044950514500026673, 'samples': 5342208, 'steps': 10433, 'batch_loss/train': 0.8655779613181949} 12/24/2021 08:32:15 - INFO - codeparrot_training - Step 10434: {'lr': 0.00044949564526300915, 'samples': 5342720, 'steps': 10434, 'batch_loss/train': 0.8652540910989046} 12/24/2021 08:32:26 - INFO - codeparrot_training - Step 10435: {'lr': 0.00044948614473263226, 'samples': 5343232, 'steps': 10435, 'batch_loss/train': 0.7971383109688759} 12/24/2021 08:32:38 - INFO - codeparrot_training - Step 10436: {'lr': 0.0004494766434091738, 'samples': 5343744, 'steps': 10436, 'batch_loss/train': 0.8622135035693645} 12/24/2021 08:32:49 - INFO - codeparrot_training - Step 10437: {'lr': 0.00044946714129267146, 'samples': 5344256, 'steps': 10437, 'batch_loss/train': 0.7831653719767928} 12/24/2021 08:32:59 - INFO - codeparrot_training - Step 10438: {'lr': 0.0004494576383831631, 'samples': 5344768, 'steps': 10438, 'batch_loss/train': 0.7947959816083312} 12/24/2021 08:33:10 - INFO - codeparrot_training - Step 10439: {'lr': 0.0004494481346806865, 'samples': 5345280, 'steps': 10439, 'batch_loss/train': 0.8032490755431354} 12/24/2021 08:33:22 - INFO - codeparrot_training - Step 10440: {'lr': 0.0004494386301852794, 'samples': 5345792, 'steps': 10440, 'batch_loss/train': 0.837716837413609} 12/24/2021 08:33:33 - INFO - codeparrot_training - Step 10441: {'lr': 0.0004494291248969797, 'samples': 5346304, 'steps': 10441, 'batch_loss/train': 0.9482036773115396} 12/24/2021 08:33:43 - INFO - codeparrot_training - Step 10442: {'lr': 0.000449419618815825, 'samples': 5346816, 'steps': 10442, 'batch_loss/train': 0.8309357687830925} 12/24/2021 08:33:55 - INFO - codeparrot_training - Step 10443: {'lr': 0.0004494101119418532, 'samples': 5347328, 'steps': 10443, 'batch_loss/train': 0.7901392187923193} 12/24/2021 08:34:06 - INFO - codeparrot_training - Step 10444: {'lr': 0.00044940060427510215, 'samples': 5347840, 'steps': 10444, 'batch_loss/train': 0.8370324575807899} 12/24/2021 08:34:16 - INFO - codeparrot_training - Step 10445: {'lr': 0.00044939109581560956, 'samples': 5348352, 'steps': 10445, 'batch_loss/train': 0.8368688654154539} 12/24/2021 08:34:31 - INFO - codeparrot_training - Step 10446: {'lr': 0.00044938158656341324, 'samples': 5348864, 'steps': 10446, 'batch_loss/train': 0.7448141831555404} 12/24/2021 08:34:41 - INFO - codeparrot_training - Step 10447: {'lr': 0.00044937207651855106, 'samples': 5349376, 'steps': 10447, 'batch_loss/train': 0.773899276740849} 12/24/2021 08:34:52 - INFO - codeparrot_training - Step 10448: {'lr': 0.0004493625656810607, 'samples': 5349888, 'steps': 10448, 'batch_loss/train': 0.5672785501228645} 12/24/2021 08:35:04 - INFO - codeparrot_training - Step 10449: {'lr': 0.00044935305405098016, 'samples': 5350400, 'steps': 10449, 'batch_loss/train': 0.8068037275224924} 12/24/2021 08:35:15 - INFO - codeparrot_training - Step 10450: {'lr': 0.0004493435416283471, 'samples': 5350912, 'steps': 10450, 'batch_loss/train': 0.7473713867366314} 12/24/2021 08:35:25 - INFO - codeparrot_training - Step 10451: {'lr': 0.0004493340284131994, 'samples': 5351424, 'steps': 10451, 'batch_loss/train': 0.7676063440740108} 12/24/2021 08:35:36 - INFO - codeparrot_training - Step 10452: {'lr': 0.0004493245144055748, 'samples': 5351936, 'steps': 10452, 'batch_loss/train': 0.9165369549300522} 12/24/2021 08:35:48 - INFO - codeparrot_training - Step 10453: {'lr': 0.00044931499960551124, 'samples': 5352448, 'steps': 10453, 'batch_loss/train': 0.7227677972987294} 12/24/2021 08:35:58 - INFO - codeparrot_training - Step 10454: {'lr': 0.0004493054840130465, 'samples': 5352960, 'steps': 10454, 'batch_loss/train': 0.7901041256263852} 12/24/2021 08:36:09 - INFO - codeparrot_training - Step 10455: {'lr': 0.00044929596762821844, 'samples': 5353472, 'steps': 10455, 'batch_loss/train': 0.8187860157340765} 12/24/2021 08:36:23 - INFO - codeparrot_training - Step 10456: {'lr': 0.00044928645045106487, 'samples': 5353984, 'steps': 10456, 'batch_loss/train': 1.0309590459801257} 12/24/2021 08:36:34 - INFO - codeparrot_training - Step 10457: {'lr': 0.00044927693248162363, 'samples': 5354496, 'steps': 10457, 'batch_loss/train': 0.785574356210418} 12/24/2021 08:36:44 - INFO - codeparrot_training - Step 10458: {'lr': 0.0004492674137199325, 'samples': 5355008, 'steps': 10458, 'batch_loss/train': 0.838389863492921} 12/24/2021 08:36:57 - INFO - codeparrot_training - Step 10459: {'lr': 0.00044925789416602936, 'samples': 5355520, 'steps': 10459, 'batch_loss/train': 0.7583990129642189} 12/24/2021 08:37:07 - INFO - codeparrot_training - Step 10460: {'lr': 0.0004492483738199521, 'samples': 5356032, 'steps': 10460, 'batch_loss/train': 0.7395912841893733} 12/24/2021 08:37:18 - INFO - codeparrot_training - Step 10461: {'lr': 0.00044923885268173856, 'samples': 5356544, 'steps': 10461, 'batch_loss/train': 0.7231908850371838} 12/24/2021 08:37:28 - INFO - codeparrot_training - Step 10462: {'lr': 0.00044922933075142646, 'samples': 5357056, 'steps': 10462, 'batch_loss/train': 0.8621260123327374} 12/24/2021 08:37:40 - INFO - codeparrot_training - Step 10463: {'lr': 0.00044921980802905393, 'samples': 5357568, 'steps': 10463, 'batch_loss/train': 0.824397602584213} 12/24/2021 08:37:51 - INFO - codeparrot_training - Step 10464: {'lr': 0.00044921028451465854, 'samples': 5358080, 'steps': 10464, 'batch_loss/train': 0.7828119313344359} 12/24/2021 08:38:01 - INFO - codeparrot_training - Step 10465: {'lr': 0.0004492007602082784, 'samples': 5358592, 'steps': 10465, 'batch_loss/train': 0.743161978200078} 12/24/2021 08:38:15 - INFO - codeparrot_training - Step 10466: {'lr': 0.00044919123510995114, 'samples': 5359104, 'steps': 10466, 'batch_loss/train': 0.8414688892662525} 12/24/2021 08:38:26 - INFO - codeparrot_training - Step 10467: {'lr': 0.0004491817092197148, 'samples': 5359616, 'steps': 10467, 'batch_loss/train': 0.8656922038644552} 12/24/2021 08:38:36 - INFO - codeparrot_training - Step 10468: {'lr': 0.00044917218253760717, 'samples': 5360128, 'steps': 10468, 'batch_loss/train': 0.7551662372425199} 12/24/2021 08:38:49 - INFO - codeparrot_training - Step 10469: {'lr': 0.00044916265506366615, 'samples': 5360640, 'steps': 10469, 'batch_loss/train': 0.7958375383168459} 12/24/2021 08:38:59 - INFO - codeparrot_training - Step 10470: {'lr': 0.00044915312679792957, 'samples': 5361152, 'steps': 10470, 'batch_loss/train': 0.9615722792223096} 12/24/2021 08:39:10 - INFO - codeparrot_training - Step 10471: {'lr': 0.00044914359774043543, 'samples': 5361664, 'steps': 10471, 'batch_loss/train': 0.6608166615478694} 12/24/2021 08:39:20 - INFO - codeparrot_training - Step 10472: {'lr': 0.00044913406789122147, 'samples': 5362176, 'steps': 10472, 'batch_loss/train': 0.8011406413279474} 12/24/2021 08:39:35 - INFO - codeparrot_training - Step 10473: {'lr': 0.0004491245372503257, 'samples': 5362688, 'steps': 10473, 'batch_loss/train': 0.9239508584141731} 12/24/2021 08:39:45 - INFO - codeparrot_training - Step 10474: {'lr': 0.00044911500581778596, 'samples': 5363200, 'steps': 10474, 'batch_loss/train': 0.7825911137042567} 12/24/2021 08:39:56 - INFO - codeparrot_training - Step 10475: {'lr': 0.00044910547359364004, 'samples': 5363712, 'steps': 10475, 'batch_loss/train': 0.9185909917578101} 12/24/2021 08:40:08 - INFO - codeparrot_training - Step 10476: {'lr': 0.00044909594057792614, 'samples': 5364224, 'steps': 10476, 'batch_loss/train': 0.7829916309565306} 12/24/2021 08:40:18 - INFO - codeparrot_training - Step 10477: {'lr': 0.0004490864067706818, 'samples': 5364736, 'steps': 10477, 'batch_loss/train': 0.785737476311624} 12/24/2021 08:40:29 - INFO - codeparrot_training - Step 10478: {'lr': 0.0004490768721719451, 'samples': 5365248, 'steps': 10478, 'batch_loss/train': 0.8436423763632774} 12/24/2021 08:40:41 - INFO - codeparrot_training - Step 10479: {'lr': 0.00044906733678175395, 'samples': 5365760, 'steps': 10479, 'batch_loss/train': 0.8636573841795325} 12/24/2021 08:40:52 - INFO - codeparrot_training - Step 10480: {'lr': 0.00044905780060014623, 'samples': 5366272, 'steps': 10480, 'batch_loss/train': 0.7012119552236982} 12/24/2021 08:41:02 - INFO - codeparrot_training - Step 10481: {'lr': 0.0004490482636271598, 'samples': 5366784, 'steps': 10481, 'batch_loss/train': 0.7294406974688172} 12/24/2021 08:41:14 - INFO - codeparrot_training - Step 10482: {'lr': 0.0004490387258628327, 'samples': 5367296, 'steps': 10482, 'batch_loss/train': 0.8129403959028423} 12/24/2021 08:41:25 - INFO - codeparrot_training - Step 10483: {'lr': 0.0004490291873072028, 'samples': 5367808, 'steps': 10483, 'batch_loss/train': 0.8430897956714034} 12/24/2021 08:41:35 - INFO - codeparrot_training - Step 10484: {'lr': 0.0004490196479603079, 'samples': 5368320, 'steps': 10484, 'batch_loss/train': 0.7426677686162293} 12/24/2021 08:41:46 - INFO - codeparrot_training - Step 10485: {'lr': 0.0004490101078221861, 'samples': 5368832, 'steps': 10485, 'batch_loss/train': 0.770033513661474} 12/24/2021 08:42:00 - INFO - codeparrot_training - Step 10486: {'lr': 0.00044900056689287527, 'samples': 5369344, 'steps': 10486, 'batch_loss/train': 0.7762870043516159} 12/24/2021 08:42:10 - INFO - codeparrot_training - Step 10487: {'lr': 0.0004489910251724133, 'samples': 5369856, 'steps': 10487, 'batch_loss/train': 0.8074964666739106} 12/24/2021 08:42:21 - INFO - codeparrot_training - Step 10488: {'lr': 0.0004489814826608381, 'samples': 5370368, 'steps': 10488, 'batch_loss/train': 0.8335095811635256} 12/24/2021 08:42:33 - INFO - codeparrot_training - Step 10489: {'lr': 0.0004489719393581877, 'samples': 5370880, 'steps': 10489, 'batch_loss/train': 0.8695443440228701} 12/24/2021 08:42:44 - INFO - codeparrot_training - Step 10490: {'lr': 0.0004489623952645, 'samples': 5371392, 'steps': 10490, 'batch_loss/train': 1.8784070737892762} 12/24/2021 08:42:55 - INFO - codeparrot_training - Step 10491: {'lr': 0.0004489528503798129, 'samples': 5371904, 'steps': 10491, 'batch_loss/train': 1.0044024074450135} 12/24/2021 08:43:06 - INFO - codeparrot_training - Step 10492: {'lr': 0.0004489433047041645, 'samples': 5372416, 'steps': 10492, 'batch_loss/train': 0.7454308072337881} 12/24/2021 08:43:17 - INFO - codeparrot_training - Step 10493: {'lr': 0.0004489337582375926, 'samples': 5372928, 'steps': 10493, 'batch_loss/train': 0.8211299935355783} 12/24/2021 08:43:28 - INFO - codeparrot_training - Step 10494: {'lr': 0.0004489242109801351, 'samples': 5373440, 'steps': 10494, 'batch_loss/train': 0.7522375648841262} 12/24/2021 08:43:38 - INFO - codeparrot_training - Step 10495: {'lr': 0.00044891466293183016, 'samples': 5373952, 'steps': 10495, 'batch_loss/train': 0.8361212629824877} 12/24/2021 08:43:52 - INFO - codeparrot_training - Step 10496: {'lr': 0.00044890511409271557, 'samples': 5374464, 'steps': 10496, 'batch_loss/train': 0.7968784226104617} 12/24/2021 08:44:03 - INFO - codeparrot_training - Step 10497: {'lr': 0.0004488955644628294, 'samples': 5374976, 'steps': 10497, 'batch_loss/train': 0.8065261067822576} 12/24/2021 08:44:13 - INFO - codeparrot_training - Step 10498: {'lr': 0.00044888601404220946, 'samples': 5375488, 'steps': 10498, 'batch_loss/train': 0.7391413068398833} 12/24/2021 08:44:25 - INFO - codeparrot_training - Step 10499: {'lr': 0.0004488764628308939, 'samples': 5376000, 'steps': 10499, 'batch_loss/train': 0.7491197087801993} 12/24/2021 08:44:36 - INFO - codeparrot_training - Step 10500: {'lr': 0.0004488669108289206, 'samples': 5376512, 'steps': 10500, 'batch_loss/train': 0.7332215935457498} 12/24/2021 08:44:46 - INFO - codeparrot_training - Step 10501: {'lr': 0.00044885735803632757, 'samples': 5377024, 'steps': 10501, 'batch_loss/train': 0.8085275292396545} 12/24/2021 08:45:00 - INFO - codeparrot_training - Step 10502: {'lr': 0.00044884780445315274, 'samples': 5377536, 'steps': 10502, 'batch_loss/train': 0.8331820571329445} 12/24/2021 08:45:11 - INFO - codeparrot_training - Step 10503: {'lr': 0.0004488382500794341, 'samples': 5378048, 'steps': 10503, 'batch_loss/train': 1.0027224821969867} 12/24/2021 08:45:22 - INFO - codeparrot_training - Step 10504: {'lr': 0.0004488286949152097, 'samples': 5378560, 'steps': 10504, 'batch_loss/train': 1.02275256998837} 12/24/2021 08:45:32 - INFO - codeparrot_training - Step 10505: {'lr': 0.0004488191389605175, 'samples': 5379072, 'steps': 10505, 'batch_loss/train': 0.7477519456297159} 12/24/2021 08:45:44 - INFO - codeparrot_training - Step 10506: {'lr': 0.00044880958221539546, 'samples': 5379584, 'steps': 10506, 'batch_loss/train': 0.8531738789752126} 12/24/2021 08:45:55 - INFO - codeparrot_training - Step 10507: {'lr': 0.0004488000246798816, 'samples': 5380096, 'steps': 10507, 'batch_loss/train': 0.8225855408236384} 12/24/2021 08:46:05 - INFO - codeparrot_training - Step 10508: {'lr': 0.0004487904663540139, 'samples': 5380608, 'steps': 10508, 'batch_loss/train': 0.7808821285143495} 12/24/2021 08:46:18 - INFO - codeparrot_training - Step 10509: {'lr': 0.00044878090723783035, 'samples': 5381120, 'steps': 10509, 'batch_loss/train': 0.9541792855598032} 12/24/2021 08:46:28 - INFO - codeparrot_training - Step 10510: {'lr': 0.000448771347331369, 'samples': 5381632, 'steps': 10510, 'batch_loss/train': 0.704401959432289} 12/24/2021 08:46:39 - INFO - codeparrot_training - Step 10511: {'lr': 0.00044876178663466777, 'samples': 5382144, 'steps': 10511, 'batch_loss/train': 0.7338099260814488} 12/24/2021 08:46:53 - INFO - codeparrot_training - Step 10512: {'lr': 0.00044875222514776476, 'samples': 5382656, 'steps': 10512, 'batch_loss/train': 0.8565622391179204} 12/24/2021 08:47:03 - INFO - codeparrot_training - Step 10513: {'lr': 0.00044874266287069797, 'samples': 5383168, 'steps': 10513, 'batch_loss/train': 0.735381867736578} 12/24/2021 08:47:14 - INFO - codeparrot_training - Step 10514: {'lr': 0.00044873309980350535, 'samples': 5383680, 'steps': 10514, 'batch_loss/train': 0.7622875049710274} 12/24/2021 08:47:25 - INFO - codeparrot_training - Step 10515: {'lr': 0.00044872353594622495, 'samples': 5384192, 'steps': 10515, 'batch_loss/train': 0.7897776332683861} 12/24/2021 08:47:37 - INFO - codeparrot_training - Step 10516: {'lr': 0.0004487139712988949, 'samples': 5384704, 'steps': 10516, 'batch_loss/train': 0.8544242791831493} 12/24/2021 08:47:47 - INFO - codeparrot_training - Step 10517: {'lr': 0.00044870440586155304, 'samples': 5385216, 'steps': 10517, 'batch_loss/train': 0.7837307443842292} 12/24/2021 08:47:58 - INFO - codeparrot_training - Step 10518: {'lr': 0.00044869483963423754, 'samples': 5385728, 'steps': 10518, 'batch_loss/train': 0.8174629807472229} 12/24/2021 08:48:10 - INFO - codeparrot_training - Step 10519: {'lr': 0.0004486852726169863, 'samples': 5386240, 'steps': 10519, 'batch_loss/train': 0.7617954607121646} 12/24/2021 08:48:20 - INFO - codeparrot_training - Step 10520: {'lr': 0.00044867570480983755, 'samples': 5386752, 'steps': 10520, 'batch_loss/train': 0.7883866550400853} 12/24/2021 08:48:31 - INFO - codeparrot_training - Step 10521: {'lr': 0.0004486661362128291, 'samples': 5387264, 'steps': 10521, 'batch_loss/train': 0.7648498602211475} 12/24/2021 08:48:43 - INFO - codeparrot_training - Step 10522: {'lr': 0.0004486565668259992, 'samples': 5387776, 'steps': 10522, 'batch_loss/train': 0.7994718733243644} 12/24/2021 08:48:54 - INFO - codeparrot_training - Step 10523: {'lr': 0.0004486469966493858, 'samples': 5388288, 'steps': 10523, 'batch_loss/train': 0.7887312341481447} 12/24/2021 08:49:04 - INFO - codeparrot_training - Step 10524: {'lr': 0.0004486374256830269, 'samples': 5388800, 'steps': 10524, 'batch_loss/train': 0.865980570204556} 12/24/2021 08:49:15 - INFO - codeparrot_training - Step 10525: {'lr': 0.0004486278539269606, 'samples': 5389312, 'steps': 10525, 'batch_loss/train': 0.8340143347159028} 12/24/2021 08:49:28 - INFO - codeparrot_training - Step 10526: {'lr': 0.000448618281381225, 'samples': 5389824, 'steps': 10526, 'batch_loss/train': 0.809475330170244} 12/24/2021 08:49:39 - INFO - codeparrot_training - Step 10527: {'lr': 0.0004486087080458581, 'samples': 5390336, 'steps': 10527, 'batch_loss/train': 0.8235700158402324} 12/24/2021 08:49:50 - INFO - codeparrot_training - Step 10528: {'lr': 0.0004485991339208979, 'samples': 5390848, 'steps': 10528, 'batch_loss/train': 0.8781493296846747} 12/24/2021 08:50:02 - INFO - codeparrot_training - Step 10529: {'lr': 0.0004485895590063826, 'samples': 5391360, 'steps': 10529, 'batch_loss/train': 0.7845215261913836} 12/24/2021 08:50:12 - INFO - codeparrot_training - Step 10530: {'lr': 0.0004485799833023502, 'samples': 5391872, 'steps': 10530, 'batch_loss/train': 0.8524550283327699} 12/24/2021 08:50:23 - INFO - codeparrot_training - Step 10531: {'lr': 0.0004485704068088387, 'samples': 5392384, 'steps': 10531, 'batch_loss/train': 0.7006382821127772} 12/24/2021 08:50:35 - INFO - codeparrot_training - Step 10532: {'lr': 0.00044856082952588627, 'samples': 5392896, 'steps': 10532, 'batch_loss/train': 0.9550733994692564} 12/24/2021 08:50:45 - INFO - codeparrot_training - Step 10533: {'lr': 0.000448551251453531, 'samples': 5393408, 'steps': 10533, 'batch_loss/train': 0.8641991531476378} 12/24/2021 08:50:56 - INFO - codeparrot_training - Step 10534: {'lr': 0.0004485416725918109, 'samples': 5393920, 'steps': 10534, 'batch_loss/train': 0.8492156583815813} 12/24/2021 08:51:10 - INFO - codeparrot_training - Step 10535: {'lr': 0.00044853209294076405, 'samples': 5394432, 'steps': 10535, 'batch_loss/train': 0.86182074341923} 12/24/2021 08:51:21 - INFO - codeparrot_training - Step 10536: {'lr': 0.00044852251250042857, 'samples': 5394944, 'steps': 10536, 'batch_loss/train': 0.8630466856993735} 12/24/2021 08:51:31 - INFO - codeparrot_training - Step 10537: {'lr': 0.0004485129312708426, 'samples': 5395456, 'steps': 10537, 'batch_loss/train': 0.7945640287362039} 12/24/2021 08:51:42 - INFO - codeparrot_training - Step 10538: {'lr': 0.00044850334925204406, 'samples': 5395968, 'steps': 10538, 'batch_loss/train': 0.7946008527651429} 12/24/2021 08:51:54 - INFO - codeparrot_training - Step 10539: {'lr': 0.00044849376644407124, 'samples': 5396480, 'steps': 10539, 'batch_loss/train': 1.004290483891964} 12/24/2021 08:52:04 - INFO - codeparrot_training - Step 10540: {'lr': 0.00044848418284696215, 'samples': 5396992, 'steps': 10540, 'batch_loss/train': 0.7636160908732563} 12/24/2021 08:52:15 - INFO - codeparrot_training - Step 10541: {'lr': 0.00044847459846075483, 'samples': 5397504, 'steps': 10541, 'batch_loss/train': 0.877150640822947} 12/24/2021 08:52:27 - INFO - codeparrot_training - Step 10542: {'lr': 0.0004484650132854875, 'samples': 5398016, 'steps': 10542, 'batch_loss/train': 0.8206699686124921} 12/24/2021 08:52:38 - INFO - codeparrot_training - Step 10543: {'lr': 0.00044845542732119823, 'samples': 5398528, 'steps': 10543, 'batch_loss/train': 0.7909706262871623} 12/24/2021 08:52:49 - INFO - codeparrot_training - Step 10544: {'lr': 0.00044844584056792506, 'samples': 5399040, 'steps': 10544, 'batch_loss/train': 0.7706031305715442} 12/24/2021 08:53:03 - INFO - codeparrot_training - Step 10545: {'lr': 0.00044843625302570616, 'samples': 5399552, 'steps': 10545, 'batch_loss/train': 0.7902661552652717} 12/24/2021 08:53:14 - INFO - codeparrot_training - Step 10546: {'lr': 0.00044842666469457974, 'samples': 5400064, 'steps': 10546, 'batch_loss/train': 0.8119341243291274} 12/24/2021 08:53:24 - INFO - codeparrot_training - Step 10547: {'lr': 0.00044841707557458365, 'samples': 5400576, 'steps': 10547, 'batch_loss/train': 0.6451569013879634} 12/24/2021 08:53:35 - INFO - codeparrot_training - Step 10548: {'lr': 0.00044840748566575637, 'samples': 5401088, 'steps': 10548, 'batch_loss/train': 0.8684166162274778} 12/24/2021 08:53:47 - INFO - codeparrot_training - Step 10549: {'lr': 0.00044839789496813574, 'samples': 5401600, 'steps': 10549, 'batch_loss/train': 0.9543522372841835} 12/24/2021 08:53:58 - INFO - codeparrot_training - Step 10550: {'lr': 0.00044838830348176004, 'samples': 5402112, 'steps': 10550, 'batch_loss/train': 0.8350133486092091} 12/24/2021 08:54:08 - INFO - codeparrot_training - Step 10551: {'lr': 0.00044837871120666727, 'samples': 5402624, 'steps': 10551, 'batch_loss/train': 0.8238198910839856} 12/24/2021 08:54:22 - INFO - codeparrot_training - Step 10552: {'lr': 0.0004483691181428957, 'samples': 5403136, 'steps': 10552, 'batch_loss/train': 0.6662101931869984} 12/24/2021 08:54:33 - INFO - codeparrot_training - Step 10553: {'lr': 0.00044835952429048343, 'samples': 5403648, 'steps': 10553, 'batch_loss/train': 0.7566130897030234} 12/24/2021 08:54:43 - INFO - codeparrot_training - Step 10554: {'lr': 0.00044834992964946864, 'samples': 5404160, 'steps': 10554, 'batch_loss/train': 0.8130038110539317} 12/24/2021 08:54:55 - INFO - codeparrot_training - Step 10555: {'lr': 0.0004483403342198894, 'samples': 5404672, 'steps': 10555, 'batch_loss/train': 0.8879144857637584} 12/24/2021 08:55:06 - INFO - codeparrot_training - Step 10556: {'lr': 0.0004483307380017838, 'samples': 5405184, 'steps': 10556, 'batch_loss/train': 0.9058896605856717} 12/24/2021 08:55:17 - INFO - codeparrot_training - Step 10557: {'lr': 0.0004483211409951902, 'samples': 5405696, 'steps': 10557, 'batch_loss/train': 0.9302240200340748} 12/24/2021 08:55:29 - INFO - codeparrot_training - Step 10558: {'lr': 0.00044831154320014644, 'samples': 5406208, 'steps': 10558, 'batch_loss/train': 0.8224490424618125} 12/24/2021 08:55:39 - INFO - codeparrot_training - Step 10559: {'lr': 0.000448301944616691, 'samples': 5406720, 'steps': 10559, 'batch_loss/train': 0.8154609343037009} 12/24/2021 08:55:50 - INFO - codeparrot_training - Step 10560: {'lr': 0.0004482923452448619, 'samples': 5407232, 'steps': 10560, 'batch_loss/train': 0.8805026914924383} 12/24/2021 08:56:00 - INFO - codeparrot_training - Step 10561: {'lr': 0.0004482827450846972, 'samples': 5407744, 'steps': 10561, 'batch_loss/train': 0.8072382817044854} 12/24/2021 08:56:14 - INFO - codeparrot_training - Step 10562: {'lr': 0.00044827314413623535, 'samples': 5408256, 'steps': 10562, 'batch_loss/train': 0.7830020184628665} 12/24/2021 08:56:25 - INFO - codeparrot_training - Step 10563: {'lr': 0.00044826354239951426, 'samples': 5408768, 'steps': 10563, 'batch_loss/train': 0.8367207692936063} 12/24/2021 08:56:35 - INFO - codeparrot_training - Step 10564: {'lr': 0.0004482539398745722, 'samples': 5409280, 'steps': 10564, 'batch_loss/train': 0.8378134733065963} 12/24/2021 08:56:48 - INFO - codeparrot_training - Step 10565: {'lr': 0.00044824433656144726, 'samples': 5409792, 'steps': 10565, 'batch_loss/train': 0.955959090963006} 12/24/2021 08:56:58 - INFO - codeparrot_training - Step 10566: {'lr': 0.0004482347324601778, 'samples': 5410304, 'steps': 10566, 'batch_loss/train': 0.7753243464976549} 12/24/2021 08:57:09 - INFO - codeparrot_training - Step 10567: {'lr': 0.0004482251275708018, 'samples': 5410816, 'steps': 10567, 'batch_loss/train': 0.7879804763942957} 12/24/2021 08:57:23 - INFO - codeparrot_training - Step 10568: {'lr': 0.0004482155218933576, 'samples': 5411328, 'steps': 10568, 'batch_loss/train': 0.917374768294394} 12/24/2021 08:57:33 - INFO - codeparrot_training - Step 10569: {'lr': 0.0004482059154278833, 'samples': 5411840, 'steps': 10569, 'batch_loss/train': 0.6635491494089365} 12/24/2021 08:57:44 - INFO - codeparrot_training - Step 10570: {'lr': 0.0004481963081744171, 'samples': 5412352, 'steps': 10570, 'batch_loss/train': 0.7197638675570488} 12/24/2021 08:57:55 - INFO - codeparrot_training - Step 10571: {'lr': 0.00044818670013299725, 'samples': 5412864, 'steps': 10571, 'batch_loss/train': 0.8336974447593093} 12/24/2021 08:58:07 - INFO - codeparrot_training - Step 10572: {'lr': 0.0004481770913036619, 'samples': 5413376, 'steps': 10572, 'batch_loss/train': 0.6935882938560098} 12/24/2021 08:58:17 - INFO - codeparrot_training - Step 10573: {'lr': 0.00044816748168644927, 'samples': 5413888, 'steps': 10573, 'batch_loss/train': 0.8075859230011702} 12/24/2021 08:58:28 - INFO - codeparrot_training - Step 10574: {'lr': 0.0004481578712813976, 'samples': 5414400, 'steps': 10574, 'batch_loss/train': 0.7090329648926854} 12/24/2021 08:58:40 - INFO - codeparrot_training - Step 10575: {'lr': 0.000448148260088545, 'samples': 5414912, 'steps': 10575, 'batch_loss/train': 0.8335407320410013} 12/24/2021 08:58:51 - INFO - codeparrot_training - Step 10576: {'lr': 0.0004481386481079297, 'samples': 5415424, 'steps': 10576, 'batch_loss/train': 0.9128653770312667} 12/24/2021 08:59:01 - INFO - codeparrot_training - Step 10577: {'lr': 0.00044812903533959005, 'samples': 5415936, 'steps': 10577, 'batch_loss/train': 0.7971903928555548} 12/24/2021 08:59:12 - INFO - codeparrot_training - Step 10578: {'lr': 0.0004481194217835641, 'samples': 5416448, 'steps': 10578, 'batch_loss/train': 0.8786277304170653} 12/24/2021 08:59:24 - INFO - codeparrot_training - Step 10579: {'lr': 0.00044810980743989015, 'samples': 5416960, 'steps': 10579, 'batch_loss/train': 1.2548747565597296} 12/24/2021 08:59:35 - INFO - codeparrot_training - Step 10580: {'lr': 0.0004481001923086064, 'samples': 5417472, 'steps': 10580, 'batch_loss/train': 0.8347665397450328} 12/24/2021 08:59:45 - INFO - codeparrot_training - Step 10581: {'lr': 0.0004480905763897511, 'samples': 5417984, 'steps': 10581, 'batch_loss/train': 0.8208323016297072} 12/24/2021 08:59:59 - INFO - codeparrot_training - Step 10582: {'lr': 0.0004480809596833625, 'samples': 5418496, 'steps': 10582, 'batch_loss/train': 0.7673409841954708} 12/24/2021 09:00:10 - INFO - codeparrot_training - Step 10583: {'lr': 0.00044807134218947866, 'samples': 5419008, 'steps': 10583, 'batch_loss/train': 0.7745740991085768} 12/24/2021 09:00:20 - INFO - codeparrot_training - Step 10584: {'lr': 0.0004480617239081381, 'samples': 5419520, 'steps': 10584, 'batch_loss/train': 0.7493897475069389} 12/24/2021 09:00:32 - INFO - codeparrot_training - Step 10585: {'lr': 0.0004480521048393789, 'samples': 5420032, 'steps': 10585, 'batch_loss/train': 0.7618426271365024} 12/24/2021 09:00:43 - INFO - codeparrot_training - Step 10586: {'lr': 0.0004480424849832392, 'samples': 5420544, 'steps': 10586, 'batch_loss/train': 0.8723383615724742} 12/24/2021 09:00:53 - INFO - codeparrot_training - Step 10587: {'lr': 0.00044803286433975744, 'samples': 5421056, 'steps': 10587, 'batch_loss/train': 0.8150781719014049} 12/24/2021 09:01:06 - INFO - codeparrot_training - Step 10588: {'lr': 0.0004480232429089718, 'samples': 5421568, 'steps': 10588, 'batch_loss/train': 0.8170482330024242} 12/24/2021 09:01:16 - INFO - codeparrot_training - Step 10589: {'lr': 0.00044801362069092046, 'samples': 5422080, 'steps': 10589, 'batch_loss/train': 0.843676658347249} 12/24/2021 09:01:27 - INFO - codeparrot_training - Step 10590: {'lr': 0.00044800399768564177, 'samples': 5422592, 'steps': 10590, 'batch_loss/train': 1.3380912691354752} 12/24/2021 09:01:38 - INFO - codeparrot_training - Step 10591: {'lr': 0.0004479943738931739, 'samples': 5423104, 'steps': 10591, 'batch_loss/train': 0.9472064170986414} 12/24/2021 09:01:51 - INFO - codeparrot_training - Step 10592: {'lr': 0.0004479847493135553, 'samples': 5423616, 'steps': 10592, 'batch_loss/train': 0.7896604961715639} 12/24/2021 09:02:02 - INFO - codeparrot_training - Step 10593: {'lr': 0.0004479751239468239, 'samples': 5424128, 'steps': 10593, 'batch_loss/train': 0.7387141897343099} 12/24/2021 09:02:13 - INFO - codeparrot_training - Step 10594: {'lr': 0.0004479654977930183, 'samples': 5424640, 'steps': 10594, 'batch_loss/train': 0.8608713671565056} 12/24/2021 09:02:25 - INFO - codeparrot_training - Step 10595: {'lr': 0.0004479558708521766, 'samples': 5425152, 'steps': 10595, 'batch_loss/train': 0.8072370490990579} 12/24/2021 09:02:35 - INFO - codeparrot_training - Step 10596: {'lr': 0.0004479462431243371, 'samples': 5425664, 'steps': 10596, 'batch_loss/train': 0.8613957017660141} 12/24/2021 09:02:46 - INFO - codeparrot_training - Step 10597: {'lr': 0.00044793661460953803, 'samples': 5426176, 'steps': 10597, 'batch_loss/train': 0.854718470480293} 12/24/2021 09:03:00 - INFO - codeparrot_training - Step 10598: {'lr': 0.0004479269853078177, 'samples': 5426688, 'steps': 10598, 'batch_loss/train': 0.838535844348371} 12/24/2021 09:03:10 - INFO - codeparrot_training - Step 10599: {'lr': 0.0004479173552192145, 'samples': 5427200, 'steps': 10599, 'batch_loss/train': 0.7889236360788345} 12/24/2021 09:03:21 - INFO - codeparrot_training - Step 10600: {'lr': 0.0004479077243437666, 'samples': 5427712, 'steps': 10600, 'batch_loss/train': 0.7101184637285769} 12/24/2021 09:03:32 - INFO - codeparrot_training - Step 10601: {'lr': 0.00044789809268151227, 'samples': 5428224, 'steps': 10601, 'batch_loss/train': 0.8260312508791685} 12/24/2021 09:03:44 - INFO - codeparrot_training - Step 10602: {'lr': 0.0004478884602324899, 'samples': 5428736, 'steps': 10602, 'batch_loss/train': 1.297970243729651} 12/24/2021 09:03:55 - INFO - codeparrot_training - Step 10603: {'lr': 0.0004478788269967377, 'samples': 5429248, 'steps': 10603, 'batch_loss/train': 0.7521772449836135} 12/24/2021 09:04:05 - INFO - codeparrot_training - Step 10604: {'lr': 0.00044786919297429403, 'samples': 5429760, 'steps': 10604, 'batch_loss/train': 1.144776126369834} 12/24/2021 09:04:17 - INFO - codeparrot_training - Step 10605: {'lr': 0.00044785955816519716, 'samples': 5430272, 'steps': 10605, 'batch_loss/train': 0.7983669609529898} 12/24/2021 09:04:28 - INFO - codeparrot_training - Step 10606: {'lr': 0.0004478499225694854, 'samples': 5430784, 'steps': 10606, 'batch_loss/train': 0.8182103130966425} 12/24/2021 09:04:38 - INFO - codeparrot_training - Step 10607: {'lr': 0.00044784028618719696, 'samples': 5431296, 'steps': 10607, 'batch_loss/train': 0.8467282885685563} 12/24/2021 09:04:52 - INFO - codeparrot_training - Step 10608: {'lr': 0.0004478306490183703, 'samples': 5431808, 'steps': 10608, 'batch_loss/train': 0.9414431154727936} 12/24/2021 09:05:03 - INFO - codeparrot_training - Step 10609: {'lr': 0.00044782101106304376, 'samples': 5432320, 'steps': 10609, 'batch_loss/train': 1.0579513502307236} 12/24/2021 09:05:14 - INFO - codeparrot_training - Step 10610: {'lr': 0.0004478113723212555, 'samples': 5432832, 'steps': 10610, 'batch_loss/train': 0.8264895798638463} 12/24/2021 09:05:24 - INFO - codeparrot_training - Step 10611: {'lr': 0.0004478017327930439, 'samples': 5433344, 'steps': 10611, 'batch_loss/train': 0.8902961339335889} 12/24/2021 09:05:36 - INFO - codeparrot_training - Step 10612: {'lr': 0.00044779209247844734, 'samples': 5433856, 'steps': 10612, 'batch_loss/train': 0.8916475251317024} 12/24/2021 09:05:47 - INFO - codeparrot_training - Step 10613: {'lr': 0.0004477824513775041, 'samples': 5434368, 'steps': 10613, 'batch_loss/train': 0.8635128643363714} 12/24/2021 09:05:57 - INFO - codeparrot_training - Step 10614: {'lr': 0.0004477728094902525, 'samples': 5434880, 'steps': 10614, 'batch_loss/train': 0.7734377272427082} 12/24/2021 09:06:09 - INFO - codeparrot_training - Step 10615: {'lr': 0.0004477631668167309, 'samples': 5435392, 'steps': 10615, 'batch_loss/train': 0.6912528977263719} 12/24/2021 09:06:20 - INFO - codeparrot_training - Step 10616: {'lr': 0.00044775352335697755, 'samples': 5435904, 'steps': 10616, 'batch_loss/train': 0.7631704900413752} 12/24/2021 09:06:30 - INFO - codeparrot_training - Step 10617: {'lr': 0.00044774387911103095, 'samples': 5436416, 'steps': 10617, 'batch_loss/train': 0.9497113227844238} 12/24/2021 09:06:44 - INFO - codeparrot_training - Step 10618: {'lr': 0.00044773423407892934, 'samples': 5436928, 'steps': 10618, 'batch_loss/train': 0.8919780347496271} 12/24/2021 09:06:55 - INFO - codeparrot_training - Step 10619: {'lr': 0.000447724588260711, 'samples': 5437440, 'steps': 10619, 'batch_loss/train': 0.9074831353500485} 12/24/2021 09:07:06 - INFO - codeparrot_training - Step 10620: {'lr': 0.0004477149416564144, 'samples': 5437952, 'steps': 10620, 'batch_loss/train': 0.7401685980148613} 12/24/2021 09:07:18 - INFO - codeparrot_training - Step 10621: {'lr': 0.0004477052942660779, 'samples': 5438464, 'steps': 10621, 'batch_loss/train': 0.8670901628211141} 12/24/2021 09:07:28 - INFO - codeparrot_training - Step 10622: {'lr': 0.00044769564608973967, 'samples': 5438976, 'steps': 10622, 'batch_loss/train': 0.8266981709748507} 12/24/2021 09:07:39 - INFO - codeparrot_training - Step 10623: {'lr': 0.0004476859971274383, 'samples': 5439488, 'steps': 10623, 'batch_loss/train': 0.8723134370520711} 12/24/2021 09:07:49 - INFO - codeparrot_training - Step 10624: {'lr': 0.000447676347379212, 'samples': 5440000, 'steps': 10624, 'batch_loss/train': 0.8394975904375315} 12/24/2021 09:08:01 - INFO - codeparrot_training - Step 10625: {'lr': 0.00044766669684509925, 'samples': 5440512, 'steps': 10625, 'batch_loss/train': 1.1346929529681802} 12/24/2021 09:08:12 - INFO - codeparrot_training - Step 10626: {'lr': 0.0004476570455251383, 'samples': 5441024, 'steps': 10626, 'batch_loss/train': 1.330557193607092} 12/24/2021 09:08:23 - INFO - codeparrot_training - Step 10627: {'lr': 0.00044764739341936763, 'samples': 5441536, 'steps': 10627, 'batch_loss/train': 0.8242015764117241} 12/24/2021 09:08:35 - INFO - codeparrot_training - Step 10628: {'lr': 0.00044763774052782544, 'samples': 5442048, 'steps': 10628, 'batch_loss/train': 0.8164734728634357} 12/24/2021 09:08:45 - INFO - codeparrot_training - Step 10629: {'lr': 0.00044762808685055034, 'samples': 5442560, 'steps': 10629, 'batch_loss/train': 0.9385953210294247} 12/24/2021 09:08:56 - INFO - codeparrot_training - Step 10630: {'lr': 0.0004476184323875805, 'samples': 5443072, 'steps': 10630, 'batch_loss/train': 0.809606684371829} 12/24/2021 09:09:06 - INFO - codeparrot_training - Step 10631: {'lr': 0.0004476087771389544, 'samples': 5443584, 'steps': 10631, 'batch_loss/train': 0.8542956635355949} 12/24/2021 09:09:20 - INFO - codeparrot_training - Step 10632: {'lr': 0.0004475991211047105, 'samples': 5444096, 'steps': 10632, 'batch_loss/train': 0.7641462567262352} 12/24/2021 09:09:31 - INFO - codeparrot_training - Step 10633: {'lr': 0.000447589464284887, 'samples': 5444608, 'steps': 10633, 'batch_loss/train': 0.8749074535444379} 12/24/2021 09:09:41 - INFO - codeparrot_training - Step 10634: {'lr': 0.0004475798066795225, 'samples': 5445120, 'steps': 10634, 'batch_loss/train': 0.8058038502931595} 12/24/2021 09:09:53 - INFO - codeparrot_training - Step 10635: {'lr': 0.00044757014828865517, 'samples': 5445632, 'steps': 10635, 'batch_loss/train': 0.86651870328933} 12/24/2021 09:10:04 - INFO - codeparrot_training - Step 10636: {'lr': 0.0004475604891123236, 'samples': 5446144, 'steps': 10636, 'batch_loss/train': 0.7727127214893699} 12/24/2021 09:10:15 - INFO - codeparrot_training - Step 10637: {'lr': 0.0004475508291505661, 'samples': 5446656, 'steps': 10637, 'batch_loss/train': 0.7786223802249879} 12/24/2021 09:10:29 - INFO - codeparrot_training - Step 10638: {'lr': 0.0004475411684034212, 'samples': 5447168, 'steps': 10638, 'batch_loss/train': 0.9015214117243886} 12/24/2021 09:10:39 - INFO - codeparrot_training - Step 10639: {'lr': 0.00044753150687092705, 'samples': 5447680, 'steps': 10639, 'batch_loss/train': 0.8154057525098324} 12/24/2021 09:10:50 - INFO - codeparrot_training - Step 10640: {'lr': 0.0004475218445531223, 'samples': 5448192, 'steps': 10640, 'batch_loss/train': 0.7619993423577398} 12/24/2021 09:11:02 - INFO - codeparrot_training - Step 10641: {'lr': 0.00044751218145004526, 'samples': 5448704, 'steps': 10641, 'batch_loss/train': 0.7352472424972802} 12/24/2021 09:11:13 - INFO - codeparrot_training - Step 10642: {'lr': 0.0004475025175617343, 'samples': 5449216, 'steps': 10642, 'batch_loss/train': 0.8878405820578337} 12/24/2021 09:11:23 - INFO - codeparrot_training - Step 10643: {'lr': 0.000447492852888228, 'samples': 5449728, 'steps': 10643, 'batch_loss/train': 0.7383301528170705} 12/24/2021 09:11:34 - INFO - codeparrot_training - Step 10644: {'lr': 0.0004474831874295647, 'samples': 5450240, 'steps': 10644, 'batch_loss/train': 0.8313696915283799} 12/24/2021 09:11:46 - INFO - codeparrot_training - Step 10645: {'lr': 0.0004474735211857827, 'samples': 5450752, 'steps': 10645, 'batch_loss/train': 0.7802159637212753} 12/24/2021 09:11:56 - INFO - codeparrot_training - Step 10646: {'lr': 0.0004474638541569206, 'samples': 5451264, 'steps': 10646, 'batch_loss/train': 0.9196296576410532} 12/24/2021 09:12:07 - INFO - codeparrot_training - Step 10647: {'lr': 0.0004474541863430168, 'samples': 5451776, 'steps': 10647, 'batch_loss/train': 0.7170592537149787} 12/24/2021 09:12:21 - INFO - codeparrot_training - Step 10648: {'lr': 0.00044744451774410965, 'samples': 5452288, 'steps': 10648, 'batch_loss/train': 0.8166247131302953} 12/24/2021 09:12:31 - INFO - codeparrot_training - Step 10649: {'lr': 0.0004474348483602377, 'samples': 5452800, 'steps': 10649, 'batch_loss/train': 0.827776487916708} 12/24/2021 09:12:42 - INFO - codeparrot_training - Step 10650: {'lr': 0.00044742517819143935, 'samples': 5453312, 'steps': 10650, 'batch_loss/train': 0.784471915801987} 12/24/2021 09:12:54 - INFO - codeparrot_training - Step 10651: {'lr': 0.00044741550723775305, 'samples': 5453824, 'steps': 10651, 'batch_loss/train': 0.8390163835138083} 12/24/2021 09:13:04 - INFO - codeparrot_training - Step 10652: {'lr': 0.00044740583549921723, 'samples': 5454336, 'steps': 10652, 'batch_loss/train': 0.8903054390102625} 12/24/2021 09:13:15 - INFO - codeparrot_training - Step 10653: {'lr': 0.0004473961629758703, 'samples': 5454848, 'steps': 10653, 'batch_loss/train': 0.951248818077147} 12/24/2021 09:13:25 - INFO - codeparrot_training - Step 10654: {'lr': 0.00044738648966775077, 'samples': 5455360, 'steps': 10654, 'batch_loss/train': 0.7882992904633284} 12/24/2021 09:13:39 - INFO - codeparrot_training - Step 10655: {'lr': 0.00044737681557489704, 'samples': 5455872, 'steps': 10655, 'batch_loss/train': 0.9025096725672483} 12/24/2021 09:13:50 - INFO - codeparrot_training - Step 10656: {'lr': 0.00044736714069734776, 'samples': 5456384, 'steps': 10656, 'batch_loss/train': 0.8937898403964937} 12/24/2021 09:14:00 - INFO - codeparrot_training - Step 10657: {'lr': 0.00044735746503514115, 'samples': 5456896, 'steps': 10657, 'batch_loss/train': 0.783650086261332} 12/24/2021 09:14:12 - INFO - codeparrot_training - Step 10658: {'lr': 0.00044734778858831575, 'samples': 5457408, 'steps': 10658, 'batch_loss/train': 0.7006830698810518} 12/24/2021 09:14:23 - INFO - codeparrot_training - Step 10659: {'lr': 0.0004473381113569101, 'samples': 5457920, 'steps': 10659, 'batch_loss/train': 0.8627926111221313} 12/24/2021 09:14:34 - INFO - codeparrot_training - Step 10660: {'lr': 0.00044732843334096265, 'samples': 5458432, 'steps': 10660, 'batch_loss/train': 0.8557274211198092} 12/24/2021 09:14:46 - INFO - codeparrot_training - Step 10661: {'lr': 0.0004473187545405119, 'samples': 5458944, 'steps': 10661, 'batch_loss/train': 0.7673217905685306} 12/24/2021 09:14:56 - INFO - codeparrot_training - Step 10662: {'lr': 0.00044730907495559616, 'samples': 5459456, 'steps': 10662, 'batch_loss/train': 0.8474106728099287} 12/24/2021 09:15:07 - INFO - codeparrot_training - Step 10663: {'lr': 0.00044729939458625415, 'samples': 5459968, 'steps': 10663, 'batch_loss/train': 0.8636796236969531} 12/24/2021 09:15:18 - INFO - codeparrot_training - Step 10664: {'lr': 0.0004472897134325242, 'samples': 5460480, 'steps': 10664, 'batch_loss/train': 0.7362259281799197} 12/24/2021 09:15:30 - INFO - codeparrot_training - Step 10665: {'lr': 0.00044728003149444484, 'samples': 5460992, 'steps': 10665, 'batch_loss/train': 0.869433410000056} 12/24/2021 09:15:40 - INFO - codeparrot_training - Step 10666: {'lr': 0.0004472703487720545, 'samples': 5461504, 'steps': 10666, 'batch_loss/train': 0.7741384468972683} 12/24/2021 09:15:51 - INFO - codeparrot_training - Step 10667: {'lr': 0.00044726066526539187, 'samples': 5462016, 'steps': 10667, 'batch_loss/train': 1.001711587421596} 12/24/2021 09:16:05 - INFO - codeparrot_training - Step 10668: {'lr': 0.0004472509809744952, 'samples': 5462528, 'steps': 10668, 'batch_loss/train': 0.8715832494199276} 12/24/2021 09:16:15 - INFO - codeparrot_training - Step 10669: {'lr': 0.0004472412958994032, 'samples': 5463040, 'steps': 10669, 'batch_loss/train': 0.7709946008399129} 12/24/2021 09:16:26 - INFO - codeparrot_training - Step 10670: {'lr': 0.0004472316100401542, 'samples': 5463552, 'steps': 10670, 'batch_loss/train': 0.8334818771108985} 12/24/2021 09:16:38 - INFO - codeparrot_training - Step 10671: {'lr': 0.0004472219233967869, 'samples': 5464064, 'steps': 10671, 'batch_loss/train': 0.8328980440273881} 12/24/2021 09:16:49 - INFO - codeparrot_training - Step 10672: {'lr': 0.00044721223596933955, 'samples': 5464576, 'steps': 10672, 'batch_loss/train': 0.9352585691958666} 12/24/2021 09:16:59 - INFO - codeparrot_training - Step 10673: {'lr': 0.00044720254775785096, 'samples': 5465088, 'steps': 10673, 'batch_loss/train': 0.8406273336149752} 12/24/2021 09:17:10 - INFO - codeparrot_training - Step 10674: {'lr': 0.00044719285876235947, 'samples': 5465600, 'steps': 10674, 'batch_loss/train': 0.8244474912062287} 12/24/2021 09:17:22 - INFO - codeparrot_training - Step 10675: {'lr': 0.00044718316898290355, 'samples': 5466112, 'steps': 10675, 'batch_loss/train': 0.8251267038285732} 12/24/2021 09:17:32 - INFO - codeparrot_training - Step 10676: {'lr': 0.0004471734784195219, 'samples': 5466624, 'steps': 10676, 'batch_loss/train': 0.7952269311062992} 12/24/2021 09:17:43 - INFO - codeparrot_training - Step 10677: {'lr': 0.0004471637870722529, 'samples': 5467136, 'steps': 10677, 'batch_loss/train': 0.8038314506411552} 12/24/2021 09:17:57 - INFO - codeparrot_training - Step 10678: {'lr': 0.00044715409494113515, 'samples': 5467648, 'steps': 10678, 'batch_loss/train': 0.8667679252102971} 12/24/2021 09:18:07 - INFO - codeparrot_training - Step 10679: {'lr': 0.00044714440202620715, 'samples': 5468160, 'steps': 10679, 'batch_loss/train': 0.8296621972694993} 12/24/2021 09:18:18 - INFO - codeparrot_training - Step 10680: {'lr': 0.00044713470832750745, 'samples': 5468672, 'steps': 10680, 'batch_loss/train': 0.7843904001638293} 12/24/2021 09:18:30 - INFO - codeparrot_training - Step 10681: {'lr': 0.0004471250138450746, 'samples': 5469184, 'steps': 10681, 'batch_loss/train': 0.8224337678402662} 12/24/2021 09:18:41 - INFO - codeparrot_training - Step 10682: {'lr': 0.00044711531857894707, 'samples': 5469696, 'steps': 10682, 'batch_loss/train': 0.7592057818546891} 12/24/2021 09:18:51 - INFO - codeparrot_training - Step 10683: {'lr': 0.0004471056225291635, 'samples': 5470208, 'steps': 10683, 'batch_loss/train': 0.91695761308074} 12/24/2021 09:19:03 - INFO - codeparrot_training - Step 10684: {'lr': 0.0004470959256957624, 'samples': 5470720, 'steps': 10684, 'batch_loss/train': 0.7806129762902856} 12/24/2021 09:19:14 - INFO - codeparrot_training - Step 10685: {'lr': 0.0004470862280787823, 'samples': 5471232, 'steps': 10685, 'batch_loss/train': 0.8362739011645317} 12/24/2021 09:19:25 - INFO - codeparrot_training - Step 10686: {'lr': 0.00044707652967826184, 'samples': 5471744, 'steps': 10686, 'batch_loss/train': 0.7563408333808184} 12/24/2021 09:19:35 - INFO - codeparrot_training - Step 10687: {'lr': 0.00044706683049423944, 'samples': 5472256, 'steps': 10687, 'batch_loss/train': 0.8299167482182384} 12/24/2021 09:19:49 - INFO - codeparrot_training - Step 10688: {'lr': 0.00044705713052675374, 'samples': 5472768, 'steps': 10688, 'batch_loss/train': 0.8645736146718264} 12/24/2021 09:20:00 - INFO - codeparrot_training - Step 10689: {'lr': 0.00044704742977584334, 'samples': 5473280, 'steps': 10689, 'batch_loss/train': 0.7394348422531039} 12/24/2021 09:20:10 - INFO - codeparrot_training - Step 10690: {'lr': 0.0004470377282415467, 'samples': 5473792, 'steps': 10690, 'batch_loss/train': 0.760145265609026} 12/24/2021 09:20:22 - INFO - codeparrot_training - Step 10691: {'lr': 0.00044702802592390243, 'samples': 5474304, 'steps': 10691, 'batch_loss/train': 0.8815788766369224} 12/24/2021 09:20:33 - INFO - codeparrot_training - Step 10692: {'lr': 0.0004470183228229492, 'samples': 5474816, 'steps': 10692, 'batch_loss/train': 0.8140768520534039} 12/24/2021 09:20:43 - INFO - codeparrot_training - Step 10693: {'lr': 0.00044700861893872545, 'samples': 5475328, 'steps': 10693, 'batch_loss/train': 0.8426627519074827} 12/24/2021 09:20:57 - INFO - codeparrot_training - Step 10694: {'lr': 0.0004469989142712699, 'samples': 5475840, 'steps': 10694, 'batch_loss/train': 0.8200421996880323} 12/24/2021 09:21:08 - INFO - codeparrot_training - Step 10695: {'lr': 0.000446989208820621, 'samples': 5476352, 'steps': 10695, 'batch_loss/train': 0.8052851594984531} 12/24/2021 09:21:19 - INFO - codeparrot_training - Step 10696: {'lr': 0.0004469795025868174, 'samples': 5476864, 'steps': 10696, 'batch_loss/train': 0.7861223062500358} 12/24/2021 09:21:31 - INFO - codeparrot_training - Step 10697: {'lr': 0.00044696979556989764, 'samples': 5477376, 'steps': 10697, 'batch_loss/train': 0.8572793249040842} 12/24/2021 09:21:41 - INFO - codeparrot_training - Step 10698: {'lr': 0.0004469600877699004, 'samples': 5477888, 'steps': 10698, 'batch_loss/train': 0.8237648941576481} 12/24/2021 09:21:52 - INFO - codeparrot_training - Step 10699: {'lr': 0.0004469503791868642, 'samples': 5478400, 'steps': 10699, 'batch_loss/train': 0.7778997307177633} 12/24/2021 09:22:02 - INFO - codeparrot_training - Step 10700: {'lr': 0.0004469406698208276, 'samples': 5478912, 'steps': 10700, 'batch_loss/train': 0.6855802346253768} 12/24/2021 09:22:15 - INFO - codeparrot_training - Step 10701: {'lr': 0.00044693095967182926, 'samples': 5479424, 'steps': 10701, 'batch_loss/train': 0.9241413064301014} 12/24/2021 09:22:25 - INFO - codeparrot_training - Step 10702: {'lr': 0.00044692124873990793, 'samples': 5479936, 'steps': 10702, 'batch_loss/train': 0.886272638104856} 12/24/2021 09:22:36 - INFO - codeparrot_training - Step 10703: {'lr': 0.0004469115370251019, 'samples': 5480448, 'steps': 10703, 'batch_loss/train': 0.8849778024014086} 12/24/2021 09:22:50 - INFO - codeparrot_training - Step 10704: {'lr': 0.00044690182452745004, 'samples': 5480960, 'steps': 10704, 'batch_loss/train': 0.8004162358120084} 12/24/2021 09:23:00 - INFO - codeparrot_training - Step 10705: {'lr': 0.00044689211124699084, 'samples': 5481472, 'steps': 10705, 'batch_loss/train': 0.7759924018755555} 12/24/2021 09:23:11 - INFO - codeparrot_training - Step 10706: {'lr': 0.000446882397183763, 'samples': 5481984, 'steps': 10706, 'batch_loss/train': 0.7923445971682668} 12/24/2021 09:23:24 - INFO - codeparrot_training - Step 10707: {'lr': 0.00044687268233780503, 'samples': 5482496, 'steps': 10707, 'batch_loss/train': 0.7712407680228353} 12/24/2021 09:23:35 - INFO - codeparrot_training - Step 10708: {'lr': 0.0004468629667091556, 'samples': 5483008, 'steps': 10708, 'batch_loss/train': 0.6322430008440278} 12/24/2021 09:23:45 - INFO - codeparrot_training - Step 10709: {'lr': 0.0004468532502978534, 'samples': 5483520, 'steps': 10709, 'batch_loss/train': 0.7189806001260877} 12/24/2021 09:23:56 - INFO - codeparrot_training - Step 10710: {'lr': 0.000446843533103937, 'samples': 5484032, 'steps': 10710, 'batch_loss/train': 0.844528921879828} 12/24/2021 09:24:08 - INFO - codeparrot_training - Step 10711: {'lr': 0.00044683381512744503, 'samples': 5484544, 'steps': 10711, 'batch_loss/train': 1.2104329098947346} 12/24/2021 09:24:19 - INFO - codeparrot_training - Step 10712: {'lr': 0.00044682409636841614, 'samples': 5485056, 'steps': 10712, 'batch_loss/train': 0.9334377571940422} 12/24/2021 09:24:29 - INFO - codeparrot_training - Step 10713: {'lr': 0.0004468143768268889, 'samples': 5485568, 'steps': 10713, 'batch_loss/train': 0.7100014421157539} 12/24/2021 09:24:40 - INFO - codeparrot_training - Step 10714: {'lr': 0.00044680465650290204, 'samples': 5486080, 'steps': 10714, 'batch_loss/train': 0.8048262614756823} 12/24/2021 09:24:54 - INFO - codeparrot_training - Step 10715: {'lr': 0.0004467949353964942, 'samples': 5486592, 'steps': 10715, 'batch_loss/train': 0.7717455113306642} 12/24/2021 09:25:05 - INFO - codeparrot_training - Step 10716: {'lr': 0.00044678521350770403, 'samples': 5487104, 'steps': 10716, 'batch_loss/train': 0.8716578083112836} 12/24/2021 09:25:15 - INFO - codeparrot_training - Step 10717: {'lr': 0.0004467754908365701, 'samples': 5487616, 'steps': 10717, 'batch_loss/train': 0.83676578104496} 12/24/2021 09:25:27 - INFO - codeparrot_training - Step 10718: {'lr': 0.0004467657673831311, 'samples': 5488128, 'steps': 10718, 'batch_loss/train': 0.8599374089390039} 12/24/2021 09:25:38 - INFO - codeparrot_training - Step 10719: {'lr': 0.0004467560431474258, 'samples': 5488640, 'steps': 10719, 'batch_loss/train': 0.8468096276046708} 12/24/2021 09:25:48 - INFO - codeparrot_training - Step 10720: {'lr': 0.0004467463181294927, 'samples': 5489152, 'steps': 10720, 'batch_loss/train': 0.7650394537486136} 12/24/2021 09:26:01 - INFO - codeparrot_training - Step 10721: {'lr': 0.0004467365923293705, 'samples': 5489664, 'steps': 10721, 'batch_loss/train': 0.8472424913197756} 12/24/2021 09:26:11 - INFO - codeparrot_training - Step 10722: {'lr': 0.0004467268657470979, 'samples': 5490176, 'steps': 10722, 'batch_loss/train': 0.9371135188266635} 12/24/2021 09:26:22 - INFO - codeparrot_training - Step 10723: {'lr': 0.00044671713838271367, 'samples': 5490688, 'steps': 10723, 'batch_loss/train': 0.7303194017149508} 12/24/2021 09:26:36 - INFO - codeparrot_training - Step 10724: {'lr': 0.0004467074102362563, 'samples': 5491200, 'steps': 10724, 'batch_loss/train': 0.901885730214417} 12/24/2021 09:26:46 - INFO - codeparrot_training - Step 10725: {'lr': 0.00044669768130776446, 'samples': 5491712, 'steps': 10725, 'batch_loss/train': 0.7903814218007028} 12/24/2021 09:26:57 - INFO - codeparrot_training - Step 10726: {'lr': 0.000446687951597277, 'samples': 5492224, 'steps': 10726, 'batch_loss/train': 0.905975298024714} 12/24/2021 09:27:07 - INFO - codeparrot_training - Step 10727: {'lr': 0.0004466782211048325, 'samples': 5492736, 'steps': 10727, 'batch_loss/train': 0.8839591145515442} 12/24/2021 09:27:19 - INFO - codeparrot_training - Step 10728: {'lr': 0.0004466684898304697, 'samples': 5493248, 'steps': 10728, 'batch_loss/train': 0.7837884039618075} 12/24/2021 09:27:30 - INFO - codeparrot_training - Step 10729: {'lr': 0.00044665875777422716, 'samples': 5493760, 'steps': 10729, 'batch_loss/train': 0.7982235653325915} 12/24/2021 09:27:41 - INFO - codeparrot_training - Step 10730: {'lr': 0.00044664902493614367, 'samples': 5494272, 'steps': 10730, 'batch_loss/train': 0.7623108280822635} 12/24/2021 09:27:53 - INFO - codeparrot_training - Step 10731: {'lr': 0.000446639291316258, 'samples': 5494784, 'steps': 10731, 'batch_loss/train': 0.9720001267269254} 12/24/2021 09:28:03 - INFO - codeparrot_training - Step 10732: {'lr': 0.0004466295569146086, 'samples': 5495296, 'steps': 10732, 'batch_loss/train': 0.8351074680685997} 12/24/2021 09:28:14 - INFO - codeparrot_training - Step 10733: {'lr': 0.0004466198217312345, 'samples': 5495808, 'steps': 10733, 'batch_loss/train': 0.7695295957382768} 12/24/2021 09:28:28 - INFO - codeparrot_training - Step 10734: {'lr': 0.0004466100857661741, 'samples': 5496320, 'steps': 10734, 'batch_loss/train': 0.9335852302610874} 12/24/2021 09:28:38 - INFO - codeparrot_training - Step 10735: {'lr': 0.0004466003490194663, 'samples': 5496832, 'steps': 10735, 'batch_loss/train': 0.7459423836553469} 12/24/2021 09:28:49 - INFO - codeparrot_training - Step 10736: {'lr': 0.0004465906114911497, 'samples': 5497344, 'steps': 10736, 'batch_loss/train': 0.8494129767641425} 12/24/2021 09:29:01 - INFO - codeparrot_training - Step 10737: {'lr': 0.00044658087318126305, 'samples': 5497856, 'steps': 10737, 'batch_loss/train': 0.8460237625986338} 12/24/2021 09:29:11 - INFO - codeparrot_training - Step 10738: {'lr': 0.0004465711340898451, 'samples': 5498368, 'steps': 10738, 'batch_loss/train': 0.9133526682853699} 12/24/2021 09:29:22 - INFO - codeparrot_training - Step 10739: {'lr': 0.0004465613942169346, 'samples': 5498880, 'steps': 10739, 'batch_loss/train': 0.9649323578923941} 12/24/2021 09:29:33 - INFO - codeparrot_training - Step 10740: {'lr': 0.00044655165356257017, 'samples': 5499392, 'steps': 10740, 'batch_loss/train': 0.7490476332604885} 12/24/2021 09:29:47 - INFO - codeparrot_training - Step 10741: {'lr': 0.0004465419121267905, 'samples': 5499904, 'steps': 10741, 'batch_loss/train': 0.7276659072376788} 12/24/2021 09:29:57 - INFO - codeparrot_training - Step 10742: {'lr': 0.0004465321699096345, 'samples': 5500416, 'steps': 10742, 'batch_loss/train': 0.7760238200426102} 12/24/2021 09:30:08 - INFO - codeparrot_training - Step 10743: {'lr': 0.00044652242691114075, 'samples': 5500928, 'steps': 10743, 'batch_loss/train': 0.7388210189528763} 12/24/2021 09:30:20 - INFO - codeparrot_training - Step 10744: {'lr': 0.0004465126831313481, 'samples': 5501440, 'steps': 10744, 'batch_loss/train': 0.8512135734781623} 12/24/2021 09:30:31 - INFO - codeparrot_training - Step 10745: {'lr': 0.0004465029385702952, 'samples': 5501952, 'steps': 10745, 'batch_loss/train': 0.7791887549683452} 12/24/2021 09:30:41 - INFO - codeparrot_training - Step 10746: {'lr': 0.0004464931932280208, 'samples': 5502464, 'steps': 10746, 'batch_loss/train': 0.8500005435198545} 12/24/2021 09:30:53 - INFO - codeparrot_training - Step 10747: {'lr': 0.0004464834471045636, 'samples': 5502976, 'steps': 10747, 'batch_loss/train': 0.8381902729161084} 12/24/2021 09:31:04 - INFO - codeparrot_training - Step 10748: {'lr': 0.0004464737001999625, 'samples': 5503488, 'steps': 10748, 'batch_loss/train': 0.7749290927313268} 12/24/2021 09:31:14 - INFO - codeparrot_training - Step 10749: {'lr': 0.0004464639525142561, 'samples': 5504000, 'steps': 10749, 'batch_loss/train': 0.8670441480353475} 12/24/2021 09:31:25 - INFO - codeparrot_training - Step 10750: {'lr': 0.0004464542040474832, 'samples': 5504512, 'steps': 10750, 'batch_loss/train': 0.8347116885706782} 12/24/2021 09:31:37 - INFO - codeparrot_training - Step 10751: {'lr': 0.0004464444547996826, 'samples': 5505024, 'steps': 10751, 'batch_loss/train': 0.8655749396421015} 12/24/2021 09:31:48 - INFO - codeparrot_training - Step 10752: {'lr': 0.00044643470477089296, 'samples': 5505536, 'steps': 10752, 'batch_loss/train': 0.749332731240429} 12/24/2021 09:31:59 - INFO - codeparrot_training - Step 10753: {'lr': 0.0004464249539611532, 'samples': 5506048, 'steps': 10753, 'batch_loss/train': 0.6822500235866755} 12/24/2021 09:32:13 - INFO - codeparrot_training - Step 10754: {'lr': 0.00044641520237050195, 'samples': 5506560, 'steps': 10754, 'batch_loss/train': 0.8634938671020791} 12/24/2021 09:32:23 - INFO - codeparrot_training - Step 10755: {'lr': 0.0004464054499989779, 'samples': 5507072, 'steps': 10755, 'batch_loss/train': 0.8160443771630526} 12/24/2021 09:32:34 - INFO - codeparrot_training - Step 10756: {'lr': 0.00044639569684662005, 'samples': 5507584, 'steps': 10756, 'batch_loss/train': 0.8239875286817551} 12/24/2021 09:32:47 - INFO - codeparrot_training - Step 10757: {'lr': 0.00044638594291346713, 'samples': 5508096, 'steps': 10757, 'batch_loss/train': 0.7302989778108895} 12/24/2021 09:32:57 - INFO - codeparrot_training - Step 10758: {'lr': 0.00044637618819955773, 'samples': 5508608, 'steps': 10758, 'batch_loss/train': 0.8600463755428791} 12/24/2021 09:33:08 - INFO - codeparrot_training - Step 10759: {'lr': 0.0004463664327049308, 'samples': 5509120, 'steps': 10759, 'batch_loss/train': 3.8595864353701472} 12/24/2021 09:33:19 - INFO - codeparrot_training - Step 10760: {'lr': 0.0004463566764296251, 'samples': 5509632, 'steps': 10760, 'batch_loss/train': 0.9670999366790056} 12/24/2021 09:33:31 - INFO - codeparrot_training - Step 10761: {'lr': 0.0004463469193736794, 'samples': 5510144, 'steps': 10761, 'batch_loss/train': 0.96539899520576} 12/24/2021 09:33:41 - INFO - codeparrot_training - Step 10762: {'lr': 0.0004463371615371325, 'samples': 5510656, 'steps': 10762, 'batch_loss/train': 0.7594765670364723} 12/24/2021 09:33:52 - INFO - codeparrot_training - Step 10763: {'lr': 0.00044632740292002314, 'samples': 5511168, 'steps': 10763, 'batch_loss/train': 0.8235981310717762} 12/24/2021 09:34:06 - INFO - codeparrot_training - Step 10764: {'lr': 0.0004463176435223901, 'samples': 5511680, 'steps': 10764, 'batch_loss/train': 2.02680457290262} 12/24/2021 09:34:16 - INFO - codeparrot_training - Step 10765: {'lr': 0.00044630788334427236, 'samples': 5512192, 'steps': 10765, 'batch_loss/train': 0.912223999388516} 12/24/2021 09:34:27 - INFO - codeparrot_training - Step 10766: {'lr': 0.00044629812238570853, 'samples': 5512704, 'steps': 10766, 'batch_loss/train': 0.8912660647183657} 12/24/2021 09:34:39 - INFO - codeparrot_training - Step 10767: {'lr': 0.0004462883606467375, 'samples': 5513216, 'steps': 10767, 'batch_loss/train': 0.8596058436669409} 12/24/2021 09:34:50 - INFO - codeparrot_training - Step 10768: {'lr': 0.000446278598127398, 'samples': 5513728, 'steps': 10768, 'batch_loss/train': 0.913379492238164} 12/24/2021 09:35:01 - INFO - codeparrot_training - Step 10769: {'lr': 0.000446268834827729, 'samples': 5514240, 'steps': 10769, 'batch_loss/train': 0.8937549740076065} 12/24/2021 09:35:11 - INFO - codeparrot_training - Step 10770: {'lr': 0.00044625907074776915, 'samples': 5514752, 'steps': 10770, 'batch_loss/train': 0.828069438226521} 12/24/2021 09:35:25 - INFO - codeparrot_training - Step 10771: {'lr': 0.00044624930588755743, 'samples': 5515264, 'steps': 10771, 'batch_loss/train': 0.973112441599369} 12/24/2021 09:35:36 - INFO - codeparrot_training - Step 10772: {'lr': 0.0004462395402471325, 'samples': 5515776, 'steps': 10772, 'batch_loss/train': 0.9274201416410506} 12/24/2021 09:35:46 - INFO - codeparrot_training - Step 10773: {'lr': 0.0004462297738265333, 'samples': 5516288, 'steps': 10773, 'batch_loss/train': 0.764931860845536} 12/24/2021 09:35:59 - INFO - codeparrot_training - Step 10774: {'lr': 0.00044622000662579854, 'samples': 5516800, 'steps': 10774, 'batch_loss/train': 1.0648406958207488} 12/24/2021 09:36:09 - INFO - codeparrot_training - Step 10775: {'lr': 0.0004462102386449672, 'samples': 5517312, 'steps': 10775, 'batch_loss/train': 0.832618685439229} 12/24/2021 09:36:20 - INFO - codeparrot_training - Step 10776: {'lr': 0.000446200469884078, 'samples': 5517824, 'steps': 10776, 'batch_loss/train': 0.9229082828387618} 12/24/2021 09:36:32 - INFO - codeparrot_training - Step 10777: {'lr': 0.00044619070034316975, 'samples': 5518336, 'steps': 10777, 'batch_loss/train': 0.8358823852613568} 12/24/2021 09:36:42 - INFO - codeparrot_training - Step 10778: {'lr': 0.0004461809300222814, 'samples': 5518848, 'steps': 10778, 'batch_loss/train': 0.7189194373786449} 12/24/2021 09:36:53 - INFO - codeparrot_training - Step 10779: {'lr': 0.0004461711589214519, 'samples': 5519360, 'steps': 10779, 'batch_loss/train': 0.8693958595395088} 12/24/2021 09:37:04 - INFO - codeparrot_training - Step 10780: {'lr': 0.0004461613870407197, 'samples': 5519872, 'steps': 10780, 'batch_loss/train': 0.8215877953916788} 12/24/2021 09:37:18 - INFO - codeparrot_training - Step 10781: {'lr': 0.0004461516143801241, 'samples': 5520384, 'steps': 10781, 'batch_loss/train': 0.8258486269041896} 12/24/2021 09:37:28 - INFO - codeparrot_training - Step 10782: {'lr': 0.00044614184093970356, 'samples': 5520896, 'steps': 10782, 'batch_loss/train': 0.7986639821901917} 12/24/2021 09:37:39 - INFO - codeparrot_training - Step 10783: {'lr': 0.00044613206671949724, 'samples': 5521408, 'steps': 10783, 'batch_loss/train': 0.7631412289338186} 12/24/2021 09:37:51 - INFO - codeparrot_training - Step 10784: {'lr': 0.0004461222917195439, 'samples': 5521920, 'steps': 10784, 'batch_loss/train': 0.7913756985217333} 12/24/2021 09:38:01 - INFO - codeparrot_training - Step 10785: {'lr': 0.00044611251593988226, 'samples': 5522432, 'steps': 10785, 'batch_loss/train': 0.7926811063662171} 12/24/2021 09:38:12 - INFO - codeparrot_training - Step 10786: {'lr': 0.0004461027393805514, 'samples': 5522944, 'steps': 10786, 'batch_loss/train': 0.8561243424192071} 12/24/2021 09:38:24 - INFO - codeparrot_training - Step 10787: {'lr': 0.0004460929620415901, 'samples': 5523456, 'steps': 10787, 'batch_loss/train': 0.7901813229545951} 12/24/2021 09:38:35 - INFO - codeparrot_training - Step 10788: {'lr': 0.0004460831839230371, 'samples': 5523968, 'steps': 10788, 'batch_loss/train': 0.890609041787684} 12/24/2021 09:38:45 - INFO - codeparrot_training - Step 10789: {'lr': 0.0004460734050249315, 'samples': 5524480, 'steps': 10789, 'batch_loss/train': 0.8061244483105838} 12/24/2021 09:38:59 - INFO - codeparrot_training - Step 10790: {'lr': 0.00044606362534731214, 'samples': 5524992, 'steps': 10790, 'batch_loss/train': 0.7929441602900624} 12/24/2021 09:39:10 - INFO - codeparrot_training - Step 10791: {'lr': 0.00044605384489021776, 'samples': 5525504, 'steps': 10791, 'batch_loss/train': 0.8829972734674811} 12/24/2021 09:39:20 - INFO - codeparrot_training - Step 10792: {'lr': 0.00044604406365368734, 'samples': 5526016, 'steps': 10792, 'batch_loss/train': 0.7276713205501437} 12/24/2021 09:39:31 - INFO - codeparrot_training - Step 10793: {'lr': 0.0004460342816377597, 'samples': 5526528, 'steps': 10793, 'batch_loss/train': 0.8161604087799788} 12/24/2021 09:39:43 - INFO - codeparrot_training - Step 10794: {'lr': 0.00044602449884247384, 'samples': 5527040, 'steps': 10794, 'batch_loss/train': 0.8458502143621445} 12/24/2021 09:39:54 - INFO - codeparrot_training - Step 10795: {'lr': 0.00044601471526786854, 'samples': 5527552, 'steps': 10795, 'batch_loss/train': 0.8544652827549726} 12/24/2021 09:40:04 - INFO - codeparrot_training - Step 10796: {'lr': 0.00044600493091398273, 'samples': 5528064, 'steps': 10796, 'batch_loss/train': 0.8474915400147438} 12/24/2021 09:40:16 - INFO - codeparrot_training - Step 10797: {'lr': 0.0004459951457808554, 'samples': 5528576, 'steps': 10797, 'batch_loss/train': 0.8334246864542365} 12/24/2021 09:40:27 - INFO - codeparrot_training - Step 10798: {'lr': 0.0004459853598685253, 'samples': 5529088, 'steps': 10798, 'batch_loss/train': 0.8378892135806382} 12/24/2021 09:40:37 - INFO - codeparrot_training - Step 10799: {'lr': 0.0004459755731770315, 'samples': 5529600, 'steps': 10799, 'batch_loss/train': 0.8318667802959681} 12/24/2021 09:40:51 - INFO - codeparrot_training - Step 10800: {'lr': 0.00044596578570641274, 'samples': 5530112, 'steps': 10800, 'batch_loss/train': 0.8668781742453575} 12/24/2021 09:41:02 - INFO - codeparrot_training - Step 10801: {'lr': 0.000445955997456708, 'samples': 5530624, 'steps': 10801, 'batch_loss/train': 0.9100549947470427} 12/24/2021 09:41:13 - INFO - codeparrot_training - Step 10802: {'lr': 0.0004459462084279562, 'samples': 5531136, 'steps': 10802, 'batch_loss/train': 0.8152145687490702} 12/24/2021 09:41:23 - INFO - codeparrot_training - Step 10803: {'lr': 0.00044593641862019636, 'samples': 5531648, 'steps': 10803, 'batch_loss/train': 0.8002207344397902} 12/24/2021 09:41:35 - INFO - codeparrot_training - Step 10804: {'lr': 0.00044592662803346716, 'samples': 5532160, 'steps': 10804, 'batch_loss/train': 0.81243612524122} 12/24/2021 09:41:46 - INFO - codeparrot_training - Step 10805: {'lr': 0.0004459168366678078, 'samples': 5532672, 'steps': 10805, 'batch_loss/train': 0.6852207900956273} 12/24/2021 09:41:57 - INFO - codeparrot_training - Step 10806: {'lr': 0.00044590704452325704, 'samples': 5533184, 'steps': 10806, 'batch_loss/train': 0.8224033461883664} 12/24/2021 09:42:09 - INFO - codeparrot_training - Step 10807: {'lr': 0.0004458972515998538, 'samples': 5533696, 'steps': 10807, 'batch_loss/train': 0.7951122121885419} 12/24/2021 09:42:19 - INFO - codeparrot_training - Step 10808: {'lr': 0.0004458874578976371, 'samples': 5534208, 'steps': 10808, 'batch_loss/train': 0.8264442319050431} 12/24/2021 09:42:30 - INFO - codeparrot_training - Step 10809: {'lr': 0.00044587766341664574, 'samples': 5534720, 'steps': 10809, 'batch_loss/train': 0.8749741408973932} 12/24/2021 09:42:44 - INFO - codeparrot_training - Step 10810: {'lr': 0.0004458678681569188, 'samples': 5535232, 'steps': 10810, 'batch_loss/train': 0.7128149475902319} 12/24/2021 09:42:55 - INFO - codeparrot_training - Step 10811: {'lr': 0.00044585807211849524, 'samples': 5535744, 'steps': 10811, 'batch_loss/train': 0.7463586200028658} 12/24/2021 09:43:05 - INFO - codeparrot_training - Step 10812: {'lr': 0.0004458482753014139, 'samples': 5536256, 'steps': 10812, 'batch_loss/train': 0.7934063179418445} 12/24/2021 09:43:17 - INFO - codeparrot_training - Step 10813: {'lr': 0.00044583847770571374, 'samples': 5536768, 'steps': 10813, 'batch_loss/train': 0.8233286561444402} 12/24/2021 09:43:28 - INFO - codeparrot_training - Step 10814: {'lr': 0.00044582867933143377, 'samples': 5537280, 'steps': 10814, 'batch_loss/train': 0.8023870261386037} 12/24/2021 09:43:39 - INFO - codeparrot_training - Step 10815: {'lr': 0.0004458188801786129, 'samples': 5537792, 'steps': 10815, 'batch_loss/train': 0.977019976824522} 12/24/2021 09:43:49 - INFO - codeparrot_training - Step 10816: {'lr': 0.00044580908024729007, 'samples': 5538304, 'steps': 10816, 'batch_loss/train': 0.8743409449234605} 12/24/2021 09:44:01 - INFO - codeparrot_training - Step 10817: {'lr': 0.00044579927953750434, 'samples': 5538816, 'steps': 10817, 'batch_loss/train': 0.7849325931165367} 12/24/2021 09:44:12 - INFO - codeparrot_training - Step 10818: {'lr': 0.00044578947804929447, 'samples': 5539328, 'steps': 10818, 'batch_loss/train': 1.682756321504712} 12/24/2021 09:44:22 - INFO - codeparrot_training - Step 10819: {'lr': 0.00044577967578269975, 'samples': 5539840, 'steps': 10819, 'batch_loss/train': 0.7580355703830719} 12/24/2021 09:44:36 - INFO - codeparrot_training - Step 10820: {'lr': 0.00044576987273775885, 'samples': 5540352, 'steps': 10820, 'batch_loss/train': 0.814213496632874} 12/24/2021 09:44:47 - INFO - codeparrot_training - Step 10821: {'lr': 0.00044576006891451083, 'samples': 5540864, 'steps': 10821, 'batch_loss/train': 0.7540136221796274} 12/24/2021 09:44:57 - INFO - codeparrot_training - Step 10822: {'lr': 0.00044575026431299474, 'samples': 5541376, 'steps': 10822, 'batch_loss/train': 0.8439429588615894} 12/24/2021 09:45:08 - INFO - codeparrot_training - Step 10823: {'lr': 0.00044574045893324956, 'samples': 5541888, 'steps': 10823, 'batch_loss/train': 0.8859980059787631} 12/24/2021 09:45:20 - INFO - codeparrot_training - Step 10824: {'lr': 0.0004457306527753141, 'samples': 5542400, 'steps': 10824, 'batch_loss/train': 0.9017994906753302} 12/24/2021 09:45:31 - INFO - codeparrot_training - Step 10825: {'lr': 0.0004457208458392276, 'samples': 5542912, 'steps': 10825, 'batch_loss/train': 0.7864972734823823} 12/24/2021 09:45:41 - INFO - codeparrot_training - Step 10826: {'lr': 0.00044571103812502885, 'samples': 5543424, 'steps': 10826, 'batch_loss/train': 0.6987049742601812} 12/24/2021 09:45:55 - INFO - codeparrot_training - Step 10827: {'lr': 0.00044570122963275694, 'samples': 5543936, 'steps': 10827, 'batch_loss/train': 0.7951440783217549} 12/24/2021 09:46:06 - INFO - codeparrot_training - Step 10828: {'lr': 0.0004456914203624508, 'samples': 5544448, 'steps': 10828, 'batch_loss/train': 0.8697733907029033} 12/24/2021 09:46:16 - INFO - codeparrot_training - Step 10829: {'lr': 0.00044568161031414944, 'samples': 5544960, 'steps': 10829, 'batch_loss/train': 0.7760400185361505} 12/24/2021 09:46:28 - INFO - codeparrot_training - Step 10830: {'lr': 0.00044567179948789193, 'samples': 5545472, 'steps': 10830, 'batch_loss/train': 0.8929698271676898} 12/24/2021 09:46:39 - INFO - codeparrot_training - Step 10831: {'lr': 0.00044566198788371724, 'samples': 5545984, 'steps': 10831, 'batch_loss/train': 0.8302591210231185} 12/24/2021 09:46:50 - INFO - codeparrot_training - Step 10832: {'lr': 0.0004456521755016643, 'samples': 5546496, 'steps': 10832, 'batch_loss/train': 0.8097523171454668} 12/24/2021 09:47:02 - INFO - codeparrot_training - Step 10833: {'lr': 0.00044564236234177225, 'samples': 5547008, 'steps': 10833, 'batch_loss/train': 0.7050154701573774} 12/24/2021 09:47:13 - INFO - codeparrot_training - Step 10834: {'lr': 0.0004456325484040801, 'samples': 5547520, 'steps': 10834, 'batch_loss/train': 0.558776949881576} 12/24/2021 09:47:23 - INFO - codeparrot_training - Step 10835: {'lr': 0.0004456227336886267, 'samples': 5548032, 'steps': 10835, 'batch_loss/train': 0.8192015569657087} 12/24/2021 09:47:34 - INFO - codeparrot_training - Step 10836: {'lr': 0.0004456129181954511, 'samples': 5548544, 'steps': 10836, 'batch_loss/train': 0.8169576050713658} 12/24/2021 09:47:46 - INFO - codeparrot_training - Step 10837: {'lr': 0.0004456031019245925, 'samples': 5549056, 'steps': 10837, 'batch_loss/train': 0.8075983813032508} 12/24/2021 09:47:56 - INFO - codeparrot_training - Step 10838: {'lr': 0.0004455932848760899, 'samples': 5549568, 'steps': 10838, 'batch_loss/train': 0.8819923540577292} 12/24/2021 09:48:07 - INFO - codeparrot_training - Step 10839: {'lr': 0.0004455834670499821, 'samples': 5550080, 'steps': 10839, 'batch_loss/train': 0.7288749376311898} 12/24/2021 09:48:21 - INFO - codeparrot_training - Step 10840: {'lr': 0.0004455736484463084, 'samples': 5550592, 'steps': 10840, 'batch_loss/train': 0.8309487430378795} 12/24/2021 09:48:32 - INFO - codeparrot_training - Step 10841: {'lr': 0.00044556382906510775, 'samples': 5551104, 'steps': 10841, 'batch_loss/train': 0.6681922012940049} 12/24/2021 09:48:42 - INFO - codeparrot_training - Step 10842: {'lr': 0.00044555400890641903, 'samples': 5551616, 'steps': 10842, 'batch_loss/train': 0.84134775493294} 12/24/2021 09:48:54 - INFO - codeparrot_training - Step 10843: {'lr': 0.00044554418797028146, 'samples': 5552128, 'steps': 10843, 'batch_loss/train': 0.9195525757968426} 12/24/2021 09:49:05 - INFO - codeparrot_training - Step 10844: {'lr': 0.00044553436625673407, 'samples': 5552640, 'steps': 10844, 'batch_loss/train': 0.8575003603473306} 12/24/2021 09:49:16 - INFO - codeparrot_training - Step 10845: {'lr': 0.0004455245437658158, 'samples': 5553152, 'steps': 10845, 'batch_loss/train': 0.9079188946634531} 12/24/2021 09:49:26 - INFO - codeparrot_training - Step 10846: {'lr': 0.0004455147204975658, 'samples': 5553664, 'steps': 10846, 'batch_loss/train': 0.8096629707142711} 12/24/2021 09:49:39 - INFO - codeparrot_training - Step 10847: {'lr': 0.0004455048964520231, 'samples': 5554176, 'steps': 10847, 'batch_loss/train': 0.7315003359690309} 12/24/2021 09:49:49 - INFO - codeparrot_training - Step 10848: {'lr': 0.00044549507162922676, 'samples': 5554688, 'steps': 10848, 'batch_loss/train': 0.7763335896888748} 12/24/2021 09:50:00 - INFO - codeparrot_training - Step 10849: {'lr': 0.00044548524602921585, 'samples': 5555200, 'steps': 10849, 'batch_loss/train': 0.7261464078910649} 12/24/2021 09:50:14 - INFO - codeparrot_training - Step 10850: {'lr': 0.0004454754196520294, 'samples': 5555712, 'steps': 10850, 'batch_loss/train': 0.7816362353041768} 12/24/2021 09:50:24 - INFO - codeparrot_training - Step 10851: {'lr': 0.0004454655924977065, 'samples': 5556224, 'steps': 10851, 'batch_loss/train': 0.8554452676326036} 12/24/2021 09:50:35 - INFO - codeparrot_training - Step 10852: {'lr': 0.0004454557645662862, 'samples': 5556736, 'steps': 10852, 'batch_loss/train': 0.7424833127297461} 12/24/2021 09:50:46 - INFO - codeparrot_training - Step 10853: {'lr': 0.0004454459358578076, 'samples': 5557248, 'steps': 10853, 'batch_loss/train': 0.7684113923460245} 12/24/2021 09:50:58 - INFO - codeparrot_training - Step 10854: {'lr': 0.0004454361063723097, 'samples': 5557760, 'steps': 10854, 'batch_loss/train': 0.7858063946478069} 12/24/2021 09:51:09 - INFO - codeparrot_training - Step 10855: {'lr': 0.00044542627610983163, 'samples': 5558272, 'steps': 10855, 'batch_loss/train': 0.8489623414352536} 12/24/2021 09:51:19 - INFO - codeparrot_training - Step 10856: {'lr': 0.00044541644507041246, 'samples': 5558784, 'steps': 10856, 'batch_loss/train': 0.7251322004012764} 12/24/2021 09:51:33 - INFO - codeparrot_training - Step 10857: {'lr': 0.00044540661325409136, 'samples': 5559296, 'steps': 10857, 'batch_loss/train': 0.6984347095713019} 12/24/2021 09:51:44 - INFO - codeparrot_training - Step 10858: {'lr': 0.00044539678066090737, 'samples': 5559808, 'steps': 10858, 'batch_loss/train': 0.8166635315865278} 12/24/2021 09:51:54 - INFO - codeparrot_training - Step 10859: {'lr': 0.0004453869472908995, 'samples': 5560320, 'steps': 10859, 'batch_loss/train': 0.8242715238593519} 12/24/2021 09:52:06 - INFO - codeparrot_training - Step 10860: {'lr': 0.0004453771131441069, 'samples': 5560832, 'steps': 10860, 'batch_loss/train': 0.7848525708541274} 12/24/2021 09:52:17 - INFO - codeparrot_training - Step 10861: {'lr': 0.0004453672782205687, 'samples': 5561344, 'steps': 10861, 'batch_loss/train': 0.8618548566009849} 12/24/2021 09:52:27 - INFO - codeparrot_training - Step 10862: {'lr': 0.00044535744252032394, 'samples': 5561856, 'steps': 10862, 'batch_loss/train': 0.8985954727977514} 12/24/2021 09:52:39 - INFO - codeparrot_training - Step 10863: {'lr': 0.00044534760604341187, 'samples': 5562368, 'steps': 10863, 'batch_loss/train': 0.77994518680498} 12/24/2021 09:52:50 - INFO - codeparrot_training - Step 10864: {'lr': 0.0004453377687898713, 'samples': 5562880, 'steps': 10864, 'batch_loss/train': 0.7802282785996795} 12/24/2021 09:53:01 - INFO - codeparrot_training - Step 10865: {'lr': 0.00044532793075974166, 'samples': 5563392, 'steps': 10865, 'batch_loss/train': 0.7864699875935912} 12/24/2021 09:53:11 - INFO - codeparrot_training - Step 10866: {'lr': 0.00044531809195306186, 'samples': 5563904, 'steps': 10866, 'batch_loss/train': 0.83103958517313} 12/24/2021 09:53:25 - INFO - codeparrot_training - Step 10867: {'lr': 0.00044530825236987104, 'samples': 5564416, 'steps': 10867, 'batch_loss/train': 0.881015727063641} 12/24/2021 09:53:36 - INFO - codeparrot_training - Step 10868: {'lr': 0.0004452984120102084, 'samples': 5564928, 'steps': 10868, 'batch_loss/train': 0.6778335750568658} 12/24/2021 09:53:46 - INFO - codeparrot_training - Step 10869: {'lr': 0.000445288570874113, 'samples': 5565440, 'steps': 10869, 'batch_loss/train': 0.8559061018750072} 12/24/2021 09:53:58 - INFO - codeparrot_training - Step 10870: {'lr': 0.000445278728961624, 'samples': 5565952, 'steps': 10870, 'batch_loss/train': 0.9122867742553353} 12/24/2021 09:54:09 - INFO - codeparrot_training - Step 10871: {'lr': 0.00044526888627278045, 'samples': 5566464, 'steps': 10871, 'batch_loss/train': 0.7638042937032878} 12/24/2021 09:54:20 - INFO - codeparrot_training - Step 10872: {'lr': 0.0004452590428076216, 'samples': 5566976, 'steps': 10872, 'batch_loss/train': 0.7911194888874888} 12/24/2021 09:54:32 - INFO - codeparrot_training - Step 10873: {'lr': 0.0004452491985661865, 'samples': 5567488, 'steps': 10873, 'batch_loss/train': 0.8478828637162223} 12/24/2021 09:54:42 - INFO - codeparrot_training - Step 10874: {'lr': 0.0004452393535485143, 'samples': 5568000, 'steps': 10874, 'batch_loss/train': 0.842777319252491} 12/24/2021 09:54:53 - INFO - codeparrot_training - Step 10875: {'lr': 0.0004452295077546442, 'samples': 5568512, 'steps': 10875, 'batch_loss/train': 0.8600633628666401} 12/24/2021 09:55:03 - INFO - codeparrot_training - Step 10876: {'lr': 0.0004452196611846152, 'samples': 5569024, 'steps': 10876, 'batch_loss/train': 0.8084508427418768} 12/24/2021 09:55:15 - INFO - codeparrot_training - Step 10877: {'lr': 0.00044520981383846663, 'samples': 5569536, 'steps': 10877, 'batch_loss/train': 0.8433518446981907} 12/24/2021 09:55:26 - INFO - codeparrot_training - Step 10878: {'lr': 0.00044519996571623757, 'samples': 5570048, 'steps': 10878, 'batch_loss/train': 0.7583296946249902} 12/24/2021 09:55:37 - INFO - codeparrot_training - Step 10879: {'lr': 0.00044519011681796707, 'samples': 5570560, 'steps': 10879, 'batch_loss/train': 0.7901020054705441} 12/24/2021 09:55:50 - INFO - codeparrot_training - Step 10880: {'lr': 0.00044518026714369436, 'samples': 5571072, 'steps': 10880, 'batch_loss/train': 0.8028225414454937} 12/24/2021 09:56:01 - INFO - codeparrot_training - Step 10881: {'lr': 0.00044517041669345865, 'samples': 5571584, 'steps': 10881, 'batch_loss/train': 0.7613654863089323} 12/24/2021 09:56:11 - INFO - codeparrot_training - Step 10882: {'lr': 0.000445160565467299, 'samples': 5572096, 'steps': 10882, 'batch_loss/train': 0.726557212183252} 12/24/2021 09:56:24 - INFO - codeparrot_training - Step 10883: {'lr': 0.0004451507134652547, 'samples': 5572608, 'steps': 10883, 'batch_loss/train': 0.8624087944626808} 12/24/2021 09:56:34 - INFO - codeparrot_training - Step 10884: {'lr': 0.0004451408606873648, 'samples': 5573120, 'steps': 10884, 'batch_loss/train': 0.6715263319201767} 12/24/2021 09:56:45 - INFO - codeparrot_training - Step 10885: {'lr': 0.00044513100713366863, 'samples': 5573632, 'steps': 10885, 'batch_loss/train': 0.8371865935623646} 12/24/2021 09:56:59 - INFO - codeparrot_training - Step 10886: {'lr': 0.0004451211528042052, 'samples': 5574144, 'steps': 10886, 'batch_loss/train': 0.7784631456015632} 12/24/2021 09:57:09 - INFO - codeparrot_training - Step 10887: {'lr': 0.00044511129769901365, 'samples': 5574656, 'steps': 10887, 'batch_loss/train': 0.7407597405835986} 12/24/2021 09:57:20 - INFO - codeparrot_training - Step 10888: {'lr': 0.0004451014418181333, 'samples': 5575168, 'steps': 10888, 'batch_loss/train': 0.8157017147168517} 12/24/2021 09:57:31 - INFO - codeparrot_training - Step 10889: {'lr': 0.0004450915851616033, 'samples': 5575680, 'steps': 10889, 'batch_loss/train': 0.8007708759978414} 12/24/2021 09:57:43 - INFO - codeparrot_training - Step 10890: {'lr': 0.0004450817277294628, 'samples': 5576192, 'steps': 10890, 'batch_loss/train': 0.8193329237401485} 12/24/2021 09:57:53 - INFO - codeparrot_training - Step 10891: {'lr': 0.00044507186952175105, 'samples': 5576704, 'steps': 10891, 'batch_loss/train': 0.7373017957434058} 12/24/2021 09:58:04 - INFO - codeparrot_training - Step 10892: {'lr': 0.00044506201053850716, 'samples': 5577216, 'steps': 10892, 'batch_loss/train': 0.8503483477979898} 12/24/2021 09:58:16 - INFO - codeparrot_training - Step 10893: {'lr': 0.0004450521507797704, 'samples': 5577728, 'steps': 10893, 'batch_loss/train': 0.835589705966413} 12/24/2021 09:58:26 - INFO - codeparrot_training - Step 10894: {'lr': 0.00044504229024557994, 'samples': 5578240, 'steps': 10894, 'batch_loss/train': 0.8873794544488192} 12/24/2021 09:58:37 - INFO - codeparrot_training - Step 10895: {'lr': 0.00044503242893597495, 'samples': 5578752, 'steps': 10895, 'batch_loss/train': 0.7751860711723566} 12/24/2021 09:58:51 - INFO - codeparrot_training - Step 10896: {'lr': 0.00044502256685099474, 'samples': 5579264, 'steps': 10896, 'batch_loss/train': 0.7854784112423658} 12/24/2021 09:59:02 - INFO - codeparrot_training - Step 10897: {'lr': 0.00044501270399067836, 'samples': 5579776, 'steps': 10897, 'batch_loss/train': 0.8301303971093148} 12/24/2021 09:59:12 - INFO - codeparrot_training - Step 10898: {'lr': 0.0004450028403550651, 'samples': 5580288, 'steps': 10898, 'batch_loss/train': 0.8570888242684305} 12/24/2021 09:59:23 - INFO - codeparrot_training - Step 10899: {'lr': 0.0004449929759441942, 'samples': 5580800, 'steps': 10899, 'batch_loss/train': 0.8594654025509953} 12/24/2021 09:59:35 - INFO - codeparrot_training - Step 10900: {'lr': 0.00044498311075810487, 'samples': 5581312, 'steps': 10900, 'batch_loss/train': 0.8458545757457614} 12/24/2021 09:59:45 - INFO - codeparrot_training - Step 10901: {'lr': 0.0004449732447968363, 'samples': 5581824, 'steps': 10901, 'batch_loss/train': 0.9073655577376485} 12/24/2021 09:59:56 - INFO - codeparrot_training - Step 10902: {'lr': 0.00044496337806042773, 'samples': 5582336, 'steps': 10902, 'batch_loss/train': 0.7975498139858246} 12/24/2021 10:00:10 - INFO - codeparrot_training - Step 10903: {'lr': 0.00044495351054891835, 'samples': 5582848, 'steps': 10903, 'batch_loss/train': 0.7500511696562171} 12/24/2021 10:00:21 - INFO - codeparrot_training - Step 10904: {'lr': 0.0004449436422623475, 'samples': 5583360, 'steps': 10904, 'batch_loss/train': 0.7496456168591976} 12/24/2021 10:00:31 - INFO - codeparrot_training - Step 10905: {'lr': 0.0004449337732007543, 'samples': 5583872, 'steps': 10905, 'batch_loss/train': 0.7525496045127511} 12/24/2021 10:00:43 - INFO - codeparrot_training - Step 10906: {'lr': 0.00044492390336417796, 'samples': 5584384, 'steps': 10906, 'batch_loss/train': 0.7217736782040447} 12/24/2021 10:00:54 - INFO - codeparrot_training - Step 10907: {'lr': 0.0004449140327526579, 'samples': 5584896, 'steps': 10907, 'batch_loss/train': 0.8827010681852698} 12/24/2021 10:01:04 - INFO - codeparrot_training - Step 10908: {'lr': 0.00044490416136623313, 'samples': 5585408, 'steps': 10908, 'batch_loss/train': 0.7920997152104974} 12/24/2021 10:01:15 - INFO - codeparrot_training - Step 10909: {'lr': 0.00044489428920494313, 'samples': 5585920, 'steps': 10909, 'batch_loss/train': 0.8518842607736588} 12/24/2021 10:01:27 - INFO - codeparrot_training - Step 10910: {'lr': 0.0004448844162688269, 'samples': 5586432, 'steps': 10910, 'batch_loss/train': 0.8199482588097453} 12/24/2021 10:01:38 - INFO - codeparrot_training - Step 10911: {'lr': 0.0004448745425579239, 'samples': 5586944, 'steps': 10911, 'batch_loss/train': 0.6876119426451623} 12/24/2021 10:01:48 - INFO - codeparrot_training - Step 10912: {'lr': 0.00044486466807227323, 'samples': 5587456, 'steps': 10912, 'batch_loss/train': 0.7972469907253981} 12/24/2021 10:02:01 - INFO - codeparrot_training - Step 10913: {'lr': 0.00044485479281191427, 'samples': 5587968, 'steps': 10913, 'batch_loss/train': 0.7961425427347422} 12/24/2021 10:02:11 - INFO - codeparrot_training - Step 10914: {'lr': 0.0004448449167768862, 'samples': 5588480, 'steps': 10914, 'batch_loss/train': 0.8932485939003527} 12/24/2021 10:02:22 - INFO - codeparrot_training - Step 10915: {'lr': 0.00044483503996722834, 'samples': 5588992, 'steps': 10915, 'batch_loss/train': 0.7290362054482102} 12/24/2021 10:02:36 - INFO - codeparrot_training - Step 10916: {'lr': 0.0004448251623829799, 'samples': 5589504, 'steps': 10916, 'batch_loss/train': 0.7619414618238807} 12/24/2021 10:02:46 - INFO - codeparrot_training - Step 10917: {'lr': 0.0004448152840241802, 'samples': 5590016, 'steps': 10917, 'batch_loss/train': 0.8236713968217373} 12/24/2021 10:02:57 - INFO - codeparrot_training - Step 10918: {'lr': 0.00044480540489086844, 'samples': 5590528, 'steps': 10918, 'batch_loss/train': 0.7743346691131592} 12/24/2021 10:03:07 - INFO - codeparrot_training - Step 10919: {'lr': 0.000444795524983084, 'samples': 5591040, 'steps': 10919, 'batch_loss/train': 0.8843336012214422} 12/24/2021 10:03:20 - INFO - codeparrot_training - Step 10920: {'lr': 0.0004447856443008661, 'samples': 5591552, 'steps': 10920, 'batch_loss/train': 0.7977474583312869} 12/24/2021 10:03:30 - INFO - codeparrot_training - Step 10921: {'lr': 0.00044477576284425403, 'samples': 5592064, 'steps': 10921, 'batch_loss/train': 0.8077557235956192} 12/24/2021 10:03:41 - INFO - codeparrot_training - Step 10922: {'lr': 0.00044476588061328706, 'samples': 5592576, 'steps': 10922, 'batch_loss/train': 0.805933048017323} 12/24/2021 10:03:53 - INFO - codeparrot_training - Step 10923: {'lr': 0.00044475599760800447, 'samples': 5593088, 'steps': 10923, 'batch_loss/train': 0.8489169673994184} 12/24/2021 10:04:03 - INFO - codeparrot_training - Step 10924: {'lr': 0.00044474611382844557, 'samples': 5593600, 'steps': 10924, 'batch_loss/train': 0.7325566727668047} 12/24/2021 10:04:14 - INFO - codeparrot_training - Step 10925: {'lr': 0.0004447362292746497, 'samples': 5594112, 'steps': 10925, 'batch_loss/train': 0.8545536263845861} 12/24/2021 10:04:28 - INFO - codeparrot_training - Step 10926: {'lr': 0.00044472634394665615, 'samples': 5594624, 'steps': 10926, 'batch_loss/train': 0.7926866160705686} 12/24/2021 10:04:38 - INFO - codeparrot_training - Step 10927: {'lr': 0.0004447164578445041, 'samples': 5595136, 'steps': 10927, 'batch_loss/train': 0.8534648772329092} 12/24/2021 10:04:49 - INFO - codeparrot_training - Step 10928: {'lr': 0.00044470657096823293, 'samples': 5595648, 'steps': 10928, 'batch_loss/train': 0.7777207613689825} 12/24/2021 10:05:01 - INFO - codeparrot_training - Step 10929: {'lr': 0.00044469668331788195, 'samples': 5596160, 'steps': 10929, 'batch_loss/train': 0.8182534407824278} 12/24/2021 10:05:12 - INFO - codeparrot_training - Step 10930: {'lr': 0.0004446867948934905, 'samples': 5596672, 'steps': 10930, 'batch_loss/train': 0.7191001363098621} 12/24/2021 10:05:22 - INFO - codeparrot_training - Step 10931: {'lr': 0.0004446769056950979, 'samples': 5597184, 'steps': 10931, 'batch_loss/train': 0.8371490156278014} 12/24/2021 10:05:33 - INFO - codeparrot_training - Step 10932: {'lr': 0.00044466701572274336, 'samples': 5597696, 'steps': 10932, 'batch_loss/train': 0.8934993366710842} 12/24/2021 10:05:45 - INFO - codeparrot_training - Step 10933: {'lr': 0.00044465712497646635, 'samples': 5598208, 'steps': 10933, 'batch_loss/train': 0.904198654461652} 12/24/2021 10:05:55 - INFO - codeparrot_training - Step 10934: {'lr': 0.000444647233456306, 'samples': 5598720, 'steps': 10934, 'batch_loss/train': 0.7638677265495062} 12/24/2021 10:06:06 - INFO - codeparrot_training - Step 10935: {'lr': 0.0004446373411623018, 'samples': 5599232, 'steps': 10935, 'batch_loss/train': 0.7862858907319605} 12/24/2021 10:06:20 - INFO - codeparrot_training - Step 10936: {'lr': 0.00044462744809449305, 'samples': 5599744, 'steps': 10936, 'batch_loss/train': 0.7867428343743086} 12/24/2021 10:06:31 - INFO - codeparrot_training - Step 10937: {'lr': 0.0004446175542529191, 'samples': 5600256, 'steps': 10937, 'batch_loss/train': 0.8393871113657951} 12/24/2021 10:06:41 - INFO - codeparrot_training - Step 10938: {'lr': 0.0004446076596376191, 'samples': 5600768, 'steps': 10938, 'batch_loss/train': 1.110221735201776} 12/24/2021 10:06:52 - INFO - codeparrot_training - Step 10939: {'lr': 0.0004445977642486326, 'samples': 5601280, 'steps': 10939, 'batch_loss/train': 0.8386326534673572} 12/24/2021 10:07:04 - INFO - codeparrot_training - Step 10940: {'lr': 0.0004445878680859989, 'samples': 5601792, 'steps': 10940, 'batch_loss/train': 0.8771393583156168} 12/24/2021 10:07:15 - INFO - codeparrot_training - Step 10941: {'lr': 0.0004445779711497573, 'samples': 5602304, 'steps': 10941, 'batch_loss/train': 0.8326777806505561} 12/24/2021 10:07:25 - INFO - codeparrot_training - Step 10942: {'lr': 0.0004445680734399471, 'samples': 5602816, 'steps': 10942, 'batch_loss/train': 0.8490407094359398} 12/24/2021 10:07:37 - INFO - codeparrot_training - Step 10943: {'lr': 0.00044455817495660775, 'samples': 5603328, 'steps': 10943, 'batch_loss/train': 0.8253244636580348} 12/24/2021 10:07:48 - INFO - codeparrot_training - Step 10944: {'lr': 0.0004445482756997785, 'samples': 5603840, 'steps': 10944, 'batch_loss/train': 0.810495144687593} 12/24/2021 10:07:58 - INFO - codeparrot_training - Step 10945: {'lr': 0.00044453837566949884, 'samples': 5604352, 'steps': 10945, 'batch_loss/train': 0.811394615098834} 12/24/2021 10:08:11 - INFO - codeparrot_training - Step 10946: {'lr': 0.00044452847486580805, 'samples': 5604864, 'steps': 10946, 'batch_loss/train': 0.6985065769404173} 12/24/2021 10:08:21 - INFO - codeparrot_training - Step 10947: {'lr': 0.00044451857328874544, 'samples': 5605376, 'steps': 10947, 'batch_loss/train': 0.8608842184767127} 12/24/2021 10:08:32 - INFO - codeparrot_training - Step 10948: {'lr': 0.00044450867093835043, 'samples': 5605888, 'steps': 10948, 'batch_loss/train': 0.8813263014890254} 12/24/2021 10:08:46 - INFO - codeparrot_training - Step 10949: {'lr': 0.0004444987678146625, 'samples': 5606400, 'steps': 10949, 'batch_loss/train': 0.8300374504178762} 12/24/2021 10:08:56 - INFO - codeparrot_training - Step 10950: {'lr': 0.0004444888639177208, 'samples': 5606912, 'steps': 10950, 'batch_loss/train': 0.8666644552722573} 12/24/2021 10:09:07 - INFO - codeparrot_training - Step 10951: {'lr': 0.0004444789592475649, 'samples': 5607424, 'steps': 10951, 'batch_loss/train': 0.8004141692072153} 12/24/2021 10:09:18 - INFO - codeparrot_training - Step 10952: {'lr': 0.000444469053804234, 'samples': 5607936, 'steps': 10952, 'batch_loss/train': 0.8220204329118133} 12/24/2021 10:09:30 - INFO - codeparrot_training - Step 10953: {'lr': 0.00044445914758776755, 'samples': 5608448, 'steps': 10953, 'batch_loss/train': 0.8284599673934281} 12/24/2021 10:09:40 - INFO - codeparrot_training - Step 10954: {'lr': 0.000444449240598205, 'samples': 5608960, 'steps': 10954, 'batch_loss/train': 0.8313234075903893} 12/24/2021 10:09:51 - INFO - codeparrot_training - Step 10955: {'lr': 0.0004444393328355857, 'samples': 5609472, 'steps': 10955, 'batch_loss/train': 0.8259363223332912} 12/24/2021 10:10:05 - INFO - codeparrot_training - Step 10956: {'lr': 0.00044442942429994905, 'samples': 5609984, 'steps': 10956, 'batch_loss/train': 0.7803756976500154} 12/24/2021 10:10:15 - INFO - codeparrot_training - Step 10957: {'lr': 0.0004444195149913344, 'samples': 5610496, 'steps': 10957, 'batch_loss/train': 1.0157271986827254} 12/24/2021 10:10:26 - INFO - codeparrot_training - Step 10958: {'lr': 0.00044440960490978117, 'samples': 5611008, 'steps': 10958, 'batch_loss/train': 0.7972285109572113} 12/24/2021 10:10:38 - INFO - codeparrot_training - Step 10959: {'lr': 0.0004443996940553288, 'samples': 5611520, 'steps': 10959, 'batch_loss/train': 0.8503535101190209} 12/24/2021 10:10:48 - INFO - codeparrot_training - Step 10960: {'lr': 0.0004443897824280166, 'samples': 5612032, 'steps': 10960, 'batch_loss/train': 0.8458780813962221} 12/24/2021 10:10:59 - INFO - codeparrot_training - Step 10961: {'lr': 0.000444379870027884, 'samples': 5612544, 'steps': 10961, 'batch_loss/train': 0.881940764375031} 12/24/2021 10:11:10 - INFO - codeparrot_training - Step 10962: {'lr': 0.0004443699568549705, 'samples': 5613056, 'steps': 10962, 'batch_loss/train': 0.8730244264006615} 12/24/2021 10:11:22 - INFO - codeparrot_training - Step 10963: {'lr': 0.00044436004290931535, 'samples': 5613568, 'steps': 10963, 'batch_loss/train': 0.8132384880445898} 12/24/2021 10:11:32 - INFO - codeparrot_training - Step 10964: {'lr': 0.0004443501281909581, 'samples': 5614080, 'steps': 10964, 'batch_loss/train': 0.8802147302776575} 12/24/2021 10:11:43 - INFO - codeparrot_training - Step 10965: {'lr': 0.00044434021269993817, 'samples': 5614592, 'steps': 10965, 'batch_loss/train': 0.8415317391045392} 12/24/2021 10:11:57 - INFO - codeparrot_training - Step 10966: {'lr': 0.00044433029643629483, 'samples': 5615104, 'steps': 10966, 'batch_loss/train': 0.7705396451056004} 12/24/2021 10:12:07 - INFO - codeparrot_training - Step 10967: {'lr': 0.00044432037940006773, 'samples': 5615616, 'steps': 10967, 'batch_loss/train': 0.7834710720926523} 12/24/2021 10:12:18 - INFO - codeparrot_training - Step 10968: {'lr': 0.0004443104615912961, 'samples': 5616128, 'steps': 10968, 'batch_loss/train': 0.6879663756117225} 12/24/2021 10:12:30 - INFO - codeparrot_training - Step 10969: {'lr': 0.0004443005430100195, 'samples': 5616640, 'steps': 10969, 'batch_loss/train': 0.8602978121489286} 12/24/2021 10:12:41 - INFO - codeparrot_training - Step 10970: {'lr': 0.00044429062365627725, 'samples': 5617152, 'steps': 10970, 'batch_loss/train': 0.9837675960734487} 12/24/2021 10:12:51 - INFO - codeparrot_training - Step 10971: {'lr': 0.00044428070353010887, 'samples': 5617664, 'steps': 10971, 'batch_loss/train': 0.8420333587564528} 12/24/2021 10:13:02 - INFO - codeparrot_training - Step 10972: {'lr': 0.0004442707826315537, 'samples': 5618176, 'steps': 10972, 'batch_loss/train': 0.8418029248714447} 12/24/2021 10:13:16 - INFO - codeparrot_training - Step 10973: {'lr': 0.0004442608609606514, 'samples': 5618688, 'steps': 10973, 'batch_loss/train': 0.7856512004509568} 12/24/2021 10:13:26 - INFO - codeparrot_training - Step 10974: {'lr': 0.00044425093851744115, 'samples': 5619200, 'steps': 10974, 'batch_loss/train': 0.7625181656330824} 12/24/2021 10:13:37 - INFO - codeparrot_training - Step 10975: {'lr': 0.0004442410153019626, 'samples': 5619712, 'steps': 10975, 'batch_loss/train': 0.7805375289171934} 12/24/2021 10:13:49 - INFO - codeparrot_training - Step 10976: {'lr': 0.000444231091314255, 'samples': 5620224, 'steps': 10976, 'batch_loss/train': 0.9621887728571892} 12/24/2021 10:14:00 - INFO - codeparrot_training - Step 10977: {'lr': 0.00044422116655435794, 'samples': 5620736, 'steps': 10977, 'batch_loss/train': 0.8890165071934462} 12/24/2021 10:14:10 - INFO - codeparrot_training - Step 10978: {'lr': 0.0004442112410223109, 'samples': 5621248, 'steps': 10978, 'batch_loss/train': 0.8782825767993927} 12/24/2021 10:14:23 - INFO - codeparrot_training - Step 10979: {'lr': 0.00044420131471815336, 'samples': 5621760, 'steps': 10979, 'batch_loss/train': 0.7491826396435499} 12/24/2021 10:14:33 - INFO - codeparrot_training - Step 10980: {'lr': 0.00044419138764192467, 'samples': 5622272, 'steps': 10980, 'batch_loss/train': 0.7603728910908103} 12/24/2021 10:14:44 - INFO - codeparrot_training - Step 10981: {'lr': 0.00044418145979366433, 'samples': 5622784, 'steps': 10981, 'batch_loss/train': 0.6735513061285019} 12/24/2021 10:14:56 - INFO - codeparrot_training - Step 10982: {'lr': 0.00044417153117341183, 'samples': 5623296, 'steps': 10982, 'batch_loss/train': 0.7899807053618133} 12/24/2021 10:15:07 - INFO - codeparrot_training - Step 10983: {'lr': 0.00044416160178120656, 'samples': 5623808, 'steps': 10983, 'batch_loss/train': 0.7698804009705782} 12/24/2021 10:15:18 - INFO - codeparrot_training - Step 10984: {'lr': 0.00044415167161708817, 'samples': 5624320, 'steps': 10984, 'batch_loss/train': 0.8712650197558105} 12/24/2021 10:15:28 - INFO - codeparrot_training - Step 10985: {'lr': 0.00044414174068109605, 'samples': 5624832, 'steps': 10985, 'batch_loss/train': 0.8116234457120299} 12/24/2021 10:15:43 - INFO - codeparrot_training - Step 10986: {'lr': 0.0004441318089732697, 'samples': 5625344, 'steps': 10986, 'batch_loss/train': 0.8141689500771463} 12/24/2021 10:15:53 - INFO - codeparrot_training - Step 10987: {'lr': 0.00044412187649364845, 'samples': 5625856, 'steps': 10987, 'batch_loss/train': 0.8742978516966105} 12/24/2021 10:16:04 - INFO - codeparrot_training - Step 10988: {'lr': 0.000444111943242272, 'samples': 5626368, 'steps': 10988, 'batch_loss/train': 0.823435140773654} 12/24/2021 10:16:16 - INFO - codeparrot_training - Step 10989: {'lr': 0.00044410200921917977, 'samples': 5626880, 'steps': 10989, 'batch_loss/train': 0.7993581304326653} 12/24/2021 10:16:27 - INFO - codeparrot_training - Step 10990: {'lr': 0.00044409207442441123, 'samples': 5627392, 'steps': 10990, 'batch_loss/train': 0.8544584438204765} 12/24/2021 10:16:37 - INFO - codeparrot_training - Step 10991: {'lr': 0.00044408213885800587, 'samples': 5627904, 'steps': 10991, 'batch_loss/train': 0.825275064446032} 12/24/2021 10:16:50 - INFO - codeparrot_training - Step 10992: {'lr': 0.0004440722025200032, 'samples': 5628416, 'steps': 10992, 'batch_loss/train': 0.7560575162060559} 12/24/2021 10:17:00 - INFO - codeparrot_training - Step 10993: {'lr': 0.00044406226541044276, 'samples': 5628928, 'steps': 10993, 'batch_loss/train': 0.7675040597096086} 12/24/2021 10:17:11 - INFO - codeparrot_training - Step 10994: {'lr': 0.000444052327529364, 'samples': 5629440, 'steps': 10994, 'batch_loss/train': 0.7893856687005609} 12/24/2021 10:17:21 - INFO - codeparrot_training - Step 10995: {'lr': 0.0004440423888768065, 'samples': 5629952, 'steps': 10995, 'batch_loss/train': 1.0207174480892718} 12/24/2021 10:17:35 - INFO - codeparrot_training - Step 10996: {'lr': 0.00044403244945280973, 'samples': 5630464, 'steps': 10996, 'batch_loss/train': 0.8031855854205787} 12/24/2021 10:17:46 - INFO - codeparrot_training - Step 10997: {'lr': 0.0004440225092574132, 'samples': 5630976, 'steps': 10997, 'batch_loss/train': 0.8113267701119184} 12/24/2021 10:17:57 - INFO - codeparrot_training - Step 10998: {'lr': 0.0004440125682906563, 'samples': 5631488, 'steps': 10998, 'batch_loss/train': 0.856562408618629} 12/24/2021 10:18:09 - INFO - codeparrot_training - Step 10999: {'lr': 0.00044400262655257884, 'samples': 5632000, 'steps': 10999, 'batch_loss/train': 0.857970992103219} 12/24/2021 10:18:20 - INFO - codeparrot_training - Step 11000: {'lr': 0.0004439926840432201, 'samples': 5632512, 'steps': 11000, 'batch_loss/train': 0.8386950520798564} 12/24/2021 10:18:30 - INFO - codeparrot_training - Step 11001: {'lr': 0.0004439827407626198, 'samples': 5633024, 'steps': 11001, 'batch_loss/train': 0.8656171392649412} 12/24/2021 10:18:41 - INFO - codeparrot_training - Step 11002: {'lr': 0.0004439727967108173, 'samples': 5633536, 'steps': 11002, 'batch_loss/train': 1.03991655446589} 12/24/2021 10:18:55 - INFO - codeparrot_training - Step 11003: {'lr': 0.00044396285188785225, 'samples': 5634048, 'steps': 11003, 'batch_loss/train': 0.9159948891028762} 12/24/2021 10:19:05 - INFO - codeparrot_training - Step 11004: {'lr': 0.00044395290629376406, 'samples': 5634560, 'steps': 11004, 'batch_loss/train': 0.8281197976320982} 12/24/2021 10:19:16 - INFO - codeparrot_training - Step 11005: {'lr': 0.0004439429599285924, 'samples': 5635072, 'steps': 11005, 'batch_loss/train': 0.8854619646444917} 12/24/2021 10:19:28 - INFO - codeparrot_training - Step 11006: {'lr': 0.00044393301279237673, 'samples': 5635584, 'steps': 11006, 'batch_loss/train': 0.8381113167852163} 12/24/2021 10:19:38 - INFO - codeparrot_training - Step 11007: {'lr': 0.00044392306488515664, 'samples': 5636096, 'steps': 11007, 'batch_loss/train': 0.8057670420967042} 12/24/2021 10:19:49 - INFO - codeparrot_training - Step 11008: {'lr': 0.0004439131162069717, 'samples': 5636608, 'steps': 11008, 'batch_loss/train': 0.7844881652854383} 12/24/2021 10:20:01 - INFO - codeparrot_training - Step 11009: {'lr': 0.0004439031667578614, 'samples': 5637120, 'steps': 11009, 'batch_loss/train': 0.8640586668625474} 12/24/2021 10:20:12 - INFO - codeparrot_training - Step 11010: {'lr': 0.0004438932165378653, 'samples': 5637632, 'steps': 11010, 'batch_loss/train': 0.8228978021070361} 12/24/2021 10:20:23 - INFO - codeparrot_training - Step 11011: {'lr': 0.00044388326554702307, 'samples': 5638144, 'steps': 11011, 'batch_loss/train': 0.6155866431072354} 12/24/2021 10:20:33 - INFO - codeparrot_training - Step 11012: {'lr': 0.0004438733137853741, 'samples': 5638656, 'steps': 11012, 'batch_loss/train': 0.8309308819007128} 12/24/2021 10:20:47 - INFO - codeparrot_training - Step 11013: {'lr': 0.00044386336125295806, 'samples': 5639168, 'steps': 11013, 'batch_loss/train': 0.8373279348015785} 12/24/2021 10:20:58 - INFO - codeparrot_training - Step 11014: {'lr': 0.00044385340794981447, 'samples': 5639680, 'steps': 11014, 'batch_loss/train': 0.7298693128395826} 12/24/2021 10:21:08 - INFO - codeparrot_training - Step 11015: {'lr': 0.00044384345387598294, 'samples': 5640192, 'steps': 11015, 'batch_loss/train': 1.2123171482235193} 12/24/2021 10:21:20 - INFO - codeparrot_training - Step 11016: {'lr': 0.00044383349903150306, 'samples': 5640704, 'steps': 11016, 'batch_loss/train': 0.7798226769082248} 12/24/2021 10:21:31 - INFO - codeparrot_training - Step 11017: {'lr': 0.0004438235434164144, 'samples': 5641216, 'steps': 11017, 'batch_loss/train': 0.8722680797800422} 12/24/2021 10:21:42 - INFO - codeparrot_training - Step 11018: {'lr': 0.00044381358703075646, 'samples': 5641728, 'steps': 11018, 'batch_loss/train': 0.84231923148036} 12/24/2021 10:21:54 - INFO - codeparrot_training - Step 11019: {'lr': 0.0004438036298745689, 'samples': 5642240, 'steps': 11019, 'batch_loss/train': 0.9258232787251472} 12/24/2021 10:22:05 - INFO - codeparrot_training - Step 11020: {'lr': 0.00044379367194789126, 'samples': 5642752, 'steps': 11020, 'batch_loss/train': 0.8293189113028347} 12/24/2021 10:22:15 - INFO - codeparrot_training - Step 11021: {'lr': 0.00044378371325076314, 'samples': 5643264, 'steps': 11021, 'batch_loss/train': 0.7908273374196142} 12/24/2021 10:22:26 - INFO - codeparrot_training - Step 11022: {'lr': 0.00044377375378322427, 'samples': 5643776, 'steps': 11022, 'batch_loss/train': 0.7714593755081296} 12/24/2021 10:22:38 - INFO - codeparrot_training - Step 11023: {'lr': 0.000443763793545314, 'samples': 5644288, 'steps': 11023, 'batch_loss/train': 0.7643750756978989} 12/24/2021 10:22:48 - INFO - codeparrot_training - Step 11024: {'lr': 0.000443753832537072, 'samples': 5644800, 'steps': 11024, 'batch_loss/train': 0.7669264758005738} 12/24/2021 10:22:59 - INFO - codeparrot_training - Step 11025: {'lr': 0.00044374387075853805, 'samples': 5645312, 'steps': 11025, 'batch_loss/train': 0.8069176571443677} 12/24/2021 10:23:14 - INFO - codeparrot_training - Step 11026: {'lr': 0.0004437339082097516, 'samples': 5645824, 'steps': 11026, 'batch_loss/train': 0.8354752995073795} 12/24/2021 10:23:24 - INFO - codeparrot_training - Step 11027: {'lr': 0.00044372394489075226, 'samples': 5646336, 'steps': 11027, 'batch_loss/train': 0.8525543054565787} 12/24/2021 10:23:35 - INFO - codeparrot_training - Step 11028: {'lr': 0.00044371398080157964, 'samples': 5646848, 'steps': 11028, 'batch_loss/train': 0.5464476672932506} 12/24/2021 10:23:47 - INFO - codeparrot_training - Step 11029: {'lr': 0.0004437040159422734, 'samples': 5647360, 'steps': 11029, 'batch_loss/train': 0.8285870314575732} 12/24/2021 10:23:58 - INFO - codeparrot_training - Step 11030: {'lr': 0.00044369405031287313, 'samples': 5647872, 'steps': 11030, 'batch_loss/train': 0.838546889834106} 12/24/2021 10:24:08 - INFO - codeparrot_training - Step 11031: {'lr': 0.00044368408391341844, 'samples': 5648384, 'steps': 11031, 'batch_loss/train': 0.9817924974486232} 12/24/2021 10:24:19 - INFO - codeparrot_training - Step 11032: {'lr': 0.00044367411674394895, 'samples': 5648896, 'steps': 11032, 'batch_loss/train': 0.8140457046683878} 12/24/2021 10:24:33 - INFO - codeparrot_training - Step 11033: {'lr': 0.00044366414880450434, 'samples': 5649408, 'steps': 11033, 'batch_loss/train': 0.8756933519616723} 12/24/2021 10:24:44 - INFO - codeparrot_training - Step 11034: {'lr': 0.00044365418009512423, 'samples': 5649920, 'steps': 11034, 'batch_loss/train': 0.7152390251867473} 12/24/2021 10:24:54 - INFO - codeparrot_training - Step 11035: {'lr': 0.00044364421061584815, 'samples': 5650432, 'steps': 11035, 'batch_loss/train': 1.3775691795162857} 12/24/2021 10:25:06 - INFO - codeparrot_training - Step 11036: {'lr': 0.00044363424036671583, 'samples': 5650944, 'steps': 11036, 'batch_loss/train': 1.0207097977399826} 12/24/2021 10:25:17 - INFO - codeparrot_training - Step 11037: {'lr': 0.0004436242693477669, 'samples': 5651456, 'steps': 11037, 'batch_loss/train': 0.8518465729430318} 12/24/2021 10:25:28 - INFO - codeparrot_training - Step 11038: {'lr': 0.00044361429755904094, 'samples': 5651968, 'steps': 11038, 'batch_loss/train': 0.9381633717566729} 12/24/2021 10:25:38 - INFO - codeparrot_training - Step 11039: {'lr': 0.00044360432500057767, 'samples': 5652480, 'steps': 11039, 'batch_loss/train': 0.9272404424846172} 12/24/2021 10:25:50 - INFO - codeparrot_training - Step 11040: {'lr': 0.0004435943516724167, 'samples': 5652992, 'steps': 11040, 'batch_loss/train': 0.8795708469115198} 12/24/2021 10:26:01 - INFO - codeparrot_training - Step 11041: {'lr': 0.0004435843775745977, 'samples': 5653504, 'steps': 11041, 'batch_loss/train': 0.8878348078578711} 12/24/2021 10:26:11 - INFO - codeparrot_training - Step 11042: {'lr': 0.00044357440270716034, 'samples': 5654016, 'steps': 11042, 'batch_loss/train': 0.873943495680578} 12/24/2021 10:26:25 - INFO - codeparrot_training - Step 11043: {'lr': 0.00044356442707014414, 'samples': 5654528, 'steps': 11043, 'batch_loss/train': 0.8606471493840218} 12/24/2021 10:26:36 - INFO - codeparrot_training - Step 11044: {'lr': 0.00044355445066358894, 'samples': 5655040, 'steps': 11044, 'batch_loss/train': 0.9067669250071049} 12/24/2021 10:26:46 - INFO - codeparrot_training - Step 11045: {'lr': 0.00044354447348753437, 'samples': 5655552, 'steps': 11045, 'batch_loss/train': 0.8090451955795288} 12/24/2021 10:26:58 - INFO - codeparrot_training - Step 11046: {'lr': 0.00044353449554201995, 'samples': 5656064, 'steps': 11046, 'batch_loss/train': 0.771547794342041} 12/24/2021 10:27:09 - INFO - codeparrot_training - Step 11047: {'lr': 0.0004435245168270855, 'samples': 5656576, 'steps': 11047, 'batch_loss/train': 0.9030999755486846} 12/24/2021 10:27:20 - INFO - codeparrot_training - Step 11048: {'lr': 0.00044351453734277064, 'samples': 5657088, 'steps': 11048, 'batch_loss/train': 0.7800618216861039} 12/24/2021 10:27:31 - INFO - codeparrot_training - Step 11049: {'lr': 0.0004435045570891151, 'samples': 5657600, 'steps': 11049, 'batch_loss/train': 0.7912472374737263} 12/24/2021 10:27:42 - INFO - codeparrot_training - Step 11050: {'lr': 0.0004434945760661584, 'samples': 5658112, 'steps': 11050, 'batch_loss/train': 0.8543210583738983} 12/24/2021 10:27:53 - INFO - codeparrot_training - Step 11051: {'lr': 0.0004434845942739404, 'samples': 5658624, 'steps': 11051, 'batch_loss/train': 0.7559478995390236} 12/24/2021 10:28:03 - INFO - codeparrot_training - Step 11052: {'lr': 0.0004434746117125007, 'samples': 5659136, 'steps': 11052, 'batch_loss/train': 0.8301911693997681} 12/24/2021 10:28:17 - INFO - codeparrot_training - Step 11053: {'lr': 0.000443464628381879, 'samples': 5659648, 'steps': 11053, 'batch_loss/train': 0.8359573553316295} 12/24/2021 10:28:28 - INFO - codeparrot_training - Step 11054: {'lr': 0.000443454644282115, 'samples': 5660160, 'steps': 11054, 'batch_loss/train': 1.5942865535616875} 12/24/2021 10:28:38 - INFO - codeparrot_training - Step 11055: {'lr': 0.0004434446594132483, 'samples': 5660672, 'steps': 11055, 'batch_loss/train': 0.839065209031105} 12/24/2021 10:28:51 - INFO - codeparrot_training - Step 11056: {'lr': 0.0004434346737753188, 'samples': 5661184, 'steps': 11056, 'batch_loss/train': 1.097311170771718} 12/24/2021 10:29:01 - INFO - codeparrot_training - Step 11057: {'lr': 0.00044342468736836594, 'samples': 5661696, 'steps': 11057, 'batch_loss/train': 0.8882558075711131} 12/24/2021 10:29:12 - INFO - codeparrot_training - Step 11058: {'lr': 0.0004434147001924297, 'samples': 5662208, 'steps': 11058, 'batch_loss/train': 0.9029380134306848} 12/24/2021 10:29:26 - INFO - codeparrot_training - Step 11059: {'lr': 0.0004434047122475495, 'samples': 5662720, 'steps': 11059, 'batch_loss/train': 0.8357595279812813} 12/24/2021 10:29:36 - INFO - codeparrot_training - Step 11060: {'lr': 0.0004433947235337653, 'samples': 5663232, 'steps': 11060, 'batch_loss/train': 0.9865466449409723} 12/24/2021 10:29:47 - INFO - codeparrot_training - Step 11061: {'lr': 0.00044338473405111667, 'samples': 5663744, 'steps': 11061, 'batch_loss/train': 0.8490144773386419} 12/24/2021 10:29:57 - INFO - codeparrot_training - Step 11062: {'lr': 0.00044337474379964336, 'samples': 5664256, 'steps': 11062, 'batch_loss/train': 0.9441705690696836} 12/24/2021 10:30:10 - INFO - codeparrot_training - Step 11063: {'lr': 0.000443364752779385, 'samples': 5664768, 'steps': 11063, 'batch_loss/train': 0.7336043514078483} 12/24/2021 10:30:20 - INFO - codeparrot_training - Step 11064: {'lr': 0.0004433547609903815, 'samples': 5665280, 'steps': 11064, 'batch_loss/train': 0.8096481449902058} 12/24/2021 10:30:31 - INFO - codeparrot_training - Step 11065: {'lr': 0.00044334476843267245, 'samples': 5665792, 'steps': 11065, 'batch_loss/train': 0.7529252655804157} 12/24/2021 10:30:43 - INFO - codeparrot_training - Step 11066: {'lr': 0.0004433347751062976, 'samples': 5666304, 'steps': 11066, 'batch_loss/train': 0.9186452683061361} 12/24/2021 10:30:54 - INFO - codeparrot_training - Step 11067: {'lr': 0.0004433247810112967, 'samples': 5666816, 'steps': 11067, 'batch_loss/train': 0.9059292552992702} 12/24/2021 10:31:04 - INFO - codeparrot_training - Step 11068: {'lr': 0.0004433147861477095, 'samples': 5667328, 'steps': 11068, 'batch_loss/train': 0.8963662125170231} 12/24/2021 10:31:16 - INFO - codeparrot_training - Step 11069: {'lr': 0.0004433047905155756, 'samples': 5667840, 'steps': 11069, 'batch_loss/train': 0.8476539044640958} 12/24/2021 10:31:27 - INFO - codeparrot_training - Step 11070: {'lr': 0.0004432947941149349, 'samples': 5668352, 'steps': 11070, 'batch_loss/train': 0.7888220651075244} 12/24/2021 10:31:38 - INFO - codeparrot_training - Step 11071: {'lr': 0.00044328479694582715, 'samples': 5668864, 'steps': 11071, 'batch_loss/train': 0.8464450263418257} 12/24/2021 10:31:48 - INFO - codeparrot_training - Step 11072: {'lr': 0.00044327479900829194, 'samples': 5669376, 'steps': 11072, 'batch_loss/train': 0.7308046696707606} 12/24/2021 10:32:02 - INFO - codeparrot_training - Step 11073: {'lr': 0.0004432648003023691, 'samples': 5669888, 'steps': 11073, 'batch_loss/train': 0.7461787988431752} 12/24/2021 10:32:13 - INFO - codeparrot_training - Step 11074: {'lr': 0.0004432548008280984, 'samples': 5670400, 'steps': 11074, 'batch_loss/train': 0.8540321453474462} 12/24/2021 10:32:23 - INFO - codeparrot_training - Step 11075: {'lr': 0.0004432448005855196, 'samples': 5670912, 'steps': 11075, 'batch_loss/train': 0.8936737673357129} 12/24/2021 10:32:35 - INFO - codeparrot_training - Step 11076: {'lr': 0.00044323479957467246, 'samples': 5671424, 'steps': 11076, 'batch_loss/train': 0.8017926504835486} 12/24/2021 10:32:46 - INFO - codeparrot_training - Step 11077: {'lr': 0.00044322479779559666, 'samples': 5671936, 'steps': 11077, 'batch_loss/train': 0.7790496675297618} 12/24/2021 10:32:56 - INFO - codeparrot_training - Step 11078: {'lr': 0.00044321479524833207, 'samples': 5672448, 'steps': 11078, 'batch_loss/train': 0.8787175340112299} 12/24/2021 10:33:09 - INFO - codeparrot_training - Step 11079: {'lr': 0.0004432047919329184, 'samples': 5672960, 'steps': 11079, 'batch_loss/train': 0.9444853942841291} 12/24/2021 10:33:19 - INFO - codeparrot_training - Step 11080: {'lr': 0.0004431947878493955, 'samples': 5673472, 'steps': 11080, 'batch_loss/train': 0.8756210263818502} 12/24/2021 10:33:30 - INFO - codeparrot_training - Step 11081: {'lr': 0.00044318478299780294, 'samples': 5673984, 'steps': 11081, 'batch_loss/train': 0.9114952967502177} 12/24/2021 10:33:40 - INFO - codeparrot_training - Step 11082: {'lr': 0.00044317477737818066, 'samples': 5674496, 'steps': 11082, 'batch_loss/train': 0.7025766004808247} 12/24/2021 10:33:54 - INFO - codeparrot_training - Step 11083: {'lr': 0.00044316477099056844, 'samples': 5675008, 'steps': 11083, 'batch_loss/train': 0.8304563031997532} 12/24/2021 10:34:05 - INFO - codeparrot_training - Step 11084: {'lr': 0.000443154763835006, 'samples': 5675520, 'steps': 11084, 'batch_loss/train': 0.8324671778827906} 12/24/2021 10:34:15 - INFO - codeparrot_training - Step 11085: {'lr': 0.00044314475591153326, 'samples': 5676032, 'steps': 11085, 'batch_loss/train': 0.813451140653342} 12/24/2021 10:34:27 - INFO - codeparrot_training - Step 11086: {'lr': 0.0004431347472201898, 'samples': 5676544, 'steps': 11086, 'batch_loss/train': 0.8129103556275368} 12/24/2021 10:34:38 - INFO - codeparrot_training - Step 11087: {'lr': 0.00044312473776101545, 'samples': 5677056, 'steps': 11087, 'batch_loss/train': 1.0447966633364558} 12/24/2021 10:34:49 - INFO - codeparrot_training - Step 11088: {'lr': 0.00044311472753405013, 'samples': 5677568, 'steps': 11088, 'batch_loss/train': 0.9039094634354115} 12/24/2021 10:35:02 - INFO - codeparrot_training - Step 11089: {'lr': 0.0004431047165393336, 'samples': 5678080, 'steps': 11089, 'batch_loss/train': 0.8215498132631183} 12/24/2021 10:35:13 - INFO - codeparrot_training - Step 11090: {'lr': 0.0004430947047769056, 'samples': 5678592, 'steps': 11090, 'batch_loss/train': 0.8879576670005918} 12/24/2021 10:35:23 - INFO - codeparrot_training - Step 11091: {'lr': 0.00044308469224680594, 'samples': 5679104, 'steps': 11091, 'batch_loss/train': 0.7398049116600305} 12/24/2021 10:35:34 - INFO - codeparrot_training - Step 11092: {'lr': 0.0004430746789490745, 'samples': 5679616, 'steps': 11092, 'batch_loss/train': 0.8157217362895608} 12/24/2021 10:35:46 - INFO - codeparrot_training - Step 11093: {'lr': 0.00044306466488375104, 'samples': 5680128, 'steps': 11093, 'batch_loss/train': 0.8054641969501972} 12/24/2021 10:35:57 - INFO - codeparrot_training - Step 11094: {'lr': 0.0004430546500508753, 'samples': 5680640, 'steps': 11094, 'batch_loss/train': 0.9058647528290749} 12/24/2021 10:36:07 - INFO - codeparrot_training - Step 11095: {'lr': 0.0004430446344504873, 'samples': 5681152, 'steps': 11095, 'batch_loss/train': 0.6630594772286713} 12/24/2021 10:36:19 - INFO - codeparrot_training - Step 11096: {'lr': 0.00044303461808262656, 'samples': 5681664, 'steps': 11096, 'batch_loss/train': 1.0374391051009297} 12/24/2021 10:36:30 - INFO - codeparrot_training - Step 11097: {'lr': 0.00044302460094733316, 'samples': 5682176, 'steps': 11097, 'batch_loss/train': 0.823808710090816} 12/24/2021 10:36:41 - INFO - codeparrot_training - Step 11098: {'lr': 0.00044301458304464683, 'samples': 5682688, 'steps': 11098, 'batch_loss/train': 0.8431851202622056} 12/24/2021 10:36:55 - INFO - codeparrot_training - Step 11099: {'lr': 0.00044300456437460737, 'samples': 5683200, 'steps': 11099, 'batch_loss/train': 0.8670665142126381} 12/24/2021 10:37:05 - INFO - codeparrot_training - Step 11100: {'lr': 0.00044299454493725466, 'samples': 5683712, 'steps': 11100, 'batch_loss/train': 0.8318648366257548} 12/24/2021 10:37:16 - INFO - codeparrot_training - Step 11101: {'lr': 0.00044298452473262844, 'samples': 5684224, 'steps': 11101, 'batch_loss/train': 0.8869622424244881} 12/24/2021 10:37:28 - INFO - codeparrot_training - Step 11102: {'lr': 0.0004429745037607686, 'samples': 5684736, 'steps': 11102, 'batch_loss/train': 0.8269353630021214} 12/24/2021 10:37:39 - INFO - codeparrot_training - Step 11103: {'lr': 0.00044296448202171514, 'samples': 5685248, 'steps': 11103, 'batch_loss/train': 0.8063894666265696} 12/24/2021 10:37:49 - INFO - codeparrot_training - Step 11104: {'lr': 0.00044295445951550775, 'samples': 5685760, 'steps': 11104, 'batch_loss/train': 0.6847405945882201} 12/24/2021 10:38:00 - INFO - codeparrot_training - Step 11105: {'lr': 0.00044294443624218617, 'samples': 5686272, 'steps': 11105, 'batch_loss/train': 0.8288700487464666} 12/24/2021 10:38:12 - INFO - codeparrot_training - Step 11106: {'lr': 0.0004429344122017904, 'samples': 5686784, 'steps': 11106, 'batch_loss/train': 0.940660996362567} 12/24/2021 10:38:23 - INFO - codeparrot_training - Step 11107: {'lr': 0.0004429243873943604, 'samples': 5687296, 'steps': 11107, 'batch_loss/train': 1.481989034684375} 12/24/2021 10:38:33 - INFO - codeparrot_training - Step 11108: {'lr': 0.0004429143618199357, 'samples': 5687808, 'steps': 11108, 'batch_loss/train': 0.8300324627198279} 12/24/2021 10:38:45 - INFO - codeparrot_training - Step 11109: {'lr': 0.0004429043354785564, 'samples': 5688320, 'steps': 11109, 'batch_loss/train': 0.829406343633309} 12/24/2021 10:38:56 - INFO - codeparrot_training - Step 11110: {'lr': 0.0004428943083702623, 'samples': 5688832, 'steps': 11110, 'batch_loss/train': 0.8620848394930363} 12/24/2021 10:39:06 - INFO - codeparrot_training - Step 11111: {'lr': 0.00044288428049509324, 'samples': 5689344, 'steps': 11111, 'batch_loss/train': 0.8169820019975305} 12/24/2021 10:39:20 - INFO - codeparrot_training - Step 11112: {'lr': 0.00044287425185308917, 'samples': 5689856, 'steps': 11112, 'batch_loss/train': 0.8594313263893127} 12/24/2021 10:39:31 - INFO - codeparrot_training - Step 11113: {'lr': 0.00044286422244428983, 'samples': 5690368, 'steps': 11113, 'batch_loss/train': 0.7942333240061998} 12/24/2021 10:39:41 - INFO - codeparrot_training - Step 11114: {'lr': 0.0004428541922687352, 'samples': 5690880, 'steps': 11114, 'batch_loss/train': 0.8725786674767733} 12/24/2021 10:39:52 - INFO - codeparrot_training - Step 11115: {'lr': 0.0004428441613264651, 'samples': 5691392, 'steps': 11115, 'batch_loss/train': 0.7966430040542036} 12/24/2021 10:40:04 - INFO - codeparrot_training - Step 11116: {'lr': 0.00044283412961751943, 'samples': 5691904, 'steps': 11116, 'batch_loss/train': 0.9165596491657197} 12/24/2021 10:40:15 - INFO - codeparrot_training - Step 11117: {'lr': 0.0004428240971419382, 'samples': 5692416, 'steps': 11117, 'batch_loss/train': 0.7973730945959687} 12/24/2021 10:40:25 - INFO - codeparrot_training - Step 11118: {'lr': 0.000442814063899761, 'samples': 5692928, 'steps': 11118, 'batch_loss/train': 0.8621896132826805} 12/24/2021 10:40:39 - INFO - codeparrot_training - Step 11119: {'lr': 0.00044280402989102795, 'samples': 5693440, 'steps': 11119, 'batch_loss/train': 0.8187073385342956} 12/24/2021 10:40:50 - INFO - codeparrot_training - Step 11120: {'lr': 0.00044279399511577887, 'samples': 5693952, 'steps': 11120, 'batch_loss/train': 0.8307838412001729} 12/24/2021 10:41:00 - INFO - codeparrot_training - Step 11121: {'lr': 0.00044278395957405366, 'samples': 5694464, 'steps': 11121, 'batch_loss/train': 0.8712563486769795} 12/24/2021 10:41:12 - INFO - codeparrot_training - Step 11122: {'lr': 0.0004427739232658922, 'samples': 5694976, 'steps': 11122, 'batch_loss/train': 0.8968441970646381} 12/24/2021 10:41:23 - INFO - codeparrot_training - Step 11123: {'lr': 0.00044276388619133444, 'samples': 5695488, 'steps': 11123, 'batch_loss/train': 0.8653715560212731} 12/24/2021 10:41:33 - INFO - codeparrot_training - Step 11124: {'lr': 0.0004427538483504202, 'samples': 5696000, 'steps': 11124, 'batch_loss/train': 0.8351251622661948} 12/24/2021 10:41:44 - INFO - codeparrot_training - Step 11125: {'lr': 0.0004427438097431895, 'samples': 5696512, 'steps': 11125, 'batch_loss/train': 0.7134669069200754} 12/24/2021 10:41:56 - INFO - codeparrot_training - Step 11126: {'lr': 0.0004427337703696822, 'samples': 5697024, 'steps': 11126, 'batch_loss/train': 0.6375352437607944} 12/24/2021 10:42:07 - INFO - codeparrot_training - Step 11127: {'lr': 0.00044272373022993816, 'samples': 5697536, 'steps': 11127, 'batch_loss/train': 0.860275296960026} 12/24/2021 10:42:18 - INFO - codeparrot_training - Step 11128: {'lr': 0.00044271368932399737, 'samples': 5698048, 'steps': 11128, 'batch_loss/train': 0.7285495409741998} 12/24/2021 10:42:31 - INFO - codeparrot_training - Step 11129: {'lr': 0.00044270364765189974, 'samples': 5698560, 'steps': 11129, 'batch_loss/train': 0.8645377792418003} 12/24/2021 10:42:42 - INFO - codeparrot_training - Step 11130: {'lr': 0.0004426936052136851, 'samples': 5699072, 'steps': 11130, 'batch_loss/train': 0.7293839233461767} 12/24/2021 10:42:53 - INFO - codeparrot_training - Step 11131: {'lr': 0.0004426835620093935, 'samples': 5699584, 'steps': 11131, 'batch_loss/train': 0.820781207177788} 12/24/2021 10:43:05 - INFO - codeparrot_training - Step 11132: {'lr': 0.00044267351803906475, 'samples': 5700096, 'steps': 11132, 'batch_loss/train': 0.7989412005990744} 12/24/2021 10:43:16 - INFO - codeparrot_training - Step 11133: {'lr': 0.00044266347330273895, 'samples': 5700608, 'steps': 11133, 'batch_loss/train': 0.8629915565252304} 12/24/2021 10:43:26 - INFO - codeparrot_training - Step 11134: {'lr': 0.0004426534278004558, 'samples': 5701120, 'steps': 11134, 'batch_loss/train': 0.8896132614463568} 12/24/2021 10:43:40 - INFO - codeparrot_training - Step 11135: {'lr': 0.0004426433815322555, 'samples': 5701632, 'steps': 11135, 'batch_loss/train': 0.8391165928915143} 12/24/2021 10:43:51 - INFO - codeparrot_training - Step 11136: {'lr': 0.0004426333344981777, 'samples': 5702144, 'steps': 11136, 'batch_loss/train': 0.9965373193845153} 12/24/2021 10:44:01 - INFO - codeparrot_training - Step 11137: {'lr': 0.0004426232866982626, 'samples': 5702656, 'steps': 11137, 'batch_loss/train': 0.8165169293060899} 12/24/2021 10:44:12 - INFO - codeparrot_training - Step 11138: {'lr': 0.00044261323813255, 'samples': 5703168, 'steps': 11138, 'batch_loss/train': 0.8223088579252362} 12/24/2021 10:44:24 - INFO - codeparrot_training - Step 11139: {'lr': 0.0004426031888010799, 'samples': 5703680, 'steps': 11139, 'batch_loss/train': 0.8249701331369579} 12/24/2021 10:44:34 - INFO - codeparrot_training - Step 11140: {'lr': 0.0004425931387038923, 'samples': 5704192, 'steps': 11140, 'batch_loss/train': 0.7277477788738906} 12/24/2021 10:44:45 - INFO - codeparrot_training - Step 11141: {'lr': 0.000442583087841027, 'samples': 5704704, 'steps': 11141, 'batch_loss/train': 0.7740936330519617} 12/24/2021 10:44:57 - INFO - codeparrot_training - Step 11142: {'lr': 0.0004425730362125241, 'samples': 5705216, 'steps': 11142, 'batch_loss/train': 0.8565419567748904} 12/24/2021 10:45:08 - INFO - codeparrot_training - Step 11143: {'lr': 0.00044256298381842356, 'samples': 5705728, 'steps': 11143, 'batch_loss/train': 0.7929166192188859} 12/24/2021 10:45:18 - INFO - codeparrot_training - Step 11144: {'lr': 0.0004425529306587653, 'samples': 5706240, 'steps': 11144, 'batch_loss/train': 0.8581315148621798} 12/24/2021 10:45:30 - INFO - codeparrot_training - Step 11145: {'lr': 0.0004425428767335893, 'samples': 5706752, 'steps': 11145, 'batch_loss/train': 1.5988205368630588} 12/24/2021 10:45:41 - INFO - codeparrot_training - Step 11146: {'lr': 0.0004425328220429355, 'samples': 5707264, 'steps': 11146, 'batch_loss/train': 0.8395934049040079} 12/24/2021 10:45:52 - INFO - codeparrot_training - Step 11147: {'lr': 0.00044252276658684385, 'samples': 5707776, 'steps': 11147, 'batch_loss/train': 0.8607608159072697} 12/24/2021 10:46:02 - INFO - codeparrot_training - Step 11148: {'lr': 0.0004425127103653545, 'samples': 5708288, 'steps': 11148, 'batch_loss/train': 0.8472192566841841} 12/24/2021 10:46:16 - INFO - codeparrot_training - Step 11149: {'lr': 0.00044250265337850717, 'samples': 5708800, 'steps': 11149, 'batch_loss/train': 0.9719356098212302} 12/24/2021 10:46:27 - INFO - codeparrot_training - Step 11150: {'lr': 0.00044249259562634204, 'samples': 5709312, 'steps': 11150, 'batch_loss/train': 0.8288464304059744} 12/24/2021 10:46:38 - INFO - codeparrot_training - Step 11151: {'lr': 0.000442482537108899, 'samples': 5709824, 'steps': 11151, 'batch_loss/train': 0.7702581195626408} 12/24/2021 10:46:50 - INFO - codeparrot_training - Step 11152: {'lr': 0.00044247247782621806, 'samples': 5710336, 'steps': 11152, 'batch_loss/train': 1.1492256624624133} 12/24/2021 10:47:00 - INFO - codeparrot_training - Step 11153: {'lr': 0.0004424624177783393, 'samples': 5710848, 'steps': 11153, 'batch_loss/train': 0.6790706706233323} 12/24/2021 10:47:11 - INFO - codeparrot_training - Step 11154: {'lr': 0.0004424523569653026, 'samples': 5711360, 'steps': 11154, 'batch_loss/train': 0.7970916898921132} 12/24/2021 10:47:21 - INFO - codeparrot_training - Step 11155: {'lr': 0.000442442295387148, 'samples': 5711872, 'steps': 11155, 'batch_loss/train': 0.802778500597924} 12/24/2021 10:47:34 - INFO - codeparrot_training - Step 11156: {'lr': 0.0004424322330439154, 'samples': 5712384, 'steps': 11156, 'batch_loss/train': 0.7263406878337264} 12/24/2021 10:47:44 - INFO - codeparrot_training - Step 11157: {'lr': 0.000442422169935645, 'samples': 5712896, 'steps': 11157, 'batch_loss/train': 0.7565686935558915} 12/24/2021 10:47:55 - INFO - codeparrot_training - Step 11158: {'lr': 0.00044241210606237676, 'samples': 5713408, 'steps': 11158, 'batch_loss/train': 0.8187745027244091} 12/24/2021 10:48:08 - INFO - codeparrot_training - Step 11159: {'lr': 0.0004424020414241505, 'samples': 5713920, 'steps': 11159, 'batch_loss/train': 0.8271576510742307} 12/24/2021 10:48:19 - INFO - codeparrot_training - Step 11160: {'lr': 0.0004423919760210065, 'samples': 5714432, 'steps': 11160, 'batch_loss/train': 0.8404491180554032} 12/24/2021 10:48:30 - INFO - codeparrot_training - Step 11161: {'lr': 0.0004423819098529846, 'samples': 5714944, 'steps': 11161, 'batch_loss/train': 0.8707414641976357} 12/24/2021 10:48:42 - INFO - codeparrot_training - Step 11162: {'lr': 0.0004423718429201248, 'samples': 5715456, 'steps': 11162, 'batch_loss/train': 0.8677752986550331} 12/24/2021 10:48:53 - INFO - codeparrot_training - Step 11163: {'lr': 0.0004423617752224673, 'samples': 5715968, 'steps': 11163, 'batch_loss/train': 0.8771598495077342} 12/24/2021 10:49:03 - INFO - codeparrot_training - Step 11164: {'lr': 0.00044235170676005197, 'samples': 5716480, 'steps': 11164, 'batch_loss/train': 0.8582580974325538} 12/24/2021 10:49:14 - INFO - codeparrot_training - Step 11165: {'lr': 0.0004423416375329189, 'samples': 5716992, 'steps': 11165, 'batch_loss/train': 0.4342476944439113} 12/24/2021 10:49:28 - INFO - codeparrot_training - Step 11166: {'lr': 0.00044233156754110804, 'samples': 5717504, 'steps': 11166, 'batch_loss/train': 0.904159078374505} 12/24/2021 10:49:38 - INFO - codeparrot_training - Step 11167: {'lr': 0.00044232149678465955, 'samples': 5718016, 'steps': 11167, 'batch_loss/train': 0.8716487353667617} 12/24/2021 10:49:49 - INFO - codeparrot_training - Step 11168: {'lr': 0.00044231142526361336, 'samples': 5718528, 'steps': 11168, 'batch_loss/train': 0.7914180117659271} 12/24/2021 10:50:01 - INFO - codeparrot_training - Step 11169: {'lr': 0.0004423013529780096, 'samples': 5719040, 'steps': 11169, 'batch_loss/train': 0.8193055149167776} 12/24/2021 10:50:11 - INFO - codeparrot_training - Step 11170: {'lr': 0.00044229127992788825, 'samples': 5719552, 'steps': 11170, 'batch_loss/train': 0.7323908100370318} 12/24/2021 10:50:22 - INFO - codeparrot_training - Step 11171: {'lr': 0.00044228120611328935, 'samples': 5720064, 'steps': 11171, 'batch_loss/train': 0.8911108383908868} 12/24/2021 10:50:33 - INFO - codeparrot_training - Step 11172: {'lr': 0.0004422711315342531, 'samples': 5720576, 'steps': 11172, 'batch_loss/train': 0.7858989862725139} 12/24/2021 10:50:45 - INFO - codeparrot_training - Step 11173: {'lr': 0.0004422610561908193, 'samples': 5721088, 'steps': 11173, 'batch_loss/train': 0.8519919933751225} 12/24/2021 10:50:55 - INFO - codeparrot_training - Step 11174: {'lr': 0.0004422509800830282, 'samples': 5721600, 'steps': 11174, 'batch_loss/train': 0.8694321778602898} 12/24/2021 10:51:06 - INFO - codeparrot_training - Step 11175: {'lr': 0.0004422409032109197, 'samples': 5722112, 'steps': 11175, 'batch_loss/train': 0.8770554577931762} 12/24/2021 10:51:20 - INFO - codeparrot_training - Step 11176: {'lr': 0.0004422308255745341, 'samples': 5722624, 'steps': 11176, 'batch_loss/train': 0.9905525036156178} 12/24/2021 10:51:30 - INFO - codeparrot_training - Step 11177: {'lr': 0.00044222074717391123, 'samples': 5723136, 'steps': 11177, 'batch_loss/train': 0.8736726539209485} 12/24/2021 10:51:41 - INFO - codeparrot_training - Step 11178: {'lr': 0.00044221066800909125, 'samples': 5723648, 'steps': 11178, 'batch_loss/train': 0.919305610936135} 12/24/2021 10:51:53 - INFO - codeparrot_training - Step 11179: {'lr': 0.0004422005880801142, 'samples': 5724160, 'steps': 11179, 'batch_loss/train': 1.2957436703145504} 12/24/2021 10:52:04 - INFO - codeparrot_training - Step 11180: {'lr': 0.0004421905073870203, 'samples': 5724672, 'steps': 11180, 'batch_loss/train': 0.796544331125915} 12/24/2021 10:52:14 - INFO - codeparrot_training - Step 11181: {'lr': 0.0004421804259298494, 'samples': 5725184, 'steps': 11181, 'batch_loss/train': 0.8540357677266002} 12/24/2021 10:52:26 - INFO - codeparrot_training - Step 11182: {'lr': 0.0004421703437086417, 'samples': 5725696, 'steps': 11182, 'batch_loss/train': 0.6905936072580516} 12/24/2021 10:52:37 - INFO - codeparrot_training - Step 11183: {'lr': 0.00044216026072343734, 'samples': 5726208, 'steps': 11183, 'batch_loss/train': 0.8800710837822407} 12/24/2021 10:52:48 - INFO - codeparrot_training - Step 11184: {'lr': 0.0004421501769742763, 'samples': 5726720, 'steps': 11184, 'batch_loss/train': 0.9592139050364494} 12/24/2021 10:52:58 - INFO - codeparrot_training - Step 11185: {'lr': 0.00044214009246119874, 'samples': 5727232, 'steps': 11185, 'batch_loss/train': 0.7899616369977593} 12/24/2021 10:53:12 - INFO - codeparrot_training - Step 11186: {'lr': 0.0004421300071842447, 'samples': 5727744, 'steps': 11186, 'batch_loss/train': 0.8674788124626502} 12/24/2021 10:53:23 - INFO - codeparrot_training - Step 11187: {'lr': 0.0004421199211434543, 'samples': 5728256, 'steps': 11187, 'batch_loss/train': 0.8708415199071169} 12/24/2021 10:53:33 - INFO - codeparrot_training - Step 11188: {'lr': 0.0004421098343388676, 'samples': 5728768, 'steps': 11188, 'batch_loss/train': 0.7717052698135376} 12/24/2021 10:53:45 - INFO - codeparrot_training - Step 11189: {'lr': 0.00044209974677052475, 'samples': 5729280, 'steps': 11189, 'batch_loss/train': 0.8244530567899346} 12/24/2021 10:53:56 - INFO - codeparrot_training - Step 11190: {'lr': 0.00044208965843846586, 'samples': 5729792, 'steps': 11190, 'batch_loss/train': 0.915540904738009} 12/24/2021 10:54:07 - INFO - codeparrot_training - Step 11191: {'lr': 0.000442079569342731, 'samples': 5730304, 'steps': 11191, 'batch_loss/train': 0.9720582254230976} 12/24/2021 10:54:19 - INFO - codeparrot_training - Step 11192: {'lr': 0.00044206947948336036, 'samples': 5730816, 'steps': 11192, 'batch_loss/train': 0.8505004663020372} 12/24/2021 10:54:30 - INFO - codeparrot_training - Step 11193: {'lr': 0.000442059388860394, 'samples': 5731328, 'steps': 11193, 'batch_loss/train': 0.9498777547851205} 12/24/2021 10:54:40 - INFO - codeparrot_training - Step 11194: {'lr': 0.0004420492974738719, 'samples': 5731840, 'steps': 11194, 'batch_loss/train': 0.849251595325768} 12/24/2021 10:54:51 - INFO - codeparrot_training - Step 11195: {'lr': 0.0004420392053238344, 'samples': 5732352, 'steps': 11195, 'batch_loss/train': 0.8019404695369303} 12/24/2021 10:55:03 - INFO - codeparrot_training - Step 11196: {'lr': 0.00044202911241032154, 'samples': 5732864, 'steps': 11196, 'batch_loss/train': 0.8295486159622669} 12/24/2021 10:55:13 - INFO - codeparrot_training - Step 11197: {'lr': 0.00044201901873337334, 'samples': 5733376, 'steps': 11197, 'batch_loss/train': 0.7422407595440745} 12/24/2021 10:55:24 - INFO - codeparrot_training - Step 11198: {'lr': 0.0004420089242930301, 'samples': 5733888, 'steps': 11198, 'batch_loss/train': 0.8162256292998791} 12/24/2021 10:55:38 - INFO - codeparrot_training - Step 11199: {'lr': 0.0004419988290893319, 'samples': 5734400, 'steps': 11199, 'batch_loss/train': 0.8221523206448182} 12/24/2021 10:55:48 - INFO - codeparrot_training - Step 11200: {'lr': 0.0004419887331223188, 'samples': 5734912, 'steps': 11200, 'batch_loss/train': 0.8353649012278765} 12/24/2021 10:55:59 - INFO - codeparrot_training - Step 11201: {'lr': 0.0004419786363920311, 'samples': 5735424, 'steps': 11201, 'batch_loss/train': 0.8013505423441529} 12/24/2021 10:56:11 - INFO - codeparrot_training - Step 11202: {'lr': 0.0004419685388985087, 'samples': 5735936, 'steps': 11202, 'batch_loss/train': 0.7680252005811781} 12/24/2021 10:56:22 - INFO - codeparrot_training - Step 11203: {'lr': 0.00044195844064179193, 'samples': 5736448, 'steps': 11203, 'batch_loss/train': 0.815872160717845} 12/24/2021 10:56:32 - INFO - codeparrot_training - Step 11204: {'lr': 0.0004419483416219208, 'samples': 5736960, 'steps': 11204, 'batch_loss/train': 0.9157598465681076} 12/24/2021 10:56:43 - INFO - codeparrot_training - Step 11205: {'lr': 0.0004419382418389356, 'samples': 5737472, 'steps': 11205, 'batch_loss/train': 0.9527588505297899} 12/24/2021 10:56:57 - INFO - codeparrot_training - Step 11206: {'lr': 0.0004419281412928765, 'samples': 5737984, 'steps': 11206, 'batch_loss/train': 0.8510850537568331} 12/24/2021 10:57:07 - INFO - codeparrot_training - Step 11207: {'lr': 0.0004419180399837835, 'samples': 5738496, 'steps': 11207, 'batch_loss/train': 0.8768197502940893} 12/24/2021 10:57:18 - INFO - codeparrot_training - Step 11208: {'lr': 0.00044190793791169683, 'samples': 5739008, 'steps': 11208, 'batch_loss/train': 0.9188743019476533} 12/24/2021 10:57:30 - INFO - codeparrot_training - Step 11209: {'lr': 0.0004418978350766567, 'samples': 5739520, 'steps': 11209, 'batch_loss/train': 0.824019598774612} 12/24/2021 10:57:41 - INFO - codeparrot_training - Step 11210: {'lr': 0.0004418877314787032, 'samples': 5740032, 'steps': 11210, 'batch_loss/train': 0.8195662889629602} 12/24/2021 10:57:51 - INFO - codeparrot_training - Step 11211: {'lr': 0.00044187762711787653, 'samples': 5740544, 'steps': 11211, 'batch_loss/train': 0.7950182454660535} 12/24/2021 10:58:03 - INFO - codeparrot_training - Step 11212: {'lr': 0.00044186752199421687, 'samples': 5741056, 'steps': 11212, 'batch_loss/train': 0.7598975412547588} 12/24/2021 10:58:14 - INFO - codeparrot_training - Step 11213: {'lr': 0.00044185741610776444, 'samples': 5741568, 'steps': 11213, 'batch_loss/train': 0.9093993450514972} 12/24/2021 10:58:24 - INFO - codeparrot_training - Step 11214: {'lr': 0.00044184730945855935, 'samples': 5742080, 'steps': 11214, 'batch_loss/train': 0.8479540352709591} 12/24/2021 10:58:38 - INFO - codeparrot_training - Step 11215: {'lr': 0.0004418372020466417, 'samples': 5742592, 'steps': 11215, 'batch_loss/train': 0.8043979057110846} 12/24/2021 10:58:49 - INFO - codeparrot_training - Step 11216: {'lr': 0.0004418270938720519, 'samples': 5743104, 'steps': 11216, 'batch_loss/train': 0.5510339685715735} 12/24/2021 10:59:00 - INFO - codeparrot_training - Step 11217: {'lr': 0.0004418169849348299, 'samples': 5743616, 'steps': 11217, 'batch_loss/train': 0.9335622685030103} 12/24/2021 10:59:10 - INFO - codeparrot_training - Step 11218: {'lr': 0.00044180687523501607, 'samples': 5744128, 'steps': 11218, 'batch_loss/train': 0.7888668100349605} 12/24/2021 10:59:22 - INFO - codeparrot_training - Step 11219: {'lr': 0.00044179676477265053, 'samples': 5744640, 'steps': 11219, 'batch_loss/train': 0.7729794946499169} 12/24/2021 10:59:33 - INFO - codeparrot_training - Step 11220: {'lr': 0.00044178665354777346, 'samples': 5745152, 'steps': 11220, 'batch_loss/train': 1.0207809312269092} 12/24/2021 10:59:44 - INFO - codeparrot_training - Step 11221: {'lr': 0.00044177654156042504, 'samples': 5745664, 'steps': 11221, 'batch_loss/train': 0.8156588603742421} 12/24/2021 10:59:57 - INFO - codeparrot_training - Step 11222: {'lr': 0.00044176642881064553, 'samples': 5746176, 'steps': 11222, 'batch_loss/train': 0.7532892231829464} 12/24/2021 11:00:08 - INFO - codeparrot_training - Step 11223: {'lr': 0.0004417563152984751, 'samples': 5746688, 'steps': 11223, 'batch_loss/train': 0.8347540097311139} 12/24/2021 11:00:19 - INFO - codeparrot_training - Step 11224: {'lr': 0.0004417462010239539, 'samples': 5747200, 'steps': 11224, 'batch_loss/train': 0.8525079041719437} 12/24/2021 11:00:29 - INFO - codeparrot_training - Step 11225: {'lr': 0.00044173608598712234, 'samples': 5747712, 'steps': 11225, 'batch_loss/train': 0.8903640280477703} 12/24/2021 11:00:41 - INFO - codeparrot_training - Step 11226: {'lr': 0.00044172597018802043, 'samples': 5748224, 'steps': 11226, 'batch_loss/train': 0.7309421263635159} 12/24/2021 11:00:52 - INFO - codeparrot_training - Step 11227: {'lr': 0.0004417158536266884, 'samples': 5748736, 'steps': 11227, 'batch_loss/train': 0.7857103231362998} 12/24/2021 11:01:02 - INFO - codeparrot_training - Step 11228: {'lr': 0.0004417057363031666, 'samples': 5749248, 'steps': 11228, 'batch_loss/train': 0.837283531203866} 12/24/2021 11:01:15 - INFO - codeparrot_training - Step 11229: {'lr': 0.00044169561821749513, 'samples': 5749760, 'steps': 11229, 'batch_loss/train': 0.7476801623124629} 12/24/2021 11:01:25 - INFO - codeparrot_training - Step 11230: {'lr': 0.0004416854993697143, 'samples': 5750272, 'steps': 11230, 'batch_loss/train': 0.9097103541716933} 12/24/2021 11:01:36 - INFO - codeparrot_training - Step 11231: {'lr': 0.0004416753797598643, 'samples': 5750784, 'steps': 11231, 'batch_loss/train': 0.8856975859962404} 12/24/2021 11:01:46 - INFO - codeparrot_training - Step 11232: {'lr': 0.00044166525938798537, 'samples': 5751296, 'steps': 11232, 'batch_loss/train': 0.8870255723595619} 12/24/2021 11:01:58 - INFO - codeparrot_training - Step 11233: {'lr': 0.00044165513825411773, 'samples': 5751808, 'steps': 11233, 'batch_loss/train': 0.7362006776966155} 12/24/2021 11:02:09 - INFO - codeparrot_training - Step 11234: {'lr': 0.00044164501635830154, 'samples': 5752320, 'steps': 11234, 'batch_loss/train': 0.9232042841613293} 12/24/2021 11:02:20 - INFO - codeparrot_training - Step 11235: {'lr': 0.0004416348937005773, 'samples': 5752832, 'steps': 11235, 'batch_loss/train': 0.9391067186370492} 12/24/2021 11:02:34 - INFO - codeparrot_training - Step 11236: {'lr': 0.00044162477028098494, 'samples': 5753344, 'steps': 11236, 'batch_loss/train': 0.8865085989236832} 12/24/2021 11:02:44 - INFO - codeparrot_training - Step 11237: {'lr': 0.0004416146460995649, 'samples': 5753856, 'steps': 11237, 'batch_loss/train': 0.7965357147622854} 12/24/2021 11:02:55 - INFO - codeparrot_training - Step 11238: {'lr': 0.0004416045211563574, 'samples': 5754368, 'steps': 11238, 'batch_loss/train': 0.8427147176116705} 12/24/2021 11:03:07 - INFO - codeparrot_training - Step 11239: {'lr': 0.0004415943954514026, 'samples': 5754880, 'steps': 11239, 'batch_loss/train': 0.8874052874743938} 12/24/2021 11:03:17 - INFO - codeparrot_training - Step 11240: {'lr': 0.0004415842689847409, 'samples': 5755392, 'steps': 11240, 'batch_loss/train': 0.9022743757814169} 12/24/2021 11:03:28 - INFO - codeparrot_training - Step 11241: {'lr': 0.0004415741417564124, 'samples': 5755904, 'steps': 11241, 'batch_loss/train': 0.8714800260495394} 12/24/2021 11:03:40 - INFO - codeparrot_training - Step 11242: {'lr': 0.00044156401376645757, 'samples': 5756416, 'steps': 11242, 'batch_loss/train': 0.8362982899416238} 12/24/2021 11:03:51 - INFO - codeparrot_training - Step 11243: {'lr': 0.00044155388501491646, 'samples': 5756928, 'steps': 11243, 'batch_loss/train': 0.8660383612150326} 12/24/2021 11:04:01 - INFO - codeparrot_training - Step 11244: {'lr': 0.00044154375550182947, 'samples': 5757440, 'steps': 11244, 'batch_loss/train': 0.8110981229692698} 12/24/2021 11:04:12 - INFO - codeparrot_training - Step 11245: {'lr': 0.0004415336252272368, 'samples': 5757952, 'steps': 11245, 'batch_loss/train': 0.8396625770255923} 12/24/2021 11:04:26 - INFO - codeparrot_training - Step 11246: {'lr': 0.00044152349419117885, 'samples': 5758464, 'steps': 11246, 'batch_loss/train': 0.8752744048833847} 12/24/2021 11:04:36 - INFO - codeparrot_training - Step 11247: {'lr': 0.00044151336239369566, 'samples': 5758976, 'steps': 11247, 'batch_loss/train': 0.8359992923215032} 12/24/2021 11:04:47 - INFO - codeparrot_training - Step 11248: {'lr': 0.0004415032298348277, 'samples': 5759488, 'steps': 11248, 'batch_loss/train': 0.7779805832542479} 12/24/2021 11:04:59 - INFO - codeparrot_training - Step 11249: {'lr': 0.00044149309651461525, 'samples': 5760000, 'steps': 11249, 'batch_loss/train': 0.7762058591470122} 12/24/2021 11:05:10 - INFO - codeparrot_training - Step 11250: {'lr': 0.0004414829624330986, 'samples': 5760512, 'steps': 11250, 'batch_loss/train': 0.7964166961610317} 12/24/2021 11:05:20 - INFO - codeparrot_training - Step 11251: {'lr': 0.00044147282759031794, 'samples': 5761024, 'steps': 11251, 'batch_loss/train': 0.7928303363732994} 12/24/2021 11:05:31 - INFO - codeparrot_training - Step 11252: {'lr': 0.00044146269198631357, 'samples': 5761536, 'steps': 11252, 'batch_loss/train': 0.9018150614574552} 12/24/2021 11:05:45 - INFO - codeparrot_training - Step 11253: {'lr': 0.0004414525556211258, 'samples': 5762048, 'steps': 11253, 'batch_loss/train': 0.9370051044970751} 12/24/2021 11:05:55 - INFO - codeparrot_training - Step 11254: {'lr': 0.00044144241849479503, 'samples': 5762560, 'steps': 11254, 'batch_loss/train': 0.7996881385333836} 12/24/2021 11:06:06 - INFO - codeparrot_training - Step 11255: {'lr': 0.0004414322806073615, 'samples': 5763072, 'steps': 11255, 'batch_loss/train': 0.8327784049324691} 12/24/2021 11:06:18 - INFO - codeparrot_training - Step 11256: {'lr': 0.0004414221419588655, 'samples': 5763584, 'steps': 11256, 'batch_loss/train': 0.841477588750422} 12/24/2021 11:06:29 - INFO - codeparrot_training - Step 11257: {'lr': 0.0004414120025493473, 'samples': 5764096, 'steps': 11257, 'batch_loss/train': 0.7966758152469993} 12/24/2021 11:06:39 - INFO - codeparrot_training - Step 11258: {'lr': 0.00044140186237884727, 'samples': 5764608, 'steps': 11258, 'batch_loss/train': 0.6987661223392934} 12/24/2021 11:06:51 - INFO - codeparrot_training - Step 11259: {'lr': 0.0004413917214474058, 'samples': 5765120, 'steps': 11259, 'batch_loss/train': 0.7847692333161831} 12/24/2021 11:07:02 - INFO - codeparrot_training - Step 11260: {'lr': 0.000441381579755063, 'samples': 5765632, 'steps': 11260, 'batch_loss/train': 0.8292094580829144} 12/24/2021 11:07:13 - INFO - codeparrot_training - Step 11261: {'lr': 0.00044137143730185936, 'samples': 5766144, 'steps': 11261, 'batch_loss/train': 0.7969049504026771} 12/24/2021 11:07:23 - INFO - codeparrot_training - Step 11262: {'lr': 0.0004413612940878351, 'samples': 5766656, 'steps': 11262, 'batch_loss/train': 0.8590116817504168} 12/24/2021 11:07:35 - INFO - codeparrot_training - Step 11263: {'lr': 0.0004413511501130306, 'samples': 5767168, 'steps': 11263, 'batch_loss/train': 0.7678686426952481} 12/24/2021 11:07:46 - INFO - codeparrot_training - Step 11264: {'lr': 0.00044134100537748615, 'samples': 5767680, 'steps': 11264, 'batch_loss/train': 0.802765496307984} 12/24/2021 11:07:56 - INFO - codeparrot_training - Step 11265: {'lr': 0.0004413308598812421, 'samples': 5768192, 'steps': 11265, 'batch_loss/train': 0.7897660518065095} 12/24/2021 11:08:10 - INFO - codeparrot_training - Step 11266: {'lr': 0.0004413207136243389, 'samples': 5768704, 'steps': 11266, 'batch_loss/train': 0.7187366515863687} 12/24/2021 11:08:21 - INFO - codeparrot_training - Step 11267: {'lr': 0.0004413105666068167, 'samples': 5769216, 'steps': 11267, 'batch_loss/train': 0.6731575652956963} 12/24/2021 11:08:31 - INFO - codeparrot_training - Step 11268: {'lr': 0.00044130041882871595, 'samples': 5769728, 'steps': 11268, 'batch_loss/train': 0.823978878557682} 12/24/2021 11:08:43 - INFO - codeparrot_training - Step 11269: {'lr': 0.00044129027029007694, 'samples': 5770240, 'steps': 11269, 'batch_loss/train': 0.7850673040375113} 12/24/2021 11:08:54 - INFO - codeparrot_training - Step 11270: {'lr': 0.00044128012099094003, 'samples': 5770752, 'steps': 11270, 'batch_loss/train': 0.8567401790060103} 12/24/2021 11:09:05 - INFO - codeparrot_training - Step 11271: {'lr': 0.0004412699709313457, 'samples': 5771264, 'steps': 11271, 'batch_loss/train': 0.817027491517365} 12/24/2021 11:09:15 - INFO - codeparrot_training - Step 11272: {'lr': 0.0004412598201113341, 'samples': 5771776, 'steps': 11272, 'batch_loss/train': 0.8625775841064751} 12/24/2021 11:09:27 - INFO - codeparrot_training - Step 11273: {'lr': 0.0004412496685309456, 'samples': 5772288, 'steps': 11273, 'batch_loss/train': 0.9549995735287666} 12/24/2021 11:09:38 - INFO - codeparrot_training - Step 11274: {'lr': 0.00044123951619022073, 'samples': 5772800, 'steps': 11274, 'batch_loss/train': 0.817158910445869} 12/24/2021 11:09:49 - INFO - codeparrot_training - Step 11275: {'lr': 0.0004412293630891998, 'samples': 5773312, 'steps': 11275, 'batch_loss/train': 0.7671540761366487} 12/24/2021 11:10:02 - INFO - codeparrot_training - Step 11276: {'lr': 0.00044121920922792304, 'samples': 5773824, 'steps': 11276, 'batch_loss/train': 0.812494439072907} 12/24/2021 11:10:13 - INFO - codeparrot_training - Step 11277: {'lr': 0.0004412090546064309, 'samples': 5774336, 'steps': 11277, 'batch_loss/train': 0.8377484964439645} 12/24/2021 11:10:24 - INFO - codeparrot_training - Step 11278: {'lr': 0.00044119889922476373, 'samples': 5774848, 'steps': 11278, 'batch_loss/train': 0.8151380242779851} 12/24/2021 11:10:36 - INFO - codeparrot_training - Step 11279: {'lr': 0.00044118874308296204, 'samples': 5775360, 'steps': 11279, 'batch_loss/train': 0.8400231581181288} 12/24/2021 11:10:47 - INFO - codeparrot_training - Step 11280: {'lr': 0.00044117858618106606, 'samples': 5775872, 'steps': 11280, 'batch_loss/train': 0.8155936682596803} 12/24/2021 11:10:57 - INFO - codeparrot_training - Step 11281: {'lr': 0.00044116842851911616, 'samples': 5776384, 'steps': 11281, 'batch_loss/train': 0.8166013704612851} 12/24/2021 11:11:08 - INFO - codeparrot_training - Step 11282: {'lr': 0.0004411582700971528, 'samples': 5776896, 'steps': 11282, 'batch_loss/train': 0.7909044022671878} 12/24/2021 11:11:22 - INFO - codeparrot_training - Step 11283: {'lr': 0.00044114811091521634, 'samples': 5777408, 'steps': 11283, 'batch_loss/train': 0.9080973295494914} 12/24/2021 11:11:33 - INFO - codeparrot_training - Step 11284: {'lr': 0.00044113795097334714, 'samples': 5777920, 'steps': 11284, 'batch_loss/train': 0.6880807410925627} 12/24/2021 11:11:43 - INFO - codeparrot_training - Step 11285: {'lr': 0.00044112779027158567, 'samples': 5778432, 'steps': 11285, 'batch_loss/train': 1.357455468736589} 12/24/2021 11:11:55 - INFO - codeparrot_training - Step 11286: {'lr': 0.0004411176288099722, 'samples': 5778944, 'steps': 11286, 'batch_loss/train': 0.7489157347008586} 12/24/2021 11:12:06 - INFO - codeparrot_training - Step 11287: {'lr': 0.0004411074665885473, 'samples': 5779456, 'steps': 11287, 'batch_loss/train': 0.49860278191044927} 12/24/2021 11:12:17 - INFO - codeparrot_training - Step 11288: {'lr': 0.0004410973036073511, 'samples': 5779968, 'steps': 11288, 'batch_loss/train': 0.8126764164771885} 12/24/2021 11:12:27 - INFO - codeparrot_training - Step 11289: {'lr': 0.0004410871398664243, 'samples': 5780480, 'steps': 11289, 'batch_loss/train': 0.8276293769013137} 12/24/2021 11:12:39 - INFO - codeparrot_training - Step 11290: {'lr': 0.00044107697536580726, 'samples': 5780992, 'steps': 11290, 'batch_loss/train': 0.8956139152869582} 12/24/2021 11:12:50 - INFO - codeparrot_training - Step 11291: {'lr': 0.00044106681010554015, 'samples': 5781504, 'steps': 11291, 'batch_loss/train': 0.7409002911299467} 12/24/2021 11:13:01 - INFO - codeparrot_training - Step 11292: {'lr': 0.00044105664408566366, 'samples': 5782016, 'steps': 11292, 'batch_loss/train': 0.769083091057837} 12/24/2021 11:13:15 - INFO - codeparrot_training - Step 11293: {'lr': 0.00044104647730621805, 'samples': 5782528, 'steps': 11293, 'batch_loss/train': 0.8350447665434331} 12/24/2021 11:13:25 - INFO - codeparrot_training - Step 11294: {'lr': 0.00044103630976724377, 'samples': 5783040, 'steps': 11294, 'batch_loss/train': 0.7894177651032805} 12/24/2021 11:13:36 - INFO - codeparrot_training - Step 11295: {'lr': 0.00044102614146878126, 'samples': 5783552, 'steps': 11295, 'batch_loss/train': 0.857789465226233} 12/24/2021 11:13:48 - INFO - codeparrot_training - Step 11296: {'lr': 0.000441015972410871, 'samples': 5784064, 'steps': 11296, 'batch_loss/train': 0.7403782471083105} 12/24/2021 11:13:59 - INFO - codeparrot_training - Step 11297: {'lr': 0.00044100580259355326, 'samples': 5784576, 'steps': 11297, 'batch_loss/train': 0.8024888252839446} 12/24/2021 11:14:09 - INFO - codeparrot_training - Step 11298: {'lr': 0.00044099563201686865, 'samples': 5785088, 'steps': 11298, 'batch_loss/train': 0.8984541399404407} 12/24/2021 11:14:20 - INFO - codeparrot_training - Step 11299: {'lr': 0.0004409854606808575, 'samples': 5785600, 'steps': 11299, 'batch_loss/train': 0.7565794140100479} 12/24/2021 11:14:32 - INFO - codeparrot_training - Step 11300: {'lr': 0.0004409752885855603, 'samples': 5786112, 'steps': 11300, 'batch_loss/train': 1.563860734924674} 12/24/2021 11:14:43 - INFO - codeparrot_training - Step 11301: {'lr': 0.0004409651157310174, 'samples': 5786624, 'steps': 11301, 'batch_loss/train': 0.8662031721323729} 12/24/2021 11:14:53 - INFO - codeparrot_training - Step 11302: {'lr': 0.00044095494211726936, 'samples': 5787136, 'steps': 11302, 'batch_loss/train': 1.1425714856013656} 12/24/2021 11:15:07 - INFO - codeparrot_training - Step 11303: {'lr': 0.00044094476774435656, 'samples': 5787648, 'steps': 11303, 'batch_loss/train': 0.7828534515574574} 12/24/2021 11:15:18 - INFO - codeparrot_training - Step 11304: {'lr': 0.00044093459261231944, 'samples': 5788160, 'steps': 11304, 'batch_loss/train': 0.841111671179533} 12/24/2021 11:15:28 - INFO - codeparrot_training - Step 11305: {'lr': 0.00044092441672119844, 'samples': 5788672, 'steps': 11305, 'batch_loss/train': 0.8769703167490661} 12/24/2021 11:15:40 - INFO - codeparrot_training - Step 11306: {'lr': 0.00044091424007103417, 'samples': 5789184, 'steps': 11306, 'batch_loss/train': 0.8077582363039255} 12/24/2021 11:15:51 - INFO - codeparrot_training - Step 11307: {'lr': 0.0004409040626618669, 'samples': 5789696, 'steps': 11307, 'batch_loss/train': 0.9099235748872161} 12/24/2021 11:16:02 - INFO - codeparrot_training - Step 11308: {'lr': 0.00044089388449373713, 'samples': 5790208, 'steps': 11308, 'batch_loss/train': 0.8829286559484899} 12/24/2021 11:16:12 - INFO - codeparrot_training - Step 11309: {'lr': 0.00044088370556668536, 'samples': 5790720, 'steps': 11309, 'batch_loss/train': 0.7316450150683522} 12/24/2021 11:16:24 - INFO - codeparrot_training - Step 11310: {'lr': 0.000440873525880752, 'samples': 5791232, 'steps': 11310, 'batch_loss/train': 0.8402590835466981} 12/24/2021 11:16:35 - INFO - codeparrot_training - Step 11311: {'lr': 0.0004408633454359777, 'samples': 5791744, 'steps': 11311, 'batch_loss/train': 0.8649861598387361} 12/24/2021 11:16:45 - INFO - codeparrot_training - Step 11312: {'lr': 0.00044085316423240274, 'samples': 5792256, 'steps': 11312, 'batch_loss/train': 0.8774905642494559} 12/24/2021 11:16:57 - INFO - codeparrot_training - Step 11313: {'lr': 0.0004408429822700677, 'samples': 5792768, 'steps': 11313, 'batch_loss/train': 0.7774195987731218} 12/24/2021 11:17:08 - INFO - codeparrot_training - Step 11314: {'lr': 0.00044083279954901293, 'samples': 5793280, 'steps': 11314, 'batch_loss/train': 0.8064559018239379} 12/24/2021 11:17:19 - INFO - codeparrot_training - Step 11315: {'lr': 0.0004408226160692791, 'samples': 5793792, 'steps': 11315, 'batch_loss/train': 0.8414366524666548} 12/24/2021 11:17:32 - INFO - codeparrot_training - Step 11316: {'lr': 0.0004408124318309066, 'samples': 5794304, 'steps': 11316, 'batch_loss/train': 0.7989563271403313} 12/24/2021 11:17:43 - INFO - codeparrot_training - Step 11317: {'lr': 0.00044080224683393584, 'samples': 5794816, 'steps': 11317, 'batch_loss/train': 0.821447467431426} 12/24/2021 11:17:54 - INFO - codeparrot_training - Step 11318: {'lr': 0.00044079206107840743, 'samples': 5795328, 'steps': 11318, 'batch_loss/train': 0.9467098726890981} 12/24/2021 11:18:04 - INFO - codeparrot_training - Step 11319: {'lr': 0.0004407818745643618, 'samples': 5795840, 'steps': 11319, 'batch_loss/train': 0.8051995667628944} 12/24/2021 11:18:16 - INFO - codeparrot_training - Step 11320: {'lr': 0.00044077168729183956, 'samples': 5796352, 'steps': 11320, 'batch_loss/train': 0.8884082548320293} 12/24/2021 11:18:27 - INFO - codeparrot_training - Step 11321: {'lr': 0.000440761499260881, 'samples': 5796864, 'steps': 11321, 'batch_loss/train': 0.889969656476751} 12/24/2021 11:18:37 - INFO - codeparrot_training - Step 11322: {'lr': 0.0004407513104715268, 'samples': 5797376, 'steps': 11322, 'batch_loss/train': 0.8695155098102987} 12/24/2021 11:18:51 - INFO - codeparrot_training - Step 11323: {'lr': 0.0004407411209238175, 'samples': 5797888, 'steps': 11323, 'batch_loss/train': 0.8021983145736158} 12/24/2021 11:19:02 - INFO - codeparrot_training - Step 11324: {'lr': 0.0004407309306177934, 'samples': 5798400, 'steps': 11324, 'batch_loss/train': 0.9528757045045495} 12/24/2021 11:19:12 - INFO - codeparrot_training - Step 11325: {'lr': 0.00044072073955349523, 'samples': 5798912, 'steps': 11325, 'batch_loss/train': 0.7963427407667041} 12/24/2021 11:19:24 - INFO - codeparrot_training - Step 11326: {'lr': 0.0004407105477309633, 'samples': 5799424, 'steps': 11326, 'batch_loss/train': 0.8637789972126484} 12/24/2021 11:19:35 - INFO - codeparrot_training - Step 11327: {'lr': 0.0004407003551502383, 'samples': 5799936, 'steps': 11327, 'batch_loss/train': 0.8131297137588263} 12/24/2021 11:19:46 - INFO - codeparrot_training - Step 11328: {'lr': 0.0004406901618113608, 'samples': 5800448, 'steps': 11328, 'batch_loss/train': 0.7743616243824363} 12/24/2021 11:19:58 - INFO - codeparrot_training - Step 11329: {'lr': 0.0004406799677143711, 'samples': 5800960, 'steps': 11329, 'batch_loss/train': 0.730196988908574} 12/24/2021 11:20:08 - INFO - codeparrot_training - Step 11330: {'lr': 0.0004406697728593099, 'samples': 5801472, 'steps': 11330, 'batch_loss/train': 0.8447617157362401} 12/24/2021 11:20:19 - INFO - codeparrot_training - Step 11331: {'lr': 0.00044065957724621766, 'samples': 5801984, 'steps': 11331, 'batch_loss/train': 0.6813807997386903} 12/24/2021 11:20:29 - INFO - codeparrot_training - Step 11332: {'lr': 0.00044064938087513486, 'samples': 5802496, 'steps': 11332, 'batch_loss/train': 1.0104165854863822} 12/24/2021 11:20:43 - INFO - codeparrot_training - Step 11333: {'lr': 0.0004406391837461022, 'samples': 5803008, 'steps': 11333, 'batch_loss/train': 0.9553629858419299} 12/24/2021 11:20:54 - INFO - codeparrot_training - Step 11334: {'lr': 0.0004406289858591601, 'samples': 5803520, 'steps': 11334, 'batch_loss/train': 1.595450556371361} 12/24/2021 11:21:05 - INFO - codeparrot_training - Step 11335: {'lr': 0.0004406187872143491, 'samples': 5804032, 'steps': 11335, 'batch_loss/train': 0.8181416180450469} 12/24/2021 11:21:16 - INFO - codeparrot_training - Step 11336: {'lr': 0.0004406085878117098, 'samples': 5804544, 'steps': 11336, 'batch_loss/train': 1.5573893603868783} 12/24/2021 11:21:27 - INFO - codeparrot_training - Step 11337: {'lr': 0.00044059838765128276, 'samples': 5805056, 'steps': 11337, 'batch_loss/train': 0.7691467702388763} 12/24/2021 11:21:38 - INFO - codeparrot_training - Step 11338: {'lr': 0.0004405881867331084, 'samples': 5805568, 'steps': 11338, 'batch_loss/train': 0.8373306100256741} 12/24/2021 11:21:51 - INFO - codeparrot_training - Step 11339: {'lr': 0.00044057798505722746, 'samples': 5806080, 'steps': 11339, 'batch_loss/train': 0.8558120606467128} 12/24/2021 11:22:02 - INFO - codeparrot_training - Step 11340: {'lr': 0.00044056778262368043, 'samples': 5806592, 'steps': 11340, 'batch_loss/train': 0.8162625916302204} 12/24/2021 11:22:13 - INFO - codeparrot_training - Step 11341: {'lr': 0.0004405575794325078, 'samples': 5807104, 'steps': 11341, 'batch_loss/train': 0.9721087142825127} 12/24/2021 11:22:23 - INFO - codeparrot_training - Step 11342: {'lr': 0.0004405473754837502, 'samples': 5807616, 'steps': 11342, 'batch_loss/train': 0.7686095493845642} 12/24/2021 11:22:35 - INFO - codeparrot_training - Step 11343: {'lr': 0.00044053717077744813, 'samples': 5808128, 'steps': 11343, 'batch_loss/train': 0.8152584312483668} 12/24/2021 11:22:46 - INFO - codeparrot_training - Step 11344: {'lr': 0.00044052696531364234, 'samples': 5808640, 'steps': 11344, 'batch_loss/train': 0.8345370595343411} 12/24/2021 11:22:56 - INFO - codeparrot_training - Step 11345: {'lr': 0.0004405167590923732, 'samples': 5809152, 'steps': 11345, 'batch_loss/train': 0.7945557525381446} 12/24/2021 11:23:09 - INFO - codeparrot_training - Step 11346: {'lr': 0.00044050655211368134, 'samples': 5809664, 'steps': 11346, 'batch_loss/train': 0.7587061413796619} 12/24/2021 11:23:19 - INFO - codeparrot_training - Step 11347: {'lr': 0.0004404963443776074, 'samples': 5810176, 'steps': 11347, 'batch_loss/train': 0.7599444701336324} 12/24/2021 11:23:30 - INFO - codeparrot_training - Step 11348: {'lr': 0.0004404861358841919, 'samples': 5810688, 'steps': 11348, 'batch_loss/train': 0.6848871044348925} 12/24/2021 11:23:42 - INFO - codeparrot_training - Step 11349: {'lr': 0.0004404759266334755, 'samples': 5811200, 'steps': 11349, 'batch_loss/train': 0.6658417070284486} 12/24/2021 11:23:53 - INFO - codeparrot_training - Step 11350: {'lr': 0.0004404657166254986, 'samples': 5811712, 'steps': 11350, 'batch_loss/train': 0.7494333449285477} 12/24/2021 11:24:03 - INFO - codeparrot_training - Step 11351: {'lr': 0.000440455505860302, 'samples': 5812224, 'steps': 11351, 'batch_loss/train': 0.8018009262159467} 12/24/2021 11:24:14 - INFO - codeparrot_training - Step 11352: {'lr': 0.0004404452943379263, 'samples': 5812736, 'steps': 11352, 'batch_loss/train': 0.8997147381305695} 12/24/2021 11:24:28 - INFO - codeparrot_training - Step 11353: {'lr': 0.00044043508205841197, 'samples': 5813248, 'steps': 11353, 'batch_loss/train': 0.8719843979924917} 12/24/2021 11:24:39 - INFO - codeparrot_training - Step 11354: {'lr': 0.0004404248690217997, 'samples': 5813760, 'steps': 11354, 'batch_loss/train': 1.0419126022607088} 12/24/2021 11:24:49 - INFO - codeparrot_training - Step 11355: {'lr': 0.00044041465522813, 'samples': 5814272, 'steps': 11355, 'batch_loss/train': 0.9462229898199439} 12/24/2021 11:25:02 - INFO - codeparrot_training - Step 11356: {'lr': 0.00044040444067744346, 'samples': 5814784, 'steps': 11356, 'batch_loss/train': 0.870270598679781} 12/24/2021 11:25:13 - INFO - codeparrot_training - Step 11357: {'lr': 0.0004403942253697809, 'samples': 5815296, 'steps': 11357, 'batch_loss/train': 1.1140793059021235} 12/24/2021 11:25:23 - INFO - codeparrot_training - Step 11358: {'lr': 0.0004403840093051827, 'samples': 5815808, 'steps': 11358, 'batch_loss/train': 0.9009773063007742} 12/24/2021 11:25:34 - INFO - codeparrot_training - Step 11359: {'lr': 0.0004403737924836896, 'samples': 5816320, 'steps': 11359, 'batch_loss/train': 0.7704703537747264} 12/24/2021 11:25:46 - INFO - codeparrot_training - Step 11360: {'lr': 0.00044036357490534207, 'samples': 5816832, 'steps': 11360, 'batch_loss/train': 0.914567454252392} 12/24/2021 11:25:56 - INFO - codeparrot_training - Step 11361: {'lr': 0.000440353356570181, 'samples': 5817344, 'steps': 11361, 'batch_loss/train': 0.7476448663510382} 12/24/2021 11:26:07 - INFO - codeparrot_training - Step 11362: {'lr': 0.00044034313747824677, 'samples': 5817856, 'steps': 11362, 'batch_loss/train': 0.8104160367511213} 12/24/2021 11:26:21 - INFO - codeparrot_training - Step 11363: {'lr': 0.00044033291762958016, 'samples': 5818368, 'steps': 11363, 'batch_loss/train': 0.8399882060475647} 12/24/2021 11:26:31 - INFO - codeparrot_training - Step 11364: {'lr': 0.0004403226970242217, 'samples': 5818880, 'steps': 11364, 'batch_loss/train': 0.7888123490847647} 12/24/2021 11:26:42 - INFO - codeparrot_training - Step 11365: {'lr': 0.00044031247566221203, 'samples': 5819392, 'steps': 11365, 'batch_loss/train': 0.9147117715328932} 12/24/2021 11:26:54 - INFO - codeparrot_training - Step 11366: {'lr': 0.0004403022535435919, 'samples': 5819904, 'steps': 11366, 'batch_loss/train': 0.676472885068506} 12/24/2021 11:27:05 - INFO - codeparrot_training - Step 11367: {'lr': 0.00044029203066840185, 'samples': 5820416, 'steps': 11367, 'batch_loss/train': 0.7275005377596244} 12/24/2021 11:27:15 - INFO - codeparrot_training - Step 11368: {'lr': 0.00044028180703668243, 'samples': 5820928, 'steps': 11368, 'batch_loss/train': 0.8755998881533742} 12/24/2021 11:27:26 - INFO - codeparrot_training - Step 11369: {'lr': 0.0004402715826484745, 'samples': 5821440, 'steps': 11369, 'batch_loss/train': 0.9250883394852281} 12/24/2021 11:27:40 - INFO - codeparrot_training - Step 11370: {'lr': 0.00044026135750381855, 'samples': 5821952, 'steps': 11370, 'batch_loss/train': 0.6970645980909467} 12/24/2021 11:27:50 - INFO - codeparrot_training - Step 11371: {'lr': 0.00044025113160275534, 'samples': 5822464, 'steps': 11371, 'batch_loss/train': 0.8350741773610935} 12/24/2021 11:28:01 - INFO - codeparrot_training - Step 11372: {'lr': 0.00044024090494532543, 'samples': 5822976, 'steps': 11372, 'batch_loss/train': 0.8687480096705258} 12/24/2021 11:28:13 - INFO - codeparrot_training - Step 11373: {'lr': 0.00044023067753156943, 'samples': 5823488, 'steps': 11373, 'batch_loss/train': 0.8435819299193099} 12/24/2021 11:28:24 - INFO - codeparrot_training - Step 11374: {'lr': 0.0004402204493615282, 'samples': 5824000, 'steps': 11374, 'batch_loss/train': 0.8296959912404418} 12/24/2021 11:28:34 - INFO - codeparrot_training - Step 11375: {'lr': 0.0004402102204352423, 'samples': 5824512, 'steps': 11375, 'batch_loss/train': 0.806734180659987} 12/24/2021 11:28:46 - INFO - codeparrot_training - Step 11376: {'lr': 0.0004401999907527523, 'samples': 5825024, 'steps': 11376, 'batch_loss/train': 0.8019633768126369} 12/24/2021 11:28:57 - INFO - codeparrot_training - Step 11377: {'lr': 0.00044018976031409896, 'samples': 5825536, 'steps': 11377, 'batch_loss/train': 0.7315721653867513} 12/24/2021 11:29:07 - INFO - codeparrot_training - Step 11378: {'lr': 0.000440179529119323, 'samples': 5826048, 'steps': 11378, 'batch_loss/train': 0.7762039890512824} 12/24/2021 11:29:18 - INFO - codeparrot_training - Step 11379: {'lr': 0.00044016929716846504, 'samples': 5826560, 'steps': 11379, 'batch_loss/train': 0.8606168311089277} 12/24/2021 11:29:32 - INFO - codeparrot_training - Step 11380: {'lr': 0.0004401590644615657, 'samples': 5827072, 'steps': 11380, 'batch_loss/train': 0.986626404337585} 12/24/2021 11:29:42 - INFO - codeparrot_training - Step 11381: {'lr': 0.0004401488309986658, 'samples': 5827584, 'steps': 11381, 'batch_loss/train': 1.0284949839115143} 12/24/2021 11:29:53 - INFO - codeparrot_training - Step 11382: {'lr': 0.0004401385967798059, 'samples': 5828096, 'steps': 11382, 'batch_loss/train': 0.8954215245321393} 12/24/2021 11:30:05 - INFO - codeparrot_training - Step 11383: {'lr': 0.00044012836180502666, 'samples': 5828608, 'steps': 11383, 'batch_loss/train': 0.7528470666147768} 12/24/2021 11:30:16 - INFO - codeparrot_training - Step 11384: {'lr': 0.00044011812607436897, 'samples': 5829120, 'steps': 11384, 'batch_loss/train': 0.8458589781075716} 12/24/2021 11:30:26 - INFO - codeparrot_training - Step 11385: {'lr': 0.00044010788958787333, 'samples': 5829632, 'steps': 11385, 'batch_loss/train': 0.9255357421934605} 12/24/2021 11:30:40 - INFO - codeparrot_training - Step 11386: {'lr': 0.0004400976523455805, 'samples': 5830144, 'steps': 11386, 'batch_loss/train': 0.8469593040645123} 12/24/2021 11:30:51 - INFO - codeparrot_training - Step 11387: {'lr': 0.00044008741434753125, 'samples': 5830656, 'steps': 11387, 'batch_loss/train': 0.7618680596351624} 12/24/2021 11:31:01 - INFO - codeparrot_training - Step 11388: {'lr': 0.0004400771755937661, 'samples': 5831168, 'steps': 11388, 'batch_loss/train': 0.9711963944137096} 12/24/2021 11:31:12 - INFO - codeparrot_training - Step 11389: {'lr': 0.000440066936084326, 'samples': 5831680, 'steps': 11389, 'batch_loss/train': 0.7642939984798431} 12/24/2021 11:31:24 - INFO - codeparrot_training - Step 11390: {'lr': 0.0004400566958192515, 'samples': 5832192, 'steps': 11390, 'batch_loss/train': 0.8314000959508121} 12/24/2021 11:31:35 - INFO - codeparrot_training - Step 11391: {'lr': 0.00044004645479858327, 'samples': 5832704, 'steps': 11391, 'batch_loss/train': 0.5979772233404219} 12/24/2021 11:31:45 - INFO - codeparrot_training - Step 11392: {'lr': 0.0004400362130223621, 'samples': 5833216, 'steps': 11392, 'batch_loss/train': 1.422635587863624} 12/24/2021 11:31:58 - INFO - codeparrot_training - Step 11393: {'lr': 0.00044002597049062875, 'samples': 5833728, 'steps': 11393, 'batch_loss/train': 1.0367473978549242} 12/24/2021 11:32:08 - INFO - codeparrot_training - Step 11394: {'lr': 0.00044001572720342387, 'samples': 5834240, 'steps': 11394, 'batch_loss/train': 3.043247641995549} 12/24/2021 11:32:19 - INFO - codeparrot_training - Step 11395: {'lr': 0.0004400054831607883, 'samples': 5834752, 'steps': 11395, 'batch_loss/train': 0.9062797138467431} 12/24/2021 11:32:33 - INFO - codeparrot_training - Step 11396: {'lr': 0.00043999523836276246, 'samples': 5835264, 'steps': 11396, 'batch_loss/train': 0.9052890376187861} 12/24/2021 11:32:43 - INFO - codeparrot_training - Step 11397: {'lr': 0.0004399849928093874, 'samples': 5835776, 'steps': 11397, 'batch_loss/train': 0.8514207215048373} 12/24/2021 11:32:54 - INFO - codeparrot_training - Step 11398: {'lr': 0.0004399747465007038, 'samples': 5836288, 'steps': 11398, 'batch_loss/train': 0.896930436603725} 12/24/2021 11:33:05 - INFO - codeparrot_training - Step 11399: {'lr': 0.0004399644994367523, 'samples': 5836800, 'steps': 11399, 'batch_loss/train': 1.8249150458723307} 12/24/2021 11:33:17 - INFO - codeparrot_training - Step 11400: {'lr': 0.00043995425161757367, 'samples': 5837312, 'steps': 11400, 'batch_loss/train': 0.8876331504434347} 12/24/2021 11:33:27 - INFO - codeparrot_training - Step 11401: {'lr': 0.0004399440030432087, 'samples': 5837824, 'steps': 11401, 'batch_loss/train': 0.9029317423701286} 12/24/2021 11:33:38 - INFO - codeparrot_training - Step 11402: {'lr': 0.00043993375371369805, 'samples': 5838336, 'steps': 11402, 'batch_loss/train': 0.8800773937255144} 12/24/2021 11:33:50 - INFO - codeparrot_training - Step 11403: {'lr': 0.0004399235036290825, 'samples': 5838848, 'steps': 11403, 'batch_loss/train': 0.9051732774823904} 12/24/2021 11:34:01 - INFO - codeparrot_training - Step 11404: {'lr': 0.00043991325278940286, 'samples': 5839360, 'steps': 11404, 'batch_loss/train': 0.8953386666253209} 12/24/2021 11:34:11 - INFO - codeparrot_training - Step 11405: {'lr': 0.00043990300119469984, 'samples': 5839872, 'steps': 11405, 'batch_loss/train': 0.7678148923441768} 12/24/2021 11:34:23 - INFO - codeparrot_training - Step 11406: {'lr': 0.0004398927488450141, 'samples': 5840384, 'steps': 11406, 'batch_loss/train': 0.9020877620205283} 12/24/2021 11:34:34 - INFO - codeparrot_training - Step 11407: {'lr': 0.0004398824957403865, 'samples': 5840896, 'steps': 11407, 'batch_loss/train': 0.9354302939027548} 12/24/2021 11:34:44 - INFO - codeparrot_training - Step 11408: {'lr': 0.00043987224188085796, 'samples': 5841408, 'steps': 11408, 'batch_loss/train': 0.9749234486371279} 12/24/2021 11:34:55 - INFO - codeparrot_training - Step 11409: {'lr': 0.0004398619872664689, 'samples': 5841920, 'steps': 11409, 'batch_loss/train': 0.7265792179387063} 12/24/2021 11:35:09 - INFO - codeparrot_training - Step 11410: {'lr': 0.00043985173189726035, 'samples': 5842432, 'steps': 11410, 'batch_loss/train': 0.7114562108181417} 12/24/2021 11:35:19 - INFO - codeparrot_training - Step 11411: {'lr': 0.00043984147577327294, 'samples': 5842944, 'steps': 11411, 'batch_loss/train': 0.836038043256849} 12/24/2021 11:35:30 - INFO - codeparrot_training - Step 11412: {'lr': 0.00043983121889454757, 'samples': 5843456, 'steps': 11412, 'batch_loss/train': 0.754476091475226} 12/24/2021 11:35:42 - INFO - codeparrot_training - Step 11413: {'lr': 0.0004398209612611249, 'samples': 5843968, 'steps': 11413, 'batch_loss/train': 0.7112555443309247} 12/24/2021 11:35:53 - INFO - codeparrot_training - Step 11414: {'lr': 0.0004398107028730458, 'samples': 5844480, 'steps': 11414, 'batch_loss/train': 0.8259914740920067} 12/24/2021 11:36:03 - INFO - codeparrot_training - Step 11415: {'lr': 0.000439800443730351, 'samples': 5844992, 'steps': 11415, 'batch_loss/train': 0.9193179458379745} 12/24/2021 11:36:18 - INFO - codeparrot_training - Step 11416: {'lr': 0.0004397901838330813, 'samples': 5845504, 'steps': 11416, 'batch_loss/train': 0.8083784524351358} 12/24/2021 11:36:28 - INFO - codeparrot_training - Step 11417: {'lr': 0.0004397799231812775, 'samples': 5846016, 'steps': 11417, 'batch_loss/train': 0.9438673174008727} 12/24/2021 11:36:39 - INFO - codeparrot_training - Step 11418: {'lr': 0.0004397696617749804, 'samples': 5846528, 'steps': 11418, 'batch_loss/train': 0.868253281340003} 12/24/2021 11:36:50 - INFO - codeparrot_training - Step 11419: {'lr': 0.0004397593996142307, 'samples': 5847040, 'steps': 11419, 'batch_loss/train': 0.8524649776518345} 12/24/2021 11:37:02 - INFO - codeparrot_training - Step 11420: {'lr': 0.00043974913669906935, 'samples': 5847552, 'steps': 11420, 'batch_loss/train': 0.8146538529545069} 12/24/2021 11:37:12 - INFO - codeparrot_training - Step 11421: {'lr': 0.0004397388730295371, 'samples': 5848064, 'steps': 11421, 'batch_loss/train': 0.8308792071184143} 12/24/2021 11:37:23 - INFO - codeparrot_training - Step 11422: {'lr': 0.00043972860860567466, 'samples': 5848576, 'steps': 11422, 'batch_loss/train': 0.8121947199106216} 12/24/2021 11:37:37 - INFO - codeparrot_training - Step 11423: {'lr': 0.000439718343427523, 'samples': 5849088, 'steps': 11423, 'batch_loss/train': 0.7972674653865397} 12/24/2021 11:37:47 - INFO - codeparrot_training - Step 11424: {'lr': 0.00043970807749512276, 'samples': 5849600, 'steps': 11424, 'batch_loss/train': 0.8175267055630684} 12/24/2021 11:37:58 - INFO - codeparrot_training - Step 11425: {'lr': 0.0004396978108085149, 'samples': 5850112, 'steps': 11425, 'batch_loss/train': 0.7880559656769037} 12/24/2021 11:38:09 - INFO - codeparrot_training - Step 11426: {'lr': 0.0004396875433677401, 'samples': 5850624, 'steps': 11426, 'batch_loss/train': 0.8729026317596436} 12/24/2021 11:38:21 - INFO - codeparrot_training - Step 11427: {'lr': 0.0004396772751728393, 'samples': 5851136, 'steps': 11427, 'batch_loss/train': 0.869652895256877} 12/24/2021 11:38:31 - INFO - codeparrot_training - Step 11428: {'lr': 0.0004396670062238532, 'samples': 5851648, 'steps': 11428, 'batch_loss/train': 0.7724233479239047} 12/24/2021 11:38:42 - INFO - codeparrot_training - Step 11429: {'lr': 0.0004396567365208228, 'samples': 5852160, 'steps': 11429, 'batch_loss/train': 0.8729764893651009} 12/24/2021 11:38:54 - INFO - codeparrot_training - Step 11430: {'lr': 0.00043964646606378877, 'samples': 5852672, 'steps': 11430, 'batch_loss/train': 0.8438466107472777} 12/24/2021 11:39:04 - INFO - codeparrot_training - Step 11431: {'lr': 0.000439636194852792, 'samples': 5853184, 'steps': 11431, 'batch_loss/train': 0.8399789007380605} 12/24/2021 11:39:15 - INFO - codeparrot_training - Step 11432: {'lr': 0.0004396259228878734, 'samples': 5853696, 'steps': 11432, 'batch_loss/train': 0.9447877584025264} 12/24/2021 11:39:29 - INFO - codeparrot_training - Step 11433: {'lr': 0.00043961565016907366, 'samples': 5854208, 'steps': 11433, 'batch_loss/train': 0.6625346598448232} 12/24/2021 11:39:40 - INFO - codeparrot_training - Step 11434: {'lr': 0.0004396053766964337, 'samples': 5854720, 'steps': 11434, 'batch_loss/train': 0.7615937017835677} 12/24/2021 11:39:50 - INFO - codeparrot_training - Step 11435: {'lr': 0.00043959510246999433, 'samples': 5855232, 'steps': 11435, 'batch_loss/train': 1.223414660897106} 12/24/2021 11:40:01 - INFO - codeparrot_training - Step 11436: {'lr': 0.00043958482748979644, 'samples': 5855744, 'steps': 11436, 'batch_loss/train': 0.5974070176016539} 12/24/2021 11:40:13 - INFO - codeparrot_training - Step 11437: {'lr': 0.0004395745517558809, 'samples': 5856256, 'steps': 11437, 'batch_loss/train': 0.8991320449858904} 12/24/2021 11:40:24 - INFO - codeparrot_training - Step 11438: {'lr': 0.00043956427526828845, 'samples': 5856768, 'steps': 11438, 'batch_loss/train': 0.8072096733376384} 12/24/2021 11:40:34 - INFO - codeparrot_training - Step 11439: {'lr': 0.0004395539980270601, 'samples': 5857280, 'steps': 11439, 'batch_loss/train': 0.8836915697902441} 12/24/2021 11:40:48 - INFO - codeparrot_training - Step 11440: {'lr': 0.0004395437200322365, 'samples': 5857792, 'steps': 11440, 'batch_loss/train': 0.6081516586709768} 12/24/2021 11:40:59 - INFO - codeparrot_training - Step 11441: {'lr': 0.0004395334412838588, 'samples': 5858304, 'steps': 11441, 'batch_loss/train': 0.813672557938844} 12/24/2021 11:41:09 - INFO - codeparrot_training - Step 11442: {'lr': 0.0004395231617819676, 'samples': 5858816, 'steps': 11442, 'batch_loss/train': 0.7588015422224998} 12/24/2021 11:41:21 - INFO - codeparrot_training - Step 11443: {'lr': 0.0004395128815266039, 'samples': 5859328, 'steps': 11443, 'batch_loss/train': 0.8493780912831426} 12/24/2021 11:41:32 - INFO - codeparrot_training - Step 11444: {'lr': 0.00043950260051780853, 'samples': 5859840, 'steps': 11444, 'batch_loss/train': 0.7773493472486734} 12/24/2021 11:41:43 - INFO - codeparrot_training - Step 11445: {'lr': 0.0004394923187556224, 'samples': 5860352, 'steps': 11445, 'batch_loss/train': 0.8440180830657482} 12/24/2021 11:41:53 - INFO - codeparrot_training - Step 11446: {'lr': 0.00043948203624008634, 'samples': 5860864, 'steps': 11446, 'batch_loss/train': 0.6966573195531964} 12/24/2021 11:42:06 - INFO - codeparrot_training - Step 11447: {'lr': 0.0004394717529712412, 'samples': 5861376, 'steps': 11447, 'batch_loss/train': 0.7507518042693846} 12/24/2021 11:42:16 - INFO - codeparrot_training - Step 11448: {'lr': 0.000439461468949128, 'samples': 5861888, 'steps': 11448, 'batch_loss/train': 0.7032372027169913} 12/24/2021 11:42:27 - INFO - codeparrot_training - Step 11449: {'lr': 0.0004394511841737875, 'samples': 5862400, 'steps': 11449, 'batch_loss/train': 0.7894241362810135} 12/24/2021 11:42:39 - INFO - codeparrot_training - Step 11450: {'lr': 0.0004394408986452606, 'samples': 5862912, 'steps': 11450, 'batch_loss/train': 0.5605557216331363} 12/24/2021 11:42:50 - INFO - codeparrot_training - Step 11451: {'lr': 0.00043943061236358826, 'samples': 5863424, 'steps': 11451, 'batch_loss/train': 0.9105438468977809} 12/24/2021 11:43:01 - INFO - codeparrot_training - Step 11452: {'lr': 0.00043942032532881124, 'samples': 5863936, 'steps': 11452, 'batch_loss/train': 0.773772525601089} 12/24/2021 11:43:11 - INFO - codeparrot_training - Step 11453: {'lr': 0.00043941003754097065, 'samples': 5864448, 'steps': 11453, 'batch_loss/train': 0.6851388737559319} 12/24/2021 11:43:25 - INFO - codeparrot_training - Step 11454: {'lr': 0.0004393997490001072, 'samples': 5864960, 'steps': 11454, 'batch_loss/train': 0.7945845505455509} 12/24/2021 11:43:36 - INFO - codeparrot_training - Step 11455: {'lr': 0.00043938945970626187, 'samples': 5865472, 'steps': 11455, 'batch_loss/train': 0.8986167018301785} 12/24/2021 11:43:46 - INFO - codeparrot_training - Step 11456: {'lr': 0.0004393791696594756, 'samples': 5865984, 'steps': 11456, 'batch_loss/train': 0.8106352214235812} 12/24/2021 11:43:58 - INFO - codeparrot_training - Step 11457: {'lr': 0.0004393688788597893, 'samples': 5866496, 'steps': 11457, 'batch_loss/train': 0.8564373072003946} 12/24/2021 11:44:09 - INFO - codeparrot_training - Step 11458: {'lr': 0.0004393585873072436, 'samples': 5867008, 'steps': 11458, 'batch_loss/train': 0.7892782450653613} 12/24/2021 11:44:20 - INFO - codeparrot_training - Step 11459: {'lr': 0.0004393482950018799, 'samples': 5867520, 'steps': 11459, 'batch_loss/train': 0.9033614341169596} 12/24/2021 11:44:32 - INFO - codeparrot_training - Step 11460: {'lr': 0.00043933800194373884, 'samples': 5868032, 'steps': 11460, 'batch_loss/train': 0.7115324779879302} 12/24/2021 11:44:42 - INFO - codeparrot_training - Step 11461: {'lr': 0.00043932770813286126, 'samples': 5868544, 'steps': 11461, 'batch_loss/train': 0.7757073221728206} 12/24/2021 11:44:53 - INFO - codeparrot_training - Step 11462: {'lr': 0.00043931741356928835, 'samples': 5869056, 'steps': 11462, 'batch_loss/train': 1.0164434388279915} 12/24/2021 11:45:04 - INFO - codeparrot_training - Step 11463: {'lr': 0.0004393071182530608, 'samples': 5869568, 'steps': 11463, 'batch_loss/train': 0.9067450519651175} 12/24/2021 11:45:17 - INFO - codeparrot_training - Step 11464: {'lr': 0.0004392968221842196, 'samples': 5870080, 'steps': 11464, 'batch_loss/train': 0.7767503741197288} 12/24/2021 11:45:28 - INFO - codeparrot_training - Step 11465: {'lr': 0.00043928652536280577, 'samples': 5870592, 'steps': 11465, 'batch_loss/train': 0.9073716988787055} 12/24/2021 11:45:39 - INFO - codeparrot_training - Step 11466: {'lr': 0.00043927622778886015, 'samples': 5871104, 'steps': 11466, 'batch_loss/train': 0.7870850772596896} 12/24/2021 11:45:51 - INFO - codeparrot_training - Step 11467: {'lr': 0.00043926592946242374, 'samples': 5871616, 'steps': 11467, 'batch_loss/train': 0.8345678588375449} 12/24/2021 11:46:01 - INFO - codeparrot_training - Step 11468: {'lr': 0.00043925563038353747, 'samples': 5872128, 'steps': 11468, 'batch_loss/train': 0.8357008304446936} 12/24/2021 11:46:12 - INFO - codeparrot_training - Step 11469: {'lr': 0.00043924533055224226, 'samples': 5872640, 'steps': 11469, 'batch_loss/train': 0.8381891846656799} 12/24/2021 11:46:24 - INFO - codeparrot_training - Step 11470: {'lr': 0.000439235029968579, 'samples': 5873152, 'steps': 11470, 'batch_loss/train': 0.8456002669408917} 12/24/2021 11:46:34 - INFO - codeparrot_training - Step 11471: {'lr': 0.00043922472863258883, 'samples': 5873664, 'steps': 11471, 'batch_loss/train': 0.8898353381082416} 12/24/2021 11:46:45 - INFO - codeparrot_training - Step 11472: {'lr': 0.00043921442654431257, 'samples': 5874176, 'steps': 11472, 'batch_loss/train': 0.7647916411515325} 12/24/2021 11:46:59 - INFO - codeparrot_training - Step 11473: {'lr': 0.00043920412370379114, 'samples': 5874688, 'steps': 11473, 'batch_loss/train': 0.8291858937591314} 12/24/2021 11:47:10 - INFO - codeparrot_training - Step 11474: {'lr': 0.0004391938201110656, 'samples': 5875200, 'steps': 11474, 'batch_loss/train': 0.9321765080094337} 12/24/2021 11:47:20 - INFO - codeparrot_training - Step 11475: {'lr': 0.00043918351576617684, 'samples': 5875712, 'steps': 11475, 'batch_loss/train': 0.7186591359786689} 12/24/2021 11:47:31 - INFO - codeparrot_training - Step 11476: {'lr': 0.0004391732106691658, 'samples': 5876224, 'steps': 11476, 'batch_loss/train': 0.8105414400342852} 12/24/2021 11:47:43 - INFO - codeparrot_training - Step 11477: {'lr': 0.00043916290482007357, 'samples': 5876736, 'steps': 11477, 'batch_loss/train': 0.8032513409852982} 12/24/2021 11:47:54 - INFO - codeparrot_training - Step 11478: {'lr': 0.00043915259821894107, 'samples': 5877248, 'steps': 11478, 'batch_loss/train': 0.5639570143539459} 12/24/2021 11:48:04 - INFO - codeparrot_training - Step 11479: {'lr': 0.00043914229086580925, 'samples': 5877760, 'steps': 11479, 'batch_loss/train': 0.9335974073037505} 12/24/2021 11:48:19 - INFO - codeparrot_training - Step 11480: {'lr': 0.0004391319827607191, 'samples': 5878272, 'steps': 11480, 'batch_loss/train': 0.8201419143006206} 12/24/2021 11:48:29 - INFO - codeparrot_training - Step 11481: {'lr': 0.0004391216739037116, 'samples': 5878784, 'steps': 11481, 'batch_loss/train': 0.8541193380951881} 12/24/2021 11:48:40 - INFO - codeparrot_training - Step 11482: {'lr': 0.00043911136429482765, 'samples': 5879296, 'steps': 11482, 'batch_loss/train': 0.6772475382313132} 12/24/2021 11:48:52 - INFO - codeparrot_training - Step 11483: {'lr': 0.00043910105393410844, 'samples': 5879808, 'steps': 11483, 'batch_loss/train': 0.8057158847805113} 12/24/2021 11:49:02 - INFO - codeparrot_training - Step 11484: {'lr': 0.00043909074282159476, 'samples': 5880320, 'steps': 11484, 'batch_loss/train': 0.8751262435689569} 12/24/2021 11:49:13 - INFO - codeparrot_training - Step 11485: {'lr': 0.00043908043095732777, 'samples': 5880832, 'steps': 11485, 'batch_loss/train': 0.7856243557762355} 12/24/2021 11:49:24 - INFO - codeparrot_training - Step 11486: {'lr': 0.0004390701183413483, 'samples': 5881344, 'steps': 11486, 'batch_loss/train': 0.8698148168623447} 12/24/2021 11:49:36 - INFO - codeparrot_training - Step 11487: {'lr': 0.00043905980497369745, 'samples': 5881856, 'steps': 11487, 'batch_loss/train': 0.7315941755659878} 12/24/2021 11:49:46 - INFO - codeparrot_training - Step 11488: {'lr': 0.00043904949085441625, 'samples': 5882368, 'steps': 11488, 'batch_loss/train': 0.8183371517807245} 12/24/2021 11:49:57 - INFO - codeparrot_training - Step 11489: {'lr': 0.0004390391759835456, 'samples': 5882880, 'steps': 11489, 'batch_loss/train': 0.8444261653348804} 12/24/2021 11:50:09 - INFO - codeparrot_training - Step 11490: {'lr': 0.0004390288603611265, 'samples': 5883392, 'steps': 11490, 'batch_loss/train': 0.8296479175332934} 12/24/2021 11:50:20 - INFO - codeparrot_training - Step 11491: {'lr': 0.0004390185439872002, 'samples': 5883904, 'steps': 11491, 'batch_loss/train': 0.7818595077842474} 12/24/2021 11:50:31 - INFO - codeparrot_training - Step 11492: {'lr': 0.0004390082268618074, 'samples': 5884416, 'steps': 11492, 'batch_loss/train': 0.51517152285669} 12/24/2021 11:50:44 - INFO - codeparrot_training - Step 11493: {'lr': 0.0004389979089849894, 'samples': 5884928, 'steps': 11493, 'batch_loss/train': 0.7759296726435423} 12/24/2021 11:50:55 - INFO - codeparrot_training - Step 11494: {'lr': 0.00043898759035678695, 'samples': 5885440, 'steps': 11494, 'batch_loss/train': 0.6861410145647824} 12/24/2021 11:51:06 - INFO - codeparrot_training - Step 11495: {'lr': 0.00043897727097724114, 'samples': 5885952, 'steps': 11495, 'batch_loss/train': 0.7754378607496619} 12/24/2021 11:51:18 - INFO - codeparrot_training - Step 11496: {'lr': 0.0004389669508463932, 'samples': 5886464, 'steps': 11496, 'batch_loss/train': 0.7684177961200476} 12/24/2021 11:51:28 - INFO - codeparrot_training - Step 11497: {'lr': 0.000438956629964284, 'samples': 5886976, 'steps': 11497, 'batch_loss/train': 0.8593681121710688} 12/24/2021 11:51:39 - INFO - codeparrot_training - Step 11498: {'lr': 0.0004389463083309545, 'samples': 5887488, 'steps': 11498, 'batch_loss/train': 0.8513759155757725} 12/24/2021 11:51:49 - INFO - codeparrot_training - Step 11499: {'lr': 0.0004389359859464459, 'samples': 5888000, 'steps': 11499, 'batch_loss/train': 0.880959052592516} 12/24/2021 11:52:02 - INFO - codeparrot_training - Step 11500: {'lr': 0.00043892566281079914, 'samples': 5888512, 'steps': 11500, 'batch_loss/train': 0.9226209782063961} 12/24/2021 11:52:13 - INFO - codeparrot_training - Step 11501: {'lr': 0.0004389153389240552, 'samples': 5889024, 'steps': 11501, 'batch_loss/train': 0.845322267152369} 12/24/2021 11:52:23 - INFO - codeparrot_training - Step 11502: {'lr': 0.0004389050142862553, 'samples': 5889536, 'steps': 11502, 'batch_loss/train': 0.7735745385289192} 12/24/2021 11:52:37 - INFO - codeparrot_training - Step 11503: {'lr': 0.0004388946888974404, 'samples': 5890048, 'steps': 11503, 'batch_loss/train': 1.7818388808518648} 12/24/2021 11:52:48 - INFO - codeparrot_training - Step 11504: {'lr': 0.00043888436275765145, 'samples': 5890560, 'steps': 11504, 'batch_loss/train': 0.8964297133497894} 12/24/2021 11:52:58 - INFO - codeparrot_training - Step 11505: {'lr': 0.0004388740358669297, 'samples': 5891072, 'steps': 11505, 'batch_loss/train': 0.8431287263520062} 12/24/2021 11:53:09 - INFO - codeparrot_training - Step 11506: {'lr': 0.00043886370822531607, 'samples': 5891584, 'steps': 11506, 'batch_loss/train': 0.8163112783804536} 12/24/2021 11:53:21 - INFO - codeparrot_training - Step 11507: {'lr': 0.00043885337983285154, 'samples': 5892096, 'steps': 11507, 'batch_loss/train': 0.8088490567170084} 12/24/2021 11:53:32 - INFO - codeparrot_training - Step 11508: {'lr': 0.00043884305068957744, 'samples': 5892608, 'steps': 11508, 'batch_loss/train': 0.7794548005331308} 12/24/2021 11:53:42 - INFO - codeparrot_training - Step 11509: {'lr': 0.0004388327207955346, 'samples': 5893120, 'steps': 11509, 'batch_loss/train': 0.8476337715983391} 12/24/2021 11:53:56 - INFO - codeparrot_training - Step 11510: {'lr': 0.00043882239015076413, 'samples': 5893632, 'steps': 11510, 'batch_loss/train': 0.9530813442543149} 12/24/2021 11:54:07 - INFO - codeparrot_training - Step 11511: {'lr': 0.00043881205875530716, 'samples': 5894144, 'steps': 11511, 'batch_loss/train': 0.8622795073315501} 12/24/2021 11:54:18 - INFO - codeparrot_training - Step 11512: {'lr': 0.00043880172660920477, 'samples': 5894656, 'steps': 11512, 'batch_loss/train': 0.8452543360181153} 12/24/2021 11:54:30 - INFO - codeparrot_training - Step 11513: {'lr': 0.000438791393712498, 'samples': 5895168, 'steps': 11513, 'batch_loss/train': 0.8661960893077776} 12/24/2021 11:54:40 - INFO - codeparrot_training - Step 11514: {'lr': 0.00043878106006522787, 'samples': 5895680, 'steps': 11514, 'batch_loss/train': 0.811108241789043} 12/24/2021 11:54:51 - INFO - codeparrot_training - Step 11515: {'lr': 0.0004387707256674356, 'samples': 5896192, 'steps': 11515, 'batch_loss/train': 0.8660155571997166} 12/24/2021 11:55:01 - INFO - codeparrot_training - Step 11516: {'lr': 0.00043876039051916215, 'samples': 5896704, 'steps': 11516, 'batch_loss/train': 0.743276767898351} 12/24/2021 11:55:13 - INFO - codeparrot_training - Step 11517: {'lr': 0.0004387500546204487, 'samples': 5897216, 'steps': 11517, 'batch_loss/train': 0.8721374105662107} 12/24/2021 11:55:24 - INFO - codeparrot_training - Step 11518: {'lr': 0.00043873971797133625, 'samples': 5897728, 'steps': 11518, 'batch_loss/train': 0.8066286323592067} 12/24/2021 11:55:35 - INFO - codeparrot_training - Step 11519: {'lr': 0.00043872938057186597, 'samples': 5898240, 'steps': 11519, 'batch_loss/train': 0.7491935351863503} 12/24/2021 11:55:48 - INFO - codeparrot_training - Step 11520: {'lr': 0.00043871904242207894, 'samples': 5898752, 'steps': 11520, 'batch_loss/train': 0.8140249107964337} 12/24/2021 11:55:59 - INFO - codeparrot_training - Step 11521: {'lr': 0.0004387087035220163, 'samples': 5899264, 'steps': 11521, 'batch_loss/train': 0.866629206109792} 12/24/2021 11:56:10 - INFO - codeparrot_training - Step 11522: {'lr': 0.000438698363871719, 'samples': 5899776, 'steps': 11522, 'batch_loss/train': 0.7747926791198552} 12/24/2021 11:56:21 - INFO - codeparrot_training - Step 11523: {'lr': 0.0004386880234712284, 'samples': 5900288, 'steps': 11523, 'batch_loss/train': 0.8436418352648616} 12/24/2021 11:56:32 - INFO - codeparrot_training - Step 11524: {'lr': 0.00043867768232058534, 'samples': 5900800, 'steps': 11524, 'batch_loss/train': 0.8251732541248202} 12/24/2021 11:56:43 - INFO - codeparrot_training - Step 11525: {'lr': 0.0004386673404198311, 'samples': 5901312, 'steps': 11525, 'batch_loss/train': 0.7491602348163724} 12/24/2021 11:56:53 - INFO - codeparrot_training - Step 11526: {'lr': 0.00043865699776900674, 'samples': 5901824, 'steps': 11526, 'batch_loss/train': 0.7937993472442031} 12/24/2021 11:57:06 - INFO - codeparrot_training - Step 11527: {'lr': 0.0004386466543681534, 'samples': 5902336, 'steps': 11527, 'batch_loss/train': 0.8175212820060551} 12/24/2021 11:57:16 - INFO - codeparrot_training - Step 11528: {'lr': 0.0004386363102173122, 'samples': 5902848, 'steps': 11528, 'batch_loss/train': 0.7377858082763851} 12/24/2021 11:57:27 - INFO - codeparrot_training - Step 11529: {'lr': 0.0004386259653165243, 'samples': 5903360, 'steps': 11529, 'batch_loss/train': 0.6438781097531319} 12/24/2021 11:57:39 - INFO - codeparrot_training - Step 11530: {'lr': 0.00043861561966583075, 'samples': 5903872, 'steps': 11530, 'batch_loss/train': 0.7557124673621729} 12/24/2021 11:57:49 - INFO - codeparrot_training - Step 11531: {'lr': 0.00043860527326527267, 'samples': 5904384, 'steps': 11531, 'batch_loss/train': 0.7579001812264323} 12/24/2021 11:58:00 - INFO - codeparrot_training - Step 11532: {'lr': 0.00043859492611489126, 'samples': 5904896, 'steps': 11532, 'batch_loss/train': 0.7573266001418233} 12/24/2021 11:58:14 - INFO - codeparrot_training - Step 11533: {'lr': 0.0004385845782147276, 'samples': 5905408, 'steps': 11533, 'batch_loss/train': 0.8767263961490244} 12/24/2021 11:58:24 - INFO - codeparrot_training - Step 11534: {'lr': 0.00043857422956482294, 'samples': 5905920, 'steps': 11534, 'batch_loss/train': 0.9078770624473691} 12/24/2021 11:58:35 - INFO - codeparrot_training - Step 11535: {'lr': 0.0004385638801652183, 'samples': 5906432, 'steps': 11535, 'batch_loss/train': 0.685559471603483} 12/24/2021 11:58:45 - INFO - codeparrot_training - Step 11536: {'lr': 0.0004385535300159549, 'samples': 5906944, 'steps': 11536, 'batch_loss/train': 0.6620963714085519} 12/24/2021 11:58:58 - INFO - codeparrot_training - Step 11537: {'lr': 0.0004385431791170739, 'samples': 5907456, 'steps': 11537, 'batch_loss/train': 1.0220761625096202} 12/24/2021 11:59:08 - INFO - codeparrot_training - Step 11538: {'lr': 0.00043853282746861635, 'samples': 5907968, 'steps': 11538, 'batch_loss/train': 0.8154927038121969} 12/24/2021 11:59:19 - INFO - codeparrot_training - Step 11539: {'lr': 0.00043852247507062347, 'samples': 5908480, 'steps': 11539, 'batch_loss/train': 0.9037615228444338} 12/24/2021 11:59:33 - INFO - codeparrot_training - Step 11540: {'lr': 0.00043851212192313636, 'samples': 5908992, 'steps': 11540, 'batch_loss/train': 0.7746686106547713} 12/24/2021 11:59:43 - INFO - codeparrot_training - Step 11541: {'lr': 0.00043850176802619635, 'samples': 5909504, 'steps': 11541, 'batch_loss/train': 0.7585024992004037} 12/24/2021 11:59:54 - INFO - codeparrot_training - Step 11542: {'lr': 0.00043849141337984446, 'samples': 5910016, 'steps': 11542, 'batch_loss/train': 0.8458942454308271} 12/24/2021 12:00:06 - INFO - codeparrot_training - Step 11543: {'lr': 0.0004384810579841218, 'samples': 5910528, 'steps': 11543, 'batch_loss/train': 1.0000664331018925} 12/24/2021 12:00:17 - INFO - codeparrot_training - Step 11544: {'lr': 0.00043847070183906966, 'samples': 5911040, 'steps': 11544, 'batch_loss/train': 0.8557570651173592} 12/24/2021 12:00:27 - INFO - codeparrot_training - Step 11545: {'lr': 0.0004384603449447292, 'samples': 5911552, 'steps': 11545, 'batch_loss/train': 0.7446236581308767} 12/24/2021 12:00:38 - INFO - codeparrot_training - Step 11546: {'lr': 0.0004384499873011415, 'samples': 5912064, 'steps': 11546, 'batch_loss/train': 0.8709399895742536} 12/24/2021 12:00:50 - INFO - codeparrot_training - Step 11547: {'lr': 0.00043843962890834787, 'samples': 5912576, 'steps': 11547, 'batch_loss/train': 0.8185308873653412} 12/24/2021 12:01:01 - INFO - codeparrot_training - Step 11548: {'lr': 0.0004384292697663894, 'samples': 5913088, 'steps': 11548, 'batch_loss/train': 0.6751411753939465} 12/24/2021 12:01:11 - INFO - codeparrot_training - Step 11549: {'lr': 0.0004384189098753073, 'samples': 5913600, 'steps': 11549, 'batch_loss/train': 0.5978180398233235} 12/24/2021 12:01:25 - INFO - codeparrot_training - Step 11550: {'lr': 0.0004384085492351428, 'samples': 5914112, 'steps': 11550, 'batch_loss/train': 0.8774652816355228} 12/24/2021 12:01:36 - INFO - codeparrot_training - Step 11551: {'lr': 0.00043839818784593705, 'samples': 5914624, 'steps': 11551, 'batch_loss/train': 0.8697770293802023} 12/24/2021 12:01:47 - INFO - codeparrot_training - Step 11552: {'lr': 0.00043838782570773124, 'samples': 5915136, 'steps': 11552, 'batch_loss/train': 0.875305705703795} 12/24/2021 12:01:59 - INFO - codeparrot_training - Step 11553: {'lr': 0.00043837746282056653, 'samples': 5915648, 'steps': 11553, 'batch_loss/train': 0.8442014278843999} 12/24/2021 12:02:09 - INFO - codeparrot_training - Step 11554: {'lr': 0.00043836709918448414, 'samples': 5916160, 'steps': 11554, 'batch_loss/train': 0.9014040534384549} 12/24/2021 12:02:20 - INFO - codeparrot_training - Step 11555: {'lr': 0.0004383567347995253, 'samples': 5916672, 'steps': 11555, 'batch_loss/train': 0.9334429064765573} 12/24/2021 12:02:30 - INFO - codeparrot_training - Step 11556: {'lr': 0.0004383463696657313, 'samples': 5917184, 'steps': 11556, 'batch_loss/train': 0.8477493366226554} 12/24/2021 12:02:42 - INFO - codeparrot_training - Step 11557: {'lr': 0.00043833600378314307, 'samples': 5917696, 'steps': 11557, 'batch_loss/train': 0.8880462828092277} 12/24/2021 12:02:53 - INFO - codeparrot_training - Step 11558: {'lr': 0.0004383256371518021, 'samples': 5918208, 'steps': 11558, 'batch_loss/train': 0.8059919467195868} 12/24/2021 12:03:04 - INFO - codeparrot_training - Step 11559: {'lr': 0.00043831526977174953, 'samples': 5918720, 'steps': 11559, 'batch_loss/train': 0.9824076788499951} 12/24/2021 12:03:16 - INFO - codeparrot_training - Step 11560: {'lr': 0.00043830490164302655, 'samples': 5919232, 'steps': 11560, 'batch_loss/train': 0.7192414122400805} 12/24/2021 12:03:26 - INFO - codeparrot_training - Step 11561: {'lr': 0.00043829453276567434, 'samples': 5919744, 'steps': 11561, 'batch_loss/train': 0.8202870711684227} 12/24/2021 12:03:37 - INFO - codeparrot_training - Step 11562: {'lr': 0.0004382841631397342, 'samples': 5920256, 'steps': 11562, 'batch_loss/train': 0.8567637549713254} 12/24/2021 12:03:51 - INFO - codeparrot_training - Step 11563: {'lr': 0.0004382737927652473, 'samples': 5920768, 'steps': 11563, 'batch_loss/train': 0.8342273626476526} 12/24/2021 12:04:01 - INFO - codeparrot_training - Step 11564: {'lr': 0.0004382634216422549, 'samples': 5921280, 'steps': 11564, 'batch_loss/train': 0.7645568675361574} 12/24/2021 12:04:12 - INFO - codeparrot_training - Step 11565: {'lr': 0.0004382530497707982, 'samples': 5921792, 'steps': 11565, 'batch_loss/train': 0.8697323426604271} 12/24/2021 12:04:24 - INFO - codeparrot_training - Step 11566: {'lr': 0.0004382426771509185, 'samples': 5922304, 'steps': 11566, 'batch_loss/train': 0.8339061778970063} 12/24/2021 12:04:35 - INFO - codeparrot_training - Step 11567: {'lr': 0.000438232303782657, 'samples': 5922816, 'steps': 11567, 'batch_loss/train': 0.8140732357278466} 12/24/2021 12:04:45 - INFO - codeparrot_training - Step 11568: {'lr': 0.0004382219296660549, 'samples': 5923328, 'steps': 11568, 'batch_loss/train': 0.8244605539366603} 12/24/2021 12:04:56 - INFO - codeparrot_training - Step 11569: {'lr': 0.00043821155480115346, 'samples': 5923840, 'steps': 11569, 'batch_loss/train': 0.7966225678101182} 12/24/2021 12:05:10 - INFO - codeparrot_training - Step 11570: {'lr': 0.00043820117918799397, 'samples': 5924352, 'steps': 11570, 'batch_loss/train': 0.8934427765198052} 12/24/2021 12:05:20 - INFO - codeparrot_training - Step 11571: {'lr': 0.0004381908028266176, 'samples': 5924864, 'steps': 11571, 'batch_loss/train': 0.9220105726271868} 12/24/2021 12:05:31 - INFO - codeparrot_training - Step 11572: {'lr': 0.00043818042571706575, 'samples': 5925376, 'steps': 11572, 'batch_loss/train': 0.6622952258912846} 12/24/2021 12:05:43 - INFO - codeparrot_training - Step 11573: {'lr': 0.00043817004785937953, 'samples': 5925888, 'steps': 11573, 'batch_loss/train': 0.7782238684594631} 12/24/2021 12:05:53 - INFO - codeparrot_training - Step 11574: {'lr': 0.0004381596692536002, 'samples': 5926400, 'steps': 11574, 'batch_loss/train': 0.8657109106425196} 12/24/2021 12:06:04 - INFO - codeparrot_training - Step 11575: {'lr': 0.00043814928989976916, 'samples': 5926912, 'steps': 11575, 'batch_loss/train': 0.8737171944230795} 12/24/2021 12:06:16 - INFO - codeparrot_training - Step 11576: {'lr': 0.0004381389097979276, 'samples': 5927424, 'steps': 11576, 'batch_loss/train': 0.8877669908106327} 12/24/2021 12:06:27 - INFO - codeparrot_training - Step 11577: {'lr': 0.00043812852894811673, 'samples': 5927936, 'steps': 11577, 'batch_loss/train': 0.853417057543993} 12/24/2021 12:06:37 - INFO - codeparrot_training - Step 11578: {'lr': 0.0004381181473503778, 'samples': 5928448, 'steps': 11578, 'batch_loss/train': 0.8276679487898946} 12/24/2021 12:06:51 - INFO - codeparrot_training - Step 11579: {'lr': 0.0004381077650047522, 'samples': 5928960, 'steps': 11579, 'batch_loss/train': 0.8200833974406123} 12/24/2021 12:07:02 - INFO - codeparrot_training - Step 11580: {'lr': 0.00043809738191128117, 'samples': 5929472, 'steps': 11580, 'batch_loss/train': 0.8188296286389232} 12/24/2021 12:07:12 - INFO - codeparrot_training - Step 11581: {'lr': 0.00043808699807000595, 'samples': 5929984, 'steps': 11581, 'batch_loss/train': 0.7894366988912225} 12/24/2021 12:07:23 - INFO - codeparrot_training - Step 11582: {'lr': 0.0004380766134809678, 'samples': 5930496, 'steps': 11582, 'batch_loss/train': 0.7638015942648053} 12/24/2021 12:07:35 - INFO - codeparrot_training - Step 11583: {'lr': 0.00043806622814420815, 'samples': 5931008, 'steps': 11583, 'batch_loss/train': 1.1734613124281168} 12/24/2021 12:07:45 - INFO - codeparrot_training - Step 11584: {'lr': 0.0004380558420597682, 'samples': 5931520, 'steps': 11584, 'batch_loss/train': 0.9385200515389442} 12/24/2021 12:07:56 - INFO - codeparrot_training - Step 11585: {'lr': 0.0004380454552276891, 'samples': 5932032, 'steps': 11585, 'batch_loss/train': 0.8411019686609507} 12/24/2021 12:08:10 - INFO - codeparrot_training - Step 11586: {'lr': 0.0004380350676480124, 'samples': 5932544, 'steps': 11586, 'batch_loss/train': 0.9061531717889011} 12/24/2021 12:08:20 - INFO - codeparrot_training - Step 11587: {'lr': 0.0004380246793207792, 'samples': 5933056, 'steps': 11587, 'batch_loss/train': 0.845973645336926} 12/24/2021 12:08:31 - INFO - codeparrot_training - Step 11588: {'lr': 0.0004380142902460309, 'samples': 5933568, 'steps': 11588, 'batch_loss/train': 0.823011415079236} 12/24/2021 12:08:43 - INFO - codeparrot_training - Step 11589: {'lr': 0.00043800390042380877, 'samples': 5934080, 'steps': 11589, 'batch_loss/train': 0.7755142385140061} 12/24/2021 12:08:54 - INFO - codeparrot_training - Step 11590: {'lr': 0.0004379935098541541, 'samples': 5934592, 'steps': 11590, 'batch_loss/train': 0.8481012303382158} 12/24/2021 12:09:04 - INFO - codeparrot_training - Step 11591: {'lr': 0.0004379831185371082, 'samples': 5935104, 'steps': 11591, 'batch_loss/train': 0.7527904894668609} 12/24/2021 12:09:15 - INFO - codeparrot_training - Step 11592: {'lr': 0.00043797272647271246, 'samples': 5935616, 'steps': 11592, 'batch_loss/train': 0.8771649142727256} 12/24/2021 12:09:27 - INFO - codeparrot_training - Step 11593: {'lr': 0.000437962333661008, 'samples': 5936128, 'steps': 11593, 'batch_loss/train': 0.8441026918590069} 12/24/2021 12:09:38 - INFO - codeparrot_training - Step 11594: {'lr': 0.00043795194010203643, 'samples': 5936640, 'steps': 11594, 'batch_loss/train': 1.0156392906792462} 12/24/2021 12:09:48 - INFO - codeparrot_training - Step 11595: {'lr': 0.0004379415457958389, 'samples': 5937152, 'steps': 11595, 'batch_loss/train': 0.8102868818677962} 12/24/2021 12:10:00 - INFO - codeparrot_training - Step 11596: {'lr': 0.00043793115074245673, 'samples': 5937664, 'steps': 11596, 'batch_loss/train': 0.8321265345439315} 12/24/2021 12:10:11 - INFO - codeparrot_training - Step 11597: {'lr': 0.0004379207549419312, 'samples': 5938176, 'steps': 11597, 'batch_loss/train': 0.9067682139575481} 12/24/2021 12:10:22 - INFO - codeparrot_training - Step 11598: {'lr': 0.0004379103583943037, 'samples': 5938688, 'steps': 11598, 'batch_loss/train': 0.727907867403701} 12/24/2021 12:10:35 - INFO - codeparrot_training - Step 11599: {'lr': 0.0004378999610996156, 'samples': 5939200, 'steps': 11599, 'batch_loss/train': 0.8384908409789205} 12/24/2021 12:10:46 - INFO - codeparrot_training - Step 11600: {'lr': 0.0004378895630579083, 'samples': 5939712, 'steps': 11600, 'batch_loss/train': 0.9088318115100265} 12/24/2021 12:10:57 - INFO - codeparrot_training - Step 11601: {'lr': 0.0004378791642692229, 'samples': 5940224, 'steps': 11601, 'batch_loss/train': 0.8191968044266105} 12/24/2021 12:11:07 - INFO - codeparrot_training - Step 11602: {'lr': 0.000437868764733601, 'samples': 5940736, 'steps': 11602, 'batch_loss/train': 0.847730478271842} 12/24/2021 12:11:19 - INFO - codeparrot_training - Step 11603: {'lr': 0.00043785836445108374, 'samples': 5941248, 'steps': 11603, 'batch_loss/train': 0.8868412636220455} 12/24/2021 12:11:30 - INFO - codeparrot_training - Step 11604: {'lr': 0.0004378479634217126, 'samples': 5941760, 'steps': 11604, 'batch_loss/train': 0.9145413301885128} 12/24/2021 12:11:40 - INFO - codeparrot_training - Step 11605: {'lr': 0.0004378375616455289, 'samples': 5942272, 'steps': 11605, 'batch_loss/train': 0.8403426893055439} 12/24/2021 12:11:52 - INFO - codeparrot_training - Step 11606: {'lr': 0.00043782715912257385, 'samples': 5942784, 'steps': 11606, 'batch_loss/train': 0.7754302183166146} 12/24/2021 12:12:03 - INFO - codeparrot_training - Step 11607: {'lr': 0.0004378167558528891, 'samples': 5943296, 'steps': 11607, 'batch_loss/train': 0.8443226399831474} 12/24/2021 12:12:14 - INFO - codeparrot_training - Step 11608: {'lr': 0.00043780635183651576, 'samples': 5943808, 'steps': 11608, 'batch_loss/train': 0.7966298656538129} 12/24/2021 12:12:27 - INFO - codeparrot_training - Step 11609: {'lr': 0.0004377959470734953, 'samples': 5944320, 'steps': 11609, 'batch_loss/train': 0.711787095759064} 12/24/2021 12:12:38 - INFO - codeparrot_training - Step 11610: {'lr': 0.00043778554156386905, 'samples': 5944832, 'steps': 11610, 'batch_loss/train': 0.8151143302675337} 12/24/2021 12:12:49 - INFO - codeparrot_training - Step 11611: {'lr': 0.0004377751353076784, 'samples': 5945344, 'steps': 11611, 'batch_loss/train': 0.8266148697584867} 12/24/2021 12:12:59 - INFO - codeparrot_training - Step 11612: {'lr': 0.0004377647283049647, 'samples': 5945856, 'steps': 11612, 'batch_loss/train': 0.8527458151802421} 12/24/2021 12:13:11 - INFO - codeparrot_training - Step 11613: {'lr': 0.0004377543205557694, 'samples': 5946368, 'steps': 11613, 'batch_loss/train': 0.7467412957921624} 12/24/2021 12:13:22 - INFO - codeparrot_training - Step 11614: {'lr': 0.00043774391206013373, 'samples': 5946880, 'steps': 11614, 'batch_loss/train': 0.7940186904743314} 12/24/2021 12:13:32 - INFO - codeparrot_training - Step 11615: {'lr': 0.00043773350281809927, 'samples': 5947392, 'steps': 11615, 'batch_loss/train': 0.8509871494024992} 12/24/2021 12:13:46 - INFO - codeparrot_training - Step 11616: {'lr': 0.0004377230928297072, 'samples': 5947904, 'steps': 11616, 'batch_loss/train': 0.7894705166108906} 12/24/2021 12:13:57 - INFO - codeparrot_training - Step 11617: {'lr': 0.000437712682094999, 'samples': 5948416, 'steps': 11617, 'batch_loss/train': 0.8988416399806738} 12/24/2021 12:14:07 - INFO - codeparrot_training - Step 11618: {'lr': 0.000437702270614016, 'samples': 5948928, 'steps': 11618, 'batch_loss/train': 0.8381661400198936} 12/24/2021 12:14:20 - INFO - codeparrot_training - Step 11619: {'lr': 0.00043769185838679977, 'samples': 5949440, 'steps': 11619, 'batch_loss/train': 0.8507503569126129} 12/24/2021 12:14:30 - INFO - codeparrot_training - Step 11620: {'lr': 0.0004376814454133915, 'samples': 5949952, 'steps': 11620, 'batch_loss/train': 0.8262763805687428} 12/24/2021 12:14:41 - INFO - codeparrot_training - Step 11621: {'lr': 0.0004376710316938327, 'samples': 5950464, 'steps': 11621, 'batch_loss/train': 0.8283967729657888} 12/24/2021 12:14:51 - INFO - codeparrot_training - Step 11622: {'lr': 0.00043766061722816477, 'samples': 5950976, 'steps': 11622, 'batch_loss/train': 0.8396310694515705} 12/24/2021 12:15:03 - INFO - codeparrot_training - Step 11623: {'lr': 0.000437650202016429, 'samples': 5951488, 'steps': 11623, 'batch_loss/train': 0.7218479961156845} 12/24/2021 12:15:14 - INFO - codeparrot_training - Step 11624: {'lr': 0.00043763978605866697, 'samples': 5952000, 'steps': 11624, 'batch_loss/train': 0.7771215580869466} 12/24/2021 12:15:25 - INFO - codeparrot_training - Step 11625: {'lr': 0.00043762936935491995, 'samples': 5952512, 'steps': 11625, 'batch_loss/train': 0.8501758212223649} 12/24/2021 12:15:38 - INFO - codeparrot_training - Step 11626: {'lr': 0.00043761895190522936, 'samples': 5953024, 'steps': 11626, 'batch_loss/train': 0.7616294156759977} 12/24/2021 12:15:49 - INFO - codeparrot_training - Step 11627: {'lr': 0.00043760853370963674, 'samples': 5953536, 'steps': 11627, 'batch_loss/train': 0.996350129134953} 12/24/2021 12:16:00 - INFO - codeparrot_training - Step 11628: {'lr': 0.0004375981147681833, 'samples': 5954048, 'steps': 11628, 'batch_loss/train': 0.8273781882598996} 12/24/2021 12:16:12 - INFO - codeparrot_training - Step 11629: {'lr': 0.00043758769508091077, 'samples': 5954560, 'steps': 11629, 'batch_loss/train': 0.7882094881497324} 12/24/2021 12:16:22 - INFO - codeparrot_training - Step 11630: {'lr': 0.00043757727464786026, 'samples': 5955072, 'steps': 11630, 'batch_loss/train': 0.9794230796396732} 12/24/2021 12:16:33 - INFO - codeparrot_training - Step 11631: {'lr': 0.00043756685346907343, 'samples': 5955584, 'steps': 11631, 'batch_loss/train': 0.8474355777725577} 12/24/2021 12:16:45 - INFO - codeparrot_training - Step 11632: {'lr': 0.00043755643154459153, 'samples': 5956096, 'steps': 11632, 'batch_loss/train': 0.7616382292471826} 12/24/2021 12:16:55 - INFO - codeparrot_training - Step 11633: {'lr': 0.00043754600887445605, 'samples': 5956608, 'steps': 11633, 'batch_loss/train': 0.9091257723048329} 12/24/2021 12:17:06 - INFO - codeparrot_training - Step 11634: {'lr': 0.0004375355854587086, 'samples': 5957120, 'steps': 11634, 'batch_loss/train': 0.8459434201940894} 12/24/2021 12:17:17 - INFO - codeparrot_training - Step 11635: {'lr': 0.0004375251612973904, 'samples': 5957632, 'steps': 11635, 'batch_loss/train': 0.871345560066402} 12/24/2021 12:17:29 - INFO - codeparrot_training - Step 11636: {'lr': 0.00043751473639054296, 'samples': 5958144, 'steps': 11636, 'batch_loss/train': 0.8685217942111194} 12/24/2021 12:17:39 - INFO - codeparrot_training - Step 11637: {'lr': 0.0004375043107382077, 'samples': 5958656, 'steps': 11637, 'batch_loss/train': 0.8585586016997695} 12/24/2021 12:17:50 - INFO - codeparrot_training - Step 11638: {'lr': 0.0004374938843404261, 'samples': 5959168, 'steps': 11638, 'batch_loss/train': 0.7084438952151686} 12/24/2021 12:18:04 - INFO - codeparrot_training - Step 11639: {'lr': 0.00043748345719723966, 'samples': 5959680, 'steps': 11639, 'batch_loss/train': 0.837214611005038} 12/24/2021 12:18:14 - INFO - codeparrot_training - Step 11640: {'lr': 0.0004374730293086897, 'samples': 5960192, 'steps': 11640, 'batch_loss/train': 0.8082345370203257} 12/24/2021 12:18:25 - INFO - codeparrot_training - Step 11641: {'lr': 0.00043746260067481784, 'samples': 5960704, 'steps': 11641, 'batch_loss/train': 0.8450025185011327} 12/24/2021 12:18:37 - INFO - codeparrot_training - Step 11642: {'lr': 0.0004374521712956655, 'samples': 5961216, 'steps': 11642, 'batch_loss/train': 0.8291798415593803} 12/24/2021 12:18:47 - INFO - codeparrot_training - Step 11643: {'lr': 0.000437441741171274, 'samples': 5961728, 'steps': 11643, 'batch_loss/train': 0.7923050546087325} 12/24/2021 12:18:58 - INFO - codeparrot_training - Step 11644: {'lr': 0.0004374313103016849, 'samples': 5962240, 'steps': 11644, 'batch_loss/train': 0.8133429707959294} 12/24/2021 12:19:09 - INFO - codeparrot_training - Step 11645: {'lr': 0.00043742087868693983, 'samples': 5962752, 'steps': 11645, 'batch_loss/train': 0.8146659429185092} 12/24/2021 12:19:22 - INFO - codeparrot_training - Step 11646: {'lr': 0.00043741044632708, 'samples': 5963264, 'steps': 11646, 'batch_loss/train': 0.7089346639113501} 12/24/2021 12:19:33 - INFO - codeparrot_training - Step 11647: {'lr': 0.00043740001322214707, 'samples': 5963776, 'steps': 11647, 'batch_loss/train': 0.8009026320651174} 12/24/2021 12:19:44 - INFO - codeparrot_training - Step 11648: {'lr': 0.0004373895793721824, 'samples': 5964288, 'steps': 11648, 'batch_loss/train': 0.8284736732020974} 12/24/2021 12:19:56 - INFO - codeparrot_training - Step 11649: {'lr': 0.00043737914477722753, 'samples': 5964800, 'steps': 11649, 'batch_loss/train': 0.7777611752972007} 12/24/2021 12:20:06 - INFO - codeparrot_training - Step 11650: {'lr': 0.0004373687094373239, 'samples': 5965312, 'steps': 11650, 'batch_loss/train': 0.7867803620174527} 12/24/2021 12:20:17 - INFO - codeparrot_training - Step 11651: {'lr': 0.000437358273352513, 'samples': 5965824, 'steps': 11651, 'batch_loss/train': 0.8711332988459617} 12/24/2021 12:20:29 - INFO - codeparrot_training - Step 11652: {'lr': 0.00043734783652283645, 'samples': 5966336, 'steps': 11652, 'batch_loss/train': 0.8531818185001612} 12/24/2021 12:20:40 - INFO - codeparrot_training - Step 11653: {'lr': 0.00043733739894833557, 'samples': 5966848, 'steps': 11653, 'batch_loss/train': 0.8047435497865081} 12/24/2021 12:20:50 - INFO - codeparrot_training - Step 11654: {'lr': 0.000437326960629052, 'samples': 5967360, 'steps': 11654, 'batch_loss/train': 0.7301565357483923} 12/24/2021 12:21:01 - INFO - codeparrot_training - Step 11655: {'lr': 0.0004373165215650271, 'samples': 5967872, 'steps': 11655, 'batch_loss/train': 0.8319264343008399} 12/24/2021 12:21:15 - INFO - codeparrot_training - Step 11656: {'lr': 0.00043730608175630246, 'samples': 5968384, 'steps': 11656, 'batch_loss/train': 0.8410697123035789} 12/24/2021 12:21:25 - INFO - codeparrot_training - Step 11657: {'lr': 0.0004372956412029195, 'samples': 5968896, 'steps': 11657, 'batch_loss/train': 0.810084879398346} 12/24/2021 12:21:36 - INFO - codeparrot_training - Step 11658: {'lr': 0.00043728519990491986, 'samples': 5969408, 'steps': 11658, 'batch_loss/train': 0.8982053315266967} 12/24/2021 12:21:48 - INFO - codeparrot_training - Step 11659: {'lr': 0.00043727475786234496, 'samples': 5969920, 'steps': 11659, 'batch_loss/train': 0.9012392973527312} 12/24/2021 12:21:58 - INFO - codeparrot_training - Step 11660: {'lr': 0.00043726431507523633, 'samples': 5970432, 'steps': 11660, 'batch_loss/train': 0.8851936450228095} 12/24/2021 12:22:09 - INFO - codeparrot_training - Step 11661: {'lr': 0.00043725387154363557, 'samples': 5970944, 'steps': 11661, 'batch_loss/train': 0.8768702326342463} 12/24/2021 12:22:21 - INFO - codeparrot_training - Step 11662: {'lr': 0.000437243427267584, 'samples': 5971456, 'steps': 11662, 'batch_loss/train': 0.6247611874714494} 12/24/2021 12:22:32 - INFO - codeparrot_training - Step 11663: {'lr': 0.0004372329822471234, 'samples': 5971968, 'steps': 11663, 'batch_loss/train': 0.8242616865318269} 12/24/2021 12:22:42 - INFO - codeparrot_training - Step 11664: {'lr': 0.000437222536482295, 'samples': 5972480, 'steps': 11664, 'batch_loss/train': 0.9245325615629554} 12/24/2021 12:22:53 - INFO - codeparrot_training - Step 11665: {'lr': 0.0004372120899731407, 'samples': 5972992, 'steps': 11665, 'batch_loss/train': 0.8259729845449328} 12/24/2021 12:23:07 - INFO - codeparrot_training - Step 11666: {'lr': 0.0004372016427197016, 'samples': 5973504, 'steps': 11666, 'batch_loss/train': 0.8904564768308774} 12/24/2021 12:23:17 - INFO - codeparrot_training - Step 11667: {'lr': 0.0004371911947220195, 'samples': 5974016, 'steps': 11667, 'batch_loss/train': 0.7684264169074595} 12/24/2021 12:23:28 - INFO - codeparrot_training - Step 11668: {'lr': 0.00043718074598013597, 'samples': 5974528, 'steps': 11668, 'batch_loss/train': 0.7596973037580028} 12/24/2021 12:23:40 - INFO - codeparrot_training - Step 11669: {'lr': 0.0004371702964940924, 'samples': 5975040, 'steps': 11669, 'batch_loss/train': 0.7767551152501255} 12/24/2021 12:23:51 - INFO - codeparrot_training - Step 11670: {'lr': 0.00043715984626393047, 'samples': 5975552, 'steps': 11670, 'batch_loss/train': 0.7542935321107507} 12/24/2021 12:24:01 - INFO - codeparrot_training - Step 11671: {'lr': 0.0004371493952896916, 'samples': 5976064, 'steps': 11671, 'batch_loss/train': 0.7821650551632047} 12/24/2021 12:24:14 - INFO - codeparrot_training - Step 11672: {'lr': 0.0004371389435714174, 'samples': 5976576, 'steps': 11672, 'batch_loss/train': 0.8529095877893269} 12/24/2021 12:24:24 - INFO - codeparrot_training - Step 11673: {'lr': 0.0004371284911091495, 'samples': 5977088, 'steps': 11673, 'batch_loss/train': 0.8914652578532696} 12/24/2021 12:24:35 - INFO - codeparrot_training - Step 11674: {'lr': 0.0004371180379029293, 'samples': 5977600, 'steps': 11674, 'batch_loss/train': 0.7425255984999239} 12/24/2021 12:24:47 - INFO - codeparrot_training - Step 11675: {'lr': 0.0004371075839527985, 'samples': 5978112, 'steps': 11675, 'batch_loss/train': 0.8027457995340228} 12/24/2021 12:24:58 - INFO - codeparrot_training - Step 11676: {'lr': 0.0004370971292587985, 'samples': 5978624, 'steps': 11676, 'batch_loss/train': 0.7696618787012994} 12/24/2021 12:25:08 - INFO - codeparrot_training - Step 11677: {'lr': 0.000437086673820971, 'samples': 5979136, 'steps': 11677, 'batch_loss/train': 0.8422513124532998} 12/24/2021 12:25:19 - INFO - codeparrot_training - Step 11678: {'lr': 0.00043707621763935755, 'samples': 5979648, 'steps': 11678, 'batch_loss/train': 0.6558418060885742} 12/24/2021 12:25:33 - INFO - codeparrot_training - Step 11679: {'lr': 0.0004370657607139997, 'samples': 5980160, 'steps': 11679, 'batch_loss/train': 0.9518253523856401} 12/24/2021 12:25:44 - INFO - codeparrot_training - Step 11680: {'lr': 0.000437055303044939, 'samples': 5980672, 'steps': 11680, 'batch_loss/train': 0.7912518617231399} 12/24/2021 12:25:54 - INFO - codeparrot_training - Step 11681: {'lr': 0.00043704484463221707, 'samples': 5981184, 'steps': 11681, 'batch_loss/train': 0.7746468344703317} 12/24/2021 12:26:06 - INFO - codeparrot_training - Step 11682: {'lr': 0.0004370343854758754, 'samples': 5981696, 'steps': 11682, 'batch_loss/train': 0.9657378923147917} 12/24/2021 12:26:17 - INFO - codeparrot_training - Step 11683: {'lr': 0.0004370239255759557, 'samples': 5982208, 'steps': 11683, 'batch_loss/train': 0.8253814931958914} 12/24/2021 12:26:28 - INFO - codeparrot_training - Step 11684: {'lr': 0.00043701346493249944, 'samples': 5982720, 'steps': 11684, 'batch_loss/train': 0.9408642328344285} 12/24/2021 12:26:41 - INFO - codeparrot_training - Step 11685: {'lr': 0.0004370030035455483, 'samples': 5983232, 'steps': 11685, 'batch_loss/train': 0.7501863790675998} 12/24/2021 12:26:52 - INFO - codeparrot_training - Step 11686: {'lr': 0.00043699254141514376, 'samples': 5983744, 'steps': 11686, 'batch_loss/train': 0.7745876619592309} 12/24/2021 12:27:03 - INFO - codeparrot_training - Step 11687: {'lr': 0.00043698207854132756, 'samples': 5984256, 'steps': 11687, 'batch_loss/train': 0.8657521372660995} 12/24/2021 12:27:13 - INFO - codeparrot_training - Step 11688: {'lr': 0.00043697161492414115, 'samples': 5984768, 'steps': 11688, 'batch_loss/train': 0.9096247609704733} 12/24/2021 12:27:25 - INFO - codeparrot_training - Step 11689: {'lr': 0.00043696115056362627, 'samples': 5985280, 'steps': 11689, 'batch_loss/train': 1.0440878588706255} 12/24/2021 12:27:36 - INFO - codeparrot_training - Step 11690: {'lr': 0.0004369506854598244, 'samples': 5985792, 'steps': 11690, 'batch_loss/train': 0.8987984983250499} 12/24/2021 12:27:46 - INFO - codeparrot_training - Step 11691: {'lr': 0.0004369402196127772, 'samples': 5986304, 'steps': 11691, 'batch_loss/train': 0.8621318873483688} 12/24/2021 12:27:59 - INFO - codeparrot_training - Step 11692: {'lr': 0.00043692975302252624, 'samples': 5986816, 'steps': 11692, 'batch_loss/train': 0.8721506798174232} 12/24/2021 12:28:10 - INFO - codeparrot_training - Step 11693: {'lr': 0.00043691928568911323, 'samples': 5987328, 'steps': 11693, 'batch_loss/train': 0.8967791767790914} 12/24/2021 12:28:20 - INFO - codeparrot_training - Step 11694: {'lr': 0.0004369088176125797, 'samples': 5987840, 'steps': 11694, 'batch_loss/train': 0.9681771313771605} 12/24/2021 12:28:34 - INFO - codeparrot_training - Step 11695: {'lr': 0.0004368983487929673, 'samples': 5988352, 'steps': 11695, 'batch_loss/train': 0.8839779915288091} 12/24/2021 12:28:45 - INFO - codeparrot_training - Step 11696: {'lr': 0.0004368878792303176, 'samples': 5988864, 'steps': 11696, 'batch_loss/train': 0.8072274690493941} 12/24/2021 12:28:55 - INFO - codeparrot_training - Step 11697: {'lr': 0.00043687740892467224, 'samples': 5989376, 'steps': 11697, 'batch_loss/train': 0.7863908582367003} 12/24/2021 12:29:07 - INFO - codeparrot_training - Step 11698: {'lr': 0.0004368669378760729, 'samples': 5989888, 'steps': 11698, 'batch_loss/train': 0.9614530391991138} 12/24/2021 12:29:18 - INFO - codeparrot_training - Step 11699: {'lr': 0.0004368564660845612, 'samples': 5990400, 'steps': 11699, 'batch_loss/train': 0.8467415058985353} 12/24/2021 12:29:29 - INFO - codeparrot_training - Step 11700: {'lr': 0.00043684599355017876, 'samples': 5990912, 'steps': 11700, 'batch_loss/train': 0.8407152649015188} 12/24/2021 12:29:39 - INFO - codeparrot_training - Step 11701: {'lr': 0.00043683552027296707, 'samples': 5991424, 'steps': 11701, 'batch_loss/train': 0.8326765270903707} 12/24/2021 12:29:54 - INFO - codeparrot_training - Step 11702: {'lr': 0.00043682504625296804, 'samples': 5991936, 'steps': 11702, 'batch_loss/train': 0.8221745411865413} 12/24/2021 12:30:05 - INFO - codeparrot_training - Step 11703: {'lr': 0.0004368145714902231, 'samples': 5992448, 'steps': 11703, 'batch_loss/train': 1.0882466034963727} 12/24/2021 12:30:16 - INFO - codeparrot_training - Step 11704: {'lr': 0.00043680409598477395, 'samples': 5992960, 'steps': 11704, 'batch_loss/train': 1.6158346589654684} 12/24/2021 12:30:28 - INFO - codeparrot_training - Step 11705: {'lr': 0.00043679361973666226, 'samples': 5993472, 'steps': 11705, 'batch_loss/train': 0.893579164519906} 12/24/2021 12:30:38 - INFO - codeparrot_training - Step 11706: {'lr': 0.00043678314274592966, 'samples': 5993984, 'steps': 11706, 'batch_loss/train': 0.9781519779935479} 12/24/2021 12:30:49 - INFO - codeparrot_training - Step 11707: {'lr': 0.0004367726650126178, 'samples': 5994496, 'steps': 11707, 'batch_loss/train': 0.7835615272633731} 12/24/2021 12:30:59 - INFO - codeparrot_training - Step 11708: {'lr': 0.0004367621865367684, 'samples': 5995008, 'steps': 11708, 'batch_loss/train': 0.8529587858356535} 12/24/2021 12:31:11 - INFO - codeparrot_training - Step 11709: {'lr': 0.00043675170731842305, 'samples': 5995520, 'steps': 11709, 'batch_loss/train': 0.8837235737591982} 12/24/2021 12:31:22 - INFO - codeparrot_training - Step 11710: {'lr': 0.00043674122735762344, 'samples': 5996032, 'steps': 11710, 'batch_loss/train': 0.7550776381976902} 12/24/2021 12:31:33 - INFO - codeparrot_training - Step 11711: {'lr': 0.0004367307466544111, 'samples': 5996544, 'steps': 11711, 'batch_loss/train': 0.7472761878743768} 12/24/2021 12:31:47 - INFO - codeparrot_training - Step 11712: {'lr': 0.000436720265208828, 'samples': 5997056, 'steps': 11712, 'batch_loss/train': 0.8980620428919792} 12/24/2021 12:31:58 - INFO - codeparrot_training - Step 11713: {'lr': 0.0004367097830209156, 'samples': 5997568, 'steps': 11713, 'batch_loss/train': 0.6514723324216902} 12/24/2021 12:32:08 - INFO - codeparrot_training - Step 11714: {'lr': 0.00043669930009071543, 'samples': 5998080, 'steps': 11714, 'batch_loss/train': 0.7670709285885096} 12/24/2021 12:32:20 - INFO - codeparrot_training - Step 11715: {'lr': 0.00043668881641826954, 'samples': 5998592, 'steps': 11715, 'batch_loss/train': 0.844336380250752} 12/24/2021 12:32:31 - INFO - codeparrot_training - Step 11716: {'lr': 0.0004366783320036193, 'samples': 5999104, 'steps': 11716, 'batch_loss/train': 0.7729344889521599} 12/24/2021 12:32:41 - INFO - codeparrot_training - Step 11717: {'lr': 0.0004366678468468065, 'samples': 5999616, 'steps': 11717, 'batch_loss/train': 0.7550327843055129} 12/24/2021 12:32:52 - INFO - codeparrot_training - Step 11718: {'lr': 0.0004366573609478729, 'samples': 6000128, 'steps': 11718, 'batch_loss/train': 0.6654296314809471} 12/24/2021 12:33:04 - INFO - codeparrot_training - Step 11719: {'lr': 0.00043664687430686, 'samples': 6000640, 'steps': 11719, 'batch_loss/train': 0.7945548910647631} 12/24/2021 12:33:15 - INFO - codeparrot_training - Step 11720: {'lr': 0.0004366363869238097, 'samples': 6001152, 'steps': 11720, 'batch_loss/train': 0.9105821088887751} 12/24/2021 12:33:26 - INFO - codeparrot_training - Step 11721: {'lr': 0.0004366258987987635, 'samples': 6001664, 'steps': 11721, 'batch_loss/train': 0.7153103835880756} 12/24/2021 12:33:38 - INFO - codeparrot_training - Step 11722: {'lr': 0.0004366154099317633, 'samples': 6002176, 'steps': 11722, 'batch_loss/train': 0.7832771930843592} 12/24/2021 12:33:48 - INFO - codeparrot_training - Step 11723: {'lr': 0.0004366049203228506, 'samples': 6002688, 'steps': 11723, 'batch_loss/train': 0.8174538357416168} 12/24/2021 12:33:59 - INFO - codeparrot_training - Step 11724: {'lr': 0.00043659442997206723, 'samples': 6003200, 'steps': 11724, 'batch_loss/train': 0.8664301242679358} 12/24/2021 12:34:13 - INFO - codeparrot_training - Step 11725: {'lr': 0.0004365839388794549, 'samples': 6003712, 'steps': 11725, 'batch_loss/train': 0.853627516888082} 12/24/2021 12:34:23 - INFO - codeparrot_training - Step 11726: {'lr': 0.00043657344704505527, 'samples': 6004224, 'steps': 11726, 'batch_loss/train': 0.8337930995039642} 12/24/2021 12:34:34 - INFO - codeparrot_training - Step 11727: {'lr': 0.00043656295446891003, 'samples': 6004736, 'steps': 11727, 'batch_loss/train': 0.8072500275447965} 12/24/2021 12:34:46 - INFO - codeparrot_training - Step 11728: {'lr': 0.00043655246115106096, 'samples': 6005248, 'steps': 11728, 'batch_loss/train': 0.8795997216366231} 12/24/2021 12:34:57 - INFO - codeparrot_training - Step 11729: {'lr': 0.00043654196709154967, 'samples': 6005760, 'steps': 11729, 'batch_loss/train': 0.8219874203205109} 12/24/2021 12:35:08 - INFO - codeparrot_training - Step 11730: {'lr': 0.000436531472290418, 'samples': 6006272, 'steps': 11730, 'batch_loss/train': 0.7846016865223646} 12/24/2021 12:35:18 - INFO - codeparrot_training - Step 11731: {'lr': 0.00043652097674770765, 'samples': 6006784, 'steps': 11731, 'batch_loss/train': 0.7991420216858387} 12/24/2021 12:35:32 - INFO - codeparrot_training - Step 11732: {'lr': 0.00043651048046346034, 'samples': 6007296, 'steps': 11732, 'batch_loss/train': 0.9710358744487166} 12/24/2021 12:35:42 - INFO - codeparrot_training - Step 11733: {'lr': 0.00043649998343771773, 'samples': 6007808, 'steps': 11733, 'batch_loss/train': 0.8618728118017316} 12/24/2021 12:35:53 - INFO - codeparrot_training - Step 11734: {'lr': 0.0004364894856705216, 'samples': 6008320, 'steps': 11734, 'batch_loss/train': 0.7907873613294214} 12/24/2021 12:36:05 - INFO - codeparrot_training - Step 11735: {'lr': 0.0004364789871619137, 'samples': 6008832, 'steps': 11735, 'batch_loss/train': 0.7759946091100574} 12/24/2021 12:36:16 - INFO - codeparrot_training - Step 11736: {'lr': 0.0004364684879119358, 'samples': 6009344, 'steps': 11736, 'batch_loss/train': 0.7773369867354631} 12/24/2021 12:36:26 - INFO - codeparrot_training - Step 11737: {'lr': 0.00043645798792062965, 'samples': 6009856, 'steps': 11737, 'batch_loss/train': 0.7772903814911842} 12/24/2021 12:36:37 - INFO - codeparrot_training - Step 11738: {'lr': 0.00043644748718803684, 'samples': 6010368, 'steps': 11738, 'batch_loss/train': 0.8165462613105774} 12/24/2021 12:36:49 - INFO - codeparrot_training - Step 11739: {'lr': 0.0004364369857141992, 'samples': 6010880, 'steps': 11739, 'batch_loss/train': 0.8460969654843211} 12/24/2021 12:37:00 - INFO - codeparrot_training - Step 11740: {'lr': 0.0004364264834991586, 'samples': 6011392, 'steps': 11740, 'batch_loss/train': 0.8603791231289506} 12/24/2021 12:37:10 - INFO - codeparrot_training - Step 11741: {'lr': 0.0004364159805429566, 'samples': 6011904, 'steps': 11741, 'batch_loss/train': 0.8132722098380327} 12/24/2021 12:37:24 - INFO - codeparrot_training - Step 11742: {'lr': 0.00043640547684563514, 'samples': 6012416, 'steps': 11742, 'batch_loss/train': 0.802887330763042} 12/24/2021 12:37:35 - INFO - codeparrot_training - Step 11743: {'lr': 0.0004363949724072358, 'samples': 6012928, 'steps': 11743, 'batch_loss/train': 0.7293046121485531} 12/24/2021 12:37:45 - INFO - codeparrot_training - Step 11744: {'lr': 0.0004363844672278005, 'samples': 6013440, 'steps': 11744, 'batch_loss/train': 0.7824789620935917} 12/24/2021 12:37:58 - INFO - codeparrot_training - Step 11745: {'lr': 0.0004363739613073709, 'samples': 6013952, 'steps': 11745, 'batch_loss/train': 0.7825160357169807} 12/24/2021 12:38:08 - INFO - codeparrot_training - Step 11746: {'lr': 0.0004363634546459888, 'samples': 6014464, 'steps': 11746, 'batch_loss/train': 0.8749338062480092} 12/24/2021 12:38:19 - INFO - codeparrot_training - Step 11747: {'lr': 0.000436352947243696, 'samples': 6014976, 'steps': 11747, 'batch_loss/train': 0.8819288667291403} 12/24/2021 12:38:31 - INFO - codeparrot_training - Step 11748: {'lr': 0.00043634243910053424, 'samples': 6015488, 'steps': 11748, 'batch_loss/train': 0.830249990336597} 12/24/2021 12:38:41 - INFO - codeparrot_training - Step 11749: {'lr': 0.00043633193021654527, 'samples': 6016000, 'steps': 11749, 'batch_loss/train': 0.757921767886728} 12/24/2021 12:38:52 - INFO - codeparrot_training - Step 11750: {'lr': 0.0004363214205917709, 'samples': 6016512, 'steps': 11750, 'batch_loss/train': 0.7933501079678535} 12/24/2021 12:39:02 - INFO - codeparrot_training - Step 11751: {'lr': 0.0004363109102262529, 'samples': 6017024, 'steps': 11751, 'batch_loss/train': 0.7887306548655033} 12/24/2021 12:39:16 - INFO - codeparrot_training - Step 11752: {'lr': 0.00043630039912003317, 'samples': 6017536, 'steps': 11752, 'batch_loss/train': 0.926324880681932} 12/24/2021 12:39:27 - INFO - codeparrot_training - Step 11753: {'lr': 0.00043628988727315336, 'samples': 6018048, 'steps': 11753, 'batch_loss/train': 0.8114265957847238} 12/24/2021 12:39:37 - INFO - codeparrot_training - Step 11754: {'lr': 0.00043627937468565525, 'samples': 6018560, 'steps': 11754, 'batch_loss/train': 0.8682216387242079} 12/24/2021 12:39:50 - INFO - codeparrot_training - Step 11755: {'lr': 0.0004362688613575807, 'samples': 6019072, 'steps': 11755, 'batch_loss/train': 0.879512270912528} 12/24/2021 12:40:00 - INFO - codeparrot_training - Step 11756: {'lr': 0.0004362583472889715, 'samples': 6019584, 'steps': 11756, 'batch_loss/train': 0.8891006326302886} 12/24/2021 12:40:11 - INFO - codeparrot_training - Step 11757: {'lr': 0.00043624783247986945, 'samples': 6020096, 'steps': 11757, 'batch_loss/train': 0.7561360895633698} 12/24/2021 12:40:23 - INFO - codeparrot_training - Step 11758: {'lr': 0.00043623731693031626, 'samples': 6020608, 'steps': 11758, 'batch_loss/train': 0.8668024085927755} 12/24/2021 12:40:34 - INFO - codeparrot_training - Step 11759: {'lr': 0.00043622680064035394, 'samples': 6021120, 'steps': 11759, 'batch_loss/train': 0.7900274740532041} 12/24/2021 12:40:44 - INFO - codeparrot_training - Step 11760: {'lr': 0.0004362162836100241, 'samples': 6021632, 'steps': 11760, 'batch_loss/train': 0.8467871602624655} 12/24/2021 12:40:55 - INFO - codeparrot_training - Step 11761: {'lr': 0.00043620576583936865, 'samples': 6022144, 'steps': 11761, 'batch_loss/train': 0.7847380917519331} 12/24/2021 12:41:09 - INFO - codeparrot_training - Step 11762: {'lr': 0.0004361952473284294, 'samples': 6022656, 'steps': 11762, 'batch_loss/train': 0.6808999546337873} 12/24/2021 12:41:19 - INFO - codeparrot_training - Step 11763: {'lr': 0.00043618472807724806, 'samples': 6023168, 'steps': 11763, 'batch_loss/train': 0.7562377247959375} 12/24/2021 12:41:30 - INFO - codeparrot_training - Step 11764: {'lr': 0.0004361742080858667, 'samples': 6023680, 'steps': 11764, 'batch_loss/train': 0.7916139222215861} 12/24/2021 12:41:42 - INFO - codeparrot_training - Step 11765: {'lr': 0.0004361636873543269, 'samples': 6024192, 'steps': 11765, 'batch_loss/train': 0.8639918174594641} 12/24/2021 12:41:52 - INFO - codeparrot_training - Step 11766: {'lr': 0.0004361531658826705, 'samples': 6024704, 'steps': 11766, 'batch_loss/train': 0.46093166223727167} 12/24/2021 12:42:03 - INFO - codeparrot_training - Step 11767: {'lr': 0.00043614264367093945, 'samples': 6025216, 'steps': 11767, 'batch_loss/train': 0.8467028300510719} 12/24/2021 12:42:15 - INFO - codeparrot_training - Step 11768: {'lr': 0.00043613212071917564, 'samples': 6025728, 'steps': 11768, 'batch_loss/train': 0.8703522263094783} 12/24/2021 12:42:26 - INFO - codeparrot_training - Step 11769: {'lr': 0.0004361215970274207, 'samples': 6026240, 'steps': 11769, 'batch_loss/train': 0.7103698856662959} 12/24/2021 12:42:36 - INFO - codeparrot_training - Step 11770: {'lr': 0.0004361110725957166, 'samples': 6026752, 'steps': 11770, 'batch_loss/train': 0.795048288826365} 12/24/2021 12:42:47 - INFO - codeparrot_training - Step 11771: {'lr': 0.0004361005474241051, 'samples': 6027264, 'steps': 11771, 'batch_loss/train': 0.8296354096382856} 12/24/2021 12:43:01 - INFO - codeparrot_training - Step 11772: {'lr': 0.0004360900215126281, 'samples': 6027776, 'steps': 11772, 'batch_loss/train': 0.890944005921483} 12/24/2021 12:43:11 - INFO - codeparrot_training - Step 11773: {'lr': 0.00043607949486132745, 'samples': 6028288, 'steps': 11773, 'batch_loss/train': 0.7936165826395154} 12/24/2021 12:43:22 - INFO - codeparrot_training - Step 11774: {'lr': 0.00043606896747024504, 'samples': 6028800, 'steps': 11774, 'batch_loss/train': 0.9413049495778978} 12/24/2021 12:43:34 - INFO - codeparrot_training - Step 11775: {'lr': 0.0004360584393394226, 'samples': 6029312, 'steps': 11775, 'batch_loss/train': 0.8628997793421149} 12/24/2021 12:43:45 - INFO - codeparrot_training - Step 11776: {'lr': 0.00043604791046890216, 'samples': 6029824, 'steps': 11776, 'batch_loss/train': 0.7722929921001196} 12/24/2021 12:43:55 - INFO - codeparrot_training - Step 11777: {'lr': 0.0004360373808587254, 'samples': 6030336, 'steps': 11777, 'batch_loss/train': 0.8917267202050425} 12/24/2021 12:44:07 - INFO - codeparrot_training - Step 11778: {'lr': 0.0004360268505089342, 'samples': 6030848, 'steps': 11778, 'batch_loss/train': 0.8836273029446602} 12/24/2021 12:44:18 - INFO - codeparrot_training - Step 11779: {'lr': 0.00043601631941957055, 'samples': 6031360, 'steps': 11779, 'batch_loss/train': 0.8447324177250266} 12/24/2021 12:44:29 - INFO - codeparrot_training - Step 11780: {'lr': 0.00043600578759067627, 'samples': 6031872, 'steps': 11780, 'batch_loss/train': 0.748365749605} 12/24/2021 12:44:39 - INFO - codeparrot_training - Step 11781: {'lr': 0.0004359952550222932, 'samples': 6032384, 'steps': 11781, 'batch_loss/train': 0.8467170423828065} 12/24/2021 12:44:53 - INFO - codeparrot_training - Step 11782: {'lr': 0.00043598472171446325, 'samples': 6032896, 'steps': 11782, 'batch_loss/train': 0.7953021316789091} 12/24/2021 12:45:04 - INFO - codeparrot_training - Step 11783: {'lr': 0.0004359741876672282, 'samples': 6033408, 'steps': 11783, 'batch_loss/train': 0.7701834655599669} 12/24/2021 12:45:14 - INFO - codeparrot_training - Step 11784: {'lr': 0.0004359636528806301, 'samples': 6033920, 'steps': 11784, 'batch_loss/train': 0.7965559314470738} 12/24/2021 12:45:26 - INFO - codeparrot_training - Step 11785: {'lr': 0.0004359531173547107, 'samples': 6034432, 'steps': 11785, 'batch_loss/train': 0.9467361997812986} 12/24/2021 12:45:37 - INFO - codeparrot_training - Step 11786: {'lr': 0.00043594258108951183, 'samples': 6034944, 'steps': 11786, 'batch_loss/train': 0.8318564454093575} 12/24/2021 12:45:47 - INFO - codeparrot_training - Step 11787: {'lr': 0.00043593204408507553, 'samples': 6035456, 'steps': 11787, 'batch_loss/train': 0.8276078784838319} 12/24/2021 12:46:01 - INFO - codeparrot_training - Step 11788: {'lr': 0.00043592150634144363, 'samples': 6035968, 'steps': 11788, 'batch_loss/train': 0.7686287658289075} 12/24/2021 12:46:12 - INFO - codeparrot_training - Step 11789: {'lr': 0.00043591096785865804, 'samples': 6036480, 'steps': 11789, 'batch_loss/train': 0.808810006827116} 12/24/2021 12:46:22 - INFO - codeparrot_training - Step 11790: {'lr': 0.0004359004286367606, 'samples': 6036992, 'steps': 11790, 'batch_loss/train': 0.6633726314175874} 12/24/2021 12:46:33 - INFO - codeparrot_training - Step 11791: {'lr': 0.0004358898886757933, 'samples': 6037504, 'steps': 11791, 'batch_loss/train': 0.8187879333272576} 12/24/2021 12:46:45 - INFO - codeparrot_training - Step 11792: {'lr': 0.00043587934797579793, 'samples': 6038016, 'steps': 11792, 'batch_loss/train': 0.8068176822271198} 12/24/2021 12:46:56 - INFO - codeparrot_training - Step 11793: {'lr': 0.0004358688065368165, 'samples': 6038528, 'steps': 11793, 'batch_loss/train': 0.8482115548104048} 12/24/2021 12:47:06 - INFO - codeparrot_training - Step 11794: {'lr': 0.00043585826435889086, 'samples': 6039040, 'steps': 11794, 'batch_loss/train': 0.9011016711592674} 12/24/2021 12:47:19 - INFO - codeparrot_training - Step 11795: {'lr': 0.000435847721442063, 'samples': 6039552, 'steps': 11795, 'batch_loss/train': 0.8579520303755999} 12/24/2021 12:47:29 - INFO - codeparrot_training - Step 11796: {'lr': 0.00043583717778637466, 'samples': 6040064, 'steps': 11796, 'batch_loss/train': 0.847842057235539} 12/24/2021 12:47:40 - INFO - codeparrot_training - Step 11797: {'lr': 0.0004358266333918679, 'samples': 6040576, 'steps': 11797, 'batch_loss/train': 0.9169369582086802} 12/24/2021 12:47:52 - INFO - codeparrot_training - Step 11798: {'lr': 0.00043581608825858463, 'samples': 6041088, 'steps': 11798, 'batch_loss/train': 0.8589324569329619} 12/24/2021 12:48:03 - INFO - codeparrot_training - Step 11799: {'lr': 0.0004358055423865668, 'samples': 6041600, 'steps': 11799, 'batch_loss/train': 0.7989592109806836} 12/24/2021 12:48:13 - INFO - codeparrot_training - Step 11800: {'lr': 0.00043579499577585623, 'samples': 6042112, 'steps': 11800, 'batch_loss/train': 0.9526096731424332} 12/24/2021 12:48:24 - INFO - codeparrot_training - Step 11801: {'lr': 0.0004357844484264949, 'samples': 6042624, 'steps': 11801, 'batch_loss/train': 0.8616102701053023} 12/24/2021 12:48:38 - INFO - codeparrot_training - Step 11802: {'lr': 0.0004357739003385248, 'samples': 6043136, 'steps': 11802, 'batch_loss/train': 0.8860848830081522} 12/24/2021 12:48:48 - INFO - codeparrot_training - Step 11803: {'lr': 0.0004357633515119878, 'samples': 6043648, 'steps': 11803, 'batch_loss/train': 0.7833988219499588} 12/24/2021 12:48:59 - INFO - codeparrot_training - Step 11804: {'lr': 0.0004357528019469259, 'samples': 6044160, 'steps': 11804, 'batch_loss/train': 0.7641990631818771} 12/24/2021 12:49:11 - INFO - codeparrot_training - Step 11805: {'lr': 0.0004357422516433809, 'samples': 6044672, 'steps': 11805, 'batch_loss/train': 0.7949884841218591} 12/24/2021 12:49:22 - INFO - codeparrot_training - Step 11806: {'lr': 0.00043573170060139495, 'samples': 6045184, 'steps': 11806, 'batch_loss/train': 0.7862296181265265} 12/24/2021 12:49:32 - INFO - codeparrot_training - Step 11807: {'lr': 0.00043572114882100985, 'samples': 6045696, 'steps': 11807, 'batch_loss/train': 0.7172277169302106} 12/24/2021 12:49:44 - INFO - codeparrot_training - Step 11808: {'lr': 0.00043571059630226756, 'samples': 6046208, 'steps': 11808, 'batch_loss/train': 0.8426872864365578} 12/24/2021 12:49:55 - INFO - codeparrot_training - Step 11809: {'lr': 0.00043570004304521005, 'samples': 6046720, 'steps': 11809, 'batch_loss/train': 0.8818366227205843} 12/24/2021 12:50:06 - INFO - codeparrot_training - Step 11810: {'lr': 0.0004356894890498794, 'samples': 6047232, 'steps': 11810, 'batch_loss/train': 0.6523339636623859} 12/24/2021 12:50:19 - INFO - codeparrot_training - Step 11811: {'lr': 0.00043567893431631735, 'samples': 6047744, 'steps': 11811, 'batch_loss/train': 0.6968432124704123} 12/24/2021 12:50:30 - INFO - codeparrot_training - Step 11812: {'lr': 0.00043566837884456596, 'samples': 6048256, 'steps': 11812, 'batch_loss/train': 0.7675347300246358} 12/24/2021 12:50:41 - INFO - codeparrot_training - Step 11813: {'lr': 0.0004356578226346672, 'samples': 6048768, 'steps': 11813, 'batch_loss/train': 0.7798335617408156} 12/24/2021 12:50:51 - INFO - codeparrot_training - Step 11814: {'lr': 0.00043564726568666304, 'samples': 6049280, 'steps': 11814, 'batch_loss/train': 0.9543699948117137} 12/24/2021 12:51:03 - INFO - codeparrot_training - Step 11815: {'lr': 0.00043563670800059553, 'samples': 6049792, 'steps': 11815, 'batch_loss/train': 0.7953560976311564} 12/24/2021 12:51:14 - INFO - codeparrot_training - Step 11816: {'lr': 0.0004356261495765065, 'samples': 6050304, 'steps': 11816, 'batch_loss/train': 0.8544840160757303} 12/24/2021 12:51:24 - INFO - codeparrot_training - Step 11817: {'lr': 0.00043561559041443797, 'samples': 6050816, 'steps': 11817, 'batch_loss/train': 0.7850363631732762} 12/24/2021 12:51:41 - INFO - codeparrot_training - Step 11818: {'lr': 0.00043560503051443195, 'samples': 6051328, 'steps': 11818, 'batch_loss/train': 0.7316721491515636} 12/24/2021 12:51:52 - INFO - codeparrot_training - Step 11819: {'lr': 0.0004355944698765304, 'samples': 6051840, 'steps': 11819, 'batch_loss/train': 0.8317349464632571} 12/24/2021 12:52:02 - INFO - codeparrot_training - Step 11820: {'lr': 0.00043558390850077534, 'samples': 6052352, 'steps': 11820, 'batch_loss/train': 0.8591380091384053} 12/24/2021 12:52:14 - INFO - codeparrot_training - Step 11821: {'lr': 0.00043557334638720876, 'samples': 6052864, 'steps': 11821, 'batch_loss/train': 0.7983113322407007} 12/24/2021 12:52:25 - INFO - codeparrot_training - Step 11822: {'lr': 0.0004355627835358726, 'samples': 6053376, 'steps': 11822, 'batch_loss/train': 0.8416651161387563} 12/24/2021 12:52:36 - INFO - codeparrot_training - Step 11823: {'lr': 0.00043555221994680883, 'samples': 6053888, 'steps': 11823, 'batch_loss/train': 0.7899772003293037} 12/24/2021 12:52:46 - INFO - codeparrot_training - Step 11824: {'lr': 0.00043554165562005955, 'samples': 6054400, 'steps': 11824, 'batch_loss/train': 0.8370476737618446} 12/24/2021 12:52:58 - INFO - codeparrot_training - Step 11825: {'lr': 0.00043553109055566677, 'samples': 6054912, 'steps': 11825, 'batch_loss/train': 0.809784384444356} 12/24/2021 12:53:09 - INFO - codeparrot_training - Step 11826: {'lr': 0.0004355205247536723, 'samples': 6055424, 'steps': 11826, 'batch_loss/train': 0.923926368355751} 12/24/2021 12:53:19 - INFO - codeparrot_training - Step 11827: {'lr': 0.00043550995821411834, 'samples': 6055936, 'steps': 11827, 'batch_loss/train': 0.8641903512179852} 12/24/2021 12:53:33 - INFO - codeparrot_training - Step 11828: {'lr': 0.00043549939093704684, 'samples': 6056448, 'steps': 11828, 'batch_loss/train': 0.9117903104051948} 12/24/2021 12:53:44 - INFO - codeparrot_training - Step 11829: {'lr': 0.0004354888229224997, 'samples': 6056960, 'steps': 11829, 'batch_loss/train': 0.6719609531573951} 12/24/2021 12:53:55 - INFO - codeparrot_training - Step 11830: {'lr': 0.00043547825417051915, 'samples': 6057472, 'steps': 11830, 'batch_loss/train': 0.7593609630130231} 12/24/2021 12:54:07 - INFO - codeparrot_training - Step 11831: {'lr': 0.00043546768468114705, 'samples': 6057984, 'steps': 11831, 'batch_loss/train': 0.6688989661633968} 12/24/2021 12:54:17 - INFO - codeparrot_training - Step 11832: {'lr': 0.00043545711445442547, 'samples': 6058496, 'steps': 11832, 'batch_loss/train': 0.7868369240313768} 12/24/2021 12:54:28 - INFO - codeparrot_training - Step 11833: {'lr': 0.0004354465434903965, 'samples': 6059008, 'steps': 11833, 'batch_loss/train': 0.8628587340936065} 12/24/2021 12:54:40 - INFO - codeparrot_training - Step 11834: {'lr': 0.000435435971789102, 'samples': 6059520, 'steps': 11834, 'batch_loss/train': 0.8213010798208416} 12/24/2021 12:54:51 - INFO - codeparrot_training - Step 11835: {'lr': 0.0004354253993505841, 'samples': 6060032, 'steps': 11835, 'batch_loss/train': 0.8403674885630608} 12/24/2021 12:55:01 - INFO - codeparrot_training - Step 11836: {'lr': 0.0004354148261748849, 'samples': 6060544, 'steps': 11836, 'batch_loss/train': 0.8825730895623565} 12/24/2021 12:55:12 - INFO - codeparrot_training - Step 11837: {'lr': 0.0004354042522620463, 'samples': 6061056, 'steps': 11837, 'batch_loss/train': 0.8562242102343589} 12/24/2021 12:55:24 - INFO - codeparrot_training - Step 11838: {'lr': 0.0004353936776121104, 'samples': 6061568, 'steps': 11838, 'batch_loss/train': 0.9402781333774328} 12/24/2021 12:55:34 - INFO - codeparrot_training - Step 11839: {'lr': 0.0004353831022251193, 'samples': 6062080, 'steps': 11839, 'batch_loss/train': 0.8214693833142519} 12/24/2021 12:55:45 - INFO - codeparrot_training - Step 11840: {'lr': 0.0004353725261011149, 'samples': 6062592, 'steps': 11840, 'batch_loss/train': 0.8373309238813818} 12/24/2021 12:55:59 - INFO - codeparrot_training - Step 11841: {'lr': 0.0004353619492401394, 'samples': 6063104, 'steps': 11841, 'batch_loss/train': 0.8669061791151762} 12/24/2021 12:56:10 - INFO - codeparrot_training - Step 11842: {'lr': 0.00043535137164223473, 'samples': 6063616, 'steps': 11842, 'batch_loss/train': 0.8766533127054572} 12/24/2021 12:56:20 - INFO - codeparrot_training - Step 11843: {'lr': 0.00043534079330744305, 'samples': 6064128, 'steps': 11843, 'batch_loss/train': 0.7981370706111193} 12/24/2021 12:56:32 - INFO - codeparrot_training - Step 11844: {'lr': 0.0004353302142358063, 'samples': 6064640, 'steps': 11844, 'batch_loss/train': 0.9849724778905511} 12/24/2021 12:56:43 - INFO - codeparrot_training - Step 11845: {'lr': 0.00043531963442736653, 'samples': 6065152, 'steps': 11845, 'batch_loss/train': 0.8458804972469807} 12/24/2021 12:56:53 - INFO - codeparrot_training - Step 11846: {'lr': 0.000435309053882166, 'samples': 6065664, 'steps': 11846, 'batch_loss/train': 1.2453971486538649} 12/24/2021 12:57:04 - INFO - codeparrot_training - Step 11847: {'lr': 0.00043529847260024657, 'samples': 6066176, 'steps': 11847, 'batch_loss/train': 0.8235119588207453} 12/24/2021 12:57:18 - INFO - codeparrot_training - Step 11848: {'lr': 0.00043528789058165044, 'samples': 6066688, 'steps': 11848, 'batch_loss/train': 0.8499731335323304} 12/24/2021 12:57:28 - INFO - codeparrot_training - Step 11849: {'lr': 0.0004352773078264196, 'samples': 6067200, 'steps': 11849, 'batch_loss/train': 0.7830452676862478} 12/24/2021 12:57:39 - INFO - codeparrot_training - Step 11850: {'lr': 0.0004352667243345961, 'samples': 6067712, 'steps': 11850, 'batch_loss/train': 0.8306307368911803} 12/24/2021 12:57:51 - INFO - codeparrot_training - Step 11851: {'lr': 0.00043525614010622206, 'samples': 6068224, 'steps': 11851, 'batch_loss/train': 0.8127089887857437} 12/24/2021 12:58:01 - INFO - codeparrot_training - Step 11852: {'lr': 0.0004352455551413396, 'samples': 6068736, 'steps': 11852, 'batch_loss/train': 0.8241313844919205} 12/24/2021 12:58:12 - INFO - codeparrot_training - Step 11853: {'lr': 0.00043523496943999074, 'samples': 6069248, 'steps': 11853, 'batch_loss/train': 0.8757241507992148} 12/24/2021 12:58:26 - INFO - codeparrot_training - Step 11854: {'lr': 0.00043522438300221757, 'samples': 6069760, 'steps': 11854, 'batch_loss/train': 0.9087372627109289} 12/24/2021 12:58:37 - INFO - codeparrot_training - Step 11855: {'lr': 0.0004352137958280622, 'samples': 6070272, 'steps': 11855, 'batch_loss/train': 0.8366520344279706} 12/24/2021 12:58:47 - INFO - codeparrot_training - Step 11856: {'lr': 0.00043520320791756674, 'samples': 6070784, 'steps': 11856, 'batch_loss/train': 0.8849855605512857} 12/24/2021 12:58:58 - INFO - codeparrot_training - Step 11857: {'lr': 0.0004351926192707732, 'samples': 6071296, 'steps': 11857, 'batch_loss/train': 0.7886021314188838} 12/24/2021 12:59:10 - INFO - codeparrot_training - Step 11858: {'lr': 0.00043518202988772375, 'samples': 6071808, 'steps': 11858, 'batch_loss/train': 0.6899315748596564} 12/24/2021 12:59:21 - INFO - codeparrot_training - Step 11859: {'lr': 0.0004351714397684605, 'samples': 6072320, 'steps': 11859, 'batch_loss/train': 0.770823429338634} 12/24/2021 12:59:31 - INFO - codeparrot_training - Step 11860: {'lr': 0.00043516084891302546, 'samples': 6072832, 'steps': 11860, 'batch_loss/train': 0.8737191492691636} 12/24/2021 12:59:45 - INFO - codeparrot_training - Step 11861: {'lr': 0.00043515025732146083, 'samples': 6073344, 'steps': 11861, 'batch_loss/train': 0.7886578310281038} 12/24/2021 12:59:56 - INFO - codeparrot_training - Step 11862: {'lr': 0.00043513966499380874, 'samples': 6073856, 'steps': 11862, 'batch_loss/train': 0.7998228432843462} 12/24/2021 13:00:06 - INFO - codeparrot_training - Step 11863: {'lr': 0.0004351290719301112, 'samples': 6074368, 'steps': 11863, 'batch_loss/train': 0.754747124388814} 12/24/2021 13:00:18 - INFO - codeparrot_training - Step 11864: {'lr': 0.0004351184781304104, 'samples': 6074880, 'steps': 11864, 'batch_loss/train': 0.8827817784622312} 12/24/2021 13:00:29 - INFO - codeparrot_training - Step 11865: {'lr': 0.0004351078835947484, 'samples': 6075392, 'steps': 11865, 'batch_loss/train': 0.796937664039433} 12/24/2021 13:00:40 - INFO - codeparrot_training - Step 11866: {'lr': 0.0004350972883231674, 'samples': 6075904, 'steps': 11866, 'batch_loss/train': 0.7985184844583273} 12/24/2021 13:00:51 - INFO - codeparrot_training - Step 11867: {'lr': 0.00043508669231570943, 'samples': 6076416, 'steps': 11867, 'batch_loss/train': 0.8437642948701978} 12/24/2021 13:01:02 - INFO - codeparrot_training - Step 11868: {'lr': 0.00043507609557241663, 'samples': 6076928, 'steps': 11868, 'batch_loss/train': 0.7176186714787036} 12/24/2021 13:01:13 - INFO - codeparrot_training - Step 11869: {'lr': 0.00043506549809333117, 'samples': 6077440, 'steps': 11869, 'batch_loss/train': 0.8000565832480788} 12/24/2021 13:01:23 - INFO - codeparrot_training - Step 11870: {'lr': 0.0004350548998784952, 'samples': 6077952, 'steps': 11870, 'batch_loss/train': 0.7144257901236415} 12/24/2021 13:01:37 - INFO - codeparrot_training - Step 11871: {'lr': 0.0004350443009279508, 'samples': 6078464, 'steps': 11871, 'batch_loss/train': 0.8540997114032507} 12/24/2021 13:01:48 - INFO - codeparrot_training - Step 11872: {'lr': 0.0004350337012417401, 'samples': 6078976, 'steps': 11872, 'batch_loss/train': 0.783287649974227} 12/24/2021 13:01:58 - INFO - codeparrot_training - Step 11873: {'lr': 0.00043502310081990536, 'samples': 6079488, 'steps': 11873, 'batch_loss/train': 0.8073576886672527} 12/24/2021 13:02:10 - INFO - codeparrot_training - Step 11874: {'lr': 0.00043501249966248855, 'samples': 6080000, 'steps': 11874, 'batch_loss/train': 0.8158600572496653} 12/24/2021 13:02:21 - INFO - codeparrot_training - Step 11875: {'lr': 0.0004350018977695319, 'samples': 6080512, 'steps': 11875, 'batch_loss/train': 0.7651701997965574} 12/24/2021 13:02:32 - INFO - codeparrot_training - Step 11876: {'lr': 0.0004349912951410776, 'samples': 6081024, 'steps': 11876, 'batch_loss/train': 0.7442944445647299} 12/24/2021 13:02:46 - INFO - codeparrot_training - Step 11877: {'lr': 0.0004349806917771677, 'samples': 6081536, 'steps': 11877, 'batch_loss/train': 0.856900162063539} 12/24/2021 13:02:56 - INFO - codeparrot_training - Step 11878: {'lr': 0.0004349700876778445, 'samples': 6082048, 'steps': 11878, 'batch_loss/train': 0.790679992409423} 12/24/2021 13:03:07 - INFO - codeparrot_training - Step 11879: {'lr': 0.00043495948284315, 'samples': 6082560, 'steps': 11879, 'batch_loss/train': 0.7691604858264327} 12/24/2021 13:03:17 - INFO - codeparrot_training - Step 11880: {'lr': 0.0004349488772731265, 'samples': 6083072, 'steps': 11880, 'batch_loss/train': 0.820023181848228} 12/24/2021 13:03:29 - INFO - codeparrot_training - Step 11881: {'lr': 0.00043493827096781606, 'samples': 6083584, 'steps': 11881, 'batch_loss/train': 0.8148936703801155} 12/24/2021 13:03:40 - INFO - codeparrot_training - Step 11882: {'lr': 0.0004349276639272609, 'samples': 6084096, 'steps': 11882, 'batch_loss/train': 0.8576838592998683} 12/24/2021 13:03:51 - INFO - codeparrot_training - Step 11883: {'lr': 0.0004349170561515032, 'samples': 6084608, 'steps': 11883, 'batch_loss/train': 1.4754521828144789} 12/24/2021 13:04:03 - INFO - codeparrot_training - Step 11884: {'lr': 0.00043490644764058507, 'samples': 6085120, 'steps': 11884, 'batch_loss/train': 0.7798147881403565} 12/24/2021 13:04:13 - INFO - codeparrot_training - Step 11885: {'lr': 0.0004348958383945487, 'samples': 6085632, 'steps': 11885, 'batch_loss/train': 1.0492250602692366} 12/24/2021 13:04:24 - INFO - codeparrot_training - Step 11886: {'lr': 0.0004348852284134363, 'samples': 6086144, 'steps': 11886, 'batch_loss/train': 0.7952179349958897} 12/24/2021 13:04:38 - INFO - codeparrot_training - Step 11887: {'lr': 0.0004348746176972901, 'samples': 6086656, 'steps': 11887, 'batch_loss/train': 0.873935048468411} 12/24/2021 13:04:49 - INFO - codeparrot_training - Step 11888: {'lr': 0.0004348640062461521, 'samples': 6087168, 'steps': 11888, 'batch_loss/train': 0.7692730670096353} 12/24/2021 13:04:59 - INFO - codeparrot_training - Step 11889: {'lr': 0.00043485339406006476, 'samples': 6087680, 'steps': 11889, 'batch_loss/train': 1.5687491530552506} 12/24/2021 13:05:10 - INFO - codeparrot_training - Step 11890: {'lr': 0.00043484278113907, 'samples': 6088192, 'steps': 11890, 'batch_loss/train': 0.7712330324575305} 12/24/2021 13:05:22 - INFO - codeparrot_training - Step 11891: {'lr': 0.0004348321674832102, 'samples': 6088704, 'steps': 11891, 'batch_loss/train': 0.7967668662313372} 12/24/2021 13:05:32 - INFO - codeparrot_training - Step 11892: {'lr': 0.00043482155309252746, 'samples': 6089216, 'steps': 11892, 'batch_loss/train': 0.6564927594736218} 12/24/2021 13:05:43 - INFO - codeparrot_training - Step 11893: {'lr': 0.00043481093796706404, 'samples': 6089728, 'steps': 11893, 'batch_loss/train': 0.8709563985466957} 12/24/2021 13:05:55 - INFO - codeparrot_training - Step 11894: {'lr': 0.0004348003221068622, 'samples': 6090240, 'steps': 11894, 'batch_loss/train': 0.8545932294800878} 12/24/2021 13:06:06 - INFO - codeparrot_training - Step 11895: {'lr': 0.0004347897055119639, 'samples': 6090752, 'steps': 11895, 'batch_loss/train': 0.7053077064920217} 12/24/2021 13:06:17 - INFO - codeparrot_training - Step 11896: {'lr': 0.0004347790881824116, 'samples': 6091264, 'steps': 11896, 'batch_loss/train': 0.8131794836372137} 12/24/2021 13:06:29 - INFO - codeparrot_training - Step 11897: {'lr': 0.00043476847011824737, 'samples': 6091776, 'steps': 11897, 'batch_loss/train': 0.8484513682778925} 12/24/2021 13:06:39 - INFO - codeparrot_training - Step 11898: {'lr': 0.0004347578513195135, 'samples': 6092288, 'steps': 11898, 'batch_loss/train': 0.7146672131493688} 12/24/2021 13:06:50 - INFO - codeparrot_training - Step 11899: {'lr': 0.0004347472317862521, 'samples': 6092800, 'steps': 11899, 'batch_loss/train': 0.8771570399403572} 12/24/2021 13:07:04 - INFO - codeparrot_training - Step 11900: {'lr': 0.00043473661151850556, 'samples': 6093312, 'steps': 11900, 'batch_loss/train': 0.7403679378330708} 12/24/2021 13:07:14 - INFO - codeparrot_training - Step 11901: {'lr': 0.000434725990516316, 'samples': 6093824, 'steps': 11901, 'batch_loss/train': 0.7681078761816025} 12/24/2021 13:07:25 - INFO - codeparrot_training - Step 11902: {'lr': 0.00043471536877972563, 'samples': 6094336, 'steps': 11902, 'batch_loss/train': 0.8531141756102443} 12/24/2021 13:07:35 - INFO - codeparrot_training - Step 11903: {'lr': 0.00043470474630877666, 'samples': 6094848, 'steps': 11903, 'batch_loss/train': 0.800629633013159} 12/24/2021 13:07:48 - INFO - codeparrot_training - Step 11904: {'lr': 0.00043469412310351144, 'samples': 6095360, 'steps': 11904, 'batch_loss/train': 0.7660149495350197} 12/24/2021 13:07:58 - INFO - codeparrot_training - Step 11905: {'lr': 0.000434683499163972, 'samples': 6095872, 'steps': 11905, 'batch_loss/train': 0.9125418420881033} 12/24/2021 13:08:09 - INFO - codeparrot_training - Step 11906: {'lr': 0.0004346728744902008, 'samples': 6096384, 'steps': 11906, 'batch_loss/train': 0.7752507915720344} 12/24/2021 13:08:23 - INFO - codeparrot_training - Step 11907: {'lr': 0.00043466224908223996, 'samples': 6096896, 'steps': 11907, 'batch_loss/train': 1.6098135113716125} 12/24/2021 13:08:33 - INFO - codeparrot_training - Step 11908: {'lr': 0.0004346516229401317, 'samples': 6097408, 'steps': 11908, 'batch_loss/train': 0.7288808301091194} 12/24/2021 13:08:44 - INFO - codeparrot_training - Step 11909: {'lr': 0.0004346409960639184, 'samples': 6097920, 'steps': 11909, 'batch_loss/train': 1.0458652777597308} 12/24/2021 13:08:56 - INFO - codeparrot_training - Step 11910: {'lr': 0.0004346303684536422, 'samples': 6098432, 'steps': 11910, 'batch_loss/train': 0.766910299891606} 12/24/2021 13:09:07 - INFO - codeparrot_training - Step 11911: {'lr': 0.0004346197401093453, 'samples': 6098944, 'steps': 11911, 'batch_loss/train': 1.0224649412557483} 12/24/2021 13:09:17 - INFO - codeparrot_training - Step 11912: {'lr': 0.00043460911103107005, 'samples': 6099456, 'steps': 11912, 'batch_loss/train': 0.8006429113447666} 12/24/2021 13:09:28 - INFO - codeparrot_training - Step 11913: {'lr': 0.0004345984812188587, 'samples': 6099968, 'steps': 11913, 'batch_loss/train': 0.8322311816737056} 12/24/2021 13:09:40 - INFO - codeparrot_training - Step 11914: {'lr': 0.0004345878506727535, 'samples': 6100480, 'steps': 11914, 'batch_loss/train': 0.7711698925122619} 12/24/2021 13:09:51 - INFO - codeparrot_training - Step 11915: {'lr': 0.00043457721939279664, 'samples': 6100992, 'steps': 11915, 'batch_loss/train': 0.793739520246163} 12/24/2021 13:10:01 - INFO - codeparrot_training - Step 11916: {'lr': 0.00043456658737903056, 'samples': 6101504, 'steps': 11916, 'batch_loss/train': 0.8048275224864483} 12/24/2021 13:10:15 - INFO - codeparrot_training - Step 11917: {'lr': 0.0004345559546314973, 'samples': 6102016, 'steps': 11917, 'batch_loss/train': 0.7713828142732382} 12/24/2021 13:10:26 - INFO - codeparrot_training - Step 11918: {'lr': 0.0004345453211502393, 'samples': 6102528, 'steps': 11918, 'batch_loss/train': 0.9225285742431879} 12/24/2021 13:10:36 - INFO - codeparrot_training - Step 11919: {'lr': 0.0004345346869352988, 'samples': 6103040, 'steps': 11919, 'batch_loss/train': 0.8104855716228485} 12/24/2021 13:10:48 - INFO - codeparrot_training - Step 11920: {'lr': 0.0004345240519867181, 'samples': 6103552, 'steps': 11920, 'batch_loss/train': 0.7976308241486549} 12/24/2021 13:10:59 - INFO - codeparrot_training - Step 11921: {'lr': 0.0004345134163045393, 'samples': 6104064, 'steps': 11921, 'batch_loss/train': 0.8360689696855843} 12/24/2021 13:11:10 - INFO - codeparrot_training - Step 11922: {'lr': 0.0004345027798888049, 'samples': 6104576, 'steps': 11922, 'batch_loss/train': 0.86832404602319} 12/24/2021 13:11:21 - INFO - codeparrot_training - Step 11923: {'lr': 0.0004344921427395572, 'samples': 6105088, 'steps': 11923, 'batch_loss/train': 0.7784860573010519} 12/24/2021 13:11:32 - INFO - codeparrot_training - Step 11924: {'lr': 0.00043448150485683837, 'samples': 6105600, 'steps': 11924, 'batch_loss/train': 0.8819501083344221} 12/24/2021 13:11:43 - INFO - codeparrot_training - Step 11925: {'lr': 0.0004344708662406907, 'samples': 6106112, 'steps': 11925, 'batch_loss/train': 0.7826263522729278} 12/24/2021 13:11:53 - INFO - codeparrot_training - Step 11926: {'lr': 0.00043446022689115643, 'samples': 6106624, 'steps': 11926, 'batch_loss/train': 0.7835884774103761} 12/24/2021 13:12:07 - INFO - codeparrot_training - Step 11927: {'lr': 0.0004344495868082781, 'samples': 6107136, 'steps': 11927, 'batch_loss/train': 0.8270883425138891} 12/24/2021 13:12:18 - INFO - codeparrot_training - Step 11928: {'lr': 0.00043443894599209774, 'samples': 6107648, 'steps': 11928, 'batch_loss/train': 0.7472119061276317} 12/24/2021 13:12:28 - INFO - codeparrot_training - Step 11929: {'lr': 0.0004344283044426578, 'samples': 6108160, 'steps': 11929, 'batch_loss/train': 0.8436671285890043} 12/24/2021 13:12:40 - INFO - codeparrot_training - Step 11930: {'lr': 0.0004344176621600006, 'samples': 6108672, 'steps': 11930, 'batch_loss/train': 0.7349183559417725} 12/24/2021 13:12:51 - INFO - codeparrot_training - Step 11931: {'lr': 0.0004344070191441684, 'samples': 6109184, 'steps': 11931, 'batch_loss/train': 0.755842974409461} 12/24/2021 13:13:02 - INFO - codeparrot_training - Step 11932: {'lr': 0.0004343963753952035, 'samples': 6109696, 'steps': 11932, 'batch_loss/train': 0.8083255318924785} 12/24/2021 13:13:15 - INFO - codeparrot_training - Step 11933: {'lr': 0.0004343857309131483, 'samples': 6110208, 'steps': 11933, 'batch_loss/train': 0.8076547633390874} 12/24/2021 13:13:26 - INFO - codeparrot_training - Step 11934: {'lr': 0.00043437508569804493, 'samples': 6110720, 'steps': 11934, 'batch_loss/train': 0.7719375404994935} 12/24/2021 13:13:37 - INFO - codeparrot_training - Step 11935: {'lr': 0.00043436443974993597, 'samples': 6111232, 'steps': 11935, 'batch_loss/train': 0.7791567398235202} 12/24/2021 13:13:47 - INFO - codeparrot_training - Step 11936: {'lr': 0.00043435379306886357, 'samples': 6111744, 'steps': 11936, 'batch_loss/train': 0.8651179987937212} 12/24/2021 13:14:00 - INFO - codeparrot_training - Step 11937: {'lr': 0.00043434314565487007, 'samples': 6112256, 'steps': 11937, 'batch_loss/train': 0.7891220026649535} 12/24/2021 13:14:10 - INFO - codeparrot_training - Step 11938: {'lr': 0.00043433249750799785, 'samples': 6112768, 'steps': 11938, 'batch_loss/train': 0.7257841763785109} 12/24/2021 13:14:21 - INFO - codeparrot_training - Step 11939: {'lr': 0.00043432184862828916, 'samples': 6113280, 'steps': 11939, 'batch_loss/train': 0.8680744739249349} 12/24/2021 13:14:34 - INFO - codeparrot_training - Step 11940: {'lr': 0.00043431119901578643, 'samples': 6113792, 'steps': 11940, 'batch_loss/train': 0.6302207950502634} 12/24/2021 13:14:44 - INFO - codeparrot_training - Step 11941: {'lr': 0.00043430054867053206, 'samples': 6114304, 'steps': 11941, 'batch_loss/train': 1.473327080719173} 12/24/2021 13:14:55 - INFO - codeparrot_training - Step 11942: {'lr': 0.0004342898975925682, 'samples': 6114816, 'steps': 11942, 'batch_loss/train': 0.763594685238786} 12/24/2021 13:15:07 - INFO - codeparrot_training - Step 11943: {'lr': 0.00043427924578193726, 'samples': 6115328, 'steps': 11943, 'batch_loss/train': 0.8341294936835766} 12/24/2021 13:15:17 - INFO - codeparrot_training - Step 11944: {'lr': 0.0004342685932386817, 'samples': 6115840, 'steps': 11944, 'batch_loss/train': 0.8410936510190368} 12/24/2021 13:15:28 - INFO - codeparrot_training - Step 11945: {'lr': 0.0004342579399628438, 'samples': 6116352, 'steps': 11945, 'batch_loss/train': 0.8496141536161304} 12/24/2021 13:15:42 - INFO - codeparrot_training - Step 11946: {'lr': 0.00043424728595446585, 'samples': 6116864, 'steps': 11946, 'batch_loss/train': 0.7380674984306097} 12/24/2021 13:15:53 - INFO - codeparrot_training - Step 11947: {'lr': 0.00043423663121359025, 'samples': 6117376, 'steps': 11947, 'batch_loss/train': 0.851070745382458} 12/24/2021 13:16:03 - INFO - codeparrot_training - Step 11948: {'lr': 0.0004342259757402594, 'samples': 6117888, 'steps': 11948, 'batch_loss/train': 0.7194676836952567} 12/24/2021 13:16:14 - INFO - codeparrot_training - Step 11949: {'lr': 0.00043421531953451566, 'samples': 6118400, 'steps': 11949, 'batch_loss/train': 0.8033730061724782} 12/24/2021 13:16:26 - INFO - codeparrot_training - Step 11950: {'lr': 0.00043420466259640135, 'samples': 6118912, 'steps': 11950, 'batch_loss/train': 0.7483339253813028} 12/24/2021 13:16:36 - INFO - codeparrot_training - Step 11951: {'lr': 0.0004341940049259588, 'samples': 6119424, 'steps': 11951, 'batch_loss/train': 0.7880545752122998} 12/24/2021 13:16:47 - INFO - codeparrot_training - Step 11952: {'lr': 0.00043418334652323045, 'samples': 6119936, 'steps': 11952, 'batch_loss/train': 0.6090754705946892} 12/24/2021 13:16:59 - INFO - codeparrot_training - Step 11953: {'lr': 0.00043417268738825865, 'samples': 6120448, 'steps': 11953, 'batch_loss/train': 0.9017784399911761} 12/24/2021 13:17:10 - INFO - codeparrot_training - Step 11954: {'lr': 0.00043416202752108587, 'samples': 6120960, 'steps': 11954, 'batch_loss/train': 0.7862270215991884} 12/24/2021 13:17:20 - INFO - codeparrot_training - Step 11955: {'lr': 0.00043415136692175434, 'samples': 6121472, 'steps': 11955, 'batch_loss/train': 0.6223590908921324} 12/24/2021 13:17:34 - INFO - codeparrot_training - Step 11956: {'lr': 0.0004341407055903065, 'samples': 6121984, 'steps': 11956, 'batch_loss/train': 0.7898448742926121} 12/24/2021 13:17:45 - INFO - codeparrot_training - Step 11957: {'lr': 0.0004341300435267848, 'samples': 6122496, 'steps': 11957, 'batch_loss/train': 0.7708774721249938} 12/24/2021 13:17:55 - INFO - codeparrot_training - Step 11958: {'lr': 0.00043411938073123154, 'samples': 6123008, 'steps': 11958, 'batch_loss/train': 0.7756940061226487} 12/24/2021 13:18:06 - INFO - codeparrot_training - Step 11959: {'lr': 0.0004341087172036892, 'samples': 6123520, 'steps': 11959, 'batch_loss/train': 0.779917165171355} 12/24/2021 13:18:18 - INFO - codeparrot_training - Step 11960: {'lr': 0.0004340980529442, 'samples': 6124032, 'steps': 11960, 'batch_loss/train': 0.7812254833988845} 12/24/2021 13:18:29 - INFO - codeparrot_training - Step 11961: {'lr': 0.0004340873879528065, 'samples': 6124544, 'steps': 11961, 'batch_loss/train': 0.8948405412957072} 12/24/2021 13:18:39 - INFO - codeparrot_training - Step 11962: {'lr': 0.0004340767222295511, 'samples': 6125056, 'steps': 11962, 'batch_loss/train': 0.8088585259392858} 12/24/2021 13:18:54 - INFO - codeparrot_training - Step 11963: {'lr': 0.0004340660557744761, 'samples': 6125568, 'steps': 11963, 'batch_loss/train': 0.7808480900712311} 12/24/2021 13:19:04 - INFO - codeparrot_training - Step 11964: {'lr': 0.00043405538858762397, 'samples': 6126080, 'steps': 11964, 'batch_loss/train': 0.8044634833931923} 12/24/2021 13:19:15 - INFO - codeparrot_training - Step 11965: {'lr': 0.0004340447206690371, 'samples': 6126592, 'steps': 11965, 'batch_loss/train': 0.7605924839153886} 12/24/2021 13:19:27 - INFO - codeparrot_training - Step 11966: {'lr': 0.000434034052018758, 'samples': 6127104, 'steps': 11966, 'batch_loss/train': 0.7480822796933353} 12/24/2021 13:19:38 - INFO - codeparrot_training - Step 11967: {'lr': 0.0004340233826368289, 'samples': 6127616, 'steps': 11967, 'batch_loss/train': 0.8587241917848587} 12/24/2021 13:19:48 - INFO - codeparrot_training - Step 11968: {'lr': 0.0004340127125232923, 'samples': 6128128, 'steps': 11968, 'batch_loss/train': 0.6997383637353778} 12/24/2021 13:19:59 - INFO - codeparrot_training - Step 11969: {'lr': 0.00043400204167819075, 'samples': 6128640, 'steps': 11969, 'batch_loss/train': 0.7980838501825929} 12/24/2021 13:20:11 - INFO - codeparrot_training - Step 11970: {'lr': 0.0004339913701015665, 'samples': 6129152, 'steps': 11970, 'batch_loss/train': 0.8090494899079204} 12/24/2021 13:20:21 - INFO - codeparrot_training - Step 11971: {'lr': 0.00043398069779346205, 'samples': 6129664, 'steps': 11971, 'batch_loss/train': 0.8515591421164572} 12/24/2021 13:20:32 - INFO - codeparrot_training - Step 11972: {'lr': 0.0004339700247539198, 'samples': 6130176, 'steps': 11972, 'batch_loss/train': 0.8308892324566841} 12/24/2021 13:20:44 - INFO - codeparrot_training - Step 11973: {'lr': 0.0004339593509829822, 'samples': 6130688, 'steps': 11973, 'batch_loss/train': 0.8296873979270458} 12/24/2021 13:20:55 - INFO - codeparrot_training - Step 11974: {'lr': 0.00043394867648069164, 'samples': 6131200, 'steps': 11974, 'batch_loss/train': 0.6863043562625535} 12/24/2021 13:21:05 - INFO - codeparrot_training - Step 11975: {'lr': 0.0004339380012470907, 'samples': 6131712, 'steps': 11975, 'batch_loss/train': 0.815657471306622} 12/24/2021 13:21:19 - INFO - codeparrot_training - Step 11976: {'lr': 0.0004339273252822216, 'samples': 6132224, 'steps': 11976, 'batch_loss/train': 0.7670378335751593} 12/24/2021 13:21:30 - INFO - codeparrot_training - Step 11977: {'lr': 0.00043391664858612693, 'samples': 6132736, 'steps': 11977, 'batch_loss/train': 0.8033824423328042} 12/24/2021 13:21:40 - INFO - codeparrot_training - Step 11978: {'lr': 0.0004339059711588491, 'samples': 6133248, 'steps': 11978, 'batch_loss/train': 0.7699840143322945} 12/24/2021 13:21:53 - INFO - codeparrot_training - Step 11979: {'lr': 0.0004338952930004306, 'samples': 6133760, 'steps': 11979, 'batch_loss/train': 0.8482827628031373} 12/24/2021 13:22:03 - INFO - codeparrot_training - Step 11980: {'lr': 0.00043388461411091397, 'samples': 6134272, 'steps': 11980, 'batch_loss/train': 0.7806691909208894} 12/24/2021 13:22:14 - INFO - codeparrot_training - Step 11981: {'lr': 0.00043387393449034143, 'samples': 6134784, 'steps': 11981, 'batch_loss/train': 0.7692093548830599} 12/24/2021 13:22:25 - INFO - codeparrot_training - Step 11982: {'lr': 0.0004338632541387555, 'samples': 6135296, 'steps': 11982, 'batch_loss/train': 0.6906398597639054} 12/24/2021 13:22:37 - INFO - codeparrot_training - Step 11983: {'lr': 0.0004338525730561987, 'samples': 6135808, 'steps': 11983, 'batch_loss/train': 0.8967654390726238} 12/24/2021 13:22:47 - INFO - codeparrot_training - Step 11984: {'lr': 0.00043384189124271355, 'samples': 6136320, 'steps': 11984, 'batch_loss/train': 0.8247928433120251} 12/24/2021 13:22:58 - INFO - codeparrot_training - Step 11985: {'lr': 0.0004338312086983425, 'samples': 6136832, 'steps': 11985, 'batch_loss/train': 0.6573777862358838} 12/24/2021 13:23:12 - INFO - codeparrot_training - Step 11986: {'lr': 0.0004338205254231279, 'samples': 6137344, 'steps': 11986, 'batch_loss/train': 0.7345309427473694} 12/24/2021 13:23:22 - INFO - codeparrot_training - Step 11987: {'lr': 0.0004338098414171123, 'samples': 6137856, 'steps': 11987, 'batch_loss/train': 0.8202324188314378} 12/24/2021 13:23:33 - INFO - codeparrot_training - Step 11988: {'lr': 0.0004337991566803382, 'samples': 6138368, 'steps': 11988, 'batch_loss/train': 0.7970065101981163} 12/24/2021 13:23:45 - INFO - codeparrot_training - Step 11989: {'lr': 0.00043378847121284804, 'samples': 6138880, 'steps': 11989, 'batch_loss/train': 0.8772339336574078} 12/24/2021 13:23:56 - INFO - codeparrot_training - Step 11990: {'lr': 0.0004337777850146843, 'samples': 6139392, 'steps': 11990, 'batch_loss/train': 0.925829271087423} 12/24/2021 13:24:06 - INFO - codeparrot_training - Step 11991: {'lr': 0.00043376709808588956, 'samples': 6139904, 'steps': 11991, 'batch_loss/train': 0.7645116122439504} 12/24/2021 13:24:20 - INFO - codeparrot_training - Step 11992: {'lr': 0.00043375641042650613, 'samples': 6140416, 'steps': 11992, 'batch_loss/train': 0.7492439419729635} 12/24/2021 13:24:31 - INFO - codeparrot_training - Step 11993: {'lr': 0.0004337457220365767, 'samples': 6140928, 'steps': 11993, 'batch_loss/train': 0.7759853042662144} 12/24/2021 13:24:41 - INFO - codeparrot_training - Step 11994: {'lr': 0.0004337350329161436, 'samples': 6141440, 'steps': 11994, 'batch_loss/train': 0.7987184313242324} 12/24/2021 13:24:52 - INFO - codeparrot_training - Step 11995: {'lr': 0.0004337243430652493, 'samples': 6141952, 'steps': 11995, 'batch_loss/train': 0.7321375866304152} 12/24/2021 13:25:04 - INFO - codeparrot_training - Step 11996: {'lr': 0.0004337136524839365, 'samples': 6142464, 'steps': 11996, 'batch_loss/train': 0.9309822128270753} 12/24/2021 13:25:15 - INFO - codeparrot_training - Step 11997: {'lr': 0.00043370296117224754, 'samples': 6142976, 'steps': 11997, 'batch_loss/train': 0.9136658385396004} 12/24/2021 13:25:25 - INFO - codeparrot_training - Step 11998: {'lr': 0.00043369226913022495, 'samples': 6143488, 'steps': 11998, 'batch_loss/train': 0.7793627576902509} 12/24/2021 13:25:38 - INFO - codeparrot_training - Step 11999: {'lr': 0.0004336815763579113, 'samples': 6144000, 'steps': 11999, 'batch_loss/train': 0.7981039644801058} 12/24/2021 13:25:48 - INFO - codeparrot_training - Step 12000: {'lr': 0.00043367088285534903, 'samples': 6144512, 'steps': 12000, 'batch_loss/train': 0.8025233270600438} 12/24/2021 13:25:59 - INFO - codeparrot_training - Step 12001: {'lr': 0.0004336601886225807, 'samples': 6145024, 'steps': 12001, 'batch_loss/train': 0.8558838004246354} 12/24/2021 13:26:12 - INFO - codeparrot_training - Step 12002: {'lr': 0.0004336494936596488, 'samples': 6145536, 'steps': 12002, 'batch_loss/train': 0.8207219010218978} 12/24/2021 13:26:23 - INFO - codeparrot_training - Step 12003: {'lr': 0.0004336387979665959, 'samples': 6146048, 'steps': 12003, 'batch_loss/train': 0.7779877954162657} 12/24/2021 13:26:34 - INFO - codeparrot_training - Step 12004: {'lr': 0.0004336281015434644, 'samples': 6146560, 'steps': 12004, 'batch_loss/train': 0.7787324109231122} 12/24/2021 13:26:45 - INFO - codeparrot_training - Step 12005: {'lr': 0.00043361740439029693, 'samples': 6147072, 'steps': 12005, 'batch_loss/train': 0.8745403653010726} 12/24/2021 13:26:57 - INFO - codeparrot_training - Step 12006: {'lr': 0.000433606706507136, 'samples': 6147584, 'steps': 12006, 'batch_loss/train': 0.7967035854235291} 12/24/2021 13:27:07 - INFO - codeparrot_training - Step 12007: {'lr': 0.00043359600789402415, 'samples': 6148096, 'steps': 12007, 'batch_loss/train': 0.747179362224415} 12/24/2021 13:27:18 - INFO - codeparrot_training - Step 12008: {'lr': 0.00043358530855100387, 'samples': 6148608, 'steps': 12008, 'batch_loss/train': 0.8402700703591108} 12/24/2021 13:27:32 - INFO - codeparrot_training - Step 12009: {'lr': 0.00043357460847811777, 'samples': 6149120, 'steps': 12009, 'batch_loss/train': 0.8399537745863199} 12/24/2021 13:27:43 - INFO - codeparrot_training - Step 12010: {'lr': 0.0004335639076754083, 'samples': 6149632, 'steps': 12010, 'batch_loss/train': 0.8270249776542187} 12/24/2021 13:27:53 - INFO - codeparrot_training - Step 12011: {'lr': 0.00043355320614291804, 'samples': 6150144, 'steps': 12011, 'batch_loss/train': 0.8013569566537626} 12/24/2021 13:28:05 - INFO - codeparrot_training - Step 12012: {'lr': 0.0004335425038806895, 'samples': 6150656, 'steps': 12012, 'batch_loss/train': 1.0109678893350065} 12/24/2021 13:28:16 - INFO - codeparrot_training - Step 12013: {'lr': 0.0004335318008887653, 'samples': 6151168, 'steps': 12013, 'batch_loss/train': 0.928726676851511} 12/24/2021 13:28:27 - INFO - codeparrot_training - Step 12014: {'lr': 0.000433521097167188, 'samples': 6151680, 'steps': 12014, 'batch_loss/train': 0.7806793497875333} 12/24/2021 13:28:39 - INFO - codeparrot_training - Step 12015: {'lr': 0.00043351039271600014, 'samples': 6152192, 'steps': 12015, 'batch_loss/train': 0.6586292001884431} 12/24/2021 13:28:49 - INFO - codeparrot_training - Step 12016: {'lr': 0.0004334996875352442, 'samples': 6152704, 'steps': 12016, 'batch_loss/train': 0.8668024986982346} 12/24/2021 13:29:00 - INFO - codeparrot_training - Step 12017: {'lr': 0.00043348898162496273, 'samples': 6153216, 'steps': 12017, 'batch_loss/train': 0.8686914062127471} 12/24/2021 13:29:13 - INFO - codeparrot_training - Step 12018: {'lr': 0.0004334782749851984, 'samples': 6153728, 'steps': 12018, 'batch_loss/train': 0.8442731876857579} 12/24/2021 13:29:24 - INFO - codeparrot_training - Step 12019: {'lr': 0.0004334675676159937, 'samples': 6154240, 'steps': 12019, 'batch_loss/train': 1.5195806212723255} 12/24/2021 13:29:34 - INFO - codeparrot_training - Step 12020: {'lr': 0.0004334568595173913, 'samples': 6154752, 'steps': 12020, 'batch_loss/train': 0.6478023232193664} 12/24/2021 13:29:45 - INFO - codeparrot_training - Step 12021: {'lr': 0.0004334461506894336, 'samples': 6155264, 'steps': 12021, 'batch_loss/train': 0.8636379218660295} 12/24/2021 13:29:59 - INFO - codeparrot_training - Step 12022: {'lr': 0.0004334354411321634, 'samples': 6155776, 'steps': 12022, 'batch_loss/train': 0.9001597724854946} 12/24/2021 13:30:09 - INFO - codeparrot_training - Step 12023: {'lr': 0.0004334247308456231, 'samples': 6156288, 'steps': 12023, 'batch_loss/train': 0.895394274033606} 12/24/2021 13:30:20 - INFO - codeparrot_training - Step 12024: {'lr': 0.0004334140198298553, 'samples': 6156800, 'steps': 12024, 'batch_loss/train': 0.7955957166850567} 12/24/2021 13:30:32 - INFO - codeparrot_training - Step 12025: {'lr': 0.0004334033080849026, 'samples': 6157312, 'steps': 12025, 'batch_loss/train': 0.8007966075092554} 12/24/2021 13:30:43 - INFO - codeparrot_training - Step 12026: {'lr': 0.0004333925956108076, 'samples': 6157824, 'steps': 12026, 'batch_loss/train': 0.5938624573173001} 12/24/2021 13:30:53 - INFO - codeparrot_training - Step 12027: {'lr': 0.0004333818824076129, 'samples': 6158336, 'steps': 12027, 'batch_loss/train': 0.9980204682797194} 12/24/2021 13:31:04 - INFO - codeparrot_training - Step 12028: {'lr': 0.00043337116847536104, 'samples': 6158848, 'steps': 12028, 'batch_loss/train': 0.8174608172848821} 12/24/2021 13:31:16 - INFO - codeparrot_training - Step 12029: {'lr': 0.00043336045381409474, 'samples': 6159360, 'steps': 12029, 'batch_loss/train': 0.8341906759887934} 12/24/2021 13:31:27 - INFO - codeparrot_training - Step 12030: {'lr': 0.0004333497384238564, 'samples': 6159872, 'steps': 12030, 'batch_loss/train': 0.8025997444055974} 12/24/2021 13:31:37 - INFO - codeparrot_training - Step 12031: {'lr': 0.0004333390223046888, 'samples': 6160384, 'steps': 12031, 'batch_loss/train': 0.7785857068374753} 12/24/2021 13:31:51 - INFO - codeparrot_training - Step 12032: {'lr': 0.0004333283054566345, 'samples': 6160896, 'steps': 12032, 'batch_loss/train': 0.8373321760445833} 12/24/2021 13:32:02 - INFO - codeparrot_training - Step 12033: {'lr': 0.000433317587879736, 'samples': 6161408, 'steps': 12033, 'batch_loss/train': 0.8255718215368688} 12/24/2021 13:32:12 - INFO - codeparrot_training - Step 12034: {'lr': 0.0004333068695740361, 'samples': 6161920, 'steps': 12034, 'batch_loss/train': 0.9039104143157601} 12/24/2021 13:32:25 - INFO - codeparrot_training - Step 12035: {'lr': 0.0004332961505395772, 'samples': 6162432, 'steps': 12035, 'batch_loss/train': 0.7146047961432487} 12/24/2021 13:32:35 - INFO - codeparrot_training - Step 12036: {'lr': 0.000433285430776402, 'samples': 6162944, 'steps': 12036, 'batch_loss/train': 1.0177789935842156} 12/24/2021 13:32:46 - INFO - codeparrot_training - Step 12037: {'lr': 0.0004332747102845532, 'samples': 6163456, 'steps': 12037, 'batch_loss/train': 0.8485936732031405} 12/24/2021 13:32:58 - INFO - codeparrot_training - Step 12038: {'lr': 0.00043326398906407335, 'samples': 6163968, 'steps': 12038, 'batch_loss/train': 0.798572460655123} 12/24/2021 13:33:08 - INFO - codeparrot_training - Step 12039: {'lr': 0.000433253267115005, 'samples': 6164480, 'steps': 12039, 'batch_loss/train': 0.804407762363553} 12/24/2021 13:33:19 - INFO - codeparrot_training - Step 12040: {'lr': 0.000433242544437391, 'samples': 6164992, 'steps': 12040, 'batch_loss/train': 0.7167921075597405} 12/24/2021 13:33:30 - INFO - codeparrot_training - Step 12041: {'lr': 0.0004332318210312737, 'samples': 6165504, 'steps': 12041, 'batch_loss/train': 0.6646615361096337} 12/24/2021 13:33:44 - INFO - codeparrot_training - Step 12042: {'lr': 0.0004332210968966959, 'samples': 6166016, 'steps': 12042, 'batch_loss/train': 0.6714208922348917} 12/24/2021 13:33:54 - INFO - codeparrot_training - Step 12043: {'lr': 0.0004332103720337003, 'samples': 6166528, 'steps': 12043, 'batch_loss/train': 0.8136640929151326} 12/24/2021 13:34:05 - INFO - codeparrot_training - Step 12044: {'lr': 0.0004331996464423293, 'samples': 6167040, 'steps': 12044, 'batch_loss/train': 0.7752528917044401} 12/24/2021 13:34:17 - INFO - codeparrot_training - Step 12045: {'lr': 0.00043318892012262576, 'samples': 6167552, 'steps': 12045, 'batch_loss/train': 0.8303656941279769} 12/24/2021 13:34:28 - INFO - codeparrot_training - Step 12046: {'lr': 0.0004331781930746322, 'samples': 6168064, 'steps': 12046, 'batch_loss/train': 0.7861363771371543} 12/24/2021 13:34:38 - INFO - codeparrot_training - Step 12047: {'lr': 0.00043316746529839127, 'samples': 6168576, 'steps': 12047, 'batch_loss/train': 0.8319925339892507} 12/24/2021 13:34:52 - INFO - codeparrot_training - Step 12048: {'lr': 0.00043315673679394576, 'samples': 6169088, 'steps': 12048, 'batch_loss/train': 0.7758658528327942} 12/24/2021 13:35:03 - INFO - codeparrot_training - Step 12049: {'lr': 0.00043314600756133814, 'samples': 6169600, 'steps': 12049, 'batch_loss/train': 0.8196838963776827} 12/24/2021 13:35:13 - INFO - codeparrot_training - Step 12050: {'lr': 0.00043313527760061125, 'samples': 6170112, 'steps': 12050, 'batch_loss/train': 0.686512092128396} 12/24/2021 13:35:25 - INFO - codeparrot_training - Step 12051: {'lr': 0.0004331245469118075, 'samples': 6170624, 'steps': 12051, 'batch_loss/train': 0.7645513825118542} 12/24/2021 13:35:36 - INFO - codeparrot_training - Step 12052: {'lr': 0.00043311381549496973, 'samples': 6171136, 'steps': 12052, 'batch_loss/train': 0.6990576835814863} 12/24/2021 13:35:47 - INFO - codeparrot_training - Step 12053: {'lr': 0.0004331030833501407, 'samples': 6171648, 'steps': 12053, 'batch_loss/train': 0.7606160724535584} 12/24/2021 13:35:57 - INFO - codeparrot_training - Step 12054: {'lr': 0.00043309235047736284, 'samples': 6172160, 'steps': 12054, 'batch_loss/train': 0.7606904590502381} 12/24/2021 13:36:09 - INFO - codeparrot_training - Step 12055: {'lr': 0.00043308161687667895, 'samples': 6172672, 'steps': 12055, 'batch_loss/train': 0.7545986194163561} 12/24/2021 13:36:20 - INFO - codeparrot_training - Step 12056: {'lr': 0.0004330708825481317, 'samples': 6173184, 'steps': 12056, 'batch_loss/train': 0.769294775556773} 12/24/2021 13:36:31 - INFO - codeparrot_training - Step 12057: {'lr': 0.0004330601474917637, 'samples': 6173696, 'steps': 12057, 'batch_loss/train': 0.7668360415846109} 12/24/2021 13:36:43 - INFO - codeparrot_training - Step 12058: {'lr': 0.00043304941170761767, 'samples': 6174208, 'steps': 12058, 'batch_loss/train': 0.8142498843371868} 12/24/2021 13:36:53 - INFO - codeparrot_training - Step 12059: {'lr': 0.00043303867519573635, 'samples': 6174720, 'steps': 12059, 'batch_loss/train': 0.7863112618215382} 12/24/2021 13:37:04 - INFO - codeparrot_training - Step 12060: {'lr': 0.00043302793795616234, 'samples': 6175232, 'steps': 12060, 'batch_loss/train': 0.7919756253249943} 12/24/2021 13:37:18 - INFO - codeparrot_training - Step 12061: {'lr': 0.0004330171999889384, 'samples': 6175744, 'steps': 12061, 'batch_loss/train': 0.8816065602004528} 12/24/2021 13:37:28 - INFO - codeparrot_training - Step 12062: {'lr': 0.00043300646129410713, 'samples': 6176256, 'steps': 12062, 'batch_loss/train': 0.822160336188972} 12/24/2021 13:37:39 - INFO - codeparrot_training - Step 12063: {'lr': 0.00043299572187171126, 'samples': 6176768, 'steps': 12063, 'batch_loss/train': 0.8527468382380903} 12/24/2021 13:37:51 - INFO - codeparrot_training - Step 12064: {'lr': 0.00043298498172179357, 'samples': 6177280, 'steps': 12064, 'batch_loss/train': 0.7828642521053553} 12/24/2021 13:38:02 - INFO - codeparrot_training - Step 12065: {'lr': 0.00043297424084439665, 'samples': 6177792, 'steps': 12065, 'batch_loss/train': 0.7169225299730897} 12/24/2021 13:38:12 - INFO - codeparrot_training - Step 12066: {'lr': 0.0004329634992395631, 'samples': 6178304, 'steps': 12066, 'batch_loss/train': 0.8987132766051218} 12/24/2021 13:38:23 - INFO - codeparrot_training - Step 12067: {'lr': 0.0004329527569073359, 'samples': 6178816, 'steps': 12067, 'batch_loss/train': 0.7925849677994847} 12/24/2021 13:38:35 - INFO - codeparrot_training - Step 12068: {'lr': 0.00043294201384775765, 'samples': 6179328, 'steps': 12068, 'batch_loss/train': 0.7671774867922068} 12/24/2021 13:38:46 - INFO - codeparrot_training - Step 12069: {'lr': 0.00043293127006087097, 'samples': 6179840, 'steps': 12069, 'batch_loss/train': 0.7923414530232549} 12/24/2021 13:38:56 - INFO - codeparrot_training - Step 12070: {'lr': 0.00043292052554671866, 'samples': 6180352, 'steps': 12070, 'batch_loss/train': 0.7783680772408843} 12/24/2021 13:39:10 - INFO - codeparrot_training - Step 12071: {'lr': 0.0004329097803053434, 'samples': 6180864, 'steps': 12071, 'batch_loss/train': 0.6429271740489639} 12/24/2021 13:39:21 - INFO - codeparrot_training - Step 12072: {'lr': 0.0004328990343367879, 'samples': 6181376, 'steps': 12072, 'batch_loss/train': 0.653667739010416} 12/24/2021 13:39:31 - INFO - codeparrot_training - Step 12073: {'lr': 0.0004328882876410949, 'samples': 6181888, 'steps': 12073, 'batch_loss/train': 0.8491568770259619} 12/24/2021 13:39:44 - INFO - codeparrot_training - Step 12074: {'lr': 0.0004328775402183072, 'samples': 6182400, 'steps': 12074, 'batch_loss/train': 0.78175516333431} 12/24/2021 13:39:54 - INFO - codeparrot_training - Step 12075: {'lr': 0.0004328667920684673, 'samples': 6182912, 'steps': 12075, 'batch_loss/train': 0.8433824861422181} 12/24/2021 13:40:05 - INFO - codeparrot_training - Step 12076: {'lr': 0.0004328560431916182, 'samples': 6183424, 'steps': 12076, 'batch_loss/train': 0.775952888187021} 12/24/2021 13:40:18 - INFO - codeparrot_training - Step 12077: {'lr': 0.00043284529358780246, 'samples': 6183936, 'steps': 12077, 'batch_loss/train': 0.7095570330275223} 12/24/2021 13:40:29 - INFO - codeparrot_training - Step 12078: {'lr': 0.0004328345432570629, 'samples': 6184448, 'steps': 12078, 'batch_loss/train': 0.8503950252197683} 12/24/2021 13:40:40 - INFO - codeparrot_training - Step 12079: {'lr': 0.0004328237921994422, 'samples': 6184960, 'steps': 12079, 'batch_loss/train': 0.8368097599595785} 12/24/2021 13:40:50 - INFO - codeparrot_training - Step 12080: {'lr': 0.00043281304041498315, 'samples': 6185472, 'steps': 12080, 'batch_loss/train': 0.8028869167901576} 12/24/2021 13:41:02 - INFO - codeparrot_training - Step 12081: {'lr': 0.00043280228790372856, 'samples': 6185984, 'steps': 12081, 'batch_loss/train': 0.7937676021829247} 12/24/2021 13:41:13 - INFO - codeparrot_training - Step 12082: {'lr': 0.00043279153466572096, 'samples': 6186496, 'steps': 12082, 'batch_loss/train': 0.7305656308308244} 12/24/2021 13:41:23 - INFO - codeparrot_training - Step 12083: {'lr': 0.00043278078070100337, 'samples': 6187008, 'steps': 12083, 'batch_loss/train': 0.7962971571832895} 12/24/2021 13:41:36 - INFO - codeparrot_training - Step 12084: {'lr': 0.00043277002600961836, 'samples': 6187520, 'steps': 12084, 'batch_loss/train': 0.857848254032433} 12/24/2021 13:41:46 - INFO - codeparrot_training - Step 12085: {'lr': 0.0004327592705916087, 'samples': 6188032, 'steps': 12085, 'batch_loss/train': 0.8568882327526808} 12/24/2021 13:41:57 - INFO - codeparrot_training - Step 12086: {'lr': 0.0004327485144470171, 'samples': 6188544, 'steps': 12086, 'batch_loss/train': 0.7505155252292752} 12/24/2021 13:42:11 - INFO - codeparrot_training - Step 12087: {'lr': 0.00043273775757588656, 'samples': 6189056, 'steps': 12087, 'batch_loss/train': 0.8033103100024164} 12/24/2021 13:42:21 - INFO - codeparrot_training - Step 12088: {'lr': 0.0004327269999782597, 'samples': 6189568, 'steps': 12088, 'batch_loss/train': 0.7785016032867134} 12/24/2021 13:42:32 - INFO - codeparrot_training - Step 12089: {'lr': 0.00043271624165417923, 'samples': 6190080, 'steps': 12089, 'batch_loss/train': 0.7535198093391955} 12/24/2021 13:42:44 - INFO - codeparrot_training - Step 12090: {'lr': 0.000432705482603688, 'samples': 6190592, 'steps': 12090, 'batch_loss/train': 0.7738839983940125} 12/24/2021 13:42:55 - INFO - codeparrot_training - Step 12091: {'lr': 0.00043269472282682873, 'samples': 6191104, 'steps': 12091, 'batch_loss/train': 0.8585946839302778} 12/24/2021 13:43:06 - INFO - codeparrot_training - Step 12092: {'lr': 0.0004326839623236442, 'samples': 6191616, 'steps': 12092, 'batch_loss/train': 0.6338164906483144} 12/24/2021 13:43:16 - INFO - codeparrot_training - Step 12093: {'lr': 0.0004326732010941773, 'samples': 6192128, 'steps': 12093, 'batch_loss/train': 0.7482505207881331} 12/24/2021 13:43:28 - INFO - codeparrot_training - Step 12094: {'lr': 0.00043266243913847065, 'samples': 6192640, 'steps': 12094, 'batch_loss/train': 0.5795548679307103} 12/24/2021 13:43:39 - INFO - codeparrot_training - Step 12095: {'lr': 0.00043265167645656713, 'samples': 6193152, 'steps': 12095, 'batch_loss/train': 0.7760138348676264} 12/24/2021 13:43:49 - INFO - codeparrot_training - Step 12096: {'lr': 0.0004326409130485095, 'samples': 6193664, 'steps': 12096, 'batch_loss/train': 0.8688940703868866} 12/24/2021 13:44:02 - INFO - codeparrot_training - Step 12097: {'lr': 0.0004326301489143406, 'samples': 6194176, 'steps': 12097, 'batch_loss/train': 0.6946710087358952} 12/24/2021 13:44:13 - INFO - codeparrot_training - Step 12098: {'lr': 0.00043261938405410316, 'samples': 6194688, 'steps': 12098, 'batch_loss/train': 0.7707851519808173} 12/24/2021 13:44:23 - INFO - codeparrot_training - Step 12099: {'lr': 0.00043260861846784005, 'samples': 6195200, 'steps': 12099, 'batch_loss/train': 0.7624118267558515} 12/24/2021 13:44:34 - INFO - codeparrot_training - Step 12100: {'lr': 0.0004325978521555939, 'samples': 6195712, 'steps': 12100, 'batch_loss/train': 1.0380412051454186} 12/24/2021 13:44:48 - INFO - codeparrot_training - Step 12101: {'lr': 0.0004325870851174078, 'samples': 6196224, 'steps': 12101, 'batch_loss/train': 0.8064001030288637} 12/24/2021 13:44:59 - INFO - codeparrot_training - Step 12102: {'lr': 0.00043257631735332423, 'samples': 6196736, 'steps': 12102, 'batch_loss/train': 0.8147513065487146} 12/24/2021 13:45:09 - INFO - codeparrot_training - Step 12103: {'lr': 0.00043256554886338626, 'samples': 6197248, 'steps': 12103, 'batch_loss/train': 0.5317398654296994} 12/24/2021 13:45:22 - INFO - codeparrot_training - Step 12104: {'lr': 0.0004325547796476367, 'samples': 6197760, 'steps': 12104, 'batch_loss/train': 0.6718515446409583} 12/24/2021 13:45:32 - INFO - codeparrot_training - Step 12105: {'lr': 0.00043254400970611805, 'samples': 6198272, 'steps': 12105, 'batch_loss/train': 0.8263368664775044} 12/24/2021 13:45:43 - INFO - codeparrot_training - Step 12106: {'lr': 0.00043253323903887347, 'samples': 6198784, 'steps': 12106, 'batch_loss/train': 0.7677881123963743} 12/24/2021 13:45:57 - INFO - codeparrot_training - Step 12107: {'lr': 0.0004325224676459456, 'samples': 6199296, 'steps': 12107, 'batch_loss/train': 0.7561642359942198} 12/24/2021 13:46:07 - INFO - codeparrot_training - Step 12108: {'lr': 0.00043251169552737734, 'samples': 6199808, 'steps': 12108, 'batch_loss/train': 0.901011317037046} 12/24/2021 13:46:18 - INFO - codeparrot_training - Step 12109: {'lr': 0.00043250092268321157, 'samples': 6200320, 'steps': 12109, 'batch_loss/train': 0.8624045215547085} 12/24/2021 13:46:30 - INFO - codeparrot_training - Step 12110: {'lr': 0.0004324901491134909, 'samples': 6200832, 'steps': 12110, 'batch_loss/train': 0.9288313901051879} 12/24/2021 13:46:41 - INFO - codeparrot_training - Step 12111: {'lr': 0.0004324793748182584, 'samples': 6201344, 'steps': 12111, 'batch_loss/train': 0.833576693199575} 12/24/2021 13:46:51 - INFO - codeparrot_training - Step 12112: {'lr': 0.0004324685997975568, 'samples': 6201856, 'steps': 12112, 'batch_loss/train': 0.7055014956276864} 12/24/2021 13:47:02 - INFO - codeparrot_training - Step 12113: {'lr': 0.00043245782405142884, 'samples': 6202368, 'steps': 12113, 'batch_loss/train': 0.6756849898956716} 12/24/2021 13:47:14 - INFO - codeparrot_training - Step 12114: {'lr': 0.0004324470475799176, 'samples': 6202880, 'steps': 12114, 'batch_loss/train': 0.7570183970965445} 12/24/2021 13:47:25 - INFO - codeparrot_training - Step 12115: {'lr': 0.0004324362703830658, 'samples': 6203392, 'steps': 12115, 'batch_loss/train': 0.8445266801863909} 12/24/2021 13:47:35 - INFO - codeparrot_training - Step 12116: {'lr': 0.00043242549246091613, 'samples': 6203904, 'steps': 12116, 'batch_loss/train': 0.787812101887539} 12/24/2021 13:47:49 - INFO - codeparrot_training - Step 12117: {'lr': 0.0004324147138135117, 'samples': 6204416, 'steps': 12117, 'batch_loss/train': 0.7794267679564655} 12/24/2021 13:48:00 - INFO - codeparrot_training - Step 12118: {'lr': 0.0004324039344408952, 'samples': 6204928, 'steps': 12118, 'batch_loss/train': 0.876647999510169} 12/24/2021 13:48:10 - INFO - codeparrot_training - Step 12119: {'lr': 0.0004323931543431096, 'samples': 6205440, 'steps': 12119, 'batch_loss/train': 0.8246309207752347} 12/24/2021 13:48:23 - INFO - codeparrot_training - Step 12120: {'lr': 0.0004323823735201976, 'samples': 6205952, 'steps': 12120, 'batch_loss/train': 1.1589002273976803} 12/24/2021 13:48:34 - INFO - codeparrot_training - Step 12121: {'lr': 0.00043237159197220213, 'samples': 6206464, 'steps': 12121, 'batch_loss/train': 0.8923396710306406} 12/24/2021 13:48:45 - INFO - codeparrot_training - Step 12122: {'lr': 0.00043236080969916613, 'samples': 6206976, 'steps': 12122, 'batch_loss/train': 0.7662212019786239} 12/24/2021 13:48:55 - INFO - codeparrot_training - Step 12123: {'lr': 0.00043235002670113237, 'samples': 6207488, 'steps': 12123, 'batch_loss/train': 0.7702013852540404} 12/24/2021 13:49:07 - INFO - codeparrot_training - Step 12124: {'lr': 0.0004323392429781438, 'samples': 6208000, 'steps': 12124, 'batch_loss/train': 0.8110847882926464} 12/24/2021 13:49:18 - INFO - codeparrot_training - Step 12125: {'lr': 0.00043232845853024323, 'samples': 6208512, 'steps': 12125, 'batch_loss/train': 0.8448348864912987} 12/24/2021 13:49:28 - INFO - codeparrot_training - Step 12126: {'lr': 0.0004323176733574735, 'samples': 6209024, 'steps': 12126, 'batch_loss/train': 0.7583858696743846} 12/24/2021 13:49:40 - INFO - codeparrot_training - Step 12127: {'lr': 0.0004323068874598777, 'samples': 6209536, 'steps': 12127, 'batch_loss/train': 0.6979982014745474} 12/24/2021 13:49:51 - INFO - codeparrot_training - Step 12128: {'lr': 0.0004322961008374984, 'samples': 6210048, 'steps': 12128, 'batch_loss/train': 0.7727275239303708} 12/24/2021 13:50:02 - INFO - codeparrot_training - Step 12129: {'lr': 0.00043228531349037865, 'samples': 6210560, 'steps': 12129, 'batch_loss/train': 0.8180572707206011} 12/24/2021 13:50:16 - INFO - codeparrot_training - Step 12130: {'lr': 0.0004322745254185614, 'samples': 6211072, 'steps': 12130, 'batch_loss/train': 0.7980290343984962} 12/24/2021 13:50:26 - INFO - codeparrot_training - Step 12131: {'lr': 0.0004322637366220894, 'samples': 6211584, 'steps': 12131, 'batch_loss/train': 0.7319812389323488} 12/24/2021 13:50:37 - INFO - codeparrot_training - Step 12132: {'lr': 0.0004322529471010056, 'samples': 6212096, 'steps': 12132, 'batch_loss/train': 0.7053784416057169} 12/24/2021 13:50:48 - INFO - codeparrot_training - Step 12133: {'lr': 0.00043224215685535287, 'samples': 6212608, 'steps': 12133, 'batch_loss/train': 0.840707965195179} 12/24/2021 13:51:00 - INFO - codeparrot_training - Step 12134: {'lr': 0.0004322313658851742, 'samples': 6213120, 'steps': 12134, 'batch_loss/train': 0.9273670921102166} 12/24/2021 13:51:10 - INFO - codeparrot_training - Step 12135: {'lr': 0.00043222057419051244, 'samples': 6213632, 'steps': 12135, 'batch_loss/train': 0.852154447697103} 12/24/2021 13:51:21 - INFO - codeparrot_training - Step 12136: {'lr': 0.0004322097817714104, 'samples': 6214144, 'steps': 12136, 'batch_loss/train': 0.7650745860300958} 12/24/2021 13:51:35 - INFO - codeparrot_training - Step 12137: {'lr': 0.0004321989886279111, 'samples': 6214656, 'steps': 12137, 'batch_loss/train': 0.9048011135309935} 12/24/2021 13:51:46 - INFO - codeparrot_training - Step 12138: {'lr': 0.00043218819476005744, 'samples': 6215168, 'steps': 12138, 'batch_loss/train': 0.7998532392084599} 12/24/2021 13:51:56 - INFO - codeparrot_training - Step 12139: {'lr': 0.0004321774001678923, 'samples': 6215680, 'steps': 12139, 'batch_loss/train': 0.8518052687868476} 12/24/2021 13:52:08 - INFO - codeparrot_training - Step 12140: {'lr': 0.00043216660485145863, 'samples': 6216192, 'steps': 12140, 'batch_loss/train': 0.736466329661198} 12/24/2021 13:52:19 - INFO - codeparrot_training - Step 12141: {'lr': 0.0004321558088107993, 'samples': 6216704, 'steps': 12141, 'batch_loss/train': 0.5886839171871543} 12/24/2021 13:52:30 - INFO - codeparrot_training - Step 12142: {'lr': 0.00043214501204595727, 'samples': 6217216, 'steps': 12142, 'batch_loss/train': 0.7395176461432129} 12/24/2021 13:52:40 - INFO - codeparrot_training - Step 12143: {'lr': 0.0004321342145569754, 'samples': 6217728, 'steps': 12143, 'batch_loss/train': 0.8438049992546439} 12/24/2021 13:52:52 - INFO - codeparrot_training - Step 12144: {'lr': 0.0004321234163438967, 'samples': 6218240, 'steps': 12144, 'batch_loss/train': 0.6892033943440765} 12/24/2021 13:53:03 - INFO - codeparrot_training - Step 12145: {'lr': 0.0004321126174067641, 'samples': 6218752, 'steps': 12145, 'batch_loss/train': 0.7757965493947268} 12/24/2021 13:53:14 - INFO - codeparrot_training - Step 12146: {'lr': 0.0004321018177456204, 'samples': 6219264, 'steps': 12146, 'batch_loss/train': 0.5549590727314353} 12/24/2021 13:53:27 - INFO - codeparrot_training - Step 12147: {'lr': 0.0004320910173605087, 'samples': 6219776, 'steps': 12147, 'batch_loss/train': 0.7895357111701742} 12/24/2021 13:53:38 - INFO - codeparrot_training - Step 12148: {'lr': 0.00043208021625147186, 'samples': 6220288, 'steps': 12148, 'batch_loss/train': 0.8499232083559036} 12/24/2021 13:53:49 - INFO - codeparrot_training - Step 12149: {'lr': 0.0004320694144185528, 'samples': 6220800, 'steps': 12149, 'batch_loss/train': 0.8776790266856551} 12/24/2021 13:54:01 - INFO - codeparrot_training - Step 12150: {'lr': 0.0004320586118617945, 'samples': 6221312, 'steps': 12150, 'batch_loss/train': 0.8216724386438727} 12/24/2021 13:54:11 - INFO - codeparrot_training - Step 12151: {'lr': 0.0004320478085812399, 'samples': 6221824, 'steps': 12151, 'batch_loss/train': 0.8004731903783977} 12/24/2021 13:54:22 - INFO - codeparrot_training - Step 12152: {'lr': 0.00043203700457693207, 'samples': 6222336, 'steps': 12152, 'batch_loss/train': 0.7993661630898714} 12/24/2021 13:54:34 - INFO - codeparrot_training - Step 12153: {'lr': 0.00043202619984891374, 'samples': 6222848, 'steps': 12153, 'batch_loss/train': 0.776324643753469} 12/24/2021 13:54:45 - INFO - codeparrot_training - Step 12154: {'lr': 0.00043201539439722793, 'samples': 6223360, 'steps': 12154, 'batch_loss/train': 0.7943091122433543} 12/24/2021 13:54:55 - INFO - codeparrot_training - Step 12155: {'lr': 0.0004320045882219177, 'samples': 6223872, 'steps': 12155, 'batch_loss/train': 0.7873107823543251} 12/24/2021 13:55:06 - INFO - codeparrot_training - Step 12156: {'lr': 0.0004319937813230259, 'samples': 6224384, 'steps': 12156, 'batch_loss/train': 0.7097842073999345} 12/24/2021 13:55:20 - INFO - codeparrot_training - Step 12157: {'lr': 0.00043198297370059566, 'samples': 6224896, 'steps': 12157, 'batch_loss/train': 0.7848970030900091} 12/24/2021 13:55:30 - INFO - codeparrot_training - Step 12158: {'lr': 0.0004319721653546697, 'samples': 6225408, 'steps': 12158, 'batch_loss/train': 0.8257901230826974} 12/24/2021 13:55:41 - INFO - codeparrot_training - Step 12159: {'lr': 0.0004319613562852912, 'samples': 6225920, 'steps': 12159, 'batch_loss/train': 0.8684250032529235} 12/24/2021 13:55:53 - INFO - codeparrot_training - Step 12160: {'lr': 0.000431950546492503, 'samples': 6226432, 'steps': 12160, 'batch_loss/train': 0.8824580479413271} 12/24/2021 13:56:04 - INFO - codeparrot_training - Step 12161: {'lr': 0.00043193973597634817, 'samples': 6226944, 'steps': 12161, 'batch_loss/train': 0.9468091176822782} 12/24/2021 13:56:14 - INFO - codeparrot_training - Step 12162: {'lr': 0.00043192892473686964, 'samples': 6227456, 'steps': 12162, 'batch_loss/train': 0.7538204491138458} 12/24/2021 13:56:28 - INFO - codeparrot_training - Step 12163: {'lr': 0.00043191811277411036, 'samples': 6227968, 'steps': 12163, 'batch_loss/train': 0.738327962346375} 12/24/2021 13:56:39 - INFO - codeparrot_training - Step 12164: {'lr': 0.0004319073000881134, 'samples': 6228480, 'steps': 12164, 'batch_loss/train': 0.7511085723526776} 12/24/2021 13:56:49 - INFO - codeparrot_training - Step 12165: {'lr': 0.0004318964866789217, 'samples': 6228992, 'steps': 12165, 'batch_loss/train': 0.835702917072922} 12/24/2021 13:57:01 - INFO - codeparrot_training - Step 12166: {'lr': 0.0004318856725465782, 'samples': 6229504, 'steps': 12166, 'batch_loss/train': 0.7806347164441831} 12/24/2021 13:57:12 - INFO - codeparrot_training - Step 12167: {'lr': 0.000431874857691126, 'samples': 6230016, 'steps': 12167, 'batch_loss/train': 0.8389895586296916} 12/24/2021 13:57:23 - INFO - codeparrot_training - Step 12168: {'lr': 0.000431864042112608, 'samples': 6230528, 'steps': 12168, 'batch_loss/train': 0.793075324036181} 12/24/2021 13:57:33 - INFO - codeparrot_training - Step 12169: {'lr': 0.0004318532258110673, 'samples': 6231040, 'steps': 12169, 'batch_loss/train': 0.7863520615501329} 12/24/2021 13:57:46 - INFO - codeparrot_training - Step 12170: {'lr': 0.0004318424087865468, 'samples': 6231552, 'steps': 12170, 'batch_loss/train': 0.7430710233747959} 12/24/2021 13:57:56 - INFO - codeparrot_training - Step 12171: {'lr': 0.0004318315910390895, 'samples': 6232064, 'steps': 12171, 'batch_loss/train': 0.7546932101249695} 12/24/2021 13:58:07 - INFO - codeparrot_training - Step 12172: {'lr': 0.00043182077256873855, 'samples': 6232576, 'steps': 12172, 'batch_loss/train': 0.8911141697317362} 12/24/2021 13:58:19 - INFO - codeparrot_training - Step 12173: {'lr': 0.00043180995337553673, 'samples': 6233088, 'steps': 12173, 'batch_loss/train': 0.8013013806194067} 12/24/2021 13:58:30 - INFO - codeparrot_training - Step 12174: {'lr': 0.00043179913345952725, 'samples': 6233600, 'steps': 12174, 'batch_loss/train': 0.7833660906180739} 12/24/2021 13:58:41 - INFO - codeparrot_training - Step 12175: {'lr': 0.00043178831282075303, 'samples': 6234112, 'steps': 12175, 'batch_loss/train': 0.7892635436728597} 12/24/2021 13:58:55 - INFO - codeparrot_training - Step 12176: {'lr': 0.00043177749145925713, 'samples': 6234624, 'steps': 12176, 'batch_loss/train': 0.5601776269031689} 12/24/2021 13:59:05 - INFO - codeparrot_training - Step 12177: {'lr': 0.00043176666937508264, 'samples': 6235136, 'steps': 12177, 'batch_loss/train': 0.8713003349257633} 12/24/2021 13:59:16 - INFO - codeparrot_training - Step 12178: {'lr': 0.00043175584656827244, 'samples': 6235648, 'steps': 12178, 'batch_loss/train': 0.8647477524355054} 12/24/2021 13:59:26 - INFO - codeparrot_training - Step 12179: {'lr': 0.0004317450230388696, 'samples': 6236160, 'steps': 12179, 'batch_loss/train': 0.7730991786811501} 12/24/2021 13:59:39 - INFO - codeparrot_training - Step 12180: {'lr': 0.00043173419878691724, 'samples': 6236672, 'steps': 12180, 'batch_loss/train': 0.6066297115758061} 12/24/2021 13:59:49 - INFO - codeparrot_training - Step 12181: {'lr': 0.0004317233738124583, 'samples': 6237184, 'steps': 12181, 'batch_loss/train': 0.8075444037094712} 12/24/2021 14:00:00 - INFO - codeparrot_training - Step 12182: {'lr': 0.0004317125481155358, 'samples': 6237696, 'steps': 12182, 'batch_loss/train': 0.6694544111378491} 12/24/2021 14:00:12 - INFO - codeparrot_training - Step 12183: {'lr': 0.00043170172169619287, 'samples': 6238208, 'steps': 12183, 'batch_loss/train': 0.8363775480538607} 12/24/2021 14:00:23 - INFO - codeparrot_training - Step 12184: {'lr': 0.00043169089455447247, 'samples': 6238720, 'steps': 12184, 'batch_loss/train': 0.6249499695259146} 12/24/2021 14:00:33 - INFO - codeparrot_training - Step 12185: {'lr': 0.0004316800666904177, 'samples': 6239232, 'steps': 12185, 'batch_loss/train': 0.8666834064642899} 12/24/2021 14:00:44 - INFO - codeparrot_training - Step 12186: {'lr': 0.00043166923810407154, 'samples': 6239744, 'steps': 12186, 'batch_loss/train': 0.7818394117057323} 12/24/2021 14:00:58 - INFO - codeparrot_training - Step 12187: {'lr': 0.00043165840879547714, 'samples': 6240256, 'steps': 12187, 'batch_loss/train': 0.7852836130186915} 12/24/2021 14:01:09 - INFO - codeparrot_training - Step 12188: {'lr': 0.0004316475787646775, 'samples': 6240768, 'steps': 12188, 'batch_loss/train': 0.7339230049401522} 12/24/2021 14:01:19 - INFO - codeparrot_training - Step 12189: {'lr': 0.0004316367480117156, 'samples': 6241280, 'steps': 12189, 'batch_loss/train': 0.8091825768351555} 12/24/2021 14:01:31 - INFO - codeparrot_training - Step 12190: {'lr': 0.00043162591653663466, 'samples': 6241792, 'steps': 12190, 'batch_loss/train': 0.8661072626709938} 12/24/2021 14:01:42 - INFO - codeparrot_training - Step 12191: {'lr': 0.0004316150843394776, 'samples': 6242304, 'steps': 12191, 'batch_loss/train': 0.7776654381304979} 12/24/2021 14:01:53 - INFO - codeparrot_training - Step 12192: {'lr': 0.00043160425142028757, 'samples': 6242816, 'steps': 12192, 'batch_loss/train': 0.8028164331335574} 12/24/2021 14:02:06 - INFO - codeparrot_training - Step 12193: {'lr': 0.00043159341777910767, 'samples': 6243328, 'steps': 12193, 'batch_loss/train': 0.7768774759024382} 12/24/2021 14:02:17 - INFO - codeparrot_training - Step 12194: {'lr': 0.0004315825834159808, 'samples': 6243840, 'steps': 12194, 'batch_loss/train': 0.7136036343872547} 12/24/2021 14:02:28 - INFO - codeparrot_training - Step 12195: {'lr': 0.00043157174833095016, 'samples': 6244352, 'steps': 12195, 'batch_loss/train': 0.7560820725047961} 12/24/2021 14:02:38 - INFO - codeparrot_training - Step 12196: {'lr': 0.0004315609125240588, 'samples': 6244864, 'steps': 12196, 'batch_loss/train': 0.8561049848794937} 12/24/2021 14:02:51 - INFO - codeparrot_training - Step 12197: {'lr': 0.0004315500759953498, 'samples': 6245376, 'steps': 12197, 'batch_loss/train': 0.7450854876078665} 12/24/2021 14:03:01 - INFO - codeparrot_training - Step 12198: {'lr': 0.0004315392387448663, 'samples': 6245888, 'steps': 12198, 'batch_loss/train': 0.8229757882654667} 12/24/2021 14:03:12 - INFO - codeparrot_training - Step 12199: {'lr': 0.0004315284007726513, 'samples': 6246400, 'steps': 12199, 'batch_loss/train': 0.621798378881067} 12/24/2021 14:03:24 - INFO - codeparrot_training - Step 12200: {'lr': 0.000431517562078748, 'samples': 6246912, 'steps': 12200, 'batch_loss/train': 0.8489806051366031} 12/24/2021 14:03:35 - INFO - codeparrot_training - Step 12201: {'lr': 0.00043150672266319937, 'samples': 6247424, 'steps': 12201, 'batch_loss/train': 0.8569862665608525} 12/24/2021 14:03:45 - INFO - codeparrot_training - Step 12202: {'lr': 0.0004314958825260485, 'samples': 6247936, 'steps': 12202, 'batch_loss/train': 0.894162411801517} 12/24/2021 14:03:57 - INFO - codeparrot_training - Step 12203: {'lr': 0.00043148504166733853, 'samples': 6248448, 'steps': 12203, 'batch_loss/train': 0.6968813781859353} 12/24/2021 14:04:08 - INFO - codeparrot_training - Step 12204: {'lr': 0.00043147420008711256, 'samples': 6248960, 'steps': 12204, 'batch_loss/train': 0.7722311218385585} 12/24/2021 14:04:19 - INFO - codeparrot_training - Step 12205: {'lr': 0.00043146335778541375, 'samples': 6249472, 'steps': 12205, 'batch_loss/train': 0.7287326105870306} 12/24/2021 14:04:33 - INFO - codeparrot_training - Step 12206: {'lr': 0.00043145251476228507, 'samples': 6249984, 'steps': 12206, 'batch_loss/train': 0.7710372731089592} 12/24/2021 14:04:43 - INFO - codeparrot_training - Step 12207: {'lr': 0.00043144167101776974, 'samples': 6250496, 'steps': 12207, 'batch_loss/train': 0.7727518356405199} 12/24/2021 14:04:54 - INFO - codeparrot_training - Step 12208: {'lr': 0.0004314308265519109, 'samples': 6251008, 'steps': 12208, 'batch_loss/train': 0.8175657652318478} 12/24/2021 14:05:04 - INFO - codeparrot_training - Step 12209: {'lr': 0.0004314199813647515, 'samples': 6251520, 'steps': 12209, 'batch_loss/train': 0.9083910225890577} 12/24/2021 14:05:17 - INFO - codeparrot_training - Step 12210: {'lr': 0.0004314091354563348, 'samples': 6252032, 'steps': 12210, 'batch_loss/train': 0.8039097627624869} 12/24/2021 14:05:27 - INFO - codeparrot_training - Step 12211: {'lr': 0.0004313982888267039, 'samples': 6252544, 'steps': 12211, 'batch_loss/train': 0.7489824118092656} 12/24/2021 14:05:38 - INFO - codeparrot_training - Step 12212: {'lr': 0.0004313874414759019, 'samples': 6253056, 'steps': 12212, 'batch_loss/train': 0.7979347184300423} 12/24/2021 14:05:50 - INFO - codeparrot_training - Step 12213: {'lr': 0.0004313765934039718, 'samples': 6253568, 'steps': 12213, 'batch_loss/train': 0.7650578459724784} 12/24/2021 14:06:00 - INFO - codeparrot_training - Step 12214: {'lr': 0.000431365744610957, 'samples': 6254080, 'steps': 12214, 'batch_loss/train': 0.755791119299829} 12/24/2021 14:06:11 - INFO - codeparrot_training - Step 12215: {'lr': 0.0004313548950969004, 'samples': 6254592, 'steps': 12215, 'batch_loss/train': 0.8028151411563158} 12/24/2021 14:06:25 - INFO - codeparrot_training - Step 12216: {'lr': 0.00043134404486184526, 'samples': 6255104, 'steps': 12216, 'batch_loss/train': 0.7442912459373474} 12/24/2021 14:06:35 - INFO - codeparrot_training - Step 12217: {'lr': 0.00043133319390583464, 'samples': 6255616, 'steps': 12217, 'batch_loss/train': 0.7946487683802843} 12/24/2021 14:06:46 - INFO - codeparrot_training - Step 12218: {'lr': 0.00043132234222891176, 'samples': 6256128, 'steps': 12218, 'batch_loss/train': 0.679262104909867} 12/24/2021 14:06:58 - INFO - codeparrot_training - Step 12219: {'lr': 0.00043131148983111965, 'samples': 6256640, 'steps': 12219, 'batch_loss/train': 0.8262413847260177} 12/24/2021 14:07:09 - INFO - codeparrot_training - Step 12220: {'lr': 0.00043130063671250154, 'samples': 6257152, 'steps': 12220, 'batch_loss/train': 0.7125571900978684} 12/24/2021 14:07:19 - INFO - codeparrot_training - Step 12221: {'lr': 0.0004312897828731005, 'samples': 6257664, 'steps': 12221, 'batch_loss/train': 0.8014109269715846} 12/24/2021 14:07:30 - INFO - codeparrot_training - Step 12222: {'lr': 0.00043127892831295976, 'samples': 6258176, 'steps': 12222, 'batch_loss/train': 0.9121318068355322} 12/24/2021 14:07:44 - INFO - codeparrot_training - Step 12223: {'lr': 0.0004312680730321225, 'samples': 6258688, 'steps': 12223, 'batch_loss/train': 0.7847340200096369} 12/24/2021 14:07:55 - INFO - codeparrot_training - Step 12224: {'lr': 0.0004312572170306318, 'samples': 6259200, 'steps': 12224, 'batch_loss/train': 0.7949756467714906} 12/24/2021 14:08:05 - INFO - codeparrot_training - Step 12225: {'lr': 0.0004312463603085309, 'samples': 6259712, 'steps': 12225, 'batch_loss/train': 0.8203471973538399} 12/24/2021 14:08:18 - INFO - codeparrot_training - Step 12226: {'lr': 0.00043123550286586275, 'samples': 6260224, 'steps': 12226, 'batch_loss/train': 1.081231483258307} 12/24/2021 14:08:28 - INFO - codeparrot_training - Step 12227: {'lr': 0.0004312246447026709, 'samples': 6260736, 'steps': 12227, 'batch_loss/train': 0.8444774700328708} 12/24/2021 14:08:39 - INFO - codeparrot_training - Step 12228: {'lr': 0.0004312137858189981, 'samples': 6261248, 'steps': 12228, 'batch_loss/train': 0.7109765144996345} 12/24/2021 14:08:51 - INFO - codeparrot_training - Step 12229: {'lr': 0.00043120292621488777, 'samples': 6261760, 'steps': 12229, 'batch_loss/train': 0.784294793382287} 12/24/2021 14:09:02 - INFO - codeparrot_training - Step 12230: {'lr': 0.00043119206589038305, 'samples': 6262272, 'steps': 12230, 'batch_loss/train': 0.8325478108599782} 12/24/2021 14:09:12 - INFO - codeparrot_training - Step 12231: {'lr': 0.0004311812048455271, 'samples': 6262784, 'steps': 12231, 'batch_loss/train': 0.8605559123679996} 12/24/2021 14:09:23 - INFO - codeparrot_training - Step 12232: {'lr': 0.0004311703430803631, 'samples': 6263296, 'steps': 12232, 'batch_loss/train': 0.7291262617800385} 12/24/2021 14:09:37 - INFO - codeparrot_training - Step 12233: {'lr': 0.00043115948059493415, 'samples': 6263808, 'steps': 12233, 'batch_loss/train': 0.7600404312834144} 12/24/2021 14:09:47 - INFO - codeparrot_training - Step 12234: {'lr': 0.00043114861738928354, 'samples': 6264320, 'steps': 12234, 'batch_loss/train': 0.7993414434604347} 12/24/2021 14:09:58 - INFO - codeparrot_training - Step 12235: {'lr': 0.0004311377534634544, 'samples': 6264832, 'steps': 12235, 'batch_loss/train': 0.703122076112777} 12/24/2021 14:10:10 - INFO - codeparrot_training - Step 12236: {'lr': 0.00043112688881749, 'samples': 6265344, 'steps': 12236, 'batch_loss/train': 0.7846689717844129} 12/24/2021 14:10:21 - INFO - codeparrot_training - Step 12237: {'lr': 0.00043111602345143345, 'samples': 6265856, 'steps': 12237, 'batch_loss/train': 0.9069689041934907} 12/24/2021 14:10:32 - INFO - codeparrot_training - Step 12238: {'lr': 0.00043110515736532795, 'samples': 6266368, 'steps': 12238, 'batch_loss/train': 0.9603544445708394} 12/24/2021 14:10:44 - INFO - codeparrot_training - Step 12239: {'lr': 0.0004310942905592168, 'samples': 6266880, 'steps': 12239, 'batch_loss/train': 0.7955480236560106} 12/24/2021 14:10:55 - INFO - codeparrot_training - Step 12240: {'lr': 0.0004310834230331431, 'samples': 6267392, 'steps': 12240, 'batch_loss/train': 0.754021446686238} 12/24/2021 14:11:05 - INFO - codeparrot_training - Step 12241: {'lr': 0.0004310725547871501, 'samples': 6267904, 'steps': 12241, 'batch_loss/train': 0.8625084049999714} 12/24/2021 14:11:16 - INFO - codeparrot_training - Step 12242: {'lr': 0.0004310616858212809, 'samples': 6268416, 'steps': 12242, 'batch_loss/train': 0.7031020298600197} 12/24/2021 14:11:28 - INFO - codeparrot_training - Step 12243: {'lr': 0.000431050816135579, 'samples': 6268928, 'steps': 12243, 'batch_loss/train': 0.5938820857554674} 12/24/2021 14:11:39 - INFO - codeparrot_training - Step 12244: {'lr': 0.0004310399457300872, 'samples': 6269440, 'steps': 12244, 'batch_loss/train': 0.5328081625048071} 12/24/2021 14:11:49 - INFO - codeparrot_training - Step 12245: {'lr': 0.0004310290746048491, 'samples': 6269952, 'steps': 12245, 'batch_loss/train': 0.8085874691605568} 12/24/2021 14:12:03 - INFO - codeparrot_training - Step 12246: {'lr': 0.0004310182027599077, 'samples': 6270464, 'steps': 12246, 'batch_loss/train': 0.6635234493296593} 12/24/2021 14:12:14 - INFO - codeparrot_training - Step 12247: {'lr': 0.0004310073301953061, 'samples': 6270976, 'steps': 12247, 'batch_loss/train': 0.6714389031985775} 12/24/2021 14:12:24 - INFO - codeparrot_training - Step 12248: {'lr': 0.0004309964569110879, 'samples': 6271488, 'steps': 12248, 'batch_loss/train': 0.7707402356900275} 12/24/2021 14:12:36 - INFO - codeparrot_training - Step 12249: {'lr': 0.0004309855829072961, 'samples': 6272000, 'steps': 12249, 'batch_loss/train': 0.6956543419510126} 12/24/2021 14:12:47 - INFO - codeparrot_training - Step 12250: {'lr': 0.000430974708183974, 'samples': 6272512, 'steps': 12250, 'batch_loss/train': 0.7815654948353767} 12/24/2021 14:12:57 - INFO - codeparrot_training - Step 12251: {'lr': 0.0004309638327411647, 'samples': 6273024, 'steps': 12251, 'batch_loss/train': 0.7641333164647222} 12/24/2021 14:13:08 - INFO - codeparrot_training - Step 12252: {'lr': 0.00043095295657891153, 'samples': 6273536, 'steps': 12252, 'batch_loss/train': 0.8381065614521503} 12/24/2021 14:13:22 - INFO - codeparrot_training - Step 12253: {'lr': 0.0004309420796972579, 'samples': 6274048, 'steps': 12253, 'batch_loss/train': 0.8946061907336116} 12/24/2021 14:13:33 - INFO - codeparrot_training - Step 12254: {'lr': 0.0004309312020962467, 'samples': 6274560, 'steps': 12254, 'batch_loss/train': 0.87475167773664} 12/24/2021 14:13:43 - INFO - codeparrot_training - Step 12255: {'lr': 0.0004309203237759214, 'samples': 6275072, 'steps': 12255, 'batch_loss/train': 0.5965061821043491} 12/24/2021 14:13:55 - INFO - codeparrot_training - Step 12256: {'lr': 0.00043090944473632523, 'samples': 6275584, 'steps': 12256, 'batch_loss/train': 0.8844601325690746} 12/24/2021 14:14:06 - INFO - codeparrot_training - Step 12257: {'lr': 0.00043089856497750134, 'samples': 6276096, 'steps': 12257, 'batch_loss/train': 0.7973493495956063} 12/24/2021 14:14:16 - INFO - codeparrot_training - Step 12258: {'lr': 0.00043088768449949313, 'samples': 6276608, 'steps': 12258, 'batch_loss/train': 0.8365019112825394} 12/24/2021 14:14:28 - INFO - codeparrot_training - Step 12259: {'lr': 0.0004308768033023438, 'samples': 6277120, 'steps': 12259, 'batch_loss/train': 0.7256309906952083} 12/24/2021 14:14:39 - INFO - codeparrot_training - Step 12260: {'lr': 0.00043086592138609655, 'samples': 6277632, 'steps': 12260, 'batch_loss/train': 0.7710163900628686} 12/24/2021 14:14:50 - INFO - codeparrot_training - Step 12261: {'lr': 0.0004308550387507947, 'samples': 6278144, 'steps': 12261, 'batch_loss/train': 0.8073843782767653} 12/24/2021 14:15:04 - INFO - codeparrot_training - Step 12262: {'lr': 0.00043084415539648154, 'samples': 6278656, 'steps': 12262, 'batch_loss/train': 0.8443245647940785} 12/24/2021 14:15:14 - INFO - codeparrot_training - Step 12263: {'lr': 0.0004308332713232003, 'samples': 6279168, 'steps': 12263, 'batch_loss/train': 0.9828111757524312} 12/24/2021 14:15:25 - INFO - codeparrot_training - Step 12264: {'lr': 0.0004308223865309942, 'samples': 6279680, 'steps': 12264, 'batch_loss/train': 0.7774681709706783} 12/24/2021 14:15:35 - INFO - codeparrot_training - Step 12265: {'lr': 0.0004308115010199066, 'samples': 6280192, 'steps': 12265, 'batch_loss/train': 0.8235852681100368} 12/24/2021 14:15:48 - INFO - codeparrot_training - Step 12266: {'lr': 0.00043080061478998076, 'samples': 6280704, 'steps': 12266, 'batch_loss/train': 0.7496848022565246} 12/24/2021 14:15:58 - INFO - codeparrot_training - Step 12267: {'lr': 0.0004307897278412599, 'samples': 6281216, 'steps': 12267, 'batch_loss/train': 0.6827171443146653} 12/24/2021 14:16:09 - INFO - codeparrot_training - Step 12268: {'lr': 0.00043077884017378735, 'samples': 6281728, 'steps': 12268, 'batch_loss/train': 0.882659096212592} 12/24/2021 14:16:21 - INFO - codeparrot_training - Step 12269: {'lr': 0.00043076795178760644, 'samples': 6282240, 'steps': 12269, 'batch_loss/train': 0.8584082620218396} 12/24/2021 14:16:32 - INFO - codeparrot_training - Step 12270: {'lr': 0.0004307570626827604, 'samples': 6282752, 'steps': 12270, 'batch_loss/train': 0.9035632056184113} 12/24/2021 14:16:42 - INFO - codeparrot_training - Step 12271: {'lr': 0.0004307461728592925, 'samples': 6283264, 'steps': 12271, 'batch_loss/train': 0.7601705752313137} 12/24/2021 14:16:56 - INFO - codeparrot_training - Step 12272: {'lr': 0.0004307352823172461, 'samples': 6283776, 'steps': 12272, 'batch_loss/train': 0.7714057741686702} 12/24/2021 14:17:07 - INFO - codeparrot_training - Step 12273: {'lr': 0.0004307243910566644, 'samples': 6284288, 'steps': 12273, 'batch_loss/train': 0.8042116044089198} 12/24/2021 14:17:17 - INFO - codeparrot_training - Step 12274: {'lr': 0.0004307134990775908, 'samples': 6284800, 'steps': 12274, 'batch_loss/train': 0.6747999526560307} 12/24/2021 14:17:30 - INFO - codeparrot_training - Step 12275: {'lr': 0.00043070260638006854, 'samples': 6285312, 'steps': 12275, 'batch_loss/train': 0.8408170812763274} 12/24/2021 14:17:40 - INFO - codeparrot_training - Step 12276: {'lr': 0.000430691712964141, 'samples': 6285824, 'steps': 12276, 'batch_loss/train': 0.8883801400661469} 12/24/2021 14:17:51 - INFO - codeparrot_training - Step 12277: {'lr': 0.0004306808188298514, 'samples': 6286336, 'steps': 12277, 'batch_loss/train': 0.7596436813473701} 12/24/2021 14:18:01 - INFO - codeparrot_training - Step 12278: {'lr': 0.0004306699239772431, 'samples': 6286848, 'steps': 12278, 'batch_loss/train': 0.8206784054636955} 12/24/2021 14:18:13 - INFO - codeparrot_training - Step 12279: {'lr': 0.0004306590284063594, 'samples': 6287360, 'steps': 12279, 'batch_loss/train': 0.8353832811117172} 12/24/2021 14:18:24 - INFO - codeparrot_training - Step 12280: {'lr': 0.0004306481321172436, 'samples': 6287872, 'steps': 12280, 'batch_loss/train': 0.9013335031922907} 12/24/2021 14:18:35 - INFO - codeparrot_training - Step 12281: {'lr': 0.000430637235109939, 'samples': 6288384, 'steps': 12281, 'batch_loss/train': 0.7987179309129715} 12/24/2021 14:18:49 - INFO - codeparrot_training - Step 12282: {'lr': 0.00043062633738448903, 'samples': 6288896, 'steps': 12282, 'batch_loss/train': 0.7496815640479326} 12/24/2021 14:18:59 - INFO - codeparrot_training - Step 12283: {'lr': 0.0004306154389409369, 'samples': 6289408, 'steps': 12283, 'batch_loss/train': 0.8097202340140939} 12/24/2021 14:19:10 - INFO - codeparrot_training - Step 12284: {'lr': 0.000430604539779326, 'samples': 6289920, 'steps': 12284, 'batch_loss/train': 0.8041735477745533} 12/24/2021 14:19:22 - INFO - codeparrot_training - Step 12285: {'lr': 0.0004305936398996997, 'samples': 6290432, 'steps': 12285, 'batch_loss/train': 0.7634845175780356} 12/24/2021 14:19:33 - INFO - codeparrot_training - Step 12286: {'lr': 0.00043058273930210114, 'samples': 6290944, 'steps': 12286, 'batch_loss/train': 0.7311426964588463} 12/24/2021 14:19:43 - INFO - codeparrot_training - Step 12287: {'lr': 0.0004305718379865739, 'samples': 6291456, 'steps': 12287, 'batch_loss/train': 0.6604692842811346} 12/24/2021 14:19:54 - INFO - codeparrot_training - Step 12288: {'lr': 0.0004305609359531612, 'samples': 6291968, 'steps': 12288, 'batch_loss/train': 0.7842123620212078} 12/24/2021 14:20:06 - INFO - codeparrot_training - Step 12289: {'lr': 0.0004305500332019063, 'samples': 6292480, 'steps': 12289, 'batch_loss/train': 0.5927304878132418} 12/24/2021 14:20:17 - INFO - codeparrot_training - Step 12290: {'lr': 0.0004305391297328528, 'samples': 6292992, 'steps': 12290, 'batch_loss/train': 0.8528223328758031} 12/24/2021 14:20:27 - INFO - codeparrot_training - Step 12291: {'lr': 0.0004305282255460438, 'samples': 6293504, 'steps': 12291, 'batch_loss/train': 0.7427678909152746} 12/24/2021 14:20:41 - INFO - codeparrot_training - Step 12292: {'lr': 0.00043051732064152277, 'samples': 6294016, 'steps': 12292, 'batch_loss/train': 0.7486382238566875} 12/24/2021 14:20:52 - INFO - codeparrot_training - Step 12293: {'lr': 0.000430506415019333, 'samples': 6294528, 'steps': 12293, 'batch_loss/train': 0.6885273982770741} 12/24/2021 14:21:03 - INFO - codeparrot_training - Step 12294: {'lr': 0.0004304955086795179, 'samples': 6295040, 'steps': 12294, 'batch_loss/train': 0.8153868867084384} 12/24/2021 14:21:15 - INFO - codeparrot_training - Step 12295: {'lr': 0.00043048460162212085, 'samples': 6295552, 'steps': 12295, 'batch_loss/train': 0.6607426677364856} 12/24/2021 14:21:25 - INFO - codeparrot_training - Step 12296: {'lr': 0.0004304736938471851, 'samples': 6296064, 'steps': 12296, 'batch_loss/train': 0.7550199691904709} 12/24/2021 14:21:36 - INFO - codeparrot_training - Step 12297: {'lr': 0.00043046278535475416, 'samples': 6296576, 'steps': 12297, 'batch_loss/train': 0.8060110516380519} 12/24/2021 14:21:48 - INFO - codeparrot_training - Step 12298: {'lr': 0.00043045187614487127, 'samples': 6297088, 'steps': 12298, 'batch_loss/train': 0.816579967038706} 12/24/2021 14:21:59 - INFO - codeparrot_training - Step 12299: {'lr': 0.00043044096621757983, 'samples': 6297600, 'steps': 12299, 'batch_loss/train': 0.6799819166772068} 12/24/2021 14:22:10 - INFO - codeparrot_training - Step 12300: {'lr': 0.0004304300555729233, 'samples': 6298112, 'steps': 12300, 'batch_loss/train': 0.7456321865320206} 12/24/2021 14:22:20 - INFO - codeparrot_training - Step 12301: {'lr': 0.00043041914421094497, 'samples': 6298624, 'steps': 12301, 'batch_loss/train': 0.620535402558744} 12/24/2021 14:22:32 - INFO - codeparrot_training - Step 12302: {'lr': 0.00043040823213168824, 'samples': 6299136, 'steps': 12302, 'batch_loss/train': 0.7617781965527683} 12/24/2021 14:22:43 - INFO - codeparrot_training - Step 12303: {'lr': 0.00043039731933519653, 'samples': 6299648, 'steps': 12303, 'batch_loss/train': 0.8768266410334036} 12/24/2021 14:22:53 - INFO - codeparrot_training - Step 12304: {'lr': 0.00043038640582151316, 'samples': 6300160, 'steps': 12304, 'batch_loss/train': 0.7214069752953947} 12/24/2021 14:23:08 - INFO - codeparrot_training - Step 12305: {'lr': 0.00043037549159068156, 'samples': 6300672, 'steps': 12305, 'batch_loss/train': 0.8150719702243805} 12/24/2021 14:23:18 - INFO - codeparrot_training - Step 12306: {'lr': 0.00043036457664274514, 'samples': 6301184, 'steps': 12306, 'batch_loss/train': 0.8488970650359988} 12/24/2021 14:23:29 - INFO - codeparrot_training - Step 12307: {'lr': 0.00043035366097774727, 'samples': 6301696, 'steps': 12307, 'batch_loss/train': 0.7865452799014747} 12/24/2021 14:23:41 - INFO - codeparrot_training - Step 12308: {'lr': 0.0004303427445957313, 'samples': 6302208, 'steps': 12308, 'batch_loss/train': 0.7726334594190121} 12/24/2021 14:23:52 - INFO - codeparrot_training - Step 12309: {'lr': 0.00043033182749674067, 'samples': 6302720, 'steps': 12309, 'batch_loss/train': 0.8064951361157} 12/24/2021 14:24:02 - INFO - codeparrot_training - Step 12310: {'lr': 0.00043032090968081883, 'samples': 6303232, 'steps': 12310, 'batch_loss/train': 0.8118260093033314} 12/24/2021 14:24:14 - INFO - codeparrot_training - Step 12311: {'lr': 0.0004303099911480091, 'samples': 6303744, 'steps': 12311, 'batch_loss/train': 0.7829987532459199} 12/24/2021 14:24:25 - INFO - codeparrot_training - Step 12312: {'lr': 0.0004302990718983549, 'samples': 6304256, 'steps': 12312, 'batch_loss/train': 0.7770630577579141} 12/24/2021 14:24:36 - INFO - codeparrot_training - Step 12313: {'lr': 0.00043028815193189976, 'samples': 6304768, 'steps': 12313, 'batch_loss/train': 0.7889005547622219} 12/24/2021 14:24:46 - INFO - codeparrot_training - Step 12314: {'lr': 0.00043027723124868693, 'samples': 6305280, 'steps': 12314, 'batch_loss/train': 0.869221244007349} 12/24/2021 14:25:00 - INFO - codeparrot_training - Step 12315: {'lr': 0.00043026630984875993, 'samples': 6305792, 'steps': 12315, 'batch_loss/train': 1.040066933259368} 12/24/2021 14:25:11 - INFO - codeparrot_training - Step 12316: {'lr': 0.0004302553877321622, 'samples': 6306304, 'steps': 12316, 'batch_loss/train': 1.232151735574007} 12/24/2021 14:25:21 - INFO - codeparrot_training - Step 12317: {'lr': 0.000430244464898937, 'samples': 6306816, 'steps': 12317, 'batch_loss/train': 0.7509137410670519} 12/24/2021 14:25:33 - INFO - codeparrot_training - Step 12318: {'lr': 0.0004302335413491279, 'samples': 6307328, 'steps': 12318, 'batch_loss/train': 0.7756724371574819} 12/24/2021 14:25:44 - INFO - codeparrot_training - Step 12319: {'lr': 0.00043022261708277834, 'samples': 6307840, 'steps': 12319, 'batch_loss/train': 0.6838391253259033} 12/24/2021 14:25:55 - INFO - codeparrot_training - Step 12320: {'lr': 0.0004302116920999317, 'samples': 6308352, 'steps': 12320, 'batch_loss/train': 0.8346721231937408} 12/24/2021 14:26:05 - INFO - codeparrot_training - Step 12321: {'lr': 0.0004302007664006313, 'samples': 6308864, 'steps': 12321, 'batch_loss/train': 0.8757670421618968} 12/24/2021 14:26:20 - INFO - codeparrot_training - Step 12322: {'lr': 0.0004301898399849208, 'samples': 6309376, 'steps': 12322, 'batch_loss/train': 0.6817302466370165} 12/24/2021 14:26:30 - INFO - codeparrot_training - Step 12323: {'lr': 0.0004301789128528435, 'samples': 6309888, 'steps': 12323, 'batch_loss/train': 0.841561047360301} 12/24/2021 14:26:41 - INFO - codeparrot_training - Step 12324: {'lr': 0.00043016798500444294, 'samples': 6310400, 'steps': 12324, 'batch_loss/train': 0.8472357550635934} 12/24/2021 14:26:53 - INFO - codeparrot_training - Step 12325: {'lr': 0.00043015705643976245, 'samples': 6310912, 'steps': 12325, 'batch_loss/train': 0.7516172897303477} 12/24/2021 14:27:03 - INFO - codeparrot_training - Step 12326: {'lr': 0.0004301461271588455, 'samples': 6311424, 'steps': 12326, 'batch_loss/train': 0.759111138060689} 12/24/2021 14:27:14 - INFO - codeparrot_training - Step 12327: {'lr': 0.00043013519716173567, 'samples': 6311936, 'steps': 12327, 'batch_loss/train': 0.9010791126638651} 12/24/2021 14:27:26 - INFO - codeparrot_training - Step 12328: {'lr': 0.0004301242664484762, 'samples': 6312448, 'steps': 12328, 'batch_loss/train': 0.883692990988493} 12/24/2021 14:27:37 - INFO - codeparrot_training - Step 12329: {'lr': 0.0004301133350191107, 'samples': 6312960, 'steps': 12329, 'batch_loss/train': 0.7739092805422843} 12/24/2021 14:27:47 - INFO - codeparrot_training - Step 12330: {'lr': 0.0004301024028736826, 'samples': 6313472, 'steps': 12330, 'batch_loss/train': 0.7015121169388294} 12/24/2021 14:28:01 - INFO - codeparrot_training - Step 12331: {'lr': 0.00043009147001223536, 'samples': 6313984, 'steps': 12331, 'batch_loss/train': 0.8352522421628237} 12/24/2021 14:28:12 - INFO - codeparrot_training - Step 12332: {'lr': 0.00043008053643481237, 'samples': 6314496, 'steps': 12332, 'batch_loss/train': 0.849798796698451} 12/24/2021 14:28:22 - INFO - codeparrot_training - Step 12333: {'lr': 0.0004300696021414572, 'samples': 6315008, 'steps': 12333, 'batch_loss/train': 0.7924626478925347} 12/24/2021 14:28:33 - INFO - codeparrot_training - Step 12334: {'lr': 0.00043005866713221333, 'samples': 6315520, 'steps': 12334, 'batch_loss/train': 0.8549627820029855} 12/24/2021 14:28:45 - INFO - codeparrot_training - Step 12335: {'lr': 0.0004300477314071242, 'samples': 6316032, 'steps': 12335, 'batch_loss/train': 0.6617271099239588} 12/24/2021 14:28:56 - INFO - codeparrot_training - Step 12336: {'lr': 0.00043003679496623316, 'samples': 6316544, 'steps': 12336, 'batch_loss/train': 0.885429815389216} 12/24/2021 14:29:06 - INFO - codeparrot_training - Step 12337: {'lr': 0.00043002585780958384, 'samples': 6317056, 'steps': 12337, 'batch_loss/train': 0.7952076601795852} 12/24/2021 14:29:20 - INFO - codeparrot_training - Step 12338: {'lr': 0.00043001491993721976, 'samples': 6317568, 'steps': 12338, 'batch_loss/train': 0.882630517706275} 12/24/2021 14:29:31 - INFO - codeparrot_training - Step 12339: {'lr': 0.00043000398134918424, 'samples': 6318080, 'steps': 12339, 'batch_loss/train': 0.7606533570215106} 12/24/2021 14:29:42 - INFO - codeparrot_training - Step 12340: {'lr': 0.0004299930420455209, 'samples': 6318592, 'steps': 12340, 'batch_loss/train': 0.8053768817335367} 12/24/2021 14:29:54 - INFO - codeparrot_training - Step 12341: {'lr': 0.0004299821020262732, 'samples': 6319104, 'steps': 12341, 'batch_loss/train': 0.8374393973499537} 12/24/2021 14:30:04 - INFO - codeparrot_training - Step 12342: {'lr': 0.00042997116129148464, 'samples': 6319616, 'steps': 12342, 'batch_loss/train': 0.8192249396815896} 12/24/2021 14:30:15 - INFO - codeparrot_training - Step 12343: {'lr': 0.0004299602198411986, 'samples': 6320128, 'steps': 12343, 'batch_loss/train': 0.5555128389969468} 12/24/2021 14:30:26 - INFO - codeparrot_training - Step 12344: {'lr': 0.0004299492776754588, 'samples': 6320640, 'steps': 12344, 'batch_loss/train': 0.683896423317492} 12/24/2021 14:30:38 - INFO - codeparrot_training - Step 12345: {'lr': 0.0004299383347943085, 'samples': 6321152, 'steps': 12345, 'batch_loss/train': 0.8146888706833124} 12/24/2021 14:30:48 - INFO - codeparrot_training - Step 12346: {'lr': 0.0004299273911977913, 'samples': 6321664, 'steps': 12346, 'batch_loss/train': 0.8355367723852396} 12/24/2021 14:30:59 - INFO - codeparrot_training - Step 12347: {'lr': 0.0004299164468859509, 'samples': 6322176, 'steps': 12347, 'batch_loss/train': 0.9169576009735465} 12/24/2021 14:31:11 - INFO - codeparrot_training - Step 12348: {'lr': 0.0004299055018588305, 'samples': 6322688, 'steps': 12348, 'batch_loss/train': 0.8729316801764071} 12/24/2021 14:31:22 - INFO - codeparrot_training - Step 12349: {'lr': 0.00042989455611647384, 'samples': 6323200, 'steps': 12349, 'batch_loss/train': 0.8273122515529394} 12/24/2021 14:31:32 - INFO - codeparrot_training - Step 12350: {'lr': 0.0004298836096589243, 'samples': 6323712, 'steps': 12350, 'batch_loss/train': 0.7345685651525855} 12/24/2021 14:31:46 - INFO - codeparrot_training - Step 12351: {'lr': 0.00042987266248622547, 'samples': 6324224, 'steps': 12351, 'batch_loss/train': 0.7613907686900347} 12/24/2021 14:31:57 - INFO - codeparrot_training - Step 12352: {'lr': 0.0004298617145984208, 'samples': 6324736, 'steps': 12352, 'batch_loss/train': 0.7969642365351319} 12/24/2021 14:32:08 - INFO - codeparrot_training - Step 12353: {'lr': 0.00042985076599555395, 'samples': 6325248, 'steps': 12353, 'batch_loss/train': 0.8981754798442125} 12/24/2021 14:32:18 - INFO - codeparrot_training - Step 12354: {'lr': 0.00042983981667766836, 'samples': 6325760, 'steps': 12354, 'batch_loss/train': 0.7687706106808037} 12/24/2021 14:32:31 - INFO - codeparrot_training - Step 12355: {'lr': 0.00042982886664480745, 'samples': 6326272, 'steps': 12355, 'batch_loss/train': 0.742304490879178} 12/24/2021 14:32:41 - INFO - codeparrot_training - Step 12356: {'lr': 0.000429817915897015, 'samples': 6326784, 'steps': 12356, 'batch_loss/train': 0.8171188058331609} 12/24/2021 14:32:52 - INFO - codeparrot_training - Step 12357: {'lr': 0.00042980696443433434, 'samples': 6327296, 'steps': 12357, 'batch_loss/train': 0.6658969540148973} 12/24/2021 14:33:04 - INFO - codeparrot_training - Step 12358: {'lr': 0.0004297960122568091, 'samples': 6327808, 'steps': 12358, 'batch_loss/train': 0.8810875238850713} 12/24/2021 14:33:15 - INFO - codeparrot_training - Step 12359: {'lr': 0.0004297850593644828, 'samples': 6328320, 'steps': 12359, 'batch_loss/train': 0.7863573115319014} 12/24/2021 14:33:25 - INFO - codeparrot_training - Step 12360: {'lr': 0.00042977410575739894, 'samples': 6328832, 'steps': 12360, 'batch_loss/train': 0.7934357421472669} 12/24/2021 14:33:39 - INFO - codeparrot_training - Step 12361: {'lr': 0.0004297631514356012, 'samples': 6329344, 'steps': 12361, 'batch_loss/train': 0.7711820569820702} 12/24/2021 14:33:50 - INFO - codeparrot_training - Step 12362: {'lr': 0.00042975219639913306, 'samples': 6329856, 'steps': 12362, 'batch_loss/train': 0.7265287027694285} 12/24/2021 14:34:00 - INFO - codeparrot_training - Step 12363: {'lr': 0.000429741240648038, 'samples': 6330368, 'steps': 12363, 'batch_loss/train': 0.8654192164540291} 12/24/2021 14:34:12 - INFO - codeparrot_training - Step 12364: {'lr': 0.0004297302841823596, 'samples': 6330880, 'steps': 12364, 'batch_loss/train': 0.8018408510833979} 12/24/2021 14:34:23 - INFO - codeparrot_training - Step 12365: {'lr': 0.0004297193270021416, 'samples': 6331392, 'steps': 12365, 'batch_loss/train': 0.8146932553499937} 12/24/2021 14:34:34 - INFO - codeparrot_training - Step 12366: {'lr': 0.0004297083691074273, 'samples': 6331904, 'steps': 12366, 'batch_loss/train': 0.8488896784838289} 12/24/2021 14:34:44 - INFO - codeparrot_training - Step 12367: {'lr': 0.00042969741049826037, 'samples': 6332416, 'steps': 12367, 'batch_loss/train': 0.6493413867428899} 12/24/2021 14:34:56 - INFO - codeparrot_training - Step 12368: {'lr': 0.0004296864511746844, 'samples': 6332928, 'steps': 12368, 'batch_loss/train': 0.8007960086688399} 12/24/2021 14:35:07 - INFO - codeparrot_training - Step 12369: {'lr': 0.000429675491136743, 'samples': 6333440, 'steps': 12369, 'batch_loss/train': 0.7585000756662339} 12/24/2021 14:35:17 - INFO - codeparrot_training - Step 12370: {'lr': 0.0004296645303844796, 'samples': 6333952, 'steps': 12370, 'batch_loss/train': 0.7781162741594017} 12/24/2021 14:35:31 - INFO - codeparrot_training - Step 12371: {'lr': 0.00042965356891793793, 'samples': 6334464, 'steps': 12371, 'batch_loss/train': 0.7978723011910915} 12/24/2021 14:35:42 - INFO - codeparrot_training - Step 12372: {'lr': 0.0004296426067371615, 'samples': 6334976, 'steps': 12372, 'batch_loss/train': 0.8087886527646333} 12/24/2021 14:35:53 - INFO - codeparrot_training - Step 12373: {'lr': 0.00042963164384219384, 'samples': 6335488, 'steps': 12373, 'batch_loss/train': 0.7957653071498498} 12/24/2021 14:36:05 - INFO - codeparrot_training - Step 12374: {'lr': 0.00042962068023307866, 'samples': 6336000, 'steps': 12374, 'batch_loss/train': 0.7948509419802576} 12/24/2021 14:36:15 - INFO - codeparrot_training - Step 12375: {'lr': 0.0004296097159098594, 'samples': 6336512, 'steps': 12375, 'batch_loss/train': 0.8411115687340498} 12/24/2021 14:36:26 - INFO - codeparrot_training - Step 12376: {'lr': 0.00042959875087257983, 'samples': 6337024, 'steps': 12376, 'batch_loss/train': 0.7524054679088295} 12/24/2021 14:36:36 - INFO - codeparrot_training - Step 12377: {'lr': 0.0004295877851212834, 'samples': 6337536, 'steps': 12377, 'batch_loss/train': 0.7777447272092104} 12/24/2021 14:36:49 - INFO - codeparrot_training - Step 12378: {'lr': 0.00042957681865601373, 'samples': 6338048, 'steps': 12378, 'batch_loss/train': 0.8293453427031636} 12/24/2021 14:36:59 - INFO - codeparrot_training - Step 12379: {'lr': 0.0004295658514768145, 'samples': 6338560, 'steps': 12379, 'batch_loss/train': 0.7339596655219793} 12/24/2021 14:37:10 - INFO - codeparrot_training - Step 12380: {'lr': 0.00042955488358372915, 'samples': 6339072, 'steps': 12380, 'batch_loss/train': 0.7774992897175252} 12/24/2021 14:37:24 - INFO - codeparrot_training - Step 12381: {'lr': 0.0004295439149768014, 'samples': 6339584, 'steps': 12381, 'batch_loss/train': 0.7842135555110872} 12/24/2021 14:37:35 - INFO - codeparrot_training - Step 12382: {'lr': 0.0004295329456560748, 'samples': 6340096, 'steps': 12382, 'batch_loss/train': 0.8480190169066191} 12/24/2021 14:37:45 - INFO - codeparrot_training - Step 12383: {'lr': 0.0004295219756215931, 'samples': 6340608, 'steps': 12383, 'batch_loss/train': 0.8587280753999949} 12/24/2021 14:37:57 - INFO - codeparrot_training - Step 12384: {'lr': 0.0004295110048733998, 'samples': 6341120, 'steps': 12384, 'batch_loss/train': 0.5135120488703251} 12/24/2021 14:38:08 - INFO - codeparrot_training - Step 12385: {'lr': 0.00042950003341153843, 'samples': 6341632, 'steps': 12385, 'batch_loss/train': 0.6832177576143295} 12/24/2021 14:38:18 - INFO - codeparrot_training - Step 12386: {'lr': 0.0004294890612360528, 'samples': 6342144, 'steps': 12386, 'batch_loss/train': 0.8896194339031354} 12/24/2021 14:38:29 - INFO - codeparrot_training - Step 12387: {'lr': 0.00042947808834698637, 'samples': 6342656, 'steps': 12387, 'batch_loss/train': 1.1233422113582492} 12/24/2021 14:38:41 - INFO - codeparrot_training - Step 12388: {'lr': 0.00042946711474438283, 'samples': 6343168, 'steps': 12388, 'batch_loss/train': 0.7994206380099058} 12/24/2021 14:38:52 - INFO - codeparrot_training - Step 12389: {'lr': 0.00042945614042828585, 'samples': 6343680, 'steps': 12389, 'batch_loss/train': 0.8528813631273806} 12/24/2021 14:39:02 - INFO - codeparrot_training - Step 12390: {'lr': 0.000429445165398739, 'samples': 6344192, 'steps': 12390, 'batch_loss/train': 0.845009452663362} 12/24/2021 14:39:17 - INFO - codeparrot_training - Step 12391: {'lr': 0.00042943418965578594, 'samples': 6344704, 'steps': 12391, 'batch_loss/train': 0.7247587935999036} 12/24/2021 14:39:28 - INFO - codeparrot_training - Step 12392: {'lr': 0.0004294232131994703, 'samples': 6345216, 'steps': 12392, 'batch_loss/train': 0.8841559169813991} 12/24/2021 14:39:38 - INFO - codeparrot_training - Step 12393: {'lr': 0.0004294122360298357, 'samples': 6345728, 'steps': 12393, 'batch_loss/train': 2.432684081606567} 12/24/2021 14:39:51 - INFO - codeparrot_training - Step 12394: {'lr': 0.00042940125814692576, 'samples': 6346240, 'steps': 12394, 'batch_loss/train': 1.1373613842297345} 12/24/2021 14:40:01 - INFO - codeparrot_training - Step 12395: {'lr': 0.00042939027955078414, 'samples': 6346752, 'steps': 12395, 'batch_loss/train': 0.5492775349412113} 12/24/2021 14:40:12 - INFO - codeparrot_training - Step 12396: {'lr': 0.00042937930024145455, 'samples': 6347264, 'steps': 12396, 'batch_loss/train': 0.6568741435185075} 12/24/2021 14:40:23 - INFO - codeparrot_training - Step 12397: {'lr': 0.00042936832021898063, 'samples': 6347776, 'steps': 12397, 'batch_loss/train': 0.7974602496251464} 12/24/2021 14:40:35 - INFO - codeparrot_training - Step 12398: {'lr': 0.00042935733948340596, 'samples': 6348288, 'steps': 12398, 'batch_loss/train': 0.8843438404146582} 12/24/2021 14:40:45 - INFO - codeparrot_training - Step 12399: {'lr': 0.0004293463580347742, 'samples': 6348800, 'steps': 12399, 'batch_loss/train': 0.7980016823858023} 12/24/2021 14:40:56 - INFO - codeparrot_training - Step 12400: {'lr': 0.0004293353758731291, 'samples': 6349312, 'steps': 12400, 'batch_loss/train': 0.8523324499838054} 12/24/2021 14:41:10 - INFO - codeparrot_training - Step 12401: {'lr': 0.00042932439299851424, 'samples': 6349824, 'steps': 12401, 'batch_loss/train': 0.8101717010140419} 12/24/2021 14:41:20 - INFO - codeparrot_training - Step 12402: {'lr': 0.0004293134094109732, 'samples': 6350336, 'steps': 12402, 'batch_loss/train': 0.7708416245877743} 12/24/2021 14:41:31 - INFO - codeparrot_training - Step 12403: {'lr': 0.00042930242511054985, 'samples': 6350848, 'steps': 12403, 'batch_loss/train': 0.7616898538544774} 12/24/2021 14:41:43 - INFO - codeparrot_training - Step 12404: {'lr': 0.00042929144009728776, 'samples': 6351360, 'steps': 12404, 'batch_loss/train': 0.8309040572494268} 12/24/2021 14:41:54 - INFO - codeparrot_training - Step 12405: {'lr': 0.00042928045437123056, 'samples': 6351872, 'steps': 12405, 'batch_loss/train': 0.818053456954658} 12/24/2021 14:42:04 - INFO - codeparrot_training - Step 12406: {'lr': 0.00042926946793242206, 'samples': 6352384, 'steps': 12406, 'batch_loss/train': 0.8208971414715052} 12/24/2021 14:42:15 - INFO - codeparrot_training - Step 12407: {'lr': 0.0004292584807809057, 'samples': 6352896, 'steps': 12407, 'batch_loss/train': 0.8000459643080831} 12/24/2021 14:42:29 - INFO - codeparrot_training - Step 12408: {'lr': 0.00042924749291672545, 'samples': 6353408, 'steps': 12408, 'batch_loss/train': 0.869529961142689} 12/24/2021 14:42:39 - INFO - codeparrot_training - Step 12409: {'lr': 0.0004292365043399248, 'samples': 6353920, 'steps': 12409, 'batch_loss/train': 0.8928617369383574} 12/24/2021 14:42:50 - INFO - codeparrot_training - Step 12410: {'lr': 0.0004292255150505474, 'samples': 6354432, 'steps': 12410, 'batch_loss/train': 0.8330930266529322} 12/24/2021 14:43:02 - INFO - codeparrot_training - Step 12411: {'lr': 0.00042921452504863714, 'samples': 6354944, 'steps': 12411, 'batch_loss/train': 0.8879835773259401} 12/24/2021 14:43:13 - INFO - codeparrot_training - Step 12412: {'lr': 0.0004292035343342375, 'samples': 6355456, 'steps': 12412, 'batch_loss/train': 0.8154512764886022} 12/24/2021 14:43:23 - INFO - codeparrot_training - Step 12413: {'lr': 0.0004291925429073924, 'samples': 6355968, 'steps': 12413, 'batch_loss/train': 0.7850723941810429} 12/24/2021 14:43:36 - INFO - codeparrot_training - Step 12414: {'lr': 0.00042918155076814533, 'samples': 6356480, 'steps': 12414, 'batch_loss/train': 0.8165755672380328} 12/24/2021 14:43:46 - INFO - codeparrot_training - Step 12415: {'lr': 0.00042917055791654005, 'samples': 6356992, 'steps': 12415, 'batch_loss/train': 0.8653854383155704} 12/24/2021 14:43:57 - INFO - codeparrot_training - Step 12416: {'lr': 0.00042915956435262033, 'samples': 6357504, 'steps': 12416, 'batch_loss/train': 0.7915253937244415} 12/24/2021 14:44:11 - INFO - codeparrot_training - Step 12417: {'lr': 0.0004291485700764298, 'samples': 6358016, 'steps': 12417, 'batch_loss/train': 0.6017493405961432} 12/24/2021 14:44:21 - INFO - codeparrot_training - Step 12418: {'lr': 0.0004291375750880122, 'samples': 6358528, 'steps': 12418, 'batch_loss/train': 0.7540674274787307} 12/24/2021 14:44:32 - INFO - codeparrot_training - Step 12419: {'lr': 0.0004291265793874113, 'samples': 6359040, 'steps': 12419, 'batch_loss/train': 0.8317429460585117} 12/24/2021 14:44:42 - INFO - codeparrot_training - Step 12420: {'lr': 0.0004291155829746707, 'samples': 6359552, 'steps': 12420, 'batch_loss/train': 0.8359288175124675} 12/24/2021 14:44:55 - INFO - codeparrot_training - Step 12421: {'lr': 0.0004291045858498341, 'samples': 6360064, 'steps': 12421, 'batch_loss/train': 0.8056728024967015} 12/24/2021 14:45:05 - INFO - codeparrot_training - Step 12422: {'lr': 0.0004290935880129454, 'samples': 6360576, 'steps': 12422, 'batch_loss/train': 0.797962804324925} 12/24/2021 14:45:16 - INFO - codeparrot_training - Step 12423: {'lr': 0.0004290825894640482, 'samples': 6361088, 'steps': 12423, 'batch_loss/train': 0.7938190503045917} 12/24/2021 14:45:28 - INFO - codeparrot_training - Step 12424: {'lr': 0.00042907159020318624, 'samples': 6361600, 'steps': 12424, 'batch_loss/train': 0.7552765402942896} 12/24/2021 14:45:39 - INFO - codeparrot_training - Step 12425: {'lr': 0.0004290605902304032, 'samples': 6362112, 'steps': 12425, 'batch_loss/train': 0.7254874566569924} 12/24/2021 14:45:49 - INFO - codeparrot_training - Step 12426: {'lr': 0.000429049589545743, 'samples': 6362624, 'steps': 12426, 'batch_loss/train': 0.7881489107385278} 12/24/2021 14:46:01 - INFO - codeparrot_training - Step 12427: {'lr': 0.0004290385881492491, 'samples': 6363136, 'steps': 12427, 'batch_loss/train': 0.821887026540935} 12/24/2021 14:46:12 - INFO - codeparrot_training - Step 12428: {'lr': 0.0004290275860409655, 'samples': 6363648, 'steps': 12428, 'batch_loss/train': 0.7941449906211346} 12/24/2021 14:46:23 - INFO - codeparrot_training - Step 12429: {'lr': 0.00042901658322093574, 'samples': 6364160, 'steps': 12429, 'batch_loss/train': 0.8985576704144478} 12/24/2021 14:46:33 - INFO - codeparrot_training - Step 12430: {'lr': 0.00042900557968920364, 'samples': 6364672, 'steps': 12430, 'batch_loss/train': 0.8640203410759568} 12/24/2021 14:46:47 - INFO - codeparrot_training - Step 12431: {'lr': 0.0004289945754458129, 'samples': 6365184, 'steps': 12431, 'batch_loss/train': 0.8400302901864052} 12/24/2021 14:46:58 - INFO - codeparrot_training - Step 12432: {'lr': 0.0004289835704908074, 'samples': 6365696, 'steps': 12432, 'batch_loss/train': 0.8091635387390852} 12/24/2021 14:47:08 - INFO - codeparrot_training - Step 12433: {'lr': 0.0004289725648242307, 'samples': 6366208, 'steps': 12433, 'batch_loss/train': 0.7840483714826405} 12/24/2021 14:47:20 - INFO - codeparrot_training - Step 12434: {'lr': 0.00042896155844612675, 'samples': 6366720, 'steps': 12434, 'batch_loss/train': 0.8793235425837338} 12/24/2021 14:47:31 - INFO - codeparrot_training - Step 12435: {'lr': 0.00042895055135653914, 'samples': 6367232, 'steps': 12435, 'batch_loss/train': 0.783255934715271} 12/24/2021 14:47:42 - INFO - codeparrot_training - Step 12436: {'lr': 0.0004289395435555118, 'samples': 6367744, 'steps': 12436, 'batch_loss/train': 0.8168025072664022} 12/24/2021 14:47:55 - INFO - codeparrot_training - Step 12437: {'lr': 0.00042892853504308836, 'samples': 6368256, 'steps': 12437, 'batch_loss/train': 0.7355806706473231} 12/24/2021 14:48:06 - INFO - codeparrot_training - Step 12438: {'lr': 0.00042891752581931264, 'samples': 6368768, 'steps': 12438, 'batch_loss/train': 0.833764361217618} 12/24/2021 14:48:17 - INFO - codeparrot_training - Step 12439: {'lr': 0.00042890651588422835, 'samples': 6369280, 'steps': 12439, 'batch_loss/train': 0.794440578436479} 12/24/2021 14:48:27 - INFO - codeparrot_training - Step 12440: {'lr': 0.00042889550523787936, 'samples': 6369792, 'steps': 12440, 'batch_loss/train': 0.9145561130717397} 12/24/2021 14:48:39 - INFO - codeparrot_training - Step 12441: {'lr': 0.00042888449388030935, 'samples': 6370304, 'steps': 12441, 'batch_loss/train': 0.825006281491369} 12/24/2021 14:48:50 - INFO - codeparrot_training - Step 12442: {'lr': 0.00042887348181156214, 'samples': 6370816, 'steps': 12442, 'batch_loss/train': 0.8343889401294291} 12/24/2021 14:49:00 - INFO - codeparrot_training - Step 12443: {'lr': 0.0004288624690316815, 'samples': 6371328, 'steps': 12443, 'batch_loss/train': 0.8926136689260602} 12/24/2021 14:49:12 - INFO - codeparrot_training - Step 12444: {'lr': 0.0004288514555407113, 'samples': 6371840, 'steps': 12444, 'batch_loss/train': 0.7384015124989673} 12/24/2021 14:49:23 - INFO - codeparrot_training - Step 12445: {'lr': 0.00042884044133869513, 'samples': 6372352, 'steps': 12445, 'batch_loss/train': 0.9055578703992069} 12/24/2021 14:49:34 - INFO - codeparrot_training - Step 12446: {'lr': 0.0004288294264256769, 'samples': 6372864, 'steps': 12446, 'batch_loss/train': 0.7229413795284927} 12/24/2021 14:49:48 - INFO - codeparrot_training - Step 12447: {'lr': 0.00042881841080170046, 'samples': 6373376, 'steps': 12447, 'batch_loss/train': 0.7975387158803642} 12/24/2021 14:49:58 - INFO - codeparrot_training - Step 12448: {'lr': 0.0004288073944668095, 'samples': 6373888, 'steps': 12448, 'batch_loss/train': 0.8061680789105594} 12/24/2021 14:50:09 - INFO - codeparrot_training - Step 12449: {'lr': 0.00042879637742104787, 'samples': 6374400, 'steps': 12449, 'batch_loss/train': 0.8445382053032517} 12/24/2021 14:50:21 - INFO - codeparrot_training - Step 12450: {'lr': 0.0004287853596644594, 'samples': 6374912, 'steps': 12450, 'batch_loss/train': 0.8607777100987732} 12/24/2021 14:50:32 - INFO - codeparrot_training - Step 12451: {'lr': 0.0004287743411970878, 'samples': 6375424, 'steps': 12451, 'batch_loss/train': 0.7319486390333623} 12/24/2021 14:50:42 - INFO - codeparrot_training - Step 12452: {'lr': 0.0004287633220189769, 'samples': 6375936, 'steps': 12452, 'batch_loss/train': 0.9332597106695175} 12/24/2021 14:50:53 - INFO - codeparrot_training - Step 12453: {'lr': 0.00042875230213017056, 'samples': 6376448, 'steps': 12453, 'batch_loss/train': 0.8597526177763939} 12/24/2021 14:51:07 - INFO - codeparrot_training - Step 12454: {'lr': 0.0004287412815307126, 'samples': 6376960, 'steps': 12454, 'batch_loss/train': 0.7001634244807065} 12/24/2021 14:51:17 - INFO - codeparrot_training - Step 12455: {'lr': 0.00042873026022064673, 'samples': 6377472, 'steps': 12455, 'batch_loss/train': 0.8199717933312058} 12/24/2021 14:51:28 - INFO - codeparrot_training - Step 12456: {'lr': 0.00042871923820001677, 'samples': 6377984, 'steps': 12456, 'batch_loss/train': 0.8612269423902035} 12/24/2021 14:51:40 - INFO - codeparrot_training - Step 12457: {'lr': 0.0004287082154688667, 'samples': 6378496, 'steps': 12457, 'batch_loss/train': 0.8541154004633427} 12/24/2021 14:51:51 - INFO - codeparrot_training - Step 12458: {'lr': 0.0004286971920272402, 'samples': 6379008, 'steps': 12458, 'batch_loss/train': 0.7874083679635078} 12/24/2021 14:52:01 - INFO - codeparrot_training - Step 12459: {'lr': 0.0004286861678751812, 'samples': 6379520, 'steps': 12459, 'batch_loss/train': 0.8218161724507809} 12/24/2021 14:52:13 - INFO - codeparrot_training - Step 12460: {'lr': 0.00042867514301273344, 'samples': 6380032, 'steps': 12460, 'batch_loss/train': 0.8971048789098859} 12/24/2021 14:52:24 - INFO - codeparrot_training - Step 12461: {'lr': 0.00042866411743994073, 'samples': 6380544, 'steps': 12461, 'batch_loss/train': 0.807311562821269} 12/24/2021 14:52:34 - INFO - codeparrot_training - Step 12462: {'lr': 0.00042865309115684697, 'samples': 6381056, 'steps': 12462, 'batch_loss/train': 0.8173784422688186} 12/24/2021 14:52:48 - INFO - codeparrot_training - Step 12463: {'lr': 0.00042864206416349604, 'samples': 6381568, 'steps': 12463, 'batch_loss/train': 0.8336331240134314} 12/24/2021 14:52:59 - INFO - codeparrot_training - Step 12464: {'lr': 0.0004286310364599316, 'samples': 6382080, 'steps': 12464, 'batch_loss/train': 0.767644037026912} 12/24/2021 14:53:10 - INFO - codeparrot_training - Step 12465: {'lr': 0.0004286200080461977, 'samples': 6382592, 'steps': 12465, 'batch_loss/train': 0.795863633044064} 12/24/2021 14:53:20 - INFO - codeparrot_training - Step 12466: {'lr': 0.0004286089789223381, 'samples': 6383104, 'steps': 12466, 'batch_loss/train': 0.8848249709699303} 12/24/2021 14:53:32 - INFO - codeparrot_training - Step 12467: {'lr': 0.00042859794908839654, 'samples': 6383616, 'steps': 12467, 'batch_loss/train': 0.8330605169758201} 12/24/2021 14:53:43 - INFO - codeparrot_training - Step 12468: {'lr': 0.0004285869185444171, 'samples': 6384128, 'steps': 12468, 'batch_loss/train': 0.8444941518828273} 12/24/2021 14:53:53 - INFO - codeparrot_training - Step 12469: {'lr': 0.00042857588729044337, 'samples': 6384640, 'steps': 12469, 'batch_loss/train': 0.8321745526045561} 12/24/2021 14:54:05 - INFO - codeparrot_training - Step 12470: {'lr': 0.00042856485532651945, 'samples': 6385152, 'steps': 12470, 'batch_loss/train': 0.8132377900183201} 12/24/2021 14:54:16 - INFO - codeparrot_training - Step 12471: {'lr': 0.00042855382265268905, 'samples': 6385664, 'steps': 12471, 'batch_loss/train': 0.801152341067791} 12/24/2021 14:54:27 - INFO - codeparrot_training - Step 12472: {'lr': 0.00042854278926899604, 'samples': 6386176, 'steps': 12472, 'batch_loss/train': 0.8682806268334389} 12/24/2021 14:54:39 - INFO - codeparrot_training - Step 12473: {'lr': 0.00042853175517548433, 'samples': 6386688, 'steps': 12473, 'batch_loss/train': 0.8032119488343596} 12/24/2021 14:54:49 - INFO - codeparrot_training - Step 12474: {'lr': 0.0004285207203721978, 'samples': 6387200, 'steps': 12474, 'batch_loss/train': 0.7422875328920782} 12/24/2021 14:55:00 - INFO - codeparrot_training - Step 12475: {'lr': 0.0004285096848591803, 'samples': 6387712, 'steps': 12475, 'batch_loss/train': 0.7591179790906608} 12/24/2021 14:55:11 - INFO - codeparrot_training - Step 12476: {'lr': 0.0004284986486364756, 'samples': 6388224, 'steps': 12476, 'batch_loss/train': 0.7809181055054069} 12/24/2021 14:55:25 - INFO - codeparrot_training - Step 12477: {'lr': 0.00042848761170412775, 'samples': 6388736, 'steps': 12477, 'batch_loss/train': 0.8499645423144102} 12/24/2021 14:55:36 - INFO - codeparrot_training - Step 12478: {'lr': 0.00042847657406218056, 'samples': 6389248, 'steps': 12478, 'batch_loss/train': 0.9583186907693744} 12/24/2021 14:55:46 - INFO - codeparrot_training - Step 12479: {'lr': 0.0004284655357106779, 'samples': 6389760, 'steps': 12479, 'batch_loss/train': 0.8876091334968805} 12/24/2021 14:55:58 - INFO - codeparrot_training - Step 12480: {'lr': 0.0004284544966496636, 'samples': 6390272, 'steps': 12480, 'batch_loss/train': 0.7561095459386706} 12/24/2021 14:56:09 - INFO - codeparrot_training - Step 12481: {'lr': 0.0004284434568791817, 'samples': 6390784, 'steps': 12481, 'batch_loss/train': 0.8393377370666713} 12/24/2021 14:56:19 - INFO - codeparrot_training - Step 12482: {'lr': 0.00042843241639927587, 'samples': 6391296, 'steps': 12482, 'batch_loss/train': 0.7879965198226273} 12/24/2021 14:56:33 - INFO - codeparrot_training - Step 12483: {'lr': 0.00042842137520999026, 'samples': 6391808, 'steps': 12483, 'batch_loss/train': 0.9268497116863728} 12/24/2021 14:56:44 - INFO - codeparrot_training - Step 12484: {'lr': 0.0004284103333113686, 'samples': 6392320, 'steps': 12484, 'batch_loss/train': 0.8065941096283495} 12/24/2021 14:56:55 - INFO - codeparrot_training - Step 12485: {'lr': 0.00042839929070345473, 'samples': 6392832, 'steps': 12485, 'batch_loss/train': 0.7044813940301538} 12/24/2021 14:57:07 - INFO - codeparrot_training - Step 12486: {'lr': 0.0004283882473862927, 'samples': 6393344, 'steps': 12486, 'batch_loss/train': 0.8050214995164424} 12/24/2021 14:57:17 - INFO - codeparrot_training - Step 12487: {'lr': 0.00042837720335992636, 'samples': 6393856, 'steps': 12487, 'batch_loss/train': 0.7180748523678631} 12/24/2021 14:57:28 - INFO - codeparrot_training - Step 12488: {'lr': 0.0004283661586243996, 'samples': 6394368, 'steps': 12488, 'batch_loss/train': 0.7622534325346351} 12/24/2021 14:57:39 - INFO - codeparrot_training - Step 12489: {'lr': 0.00042835511317975637, 'samples': 6394880, 'steps': 12489, 'batch_loss/train': 0.8191845621913671} 12/24/2021 14:57:51 - INFO - codeparrot_training - Step 12490: {'lr': 0.0004283440670260405, 'samples': 6395392, 'steps': 12490, 'batch_loss/train': 0.8970791446045041} 12/24/2021 14:58:01 - INFO - codeparrot_training - Step 12491: {'lr': 0.000428333020163296, 'samples': 6395904, 'steps': 12491, 'batch_loss/train': 1.001878778450191} 12/24/2021 14:58:12 - INFO - codeparrot_training - Step 12492: {'lr': 0.00042832197259156675, 'samples': 6396416, 'steps': 12492, 'batch_loss/train': 0.7568719228729606} 12/24/2021 14:58:26 - INFO - codeparrot_training - Step 12493: {'lr': 0.0004283109243108966, 'samples': 6396928, 'steps': 12493, 'batch_loss/train': 0.8483172729611397} 12/24/2021 14:58:37 - INFO - codeparrot_training - Step 12494: {'lr': 0.0004282998753213296, 'samples': 6397440, 'steps': 12494, 'batch_loss/train': 0.8277612710371614} 12/24/2021 14:58:47 - INFO - codeparrot_training - Step 12495: {'lr': 0.0004282888256229096, 'samples': 6397952, 'steps': 12495, 'batch_loss/train': 0.8111834598821588} 12/24/2021 14:58:59 - INFO - codeparrot_training - Step 12496: {'lr': 0.00042827777521568057, 'samples': 6398464, 'steps': 12496, 'batch_loss/train': 0.9000555835664272} 12/24/2021 14:59:10 - INFO - codeparrot_training - Step 12497: {'lr': 0.00042826672409968647, 'samples': 6398976, 'steps': 12497, 'batch_loss/train': 0.7856691861525178} 12/24/2021 14:59:20 - INFO - codeparrot_training - Step 12498: {'lr': 0.0004282556722749711, 'samples': 6399488, 'steps': 12498, 'batch_loss/train': 0.8893981771543622} 12/24/2021 14:59:31 - INFO - codeparrot_training - Step 12499: {'lr': 0.0004282446197415785, 'samples': 6400000, 'steps': 12499, 'batch_loss/train': 0.745213175425306} 12/24/2021 14:59:31 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/24/2021 15:02:52 - INFO - codeparrot_training - Step 12500: {'loss/eval': 0.8193633556365967, 'perplexity': 2.269054889678955} 12/24/2021 15:03:09 - WARNING - huggingface_hub.repository - Several commits (3) will be pushed upstream. 12/24/2021 15:03:24 - INFO - codeparrot_training - Step 12500: {'lr': 0.00042823356649955265, 'samples': 6400512, 'steps': 12500, 'batch_loss/train': 0.8593634327407926} 12/24/2021 15:03:35 - INFO - codeparrot_training - Step 12501: {'lr': 0.0004282225125489374, 'samples': 6401024, 'steps': 12501, 'batch_loss/train': 0.7067815940827131} 12/24/2021 15:03:45 - INFO - codeparrot_training - Step 12502: {'lr': 0.0004282114578897767, 'samples': 6401536, 'steps': 12502, 'batch_loss/train': 0.7497427952475846} 12/24/2021 15:04:00 - INFO - codeparrot_training - Step 12503: {'lr': 0.00042820040252211454, 'samples': 6402048, 'steps': 12503, 'batch_loss/train': 0.8325769518269226} 12/24/2021 15:04:10 - INFO - codeparrot_training - Step 12504: {'lr': 0.00042818934644599493, 'samples': 6402560, 'steps': 12504, 'batch_loss/train': 0.7711407681927085} 12/24/2021 15:04:21 - INFO - codeparrot_training - Step 12505: {'lr': 0.00042817828966146173, 'samples': 6403072, 'steps': 12505, 'batch_loss/train': 0.5320454495376907} 12/24/2021 15:04:33 - INFO - codeparrot_training - Step 12506: {'lr': 0.0004281672321685589, 'samples': 6403584, 'steps': 12506, 'batch_loss/train': 0.754804483614862} 12/24/2021 15:04:44 - INFO - codeparrot_training - Step 12507: {'lr': 0.0004281561739673304, 'samples': 6404096, 'steps': 12507, 'batch_loss/train': 0.8037345167249441} 12/24/2021 15:04:54 - INFO - codeparrot_training - Step 12508: {'lr': 0.0004281451150578203, 'samples': 6404608, 'steps': 12508, 'batch_loss/train': 0.7624505225103348} 12/24/2021 15:05:07 - INFO - codeparrot_training - Step 12509: {'lr': 0.00042813405544007245, 'samples': 6405120, 'steps': 12509, 'batch_loss/train': 0.836041453294456} 12/24/2021 15:05:17 - INFO - codeparrot_training - Step 12510: {'lr': 0.00042812299511413087, 'samples': 6405632, 'steps': 12510, 'batch_loss/train': 0.7718489300459623} 12/24/2021 15:05:28 - INFO - codeparrot_training - Step 12511: {'lr': 0.00042811193408003956, 'samples': 6406144, 'steps': 12511, 'batch_loss/train': 0.7994400281459093} 12/24/2021 15:05:39 - INFO - codeparrot_training - Step 12512: {'lr': 0.00042810087233784237, 'samples': 6406656, 'steps': 12512, 'batch_loss/train': 0.8050190825015306} 12/24/2021 15:05:51 - INFO - codeparrot_training - Step 12513: {'lr': 0.0004280898098875834, 'samples': 6407168, 'steps': 12513, 'batch_loss/train': 0.6992241148836911} 12/24/2021 15:06:01 - INFO - codeparrot_training - Step 12514: {'lr': 0.00042807874672930657, 'samples': 6407680, 'steps': 12514, 'batch_loss/train': 0.71758639998734} 12/24/2021 15:06:12 - INFO - codeparrot_training - Step 12515: {'lr': 0.00042806768286305594, 'samples': 6408192, 'steps': 12515, 'batch_loss/train': 0.7814584905281663} 12/24/2021 15:06:26 - INFO - codeparrot_training - Step 12516: {'lr': 0.0004280566182888754, 'samples': 6408704, 'steps': 12516, 'batch_loss/train': 0.8314587259665132} 12/24/2021 15:06:36 - INFO - codeparrot_training - Step 12517: {'lr': 0.0004280455530068089, 'samples': 6409216, 'steps': 12517, 'batch_loss/train': 0.9352252334356308} 12/24/2021 15:06:47 - INFO - codeparrot_training - Step 12518: {'lr': 0.00042803448701690065, 'samples': 6409728, 'steps': 12518, 'batch_loss/train': 0.8647064417600632} 12/24/2021 15:06:58 - INFO - codeparrot_training - Step 12519: {'lr': 0.0004280234203191944, 'samples': 6410240, 'steps': 12519, 'batch_loss/train': 0.9176020724698901} 12/24/2021 15:07:10 - INFO - codeparrot_training - Step 12520: {'lr': 0.0004280123529137343, 'samples': 6410752, 'steps': 12520, 'batch_loss/train': 0.7387919737957418} 12/24/2021 15:07:20 - INFO - codeparrot_training - Step 12521: {'lr': 0.0004280012848005643, 'samples': 6411264, 'steps': 12521, 'batch_loss/train': 0.7734425188973546} 12/24/2021 15:07:31 - INFO - codeparrot_training - Step 12522: {'lr': 0.0004279902159797283, 'samples': 6411776, 'steps': 12522, 'batch_loss/train': 0.7120393645018339} 12/24/2021 15:07:45 - INFO - codeparrot_training - Step 12523: {'lr': 0.0004279791464512706, 'samples': 6412288, 'steps': 12523, 'batch_loss/train': 0.9026863630861044} 12/24/2021 15:07:55 - INFO - codeparrot_training - Step 12524: {'lr': 0.0004279680762152349, 'samples': 6412800, 'steps': 12524, 'batch_loss/train': 0.9006065241992474} 12/24/2021 15:08:06 - INFO - codeparrot_training - Step 12525: {'lr': 0.00042795700527166523, 'samples': 6413312, 'steps': 12525, 'batch_loss/train': 0.7988934030290693} 12/24/2021 15:08:18 - INFO - codeparrot_training - Step 12526: {'lr': 0.00042794593362060584, 'samples': 6413824, 'steps': 12526, 'batch_loss/train': 0.8117581093683839} 12/24/2021 15:08:29 - INFO - codeparrot_training - Step 12527: {'lr': 0.00042793486126210063, 'samples': 6414336, 'steps': 12527, 'batch_loss/train': 0.7496100584976375} 12/24/2021 15:08:39 - INFO - codeparrot_training - Step 12528: {'lr': 0.0004279237881961935, 'samples': 6414848, 'steps': 12528, 'batch_loss/train': 0.8645166782662272} 12/24/2021 15:08:50 - INFO - codeparrot_training - Step 12529: {'lr': 0.00042791271442292866, 'samples': 6415360, 'steps': 12529, 'batch_loss/train': 0.8367462414316833} 12/24/2021 15:09:02 - INFO - codeparrot_training - Step 12530: {'lr': 0.00042790163994235, 'samples': 6415872, 'steps': 12530, 'batch_loss/train': 1.0348610579967499} 12/24/2021 15:09:12 - INFO - codeparrot_training - Step 12531: {'lr': 0.00042789056475450157, 'samples': 6416384, 'steps': 12531, 'batch_loss/train': 0.7822382273152471} 12/24/2021 15:09:23 - INFO - codeparrot_training - Step 12532: {'lr': 0.0004278794888594274, 'samples': 6416896, 'steps': 12532, 'batch_loss/train': 0.936638574115932} 12/24/2021 15:09:37 - INFO - codeparrot_training - Step 12533: {'lr': 0.00042786841225717163, 'samples': 6417408, 'steps': 12533, 'batch_loss/train': 0.7991258660331368} 12/24/2021 15:09:47 - INFO - codeparrot_training - Step 12534: {'lr': 0.0004278573349477782, 'samples': 6417920, 'steps': 12534, 'batch_loss/train': 0.8300980255007744} 12/24/2021 15:09:58 - INFO - codeparrot_training - Step 12535: {'lr': 0.0004278462569312911, 'samples': 6418432, 'steps': 12535, 'batch_loss/train': 0.7783073745667934} 12/24/2021 15:10:10 - INFO - codeparrot_training - Step 12536: {'lr': 0.0004278351782077545, 'samples': 6418944, 'steps': 12536, 'batch_loss/train': 0.788862157613039} 12/24/2021 15:10:21 - INFO - codeparrot_training - Step 12537: {'lr': 0.0004278240987772123, 'samples': 6419456, 'steps': 12537, 'batch_loss/train': 0.788922606036067} 12/24/2021 15:10:31 - INFO - codeparrot_training - Step 12538: {'lr': 0.0004278130186397087, 'samples': 6419968, 'steps': 12538, 'batch_loss/train': 0.8325432632118464} 12/24/2021 15:10:44 - INFO - codeparrot_training - Step 12539: {'lr': 0.00042780193779528767, 'samples': 6420480, 'steps': 12539, 'batch_loss/train': 0.7274671760387719} 12/24/2021 15:10:54 - INFO - codeparrot_training - Step 12540: {'lr': 0.0004277908562439933, 'samples': 6420992, 'steps': 12540, 'batch_loss/train': 0.7741058515384793} 12/24/2021 15:11:05 - INFO - codeparrot_training - Step 12541: {'lr': 0.0004277797739858695, 'samples': 6421504, 'steps': 12541, 'batch_loss/train': 0.7834247495047748} 12/24/2021 15:11:15 - INFO - codeparrot_training - Step 12542: {'lr': 0.0004277686910209605, 'samples': 6422016, 'steps': 12542, 'batch_loss/train': 0.5988394973683171} 12/24/2021 15:11:28 - INFO - codeparrot_training - Step 12543: {'lr': 0.00042775760734931035, 'samples': 6422528, 'steps': 12543, 'batch_loss/train': 0.8744842577725649} 12/24/2021 15:11:38 - INFO - codeparrot_training - Step 12544: {'lr': 0.00042774652297096304, 'samples': 6423040, 'steps': 12544, 'batch_loss/train': 0.7917586546391249} 12/24/2021 15:11:49 - INFO - codeparrot_training - Step 12545: {'lr': 0.0004277354378859627, 'samples': 6423552, 'steps': 12545, 'batch_loss/train': 0.7943313712021336} 12/24/2021 15:12:03 - INFO - codeparrot_training - Step 12546: {'lr': 0.00042772435209435333, 'samples': 6424064, 'steps': 12546, 'batch_loss/train': 0.8008491066284478} 12/24/2021 15:12:13 - INFO - codeparrot_training - Step 12547: {'lr': 0.0004277132655961791, 'samples': 6424576, 'steps': 12547, 'batch_loss/train': 0.7708990881219506} 12/24/2021 15:12:24 - INFO - codeparrot_training - Step 12548: {'lr': 0.00042770217839148395, 'samples': 6425088, 'steps': 12548, 'batch_loss/train': 1.0026995101943612} 12/24/2021 15:12:36 - INFO - codeparrot_training - Step 12549: {'lr': 0.00042769109048031205, 'samples': 6425600, 'steps': 12549, 'batch_loss/train': 0.881767921615392} 12/24/2021 15:12:47 - INFO - codeparrot_training - Step 12550: {'lr': 0.0004276800018627075, 'samples': 6426112, 'steps': 12550, 'batch_loss/train': 0.7471167128533125} 12/24/2021 15:12:57 - INFO - codeparrot_training - Step 12551: {'lr': 0.0004276689125387143, 'samples': 6426624, 'steps': 12551, 'batch_loss/train': 0.7414843237493187} 12/24/2021 15:13:08 - INFO - codeparrot_training - Step 12552: {'lr': 0.00042765782250837664, 'samples': 6427136, 'steps': 12552, 'batch_loss/train': 0.8894875384867191} 12/24/2021 15:13:21 - INFO - codeparrot_training - Step 12553: {'lr': 0.0004276467317717386, 'samples': 6427648, 'steps': 12553, 'batch_loss/train': 0.8328850325196981} 12/24/2021 15:13:32 - INFO - codeparrot_training - Step 12554: {'lr': 0.00042763564032884414, 'samples': 6428160, 'steps': 12554, 'batch_loss/train': 0.8170148031786084} 12/24/2021 15:13:43 - INFO - codeparrot_training - Step 12555: {'lr': 0.0004276245481797375, 'samples': 6428672, 'steps': 12555, 'batch_loss/train': 0.8414353891275823} 12/24/2021 15:13:55 - INFO - codeparrot_training - Step 12556: {'lr': 0.00042761345532446273, 'samples': 6429184, 'steps': 12556, 'batch_loss/train': 0.8538457024842501} 12/24/2021 15:14:05 - INFO - codeparrot_training - Step 12557: {'lr': 0.0004276023617630639, 'samples': 6429696, 'steps': 12557, 'batch_loss/train': 0.8339118333533406} 12/24/2021 15:14:16 - INFO - codeparrot_training - Step 12558: {'lr': 0.00042759126749558515, 'samples': 6430208, 'steps': 12558, 'batch_loss/train': 0.8530780868604779} 12/24/2021 15:14:28 - INFO - codeparrot_training - Step 12559: {'lr': 0.0004275801725220705, 'samples': 6430720, 'steps': 12559, 'batch_loss/train': 0.8244136100402102} 12/24/2021 15:14:39 - INFO - codeparrot_training - Step 12560: {'lr': 0.0004275690768425643, 'samples': 6431232, 'steps': 12560, 'batch_loss/train': 0.8316478095948696} 12/24/2021 15:14:49 - INFO - codeparrot_training - Step 12561: {'lr': 0.00042755798045711037, 'samples': 6431744, 'steps': 12561, 'batch_loss/train': 0.833941025659442} 12/24/2021 15:15:03 - INFO - codeparrot_training - Step 12562: {'lr': 0.000427546883365753, 'samples': 6432256, 'steps': 12562, 'batch_loss/train': 0.7276930769439787} 12/24/2021 15:15:14 - INFO - codeparrot_training - Step 12563: {'lr': 0.0004275357855685362, 'samples': 6432768, 'steps': 12563, 'batch_loss/train': 1.7352975346148014} 12/24/2021 15:15:25 - INFO - codeparrot_training - Step 12564: {'lr': 0.0004275246870655043, 'samples': 6433280, 'steps': 12564, 'batch_loss/train': 0.8410628056153655} 12/24/2021 15:15:35 - INFO - codeparrot_training - Step 12565: {'lr': 0.0004275135878567011, 'samples': 6433792, 'steps': 12565, 'batch_loss/train': 1.4730480615980923} 12/24/2021 15:15:47 - INFO - codeparrot_training - Step 12566: {'lr': 0.000427502487942171, 'samples': 6434304, 'steps': 12566, 'batch_loss/train': 0.8459376911632717} 12/24/2021 15:15:58 - INFO - codeparrot_training - Step 12567: {'lr': 0.00042749138732195804, 'samples': 6434816, 'steps': 12567, 'batch_loss/train': 0.8043190236203372} 12/24/2021 15:16:08 - INFO - codeparrot_training - Step 12568: {'lr': 0.0004274802859961063, 'samples': 6435328, 'steps': 12568, 'batch_loss/train': 0.8507891995832324} 12/24/2021 15:16:22 - INFO - codeparrot_training - Step 12569: {'lr': 0.00042746918396465995, 'samples': 6435840, 'steps': 12569, 'batch_loss/train': 0.80434625223279} 12/24/2021 15:16:33 - INFO - codeparrot_training - Step 12570: {'lr': 0.0004274580812276632, 'samples': 6436352, 'steps': 12570, 'batch_loss/train': 0.7979062153026462} 12/24/2021 15:16:44 - INFO - codeparrot_training - Step 12571: {'lr': 0.0004274469777851601, 'samples': 6436864, 'steps': 12571, 'batch_loss/train': 0.9130801339633763} 12/24/2021 15:16:54 - INFO - codeparrot_training - Step 12572: {'lr': 0.0004274358736371947, 'samples': 6437376, 'steps': 12572, 'batch_loss/train': 0.8188600828871131} 12/24/2021 15:17:06 - INFO - codeparrot_training - Step 12573: {'lr': 0.00042742476878381144, 'samples': 6437888, 'steps': 12573, 'batch_loss/train': 0.7786412108689547} 12/24/2021 15:17:17 - INFO - codeparrot_training - Step 12574: {'lr': 0.00042741366322505416, 'samples': 6438400, 'steps': 12574, 'batch_loss/train': 0.9284954508766532} 12/24/2021 15:17:28 - INFO - codeparrot_training - Step 12575: {'lr': 0.0004274025569609672, 'samples': 6438912, 'steps': 12575, 'batch_loss/train': 0.7526645252946764} 12/24/2021 15:17:40 - INFO - codeparrot_training - Step 12576: {'lr': 0.0004273914499915946, 'samples': 6439424, 'steps': 12576, 'batch_loss/train': 0.7894351202994585} 12/24/2021 15:17:50 - INFO - codeparrot_training - Step 12577: {'lr': 0.00042738034231698063, 'samples': 6439936, 'steps': 12577, 'batch_loss/train': 0.7836560728028417} 12/24/2021 15:18:01 - INFO - codeparrot_training - Step 12578: {'lr': 0.0004273692339371694, 'samples': 6440448, 'steps': 12578, 'batch_loss/train': 0.7091091936454177} 12/24/2021 15:18:15 - INFO - codeparrot_training - Step 12579: {'lr': 0.00042735812485220504, 'samples': 6440960, 'steps': 12579, 'batch_loss/train': 0.8197984588332474} 12/24/2021 15:18:26 - INFO - codeparrot_training - Step 12580: {'lr': 0.0004273470150621318, 'samples': 6441472, 'steps': 12580, 'batch_loss/train': 0.8144944543018937} 12/24/2021 15:18:36 - INFO - codeparrot_training - Step 12581: {'lr': 0.0004273359045669938, 'samples': 6441984, 'steps': 12581, 'batch_loss/train': 0.8112665330991149} 12/24/2021 15:18:47 - INFO - codeparrot_training - Step 12582: {'lr': 0.00042732479336683515, 'samples': 6442496, 'steps': 12582, 'batch_loss/train': 0.8013476715423167} 12/24/2021 15:18:59 - INFO - codeparrot_training - Step 12583: {'lr': 0.0004273136814617, 'samples': 6443008, 'steps': 12583, 'batch_loss/train': 0.6059524192241952} 12/24/2021 15:19:10 - INFO - codeparrot_training - Step 12584: {'lr': 0.00042730256885163275, 'samples': 6443520, 'steps': 12584, 'batch_loss/train': 0.8411729331128299} 12/24/2021 15:19:20 - INFO - codeparrot_training - Step 12585: {'lr': 0.0004272914555366774, 'samples': 6444032, 'steps': 12585, 'batch_loss/train': 0.893549625761807} 12/24/2021 15:19:33 - INFO - codeparrot_training - Step 12586: {'lr': 0.00042728034151687813, 'samples': 6444544, 'steps': 12586, 'batch_loss/train': 0.7435586103238165} 12/24/2021 15:19:43 - INFO - codeparrot_training - Step 12587: {'lr': 0.00042726922679227916, 'samples': 6445056, 'steps': 12587, 'batch_loss/train': 0.7921226331964135} 12/24/2021 15:19:54 - INFO - codeparrot_training - Step 12588: {'lr': 0.0004272581113629247, 'samples': 6445568, 'steps': 12588, 'batch_loss/train': 0.921039080247283} 12/24/2021 15:20:06 - INFO - codeparrot_training - Step 12589: {'lr': 0.0004272469952288589, 'samples': 6446080, 'steps': 12589, 'batch_loss/train': 0.6759250807808712} 12/24/2021 15:20:17 - INFO - codeparrot_training - Step 12590: {'lr': 0.000427235878390126, 'samples': 6446592, 'steps': 12590, 'batch_loss/train': 0.8440966885536909} 12/24/2021 15:20:27 - INFO - codeparrot_training - Step 12591: {'lr': 0.0004272247608467701, 'samples': 6447104, 'steps': 12591, 'batch_loss/train': 0.877419629599899} 12/24/2021 15:20:38 - INFO - codeparrot_training - Step 12592: {'lr': 0.00042721364259883554, 'samples': 6447616, 'steps': 12592, 'batch_loss/train': 0.8526618042960763} 12/24/2021 15:20:52 - INFO - codeparrot_training - Step 12593: {'lr': 0.00042720252364636635, 'samples': 6448128, 'steps': 12593, 'batch_loss/train': 0.8771097280550748} 12/24/2021 15:21:02 - INFO - codeparrot_training - Step 12594: {'lr': 0.000427191403989407, 'samples': 6448640, 'steps': 12594, 'batch_loss/train': 0.7811001027002931} 12/24/2021 15:21:13 - INFO - codeparrot_training - Step 12595: {'lr': 0.00042718028362800144, 'samples': 6449152, 'steps': 12595, 'batch_loss/train': 0.7772481930442154} 12/24/2021 15:21:25 - INFO - codeparrot_training - Step 12596: {'lr': 0.00042716916256219393, 'samples': 6449664, 'steps': 12596, 'batch_loss/train': 0.8449200331233442} 12/24/2021 15:21:36 - INFO - codeparrot_training - Step 12597: {'lr': 0.0004271580407920288, 'samples': 6450176, 'steps': 12597, 'batch_loss/train': 0.7910690953722224} 12/24/2021 15:21:46 - INFO - codeparrot_training - Step 12598: {'lr': 0.00042714691831755014, 'samples': 6450688, 'steps': 12598, 'batch_loss/train': 0.6939439445268363} 12/24/2021 15:22:00 - INFO - codeparrot_training - Step 12599: {'lr': 0.0004271357951388023, 'samples': 6451200, 'steps': 12599, 'batch_loss/train': 0.8071074280887842} 12/24/2021 15:22:11 - INFO - codeparrot_training - Step 12600: {'lr': 0.0004271246712558293, 'samples': 6451712, 'steps': 12600, 'batch_loss/train': 0.6413073476869613} 12/24/2021 15:22:22 - INFO - codeparrot_training - Step 12601: {'lr': 0.00042711354666867557, 'samples': 6452224, 'steps': 12601, 'batch_loss/train': 0.7591525570023805} 12/24/2021 15:22:32 - INFO - codeparrot_training - Step 12602: {'lr': 0.0004271024213773853, 'samples': 6452736, 'steps': 12602, 'batch_loss/train': 0.790421596262604} 12/24/2021 15:22:44 - INFO - codeparrot_training - Step 12603: {'lr': 0.0004270912953820026, 'samples': 6453248, 'steps': 12603, 'batch_loss/train': 0.851144996471703} 12/24/2021 15:22:55 - INFO - codeparrot_training - Step 12604: {'lr': 0.0004270801686825717, 'samples': 6453760, 'steps': 12604, 'batch_loss/train': 0.7764520337805152} 12/24/2021 15:23:05 - INFO - codeparrot_training - Step 12605: {'lr': 0.00042706904127913704, 'samples': 6454272, 'steps': 12605, 'batch_loss/train': 0.7731544822454453} 12/24/2021 15:23:17 - INFO - codeparrot_training - Step 12606: {'lr': 0.0004270579131717427, 'samples': 6454784, 'steps': 12606, 'batch_loss/train': 0.7640663693891838} 12/24/2021 15:23:28 - INFO - codeparrot_training - Step 12607: {'lr': 0.000427046784360433, 'samples': 6455296, 'steps': 12607, 'batch_loss/train': 0.8692188682034612} 12/24/2021 15:23:39 - INFO - codeparrot_training - Step 12608: {'lr': 0.000427035654845252, 'samples': 6455808, 'steps': 12608, 'batch_loss/train': 0.9442226234823465} 12/24/2021 15:23:52 - INFO - codeparrot_training - Step 12609: {'lr': 0.0004270245246262442, 'samples': 6456320, 'steps': 12609, 'batch_loss/train': 0.855852272361517} 12/24/2021 15:24:03 - INFO - codeparrot_training - Step 12610: {'lr': 0.0004270133937034537, 'samples': 6456832, 'steps': 12610, 'batch_loss/train': 0.8145561208948493} 12/24/2021 15:24:14 - INFO - codeparrot_training - Step 12611: {'lr': 0.0004270022620769248, 'samples': 6457344, 'steps': 12611, 'batch_loss/train': 0.8343264537397772} 12/24/2021 15:24:24 - INFO - codeparrot_training - Step 12612: {'lr': 0.00042699112974670166, 'samples': 6457856, 'steps': 12612, 'batch_loss/train': 0.8086019335314631} 12/24/2021 15:24:37 - INFO - codeparrot_training - Step 12613: {'lr': 0.0004269799967128287, 'samples': 6458368, 'steps': 12613, 'batch_loss/train': 0.8558925767429173} 12/24/2021 15:24:47 - INFO - codeparrot_training - Step 12614: {'lr': 0.0004269688629753501, 'samples': 6458880, 'steps': 12614, 'batch_loss/train': 0.7844953574240208} 12/24/2021 15:24:58 - INFO - codeparrot_training - Step 12615: {'lr': 0.0004269577285343101, 'samples': 6459392, 'steps': 12615, 'batch_loss/train': 0.7611038484610617} 12/24/2021 15:25:10 - INFO - codeparrot_training - Step 12616: {'lr': 0.00042694659338975296, 'samples': 6459904, 'steps': 12616, 'batch_loss/train': 0.7625792613252997} 12/24/2021 15:25:21 - INFO - codeparrot_training - Step 12617: {'lr': 0.000426935457541723, 'samples': 6460416, 'steps': 12617, 'batch_loss/train': 0.8155375448986888} 12/24/2021 15:25:32 - INFO - codeparrot_training - Step 12618: {'lr': 0.00042692432099026445, 'samples': 6460928, 'steps': 12618, 'batch_loss/train': 0.7241238900460303} 12/24/2021 15:25:42 - INFO - codeparrot_training - Step 12619: {'lr': 0.00042691318373542164, 'samples': 6461440, 'steps': 12619, 'batch_loss/train': 0.9593733809888363} 12/24/2021 15:25:54 - INFO - codeparrot_training - Step 12620: {'lr': 0.00042690204577723877, 'samples': 6461952, 'steps': 12620, 'batch_loss/train': 0.9320247764699161} 12/24/2021 15:26:05 - INFO - codeparrot_training - Step 12621: {'lr': 0.00042689090711576025, 'samples': 6462464, 'steps': 12621, 'batch_loss/train': 0.7988232458010316} 12/24/2021 15:26:16 - INFO - codeparrot_training - Step 12622: {'lr': 0.0004268797677510302, 'samples': 6462976, 'steps': 12622, 'batch_loss/train': 0.9186774916015565} 12/24/2021 15:26:29 - INFO - codeparrot_training - Step 12623: {'lr': 0.00042686862768309295, 'samples': 6463488, 'steps': 12623, 'batch_loss/train': 0.8664492834359407} 12/24/2021 15:26:40 - INFO - codeparrot_training - Step 12624: {'lr': 0.00042685748691199287, 'samples': 6464000, 'steps': 12624, 'batch_loss/train': 0.7460404895246029} 12/24/2021 15:26:51 - INFO - codeparrot_training - Step 12625: {'lr': 0.00042684634543777424, 'samples': 6464512, 'steps': 12625, 'batch_loss/train': 0.92150027025491} 12/24/2021 15:27:03 - INFO - codeparrot_training - Step 12626: {'lr': 0.0004268352032604812, 'samples': 6465024, 'steps': 12626, 'batch_loss/train': 0.8366102380678058} 12/24/2021 15:27:14 - INFO - codeparrot_training - Step 12627: {'lr': 0.00042682406038015825, 'samples': 6465536, 'steps': 12627, 'batch_loss/train': 0.8040793659165502} 12/24/2021 15:27:24 - INFO - codeparrot_training - Step 12628: {'lr': 0.0004268129167968496, 'samples': 6466048, 'steps': 12628, 'batch_loss/train': 0.8458579108119011} 12/24/2021 15:27:35 - INFO - codeparrot_training - Step 12629: {'lr': 0.00042680177251059947, 'samples': 6466560, 'steps': 12629, 'batch_loss/train': 0.9221493825316429} 12/24/2021 15:27:49 - INFO - codeparrot_training - Step 12630: {'lr': 0.00042679062752145236, 'samples': 6467072, 'steps': 12630, 'batch_loss/train': 2.394324777647853} 12/24/2021 15:28:00 - INFO - codeparrot_training - Step 12631: {'lr': 0.0004267794818294525, 'samples': 6467584, 'steps': 12631, 'batch_loss/train': 0.8000651281327009} 12/24/2021 15:28:10 - INFO - codeparrot_training - Step 12632: {'lr': 0.000426768335434644, 'samples': 6468096, 'steps': 12632, 'batch_loss/train': 0.8566823611035943} 12/24/2021 15:28:22 - INFO - codeparrot_training - Step 12633: {'lr': 0.00042675718833707146, 'samples': 6468608, 'steps': 12633, 'batch_loss/train': 0.8347082762047648} 12/24/2021 15:28:33 - INFO - codeparrot_training - Step 12634: {'lr': 0.00042674604053677905, 'samples': 6469120, 'steps': 12634, 'batch_loss/train': 0.8318511140532792} 12/24/2021 15:28:44 - INFO - codeparrot_training - Step 12635: {'lr': 0.0004267348920338111, 'samples': 6469632, 'steps': 12635, 'batch_loss/train': 0.8622719384729862} 12/24/2021 15:28:54 - INFO - codeparrot_training - Step 12636: {'lr': 0.00042672374282821203, 'samples': 6470144, 'steps': 12636, 'batch_loss/train': 0.7548525109887123} 12/24/2021 15:29:06 - INFO - codeparrot_training - Step 12637: {'lr': 0.000426712592920026, 'samples': 6470656, 'steps': 12637, 'batch_loss/train': 0.8233505475800484} 12/24/2021 15:29:17 - INFO - codeparrot_training - Step 12638: {'lr': 0.00042670144230929746, 'samples': 6471168, 'steps': 12638, 'batch_loss/train': 0.8472527214325964} 12/24/2021 15:29:27 - INFO - codeparrot_training - Step 12639: {'lr': 0.0004266902909960707, 'samples': 6471680, 'steps': 12639, 'batch_loss/train': 0.9356502238661051} 12/24/2021 15:29:41 - INFO - codeparrot_training - Step 12640: {'lr': 0.00042667913898039014, 'samples': 6472192, 'steps': 12640, 'batch_loss/train': 0.7816589353606105} 12/24/2021 15:29:52 - INFO - codeparrot_training - Step 12641: {'lr': 0.0004266679862622999, 'samples': 6472704, 'steps': 12641, 'batch_loss/train': 0.8311442490667105} 12/24/2021 15:30:03 - INFO - codeparrot_training - Step 12642: {'lr': 0.0004266568328418445, 'samples': 6473216, 'steps': 12642, 'batch_loss/train': 0.7273613173747435} 12/24/2021 15:30:15 - INFO - codeparrot_training - Step 12643: {'lr': 0.0004266456787190682, 'samples': 6473728, 'steps': 12643, 'batch_loss/train': 0.9173296145163476} 12/24/2021 15:30:26 - INFO - codeparrot_training - Step 12644: {'lr': 0.00042663452389401547, 'samples': 6474240, 'steps': 12644, 'batch_loss/train': 0.7772900313138962} 12/24/2021 15:30:36 - INFO - codeparrot_training - Step 12645: {'lr': 0.00042662336836673056, 'samples': 6474752, 'steps': 12645, 'batch_loss/train': 1.0938143348321319} 12/24/2021 15:30:50 - INFO - codeparrot_training - Step 12646: {'lr': 0.00042661221213725783, 'samples': 6475264, 'steps': 12646, 'batch_loss/train': 0.8102797737810761} 12/24/2021 15:31:01 - INFO - codeparrot_training - Step 12647: {'lr': 0.0004266010552056416, 'samples': 6475776, 'steps': 12647, 'batch_loss/train': 0.849279472604394} 12/24/2021 15:31:11 - INFO - codeparrot_training - Step 12648: {'lr': 0.00042658989757192624, 'samples': 6476288, 'steps': 12648, 'batch_loss/train': 1.0747971422970295} 12/24/2021 15:31:22 - INFO - codeparrot_training - Step 12649: {'lr': 0.00042657873923615606, 'samples': 6476800, 'steps': 12649, 'batch_loss/train': 0.8088468257337809} 12/24/2021 15:31:34 - INFO - codeparrot_training - Step 12650: {'lr': 0.0004265675801983756, 'samples': 6477312, 'steps': 12650, 'batch_loss/train': 0.8456621773075312} 12/24/2021 15:31:45 - INFO - codeparrot_training - Step 12651: {'lr': 0.0004265564204586291, 'samples': 6477824, 'steps': 12651, 'batch_loss/train': 0.863393499981612} 12/24/2021 15:31:55 - INFO - codeparrot_training - Step 12652: {'lr': 0.00042654526001696094, 'samples': 6478336, 'steps': 12652, 'batch_loss/train': 0.9086840711534023} 12/24/2021 15:32:07 - INFO - codeparrot_training - Step 12653: {'lr': 0.00042653409887341543, 'samples': 6478848, 'steps': 12653, 'batch_loss/train': 0.7863477990031242} 12/24/2021 15:32:18 - INFO - codeparrot_training - Step 12654: {'lr': 0.000426522937028037, 'samples': 6479360, 'steps': 12654, 'batch_loss/train': 0.820214862935245} 12/24/2021 15:32:29 - INFO - codeparrot_training - Step 12655: {'lr': 0.0004265117744808701, 'samples': 6479872, 'steps': 12655, 'batch_loss/train': 0.8802738226950169} 12/24/2021 15:32:42 - INFO - codeparrot_training - Step 12656: {'lr': 0.00042650061123195894, 'samples': 6480384, 'steps': 12656, 'batch_loss/train': 0.8391699083149433} 12/24/2021 15:32:53 - INFO - codeparrot_training - Step 12657: {'lr': 0.0004264894472813481, 'samples': 6480896, 'steps': 12657, 'batch_loss/train': 0.8960793800652027} 12/24/2021 15:33:04 - INFO - codeparrot_training - Step 12658: {'lr': 0.00042647828262908175, 'samples': 6481408, 'steps': 12658, 'batch_loss/train': 0.8040118655189872} 12/24/2021 15:33:14 - INFO - codeparrot_training - Step 12659: {'lr': 0.00042646711727520445, 'samples': 6481920, 'steps': 12659, 'batch_loss/train': 0.8710431689396501} 12/24/2021 15:33:27 - INFO - codeparrot_training - Step 12660: {'lr': 0.0004264559512197605, 'samples': 6482432, 'steps': 12660, 'batch_loss/train': 0.8073851075023413} 12/24/2021 15:33:37 - INFO - codeparrot_training - Step 12661: {'lr': 0.00042644478446279426, 'samples': 6482944, 'steps': 12661, 'batch_loss/train': 0.8246556045487523} 12/24/2021 15:33:48 - INFO - codeparrot_training - Step 12662: {'lr': 0.0004264336170043502, 'samples': 6483456, 'steps': 12662, 'batch_loss/train': 0.8643769966438413} 12/24/2021 15:34:00 - INFO - codeparrot_training - Step 12663: {'lr': 0.0004264224488444728, 'samples': 6483968, 'steps': 12663, 'batch_loss/train': 0.8250672835856676} 12/24/2021 15:34:11 - INFO - codeparrot_training - Step 12664: {'lr': 0.0004264112799832063, 'samples': 6484480, 'steps': 12664, 'batch_loss/train': 0.8980627553537488} 12/24/2021 15:34:21 - INFO - codeparrot_training - Step 12665: {'lr': 0.0004264001104205951, 'samples': 6484992, 'steps': 12665, 'batch_loss/train': 0.8328432757407427} 12/24/2021 15:34:33 - INFO - codeparrot_training - Step 12666: {'lr': 0.0004263889401566838, 'samples': 6485504, 'steps': 12666, 'batch_loss/train': 0.7941936503630131} 12/24/2021 15:34:44 - INFO - codeparrot_training - Step 12667: {'lr': 0.0004263777691915166, 'samples': 6486016, 'steps': 12667, 'batch_loss/train': 0.7961399145424366} 12/24/2021 15:34:54 - INFO - codeparrot_training - Step 12668: {'lr': 0.00042636659752513795, 'samples': 6486528, 'steps': 12668, 'batch_loss/train': 0.7109446451067924} 12/24/2021 15:35:05 - INFO - codeparrot_training - Step 12669: {'lr': 0.00042635542515759235, 'samples': 6487040, 'steps': 12669, 'batch_loss/train': 0.8301319610327482} 12/24/2021 15:35:19 - INFO - codeparrot_training - Step 12670: {'lr': 0.0004263442520889241, 'samples': 6487552, 'steps': 12670, 'batch_loss/train': 0.8594667594879866} 12/24/2021 15:35:30 - INFO - codeparrot_training - Step 12671: {'lr': 0.0004263330783191778, 'samples': 6488064, 'steps': 12671, 'batch_loss/train': 0.8265599743463099} 12/24/2021 15:35:40 - INFO - codeparrot_training - Step 12672: {'lr': 0.00042632190384839767, 'samples': 6488576, 'steps': 12672, 'batch_loss/train': 0.9004784114658833} 12/24/2021 15:35:52 - INFO - codeparrot_training - Step 12673: {'lr': 0.00042631072867662837, 'samples': 6489088, 'steps': 12673, 'batch_loss/train': 0.8903949270024896} 12/24/2021 15:36:03 - INFO - codeparrot_training - Step 12674: {'lr': 0.00042629955280391396, 'samples': 6489600, 'steps': 12674, 'batch_loss/train': 0.9640490128658712} 12/24/2021 15:36:13 - INFO - codeparrot_training - Step 12675: {'lr': 0.0004262883762302993, 'samples': 6490112, 'steps': 12675, 'batch_loss/train': 0.7534837296698242} 12/24/2021 15:36:26 - INFO - codeparrot_training - Step 12676: {'lr': 0.0004262771989558285, 'samples': 6490624, 'steps': 12676, 'batch_loss/train': 0.8838881365954876} 12/24/2021 15:36:36 - INFO - codeparrot_training - Step 12677: {'lr': 0.00042626602098054615, 'samples': 6491136, 'steps': 12677, 'batch_loss/train': 0.8567611183971167} 12/24/2021 15:36:47 - INFO - codeparrot_training - Step 12678: {'lr': 0.00042625484230449665, 'samples': 6491648, 'steps': 12678, 'batch_loss/train': 0.7626380105502903} 12/24/2021 15:37:01 - INFO - codeparrot_training - Step 12679: {'lr': 0.00042624366292772447, 'samples': 6492160, 'steps': 12679, 'batch_loss/train': 0.6806810586713254} 12/24/2021 15:37:12 - INFO - codeparrot_training - Step 12680: {'lr': 0.000426232482850274, 'samples': 6492672, 'steps': 12680, 'batch_loss/train': 0.7608084445819259} 12/24/2021 15:37:22 - INFO - codeparrot_training - Step 12681: {'lr': 0.0004262213020721898, 'samples': 6493184, 'steps': 12681, 'batch_loss/train': 0.8114660084247589} 12/24/2021 15:37:33 - INFO - codeparrot_training - Step 12682: {'lr': 0.0004262101205935162, 'samples': 6493696, 'steps': 12682, 'batch_loss/train': 0.6110063777305186} 12/24/2021 15:37:45 - INFO - codeparrot_training - Step 12683: {'lr': 0.0004261989384142976, 'samples': 6494208, 'steps': 12683, 'batch_loss/train': 0.8294248105958104} 12/24/2021 15:37:55 - INFO - codeparrot_training - Step 12684: {'lr': 0.00042618775553457866, 'samples': 6494720, 'steps': 12684, 'batch_loss/train': 0.7517662858590484} 12/24/2021 15:38:06 - INFO - codeparrot_training - Step 12685: {'lr': 0.0004261765719544036, 'samples': 6495232, 'steps': 12685, 'batch_loss/train': 0.919594987295568} 12/24/2021 15:38:20 - INFO - codeparrot_training - Step 12686: {'lr': 0.0004261653876738172, 'samples': 6495744, 'steps': 12686, 'batch_loss/train': 0.7763039005221799} 12/24/2021 15:38:30 - INFO - codeparrot_training - Step 12687: {'lr': 0.00042615420269286367, 'samples': 6496256, 'steps': 12687, 'batch_loss/train': 0.8003615217749029} 12/24/2021 15:38:41 - INFO - codeparrot_training - Step 12688: {'lr': 0.0004261430170115875, 'samples': 6496768, 'steps': 12688, 'batch_loss/train': 0.8178893560543656} 12/24/2021 15:38:54 - INFO - codeparrot_training - Step 12689: {'lr': 0.0004261318306300332, 'samples': 6497280, 'steps': 12689, 'batch_loss/train': 0.8015596193727106} 12/24/2021 15:39:04 - INFO - codeparrot_training - Step 12690: {'lr': 0.00042612064354824523, 'samples': 6497792, 'steps': 12690, 'batch_loss/train': 0.7825410077348351} 12/24/2021 15:39:15 - INFO - codeparrot_training - Step 12691: {'lr': 0.00042610945576626813, 'samples': 6498304, 'steps': 12691, 'batch_loss/train': 1.1071297666057944} 12/24/2021 15:39:25 - INFO - codeparrot_training - Step 12692: {'lr': 0.00042609826728414634, 'samples': 6498816, 'steps': 12692, 'batch_loss/train': 0.917801265604794} 12/24/2021 15:39:37 - INFO - codeparrot_training - Step 12693: {'lr': 0.00042608707810192425, 'samples': 6499328, 'steps': 12693, 'batch_loss/train': 0.7874202290549874} 12/24/2021 15:39:48 - INFO - codeparrot_training - Step 12694: {'lr': 0.00042607588821964653, 'samples': 6499840, 'steps': 12694, 'batch_loss/train': 0.8696433985605836} 12/24/2021 15:39:59 - INFO - codeparrot_training - Step 12695: {'lr': 0.0004260646976373576, 'samples': 6500352, 'steps': 12695, 'batch_loss/train': 0.6950437125051394} 12/24/2021 15:40:12 - INFO - codeparrot_training - Step 12696: {'lr': 0.00042605350635510175, 'samples': 6500864, 'steps': 12696, 'batch_loss/train': 0.8192175021395087} 12/24/2021 15:40:23 - INFO - codeparrot_training - Step 12697: {'lr': 0.00042604231437292375, 'samples': 6501376, 'steps': 12697, 'batch_loss/train': 0.7752391751855612} 12/24/2021 15:40:34 - INFO - codeparrot_training - Step 12698: {'lr': 0.0004260311216908679, 'samples': 6501888, 'steps': 12698, 'batch_loss/train': 0.7753578573465347} 12/24/2021 15:40:46 - INFO - codeparrot_training - Step 12699: {'lr': 0.00042601992830897887, 'samples': 6502400, 'steps': 12699, 'batch_loss/train': 0.8408265884499997} 12/24/2021 15:40:56 - INFO - codeparrot_training - Step 12700: {'lr': 0.00042600873422730094, 'samples': 6502912, 'steps': 12700, 'batch_loss/train': 0.6720321490429342} 12/24/2021 15:41:07 - INFO - codeparrot_training - Step 12701: {'lr': 0.0004259975394458788, 'samples': 6503424, 'steps': 12701, 'batch_loss/train': 0.7840562369674444} 12/24/2021 15:41:18 - INFO - codeparrot_training - Step 12702: {'lr': 0.000425986343964757, 'samples': 6503936, 'steps': 12702, 'batch_loss/train': 0.8313891720026731} 12/24/2021 15:41:30 - INFO - codeparrot_training - Step 12703: {'lr': 0.0004259751477839798, 'samples': 6504448, 'steps': 12703, 'batch_loss/train': 0.9774435837753117} 12/24/2021 15:41:40 - INFO - codeparrot_training - Step 12704: {'lr': 0.000425963950903592, 'samples': 6504960, 'steps': 12704, 'batch_loss/train': 0.8382281444501132} 12/24/2021 15:41:51 - INFO - codeparrot_training - Step 12705: {'lr': 0.00042595275332363785, 'samples': 6505472, 'steps': 12705, 'batch_loss/train': 0.7529152799397707} 12/24/2021 15:42:05 - INFO - codeparrot_training - Step 12706: {'lr': 0.000425941555044162, 'samples': 6505984, 'steps': 12706, 'batch_loss/train': 0.7596880234777927} 12/24/2021 15:42:16 - INFO - codeparrot_training - Step 12707: {'lr': 0.00042593035606520904, 'samples': 6506496, 'steps': 12707, 'batch_loss/train': 0.8207313334569335} 12/24/2021 15:42:26 - INFO - codeparrot_training - Step 12708: {'lr': 0.00042591915638682333, 'samples': 6507008, 'steps': 12708, 'batch_loss/train': 0.8979011261835694} 12/24/2021 15:42:38 - INFO - codeparrot_training - Step 12709: {'lr': 0.00042590795600904954, 'samples': 6507520, 'steps': 12709, 'batch_loss/train': 0.6501489887014031} 12/24/2021 15:42:49 - INFO - codeparrot_training - Step 12710: {'lr': 0.0004258967549319321, 'samples': 6508032, 'steps': 12710, 'batch_loss/train': 0.7392860548570752} 12/24/2021 15:42:59 - INFO - codeparrot_training - Step 12711: {'lr': 0.00042588555315551557, 'samples': 6508544, 'steps': 12711, 'batch_loss/train': 0.9209254134912044} 12/24/2021 15:43:12 - INFO - codeparrot_training - Step 12712: {'lr': 0.00042587435067984456, 'samples': 6509056, 'steps': 12712, 'batch_loss/train': 0.8230609287275001} 12/24/2021 15:43:22 - INFO - codeparrot_training - Step 12713: {'lr': 0.00042586314750496344, 'samples': 6509568, 'steps': 12713, 'batch_loss/train': 0.7564549492672086} 12/24/2021 15:43:33 - INFO - codeparrot_training - Step 12714: {'lr': 0.00042585194363091693, 'samples': 6510080, 'steps': 12714, 'batch_loss/train': 0.709659906104207} 12/24/2021 15:43:44 - INFO - codeparrot_training - Step 12715: {'lr': 0.00042584073905774944, 'samples': 6510592, 'steps': 12715, 'batch_loss/train': 0.7618708619847894} 12/24/2021 15:43:58 - INFO - codeparrot_training - Step 12716: {'lr': 0.0004258295337855056, 'samples': 6511104, 'steps': 12716, 'batch_loss/train': 0.8756120847538114} 12/24/2021 15:44:08 - INFO - codeparrot_training - Step 12717: {'lr': 0.0004258183278142299, 'samples': 6511616, 'steps': 12717, 'batch_loss/train': 0.7428342083003372} 12/24/2021 15:44:19 - INFO - codeparrot_training - Step 12718: {'lr': 0.000425807121143967, 'samples': 6512128, 'steps': 12718, 'batch_loss/train': 0.8353196093812585} 12/24/2021 15:44:31 - INFO - codeparrot_training - Step 12719: {'lr': 0.00042579591377476123, 'samples': 6512640, 'steps': 12719, 'batch_loss/train': 0.8151398524641991} 12/24/2021 15:44:42 - INFO - codeparrot_training - Step 12720: {'lr': 0.0004257847057066574, 'samples': 6513152, 'steps': 12720, 'batch_loss/train': 0.673821653239429} 12/24/2021 15:44:52 - INFO - codeparrot_training - Step 12721: {'lr': 0.0004257734969396999, 'samples': 6513664, 'steps': 12721, 'batch_loss/train': 0.8814121698960662} 12/24/2021 15:45:04 - INFO - codeparrot_training - Step 12722: {'lr': 0.0004257622874739333, 'samples': 6514176, 'steps': 12722, 'batch_loss/train': 0.7826341334730387} 12/24/2021 15:45:15 - INFO - codeparrot_training - Step 12723: {'lr': 0.0004257510773094023, 'samples': 6514688, 'steps': 12723, 'batch_loss/train': 0.8300854614935815} 12/24/2021 15:45:26 - INFO - codeparrot_training - Step 12724: {'lr': 0.00042573986644615133, 'samples': 6515200, 'steps': 12724, 'batch_loss/train': 0.7651649974286556} 12/24/2021 15:45:36 - INFO - codeparrot_training - Step 12725: {'lr': 0.000425728654884225, 'samples': 6515712, 'steps': 12725, 'batch_loss/train': 0.9484895491041243} 12/24/2021 15:45:50 - INFO - codeparrot_training - Step 12726: {'lr': 0.0004257174426236679, 'samples': 6516224, 'steps': 12726, 'batch_loss/train': 0.8161309584975243} 12/24/2021 15:46:01 - INFO - codeparrot_training - Step 12727: {'lr': 0.0004257062296645247, 'samples': 6516736, 'steps': 12727, 'batch_loss/train': 0.7853889411780983} 12/24/2021 15:46:11 - INFO - codeparrot_training - Step 12728: {'lr': 0.0004256950160068397, 'samples': 6517248, 'steps': 12728, 'batch_loss/train': 1.9278644304722548} 12/24/2021 15:46:23 - INFO - codeparrot_training - Step 12729: {'lr': 0.00042568380165065774, 'samples': 6517760, 'steps': 12729, 'batch_loss/train': 0.8399738287553191} 12/24/2021 15:46:34 - INFO - codeparrot_training - Step 12730: {'lr': 0.00042567258659602326, 'samples': 6518272, 'steps': 12730, 'batch_loss/train': 0.8110517105087638} 12/24/2021 15:46:45 - INFO - codeparrot_training - Step 12731: {'lr': 0.00042566137084298103, 'samples': 6518784, 'steps': 12731, 'batch_loss/train': 0.8210046254098415} 12/24/2021 15:46:58 - INFO - codeparrot_training - Step 12732: {'lr': 0.0004256501543915754, 'samples': 6519296, 'steps': 12732, 'batch_loss/train': 0.8385020447894931} 12/24/2021 15:47:09 - INFO - codeparrot_training - Step 12733: {'lr': 0.0004256389372418511, 'samples': 6519808, 'steps': 12733, 'batch_loss/train': 0.909946602769196} 12/24/2021 15:47:20 - INFO - codeparrot_training - Step 12734: {'lr': 0.0004256277193938528, 'samples': 6520320, 'steps': 12734, 'batch_loss/train': 0.8419289598241448} 12/24/2021 15:47:30 - INFO - codeparrot_training - Step 12735: {'lr': 0.00042561650084762496, 'samples': 6520832, 'steps': 12735, 'batch_loss/train': 0.8131282413378358} 12/24/2021 15:47:42 - INFO - codeparrot_training - Step 12736: {'lr': 0.00042560528160321217, 'samples': 6521344, 'steps': 12736, 'batch_loss/train': 0.8101909919641912} 12/24/2021 15:47:53 - INFO - codeparrot_training - Step 12737: {'lr': 0.00042559406166065906, 'samples': 6521856, 'steps': 12737, 'batch_loss/train': 0.7043209981638938} 12/24/2021 15:48:03 - INFO - codeparrot_training - Step 12738: {'lr': 0.00042558284102001034, 'samples': 6522368, 'steps': 12738, 'batch_loss/train': 0.7476955850142986} 12/24/2021 15:48:16 - INFO - codeparrot_training - Step 12739: {'lr': 0.0004255716196813105, 'samples': 6522880, 'steps': 12739, 'batch_loss/train': 0.9167607808485627} 12/24/2021 15:48:26 - INFO - codeparrot_training - Step 12740: {'lr': 0.00042556039764460427, 'samples': 6523392, 'steps': 12740, 'batch_loss/train': 0.7797140325419605} 12/24/2021 15:48:37 - INFO - codeparrot_training - Step 12741: {'lr': 0.00042554917490993604, 'samples': 6523904, 'steps': 12741, 'batch_loss/train': 0.7089812071062624} 12/24/2021 15:48:51 - INFO - codeparrot_training - Step 12742: {'lr': 0.00042553795147735077, 'samples': 6524416, 'steps': 12742, 'batch_loss/train': 0.6982485051266849} 12/24/2021 15:49:02 - INFO - codeparrot_training - Step 12743: {'lr': 0.0004255267273468927, 'samples': 6524928, 'steps': 12743, 'batch_loss/train': 0.77466229069978} 12/24/2021 15:49:13 - INFO - codeparrot_training - Step 12744: {'lr': 0.00042551550251860676, 'samples': 6525440, 'steps': 12744, 'batch_loss/train': 0.7350398725830019} 12/24/2021 15:49:23 - INFO - codeparrot_training - Step 12745: {'lr': 0.00042550427699253736, 'samples': 6525952, 'steps': 12745, 'batch_loss/train': 0.8042952544055879} 12/24/2021 15:49:35 - INFO - codeparrot_training - Step 12746: {'lr': 0.00042549305076872927, 'samples': 6526464, 'steps': 12746, 'batch_loss/train': 1.0275668017566204} 12/24/2021 15:49:46 - INFO - codeparrot_training - Step 12747: {'lr': 0.0004254818238472271, 'samples': 6526976, 'steps': 12747, 'batch_loss/train': 0.8515878701582551} 12/24/2021 15:49:57 - INFO - codeparrot_training - Step 12748: {'lr': 0.00042547059622807543, 'samples': 6527488, 'steps': 12748, 'batch_loss/train': 0.8016918962821364} 12/24/2021 15:50:10 - INFO - codeparrot_training - Step 12749: {'lr': 0.000425459367911319, 'samples': 6528000, 'steps': 12749, 'batch_loss/train': 0.7291825793217868} 12/24/2021 15:50:21 - INFO - codeparrot_training - Step 12750: {'lr': 0.0004254481388970022, 'samples': 6528512, 'steps': 12750, 'batch_loss/train': 0.788270459510386} 12/24/2021 15:50:32 - INFO - codeparrot_training - Step 12751: {'lr': 0.00042543690918517, 'samples': 6529024, 'steps': 12751, 'batch_loss/train': 0.814669125713408} 12/24/2021 15:50:44 - INFO - codeparrot_training - Step 12752: {'lr': 0.00042542567877586684, 'samples': 6529536, 'steps': 12752, 'batch_loss/train': 0.8347485233098269} 12/24/2021 15:50:55 - INFO - codeparrot_training - Step 12753: {'lr': 0.0004254144476691374, 'samples': 6530048, 'steps': 12753, 'batch_loss/train': 0.7812588871456683} 12/24/2021 15:51:05 - INFO - codeparrot_training - Step 12754: {'lr': 0.0004254032158650264, 'samples': 6530560, 'steps': 12754, 'batch_loss/train': 0.8233502148650587} 12/24/2021 15:51:16 - INFO - codeparrot_training - Step 12755: {'lr': 0.00042539198336357845, 'samples': 6531072, 'steps': 12755, 'batch_loss/train': 0.5994498632499017} 12/24/2021 15:51:28 - INFO - codeparrot_training - Step 12756: {'lr': 0.0004253807501648381, 'samples': 6531584, 'steps': 12756, 'batch_loss/train': 0.8200299479067326} 12/24/2021 15:51:38 - INFO - codeparrot_training - Step 12757: {'lr': 0.00042536951626885025, 'samples': 6532096, 'steps': 12757, 'batch_loss/train': 0.8660965692251921} 12/24/2021 15:51:49 - INFO - codeparrot_training - Step 12758: {'lr': 0.0004253582816756593, 'samples': 6532608, 'steps': 12758, 'batch_loss/train': 0.7224141941405833} 12/24/2021 15:52:03 - INFO - codeparrot_training - Step 12759: {'lr': 0.00042534704638531005, 'samples': 6533120, 'steps': 12759, 'batch_loss/train': 0.8657286437228322} 12/24/2021 15:52:14 - INFO - codeparrot_training - Step 12760: {'lr': 0.0004253358103978472, 'samples': 6533632, 'steps': 12760, 'batch_loss/train': 0.7427330582868308} 12/24/2021 15:52:24 - INFO - codeparrot_training - Step 12761: {'lr': 0.00042532457371331534, 'samples': 6534144, 'steps': 12761, 'batch_loss/train': 0.725032807327807} 12/24/2021 15:52:35 - INFO - codeparrot_training - Step 12762: {'lr': 0.0004253133363317592, 'samples': 6534656, 'steps': 12762, 'batch_loss/train': 0.9207190806046128} 12/24/2021 15:52:47 - INFO - codeparrot_training - Step 12763: {'lr': 0.0004253020982532234, 'samples': 6535168, 'steps': 12763, 'batch_loss/train': 0.7935615945607424} 12/24/2021 15:52:57 - INFO - codeparrot_training - Step 12764: {'lr': 0.0004252908594777527, 'samples': 6535680, 'steps': 12764, 'batch_loss/train': 0.7802359769120812} 12/24/2021 15:53:08 - INFO - codeparrot_training - Step 12765: {'lr': 0.0004252796200053917, 'samples': 6536192, 'steps': 12765, 'batch_loss/train': 0.8010279815644026} 12/24/2021 15:53:20 - INFO - codeparrot_training - Step 12766: {'lr': 0.0004252683798361851, 'samples': 6536704, 'steps': 12766, 'batch_loss/train': 0.8574970979243517} 12/24/2021 15:53:31 - INFO - codeparrot_training - Step 12767: {'lr': 0.0004252571389701776, 'samples': 6537216, 'steps': 12767, 'batch_loss/train': 0.8903798819519579} 12/24/2021 15:53:41 - INFO - codeparrot_training - Step 12768: {'lr': 0.00042524589740741394, 'samples': 6537728, 'steps': 12768, 'batch_loss/train': 0.8013594676740468} 12/24/2021 15:53:56 - INFO - codeparrot_training - Step 12769: {'lr': 0.0004252346551479387, 'samples': 6538240, 'steps': 12769, 'batch_loss/train': 0.7794401561841369} 12/24/2021 15:54:06 - INFO - codeparrot_training - Step 12770: {'lr': 0.0004252234121917966, 'samples': 6538752, 'steps': 12770, 'batch_loss/train': 1.5623252354562283} 12/24/2021 15:54:17 - INFO - codeparrot_training - Step 12771: {'lr': 0.0004252121685390325, 'samples': 6539264, 'steps': 12771, 'batch_loss/train': 0.7665633629076183} 12/24/2021 15:54:29 - INFO - codeparrot_training - Step 12772: {'lr': 0.0004252009241896909, 'samples': 6539776, 'steps': 12772, 'batch_loss/train': 0.8329424569383264} 12/24/2021 15:54:40 - INFO - codeparrot_training - Step 12773: {'lr': 0.00042518967914381657, 'samples': 6540288, 'steps': 12773, 'batch_loss/train': 0.8265408724546432} 12/24/2021 15:54:50 - INFO - codeparrot_training - Step 12774: {'lr': 0.0004251784334014542, 'samples': 6540800, 'steps': 12774, 'batch_loss/train': 0.8363358052447438} 12/24/2021 15:55:01 - INFO - codeparrot_training - Step 12775: {'lr': 0.00042516718696264865, 'samples': 6541312, 'steps': 12775, 'batch_loss/train': 0.9082498261705041} 12/24/2021 15:55:13 - INFO - codeparrot_training - Step 12776: {'lr': 0.00042515593982744436, 'samples': 6541824, 'steps': 12776, 'batch_loss/train': 0.8303805682808161} 12/24/2021 15:55:24 - INFO - codeparrot_training - Step 12777: {'lr': 0.00042514469199588634, 'samples': 6542336, 'steps': 12777, 'batch_loss/train': 0.39812382258241996} 12/24/2021 15:55:34 - INFO - codeparrot_training - Step 12778: {'lr': 0.00042513344346801904, 'samples': 6542848, 'steps': 12778, 'batch_loss/train': 0.9018856538459659} 12/24/2021 15:55:48 - INFO - codeparrot_training - Step 12779: {'lr': 0.0004251221942438873, 'samples': 6543360, 'steps': 12779, 'batch_loss/train': 0.8950139163061976} 12/24/2021 15:55:59 - INFO - codeparrot_training - Step 12780: {'lr': 0.00042511094432353584, 'samples': 6543872, 'steps': 12780, 'batch_loss/train': 0.8873281383421272} 12/24/2021 15:56:10 - INFO - codeparrot_training - Step 12781: {'lr': 0.00042509969370700944, 'samples': 6544384, 'steps': 12781, 'batch_loss/train': 0.9058304224163294} 12/24/2021 15:56:22 - INFO - codeparrot_training - Step 12782: {'lr': 0.0004250884423943527, 'samples': 6544896, 'steps': 12782, 'batch_loss/train': 0.7441819603554904} 12/24/2021 15:56:32 - INFO - codeparrot_training - Step 12783: {'lr': 0.00042507719038561054, 'samples': 6545408, 'steps': 12783, 'batch_loss/train': 0.7923362869769335} 12/24/2021 15:56:43 - INFO - codeparrot_training - Step 12784: {'lr': 0.0004250659376808275, 'samples': 6545920, 'steps': 12784, 'batch_loss/train': 0.8717459570616484} 12/24/2021 15:56:54 - INFO - codeparrot_training - Step 12785: {'lr': 0.00042505468428004833, 'samples': 6546432, 'steps': 12785, 'batch_loss/train': 0.8286224843468517} 12/24/2021 15:57:06 - INFO - codeparrot_training - Step 12786: {'lr': 0.000425043430183318, 'samples': 6546944, 'steps': 12786, 'batch_loss/train': 0.794399737729691} 12/24/2021 15:57:16 - INFO - codeparrot_training - Step 12787: {'lr': 0.000425032175390681, 'samples': 6547456, 'steps': 12787, 'batch_loss/train': 0.802183295134455} 12/24/2021 15:57:27 - INFO - codeparrot_training - Step 12788: {'lr': 0.00042502091990218206, 'samples': 6547968, 'steps': 12788, 'batch_loss/train': 0.9288619412109256} 12/24/2021 15:57:41 - INFO - codeparrot_training - Step 12789: {'lr': 0.00042500966371786616, 'samples': 6548480, 'steps': 12789, 'batch_loss/train': 0.7333228255156428} 12/24/2021 15:57:52 - INFO - codeparrot_training - Step 12790: {'lr': 0.00042499840683777793, 'samples': 6548992, 'steps': 12790, 'batch_loss/train': 0.8084307704120874} 12/24/2021 15:58:02 - INFO - codeparrot_training - Step 12791: {'lr': 0.00042498714926196205, 'samples': 6549504, 'steps': 12791, 'batch_loss/train': 0.8325918009504676} 12/24/2021 15:58:14 - INFO - codeparrot_training - Step 12792: {'lr': 0.0004249758909904633, 'samples': 6550016, 'steps': 12792, 'batch_loss/train': 0.6749343434348702} 12/24/2021 15:58:25 - INFO - codeparrot_training - Step 12793: {'lr': 0.0004249646320233265, 'samples': 6550528, 'steps': 12793, 'batch_loss/train': 0.7739299070090055} 12/24/2021 15:58:36 - INFO - codeparrot_training - Step 12794: {'lr': 0.0004249533723605964, 'samples': 6551040, 'steps': 12794, 'batch_loss/train': 0.8691810043528676} 12/24/2021 15:58:46 - INFO - codeparrot_training - Step 12795: {'lr': 0.0004249421120023177, 'samples': 6551552, 'steps': 12795, 'batch_loss/train': 0.8210790152661502} 12/24/2021 15:58:59 - INFO - codeparrot_training - Step 12796: {'lr': 0.00042493085094853534, 'samples': 6552064, 'steps': 12796, 'batch_loss/train': 0.8595274081453681} 12/24/2021 15:59:10 - INFO - codeparrot_training - Step 12797: {'lr': 0.00042491958919929385, 'samples': 6552576, 'steps': 12797, 'batch_loss/train': 0.8257254608906806} 12/24/2021 15:59:20 - INFO - codeparrot_training - Step 12798: {'lr': 0.0004249083267546382, 'samples': 6553088, 'steps': 12798, 'batch_loss/train': 1.073230025358498} 12/24/2021 15:59:34 - INFO - codeparrot_training - Step 12799: {'lr': 0.0004248970636146131, 'samples': 6553600, 'steps': 12799, 'batch_loss/train': 1.1721860617399216} 12/24/2021 15:59:45 - INFO - codeparrot_training - Step 12800: {'lr': 0.0004248857997792632, 'samples': 6554112, 'steps': 12800, 'batch_loss/train': 0.8692452553659678} 12/24/2021 15:59:55 - INFO - codeparrot_training - Step 12801: {'lr': 0.0004248745352486335, 'samples': 6554624, 'steps': 12801, 'batch_loss/train': 0.8132470529526472} 12/24/2021 16:00:06 - INFO - codeparrot_training - Step 12802: {'lr': 0.0004248632700227686, 'samples': 6555136, 'steps': 12802, 'batch_loss/train': 0.8837095377966762} 12/24/2021 16:00:18 - INFO - codeparrot_training - Step 12803: {'lr': 0.0004248520041017134, 'samples': 6555648, 'steps': 12803, 'batch_loss/train': 0.871555722784251} 12/24/2021 16:00:29 - INFO - codeparrot_training - Step 12804: {'lr': 0.0004248407374855127, 'samples': 6556160, 'steps': 12804, 'batch_loss/train': 0.7545436061918736} 12/24/2021 16:00:40 - INFO - codeparrot_training - Step 12805: {'lr': 0.00042482947017421125, 'samples': 6556672, 'steps': 12805, 'batch_loss/train': 0.8418841515667737} 12/24/2021 16:00:53 - INFO - codeparrot_training - Step 12806: {'lr': 0.0004248182021678538, 'samples': 6557184, 'steps': 12806, 'batch_loss/train': 0.9695285316556692} 12/24/2021 16:01:04 - INFO - codeparrot_training - Step 12807: {'lr': 0.0004248069334664851, 'samples': 6557696, 'steps': 12807, 'batch_loss/train': 0.7371480215806514} 12/24/2021 16:01:15 - INFO - codeparrot_training - Step 12808: {'lr': 0.0004247956640701502, 'samples': 6558208, 'steps': 12808, 'batch_loss/train': 0.8456141427159309} 12/24/2021 16:01:27 - INFO - codeparrot_training - Step 12809: {'lr': 0.0004247843939788936, 'samples': 6558720, 'steps': 12809, 'batch_loss/train': 0.8769786106422544} 12/24/2021 16:01:37 - INFO - codeparrot_training - Step 12810: {'lr': 0.00042477312319276033, 'samples': 6559232, 'steps': 12810, 'batch_loss/train': 0.7933020000346005} 12/24/2021 16:01:48 - INFO - codeparrot_training - Step 12811: {'lr': 0.0004247618517117951, 'samples': 6559744, 'steps': 12811, 'batch_loss/train': 0.7833170474041253} 12/24/2021 16:02:00 - INFO - codeparrot_training - Step 12812: {'lr': 0.00042475057953604274, 'samples': 6560256, 'steps': 12812, 'batch_loss/train': 0.8756840396672487} 12/24/2021 16:02:10 - INFO - codeparrot_training - Step 12813: {'lr': 0.00042473930666554803, 'samples': 6560768, 'steps': 12813, 'batch_loss/train': 0.875840462744236} 12/24/2021 16:02:21 - INFO - codeparrot_training - Step 12814: {'lr': 0.0004247280331003559, 'samples': 6561280, 'steps': 12814, 'batch_loss/train': 0.8746453253552318} 12/24/2021 16:02:32 - INFO - codeparrot_training - Step 12815: {'lr': 0.000424716758840511, 'samples': 6561792, 'steps': 12815, 'batch_loss/train': 0.9115716628730297} 12/24/2021 16:02:44 - INFO - codeparrot_training - Step 12816: {'lr': 0.0004247054838860582, 'samples': 6562304, 'steps': 12816, 'batch_loss/train': 0.7310405457392335} 12/24/2021 16:02:54 - INFO - codeparrot_training - Step 12817: {'lr': 0.0004246942082370424, 'samples': 6562816, 'steps': 12817, 'batch_loss/train': 0.7861380120739341} 12/24/2021 16:03:05 - INFO - codeparrot_training - Step 12818: {'lr': 0.00042468293189350844, 'samples': 6563328, 'steps': 12818, 'batch_loss/train': 0.8692785315215588} 12/24/2021 16:03:19 - INFO - codeparrot_training - Step 12819: {'lr': 0.0004246716548555011, 'samples': 6563840, 'steps': 12819, 'batch_loss/train': 0.7535063019022346} 12/24/2021 16:03:29 - INFO - codeparrot_training - Step 12820: {'lr': 0.0004246603771230651, 'samples': 6564352, 'steps': 12820, 'batch_loss/train': 0.7090701311826706} 12/24/2021 16:03:40 - INFO - codeparrot_training - Step 12821: {'lr': 0.00042464909869624544, 'samples': 6564864, 'steps': 12821, 'batch_loss/train': 0.8412591709056869} 12/24/2021 16:03:52 - INFO - codeparrot_training - Step 12822: {'lr': 0.00042463781957508695, 'samples': 6565376, 'steps': 12822, 'batch_loss/train': 0.7924729147925973} 12/24/2021 16:04:03 - INFO - codeparrot_training - Step 12823: {'lr': 0.0004246265397596345, 'samples': 6565888, 'steps': 12823, 'batch_loss/train': 0.6617266838438809} 12/24/2021 16:04:14 - INFO - codeparrot_training - Step 12824: {'lr': 0.0004246152592499327, 'samples': 6566400, 'steps': 12824, 'batch_loss/train': 0.7435141829773784} 12/24/2021 16:04:24 - INFO - codeparrot_training - Step 12825: {'lr': 0.00042460397804602667, 'samples': 6566912, 'steps': 12825, 'batch_loss/train': 0.887202775105834} 12/24/2021 16:04:36 - INFO - codeparrot_training - Step 12826: {'lr': 0.00042459269614796117, 'samples': 6567424, 'steps': 12826, 'batch_loss/train': 0.9623308554291725} 12/24/2021 16:04:47 - INFO - codeparrot_training - Step 12827: {'lr': 0.0004245814135557809, 'samples': 6567936, 'steps': 12827, 'batch_loss/train': 0.8751085568219423} 12/24/2021 16:04:57 - INFO - codeparrot_training - Step 12828: {'lr': 0.00042457013026953093, 'samples': 6568448, 'steps': 12828, 'batch_loss/train': 0.8285783343017101} 12/24/2021 16:05:11 - INFO - codeparrot_training - Step 12829: {'lr': 0.00042455884628925613, 'samples': 6568960, 'steps': 12829, 'batch_loss/train': 0.7740848795510828} 12/24/2021 16:05:22 - INFO - codeparrot_training - Step 12830: {'lr': 0.00042454756161500114, 'samples': 6569472, 'steps': 12830, 'batch_loss/train': 0.7670541357947513} 12/24/2021 16:05:33 - INFO - codeparrot_training - Step 12831: {'lr': 0.000424536276246811, 'samples': 6569984, 'steps': 12831, 'batch_loss/train': 0.6500386586412787} 12/24/2021 16:05:45 - INFO - codeparrot_training - Step 12832: {'lr': 0.00042452499018473056, 'samples': 6570496, 'steps': 12832, 'batch_loss/train': 0.7886459876317531} 12/24/2021 16:05:56 - INFO - codeparrot_training - Step 12833: {'lr': 0.0004245137034288046, 'samples': 6571008, 'steps': 12833, 'batch_loss/train': 0.8753083618357778} 12/24/2021 16:06:06 - INFO - codeparrot_training - Step 12834: {'lr': 0.00042450241597907806, 'samples': 6571520, 'steps': 12834, 'batch_loss/train': 0.8468303941190243} 12/24/2021 16:06:17 - INFO - codeparrot_training - Step 12835: {'lr': 0.00042449112783559577, 'samples': 6572032, 'steps': 12835, 'batch_loss/train': 0.7751929583027959} 12/24/2021 16:06:31 - INFO - codeparrot_training - Step 12836: {'lr': 0.0004244798389984027, 'samples': 6572544, 'steps': 12836, 'batch_loss/train': 0.7530256072059274} 12/24/2021 16:06:41 - INFO - codeparrot_training - Step 12837: {'lr': 0.0004244685494675437, 'samples': 6573056, 'steps': 12837, 'batch_loss/train': 0.5218226362485439} 12/24/2021 16:06:52 - INFO - codeparrot_training - Step 12838: {'lr': 0.00042445725924306353, 'samples': 6573568, 'steps': 12838, 'batch_loss/train': 0.7315205556806177} 12/24/2021 16:07:04 - INFO - codeparrot_training - Step 12839: {'lr': 0.0004244459683250073, 'samples': 6574080, 'steps': 12839, 'batch_loss/train': 0.71277996012941} 12/24/2021 16:07:15 - INFO - codeparrot_training - Step 12840: {'lr': 0.00042443467671341963, 'samples': 6574592, 'steps': 12840, 'batch_loss/train': 0.7733085071668029} 12/24/2021 16:07:25 - INFO - codeparrot_training - Step 12841: {'lr': 0.0004244233844083456, 'samples': 6575104, 'steps': 12841, 'batch_loss/train': 0.8253549213986844} 12/24/2021 16:07:37 - INFO - codeparrot_training - Step 12842: {'lr': 0.00042441209140983005, 'samples': 6575616, 'steps': 12842, 'batch_loss/train': 0.857629369944334} 12/24/2021 16:07:48 - INFO - codeparrot_training - Step 12843: {'lr': 0.00042440079771791783, 'samples': 6576128, 'steps': 12843, 'batch_loss/train': 0.7988179917447269} 12/24/2021 16:07:58 - INFO - codeparrot_training - Step 12844: {'lr': 0.00042438950333265404, 'samples': 6576640, 'steps': 12844, 'batch_loss/train': 0.8076379294507205} 12/24/2021 16:08:09 - INFO - codeparrot_training - Step 12845: {'lr': 0.0004243782082540833, 'samples': 6577152, 'steps': 12845, 'batch_loss/train': 0.7710771028650925} 12/24/2021 16:08:23 - INFO - codeparrot_training - Step 12846: {'lr': 0.0004243669124822507, 'samples': 6577664, 'steps': 12846, 'batch_loss/train': 0.5990145090036094} 12/24/2021 16:08:34 - INFO - codeparrot_training - Step 12847: {'lr': 0.0004243556160172011, 'samples': 6578176, 'steps': 12847, 'batch_loss/train': 0.8161477211397141} 12/24/2021 16:08:44 - INFO - codeparrot_training - Step 12848: {'lr': 0.0004243443188589793, 'samples': 6578688, 'steps': 12848, 'batch_loss/train': 0.7905095323221758} 12/24/2021 16:08:57 - INFO - codeparrot_training - Step 12849: {'lr': 0.00042433302100763047, 'samples': 6579200, 'steps': 12849, 'batch_loss/train': 0.8251441931352019} 12/24/2021 16:09:08 - INFO - codeparrot_training - Step 12850: {'lr': 0.00042432172246319926, 'samples': 6579712, 'steps': 12850, 'batch_loss/train': 0.8701100486796349} 12/24/2021 16:09:18 - INFO - codeparrot_training - Step 12851: {'lr': 0.0004243104232257308, 'samples': 6580224, 'steps': 12851, 'batch_loss/train': 0.9728533320594579} 12/24/2021 16:09:30 - INFO - codeparrot_training - Step 12852: {'lr': 0.0004242991232952699, 'samples': 6580736, 'steps': 12852, 'batch_loss/train': 0.8231268348172307} 12/24/2021 16:09:41 - INFO - codeparrot_training - Step 12853: {'lr': 0.0004242878226718614, 'samples': 6581248, 'steps': 12853, 'batch_loss/train': 0.8407412227243185} 12/24/2021 16:09:51 - INFO - codeparrot_training - Step 12854: {'lr': 0.0004242765213555504, 'samples': 6581760, 'steps': 12854, 'batch_loss/train': 0.7935853274539113} 12/24/2021 16:10:02 - INFO - codeparrot_training - Step 12855: {'lr': 0.0004242652193463817, 'samples': 6582272, 'steps': 12855, 'batch_loss/train': 0.7893967612180859} 12/24/2021 16:10:14 - INFO - codeparrot_training - Step 12856: {'lr': 0.0004242539166444004, 'samples': 6582784, 'steps': 12856, 'batch_loss/train': 0.8717006323859096} 12/24/2021 16:10:25 - INFO - codeparrot_training - Step 12857: {'lr': 0.00042424261324965117, 'samples': 6583296, 'steps': 12857, 'batch_loss/train': 0.7725236427504569} 12/24/2021 16:10:35 - INFO - codeparrot_training - Step 12858: {'lr': 0.0004242313091621792, 'samples': 6583808, 'steps': 12858, 'batch_loss/train': 0.8630127932410687} 12/24/2021 16:10:49 - INFO - codeparrot_training - Step 12859: {'lr': 0.00042422000438202935, 'samples': 6584320, 'steps': 12859, 'batch_loss/train': 0.8358451439999044} 12/24/2021 16:11:00 - INFO - codeparrot_training - Step 12860: {'lr': 0.00042420869890924653, 'samples': 6584832, 'steps': 12860, 'batch_loss/train': 0.8723146729171276} 12/24/2021 16:11:10 - INFO - codeparrot_training - Step 12861: {'lr': 0.0004241973927438757, 'samples': 6585344, 'steps': 12861, 'batch_loss/train': 0.9047963423654437} 12/24/2021 16:11:23 - INFO - codeparrot_training - Step 12862: {'lr': 0.0004241860858859618, 'samples': 6585856, 'steps': 12862, 'batch_loss/train': 0.8510332745499909} 12/24/2021 16:11:34 - INFO - codeparrot_training - Step 12863: {'lr': 0.00042417477833554975, 'samples': 6586368, 'steps': 12863, 'batch_loss/train': 0.645011093467474} 12/24/2021 16:11:45 - INFO - codeparrot_training - Step 12864: {'lr': 0.00042416347009268464, 'samples': 6586880, 'steps': 12864, 'batch_loss/train': 0.6720038597122766} 12/24/2021 16:11:55 - INFO - codeparrot_training - Step 12865: {'lr': 0.00042415216115741127, 'samples': 6587392, 'steps': 12865, 'batch_loss/train': 1.670983268879354} 12/24/2021 16:12:09 - INFO - codeparrot_training - Step 12866: {'lr': 0.0004241408515297747, 'samples': 6587904, 'steps': 12866, 'batch_loss/train': 0.8908951254561543} 12/24/2021 16:12:20 - INFO - codeparrot_training - Step 12867: {'lr': 0.00042412954120981983, 'samples': 6588416, 'steps': 12867, 'batch_loss/train': 0.7524021058343351} 12/24/2021 16:12:30 - INFO - codeparrot_training - Step 12868: {'lr': 0.00042411823019759164, 'samples': 6588928, 'steps': 12868, 'batch_loss/train': 0.7653742488473654} 12/24/2021 16:12:42 - INFO - codeparrot_training - Step 12869: {'lr': 0.0004241069184931351, 'samples': 6589440, 'steps': 12869, 'batch_loss/train': 0.7920643562683836} 12/24/2021 16:12:53 - INFO - codeparrot_training - Step 12870: {'lr': 0.00042409560609649524, 'samples': 6589952, 'steps': 12870, 'batch_loss/train': 0.7942575979977846} 12/24/2021 16:13:04 - INFO - codeparrot_training - Step 12871: {'lr': 0.00042408429300771697, 'samples': 6590464, 'steps': 12871, 'batch_loss/train': 0.8007700052112341} 12/24/2021 16:13:16 - INFO - codeparrot_training - Step 12872: {'lr': 0.00042407297922684523, 'samples': 6590976, 'steps': 12872, 'batch_loss/train': 0.8388366277795285} 12/24/2021 16:13:27 - INFO - codeparrot_training - Step 12873: {'lr': 0.0004240616647539252, 'samples': 6591488, 'steps': 12873, 'batch_loss/train': 0.8070039581507444} 12/24/2021 16:13:37 - INFO - codeparrot_training - Step 12874: {'lr': 0.0004240503495890015, 'samples': 6592000, 'steps': 12874, 'batch_loss/train': 0.8159436276182532} 12/24/2021 16:13:48 - INFO - codeparrot_training - Step 12875: {'lr': 0.0004240390337321195, 'samples': 6592512, 'steps': 12875, 'batch_loss/train': 0.7678877813741565} 12/24/2021 16:14:03 - INFO - codeparrot_training - Step 12876: {'lr': 0.0004240277171833239, 'samples': 6593024, 'steps': 12876, 'batch_loss/train': 0.8536263089627028} 12/24/2021 16:14:13 - INFO - codeparrot_training - Step 12877: {'lr': 0.0004240163999426599, 'samples': 6593536, 'steps': 12877, 'batch_loss/train': 1.0798377469182014} 12/24/2021 16:14:24 - INFO - codeparrot_training - Step 12878: {'lr': 0.0004240050820101724, 'samples': 6594048, 'steps': 12878, 'batch_loss/train': 0.8909198241308331} 12/24/2021 16:14:36 - INFO - codeparrot_training - Step 12879: {'lr': 0.00042399376338590636, 'samples': 6594560, 'steps': 12879, 'batch_loss/train': 0.7922730674035847} 12/24/2021 16:14:47 - INFO - codeparrot_training - Step 12880: {'lr': 0.0004239824440699068, 'samples': 6595072, 'steps': 12880, 'batch_loss/train': 0.8348797839134932} 12/24/2021 16:14:57 - INFO - codeparrot_training - Step 12881: {'lr': 0.0004239711240622187, 'samples': 6595584, 'steps': 12881, 'batch_loss/train': 0.834947076626122} 12/24/2021 16:15:08 - INFO - codeparrot_training - Step 12882: {'lr': 0.0004239598033628872, 'samples': 6596096, 'steps': 12882, 'batch_loss/train': 0.8637937535531819} 12/24/2021 16:15:20 - INFO - codeparrot_training - Step 12883: {'lr': 0.00042394848197195714, 'samples': 6596608, 'steps': 12883, 'batch_loss/train': 0.8865162078291178} 12/24/2021 16:15:31 - INFO - codeparrot_training - Step 12884: {'lr': 0.0004239371598894736, 'samples': 6597120, 'steps': 12884, 'batch_loss/train': 0.8915656581521034} 12/24/2021 16:15:41 - INFO - codeparrot_training - Step 12885: {'lr': 0.0004239258371154816, 'samples': 6597632, 'steps': 12885, 'batch_loss/train': 0.7715214379131794} 12/24/2021 16:15:54 - INFO - codeparrot_training - Step 12886: {'lr': 0.0004239145136500262, 'samples': 6598144, 'steps': 12886, 'batch_loss/train': 0.7916861004196107} 12/24/2021 16:16:04 - INFO - codeparrot_training - Step 12887: {'lr': 0.0004239031894931523, 'samples': 6598656, 'steps': 12887, 'batch_loss/train': 0.7386882519349456} 12/24/2021 16:16:15 - INFO - codeparrot_training - Step 12888: {'lr': 0.0004238918646449049, 'samples': 6599168, 'steps': 12888, 'batch_loss/train': 0.9849840514361858} 12/24/2021 16:16:29 - INFO - codeparrot_training - Step 12889: {'lr': 0.00042388053910532924, 'samples': 6599680, 'steps': 12889, 'batch_loss/train': 0.8501082500442863} 12/24/2021 16:16:39 - INFO - codeparrot_training - Step 12890: {'lr': 0.0004238692128744702, 'samples': 6600192, 'steps': 12890, 'batch_loss/train': 0.8690782925114036} 12/24/2021 16:16:50 - INFO - codeparrot_training - Step 12891: {'lr': 0.00042385788595237275, 'samples': 6600704, 'steps': 12891, 'batch_loss/train': 0.8052077945321798} 12/24/2021 16:17:01 - INFO - codeparrot_training - Step 12892: {'lr': 0.000423846558339082, 'samples': 6601216, 'steps': 12892, 'batch_loss/train': 0.8478165487758815} 12/24/2021 16:17:13 - INFO - codeparrot_training - Step 12893: {'lr': 0.00042383523003464304, 'samples': 6601728, 'steps': 12893, 'batch_loss/train': 0.8770785238593817} 12/24/2021 16:17:23 - INFO - codeparrot_training - Step 12894: {'lr': 0.0004238239010391008, 'samples': 6602240, 'steps': 12894, 'batch_loss/train': 0.9043105095624924} 12/24/2021 16:17:34 - INFO - codeparrot_training - Step 12895: {'lr': 0.0004238125713525004, 'samples': 6602752, 'steps': 12895, 'batch_loss/train': 0.7998585114255548} 12/24/2021 16:17:48 - INFO - codeparrot_training - Step 12896: {'lr': 0.0004238012409748868, 'samples': 6603264, 'steps': 12896, 'batch_loss/train': 0.7953733475878835} 12/24/2021 16:17:59 - INFO - codeparrot_training - Step 12897: {'lr': 0.00042378990990630513, 'samples': 6603776, 'steps': 12897, 'batch_loss/train': 0.7621951913461089} 12/24/2021 16:18:10 - INFO - codeparrot_training - Step 12898: {'lr': 0.0004237785781468003, 'samples': 6604288, 'steps': 12898, 'batch_loss/train': 0.9496196322143078} 12/24/2021 16:18:22 - INFO - codeparrot_training - Step 12899: {'lr': 0.00042376724569641755, 'samples': 6604800, 'steps': 12899, 'batch_loss/train': 0.7285679820924997} 12/24/2021 16:18:32 - INFO - codeparrot_training - Step 12900: {'lr': 0.00042375591255520187, 'samples': 6605312, 'steps': 12900, 'batch_loss/train': 0.6456664949655533} 12/24/2021 16:18:43 - INFO - codeparrot_training - Step 12901: {'lr': 0.0004237445787231982, 'samples': 6605824, 'steps': 12901, 'batch_loss/train': 0.8336421232670546} 12/24/2021 16:18:54 - INFO - codeparrot_training - Step 12902: {'lr': 0.0004237332442004518, 'samples': 6606336, 'steps': 12902, 'batch_loss/train': 0.7635426777414978} 12/24/2021 16:19:06 - INFO - codeparrot_training - Step 12903: {'lr': 0.0004237219089870075, 'samples': 6606848, 'steps': 12903, 'batch_loss/train': 0.8602701369673014} 12/24/2021 16:19:17 - INFO - codeparrot_training - Step 12904: {'lr': 0.00042371057308291055, 'samples': 6607360, 'steps': 12904, 'batch_loss/train': 0.8576609455049038} 12/24/2021 16:19:27 - INFO - codeparrot_training - Step 12905: {'lr': 0.00042369923648820595, 'samples': 6607872, 'steps': 12905, 'batch_loss/train': 0.8462854092940688} 12/24/2021 16:19:41 - INFO - codeparrot_training - Step 12906: {'lr': 0.0004236878992029387, 'samples': 6608384, 'steps': 12906, 'batch_loss/train': 0.7943619778379798} 12/24/2021 16:19:52 - INFO - codeparrot_training - Step 12907: {'lr': 0.00042367656122715405, 'samples': 6608896, 'steps': 12907, 'batch_loss/train': 0.7403267612680793} 12/24/2021 16:20:02 - INFO - codeparrot_training - Step 12908: {'lr': 0.0004236652225608969, 'samples': 6609408, 'steps': 12908, 'batch_loss/train': 0.8042130004614592} 12/24/2021 16:20:14 - INFO - codeparrot_training - Step 12909: {'lr': 0.0004236538832042125, 'samples': 6609920, 'steps': 12909, 'batch_loss/train': 0.9341345317661762} 12/24/2021 16:20:25 - INFO - codeparrot_training - Step 12910: {'lr': 0.00042364254315714573, 'samples': 6610432, 'steps': 12910, 'batch_loss/train': 0.8562896563671529} 12/24/2021 16:20:36 - INFO - codeparrot_training - Step 12911: {'lr': 0.0004236312024197417, 'samples': 6610944, 'steps': 12911, 'batch_loss/train': 0.8190116416662931} 12/24/2021 16:20:46 - INFO - codeparrot_training - Step 12912: {'lr': 0.00042361986099204575, 'samples': 6611456, 'steps': 12912, 'batch_loss/train': 0.7989669675007463} 12/24/2021 16:20:58 - INFO - codeparrot_training - Step 12913: {'lr': 0.00042360851887410266, 'samples': 6611968, 'steps': 12913, 'batch_loss/train': 0.9435015814378858} 12/24/2021 16:21:09 - INFO - codeparrot_training - Step 12914: {'lr': 0.00042359717606595777, 'samples': 6612480, 'steps': 12914, 'batch_loss/train': 0.7882572235539556} 12/24/2021 16:21:19 - INFO - codeparrot_training - Step 12915: {'lr': 0.0004235858325676559, 'samples': 6612992, 'steps': 12915, 'batch_loss/train': 0.8400699933990836} 12/24/2021 16:21:33 - INFO - codeparrot_training - Step 12916: {'lr': 0.0004235744883792425, 'samples': 6613504, 'steps': 12916, 'batch_loss/train': 0.8912825724110007} 12/24/2021 16:21:44 - INFO - codeparrot_training - Step 12917: {'lr': 0.0004235631435007623, 'samples': 6614016, 'steps': 12917, 'batch_loss/train': 0.8114601764827967} 12/24/2021 16:21:55 - INFO - codeparrot_training - Step 12918: {'lr': 0.0004235517979322606, 'samples': 6614528, 'steps': 12918, 'batch_loss/train': 0.8857047227211297} 12/24/2021 16:22:07 - INFO - codeparrot_training - Step 12919: {'lr': 0.0004235404516737826, 'samples': 6615040, 'steps': 12919, 'batch_loss/train': 0.8208198649808764} 12/24/2021 16:22:17 - INFO - codeparrot_training - Step 12920: {'lr': 0.0004235291047253732, 'samples': 6615552, 'steps': 12920, 'batch_loss/train': 0.962240731343627} 12/24/2021 16:22:28 - INFO - codeparrot_training - Step 12921: {'lr': 0.00042351775708707764, 'samples': 6616064, 'steps': 12921, 'batch_loss/train': 0.9240679070353508} 12/24/2021 16:22:39 - INFO - codeparrot_training - Step 12922: {'lr': 0.0004235064087589409, 'samples': 6616576, 'steps': 12922, 'batch_loss/train': 0.9372880132868886} 12/24/2021 16:22:52 - INFO - codeparrot_training - Step 12923: {'lr': 0.00042349505974100833, 'samples': 6617088, 'steps': 12923, 'batch_loss/train': 0.7791684423573315} 12/24/2021 16:23:03 - INFO - codeparrot_training - Step 12924: {'lr': 0.0004234837100333248, 'samples': 6617600, 'steps': 12924, 'batch_loss/train': 0.8185719531029463} 12/24/2021 16:23:14 - INFO - codeparrot_training - Step 12925: {'lr': 0.00042347235963593555, 'samples': 6618112, 'steps': 12925, 'batch_loss/train': 0.753034558147192} 12/24/2021 16:23:26 - INFO - codeparrot_training - Step 12926: {'lr': 0.00042346100854888575, 'samples': 6618624, 'steps': 12926, 'batch_loss/train': 0.8551320396363735} 12/24/2021 16:23:36 - INFO - codeparrot_training - Step 12927: {'lr': 0.0004234496567722205, 'samples': 6619136, 'steps': 12927, 'batch_loss/train': 0.8448055284097791} 12/24/2021 16:23:47 - INFO - codeparrot_training - Step 12928: {'lr': 0.00042343830430598486, 'samples': 6619648, 'steps': 12928, 'batch_loss/train': 0.7405439242720604} 12/24/2021 16:23:59 - INFO - codeparrot_training - Step 12929: {'lr': 0.000423426951150224, 'samples': 6620160, 'steps': 12929, 'batch_loss/train': 0.8573675584048033} 12/24/2021 16:24:10 - INFO - codeparrot_training - Step 12930: {'lr': 0.0004234155973049831, 'samples': 6620672, 'steps': 12930, 'batch_loss/train': 0.8770044185221195} 12/24/2021 16:24:20 - INFO - codeparrot_training - Step 12931: {'lr': 0.0004234042427703072, 'samples': 6621184, 'steps': 12931, 'batch_loss/train': 0.8417053634766489} 12/24/2021 16:24:31 - INFO - codeparrot_training - Step 12932: {'lr': 0.0004233928875462416, 'samples': 6621696, 'steps': 12932, 'batch_loss/train': 0.8311327574774623} 12/24/2021 16:24:43 - INFO - codeparrot_training - Step 12933: {'lr': 0.00042338153163283125, 'samples': 6622208, 'steps': 12933, 'batch_loss/train': 0.7965037440881133} 12/24/2021 16:24:54 - INFO - codeparrot_training - Step 12934: {'lr': 0.0004233701750301215, 'samples': 6622720, 'steps': 12934, 'batch_loss/train': 0.7580520557239652} 12/24/2021 16:25:04 - INFO - codeparrot_training - Step 12935: {'lr': 0.00042335881773815733, 'samples': 6623232, 'steps': 12935, 'batch_loss/train': 0.8304054976906627} 12/24/2021 16:25:19 - INFO - codeparrot_training - Step 12936: {'lr': 0.000423347459756984, 'samples': 6623744, 'steps': 12936, 'batch_loss/train': 0.7545634275302291} 12/24/2021 16:25:29 - INFO - codeparrot_training - Step 12937: {'lr': 0.00042333610108664655, 'samples': 6624256, 'steps': 12937, 'batch_loss/train': 0.7333661860320717} 12/24/2021 16:25:40 - INFO - codeparrot_training - Step 12938: {'lr': 0.0004233247417271903, 'samples': 6624768, 'steps': 12938, 'batch_loss/train': 0.8189774843631312} 12/24/2021 16:25:52 - INFO - codeparrot_training - Step 12939: {'lr': 0.0004233133816786602, 'samples': 6625280, 'steps': 12939, 'batch_loss/train': 0.8388893138617277} 12/24/2021 16:26:02 - INFO - codeparrot_training - Step 12940: {'lr': 0.0004233020209411017, 'samples': 6625792, 'steps': 12940, 'batch_loss/train': 0.6992448113160208} 12/24/2021 16:26:13 - INFO - codeparrot_training - Step 12941: {'lr': 0.0004232906595145597, 'samples': 6626304, 'steps': 12941, 'batch_loss/train': 0.813586562871933} 12/24/2021 16:26:24 - INFO - codeparrot_training - Step 12942: {'lr': 0.00042327929739907947, 'samples': 6626816, 'steps': 12942, 'batch_loss/train': 0.7618722589686513} 12/24/2021 16:26:38 - INFO - codeparrot_training - Step 12943: {'lr': 0.0004232679345947062, 'samples': 6627328, 'steps': 12943, 'batch_loss/train': 0.7277508727274835} 12/24/2021 16:26:48 - INFO - codeparrot_training - Step 12944: {'lr': 0.000423256571101485, 'samples': 6627840, 'steps': 12944, 'batch_loss/train': 0.7882340475916862} 12/24/2021 16:26:59 - INFO - codeparrot_training - Step 12945: {'lr': 0.00042324520691946116, 'samples': 6628352, 'steps': 12945, 'batch_loss/train': 0.921651654643938} 12/24/2021 16:27:11 - INFO - codeparrot_training - Step 12946: {'lr': 0.00042323384204867977, 'samples': 6628864, 'steps': 12946, 'batch_loss/train': 0.8580932421609759} 12/24/2021 16:27:22 - INFO - codeparrot_training - Step 12947: {'lr': 0.00042322247648918606, 'samples': 6629376, 'steps': 12947, 'batch_loss/train': 0.7443629554472864} 12/24/2021 16:27:32 - INFO - codeparrot_training - Step 12948: {'lr': 0.0004232111102410251, 'samples': 6629888, 'steps': 12948, 'batch_loss/train': 0.7954574376344681} 12/24/2021 16:27:45 - INFO - codeparrot_training - Step 12949: {'lr': 0.0004231997433042423, 'samples': 6630400, 'steps': 12949, 'batch_loss/train': 0.9307677503675222} 12/24/2021 16:27:55 - INFO - codeparrot_training - Step 12950: {'lr': 0.00042318837567888255, 'samples': 6630912, 'steps': 12950, 'batch_loss/train': 0.810433529317379} 12/24/2021 16:28:06 - INFO - codeparrot_training - Step 12951: {'lr': 0.0004231770073649914, 'samples': 6631424, 'steps': 12951, 'batch_loss/train': 0.8712178955320269} 12/24/2021 16:28:20 - INFO - codeparrot_training - Step 12952: {'lr': 0.0004231656383626137, 'samples': 6631936, 'steps': 12952, 'batch_loss/train': 0.7995509002357721} 12/24/2021 16:28:30 - INFO - codeparrot_training - Step 12953: {'lr': 0.0004231542686717949, 'samples': 6632448, 'steps': 12953, 'batch_loss/train': 0.7501424308866262} 12/24/2021 16:28:41 - INFO - codeparrot_training - Step 12954: {'lr': 0.0004231428982925801, 'samples': 6632960, 'steps': 12954, 'batch_loss/train': 0.7359261745586991} 12/24/2021 16:28:52 - INFO - codeparrot_training - Step 12955: {'lr': 0.00042313152722501444, 'samples': 6633472, 'steps': 12955, 'batch_loss/train': 0.92414107080549} 12/24/2021 16:29:04 - INFO - codeparrot_training - Step 12956: {'lr': 0.0004231201554691433, 'samples': 6633984, 'steps': 12956, 'batch_loss/train': 0.8774133296683431} 12/24/2021 16:29:14 - INFO - codeparrot_training - Step 12957: {'lr': 0.00042310878302501175, 'samples': 6634496, 'steps': 12957, 'batch_loss/train': 0.9773971885442734} 12/24/2021 16:29:25 - INFO - codeparrot_training - Step 12958: {'lr': 0.000423097409892665, 'samples': 6635008, 'steps': 12958, 'batch_loss/train': 0.8849851721897721} 12/24/2021 16:29:37 - INFO - codeparrot_training - Step 12959: {'lr': 0.00042308603607214834, 'samples': 6635520, 'steps': 12959, 'batch_loss/train': 0.8340517971664667} 12/24/2021 16:29:48 - INFO - codeparrot_training - Step 12960: {'lr': 0.000423074661563507, 'samples': 6636032, 'steps': 12960, 'batch_loss/train': 0.7923588007688522} 12/24/2021 16:29:58 - INFO - codeparrot_training - Step 12961: {'lr': 0.0004230632863667861, 'samples': 6636544, 'steps': 12961, 'batch_loss/train': 0.9344980455935001} 12/24/2021 16:30:12 - INFO - codeparrot_training - Step 12962: {'lr': 0.00042305191048203094, 'samples': 6637056, 'steps': 12962, 'batch_loss/train': 0.8713934682309628} 12/24/2021 16:30:23 - INFO - codeparrot_training - Step 12963: {'lr': 0.0004230405339092867, 'samples': 6637568, 'steps': 12963, 'batch_loss/train': 0.7460556310834363} 12/24/2021 16:30:33 - INFO - codeparrot_training - Step 12964: {'lr': 0.00042302915664859867, 'samples': 6638080, 'steps': 12964, 'batch_loss/train': 0.8466383214108646} 12/24/2021 16:30:44 - INFO - codeparrot_training - Step 12965: {'lr': 0.000423017778700012, 'samples': 6638592, 'steps': 12965, 'batch_loss/train': 0.9413957940414548} 12/24/2021 16:30:56 - INFO - codeparrot_training - Step 12966: {'lr': 0.00042300640006357203, 'samples': 6639104, 'steps': 12966, 'batch_loss/train': 0.9890270158648491} 12/24/2021 16:31:07 - INFO - codeparrot_training - Step 12967: {'lr': 0.00042299502073932395, 'samples': 6639616, 'steps': 12967, 'batch_loss/train': 0.8127199485898018} 12/24/2021 16:31:17 - INFO - codeparrot_training - Step 12968: {'lr': 0.0004229836407273129, 'samples': 6640128, 'steps': 12968, 'batch_loss/train': 0.8018375718966126} 12/24/2021 16:31:31 - INFO - codeparrot_training - Step 12969: {'lr': 0.0004229722600275843, 'samples': 6640640, 'steps': 12969, 'batch_loss/train': 0.5608920340891927} 12/24/2021 16:31:42 - INFO - codeparrot_training - Step 12970: {'lr': 0.0004229608786401832, 'samples': 6641152, 'steps': 12970, 'batch_loss/train': 0.7730783838778734} 12/24/2021 16:31:53 - INFO - codeparrot_training - Step 12971: {'lr': 0.00042294949656515504, 'samples': 6641664, 'steps': 12971, 'batch_loss/train': 0.8355894018895924} 12/24/2021 16:32:05 - INFO - codeparrot_training - Step 12972: {'lr': 0.000422938113802545, 'samples': 6642176, 'steps': 12972, 'batch_loss/train': 0.7861653240397573} 12/24/2021 16:32:16 - INFO - codeparrot_training - Step 12973: {'lr': 0.00042292673035239825, 'samples': 6642688, 'steps': 12973, 'batch_loss/train': 1.1248339572921395} 12/24/2021 16:32:26 - INFO - codeparrot_training - Step 12974: {'lr': 0.00042291534621476015, 'samples': 6643200, 'steps': 12974, 'batch_loss/train': 0.814558366779238} 12/24/2021 16:32:37 - INFO - codeparrot_training - Step 12975: {'lr': 0.0004229039613896759, 'samples': 6643712, 'steps': 12975, 'batch_loss/train': 0.8070625215768814} 12/24/2021 16:32:49 - INFO - codeparrot_training - Step 12976: {'lr': 0.00042289257587719084, 'samples': 6644224, 'steps': 12976, 'batch_loss/train': 0.7938446374610066} 12/24/2021 16:33:00 - INFO - codeparrot_training - Step 12977: {'lr': 0.0004228811896773501, 'samples': 6644736, 'steps': 12977, 'batch_loss/train': 0.8234686926007271} 12/24/2021 16:33:10 - INFO - codeparrot_training - Step 12978: {'lr': 0.0004228698027901991, 'samples': 6645248, 'steps': 12978, 'batch_loss/train': 0.7840134901925921} 12/24/2021 16:33:22 - INFO - codeparrot_training - Step 12979: {'lr': 0.00042285841521578306, 'samples': 6645760, 'steps': 12979, 'batch_loss/train': 0.7938153930008411} 12/24/2021 16:33:33 - INFO - codeparrot_training - Step 12980: {'lr': 0.0004228470269541471, 'samples': 6646272, 'steps': 12980, 'batch_loss/train': 0.8854358643293381} 12/24/2021 16:33:44 - INFO - codeparrot_training - Step 12981: {'lr': 0.00042283563800533675, 'samples': 6646784, 'steps': 12981, 'batch_loss/train': 0.8268008991144598} 12/24/2021 16:33:57 - INFO - codeparrot_training - Step 12982: {'lr': 0.0004228242483693971, 'samples': 6647296, 'steps': 12982, 'batch_loss/train': 0.8235091126989573} 12/24/2021 16:34:08 - INFO - codeparrot_training - Step 12983: {'lr': 0.0004228128580463735, 'samples': 6647808, 'steps': 12983, 'batch_loss/train': 0.829069378785789} 12/24/2021 16:34:19 - INFO - codeparrot_training - Step 12984: {'lr': 0.00042280146703631126, 'samples': 6648320, 'steps': 12984, 'batch_loss/train': 0.8653337145224214} 12/24/2021 16:34:31 - INFO - codeparrot_training - Step 12985: {'lr': 0.0004227900753392556, 'samples': 6648832, 'steps': 12985, 'batch_loss/train': 0.7965043929871172} 12/24/2021 16:34:41 - INFO - codeparrot_training - Step 12986: {'lr': 0.0004227786829552519, 'samples': 6649344, 'steps': 12986, 'batch_loss/train': 0.8376366160809994} 12/24/2021 16:34:52 - INFO - codeparrot_training - Step 12987: {'lr': 0.0004227672898843453, 'samples': 6649856, 'steps': 12987, 'batch_loss/train': 0.8670059777796268} 12/24/2021 16:35:03 - INFO - codeparrot_training - Step 12988: {'lr': 0.0004227558961265813, 'samples': 6650368, 'steps': 12988, 'batch_loss/train': 0.7635803883895278} 12/24/2021 16:35:15 - INFO - codeparrot_training - Step 12989: {'lr': 0.00042274450168200493, 'samples': 6650880, 'steps': 12989, 'batch_loss/train': 1.117776634171605} 12/24/2021 16:35:25 - INFO - codeparrot_training - Step 12990: {'lr': 0.00042273310655066185, 'samples': 6651392, 'steps': 12990, 'batch_loss/train': 0.8456959133036435} 12/24/2021 16:35:36 - INFO - codeparrot_training - Step 12991: {'lr': 0.000422721710732597, 'samples': 6651904, 'steps': 12991, 'batch_loss/train': 0.7946229944936931} 12/24/2021 16:35:50 - INFO - codeparrot_training - Step 12992: {'lr': 0.0004227103142278559, 'samples': 6652416, 'steps': 12992, 'batch_loss/train': 0.8751590171596035} 12/24/2021 16:36:01 - INFO - codeparrot_training - Step 12993: {'lr': 0.00042269891703648386, 'samples': 6652928, 'steps': 12993, 'batch_loss/train': 0.8767042635008693} 12/24/2021 16:36:11 - INFO - codeparrot_training - Step 12994: {'lr': 0.00042268751915852605, 'samples': 6653440, 'steps': 12994, 'batch_loss/train': 0.840452398173511} 12/24/2021 16:36:22 - INFO - codeparrot_training - Step 12995: {'lr': 0.00042267612059402796, 'samples': 6653952, 'steps': 12995, 'batch_loss/train': 0.8565544402226806} 12/24/2021 16:36:34 - INFO - codeparrot_training - Step 12996: {'lr': 0.0004226647213430347, 'samples': 6654464, 'steps': 12996, 'batch_loss/train': 0.8668846189975739} 12/24/2021 16:36:44 - INFO - codeparrot_training - Step 12997: {'lr': 0.00042265332140559175, 'samples': 6654976, 'steps': 12997, 'batch_loss/train': 0.8541797064244747} 12/24/2021 16:36:55 - INFO - codeparrot_training - Step 12998: {'lr': 0.0004226419207817444, 'samples': 6655488, 'steps': 12998, 'batch_loss/train': 0.8081822261447087} 12/24/2021 16:37:09 - INFO - codeparrot_training - Step 12999: {'lr': 0.00042263051947153795, 'samples': 6656000, 'steps': 12999, 'batch_loss/train': 0.6979994125431404} 12/24/2021 16:37:20 - INFO - codeparrot_training - Step 13000: {'lr': 0.00042261911747501767, 'samples': 6656512, 'steps': 13000, 'batch_loss/train': 1.042100409977138} 12/24/2021 16:37:30 - INFO - codeparrot_training - Step 13001: {'lr': 0.00042260771479222907, 'samples': 6657024, 'steps': 13001, 'batch_loss/train': 0.8218228998593986} 12/24/2021 16:37:42 - INFO - codeparrot_training - Step 13002: {'lr': 0.0004225963114232173, 'samples': 6657536, 'steps': 13002, 'batch_loss/train': 0.8825387079268694} 12/24/2021 16:37:53 - INFO - codeparrot_training - Step 13003: {'lr': 0.0004225849073680279, 'samples': 6658048, 'steps': 13003, 'batch_loss/train': 0.8849582988768816} 12/24/2021 16:38:03 - INFO - codeparrot_training - Step 13004: {'lr': 0.0004225735026267059, 'samples': 6658560, 'steps': 13004, 'batch_loss/train': 0.8320559058338404} 12/24/2021 16:38:14 - INFO - codeparrot_training - Step 13005: {'lr': 0.0004225620971992968, 'samples': 6659072, 'steps': 13005, 'batch_loss/train': 1.014697226230055} 12/24/2021 16:38:26 - INFO - codeparrot_training - Step 13006: {'lr': 0.0004225506910858461, 'samples': 6659584, 'steps': 13006, 'batch_loss/train': 0.8912302805110812} 12/24/2021 16:38:37 - INFO - codeparrot_training - Step 13007: {'lr': 0.000422539284286399, 'samples': 6660096, 'steps': 13007, 'batch_loss/train': 0.7969660256057978} 12/24/2021 16:38:47 - INFO - codeparrot_training - Step 13008: {'lr': 0.0004225278768010008, 'samples': 6660608, 'steps': 13008, 'batch_loss/train': 0.8367911288514733} 12/24/2021 16:39:01 - INFO - codeparrot_training - Step 13009: {'lr': 0.00042251646862969685, 'samples': 6661120, 'steps': 13009, 'batch_loss/train': 0.807616968639195} 12/24/2021 16:39:12 - INFO - codeparrot_training - Step 13010: {'lr': 0.0004225050597725326, 'samples': 6661632, 'steps': 13010, 'batch_loss/train': 0.9081397503614426} 12/24/2021 16:39:22 - INFO - codeparrot_training - Step 13011: {'lr': 0.00042249365022955344, 'samples': 6662144, 'steps': 13011, 'batch_loss/train': 0.790582480840385} 12/24/2021 16:39:35 - INFO - codeparrot_training - Step 13012: {'lr': 0.0004224822400008046, 'samples': 6662656, 'steps': 13012, 'batch_loss/train': 0.9067452680319548} 12/24/2021 16:39:45 - INFO - codeparrot_training - Step 13013: {'lr': 0.0004224708290863315, 'samples': 6663168, 'steps': 13013, 'batch_loss/train': 0.8366348496638238} 12/24/2021 16:39:56 - INFO - codeparrot_training - Step 13014: {'lr': 0.0004224594174861795, 'samples': 6663680, 'steps': 13014, 'batch_loss/train': 0.814844858366996} 12/24/2021 16:40:08 - INFO - codeparrot_training - Step 13015: {'lr': 0.00042244800520039395, 'samples': 6664192, 'steps': 13015, 'batch_loss/train': 0.8299070438370109} 12/24/2021 16:40:19 - INFO - codeparrot_training - Step 13016: {'lr': 0.0004224365922290203, 'samples': 6664704, 'steps': 13016, 'batch_loss/train': 0.8946619965136051} 12/24/2021 16:40:29 - INFO - codeparrot_training - Step 13017: {'lr': 0.0004224251785721038, 'samples': 6665216, 'steps': 13017, 'batch_loss/train': 0.898843658156693} 12/24/2021 16:40:40 - INFO - codeparrot_training - Step 13018: {'lr': 0.00042241376422968996, 'samples': 6665728, 'steps': 13018, 'batch_loss/train': 0.8849222734570503} 12/24/2021 16:40:52 - INFO - codeparrot_training - Step 13019: {'lr': 0.0004224023492018241, 'samples': 6666240, 'steps': 13019, 'batch_loss/train': 0.8875184264034033} 12/24/2021 16:41:02 - INFO - codeparrot_training - Step 13020: {'lr': 0.00042239093348855157, 'samples': 6666752, 'steps': 13020, 'batch_loss/train': 0.8142429143190384} 12/24/2021 16:41:13 - INFO - codeparrot_training - Step 13021: {'lr': 0.0004223795170899178, 'samples': 6667264, 'steps': 13021, 'batch_loss/train': 0.712556635029614} 12/24/2021 16:41:27 - INFO - codeparrot_training - Step 13022: {'lr': 0.00042236810000596814, 'samples': 6667776, 'steps': 13022, 'batch_loss/train': 0.7989917658269405} 12/24/2021 16:41:37 - INFO - codeparrot_training - Step 13023: {'lr': 0.000422356682236748, 'samples': 6668288, 'steps': 13023, 'batch_loss/train': 0.7957085063681006} 12/24/2021 16:41:48 - INFO - codeparrot_training - Step 13024: {'lr': 0.00042234526378230277, 'samples': 6668800, 'steps': 13024, 'batch_loss/train': 0.6589626700151712} 12/24/2021 16:42:00 - INFO - codeparrot_training - Step 13025: {'lr': 0.0004223338446426779, 'samples': 6669312, 'steps': 13025, 'batch_loss/train': 0.7192874809261411} 12/24/2021 16:42:11 - INFO - codeparrot_training - Step 13026: {'lr': 0.0004223224248179186, 'samples': 6669824, 'steps': 13026, 'batch_loss/train': 0.8128350535407662} 12/24/2021 16:42:21 - INFO - codeparrot_training - Step 13027: {'lr': 0.00042231100430807053, 'samples': 6670336, 'steps': 13027, 'batch_loss/train': 0.7505819397047162} 12/24/2021 16:42:36 - INFO - codeparrot_training - Step 13028: {'lr': 0.0004222995831131789, 'samples': 6670848, 'steps': 13028, 'batch_loss/train': 0.8713273471221328} 12/24/2021 16:42:46 - INFO - codeparrot_training - Step 13029: {'lr': 0.0004222881612332892, 'samples': 6671360, 'steps': 13029, 'batch_loss/train': 0.8348371647298336} 12/24/2021 16:42:57 - INFO - codeparrot_training - Step 13030: {'lr': 0.00042227673866844685, 'samples': 6671872, 'steps': 13030, 'batch_loss/train': 0.5584628558717668} 12/24/2021 16:43:07 - INFO - codeparrot_training - Step 13031: {'lr': 0.0004222653154186972, 'samples': 6672384, 'steps': 13031, 'batch_loss/train': 0.8696496095508337} 12/24/2021 16:43:20 - INFO - codeparrot_training - Step 13032: {'lr': 0.0004222538914840857, 'samples': 6672896, 'steps': 13032, 'batch_loss/train': 0.7789918419439346} 12/24/2021 16:43:30 - INFO - codeparrot_training - Step 13033: {'lr': 0.00042224246686465785, 'samples': 6673408, 'steps': 13033, 'batch_loss/train': 0.7508451219182462} 12/24/2021 16:43:41 - INFO - codeparrot_training - Step 13034: {'lr': 0.00042223104156045887, 'samples': 6673920, 'steps': 13034, 'batch_loss/train': 0.7919189482927322} 12/24/2021 16:43:53 - INFO - codeparrot_training - Step 13035: {'lr': 0.0004222196155715344, 'samples': 6674432, 'steps': 13035, 'batch_loss/train': 0.7666469966061413} 12/24/2021 16:44:04 - INFO - codeparrot_training - Step 13036: {'lr': 0.00042220818889792967, 'samples': 6674944, 'steps': 13036, 'batch_loss/train': 0.8327909549698234} 12/24/2021 16:44:14 - INFO - codeparrot_training - Step 13037: {'lr': 0.00042219676153969026, 'samples': 6675456, 'steps': 13037, 'batch_loss/train': 0.8984936960041523} 12/24/2021 16:44:28 - INFO - codeparrot_training - Step 13038: {'lr': 0.00042218533349686163, 'samples': 6675968, 'steps': 13038, 'batch_loss/train': 0.797020610421896} 12/24/2021 16:44:39 - INFO - codeparrot_training - Step 13039: {'lr': 0.00042217390476948896, 'samples': 6676480, 'steps': 13039, 'batch_loss/train': 0.8313870327547193} 12/24/2021 16:44:49 - INFO - codeparrot_training - Step 13040: {'lr': 0.000422162475357618, 'samples': 6676992, 'steps': 13040, 'batch_loss/train': 0.9251939682289958} 12/24/2021 16:45:00 - INFO - codeparrot_training - Step 13041: {'lr': 0.00042215104526129396, 'samples': 6677504, 'steps': 13041, 'batch_loss/train': 0.7407164219766855} 12/24/2021 16:45:12 - INFO - codeparrot_training - Step 13042: {'lr': 0.0004221396144805624, 'samples': 6678016, 'steps': 13042, 'batch_loss/train': 0.7497866554185748} 12/24/2021 16:45:23 - INFO - codeparrot_training - Step 13043: {'lr': 0.0004221281830154687, 'samples': 6678528, 'steps': 13043, 'batch_loss/train': 0.6723706083139405} 12/24/2021 16:45:33 - INFO - codeparrot_training - Step 13044: {'lr': 0.0004221167508660583, 'samples': 6679040, 'steps': 13044, 'batch_loss/train': 0.8112401450052857} 12/24/2021 16:45:45 - INFO - codeparrot_training - Step 13045: {'lr': 0.00042210531803237673, 'samples': 6679552, 'steps': 13045, 'batch_loss/train': 0.8690565046854317} 12/24/2021 16:45:56 - INFO - codeparrot_training - Step 13046: {'lr': 0.00042209388451446944, 'samples': 6680064, 'steps': 13046, 'batch_loss/train': 0.8057667752727866} 12/24/2021 16:46:07 - INFO - codeparrot_training - Step 13047: {'lr': 0.0004220824503123818, 'samples': 6680576, 'steps': 13047, 'batch_loss/train': 0.8355517974123359} 12/24/2021 16:46:21 - INFO - codeparrot_training - Step 13048: {'lr': 0.0004220710154261593, 'samples': 6681088, 'steps': 13048, 'batch_loss/train': 0.7158059785142541} 12/24/2021 16:46:31 - INFO - codeparrot_training - Step 13049: {'lr': 0.00042205957985584755, 'samples': 6681600, 'steps': 13049, 'batch_loss/train': 0.9339654054492712} 12/24/2021 16:46:42 - INFO - codeparrot_training - Step 13050: {'lr': 0.0004220481436014917, 'samples': 6682112, 'steps': 13050, 'batch_loss/train': 0.8802523007616401} 12/24/2021 16:46:52 - INFO - codeparrot_training - Step 13051: {'lr': 0.0004220367066631375, 'samples': 6682624, 'steps': 13051, 'batch_loss/train': 0.8858281802386045} 12/24/2021 16:47:05 - INFO - codeparrot_training - Step 13052: {'lr': 0.0004220252690408303, 'samples': 6683136, 'steps': 13052, 'batch_loss/train': 0.9059086898341775} 12/24/2021 16:47:15 - INFO - codeparrot_training - Step 13053: {'lr': 0.00042201383073461554, 'samples': 6683648, 'steps': 13053, 'batch_loss/train': 0.7837138865143061} 12/24/2021 16:47:26 - INFO - codeparrot_training - Step 13054: {'lr': 0.00042200239174453876, 'samples': 6684160, 'steps': 13054, 'batch_loss/train': 0.824236567132175} 12/24/2021 16:47:38 - INFO - codeparrot_training - Step 13055: {'lr': 0.00042199095207064543, 'samples': 6684672, 'steps': 13055, 'batch_loss/train': 0.8167127736378461} 12/24/2021 16:47:48 - INFO - codeparrot_training - Step 13056: {'lr': 0.00042197951171298097, 'samples': 6685184, 'steps': 13056, 'batch_loss/train': 0.7987996125593781} 12/24/2021 16:47:59 - INFO - codeparrot_training - Step 13057: {'lr': 0.0004219680706715909, 'samples': 6685696, 'steps': 13057, 'batch_loss/train': 0.8129761908203363} 12/24/2021 16:48:13 - INFO - codeparrot_training - Step 13058: {'lr': 0.0004219566289465208, 'samples': 6686208, 'steps': 13058, 'batch_loss/train': 0.8367360085248947} 12/24/2021 16:48:24 - INFO - codeparrot_training - Step 13059: {'lr': 0.00042194518653781597, 'samples': 6686720, 'steps': 13059, 'batch_loss/train': 0.7538531725294888} 12/24/2021 16:48:34 - INFO - codeparrot_training - Step 13060: {'lr': 0.00042193374344552215, 'samples': 6687232, 'steps': 13060, 'batch_loss/train': 0.8887944733723998} 12/24/2021 16:48:45 - INFO - codeparrot_training - Step 13061: {'lr': 0.00042192229966968455, 'samples': 6687744, 'steps': 13061, 'batch_loss/train': 0.7824779665097594} 12/24/2021 16:48:57 - INFO - codeparrot_training - Step 13062: {'lr': 0.0004219108552103489, 'samples': 6688256, 'steps': 13062, 'batch_loss/train': 0.6332334568724036} 12/24/2021 16:49:08 - INFO - codeparrot_training - Step 13063: {'lr': 0.0004218994100675605, 'samples': 6688768, 'steps': 13063, 'batch_loss/train': 0.7438715272583067} 12/24/2021 16:49:18 - INFO - codeparrot_training - Step 13064: {'lr': 0.000421887964241365, 'samples': 6689280, 'steps': 13064, 'batch_loss/train': 0.8649081652984023} 12/24/2021 16:49:30 - INFO - codeparrot_training - Step 13065: {'lr': 0.0004218765177318079, 'samples': 6689792, 'steps': 13065, 'batch_loss/train': 0.7449036955367774} 12/24/2021 16:49:41 - INFO - codeparrot_training - Step 13066: {'lr': 0.00042186507053893456, 'samples': 6690304, 'steps': 13066, 'batch_loss/train': 0.861285739345476} 12/24/2021 16:49:51 - INFO - codeparrot_training - Step 13067: {'lr': 0.0004218536226627907, 'samples': 6690816, 'steps': 13067, 'batch_loss/train': 0.8224632360506803} 12/24/2021 16:50:05 - INFO - codeparrot_training - Step 13068: {'lr': 0.00042184217410342164, 'samples': 6691328, 'steps': 13068, 'batch_loss/train': 0.8836798593401909} 12/24/2021 16:50:16 - INFO - codeparrot_training - Step 13069: {'lr': 0.00042183072486087303, 'samples': 6691840, 'steps': 13069, 'batch_loss/train': 0.8200494545744732} 12/24/2021 16:50:26 - INFO - codeparrot_training - Step 13070: {'lr': 0.0004218192749351903, 'samples': 6692352, 'steps': 13070, 'batch_loss/train': 0.7909489534795284} 12/24/2021 16:50:38 - INFO - codeparrot_training - Step 13071: {'lr': 0.0004218078243264191, 'samples': 6692864, 'steps': 13071, 'batch_loss/train': 0.8447822220623493} 12/24/2021 16:50:49 - INFO - codeparrot_training - Step 13072: {'lr': 0.0004217963730346048, 'samples': 6693376, 'steps': 13072, 'batch_loss/train': 0.8061314676888287} 12/24/2021 16:51:00 - INFO - codeparrot_training - Step 13073: {'lr': 0.0004217849210597929, 'samples': 6693888, 'steps': 13073, 'batch_loss/train': 0.8490304304286838} 12/24/2021 16:51:10 - INFO - codeparrot_training - Step 13074: {'lr': 0.0004217734684020292, 'samples': 6694400, 'steps': 13074, 'batch_loss/train': 0.9232624350115657} 12/24/2021 16:51:22 - INFO - codeparrot_training - Step 13075: {'lr': 0.0004217620150613589, 'samples': 6694912, 'steps': 13075, 'batch_loss/train': 0.7579597264993936} 12/24/2021 16:51:33 - INFO - codeparrot_training - Step 13076: {'lr': 0.0004217505610378277, 'samples': 6695424, 'steps': 13076, 'batch_loss/train': 0.8566374685615301} 12/24/2021 16:51:44 - INFO - codeparrot_training - Step 13077: {'lr': 0.0004217391063314813, 'samples': 6695936, 'steps': 13077, 'batch_loss/train': 0.8428273824974895} 12/24/2021 16:51:57 - INFO - codeparrot_training - Step 13078: {'lr': 0.0004217276509423649, 'samples': 6696448, 'steps': 13078, 'batch_loss/train': 0.8802339620888233} 12/24/2021 16:52:08 - INFO - codeparrot_training - Step 13079: {'lr': 0.00042171619487052416, 'samples': 6696960, 'steps': 13079, 'batch_loss/train': 0.7408670908771455} 12/24/2021 16:52:19 - INFO - codeparrot_training - Step 13080: {'lr': 0.00042170473811600475, 'samples': 6697472, 'steps': 13080, 'batch_loss/train': 0.7662916025146842} 12/24/2021 16:52:31 - INFO - codeparrot_training - Step 13081: {'lr': 0.000421693280678852, 'samples': 6697984, 'steps': 13081, 'batch_loss/train': 0.6994417998939753} 12/24/2021 16:52:42 - INFO - codeparrot_training - Step 13082: {'lr': 0.0004216818225591118, 'samples': 6698496, 'steps': 13082, 'batch_loss/train': 0.7425871398299932} 12/24/2021 16:52:52 - INFO - codeparrot_training - Step 13083: {'lr': 0.00042167036375682937, 'samples': 6699008, 'steps': 13083, 'batch_loss/train': 0.7254088125191629} 12/24/2021 16:53:03 - INFO - codeparrot_training - Step 13084: {'lr': 0.00042165890427205046, 'samples': 6699520, 'steps': 13084, 'batch_loss/train': 0.768653488252312} 12/24/2021 16:53:17 - INFO - codeparrot_training - Step 13085: {'lr': 0.0004216474441048205, 'samples': 6700032, 'steps': 13085, 'batch_loss/train': 0.8007477829232812} 12/24/2021 16:53:27 - INFO - codeparrot_training - Step 13086: {'lr': 0.00042163598325518513, 'samples': 6700544, 'steps': 13086, 'batch_loss/train': 0.9080955665558577} 12/24/2021 16:53:38 - INFO - codeparrot_training - Step 13087: {'lr': 0.00042162452172318993, 'samples': 6701056, 'steps': 13087, 'batch_loss/train': 0.7961451306473464} 12/24/2021 16:53:50 - INFO - codeparrot_training - Step 13088: {'lr': 0.00042161305950888045, 'samples': 6701568, 'steps': 13088, 'batch_loss/train': 0.8136374466121197} 12/24/2021 16:54:01 - INFO - codeparrot_training - Step 13089: {'lr': 0.00042160159661230217, 'samples': 6702080, 'steps': 13089, 'batch_loss/train': 1.648950087837875} 12/24/2021 16:54:12 - INFO - codeparrot_training - Step 13090: {'lr': 0.0004215901330335008, 'samples': 6702592, 'steps': 13090, 'batch_loss/train': 0.7735862487461418} 12/24/2021 16:54:22 - INFO - codeparrot_training - Step 13091: {'lr': 0.0004215786687725218, 'samples': 6703104, 'steps': 13091, 'batch_loss/train': 0.7598436083644629} 12/24/2021 16:54:34 - INFO - codeparrot_training - Step 13092: {'lr': 0.00042156720382941084, 'samples': 6703616, 'steps': 13092, 'batch_loss/train': 0.6827410813421011} 12/24/2021 16:54:45 - INFO - codeparrot_training - Step 13093: {'lr': 0.00042155573820421347, 'samples': 6704128, 'steps': 13093, 'batch_loss/train': 0.8204029169864953} 12/24/2021 16:54:55 - INFO - codeparrot_training - Step 13094: {'lr': 0.0004215442718969752, 'samples': 6704640, 'steps': 13094, 'batch_loss/train': 0.8206214485689998} 12/24/2021 16:55:10 - INFO - codeparrot_training - Step 13095: {'lr': 0.0004215328049077417, 'samples': 6705152, 'steps': 13095, 'batch_loss/train': 1.7788200648501515} 12/24/2021 16:55:20 - INFO - codeparrot_training - Step 13096: {'lr': 0.0004215213372365586, 'samples': 6705664, 'steps': 13096, 'batch_loss/train': 0.8699695970863104} 12/24/2021 16:55:31 - INFO - codeparrot_training - Step 13097: {'lr': 0.0004215098688834714, 'samples': 6706176, 'steps': 13097, 'batch_loss/train': 0.8947219327092171} 12/24/2021 16:55:43 - INFO - codeparrot_training - Step 13098: {'lr': 0.0004214983998485257, 'samples': 6706688, 'steps': 13098, 'batch_loss/train': 0.7787996651604772} 12/24/2021 16:55:54 - INFO - codeparrot_training - Step 13099: {'lr': 0.00042148693013176707, 'samples': 6707200, 'steps': 13099, 'batch_loss/train': 0.9528190204873681} 12/24/2021 16:56:04 - INFO - codeparrot_training - Step 13100: {'lr': 0.0004214754597332412, 'samples': 6707712, 'steps': 13100, 'batch_loss/train': 0.6809896575286984} 12/24/2021 16:56:15 - INFO - codeparrot_training - Step 13101: {'lr': 0.0004214639886529937, 'samples': 6708224, 'steps': 13101, 'batch_loss/train': 0.8001194875687361} 12/24/2021 16:56:27 - INFO - codeparrot_training - Step 13102: {'lr': 0.00042145251689107, 'samples': 6708736, 'steps': 13102, 'batch_loss/train': 0.7701521813869476} 12/24/2021 16:56:38 - INFO - codeparrot_training - Step 13103: {'lr': 0.0004214410444475159, 'samples': 6709248, 'steps': 13103, 'batch_loss/train': 0.8042446027975529} 12/24/2021 16:56:48 - INFO - codeparrot_training - Step 13104: {'lr': 0.00042142957132237696, 'samples': 6709760, 'steps': 13104, 'batch_loss/train': 0.9083388913422823} 12/24/2021 16:57:00 - INFO - codeparrot_training - Step 13105: {'lr': 0.00042141809751569876, 'samples': 6710272, 'steps': 13105, 'batch_loss/train': 0.876733418321237} 12/24/2021 16:57:11 - INFO - codeparrot_training - Step 13106: {'lr': 0.00042140662302752696, 'samples': 6710784, 'steps': 13106, 'batch_loss/train': 0.9210051447153091} 12/24/2021 16:57:22 - INFO - codeparrot_training - Step 13107: {'lr': 0.0004213951478579071, 'samples': 6711296, 'steps': 13107, 'batch_loss/train': 0.8606929956004024} 12/24/2021 16:57:36 - INFO - codeparrot_training - Step 13108: {'lr': 0.00042138367200688487, 'samples': 6711808, 'steps': 13108, 'batch_loss/train': 0.8006187113933265} 12/24/2021 16:57:46 - INFO - codeparrot_training - Step 13109: {'lr': 0.0004213721954745058, 'samples': 6712320, 'steps': 13109, 'batch_loss/train': 0.9175910842604935} 12/24/2021 16:57:57 - INFO - codeparrot_training - Step 13110: {'lr': 0.0004213607182608157, 'samples': 6712832, 'steps': 13110, 'batch_loss/train': 0.8409126475453377} 12/24/2021 16:58:07 - INFO - codeparrot_training - Step 13111: {'lr': 0.0004213492403658601, 'samples': 6713344, 'steps': 13111, 'batch_loss/train': 0.8381201233714819} 12/24/2021 16:58:20 - INFO - codeparrot_training - Step 13112: {'lr': 0.00042133776178968454, 'samples': 6713856, 'steps': 13112, 'batch_loss/train': 0.8473222041502595} 12/24/2021 16:58:30 - INFO - codeparrot_training - Step 13113: {'lr': 0.00042132628253233474, 'samples': 6714368, 'steps': 13113, 'batch_loss/train': 0.7532546450383961} 12/24/2021 16:58:41 - INFO - codeparrot_training - Step 13114: {'lr': 0.00042131480259385634, 'samples': 6714880, 'steps': 13114, 'batch_loss/train': 0.6967521458864212} 12/24/2021 16:58:55 - INFO - codeparrot_training - Step 13115: {'lr': 0.000421303321974295, 'samples': 6715392, 'steps': 13115, 'batch_loss/train': 0.820867155212909} 12/24/2021 16:59:05 - INFO - codeparrot_training - Step 13116: {'lr': 0.0004212918406736963, 'samples': 6715904, 'steps': 13116, 'batch_loss/train': 0.7943852748721838} 12/24/2021 16:59:16 - INFO - codeparrot_training - Step 13117: {'lr': 0.0004212803586921059, 'samples': 6716416, 'steps': 13117, 'batch_loss/train': 0.8732364354655147} 12/24/2021 16:59:28 - INFO - codeparrot_training - Step 13118: {'lr': 0.0004212688760295695, 'samples': 6716928, 'steps': 13118, 'batch_loss/train': 0.9149840781465173} 12/24/2021 16:59:39 - INFO - codeparrot_training - Step 13119: {'lr': 0.0004212573926861327, 'samples': 6717440, 'steps': 13119, 'batch_loss/train': 0.7117884964682162} 12/24/2021 16:59:49 - INFO - codeparrot_training - Step 13120: {'lr': 0.0004212459086618412, 'samples': 6717952, 'steps': 13120, 'batch_loss/train': 0.8240248709917068} 12/24/2021 17:00:00 - INFO - codeparrot_training - Step 13121: {'lr': 0.00042123442395674063, 'samples': 6718464, 'steps': 13121, 'batch_loss/train': 0.7553907157853246} 12/24/2021 17:00:12 - INFO - codeparrot_training - Step 13122: {'lr': 0.00042122293857087663, 'samples': 6718976, 'steps': 13122, 'batch_loss/train': 0.8174107396043837} 12/24/2021 17:00:23 - INFO - codeparrot_training - Step 13123: {'lr': 0.0004212114525042949, 'samples': 6719488, 'steps': 13123, 'batch_loss/train': 0.8471985463984311} 12/24/2021 17:00:33 - INFO - codeparrot_training - Step 13124: {'lr': 0.000421199965757041, 'samples': 6720000, 'steps': 13124, 'batch_loss/train': 0.8777696965262294} 12/24/2021 17:00:47 - INFO - codeparrot_training - Step 13125: {'lr': 0.00042118847832916075, 'samples': 6720512, 'steps': 13125, 'batch_loss/train': 0.8086504535749555} 12/24/2021 17:00:58 - INFO - codeparrot_training - Step 13126: {'lr': 0.0004211769902206997, 'samples': 6721024, 'steps': 13126, 'batch_loss/train': 0.8186282245442271} 12/24/2021 17:01:08 - INFO - codeparrot_training - Step 13127: {'lr': 0.0004211655014317036, 'samples': 6721536, 'steps': 13127, 'batch_loss/train': 0.7929616752080619} 12/24/2021 17:01:20 - INFO - codeparrot_training - Step 13128: {'lr': 0.00042115401196221814, 'samples': 6722048, 'steps': 13128, 'batch_loss/train': 0.7859851950779557} 12/24/2021 17:01:31 - INFO - codeparrot_training - Step 13129: {'lr': 0.0004211425218122889, 'samples': 6722560, 'steps': 13129, 'batch_loss/train': 0.7358655240386724} 12/24/2021 17:01:42 - INFO - codeparrot_training - Step 13130: {'lr': 0.00042113103098196167, 'samples': 6723072, 'steps': 13130, 'batch_loss/train': 0.8902473300695419} 12/24/2021 17:01:54 - INFO - codeparrot_training - Step 13131: {'lr': 0.000421119539471282, 'samples': 6723584, 'steps': 13131, 'batch_loss/train': 0.7605984611436725} 12/24/2021 17:02:04 - INFO - codeparrot_training - Step 13132: {'lr': 0.00042110804728029573, 'samples': 6724096, 'steps': 13132, 'batch_loss/train': 0.7795050721615553} 12/24/2021 17:02:15 - INFO - codeparrot_training - Step 13133: {'lr': 0.0004210965544090484, 'samples': 6724608, 'steps': 13133, 'batch_loss/train': 0.8133056019432843} 12/24/2021 17:02:25 - INFO - codeparrot_training - Step 13134: {'lr': 0.0004210850608575858, 'samples': 6725120, 'steps': 13134, 'batch_loss/train': 0.7581839985214174} 12/24/2021 17:02:39 - INFO - codeparrot_training - Step 13135: {'lr': 0.0004210735666259536, 'samples': 6725632, 'steps': 13135, 'batch_loss/train': 0.7936996873468161} 12/24/2021 17:02:50 - INFO - codeparrot_training - Step 13136: {'lr': 0.0004210620717141975, 'samples': 6726144, 'steps': 13136, 'batch_loss/train': 0.8584145000204444} 12/24/2021 17:03:01 - INFO - codeparrot_training - Step 13137: {'lr': 0.0004210505761223632, 'samples': 6726656, 'steps': 13137, 'batch_loss/train': 0.7955347178503871} 12/24/2021 17:03:13 - INFO - codeparrot_training - Step 13138: {'lr': 0.0004210390798504964, 'samples': 6727168, 'steps': 13138, 'batch_loss/train': 0.8258552616462111} 12/24/2021 17:03:23 - INFO - codeparrot_training - Step 13139: {'lr': 0.00042102758289864274, 'samples': 6727680, 'steps': 13139, 'batch_loss/train': 0.8441900098696351} 12/24/2021 17:03:34 - INFO - codeparrot_training - Step 13140: {'lr': 0.000421016085266848, 'samples': 6728192, 'steps': 13140, 'batch_loss/train': 0.8016094556078315} 12/24/2021 17:03:46 - INFO - codeparrot_training - Step 13141: {'lr': 0.000421004586955158, 'samples': 6728704, 'steps': 13141, 'batch_loss/train': 0.7931282571516931} 12/24/2021 17:03:57 - INFO - codeparrot_training - Step 13142: {'lr': 0.00042099308796361825, 'samples': 6729216, 'steps': 13142, 'batch_loss/train': 0.9046736750751734} 12/24/2021 17:04:08 - INFO - codeparrot_training - Step 13143: {'lr': 0.0004209815882922745, 'samples': 6729728, 'steps': 13143, 'batch_loss/train': 0.7448652876773849} 12/24/2021 17:04:18 - INFO - codeparrot_training - Step 13144: {'lr': 0.0004209700879411725, 'samples': 6730240, 'steps': 13144, 'batch_loss/train': 0.8598833531141281} 12/24/2021 17:04:32 - INFO - codeparrot_training - Step 13145: {'lr': 0.00042095858691035807, 'samples': 6730752, 'steps': 13145, 'batch_loss/train': 0.7562524392269552} 12/24/2021 17:04:43 - INFO - codeparrot_training - Step 13146: {'lr': 0.00042094708519987685, 'samples': 6731264, 'steps': 13146, 'batch_loss/train': 0.7780975312925875} 12/24/2021 17:04:53 - INFO - codeparrot_training - Step 13147: {'lr': 0.0004209355828097745, 'samples': 6731776, 'steps': 13147, 'batch_loss/train': 0.8267309288494289} 12/24/2021 17:05:05 - INFO - codeparrot_training - Step 13148: {'lr': 0.0004209240797400969, 'samples': 6732288, 'steps': 13148, 'batch_loss/train': 0.7949872203171253} 12/24/2021 17:05:16 - INFO - codeparrot_training - Step 13149: {'lr': 0.0004209125759908896, 'samples': 6732800, 'steps': 13149, 'batch_loss/train': 1.6677916990593076} 12/24/2021 17:05:27 - INFO - codeparrot_training - Step 13150: {'lr': 0.0004209010715621985, 'samples': 6733312, 'steps': 13150, 'batch_loss/train': 0.9607327671255916} 12/24/2021 17:05:39 - INFO - codeparrot_training - Step 13151: {'lr': 0.0004208895664540693, 'samples': 6733824, 'steps': 13151, 'batch_loss/train': 0.7455375869758427} 12/24/2021 17:05:50 - INFO - codeparrot_training - Step 13152: {'lr': 0.00042087806066654765, 'samples': 6734336, 'steps': 13152, 'batch_loss/train': 0.8654231792315841} 12/24/2021 17:06:00 - INFO - codeparrot_training - Step 13153: {'lr': 0.00042086655419967936, 'samples': 6734848, 'steps': 13153, 'batch_loss/train': 0.7539959447458386} 12/24/2021 17:06:11 - INFO - codeparrot_training - Step 13154: {'lr': 0.0004208550470535102, 'samples': 6735360, 'steps': 13154, 'batch_loss/train': 0.7595639815554023} 12/24/2021 17:06:25 - INFO - codeparrot_training - Step 13155: {'lr': 0.00042084353922808575, 'samples': 6735872, 'steps': 13155, 'batch_loss/train': 0.754623148823157} 12/24/2021 17:06:36 - INFO - codeparrot_training - Step 13156: {'lr': 0.00042083203072345205, 'samples': 6736384, 'steps': 13156, 'batch_loss/train': 0.6780898203141987} 12/24/2021 17:06:46 - INFO - codeparrot_training - Step 13157: {'lr': 0.00042082052153965464, 'samples': 6736896, 'steps': 13157, 'batch_loss/train': 0.6764449521433562} 12/24/2021 17:06:59 - INFO - codeparrot_training - Step 13158: {'lr': 0.00042080901167673934, 'samples': 6737408, 'steps': 13158, 'batch_loss/train': 0.8008964778855443} 12/24/2021 17:07:09 - INFO - codeparrot_training - Step 13159: {'lr': 0.0004207975011347519, 'samples': 6737920, 'steps': 13159, 'batch_loss/train': 0.8585302941501141} 12/24/2021 17:07:20 - INFO - codeparrot_training - Step 13160: {'lr': 0.00042078598991373813, 'samples': 6738432, 'steps': 13160, 'batch_loss/train': 0.8295428650453687} 12/24/2021 17:07:32 - INFO - codeparrot_training - Step 13161: {'lr': 0.00042077447801374367, 'samples': 6738944, 'steps': 13161, 'batch_loss/train': 0.8740113712847233} 12/24/2021 17:07:43 - INFO - codeparrot_training - Step 13162: {'lr': 0.0004207629654348145, 'samples': 6739456, 'steps': 13162, 'batch_loss/train': 0.8759404812008142} 12/24/2021 17:07:53 - INFO - codeparrot_training - Step 13163: {'lr': 0.0004207514521769961, 'samples': 6739968, 'steps': 13163, 'batch_loss/train': 0.7964446407277137} 12/24/2021 17:08:04 - INFO - codeparrot_training - Step 13164: {'lr': 0.00042073993824033454, 'samples': 6740480, 'steps': 13164, 'batch_loss/train': 0.7669547580881044} 12/24/2021 17:08:17 - INFO - codeparrot_training - Step 13165: {'lr': 0.0004207284236248753, 'samples': 6740992, 'steps': 13165, 'batch_loss/train': 0.8292675539851189} 12/24/2021 17:08:28 - INFO - codeparrot_training - Step 13166: {'lr': 0.00042071690833066444, 'samples': 6741504, 'steps': 13166, 'batch_loss/train': 0.7647723373956978} 12/24/2021 17:08:39 - INFO - codeparrot_training - Step 13167: {'lr': 0.0004207053923577476, 'samples': 6742016, 'steps': 13167, 'batch_loss/train': 0.8224443276412785} 12/24/2021 17:08:51 - INFO - codeparrot_training - Step 13168: {'lr': 0.0004206938757061705, 'samples': 6742528, 'steps': 13168, 'batch_loss/train': 0.8203578346874565} 12/24/2021 17:09:02 - INFO - codeparrot_training - Step 13169: {'lr': 0.0004206823583759791, 'samples': 6743040, 'steps': 13169, 'batch_loss/train': 0.8630714854225516} 12/24/2021 17:09:12 - INFO - codeparrot_training - Step 13170: {'lr': 0.00042067084036721895, 'samples': 6743552, 'steps': 13170, 'batch_loss/train': 0.6003166257869452} 12/24/2021 17:09:24 - INFO - codeparrot_training - Step 13171: {'lr': 0.00042065932167993615, 'samples': 6744064, 'steps': 13171, 'batch_loss/train': 0.7873190985992551} 12/24/2021 17:09:35 - INFO - codeparrot_training - Step 13172: {'lr': 0.0004206478023141762, 'samples': 6744576, 'steps': 13172, 'batch_loss/train': 0.712820737157017} 12/24/2021 17:09:45 - INFO - codeparrot_training - Step 13173: {'lr': 0.00042063628226998506, 'samples': 6745088, 'steps': 13173, 'batch_loss/train': 0.7872478519566357} 12/24/2021 17:09:56 - INFO - codeparrot_training - Step 13174: {'lr': 0.0004206247615474085, 'samples': 6745600, 'steps': 13174, 'batch_loss/train': 0.7385636568069458} 12/24/2021 17:10:11 - INFO - codeparrot_training - Step 13175: {'lr': 0.00042061324014649234, 'samples': 6746112, 'steps': 13175, 'batch_loss/train': 0.851138062775135} 12/24/2021 17:10:21 - INFO - codeparrot_training - Step 13176: {'lr': 0.00042060171806728233, 'samples': 6746624, 'steps': 13176, 'batch_loss/train': 0.7606805134564638} 12/24/2021 17:10:32 - INFO - codeparrot_training - Step 13177: {'lr': 0.00042059019530982423, 'samples': 6747136, 'steps': 13177, 'batch_loss/train': 0.7792458129115403} 12/24/2021 17:10:44 - INFO - codeparrot_training - Step 13178: {'lr': 0.00042057867187416406, 'samples': 6747648, 'steps': 13178, 'batch_loss/train': 0.7666065362282097} 12/24/2021 17:10:55 - INFO - codeparrot_training - Step 13179: {'lr': 0.0004205671477603474, 'samples': 6748160, 'steps': 13179, 'batch_loss/train': 0.8250810289755464} 12/24/2021 17:11:05 - INFO - codeparrot_training - Step 13180: {'lr': 0.0004205556229684202, 'samples': 6748672, 'steps': 13180, 'batch_loss/train': 0.8443933548405766} 12/24/2021 17:11:20 - INFO - codeparrot_training - Step 13181: {'lr': 0.00042054409749842824, 'samples': 6749184, 'steps': 13181, 'batch_loss/train': 0.8561663711443543} 12/24/2021 17:11:31 - INFO - codeparrot_training - Step 13182: {'lr': 0.0004205325713504174, 'samples': 6749696, 'steps': 13182, 'batch_loss/train': 0.7828223360702395} 12/24/2021 17:11:41 - INFO - codeparrot_training - Step 13183: {'lr': 0.0004205210445244334, 'samples': 6750208, 'steps': 13183, 'batch_loss/train': 0.773463205434382} 12/24/2021 17:11:52 - INFO - codeparrot_training - Step 13184: {'lr': 0.0004205095170205221, 'samples': 6750720, 'steps': 13184, 'batch_loss/train': 0.6466374862939119} 12/24/2021 17:12:04 - INFO - codeparrot_training - Step 13185: {'lr': 0.00042049798883872936, 'samples': 6751232, 'steps': 13185, 'batch_loss/train': 0.9124206686392426} 12/24/2021 17:12:14 - INFO - codeparrot_training - Step 13186: {'lr': 0.00042048645997910095, 'samples': 6751744, 'steps': 13186, 'batch_loss/train': 0.8038037652149796} 12/24/2021 17:12:25 - INFO - codeparrot_training - Step 13187: {'lr': 0.00042047493044168285, 'samples': 6752256, 'steps': 13187, 'batch_loss/train': 0.922662565484643} 12/24/2021 17:12:37 - INFO - codeparrot_training - Step 13188: {'lr': 0.00042046340022652075, 'samples': 6752768, 'steps': 13188, 'batch_loss/train': 0.7041425784118474} 12/24/2021 17:12:48 - INFO - codeparrot_training - Step 13189: {'lr': 0.0004204518693336604, 'samples': 6753280, 'steps': 13189, 'batch_loss/train': 0.8429762180894613} 12/24/2021 17:12:58 - INFO - codeparrot_training - Step 13190: {'lr': 0.0004204403377631479, 'samples': 6753792, 'steps': 13190, 'batch_loss/train': 0.9573446903377771} 12/24/2021 17:13:12 - INFO - codeparrot_training - Step 13191: {'lr': 0.00042042880551502906, 'samples': 6754304, 'steps': 13191, 'batch_loss/train': 0.8361318558454514} 12/24/2021 17:13:23 - INFO - codeparrot_training - Step 13192: {'lr': 0.0004204172725893495, 'samples': 6754816, 'steps': 13192, 'batch_loss/train': 0.9040102055296302} 12/24/2021 17:13:33 - INFO - codeparrot_training - Step 13193: {'lr': 0.00042040573898615525, 'samples': 6755328, 'steps': 13193, 'batch_loss/train': 0.7336675287224352} 12/24/2021 17:13:44 - INFO - codeparrot_training - Step 13194: {'lr': 0.00042039420470549213, 'samples': 6755840, 'steps': 13194, 'batch_loss/train': 0.7916770880110562} 12/24/2021 17:13:56 - INFO - codeparrot_training - Step 13195: {'lr': 0.00042038266974740604, 'samples': 6756352, 'steps': 13195, 'batch_loss/train': 0.819748766720295} 12/24/2021 17:14:07 - INFO - codeparrot_training - Step 13196: {'lr': 0.00042037113411194275, 'samples': 6756864, 'steps': 13196, 'batch_loss/train': 0.775233889464289} 12/24/2021 17:14:17 - INFO - codeparrot_training - Step 13197: {'lr': 0.0004203595977991481, 'samples': 6757376, 'steps': 13197, 'batch_loss/train': 0.8552856640890241} 12/24/2021 17:14:31 - INFO - codeparrot_training - Step 13198: {'lr': 0.0004203480608090681, 'samples': 6757888, 'steps': 13198, 'batch_loss/train': 0.6801711132284254} 12/24/2021 17:14:42 - INFO - codeparrot_training - Step 13199: {'lr': 0.00042033652314174845, 'samples': 6758400, 'steps': 13199, 'batch_loss/train': 0.8157510804012418} 12/24/2021 17:14:52 - INFO - codeparrot_training - Step 13200: {'lr': 0.00042032498479723516, 'samples': 6758912, 'steps': 13200, 'batch_loss/train': 0.9044127142988145} 12/24/2021 17:15:04 - INFO - codeparrot_training - Step 13201: {'lr': 0.00042031344577557407, 'samples': 6759424, 'steps': 13201, 'batch_loss/train': 0.8240072159096599} 12/24/2021 17:15:15 - INFO - codeparrot_training - Step 13202: {'lr': 0.00042030190607681095, 'samples': 6759936, 'steps': 13202, 'batch_loss/train': 0.797450901940465} 12/24/2021 17:15:26 - INFO - codeparrot_training - Step 13203: {'lr': 0.00042029036570099176, 'samples': 6760448, 'steps': 13203, 'batch_loss/train': 0.8776646926999092} 12/24/2021 17:15:38 - INFO - codeparrot_training - Step 13204: {'lr': 0.0004202788246481625, 'samples': 6760960, 'steps': 13204, 'batch_loss/train': 0.7455227500759065} 12/24/2021 17:15:48 - INFO - codeparrot_training - Step 13205: {'lr': 0.00042026728291836874, 'samples': 6761472, 'steps': 13205, 'batch_loss/train': 0.7987164817750454} 12/24/2021 17:15:59 - INFO - codeparrot_training - Step 13206: {'lr': 0.00042025574051165673, 'samples': 6761984, 'steps': 13206, 'batch_loss/train': 0.7251236925367266} 12/24/2021 17:16:09 - INFO - codeparrot_training - Step 13207: {'lr': 0.00042024419742807206, 'samples': 6762496, 'steps': 13207, 'batch_loss/train': 0.8656305363401771} 12/24/2021 17:16:22 - INFO - codeparrot_training - Step 13208: {'lr': 0.0004202326536676608, 'samples': 6763008, 'steps': 13208, 'batch_loss/train': 0.7224058164283633} 12/24/2021 17:16:32 - INFO - codeparrot_training - Step 13209: {'lr': 0.0004202211092304687, 'samples': 6763520, 'steps': 13209, 'batch_loss/train': 0.7434908912982792} 12/24/2021 17:16:43 - INFO - codeparrot_training - Step 13210: {'lr': 0.0004202095641165419, 'samples': 6764032, 'steps': 13210, 'batch_loss/train': 0.8964906767942011} 12/24/2021 17:16:57 - INFO - codeparrot_training - Step 13211: {'lr': 0.000420198018325926, 'samples': 6764544, 'steps': 13211, 'batch_loss/train': 0.9036046036053449} 12/24/2021 17:17:07 - INFO - codeparrot_training - Step 13212: {'lr': 0.0004201864718586671, 'samples': 6765056, 'steps': 13212, 'batch_loss/train': 0.8115497333928943} 12/24/2021 17:17:18 - INFO - codeparrot_training - Step 13213: {'lr': 0.000420174924714811, 'samples': 6765568, 'steps': 13213, 'batch_loss/train': 0.7471444895491004} 12/24/2021 17:17:30 - INFO - codeparrot_training - Step 13214: {'lr': 0.00042016337689440374, 'samples': 6766080, 'steps': 13214, 'batch_loss/train': 0.8473425358533859} 12/24/2021 17:17:41 - INFO - codeparrot_training - Step 13215: {'lr': 0.00042015182839749113, 'samples': 6766592, 'steps': 13215, 'batch_loss/train': 0.7608842412009835} 12/24/2021 17:17:51 - INFO - codeparrot_training - Step 13216: {'lr': 0.00042014027922411903, 'samples': 6767104, 'steps': 13216, 'batch_loss/train': 0.7013066569343209} 12/24/2021 17:18:02 - INFO - codeparrot_training - Step 13217: {'lr': 0.00042012872937433347, 'samples': 6767616, 'steps': 13217, 'batch_loss/train': 0.742322466801852} 12/24/2021 17:18:14 - INFO - codeparrot_training - Step 13218: {'lr': 0.00042011717884818037, 'samples': 6768128, 'steps': 13218, 'batch_loss/train': 0.7789000226184726} 12/24/2021 17:18:24 - INFO - codeparrot_training - Step 13219: {'lr': 0.0004201056276457055, 'samples': 6768640, 'steps': 13219, 'batch_loss/train': 0.7577952770516276} 12/24/2021 17:18:35 - INFO - codeparrot_training - Step 13220: {'lr': 0.0004200940757669549, 'samples': 6769152, 'steps': 13220, 'batch_loss/train': 1.5286987030413002} 12/24/2021 17:18:49 - INFO - codeparrot_training - Step 13221: {'lr': 0.00042008252321197457, 'samples': 6769664, 'steps': 13221, 'batch_loss/train': 0.8726291283965111} 12/24/2021 17:19:00 - INFO - codeparrot_training - Step 13222: {'lr': 0.0004200709699808103, 'samples': 6770176, 'steps': 13222, 'batch_loss/train': 0.7807420853059739} 12/24/2021 17:19:11 - INFO - codeparrot_training - Step 13223: {'lr': 0.0004200594160735081, 'samples': 6770688, 'steps': 13223, 'batch_loss/train': 0.9366481369361281} 12/24/2021 17:19:23 - INFO - codeparrot_training - Step 13224: {'lr': 0.0004200478614901138, 'samples': 6771200, 'steps': 13224, 'batch_loss/train': 0.8520561046898365} 12/24/2021 17:19:33 - INFO - codeparrot_training - Step 13225: {'lr': 0.00042003630623067346, 'samples': 6771712, 'steps': 13225, 'batch_loss/train': 0.8370273830369115} 12/24/2021 17:19:44 - INFO - codeparrot_training - Step 13226: {'lr': 0.0004200247502952329, 'samples': 6772224, 'steps': 13226, 'batch_loss/train': 0.7792091674637049} 12/24/2021 17:19:58 - INFO - codeparrot_training - Step 13227: {'lr': 0.0004200131936838382, 'samples': 6772736, 'steps': 13227, 'batch_loss/train': 0.8585053854621947} 12/24/2021 17:20:08 - INFO - codeparrot_training - Step 13228: {'lr': 0.0004200016363965352, 'samples': 6773248, 'steps': 13228, 'batch_loss/train': 0.8052849899977446} 12/24/2021 17:20:19 - INFO - codeparrot_training - Step 13229: {'lr': 0.0004199900784333699, 'samples': 6773760, 'steps': 13229, 'batch_loss/train': 0.8036480685696006} 12/24/2021 17:20:29 - INFO - codeparrot_training - Step 13230: {'lr': 0.0004199785197943883, 'samples': 6774272, 'steps': 13230, 'batch_loss/train': 1.8355634207837284} 12/24/2021 17:20:41 - INFO - codeparrot_training - Step 13231: {'lr': 0.0004199669604796361, 'samples': 6774784, 'steps': 13231, 'batch_loss/train': 0.818474655970931} 12/24/2021 17:20:52 - INFO - codeparrot_training - Step 13232: {'lr': 0.00041995540048915964, 'samples': 6775296, 'steps': 13232, 'batch_loss/train': 0.8032191414386034} 12/24/2021 17:21:03 - INFO - codeparrot_training - Step 13233: {'lr': 0.00041994383982300453, 'samples': 6775808, 'steps': 13233, 'batch_loss/train': 0.8257854050025344} 12/24/2021 17:21:15 - INFO - codeparrot_training - Step 13234: {'lr': 0.000419932278481217, 'samples': 6776320, 'steps': 13234, 'batch_loss/train': 0.8169780680909753} 12/24/2021 17:21:26 - INFO - codeparrot_training - Step 13235: {'lr': 0.0004199207164638429, 'samples': 6776832, 'steps': 13235, 'batch_loss/train': 0.6918790945783257} 12/24/2021 17:21:36 - INFO - codeparrot_training - Step 13236: {'lr': 0.0004199091537709281, 'samples': 6777344, 'steps': 13236, 'batch_loss/train': 0.7978520188480616} 12/24/2021 17:21:50 - INFO - codeparrot_training - Step 13237: {'lr': 0.0004198975904025187, 'samples': 6777856, 'steps': 13237, 'batch_loss/train': 0.9272961686365306} 12/24/2021 17:22:01 - INFO - codeparrot_training - Step 13238: {'lr': 0.0004198860263586607, 'samples': 6778368, 'steps': 13238, 'batch_loss/train': 0.8610140793025494} 12/24/2021 17:22:11 - INFO - codeparrot_training - Step 13239: {'lr': 0.0004198744616393999, 'samples': 6778880, 'steps': 13239, 'batch_loss/train': 1.0166155565530062} 12/24/2021 17:22:22 - INFO - codeparrot_training - Step 13240: {'lr': 0.00041986289624478234, 'samples': 6779392, 'steps': 13240, 'batch_loss/train': 0.9031170625239611} 12/24/2021 17:22:34 - INFO - codeparrot_training - Step 13241: {'lr': 0.0004198513301748542, 'samples': 6779904, 'steps': 13241, 'batch_loss/train': 0.8386277128010988} 12/24/2021 17:22:45 - INFO - codeparrot_training - Step 13242: {'lr': 0.00041983976342966126, 'samples': 6780416, 'steps': 13242, 'batch_loss/train': 0.9529253551736474} 12/24/2021 17:22:56 - INFO - codeparrot_training - Step 13243: {'lr': 0.0004198281960092495, 'samples': 6780928, 'steps': 13243, 'batch_loss/train': 0.8206408582627773} 12/24/2021 17:23:08 - INFO - codeparrot_training - Step 13244: {'lr': 0.0004198166279136649, 'samples': 6781440, 'steps': 13244, 'batch_loss/train': 0.7052513130474836} 12/24/2021 17:23:19 - INFO - codeparrot_training - Step 13245: {'lr': 0.00041980505914295365, 'samples': 6781952, 'steps': 13245, 'batch_loss/train': 0.8615616867318749} 12/24/2021 17:23:29 - INFO - codeparrot_training - Step 13246: {'lr': 0.0004197934896971615, 'samples': 6782464, 'steps': 13246, 'batch_loss/train': 0.8064258699305356} 12/24/2021 17:23:41 - INFO - codeparrot_training - Step 13247: {'lr': 0.0004197819195763346, 'samples': 6782976, 'steps': 13247, 'batch_loss/train': 0.7359414137899876} 12/24/2021 17:23:52 - INFO - codeparrot_training - Step 13248: {'lr': 0.00041977034878051887, 'samples': 6783488, 'steps': 13248, 'batch_loss/train': 0.8326709279790521} 12/24/2021 17:24:03 - INFO - codeparrot_training - Step 13249: {'lr': 0.00041975877730976034, 'samples': 6784000, 'steps': 13249, 'batch_loss/train': 1.1424960168078542} 12/24/2021 17:24:13 - INFO - codeparrot_training - Step 13250: {'lr': 0.00041974720516410504, 'samples': 6784512, 'steps': 13250, 'batch_loss/train': 0.9345102943480015} 12/24/2021 17:24:27 - INFO - codeparrot_training - Step 13251: {'lr': 0.0004197356323435989, 'samples': 6785024, 'steps': 13251, 'batch_loss/train': 0.716802439186722} 12/24/2021 17:24:38 - INFO - codeparrot_training - Step 13252: {'lr': 0.000419724058848288, 'samples': 6785536, 'steps': 13252, 'batch_loss/train': 0.8231111750937998} 12/24/2021 17:24:48 - INFO - codeparrot_training - Step 13253: {'lr': 0.00041971248467821833, 'samples': 6786048, 'steps': 13253, 'batch_loss/train': 0.8548115501180291} 12/24/2021 17:25:00 - INFO - codeparrot_training - Step 13254: {'lr': 0.00041970090983343587, 'samples': 6786560, 'steps': 13254, 'batch_loss/train': 0.8501289011910558} 12/24/2021 17:25:11 - INFO - codeparrot_training - Step 13255: {'lr': 0.0004196893343139867, 'samples': 6787072, 'steps': 13255, 'batch_loss/train': 0.6921546200755984} 12/24/2021 17:25:21 - INFO - codeparrot_training - Step 13256: {'lr': 0.0004196777581199168, 'samples': 6787584, 'steps': 13256, 'batch_loss/train': 0.7871983805671334} 12/24/2021 17:25:36 - INFO - codeparrot_training - Step 13257: {'lr': 0.0004196661812512722, 'samples': 6788096, 'steps': 13257, 'batch_loss/train': 0.7923246412537992} 12/24/2021 17:25:46 - INFO - codeparrot_training - Step 13258: {'lr': 0.0004196546037080989, 'samples': 6788608, 'steps': 13258, 'batch_loss/train': 0.8318671612069011} 12/24/2021 17:25:57 - INFO - codeparrot_training - Step 13259: {'lr': 0.00041964302549044296, 'samples': 6789120, 'steps': 13259, 'batch_loss/train': 0.7719853976741433} 12/24/2021 17:26:08 - INFO - codeparrot_training - Step 13260: {'lr': 0.0004196314465983505, 'samples': 6789632, 'steps': 13260, 'batch_loss/train': 0.7263758017215878} 12/24/2021 17:26:20 - INFO - codeparrot_training - Step 13261: {'lr': 0.0004196198670318674, 'samples': 6790144, 'steps': 13261, 'batch_loss/train': 0.7636184627190232} 12/24/2021 17:26:30 - INFO - codeparrot_training - Step 13262: {'lr': 0.0004196082867910397, 'samples': 6790656, 'steps': 13262, 'batch_loss/train': 0.7440125392749906} 12/24/2021 17:26:41 - INFO - codeparrot_training - Step 13263: {'lr': 0.00041959670587591355, 'samples': 6791168, 'steps': 13263, 'batch_loss/train': 0.7965268651023507} 12/24/2021 17:26:53 - INFO - codeparrot_training - Step 13264: {'lr': 0.0004195851242865349, 'samples': 6791680, 'steps': 13264, 'batch_loss/train': 0.6420968394959345} 12/24/2021 17:27:04 - INFO - codeparrot_training - Step 13265: {'lr': 0.0004195735420229498, 'samples': 6792192, 'steps': 13265, 'batch_loss/train': 0.8301542485132813} 12/24/2021 17:27:14 - INFO - codeparrot_training - Step 13266: {'lr': 0.0004195619590852045, 'samples': 6792704, 'steps': 13266, 'batch_loss/train': 0.7865244783461094} 12/24/2021 17:27:25 - INFO - codeparrot_training - Step 13267: {'lr': 0.0004195503754733447, 'samples': 6793216, 'steps': 13267, 'batch_loss/train': 0.8419237041380256} 12/24/2021 17:27:39 - INFO - codeparrot_training - Step 13268: {'lr': 0.00041953879118741664, 'samples': 6793728, 'steps': 13268, 'batch_loss/train': 0.822650546208024} 12/24/2021 17:27:50 - INFO - codeparrot_training - Step 13269: {'lr': 0.0004195272062274664, 'samples': 6794240, 'steps': 13269, 'batch_loss/train': 0.7038432052358985} 12/24/2021 17:28:01 - INFO - codeparrot_training - Step 13270: {'lr': 0.00041951562059353996, 'samples': 6794752, 'steps': 13270, 'batch_loss/train': 1.0804718043655157} 12/24/2021 17:28:12 - INFO - codeparrot_training - Step 13271: {'lr': 0.0004195040342856835, 'samples': 6795264, 'steps': 13271, 'batch_loss/train': 0.5876227328553796} 12/24/2021 17:28:23 - INFO - codeparrot_training - Step 13272: {'lr': 0.000419492447303943, 'samples': 6795776, 'steps': 13272, 'batch_loss/train': 0.9046404212713242} 12/24/2021 17:28:34 - INFO - codeparrot_training - Step 13273: {'lr': 0.00041948085964836444, 'samples': 6796288, 'steps': 13273, 'batch_loss/train': 0.9339744411408901} 12/24/2021 17:28:44 - INFO - codeparrot_training - Step 13274: {'lr': 0.000419469271318994, 'samples': 6796800, 'steps': 13274, 'batch_loss/train': 0.8228378109633923} 12/24/2021 17:28:56 - INFO - codeparrot_training - Step 13275: {'lr': 0.00041945768231587777, 'samples': 6797312, 'steps': 13275, 'batch_loss/train': 0.8863197742030025} 12/24/2021 17:29:07 - INFO - codeparrot_training - Step 13276: {'lr': 0.00041944609263906175, 'samples': 6797824, 'steps': 13276, 'batch_loss/train': 0.8466470362618566} 12/24/2021 17:29:18 - INFO - codeparrot_training - Step 13277: {'lr': 0.000419434502288592, 'samples': 6798336, 'steps': 13277, 'batch_loss/train': 0.7285827847663313} 12/24/2021 17:29:32 - INFO - codeparrot_training - Step 13278: {'lr': 0.0004194229112645148, 'samples': 6798848, 'steps': 13278, 'batch_loss/train': 0.812964313197881} 12/24/2021 17:29:42 - INFO - codeparrot_training - Step 13279: {'lr': 0.00041941131956687596, 'samples': 6799360, 'steps': 13279, 'batch_loss/train': 1.5197123121470213} 12/24/2021 17:29:53 - INFO - codeparrot_training - Step 13280: {'lr': 0.0004193997271957217, 'samples': 6799872, 'steps': 13280, 'batch_loss/train': 0.8432503836229444} 12/24/2021 17:30:04 - INFO - codeparrot_training - Step 13281: {'lr': 0.00041938813415109803, 'samples': 6800384, 'steps': 13281, 'batch_loss/train': 0.8109947703778744} 12/24/2021 17:30:16 - INFO - codeparrot_training - Step 13282: {'lr': 0.0004193765404330512, 'samples': 6800896, 'steps': 13282, 'batch_loss/train': 0.8635568963363767} 12/24/2021 17:30:26 - INFO - codeparrot_training - Step 13283: {'lr': 0.0004193649460416271, 'samples': 6801408, 'steps': 13283, 'batch_loss/train': 0.8430218491703272} 12/24/2021 17:30:37 - INFO - codeparrot_training - Step 13284: {'lr': 0.000419353350976872, 'samples': 6801920, 'steps': 13284, 'batch_loss/train': 0.7463191552087665} 12/24/2021 17:30:49 - INFO - codeparrot_training - Step 13285: {'lr': 0.00041934175523883186, 'samples': 6802432, 'steps': 13285, 'batch_loss/train': 0.8257656851783395} 12/24/2021 17:31:00 - INFO - codeparrot_training - Step 13286: {'lr': 0.00041933015882755293, 'samples': 6802944, 'steps': 13286, 'batch_loss/train': 0.6541245197877288} 12/24/2021 17:31:10 - INFO - codeparrot_training - Step 13287: {'lr': 0.00041931856174308115, 'samples': 6803456, 'steps': 13287, 'batch_loss/train': 0.8294146803673357} 12/24/2021 17:31:24 - INFO - codeparrot_training - Step 13288: {'lr': 0.00041930696398546273, 'samples': 6803968, 'steps': 13288, 'batch_loss/train': 0.7950731101445854} 12/24/2021 17:31:35 - INFO - codeparrot_training - Step 13289: {'lr': 0.00041929536555474373, 'samples': 6804480, 'steps': 13289, 'batch_loss/train': 0.8013875912874937} 12/24/2021 17:31:45 - INFO - codeparrot_training - Step 13290: {'lr': 0.00041928376645097035, 'samples': 6804992, 'steps': 13290, 'batch_loss/train': 0.7737386012449861} 12/24/2021 17:31:56 - INFO - codeparrot_training - Step 13291: {'lr': 0.00041927216667418855, 'samples': 6805504, 'steps': 13291, 'batch_loss/train': 0.8003901885822415} 12/24/2021 17:32:08 - INFO - codeparrot_training - Step 13292: {'lr': 0.0004192605662244445, 'samples': 6806016, 'steps': 13292, 'batch_loss/train': 0.6707771243527532} 12/24/2021 17:32:19 - INFO - codeparrot_training - Step 13293: {'lr': 0.00041924896510178445, 'samples': 6806528, 'steps': 13293, 'batch_loss/train': 0.7174275789875537} 12/24/2021 17:32:29 - INFO - codeparrot_training - Step 13294: {'lr': 0.00041923736330625437, 'samples': 6807040, 'steps': 13294, 'batch_loss/train': 0.7700326316989958} 12/24/2021 17:32:41 - INFO - codeparrot_training - Step 13295: {'lr': 0.00041922576083790045, 'samples': 6807552, 'steps': 13295, 'batch_loss/train': 0.6872891215607524} 12/24/2021 17:32:52 - INFO - codeparrot_training - Step 13296: {'lr': 0.00041921415769676873, 'samples': 6808064, 'steps': 13296, 'batch_loss/train': 0.6190993885975331} 12/24/2021 17:33:03 - INFO - codeparrot_training - Step 13297: {'lr': 0.0004192025538829055, 'samples': 6808576, 'steps': 13297, 'batch_loss/train': 0.7996860444545746} 12/24/2021 17:33:17 - INFO - codeparrot_training - Step 13298: {'lr': 0.00041919094939635684, 'samples': 6809088, 'steps': 13298, 'batch_loss/train': 0.9172459095716476} 12/24/2021 17:33:27 - INFO - codeparrot_training - Step 13299: {'lr': 0.00041917934423716876, 'samples': 6809600, 'steps': 13299, 'batch_loss/train': 0.9185072458349168} 12/24/2021 17:33:38 - INFO - codeparrot_training - Step 13300: {'lr': 0.00041916773840538753, 'samples': 6810112, 'steps': 13300, 'batch_loss/train': 0.8052965914830565} 12/24/2021 17:33:49 - INFO - codeparrot_training - Step 13301: {'lr': 0.00041915613190105915, 'samples': 6810624, 'steps': 13301, 'batch_loss/train': 0.8304195990785956} 12/24/2021 17:34:01 - INFO - codeparrot_training - Step 13302: {'lr': 0.00041914452472423003, 'samples': 6811136, 'steps': 13302, 'batch_loss/train': 0.8278103922493756} 12/24/2021 17:34:11 - INFO - codeparrot_training - Step 13303: {'lr': 0.00041913291687494604, 'samples': 6811648, 'steps': 13303, 'batch_loss/train': 0.8680481966584921} 12/24/2021 17:34:22 - INFO - codeparrot_training - Step 13304: {'lr': 0.00041912130835325347, 'samples': 6812160, 'steps': 13304, 'batch_loss/train': 0.8712839563377202} 12/24/2021 17:34:36 - INFO - codeparrot_training - Step 13305: {'lr': 0.0004191096991591984, 'samples': 6812672, 'steps': 13305, 'batch_loss/train': 0.8498088205233216} 12/24/2021 17:34:46 - INFO - codeparrot_training - Step 13306: {'lr': 0.00041909808929282706, 'samples': 6813184, 'steps': 13306, 'batch_loss/train': 0.807948412373662} 12/24/2021 17:34:57 - INFO - codeparrot_training - Step 13307: {'lr': 0.0004190864787541855, 'samples': 6813696, 'steps': 13307, 'batch_loss/train': 0.856605876237154} 12/24/2021 17:35:09 - INFO - codeparrot_training - Step 13308: {'lr': 0.00041907486754332, 'samples': 6814208, 'steps': 13308, 'batch_loss/train': 0.8302074614912271} 12/24/2021 17:35:20 - INFO - codeparrot_training - Step 13309: {'lr': 0.0004190632556602767, 'samples': 6814720, 'steps': 13309, 'batch_loss/train': 0.7285954039543867} 12/24/2021 17:35:30 - INFO - codeparrot_training - Step 13310: {'lr': 0.0004190516431051017, 'samples': 6815232, 'steps': 13310, 'batch_loss/train': 0.7418063841760159} 12/24/2021 17:35:41 - INFO - codeparrot_training - Step 13311: {'lr': 0.00041904002987784117, 'samples': 6815744, 'steps': 13311, 'batch_loss/train': 0.7928233630955219} 12/24/2021 17:35:53 - INFO - codeparrot_training - Step 13312: {'lr': 0.0004190284159785413, 'samples': 6816256, 'steps': 13312, 'batch_loss/train': 0.9133796067908406} 12/24/2021 17:36:04 - INFO - codeparrot_training - Step 13313: {'lr': 0.0004190168014072483, 'samples': 6816768, 'steps': 13313, 'batch_loss/train': 0.8735930770635605} 12/24/2021 17:36:14 - INFO - codeparrot_training - Step 13314: {'lr': 0.00041900518616400827, 'samples': 6817280, 'steps': 13314, 'batch_loss/train': 0.6894259739201516} 12/24/2021 17:36:27 - INFO - codeparrot_training - Step 13315: {'lr': 0.00041899357024886754, 'samples': 6817792, 'steps': 13315, 'batch_loss/train': 0.7608071370050311} 12/24/2021 17:36:37 - INFO - codeparrot_training - Step 13316: {'lr': 0.0004189819536618721, 'samples': 6818304, 'steps': 13316, 'batch_loss/train': 0.7466223938390613} 12/24/2021 17:36:48 - INFO - codeparrot_training - Step 13317: {'lr': 0.0004189703364030683, 'samples': 6818816, 'steps': 13317, 'batch_loss/train': 0.8331453334540129} 12/24/2021 17:37:02 - INFO - codeparrot_training - Step 13318: {'lr': 0.0004189587184725021, 'samples': 6819328, 'steps': 13318, 'batch_loss/train': 0.6060700166854076} 12/24/2021 17:37:12 - INFO - codeparrot_training - Step 13319: {'lr': 0.00041894709987022, 'samples': 6819840, 'steps': 13319, 'batch_loss/train': 0.9940622511785477} 12/24/2021 17:37:23 - INFO - codeparrot_training - Step 13320: {'lr': 0.00041893548059626786, 'samples': 6820352, 'steps': 13320, 'batch_loss/train': 0.9877703925594687} 12/24/2021 17:37:34 - INFO - codeparrot_training - Step 13321: {'lr': 0.0004189238606506921, 'samples': 6820864, 'steps': 13321, 'batch_loss/train': 0.8252851944416761} 12/24/2021 17:37:46 - INFO - codeparrot_training - Step 13322: {'lr': 0.00041891224003353885, 'samples': 6821376, 'steps': 13322, 'batch_loss/train': 1.49789622053504} 12/24/2021 17:37:57 - INFO - codeparrot_training - Step 13323: {'lr': 0.0004189006187448543, 'samples': 6821888, 'steps': 13323, 'batch_loss/train': 0.8329881355166435} 12/24/2021 17:38:07 - INFO - codeparrot_training - Step 13324: {'lr': 0.0004188889967846847, 'samples': 6822400, 'steps': 13324, 'batch_loss/train': 0.8335860809311271} 12/24/2021 17:38:20 - INFO - codeparrot_training - Step 13325: {'lr': 0.0004188773741530762, 'samples': 6822912, 'steps': 13325, 'batch_loss/train': 0.7524183467030525} 12/24/2021 17:38:31 - INFO - codeparrot_training - Step 13326: {'lr': 0.0004188657508500751, 'samples': 6823424, 'steps': 13326, 'batch_loss/train': 1.3128201309591532} 12/24/2021 17:38:42 - INFO - codeparrot_training - Step 13327: {'lr': 0.00041885412687572745, 'samples': 6823936, 'steps': 13327, 'batch_loss/train': 0.8415225767530501} 12/24/2021 17:38:55 - INFO - codeparrot_training - Step 13328: {'lr': 0.00041884250223007957, 'samples': 6824448, 'steps': 13328, 'batch_loss/train': 0.8406263990327716} 12/24/2021 17:39:06 - INFO - codeparrot_training - Step 13329: {'lr': 0.0004188308769131777, 'samples': 6824960, 'steps': 13329, 'batch_loss/train': 0.8336712580639869} 12/24/2021 17:39:17 - INFO - codeparrot_training - Step 13330: {'lr': 0.0004188192509250679, 'samples': 6825472, 'steps': 13330, 'batch_loss/train': 0.8500007721595466} 12/24/2021 17:39:27 - INFO - codeparrot_training - Step 13331: {'lr': 0.00041880762426579654, 'samples': 6825984, 'steps': 13331, 'batch_loss/train': 0.7913707671687007} 12/24/2021 17:39:39 - INFO - codeparrot_training - Step 13332: {'lr': 0.0004187959969354099, 'samples': 6826496, 'steps': 13332, 'batch_loss/train': 0.8297595670446754} 12/24/2021 17:39:50 - INFO - codeparrot_training - Step 13333: {'lr': 0.000418784368933954, 'samples': 6827008, 'steps': 13333, 'batch_loss/train': 0.8408014131709933} 12/24/2021 17:40:00 - INFO - codeparrot_training - Step 13334: {'lr': 0.00041877274026147525, 'samples': 6827520, 'steps': 13334, 'batch_loss/train': 0.7653535848949105} 12/24/2021 17:40:13 - INFO - codeparrot_training - Step 13335: {'lr': 0.0004187611109180197, 'samples': 6828032, 'steps': 13335, 'batch_loss/train': 0.8307512337341905} 12/24/2021 17:40:23 - INFO - codeparrot_training - Step 13336: {'lr': 0.00041874948090363386, 'samples': 6828544, 'steps': 13336, 'batch_loss/train': 0.7472973740659654} 12/24/2021 17:40:34 - INFO - codeparrot_training - Step 13337: {'lr': 0.0004187378502183637, 'samples': 6829056, 'steps': 13337, 'batch_loss/train': 0.8492257352918386} 12/24/2021 17:40:48 - INFO - codeparrot_training - Step 13338: {'lr': 0.0004187262188622556, 'samples': 6829568, 'steps': 13338, 'batch_loss/train': 0.7791922183241695} 12/24/2021 17:40:58 - INFO - codeparrot_training - Step 13339: {'lr': 0.0004187145868353557, 'samples': 6830080, 'steps': 13339, 'batch_loss/train': 0.8387351459823549} 12/24/2021 17:41:09 - INFO - codeparrot_training - Step 13340: {'lr': 0.0004187029541377103, 'samples': 6830592, 'steps': 13340, 'batch_loss/train': 0.8349424367770553} 12/24/2021 17:41:20 - INFO - codeparrot_training - Step 13341: {'lr': 0.0004186913207693658, 'samples': 6831104, 'steps': 13341, 'batch_loss/train': 0.7070877784863114} 12/24/2021 17:41:32 - INFO - codeparrot_training - Step 13342: {'lr': 0.00041867968673036814, 'samples': 6831616, 'steps': 13342, 'batch_loss/train': 0.8450437178835273} 12/24/2021 17:41:42 - INFO - codeparrot_training - Step 13343: {'lr': 0.00041866805202076386, 'samples': 6832128, 'steps': 13343, 'batch_loss/train': 0.8562998804263771} 12/24/2021 17:41:53 - INFO - codeparrot_training - Step 13344: {'lr': 0.000418656416640599, 'samples': 6832640, 'steps': 13344, 'batch_loss/train': 0.7734531089663506} 12/24/2021 17:42:07 - INFO - codeparrot_training - Step 13345: {'lr': 0.00041864478058991993, 'samples': 6833152, 'steps': 13345, 'batch_loss/train': 0.8757303152233362} 12/24/2021 17:42:18 - INFO - codeparrot_training - Step 13346: {'lr': 0.0004186331438687729, 'samples': 6833664, 'steps': 13346, 'batch_loss/train': 0.8191946353763342} 12/24/2021 17:42:28 - INFO - codeparrot_training - Step 13347: {'lr': 0.0004186215064772042, 'samples': 6834176, 'steps': 13347, 'batch_loss/train': 0.6873598594684154} 12/24/2021 17:42:41 - INFO - codeparrot_training - Step 13348: {'lr': 0.0004186098684152601, 'samples': 6834688, 'steps': 13348, 'batch_loss/train': 0.8036960307508707} 12/24/2021 17:42:52 - INFO - codeparrot_training - Step 13349: {'lr': 0.00041859822968298667, 'samples': 6835200, 'steps': 13349, 'batch_loss/train': 0.6913217566907406} 12/24/2021 17:43:02 - INFO - codeparrot_training - Step 13350: {'lr': 0.00041858659028043047, 'samples': 6835712, 'steps': 13350, 'batch_loss/train': 0.8616676316596568} 12/24/2021 17:43:13 - INFO - codeparrot_training - Step 13351: {'lr': 0.00041857495020763757, 'samples': 6836224, 'steps': 13351, 'batch_loss/train': 0.8809795812703669} 12/24/2021 17:43:25 - INFO - codeparrot_training - Step 13352: {'lr': 0.00041856330946465437, 'samples': 6836736, 'steps': 13352, 'batch_loss/train': 0.7497248114086688} 12/24/2021 17:43:35 - INFO - codeparrot_training - Step 13353: {'lr': 0.000418551668051527, 'samples': 6837248, 'steps': 13353, 'batch_loss/train': 0.8533205271232873} 12/24/2021 17:43:46 - INFO - codeparrot_training - Step 13354: {'lr': 0.00041854002596830196, 'samples': 6837760, 'steps': 13354, 'batch_loss/train': 0.7701509715989232} 12/24/2021 17:43:58 - INFO - codeparrot_training - Step 13355: {'lr': 0.0004185283832150254, 'samples': 6838272, 'steps': 13355, 'batch_loss/train': 0.7721464815549552} 12/24/2021 17:44:09 - INFO - codeparrot_training - Step 13356: {'lr': 0.0004185167397917436, 'samples': 6838784, 'steps': 13356, 'batch_loss/train': 0.8191412430023775} 12/24/2021 17:44:19 - INFO - codeparrot_training - Step 13357: {'lr': 0.00041850509569850284, 'samples': 6839296, 'steps': 13357, 'batch_loss/train': 0.6810070211067796} 12/24/2021 17:44:33 - INFO - codeparrot_training - Step 13358: {'lr': 0.0004184934509353495, 'samples': 6839808, 'steps': 13358, 'batch_loss/train': 0.76257219677791} 12/24/2021 17:44:44 - INFO - codeparrot_training - Step 13359: {'lr': 0.0004184818055023297, 'samples': 6840320, 'steps': 13359, 'batch_loss/train': 0.765519677195698} 12/24/2021 17:44:54 - INFO - codeparrot_training - Step 13360: {'lr': 0.00041847015939949, 'samples': 6840832, 'steps': 13360, 'batch_loss/train': 0.9343729726970196} 12/24/2021 17:45:05 - INFO - codeparrot_training - Step 13361: {'lr': 0.0004184585126268765, 'samples': 6841344, 'steps': 13361, 'batch_loss/train': 0.8133214060217142} 12/24/2021 17:45:17 - INFO - codeparrot_training - Step 13362: {'lr': 0.00041844686518453556, 'samples': 6841856, 'steps': 13362, 'batch_loss/train': 0.7686606161296368} 12/24/2021 17:45:28 - INFO - codeparrot_training - Step 13363: {'lr': 0.00041843521707251354, 'samples': 6842368, 'steps': 13363, 'batch_loss/train': 0.7544627776369452} 12/24/2021 17:45:38 - INFO - codeparrot_training - Step 13364: {'lr': 0.0004184235682908566, 'samples': 6842880, 'steps': 13364, 'batch_loss/train': 0.8090066853910685} 12/24/2021 17:45:50 - INFO - codeparrot_training - Step 13365: {'lr': 0.0004184119188396112, 'samples': 6843392, 'steps': 13365, 'batch_loss/train': 0.8371434365399182} 12/24/2021 17:46:01 - INFO - codeparrot_training - Step 13366: {'lr': 0.0004184002687188236, 'samples': 6843904, 'steps': 13366, 'batch_loss/train': 0.8754458893090487} 12/24/2021 17:46:12 - INFO - codeparrot_training - Step 13367: {'lr': 0.0004183886179285401, 'samples': 6844416, 'steps': 13367, 'batch_loss/train': 0.8907420560717583} 12/24/2021 17:46:25 - INFO - codeparrot_training - Step 13368: {'lr': 0.0004183769664688071, 'samples': 6844928, 'steps': 13368, 'batch_loss/train': 0.8635064223781228} 12/24/2021 17:46:36 - INFO - codeparrot_training - Step 13369: {'lr': 0.0004183653143396708, 'samples': 6845440, 'steps': 13369, 'batch_loss/train': 0.7956645516678691} 12/24/2021 17:46:47 - INFO - codeparrot_training - Step 13370: {'lr': 0.0004183536615411775, 'samples': 6845952, 'steps': 13370, 'batch_loss/train': 0.8054405869916081} 12/24/2021 17:46:57 - INFO - codeparrot_training - Step 13371: {'lr': 0.00041834200807337377, 'samples': 6846464, 'steps': 13371, 'batch_loss/train': 0.8345984816551208} 12/24/2021 17:47:10 - INFO - codeparrot_training - Step 13372: {'lr': 0.0004183303539363057, 'samples': 6846976, 'steps': 13372, 'batch_loss/train': 0.7045520239043981} 12/24/2021 17:47:20 - INFO - codeparrot_training - Step 13373: {'lr': 0.00041831869913001976, 'samples': 6847488, 'steps': 13373, 'batch_loss/train': 0.9016745518893003} 12/24/2021 17:47:31 - INFO - codeparrot_training - Step 13374: {'lr': 0.00041830704365456215, 'samples': 6848000, 'steps': 13374, 'batch_loss/train': 0.8175651987548918} 12/24/2021 17:47:45 - INFO - codeparrot_training - Step 13375: {'lr': 0.0004182953875099793, 'samples': 6848512, 'steps': 13375, 'batch_loss/train': 0.9010699614882469} 12/24/2021 17:47:55 - INFO - codeparrot_training - Step 13376: {'lr': 0.00041828373069631756, 'samples': 6849024, 'steps': 13376, 'batch_loss/train': 0.7926432113163173} 12/24/2021 17:48:06 - INFO - codeparrot_training - Step 13377: {'lr': 0.00041827207321362323, 'samples': 6849536, 'steps': 13377, 'batch_loss/train': 0.8869187328964472} 12/24/2021 17:48:18 - INFO - codeparrot_training - Step 13378: {'lr': 0.00041826041506194266, 'samples': 6850048, 'steps': 13378, 'batch_loss/train': 0.8356825411319733} 12/24/2021 17:48:29 - INFO - codeparrot_training - Step 13379: {'lr': 0.00041824875624132225, 'samples': 6850560, 'steps': 13379, 'batch_loss/train': 0.7507787146605551} 12/24/2021 17:48:39 - INFO - codeparrot_training - Step 13380: {'lr': 0.00041823709675180835, 'samples': 6851072, 'steps': 13380, 'batch_loss/train': 0.7026171020697802} 12/24/2021 17:48:50 - INFO - codeparrot_training - Step 13381: {'lr': 0.00041822543659344715, 'samples': 6851584, 'steps': 13381, 'batch_loss/train': 0.7332087230170146} 12/24/2021 17:49:02 - INFO - codeparrot_training - Step 13382: {'lr': 0.0004182137757662852, 'samples': 6852096, 'steps': 13382, 'batch_loss/train': 0.8823283137753606} 12/24/2021 17:49:13 - INFO - codeparrot_training - Step 13383: {'lr': 0.0004182021142703688, 'samples': 6852608, 'steps': 13383, 'batch_loss/train': 0.8496929779648781} 12/24/2021 17:49:23 - INFO - codeparrot_training - Step 13384: {'lr': 0.00041819045210574437, 'samples': 6853120, 'steps': 13384, 'batch_loss/train': 0.7568288557231426} 12/24/2021 17:49:37 - INFO - codeparrot_training - Step 13385: {'lr': 0.0004181787892724581, 'samples': 6853632, 'steps': 13385, 'batch_loss/train': 0.8010180708952248} 12/24/2021 17:49:48 - INFO - codeparrot_training - Step 13386: {'lr': 0.0004181671257705565, 'samples': 6854144, 'steps': 13386, 'batch_loss/train': 0.8819185774773359} 12/24/2021 17:49:59 - INFO - codeparrot_training - Step 13387: {'lr': 0.00041815546160008587, 'samples': 6854656, 'steps': 13387, 'batch_loss/train': 0.7572279986925423} 12/24/2021 17:50:11 - INFO - codeparrot_training - Step 13388: {'lr': 0.0004181437967610927, 'samples': 6855168, 'steps': 13388, 'batch_loss/train': 0.8902539378032088} 12/24/2021 17:50:21 - INFO - codeparrot_training - Step 13389: {'lr': 0.00041813213125362325, 'samples': 6855680, 'steps': 13389, 'batch_loss/train': 0.7840015487745404} 12/24/2021 17:50:32 - INFO - codeparrot_training - Step 13390: {'lr': 0.00041812046507772396, 'samples': 6856192, 'steps': 13390, 'batch_loss/train': 0.8718201937153935} 12/24/2021 17:50:43 - INFO - codeparrot_training - Step 13391: {'lr': 0.0004181087982334411, 'samples': 6856704, 'steps': 13391, 'batch_loss/train': 0.7378954140003771} 12/24/2021 17:50:55 - INFO - codeparrot_training - Step 13392: {'lr': 0.0004180971307208212, 'samples': 6857216, 'steps': 13392, 'batch_loss/train': 0.7407225521747023} 12/24/2021 17:51:06 - INFO - codeparrot_training - Step 13393: {'lr': 0.0004180854625399105, 'samples': 6857728, 'steps': 13393, 'batch_loss/train': 0.9035733994096518} 12/24/2021 17:51:16 - INFO - codeparrot_training - Step 13394: {'lr': 0.00041807379369075557, 'samples': 6858240, 'steps': 13394, 'batch_loss/train': 0.7611780995503068} 12/24/2021 17:51:28 - INFO - codeparrot_training - Step 13395: {'lr': 0.00041806212417340265, 'samples': 6858752, 'steps': 13395, 'batch_loss/train': 0.7739942250773311} 12/24/2021 17:51:39 - INFO - codeparrot_training - Step 13396: {'lr': 0.0004180504539878982, 'samples': 6859264, 'steps': 13396, 'batch_loss/train': 0.8764342637732625} 12/24/2021 17:51:50 - INFO - codeparrot_training - Step 13397: {'lr': 0.00041803878313428856, 'samples': 6859776, 'steps': 13397, 'batch_loss/train': 0.8011249527335167} 12/24/2021 17:52:03 - INFO - codeparrot_training - Step 13398: {'lr': 0.0004180271116126202, 'samples': 6860288, 'steps': 13398, 'batch_loss/train': 0.8016375675797462} 12/24/2021 17:52:14 - INFO - codeparrot_training - Step 13399: {'lr': 0.0004180154394229395, 'samples': 6860800, 'steps': 13399, 'batch_loss/train': 0.8022057870402932} 12/24/2021 17:52:25 - INFO - codeparrot_training - Step 13400: {'lr': 0.0004180037665652928, 'samples': 6861312, 'steps': 13400, 'batch_loss/train': 0.6648685880936682} 12/24/2021 17:52:37 - INFO - codeparrot_training - Step 13401: {'lr': 0.0004179920930397266, 'samples': 6861824, 'steps': 13401, 'batch_loss/train': 0.8426976129412651} 12/24/2021 17:52:47 - INFO - codeparrot_training - Step 13402: {'lr': 0.0004179804188462872, 'samples': 6862336, 'steps': 13402, 'batch_loss/train': 0.7290635666577145} 12/24/2021 17:52:58 - INFO - codeparrot_training - Step 13403: {'lr': 0.00041796874398502117, 'samples': 6862848, 'steps': 13403, 'batch_loss/train': 0.8328350745141506} 12/24/2021 17:53:08 - INFO - codeparrot_training - Step 13404: {'lr': 0.0004179570684559749, 'samples': 6863360, 'steps': 13404, 'batch_loss/train': 0.8729202635586262} 12/24/2021 17:53:21 - INFO - codeparrot_training - Step 13405: {'lr': 0.0004179453922591946, 'samples': 6863872, 'steps': 13405, 'batch_loss/train': 0.7957298546098173} 12/24/2021 17:53:31 - INFO - codeparrot_training - Step 13406: {'lr': 0.0004179337153947269, 'samples': 6864384, 'steps': 13406, 'batch_loss/train': 0.8176814126782119} 12/24/2021 17:53:42 - INFO - codeparrot_training - Step 13407: {'lr': 0.00041792203786261814, 'samples': 6864896, 'steps': 13407, 'batch_loss/train': 0.9412024901248515} 12/24/2021 17:53:55 - INFO - codeparrot_training - Step 13408: {'lr': 0.0004179103596629148, 'samples': 6865408, 'steps': 13408, 'batch_loss/train': 0.8411218301625922} 12/24/2021 17:54:06 - INFO - codeparrot_training - Step 13409: {'lr': 0.00041789868079566316, 'samples': 6865920, 'steps': 13409, 'batch_loss/train': 0.8496462875045836} 12/24/2021 17:54:17 - INFO - codeparrot_training - Step 13410: {'lr': 0.0004178870012609099, 'samples': 6866432, 'steps': 13410, 'batch_loss/train': 0.8245499702170491} 12/24/2021 17:54:29 - INFO - codeparrot_training - Step 13411: {'lr': 0.0004178753210587012, 'samples': 6866944, 'steps': 13411, 'batch_loss/train': 0.8328776822891086} 12/24/2021 17:54:39 - INFO - codeparrot_training - Step 13412: {'lr': 0.0004178636401890837, 'samples': 6867456, 'steps': 13412, 'batch_loss/train': 0.848421691916883} 12/24/2021 17:54:50 - INFO - codeparrot_training - Step 13413: {'lr': 0.0004178519586521037, 'samples': 6867968, 'steps': 13413, 'batch_loss/train': 0.8031352516263723} 12/24/2021 17:55:01 - INFO - codeparrot_training - Step 13414: {'lr': 0.0004178402764478077, 'samples': 6868480, 'steps': 13414, 'batch_loss/train': 0.7918480960652232} 12/24/2021 17:55:15 - INFO - codeparrot_training - Step 13415: {'lr': 0.00041782859357624217, 'samples': 6868992, 'steps': 13415, 'batch_loss/train': 0.7259304998442531} 12/24/2021 17:55:25 - INFO - codeparrot_training - Step 13416: {'lr': 0.0004178169100374535, 'samples': 6869504, 'steps': 13416, 'batch_loss/train': 0.8813172243535519} 12/24/2021 17:55:36 - INFO - codeparrot_training - Step 13417: {'lr': 0.00041780522583148816, 'samples': 6870016, 'steps': 13417, 'batch_loss/train': 0.8207487808540463} 12/24/2021 17:55:48 - INFO - codeparrot_training - Step 13418: {'lr': 0.00041779354095839257, 'samples': 6870528, 'steps': 13418, 'batch_loss/train': 0.7831490007229149} 12/24/2021 17:55:59 - INFO - codeparrot_training - Step 13419: {'lr': 0.00041778185541821325, 'samples': 6871040, 'steps': 13419, 'batch_loss/train': 0.8219869141466916} 12/24/2021 17:56:09 - INFO - codeparrot_training - Step 13420: {'lr': 0.0004177701692109966, 'samples': 6871552, 'steps': 13420, 'batch_loss/train': 0.6229864414199255} 12/24/2021 17:56:21 - INFO - codeparrot_training - Step 13421: {'lr': 0.00041775848233678917, 'samples': 6872064, 'steps': 13421, 'batch_loss/train': 0.8627431849017739} 12/24/2021 17:56:32 - INFO - codeparrot_training - Step 13422: {'lr': 0.00041774679479563727, 'samples': 6872576, 'steps': 13422, 'batch_loss/train': 0.8289665686897933} 12/24/2021 17:56:43 - INFO - codeparrot_training - Step 13423: {'lr': 0.0004177351065875875, 'samples': 6873088, 'steps': 13423, 'batch_loss/train': 0.8349010618403554} 12/24/2021 17:56:57 - INFO - codeparrot_training - Step 13424: {'lr': 0.00041772341771268627, 'samples': 6873600, 'steps': 13424, 'batch_loss/train': 0.8185014012269676} 12/24/2021 17:57:07 - INFO - codeparrot_training - Step 13425: {'lr': 0.00041771172817098003, 'samples': 6874112, 'steps': 13425, 'batch_loss/train': 0.7164841839112341} 12/24/2021 17:57:18 - INFO - codeparrot_training - Step 13426: {'lr': 0.0004177000379625153, 'samples': 6874624, 'steps': 13426, 'batch_loss/train': 0.8300656527280807} 12/24/2021 17:57:28 - INFO - codeparrot_training - Step 13427: {'lr': 0.0004176883470873386, 'samples': 6875136, 'steps': 13427, 'batch_loss/train': 0.7777676025871187} 12/24/2021 17:57:41 - INFO - codeparrot_training - Step 13428: {'lr': 0.00041767665554549627, 'samples': 6875648, 'steps': 13428, 'batch_loss/train': 0.8375170149374753} 12/24/2021 17:57:51 - INFO - codeparrot_training - Step 13429: {'lr': 0.00041766496333703497, 'samples': 6876160, 'steps': 13429, 'batch_loss/train': 0.7536641955375671} 12/24/2021 17:58:02 - INFO - codeparrot_training - Step 13430: {'lr': 0.000417653270462001, 'samples': 6876672, 'steps': 13430, 'batch_loss/train': 0.7740793759003282} 12/24/2021 17:58:15 - INFO - codeparrot_training - Step 13431: {'lr': 0.0004176415769204409, 'samples': 6877184, 'steps': 13431, 'batch_loss/train': 0.8555116085335612} 12/24/2021 17:58:26 - INFO - codeparrot_training - Step 13432: {'lr': 0.0004176298827124013, 'samples': 6877696, 'steps': 13432, 'batch_loss/train': 0.8157337317243218} 12/24/2021 17:58:36 - INFO - codeparrot_training - Step 13433: {'lr': 0.0004176181878379285, 'samples': 6878208, 'steps': 13433, 'batch_loss/train': 0.8766613742336631} 12/24/2021 17:58:47 - INFO - codeparrot_training - Step 13434: {'lr': 0.0004176064922970691, 'samples': 6878720, 'steps': 13434, 'batch_loss/train': 1.316168317105621} 12/24/2021 17:59:00 - INFO - codeparrot_training - Step 13435: {'lr': 0.0004175947960898696, 'samples': 6879232, 'steps': 13435, 'batch_loss/train': 0.8376758471131325} 12/24/2021 17:59:11 - INFO - codeparrot_training - Step 13436: {'lr': 0.0004175830992163764, 'samples': 6879744, 'steps': 13436, 'batch_loss/train': 0.8476493777707219} 12/24/2021 17:59:22 - INFO - codeparrot_training - Step 13437: {'lr': 0.0004175714016766361, 'samples': 6880256, 'steps': 13437, 'batch_loss/train': 0.7847469444386661} 12/24/2021 17:59:34 - INFO - codeparrot_training - Step 13438: {'lr': 0.0004175597034706952, 'samples': 6880768, 'steps': 13438, 'batch_loss/train': 0.6914884564466774} 12/24/2021 17:59:44 - INFO - codeparrot_training - Step 13439: {'lr': 0.0004175480045986002, 'samples': 6881280, 'steps': 13439, 'batch_loss/train': 0.844203433021903} 12/24/2021 17:59:55 - INFO - codeparrot_training - Step 13440: {'lr': 0.0004175363050603976, 'samples': 6881792, 'steps': 13440, 'batch_loss/train': 0.8065213989466429} 12/24/2021 18:00:07 - INFO - codeparrot_training - Step 13441: {'lr': 0.0004175246048561339, 'samples': 6882304, 'steps': 13441, 'batch_loss/train': 0.8238637167960405} 12/24/2021 18:00:18 - INFO - codeparrot_training - Step 13442: {'lr': 0.0004175129039858556, 'samples': 6882816, 'steps': 13442, 'batch_loss/train': 0.8412407729774714} 12/24/2021 18:00:28 - INFO - codeparrot_training - Step 13443: {'lr': 0.00041750120244960924, 'samples': 6883328, 'steps': 13443, 'batch_loss/train': 0.7507714862003922} 12/24/2021 18:00:39 - INFO - codeparrot_training - Step 13444: {'lr': 0.00041748950024744134, 'samples': 6883840, 'steps': 13444, 'batch_loss/train': 0.7733234157785773} 12/24/2021 18:00:53 - INFO - codeparrot_training - Step 13445: {'lr': 0.0004174777973793985, 'samples': 6884352, 'steps': 13445, 'batch_loss/train': 0.8829414653591812} 12/24/2021 18:01:03 - INFO - codeparrot_training - Step 13446: {'lr': 0.0004174660938455271, 'samples': 6884864, 'steps': 13446, 'batch_loss/train': 0.823197865858674} 12/24/2021 18:01:14 - INFO - codeparrot_training - Step 13447: {'lr': 0.00041745438964587377, 'samples': 6885376, 'steps': 13447, 'batch_loss/train': 0.797347885556519} 12/24/2021 18:01:26 - INFO - codeparrot_training - Step 13448: {'lr': 0.00041744268478048505, 'samples': 6885888, 'steps': 13448, 'batch_loss/train': 0.7927600429393351} 12/24/2021 18:01:37 - INFO - codeparrot_training - Step 13449: {'lr': 0.0004174309792494073, 'samples': 6886400, 'steps': 13449, 'batch_loss/train': 0.8191743725910783} 12/24/2021 18:01:47 - INFO - codeparrot_training - Step 13450: {'lr': 0.0004174192730526874, 'samples': 6886912, 'steps': 13450, 'batch_loss/train': 0.7274933671578765} 12/24/2021 18:01:59 - INFO - codeparrot_training - Step 13451: {'lr': 0.00041740756619037156, 'samples': 6887424, 'steps': 13451, 'batch_loss/train': 0.8235009191557765} 12/24/2021 18:02:10 - INFO - codeparrot_training - Step 13452: {'lr': 0.00041739585866250653, 'samples': 6887936, 'steps': 13452, 'batch_loss/train': 0.8182401545345783} 12/24/2021 18:02:21 - INFO - codeparrot_training - Step 13453: {'lr': 0.0004173841504691387, 'samples': 6888448, 'steps': 13453, 'batch_loss/train': 0.8768950831145048} 12/24/2021 18:02:31 - INFO - codeparrot_training - Step 13454: {'lr': 0.00041737244161031467, 'samples': 6888960, 'steps': 13454, 'batch_loss/train': 0.8934615729376674} 12/24/2021 18:02:45 - INFO - codeparrot_training - Step 13455: {'lr': 0.0004173607320860811, 'samples': 6889472, 'steps': 13455, 'batch_loss/train': 0.8441444272175431} 12/24/2021 18:02:56 - INFO - codeparrot_training - Step 13456: {'lr': 0.0004173490218964844, 'samples': 6889984, 'steps': 13456, 'batch_loss/train': 0.7794137524906546} 12/24/2021 18:03:07 - INFO - codeparrot_training - Step 13457: {'lr': 0.0004173373110415711, 'samples': 6890496, 'steps': 13457, 'batch_loss/train': 0.8137453380040824} 12/24/2021 18:03:19 - INFO - codeparrot_training - Step 13458: {'lr': 0.00041732559952138796, 'samples': 6891008, 'steps': 13458, 'batch_loss/train': 0.7862089965492487} 12/24/2021 18:03:29 - INFO - codeparrot_training - Step 13459: {'lr': 0.0004173138873359813, 'samples': 6891520, 'steps': 13459, 'batch_loss/train': 0.7879939554259181} 12/24/2021 18:03:40 - INFO - codeparrot_training - Step 13460: {'lr': 0.0004173021744853979, 'samples': 6892032, 'steps': 13460, 'batch_loss/train': 0.7839152151718736} 12/24/2021 18:03:54 - INFO - codeparrot_training - Step 13461: {'lr': 0.0004172904609696842, 'samples': 6892544, 'steps': 13461, 'batch_loss/train': 0.8495272658765316} 12/24/2021 18:04:04 - INFO - codeparrot_training - Step 13462: {'lr': 0.0004172787467888868, 'samples': 6893056, 'steps': 13462, 'batch_loss/train': 0.7569084241986275} 12/24/2021 18:04:15 - INFO - codeparrot_training - Step 13463: {'lr': 0.0004172670319430523, 'samples': 6893568, 'steps': 13463, 'batch_loss/train': 0.8081970158964396} 12/24/2021 18:04:26 - INFO - codeparrot_training - Step 13464: {'lr': 0.00041725531643222716, 'samples': 6894080, 'steps': 13464, 'batch_loss/train': 0.8522781739011407} 12/24/2021 18:04:38 - INFO - codeparrot_training - Step 13465: {'lr': 0.000417243600256458, 'samples': 6894592, 'steps': 13465, 'batch_loss/train': 0.7486175802769139} 12/24/2021 18:04:48 - INFO - codeparrot_training - Step 13466: {'lr': 0.00041723188341579153, 'samples': 6895104, 'steps': 13466, 'batch_loss/train': 0.8573319837450981} 12/24/2021 18:04:59 - INFO - codeparrot_training - Step 13467: {'lr': 0.00041722016591027423, 'samples': 6895616, 'steps': 13467, 'batch_loss/train': 0.8864365788176656} 12/24/2021 18:05:11 - INFO - codeparrot_training - Step 13468: {'lr': 0.0004172084477399526, 'samples': 6896128, 'steps': 13468, 'batch_loss/train': 0.8727437341585755} 12/24/2021 18:05:22 - INFO - codeparrot_training - Step 13469: {'lr': 0.0004171967289048734, 'samples': 6896640, 'steps': 13469, 'batch_loss/train': 0.6809875041944906} 12/24/2021 18:05:32 - INFO - codeparrot_training - Step 13470: {'lr': 0.0004171850094050831, 'samples': 6897152, 'steps': 13470, 'batch_loss/train': 0.8569710375741124} 12/24/2021 18:05:45 - INFO - codeparrot_training - Step 13471: {'lr': 0.00041717328924062837, 'samples': 6897664, 'steps': 13471, 'batch_loss/train': 0.7467970867874101} 12/24/2021 18:05:55 - INFO - codeparrot_training - Step 13472: {'lr': 0.0004171615684115557, 'samples': 6898176, 'steps': 13472, 'batch_loss/train': 0.5932987760170363} 12/24/2021 18:06:06 - INFO - codeparrot_training - Step 13473: {'lr': 0.0004171498469179118, 'samples': 6898688, 'steps': 13473, 'batch_loss/train': 0.8325558914802969} 12/24/2021 18:06:20 - INFO - codeparrot_training - Step 13474: {'lr': 0.00041713812475974323, 'samples': 6899200, 'steps': 13474, 'batch_loss/train': 0.7193480203859508} 12/24/2021 18:06:31 - INFO - codeparrot_training - Step 13475: {'lr': 0.00041712640193709654, 'samples': 6899712, 'steps': 13475, 'batch_loss/train': 0.8736268309876323} 12/24/2021 18:06:41 - INFO - codeparrot_training - Step 13476: {'lr': 0.0004171146784500185, 'samples': 6900224, 'steps': 13476, 'batch_loss/train': 0.9661603420972824} 12/24/2021 18:06:52 - INFO - codeparrot_training - Step 13477: {'lr': 0.0004171029542985554, 'samples': 6900736, 'steps': 13477, 'batch_loss/train': 1.1155674736946821} 12/24/2021 18:07:04 - INFO - codeparrot_training - Step 13478: {'lr': 0.0004170912294827542, 'samples': 6901248, 'steps': 13478, 'batch_loss/train': 0.7038574302569032} 12/24/2021 18:07:15 - INFO - codeparrot_training - Step 13479: {'lr': 0.0004170795040026614, 'samples': 6901760, 'steps': 13479, 'batch_loss/train': 0.7300950079225004} 12/24/2021 18:07:25 - INFO - codeparrot_training - Step 13480: {'lr': 0.00041706777785832346, 'samples': 6902272, 'steps': 13480, 'batch_loss/train': 0.8460462652146816} 12/24/2021 18:07:37 - INFO - codeparrot_training - Step 13481: {'lr': 0.0004170560510497872, 'samples': 6902784, 'steps': 13481, 'batch_loss/train': 0.66613624477759} 12/24/2021 18:07:48 - INFO - codeparrot_training - Step 13482: {'lr': 0.0004170443235770991, 'samples': 6903296, 'steps': 13482, 'batch_loss/train': 0.74916025204584} 12/24/2021 18:07:59 - INFO - codeparrot_training - Step 13483: {'lr': 0.0004170325954403058, 'samples': 6903808, 'steps': 13483, 'batch_loss/train': 0.8341453336179256} 12/24/2021 18:08:12 - INFO - codeparrot_training - Step 13484: {'lr': 0.00041702086663945404, 'samples': 6904320, 'steps': 13484, 'batch_loss/train': 0.8342033205553889} 12/24/2021 18:08:23 - INFO - codeparrot_training - Step 13485: {'lr': 0.00041700913717459035, 'samples': 6904832, 'steps': 13485, 'batch_loss/train': 0.7946330029517412} 12/24/2021 18:08:34 - INFO - codeparrot_training - Step 13486: {'lr': 0.0004169974070457614, 'samples': 6905344, 'steps': 13486, 'batch_loss/train': 0.9165064059197903} 12/24/2021 18:08:44 - INFO - codeparrot_training - Step 13487: {'lr': 0.00041698567625301385, 'samples': 6905856, 'steps': 13487, 'batch_loss/train': 0.9000625032931566} 12/24/2021 18:08:57 - INFO - codeparrot_training - Step 13488: {'lr': 0.0004169739447963943, 'samples': 6906368, 'steps': 13488, 'batch_loss/train': 0.967975964769721} 12/24/2021 18:09:07 - INFO - codeparrot_training - Step 13489: {'lr': 0.0004169622126759494, 'samples': 6906880, 'steps': 13489, 'batch_loss/train': 0.7822508821263909} 12/24/2021 18:09:18 - INFO - codeparrot_training - Step 13490: {'lr': 0.00041695047989172567, 'samples': 6907392, 'steps': 13490, 'batch_loss/train': 0.6566777806729078} 12/24/2021 18:09:32 - INFO - codeparrot_training - Step 13491: {'lr': 0.00041693874644376997, 'samples': 6907904, 'steps': 13491, 'batch_loss/train': 0.7683784570544958} 12/24/2021 18:09:43 - INFO - codeparrot_training - Step 13492: {'lr': 0.0004169270123321288, 'samples': 6908416, 'steps': 13492, 'batch_loss/train': 0.808711014688015} 12/24/2021 18:09:53 - INFO - codeparrot_training - Step 13493: {'lr': 0.00041691527755684894, 'samples': 6908928, 'steps': 13493, 'batch_loss/train': 0.8252057433128357} 12/24/2021 18:10:05 - INFO - codeparrot_training - Step 13494: {'lr': 0.00041690354211797686, 'samples': 6909440, 'steps': 13494, 'batch_loss/train': 0.7961625498719513} 12/24/2021 18:10:16 - INFO - codeparrot_training - Step 13495: {'lr': 0.00041689180601555943, 'samples': 6909952, 'steps': 13495, 'batch_loss/train': 0.7735111592337489} 12/24/2021 18:10:27 - INFO - codeparrot_training - Step 13496: {'lr': 0.00041688006924964314, 'samples': 6910464, 'steps': 13496, 'batch_loss/train': 0.7838389207608998} 12/24/2021 18:10:37 - INFO - codeparrot_training - Step 13497: {'lr': 0.00041686833182027463, 'samples': 6910976, 'steps': 13497, 'batch_loss/train': 0.8607270494103432} 12/24/2021 18:10:49 - INFO - codeparrot_training - Step 13498: {'lr': 0.0004168565937275007, 'samples': 6911488, 'steps': 13498, 'batch_loss/train': 0.8828485663980246} 12/24/2021 18:11:00 - INFO - codeparrot_training - Step 13499: {'lr': 0.00041684485497136804, 'samples': 6912000, 'steps': 13499, 'batch_loss/train': 0.957943425513804} 12/24/2021 18:11:11 - INFO - codeparrot_training - Step 13500: {'lr': 0.00041683311555192314, 'samples': 6912512, 'steps': 13500, 'batch_loss/train': 0.7643457055091858} 12/24/2021 18:11:23 - INFO - codeparrot_training - Step 13501: {'lr': 0.00041682137546921274, 'samples': 6913024, 'steps': 13501, 'batch_loss/train': 0.806865626713261} 12/24/2021 18:11:33 - INFO - codeparrot_training - Step 13502: {'lr': 0.0004168096347232836, 'samples': 6913536, 'steps': 13502, 'batch_loss/train': 0.8002302469685674} 12/24/2021 18:11:44 - INFO - codeparrot_training - Step 13503: {'lr': 0.00041679789331418234, 'samples': 6914048, 'steps': 13503, 'batch_loss/train': 0.8567804843187332} 12/24/2021 18:11:58 - INFO - codeparrot_training - Step 13504: {'lr': 0.00041678615124195564, 'samples': 6914560, 'steps': 13504, 'batch_loss/train': 0.7756170593202114} 12/24/2021 18:12:08 - INFO - codeparrot_training - Step 13505: {'lr': 0.0004167744085066502, 'samples': 6915072, 'steps': 13505, 'batch_loss/train': 0.8307293020188808} 12/24/2021 18:12:19 - INFO - codeparrot_training - Step 13506: {'lr': 0.0004167626651083126, 'samples': 6915584, 'steps': 13506, 'batch_loss/train': 0.7335264547728002} 12/24/2021 18:12:30 - INFO - codeparrot_training - Step 13507: {'lr': 0.0004167509210469896, 'samples': 6916096, 'steps': 13507, 'batch_loss/train': 0.8580498928204179} 12/24/2021 18:12:42 - INFO - codeparrot_training - Step 13508: {'lr': 0.000416739176322728, 'samples': 6916608, 'steps': 13508, 'batch_loss/train': 0.7321929335594177} 12/24/2021 18:12:53 - INFO - codeparrot_training - Step 13509: {'lr': 0.0004167274309355743, 'samples': 6917120, 'steps': 13509, 'batch_loss/train': 0.8430396215990186} 12/24/2021 18:13:03 - INFO - codeparrot_training - Step 13510: {'lr': 0.0004167156848855753, 'samples': 6917632, 'steps': 13510, 'batch_loss/train': 0.8778301689308137} 12/24/2021 18:13:15 - INFO - codeparrot_training - Step 13511: {'lr': 0.0004167039381727778, 'samples': 6918144, 'steps': 13511, 'batch_loss/train': 0.8345812894403934} 12/24/2021 18:13:26 - INFO - codeparrot_training - Step 13512: {'lr': 0.0004166921907972283, 'samples': 6918656, 'steps': 13512, 'batch_loss/train': 0.6932782281655818} 12/24/2021 18:13:36 - INFO - codeparrot_training - Step 13513: {'lr': 0.0004166804427589736, 'samples': 6919168, 'steps': 13513, 'batch_loss/train': 0.7686655893921852} 12/24/2021 18:13:50 - INFO - codeparrot_training - Step 13514: {'lr': 0.0004166686940580604, 'samples': 6919680, 'steps': 13514, 'batch_loss/train': 0.7660488232504576} 12/24/2021 18:14:01 - INFO - codeparrot_training - Step 13515: {'lr': 0.00041665694469453544, 'samples': 6920192, 'steps': 13515, 'batch_loss/train': 0.8607385698705912} 12/24/2021 18:14:12 - INFO - codeparrot_training - Step 13516: {'lr': 0.0004166451946684453, 'samples': 6920704, 'steps': 13516, 'batch_loss/train': 0.6800875917542726} 12/24/2021 18:14:24 - INFO - codeparrot_training - Step 13517: {'lr': 0.0004166334439798368, 'samples': 6921216, 'steps': 13517, 'batch_loss/train': 0.7620429182425141} 12/24/2021 18:14:34 - INFO - codeparrot_training - Step 13518: {'lr': 0.0004166216926287567, 'samples': 6921728, 'steps': 13518, 'batch_loss/train': 0.8660935582593083} 12/24/2021 18:14:45 - INFO - codeparrot_training - Step 13519: {'lr': 0.0004166099406152517, 'samples': 6922240, 'steps': 13519, 'batch_loss/train': 0.832039772765711} 12/24/2021 18:14:55 - INFO - codeparrot_training - Step 13520: {'lr': 0.00041659818793936845, 'samples': 6922752, 'steps': 13520, 'batch_loss/train': 0.8911683410406113} 12/24/2021 18:15:09 - INFO - codeparrot_training - Step 13521: {'lr': 0.00041658643460115366, 'samples': 6923264, 'steps': 13521, 'batch_loss/train': 0.8742270432412624} 12/24/2021 18:15:20 - INFO - codeparrot_training - Step 13522: {'lr': 0.00041657468060065414, 'samples': 6923776, 'steps': 13522, 'batch_loss/train': 0.892209294019267} 12/24/2021 18:15:31 - INFO - codeparrot_training - Step 13523: {'lr': 0.00041656292593791655, 'samples': 6924288, 'steps': 13523, 'batch_loss/train': 0.7969346344470978} 12/24/2021 18:15:43 - INFO - codeparrot_training - Step 13524: {'lr': 0.00041655117061298774, 'samples': 6924800, 'steps': 13524, 'batch_loss/train': 0.8883431719150394} 12/24/2021 18:15:53 - INFO - codeparrot_training - Step 13525: {'lr': 0.0004165394146259143, 'samples': 6925312, 'steps': 13525, 'batch_loss/train': 0.7722043972462416} 12/24/2021 18:16:04 - INFO - codeparrot_training - Step 13526: {'lr': 0.00041652765797674306, 'samples': 6925824, 'steps': 13526, 'batch_loss/train': 0.8182902177795768} 12/24/2021 18:16:16 - INFO - codeparrot_training - Step 13527: {'lr': 0.0004165159006655207, 'samples': 6926336, 'steps': 13527, 'batch_loss/train': 0.8291089567355812} 12/24/2021 18:16:27 - INFO - codeparrot_training - Step 13528: {'lr': 0.000416504142692294, 'samples': 6926848, 'steps': 13528, 'batch_loss/train': 0.8431429513730109} 12/24/2021 18:16:37 - INFO - codeparrot_training - Step 13529: {'lr': 0.00041649238405710976, 'samples': 6927360, 'steps': 13529, 'batch_loss/train': 0.7660315232351422} 12/24/2021 18:16:48 - INFO - codeparrot_training - Step 13530: {'lr': 0.0004164806247600146, 'samples': 6927872, 'steps': 13530, 'batch_loss/train': 0.8278312152251601} 12/24/2021 18:17:02 - INFO - codeparrot_training - Step 13531: {'lr': 0.0004164688648010554, 'samples': 6928384, 'steps': 13531, 'batch_loss/train': 0.905308123677969} 12/24/2021 18:17:12 - INFO - codeparrot_training - Step 13532: {'lr': 0.0004164571041802788, 'samples': 6928896, 'steps': 13532, 'batch_loss/train': 0.7631172575056553} 12/24/2021 18:17:23 - INFO - codeparrot_training - Step 13533: {'lr': 0.00041644534289773157, 'samples': 6929408, 'steps': 13533, 'batch_loss/train': 0.8230935633182526} 12/24/2021 18:17:35 - INFO - codeparrot_training - Step 13534: {'lr': 0.0004164335809534605, 'samples': 6929920, 'steps': 13534, 'batch_loss/train': 0.8858921630308032} 12/24/2021 18:17:46 - INFO - codeparrot_training - Step 13535: {'lr': 0.00041642181834751247, 'samples': 6930432, 'steps': 13535, 'batch_loss/train': 0.6871832953765988} 12/24/2021 18:17:56 - INFO - codeparrot_training - Step 13536: {'lr': 0.00041641005507993403, 'samples': 6930944, 'steps': 13536, 'batch_loss/train': 0.7912917165085673} 12/24/2021 18:18:09 - INFO - codeparrot_training - Step 13537: {'lr': 0.0004163982911507721, 'samples': 6931456, 'steps': 13537, 'batch_loss/train': 0.7571589183062315} 12/24/2021 18:18:19 - INFO - codeparrot_training - Step 13538: {'lr': 0.0004163865265600733, 'samples': 6931968, 'steps': 13538, 'batch_loss/train': 1.1577836433425546} 12/24/2021 18:18:30 - INFO - codeparrot_training - Step 13539: {'lr': 0.0004163747613078846, 'samples': 6932480, 'steps': 13539, 'batch_loss/train': 0.8656878378242254} 12/24/2021 18:18:41 - INFO - codeparrot_training - Step 13540: {'lr': 0.0004163629953942526, 'samples': 6932992, 'steps': 13540, 'batch_loss/train': 0.7274064254015684} 12/24/2021 18:18:53 - INFO - codeparrot_training - Step 13541: {'lr': 0.00041635122881922423, 'samples': 6933504, 'steps': 13541, 'batch_loss/train': 0.7746416749432683} 12/24/2021 18:19:03 - INFO - codeparrot_training - Step 13542: {'lr': 0.00041633946158284605, 'samples': 6934016, 'steps': 13542, 'batch_loss/train': 0.7561235805042088} 12/24/2021 18:19:14 - INFO - codeparrot_training - Step 13543: {'lr': 0.0004163276936851651, 'samples': 6934528, 'steps': 13543, 'batch_loss/train': 0.7302806712687016} 12/24/2021 18:19:28 - INFO - codeparrot_training - Step 13544: {'lr': 0.000416315925126228, 'samples': 6935040, 'steps': 13544, 'batch_loss/train': 0.7187257376499474} 12/24/2021 18:19:39 - INFO - codeparrot_training - Step 13545: {'lr': 0.0004163041559060816, 'samples': 6935552, 'steps': 13545, 'batch_loss/train': 0.808296175673604} 12/24/2021 18:19:49 - INFO - codeparrot_training - Step 13546: {'lr': 0.00041629238602477266, 'samples': 6936064, 'steps': 13546, 'batch_loss/train': 0.8431949052028358} 12/24/2021 18:20:01 - INFO - codeparrot_training - Step 13547: {'lr': 0.00041628061548234797, 'samples': 6936576, 'steps': 13547, 'batch_loss/train': 0.7906867447309196} 12/24/2021 18:20:12 - INFO - codeparrot_training - Step 13548: {'lr': 0.00041626884427885437, 'samples': 6937088, 'steps': 13548, 'batch_loss/train': 0.7996720841620117} 12/24/2021 18:20:23 - INFO - codeparrot_training - Step 13549: {'lr': 0.0004162570724143386, 'samples': 6937600, 'steps': 13549, 'batch_loss/train': 0.8480516914278269} 12/24/2021 18:20:33 - INFO - codeparrot_training - Step 13550: {'lr': 0.0004162452998888475, 'samples': 6938112, 'steps': 13550, 'batch_loss/train': 0.8501515774987638} 12/24/2021 18:20:47 - INFO - codeparrot_training - Step 13551: {'lr': 0.0004162335267024278, 'samples': 6938624, 'steps': 13551, 'batch_loss/train': 0.6762274018255994} 12/24/2021 18:20:58 - INFO - codeparrot_training - Step 13552: {'lr': 0.0004162217528551264, 'samples': 6939136, 'steps': 13552, 'batch_loss/train': 0.7632704516872764} 12/24/2021 18:21:09 - INFO - codeparrot_training - Step 13553: {'lr': 0.0004162099783469901, 'samples': 6939648, 'steps': 13553, 'batch_loss/train': 0.8442970439791679} 12/24/2021 18:21:20 - INFO - codeparrot_training - Step 13554: {'lr': 0.0004161982031780657, 'samples': 6940160, 'steps': 13554, 'batch_loss/train': 0.805717951618135} 12/24/2021 18:21:31 - INFO - codeparrot_training - Step 13555: {'lr': 0.00041618642734839997, 'samples': 6940672, 'steps': 13555, 'batch_loss/train': 0.7349476516246796} 12/24/2021 18:21:42 - INFO - codeparrot_training - Step 13556: {'lr': 0.0004161746508580397, 'samples': 6941184, 'steps': 13556, 'batch_loss/train': 0.8336070696823299} 12/24/2021 18:21:54 - INFO - codeparrot_training - Step 13557: {'lr': 0.0004161628737070319, 'samples': 6941696, 'steps': 13557, 'batch_loss/train': 0.8172261312138289} 12/24/2021 18:22:05 - INFO - codeparrot_training - Step 13558: {'lr': 0.0004161510958954231, 'samples': 6942208, 'steps': 13558, 'batch_loss/train': 0.797247963026166} 12/24/2021 18:22:15 - INFO - codeparrot_training - Step 13559: {'lr': 0.0004161393174232604, 'samples': 6942720, 'steps': 13559, 'batch_loss/train': 0.8284044377505779} 12/24/2021 18:22:29 - INFO - codeparrot_training - Step 13560: {'lr': 0.0004161275382905904, 'samples': 6943232, 'steps': 13560, 'batch_loss/train': 0.852595305070281} 12/24/2021 18:22:40 - INFO - codeparrot_training - Step 13561: {'lr': 0.0004161157584974601, 'samples': 6943744, 'steps': 13561, 'batch_loss/train': 0.8503582179546356} 12/24/2021 18:22:50 - INFO - codeparrot_training - Step 13562: {'lr': 0.00041610397804391633, 'samples': 6944256, 'steps': 13562, 'batch_loss/train': 0.7518537547439337} 12/24/2021 18:23:01 - INFO - codeparrot_training - Step 13563: {'lr': 0.0004160921969300058, 'samples': 6944768, 'steps': 13563, 'batch_loss/train': 0.7942410269752145} 12/24/2021 18:23:13 - INFO - codeparrot_training - Step 13564: {'lr': 0.0004160804151557753, 'samples': 6945280, 'steps': 13564, 'batch_loss/train': 0.7812806433066726} 12/24/2021 18:23:24 - INFO - codeparrot_training - Step 13565: {'lr': 0.00041606863272127193, 'samples': 6945792, 'steps': 13565, 'batch_loss/train': 0.7502692146226764} 12/24/2021 18:23:34 - INFO - codeparrot_training - Step 13566: {'lr': 0.00041605684962654234, 'samples': 6946304, 'steps': 13566, 'batch_loss/train': 0.8206184215378016} 12/24/2021 18:23:46 - INFO - codeparrot_training - Step 13567: {'lr': 0.0004160450658716334, 'samples': 6946816, 'steps': 13567, 'batch_loss/train': 0.7317204936407506} 12/24/2021 18:23:57 - INFO - codeparrot_training - Step 13568: {'lr': 0.0004160332814565919, 'samples': 6947328, 'steps': 13568, 'batch_loss/train': 0.8107765773311257} 12/24/2021 18:24:07 - INFO - codeparrot_training - Step 13569: {'lr': 0.00041602149638146494, 'samples': 6947840, 'steps': 13569, 'batch_loss/train': 0.8233556877821684} 12/24/2021 18:24:21 - INFO - codeparrot_training - Step 13570: {'lr': 0.00041600971064629903, 'samples': 6948352, 'steps': 13570, 'batch_loss/train': 0.79493148624897} 12/24/2021 18:24:32 - INFO - codeparrot_training - Step 13571: {'lr': 0.00041599792425114126, 'samples': 6948864, 'steps': 13571, 'batch_loss/train': 0.7601630084682256} 12/24/2021 18:24:42 - INFO - codeparrot_training - Step 13572: {'lr': 0.00041598613719603846, 'samples': 6949376, 'steps': 13572, 'batch_loss/train': 0.8443589173257351} 12/24/2021 18:24:53 - INFO - codeparrot_training - Step 13573: {'lr': 0.0004159743494810374, 'samples': 6949888, 'steps': 13573, 'batch_loss/train': 0.6615303644211963} 12/24/2021 18:25:05 - INFO - codeparrot_training - Step 13574: {'lr': 0.00041596256110618503, 'samples': 6950400, 'steps': 13574, 'batch_loss/train': 0.907196993008256} 12/24/2021 18:25:16 - INFO - codeparrot_training - Step 13575: {'lr': 0.00041595077207152813, 'samples': 6950912, 'steps': 13575, 'batch_loss/train': 0.9993758499622345} 12/24/2021 18:25:26 - INFO - codeparrot_training - Step 13576: {'lr': 0.0004159389823771137, 'samples': 6951424, 'steps': 13576, 'batch_loss/train': 0.7420989573001862} 12/24/2021 18:25:38 - INFO - codeparrot_training - Step 13577: {'lr': 0.00041592719202298846, 'samples': 6951936, 'steps': 13577, 'batch_loss/train': 0.8297704339493066} 12/24/2021 18:25:49 - INFO - codeparrot_training - Step 13578: {'lr': 0.00041591540100919947, 'samples': 6952448, 'steps': 13578, 'batch_loss/train': 0.8221791572868824} 12/24/2021 18:26:00 - INFO - codeparrot_training - Step 13579: {'lr': 0.0004159036093357934, 'samples': 6952960, 'steps': 13579, 'batch_loss/train': 0.797499518841505} 12/24/2021 18:26:13 - INFO - codeparrot_training - Step 13580: {'lr': 0.00041589181700281717, 'samples': 6953472, 'steps': 13580, 'batch_loss/train': 0.7871373433154076} 12/24/2021 18:26:24 - INFO - codeparrot_training - Step 13581: {'lr': 0.00041588002401031775, 'samples': 6953984, 'steps': 13581, 'batch_loss/train': 0.7627917525824159} 12/24/2021 18:26:35 - INFO - codeparrot_training - Step 13582: {'lr': 0.0004158682303583421, 'samples': 6954496, 'steps': 13582, 'batch_loss/train': 0.9050031621009111} 12/24/2021 18:26:45 - INFO - codeparrot_training - Step 13583: {'lr': 0.0004158564360469368, 'samples': 6955008, 'steps': 13583, 'batch_loss/train': 0.8953026542440057} 12/24/2021 18:26:57 - INFO - codeparrot_training - Step 13584: {'lr': 0.00041584464107614905, 'samples': 6955520, 'steps': 13584, 'batch_loss/train': 0.8781520901247859} 12/24/2021 18:27:08 - INFO - codeparrot_training - Step 13585: {'lr': 0.0004158328454460257, 'samples': 6956032, 'steps': 13585, 'batch_loss/train': 0.8078726553358138} 12/24/2021 18:27:18 - INFO - codeparrot_training - Step 13586: {'lr': 0.00041582104915661343, 'samples': 6956544, 'steps': 13586, 'batch_loss/train': 0.8386942148208618} 12/24/2021 18:27:31 - INFO - codeparrot_training - Step 13587: {'lr': 0.0004158092522079594, 'samples': 6957056, 'steps': 13587, 'batch_loss/train': 0.8247412638738751} 12/24/2021 18:27:41 - INFO - codeparrot_training - Step 13588: {'lr': 0.0004157974546001103, 'samples': 6957568, 'steps': 13588, 'batch_loss/train': 0.8546771369874477} 12/24/2021 18:27:52 - INFO - codeparrot_training - Step 13589: {'lr': 0.00041578565633311313, 'samples': 6958080, 'steps': 13589, 'batch_loss/train': 0.8126274775713682} 12/24/2021 18:28:06 - INFO - codeparrot_training - Step 13590: {'lr': 0.00041577385740701483, 'samples': 6958592, 'steps': 13590, 'batch_loss/train': 0.7123595825396478} 12/24/2021 18:28:16 - INFO - codeparrot_training - Step 13591: {'lr': 0.0004157620578218623, 'samples': 6959104, 'steps': 13591, 'batch_loss/train': 0.7426079995930195} 12/24/2021 18:28:27 - INFO - codeparrot_training - Step 13592: {'lr': 0.0004157502575777023, 'samples': 6959616, 'steps': 13592, 'batch_loss/train': 0.7773772475775331} 12/24/2021 18:28:39 - INFO - codeparrot_training - Step 13593: {'lr': 0.00041573845667458196, 'samples': 6960128, 'steps': 13593, 'batch_loss/train': 0.7342138603562489} 12/24/2021 18:28:50 - INFO - codeparrot_training - Step 13594: {'lr': 0.0004157266551125481, 'samples': 6960640, 'steps': 13594, 'batch_loss/train': 0.7890412475680932} 12/24/2021 18:29:00 - INFO - codeparrot_training - Step 13595: {'lr': 0.00041571485289164763, 'samples': 6961152, 'steps': 13595, 'batch_loss/train': 0.7613468486815691} 12/24/2021 18:29:11 - INFO - codeparrot_training - Step 13596: {'lr': 0.00041570305001192745, 'samples': 6961664, 'steps': 13596, 'batch_loss/train': 0.6747157517820597} 12/24/2021 18:29:23 - INFO - codeparrot_training - Step 13597: {'lr': 0.0004156912464734345, 'samples': 6962176, 'steps': 13597, 'batch_loss/train': 0.5618223410565406} 12/24/2021 18:29:34 - INFO - codeparrot_training - Step 13598: {'lr': 0.00041567944227621567, 'samples': 6962688, 'steps': 13598, 'batch_loss/train': 1.2834808812476695} 12/24/2021 18:29:44 - INFO - codeparrot_training - Step 13599: {'lr': 0.000415667637420318, 'samples': 6963200, 'steps': 13599, 'batch_loss/train': 0.8605329317506403} 12/24/2021 18:29:58 - INFO - codeparrot_training - Step 13600: {'lr': 0.00041565583190578836, 'samples': 6963712, 'steps': 13600, 'batch_loss/train': 0.8286610064096749} 12/24/2021 18:30:09 - INFO - codeparrot_training - Step 13601: {'lr': 0.0004156440257326737, 'samples': 6964224, 'steps': 13601, 'batch_loss/train': 0.7679252438247204} 12/24/2021 18:30:19 - INFO - codeparrot_training - Step 13602: {'lr': 0.0004156322189010209, 'samples': 6964736, 'steps': 13602, 'batch_loss/train': 0.8736072815954685} 12/24/2021 18:30:32 - INFO - codeparrot_training - Step 13603: {'lr': 0.00041562041141087695, 'samples': 6965248, 'steps': 13603, 'batch_loss/train': 0.9162763254716992} 12/24/2021 18:30:42 - INFO - codeparrot_training - Step 13604: {'lr': 0.0004156086032622888, 'samples': 6965760, 'steps': 13604, 'batch_loss/train': 0.8754878863692284} 12/24/2021 18:30:53 - INFO - codeparrot_training - Step 13605: {'lr': 0.0004155967944553033, 'samples': 6966272, 'steps': 13605, 'batch_loss/train': 0.8162585999816656} 12/24/2021 18:31:03 - INFO - codeparrot_training - Step 13606: {'lr': 0.00041558498498996756, 'samples': 6966784, 'steps': 13606, 'batch_loss/train': 0.7420970080420375} 12/24/2021 18:31:18 - INFO - codeparrot_training - Step 13607: {'lr': 0.0004155731748663284, 'samples': 6967296, 'steps': 13607, 'batch_loss/train': 0.8076087161898613} 12/24/2021 18:31:28 - INFO - codeparrot_training - Step 13608: {'lr': 0.0004155613640844328, 'samples': 6967808, 'steps': 13608, 'batch_loss/train': 1.5994646856561303} 12/24/2021 18:31:39 - INFO - codeparrot_training - Step 13609: {'lr': 0.00041554955264432774, 'samples': 6968320, 'steps': 13609, 'batch_loss/train': 0.8448003521189094} 12/24/2021 18:31:50 - INFO - codeparrot_training - Step 13610: {'lr': 0.0004155377405460602, 'samples': 6968832, 'steps': 13610, 'batch_loss/train': 1.0527706816792488} 12/24/2021 18:32:02 - INFO - codeparrot_training - Step 13611: {'lr': 0.00041552592778967715, 'samples': 6969344, 'steps': 13611, 'batch_loss/train': 0.9617097899317741} 12/24/2021 18:32:12 - INFO - codeparrot_training - Step 13612: {'lr': 0.0004155141143752254, 'samples': 6969856, 'steps': 13612, 'batch_loss/train': 0.8335088947787881} 12/24/2021 18:32:23 - INFO - codeparrot_training - Step 13613: {'lr': 0.00041550230030275206, 'samples': 6970368, 'steps': 13613, 'batch_loss/train': 0.7734317518770695} 12/24/2021 18:32:35 - INFO - codeparrot_training - Step 13614: {'lr': 0.00041549048557230407, 'samples': 6970880, 'steps': 13614, 'batch_loss/train': 0.7767074825242162} 12/24/2021 18:32:46 - INFO - codeparrot_training - Step 13615: {'lr': 0.0004154786701839284, 'samples': 6971392, 'steps': 13615, 'batch_loss/train': 0.8982709851115942} 12/24/2021 18:32:57 - INFO - codeparrot_training - Step 13616: {'lr': 0.00041546685413767207, 'samples': 6971904, 'steps': 13616, 'batch_loss/train': 0.7459296784363687} 12/24/2021 18:33:11 - INFO - codeparrot_training - Step 13617: {'lr': 0.0004154550374335819, 'samples': 6972416, 'steps': 13617, 'batch_loss/train': 0.8025864227674901} 12/24/2021 18:33:22 - INFO - codeparrot_training - Step 13618: {'lr': 0.0004154432200717051, 'samples': 6972928, 'steps': 13618, 'batch_loss/train': 0.8186353500932455} 12/24/2021 18:33:32 - INFO - codeparrot_training - Step 13619: {'lr': 0.00041543140205208847, 'samples': 6973440, 'steps': 13619, 'batch_loss/train': 0.8373757395893335} 12/24/2021 18:33:44 - INFO - codeparrot_training - Step 13620: {'lr': 0.00041541958337477903, 'samples': 6973952, 'steps': 13620, 'batch_loss/train': 0.8270373838022351} 12/24/2021 18:33:55 - INFO - codeparrot_training - Step 13621: {'lr': 0.0004154077640398238, 'samples': 6974464, 'steps': 13621, 'batch_loss/train': 0.8414728604257107} 12/24/2021 18:34:05 - INFO - codeparrot_training - Step 13622: {'lr': 0.00041539594404726976, 'samples': 6974976, 'steps': 13622, 'batch_loss/train': 0.7877097204327583} 12/24/2021 18:34:16 - INFO - codeparrot_training - Step 13623: {'lr': 0.0004153841233971639, 'samples': 6975488, 'steps': 13623, 'batch_loss/train': 0.8000831445679069} 12/24/2021 18:34:30 - INFO - codeparrot_training - Step 13624: {'lr': 0.0004153723020895532, 'samples': 6976000, 'steps': 13624, 'batch_loss/train': 0.8378220936283469} 12/24/2021 18:34:41 - INFO - codeparrot_training - Step 13625: {'lr': 0.00041536048012448464, 'samples': 6976512, 'steps': 13625, 'batch_loss/train': 0.7665108414366841} 12/24/2021 18:34:51 - INFO - codeparrot_training - Step 13626: {'lr': 0.0004153486575020053, 'samples': 6977024, 'steps': 13626, 'batch_loss/train': 0.6879871636629105} 12/24/2021 18:35:04 - INFO - codeparrot_training - Step 13627: {'lr': 0.0004153368342221621, 'samples': 6977536, 'steps': 13627, 'batch_loss/train': 0.8053453015163541} 12/24/2021 18:35:14 - INFO - codeparrot_training - Step 13628: {'lr': 0.00041532501028500215, 'samples': 6978048, 'steps': 13628, 'batch_loss/train': 0.771147623192519} 12/24/2021 18:35:25 - INFO - codeparrot_training - Step 13629: {'lr': 0.0004153131856905723, 'samples': 6978560, 'steps': 13629, 'batch_loss/train': 0.7066908853594214} 12/24/2021 18:35:37 - INFO - codeparrot_training - Step 13630: {'lr': 0.0004153013604389196, 'samples': 6979072, 'steps': 13630, 'batch_loss/train': 0.6941221880260855} 12/24/2021 18:35:48 - INFO - codeparrot_training - Step 13631: {'lr': 0.0004152895345300912, 'samples': 6979584, 'steps': 13631, 'batch_loss/train': 0.7088221539743245} 12/24/2021 18:35:58 - INFO - codeparrot_training - Step 13632: {'lr': 0.000415277707964134, 'samples': 6980096, 'steps': 13632, 'batch_loss/train': 0.8061456470750272} 12/24/2021 18:36:12 - INFO - codeparrot_training - Step 13633: {'lr': 0.0004152658807410951, 'samples': 6980608, 'steps': 13633, 'batch_loss/train': 0.8245454579591751} 12/24/2021 18:36:23 - INFO - codeparrot_training - Step 13634: {'lr': 0.0004152540528610214, 'samples': 6981120, 'steps': 13634, 'batch_loss/train': 0.7408408988267183} 12/24/2021 18:36:33 - INFO - codeparrot_training - Step 13635: {'lr': 0.0004152422243239599, 'samples': 6981632, 'steps': 13635, 'batch_loss/train': 0.8544167056679726} 12/24/2021 18:36:44 - INFO - codeparrot_training - Step 13636: {'lr': 0.00041523039512995777, 'samples': 6982144, 'steps': 13636, 'batch_loss/train': 0.8134855362586677} 12/24/2021 18:36:56 - INFO - codeparrot_training - Step 13637: {'lr': 0.00041521856527906197, 'samples': 6982656, 'steps': 13637, 'batch_loss/train': 0.7779782065190375} 12/24/2021 18:37:06 - INFO - codeparrot_training - Step 13638: {'lr': 0.0004152067347713196, 'samples': 6983168, 'steps': 13638, 'batch_loss/train': 0.7503496606368572} 12/24/2021 18:37:17 - INFO - codeparrot_training - Step 13639: {'lr': 0.0004151949036067776, 'samples': 6983680, 'steps': 13639, 'batch_loss/train': 0.7322690435685217} 12/24/2021 18:37:31 - INFO - codeparrot_training - Step 13640: {'lr': 0.00041518307178548306, 'samples': 6984192, 'steps': 13640, 'batch_loss/train': 0.5727132566971704} 12/24/2021 18:37:42 - INFO - codeparrot_training - Step 13641: {'lr': 0.00041517123930748294, 'samples': 6984704, 'steps': 13641, 'batch_loss/train': 0.7602025484666228} 12/24/2021 18:37:52 - INFO - codeparrot_training - Step 13642: {'lr': 0.0004151594061728244, 'samples': 6985216, 'steps': 13642, 'batch_loss/train': 0.616954448632896} 12/24/2021 18:38:04 - INFO - codeparrot_training - Step 13643: {'lr': 0.00041514757238155436, 'samples': 6985728, 'steps': 13643, 'batch_loss/train': 0.7560896258801222} 12/24/2021 18:38:15 - INFO - codeparrot_training - Step 13644: {'lr': 0.00041513573793371995, 'samples': 6986240, 'steps': 13644, 'batch_loss/train': 0.6395050549181178} 12/24/2021 18:38:25 - INFO - codeparrot_training - Step 13645: {'lr': 0.00041512390282936817, 'samples': 6986752, 'steps': 13645, 'batch_loss/train': 0.6964522725902498} 12/24/2021 18:38:36 - INFO - codeparrot_training - Step 13646: {'lr': 0.0004151120670685462, 'samples': 6987264, 'steps': 13646, 'batch_loss/train': 0.6883084822911769} 12/24/2021 18:38:48 - INFO - codeparrot_training - Step 13647: {'lr': 0.0004151002306513009, 'samples': 6987776, 'steps': 13647, 'batch_loss/train': 0.8000080175697803} 12/24/2021 18:38:59 - INFO - codeparrot_training - Step 13648: {'lr': 0.0004150883935776795, 'samples': 6988288, 'steps': 13648, 'batch_loss/train': 0.5171181011828594} 12/24/2021 18:39:09 - INFO - codeparrot_training - Step 13649: {'lr': 0.00041507655584772896, 'samples': 6988800, 'steps': 13649, 'batch_loss/train': 0.866093622520566} 12/24/2021 18:39:22 - INFO - codeparrot_training - Step 13650: {'lr': 0.00041506471746149634, 'samples': 6989312, 'steps': 13650, 'batch_loss/train': 0.7229694500565529} 12/24/2021 18:39:32 - INFO - codeparrot_training - Step 13651: {'lr': 0.0004150528784190287, 'samples': 6989824, 'steps': 13651, 'batch_loss/train': 0.7149697262793779} 12/24/2021 18:39:43 - INFO - codeparrot_training - Step 13652: {'lr': 0.00041504103872037326, 'samples': 6990336, 'steps': 13652, 'batch_loss/train': 0.8414313551038504} 12/24/2021 18:39:57 - INFO - codeparrot_training - Step 13653: {'lr': 0.0004150291983655768, 'samples': 6990848, 'steps': 13653, 'batch_loss/train': 0.822242493275553} 12/24/2021 18:40:07 - INFO - codeparrot_training - Step 13654: {'lr': 0.00041501735735468664, 'samples': 6991360, 'steps': 13654, 'batch_loss/train': 0.6815868830308318} 12/24/2021 18:40:18 - INFO - codeparrot_training - Step 13655: {'lr': 0.0004150055156877498, 'samples': 6991872, 'steps': 13655, 'batch_loss/train': 0.8054414838552475} 12/24/2021 18:40:30 - INFO - codeparrot_training - Step 13656: {'lr': 0.0004149936733648133, 'samples': 6992384, 'steps': 13656, 'batch_loss/train': 0.8378195278346539} 12/24/2021 18:40:40 - INFO - codeparrot_training - Step 13657: {'lr': 0.0004149818303859243, 'samples': 6992896, 'steps': 13657, 'batch_loss/train': 0.7690370213240385} 12/24/2021 18:40:51 - INFO - codeparrot_training - Step 13658: {'lr': 0.00041496998675112976, 'samples': 6993408, 'steps': 13658, 'batch_loss/train': 0.7681764317676425} 12/24/2021 18:41:02 - INFO - codeparrot_training - Step 13659: {'lr': 0.00041495814246047684, 'samples': 6993920, 'steps': 13659, 'batch_loss/train': 0.8496627351269126} 12/24/2021 18:41:14 - INFO - codeparrot_training - Step 13660: {'lr': 0.0004149462975140127, 'samples': 6994432, 'steps': 13660, 'batch_loss/train': 0.7705060271546245} 12/24/2021 18:41:24 - INFO - codeparrot_training - Step 13661: {'lr': 0.00041493445191178436, 'samples': 6994944, 'steps': 13661, 'batch_loss/train': 0.779621466062963} 12/24/2021 18:41:35 - INFO - codeparrot_training - Step 13662: {'lr': 0.0004149226056538389, 'samples': 6995456, 'steps': 13662, 'batch_loss/train': 0.8515476898755878} 12/24/2021 18:41:49 - INFO - codeparrot_training - Step 13663: {'lr': 0.0004149107587402234, 'samples': 6995968, 'steps': 13663, 'batch_loss/train': 0.8078048098832369} 12/24/2021 18:41:59 - INFO - codeparrot_training - Step 13664: {'lr': 0.000414898911170985, 'samples': 6996480, 'steps': 13664, 'batch_loss/train': 0.7279259655624628} 12/24/2021 18:42:10 - INFO - codeparrot_training - Step 13665: {'lr': 0.00041488706294617086, 'samples': 6996992, 'steps': 13665, 'batch_loss/train': 0.66089813481085} 12/24/2021 18:42:22 - INFO - codeparrot_training - Step 13666: {'lr': 0.0004148752140658279, 'samples': 6997504, 'steps': 13666, 'batch_loss/train': 0.9951891526579857} 12/24/2021 18:42:33 - INFO - codeparrot_training - Step 13667: {'lr': 0.0004148633645300034, 'samples': 6998016, 'steps': 13667, 'batch_loss/train': 0.8146980693563819} 12/24/2021 18:42:43 - INFO - codeparrot_training - Step 13668: {'lr': 0.00041485151433874444, 'samples': 6998528, 'steps': 13668, 'batch_loss/train': 0.9112509777769446} 12/24/2021 18:42:58 - INFO - codeparrot_training - Step 13669: {'lr': 0.00041483966349209814, 'samples': 6999040, 'steps': 13669, 'batch_loss/train': 0.6775628785253502} 12/24/2021 18:43:08 - INFO - codeparrot_training - Step 13670: {'lr': 0.0004148278119901114, 'samples': 6999552, 'steps': 13670, 'batch_loss/train': 0.8585020517930388} 12/24/2021 18:43:19 - INFO - codeparrot_training - Step 13671: {'lr': 0.0004148159598328317, 'samples': 7000064, 'steps': 13671, 'batch_loss/train': 0.8119877362623811} 12/24/2021 18:43:29 - INFO - codeparrot_training - Step 13672: {'lr': 0.00041480410702030593, 'samples': 7000576, 'steps': 13672, 'batch_loss/train': 0.7146873185411096} 12/24/2021 18:43:41 - INFO - codeparrot_training - Step 13673: {'lr': 0.0004147922535525812, 'samples': 7001088, 'steps': 13673, 'batch_loss/train': 0.8589732346590608} 12/24/2021 18:43:52 - INFO - codeparrot_training - Step 13674: {'lr': 0.0004147803994297048, 'samples': 7001600, 'steps': 13674, 'batch_loss/train': 0.7894395207986236} 12/24/2021 18:44:03 - INFO - codeparrot_training - Step 13675: {'lr': 0.00041476854465172366, 'samples': 7002112, 'steps': 13675, 'batch_loss/train': 0.7312543778680265} 12/24/2021 18:44:15 - INFO - codeparrot_training - Step 13676: {'lr': 0.000414756689218685, 'samples': 7002624, 'steps': 13676, 'batch_loss/train': 0.816523902118206} 12/24/2021 18:44:25 - INFO - codeparrot_training - Step 13677: {'lr': 0.00041474483313063597, 'samples': 7003136, 'steps': 13677, 'batch_loss/train': 0.6221683954936452} 12/24/2021 18:44:36 - INFO - codeparrot_training - Step 13678: {'lr': 0.00041473297638762376, 'samples': 7003648, 'steps': 13678, 'batch_loss/train': 0.819923160597682} 12/24/2021 18:44:50 - INFO - codeparrot_training - Step 13679: {'lr': 0.0004147211189896953, 'samples': 7004160, 'steps': 13679, 'batch_loss/train': 0.755083053605631} 12/24/2021 18:45:00 - INFO - codeparrot_training - Step 13680: {'lr': 0.000414709260936898, 'samples': 7004672, 'steps': 13680, 'batch_loss/train': 0.7362191779538989} 12/24/2021 18:45:11 - INFO - codeparrot_training - Step 13681: {'lr': 0.0004146974022292788, 'samples': 7005184, 'steps': 13681, 'batch_loss/train': 0.6980527229025029} 12/24/2021 18:45:22 - INFO - codeparrot_training - Step 13682: {'lr': 0.00041468554286688494, 'samples': 7005696, 'steps': 13682, 'batch_loss/train': 0.7809050139039755} 12/24/2021 18:45:34 - INFO - codeparrot_training - Step 13683: {'lr': 0.0004146736828497635, 'samples': 7006208, 'steps': 13683, 'batch_loss/train': 0.7738628190709278} 12/24/2021 18:45:44 - INFO - codeparrot_training - Step 13684: {'lr': 0.0004146618221779617, 'samples': 7006720, 'steps': 13684, 'batch_loss/train': 0.8167421203106642} 12/24/2021 18:45:55 - INFO - codeparrot_training - Step 13685: {'lr': 0.0004146499608515267, 'samples': 7007232, 'steps': 13685, 'batch_loss/train': 0.8444514884613454} 12/24/2021 18:46:07 - INFO - codeparrot_training - Step 13686: {'lr': 0.00041463809887050564, 'samples': 7007744, 'steps': 13686, 'batch_loss/train': 0.8419130739057437} 12/24/2021 18:46:18 - INFO - codeparrot_training - Step 13687: {'lr': 0.0004146262362349456, 'samples': 7008256, 'steps': 13687, 'batch_loss/train': 0.7723642699420452} 12/24/2021 18:46:28 - INFO - codeparrot_training - Step 13688: {'lr': 0.00041461437294489377, 'samples': 7008768, 'steps': 13688, 'batch_loss/train': 0.8000844828784466} 12/24/2021 18:46:40 - INFO - codeparrot_training - Step 13689: {'lr': 0.0004146025090003975, 'samples': 7009280, 'steps': 13689, 'batch_loss/train': 0.8232099590823054} 12/24/2021 18:46:51 - INFO - codeparrot_training - Step 13690: {'lr': 0.0004145906444015037, 'samples': 7009792, 'steps': 13690, 'batch_loss/train': 0.824959154240787} 12/24/2021 18:47:01 - INFO - codeparrot_training - Step 13691: {'lr': 0.0004145787791482597, 'samples': 7010304, 'steps': 13691, 'batch_loss/train': 0.6821286793565378} 12/24/2021 18:47:16 - INFO - codeparrot_training - Step 13692: {'lr': 0.0004145669132407125, 'samples': 7010816, 'steps': 13692, 'batch_loss/train': 0.7573491861112416} 12/24/2021 18:47:26 - INFO - codeparrot_training - Step 13693: {'lr': 0.00041455504667890953, 'samples': 7011328, 'steps': 13693, 'batch_loss/train': 0.7247601402923465} 12/24/2021 18:47:37 - INFO - codeparrot_training - Step 13694: {'lr': 0.0004145431794628978, 'samples': 7011840, 'steps': 13694, 'batch_loss/train': 0.5414795642718673} 12/24/2021 18:47:47 - INFO - codeparrot_training - Step 13695: {'lr': 0.0004145313115927244, 'samples': 7012352, 'steps': 13695, 'batch_loss/train': 0.5723126069642603} 12/24/2021 18:47:59 - INFO - codeparrot_training - Step 13696: {'lr': 0.0004145194430684367, 'samples': 7012864, 'steps': 13696, 'batch_loss/train': 0.7723208030220121} 12/24/2021 18:48:10 - INFO - codeparrot_training - Step 13697: {'lr': 0.0004145075738900819, 'samples': 7013376, 'steps': 13697, 'batch_loss/train': 0.7757347812876105} 12/24/2021 18:48:21 - INFO - codeparrot_training - Step 13698: {'lr': 0.00041449570405770696, 'samples': 7013888, 'steps': 13698, 'batch_loss/train': 0.7542486554011703} 12/24/2021 18:48:33 - INFO - codeparrot_training - Step 13699: {'lr': 0.00041448383357135934, 'samples': 7014400, 'steps': 13699, 'batch_loss/train': 0.7780963629484177} 12/24/2021 18:48:43 - INFO - codeparrot_training - Step 13700: {'lr': 0.000414471962431086, 'samples': 7014912, 'steps': 13700, 'batch_loss/train': 0.8019173555076122} 12/24/2021 18:48:54 - INFO - codeparrot_training - Step 13701: {'lr': 0.0004144600906369343, 'samples': 7015424, 'steps': 13701, 'batch_loss/train': 0.6196174866636284} 12/24/2021 18:49:08 - INFO - codeparrot_training - Step 13702: {'lr': 0.0004144482181889514, 'samples': 7015936, 'steps': 13702, 'batch_loss/train': 0.7423722818493843} 12/24/2021 18:49:18 - INFO - codeparrot_training - Step 13703: {'lr': 0.00041443634508718443, 'samples': 7016448, 'steps': 13703, 'batch_loss/train': 0.5910641030641273} 12/24/2021 18:49:29 - INFO - codeparrot_training - Step 13704: {'lr': 0.00041442447133168065, 'samples': 7016960, 'steps': 13704, 'batch_loss/train': 0.8018735032528639} 12/24/2021 18:49:41 - INFO - codeparrot_training - Step 13705: {'lr': 0.0004144125969224872, 'samples': 7017472, 'steps': 13705, 'batch_loss/train': 0.7842907197773457} 12/24/2021 18:49:52 - INFO - codeparrot_training - Step 13706: {'lr': 0.00041440072185965147, 'samples': 7017984, 'steps': 13706, 'batch_loss/train': 0.7329177493229508} 12/24/2021 18:50:02 - INFO - codeparrot_training - Step 13707: {'lr': 0.0004143888461432205, 'samples': 7018496, 'steps': 13707, 'batch_loss/train': 0.7898299028165638} 12/24/2021 18:50:13 - INFO - codeparrot_training - Step 13708: {'lr': 0.00041437696977324147, 'samples': 7019008, 'steps': 13708, 'batch_loss/train': 0.7254409631714225} 12/24/2021 18:50:27 - INFO - codeparrot_training - Step 13709: {'lr': 0.00041436509274976174, 'samples': 7019520, 'steps': 13709, 'batch_loss/train': 0.8507406148128211} 12/24/2021 18:50:37 - INFO - codeparrot_training - Step 13710: {'lr': 0.0004143532150728285, 'samples': 7020032, 'steps': 13710, 'batch_loss/train': 0.7709641717374325} 12/24/2021 18:50:48 - INFO - codeparrot_training - Step 13711: {'lr': 0.00041434133674248886, 'samples': 7020544, 'steps': 13711, 'batch_loss/train': 0.7490430531324819} 12/24/2021 18:51:00 - INFO - codeparrot_training - Step 13712: {'lr': 0.00041432945775879014, 'samples': 7021056, 'steps': 13712, 'batch_loss/train': 0.8391087293857709} 12/24/2021 18:51:11 - INFO - codeparrot_training - Step 13713: {'lr': 0.00041431757812177954, 'samples': 7021568, 'steps': 13713, 'batch_loss/train': 0.8657533032819629} 12/24/2021 18:51:21 - INFO - codeparrot_training - Step 13714: {'lr': 0.0004143056978315043, 'samples': 7022080, 'steps': 13714, 'batch_loss/train': 0.8324931645765901} 12/24/2021 18:51:34 - INFO - codeparrot_training - Step 13715: {'lr': 0.0004142938168880117, 'samples': 7022592, 'steps': 13715, 'batch_loss/train': 0.7390142614021897} 12/24/2021 18:51:44 - INFO - codeparrot_training - Step 13716: {'lr': 0.00041428193529134883, 'samples': 7023104, 'steps': 13716, 'batch_loss/train': 0.8528866451233625} 12/24/2021 18:51:55 - INFO - codeparrot_training - Step 13717: {'lr': 0.00041427005304156306, 'samples': 7023616, 'steps': 13717, 'batch_loss/train': 0.8304711235687137} 12/24/2021 18:52:09 - INFO - codeparrot_training - Step 13718: {'lr': 0.0004142581701387016, 'samples': 7024128, 'steps': 13718, 'batch_loss/train': 0.7579422313719988} 12/24/2021 18:52:19 - INFO - codeparrot_training - Step 13719: {'lr': 0.00041424628658281163, 'samples': 7024640, 'steps': 13719, 'batch_loss/train': 0.8063094979152083} 12/24/2021 18:52:30 - INFO - codeparrot_training - Step 13720: {'lr': 0.0004142344023739405, 'samples': 7025152, 'steps': 13720, 'batch_loss/train': 0.7543291477486491} 12/24/2021 18:52:40 - INFO - codeparrot_training - Step 13721: {'lr': 0.00041422251751213536, 'samples': 7025664, 'steps': 13721, 'batch_loss/train': 0.845734991133213} 12/24/2021 18:52:53 - INFO - codeparrot_training - Step 13722: {'lr': 0.00041421063199744345, 'samples': 7026176, 'steps': 13722, 'batch_loss/train': 0.6845304323360324} 12/24/2021 18:53:04 - INFO - codeparrot_training - Step 13723: {'lr': 0.0004141987458299121, 'samples': 7026688, 'steps': 13723, 'batch_loss/train': 0.824881193228066} 12/24/2021 18:53:14 - INFO - codeparrot_training - Step 13724: {'lr': 0.00041418685900958865, 'samples': 7027200, 'steps': 13724, 'batch_loss/train': 1.102796956896782} 12/24/2021 18:53:28 - INFO - codeparrot_training - Step 13725: {'lr': 0.00041417497153652013, 'samples': 7027712, 'steps': 13725, 'batch_loss/train': 0.7118276503169909} 12/24/2021 18:53:39 - INFO - codeparrot_training - Step 13726: {'lr': 0.00041416308341075394, 'samples': 7028224, 'steps': 13726, 'batch_loss/train': 0.9084341879934072} 12/24/2021 18:53:49 - INFO - codeparrot_training - Step 13727: {'lr': 0.0004141511946323373, 'samples': 7028736, 'steps': 13727, 'batch_loss/train': 1.6263319330755621} 12/24/2021 18:54:01 - INFO - codeparrot_training - Step 13728: {'lr': 0.0004141393052013175, 'samples': 7029248, 'steps': 13728, 'batch_loss/train': 0.83491116669029} 12/24/2021 18:54:12 - INFO - codeparrot_training - Step 13729: {'lr': 0.0004141274151177418, 'samples': 7029760, 'steps': 13729, 'batch_loss/train': 0.7719694809056818} 12/24/2021 18:54:23 - INFO - codeparrot_training - Step 13730: {'lr': 0.0004141155243816575, 'samples': 7030272, 'steps': 13730, 'batch_loss/train': 0.8388674193993211} 12/24/2021 18:54:33 - INFO - codeparrot_training - Step 13731: {'lr': 0.00041410363299311184, 'samples': 7030784, 'steps': 13731, 'batch_loss/train': 0.6981899528764188} 12/24/2021 18:54:45 - INFO - codeparrot_training - Step 13732: {'lr': 0.000414091740952152, 'samples': 7031296, 'steps': 13732, 'batch_loss/train': 0.7887698905542493} 12/24/2021 18:54:56 - INFO - codeparrot_training - Step 13733: {'lr': 0.0004140798482588255, 'samples': 7031808, 'steps': 13733, 'batch_loss/train': 0.8351681074127555} 12/24/2021 18:55:06 - INFO - codeparrot_training - Step 13734: {'lr': 0.0004140679549131794, 'samples': 7032320, 'steps': 13734, 'batch_loss/train': 0.7311671525822021} 12/24/2021 18:55:19 - INFO - codeparrot_training - Step 13735: {'lr': 0.00041405606091526106, 'samples': 7032832, 'steps': 13735, 'batch_loss/train': 0.7369772981619462} 12/24/2021 18:55:29 - INFO - codeparrot_training - Step 13736: {'lr': 0.00041404416626511774, 'samples': 7033344, 'steps': 13736, 'batch_loss/train': 0.7665149951353669} 12/24/2021 18:55:40 - INFO - codeparrot_training - Step 13737: {'lr': 0.00041403227096279685, 'samples': 7033856, 'steps': 13737, 'batch_loss/train': 0.5664254012517631} 12/24/2021 18:55:58 - INFO - codeparrot_training - Step 13738: {'lr': 0.00041402037500834547, 'samples': 7034368, 'steps': 13738, 'batch_loss/train': 0.8143429988995194} 12/24/2021 18:56:08 - INFO - codeparrot_training - Step 13739: {'lr': 0.0004140084784018111, 'samples': 7034880, 'steps': 13739, 'batch_loss/train': 0.7422681129537523} 12/24/2021 18:56:19 - INFO - codeparrot_training - Step 13740: {'lr': 0.0004139965811432409, 'samples': 7035392, 'steps': 13740, 'batch_loss/train': 0.8327446104958653} 12/24/2021 18:56:31 - INFO - codeparrot_training - Step 13741: {'lr': 0.00041398468323268224, 'samples': 7035904, 'steps': 13741, 'batch_loss/train': 0.807286012917757} 12/24/2021 18:56:42 - INFO - codeparrot_training - Step 13742: {'lr': 0.0004139727846701824, 'samples': 7036416, 'steps': 13742, 'batch_loss/train': 0.5721896469476633} 12/24/2021 18:56:52 - INFO - codeparrot_training - Step 13743: {'lr': 0.0004139608854557887, 'samples': 7036928, 'steps': 13743, 'batch_loss/train': 0.7039823178201914} 12/24/2021 18:57:03 - INFO - codeparrot_training - Step 13744: {'lr': 0.0004139489855895484, 'samples': 7037440, 'steps': 13744, 'batch_loss/train': 0.8394855805672705} 12/24/2021 18:57:15 - INFO - codeparrot_training - Step 13745: {'lr': 0.0004139370850715088, 'samples': 7037952, 'steps': 13745, 'batch_loss/train': 0.7994116768240929} 12/24/2021 18:57:26 - INFO - codeparrot_training - Step 13746: {'lr': 0.0004139251839017173, 'samples': 7038464, 'steps': 13746, 'batch_loss/train': 0.8983790781348944} 12/24/2021 18:57:36 - INFO - codeparrot_training - Step 13747: {'lr': 0.0004139132820802212, 'samples': 7038976, 'steps': 13747, 'batch_loss/train': 0.47267161414492875} 12/24/2021 18:57:50 - INFO - codeparrot_training - Step 13748: {'lr': 0.00041390137960706767, 'samples': 7039488, 'steps': 13748, 'batch_loss/train': 0.6873089817818254} 12/24/2021 18:58:01 - INFO - codeparrot_training - Step 13749: {'lr': 0.00041388947648230426, 'samples': 7040000, 'steps': 13749, 'batch_loss/train': 0.7105399006977677} 12/24/2021 18:58:11 - INFO - codeparrot_training - Step 13750: {'lr': 0.00041387757270597807, 'samples': 7040512, 'steps': 13750, 'batch_loss/train': 0.8636395689100027} 12/24/2021 18:58:23 - INFO - codeparrot_training - Step 13751: {'lr': 0.0004138656682781365, 'samples': 7041024, 'steps': 13751, 'batch_loss/train': 0.7472064206376672} 12/24/2021 18:58:34 - INFO - codeparrot_training - Step 13752: {'lr': 0.00041385376319882705, 'samples': 7041536, 'steps': 13752, 'batch_loss/train': 0.7574474830180407} 12/24/2021 18:58:45 - INFO - codeparrot_training - Step 13753: {'lr': 0.0004138418574680969, 'samples': 7042048, 'steps': 13753, 'batch_loss/train': 0.8692765822634101} 12/24/2021 18:58:59 - INFO - codeparrot_training - Step 13754: {'lr': 0.0004138299510859933, 'samples': 7042560, 'steps': 13754, 'batch_loss/train': 0.8347559403628111} 12/24/2021 18:59:09 - INFO - codeparrot_training - Step 13755: {'lr': 0.0004138180440525637, 'samples': 7043072, 'steps': 13755, 'batch_loss/train': 0.7110188116785139} 12/24/2021 18:59:20 - INFO - codeparrot_training - Step 13756: {'lr': 0.00041380613636785534, 'samples': 7043584, 'steps': 13756, 'batch_loss/train': 0.5543778762221336} 12/24/2021 18:59:31 - INFO - codeparrot_training - Step 13757: {'lr': 0.0004137942280319158, 'samples': 7044096, 'steps': 13757, 'batch_loss/train': 0.7029660665430129} 12/24/2021 18:59:43 - INFO - codeparrot_training - Step 13758: {'lr': 0.0004137823190447921, 'samples': 7044608, 'steps': 13758, 'batch_loss/train': 0.8327977927401662} 12/24/2021 18:59:53 - INFO - codeparrot_training - Step 13759: {'lr': 0.0004137704094065319, 'samples': 7045120, 'steps': 13759, 'batch_loss/train': 0.7297650510445237} 12/24/2021 19:00:04 - INFO - codeparrot_training - Step 13760: {'lr': 0.00041375849911718223, 'samples': 7045632, 'steps': 13760, 'batch_loss/train': 0.6714879022911191} 12/24/2021 19:00:16 - INFO - codeparrot_training - Step 13761: {'lr': 0.0004137465881767907, 'samples': 7046144, 'steps': 13761, 'batch_loss/train': 0.8099147165194154} 12/24/2021 19:00:27 - INFO - codeparrot_training - Step 13762: {'lr': 0.00041373467658540453, 'samples': 7046656, 'steps': 13762, 'batch_loss/train': 0.7030640514567494} 12/24/2021 19:00:37 - INFO - codeparrot_training - Step 13763: {'lr': 0.0004137227643430711, 'samples': 7047168, 'steps': 13763, 'batch_loss/train': 0.8173875032225624} 12/24/2021 19:00:51 - INFO - codeparrot_training - Step 13764: {'lr': 0.0004137108514498378, 'samples': 7047680, 'steps': 13764, 'batch_loss/train': 0.7252163379453123} 12/24/2021 19:01:02 - INFO - codeparrot_training - Step 13765: {'lr': 0.000413698937905752, 'samples': 7048192, 'steps': 13765, 'batch_loss/train': 0.7321086304727942} 12/24/2021 19:01:12 - INFO - codeparrot_training - Step 13766: {'lr': 0.0004136870237108609, 'samples': 7048704, 'steps': 13766, 'batch_loss/train': 0.7876685298979282} 12/24/2021 19:01:24 - INFO - codeparrot_training - Step 13767: {'lr': 0.00041367510886521216, 'samples': 7049216, 'steps': 13767, 'batch_loss/train': 0.7364500996191055} 12/24/2021 19:01:35 - INFO - codeparrot_training - Step 13768: {'lr': 0.00041366319336885287, 'samples': 7049728, 'steps': 13768, 'batch_loss/train': 0.8186225444078445} 12/24/2021 19:01:46 - INFO - codeparrot_training - Step 13769: {'lr': 0.0004136512772218305, 'samples': 7050240, 'steps': 13769, 'batch_loss/train': 0.804456191137433} 12/24/2021 19:01:56 - INFO - codeparrot_training - Step 13770: {'lr': 0.0004136393604241926, 'samples': 7050752, 'steps': 13770, 'batch_loss/train': 0.7949732150882483} 12/24/2021 19:02:08 - INFO - codeparrot_training - Step 13771: {'lr': 0.0004136274429759863, 'samples': 7051264, 'steps': 13771, 'batch_loss/train': 0.77310883870814} 12/24/2021 19:02:19 - INFO - codeparrot_training - Step 13772: {'lr': 0.00041361552487725905, 'samples': 7051776, 'steps': 13772, 'batch_loss/train': 0.850149161182344} 12/24/2021 19:02:30 - INFO - codeparrot_training - Step 13773: {'lr': 0.00041360360612805824, 'samples': 7052288, 'steps': 13773, 'batch_loss/train': 0.7344567750114948} 12/24/2021 19:02:42 - INFO - codeparrot_training - Step 13774: {'lr': 0.0004135916867284314, 'samples': 7052800, 'steps': 13774, 'batch_loss/train': 0.8862059488892555} 12/24/2021 19:02:52 - INFO - codeparrot_training - Step 13775: {'lr': 0.0004135797666784257, 'samples': 7053312, 'steps': 13775, 'batch_loss/train': 0.8259107945486903} 12/24/2021 19:03:03 - INFO - codeparrot_training - Step 13776: {'lr': 0.00041356784597808857, 'samples': 7053824, 'steps': 13776, 'batch_loss/train': 0.6449545131181367} 12/24/2021 19:03:17 - INFO - codeparrot_training - Step 13777: {'lr': 0.00041355592462746744, 'samples': 7054336, 'steps': 13777, 'batch_loss/train': 0.8018773994408548} 12/24/2021 19:03:27 - INFO - codeparrot_training - Step 13778: {'lr': 0.0004135440026266097, 'samples': 7054848, 'steps': 13778, 'batch_loss/train': 0.7598493844270706} 12/24/2021 19:03:38 - INFO - codeparrot_training - Step 13779: {'lr': 0.00041353207997556283, 'samples': 7055360, 'steps': 13779, 'batch_loss/train': 0.7611016016453505} 12/24/2021 19:03:48 - INFO - codeparrot_training - Step 13780: {'lr': 0.00041352015667437417, 'samples': 7055872, 'steps': 13780, 'batch_loss/train': 0.8139701951295137} 12/24/2021 19:04:01 - INFO - codeparrot_training - Step 13781: {'lr': 0.000413508232723091, 'samples': 7056384, 'steps': 13781, 'batch_loss/train': 0.8054298779461533} 12/24/2021 19:04:12 - INFO - codeparrot_training - Step 13782: {'lr': 0.0004134963081217609, 'samples': 7056896, 'steps': 13782, 'batch_loss/train': 0.7181649958947673} 12/24/2021 19:04:22 - INFO - codeparrot_training - Step 13783: {'lr': 0.00041348438287043126, 'samples': 7057408, 'steps': 13783, 'batch_loss/train': 1.254366279579699} 12/24/2021 19:04:36 - INFO - codeparrot_training - Step 13784: {'lr': 0.0004134724569691494, 'samples': 7057920, 'steps': 13784, 'batch_loss/train': 0.8452668786048889} 12/24/2021 19:04:47 - INFO - codeparrot_training - Step 13785: {'lr': 0.0004134605304179628, 'samples': 7058432, 'steps': 13785, 'batch_loss/train': 0.8826300539076328} 12/24/2021 19:04:57 - INFO - codeparrot_training - Step 13786: {'lr': 0.0004134486032169188, 'samples': 7058944, 'steps': 13786, 'batch_loss/train': 0.6541162537178025} 12/24/2021 19:05:09 - INFO - codeparrot_training - Step 13787: {'lr': 0.0004134366753660649, 'samples': 7059456, 'steps': 13787, 'batch_loss/train': 0.8040604181587696} 12/24/2021 19:05:20 - INFO - codeparrot_training - Step 13788: {'lr': 0.0004134247468654485, 'samples': 7059968, 'steps': 13788, 'batch_loss/train': 0.7409105292754248} 12/24/2021 19:05:31 - INFO - codeparrot_training - Step 13789: {'lr': 0.00041341281771511696, 'samples': 7060480, 'steps': 13789, 'batch_loss/train': 0.588150198047515} 12/24/2021 19:05:41 - INFO - codeparrot_training - Step 13790: {'lr': 0.0004134008879151178, 'samples': 7060992, 'steps': 13790, 'batch_loss/train': 0.7603881377726793} 12/24/2021 19:05:54 - INFO - codeparrot_training - Step 13791: {'lr': 0.0004133889574654984, 'samples': 7061504, 'steps': 13791, 'batch_loss/train': 0.7656506458297372} 12/24/2021 19:06:04 - INFO - codeparrot_training - Step 13792: {'lr': 0.00041337702636630626, 'samples': 7062016, 'steps': 13792, 'batch_loss/train': 0.6573046429548413} 12/24/2021 19:06:15 - INFO - codeparrot_training - Step 13793: {'lr': 0.0004133650946175886, 'samples': 7062528, 'steps': 13793, 'batch_loss/train': 0.8574365223757923} 12/24/2021 19:06:29 - INFO - codeparrot_training - Step 13794: {'lr': 0.0004133531622193932, 'samples': 7063040, 'steps': 13794, 'batch_loss/train': 0.7609620625153184} 12/24/2021 19:06:39 - INFO - codeparrot_training - Step 13795: {'lr': 0.0004133412291717672, 'samples': 7063552, 'steps': 13795, 'batch_loss/train': 0.7340179816819727} 12/24/2021 19:06:50 - INFO - codeparrot_training - Step 13796: {'lr': 0.00041332929547475814, 'samples': 7064064, 'steps': 13796, 'batch_loss/train': 0.7534839473664761} 12/24/2021 19:07:02 - INFO - codeparrot_training - Step 13797: {'lr': 0.00041331736112841343, 'samples': 7064576, 'steps': 13797, 'batch_loss/train': 0.6740753091289662} 12/24/2021 19:07:13 - INFO - codeparrot_training - Step 13798: {'lr': 0.00041330542613278063, 'samples': 7065088, 'steps': 13798, 'batch_loss/train': 0.8201344264671206} 12/24/2021 19:07:23 - INFO - codeparrot_training - Step 13799: {'lr': 0.00041329349048790714, 'samples': 7065600, 'steps': 13799, 'batch_loss/train': 0.8119339402765036} 12/24/2021 19:07:35 - INFO - codeparrot_training - Step 13800: {'lr': 0.0004132815541938404, 'samples': 7066112, 'steps': 13800, 'batch_loss/train': 0.8328267401084304} 12/24/2021 19:07:46 - INFO - codeparrot_training - Step 13801: {'lr': 0.00041326961725062774, 'samples': 7066624, 'steps': 13801, 'batch_loss/train': 0.8421200728043914} 12/24/2021 19:07:56 - INFO - codeparrot_training - Step 13802: {'lr': 0.00041325767965831687, 'samples': 7067136, 'steps': 13802, 'batch_loss/train': 0.821764625608921} 12/24/2021 19:08:07 - INFO - codeparrot_training - Step 13803: {'lr': 0.00041324574141695503, 'samples': 7067648, 'steps': 13803, 'batch_loss/train': 0.7783314744010568} 12/24/2021 19:08:21 - INFO - codeparrot_training - Step 13804: {'lr': 0.00041323380252658977, 'samples': 7068160, 'steps': 13804, 'batch_loss/train': 0.6536838025785983} 12/24/2021 19:08:32 - INFO - codeparrot_training - Step 13805: {'lr': 0.00041322186298726847, 'samples': 7068672, 'steps': 13805, 'batch_loss/train': 0.7722303494811058} 12/24/2021 19:08:42 - INFO - codeparrot_training - Step 13806: {'lr': 0.0004132099227990388, 'samples': 7069184, 'steps': 13806, 'batch_loss/train': 1.6551284750457853} 12/24/2021 19:08:54 - INFO - codeparrot_training - Step 13807: {'lr': 0.00041319798196194803, 'samples': 7069696, 'steps': 13807, 'batch_loss/train': 0.8546575368382037} 12/24/2021 19:09:05 - INFO - codeparrot_training - Step 13808: {'lr': 0.0004131860404760437, 'samples': 7070208, 'steps': 13808, 'batch_loss/train': 0.6782859904924408} 12/24/2021 19:09:16 - INFO - codeparrot_training - Step 13809: {'lr': 0.00041317409834137333, 'samples': 7070720, 'steps': 13809, 'batch_loss/train': 0.6992060746997595} 12/24/2021 19:09:28 - INFO - codeparrot_training - Step 13810: {'lr': 0.00041316215555798437, 'samples': 7071232, 'steps': 13810, 'batch_loss/train': 0.7432420841942076} 12/24/2021 19:09:38 - INFO - codeparrot_training - Step 13811: {'lr': 0.0004131502121259243, 'samples': 7071744, 'steps': 13811, 'batch_loss/train': 0.8000726504251361} 12/24/2021 19:09:49 - INFO - codeparrot_training - Step 13812: {'lr': 0.00041313826804524053, 'samples': 7072256, 'steps': 13812, 'batch_loss/train': 0.6671086917631328} 12/24/2021 19:10:03 - INFO - codeparrot_training - Step 13813: {'lr': 0.00041312632331598064, 'samples': 7072768, 'steps': 13813, 'batch_loss/train': 0.8069752398878336} 12/24/2021 19:10:13 - INFO - codeparrot_training - Step 13814: {'lr': 0.000413114377938192, 'samples': 7073280, 'steps': 13814, 'batch_loss/train': 0.5336777325719595} 12/24/2021 19:10:24 - INFO - codeparrot_training - Step 13815: {'lr': 0.00041310243191192234, 'samples': 7073792, 'steps': 13815, 'batch_loss/train': 0.8123653624206781} 12/24/2021 19:10:35 - INFO - codeparrot_training - Step 13816: {'lr': 0.0004130904852372188, 'samples': 7074304, 'steps': 13816, 'batch_loss/train': 0.8205056395381689} 12/24/2021 19:10:47 - INFO - codeparrot_training - Step 13817: {'lr': 0.0004130785379141292, 'samples': 7074816, 'steps': 13817, 'batch_loss/train': 0.6982385264709592} 12/24/2021 19:10:57 - INFO - codeparrot_training - Step 13818: {'lr': 0.0004130665899427009, 'samples': 7075328, 'steps': 13818, 'batch_loss/train': 0.7416854574694298} 12/24/2021 19:11:08 - INFO - codeparrot_training - Step 13819: {'lr': 0.00041305464132298137, 'samples': 7075840, 'steps': 13819, 'batch_loss/train': 0.8176624048501253} 12/24/2021 19:11:20 - INFO - codeparrot_training - Step 13820: {'lr': 0.0004130426920550182, 'samples': 7076352, 'steps': 13820, 'batch_loss/train': 0.8363923761062324} 12/24/2021 19:11:31 - INFO - codeparrot_training - Step 13821: {'lr': 0.0004130307421388588, 'samples': 7076864, 'steps': 13821, 'batch_loss/train': 0.7054522428661585} 12/24/2021 19:11:41 - INFO - codeparrot_training - Step 13822: {'lr': 0.0004130187915745508, 'samples': 7077376, 'steps': 13822, 'batch_loss/train': 0.6923885527648963} 12/24/2021 19:11:55 - INFO - codeparrot_training - Step 13823: {'lr': 0.0004130068403621416, 'samples': 7077888, 'steps': 13823, 'batch_loss/train': 0.7390890698879957} 12/24/2021 19:12:06 - INFO - codeparrot_training - Step 13824: {'lr': 0.0004129948885016788, 'samples': 7078400, 'steps': 13824, 'batch_loss/train': 0.8712930865585804} 12/24/2021 19:12:16 - INFO - codeparrot_training - Step 13825: {'lr': 0.00041298293599320975, 'samples': 7078912, 'steps': 13825, 'batch_loss/train': 0.711054421029985} 12/24/2021 19:12:27 - INFO - codeparrot_training - Step 13826: {'lr': 0.00041297098283678215, 'samples': 7079424, 'steps': 13826, 'batch_loss/train': 0.7081672102212906} 12/24/2021 19:12:39 - INFO - codeparrot_training - Step 13827: {'lr': 0.0004129590290324434, 'samples': 7079936, 'steps': 13827, 'batch_loss/train': 0.8310264572501183} 12/24/2021 19:12:50 - INFO - codeparrot_training - Step 13828: {'lr': 0.0004129470745802412, 'samples': 7080448, 'steps': 13828, 'batch_loss/train': 0.7231151815503836} 12/24/2021 19:13:00 - INFO - codeparrot_training - Step 13829: {'lr': 0.0004129351194802229, 'samples': 7080960, 'steps': 13829, 'batch_loss/train': 0.5560917900002096} 12/24/2021 19:13:13 - INFO - codeparrot_training - Step 13830: {'lr': 0.00041292316373243613, 'samples': 7081472, 'steps': 13830, 'batch_loss/train': 0.6821587373851798} 12/24/2021 19:13:24 - INFO - codeparrot_training - Step 13831: {'lr': 0.0004129112073369283, 'samples': 7081984, 'steps': 13831, 'batch_loss/train': 0.7934941085986793} 12/24/2021 19:13:34 - INFO - codeparrot_training - Step 13832: {'lr': 0.000412899250293747, 'samples': 7082496, 'steps': 13832, 'batch_loss/train': 0.787737812846899} 12/24/2021 19:13:48 - INFO - codeparrot_training - Step 13833: {'lr': 0.0004128872926029399, 'samples': 7083008, 'steps': 13833, 'batch_loss/train': 0.7046836782246828} 12/24/2021 19:13:59 - INFO - codeparrot_training - Step 13834: {'lr': 0.00041287533426455434, 'samples': 7083520, 'steps': 13834, 'batch_loss/train': 0.7218586192466319} 12/24/2021 19:14:09 - INFO - codeparrot_training - Step 13835: {'lr': 0.000412863375278638, 'samples': 7084032, 'steps': 13835, 'batch_loss/train': 0.7680007279850543} 12/24/2021 19:14:21 - INFO - codeparrot_training - Step 13836: {'lr': 0.00041285141564523833, 'samples': 7084544, 'steps': 13836, 'batch_loss/train': 0.7419218700379133} 12/24/2021 19:14:32 - INFO - codeparrot_training - Step 13837: {'lr': 0.000412839455364403, 'samples': 7085056, 'steps': 13837, 'batch_loss/train': 0.7094696526182815} 12/24/2021 19:14:43 - INFO - codeparrot_training - Step 13838: {'lr': 0.0004128274944361794, 'samples': 7085568, 'steps': 13838, 'batch_loss/train': 1.5229622810147703} 12/24/2021 19:14:53 - INFO - codeparrot_training - Step 13839: {'lr': 0.00041281553286061525, 'samples': 7086080, 'steps': 13839, 'batch_loss/train': 0.7207257116679102} 12/24/2021 19:15:07 - INFO - codeparrot_training - Step 13840: {'lr': 0.00041280357063775786, 'samples': 7086592, 'steps': 13840, 'batch_loss/train': 0.7817963939160109} 12/24/2021 19:15:17 - INFO - codeparrot_training - Step 13841: {'lr': 0.00041279160776765513, 'samples': 7087104, 'steps': 13841, 'batch_loss/train': 0.6269746905309148} 12/24/2021 19:15:28 - INFO - codeparrot_training - Step 13842: {'lr': 0.0004127796442503543, 'samples': 7087616, 'steps': 13842, 'batch_loss/train': 0.7492403733776882} 12/24/2021 19:15:40 - INFO - codeparrot_training - Step 13843: {'lr': 0.00041276768008590315, 'samples': 7088128, 'steps': 13843, 'batch_loss/train': 0.7393173091113567} 12/24/2021 19:15:51 - INFO - codeparrot_training - Step 13844: {'lr': 0.0004127557152743491, 'samples': 7088640, 'steps': 13844, 'batch_loss/train': 0.7269350872375071} 12/24/2021 19:16:01 - INFO - codeparrot_training - Step 13845: {'lr': 0.0004127437498157398, 'samples': 7089152, 'steps': 13845, 'batch_loss/train': 0.9026270033791661} 12/24/2021 19:16:14 - INFO - codeparrot_training - Step 13846: {'lr': 0.00041273178371012284, 'samples': 7089664, 'steps': 13846, 'batch_loss/train': 0.7188053233548999} 12/24/2021 19:16:24 - INFO - codeparrot_training - Step 13847: {'lr': 0.0004127198169575457, 'samples': 7090176, 'steps': 13847, 'batch_loss/train': 0.7137117060483433} 12/24/2021 19:16:35 - INFO - codeparrot_training - Step 13848: {'lr': 0.00041270784955805607, 'samples': 7090688, 'steps': 13848, 'batch_loss/train': 0.8657901147380471} 12/24/2021 19:16:45 - INFO - codeparrot_training - Step 13849: {'lr': 0.0004126958815117014, 'samples': 7091200, 'steps': 13849, 'batch_loss/train': 0.7976212091743946} 12/24/2021 19:16:58 - INFO - codeparrot_training - Step 13850: {'lr': 0.00041268391281852943, 'samples': 7091712, 'steps': 13850, 'batch_loss/train': 0.8473435062915087} 12/24/2021 19:17:08 - INFO - codeparrot_training - Step 13851: {'lr': 0.00041267194347858755, 'samples': 7092224, 'steps': 13851, 'batch_loss/train': 0.8161035836674273} 12/24/2021 19:17:19 - INFO - codeparrot_training - Step 13852: {'lr': 0.00041265997349192353, 'samples': 7092736, 'steps': 13852, 'batch_loss/train': 0.6923118841368705} 12/24/2021 19:17:33 - INFO - codeparrot_training - Step 13853: {'lr': 0.00041264800285858473, 'samples': 7093248, 'steps': 13853, 'batch_loss/train': 0.7879116013646126} 12/24/2021 19:17:44 - INFO - codeparrot_training - Step 13854: {'lr': 0.0004126360315786191, 'samples': 7093760, 'steps': 13854, 'batch_loss/train': 0.8251137644983828} 12/24/2021 19:17:54 - INFO - codeparrot_training - Step 13855: {'lr': 0.00041262405965207386, 'samples': 7094272, 'steps': 13855, 'batch_loss/train': 0.7922248980030417} 12/24/2021 19:18:06 - INFO - codeparrot_training - Step 13856: {'lr': 0.00041261208707899676, 'samples': 7094784, 'steps': 13856, 'batch_loss/train': 0.5713498702971265} 12/24/2021 19:18:17 - INFO - codeparrot_training - Step 13857: {'lr': 0.0004126001138594354, 'samples': 7095296, 'steps': 13857, 'batch_loss/train': 0.7829071450978518} 12/24/2021 19:18:28 - INFO - codeparrot_training - Step 13858: {'lr': 0.0004125881399934375, 'samples': 7095808, 'steps': 13858, 'batch_loss/train': 0.8107706811279058} 12/24/2021 19:18:38 - INFO - codeparrot_training - Step 13859: {'lr': 0.0004125761654810504, 'samples': 7096320, 'steps': 13859, 'batch_loss/train': 0.8227475108578801} 12/24/2021 19:18:50 - INFO - codeparrot_training - Step 13860: {'lr': 0.00041256419032232195, 'samples': 7096832, 'steps': 13860, 'batch_loss/train': 0.8826451729983091} 12/24/2021 19:19:01 - INFO - codeparrot_training - Step 13861: {'lr': 0.00041255221451729965, 'samples': 7097344, 'steps': 13861, 'batch_loss/train': 0.7966415546834469} 12/24/2021 19:19:12 - INFO - codeparrot_training - Step 13862: {'lr': 0.0004125402380660311, 'samples': 7097856, 'steps': 13862, 'batch_loss/train': 0.7238206492038444} 12/24/2021 19:19:25 - INFO - codeparrot_training - Step 13863: {'lr': 0.000412528260968564, 'samples': 7098368, 'steps': 13863, 'batch_loss/train': 0.7158109792508185} 12/24/2021 19:19:36 - INFO - codeparrot_training - Step 13864: {'lr': 0.00041251628322494575, 'samples': 7098880, 'steps': 13864, 'batch_loss/train': 0.6837175348773599} 12/24/2021 19:19:47 - INFO - codeparrot_training - Step 13865: {'lr': 0.0004125043048352243, 'samples': 7099392, 'steps': 13865, 'batch_loss/train': 0.729366144631058} 12/24/2021 19:19:59 - INFO - codeparrot_training - Step 13866: {'lr': 0.00041249232579944697, 'samples': 7099904, 'steps': 13866, 'batch_loss/train': 0.8408379824832082} 12/24/2021 19:20:09 - INFO - codeparrot_training - Step 13867: {'lr': 0.0004124803461176615, 'samples': 7100416, 'steps': 13867, 'batch_loss/train': 0.8124661212787032} 12/24/2021 19:20:20 - INFO - codeparrot_training - Step 13868: {'lr': 0.00041246836578991565, 'samples': 7100928, 'steps': 13868, 'batch_loss/train': 0.7492935814661905} 12/24/2021 19:20:32 - INFO - codeparrot_training - Step 13869: {'lr': 0.0004124563848162568, 'samples': 7101440, 'steps': 13869, 'batch_loss/train': 0.7596230180934072} 12/24/2021 19:20:43 - INFO - codeparrot_training - Step 13870: {'lr': 0.0004124444031967328, 'samples': 7101952, 'steps': 13870, 'batch_loss/train': 0.8455656040459871} 12/24/2021 19:20:54 - INFO - codeparrot_training - Step 13871: {'lr': 0.0004124324209313911, 'samples': 7102464, 'steps': 13871, 'batch_loss/train': 0.7423830169718713} 12/24/2021 19:21:04 - INFO - codeparrot_training - Step 13872: {'lr': 0.0004124204380202795, 'samples': 7102976, 'steps': 13872, 'batch_loss/train': 0.7052617112058215} 12/24/2021 19:21:18 - INFO - codeparrot_training - Step 13873: {'lr': 0.0004124084544634455, 'samples': 7103488, 'steps': 13873, 'batch_loss/train': 0.5409138493123464} 12/24/2021 19:21:29 - INFO - codeparrot_training - Step 13874: {'lr': 0.00041239647026093684, 'samples': 7104000, 'steps': 13874, 'batch_loss/train': 0.826431043446064} 12/24/2021 19:21:39 - INFO - codeparrot_training - Step 13875: {'lr': 0.00041238448541280117, 'samples': 7104512, 'steps': 13875, 'batch_loss/train': 0.7790813045576215} 12/24/2021 19:21:51 - INFO - codeparrot_training - Step 13876: {'lr': 0.00041237249991908604, 'samples': 7105024, 'steps': 13876, 'batch_loss/train': 0.7998730465769768} 12/24/2021 19:22:02 - INFO - codeparrot_training - Step 13877: {'lr': 0.00041236051377983916, 'samples': 7105536, 'steps': 13877, 'batch_loss/train': 0.7523517006775364} 12/24/2021 19:22:12 - INFO - codeparrot_training - Step 13878: {'lr': 0.0004123485269951083, 'samples': 7106048, 'steps': 13878, 'batch_loss/train': 0.7682518009096384} 12/24/2021 19:22:26 - INFO - codeparrot_training - Step 13879: {'lr': 0.0004123365395649408, 'samples': 7106560, 'steps': 13879, 'batch_loss/train': 0.7962389052845538} 12/24/2021 19:22:37 - INFO - codeparrot_training - Step 13880: {'lr': 0.00041232455148938464, 'samples': 7107072, 'steps': 13880, 'batch_loss/train': 0.5892719498369843} 12/24/2021 19:22:48 - INFO - codeparrot_training - Step 13881: {'lr': 0.0004123125627684874, 'samples': 7107584, 'steps': 13881, 'batch_loss/train': 0.832256143912673} 12/24/2021 19:22:58 - INFO - codeparrot_training - Step 13882: {'lr': 0.00041230057340229656, 'samples': 7108096, 'steps': 13882, 'batch_loss/train': 0.8257887233048677} 12/24/2021 19:23:11 - INFO - codeparrot_training - Step 13883: {'lr': 0.00041228858339086, 'samples': 7108608, 'steps': 13883, 'batch_loss/train': 0.8291447944939137} 12/24/2021 19:23:21 - INFO - codeparrot_training - Step 13884: {'lr': 0.0004122765927342253, 'samples': 7109120, 'steps': 13884, 'batch_loss/train': 0.5774636511923745} 12/24/2021 19:23:32 - INFO - codeparrot_training - Step 13885: {'lr': 0.00041226460143244017, 'samples': 7109632, 'steps': 13885, 'batch_loss/train': 0.8914419384673238} 12/24/2021 19:23:44 - INFO - codeparrot_training - Step 13886: {'lr': 0.0004122526094855522, 'samples': 7110144, 'steps': 13886, 'batch_loss/train': 0.6644128388725221} 12/24/2021 19:23:55 - INFO - codeparrot_training - Step 13887: {'lr': 0.00041224061689360914, 'samples': 7110656, 'steps': 13887, 'batch_loss/train': 0.9055015463382006} 12/24/2021 19:24:05 - INFO - codeparrot_training - Step 13888: {'lr': 0.0004122286236566587, 'samples': 7111168, 'steps': 13888, 'batch_loss/train': 0.7438381639658473} 12/24/2021 19:24:20 - INFO - codeparrot_training - Step 13889: {'lr': 0.00041221662977474844, 'samples': 7111680, 'steps': 13889, 'batch_loss/train': 0.6994635771261528} 12/24/2021 19:24:30 - INFO - codeparrot_training - Step 13890: {'lr': 0.00041220463524792604, 'samples': 7112192, 'steps': 13890, 'batch_loss/train': 0.8810921711847186} 12/24/2021 19:24:41 - INFO - codeparrot_training - Step 13891: {'lr': 0.00041219264007623943, 'samples': 7112704, 'steps': 13891, 'batch_loss/train': 1.6836614892818034} 12/24/2021 19:24:52 - INFO - codeparrot_training - Step 13892: {'lr': 0.000412180644259736, 'samples': 7113216, 'steps': 13892, 'batch_loss/train': 0.8163724625483155} 12/24/2021 19:25:04 - INFO - codeparrot_training - Step 13893: {'lr': 0.00041216864779846366, 'samples': 7113728, 'steps': 13893, 'batch_loss/train': 0.8661471856757998} 12/24/2021 19:25:14 - INFO - codeparrot_training - Step 13894: {'lr': 0.00041215665069246996, 'samples': 7114240, 'steps': 13894, 'batch_loss/train': 0.8577881581149995} 12/24/2021 19:25:25 - INFO - codeparrot_training - Step 13895: {'lr': 0.0004121446529418026, 'samples': 7114752, 'steps': 13895, 'batch_loss/train': 0.8635135681251995} 12/24/2021 19:25:37 - INFO - codeparrot_training - Step 13896: {'lr': 0.0004121326545465094, 'samples': 7115264, 'steps': 13896, 'batch_loss/train': 0.5714422440505587} 12/24/2021 19:25:48 - INFO - codeparrot_training - Step 13897: {'lr': 0.00041212065550663793, 'samples': 7115776, 'steps': 13897, 'batch_loss/train': 0.7325993133708835} 12/24/2021 19:25:58 - INFO - codeparrot_training - Step 13898: {'lr': 0.000412108655822236, 'samples': 7116288, 'steps': 13898, 'batch_loss/train': 0.7717183511704206} 12/24/2021 19:26:12 - INFO - codeparrot_training - Step 13899: {'lr': 0.0004120966554933512, 'samples': 7116800, 'steps': 13899, 'batch_loss/train': 0.6340821543126367} 12/24/2021 19:26:23 - INFO - codeparrot_training - Step 13900: {'lr': 0.0004120846545200313, 'samples': 7117312, 'steps': 13900, 'batch_loss/train': 0.8131558392196894} 12/24/2021 19:26:33 - INFO - codeparrot_training - Step 13901: {'lr': 0.0004120726529023241, 'samples': 7117824, 'steps': 13901, 'batch_loss/train': 0.8128682803362608} 12/24/2021 19:26:45 - INFO - codeparrot_training - Step 13902: {'lr': 0.0004120606506402772, 'samples': 7118336, 'steps': 13902, 'batch_loss/train': 0.8131849518977106} 12/24/2021 19:26:56 - INFO - codeparrot_training - Step 13903: {'lr': 0.00041204864773393825, 'samples': 7118848, 'steps': 13903, 'batch_loss/train': 0.8673683982342482} 12/24/2021 19:27:07 - INFO - codeparrot_training - Step 13904: {'lr': 0.0004120366441833551, 'samples': 7119360, 'steps': 13904, 'batch_loss/train': 0.7107691017445177} 12/24/2021 19:27:17 - INFO - codeparrot_training - Step 13905: {'lr': 0.0004120246399885755, 'samples': 7119872, 'steps': 13905, 'batch_loss/train': 0.6990752932615578} 12/24/2021 19:27:29 - INFO - codeparrot_training - Step 13906: {'lr': 0.0004120126351496471, 'samples': 7120384, 'steps': 13906, 'batch_loss/train': 0.7233710014261305} 12/24/2021 19:27:40 - INFO - codeparrot_training - Step 13907: {'lr': 0.00041200062966661766, 'samples': 7120896, 'steps': 13907, 'batch_loss/train': 0.7338734278455377} 12/24/2021 19:27:51 - INFO - codeparrot_training - Step 13908: {'lr': 0.0004119886235395348, 'samples': 7121408, 'steps': 13908, 'batch_loss/train': 0.7287185871973634} 12/24/2021 19:28:04 - INFO - codeparrot_training - Step 13909: {'lr': 0.0004119766167684464, 'samples': 7121920, 'steps': 13909, 'batch_loss/train': 0.8099658545106649} 12/24/2021 19:28:15 - INFO - codeparrot_training - Step 13910: {'lr': 0.00041196460935340006, 'samples': 7122432, 'steps': 13910, 'batch_loss/train': 0.7518203910440207} 12/24/2021 19:28:26 - INFO - codeparrot_training - Step 13911: {'lr': 0.0004119526012944437, 'samples': 7122944, 'steps': 13911, 'batch_loss/train': 0.7432790687307715} 12/24/2021 19:28:38 - INFO - codeparrot_training - Step 13912: {'lr': 0.0004119405925916249, 'samples': 7123456, 'steps': 13912, 'batch_loss/train': 0.7531077340245247} 12/24/2021 19:28:48 - INFO - codeparrot_training - Step 13913: {'lr': 0.0004119285832449914, 'samples': 7123968, 'steps': 13913, 'batch_loss/train': 0.7231715171365067} 12/24/2021 19:28:59 - INFO - codeparrot_training - Step 13914: {'lr': 0.00041191657325459106, 'samples': 7124480, 'steps': 13914, 'batch_loss/train': 0.7240879312157631} 12/24/2021 19:29:11 - INFO - codeparrot_training - Step 13915: {'lr': 0.00041190456262047147, 'samples': 7124992, 'steps': 13915, 'batch_loss/train': 0.75280727352947} 12/24/2021 19:29:22 - INFO - codeparrot_training - Step 13916: {'lr': 0.00041189255134268057, 'samples': 7125504, 'steps': 13916, 'batch_loss/train': 0.756965477950871} 12/24/2021 19:29:32 - INFO - codeparrot_training - Step 13917: {'lr': 0.00041188053942126595, 'samples': 7126016, 'steps': 13917, 'batch_loss/train': 0.7368351048789918} 12/24/2021 19:29:43 - INFO - codeparrot_training - Step 13918: {'lr': 0.00041186852685627553, 'samples': 7126528, 'steps': 13918, 'batch_loss/train': 0.8117497656494379} 12/24/2021 19:29:57 - INFO - codeparrot_training - Step 13919: {'lr': 0.0004118565136477568, 'samples': 7127040, 'steps': 13919, 'batch_loss/train': 0.5958026023581624} 12/24/2021 19:30:08 - INFO - codeparrot_training - Step 13920: {'lr': 0.00041184449979575786, 'samples': 7127552, 'steps': 13920, 'batch_loss/train': 0.7007558700861409} 12/24/2021 19:30:18 - INFO - codeparrot_training - Step 13921: {'lr': 0.0004118324853003262, 'samples': 7128064, 'steps': 13921, 'batch_loss/train': 0.6878484301269054} 12/24/2021 19:30:30 - INFO - codeparrot_training - Step 13922: {'lr': 0.0004118204701615097, 'samples': 7128576, 'steps': 13922, 'batch_loss/train': 0.8642453588545322} 12/24/2021 19:30:41 - INFO - codeparrot_training - Step 13923: {'lr': 0.0004118084543793561, 'samples': 7129088, 'steps': 13923, 'batch_loss/train': 0.7895752554759383} 12/24/2021 19:30:51 - INFO - codeparrot_training - Step 13924: {'lr': 0.0004117964379539132, 'samples': 7129600, 'steps': 13924, 'batch_loss/train': 0.7869997974485159} 12/24/2021 19:31:05 - INFO - codeparrot_training - Step 13925: {'lr': 0.0004117844208852288, 'samples': 7130112, 'steps': 13925, 'batch_loss/train': 0.7850196645595133} 12/24/2021 19:31:16 - INFO - codeparrot_training - Step 13926: {'lr': 0.0004117724031733506, 'samples': 7130624, 'steps': 13926, 'batch_loss/train': 0.775435988791287} 12/24/2021 19:31:27 - INFO - codeparrot_training - Step 13927: {'lr': 0.00041176038481832635, 'samples': 7131136, 'steps': 13927, 'batch_loss/train': 0.8554649790748954} 12/24/2021 19:31:39 - INFO - codeparrot_training - Step 13928: {'lr': 0.0004117483658202039, 'samples': 7131648, 'steps': 13928, 'batch_loss/train': 0.8903634222224355} 12/24/2021 19:31:49 - INFO - codeparrot_training - Step 13929: {'lr': 0.0004117363461790311, 'samples': 7132160, 'steps': 13929, 'batch_loss/train': 0.7546544759534299} 12/24/2021 19:32:00 - INFO - codeparrot_training - Step 13930: {'lr': 0.0004117243258948555, 'samples': 7132672, 'steps': 13930, 'batch_loss/train': 0.7029049132252112} 12/24/2021 19:32:11 - INFO - codeparrot_training - Step 13931: {'lr': 0.0004117123049677253, 'samples': 7133184, 'steps': 13931, 'batch_loss/train': 0.7112559049855918} 12/24/2021 19:32:23 - INFO - codeparrot_training - Step 13932: {'lr': 0.00041170028339768784, 'samples': 7133696, 'steps': 13932, 'batch_loss/train': 0.745105360285379} 12/24/2021 19:32:33 - INFO - codeparrot_training - Step 13933: {'lr': 0.0004116882611847912, 'samples': 7134208, 'steps': 13933, 'batch_loss/train': 0.8006752240471542} 12/24/2021 19:32:44 - INFO - codeparrot_training - Step 13934: {'lr': 0.00041167623832908296, 'samples': 7134720, 'steps': 13934, 'batch_loss/train': 0.8043532008305192} 12/24/2021 19:32:56 - INFO - codeparrot_training - Step 13935: {'lr': 0.0004116642148306111, 'samples': 7135232, 'steps': 13935, 'batch_loss/train': 0.8331430535763502} 12/24/2021 19:33:07 - INFO - codeparrot_training - Step 13936: {'lr': 0.00041165219068942343, 'samples': 7135744, 'steps': 13936, 'batch_loss/train': 0.8339579286985099} 12/24/2021 19:33:17 - INFO - codeparrot_training - Step 13937: {'lr': 0.0004116401659055676, 'samples': 7136256, 'steps': 13937, 'batch_loss/train': 0.6731277249054983} 12/24/2021 19:33:31 - INFO - codeparrot_training - Step 13938: {'lr': 0.00041162814047909156, 'samples': 7136768, 'steps': 13938, 'batch_loss/train': 0.7882440134417266} 12/24/2021 19:33:42 - INFO - codeparrot_training - Step 13939: {'lr': 0.0004116161144100431, 'samples': 7137280, 'steps': 13939, 'batch_loss/train': 0.7255773618817329} 12/24/2021 19:33:53 - INFO - codeparrot_training - Step 13940: {'lr': 0.0004116040876984699, 'samples': 7137792, 'steps': 13940, 'batch_loss/train': 0.7576815681532025} 12/24/2021 19:34:03 - INFO - codeparrot_training - Step 13941: {'lr': 0.0004115920603444199, 'samples': 7138304, 'steps': 13941, 'batch_loss/train': 0.7593799885362387} 12/24/2021 19:34:15 - INFO - codeparrot_training - Step 13942: {'lr': 0.00041158003234794094, 'samples': 7138816, 'steps': 13942, 'batch_loss/train': 0.8055243836715817} 12/24/2021 19:34:26 - INFO - codeparrot_training - Step 13943: {'lr': 0.00041156800370908075, 'samples': 7139328, 'steps': 13943, 'batch_loss/train': 0.7637424366548657} 12/24/2021 19:34:36 - INFO - codeparrot_training - Step 13944: {'lr': 0.00041155597442788715, 'samples': 7139840, 'steps': 13944, 'batch_loss/train': 0.6354571045958437} 12/24/2021 19:34:49 - INFO - codeparrot_training - Step 13945: {'lr': 0.000411543944504408, 'samples': 7140352, 'steps': 13945, 'batch_loss/train': 0.7756950249895453} 12/24/2021 19:34:59 - INFO - codeparrot_training - Step 13946: {'lr': 0.00041153191393869114, 'samples': 7140864, 'steps': 13946, 'batch_loss/train': 0.7994091277942061} 12/24/2021 19:35:10 - INFO - codeparrot_training - Step 13947: {'lr': 0.0004115198827307844, 'samples': 7141376, 'steps': 13947, 'batch_loss/train': 0.775285454466939} 12/24/2021 19:35:24 - INFO - codeparrot_training - Step 13948: {'lr': 0.00041150785088073564, 'samples': 7141888, 'steps': 13948, 'batch_loss/train': 0.8141383100301027} 12/24/2021 19:35:34 - INFO - codeparrot_training - Step 13949: {'lr': 0.0004114958183885925, 'samples': 7142400, 'steps': 13949, 'batch_loss/train': 0.8275740873068571} 12/24/2021 19:35:45 - INFO - codeparrot_training - Step 13950: {'lr': 0.0004114837852544031, 'samples': 7142912, 'steps': 13950, 'batch_loss/train': 0.7098526963964105} 12/24/2021 19:35:57 - INFO - codeparrot_training - Step 13951: {'lr': 0.00041147175147821505, 'samples': 7143424, 'steps': 13951, 'batch_loss/train': 0.8021522685885429} 12/24/2021 19:36:08 - INFO - codeparrot_training - Step 13952: {'lr': 0.00041145971706007633, 'samples': 7143936, 'steps': 13952, 'batch_loss/train': 0.7627892135642469} 12/24/2021 19:36:18 - INFO - codeparrot_training - Step 13953: {'lr': 0.00041144768200003474, 'samples': 7144448, 'steps': 13953, 'batch_loss/train': 0.7356639923527837} 12/24/2021 19:36:29 - INFO - codeparrot_training - Step 13954: {'lr': 0.00041143564629813814, 'samples': 7144960, 'steps': 13954, 'batch_loss/train': 0.8613860569894314} 12/24/2021 19:36:43 - INFO - codeparrot_training - Step 13955: {'lr': 0.00041142360995443435, 'samples': 7145472, 'steps': 13955, 'batch_loss/train': 0.7789567415602505} 12/24/2021 19:36:53 - INFO - codeparrot_training - Step 13956: {'lr': 0.00041141157296897124, 'samples': 7145984, 'steps': 13956, 'batch_loss/train': 0.7483404912054539} 12/24/2021 19:37:04 - INFO - codeparrot_training - Step 13957: {'lr': 0.0004113995353417966, 'samples': 7146496, 'steps': 13957, 'batch_loss/train': 0.7338802739977837} 12/24/2021 19:37:16 - INFO - codeparrot_training - Step 13958: {'lr': 0.0004113874970729584, 'samples': 7147008, 'steps': 13958, 'batch_loss/train': 0.7805010955780745} 12/24/2021 19:37:26 - INFO - codeparrot_training - Step 13959: {'lr': 0.00041137545816250453, 'samples': 7147520, 'steps': 13959, 'batch_loss/train': 0.8014207938686013} 12/24/2021 19:37:37 - INFO - codeparrot_training - Step 13960: {'lr': 0.00041136341861048263, 'samples': 7148032, 'steps': 13960, 'batch_loss/train': 0.622970447584521} 12/24/2021 19:37:49 - INFO - codeparrot_training - Step 13961: {'lr': 0.0004113513784169408, 'samples': 7148544, 'steps': 13961, 'batch_loss/train': 0.574941924714949} 12/24/2021 19:38:00 - INFO - codeparrot_training - Step 13962: {'lr': 0.00041133933758192675, 'samples': 7149056, 'steps': 13962, 'batch_loss/train': 0.7517035251948982} 12/24/2021 19:38:10 - INFO - codeparrot_training - Step 13963: {'lr': 0.0004113272961054885, 'samples': 7149568, 'steps': 13963, 'batch_loss/train': 0.6952986119431444} 12/24/2021 19:38:24 - INFO - codeparrot_training - Step 13964: {'lr': 0.00041131525398767376, 'samples': 7150080, 'steps': 13964, 'batch_loss/train': 0.7622777394717559} 12/24/2021 19:38:35 - INFO - codeparrot_training - Step 13965: {'lr': 0.00041130321122853056, 'samples': 7150592, 'steps': 13965, 'batch_loss/train': 0.85027090087533} 12/24/2021 19:38:46 - INFO - codeparrot_training - Step 13966: {'lr': 0.0004112911678281066, 'samples': 7151104, 'steps': 13966, 'batch_loss/train': 0.7995479786768556} 12/24/2021 19:38:56 - INFO - codeparrot_training - Step 13967: {'lr': 0.0004112791237864499, 'samples': 7151616, 'steps': 13967, 'batch_loss/train': 0.8043054859153926} 12/24/2021 19:39:08 - INFO - codeparrot_training - Step 13968: {'lr': 0.0004112670791036083, 'samples': 7152128, 'steps': 13968, 'batch_loss/train': 0.8312921114265919} 12/24/2021 19:39:19 - INFO - codeparrot_training - Step 13969: {'lr': 0.0004112550337796297, 'samples': 7152640, 'steps': 13969, 'batch_loss/train': 0.8016615780070424} 12/24/2021 19:39:30 - INFO - codeparrot_training - Step 13970: {'lr': 0.00041124298781456206, 'samples': 7153152, 'steps': 13970, 'batch_loss/train': 0.8084822967648506} 12/24/2021 19:39:41 - INFO - codeparrot_training - Step 13971: {'lr': 0.000411230941208453, 'samples': 7153664, 'steps': 13971, 'batch_loss/train': 0.81635542656295} 12/24/2021 19:39:52 - INFO - codeparrot_training - Step 13972: {'lr': 0.0004112188939613507, 'samples': 7154176, 'steps': 13972, 'batch_loss/train': 0.833665503654629} 12/24/2021 19:40:03 - INFO - codeparrot_training - Step 13973: {'lr': 0.00041120684607330295, 'samples': 7154688, 'steps': 13973, 'batch_loss/train': 0.851745905354619} 12/24/2021 19:40:16 - INFO - codeparrot_training - Step 13974: {'lr': 0.00041119479754435763, 'samples': 7155200, 'steps': 13974, 'batch_loss/train': 0.8028105106204748} 12/24/2021 19:40:27 - INFO - codeparrot_training - Step 13975: {'lr': 0.00041118274837456263, 'samples': 7155712, 'steps': 13975, 'batch_loss/train': 0.7516265958547592} 12/24/2021 19:40:38 - INFO - codeparrot_training - Step 13976: {'lr': 0.00041117069856396593, 'samples': 7156224, 'steps': 13976, 'batch_loss/train': 0.788801888236776} 12/24/2021 19:40:50 - INFO - codeparrot_training - Step 13977: {'lr': 0.0004111586481126154, 'samples': 7156736, 'steps': 13977, 'batch_loss/train': 0.7451761099509895} 12/24/2021 19:41:01 - INFO - codeparrot_training - Step 13978: {'lr': 0.0004111465970205589, 'samples': 7157248, 'steps': 13978, 'batch_loss/train': 0.7992221610620618} 12/24/2021 19:41:11 - INFO - codeparrot_training - Step 13979: {'lr': 0.0004111345452878444, 'samples': 7157760, 'steps': 13979, 'batch_loss/train': 0.7731765392236412} 12/24/2021 19:41:23 - INFO - codeparrot_training - Step 13980: {'lr': 0.00041112249291451973, 'samples': 7158272, 'steps': 13980, 'batch_loss/train': 0.7843627939000726} 12/24/2021 19:41:34 - INFO - codeparrot_training - Step 13981: {'lr': 0.0004111104399006329, 'samples': 7158784, 'steps': 13981, 'batch_loss/train': 0.7495027398690581} 12/24/2021 19:41:44 - INFO - codeparrot_training - Step 13982: {'lr': 0.00041109838624623187, 'samples': 7159296, 'steps': 13982, 'batch_loss/train': 0.7196072184015065} 12/24/2021 19:41:55 - INFO - codeparrot_training - Step 13983: {'lr': 0.0004110863319513644, 'samples': 7159808, 'steps': 13983, 'batch_loss/train': 0.8414551597088575} 12/24/2021 19:42:07 - INFO - codeparrot_training - Step 13984: {'lr': 0.00041107427701607847, 'samples': 7160320, 'steps': 13984, 'batch_loss/train': 2.7947718827053905} 12/24/2021 19:42:18 - INFO - codeparrot_training - Step 13985: {'lr': 0.00041106222144042217, 'samples': 7160832, 'steps': 13985, 'batch_loss/train': 0.8259323509410024} 12/24/2021 19:42:29 - INFO - codeparrot_training - Step 13986: {'lr': 0.00041105016522444317, 'samples': 7161344, 'steps': 13986, 'batch_loss/train': 0.7863217790145427} 12/24/2021 19:42:43 - INFO - codeparrot_training - Step 13987: {'lr': 0.00041103810836818955, 'samples': 7161856, 'steps': 13987, 'batch_loss/train': 0.7929088985547423} 12/24/2021 19:42:53 - INFO - codeparrot_training - Step 13988: {'lr': 0.0004110260508717092, 'samples': 7162368, 'steps': 13988, 'batch_loss/train': 0.7943822648376226} 12/24/2021 19:43:04 - INFO - codeparrot_training - Step 13989: {'lr': 0.00041101399273505015, 'samples': 7162880, 'steps': 13989, 'batch_loss/train': 0.7463091788813472} 12/24/2021 19:43:16 - INFO - codeparrot_training - Step 13990: {'lr': 0.0004110019339582602, 'samples': 7163392, 'steps': 13990, 'batch_loss/train': 0.6905973094981164} 12/24/2021 19:43:27 - INFO - codeparrot_training - Step 13991: {'lr': 0.0004109898745413874, 'samples': 7163904, 'steps': 13991, 'batch_loss/train': 0.7841538959182799} 12/24/2021 19:43:37 - INFO - codeparrot_training - Step 13992: {'lr': 0.00041097781448447957, 'samples': 7164416, 'steps': 13992, 'batch_loss/train': 0.8576564802788198} 12/24/2021 19:43:48 - INFO - codeparrot_training - Step 13993: {'lr': 0.0004109657537875847, 'samples': 7164928, 'steps': 13993, 'batch_loss/train': 0.6257255235686898} 12/24/2021 19:44:02 - INFO - codeparrot_training - Step 13994: {'lr': 0.00041095369245075085, 'samples': 7165440, 'steps': 13994, 'batch_loss/train': 0.8077971367165446} 12/24/2021 19:44:13 - INFO - codeparrot_training - Step 13995: {'lr': 0.00041094163047402586, 'samples': 7165952, 'steps': 13995, 'batch_loss/train': 0.7517152586951852} 12/24/2021 19:44:23 - INFO - codeparrot_training - Step 13996: {'lr': 0.0004109295678574578, 'samples': 7166464, 'steps': 13996, 'batch_loss/train': 0.7959950855001807} 12/24/2021 19:44:35 - INFO - codeparrot_training - Step 13997: {'lr': 0.0004109175046010944, 'samples': 7166976, 'steps': 13997, 'batch_loss/train': 0.8046049668919295} 12/24/2021 19:44:46 - INFO - codeparrot_training - Step 13998: {'lr': 0.00041090544070498394, 'samples': 7167488, 'steps': 13998, 'batch_loss/train': 0.7995187342166901} 12/24/2021 19:44:57 - INFO - codeparrot_training - Step 13999: {'lr': 0.0004108933761691741, 'samples': 7168000, 'steps': 13999, 'batch_loss/train': 0.7244901661761105} 12/24/2021 19:45:09 - INFO - codeparrot_training - Step 14000: {'lr': 0.00041088131099371297, 'samples': 7168512, 'steps': 14000, 'batch_loss/train': 0.7847578818909824} 12/24/2021 19:45:19 - INFO - codeparrot_training - Step 14001: {'lr': 0.0004108692451786485, 'samples': 7169024, 'steps': 14001, 'batch_loss/train': 0.760838049929589} 12/24/2021 19:45:30 - INFO - codeparrot_training - Step 14002: {'lr': 0.0004108571787240286, 'samples': 7169536, 'steps': 14002, 'batch_loss/train': 0.6755525299813598} 12/24/2021 19:45:40 - INFO - codeparrot_training - Step 14003: {'lr': 0.00041084511162990135, 'samples': 7170048, 'steps': 14003, 'batch_loss/train': 0.8030363521538675} 12/24/2021 19:45:54 - INFO - codeparrot_training - Step 14004: {'lr': 0.0004108330438963146, 'samples': 7170560, 'steps': 14004, 'batch_loss/train': 0.8462497442960739} 12/24/2021 19:46:05 - INFO - codeparrot_training - Step 14005: {'lr': 0.0004108209755233165, 'samples': 7171072, 'steps': 14005, 'batch_loss/train': 0.7949494593776762} 12/24/2021 19:46:16 - INFO - codeparrot_training - Step 14006: {'lr': 0.0004108089065109548, 'samples': 7171584, 'steps': 14006, 'batch_loss/train': 0.8325726203620434} 12/24/2021 19:46:28 - INFO - codeparrot_training - Step 14007: {'lr': 0.00041079683685927773, 'samples': 7172096, 'steps': 14007, 'batch_loss/train': 0.7496057087555528} 12/24/2021 19:46:39 - INFO - codeparrot_training - Step 14008: {'lr': 0.0004107847665683331, 'samples': 7172608, 'steps': 14008, 'batch_loss/train': 0.8517486620694399} 12/24/2021 19:46:49 - INFO - codeparrot_training - Step 14009: {'lr': 0.000410772695638169, 'samples': 7173120, 'steps': 14009, 'batch_loss/train': 0.8007533731870353} 12/24/2021 19:47:03 - INFO - codeparrot_training - Step 14010: {'lr': 0.00041076062406883326, 'samples': 7173632, 'steps': 14010, 'batch_loss/train': 0.715678779524751} 12/24/2021 19:47:14 - INFO - codeparrot_training - Step 14011: {'lr': 0.000410748551860374, 'samples': 7174144, 'steps': 14011, 'batch_loss/train': 1.0619625765830278} 12/24/2021 19:47:25 - INFO - codeparrot_training - Step 14012: {'lr': 0.0004107364790128393, 'samples': 7174656, 'steps': 14012, 'batch_loss/train': 0.5922567581292242} 12/24/2021 19:47:35 - INFO - codeparrot_training - Step 14013: {'lr': 0.00041072440552627695, 'samples': 7175168, 'steps': 14013, 'batch_loss/train': 0.8731426503509283} 12/24/2021 19:47:47 - INFO - codeparrot_training - Step 14014: {'lr': 0.00041071233140073507, 'samples': 7175680, 'steps': 14014, 'batch_loss/train': 0.7490117596462369} 12/24/2021 19:47:58 - INFO - codeparrot_training - Step 14015: {'lr': 0.00041070025663626157, 'samples': 7176192, 'steps': 14015, 'batch_loss/train': 0.8013992328196764} 12/24/2021 19:48:08 - INFO - codeparrot_training - Step 14016: {'lr': 0.00041068818123290464, 'samples': 7176704, 'steps': 14016, 'batch_loss/train': 0.7276998027227819} 12/24/2021 19:48:20 - INFO - codeparrot_training - Step 14017: {'lr': 0.0004106761051907122, 'samples': 7177216, 'steps': 14017, 'batch_loss/train': 0.821045970544219} 12/24/2021 19:48:31 - INFO - codeparrot_training - Step 14018: {'lr': 0.00041066402850973216, 'samples': 7177728, 'steps': 14018, 'batch_loss/train': 0.776848113629967} 12/24/2021 19:48:42 - INFO - codeparrot_training - Step 14019: {'lr': 0.00041065195119001266, 'samples': 7178240, 'steps': 14019, 'batch_loss/train': 0.8276777165010571} 12/24/2021 19:48:54 - INFO - codeparrot_training - Step 14020: {'lr': 0.0004106398732316016, 'samples': 7178752, 'steps': 14020, 'batch_loss/train': 0.798963051638566} 12/24/2021 19:49:04 - INFO - codeparrot_training - Step 14021: {'lr': 0.000410627794634547, 'samples': 7179264, 'steps': 14021, 'batch_loss/train': 0.6916033100569621} 12/24/2021 19:49:15 - INFO - codeparrot_training - Step 14022: {'lr': 0.0004106157153988971, 'samples': 7179776, 'steps': 14022, 'batch_loss/train': 0.8705926267430186} 12/24/2021 19:49:29 - INFO - codeparrot_training - Step 14023: {'lr': 0.0004106036355246997, 'samples': 7180288, 'steps': 14023, 'batch_loss/train': 0.7666177619248629} 12/24/2021 19:49:40 - INFO - codeparrot_training - Step 14024: {'lr': 0.00041059155501200285, 'samples': 7180800, 'steps': 14024, 'batch_loss/train': 0.5735791793558747} 12/24/2021 19:49:50 - INFO - codeparrot_training - Step 14025: {'lr': 0.0004105794738608546, 'samples': 7181312, 'steps': 14025, 'batch_loss/train': 0.7323762244195677} 12/24/2021 19:50:01 - INFO - codeparrot_training - Step 14026: {'lr': 0.00041056739207130313, 'samples': 7181824, 'steps': 14026, 'batch_loss/train': 0.7950607314705849} 12/24/2021 19:50:13 - INFO - codeparrot_training - Step 14027: {'lr': 0.00041055530964339616, 'samples': 7182336, 'steps': 14027, 'batch_loss/train': 0.7942894101142883} 12/24/2021 19:50:23 - INFO - codeparrot_training - Step 14028: {'lr': 0.00041054322657718196, 'samples': 7182848, 'steps': 14028, 'batch_loss/train': 0.8391943201422691} 12/24/2021 19:50:34 - INFO - codeparrot_training - Step 14029: {'lr': 0.00041053114287270857, 'samples': 7183360, 'steps': 14029, 'batch_loss/train': 0.8328424021601677} 12/24/2021 19:50:46 - INFO - codeparrot_training - Step 14030: {'lr': 0.0004105190585300239, 'samples': 7183872, 'steps': 14030, 'batch_loss/train': 0.7934282119385898} 12/24/2021 19:50:57 - INFO - codeparrot_training - Step 14031: {'lr': 0.0004105069735491761, 'samples': 7184384, 'steps': 14031, 'batch_loss/train': 0.7938456488773227} 12/24/2021 19:51:07 - INFO - codeparrot_training - Step 14032: {'lr': 0.00041049488793021315, 'samples': 7184896, 'steps': 14032, 'batch_loss/train': 0.8413038873113692} 12/24/2021 19:51:21 - INFO - codeparrot_training - Step 14033: {'lr': 0.0004104828016731832, 'samples': 7185408, 'steps': 14033, 'batch_loss/train': 0.715989341493696} 12/24/2021 19:51:32 - INFO - codeparrot_training - Step 14034: {'lr': 0.00041047071477813416, 'samples': 7185920, 'steps': 14034, 'batch_loss/train': 0.6463325400254689} 12/24/2021 19:51:42 - INFO - codeparrot_training - Step 14035: {'lr': 0.0004104586272451142, 'samples': 7186432, 'steps': 14035, 'batch_loss/train': 0.7646285081282258} 12/24/2021 19:51:54 - INFO - codeparrot_training - Step 14036: {'lr': 0.0004104465390741713, 'samples': 7186944, 'steps': 14036, 'batch_loss/train': 0.8008784870617092} 12/24/2021 19:52:05 - INFO - codeparrot_training - Step 14037: {'lr': 0.00041043445026535353, 'samples': 7187456, 'steps': 14037, 'batch_loss/train': 0.7819372015073895} 12/24/2021 19:52:16 - INFO - codeparrot_training - Step 14038: {'lr': 0.0004104223608187089, 'samples': 7187968, 'steps': 14038, 'batch_loss/train': 0.9120669336989522} 12/24/2021 19:52:26 - INFO - codeparrot_training - Step 14039: {'lr': 0.0004104102707342856, 'samples': 7188480, 'steps': 14039, 'batch_loss/train': 0.811393903568387} 12/24/2021 19:52:40 - INFO - codeparrot_training - Step 14040: {'lr': 0.00041039818001213164, 'samples': 7188992, 'steps': 14040, 'batch_loss/train': 0.7627051658928394} 12/24/2021 19:52:51 - INFO - codeparrot_training - Step 14041: {'lr': 0.0004103860886522951, 'samples': 7189504, 'steps': 14041, 'batch_loss/train': 0.706456414074637} 12/24/2021 19:53:01 - INFO - codeparrot_training - Step 14042: {'lr': 0.000410373996654824, 'samples': 7190016, 'steps': 14042, 'batch_loss/train': 0.8067855667322874} 12/24/2021 19:53:13 - INFO - codeparrot_training - Step 14043: {'lr': 0.00041036190401976647, 'samples': 7190528, 'steps': 14043, 'batch_loss/train': 0.7703754790127277} 12/24/2021 19:53:24 - INFO - codeparrot_training - Step 14044: {'lr': 0.00041034981074717053, 'samples': 7191040, 'steps': 14044, 'batch_loss/train': 0.7125793821178377} 12/24/2021 19:53:35 - INFO - codeparrot_training - Step 14045: {'lr': 0.00041033771683708425, 'samples': 7191552, 'steps': 14045, 'batch_loss/train': 0.8026843816041946} 12/24/2021 19:53:47 - INFO - codeparrot_training - Step 14046: {'lr': 0.00041032562228955584, 'samples': 7192064, 'steps': 14046, 'batch_loss/train': 0.7613607458770275} 12/24/2021 19:53:57 - INFO - codeparrot_training - Step 14047: {'lr': 0.0004103135271046332, 'samples': 7192576, 'steps': 14047, 'batch_loss/train': 0.7608857830055058} 12/24/2021 19:54:08 - INFO - codeparrot_training - Step 14048: {'lr': 0.00041030143128236465, 'samples': 7193088, 'steps': 14048, 'batch_loss/train': 0.7914261408150196} 12/24/2021 19:54:22 - INFO - codeparrot_training - Step 14049: {'lr': 0.00041028933482279795, 'samples': 7193600, 'steps': 14049, 'batch_loss/train': 0.8004131467314437} 12/24/2021 19:54:32 - INFO - codeparrot_training - Step 14050: {'lr': 0.00041027723772598155, 'samples': 7194112, 'steps': 14050, 'batch_loss/train': 0.6950438290368766} 12/24/2021 19:54:43 - INFO - codeparrot_training - Step 14051: {'lr': 0.00041026513999196327, 'samples': 7194624, 'steps': 14051, 'batch_loss/train': 0.7212957831216045} 12/24/2021 19:54:54 - INFO - codeparrot_training - Step 14052: {'lr': 0.0004102530416207913, 'samples': 7195136, 'steps': 14052, 'batch_loss/train': 0.7901826556189917} 12/24/2021 19:55:06 - INFO - codeparrot_training - Step 14053: {'lr': 0.0004102409426125138, 'samples': 7195648, 'steps': 14053, 'batch_loss/train': 0.8061887472867966} 12/24/2021 19:55:17 - INFO - codeparrot_training - Step 14054: {'lr': 0.00041022884296717876, 'samples': 7196160, 'steps': 14054, 'batch_loss/train': 0.7862129518762231} 12/24/2021 19:55:27 - INFO - codeparrot_training - Step 14055: {'lr': 0.00041021674268483445, 'samples': 7196672, 'steps': 14055, 'batch_loss/train': 0.6609127585252281} 12/24/2021 19:55:39 - INFO - codeparrot_training - Step 14056: {'lr': 0.00041020464176552875, 'samples': 7197184, 'steps': 14056, 'batch_loss/train': 0.7881434278097004} 12/24/2021 19:55:50 - INFO - codeparrot_training - Step 14057: {'lr': 0.00041019254020930993, 'samples': 7197696, 'steps': 14057, 'batch_loss/train': 0.8012278127716854} 12/24/2021 19:56:00 - INFO - codeparrot_training - Step 14058: {'lr': 0.000410180438016226, 'samples': 7198208, 'steps': 14058, 'batch_loss/train': 0.7957572788000107} 12/24/2021 19:56:14 - INFO - codeparrot_training - Step 14059: {'lr': 0.00041016833518632524, 'samples': 7198720, 'steps': 14059, 'batch_loss/train': 0.7421736361575313} 12/24/2021 19:56:25 - INFO - codeparrot_training - Step 14060: {'lr': 0.00041015623171965557, 'samples': 7199232, 'steps': 14060, 'batch_loss/train': 0.820157153531909} 12/24/2021 19:56:35 - INFO - codeparrot_training - Step 14061: {'lr': 0.00041014412761626525, 'samples': 7199744, 'steps': 14061, 'batch_loss/train': 0.909615270793438} 12/24/2021 19:56:50 - INFO - codeparrot_training - Step 14062: {'lr': 0.00041013202287620235, 'samples': 7200256, 'steps': 14062, 'batch_loss/train': 0.7349101467989385} 12/24/2021 19:57:00 - INFO - codeparrot_training - Step 14063: {'lr': 0.0004101199174995149, 'samples': 7200768, 'steps': 14063, 'batch_loss/train': 0.7495371940312907} 12/24/2021 19:57:11 - INFO - codeparrot_training - Step 14064: {'lr': 0.0004101078114862512, 'samples': 7201280, 'steps': 14064, 'batch_loss/train': 0.6547567481175065} 12/24/2021 19:57:22 - INFO - codeparrot_training - Step 14065: {'lr': 0.00041009570483645924, 'samples': 7201792, 'steps': 14065, 'batch_loss/train': 0.8014970822259784} 12/24/2021 19:57:33 - INFO - codeparrot_training - Step 14066: {'lr': 0.0004100835975501873, 'samples': 7202304, 'steps': 14066, 'batch_loss/train': 0.7888337066397071} 12/24/2021 19:57:44 - INFO - codeparrot_training - Step 14067: {'lr': 0.00041007148962748333, 'samples': 7202816, 'steps': 14067, 'batch_loss/train': 0.772339534945786} 12/24/2021 19:57:55 - INFO - codeparrot_training - Step 14068: {'lr': 0.0004100593810683956, 'samples': 7203328, 'steps': 14068, 'batch_loss/train': 0.7338426066562533} 12/24/2021 19:58:07 - INFO - codeparrot_training - Step 14069: {'lr': 0.0004100472718729722, 'samples': 7203840, 'steps': 14069, 'batch_loss/train': 0.7766570611856878} 12/24/2021 19:58:18 - INFO - codeparrot_training - Step 14070: {'lr': 0.0004100351620412612, 'samples': 7204352, 'steps': 14070, 'batch_loss/train': 0.7847294025123119} 12/24/2021 19:58:28 - INFO - codeparrot_training - Step 14071: {'lr': 0.0004100230515733109, 'samples': 7204864, 'steps': 14071, 'batch_loss/train': 0.8807678716257215} 12/24/2021 19:58:42 - INFO - codeparrot_training - Step 14072: {'lr': 0.00041001094046916943, 'samples': 7205376, 'steps': 14072, 'batch_loss/train': 0.8835269594565034} 12/24/2021 19:58:52 - INFO - codeparrot_training - Step 14073: {'lr': 0.00040999882872888485, 'samples': 7205888, 'steps': 14073, 'batch_loss/train': 0.7462989906780422} 12/24/2021 19:59:03 - INFO - codeparrot_training - Step 14074: {'lr': 0.0004099867163525053, 'samples': 7206400, 'steps': 14074, 'batch_loss/train': 0.7411756068468094} 12/24/2021 19:59:14 - INFO - codeparrot_training - Step 14075: {'lr': 0.000409974603340079, 'samples': 7206912, 'steps': 14075, 'batch_loss/train': 0.8305881302803755} 12/24/2021 19:59:26 - INFO - codeparrot_training - Step 14076: {'lr': 0.0004099624896916541, 'samples': 7207424, 'steps': 14076, 'batch_loss/train': 0.8827858865261078} 12/24/2021 19:59:37 - INFO - codeparrot_training - Step 14077: {'lr': 0.00040995037540727863, 'samples': 7207936, 'steps': 14077, 'batch_loss/train': 0.7368094949051738} 12/24/2021 19:59:47 - INFO - codeparrot_training - Step 14078: {'lr': 0.0004099382604870009, 'samples': 7208448, 'steps': 14078, 'batch_loss/train': 0.7392447423189878} 12/24/2021 19:59:59 - INFO - codeparrot_training - Step 14079: {'lr': 0.0004099261449308691, 'samples': 7208960, 'steps': 14079, 'batch_loss/train': 0.8063699887134135} 12/24/2021 20:00:10 - INFO - codeparrot_training - Step 14080: {'lr': 0.00040991402873893126, 'samples': 7209472, 'steps': 14080, 'batch_loss/train': 0.9096806878224015} 12/24/2021 20:00:20 - INFO - codeparrot_training - Step 14081: {'lr': 0.0004099019119112357, 'samples': 7209984, 'steps': 14081, 'batch_loss/train': 0.8129604021087289} 12/24/2021 20:00:34 - INFO - codeparrot_training - Step 14082: {'lr': 0.00040988979444783047, 'samples': 7210496, 'steps': 14082, 'batch_loss/train': 0.7635708246380091} 12/24/2021 20:00:45 - INFO - codeparrot_training - Step 14083: {'lr': 0.00040987767634876384, 'samples': 7211008, 'steps': 14083, 'batch_loss/train': 0.7330233221873641} 12/24/2021 20:00:56 - INFO - codeparrot_training - Step 14084: {'lr': 0.00040986555761408385, 'samples': 7211520, 'steps': 14084, 'batch_loss/train': 0.7771141218254343} 12/24/2021 20:01:06 - INFO - codeparrot_training - Step 14085: {'lr': 0.0004098534382438388, 'samples': 7212032, 'steps': 14085, 'batch_loss/train': 0.7814133095089346} 12/24/2021 20:01:18 - INFO - codeparrot_training - Step 14086: {'lr': 0.00040984131823807693, 'samples': 7212544, 'steps': 14086, 'batch_loss/train': 0.8256251979619265} 12/24/2021 20:01:29 - INFO - codeparrot_training - Step 14087: {'lr': 0.00040982919759684623, 'samples': 7213056, 'steps': 14087, 'batch_loss/train': 0.8067771075293422} 12/24/2021 20:01:40 - INFO - codeparrot_training - Step 14088: {'lr': 0.000409817076320195, 'samples': 7213568, 'steps': 14088, 'batch_loss/train': 0.7376611909130588} 12/24/2021 20:01:53 - INFO - codeparrot_training - Step 14089: {'lr': 0.0004098049544081714, 'samples': 7214080, 'steps': 14089, 'batch_loss/train': 0.7373716356232762} 12/24/2021 20:02:04 - INFO - codeparrot_training - Step 14090: {'lr': 0.00040979283186082374, 'samples': 7214592, 'steps': 14090, 'batch_loss/train': 0.8836178127676249} 12/24/2021 20:02:15 - INFO - codeparrot_training - Step 14091: {'lr': 0.0004097807086782, 'samples': 7215104, 'steps': 14091, 'batch_loss/train': 0.6901006888365373} 12/24/2021 20:02:27 - INFO - codeparrot_training - Step 14092: {'lr': 0.0004097685848603485, 'samples': 7215616, 'steps': 14092, 'batch_loss/train': 0.7082422156818211} 12/24/2021 20:02:37 - INFO - codeparrot_training - Step 14093: {'lr': 0.0004097564604073175, 'samples': 7216128, 'steps': 14093, 'batch_loss/train': 0.7008565003052354} 12/24/2021 20:02:48 - INFO - codeparrot_training - Step 14094: {'lr': 0.0004097443353191551, 'samples': 7216640, 'steps': 14094, 'batch_loss/train': 0.8609400410205126} 12/24/2021 20:03:00 - INFO - codeparrot_training - Step 14095: {'lr': 0.00040973220959590964, 'samples': 7217152, 'steps': 14095, 'batch_loss/train': 1.0982046108692884} 12/24/2021 20:03:11 - INFO - codeparrot_training - Step 14096: {'lr': 0.0004097200832376291, 'samples': 7217664, 'steps': 14096, 'batch_loss/train': 0.7326824790798128} 12/24/2021 20:03:22 - INFO - codeparrot_training - Step 14097: {'lr': 0.00040970795624436184, 'samples': 7218176, 'steps': 14097, 'batch_loss/train': 0.6975506080780178} 12/24/2021 20:03:32 - INFO - codeparrot_training - Step 14098: {'lr': 0.00040969582861615606, 'samples': 7218688, 'steps': 14098, 'batch_loss/train': 0.5968134280119557} 12/24/2021 20:03:45 - INFO - codeparrot_training - Step 14099: {'lr': 0.00040968370035306004, 'samples': 7219200, 'steps': 14099, 'batch_loss/train': 0.7338516684249043} 12/24/2021 20:03:55 - INFO - codeparrot_training - Step 14100: {'lr': 0.0004096715714551218, 'samples': 7219712, 'steps': 14100, 'batch_loss/train': 0.8346553631126881} 12/24/2021 20:04:06 - INFO - codeparrot_training - Step 14101: {'lr': 0.00040965944192238973, 'samples': 7220224, 'steps': 14101, 'batch_loss/train': 0.8026429433375597} 12/24/2021 20:04:20 - INFO - codeparrot_training - Step 14102: {'lr': 0.000409647311754912, 'samples': 7220736, 'steps': 14102, 'batch_loss/train': 0.9185064183548093} 12/24/2021 20:04:30 - INFO - codeparrot_training - Step 14103: {'lr': 0.00040963518095273683, 'samples': 7221248, 'steps': 14103, 'batch_loss/train': 0.7733487877994776} 12/24/2021 20:04:41 - INFO - codeparrot_training - Step 14104: {'lr': 0.00040962304951591244, 'samples': 7221760, 'steps': 14104, 'batch_loss/train': 0.7222949415445328} 12/24/2021 20:04:53 - INFO - codeparrot_training - Step 14105: {'lr': 0.0004096109174444872, 'samples': 7222272, 'steps': 14105, 'batch_loss/train': 0.7994396630674601} 12/24/2021 20:05:04 - INFO - codeparrot_training - Step 14106: {'lr': 0.0004095987847385091, 'samples': 7222784, 'steps': 14106, 'batch_loss/train': 0.769797375658527} 12/24/2021 20:05:14 - INFO - codeparrot_training - Step 14107: {'lr': 0.00040958665139802653, 'samples': 7223296, 'steps': 14107, 'batch_loss/train': 0.8747116532176733} 12/24/2021 20:05:25 - INFO - codeparrot_training - Step 14108: {'lr': 0.0004095745174230876, 'samples': 7223808, 'steps': 14108, 'batch_loss/train': 0.697512713028118} 12/24/2021 20:05:37 - INFO - codeparrot_training - Step 14109: {'lr': 0.0004095623828137408, 'samples': 7224320, 'steps': 14109, 'batch_loss/train': 0.8060676995664835} 12/24/2021 20:05:48 - INFO - codeparrot_training - Step 14110: {'lr': 0.00040955024757003414, 'samples': 7224832, 'steps': 14110, 'batch_loss/train': 0.7904270123690367} 12/24/2021 20:05:58 - INFO - codeparrot_training - Step 14111: {'lr': 0.0004095381116920159, 'samples': 7225344, 'steps': 14111, 'batch_loss/train': 0.7515649958513677} 12/24/2021 20:06:12 - INFO - codeparrot_training - Step 14112: {'lr': 0.00040952597517973443, 'samples': 7225856, 'steps': 14112, 'batch_loss/train': 0.5038019128260203} 12/24/2021 20:06:23 - INFO - codeparrot_training - Step 14113: {'lr': 0.00040951383803323793, 'samples': 7226368, 'steps': 14113, 'batch_loss/train': 0.849129959475249} 12/24/2021 20:06:34 - INFO - codeparrot_training - Step 14114: {'lr': 0.0004095017002525746, 'samples': 7226880, 'steps': 14114, 'batch_loss/train': 0.8465683949179947} 12/24/2021 20:06:46 - INFO - codeparrot_training - Step 14115: {'lr': 0.0004094895618377927, 'samples': 7227392, 'steps': 14115, 'batch_loss/train': 0.7482229573652148} 12/24/2021 20:06:56 - INFO - codeparrot_training - Step 14116: {'lr': 0.0004094774227889405, 'samples': 7227904, 'steps': 14116, 'batch_loss/train': 0.7891454817727208} 12/24/2021 20:07:07 - INFO - codeparrot_training - Step 14117: {'lr': 0.0004094652831060664, 'samples': 7228416, 'steps': 14117, 'batch_loss/train': 0.8256255020387471} 12/24/2021 20:07:21 - INFO - codeparrot_training - Step 14118: {'lr': 0.00040945314278921854, 'samples': 7228928, 'steps': 14118, 'batch_loss/train': 0.8587329108268023} 12/24/2021 20:07:31 - INFO - codeparrot_training - Step 14119: {'lr': 0.0004094410018384451, 'samples': 7229440, 'steps': 14119, 'batch_loss/train': 0.6968080988153815} 12/24/2021 20:07:42 - INFO - codeparrot_training - Step 14120: {'lr': 0.0004094288602537945, 'samples': 7229952, 'steps': 14120, 'batch_loss/train': 0.7365560214966536} 12/24/2021 20:07:53 - INFO - codeparrot_training - Step 14121: {'lr': 0.00040941671803531487, 'samples': 7230464, 'steps': 14121, 'batch_loss/train': 0.882798632606864} 12/24/2021 20:08:05 - INFO - codeparrot_training - Step 14122: {'lr': 0.0004094045751830547, 'samples': 7230976, 'steps': 14122, 'batch_loss/train': 0.8106290800496936} 12/24/2021 20:08:15 - INFO - codeparrot_training - Step 14123: {'lr': 0.00040939243169706195, 'samples': 7231488, 'steps': 14123, 'batch_loss/train': 0.7590055200271308} 12/24/2021 20:08:26 - INFO - codeparrot_training - Step 14124: {'lr': 0.00040938028757738523, 'samples': 7232000, 'steps': 14124, 'batch_loss/train': 0.7384843118488789} 12/24/2021 20:08:38 - INFO - codeparrot_training - Step 14125: {'lr': 0.0004093681428240726, 'samples': 7232512, 'steps': 14125, 'batch_loss/train': 0.8105757031589746} 12/24/2021 20:08:49 - INFO - codeparrot_training - Step 14126: {'lr': 0.00040935599743717243, 'samples': 7233024, 'steps': 14126, 'batch_loss/train': 0.6496426229714416} 12/24/2021 20:08:59 - INFO - codeparrot_training - Step 14127: {'lr': 0.000409343851416733, 'samples': 7233536, 'steps': 14127, 'batch_loss/train': 0.7339321379549801} 12/24/2021 20:09:13 - INFO - codeparrot_training - Step 14128: {'lr': 0.00040933170476280256, 'samples': 7234048, 'steps': 14128, 'batch_loss/train': 0.7774470248259604} 12/24/2021 20:09:24 - INFO - codeparrot_training - Step 14129: {'lr': 0.0004093195574754294, 'samples': 7234560, 'steps': 14129, 'batch_loss/train': 0.7736155712045729} 12/24/2021 20:09:34 - INFO - codeparrot_training - Step 14130: {'lr': 0.00040930740955466186, 'samples': 7235072, 'steps': 14130, 'batch_loss/train': 0.8196102287620306} 12/24/2021 20:09:47 - INFO - codeparrot_training - Step 14131: {'lr': 0.00040929526100054827, 'samples': 7235584, 'steps': 14131, 'batch_loss/train': 0.7714681960642338} 12/24/2021 20:09:57 - INFO - codeparrot_training - Step 14132: {'lr': 0.00040928311181313674, 'samples': 7236096, 'steps': 14132, 'batch_loss/train': 0.7496281713247299} 12/24/2021 20:10:08 - INFO - codeparrot_training - Step 14133: {'lr': 0.0004092709619924758, 'samples': 7236608, 'steps': 14133, 'batch_loss/train': 0.8534570019692183} 12/24/2021 20:10:19 - INFO - codeparrot_training - Step 14134: {'lr': 0.0004092588115386136, 'samples': 7237120, 'steps': 14134, 'batch_loss/train': 0.7490087235346437} 12/24/2021 20:10:31 - INFO - codeparrot_training - Step 14135: {'lr': 0.0004092466604515985, 'samples': 7237632, 'steps': 14135, 'batch_loss/train': 0.826731524663046} 12/24/2021 20:10:41 - INFO - codeparrot_training - Step 14136: {'lr': 0.0004092345087314789, 'samples': 7238144, 'steps': 14136, 'batch_loss/train': 0.8175032990984619} 12/24/2021 20:10:52 - INFO - codeparrot_training - Step 14137: {'lr': 0.00040922235637830295, 'samples': 7238656, 'steps': 14137, 'batch_loss/train': 0.8048050818033516} 12/24/2021 20:11:04 - INFO - codeparrot_training - Step 14138: {'lr': 0.00040921020339211903, 'samples': 7239168, 'steps': 14138, 'batch_loss/train': 0.6270486218854785} 12/24/2021 20:11:15 - INFO - codeparrot_training - Step 14139: {'lr': 0.00040919804977297534, 'samples': 7239680, 'steps': 14139, 'batch_loss/train': 0.7753886543214321} 12/24/2021 20:11:25 - INFO - codeparrot_training - Step 14140: {'lr': 0.00040918589552092055, 'samples': 7240192, 'steps': 14140, 'batch_loss/train': 0.8668060833588243} 12/24/2021 20:11:39 - INFO - codeparrot_training - Step 14141: {'lr': 0.00040917374063600254, 'samples': 7240704, 'steps': 14141, 'batch_loss/train': 0.8039778508245945} 12/24/2021 20:11:50 - INFO - codeparrot_training - Step 14142: {'lr': 0.00040916158511827, 'samples': 7241216, 'steps': 14142, 'batch_loss/train': 0.8245840454474092} 12/24/2021 20:12:00 - INFO - codeparrot_training - Step 14143: {'lr': 0.00040914942896777095, 'samples': 7241728, 'steps': 14143, 'batch_loss/train': 0.7397364298813045} 12/24/2021 20:12:11 - INFO - codeparrot_training - Step 14144: {'lr': 0.000409137272184554, 'samples': 7242240, 'steps': 14144, 'batch_loss/train': 0.7973557831719518} 12/24/2021 20:12:23 - INFO - codeparrot_training - Step 14145: {'lr': 0.00040912511476866726, 'samples': 7242752, 'steps': 14145, 'batch_loss/train': 0.8044205158948898} 12/24/2021 20:12:34 - INFO - codeparrot_training - Step 14146: {'lr': 0.0004091129567201592, 'samples': 7243264, 'steps': 14146, 'batch_loss/train': 0.8200795063748956} 12/24/2021 20:12:44 - INFO - codeparrot_training - Step 14147: {'lr': 0.0004091007980390781, 'samples': 7243776, 'steps': 14147, 'batch_loss/train': 0.7113238926976919} 12/24/2021 20:12:58 - INFO - codeparrot_training - Step 14148: {'lr': 0.0004090886387254722, 'samples': 7244288, 'steps': 14148, 'batch_loss/train': 0.857639379799366} 12/24/2021 20:13:09 - INFO - codeparrot_training - Step 14149: {'lr': 0.00040907647877939, 'samples': 7244800, 'steps': 14149, 'batch_loss/train': 0.7679396970197558} 12/24/2021 20:13:19 - INFO - codeparrot_training - Step 14150: {'lr': 0.00040906431820087987, 'samples': 7245312, 'steps': 14150, 'batch_loss/train': 0.7621099806856364} 12/24/2021 20:13:31 - INFO - codeparrot_training - Step 14151: {'lr': 0.00040905215698999, 'samples': 7245824, 'steps': 14151, 'batch_loss/train': 0.9140664953738451} 12/24/2021 20:13:42 - INFO - codeparrot_training - Step 14152: {'lr': 0.0004090399951467687, 'samples': 7246336, 'steps': 14152, 'batch_loss/train': 0.7170029813423753} 12/24/2021 20:13:53 - INFO - codeparrot_training - Step 14153: {'lr': 0.0004090278326712646, 'samples': 7246848, 'steps': 14153, 'batch_loss/train': 0.786207596771419} 12/24/2021 20:14:04 - INFO - codeparrot_training - Step 14154: {'lr': 0.00040901566956352576, 'samples': 7247360, 'steps': 14154, 'batch_loss/train': 0.798956316895783} 12/24/2021 20:14:15 - INFO - codeparrot_training - Step 14155: {'lr': 0.0004090035058236008, 'samples': 7247872, 'steps': 14155, 'batch_loss/train': 0.7238874617032707} 12/24/2021 20:14:26 - INFO - codeparrot_training - Step 14156: {'lr': 0.0004089913414515378, 'samples': 7248384, 'steps': 14156, 'batch_loss/train': 0.8156622909009457} 12/24/2021 20:14:36 - INFO - codeparrot_training - Step 14157: {'lr': 0.00040897917644738536, 'samples': 7248896, 'steps': 14157, 'batch_loss/train': 0.7405556766316295} 12/24/2021 20:14:50 - INFO - codeparrot_training - Step 14158: {'lr': 0.0004089670108111917, 'samples': 7249408, 'steps': 14158, 'batch_loss/train': 0.753852279856801} 12/24/2021 20:15:01 - INFO - codeparrot_training - Step 14159: {'lr': 0.00040895484454300525, 'samples': 7249920, 'steps': 14159, 'batch_loss/train': 0.841150576248765} 12/24/2021 20:15:12 - INFO - codeparrot_training - Step 14160: {'lr': 0.0004089426776428743, 'samples': 7250432, 'steps': 14160, 'batch_loss/train': 0.8365287035703659} 12/24/2021 20:15:24 - INFO - codeparrot_training - Step 14161: {'lr': 0.0004089305101108474, 'samples': 7250944, 'steps': 14161, 'batch_loss/train': 0.7837501866742969} 12/24/2021 20:15:34 - INFO - codeparrot_training - Step 14162: {'lr': 0.0004089183419469726, 'samples': 7251456, 'steps': 14162, 'batch_loss/train': 0.7640772147569805} 12/24/2021 20:15:45 - INFO - codeparrot_training - Step 14163: {'lr': 0.00040890617315129864, 'samples': 7251968, 'steps': 14163, 'batch_loss/train': 0.7964846240356565} 12/24/2021 20:15:57 - INFO - codeparrot_training - Step 14164: {'lr': 0.00040889400372387367, 'samples': 7252480, 'steps': 14164, 'batch_loss/train': 0.775511578656733} 12/24/2021 20:16:08 - INFO - codeparrot_training - Step 14165: {'lr': 0.00040888183366474617, 'samples': 7252992, 'steps': 14165, 'batch_loss/train': 0.7397130709141493} 12/24/2021 20:16:18 - INFO - codeparrot_training - Step 14166: {'lr': 0.00040886966297396443, 'samples': 7253504, 'steps': 14166, 'batch_loss/train': 0.7880379185080528} 12/24/2021 20:16:29 - INFO - codeparrot_training - Step 14167: {'lr': 0.000408857491651577, 'samples': 7254016, 'steps': 14167, 'batch_loss/train': 0.8227582294493914} 12/24/2021 20:16:42 - INFO - codeparrot_training - Step 14168: {'lr': 0.0004088453196976321, 'samples': 7254528, 'steps': 14168, 'batch_loss/train': 0.7094215587712824} 12/24/2021 20:16:53 - INFO - codeparrot_training - Step 14169: {'lr': 0.00040883314711217823, 'samples': 7255040, 'steps': 14169, 'batch_loss/train': 0.8144409307278693} 12/24/2021 20:17:04 - INFO - codeparrot_training - Step 14170: {'lr': 0.0004088209738952637, 'samples': 7255552, 'steps': 14170, 'batch_loss/train': 0.7449844377115369} 12/24/2021 20:17:16 - INFO - codeparrot_training - Step 14171: {'lr': 0.0004088088000469369, 'samples': 7256064, 'steps': 14171, 'batch_loss/train': 0.8071022699587047} 12/24/2021 20:17:26 - INFO - codeparrot_training - Step 14172: {'lr': 0.0004087966255672464, 'samples': 7256576, 'steps': 14172, 'batch_loss/train': 0.6717624788579997} 12/24/2021 20:17:37 - INFO - codeparrot_training - Step 14173: {'lr': 0.0004087844504562404, 'samples': 7257088, 'steps': 14173, 'batch_loss/train': 0.8161356905475259} 12/24/2021 20:17:49 - INFO - codeparrot_training - Step 14174: {'lr': 0.0004087722747139674, 'samples': 7257600, 'steps': 14174, 'batch_loss/train': 0.8115293411538005} 12/24/2021 20:18:00 - INFO - codeparrot_training - Step 14175: {'lr': 0.0004087600983404757, 'samples': 7258112, 'steps': 14175, 'batch_loss/train': 0.7828636169433594} 12/24/2021 20:18:10 - INFO - codeparrot_training - Step 14176: {'lr': 0.00040874792133581384, 'samples': 7258624, 'steps': 14176, 'batch_loss/train': 0.7721194886835292} 12/24/2021 20:18:24 - INFO - codeparrot_training - Step 14177: {'lr': 0.00040873574370003024, 'samples': 7259136, 'steps': 14177, 'batch_loss/train': 0.7624199977144599} 12/24/2021 20:18:35 - INFO - codeparrot_training - Step 14178: {'lr': 0.00040872356543317326, 'samples': 7259648, 'steps': 14178, 'batch_loss/train': 0.7657391289249063} 12/24/2021 20:18:45 - INFO - codeparrot_training - Step 14179: {'lr': 0.00040871138653529127, 'samples': 7260160, 'steps': 14179, 'batch_loss/train': 0.8235079981386662} 12/24/2021 20:18:56 - INFO - codeparrot_training - Step 14180: {'lr': 0.0004086992070064327, 'samples': 7260672, 'steps': 14180, 'batch_loss/train': 0.8613769179210067} 12/24/2021 20:19:08 - INFO - codeparrot_training - Step 14181: {'lr': 0.0004086870268466461, 'samples': 7261184, 'steps': 14181, 'batch_loss/train': 0.7020208723843098} 12/24/2021 20:19:19 - INFO - codeparrot_training - Step 14182: {'lr': 0.00040867484605597976, 'samples': 7261696, 'steps': 14182, 'batch_loss/train': 0.8931556548923254} 12/24/2021 20:19:29 - INFO - codeparrot_training - Step 14183: {'lr': 0.0004086626646344822, 'samples': 7262208, 'steps': 14183, 'batch_loss/train': 0.6277901870198548} 12/24/2021 20:19:42 - INFO - codeparrot_training - Step 14184: {'lr': 0.0004086504825822016, 'samples': 7262720, 'steps': 14184, 'batch_loss/train': 0.6421396975056268} 12/24/2021 20:19:52 - INFO - codeparrot_training - Step 14185: {'lr': 0.00040863829989918677, 'samples': 7263232, 'steps': 14185, 'batch_loss/train': 0.8069097362458706} 12/24/2021 20:20:03 - INFO - codeparrot_training - Step 14186: {'lr': 0.00040862611658548594, 'samples': 7263744, 'steps': 14186, 'batch_loss/train': 0.7651180964894593} 12/24/2021 20:20:17 - INFO - codeparrot_training - Step 14187: {'lr': 0.0004086139326411475, 'samples': 7264256, 'steps': 14187, 'batch_loss/train': 0.7465059558162466} 12/24/2021 20:20:27 - INFO - codeparrot_training - Step 14188: {'lr': 0.00040860174806621994, 'samples': 7264768, 'steps': 14188, 'batch_loss/train': 0.8732327520847321} 12/24/2021 20:20:38 - INFO - codeparrot_training - Step 14189: {'lr': 0.00040858956286075177, 'samples': 7265280, 'steps': 14189, 'batch_loss/train': 0.8552466062828898} 12/24/2021 20:20:50 - INFO - codeparrot_training - Step 14190: {'lr': 0.00040857737702479146, 'samples': 7265792, 'steps': 14190, 'batch_loss/train': 0.8440984962508082} 12/24/2021 20:21:01 - INFO - codeparrot_training - Step 14191: {'lr': 0.00040856519055838727, 'samples': 7266304, 'steps': 14191, 'batch_loss/train': 0.7954618521034718} 12/24/2021 20:21:11 - INFO - codeparrot_training - Step 14192: {'lr': 0.00040855300346158776, 'samples': 7266816, 'steps': 14192, 'batch_loss/train': 0.7428499041125178} 12/24/2021 20:21:22 - INFO - codeparrot_training - Step 14193: {'lr': 0.00040854081573444135, 'samples': 7267328, 'steps': 14193, 'batch_loss/train': 0.8593726791441441} 12/24/2021 20:21:34 - INFO - codeparrot_training - Step 14194: {'lr': 0.00040852862737699656, 'samples': 7267840, 'steps': 14194, 'batch_loss/train': 0.820599090307951} 12/24/2021 20:21:44 - INFO - codeparrot_training - Step 14195: {'lr': 0.00040851643838930185, 'samples': 7268352, 'steps': 14195, 'batch_loss/train': 0.8116918564774096} 12/24/2021 20:21:55 - INFO - codeparrot_training - Step 14196: {'lr': 0.00040850424877140557, 'samples': 7268864, 'steps': 14196, 'batch_loss/train': 0.9033478377386928} 12/24/2021 20:22:09 - INFO - codeparrot_training - Step 14197: {'lr': 0.00040849205852335624, 'samples': 7269376, 'steps': 14197, 'batch_loss/train': 0.668145720846951} 12/24/2021 20:22:19 - INFO - codeparrot_training - Step 14198: {'lr': 0.0004084798676452024, 'samples': 7269888, 'steps': 14198, 'batch_loss/train': 0.7250907635316253} 12/24/2021 20:22:30 - INFO - codeparrot_training - Step 14199: {'lr': 0.00040846767613699235, 'samples': 7270400, 'steps': 14199, 'batch_loss/train': 0.7495850310660899} 12/24/2021 20:22:42 - INFO - codeparrot_training - Step 14200: {'lr': 0.00040845548399877477, 'samples': 7270912, 'steps': 14200, 'batch_loss/train': 0.8211204813560471} 12/24/2021 20:22:53 - INFO - codeparrot_training - Step 14201: {'lr': 0.0004084432912305979, 'samples': 7271424, 'steps': 14201, 'batch_loss/train': 0.83620441891253} 12/24/2021 20:23:03 - INFO - codeparrot_training - Step 14202: {'lr': 0.0004084310978325104, 'samples': 7271936, 'steps': 14202, 'batch_loss/train': 0.9357540383934975} 12/24/2021 20:23:14 - INFO - codeparrot_training - Step 14203: {'lr': 0.00040841890380456067, 'samples': 7272448, 'steps': 14203, 'batch_loss/train': 0.730705738067627} 12/24/2021 20:23:28 - INFO - codeparrot_training - Step 14204: {'lr': 0.0004084067091467971, 'samples': 7272960, 'steps': 14204, 'batch_loss/train': 0.7364543434232473} 12/24/2021 20:23:39 - INFO - codeparrot_training - Step 14205: {'lr': 0.0004083945138592684, 'samples': 7273472, 'steps': 14205, 'batch_loss/train': 0.7516931015998125} 12/24/2021 20:23:49 - INFO - codeparrot_training - Step 14206: {'lr': 0.00040838231794202286, 'samples': 7273984, 'steps': 14206, 'batch_loss/train': 0.7806793553754687} 12/24/2021 20:24:01 - INFO - codeparrot_training - Step 14207: {'lr': 0.00040837012139510905, 'samples': 7274496, 'steps': 14207, 'batch_loss/train': 0.8574721999466419} 12/24/2021 20:24:12 - INFO - codeparrot_training - Step 14208: {'lr': 0.0004083579242185754, 'samples': 7275008, 'steps': 14208, 'batch_loss/train': 0.7220795904286206} 12/24/2021 20:24:22 - INFO - codeparrot_training - Step 14209: {'lr': 0.0004083457264124705, 'samples': 7275520, 'steps': 14209, 'batch_loss/train': 0.8207021877169609} 12/24/2021 20:24:34 - INFO - codeparrot_training - Step 14210: {'lr': 0.0004083335279768426, 'samples': 7276032, 'steps': 14210, 'batch_loss/train': 0.7830867017619312} 12/24/2021 20:24:45 - INFO - codeparrot_training - Step 14211: {'lr': 0.0004083213289117406, 'samples': 7276544, 'steps': 14211, 'batch_loss/train': 0.6638002007966861} 12/24/2021 20:24:56 - INFO - codeparrot_training - Step 14212: {'lr': 0.0004083091292172127, 'samples': 7277056, 'steps': 14212, 'batch_loss/train': 0.8658229857683182} 12/24/2021 20:25:06 - INFO - codeparrot_training - Step 14213: {'lr': 0.00040829692889330743, 'samples': 7277568, 'steps': 14213, 'batch_loss/train': 0.8006741218268871} 12/24/2021 20:25:18 - INFO - codeparrot_training - Step 14214: {'lr': 0.0004082847279400734, 'samples': 7278080, 'steps': 14214, 'batch_loss/train': 0.7397617711685598} 12/24/2021 20:25:29 - INFO - codeparrot_training - Step 14215: {'lr': 0.000408272526357559, 'samples': 7278592, 'steps': 14215, 'batch_loss/train': 0.6702103691641241} 12/24/2021 20:25:40 - INFO - codeparrot_training - Step 14216: {'lr': 0.00040826032414581286, 'samples': 7279104, 'steps': 14216, 'batch_loss/train': 0.7101637991145253} 12/24/2021 20:25:54 - INFO - codeparrot_training - Step 14217: {'lr': 0.0004082481213048834, 'samples': 7279616, 'steps': 14217, 'batch_loss/train': 0.8430973803624511} 12/24/2021 20:26:04 - INFO - codeparrot_training - Step 14218: {'lr': 0.00040823591783481917, 'samples': 7280128, 'steps': 14218, 'batch_loss/train': 0.8430909896269441} 12/24/2021 20:26:15 - INFO - codeparrot_training - Step 14219: {'lr': 0.0004082237137356687, 'samples': 7280640, 'steps': 14219, 'batch_loss/train': 0.8324483365286142} 12/24/2021 20:26:27 - INFO - codeparrot_training - Step 14220: {'lr': 0.00040821150900748044, 'samples': 7281152, 'steps': 14220, 'batch_loss/train': 1.1485723936930299} 12/24/2021 20:26:38 - INFO - codeparrot_training - Step 14221: {'lr': 0.000408199303650303, 'samples': 7281664, 'steps': 14221, 'batch_loss/train': 0.8540010182186961} 12/24/2021 20:26:48 - INFO - codeparrot_training - Step 14222: {'lr': 0.0004081870976641848, 'samples': 7282176, 'steps': 14222, 'batch_loss/train': 0.7661696722498164} 12/24/2021 20:26:59 - INFO - codeparrot_training - Step 14223: {'lr': 0.00040817489104917447, 'samples': 7282688, 'steps': 14223, 'batch_loss/train': 0.806489902548492} 12/24/2021 20:27:11 - INFO - codeparrot_training - Step 14224: {'lr': 0.0004081626838053205, 'samples': 7283200, 'steps': 14224, 'batch_loss/train': 0.8409593179821968} 12/24/2021 20:27:22 - INFO - codeparrot_training - Step 14225: {'lr': 0.0004081504759326714, 'samples': 7283712, 'steps': 14225, 'batch_loss/train': 0.9127850234508514} 12/24/2021 20:27:32 - INFO - codeparrot_training - Step 14226: {'lr': 0.00040813826743127575, 'samples': 7284224, 'steps': 14226, 'batch_loss/train': 0.8927137460559607} 12/24/2021 20:27:46 - INFO - codeparrot_training - Step 14227: {'lr': 0.0004081260583011821, 'samples': 7284736, 'steps': 14227, 'batch_loss/train': 0.7269861723762006} 12/24/2021 20:27:57 - INFO - codeparrot_training - Step 14228: {'lr': 0.00040811384854243883, 'samples': 7285248, 'steps': 14228, 'batch_loss/train': 0.8607850887347013} 12/24/2021 20:28:07 - INFO - codeparrot_training - Step 14229: {'lr': 0.0004081016381550947, 'samples': 7285760, 'steps': 14229, 'batch_loss/train': 0.7037463122978806} 12/24/2021 20:28:19 - INFO - codeparrot_training - Step 14230: {'lr': 0.00040808942713919813, 'samples': 7286272, 'steps': 14230, 'batch_loss/train': 0.7550274822860956} 12/24/2021 20:28:30 - INFO - codeparrot_training - Step 14231: {'lr': 0.00040807721549479764, 'samples': 7286784, 'steps': 14231, 'batch_loss/train': 0.757731375284493} 12/24/2021 20:28:41 - INFO - codeparrot_training - Step 14232: {'lr': 0.00040806500322194186, 'samples': 7287296, 'steps': 14232, 'batch_loss/train': 0.8208673158660531} 12/24/2021 20:28:54 - INFO - codeparrot_training - Step 14233: {'lr': 0.00040805279032067935, 'samples': 7287808, 'steps': 14233, 'batch_loss/train': 0.8181716143153608} 12/24/2021 20:29:05 - INFO - codeparrot_training - Step 14234: {'lr': 0.0004080405767910585, 'samples': 7288320, 'steps': 14234, 'batch_loss/train': 0.8410442429594696} 12/24/2021 20:29:16 - INFO - codeparrot_training - Step 14235: {'lr': 0.0004080283626331282, 'samples': 7288832, 'steps': 14235, 'batch_loss/train': 0.9518751678988338} 12/24/2021 20:29:26 - INFO - codeparrot_training - Step 14236: {'lr': 0.00040801614784693666, 'samples': 7289344, 'steps': 14236, 'batch_loss/train': 0.8394506215117872} 12/24/2021 20:29:38 - INFO - codeparrot_training - Step 14237: {'lr': 0.0004080039324325326, 'samples': 7289856, 'steps': 14237, 'batch_loss/train': 0.7108169868588448} 12/24/2021 20:29:49 - INFO - codeparrot_training - Step 14238: {'lr': 0.00040799171638996457, 'samples': 7290368, 'steps': 14238, 'batch_loss/train': 0.7122271408443339} 12/24/2021 20:30:00 - INFO - codeparrot_training - Step 14239: {'lr': 0.00040797949971928103, 'samples': 7290880, 'steps': 14239, 'batch_loss/train': 0.7405793752986938} 12/24/2021 20:30:12 - INFO - codeparrot_training - Step 14240: {'lr': 0.00040796728242053083, 'samples': 7291392, 'steps': 14240, 'batch_loss/train': 0.9127914309501648} 12/24/2021 20:30:23 - INFO - codeparrot_training - Step 14241: {'lr': 0.0004079550644937623, 'samples': 7291904, 'steps': 14241, 'batch_loss/train': 1.0051098086405545} 12/24/2021 20:30:34 - INFO - codeparrot_training - Step 14242: {'lr': 0.000407942845939024, 'samples': 7292416, 'steps': 14242, 'batch_loss/train': 0.6784502449445426} 12/24/2021 20:30:44 - INFO - codeparrot_training - Step 14243: {'lr': 0.00040793062675636467, 'samples': 7292928, 'steps': 14243, 'batch_loss/train': 0.80370142031461} 12/24/2021 20:30:58 - INFO - codeparrot_training - Step 14244: {'lr': 0.0004079184069458328, 'samples': 7293440, 'steps': 14244, 'batch_loss/train': 0.7817684458568692} 12/24/2021 20:31:09 - INFO - codeparrot_training - Step 14245: {'lr': 0.0004079061865074769, 'samples': 7293952, 'steps': 14245, 'batch_loss/train': 0.6905032468494028} 12/24/2021 20:31:19 - INFO - codeparrot_training - Step 14246: {'lr': 0.00040789396544134573, 'samples': 7294464, 'steps': 14246, 'batch_loss/train': 0.8237797152251005} 12/24/2021 20:31:31 - INFO - codeparrot_training - Step 14247: {'lr': 0.00040788174374748766, 'samples': 7294976, 'steps': 14247, 'batch_loss/train': 0.7603100598789752} 12/24/2021 20:31:42 - INFO - codeparrot_training - Step 14248: {'lr': 0.00040786952142595146, 'samples': 7295488, 'steps': 14248, 'batch_loss/train': 0.8774069817736745} 12/24/2021 20:31:52 - INFO - codeparrot_training - Step 14249: {'lr': 0.0004078572984767855, 'samples': 7296000, 'steps': 14249, 'batch_loss/train': 0.8117302786558867} 12/24/2021 20:32:05 - INFO - codeparrot_training - Step 14250: {'lr': 0.00040784507490003875, 'samples': 7296512, 'steps': 14250, 'batch_loss/train': 0.886431235470809} 12/24/2021 20:32:15 - INFO - codeparrot_training - Step 14251: {'lr': 0.0004078328506957595, 'samples': 7297024, 'steps': 14251, 'batch_loss/train': 0.8625734075903893} 12/24/2021 20:32:26 - INFO - codeparrot_training - Step 14252: {'lr': 0.00040782062586399635, 'samples': 7297536, 'steps': 14252, 'batch_loss/train': 0.7727146437391639} 12/24/2021 20:32:36 - INFO - codeparrot_training - Step 14253: {'lr': 0.0004078084004047981, 'samples': 7298048, 'steps': 14253, 'batch_loss/train': 0.8114218674600124} 12/24/2021 20:32:48 - INFO - codeparrot_training - Step 14254: {'lr': 0.00040779617431821317, 'samples': 7298560, 'steps': 14254, 'batch_loss/train': 0.7393130538985133} 12/24/2021 20:32:59 - INFO - codeparrot_training - Step 14255: {'lr': 0.0004077839476042903, 'samples': 7299072, 'steps': 14255, 'batch_loss/train': 0.8065955750644207} 12/24/2021 20:33:09 - INFO - codeparrot_training - Step 14256: {'lr': 0.00040777172026307794, 'samples': 7299584, 'steps': 14256, 'batch_loss/train': 0.6928978185169399} 12/24/2021 20:33:24 - INFO - codeparrot_training - Step 14257: {'lr': 0.0004077594922946248, 'samples': 7300096, 'steps': 14257, 'batch_loss/train': 0.7926750890910625} 12/24/2021 20:33:34 - INFO - codeparrot_training - Step 14258: {'lr': 0.0004077472636989795, 'samples': 7300608, 'steps': 14258, 'batch_loss/train': 0.7970528153236955} 12/24/2021 20:33:45 - INFO - codeparrot_training - Step 14259: {'lr': 0.0004077350344761908, 'samples': 7301120, 'steps': 14259, 'batch_loss/train': 0.6630721702240407} 12/24/2021 20:33:57 - INFO - codeparrot_training - Step 14260: {'lr': 0.00040772280462630695, 'samples': 7301632, 'steps': 14260, 'batch_loss/train': 0.6785494980867952} 12/24/2021 20:34:07 - INFO - codeparrot_training - Step 14261: {'lr': 0.00040771057414937684, 'samples': 7302144, 'steps': 14261, 'batch_loss/train': 0.8476568101905286} 12/24/2021 20:34:18 - INFO - codeparrot_training - Step 14262: {'lr': 0.00040769834304544905, 'samples': 7302656, 'steps': 14262, 'batch_loss/train': 0.7798189846798778} 12/24/2021 20:34:29 - INFO - codeparrot_training - Step 14263: {'lr': 0.0004076861113145722, 'samples': 7303168, 'steps': 14263, 'batch_loss/train': 0.7683620238676667} 12/24/2021 20:34:43 - INFO - codeparrot_training - Step 14264: {'lr': 0.0004076738789567949, 'samples': 7303680, 'steps': 14264, 'batch_loss/train': 0.7504128526197746} 12/24/2021 20:34:53 - INFO - codeparrot_training - Step 14265: {'lr': 0.00040766164597216576, 'samples': 7304192, 'steps': 14265, 'batch_loss/train': 0.8290198147296906} 12/24/2021 20:35:04 - INFO - codeparrot_training - Step 14266: {'lr': 0.0004076494123607335, 'samples': 7304704, 'steps': 14266, 'batch_loss/train': 0.8140048859640956} 12/24/2021 20:35:16 - INFO - codeparrot_training - Step 14267: {'lr': 0.0004076371781225467, 'samples': 7305216, 'steps': 14267, 'batch_loss/train': 0.721267007291317} 12/24/2021 20:35:27 - INFO - codeparrot_training - Step 14268: {'lr': 0.000407624943257654, 'samples': 7305728, 'steps': 14268, 'batch_loss/train': 0.80481799505651} 12/24/2021 20:35:37 - INFO - codeparrot_training - Step 14269: {'lr': 0.00040761270776610404, 'samples': 7306240, 'steps': 14269, 'batch_loss/train': 0.8436556118540466} 12/24/2021 20:35:49 - INFO - codeparrot_training - Step 14270: {'lr': 0.00040760047164794546, 'samples': 7306752, 'steps': 14270, 'batch_loss/train': 0.737927345559001} 12/24/2021 20:36:00 - INFO - codeparrot_training - Step 14271: {'lr': 0.0004075882349032269, 'samples': 7307264, 'steps': 14271, 'batch_loss/train': 0.774925182107836} 12/24/2021 20:36:10 - INFO - codeparrot_training - Step 14272: {'lr': 0.00040757599753199703, 'samples': 7307776, 'steps': 14272, 'batch_loss/train': 0.7151597184129059} 12/24/2021 20:36:21 - INFO - codeparrot_training - Step 14273: {'lr': 0.0004075637595343045, 'samples': 7308288, 'steps': 14273, 'batch_loss/train': 0.8395281368866563} 12/24/2021 20:36:35 - INFO - codeparrot_training - Step 14274: {'lr': 0.00040755152091019787, 'samples': 7308800, 'steps': 14274, 'batch_loss/train': 0.745793903246522} 12/24/2021 20:36:46 - INFO - codeparrot_training - Step 14275: {'lr': 0.000407539281659726, 'samples': 7309312, 'steps': 14275, 'batch_loss/train': 0.722167759668082} 12/24/2021 20:36:56 - INFO - codeparrot_training - Step 14276: {'lr': 0.00040752704178293744, 'samples': 7309824, 'steps': 14276, 'batch_loss/train': 0.7461062986403704} 12/24/2021 20:37:08 - INFO - codeparrot_training - Step 14277: {'lr': 0.0004075148012798807, 'samples': 7310336, 'steps': 14277, 'batch_loss/train': 0.7330661043524742} 12/24/2021 20:37:19 - INFO - codeparrot_training - Step 14278: {'lr': 0.00040750256015060474, 'samples': 7310848, 'steps': 14278, 'batch_loss/train': 0.7695222431793809} 12/24/2021 20:37:30 - INFO - codeparrot_training - Step 14279: {'lr': 0.000407490318395158, 'samples': 7311360, 'steps': 14279, 'batch_loss/train': 0.6530494157341309} 12/24/2021 20:37:42 - INFO - codeparrot_training - Step 14280: {'lr': 0.0004074780760135892, 'samples': 7311872, 'steps': 14280, 'batch_loss/train': 0.7521150358952582} 12/24/2021 20:37:52 - INFO - codeparrot_training - Step 14281: {'lr': 0.0004074658330059471, 'samples': 7312384, 'steps': 14281, 'batch_loss/train': 0.8218121195677668} 12/24/2021 20:38:03 - INFO - codeparrot_training - Step 14282: {'lr': 0.00040745358937228026, 'samples': 7312896, 'steps': 14282, 'batch_loss/train': 0.6141473844763823} 12/24/2021 20:38:17 - INFO - codeparrot_training - Step 14283: {'lr': 0.0004074413451126374, 'samples': 7313408, 'steps': 14283, 'batch_loss/train': 0.7565831667743623} 12/24/2021 20:38:27 - INFO - codeparrot_training - Step 14284: {'lr': 0.00040742910022706716, 'samples': 7313920, 'steps': 14284, 'batch_loss/train': 0.7297872398048639} 12/24/2021 20:38:38 - INFO - codeparrot_training - Step 14285: {'lr': 0.0004074168547156184, 'samples': 7314432, 'steps': 14285, 'batch_loss/train': 0.6750283711589873} 12/24/2021 20:38:49 - INFO - codeparrot_training - Step 14286: {'lr': 0.0004074046085783395, 'samples': 7314944, 'steps': 14286, 'batch_loss/train': 0.7528380386065692} 12/24/2021 20:39:01 - INFO - codeparrot_training - Step 14287: {'lr': 0.00040739236181527936, 'samples': 7315456, 'steps': 14287, 'batch_loss/train': 0.6303626339649782} 12/24/2021 20:39:12 - INFO - codeparrot_training - Step 14288: {'lr': 0.0004073801144264867, 'samples': 7315968, 'steps': 14288, 'batch_loss/train': 0.7771528512239456} 12/24/2021 20:39:22 - INFO - codeparrot_training - Step 14289: {'lr': 0.00040736786641201, 'samples': 7316480, 'steps': 14289, 'batch_loss/train': 0.7955869548022747} 12/24/2021 20:39:34 - INFO - codeparrot_training - Step 14290: {'lr': 0.00040735561777189823, 'samples': 7316992, 'steps': 14290, 'batch_loss/train': 0.6547762549598701} 12/24/2021 20:39:45 - INFO - codeparrot_training - Step 14291: {'lr': 0.0004073433685061998, 'samples': 7317504, 'steps': 14291, 'batch_loss/train': 0.9104081047698855} 12/24/2021 20:39:56 - INFO - codeparrot_training - Step 14292: {'lr': 0.00040733111861496364, 'samples': 7318016, 'steps': 14292, 'batch_loss/train': 0.8211538596078753} 12/24/2021 20:40:09 - INFO - codeparrot_training - Step 14293: {'lr': 0.00040731886809823825, 'samples': 7318528, 'steps': 14293, 'batch_loss/train': 0.7718511000275612} 12/24/2021 20:40:20 - INFO - codeparrot_training - Step 14294: {'lr': 0.00040730661695607257, 'samples': 7319040, 'steps': 14294, 'batch_loss/train': 0.7908499927725643} 12/24/2021 20:40:31 - INFO - codeparrot_training - Step 14295: {'lr': 0.00040729436518851516, 'samples': 7319552, 'steps': 14295, 'batch_loss/train': 0.8040943476371467} 12/24/2021 20:40:41 - INFO - codeparrot_training - Step 14296: {'lr': 0.00040728211279561467, 'samples': 7320064, 'steps': 14296, 'batch_loss/train': 0.8509047888219357} 12/24/2021 20:40:53 - INFO - codeparrot_training - Step 14297: {'lr': 0.0004072698597774199, 'samples': 7320576, 'steps': 14297, 'batch_loss/train': 0.6713111757999286} 12/24/2021 20:41:04 - INFO - codeparrot_training - Step 14298: {'lr': 0.00040725760613397966, 'samples': 7321088, 'steps': 14298, 'batch_loss/train': 0.6694465749897063} 12/24/2021 20:41:14 - INFO - codeparrot_training - Step 14299: {'lr': 0.0004072453518653424, 'samples': 7321600, 'steps': 14299, 'batch_loss/train': 0.7325177020393312} 12/24/2021 20:41:27 - INFO - codeparrot_training - Step 14300: {'lr': 0.0004072330969715571, 'samples': 7322112, 'steps': 14300, 'batch_loss/train': 0.7073480949620716} 12/24/2021 20:41:37 - INFO - codeparrot_training - Step 14301: {'lr': 0.00040722084145267237, 'samples': 7322624, 'steps': 14301, 'batch_loss/train': 0.829408855875954} 12/24/2021 20:41:48 - INFO - codeparrot_training - Step 14302: {'lr': 0.00040720858530873686, 'samples': 7323136, 'steps': 14302, 'batch_loss/train': 0.8231947598978877} 12/24/2021 20:42:02 - INFO - codeparrot_training - Step 14303: {'lr': 0.0004071963285397995, 'samples': 7323648, 'steps': 14303, 'batch_loss/train': 0.8291209014132619} 12/24/2021 20:42:12 - INFO - codeparrot_training - Step 14304: {'lr': 0.0004071840711459087, 'samples': 7324160, 'steps': 14304, 'batch_loss/train': 0.773192759603262} 12/24/2021 20:42:23 - INFO - codeparrot_training - Step 14305: {'lr': 0.0004071718131271136, 'samples': 7324672, 'steps': 14305, 'batch_loss/train': 0.8420504359528422} 12/24/2021 20:42:35 - INFO - codeparrot_training - Step 14306: {'lr': 0.0004071595544834626, 'samples': 7325184, 'steps': 14306, 'batch_loss/train': 0.750297780148685} 12/24/2021 20:42:46 - INFO - codeparrot_training - Step 14307: {'lr': 0.0004071472952150046, 'samples': 7325696, 'steps': 14307, 'batch_loss/train': 0.7428242852911353} 12/24/2021 20:42:56 - INFO - codeparrot_training - Step 14308: {'lr': 0.0004071350353217882, 'samples': 7326208, 'steps': 14308, 'batch_loss/train': 0.5401429663761519} 12/24/2021 20:43:07 - INFO - codeparrot_training - Step 14309: {'lr': 0.00040712277480386234, 'samples': 7326720, 'steps': 14309, 'batch_loss/train': 0.8020009929314256} 12/24/2021 20:43:19 - INFO - codeparrot_training - Step 14310: {'lr': 0.0004071105136612756, 'samples': 7327232, 'steps': 14310, 'batch_loss/train': 0.7200181894004345} 12/24/2021 20:43:29 - INFO - codeparrot_training - Step 14311: {'lr': 0.0004070982518940768, 'samples': 7327744, 'steps': 14311, 'batch_loss/train': 0.7831822540611029} 12/24/2021 20:43:40 - INFO - codeparrot_training - Step 14312: {'lr': 0.0004070859895023146, 'samples': 7328256, 'steps': 14312, 'batch_loss/train': 0.7719675274565816} 12/24/2021 20:43:54 - INFO - codeparrot_training - Step 14313: {'lr': 0.0004070737264860379, 'samples': 7328768, 'steps': 14313, 'batch_loss/train': 1.4753432204015553} 12/24/2021 20:44:05 - INFO - codeparrot_training - Step 14314: {'lr': 0.0004070614628452953, 'samples': 7329280, 'steps': 14314, 'batch_loss/train': 0.8944761361926794} 12/24/2021 20:44:15 - INFO - codeparrot_training - Step 14315: {'lr': 0.0004070491985801357, 'samples': 7329792, 'steps': 14315, 'batch_loss/train': 0.7941536232829094} 12/24/2021 20:44:27 - INFO - codeparrot_training - Step 14316: {'lr': 0.0004070369336906078, 'samples': 7330304, 'steps': 14316, 'batch_loss/train': 0.7853873383719474} 12/24/2021 20:44:38 - INFO - codeparrot_training - Step 14317: {'lr': 0.0004070246681767603, 'samples': 7330816, 'steps': 14317, 'batch_loss/train': 0.8705396996811032} 12/24/2021 20:44:49 - INFO - codeparrot_training - Step 14318: {'lr': 0.000407012402038642, 'samples': 7331328, 'steps': 14318, 'batch_loss/train': 0.7665174310095608} 12/24/2021 20:44:59 - INFO - codeparrot_training - Step 14319: {'lr': 0.00040700013527630174, 'samples': 7331840, 'steps': 14319, 'batch_loss/train': 0.8156115703750402} 12/24/2021 20:45:13 - INFO - codeparrot_training - Step 14320: {'lr': 0.0004069878678897881, 'samples': 7332352, 'steps': 14320, 'batch_loss/train': 0.8373958133161068} 12/24/2021 20:45:24 - INFO - codeparrot_training - Step 14321: {'lr': 0.0004069755998791501, 'samples': 7332864, 'steps': 14321, 'batch_loss/train': 0.8537029828876257} 12/24/2021 20:45:34 - INFO - codeparrot_training - Step 14322: {'lr': 0.00040696333124443643, 'samples': 7333376, 'steps': 14322, 'batch_loss/train': 0.7166109117679298} 12/24/2021 20:45:46 - INFO - codeparrot_training - Step 14323: {'lr': 0.00040695106198569567, 'samples': 7333888, 'steps': 14323, 'batch_loss/train': 0.8212225837633014} 12/24/2021 20:45:57 - INFO - codeparrot_training - Step 14324: {'lr': 0.00040693879210297677, 'samples': 7334400, 'steps': 14324, 'batch_loss/train': 0.7919746628031135} 12/24/2021 20:46:07 - INFO - codeparrot_training - Step 14325: {'lr': 0.0004069265215963285, 'samples': 7334912, 'steps': 14325, 'batch_loss/train': 0.8389196745119989} 12/24/2021 20:46:19 - INFO - codeparrot_training - Step 14326: {'lr': 0.00040691425046579966, 'samples': 7335424, 'steps': 14326, 'batch_loss/train': 0.6447340575396083} 12/24/2021 20:46:30 - INFO - codeparrot_training - Step 14327: {'lr': 0.0004069019787114391, 'samples': 7335936, 'steps': 14327, 'batch_loss/train': 0.7932939818128943} 12/24/2021 20:46:41 - INFO - codeparrot_training - Step 14328: {'lr': 0.00040688970633329536, 'samples': 7336448, 'steps': 14328, 'batch_loss/train': 0.8296339986845851} 12/24/2021 20:46:53 - INFO - codeparrot_training - Step 14329: {'lr': 0.00040687743333141745, 'samples': 7336960, 'steps': 14329, 'batch_loss/train': 0.8597945431247354} 12/24/2021 20:47:04 - INFO - codeparrot_training - Step 14330: {'lr': 0.00040686515970585413, 'samples': 7337472, 'steps': 14330, 'batch_loss/train': 0.8818600727245212} 12/24/2021 20:47:15 - INFO - codeparrot_training - Step 14331: {'lr': 0.00040685288545665415, 'samples': 7337984, 'steps': 14331, 'batch_loss/train': 1.2538460846990347} 12/24/2021 20:47:25 - INFO - codeparrot_training - Step 14332: {'lr': 0.00040684061058386635, 'samples': 7338496, 'steps': 14332, 'batch_loss/train': 0.7989172330126166} 12/24/2021 20:47:39 - INFO - codeparrot_training - Step 14333: {'lr': 0.00040682833508753945, 'samples': 7339008, 'steps': 14333, 'batch_loss/train': 0.723459176835604} 12/24/2021 20:47:50 - INFO - codeparrot_training - Step 14334: {'lr': 0.0004068160589677223, 'samples': 7339520, 'steps': 14334, 'batch_loss/train': 0.8259889148175716} 12/24/2021 20:48:00 - INFO - codeparrot_training - Step 14335: {'lr': 0.00040680378222446377, 'samples': 7340032, 'steps': 14335, 'batch_loss/train': 0.786514900624752} 12/24/2021 20:48:12 - INFO - codeparrot_training - Step 14336: {'lr': 0.0004067915048578125, 'samples': 7340544, 'steps': 14336, 'batch_loss/train': 0.789344544056803} 12/24/2021 20:48:23 - INFO - codeparrot_training - Step 14337: {'lr': 0.00040677922686781746, 'samples': 7341056, 'steps': 14337, 'batch_loss/train': 0.8179225400090218} 12/24/2021 20:48:33 - INFO - codeparrot_training - Step 14338: {'lr': 0.0004067669482545275, 'samples': 7341568, 'steps': 14338, 'batch_loss/train': 0.7652892866171896} 12/24/2021 20:48:45 - INFO - codeparrot_training - Step 14339: {'lr': 0.0004067546690179913, 'samples': 7342080, 'steps': 14339, 'batch_loss/train': 0.8232034924440086} 12/24/2021 20:48:56 - INFO - codeparrot_training - Step 14340: {'lr': 0.0004067423891582577, 'samples': 7342592, 'steps': 14340, 'batch_loss/train': 0.8265803401591256} 12/24/2021 20:49:07 - INFO - codeparrot_training - Step 14341: {'lr': 0.0004067301086753755, 'samples': 7343104, 'steps': 14341, 'batch_loss/train': 0.770349288941361} 12/24/2021 20:49:17 - INFO - codeparrot_training - Step 14342: {'lr': 0.0004067178275693936, 'samples': 7343616, 'steps': 14342, 'batch_loss/train': 0.8449778454378247} 12/24/2021 20:49:31 - INFO - codeparrot_training - Step 14343: {'lr': 0.0004067055458403608, 'samples': 7344128, 'steps': 14343, 'batch_loss/train': 0.8214195249602199} 12/24/2021 20:49:42 - INFO - codeparrot_training - Step 14344: {'lr': 0.00040669326348832596, 'samples': 7344640, 'steps': 14344, 'batch_loss/train': 0.7833186108618975} 12/24/2021 20:49:52 - INFO - codeparrot_training - Step 14345: {'lr': 0.0004066809805133378, 'samples': 7345152, 'steps': 14345, 'batch_loss/train': 0.8221320817247033} 12/24/2021 20:50:04 - INFO - codeparrot_training - Step 14346: {'lr': 0.0004066686969154453, 'samples': 7345664, 'steps': 14346, 'batch_loss/train': 0.7837064553750679} 12/24/2021 20:50:15 - INFO - codeparrot_training - Step 14347: {'lr': 0.00040665641269469713, 'samples': 7346176, 'steps': 14347, 'batch_loss/train': 0.7309074411168694} 12/24/2021 20:50:26 - INFO - codeparrot_training - Step 14348: {'lr': 0.0004066441278511422, 'samples': 7346688, 'steps': 14348, 'batch_loss/train': 0.8982451381161809} 12/24/2021 20:50:40 - INFO - codeparrot_training - Step 14349: {'lr': 0.00040663184238482955, 'samples': 7347200, 'steps': 14349, 'batch_loss/train': 0.7791677988134325} 12/24/2021 20:50:50 - INFO - codeparrot_training - Step 14350: {'lr': 0.00040661955629580764, 'samples': 7347712, 'steps': 14350, 'batch_loss/train': 0.7063709188951179} 12/24/2021 20:51:01 - INFO - codeparrot_training - Step 14351: {'lr': 0.00040660726958412566, 'samples': 7348224, 'steps': 14351, 'batch_loss/train': 0.7622749167494476} 12/24/2021 20:51:11 - INFO - codeparrot_training - Step 14352: {'lr': 0.0004065949822498322, 'samples': 7348736, 'steps': 14352, 'batch_loss/train': 0.6751569795887917} 12/24/2021 20:51:24 - INFO - codeparrot_training - Step 14353: {'lr': 0.00040658269429297625, 'samples': 7349248, 'steps': 14353, 'batch_loss/train': 0.5513375150039792} 12/24/2021 20:51:34 - INFO - codeparrot_training - Step 14354: {'lr': 0.00040657040571360657, 'samples': 7349760, 'steps': 14354, 'batch_loss/train': 0.681512679962907} 12/24/2021 20:51:45 - INFO - codeparrot_training - Step 14355: {'lr': 0.00040655811651177213, 'samples': 7350272, 'steps': 14355, 'batch_loss/train': 0.6066668545245193} 12/24/2021 20:51:57 - INFO - codeparrot_training - Step 14356: {'lr': 0.0004065458266875218, 'samples': 7350784, 'steps': 14356, 'batch_loss/train': 0.660497963661328} 12/24/2021 20:52:07 - INFO - codeparrot_training - Step 14357: {'lr': 0.00040653353624090427, 'samples': 7351296, 'steps': 14357, 'batch_loss/train': 0.629343444830738} 12/24/2021 20:52:18 - INFO - codeparrot_training - Step 14358: {'lr': 0.0004065212451719685, 'samples': 7351808, 'steps': 14358, 'batch_loss/train': 0.8514356147497892} 12/24/2021 20:52:30 - INFO - codeparrot_training - Step 14359: {'lr': 0.00040650895348076345, 'samples': 7352320, 'steps': 14359, 'batch_loss/train': 0.7420325763523579} 12/24/2021 20:52:41 - INFO - codeparrot_training - Step 14360: {'lr': 0.00040649666116733784, 'samples': 7352832, 'steps': 14360, 'batch_loss/train': 0.7140755311120301} 12/24/2021 20:52:51 - INFO - codeparrot_training - Step 14361: {'lr': 0.0004064843682317406, 'samples': 7353344, 'steps': 14361, 'batch_loss/train': 0.6681561913574114} 12/24/2021 20:53:05 - INFO - codeparrot_training - Step 14362: {'lr': 0.0004064720746740206, 'samples': 7353856, 'steps': 14362, 'batch_loss/train': 0.8321587573736906} 12/24/2021 20:53:16 - INFO - codeparrot_training - Step 14363: {'lr': 0.0004064597804942267, 'samples': 7354368, 'steps': 14363, 'batch_loss/train': 0.7165713491849601} 12/24/2021 20:53:27 - INFO - codeparrot_training - Step 14364: {'lr': 0.00040644748569240783, 'samples': 7354880, 'steps': 14364, 'batch_loss/train': 0.7727621709927917} 12/24/2021 20:53:37 - INFO - codeparrot_training - Step 14365: {'lr': 0.00040643519026861274, 'samples': 7355392, 'steps': 14365, 'batch_loss/train': 0.7886540177278221} 12/24/2021 20:53:49 - INFO - codeparrot_training - Step 14366: {'lr': 0.0004064228942228906, 'samples': 7355904, 'steps': 14366, 'batch_loss/train': 0.8207669202238321} 12/24/2021 20:54:00 - INFO - codeparrot_training - Step 14367: {'lr': 0.00040641059755528987, 'samples': 7356416, 'steps': 14367, 'batch_loss/train': 0.7243892289698124} 12/24/2021 20:54:10 - INFO - codeparrot_training - Step 14368: {'lr': 0.0004063983002658598, 'samples': 7356928, 'steps': 14368, 'batch_loss/train': 0.8269480438902974} 12/24/2021 20:54:22 - INFO - codeparrot_training - Step 14369: {'lr': 0.0004063860023546491, 'samples': 7357440, 'steps': 14369, 'batch_loss/train': 0.745899316854775} 12/24/2021 20:54:33 - INFO - codeparrot_training - Step 14370: {'lr': 0.00040637370382170667, 'samples': 7357952, 'steps': 14370, 'batch_loss/train': 0.5914999585947953} 12/24/2021 20:54:44 - INFO - codeparrot_training - Step 14371: {'lr': 0.00040636140466708147, 'samples': 7358464, 'steps': 14371, 'batch_loss/train': 0.8597923330962658} 12/24/2021 20:54:57 - INFO - codeparrot_training - Step 14372: {'lr': 0.00040634910489082234, 'samples': 7358976, 'steps': 14372, 'batch_loss/train': 0.8541155429556966} 12/24/2021 20:55:08 - INFO - codeparrot_training - Step 14373: {'lr': 0.0004063368044929783, 'samples': 7359488, 'steps': 14373, 'batch_loss/train': 0.7342601106502116} 12/24/2021 20:55:19 - INFO - codeparrot_training - Step 14374: {'lr': 0.00040632450347359805, 'samples': 7360000, 'steps': 14374, 'batch_loss/train': 0.8001265237107873} 12/24/2021 20:55:31 - INFO - codeparrot_training - Step 14375: {'lr': 0.00040631220183273063, 'samples': 7360512, 'steps': 14375, 'batch_loss/train': 0.7714180091861635} 12/24/2021 20:55:41 - INFO - codeparrot_training - Step 14376: {'lr': 0.0004062998995704249, 'samples': 7361024, 'steps': 14376, 'batch_loss/train': 0.7871728148311377} 12/24/2021 20:55:52 - INFO - codeparrot_training - Step 14377: {'lr': 0.00040628759668672983, 'samples': 7361536, 'steps': 14377, 'batch_loss/train': 0.6693661054596305} 12/24/2021 20:56:03 - INFO - codeparrot_training - Step 14378: {'lr': 0.0004062752931816942, 'samples': 7362048, 'steps': 14378, 'batch_loss/train': 0.7243287408491597} 12/24/2021 20:56:15 - INFO - codeparrot_training - Step 14379: {'lr': 0.00040626298905536713, 'samples': 7362560, 'steps': 14379, 'batch_loss/train': 0.840857190079987} 12/24/2021 20:56:25 - INFO - codeparrot_training - Step 14380: {'lr': 0.0004062506843077973, 'samples': 7363072, 'steps': 14380, 'batch_loss/train': 0.7972024958580732} 12/24/2021 20:56:36 - INFO - codeparrot_training - Step 14381: {'lr': 0.00040623837893903386, 'samples': 7363584, 'steps': 14381, 'batch_loss/train': 0.6530896725598723} 12/24/2021 20:56:50 - INFO - codeparrot_training - Step 14382: {'lr': 0.00040622607294912556, 'samples': 7364096, 'steps': 14382, 'batch_loss/train': 0.784143315278925} 12/24/2021 20:57:00 - INFO - codeparrot_training - Step 14383: {'lr': 0.00040621376633812143, 'samples': 7364608, 'steps': 14383, 'batch_loss/train': 0.6975715907756239} 12/24/2021 20:57:11 - INFO - codeparrot_training - Step 14384: {'lr': 0.0004062014591060703, 'samples': 7365120, 'steps': 14384, 'batch_loss/train': 0.9049372989684343} 12/24/2021 20:57:23 - INFO - codeparrot_training - Step 14385: {'lr': 0.0004061891512530211, 'samples': 7365632, 'steps': 14385, 'batch_loss/train': 0.659433924825862} 12/24/2021 20:57:34 - INFO - codeparrot_training - Step 14386: {'lr': 0.0004061768427790229, 'samples': 7366144, 'steps': 14386, 'batch_loss/train': 0.9216384636238217} 12/24/2021 20:57:44 - INFO - codeparrot_training - Step 14387: {'lr': 0.00040616453368412454, 'samples': 7366656, 'steps': 14387, 'batch_loss/train': 0.7939587980508804} 12/24/2021 20:57:55 - INFO - codeparrot_training - Step 14388: {'lr': 0.00040615222396837494, 'samples': 7367168, 'steps': 14388, 'batch_loss/train': 1.0023083174601197} 12/24/2021 20:58:09 - INFO - codeparrot_training - Step 14389: {'lr': 0.000406139913631823, 'samples': 7367680, 'steps': 14389, 'batch_loss/train': 0.7222729837521911} 12/24/2021 20:58:19 - INFO - codeparrot_training - Step 14390: {'lr': 0.00040612760267451776, 'samples': 7368192, 'steps': 14390, 'batch_loss/train': 0.756210420280695} 12/24/2021 20:58:30 - INFO - codeparrot_training - Step 14391: {'lr': 0.00040611529109650814, 'samples': 7368704, 'steps': 14391, 'batch_loss/train': 0.8094795513898134} 12/24/2021 20:58:42 - INFO - codeparrot_training - Step 14392: {'lr': 0.0004061029788978431, 'samples': 7369216, 'steps': 14392, 'batch_loss/train': 0.8046063678339124} 12/24/2021 20:58:52 - INFO - codeparrot_training - Step 14393: {'lr': 0.0004060906660785715, 'samples': 7369728, 'steps': 14393, 'batch_loss/train': 0.7926086313091218} 12/24/2021 20:59:03 - INFO - codeparrot_training - Step 14394: {'lr': 0.00040607835263874236, 'samples': 7370240, 'steps': 14394, 'batch_loss/train': 0.8430774984881282} 12/24/2021 20:59:15 - INFO - codeparrot_training - Step 14395: {'lr': 0.00040606603857840464, 'samples': 7370752, 'steps': 14395, 'batch_loss/train': 0.7586956219747663} 12/24/2021 20:59:26 - INFO - codeparrot_training - Step 14396: {'lr': 0.0004060537238976073, 'samples': 7371264, 'steps': 14396, 'batch_loss/train': 0.8319260831922293} 12/24/2021 20:59:36 - INFO - codeparrot_training - Step 14397: {'lr': 0.00040604140859639917, 'samples': 7371776, 'steps': 14397, 'batch_loss/train': 0.7976538436487317} 12/24/2021 20:59:47 - INFO - codeparrot_training - Step 14398: {'lr': 0.0004060290926748294, 'samples': 7372288, 'steps': 14398, 'batch_loss/train': 0.7616070611402392} 12/24/2021 20:59:59 - INFO - codeparrot_training - Step 14399: {'lr': 0.00040601677613294685, 'samples': 7372800, 'steps': 14399, 'batch_loss/train': 0.7436474654823542} 12/24/2021 21:00:09 - INFO - codeparrot_training - Step 14400: {'lr': 0.0004060044589708005, 'samples': 7373312, 'steps': 14400, 'batch_loss/train': 0.765016702003777} 12/24/2021 21:00:20 - INFO - codeparrot_training - Step 14401: {'lr': 0.0004059921411884394, 'samples': 7373824, 'steps': 14401, 'batch_loss/train': 0.7874116702005267} 12/24/2021 21:00:34 - INFO - codeparrot_training - Step 14402: {'lr': 0.00040597982278591237, 'samples': 7374336, 'steps': 14402, 'batch_loss/train': 0.7543824990279973} 12/24/2021 21:00:45 - INFO - codeparrot_training - Step 14403: {'lr': 0.00040596750376326854, 'samples': 7374848, 'steps': 14403, 'batch_loss/train': 0.8271801867522299} 12/24/2021 21:00:55 - INFO - codeparrot_training - Step 14404: {'lr': 0.00040595518412055674, 'samples': 7375360, 'steps': 14404, 'batch_loss/train': 0.8070313036441803} 12/24/2021 21:01:07 - INFO - codeparrot_training - Step 14405: {'lr': 0.00040594286385782605, 'samples': 7375872, 'steps': 14405, 'batch_loss/train': 0.8165835151448846} 12/24/2021 21:01:18 - INFO - codeparrot_training - Step 14406: {'lr': 0.0004059305429751254, 'samples': 7376384, 'steps': 14406, 'batch_loss/train': 0.5629420534241945} 12/24/2021 21:01:29 - INFO - codeparrot_training - Step 14407: {'lr': 0.0004059182214725038, 'samples': 7376896, 'steps': 14407, 'batch_loss/train': 0.8440208183601499} 12/24/2021 21:01:41 - INFO - codeparrot_training - Step 14408: {'lr': 0.0004059058993500102, 'samples': 7377408, 'steps': 14408, 'batch_loss/train': 0.7572104558348656} 12/24/2021 21:01:51 - INFO - codeparrot_training - Step 14409: {'lr': 0.00040589357660769356, 'samples': 7377920, 'steps': 14409, 'batch_loss/train': 0.7743108496069908} 12/24/2021 21:02:02 - INFO - codeparrot_training - Step 14410: {'lr': 0.000405881253245603, 'samples': 7378432, 'steps': 14410, 'batch_loss/train': 0.7802382949739695} 12/24/2021 21:02:13 - INFO - codeparrot_training - Step 14411: {'lr': 0.0004058689292637874, 'samples': 7378944, 'steps': 14411, 'batch_loss/train': 0.8228210303932428} 12/24/2021 21:02:26 - INFO - codeparrot_training - Step 14412: {'lr': 0.0004058566046622958, 'samples': 7379456, 'steps': 14412, 'batch_loss/train': 0.7692320761270821} 12/24/2021 21:02:37 - INFO - codeparrot_training - Step 14413: {'lr': 0.0004058442794411772, 'samples': 7379968, 'steps': 14413, 'batch_loss/train': 0.8544259583577514} 12/24/2021 21:02:48 - INFO - codeparrot_training - Step 14414: {'lr': 0.00040583195360048064, 'samples': 7380480, 'steps': 14414, 'batch_loss/train': 0.7756724515929818} 12/24/2021 21:03:00 - INFO - codeparrot_training - Step 14415: {'lr': 0.000405819627140255, 'samples': 7380992, 'steps': 14415, 'batch_loss/train': 0.8199131153523922} 12/24/2021 21:03:10 - INFO - codeparrot_training - Step 14416: {'lr': 0.0004058073000605493, 'samples': 7381504, 'steps': 14416, 'batch_loss/train': 0.7609236361458898} 12/24/2021 21:03:21 - INFO - codeparrot_training - Step 14417: {'lr': 0.0004057949723614127, 'samples': 7382016, 'steps': 14417, 'batch_loss/train': 0.7696857322007418} 12/24/2021 21:03:35 - INFO - codeparrot_training - Step 14418: {'lr': 0.0004057826440428941, 'samples': 7382528, 'steps': 14418, 'batch_loss/train': 0.8580714724957943} 12/24/2021 21:03:45 - INFO - codeparrot_training - Step 14419: {'lr': 0.0004057703151050426, 'samples': 7383040, 'steps': 14419, 'batch_loss/train': 0.6969712261925451} 12/24/2021 21:03:56 - INFO - codeparrot_training - Step 14420: {'lr': 0.00040575798554790704, 'samples': 7383552, 'steps': 14420, 'batch_loss/train': 0.7922532535158098} 12/24/2021 21:04:08 - INFO - codeparrot_training - Step 14421: {'lr': 0.00040574565537153655, 'samples': 7384064, 'steps': 14421, 'batch_loss/train': 0.7026864956133068} 12/24/2021 21:04:18 - INFO - codeparrot_training - Step 14422: {'lr': 0.0004057333245759801, 'samples': 7384576, 'steps': 14422, 'batch_loss/train': 0.7750430807936937} 12/24/2021 21:04:29 - INFO - codeparrot_training - Step 14423: {'lr': 0.00040572099316128687, 'samples': 7385088, 'steps': 14423, 'batch_loss/train': 0.6753226849250495} 12/24/2021 21:04:40 - INFO - codeparrot_training - Step 14424: {'lr': 0.0004057086611275057, 'samples': 7385600, 'steps': 14424, 'batch_loss/train': 0.7686782237142324} 12/24/2021 21:04:52 - INFO - codeparrot_training - Step 14425: {'lr': 0.00040569632847468575, 'samples': 7386112, 'steps': 14425, 'batch_loss/train': 0.738016399089247} 12/24/2021 21:05:02 - INFO - codeparrot_training - Step 14426: {'lr': 0.0004056839952028759, 'samples': 7386624, 'steps': 14426, 'batch_loss/train': 0.697270252276212} 12/24/2021 21:05:13 - INFO - codeparrot_training - Step 14427: {'lr': 0.0004056716613121253, 'samples': 7387136, 'steps': 14427, 'batch_loss/train': 1.7606603652238846} 12/24/2021 21:05:27 - INFO - codeparrot_training - Step 14428: {'lr': 0.00040565932680248297, 'samples': 7387648, 'steps': 14428, 'batch_loss/train': 0.7046131719835103} 12/24/2021 21:05:37 - INFO - codeparrot_training - Step 14429: {'lr': 0.000405646991673998, 'samples': 7388160, 'steps': 14429, 'batch_loss/train': 0.876697420142591} 12/24/2021 21:05:48 - INFO - codeparrot_training - Step 14430: {'lr': 0.00040563465592671923, 'samples': 7388672, 'steps': 14430, 'batch_loss/train': 0.8342354092746973} 12/24/2021 21:06:00 - INFO - codeparrot_training - Step 14431: {'lr': 0.0004056223195606959, 'samples': 7389184, 'steps': 14431, 'batch_loss/train': 0.7949042366817594} 12/24/2021 21:06:11 - INFO - codeparrot_training - Step 14432: {'lr': 0.00040560998257597693, 'samples': 7389696, 'steps': 14432, 'batch_loss/train': 0.7923786672763526} 12/24/2021 21:06:21 - INFO - codeparrot_training - Step 14433: {'lr': 0.00040559764497261156, 'samples': 7390208, 'steps': 14433, 'batch_loss/train': 0.7645766119239852} 12/24/2021 21:06:32 - INFO - codeparrot_training - Step 14434: {'lr': 0.0004055853067506485, 'samples': 7390720, 'steps': 14434, 'batch_loss/train': 0.8497481131926179} 12/24/2021 21:06:44 - INFO - codeparrot_training - Step 14435: {'lr': 0.00040557296791013717, 'samples': 7391232, 'steps': 14435, 'batch_loss/train': 0.7813273291103542} 12/24/2021 21:06:55 - INFO - codeparrot_training - Step 14436: {'lr': 0.0004055606284511264, 'samples': 7391744, 'steps': 14436, 'batch_loss/train': 0.850170599296689} 12/24/2021 21:07:05 - INFO - codeparrot_training - Step 14437: {'lr': 0.0004055482883736653, 'samples': 7392256, 'steps': 14437, 'batch_loss/train': 0.7952681118622422} 12/24/2021 21:07:18 - INFO - codeparrot_training - Step 14438: {'lr': 0.000405535947677803, 'samples': 7392768, 'steps': 14438, 'batch_loss/train': 0.8147792816162109} 12/24/2021 21:07:28 - INFO - codeparrot_training - Step 14439: {'lr': 0.00040552360636358844, 'samples': 7393280, 'steps': 14439, 'batch_loss/train': 0.7802589861676097} 12/24/2021 21:07:39 - INFO - codeparrot_training - Step 14440: {'lr': 0.0004055112644310708, 'samples': 7393792, 'steps': 14440, 'batch_loss/train': 0.6766338543966413} 12/24/2021 21:07:53 - INFO - codeparrot_training - Step 14441: {'lr': 0.000405498921880299, 'samples': 7394304, 'steps': 14441, 'batch_loss/train': 0.8081174492835999} 12/24/2021 21:08:03 - INFO - codeparrot_training - Step 14442: {'lr': 0.00040548657871132234, 'samples': 7394816, 'steps': 14442, 'batch_loss/train': 0.8362015290185809} 12/24/2021 21:08:14 - INFO - codeparrot_training - Step 14443: {'lr': 0.0004054742349241896, 'samples': 7395328, 'steps': 14443, 'batch_loss/train': 0.7607070347294211} 12/24/2021 21:08:24 - INFO - codeparrot_training - Step 14444: {'lr': 0.0004054618905189501, 'samples': 7395840, 'steps': 14444, 'batch_loss/train': 0.7140958642121404} 12/24/2021 21:08:37 - INFO - codeparrot_training - Step 14445: {'lr': 0.0004054495454956528, 'samples': 7396352, 'steps': 14445, 'batch_loss/train': 0.7482699556276202} 12/24/2021 21:08:47 - INFO - codeparrot_training - Step 14446: {'lr': 0.0004054371998543468, 'samples': 7396864, 'steps': 14446, 'batch_loss/train': 0.9408856919035316} 12/24/2021 21:08:58 - INFO - codeparrot_training - Step 14447: {'lr': 0.0004054248535950812, 'samples': 7397376, 'steps': 14447, 'batch_loss/train': 0.8490971475839615} 12/24/2021 21:09:10 - INFO - codeparrot_training - Step 14448: {'lr': 0.000405412506717905, 'samples': 7397888, 'steps': 14448, 'batch_loss/train': 0.7894862918183208} 12/24/2021 21:09:20 - INFO - codeparrot_training - Step 14449: {'lr': 0.00040540015922286746, 'samples': 7398400, 'steps': 14449, 'batch_loss/train': 0.7831326867453754} 12/24/2021 21:09:31 - INFO - codeparrot_training - Step 14450: {'lr': 0.00040538781111001753, 'samples': 7398912, 'steps': 14450, 'batch_loss/train': 0.7940477705560625} 12/24/2021 21:09:45 - INFO - codeparrot_training - Step 14451: {'lr': 0.0004053754623794043, 'samples': 7399424, 'steps': 14451, 'batch_loss/train': 0.7632901519536972} 12/24/2021 21:09:55 - INFO - codeparrot_training - Step 14452: {'lr': 0.0004053631130310769, 'samples': 7399936, 'steps': 14452, 'batch_loss/train': 0.8291241135448217} 12/24/2021 21:10:06 - INFO - codeparrot_training - Step 14453: {'lr': 0.00040535076306508444, 'samples': 7400448, 'steps': 14453, 'batch_loss/train': 0.7774081956595182} 12/24/2021 21:10:17 - INFO - codeparrot_training - Step 14454: {'lr': 0.000405338412481476, 'samples': 7400960, 'steps': 14454, 'batch_loss/train': 1.723494753241539} 12/24/2021 21:10:29 - INFO - codeparrot_training - Step 14455: {'lr': 0.0004053260612803007, 'samples': 7401472, 'steps': 14455, 'batch_loss/train': 0.8344197850674391} 12/24/2021 21:10:39 - INFO - codeparrot_training - Step 14456: {'lr': 0.0004053137094616076, 'samples': 7401984, 'steps': 14456, 'batch_loss/train': 0.8360478105023503} 12/24/2021 21:10:50 - INFO - codeparrot_training - Step 14457: {'lr': 0.0004053013570254459, 'samples': 7402496, 'steps': 14457, 'batch_loss/train': 0.8614283357746899} 12/24/2021 21:11:04 - INFO - codeparrot_training - Step 14458: {'lr': 0.00040528900397186464, 'samples': 7403008, 'steps': 14458, 'batch_loss/train': 0.8023588084615767} 12/24/2021 21:11:14 - INFO - codeparrot_training - Step 14459: {'lr': 0.0004052766503009129, 'samples': 7403520, 'steps': 14459, 'batch_loss/train': 0.800335356965661} 12/24/2021 21:11:25 - INFO - codeparrot_training - Step 14460: {'lr': 0.0004052642960126398, 'samples': 7404032, 'steps': 14460, 'batch_loss/train': 0.8864487195387483} 12/24/2021 21:11:37 - INFO - codeparrot_training - Step 14461: {'lr': 0.0004052519411070945, 'samples': 7404544, 'steps': 14461, 'batch_loss/train': 0.7907833433710039} 12/24/2021 21:11:48 - INFO - codeparrot_training - Step 14462: {'lr': 0.0004052395855843261, 'samples': 7405056, 'steps': 14462, 'batch_loss/train': 0.7736672139726579} 12/24/2021 21:11:58 - INFO - codeparrot_training - Step 14463: {'lr': 0.0004052272294443837, 'samples': 7405568, 'steps': 14463, 'batch_loss/train': 0.7443662602454424} 12/24/2021 21:12:10 - INFO - codeparrot_training - Step 14464: {'lr': 0.00040521487268731655, 'samples': 7406080, 'steps': 14464, 'batch_loss/train': 0.7715516202151775} 12/24/2021 21:12:21 - INFO - codeparrot_training - Step 14465: {'lr': 0.00040520251531317355, 'samples': 7406592, 'steps': 14465, 'batch_loss/train': 0.9143158011138439} 12/24/2021 21:12:31 - INFO - codeparrot_training - Step 14466: {'lr': 0.00040519015732200404, 'samples': 7407104, 'steps': 14466, 'batch_loss/train': 0.8553401045501232} 12/24/2021 21:12:42 - INFO - codeparrot_training - Step 14467: {'lr': 0.0004051777987138571, 'samples': 7407616, 'steps': 14467, 'batch_loss/train': 0.8076601540669799} 12/24/2021 21:12:56 - INFO - codeparrot_training - Step 14468: {'lr': 0.0004051654394887817, 'samples': 7408128, 'steps': 14468, 'batch_loss/train': 0.7534263795241714} 12/24/2021 21:13:06 - INFO - codeparrot_training - Step 14469: {'lr': 0.0004051530796468271, 'samples': 7408640, 'steps': 14469, 'batch_loss/train': 0.7380002615973353} 12/24/2021 21:13:17 - INFO - codeparrot_training - Step 14470: {'lr': 0.0004051407191880426, 'samples': 7409152, 'steps': 14470, 'batch_loss/train': 0.8548561858478934} 12/24/2021 21:13:29 - INFO - codeparrot_training - Step 14471: {'lr': 0.00040512835811247705, 'samples': 7409664, 'steps': 14471, 'batch_loss/train': 0.829748303629458} 12/24/2021 21:13:40 - INFO - codeparrot_training - Step 14472: {'lr': 0.0004051159964201797, 'samples': 7410176, 'steps': 14472, 'batch_loss/train': 0.7177524211583659} 12/24/2021 21:13:50 - INFO - codeparrot_training - Step 14473: {'lr': 0.0004051036341111998, 'samples': 7410688, 'steps': 14473, 'batch_loss/train': 0.7052846199367195} 12/24/2021 21:14:04 - INFO - codeparrot_training - Step 14474: {'lr': 0.00040509127118558636, 'samples': 7411200, 'steps': 14474, 'batch_loss/train': 0.7871590792201459} 12/24/2021 21:14:15 - INFO - codeparrot_training - Step 14475: {'lr': 0.00040507890764338863, 'samples': 7411712, 'steps': 14475, 'batch_loss/train': 0.8317597033455968} 12/24/2021 21:14:25 - INFO - codeparrot_training - Step 14476: {'lr': 0.00040506654348465565, 'samples': 7412224, 'steps': 14476, 'batch_loss/train': 0.9397415351122618} 12/24/2021 21:14:38 - INFO - codeparrot_training - Step 14477: {'lr': 0.00040505417870943675, 'samples': 7412736, 'steps': 14477, 'batch_loss/train': 0.794648103415966} 12/24/2021 21:14:48 - INFO - codeparrot_training - Step 14478: {'lr': 0.0004050418133177809, 'samples': 7413248, 'steps': 14478, 'batch_loss/train': 0.8141411207616329} 12/24/2021 21:14:59 - INFO - codeparrot_training - Step 14479: {'lr': 0.0004050294473097373, 'samples': 7413760, 'steps': 14479, 'batch_loss/train': 1.0222084252163768} 12/24/2021 21:15:09 - INFO - codeparrot_training - Step 14480: {'lr': 0.00040501708068535524, 'samples': 7414272, 'steps': 14480, 'batch_loss/train': 0.7162191388197243} 12/24/2021 21:15:22 - INFO - codeparrot_training - Step 14481: {'lr': 0.0004050047134446838, 'samples': 7414784, 'steps': 14481, 'batch_loss/train': 0.7546407417394221} 12/24/2021 21:15:32 - INFO - codeparrot_training - Step 14482: {'lr': 0.0004049923455877721, 'samples': 7415296, 'steps': 14482, 'batch_loss/train': 0.7985744497273117} 12/24/2021 21:15:43 - INFO - codeparrot_training - Step 14483: {'lr': 0.00040497997711466937, 'samples': 7415808, 'steps': 14483, 'batch_loss/train': 0.8330721776001155} 12/24/2021 21:15:55 - INFO - codeparrot_training - Step 14484: {'lr': 0.0004049676080254248, 'samples': 7416320, 'steps': 14484, 'batch_loss/train': 0.7047268128953874} 12/24/2021 21:16:06 - INFO - codeparrot_training - Step 14485: {'lr': 0.0004049552383200875, 'samples': 7416832, 'steps': 14485, 'batch_loss/train': 0.78544981777668} 12/24/2021 21:16:16 - INFO - codeparrot_training - Step 14486: {'lr': 0.0004049428679987067, 'samples': 7417344, 'steps': 14486, 'batch_loss/train': 0.8239551442675292} 12/24/2021 21:16:30 - INFO - codeparrot_training - Step 14487: {'lr': 0.00040493049706133156, 'samples': 7417856, 'steps': 14487, 'batch_loss/train': 0.8231672011315823} 12/24/2021 21:16:41 - INFO - codeparrot_training - Step 14488: {'lr': 0.0004049181255080113, 'samples': 7418368, 'steps': 14488, 'batch_loss/train': 0.9033278157003224} 12/24/2021 21:16:51 - INFO - codeparrot_training - Step 14489: {'lr': 0.000404905753338795, 'samples': 7418880, 'steps': 14489, 'batch_loss/train': 0.8578813103958964} 12/24/2021 21:17:02 - INFO - codeparrot_training - Step 14490: {'lr': 0.00040489338055373195, 'samples': 7419392, 'steps': 14490, 'batch_loss/train': 0.7298800591379404} 12/24/2021 21:17:14 - INFO - codeparrot_training - Step 14491: {'lr': 0.00040488100715287123, 'samples': 7419904, 'steps': 14491, 'batch_loss/train': 0.8715792894363403} 12/24/2021 21:17:24 - INFO - codeparrot_training - Step 14492: {'lr': 0.0004048686331362623, 'samples': 7420416, 'steps': 14492, 'batch_loss/train': 0.7645857213065028} 12/24/2021 21:17:35 - INFO - codeparrot_training - Step 14493: {'lr': 0.00040485625850395396, 'samples': 7420928, 'steps': 14493, 'batch_loss/train': 0.7197452606633306} 12/24/2021 21:17:47 - INFO - codeparrot_training - Step 14494: {'lr': 0.0004048438832559957, 'samples': 7421440, 'steps': 14494, 'batch_loss/train': 0.8186878236010671} 12/24/2021 21:17:58 - INFO - codeparrot_training - Step 14495: {'lr': 0.0004048315073924366, 'samples': 7421952, 'steps': 14495, 'batch_loss/train': 0.7725222473964095} 12/24/2021 21:18:08 - INFO - codeparrot_training - Step 14496: {'lr': 0.0004048191309133259, 'samples': 7422464, 'steps': 14496, 'batch_loss/train': 0.8178280902793631} 12/24/2021 21:18:22 - INFO - codeparrot_training - Step 14497: {'lr': 0.0004048067538187128, 'samples': 7422976, 'steps': 14497, 'batch_loss/train': 0.7897657649591565} 12/24/2021 21:18:33 - INFO - codeparrot_training - Step 14498: {'lr': 0.00040479437610864644, 'samples': 7423488, 'steps': 14498, 'batch_loss/train': 0.7959894272498786} 12/24/2021 21:18:43 - INFO - codeparrot_training - Step 14499: {'lr': 0.0004047819977831761, 'samples': 7424000, 'steps': 14499, 'batch_loss/train': 0.746421517804265} 12/24/2021 21:18:55 - INFO - codeparrot_training - Step 14500: {'lr': 0.0004047696188423511, 'samples': 7424512, 'steps': 14500, 'batch_loss/train': 0.8268796298652887} 12/24/2021 21:19:06 - INFO - codeparrot_training - Step 14501: {'lr': 0.00040475723928622045, 'samples': 7425024, 'steps': 14501, 'batch_loss/train': 0.6409639795310795} 12/24/2021 21:19:17 - INFO - codeparrot_training - Step 14502: {'lr': 0.00040474485911483343, 'samples': 7425536, 'steps': 14502, 'batch_loss/train': 0.8077704403549433} 12/24/2021 21:19:27 - INFO - codeparrot_training - Step 14503: {'lr': 0.00040473247832823936, 'samples': 7426048, 'steps': 14503, 'batch_loss/train': 0.8073315555229783} 12/24/2021 21:19:41 - INFO - codeparrot_training - Step 14504: {'lr': 0.00040472009692648725, 'samples': 7426560, 'steps': 14504, 'batch_loss/train': 0.7771367463283241} 12/24/2021 21:19:51 - INFO - codeparrot_training - Step 14505: {'lr': 0.00040470771490962653, 'samples': 7427072, 'steps': 14505, 'batch_loss/train': 0.7759137470275164} 12/24/2021 21:20:02 - INFO - codeparrot_training - Step 14506: {'lr': 0.0004046953322777064, 'samples': 7427584, 'steps': 14506, 'batch_loss/train': 0.8157443013042212} 12/24/2021 21:20:14 - INFO - codeparrot_training - Step 14507: {'lr': 0.00040468294903077596, 'samples': 7428096, 'steps': 14507, 'batch_loss/train': 0.8753816806711257} 12/24/2021 21:20:25 - INFO - codeparrot_training - Step 14508: {'lr': 0.00040467056516888455, 'samples': 7428608, 'steps': 14508, 'batch_loss/train': 0.7813797183334827} 12/24/2021 21:20:35 - INFO - codeparrot_training - Step 14509: {'lr': 0.0004046581806920814, 'samples': 7429120, 'steps': 14509, 'batch_loss/train': 0.7999724131077528} 12/24/2021 21:20:46 - INFO - codeparrot_training - Step 14510: {'lr': 0.00040464579560041565, 'samples': 7429632, 'steps': 14510, 'batch_loss/train': 0.7055282741785049} 12/24/2021 21:21:00 - INFO - codeparrot_training - Step 14511: {'lr': 0.00040463340989393666, 'samples': 7430144, 'steps': 14511, 'batch_loss/train': 0.7527643274515867} 12/24/2021 21:21:11 - INFO - codeparrot_training - Step 14512: {'lr': 0.0004046210235726936, 'samples': 7430656, 'steps': 14512, 'batch_loss/train': 0.8780729696154594} 12/24/2021 21:21:21 - INFO - codeparrot_training - Step 14513: {'lr': 0.0004046086366367358, 'samples': 7431168, 'steps': 14513, 'batch_loss/train': 0.7858016239479184} 12/24/2021 21:21:33 - INFO - codeparrot_training - Step 14514: {'lr': 0.00040459624908611237, 'samples': 7431680, 'steps': 14514, 'batch_loss/train': 0.7888796348124743} 12/24/2021 21:21:44 - INFO - codeparrot_training - Step 14515: {'lr': 0.00040458386092087265, 'samples': 7432192, 'steps': 14515, 'batch_loss/train': 0.7124688038602471} 12/24/2021 21:21:55 - INFO - codeparrot_training - Step 14516: {'lr': 0.0004045714721410658, 'samples': 7432704, 'steps': 14516, 'batch_loss/train': 0.8292450793087482} 12/24/2021 21:22:07 - INFO - codeparrot_training - Step 14517: {'lr': 0.0004045590827467412, 'samples': 7433216, 'steps': 14517, 'batch_loss/train': 0.7938361028209329} 12/24/2021 21:22:17 - INFO - codeparrot_training - Step 14518: {'lr': 0.0004045466927379481, 'samples': 7433728, 'steps': 14518, 'batch_loss/train': 0.8147512190043926} 12/24/2021 21:22:28 - INFO - codeparrot_training - Step 14519: {'lr': 0.00040453430211473564, 'samples': 7434240, 'steps': 14519, 'batch_loss/train': 0.7829461582005024} 12/24/2021 21:22:42 - INFO - codeparrot_training - Step 14520: {'lr': 0.00040452191087715314, 'samples': 7434752, 'steps': 14520, 'batch_loss/train': 0.7936670864000916} 12/24/2021 21:22:52 - INFO - codeparrot_training - Step 14521: {'lr': 0.0004045095190252499, 'samples': 7435264, 'steps': 14521, 'batch_loss/train': 0.8160509457811713} 12/24/2021 21:23:03 - INFO - codeparrot_training - Step 14522: {'lr': 0.0004044971265590751, 'samples': 7435776, 'steps': 14522, 'batch_loss/train': 0.7609271751716733} 12/24/2021 21:23:14 - INFO - codeparrot_training - Step 14523: {'lr': 0.0004044847334786781, 'samples': 7436288, 'steps': 14523, 'batch_loss/train': 0.9109337767586112} 12/24/2021 21:23:26 - INFO - codeparrot_training - Step 14524: {'lr': 0.00040447233978410814, 'samples': 7436800, 'steps': 14524, 'batch_loss/train': 0.8418207569047809} 12/24/2021 21:23:37 - INFO - codeparrot_training - Step 14525: {'lr': 0.0004044599454754144, 'samples': 7437312, 'steps': 14525, 'batch_loss/train': 0.8197522228583694} 12/24/2021 21:23:47 - INFO - codeparrot_training - Step 14526: {'lr': 0.0004044475505526464, 'samples': 7437824, 'steps': 14526, 'batch_loss/train': 1.0450520738959312} 12/24/2021 21:24:01 - INFO - codeparrot_training - Step 14527: {'lr': 0.00040443515501585315, 'samples': 7438336, 'steps': 14527, 'batch_loss/train': 0.8251100000925362} 12/24/2021 21:24:12 - INFO - codeparrot_training - Step 14528: {'lr': 0.000404422758865084, 'samples': 7438848, 'steps': 14528, 'batch_loss/train': 0.771146728657186} 12/24/2021 21:24:22 - INFO - codeparrot_training - Step 14529: {'lr': 0.0004044103621003883, 'samples': 7439360, 'steps': 14529, 'batch_loss/train': 0.8224132545292377} 12/24/2021 21:24:34 - INFO - codeparrot_training - Step 14530: {'lr': 0.0004043979647218153, 'samples': 7439872, 'steps': 14530, 'batch_loss/train': 0.793625402264297} 12/24/2021 21:24:45 - INFO - codeparrot_training - Step 14531: {'lr': 0.0004043855667294144, 'samples': 7440384, 'steps': 14531, 'batch_loss/train': 0.7698456692742184} 12/24/2021 21:24:56 - INFO - codeparrot_training - Step 14532: {'lr': 0.00040437316812323465, 'samples': 7440896, 'steps': 14532, 'batch_loss/train': 0.733096657320857} 12/24/2021 21:25:06 - INFO - codeparrot_training - Step 14533: {'lr': 0.0004043607689033255, 'samples': 7441408, 'steps': 14533, 'batch_loss/train': 0.8685655393637717} 12/24/2021 21:25:18 - INFO - codeparrot_training - Step 14534: {'lr': 0.00040434836906973626, 'samples': 7441920, 'steps': 14534, 'batch_loss/train': 0.7888433029875159} 12/24/2021 21:25:29 - INFO - codeparrot_training - Step 14535: {'lr': 0.0004043359686225161, 'samples': 7442432, 'steps': 14535, 'batch_loss/train': 0.7533476836979389} 12/24/2021 21:25:40 - INFO - codeparrot_training - Step 14536: {'lr': 0.00040432356756171453, 'samples': 7442944, 'steps': 14536, 'batch_loss/train': 0.8257367052137852} 12/24/2021 21:25:54 - INFO - codeparrot_training - Step 14537: {'lr': 0.00040431116588738057, 'samples': 7443456, 'steps': 14537, 'batch_loss/train': 0.6066584931686521} 12/24/2021 21:26:05 - INFO - codeparrot_training - Step 14538: {'lr': 0.0004042987635995639, 'samples': 7443968, 'steps': 14538, 'batch_loss/train': 0.8582726761233062} 12/24/2021 21:26:15 - INFO - codeparrot_training - Step 14539: {'lr': 0.00040428636069831335, 'samples': 7444480, 'steps': 14539, 'batch_loss/train': 1.2410197705030441} 12/24/2021 21:26:26 - INFO - codeparrot_training - Step 14540: {'lr': 0.00040427395718367864, 'samples': 7444992, 'steps': 14540, 'batch_loss/train': 0.8568713590502739} 12/24/2021 21:26:38 - INFO - codeparrot_training - Step 14541: {'lr': 0.00040426155305570887, 'samples': 7445504, 'steps': 14541, 'batch_loss/train': 0.8312932522967458} 12/24/2021 21:26:48 - INFO - codeparrot_training - Step 14542: {'lr': 0.00040424914831445346, 'samples': 7446016, 'steps': 14542, 'batch_loss/train': 0.8890148410573602} 12/24/2021 21:26:59 - INFO - codeparrot_training - Step 14543: {'lr': 0.0004042367429599617, 'samples': 7446528, 'steps': 14543, 'batch_loss/train': 0.769819799810648} 12/24/2021 21:27:13 - INFO - codeparrot_training - Step 14544: {'lr': 0.00040422433699228283, 'samples': 7447040, 'steps': 14544, 'batch_loss/train': 0.766990203410387} 12/24/2021 21:27:24 - INFO - codeparrot_training - Step 14545: {'lr': 0.00040421193041146626, 'samples': 7447552, 'steps': 14545, 'batch_loss/train': 0.841575366910547} 12/24/2021 21:27:34 - INFO - codeparrot_training - Step 14546: {'lr': 0.00040419952321756123, 'samples': 7448064, 'steps': 14546, 'batch_loss/train': 0.7757745208218694} 12/24/2021 21:27:46 - INFO - codeparrot_training - Step 14547: {'lr': 0.0004041871154106172, 'samples': 7448576, 'steps': 14547, 'batch_loss/train': 0.7619125072378665} 12/24/2021 21:27:57 - INFO - codeparrot_training - Step 14548: {'lr': 0.0004041747069906834, 'samples': 7449088, 'steps': 14548, 'batch_loss/train': 0.6649721707217395} 12/24/2021 21:28:07 - INFO - codeparrot_training - Step 14549: {'lr': 0.0004041622979578091, 'samples': 7449600, 'steps': 14549, 'batch_loss/train': 0.7963699079118669} 12/24/2021 21:28:18 - INFO - codeparrot_training - Step 14550: {'lr': 0.0004041498883120438, 'samples': 7450112, 'steps': 14550, 'batch_loss/train': 0.7421456566080451} 12/24/2021 21:28:30 - INFO - codeparrot_training - Step 14551: {'lr': 0.0004041374780534367, 'samples': 7450624, 'steps': 14551, 'batch_loss/train': 0.8041170306969434} 12/24/2021 21:28:41 - INFO - codeparrot_training - Step 14552: {'lr': 0.00040412506718203725, 'samples': 7451136, 'steps': 14552, 'batch_loss/train': 0.9684199746698141} 12/24/2021 21:28:51 - INFO - codeparrot_training - Step 14553: {'lr': 0.0004041126556978946, 'samples': 7451648, 'steps': 14553, 'batch_loss/train': 0.7940661013126373} 12/24/2021 21:29:04 - INFO - codeparrot_training - Step 14554: {'lr': 0.00040410024360105834, 'samples': 7452160, 'steps': 14554, 'batch_loss/train': 0.8970004618167877} 12/24/2021 21:29:14 - INFO - codeparrot_training - Step 14555: {'lr': 0.0004040878308915776, 'samples': 7452672, 'steps': 14555, 'batch_loss/train': 0.698346015997231} 12/24/2021 21:29:25 - INFO - codeparrot_training - Step 14556: {'lr': 0.0004040754175695019, 'samples': 7453184, 'steps': 14556, 'batch_loss/train': 0.6728684189729393} 12/24/2021 21:29:39 - INFO - codeparrot_training - Step 14557: {'lr': 0.0004040630036348805, 'samples': 7453696, 'steps': 14557, 'batch_loss/train': 0.7551039564423263} 12/24/2021 21:29:50 - INFO - codeparrot_training - Step 14558: {'lr': 0.0004040505890877628, 'samples': 7454208, 'steps': 14558, 'batch_loss/train': 0.7986617586575449} 12/24/2021 21:30:00 - INFO - codeparrot_training - Step 14559: {'lr': 0.0004040381739281981, 'samples': 7454720, 'steps': 14559, 'batch_loss/train': 0.824303669272922} 12/24/2021 21:30:12 - INFO - codeparrot_training - Step 14560: {'lr': 0.0004040257581562358, 'samples': 7455232, 'steps': 14560, 'batch_loss/train': 0.7590979798696935} 12/24/2021 21:30:23 - INFO - codeparrot_training - Step 14561: {'lr': 0.00040401334177192516, 'samples': 7455744, 'steps': 14561, 'batch_loss/train': 0.7899989051511511} 12/24/2021 21:30:34 - INFO - codeparrot_training - Step 14562: {'lr': 0.0004040009247753157, 'samples': 7456256, 'steps': 14562, 'batch_loss/train': 0.760648949071765} 12/24/2021 21:30:44 - INFO - codeparrot_training - Step 14563: {'lr': 0.00040398850716645676, 'samples': 7456768, 'steps': 14563, 'batch_loss/train': 0.8485035374760628} 12/24/2021 21:30:56 - INFO - codeparrot_training - Step 14564: {'lr': 0.00040397608894539755, 'samples': 7457280, 'steps': 14564, 'batch_loss/train': 0.8137924997135997} 12/24/2021 21:31:07 - INFO - codeparrot_training - Step 14565: {'lr': 0.0004039636701121876, 'samples': 7457792, 'steps': 14565, 'batch_loss/train': 0.8754673525691032} 12/24/2021 21:31:18 - INFO - codeparrot_training - Step 14566: {'lr': 0.00040395125066687625, 'samples': 7458304, 'steps': 14566, 'batch_loss/train': 0.8997135902754962} 12/24/2021 21:31:32 - INFO - codeparrot_training - Step 14567: {'lr': 0.0004039388306095128, 'samples': 7458816, 'steps': 14567, 'batch_loss/train': 0.8499926966615021} 12/24/2021 21:31:42 - INFO - codeparrot_training - Step 14568: {'lr': 0.00040392640994014675, 'samples': 7459328, 'steps': 14568, 'batch_loss/train': 0.8714629160240293} 12/24/2021 21:31:53 - INFO - codeparrot_training - Step 14569: {'lr': 0.0004039139886588274, 'samples': 7459840, 'steps': 14569, 'batch_loss/train': 0.9056868515908718} 12/24/2021 21:32:05 - INFO - codeparrot_training - Step 14570: {'lr': 0.0004039015667656041, 'samples': 7460352, 'steps': 14570, 'batch_loss/train': 0.8453264003619552} 12/24/2021 21:32:16 - INFO - codeparrot_training - Step 14571: {'lr': 0.00040388914426052637, 'samples': 7460864, 'steps': 14571, 'batch_loss/train': 0.716292129131034} 12/24/2021 21:32:26 - INFO - codeparrot_training - Step 14572: {'lr': 0.0004038767211436434, 'samples': 7461376, 'steps': 14572, 'batch_loss/train': 0.8655237276107073} 12/24/2021 21:32:37 - INFO - codeparrot_training - Step 14573: {'lr': 0.00040386429741500477, 'samples': 7461888, 'steps': 14573, 'batch_loss/train': 0.8225331245921552} 12/24/2021 21:32:51 - INFO - codeparrot_training - Step 14574: {'lr': 0.00040385187307465976, 'samples': 7462400, 'steps': 14574, 'batch_loss/train': 0.9576403154060245} 12/24/2021 21:33:01 - INFO - codeparrot_training - Step 14575: {'lr': 0.0004038394481226578, 'samples': 7462912, 'steps': 14575, 'batch_loss/train': 0.8233979064971209} 12/24/2021 21:33:12 - INFO - codeparrot_training - Step 14576: {'lr': 0.00040382702255904827, 'samples': 7463424, 'steps': 14576, 'batch_loss/train': 0.7850408768281341} 12/24/2021 21:33:24 - INFO - codeparrot_training - Step 14577: {'lr': 0.00040381459638388065, 'samples': 7463936, 'steps': 14577, 'batch_loss/train': 0.7895810091868043} 12/24/2021 21:33:34 - INFO - codeparrot_training - Step 14578: {'lr': 0.00040380216959720415, 'samples': 7464448, 'steps': 14578, 'batch_loss/train': 0.7064187587238848} 12/24/2021 21:33:45 - INFO - codeparrot_training - Step 14579: {'lr': 0.00040378974219906836, 'samples': 7464960, 'steps': 14579, 'batch_loss/train': 0.8725610454566777} 12/24/2021 21:33:57 - INFO - codeparrot_training - Step 14580: {'lr': 0.0004037773141895226, 'samples': 7465472, 'steps': 14580, 'batch_loss/train': 0.8897393238730729} 12/24/2021 21:34:08 - INFO - codeparrot_training - Step 14581: {'lr': 0.00040376488556861635, 'samples': 7465984, 'steps': 14581, 'batch_loss/train': 0.8103339613880962} 12/24/2021 21:34:18 - INFO - codeparrot_training - Step 14582: {'lr': 0.00040375245633639896, 'samples': 7466496, 'steps': 14582, 'batch_loss/train': 0.8898217019159347} 12/24/2021 21:34:29 - INFO - codeparrot_training - Step 14583: {'lr': 0.00040374002649291987, 'samples': 7467008, 'steps': 14583, 'batch_loss/train': 0.7659333236515522} 12/24/2021 21:34:43 - INFO - codeparrot_training - Step 14584: {'lr': 0.0004037275960382285, 'samples': 7467520, 'steps': 14584, 'batch_loss/train': 0.701867675408721} 12/24/2021 21:34:53 - INFO - codeparrot_training - Step 14585: {'lr': 0.00040371516497237416, 'samples': 7468032, 'steps': 14585, 'batch_loss/train': 0.7863708473742008} 12/24/2021 21:35:04 - INFO - codeparrot_training - Step 14586: {'lr': 0.0004037027332954064, 'samples': 7468544, 'steps': 14586, 'batch_loss/train': 0.8043652372434735} 12/24/2021 21:35:16 - INFO - codeparrot_training - Step 14587: {'lr': 0.00040369030100737463, 'samples': 7469056, 'steps': 14587, 'batch_loss/train': 0.7700865329243243} 12/24/2021 21:35:27 - INFO - codeparrot_training - Step 14588: {'lr': 0.0004036778681083283, 'samples': 7469568, 'steps': 14588, 'batch_loss/train': 0.7773967040702701} 12/24/2021 21:35:38 - INFO - codeparrot_training - Step 14589: {'lr': 0.00040366543459831674, 'samples': 7470080, 'steps': 14589, 'batch_loss/train': 0.6707971030846238} 12/24/2021 21:35:50 - INFO - codeparrot_training - Step 14590: {'lr': 0.00040365300047738947, 'samples': 7470592, 'steps': 14590, 'batch_loss/train': 0.8184917871840298} 12/24/2021 21:36:00 - INFO - codeparrot_training - Step 14591: {'lr': 0.0004036405657455958, 'samples': 7471104, 'steps': 14591, 'batch_loss/train': 0.8367636390030384} 12/24/2021 21:36:11 - INFO - codeparrot_training - Step 14592: {'lr': 0.00040362813040298534, 'samples': 7471616, 'steps': 14592, 'batch_loss/train': 0.8385505238547921} 12/24/2021 21:36:21 - INFO - codeparrot_training - Step 14593: {'lr': 0.0004036156944496074, 'samples': 7472128, 'steps': 14593, 'batch_loss/train': 0.8278913744725287} 12/24/2021 21:36:33 - INFO - codeparrot_training - Step 14594: {'lr': 0.00040360325788551144, 'samples': 7472640, 'steps': 14594, 'batch_loss/train': 0.8313835365697742} 12/24/2021 21:36:44 - INFO - codeparrot_training - Step 14595: {'lr': 0.00040359082071074703, 'samples': 7473152, 'steps': 14595, 'batch_loss/train': 0.8190669601899572} 12/24/2021 21:36:55 - INFO - codeparrot_training - Step 14596: {'lr': 0.00040357838292536343, 'samples': 7473664, 'steps': 14596, 'batch_loss/train': 0.7751601226627827} 12/24/2021 21:37:08 - INFO - codeparrot_training - Step 14597: {'lr': 0.0004035659445294102, 'samples': 7474176, 'steps': 14597, 'batch_loss/train': 0.8301773127168417} 12/24/2021 21:37:19 - INFO - codeparrot_training - Step 14598: {'lr': 0.0004035535055229367, 'samples': 7474688, 'steps': 14598, 'batch_loss/train': 0.7630405388772488} 12/24/2021 21:37:30 - INFO - codeparrot_training - Step 14599: {'lr': 0.0004035410659059925, 'samples': 7475200, 'steps': 14599, 'batch_loss/train': 0.7815540349110961} 12/24/2021 21:37:42 - INFO - codeparrot_training - Step 14600: {'lr': 0.0004035286256786269, 'samples': 7475712, 'steps': 14600, 'batch_loss/train': 0.7653406602330506} 12/24/2021 21:37:52 - INFO - codeparrot_training - Step 14601: {'lr': 0.0004035161848408896, 'samples': 7476224, 'steps': 14601, 'batch_loss/train': 0.7147982083261013} 12/24/2021 21:38:03 - INFO - codeparrot_training - Step 14602: {'lr': 0.0004035037433928298, 'samples': 7476736, 'steps': 14602, 'batch_loss/train': 0.912590678781271} 12/24/2021 21:38:13 - INFO - codeparrot_training - Step 14603: {'lr': 0.0004034913013344972, 'samples': 7477248, 'steps': 14603, 'batch_loss/train': 0.8001081068068743} 12/24/2021 21:38:27 - INFO - codeparrot_training - Step 14604: {'lr': 0.000403478858665941, 'samples': 7477760, 'steps': 14604, 'batch_loss/train': 0.7987002303125337} 12/24/2021 21:38:38 - INFO - codeparrot_training - Step 14605: {'lr': 0.0004034664153872109, 'samples': 7478272, 'steps': 14605, 'batch_loss/train': 0.8290638970211148} 12/24/2021 21:38:49 - INFO - codeparrot_training - Step 14606: {'lr': 0.0004034539714983563, 'samples': 7478784, 'steps': 14606, 'batch_loss/train': 0.7457584007643163} 12/24/2021 21:39:01 - INFO - codeparrot_training - Step 14607: {'lr': 0.00040344152699942656, 'samples': 7479296, 'steps': 14607, 'batch_loss/train': 0.7755988594144583} 12/24/2021 21:39:12 - INFO - codeparrot_training - Step 14608: {'lr': 0.0004034290818904713, 'samples': 7479808, 'steps': 14608, 'batch_loss/train': 0.7539607470389456} 12/24/2021 21:39:22 - INFO - codeparrot_training - Step 14609: {'lr': 0.0004034166361715399, 'samples': 7480320, 'steps': 14609, 'batch_loss/train': 0.8808780759572983} 12/24/2021 21:39:34 - INFO - codeparrot_training - Step 14610: {'lr': 0.0004034041898426819, 'samples': 7480832, 'steps': 14610, 'batch_loss/train': 0.8753639664500952} 12/24/2021 21:39:45 - INFO - codeparrot_training - Step 14611: {'lr': 0.00040339174290394676, 'samples': 7481344, 'steps': 14611, 'batch_loss/train': 0.7702323636040092} 12/24/2021 21:39:56 - INFO - codeparrot_training - Step 14612: {'lr': 0.000403379295355384, 'samples': 7481856, 'steps': 14612, 'batch_loss/train': 0.9027400254271924} 12/24/2021 21:40:06 - INFO - codeparrot_training - Step 14613: {'lr': 0.0004033668471970431, 'samples': 7482368, 'steps': 14613, 'batch_loss/train': 0.7685903124511242} 12/24/2021 21:40:20 - INFO - codeparrot_training - Step 14614: {'lr': 0.0004033543984289735, 'samples': 7482880, 'steps': 14614, 'batch_loss/train': 0.7418541247025132} 12/24/2021 21:40:31 - INFO - codeparrot_training - Step 14615: {'lr': 0.0004033419490512247, 'samples': 7483392, 'steps': 14615, 'batch_loss/train': 0.7779615707695484} 12/24/2021 21:40:41 - INFO - codeparrot_training - Step 14616: {'lr': 0.0004033294990638462, 'samples': 7483904, 'steps': 14616, 'batch_loss/train': 0.9161662005353719} 12/24/2021 21:40:54 - INFO - codeparrot_training - Step 14617: {'lr': 0.00040331704846688755, 'samples': 7484416, 'steps': 14617, 'batch_loss/train': 0.7740978267975152} 12/24/2021 21:41:04 - INFO - codeparrot_training - Step 14618: {'lr': 0.00040330459726039815, 'samples': 7484928, 'steps': 14618, 'batch_loss/train': 0.6939873769879341} 12/24/2021 21:41:15 - INFO - codeparrot_training - Step 14619: {'lr': 0.0004032921454444276, 'samples': 7485440, 'steps': 14619, 'batch_loss/train': 0.7895460072904825} 12/24/2021 21:41:27 - INFO - codeparrot_training - Step 14620: {'lr': 0.00040327969301902534, 'samples': 7485952, 'steps': 14620, 'batch_loss/train': 0.8766858843155205} 12/24/2021 21:41:38 - INFO - codeparrot_training - Step 14621: {'lr': 0.0004032672399842409, 'samples': 7486464, 'steps': 14621, 'batch_loss/train': 0.7995488932356238} 12/24/2021 21:41:48 - INFO - codeparrot_training - Step 14622: {'lr': 0.00040325478634012374, 'samples': 7486976, 'steps': 14622, 'batch_loss/train': 0.8814340000972152} 12/24/2021 21:41:59 - INFO - codeparrot_training - Step 14623: {'lr': 0.0004032423320867235, 'samples': 7487488, 'steps': 14623, 'batch_loss/train': 0.7904592324048281} 12/24/2021 21:42:13 - INFO - codeparrot_training - Step 14624: {'lr': 0.00040322987722408955, 'samples': 7488000, 'steps': 14624, 'batch_loss/train': 0.7924273842945695} 12/24/2021 21:42:24 - INFO - codeparrot_training - Step 14625: {'lr': 0.0004032174217522715, 'samples': 7488512, 'steps': 14625, 'batch_loss/train': 0.7049997246358544} 12/24/2021 21:42:34 - INFO - codeparrot_training - Step 14626: {'lr': 0.00040320496567131884, 'samples': 7489024, 'steps': 14626, 'batch_loss/train': 0.8478294983506203} 12/24/2021 21:42:46 - INFO - codeparrot_training - Step 14627: {'lr': 0.0004031925089812811, 'samples': 7489536, 'steps': 14627, 'batch_loss/train': 0.8927136786514893} 12/24/2021 21:42:57 - INFO - codeparrot_training - Step 14628: {'lr': 0.0004031800516822077, 'samples': 7490048, 'steps': 14628, 'batch_loss/train': 0.7886912731919438} 12/24/2021 21:43:07 - INFO - codeparrot_training - Step 14629: {'lr': 0.00040316759377414835, 'samples': 7490560, 'steps': 14629, 'batch_loss/train': 0.8446077187545598} 12/24/2021 21:43:19 - INFO - codeparrot_training - Step 14630: {'lr': 0.0004031551352571524, 'samples': 7491072, 'steps': 14630, 'batch_loss/train': 0.7636511917226017} 12/24/2021 21:43:30 - INFO - codeparrot_training - Step 14631: {'lr': 0.0004031426761312695, 'samples': 7491584, 'steps': 14631, 'batch_loss/train': 0.7932222243398428} 12/24/2021 21:43:41 - INFO - codeparrot_training - Step 14632: {'lr': 0.00040313021639654914, 'samples': 7492096, 'steps': 14632, 'batch_loss/train': 0.8905920786783099} 12/24/2021 21:43:51 - INFO - codeparrot_training - Step 14633: {'lr': 0.0004031177560530409, 'samples': 7492608, 'steps': 14633, 'batch_loss/train': 0.8923349380493164} 12/24/2021 21:44:05 - INFO - codeparrot_training - Step 14634: {'lr': 0.0004031052951007942, 'samples': 7493120, 'steps': 14634, 'batch_loss/train': 0.8332614516839385} 12/24/2021 21:44:16 - INFO - codeparrot_training - Step 14635: {'lr': 0.0004030928335398587, 'samples': 7493632, 'steps': 14635, 'batch_loss/train': 0.770465200766921} 12/24/2021 21:44:26 - INFO - codeparrot_training - Step 14636: {'lr': 0.0004030803713702839, 'samples': 7494144, 'steps': 14636, 'batch_loss/train': 0.8384248511865735} 12/24/2021 21:44:39 - INFO - codeparrot_training - Step 14637: {'lr': 0.00040306790859211936, 'samples': 7494656, 'steps': 14637, 'batch_loss/train': 0.7342606505844742} 12/24/2021 21:44:49 - INFO - codeparrot_training - Step 14638: {'lr': 0.0004030554452054146, 'samples': 7495168, 'steps': 14638, 'batch_loss/train': 0.7777777884621173} 12/24/2021 21:45:00 - INFO - codeparrot_training - Step 14639: {'lr': 0.00040304298121021914, 'samples': 7495680, 'steps': 14639, 'batch_loss/train': 0.6442889912286773} 12/24/2021 21:45:12 - INFO - codeparrot_training - Step 14640: {'lr': 0.0004030305166065826, 'samples': 7496192, 'steps': 14640, 'batch_loss/train': 0.8641754714772105} 12/24/2021 21:45:23 - INFO - codeparrot_training - Step 14641: {'lr': 0.0004030180513945545, 'samples': 7496704, 'steps': 14641, 'batch_loss/train': 0.776433426886797} 12/24/2021 21:45:34 - INFO - codeparrot_training - Step 14642: {'lr': 0.0004030055855741845, 'samples': 7497216, 'steps': 14642, 'batch_loss/train': 2.4252782594412565} 12/24/2021 21:45:44 - INFO - codeparrot_training - Step 14643: {'lr': 0.00040299311914552194, 'samples': 7497728, 'steps': 14643, 'batch_loss/train': 0.9243030790239573} 12/24/2021 21:45:58 - INFO - codeparrot_training - Step 14644: {'lr': 0.00040298065210861656, 'samples': 7498240, 'steps': 14644, 'batch_loss/train': 0.7998006660491228} 12/24/2021 21:46:09 - INFO - codeparrot_training - Step 14645: {'lr': 0.00040296818446351783, 'samples': 7498752, 'steps': 14645, 'batch_loss/train': 0.8106344761326909} 12/24/2021 21:46:19 - INFO - codeparrot_training - Step 14646: {'lr': 0.00040295571621027537, 'samples': 7499264, 'steps': 14646, 'batch_loss/train': 0.8896235972642899} 12/24/2021 21:46:31 - INFO - codeparrot_training - Step 14647: {'lr': 0.0004029432473489388, 'samples': 7499776, 'steps': 14647, 'batch_loss/train': 0.81540038716048} 12/24/2021 21:46:42 - INFO - codeparrot_training - Step 14648: {'lr': 0.00040293077787955755, 'samples': 7500288, 'steps': 14648, 'batch_loss/train': 0.7576495683752} 12/24/2021 21:46:53 - INFO - codeparrot_training - Step 14649: {'lr': 0.0004029183078021813, 'samples': 7500800, 'steps': 14649, 'batch_loss/train': 0.8076618192717433} 12/24/2021 21:47:05 - INFO - codeparrot_training - Step 14650: {'lr': 0.0004029058371168596, 'samples': 7501312, 'steps': 14650, 'batch_loss/train': 0.8425707661081105} 12/24/2021 21:47:15 - INFO - codeparrot_training - Step 14651: {'lr': 0.00040289336582364214, 'samples': 7501824, 'steps': 14651, 'batch_loss/train': 0.7769680125638843} 12/24/2021 21:47:26 - INFO - codeparrot_training - Step 14652: {'lr': 0.0004028808939225782, 'samples': 7502336, 'steps': 14652, 'batch_loss/train': 0.8413285445421934} 12/24/2021 21:47:37 - INFO - codeparrot_training - Step 14653: {'lr': 0.00040286842141371763, 'samples': 7502848, 'steps': 14653, 'batch_loss/train': 0.7409617621451616} 12/24/2021 21:47:51 - INFO - codeparrot_training - Step 14654: {'lr': 0.0004028559482971099, 'samples': 7503360, 'steps': 14654, 'batch_loss/train': 0.8566116055008024} 12/24/2021 21:48:01 - INFO - codeparrot_training - Step 14655: {'lr': 0.0004028434745728047, 'samples': 7503872, 'steps': 14655, 'batch_loss/train': 0.7707671616226435} 12/24/2021 21:48:12 - INFO - codeparrot_training - Step 14656: {'lr': 0.00040283100024085153, 'samples': 7504384, 'steps': 14656, 'batch_loss/train': 0.7474158178083599} 12/24/2021 21:48:24 - INFO - codeparrot_training - Step 14657: {'lr': 0.0004028185253013, 'samples': 7504896, 'steps': 14657, 'batch_loss/train': 0.7722333716228604} 12/24/2021 21:48:35 - INFO - codeparrot_training - Step 14658: {'lr': 0.00040280604975419975, 'samples': 7505408, 'steps': 14658, 'batch_loss/train': 0.8418080592527986} 12/24/2021 21:48:45 - INFO - codeparrot_training - Step 14659: {'lr': 0.0004027935735996003, 'samples': 7505920, 'steps': 14659, 'batch_loss/train': 0.8281535031273961} 12/24/2021 21:48:59 - INFO - codeparrot_training - Step 14660: {'lr': 0.00040278109683755133, 'samples': 7506432, 'steps': 14660, 'batch_loss/train': 0.8595015809405595} 12/24/2021 21:49:10 - INFO - codeparrot_training - Step 14661: {'lr': 0.0004027686194681023, 'samples': 7506944, 'steps': 14661, 'batch_loss/train': 0.832519329385832} 12/24/2021 21:49:20 - INFO - codeparrot_training - Step 14662: {'lr': 0.0004027561414913031, 'samples': 7507456, 'steps': 14662, 'batch_loss/train': 0.8198362239636481} 12/24/2021 21:49:31 - INFO - codeparrot_training - Step 14663: {'lr': 0.00040274366290720304, 'samples': 7507968, 'steps': 14663, 'batch_loss/train': 0.771451243199408} 12/24/2021 21:49:43 - INFO - codeparrot_training - Step 14664: {'lr': 0.00040273118371585186, 'samples': 7508480, 'steps': 14664, 'batch_loss/train': 0.7829523887485266} 12/24/2021 21:49:54 - INFO - codeparrot_training - Step 14665: {'lr': 0.0004027187039172992, 'samples': 7508992, 'steps': 14665, 'batch_loss/train': 0.7813259558752179} 12/24/2021 21:50:04 - INFO - codeparrot_training - Step 14666: {'lr': 0.0004027062235115946, 'samples': 7509504, 'steps': 14666, 'batch_loss/train': 0.8141427123919129} 12/24/2021 21:50:16 - INFO - codeparrot_training - Step 14667: {'lr': 0.00040269374249878774, 'samples': 7510016, 'steps': 14667, 'batch_loss/train': 1.1692237919196486} 12/24/2021 21:50:27 - INFO - codeparrot_training - Step 14668: {'lr': 0.0004026812608789282, 'samples': 7510528, 'steps': 14668, 'batch_loss/train': 0.7247258508577943} 12/24/2021 21:50:38 - INFO - codeparrot_training - Step 14669: {'lr': 0.00040266877865206564, 'samples': 7511040, 'steps': 14669, 'batch_loss/train': 0.78247433854267} 12/24/2021 21:50:51 - INFO - codeparrot_training - Step 14670: {'lr': 0.00040265629581824966, 'samples': 7511552, 'steps': 14670, 'batch_loss/train': 0.9570004073902965} 12/24/2021 21:51:02 - INFO - codeparrot_training - Step 14671: {'lr': 0.0004026438123775299, 'samples': 7512064, 'steps': 14671, 'batch_loss/train': 0.7685965381097049} 12/24/2021 21:51:12 - INFO - codeparrot_training - Step 14672: {'lr': 0.0004026313283299559, 'samples': 7512576, 'steps': 14672, 'batch_loss/train': 0.8519279453903437} 12/24/2021 21:51:23 - INFO - codeparrot_training - Step 14673: {'lr': 0.00040261884367557747, 'samples': 7513088, 'steps': 14673, 'batch_loss/train': 0.8525078697130084} 12/24/2021 21:51:35 - INFO - codeparrot_training - Step 14674: {'lr': 0.000402606358414444, 'samples': 7513600, 'steps': 14674, 'batch_loss/train': 0.7494188183918595} 12/24/2021 21:51:46 - INFO - codeparrot_training - Step 14675: {'lr': 0.00040259387254660547, 'samples': 7514112, 'steps': 14675, 'batch_loss/train': 0.8471954734995961} 12/24/2021 21:51:56 - INFO - codeparrot_training - Step 14676: {'lr': 0.0004025813860721112, 'samples': 7514624, 'steps': 14676, 'batch_loss/train': 0.8501309119164944} 12/24/2021 21:52:08 - INFO - codeparrot_training - Step 14677: {'lr': 0.000402568898991011, 'samples': 7515136, 'steps': 14677, 'batch_loss/train': 0.8709222609177232} 12/24/2021 21:52:19 - INFO - codeparrot_training - Step 14678: {'lr': 0.0004025564113033544, 'samples': 7515648, 'steps': 14678, 'batch_loss/train': 0.7778072720393538} 12/24/2021 21:52:30 - INFO - codeparrot_training - Step 14679: {'lr': 0.00040254392300919116, 'samples': 7516160, 'steps': 14679, 'batch_loss/train': 0.8305049641057849} 12/24/2021 21:52:42 - INFO - codeparrot_training - Step 14680: {'lr': 0.00040253143410857084, 'samples': 7516672, 'steps': 14680, 'batch_loss/train': 0.8162033329717815} 12/24/2021 21:52:53 - INFO - codeparrot_training - Step 14681: {'lr': 0.00040251894460154313, 'samples': 7517184, 'steps': 14681, 'batch_loss/train': 0.8003407586365938} 12/24/2021 21:53:03 - INFO - codeparrot_training - Step 14682: {'lr': 0.00040250645448815773, 'samples': 7517696, 'steps': 14682, 'batch_loss/train': 0.7438982841558754} 12/24/2021 21:53:14 - INFO - codeparrot_training - Step 14683: {'lr': 0.00040249396376846426, 'samples': 7518208, 'steps': 14683, 'batch_loss/train': 1.2223894894123077} 12/24/2021 21:53:28 - INFO - codeparrot_training - Step 14684: {'lr': 0.0004024814724425123, 'samples': 7518720, 'steps': 14684, 'batch_loss/train': 0.6957352580502629} 12/24/2021 21:53:38 - INFO - codeparrot_training - Step 14685: {'lr': 0.00040246898051035163, 'samples': 7519232, 'steps': 14685, 'batch_loss/train': 0.6984557002433576} 12/24/2021 21:53:49 - INFO - codeparrot_training - Step 14686: {'lr': 0.00040245648797203196, 'samples': 7519744, 'steps': 14686, 'batch_loss/train': 0.7833325816318393} 12/24/2021 21:54:01 - INFO - codeparrot_training - Step 14687: {'lr': 0.00040244399482760267, 'samples': 7520256, 'steps': 14687, 'batch_loss/train': 0.8580555943772197} 12/24/2021 21:54:12 - INFO - codeparrot_training - Step 14688: {'lr': 0.00040243150107711374, 'samples': 7520768, 'steps': 14688, 'batch_loss/train': 0.791399285197258} 12/24/2021 21:54:22 - INFO - codeparrot_training - Step 14689: {'lr': 0.00040241900672061473, 'samples': 7521280, 'steps': 14689, 'batch_loss/train': 0.839608071371913} 12/24/2021 21:54:36 - INFO - codeparrot_training - Step 14690: {'lr': 0.00040240651175815523, 'samples': 7521792, 'steps': 14690, 'batch_loss/train': 0.8449339848011732} 12/24/2021 21:54:47 - INFO - codeparrot_training - Step 14691: {'lr': 0.000402394016189785, 'samples': 7522304, 'steps': 14691, 'batch_loss/train': 0.901790140196681} 12/24/2021 21:54:57 - INFO - codeparrot_training - Step 14692: {'lr': 0.0004023815200155537, 'samples': 7522816, 'steps': 14692, 'batch_loss/train': 0.9180742623284459} 12/24/2021 21:55:08 - INFO - codeparrot_training - Step 14693: {'lr': 0.00040236902323551103, 'samples': 7523328, 'steps': 14693, 'batch_loss/train': 0.7606714167632163} 12/24/2021 21:55:20 - INFO - codeparrot_training - Step 14694: {'lr': 0.0004023565258497066, 'samples': 7523840, 'steps': 14694, 'batch_loss/train': 0.9124003704637289} 12/24/2021 21:55:31 - INFO - codeparrot_training - Step 14695: {'lr': 0.00040234402785819026, 'samples': 7524352, 'steps': 14695, 'batch_loss/train': 0.8864890905097127} 12/24/2021 21:55:41 - INFO - codeparrot_training - Step 14696: {'lr': 0.0004023315292610115, 'samples': 7524864, 'steps': 14696, 'batch_loss/train': 0.7856913227587938} 12/24/2021 21:55:53 - INFO - codeparrot_training - Step 14697: {'lr': 0.0004023190300582201, 'samples': 7525376, 'steps': 14697, 'batch_loss/train': 0.8222546172328293} 12/24/2021 21:56:04 - INFO - codeparrot_training - Step 14698: {'lr': 0.0004023065302498658, 'samples': 7525888, 'steps': 14698, 'batch_loss/train': 0.987383687403053} 12/24/2021 21:56:15 - INFO - codeparrot_training - Step 14699: {'lr': 0.0004022940298359983, 'samples': 7526400, 'steps': 14699, 'batch_loss/train': 0.8651194218546152} 12/24/2021 21:56:25 - INFO - codeparrot_training - Step 14700: {'lr': 0.00040228152881666715, 'samples': 7526912, 'steps': 14700, 'batch_loss/train': 0.8914685919880867} 12/24/2021 21:56:40 - INFO - codeparrot_training - Step 14701: {'lr': 0.00040226902719192213, 'samples': 7527424, 'steps': 14701, 'batch_loss/train': 0.7007809979841113} 12/24/2021 21:56:51 - INFO - codeparrot_training - Step 14702: {'lr': 0.000402256524961813, 'samples': 7527936, 'steps': 14702, 'batch_loss/train': 0.7637762001249939} 12/24/2021 21:57:01 - INFO - codeparrot_training - Step 14703: {'lr': 0.0004022440221263894, 'samples': 7528448, 'steps': 14703, 'batch_loss/train': 1.6005335003137589} 12/24/2021 21:57:13 - INFO - codeparrot_training - Step 14704: {'lr': 0.00040223151868570096, 'samples': 7528960, 'steps': 14704, 'batch_loss/train': 1.3203336261212826} 12/24/2021 21:57:24 - INFO - codeparrot_training - Step 14705: {'lr': 0.0004022190146397976, 'samples': 7529472, 'steps': 14705, 'batch_loss/train': 0.8572336575016379} 12/24/2021 21:57:35 - INFO - codeparrot_training - Step 14706: {'lr': 0.0004022065099887289, 'samples': 7529984, 'steps': 14706, 'batch_loss/train': 0.8424673723056912} 12/24/2021 21:57:45 - INFO - codeparrot_training - Step 14707: {'lr': 0.0004021940047325445, 'samples': 7530496, 'steps': 14707, 'batch_loss/train': 0.8102518925443292} 12/24/2021 21:57:59 - INFO - codeparrot_training - Step 14708: {'lr': 0.00040218149887129436, 'samples': 7531008, 'steps': 14708, 'batch_loss/train': 0.7859471691772342} 12/24/2021 21:58:10 - INFO - codeparrot_training - Step 14709: {'lr': 0.0004021689924050279, 'samples': 7531520, 'steps': 14709, 'batch_loss/train': 0.9628745624795556} 12/24/2021 21:58:20 - INFO - codeparrot_training - Step 14710: {'lr': 0.0004021564853337951, 'samples': 7532032, 'steps': 14710, 'batch_loss/train': 0.9495108285918832} 12/24/2021 21:58:32 - INFO - codeparrot_training - Step 14711: {'lr': 0.0004021439776576454, 'samples': 7532544, 'steps': 14711, 'batch_loss/train': 0.9006870593875647} 12/24/2021 21:58:43 - INFO - codeparrot_training - Step 14712: {'lr': 0.00040213146937662894, 'samples': 7533056, 'steps': 14712, 'batch_loss/train': 0.7905729562044144} 12/24/2021 21:58:54 - INFO - codeparrot_training - Step 14713: {'lr': 0.000402118960490795, 'samples': 7533568, 'steps': 14713, 'batch_loss/train': 0.7598259625956416} 12/24/2021 21:59:06 - INFO - codeparrot_training - Step 14714: {'lr': 0.00040210645100019363, 'samples': 7534080, 'steps': 14714, 'batch_loss/train': 0.8467762069776654} 12/24/2021 21:59:16 - INFO - codeparrot_training - Step 14715: {'lr': 0.0004020939409048743, 'samples': 7534592, 'steps': 14715, 'batch_loss/train': 0.8124277796596289} 12/24/2021 21:59:27 - INFO - codeparrot_training - Step 14716: {'lr': 0.000402081430204887, 'samples': 7535104, 'steps': 14716, 'batch_loss/train': 0.8355035055428743} 12/24/2021 21:59:37 - INFO - codeparrot_training - Step 14717: {'lr': 0.0004020689189002813, 'samples': 7535616, 'steps': 14717, 'batch_loss/train': 0.8335599424317479} 12/24/2021 21:59:50 - INFO - codeparrot_training - Step 14718: {'lr': 0.000402056406991107, 'samples': 7536128, 'steps': 14718, 'batch_loss/train': 0.8779905049595982} 12/24/2021 22:00:00 - INFO - codeparrot_training - Step 14719: {'lr': 0.0004020438944774139, 'samples': 7536640, 'steps': 14719, 'batch_loss/train': 0.7282224111258984} 12/24/2021 22:00:11 - INFO - codeparrot_training - Step 14720: {'lr': 0.00040203138135925165, 'samples': 7537152, 'steps': 14720, 'batch_loss/train': 0.718089078553021} 12/24/2021 22:00:25 - INFO - codeparrot_training - Step 14721: {'lr': 0.00040201886763667006, 'samples': 7537664, 'steps': 14721, 'batch_loss/train': 0.9969516415148973} 12/24/2021 22:00:36 - INFO - codeparrot_training - Step 14722: {'lr': 0.0004020063533097188, 'samples': 7538176, 'steps': 14722, 'batch_loss/train': 0.8725247969850898} 12/24/2021 22:00:46 - INFO - codeparrot_training - Step 14723: {'lr': 0.00040199383837844773, 'samples': 7538688, 'steps': 14723, 'batch_loss/train': 0.744363805744797} 12/24/2021 22:00:58 - INFO - codeparrot_training - Step 14724: {'lr': 0.00040198132284290653, 'samples': 7539200, 'steps': 14724, 'batch_loss/train': 0.8182488223537803} 12/24/2021 22:01:09 - INFO - codeparrot_training - Step 14725: {'lr': 0.00040196880670314493, 'samples': 7539712, 'steps': 14725, 'batch_loss/train': 0.8497895654290915} 12/24/2021 22:01:20 - INFO - codeparrot_training - Step 14726: {'lr': 0.0004019562899592128, 'samples': 7540224, 'steps': 14726, 'batch_loss/train': 0.8007540460675955} 12/24/2021 22:01:30 - INFO - codeparrot_training - Step 14727: {'lr': 0.00040194377261115977, 'samples': 7540736, 'steps': 14727, 'batch_loss/train': 0.8043962195515633} 12/24/2021 22:01:42 - INFO - codeparrot_training - Step 14728: {'lr': 0.00040193125465903573, 'samples': 7541248, 'steps': 14728, 'batch_loss/train': 0.7338452736148611} 12/24/2021 22:01:53 - INFO - codeparrot_training - Step 14729: {'lr': 0.0004019187361028904, 'samples': 7541760, 'steps': 14729, 'batch_loss/train': 0.7572613246738911} 12/24/2021 22:02:03 - INFO - codeparrot_training - Step 14730: {'lr': 0.00040190621694277353, 'samples': 7542272, 'steps': 14730, 'batch_loss/train': 0.8842874085530639} 12/24/2021 22:02:17 - INFO - codeparrot_training - Step 14731: {'lr': 0.0004018936971787348, 'samples': 7542784, 'steps': 14731, 'batch_loss/train': 0.7325809765607119} 12/24/2021 22:02:28 - INFO - codeparrot_training - Step 14732: {'lr': 0.0004018811768108242, 'samples': 7543296, 'steps': 14732, 'batch_loss/train': 0.6752582401968539} 12/24/2021 22:02:38 - INFO - codeparrot_training - Step 14733: {'lr': 0.0004018686558390913, 'samples': 7543808, 'steps': 14733, 'batch_loss/train': 0.7781572537496686} 12/24/2021 22:02:51 - INFO - codeparrot_training - Step 14734: {'lr': 0.00040185613426358606, 'samples': 7544320, 'steps': 14734, 'batch_loss/train': 0.928446221165359} 12/24/2021 22:03:01 - INFO - codeparrot_training - Step 14735: {'lr': 0.0004018436120843581, 'samples': 7544832, 'steps': 14735, 'batch_loss/train': 0.7900923150591552} 12/24/2021 22:03:12 - INFO - codeparrot_training - Step 14736: {'lr': 0.00040183108930145735, 'samples': 7545344, 'steps': 14736, 'batch_loss/train': 0.8397740502841771} 12/24/2021 22:03:22 - INFO - codeparrot_training - Step 14737: {'lr': 0.0004018185659149334, 'samples': 7545856, 'steps': 14737, 'batch_loss/train': 0.8531016670167446} 12/24/2021 22:03:36 - INFO - codeparrot_training - Step 14738: {'lr': 0.00040180604192483626, 'samples': 7546368, 'steps': 14738, 'batch_loss/train': 0.8185305339284241} 12/24/2021 22:03:47 - INFO - codeparrot_training - Step 14739: {'lr': 0.00040179351733121563, 'samples': 7546880, 'steps': 14739, 'batch_loss/train': 0.7432420197874308} 12/24/2021 22:03:57 - INFO - codeparrot_training - Step 14740: {'lr': 0.00040178099213412124, 'samples': 7547392, 'steps': 14740, 'batch_loss/train': 0.9690715614706278} 12/24/2021 22:04:10 - INFO - codeparrot_training - Step 14741: {'lr': 0.00040176846633360286, 'samples': 7547904, 'steps': 14741, 'batch_loss/train': 0.7629574341699481} 12/24/2021 22:04:20 - INFO - codeparrot_training - Step 14742: {'lr': 0.00040175593992971046, 'samples': 7548416, 'steps': 14742, 'batch_loss/train': 0.7295037524309009} 12/24/2021 22:04:31 - INFO - codeparrot_training - Step 14743: {'lr': 0.0004017434129224937, 'samples': 7548928, 'steps': 14743, 'batch_loss/train': 0.7478513839887455} 12/24/2021 22:04:43 - INFO - codeparrot_training - Step 14744: {'lr': 0.00040173088531200246, 'samples': 7549440, 'steps': 14744, 'batch_loss/train': 0.709140264429152} 12/24/2021 22:04:54 - INFO - codeparrot_training - Step 14745: {'lr': 0.0004017183570982865, 'samples': 7549952, 'steps': 14745, 'batch_loss/train': 0.743111508898437} 12/24/2021 22:05:04 - INFO - codeparrot_training - Step 14746: {'lr': 0.0004017058282813957, 'samples': 7550464, 'steps': 14746, 'batch_loss/train': 0.6991888168267906} 12/24/2021 22:05:16 - INFO - codeparrot_training - Step 14747: {'lr': 0.0004016932988613797, 'samples': 7550976, 'steps': 14747, 'batch_loss/train': 0.8043817104771733} 12/24/2021 22:05:27 - INFO - codeparrot_training - Step 14748: {'lr': 0.0004016807688382884, 'samples': 7551488, 'steps': 14748, 'batch_loss/train': 0.8419307917356491} 12/24/2021 22:05:38 - INFO - codeparrot_training - Step 14749: {'lr': 0.0004016682382121718, 'samples': 7552000, 'steps': 14749, 'batch_loss/train': 0.7979575032368302} 12/24/2021 22:05:48 - INFO - codeparrot_training - Step 14750: {'lr': 0.0004016557069830794, 'samples': 7552512, 'steps': 14750, 'batch_loss/train': 0.8107557399198413} 12/24/2021 22:06:02 - INFO - codeparrot_training - Step 14751: {'lr': 0.00040164317515106133, 'samples': 7553024, 'steps': 14751, 'batch_loss/train': 0.8797395492438227} 12/24/2021 22:06:13 - INFO - codeparrot_training - Step 14752: {'lr': 0.0004016306427161671, 'samples': 7553536, 'steps': 14752, 'batch_loss/train': 0.7289293417707086} 12/24/2021 22:06:23 - INFO - codeparrot_training - Step 14753: {'lr': 0.0004016181096784467, 'samples': 7554048, 'steps': 14753, 'batch_loss/train': 0.8381202085874975} 12/24/2021 22:06:35 - INFO - codeparrot_training - Step 14754: {'lr': 0.00040160557603795, 'samples': 7554560, 'steps': 14754, 'batch_loss/train': 0.7448510839603841} 12/24/2021 22:06:46 - INFO - codeparrot_training - Step 14755: {'lr': 0.00040159304179472675, 'samples': 7555072, 'steps': 14755, 'batch_loss/train': 0.8684057397767901} 12/24/2021 22:06:57 - INFO - codeparrot_training - Step 14756: {'lr': 0.0004015805069488268, 'samples': 7555584, 'steps': 14756, 'batch_loss/train': 0.9861093750223517} 12/24/2021 22:07:09 - INFO - codeparrot_training - Step 14757: {'lr': 0.0004015679715003, 'samples': 7556096, 'steps': 14757, 'batch_loss/train': 0.7571808276697993} 12/24/2021 22:07:19 - INFO - codeparrot_training - Step 14758: {'lr': 0.0004015554354491962, 'samples': 7556608, 'steps': 14758, 'batch_loss/train': 0.9296991126611829} 12/24/2021 22:07:30 - INFO - codeparrot_training - Step 14759: {'lr': 0.00040154289879556517, 'samples': 7557120, 'steps': 14759, 'batch_loss/train': 0.6475030835717916} 12/24/2021 22:07:40 - INFO - codeparrot_training - Step 14760: {'lr': 0.0004015303615394569, 'samples': 7557632, 'steps': 14760, 'batch_loss/train': 0.8902539489790797} 12/24/2021 22:07:54 - INFO - codeparrot_training - Step 14761: {'lr': 0.00040151782368092094, 'samples': 7558144, 'steps': 14761, 'batch_loss/train': 0.7532916201744229} 12/24/2021 22:08:05 - INFO - codeparrot_training - Step 14762: {'lr': 0.0004015052852200075, 'samples': 7558656, 'steps': 14762, 'batch_loss/train': 0.775721188634634} 12/24/2021 22:08:16 - INFO - codeparrot_training - Step 14763: {'lr': 0.00040149274615676613, 'samples': 7559168, 'steps': 14763, 'batch_loss/train': 0.8039434894453734} 12/24/2021 22:08:28 - INFO - codeparrot_training - Step 14764: {'lr': 0.0004014802064912469, 'samples': 7559680, 'steps': 14764, 'batch_loss/train': 0.7805453431792557} 12/24/2021 22:08:39 - INFO - codeparrot_training - Step 14765: {'lr': 0.0004014676662234995, 'samples': 7560192, 'steps': 14765, 'batch_loss/train': 0.8478530710563064} 12/24/2021 22:08:49 - INFO - codeparrot_training - Step 14766: {'lr': 0.00040145512535357387, 'samples': 7560704, 'steps': 14766, 'batch_loss/train': 0.7502506170421839} 12/24/2021 22:09:03 - INFO - codeparrot_training - Step 14767: {'lr': 0.0004014425838815198, 'samples': 7561216, 'steps': 14767, 'batch_loss/train': 0.8349290792830288} 12/24/2021 22:09:14 - INFO - codeparrot_training - Step 14768: {'lr': 0.0004014300418073873, 'samples': 7561728, 'steps': 14768, 'batch_loss/train': 0.8687594896182418} 12/24/2021 22:09:24 - INFO - codeparrot_training - Step 14769: {'lr': 0.00040141749913122604, 'samples': 7562240, 'steps': 14769, 'batch_loss/train': 0.8419452169910073} 12/24/2021 22:09:35 - INFO - codeparrot_training - Step 14770: {'lr': 0.000401404955853086, 'samples': 7562752, 'steps': 14770, 'batch_loss/train': 0.8517233012244105} 12/24/2021 22:09:47 - INFO - codeparrot_training - Step 14771: {'lr': 0.00040139241197301715, 'samples': 7563264, 'steps': 14771, 'batch_loss/train': 0.8110721465200186} 12/24/2021 22:09:57 - INFO - codeparrot_training - Step 14772: {'lr': 0.00040137986749106904, 'samples': 7563776, 'steps': 14772, 'batch_loss/train': 0.8260789783671498} 12/24/2021 22:10:08 - INFO - codeparrot_training - Step 14773: {'lr': 0.00040136732240729186, 'samples': 7564288, 'steps': 14773, 'batch_loss/train': 0.6609376836568117} 12/24/2021 22:10:20 - INFO - codeparrot_training - Step 14774: {'lr': 0.00040135477672173525, 'samples': 7564800, 'steps': 14774, 'batch_loss/train': 0.770225047133863} 12/24/2021 22:10:31 - INFO - codeparrot_training - Step 14775: {'lr': 0.0004013422304344493, 'samples': 7565312, 'steps': 14775, 'batch_loss/train': 0.7926526321098208} 12/24/2021 22:10:41 - INFO - codeparrot_training - Step 14776: {'lr': 0.00040132968354548374, 'samples': 7565824, 'steps': 14776, 'batch_loss/train': 0.7241932116448879} 12/24/2021 22:10:55 - INFO - codeparrot_training - Step 14777: {'lr': 0.00040131713605488854, 'samples': 7566336, 'steps': 14777, 'batch_loss/train': 0.8546199640259147} 12/24/2021 22:11:06 - INFO - codeparrot_training - Step 14778: {'lr': 0.0004013045879627135, 'samples': 7566848, 'steps': 14778, 'batch_loss/train': 0.8242675373330712} 12/24/2021 22:11:16 - INFO - codeparrot_training - Step 14779: {'lr': 0.0004012920392690086, 'samples': 7567360, 'steps': 14779, 'batch_loss/train': 0.8311926331371069} 12/24/2021 22:11:27 - INFO - codeparrot_training - Step 14780: {'lr': 0.00040127948997382355, 'samples': 7567872, 'steps': 14780, 'batch_loss/train': 0.8115632841363549} 12/24/2021 22:11:39 - INFO - codeparrot_training - Step 14781: {'lr': 0.0004012669400772085, 'samples': 7568384, 'steps': 14781, 'batch_loss/train': 0.849894504994154} 12/24/2021 22:11:50 - INFO - codeparrot_training - Step 14782: {'lr': 0.0004012543895792132, 'samples': 7568896, 'steps': 14782, 'batch_loss/train': 0.7833979772403836} 12/24/2021 22:12:00 - INFO - codeparrot_training - Step 14783: {'lr': 0.0004012418384798875, 'samples': 7569408, 'steps': 14783, 'batch_loss/train': 0.979450412094593} 12/24/2021 22:12:12 - INFO - codeparrot_training - Step 14784: {'lr': 0.0004012292867792814, 'samples': 7569920, 'steps': 14784, 'batch_loss/train': 0.8685245225206017} 12/24/2021 22:12:23 - INFO - codeparrot_training - Step 14785: {'lr': 0.0004012167344774447, 'samples': 7570432, 'steps': 14785, 'batch_loss/train': 0.8927674684673548} 12/24/2021 22:12:33 - INFO - codeparrot_training - Step 14786: {'lr': 0.0004012041815744275, 'samples': 7570944, 'steps': 14786, 'batch_loss/train': 0.9062859397381544} 12/24/2021 22:12:46 - INFO - codeparrot_training - Step 14787: {'lr': 0.00040119162807027945, 'samples': 7571456, 'steps': 14787, 'batch_loss/train': 0.9237598236650229} 12/24/2021 22:12:56 - INFO - codeparrot_training - Step 14788: {'lr': 0.0004011790739650506, 'samples': 7571968, 'steps': 14788, 'batch_loss/train': 0.7106257025152445} 12/24/2021 22:13:07 - INFO - codeparrot_training - Step 14789: {'lr': 0.00040116651925879075, 'samples': 7572480, 'steps': 14789, 'batch_loss/train': 0.8307496686466038} 12/24/2021 22:13:21 - INFO - codeparrot_training - Step 14790: {'lr': 0.00040115396395155, 'samples': 7572992, 'steps': 14790, 'batch_loss/train': 0.8290072530508041} 12/24/2021 22:13:32 - INFO - codeparrot_training - Step 14791: {'lr': 0.00040114140804337815, 'samples': 7573504, 'steps': 14791, 'batch_loss/train': 0.8805717076174915} 12/24/2021 22:13:42 - INFO - codeparrot_training - Step 14792: {'lr': 0.0004011288515343251, 'samples': 7574016, 'steps': 14792, 'batch_loss/train': 0.840916465036571} 12/24/2021 22:13:53 - INFO - codeparrot_training - Step 14793: {'lr': 0.00040111629442444085, 'samples': 7574528, 'steps': 14793, 'batch_loss/train': 0.7815494351089001} 12/24/2021 22:14:05 - INFO - codeparrot_training - Step 14794: {'lr': 0.00040110373671377526, 'samples': 7575040, 'steps': 14794, 'batch_loss/train': 0.8226264868862927} 12/24/2021 22:14:16 - INFO - codeparrot_training - Step 14795: {'lr': 0.0004010911784023782, 'samples': 7575552, 'steps': 14795, 'batch_loss/train': 0.8212643876904622} 12/24/2021 22:14:27 - INFO - codeparrot_training - Step 14796: {'lr': 0.0004010786194902998, 'samples': 7576064, 'steps': 14796, 'batch_loss/train': 0.8094999364111573} 12/24/2021 22:14:39 - INFO - codeparrot_training - Step 14797: {'lr': 0.00040106605997758973, 'samples': 7576576, 'steps': 14797, 'batch_loss/train': 0.7515414047520608} 12/24/2021 22:14:49 - INFO - codeparrot_training - Step 14798: {'lr': 0.0004010534998642981, 'samples': 7577088, 'steps': 14798, 'batch_loss/train': 0.8203367255628109} 12/24/2021 22:15:00 - INFO - codeparrot_training - Step 14799: {'lr': 0.0004010409391504748, 'samples': 7577600, 'steps': 14799, 'batch_loss/train': 0.7644413465168327} 12/24/2021 22:15:10 - INFO - codeparrot_training - Step 14800: {'lr': 0.0004010283778361697, 'samples': 7578112, 'steps': 14800, 'batch_loss/train': 0.8560554499272257} 12/24/2021 22:15:24 - INFO - codeparrot_training - Step 14801: {'lr': 0.0004010158159214328, 'samples': 7578624, 'steps': 14801, 'batch_loss/train': 0.8593790698796511} 12/24/2021 22:15:35 - INFO - codeparrot_training - Step 14802: {'lr': 0.0004010032534063141, 'samples': 7579136, 'steps': 14802, 'batch_loss/train': 0.7521167425438762} 12/24/2021 22:15:46 - INFO - codeparrot_training - Step 14803: {'lr': 0.0004009906902908634, 'samples': 7579648, 'steps': 14803, 'batch_loss/train': 0.7800775719806552} 12/24/2021 22:15:58 - INFO - codeparrot_training - Step 14804: {'lr': 0.0004009781265751308, 'samples': 7580160, 'steps': 14804, 'batch_loss/train': 0.7598210019059479} 12/24/2021 22:16:08 - INFO - codeparrot_training - Step 14805: {'lr': 0.0004009655622591661, 'samples': 7580672, 'steps': 14805, 'batch_loss/train': 0.891825657337904} 12/24/2021 22:16:19 - INFO - codeparrot_training - Step 14806: {'lr': 0.0004009529973430194, 'samples': 7581184, 'steps': 14806, 'batch_loss/train': 0.7223222032189369} 12/24/2021 22:16:33 - INFO - codeparrot_training - Step 14807: {'lr': 0.00040094043182674056, 'samples': 7581696, 'steps': 14807, 'batch_loss/train': 0.8287630979903042} 12/24/2021 22:16:43 - INFO - codeparrot_training - Step 14808: {'lr': 0.0004009278657103796, 'samples': 7582208, 'steps': 14808, 'batch_loss/train': 0.8132887948304415} 12/24/2021 22:16:54 - INFO - codeparrot_training - Step 14809: {'lr': 0.00040091529899398637, 'samples': 7582720, 'steps': 14809, 'batch_loss/train': 1.4020000686869025} 12/24/2021 22:17:04 - INFO - codeparrot_training - Step 14810: {'lr': 0.00040090273167761093, 'samples': 7583232, 'steps': 14810, 'batch_loss/train': 1.1368119465187192} 12/24/2021 22:17:17 - INFO - codeparrot_training - Step 14811: {'lr': 0.0004008901637613032, 'samples': 7583744, 'steps': 14811, 'batch_loss/train': 0.7732313359156251} 12/24/2021 22:17:27 - INFO - codeparrot_training - Step 14812: {'lr': 0.00040087759524511313, 'samples': 7584256, 'steps': 14812, 'batch_loss/train': 0.7297165528871119} 12/24/2021 22:17:38 - INFO - codeparrot_training - Step 14813: {'lr': 0.00040086502612909073, 'samples': 7584768, 'steps': 14813, 'batch_loss/train': 0.8917263466864824} 12/24/2021 22:17:50 - INFO - codeparrot_training - Step 14814: {'lr': 0.00040085245641328595, 'samples': 7585280, 'steps': 14814, 'batch_loss/train': 0.7104241019114852} 12/24/2021 22:18:01 - INFO - codeparrot_training - Step 14815: {'lr': 0.0004008398860977487, 'samples': 7585792, 'steps': 14815, 'batch_loss/train': 0.8470465801656246} 12/24/2021 22:18:12 - INFO - codeparrot_training - Step 14816: {'lr': 0.0004008273151825291, 'samples': 7586304, 'steps': 14816, 'batch_loss/train': 1.0927011175081134} 12/24/2021 22:18:25 - INFO - codeparrot_training - Step 14817: {'lr': 0.000400814743667677, 'samples': 7586816, 'steps': 14817, 'batch_loss/train': 0.8689357065595686} 12/24/2021 22:18:36 - INFO - codeparrot_training - Step 14818: {'lr': 0.00040080217155324244, 'samples': 7587328, 'steps': 14818, 'batch_loss/train': 0.7834439263679087} 12/24/2021 22:18:47 - INFO - codeparrot_training - Step 14819: {'lr': 0.00040078959883927545, 'samples': 7587840, 'steps': 14819, 'batch_loss/train': 0.8272016411647201} 12/24/2021 22:18:57 - INFO - codeparrot_training - Step 14820: {'lr': 0.00040077702552582587, 'samples': 7588352, 'steps': 14820, 'batch_loss/train': 0.7530165971256793} 12/24/2021 22:19:09 - INFO - codeparrot_training - Step 14821: {'lr': 0.0004007644516129438, 'samples': 7588864, 'steps': 14821, 'batch_loss/train': 0.8085882337763906} 12/24/2021 22:19:20 - INFO - codeparrot_training - Step 14822: {'lr': 0.00040075187710067917, 'samples': 7589376, 'steps': 14822, 'batch_loss/train': 0.9034493928775191} 12/24/2021 22:19:31 - INFO - codeparrot_training - Step 14823: {'lr': 0.000400739301989082, 'samples': 7589888, 'steps': 14823, 'batch_loss/train': 0.7958934297785163} 12/24/2021 22:19:44 - INFO - codeparrot_training - Step 14824: {'lr': 0.0004007267262782023, 'samples': 7590400, 'steps': 14824, 'batch_loss/train': 1.5059992801398039} 12/24/2021 22:19:55 - INFO - codeparrot_training - Step 14825: {'lr': 0.00040071414996809007, 'samples': 7590912, 'steps': 14825, 'batch_loss/train': 0.8437318587675691} 12/24/2021 22:20:06 - INFO - codeparrot_training - Step 14826: {'lr': 0.00040070157305879524, 'samples': 7591424, 'steps': 14826, 'batch_loss/train': 0.859690835699439} 12/24/2021 22:20:18 - INFO - codeparrot_training - Step 14827: {'lr': 0.00040068899555036785, 'samples': 7591936, 'steps': 14827, 'batch_loss/train': 0.7792858814354986} 12/24/2021 22:20:29 - INFO - codeparrot_training - Step 14828: {'lr': 0.000400676417442858, 'samples': 7592448, 'steps': 14828, 'batch_loss/train': 0.8596856170333922} 12/24/2021 22:20:39 - INFO - codeparrot_training - Step 14829: {'lr': 0.00040066383873631553, 'samples': 7592960, 'steps': 14829, 'batch_loss/train': 0.8437265520915389} 12/24/2021 22:20:50 - INFO - codeparrot_training - Step 14830: {'lr': 0.0004006512594307905, 'samples': 7593472, 'steps': 14830, 'batch_loss/train': 0.9952767066424713} 12/24/2021 22:21:02 - INFO - codeparrot_training - Step 14831: {'lr': 0.000400638679526333, 'samples': 7593984, 'steps': 14831, 'batch_loss/train': 0.7796980463899672} 12/24/2021 22:21:12 - INFO - codeparrot_training - Step 14832: {'lr': 0.000400626099022993, 'samples': 7594496, 'steps': 14832, 'batch_loss/train': 0.8017937773838639} 12/24/2021 22:21:23 - INFO - codeparrot_training - Step 14833: {'lr': 0.00040061351792082046, 'samples': 7595008, 'steps': 14833, 'batch_loss/train': 0.7646425175480545} 12/24/2021 22:21:35 - INFO - codeparrot_training - Step 14834: {'lr': 0.0004006009362198655, 'samples': 7595520, 'steps': 14834, 'batch_loss/train': 0.8063022396527231} 12/24/2021 22:21:46 - INFO - codeparrot_training - Step 14835: {'lr': 0.000400588353920178, 'samples': 7596032, 'steps': 14835, 'batch_loss/train': 0.7533129518851638} 12/24/2021 22:21:56 - INFO - codeparrot_training - Step 14836: {'lr': 0.0004005757710218082, 'samples': 7596544, 'steps': 14836, 'batch_loss/train': 0.7866422403603792} 12/24/2021 22:22:10 - INFO - codeparrot_training - Step 14837: {'lr': 0.00040056318752480584, 'samples': 7597056, 'steps': 14837, 'batch_loss/train': 0.6565287751145661} 12/24/2021 22:22:21 - INFO - codeparrot_training - Step 14838: {'lr': 0.0004005506034292212, 'samples': 7597568, 'steps': 14838, 'batch_loss/train': 0.9078730633482337} 12/24/2021 22:22:32 - INFO - codeparrot_training - Step 14839: {'lr': 0.00040053801873510416, 'samples': 7598080, 'steps': 14839, 'batch_loss/train': 0.6661032875999808} 12/24/2021 22:22:42 - INFO - codeparrot_training - Step 14840: {'lr': 0.0004005254334425048, 'samples': 7598592, 'steps': 14840, 'batch_loss/train': 0.8580302838236094} 12/24/2021 22:22:54 - INFO - codeparrot_training - Step 14841: {'lr': 0.0004005128475514732, 'samples': 7599104, 'steps': 14841, 'batch_loss/train': 0.8213707860559225} 12/24/2021 22:23:05 - INFO - codeparrot_training - Step 14842: {'lr': 0.00040050026106205926, 'samples': 7599616, 'steps': 14842, 'batch_loss/train': 0.8812612863257527} 12/24/2021 22:23:15 - INFO - codeparrot_training - Step 14843: {'lr': 0.0004004876739743132, 'samples': 7600128, 'steps': 14843, 'batch_loss/train': 0.8590751877054572} 12/24/2021 22:23:27 - INFO - codeparrot_training - Step 14844: {'lr': 0.0004004750862882849, 'samples': 7600640, 'steps': 14844, 'batch_loss/train': 0.7665859599364921} 12/24/2021 22:23:38 - INFO - codeparrot_training - Step 14845: {'lr': 0.0004004624980040245, 'samples': 7601152, 'steps': 14845, 'batch_loss/train': 0.8229421116411686} 12/24/2021 22:23:49 - INFO - codeparrot_training - Step 14846: {'lr': 0.00040044990912158204, 'samples': 7601664, 'steps': 14846, 'batch_loss/train': 0.730053863953799} 12/24/2021 22:24:02 - INFO - codeparrot_training - Step 14847: {'lr': 0.00040043731964100747, 'samples': 7602176, 'steps': 14847, 'batch_loss/train': 0.8180617094039917} 12/24/2021 22:24:13 - INFO - codeparrot_training - Step 14848: {'lr': 0.000400424729562351, 'samples': 7602688, 'steps': 14848, 'batch_loss/train': 0.7661726045189425} 12/24/2021 22:24:24 - INFO - codeparrot_training - Step 14849: {'lr': 0.00040041213888566263, 'samples': 7603200, 'steps': 14849, 'batch_loss/train': 0.7981159146875143} 12/24/2021 22:24:34 - INFO - codeparrot_training - Step 14850: {'lr': 0.00040039954761099226, 'samples': 7603712, 'steps': 14850, 'batch_loss/train': 0.7678952448768541} 12/24/2021 22:24:46 - INFO - codeparrot_training - Step 14851: {'lr': 0.00040038695573839026, 'samples': 7604224, 'steps': 14851, 'batch_loss/train': 0.8113216292113066} 12/24/2021 22:24:57 - INFO - codeparrot_training - Step 14852: {'lr': 0.0004003743632679063, 'samples': 7604736, 'steps': 14852, 'batch_loss/train': 0.7680471939966083} 12/24/2021 22:25:07 - INFO - codeparrot_training - Step 14853: {'lr': 0.0004003617701995908, 'samples': 7605248, 'steps': 14853, 'batch_loss/train': 0.7215701953973621} 12/24/2021 22:25:21 - INFO - codeparrot_training - Step 14854: {'lr': 0.0004003491765334937, 'samples': 7605760, 'steps': 14854, 'batch_loss/train': 0.6879022109205835} 12/24/2021 22:25:32 - INFO - codeparrot_training - Step 14855: {'lr': 0.0004003365822696649, 'samples': 7606272, 'steps': 14855, 'batch_loss/train': 0.8165665538981557} 12/24/2021 22:25:42 - INFO - codeparrot_training - Step 14856: {'lr': 0.0004003239874081547, 'samples': 7606784, 'steps': 14856, 'batch_loss/train': 0.8529785564169288} 12/24/2021 22:25:54 - INFO - codeparrot_training - Step 14857: {'lr': 0.0004003113919490131, 'samples': 7607296, 'steps': 14857, 'batch_loss/train': 0.7673701131716371} 12/24/2021 22:26:05 - INFO - codeparrot_training - Step 14858: {'lr': 0.0004002987958922901, 'samples': 7607808, 'steps': 14858, 'batch_loss/train': 0.8257623799145222} 12/24/2021 22:26:15 - INFO - codeparrot_training - Step 14859: {'lr': 0.00040028619923803587, 'samples': 7608320, 'steps': 14859, 'batch_loss/train': 0.7897538165561855} 12/24/2021 22:26:27 - INFO - codeparrot_training - Step 14860: {'lr': 0.0004002736019863005, 'samples': 7608832, 'steps': 14860, 'batch_loss/train': 0.8202797323465347} 12/24/2021 22:26:38 - INFO - codeparrot_training - Step 14861: {'lr': 0.000400261004137134, 'samples': 7609344, 'steps': 14861, 'batch_loss/train': 0.8010152019560337} 12/24/2021 22:26:48 - INFO - codeparrot_training - Step 14862: {'lr': 0.0004002484056905865, 'samples': 7609856, 'steps': 14862, 'batch_loss/train': 0.8140564800705761} 12/24/2021 22:26:59 - INFO - codeparrot_training - Step 14863: {'lr': 0.000400235806646708, 'samples': 7610368, 'steps': 14863, 'batch_loss/train': 0.8490180997177958} 12/24/2021 22:27:13 - INFO - codeparrot_training - Step 14864: {'lr': 0.0004002232070055488, 'samples': 7610880, 'steps': 14864, 'batch_loss/train': 0.7839597379788756} 12/24/2021 22:27:24 - INFO - codeparrot_training - Step 14865: {'lr': 0.0004002106067671587, 'samples': 7611392, 'steps': 14865, 'batch_loss/train': 0.7881908644922078} 12/24/2021 22:27:34 - INFO - codeparrot_training - Step 14866: {'lr': 0.000400198005931588, 'samples': 7611904, 'steps': 14866, 'batch_loss/train': 0.7450574864633381} 12/24/2021 22:27:46 - INFO - codeparrot_training - Step 14867: {'lr': 0.00040018540449888676, 'samples': 7612416, 'steps': 14867, 'batch_loss/train': 0.7937466034200042} 12/24/2021 22:27:57 - INFO - codeparrot_training - Step 14868: {'lr': 0.000400172802469105, 'samples': 7612928, 'steps': 14868, 'batch_loss/train': 0.7303532278165221} 12/24/2021 22:28:07 - INFO - codeparrot_training - Step 14869: {'lr': 0.00040016019984229304, 'samples': 7613440, 'steps': 14869, 'batch_loss/train': 0.7804905446246266} 12/24/2021 22:28:19 - INFO - codeparrot_training - Step 14870: {'lr': 0.00040014759661850065, 'samples': 7613952, 'steps': 14870, 'batch_loss/train': 0.8562190802767873} 12/24/2021 22:28:30 - INFO - codeparrot_training - Step 14871: {'lr': 0.0004001349927977782, 'samples': 7614464, 'steps': 14871, 'batch_loss/train': 0.8433669060468674} 12/24/2021 22:28:41 - INFO - codeparrot_training - Step 14872: {'lr': 0.00040012238838017566, 'samples': 7614976, 'steps': 14872, 'batch_loss/train': 0.8049775357358158} 12/24/2021 22:28:51 - INFO - codeparrot_training - Step 14873: {'lr': 0.0004001097833657432, 'samples': 7615488, 'steps': 14873, 'batch_loss/train': 0.887266818434} 12/24/2021 22:29:03 - INFO - codeparrot_training - Step 14874: {'lr': 0.0004000971777545309, 'samples': 7616000, 'steps': 14874, 'batch_loss/train': 0.7063533149193972} 12/24/2021 22:29:14 - INFO - codeparrot_training - Step 14875: {'lr': 0.00040008457154658896, 'samples': 7616512, 'steps': 14875, 'batch_loss/train': 0.8169862143695354} 12/24/2021 22:29:25 - INFO - codeparrot_training - Step 14876: {'lr': 0.0004000719647419674, 'samples': 7617024, 'steps': 14876, 'batch_loss/train': 0.802849605679512} 12/24/2021 22:29:38 - INFO - codeparrot_training - Step 14877: {'lr': 0.00040005935734071634, 'samples': 7617536, 'steps': 14877, 'batch_loss/train': 0.7765915418276563} 12/24/2021 22:29:49 - INFO - codeparrot_training - Step 14878: {'lr': 0.000400046749342886, 'samples': 7618048, 'steps': 14878, 'batch_loss/train': 0.8169589564204216} 12/24/2021 22:30:00 - INFO - codeparrot_training - Step 14879: {'lr': 0.00040003414074852634, 'samples': 7618560, 'steps': 14879, 'batch_loss/train': 0.8262485032901168} 12/24/2021 22:30:12 - INFO - codeparrot_training - Step 14880: {'lr': 0.00040002153155768765, 'samples': 7619072, 'steps': 14880, 'batch_loss/train': 0.7724743662402034} 12/24/2021 22:30:22 - INFO - codeparrot_training - Step 14881: {'lr': 0.00040000892177042, 'samples': 7619584, 'steps': 14881, 'batch_loss/train': 0.8182650813832879} 12/24/2021 22:30:33 - INFO - codeparrot_training - Step 14882: {'lr': 0.0003999963113867735, 'samples': 7620096, 'steps': 14882, 'batch_loss/train': 0.6956731197424233} 12/24/2021 22:30:43 - INFO - codeparrot_training - Step 14883: {'lr': 0.00039998370040679834, 'samples': 7620608, 'steps': 14883, 'batch_loss/train': 0.8173117274418473} 12/24/2021 22:30:57 - INFO - codeparrot_training - Step 14884: {'lr': 0.00039997108883054464, 'samples': 7621120, 'steps': 14884, 'batch_loss/train': 0.8768704673275352} 12/24/2021 22:31:08 - INFO - codeparrot_training - Step 14885: {'lr': 0.0003999584766580624, 'samples': 7621632, 'steps': 14885, 'batch_loss/train': 0.8246625391766429} 12/24/2021 22:31:18 - INFO - codeparrot_training - Step 14886: {'lr': 0.00039994586388940204, 'samples': 7622144, 'steps': 14886, 'batch_loss/train': 0.7943075348157436} 12/24/2021 22:31:30 - INFO - codeparrot_training - Step 14887: {'lr': 0.00039993325052461336, 'samples': 7622656, 'steps': 14887, 'batch_loss/train': 0.7713945617433637} 12/24/2021 22:31:41 - INFO - codeparrot_training - Step 14888: {'lr': 0.0003999206365637468, 'samples': 7623168, 'steps': 14888, 'batch_loss/train': 0.8313033310696483} 12/24/2021 22:31:52 - INFO - codeparrot_training - Step 14889: {'lr': 0.0003999080220068524, 'samples': 7623680, 'steps': 14889, 'batch_loss/train': 0.7187997926957905} 12/24/2021 22:32:04 - INFO - codeparrot_training - Step 14890: {'lr': 0.0003998954068539803, 'samples': 7624192, 'steps': 14890, 'batch_loss/train': 0.8642866935115308} 12/24/2021 22:32:14 - INFO - codeparrot_training - Step 14891: {'lr': 0.00039988279110518066, 'samples': 7624704, 'steps': 14891, 'batch_loss/train': 0.7275898745283484} 12/24/2021 22:32:25 - INFO - codeparrot_training - Step 14892: {'lr': 0.00039987017476050357, 'samples': 7625216, 'steps': 14892, 'batch_loss/train': 0.8561246804893017} 12/24/2021 22:32:39 - INFO - codeparrot_training - Step 14893: {'lr': 0.00039985755781999933, 'samples': 7625728, 'steps': 14893, 'batch_loss/train': 0.8180895773693919} 12/24/2021 22:32:49 - INFO - codeparrot_training - Step 14894: {'lr': 0.00039984494028371797, 'samples': 7626240, 'steps': 14894, 'batch_loss/train': 0.7310387641191483} 12/24/2021 22:33:00 - INFO - codeparrot_training - Step 14895: {'lr': 0.0003998323221517098, 'samples': 7626752, 'steps': 14895, 'batch_loss/train': 0.8623589952476323} 12/24/2021 22:33:10 - INFO - codeparrot_training - Step 14896: {'lr': 0.00039981970342402485, 'samples': 7627264, 'steps': 14896, 'batch_loss/train': 0.9010824132710695} 12/24/2021 22:33:23 - INFO - codeparrot_training - Step 14897: {'lr': 0.00039980708410071333, 'samples': 7627776, 'steps': 14897, 'batch_loss/train': 0.9831913104280829} 12/24/2021 22:33:34 - INFO - codeparrot_training - Step 14898: {'lr': 0.0003997944641818253, 'samples': 7628288, 'steps': 14898, 'batch_loss/train': 0.948082908289507} 12/24/2021 22:33:44 - INFO - codeparrot_training - Step 14899: {'lr': 0.0003997818436674112, 'samples': 7628800, 'steps': 14899, 'batch_loss/train': 0.8376731583848596} 12/24/2021 22:33:56 - INFO - codeparrot_training - Step 14900: {'lr': 0.00039976922255752093, 'samples': 7629312, 'steps': 14900, 'batch_loss/train': 0.8369915243238211} 12/24/2021 22:34:07 - INFO - codeparrot_training - Step 14901: {'lr': 0.0003997566008522048, 'samples': 7629824, 'steps': 14901, 'batch_loss/train': 0.9597814227454364} 12/24/2021 22:34:17 - INFO - codeparrot_training - Step 14902: {'lr': 0.000399743978551513, 'samples': 7630336, 'steps': 14902, 'batch_loss/train': 0.8145941118709743} 12/24/2021 22:34:28 - INFO - codeparrot_training - Step 14903: {'lr': 0.0003997313556554957, 'samples': 7630848, 'steps': 14903, 'batch_loss/train': 0.7989378133788705} 12/24/2021 22:34:42 - INFO - codeparrot_training - Step 14904: {'lr': 0.0003997187321642031, 'samples': 7631360, 'steps': 14904, 'batch_loss/train': 0.7855122238397598} 12/24/2021 22:34:52 - INFO - codeparrot_training - Step 14905: {'lr': 0.00039970610807768525, 'samples': 7631872, 'steps': 14905, 'batch_loss/train': 0.8552204556763172} 12/24/2021 22:35:03 - INFO - codeparrot_training - Step 14906: {'lr': 0.00039969348339599246, 'samples': 7632384, 'steps': 14906, 'batch_loss/train': 0.7772174868732691} 12/24/2021 22:35:15 - INFO - codeparrot_training - Step 14907: {'lr': 0.00039968085811917497, 'samples': 7632896, 'steps': 14907, 'batch_loss/train': 0.8056781715713441} 12/24/2021 22:35:26 - INFO - codeparrot_training - Step 14908: {'lr': 0.00039966823224728287, 'samples': 7633408, 'steps': 14908, 'batch_loss/train': 0.9317420329898596} 12/24/2021 22:35:36 - INFO - codeparrot_training - Step 14909: {'lr': 0.0003996556057803664, 'samples': 7633920, 'steps': 14909, 'batch_loss/train': 0.7913907701149583} 12/24/2021 22:35:48 - INFO - codeparrot_training - Step 14910: {'lr': 0.00039964297871847575, 'samples': 7634432, 'steps': 14910, 'batch_loss/train': 0.7517104977741838} 12/24/2021 22:35:59 - INFO - codeparrot_training - Step 14911: {'lr': 0.00039963035106166107, 'samples': 7634944, 'steps': 14911, 'batch_loss/train': 0.9379186378791928} 12/24/2021 22:36:10 - INFO - codeparrot_training - Step 14912: {'lr': 0.00039961772280997263, 'samples': 7635456, 'steps': 14912, 'batch_loss/train': 0.7479285253211856} 12/24/2021 22:36:22 - INFO - codeparrot_training - Step 14913: {'lr': 0.0003996050939634607, 'samples': 7635968, 'steps': 14913, 'batch_loss/train': 0.840595243498683} 12/24/2021 22:36:32 - INFO - codeparrot_training - Step 14914: {'lr': 0.00039959246452217533, 'samples': 7636480, 'steps': 14914, 'batch_loss/train': 0.8410772574134171} 12/24/2021 22:36:43 - INFO - codeparrot_training - Step 14915: {'lr': 0.0003995798344861668, 'samples': 7636992, 'steps': 14915, 'batch_loss/train': 0.7583388611674309} 12/24/2021 22:36:54 - INFO - codeparrot_training - Step 14916: {'lr': 0.00039956720385548526, 'samples': 7637504, 'steps': 14916, 'batch_loss/train': 0.7782814704114571} 12/24/2021 22:37:07 - INFO - codeparrot_training - Step 14917: {'lr': 0.0003995545726301811, 'samples': 7638016, 'steps': 14917, 'batch_loss/train': 0.7786327949725091} 12/24/2021 22:37:18 - INFO - codeparrot_training - Step 14918: {'lr': 0.00039954194081030445, 'samples': 7638528, 'steps': 14918, 'batch_loss/train': 0.8110890993848443} 12/24/2021 22:37:29 - INFO - codeparrot_training - Step 14919: {'lr': 0.00039952930839590547, 'samples': 7639040, 'steps': 14919, 'batch_loss/train': 0.782548317569308} 12/24/2021 22:37:41 - INFO - codeparrot_training - Step 14920: {'lr': 0.0003995166753870344, 'samples': 7639552, 'steps': 14920, 'batch_loss/train': 0.7225279211997986} 12/24/2021 22:37:52 - INFO - codeparrot_training - Step 14921: {'lr': 0.0003995040417837415, 'samples': 7640064, 'steps': 14921, 'batch_loss/train': 0.7985578309744596} 12/24/2021 22:38:02 - INFO - codeparrot_training - Step 14922: {'lr': 0.000399491407586077, 'samples': 7640576, 'steps': 14922, 'batch_loss/train': 0.8664869396016002} 12/24/2021 22:38:16 - INFO - codeparrot_training - Step 14923: {'lr': 0.0003994787727940911, 'samples': 7641088, 'steps': 14923, 'batch_loss/train': 0.7197042619809508} 12/24/2021 22:38:27 - INFO - codeparrot_training - Step 14924: {'lr': 0.000399466137407834, 'samples': 7641600, 'steps': 14924, 'batch_loss/train': 0.8584628775715828} 12/24/2021 22:38:37 - INFO - codeparrot_training - Step 14925: {'lr': 0.00039945350142735604, 'samples': 7642112, 'steps': 14925, 'batch_loss/train': 0.7831713948398829} 12/24/2021 22:38:48 - INFO - codeparrot_training - Step 14926: {'lr': 0.0003994408648527073, 'samples': 7642624, 'steps': 14926, 'batch_loss/train': 0.7987029626965523} 12/24/2021 22:39:00 - INFO - codeparrot_training - Step 14927: {'lr': 0.0003994282276839382, 'samples': 7643136, 'steps': 14927, 'batch_loss/train': 0.9441551715135574} 12/24/2021 22:39:11 - INFO - codeparrot_training - Step 14928: {'lr': 0.0003994155899210988, 'samples': 7643648, 'steps': 14928, 'batch_loss/train': 0.658662679605186} 12/24/2021 22:39:21 - INFO - codeparrot_training - Step 14929: {'lr': 0.00039940295156423944, 'samples': 7644160, 'steps': 14929, 'batch_loss/train': 0.8753558411262929} 12/24/2021 22:39:34 - INFO - codeparrot_training - Step 14930: {'lr': 0.00039939031261341043, 'samples': 7644672, 'steps': 14930, 'batch_loss/train': 0.8455679705366492} 12/24/2021 22:39:44 - INFO - codeparrot_training - Step 14931: {'lr': 0.0003993776730686619, 'samples': 7645184, 'steps': 14931, 'batch_loss/train': 0.8144983602687716} 12/24/2021 22:39:55 - INFO - codeparrot_training - Step 14932: {'lr': 0.00039936503293004416, 'samples': 7645696, 'steps': 14932, 'batch_loss/train': 0.5203846454387531} 12/24/2021 22:40:08 - INFO - codeparrot_training - Step 14933: {'lr': 0.00039935239219760746, 'samples': 7646208, 'steps': 14933, 'batch_loss/train': 0.7989084664732218} 12/24/2021 22:40:18 - INFO - codeparrot_training - Step 14934: {'lr': 0.0003993397508714021, 'samples': 7646720, 'steps': 14934, 'batch_loss/train': 0.7818194208666682} 12/24/2021 22:40:29 - INFO - codeparrot_training - Step 14935: {'lr': 0.0003993271089514781, 'samples': 7647232, 'steps': 14935, 'batch_loss/train': 0.6784079452045262} 12/24/2021 22:40:39 - INFO - codeparrot_training - Step 14936: {'lr': 0.00039931446643788604, 'samples': 7647744, 'steps': 14936, 'batch_loss/train': 0.7971495534293354} 12/24/2021 22:40:54 - INFO - codeparrot_training - Step 14937: {'lr': 0.000399301823330676, 'samples': 7648256, 'steps': 14937, 'batch_loss/train': 0.8062593387439847} 12/24/2021 22:41:05 - INFO - codeparrot_training - Step 14938: {'lr': 0.0003992891796298983, 'samples': 7648768, 'steps': 14938, 'batch_loss/train': 0.8546439072815701} 12/24/2021 22:41:15 - INFO - codeparrot_training - Step 14939: {'lr': 0.00039927653533560316, 'samples': 7649280, 'steps': 14939, 'batch_loss/train': 0.7384168040007353} 12/24/2021 22:41:27 - INFO - codeparrot_training - Step 14940: {'lr': 0.00039926389044784086, 'samples': 7649792, 'steps': 14940, 'batch_loss/train': 0.7761679757386446} 12/24/2021 22:41:38 - INFO - codeparrot_training - Step 14941: {'lr': 0.00039925124496666175, 'samples': 7650304, 'steps': 14941, 'batch_loss/train': 0.8224639587569982} 12/24/2021 22:41:49 - INFO - codeparrot_training - Step 14942: {'lr': 0.000399238598892116, 'samples': 7650816, 'steps': 14942, 'batch_loss/train': 0.7289525454398245} 12/24/2021 22:42:03 - INFO - codeparrot_training - Step 14943: {'lr': 0.00039922595222425394, 'samples': 7651328, 'steps': 14943, 'batch_loss/train': 0.9488638639450073} 12/24/2021 22:42:13 - INFO - codeparrot_training - Step 14944: {'lr': 0.0003992133049631259, 'samples': 7651840, 'steps': 14944, 'batch_loss/train': 0.886180782224983} 12/24/2021 22:42:24 - INFO - codeparrot_training - Step 14945: {'lr': 0.00039920065710878205, 'samples': 7652352, 'steps': 14945, 'batch_loss/train': 0.6940825453493744} 12/24/2021 22:42:34 - INFO - codeparrot_training - Step 14946: {'lr': 0.0003991880086612727, 'samples': 7652864, 'steps': 14946, 'batch_loss/train': 0.7763613378629088} 12/24/2021 22:42:47 - INFO - codeparrot_training - Step 14947: {'lr': 0.00039917535962064824, 'samples': 7653376, 'steps': 14947, 'batch_loss/train': 0.7274105120450258} 12/24/2021 22:42:58 - INFO - codeparrot_training - Step 14948: {'lr': 0.0003991627099869588, 'samples': 7653888, 'steps': 14948, 'batch_loss/train': 0.7751791313057765} 12/24/2021 22:43:08 - INFO - codeparrot_training - Step 14949: {'lr': 0.00039915005976025477, 'samples': 7654400, 'steps': 14949, 'batch_loss/train': 1.1046269834041595} 12/24/2021 22:43:20 - INFO - codeparrot_training - Step 14950: {'lr': 0.0003991374089405864, 'samples': 7654912, 'steps': 14950, 'batch_loss/train': 0.7606855486519635} 12/24/2021 22:43:31 - INFO - codeparrot_training - Step 14951: {'lr': 0.000399124757528004, 'samples': 7655424, 'steps': 14951, 'batch_loss/train': 0.8064826531335711} 12/24/2021 22:43:41 - INFO - codeparrot_training - Step 14952: {'lr': 0.00039911210552255795, 'samples': 7655936, 'steps': 14952, 'batch_loss/train': 0.7880838735727593} 12/24/2021 22:43:55 - INFO - codeparrot_training - Step 14953: {'lr': 0.00039909945292429836, 'samples': 7656448, 'steps': 14953, 'batch_loss/train': 0.7198272217065096} 12/24/2021 22:44:06 - INFO - codeparrot_training - Step 14954: {'lr': 0.0003990867997332758, 'samples': 7656960, 'steps': 14954, 'batch_loss/train': 0.8193459170870483} 12/24/2021 22:44:17 - INFO - codeparrot_training - Step 14955: {'lr': 0.0003990741459495403, 'samples': 7657472, 'steps': 14955, 'batch_loss/train': 0.8289420870132744} 12/24/2021 22:44:27 - INFO - codeparrot_training - Step 14956: {'lr': 0.00039906149157314225, 'samples': 7657984, 'steps': 14956, 'batch_loss/train': 0.9309912407770753} 12/24/2021 22:44:39 - INFO - codeparrot_training - Step 14957: {'lr': 0.00039904883660413205, 'samples': 7658496, 'steps': 14957, 'batch_loss/train': 0.8533587232232094} 12/24/2021 22:44:50 - INFO - codeparrot_training - Step 14958: {'lr': 0.00039903618104256, 'samples': 7659008, 'steps': 14958, 'batch_loss/train': 0.6924831725191325} 12/24/2021 22:45:01 - INFO - codeparrot_training - Step 14959: {'lr': 0.00039902352488847625, 'samples': 7659520, 'steps': 14959, 'batch_loss/train': 0.9944223209749907} 12/24/2021 22:45:13 - INFO - codeparrot_training - Step 14960: {'lr': 0.0003990108681419313, 'samples': 7660032, 'steps': 14960, 'batch_loss/train': 0.7691072570160031} 12/24/2021 22:45:23 - INFO - codeparrot_training - Step 14961: {'lr': 0.0003989982108029754, 'samples': 7660544, 'steps': 14961, 'batch_loss/train': 0.8569365860894322} 12/24/2021 22:45:34 - INFO - codeparrot_training - Step 14962: {'lr': 0.00039898555287165897, 'samples': 7661056, 'steps': 14962, 'batch_loss/train': 0.7216180141549557} 12/24/2021 22:45:48 - INFO - codeparrot_training - Step 14963: {'lr': 0.0003989728943480321, 'samples': 7661568, 'steps': 14963, 'batch_loss/train': 0.8912443136796355} 12/24/2021 22:45:58 - INFO - codeparrot_training - Step 14964: {'lr': 0.00039896023523214525, 'samples': 7662080, 'steps': 14964, 'batch_loss/train': 0.7967049318831414} 12/24/2021 22:46:09 - INFO - codeparrot_training - Step 14965: {'lr': 0.0003989475755240488, 'samples': 7662592, 'steps': 14965, 'batch_loss/train': 0.835612652823329} 12/24/2021 22:46:20 - INFO - codeparrot_training - Step 14966: {'lr': 0.000398934915223793, 'samples': 7663104, 'steps': 14966, 'batch_loss/train': 0.7750417930074036} 12/24/2021 22:46:32 - INFO - codeparrot_training - Step 14967: {'lr': 0.00039892225433142827, 'samples': 7663616, 'steps': 14967, 'batch_loss/train': 0.8652515388093889} 12/24/2021 22:46:42 - INFO - codeparrot_training - Step 14968: {'lr': 0.00039890959284700486, 'samples': 7664128, 'steps': 14968, 'batch_loss/train': 0.8014767728745937} 12/24/2021 22:46:53 - INFO - codeparrot_training - Step 14969: {'lr': 0.0003988969307705731, 'samples': 7664640, 'steps': 14969, 'batch_loss/train': 0.8777642250061035} 12/24/2021 22:47:05 - INFO - codeparrot_training - Step 14970: {'lr': 0.0003988842681021834, 'samples': 7665152, 'steps': 14970, 'batch_loss/train': 0.8367453264072537} 12/24/2021 22:47:16 - INFO - codeparrot_training - Step 14971: {'lr': 0.0003988716048418861, 'samples': 7665664, 'steps': 14971, 'batch_loss/train': 0.8290500291623175} 12/24/2021 22:47:26 - INFO - codeparrot_training - Step 14972: {'lr': 0.00039885894098973143, 'samples': 7666176, 'steps': 14972, 'batch_loss/train': 0.7725702449679375} 12/24/2021 22:47:40 - INFO - codeparrot_training - Step 14973: {'lr': 0.00039884627654576985, 'samples': 7666688, 'steps': 14973, 'batch_loss/train': 0.8569014258682728} 12/24/2021 22:47:51 - INFO - codeparrot_training - Step 14974: {'lr': 0.0003988336115100517, 'samples': 7667200, 'steps': 14974, 'batch_loss/train': 0.8242682134732604} 12/24/2021 22:48:01 - INFO - codeparrot_training - Step 14975: {'lr': 0.0003988209458826273, 'samples': 7667712, 'steps': 14975, 'batch_loss/train': 0.8479360193014145} 12/24/2021 22:48:12 - INFO - codeparrot_training - Step 14976: {'lr': 0.00039880827966354705, 'samples': 7668224, 'steps': 14976, 'batch_loss/train': 0.7561071803793311} 12/24/2021 22:48:24 - INFO - codeparrot_training - Step 14977: {'lr': 0.00039879561285286114, 'samples': 7668736, 'steps': 14977, 'batch_loss/train': 0.8227624869905412} 12/24/2021 22:48:35 - INFO - codeparrot_training - Step 14978: {'lr': 0.0003987829454506202, 'samples': 7669248, 'steps': 14978, 'batch_loss/train': 0.8530326834879816} 12/24/2021 22:48:45 - INFO - codeparrot_training - Step 14979: {'lr': 0.0003987702774568743, 'samples': 7669760, 'steps': 14979, 'batch_loss/train': 0.8644650019705296} 12/24/2021 22:48:57 - INFO - codeparrot_training - Step 14980: {'lr': 0.0003987576088716741, 'samples': 7670272, 'steps': 14980, 'batch_loss/train': 0.8709141761064529} 12/24/2021 22:49:08 - INFO - codeparrot_training - Step 14981: {'lr': 0.0003987449396950698, 'samples': 7670784, 'steps': 14981, 'batch_loss/train': 0.8841860480606556} 12/24/2021 22:49:19 - INFO - codeparrot_training - Step 14982: {'lr': 0.0003987322699271117, 'samples': 7671296, 'steps': 14982, 'batch_loss/train': 0.839030846953392} 12/24/2021 22:49:33 - INFO - codeparrot_training - Step 14983: {'lr': 0.0003987195995678503, 'samples': 7671808, 'steps': 14983, 'batch_loss/train': 0.8072294876910746} 12/24/2021 22:49:43 - INFO - codeparrot_training - Step 14984: {'lr': 0.00039870692861733594, 'samples': 7672320, 'steps': 14984, 'batch_loss/train': 0.6850467405747622} 12/24/2021 22:49:54 - INFO - codeparrot_training - Step 14985: {'lr': 0.0003986942570756189, 'samples': 7672832, 'steps': 14985, 'batch_loss/train': 0.7369942953810096} 12/24/2021 22:50:05 - INFO - codeparrot_training - Step 14986: {'lr': 0.00039868158494274974, 'samples': 7673344, 'steps': 14986, 'batch_loss/train': 0.8134306538850069} 12/24/2021 22:50:17 - INFO - codeparrot_training - Step 14987: {'lr': 0.0003986689122187787, 'samples': 7673856, 'steps': 14987, 'batch_loss/train': 0.8122547334060073} 12/24/2021 22:50:27 - INFO - codeparrot_training - Step 14988: {'lr': 0.0003986562389037562, 'samples': 7674368, 'steps': 14988, 'batch_loss/train': 0.966645794454962} 12/24/2021 22:50:38 - INFO - codeparrot_training - Step 14989: {'lr': 0.0003986435649977326, 'samples': 7674880, 'steps': 14989, 'batch_loss/train': 0.49386740755289793} 12/24/2021 22:50:50 - INFO - codeparrot_training - Step 14990: {'lr': 0.0003986308905007583, 'samples': 7675392, 'steps': 14990, 'batch_loss/train': 0.7801290457136929} 12/24/2021 22:51:01 - INFO - codeparrot_training - Step 14991: {'lr': 0.00039861821541288376, 'samples': 7675904, 'steps': 14991, 'batch_loss/train': 0.8083182186819613} 12/24/2021 22:51:11 - INFO - codeparrot_training - Step 14992: {'lr': 0.00039860553973415923, 'samples': 7676416, 'steps': 14992, 'batch_loss/train': 0.8115547096822411} 12/24/2021 22:51:25 - INFO - codeparrot_training - Step 14993: {'lr': 0.0003985928634646353, 'samples': 7676928, 'steps': 14993, 'batch_loss/train': 0.9199220193549991} 12/24/2021 22:51:36 - INFO - codeparrot_training - Step 14994: {'lr': 0.00039858018660436217, 'samples': 7677440, 'steps': 14994, 'batch_loss/train': 0.8914810419082642} 12/24/2021 22:51:47 - INFO - codeparrot_training - Step 14995: {'lr': 0.00039856750915339036, 'samples': 7677952, 'steps': 14995, 'batch_loss/train': 0.7796540134586394} 12/24/2021 22:51:57 - INFO - codeparrot_training - Step 14996: {'lr': 0.0003985548311117702, 'samples': 7678464, 'steps': 14996, 'batch_loss/train': 0.9431326556950808} 12/24/2021 22:52:09 - INFO - codeparrot_training - Step 14997: {'lr': 0.00039854215247955214, 'samples': 7678976, 'steps': 14997, 'batch_loss/train': 0.8029667832888663} 12/24/2021 22:52:20 - INFO - codeparrot_training - Step 14998: {'lr': 0.0003985294732567866, 'samples': 7679488, 'steps': 14998, 'batch_loss/train': 0.8013834027806297} 12/24/2021 22:52:30 - INFO - codeparrot_training - Step 14999: {'lr': 0.00039851679344352385, 'samples': 7680000, 'steps': 14999, 'batch_loss/train': 0.8290996924042702} 12/24/2021 22:52:30 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/24/2021 22:55:51 - INFO - codeparrot_training - Step 15000: {'loss/eval': 0.8110089898109436, 'perplexity': 2.2501771450042725} 12/24/2021 22:56:09 - WARNING - huggingface_hub.repository - Several commits (4) will be pushed upstream. 12/24/2021 22:56:25 - INFO - codeparrot_training - Step 15000: {'lr': 0.0003985041130398145, 'samples': 7680512, 'steps': 15000, 'batch_loss/train': 0.7323928384575993} 12/24/2021 22:56:36 - INFO - codeparrot_training - Step 15001: {'lr': 0.00039849143204570884, 'samples': 7681024, 'steps': 15001, 'batch_loss/train': 0.7468573972582817} 12/24/2021 22:56:46 - INFO - codeparrot_training - Step 15002: {'lr': 0.0003984787504612574, 'samples': 7681536, 'steps': 15002, 'batch_loss/train': 0.8788123605772853} 12/24/2021 22:56:59 - INFO - codeparrot_training - Step 15003: {'lr': 0.00039846606828651033, 'samples': 7682048, 'steps': 15003, 'batch_loss/train': 0.7611135505139828} 12/24/2021 22:57:10 - INFO - codeparrot_training - Step 15004: {'lr': 0.0003984533855215184, 'samples': 7682560, 'steps': 15004, 'batch_loss/train': 0.7579456567764282} 12/24/2021 22:57:20 - INFO - codeparrot_training - Step 15005: {'lr': 0.00039844070216633177, 'samples': 7683072, 'steps': 15005, 'batch_loss/train': 1.1400523371994495} 12/24/2021 22:57:31 - INFO - codeparrot_training - Step 15006: {'lr': 0.0003984280182210009, 'samples': 7683584, 'steps': 15006, 'batch_loss/train': 0.9818686752114445} 12/24/2021 22:57:43 - INFO - codeparrot_training - Step 15007: {'lr': 0.0003984153336855764, 'samples': 7684096, 'steps': 15007, 'batch_loss/train': 0.8182533206418157} 12/24/2021 22:57:53 - INFO - codeparrot_training - Step 15008: {'lr': 0.00039840264856010857, 'samples': 7684608, 'steps': 15008, 'batch_loss/train': 0.8189091105014086} 12/24/2021 22:58:04 - INFO - codeparrot_training - Step 15009: {'lr': 0.0003983899628446478, 'samples': 7685120, 'steps': 15009, 'batch_loss/train': 0.856389774940908} 12/24/2021 22:58:16 - INFO - codeparrot_training - Step 15010: {'lr': 0.0003983772765392446, 'samples': 7685632, 'steps': 15010, 'batch_loss/train': 0.8704529319657013} 12/24/2021 22:58:27 - INFO - codeparrot_training - Step 15011: {'lr': 0.0003983645896439493, 'samples': 7686144, 'steps': 15011, 'batch_loss/train': 0.7813219825038686} 12/24/2021 22:58:37 - INFO - codeparrot_training - Step 15012: {'lr': 0.0003983519021588125, 'samples': 7686656, 'steps': 15012, 'batch_loss/train': 0.7942292508669198} 12/24/2021 22:58:51 - INFO - codeparrot_training - Step 15013: {'lr': 0.0003983392140838845, 'samples': 7687168, 'steps': 15013, 'batch_loss/train': 0.8168053589761257} 12/24/2021 22:59:02 - INFO - codeparrot_training - Step 15014: {'lr': 0.0003983265254192158, 'samples': 7687680, 'steps': 15014, 'batch_loss/train': 0.8579426174983382} 12/24/2021 22:59:12 - INFO - codeparrot_training - Step 15015: {'lr': 0.00039831383616485683, 'samples': 7688192, 'steps': 15015, 'batch_loss/train': 0.8076726503204554} 12/24/2021 22:59:23 - INFO - codeparrot_training - Step 15016: {'lr': 0.000398301146320858, 'samples': 7688704, 'steps': 15016, 'batch_loss/train': 0.7976699406281114} 12/24/2021 22:59:35 - INFO - codeparrot_training - Step 15017: {'lr': 0.00039828845588726994, 'samples': 7689216, 'steps': 15017, 'batch_loss/train': 0.9528036992996931} 12/24/2021 22:59:46 - INFO - codeparrot_training - Step 15018: {'lr': 0.00039827576486414285, 'samples': 7689728, 'steps': 15018, 'batch_loss/train': 0.7632062267512083} 12/24/2021 22:59:56 - INFO - codeparrot_training - Step 15019: {'lr': 0.00039826307325152745, 'samples': 7690240, 'steps': 15019, 'batch_loss/train': 0.7892520194873214} 12/24/2021 23:00:08 - INFO - codeparrot_training - Step 15020: {'lr': 0.00039825038104947393, 'samples': 7690752, 'steps': 15020, 'batch_loss/train': 0.869445831514895} 12/24/2021 23:00:19 - INFO - codeparrot_training - Step 15021: {'lr': 0.00039823768825803294, 'samples': 7691264, 'steps': 15021, 'batch_loss/train': 0.8454812597483397} 12/24/2021 23:00:29 - INFO - codeparrot_training - Step 15022: {'lr': 0.0003982249948772548, 'samples': 7691776, 'steps': 15022, 'batch_loss/train': 0.8623221954330802} 12/24/2021 23:00:43 - INFO - codeparrot_training - Step 15023: {'lr': 0.00039821230090719006, 'samples': 7692288, 'steps': 15023, 'batch_loss/train': 0.87711819447577} 12/24/2021 23:00:54 - INFO - codeparrot_training - Step 15024: {'lr': 0.00039819960634788923, 'samples': 7692800, 'steps': 15024, 'batch_loss/train': 0.7335321521386504} 12/24/2021 23:01:05 - INFO - codeparrot_training - Step 15025: {'lr': 0.0003981869111994027, 'samples': 7693312, 'steps': 15025, 'batch_loss/train': 0.810753146186471} 12/24/2021 23:01:15 - INFO - codeparrot_training - Step 15026: {'lr': 0.00039817421546178096, 'samples': 7693824, 'steps': 15026, 'batch_loss/train': 0.7496292444411665} 12/24/2021 23:01:27 - INFO - codeparrot_training - Step 15027: {'lr': 0.00039816151913507446, 'samples': 7694336, 'steps': 15027, 'batch_loss/train': 0.7806814622599632} 12/24/2021 23:01:38 - INFO - codeparrot_training - Step 15028: {'lr': 0.00039814882221933377, 'samples': 7694848, 'steps': 15028, 'batch_loss/train': 0.8405343350023031} 12/24/2021 23:01:49 - INFO - codeparrot_training - Step 15029: {'lr': 0.00039813612471460924, 'samples': 7695360, 'steps': 15029, 'batch_loss/train': 0.816377985291183} 12/24/2021 23:02:02 - INFO - codeparrot_training - Step 15030: {'lr': 0.00039812342662095147, 'samples': 7695872, 'steps': 15030, 'batch_loss/train': 0.8074464499950409} 12/24/2021 23:02:13 - INFO - codeparrot_training - Step 15031: {'lr': 0.0003981107279384108, 'samples': 7696384, 'steps': 15031, 'batch_loss/train': 0.9141397182829678} 12/24/2021 23:02:24 - INFO - codeparrot_training - Step 15032: {'lr': 0.0003980980286670378, 'samples': 7696896, 'steps': 15032, 'batch_loss/train': 0.8522725664079189} 12/24/2021 23:02:36 - INFO - codeparrot_training - Step 15033: {'lr': 0.0003980853288068831, 'samples': 7697408, 'steps': 15033, 'batch_loss/train': 0.806667416356504} 12/24/2021 23:02:46 - INFO - codeparrot_training - Step 15034: {'lr': 0.00039807262835799695, 'samples': 7697920, 'steps': 15034, 'batch_loss/train': 0.8526432104408741} 12/24/2021 23:02:57 - INFO - codeparrot_training - Step 15035: {'lr': 0.00039805992732042995, 'samples': 7698432, 'steps': 15035, 'batch_loss/train': 0.7657937714830041} 12/24/2021 23:03:07 - INFO - codeparrot_training - Step 15036: {'lr': 0.00039804722569423257, 'samples': 7698944, 'steps': 15036, 'batch_loss/train': 0.6937014763243496} 12/24/2021 23:03:20 - INFO - codeparrot_training - Step 15037: {'lr': 0.0003980345234794553, 'samples': 7699456, 'steps': 15037, 'batch_loss/train': 0.7814481174573302} 12/24/2021 23:03:31 - INFO - codeparrot_training - Step 15038: {'lr': 0.00039802182067614865, 'samples': 7699968, 'steps': 15038, 'batch_loss/train': 0.8368531824089587} 12/24/2021 23:03:41 - INFO - codeparrot_training - Step 15039: {'lr': 0.00039800911728436316, 'samples': 7700480, 'steps': 15039, 'batch_loss/train': 0.7595681836828589} 12/24/2021 23:03:55 - INFO - codeparrot_training - Step 15040: {'lr': 0.0003979964133041494, 'samples': 7700992, 'steps': 15040, 'batch_loss/train': 0.9541434911079705} 12/24/2021 23:04:06 - INFO - codeparrot_training - Step 15041: {'lr': 0.0003979837087355577, 'samples': 7701504, 'steps': 15041, 'batch_loss/train': 0.7991894581355155} 12/24/2021 23:04:16 - INFO - codeparrot_training - Step 15042: {'lr': 0.0003979710035786386, 'samples': 7702016, 'steps': 15042, 'batch_loss/train': 0.8787587517872453} 12/24/2021 23:04:28 - INFO - codeparrot_training - Step 15043: {'lr': 0.0003979582978334427, 'samples': 7702528, 'steps': 15043, 'batch_loss/train': 0.7576564309420064} 12/24/2021 23:04:39 - INFO - codeparrot_training - Step 15044: {'lr': 0.00039794559150002054, 'samples': 7703040, 'steps': 15044, 'batch_loss/train': 0.7905866168439388} 12/24/2021 23:04:50 - INFO - codeparrot_training - Step 15045: {'lr': 0.0003979328845784225, 'samples': 7703552, 'steps': 15045, 'batch_loss/train': 0.822965175844729} 12/24/2021 23:05:00 - INFO - codeparrot_training - Step 15046: {'lr': 0.0003979201770686991, 'samples': 7704064, 'steps': 15046, 'batch_loss/train': 0.7433033818379045} 12/24/2021 23:05:12 - INFO - codeparrot_training - Step 15047: {'lr': 0.0003979074689709011, 'samples': 7704576, 'steps': 15047, 'batch_loss/train': 0.7430890733376145} 12/24/2021 23:05:23 - INFO - codeparrot_training - Step 15048: {'lr': 0.00039789476028507875, 'samples': 7705088, 'steps': 15048, 'batch_loss/train': 0.9044713494367898} 12/24/2021 23:05:34 - INFO - codeparrot_training - Step 15049: {'lr': 0.0003978820510112827, 'samples': 7705600, 'steps': 15049, 'batch_loss/train': 1.03290344029665} 12/24/2021 23:05:46 - INFO - codeparrot_training - Step 15050: {'lr': 0.00039786934114956343, 'samples': 7706112, 'steps': 15050, 'batch_loss/train': 0.8341285428032279} 12/24/2021 23:05:56 - INFO - codeparrot_training - Step 15051: {'lr': 0.00039785663069997145, 'samples': 7706624, 'steps': 15051, 'batch_loss/train': 0.8590572364628315} 12/24/2021 23:06:07 - INFO - codeparrot_training - Step 15052: {'lr': 0.0003978439196625574, 'samples': 7707136, 'steps': 15052, 'batch_loss/train': 0.7257058883551508} 12/24/2021 23:06:21 - INFO - codeparrot_training - Step 15053: {'lr': 0.00039783120803737175, 'samples': 7707648, 'steps': 15053, 'batch_loss/train': 0.7481983585748821} 12/24/2021 23:06:31 - INFO - codeparrot_training - Step 15054: {'lr': 0.00039781849582446507, 'samples': 7708160, 'steps': 15054, 'batch_loss/train': 0.8697206219658256} 12/24/2021 23:06:42 - INFO - codeparrot_training - Step 15055: {'lr': 0.0003978057830238878, 'samples': 7708672, 'steps': 15055, 'batch_loss/train': 0.7952612451044843} 12/24/2021 23:06:53 - INFO - codeparrot_training - Step 15056: {'lr': 0.00039779306963569057, 'samples': 7709184, 'steps': 15056, 'batch_loss/train': 0.8505194690078497} 12/24/2021 23:07:05 - INFO - codeparrot_training - Step 15057: {'lr': 0.00039778035565992385, 'samples': 7709696, 'steps': 15057, 'batch_loss/train': 0.7527310631703585} 12/24/2021 23:07:15 - INFO - codeparrot_training - Step 15058: {'lr': 0.00039776764109663833, 'samples': 7710208, 'steps': 15058, 'batch_loss/train': 0.7145704725990072} 12/24/2021 23:07:26 - INFO - codeparrot_training - Step 15059: {'lr': 0.0003977549259458844, 'samples': 7710720, 'steps': 15059, 'batch_loss/train': 0.717323882272467} 12/24/2021 23:07:38 - INFO - codeparrot_training - Step 15060: {'lr': 0.00039774221020771276, 'samples': 7711232, 'steps': 15060, 'batch_loss/train': 0.6659583924338222} 12/24/2021 23:07:48 - INFO - codeparrot_training - Step 15061: {'lr': 0.0003977294938821737, 'samples': 7711744, 'steps': 15061, 'batch_loss/train': 0.7779099374311045} 12/24/2021 23:07:59 - INFO - codeparrot_training - Step 15062: {'lr': 0.00039771677696931815, 'samples': 7712256, 'steps': 15062, 'batch_loss/train': 0.7578109826426953} 12/24/2021 23:08:13 - INFO - codeparrot_training - Step 15063: {'lr': 0.0003977040594691963, 'samples': 7712768, 'steps': 15063, 'batch_loss/train': 0.7201820353511721} 12/24/2021 23:08:23 - INFO - codeparrot_training - Step 15064: {'lr': 0.00039769134138185896, 'samples': 7713280, 'steps': 15064, 'batch_loss/train': 0.7406089096330106} 12/24/2021 23:08:34 - INFO - codeparrot_training - Step 15065: {'lr': 0.00039767862270735665, 'samples': 7713792, 'steps': 15065, 'batch_loss/train': 0.8531364137306809} 12/24/2021 23:08:45 - INFO - codeparrot_training - Step 15066: {'lr': 0.00039766590344573983, 'samples': 7714304, 'steps': 15066, 'batch_loss/train': 0.7384786889888346} 12/24/2021 23:08:57 - INFO - codeparrot_training - Step 15067: {'lr': 0.0003976531835970592, 'samples': 7714816, 'steps': 15067, 'batch_loss/train': 0.8131009351927787} 12/24/2021 23:09:07 - INFO - codeparrot_training - Step 15068: {'lr': 0.0003976404631613652, 'samples': 7715328, 'steps': 15068, 'batch_loss/train': 0.6923494255170226} 12/24/2021 23:09:18 - INFO - codeparrot_training - Step 15069: {'lr': 0.0003976277421387085, 'samples': 7715840, 'steps': 15069, 'batch_loss/train': 0.7069573672488332} 12/24/2021 23:09:32 - INFO - codeparrot_training - Step 15070: {'lr': 0.0003976150205291397, 'samples': 7716352, 'steps': 15070, 'batch_loss/train': 0.8969735838472843} 12/24/2021 23:09:42 - INFO - codeparrot_training - Step 15071: {'lr': 0.0003976022983327092, 'samples': 7716864, 'steps': 15071, 'batch_loss/train': 0.8563282443210483} 12/24/2021 23:09:53 - INFO - codeparrot_training - Step 15072: {'lr': 0.00039758957554946774, 'samples': 7717376, 'steps': 15072, 'batch_loss/train': 0.8789429324679077} 12/24/2021 23:10:05 - INFO - codeparrot_training - Step 15073: {'lr': 0.0003975768521794659, 'samples': 7717888, 'steps': 15073, 'batch_loss/train': 0.8038508519530296} 12/24/2021 23:10:16 - INFO - codeparrot_training - Step 15074: {'lr': 0.00039756412822275413, 'samples': 7718400, 'steps': 15074, 'batch_loss/train': 0.8309927452355623} 12/24/2021 23:10:26 - INFO - codeparrot_training - Step 15075: {'lr': 0.00039755140367938316, 'samples': 7718912, 'steps': 15075, 'batch_loss/train': 0.7763067381456494} 12/24/2021 23:10:38 - INFO - codeparrot_training - Step 15076: {'lr': 0.00039753867854940353, 'samples': 7719424, 'steps': 15076, 'batch_loss/train': 0.7739174454472959} 12/24/2021 23:10:49 - INFO - codeparrot_training - Step 15077: {'lr': 0.00039752595283286575, 'samples': 7719936, 'steps': 15077, 'batch_loss/train': 0.8755602966994047} 12/24/2021 23:10:59 - INFO - codeparrot_training - Step 15078: {'lr': 0.00039751322652982056, 'samples': 7720448, 'steps': 15078, 'batch_loss/train': 0.8920748624950647} 12/24/2021 23:11:10 - INFO - codeparrot_training - Step 15079: {'lr': 0.00039750049964031844, 'samples': 7720960, 'steps': 15079, 'batch_loss/train': 0.8616991052404046} 12/24/2021 23:11:24 - INFO - codeparrot_training - Step 15080: {'lr': 0.00039748777216441013, 'samples': 7721472, 'steps': 15080, 'batch_loss/train': 0.7732932986691594} 12/24/2021 23:11:35 - INFO - codeparrot_training - Step 15081: {'lr': 0.000397475044102146, 'samples': 7721984, 'steps': 15081, 'batch_loss/train': 0.6883193701505661} 12/24/2021 23:11:45 - INFO - codeparrot_training - Step 15082: {'lr': 0.0003974623154535768, 'samples': 7722496, 'steps': 15082, 'batch_loss/train': 0.829625194426626} 12/24/2021 23:11:57 - INFO - codeparrot_training - Step 15083: {'lr': 0.00039744958621875314, 'samples': 7723008, 'steps': 15083, 'batch_loss/train': 0.8233994143083692} 12/24/2021 23:12:08 - INFO - codeparrot_training - Step 15084: {'lr': 0.00039743685639772566, 'samples': 7723520, 'steps': 15084, 'batch_loss/train': 0.8417293066158891} 12/24/2021 23:12:19 - INFO - codeparrot_training - Step 15085: {'lr': 0.0003974241259905448, 'samples': 7724032, 'steps': 15085, 'batch_loss/train': 0.8430235777050257} 12/24/2021 23:12:31 - INFO - codeparrot_training - Step 15086: {'lr': 0.0003974113949972614, 'samples': 7724544, 'steps': 15086, 'batch_loss/train': 0.7918209033086896} 12/24/2021 23:12:41 - INFO - codeparrot_training - Step 15087: {'lr': 0.0003973986634179259, 'samples': 7725056, 'steps': 15087, 'batch_loss/train': 0.8122613496379927} 12/24/2021 23:12:52 - INFO - codeparrot_training - Step 15088: {'lr': 0.00039738593125258887, 'samples': 7725568, 'steps': 15088, 'batch_loss/train': 0.798822452314198} 12/24/2021 23:13:02 - INFO - codeparrot_training - Step 15089: {'lr': 0.0003973731985013011, 'samples': 7726080, 'steps': 15089, 'batch_loss/train': 0.8528025895357132} 12/24/2021 23:13:14 - INFO - codeparrot_training - Step 15090: {'lr': 0.00039736046516411315, 'samples': 7726592, 'steps': 15090, 'batch_loss/train': 0.7823106086580083} 12/24/2021 23:13:25 - INFO - codeparrot_training - Step 15091: {'lr': 0.00039734773124107566, 'samples': 7727104, 'steps': 15091, 'batch_loss/train': 0.7150093035306782} 12/24/2021 23:13:36 - INFO - codeparrot_training - Step 15092: {'lr': 0.00039733499673223917, 'samples': 7727616, 'steps': 15092, 'batch_loss/train': 0.8804581421427429} 12/24/2021 23:13:50 - INFO - codeparrot_training - Step 15093: {'lr': 0.0003973222616376544, 'samples': 7728128, 'steps': 15093, 'batch_loss/train': 0.8199139162898064} 12/24/2021 23:14:00 - INFO - codeparrot_training - Step 15094: {'lr': 0.00039730952595737193, 'samples': 7728640, 'steps': 15094, 'batch_loss/train': 0.7786889607086778} 12/24/2021 23:14:11 - INFO - codeparrot_training - Step 15095: {'lr': 0.00039729678969144244, 'samples': 7729152, 'steps': 15095, 'batch_loss/train': 0.6638925014995039} 12/24/2021 23:14:23 - INFO - codeparrot_training - Step 15096: {'lr': 0.0003972840528399165, 'samples': 7729664, 'steps': 15096, 'batch_loss/train': 0.7942576501518488} 12/24/2021 23:14:34 - INFO - codeparrot_training - Step 15097: {'lr': 0.0003972713154028448, 'samples': 7730176, 'steps': 15097, 'batch_loss/train': 0.8614864069968462} 12/24/2021 23:14:44 - INFO - codeparrot_training - Step 15098: {'lr': 0.0003972585773802779, 'samples': 7730688, 'steps': 15098, 'batch_loss/train': 0.8010946726426482} 12/24/2021 23:14:58 - INFO - codeparrot_training - Step 15099: {'lr': 0.00039724583877226655, 'samples': 7731200, 'steps': 15099, 'batch_loss/train': 0.8070316882804036} 12/24/2021 23:15:09 - INFO - codeparrot_training - Step 15100: {'lr': 0.00039723309957886136, 'samples': 7731712, 'steps': 15100, 'batch_loss/train': 0.8169192844070494} 12/24/2021 23:15:19 - INFO - codeparrot_training - Step 15101: {'lr': 0.0003972203598001129, 'samples': 7732224, 'steps': 15101, 'batch_loss/train': 0.6924112597480416} 12/24/2021 23:15:30 - INFO - codeparrot_training - Step 15102: {'lr': 0.00039720761943607194, 'samples': 7732736, 'steps': 15102, 'batch_loss/train': 0.8531133430078626} 12/24/2021 23:15:42 - INFO - codeparrot_training - Step 15103: {'lr': 0.00039719487848678903, 'samples': 7733248, 'steps': 15103, 'batch_loss/train': 0.7288213148713112} 12/24/2021 23:15:53 - INFO - codeparrot_training - Step 15104: {'lr': 0.0003971821369523149, 'samples': 7733760, 'steps': 15104, 'batch_loss/train': 0.8520951448008418} 12/24/2021 23:16:03 - INFO - codeparrot_training - Step 15105: {'lr': 0.00039716939483270007, 'samples': 7734272, 'steps': 15105, 'batch_loss/train': 0.7818022151477635} 12/24/2021 23:16:15 - INFO - codeparrot_training - Step 15106: {'lr': 0.00039715665212799534, 'samples': 7734784, 'steps': 15106, 'batch_loss/train': 0.8393593486398458} 12/24/2021 23:16:26 - INFO - codeparrot_training - Step 15107: {'lr': 0.0003971439088382514, 'samples': 7735296, 'steps': 15107, 'batch_loss/train': 0.8650860283523798} 12/24/2021 23:16:36 - INFO - codeparrot_training - Step 15108: {'lr': 0.00039713116496351876, 'samples': 7735808, 'steps': 15108, 'batch_loss/train': 0.8163296999409795} 12/24/2021 23:16:50 - INFO - codeparrot_training - Step 15109: {'lr': 0.0003971184205038482, 'samples': 7736320, 'steps': 15109, 'batch_loss/train': 0.7946484116837382} 12/24/2021 23:17:01 - INFO - codeparrot_training - Step 15110: {'lr': 0.0003971056754592903, 'samples': 7736832, 'steps': 15110, 'batch_loss/train': 0.7268859702162445} 12/24/2021 23:17:12 - INFO - codeparrot_training - Step 15111: {'lr': 0.00039709292982989584, 'samples': 7737344, 'steps': 15111, 'batch_loss/train': 0.8360483264550567} 12/24/2021 23:17:22 - INFO - codeparrot_training - Step 15112: {'lr': 0.0003970801836157154, 'samples': 7737856, 'steps': 15112, 'batch_loss/train': 0.8480463186278939} 12/24/2021 23:17:34 - INFO - codeparrot_training - Step 15113: {'lr': 0.00039706743681679966, 'samples': 7738368, 'steps': 15113, 'batch_loss/train': 0.8101447802037001} 12/24/2021 23:17:45 - INFO - codeparrot_training - Step 15114: {'lr': 0.00039705468943319937, 'samples': 7738880, 'steps': 15114, 'batch_loss/train': 0.7534350296482444} 12/24/2021 23:17:55 - INFO - codeparrot_training - Step 15115: {'lr': 0.0003970419414649652, 'samples': 7739392, 'steps': 15115, 'batch_loss/train': 0.8573392233811319} 12/24/2021 23:18:09 - INFO - codeparrot_training - Step 15116: {'lr': 0.00039702919291214765, 'samples': 7739904, 'steps': 15116, 'batch_loss/train': 0.8077794564887881} 12/24/2021 23:18:20 - INFO - codeparrot_training - Step 15117: {'lr': 0.0003970164437747977, 'samples': 7740416, 'steps': 15117, 'batch_loss/train': 0.8712782580405474} 12/24/2021 23:18:31 - INFO - codeparrot_training - Step 15118: {'lr': 0.00039700369405296575, 'samples': 7740928, 'steps': 15118, 'batch_loss/train': 0.6550051795784384} 12/24/2021 23:18:43 - INFO - codeparrot_training - Step 15119: {'lr': 0.00039699094374670275, 'samples': 7741440, 'steps': 15119, 'batch_loss/train': 0.8504453618079424} 12/24/2021 23:18:53 - INFO - codeparrot_training - Step 15120: {'lr': 0.00039697819285605917, 'samples': 7741952, 'steps': 15120, 'batch_loss/train': 0.7935726381838322} 12/24/2021 23:19:04 - INFO - codeparrot_training - Step 15121: {'lr': 0.0003969654413810858, 'samples': 7742464, 'steps': 15121, 'batch_loss/train': 0.7337455414235592} 12/24/2021 23:19:15 - INFO - codeparrot_training - Step 15122: {'lr': 0.0003969526893218334, 'samples': 7742976, 'steps': 15122, 'batch_loss/train': 0.8123761937022209} 12/24/2021 23:19:27 - INFO - codeparrot_training - Step 15123: {'lr': 0.00039693993667835246, 'samples': 7743488, 'steps': 15123, 'batch_loss/train': 0.800334285479039} 12/24/2021 23:19:37 - INFO - codeparrot_training - Step 15124: {'lr': 0.0003969271834506939, 'samples': 7744000, 'steps': 15124, 'batch_loss/train': 0.6258733808062971} 12/24/2021 23:19:48 - INFO - codeparrot_training - Step 15125: {'lr': 0.00039691442963890833, 'samples': 7744512, 'steps': 15125, 'batch_loss/train': 0.7738456185907125} 12/24/2021 23:20:00 - INFO - codeparrot_training - Step 15126: {'lr': 0.00039690167524304644, 'samples': 7745024, 'steps': 15126, 'batch_loss/train': 0.7364361896179616} 12/24/2021 23:20:11 - INFO - codeparrot_training - Step 15127: {'lr': 0.000396888920263159, 'samples': 7745536, 'steps': 15127, 'batch_loss/train': 0.7780486447736621} 12/24/2021 23:20:22 - INFO - codeparrot_training - Step 15128: {'lr': 0.0003968761646992967, 'samples': 7746048, 'steps': 15128, 'batch_loss/train': 0.7675537450704724} 12/24/2021 23:20:35 - INFO - codeparrot_training - Step 15129: {'lr': 0.00039686340855151007, 'samples': 7746560, 'steps': 15129, 'batch_loss/train': 0.8317308062687516} 12/24/2021 23:20:46 - INFO - codeparrot_training - Step 15130: {'lr': 0.0003968506518198501, 'samples': 7747072, 'steps': 15130, 'batch_loss/train': 0.8268822989193723} 12/24/2021 23:20:57 - INFO - codeparrot_training - Step 15131: {'lr': 0.00039683789450436737, 'samples': 7747584, 'steps': 15131, 'batch_loss/train': 0.9185410728678107} 12/24/2021 23:21:07 - INFO - codeparrot_training - Step 15132: {'lr': 0.00039682513660511265, 'samples': 7748096, 'steps': 15132, 'batch_loss/train': 0.7483747224323452} 12/24/2021 23:21:20 - INFO - codeparrot_training - Step 15133: {'lr': 0.0003968123781221366, 'samples': 7748608, 'steps': 15133, 'batch_loss/train': 0.7404730105772614} 12/24/2021 23:21:30 - INFO - codeparrot_training - Step 15134: {'lr': 0.00039679961905549, 'samples': 7749120, 'steps': 15134, 'batch_loss/train': 0.8691325206309557} 12/24/2021 23:21:41 - INFO - codeparrot_training - Step 15135: {'lr': 0.00039678685940522346, 'samples': 7749632, 'steps': 15135, 'batch_loss/train': 0.8440402615815401} 12/24/2021 23:21:53 - INFO - codeparrot_training - Step 15136: {'lr': 0.00039677409917138786, 'samples': 7750144, 'steps': 15136, 'batch_loss/train': 0.6636732942424715} 12/24/2021 23:22:03 - INFO - codeparrot_training - Step 15137: {'lr': 0.00039676133835403384, 'samples': 7750656, 'steps': 15137, 'batch_loss/train': 0.8741588294506073} 12/24/2021 23:22:14 - INFO - codeparrot_training - Step 15138: {'lr': 0.0003967485769532121, 'samples': 7751168, 'steps': 15138, 'batch_loss/train': 0.8052665712311864} 12/24/2021 23:22:28 - INFO - codeparrot_training - Step 15139: {'lr': 0.0003967358149689735, 'samples': 7751680, 'steps': 15139, 'batch_loss/train': 0.7525038653984666} 12/24/2021 23:22:38 - INFO - codeparrot_training - Step 15140: {'lr': 0.00039672305240136876, 'samples': 7752192, 'steps': 15140, 'batch_loss/train': 0.7669370144139975} 12/24/2021 23:22:49 - INFO - codeparrot_training - Step 15141: {'lr': 0.00039671028925044845, 'samples': 7752704, 'steps': 15141, 'batch_loss/train': 0.7143866987898946} 12/24/2021 23:23:00 - INFO - codeparrot_training - Step 15142: {'lr': 0.0003966975255162635, 'samples': 7753216, 'steps': 15142, 'batch_loss/train': 0.8555583218112588} 12/24/2021 23:23:12 - INFO - codeparrot_training - Step 15143: {'lr': 0.00039668476119886453, 'samples': 7753728, 'steps': 15143, 'batch_loss/train': 0.8193344594910741} 12/24/2021 23:23:22 - INFO - codeparrot_training - Step 15144: {'lr': 0.0003966719962983023, 'samples': 7754240, 'steps': 15144, 'batch_loss/train': 0.7991278534755111} 12/24/2021 23:23:33 - INFO - codeparrot_training - Step 15145: {'lr': 0.00039665923081462766, 'samples': 7754752, 'steps': 15145, 'batch_loss/train': 0.8186865486204624} 12/24/2021 23:23:45 - INFO - codeparrot_training - Step 15146: {'lr': 0.0003966464647478913, 'samples': 7755264, 'steps': 15146, 'batch_loss/train': 0.8277139340061694} 12/24/2021 23:23:56 - INFO - codeparrot_training - Step 15147: {'lr': 0.000396633698098144, 'samples': 7755776, 'steps': 15147, 'batch_loss/train': 0.810782527551055} 12/24/2021 23:24:06 - INFO - codeparrot_training - Step 15148: {'lr': 0.00039662093086543636, 'samples': 7756288, 'steps': 15148, 'batch_loss/train': 0.7826961502432823} 12/24/2021 23:24:20 - INFO - codeparrot_training - Step 15149: {'lr': 0.0003966081630498193, 'samples': 7756800, 'steps': 15149, 'batch_loss/train': 0.7071765039581805} 12/24/2021 23:24:30 - INFO - codeparrot_training - Step 15150: {'lr': 0.00039659539465134363, 'samples': 7757312, 'steps': 15150, 'batch_loss/train': 0.7552444115281105} 12/24/2021 23:24:41 - INFO - codeparrot_training - Step 15151: {'lr': 0.00039658262567005996, 'samples': 7757824, 'steps': 15151, 'batch_loss/train': 0.814811140531674} 12/24/2021 23:24:53 - INFO - codeparrot_training - Step 15152: {'lr': 0.00039656985610601913, 'samples': 7758336, 'steps': 15152, 'batch_loss/train': 0.7091871043667197} 12/24/2021 23:25:04 - INFO - codeparrot_training - Step 15153: {'lr': 0.0003965570859592719, 'samples': 7758848, 'steps': 15153, 'batch_loss/train': 0.7672374583780766} 12/24/2021 23:25:14 - INFO - codeparrot_training - Step 15154: {'lr': 0.000396544315229869, 'samples': 7759360, 'steps': 15154, 'batch_loss/train': 0.7738391431048512} 12/24/2021 23:25:25 - INFO - codeparrot_training - Step 15155: {'lr': 0.00039653154391786126, 'samples': 7759872, 'steps': 15155, 'batch_loss/train': 0.8447645823471248} 12/24/2021 23:25:39 - INFO - codeparrot_training - Step 15156: {'lr': 0.00039651877202329946, 'samples': 7760384, 'steps': 15156, 'batch_loss/train': 0.8995185848325491} 12/24/2021 23:25:50 - INFO - codeparrot_training - Step 15157: {'lr': 0.0003965059995462342, 'samples': 7760896, 'steps': 15157, 'batch_loss/train': 0.8583915270864964} 12/24/2021 23:26:00 - INFO - codeparrot_training - Step 15158: {'lr': 0.0003964932264867166, 'samples': 7761408, 'steps': 15158, 'batch_loss/train': 0.8328680917620659} 12/24/2021 23:26:12 - INFO - codeparrot_training - Step 15159: {'lr': 0.00039648045284479714, 'samples': 7761920, 'steps': 15159, 'batch_loss/train': 0.7826241133734584} 12/24/2021 23:26:23 - INFO - codeparrot_training - Step 15160: {'lr': 0.0003964676786205268, 'samples': 7762432, 'steps': 15160, 'batch_loss/train': 0.8668542206287384} 12/24/2021 23:26:33 - INFO - codeparrot_training - Step 15161: {'lr': 0.00039645490381395615, 'samples': 7762944, 'steps': 15161, 'batch_loss/train': 0.9402839401736856} 12/24/2021 23:26:45 - INFO - codeparrot_training - Step 15162: {'lr': 0.00039644212842513614, 'samples': 7763456, 'steps': 15162, 'batch_loss/train': 0.7987944409251213} 12/24/2021 23:26:56 - INFO - codeparrot_training - Step 15163: {'lr': 0.0003964293524541176, 'samples': 7763968, 'steps': 15163, 'batch_loss/train': 0.8435867577791214} 12/24/2021 23:27:07 - INFO - codeparrot_training - Step 15164: {'lr': 0.00039641657590095113, 'samples': 7764480, 'steps': 15164, 'batch_loss/train': 0.7539418437518179} 12/24/2021 23:27:17 - INFO - codeparrot_training - Step 15165: {'lr': 0.00039640379876568774, 'samples': 7764992, 'steps': 15165, 'batch_loss/train': 0.6861504239495844} 12/24/2021 23:27:29 - INFO - codeparrot_training - Step 15166: {'lr': 0.00039639102104837805, 'samples': 7765504, 'steps': 15166, 'batch_loss/train': 0.7977317944169044} 12/24/2021 23:27:40 - INFO - codeparrot_training - Step 15167: {'lr': 0.000396378242749073, 'samples': 7766016, 'steps': 15167, 'batch_loss/train': 0.7883560061454773} 12/24/2021 23:27:51 - INFO - codeparrot_training - Step 15168: {'lr': 0.00039636546386782325, 'samples': 7766528, 'steps': 15168, 'batch_loss/train': 0.8972636186517775} 12/24/2021 23:28:04 - INFO - codeparrot_training - Step 15169: {'lr': 0.00039635268440467973, 'samples': 7767040, 'steps': 15169, 'batch_loss/train': 0.8340826760977507} 12/24/2021 23:28:15 - INFO - codeparrot_training - Step 15170: {'lr': 0.00039633990435969314, 'samples': 7767552, 'steps': 15170, 'batch_loss/train': 0.8260392621159554} 12/24/2021 23:28:26 - INFO - codeparrot_training - Step 15171: {'lr': 0.00039632712373291446, 'samples': 7768064, 'steps': 15171, 'batch_loss/train': 0.669057690538466} 12/24/2021 23:28:38 - INFO - codeparrot_training - Step 15172: {'lr': 0.0003963143425243943, 'samples': 7768576, 'steps': 15172, 'batch_loss/train': 0.8530959747731686} 12/24/2021 23:28:48 - INFO - codeparrot_training - Step 15173: {'lr': 0.0003963015607341836, 'samples': 7769088, 'steps': 15173, 'batch_loss/train': 0.7718411954119802} 12/24/2021 23:28:59 - INFO - codeparrot_training - Step 15174: {'lr': 0.00039628877836233315, 'samples': 7769600, 'steps': 15174, 'batch_loss/train': 0.8343448443338275} 12/24/2021 23:29:09 - INFO - codeparrot_training - Step 15175: {'lr': 0.0003962759954088936, 'samples': 7770112, 'steps': 15175, 'batch_loss/train': 0.876037257257849} 12/24/2021 23:29:23 - INFO - codeparrot_training - Step 15176: {'lr': 0.00039626321187391614, 'samples': 7770624, 'steps': 15176, 'batch_loss/train': 0.8979691732674837} 12/24/2021 23:29:34 - INFO - codeparrot_training - Step 15177: {'lr': 0.0003962504277574512, 'samples': 7771136, 'steps': 15177, 'batch_loss/train': 1.0067491391673684} 12/24/2021 23:29:45 - INFO - codeparrot_training - Step 15178: {'lr': 0.0003962376430595499, 'samples': 7771648, 'steps': 15178, 'batch_loss/train': 0.6952723423019052} 12/24/2021 23:29:57 - INFO - codeparrot_training - Step 15179: {'lr': 0.000396224857780263, 'samples': 7772160, 'steps': 15179, 'batch_loss/train': 0.8290817383676767} 12/24/2021 23:30:07 - INFO - codeparrot_training - Step 15180: {'lr': 0.00039621207191964115, 'samples': 7772672, 'steps': 15180, 'batch_loss/train': 0.8133543627336621} 12/24/2021 23:30:18 - INFO - codeparrot_training - Step 15181: {'lr': 0.00039619928547773533, 'samples': 7773184, 'steps': 15181, 'batch_loss/train': 0.8846636367961764} 12/24/2021 23:30:30 - INFO - codeparrot_training - Step 15182: {'lr': 0.00039618649845459637, 'samples': 7773696, 'steps': 15182, 'batch_loss/train': 0.7706340712029487} 12/24/2021 23:30:41 - INFO - codeparrot_training - Step 15183: {'lr': 0.0003961737108502751, 'samples': 7774208, 'steps': 15183, 'batch_loss/train': 0.733369723893702} 12/24/2021 23:30:51 - INFO - codeparrot_training - Step 15184: {'lr': 0.00039616092266482237, 'samples': 7774720, 'steps': 15184, 'batch_loss/train': 0.7700279669370502} 12/24/2021 23:31:02 - INFO - codeparrot_training - Step 15185: {'lr': 0.00039614813389828897, 'samples': 7775232, 'steps': 15185, 'batch_loss/train': 0.7666797069832683} 12/24/2021 23:31:16 - INFO - codeparrot_training - Step 15186: {'lr': 0.0003961353445507258, 'samples': 7775744, 'steps': 15186, 'batch_loss/train': 0.830552932806313} 12/24/2021 23:31:27 - INFO - codeparrot_training - Step 15187: {'lr': 0.00039612255462218363, 'samples': 7776256, 'steps': 15187, 'batch_loss/train': 0.5156158680329099} 12/24/2021 23:31:37 - INFO - codeparrot_training - Step 15188: {'lr': 0.00039610976411271333, 'samples': 7776768, 'steps': 15188, 'batch_loss/train': 0.791062846314162} 12/24/2021 23:31:49 - INFO - codeparrot_training - Step 15189: {'lr': 0.00039609697302236583, 'samples': 7777280, 'steps': 15189, 'batch_loss/train': 0.7706606574356556} 12/24/2021 23:32:00 - INFO - codeparrot_training - Step 15190: {'lr': 0.00039608418135119193, 'samples': 7777792, 'steps': 15190, 'batch_loss/train': 0.7988136233761907} 12/24/2021 23:32:11 - INFO - codeparrot_training - Step 15191: {'lr': 0.0003960713890992425, 'samples': 7778304, 'steps': 15191, 'batch_loss/train': 0.8365946924313903} 12/24/2021 23:32:23 - INFO - codeparrot_training - Step 15192: {'lr': 0.0003960585962665683, 'samples': 7778816, 'steps': 15192, 'batch_loss/train': 1.058469807729125} 12/24/2021 23:32:33 - INFO - codeparrot_training - Step 15193: {'lr': 0.0003960458028532204, 'samples': 7779328, 'steps': 15193, 'batch_loss/train': 0.7676019659265876} 12/24/2021 23:32:44 - INFO - codeparrot_training - Step 15194: {'lr': 0.00039603300885924944, 'samples': 7779840, 'steps': 15194, 'batch_loss/train': 0.8938004188239574} 12/24/2021 23:32:55 - INFO - codeparrot_training - Step 15195: {'lr': 0.0003960202142847064, 'samples': 7780352, 'steps': 15195, 'batch_loss/train': 0.7109097065404058} 12/24/2021 23:33:08 - INFO - codeparrot_training - Step 15196: {'lr': 0.0003960074191296421, 'samples': 7780864, 'steps': 15196, 'batch_loss/train': 0.8934599608182907} 12/24/2021 23:33:19 - INFO - codeparrot_training - Step 15197: {'lr': 0.00039599462339410754, 'samples': 7781376, 'steps': 15197, 'batch_loss/train': 0.7967575890943408} 12/24/2021 23:33:30 - INFO - codeparrot_training - Step 15198: {'lr': 0.0003959818270781533, 'samples': 7781888, 'steps': 15198, 'batch_loss/train': 0.8008099119178951} 12/24/2021 23:33:42 - INFO - codeparrot_training - Step 15199: {'lr': 0.00039596903018183054, 'samples': 7782400, 'steps': 15199, 'batch_loss/train': 0.8431852860376239} 12/24/2021 23:33:53 - INFO - codeparrot_training - Step 15200: {'lr': 0.00039595623270519007, 'samples': 7782912, 'steps': 15200, 'batch_loss/train': 0.8032833319157362} 12/24/2021 23:34:03 - INFO - codeparrot_training - Step 15201: {'lr': 0.00039594343464828264, 'samples': 7783424, 'steps': 15201, 'batch_loss/train': 0.7308995230123401} 12/24/2021 23:34:15 - INFO - codeparrot_training - Step 15202: {'lr': 0.00039593063601115933, 'samples': 7783936, 'steps': 15202, 'batch_loss/train': 0.8329409470316023} 12/24/2021 23:34:26 - INFO - codeparrot_training - Step 15203: {'lr': 0.0003959178367938708, 'samples': 7784448, 'steps': 15203, 'batch_loss/train': 0.8268965412862599} 12/24/2021 23:34:37 - INFO - codeparrot_training - Step 15204: {'lr': 0.0003959050369964681, 'samples': 7784960, 'steps': 15204, 'batch_loss/train': 0.8728166855871677} 12/24/2021 23:34:49 - INFO - codeparrot_training - Step 15205: {'lr': 0.00039589223661900207, 'samples': 7785472, 'steps': 15205, 'batch_loss/train': 0.8298898874782026} 12/24/2021 23:34:59 - INFO - codeparrot_training - Step 15206: {'lr': 0.00039587943566152364, 'samples': 7785984, 'steps': 15206, 'batch_loss/train': 0.7540445271879435} 12/24/2021 23:35:10 - INFO - codeparrot_training - Step 15207: {'lr': 0.0003958666341240835, 'samples': 7786496, 'steps': 15207, 'batch_loss/train': 0.8549695904366672} 12/24/2021 23:35:20 - INFO - codeparrot_training - Step 15208: {'lr': 0.00039585383200673287, 'samples': 7787008, 'steps': 15208, 'batch_loss/train': 0.8277378026396036} 12/24/2021 23:35:34 - INFO - codeparrot_training - Step 15209: {'lr': 0.0003958410293095224, 'samples': 7787520, 'steps': 15209, 'batch_loss/train': 0.8565903278067708} 12/24/2021 23:35:45 - INFO - codeparrot_training - Step 15210: {'lr': 0.000395828226032503, 'samples': 7788032, 'steps': 15210, 'batch_loss/train': 0.9148605838418007} 12/24/2021 23:35:55 - INFO - codeparrot_training - Step 15211: {'lr': 0.00039581542217572574, 'samples': 7788544, 'steps': 15211, 'batch_loss/train': 0.6562419291585684} 12/24/2021 23:36:07 - INFO - codeparrot_training - Step 15212: {'lr': 0.00039580261773924144, 'samples': 7789056, 'steps': 15212, 'batch_loss/train': 0.8248153859749436} 12/24/2021 23:36:18 - INFO - codeparrot_training - Step 15213: {'lr': 0.0003957898127231009, 'samples': 7789568, 'steps': 15213, 'batch_loss/train': 0.6860710091423243} 12/24/2021 23:36:29 - INFO - codeparrot_training - Step 15214: {'lr': 0.00039577700712735517, 'samples': 7790080, 'steps': 15214, 'batch_loss/train': 0.8259097570553422} 12/24/2021 23:36:42 - INFO - codeparrot_training - Step 15215: {'lr': 0.00039576420095205507, 'samples': 7790592, 'steps': 15215, 'batch_loss/train': 0.7875052792951465} 12/24/2021 23:36:53 - INFO - codeparrot_training - Step 15216: {'lr': 0.0003957513941972515, 'samples': 7791104, 'steps': 15216, 'batch_loss/train': 0.7290696385316551} 12/24/2021 23:37:04 - INFO - codeparrot_training - Step 15217: {'lr': 0.00039573858686299556, 'samples': 7791616, 'steps': 15217, 'batch_loss/train': 0.8668636307120323} 12/24/2021 23:37:14 - INFO - codeparrot_training - Step 15218: {'lr': 0.0003957257789493379, 'samples': 7792128, 'steps': 15218, 'batch_loss/train': 0.7975559551268816} 12/24/2021 23:37:26 - INFO - codeparrot_training - Step 15219: {'lr': 0.0003957129704563296, 'samples': 7792640, 'steps': 15219, 'batch_loss/train': 0.7456521112471819} 12/24/2021 23:37:37 - INFO - codeparrot_training - Step 15220: {'lr': 0.00039570016138402157, 'samples': 7793152, 'steps': 15220, 'batch_loss/train': 0.7849466083571315} 12/24/2021 23:37:48 - INFO - codeparrot_training - Step 15221: {'lr': 0.00039568735173246475, 'samples': 7793664, 'steps': 15221, 'batch_loss/train': 0.8994745616801083} 12/24/2021 23:38:02 - INFO - codeparrot_training - Step 15222: {'lr': 0.0003956745415017099, 'samples': 7794176, 'steps': 15222, 'batch_loss/train': 0.8183975322172046} 12/24/2021 23:38:12 - INFO - codeparrot_training - Step 15223: {'lr': 0.0003956617306918081, 'samples': 7794688, 'steps': 15223, 'batch_loss/train': 0.7495822970522568} 12/24/2021 23:38:23 - INFO - codeparrot_training - Step 15224: {'lr': 0.00039564891930281043, 'samples': 7795200, 'steps': 15224, 'batch_loss/train': 0.6374032786698081} 12/24/2021 23:38:35 - INFO - codeparrot_training - Step 15225: {'lr': 0.00039563610733476753, 'samples': 7795712, 'steps': 15225, 'batch_loss/train': 0.7640522895380855} 12/24/2021 23:38:46 - INFO - codeparrot_training - Step 15226: {'lr': 0.0003956232947877304, 'samples': 7796224, 'steps': 15226, 'batch_loss/train': 0.9347076122649014} 12/24/2021 23:38:56 - INFO - codeparrot_training - Step 15227: {'lr': 0.0003956104816617501, 'samples': 7796736, 'steps': 15227, 'batch_loss/train': 0.5211698145139962} 12/24/2021 23:39:07 - INFO - codeparrot_training - Step 15228: {'lr': 0.00039559766795687755, 'samples': 7797248, 'steps': 15228, 'batch_loss/train': 0.7111411956138909} 12/24/2021 23:39:20 - INFO - codeparrot_training - Step 15229: {'lr': 0.00039558485367316356, 'samples': 7797760, 'steps': 15229, 'batch_loss/train': 0.8849537717178464} 12/24/2021 23:39:30 - INFO - codeparrot_training - Step 15230: {'lr': 0.00039557203881065924, 'samples': 7798272, 'steps': 15230, 'batch_loss/train': 0.635241795796901} 12/24/2021 23:39:41 - INFO - codeparrot_training - Step 15231: {'lr': 0.00039555922336941544, 'samples': 7798784, 'steps': 15231, 'batch_loss/train': 0.8372219274751842} 12/24/2021 23:39:55 - INFO - codeparrot_training - Step 15232: {'lr': 0.0003955464073494831, 'samples': 7799296, 'steps': 15232, 'batch_loss/train': 1.033272203989327} 12/24/2021 23:40:05 - INFO - codeparrot_training - Step 15233: {'lr': 0.00039553359075091324, 'samples': 7799808, 'steps': 15233, 'batch_loss/train': 0.9007005114108324} 12/24/2021 23:40:16 - INFO - codeparrot_training - Step 15234: {'lr': 0.0003955207735737567, 'samples': 7800320, 'steps': 15234, 'batch_loss/train': 0.8266675611957908} 12/24/2021 23:40:26 - INFO - codeparrot_training - Step 15235: {'lr': 0.00039550795581806453, 'samples': 7800832, 'steps': 15235, 'batch_loss/train': 0.824978893622756} 12/24/2021 23:40:38 - INFO - codeparrot_training - Step 15236: {'lr': 0.00039549513748388766, 'samples': 7801344, 'steps': 15236, 'batch_loss/train': 0.9377342509105802} 12/24/2021 23:40:49 - INFO - codeparrot_training - Step 15237: {'lr': 0.00039548231857127717, 'samples': 7801856, 'steps': 15237, 'batch_loss/train': 0.8612476494163275} 12/24/2021 23:41:00 - INFO - codeparrot_training - Step 15238: {'lr': 0.00039546949908028385, 'samples': 7802368, 'steps': 15238, 'batch_loss/train': 0.8517179070040584} 12/24/2021 23:41:12 - INFO - codeparrot_training - Step 15239: {'lr': 0.0003954566790109587, 'samples': 7802880, 'steps': 15239, 'batch_loss/train': 0.8091731970198452} 12/24/2021 23:41:23 - INFO - codeparrot_training - Step 15240: {'lr': 0.0003954438583633527, 'samples': 7803392, 'steps': 15240, 'batch_loss/train': 1.7024442432448268} 12/24/2021 23:41:33 - INFO - codeparrot_training - Step 15241: {'lr': 0.00039543103713751693, 'samples': 7803904, 'steps': 15241, 'batch_loss/train': 0.7636218355037272} 12/24/2021 23:41:45 - INFO - codeparrot_training - Step 15242: {'lr': 0.0003954182153335022, 'samples': 7804416, 'steps': 15242, 'batch_loss/train': 0.7916224421933293} 12/24/2021 23:41:56 - INFO - codeparrot_training - Step 15243: {'lr': 0.0003954053929513596, 'samples': 7804928, 'steps': 15243, 'batch_loss/train': 0.5504813748411834} 12/24/2021 23:42:07 - INFO - codeparrot_training - Step 15244: {'lr': 0.00039539256999113995, 'samples': 7805440, 'steps': 15244, 'batch_loss/train': 0.7843561815097928} 12/24/2021 23:42:17 - INFO - codeparrot_training - Step 15245: {'lr': 0.0003953797464528944, 'samples': 7805952, 'steps': 15245, 'batch_loss/train': 0.8095387681387365} 12/24/2021 23:42:31 - INFO - codeparrot_training - Step 15246: {'lr': 0.0003953669223366739, 'samples': 7806464, 'steps': 15246, 'batch_loss/train': 0.7328188996762037} 12/24/2021 23:42:42 - INFO - codeparrot_training - Step 15247: {'lr': 0.00039535409764252937, 'samples': 7806976, 'steps': 15247, 'batch_loss/train': 0.822218150831759} 12/24/2021 23:42:52 - INFO - codeparrot_training - Step 15248: {'lr': 0.0003953412723705119, 'samples': 7807488, 'steps': 15248, 'batch_loss/train': 0.8289752132259309} 12/24/2021 23:43:05 - INFO - codeparrot_training - Step 15249: {'lr': 0.00039532844652067235, 'samples': 7808000, 'steps': 15249, 'batch_loss/train': 0.845595276914537} 12/24/2021 23:43:15 - INFO - codeparrot_training - Step 15250: {'lr': 0.0003953156200930618, 'samples': 7808512, 'steps': 15250, 'batch_loss/train': 0.6388786202296615} 12/24/2021 23:43:26 - INFO - codeparrot_training - Step 15251: {'lr': 0.0003953027930877312, 'samples': 7809024, 'steps': 15251, 'batch_loss/train': 0.8394711501896381} 12/24/2021 23:43:38 - INFO - codeparrot_training - Step 15252: {'lr': 0.0003952899655047315, 'samples': 7809536, 'steps': 15252, 'batch_loss/train': 0.8296585688367486} 12/24/2021 23:43:49 - INFO - codeparrot_training - Step 15253: {'lr': 0.00039527713734411387, 'samples': 7810048, 'steps': 15253, 'batch_loss/train': 0.7796742038335651} 12/24/2021 23:43:59 - INFO - codeparrot_training - Step 15254: {'lr': 0.0003952643086059291, 'samples': 7810560, 'steps': 15254, 'batch_loss/train': 0.8325061895884573} 12/24/2021 23:44:10 - INFO - codeparrot_training - Step 15255: {'lr': 0.00039525147929022833, 'samples': 7811072, 'steps': 15255, 'batch_loss/train': 0.7886962532065809} 12/24/2021 23:44:24 - INFO - codeparrot_training - Step 15256: {'lr': 0.00039523864939706256, 'samples': 7811584, 'steps': 15256, 'batch_loss/train': 0.8393502626568079} 12/24/2021 23:44:34 - INFO - codeparrot_training - Step 15257: {'lr': 0.0003952258189264827, 'samples': 7812096, 'steps': 15257, 'batch_loss/train': 0.9068024298176169} 12/24/2021 23:44:45 - INFO - codeparrot_training - Step 15258: {'lr': 0.00039521298787853986, 'samples': 7812608, 'steps': 15258, 'batch_loss/train': 0.8234501127153635} 12/24/2021 23:44:57 - INFO - codeparrot_training - Step 15259: {'lr': 0.000395200156253285, 'samples': 7813120, 'steps': 15259, 'batch_loss/train': 0.8700215429998934} 12/24/2021 23:45:08 - INFO - codeparrot_training - Step 15260: {'lr': 0.0003951873240507691, 'samples': 7813632, 'steps': 15260, 'batch_loss/train': 0.853349793702364} 12/24/2021 23:45:18 - INFO - codeparrot_training - Step 15261: {'lr': 0.00039517449127104325, 'samples': 7814144, 'steps': 15261, 'batch_loss/train': 0.8446015669032931} 12/24/2021 23:45:32 - INFO - codeparrot_training - Step 15262: {'lr': 0.0003951616579141585, 'samples': 7814656, 'steps': 15262, 'batch_loss/train': 0.7721384083852172} 12/24/2021 23:45:43 - INFO - codeparrot_training - Step 15263: {'lr': 0.00039514882398016573, 'samples': 7815168, 'steps': 15263, 'batch_loss/train': 0.7718077804893255} 12/24/2021 23:45:53 - INFO - codeparrot_training - Step 15264: {'lr': 0.000395135989469116, 'samples': 7815680, 'steps': 15264, 'batch_loss/train': 0.902513300999999} 12/24/2021 23:46:04 - INFO - codeparrot_training - Step 15265: {'lr': 0.0003951231543810605, 'samples': 7816192, 'steps': 15265, 'batch_loss/train': 0.7315835314802825} 12/24/2021 23:46:16 - INFO - codeparrot_training - Step 15266: {'lr': 0.0003951103187160501, 'samples': 7816704, 'steps': 15266, 'batch_loss/train': 0.8205102658830583} 12/24/2021 23:46:27 - INFO - codeparrot_training - Step 15267: {'lr': 0.0003950974824741358, 'samples': 7817216, 'steps': 15267, 'batch_loss/train': 0.8576538264751434} 12/24/2021 23:46:37 - INFO - codeparrot_training - Step 15268: {'lr': 0.0003950846456553687, 'samples': 7817728, 'steps': 15268, 'batch_loss/train': 0.8596719820052385} 12/24/2021 23:46:50 - INFO - codeparrot_training - Step 15269: {'lr': 0.00039507180825979995, 'samples': 7818240, 'steps': 15269, 'batch_loss/train': 0.7999380528926849} 12/24/2021 23:47:00 - INFO - codeparrot_training - Step 15270: {'lr': 0.00039505897028748035, 'samples': 7818752, 'steps': 15270, 'batch_loss/train': 0.6651856862008572} 12/24/2021 23:47:11 - INFO - codeparrot_training - Step 15271: {'lr': 0.0003950461317384611, 'samples': 7819264, 'steps': 15271, 'batch_loss/train': 0.8458511978387833} 12/24/2021 23:47:23 - INFO - codeparrot_training - Step 15272: {'lr': 0.0003950332926127932, 'samples': 7819776, 'steps': 15272, 'batch_loss/train': 0.6369884633459151} 12/24/2021 23:47:34 - INFO - codeparrot_training - Step 15273: {'lr': 0.00039502045291052765, 'samples': 7820288, 'steps': 15273, 'batch_loss/train': 0.807364541105926} 12/24/2021 23:47:44 - INFO - codeparrot_training - Step 15274: {'lr': 0.0003950076126317156, 'samples': 7820800, 'steps': 15274, 'batch_loss/train': 0.8725978098809719} 12/24/2021 23:47:55 - INFO - codeparrot_training - Step 15275: {'lr': 0.000394994771776408, 'samples': 7821312, 'steps': 15275, 'batch_loss/train': 0.8805484967306256} 12/24/2021 23:48:09 - INFO - codeparrot_training - Step 15276: {'lr': 0.0003949819303446559, 'samples': 7821824, 'steps': 15276, 'batch_loss/train': 0.8575741044478491} 12/24/2021 23:48:20 - INFO - codeparrot_training - Step 15277: {'lr': 0.0003949690883365104, 'samples': 7822336, 'steps': 15277, 'batch_loss/train': 0.7590005556121469} 12/24/2021 23:48:30 - INFO - codeparrot_training - Step 15278: {'lr': 0.0003949562457520226, 'samples': 7822848, 'steps': 15278, 'batch_loss/train': 0.7179674888029695} 12/24/2021 23:48:42 - INFO - codeparrot_training - Step 15279: {'lr': 0.00039494340259124347, 'samples': 7823360, 'steps': 15279, 'batch_loss/train': 0.7208007965236902} 12/24/2021 23:48:53 - INFO - codeparrot_training - Step 15280: {'lr': 0.0003949305588542241, 'samples': 7823872, 'steps': 15280, 'batch_loss/train': 0.7616277313791215} 12/24/2021 23:49:03 - INFO - codeparrot_training - Step 15281: {'lr': 0.00039491771454101555, 'samples': 7824384, 'steps': 15281, 'batch_loss/train': 0.8682294376194477} 12/24/2021 23:49:16 - INFO - codeparrot_training - Step 15282: {'lr': 0.0003949048696516689, 'samples': 7824896, 'steps': 15282, 'batch_loss/train': 0.8339403097052127} 12/24/2021 23:49:27 - INFO - codeparrot_training - Step 15283: {'lr': 0.00039489202418623514, 'samples': 7825408, 'steps': 15283, 'batch_loss/train': 1.1996589116752148} 12/24/2021 23:49:38 - INFO - codeparrot_training - Step 15284: {'lr': 0.0003948791781447655, 'samples': 7825920, 'steps': 15284, 'batch_loss/train': 0.9651167918927968} 12/24/2021 23:49:48 - INFO - codeparrot_training - Step 15285: {'lr': 0.0003948663315273109, 'samples': 7826432, 'steps': 15285, 'batch_loss/train': 1.0895264390856028} 12/24/2021 23:50:02 - INFO - codeparrot_training - Step 15286: {'lr': 0.00039485348433392244, 'samples': 7826944, 'steps': 15286, 'batch_loss/train': 0.8015530379489064} 12/24/2021 23:50:12 - INFO - codeparrot_training - Step 15287: {'lr': 0.00039484063656465123, 'samples': 7827456, 'steps': 15287, 'batch_loss/train': 0.8312156777828932} 12/24/2021 23:50:23 - INFO - codeparrot_training - Step 15288: {'lr': 0.0003948277882195483, 'samples': 7827968, 'steps': 15288, 'batch_loss/train': 0.6776157435961068} 12/24/2021 23:50:35 - INFO - codeparrot_training - Step 15289: {'lr': 0.00039481493929866484, 'samples': 7828480, 'steps': 15289, 'batch_loss/train': 0.7711026258766651} 12/24/2021 23:50:46 - INFO - codeparrot_training - Step 15290: {'lr': 0.0003948020898020519, 'samples': 7828992, 'steps': 15290, 'batch_loss/train': 0.8091305615380406} 12/24/2021 23:50:56 - INFO - codeparrot_training - Step 15291: {'lr': 0.00039478923972976045, 'samples': 7829504, 'steps': 15291, 'batch_loss/train': 0.8416535463184118} 12/24/2021 23:51:07 - INFO - codeparrot_training - Step 15292: {'lr': 0.00039477638908184167, 'samples': 7830016, 'steps': 15292, 'batch_loss/train': 0.7902555484324694} 12/24/2021 23:51:21 - INFO - codeparrot_training - Step 15293: {'lr': 0.0003947635378583467, 'samples': 7830528, 'steps': 15293, 'batch_loss/train': 0.7495507840067148} 12/24/2021 23:51:32 - INFO - codeparrot_training - Step 15294: {'lr': 0.0003947506860593264, 'samples': 7831040, 'steps': 15294, 'batch_loss/train': 0.814183505717665} 12/24/2021 23:51:42 - INFO - codeparrot_training - Step 15295: {'lr': 0.0003947378336848322, 'samples': 7831552, 'steps': 15295, 'batch_loss/train': 1.0456463396549225} 12/24/2021 23:51:55 - INFO - codeparrot_training - Step 15296: {'lr': 0.00039472498073491493, 'samples': 7832064, 'steps': 15296, 'batch_loss/train': 1.7354341787286103} 12/24/2021 23:52:05 - INFO - codeparrot_training - Step 15297: {'lr': 0.0003947121272096258, 'samples': 7832576, 'steps': 15297, 'batch_loss/train': 0.7719140336848795} 12/24/2021 23:52:16 - INFO - codeparrot_training - Step 15298: {'lr': 0.00039469927310901585, 'samples': 7833088, 'steps': 15298, 'batch_loss/train': 0.7897148537449539} 12/24/2021 23:52:28 - INFO - codeparrot_training - Step 15299: {'lr': 0.0003946864184331363, 'samples': 7833600, 'steps': 15299, 'batch_loss/train': 0.8273596549406648} 12/24/2021 23:52:38 - INFO - codeparrot_training - Step 15300: {'lr': 0.0003946735631820382, 'samples': 7834112, 'steps': 15300, 'batch_loss/train': 0.9460896928794682} 12/24/2021 23:52:49 - INFO - codeparrot_training - Step 15301: {'lr': 0.00039466070735577255, 'samples': 7834624, 'steps': 15301, 'batch_loss/train': 0.8251147661358118} 12/24/2021 23:53:00 - INFO - codeparrot_training - Step 15302: {'lr': 0.0003946478509543906, 'samples': 7835136, 'steps': 15302, 'batch_loss/train': 0.7804057290777564} 12/24/2021 23:53:14 - INFO - codeparrot_training - Step 15303: {'lr': 0.00039463499397794345, 'samples': 7835648, 'steps': 15303, 'batch_loss/train': 0.8204263602383435} 12/24/2021 23:53:24 - INFO - codeparrot_training - Step 15304: {'lr': 0.00039462213642648216, 'samples': 7836160, 'steps': 15304, 'batch_loss/train': 0.6956490692682564} 12/24/2021 23:53:35 - INFO - codeparrot_training - Step 15305: {'lr': 0.0003946092783000578, 'samples': 7836672, 'steps': 15305, 'batch_loss/train': 0.827483655884862} 12/24/2021 23:53:47 - INFO - codeparrot_training - Step 15306: {'lr': 0.0003945964195987216, 'samples': 7837184, 'steps': 15306, 'batch_loss/train': 0.8577577485702932} 12/24/2021 23:53:57 - INFO - codeparrot_training - Step 15307: {'lr': 0.00039458356032252465, 'samples': 7837696, 'steps': 15307, 'batch_loss/train': 0.877626858651638} 12/24/2021 23:54:08 - INFO - codeparrot_training - Step 15308: {'lr': 0.000394570700471518, 'samples': 7838208, 'steps': 15308, 'batch_loss/train': 0.7947085397318006} 12/24/2021 23:54:20 - INFO - codeparrot_training - Step 15309: {'lr': 0.00039455784004575287, 'samples': 7838720, 'steps': 15309, 'batch_loss/train': 0.7105868139769882} 12/24/2021 23:54:31 - INFO - codeparrot_training - Step 15310: {'lr': 0.0003945449790452803, 'samples': 7839232, 'steps': 15310, 'batch_loss/train': 0.7947533294791356} 12/24/2021 23:54:42 - INFO - codeparrot_training - Step 15311: {'lr': 0.0003945321174701515, 'samples': 7839744, 'steps': 15311, 'batch_loss/train': 0.9295290433801711} 12/24/2021 23:54:52 - INFO - codeparrot_training - Step 15312: {'lr': 0.0003945192553204176, 'samples': 7840256, 'steps': 15312, 'batch_loss/train': 1.2544735125266016} 12/24/2021 23:55:06 - INFO - codeparrot_training - Step 15313: {'lr': 0.0003945063925961297, 'samples': 7840768, 'steps': 15313, 'batch_loss/train': 0.7409102916717529} 12/24/2021 23:55:17 - INFO - codeparrot_training - Step 15314: {'lr': 0.00039449352929733897, 'samples': 7841280, 'steps': 15314, 'batch_loss/train': 0.934954047203064} 12/24/2021 23:55:27 - INFO - codeparrot_training - Step 15315: {'lr': 0.00039448066542409647, 'samples': 7841792, 'steps': 15315, 'batch_loss/train': 0.7833278919570148} 12/24/2021 23:55:39 - INFO - codeparrot_training - Step 15316: {'lr': 0.0003944678009764534, 'samples': 7842304, 'steps': 15316, 'batch_loss/train': 0.6941078037489206} 12/24/2021 23:55:50 - INFO - codeparrot_training - Step 15317: {'lr': 0.000394454935954461, 'samples': 7842816, 'steps': 15317, 'batch_loss/train': 0.8945236026775092} 12/24/2021 23:56:01 - INFO - codeparrot_training - Step 15318: {'lr': 0.0003944420703581702, 'samples': 7843328, 'steps': 15318, 'batch_loss/train': 0.8717637844383717} 12/24/2021 23:56:13 - INFO - codeparrot_training - Step 15319: {'lr': 0.0003944292041876324, 'samples': 7843840, 'steps': 15319, 'batch_loss/train': 0.7810328742489219} 12/24/2021 23:56:23 - INFO - codeparrot_training - Step 15320: {'lr': 0.00039441633744289847, 'samples': 7844352, 'steps': 15320, 'batch_loss/train': 0.8049283623695374} 12/24/2021 23:56:34 - INFO - codeparrot_training - Step 15321: {'lr': 0.0003944034701240198, 'samples': 7844864, 'steps': 15321, 'batch_loss/train': 0.8648116923868656} 12/24/2021 23:56:44 - INFO - codeparrot_training - Step 15322: {'lr': 0.0003943906022310474, 'samples': 7845376, 'steps': 15322, 'batch_loss/train': 0.6674295370467007} 12/24/2021 23:56:58 - INFO - codeparrot_training - Step 15323: {'lr': 0.0003943777337640325, 'samples': 7845888, 'steps': 15323, 'batch_loss/train': 0.6943317730911076} 12/24/2021 23:57:09 - INFO - codeparrot_training - Step 15324: {'lr': 0.0003943648647230263, 'samples': 7846400, 'steps': 15324, 'batch_loss/train': 0.84399195574224} 12/24/2021 23:57:19 - INFO - codeparrot_training - Step 15325: {'lr': 0.00039435199510807985, 'samples': 7846912, 'steps': 15325, 'batch_loss/train': 0.8206710597733036} 12/24/2021 23:57:32 - INFO - codeparrot_training - Step 15326: {'lr': 0.0003943391249192444, 'samples': 7847424, 'steps': 15326, 'batch_loss/train': 0.7402524277567863} 12/24/2021 23:57:42 - INFO - codeparrot_training - Step 15327: {'lr': 0.0003943262541565711, 'samples': 7847936, 'steps': 15327, 'batch_loss/train': 0.8995179804041982} 12/24/2021 23:57:53 - INFO - codeparrot_training - Step 15328: {'lr': 0.0003943133828201111, 'samples': 7848448, 'steps': 15328, 'batch_loss/train': 0.7442954215221107} 12/24/2021 23:58:05 - INFO - codeparrot_training - Step 15329: {'lr': 0.0003943005109099156, 'samples': 7848960, 'steps': 15329, 'batch_loss/train': 0.7635772693902254} 12/24/2021 23:58:15 - INFO - codeparrot_training - Step 15330: {'lr': 0.00039428763842603567, 'samples': 7849472, 'steps': 15330, 'batch_loss/train': 0.9991921787150204} 12/24/2021 23:58:26 - INFO - codeparrot_training - Step 15331: {'lr': 0.00039427476536852264, 'samples': 7849984, 'steps': 15331, 'batch_loss/train': 0.8093024604022503} 12/24/2021 23:58:37 - INFO - codeparrot_training - Step 15332: {'lr': 0.00039426189173742763, 'samples': 7850496, 'steps': 15332, 'batch_loss/train': 0.8615695964545012} 12/24/2021 23:58:51 - INFO - codeparrot_training - Step 15333: {'lr': 0.00039424901753280176, 'samples': 7851008, 'steps': 15333, 'batch_loss/train': 0.7762933578342199} 12/24/2021 23:59:02 - INFO - codeparrot_training - Step 15334: {'lr': 0.00039423614275469626, 'samples': 7851520, 'steps': 15334, 'batch_loss/train': 1.2032432123087347} 12/24/2021 23:59:12 - INFO - codeparrot_training - Step 15335: {'lr': 0.0003942232674031624, 'samples': 7852032, 'steps': 15335, 'batch_loss/train': 0.7871411154046655} 12/24/2021 23:59:24 - INFO - codeparrot_training - Step 15336: {'lr': 0.0003942103914782512, 'samples': 7852544, 'steps': 15336, 'batch_loss/train': 0.8919547973200679} 12/24/2021 23:59:35 - INFO - codeparrot_training - Step 15337: {'lr': 0.000394197514980014, 'samples': 7853056, 'steps': 15337, 'batch_loss/train': 0.978473620954901} 12/24/2021 23:59:46 - INFO - codeparrot_training - Step 15338: {'lr': 0.0003941846379085019, 'samples': 7853568, 'steps': 15338, 'batch_loss/train': 0.8288198690861464} 12/24/2021 23:59:58 - INFO - codeparrot_training - Step 15339: {'lr': 0.00039417176026376613, 'samples': 7854080, 'steps': 15339, 'batch_loss/train': 0.8032253300771117} 12/25/2021 00:00:08 - INFO - codeparrot_training - Step 15340: {'lr': 0.0003941588820458578, 'samples': 7854592, 'steps': 15340, 'batch_loss/train': 0.7834733896888793} 12/25/2021 00:00:19 - INFO - codeparrot_training - Step 15341: {'lr': 0.0003941460032548283, 'samples': 7855104, 'steps': 15341, 'batch_loss/train': 0.9138165982440114} 12/25/2021 00:00:29 - INFO - codeparrot_training - Step 15342: {'lr': 0.00039413312389072864, 'samples': 7855616, 'steps': 15342, 'batch_loss/train': 0.9806715538725257} 12/25/2021 00:00:43 - INFO - codeparrot_training - Step 15343: {'lr': 0.0003941202439536101, 'samples': 7856128, 'steps': 15343, 'batch_loss/train': 0.7262321896851063} 12/25/2021 00:00:54 - INFO - codeparrot_training - Step 15344: {'lr': 0.00039410736344352396, 'samples': 7856640, 'steps': 15344, 'batch_loss/train': 0.8052313097286969} 12/25/2021 00:01:04 - INFO - codeparrot_training - Step 15345: {'lr': 0.00039409448236052137, 'samples': 7857152, 'steps': 15345, 'batch_loss/train': 0.7908361717127264} 12/25/2021 00:01:17 - INFO - codeparrot_training - Step 15346: {'lr': 0.00039408160070465346, 'samples': 7857664, 'steps': 15346, 'batch_loss/train': 0.8719135774299502} 12/25/2021 00:01:27 - INFO - codeparrot_training - Step 15347: {'lr': 0.0003940687184759715, 'samples': 7858176, 'steps': 15347, 'batch_loss/train': 0.7877610977739096} 12/25/2021 00:01:38 - INFO - codeparrot_training - Step 15348: {'lr': 0.00039405583567452674, 'samples': 7858688, 'steps': 15348, 'batch_loss/train': 0.8835359560325742} 12/25/2021 00:01:52 - INFO - codeparrot_training - Step 15349: {'lr': 0.00039404295230037033, 'samples': 7859200, 'steps': 15349, 'batch_loss/train': 0.7950740053784102} 12/25/2021 00:02:02 - INFO - codeparrot_training - Step 15350: {'lr': 0.0003940300683535536, 'samples': 7859712, 'steps': 15350, 'batch_loss/train': 0.7297645083162934} 12/25/2021 00:02:13 - INFO - codeparrot_training - Step 15351: {'lr': 0.00039401718383412764, 'samples': 7860224, 'steps': 15351, 'batch_loss/train': 0.7984880302101374} 12/25/2021 00:02:24 - INFO - codeparrot_training - Step 15352: {'lr': 0.00039400429874214375, 'samples': 7860736, 'steps': 15352, 'batch_loss/train': 0.7347029009833932} 12/25/2021 00:02:36 - INFO - codeparrot_training - Step 15353: {'lr': 0.0003939914130776532, 'samples': 7861248, 'steps': 15353, 'batch_loss/train': 0.7624992993660271} 12/25/2021 00:02:46 - INFO - codeparrot_training - Step 15354: {'lr': 0.00039397852684070714, 'samples': 7861760, 'steps': 15354, 'batch_loss/train': 0.8180785411968827} 12/25/2021 00:02:57 - INFO - codeparrot_training - Step 15355: {'lr': 0.00039396564003135677, 'samples': 7862272, 'steps': 15355, 'batch_loss/train': 0.8469901252537966} 12/25/2021 00:03:09 - INFO - codeparrot_training - Step 15356: {'lr': 0.0003939527526496534, 'samples': 7862784, 'steps': 15356, 'batch_loss/train': 0.8815382746979594} 12/25/2021 00:03:20 - INFO - codeparrot_training - Step 15357: {'lr': 0.00039393986469564825, 'samples': 7863296, 'steps': 15357, 'batch_loss/train': 0.9258132046088576} 12/25/2021 00:03:30 - INFO - codeparrot_training - Step 15358: {'lr': 0.0003939269761693925, 'samples': 7863808, 'steps': 15358, 'batch_loss/train': 1.0674787480384111} 12/25/2021 00:03:42 - INFO - codeparrot_training - Step 15359: {'lr': 0.00039391408707093747, 'samples': 7864320, 'steps': 15359, 'batch_loss/train': 0.8111414082814008} 12/25/2021 00:03:53 - INFO - codeparrot_training - Step 15360: {'lr': 0.00039390119740033435, 'samples': 7864832, 'steps': 15360, 'batch_loss/train': 0.7386034093797207} 12/25/2021 00:04:04 - INFO - codeparrot_training - Step 15361: {'lr': 0.00039388830715763446, 'samples': 7865344, 'steps': 15361, 'batch_loss/train': 0.8536323141306639} 12/25/2021 00:04:14 - INFO - codeparrot_training - Step 15362: {'lr': 0.00039387541634288895, 'samples': 7865856, 'steps': 15362, 'batch_loss/train': 0.7349637674633414} 12/25/2021 00:04:28 - INFO - codeparrot_training - Step 15363: {'lr': 0.00039386252495614917, 'samples': 7866368, 'steps': 15363, 'batch_loss/train': 0.7472774405032396} 12/25/2021 00:04:39 - INFO - codeparrot_training - Step 15364: {'lr': 0.00039384963299746623, 'samples': 7866880, 'steps': 15364, 'batch_loss/train': 0.7057433561421931} 12/25/2021 00:04:49 - INFO - codeparrot_training - Step 15365: {'lr': 0.0003938367404668915, 'samples': 7867392, 'steps': 15365, 'batch_loss/train': 0.8308334683533758} 12/25/2021 00:05:01 - INFO - codeparrot_training - Step 15366: {'lr': 0.0003938238473644762, 'samples': 7867904, 'steps': 15366, 'batch_loss/train': 0.7694023521617055} 12/25/2021 00:05:12 - INFO - codeparrot_training - Step 15367: {'lr': 0.00039381095369027164, 'samples': 7868416, 'steps': 15367, 'batch_loss/train': 0.8482060246169567} 12/25/2021 00:05:22 - INFO - codeparrot_training - Step 15368: {'lr': 0.0003937980594443289, 'samples': 7868928, 'steps': 15368, 'batch_loss/train': 0.7649976285174489} 12/25/2021 00:05:34 - INFO - codeparrot_training - Step 15369: {'lr': 0.0003937851646266995, 'samples': 7869440, 'steps': 15369, 'batch_loss/train': 0.8743349593132734} 12/25/2021 00:05:45 - INFO - codeparrot_training - Step 15370: {'lr': 0.00039377226923743445, 'samples': 7869952, 'steps': 15370, 'batch_loss/train': 1.009115882217884} 12/25/2021 00:05:56 - INFO - codeparrot_training - Step 15371: {'lr': 0.0003937593732765853, 'samples': 7870464, 'steps': 15371, 'batch_loss/train': 0.7386303464882076} 12/25/2021 00:06:06 - INFO - codeparrot_training - Step 15372: {'lr': 0.00039374647674420303, 'samples': 7870976, 'steps': 15372, 'batch_loss/train': 0.7017532768659294} 12/25/2021 00:06:19 - INFO - codeparrot_training - Step 15373: {'lr': 0.0003937335796403391, 'samples': 7871488, 'steps': 15373, 'batch_loss/train': 0.8037280533462763} 12/25/2021 00:06:30 - INFO - codeparrot_training - Step 15374: {'lr': 0.00039372068196504473, 'samples': 7872000, 'steps': 15374, 'batch_loss/train': 0.8005885374732316} 12/25/2021 00:06:40 - INFO - codeparrot_training - Step 15375: {'lr': 0.00039370778371837116, 'samples': 7872512, 'steps': 15375, 'batch_loss/train': 0.7594993752427399} 12/25/2021 00:06:55 - INFO - codeparrot_training - Step 15376: {'lr': 0.00039369488490036977, 'samples': 7873024, 'steps': 15376, 'batch_loss/train': 0.923601589165628} 12/25/2021 00:07:05 - INFO - codeparrot_training - Step 15377: {'lr': 0.00039368198551109173, 'samples': 7873536, 'steps': 15377, 'batch_loss/train': 0.8154107974842191} 12/25/2021 00:07:16 - INFO - codeparrot_training - Step 15378: {'lr': 0.00039366908555058845, 'samples': 7874048, 'steps': 15378, 'batch_loss/train': 0.6304224440827966} 12/25/2021 00:07:28 - INFO - codeparrot_training - Step 15379: {'lr': 0.000393656185018911, 'samples': 7874560, 'steps': 15379, 'batch_loss/train': 0.8558851694688201} 12/25/2021 00:07:38 - INFO - codeparrot_training - Step 15380: {'lr': 0.00039364328391611094, 'samples': 7875072, 'steps': 15380, 'batch_loss/train': 0.8929771557450294} 12/25/2021 00:07:49 - INFO - codeparrot_training - Step 15381: {'lr': 0.00039363038224223923, 'samples': 7875584, 'steps': 15381, 'batch_loss/train': 0.7742041076999158} 12/25/2021 00:08:03 - INFO - codeparrot_training - Step 15382: {'lr': 0.00039361747999734756, 'samples': 7876096, 'steps': 15382, 'batch_loss/train': 0.7681278521195054} 12/25/2021 00:08:14 - INFO - codeparrot_training - Step 15383: {'lr': 0.0003936045771814869, 'samples': 7876608, 'steps': 15383, 'batch_loss/train': 0.79065868165344} 12/25/2021 00:08:24 - INFO - codeparrot_training - Step 15384: {'lr': 0.0003935916737947086, 'samples': 7877120, 'steps': 15384, 'batch_loss/train': 0.802138869650662} 12/25/2021 00:08:35 - INFO - codeparrot_training - Step 15385: {'lr': 0.0003935787698370642, 'samples': 7877632, 'steps': 15385, 'batch_loss/train': 0.8399899676442146} 12/25/2021 00:08:47 - INFO - codeparrot_training - Step 15386: {'lr': 0.0003935658653086047, 'samples': 7878144, 'steps': 15386, 'batch_loss/train': 0.8955361098051071} 12/25/2021 00:08:58 - INFO - codeparrot_training - Step 15387: {'lr': 0.0003935529602093816, 'samples': 7878656, 'steps': 15387, 'batch_loss/train': 0.8994762925431132} 12/25/2021 00:09:08 - INFO - codeparrot_training - Step 15388: {'lr': 0.0003935400545394461, 'samples': 7879168, 'steps': 15388, 'batch_loss/train': 0.8322640594560653} 12/25/2021 00:09:20 - INFO - codeparrot_training - Step 15389: {'lr': 0.0003935271482988495, 'samples': 7879680, 'steps': 15389, 'batch_loss/train': 0.691342709120363} 12/25/2021 00:09:31 - INFO - codeparrot_training - Step 15390: {'lr': 0.00039351424148764317, 'samples': 7880192, 'steps': 15390, 'batch_loss/train': 0.7536364635452628} 12/25/2021 00:09:42 - INFO - codeparrot_training - Step 15391: {'lr': 0.0003935013341058785, 'samples': 7880704, 'steps': 15391, 'batch_loss/train': 0.8349892925471067} 12/25/2021 00:09:55 - INFO - codeparrot_training - Step 15392: {'lr': 0.0003934884261536066, 'samples': 7881216, 'steps': 15392, 'batch_loss/train': 0.831968973390758} 12/25/2021 00:10:06 - INFO - codeparrot_training - Step 15393: {'lr': 0.0003934755176308789, 'samples': 7881728, 'steps': 15393, 'batch_loss/train': 0.9420473920181394} 12/25/2021 00:10:17 - INFO - codeparrot_training - Step 15394: {'lr': 0.00039346260853774675, 'samples': 7882240, 'steps': 15394, 'batch_loss/train': 0.7862703111022711} 12/25/2021 00:10:27 - INFO - codeparrot_training - Step 15395: {'lr': 0.0003934496988742614, 'samples': 7882752, 'steps': 15395, 'batch_loss/train': 0.8618777603842318} 12/25/2021 00:10:39 - INFO - codeparrot_training - Step 15396: {'lr': 0.00039343678864047426, 'samples': 7883264, 'steps': 15396, 'batch_loss/train': 0.8074067840352654} 12/25/2021 00:10:50 - INFO - codeparrot_training - Step 15397: {'lr': 0.00039342387783643663, 'samples': 7883776, 'steps': 15397, 'batch_loss/train': 0.8848116304725409} 12/25/2021 00:11:01 - INFO - codeparrot_training - Step 15398: {'lr': 0.0003934109664621998, 'samples': 7884288, 'steps': 15398, 'batch_loss/train': 0.8217656603083014} 12/25/2021 00:11:15 - INFO - codeparrot_training - Step 15399: {'lr': 0.00039339805451781506, 'samples': 7884800, 'steps': 15399, 'batch_loss/train': 0.7408628053963184} 12/25/2021 00:11:25 - INFO - codeparrot_training - Step 15400: {'lr': 0.00039338514200333384, 'samples': 7885312, 'steps': 15400, 'batch_loss/train': 0.7160932403057814} 12/25/2021 00:11:36 - INFO - codeparrot_training - Step 15401: {'lr': 0.0003933722289188074, 'samples': 7885824, 'steps': 15401, 'batch_loss/train': 0.8668087217956781} 12/25/2021 00:11:48 - INFO - codeparrot_training - Step 15402: {'lr': 0.00039335931526428714, 'samples': 7886336, 'steps': 15402, 'batch_loss/train': 0.8535954095423222} 12/25/2021 00:11:59 - INFO - codeparrot_training - Step 15403: {'lr': 0.00039334640103982443, 'samples': 7886848, 'steps': 15403, 'batch_loss/train': 0.7793702550698072} 12/25/2021 00:12:09 - INFO - codeparrot_training - Step 15404: {'lr': 0.0003933334862454705, 'samples': 7887360, 'steps': 15404, 'batch_loss/train': 0.7852452073711902} 12/25/2021 00:12:20 - INFO - codeparrot_training - Step 15405: {'lr': 0.0003933205708812768, 'samples': 7887872, 'steps': 15405, 'batch_loss/train': 0.7534645590931177} 12/25/2021 00:12:33 - INFO - codeparrot_training - Step 15406: {'lr': 0.0003933076549472946, 'samples': 7888384, 'steps': 15406, 'batch_loss/train': 0.8160502659156919} 12/25/2021 00:12:43 - INFO - codeparrot_training - Step 15407: {'lr': 0.00039329473844357515, 'samples': 7888896, 'steps': 15407, 'batch_loss/train': 0.7635744772851467} 12/25/2021 00:12:54 - INFO - codeparrot_training - Step 15408: {'lr': 0.00039328182137017006, 'samples': 7889408, 'steps': 15408, 'batch_loss/train': 0.7269302839413285} 12/25/2021 00:13:06 - INFO - codeparrot_training - Step 15409: {'lr': 0.0003932689037271305, 'samples': 7889920, 'steps': 15409, 'batch_loss/train': 0.696372346021235} 12/25/2021 00:13:16 - INFO - codeparrot_training - Step 15410: {'lr': 0.0003932559855145079, 'samples': 7890432, 'steps': 15410, 'batch_loss/train': 0.8139834208413959} 12/25/2021 00:13:27 - INFO - codeparrot_training - Step 15411: {'lr': 0.0003932430667323536, 'samples': 7890944, 'steps': 15411, 'batch_loss/train': 0.8406892237253487} 12/25/2021 00:13:41 - INFO - codeparrot_training - Step 15412: {'lr': 0.00039323014738071896, 'samples': 7891456, 'steps': 15412, 'batch_loss/train': 0.9310083440504968} 12/25/2021 00:13:51 - INFO - codeparrot_training - Step 15413: {'lr': 0.0003932172274596553, 'samples': 7891968, 'steps': 15413, 'batch_loss/train': 0.87450149981305} 12/25/2021 00:14:02 - INFO - codeparrot_training - Step 15414: {'lr': 0.00039320430696921407, 'samples': 7892480, 'steps': 15414, 'batch_loss/train': 0.7474045334383845} 12/25/2021 00:14:14 - INFO - codeparrot_training - Step 15415: {'lr': 0.0003931913859094465, 'samples': 7892992, 'steps': 15415, 'batch_loss/train': 0.8597736386582255} 12/25/2021 00:14:25 - INFO - codeparrot_training - Step 15416: {'lr': 0.00039317846428040416, 'samples': 7893504, 'steps': 15416, 'batch_loss/train': 0.809345924295485} 12/25/2021 00:14:36 - INFO - codeparrot_training - Step 15417: {'lr': 0.00039316554208213825, 'samples': 7894016, 'steps': 15417, 'batch_loss/train': 0.7909046276472509} 12/25/2021 00:14:46 - INFO - codeparrot_training - Step 15418: {'lr': 0.00039315261931470013, 'samples': 7894528, 'steps': 15418, 'batch_loss/train': 0.8312062285840511} 12/25/2021 00:14:58 - INFO - codeparrot_training - Step 15419: {'lr': 0.00039313969597814136, 'samples': 7895040, 'steps': 15419, 'batch_loss/train': 0.8647487591952085} 12/25/2021 00:15:09 - INFO - codeparrot_training - Step 15420: {'lr': 0.0003931267720725131, 'samples': 7895552, 'steps': 15420, 'batch_loss/train': 0.8310729628428817} 12/25/2021 00:15:19 - INFO - codeparrot_training - Step 15421: {'lr': 0.00039311384759786695, 'samples': 7896064, 'steps': 15421, 'batch_loss/train': 0.7724769804626703} 12/25/2021 00:15:34 - INFO - codeparrot_training - Step 15422: {'lr': 0.00039310092255425414, 'samples': 7896576, 'steps': 15422, 'batch_loss/train': 0.8304917574860156} 12/25/2021 00:15:44 - INFO - codeparrot_training - Step 15423: {'lr': 0.0003930879969417261, 'samples': 7897088, 'steps': 15423, 'batch_loss/train': 0.7945666725281626} 12/25/2021 00:15:55 - INFO - codeparrot_training - Step 15424: {'lr': 0.0003930750707603342, 'samples': 7897600, 'steps': 15424, 'batch_loss/train': 0.8140885727480054} 12/25/2021 00:16:06 - INFO - codeparrot_training - Step 15425: {'lr': 0.0003930621440101298, 'samples': 7898112, 'steps': 15425, 'batch_loss/train': 0.8673714464530349} 12/25/2021 00:16:18 - INFO - codeparrot_training - Step 15426: {'lr': 0.0003930492166911644, 'samples': 7898624, 'steps': 15426, 'batch_loss/train': 0.8208811087533832} 12/25/2021 00:16:28 - INFO - codeparrot_training - Step 15427: {'lr': 0.0003930362888034893, 'samples': 7899136, 'steps': 15427, 'batch_loss/train': 0.6008962830528617} 12/25/2021 00:16:39 - INFO - codeparrot_training - Step 15428: {'lr': 0.0003930233603471559, 'samples': 7899648, 'steps': 15428, 'batch_loss/train': 0.8424033559858799} 12/25/2021 00:16:51 - INFO - codeparrot_training - Step 15429: {'lr': 0.0003930104313222157, 'samples': 7900160, 'steps': 15429, 'batch_loss/train': 0.8384014121256769} 12/25/2021 00:17:02 - INFO - codeparrot_training - Step 15430: {'lr': 0.00039299750172871995, 'samples': 7900672, 'steps': 15430, 'batch_loss/train': 0.7091637207195163} 12/25/2021 00:17:12 - INFO - codeparrot_training - Step 15431: {'lr': 0.00039298457156672015, 'samples': 7901184, 'steps': 15431, 'batch_loss/train': 0.817709757538978} 12/25/2021 00:17:26 - INFO - codeparrot_training - Step 15432: {'lr': 0.00039297164083626767, 'samples': 7901696, 'steps': 15432, 'batch_loss/train': 0.6525339372456074} 12/25/2021 00:17:37 - INFO - codeparrot_training - Step 15433: {'lr': 0.00039295870953741396, 'samples': 7902208, 'steps': 15433, 'batch_loss/train': 0.8550559850409627} 12/25/2021 00:17:47 - INFO - codeparrot_training - Step 15434: {'lr': 0.00039294577767021036, 'samples': 7902720, 'steps': 15434, 'batch_loss/train': 0.8230439687613398} 12/25/2021 00:18:00 - INFO - codeparrot_training - Step 15435: {'lr': 0.00039293284523470837, 'samples': 7903232, 'steps': 15435, 'batch_loss/train': 0.7900687619112432} 12/25/2021 00:18:10 - INFO - codeparrot_training - Step 15436: {'lr': 0.0003929199122309593, 'samples': 7903744, 'steps': 15436, 'batch_loss/train': 0.8334807460196316} 12/25/2021 00:18:21 - INFO - codeparrot_training - Step 15437: {'lr': 0.0003929069786590147, 'samples': 7904256, 'steps': 15437, 'batch_loss/train': 0.7249587115366012} 12/25/2021 00:18:31 - INFO - codeparrot_training - Step 15438: {'lr': 0.00039289404451892584, 'samples': 7904768, 'steps': 15438, 'batch_loss/train': 0.8580064000561833} 12/25/2021 00:18:45 - INFO - codeparrot_training - Step 15439: {'lr': 0.0003928811098107442, 'samples': 7905280, 'steps': 15439, 'batch_loss/train': 0.6998079603072256} 12/25/2021 00:18:56 - INFO - codeparrot_training - Step 15440: {'lr': 0.00039286817453452127, 'samples': 7905792, 'steps': 15440, 'batch_loss/train': 0.6649850859539583} 12/25/2021 00:19:06 - INFO - codeparrot_training - Step 15441: {'lr': 0.0003928552386903084, 'samples': 7906304, 'steps': 15441, 'batch_loss/train': 0.8246154179796576} 12/25/2021 00:19:18 - INFO - codeparrot_training - Step 15442: {'lr': 0.0003928423022781571, 'samples': 7906816, 'steps': 15442, 'batch_loss/train': 0.8495868630707264} 12/25/2021 00:19:29 - INFO - codeparrot_training - Step 15443: {'lr': 0.0003928293652981187, 'samples': 7907328, 'steps': 15443, 'batch_loss/train': 0.8034689659252763} 12/25/2021 00:19:40 - INFO - codeparrot_training - Step 15444: {'lr': 0.0003928164277502446, 'samples': 7907840, 'steps': 15444, 'batch_loss/train': 0.8321050174999982} 12/25/2021 00:19:52 - INFO - codeparrot_training - Step 15445: {'lr': 0.0003928034896345864, 'samples': 7908352, 'steps': 15445, 'batch_loss/train': 0.792326987022534} 12/25/2021 00:20:02 - INFO - codeparrot_training - Step 15446: {'lr': 0.00039279055095119543, 'samples': 7908864, 'steps': 15446, 'batch_loss/train': 0.8171819006092846} 12/25/2021 00:20:13 - INFO - codeparrot_training - Step 15447: {'lr': 0.00039277761170012314, 'samples': 7909376, 'steps': 15447, 'batch_loss/train': 1.5732653397135437} 12/25/2021 00:20:24 - INFO - codeparrot_training - Step 15448: {'lr': 0.000392764671881421, 'samples': 7909888, 'steps': 15448, 'batch_loss/train': 0.7629914609715343} 12/25/2021 00:20:38 - INFO - codeparrot_training - Step 15449: {'lr': 0.00039275173149514033, 'samples': 7910400, 'steps': 15449, 'batch_loss/train': 1.6077060787938535} 12/25/2021 00:20:49 - INFO - codeparrot_training - Step 15450: {'lr': 0.00039273879054133276, 'samples': 7910912, 'steps': 15450, 'batch_loss/train': 1.080953678349033} 12/25/2021 00:20:59 - INFO - codeparrot_training - Step 15451: {'lr': 0.00039272584902004965, 'samples': 7911424, 'steps': 15451, 'batch_loss/train': 0.6728944568894804} 12/25/2021 00:21:11 - INFO - codeparrot_training - Step 15452: {'lr': 0.0003927129069313425, 'samples': 7911936, 'steps': 15452, 'batch_loss/train': 0.7973935091868043} 12/25/2021 00:21:22 - INFO - codeparrot_training - Step 15453: {'lr': 0.00039269996427526265, 'samples': 7912448, 'steps': 15453, 'batch_loss/train': 0.8017470501363277} 12/25/2021 00:21:33 - INFO - codeparrot_training - Step 15454: {'lr': 0.0003926870210518616, 'samples': 7912960, 'steps': 15454, 'batch_loss/train': 0.8062455449253321} 12/25/2021 00:21:43 - INFO - codeparrot_training - Step 15455: {'lr': 0.0003926740772611909, 'samples': 7913472, 'steps': 15455, 'batch_loss/train': 0.7315845232224092} 12/25/2021 00:21:55 - INFO - codeparrot_training - Step 15456: {'lr': 0.00039266113290330196, 'samples': 7913984, 'steps': 15456, 'batch_loss/train': 0.8114470988512039} 12/25/2021 00:22:06 - INFO - codeparrot_training - Step 15457: {'lr': 0.0003926481879782461, 'samples': 7914496, 'steps': 15457, 'batch_loss/train': 0.8648897334933281} 12/25/2021 00:22:16 - INFO - codeparrot_training - Step 15458: {'lr': 0.00039263524248607496, 'samples': 7915008, 'steps': 15458, 'batch_loss/train': 0.8705814396962523} 12/25/2021 00:22:29 - INFO - codeparrot_training - Step 15459: {'lr': 0.00039262229642684, 'samples': 7915520, 'steps': 15459, 'batch_loss/train': 0.9686509789898992} 12/25/2021 00:22:39 - INFO - codeparrot_training - Step 15460: {'lr': 0.0003926093498005926, 'samples': 7916032, 'steps': 15460, 'batch_loss/train': 1.1495653428137302} 12/25/2021 00:22:50 - INFO - codeparrot_training - Step 15461: {'lr': 0.00039259640260738433, 'samples': 7916544, 'steps': 15461, 'batch_loss/train': 0.7696927483193576} 12/25/2021 00:23:04 - INFO - codeparrot_training - Step 15462: {'lr': 0.0003925834548472665, 'samples': 7917056, 'steps': 15462, 'batch_loss/train': 0.7790194143308327} 12/25/2021 00:23:15 - INFO - codeparrot_training - Step 15463: {'lr': 0.0003925705065202908, 'samples': 7917568, 'steps': 15463, 'batch_loss/train': 0.7402936359867454} 12/25/2021 00:23:26 - INFO - codeparrot_training - Step 15464: {'lr': 0.0003925575576265086, 'samples': 7918080, 'steps': 15464, 'batch_loss/train': 0.7631677582394332} 12/25/2021 00:23:36 - INFO - codeparrot_training - Step 15465: {'lr': 0.0003925446081659713, 'samples': 7918592, 'steps': 15465, 'batch_loss/train': 1.3681001856457442} 12/25/2021 00:23:48 - INFO - codeparrot_training - Step 15466: {'lr': 0.00039253165813873047, 'samples': 7919104, 'steps': 15466, 'batch_loss/train': 0.9209087183699012} 12/25/2021 00:23:59 - INFO - codeparrot_training - Step 15467: {'lr': 0.00039251870754483767, 'samples': 7919616, 'steps': 15467, 'batch_loss/train': 0.6643075267784297} 12/25/2021 00:24:10 - INFO - codeparrot_training - Step 15468: {'lr': 0.00039250575638434423, 'samples': 7920128, 'steps': 15468, 'batch_loss/train': 0.8092904724180698} 12/25/2021 00:24:24 - INFO - codeparrot_training - Step 15469: {'lr': 0.00039249280465730165, 'samples': 7920640, 'steps': 15469, 'batch_loss/train': 0.8748049465939403} 12/25/2021 00:24:35 - INFO - codeparrot_training - Step 15470: {'lr': 0.0003924798523637616, 'samples': 7921152, 'steps': 15470, 'batch_loss/train': 0.7931233043782413} 12/25/2021 00:24:45 - INFO - codeparrot_training - Step 15471: {'lr': 0.0003924668995037754, 'samples': 7921664, 'steps': 15471, 'batch_loss/train': 1.6996760331094265} 12/25/2021 00:24:56 - INFO - codeparrot_training - Step 15472: {'lr': 0.0003924539460773946, 'samples': 7922176, 'steps': 15472, 'batch_loss/train': 0.799569989554584} 12/25/2021 00:25:08 - INFO - codeparrot_training - Step 15473: {'lr': 0.00039244099208467074, 'samples': 7922688, 'steps': 15473, 'batch_loss/train': 0.8034088630229235} 12/25/2021 00:25:19 - INFO - codeparrot_training - Step 15474: {'lr': 0.0003924280375256553, 'samples': 7923200, 'steps': 15474, 'batch_loss/train': 0.8535936614498496} 12/25/2021 00:25:29 - INFO - codeparrot_training - Step 15475: {'lr': 0.00039241508240039956, 'samples': 7923712, 'steps': 15475, 'batch_loss/train': 0.8388318466022611} 12/25/2021 00:25:41 - INFO - codeparrot_training - Step 15476: {'lr': 0.0003924021267089555, 'samples': 7924224, 'steps': 15476, 'batch_loss/train': 0.7392398132942617} 12/25/2021 00:25:52 - INFO - codeparrot_training - Step 15477: {'lr': 0.00039238917045137414, 'samples': 7924736, 'steps': 15477, 'batch_loss/train': 0.7558491444215178} 12/25/2021 00:26:02 - INFO - codeparrot_training - Step 15478: {'lr': 0.0003923762136277074, 'samples': 7925248, 'steps': 15478, 'batch_loss/train': 0.7909296401776373} 12/25/2021 00:26:16 - INFO - codeparrot_training - Step 15479: {'lr': 0.0003923632562380064, 'samples': 7925760, 'steps': 15479, 'batch_loss/train': 0.9313461170531809} 12/25/2021 00:26:27 - INFO - codeparrot_training - Step 15480: {'lr': 0.00039235029828232296, 'samples': 7926272, 'steps': 15480, 'batch_loss/train': 0.849990067537874} 12/25/2021 00:26:37 - INFO - codeparrot_training - Step 15481: {'lr': 0.0003923373397607084, 'samples': 7926784, 'steps': 15481, 'batch_loss/train': 0.8472864180803299} 12/25/2021 00:26:48 - INFO - codeparrot_training - Step 15482: {'lr': 0.0003923243806732144, 'samples': 7927296, 'steps': 15482, 'batch_loss/train': 0.8863506549969316} 12/25/2021 00:27:01 - INFO - codeparrot_training - Step 15483: {'lr': 0.00039231142101989235, 'samples': 7927808, 'steps': 15483, 'batch_loss/train': 0.8423961414955556} 12/25/2021 00:27:11 - INFO - codeparrot_training - Step 15484: {'lr': 0.0003922984608007939, 'samples': 7928320, 'steps': 15484, 'batch_loss/train': 0.8887643441557884} 12/25/2021 00:27:22 - INFO - codeparrot_training - Step 15485: {'lr': 0.0003922855000159704, 'samples': 7928832, 'steps': 15485, 'batch_loss/train': 0.7831525625661016} 12/25/2021 00:27:34 - INFO - codeparrot_training - Step 15486: {'lr': 0.00039227253866547354, 'samples': 7929344, 'steps': 15486, 'batch_loss/train': 0.7329606304410845} 12/25/2021 00:27:45 - INFO - codeparrot_training - Step 15487: {'lr': 0.0003922595767493548, 'samples': 7929856, 'steps': 15487, 'batch_loss/train': 0.667101523373276} 12/25/2021 00:27:56 - INFO - codeparrot_training - Step 15488: {'lr': 0.00039224661426766566, 'samples': 7930368, 'steps': 15488, 'batch_loss/train': 0.9813191811554134} 12/25/2021 00:28:09 - INFO - codeparrot_training - Step 15489: {'lr': 0.00039223365122045774, 'samples': 7930880, 'steps': 15489, 'batch_loss/train': 1.569396206876263} 12/25/2021 00:28:20 - INFO - codeparrot_training - Step 15490: {'lr': 0.00039222068760778243, 'samples': 7931392, 'steps': 15490, 'batch_loss/train': 0.7854859274812043} 12/25/2021 00:28:31 - INFO - codeparrot_training - Step 15491: {'lr': 0.00039220772342969147, 'samples': 7931904, 'steps': 15491, 'batch_loss/train': 0.8249241779558361} 12/25/2021 00:28:41 - INFO - codeparrot_training - Step 15492: {'lr': 0.00039219475868623624, 'samples': 7932416, 'steps': 15492, 'batch_loss/train': 0.9155715396627784} 12/25/2021 00:28:53 - INFO - codeparrot_training - Step 15493: {'lr': 0.0003921817933774684, 'samples': 7932928, 'steps': 15493, 'batch_loss/train': 0.8992560263723135} 12/25/2021 00:29:04 - INFO - codeparrot_training - Step 15494: {'lr': 0.00039216882750343934, 'samples': 7933440, 'steps': 15494, 'batch_loss/train': 0.8340519033372402} 12/25/2021 00:29:14 - INFO - codeparrot_training - Step 15495: {'lr': 0.0003921558610642008, 'samples': 7933952, 'steps': 15495, 'batch_loss/train': 0.8282835800200701} 12/25/2021 00:29:26 - INFO - codeparrot_training - Step 15496: {'lr': 0.00039214289405980416, 'samples': 7934464, 'steps': 15496, 'batch_loss/train': 0.8108610678464174} 12/25/2021 00:29:37 - INFO - codeparrot_training - Step 15497: {'lr': 0.0003921299264903011, 'samples': 7934976, 'steps': 15497, 'batch_loss/train': 0.9232405844377354} 12/25/2021 00:29:48 - INFO - codeparrot_training - Step 15498: {'lr': 0.0003921169583557431, 'samples': 7935488, 'steps': 15498, 'batch_loss/train': 0.8240866083651781} 12/25/2021 00:30:01 - INFO - codeparrot_training - Step 15499: {'lr': 0.00039210398965618164, 'samples': 7936000, 'steps': 15499, 'batch_loss/train': 0.7807441803161055} 12/25/2021 00:30:12 - INFO - codeparrot_training - Step 15500: {'lr': 0.00039209102039166856, 'samples': 7936512, 'steps': 15500, 'batch_loss/train': 0.8909327271394432} 12/25/2021 00:30:23 - INFO - codeparrot_training - Step 15501: {'lr': 0.0003920780505622551, 'samples': 7937024, 'steps': 15501, 'batch_loss/train': 0.8480931399390101} 12/25/2021 00:30:33 - INFO - codeparrot_training - Step 15502: {'lr': 0.000392065080167993, 'samples': 7937536, 'steps': 15502, 'batch_loss/train': 0.8588917988818139} 12/25/2021 00:30:45 - INFO - codeparrot_training - Step 15503: {'lr': 0.00039205210920893377, 'samples': 7938048, 'steps': 15503, 'batch_loss/train': 0.7960372469387949} 12/25/2021 00:30:56 - INFO - codeparrot_training - Step 15504: {'lr': 0.000392039137685129, 'samples': 7938560, 'steps': 15504, 'batch_loss/train': 0.8583042211830616} 12/25/2021 00:31:06 - INFO - codeparrot_training - Step 15505: {'lr': 0.00039202616559663023, 'samples': 7939072, 'steps': 15505, 'batch_loss/train': 0.87407529912889} 12/25/2021 00:31:18 - INFO - codeparrot_training - Step 15506: {'lr': 0.000392013192943489, 'samples': 7939584, 'steps': 15506, 'batch_loss/train': 0.8212019093334675} 12/25/2021 00:31:29 - INFO - codeparrot_training - Step 15507: {'lr': 0.00039200021972575705, 'samples': 7940096, 'steps': 15507, 'batch_loss/train': 0.8646645043045282} 12/25/2021 00:31:40 - INFO - codeparrot_training - Step 15508: {'lr': 0.0003919872459434858, 'samples': 7940608, 'steps': 15508, 'batch_loss/train': 0.8546788354869932} 12/25/2021 00:31:53 - INFO - codeparrot_training - Step 15509: {'lr': 0.0003919742715967269, 'samples': 7941120, 'steps': 15509, 'batch_loss/train': 0.8729835527483374} 12/25/2021 00:32:04 - INFO - codeparrot_training - Step 15510: {'lr': 0.0003919612966855318, 'samples': 7941632, 'steps': 15510, 'batch_loss/train': 0.8323172730160877} 12/25/2021 00:32:15 - INFO - codeparrot_training - Step 15511: {'lr': 0.00039194832120995234, 'samples': 7942144, 'steps': 15511, 'batch_loss/train': 0.8231093473732471} 12/25/2021 00:32:25 - INFO - codeparrot_training - Step 15512: {'lr': 0.0003919353451700398, 'samples': 7942656, 'steps': 15512, 'batch_loss/train': 0.8115687659010291} 12/25/2021 00:32:37 - INFO - codeparrot_training - Step 15513: {'lr': 0.0003919223685658461, 'samples': 7943168, 'steps': 15513, 'batch_loss/train': 0.692241829354316} 12/25/2021 00:32:48 - INFO - codeparrot_training - Step 15514: {'lr': 0.0003919093913974224, 'samples': 7943680, 'steps': 15514, 'batch_loss/train': 0.7884855680167675} 12/25/2021 00:32:59 - INFO - codeparrot_training - Step 15515: {'lr': 0.00039189641366482074, 'samples': 7944192, 'steps': 15515, 'batch_loss/train': 0.9363048588857055} 12/25/2021 00:33:13 - INFO - codeparrot_training - Step 15516: {'lr': 0.0003918834353680925, 'samples': 7944704, 'steps': 15516, 'batch_loss/train': 0.8053292147815228} 12/25/2021 00:33:23 - INFO - codeparrot_training - Step 15517: {'lr': 0.0003918704565072893, 'samples': 7945216, 'steps': 15517, 'batch_loss/train': 1.5121963080018759} 12/25/2021 00:33:34 - INFO - codeparrot_training - Step 15518: {'lr': 0.0003918574770824627, 'samples': 7945728, 'steps': 15518, 'batch_loss/train': 1.117984197102487} 12/25/2021 00:33:46 - INFO - codeparrot_training - Step 15519: {'lr': 0.0003918444970936642, 'samples': 7946240, 'steps': 15519, 'batch_loss/train': 0.8285683011636138} 12/25/2021 00:33:57 - INFO - codeparrot_training - Step 15520: {'lr': 0.00039183151654094574, 'samples': 7946752, 'steps': 15520, 'batch_loss/train': 0.8389555234462023} 12/25/2021 00:34:07 - INFO - codeparrot_training - Step 15521: {'lr': 0.00039181853542435865, 'samples': 7947264, 'steps': 15521, 'batch_loss/train': 0.8526093261316419} 12/25/2021 00:34:18 - INFO - codeparrot_training - Step 15522: {'lr': 0.0003918055537439546, 'samples': 7947776, 'steps': 15522, 'batch_loss/train': 0.8167966110631824} 12/25/2021 00:34:30 - INFO - codeparrot_training - Step 15523: {'lr': 0.0003917925714997852, 'samples': 7948288, 'steps': 15523, 'batch_loss/train': 0.8556069027399644} 12/25/2021 00:34:40 - INFO - codeparrot_training - Step 15524: {'lr': 0.0003917795886919021, 'samples': 7948800, 'steps': 15524, 'batch_loss/train': 0.8426658986136317} 12/25/2021 00:34:51 - INFO - codeparrot_training - Step 15525: {'lr': 0.0003917666053203569, 'samples': 7949312, 'steps': 15525, 'batch_loss/train': 0.9090210068970919} 12/25/2021 00:35:05 - INFO - codeparrot_training - Step 15526: {'lr': 0.0003917536213852012, 'samples': 7949824, 'steps': 15526, 'batch_loss/train': 1.5696737598627806} 12/25/2021 00:35:15 - INFO - codeparrot_training - Step 15527: {'lr': 0.0003917406368864866, 'samples': 7950336, 'steps': 15527, 'batch_loss/train': 0.8645992139354348} 12/25/2021 00:35:26 - INFO - codeparrot_training - Step 15528: {'lr': 0.00039172765182426473, 'samples': 7950848, 'steps': 15528, 'batch_loss/train': 0.8126124059781432} 12/25/2021 00:35:38 - INFO - codeparrot_training - Step 15529: {'lr': 0.00039171466619858733, 'samples': 7951360, 'steps': 15529, 'batch_loss/train': 0.8865252118557692} 12/25/2021 00:35:49 - INFO - codeparrot_training - Step 15530: {'lr': 0.00039170168000950584, 'samples': 7951872, 'steps': 15530, 'batch_loss/train': 0.8211094690486789} 12/25/2021 00:35:59 - INFO - codeparrot_training - Step 15531: {'lr': 0.00039168869325707204, 'samples': 7952384, 'steps': 15531, 'batch_loss/train': 0.8324412847869098} 12/25/2021 00:36:10 - INFO - codeparrot_training - Step 15532: {'lr': 0.0003916757059413374, 'samples': 7952896, 'steps': 15532, 'batch_loss/train': 0.7325259856879711} 12/25/2021 00:36:22 - INFO - codeparrot_training - Step 15533: {'lr': 0.00039166271806235364, 'samples': 7953408, 'steps': 15533, 'batch_loss/train': 0.6985779809765518} 12/25/2021 00:36:32 - INFO - codeparrot_training - Step 15534: {'lr': 0.00039164972962017254, 'samples': 7953920, 'steps': 15534, 'batch_loss/train': 0.7995972139760852} 12/25/2021 00:36:43 - INFO - codeparrot_training - Step 15535: {'lr': 0.0003916367406148455, 'samples': 7954432, 'steps': 15535, 'batch_loss/train': 0.7749213967472315} 12/25/2021 00:36:57 - INFO - codeparrot_training - Step 15536: {'lr': 0.0003916237510464243, 'samples': 7954944, 'steps': 15536, 'batch_loss/train': 0.7736589964479208} 12/25/2021 00:37:08 - INFO - codeparrot_training - Step 15537: {'lr': 0.0003916107609149605, 'samples': 7955456, 'steps': 15537, 'batch_loss/train': 0.7918870653957129} 12/25/2021 00:37:18 - INFO - codeparrot_training - Step 15538: {'lr': 0.0003915977702205058, 'samples': 7955968, 'steps': 15538, 'batch_loss/train': 0.8460940374061465} 12/25/2021 00:37:30 - INFO - codeparrot_training - Step 15539: {'lr': 0.00039158477896311184, 'samples': 7956480, 'steps': 15539, 'batch_loss/train': 0.7648451839340851} 12/25/2021 00:37:41 - INFO - codeparrot_training - Step 15540: {'lr': 0.00039157178714283025, 'samples': 7956992, 'steps': 15540, 'batch_loss/train': 0.7378618272487074} 12/25/2021 00:37:52 - INFO - codeparrot_training - Step 15541: {'lr': 0.00039155879475971265, 'samples': 7957504, 'steps': 15541, 'batch_loss/train': 0.8679680735804141} 12/25/2021 00:38:02 - INFO - codeparrot_training - Step 15542: {'lr': 0.00039154580181381086, 'samples': 7958016, 'steps': 15542, 'batch_loss/train': 0.8437802498228848} 12/25/2021 00:38:14 - INFO - codeparrot_training - Step 15543: {'lr': 0.00039153280830517626, 'samples': 7958528, 'steps': 15543, 'batch_loss/train': 0.89771568775177} 12/25/2021 00:38:25 - INFO - codeparrot_training - Step 15544: {'lr': 0.00039151981423386073, 'samples': 7959040, 'steps': 15544, 'batch_loss/train': 0.8372146440669894} 12/25/2021 00:38:36 - INFO - codeparrot_training - Step 15545: {'lr': 0.00039150681959991584, 'samples': 7959552, 'steps': 15545, 'batch_loss/train': 0.9180911555886269} 12/25/2021 00:38:49 - INFO - codeparrot_training - Step 15546: {'lr': 0.00039149382440339334, 'samples': 7960064, 'steps': 15546, 'batch_loss/train': 0.8079301053658128} 12/25/2021 00:39:00 - INFO - codeparrot_training - Step 15547: {'lr': 0.0003914808286443447, 'samples': 7960576, 'steps': 15547, 'batch_loss/train': 0.7537226434797049} 12/25/2021 00:39:11 - INFO - codeparrot_training - Step 15548: {'lr': 0.0003914678323228218, 'samples': 7961088, 'steps': 15548, 'batch_loss/train': 0.9934687539935112} 12/25/2021 00:39:23 - INFO - codeparrot_training - Step 15549: {'lr': 0.00039145483543887613, 'samples': 7961600, 'steps': 15549, 'batch_loss/train': 0.8111543254926801} 12/25/2021 00:39:33 - INFO - codeparrot_training - Step 15550: {'lr': 0.0003914418379925595, 'samples': 7962112, 'steps': 15550, 'batch_loss/train': 1.337262315209955} 12/25/2021 00:39:44 - INFO - codeparrot_training - Step 15551: {'lr': 0.00039142883998392354, 'samples': 7962624, 'steps': 15551, 'batch_loss/train': 0.7343388134613633} 12/25/2021 00:39:54 - INFO - codeparrot_training - Step 15552: {'lr': 0.00039141584141301987, 'samples': 7963136, 'steps': 15552, 'batch_loss/train': 1.006969394395128} 12/25/2021 00:40:06 - INFO - codeparrot_training - Step 15553: {'lr': 0.0003914028422799002, 'samples': 7963648, 'steps': 15553, 'batch_loss/train': 0.7408300954848528} 12/25/2021 00:40:17 - INFO - codeparrot_training - Step 15554: {'lr': 0.0003913898425846163, 'samples': 7964160, 'steps': 15554, 'batch_loss/train': 0.9307961594313383} 12/25/2021 00:40:28 - INFO - codeparrot_training - Step 15555: {'lr': 0.0003913768423272197, 'samples': 7964672, 'steps': 15555, 'batch_loss/train': 0.8014033641666174} 12/25/2021 00:40:41 - INFO - codeparrot_training - Step 15556: {'lr': 0.00039136384150776217, 'samples': 7965184, 'steps': 15556, 'batch_loss/train': 0.8081126278266311} 12/25/2021 00:40:52 - INFO - codeparrot_training - Step 15557: {'lr': 0.0003913508401262954, 'samples': 7965696, 'steps': 15557, 'batch_loss/train': 0.7548124867025763} 12/25/2021 00:41:02 - INFO - codeparrot_training - Step 15558: {'lr': 0.00039133783818287105, 'samples': 7966208, 'steps': 15558, 'batch_loss/train': 0.7962747167330235} 12/25/2021 00:41:15 - INFO - codeparrot_training - Step 15559: {'lr': 0.00039132483567754086, 'samples': 7966720, 'steps': 15559, 'batch_loss/train': 0.8701520282775164} 12/25/2021 00:41:25 - INFO - codeparrot_training - Step 15560: {'lr': 0.0003913118326103564, 'samples': 7967232, 'steps': 15560, 'batch_loss/train': 0.8612917094724253} 12/25/2021 00:41:36 - INFO - codeparrot_training - Step 15561: {'lr': 0.00039129882898136955, 'samples': 7967744, 'steps': 15561, 'batch_loss/train': 0.850821046391502} 12/25/2021 00:41:47 - INFO - codeparrot_training - Step 15562: {'lr': 0.0003912858247906318, 'samples': 7968256, 'steps': 15562, 'batch_loss/train': 0.755170030053705} 12/25/2021 00:41:59 - INFO - codeparrot_training - Step 15563: {'lr': 0.0003912728200381951, 'samples': 7968768, 'steps': 15563, 'batch_loss/train': 0.8073657415807247} 12/25/2021 00:42:09 - INFO - codeparrot_training - Step 15564: {'lr': 0.00039125981472411085, 'samples': 7969280, 'steps': 15564, 'batch_loss/train': 0.8094406686723232} 12/25/2021 00:42:20 - INFO - codeparrot_training - Step 15565: {'lr': 0.00039124680884843104, 'samples': 7969792, 'steps': 15565, 'batch_loss/train': 0.8039568189997226} 12/25/2021 00:42:34 - INFO - codeparrot_training - Step 15566: {'lr': 0.0003912338024112072, 'samples': 7970304, 'steps': 15566, 'batch_loss/train': 0.755577010801062} 12/25/2021 00:42:45 - INFO - codeparrot_training - Step 15567: {'lr': 0.0003912207954124911, 'samples': 7970816, 'steps': 15567, 'batch_loss/train': 0.7332959161140025} 12/25/2021 00:42:55 - INFO - codeparrot_training - Step 15568: {'lr': 0.00039120778785233445, 'samples': 7971328, 'steps': 15568, 'batch_loss/train': 0.8438756903633475} 12/25/2021 00:43:07 - INFO - codeparrot_training - Step 15569: {'lr': 0.0003911947797307889, 'samples': 7971840, 'steps': 15569, 'batch_loss/train': 0.9080109111964703} 12/25/2021 00:43:18 - INFO - codeparrot_training - Step 15570: {'lr': 0.0003911817710479063, 'samples': 7972352, 'steps': 15570, 'batch_loss/train': 0.6607720986939967} 12/25/2021 00:43:28 - INFO - codeparrot_training - Step 15571: {'lr': 0.0003911687618037382, 'samples': 7972864, 'steps': 15571, 'batch_loss/train': 0.7699328698217869} 12/25/2021 00:43:39 - INFO - codeparrot_training - Step 15572: {'lr': 0.0003911557519983365, 'samples': 7973376, 'steps': 15572, 'batch_loss/train': 0.8451412077993155} 12/25/2021 00:43:51 - INFO - codeparrot_training - Step 15573: {'lr': 0.0003911427416317528, 'samples': 7973888, 'steps': 15573, 'batch_loss/train': 0.7718981048092246} 12/25/2021 00:44:02 - INFO - codeparrot_training - Step 15574: {'lr': 0.00039112973070403877, 'samples': 7974400, 'steps': 15574, 'batch_loss/train': 0.8797804759815335} 12/25/2021 00:44:13 - INFO - codeparrot_training - Step 15575: {'lr': 0.0003911167192152463, 'samples': 7974912, 'steps': 15575, 'batch_loss/train': 0.8555625090375543} 12/25/2021 00:44:24 - INFO - codeparrot_training - Step 15576: {'lr': 0.0003911037071654271, 'samples': 7975424, 'steps': 15576, 'batch_loss/train': 0.7032338932622224} 12/25/2021 00:44:35 - INFO - codeparrot_training - Step 15577: {'lr': 0.0003910906945546327, 'samples': 7975936, 'steps': 15577, 'batch_loss/train': 0.8043194133788347} 12/25/2021 00:44:46 - INFO - codeparrot_training - Step 15578: {'lr': 0.00039107768138291497, 'samples': 7976448, 'steps': 15578, 'batch_loss/train': 0.7788636186160147} 12/25/2021 00:44:59 - INFO - codeparrot_training - Step 15579: {'lr': 0.00039106466765032575, 'samples': 7976960, 'steps': 15579, 'batch_loss/train': 0.7821403611451387} 12/25/2021 00:45:10 - INFO - codeparrot_training - Step 15580: {'lr': 0.0003910516533569166, 'samples': 7977472, 'steps': 15580, 'batch_loss/train': 0.7623580417130142} 12/25/2021 00:45:21 - INFO - codeparrot_training - Step 15581: {'lr': 0.00039103863850273947, 'samples': 7977984, 'steps': 15581, 'batch_loss/train': 0.8365695457905531} 12/25/2021 00:45:33 - INFO - codeparrot_training - Step 15582: {'lr': 0.00039102562308784584, 'samples': 7978496, 'steps': 15582, 'batch_loss/train': 0.7589267203584313} 12/25/2021 00:45:43 - INFO - codeparrot_training - Step 15583: {'lr': 0.00039101260711228757, 'samples': 7979008, 'steps': 15583, 'batch_loss/train': 0.8801112612709403} 12/25/2021 00:45:54 - INFO - codeparrot_training - Step 15584: {'lr': 0.00039099959057611644, 'samples': 7979520, 'steps': 15584, 'batch_loss/train': 0.836783423088491} 12/25/2021 00:46:05 - INFO - codeparrot_training - Step 15585: {'lr': 0.0003909865734793843, 'samples': 7980032, 'steps': 15585, 'batch_loss/train': 0.8447759822010994} 12/25/2021 00:46:18 - INFO - codeparrot_training - Step 15586: {'lr': 0.0003909735558221426, 'samples': 7980544, 'steps': 15586, 'batch_loss/train': 1.4221013626083732} 12/25/2021 00:46:29 - INFO - codeparrot_training - Step 15587: {'lr': 0.00039096053760444337, 'samples': 7981056, 'steps': 15587, 'batch_loss/train': 0.765137389767915} 12/25/2021 00:46:40 - INFO - codeparrot_training - Step 15588: {'lr': 0.00039094751882633817, 'samples': 7981568, 'steps': 15588, 'batch_loss/train': 0.8962667817249894} 12/25/2021 00:46:52 - INFO - codeparrot_training - Step 15589: {'lr': 0.00039093449948787896, 'samples': 7982080, 'steps': 15589, 'batch_loss/train': 0.938039437867701} 12/25/2021 00:47:02 - INFO - codeparrot_training - Step 15590: {'lr': 0.0003909214795891174, 'samples': 7982592, 'steps': 15590, 'batch_loss/train': 0.7523462832905352} 12/25/2021 00:47:13 - INFO - codeparrot_training - Step 15591: {'lr': 0.0003909084591301052, 'samples': 7983104, 'steps': 15591, 'batch_loss/train': 0.8182441758690402} 12/25/2021 00:47:25 - INFO - codeparrot_training - Step 15592: {'lr': 0.0003908954381108942, 'samples': 7983616, 'steps': 15592, 'batch_loss/train': 0.8076259540393949} 12/25/2021 00:47:35 - INFO - codeparrot_training - Step 15593: {'lr': 0.00039088241653153604, 'samples': 7984128, 'steps': 15593, 'batch_loss/train': 0.862928718328476} 12/25/2021 00:47:46 - INFO - codeparrot_training - Step 15594: {'lr': 0.0003908693943920827, 'samples': 7984640, 'steps': 15594, 'batch_loss/train': 0.7839058823883533} 12/25/2021 00:48:00 - INFO - codeparrot_training - Step 15595: {'lr': 0.0003908563716925858, 'samples': 7985152, 'steps': 15595, 'batch_loss/train': 0.8448459999635816} 12/25/2021 00:48:10 - INFO - codeparrot_training - Step 15596: {'lr': 0.00039084334843309714, 'samples': 7985664, 'steps': 15596, 'batch_loss/train': 0.8715806817635894} 12/25/2021 00:48:21 - INFO - codeparrot_training - Step 15597: {'lr': 0.0003908303246136685, 'samples': 7986176, 'steps': 15597, 'batch_loss/train': 0.9483963567763567} 12/25/2021 00:48:32 - INFO - codeparrot_training - Step 15598: {'lr': 0.00039081730023435177, 'samples': 7986688, 'steps': 15598, 'batch_loss/train': 0.8258726913481951} 12/25/2021 00:48:44 - INFO - codeparrot_training - Step 15599: {'lr': 0.00039080427529519844, 'samples': 7987200, 'steps': 15599, 'batch_loss/train': 0.7841024221852422} 12/25/2021 00:48:55 - INFO - codeparrot_training - Step 15600: {'lr': 0.0003907912497962607, 'samples': 7987712, 'steps': 15600, 'batch_loss/train': 0.5916137634776533} 12/25/2021 00:49:05 - INFO - codeparrot_training - Step 15601: {'lr': 0.0003907782237375899, 'samples': 7988224, 'steps': 15601, 'batch_loss/train': 0.7409629048779607} 12/25/2021 00:49:19 - INFO - codeparrot_training - Step 15602: {'lr': 0.0003907651971192382, 'samples': 7988736, 'steps': 15602, 'batch_loss/train': 1.0718247359618545} 12/25/2021 00:49:30 - INFO - codeparrot_training - Step 15603: {'lr': 0.00039075216994125705, 'samples': 7989248, 'steps': 15603, 'batch_loss/train': 0.8195392405614257} 12/25/2021 00:49:40 - INFO - codeparrot_training - Step 15604: {'lr': 0.0003907391422036986, 'samples': 7989760, 'steps': 15604, 'batch_loss/train': 0.7676490461453795} 12/25/2021 00:49:52 - INFO - codeparrot_training - Step 15605: {'lr': 0.0003907261139066144, 'samples': 7990272, 'steps': 15605, 'batch_loss/train': 0.7542807147838175} 12/25/2021 00:50:03 - INFO - codeparrot_training - Step 15606: {'lr': 0.0003907130850500562, 'samples': 7990784, 'steps': 15606, 'batch_loss/train': 0.8313349899835885} 12/25/2021 00:50:13 - INFO - codeparrot_training - Step 15607: {'lr': 0.0003907000556340761, 'samples': 7991296, 'steps': 15607, 'batch_loss/train': 0.8789976816624403} 12/25/2021 00:50:24 - INFO - codeparrot_training - Step 15608: {'lr': 0.00039068702565872555, 'samples': 7991808, 'steps': 15608, 'batch_loss/train': 0.7688020030036569} 12/25/2021 00:50:36 - INFO - codeparrot_training - Step 15609: {'lr': 0.0003906739951240565, 'samples': 7992320, 'steps': 15609, 'batch_loss/train': 0.8901801048777997} 12/25/2021 00:50:46 - INFO - codeparrot_training - Step 15610: {'lr': 0.00039066096403012084, 'samples': 7992832, 'steps': 15610, 'batch_loss/train': 0.8095833538100123} 12/25/2021 00:50:57 - INFO - codeparrot_training - Step 15611: {'lr': 0.0003906479323769703, 'samples': 7993344, 'steps': 15611, 'batch_loss/train': 0.6379567254334688} 12/25/2021 00:51:11 - INFO - codeparrot_training - Step 15612: {'lr': 0.00039063490016465663, 'samples': 7993856, 'steps': 15612, 'batch_loss/train': 1.3505712519399822} 12/25/2021 00:51:21 - INFO - codeparrot_training - Step 15613: {'lr': 0.0003906218673932318, 'samples': 7994368, 'steps': 15613, 'batch_loss/train': 0.8107848167419434} 12/25/2021 00:51:32 - INFO - codeparrot_training - Step 15614: {'lr': 0.00039060883406274744, 'samples': 7994880, 'steps': 15614, 'batch_loss/train': 0.795276123448275} 12/25/2021 00:51:44 - INFO - codeparrot_training - Step 15615: {'lr': 0.0003905958001732555, 'samples': 7995392, 'steps': 15615, 'batch_loss/train': 0.798854848369956} 12/25/2021 00:51:55 - INFO - codeparrot_training - Step 15616: {'lr': 0.00039058276572480767, 'samples': 7995904, 'steps': 15616, 'batch_loss/train': 0.8527885056100786} 12/25/2021 00:52:05 - INFO - codeparrot_training - Step 15617: {'lr': 0.00039056973071745595, 'samples': 7996416, 'steps': 15617, 'batch_loss/train': 0.7480744551867247} 12/25/2021 00:52:16 - INFO - codeparrot_training - Step 15618: {'lr': 0.00039055669515125196, 'samples': 7996928, 'steps': 15618, 'batch_loss/train': 0.7616115676937625} 12/25/2021 00:52:28 - INFO - codeparrot_training - Step 15619: {'lr': 0.00039054365902624767, 'samples': 7997440, 'steps': 15619, 'batch_loss/train': 0.9089390505105257} 12/25/2021 00:52:39 - INFO - codeparrot_training - Step 15620: {'lr': 0.0003905306223424948, 'samples': 7997952, 'steps': 15620, 'batch_loss/train': 0.7680186476791278} 12/25/2021 00:52:49 - INFO - codeparrot_training - Step 15621: {'lr': 0.0003905175851000453, 'samples': 7998464, 'steps': 15621, 'batch_loss/train': 0.8272782019339502} 12/25/2021 00:53:01 - INFO - codeparrot_training - Step 15622: {'lr': 0.0003905045472989509, 'samples': 7998976, 'steps': 15622, 'batch_loss/train': 0.8188977288082242} 12/25/2021 00:53:12 - INFO - codeparrot_training - Step 15623: {'lr': 0.00039049150893926353, 'samples': 7999488, 'steps': 15623, 'batch_loss/train': 0.7710812175646424} 12/25/2021 00:53:22 - INFO - codeparrot_training - Step 15624: {'lr': 0.00039047847002103493, 'samples': 8000000, 'steps': 15624, 'batch_loss/train': 0.9041711688041687} 12/25/2021 00:53:36 - INFO - codeparrot_training - Step 15625: {'lr': 0.00039046543054431694, 'samples': 8000512, 'steps': 15625, 'batch_loss/train': 0.8444310743361712} 12/25/2021 00:53:47 - INFO - codeparrot_training - Step 15626: {'lr': 0.00039045239050916153, 'samples': 8001024, 'steps': 15626, 'batch_loss/train': 0.7852856387617067} 12/25/2021 00:53:58 - INFO - codeparrot_training - Step 15627: {'lr': 0.00039043934991562037, 'samples': 8001536, 'steps': 15627, 'batch_loss/train': 0.9021553881466389} 12/25/2021 00:54:10 - INFO - codeparrot_training - Step 15628: {'lr': 0.0003904263087637454, 'samples': 8002048, 'steps': 15628, 'batch_loss/train': 0.7989698708988726} 12/25/2021 00:54:20 - INFO - codeparrot_training - Step 15629: {'lr': 0.0003904132670535885, 'samples': 8002560, 'steps': 15629, 'batch_loss/train': 0.8259442932903767} 12/25/2021 00:54:31 - INFO - codeparrot_training - Step 15630: {'lr': 0.0003904002247852013, 'samples': 8003072, 'steps': 15630, 'batch_loss/train': 0.7882327167317271} 12/25/2021 00:54:42 - INFO - codeparrot_training - Step 15631: {'lr': 0.00039038718195863605, 'samples': 8003584, 'steps': 15631, 'batch_loss/train': 0.8365611680783331} 12/25/2021 00:54:56 - INFO - codeparrot_training - Step 15632: {'lr': 0.00039037413857394423, 'samples': 8004096, 'steps': 15632, 'batch_loss/train': 0.7803334966301918} 12/25/2021 00:55:06 - INFO - codeparrot_training - Step 15633: {'lr': 0.0003903610946311779, 'samples': 8004608, 'steps': 15633, 'batch_loss/train': 0.8393072276376188} 12/25/2021 00:55:17 - INFO - codeparrot_training - Step 15634: {'lr': 0.00039034805013038877, 'samples': 8005120, 'steps': 15634, 'batch_loss/train': 1.8175356816500425} 12/25/2021 00:55:29 - INFO - codeparrot_training - Step 15635: {'lr': 0.0003903350050716289, 'samples': 8005632, 'steps': 15635, 'batch_loss/train': 0.8105579894036055} 12/25/2021 00:55:40 - INFO - codeparrot_training - Step 15636: {'lr': 0.00039032195945494996, 'samples': 8006144, 'steps': 15636, 'batch_loss/train': 0.7940618945285678} 12/25/2021 00:55:51 - INFO - codeparrot_training - Step 15637: {'lr': 0.00039030891328040396, 'samples': 8006656, 'steps': 15637, 'batch_loss/train': 0.7389607694931328} 12/25/2021 00:56:01 - INFO - codeparrot_training - Step 15638: {'lr': 0.0003902958665480426, 'samples': 8007168, 'steps': 15638, 'batch_loss/train': 0.6845337934792042} 12/25/2021 00:56:13 - INFO - codeparrot_training - Step 15639: {'lr': 0.00039028281925791795, 'samples': 8007680, 'steps': 15639, 'batch_loss/train': 0.8781865369528532} 12/25/2021 00:56:24 - INFO - codeparrot_training - Step 15640: {'lr': 0.0003902697714100817, 'samples': 8008192, 'steps': 15640, 'batch_loss/train': 0.8711416008882225} 12/25/2021 00:56:35 - INFO - codeparrot_training - Step 15641: {'lr': 0.0003902567230045858, 'samples': 8008704, 'steps': 15641, 'batch_loss/train': 0.7773706265725195} 12/25/2021 00:56:48 - INFO - codeparrot_training - Step 15642: {'lr': 0.0003902436740414822, 'samples': 8009216, 'steps': 15642, 'batch_loss/train': 0.8316805632784963} 12/25/2021 00:56:59 - INFO - codeparrot_training - Step 15643: {'lr': 0.0003902306245208226, 'samples': 8009728, 'steps': 15643, 'batch_loss/train': 0.9593033967539668} 12/25/2021 00:57:10 - INFO - codeparrot_training - Step 15644: {'lr': 0.0003902175744426591, 'samples': 8010240, 'steps': 15644, 'batch_loss/train': 0.7876972891390324} 12/25/2021 00:57:20 - INFO - codeparrot_training - Step 15645: {'lr': 0.00039020452380704343, 'samples': 8010752, 'steps': 15645, 'batch_loss/train': 0.9239841103553772} 12/25/2021 00:57:32 - INFO - codeparrot_training - Step 15646: {'lr': 0.0003901914726140275, 'samples': 8011264, 'steps': 15646, 'batch_loss/train': 0.7829052079468966} 12/25/2021 00:57:43 - INFO - codeparrot_training - Step 15647: {'lr': 0.00039017842086366323, 'samples': 8011776, 'steps': 15647, 'batch_loss/train': 0.8907040655612946} 12/25/2021 00:57:53 - INFO - codeparrot_training - Step 15648: {'lr': 0.0003901653685560025, 'samples': 8012288, 'steps': 15648, 'batch_loss/train': 0.8123979223892093} 12/25/2021 00:58:05 - INFO - codeparrot_training - Step 15649: {'lr': 0.0003901523156910972, 'samples': 8012800, 'steps': 15649, 'batch_loss/train': 0.831674643792212} 12/25/2021 00:58:16 - INFO - codeparrot_training - Step 15650: {'lr': 0.0003901392622689992, 'samples': 8013312, 'steps': 15650, 'batch_loss/train': 0.7798086651600897} 12/25/2021 00:58:27 - INFO - codeparrot_training - Step 15651: {'lr': 0.0003901262082897603, 'samples': 8013824, 'steps': 15651, 'batch_loss/train': 0.9306235343683511} 12/25/2021 00:58:40 - INFO - codeparrot_training - Step 15652: {'lr': 0.0003901131537534326, 'samples': 8014336, 'steps': 15652, 'batch_loss/train': 0.7771784961223602} 12/25/2021 00:58:51 - INFO - codeparrot_training - Step 15653: {'lr': 0.0003901000986600679, 'samples': 8014848, 'steps': 15653, 'batch_loss/train': 0.7306192917749286} 12/25/2021 00:59:02 - INFO - codeparrot_training - Step 15654: {'lr': 0.0003900870430097181, 'samples': 8015360, 'steps': 15654, 'batch_loss/train': 0.8658047588542104} 12/25/2021 00:59:14 - INFO - codeparrot_training - Step 15655: {'lr': 0.0003900739868024351, 'samples': 8015872, 'steps': 15655, 'batch_loss/train': 0.802291358821094} 12/25/2021 00:59:24 - INFO - codeparrot_training - Step 15656: {'lr': 0.0003900609300382708, 'samples': 8016384, 'steps': 15656, 'batch_loss/train': 0.837952318135649} 12/25/2021 00:59:35 - INFO - codeparrot_training - Step 15657: {'lr': 0.00039004787271727716, 'samples': 8016896, 'steps': 15657, 'batch_loss/train': 0.822870985371992} 12/25/2021 00:59:45 - INFO - codeparrot_training - Step 15658: {'lr': 0.0003900348148395061, 'samples': 8017408, 'steps': 15658, 'batch_loss/train': 0.8294438743032515} 12/25/2021 00:59:58 - INFO - codeparrot_training - Step 15659: {'lr': 0.0003900217564050094, 'samples': 8017920, 'steps': 15659, 'batch_loss/train': 0.7942709615454078} 12/25/2021 01:00:08 - INFO - codeparrot_training - Step 15660: {'lr': 0.00039000869741383915, 'samples': 8018432, 'steps': 15660, 'batch_loss/train': 0.8996332613751292} 12/25/2021 01:00:19 - INFO - codeparrot_training - Step 15661: {'lr': 0.00038999563786604717, 'samples': 8018944, 'steps': 15661, 'batch_loss/train': 0.8115709372796118} 12/25/2021 01:00:33 - INFO - codeparrot_training - Step 15662: {'lr': 0.0003899825777616853, 'samples': 8019456, 'steps': 15662, 'batch_loss/train': 0.8351846979930997} 12/25/2021 01:00:43 - INFO - codeparrot_training - Step 15663: {'lr': 0.0003899695171008057, 'samples': 8019968, 'steps': 15663, 'batch_loss/train': 0.7678010510280728} 12/25/2021 01:00:54 - INFO - codeparrot_training - Step 15664: {'lr': 0.00038995645588346006, 'samples': 8020480, 'steps': 15664, 'batch_loss/train': 0.7469604378566146} 12/25/2021 01:01:06 - INFO - codeparrot_training - Step 15665: {'lr': 0.0003899433941097004, 'samples': 8020992, 'steps': 15665, 'batch_loss/train': 0.8263297425583005} 12/25/2021 01:01:17 - INFO - codeparrot_training - Step 15666: {'lr': 0.0003899303317795787, 'samples': 8021504, 'steps': 15666, 'batch_loss/train': 0.714073384529911} 12/25/2021 01:01:27 - INFO - codeparrot_training - Step 15667: {'lr': 0.0003899172688931468, 'samples': 8022016, 'steps': 15667, 'batch_loss/train': 0.741092776414007} 12/25/2021 01:01:38 - INFO - codeparrot_training - Step 15668: {'lr': 0.0003899042054504566, 'samples': 8022528, 'steps': 15668, 'batch_loss/train': 0.8543092459440231} 12/25/2021 01:01:50 - INFO - codeparrot_training - Step 15669: {'lr': 0.00038989114145156017, 'samples': 8023040, 'steps': 15669, 'batch_loss/train': 0.8005679533816874} 12/25/2021 01:02:00 - INFO - codeparrot_training - Step 15670: {'lr': 0.0003898780768965095, 'samples': 8023552, 'steps': 15670, 'batch_loss/train': 0.7964179976843297} 12/25/2021 01:02:11 - INFO - codeparrot_training - Step 15671: {'lr': 0.0003898650117853562, 'samples': 8024064, 'steps': 15671, 'batch_loss/train': 0.8297726735472679} 12/25/2021 01:02:25 - INFO - codeparrot_training - Step 15672: {'lr': 0.00038985194611815255, 'samples': 8024576, 'steps': 15672, 'batch_loss/train': 0.7566685490310192} 12/25/2021 01:02:36 - INFO - codeparrot_training - Step 15673: {'lr': 0.0003898388798949503, 'samples': 8025088, 'steps': 15673, 'batch_loss/train': 0.5992351775057614} 12/25/2021 01:02:47 - INFO - codeparrot_training - Step 15674: {'lr': 0.0003898258131158016, 'samples': 8025600, 'steps': 15674, 'batch_loss/train': 0.8505495525896549} 12/25/2021 01:02:59 - INFO - codeparrot_training - Step 15675: {'lr': 0.00038981274578075807, 'samples': 8026112, 'steps': 15675, 'batch_loss/train': 0.748377921525389} 12/25/2021 01:03:09 - INFO - codeparrot_training - Step 15676: {'lr': 0.00038979967788987203, 'samples': 8026624, 'steps': 15676, 'batch_loss/train': 0.7041781803127378} 12/25/2021 01:03:20 - INFO - codeparrot_training - Step 15677: {'lr': 0.0003897866094431951, 'samples': 8027136, 'steps': 15677, 'batch_loss/train': 0.8204400455579162} 12/25/2021 01:03:30 - INFO - codeparrot_training - Step 15678: {'lr': 0.00038977354044077953, 'samples': 8027648, 'steps': 15678, 'batch_loss/train': 0.7606696139555424} 12/25/2021 01:03:42 - INFO - codeparrot_training - Step 15679: {'lr': 0.0003897604708826771, 'samples': 8028160, 'steps': 15679, 'batch_loss/train': 0.7910135770216584} 12/25/2021 01:03:53 - INFO - codeparrot_training - Step 15680: {'lr': 0.0003897474007689397, 'samples': 8028672, 'steps': 15680, 'batch_loss/train': 0.7602348843356594} 12/25/2021 01:04:04 - INFO - codeparrot_training - Step 15681: {'lr': 0.00038973433009961955, 'samples': 8029184, 'steps': 15681, 'batch_loss/train': 0.7013574067968875} 12/25/2021 01:04:17 - INFO - codeparrot_training - Step 15682: {'lr': 0.0003897212588747684, 'samples': 8029696, 'steps': 15682, 'batch_loss/train': 0.8649849254870787} 12/25/2021 01:04:28 - INFO - codeparrot_training - Step 15683: {'lr': 0.0003897081870944383, 'samples': 8030208, 'steps': 15683, 'batch_loss/train': 0.7846765033900738} 12/25/2021 01:04:39 - INFO - codeparrot_training - Step 15684: {'lr': 0.00038969511475868114, 'samples': 8030720, 'steps': 15684, 'batch_loss/train': 0.9033584278076887} 12/25/2021 01:04:51 - INFO - codeparrot_training - Step 15685: {'lr': 0.000389682041867549, 'samples': 8031232, 'steps': 15685, 'batch_loss/train': 0.8118885159492493} 12/25/2021 01:05:02 - INFO - codeparrot_training - Step 15686: {'lr': 0.0003896689684210938, 'samples': 8031744, 'steps': 15686, 'batch_loss/train': 0.7150546838529408} 12/25/2021 01:05:12 - INFO - codeparrot_training - Step 15687: {'lr': 0.00038965589441936754, 'samples': 8032256, 'steps': 15687, 'batch_loss/train': 0.8802613383159041} 12/25/2021 01:05:23 - INFO - codeparrot_training - Step 15688: {'lr': 0.00038964281986242214, 'samples': 8032768, 'steps': 15688, 'batch_loss/train': 0.6905770096927881} 12/25/2021 01:05:35 - INFO - codeparrot_training - Step 15689: {'lr': 0.00038962974475030966, 'samples': 8033280, 'steps': 15689, 'batch_loss/train': 0.8207397861406207} 12/25/2021 01:05:45 - INFO - codeparrot_training - Step 15690: {'lr': 0.00038961666908308203, 'samples': 8033792, 'steps': 15690, 'batch_loss/train': 0.8059776993468404} 12/25/2021 01:05:56 - INFO - codeparrot_training - Step 15691: {'lr': 0.00038960359286079114, 'samples': 8034304, 'steps': 15691, 'batch_loss/train': 0.8229903252795339} 12/25/2021 01:06:10 - INFO - codeparrot_training - Step 15692: {'lr': 0.0003895905160834892, 'samples': 8034816, 'steps': 15692, 'batch_loss/train': 0.7868875470012426} 12/25/2021 01:06:21 - INFO - codeparrot_training - Step 15693: {'lr': 0.00038957743875122795, 'samples': 8035328, 'steps': 15693, 'batch_loss/train': 0.930713227018714} 12/25/2021 01:06:31 - INFO - codeparrot_training - Step 15694: {'lr': 0.0003895643608640597, 'samples': 8035840, 'steps': 15694, 'batch_loss/train': 0.7515664519742131} 12/25/2021 01:06:43 - INFO - codeparrot_training - Step 15695: {'lr': 0.0003895512824220361, 'samples': 8036352, 'steps': 15695, 'batch_loss/train': 0.6789912935346365} 12/25/2021 01:06:54 - INFO - codeparrot_training - Step 15696: {'lr': 0.0003895382034252093, 'samples': 8036864, 'steps': 15696, 'batch_loss/train': 0.7572893109172583} 12/25/2021 01:07:04 - INFO - codeparrot_training - Step 15697: {'lr': 0.00038952512387363135, 'samples': 8037376, 'steps': 15697, 'batch_loss/train': 0.8829627772793174} 12/25/2021 01:07:15 - INFO - codeparrot_training - Step 15698: {'lr': 0.00038951204376735417, 'samples': 8037888, 'steps': 15698, 'batch_loss/train': 0.7392695050220937} 12/25/2021 01:07:27 - INFO - codeparrot_training - Step 15699: {'lr': 0.00038949896310642973, 'samples': 8038400, 'steps': 15699, 'batch_loss/train': 0.7797246011905372} 12/25/2021 01:07:38 - INFO - codeparrot_training - Step 15700: {'lr': 0.00038948588189091014, 'samples': 8038912, 'steps': 15700, 'batch_loss/train': 0.8238237909972668} 12/25/2021 01:07:48 - INFO - codeparrot_training - Step 15701: {'lr': 0.00038947280012084737, 'samples': 8039424, 'steps': 15701, 'batch_loss/train': 0.8965264977887273} 12/25/2021 01:08:02 - INFO - codeparrot_training - Step 15702: {'lr': 0.00038945971779629336, 'samples': 8039936, 'steps': 15702, 'batch_loss/train': 0.757955034263432} 12/25/2021 01:08:13 - INFO - codeparrot_training - Step 15703: {'lr': 0.0003894466349173002, 'samples': 8040448, 'steps': 15703, 'batch_loss/train': 0.7798960972577333} 12/25/2021 01:08:23 - INFO - codeparrot_training - Step 15704: {'lr': 0.00038943355148391985, 'samples': 8040960, 'steps': 15704, 'batch_loss/train': 0.780336843803525} 12/25/2021 01:08:35 - INFO - codeparrot_training - Step 15705: {'lr': 0.00038942046749620437, 'samples': 8041472, 'steps': 15705, 'batch_loss/train': 0.9025689139962196} 12/25/2021 01:08:46 - INFO - codeparrot_training - Step 15706: {'lr': 0.00038940738295420576, 'samples': 8041984, 'steps': 15706, 'batch_loss/train': 0.8348572133108974} 12/25/2021 01:08:57 - INFO - codeparrot_training - Step 15707: {'lr': 0.00038939429785797596, 'samples': 8042496, 'steps': 15707, 'batch_loss/train': 0.9253094987943769} 12/25/2021 01:09:07 - INFO - codeparrot_training - Step 15708: {'lr': 0.00038938121220756717, 'samples': 8043008, 'steps': 15708, 'batch_loss/train': 0.8840648680925369} 12/25/2021 01:09:19 - INFO - codeparrot_training - Step 15709: {'lr': 0.0003893681260030312, 'samples': 8043520, 'steps': 15709, 'batch_loss/train': 0.8569790851324797} 12/25/2021 01:09:30 - INFO - codeparrot_training - Step 15710: {'lr': 0.0003893550392444203, 'samples': 8044032, 'steps': 15710, 'batch_loss/train': 0.8894339427351952} 12/25/2021 01:09:41 - INFO - codeparrot_training - Step 15711: {'lr': 0.00038934195193178635, 'samples': 8044544, 'steps': 15711, 'batch_loss/train': 0.7771852947771549} 12/25/2021 01:09:54 - INFO - codeparrot_training - Step 15712: {'lr': 0.00038932886406518137, 'samples': 8045056, 'steps': 15712, 'batch_loss/train': 0.8696241073776037} 12/25/2021 01:10:05 - INFO - codeparrot_training - Step 15713: {'lr': 0.0003893157756446575, 'samples': 8045568, 'steps': 15713, 'batch_loss/train': 0.7652731547132134} 12/25/2021 01:10:16 - INFO - codeparrot_training - Step 15714: {'lr': 0.0003893026866702666, 'samples': 8046080, 'steps': 15714, 'batch_loss/train': 0.8657754063606262} 12/25/2021 01:10:28 - INFO - codeparrot_training - Step 15715: {'lr': 0.0003892895971420609, 'samples': 8046592, 'steps': 15715, 'batch_loss/train': 0.8399825226515532} 12/25/2021 01:10:38 - INFO - codeparrot_training - Step 15716: {'lr': 0.0003892765070600923, 'samples': 8047104, 'steps': 15716, 'batch_loss/train': 0.8654966567410156} 12/25/2021 01:10:49 - INFO - codeparrot_training - Step 15717: {'lr': 0.00038926341642441295, 'samples': 8047616, 'steps': 15717, 'batch_loss/train': 0.7958465637639165} 12/25/2021 01:10:59 - INFO - codeparrot_training - Step 15718: {'lr': 0.0003892503252350749, 'samples': 8048128, 'steps': 15718, 'batch_loss/train': 0.8577659204602242} 12/25/2021 01:11:14 - INFO - codeparrot_training - Step 15719: {'lr': 0.00038923723349212994, 'samples': 8048640, 'steps': 15719, 'batch_loss/train': 0.7846201597712934} 12/25/2021 01:11:24 - INFO - codeparrot_training - Step 15720: {'lr': 0.0003892241411956304, 'samples': 8049152, 'steps': 15720, 'batch_loss/train': 0.797103806398809} 12/25/2021 01:11:35 - INFO - codeparrot_training - Step 15721: {'lr': 0.0003892110483456284, 'samples': 8049664, 'steps': 15721, 'batch_loss/train': 0.6394203954841942} 12/25/2021 01:11:47 - INFO - codeparrot_training - Step 15722: {'lr': 0.0003891979549421757, 'samples': 8050176, 'steps': 15722, 'batch_loss/train': 0.8317654971033335} 12/25/2021 01:11:57 - INFO - codeparrot_training - Step 15723: {'lr': 0.0003891848609853245, 'samples': 8050688, 'steps': 15723, 'batch_loss/train': 0.7907711057923734} 12/25/2021 01:12:08 - INFO - codeparrot_training - Step 15724: {'lr': 0.0003891717664751268, 'samples': 8051200, 'steps': 15724, 'batch_loss/train': 0.9412051662802696} 12/25/2021 01:12:20 - INFO - codeparrot_training - Step 15725: {'lr': 0.00038915867141163475, 'samples': 8051712, 'steps': 15725, 'batch_loss/train': 0.8463269295170903} 12/25/2021 01:12:31 - INFO - codeparrot_training - Step 15726: {'lr': 0.0003891455757949004, 'samples': 8052224, 'steps': 15726, 'batch_loss/train': 0.9003653433173895} 12/25/2021 01:12:41 - INFO - codeparrot_training - Step 15727: {'lr': 0.00038913247962497574, 'samples': 8052736, 'steps': 15727, 'batch_loss/train': 0.8189238319173455} 12/25/2021 01:12:52 - INFO - codeparrot_training - Step 15728: {'lr': 0.00038911938290191287, 'samples': 8053248, 'steps': 15728, 'batch_loss/train': 0.8456539362668991} 12/25/2021 01:13:04 - INFO - codeparrot_training - Step 15729: {'lr': 0.0003891062856257639, 'samples': 8053760, 'steps': 15729, 'batch_loss/train': 0.8315775259397924} 12/25/2021 01:13:14 - INFO - codeparrot_training - Step 15730: {'lr': 0.0003890931877965808, 'samples': 8054272, 'steps': 15730, 'batch_loss/train': 0.7180466895224527} 12/25/2021 01:13:25 - INFO - codeparrot_training - Step 15731: {'lr': 0.00038908008941441574, 'samples': 8054784, 'steps': 15731, 'batch_loss/train': 0.619721140130423} 12/25/2021 01:13:39 - INFO - codeparrot_training - Step 15732: {'lr': 0.0003890669904793208, 'samples': 8055296, 'steps': 15732, 'batch_loss/train': 0.7848477009683847} 12/25/2021 01:13:49 - INFO - codeparrot_training - Step 15733: {'lr': 0.00038905389099134797, 'samples': 8055808, 'steps': 15733, 'batch_loss/train': 0.8551715640351176} 12/25/2021 01:14:00 - INFO - codeparrot_training - Step 15734: {'lr': 0.0003890407909505494, 'samples': 8056320, 'steps': 15734, 'batch_loss/train': 0.7273200983181596} 12/25/2021 01:14:12 - INFO - codeparrot_training - Step 15735: {'lr': 0.00038902769035697717, 'samples': 8056832, 'steps': 15735, 'batch_loss/train': 0.7942839420866221} 12/25/2021 01:14:23 - INFO - codeparrot_training - Step 15736: {'lr': 0.0003890145892106832, 'samples': 8057344, 'steps': 15736, 'batch_loss/train': 0.7505403193645179} 12/25/2021 01:14:33 - INFO - codeparrot_training - Step 15737: {'lr': 0.0003890014875117199, 'samples': 8057856, 'steps': 15737, 'batch_loss/train': 0.8676652563735843} 12/25/2021 01:14:44 - INFO - codeparrot_training - Step 15738: {'lr': 0.000388988385260139, 'samples': 8058368, 'steps': 15738, 'batch_loss/train': 0.9023545440286398} 12/25/2021 01:14:56 - INFO - codeparrot_training - Step 15739: {'lr': 0.0003889752824559929, 'samples': 8058880, 'steps': 15739, 'batch_loss/train': 0.8071632222272456} 12/25/2021 01:15:07 - INFO - codeparrot_training - Step 15740: {'lr': 0.00038896217909933353, 'samples': 8059392, 'steps': 15740, 'batch_loss/train': 0.8287861885037273} 12/25/2021 01:15:17 - INFO - codeparrot_training - Step 15741: {'lr': 0.0003889490751902129, 'samples': 8059904, 'steps': 15741, 'batch_loss/train': 0.8391963667236269} 12/25/2021 01:15:31 - INFO - codeparrot_training - Step 15742: {'lr': 0.0003889359707286833, 'samples': 8060416, 'steps': 15742, 'batch_loss/train': 0.9243177846074104} 12/25/2021 01:15:42 - INFO - codeparrot_training - Step 15743: {'lr': 0.0003889228657147967, 'samples': 8060928, 'steps': 15743, 'batch_loss/train': 0.7521049389615655} 12/25/2021 01:15:52 - INFO - codeparrot_training - Step 15744: {'lr': 0.00038890976014860536, 'samples': 8061440, 'steps': 15744, 'batch_loss/train': 0.863966217264533} 12/25/2021 01:16:04 - INFO - codeparrot_training - Step 15745: {'lr': 0.00038889665403016104, 'samples': 8061952, 'steps': 15745, 'batch_loss/train': 0.7828999310731888} 12/25/2021 01:16:15 - INFO - codeparrot_training - Step 15746: {'lr': 0.00038888354735951624, 'samples': 8062464, 'steps': 15746, 'batch_loss/train': 0.7836721306666732} 12/25/2021 01:16:25 - INFO - codeparrot_training - Step 15747: {'lr': 0.0003888704401367229, 'samples': 8062976, 'steps': 15747, 'batch_loss/train': 0.7520188391208649} 12/25/2021 01:16:39 - INFO - codeparrot_training - Step 15748: {'lr': 0.0003888573323618331, 'samples': 8063488, 'steps': 15748, 'batch_loss/train': 0.8477816851809621} 12/25/2021 01:16:50 - INFO - codeparrot_training - Step 15749: {'lr': 0.00038884422403489893, 'samples': 8064000, 'steps': 15749, 'batch_loss/train': 0.7981779668480158} 12/25/2021 01:17:00 - INFO - codeparrot_training - Step 15750: {'lr': 0.00038883111515597256, 'samples': 8064512, 'steps': 15750, 'batch_loss/train': 0.8079640376381576} 12/25/2021 01:17:11 - INFO - codeparrot_training - Step 15751: {'lr': 0.0003888180057251061, 'samples': 8065024, 'steps': 15751, 'batch_loss/train': 0.8195306742563844} 12/25/2021 01:17:23 - INFO - codeparrot_training - Step 15752: {'lr': 0.00038880489574235174, 'samples': 8065536, 'steps': 15752, 'batch_loss/train': 0.7093187631107867} 12/25/2021 01:17:34 - INFO - codeparrot_training - Step 15753: {'lr': 0.00038879178520776137, 'samples': 8066048, 'steps': 15753, 'batch_loss/train': 0.7841576496139169} 12/25/2021 01:17:44 - INFO - codeparrot_training - Step 15754: {'lr': 0.0003887786741213874, 'samples': 8066560, 'steps': 15754, 'batch_loss/train': 0.865578860975802} 12/25/2021 01:17:56 - INFO - codeparrot_training - Step 15755: {'lr': 0.00038876556248328175, 'samples': 8067072, 'steps': 15755, 'batch_loss/train': 0.7194801014848053} 12/25/2021 01:18:07 - INFO - codeparrot_training - Step 15756: {'lr': 0.0003887524502934967, 'samples': 8067584, 'steps': 15756, 'batch_loss/train': 0.7807535557076335} 12/25/2021 01:18:17 - INFO - codeparrot_training - Step 15757: {'lr': 0.00038873933755208424, 'samples': 8068096, 'steps': 15757, 'batch_loss/train': 0.8337992522865534} 12/25/2021 01:18:31 - INFO - codeparrot_training - Step 15758: {'lr': 0.00038872622425909653, 'samples': 8068608, 'steps': 15758, 'batch_loss/train': 0.8307599449763075} 12/25/2021 01:18:42 - INFO - codeparrot_training - Step 15759: {'lr': 0.0003887131104145858, 'samples': 8069120, 'steps': 15759, 'batch_loss/train': 0.7859542621299624} 12/25/2021 01:18:52 - INFO - codeparrot_training - Step 15760: {'lr': 0.00038869999601860413, 'samples': 8069632, 'steps': 15760, 'batch_loss/train': 0.8759892955422401} 12/25/2021 01:19:03 - INFO - codeparrot_training - Step 15761: {'lr': 0.0003886868810712036, 'samples': 8070144, 'steps': 15761, 'batch_loss/train': 0.7202123799361289} 12/25/2021 01:19:15 - INFO - codeparrot_training - Step 15762: {'lr': 0.0003886737655724364, 'samples': 8070656, 'steps': 15762, 'batch_loss/train': 1.1478245612233877} 12/25/2021 01:19:26 - INFO - codeparrot_training - Step 15763: {'lr': 0.00038866064952235467, 'samples': 8071168, 'steps': 15763, 'batch_loss/train': 0.7654831726104021} 12/25/2021 01:19:36 - INFO - codeparrot_training - Step 15764: {'lr': 0.0003886475329210106, 'samples': 8071680, 'steps': 15764, 'batch_loss/train': 0.8525100278202444} 12/25/2021 01:19:48 - INFO - codeparrot_training - Step 15765: {'lr': 0.0003886344157684562, 'samples': 8072192, 'steps': 15765, 'batch_loss/train': 0.8039464661851525} 12/25/2021 01:19:59 - INFO - codeparrot_training - Step 15766: {'lr': 0.0003886212980647438, 'samples': 8072704, 'steps': 15766, 'batch_loss/train': 0.7589243678376079} 12/25/2021 01:20:10 - INFO - codeparrot_training - Step 15767: {'lr': 0.00038860817980992533, 'samples': 8073216, 'steps': 15767, 'batch_loss/train': 0.8870594874024391} 12/25/2021 01:20:22 - INFO - codeparrot_training - Step 15768: {'lr': 0.00038859506100405325, 'samples': 8073728, 'steps': 15768, 'batch_loss/train': 0.8176291091367602} 12/25/2021 01:20:33 - INFO - codeparrot_training - Step 15769: {'lr': 0.00038858194164717943, 'samples': 8074240, 'steps': 15769, 'batch_loss/train': 0.7526822052896023} 12/25/2021 01:20:43 - INFO - codeparrot_training - Step 15770: {'lr': 0.00038856882173935614, 'samples': 8074752, 'steps': 15770, 'batch_loss/train': 0.7799402549862862} 12/25/2021 01:20:57 - INFO - codeparrot_training - Step 15771: {'lr': 0.00038855570128063557, 'samples': 8075264, 'steps': 15771, 'batch_loss/train': 0.5921095956582576} 12/25/2021 01:21:08 - INFO - codeparrot_training - Step 15772: {'lr': 0.0003885425802710698, 'samples': 8075776, 'steps': 15772, 'batch_loss/train': 0.7451295657083392} 12/25/2021 01:21:18 - INFO - codeparrot_training - Step 15773: {'lr': 0.0003885294587107111, 'samples': 8076288, 'steps': 15773, 'batch_loss/train': 0.8002108056098223} 12/25/2021 01:21:29 - INFO - codeparrot_training - Step 15774: {'lr': 0.0003885163365996116, 'samples': 8076800, 'steps': 15774, 'batch_loss/train': 0.8798824055120349} 12/25/2021 01:21:41 - INFO - codeparrot_training - Step 15775: {'lr': 0.0003885032139378234, 'samples': 8077312, 'steps': 15775, 'batch_loss/train': 0.8160007442347705} 12/25/2021 01:21:52 - INFO - codeparrot_training - Step 15776: {'lr': 0.0003884900907253988, 'samples': 8077824, 'steps': 15776, 'batch_loss/train': 0.7782416632398963} 12/25/2021 01:22:03 - INFO - codeparrot_training - Step 15777: {'lr': 0.00038847696696238973, 'samples': 8078336, 'steps': 15777, 'batch_loss/train': 0.796853574924171} 12/25/2021 01:22:16 - INFO - codeparrot_training - Step 15778: {'lr': 0.0003884638426488486, 'samples': 8078848, 'steps': 15778, 'batch_loss/train': 0.7981440995354205} 12/25/2021 01:22:27 - INFO - codeparrot_training - Step 15779: {'lr': 0.00038845071778482754, 'samples': 8079360, 'steps': 15779, 'batch_loss/train': 0.8315826123580337} 12/25/2021 01:22:38 - INFO - codeparrot_training - Step 15780: {'lr': 0.00038843759237037866, 'samples': 8079872, 'steps': 15780, 'batch_loss/train': 0.8533467542147264} 12/25/2021 01:22:49 - INFO - codeparrot_training - Step 15781: {'lr': 0.0003884244664055542, 'samples': 8080384, 'steps': 15781, 'batch_loss/train': 0.8294198457151651} 12/25/2021 01:23:00 - INFO - codeparrot_training - Step 15782: {'lr': 0.0003884113398904063, 'samples': 8080896, 'steps': 15782, 'batch_loss/train': 0.76695159857627} 12/25/2021 01:23:11 - INFO - codeparrot_training - Step 15783: {'lr': 0.0003883982128249872, 'samples': 8081408, 'steps': 15783, 'batch_loss/train': 0.964668293017894} 12/25/2021 01:23:21 - INFO - codeparrot_training - Step 15784: {'lr': 0.0003883850852093491, 'samples': 8081920, 'steps': 15784, 'batch_loss/train': 0.7925855591893196} 12/25/2021 01:23:33 - INFO - codeparrot_training - Step 15785: {'lr': 0.00038837195704354417, 'samples': 8082432, 'steps': 15785, 'batch_loss/train': 0.6926341284997761} 12/25/2021 01:23:44 - INFO - codeparrot_training - Step 15786: {'lr': 0.00038835882832762453, 'samples': 8082944, 'steps': 15786, 'batch_loss/train': 0.8618613248690963} 12/25/2021 01:23:55 - INFO - codeparrot_training - Step 15787: {'lr': 0.00038834569906164246, 'samples': 8083456, 'steps': 15787, 'batch_loss/train': 0.7856345595791936} 12/25/2021 01:24:09 - INFO - codeparrot_training - Step 15788: {'lr': 0.0003883325692456501, 'samples': 8083968, 'steps': 15788, 'batch_loss/train': 0.7576914150267839} 12/25/2021 01:24:19 - INFO - codeparrot_training - Step 15789: {'lr': 0.0003883194388796997, 'samples': 8084480, 'steps': 15789, 'batch_loss/train': 0.8791575022041798} 12/25/2021 01:24:30 - INFO - codeparrot_training - Step 15790: {'lr': 0.00038830630796384344, 'samples': 8084992, 'steps': 15790, 'batch_loss/train': 0.8359487438574433} 12/25/2021 01:24:42 - INFO - codeparrot_training - Step 15791: {'lr': 0.00038829317649813344, 'samples': 8085504, 'steps': 15791, 'batch_loss/train': 2.3388342964462936} 12/25/2021 01:24:53 - INFO - codeparrot_training - Step 15792: {'lr': 0.0003882800444826221, 'samples': 8086016, 'steps': 15792, 'batch_loss/train': 0.8238516272976995} 12/25/2021 01:25:03 - INFO - codeparrot_training - Step 15793: {'lr': 0.0003882669119173615, 'samples': 8086528, 'steps': 15793, 'batch_loss/train': 0.8959995210170746} 12/25/2021 01:25:14 - INFO - codeparrot_training - Step 15794: {'lr': 0.0003882537788024039, 'samples': 8087040, 'steps': 15794, 'batch_loss/train': 0.8046516655012965} 12/25/2021 01:25:29 - INFO - codeparrot_training - Step 15795: {'lr': 0.00038824064513780134, 'samples': 8087552, 'steps': 15795, 'batch_loss/train': 0.8112784763798118} 12/25/2021 01:25:39 - INFO - codeparrot_training - Step 15796: {'lr': 0.00038822751092360626, 'samples': 8088064, 'steps': 15796, 'batch_loss/train': 1.3510772194713354} 12/25/2021 01:25:50 - INFO - codeparrot_training - Step 15797: {'lr': 0.0003882143761598708, 'samples': 8088576, 'steps': 15797, 'batch_loss/train': 0.7776253074407578} 12/25/2021 01:26:02 - INFO - codeparrot_training - Step 15798: {'lr': 0.00038820124084664717, 'samples': 8089088, 'steps': 15798, 'batch_loss/train': 0.7244692739332095} 12/25/2021 01:26:12 - INFO - codeparrot_training - Step 15799: {'lr': 0.0003881881049839875, 'samples': 8089600, 'steps': 15799, 'batch_loss/train': 0.9769402123056352} 12/25/2021 01:26:23 - INFO - codeparrot_training - Step 15800: {'lr': 0.00038817496857194425, 'samples': 8090112, 'steps': 15800, 'batch_loss/train': 0.8827284863218665} 12/25/2021 01:26:33 - INFO - codeparrot_training - Step 15801: {'lr': 0.0003881618316105694, 'samples': 8090624, 'steps': 15801, 'batch_loss/train': 0.7763396454975009} 12/25/2021 01:26:46 - INFO - codeparrot_training - Step 15802: {'lr': 0.0003881486940999153, 'samples': 8091136, 'steps': 15802, 'batch_loss/train': 0.8585425550118089} 12/25/2021 01:26:56 - INFO - codeparrot_training - Step 15803: {'lr': 0.0003881355560400341, 'samples': 8091648, 'steps': 15803, 'batch_loss/train': 0.8608677983283997} 12/25/2021 01:27:07 - INFO - codeparrot_training - Step 15804: {'lr': 0.00038812241743097815, 'samples': 8092160, 'steps': 15804, 'batch_loss/train': 0.7936427155509591} 12/25/2021 01:27:19 - INFO - codeparrot_training - Step 15805: {'lr': 0.00038810927827279964, 'samples': 8092672, 'steps': 15805, 'batch_loss/train': 0.7100028721615672} 12/25/2021 01:27:30 - INFO - codeparrot_training - Step 15806: {'lr': 0.0003880961385655507, 'samples': 8093184, 'steps': 15806, 'batch_loss/train': 0.8852280862629414} 12/25/2021 01:27:40 - INFO - codeparrot_training - Step 15807: {'lr': 0.00038808299830928375, 'samples': 8093696, 'steps': 15807, 'batch_loss/train': 0.8328166026622057} 12/25/2021 01:27:55 - INFO - codeparrot_training - Step 15808: {'lr': 0.0003880698575040509, 'samples': 8094208, 'steps': 15808, 'batch_loss/train': 1.552181909326464} 12/25/2021 01:28:05 - INFO - codeparrot_training - Step 15809: {'lr': 0.0003880567161499044, 'samples': 8094720, 'steps': 15809, 'batch_loss/train': 0.7883434616960585} 12/25/2021 01:28:16 - INFO - codeparrot_training - Step 15810: {'lr': 0.00038804357424689653, 'samples': 8095232, 'steps': 15810, 'batch_loss/train': 0.8242688747122884} 12/25/2021 01:28:27 - INFO - codeparrot_training - Step 15811: {'lr': 0.0003880304317950796, 'samples': 8095744, 'steps': 15811, 'batch_loss/train': 0.8659042678773403} 12/25/2021 01:28:39 - INFO - codeparrot_training - Step 15812: {'lr': 0.0003880172887945057, 'samples': 8096256, 'steps': 15812, 'batch_loss/train': 0.9363572802394629} 12/25/2021 01:28:50 - INFO - codeparrot_training - Step 15813: {'lr': 0.00038800414524522726, 'samples': 8096768, 'steps': 15813, 'batch_loss/train': 0.7141980035230517} 12/25/2021 01:29:00 - INFO - codeparrot_training - Step 15814: {'lr': 0.00038799100114729636, 'samples': 8097280, 'steps': 15814, 'batch_loss/train': 0.7787153943208978} 12/25/2021 01:29:12 - INFO - codeparrot_training - Step 15815: {'lr': 0.0003879778565007654, 'samples': 8097792, 'steps': 15815, 'batch_loss/train': 1.5909388340078294} 12/25/2021 01:29:23 - INFO - codeparrot_training - Step 15816: {'lr': 0.0003879647113056866, 'samples': 8098304, 'steps': 15816, 'batch_loss/train': 0.8117691529914737} 12/25/2021 01:29:34 - INFO - codeparrot_training - Step 15817: {'lr': 0.00038795156556211216, 'samples': 8098816, 'steps': 15817, 'batch_loss/train': 0.8949981573969126} 12/25/2021 01:29:44 - INFO - codeparrot_training - Step 15818: {'lr': 0.00038793841927009444, 'samples': 8099328, 'steps': 15818, 'batch_loss/train': 0.821864596568048} 12/25/2021 01:29:58 - INFO - codeparrot_training - Step 15819: {'lr': 0.00038792527242968554, 'samples': 8099840, 'steps': 15819, 'batch_loss/train': 0.7330413246527314} 12/25/2021 01:30:09 - INFO - codeparrot_training - Step 15820: {'lr': 0.0003879121250409379, 'samples': 8100352, 'steps': 15820, 'batch_loss/train': 1.0355707705020905} 12/25/2021 01:30:19 - INFO - codeparrot_training - Step 15821: {'lr': 0.0003878989771039038, 'samples': 8100864, 'steps': 15821, 'batch_loss/train': 0.7631393475458026} 12/25/2021 01:30:32 - INFO - codeparrot_training - Step 15822: {'lr': 0.0003878858286186353, 'samples': 8101376, 'steps': 15822, 'batch_loss/train': 0.8185333721339703} 12/25/2021 01:30:42 - INFO - codeparrot_training - Step 15823: {'lr': 0.00038787267958518483, 'samples': 8101888, 'steps': 15823, 'batch_loss/train': 0.7538043493404984} 12/25/2021 01:30:53 - INFO - codeparrot_training - Step 15824: {'lr': 0.00038785953000360475, 'samples': 8102400, 'steps': 15824, 'batch_loss/train': 1.027272627223283} 12/25/2021 01:31:05 - INFO - codeparrot_training - Step 15825: {'lr': 0.0003878463798739471, 'samples': 8102912, 'steps': 15825, 'batch_loss/train': 0.9033350525423884} 12/25/2021 01:31:16 - INFO - codeparrot_training - Step 15826: {'lr': 0.0003878332291962644, 'samples': 8103424, 'steps': 15826, 'batch_loss/train': 0.9149735840037465} 12/25/2021 01:31:26 - INFO - codeparrot_training - Step 15827: {'lr': 0.00038782007797060874, 'samples': 8103936, 'steps': 15827, 'batch_loss/train': 0.9362528212368488} 12/25/2021 01:31:37 - INFO - codeparrot_training - Step 15828: {'lr': 0.0003878069261970325, 'samples': 8104448, 'steps': 15828, 'batch_loss/train': 0.6363034762907773} 12/25/2021 01:31:51 - INFO - codeparrot_training - Step 15829: {'lr': 0.000387793773875588, 'samples': 8104960, 'steps': 15829, 'batch_loss/train': 0.7258311687037349} 12/25/2021 01:32:01 - INFO - codeparrot_training - Step 15830: {'lr': 0.0003877806210063275, 'samples': 8105472, 'steps': 15830, 'batch_loss/train': 0.8605827707797289} 12/25/2021 01:32:12 - INFO - codeparrot_training - Step 15831: {'lr': 0.0003877674675893033, 'samples': 8105984, 'steps': 15831, 'batch_loss/train': 0.7228332534432411} 12/25/2021 01:32:24 - INFO - codeparrot_training - Step 15832: {'lr': 0.0003877543136245676, 'samples': 8106496, 'steps': 15832, 'batch_loss/train': 0.640756132081151} 12/25/2021 01:32:35 - INFO - codeparrot_training - Step 15833: {'lr': 0.0003877411591121728, 'samples': 8107008, 'steps': 15833, 'batch_loss/train': 0.8831069339066744} 12/25/2021 01:32:46 - INFO - codeparrot_training - Step 15834: {'lr': 0.0003877280040521711, 'samples': 8107520, 'steps': 15834, 'batch_loss/train': 0.6902056368999183} 12/25/2021 01:32:59 - INFO - codeparrot_training - Step 15835: {'lr': 0.0003877148484446149, 'samples': 8108032, 'steps': 15835, 'batch_loss/train': 1.0067432979121804} 12/25/2021 01:33:10 - INFO - codeparrot_training - Step 15836: {'lr': 0.0003877016922895564, 'samples': 8108544, 'steps': 15836, 'batch_loss/train': 0.8549411841668189} 12/25/2021 01:33:21 - INFO - codeparrot_training - Step 15837: {'lr': 0.0003876885355870481, 'samples': 8109056, 'steps': 15837, 'batch_loss/train': 0.8040331108495593} 12/25/2021 01:33:31 - INFO - codeparrot_training - Step 15838: {'lr': 0.0003876753783371421, 'samples': 8109568, 'steps': 15838, 'batch_loss/train': 0.8002834897488356} 12/25/2021 01:33:43 - INFO - codeparrot_training - Step 15839: {'lr': 0.0003876622205398907, 'samples': 8110080, 'steps': 15839, 'batch_loss/train': 0.8838485758751631} 12/25/2021 01:33:54 - INFO - codeparrot_training - Step 15840: {'lr': 0.00038764906219534637, 'samples': 8110592, 'steps': 15840, 'batch_loss/train': 0.9330827724188566} 12/25/2021 01:34:05 - INFO - codeparrot_training - Step 15841: {'lr': 0.00038763590330356134, 'samples': 8111104, 'steps': 15841, 'batch_loss/train': 0.8955740034580231} 12/25/2021 01:34:17 - INFO - codeparrot_training - Step 15842: {'lr': 0.00038762274386458794, 'samples': 8111616, 'steps': 15842, 'batch_loss/train': 0.9341829782351851} 12/25/2021 01:34:27 - INFO - codeparrot_training - Step 15843: {'lr': 0.00038760958387847847, 'samples': 8112128, 'steps': 15843, 'batch_loss/train': 0.7551649492233992} 12/25/2021 01:34:38 - INFO - codeparrot_training - Step 15844: {'lr': 0.00038759642334528524, 'samples': 8112640, 'steps': 15844, 'batch_loss/train': 0.8448627507314086} 12/25/2021 01:34:51 - INFO - codeparrot_training - Step 15845: {'lr': 0.00038758326226506057, 'samples': 8113152, 'steps': 15845, 'batch_loss/train': 0.843301868531853} 12/25/2021 01:35:02 - INFO - codeparrot_training - Step 15846: {'lr': 0.00038757010063785683, 'samples': 8113664, 'steps': 15846, 'batch_loss/train': 1.5293164160102606} 12/25/2021 01:35:13 - INFO - codeparrot_training - Step 15847: {'lr': 0.0003875569384637263, 'samples': 8114176, 'steps': 15847, 'batch_loss/train': 0.9010554095730186} 12/25/2021 01:35:23 - INFO - codeparrot_training - Step 15848: {'lr': 0.0003875437757427214, 'samples': 8114688, 'steps': 15848, 'batch_loss/train': 0.8444194495677948} 12/25/2021 01:35:36 - INFO - codeparrot_training - Step 15849: {'lr': 0.00038753061247489423, 'samples': 8115200, 'steps': 15849, 'batch_loss/train': 0.9140803748741746} 12/25/2021 01:35:46 - INFO - codeparrot_training - Step 15850: {'lr': 0.0003875174486602974, 'samples': 8115712, 'steps': 15850, 'batch_loss/train': 1.711788178421557} 12/25/2021 01:35:57 - INFO - codeparrot_training - Step 15851: {'lr': 0.000387504284298983, 'samples': 8116224, 'steps': 15851, 'batch_loss/train': 1.260687910602428} 12/25/2021 01:36:09 - INFO - codeparrot_training - Step 15852: {'lr': 0.00038749111939100355, 'samples': 8116736, 'steps': 15852, 'batch_loss/train': 0.808910702355206} 12/25/2021 01:36:20 - INFO - codeparrot_training - Step 15853: {'lr': 0.00038747795393641137, 'samples': 8117248, 'steps': 15853, 'batch_loss/train': 0.8862286945804954} 12/25/2021 01:36:30 - INFO - codeparrot_training - Step 15854: {'lr': 0.00038746478793525867, 'samples': 8117760, 'steps': 15854, 'batch_loss/train': 0.8333350396715105} 12/25/2021 01:36:41 - INFO - codeparrot_training - Step 15855: {'lr': 0.0003874516213875979, 'samples': 8118272, 'steps': 15855, 'batch_loss/train': 0.7371831117197871} 12/25/2021 01:36:53 - INFO - codeparrot_training - Step 15856: {'lr': 0.00038743845429348145, 'samples': 8118784, 'steps': 15856, 'batch_loss/train': 0.8397204801440239} 12/25/2021 01:37:03 - INFO - codeparrot_training - Step 15857: {'lr': 0.0003874252866529616, 'samples': 8119296, 'steps': 15857, 'batch_loss/train': 0.8299220823682845} 12/25/2021 01:37:14 - INFO - codeparrot_training - Step 15858: {'lr': 0.00038741211846609057, 'samples': 8119808, 'steps': 15858, 'batch_loss/train': 0.88418132904917} 12/25/2021 01:37:28 - INFO - codeparrot_training - Step 15859: {'lr': 0.000387398949732921, 'samples': 8120320, 'steps': 15859, 'batch_loss/train': 1.6038665059022605} 12/25/2021 01:37:38 - INFO - codeparrot_training - Step 15860: {'lr': 0.000387385780453505, 'samples': 8120832, 'steps': 15860, 'batch_loss/train': 0.7689180788584054} 12/25/2021 01:37:49 - INFO - codeparrot_training - Step 15861: {'lr': 0.000387372610627895, 'samples': 8121344, 'steps': 15861, 'batch_loss/train': 0.8917086201254278} 12/25/2021 01:38:01 - INFO - codeparrot_training - Step 15862: {'lr': 0.00038735944025614343, 'samples': 8121856, 'steps': 15862, 'batch_loss/train': 0.7735841209068894} 12/25/2021 01:38:12 - INFO - codeparrot_training - Step 15863: {'lr': 0.00038734626933830253, 'samples': 8122368, 'steps': 15863, 'batch_loss/train': 0.8367095682770014} 12/25/2021 01:38:22 - INFO - codeparrot_training - Step 15864: {'lr': 0.0003873330978744247, 'samples': 8122880, 'steps': 15864, 'batch_loss/train': 0.8241643570363522} 12/25/2021 01:38:36 - INFO - codeparrot_training - Step 15865: {'lr': 0.00038731992586456244, 'samples': 8123392, 'steps': 15865, 'batch_loss/train': 0.8427147883921862} 12/25/2021 01:38:47 - INFO - codeparrot_training - Step 15866: {'lr': 0.000387306753308768, 'samples': 8123904, 'steps': 15866, 'batch_loss/train': 0.8458883622661233} 12/25/2021 01:38:57 - INFO - codeparrot_training - Step 15867: {'lr': 0.0003872935802070936, 'samples': 8124416, 'steps': 15867, 'batch_loss/train': 0.7779077213490382} 12/25/2021 01:39:08 - INFO - codeparrot_training - Step 15868: {'lr': 0.0003872804065595919, 'samples': 8124928, 'steps': 15868, 'batch_loss/train': 0.8522523127030581} 12/25/2021 01:39:21 - INFO - codeparrot_training - Step 15869: {'lr': 0.00038726723236631505, 'samples': 8125440, 'steps': 15869, 'batch_loss/train': 0.7877747411839664} 12/25/2021 01:39:31 - INFO - codeparrot_training - Step 15870: {'lr': 0.0003872540576273156, 'samples': 8125952, 'steps': 15870, 'batch_loss/train': 0.8664722796529531} 12/25/2021 01:39:42 - INFO - codeparrot_training - Step 15871: {'lr': 0.00038724088234264574, 'samples': 8126464, 'steps': 15871, 'batch_loss/train': 0.5499703856185079} 12/25/2021 01:39:54 - INFO - codeparrot_training - Step 15872: {'lr': 0.0003872277065123581, 'samples': 8126976, 'steps': 15872, 'batch_loss/train': 0.8324263896793127} 12/25/2021 01:40:04 - INFO - codeparrot_training - Step 15873: {'lr': 0.00038721453013650483, 'samples': 8127488, 'steps': 15873, 'batch_loss/train': 0.7686089789494872} 12/25/2021 01:40:15 - INFO - codeparrot_training - Step 15874: {'lr': 0.0003872013532151384, 'samples': 8128000, 'steps': 15874, 'batch_loss/train': 0.8209060980007052} 12/25/2021 01:40:26 - INFO - codeparrot_training - Step 15875: {'lr': 0.00038718817574831115, 'samples': 8128512, 'steps': 15875, 'batch_loss/train': 0.6660781520185992} 12/25/2021 01:40:39 - INFO - codeparrot_training - Step 15876: {'lr': 0.00038717499773607557, 'samples': 8129024, 'steps': 15876, 'batch_loss/train': 0.8291748436167836} 12/25/2021 01:40:50 - INFO - codeparrot_training - Step 15877: {'lr': 0.00038716181917848396, 'samples': 8129536, 'steps': 15877, 'batch_loss/train': 0.7278742502676323} 12/25/2021 01:41:01 - INFO - codeparrot_training - Step 15878: {'lr': 0.0003871486400755888, 'samples': 8130048, 'steps': 15878, 'batch_loss/train': 0.8116934881545603} 12/25/2021 01:41:13 - INFO - codeparrot_training - Step 15879: {'lr': 0.0003871354604274424, 'samples': 8130560, 'steps': 15879, 'batch_loss/train': 0.9245752098504454} 12/25/2021 01:41:24 - INFO - codeparrot_training - Step 15880: {'lr': 0.00038712228023409716, 'samples': 8131072, 'steps': 15880, 'batch_loss/train': 0.7718422044999897} 12/25/2021 01:41:34 - INFO - codeparrot_training - Step 15881: {'lr': 0.00038710909949560556, 'samples': 8131584, 'steps': 15881, 'batch_loss/train': 0.8212699617724866} 12/25/2021 01:41:46 - INFO - codeparrot_training - Step 15882: {'lr': 0.00038709591821201993, 'samples': 8132096, 'steps': 15882, 'batch_loss/train': 0.7212107892846689} 12/25/2021 01:41:57 - INFO - codeparrot_training - Step 15883: {'lr': 0.00038708273638339276, 'samples': 8132608, 'steps': 15883, 'batch_loss/train': 0.7743928264826536} 12/25/2021 01:42:08 - INFO - codeparrot_training - Step 15884: {'lr': 0.0003870695540097763, 'samples': 8133120, 'steps': 15884, 'batch_loss/train': 0.7857242487370968} 12/25/2021 01:42:18 - INFO - codeparrot_training - Step 15885: {'lr': 0.00038705637109122306, 'samples': 8133632, 'steps': 15885, 'batch_loss/train': 0.7918857175391167} 12/25/2021 01:42:32 - INFO - codeparrot_training - Step 15886: {'lr': 0.00038704318762778543, 'samples': 8134144, 'steps': 15886, 'batch_loss/train': 0.7338874759152532} 12/25/2021 01:42:43 - INFO - codeparrot_training - Step 15887: {'lr': 0.0003870300036195159, 'samples': 8134656, 'steps': 15887, 'batch_loss/train': 0.8313284711912274} 12/25/2021 01:42:53 - INFO - codeparrot_training - Step 15888: {'lr': 0.00038701681906646673, 'samples': 8135168, 'steps': 15888, 'batch_loss/train': 0.8071038937196136} 12/25/2021 01:43:05 - INFO - codeparrot_training - Step 15889: {'lr': 0.00038700363396869045, 'samples': 8135680, 'steps': 15889, 'batch_loss/train': 0.7605591872707009} 12/25/2021 01:43:16 - INFO - codeparrot_training - Step 15890: {'lr': 0.0003869904483262395, 'samples': 8136192, 'steps': 15890, 'batch_loss/train': 0.8158892346546054} 12/25/2021 01:43:26 - INFO - codeparrot_training - Step 15891: {'lr': 0.00038697726213916616, 'samples': 8136704, 'steps': 15891, 'batch_loss/train': 0.835092926165089} 12/25/2021 01:43:39 - INFO - codeparrot_training - Step 15892: {'lr': 0.00038696407540752306, 'samples': 8137216, 'steps': 15892, 'batch_loss/train': 0.8660615645349026} 12/25/2021 01:43:49 - INFO - codeparrot_training - Step 15893: {'lr': 0.00038695088813136245, 'samples': 8137728, 'steps': 15893, 'batch_loss/train': 0.7833918277174234} 12/25/2021 01:44:00 - INFO - codeparrot_training - Step 15894: {'lr': 0.0003869377003107368, 'samples': 8138240, 'steps': 15894, 'batch_loss/train': 0.7370122694410384} 12/25/2021 01:44:10 - INFO - codeparrot_training - Step 15895: {'lr': 0.00038692451194569853, 'samples': 8138752, 'steps': 15895, 'batch_loss/train': 0.8058617315255105} 12/25/2021 01:44:24 - INFO - codeparrot_training - Step 15896: {'lr': 0.0003869113230363002, 'samples': 8139264, 'steps': 15896, 'batch_loss/train': 0.8570855716243386} 12/25/2021 01:44:35 - INFO - codeparrot_training - Step 15897: {'lr': 0.00038689813358259415, 'samples': 8139776, 'steps': 15897, 'batch_loss/train': 0.8876968119293451} 12/25/2021 01:44:45 - INFO - codeparrot_training - Step 15898: {'lr': 0.0003868849435846328, 'samples': 8140288, 'steps': 15898, 'batch_loss/train': 0.850636709947139} 12/25/2021 01:44:58 - INFO - codeparrot_training - Step 15899: {'lr': 0.0003868717530424685, 'samples': 8140800, 'steps': 15899, 'batch_loss/train': 0.8473618859425187} 12/25/2021 01:45:08 - INFO - codeparrot_training - Step 15900: {'lr': 0.0003868585619561539, 'samples': 8141312, 'steps': 15900, 'batch_loss/train': 1.1239904006943107} 12/25/2021 01:45:19 - INFO - codeparrot_training - Step 15901: {'lr': 0.00038684537032574124, 'samples': 8141824, 'steps': 15901, 'batch_loss/train': 0.7259607380256057} 12/25/2021 01:45:31 - INFO - codeparrot_training - Step 15902: {'lr': 0.00038683217815128315, 'samples': 8142336, 'steps': 15902, 'batch_loss/train': 0.8668377576395869} 12/25/2021 01:45:41 - INFO - codeparrot_training - Step 15903: {'lr': 0.0003868189854328319, 'samples': 8142848, 'steps': 15903, 'batch_loss/train': 0.7397450185380876} 12/25/2021 01:45:52 - INFO - codeparrot_training - Step 15904: {'lr': 0.0003868057921704401, 'samples': 8143360, 'steps': 15904, 'batch_loss/train': 0.8425621276255697} 12/25/2021 01:46:03 - INFO - codeparrot_training - Step 15905: {'lr': 0.0003867925983641601, 'samples': 8143872, 'steps': 15905, 'batch_loss/train': 0.9375513000413775} 12/25/2021 01:46:16 - INFO - codeparrot_training - Step 15906: {'lr': 0.0003867794040140443, 'samples': 8144384, 'steps': 15906, 'batch_loss/train': 0.8249609540216625} 12/25/2021 01:46:27 - INFO - codeparrot_training - Step 15907: {'lr': 0.0003867662091201454, 'samples': 8144896, 'steps': 15907, 'batch_loss/train': 0.8533682487905025} 12/25/2021 01:46:38 - INFO - codeparrot_training - Step 15908: {'lr': 0.0003867530136825156, 'samples': 8145408, 'steps': 15908, 'batch_loss/train': 0.8363103144802153} 12/25/2021 01:46:50 - INFO - codeparrot_training - Step 15909: {'lr': 0.0003867398177012074, 'samples': 8145920, 'steps': 15909, 'batch_loss/train': 0.7907789656892419} 12/25/2021 01:47:00 - INFO - codeparrot_training - Step 15910: {'lr': 0.0003867266211762734, 'samples': 8146432, 'steps': 15910, 'batch_loss/train': 0.7807363520842046} 12/25/2021 01:47:11 - INFO - codeparrot_training - Step 15911: {'lr': 0.00038671342410776595, 'samples': 8146944, 'steps': 15911, 'batch_loss/train': 0.8284837482497096} 12/25/2021 01:47:25 - INFO - codeparrot_training - Step 15912: {'lr': 0.00038670022649573755, 'samples': 8147456, 'steps': 15912, 'batch_loss/train': 0.8006750317290425} 12/25/2021 01:47:35 - INFO - codeparrot_training - Step 15913: {'lr': 0.0003866870283402406, 'samples': 8147968, 'steps': 15913, 'batch_loss/train': 0.8742643436416984} 12/25/2021 01:47:46 - INFO - codeparrot_training - Step 15914: {'lr': 0.0003866738296413277, 'samples': 8148480, 'steps': 15914, 'batch_loss/train': 0.8479123343713582} 12/25/2021 01:47:56 - INFO - codeparrot_training - Step 15915: {'lr': 0.0003866606303990513, 'samples': 8148992, 'steps': 15915, 'batch_loss/train': 0.7559634274803102} 12/25/2021 01:48:09 - INFO - codeparrot_training - Step 15916: {'lr': 0.0003866474306134638, 'samples': 8149504, 'steps': 15916, 'batch_loss/train': 0.8192490248475224} 12/25/2021 01:48:19 - INFO - codeparrot_training - Step 15917: {'lr': 0.00038663423028461767, 'samples': 8150016, 'steps': 15917, 'batch_loss/train': 0.7710335599258542} 12/25/2021 01:48:30 - INFO - codeparrot_training - Step 15918: {'lr': 0.0003866210294125655, 'samples': 8150528, 'steps': 15918, 'batch_loss/train': 1.0272972490638494} 12/25/2021 01:48:42 - INFO - codeparrot_training - Step 15919: {'lr': 0.00038660782799735966, 'samples': 8151040, 'steps': 15919, 'batch_loss/train': 0.82004225673154} 12/25/2021 01:48:53 - INFO - codeparrot_training - Step 15920: {'lr': 0.0003865946260390527, 'samples': 8151552, 'steps': 15920, 'batch_loss/train': 0.817459037527442} 12/25/2021 01:49:03 - INFO - codeparrot_training - Step 15921: {'lr': 0.0003865814235376971, 'samples': 8152064, 'steps': 15921, 'batch_loss/train': 0.8045809753239155} 12/25/2021 01:49:15 - INFO - codeparrot_training - Step 15922: {'lr': 0.00038656822049334526, 'samples': 8152576, 'steps': 15922, 'batch_loss/train': 0.8840352231636643} 12/25/2021 01:49:26 - INFO - codeparrot_training - Step 15923: {'lr': 0.00038655501690604976, 'samples': 8153088, 'steps': 15923, 'batch_loss/train': 0.86929767485708} 12/25/2021 01:49:36 - INFO - codeparrot_training - Step 15924: {'lr': 0.00038654181277586317, 'samples': 8153600, 'steps': 15924, 'batch_loss/train': 0.8093113685026765} 12/25/2021 01:49:50 - INFO - codeparrot_training - Step 15925: {'lr': 0.0003865286081028378, 'samples': 8154112, 'steps': 15925, 'batch_loss/train': 0.7946366625837982} 12/25/2021 01:50:01 - INFO - codeparrot_training - Step 15926: {'lr': 0.0003865154028870262, 'samples': 8154624, 'steps': 15926, 'batch_loss/train': 0.8584465961903334} 12/25/2021 01:50:11 - INFO - codeparrot_training - Step 15927: {'lr': 0.000386502197128481, 'samples': 8155136, 'steps': 15927, 'batch_loss/train': 0.7386957972776145} 12/25/2021 01:50:22 - INFO - codeparrot_training - Step 15928: {'lr': 0.00038648899082725454, 'samples': 8155648, 'steps': 15928, 'batch_loss/train': 0.7640315662138164} 12/25/2021 01:50:34 - INFO - codeparrot_training - Step 15929: {'lr': 0.00038647578398339947, 'samples': 8156160, 'steps': 15929, 'batch_loss/train': 0.8132463060319424} 12/25/2021 01:50:45 - INFO - codeparrot_training - Step 15930: {'lr': 0.0003864625765969681, 'samples': 8156672, 'steps': 15930, 'batch_loss/train': 0.758939404040575} 12/25/2021 01:50:55 - INFO - codeparrot_training - Step 15931: {'lr': 0.00038644936866801314, 'samples': 8157184, 'steps': 15931, 'batch_loss/train': 0.7920699408277869} 12/25/2021 01:51:07 - INFO - codeparrot_training - Step 15932: {'lr': 0.0003864361601965869, 'samples': 8157696, 'steps': 15932, 'batch_loss/train': 0.8473845645785332} 12/25/2021 01:51:18 - INFO - codeparrot_training - Step 15933: {'lr': 0.0003864229511827422, 'samples': 8158208, 'steps': 15933, 'batch_loss/train': 0.8401345312595367} 12/25/2021 01:51:28 - INFO - codeparrot_training - Step 15934: {'lr': 0.0003864097416265311, 'samples': 8158720, 'steps': 15934, 'batch_loss/train': 0.7683365680277348} 12/25/2021 01:51:42 - INFO - codeparrot_training - Step 15935: {'lr': 0.00038639653152800656, 'samples': 8159232, 'steps': 15935, 'batch_loss/train': 0.8957844823598862} 12/25/2021 01:51:53 - INFO - codeparrot_training - Step 15936: {'lr': 0.00038638332088722084, 'samples': 8159744, 'steps': 15936, 'batch_loss/train': 0.7911417931318283} 12/25/2021 01:52:04 - INFO - codeparrot_training - Step 15937: {'lr': 0.0003863701097042266, 'samples': 8160256, 'steps': 15937, 'batch_loss/train': 1.0067486630287021} 12/25/2021 01:52:14 - INFO - codeparrot_training - Step 15938: {'lr': 0.00038635689797907627, 'samples': 8160768, 'steps': 15938, 'batch_loss/train': 0.8668714445084333} 12/25/2021 01:52:26 - INFO - codeparrot_training - Step 15939: {'lr': 0.0003863436857118223, 'samples': 8161280, 'steps': 15939, 'batch_loss/train': 0.8365232376381755} 12/25/2021 01:52:37 - INFO - codeparrot_training - Step 15940: {'lr': 0.00038633047290251743, 'samples': 8161792, 'steps': 15940, 'batch_loss/train': 0.7780972379259765} 12/25/2021 01:52:47 - INFO - codeparrot_training - Step 15941: {'lr': 0.000386317259551214, 'samples': 8162304, 'steps': 15941, 'batch_loss/train': 0.727162878960371} 12/25/2021 01:53:01 - INFO - codeparrot_training - Step 15942: {'lr': 0.00038630404565796473, 'samples': 8162816, 'steps': 15942, 'batch_loss/train': 0.7797388387843966} 12/25/2021 01:53:12 - INFO - codeparrot_training - Step 15943: {'lr': 0.0003862908312228219, 'samples': 8163328, 'steps': 15943, 'batch_loss/train': 0.8519325125962496} 12/25/2021 01:53:23 - INFO - codeparrot_training - Step 15944: {'lr': 0.00038627761624583835, 'samples': 8163840, 'steps': 15944, 'batch_loss/train': 0.9391738567501307} 12/25/2021 01:53:35 - INFO - codeparrot_training - Step 15945: {'lr': 0.0003862644007270663, 'samples': 8164352, 'steps': 15945, 'batch_loss/train': 0.7426668554544449} 12/25/2021 01:53:45 - INFO - codeparrot_training - Step 15946: {'lr': 0.0003862511846665585, 'samples': 8164864, 'steps': 15946, 'batch_loss/train': 0.8808911424130201} 12/25/2021 01:53:56 - INFO - codeparrot_training - Step 15947: {'lr': 0.0003862379680643675, 'samples': 8165376, 'steps': 15947, 'batch_loss/train': 0.7879922252614051} 12/25/2021 01:54:07 - INFO - codeparrot_training - Step 15948: {'lr': 0.0003862247509205457, 'samples': 8165888, 'steps': 15948, 'batch_loss/train': 0.8925350410863757} 12/25/2021 01:54:19 - INFO - codeparrot_training - Step 15949: {'lr': 0.0003862115332351458, 'samples': 8166400, 'steps': 15949, 'batch_loss/train': 0.5835096794180572} 12/25/2021 01:54:29 - INFO - codeparrot_training - Step 15950: {'lr': 0.0003861983150082201, 'samples': 8166912, 'steps': 15950, 'batch_loss/train': 0.9149036314338446} 12/25/2021 01:54:40 - INFO - codeparrot_training - Step 15951: {'lr': 0.00038618509623982155, 'samples': 8167424, 'steps': 15951, 'batch_loss/train': 0.7576398244127631} 12/25/2021 01:54:53 - INFO - codeparrot_training - Step 15952: {'lr': 0.00038617187693000234, 'samples': 8167936, 'steps': 15952, 'batch_loss/train': 0.8431260804645717} 12/25/2021 01:55:03 - INFO - codeparrot_training - Step 15953: {'lr': 0.00038615865707881524, 'samples': 8168448, 'steps': 15953, 'batch_loss/train': 0.8190409708768129} 12/25/2021 01:55:14 - INFO - codeparrot_training - Step 15954: {'lr': 0.0003861454366863126, 'samples': 8168960, 'steps': 15954, 'batch_loss/train': 0.7545430640457198} 12/25/2021 01:55:26 - INFO - codeparrot_training - Step 15955: {'lr': 0.0003861322157525473, 'samples': 8169472, 'steps': 15955, 'batch_loss/train': 0.787816857919097} 12/25/2021 01:55:36 - INFO - codeparrot_training - Step 15956: {'lr': 0.0003861189942775716, 'samples': 8169984, 'steps': 15956, 'batch_loss/train': 0.8659446869278327} 12/25/2021 01:55:47 - INFO - codeparrot_training - Step 15957: {'lr': 0.0003861057722614382, 'samples': 8170496, 'steps': 15957, 'batch_loss/train': 0.7620205436833203} 12/25/2021 01:55:59 - INFO - codeparrot_training - Step 15958: {'lr': 0.00038609254970419963, 'samples': 8171008, 'steps': 15958, 'batch_loss/train': 0.7407871743198484} 12/25/2021 01:56:10 - INFO - codeparrot_training - Step 15959: {'lr': 0.0003860793266059085, 'samples': 8171520, 'steps': 15959, 'batch_loss/train': 0.8457195097580552} 12/25/2021 01:56:20 - INFO - codeparrot_training - Step 15960: {'lr': 0.0003860661029666173, 'samples': 8172032, 'steps': 15960, 'batch_loss/train': 0.7798801818862557} 12/25/2021 01:56:31 - INFO - codeparrot_training - Step 15961: {'lr': 0.0003860528787863787, 'samples': 8172544, 'steps': 15961, 'batch_loss/train': 0.7474884039256722} 12/25/2021 01:56:43 - INFO - codeparrot_training - Step 15962: {'lr': 0.0003860396540652452, 'samples': 8173056, 'steps': 15962, 'batch_loss/train': 0.8923864169046283} 12/25/2021 01:56:53 - INFO - codeparrot_training - Step 15963: {'lr': 0.00038602642880326935, 'samples': 8173568, 'steps': 15963, 'batch_loss/train': 0.8600718812085688} 12/25/2021 01:57:04 - INFO - codeparrot_training - Step 15964: {'lr': 0.0003860132030005038, 'samples': 8174080, 'steps': 15964, 'batch_loss/train': 0.8738114880397916} 12/25/2021 01:57:18 - INFO - codeparrot_training - Step 15965: {'lr': 0.00038599997665700114, 'samples': 8174592, 'steps': 15965, 'batch_loss/train': 0.8058833004906774} 12/25/2021 01:57:28 - INFO - codeparrot_training - Step 15966: {'lr': 0.000385986749772814, 'samples': 8175104, 'steps': 15966, 'batch_loss/train': 0.8185350354760885} 12/25/2021 01:57:39 - INFO - codeparrot_training - Step 15967: {'lr': 0.0003859735223479948, 'samples': 8175616, 'steps': 15967, 'batch_loss/train': 0.986338971182704} 12/25/2021 01:57:51 - INFO - codeparrot_training - Step 15968: {'lr': 0.00038596029438259623, 'samples': 8176128, 'steps': 15968, 'batch_loss/train': 0.8111048098653555} 12/25/2021 01:58:02 - INFO - codeparrot_training - Step 15969: {'lr': 0.00038594706587667085, 'samples': 8176640, 'steps': 15969, 'batch_loss/train': 0.922180250287056} 12/25/2021 01:58:13 - INFO - codeparrot_training - Step 15970: {'lr': 0.00038593383683027134, 'samples': 8177152, 'steps': 15970, 'batch_loss/train': 1.156101196538657} 12/25/2021 01:58:23 - INFO - codeparrot_training - Step 15971: {'lr': 0.0003859206072434501, 'samples': 8177664, 'steps': 15971, 'batch_loss/train': 1.1727699318435043} 12/25/2021 01:58:36 - INFO - codeparrot_training - Step 15972: {'lr': 0.00038590737711625994, 'samples': 8178176, 'steps': 15972, 'batch_loss/train': 0.6796718371333554} 12/25/2021 01:58:46 - INFO - codeparrot_training - Step 15973: {'lr': 0.0003858941464487533, 'samples': 8178688, 'steps': 15973, 'batch_loss/train': 0.8828242023009807} 12/25/2021 01:58:57 - INFO - codeparrot_training - Step 15974: {'lr': 0.00038588091524098295, 'samples': 8179200, 'steps': 15974, 'batch_loss/train': 0.7550649229669943} 12/25/2021 01:59:09 - INFO - codeparrot_training - Step 15975: {'lr': 0.00038586768349300126, 'samples': 8179712, 'steps': 15975, 'batch_loss/train': 0.7840683097019792} 12/25/2021 01:59:20 - INFO - codeparrot_training - Step 15976: {'lr': 0.00038585445120486105, 'samples': 8180224, 'steps': 15976, 'batch_loss/train': 0.7015423662960529} 12/25/2021 01:59:30 - INFO - codeparrot_training - Step 15977: {'lr': 0.0003858412183766148, 'samples': 8180736, 'steps': 15977, 'batch_loss/train': 0.837539890781045} 12/25/2021 01:59:43 - INFO - codeparrot_training - Step 15978: {'lr': 0.0003858279850083152, 'samples': 8181248, 'steps': 15978, 'batch_loss/train': 0.8729233774356544} 12/25/2021 01:59:53 - INFO - codeparrot_training - Step 15979: {'lr': 0.0003858147511000148, 'samples': 8181760, 'steps': 15979, 'batch_loss/train': 0.8719982132315636} 12/25/2021 02:00:04 - INFO - codeparrot_training - Step 15980: {'lr': 0.0003858015166517662, 'samples': 8182272, 'steps': 15980, 'batch_loss/train': 0.7959411516785622} 12/25/2021 02:00:15 - INFO - codeparrot_training - Step 15981: {'lr': 0.00038578828166362214, 'samples': 8182784, 'steps': 15981, 'batch_loss/train': 0.7907535424456} 12/25/2021 02:00:27 - INFO - codeparrot_training - Step 15982: {'lr': 0.00038577504613563496, 'samples': 8183296, 'steps': 15982, 'batch_loss/train': 0.9062036825343966} 12/25/2021 02:00:38 - INFO - codeparrot_training - Step 15983: {'lr': 0.00038576181006785766, 'samples': 8183808, 'steps': 15983, 'batch_loss/train': 0.9249244048260152} 12/25/2021 02:00:49 - INFO - codeparrot_training - Step 15984: {'lr': 0.0003857485734603425, 'samples': 8184320, 'steps': 15984, 'batch_loss/train': 0.9262448074296117} 12/25/2021 02:01:01 - INFO - codeparrot_training - Step 15985: {'lr': 0.0003857353363131424, 'samples': 8184832, 'steps': 15985, 'batch_loss/train': 0.9019403057172894} 12/25/2021 02:01:11 - INFO - codeparrot_training - Step 15986: {'lr': 0.00038572209862630983, 'samples': 8185344, 'steps': 15986, 'batch_loss/train': 0.908678999170661} 12/25/2021 02:01:22 - INFO - codeparrot_training - Step 15987: {'lr': 0.0003857088603998974, 'samples': 8185856, 'steps': 15987, 'batch_loss/train': 0.7979239327833056} 12/25/2021 02:01:34 - INFO - codeparrot_training - Step 15988: {'lr': 0.00038569562163395777, 'samples': 8186368, 'steps': 15988, 'batch_loss/train': 0.923509752843529} 12/25/2021 02:01:45 - INFO - codeparrot_training - Step 15989: {'lr': 0.0003856823823285436, 'samples': 8186880, 'steps': 15989, 'batch_loss/train': 0.5822119442746043} 12/25/2021 02:01:56 - INFO - codeparrot_training - Step 15990: {'lr': 0.00038566914248370755, 'samples': 8187392, 'steps': 15990, 'batch_loss/train': 0.790259781293571} 12/25/2021 02:02:06 - INFO - codeparrot_training - Step 15991: {'lr': 0.0003856559020995022, 'samples': 8187904, 'steps': 15991, 'batch_loss/train': 0.8976043476723135} 12/25/2021 02:02:19 - INFO - codeparrot_training - Step 15992: {'lr': 0.0003856426611759802, 'samples': 8188416, 'steps': 15992, 'batch_loss/train': 0.8077692771330476} 12/25/2021 02:02:29 - INFO - codeparrot_training - Step 15993: {'lr': 0.00038562941971319423, 'samples': 8188928, 'steps': 15993, 'batch_loss/train': 0.9481079075485468} 12/25/2021 02:02:40 - INFO - codeparrot_training - Step 15994: {'lr': 0.00038561617771119686, 'samples': 8189440, 'steps': 15994, 'batch_loss/train': 0.9331595590338111} 12/25/2021 02:02:52 - INFO - codeparrot_training - Step 15995: {'lr': 0.0003856029351700407, 'samples': 8189952, 'steps': 15995, 'batch_loss/train': 0.8227151832543314} 12/25/2021 02:03:03 - INFO - codeparrot_training - Step 15996: {'lr': 0.0003855896920897786, 'samples': 8190464, 'steps': 15996, 'batch_loss/train': 0.7849228528793901} 12/25/2021 02:03:13 - INFO - codeparrot_training - Step 15997: {'lr': 0.00038557644847046296, 'samples': 8190976, 'steps': 15997, 'batch_loss/train': 0.8737126723863184} 12/25/2021 02:03:26 - INFO - codeparrot_training - Step 15998: {'lr': 0.00038556320431214664, 'samples': 8191488, 'steps': 15998, 'batch_loss/train': 0.8017204981297255} 12/25/2021 02:03:36 - INFO - codeparrot_training - Step 15999: {'lr': 0.00038554995961488214, 'samples': 8192000, 'steps': 15999, 'batch_loss/train': 0.8557989597320557} 12/25/2021 02:03:47 - INFO - codeparrot_training - Step 16000: {'lr': 0.00038553671437872226, 'samples': 8192512, 'steps': 16000, 'batch_loss/train': 0.5661846497096121} 12/25/2021 02:03:57 - INFO - codeparrot_training - Step 16001: {'lr': 0.00038552346860371955, 'samples': 8193024, 'steps': 16001, 'batch_loss/train': 0.8278540319297463} 12/25/2021 02:04:11 - INFO - codeparrot_training - Step 16002: {'lr': 0.00038551022228992674, 'samples': 8193536, 'steps': 16002, 'batch_loss/train': 0.5938393741380423} 12/25/2021 02:04:21 - INFO - codeparrot_training - Step 16003: {'lr': 0.00038549697543739637, 'samples': 8194048, 'steps': 16003, 'batch_loss/train': 0.8776170946657658} 12/25/2021 02:04:32 - INFO - codeparrot_training - Step 16004: {'lr': 0.0003854837280461813, 'samples': 8194560, 'steps': 16004, 'batch_loss/train': 0.8427702859044075} 12/25/2021 02:04:44 - INFO - codeparrot_training - Step 16005: {'lr': 0.00038547048011633396, 'samples': 8195072, 'steps': 16005, 'batch_loss/train': 0.7575284410268068} 12/25/2021 02:04:55 - INFO - codeparrot_training - Step 16006: {'lr': 0.0003854572316479072, 'samples': 8195584, 'steps': 16006, 'batch_loss/train': 0.9957738658413291} 12/25/2021 02:05:05 - INFO - codeparrot_training - Step 16007: {'lr': 0.0003854439826409536, 'samples': 8196096, 'steps': 16007, 'batch_loss/train': 1.1693482715636492} 12/25/2021 02:05:16 - INFO - codeparrot_training - Step 16008: {'lr': 0.0003854307330955259, 'samples': 8196608, 'steps': 16008, 'batch_loss/train': 0.8145033726468682} 12/25/2021 02:05:28 - INFO - codeparrot_training - Step 16009: {'lr': 0.0003854174830116767, 'samples': 8197120, 'steps': 16009, 'batch_loss/train': 1.078247170895338} 12/25/2021 02:05:39 - INFO - codeparrot_training - Step 16010: {'lr': 0.00038540423238945875, 'samples': 8197632, 'steps': 16010, 'batch_loss/train': 0.9409205941483378} 12/25/2021 02:05:49 - INFO - codeparrot_training - Step 16011: {'lr': 0.0003853909812289248, 'samples': 8198144, 'steps': 16011, 'batch_loss/train': 0.8196765496395528} 12/25/2021 02:06:02 - INFO - codeparrot_training - Step 16012: {'lr': 0.0003853777295301273, 'samples': 8198656, 'steps': 16012, 'batch_loss/train': 0.7268450483679771} 12/25/2021 02:06:13 - INFO - codeparrot_training - Step 16013: {'lr': 0.00038536447729311906, 'samples': 8199168, 'steps': 16013, 'batch_loss/train': 0.8359199985861778} 12/25/2021 02:06:23 - INFO - codeparrot_training - Step 16014: {'lr': 0.00038535122451795287, 'samples': 8199680, 'steps': 16014, 'batch_loss/train': 0.6864448846317828} 12/25/2021 02:06:35 - INFO - codeparrot_training - Step 16015: {'lr': 0.00038533797120468116, 'samples': 8200192, 'steps': 16015, 'batch_loss/train': 0.8538241358473897} 12/25/2021 02:06:46 - INFO - codeparrot_training - Step 16016: {'lr': 0.00038532471735335693, 'samples': 8200704, 'steps': 16016, 'batch_loss/train': 0.7892446191981435} 12/25/2021 02:06:57 - INFO - codeparrot_training - Step 16017: {'lr': 0.0003853114629640326, 'samples': 8201216, 'steps': 16017, 'batch_loss/train': 0.8845113404095173} 12/25/2021 02:07:07 - INFO - codeparrot_training - Step 16018: {'lr': 0.0003852982080367611, 'samples': 8201728, 'steps': 16018, 'batch_loss/train': 0.6081043148878962} 12/25/2021 02:07:20 - INFO - codeparrot_training - Step 16019: {'lr': 0.00038528495257159496, 'samples': 8202240, 'steps': 16019, 'batch_loss/train': 0.7931799464859068} 12/25/2021 02:07:30 - INFO - codeparrot_training - Step 16020: {'lr': 0.00038527169656858695, 'samples': 8202752, 'steps': 16020, 'batch_loss/train': 0.7702388390898705} 12/25/2021 02:07:41 - INFO - codeparrot_training - Step 16021: {'lr': 0.0003852584400277897, 'samples': 8203264, 'steps': 16021, 'batch_loss/train': 0.9468609434552491} 12/25/2021 02:07:53 - INFO - codeparrot_training - Step 16022: {'lr': 0.00038524518294925605, 'samples': 8203776, 'steps': 16022, 'batch_loss/train': 0.8058160310611129} 12/25/2021 02:08:04 - INFO - codeparrot_training - Step 16023: {'lr': 0.00038523192533303853, 'samples': 8204288, 'steps': 16023, 'batch_loss/train': 0.9166689086705446} 12/25/2021 02:08:14 - INFO - codeparrot_training - Step 16024: {'lr': 0.00038521866717919, 'samples': 8204800, 'steps': 16024, 'batch_loss/train': 0.9957166203530505} 12/25/2021 02:08:26 - INFO - codeparrot_training - Step 16025: {'lr': 0.0003852054084877631, 'samples': 8205312, 'steps': 16025, 'batch_loss/train': 0.8418481200933456} 12/25/2021 02:08:37 - INFO - codeparrot_training - Step 16026: {'lr': 0.00038519214925881046, 'samples': 8205824, 'steps': 16026, 'batch_loss/train': 0.8600256172940135} 12/25/2021 02:08:48 - INFO - codeparrot_training - Step 16027: {'lr': 0.000385178889492385, 'samples': 8206336, 'steps': 16027, 'batch_loss/train': 0.8112637223675847} 12/25/2021 02:08:58 - INFO - codeparrot_training - Step 16028: {'lr': 0.00038516562918853924, 'samples': 8206848, 'steps': 16028, 'batch_loss/train': 0.7726600728929043} 12/25/2021 02:09:11 - INFO - codeparrot_training - Step 16029: {'lr': 0.00038515236834732604, 'samples': 8207360, 'steps': 16029, 'batch_loss/train': 0.7598995317239314} 12/25/2021 02:09:22 - INFO - codeparrot_training - Step 16030: {'lr': 0.00038513910696879797, 'samples': 8207872, 'steps': 16030, 'batch_loss/train': 0.7537999115884304} 12/25/2021 02:09:32 - INFO - codeparrot_training - Step 16031: {'lr': 0.00038512584505300787, 'samples': 8208384, 'steps': 16031, 'batch_loss/train': 0.8204616480506957} 12/25/2021 02:09:45 - INFO - codeparrot_training - Step 16032: {'lr': 0.00038511258260000845, 'samples': 8208896, 'steps': 16032, 'batch_loss/train': 0.8652409724891186} 12/25/2021 02:09:55 - INFO - codeparrot_training - Step 16033: {'lr': 0.00038509931960985244, 'samples': 8209408, 'steps': 16033, 'batch_loss/train': 0.8952907249331474} 12/25/2021 02:10:06 - INFO - codeparrot_training - Step 16034: {'lr': 0.0003850860560825925, 'samples': 8209920, 'steps': 16034, 'batch_loss/train': 0.8826564839109778} 12/25/2021 02:10:18 - INFO - codeparrot_training - Step 16035: {'lr': 0.0003850727920182814, 'samples': 8210432, 'steps': 16035, 'batch_loss/train': 0.8212894881144166} 12/25/2021 02:10:29 - INFO - codeparrot_training - Step 16036: {'lr': 0.0003850595274169719, 'samples': 8210944, 'steps': 16036, 'batch_loss/train': 0.7932083439081907} 12/25/2021 02:10:39 - INFO - codeparrot_training - Step 16037: {'lr': 0.00038504626227871676, 'samples': 8211456, 'steps': 16037, 'batch_loss/train': 0.7684995280578732} 12/25/2021 02:10:50 - INFO - codeparrot_training - Step 16038: {'lr': 0.0003850329966035686, 'samples': 8211968, 'steps': 16038, 'batch_loss/train': 0.8060618788003922} 12/25/2021 02:11:02 - INFO - codeparrot_training - Step 16039: {'lr': 0.00038501973039158023, 'samples': 8212480, 'steps': 16039, 'batch_loss/train': 0.8994606621563435} 12/25/2021 02:11:13 - INFO - codeparrot_training - Step 16040: {'lr': 0.00038500646364280445, 'samples': 8212992, 'steps': 16040, 'batch_loss/train': 0.7925277790054679} 12/25/2021 02:11:23 - INFO - codeparrot_training - Step 16041: {'lr': 0.0003849931963572939, 'samples': 8213504, 'steps': 16041, 'batch_loss/train': 0.6804145281203091} 12/25/2021 02:11:36 - INFO - codeparrot_training - Step 16042: {'lr': 0.00038497992853510143, 'samples': 8214016, 'steps': 16042, 'batch_loss/train': 0.7541121831163764} 12/25/2021 02:11:47 - INFO - codeparrot_training - Step 16043: {'lr': 0.0003849666601762797, 'samples': 8214528, 'steps': 16043, 'batch_loss/train': 0.798967189155519} 12/25/2021 02:11:57 - INFO - codeparrot_training - Step 16044: {'lr': 0.00038495339128088145, 'samples': 8215040, 'steps': 16044, 'batch_loss/train': 0.7961586341261864} 12/25/2021 02:12:09 - INFO - codeparrot_training - Step 16045: {'lr': 0.00038494012184895954, 'samples': 8215552, 'steps': 16045, 'batch_loss/train': 0.8044637273997068} 12/25/2021 02:12:20 - INFO - codeparrot_training - Step 16046: {'lr': 0.00038492685188056664, 'samples': 8216064, 'steps': 16046, 'batch_loss/train': 0.7534742979332805} 12/25/2021 02:12:31 - INFO - codeparrot_training - Step 16047: {'lr': 0.00038491358137575555, 'samples': 8216576, 'steps': 16047, 'batch_loss/train': 0.797695537796244} 12/25/2021 02:12:43 - INFO - codeparrot_training - Step 16048: {'lr': 0.000384900310334579, 'samples': 8217088, 'steps': 16048, 'batch_loss/train': 0.8077402408234775} 12/25/2021 02:12:54 - INFO - codeparrot_training - Step 16049: {'lr': 0.00038488703875708965, 'samples': 8217600, 'steps': 16049, 'batch_loss/train': 0.8062938784714788} 12/25/2021 02:13:04 - INFO - codeparrot_training - Step 16050: {'lr': 0.0003848737666433404, 'samples': 8218112, 'steps': 16050, 'batch_loss/train': 0.8183341817930341} 12/25/2021 02:13:15 - INFO - codeparrot_training - Step 16051: {'lr': 0.00038486049399338407, 'samples': 8218624, 'steps': 16051, 'batch_loss/train': 0.8031514943577349} 12/25/2021 02:13:27 - INFO - codeparrot_training - Step 16052: {'lr': 0.00038484722080727325, 'samples': 8219136, 'steps': 16052, 'batch_loss/train': 0.8830859381705523} 12/25/2021 02:13:37 - INFO - codeparrot_training - Step 16053: {'lr': 0.0003848339470850609, 'samples': 8219648, 'steps': 16053, 'batch_loss/train': 0.8541720771463588} 12/25/2021 02:13:48 - INFO - codeparrot_training - Step 16054: {'lr': 0.0003848206728267995, 'samples': 8220160, 'steps': 16054, 'batch_loss/train': 0.8824032871052623} 12/25/2021 02:14:00 - INFO - codeparrot_training - Step 16055: {'lr': 0.0003848073980325422, 'samples': 8220672, 'steps': 16055, 'batch_loss/train': 0.9452198547078297} 12/25/2021 02:14:11 - INFO - codeparrot_training - Step 16056: {'lr': 0.0003847941227023415, 'samples': 8221184, 'steps': 16056, 'batch_loss/train': 0.9067675499245524} 12/25/2021 02:14:21 - INFO - codeparrot_training - Step 16057: {'lr': 0.00038478084683625025, 'samples': 8221696, 'steps': 16057, 'batch_loss/train': 0.6366102094762027} 12/25/2021 02:14:35 - INFO - codeparrot_training - Step 16058: {'lr': 0.00038476757043432125, 'samples': 8222208, 'steps': 16058, 'batch_loss/train': 0.758772318251431} 12/25/2021 02:14:46 - INFO - codeparrot_training - Step 16059: {'lr': 0.00038475429349660734, 'samples': 8222720, 'steps': 16059, 'batch_loss/train': 0.7825368791818619} 12/25/2021 02:14:56 - INFO - codeparrot_training - Step 16060: {'lr': 0.00038474101602316117, 'samples': 8223232, 'steps': 16060, 'batch_loss/train': 0.7972139772027731} 12/25/2021 02:15:07 - INFO - codeparrot_training - Step 16061: {'lr': 0.0003847277380140356, 'samples': 8223744, 'steps': 16061, 'batch_loss/train': 0.7779015469131991} 12/25/2021 02:15:19 - INFO - codeparrot_training - Step 16062: {'lr': 0.0003847144594692835, 'samples': 8224256, 'steps': 16062, 'batch_loss/train': 0.8281307513825595} 12/25/2021 02:15:30 - INFO - codeparrot_training - Step 16063: {'lr': 0.0003847011803889575, 'samples': 8224768, 'steps': 16063, 'batch_loss/train': 0.8663481143303216} 12/25/2021 02:15:40 - INFO - codeparrot_training - Step 16064: {'lr': 0.00038468790077311044, 'samples': 8225280, 'steps': 16064, 'batch_loss/train': 0.6960632111877203} 12/25/2021 02:15:52 - INFO - codeparrot_training - Step 16065: {'lr': 0.0003846746206217952, 'samples': 8225792, 'steps': 16065, 'batch_loss/train': 0.9468915048055351} 12/25/2021 02:16:03 - INFO - codeparrot_training - Step 16066: {'lr': 0.00038466133993506454, 'samples': 8226304, 'steps': 16066, 'batch_loss/train': 0.7841430846601725} 12/25/2021 02:16:13 - INFO - codeparrot_training - Step 16067: {'lr': 0.00038464805871297114, 'samples': 8226816, 'steps': 16067, 'batch_loss/train': 0.6967691984027624} 12/25/2021 02:16:27 - INFO - codeparrot_training - Step 16068: {'lr': 0.00038463477695556804, 'samples': 8227328, 'steps': 16068, 'batch_loss/train': 0.7953929621726274} 12/25/2021 02:16:37 - INFO - codeparrot_training - Step 16069: {'lr': 0.0003846214946629077, 'samples': 8227840, 'steps': 16069, 'batch_loss/train': 0.6268124002963305} 12/25/2021 02:16:48 - INFO - codeparrot_training - Step 16070: {'lr': 0.00038460821183504336, 'samples': 8228352, 'steps': 16070, 'batch_loss/train': 0.8324579335749149} 12/25/2021 02:16:58 - INFO - codeparrot_training - Step 16071: {'lr': 0.0003845949284720275, 'samples': 8228864, 'steps': 16071, 'batch_loss/train': 0.9602774975355715} 12/25/2021 02:17:11 - INFO - codeparrot_training - Step 16072: {'lr': 0.00038458164457391296, 'samples': 8229376, 'steps': 16072, 'batch_loss/train': 0.7889931909739971} 12/25/2021 02:17:21 - INFO - codeparrot_training - Step 16073: {'lr': 0.0003845683601407527, 'samples': 8229888, 'steps': 16073, 'batch_loss/train': 0.8188540693372488} 12/25/2021 02:17:32 - INFO - codeparrot_training - Step 16074: {'lr': 0.00038455507517259934, 'samples': 8230400, 'steps': 16074, 'batch_loss/train': 1.860931719187647} 12/25/2021 02:17:44 - INFO - codeparrot_training - Step 16075: {'lr': 0.00038454178966950596, 'samples': 8230912, 'steps': 16075, 'batch_loss/train': 0.7472907162737101} 12/25/2021 02:17:54 - INFO - codeparrot_training - Step 16076: {'lr': 0.00038452850363152513, 'samples': 8231424, 'steps': 16076, 'batch_loss/train': 0.8430562992580235} 12/25/2021 02:18:05 - INFO - codeparrot_training - Step 16077: {'lr': 0.00038451521705870986, 'samples': 8231936, 'steps': 16077, 'batch_loss/train': 0.845568910241127} 12/25/2021 02:18:18 - INFO - codeparrot_training - Step 16078: {'lr': 0.0003845019299511128, 'samples': 8232448, 'steps': 16078, 'batch_loss/train': 0.8897799476981163} 12/25/2021 02:18:28 - INFO - codeparrot_training - Step 16079: {'lr': 0.0003844886423087869, 'samples': 8232960, 'steps': 16079, 'batch_loss/train': 0.8258010167628527} 12/25/2021 02:18:39 - INFO - codeparrot_training - Step 16080: {'lr': 0.00038447535413178483, 'samples': 8233472, 'steps': 16080, 'batch_loss/train': 0.829025040846318} 12/25/2021 02:18:50 - INFO - codeparrot_training - Step 16081: {'lr': 0.0003844620654201596, 'samples': 8233984, 'steps': 16081, 'batch_loss/train': 0.9346692012622952} 12/25/2021 02:19:02 - INFO - codeparrot_training - Step 16082: {'lr': 0.000384448776173964, 'samples': 8234496, 'steps': 16082, 'batch_loss/train': 0.8145050131715834} 12/25/2021 02:19:12 - INFO - codeparrot_training - Step 16083: {'lr': 0.0003844354863932508, 'samples': 8235008, 'steps': 16083, 'batch_loss/train': 0.7777903089299798} 12/25/2021 02:19:23 - INFO - codeparrot_training - Step 16084: {'lr': 0.0003844221960780728, 'samples': 8235520, 'steps': 16084, 'batch_loss/train': 0.9084560540504754} 12/25/2021 02:19:35 - INFO - codeparrot_training - Step 16085: {'lr': 0.000384408905228483, 'samples': 8236032, 'steps': 16085, 'batch_loss/train': 0.7596675539389253} 12/25/2021 02:19:46 - INFO - codeparrot_training - Step 16086: {'lr': 0.00038439561384453406, 'samples': 8236544, 'steps': 16086, 'batch_loss/train': 0.8080680230632424} 12/25/2021 02:19:56 - INFO - codeparrot_training - Step 16087: {'lr': 0.000384382321926279, 'samples': 8237056, 'steps': 16087, 'batch_loss/train': 0.9012317303568125} 12/25/2021 02:20:10 - INFO - codeparrot_training - Step 16088: {'lr': 0.00038436902947377056, 'samples': 8237568, 'steps': 16088, 'batch_loss/train': 0.7921625766903162} 12/25/2021 02:20:21 - INFO - codeparrot_training - Step 16089: {'lr': 0.00038435573648706157, 'samples': 8238080, 'steps': 16089, 'batch_loss/train': 0.8104747767793015} 12/25/2021 02:20:31 - INFO - codeparrot_training - Step 16090: {'lr': 0.00038434244296620487, 'samples': 8238592, 'steps': 16090, 'batch_loss/train': 0.9291906571015716} 12/25/2021 02:20:42 - INFO - codeparrot_training - Step 16091: {'lr': 0.0003843291489112534, 'samples': 8239104, 'steps': 16091, 'batch_loss/train': 0.7777676261030138} 12/25/2021 02:20:54 - INFO - codeparrot_training - Step 16092: {'lr': 0.0003843158543222599, 'samples': 8239616, 'steps': 16092, 'batch_loss/train': 0.6837736679008231} 12/25/2021 02:21:05 - INFO - codeparrot_training - Step 16093: {'lr': 0.00038430255919927726, 'samples': 8240128, 'steps': 16093, 'batch_loss/train': 0.8517035860568285} 12/25/2021 02:21:15 - INFO - codeparrot_training - Step 16094: {'lr': 0.00038428926354235846, 'samples': 8240640, 'steps': 16094, 'batch_loss/train': 0.8731510024517775} 12/25/2021 02:21:27 - INFO - codeparrot_training - Step 16095: {'lr': 0.0003842759673515561, 'samples': 8241152, 'steps': 16095, 'batch_loss/train': 0.8405044543324038} 12/25/2021 02:21:38 - INFO - codeparrot_training - Step 16096: {'lr': 0.0003842626706269233, 'samples': 8241664, 'steps': 16096, 'batch_loss/train': 0.8264044355601072} 12/25/2021 02:21:49 - INFO - codeparrot_training - Step 16097: {'lr': 0.00038424937336851284, 'samples': 8242176, 'steps': 16097, 'batch_loss/train': 0.8894099835306406} 12/25/2021 02:22:01 - INFO - codeparrot_training - Step 16098: {'lr': 0.0003842360755763775, 'samples': 8242688, 'steps': 16098, 'batch_loss/train': 0.7954582301899791} 12/25/2021 02:22:12 - INFO - codeparrot_training - Step 16099: {'lr': 0.0003842227772505702, 'samples': 8243200, 'steps': 16099, 'batch_loss/train': 0.8231162172742188} 12/25/2021 02:22:22 - INFO - codeparrot_training - Step 16100: {'lr': 0.00038420947839114386, 'samples': 8243712, 'steps': 16100, 'batch_loss/train': 0.8821257147938013} 12/25/2021 02:22:33 - INFO - codeparrot_training - Step 16101: {'lr': 0.0003841961789981513, 'samples': 8244224, 'steps': 16101, 'batch_loss/train': 0.853860404342413} 12/25/2021 02:22:45 - INFO - codeparrot_training - Step 16102: {'lr': 0.00038418287907164545, 'samples': 8244736, 'steps': 16102, 'batch_loss/train': 0.7823797035962343} 12/25/2021 02:22:56 - INFO - codeparrot_training - Step 16103: {'lr': 0.0003841695786116791, 'samples': 8245248, 'steps': 16103, 'batch_loss/train': 0.8867076030001044} 12/25/2021 02:23:07 - INFO - codeparrot_training - Step 16104: {'lr': 0.0003841562776183051, 'samples': 8245760, 'steps': 16104, 'batch_loss/train': 0.830942084081471} 12/25/2021 02:23:19 - INFO - codeparrot_training - Step 16105: {'lr': 0.00038414297609157657, 'samples': 8246272, 'steps': 16105, 'batch_loss/train': 0.7915238274727017} 12/25/2021 02:23:30 - INFO - codeparrot_training - Step 16106: {'lr': 0.000384129674031546, 'samples': 8246784, 'steps': 16106, 'batch_loss/train': 0.9486546241678298} 12/25/2021 02:23:40 - INFO - codeparrot_training - Step 16107: {'lr': 0.0003841163714382666, 'samples': 8247296, 'steps': 16107, 'batch_loss/train': 0.7764795202529058} 12/25/2021 02:23:52 - INFO - codeparrot_training - Step 16108: {'lr': 0.0003841030683117912, 'samples': 8247808, 'steps': 16108, 'batch_loss/train': 0.795268782414496} 12/25/2021 02:24:03 - INFO - codeparrot_training - Step 16109: {'lr': 0.00038408976465217253, 'samples': 8248320, 'steps': 16109, 'batch_loss/train': 0.8978043971583247} 12/25/2021 02:24:14 - INFO - codeparrot_training - Step 16110: {'lr': 0.00038407646045946365, 'samples': 8248832, 'steps': 16110, 'batch_loss/train': 0.7513648503227159} 12/25/2021 02:24:24 - INFO - codeparrot_training - Step 16111: {'lr': 0.0003840631557337173, 'samples': 8249344, 'steps': 16111, 'batch_loss/train': 0.6768448045477271} 12/25/2021 02:24:36 - INFO - codeparrot_training - Step 16112: {'lr': 0.0003840498504749865, 'samples': 8249856, 'steps': 16112, 'batch_loss/train': 0.8626173892989755} 12/25/2021 02:24:47 - INFO - codeparrot_training - Step 16113: {'lr': 0.0003840365446833242, 'samples': 8250368, 'steps': 16113, 'batch_loss/train': 0.8779346612282097} 12/25/2021 02:24:58 - INFO - codeparrot_training - Step 16114: {'lr': 0.00038402323835878317, 'samples': 8250880, 'steps': 16114, 'batch_loss/train': 0.7885295065352693} 12/25/2021 02:25:10 - INFO - codeparrot_training - Step 16115: {'lr': 0.0003840099315014164, 'samples': 8251392, 'steps': 16115, 'batch_loss/train': 0.8517162532079965} 12/25/2021 02:25:20 - INFO - codeparrot_training - Step 16116: {'lr': 0.0003839966241112766, 'samples': 8251904, 'steps': 16116, 'batch_loss/train': 0.7740895696915686} 12/25/2021 02:25:31 - INFO - codeparrot_training - Step 16117: {'lr': 0.0003839833161884169, 'samples': 8252416, 'steps': 16117, 'batch_loss/train': 0.7529205842874944} 12/25/2021 02:25:44 - INFO - codeparrot_training - Step 16118: {'lr': 0.00038397000773289015, 'samples': 8252928, 'steps': 16118, 'batch_loss/train': 0.8813658097060397} 12/25/2021 02:25:54 - INFO - codeparrot_training - Step 16119: {'lr': 0.00038395669874474915, 'samples': 8253440, 'steps': 16119, 'batch_loss/train': 0.7670102324336767} 12/25/2021 02:26:05 - INFO - codeparrot_training - Step 16120: {'lr': 0.00038394338922404696, 'samples': 8253952, 'steps': 16120, 'batch_loss/train': 0.8111936482600868} 12/25/2021 02:26:17 - INFO - codeparrot_training - Step 16121: {'lr': 0.00038393007917083644, 'samples': 8254464, 'steps': 16121, 'batch_loss/train': 0.783370922319591} 12/25/2021 02:26:28 - INFO - codeparrot_training - Step 16122: {'lr': 0.00038391676858517043, 'samples': 8254976, 'steps': 16122, 'batch_loss/train': 0.8042816944653168} 12/25/2021 02:26:39 - INFO - codeparrot_training - Step 16123: {'lr': 0.00038390345746710196, 'samples': 8255488, 'steps': 16123, 'batch_loss/train': 0.7798807388171554} 12/25/2021 02:26:49 - INFO - codeparrot_training - Step 16124: {'lr': 0.00038389014581668394, 'samples': 8256000, 'steps': 16124, 'batch_loss/train': 0.6217466522939503} 12/25/2021 02:27:01 - INFO - codeparrot_training - Step 16125: {'lr': 0.0003838768336339692, 'samples': 8256512, 'steps': 16125, 'batch_loss/train': 0.8432097621262074} 12/25/2021 02:27:12 - INFO - codeparrot_training - Step 16126: {'lr': 0.0003838635209190107, 'samples': 8257024, 'steps': 16126, 'batch_loss/train': 0.7852634442970157} 12/25/2021 02:27:22 - INFO - codeparrot_training - Step 16127: {'lr': 0.00038385020767186146, 'samples': 8257536, 'steps': 16127, 'batch_loss/train': 0.9125803597271442} 12/25/2021 02:27:35 - INFO - codeparrot_training - Step 16128: {'lr': 0.00038383689389257425, 'samples': 8258048, 'steps': 16128, 'batch_loss/train': 0.8589435648173094} 12/25/2021 02:27:46 - INFO - codeparrot_training - Step 16129: {'lr': 0.0003838235795812022, 'samples': 8258560, 'steps': 16129, 'batch_loss/train': 0.8678293521516025} 12/25/2021 02:27:57 - INFO - codeparrot_training - Step 16130: {'lr': 0.00038381026473779804, 'samples': 8259072, 'steps': 16130, 'batch_loss/train': 0.8520138985477388} 12/25/2021 02:28:09 - INFO - codeparrot_training - Step 16131: {'lr': 0.00038379694936241474, 'samples': 8259584, 'steps': 16131, 'batch_loss/train': 0.9291203459724784} 12/25/2021 02:28:19 - INFO - codeparrot_training - Step 16132: {'lr': 0.0003837836334551054, 'samples': 8260096, 'steps': 16132, 'batch_loss/train': 0.7545004175044596} 12/25/2021 02:28:30 - INFO - codeparrot_training - Step 16133: {'lr': 0.0003837703170159228, 'samples': 8260608, 'steps': 16133, 'batch_loss/train': 0.8118537589907646} 12/25/2021 02:28:40 - INFO - codeparrot_training - Step 16134: {'lr': 0.00038375700004491994, 'samples': 8261120, 'steps': 16134, 'batch_loss/train': 0.8234759420156479} 12/25/2021 02:28:53 - INFO - codeparrot_training - Step 16135: {'lr': 0.0003837436825421496, 'samples': 8261632, 'steps': 16135, 'batch_loss/train': 0.8515349454246461} 12/25/2021 02:29:04 - INFO - codeparrot_training - Step 16136: {'lr': 0.00038373036450766514, 'samples': 8262144, 'steps': 16136, 'batch_loss/train': 0.746354472823441} 12/25/2021 02:29:14 - INFO - codeparrot_training - Step 16137: {'lr': 0.00038371704594151914, 'samples': 8262656, 'steps': 16137, 'batch_loss/train': 0.8614595294930041} 12/25/2021 02:29:27 - INFO - codeparrot_training - Step 16138: {'lr': 0.00038370372684376464, 'samples': 8263168, 'steps': 16138, 'batch_loss/train': 0.8568549691699445} 12/25/2021 02:29:37 - INFO - codeparrot_training - Step 16139: {'lr': 0.0003836904072144546, 'samples': 8263680, 'steps': 16139, 'batch_loss/train': 0.6737520538736135} 12/25/2021 02:29:48 - INFO - codeparrot_training - Step 16140: {'lr': 0.00038367708705364204, 'samples': 8264192, 'steps': 16140, 'batch_loss/train': 0.8398873191326857} 12/25/2021 02:30:00 - INFO - codeparrot_training - Step 16141: {'lr': 0.0003836637663613798, 'samples': 8264704, 'steps': 16141, 'batch_loss/train': 0.8552912138402462} 12/25/2021 02:30:10 - INFO - codeparrot_training - Step 16142: {'lr': 0.0003836504451377209, 'samples': 8265216, 'steps': 16142, 'batch_loss/train': 0.8001596452668309} 12/25/2021 02:30:21 - INFO - codeparrot_training - Step 16143: {'lr': 0.00038363712338271825, 'samples': 8265728, 'steps': 16143, 'batch_loss/train': 0.8731149146333337} 12/25/2021 02:30:32 - INFO - codeparrot_training - Step 16144: {'lr': 0.0003836238010964249, 'samples': 8266240, 'steps': 16144, 'batch_loss/train': 0.8250919877900742} 12/25/2021 02:30:44 - INFO - codeparrot_training - Step 16145: {'lr': 0.00038361047827889375, 'samples': 8266752, 'steps': 16145, 'batch_loss/train': 0.7390557007165626} 12/25/2021 02:30:55 - INFO - codeparrot_training - Step 16146: {'lr': 0.0003835971549301779, 'samples': 8267264, 'steps': 16146, 'batch_loss/train': 0.732437239959836} 12/25/2021 02:31:05 - INFO - codeparrot_training - Step 16147: {'lr': 0.00038358383105033006, 'samples': 8267776, 'steps': 16147, 'batch_loss/train': 0.9339723063167185} 12/25/2021 02:31:18 - INFO - codeparrot_training - Step 16148: {'lr': 0.0003835705066394034, 'samples': 8268288, 'steps': 16148, 'batch_loss/train': 0.855466884560883} 12/25/2021 02:31:28 - INFO - codeparrot_training - Step 16149: {'lr': 0.00038355718169745084, 'samples': 8268800, 'steps': 16149, 'batch_loss/train': 0.9499933803454041} 12/25/2021 02:31:39 - INFO - codeparrot_training - Step 16150: {'lr': 0.0003835438562245254, 'samples': 8269312, 'steps': 16150, 'batch_loss/train': 0.9050011802464724} 12/25/2021 02:31:51 - INFO - codeparrot_training - Step 16151: {'lr': 0.00038353053022067995, 'samples': 8269824, 'steps': 16151, 'batch_loss/train': 0.8764250101521611} 12/25/2021 02:32:01 - INFO - codeparrot_training - Step 16152: {'lr': 0.00038351720368596755, 'samples': 8270336, 'steps': 16152, 'batch_loss/train': 0.8196594431065023} 12/25/2021 02:32:12 - INFO - codeparrot_training - Step 16153: {'lr': 0.00038350387662044116, 'samples': 8270848, 'steps': 16153, 'batch_loss/train': 0.7788875987753272} 12/25/2021 02:32:24 - INFO - codeparrot_training - Step 16154: {'lr': 0.0003834905490241537, 'samples': 8271360, 'steps': 16154, 'batch_loss/train': 0.8461709609255195} 12/25/2021 02:32:35 - INFO - codeparrot_training - Step 16155: {'lr': 0.0003834772208971584, 'samples': 8271872, 'steps': 16155, 'batch_loss/train': 1.3943219548091292} 12/25/2021 02:32:45 - INFO - codeparrot_training - Step 16156: {'lr': 0.00038346389223950793, 'samples': 8272384, 'steps': 16156, 'batch_loss/train': 0.8177695901831612} 12/25/2021 02:32:56 - INFO - codeparrot_training - Step 16157: {'lr': 0.00038345056305125544, 'samples': 8272896, 'steps': 16157, 'batch_loss/train': 0.802061190828681} 12/25/2021 02:33:09 - INFO - codeparrot_training - Step 16158: {'lr': 0.00038343723333245387, 'samples': 8273408, 'steps': 16158, 'batch_loss/train': 0.660091666970402} 12/25/2021 02:33:20 - INFO - codeparrot_training - Step 16159: {'lr': 0.00038342390308315636, 'samples': 8273920, 'steps': 16159, 'batch_loss/train': 0.7810402475297451} 12/25/2021 02:33:30 - INFO - codeparrot_training - Step 16160: {'lr': 0.0003834105723034157, 'samples': 8274432, 'steps': 16160, 'batch_loss/train': 0.842733591562137} 12/25/2021 02:33:42 - INFO - codeparrot_training - Step 16161: {'lr': 0.00038339724099328495, 'samples': 8274944, 'steps': 16161, 'batch_loss/train': 0.7191121784271672} 12/25/2021 02:33:53 - INFO - codeparrot_training - Step 16162: {'lr': 0.00038338390915281716, 'samples': 8275456, 'steps': 16162, 'batch_loss/train': 0.6876327644567937} 12/25/2021 02:34:04 - INFO - codeparrot_training - Step 16163: {'lr': 0.0003833705767820653, 'samples': 8275968, 'steps': 16163, 'batch_loss/train': 0.8216088060289621} 12/25/2021 02:34:17 - INFO - codeparrot_training - Step 16164: {'lr': 0.0003833572438810825, 'samples': 8276480, 'steps': 16164, 'batch_loss/train': 0.7792325129266828} 12/25/2021 02:34:28 - INFO - codeparrot_training - Step 16165: {'lr': 0.0003833439104499216, 'samples': 8276992, 'steps': 16165, 'batch_loss/train': 0.833079076372087} 12/25/2021 02:34:38 - INFO - codeparrot_training - Step 16166: {'lr': 0.0003833305764886356, 'samples': 8277504, 'steps': 16166, 'batch_loss/train': 0.8044096622616053} 12/25/2021 02:34:50 - INFO - codeparrot_training - Step 16167: {'lr': 0.00038331724199727757, 'samples': 8278016, 'steps': 16167, 'batch_loss/train': 0.8251318573020399} 12/25/2021 02:35:01 - INFO - codeparrot_training - Step 16168: {'lr': 0.00038330390697590056, 'samples': 8278528, 'steps': 16168, 'batch_loss/train': 0.7515134192071855} 12/25/2021 02:35:12 - INFO - codeparrot_training - Step 16169: {'lr': 0.00038329057142455757, 'samples': 8279040, 'steps': 16169, 'batch_loss/train': 0.8703766325488687} 12/25/2021 02:35:22 - INFO - codeparrot_training - Step 16170: {'lr': 0.00038327723534330155, 'samples': 8279552, 'steps': 16170, 'batch_loss/train': 0.7304678549990058} 12/25/2021 02:35:35 - INFO - codeparrot_training - Step 16171: {'lr': 0.0003832638987321856, 'samples': 8280064, 'steps': 16171, 'batch_loss/train': 0.8240309120155871} 12/25/2021 02:35:46 - INFO - codeparrot_training - Step 16172: {'lr': 0.0003832505615912627, 'samples': 8280576, 'steps': 16172, 'batch_loss/train': 1.0469013368710876} 12/25/2021 02:35:56 - INFO - codeparrot_training - Step 16173: {'lr': 0.00038323722392058593, 'samples': 8281088, 'steps': 16173, 'batch_loss/train': 0.8479386363178492} 12/25/2021 02:36:09 - INFO - codeparrot_training - Step 16174: {'lr': 0.0003832238857202081, 'samples': 8281600, 'steps': 16174, 'batch_loss/train': 0.8659068867564201} 12/25/2021 02:36:20 - INFO - codeparrot_training - Step 16175: {'lr': 0.0003832105469901825, 'samples': 8282112, 'steps': 16175, 'batch_loss/train': 0.9240828398615122} 12/25/2021 02:36:30 - INFO - codeparrot_training - Step 16176: {'lr': 0.0003831972077305621, 'samples': 8282624, 'steps': 16176, 'batch_loss/train': 0.866184108890593} 12/25/2021 02:36:41 - INFO - codeparrot_training - Step 16177: {'lr': 0.0003831838679413999, 'samples': 8283136, 'steps': 16177, 'batch_loss/train': 0.8343033920973539} 12/25/2021 02:36:53 - INFO - codeparrot_training - Step 16178: {'lr': 0.0003831705276227489, 'samples': 8283648, 'steps': 16178, 'batch_loss/train': 0.8500022701919079} 12/25/2021 02:37:04 - INFO - codeparrot_training - Step 16179: {'lr': 0.0003831571867746622, 'samples': 8284160, 'steps': 16179, 'batch_loss/train': 0.8047098175156862} 12/25/2021 02:37:15 - INFO - codeparrot_training - Step 16180: {'lr': 0.0003831438453971927, 'samples': 8284672, 'steps': 16180, 'batch_loss/train': 0.9802279439754784} 12/25/2021 02:37:27 - INFO - codeparrot_training - Step 16181: {'lr': 0.00038313050349039366, 'samples': 8285184, 'steps': 16181, 'batch_loss/train': 0.8271002685651183} 12/25/2021 02:37:37 - INFO - codeparrot_training - Step 16182: {'lr': 0.000383117161054318, 'samples': 8285696, 'steps': 16182, 'batch_loss/train': 0.8083352847024798} 12/25/2021 02:37:48 - INFO - codeparrot_training - Step 16183: {'lr': 0.0003831038180890187, 'samples': 8286208, 'steps': 16183, 'batch_loss/train': 0.7051412118598819} 12/25/2021 02:38:00 - INFO - codeparrot_training - Step 16184: {'lr': 0.00038309047459454893, 'samples': 8286720, 'steps': 16184, 'batch_loss/train': 0.7563889320008457} 12/25/2021 02:38:11 - INFO - codeparrot_training - Step 16185: {'lr': 0.0003830771305709616, 'samples': 8287232, 'steps': 16185, 'batch_loss/train': 0.7882638117298484} 12/25/2021 02:38:22 - INFO - codeparrot_training - Step 16186: {'lr': 0.00038306378601831, 'samples': 8287744, 'steps': 16186, 'batch_loss/train': 0.9089311957359314} 12/25/2021 02:38:32 - INFO - codeparrot_training - Step 16187: {'lr': 0.00038305044093664685, 'samples': 8288256, 'steps': 16187, 'batch_loss/train': 0.7887330874800682} 12/25/2021 02:38:44 - INFO - codeparrot_training - Step 16188: {'lr': 0.00038303709532602553, 'samples': 8288768, 'steps': 16188, 'batch_loss/train': 0.8721259208396077} 12/25/2021 02:38:55 - INFO - codeparrot_training - Step 16189: {'lr': 0.0003830237491864989, 'samples': 8289280, 'steps': 16189, 'batch_loss/train': 0.9919768180698156} 12/25/2021 02:39:06 - INFO - codeparrot_training - Step 16190: {'lr': 0.0003830104025181201, 'samples': 8289792, 'steps': 16190, 'batch_loss/train': 0.8124116510152817} 12/25/2021 02:39:18 - INFO - codeparrot_training - Step 16191: {'lr': 0.0003829970553209421, 'samples': 8290304, 'steps': 16191, 'batch_loss/train': 0.8019555388018489} 12/25/2021 02:39:28 - INFO - codeparrot_training - Step 16192: {'lr': 0.0003829837075950181, 'samples': 8290816, 'steps': 16192, 'batch_loss/train': 0.8329824283719063} 12/25/2021 02:39:39 - INFO - codeparrot_training - Step 16193: {'lr': 0.000382970359340401, 'samples': 8291328, 'steps': 16193, 'batch_loss/train': 0.8318270607851446} 12/25/2021 02:39:52 - INFO - codeparrot_training - Step 16194: {'lr': 0.00038295701055714404, 'samples': 8291840, 'steps': 16194, 'batch_loss/train': 0.8721837925259024} 12/25/2021 02:40:02 - INFO - codeparrot_training - Step 16195: {'lr': 0.00038294366124530017, 'samples': 8292352, 'steps': 16195, 'batch_loss/train': 0.7231748431222513} 12/25/2021 02:40:13 - INFO - codeparrot_training - Step 16196: {'lr': 0.00038293031140492254, 'samples': 8292864, 'steps': 16196, 'batch_loss/train': 0.777944527566433} 12/25/2021 02:40:24 - INFO - codeparrot_training - Step 16197: {'lr': 0.0003829169610360641, 'samples': 8293376, 'steps': 16197, 'batch_loss/train': 0.8974739853292704} 12/25/2021 02:40:36 - INFO - codeparrot_training - Step 16198: {'lr': 0.0003829036101387781, 'samples': 8293888, 'steps': 16198, 'batch_loss/train': 0.8058635968482122} 12/25/2021 02:40:46 - INFO - codeparrot_training - Step 16199: {'lr': 0.00038289025871311743, 'samples': 8294400, 'steps': 16199, 'batch_loss/train': 0.8138997131027281} 12/25/2021 02:40:57 - INFO - codeparrot_training - Step 16200: {'lr': 0.0003828769067591353, 'samples': 8294912, 'steps': 16200, 'batch_loss/train': 0.8115601791068912} 12/25/2021 02:41:09 - INFO - codeparrot_training - Step 16201: {'lr': 0.0003828635542768849, 'samples': 8295424, 'steps': 16201, 'batch_loss/train': 0.9758891286328435} 12/25/2021 02:41:19 - INFO - codeparrot_training - Step 16202: {'lr': 0.000382850201266419, 'samples': 8295936, 'steps': 16202, 'batch_loss/train': 0.7352871701586992} 12/25/2021 02:41:30 - INFO - codeparrot_training - Step 16203: {'lr': 0.0003828368477277909, 'samples': 8296448, 'steps': 16203, 'batch_loss/train': 0.8047132124193013} 12/25/2021 02:41:44 - INFO - codeparrot_training - Step 16204: {'lr': 0.00038282349366105373, 'samples': 8296960, 'steps': 16204, 'batch_loss/train': 0.7303614737465978} 12/25/2021 02:41:54 - INFO - codeparrot_training - Step 16205: {'lr': 0.0003828101390662605, 'samples': 8297472, 'steps': 16205, 'batch_loss/train': 0.8570972718298435} 12/25/2021 02:42:05 - INFO - codeparrot_training - Step 16206: {'lr': 0.00038279678394346415, 'samples': 8297984, 'steps': 16206, 'batch_loss/train': 0.8039840771816671} 12/25/2021 02:42:15 - INFO - codeparrot_training - Step 16207: {'lr': 0.0003827834282927181, 'samples': 8298496, 'steps': 16207, 'batch_loss/train': 0.8151399600319564} 12/25/2021 02:42:28 - INFO - codeparrot_training - Step 16208: {'lr': 0.00038277007211407523, 'samples': 8299008, 'steps': 16208, 'batch_loss/train': 0.8584961043670774} 12/25/2021 02:42:39 - INFO - codeparrot_training - Step 16209: {'lr': 0.0003827567154075887, 'samples': 8299520, 'steps': 16209, 'batch_loss/train': 0.7867444818839431} 12/25/2021 02:42:49 - INFO - codeparrot_training - Step 16210: {'lr': 0.00038274335817331154, 'samples': 8300032, 'steps': 16210, 'batch_loss/train': 0.9490944603458047} 12/25/2021 02:43:02 - INFO - codeparrot_training - Step 16211: {'lr': 0.0003827300004112969, 'samples': 8300544, 'steps': 16211, 'batch_loss/train': 1.1683026031823829} 12/25/2021 02:43:13 - INFO - codeparrot_training - Step 16212: {'lr': 0.00038271664212159793, 'samples': 8301056, 'steps': 16212, 'batch_loss/train': 0.7080003125593066} 12/25/2021 02:43:24 - INFO - codeparrot_training - Step 16213: {'lr': 0.00038270328330426766, 'samples': 8301568, 'steps': 16213, 'batch_loss/train': 0.6375029543414712} 12/25/2021 02:43:34 - INFO - codeparrot_training - Step 16214: {'lr': 0.0003826899239593593, 'samples': 8302080, 'steps': 16214, 'batch_loss/train': 0.6934953164309263} 12/25/2021 02:43:46 - INFO - codeparrot_training - Step 16215: {'lr': 0.0003826765640869259, 'samples': 8302592, 'steps': 16215, 'batch_loss/train': 0.791712237522006} 12/25/2021 02:43:57 - INFO - codeparrot_training - Step 16216: {'lr': 0.0003826632036870206, 'samples': 8303104, 'steps': 16216, 'batch_loss/train': 0.8386116125620902} 12/25/2021 02:44:07 - INFO - codeparrot_training - Step 16217: {'lr': 0.00038264984275969644, 'samples': 8303616, 'steps': 16217, 'batch_loss/train': 0.8120864052325487} 12/25/2021 02:44:20 - INFO - codeparrot_training - Step 16218: {'lr': 0.0003826364813050066, 'samples': 8304128, 'steps': 16218, 'batch_loss/train': 0.8275433601811528} 12/25/2021 02:44:31 - INFO - codeparrot_training - Step 16219: {'lr': 0.0003826231193230042, 'samples': 8304640, 'steps': 16219, 'batch_loss/train': 1.2695331145077944} 12/25/2021 02:44:41 - INFO - codeparrot_training - Step 16220: {'lr': 0.0003826097568137424, 'samples': 8305152, 'steps': 16220, 'batch_loss/train': 0.8420792364049703} 12/25/2021 02:44:52 - INFO - codeparrot_training - Step 16221: {'lr': 0.00038259639377727416, 'samples': 8305664, 'steps': 16221, 'batch_loss/train': 0.7659246348775923} 12/25/2021 02:45:05 - INFO - codeparrot_training - Step 16222: {'lr': 0.0003825830302136528, 'samples': 8306176, 'steps': 16222, 'batch_loss/train': 0.8562729209661484} 12/25/2021 02:45:15 - INFO - codeparrot_training - Step 16223: {'lr': 0.0003825696661229314, 'samples': 8306688, 'steps': 16223, 'batch_loss/train': 0.7829266441985965} 12/25/2021 02:45:26 - INFO - codeparrot_training - Step 16224: {'lr': 0.00038255630150516305, 'samples': 8307200, 'steps': 16224, 'batch_loss/train': 0.9104764088988304} 12/25/2021 02:45:38 - INFO - codeparrot_training - Step 16225: {'lr': 0.0003825429363604009, 'samples': 8307712, 'steps': 16225, 'batch_loss/train': 0.6452954738633707} 12/25/2021 02:45:49 - INFO - codeparrot_training - Step 16226: {'lr': 0.00038252957068869806, 'samples': 8308224, 'steps': 16226, 'batch_loss/train': 0.7890948755666614} 12/25/2021 02:45:59 - INFO - codeparrot_training - Step 16227: {'lr': 0.0003825162044901077, 'samples': 8308736, 'steps': 16227, 'batch_loss/train': 0.7174768336117268} 12/25/2021 02:46:11 - INFO - codeparrot_training - Step 16228: {'lr': 0.00038250283776468293, 'samples': 8309248, 'steps': 16228, 'batch_loss/train': 0.6789648090489209} 12/25/2021 02:46:22 - INFO - codeparrot_training - Step 16229: {'lr': 0.000382489470512477, 'samples': 8309760, 'steps': 16229, 'batch_loss/train': 0.7255231295712292} 12/25/2021 02:46:33 - INFO - codeparrot_training - Step 16230: {'lr': 0.0003824761027335428, 'samples': 8310272, 'steps': 16230, 'batch_loss/train': 0.9285817821510136} 12/25/2021 02:46:43 - INFO - codeparrot_training - Step 16231: {'lr': 0.0003824627344279338, 'samples': 8310784, 'steps': 16231, 'batch_loss/train': 0.8836229434236884} 12/25/2021 02:46:55 - INFO - codeparrot_training - Step 16232: {'lr': 0.0003824493655957028, 'samples': 8311296, 'steps': 16232, 'batch_loss/train': 0.8141471054404974} 12/25/2021 02:47:06 - INFO - codeparrot_training - Step 16233: {'lr': 0.00038243599623690323, 'samples': 8311808, 'steps': 16233, 'batch_loss/train': 0.8879043301567435} 12/25/2021 02:47:16 - INFO - codeparrot_training - Step 16234: {'lr': 0.0003824226263515882, 'samples': 8312320, 'steps': 16234, 'batch_loss/train': 0.7990043358877301} 12/25/2021 02:47:29 - INFO - codeparrot_training - Step 16235: {'lr': 0.0003824092559398107, 'samples': 8312832, 'steps': 16235, 'batch_loss/train': 0.9992855554446578} 12/25/2021 02:47:39 - INFO - codeparrot_training - Step 16236: {'lr': 0.000382395885001624, 'samples': 8313344, 'steps': 16236, 'batch_loss/train': 0.6943497508764267} 12/25/2021 02:47:50 - INFO - codeparrot_training - Step 16237: {'lr': 0.00038238251353708136, 'samples': 8313856, 'steps': 16237, 'batch_loss/train': 0.818963487399742} 12/25/2021 02:48:02 - INFO - codeparrot_training - Step 16238: {'lr': 0.0003823691415462358, 'samples': 8314368, 'steps': 16238, 'batch_loss/train': 0.8258743102196604} 12/25/2021 02:48:13 - INFO - codeparrot_training - Step 16239: {'lr': 0.00038235576902914047, 'samples': 8314880, 'steps': 16239, 'batch_loss/train': 0.9781196531839669} 12/25/2021 02:48:23 - INFO - codeparrot_training - Step 16240: {'lr': 0.00038234239598584864, 'samples': 8315392, 'steps': 16240, 'batch_loss/train': 0.777174765476957} 12/25/2021 02:48:34 - INFO - codeparrot_training - Step 16241: {'lr': 0.00038232902241641333, 'samples': 8315904, 'steps': 16241, 'batch_loss/train': 0.8433862077072263} 12/25/2021 02:48:48 - INFO - codeparrot_training - Step 16242: {'lr': 0.00038231564832088785, 'samples': 8316416, 'steps': 16242, 'batch_loss/train': 0.7381761316210032} 12/25/2021 02:48:59 - INFO - codeparrot_training - Step 16243: {'lr': 0.00038230227369932525, 'samples': 8316928, 'steps': 16243, 'batch_loss/train': 0.8026910526677966} 12/25/2021 02:49:10 - INFO - codeparrot_training - Step 16244: {'lr': 0.0003822888985517788, 'samples': 8317440, 'steps': 16244, 'batch_loss/train': 0.8288120809011161} 12/25/2021 02:49:22 - INFO - codeparrot_training - Step 16245: {'lr': 0.0003822755228783017, 'samples': 8317952, 'steps': 16245, 'batch_loss/train': 0.7907813740894198} 12/25/2021 02:49:32 - INFO - codeparrot_training - Step 16246: {'lr': 0.000382262146678947, 'samples': 8318464, 'steps': 16246, 'batch_loss/train': 0.7615100733237341} 12/25/2021 02:49:43 - INFO - codeparrot_training - Step 16247: {'lr': 0.000382248769953768, 'samples': 8318976, 'steps': 16247, 'batch_loss/train': 0.8002092554233968} 12/25/2021 02:49:57 - INFO - codeparrot_training - Step 16248: {'lr': 0.0003822353927028177, 'samples': 8319488, 'steps': 16248, 'batch_loss/train': 0.8524952782317996} 12/25/2021 02:50:08 - INFO - codeparrot_training - Step 16249: {'lr': 0.00038222201492614955, 'samples': 8320000, 'steps': 16249, 'batch_loss/train': 0.8053049091249704} 12/25/2021 02:50:18 - INFO - codeparrot_training - Step 16250: {'lr': 0.0003822086366238165, 'samples': 8320512, 'steps': 16250, 'batch_loss/train': 0.7493419107049704} 12/25/2021 02:50:30 - INFO - codeparrot_training - Step 16251: {'lr': 0.00038219525779587194, 'samples': 8321024, 'steps': 16251, 'batch_loss/train': 0.7699570660479367} 12/25/2021 02:50:41 - INFO - codeparrot_training - Step 16252: {'lr': 0.00038218187844236886, 'samples': 8321536, 'steps': 16252, 'batch_loss/train': 0.7990143317729235} 12/25/2021 02:50:52 - INFO - codeparrot_training - Step 16253: {'lr': 0.0003821684985633606, 'samples': 8322048, 'steps': 16253, 'batch_loss/train': 0.8389142705127597} 12/25/2021 02:51:02 - INFO - codeparrot_training - Step 16254: {'lr': 0.00038215511815890027, 'samples': 8322560, 'steps': 16254, 'batch_loss/train': 1.1116315193939954} 12/25/2021 02:51:14 - INFO - codeparrot_training - Step 16255: {'lr': 0.000382141737229041, 'samples': 8323072, 'steps': 16255, 'batch_loss/train': 0.7991490121930838} 12/25/2021 02:51:25 - INFO - codeparrot_training - Step 16256: {'lr': 0.0003821283557738362, 'samples': 8323584, 'steps': 16256, 'batch_loss/train': 0.661286910995841} 12/25/2021 02:51:36 - INFO - codeparrot_training - Step 16257: {'lr': 0.00038211497379333885, 'samples': 8324096, 'steps': 16257, 'batch_loss/train': 1.0016185706481338} 12/25/2021 02:51:49 - INFO - codeparrot_training - Step 16258: {'lr': 0.0003821015912876023, 'samples': 8324608, 'steps': 16258, 'batch_loss/train': 0.7847919738851488} 12/25/2021 02:52:00 - INFO - codeparrot_training - Step 16259: {'lr': 0.0003820882082566797, 'samples': 8325120, 'steps': 16259, 'batch_loss/train': 0.7772012120112777} 12/25/2021 02:52:11 - INFO - codeparrot_training - Step 16260: {'lr': 0.00038207482470062435, 'samples': 8325632, 'steps': 16260, 'batch_loss/train': 0.6735875172307715} 12/25/2021 02:52:21 - INFO - codeparrot_training - Step 16261: {'lr': 0.00038206144061948925, 'samples': 8326144, 'steps': 16261, 'batch_loss/train': 0.847744164057076} 12/25/2021 02:52:33 - INFO - codeparrot_training - Step 16262: {'lr': 0.0003820480560133277, 'samples': 8326656, 'steps': 16262, 'batch_loss/train': 0.8696646196767688} 12/25/2021 02:52:44 - INFO - codeparrot_training - Step 16263: {'lr': 0.000382034670882193, 'samples': 8327168, 'steps': 16263, 'batch_loss/train': 0.9551706224447116} 12/25/2021 02:52:55 - INFO - codeparrot_training - Step 16264: {'lr': 0.0003820212852261383, 'samples': 8327680, 'steps': 16264, 'batch_loss/train': 0.8641198491677642} 12/25/2021 02:53:08 - INFO - codeparrot_training - Step 16265: {'lr': 0.0003820078990452168, 'samples': 8328192, 'steps': 16265, 'batch_loss/train': 0.8823937671259046} 12/25/2021 02:53:19 - INFO - codeparrot_training - Step 16266: {'lr': 0.0003819945123394818, 'samples': 8328704, 'steps': 16266, 'batch_loss/train': 0.637250320520252} 12/25/2021 02:53:30 - INFO - codeparrot_training - Step 16267: {'lr': 0.00038198112510898636, 'samples': 8329216, 'steps': 16267, 'batch_loss/train': 0.8441781578585505} 12/25/2021 02:53:42 - INFO - codeparrot_training - Step 16268: {'lr': 0.00038196773735378386, 'samples': 8329728, 'steps': 16268, 'batch_loss/train': 0.6463514685165137} 12/25/2021 02:53:52 - INFO - codeparrot_training - Step 16269: {'lr': 0.0003819543490739275, 'samples': 8330240, 'steps': 16269, 'batch_loss/train': 0.7859427556395531} 12/25/2021 02:54:03 - INFO - codeparrot_training - Step 16270: {'lr': 0.00038194096026947043, 'samples': 8330752, 'steps': 16270, 'batch_loss/train': 0.8431342211551964} 12/25/2021 02:54:14 - INFO - codeparrot_training - Step 16271: {'lr': 0.0003819275709404659, 'samples': 8331264, 'steps': 16271, 'batch_loss/train': 0.8546801628544927} 12/25/2021 02:54:26 - INFO - codeparrot_training - Step 16272: {'lr': 0.00038191418108696716, 'samples': 8331776, 'steps': 16272, 'batch_loss/train': 0.8469636179506779} 12/25/2021 02:54:36 - INFO - codeparrot_training - Step 16273: {'lr': 0.0003819007907090275, 'samples': 8332288, 'steps': 16273, 'batch_loss/train': 0.7885841981042176} 12/25/2021 02:54:47 - INFO - codeparrot_training - Step 16274: {'lr': 0.0003818873998067001, 'samples': 8332800, 'steps': 16274, 'batch_loss/train': 0.7810239517129958} 12/25/2021 02:55:01 - INFO - codeparrot_training - Step 16275: {'lr': 0.0003818740083800382, 'samples': 8333312, 'steps': 16275, 'batch_loss/train': 0.9006401170045137} 12/25/2021 02:55:11 - INFO - codeparrot_training - Step 16276: {'lr': 0.0003818606164290949, 'samples': 8333824, 'steps': 16276, 'batch_loss/train': 0.8330425946041942} 12/25/2021 02:55:22 - INFO - codeparrot_training - Step 16277: {'lr': 0.0003818472239539237, 'samples': 8334336, 'steps': 16277, 'batch_loss/train': 0.8336154632270336} 12/25/2021 02:55:34 - INFO - codeparrot_training - Step 16278: {'lr': 0.0003818338309545777, 'samples': 8334848, 'steps': 16278, 'batch_loss/train': 0.8230243781581521} 12/25/2021 02:55:45 - INFO - codeparrot_training - Step 16279: {'lr': 0.00038182043743111016, 'samples': 8335360, 'steps': 16279, 'batch_loss/train': 0.8242060896009207} 12/25/2021 02:55:55 - INFO - codeparrot_training - Step 16280: {'lr': 0.0003818070433835743, 'samples': 8335872, 'steps': 16280, 'batch_loss/train': 0.759116203757003} 12/25/2021 02:56:07 - INFO - codeparrot_training - Step 16281: {'lr': 0.0003817936488120235, 'samples': 8336384, 'steps': 16281, 'batch_loss/train': 0.8300940189510584} 12/25/2021 02:56:18 - INFO - codeparrot_training - Step 16282: {'lr': 0.0003817802537165109, 'samples': 8336896, 'steps': 16282, 'batch_loss/train': 0.799283092841506} 12/25/2021 02:56:28 - INFO - codeparrot_training - Step 16283: {'lr': 0.00038176685809708966, 'samples': 8337408, 'steps': 16283, 'batch_loss/train': 0.8662338946014643} 12/25/2021 02:56:39 - INFO - codeparrot_training - Step 16284: {'lr': 0.00038175346195381323, 'samples': 8337920, 'steps': 16284, 'batch_loss/train': 0.9057999240467325} 12/25/2021 02:56:52 - INFO - codeparrot_training - Step 16285: {'lr': 0.0003817400652867348, 'samples': 8338432, 'steps': 16285, 'batch_loss/train': 1.4642799436114728} 12/25/2021 02:57:02 - INFO - codeparrot_training - Step 16286: {'lr': 0.00038172666809590763, 'samples': 8338944, 'steps': 16286, 'batch_loss/train': 1.2755515538156033} 12/25/2021 02:57:13 - INFO - codeparrot_training - Step 16287: {'lr': 0.0003817132703813848, 'samples': 8339456, 'steps': 16287, 'batch_loss/train': 0.9616307532414794} 12/25/2021 02:57:27 - INFO - codeparrot_training - Step 16288: {'lr': 0.0003816998721432199, 'samples': 8339968, 'steps': 16288, 'batch_loss/train': 0.8111745901405811} 12/25/2021 02:57:37 - INFO - codeparrot_training - Step 16289: {'lr': 0.00038168647338146603, 'samples': 8340480, 'steps': 16289, 'batch_loss/train': 0.7330580805428326} 12/25/2021 02:57:48 - INFO - codeparrot_training - Step 16290: {'lr': 0.0003816730740961764, 'samples': 8340992, 'steps': 16290, 'batch_loss/train': 0.781814088113606} 12/25/2021 02:57:59 - INFO - codeparrot_training - Step 16291: {'lr': 0.00038165967428740435, 'samples': 8341504, 'steps': 16291, 'batch_loss/train': 0.8783562872558832} 12/25/2021 02:58:11 - INFO - codeparrot_training - Step 16292: {'lr': 0.0003816462739552032, 'samples': 8342016, 'steps': 16292, 'batch_loss/train': 0.7450939053669572} 12/25/2021 02:58:21 - INFO - codeparrot_training - Step 16293: {'lr': 0.00038163287309962615, 'samples': 8342528, 'steps': 16293, 'batch_loss/train': 0.9064376140013337} 12/25/2021 02:58:32 - INFO - codeparrot_training - Step 16294: {'lr': 0.00038161947172072653, 'samples': 8343040, 'steps': 16294, 'batch_loss/train': 0.826167767867446} 12/25/2021 02:58:46 - INFO - codeparrot_training - Step 16295: {'lr': 0.0003816060698185575, 'samples': 8343552, 'steps': 16295, 'batch_loss/train': 1.0067956410348415} 12/25/2021 02:58:57 - INFO - codeparrot_training - Step 16296: {'lr': 0.0003815926673931725, 'samples': 8344064, 'steps': 16296, 'batch_loss/train': 0.8287309063598514} 12/25/2021 02:59:07 - INFO - codeparrot_training - Step 16297: {'lr': 0.0003815792644446247, 'samples': 8344576, 'steps': 16297, 'batch_loss/train': 0.7022109534591436} 12/25/2021 02:59:19 - INFO - codeparrot_training - Step 16298: {'lr': 0.0003815658609729674, 'samples': 8345088, 'steps': 16298, 'batch_loss/train': 0.7975855451077223} 12/25/2021 02:59:30 - INFO - codeparrot_training - Step 16299: {'lr': 0.000381552456978254, 'samples': 8345600, 'steps': 16299, 'batch_loss/train': 0.8270023120567203} 12/25/2021 02:59:40 - INFO - codeparrot_training - Step 16300: {'lr': 0.00038153905246053767, 'samples': 8346112, 'steps': 16300, 'batch_loss/train': 0.7898675173055381} 12/25/2021 02:59:52 - INFO - codeparrot_training - Step 16301: {'lr': 0.0003815256474198718, 'samples': 8346624, 'steps': 16301, 'batch_loss/train': 0.8794332067482173} 12/25/2021 03:00:03 - INFO - codeparrot_training - Step 16302: {'lr': 0.00038151224185630953, 'samples': 8347136, 'steps': 16302, 'batch_loss/train': 0.7721956996247172} 12/25/2021 03:00:14 - INFO - codeparrot_training - Step 16303: {'lr': 0.00038149883576990426, 'samples': 8347648, 'steps': 16303, 'batch_loss/train': 0.8384956512600183} 12/25/2021 03:00:24 - INFO - codeparrot_training - Step 16304: {'lr': 0.0003814854291607093, 'samples': 8348160, 'steps': 16304, 'batch_loss/train': 0.6014195396564901} 12/25/2021 03:00:38 - INFO - codeparrot_training - Step 16305: {'lr': 0.00038147202202877783, 'samples': 8348672, 'steps': 16305, 'batch_loss/train': 0.7817904027178884} 12/25/2021 03:00:49 - INFO - codeparrot_training - Step 16306: {'lr': 0.0003814586143741633, 'samples': 8349184, 'steps': 16306, 'batch_loss/train': 0.7968473830260336} 12/25/2021 03:00:59 - INFO - codeparrot_training - Step 16307: {'lr': 0.0003814452061969189, 'samples': 8349696, 'steps': 16307, 'batch_loss/train': 0.8895552940666676} 12/25/2021 03:01:11 - INFO - codeparrot_training - Step 16308: {'lr': 0.0003814317974970981, 'samples': 8350208, 'steps': 16308, 'batch_loss/train': 0.8768477402627468} 12/25/2021 03:01:22 - INFO - codeparrot_training - Step 16309: {'lr': 0.00038141838827475405, 'samples': 8350720, 'steps': 16309, 'batch_loss/train': 0.7282009613700211} 12/25/2021 03:01:33 - INFO - codeparrot_training - Step 16310: {'lr': 0.00038140497852994014, 'samples': 8351232, 'steps': 16310, 'batch_loss/train': 0.8366803848184645} 12/25/2021 03:01:45 - INFO - codeparrot_training - Step 16311: {'lr': 0.0003813915682627096, 'samples': 8351744, 'steps': 16311, 'batch_loss/train': 0.7656125624198467} 12/25/2021 03:01:55 - INFO - codeparrot_training - Step 16312: {'lr': 0.0003813781574731159, 'samples': 8352256, 'steps': 16312, 'batch_loss/train': 0.8573290696367621} 12/25/2021 03:02:06 - INFO - codeparrot_training - Step 16313: {'lr': 0.00038136474616121206, 'samples': 8352768, 'steps': 16313, 'batch_loss/train': 0.8767125280573964} 12/25/2021 03:02:16 - INFO - codeparrot_training - Step 16314: {'lr': 0.0003813513343270518, 'samples': 8353280, 'steps': 16314, 'batch_loss/train': 0.9904765579849482} 12/25/2021 03:02:34 - INFO - codeparrot_training - Step 16315: {'lr': 0.00038133792197068815, 'samples': 8353792, 'steps': 16315, 'batch_loss/train': 0.7749294840032235} 12/25/2021 03:02:44 - INFO - codeparrot_training - Step 16316: {'lr': 0.0003813245090921745, 'samples': 8354304, 'steps': 16316, 'batch_loss/train': 0.9286512201651931} 12/25/2021 03:02:55 - INFO - codeparrot_training - Step 16317: {'lr': 0.00038131109569156416, 'samples': 8354816, 'steps': 16317, 'batch_loss/train': 0.7767377435229719} 12/25/2021 03:03:07 - INFO - codeparrot_training - Step 16318: {'lr': 0.00038129768176891054, 'samples': 8355328, 'steps': 16318, 'batch_loss/train': 0.8014679723419249} 12/25/2021 03:03:18 - INFO - codeparrot_training - Step 16319: {'lr': 0.00038128426732426685, 'samples': 8355840, 'steps': 16319, 'batch_loss/train': 0.76744638197124} 12/25/2021 03:03:28 - INFO - codeparrot_training - Step 16320: {'lr': 0.0003812708523576865, 'samples': 8356352, 'steps': 16320, 'batch_loss/train': 0.7214752165600657} 12/25/2021 03:03:40 - INFO - codeparrot_training - Step 16321: {'lr': 0.0003812574368692229, 'samples': 8356864, 'steps': 16321, 'batch_loss/train': 0.7971192426048219} 12/25/2021 03:03:51 - INFO - codeparrot_training - Step 16322: {'lr': 0.00038124402085892915, 'samples': 8357376, 'steps': 16322, 'batch_loss/train': 0.7164738974533975} 12/25/2021 03:04:01 - INFO - codeparrot_training - Step 16323: {'lr': 0.0003812306043268589, 'samples': 8357888, 'steps': 16323, 'batch_loss/train': 0.7719359621405602} 12/25/2021 03:04:12 - INFO - codeparrot_training - Step 16324: {'lr': 0.00038121718727306516, 'samples': 8358400, 'steps': 16324, 'batch_loss/train': 0.7943692915141582} 12/25/2021 03:04:26 - INFO - codeparrot_training - Step 16325: {'lr': 0.00038120376969760143, 'samples': 8358912, 'steps': 16325, 'batch_loss/train': 0.7947669941931963} 12/25/2021 03:04:36 - INFO - codeparrot_training - Step 16326: {'lr': 0.0003811903516005211, 'samples': 8359424, 'steps': 16326, 'batch_loss/train': 0.9899771269410849} 12/25/2021 03:04:47 - INFO - codeparrot_training - Step 16327: {'lr': 0.0003811769329818775, 'samples': 8359936, 'steps': 16327, 'batch_loss/train': 0.7655866574496031} 12/25/2021 03:04:59 - INFO - codeparrot_training - Step 16328: {'lr': 0.000381163513841724, 'samples': 8360448, 'steps': 16328, 'batch_loss/train': 0.8454933380708098} 12/25/2021 03:05:10 - INFO - codeparrot_training - Step 16329: {'lr': 0.00038115009418011376, 'samples': 8360960, 'steps': 16329, 'batch_loss/train': 1.033647095784545} 12/25/2021 03:05:20 - INFO - codeparrot_training - Step 16330: {'lr': 0.0003811366739971003, 'samples': 8361472, 'steps': 16330, 'batch_loss/train': 0.7833782411180437} 12/25/2021 03:05:33 - INFO - codeparrot_training - Step 16331: {'lr': 0.000381123253292737, 'samples': 8361984, 'steps': 16331, 'batch_loss/train': 0.824759311042726} 12/25/2021 03:05:43 - INFO - codeparrot_training - Step 16332: {'lr': 0.00038110983206707705, 'samples': 8362496, 'steps': 16332, 'batch_loss/train': 0.7552850181236863} 12/25/2021 03:05:54 - INFO - codeparrot_training - Step 16333: {'lr': 0.000381096410320174, 'samples': 8363008, 'steps': 16333, 'batch_loss/train': 0.6612557023763657} 12/25/2021 03:06:04 - INFO - codeparrot_training - Step 16334: {'lr': 0.00038108298805208116, 'samples': 8363520, 'steps': 16334, 'batch_loss/train': 0.8151645949110389} 12/25/2021 03:06:18 - INFO - codeparrot_training - Step 16335: {'lr': 0.00038106956526285176, 'samples': 8364032, 'steps': 16335, 'batch_loss/train': 0.8989729033783078} 12/25/2021 03:06:29 - INFO - codeparrot_training - Step 16336: {'lr': 0.0003810561419525393, 'samples': 8364544, 'steps': 16336, 'batch_loss/train': 1.3498526024632156} 12/25/2021 03:06:40 - INFO - codeparrot_training - Step 16337: {'lr': 0.00038104271812119706, 'samples': 8365056, 'steps': 16337, 'batch_loss/train': 0.7144277915358543} 12/25/2021 03:06:52 - INFO - codeparrot_training - Step 16338: {'lr': 0.00038102929376887854, 'samples': 8365568, 'steps': 16338, 'batch_loss/train': 0.7862965976819396} 12/25/2021 03:07:02 - INFO - codeparrot_training - Step 16339: {'lr': 0.00038101586889563693, 'samples': 8366080, 'steps': 16339, 'batch_loss/train': 0.8331911994609982} 12/25/2021 03:07:13 - INFO - codeparrot_training - Step 16340: {'lr': 0.0003810024435015258, 'samples': 8366592, 'steps': 16340, 'batch_loss/train': 0.810480953194201} 12/25/2021 03:07:25 - INFO - codeparrot_training - Step 16341: {'lr': 0.00038098901758659834, 'samples': 8367104, 'steps': 16341, 'batch_loss/train': 0.8134991200640798} 12/25/2021 03:07:36 - INFO - codeparrot_training - Step 16342: {'lr': 0.00038097559115090804, 'samples': 8367616, 'steps': 16342, 'batch_loss/train': 0.8433480272069573} 12/25/2021 03:07:46 - INFO - codeparrot_training - Step 16343: {'lr': 0.0003809621641945082, 'samples': 8368128, 'steps': 16343, 'batch_loss/train': 0.7810773374512792} 12/25/2021 03:07:57 - INFO - codeparrot_training - Step 16344: {'lr': 0.0003809487367174522, 'samples': 8368640, 'steps': 16344, 'batch_loss/train': 0.7124492945149541} 12/25/2021 03:08:11 - INFO - codeparrot_training - Step 16345: {'lr': 0.0003809353087197936, 'samples': 8369152, 'steps': 16345, 'batch_loss/train': 0.8122470653615892} 12/25/2021 03:08:21 - INFO - codeparrot_training - Step 16346: {'lr': 0.0003809218802015856, 'samples': 8369664, 'steps': 16346, 'batch_loss/train': 0.8174110765103251} 12/25/2021 03:08:32 - INFO - codeparrot_training - Step 16347: {'lr': 0.00038090845116288164, 'samples': 8370176, 'steps': 16347, 'batch_loss/train': 1.6463056881912053} 12/25/2021 03:08:44 - INFO - codeparrot_training - Step 16348: {'lr': 0.00038089502160373504, 'samples': 8370688, 'steps': 16348, 'batch_loss/train': 0.7560862839454785} 12/25/2021 03:08:55 - INFO - codeparrot_training - Step 16349: {'lr': 0.0003808815915241993, 'samples': 8371200, 'steps': 16349, 'batch_loss/train': 0.8990575261414051} 12/25/2021 03:09:05 - INFO - codeparrot_training - Step 16350: {'lr': 0.00038086816092432787, 'samples': 8371712, 'steps': 16350, 'batch_loss/train': 0.7955160373821855} 12/25/2021 03:09:19 - INFO - codeparrot_training - Step 16351: {'lr': 0.00038085472980417393, 'samples': 8372224, 'steps': 16351, 'batch_loss/train': 0.8065568981692195} 12/25/2021 03:09:30 - INFO - codeparrot_training - Step 16352: {'lr': 0.000380841298163791, 'samples': 8372736, 'steps': 16352, 'batch_loss/train': 0.8769071912392974} 12/25/2021 03:09:40 - INFO - codeparrot_training - Step 16353: {'lr': 0.00038082786600323254, 'samples': 8373248, 'steps': 16353, 'batch_loss/train': 0.7477641692385077} 12/25/2021 03:09:51 - INFO - codeparrot_training - Step 16354: {'lr': 0.0003808144333225518, 'samples': 8373760, 'steps': 16354, 'batch_loss/train': 0.7459174729883671} 12/25/2021 03:10:03 - INFO - codeparrot_training - Step 16355: {'lr': 0.0003808010001218023, 'samples': 8374272, 'steps': 16355, 'batch_loss/train': 0.8904078425839543} 12/25/2021 03:10:13 - INFO - codeparrot_training - Step 16356: {'lr': 0.00038078756640103745, 'samples': 8374784, 'steps': 16356, 'batch_loss/train': 0.9255089508369565} 12/25/2021 03:10:24 - INFO - codeparrot_training - Step 16357: {'lr': 0.00038077413216031054, 'samples': 8375296, 'steps': 16357, 'batch_loss/train': 0.7476654928177595} 12/25/2021 03:10:36 - INFO - codeparrot_training - Step 16358: {'lr': 0.0003807606973996751, 'samples': 8375808, 'steps': 16358, 'batch_loss/train': 0.8408073158934712} 12/25/2021 03:10:47 - INFO - codeparrot_training - Step 16359: {'lr': 0.00038074726211918456, 'samples': 8376320, 'steps': 16359, 'batch_loss/train': 0.8173728152178228} 12/25/2021 03:10:57 - INFO - codeparrot_training - Step 16360: {'lr': 0.00038073382631889223, 'samples': 8376832, 'steps': 16360, 'batch_loss/train': 0.7536002132110298} 12/25/2021 03:11:11 - INFO - codeparrot_training - Step 16361: {'lr': 0.0003807203899988515, 'samples': 8377344, 'steps': 16361, 'batch_loss/train': 0.8604493821039796} 12/25/2021 03:11:22 - INFO - codeparrot_training - Step 16362: {'lr': 0.00038070695315911595, 'samples': 8377856, 'steps': 16362, 'batch_loss/train': 0.7734404196962714} 12/25/2021 03:11:32 - INFO - codeparrot_training - Step 16363: {'lr': 0.0003806935157997389, 'samples': 8378368, 'steps': 16363, 'batch_loss/train': 0.8833962343633175} 12/25/2021 03:11:44 - INFO - codeparrot_training - Step 16364: {'lr': 0.0003806800779207737, 'samples': 8378880, 'steps': 16364, 'batch_loss/train': 0.9135098652914166} 12/25/2021 03:11:55 - INFO - codeparrot_training - Step 16365: {'lr': 0.000380666639522274, 'samples': 8379392, 'steps': 16365, 'batch_loss/train': 0.6967317988164723} 12/25/2021 03:12:05 - INFO - codeparrot_training - Step 16366: {'lr': 0.000380653200604293, 'samples': 8379904, 'steps': 16366, 'batch_loss/train': 0.8212092723697424} 12/25/2021 03:12:16 - INFO - codeparrot_training - Step 16367: {'lr': 0.00038063976116688413, 'samples': 8380416, 'steps': 16367, 'batch_loss/train': 0.8110054866410792} 12/25/2021 03:12:28 - INFO - codeparrot_training - Step 16368: {'lr': 0.000380626321210101, 'samples': 8380928, 'steps': 16368, 'batch_loss/train': 0.8403459768742323} 12/25/2021 03:12:39 - INFO - codeparrot_training - Step 16369: {'lr': 0.0003806128807339969, 'samples': 8381440, 'steps': 16369, 'batch_loss/train': 0.8076573088765144} 12/25/2021 03:12:49 - INFO - codeparrot_training - Step 16370: {'lr': 0.0003805994397386253, 'samples': 8381952, 'steps': 16370, 'batch_loss/train': 0.7928742929361761} 12/25/2021 03:13:01 - INFO - codeparrot_training - Step 16371: {'lr': 0.00038058599822403963, 'samples': 8382464, 'steps': 16371, 'batch_loss/train': 0.8355450322851539} 12/25/2021 03:13:12 - INFO - codeparrot_training - Step 16372: {'lr': 0.00038057255619029334, 'samples': 8382976, 'steps': 16372, 'batch_loss/train': 0.7147802908439189} 12/25/2021 03:13:23 - INFO - codeparrot_training - Step 16373: {'lr': 0.0003805591136374399, 'samples': 8383488, 'steps': 16373, 'batch_loss/train': 0.8613752662204206} 12/25/2021 03:13:33 - INFO - codeparrot_training - Step 16374: {'lr': 0.00038054567056553266, 'samples': 8384000, 'steps': 16374, 'batch_loss/train': 0.7096129877027124} 12/25/2021 03:13:48 - INFO - codeparrot_training - Step 16375: {'lr': 0.0003805322269746252, 'samples': 8384512, 'steps': 16375, 'batch_loss/train': 0.8322852025739849} 12/25/2021 03:13:58 - INFO - codeparrot_training - Step 16376: {'lr': 0.0003805187828647709, 'samples': 8385024, 'steps': 16376, 'batch_loss/train': 0.8532070713117719} 12/25/2021 03:14:09 - INFO - codeparrot_training - Step 16377: {'lr': 0.0003805053382360231, 'samples': 8385536, 'steps': 16377, 'batch_loss/train': 0.7727809818461537} 12/25/2021 03:14:21 - INFO - codeparrot_training - Step 16378: {'lr': 0.00038049189308843545, 'samples': 8386048, 'steps': 16378, 'batch_loss/train': 0.6201814038213342} 12/25/2021 03:14:32 - INFO - codeparrot_training - Step 16379: {'lr': 0.00038047844742206127, 'samples': 8386560, 'steps': 16379, 'batch_loss/train': 0.7912314990535378} 12/25/2021 03:14:42 - INFO - codeparrot_training - Step 16380: {'lr': 0.00038046500123695403, 'samples': 8387072, 'steps': 16380, 'batch_loss/train': 0.7682696841657162} 12/25/2021 03:14:56 - INFO - codeparrot_training - Step 16381: {'lr': 0.00038045155453316724, 'samples': 8387584, 'steps': 16381, 'batch_loss/train': 0.7935175374150276} 12/25/2021 03:15:07 - INFO - codeparrot_training - Step 16382: {'lr': 0.00038043810731075434, 'samples': 8388096, 'steps': 16382, 'batch_loss/train': 0.8303677807562053} 12/25/2021 03:15:17 - INFO - codeparrot_training - Step 16383: {'lr': 0.00038042465956976873, 'samples': 8388608, 'steps': 16383, 'batch_loss/train': 0.8093399005010724} 12/25/2021 03:15:28 - INFO - codeparrot_training - Step 16384: {'lr': 0.0003804112113102639, 'samples': 8389120, 'steps': 16384, 'batch_loss/train': 0.8039600357878953} 12/25/2021 03:15:40 - INFO - codeparrot_training - Step 16385: {'lr': 0.0003803977625322934, 'samples': 8389632, 'steps': 16385, 'batch_loss/train': 0.8676012698560953} 12/25/2021 03:15:50 - INFO - codeparrot_training - Step 16386: {'lr': 0.0003803843132359107, 'samples': 8390144, 'steps': 16386, 'batch_loss/train': 0.7880402319133282} 12/25/2021 03:16:01 - INFO - codeparrot_training - Step 16387: {'lr': 0.000380370863421169, 'samples': 8390656, 'steps': 16387, 'batch_loss/train': 0.82987698353827} 12/25/2021 03:16:14 - INFO - codeparrot_training - Step 16388: {'lr': 0.0003803574130881221, 'samples': 8391168, 'steps': 16388, 'batch_loss/train': 0.9423878313973546} 12/25/2021 03:16:24 - INFO - codeparrot_training - Step 16389: {'lr': 0.0003803439622368233, 'samples': 8391680, 'steps': 16389, 'batch_loss/train': 1.5762659348547459} 12/25/2021 03:16:35 - INFO - codeparrot_training - Step 16390: {'lr': 0.00038033051086732616, 'samples': 8392192, 'steps': 16390, 'batch_loss/train': 0.8115044236183167} 12/25/2021 03:16:48 - INFO - codeparrot_training - Step 16391: {'lr': 0.00038031705897968417, 'samples': 8392704, 'steps': 16391, 'batch_loss/train': 0.7338745906017721} 12/25/2021 03:16:59 - INFO - codeparrot_training - Step 16392: {'lr': 0.0003803036065739507, 'samples': 8393216, 'steps': 16392, 'batch_loss/train': 0.8738591875880957} 12/25/2021 03:17:09 - INFO - codeparrot_training - Step 16393: {'lr': 0.00038029015365017926, 'samples': 8393728, 'steps': 16393, 'batch_loss/train': 0.8754670964553952} 12/25/2021 03:17:21 - INFO - codeparrot_training - Step 16394: {'lr': 0.00038027670020842345, 'samples': 8394240, 'steps': 16394, 'batch_loss/train': 0.6716324593871832} 12/25/2021 03:17:32 - INFO - codeparrot_training - Step 16395: {'lr': 0.0003802632462487366, 'samples': 8394752, 'steps': 16395, 'batch_loss/train': 0.7741752951405942} 12/25/2021 03:17:43 - INFO - codeparrot_training - Step 16396: {'lr': 0.00038024979177117226, 'samples': 8395264, 'steps': 16396, 'batch_loss/train': 0.8425773023627698} 12/25/2021 03:17:53 - INFO - codeparrot_training - Step 16397: {'lr': 0.00038023633677578396, 'samples': 8395776, 'steps': 16397, 'batch_loss/train': 0.7516240701079369} 12/25/2021 03:18:05 - INFO - codeparrot_training - Step 16398: {'lr': 0.00038022288126262515, 'samples': 8396288, 'steps': 16398, 'batch_loss/train': 0.7779428055509925} 12/25/2021 03:18:16 - INFO - codeparrot_training - Step 16399: {'lr': 0.0003802094252317494, 'samples': 8396800, 'steps': 16399, 'batch_loss/train': 0.8505194000899792} 12/25/2021 03:18:26 - INFO - codeparrot_training - Step 16400: {'lr': 0.00038019596868321, 'samples': 8397312, 'steps': 16400, 'batch_loss/train': 0.8677960052154958} 12/25/2021 03:18:39 - INFO - codeparrot_training - Step 16401: {'lr': 0.0003801825116170607, 'samples': 8397824, 'steps': 16401, 'batch_loss/train': 0.8925122581422329} 12/25/2021 03:18:50 - INFO - codeparrot_training - Step 16402: {'lr': 0.0003801690540333549, 'samples': 8398336, 'steps': 16402, 'batch_loss/train': 0.5527537409216166} 12/25/2021 03:19:01 - INFO - codeparrot_training - Step 16403: {'lr': 0.000380155595932146, 'samples': 8398848, 'steps': 16403, 'batch_loss/train': 0.8026828377041966} 12/25/2021 03:19:13 - INFO - codeparrot_training - Step 16404: {'lr': 0.0003801421373134876, 'samples': 8399360, 'steps': 16404, 'batch_loss/train': 0.836663456633687} 12/25/2021 03:19:24 - INFO - codeparrot_training - Step 16405: {'lr': 0.00038012867817743335, 'samples': 8399872, 'steps': 16405, 'batch_loss/train': 0.8296491596847773} 12/25/2021 03:19:34 - INFO - codeparrot_training - Step 16406: {'lr': 0.0003801152185240364, 'samples': 8400384, 'steps': 16406, 'batch_loss/train': 0.7445092252455652} 12/25/2021 03:19:45 - INFO - codeparrot_training - Step 16407: {'lr': 0.0003801017583533506, 'samples': 8400896, 'steps': 16407, 'batch_loss/train': 0.8346738899126649} 12/25/2021 03:19:57 - INFO - codeparrot_training - Step 16408: {'lr': 0.0003800882976654293, 'samples': 8401408, 'steps': 16408, 'batch_loss/train': 0.731676128692925} 12/25/2021 03:20:08 - INFO - codeparrot_training - Step 16409: {'lr': 0.0003800748364603261, 'samples': 8401920, 'steps': 16409, 'batch_loss/train': 0.8197162933647633} 12/25/2021 03:20:18 - INFO - codeparrot_training - Step 16410: {'lr': 0.00038006137473809446, 'samples': 8402432, 'steps': 16410, 'batch_loss/train': 0.8964916691184044} 12/25/2021 03:20:31 - INFO - codeparrot_training - Step 16411: {'lr': 0.0003800479124987878, 'samples': 8402944, 'steps': 16411, 'batch_loss/train': 0.8020818531513214} 12/25/2021 03:20:42 - INFO - codeparrot_training - Step 16412: {'lr': 0.0003800344497424598, 'samples': 8403456, 'steps': 16412, 'batch_loss/train': 0.8265873305499554} 12/25/2021 03:20:53 - INFO - codeparrot_training - Step 16413: {'lr': 0.00038002098646916396, 'samples': 8403968, 'steps': 16413, 'batch_loss/train': 0.9385354509577155} 12/25/2021 03:21:05 - INFO - codeparrot_training - Step 16414: {'lr': 0.0003800075226789538, 'samples': 8404480, 'steps': 16414, 'batch_loss/train': 0.6658437331207097} 12/25/2021 03:21:15 - INFO - codeparrot_training - Step 16415: {'lr': 0.0003799940583718827, 'samples': 8404992, 'steps': 16415, 'batch_loss/train': 0.8669708520174026} 12/25/2021 03:21:26 - INFO - codeparrot_training - Step 16416: {'lr': 0.0003799805935480044, 'samples': 8405504, 'steps': 16416, 'batch_loss/train': 0.8121983092278242} 12/25/2021 03:21:36 - INFO - codeparrot_training - Step 16417: {'lr': 0.0003799671282073723, 'samples': 8406016, 'steps': 16417, 'batch_loss/train': 0.8629690702073276} 12/25/2021 03:21:48 - INFO - codeparrot_training - Step 16418: {'lr': 0.0003799536623500399, 'samples': 8406528, 'steps': 16418, 'batch_loss/train': 0.7434429600834846} 12/25/2021 03:21:59 - INFO - codeparrot_training - Step 16419: {'lr': 0.00037994019597606095, 'samples': 8407040, 'steps': 16419, 'batch_loss/train': 0.6952060237526894} 12/25/2021 03:22:10 - INFO - codeparrot_training - Step 16420: {'lr': 0.00037992672908548875, 'samples': 8407552, 'steps': 16420, 'batch_loss/train': 0.7852361318655312} 12/25/2021 03:22:23 - INFO - codeparrot_training - Step 16421: {'lr': 0.00037991326167837705, 'samples': 8408064, 'steps': 16421, 'batch_loss/train': 0.8495915979146957} 12/25/2021 03:22:34 - INFO - codeparrot_training - Step 16422: {'lr': 0.0003798997937547792, 'samples': 8408576, 'steps': 16422, 'batch_loss/train': 0.7924232976511121} 12/25/2021 03:22:44 - INFO - codeparrot_training - Step 16423: {'lr': 0.00037988632531474873, 'samples': 8409088, 'steps': 16423, 'batch_loss/train': 1.0281326174736023} 12/25/2021 03:22:56 - INFO - codeparrot_training - Step 16424: {'lr': 0.0003798728563583394, 'samples': 8409600, 'steps': 16424, 'batch_loss/train': 0.8153260499238968} 12/25/2021 03:23:07 - INFO - codeparrot_training - Step 16425: {'lr': 0.00037985938688560463, 'samples': 8410112, 'steps': 16425, 'batch_loss/train': 0.8157568015158176} 12/25/2021 03:23:18 - INFO - codeparrot_training - Step 16426: {'lr': 0.0003798459168965979, 'samples': 8410624, 'steps': 16426, 'batch_loss/train': 0.7279916242696345} 12/25/2021 03:23:28 - INFO - codeparrot_training - Step 16427: {'lr': 0.0003798324463913729, 'samples': 8411136, 'steps': 16427, 'batch_loss/train': 0.8510133742820472} 12/25/2021 03:23:40 - INFO - codeparrot_training - Step 16428: {'lr': 0.000379818975369983, 'samples': 8411648, 'steps': 16428, 'batch_loss/train': 0.794634415069595} 12/25/2021 03:23:51 - INFO - codeparrot_training - Step 16429: {'lr': 0.00037980550383248196, 'samples': 8412160, 'steps': 16429, 'batch_loss/train': 0.8521801605820656} 12/25/2021 03:24:01 - INFO - codeparrot_training - Step 16430: {'lr': 0.00037979203177892324, 'samples': 8412672, 'steps': 16430, 'batch_loss/train': 0.7866526050493121} 12/25/2021 03:24:14 - INFO - codeparrot_training - Step 16431: {'lr': 0.00037977855920936045, 'samples': 8413184, 'steps': 16431, 'batch_loss/train': 0.8104956718161702} 12/25/2021 03:24:25 - INFO - codeparrot_training - Step 16432: {'lr': 0.00037976508612384705, 'samples': 8413696, 'steps': 16432, 'batch_loss/train': 0.8834657943807542} 12/25/2021 03:24:35 - INFO - codeparrot_training - Step 16433: {'lr': 0.00037975161252243664, 'samples': 8414208, 'steps': 16433, 'batch_loss/train': 0.910928484518081} 12/25/2021 03:24:47 - INFO - codeparrot_training - Step 16434: {'lr': 0.0003797381384051829, 'samples': 8414720, 'steps': 16434, 'batch_loss/train': 0.837822275236249} 12/25/2021 03:24:58 - INFO - codeparrot_training - Step 16435: {'lr': 0.00037972466377213926, 'samples': 8415232, 'steps': 16435, 'batch_loss/train': 0.753268264234066} 12/25/2021 03:25:08 - INFO - codeparrot_training - Step 16436: {'lr': 0.0003797111886233593, 'samples': 8415744, 'steps': 16436, 'batch_loss/train': 0.7353754403302446} 12/25/2021 03:25:19 - INFO - codeparrot_training - Step 16437: {'lr': 0.00037969771295889664, 'samples': 8416256, 'steps': 16437, 'batch_loss/train': 0.8474216526374221} 12/25/2021 03:25:32 - INFO - codeparrot_training - Step 16438: {'lr': 0.00037968423677880493, 'samples': 8416768, 'steps': 16438, 'batch_loss/train': 0.8273716932162642} 12/25/2021 03:25:42 - INFO - codeparrot_training - Step 16439: {'lr': 0.0003796707600831376, 'samples': 8417280, 'steps': 16439, 'batch_loss/train': 0.833724650554359} 12/25/2021 03:25:53 - INFO - codeparrot_training - Step 16440: {'lr': 0.00037965728287194835, 'samples': 8417792, 'steps': 16440, 'batch_loss/train': 0.9830353688448668} 12/25/2021 03:26:05 - INFO - codeparrot_training - Step 16441: {'lr': 0.00037964380514529065, 'samples': 8418304, 'steps': 16441, 'batch_loss/train': 0.7767963903024793} 12/25/2021 03:26:16 - INFO - codeparrot_training - Step 16442: {'lr': 0.0003796303269032181, 'samples': 8418816, 'steps': 16442, 'batch_loss/train': 0.7797692895401269} 12/25/2021 03:26:26 - INFO - codeparrot_training - Step 16443: {'lr': 0.00037961684814578434, 'samples': 8419328, 'steps': 16443, 'batch_loss/train': 0.8390476023778319} 12/25/2021 03:26:38 - INFO - codeparrot_training - Step 16444: {'lr': 0.0003796033688730429, 'samples': 8419840, 'steps': 16444, 'batch_loss/train': 0.7752347416244447} 12/25/2021 03:26:49 - INFO - codeparrot_training - Step 16445: {'lr': 0.0003795898890850475, 'samples': 8420352, 'steps': 16445, 'batch_loss/train': 0.7760807042941451} 12/25/2021 03:26:59 - INFO - codeparrot_training - Step 16446: {'lr': 0.00037957640878185153, 'samples': 8420864, 'steps': 16446, 'batch_loss/train': 0.7619779882952571} 12/25/2021 03:27:12 - INFO - codeparrot_training - Step 16447: {'lr': 0.0003795629279635088, 'samples': 8421376, 'steps': 16447, 'batch_loss/train': 0.8889281153678894} 12/25/2021 03:27:22 - INFO - codeparrot_training - Step 16448: {'lr': 0.00037954944663007274, 'samples': 8421888, 'steps': 16448, 'batch_loss/train': 0.8099018186330795} 12/25/2021 03:27:33 - INFO - codeparrot_training - Step 16449: {'lr': 0.0003795359647815969, 'samples': 8422400, 'steps': 16449, 'batch_loss/train': 0.8250860443804413} 12/25/2021 03:27:43 - INFO - codeparrot_training - Step 16450: {'lr': 0.0003795224824181351, 'samples': 8422912, 'steps': 16450, 'batch_loss/train': 0.7922704815864563} 12/25/2021 03:27:56 - INFO - codeparrot_training - Step 16451: {'lr': 0.0003795089995397407, 'samples': 8423424, 'steps': 16451, 'batch_loss/train': 0.7929414631798863} 12/25/2021 03:28:07 - INFO - codeparrot_training - Step 16452: {'lr': 0.0003794955161464675, 'samples': 8423936, 'steps': 16452, 'batch_loss/train': 0.9027003515511751} 12/25/2021 03:28:17 - INFO - codeparrot_training - Step 16453: {'lr': 0.0003794820322383691, 'samples': 8424448, 'steps': 16453, 'batch_loss/train': 0.8533775340765715} 12/25/2021 03:28:29 - INFO - codeparrot_training - Step 16454: {'lr': 0.0003794685478154989, 'samples': 8424960, 'steps': 16454, 'batch_loss/train': 0.8236110154539347} 12/25/2021 03:28:40 - INFO - codeparrot_training - Step 16455: {'lr': 0.0003794550628779107, 'samples': 8425472, 'steps': 16455, 'batch_loss/train': 0.7584734526462853} 12/25/2021 03:28:51 - INFO - codeparrot_training - Step 16456: {'lr': 0.000379441577425658, 'samples': 8425984, 'steps': 16456, 'batch_loss/train': 0.7601137375459075} 12/25/2021 03:29:03 - INFO - codeparrot_training - Step 16457: {'lr': 0.0003794280914587945, 'samples': 8426496, 'steps': 16457, 'batch_loss/train': 0.8395616477355361} 12/25/2021 03:29:13 - INFO - codeparrot_training - Step 16458: {'lr': 0.0003794146049773737, 'samples': 8427008, 'steps': 16458, 'batch_loss/train': 0.601382864639163} 12/25/2021 03:29:24 - INFO - codeparrot_training - Step 16459: {'lr': 0.0003794011179814494, 'samples': 8427520, 'steps': 16459, 'batch_loss/train': 0.8463073363527656} 12/25/2021 03:29:35 - INFO - codeparrot_training - Step 16460: {'lr': 0.0003793876304710751, 'samples': 8428032, 'steps': 16460, 'batch_loss/train': 0.7563352892175317} 12/25/2021 03:29:47 - INFO - codeparrot_training - Step 16461: {'lr': 0.0003793741424463044, 'samples': 8428544, 'steps': 16461, 'batch_loss/train': 0.7561961910687387} 12/25/2021 03:29:58 - INFO - codeparrot_training - Step 16462: {'lr': 0.000379360653907191, 'samples': 8429056, 'steps': 16462, 'batch_loss/train': 1.0054340222850442} 12/25/2021 03:30:08 - INFO - codeparrot_training - Step 16463: {'lr': 0.00037934716485378843, 'samples': 8429568, 'steps': 16463, 'batch_loss/train': 0.8225608076900244} 12/25/2021 03:30:21 - INFO - codeparrot_training - Step 16464: {'lr': 0.00037933367528615044, 'samples': 8430080, 'steps': 16464, 'batch_loss/train': 0.9183127526193857} 12/25/2021 03:30:31 - INFO - codeparrot_training - Step 16465: {'lr': 0.00037932018520433054, 'samples': 8430592, 'steps': 16465, 'batch_loss/train': 0.8509008605033159} 12/25/2021 03:30:42 - INFO - codeparrot_training - Step 16466: {'lr': 0.00037930669460838247, 'samples': 8431104, 'steps': 16466, 'batch_loss/train': 0.8368670279160142} 12/25/2021 03:30:55 - INFO - codeparrot_training - Step 16467: {'lr': 0.0003792932034983597, 'samples': 8431616, 'steps': 16467, 'batch_loss/train': 1.0684367178473622} 12/25/2021 03:31:05 - INFO - codeparrot_training - Step 16468: {'lr': 0.000379279711874316, 'samples': 8432128, 'steps': 16468, 'batch_loss/train': 0.7960388326318935} 12/25/2021 03:31:16 - INFO - codeparrot_training - Step 16469: {'lr': 0.00037926621973630506, 'samples': 8432640, 'steps': 16469, 'batch_loss/train': 0.7981840176507831} 12/25/2021 03:31:27 - INFO - codeparrot_training - Step 16470: {'lr': 0.0003792527270843804, 'samples': 8433152, 'steps': 16470, 'batch_loss/train': 0.779503648635} 12/25/2021 03:31:39 - INFO - codeparrot_training - Step 16471: {'lr': 0.0003792392339185957, 'samples': 8433664, 'steps': 16471, 'batch_loss/train': 0.8255496290512383} 12/25/2021 03:31:49 - INFO - codeparrot_training - Step 16472: {'lr': 0.0003792257402390046, 'samples': 8434176, 'steps': 16472, 'batch_loss/train': 0.769402094418183} 12/25/2021 03:32:00 - INFO - codeparrot_training - Step 16473: {'lr': 0.0003792122460456608, 'samples': 8434688, 'steps': 16473, 'batch_loss/train': 0.8440640748012811} 12/25/2021 03:32:12 - INFO - codeparrot_training - Step 16474: {'lr': 0.0003791987513386178, 'samples': 8435200, 'steps': 16474, 'batch_loss/train': 1.0737783601507545} 12/25/2021 03:32:23 - INFO - codeparrot_training - Step 16475: {'lr': 0.0003791852561179294, 'samples': 8435712, 'steps': 16475, 'batch_loss/train': 0.8259078186238185} 12/25/2021 03:32:33 - INFO - codeparrot_training - Step 16476: {'lr': 0.0003791717603836492, 'samples': 8436224, 'steps': 16476, 'batch_loss/train': 0.7899503069929779} 12/25/2021 03:32:46 - INFO - codeparrot_training - Step 16477: {'lr': 0.0003791582641358309, 'samples': 8436736, 'steps': 16477, 'batch_loss/train': 0.8505193097516894} 12/25/2021 03:32:56 - INFO - codeparrot_training - Step 16478: {'lr': 0.000379144767374528, 'samples': 8437248, 'steps': 16478, 'batch_loss/train': 0.8115126470220275} 12/25/2021 03:33:07 - INFO - codeparrot_training - Step 16479: {'lr': 0.0003791312700997944, 'samples': 8437760, 'steps': 16479, 'batch_loss/train': 0.8243734799325466} 12/25/2021 03:33:18 - INFO - codeparrot_training - Step 16480: {'lr': 0.0003791177723116835, 'samples': 8438272, 'steps': 16480, 'batch_loss/train': 0.8393291682004929} 12/25/2021 03:33:30 - INFO - codeparrot_training - Step 16481: {'lr': 0.0003791042740102492, 'samples': 8438784, 'steps': 16481, 'batch_loss/train': 0.8002667257096618} 12/25/2021 03:33:40 - INFO - codeparrot_training - Step 16482: {'lr': 0.00037909077519554497, 'samples': 8439296, 'steps': 16482, 'batch_loss/train': 0.9083249159157276} 12/25/2021 03:33:51 - INFO - codeparrot_training - Step 16483: {'lr': 0.0003790772758676246, 'samples': 8439808, 'steps': 16483, 'batch_loss/train': 0.7681462168693542} 12/25/2021 03:34:03 - INFO - codeparrot_training - Step 16484: {'lr': 0.0003790637760265417, 'samples': 8440320, 'steps': 16484, 'batch_loss/train': 0.7831322737038136} 12/25/2021 03:34:14 - INFO - codeparrot_training - Step 16485: {'lr': 0.00037905027567234994, 'samples': 8440832, 'steps': 16485, 'batch_loss/train': 0.6631510443985462} 12/25/2021 03:34:24 - INFO - codeparrot_training - Step 16486: {'lr': 0.00037903677480510313, 'samples': 8441344, 'steps': 16486, 'batch_loss/train': 0.8327266471460462} 12/25/2021 03:34:36 - INFO - codeparrot_training - Step 16487: {'lr': 0.0003790232734248547, 'samples': 8441856, 'steps': 16487, 'batch_loss/train': 0.7892580246552825} 12/25/2021 03:34:47 - INFO - codeparrot_training - Step 16488: {'lr': 0.0003790097715316585, 'samples': 8442368, 'steps': 16488, 'batch_loss/train': 0.7624923280673102} 12/25/2021 03:34:58 - INFO - codeparrot_training - Step 16489: {'lr': 0.00037899626912556817, 'samples': 8442880, 'steps': 16489, 'batch_loss/train': 0.7753078872337937} 12/25/2021 03:35:08 - INFO - codeparrot_training - Step 16490: {'lr': 0.0003789827662066373, 'samples': 8443392, 'steps': 16490, 'batch_loss/train': 0.8057362074032426} 12/25/2021 03:35:22 - INFO - codeparrot_training - Step 16491: {'lr': 0.0003789692627749197, 'samples': 8443904, 'steps': 16491, 'batch_loss/train': 0.7815865671727806} 12/25/2021 03:35:32 - INFO - codeparrot_training - Step 16492: {'lr': 0.00037895575883046905, 'samples': 8444416, 'steps': 16492, 'batch_loss/train': 0.6906238934025168} 12/25/2021 03:35:43 - INFO - codeparrot_training - Step 16493: {'lr': 0.000378942254373339, 'samples': 8444928, 'steps': 16493, 'batch_loss/train': 0.6305379341356456} 12/25/2021 03:35:55 - INFO - codeparrot_training - Step 16494: {'lr': 0.00037892874940358314, 'samples': 8445440, 'steps': 16494, 'batch_loss/train': 0.7956684986129403} 12/25/2021 03:36:05 - INFO - codeparrot_training - Step 16495: {'lr': 0.0003789152439212554, 'samples': 8445952, 'steps': 16495, 'batch_loss/train': 0.8217930993996561} 12/25/2021 03:36:16 - INFO - codeparrot_training - Step 16496: {'lr': 0.0003789017379264093, 'samples': 8446464, 'steps': 16496, 'batch_loss/train': 0.8296674899756908} 12/25/2021 03:36:29 - INFO - codeparrot_training - Step 16497: {'lr': 0.0003788882314190984, 'samples': 8446976, 'steps': 16497, 'batch_loss/train': 0.7899437178857625} 12/25/2021 03:36:39 - INFO - codeparrot_training - Step 16498: {'lr': 0.00037887472439937666, 'samples': 8447488, 'steps': 16498, 'batch_loss/train': 0.6996634872630239} 12/25/2021 03:36:50 - INFO - codeparrot_training - Step 16499: {'lr': 0.00037886121686729775, 'samples': 8448000, 'steps': 16499, 'batch_loss/train': 0.797478562220931} 12/25/2021 03:37:00 - INFO - codeparrot_training - Step 16500: {'lr': 0.0003788477088229152, 'samples': 8448512, 'steps': 16500, 'batch_loss/train': 0.7477188266348094} 12/25/2021 03:37:12 - INFO - codeparrot_training - Step 16501: {'lr': 0.00037883420026628287, 'samples': 8449024, 'steps': 16501, 'batch_loss/train': 0.950902255717665} 12/25/2021 03:37:23 - INFO - codeparrot_training - Step 16502: {'lr': 0.00037882069119745434, 'samples': 8449536, 'steps': 16502, 'batch_loss/train': 0.7583601241931319} 12/25/2021 03:37:34 - INFO - codeparrot_training - Step 16503: {'lr': 0.0003788071816164834, 'samples': 8450048, 'steps': 16503, 'batch_loss/train': 0.8168062679469585} 12/25/2021 03:37:46 - INFO - codeparrot_training - Step 16504: {'lr': 0.00037879367152342377, 'samples': 8450560, 'steps': 16504, 'batch_loss/train': 0.78849514760077} 12/25/2021 03:37:56 - INFO - codeparrot_training - Step 16505: {'lr': 0.0003787801609183291, 'samples': 8451072, 'steps': 16505, 'batch_loss/train': 0.7755010211840272} 12/25/2021 03:38:07 - INFO - codeparrot_training - Step 16506: {'lr': 0.0003787666498012531, 'samples': 8451584, 'steps': 16506, 'batch_loss/train': 0.7951684494037181} 12/25/2021 03:38:20 - INFO - codeparrot_training - Step 16507: {'lr': 0.00037875313817224964, 'samples': 8452096, 'steps': 16507, 'batch_loss/train': 0.5416113164974377} 12/25/2021 03:38:30 - INFO - codeparrot_training - Step 16508: {'lr': 0.00037873962603137226, 'samples': 8452608, 'steps': 16508, 'batch_loss/train': 0.7746973051689565} 12/25/2021 03:38:41 - INFO - codeparrot_training - Step 16509: {'lr': 0.00037872611337867474, 'samples': 8453120, 'steps': 16509, 'batch_loss/train': 0.8164829509332776} 12/25/2021 03:38:53 - INFO - codeparrot_training - Step 16510: {'lr': 0.00037871260021421076, 'samples': 8453632, 'steps': 16510, 'batch_loss/train': 0.8254687869921327} 12/25/2021 03:39:04 - INFO - codeparrot_training - Step 16511: {'lr': 0.00037869908653803414, 'samples': 8454144, 'steps': 16511, 'batch_loss/train': 0.7941197762265801} 12/25/2021 03:39:14 - INFO - codeparrot_training - Step 16512: {'lr': 0.0003786855723501985, 'samples': 8454656, 'steps': 16512, 'batch_loss/train': 0.8040883066132665} 12/25/2021 03:39:25 - INFO - codeparrot_training - Step 16513: {'lr': 0.00037867205765075763, 'samples': 8455168, 'steps': 16513, 'batch_loss/train': 0.821681444067508} 12/25/2021 03:39:37 - INFO - codeparrot_training - Step 16514: {'lr': 0.0003786585424397653, 'samples': 8455680, 'steps': 16514, 'batch_loss/train': 0.7770571578294039} 12/25/2021 03:39:47 - INFO - codeparrot_training - Step 16515: {'lr': 0.0003786450267172751, 'samples': 8456192, 'steps': 16515, 'batch_loss/train': 0.6866654111072421} 12/25/2021 03:39:58 - INFO - codeparrot_training - Step 16516: {'lr': 0.00037863151048334086, 'samples': 8456704, 'steps': 16516, 'batch_loss/train': 0.8063623425550759} 12/25/2021 03:40:11 - INFO - codeparrot_training - Step 16517: {'lr': 0.0003786179937380163, 'samples': 8457216, 'steps': 16517, 'batch_loss/train': 0.7917315142694861} 12/25/2021 03:40:21 - INFO - codeparrot_training - Step 16518: {'lr': 0.00037860447648135524, 'samples': 8457728, 'steps': 16518, 'batch_loss/train': 0.8119336636736989} 12/25/2021 03:40:32 - INFO - codeparrot_training - Step 16519: {'lr': 0.0003785909587134113, 'samples': 8458240, 'steps': 16519, 'batch_loss/train': 0.8167871756013483} 12/25/2021 03:40:44 - INFO - codeparrot_training - Step 16520: {'lr': 0.0003785774404342382, 'samples': 8458752, 'steps': 16520, 'batch_loss/train': 0.7268245306331664} 12/25/2021 03:40:55 - INFO - codeparrot_training - Step 16521: {'lr': 0.00037856392164388984, 'samples': 8459264, 'steps': 16521, 'batch_loss/train': 1.3858943413943052} 12/25/2021 03:41:06 - INFO - codeparrot_training - Step 16522: {'lr': 0.00037855040234241976, 'samples': 8459776, 'steps': 16522, 'batch_loss/train': 0.7487788200378418} 12/25/2021 03:41:16 - INFO - codeparrot_training - Step 16523: {'lr': 0.0003785368825298819, 'samples': 8460288, 'steps': 16523, 'batch_loss/train': 0.8955449759960175} 12/25/2021 03:41:28 - INFO - codeparrot_training - Step 16524: {'lr': 0.00037852336220632995, 'samples': 8460800, 'steps': 16524, 'batch_loss/train': 0.682218772592023} 12/25/2021 03:41:39 - INFO - codeparrot_training - Step 16525: {'lr': 0.00037850984137181767, 'samples': 8461312, 'steps': 16525, 'batch_loss/train': 0.6752700781216845} 12/25/2021 03:41:49 - INFO - codeparrot_training - Step 16526: {'lr': 0.00037849632002639864, 'samples': 8461824, 'steps': 16526, 'batch_loss/train': 0.7811495019122958} 12/25/2021 03:42:02 - INFO - codeparrot_training - Step 16527: {'lr': 0.00037848279817012686, 'samples': 8462336, 'steps': 16527, 'batch_loss/train': 0.7766978833824396} 12/25/2021 03:42:13 - INFO - codeparrot_training - Step 16528: {'lr': 0.00037846927580305594, 'samples': 8462848, 'steps': 16528, 'batch_loss/train': 0.8493122756481171} 12/25/2021 03:42:23 - INFO - codeparrot_training - Step 16529: {'lr': 0.0003784557529252397, 'samples': 8463360, 'steps': 16529, 'batch_loss/train': 0.8459546724334359} 12/25/2021 03:42:35 - INFO - codeparrot_training - Step 16530: {'lr': 0.0003784422295367319, 'samples': 8463872, 'steps': 16530, 'batch_loss/train': 0.8420109911821783} 12/25/2021 03:42:46 - INFO - codeparrot_training - Step 16531: {'lr': 0.0003784287056375863, 'samples': 8464384, 'steps': 16531, 'batch_loss/train': 0.7454065852798522} 12/25/2021 03:42:57 - INFO - codeparrot_training - Step 16532: {'lr': 0.00037841518122785667, 'samples': 8464896, 'steps': 16532, 'batch_loss/train': 0.8109003826975822} 12/25/2021 03:43:07 - INFO - codeparrot_training - Step 16533: {'lr': 0.0003784016563075967, 'samples': 8465408, 'steps': 16533, 'batch_loss/train': 0.7368392455391586} 12/25/2021 03:43:19 - INFO - codeparrot_training - Step 16534: {'lr': 0.00037838813087686036, 'samples': 8465920, 'steps': 16534, 'batch_loss/train': 0.8313487195409834} 12/25/2021 03:43:30 - INFO - codeparrot_training - Step 16535: {'lr': 0.0003783746049357012, 'samples': 8466432, 'steps': 16535, 'batch_loss/train': 0.8405038439668715} 12/25/2021 03:43:40 - INFO - codeparrot_training - Step 16536: {'lr': 0.0003783610784841731, 'samples': 8466944, 'steps': 16536, 'batch_loss/train': 0.8064529099501669} 12/25/2021 03:43:53 - INFO - codeparrot_training - Step 16537: {'lr': 0.00037834755152232983, 'samples': 8467456, 'steps': 16537, 'batch_loss/train': 0.8049694681540132} 12/25/2021 03:44:04 - INFO - codeparrot_training - Step 16538: {'lr': 0.0003783340240502251, 'samples': 8467968, 'steps': 16538, 'batch_loss/train': 0.8242783132009208} 12/25/2021 03:44:14 - INFO - codeparrot_training - Step 16539: {'lr': 0.00037832049606791274, 'samples': 8468480, 'steps': 16539, 'batch_loss/train': 0.7106008271221071} 12/25/2021 03:44:26 - INFO - codeparrot_training - Step 16540: {'lr': 0.00037830696757544665, 'samples': 8468992, 'steps': 16540, 'batch_loss/train': 0.8009513840079308} 12/25/2021 03:44:37 - INFO - codeparrot_training - Step 16541: {'lr': 0.00037829343857288046, 'samples': 8469504, 'steps': 16541, 'batch_loss/train': 0.8087261221371591} 12/25/2021 03:44:48 - INFO - codeparrot_training - Step 16542: {'lr': 0.0003782799090602679, 'samples': 8470016, 'steps': 16542, 'batch_loss/train': 0.7671624538488686} 12/25/2021 03:44:58 - INFO - codeparrot_training - Step 16543: {'lr': 0.0003782663790376629, 'samples': 8470528, 'steps': 16543, 'batch_loss/train': 0.8415856389328837} 12/25/2021 03:45:10 - INFO - codeparrot_training - Step 16544: {'lr': 0.00037825284850511934, 'samples': 8471040, 'steps': 16544, 'batch_loss/train': 0.739308618940413} 12/25/2021 03:45:21 - INFO - codeparrot_training - Step 16545: {'lr': 0.00037823931746269073, 'samples': 8471552, 'steps': 16545, 'batch_loss/train': 0.7557910224422812} 12/25/2021 03:45:32 - INFO - codeparrot_training - Step 16546: {'lr': 0.0003782257859104311, 'samples': 8472064, 'steps': 16546, 'batch_loss/train': 0.8061896306462586} 12/25/2021 03:45:45 - INFO - codeparrot_training - Step 16547: {'lr': 0.00037821225384839413, 'samples': 8472576, 'steps': 16547, 'batch_loss/train': 0.8988741151988506} 12/25/2021 03:45:55 - INFO - codeparrot_training - Step 16548: {'lr': 0.0003781987212766337, 'samples': 8473088, 'steps': 16548, 'batch_loss/train': 0.7664727522060275} 12/25/2021 03:46:06 - INFO - codeparrot_training - Step 16549: {'lr': 0.00037818518819520347, 'samples': 8473600, 'steps': 16549, 'batch_loss/train': 0.8005831367336214} 12/25/2021 03:46:18 - INFO - codeparrot_training - Step 16550: {'lr': 0.00037817165460415735, 'samples': 8474112, 'steps': 16550, 'batch_loss/train': 0.726163043640554} 12/25/2021 03:46:28 - INFO - codeparrot_training - Step 16551: {'lr': 0.00037815812050354916, 'samples': 8474624, 'steps': 16551, 'batch_loss/train': 0.7795983799733222} 12/25/2021 03:46:39 - INFO - codeparrot_training - Step 16552: {'lr': 0.00037814458589343264, 'samples': 8475136, 'steps': 16552, 'batch_loss/train': 0.869534901343286} 12/25/2021 03:46:50 - INFO - codeparrot_training - Step 16553: {'lr': 0.00037813105077386165, 'samples': 8475648, 'steps': 16553, 'batch_loss/train': 0.8297166768461466} 12/25/2021 03:47:03 - INFO - codeparrot_training - Step 16554: {'lr': 0.00037811751514489, 'samples': 8476160, 'steps': 16554, 'batch_loss/train': 0.6709253331646323} 12/25/2021 03:47:14 - INFO - codeparrot_training - Step 16555: {'lr': 0.00037810397900657147, 'samples': 8476672, 'steps': 16555, 'batch_loss/train': 0.8448412213474512} 12/25/2021 03:47:25 - INFO - codeparrot_training - Step 16556: {'lr': 0.00037809044235895993, 'samples': 8477184, 'steps': 16556, 'batch_loss/train': 0.740744206123054} 12/25/2021 03:47:38 - INFO - codeparrot_training - Step 16557: {'lr': 0.00037807690520210915, 'samples': 8477696, 'steps': 16557, 'batch_loss/train': 0.8634712006896734} 12/25/2021 03:47:48 - INFO - codeparrot_training - Step 16558: {'lr': 0.00037806336753607285, 'samples': 8478208, 'steps': 16558, 'batch_loss/train': 0.5663075284101069} 12/25/2021 03:47:59 - INFO - codeparrot_training - Step 16559: {'lr': 0.000378049829360905, 'samples': 8478720, 'steps': 16559, 'batch_loss/train': 1.0266973900143057} 12/25/2021 03:48:09 - INFO - codeparrot_training - Step 16560: {'lr': 0.0003780362906766595, 'samples': 8479232, 'steps': 16560, 'batch_loss/train': 0.7004072163254023} 12/25/2021 03:48:21 - INFO - codeparrot_training - Step 16561: {'lr': 0.00037802275148338994, 'samples': 8479744, 'steps': 16561, 'batch_loss/train': 0.8039108701050282} 12/25/2021 03:48:32 - INFO - codeparrot_training - Step 16562: {'lr': 0.00037800921178115035, 'samples': 8480256, 'steps': 16562, 'batch_loss/train': 0.8419303754344583} 12/25/2021 03:48:43 - INFO - codeparrot_training - Step 16563: {'lr': 0.0003779956715699944, 'samples': 8480768, 'steps': 16563, 'batch_loss/train': 0.7792437979951501} 12/25/2021 03:48:55 - INFO - codeparrot_training - Step 16564: {'lr': 0.0003779821308499759, 'samples': 8481280, 'steps': 16564, 'batch_loss/train': 0.8435241850093007} 12/25/2021 03:49:05 - INFO - codeparrot_training - Step 16565: {'lr': 0.0003779685896211489, 'samples': 8481792, 'steps': 16565, 'batch_loss/train': 0.6361147145507857} 12/25/2021 03:49:16 - INFO - codeparrot_training - Step 16566: {'lr': 0.00037795504788356703, 'samples': 8482304, 'steps': 16566, 'batch_loss/train': 0.735586043796502} 12/25/2021 03:49:29 - INFO - codeparrot_training - Step 16567: {'lr': 0.0003779415056372843, 'samples': 8482816, 'steps': 16567, 'batch_loss/train': 0.8960874630138278} 12/25/2021 03:49:40 - INFO - codeparrot_training - Step 16568: {'lr': 0.0003779279628823543, 'samples': 8483328, 'steps': 16568, 'batch_loss/train': 0.8110686857253313} 12/25/2021 03:49:50 - INFO - codeparrot_training - Step 16569: {'lr': 0.0003779144196188312, 'samples': 8483840, 'steps': 16569, 'batch_loss/train': 0.7972037107683718} 12/25/2021 03:50:02 - INFO - codeparrot_training - Step 16570: {'lr': 0.00037790087584676856, 'samples': 8484352, 'steps': 16570, 'batch_loss/train': 0.7254444495774806} 12/25/2021 03:50:13 - INFO - codeparrot_training - Step 16571: {'lr': 0.00037788733156622037, 'samples': 8484864, 'steps': 16571, 'batch_loss/train': 0.9556894423440099} 12/25/2021 03:50:24 - INFO - codeparrot_training - Step 16572: {'lr': 0.0003778737867772404, 'samples': 8485376, 'steps': 16572, 'batch_loss/train': 0.8679171307012439} 12/25/2021 03:50:34 - INFO - codeparrot_training - Step 16573: {'lr': 0.00037786024147988263, 'samples': 8485888, 'steps': 16573, 'batch_loss/train': 0.8097782908007503} 12/25/2021 03:50:46 - INFO - codeparrot_training - Step 16574: {'lr': 0.00037784669567420063, 'samples': 8486400, 'steps': 16574, 'batch_loss/train': 1.1774125043302774} 12/25/2021 03:50:57 - INFO - codeparrot_training - Step 16575: {'lr': 0.0003778331493602486, 'samples': 8486912, 'steps': 16575, 'batch_loss/train': 0.7441344138933346} 12/25/2021 03:51:07 - INFO - codeparrot_training - Step 16576: {'lr': 0.00037781960253808024, 'samples': 8487424, 'steps': 16576, 'batch_loss/train': 0.754784211050719} 12/25/2021 03:51:20 - INFO - codeparrot_training - Step 16577: {'lr': 0.00037780605520774935, 'samples': 8487936, 'steps': 16577, 'batch_loss/train': 0.7040003549773246} 12/25/2021 03:51:31 - INFO - codeparrot_training - Step 16578: {'lr': 0.0003777925073693098, 'samples': 8488448, 'steps': 16578, 'batch_loss/train': 0.8811511285603046} 12/25/2021 03:51:41 - INFO - codeparrot_training - Step 16579: {'lr': 0.00037777895902281553, 'samples': 8488960, 'steps': 16579, 'batch_loss/train': 0.7976763565093279} 12/25/2021 03:51:53 - INFO - codeparrot_training - Step 16580: {'lr': 0.00037776541016832044, 'samples': 8489472, 'steps': 16580, 'batch_loss/train': 0.8060139836743474} 12/25/2021 03:52:04 - INFO - codeparrot_training - Step 16581: {'lr': 0.0003777518608058782, 'samples': 8489984, 'steps': 16581, 'batch_loss/train': 0.8140704685356468} 12/25/2021 03:52:14 - INFO - codeparrot_training - Step 16582: {'lr': 0.0003777383109355429, 'samples': 8490496, 'steps': 16582, 'batch_loss/train': 0.9451856901869178} 12/25/2021 03:52:25 - INFO - codeparrot_training - Step 16583: {'lr': 0.00037772476055736824, 'samples': 8491008, 'steps': 16583, 'batch_loss/train': 0.7788053378462791} 12/25/2021 03:52:38 - INFO - codeparrot_training - Step 16584: {'lr': 0.0003777112096714082, 'samples': 8491520, 'steps': 16584, 'batch_loss/train': 0.7775973826646805} 12/25/2021 03:52:49 - INFO - codeparrot_training - Step 16585: {'lr': 0.00037769765827771665, 'samples': 8492032, 'steps': 16585, 'batch_loss/train': 0.8107525454834104} 12/25/2021 03:52:59 - INFO - codeparrot_training - Step 16586: {'lr': 0.0003776841063763474, 'samples': 8492544, 'steps': 16586, 'batch_loss/train': 0.7395923510193825} 12/25/2021 03:53:11 - INFO - codeparrot_training - Step 16587: {'lr': 0.0003776705539673543, 'samples': 8493056, 'steps': 16587, 'batch_loss/train': 0.7230852670036256} 12/25/2021 03:53:22 - INFO - codeparrot_training - Step 16588: {'lr': 0.0003776570010507913, 'samples': 8493568, 'steps': 16588, 'batch_loss/train': 0.7370801763609052} 12/25/2021 03:53:32 - INFO - codeparrot_training - Step 16589: {'lr': 0.00037764344762671227, 'samples': 8494080, 'steps': 16589, 'batch_loss/train': 0.8609027564525604} 12/25/2021 03:53:43 - INFO - codeparrot_training - Step 16590: {'lr': 0.0003776298936951711, 'samples': 8494592, 'steps': 16590, 'batch_loss/train': 0.830739120952785} 12/25/2021 03:53:55 - INFO - codeparrot_training - Step 16591: {'lr': 0.00037761633925622164, 'samples': 8495104, 'steps': 16591, 'batch_loss/train': 0.8052471922710538} 12/25/2021 03:54:06 - INFO - codeparrot_training - Step 16592: {'lr': 0.0003776027843099178, 'samples': 8495616, 'steps': 16592, 'batch_loss/train': 0.8935459610074759} 12/25/2021 03:54:16 - INFO - codeparrot_training - Step 16593: {'lr': 0.0003775892288563135, 'samples': 8496128, 'steps': 16593, 'batch_loss/train': 0.6191854854114354} 12/25/2021 03:54:30 - INFO - codeparrot_training - Step 16594: {'lr': 0.00037757567289546255, 'samples': 8496640, 'steps': 16594, 'batch_loss/train': 0.6453753454843536} 12/25/2021 03:54:40 - INFO - codeparrot_training - Step 16595: {'lr': 0.000377562116427419, 'samples': 8497152, 'steps': 16595, 'batch_loss/train': 0.7863879706710577} 12/25/2021 03:54:51 - INFO - codeparrot_training - Step 16596: {'lr': 0.00037754855945223666, 'samples': 8497664, 'steps': 16596, 'batch_loss/train': 0.9586450178176165} 12/25/2021 03:55:03 - INFO - codeparrot_training - Step 16597: {'lr': 0.00037753500196996924, 'samples': 8498176, 'steps': 16597, 'batch_loss/train': 0.8055119905620813} 12/25/2021 03:55:13 - INFO - codeparrot_training - Step 16598: {'lr': 0.0003775214439806709, 'samples': 8498688, 'steps': 16598, 'batch_loss/train': 0.8896419415250421} 12/25/2021 03:55:24 - INFO - codeparrot_training - Step 16599: {'lr': 0.0003775078854843955, 'samples': 8499200, 'steps': 16599, 'batch_loss/train': 0.7959301364608109} 12/25/2021 03:55:35 - INFO - codeparrot_training - Step 16600: {'lr': 0.0003774943264811968, 'samples': 8499712, 'steps': 16600, 'batch_loss/train': 0.8302432345226407} 12/25/2021 03:55:47 - INFO - codeparrot_training - Step 16601: {'lr': 0.0003774807669711289, 'samples': 8500224, 'steps': 16601, 'batch_loss/train': 0.7830287925899029} 12/25/2021 03:55:57 - INFO - codeparrot_training - Step 16602: {'lr': 0.0003774672069542455, 'samples': 8500736, 'steps': 16602, 'batch_loss/train': 0.8227236882084981} 12/25/2021 03:56:08 - INFO - codeparrot_training - Step 16603: {'lr': 0.00037745364643060065, 'samples': 8501248, 'steps': 16603, 'batch_loss/train': 0.7878961353562772} 12/25/2021 03:56:20 - INFO - codeparrot_training - Step 16604: {'lr': 0.0003774400854002482, 'samples': 8501760, 'steps': 16604, 'batch_loss/train': 0.726678610779345} 12/25/2021 03:56:31 - INFO - codeparrot_training - Step 16605: {'lr': 0.00037742652386324216, 'samples': 8502272, 'steps': 16605, 'batch_loss/train': 0.762860338203609} 12/25/2021 03:56:42 - INFO - codeparrot_training - Step 16606: {'lr': 0.00037741296181963634, 'samples': 8502784, 'steps': 16606, 'batch_loss/train': 0.8711648182943463} 12/25/2021 03:56:54 - INFO - codeparrot_training - Step 16607: {'lr': 0.00037739939926948466, 'samples': 8503296, 'steps': 16607, 'batch_loss/train': 0.7261677663773298} 12/25/2021 03:57:05 - INFO - codeparrot_training - Step 16608: {'lr': 0.00037738583621284115, 'samples': 8503808, 'steps': 16608, 'batch_loss/train': 0.8561877384781837} 12/25/2021 03:57:15 - INFO - codeparrot_training - Step 16609: {'lr': 0.0003773722726497596, 'samples': 8504320, 'steps': 16609, 'batch_loss/train': 0.6730571042280644} 12/25/2021 03:57:27 - INFO - codeparrot_training - Step 16610: {'lr': 0.000377358708580294, 'samples': 8504832, 'steps': 16610, 'batch_loss/train': 0.6267926581203938} 12/25/2021 03:57:38 - INFO - codeparrot_training - Step 16611: {'lr': 0.0003773451440044983, 'samples': 8505344, 'steps': 16611, 'batch_loss/train': 0.7693609306588769} 12/25/2021 03:57:49 - INFO - codeparrot_training - Step 16612: {'lr': 0.00037733157892242634, 'samples': 8505856, 'steps': 16612, 'batch_loss/train': 0.8431441392749548} 12/25/2021 03:57:59 - INFO - codeparrot_training - Step 16613: {'lr': 0.00037731801333413205, 'samples': 8506368, 'steps': 16613, 'batch_loss/train': 0.8202112372964621} 12/25/2021 03:58:12 - INFO - codeparrot_training - Step 16614: {'lr': 0.0003773044472396695, 'samples': 8506880, 'steps': 16614, 'batch_loss/train': 0.8036119181197137} 12/25/2021 03:58:23 - INFO - codeparrot_training - Step 16615: {'lr': 0.00037729088063909253, 'samples': 8507392, 'steps': 16615, 'batch_loss/train': 0.7770014917477965} 12/25/2021 03:58:33 - INFO - codeparrot_training - Step 16616: {'lr': 0.00037727731353245507, 'samples': 8507904, 'steps': 16616, 'batch_loss/train': 0.8533467063680291} 12/25/2021 03:58:45 - INFO - codeparrot_training - Step 16617: {'lr': 0.0003772637459198111, 'samples': 8508416, 'steps': 16617, 'batch_loss/train': 0.8291139649227262} 12/25/2021 03:58:56 - INFO - codeparrot_training - Step 16618: {'lr': 0.0003772501778012144, 'samples': 8508928, 'steps': 16618, 'batch_loss/train': 0.9490412268787622} 12/25/2021 03:59:07 - INFO - codeparrot_training - Step 16619: {'lr': 0.00037723660917671914, 'samples': 8509440, 'steps': 16619, 'batch_loss/train': 1.052719285711646} 12/25/2021 03:59:19 - INFO - codeparrot_training - Step 16620: {'lr': 0.0003772230400463792, 'samples': 8509952, 'steps': 16620, 'batch_loss/train': 0.8003109847195446} 12/25/2021 03:59:29 - INFO - codeparrot_training - Step 16621: {'lr': 0.00037720947041024845, 'samples': 8510464, 'steps': 16621, 'batch_loss/train': 1.0083906565560028} 12/25/2021 03:59:40 - INFO - codeparrot_training - Step 16622: {'lr': 0.0003771959002683809, 'samples': 8510976, 'steps': 16622, 'batch_loss/train': 0.7835095576010644} 12/25/2021 03:59:51 - INFO - codeparrot_training - Step 16623: {'lr': 0.0003771823296208304, 'samples': 8511488, 'steps': 16623, 'batch_loss/train': 0.8393483646214008} 12/25/2021 04:00:03 - INFO - codeparrot_training - Step 16624: {'lr': 0.00037716875846765104, 'samples': 8512000, 'steps': 16624, 'batch_loss/train': 0.7517706081271172} 12/25/2021 04:00:14 - INFO - codeparrot_training - Step 16625: {'lr': 0.0003771551868088967, 'samples': 8512512, 'steps': 16625, 'batch_loss/train': 0.8402067485731095} 12/25/2021 04:00:25 - INFO - codeparrot_training - Step 16626: {'lr': 0.0003771416146446214, 'samples': 8513024, 'steps': 16626, 'batch_loss/train': 0.8476774012669921} 12/25/2021 04:00:37 - INFO - codeparrot_training - Step 16627: {'lr': 0.000377128041974879, 'samples': 8513536, 'steps': 16627, 'batch_loss/train': 0.8483178850729018} 12/25/2021 04:00:47 - INFO - codeparrot_training - Step 16628: {'lr': 0.0003771144687997235, 'samples': 8514048, 'steps': 16628, 'batch_loss/train': 0.8186574587598443} 12/25/2021 04:00:58 - INFO - codeparrot_training - Step 16629: {'lr': 0.00037710089511920885, 'samples': 8514560, 'steps': 16629, 'batch_loss/train': 0.7525997310876846} 12/25/2021 04:01:11 - INFO - codeparrot_training - Step 16630: {'lr': 0.00037708732093338904, 'samples': 8515072, 'steps': 16630, 'batch_loss/train': 0.8002572609111667} 12/25/2021 04:01:21 - INFO - codeparrot_training - Step 16631: {'lr': 0.00037707374624231806, 'samples': 8515584, 'steps': 16631, 'batch_loss/train': 0.7531296997331083} 12/25/2021 04:01:32 - INFO - codeparrot_training - Step 16632: {'lr': 0.00037706017104604985, 'samples': 8516096, 'steps': 16632, 'batch_loss/train': 0.8644985544960946} 12/25/2021 04:01:42 - INFO - codeparrot_training - Step 16633: {'lr': 0.00037704659534463836, 'samples': 8516608, 'steps': 16633, 'batch_loss/train': 0.8625244228169322} 12/25/2021 04:01:54 - INFO - codeparrot_training - Step 16634: {'lr': 0.0003770330191381376, 'samples': 8517120, 'steps': 16634, 'batch_loss/train': 0.7983301468193531} 12/25/2021 04:02:05 - INFO - codeparrot_training - Step 16635: {'lr': 0.00037701944242660146, 'samples': 8517632, 'steps': 16635, 'batch_loss/train': 0.8187095366884023} 12/25/2021 04:02:16 - INFO - codeparrot_training - Step 16636: {'lr': 0.00037700586521008405, 'samples': 8518144, 'steps': 16636, 'batch_loss/train': 0.7138143037445843} 12/25/2021 04:02:28 - INFO - codeparrot_training - Step 16637: {'lr': 0.0003769922874886392, 'samples': 8518656, 'steps': 16637, 'batch_loss/train': 0.7799130799248815} 12/25/2021 04:02:38 - INFO - codeparrot_training - Step 16638: {'lr': 0.000376978709262321, 'samples': 8519168, 'steps': 16638, 'batch_loss/train': 0.7998257642611861} 12/25/2021 04:02:49 - INFO - codeparrot_training - Step 16639: {'lr': 0.0003769651305311834, 'samples': 8519680, 'steps': 16639, 'batch_loss/train': 0.7067429006565362} 12/25/2021 04:03:01 - INFO - codeparrot_training - Step 16640: {'lr': 0.00037695155129528036, 'samples': 8520192, 'steps': 16640, 'batch_loss/train': 0.8946420215070248} 12/25/2021 04:03:12 - INFO - codeparrot_training - Step 16641: {'lr': 0.000376937971554666, 'samples': 8520704, 'steps': 16641, 'batch_loss/train': 1.3274777377955616} 12/25/2021 04:03:22 - INFO - codeparrot_training - Step 16642: {'lr': 0.00037692439130939407, 'samples': 8521216, 'steps': 16642, 'batch_loss/train': 0.6658538654446602} 12/25/2021 04:03:33 - INFO - codeparrot_training - Step 16643: {'lr': 0.0003769108105595187, 'samples': 8521728, 'steps': 16643, 'batch_loss/train': 0.8461403069086373} 12/25/2021 04:03:46 - INFO - codeparrot_training - Step 16644: {'lr': 0.00037689722930509396, 'samples': 8522240, 'steps': 16644, 'batch_loss/train': 0.6072911680676043} 12/25/2021 04:03:57 - INFO - codeparrot_training - Step 16645: {'lr': 0.00037688364754617364, 'samples': 8522752, 'steps': 16645, 'batch_loss/train': 0.7474903869442642} 12/25/2021 04:04:07 - INFO - codeparrot_training - Step 16646: {'lr': 0.0003768700652828119, 'samples': 8523264, 'steps': 16646, 'batch_loss/train': 0.7459417106583714} 12/25/2021 04:04:20 - INFO - codeparrot_training - Step 16647: {'lr': 0.00037685648251506264, 'samples': 8523776, 'steps': 16647, 'batch_loss/train': 0.7201112443581223} 12/25/2021 04:04:30 - INFO - codeparrot_training - Step 16648: {'lr': 0.00037684289924298, 'samples': 8524288, 'steps': 16648, 'batch_loss/train': 0.8634482966735959} 12/25/2021 04:04:41 - INFO - codeparrot_training - Step 16649: {'lr': 0.0003768293154666178, 'samples': 8524800, 'steps': 16649, 'batch_loss/train': 0.8697161339223385} 12/25/2021 04:04:53 - INFO - codeparrot_training - Step 16650: {'lr': 0.0003768157311860302, 'samples': 8525312, 'steps': 16650, 'batch_loss/train': 0.9556110184639692} 12/25/2021 04:05:03 - INFO - codeparrot_training - Step 16651: {'lr': 0.00037680214640127107, 'samples': 8525824, 'steps': 16651, 'batch_loss/train': 0.7613786077126861} 12/25/2021 04:05:14 - INFO - codeparrot_training - Step 16652: {'lr': 0.0003767885611123945, 'samples': 8526336, 'steps': 16652, 'batch_loss/train': 0.7624295821879059} 12/25/2021 04:05:25 - INFO - codeparrot_training - Step 16653: {'lr': 0.0003767749753194545, 'samples': 8526848, 'steps': 16653, 'batch_loss/train': 0.802471979521215} 12/25/2021 04:05:37 - INFO - codeparrot_training - Step 16654: {'lr': 0.000376761389022505, 'samples': 8527360, 'steps': 16654, 'batch_loss/train': 0.6946359314024448} 12/25/2021 04:05:48 - INFO - codeparrot_training - Step 16655: {'lr': 0.0003767478022216001, 'samples': 8527872, 'steps': 16655, 'batch_loss/train': 0.8241288680583239} 12/25/2021 04:05:58 - INFO - codeparrot_training - Step 16656: {'lr': 0.00037673421491679393, 'samples': 8528384, 'steps': 16656, 'batch_loss/train': 0.7735501201823354} 12/25/2021 04:06:10 - INFO - codeparrot_training - Step 16657: {'lr': 0.00037672062710814025, 'samples': 8528896, 'steps': 16657, 'batch_loss/train': 0.734732729382813} 12/25/2021 04:06:21 - INFO - codeparrot_training - Step 16658: {'lr': 0.0003767070387956932, 'samples': 8529408, 'steps': 16658, 'batch_loss/train': 0.7430945299565792} 12/25/2021 04:06:32 - INFO - codeparrot_training - Step 16659: {'lr': 0.00037669344997950683, 'samples': 8529920, 'steps': 16659, 'batch_loss/train': 0.8033862560987473} 12/25/2021 04:06:45 - INFO - codeparrot_training - Step 16660: {'lr': 0.00037667986065963524, 'samples': 8530432, 'steps': 16660, 'batch_loss/train': 0.8128122547641397} 12/25/2021 04:06:55 - INFO - codeparrot_training - Step 16661: {'lr': 0.0003766662708361323, 'samples': 8530944, 'steps': 16661, 'batch_loss/train': 0.8029925227165222} 12/25/2021 04:07:06 - INFO - codeparrot_training - Step 16662: {'lr': 0.000376652680509052, 'samples': 8531456, 'steps': 16662, 'batch_loss/train': 0.7580370707437396} 12/25/2021 04:07:16 - INFO - codeparrot_training - Step 16663: {'lr': 0.0003766390896784485, 'samples': 8531968, 'steps': 16663, 'batch_loss/train': 0.5628888821229339} 12/25/2021 04:07:29 - INFO - codeparrot_training - Step 16664: {'lr': 0.0003766254983443758, 'samples': 8532480, 'steps': 16664, 'batch_loss/train': 0.7270012283697724} 12/25/2021 04:07:39 - INFO - codeparrot_training - Step 16665: {'lr': 0.00037661190650688795, 'samples': 8532992, 'steps': 16665, 'batch_loss/train': 0.6384776596678421} 12/25/2021 04:07:50 - INFO - codeparrot_training - Step 16666: {'lr': 0.00037659831416603896, 'samples': 8533504, 'steps': 16666, 'batch_loss/train': 0.8827766422182322} 12/25/2021 04:08:02 - INFO - codeparrot_training - Step 16667: {'lr': 0.00037658472132188293, 'samples': 8534016, 'steps': 16667, 'batch_loss/train': 0.8269816106185317} 12/25/2021 04:08:12 - INFO - codeparrot_training - Step 16668: {'lr': 0.0003765711279744738, 'samples': 8534528, 'steps': 16668, 'batch_loss/train': 0.73791917366907} 12/25/2021 04:08:23 - INFO - codeparrot_training - Step 16669: {'lr': 0.0003765575341238656, 'samples': 8535040, 'steps': 16669, 'batch_loss/train': 0.8481732900254428} 12/25/2021 04:08:35 - INFO - codeparrot_training - Step 16670: {'lr': 0.00037654393977011255, 'samples': 8535552, 'steps': 16670, 'batch_loss/train': 0.8000491447746754} 12/25/2021 04:08:46 - INFO - codeparrot_training - Step 16671: {'lr': 0.0003765303449132685, 'samples': 8536064, 'steps': 16671, 'batch_loss/train': 0.8287209486588836} 12/25/2021 04:08:56 - INFO - codeparrot_training - Step 16672: {'lr': 0.00037651674955338754, 'samples': 8536576, 'steps': 16672, 'batch_loss/train': 0.7714840499684215} 12/25/2021 04:09:10 - INFO - codeparrot_training - Step 16673: {'lr': 0.00037650315369052383, 'samples': 8537088, 'steps': 16673, 'batch_loss/train': 0.8203267510980368} 12/25/2021 04:09:20 - INFO - codeparrot_training - Step 16674: {'lr': 0.0003764895573247313, 'samples': 8537600, 'steps': 16674, 'batch_loss/train': 0.6649383117910475} 12/25/2021 04:09:31 - INFO - codeparrot_training - Step 16675: {'lr': 0.00037647596045606416, 'samples': 8538112, 'steps': 16675, 'batch_loss/train': 0.8307843180373311} 12/25/2021 04:09:42 - INFO - codeparrot_training - Step 16676: {'lr': 0.00037646236308457623, 'samples': 8538624, 'steps': 16676, 'batch_loss/train': 0.8384478408843279} 12/25/2021 04:09:56 - INFO - codeparrot_training - Step 16677: {'lr': 0.0003764487652103217, 'samples': 8539136, 'steps': 16677, 'batch_loss/train': 0.5985100935213268} 12/25/2021 04:10:06 - INFO - codeparrot_training - Step 16678: {'lr': 0.00037643516683335464, 'samples': 8539648, 'steps': 16678, 'batch_loss/train': 0.7695278450846672} 12/25/2021 04:10:17 - INFO - codeparrot_training - Step 16679: {'lr': 0.0003764215679537291, 'samples': 8540160, 'steps': 16679, 'batch_loss/train': 0.8905997835099697} 12/25/2021 04:10:29 - INFO - codeparrot_training - Step 16680: {'lr': 0.00037640796857149914, 'samples': 8540672, 'steps': 16680, 'batch_loss/train': 0.8428367357701063} 12/25/2021 04:10:40 - INFO - codeparrot_training - Step 16681: {'lr': 0.00037639436868671885, 'samples': 8541184, 'steps': 16681, 'batch_loss/train': 0.8353266492486} 12/25/2021 04:10:50 - INFO - codeparrot_training - Step 16682: {'lr': 0.0003763807682994423, 'samples': 8541696, 'steps': 16682, 'batch_loss/train': 0.9855530862696469} 12/25/2021 04:11:02 - INFO - codeparrot_training - Step 16683: {'lr': 0.00037636716740972346, 'samples': 8542208, 'steps': 16683, 'batch_loss/train': 0.7413860354572535} 12/25/2021 04:11:13 - INFO - codeparrot_training - Step 16684: {'lr': 0.0003763535660176165, 'samples': 8542720, 'steps': 16684, 'batch_loss/train': 0.8318873876705766} 12/25/2021 04:11:23 - INFO - codeparrot_training - Step 16685: {'lr': 0.0003763399641231755, 'samples': 8543232, 'steps': 16685, 'batch_loss/train': 0.7351667629554868} 12/25/2021 04:11:34 - INFO - codeparrot_training - Step 16686: {'lr': 0.0003763263617264545, 'samples': 8543744, 'steps': 16686, 'batch_loss/train': 0.7680886490270495} 12/25/2021 04:11:48 - INFO - codeparrot_training - Step 16687: {'lr': 0.00037631275882750757, 'samples': 8544256, 'steps': 16687, 'batch_loss/train': 0.8841645289212465} 12/25/2021 04:11:59 - INFO - codeparrot_training - Step 16688: {'lr': 0.0003762991554263888, 'samples': 8544768, 'steps': 16688, 'batch_loss/train': 0.7944960384629667} 12/25/2021 04:12:09 - INFO - codeparrot_training - Step 16689: {'lr': 0.00037628555152315236, 'samples': 8545280, 'steps': 16689, 'batch_loss/train': 0.7439304627478123} 12/25/2021 04:12:21 - INFO - codeparrot_training - Step 16690: {'lr': 0.00037627194711785215, 'samples': 8545792, 'steps': 16690, 'batch_loss/train': 0.7636669613420963} 12/25/2021 04:12:32 - INFO - codeparrot_training - Step 16691: {'lr': 0.0003762583422105423, 'samples': 8546304, 'steps': 16691, 'batch_loss/train': 0.810523965395987} 12/25/2021 04:12:43 - INFO - codeparrot_training - Step 16692: {'lr': 0.00037624473680127713, 'samples': 8546816, 'steps': 16692, 'batch_loss/train': 0.7420337973162532} 12/25/2021 04:12:56 - INFO - codeparrot_training - Step 16693: {'lr': 0.00037623113089011053, 'samples': 8547328, 'steps': 16693, 'batch_loss/train': 0.8775714319199324} 12/25/2021 04:13:07 - INFO - codeparrot_training - Step 16694: {'lr': 0.00037621752447709644, 'samples': 8547840, 'steps': 16694, 'batch_loss/train': 0.7410791693255305} 12/25/2021 04:13:17 - INFO - codeparrot_training - Step 16695: {'lr': 0.0003762039175622893, 'samples': 8548352, 'steps': 16695, 'batch_loss/train': 0.8371320944279432} 12/25/2021 04:13:28 - INFO - codeparrot_training - Step 16696: {'lr': 0.00037619031014574295, 'samples': 8548864, 'steps': 16696, 'batch_loss/train': 0.8016345559153706} 12/25/2021 04:13:40 - INFO - codeparrot_training - Step 16697: {'lr': 0.0003761767022275116, 'samples': 8549376, 'steps': 16697, 'batch_loss/train': 0.721653460059315} 12/25/2021 04:13:51 - INFO - codeparrot_training - Step 16698: {'lr': 0.00037616309380764936, 'samples': 8549888, 'steps': 16698, 'batch_loss/train': 0.8414503233507276} 12/25/2021 04:14:01 - INFO - codeparrot_training - Step 16699: {'lr': 0.0003761494848862103, 'samples': 8550400, 'steps': 16699, 'batch_loss/train': 0.7424035912845284} 12/25/2021 04:14:13 - INFO - codeparrot_training - Step 16700: {'lr': 0.0003761358754632484, 'samples': 8550912, 'steps': 16700, 'batch_loss/train': 1.4238585326820612} 12/25/2021 04:14:24 - INFO - codeparrot_training - Step 16701: {'lr': 0.000376122265538818, 'samples': 8551424, 'steps': 16701, 'batch_loss/train': 0.7585296733304858} 12/25/2021 04:14:35 - INFO - codeparrot_training - Step 16702: {'lr': 0.0003761086551129731, 'samples': 8551936, 'steps': 16702, 'batch_loss/train': 0.8075391612946987} 12/25/2021 04:14:48 - INFO - codeparrot_training - Step 16703: {'lr': 0.00037609504418576774, 'samples': 8552448, 'steps': 16703, 'batch_loss/train': 0.7992858774960041} 12/25/2021 04:14:59 - INFO - codeparrot_training - Step 16704: {'lr': 0.00037608143275725614, 'samples': 8552960, 'steps': 16704, 'batch_loss/train': 0.7491212114691734} 12/25/2021 04:15:10 - INFO - codeparrot_training - Step 16705: {'lr': 0.00037606782082749235, 'samples': 8553472, 'steps': 16705, 'batch_loss/train': 0.8708731951192021} 12/25/2021 04:15:20 - INFO - codeparrot_training - Step 16706: {'lr': 0.0003760542083965305, 'samples': 8553984, 'steps': 16706, 'batch_loss/train': 0.7283955793827772} 12/25/2021 04:15:33 - INFO - codeparrot_training - Step 16707: {'lr': 0.0003760405954644247, 'samples': 8554496, 'steps': 16707, 'batch_loss/train': 0.8372174296528101} 12/25/2021 04:15:44 - INFO - codeparrot_training - Step 16708: {'lr': 0.00037602698203122913, 'samples': 8555008, 'steps': 16708, 'batch_loss/train': 0.8319190966431051} 12/25/2021 04:15:55 - INFO - codeparrot_training - Step 16709: {'lr': 0.00037601336809699784, 'samples': 8555520, 'steps': 16709, 'batch_loss/train': 1.0880826972424984} 12/25/2021 04:16:08 - INFO - codeparrot_training - Step 16710: {'lr': 0.00037599975366178496, 'samples': 8556032, 'steps': 16710, 'batch_loss/train': 0.7835386430379003} 12/25/2021 04:16:19 - INFO - codeparrot_training - Step 16711: {'lr': 0.0003759861387256447, 'samples': 8556544, 'steps': 16711, 'batch_loss/train': 1.0227637367788702} 12/25/2021 04:16:30 - INFO - codeparrot_training - Step 16712: {'lr': 0.0003759725232886311, 'samples': 8557056, 'steps': 16712, 'batch_loss/train': 0.7972689643502235} 12/25/2021 04:16:40 - INFO - codeparrot_training - Step 16713: {'lr': 0.00037595890735079825, 'samples': 8557568, 'steps': 16713, 'batch_loss/train': 0.8617054848000407} 12/25/2021 04:16:53 - INFO - codeparrot_training - Step 16714: {'lr': 0.00037594529091220034, 'samples': 8558080, 'steps': 16714, 'batch_loss/train': 0.80406145285815} 12/25/2021 04:17:03 - INFO - codeparrot_training - Step 16715: {'lr': 0.00037593167397289163, 'samples': 8558592, 'steps': 16715, 'batch_loss/train': 0.6784041188657284} 12/25/2021 04:17:14 - INFO - codeparrot_training - Step 16716: {'lr': 0.00037591805653292606, 'samples': 8559104, 'steps': 16716, 'batch_loss/train': 0.691033324226737} 12/25/2021 04:17:26 - INFO - codeparrot_training - Step 16717: {'lr': 0.0003759044385923578, 'samples': 8559616, 'steps': 16717, 'batch_loss/train': 0.7961978958919644} 12/25/2021 04:17:37 - INFO - codeparrot_training - Step 16718: {'lr': 0.0003758908201512411, 'samples': 8560128, 'steps': 16718, 'batch_loss/train': 0.8703258624300361} 12/25/2021 04:17:47 - INFO - codeparrot_training - Step 16719: {'lr': 0.00037587720120963006, 'samples': 8560640, 'steps': 16719, 'batch_loss/train': 0.7380960467271507} 12/25/2021 04:17:59 - INFO - codeparrot_training - Step 16720: {'lr': 0.00037586358176757865, 'samples': 8561152, 'steps': 16720, 'batch_loss/train': 0.920890144072473} 12/25/2021 04:18:10 - INFO - codeparrot_training - Step 16721: {'lr': 0.00037584996182514133, 'samples': 8561664, 'steps': 16721, 'batch_loss/train': 1.04406201466918} 12/25/2021 04:18:20 - INFO - codeparrot_training - Step 16722: {'lr': 0.00037583634138237197, 'samples': 8562176, 'steps': 16722, 'batch_loss/train': 0.7928624846972525} 12/25/2021 04:18:31 - INFO - codeparrot_training - Step 16723: {'lr': 0.00037582272043932484, 'samples': 8562688, 'steps': 16723, 'batch_loss/train': 0.7397365440847352} 12/25/2021 04:18:45 - INFO - codeparrot_training - Step 16724: {'lr': 0.00037580909899605414, 'samples': 8563200, 'steps': 16724, 'batch_loss/train': 0.6649538045749068} 12/25/2021 04:18:56 - INFO - codeparrot_training - Step 16725: {'lr': 0.00037579547705261394, 'samples': 8563712, 'steps': 16725, 'batch_loss/train': 0.8215033332817256} 12/25/2021 04:19:07 - INFO - codeparrot_training - Step 16726: {'lr': 0.00037578185460905843, 'samples': 8564224, 'steps': 16726, 'batch_loss/train': 0.8163546514697373} 12/25/2021 04:19:19 - INFO - codeparrot_training - Step 16727: {'lr': 0.0003757682316654417, 'samples': 8564736, 'steps': 16727, 'batch_loss/train': 0.8572987858206034} 12/25/2021 04:19:29 - INFO - codeparrot_training - Step 16728: {'lr': 0.0003757546082218181, 'samples': 8565248, 'steps': 16728, 'batch_loss/train': 0.8037804765626788} 12/25/2021 04:19:40 - INFO - codeparrot_training - Step 16729: {'lr': 0.00037574098427824154, 'samples': 8565760, 'steps': 16729, 'batch_loss/train': 0.8352682497352362} 12/25/2021 04:19:52 - INFO - codeparrot_training - Step 16730: {'lr': 0.0003757273598347664, 'samples': 8566272, 'steps': 16730, 'batch_loss/train': 0.8697860867250711} 12/25/2021 04:20:02 - INFO - codeparrot_training - Step 16731: {'lr': 0.00037571373489144665, 'samples': 8566784, 'steps': 16731, 'batch_loss/train': 0.8067708157468587} 12/25/2021 04:20:13 - INFO - codeparrot_training - Step 16732: {'lr': 0.00037570010944833665, 'samples': 8567296, 'steps': 16732, 'batch_loss/train': 0.8181959111243486} 12/25/2021 04:20:24 - INFO - codeparrot_training - Step 16733: {'lr': 0.0003756864835054904, 'samples': 8567808, 'steps': 16733, 'batch_loss/train': 0.7223282610066235} 12/25/2021 04:20:38 - INFO - codeparrot_training - Step 16734: {'lr': 0.0003756728570629622, 'samples': 8568320, 'steps': 16734, 'batch_loss/train': 0.854135949164629} 12/25/2021 04:20:48 - INFO - codeparrot_training - Step 16735: {'lr': 0.0003756592301208062, 'samples': 8568832, 'steps': 16735, 'batch_loss/train': 0.867919348180294} 12/25/2021 04:20:59 - INFO - codeparrot_training - Step 16736: {'lr': 0.00037564560267907643, 'samples': 8569344, 'steps': 16736, 'batch_loss/train': 0.8571986358147115} 12/25/2021 04:21:11 - INFO - codeparrot_training - Step 16737: {'lr': 0.0003756319747378272, 'samples': 8569856, 'steps': 16737, 'batch_loss/train': 1.0261741261929274} 12/25/2021 04:21:22 - INFO - codeparrot_training - Step 16738: {'lr': 0.00037561834629711277, 'samples': 8570368, 'steps': 16738, 'batch_loss/train': 0.8690635357052088} 12/25/2021 04:21:32 - INFO - codeparrot_training - Step 16739: {'lr': 0.0003756047173569872, 'samples': 8570880, 'steps': 16739, 'batch_loss/train': 1.0075305809732527} 12/25/2021 04:21:46 - INFO - codeparrot_training - Step 16740: {'lr': 0.00037559108791750465, 'samples': 8571392, 'steps': 16740, 'batch_loss/train': 0.8820587508380413} 12/25/2021 04:21:57 - INFO - codeparrot_training - Step 16741: {'lr': 0.00037557745797871944, 'samples': 8571904, 'steps': 16741, 'batch_loss/train': 0.922706002369523} 12/25/2021 04:22:07 - INFO - codeparrot_training - Step 16742: {'lr': 0.00037556382754068564, 'samples': 8572416, 'steps': 16742, 'batch_loss/train': 0.7966108587570488} 12/25/2021 04:22:18 - INFO - codeparrot_training - Step 16743: {'lr': 0.00037555019660345747, 'samples': 8572928, 'steps': 16743, 'batch_loss/train': 0.7386943942401558} 12/25/2021 04:22:30 - INFO - codeparrot_training - Step 16744: {'lr': 0.00037553656516708906, 'samples': 8573440, 'steps': 16744, 'batch_loss/train': 0.8106554327532649} 12/25/2021 04:22:40 - INFO - codeparrot_training - Step 16745: {'lr': 0.0003755229332316347, 'samples': 8573952, 'steps': 16745, 'batch_loss/train': 1.6150413770228624} 12/25/2021 04:22:51 - INFO - codeparrot_training - Step 16746: {'lr': 0.0003755093007971486, 'samples': 8574464, 'steps': 16746, 'batch_loss/train': 0.750589590985328} 12/25/2021 04:23:03 - INFO - codeparrot_training - Step 16747: {'lr': 0.0003754956678636849, 'samples': 8574976, 'steps': 16747, 'batch_loss/train': 0.9121319097466767} 12/25/2021 04:23:14 - INFO - codeparrot_training - Step 16748: {'lr': 0.0003754820344312978, 'samples': 8575488, 'steps': 16748, 'batch_loss/train': 1.5113547146320343} 12/25/2021 04:23:24 - INFO - codeparrot_training - Step 16749: {'lr': 0.0003754684005000415, 'samples': 8576000, 'steps': 16749, 'batch_loss/train': 0.8710870193317533} 12/25/2021 04:23:38 - INFO - codeparrot_training - Step 16750: {'lr': 0.0003754547660699702, 'samples': 8576512, 'steps': 16750, 'batch_loss/train': 0.8505454044789076} 12/25/2021 04:23:49 - INFO - codeparrot_training - Step 16751: {'lr': 0.0003754411311411381, 'samples': 8577024, 'steps': 16751, 'batch_loss/train': 0.80067571811378} 12/25/2021 04:23:59 - INFO - codeparrot_training - Step 16752: {'lr': 0.00037542749571359946, 'samples': 8577536, 'steps': 16752, 'batch_loss/train': 0.7034958554431796} 12/25/2021 04:24:10 - INFO - codeparrot_training - Step 16753: {'lr': 0.0003754138597874085, 'samples': 8578048, 'steps': 16753, 'batch_loss/train': 0.7743125660344958} 12/25/2021 04:24:22 - INFO - codeparrot_training - Step 16754: {'lr': 0.0003754002233626194, 'samples': 8578560, 'steps': 16754, 'batch_loss/train': 0.7833639597520232} 12/25/2021 04:24:33 - INFO - codeparrot_training - Step 16755: {'lr': 0.00037538658643928625, 'samples': 8579072, 'steps': 16755, 'batch_loss/train': 0.7990532154217362} 12/25/2021 04:24:44 - INFO - codeparrot_training - Step 16756: {'lr': 0.0003753729490174634, 'samples': 8579584, 'steps': 16756, 'batch_loss/train': 0.7921490548178554} 12/25/2021 04:24:55 - INFO - codeparrot_training - Step 16757: {'lr': 0.0003753593110972051, 'samples': 8580096, 'steps': 16757, 'batch_loss/train': 0.9167395504191518} 12/25/2021 04:25:06 - INFO - codeparrot_training - Step 16758: {'lr': 0.00037534567267856554, 'samples': 8580608, 'steps': 16758, 'batch_loss/train': 0.749888158403337} 12/25/2021 04:25:17 - INFO - codeparrot_training - Step 16759: {'lr': 0.00037533203376159885, 'samples': 8581120, 'steps': 16759, 'batch_loss/train': 0.8290657559409738} 12/25/2021 04:25:31 - INFO - codeparrot_training - Step 16760: {'lr': 0.0003753183943463594, 'samples': 8581632, 'steps': 16760, 'batch_loss/train': 0.8132038963958621} 12/25/2021 04:25:41 - INFO - codeparrot_training - Step 16761: {'lr': 0.00037530475443290127, 'samples': 8582144, 'steps': 16761, 'batch_loss/train': 0.8011262258514762} 12/25/2021 04:25:52 - INFO - codeparrot_training - Step 16762: {'lr': 0.0003752911140212787, 'samples': 8582656, 'steps': 16762, 'batch_loss/train': 0.9268913585692644} 12/25/2021 04:26:02 - INFO - codeparrot_training - Step 16763: {'lr': 0.000375277473111546, 'samples': 8583168, 'steps': 16763, 'batch_loss/train': 0.8020074982196093} 12/25/2021 04:26:15 - INFO - codeparrot_training - Step 16764: {'lr': 0.00037526383170375743, 'samples': 8583680, 'steps': 16764, 'batch_loss/train': 0.7643739888444543} 12/25/2021 04:26:25 - INFO - codeparrot_training - Step 16765: {'lr': 0.0003752501897979671, 'samples': 8584192, 'steps': 16765, 'batch_loss/train': 0.88400295143947} 12/25/2021 04:26:36 - INFO - codeparrot_training - Step 16766: {'lr': 0.0003752365473942293, 'samples': 8584704, 'steps': 16766, 'batch_loss/train': 0.8554494008421898} 12/25/2021 04:26:48 - INFO - codeparrot_training - Step 16767: {'lr': 0.0003752229044925983, 'samples': 8585216, 'steps': 16767, 'batch_loss/train': 0.8141784938052297} 12/25/2021 04:26:58 - INFO - codeparrot_training - Step 16768: {'lr': 0.00037520926109312834, 'samples': 8585728, 'steps': 16768, 'batch_loss/train': 0.7900130702182651} 12/25/2021 04:27:09 - INFO - codeparrot_training - Step 16769: {'lr': 0.00037519561719587366, 'samples': 8586240, 'steps': 16769, 'batch_loss/train': 0.7626943355426192} 12/25/2021 04:27:23 - INFO - codeparrot_training - Step 16770: {'lr': 0.0003751819728008884, 'samples': 8586752, 'steps': 16770, 'batch_loss/train': 0.7388465732801706} 12/25/2021 04:27:34 - INFO - codeparrot_training - Step 16771: {'lr': 0.00037516832790822694, 'samples': 8587264, 'steps': 16771, 'batch_loss/train': 0.7580350628122687} 12/25/2021 04:27:44 - INFO - codeparrot_training - Step 16772: {'lr': 0.00037515468251794336, 'samples': 8587776, 'steps': 16772, 'batch_loss/train': 0.7898198114708066} 12/25/2021 04:27:55 - INFO - codeparrot_training - Step 16773: {'lr': 0.0003751410366300922, 'samples': 8588288, 'steps': 16773, 'batch_loss/train': 0.790811381302774} 12/25/2021 04:28:07 - INFO - codeparrot_training - Step 16774: {'lr': 0.00037512739024472743, 'samples': 8588800, 'steps': 16774, 'batch_loss/train': 0.720656436868012} 12/25/2021 04:28:17 - INFO - codeparrot_training - Step 16775: {'lr': 0.0003751137433619034, 'samples': 8589312, 'steps': 16775, 'batch_loss/train': 0.8315798453986645} 12/25/2021 04:28:28 - INFO - codeparrot_training - Step 16776: {'lr': 0.0003751000959816744, 'samples': 8589824, 'steps': 16776, 'batch_loss/train': 0.7864353025797755} 12/25/2021 04:28:40 - INFO - codeparrot_training - Step 16777: {'lr': 0.0003750864481040946, 'samples': 8590336, 'steps': 16777, 'batch_loss/train': 0.7324558054096997} 12/25/2021 04:28:51 - INFO - codeparrot_training - Step 16778: {'lr': 0.00037507279972921834, 'samples': 8590848, 'steps': 16778, 'batch_loss/train': 0.7810623636469245} 12/25/2021 04:29:02 - INFO - codeparrot_training - Step 16779: {'lr': 0.00037505915085709983, 'samples': 8591360, 'steps': 16779, 'batch_loss/train': 0.8001713114790618} 12/25/2021 04:29:15 - INFO - codeparrot_training - Step 16780: {'lr': 0.00037504550148779336, 'samples': 8591872, 'steps': 16780, 'batch_loss/train': 0.7363702841103077} 12/25/2021 04:29:26 - INFO - codeparrot_training - Step 16781: {'lr': 0.00037503185162135325, 'samples': 8592384, 'steps': 16781, 'batch_loss/train': 0.7494247816503048} 12/25/2021 04:29:36 - INFO - codeparrot_training - Step 16782: {'lr': 0.0003750182012578337, 'samples': 8592896, 'steps': 16782, 'batch_loss/train': 0.8237836356274784} 12/25/2021 04:29:47 - INFO - codeparrot_training - Step 16783: {'lr': 0.0003750045503972889, 'samples': 8593408, 'steps': 16783, 'batch_loss/train': 0.8554332396015525} 12/25/2021 04:29:59 - INFO - codeparrot_training - Step 16784: {'lr': 0.0003749908990397733, 'samples': 8593920, 'steps': 16784, 'batch_loss/train': 0.7756406227126718} 12/25/2021 04:30:10 - INFO - codeparrot_training - Step 16785: {'lr': 0.00037497724718534096, 'samples': 8594432, 'steps': 16785, 'batch_loss/train': 0.7669764935271814} 12/25/2021 04:30:20 - INFO - codeparrot_training - Step 16786: {'lr': 0.00037496359483404646, 'samples': 8594944, 'steps': 16786, 'batch_loss/train': 0.8374728402122855} 12/25/2021 04:30:33 - INFO - codeparrot_training - Step 16787: {'lr': 0.0003749499419859438, 'samples': 8595456, 'steps': 16787, 'batch_loss/train': 0.9900995716452599} 12/25/2021 04:30:43 - INFO - codeparrot_training - Step 16788: {'lr': 0.00037493628864108724, 'samples': 8595968, 'steps': 16788, 'batch_loss/train': 0.80385383637622} 12/25/2021 04:30:54 - INFO - codeparrot_training - Step 16789: {'lr': 0.0003749226347995314, 'samples': 8596480, 'steps': 16789, 'batch_loss/train': 0.7412880454212427} 12/25/2021 04:31:07 - INFO - codeparrot_training - Step 16790: {'lr': 0.00037490898046133015, 'samples': 8596992, 'steps': 16790, 'batch_loss/train': 0.7800582079216838} 12/25/2021 04:31:18 - INFO - codeparrot_training - Step 16791: {'lr': 0.000374895325626538, 'samples': 8597504, 'steps': 16791, 'batch_loss/train': 0.8001766074448824} 12/25/2021 04:31:29 - INFO - codeparrot_training - Step 16792: {'lr': 0.0003748816702952092, 'samples': 8598016, 'steps': 16792, 'batch_loss/train': 0.7999563883058727} 12/25/2021 04:31:39 - INFO - codeparrot_training - Step 16793: {'lr': 0.000374868014467398, 'samples': 8598528, 'steps': 16793, 'batch_loss/train': 0.7717184991342947} 12/25/2021 04:31:51 - INFO - codeparrot_training - Step 16794: {'lr': 0.0003748543581431588, 'samples': 8599040, 'steps': 16794, 'batch_loss/train': 0.8096832735463977} 12/25/2021 04:32:02 - INFO - codeparrot_training - Step 16795: {'lr': 0.00037484070132254576, 'samples': 8599552, 'steps': 16795, 'batch_loss/train': 0.9273629480740055} 12/25/2021 04:32:12 - INFO - codeparrot_training - Step 16796: {'lr': 0.0003748270440056133, 'samples': 8600064, 'steps': 16796, 'batch_loss/train': 0.7405969118699431} 12/25/2021 04:32:25 - INFO - codeparrot_training - Step 16797: {'lr': 0.00037481338619241555, 'samples': 8600576, 'steps': 16797, 'batch_loss/train': 0.8248808346688747} 12/25/2021 04:32:35 - INFO - codeparrot_training - Step 16798: {'lr': 0.000374799727883007, 'samples': 8601088, 'steps': 16798, 'batch_loss/train': 0.7495758039876819} 12/25/2021 04:32:46 - INFO - codeparrot_training - Step 16799: {'lr': 0.0003747860690774418, 'samples': 8601600, 'steps': 16799, 'batch_loss/train': 0.7080374551005661} 12/25/2021 04:32:58 - INFO - codeparrot_training - Step 16800: {'lr': 0.0003747724097757743, 'samples': 8602112, 'steps': 16800, 'batch_loss/train': 0.7591205672360957} 12/25/2021 04:33:09 - INFO - codeparrot_training - Step 16801: {'lr': 0.00037475874997805876, 'samples': 8602624, 'steps': 16801, 'batch_loss/train': 0.6869521522894502} 12/25/2021 04:33:19 - INFO - codeparrot_training - Step 16802: {'lr': 0.00037474508968434963, 'samples': 8603136, 'steps': 16802, 'batch_loss/train': 0.8975872443988919} 12/25/2021 04:33:30 - INFO - codeparrot_training - Step 16803: {'lr': 0.00037473142889470106, 'samples': 8603648, 'steps': 16803, 'batch_loss/train': 0.7286734813824296} 12/25/2021 04:33:44 - INFO - codeparrot_training - Step 16804: {'lr': 0.00037471776760916747, 'samples': 8604160, 'steps': 16804, 'batch_loss/train': 0.844454251229763} 12/25/2021 04:33:54 - INFO - codeparrot_training - Step 16805: {'lr': 0.00037470410582780313, 'samples': 8604672, 'steps': 16805, 'batch_loss/train': 0.8757756883278489} 12/25/2021 04:34:05 - INFO - codeparrot_training - Step 16806: {'lr': 0.0003746904435506624, 'samples': 8605184, 'steps': 16806, 'batch_loss/train': 0.7178453872911632} 12/25/2021 04:34:17 - INFO - codeparrot_training - Step 16807: {'lr': 0.00037467678077779935, 'samples': 8605696, 'steps': 16807, 'batch_loss/train': 0.8316476369509473} 12/25/2021 04:34:28 - INFO - codeparrot_training - Step 16808: {'lr': 0.0003746631175092687, 'samples': 8606208, 'steps': 16808, 'batch_loss/train': 0.8450525214429945} 12/25/2021 04:34:38 - INFO - codeparrot_training - Step 16809: {'lr': 0.0003746494537451245, 'samples': 8606720, 'steps': 16809, 'batch_loss/train': 1.079945339821279} 12/25/2021 04:34:53 - INFO - codeparrot_training - Step 16810: {'lr': 0.0003746357894854211, 'samples': 8607232, 'steps': 16810, 'batch_loss/train': 0.7586823347955942} 12/25/2021 04:35:03 - INFO - codeparrot_training - Step 16811: {'lr': 0.0003746221247302129, 'samples': 8607744, 'steps': 16811, 'batch_loss/train': 0.8975166026502848} 12/25/2021 04:35:14 - INFO - codeparrot_training - Step 16812: {'lr': 0.00037460845947955425, 'samples': 8608256, 'steps': 16812, 'batch_loss/train': 0.8366717818425968} 12/25/2021 04:35:24 - INFO - codeparrot_training - Step 16813: {'lr': 0.00037459479373349924, 'samples': 8608768, 'steps': 16813, 'batch_loss/train': 0.9382120068185031} 12/25/2021 04:35:37 - INFO - codeparrot_training - Step 16814: {'lr': 0.0003745811274921025, 'samples': 8609280, 'steps': 16814, 'batch_loss/train': 0.8180460128933191} 12/25/2021 04:35:47 - INFO - codeparrot_training - Step 16815: {'lr': 0.00037456746075541827, 'samples': 8609792, 'steps': 16815, 'batch_loss/train': 0.7871474418789148} 12/25/2021 04:35:58 - INFO - codeparrot_training - Step 16816: {'lr': 0.00037455379352350083, 'samples': 8610304, 'steps': 16816, 'batch_loss/train': 1.0414102021604776} 12/25/2021 04:36:10 - INFO - codeparrot_training - Step 16817: {'lr': 0.00037454012579640446, 'samples': 8610816, 'steps': 16817, 'batch_loss/train': 0.8451581979170442} 12/25/2021 04:36:20 - INFO - codeparrot_training - Step 16818: {'lr': 0.00037452645757418365, 'samples': 8611328, 'steps': 16818, 'batch_loss/train': 0.815602975897491} 12/25/2021 04:36:31 - INFO - codeparrot_training - Step 16819: {'lr': 0.00037451278885689265, 'samples': 8611840, 'steps': 16819, 'batch_loss/train': 0.7941820891574025} 12/25/2021 04:36:45 - INFO - codeparrot_training - Step 16820: {'lr': 0.0003744991196445857, 'samples': 8612352, 'steps': 16820, 'batch_loss/train': 0.8423488661646843} 12/25/2021 04:36:55 - INFO - codeparrot_training - Step 16821: {'lr': 0.00037448544993731747, 'samples': 8612864, 'steps': 16821, 'batch_loss/train': 0.8763151094317436} 12/25/2021 04:37:06 - INFO - codeparrot_training - Step 16822: {'lr': 0.000374471779735142, 'samples': 8613376, 'steps': 16822, 'batch_loss/train': 1.2840434511890635} 12/25/2021 04:37:17 - INFO - codeparrot_training - Step 16823: {'lr': 0.00037445810903811365, 'samples': 8613888, 'steps': 16823, 'batch_loss/train': 0.9142883643507957} 12/25/2021 04:37:29 - INFO - codeparrot_training - Step 16824: {'lr': 0.0003744444378462869, 'samples': 8614400, 'steps': 16824, 'batch_loss/train': 0.9053923459723592} 12/25/2021 04:37:39 - INFO - codeparrot_training - Step 16825: {'lr': 0.0003744307661597161, 'samples': 8614912, 'steps': 16825, 'batch_loss/train': 0.7743001570925117} 12/25/2021 04:37:50 - INFO - codeparrot_training - Step 16826: {'lr': 0.00037441709397845554, 'samples': 8615424, 'steps': 16826, 'batch_loss/train': 0.7293683758471161} 12/25/2021 04:38:04 - INFO - codeparrot_training - Step 16827: {'lr': 0.00037440342130255956, 'samples': 8615936, 'steps': 16827, 'batch_loss/train': 0.8242256743833423} 12/25/2021 04:38:14 - INFO - codeparrot_training - Step 16828: {'lr': 0.0003743897481320826, 'samples': 8616448, 'steps': 16828, 'batch_loss/train': 0.8289335928857327} 12/25/2021 04:38:25 - INFO - codeparrot_training - Step 16829: {'lr': 0.00037437607446707893, 'samples': 8616960, 'steps': 16829, 'batch_loss/train': 0.9635726604610682} 12/25/2021 04:38:37 - INFO - codeparrot_training - Step 16830: {'lr': 0.000374362400307603, 'samples': 8617472, 'steps': 16830, 'batch_loss/train': 0.8408830380067229} 12/25/2021 04:38:48 - INFO - codeparrot_training - Step 16831: {'lr': 0.00037434872565370904, 'samples': 8617984, 'steps': 16831, 'batch_loss/train': 0.851435306482017} 12/25/2021 04:38:58 - INFO - codeparrot_training - Step 16832: {'lr': 0.0003743350505054516, 'samples': 8618496, 'steps': 16832, 'batch_loss/train': 0.8041081479750574} 12/25/2021 04:39:09 - INFO - codeparrot_training - Step 16833: {'lr': 0.00037432137486288487, 'samples': 8619008, 'steps': 16833, 'batch_loss/train': 0.7768758498132229} 12/25/2021 04:39:21 - INFO - codeparrot_training - Step 16834: {'lr': 0.0003743076987260633, 'samples': 8619520, 'steps': 16834, 'batch_loss/train': 0.7301510344259441} 12/25/2021 04:39:31 - INFO - codeparrot_training - Step 16835: {'lr': 0.00037429402209504124, 'samples': 8620032, 'steps': 16835, 'batch_loss/train': 0.8032632139511406} 12/25/2021 04:39:42 - INFO - codeparrot_training - Step 16836: {'lr': 0.0003742803449698732, 'samples': 8620544, 'steps': 16836, 'batch_loss/train': 0.8658873476088047} 12/25/2021 04:39:54 - INFO - codeparrot_training - Step 16837: {'lr': 0.0003742666673506133, 'samples': 8621056, 'steps': 16837, 'batch_loss/train': 0.8268053364008665} 12/25/2021 04:40:05 - INFO - codeparrot_training - Step 16838: {'lr': 0.0003742529892373161, 'samples': 8621568, 'steps': 16838, 'batch_loss/train': 0.8435197463259101} 12/25/2021 04:40:15 - INFO - codeparrot_training - Step 16839: {'lr': 0.0003742393106300359, 'samples': 8622080, 'steps': 16839, 'batch_loss/train': 0.8032750058919191} 12/25/2021 04:40:28 - INFO - codeparrot_training - Step 16840: {'lr': 0.0003742256315288272, 'samples': 8622592, 'steps': 16840, 'batch_loss/train': 0.8487167488783598} 12/25/2021 04:40:39 - INFO - codeparrot_training - Step 16841: {'lr': 0.00037421195193374423, 'samples': 8623104, 'steps': 16841, 'batch_loss/train': 0.8966254452243447} 12/25/2021 04:40:49 - INFO - codeparrot_training - Step 16842: {'lr': 0.00037419827184484146, 'samples': 8623616, 'steps': 16842, 'batch_loss/train': 0.757526075351052} 12/25/2021 04:41:00 - INFO - codeparrot_training - Step 16843: {'lr': 0.0003741845912621732, 'samples': 8624128, 'steps': 16843, 'batch_loss/train': 0.8758406322449446} 12/25/2021 04:41:12 - INFO - codeparrot_training - Step 16844: {'lr': 0.000374170910185794, 'samples': 8624640, 'steps': 16844, 'batch_loss/train': 0.9332855828106403} 12/25/2021 04:41:23 - INFO - codeparrot_training - Step 16845: {'lr': 0.00037415722861575807, 'samples': 8625152, 'steps': 16845, 'batch_loss/train': 0.8676942409947515} 12/25/2021 04:41:33 - INFO - codeparrot_training - Step 16846: {'lr': 0.00037414354655211987, 'samples': 8625664, 'steps': 16846, 'batch_loss/train': 0.8006712384521961} 12/25/2021 04:41:45 - INFO - codeparrot_training - Step 16847: {'lr': 0.0003741298639949339, 'samples': 8626176, 'steps': 16847, 'batch_loss/train': 0.8401034455746412} 12/25/2021 04:41:56 - INFO - codeparrot_training - Step 16848: {'lr': 0.00037411618094425434, 'samples': 8626688, 'steps': 16848, 'batch_loss/train': 0.765834906604141} 12/25/2021 04:42:07 - INFO - codeparrot_training - Step 16849: {'lr': 0.00037410249740013575, 'samples': 8627200, 'steps': 16849, 'batch_loss/train': 0.7458612332120538} 12/25/2021 04:42:19 - INFO - codeparrot_training - Step 16850: {'lr': 0.00037408881336263247, 'samples': 8627712, 'steps': 16850, 'batch_loss/train': 0.7210748337674886} 12/25/2021 04:42:30 - INFO - codeparrot_training - Step 16851: {'lr': 0.00037407512883179893, 'samples': 8628224, 'steps': 16851, 'batch_loss/train': 0.8079910955857486} 12/25/2021 04:42:41 - INFO - codeparrot_training - Step 16852: {'lr': 0.00037406144380768957, 'samples': 8628736, 'steps': 16852, 'batch_loss/train': 0.7502630657982081} 12/25/2021 04:42:51 - INFO - codeparrot_training - Step 16853: {'lr': 0.0003740477582903587, 'samples': 8629248, 'steps': 16853, 'batch_loss/train': 0.8051105607300997} 12/25/2021 04:43:03 - INFO - codeparrot_training - Step 16854: {'lr': 0.00037403407227986084, 'samples': 8629760, 'steps': 16854, 'batch_loss/train': 0.7492449590936303} 12/25/2021 04:43:14 - INFO - codeparrot_training - Step 16855: {'lr': 0.00037402038577625033, 'samples': 8630272, 'steps': 16855, 'batch_loss/train': 0.7248593939002603} 12/25/2021 04:43:24 - INFO - codeparrot_training - Step 16856: {'lr': 0.0003740066987795816, 'samples': 8630784, 'steps': 16856, 'batch_loss/train': 0.8419830687344074} 12/25/2021 04:43:37 - INFO - codeparrot_training - Step 16857: {'lr': 0.00037399301128990904, 'samples': 8631296, 'steps': 16857, 'batch_loss/train': 1.2562970221042633} 12/25/2021 04:43:48 - INFO - codeparrot_training - Step 16858: {'lr': 0.000373979323307287, 'samples': 8631808, 'steps': 16858, 'batch_loss/train': 0.9367145309224725} 12/25/2021 04:43:59 - INFO - codeparrot_training - Step 16859: {'lr': 0.0003739656348317701, 'samples': 8632320, 'steps': 16859, 'batch_loss/train': 0.8665110496804118} 12/25/2021 04:44:11 - INFO - codeparrot_training - Step 16860: {'lr': 0.00037395194586341263, 'samples': 8632832, 'steps': 16860, 'batch_loss/train': 0.7645210912451148} 12/25/2021 04:44:21 - INFO - codeparrot_training - Step 16861: {'lr': 0.00037393825640226903, 'samples': 8633344, 'steps': 16861, 'batch_loss/train': 0.8444813657552004} 12/25/2021 04:44:32 - INFO - codeparrot_training - Step 16862: {'lr': 0.0003739245664483936, 'samples': 8633856, 'steps': 16862, 'batch_loss/train': 0.7573119085282087} 12/25/2021 04:44:42 - INFO - codeparrot_training - Step 16863: {'lr': 0.00037391087600184103, 'samples': 8634368, 'steps': 16863, 'batch_loss/train': 0.852228369563818} 12/25/2021 04:44:55 - INFO - codeparrot_training - Step 16864: {'lr': 0.0003738971850626656, 'samples': 8634880, 'steps': 16864, 'batch_loss/train': 0.7812547390349209} 12/25/2021 04:45:05 - INFO - codeparrot_training - Step 16865: {'lr': 0.00037388349363092166, 'samples': 8635392, 'steps': 16865, 'batch_loss/train': 0.7754620788618922} 12/25/2021 04:45:16 - INFO - codeparrot_training - Step 16866: {'lr': 0.0003738698017066637, 'samples': 8635904, 'steps': 16866, 'batch_loss/train': 0.8738084305077791} 12/25/2021 04:45:29 - INFO - codeparrot_training - Step 16867: {'lr': 0.00037385610928994633, 'samples': 8636416, 'steps': 16867, 'batch_loss/train': 0.7871058732271194} 12/25/2021 04:45:39 - INFO - codeparrot_training - Step 16868: {'lr': 0.00037384241638082374, 'samples': 8636928, 'steps': 16868, 'batch_loss/train': 1.4962820969522} 12/25/2021 04:45:50 - INFO - codeparrot_training - Step 16869: {'lr': 0.0003738287229793504, 'samples': 8637440, 'steps': 16869, 'batch_loss/train': 0.9379601497203112} 12/25/2021 04:46:02 - INFO - codeparrot_training - Step 16870: {'lr': 0.00037381502908558095, 'samples': 8637952, 'steps': 16870, 'batch_loss/train': 0.9421974755823612} 12/25/2021 04:46:13 - INFO - codeparrot_training - Step 16871: {'lr': 0.0003738013346995696, 'samples': 8638464, 'steps': 16871, 'batch_loss/train': 0.853887845762074} 12/25/2021 04:46:23 - INFO - codeparrot_training - Step 16872: {'lr': 0.00037378763982137095, 'samples': 8638976, 'steps': 16872, 'batch_loss/train': 0.8635143684223294} 12/25/2021 04:46:34 - INFO - codeparrot_training - Step 16873: {'lr': 0.0003737739444510394, 'samples': 8639488, 'steps': 16873, 'batch_loss/train': 0.8265854506753385} 12/25/2021 04:46:47 - INFO - codeparrot_training - Step 16874: {'lr': 0.00037376024858862934, 'samples': 8640000, 'steps': 16874, 'batch_loss/train': 0.8448872733861208} 12/25/2021 04:46:57 - INFO - codeparrot_training - Step 16875: {'lr': 0.00037374655223419516, 'samples': 8640512, 'steps': 16875, 'batch_loss/train': 0.793537843041122} 12/25/2021 04:47:08 - INFO - codeparrot_training - Step 16876: {'lr': 0.00037373285538779157, 'samples': 8641024, 'steps': 16876, 'batch_loss/train': 0.7814443272072822} 12/25/2021 04:47:20 - INFO - codeparrot_training - Step 16877: {'lr': 0.0003737191580494728, 'samples': 8641536, 'steps': 16877, 'batch_loss/train': 0.8900720013771206} 12/25/2021 04:47:31 - INFO - codeparrot_training - Step 16878: {'lr': 0.00037370546021929334, 'samples': 8642048, 'steps': 16878, 'batch_loss/train': 0.830570001155138} 12/25/2021 04:47:41 - INFO - codeparrot_training - Step 16879: {'lr': 0.0003736917618973078, 'samples': 8642560, 'steps': 16879, 'batch_loss/train': 0.7548836001660675} 12/25/2021 04:47:53 - INFO - codeparrot_training - Step 16880: {'lr': 0.00037367806308357044, 'samples': 8643072, 'steps': 16880, 'batch_loss/train': 0.8298603799194098} 12/25/2021 04:48:04 - INFO - codeparrot_training - Step 16881: {'lr': 0.0003736643637781357, 'samples': 8643584, 'steps': 16881, 'batch_loss/train': 0.7812337651848793} 12/25/2021 04:48:14 - INFO - codeparrot_training - Step 16882: {'lr': 0.0003736506639810583, 'samples': 8644096, 'steps': 16882, 'batch_loss/train': 0.7866130210459232} 12/25/2021 04:48:25 - INFO - codeparrot_training - Step 16883: {'lr': 0.00037363696369239254, 'samples': 8644608, 'steps': 16883, 'batch_loss/train': 0.833422499417793} 12/25/2021 04:48:37 - INFO - codeparrot_training - Step 16884: {'lr': 0.0003736232629121928, 'samples': 8645120, 'steps': 16884, 'batch_loss/train': 0.8255998343229294} 12/25/2021 04:48:48 - INFO - codeparrot_training - Step 16885: {'lr': 0.00037360956164051367, 'samples': 8645632, 'steps': 16885, 'batch_loss/train': 0.7343484321609139} 12/25/2021 04:48:58 - INFO - codeparrot_training - Step 16886: {'lr': 0.0003735958598774097, 'samples': 8646144, 'steps': 16886, 'batch_loss/train': 0.7365163564682007} 12/25/2021 04:49:11 - INFO - codeparrot_training - Step 16887: {'lr': 0.0003735821576229351, 'samples': 8646656, 'steps': 16887, 'batch_loss/train': 0.7812972385436296} 12/25/2021 04:49:22 - INFO - codeparrot_training - Step 16888: {'lr': 0.0003735684548771446, 'samples': 8647168, 'steps': 16888, 'batch_loss/train': 0.8191139097325504} 12/25/2021 04:49:32 - INFO - codeparrot_training - Step 16889: {'lr': 0.00037355475164009257, 'samples': 8647680, 'steps': 16889, 'batch_loss/train': 0.8778741369023919} 12/25/2021 04:49:45 - INFO - codeparrot_training - Step 16890: {'lr': 0.0003735410479118335, 'samples': 8648192, 'steps': 16890, 'batch_loss/train': 0.781808169092983} 12/25/2021 04:49:55 - INFO - codeparrot_training - Step 16891: {'lr': 0.0003735273436924218, 'samples': 8648704, 'steps': 16891, 'batch_loss/train': 0.8211659854277968} 12/25/2021 04:50:06 - INFO - codeparrot_training - Step 16892: {'lr': 0.00037351363898191215, 'samples': 8649216, 'steps': 16892, 'batch_loss/train': 0.9147819066420197} 12/25/2021 04:50:16 - INFO - codeparrot_training - Step 16893: {'lr': 0.00037349993378035887, 'samples': 8649728, 'steps': 16893, 'batch_loss/train': 0.5834653661586344} 12/25/2021 04:50:29 - INFO - codeparrot_training - Step 16894: {'lr': 0.0003734862280878164, 'samples': 8650240, 'steps': 16894, 'batch_loss/train': 0.8009211886674166} 12/25/2021 04:50:39 - INFO - codeparrot_training - Step 16895: {'lr': 0.00037347252190433937, 'samples': 8650752, 'steps': 16895, 'batch_loss/train': 0.75249785464257} 12/25/2021 04:50:50 - INFO - codeparrot_training - Step 16896: {'lr': 0.0003734588152299823, 'samples': 8651264, 'steps': 16896, 'batch_loss/train': 0.7644891669042408} 12/25/2021 04:51:02 - INFO - codeparrot_training - Step 16897: {'lr': 0.0003734451080647995, 'samples': 8651776, 'steps': 16897, 'batch_loss/train': 0.8142156768590212} 12/25/2021 04:51:13 - INFO - codeparrot_training - Step 16898: {'lr': 0.0003734314004088455, 'samples': 8652288, 'steps': 16898, 'batch_loss/train': 0.8041269518435001} 12/25/2021 04:51:24 - INFO - codeparrot_training - Step 16899: {'lr': 0.0003734176922621749, 'samples': 8652800, 'steps': 16899, 'batch_loss/train': 0.829131368547678} 12/25/2021 04:51:36 - INFO - codeparrot_training - Step 16900: {'lr': 0.0003734039836248423, 'samples': 8653312, 'steps': 16900, 'batch_loss/train': 0.8460450582206249} 12/25/2021 04:51:47 - INFO - codeparrot_training - Step 16901: {'lr': 0.0003733902744969019, 'samples': 8653824, 'steps': 16901, 'batch_loss/train': 0.7268404867500067} 12/25/2021 04:51:57 - INFO - codeparrot_training - Step 16902: {'lr': 0.00037337656487840845, 'samples': 8654336, 'steps': 16902, 'batch_loss/train': 0.8798059429973364} 12/25/2021 04:52:09 - INFO - codeparrot_training - Step 16903: {'lr': 0.0003733628547694163, 'samples': 8654848, 'steps': 16903, 'batch_loss/train': 0.7025984893552959} 12/25/2021 04:52:20 - INFO - codeparrot_training - Step 16904: {'lr': 0.00037334914416998, 'samples': 8655360, 'steps': 16904, 'batch_loss/train': 0.8321634414605796} 12/25/2021 04:52:30 - INFO - codeparrot_training - Step 16905: {'lr': 0.0003733354330801541, 'samples': 8655872, 'steps': 16905, 'batch_loss/train': 0.9225273355841637} 12/25/2021 04:52:41 - INFO - codeparrot_training - Step 16906: {'lr': 0.0003733217214999932, 'samples': 8656384, 'steps': 16906, 'batch_loss/train': 0.8796259518712759} 12/25/2021 04:52:53 - INFO - codeparrot_training - Step 16907: {'lr': 0.0003733080094295515, 'samples': 8656896, 'steps': 16907, 'batch_loss/train': 0.7861633766442537} 12/25/2021 04:53:04 - INFO - codeparrot_training - Step 16908: {'lr': 0.0003732942968688839, 'samples': 8657408, 'steps': 16908, 'batch_loss/train': 0.8392008277587593} 12/25/2021 04:53:15 - INFO - codeparrot_training - Step 16909: {'lr': 0.0003732805838180446, 'samples': 8657920, 'steps': 16909, 'batch_loss/train': 0.7890594693599269} 12/25/2021 04:53:27 - INFO - codeparrot_training - Step 16910: {'lr': 0.00037326687027708827, 'samples': 8658432, 'steps': 16910, 'batch_loss/train': 0.8801270104013383} 12/25/2021 04:53:37 - INFO - codeparrot_training - Step 16911: {'lr': 0.0003732531562460695, 'samples': 8658944, 'steps': 16911, 'batch_loss/train': 0.7697160290554166} 12/25/2021 04:53:48 - INFO - codeparrot_training - Step 16912: {'lr': 0.0003732394417250426, 'samples': 8659456, 'steps': 16912, 'batch_loss/train': 0.8670120844617486} 12/25/2021 04:54:00 - INFO - codeparrot_training - Step 16913: {'lr': 0.0003732257267140623, 'samples': 8659968, 'steps': 16913, 'batch_loss/train': 0.8218140280805528} 12/25/2021 04:54:11 - INFO - codeparrot_training - Step 16914: {'lr': 0.00037321201121318293, 'samples': 8660480, 'steps': 16914, 'batch_loss/train': 0.705955928307958} 12/25/2021 04:54:21 - INFO - codeparrot_training - Step 16915: {'lr': 0.0003731982952224593, 'samples': 8660992, 'steps': 16915, 'batch_loss/train': 0.8162917965673842} 12/25/2021 04:54:32 - INFO - codeparrot_training - Step 16916: {'lr': 0.0003731845787419456, 'samples': 8661504, 'steps': 16916, 'batch_loss/train': 0.8141737142577767} 12/25/2021 04:54:45 - INFO - codeparrot_training - Step 16917: {'lr': 0.0003731708617716966, 'samples': 8662016, 'steps': 16917, 'batch_loss/train': 0.8149023531004786} 12/25/2021 04:54:55 - INFO - codeparrot_training - Step 16918: {'lr': 0.0003731571443117668, 'samples': 8662528, 'steps': 16918, 'batch_loss/train': 0.7836297894828022} 12/25/2021 04:55:06 - INFO - codeparrot_training - Step 16919: {'lr': 0.00037314342636221066, 'samples': 8663040, 'steps': 16919, 'batch_loss/train': 0.8819235051050782} 12/25/2021 04:55:18 - INFO - codeparrot_training - Step 16920: {'lr': 0.0003731297079230827, 'samples': 8663552, 'steps': 16920, 'batch_loss/train': 0.8513206010684371} 12/25/2021 04:55:29 - INFO - codeparrot_training - Step 16921: {'lr': 0.0003731159889944376, 'samples': 8664064, 'steps': 16921, 'batch_loss/train': 0.7505997074767947} 12/25/2021 04:55:39 - INFO - codeparrot_training - Step 16922: {'lr': 0.00037310226957632973, 'samples': 8664576, 'steps': 16922, 'batch_loss/train': 0.7619612785056233} 12/25/2021 04:55:51 - INFO - codeparrot_training - Step 16923: {'lr': 0.00037308854966881376, 'samples': 8665088, 'steps': 16923, 'batch_loss/train': 0.8756872545927763} 12/25/2021 04:56:02 - INFO - codeparrot_training - Step 16924: {'lr': 0.0003730748292719442, 'samples': 8665600, 'steps': 16924, 'batch_loss/train': 0.8660275805741549} 12/25/2021 04:56:12 - INFO - codeparrot_training - Step 16925: {'lr': 0.00037306110838577556, 'samples': 8666112, 'steps': 16925, 'batch_loss/train': 0.8338968120515347} 12/25/2021 04:56:23 - INFO - codeparrot_training - Step 16926: {'lr': 0.00037304738701036245, 'samples': 8666624, 'steps': 16926, 'batch_loss/train': 0.8496685037389398} 12/25/2021 04:56:36 - INFO - codeparrot_training - Step 16927: {'lr': 0.00037303366514575937, 'samples': 8667136, 'steps': 16927, 'batch_loss/train': 0.6638684456702322} 12/25/2021 04:56:46 - INFO - codeparrot_training - Step 16928: {'lr': 0.0003730199427920209, 'samples': 8667648, 'steps': 16928, 'batch_loss/train': 0.8140806192532182} 12/25/2021 04:56:57 - INFO - codeparrot_training - Step 16929: {'lr': 0.00037300621994920165, 'samples': 8668160, 'steps': 16929, 'batch_loss/train': 0.7958425553515553} 12/25/2021 04:57:09 - INFO - codeparrot_training - Step 16930: {'lr': 0.00037299249661735607, 'samples': 8668672, 'steps': 16930, 'batch_loss/train': 1.0119575143326074} 12/25/2021 04:57:20 - INFO - codeparrot_training - Step 16931: {'lr': 0.0003729787727965387, 'samples': 8669184, 'steps': 16931, 'batch_loss/train': 0.8294768524356186} 12/25/2021 04:57:31 - INFO - codeparrot_training - Step 16932: {'lr': 0.0003729650484868043, 'samples': 8669696, 'steps': 16932, 'batch_loss/train': 0.8029115973040462} 12/25/2021 04:57:44 - INFO - codeparrot_training - Step 16933: {'lr': 0.00037295132368820715, 'samples': 8670208, 'steps': 16933, 'batch_loss/train': 0.7087505388772115} 12/25/2021 04:57:54 - INFO - codeparrot_training - Step 16934: {'lr': 0.00037293759840080205, 'samples': 8670720, 'steps': 16934, 'batch_loss/train': 0.7133074882440269} 12/25/2021 04:58:05 - INFO - codeparrot_training - Step 16935: {'lr': 0.00037292387262464344, 'samples': 8671232, 'steps': 16935, 'batch_loss/train': 0.7368608319666237} 12/25/2021 04:58:16 - INFO - codeparrot_training - Step 16936: {'lr': 0.00037291014635978595, 'samples': 8671744, 'steps': 16936, 'batch_loss/train': 1.0861839968711138} 12/25/2021 04:58:27 - INFO - codeparrot_training - Step 16937: {'lr': 0.00037289641960628413, 'samples': 8672256, 'steps': 16937, 'batch_loss/train': 0.7455763346515596} 12/25/2021 04:58:38 - INFO - codeparrot_training - Step 16938: {'lr': 0.0003728826923641925, 'samples': 8672768, 'steps': 16938, 'batch_loss/train': 0.73654621001333} 12/25/2021 04:58:49 - INFO - codeparrot_training - Step 16939: {'lr': 0.00037286896463356573, 'samples': 8673280, 'steps': 16939, 'batch_loss/train': 0.7829347690567374} 12/25/2021 04:59:01 - INFO - codeparrot_training - Step 16940: {'lr': 0.00037285523641445833, 'samples': 8673792, 'steps': 16940, 'batch_loss/train': 0.7066250217612833} 12/25/2021 04:59:11 - INFO - codeparrot_training - Step 16941: {'lr': 0.00037284150770692495, 'samples': 8674304, 'steps': 16941, 'batch_loss/train': 0.6985812222119421} 12/25/2021 04:59:22 - INFO - codeparrot_training - Step 16942: {'lr': 0.0003728277785110201, 'samples': 8674816, 'steps': 16942, 'batch_loss/train': 0.8073779544793069} 12/25/2021 04:59:35 - INFO - codeparrot_training - Step 16943: {'lr': 0.0003728140488267983, 'samples': 8675328, 'steps': 16943, 'batch_loss/train': 0.7262226333841681} 12/25/2021 04:59:45 - INFO - codeparrot_training - Step 16944: {'lr': 0.0003728003186543143, 'samples': 8675840, 'steps': 16944, 'batch_loss/train': 0.7448132731951773} 12/25/2021 04:59:56 - INFO - codeparrot_training - Step 16945: {'lr': 0.00037278658799362266, 'samples': 8676352, 'steps': 16945, 'batch_loss/train': 0.8564401036128402} 12/25/2021 05:00:06 - INFO - codeparrot_training - Step 16946: {'lr': 0.00037277285684477775, 'samples': 8676864, 'steps': 16946, 'batch_loss/train': 0.7912777378223836} 12/25/2021 05:00:19 - INFO - codeparrot_training - Step 16947: {'lr': 0.0003727591252078344, 'samples': 8677376, 'steps': 16947, 'batch_loss/train': 0.8023835401982069} 12/25/2021 05:00:29 - INFO - codeparrot_training - Step 16948: {'lr': 0.0003727453930828471, 'samples': 8677888, 'steps': 16948, 'batch_loss/train': 0.7704978481051512} 12/25/2021 05:00:40 - INFO - codeparrot_training - Step 16949: {'lr': 0.0003727316604698704, 'samples': 8678400, 'steps': 16949, 'batch_loss/train': 0.8058857824653387} 12/25/2021 05:00:52 - INFO - codeparrot_training - Step 16950: {'lr': 0.0003727179273689591, 'samples': 8678912, 'steps': 16950, 'batch_loss/train': 0.8283862844109535} 12/25/2021 05:01:02 - INFO - codeparrot_training - Step 16951: {'lr': 0.0003727041937801676, 'samples': 8679424, 'steps': 16951, 'batch_loss/train': 0.806168888695538} 12/25/2021 05:01:13 - INFO - codeparrot_training - Step 16952: {'lr': 0.0003726904597035505, 'samples': 8679936, 'steps': 16952, 'batch_loss/train': 0.771311673335731} 12/25/2021 05:01:25 - INFO - codeparrot_training - Step 16953: {'lr': 0.0003726767251391625, 'samples': 8680448, 'steps': 16953, 'batch_loss/train': 0.8209850583225489} 12/25/2021 05:01:36 - INFO - codeparrot_training - Step 16954: {'lr': 0.00037266299008705824, 'samples': 8680960, 'steps': 16954, 'batch_loss/train': 0.7592252916656435} 12/25/2021 05:01:46 - INFO - codeparrot_training - Step 16955: {'lr': 0.00037264925454729217, 'samples': 8681472, 'steps': 16955, 'batch_loss/train': 0.8462404278106987} 12/25/2021 05:01:57 - INFO - codeparrot_training - Step 16956: {'lr': 0.00037263551851991896, 'samples': 8681984, 'steps': 16956, 'batch_loss/train': 0.7634237920865417} 12/25/2021 05:02:10 - INFO - codeparrot_training - Step 16957: {'lr': 0.00037262178200499327, 'samples': 8682496, 'steps': 16957, 'batch_loss/train': 0.8196874542627484} 12/25/2021 05:02:21 - INFO - codeparrot_training - Step 16958: {'lr': 0.0003726080450025697, 'samples': 8683008, 'steps': 16958, 'batch_loss/train': 0.7169225828256458} 12/25/2021 05:02:31 - INFO - codeparrot_training - Step 16959: {'lr': 0.0003725943075127027, 'samples': 8683520, 'steps': 16959, 'batch_loss/train': 0.8817387595772743} 12/25/2021 05:02:43 - INFO - codeparrot_training - Step 16960: {'lr': 0.0003725805695354472, 'samples': 8684032, 'steps': 16960, 'batch_loss/train': 1.0477619240991771} 12/25/2021 05:02:54 - INFO - codeparrot_training - Step 16961: {'lr': 0.00037256683107085754, 'samples': 8684544, 'steps': 16961, 'batch_loss/train': 0.8922459445893764} 12/25/2021 05:03:05 - INFO - codeparrot_training - Step 16962: {'lr': 0.0003725530921189884, 'samples': 8685056, 'steps': 16962, 'batch_loss/train': 0.7681152317672968} 12/25/2021 05:03:17 - INFO - codeparrot_training - Step 16963: {'lr': 0.0003725393526798946, 'samples': 8685568, 'steps': 16963, 'batch_loss/train': 0.808703288435936} 12/25/2021 05:03:27 - INFO - codeparrot_training - Step 16964: {'lr': 0.00037252561275363046, 'samples': 8686080, 'steps': 16964, 'batch_loss/train': 0.8164129443466663} 12/25/2021 05:03:38 - INFO - codeparrot_training - Step 16965: {'lr': 0.0003725118723402507, 'samples': 8686592, 'steps': 16965, 'batch_loss/train': 0.7724475581198931} 12/25/2021 05:03:48 - INFO - codeparrot_training - Step 16966: {'lr': 0.0003724981314398102, 'samples': 8687104, 'steps': 16966, 'batch_loss/train': 0.6600355408154428} 12/25/2021 05:04:01 - INFO - codeparrot_training - Step 16967: {'lr': 0.0003724843900523633, 'samples': 8687616, 'steps': 16967, 'batch_loss/train': 1.0596976913511753} 12/25/2021 05:04:12 - INFO - codeparrot_training - Step 16968: {'lr': 0.0003724706481779646, 'samples': 8688128, 'steps': 16968, 'batch_loss/train': 0.7758840350434184} 12/25/2021 05:04:23 - INFO - codeparrot_training - Step 16969: {'lr': 0.000372456905816669, 'samples': 8688640, 'steps': 16969, 'batch_loss/train': 0.7719105272553861} 12/25/2021 05:04:35 - INFO - codeparrot_training - Step 16970: {'lr': 0.0003724431629685309, 'samples': 8689152, 'steps': 16970, 'batch_loss/train': 0.843370673013851} 12/25/2021 05:04:45 - INFO - codeparrot_training - Step 16971: {'lr': 0.000372429419633605, 'samples': 8689664, 'steps': 16971, 'batch_loss/train': 0.8245174754410982} 12/25/2021 05:04:56 - INFO - codeparrot_training - Step 16972: {'lr': 0.000372415675811946, 'samples': 8690176, 'steps': 16972, 'batch_loss/train': 0.7421648184536025} 12/25/2021 05:05:09 - INFO - codeparrot_training - Step 16973: {'lr': 0.0003724019315036084, 'samples': 8690688, 'steps': 16973, 'batch_loss/train': 0.7741259718313813} 12/25/2021 05:05:20 - INFO - codeparrot_training - Step 16974: {'lr': 0.00037238818670864705, 'samples': 8691200, 'steps': 16974, 'batch_loss/train': 0.8205073582939804} 12/25/2021 05:05:30 - INFO - codeparrot_training - Step 16975: {'lr': 0.0003723744414271164, 'samples': 8691712, 'steps': 16975, 'batch_loss/train': 0.5250114519149065} 12/25/2021 05:05:41 - INFO - codeparrot_training - Step 16976: {'lr': 0.0003723606956590713, 'samples': 8692224, 'steps': 16976, 'batch_loss/train': 0.8636396112851799} 12/25/2021 05:05:53 - INFO - codeparrot_training - Step 16977: {'lr': 0.00037234694940456616, 'samples': 8692736, 'steps': 16977, 'batch_loss/train': 1.7329059895128012} 12/25/2021 05:06:04 - INFO - codeparrot_training - Step 16978: {'lr': 0.0003723332026636558, 'samples': 8693248, 'steps': 16978, 'batch_loss/train': 0.7181051638908684} 12/25/2021 05:06:14 - INFO - codeparrot_training - Step 16979: {'lr': 0.0003723194554363948, 'samples': 8693760, 'steps': 16979, 'batch_loss/train': 0.8630260471254587} 12/25/2021 05:06:26 - INFO - codeparrot_training - Step 16980: {'lr': 0.0003723057077228378, 'samples': 8694272, 'steps': 16980, 'batch_loss/train': 0.808372332714498} 12/25/2021 05:06:37 - INFO - codeparrot_training - Step 16981: {'lr': 0.00037229195952303955, 'samples': 8694784, 'steps': 16981, 'batch_loss/train': 0.6976062250323594} 12/25/2021 05:06:48 - INFO - codeparrot_training - Step 16982: {'lr': 0.0003722782108370546, 'samples': 8695296, 'steps': 16982, 'batch_loss/train': 0.709970714058727} 12/25/2021 05:07:00 - INFO - codeparrot_training - Step 16983: {'lr': 0.00037226446166493764, 'samples': 8695808, 'steps': 16983, 'batch_loss/train': 0.6320819370448589} 12/25/2021 05:07:11 - INFO - codeparrot_training - Step 16984: {'lr': 0.0003722507120067434, 'samples': 8696320, 'steps': 16984, 'batch_loss/train': 0.7729596709832549} 12/25/2021 05:07:22 - INFO - codeparrot_training - Step 16985: {'lr': 0.00037223696186252644, 'samples': 8696832, 'steps': 16985, 'batch_loss/train': 0.7636933021713048} 12/25/2021 05:07:32 - INFO - codeparrot_training - Step 16986: {'lr': 0.0003722232112323415, 'samples': 8697344, 'steps': 16986, 'batch_loss/train': 0.8066178378649056} 12/25/2021 05:07:45 - INFO - codeparrot_training - Step 16987: {'lr': 0.00037220946011624326, 'samples': 8697856, 'steps': 16987, 'batch_loss/train': 0.8255515063647181} 12/25/2021 05:07:55 - INFO - codeparrot_training - Step 16988: {'lr': 0.00037219570851428627, 'samples': 8698368, 'steps': 16988, 'batch_loss/train': 0.6923258695751429} 12/25/2021 05:08:06 - INFO - codeparrot_training - Step 16989: {'lr': 0.00037218195642652535, 'samples': 8698880, 'steps': 16989, 'batch_loss/train': 0.7696397416293621} 12/25/2021 05:08:18 - INFO - codeparrot_training - Step 16990: {'lr': 0.00037216820385301507, 'samples': 8699392, 'steps': 16990, 'batch_loss/train': 0.5952057163231075} 12/25/2021 05:08:28 - INFO - codeparrot_training - Step 16991: {'lr': 0.0003721544507938102, 'samples': 8699904, 'steps': 16991, 'batch_loss/train': 0.7983366781845689} 12/25/2021 05:08:39 - INFO - codeparrot_training - Step 16992: {'lr': 0.00037214069724896524, 'samples': 8700416, 'steps': 16992, 'batch_loss/train': 0.8746267631649971} 12/25/2021 05:08:51 - INFO - codeparrot_training - Step 16993: {'lr': 0.0003721269432185351, 'samples': 8700928, 'steps': 16993, 'batch_loss/train': 1.0383064737543464} 12/25/2021 05:09:02 - INFO - codeparrot_training - Step 16994: {'lr': 0.0003721131887025742, 'samples': 8701440, 'steps': 16994, 'batch_loss/train': 0.8653882518410683} 12/25/2021 05:09:12 - INFO - codeparrot_training - Step 16995: {'lr': 0.00037209943370113753, 'samples': 8701952, 'steps': 16995, 'batch_loss/train': 0.7376897372305393} 12/25/2021 05:09:23 - INFO - codeparrot_training - Step 16996: {'lr': 0.0003720856782142796, 'samples': 8702464, 'steps': 16996, 'batch_loss/train': 0.9481252499390393} 12/25/2021 05:09:36 - INFO - codeparrot_training - Step 16997: {'lr': 0.000372071922242055, 'samples': 8702976, 'steps': 16997, 'batch_loss/train': 0.7774981781840324} 12/25/2021 05:09:46 - INFO - codeparrot_training - Step 16998: {'lr': 0.0003720581657845186, 'samples': 8703488, 'steps': 16998, 'batch_loss/train': 1.001218719407916} 12/25/2021 05:09:57 - INFO - codeparrot_training - Step 16999: {'lr': 0.00037204440884172505, 'samples': 8704000, 'steps': 16999, 'batch_loss/train': 0.8737264340743423} 12/25/2021 05:10:09 - INFO - codeparrot_training - Step 17000: {'lr': 0.000372030651413729, 'samples': 8704512, 'steps': 17000, 'batch_loss/train': 0.775111373513937} 12/25/2021 05:10:20 - INFO - codeparrot_training - Step 17001: {'lr': 0.00037201689350058507, 'samples': 8705024, 'steps': 17001, 'batch_loss/train': 0.7977651349501684} 12/25/2021 05:10:30 - INFO - codeparrot_training - Step 17002: {'lr': 0.00037200313510234813, 'samples': 8705536, 'steps': 17002, 'batch_loss/train': 0.8370692068710923} 12/25/2021 05:10:43 - INFO - codeparrot_training - Step 17003: {'lr': 0.00037198937621907275, 'samples': 8706048, 'steps': 17003, 'batch_loss/train': 0.7835679426789284} 12/25/2021 05:10:54 - INFO - codeparrot_training - Step 17004: {'lr': 0.00037197561685081373, 'samples': 8706560, 'steps': 17004, 'batch_loss/train': 0.818569521070458} 12/25/2021 05:11:04 - INFO - codeparrot_training - Step 17005: {'lr': 0.0003719618569976256, 'samples': 8707072, 'steps': 17005, 'batch_loss/train': 0.9426284385845065} 12/25/2021 05:11:15 - INFO - codeparrot_training - Step 17006: {'lr': 0.0003719480966595633, 'samples': 8707584, 'steps': 17006, 'batch_loss/train': 0.798375753685832} 12/25/2021 05:11:27 - INFO - codeparrot_training - Step 17007: {'lr': 0.00037193433583668136, 'samples': 8708096, 'steps': 17007, 'batch_loss/train': 0.7254099519923329} 12/25/2021 05:11:38 - INFO - codeparrot_training - Step 17008: {'lr': 0.00037192057452903457, 'samples': 8708608, 'steps': 17008, 'batch_loss/train': 0.844433113001287} 12/25/2021 05:11:48 - INFO - codeparrot_training - Step 17009: {'lr': 0.00037190681273667757, 'samples': 8709120, 'steps': 17009, 'batch_loss/train': 0.7963527720421553} 12/25/2021 05:12:00 - INFO - codeparrot_training - Step 17010: {'lr': 0.00037189305045966513, 'samples': 8709632, 'steps': 17010, 'batch_loss/train': 0.7469498068094254} 12/25/2021 05:12:11 - INFO - codeparrot_training - Step 17011: {'lr': 0.0003718792876980519, 'samples': 8710144, 'steps': 17011, 'batch_loss/train': 0.8230086555704474} 12/25/2021 05:12:22 - INFO - codeparrot_training - Step 17012: {'lr': 0.0003718655244518927, 'samples': 8710656, 'steps': 17012, 'batch_loss/train': 0.8152289055287838} 12/25/2021 05:12:34 - INFO - codeparrot_training - Step 17013: {'lr': 0.0003718517607212422, 'samples': 8711168, 'steps': 17013, 'batch_loss/train': 0.8193493811413646} 12/25/2021 05:12:45 - INFO - codeparrot_training - Step 17014: {'lr': 0.00037183799650615507, 'samples': 8711680, 'steps': 17014, 'batch_loss/train': 0.8222498865798116} 12/25/2021 05:12:56 - INFO - codeparrot_training - Step 17015: {'lr': 0.00037182423180668613, 'samples': 8712192, 'steps': 17015, 'batch_loss/train': 0.8117305459454656} 12/25/2021 05:13:08 - INFO - codeparrot_training - Step 17016: {'lr': 0.00037181046662289, 'samples': 8712704, 'steps': 17016, 'batch_loss/train': 0.8326842095702887} 12/25/2021 05:13:19 - INFO - codeparrot_training - Step 17017: {'lr': 0.00037179670095482145, 'samples': 8713216, 'steps': 17017, 'batch_loss/train': 0.8051888970658183} 12/25/2021 05:13:29 - INFO - codeparrot_training - Step 17018: {'lr': 0.00037178293480253516, 'samples': 8713728, 'steps': 17018, 'batch_loss/train': 0.8654617122374475} 12/25/2021 05:13:40 - INFO - codeparrot_training - Step 17019: {'lr': 0.00037176916816608597, 'samples': 8714240, 'steps': 17019, 'batch_loss/train': 0.7680071946233511} 12/25/2021 05:13:53 - INFO - codeparrot_training - Step 17020: {'lr': 0.0003717554010455285, 'samples': 8714752, 'steps': 17020, 'batch_loss/train': 0.801099980250001} 12/25/2021 05:14:03 - INFO - codeparrot_training - Step 17021: {'lr': 0.0003717416334409176, 'samples': 8715264, 'steps': 17021, 'batch_loss/train': 0.5832780604250729} 12/25/2021 05:14:14 - INFO - codeparrot_training - Step 17022: {'lr': 0.00037172786535230785, 'samples': 8715776, 'steps': 17022, 'batch_loss/train': 0.8202964058145881} 12/25/2021 05:14:26 - INFO - codeparrot_training - Step 17023: {'lr': 0.0003717140967797541, 'samples': 8716288, 'steps': 17023, 'batch_loss/train': 0.7297754213213921} 12/25/2021 05:14:37 - INFO - codeparrot_training - Step 17024: {'lr': 0.0003717003277233111, 'samples': 8716800, 'steps': 17024, 'batch_loss/train': 0.6359574077650905} 12/25/2021 05:14:47 - INFO - codeparrot_training - Step 17025: {'lr': 0.00037168655818303346, 'samples': 8717312, 'steps': 17025, 'batch_loss/train': 0.8652881570160389} 12/25/2021 05:14:58 - INFO - codeparrot_training - Step 17026: {'lr': 0.0003716727881589761, 'samples': 8717824, 'steps': 17026, 'batch_loss/train': 0.8711988758295774} 12/25/2021 05:15:10 - INFO - codeparrot_training - Step 17027: {'lr': 0.0003716590176511936, 'samples': 8718336, 'steps': 17027, 'batch_loss/train': 0.8255058235954493} 12/25/2021 05:15:20 - INFO - codeparrot_training - Step 17028: {'lr': 0.00037164524665974085, 'samples': 8718848, 'steps': 17028, 'batch_loss/train': 0.7768983574351296} 12/25/2021 05:15:31 - INFO - codeparrot_training - Step 17029: {'lr': 0.00037163147518467257, 'samples': 8719360, 'steps': 17029, 'batch_loss/train': 0.7421838426962495} 12/25/2021 05:15:43 - INFO - codeparrot_training - Step 17030: {'lr': 0.0003716177032260434, 'samples': 8719872, 'steps': 17030, 'batch_loss/train': 0.7593359113670886} 12/25/2021 05:15:54 - INFO - codeparrot_training - Step 17031: {'lr': 0.0003716039307839081, 'samples': 8720384, 'steps': 17031, 'batch_loss/train': 0.7846761122345924} 12/25/2021 05:16:04 - INFO - codeparrot_training - Step 17032: {'lr': 0.0003715901578583216, 'samples': 8720896, 'steps': 17032, 'batch_loss/train': 0.8314266912639141} 12/25/2021 05:16:17 - INFO - codeparrot_training - Step 17033: {'lr': 0.00037157638444933845, 'samples': 8721408, 'steps': 17033, 'batch_loss/train': 0.7595108640380204} 12/25/2021 05:16:28 - INFO - codeparrot_training - Step 17034: {'lr': 0.00037156261055701356, 'samples': 8721920, 'steps': 17034, 'batch_loss/train': 0.7954403317999095} 12/25/2021 05:16:38 - INFO - codeparrot_training - Step 17035: {'lr': 0.00037154883618140165, 'samples': 8722432, 'steps': 17035, 'batch_loss/train': 0.7787006911821663} 12/25/2021 05:16:50 - INFO - codeparrot_training - Step 17036: {'lr': 0.0003715350613225574, 'samples': 8722944, 'steps': 17036, 'batch_loss/train': 0.7244045082479715} 12/25/2021 05:17:01 - INFO - codeparrot_training - Step 17037: {'lr': 0.0003715212859805357, 'samples': 8723456, 'steps': 17037, 'batch_loss/train': 0.7764649079181254} 12/25/2021 05:17:11 - INFO - codeparrot_training - Step 17038: {'lr': 0.0003715075101553912, 'samples': 8723968, 'steps': 17038, 'batch_loss/train': 0.8228448508307338} 12/25/2021 05:17:22 - INFO - codeparrot_training - Step 17039: {'lr': 0.0003714937338471787, 'samples': 8724480, 'steps': 17039, 'batch_loss/train': 0.9349382780492306} 12/25/2021 05:17:34 - INFO - codeparrot_training - Step 17040: {'lr': 0.00037147995705595307, 'samples': 8724992, 'steps': 17040, 'batch_loss/train': 0.6677048946730793} 12/25/2021 05:17:45 - INFO - codeparrot_training - Step 17041: {'lr': 0.000371466179781769, 'samples': 8725504, 'steps': 17041, 'batch_loss/train': 0.7491424176841974} 12/25/2021 05:17:55 - INFO - codeparrot_training - Step 17042: {'lr': 0.0003714524020246812, 'samples': 8726016, 'steps': 17042, 'batch_loss/train': 0.8117257002741098} 12/25/2021 05:18:08 - INFO - codeparrot_training - Step 17043: {'lr': 0.00037143862378474454, 'samples': 8726528, 'steps': 17043, 'batch_loss/train': 0.8114864882081747} 12/25/2021 05:18:19 - INFO - codeparrot_training - Step 17044: {'lr': 0.0003714248450620137, 'samples': 8727040, 'steps': 17044, 'batch_loss/train': 0.8313294323161244} 12/25/2021 05:18:29 - INFO - codeparrot_training - Step 17045: {'lr': 0.0003714110658565437, 'samples': 8727552, 'steps': 17045, 'batch_loss/train': 0.8326535443775356} 12/25/2021 05:18:41 - INFO - codeparrot_training - Step 17046: {'lr': 0.00037139728616838893, 'samples': 8728064, 'steps': 17046, 'batch_loss/train': 0.8670677645131946} 12/25/2021 05:18:52 - INFO - codeparrot_training - Step 17047: {'lr': 0.00037138350599760447, 'samples': 8728576, 'steps': 17047, 'batch_loss/train': 0.8820416778326035} 12/25/2021 05:19:03 - INFO - codeparrot_training - Step 17048: {'lr': 0.0003713697253442451, 'samples': 8729088, 'steps': 17048, 'batch_loss/train': 0.9414411610923707} 12/25/2021 05:19:13 - INFO - codeparrot_training - Step 17049: {'lr': 0.00037135594420836544, 'samples': 8729600, 'steps': 17049, 'batch_loss/train': 0.7101630161050707} 12/25/2021 05:19:26 - INFO - codeparrot_training - Step 17050: {'lr': 0.00037134216259002037, 'samples': 8730112, 'steps': 17050, 'batch_loss/train': 0.7099400307051837} 12/25/2021 05:19:37 - INFO - codeparrot_training - Step 17051: {'lr': 0.0003713283804892647, 'samples': 8730624, 'steps': 17051, 'batch_loss/train': 0.835782659240067} 12/25/2021 05:19:47 - INFO - codeparrot_training - Step 17052: {'lr': 0.00037131459790615317, 'samples': 8731136, 'steps': 17052, 'batch_loss/train': 0.7469722246751189} 12/25/2021 05:19:59 - INFO - codeparrot_training - Step 17053: {'lr': 0.00037130081484074064, 'samples': 8731648, 'steps': 17053, 'batch_loss/train': 0.7161566922441125} 12/25/2021 05:20:10 - INFO - codeparrot_training - Step 17054: {'lr': 0.00037128703129308193, 'samples': 8732160, 'steps': 17054, 'batch_loss/train': 0.7126652075676247} 12/25/2021 05:20:21 - INFO - codeparrot_training - Step 17055: {'lr': 0.0003712732472632317, 'samples': 8732672, 'steps': 17055, 'batch_loss/train': 0.8200038126669824} 12/25/2021 05:20:33 - INFO - codeparrot_training - Step 17056: {'lr': 0.00037125946275124485, 'samples': 8733184, 'steps': 17056, 'batch_loss/train': 0.8452499764971435} 12/25/2021 05:20:43 - INFO - codeparrot_training - Step 17057: {'lr': 0.00037124567775717615, 'samples': 8733696, 'steps': 17057, 'batch_loss/train': 0.6267070721369237} 12/25/2021 05:20:54 - INFO - codeparrot_training - Step 17058: {'lr': 0.0003712318922810804, 'samples': 8734208, 'steps': 17058, 'batch_loss/train': 0.8093329521361738} 12/25/2021 05:21:04 - INFO - codeparrot_training - Step 17059: {'lr': 0.00037121810632301244, 'samples': 8734720, 'steps': 17059, 'batch_loss/train': 0.6699704579077661} 12/25/2021 05:21:16 - INFO - codeparrot_training - Step 17060: {'lr': 0.00037120431988302705, 'samples': 8735232, 'steps': 17060, 'batch_loss/train': 0.6628342038020492} 12/25/2021 05:21:27 - INFO - codeparrot_training - Step 17061: {'lr': 0.0003711905329611791, 'samples': 8735744, 'steps': 17061, 'batch_loss/train': 0.7603456387296319} 12/25/2021 05:21:38 - INFO - codeparrot_training - Step 17062: {'lr': 0.0003711767455575233, 'samples': 8736256, 'steps': 17062, 'batch_loss/train': 0.7674389842431992} 12/25/2021 05:21:50 - INFO - codeparrot_training - Step 17063: {'lr': 0.00037116295767211454, 'samples': 8736768, 'steps': 17063, 'batch_loss/train': 0.8636061288416386} 12/25/2021 05:22:01 - INFO - codeparrot_training - Step 17064: {'lr': 0.00037114916930500766, 'samples': 8737280, 'steps': 17064, 'batch_loss/train': 0.7856367046479136} 12/25/2021 05:22:12 - INFO - codeparrot_training - Step 17065: {'lr': 0.00037113538045625735, 'samples': 8737792, 'steps': 17065, 'batch_loss/train': 0.8325524749234319} 12/25/2021 05:22:24 - INFO - codeparrot_training - Step 17066: {'lr': 0.0003711215911259185, 'samples': 8738304, 'steps': 17066, 'batch_loss/train': 0.8270903588272631} 12/25/2021 05:22:34 - INFO - codeparrot_training - Step 17067: {'lr': 0.00037110780131404596, 'samples': 8738816, 'steps': 17067, 'batch_loss/train': 0.6566957010654733} 12/25/2021 05:22:45 - INFO - codeparrot_training - Step 17068: {'lr': 0.0003710940110206946, 'samples': 8739328, 'steps': 17068, 'batch_loss/train': 0.7230028910562396} 12/25/2021 05:22:56 - INFO - codeparrot_training - Step 17069: {'lr': 0.00037108022024591903, 'samples': 8739840, 'steps': 17069, 'batch_loss/train': 0.7805981310084462} 12/25/2021 05:23:08 - INFO - codeparrot_training - Step 17070: {'lr': 0.00037106642898977434, 'samples': 8740352, 'steps': 17070, 'batch_loss/train': 0.7856095964089036} 12/25/2021 05:23:18 - INFO - codeparrot_training - Step 17071: {'lr': 0.0003710526372523152, 'samples': 8740864, 'steps': 17071, 'batch_loss/train': 0.7707490520551801} 12/25/2021 05:23:29 - INFO - codeparrot_training - Step 17072: {'lr': 0.0003710388450335964, 'samples': 8741376, 'steps': 17072, 'batch_loss/train': 0.8724807128310204} 12/25/2021 05:23:42 - INFO - codeparrot_training - Step 17073: {'lr': 0.000371025052333673, 'samples': 8741888, 'steps': 17073, 'batch_loss/train': 0.7088010662700981} 12/25/2021 05:23:52 - INFO - codeparrot_training - Step 17074: {'lr': 0.00037101125915259964, 'samples': 8742400, 'steps': 17074, 'batch_loss/train': 0.7696491181850433} 12/25/2021 05:24:03 - INFO - codeparrot_training - Step 17075: {'lr': 0.0003709974654904311, 'samples': 8742912, 'steps': 17075, 'batch_loss/train': 0.7994547141715884} 12/25/2021 05:24:15 - INFO - codeparrot_training - Step 17076: {'lr': 0.0003709836713472225, 'samples': 8743424, 'steps': 17076, 'batch_loss/train': 0.6993579147383571} 12/25/2021 05:24:26 - INFO - codeparrot_training - Step 17077: {'lr': 0.0003709698767230284, 'samples': 8743936, 'steps': 17077, 'batch_loss/train': 0.8109991308301687} 12/25/2021 05:24:36 - INFO - codeparrot_training - Step 17078: {'lr': 0.00037095608161790376, 'samples': 8744448, 'steps': 17078, 'batch_loss/train': 0.7571219063829631} 12/25/2021 05:24:47 - INFO - codeparrot_training - Step 17079: {'lr': 0.00037094228603190343, 'samples': 8744960, 'steps': 17079, 'batch_loss/train': 0.7960456013679504} 12/25/2021 05:24:59 - INFO - codeparrot_training - Step 17080: {'lr': 0.00037092848996508226, 'samples': 8745472, 'steps': 17080, 'batch_loss/train': 0.9401393355801702} 12/25/2021 05:25:09 - INFO - codeparrot_training - Step 17081: {'lr': 0.00037091469341749505, 'samples': 8745984, 'steps': 17081, 'batch_loss/train': 0.7447575032711029} 12/25/2021 05:25:20 - INFO - codeparrot_training - Step 17082: {'lr': 0.00037090089638919666, 'samples': 8746496, 'steps': 17082, 'batch_loss/train': 0.8495045732706785} 12/25/2021 05:25:33 - INFO - codeparrot_training - Step 17083: {'lr': 0.00037088709888024197, 'samples': 8747008, 'steps': 17083, 'batch_loss/train': 0.7843368935864419} 12/25/2021 05:25:43 - INFO - codeparrot_training - Step 17084: {'lr': 0.00037087330089068576, 'samples': 8747520, 'steps': 17084, 'batch_loss/train': 0.8121597226709127} 12/25/2021 05:25:54 - INFO - codeparrot_training - Step 17085: {'lr': 0.00037085950242058297, 'samples': 8748032, 'steps': 17085, 'batch_loss/train': 0.720646555069834} 12/25/2021 05:26:06 - INFO - codeparrot_training - Step 17086: {'lr': 0.0003708457034699885, 'samples': 8748544, 'steps': 17086, 'batch_loss/train': 0.7477453174069524} 12/25/2021 05:26:17 - INFO - codeparrot_training - Step 17087: {'lr': 0.0003708319040389571, 'samples': 8749056, 'steps': 17087, 'batch_loss/train': 0.803238358348608} 12/25/2021 05:26:27 - INFO - codeparrot_training - Step 17088: {'lr': 0.00037081810412754364, 'samples': 8749568, 'steps': 17088, 'batch_loss/train': 0.800892467610538} 12/25/2021 05:26:40 - INFO - codeparrot_training - Step 17089: {'lr': 0.0003708043037358031, 'samples': 8750080, 'steps': 17089, 'batch_loss/train': 0.7398656863952056} 12/25/2021 05:26:51 - INFO - codeparrot_training - Step 17090: {'lr': 0.00037079050286379023, 'samples': 8750592, 'steps': 17090, 'batch_loss/train': 0.7577791521325707} 12/25/2021 05:27:01 - INFO - codeparrot_training - Step 17091: {'lr': 0.0003707767015115599, 'samples': 8751104, 'steps': 17091, 'batch_loss/train': 0.7902941983193159} 12/25/2021 05:27:12 - INFO - codeparrot_training - Step 17092: {'lr': 0.000370762899679167, 'samples': 8751616, 'steps': 17092, 'batch_loss/train': 0.8260375175741501} 12/25/2021 05:27:24 - INFO - codeparrot_training - Step 17093: {'lr': 0.0003707490973666665, 'samples': 8752128, 'steps': 17093, 'batch_loss/train': 0.7915805159136653} 12/25/2021 05:27:34 - INFO - codeparrot_training - Step 17094: {'lr': 0.0003707352945741131, 'samples': 8752640, 'steps': 17094, 'batch_loss/train': 0.7801563115790486} 12/25/2021 05:27:45 - INFO - codeparrot_training - Step 17095: {'lr': 0.00037072149130156173, 'samples': 8753152, 'steps': 17095, 'batch_loss/train': 0.8081841710954905} 12/25/2021 05:27:57 - INFO - codeparrot_training - Step 17096: {'lr': 0.0003707076875490674, 'samples': 8753664, 'steps': 17096, 'batch_loss/train': 0.9831717004999518} 12/25/2021 05:28:08 - INFO - codeparrot_training - Step 17097: {'lr': 0.0003706938833166847, 'samples': 8754176, 'steps': 17097, 'batch_loss/train': 0.7952755680307746} 12/25/2021 05:28:19 - INFO - codeparrot_training - Step 17098: {'lr': 0.0003706800786044688, 'samples': 8754688, 'steps': 17098, 'batch_loss/train': 0.6150395916774869} 12/25/2021 05:28:31 - INFO - codeparrot_training - Step 17099: {'lr': 0.00037066627341247454, 'samples': 8755200, 'steps': 17099, 'batch_loss/train': 0.7736872346140444} 12/25/2021 05:28:41 - INFO - codeparrot_training - Step 17100: {'lr': 0.00037065246774075673, 'samples': 8755712, 'steps': 17100, 'batch_loss/train': 0.8054636428132653} 12/25/2021 05:28:52 - INFO - codeparrot_training - Step 17101: {'lr': 0.00037063866158937016, 'samples': 8756224, 'steps': 17101, 'batch_loss/train': 0.6889229714870453} 12/25/2021 05:29:03 - INFO - codeparrot_training - Step 17102: {'lr': 0.0003706248549583699, 'samples': 8756736, 'steps': 17102, 'batch_loss/train': 0.8030995186418295} 12/25/2021 05:29:15 - INFO - codeparrot_training - Step 17103: {'lr': 0.00037061104784781075, 'samples': 8757248, 'steps': 17103, 'batch_loss/train': 0.8032166035845876} 12/25/2021 05:29:26 - INFO - codeparrot_training - Step 17104: {'lr': 0.0003705972402577476, 'samples': 8757760, 'steps': 17104, 'batch_loss/train': 0.7925503524020314} 12/25/2021 05:29:36 - INFO - codeparrot_training - Step 17105: {'lr': 0.00037058343218823535, 'samples': 8758272, 'steps': 17105, 'batch_loss/train': 0.6788564736489207} 12/25/2021 05:29:49 - INFO - codeparrot_training - Step 17106: {'lr': 0.000370569623639329, 'samples': 8758784, 'steps': 17106, 'batch_loss/train': 0.8527213279157877} 12/25/2021 05:29:59 - INFO - codeparrot_training - Step 17107: {'lr': 0.0003705558146110832, 'samples': 8759296, 'steps': 17107, 'batch_loss/train': 0.7974973483942449} 12/25/2021 05:30:10 - INFO - codeparrot_training - Step 17108: {'lr': 0.00037054200510355307, 'samples': 8759808, 'steps': 17108, 'batch_loss/train': 0.8022144790738821} 12/25/2021 05:30:24 - INFO - codeparrot_training - Step 17109: {'lr': 0.00037052819511679345, 'samples': 8760320, 'steps': 17109, 'batch_loss/train': 0.8363966040778905} 12/25/2021 05:30:35 - INFO - codeparrot_training - Step 17110: {'lr': 0.00037051438465085924, 'samples': 8760832, 'steps': 17110, 'batch_loss/train': 0.7322226818650961} 12/25/2021 05:30:45 - INFO - codeparrot_training - Step 17111: {'lr': 0.0003705005737058053, 'samples': 8761344, 'steps': 17111, 'batch_loss/train': 0.793107071891427} 12/25/2021 05:30:58 - INFO - codeparrot_training - Step 17112: {'lr': 0.00037048676228168667, 'samples': 8761856, 'steps': 17112, 'batch_loss/train': 0.6431608846178278} 12/25/2021 05:31:08 - INFO - codeparrot_training - Step 17113: {'lr': 0.00037047295037855813, 'samples': 8762368, 'steps': 17113, 'batch_loss/train': 0.7216192702762783} 12/25/2021 05:31:19 - INFO - codeparrot_training - Step 17114: {'lr': 0.0003704591379964745, 'samples': 8762880, 'steps': 17114, 'batch_loss/train': 0.6937162929680198} 12/25/2021 05:31:30 - INFO - codeparrot_training - Step 17115: {'lr': 0.000370445325135491, 'samples': 8763392, 'steps': 17115, 'batch_loss/train': 0.6918797986581922} 12/25/2021 05:31:44 - INFO - codeparrot_training - Step 17116: {'lr': 0.0003704315117956623, 'samples': 8763904, 'steps': 17116, 'batch_loss/train': 0.8787906486541033} 12/25/2021 05:31:54 - INFO - codeparrot_training - Step 17117: {'lr': 0.00037041769797704335, 'samples': 8764416, 'steps': 17117, 'batch_loss/train': 0.7820621030405164} 12/25/2021 05:32:05 - INFO - codeparrot_training - Step 17118: {'lr': 0.00037040388367968915, 'samples': 8764928, 'steps': 17118, 'batch_loss/train': 0.7833555764518678} 12/25/2021 05:32:17 - INFO - codeparrot_training - Step 17119: {'lr': 0.00037039006890365455, 'samples': 8765440, 'steps': 17119, 'batch_loss/train': 0.8891285564750433} 12/25/2021 05:32:28 - INFO - codeparrot_training - Step 17120: {'lr': 0.0003703762536489944, 'samples': 8765952, 'steps': 17120, 'batch_loss/train': 0.7905563646927476} 12/25/2021 05:32:38 - INFO - codeparrot_training - Step 17121: {'lr': 0.0003703624379157639, 'samples': 8766464, 'steps': 17121, 'batch_loss/train': 0.831133765168488} 12/25/2021 05:32:50 - INFO - codeparrot_training - Step 17122: {'lr': 0.0003703486217040177, 'samples': 8766976, 'steps': 17122, 'batch_loss/train': 0.8065251596271992} 12/25/2021 05:33:01 - INFO - codeparrot_training - Step 17123: {'lr': 0.00037033480501381076, 'samples': 8767488, 'steps': 17123, 'batch_loss/train': 0.865512371994555} 12/25/2021 05:33:11 - INFO - codeparrot_training - Step 17124: {'lr': 0.0003703209878451982, 'samples': 8768000, 'steps': 17124, 'batch_loss/train': 0.7592856697738171} 12/25/2021 05:33:22 - INFO - codeparrot_training - Step 17125: {'lr': 0.00037030717019823474, 'samples': 8768512, 'steps': 17125, 'batch_loss/train': 0.7127336743287742} 12/25/2021 05:33:36 - INFO - codeparrot_training - Step 17126: {'lr': 0.00037029335207297543, 'samples': 8769024, 'steps': 17126, 'batch_loss/train': 0.8838270250707865} 12/25/2021 05:33:47 - INFO - codeparrot_training - Step 17127: {'lr': 0.00037027953346947513, 'samples': 8769536, 'steps': 17127, 'batch_loss/train': 0.7983716493472457} 12/25/2021 05:33:57 - INFO - codeparrot_training - Step 17128: {'lr': 0.0003702657143877889, 'samples': 8770048, 'steps': 17128, 'batch_loss/train': 0.8672707006335258} 12/25/2021 05:34:09 - INFO - codeparrot_training - Step 17129: {'lr': 0.00037025189482797153, 'samples': 8770560, 'steps': 17129, 'batch_loss/train': 0.8403270551934838} 12/25/2021 05:34:20 - INFO - codeparrot_training - Step 17130: {'lr': 0.00037023807479007805, 'samples': 8771072, 'steps': 17130, 'batch_loss/train': 0.7310327524319291} 12/25/2021 05:34:31 - INFO - codeparrot_training - Step 17131: {'lr': 0.0003702242542741634, 'samples': 8771584, 'steps': 17131, 'batch_loss/train': 0.5109979594126344} 12/25/2021 05:34:41 - INFO - codeparrot_training - Step 17132: {'lr': 0.00037021043328028247, 'samples': 8772096, 'steps': 17132, 'batch_loss/train': 0.770905896788463} 12/25/2021 05:34:53 - INFO - codeparrot_training - Step 17133: {'lr': 0.00037019661180849025, 'samples': 8772608, 'steps': 17133, 'batch_loss/train': 0.8295010104775429} 12/25/2021 05:35:04 - INFO - codeparrot_training - Step 17134: {'lr': 0.0003701827898588417, 'samples': 8773120, 'steps': 17134, 'batch_loss/train': 0.8354463633149862} 12/25/2021 05:35:15 - INFO - codeparrot_training - Step 17135: {'lr': 0.0003701689674313918, 'samples': 8773632, 'steps': 17135, 'batch_loss/train': 0.7758597326464951} 12/25/2021 05:35:28 - INFO - codeparrot_training - Step 17136: {'lr': 0.0003701551445261954, 'samples': 8774144, 'steps': 17136, 'batch_loss/train': 0.8007802008651197} 12/25/2021 05:35:39 - INFO - codeparrot_training - Step 17137: {'lr': 0.0003701413211433076, 'samples': 8774656, 'steps': 17137, 'batch_loss/train': 0.9960801098495722} 12/25/2021 05:35:50 - INFO - codeparrot_training - Step 17138: {'lr': 0.0003701274972827832, 'samples': 8775168, 'steps': 17138, 'batch_loss/train': 0.7930916622281075} 12/25/2021 05:36:02 - INFO - codeparrot_training - Step 17139: {'lr': 0.00037011367294467724, 'samples': 8775680, 'steps': 17139, 'batch_loss/train': 0.7727675884962082} 12/25/2021 05:36:12 - INFO - codeparrot_training - Step 17140: {'lr': 0.0003700998481290446, 'samples': 8776192, 'steps': 17140, 'batch_loss/train': 1.0492531957570463} 12/25/2021 05:36:23 - INFO - codeparrot_training - Step 17141: {'lr': 0.00037008602283594043, 'samples': 8776704, 'steps': 17141, 'batch_loss/train': 0.7697528246790171} 12/25/2021 05:36:34 - INFO - codeparrot_training - Step 17142: {'lr': 0.0003700721970654196, 'samples': 8777216, 'steps': 17142, 'batch_loss/train': 0.7902096423786134} 12/25/2021 05:36:46 - INFO - codeparrot_training - Step 17143: {'lr': 0.00037005837081753693, 'samples': 8777728, 'steps': 17143, 'batch_loss/train': 0.793646652251482} 12/25/2021 05:36:56 - INFO - codeparrot_training - Step 17144: {'lr': 0.00037004454409234755, 'samples': 8778240, 'steps': 17144, 'batch_loss/train': 0.8268961990252137} 12/25/2021 05:37:07 - INFO - codeparrot_training - Step 17145: {'lr': 0.00037003071688990636, 'samples': 8778752, 'steps': 17145, 'batch_loss/train': 0.8262108414201066} 12/25/2021 05:37:21 - INFO - codeparrot_training - Step 17146: {'lr': 0.00037001688921026844, 'samples': 8779264, 'steps': 17146, 'batch_loss/train': 0.7126466673798859} 12/25/2021 05:37:31 - INFO - codeparrot_training - Step 17147: {'lr': 0.0003700030610534886, 'samples': 8779776, 'steps': 17147, 'batch_loss/train': 0.8005399061366916} 12/25/2021 05:37:42 - INFO - codeparrot_training - Step 17148: {'lr': 0.000369989232419622, 'samples': 8780288, 'steps': 17148, 'batch_loss/train': 0.8152961293235421} 12/25/2021 05:37:54 - INFO - codeparrot_training - Step 17149: {'lr': 0.0003699754033087235, 'samples': 8780800, 'steps': 17149, 'batch_loss/train': 0.754080421756953} 12/25/2021 05:38:05 - INFO - codeparrot_training - Step 17150: {'lr': 0.00036996157372084804, 'samples': 8781312, 'steps': 17150, 'batch_loss/train': 0.7759597929543816} 12/25/2021 05:38:16 - INFO - codeparrot_training - Step 17151: {'lr': 0.00036994774365605067, 'samples': 8781824, 'steps': 17151, 'batch_loss/train': 0.6168080973438919} 12/25/2021 05:38:28 - INFO - codeparrot_training - Step 17152: {'lr': 0.0003699339131143864, 'samples': 8782336, 'steps': 17152, 'batch_loss/train': 0.6827954089385457} 12/25/2021 05:38:38 - INFO - codeparrot_training - Step 17153: {'lr': 0.00036992008209591013, 'samples': 8782848, 'steps': 17153, 'batch_loss/train': 0.737398182740435} 12/25/2021 05:38:49 - INFO - codeparrot_training - Step 17154: {'lr': 0.0003699062506006769, 'samples': 8783360, 'steps': 17154, 'batch_loss/train': 0.7487293141894042} 12/25/2021 05:38:59 - INFO - codeparrot_training - Step 17155: {'lr': 0.00036989241862874167, 'samples': 8783872, 'steps': 17155, 'batch_loss/train': 0.827027621679008} 12/25/2021 05:39:13 - INFO - codeparrot_training - Step 17156: {'lr': 0.0003698785861801595, 'samples': 8784384, 'steps': 17156, 'batch_loss/train': 0.8330007204785943} 12/25/2021 05:39:24 - INFO - codeparrot_training - Step 17157: {'lr': 0.00036986475325498537, 'samples': 8784896, 'steps': 17157, 'batch_loss/train': 0.7616052106022835} 12/25/2021 05:39:34 - INFO - codeparrot_training - Step 17158: {'lr': 0.00036985091985327414, 'samples': 8785408, 'steps': 17158, 'batch_loss/train': 0.8466999679803848} 12/25/2021 05:39:47 - INFO - codeparrot_training - Step 17159: {'lr': 0.000369837085975081, 'samples': 8785920, 'steps': 17159, 'batch_loss/train': 0.790565887466073} 12/25/2021 05:39:57 - INFO - codeparrot_training - Step 17160: {'lr': 0.00036982325162046077, 'samples': 8786432, 'steps': 17160, 'batch_loss/train': 0.8181645576842129} 12/25/2021 05:40:08 - INFO - codeparrot_training - Step 17161: {'lr': 0.00036980941678946863, 'samples': 8786944, 'steps': 17161, 'batch_loss/train': 0.782793365418911} 12/25/2021 05:40:20 - INFO - codeparrot_training - Step 17162: {'lr': 0.0003697955814821594, 'samples': 8787456, 'steps': 17162, 'batch_loss/train': 0.8024579202756286} 12/25/2021 05:40:31 - INFO - codeparrot_training - Step 17163: {'lr': 0.00036978174569858823, 'samples': 8787968, 'steps': 17163, 'batch_loss/train': 0.7688863810617477} 12/25/2021 05:40:41 - INFO - codeparrot_training - Step 17164: {'lr': 0.0003697679094388101, 'samples': 8788480, 'steps': 17164, 'batch_loss/train': 0.7994425594806671} 12/25/2021 05:40:52 - INFO - codeparrot_training - Step 17165: {'lr': 0.0003697540727028799, 'samples': 8788992, 'steps': 17165, 'batch_loss/train': 0.8119599288329482} 12/25/2021 05:41:06 - INFO - codeparrot_training - Step 17166: {'lr': 0.00036974023549085275, 'samples': 8789504, 'steps': 17166, 'batch_loss/train': 0.7908940459601581} 12/25/2021 05:41:16 - INFO - codeparrot_training - Step 17167: {'lr': 0.0003697263978027836, 'samples': 8790016, 'steps': 17167, 'batch_loss/train': 0.787070064805448} 12/25/2021 05:41:27 - INFO - codeparrot_training - Step 17168: {'lr': 0.00036971255963872757, 'samples': 8790528, 'steps': 17168, 'batch_loss/train': 0.8024102123454213} 12/25/2021 05:41:39 - INFO - codeparrot_training - Step 17169: {'lr': 0.00036969872099873956, 'samples': 8791040, 'steps': 17169, 'batch_loss/train': 0.7616748907603323} 12/25/2021 05:41:49 - INFO - codeparrot_training - Step 17170: {'lr': 0.0003696848818828746, 'samples': 8791552, 'steps': 17170, 'batch_loss/train': 1.2399832950904965} 12/25/2021 05:42:00 - INFO - codeparrot_training - Step 17171: {'lr': 0.0003696710422911879, 'samples': 8792064, 'steps': 17171, 'batch_loss/train': 0.8060918838018551} 12/25/2021 05:42:14 - INFO - codeparrot_training - Step 17172: {'lr': 0.00036965720222373414, 'samples': 8792576, 'steps': 17172, 'batch_loss/train': 0.8064326541498303} 12/25/2021 05:42:24 - INFO - codeparrot_training - Step 17173: {'lr': 0.00036964336168056854, 'samples': 8793088, 'steps': 17173, 'batch_loss/train': 0.8085316345095634} 12/25/2021 05:42:35 - INFO - codeparrot_training - Step 17174: {'lr': 0.00036962952066174626, 'samples': 8793600, 'steps': 17174, 'batch_loss/train': 0.7349209170788527} 12/25/2021 05:42:46 - INFO - codeparrot_training - Step 17175: {'lr': 0.0003696156791673221, 'samples': 8794112, 'steps': 17175, 'batch_loss/train': 0.8086433010175824} 12/25/2021 05:42:58 - INFO - codeparrot_training - Step 17176: {'lr': 0.00036960183719735105, 'samples': 8794624, 'steps': 17176, 'batch_loss/train': 0.769500915077515} 12/25/2021 05:43:08 - INFO - codeparrot_training - Step 17177: {'lr': 0.00036958799475188844, 'samples': 8795136, 'steps': 17177, 'batch_loss/train': 1.007725934498012} 12/25/2021 05:43:19 - INFO - codeparrot_training - Step 17178: {'lr': 0.000369574151830989, 'samples': 8795648, 'steps': 17178, 'batch_loss/train': 0.7485896209254861} 12/25/2021 05:43:31 - INFO - codeparrot_training - Step 17179: {'lr': 0.0003695603084347079, 'samples': 8796160, 'steps': 17179, 'batch_loss/train': 0.7761750807985663} 12/25/2021 05:43:42 - INFO - codeparrot_training - Step 17180: {'lr': 0.00036954646456310023, 'samples': 8796672, 'steps': 17180, 'batch_loss/train': 0.7554480084218085} 12/25/2021 05:43:52 - INFO - codeparrot_training - Step 17181: {'lr': 0.0003695326202162209, 'samples': 8797184, 'steps': 17181, 'batch_loss/train': 0.7695747809484601} 12/25/2021 05:44:04 - INFO - codeparrot_training - Step 17182: {'lr': 0.0003695187753941249, 'samples': 8797696, 'steps': 17182, 'batch_loss/train': 0.8344891214510426} 12/25/2021 05:44:15 - INFO - codeparrot_training - Step 17183: {'lr': 0.00036950493009686746, 'samples': 8798208, 'steps': 17183, 'batch_loss/train': 0.8149739978834987} 12/25/2021 05:44:25 - INFO - codeparrot_training - Step 17184: {'lr': 0.0003694910843245036, 'samples': 8798720, 'steps': 17184, 'batch_loss/train': 0.9017830584198236} 12/25/2021 05:44:39 - INFO - codeparrot_training - Step 17185: {'lr': 0.0003694772380770882, 'samples': 8799232, 'steps': 17185, 'batch_loss/train': 0.7965011243941262} 12/25/2021 05:44:50 - INFO - codeparrot_training - Step 17186: {'lr': 0.0003694633913546765, 'samples': 8799744, 'steps': 17186, 'batch_loss/train': 0.7136217532679439} 12/25/2021 05:45:01 - INFO - codeparrot_training - Step 17187: {'lr': 0.0003694495441573235, 'samples': 8800256, 'steps': 17187, 'batch_loss/train': 0.7999111721292138} 12/25/2021 05:45:11 - INFO - codeparrot_training - Step 17188: {'lr': 0.0003694356964850841, 'samples': 8800768, 'steps': 17188, 'batch_loss/train': 0.6560873351991177} 12/25/2021 05:45:23 - INFO - codeparrot_training - Step 17189: {'lr': 0.0003694218483380136, 'samples': 8801280, 'steps': 17189, 'batch_loss/train': 0.7716332674026489} 12/25/2021 05:45:34 - INFO - codeparrot_training - Step 17190: {'lr': 0.00036940799971616677, 'samples': 8801792, 'steps': 17190, 'batch_loss/train': 0.810953987762332} 12/25/2021 05:45:44 - INFO - codeparrot_training - Step 17191: {'lr': 0.0003693941506195989, 'samples': 8802304, 'steps': 17191, 'batch_loss/train': 0.756384102627635} 12/25/2021 05:45:57 - INFO - codeparrot_training - Step 17192: {'lr': 0.000369380301048365, 'samples': 8802816, 'steps': 17192, 'batch_loss/train': 0.8056000221986324} 12/25/2021 05:46:07 - INFO - codeparrot_training - Step 17193: {'lr': 0.0003693664510025201, 'samples': 8803328, 'steps': 17193, 'batch_loss/train': 0.8632957097142935} 12/25/2021 05:46:18 - INFO - codeparrot_training - Step 17194: {'lr': 0.0003693526004821192, 'samples': 8803840, 'steps': 17194, 'batch_loss/train': 0.9539960715919733} 12/25/2021 05:46:32 - INFO - codeparrot_training - Step 17195: {'lr': 0.0003693387494872174, 'samples': 8804352, 'steps': 17195, 'batch_loss/train': 0.9601670550182462} 12/25/2021 05:46:42 - INFO - codeparrot_training - Step 17196: {'lr': 0.0003693248980178699, 'samples': 8804864, 'steps': 17196, 'batch_loss/train': 0.6505533329909667} 12/25/2021 05:46:53 - INFO - codeparrot_training - Step 17197: {'lr': 0.0003693110460741316, 'samples': 8805376, 'steps': 17197, 'batch_loss/train': 0.8875416368246078} 12/25/2021 05:47:05 - INFO - codeparrot_training - Step 17198: {'lr': 0.00036929719365605765, 'samples': 8805888, 'steps': 17198, 'batch_loss/train': 0.8395707728341222} 12/25/2021 05:47:15 - INFO - codeparrot_training - Step 17199: {'lr': 0.00036928334076370304, 'samples': 8806400, 'steps': 17199, 'batch_loss/train': 0.8049160344526172} 12/25/2021 05:47:26 - INFO - codeparrot_training - Step 17200: {'lr': 0.000369269487397123, 'samples': 8806912, 'steps': 17200, 'batch_loss/train': 0.7998269181698561} 12/25/2021 05:47:37 - INFO - codeparrot_training - Step 17201: {'lr': 0.00036925563355637243, 'samples': 8807424, 'steps': 17201, 'batch_loss/train': 0.9319567214697599} 12/25/2021 05:47:51 - INFO - codeparrot_training - Step 17202: {'lr': 0.0003692417792415066, 'samples': 8807936, 'steps': 17202, 'batch_loss/train': 0.7968158777803183} 12/25/2021 05:48:01 - INFO - codeparrot_training - Step 17203: {'lr': 0.0003692279244525804, 'samples': 8808448, 'steps': 17203, 'batch_loss/train': 0.8134445394389331} 12/25/2021 05:48:12 - INFO - codeparrot_training - Step 17204: {'lr': 0.000369214069189649, 'samples': 8808960, 'steps': 17204, 'batch_loss/train': 0.7873349864967167} 12/25/2021 05:48:24 - INFO - codeparrot_training - Step 17205: {'lr': 0.0003692002134527674, 'samples': 8809472, 'steps': 17205, 'batch_loss/train': 0.8352088769897819} 12/25/2021 05:48:34 - INFO - codeparrot_training - Step 17206: {'lr': 0.0003691863572419909, 'samples': 8809984, 'steps': 17206, 'batch_loss/train': 0.7591881887055933} 12/25/2021 05:48:45 - INFO - codeparrot_training - Step 17207: {'lr': 0.0003691725005573744, 'samples': 8810496, 'steps': 17207, 'batch_loss/train': 0.8510277699679136} 12/25/2021 05:48:57 - INFO - codeparrot_training - Step 17208: {'lr': 0.000369158643398973, 'samples': 8811008, 'steps': 17208, 'batch_loss/train': 0.6452690217993222} 12/25/2021 05:49:08 - INFO - codeparrot_training - Step 17209: {'lr': 0.00036914478576684185, 'samples': 8811520, 'steps': 17209, 'batch_loss/train': 0.8176303561776876} 12/25/2021 05:49:18 - INFO - codeparrot_training - Step 17210: {'lr': 0.000369130927661036, 'samples': 8812032, 'steps': 17210, 'batch_loss/train': 0.8321163556538522} 12/25/2021 05:49:29 - INFO - codeparrot_training - Step 17211: {'lr': 0.00036911706908161057, 'samples': 8812544, 'steps': 17211, 'batch_loss/train': 0.84617964271456} 12/25/2021 05:49:43 - INFO - codeparrot_training - Step 17212: {'lr': 0.0003691032100286207, 'samples': 8813056, 'steps': 17212, 'batch_loss/train': 0.725850717164576} 12/25/2021 05:49:53 - INFO - codeparrot_training - Step 17213: {'lr': 0.0003690893505021214, 'samples': 8813568, 'steps': 17213, 'batch_loss/train': 0.813361641485244} 12/25/2021 05:50:04 - INFO - codeparrot_training - Step 17214: {'lr': 0.0003690754905021677, 'samples': 8814080, 'steps': 17214, 'batch_loss/train': 0.759377745911479} 12/25/2021 05:50:16 - INFO - codeparrot_training - Step 17215: {'lr': 0.000369061630028815, 'samples': 8814592, 'steps': 17215, 'batch_loss/train': 0.8136732103303075} 12/25/2021 05:50:27 - INFO - codeparrot_training - Step 17216: {'lr': 0.00036904776908211815, 'samples': 8815104, 'steps': 17216, 'batch_loss/train': 0.6879623420536518} 12/25/2021 05:50:37 - INFO - codeparrot_training - Step 17217: {'lr': 0.0003690339076621323, 'samples': 8815616, 'steps': 17217, 'batch_loss/train': 0.8923871610313654} 12/25/2021 05:50:50 - INFO - codeparrot_training - Step 17218: {'lr': 0.00036902004576891255, 'samples': 8816128, 'steps': 17218, 'batch_loss/train': 0.8098624097183347} 12/25/2021 05:51:00 - INFO - codeparrot_training - Step 17219: {'lr': 0.000369006183402514, 'samples': 8816640, 'steps': 17219, 'batch_loss/train': 0.7620040588080883} 12/25/2021 05:51:11 - INFO - codeparrot_training - Step 17220: {'lr': 0.00036899232056299196, 'samples': 8817152, 'steps': 17220, 'batch_loss/train': 0.7384964972734451} 12/25/2021 05:51:21 - INFO - codeparrot_training - Step 17221: {'lr': 0.00036897845725040123, 'samples': 8817664, 'steps': 17221, 'batch_loss/train': 0.7824685508385301} 12/25/2021 05:51:33 - INFO - codeparrot_training - Step 17222: {'lr': 0.00036896459346479716, 'samples': 8818176, 'steps': 17222, 'batch_loss/train': 0.7398649463430047} 12/25/2021 05:51:44 - INFO - codeparrot_training - Step 17223: {'lr': 0.0003689507292062348, 'samples': 8818688, 'steps': 17223, 'batch_loss/train': 0.7668903039302677} 12/25/2021 05:51:55 - INFO - codeparrot_training - Step 17224: {'lr': 0.0003689368644747692, 'samples': 8819200, 'steps': 17224, 'batch_loss/train': 0.8208004613406956} 12/25/2021 05:52:09 - INFO - codeparrot_training - Step 17225: {'lr': 0.00036892299927045554, 'samples': 8819712, 'steps': 17225, 'batch_loss/train': 0.5844693616963923} 12/25/2021 05:52:19 - INFO - codeparrot_training - Step 17226: {'lr': 0.000368909133593349, 'samples': 8820224, 'steps': 17226, 'batch_loss/train': 0.8876215191558003} 12/25/2021 05:52:30 - INFO - codeparrot_training - Step 17227: {'lr': 0.00036889526744350457, 'samples': 8820736, 'steps': 17227, 'batch_loss/train': 0.8609638772904873} 12/25/2021 05:52:42 - INFO - codeparrot_training - Step 17228: {'lr': 0.0003688814008209775, 'samples': 8821248, 'steps': 17228, 'batch_loss/train': 0.7903991714119911} 12/25/2021 05:52:52 - INFO - codeparrot_training - Step 17229: {'lr': 0.0003688675337258228, 'samples': 8821760, 'steps': 17229, 'batch_loss/train': 0.7894077077507973} 12/25/2021 05:53:03 - INFO - codeparrot_training - Step 17230: {'lr': 0.00036885366615809574, 'samples': 8822272, 'steps': 17230, 'batch_loss/train': 0.7709722374565899} 12/25/2021 05:53:15 - INFO - codeparrot_training - Step 17231: {'lr': 0.00036883979811785136, 'samples': 8822784, 'steps': 17231, 'batch_loss/train': 0.8407090781256557} 12/25/2021 05:53:26 - INFO - codeparrot_training - Step 17232: {'lr': 0.0003688259296051448, 'samples': 8823296, 'steps': 17232, 'batch_loss/train': 0.7226608591154218} 12/25/2021 05:53:36 - INFO - codeparrot_training - Step 17233: {'lr': 0.0003688120606200312, 'samples': 8823808, 'steps': 17233, 'batch_loss/train': 0.6584227411076427} 12/25/2021 05:53:47 - INFO - codeparrot_training - Step 17234: {'lr': 0.00036879819116256575, 'samples': 8824320, 'steps': 17234, 'batch_loss/train': 0.7364468658342957} 12/25/2021 05:54:01 - INFO - codeparrot_training - Step 17235: {'lr': 0.0003687843212328035, 'samples': 8824832, 'steps': 17235, 'batch_loss/train': 0.8319362620823085} 12/25/2021 05:54:11 - INFO - codeparrot_training - Step 17236: {'lr': 0.0003687704508307997, 'samples': 8825344, 'steps': 17236, 'batch_loss/train': 0.8988587693311274} 12/25/2021 05:54:22 - INFO - codeparrot_training - Step 17237: {'lr': 0.0003687565799566094, 'samples': 8825856, 'steps': 17237, 'batch_loss/train': 0.8252285839989781} 12/25/2021 05:54:34 - INFO - codeparrot_training - Step 17238: {'lr': 0.00036874270861028777, 'samples': 8826368, 'steps': 17238, 'batch_loss/train': 0.8260625600814819} 12/25/2021 05:54:45 - INFO - codeparrot_training - Step 17239: {'lr': 0.00036872883679189004, 'samples': 8826880, 'steps': 17239, 'batch_loss/train': 0.838415885809809} 12/25/2021 05:54:55 - INFO - codeparrot_training - Step 17240: {'lr': 0.0003687149645014711, 'samples': 8827392, 'steps': 17240, 'batch_loss/train': 0.8419815097004175} 12/25/2021 05:55:09 - INFO - codeparrot_training - Step 17241: {'lr': 0.0003687010917390865, 'samples': 8827904, 'steps': 17241, 'batch_loss/train': 0.8275818657130003} 12/25/2021 05:55:20 - INFO - codeparrot_training - Step 17242: {'lr': 0.00036868721850479106, 'samples': 8828416, 'steps': 17242, 'batch_loss/train': 0.7847740761935711} 12/25/2021 05:55:30 - INFO - codeparrot_training - Step 17243: {'lr': 0.0003686733447986401, 'samples': 8828928, 'steps': 17243, 'batch_loss/train': 0.8192842146381736} 12/25/2021 05:55:41 - INFO - codeparrot_training - Step 17244: {'lr': 0.00036865947062068865, 'samples': 8829440, 'steps': 17244, 'batch_loss/train': 0.7871213546022773} 12/25/2021 05:55:54 - INFO - codeparrot_training - Step 17245: {'lr': 0.00036864559597099203, 'samples': 8829952, 'steps': 17245, 'batch_loss/train': 0.7378698829561472} 12/25/2021 05:56:04 - INFO - codeparrot_training - Step 17246: {'lr': 0.0003686317208496053, 'samples': 8830464, 'steps': 17246, 'batch_loss/train': 0.9209167528897524} 12/25/2021 05:56:15 - INFO - codeparrot_training - Step 17247: {'lr': 0.0003686178452565836, 'samples': 8830976, 'steps': 17247, 'batch_loss/train': 0.8059413126902655} 12/25/2021 05:56:27 - INFO - codeparrot_training - Step 17248: {'lr': 0.00036860396919198227, 'samples': 8831488, 'steps': 17248, 'batch_loss/train': 0.8309496017172933} 12/25/2021 05:56:38 - INFO - codeparrot_training - Step 17249: {'lr': 0.0003685900926558562, 'samples': 8832000, 'steps': 17249, 'batch_loss/train': 0.8492163764312863} 12/25/2021 05:56:48 - INFO - codeparrot_training - Step 17250: {'lr': 0.0003685762156482607, 'samples': 8832512, 'steps': 17250, 'batch_loss/train': 0.8308644741773605} 12/25/2021 05:56:59 - INFO - codeparrot_training - Step 17251: {'lr': 0.000368562338169251, 'samples': 8833024, 'steps': 17251, 'batch_loss/train': 0.8423826210200787} 12/25/2021 05:57:12 - INFO - codeparrot_training - Step 17252: {'lr': 0.00036854846021888227, 'samples': 8833536, 'steps': 17252, 'batch_loss/train': 0.8134873230010271} 12/25/2021 05:57:22 - INFO - codeparrot_training - Step 17253: {'lr': 0.0003685345817972095, 'samples': 8834048, 'steps': 17253, 'batch_loss/train': 0.7132485271431506} 12/25/2021 05:57:33 - INFO - codeparrot_training - Step 17254: {'lr': 0.00036852070290428805, 'samples': 8834560, 'steps': 17254, 'batch_loss/train': 0.8196551776491106} 12/25/2021 05:57:45 - INFO - codeparrot_training - Step 17255: {'lr': 0.00036850682354017307, 'samples': 8835072, 'steps': 17255, 'batch_loss/train': 0.8841849556192756} 12/25/2021 05:57:56 - INFO - codeparrot_training - Step 17256: {'lr': 0.0003684929437049196, 'samples': 8835584, 'steps': 17256, 'batch_loss/train': 0.8413330428302288} 12/25/2021 05:58:06 - INFO - codeparrot_training - Step 17257: {'lr': 0.000368479063398583, 'samples': 8836096, 'steps': 17257, 'batch_loss/train': 0.6746061585145071} 12/25/2021 05:58:18 - INFO - codeparrot_training - Step 17258: {'lr': 0.0003684651826212184, 'samples': 8836608, 'steps': 17258, 'batch_loss/train': 0.7552834236994386} 12/25/2021 05:58:29 - INFO - codeparrot_training - Step 17259: {'lr': 0.0003684513013728809, 'samples': 8837120, 'steps': 17259, 'batch_loss/train': 0.777313694357872} 12/25/2021 05:58:40 - INFO - codeparrot_training - Step 17260: {'lr': 0.00036843741965362576, 'samples': 8837632, 'steps': 17260, 'batch_loss/train': 0.7562119869980961} 12/25/2021 05:58:52 - INFO - codeparrot_training - Step 17261: {'lr': 0.00036842353746350826, 'samples': 8838144, 'steps': 17261, 'batch_loss/train': 0.8283456321805716} 12/25/2021 05:59:03 - INFO - codeparrot_training - Step 17262: {'lr': 0.00036840965480258335, 'samples': 8838656, 'steps': 17262, 'batch_loss/train': 0.8363459454849362} 12/25/2021 05:59:14 - INFO - codeparrot_training - Step 17263: {'lr': 0.0003683957716709064, 'samples': 8839168, 'steps': 17263, 'batch_loss/train': 0.9010779391974211} 12/25/2021 05:59:24 - INFO - codeparrot_training - Step 17264: {'lr': 0.0003683818880685327, 'samples': 8839680, 'steps': 17264, 'batch_loss/train': 0.7892210716381669} 12/25/2021 05:59:36 - INFO - codeparrot_training - Step 17265: {'lr': 0.0003683680039955172, 'samples': 8840192, 'steps': 17265, 'batch_loss/train': 0.7559919310733676} 12/25/2021 05:59:47 - INFO - codeparrot_training - Step 17266: {'lr': 0.00036835411945191515, 'samples': 8840704, 'steps': 17266, 'batch_loss/train': 0.781342061702162} 12/25/2021 05:59:57 - INFO - codeparrot_training - Step 17267: {'lr': 0.0003683402344377819, 'samples': 8841216, 'steps': 17267, 'batch_loss/train': 0.8051368105225265} 12/25/2021 06:00:10 - INFO - codeparrot_training - Step 17268: {'lr': 0.00036832634895317255, 'samples': 8841728, 'steps': 17268, 'batch_loss/train': 0.9045368121005595} 12/25/2021 06:00:20 - INFO - codeparrot_training - Step 17269: {'lr': 0.00036831246299814225, 'samples': 8842240, 'steps': 17269, 'batch_loss/train': 0.8010994454380125} 12/25/2021 06:00:31 - INFO - codeparrot_training - Step 17270: {'lr': 0.00036829857657274634, 'samples': 8842752, 'steps': 17270, 'batch_loss/train': 0.9919133512303233} 12/25/2021 06:00:44 - INFO - codeparrot_training - Step 17271: {'lr': 0.00036828468967704, 'samples': 8843264, 'steps': 17271, 'batch_loss/train': 0.6956812762655318} 12/25/2021 06:00:54 - INFO - codeparrot_training - Step 17272: {'lr': 0.0003682708023110783, 'samples': 8843776, 'steps': 17272, 'batch_loss/train': 0.7350250878371298} 12/25/2021 06:01:05 - INFO - codeparrot_training - Step 17273: {'lr': 0.0003682569144749166, 'samples': 8844288, 'steps': 17273, 'batch_loss/train': 0.647052991669625} 12/25/2021 06:01:15 - INFO - codeparrot_training - Step 17274: {'lr': 0.00036824302616861006, 'samples': 8844800, 'steps': 17274, 'batch_loss/train': 0.8193786414340138} 12/25/2021 06:01:27 - INFO - codeparrot_training - Step 17275: {'lr': 0.00036822913739221384, 'samples': 8845312, 'steps': 17275, 'batch_loss/train': 0.8136151032522321} 12/25/2021 06:01:38 - INFO - codeparrot_training - Step 17276: {'lr': 0.0003682152481457833, 'samples': 8845824, 'steps': 17276, 'batch_loss/train': 0.8512329300865531} 12/25/2021 06:01:49 - INFO - codeparrot_training - Step 17277: {'lr': 0.00036820135842937344, 'samples': 8846336, 'steps': 17277, 'batch_loss/train': 0.7783723613247275} 12/25/2021 06:02:01 - INFO - codeparrot_training - Step 17278: {'lr': 0.00036818746824303974, 'samples': 8846848, 'steps': 17278, 'batch_loss/train': 0.776592331007123} 12/25/2021 06:02:11 - INFO - codeparrot_training - Step 17279: {'lr': 0.0003681735775868372, 'samples': 8847360, 'steps': 17279, 'batch_loss/train': 0.9105262234807014} 12/25/2021 06:02:22 - INFO - codeparrot_training - Step 17280: {'lr': 0.00036815968646082125, 'samples': 8847872, 'steps': 17280, 'batch_loss/train': 0.9032530598342419} 12/25/2021 06:02:35 - INFO - codeparrot_training - Step 17281: {'lr': 0.0003681457948650469, 'samples': 8848384, 'steps': 17281, 'batch_loss/train': 0.7415383302140981} 12/25/2021 06:02:45 - INFO - codeparrot_training - Step 17282: {'lr': 0.0003681319027995695, 'samples': 8848896, 'steps': 17282, 'batch_loss/train': 0.8926421753130853} 12/25/2021 06:02:56 - INFO - codeparrot_training - Step 17283: {'lr': 0.0003681180102644443, 'samples': 8849408, 'steps': 17283, 'batch_loss/train': 0.836852471344173} 12/25/2021 06:03:07 - INFO - codeparrot_training - Step 17284: {'lr': 0.0003681041172597264, 'samples': 8849920, 'steps': 17284, 'batch_loss/train': 0.9254018319770694} 12/25/2021 06:03:18 - INFO - codeparrot_training - Step 17285: {'lr': 0.0003680902237854712, 'samples': 8850432, 'steps': 17285, 'batch_loss/train': 0.79186587408185} 12/25/2021 06:03:29 - INFO - codeparrot_training - Step 17286: {'lr': 0.00036807632984173385, 'samples': 8850944, 'steps': 17286, 'batch_loss/train': 0.7811263110488653} 12/25/2021 06:03:40 - INFO - codeparrot_training - Step 17287: {'lr': 0.00036806243542856967, 'samples': 8851456, 'steps': 17287, 'batch_loss/train': 0.7241918281652033} 12/25/2021 06:03:52 - INFO - codeparrot_training - Step 17288: {'lr': 0.0003680485405460337, 'samples': 8851968, 'steps': 17288, 'batch_loss/train': 0.9191487282514572} 12/25/2021 06:04:03 - INFO - codeparrot_training - Step 17289: {'lr': 0.00036803464519418133, 'samples': 8852480, 'steps': 17289, 'batch_loss/train': 0.8602727898396552} 12/25/2021 06:04:14 - INFO - codeparrot_training - Step 17290: {'lr': 0.0003680207493730678, 'samples': 8852992, 'steps': 17290, 'batch_loss/train': 0.9003111515194178} 12/25/2021 06:04:26 - INFO - codeparrot_training - Step 17291: {'lr': 0.00036800685308274836, 'samples': 8853504, 'steps': 17291, 'batch_loss/train': 0.9448501076549292} 12/25/2021 06:04:36 - INFO - codeparrot_training - Step 17292: {'lr': 0.00036799295632327815, 'samples': 8854016, 'steps': 17292, 'batch_loss/train': 0.8012983910739422} 12/25/2021 06:04:47 - INFO - codeparrot_training - Step 17293: {'lr': 0.00036797905909471257, 'samples': 8854528, 'steps': 17293, 'batch_loss/train': 0.892436935682781} 12/25/2021 06:04:59 - INFO - codeparrot_training - Step 17294: {'lr': 0.0003679651613971068, 'samples': 8855040, 'steps': 17294, 'batch_loss/train': 0.8247943129390478} 12/25/2021 06:05:10 - INFO - codeparrot_training - Step 17295: {'lr': 0.00036795126323051604, 'samples': 8855552, 'steps': 17295, 'batch_loss/train': 0.6962253591045737} 12/25/2021 06:05:20 - INFO - codeparrot_training - Step 17296: {'lr': 0.00036793736459499567, 'samples': 8856064, 'steps': 17296, 'batch_loss/train': 0.8119937884621322} 12/25/2021 06:05:31 - INFO - codeparrot_training - Step 17297: {'lr': 0.0003679234654906008, 'samples': 8856576, 'steps': 17297, 'batch_loss/train': 0.8557657487690449} 12/25/2021 06:05:43 - INFO - codeparrot_training - Step 17298: {'lr': 0.00036790956591738677, 'samples': 8857088, 'steps': 17298, 'batch_loss/train': 0.8500382946804166} 12/25/2021 06:05:54 - INFO - codeparrot_training - Step 17299: {'lr': 0.0003678956658754088, 'samples': 8857600, 'steps': 17299, 'batch_loss/train': 0.8305372996255755} 12/25/2021 06:06:04 - INFO - codeparrot_training - Step 17300: {'lr': 0.0003678817653647223, 'samples': 8858112, 'steps': 17300, 'batch_loss/train': 0.769153090659529} 12/25/2021 06:06:17 - INFO - codeparrot_training - Step 17301: {'lr': 0.0003678678643853823, 'samples': 8858624, 'steps': 17301, 'batch_loss/train': 0.8107721908017993} 12/25/2021 06:06:28 - INFO - codeparrot_training - Step 17302: {'lr': 0.0003678539629374442, 'samples': 8859136, 'steps': 17302, 'batch_loss/train': 0.8316402575001121} 12/25/2021 06:06:38 - INFO - codeparrot_training - Step 17303: {'lr': 0.0003678400610209632, 'samples': 8859648, 'steps': 17303, 'batch_loss/train': 0.9490910302847624} 12/25/2021 06:06:50 - INFO - codeparrot_training - Step 17304: {'lr': 0.0003678261586359947, 'samples': 8860160, 'steps': 17304, 'batch_loss/train': 0.7164638976100832} 12/25/2021 06:07:01 - INFO - codeparrot_training - Step 17305: {'lr': 0.00036781225578259386, 'samples': 8860672, 'steps': 17305, 'batch_loss/train': 0.7079206863418221} 12/25/2021 06:07:12 - INFO - codeparrot_training - Step 17306: {'lr': 0.0003677983524608159, 'samples': 8861184, 'steps': 17306, 'batch_loss/train': 0.9506425615400076} 12/25/2021 06:07:22 - INFO - codeparrot_training - Step 17307: {'lr': 0.0003677844486707163, 'samples': 8861696, 'steps': 17307, 'batch_loss/train': 0.8129914905875921} 12/25/2021 06:07:34 - INFO - codeparrot_training - Step 17308: {'lr': 0.00036777054441235006, 'samples': 8862208, 'steps': 17308, 'batch_loss/train': 0.8563566855154932} 12/25/2021 06:07:45 - INFO - codeparrot_training - Step 17309: {'lr': 0.0003677566396857728, 'samples': 8862720, 'steps': 17309, 'batch_loss/train': 0.8674702383577824} 12/25/2021 06:07:55 - INFO - codeparrot_training - Step 17310: {'lr': 0.0003677427344910394, 'samples': 8863232, 'steps': 17310, 'batch_loss/train': 0.8358895787969232} 12/25/2021 06:08:08 - INFO - codeparrot_training - Step 17311: {'lr': 0.00036772882882820545, 'samples': 8863744, 'steps': 17311, 'batch_loss/train': 0.7306287875398993} 12/25/2021 06:08:19 - INFO - codeparrot_training - Step 17312: {'lr': 0.0003677149226973261, 'samples': 8864256, 'steps': 17312, 'batch_loss/train': 0.8093151347711682} 12/25/2021 06:08:29 - INFO - codeparrot_training - Step 17313: {'lr': 0.00036770101609845667, 'samples': 8864768, 'steps': 17313, 'batch_loss/train': 0.8037114106118679} 12/25/2021 06:08:41 - INFO - codeparrot_training - Step 17314: {'lr': 0.0003676871090316524, 'samples': 8865280, 'steps': 17314, 'batch_loss/train': 0.8632569825276732} 12/25/2021 06:08:52 - INFO - codeparrot_training - Step 17315: {'lr': 0.0003676732014969687, 'samples': 8865792, 'steps': 17315, 'batch_loss/train': 0.9286961993202567} 12/25/2021 06:09:02 - INFO - codeparrot_training - Step 17316: {'lr': 0.0003676592934944608, 'samples': 8866304, 'steps': 17316, 'batch_loss/train': 0.8093081833794713} 12/25/2021 06:09:13 - INFO - codeparrot_training - Step 17317: {'lr': 0.00036764538502418386, 'samples': 8866816, 'steps': 17317, 'batch_loss/train': 0.9293639780953526} 12/25/2021 06:09:26 - INFO - codeparrot_training - Step 17318: {'lr': 0.00036763147608619335, 'samples': 8867328, 'steps': 17318, 'batch_loss/train': 0.7940994258970022} 12/25/2021 06:09:36 - INFO - codeparrot_training - Step 17319: {'lr': 0.00036761756668054454, 'samples': 8867840, 'steps': 17319, 'batch_loss/train': 0.7405308838933706} 12/25/2021 06:09:47 - INFO - codeparrot_training - Step 17320: {'lr': 0.0003676036568072927, 'samples': 8868352, 'steps': 17320, 'batch_loss/train': 0.8616985755506903} 12/25/2021 06:09:59 - INFO - codeparrot_training - Step 17321: {'lr': 0.00036758974646649304, 'samples': 8868864, 'steps': 17321, 'batch_loss/train': 0.807452242821455} 12/25/2021 06:10:09 - INFO - codeparrot_training - Step 17322: {'lr': 0.00036757583565820103, 'samples': 8869376, 'steps': 17322, 'batch_loss/train': 0.8254247857257724} 12/25/2021 06:10:20 - INFO - codeparrot_training - Step 17323: {'lr': 0.0003675619243824718, 'samples': 8869888, 'steps': 17323, 'batch_loss/train': 0.7898417352698743} 12/25/2021 06:10:32 - INFO - codeparrot_training - Step 17324: {'lr': 0.00036754801263936084, 'samples': 8870400, 'steps': 17324, 'batch_loss/train': 0.8572938917204738} 12/25/2021 06:10:43 - INFO - codeparrot_training - Step 17325: {'lr': 0.0003675341004289233, 'samples': 8870912, 'steps': 17325, 'batch_loss/train': 0.8903050748631358} 12/25/2021 06:10:53 - INFO - codeparrot_training - Step 17326: {'lr': 0.00036752018775121467, 'samples': 8871424, 'steps': 17326, 'batch_loss/train': 0.7560677705332637} 12/25/2021 06:11:06 - INFO - codeparrot_training - Step 17327: {'lr': 0.00036750627460629, 'samples': 8871936, 'steps': 17327, 'batch_loss/train': 0.7570600626058877} 12/25/2021 06:11:17 - INFO - codeparrot_training - Step 17328: {'lr': 0.0003674923609942049, 'samples': 8872448, 'steps': 17328, 'batch_loss/train': 0.7924507949501276} 12/25/2021 06:11:27 - INFO - codeparrot_training - Step 17329: {'lr': 0.0003674784469150144, 'samples': 8872960, 'steps': 17329, 'batch_loss/train': 0.7820098977535963} 12/25/2021 06:11:38 - INFO - codeparrot_training - Step 17330: {'lr': 0.00036746453236877403, 'samples': 8873472, 'steps': 17330, 'batch_loss/train': 0.7168837389908731} 12/25/2021 06:11:50 - INFO - codeparrot_training - Step 17331: {'lr': 0.000367450617355539, 'samples': 8873984, 'steps': 17331, 'batch_loss/train': 1.0078589647309855} 12/25/2021 06:12:01 - INFO - codeparrot_training - Step 17332: {'lr': 0.00036743670187536464, 'samples': 8874496, 'steps': 17332, 'batch_loss/train': 0.8487690822221339} 12/25/2021 06:12:12 - INFO - codeparrot_training - Step 17333: {'lr': 0.0003674227859283064, 'samples': 8875008, 'steps': 17333, 'batch_loss/train': 0.6980200614780188} 12/25/2021 06:12:24 - INFO - codeparrot_training - Step 17334: {'lr': 0.0003674088695144194, 'samples': 8875520, 'steps': 17334, 'batch_loss/train': 0.8561439355835319} 12/25/2021 06:12:34 - INFO - codeparrot_training - Step 17335: {'lr': 0.0003673949526337591, 'samples': 8876032, 'steps': 17335, 'batch_loss/train': 0.7016499261371791} 12/25/2021 06:12:45 - INFO - codeparrot_training - Step 17336: {'lr': 0.00036738103528638076, 'samples': 8876544, 'steps': 17336, 'batch_loss/train': 0.648280308349058} 12/25/2021 06:12:57 - INFO - codeparrot_training - Step 17337: {'lr': 0.0003673671174723398, 'samples': 8877056, 'steps': 17337, 'batch_loss/train': 0.8724385090172291} 12/25/2021 06:13:07 - INFO - codeparrot_training - Step 17338: {'lr': 0.00036735319919169147, 'samples': 8877568, 'steps': 17338, 'batch_loss/train': 0.8258048743009567} 12/25/2021 06:13:18 - INFO - codeparrot_training - Step 17339: {'lr': 0.00036733928044449114, 'samples': 8878080, 'steps': 17339, 'batch_loss/train': 0.8446066798642278} 12/25/2021 06:13:29 - INFO - codeparrot_training - Step 17340: {'lr': 0.0003673253612307941, 'samples': 8878592, 'steps': 17340, 'batch_loss/train': 0.8150220136158168} 12/25/2021 06:13:41 - INFO - codeparrot_training - Step 17341: {'lr': 0.0003673114415506558, 'samples': 8879104, 'steps': 17341, 'batch_loss/train': 0.7038753349334002} 12/25/2021 06:13:52 - INFO - codeparrot_training - Step 17342: {'lr': 0.0003672975214041314, 'samples': 8879616, 'steps': 17342, 'batch_loss/train': 0.79065768327564} 12/25/2021 06:14:02 - INFO - codeparrot_training - Step 17343: {'lr': 0.0003672836007912764, 'samples': 8880128, 'steps': 17343, 'batch_loss/train': 0.8401235141791403} 12/25/2021 06:14:15 - INFO - codeparrot_training - Step 17344: {'lr': 0.00036726967971214614, 'samples': 8880640, 'steps': 17344, 'batch_loss/train': 0.8591272979974747} 12/25/2021 06:14:25 - INFO - codeparrot_training - Step 17345: {'lr': 0.00036725575816679587, 'samples': 8881152, 'steps': 17345, 'batch_loss/train': 0.7184110749512911} 12/25/2021 06:14:36 - INFO - codeparrot_training - Step 17346: {'lr': 0.00036724183615528095, 'samples': 8881664, 'steps': 17346, 'batch_loss/train': 0.8183328243903816} 12/25/2021 06:14:48 - INFO - codeparrot_training - Step 17347: {'lr': 0.0003672279136776568, 'samples': 8882176, 'steps': 17347, 'batch_loss/train': 0.8509383341297507} 12/25/2021 06:14:59 - INFO - codeparrot_training - Step 17348: {'lr': 0.0003672139907339787, 'samples': 8882688, 'steps': 17348, 'batch_loss/train': 0.7889105910435319} 12/25/2021 06:15:10 - INFO - codeparrot_training - Step 17349: {'lr': 0.00036720006732430204, 'samples': 8883200, 'steps': 17349, 'batch_loss/train': 0.7534478532616049} 12/25/2021 06:15:20 - INFO - codeparrot_training - Step 17350: {'lr': 0.00036718614344868216, 'samples': 8883712, 'steps': 17350, 'batch_loss/train': 0.8664698367938399} 12/25/2021 06:15:32 - INFO - codeparrot_training - Step 17351: {'lr': 0.0003671722191071745, 'samples': 8884224, 'steps': 17351, 'batch_loss/train': 0.7885325280949473} 12/25/2021 06:15:43 - INFO - codeparrot_training - Step 17352: {'lr': 0.00036715829429983426, 'samples': 8884736, 'steps': 17352, 'batch_loss/train': 0.677269549574703} 12/25/2021 06:15:54 - INFO - codeparrot_training - Step 17353: {'lr': 0.0003671443690267168, 'samples': 8885248, 'steps': 17353, 'batch_loss/train': 0.7831704868003726} 12/25/2021 06:16:06 - INFO - codeparrot_training - Step 17354: {'lr': 0.0003671304432878777, 'samples': 8885760, 'steps': 17354, 'batch_loss/train': 0.6443128366954625} 12/25/2021 06:16:16 - INFO - codeparrot_training - Step 17355: {'lr': 0.0003671165170833721, 'samples': 8886272, 'steps': 17355, 'batch_loss/train': 0.8150873696431518} 12/25/2021 06:16:27 - INFO - codeparrot_training - Step 17356: {'lr': 0.0003671025904132555, 'samples': 8886784, 'steps': 17356, 'batch_loss/train': 0.8245222175028175} 12/25/2021 06:16:40 - INFO - codeparrot_training - Step 17357: {'lr': 0.00036708866327758315, 'samples': 8887296, 'steps': 17357, 'batch_loss/train': 0.7705566422082484} 12/25/2021 06:16:50 - INFO - codeparrot_training - Step 17358: {'lr': 0.0003670747356764106, 'samples': 8887808, 'steps': 17358, 'batch_loss/train': 0.6782016432262026} 12/25/2021 06:17:01 - INFO - codeparrot_training - Step 17359: {'lr': 0.000367060807609793, 'samples': 8888320, 'steps': 17359, 'batch_loss/train': 1.1025715228170156} 12/25/2021 06:17:13 - INFO - codeparrot_training - Step 17360: {'lr': 0.0003670468790777859, 'samples': 8888832, 'steps': 17360, 'batch_loss/train': 0.7848197855055332} 12/25/2021 06:17:24 - INFO - codeparrot_training - Step 17361: {'lr': 0.0003670329500804446, 'samples': 8889344, 'steps': 17361, 'batch_loss/train': 0.7547322893515229} 12/25/2021 06:17:34 - INFO - codeparrot_training - Step 17362: {'lr': 0.00036701902061782445, 'samples': 8889856, 'steps': 17362, 'batch_loss/train': 0.7704257583245635} 12/25/2021 06:17:45 - INFO - codeparrot_training - Step 17363: {'lr': 0.00036700509068998086, 'samples': 8890368, 'steps': 17363, 'batch_loss/train': 0.8262847103178501} 12/25/2021 06:17:57 - INFO - codeparrot_training - Step 17364: {'lr': 0.0003669911602969692, 'samples': 8890880, 'steps': 17364, 'batch_loss/train': 0.7317970981821418} 12/25/2021 06:18:07 - INFO - codeparrot_training - Step 17365: {'lr': 0.0003669772294388449, 'samples': 8891392, 'steps': 17365, 'batch_loss/train': 0.8317115046083927} 12/25/2021 06:18:18 - INFO - codeparrot_training - Step 17366: {'lr': 0.0003669632981156632, 'samples': 8891904, 'steps': 17366, 'batch_loss/train': 0.8449640218168497} 12/25/2021 06:18:31 - INFO - codeparrot_training - Step 17367: {'lr': 0.0003669493663274797, 'samples': 8892416, 'steps': 17367, 'batch_loss/train': 0.9459993746131659} 12/25/2021 06:18:41 - INFO - codeparrot_training - Step 17368: {'lr': 0.00036693543407434963, 'samples': 8892928, 'steps': 17368, 'batch_loss/train': 0.7527595912106335} 12/25/2021 06:18:52 - INFO - codeparrot_training - Step 17369: {'lr': 0.0003669215013563284, 'samples': 8893440, 'steps': 17369, 'batch_loss/train': 0.8039830820634961} 12/25/2021 06:19:04 - INFO - codeparrot_training - Step 17370: {'lr': 0.0003669075681734715, 'samples': 8893952, 'steps': 17370, 'batch_loss/train': 0.8196515720337629} 12/25/2021 06:19:15 - INFO - codeparrot_training - Step 17371: {'lr': 0.00036689363452583425, 'samples': 8894464, 'steps': 17371, 'batch_loss/train': 0.7886779754189774} 12/25/2021 06:19:25 - INFO - codeparrot_training - Step 17372: {'lr': 0.00036687970041347204, 'samples': 8894976, 'steps': 17372, 'batch_loss/train': 0.7890283425804228} 12/25/2021 06:19:36 - INFO - codeparrot_training - Step 17373: {'lr': 0.00036686576583644024, 'samples': 8895488, 'steps': 17373, 'batch_loss/train': 0.8475205721333623} 12/25/2021 06:19:49 - INFO - codeparrot_training - Step 17374: {'lr': 0.00036685183079479436, 'samples': 8896000, 'steps': 17374, 'batch_loss/train': 0.8267627423629165} 12/25/2021 06:19:59 - INFO - codeparrot_training - Step 17375: {'lr': 0.00036683789528858964, 'samples': 8896512, 'steps': 17375, 'batch_loss/train': 0.8540192320942879} 12/25/2021 06:20:10 - INFO - codeparrot_training - Step 17376: {'lr': 0.00036682395931788154, 'samples': 8897024, 'steps': 17376, 'batch_loss/train': 0.7792487149126828} 12/25/2021 06:20:22 - INFO - codeparrot_training - Step 17377: {'lr': 0.0003668100228827255, 'samples': 8897536, 'steps': 17377, 'batch_loss/train': 0.879600495332852} 12/25/2021 06:20:32 - INFO - codeparrot_training - Step 17378: {'lr': 0.00036679608598317703, 'samples': 8898048, 'steps': 17378, 'batch_loss/train': 0.7788811279460788} 12/25/2021 06:20:43 - INFO - codeparrot_training - Step 17379: {'lr': 0.00036678214861929126, 'samples': 8898560, 'steps': 17379, 'batch_loss/train': 0.8339707474224269} 12/25/2021 06:20:55 - INFO - codeparrot_training - Step 17380: {'lr': 0.0003667682107911238, 'samples': 8899072, 'steps': 17380, 'batch_loss/train': 0.8452870724722743} 12/25/2021 06:21:06 - INFO - codeparrot_training - Step 17381: {'lr': 0.0003667542724987302, 'samples': 8899584, 'steps': 17381, 'batch_loss/train': 0.8210428059101105} 12/25/2021 06:21:16 - INFO - codeparrot_training - Step 17382: {'lr': 0.0003667403337421654, 'samples': 8900096, 'steps': 17382, 'batch_loss/train': 0.8437640465563163} 12/25/2021 06:21:27 - INFO - codeparrot_training - Step 17383: {'lr': 0.00036672639452148536, 'samples': 8900608, 'steps': 17383, 'batch_loss/train': 0.771002198336646} 12/25/2021 06:21:39 - INFO - codeparrot_training - Step 17384: {'lr': 0.00036671245483674514, 'samples': 8901120, 'steps': 17384, 'batch_loss/train': 0.7774600340053439} 12/25/2021 06:21:50 - INFO - codeparrot_training - Step 17385: {'lr': 0.0003666985146880003, 'samples': 8901632, 'steps': 17385, 'batch_loss/train': 0.8147623077966273} 12/25/2021 06:22:00 - INFO - codeparrot_training - Step 17386: {'lr': 0.0003666845740753062, 'samples': 8902144, 'steps': 17386, 'batch_loss/train': 0.9571607792750001} 12/25/2021 06:22:13 - INFO - codeparrot_training - Step 17387: {'lr': 0.00036667063299871837, 'samples': 8902656, 'steps': 17387, 'batch_loss/train': 0.8104799287393689} 12/25/2021 06:22:23 - INFO - codeparrot_training - Step 17388: {'lr': 0.00036665669145829213, 'samples': 8903168, 'steps': 17388, 'batch_loss/train': 0.7564617437310517} 12/25/2021 06:22:34 - INFO - codeparrot_training - Step 17389: {'lr': 0.0003666427494540828, 'samples': 8903680, 'steps': 17389, 'batch_loss/train': 0.8002532478421926} 12/25/2021 06:22:46 - INFO - codeparrot_training - Step 17390: {'lr': 0.00036662880698614607, 'samples': 8904192, 'steps': 17390, 'batch_loss/train': 0.8140713572502136} 12/25/2021 06:22:57 - INFO - codeparrot_training - Step 17391: {'lr': 0.00036661486405453725, 'samples': 8904704, 'steps': 17391, 'batch_loss/train': 0.7687893072143197} 12/25/2021 06:23:07 - INFO - codeparrot_training - Step 17392: {'lr': 0.0003666009206593117, 'samples': 8905216, 'steps': 17392, 'batch_loss/train': 0.8748242845758796} 12/25/2021 06:23:19 - INFO - codeparrot_training - Step 17393: {'lr': 0.0003665869768005249, 'samples': 8905728, 'steps': 17393, 'batch_loss/train': 0.8501117685809731} 12/25/2021 06:23:30 - INFO - codeparrot_training - Step 17394: {'lr': 0.0003665730324782324, 'samples': 8906240, 'steps': 17394, 'batch_loss/train': 0.8477291194722056} 12/25/2021 06:23:41 - INFO - codeparrot_training - Step 17395: {'lr': 0.00036655908769248947, 'samples': 8906752, 'steps': 17395, 'batch_loss/train': 0.8678005691617727} 12/25/2021 06:23:51 - INFO - codeparrot_training - Step 17396: {'lr': 0.0003665451424433516, 'samples': 8907264, 'steps': 17396, 'batch_loss/train': 0.7923067584633827} 12/25/2021 06:24:04 - INFO - codeparrot_training - Step 17397: {'lr': 0.00036653119673087434, 'samples': 8907776, 'steps': 17397, 'batch_loss/train': 0.7290579518303275} 12/25/2021 06:24:15 - INFO - codeparrot_training - Step 17398: {'lr': 0.00036651725055511297, 'samples': 8908288, 'steps': 17398, 'batch_loss/train': 0.7167352255200967} 12/25/2021 06:24:25 - INFO - codeparrot_training - Step 17399: {'lr': 0.00036650330391612296, 'samples': 8908800, 'steps': 17399, 'batch_loss/train': 0.8150149323046207} 12/25/2021 06:24:37 - INFO - codeparrot_training - Step 17400: {'lr': 0.00036648935681395995, 'samples': 8909312, 'steps': 17400, 'batch_loss/train': 0.8821622189134359} 12/25/2021 06:24:48 - INFO - codeparrot_training - Step 17401: {'lr': 0.00036647540924867917, 'samples': 8909824, 'steps': 17401, 'batch_loss/train': 0.7694982807151973} 12/25/2021 06:24:58 - INFO - codeparrot_training - Step 17402: {'lr': 0.00036646146122033607, 'samples': 8910336, 'steps': 17402, 'batch_loss/train': 0.834856373257935} 12/25/2021 06:25:11 - INFO - codeparrot_training - Step 17403: {'lr': 0.0003664475127289862, 'samples': 8910848, 'steps': 17403, 'batch_loss/train': 0.7484819306991994} 12/25/2021 06:25:22 - INFO - codeparrot_training - Step 17404: {'lr': 0.0003664335637746851, 'samples': 8911360, 'steps': 17404, 'batch_loss/train': 0.7945677083916962} 12/25/2021 06:25:32 - INFO - codeparrot_training - Step 17405: {'lr': 0.000366419614357488, 'samples': 8911872, 'steps': 17405, 'batch_loss/train': 0.8812666898593307} 12/25/2021 06:25:44 - INFO - codeparrot_training - Step 17406: {'lr': 0.0003664056644774505, 'samples': 8912384, 'steps': 17406, 'batch_loss/train': 0.7111737192608416} 12/25/2021 06:25:55 - INFO - codeparrot_training - Step 17407: {'lr': 0.0003663917141346281, 'samples': 8912896, 'steps': 17407, 'batch_loss/train': 0.8680298179388046} 12/25/2021 06:26:06 - INFO - codeparrot_training - Step 17408: {'lr': 0.00036637776332907603, 'samples': 8913408, 'steps': 17408, 'batch_loss/train': 0.8710408583283424} 12/25/2021 06:26:16 - INFO - codeparrot_training - Step 17409: {'lr': 0.00036636381206085007, 'samples': 8913920, 'steps': 17409, 'batch_loss/train': 0.7672727997414768} 12/25/2021 06:26:28 - INFO - codeparrot_training - Step 17410: {'lr': 0.00036634986033000545, 'samples': 8914432, 'steps': 17410, 'batch_loss/train': 0.8321215459145606} 12/25/2021 06:26:39 - INFO - codeparrot_training - Step 17411: {'lr': 0.00036633590813659774, 'samples': 8914944, 'steps': 17411, 'batch_loss/train': 0.7823754716664553} 12/25/2021 06:26:49 - INFO - codeparrot_training - Step 17412: {'lr': 0.0003663219554806824, 'samples': 8915456, 'steps': 17412, 'batch_loss/train': 0.8264126181602478} 12/25/2021 06:27:02 - INFO - codeparrot_training - Step 17413: {'lr': 0.00036630800236231486, 'samples': 8915968, 'steps': 17413, 'batch_loss/train': 0.8462514355778694} 12/25/2021 06:27:13 - INFO - codeparrot_training - Step 17414: {'lr': 0.0003662940487815506, 'samples': 8916480, 'steps': 17414, 'batch_loss/train': 0.6566808423958719} 12/25/2021 06:27:23 - INFO - codeparrot_training - Step 17415: {'lr': 0.0003662800947384452, 'samples': 8916992, 'steps': 17415, 'batch_loss/train': 0.9131703218445182} 12/25/2021 06:27:35 - INFO - codeparrot_training - Step 17416: {'lr': 0.000366266140233054, 'samples': 8917504, 'steps': 17416, 'batch_loss/train': 0.8549003629013896} 12/25/2021 06:27:46 - INFO - codeparrot_training - Step 17417: {'lr': 0.0003662521852654325, 'samples': 8918016, 'steps': 17417, 'batch_loss/train': 0.8635540874674916} 12/25/2021 06:27:57 - INFO - codeparrot_training - Step 17418: {'lr': 0.00036623822983563627, 'samples': 8918528, 'steps': 17418, 'batch_loss/train': 0.8263587718829513} 12/25/2021 06:28:09 - INFO - codeparrot_training - Step 17419: {'lr': 0.00036622427394372064, 'samples': 8919040, 'steps': 17419, 'batch_loss/train': 0.8798807505518198} 12/25/2021 06:28:19 - INFO - codeparrot_training - Step 17420: {'lr': 0.0003662103175897413, 'samples': 8919552, 'steps': 17420, 'batch_loss/train': 0.8186856126412749} 12/25/2021 06:28:30 - INFO - codeparrot_training - Step 17421: {'lr': 0.00036619636077375343, 'samples': 8920064, 'steps': 17421, 'batch_loss/train': 0.8478096220642328} 12/25/2021 06:28:41 - INFO - codeparrot_training - Step 17422: {'lr': 0.0003661824034958129, 'samples': 8920576, 'steps': 17422, 'batch_loss/train': 0.8655264256522059} 12/25/2021 06:28:53 - INFO - codeparrot_training - Step 17423: {'lr': 0.00036616844575597493, 'samples': 8921088, 'steps': 17423, 'batch_loss/train': 0.8330664969980717} 12/25/2021 06:29:04 - INFO - codeparrot_training - Step 17424: {'lr': 0.00036615448755429505, 'samples': 8921600, 'steps': 17424, 'batch_loss/train': 0.841694189235568} 12/25/2021 06:29:14 - INFO - codeparrot_training - Step 17425: {'lr': 0.00036614052889082885, 'samples': 8922112, 'steps': 17425, 'batch_loss/train': 0.7405771990306675} 12/25/2021 06:29:27 - INFO - codeparrot_training - Step 17426: {'lr': 0.00036612656976563174, 'samples': 8922624, 'steps': 17426, 'batch_loss/train': 0.8007215103134513} 12/25/2021 06:29:37 - INFO - codeparrot_training - Step 17427: {'lr': 0.0003661126101787591, 'samples': 8923136, 'steps': 17427, 'batch_loss/train': 0.8317997781559825} 12/25/2021 06:29:48 - INFO - codeparrot_training - Step 17428: {'lr': 0.00036609865013026676, 'samples': 8923648, 'steps': 17428, 'batch_loss/train': 0.6807360252132639} 12/25/2021 06:30:00 - INFO - codeparrot_training - Step 17429: {'lr': 0.0003660846896202099, 'samples': 8924160, 'steps': 17429, 'batch_loss/train': 1.5827063405886292} 12/25/2021 06:30:11 - INFO - codeparrot_training - Step 17430: {'lr': 0.0003660707286486442, 'samples': 8924672, 'steps': 17430, 'batch_loss/train': 0.9250199971720576} 12/25/2021 06:30:21 - INFO - codeparrot_training - Step 17431: {'lr': 0.0003660567672156251, 'samples': 8925184, 'steps': 17431, 'batch_loss/train': 0.7911776434630156} 12/25/2021 06:30:32 - INFO - codeparrot_training - Step 17432: {'lr': 0.0003660428053212082, 'samples': 8925696, 'steps': 17432, 'batch_loss/train': 0.7495663054287434} 12/25/2021 06:30:45 - INFO - codeparrot_training - Step 17433: {'lr': 0.0003660288429654488, 'samples': 8926208, 'steps': 17433, 'batch_loss/train': 0.8315925328060985} 12/25/2021 06:30:55 - INFO - codeparrot_training - Step 17434: {'lr': 0.0003660148801484025, 'samples': 8926720, 'steps': 17434, 'batch_loss/train': 0.8217172026634216} 12/25/2021 06:31:06 - INFO - codeparrot_training - Step 17435: {'lr': 0.0003660009168701249, 'samples': 8927232, 'steps': 17435, 'batch_loss/train': 0.773083757609129} 12/25/2021 06:31:18 - INFO - codeparrot_training - Step 17436: {'lr': 0.00036598695313067156, 'samples': 8927744, 'steps': 17436, 'batch_loss/train': 0.6873998296796344} 12/25/2021 06:31:28 - INFO - codeparrot_training - Step 17437: {'lr': 0.0003659729889300978, 'samples': 8928256, 'steps': 17437, 'batch_loss/train': 0.7631347207352519} 12/25/2021 06:31:39 - INFO - codeparrot_training - Step 17438: {'lr': 0.00036595902426845916, 'samples': 8928768, 'steps': 17438, 'batch_loss/train': 0.8004321996122599} 12/25/2021 06:31:51 - INFO - codeparrot_training - Step 17439: {'lr': 0.00036594505914581136, 'samples': 8929280, 'steps': 17439, 'batch_loss/train': 0.7429881563875824} 12/25/2021 06:32:02 - INFO - codeparrot_training - Step 17440: {'lr': 0.0003659310935622096, 'samples': 8929792, 'steps': 17440, 'batch_loss/train': 0.8600389467319474} 12/25/2021 06:32:12 - INFO - codeparrot_training - Step 17441: {'lr': 0.0003659171275177097, 'samples': 8930304, 'steps': 17441, 'batch_loss/train': 0.7702525313943624} 12/25/2021 06:32:23 - INFO - codeparrot_training - Step 17442: {'lr': 0.0003659031610123671, 'samples': 8930816, 'steps': 17442, 'batch_loss/train': 0.8319746782071888} 12/25/2021 06:32:36 - INFO - codeparrot_training - Step 17443: {'lr': 0.0003658891940462373, 'samples': 8931328, 'steps': 17443, 'batch_loss/train': 0.865491196513176} 12/25/2021 06:32:46 - INFO - codeparrot_training - Step 17444: {'lr': 0.00036587522661937576, 'samples': 8931840, 'steps': 17444, 'batch_loss/train': 0.7817246387712657} 12/25/2021 06:32:57 - INFO - codeparrot_training - Step 17445: {'lr': 0.0003658612587318381, 'samples': 8932352, 'steps': 17445, 'batch_loss/train': 0.7843131460249424} 12/25/2021 06:33:09 - INFO - codeparrot_training - Step 17446: {'lr': 0.0003658472903836798, 'samples': 8932864, 'steps': 17446, 'batch_loss/train': 0.8150057559832931} 12/25/2021 06:33:20 - INFO - codeparrot_training - Step 17447: {'lr': 0.0003658333215749564, 'samples': 8933376, 'steps': 17447, 'batch_loss/train': 0.736543252132833} 12/25/2021 06:33:30 - INFO - codeparrot_training - Step 17448: {'lr': 0.0003658193523057235, 'samples': 8933888, 'steps': 17448, 'batch_loss/train': 0.8731275340542197} 12/25/2021 06:33:42 - INFO - codeparrot_training - Step 17449: {'lr': 0.0003658053825760366, 'samples': 8934400, 'steps': 17449, 'batch_loss/train': 0.7878011409193277} 12/25/2021 06:33:53 - INFO - codeparrot_training - Step 17450: {'lr': 0.00036579141238595115, 'samples': 8934912, 'steps': 17450, 'batch_loss/train': 0.7260540304705501} 12/25/2021 06:34:04 - INFO - codeparrot_training - Step 17451: {'lr': 0.00036577744173552285, 'samples': 8935424, 'steps': 17451, 'batch_loss/train': 0.8659751610830426} 12/25/2021 06:34:16 - INFO - codeparrot_training - Step 17452: {'lr': 0.00036576347062480715, 'samples': 8935936, 'steps': 17452, 'batch_loss/train': 0.7270058700814843} 12/25/2021 06:34:27 - INFO - codeparrot_training - Step 17453: {'lr': 0.0003657494990538595, 'samples': 8936448, 'steps': 17453, 'batch_loss/train': 0.6993022486567497} 12/25/2021 06:34:37 - INFO - codeparrot_training - Step 17454: {'lr': 0.00036573552702273563, 'samples': 8936960, 'steps': 17454, 'batch_loss/train': 0.8078706124797463} 12/25/2021 06:34:48 - INFO - codeparrot_training - Step 17455: {'lr': 0.00036572155453149104, 'samples': 8937472, 'steps': 17455, 'batch_loss/train': 0.713283383869566} 12/25/2021 06:35:00 - INFO - codeparrot_training - Step 17456: {'lr': 0.00036570758158018115, 'samples': 8937984, 'steps': 17456, 'batch_loss/train': 0.70717708254233} 12/25/2021 06:35:11 - INFO - codeparrot_training - Step 17457: {'lr': 0.0003656936081688617, 'samples': 8938496, 'steps': 17457, 'batch_loss/train': 0.6310438867658377} 12/25/2021 06:35:21 - INFO - codeparrot_training - Step 17458: {'lr': 0.000365679634297588, 'samples': 8939008, 'steps': 17458, 'batch_loss/train': 0.7631240736227483} 12/25/2021 06:35:34 - INFO - codeparrot_training - Step 17459: {'lr': 0.00036566565996641587, 'samples': 8939520, 'steps': 17459, 'batch_loss/train': 0.7502413596957922} 12/25/2021 06:35:45 - INFO - codeparrot_training - Step 17460: {'lr': 0.0003656516851754008, 'samples': 8940032, 'steps': 17460, 'batch_loss/train': 0.7712232791818678} 12/25/2021 06:35:55 - INFO - codeparrot_training - Step 17461: {'lr': 0.0003656377099245982, 'samples': 8940544, 'steps': 17461, 'batch_loss/train': 0.8671118777710944} 12/25/2021 06:36:07 - INFO - codeparrot_training - Step 17462: {'lr': 0.00036562373421406377, 'samples': 8941056, 'steps': 17462, 'batch_loss/train': 0.794335660059005} 12/25/2021 06:36:18 - INFO - codeparrot_training - Step 17463: {'lr': 0.000365609758043853, 'samples': 8941568, 'steps': 17463, 'batch_loss/train': 0.8260528426617384} 12/25/2021 06:36:29 - INFO - codeparrot_training - Step 17464: {'lr': 0.0003655957814140215, 'samples': 8942080, 'steps': 17464, 'batch_loss/train': 0.8164424132555723} 12/25/2021 06:36:39 - INFO - codeparrot_training - Step 17465: {'lr': 0.00036558180432462483, 'samples': 8942592, 'steps': 17465, 'batch_loss/train': 0.830485075712204} 12/25/2021 06:36:52 - INFO - codeparrot_training - Step 17466: {'lr': 0.00036556782677571854, 'samples': 8943104, 'steps': 17466, 'batch_loss/train': 0.857560420408845} 12/25/2021 06:37:02 - INFO - codeparrot_training - Step 17467: {'lr': 0.0003655538487673582, 'samples': 8943616, 'steps': 17467, 'batch_loss/train': 0.6089750733226538} 12/25/2021 06:37:13 - INFO - codeparrot_training - Step 17468: {'lr': 0.0003655398702995994, 'samples': 8944128, 'steps': 17468, 'batch_loss/train': 0.7589163966476917} 12/25/2021 06:37:25 - INFO - codeparrot_training - Step 17469: {'lr': 0.0003655258913724977, 'samples': 8944640, 'steps': 17469, 'batch_loss/train': 0.77393215498887} 12/25/2021 06:37:35 - INFO - codeparrot_training - Step 17470: {'lr': 0.00036551191198610865, 'samples': 8945152, 'steps': 17470, 'batch_loss/train': 0.8750169444829226} 12/25/2021 06:37:46 - INFO - codeparrot_training - Step 17471: {'lr': 0.0003654979321404879, 'samples': 8945664, 'steps': 17471, 'batch_loss/train': 0.7714402242563665} 12/25/2021 06:37:59 - INFO - codeparrot_training - Step 17472: {'lr': 0.00036548395183569094, 'samples': 8946176, 'steps': 17472, 'batch_loss/train': 0.7904062662273645} 12/25/2021 06:38:10 - INFO - codeparrot_training - Step 17473: {'lr': 0.00036546997107177344, 'samples': 8946688, 'steps': 17473, 'batch_loss/train': 0.8401390672661364} 12/25/2021 06:38:20 - INFO - codeparrot_training - Step 17474: {'lr': 0.0003654559898487909, 'samples': 8947200, 'steps': 17474, 'batch_loss/train': 0.7598392039071769} 12/25/2021 06:38:32 - INFO - codeparrot_training - Step 17475: {'lr': 0.00036544200816679894, 'samples': 8947712, 'steps': 17475, 'batch_loss/train': 0.8148228228092194} 12/25/2021 06:38:43 - INFO - codeparrot_training - Step 17476: {'lr': 0.0003654280260258532, 'samples': 8948224, 'steps': 17476, 'batch_loss/train': 0.7977593307732604} 12/25/2021 06:38:54 - INFO - codeparrot_training - Step 17477: {'lr': 0.00036541404342600915, 'samples': 8948736, 'steps': 17477, 'batch_loss/train': 0.8087447155267} 12/25/2021 06:39:04 - INFO - codeparrot_training - Step 17478: {'lr': 0.0003654000603673224, 'samples': 8949248, 'steps': 17478, 'batch_loss/train': 0.8449163034092635} 12/25/2021 06:39:16 - INFO - codeparrot_training - Step 17479: {'lr': 0.00036538607684984863, 'samples': 8949760, 'steps': 17479, 'batch_loss/train': 0.7880113758146763} 12/25/2021 06:39:27 - INFO - codeparrot_training - Step 17480: {'lr': 0.00036537209287364344, 'samples': 8950272, 'steps': 17480, 'batch_loss/train': 0.7587065203115344} 12/25/2021 06:39:37 - INFO - codeparrot_training - Step 17481: {'lr': 0.0003653581084387624, 'samples': 8950784, 'steps': 17481, 'batch_loss/train': 0.9208804350346327} 12/25/2021 06:39:50 - INFO - codeparrot_training - Step 17482: {'lr': 0.0003653441235452609, 'samples': 8951296, 'steps': 17482, 'batch_loss/train': 0.8329276763834059} 12/25/2021 06:40:01 - INFO - codeparrot_training - Step 17483: {'lr': 0.00036533013819319487, 'samples': 8951808, 'steps': 17483, 'batch_loss/train': 0.7580092360731214} 12/25/2021 06:40:11 - INFO - codeparrot_training - Step 17484: {'lr': 0.00036531615238261973, 'samples': 8952320, 'steps': 17484, 'batch_loss/train': 0.9084412958472967} 12/25/2021 06:40:24 - INFO - codeparrot_training - Step 17485: {'lr': 0.00036530216611359106, 'samples': 8952832, 'steps': 17485, 'batch_loss/train': 0.7212890683440492} 12/25/2021 06:40:35 - INFO - codeparrot_training - Step 17486: {'lr': 0.0003652881793861645, 'samples': 8953344, 'steps': 17486, 'batch_loss/train': 0.8253106465563178} 12/25/2021 06:40:45 - INFO - codeparrot_training - Step 17487: {'lr': 0.0003652741922003958, 'samples': 8953856, 'steps': 17487, 'batch_loss/train': 0.7835287684574723} 12/25/2021 06:40:56 - INFO - codeparrot_training - Step 17488: {'lr': 0.00036526020455634035, 'samples': 8954368, 'steps': 17488, 'batch_loss/train': 0.8386829970404506} 12/25/2021 06:41:08 - INFO - codeparrot_training - Step 17489: {'lr': 0.0003652462164540538, 'samples': 8954880, 'steps': 17489, 'batch_loss/train': 0.8457741970196366} 12/25/2021 06:41:19 - INFO - codeparrot_training - Step 17490: {'lr': 0.0003652322278935919, 'samples': 8955392, 'steps': 17490, 'batch_loss/train': 0.8108110325410962} 12/25/2021 06:41:30 - INFO - codeparrot_training - Step 17491: {'lr': 0.0003652182388750101, 'samples': 8955904, 'steps': 17491, 'batch_loss/train': 0.6465431621181779} 12/25/2021 06:41:42 - INFO - codeparrot_training - Step 17492: {'lr': 0.0003652042493983641, 'samples': 8956416, 'steps': 17492, 'batch_loss/train': 0.8066728254780173} 12/25/2021 06:41:53 - INFO - codeparrot_training - Step 17493: {'lr': 0.0003651902594637095, 'samples': 8956928, 'steps': 17493, 'batch_loss/train': 1.004271681420505} 12/25/2021 06:42:03 - INFO - codeparrot_training - Step 17494: {'lr': 0.00036517626907110203, 'samples': 8957440, 'steps': 17494, 'batch_loss/train': 0.6365663467440754} 12/25/2021 06:42:14 - INFO - codeparrot_training - Step 17495: {'lr': 0.000365162278220597, 'samples': 8957952, 'steps': 17495, 'batch_loss/train': 0.8705560937523842} 12/25/2021 06:42:26 - INFO - codeparrot_training - Step 17496: {'lr': 0.00036514828691225045, 'samples': 8958464, 'steps': 17496, 'batch_loss/train': 0.8482341710478067} 12/25/2021 06:42:36 - INFO - codeparrot_training - Step 17497: {'lr': 0.0003651342951461177, 'samples': 8958976, 'steps': 17497, 'batch_loss/train': 0.8688390105962753} 12/25/2021 06:42:47 - INFO - codeparrot_training - Step 17498: {'lr': 0.0003651203029222544, 'samples': 8959488, 'steps': 17498, 'batch_loss/train': 0.8372709983959794} 12/25/2021 06:43:00 - INFO - codeparrot_training - Step 17499: {'lr': 0.0003651063102407163, 'samples': 8960000, 'steps': 17499, 'batch_loss/train': 0.8981794789433479} 12/25/2021 06:43:00 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/25/2021 06:46:22 - INFO - codeparrot_training - Step 17500: {'loss/eval': 0.8102298974990845, 'perplexity': 2.248424768447876} 12/25/2021 06:46:39 - WARNING - huggingface_hub.repository - Several commits (5) will be pushed upstream. 12/25/2021 06:46:52 - INFO - codeparrot_training - Step 17500: {'lr': 0.000365092317101559, 'samples': 8960512, 'steps': 17500, 'batch_loss/train': 0.8307197419926524} 12/25/2021 06:47:03 - INFO - codeparrot_training - Step 17501: {'lr': 0.0003650783235048381, 'samples': 8961024, 'steps': 17501, 'batch_loss/train': 0.9144590720534325} 12/25/2021 06:47:15 - INFO - codeparrot_training - Step 17502: {'lr': 0.00036506432945060917, 'samples': 8961536, 'steps': 17502, 'batch_loss/train': 0.7914969539269805} 12/25/2021 06:47:25 - INFO - codeparrot_training - Step 17503: {'lr': 0.00036505033493892795, 'samples': 8962048, 'steps': 17503, 'batch_loss/train': 0.7712360359728336} 12/25/2021 06:47:36 - INFO - codeparrot_training - Step 17504: {'lr': 0.0003650363399698501, 'samples': 8962560, 'steps': 17504, 'batch_loss/train': 0.8089114557951689} 12/25/2021 06:47:47 - INFO - codeparrot_training - Step 17505: {'lr': 0.00036502234454343106, 'samples': 8963072, 'steps': 17505, 'batch_loss/train': 0.8838100219145417} 12/25/2021 06:47:59 - INFO - codeparrot_training - Step 17506: {'lr': 0.00036500834865972674, 'samples': 8963584, 'steps': 17506, 'batch_loss/train': 0.9241408538073301} 12/25/2021 06:48:09 - INFO - codeparrot_training - Step 17507: {'lr': 0.0003649943523187926, 'samples': 8964096, 'steps': 17507, 'batch_loss/train': 0.5533155215671286} 12/25/2021 06:48:20 - INFO - codeparrot_training - Step 17508: {'lr': 0.00036498035552068434, 'samples': 8964608, 'steps': 17508, 'batch_loss/train': 0.8114378890022635} 12/25/2021 06:48:32 - INFO - codeparrot_training - Step 17509: {'lr': 0.00036496635826545765, 'samples': 8965120, 'steps': 17509, 'batch_loss/train': 0.8427792554721236} 12/25/2021 06:48:43 - INFO - codeparrot_training - Step 17510: {'lr': 0.00036495236055316804, 'samples': 8965632, 'steps': 17510, 'batch_loss/train': 0.8862853366881609} 12/25/2021 06:48:53 - INFO - codeparrot_training - Step 17511: {'lr': 0.0003649383623838713, 'samples': 8966144, 'steps': 17511, 'batch_loss/train': 0.8413079120218754} 12/25/2021 06:49:06 - INFO - codeparrot_training - Step 17512: {'lr': 0.000364924363757623, 'samples': 8966656, 'steps': 17512, 'batch_loss/train': 0.8658405407331884} 12/25/2021 06:49:17 - INFO - codeparrot_training - Step 17513: {'lr': 0.0003649103646744789, 'samples': 8967168, 'steps': 17513, 'batch_loss/train': 0.7635018015280366} 12/25/2021 06:49:27 - INFO - codeparrot_training - Step 17514: {'lr': 0.00036489636513449455, 'samples': 8967680, 'steps': 17514, 'batch_loss/train': 0.8171255439519882} 12/25/2021 06:49:39 - INFO - codeparrot_training - Step 17515: {'lr': 0.00036488236513772547, 'samples': 8968192, 'steps': 17515, 'batch_loss/train': 0.7920855414122343} 12/25/2021 06:49:50 - INFO - codeparrot_training - Step 17516: {'lr': 0.00036486836468422766, 'samples': 8968704, 'steps': 17516, 'batch_loss/train': 0.8362925872206688} 12/25/2021 06:50:00 - INFO - codeparrot_training - Step 17517: {'lr': 0.0003648543637740566, 'samples': 8969216, 'steps': 17517, 'batch_loss/train': 0.8468721536919475} 12/25/2021 06:50:11 - INFO - codeparrot_training - Step 17518: {'lr': 0.00036484036240726794, 'samples': 8969728, 'steps': 17518, 'batch_loss/train': 0.7744992033112794} 12/25/2021 06:50:24 - INFO - codeparrot_training - Step 17519: {'lr': 0.00036482636058391733, 'samples': 8970240, 'steps': 17519, 'batch_loss/train': 0.7002326406072825} 12/25/2021 06:50:34 - INFO - codeparrot_training - Step 17520: {'lr': 0.0003648123583040605, 'samples': 8970752, 'steps': 17520, 'batch_loss/train': 0.8811657819896936} 12/25/2021 06:50:45 - INFO - codeparrot_training - Step 17521: {'lr': 0.00036479835556775303, 'samples': 8971264, 'steps': 17521, 'batch_loss/train': 0.78761544264853} 12/25/2021 06:50:57 - INFO - codeparrot_training - Step 17522: {'lr': 0.0003647843523750507, 'samples': 8971776, 'steps': 17522, 'batch_loss/train': 0.8567458372563124} 12/25/2021 06:51:08 - INFO - codeparrot_training - Step 17523: {'lr': 0.0003647703487260091, 'samples': 8972288, 'steps': 17523, 'batch_loss/train': 0.7463348587043583} 12/25/2021 06:51:18 - INFO - codeparrot_training - Step 17524: {'lr': 0.000364756344620684, 'samples': 8972800, 'steps': 17524, 'batch_loss/train': 0.8922322941944003} 12/25/2021 06:51:31 - INFO - codeparrot_training - Step 17525: {'lr': 0.00036474234005913085, 'samples': 8973312, 'steps': 17525, 'batch_loss/train': 0.8088172436691821} 12/25/2021 06:51:41 - INFO - codeparrot_training - Step 17526: {'lr': 0.0003647283350414056, 'samples': 8973824, 'steps': 17526, 'batch_loss/train': 1.2374369511380792} 12/25/2021 06:51:52 - INFO - codeparrot_training - Step 17527: {'lr': 0.0003647143295675638, 'samples': 8974336, 'steps': 17527, 'batch_loss/train': 0.7628041077405214} 12/25/2021 06:52:02 - INFO - codeparrot_training - Step 17528: {'lr': 0.0003647003236376611, 'samples': 8974848, 'steps': 17528, 'batch_loss/train': 0.8127096234820783} 12/25/2021 06:52:15 - INFO - codeparrot_training - Step 17529: {'lr': 0.00036468631725175327, 'samples': 8975360, 'steps': 17529, 'batch_loss/train': 0.6570054669864476} 12/25/2021 06:52:26 - INFO - codeparrot_training - Step 17530: {'lr': 0.0003646723104098959, 'samples': 8975872, 'steps': 17530, 'batch_loss/train': 0.75329699809663} 12/25/2021 06:52:37 - INFO - codeparrot_training - Step 17531: {'lr': 0.0003646583031121448, 'samples': 8976384, 'steps': 17531, 'batch_loss/train': 0.8538977205753326} 12/25/2021 06:52:49 - INFO - codeparrot_training - Step 17532: {'lr': 0.0003646442953585555, 'samples': 8976896, 'steps': 17532, 'batch_loss/train': 0.7317030155099928} 12/25/2021 06:52:59 - INFO - codeparrot_training - Step 17533: {'lr': 0.0003646302871491838, 'samples': 8977408, 'steps': 17533, 'batch_loss/train': 0.7808196181431413} 12/25/2021 06:53:10 - INFO - codeparrot_training - Step 17534: {'lr': 0.0003646162784840853, 'samples': 8977920, 'steps': 17534, 'batch_loss/train': 0.7623348142951727} 12/25/2021 06:53:22 - INFO - codeparrot_training - Step 17535: {'lr': 0.0003646022693633159, 'samples': 8978432, 'steps': 17535, 'batch_loss/train': 0.7366389683447778} 12/25/2021 06:53:33 - INFO - codeparrot_training - Step 17536: {'lr': 0.000364588259786931, 'samples': 8978944, 'steps': 17536, 'batch_loss/train': 0.8142022858373821} 12/25/2021 06:53:43 - INFO - codeparrot_training - Step 17537: {'lr': 0.0003645742497549865, 'samples': 8979456, 'steps': 17537, 'batch_loss/train': 0.7674435074441135} 12/25/2021 06:53:54 - INFO - codeparrot_training - Step 17538: {'lr': 0.000364560239267538, 'samples': 8979968, 'steps': 17538, 'batch_loss/train': 0.8256448516622186} 12/25/2021 06:54:08 - INFO - codeparrot_training - Step 17539: {'lr': 0.00036454622832464135, 'samples': 8980480, 'steps': 17539, 'batch_loss/train': 0.8501670202240348} 12/25/2021 06:54:19 - INFO - codeparrot_training - Step 17540: {'lr': 0.00036453221692635203, 'samples': 8980992, 'steps': 17540, 'batch_loss/train': 0.7851654877886176} 12/25/2021 06:54:29 - INFO - codeparrot_training - Step 17541: {'lr': 0.0003645182050727259, 'samples': 8981504, 'steps': 17541, 'batch_loss/train': 0.7624305402860045} 12/25/2021 06:54:41 - INFO - codeparrot_training - Step 17542: {'lr': 0.00036450419276381867, 'samples': 8982016, 'steps': 17542, 'batch_loss/train': 0.8506679404526949} 12/25/2021 06:54:52 - INFO - codeparrot_training - Step 17543: {'lr': 0.00036449017999968603, 'samples': 8982528, 'steps': 17543, 'batch_loss/train': 0.7728010513819754} 12/25/2021 06:55:03 - INFO - codeparrot_training - Step 17544: {'lr': 0.0003644761667803836, 'samples': 8983040, 'steps': 17544, 'batch_loss/train': 0.7885484900325537} 12/25/2021 06:55:17 - INFO - codeparrot_training - Step 17545: {'lr': 0.00036446215310596716, 'samples': 8983552, 'steps': 17545, 'batch_loss/train': 0.8060733275488019} 12/25/2021 06:55:27 - INFO - codeparrot_training - Step 17546: {'lr': 0.00036444813897649244, 'samples': 8984064, 'steps': 17546, 'batch_loss/train': 0.6928237667307258} 12/25/2021 06:55:38 - INFO - codeparrot_training - Step 17547: {'lr': 0.00036443412439201504, 'samples': 8984576, 'steps': 17547, 'batch_loss/train': 0.709927087649703} 12/25/2021 06:55:48 - INFO - codeparrot_training - Step 17548: {'lr': 0.00036442010935259096, 'samples': 8985088, 'steps': 17548, 'batch_loss/train': 0.7909206310287118} 12/25/2021 06:56:01 - INFO - codeparrot_training - Step 17549: {'lr': 0.0003644060938582756, 'samples': 8985600, 'steps': 17549, 'batch_loss/train': 0.7729290095157921} 12/25/2021 06:56:11 - INFO - codeparrot_training - Step 17550: {'lr': 0.0003643920779091248, 'samples': 8986112, 'steps': 17550, 'batch_loss/train': 0.7600880955578759} 12/25/2021 06:56:22 - INFO - codeparrot_training - Step 17551: {'lr': 0.00036437806150519433, 'samples': 8986624, 'steps': 17551, 'batch_loss/train': 0.6122954800957814} 12/25/2021 06:56:34 - INFO - codeparrot_training - Step 17552: {'lr': 0.0003643640446465399, 'samples': 8987136, 'steps': 17552, 'batch_loss/train': 0.7316226824186742} 12/25/2021 06:56:45 - INFO - codeparrot_training - Step 17553: {'lr': 0.00036435002733321717, 'samples': 8987648, 'steps': 17553, 'batch_loss/train': 0.6844443837180734} 12/25/2021 06:56:55 - INFO - codeparrot_training - Step 17554: {'lr': 0.0003643360095652819, 'samples': 8988160, 'steps': 17554, 'batch_loss/train': 0.8313455134630203} 12/25/2021 06:57:09 - INFO - codeparrot_training - Step 17555: {'lr': 0.0003643219913427899, 'samples': 8988672, 'steps': 17555, 'batch_loss/train': 0.833085217513144} 12/25/2021 06:57:20 - INFO - codeparrot_training - Step 17556: {'lr': 0.0003643079726657967, 'samples': 8989184, 'steps': 17556, 'batch_loss/train': 0.7757612625136971} 12/25/2021 06:57:30 - INFO - codeparrot_training - Step 17557: {'lr': 0.00036429395353435826, 'samples': 8989696, 'steps': 17557, 'batch_loss/train': 0.7181702015222982} 12/25/2021 06:57:43 - INFO - codeparrot_training - Step 17558: {'lr': 0.0003642799339485301, 'samples': 8990208, 'steps': 17558, 'batch_loss/train': 0.7596110142767429} 12/25/2021 06:57:53 - INFO - codeparrot_training - Step 17559: {'lr': 0.00036426591390836826, 'samples': 8990720, 'steps': 17559, 'batch_loss/train': 1.07494986243546} 12/25/2021 06:58:04 - INFO - codeparrot_training - Step 17560: {'lr': 0.0003642518934139282, 'samples': 8991232, 'steps': 17560, 'batch_loss/train': 0.7112321534659714} 12/25/2021 06:58:14 - INFO - codeparrot_training - Step 17561: {'lr': 0.0003642378724652657, 'samples': 8991744, 'steps': 17561, 'batch_loss/train': 0.7707404792308807} 12/25/2021 06:58:26 - INFO - codeparrot_training - Step 17562: {'lr': 0.0003642238510624366, 'samples': 8992256, 'steps': 17562, 'batch_loss/train': 0.7842715652659535} 12/25/2021 06:58:37 - INFO - codeparrot_training - Step 17563: {'lr': 0.00036420982920549663, 'samples': 8992768, 'steps': 17563, 'batch_loss/train': 0.8554857429116964} 12/25/2021 06:58:48 - INFO - codeparrot_training - Step 17564: {'lr': 0.00036419580689450146, 'samples': 8993280, 'steps': 17564, 'batch_loss/train': 0.7414386430755258} 12/25/2021 06:59:01 - INFO - codeparrot_training - Step 17565: {'lr': 0.00036418178412950687, 'samples': 8993792, 'steps': 17565, 'batch_loss/train': 0.8422224130481482} 12/25/2021 06:59:12 - INFO - codeparrot_training - Step 17566: {'lr': 0.00036416776091056864, 'samples': 8994304, 'steps': 17566, 'batch_loss/train': 0.7690241551026702} 12/25/2021 06:59:23 - INFO - codeparrot_training - Step 17567: {'lr': 0.00036415373723774246, 'samples': 8994816, 'steps': 17567, 'batch_loss/train': 0.6803681775927544} 12/25/2021 06:59:35 - INFO - codeparrot_training - Step 17568: {'lr': 0.0003641397131110842, 'samples': 8995328, 'steps': 17568, 'batch_loss/train': 0.8099179090932012} 12/25/2021 06:59:45 - INFO - codeparrot_training - Step 17569: {'lr': 0.0003641256885306495, 'samples': 8995840, 'steps': 17569, 'batch_loss/train': 0.6165088304551318} 12/25/2021 06:59:56 - INFO - codeparrot_training - Step 17570: {'lr': 0.00036411166349649415, 'samples': 8996352, 'steps': 17570, 'batch_loss/train': 0.6918530219700187} 12/25/2021 07:00:06 - INFO - codeparrot_training - Step 17571: {'lr': 0.00036409763800867396, 'samples': 8996864, 'steps': 17571, 'batch_loss/train': 0.7085294311400503} 12/25/2021 07:00:18 - INFO - codeparrot_training - Step 17572: {'lr': 0.0003640836120672446, 'samples': 8997376, 'steps': 17572, 'batch_loss/train': 0.7883330322802067} 12/25/2021 07:00:29 - INFO - codeparrot_training - Step 17573: {'lr': 0.00036406958567226184, 'samples': 8997888, 'steps': 17573, 'batch_loss/train': 0.64566471404396} 12/25/2021 07:00:40 - INFO - codeparrot_training - Step 17574: {'lr': 0.00036405555882378153, 'samples': 8998400, 'steps': 17574, 'batch_loss/train': 0.7790735820308328} 12/25/2021 07:00:52 - INFO - codeparrot_training - Step 17575: {'lr': 0.00036404153152185936, 'samples': 8998912, 'steps': 17575, 'batch_loss/train': 0.7942043663933873} 12/25/2021 07:01:02 - INFO - codeparrot_training - Step 17576: {'lr': 0.00036402750376655116, 'samples': 8999424, 'steps': 17576, 'batch_loss/train': 0.7676638141274452} 12/25/2021 07:01:13 - INFO - codeparrot_training - Step 17577: {'lr': 0.00036401347555791266, 'samples': 8999936, 'steps': 17577, 'batch_loss/train': 0.7898757169023156} 12/25/2021 07:01:27 - INFO - codeparrot_training - Step 17578: {'lr': 0.0003639994468959996, 'samples': 9000448, 'steps': 17578, 'batch_loss/train': 0.7626757230609655} 12/25/2021 07:01:37 - INFO - codeparrot_training - Step 17579: {'lr': 0.00036398541778086773, 'samples': 9000960, 'steps': 17579, 'batch_loss/train': 0.7332722833380103} 12/25/2021 07:01:48 - INFO - codeparrot_training - Step 17580: {'lr': 0.0003639713882125729, 'samples': 9001472, 'steps': 17580, 'batch_loss/train': 0.7381465600337833} 12/25/2021 07:02:00 - INFO - codeparrot_training - Step 17581: {'lr': 0.000363957358191171, 'samples': 9001984, 'steps': 17581, 'batch_loss/train': 0.6689006948145106} 12/25/2021 07:02:11 - INFO - codeparrot_training - Step 17582: {'lr': 0.00036394332771671757, 'samples': 9002496, 'steps': 17582, 'batch_loss/train': 0.7835452607832849} 12/25/2021 07:02:21 - INFO - codeparrot_training - Step 17583: {'lr': 0.00036392929678926843, 'samples': 9003008, 'steps': 17583, 'batch_loss/train': 0.7601524055935442} 12/25/2021 07:02:32 - INFO - codeparrot_training - Step 17584: {'lr': 0.0003639152654088795, 'samples': 9003520, 'steps': 17584, 'batch_loss/train': 0.8712026868015528} 12/25/2021 07:02:46 - INFO - codeparrot_training - Step 17585: {'lr': 0.00036390123357560647, 'samples': 9004032, 'steps': 17585, 'batch_loss/train': 0.7415044941008091} 12/25/2021 07:02:56 - INFO - codeparrot_training - Step 17586: {'lr': 0.00036388720128950516, 'samples': 9004544, 'steps': 17586, 'batch_loss/train': 0.7359963285271078} 12/25/2021 07:03:07 - INFO - codeparrot_training - Step 17587: {'lr': 0.00036387316855063137, 'samples': 9005056, 'steps': 17587, 'batch_loss/train': 0.7464203489944339} 12/25/2021 07:03:19 - INFO - codeparrot_training - Step 17588: {'lr': 0.00036385913535904083, 'samples': 9005568, 'steps': 17588, 'batch_loss/train': 0.8501342721283436} 12/25/2021 07:03:30 - INFO - codeparrot_training - Step 17589: {'lr': 0.0003638451017147892, 'samples': 9006080, 'steps': 17589, 'batch_loss/train': 0.8405546294525266} 12/25/2021 07:03:40 - INFO - codeparrot_training - Step 17590: {'lr': 0.0003638310676179327, 'samples': 9006592, 'steps': 17590, 'batch_loss/train': 0.7981152031570673} 12/25/2021 07:03:52 - INFO - codeparrot_training - Step 17591: {'lr': 0.0003638170330685267, 'samples': 9007104, 'steps': 17591, 'batch_loss/train': 0.7595372861251235} 12/25/2021 07:04:03 - INFO - codeparrot_training - Step 17592: {'lr': 0.0003638029980666272, 'samples': 9007616, 'steps': 17592, 'batch_loss/train': 0.8733402779325843} 12/25/2021 07:04:13 - INFO - codeparrot_training - Step 17593: {'lr': 0.0003637889626122899, 'samples': 9008128, 'steps': 17593, 'batch_loss/train': 0.8103013224899769} 12/25/2021 07:04:24 - INFO - codeparrot_training - Step 17594: {'lr': 0.0003637749267055707, 'samples': 9008640, 'steps': 17594, 'batch_loss/train': 0.7836054358631372} 12/25/2021 07:04:38 - INFO - codeparrot_training - Step 17595: {'lr': 0.0003637608903465254, 'samples': 9009152, 'steps': 17595, 'batch_loss/train': 0.5463819614378735} 12/25/2021 07:04:49 - INFO - codeparrot_training - Step 17596: {'lr': 0.00036374685353520966, 'samples': 9009664, 'steps': 17596, 'batch_loss/train': 0.8394410097971559} 12/25/2021 07:04:59 - INFO - codeparrot_training - Step 17597: {'lr': 0.0003637328162716794, 'samples': 9010176, 'steps': 17597, 'batch_loss/train': 0.5892921526101418} 12/25/2021 07:05:11 - INFO - codeparrot_training - Step 17598: {'lr': 0.0003637187785559904, 'samples': 9010688, 'steps': 17598, 'batch_loss/train': 0.7683813683688641} 12/25/2021 07:05:22 - INFO - codeparrot_training - Step 17599: {'lr': 0.00036370474038819846, 'samples': 9011200, 'steps': 17599, 'batch_loss/train': 0.7961241444572806} 12/25/2021 07:05:33 - INFO - codeparrot_training - Step 17600: {'lr': 0.0003636907017683595, 'samples': 9011712, 'steps': 17600, 'batch_loss/train': 0.7746596592478454} 12/25/2021 07:05:45 - INFO - codeparrot_training - Step 17601: {'lr': 0.00036367666269652917, 'samples': 9012224, 'steps': 17601, 'batch_loss/train': 0.526436266489327} 12/25/2021 07:05:56 - INFO - codeparrot_training - Step 17602: {'lr': 0.0003636626231727633, 'samples': 9012736, 'steps': 17602, 'batch_loss/train': 0.7766845347359776} 12/25/2021 07:06:06 - INFO - codeparrot_training - Step 17603: {'lr': 0.0003636485831971179, 'samples': 9013248, 'steps': 17603, 'batch_loss/train': 0.5968628613627516} 12/25/2021 07:06:17 - INFO - codeparrot_training - Step 17604: {'lr': 0.00036363454276964845, 'samples': 9013760, 'steps': 17604, 'batch_loss/train': 0.7931735422462225} 12/25/2021 07:06:29 - INFO - codeparrot_training - Step 17605: {'lr': 0.0003636205018904111, 'samples': 9014272, 'steps': 17605, 'batch_loss/train': 0.7060590973123908} 12/25/2021 07:06:40 - INFO - codeparrot_training - Step 17606: {'lr': 0.0003636064605594615, 'samples': 9014784, 'steps': 17606, 'batch_loss/train': 0.764197283424437} 12/25/2021 07:06:50 - INFO - codeparrot_training - Step 17607: {'lr': 0.00036359241877685557, 'samples': 9015296, 'steps': 17607, 'batch_loss/train': 0.6348173234146088} 12/25/2021 07:07:04 - INFO - codeparrot_training - Step 17608: {'lr': 0.000363578376542649, 'samples': 9015808, 'steps': 17608, 'batch_loss/train': 0.7466598202008754} 12/25/2021 07:07:15 - INFO - codeparrot_training - Step 17609: {'lr': 0.00036356433385689767, 'samples': 9016320, 'steps': 17609, 'batch_loss/train': 0.5412121939007193} 12/25/2021 07:07:26 - INFO - codeparrot_training - Step 17610: {'lr': 0.0003635502907196575, 'samples': 9016832, 'steps': 17610, 'batch_loss/train': 0.8012448595836759} 12/25/2021 07:07:38 - INFO - codeparrot_training - Step 17611: {'lr': 0.0003635362471309842, 'samples': 9017344, 'steps': 17611, 'batch_loss/train': 0.5684515759348869} 12/25/2021 07:07:48 - INFO - codeparrot_training - Step 17612: {'lr': 0.0003635222030909337, 'samples': 9017856, 'steps': 17612, 'batch_loss/train': 0.8721035663038492} 12/25/2021 07:07:59 - INFO - codeparrot_training - Step 17613: {'lr': 0.0003635081585995618, 'samples': 9018368, 'steps': 17613, 'batch_loss/train': 0.7471745884977281} 12/25/2021 07:08:13 - INFO - codeparrot_training - Step 17614: {'lr': 0.00036349411365692427, 'samples': 9018880, 'steps': 17614, 'batch_loss/train': 0.7213918539928272} 12/25/2021 07:08:23 - INFO - codeparrot_training - Step 17615: {'lr': 0.00036348006826307695, 'samples': 9019392, 'steps': 17615, 'batch_loss/train': 0.7990055694244802} 12/25/2021 07:08:34 - INFO - codeparrot_training - Step 17616: {'lr': 0.0003634660224180758, 'samples': 9019904, 'steps': 17616, 'batch_loss/train': 0.7320225690491498} 12/25/2021 07:08:44 - INFO - codeparrot_training - Step 17617: {'lr': 0.0003634519761219766, 'samples': 9020416, 'steps': 17617, 'batch_loss/train': 0.9956974033266306} 12/25/2021 07:08:56 - INFO - codeparrot_training - Step 17618: {'lr': 0.0003634379293748351, 'samples': 9020928, 'steps': 17618, 'batch_loss/train': 0.795455145649612} 12/25/2021 07:09:07 - INFO - codeparrot_training - Step 17619: {'lr': 0.0003634238821767073, 'samples': 9021440, 'steps': 17619, 'batch_loss/train': 0.8042892324738204} 12/25/2021 07:09:18 - INFO - codeparrot_training - Step 17620: {'lr': 0.000363409834527649, 'samples': 9021952, 'steps': 17620, 'batch_loss/train': 0.45912693394348025} 12/25/2021 07:09:30 - INFO - codeparrot_training - Step 17621: {'lr': 0.0003633957864277159, 'samples': 9022464, 'steps': 17621, 'batch_loss/train': 0.6204775972291827} 12/25/2021 07:09:40 - INFO - codeparrot_training - Step 17622: {'lr': 0.00036338173787696403, 'samples': 9022976, 'steps': 17622, 'batch_loss/train': 0.7657664536964148} 12/25/2021 07:09:51 - INFO - codeparrot_training - Step 17623: {'lr': 0.0003633676888754492, 'samples': 9023488, 'steps': 17623, 'batch_loss/train': 0.7847698270343244} 12/25/2021 07:10:05 - INFO - codeparrot_training - Step 17624: {'lr': 0.0003633536394232272, 'samples': 9024000, 'steps': 17624, 'batch_loss/train': 0.7346685752272606} 12/25/2021 07:10:15 - INFO - codeparrot_training - Step 17625: {'lr': 0.000363339589520354, 'samples': 9024512, 'steps': 17625, 'batch_loss/train': 0.7717712866142392} 12/25/2021 07:10:26 - INFO - codeparrot_training - Step 17626: {'lr': 0.00036332553916688535, 'samples': 9025024, 'steps': 17626, 'batch_loss/train': 0.7106748977676034} 12/25/2021 07:10:36 - INFO - codeparrot_training - Step 17627: {'lr': 0.0003633114883628772, 'samples': 9025536, 'steps': 17627, 'batch_loss/train': 0.7787209295202047} 12/25/2021 07:10:48 - INFO - codeparrot_training - Step 17628: {'lr': 0.00036329743710838524, 'samples': 9026048, 'steps': 17628, 'batch_loss/train': 0.7140819481573999} 12/25/2021 07:10:59 - INFO - codeparrot_training - Step 17629: {'lr': 0.00036328338540346554, 'samples': 9026560, 'steps': 17629, 'batch_loss/train': 0.8445842936635017} 12/25/2021 07:11:10 - INFO - codeparrot_training - Step 17630: {'lr': 0.00036326933324817384, 'samples': 9027072, 'steps': 17630, 'batch_loss/train': 0.7690131897106767} 12/25/2021 07:11:23 - INFO - codeparrot_training - Step 17631: {'lr': 0.0003632552806425661, 'samples': 9027584, 'steps': 17631, 'batch_loss/train': 0.7639965566340834} 12/25/2021 07:11:34 - INFO - codeparrot_training - Step 17632: {'lr': 0.000363241227586698, 'samples': 9028096, 'steps': 17632, 'batch_loss/train': 0.7879350152797997} 12/25/2021 07:11:45 - INFO - codeparrot_training - Step 17633: {'lr': 0.00036322717408062566, 'samples': 9028608, 'steps': 17633, 'batch_loss/train': 0.7350771129131317} 12/25/2021 07:11:57 - INFO - codeparrot_training - Step 17634: {'lr': 0.0003632131201244047, 'samples': 9029120, 'steps': 17634, 'batch_loss/train': 0.7032949663698673} 12/25/2021 07:12:07 - INFO - codeparrot_training - Step 17635: {'lr': 0.00036319906571809126, 'samples': 9029632, 'steps': 17635, 'batch_loss/train': 0.8632441889494658} 12/25/2021 07:12:18 - INFO - codeparrot_training - Step 17636: {'lr': 0.00036318501086174106, 'samples': 9030144, 'steps': 17636, 'batch_loss/train': 0.8104841616004705} 12/25/2021 07:12:30 - INFO - codeparrot_training - Step 17637: {'lr': 0.0003631709555554099, 'samples': 9030656, 'steps': 17637, 'batch_loss/train': 0.7733089569956064} 12/25/2021 07:12:40 - INFO - codeparrot_training - Step 17638: {'lr': 0.00036315689979915386, 'samples': 9031168, 'steps': 17638, 'batch_loss/train': 0.6101869707927108} 12/25/2021 07:12:51 - INFO - codeparrot_training - Step 17639: {'lr': 0.0003631428435930286, 'samples': 9031680, 'steps': 17639, 'batch_loss/train': 0.8473807945847511} 12/25/2021 07:13:02 - INFO - codeparrot_training - Step 17640: {'lr': 0.00036312878693709025, 'samples': 9032192, 'steps': 17640, 'batch_loss/train': 0.8029047204181552} 12/25/2021 07:13:14 - INFO - codeparrot_training - Step 17641: {'lr': 0.00036311472983139437, 'samples': 9032704, 'steps': 17641, 'batch_loss/train': 0.704572825692594} 12/25/2021 07:13:24 - INFO - codeparrot_training - Step 17642: {'lr': 0.0003631006722759972, 'samples': 9033216, 'steps': 17642, 'batch_loss/train': 0.5113499721046537} 12/25/2021 07:13:35 - INFO - codeparrot_training - Step 17643: {'lr': 0.0003630866142709544, 'samples': 9033728, 'steps': 17643, 'batch_loss/train': 0.8660334460437298} 12/25/2021 07:13:47 - INFO - codeparrot_training - Step 17644: {'lr': 0.00036307255581632193, 'samples': 9034240, 'steps': 17644, 'batch_loss/train': 0.7041191309690475} 12/25/2021 07:13:57 - INFO - codeparrot_training - Step 17645: {'lr': 0.0003630584969121557, 'samples': 9034752, 'steps': 17645, 'batch_loss/train': 0.7543501590844244} 12/25/2021 07:14:08 - INFO - codeparrot_training - Step 17646: {'lr': 0.0003630444375585116, 'samples': 9035264, 'steps': 17646, 'batch_loss/train': 0.9205021909438074} 12/25/2021 07:14:22 - INFO - codeparrot_training - Step 17647: {'lr': 0.00036303037775544533, 'samples': 9035776, 'steps': 17647, 'batch_loss/train': 0.7302853185683489} 12/25/2021 07:14:33 - INFO - codeparrot_training - Step 17648: {'lr': 0.00036301631750301316, 'samples': 9036288, 'steps': 17648, 'batch_loss/train': 0.7928586844354868} 12/25/2021 07:14:43 - INFO - codeparrot_training - Step 17649: {'lr': 0.0003630022568012707, 'samples': 9036800, 'steps': 17649, 'batch_loss/train': 0.670804787427187} 12/25/2021 07:14:54 - INFO - codeparrot_training - Step 17650: {'lr': 0.00036298819565027394, 'samples': 9037312, 'steps': 17650, 'batch_loss/train': 0.6158018579008058} 12/25/2021 07:15:06 - INFO - codeparrot_training - Step 17651: {'lr': 0.0003629741340500787, 'samples': 9037824, 'steps': 17651, 'batch_loss/train': 0.7634647022932768} 12/25/2021 07:15:16 - INFO - codeparrot_training - Step 17652: {'lr': 0.0003629600720007411, 'samples': 9038336, 'steps': 17652, 'batch_loss/train': 0.7944959010928869} 12/25/2021 07:15:27 - INFO - codeparrot_training - Step 17653: {'lr': 0.0003629460095023168, 'samples': 9038848, 'steps': 17653, 'batch_loss/train': 0.7545455219224095} 12/25/2021 07:15:41 - INFO - codeparrot_training - Step 17654: {'lr': 0.0003629319465548618, 'samples': 9039360, 'steps': 17654, 'batch_loss/train': 0.8213546965271235} 12/25/2021 07:15:52 - INFO - codeparrot_training - Step 17655: {'lr': 0.00036291788315843216, 'samples': 9039872, 'steps': 17655, 'batch_loss/train': 0.6403610522975214} 12/25/2021 07:16:02 - INFO - codeparrot_training - Step 17656: {'lr': 0.0003629038193130836, 'samples': 9040384, 'steps': 17656, 'batch_loss/train': 0.7838566498830914} 12/25/2021 07:16:14 - INFO - codeparrot_training - Step 17657: {'lr': 0.000362889755018872, 'samples': 9040896, 'steps': 17657, 'batch_loss/train': 0.7583698743255809} 12/25/2021 07:16:25 - INFO - codeparrot_training - Step 17658: {'lr': 0.00036287569027585343, 'samples': 9041408, 'steps': 17658, 'batch_loss/train': 0.8250404074788094} 12/25/2021 07:16:35 - INFO - codeparrot_training - Step 17659: {'lr': 0.00036286162508408374, 'samples': 9041920, 'steps': 17659, 'batch_loss/train': 0.8280791495926678} 12/25/2021 07:16:47 - INFO - codeparrot_training - Step 17660: {'lr': 0.0003628475594436188, 'samples': 9042432, 'steps': 17660, 'batch_loss/train': 0.705647261114791} 12/25/2021 07:16:58 - INFO - codeparrot_training - Step 17661: {'lr': 0.0003628334933545146, 'samples': 9042944, 'steps': 17661, 'batch_loss/train': 0.8902605269104242} 12/25/2021 07:17:09 - INFO - codeparrot_training - Step 17662: {'lr': 0.000362819426816827, 'samples': 9043456, 'steps': 17662, 'batch_loss/train': 0.7084405519999564} 12/25/2021 07:17:19 - INFO - codeparrot_training - Step 17663: {'lr': 0.00036280535983061204, 'samples': 9043968, 'steps': 17663, 'batch_loss/train': 0.8116202522069216} 12/25/2021 07:17:33 - INFO - codeparrot_training - Step 17664: {'lr': 0.00036279129239592545, 'samples': 9044480, 'steps': 17664, 'batch_loss/train': 0.8217359138652682} 12/25/2021 07:17:44 - INFO - codeparrot_training - Step 17665: {'lr': 0.00036277722451282337, 'samples': 9044992, 'steps': 17665, 'batch_loss/train': 0.7727597337216139} 12/25/2021 07:17:54 - INFO - codeparrot_training - Step 17666: {'lr': 0.00036276315618136157, 'samples': 9045504, 'steps': 17666, 'batch_loss/train': 0.774626309517771} 12/25/2021 07:18:06 - INFO - codeparrot_training - Step 17667: {'lr': 0.00036274908740159604, 'samples': 9046016, 'steps': 17667, 'batch_loss/train': 0.7706552548334002} 12/25/2021 07:18:17 - INFO - codeparrot_training - Step 17668: {'lr': 0.0003627350181735828, 'samples': 9046528, 'steps': 17668, 'batch_loss/train': 0.7347961165942252} 12/25/2021 07:18:28 - INFO - codeparrot_training - Step 17669: {'lr': 0.00036272094849737756, 'samples': 9047040, 'steps': 17669, 'batch_loss/train': 0.7259458638727665} 12/25/2021 07:18:41 - INFO - codeparrot_training - Step 17670: {'lr': 0.00036270687837303644, 'samples': 9047552, 'steps': 17670, 'batch_loss/train': 0.8207469023764133} 12/25/2021 07:18:52 - INFO - codeparrot_training - Step 17671: {'lr': 0.00036269280780061537, 'samples': 9048064, 'steps': 17671, 'batch_loss/train': 0.6640577528160065} 12/25/2021 07:19:02 - INFO - codeparrot_training - Step 17672: {'lr': 0.00036267873678017025, 'samples': 9048576, 'steps': 17672, 'batch_loss/train': 0.8079728824086487} 12/25/2021 07:19:13 - INFO - codeparrot_training - Step 17673: {'lr': 0.0003626646653117569, 'samples': 9049088, 'steps': 17673, 'batch_loss/train': 0.8468906646594405} 12/25/2021 07:19:25 - INFO - codeparrot_training - Step 17674: {'lr': 0.00036265059339543154, 'samples': 9049600, 'steps': 17674, 'batch_loss/train': 0.8667345577850938} 12/25/2021 07:19:36 - INFO - codeparrot_training - Step 17675: {'lr': 0.0003626365210312499, 'samples': 9050112, 'steps': 17675, 'batch_loss/train': 0.7375476988963783} 12/25/2021 07:19:46 - INFO - codeparrot_training - Step 17676: {'lr': 0.00036262244821926785, 'samples': 9050624, 'steps': 17676, 'batch_loss/train': 0.9690533727407455} 12/25/2021 07:19:59 - INFO - codeparrot_training - Step 17677: {'lr': 0.0003626083749595417, 'samples': 9051136, 'steps': 17677, 'batch_loss/train': 0.7467159480438568} 12/25/2021 07:20:09 - INFO - codeparrot_training - Step 17678: {'lr': 0.000362594301252127, 'samples': 9051648, 'steps': 17678, 'batch_loss/train': 0.8046428337693214} 12/25/2021 07:20:20 - INFO - codeparrot_training - Step 17679: {'lr': 0.00036258022709707995, 'samples': 9052160, 'steps': 17679, 'batch_loss/train': 0.7997638434171677} 12/25/2021 07:20:34 - INFO - codeparrot_training - Step 17680: {'lr': 0.0003625661524944564, 'samples': 9052672, 'steps': 17680, 'batch_loss/train': 1.0758110247552395} 12/25/2021 07:20:44 - INFO - codeparrot_training - Step 17681: {'lr': 0.0003625520774443124, 'samples': 9053184, 'steps': 17681, 'batch_loss/train': 0.8179168170318007} 12/25/2021 07:20:55 - INFO - codeparrot_training - Step 17682: {'lr': 0.00036253800194670373, 'samples': 9053696, 'steps': 17682, 'batch_loss/train': 0.7226031459285878} 12/25/2021 07:21:07 - INFO - codeparrot_training - Step 17683: {'lr': 0.0003625239260016865, 'samples': 9054208, 'steps': 17683, 'batch_loss/train': 0.7658065613359213} 12/25/2021 07:21:18 - INFO - codeparrot_training - Step 17684: {'lr': 0.0003625098496093166, 'samples': 9054720, 'steps': 17684, 'batch_loss/train': 0.7987106796354055} 12/25/2021 07:21:28 - INFO - codeparrot_training - Step 17685: {'lr': 0.00036249577276965014, 'samples': 9055232, 'steps': 17685, 'batch_loss/train': 0.8192397551611066} 12/25/2021 07:21:39 - INFO - codeparrot_training - Step 17686: {'lr': 0.0003624816954827428, 'samples': 9055744, 'steps': 17686, 'batch_loss/train': 0.7557659633457661} 12/25/2021 07:21:51 - INFO - codeparrot_training - Step 17687: {'lr': 0.0003624676177486508, 'samples': 9056256, 'steps': 17687, 'batch_loss/train': 0.811277718283236} 12/25/2021 07:22:02 - INFO - codeparrot_training - Step 17688: {'lr': 0.00036245353956743, 'samples': 9056768, 'steps': 17688, 'batch_loss/train': 0.8302309012506157} 12/25/2021 07:22:12 - INFO - codeparrot_training - Step 17689: {'lr': 0.0003624394609391363, 'samples': 9057280, 'steps': 17689, 'batch_loss/train': 0.7640439281240106} 12/25/2021 07:22:24 - INFO - codeparrot_training - Step 17690: {'lr': 0.0003624253818638259, 'samples': 9057792, 'steps': 17690, 'batch_loss/train': 0.9392843637615442} 12/25/2021 07:22:35 - INFO - codeparrot_training - Step 17691: {'lr': 0.0003624113023415546, 'samples': 9058304, 'steps': 17691, 'batch_loss/train': 0.782216589897871} 12/25/2021 07:22:45 - INFO - codeparrot_training - Step 17692: {'lr': 0.00036239722237237836, 'samples': 9058816, 'steps': 17692, 'batch_loss/train': 0.7129388963803649} 12/25/2021 07:22:59 - INFO - codeparrot_training - Step 17693: {'lr': 0.0003623831419563532, 'samples': 9059328, 'steps': 17693, 'batch_loss/train': 0.7712692660279572} 12/25/2021 07:23:10 - INFO - codeparrot_training - Step 17694: {'lr': 0.0003623690610935352, 'samples': 9059840, 'steps': 17694, 'batch_loss/train': 0.8685065787285566} 12/25/2021 07:23:21 - INFO - codeparrot_training - Step 17695: {'lr': 0.0003623549797839802, 'samples': 9060352, 'steps': 17695, 'batch_loss/train': 0.8203643206506968} 12/25/2021 07:23:31 - INFO - codeparrot_training - Step 17696: {'lr': 0.0003623408980277442, 'samples': 9060864, 'steps': 17696, 'batch_loss/train': 0.7494438004214317} 12/25/2021 07:23:43 - INFO - codeparrot_training - Step 17697: {'lr': 0.0003623268158248831, 'samples': 9061376, 'steps': 17697, 'batch_loss/train': 0.6448751664720476} 12/25/2021 07:23:54 - INFO - codeparrot_training - Step 17698: {'lr': 0.0003623127331754532, 'samples': 9061888, 'steps': 17698, 'batch_loss/train': 0.7729178238660097} 12/25/2021 07:24:04 - INFO - codeparrot_training - Step 17699: {'lr': 0.0003622986500795101, 'samples': 9062400, 'steps': 17699, 'batch_loss/train': 0.8362145340070128} 12/25/2021 07:24:18 - INFO - codeparrot_training - Step 17700: {'lr': 0.0003622845665371101, 'samples': 9062912, 'steps': 17700, 'batch_loss/train': 0.7773134093731642} 12/25/2021 07:24:29 - INFO - codeparrot_training - Step 17701: {'lr': 0.000362270482548309, 'samples': 9063424, 'steps': 17701, 'batch_loss/train': 0.7742315595969558} 12/25/2021 07:24:39 - INFO - codeparrot_training - Step 17702: {'lr': 0.0003622563981131628, 'samples': 9063936, 'steps': 17702, 'batch_loss/train': 0.7243327675387263} 12/25/2021 07:24:51 - INFO - codeparrot_training - Step 17703: {'lr': 0.0003622423132317277, 'samples': 9064448, 'steps': 17703, 'batch_loss/train': 0.8004544912837446} 12/25/2021 07:25:02 - INFO - codeparrot_training - Step 17704: {'lr': 0.0003622282279040594, 'samples': 9064960, 'steps': 17704, 'batch_loss/train': 0.673150465299841} 12/25/2021 07:25:12 - INFO - codeparrot_training - Step 17705: {'lr': 0.0003622141421302141, 'samples': 9065472, 'steps': 17705, 'batch_loss/train': 0.5958645657810848} 12/25/2021 07:25:25 - INFO - codeparrot_training - Step 17706: {'lr': 0.00036220005591024775, 'samples': 9065984, 'steps': 17706, 'batch_loss/train': 0.7407970083877444} 12/25/2021 07:25:35 - INFO - codeparrot_training - Step 17707: {'lr': 0.00036218596924421636, 'samples': 9066496, 'steps': 17707, 'batch_loss/train': 0.7628747536800802} 12/25/2021 07:25:46 - INFO - codeparrot_training - Step 17708: {'lr': 0.0003621718821321759, 'samples': 9067008, 'steps': 17708, 'batch_loss/train': 0.804329183883965} 12/25/2021 07:25:57 - INFO - codeparrot_training - Step 17709: {'lr': 0.0003621577945741824, 'samples': 9067520, 'steps': 17709, 'batch_loss/train': 0.840302050113678} 12/25/2021 07:26:10 - INFO - codeparrot_training - Step 17710: {'lr': 0.0003621437065702918, 'samples': 9068032, 'steps': 17710, 'batch_loss/train': 0.750988744199276} 12/25/2021 07:26:21 - INFO - codeparrot_training - Step 17711: {'lr': 0.00036212961812056015, 'samples': 9068544, 'steps': 17711, 'batch_loss/train': 0.7387147452682257} 12/25/2021 07:26:32 - INFO - codeparrot_training - Step 17712: {'lr': 0.00036211552922504355, 'samples': 9069056, 'steps': 17712, 'batch_loss/train': 0.6058566796127707} 12/25/2021 07:26:44 - INFO - codeparrot_training - Step 17713: {'lr': 0.0003621014398837979, 'samples': 9069568, 'steps': 17713, 'batch_loss/train': 0.7289773840457201} 12/25/2021 07:26:54 - INFO - codeparrot_training - Step 17714: {'lr': 0.0003620873500968793, 'samples': 9070080, 'steps': 17714, 'batch_loss/train': 0.669861247879453} 12/25/2021 07:27:05 - INFO - codeparrot_training - Step 17715: {'lr': 0.0003620732598643436, 'samples': 9070592, 'steps': 17715, 'batch_loss/train': 0.6559730423614383} 12/25/2021 07:27:19 - INFO - codeparrot_training - Step 17716: {'lr': 0.00036205916918624707, 'samples': 9071104, 'steps': 17716, 'batch_loss/train': 0.8102619219571352} 12/25/2021 07:27:30 - INFO - codeparrot_training - Step 17717: {'lr': 0.00036204507806264556, 'samples': 9071616, 'steps': 17717, 'batch_loss/train': 0.7798335468396544} 12/25/2021 07:27:40 - INFO - codeparrot_training - Step 17718: {'lr': 0.0003620309864935951, 'samples': 9072128, 'steps': 17718, 'batch_loss/train': 0.7422978039830923} 12/25/2021 07:27:51 - INFO - codeparrot_training - Step 17719: {'lr': 0.00036201689447915177, 'samples': 9072640, 'steps': 17719, 'batch_loss/train': 0.6350918546668254} 12/25/2021 07:28:03 - INFO - codeparrot_training - Step 17720: {'lr': 0.0003620028020193716, 'samples': 9073152, 'steps': 17720, 'batch_loss/train': 0.8238627393729985} 12/25/2021 07:28:14 - INFO - codeparrot_training - Step 17721: {'lr': 0.0003619887091143105, 'samples': 9073664, 'steps': 17721, 'batch_loss/train': 0.7511789780110121} 12/25/2021 07:28:24 - INFO - codeparrot_training - Step 17722: {'lr': 0.00036197461576402455, 'samples': 9074176, 'steps': 17722, 'batch_loss/train': 0.7783185890875757} 12/25/2021 07:28:36 - INFO - codeparrot_training - Step 17723: {'lr': 0.00036196052196856986, 'samples': 9074688, 'steps': 17723, 'batch_loss/train': 0.7756461766548455} 12/25/2021 07:28:47 - INFO - codeparrot_training - Step 17724: {'lr': 0.00036194642772800246, 'samples': 9075200, 'steps': 17724, 'batch_loss/train': 0.6385670628515072} 12/25/2021 07:28:57 - INFO - codeparrot_training - Step 17725: {'lr': 0.00036193233304237827, 'samples': 9075712, 'steps': 17725, 'batch_loss/train': 0.758380638435483} 12/25/2021 07:29:09 - INFO - codeparrot_training - Step 17726: {'lr': 0.00036191823791175337, 'samples': 9076224, 'steps': 17726, 'batch_loss/train': 0.7110959626734257} 12/25/2021 07:29:20 - INFO - codeparrot_training - Step 17727: {'lr': 0.00036190414233618393, 'samples': 9076736, 'steps': 17727, 'batch_loss/train': 0.757047691848129} 12/25/2021 07:29:31 - INFO - codeparrot_training - Step 17728: {'lr': 0.00036189004631572574, 'samples': 9077248, 'steps': 17728, 'batch_loss/train': 0.693729575665202} 12/25/2021 07:29:45 - INFO - codeparrot_training - Step 17729: {'lr': 0.000361875949850435, 'samples': 9077760, 'steps': 17729, 'batch_loss/train': 0.7730196961201727} 12/25/2021 07:29:55 - INFO - codeparrot_training - Step 17730: {'lr': 0.0003618618529403678, 'samples': 9078272, 'steps': 17730, 'batch_loss/train': 0.7071758477832191} 12/25/2021 07:30:06 - INFO - codeparrot_training - Step 17731: {'lr': 0.00036184775558558005, 'samples': 9078784, 'steps': 17731, 'batch_loss/train': 0.8133057411760092} 12/25/2021 07:30:16 - INFO - codeparrot_training - Step 17732: {'lr': 0.00036183365778612786, 'samples': 9079296, 'steps': 17732, 'batch_loss/train': 0.7648610947653651} 12/25/2021 07:30:28 - INFO - codeparrot_training - Step 17733: {'lr': 0.0003618195595420673, 'samples': 9079808, 'steps': 17733, 'batch_loss/train': 0.9724224405363202} 12/25/2021 07:30:39 - INFO - codeparrot_training - Step 17734: {'lr': 0.00036180546085345434, 'samples': 9080320, 'steps': 17734, 'batch_loss/train': 0.8040316170081496} 12/25/2021 07:30:49 - INFO - codeparrot_training - Step 17735: {'lr': 0.00036179136172034517, 'samples': 9080832, 'steps': 17735, 'batch_loss/train': 0.7773992479778826} 12/25/2021 07:31:02 - INFO - codeparrot_training - Step 17736: {'lr': 0.0003617772621427957, 'samples': 9081344, 'steps': 17736, 'batch_loss/train': 0.7461039163172245} 12/25/2021 07:31:12 - INFO - codeparrot_training - Step 17737: {'lr': 0.000361763162120862, 'samples': 9081856, 'steps': 17737, 'batch_loss/train': 0.7351092882454395} 12/25/2021 07:31:23 - INFO - codeparrot_training - Step 17738: {'lr': 0.0003617490616546002, 'samples': 9082368, 'steps': 17738, 'batch_loss/train': 0.6011605417006649} 12/25/2021 07:31:37 - INFO - codeparrot_training - Step 17739: {'lr': 0.0003617349607440664, 'samples': 9082880, 'steps': 17739, 'batch_loss/train': 0.7561554722487926} 12/25/2021 07:31:47 - INFO - codeparrot_training - Step 17740: {'lr': 0.00036172085938931655, 'samples': 9083392, 'steps': 17740, 'batch_loss/train': 0.8168542766943574} 12/25/2021 07:31:58 - INFO - codeparrot_training - Step 17741: {'lr': 0.00036170675759040664, 'samples': 9083904, 'steps': 17741, 'batch_loss/train': 0.800414873752743} 12/25/2021 07:32:10 - INFO - codeparrot_training - Step 17742: {'lr': 0.00036169265534739294, 'samples': 9084416, 'steps': 17742, 'batch_loss/train': 0.7387314364314079} 12/25/2021 07:32:21 - INFO - codeparrot_training - Step 17743: {'lr': 0.0003616785526603314, 'samples': 9084928, 'steps': 17743, 'batch_loss/train': 1.4695174950174987} 12/25/2021 07:32:31 - INFO - codeparrot_training - Step 17744: {'lr': 0.0003616644495292781, 'samples': 9085440, 'steps': 17744, 'batch_loss/train': 0.7009374792687595} 12/25/2021 07:32:42 - INFO - codeparrot_training - Step 17745: {'lr': 0.00036165034595428907, 'samples': 9085952, 'steps': 17745, 'batch_loss/train': 0.7174155875109136} 12/25/2021 07:32:54 - INFO - codeparrot_training - Step 17746: {'lr': 0.00036163624193542057, 'samples': 9086464, 'steps': 17746, 'batch_loss/train': 0.7612993596121669} 12/25/2021 07:33:04 - INFO - codeparrot_training - Step 17747: {'lr': 0.0003616221374727283, 'samples': 9086976, 'steps': 17747, 'batch_loss/train': 0.8842564942315221} 12/25/2021 07:33:15 - INFO - codeparrot_training - Step 17748: {'lr': 0.0003616080325662687, 'samples': 9087488, 'steps': 17748, 'batch_loss/train': 0.7470891401171684} 12/25/2021 07:33:29 - INFO - codeparrot_training - Step 17749: {'lr': 0.00036159392721609774, 'samples': 9088000, 'steps': 17749, 'batch_loss/train': 0.7783700358122587} 12/25/2021 07:33:40 - INFO - codeparrot_training - Step 17750: {'lr': 0.00036157982142227135, 'samples': 9088512, 'steps': 17750, 'batch_loss/train': 0.9824235951527953} 12/25/2021 07:33:50 - INFO - codeparrot_training - Step 17751: {'lr': 0.00036156571518484575, 'samples': 9089024, 'steps': 17751, 'batch_loss/train': 0.69862193177687} 12/25/2021 07:34:02 - INFO - codeparrot_training - Step 17752: {'lr': 0.000361551608503877, 'samples': 9089536, 'steps': 17752, 'batch_loss/train': 0.7812846698798239} 12/25/2021 07:34:13 - INFO - codeparrot_training - Step 17753: {'lr': 0.00036153750137942126, 'samples': 9090048, 'steps': 17753, 'batch_loss/train': 0.5470663697924465} 12/25/2021 07:34:24 - INFO - codeparrot_training - Step 17754: {'lr': 0.0003615233938115344, 'samples': 9090560, 'steps': 17754, 'batch_loss/train': 0.7298129852861166} 12/25/2021 07:34:34 - INFO - codeparrot_training - Step 17755: {'lr': 0.00036150928580027275, 'samples': 9091072, 'steps': 17755, 'batch_loss/train': 0.8184777405112982} 12/25/2021 07:34:48 - INFO - codeparrot_training - Step 17756: {'lr': 0.0003614951773456922, 'samples': 9091584, 'steps': 17756, 'batch_loss/train': 0.848335612565279} 12/25/2021 07:34:59 - INFO - codeparrot_training - Step 17757: {'lr': 0.00036148106844784893, 'samples': 9092096, 'steps': 17757, 'batch_loss/train': 0.718683710321784} 12/25/2021 07:35:09 - INFO - codeparrot_training - Step 17758: {'lr': 0.00036146695910679904, 'samples': 9092608, 'steps': 17758, 'batch_loss/train': 0.8226437084376812} 12/25/2021 07:35:22 - INFO - codeparrot_training - Step 17759: {'lr': 0.0003614528493225987, 'samples': 9093120, 'steps': 17759, 'batch_loss/train': 0.732352935243398} 12/25/2021 07:35:32 - INFO - codeparrot_training - Step 17760: {'lr': 0.0003614387390953038, 'samples': 9093632, 'steps': 17760, 'batch_loss/train': 0.8053757585585117} 12/25/2021 07:35:43 - INFO - codeparrot_training - Step 17761: {'lr': 0.0003614246284249706, 'samples': 9094144, 'steps': 17761, 'batch_loss/train': 0.7629609738942236} 12/25/2021 07:35:55 - INFO - codeparrot_training - Step 17762: {'lr': 0.0003614105173116551, 'samples': 9094656, 'steps': 17762, 'batch_loss/train': 1.014538362622261} 12/25/2021 07:36:05 - INFO - codeparrot_training - Step 17763: {'lr': 0.0003613964057554135, 'samples': 9095168, 'steps': 17763, 'batch_loss/train': 0.7166716544888914} 12/25/2021 07:36:16 - INFO - codeparrot_training - Step 17764: {'lr': 0.0003613822937563018, 'samples': 9095680, 'steps': 17764, 'batch_loss/train': 0.7259231763891876} 12/25/2021 07:36:26 - INFO - codeparrot_training - Step 17765: {'lr': 0.0003613681813143762, 'samples': 9096192, 'steps': 17765, 'batch_loss/train': 0.8100455468520522} 12/25/2021 07:36:39 - INFO - codeparrot_training - Step 17766: {'lr': 0.0003613540684296928, 'samples': 9096704, 'steps': 17766, 'batch_loss/train': 0.7677058707922697} 12/25/2021 07:36:49 - INFO - codeparrot_training - Step 17767: {'lr': 0.00036133995510230754, 'samples': 9097216, 'steps': 17767, 'batch_loss/train': 0.7678533188300207} 12/25/2021 07:37:00 - INFO - codeparrot_training - Step 17768: {'lr': 0.0003613258413322768, 'samples': 9097728, 'steps': 17768, 'batch_loss/train': 0.7608571611344814} 12/25/2021 07:37:14 - INFO - codeparrot_training - Step 17769: {'lr': 0.0003613117271196565, 'samples': 9098240, 'steps': 17769, 'batch_loss/train': 0.703584723174572} 12/25/2021 07:37:25 - INFO - codeparrot_training - Step 17770: {'lr': 0.0003612976124645028, 'samples': 9098752, 'steps': 17770, 'batch_loss/train': 0.786126715131104} 12/25/2021 07:37:35 - INFO - codeparrot_training - Step 17771: {'lr': 0.00036128349736687176, 'samples': 9099264, 'steps': 17771, 'batch_loss/train': 0.784556683152914} 12/25/2021 07:37:47 - INFO - codeparrot_training - Step 17772: {'lr': 0.0003612693818268197, 'samples': 9099776, 'steps': 17772, 'batch_loss/train': 0.5393916087923571} 12/25/2021 07:37:58 - INFO - codeparrot_training - Step 17773: {'lr': 0.00036125526584440237, 'samples': 9100288, 'steps': 17773, 'batch_loss/train': 0.6917036215309054} 12/25/2021 07:38:08 - INFO - codeparrot_training - Step 17774: {'lr': 0.0003612411494196762, 'samples': 9100800, 'steps': 17774, 'batch_loss/train': 0.7501732483506203} 12/25/2021 07:38:19 - INFO - codeparrot_training - Step 17775: {'lr': 0.00036122703255269734, 'samples': 9101312, 'steps': 17775, 'batch_loss/train': 0.7919832747429609} 12/25/2021 07:38:31 - INFO - codeparrot_training - Step 17776: {'lr': 0.00036121291524352173, 'samples': 9101824, 'steps': 17776, 'batch_loss/train': 0.7547921007499099} 12/25/2021 07:38:42 - INFO - codeparrot_training - Step 17777: {'lr': 0.00036119879749220554, 'samples': 9102336, 'steps': 17777, 'batch_loss/train': 0.8710017576813698} 12/25/2021 07:38:52 - INFO - codeparrot_training - Step 17778: {'lr': 0.00036118467929880494, 'samples': 9102848, 'steps': 17778, 'batch_loss/train': 0.769886291003786} 12/25/2021 07:39:06 - INFO - codeparrot_training - Step 17779: {'lr': 0.000361170560663376, 'samples': 9103360, 'steps': 17779, 'batch_loss/train': 0.7797093074768782} 12/25/2021 07:39:17 - INFO - codeparrot_training - Step 17780: {'lr': 0.0003611564415859749, 'samples': 9103872, 'steps': 17780, 'batch_loss/train': 0.8370343940332532} 12/25/2021 07:39:27 - INFO - codeparrot_training - Step 17781: {'lr': 0.0003611423220666578, 'samples': 9104384, 'steps': 17781, 'batch_loss/train': 0.7666533938609064} 12/25/2021 07:39:40 - INFO - codeparrot_training - Step 17782: {'lr': 0.00036112820210548074, 'samples': 9104896, 'steps': 17782, 'batch_loss/train': 0.7449966752901673} 12/25/2021 07:39:50 - INFO - codeparrot_training - Step 17783: {'lr': 0.0003611140817024999, 'samples': 9105408, 'steps': 17783, 'batch_loss/train': 0.7528112531872466} 12/25/2021 07:40:01 - INFO - codeparrot_training - Step 17784: {'lr': 0.0003610999608577715, 'samples': 9105920, 'steps': 17784, 'batch_loss/train': 0.8288560352521017} 12/25/2021 07:40:12 - INFO - codeparrot_training - Step 17785: {'lr': 0.0003610858395713516, 'samples': 9106432, 'steps': 17785, 'batch_loss/train': 0.6245989724993706} 12/25/2021 07:40:25 - INFO - codeparrot_training - Step 17786: {'lr': 0.0003610717178432962, 'samples': 9106944, 'steps': 17786, 'batch_loss/train': 0.7511773807927966} 12/25/2021 07:40:36 - INFO - codeparrot_training - Step 17787: {'lr': 0.00036105759567366173, 'samples': 9107456, 'steps': 17787, 'batch_loss/train': 0.7778019141405821} 12/25/2021 07:40:47 - INFO - codeparrot_training - Step 17788: {'lr': 0.0003610434730625042, 'samples': 9107968, 'steps': 17788, 'batch_loss/train': 0.7990216100588441} 12/25/2021 07:40:58 - INFO - codeparrot_training - Step 17789: {'lr': 0.0003610293500098797, 'samples': 9108480, 'steps': 17789, 'batch_loss/train': 0.7021351725561544} 12/25/2021 07:41:09 - INFO - codeparrot_training - Step 17790: {'lr': 0.00036101522651584445, 'samples': 9108992, 'steps': 17790, 'batch_loss/train': 0.7662493728566915} 12/25/2021 07:41:20 - INFO - codeparrot_training - Step 17791: {'lr': 0.00036100110258045456, 'samples': 9109504, 'steps': 17791, 'batch_loss/train': 0.738510187715292} 12/25/2021 07:41:32 - INFO - codeparrot_training - Step 17792: {'lr': 0.0003609869782037662, 'samples': 9110016, 'steps': 17792, 'batch_loss/train': 0.73478311765939} 12/25/2021 07:41:42 - INFO - codeparrot_training - Step 17793: {'lr': 0.0003609728533858355, 'samples': 9110528, 'steps': 17793, 'batch_loss/train': 0.7905145846307278} 12/25/2021 07:41:53 - INFO - codeparrot_training - Step 17794: {'lr': 0.00036095872812671873, 'samples': 9111040, 'steps': 17794, 'batch_loss/train': 0.7520768456161022} 12/25/2021 07:42:07 - INFO - codeparrot_training - Step 17795: {'lr': 0.00036094460242647186, 'samples': 9111552, 'steps': 17795, 'batch_loss/train': 0.6991760046221316} 12/25/2021 07:42:17 - INFO - codeparrot_training - Step 17796: {'lr': 0.00036093047628515116, 'samples': 9112064, 'steps': 17796, 'batch_loss/train': 1.3609985983930528} 12/25/2021 07:42:28 - INFO - codeparrot_training - Step 17797: {'lr': 0.0003609163497028128, 'samples': 9112576, 'steps': 17797, 'batch_loss/train': 0.7487308471463621} 12/25/2021 07:42:39 - INFO - codeparrot_training - Step 17798: {'lr': 0.0003609022226795128, 'samples': 9113088, 'steps': 17798, 'batch_loss/train': 0.7653584992513061} 12/25/2021 07:42:51 - INFO - codeparrot_training - Step 17799: {'lr': 0.00036088809521530755, 'samples': 9113600, 'steps': 17799, 'batch_loss/train': 0.7316322905244306} 12/25/2021 07:43:01 - INFO - codeparrot_training - Step 17800: {'lr': 0.0003608739673102531, 'samples': 9114112, 'steps': 17800, 'batch_loss/train': 0.7617828715592623} 12/25/2021 07:43:12 - INFO - codeparrot_training - Step 17801: {'lr': 0.00036085983896440565, 'samples': 9114624, 'steps': 17801, 'batch_loss/train': 0.6068501759436913} 12/25/2021 07:43:24 - INFO - codeparrot_training - Step 17802: {'lr': 0.0003608457101778213, 'samples': 9115136, 'steps': 17802, 'batch_loss/train': 0.8772157751955092} 12/25/2021 07:43:35 - INFO - codeparrot_training - Step 17803: {'lr': 0.00036083158095055626, 'samples': 9115648, 'steps': 17803, 'batch_loss/train': 0.683884691039566} 12/25/2021 07:43:45 - INFO - codeparrot_training - Step 17804: {'lr': 0.00036081745128266666, 'samples': 9116160, 'steps': 17804, 'batch_loss/train': 0.8082655528560281} 12/25/2021 07:43:58 - INFO - codeparrot_training - Step 17805: {'lr': 0.0003608033211742088, 'samples': 9116672, 'steps': 17805, 'batch_loss/train': 0.788816318847239} 12/25/2021 07:44:08 - INFO - codeparrot_training - Step 17806: {'lr': 0.0003607891906252386, 'samples': 9117184, 'steps': 17806, 'batch_loss/train': 0.8618509895168245} 12/25/2021 07:44:19 - INFO - codeparrot_training - Step 17807: {'lr': 0.00036077505963581265, 'samples': 9117696, 'steps': 17807, 'batch_loss/train': 0.7447652197442949} 12/25/2021 07:44:29 - INFO - codeparrot_training - Step 17808: {'lr': 0.00036076092820598674, 'samples': 9118208, 'steps': 17808, 'batch_loss/train': 0.7228549565188587} 12/25/2021 07:44:43 - INFO - codeparrot_training - Step 17809: {'lr': 0.0003607467963358172, 'samples': 9118720, 'steps': 17809, 'batch_loss/train': 0.6099377707578242} 12/25/2021 07:44:54 - INFO - codeparrot_training - Step 17810: {'lr': 0.0003607326640253603, 'samples': 9119232, 'steps': 17810, 'batch_loss/train': 0.7987926178611815} 12/25/2021 07:45:05 - INFO - codeparrot_training - Step 17811: {'lr': 0.0003607185312746721, 'samples': 9119744, 'steps': 17811, 'batch_loss/train': 0.756514276843518} 12/25/2021 07:45:17 - INFO - codeparrot_training - Step 17812: {'lr': 0.0003607043980838088, 'samples': 9120256, 'steps': 17812, 'batch_loss/train': 0.7808274142444134} 12/25/2021 07:45:27 - INFO - codeparrot_training - Step 17813: {'lr': 0.0003606902644528266, 'samples': 9120768, 'steps': 17813, 'batch_loss/train': 0.8189877457916737} 12/25/2021 07:45:38 - INFO - codeparrot_training - Step 17814: {'lr': 0.00036067613038178175, 'samples': 9121280, 'steps': 17814, 'batch_loss/train': 0.7943511442281306} 12/25/2021 07:45:52 - INFO - codeparrot_training - Step 17815: {'lr': 0.00036066199587073036, 'samples': 9121792, 'steps': 17815, 'batch_loss/train': 0.8086743066087365} 12/25/2021 07:46:02 - INFO - codeparrot_training - Step 17816: {'lr': 0.0003606478609197287, 'samples': 9122304, 'steps': 17816, 'batch_loss/train': 0.7321174507960677} 12/25/2021 07:46:13 - INFO - codeparrot_training - Step 17817: {'lr': 0.0003606337255288329, 'samples': 9122816, 'steps': 17817, 'batch_loss/train': 0.8418209105730057} 12/25/2021 07:46:23 - INFO - codeparrot_training - Step 17818: {'lr': 0.0003606195896980992, 'samples': 9123328, 'steps': 17818, 'batch_loss/train': 0.7433380247093737} 12/25/2021 07:46:35 - INFO - codeparrot_training - Step 17819: {'lr': 0.00036060545342758374, 'samples': 9123840, 'steps': 17819, 'batch_loss/train': 0.7643007785081863} 12/25/2021 07:46:46 - INFO - codeparrot_training - Step 17820: {'lr': 0.0003605913167173428, 'samples': 9124352, 'steps': 17820, 'batch_loss/train': 0.7032612923067063} 12/25/2021 07:46:57 - INFO - codeparrot_training - Step 17821: {'lr': 0.00036057717956743254, 'samples': 9124864, 'steps': 17821, 'batch_loss/train': 0.7414277320494875} 12/25/2021 07:47:09 - INFO - codeparrot_training - Step 17822: {'lr': 0.0003605630419779091, 'samples': 9125376, 'steps': 17822, 'batch_loss/train': 0.6772505636326969} 12/25/2021 07:47:19 - INFO - codeparrot_training - Step 17823: {'lr': 0.00036054890394882885, 'samples': 9125888, 'steps': 17823, 'batch_loss/train': 0.7956299902871251} 12/25/2021 07:47:30 - INFO - codeparrot_training - Step 17824: {'lr': 0.00036053476548024785, 'samples': 9126400, 'steps': 17824, 'batch_loss/train': 0.7695869193412364} 12/25/2021 07:47:44 - INFO - codeparrot_training - Step 17825: {'lr': 0.0003605206265722223, 'samples': 9126912, 'steps': 17825, 'batch_loss/train': 0.7410836559720337} 12/25/2021 07:47:54 - INFO - codeparrot_training - Step 17826: {'lr': 0.00036050648722480853, 'samples': 9127424, 'steps': 17826, 'batch_loss/train': 0.736137677449733} 12/25/2021 07:48:05 - INFO - codeparrot_training - Step 17827: {'lr': 0.00036049234743806277, 'samples': 9127936, 'steps': 17827, 'batch_loss/train': 0.6366320834495127} 12/25/2021 07:48:15 - INFO - codeparrot_training - Step 17828: {'lr': 0.000360478207212041, 'samples': 9128448, 'steps': 17828, 'batch_loss/train': 0.6901456958148628} 12/25/2021 07:48:28 - INFO - codeparrot_training - Step 17829: {'lr': 0.00036046406654679976, 'samples': 9128960, 'steps': 17829, 'batch_loss/train': 0.8458763677626848} 12/25/2021 07:48:38 - INFO - codeparrot_training - Step 17830: {'lr': 0.00036044992544239496, 'samples': 9129472, 'steps': 17830, 'batch_loss/train': 0.788299972191453} 12/25/2021 07:48:49 - INFO - codeparrot_training - Step 17831: {'lr': 0.00036043578389888297, 'samples': 9129984, 'steps': 17831, 'batch_loss/train': 0.8030510246753693} 12/25/2021 07:49:03 - INFO - codeparrot_training - Step 17832: {'lr': 0.0003604216419163201, 'samples': 9130496, 'steps': 17832, 'batch_loss/train': 0.7297929683700204} 12/25/2021 07:49:14 - INFO - codeparrot_training - Step 17833: {'lr': 0.0003604074994947624, 'samples': 9131008, 'steps': 17833, 'batch_loss/train': 0.6888589970767498} 12/25/2021 07:49:24 - INFO - codeparrot_training - Step 17834: {'lr': 0.0003603933566342662, 'samples': 9131520, 'steps': 17834, 'batch_loss/train': 0.8634268119931221} 12/25/2021 07:49:36 - INFO - codeparrot_training - Step 17835: {'lr': 0.00036037921333488765, 'samples': 9132032, 'steps': 17835, 'batch_loss/train': 0.8103292975574732} 12/25/2021 07:49:47 - INFO - codeparrot_training - Step 17836: {'lr': 0.00036036506959668315, 'samples': 9132544, 'steps': 17836, 'batch_loss/train': 0.7381943073123693} 12/25/2021 07:49:57 - INFO - codeparrot_training - Step 17837: {'lr': 0.0003603509254197087, 'samples': 9133056, 'steps': 17837, 'batch_loss/train': 0.8192145507782698} 12/25/2021 07:50:08 - INFO - codeparrot_training - Step 17838: {'lr': 0.00036033678080402067, 'samples': 9133568, 'steps': 17838, 'batch_loss/train': 0.7399440128356218} 12/25/2021 07:50:20 - INFO - codeparrot_training - Step 17839: {'lr': 0.00036032263574967523, 'samples': 9134080, 'steps': 17839, 'batch_loss/train': 0.7430320819839835} 12/25/2021 07:50:31 - INFO - codeparrot_training - Step 17840: {'lr': 0.0003603084902567287, 'samples': 9134592, 'steps': 17840, 'batch_loss/train': 0.726772328838706} 12/25/2021 07:50:41 - INFO - codeparrot_training - Step 17841: {'lr': 0.00036029434432523715, 'samples': 9135104, 'steps': 17841, 'batch_loss/train': 0.7708215941675007} 12/25/2021 07:50:53 - INFO - codeparrot_training - Step 17842: {'lr': 0.0003602801979552571, 'samples': 9135616, 'steps': 17842, 'batch_loss/train': 0.6950089964084327} 12/25/2021 07:51:04 - INFO - codeparrot_training - Step 17843: {'lr': 0.00036026605114684445, 'samples': 9136128, 'steps': 17843, 'batch_loss/train': 0.7343624467030168} 12/25/2021 07:51:14 - INFO - codeparrot_training - Step 17844: {'lr': 0.0003602519039000557, 'samples': 9136640, 'steps': 17844, 'batch_loss/train': 0.7798390155658126} 12/25/2021 07:51:28 - INFO - codeparrot_training - Step 17845: {'lr': 0.000360237756214947, 'samples': 9137152, 'steps': 17845, 'batch_loss/train': 0.8290816080989316} 12/25/2021 07:51:39 - INFO - codeparrot_training - Step 17846: {'lr': 0.00036022360809157466, 'samples': 9137664, 'steps': 17846, 'batch_loss/train': 0.5746826746035367} 12/25/2021 07:51:49 - INFO - codeparrot_training - Step 17847: {'lr': 0.0003602094595299948, 'samples': 9138176, 'steps': 17847, 'batch_loss/train': 0.7282516141422093} 12/25/2021 07:52:01 - INFO - codeparrot_training - Step 17848: {'lr': 0.0003601953105302638, 'samples': 9138688, 'steps': 17848, 'batch_loss/train': 0.7761868191882968} 12/25/2021 07:52:12 - INFO - codeparrot_training - Step 17849: {'lr': 0.00036018116109243784, 'samples': 9139200, 'steps': 17849, 'batch_loss/train': 0.795620322227478} 12/25/2021 07:52:22 - INFO - codeparrot_training - Step 17850: {'lr': 0.00036016701121657313, 'samples': 9139712, 'steps': 17850, 'batch_loss/train': 0.8223141841590405} 12/25/2021 07:52:33 - INFO - codeparrot_training - Step 17851: {'lr': 0.00036015286090272604, 'samples': 9140224, 'steps': 17851, 'batch_loss/train': 0.7412056121975183} 12/25/2021 07:52:45 - INFO - codeparrot_training - Step 17852: {'lr': 0.0003601387101509527, 'samples': 9140736, 'steps': 17852, 'batch_loss/train': 0.7315345797687769} 12/25/2021 07:52:56 - INFO - codeparrot_training - Step 17853: {'lr': 0.00036012455896130955, 'samples': 9141248, 'steps': 17853, 'batch_loss/train': 0.6843323418870568} 12/25/2021 07:53:06 - INFO - codeparrot_training - Step 17854: {'lr': 0.0003601104073338526, 'samples': 9141760, 'steps': 17854, 'batch_loss/train': 0.783759037964046} 12/25/2021 07:53:20 - INFO - codeparrot_training - Step 17855: {'lr': 0.0003600962552686384, 'samples': 9142272, 'steps': 17855, 'batch_loss/train': 0.7316460961010307} 12/25/2021 07:53:31 - INFO - codeparrot_training - Step 17856: {'lr': 0.000360082102765723, 'samples': 9142784, 'steps': 17856, 'batch_loss/train': 0.7006014280486852} 12/25/2021 07:53:41 - INFO - codeparrot_training - Step 17857: {'lr': 0.00036006794982516263, 'samples': 9143296, 'steps': 17857, 'batch_loss/train': 0.7050288191530854} 12/25/2021 07:53:53 - INFO - codeparrot_training - Step 17858: {'lr': 0.0003600537964470137, 'samples': 9143808, 'steps': 17858, 'batch_loss/train': 0.6979926889762282} 12/25/2021 07:54:04 - INFO - codeparrot_training - Step 17859: {'lr': 0.00036003964263133247, 'samples': 9144320, 'steps': 17859, 'batch_loss/train': 0.6818162848940119} 12/25/2021 07:54:15 - INFO - codeparrot_training - Step 17860: {'lr': 0.0003600254883781751, 'samples': 9144832, 'steps': 17860, 'batch_loss/train': 0.6981060227844864} 12/25/2021 07:54:27 - INFO - codeparrot_training - Step 17861: {'lr': 0.000360011333687598, 'samples': 9145344, 'steps': 17861, 'batch_loss/train': 0.7608326952904463} 12/25/2021 07:54:37 - INFO - codeparrot_training - Step 17862: {'lr': 0.00035999717855965743, 'samples': 9145856, 'steps': 17862, 'batch_loss/train': 0.7618128680624068} 12/25/2021 07:54:48 - INFO - codeparrot_training - Step 17863: {'lr': 0.0003599830229944095, 'samples': 9146368, 'steps': 17863, 'batch_loss/train': 0.8259791117161512} 12/25/2021 07:54:58 - INFO - codeparrot_training - Step 17864: {'lr': 0.00035996886699191067, 'samples': 9146880, 'steps': 17864, 'batch_loss/train': 0.8009195737540722} 12/25/2021 07:55:12 - INFO - codeparrot_training - Step 17865: {'lr': 0.00035995471055221706, 'samples': 9147392, 'steps': 17865, 'batch_loss/train': 0.6335430153994821} 12/25/2021 07:55:23 - INFO - codeparrot_training - Step 17866: {'lr': 0.00035994055367538516, 'samples': 9147904, 'steps': 17866, 'batch_loss/train': 0.7495693620294333} 12/25/2021 07:55:33 - INFO - codeparrot_training - Step 17867: {'lr': 0.000359926396361471, 'samples': 9148416, 'steps': 17867, 'batch_loss/train': 0.8701529251411557} 12/25/2021 07:55:45 - INFO - codeparrot_training - Step 17868: {'lr': 0.0003599122386105311, 'samples': 9148928, 'steps': 17868, 'batch_loss/train': 0.7870325809344649} 12/25/2021 07:55:56 - INFO - codeparrot_training - Step 17869: {'lr': 0.0003598980804226216, 'samples': 9149440, 'steps': 17869, 'batch_loss/train': 0.797814545687288} 12/25/2021 07:56:07 - INFO - codeparrot_training - Step 17870: {'lr': 0.0003598839217977988, 'samples': 9149952, 'steps': 17870, 'batch_loss/train': 0.8511852398514748} 12/25/2021 07:56:20 - INFO - codeparrot_training - Step 17871: {'lr': 0.000359869762736119, 'samples': 9150464, 'steps': 17871, 'batch_loss/train': 0.805783424526453} 12/25/2021 07:56:31 - INFO - codeparrot_training - Step 17872: {'lr': 0.0003598556032376386, 'samples': 9150976, 'steps': 17872, 'batch_loss/train': 0.8310567662119865} 12/25/2021 07:56:42 - INFO - codeparrot_training - Step 17873: {'lr': 0.0003598414433024137, 'samples': 9151488, 'steps': 17873, 'batch_loss/train': 0.7201795554719865} 12/25/2021 07:56:54 - INFO - codeparrot_training - Step 17874: {'lr': 0.0003598272829305008, 'samples': 9152000, 'steps': 17874, 'batch_loss/train': 0.704858306562528} 12/25/2021 07:57:05 - INFO - codeparrot_training - Step 17875: {'lr': 0.00035981312212195605, 'samples': 9152512, 'steps': 17875, 'batch_loss/train': 0.8485827594995499} 12/25/2021 07:57:15 - INFO - codeparrot_training - Step 17876: {'lr': 0.0003597989608768358, 'samples': 9153024, 'steps': 17876, 'batch_loss/train': 0.9452601028606296} 12/25/2021 07:57:27 - INFO - codeparrot_training - Step 17877: {'lr': 0.00035978479919519633, 'samples': 9153536, 'steps': 17877, 'batch_loss/train': 0.8348833685740829} 12/25/2021 07:57:38 - INFO - codeparrot_training - Step 17878: {'lr': 0.00035977063707709397, 'samples': 9154048, 'steps': 17878, 'batch_loss/train': 0.7460111556574702} 12/25/2021 07:57:48 - INFO - codeparrot_training - Step 17879: {'lr': 0.00035975647452258494, 'samples': 9154560, 'steps': 17879, 'batch_loss/train': 0.713279367890209} 12/25/2021 07:57:59 - INFO - codeparrot_training - Step 17880: {'lr': 0.0003597423115317256, 'samples': 9155072, 'steps': 17880, 'batch_loss/train': 0.8075179900042713} 12/25/2021 07:58:13 - INFO - codeparrot_training - Step 17881: {'lr': 0.00035972814810457236, 'samples': 9155584, 'steps': 17881, 'batch_loss/train': 0.7255802797153592} 12/25/2021 07:58:23 - INFO - codeparrot_training - Step 17882: {'lr': 0.0003597139842411814, 'samples': 9156096, 'steps': 17882, 'batch_loss/train': 0.7535289200022817} 12/25/2021 07:58:34 - INFO - codeparrot_training - Step 17883: {'lr': 0.0003596998199416091, 'samples': 9156608, 'steps': 17883, 'batch_loss/train': 0.7828155281022191} 12/25/2021 07:58:46 - INFO - codeparrot_training - Step 17884: {'lr': 0.00035968565520591166, 'samples': 9157120, 'steps': 17884, 'batch_loss/train': 0.7688402673229575} 12/25/2021 07:58:57 - INFO - codeparrot_training - Step 17885: {'lr': 0.0003596714900341455, 'samples': 9157632, 'steps': 17885, 'batch_loss/train': 0.6633924982161261} 12/25/2021 07:59:07 - INFO - codeparrot_training - Step 17886: {'lr': 0.0003596573244263669, 'samples': 9158144, 'steps': 17886, 'batch_loss/train': 0.7185263230931014} 12/25/2021 07:59:19 - INFO - codeparrot_training - Step 17887: {'lr': 0.0003596431583826322, 'samples': 9158656, 'steps': 17887, 'batch_loss/train': 0.810705728828907} 12/25/2021 07:59:30 - INFO - codeparrot_training - Step 17888: {'lr': 0.0003596289919029977, 'samples': 9159168, 'steps': 17888, 'batch_loss/train': 0.8424942195415497} 12/25/2021 07:59:41 - INFO - codeparrot_training - Step 17889: {'lr': 0.0003596148249875197, 'samples': 9159680, 'steps': 17889, 'batch_loss/train': 0.790847728960216} 12/25/2021 07:59:51 - INFO - codeparrot_training - Step 17890: {'lr': 0.0003596006576362546, 'samples': 9160192, 'steps': 17890, 'batch_loss/train': 0.7345295459963381} 12/25/2021 08:00:03 - INFO - codeparrot_training - Step 17891: {'lr': 0.0003595864898492586, 'samples': 9160704, 'steps': 17891, 'batch_loss/train': 0.7800624556839466} 12/25/2021 08:00:14 - INFO - codeparrot_training - Step 17892: {'lr': 0.00035957232162658816, 'samples': 9161216, 'steps': 17892, 'batch_loss/train': 0.7356341145932674} 12/25/2021 08:00:24 - INFO - codeparrot_training - Step 17893: {'lr': 0.0003595581529682995, 'samples': 9161728, 'steps': 17893, 'batch_loss/train': 0.6112324977293611} 12/25/2021 08:00:38 - INFO - codeparrot_training - Step 17894: {'lr': 0.000359543983874449, 'samples': 9162240, 'steps': 17894, 'batch_loss/train': 0.8752388022840023} 12/25/2021 08:00:48 - INFO - codeparrot_training - Step 17895: {'lr': 0.000359529814345093, 'samples': 9162752, 'steps': 17895, 'batch_loss/train': 0.7113552229711786} 12/25/2021 08:00:59 - INFO - codeparrot_training - Step 17896: {'lr': 0.00035951564438028777, 'samples': 9163264, 'steps': 17896, 'batch_loss/train': 0.7181276082992554} 12/25/2021 08:01:11 - INFO - codeparrot_training - Step 17897: {'lr': 0.0003595014739800897, 'samples': 9163776, 'steps': 17897, 'batch_loss/train': 0.8090062122792006} 12/25/2021 08:01:22 - INFO - codeparrot_training - Step 17898: {'lr': 0.0003594873031445551, 'samples': 9164288, 'steps': 17898, 'batch_loss/train': 0.8341998532414436} 12/25/2021 08:01:32 - INFO - codeparrot_training - Step 17899: {'lr': 0.0003594731318737404, 'samples': 9164800, 'steps': 17899, 'batch_loss/train': 0.6941004949621856} 12/25/2021 08:01:46 - INFO - codeparrot_training - Step 17900: {'lr': 0.0003594589601677018, 'samples': 9165312, 'steps': 17900, 'batch_loss/train': 0.7142551252618432} 12/25/2021 08:01:57 - INFO - codeparrot_training - Step 17901: {'lr': 0.0003594447880264958, 'samples': 9165824, 'steps': 17901, 'batch_loss/train': 0.8498937347903848} 12/25/2021 08:02:07 - INFO - codeparrot_training - Step 17902: {'lr': 0.00035943061545017844, 'samples': 9166336, 'steps': 17902, 'batch_loss/train': 0.6493525975383818} 12/25/2021 08:02:18 - INFO - codeparrot_training - Step 17903: {'lr': 0.0003594164424388065, 'samples': 9166848, 'steps': 17903, 'batch_loss/train': 0.7294132527895272} 12/25/2021 08:02:30 - INFO - codeparrot_training - Step 17904: {'lr': 0.000359402268992436, 'samples': 9167360, 'steps': 17904, 'batch_loss/train': 0.7864136369898915} 12/25/2021 08:02:41 - INFO - codeparrot_training - Step 17905: {'lr': 0.00035938809511112333, 'samples': 9167872, 'steps': 17905, 'batch_loss/train': 0.782247724942863} 12/25/2021 08:02:51 - INFO - codeparrot_training - Step 17906: {'lr': 0.00035937392079492497, 'samples': 9168384, 'steps': 17906, 'batch_loss/train': 0.6099997507408261} 12/25/2021 08:03:03 - INFO - codeparrot_training - Step 17907: {'lr': 0.0003593597460438972, 'samples': 9168896, 'steps': 17907, 'batch_loss/train': 0.7918103344272822} 12/25/2021 08:03:14 - INFO - codeparrot_training - Step 17908: {'lr': 0.00035934557085809637, 'samples': 9169408, 'steps': 17908, 'batch_loss/train': 0.8856122093275189} 12/25/2021 08:03:24 - INFO - codeparrot_training - Step 17909: {'lr': 0.0003593313952375788, 'samples': 9169920, 'steps': 17909, 'batch_loss/train': 0.7855873825028539} 12/25/2021 08:03:38 - INFO - codeparrot_training - Step 17910: {'lr': 0.00035931721918240096, 'samples': 9170432, 'steps': 17910, 'batch_loss/train': 0.7967827599495649} 12/25/2021 08:03:49 - INFO - codeparrot_training - Step 17911: {'lr': 0.000359303042692619, 'samples': 9170944, 'steps': 17911, 'batch_loss/train': 0.7758290329948068} 12/25/2021 08:03:59 - INFO - codeparrot_training - Step 17912: {'lr': 0.00035928886576828956, 'samples': 9171456, 'steps': 17912, 'batch_loss/train': 0.8138824179768562} 12/25/2021 08:04:11 - INFO - codeparrot_training - Step 17913: {'lr': 0.00035927468840946873, 'samples': 9171968, 'steps': 17913, 'batch_loss/train': 0.8014692727010697} 12/25/2021 08:04:22 - INFO - codeparrot_training - Step 17914: {'lr': 0.00035926051061621313, 'samples': 9172480, 'steps': 17914, 'batch_loss/train': 0.7778085209429264} 12/25/2021 08:04:33 - INFO - codeparrot_training - Step 17915: {'lr': 0.00035924633238857887, 'samples': 9172992, 'steps': 17915, 'batch_loss/train': 0.8132130773738027} 12/25/2021 08:04:43 - INFO - codeparrot_training - Step 17916: {'lr': 0.00035923215372662254, 'samples': 9173504, 'steps': 17916, 'batch_loss/train': 0.7739381724968553} 12/25/2021 08:04:57 - INFO - codeparrot_training - Step 17917: {'lr': 0.0003592179746304004, 'samples': 9174016, 'steps': 17917, 'batch_loss/train': 0.5227722069248557} 12/25/2021 08:05:08 - INFO - codeparrot_training - Step 17918: {'lr': 0.00035920379509996877, 'samples': 9174528, 'steps': 17918, 'batch_loss/train': 0.7941859578713775} 12/25/2021 08:05:18 - INFO - codeparrot_training - Step 17919: {'lr': 0.00035918961513538414, 'samples': 9175040, 'steps': 17919, 'batch_loss/train': 0.6831435724161565} 12/25/2021 08:05:31 - INFO - codeparrot_training - Step 17920: {'lr': 0.0003591754347367028, 'samples': 9175552, 'steps': 17920, 'batch_loss/train': 0.7552271038293839} 12/25/2021 08:05:41 - INFO - codeparrot_training - Step 17921: {'lr': 0.00035916125390398116, 'samples': 9176064, 'steps': 17921, 'batch_loss/train': 0.7462661005556583} 12/25/2021 08:05:52 - INFO - codeparrot_training - Step 17922: {'lr': 0.0003591470726372755, 'samples': 9176576, 'steps': 17922, 'batch_loss/train': 0.7556256214156747} 12/25/2021 08:06:04 - INFO - codeparrot_training - Step 17923: {'lr': 0.00035913289093664246, 'samples': 9177088, 'steps': 17923, 'batch_loss/train': 1.129672598093748} 12/25/2021 08:06:15 - INFO - codeparrot_training - Step 17924: {'lr': 0.00035911870880213814, 'samples': 9177600, 'steps': 17924, 'batch_loss/train': 0.6896006097085774} 12/25/2021 08:06:25 - INFO - codeparrot_training - Step 17925: {'lr': 0.00035910452623381906, 'samples': 9178112, 'steps': 17925, 'batch_loss/train': 0.7896982440724969} 12/25/2021 08:06:36 - INFO - codeparrot_training - Step 17926: {'lr': 0.0003590903432317416, 'samples': 9178624, 'steps': 17926, 'batch_loss/train': 1.6338009471073747} 12/25/2021 08:06:48 - INFO - codeparrot_training - Step 17927: {'lr': 0.0003590761597959622, 'samples': 9179136, 'steps': 17927, 'batch_loss/train': 0.7899782657623291} 12/25/2021 08:06:59 - INFO - codeparrot_training - Step 17928: {'lr': 0.000359061975926537, 'samples': 9179648, 'steps': 17928, 'batch_loss/train': 0.8386667869053781} 12/25/2021 08:07:09 - INFO - codeparrot_training - Step 17929: {'lr': 0.0003590477916235227, 'samples': 9180160, 'steps': 17929, 'batch_loss/train': 0.7890217388048768} 12/25/2021 08:07:23 - INFO - codeparrot_training - Step 17930: {'lr': 0.0003590336068869755, 'samples': 9180672, 'steps': 17930, 'batch_loss/train': 0.6776490486226976} 12/25/2021 08:07:34 - INFO - codeparrot_training - Step 17931: {'lr': 0.00035901942171695186, 'samples': 9181184, 'steps': 17931, 'batch_loss/train': 0.7758531586732715} 12/25/2021 08:07:44 - INFO - codeparrot_training - Step 17932: {'lr': 0.00035900523611350816, 'samples': 9181696, 'steps': 17932, 'batch_loss/train': 0.6284844006877393} 12/25/2021 08:07:56 - INFO - codeparrot_training - Step 17933: {'lr': 0.00035899105007670083, 'samples': 9182208, 'steps': 17933, 'batch_loss/train': 0.6950108115561306} 12/25/2021 08:08:07 - INFO - codeparrot_training - Step 17934: {'lr': 0.0003589768636065862, 'samples': 9182720, 'steps': 17934, 'batch_loss/train': 0.6938013667240739} 12/25/2021 08:08:17 - INFO - codeparrot_training - Step 17935: {'lr': 0.0003589626767032207, 'samples': 9183232, 'steps': 17935, 'batch_loss/train': 0.6841128952801228} 12/25/2021 08:08:28 - INFO - codeparrot_training - Step 17936: {'lr': 0.0003589484893666608, 'samples': 9183744, 'steps': 17936, 'batch_loss/train': 0.7871584824752063} 12/25/2021 08:08:40 - INFO - codeparrot_training - Step 17937: {'lr': 0.0003589343015969628, 'samples': 9184256, 'steps': 17937, 'batch_loss/train': 0.7249570544809103} 12/25/2021 08:08:51 - INFO - codeparrot_training - Step 17938: {'lr': 0.00035892011339418313, 'samples': 9184768, 'steps': 17938, 'batch_loss/train': 0.7386880959384143} 12/25/2021 08:09:01 - INFO - codeparrot_training - Step 17939: {'lr': 0.00035890592475837824, 'samples': 9185280, 'steps': 17939, 'batch_loss/train': 0.6662533115595579} 12/25/2021 08:09:16 - INFO - codeparrot_training - Step 17940: {'lr': 0.0003588917356896045, 'samples': 9185792, 'steps': 17940, 'batch_loss/train': 0.5740222339518368} 12/25/2021 08:09:26 - INFO - codeparrot_training - Step 17941: {'lr': 0.00035887754618791837, 'samples': 9186304, 'steps': 17941, 'batch_loss/train': 0.6408050982281566} 12/25/2021 08:09:37 - INFO - codeparrot_training - Step 17942: {'lr': 0.0003588633562533762, 'samples': 9186816, 'steps': 17942, 'batch_loss/train': 0.8514378871768713} 12/25/2021 08:09:49 - INFO - codeparrot_training - Step 17943: {'lr': 0.00035884916588603446, 'samples': 9187328, 'steps': 17943, 'batch_loss/train': 0.6469210954383016} 12/25/2021 08:09:59 - INFO - codeparrot_training - Step 17944: {'lr': 0.00035883497508594947, 'samples': 9187840, 'steps': 17944, 'batch_loss/train': 0.616413538926281} 12/25/2021 08:10:10 - INFO - codeparrot_training - Step 17945: {'lr': 0.0003588207838531777, 'samples': 9188352, 'steps': 17945, 'batch_loss/train': 0.8340562162920833} 12/25/2021 08:10:21 - INFO - codeparrot_training - Step 17946: {'lr': 0.0003588065921877757, 'samples': 9188864, 'steps': 17946, 'batch_loss/train': 0.7782134097069502} 12/25/2021 08:10:34 - INFO - codeparrot_training - Step 17947: {'lr': 0.0003587924000897997, 'samples': 9189376, 'steps': 17947, 'batch_loss/train': 1.2779919817112386} 12/25/2021 08:10:45 - INFO - codeparrot_training - Step 17948: {'lr': 0.0003587782075593062, 'samples': 9189888, 'steps': 17948, 'batch_loss/train': 0.7896372159011662} 12/25/2021 08:10:56 - INFO - codeparrot_training - Step 17949: {'lr': 0.00035876401459635166, 'samples': 9190400, 'steps': 17949, 'batch_loss/train': 0.7195736085996032} 12/25/2021 08:11:08 - INFO - codeparrot_training - Step 17950: {'lr': 0.0003587498212009924, 'samples': 9190912, 'steps': 17950, 'batch_loss/train': 0.893751721829176} 12/25/2021 08:11:19 - INFO - codeparrot_training - Step 17951: {'lr': 0.0003587356273732849, 'samples': 9191424, 'steps': 17951, 'batch_loss/train': 0.8177682748064399} 12/25/2021 08:11:29 - INFO - codeparrot_training - Step 17952: {'lr': 0.0003587214331132857, 'samples': 9191936, 'steps': 17952, 'batch_loss/train': 0.5647807231289335} 12/25/2021 08:11:41 - INFO - codeparrot_training - Step 17953: {'lr': 0.0003587072384210511, 'samples': 9192448, 'steps': 17953, 'batch_loss/train': 0.7991350749507546} 12/25/2021 08:11:52 - INFO - codeparrot_training - Step 17954: {'lr': 0.0003586930432966375, 'samples': 9192960, 'steps': 17954, 'batch_loss/train': 0.7887737555429339} 12/25/2021 08:12:03 - INFO - codeparrot_training - Step 17955: {'lr': 0.0003586788477401015, 'samples': 9193472, 'steps': 17955, 'batch_loss/train': 0.8598822513595223} 12/25/2021 08:12:17 - INFO - codeparrot_training - Step 17956: {'lr': 0.00035866465175149936, 'samples': 9193984, 'steps': 17956, 'batch_loss/train': 0.7465929626487195} 12/25/2021 08:12:27 - INFO - codeparrot_training - Step 17957: {'lr': 0.0003586504553308876, 'samples': 9194496, 'steps': 17957, 'batch_loss/train': 0.7838387247174978} 12/25/2021 08:12:38 - INFO - codeparrot_training - Step 17958: {'lr': 0.0003586362584783227, 'samples': 9195008, 'steps': 17958, 'batch_loss/train': 0.7819814500398934} 12/25/2021 08:12:48 - INFO - codeparrot_training - Step 17959: {'lr': 0.00035862206119386097, 'samples': 9195520, 'steps': 17959, 'batch_loss/train': 0.5519686695188284} 12/25/2021 08:13:00 - INFO - codeparrot_training - Step 17960: {'lr': 0.00035860786347755903, 'samples': 9196032, 'steps': 17960, 'batch_loss/train': 0.7857985408045352} 12/25/2021 08:13:11 - INFO - codeparrot_training - Step 17961: {'lr': 0.0003585936653294732, 'samples': 9196544, 'steps': 17961, 'batch_loss/train': 0.7469450421631336} 12/25/2021 08:13:22 - INFO - codeparrot_training - Step 17962: {'lr': 0.00035857946674966, 'samples': 9197056, 'steps': 17962, 'batch_loss/train': 0.7047128267586231} 12/25/2021 08:13:34 - INFO - codeparrot_training - Step 17963: {'lr': 0.0003585652677381758, 'samples': 9197568, 'steps': 17963, 'batch_loss/train': 0.7726348675787449} 12/25/2021 08:13:44 - INFO - codeparrot_training - Step 17964: {'lr': 0.00035855106829507703, 'samples': 9198080, 'steps': 17964, 'batch_loss/train': 0.7891871696338058} 12/25/2021 08:13:55 - INFO - codeparrot_training - Step 17965: {'lr': 0.0003585368684204203, 'samples': 9198592, 'steps': 17965, 'batch_loss/train': 0.7747810315340757} 12/25/2021 08:14:07 - INFO - codeparrot_training - Step 17966: {'lr': 0.00035852266811426195, 'samples': 9199104, 'steps': 17966, 'batch_loss/train': 0.6687993888626806} 12/25/2021 08:14:18 - INFO - codeparrot_training - Step 17967: {'lr': 0.0003585084673766584, 'samples': 9199616, 'steps': 17967, 'batch_loss/train': 0.7887977156788111} 12/25/2021 08:14:29 - INFO - codeparrot_training - Step 17968: {'lr': 0.00035849426620766615, 'samples': 9200128, 'steps': 17968, 'batch_loss/train': 0.6113012530840933} 12/25/2021 08:14:39 - INFO - codeparrot_training - Step 17969: {'lr': 0.0003584800646073417, 'samples': 9200640, 'steps': 17969, 'batch_loss/train': 0.7265186663717031} 12/25/2021 08:14:53 - INFO - codeparrot_training - Step 17970: {'lr': 0.00035846586257574144, 'samples': 9201152, 'steps': 17970, 'batch_loss/train': 0.7899477975443006} 12/25/2021 08:15:04 - INFO - codeparrot_training - Step 17971: {'lr': 0.0003584516601129219, 'samples': 9201664, 'steps': 17971, 'batch_loss/train': 0.8547662822529674} 12/25/2021 08:15:14 - INFO - codeparrot_training - Step 17972: {'lr': 0.0003584374572189395, 'samples': 9202176, 'steps': 17972, 'batch_loss/train': 0.7347933298442513} 12/25/2021 08:15:26 - INFO - codeparrot_training - Step 17973: {'lr': 0.0003584232538938507, 'samples': 9202688, 'steps': 17973, 'batch_loss/train': 0.8387369848787785} 12/25/2021 08:15:37 - INFO - codeparrot_training - Step 17974: {'lr': 0.000358409050137712, 'samples': 9203200, 'steps': 17974, 'batch_loss/train': 0.7517831844743341} 12/25/2021 08:15:48 - INFO - codeparrot_training - Step 17975: {'lr': 0.0003583948459505799, 'samples': 9203712, 'steps': 17975, 'batch_loss/train': 0.7637354098260403} 12/25/2021 08:16:02 - INFO - codeparrot_training - Step 17976: {'lr': 0.0003583806413325107, 'samples': 9204224, 'steps': 17976, 'batch_loss/train': 0.8272593496367335} 12/25/2021 08:16:12 - INFO - codeparrot_training - Step 17977: {'lr': 0.0003583664362835611, 'samples': 9204736, 'steps': 17977, 'batch_loss/train': 0.7066159271635115} 12/25/2021 08:16:23 - INFO - codeparrot_training - Step 17978: {'lr': 0.00035835223080378736, 'samples': 9205248, 'steps': 17978, 'batch_loss/train': 0.7154819015413523} 12/25/2021 08:16:35 - INFO - codeparrot_training - Step 17979: {'lr': 0.0003583380248932462, 'samples': 9205760, 'steps': 17979, 'batch_loss/train': 0.8088692161254585} 12/25/2021 08:16:45 - INFO - codeparrot_training - Step 17980: {'lr': 0.0003583238185519938, 'samples': 9206272, 'steps': 17980, 'batch_loss/train': 0.7470748107880354} 12/25/2021 08:16:56 - INFO - codeparrot_training - Step 17981: {'lr': 0.00035830961178008695, 'samples': 9206784, 'steps': 17981, 'batch_loss/train': 0.7483450472354889} 12/25/2021 08:17:07 - INFO - codeparrot_training - Step 17982: {'lr': 0.0003582954045775819, 'samples': 9207296, 'steps': 17982, 'batch_loss/train': 0.709467462147586} 12/25/2021 08:17:19 - INFO - codeparrot_training - Step 17983: {'lr': 0.0003582811969445351, 'samples': 9207808, 'steps': 17983, 'batch_loss/train': 0.8133959074039012} 12/25/2021 08:17:29 - INFO - codeparrot_training - Step 17984: {'lr': 0.0003582669888810034, 'samples': 9208320, 'steps': 17984, 'batch_loss/train': 0.7242816729121841} 12/25/2021 08:17:40 - INFO - codeparrot_training - Step 17985: {'lr': 0.0003582527803870429, 'samples': 9208832, 'steps': 17985, 'batch_loss/train': 0.8012088220566511} 12/25/2021 08:17:52 - INFO - codeparrot_training - Step 17986: {'lr': 0.0003582385714627102, 'samples': 9209344, 'steps': 17986, 'batch_loss/train': 0.757776400540024} 12/25/2021 08:18:03 - INFO - codeparrot_training - Step 17987: {'lr': 0.00035822436210806183, 'samples': 9209856, 'steps': 17987, 'batch_loss/train': 0.7740676682442427} 12/25/2021 08:18:13 - INFO - codeparrot_training - Step 17988: {'lr': 0.0003582101523231543, 'samples': 9210368, 'steps': 17988, 'batch_loss/train': 0.721419503854122} 12/25/2021 08:18:28 - INFO - codeparrot_training - Step 17989: {'lr': 0.00035819594210804404, 'samples': 9210880, 'steps': 17989, 'batch_loss/train': 0.7732008676975965} 12/25/2021 08:18:39 - INFO - codeparrot_training - Step 17990: {'lr': 0.00035818173146278753, 'samples': 9211392, 'steps': 17990, 'batch_loss/train': 0.8654278581961989} 12/25/2021 08:18:49 - INFO - codeparrot_training - Step 17991: {'lr': 0.0003581675203874413, 'samples': 9211904, 'steps': 17991, 'batch_loss/train': 0.7352234162390232} 12/25/2021 08:19:00 - INFO - codeparrot_training - Step 17992: {'lr': 0.00035815330888206196, 'samples': 9212416, 'steps': 17992, 'batch_loss/train': 0.6819380226661451} 12/25/2021 08:19:12 - INFO - codeparrot_training - Step 17993: {'lr': 0.00035813909694670576, 'samples': 9212928, 'steps': 17993, 'batch_loss/train': 0.8022443321533501} 12/25/2021 08:19:23 - INFO - codeparrot_training - Step 17994: {'lr': 0.00035812488458142946, 'samples': 9213440, 'steps': 17994, 'batch_loss/train': 0.7760406825691462} 12/25/2021 08:19:33 - INFO - codeparrot_training - Step 17995: {'lr': 0.00035811067178628933, 'samples': 9213952, 'steps': 17995, 'batch_loss/train': 0.6545461164787412} 12/25/2021 08:19:47 - INFO - codeparrot_training - Step 17996: {'lr': 0.00035809645856134197, 'samples': 9214464, 'steps': 17996, 'batch_loss/train': 0.7219920232892036} 12/25/2021 08:19:58 - INFO - codeparrot_training - Step 17997: {'lr': 0.00035808224490664403, 'samples': 9214976, 'steps': 17997, 'batch_loss/train': 0.761314956471324} 12/25/2021 08:20:09 - INFO - codeparrot_training - Step 17998: {'lr': 0.0003580680308222519, 'samples': 9215488, 'steps': 17998, 'batch_loss/train': 0.7528296438977122} 12/25/2021 08:20:21 - INFO - codeparrot_training - Step 17999: {'lr': 0.00035805381630822197, 'samples': 9216000, 'steps': 17999, 'batch_loss/train': 0.7405782712739892} 12/25/2021 08:20:31 - INFO - codeparrot_training - Step 18000: {'lr': 0.00035803960136461093, 'samples': 9216512, 'steps': 18000, 'batch_loss/train': 1.50251404941082} 12/25/2021 08:20:42 - INFO - codeparrot_training - Step 18001: {'lr': 0.00035802538599147526, 'samples': 9217024, 'steps': 18001, 'batch_loss/train': 0.7431442211382091} 12/25/2021 08:20:53 - INFO - codeparrot_training - Step 18002: {'lr': 0.0003580111701888714, 'samples': 9217536, 'steps': 18002, 'batch_loss/train': 0.7576367296278477} 12/25/2021 08:21:05 - INFO - codeparrot_training - Step 18003: {'lr': 0.00035799695395685597, 'samples': 9218048, 'steps': 18003, 'batch_loss/train': 0.6729948143474758} 12/25/2021 08:21:15 - INFO - codeparrot_training - Step 18004: {'lr': 0.00035798273729548537, 'samples': 9218560, 'steps': 18004, 'batch_loss/train': 0.7414884227328002} 12/25/2021 08:21:26 - INFO - codeparrot_training - Step 18005: {'lr': 0.0003579685202048162, 'samples': 9219072, 'steps': 18005, 'batch_loss/train': 0.7696673972532153} 12/25/2021 08:21:40 - INFO - codeparrot_training - Step 18006: {'lr': 0.0003579543026849049, 'samples': 9219584, 'steps': 18006, 'batch_loss/train': 0.757774646859616} 12/25/2021 08:21:51 - INFO - codeparrot_training - Step 18007: {'lr': 0.00035794008473580816, 'samples': 9220096, 'steps': 18007, 'batch_loss/train': 0.8522818749770522} 12/25/2021 08:22:01 - INFO - codeparrot_training - Step 18008: {'lr': 0.00035792586635758234, 'samples': 9220608, 'steps': 18008, 'batch_loss/train': 0.6781428731046617} 12/25/2021 08:22:13 - INFO - codeparrot_training - Step 18009: {'lr': 0.000357911647550284, 'samples': 9221120, 'steps': 18009, 'batch_loss/train': 0.7549514886923134} 12/25/2021 08:22:24 - INFO - codeparrot_training - Step 18010: {'lr': 0.0003578974283139698, 'samples': 9221632, 'steps': 18010, 'batch_loss/train': 0.8066920862765983} 12/25/2021 08:22:35 - INFO - codeparrot_training - Step 18011: {'lr': 0.00035788320864869606, 'samples': 9222144, 'steps': 18011, 'batch_loss/train': 1.1306907013058662} 12/25/2021 08:22:45 - INFO - codeparrot_training - Step 18012: {'lr': 0.00035786898855451945, 'samples': 9222656, 'steps': 18012, 'batch_loss/train': 0.7631335817277431} 12/25/2021 08:22:57 - INFO - codeparrot_training - Step 18013: {'lr': 0.0003578547680314964, 'samples': 9223168, 'steps': 18013, 'batch_loss/train': 0.802593074273318} 12/25/2021 08:23:08 - INFO - codeparrot_training - Step 18014: {'lr': 0.00035784054707968364, 'samples': 9223680, 'steps': 18014, 'batch_loss/train': 0.7456806129775941} 12/25/2021 08:23:18 - INFO - codeparrot_training - Step 18015: {'lr': 0.0003578263256991375, 'samples': 9224192, 'steps': 18015, 'batch_loss/train': 0.6912586501566693} 12/25/2021 08:23:32 - INFO - codeparrot_training - Step 18016: {'lr': 0.0003578121038899147, 'samples': 9224704, 'steps': 18016, 'batch_loss/train': 0.7460655029863119} 12/25/2021 08:23:43 - INFO - codeparrot_training - Step 18017: {'lr': 0.0003577978816520716, 'samples': 9225216, 'steps': 18017, 'batch_loss/train': 0.773007595911622} 12/25/2021 08:23:53 - INFO - codeparrot_training - Step 18018: {'lr': 0.00035778365898566484, 'samples': 9225728, 'steps': 18018, 'batch_loss/train': 0.8175439108163118} 12/25/2021 08:24:05 - INFO - codeparrot_training - Step 18019: {'lr': 0.0003577694358907509, 'samples': 9226240, 'steps': 18019, 'batch_loss/train': 0.8007014319300652} 12/25/2021 08:24:16 - INFO - codeparrot_training - Step 18020: {'lr': 0.00035775521236738645, 'samples': 9226752, 'steps': 18020, 'batch_loss/train': 0.7910770792514086} 12/25/2021 08:24:27 - INFO - codeparrot_training - Step 18021: {'lr': 0.000357740988415628, 'samples': 9227264, 'steps': 18021, 'batch_loss/train': 0.7486796891316772} 12/25/2021 08:24:37 - INFO - codeparrot_training - Step 18022: {'lr': 0.00035772676403553194, 'samples': 9227776, 'steps': 18022, 'batch_loss/train': 0.6901904738042504} 12/25/2021 08:24:51 - INFO - codeparrot_training - Step 18023: {'lr': 0.000357712539227155, 'samples': 9228288, 'steps': 18023, 'batch_loss/train': 0.7023059241473675} 12/25/2021 08:25:02 - INFO - codeparrot_training - Step 18024: {'lr': 0.0003576983139905537, 'samples': 9228800, 'steps': 18024, 'batch_loss/train': 0.7616584650240839} 12/25/2021 08:25:12 - INFO - codeparrot_training - Step 18025: {'lr': 0.00035768408832578455, 'samples': 9229312, 'steps': 18025, 'batch_loss/train': 0.8151846374385059} 12/25/2021 08:25:24 - INFO - codeparrot_training - Step 18026: {'lr': 0.0003576698622329041, 'samples': 9229824, 'steps': 18026, 'batch_loss/train': 0.7010313849896193} 12/25/2021 08:25:35 - INFO - codeparrot_training - Step 18027: {'lr': 0.00035765563571196903, 'samples': 9230336, 'steps': 18027, 'batch_loss/train': 0.7225790824741125} 12/25/2021 08:25:45 - INFO - codeparrot_training - Step 18028: {'lr': 0.00035764140876303564, 'samples': 9230848, 'steps': 18028, 'batch_loss/train': 0.8526424001902342} 12/25/2021 08:25:58 - INFO - codeparrot_training - Step 18029: {'lr': 0.00035762718138616077, 'samples': 9231360, 'steps': 18029, 'batch_loss/train': 0.7714153425768018} 12/25/2021 08:26:08 - INFO - codeparrot_training - Step 18030: {'lr': 0.00035761295358140084, 'samples': 9231872, 'steps': 18030, 'batch_loss/train': 0.6287211526650935} 12/25/2021 08:26:19 - INFO - codeparrot_training - Step 18031: {'lr': 0.0003575987253488124, 'samples': 9232384, 'steps': 18031, 'batch_loss/train': 0.6888563483953476} 12/25/2021 08:26:30 - INFO - codeparrot_training - Step 18032: {'lr': 0.0003575844966884521, 'samples': 9232896, 'steps': 18032, 'batch_loss/train': 0.7873077616095543} 12/25/2021 08:26:42 - INFO - codeparrot_training - Step 18033: {'lr': 0.00035757026760037646, 'samples': 9233408, 'steps': 18033, 'batch_loss/train': 0.7503557032905519} 12/25/2021 08:26:52 - INFO - codeparrot_training - Step 18034: {'lr': 0.0003575560380846421, 'samples': 9233920, 'steps': 18034, 'batch_loss/train': 0.7535805222578347} 12/25/2021 08:27:03 - INFO - codeparrot_training - Step 18035: {'lr': 0.00035754180814130546, 'samples': 9234432, 'steps': 18035, 'batch_loss/train': 0.8216107292100787} 12/25/2021 08:27:17 - INFO - codeparrot_training - Step 18036: {'lr': 0.0003575275777704232, 'samples': 9234944, 'steps': 18036, 'batch_loss/train': 0.7651052037253976} 12/25/2021 08:27:27 - INFO - codeparrot_training - Step 18037: {'lr': 0.00035751334697205195, 'samples': 9235456, 'steps': 18037, 'batch_loss/train': 0.7631453615613282} 12/25/2021 08:27:38 - INFO - codeparrot_training - Step 18038: {'lr': 0.00035749911574624814, 'samples': 9235968, 'steps': 18038, 'batch_loss/train': 0.8542109616100788} 12/25/2021 08:27:50 - INFO - codeparrot_training - Step 18039: {'lr': 0.00035748488409306854, 'samples': 9236480, 'steps': 18039, 'batch_loss/train': 0.7904482870362699} 12/25/2021 08:28:01 - INFO - codeparrot_training - Step 18040: {'lr': 0.0003574706520125696, 'samples': 9236992, 'steps': 18040, 'batch_loss/train': 0.9757859111996368} 12/25/2021 08:28:11 - INFO - codeparrot_training - Step 18041: {'lr': 0.00035745641950480787, 'samples': 9237504, 'steps': 18041, 'batch_loss/train': 0.8174637202173471} 12/25/2021 08:28:23 - INFO - codeparrot_training - Step 18042: {'lr': 0.00035744218656984006, 'samples': 9238016, 'steps': 18042, 'batch_loss/train': 0.8400824069976807} 12/25/2021 08:28:34 - INFO - codeparrot_training - Step 18043: {'lr': 0.00035742795320772267, 'samples': 9238528, 'steps': 18043, 'batch_loss/train': 0.8633889523334801} 12/25/2021 08:28:45 - INFO - codeparrot_training - Step 18044: {'lr': 0.00035741371941851223, 'samples': 9239040, 'steps': 18044, 'batch_loss/train': 0.9193166270852089} 12/25/2021 08:28:55 - INFO - codeparrot_training - Step 18045: {'lr': 0.00035739948520226546, 'samples': 9239552, 'steps': 18045, 'batch_loss/train': 0.8489407487213612} 12/25/2021 08:29:09 - INFO - codeparrot_training - Step 18046: {'lr': 0.00035738525055903884, 'samples': 9240064, 'steps': 18046, 'batch_loss/train': 0.7366886333329603} 12/25/2021 08:29:20 - INFO - codeparrot_training - Step 18047: {'lr': 0.00035737101548888913, 'samples': 9240576, 'steps': 18047, 'batch_loss/train': 0.7270556762814522} 12/25/2021 08:29:31 - INFO - codeparrot_training - Step 18048: {'lr': 0.00035735677999187266, 'samples': 9241088, 'steps': 18048, 'batch_loss/train': 0.8159760646522045} 12/25/2021 08:29:43 - INFO - codeparrot_training - Step 18049: {'lr': 0.0003573425440680463, 'samples': 9241600, 'steps': 18049, 'batch_loss/train': 0.7386567518115044} 12/25/2021 08:29:54 - INFO - codeparrot_training - Step 18050: {'lr': 0.00035732830771746644, 'samples': 9242112, 'steps': 18050, 'batch_loss/train': 0.8147954475134611} 12/25/2021 08:30:04 - INFO - codeparrot_training - Step 18051: {'lr': 0.00035731407094018977, 'samples': 9242624, 'steps': 18051, 'batch_loss/train': 0.6669768729479983} 12/25/2021 08:30:16 - INFO - codeparrot_training - Step 18052: {'lr': 0.00035729983373627283, 'samples': 9243136, 'steps': 18052, 'batch_loss/train': 0.5283348395023495} 12/25/2021 08:30:27 - INFO - codeparrot_training - Step 18053: {'lr': 0.0003572855961057724, 'samples': 9243648, 'steps': 18053, 'batch_loss/train': 0.816974394954741} 12/25/2021 08:30:38 - INFO - codeparrot_training - Step 18054: {'lr': 0.0003572713580487448, 'samples': 9244160, 'steps': 18054, 'batch_loss/train': 0.9299474079161882} 12/25/2021 08:30:48 - INFO - codeparrot_training - Step 18055: {'lr': 0.00035725711956524685, 'samples': 9244672, 'steps': 18055, 'batch_loss/train': 0.7922656764858402} 12/25/2021 08:31:02 - INFO - codeparrot_training - Step 18056: {'lr': 0.0003572428806553351, 'samples': 9245184, 'steps': 18056, 'batch_loss/train': 0.900157243013382} 12/25/2021 08:31:13 - INFO - codeparrot_training - Step 18057: {'lr': 0.00035722864131906607, 'samples': 9245696, 'steps': 18057, 'batch_loss/train': 0.8177419062703848} 12/25/2021 08:31:23 - INFO - codeparrot_training - Step 18058: {'lr': 0.00035721440155649656, 'samples': 9246208, 'steps': 18058, 'batch_loss/train': 0.8525614580139518} 12/25/2021 08:31:35 - INFO - codeparrot_training - Step 18059: {'lr': 0.00035720016136768295, 'samples': 9246720, 'steps': 18059, 'batch_loss/train': 0.762987450696528} 12/25/2021 08:31:46 - INFO - codeparrot_training - Step 18060: {'lr': 0.0003571859207526821, 'samples': 9247232, 'steps': 18060, 'batch_loss/train': 0.8530765231698751} 12/25/2021 08:31:56 - INFO - codeparrot_training - Step 18061: {'lr': 0.0003571716797115504, 'samples': 9247744, 'steps': 18061, 'batch_loss/train': 0.8653159160166979} 12/25/2021 08:32:11 - INFO - codeparrot_training - Step 18062: {'lr': 0.00035715743824434466, 'samples': 9248256, 'steps': 18062, 'batch_loss/train': 0.9101435570046306} 12/25/2021 08:32:21 - INFO - codeparrot_training - Step 18063: {'lr': 0.0003571431963511214, 'samples': 9248768, 'steps': 18063, 'batch_loss/train': 0.847901126369834} 12/25/2021 08:32:32 - INFO - codeparrot_training - Step 18064: {'lr': 0.0003571289540319371, 'samples': 9249280, 'steps': 18064, 'batch_loss/train': 0.7641618775669485} 12/25/2021 08:32:44 - INFO - codeparrot_training - Step 18065: {'lr': 0.0003571147112868487, 'samples': 9249792, 'steps': 18065, 'batch_loss/train': 0.8529309323057532} 12/25/2021 08:32:55 - INFO - codeparrot_training - Step 18066: {'lr': 0.00035710046811591263, 'samples': 9250304, 'steps': 18066, 'batch_loss/train': 0.8765636370517313} 12/25/2021 08:33:05 - INFO - codeparrot_training - Step 18067: {'lr': 0.0003570862245191854, 'samples': 9250816, 'steps': 18067, 'batch_loss/train': 0.7473999529611319} 12/25/2021 08:33:16 - INFO - codeparrot_training - Step 18068: {'lr': 0.00035707198049672394, 'samples': 9251328, 'steps': 18068, 'batch_loss/train': 0.7504917057231069} 12/25/2021 08:33:28 - INFO - codeparrot_training - Step 18069: {'lr': 0.00035705773604858464, 'samples': 9251840, 'steps': 18069, 'batch_loss/train': 0.7873640218749642} 12/25/2021 08:33:38 - INFO - codeparrot_training - Step 18070: {'lr': 0.0003570434911748242, 'samples': 9252352, 'steps': 18070, 'batch_loss/train': 0.8325519734062254} 12/25/2021 08:33:49 - INFO - codeparrot_training - Step 18071: {'lr': 0.0003570292458754993, 'samples': 9252864, 'steps': 18071, 'batch_loss/train': 0.7413284645881504} 12/25/2021 08:34:01 - INFO - codeparrot_training - Step 18072: {'lr': 0.00035701500015066645, 'samples': 9253376, 'steps': 18072, 'batch_loss/train': 0.7178043639287353} 12/25/2021 08:34:12 - INFO - codeparrot_training - Step 18073: {'lr': 0.0003570007540003825, 'samples': 9253888, 'steps': 18073, 'batch_loss/train': 0.6429313896223903} 12/25/2021 08:34:22 - INFO - codeparrot_training - Step 18074: {'lr': 0.0003569865074247038, 'samples': 9254400, 'steps': 18074, 'batch_loss/train': 0.797141679096967} 12/25/2021 08:34:36 - INFO - codeparrot_training - Step 18075: {'lr': 0.0003569722604236872, 'samples': 9254912, 'steps': 18075, 'batch_loss/train': 0.7119513088837266} 12/25/2021 08:34:47 - INFO - codeparrot_training - Step 18076: {'lr': 0.0003569580129973894, 'samples': 9255424, 'steps': 18076, 'batch_loss/train': 1.3891665302217007} 12/25/2021 08:34:57 - INFO - codeparrot_training - Step 18077: {'lr': 0.0003569437651458668, 'samples': 9255936, 'steps': 18077, 'batch_loss/train': 0.983474699780345} 12/25/2021 08:35:08 - INFO - codeparrot_training - Step 18078: {'lr': 0.0003569295168691762, 'samples': 9256448, 'steps': 18078, 'batch_loss/train': 0.8987373067066073} 12/25/2021 08:35:20 - INFO - codeparrot_training - Step 18079: {'lr': 0.00035691526816737423, 'samples': 9256960, 'steps': 18079, 'batch_loss/train': 0.7845617025159299} 12/25/2021 08:35:31 - INFO - codeparrot_training - Step 18080: {'lr': 0.0003569010190405174, 'samples': 9257472, 'steps': 18080, 'batch_loss/train': 0.7848463701084256} 12/25/2021 08:35:41 - INFO - codeparrot_training - Step 18081: {'lr': 0.0003568867694886627, 'samples': 9257984, 'steps': 18081, 'batch_loss/train': 0.8279461171478033} 12/25/2021 08:35:55 - INFO - codeparrot_training - Step 18082: {'lr': 0.0003568725195118664, 'samples': 9258496, 'steps': 18082, 'batch_loss/train': 0.682474825065583} 12/25/2021 08:36:06 - INFO - codeparrot_training - Step 18083: {'lr': 0.00035685826911018535, 'samples': 9259008, 'steps': 18083, 'batch_loss/train': 0.8513691071420908} 12/25/2021 08:36:16 - INFO - codeparrot_training - Step 18084: {'lr': 0.0003568440182836762, 'samples': 9259520, 'steps': 18084, 'batch_loss/train': 0.9135038927197456} 12/25/2021 08:36:28 - INFO - codeparrot_training - Step 18085: {'lr': 0.0003568297670323956, 'samples': 9260032, 'steps': 18085, 'batch_loss/train': 0.788715117610991} 12/25/2021 08:36:39 - INFO - codeparrot_training - Step 18086: {'lr': 0.0003568155153564001, 'samples': 9260544, 'steps': 18086, 'batch_loss/train': 0.9857478244230151} 12/25/2021 08:36:50 - INFO - codeparrot_training - Step 18087: {'lr': 0.0003568012632557465, 'samples': 9261056, 'steps': 18087, 'batch_loss/train': 0.7928656367585063} 12/25/2021 08:37:02 - INFO - codeparrot_training - Step 18088: {'lr': 0.00035678701073049143, 'samples': 9261568, 'steps': 18088, 'batch_loss/train': 0.762219331227243} 12/25/2021 08:37:12 - INFO - codeparrot_training - Step 18089: {'lr': 0.00035677275778069154, 'samples': 9262080, 'steps': 18089, 'batch_loss/train': 0.7894670034293085} 12/25/2021 08:37:23 - INFO - codeparrot_training - Step 18090: {'lr': 0.00035675850440640335, 'samples': 9262592, 'steps': 18090, 'batch_loss/train': 0.8283325689844787} 12/25/2021 08:37:34 - INFO - codeparrot_training - Step 18091: {'lr': 0.00035674425060768373, 'samples': 9263104, 'steps': 18091, 'batch_loss/train': 0.7848785864189267} 12/25/2021 08:37:47 - INFO - codeparrot_training - Step 18092: {'lr': 0.0003567299963845894, 'samples': 9263616, 'steps': 18092, 'batch_loss/train': 0.6291342582553625} 12/25/2021 08:37:58 - INFO - codeparrot_training - Step 18093: {'lr': 0.0003567157417371767, 'samples': 9264128, 'steps': 18093, 'batch_loss/train': 0.8658454539254308} 12/25/2021 08:38:08 - INFO - codeparrot_training - Step 18094: {'lr': 0.00035670148666550256, 'samples': 9264640, 'steps': 18094, 'batch_loss/train': 0.6990630300715566} 12/25/2021 08:38:20 - INFO - codeparrot_training - Step 18095: {'lr': 0.00035668723116962365, 'samples': 9265152, 'steps': 18095, 'batch_loss/train': 0.7974061910063028} 12/25/2021 08:38:31 - INFO - codeparrot_training - Step 18096: {'lr': 0.0003566729752495965, 'samples': 9265664, 'steps': 18096, 'batch_loss/train': 0.8366684759967029} 12/25/2021 08:38:42 - INFO - codeparrot_training - Step 18097: {'lr': 0.000356658718905478, 'samples': 9266176, 'steps': 18097, 'batch_loss/train': 0.8468843244481832} 12/25/2021 08:38:54 - INFO - codeparrot_training - Step 18098: {'lr': 0.0003566444621373247, 'samples': 9266688, 'steps': 18098, 'batch_loss/train': 0.865192074328661} 12/25/2021 08:39:04 - INFO - codeparrot_training - Step 18099: {'lr': 0.00035663020494519323, 'samples': 9267200, 'steps': 18099, 'batch_loss/train': 0.7830242915078998} 12/25/2021 08:39:15 - INFO - codeparrot_training - Step 18100: {'lr': 0.00035661594732914033, 'samples': 9267712, 'steps': 18100, 'batch_loss/train': 0.8485425626859069} 12/25/2021 08:39:25 - INFO - codeparrot_training - Step 18101: {'lr': 0.00035660168928922277, 'samples': 9268224, 'steps': 18101, 'batch_loss/train': 0.7940842187963426} 12/25/2021 08:39:39 - INFO - codeparrot_training - Step 18102: {'lr': 0.00035658743082549703, 'samples': 9268736, 'steps': 18102, 'batch_loss/train': 0.846746638417244} 12/25/2021 08:39:50 - INFO - codeparrot_training - Step 18103: {'lr': 0.0003565731719380199, 'samples': 9269248, 'steps': 18103, 'batch_loss/train': 0.7290667071938515} 12/25/2021 08:40:00 - INFO - codeparrot_training - Step 18104: {'lr': 0.00035655891262684815, 'samples': 9269760, 'steps': 18104, 'batch_loss/train': 0.7071254896000028} 12/25/2021 08:40:13 - INFO - codeparrot_training - Step 18105: {'lr': 0.00035654465289203843, 'samples': 9270272, 'steps': 18105, 'batch_loss/train': 0.8398349778726697} 12/25/2021 08:40:23 - INFO - codeparrot_training - Step 18106: {'lr': 0.0003565303927336473, 'samples': 9270784, 'steps': 18106, 'batch_loss/train': 1.6366764530539513} 12/25/2021 08:40:34 - INFO - codeparrot_training - Step 18107: {'lr': 0.0003565161321517316, 'samples': 9271296, 'steps': 18107, 'batch_loss/train': 0.8430662723258138} 12/25/2021 08:40:46 - INFO - codeparrot_training - Step 18108: {'lr': 0.000356501871146348, 'samples': 9271808, 'steps': 18108, 'batch_loss/train': 0.7722038964275271} 12/25/2021 08:40:57 - INFO - codeparrot_training - Step 18109: {'lr': 0.0003564876097175531, 'samples': 9272320, 'steps': 18109, 'batch_loss/train': 0.8868825575336814} 12/25/2021 08:41:07 - INFO - codeparrot_training - Step 18110: {'lr': 0.00035647334786540375, 'samples': 9272832, 'steps': 18110, 'batch_loss/train': 0.9719025306403637} 12/25/2021 08:41:18 - INFO - codeparrot_training - Step 18111: {'lr': 0.00035645908558995656, 'samples': 9273344, 'steps': 18111, 'batch_loss/train': 0.8098523633088917} 12/25/2021 08:41:32 - INFO - codeparrot_training - Step 18112: {'lr': 0.0003564448228912682, 'samples': 9273856, 'steps': 18112, 'batch_loss/train': 0.8216093452647328} 12/25/2021 08:41:42 - INFO - codeparrot_training - Step 18113: {'lr': 0.0003564305597693954, 'samples': 9274368, 'steps': 18113, 'batch_loss/train': 0.8353100954554975} 12/25/2021 08:41:53 - INFO - codeparrot_training - Step 18114: {'lr': 0.00035641629622439497, 'samples': 9274880, 'steps': 18114, 'batch_loss/train': 0.8476331494748592} 12/25/2021 08:42:05 - INFO - codeparrot_training - Step 18115: {'lr': 0.00035640203225632345, 'samples': 9275392, 'steps': 18115, 'batch_loss/train': 0.6335338777862489} 12/25/2021 08:42:16 - INFO - codeparrot_training - Step 18116: {'lr': 0.00035638776786523765, 'samples': 9275904, 'steps': 18116, 'batch_loss/train': 0.7739562122151256} 12/25/2021 08:42:26 - INFO - codeparrot_training - Step 18117: {'lr': 0.0003563735030511943, 'samples': 9276416, 'steps': 18117, 'batch_loss/train': 0.8076600208878517} 12/25/2021 08:42:38 - INFO - codeparrot_training - Step 18118: {'lr': 0.00035635923781425, 'samples': 9276928, 'steps': 18118, 'batch_loss/train': 0.736934176646173} 12/25/2021 08:42:49 - INFO - codeparrot_training - Step 18119: {'lr': 0.00035634497215446156, 'samples': 9277440, 'steps': 18119, 'batch_loss/train': 0.8756352066993713} 12/25/2021 08:43:00 - INFO - codeparrot_training - Step 18120: {'lr': 0.00035633070607188566, 'samples': 9277952, 'steps': 18120, 'batch_loss/train': 0.8475366681814194} 12/25/2021 08:43:13 - INFO - codeparrot_training - Step 18121: {'lr': 0.00035631643956657907, 'samples': 9278464, 'steps': 18121, 'batch_loss/train': 0.8193118013441563} 12/25/2021 08:43:24 - INFO - codeparrot_training - Step 18122: {'lr': 0.0003563021726385983, 'samples': 9278976, 'steps': 18122, 'batch_loss/train': 0.7245549439685419} 12/25/2021 08:43:35 - INFO - codeparrot_training - Step 18123: {'lr': 0.00035628790528800043, 'samples': 9279488, 'steps': 18123, 'batch_loss/train': 0.8384366622194648} 12/25/2021 08:43:45 - INFO - codeparrot_training - Step 18124: {'lr': 0.00035627363751484186, 'samples': 9280000, 'steps': 18124, 'batch_loss/train': 0.7838488528504968} 12/25/2021 08:43:57 - INFO - codeparrot_training - Step 18125: {'lr': 0.00035625936931917945, 'samples': 9280512, 'steps': 18125, 'batch_loss/train': 0.7992687033256516} 12/25/2021 08:44:08 - INFO - codeparrot_training - Step 18126: {'lr': 0.00035624510070107, 'samples': 9281024, 'steps': 18126, 'batch_loss/train': 0.7112114366609603} 12/25/2021 08:44:18 - INFO - codeparrot_training - Step 18127: {'lr': 0.0003562308316605701, 'samples': 9281536, 'steps': 18127, 'batch_loss/train': 0.7160125657101162} 12/25/2021 08:44:31 - INFO - codeparrot_training - Step 18128: {'lr': 0.00035621656219773646, 'samples': 9282048, 'steps': 18128, 'batch_loss/train': 0.7902070221025497} 12/25/2021 08:44:41 - INFO - codeparrot_training - Step 18129: {'lr': 0.0003562022923126259, 'samples': 9282560, 'steps': 18129, 'batch_loss/train': 0.8086669351905584} 12/25/2021 08:44:52 - INFO - codeparrot_training - Step 18130: {'lr': 0.00035618802200529517, 'samples': 9283072, 'steps': 18130, 'batch_loss/train': 0.7708114478737116} 12/25/2021 08:45:05 - INFO - codeparrot_training - Step 18131: {'lr': 0.0003561737512758009, 'samples': 9283584, 'steps': 18131, 'batch_loss/train': 0.7739643910899758} 12/25/2021 08:45:16 - INFO - codeparrot_training - Step 18132: {'lr': 0.0003561594801241999, 'samples': 9284096, 'steps': 18132, 'batch_loss/train': 0.7935405120952055} 12/25/2021 08:45:27 - INFO - codeparrot_training - Step 18133: {'lr': 0.0003561452085505489, 'samples': 9284608, 'steps': 18133, 'batch_loss/train': 0.8007267699576914} 12/25/2021 08:45:37 - INFO - codeparrot_training - Step 18134: {'lr': 0.00035613093655490467, 'samples': 9285120, 'steps': 18134, 'batch_loss/train': 0.790337479673326} 12/25/2021 08:45:49 - INFO - codeparrot_training - Step 18135: {'lr': 0.0003561166641373238, 'samples': 9285632, 'steps': 18135, 'batch_loss/train': 0.7367321443744004} 12/25/2021 08:46:00 - INFO - codeparrot_training - Step 18136: {'lr': 0.00035610239129786327, 'samples': 9286144, 'steps': 18136, 'batch_loss/train': 0.7659020749852061} 12/25/2021 08:46:11 - INFO - codeparrot_training - Step 18137: {'lr': 0.00035608811803657965, 'samples': 9286656, 'steps': 18137, 'batch_loss/train': 0.7844307490158826} 12/25/2021 08:46:24 - INFO - codeparrot_training - Step 18138: {'lr': 0.00035607384435352974, 'samples': 9287168, 'steps': 18138, 'batch_loss/train': 0.7088221220765263} 12/25/2021 08:46:35 - INFO - codeparrot_training - Step 18139: {'lr': 0.00035605957024877023, 'samples': 9287680, 'steps': 18139, 'batch_loss/train': 0.7447025999426842} 12/25/2021 08:46:46 - INFO - codeparrot_training - Step 18140: {'lr': 0.000356045295722358, 'samples': 9288192, 'steps': 18140, 'batch_loss/train': 0.8366460325196385} 12/25/2021 08:46:58 - INFO - codeparrot_training - Step 18141: {'lr': 0.00035603102077434956, 'samples': 9288704, 'steps': 18141, 'batch_loss/train': 0.7551999660208821} 12/25/2021 08:47:08 - INFO - codeparrot_training - Step 18142: {'lr': 0.000356016745404802, 'samples': 9289216, 'steps': 18142, 'batch_loss/train': 0.8483545645140111} 12/25/2021 08:47:19 - INFO - codeparrot_training - Step 18143: {'lr': 0.0003560024696137718, 'samples': 9289728, 'steps': 18143, 'batch_loss/train': 0.7712157610803843} 12/25/2021 08:47:29 - INFO - codeparrot_training - Step 18144: {'lr': 0.00035598819340131575, 'samples': 9290240, 'steps': 18144, 'batch_loss/train': 0.799200483597815} 12/25/2021 08:47:42 - INFO - codeparrot_training - Step 18145: {'lr': 0.00035597391676749074, 'samples': 9290752, 'steps': 18145, 'batch_loss/train': 1.7518038777634501} 12/25/2021 08:47:52 - INFO - codeparrot_training - Step 18146: {'lr': 0.00035595963971235344, 'samples': 9291264, 'steps': 18146, 'batch_loss/train': 0.7712211571633816} 12/25/2021 08:48:03 - INFO - codeparrot_training - Step 18147: {'lr': 0.0003559453622359606, 'samples': 9291776, 'steps': 18147, 'batch_loss/train': 0.8473204066976905} 12/25/2021 08:48:15 - INFO - codeparrot_training - Step 18148: {'lr': 0.000355931084338369, 'samples': 9292288, 'steps': 18148, 'batch_loss/train': 0.6671972501208074} 12/25/2021 08:48:26 - INFO - codeparrot_training - Step 18149: {'lr': 0.0003559168060196355, 'samples': 9292800, 'steps': 18149, 'batch_loss/train': 0.8234849036671221} 12/25/2021 08:48:36 - INFO - codeparrot_training - Step 18150: {'lr': 0.0003559025272798167, 'samples': 9293312, 'steps': 18150, 'batch_loss/train': 0.7931095147505403} 12/25/2021 08:48:50 - INFO - codeparrot_training - Step 18151: {'lr': 0.00035588824811896944, 'samples': 9293824, 'steps': 18151, 'batch_loss/train': 0.8036462047602981} 12/25/2021 08:49:01 - INFO - codeparrot_training - Step 18152: {'lr': 0.0003558739685371505, 'samples': 9294336, 'steps': 18152, 'batch_loss/train': 0.8471349277533591} 12/25/2021 08:49:11 - INFO - codeparrot_training - Step 18153: {'lr': 0.00035585968853441664, 'samples': 9294848, 'steps': 18153, 'batch_loss/train': 0.7213865857920609} 12/25/2021 08:49:22 - INFO - codeparrot_training - Step 18154: {'lr': 0.00035584540811082457, 'samples': 9295360, 'steps': 18154, 'batch_loss/train': 0.8430404746904969} 12/25/2021 08:49:34 - INFO - codeparrot_training - Step 18155: {'lr': 0.0003558311272664312, 'samples': 9295872, 'steps': 18155, 'batch_loss/train': 0.841616740450263} 12/25/2021 08:49:45 - INFO - codeparrot_training - Step 18156: {'lr': 0.00035581684600129317, 'samples': 9296384, 'steps': 18156, 'batch_loss/train': 0.7761376677080989} 12/25/2021 08:49:55 - INFO - codeparrot_training - Step 18157: {'lr': 0.0003558025643154673, 'samples': 9296896, 'steps': 18157, 'batch_loss/train': 0.7785442974418402} 12/25/2021 08:50:08 - INFO - codeparrot_training - Step 18158: {'lr': 0.00035578828220901045, 'samples': 9297408, 'steps': 18158, 'batch_loss/train': 0.8582198964431882} 12/25/2021 08:50:18 - INFO - codeparrot_training - Step 18159: {'lr': 0.0003557739996819793, 'samples': 9297920, 'steps': 18159, 'batch_loss/train': 0.7366494811140001} 12/25/2021 08:50:29 - INFO - codeparrot_training - Step 18160: {'lr': 0.0003557597167344307, 'samples': 9298432, 'steps': 18160, 'batch_loss/train': 0.8719141464680433} 12/25/2021 08:50:42 - INFO - codeparrot_training - Step 18161: {'lr': 0.0003557454333664213, 'samples': 9298944, 'steps': 18161, 'batch_loss/train': 0.8827848427463323} 12/25/2021 08:50:53 - INFO - codeparrot_training - Step 18162: {'lr': 0.0003557311495780081, 'samples': 9299456, 'steps': 18162, 'batch_loss/train': 0.8224620744585991} 12/25/2021 08:51:04 - INFO - codeparrot_training - Step 18163: {'lr': 0.00035571686536924773, 'samples': 9299968, 'steps': 18163, 'batch_loss/train': 0.8184007406234741} 12/25/2021 08:51:14 - INFO - codeparrot_training - Step 18164: {'lr': 0.00035570258074019704, 'samples': 9300480, 'steps': 18164, 'batch_loss/train': 0.7829972207546234} 12/25/2021 08:51:26 - INFO - codeparrot_training - Step 18165: {'lr': 0.0003556882956909128, 'samples': 9300992, 'steps': 18165, 'batch_loss/train': 0.8500883188098669} 12/25/2021 08:51:37 - INFO - codeparrot_training - Step 18166: {'lr': 0.0003556740102214518, 'samples': 9301504, 'steps': 18166, 'batch_loss/train': 0.7374429302290082} 12/25/2021 08:51:48 - INFO - codeparrot_training - Step 18167: {'lr': 0.0003556597243318708, 'samples': 9302016, 'steps': 18167, 'batch_loss/train': 0.982724032131955} 12/25/2021 08:52:02 - INFO - codeparrot_training - Step 18168: {'lr': 0.0003556454380222266, 'samples': 9302528, 'steps': 18168, 'batch_loss/train': 0.7031833194196224} 12/25/2021 08:52:12 - INFO - codeparrot_training - Step 18169: {'lr': 0.0003556311512925761, 'samples': 9303040, 'steps': 18169, 'batch_loss/train': 0.8328291783109307} 12/25/2021 08:52:23 - INFO - codeparrot_training - Step 18170: {'lr': 0.000355616864142976, 'samples': 9303552, 'steps': 18170, 'batch_loss/train': 0.8524392764084041} 12/25/2021 08:52:35 - INFO - codeparrot_training - Step 18171: {'lr': 0.0003556025765734832, 'samples': 9304064, 'steps': 18171, 'batch_loss/train': 0.7823582208366133} 12/25/2021 08:52:46 - INFO - codeparrot_training - Step 18172: {'lr': 0.00035558828858415424, 'samples': 9304576, 'steps': 18172, 'batch_loss/train': 0.6745367529802024} 12/25/2021 08:52:56 - INFO - codeparrot_training - Step 18173: {'lr': 0.00035557400017504633, 'samples': 9305088, 'steps': 18173, 'batch_loss/train': 0.8637550286948681} 12/25/2021 08:53:07 - INFO - codeparrot_training - Step 18174: {'lr': 0.0003555597113462159, 'samples': 9305600, 'steps': 18174, 'batch_loss/train': 0.8511621123179793} 12/25/2021 08:53:19 - INFO - codeparrot_training - Step 18175: {'lr': 0.00035554542209772003, 'samples': 9306112, 'steps': 18175, 'batch_loss/train': 0.7401229583192617} 12/25/2021 08:53:30 - INFO - codeparrot_training - Step 18176: {'lr': 0.00035553113242961536, 'samples': 9306624, 'steps': 18176, 'batch_loss/train': 0.7972088307142258} 12/25/2021 08:53:40 - INFO - codeparrot_training - Step 18177: {'lr': 0.0003555168423419587, 'samples': 9307136, 'steps': 18177, 'batch_loss/train': 0.8280125178862363} 12/25/2021 08:53:54 - INFO - codeparrot_training - Step 18178: {'lr': 0.000355502551834807, 'samples': 9307648, 'steps': 18178, 'batch_loss/train': 0.6773721827194095} 12/25/2021 08:54:05 - INFO - codeparrot_training - Step 18179: {'lr': 0.000355488260908217, 'samples': 9308160, 'steps': 18179, 'batch_loss/train': 0.7591168796643615} 12/25/2021 08:54:15 - INFO - codeparrot_training - Step 18180: {'lr': 0.0003554739695622454, 'samples': 9308672, 'steps': 18180, 'batch_loss/train': 1.1027994928881526} 12/25/2021 08:54:27 - INFO - codeparrot_training - Step 18181: {'lr': 0.00035545967779694924, 'samples': 9309184, 'steps': 18181, 'batch_loss/train': 0.8884366443380713} 12/25/2021 08:54:38 - INFO - codeparrot_training - Step 18182: {'lr': 0.0003554453856123852, 'samples': 9309696, 'steps': 18182, 'batch_loss/train': 0.839467016980052} 12/25/2021 08:54:49 - INFO - codeparrot_training - Step 18183: {'lr': 0.0003554310930086101, 'samples': 9310208, 'steps': 18183, 'batch_loss/train': 0.7194480211474001} 12/25/2021 08:54:59 - INFO - codeparrot_training - Step 18184: {'lr': 0.0003554167999856808, 'samples': 9310720, 'steps': 18184, 'batch_loss/train': 0.6648047724738717} 12/25/2021 08:55:11 - INFO - codeparrot_training - Step 18185: {'lr': 0.000355402506543654, 'samples': 9311232, 'steps': 18185, 'batch_loss/train': 0.7051963191479445} 12/25/2021 08:55:22 - INFO - codeparrot_training - Step 18186: {'lr': 0.00035538821268258676, 'samples': 9311744, 'steps': 18186, 'batch_loss/train': 0.6639888865174726} 12/25/2021 08:55:32 - INFO - codeparrot_training - Step 18187: {'lr': 0.00035537391840253574, 'samples': 9312256, 'steps': 18187, 'batch_loss/train': 0.8040711299981922} 12/25/2021 08:55:44 - INFO - codeparrot_training - Step 18188: {'lr': 0.00035535962370355785, 'samples': 9312768, 'steps': 18188, 'batch_loss/train': 0.7494156323373318} 12/25/2021 08:55:55 - INFO - codeparrot_training - Step 18189: {'lr': 0.0003553453285857098, 'samples': 9313280, 'steps': 18189, 'batch_loss/train': 0.728972080629319} 12/25/2021 08:56:06 - INFO - codeparrot_training - Step 18190: {'lr': 0.0003553310330490486, 'samples': 9313792, 'steps': 18190, 'batch_loss/train': 0.7666038302704692} 12/25/2021 08:56:19 - INFO - codeparrot_training - Step 18191: {'lr': 0.00035531673709363087, 'samples': 9314304, 'steps': 18191, 'batch_loss/train': 0.7780453593004495} 12/25/2021 08:56:30 - INFO - codeparrot_training - Step 18192: {'lr': 0.00035530244071951367, 'samples': 9314816, 'steps': 18192, 'batch_loss/train': 0.7532292529940605} 12/25/2021 08:56:41 - INFO - codeparrot_training - Step 18193: {'lr': 0.0003552881439267536, 'samples': 9315328, 'steps': 18193, 'batch_loss/train': 0.7440367676317692} 12/25/2021 08:56:53 - INFO - codeparrot_training - Step 18194: {'lr': 0.0003552738467154076, 'samples': 9315840, 'steps': 18194, 'batch_loss/train': 0.8186815739609301} 12/25/2021 08:57:04 - INFO - codeparrot_training - Step 18195: {'lr': 0.0003552595490855327, 'samples': 9316352, 'steps': 18195, 'batch_loss/train': 1.0369075844064355} 12/25/2021 08:57:14 - INFO - codeparrot_training - Step 18196: {'lr': 0.0003552452510371855, 'samples': 9316864, 'steps': 18196, 'batch_loss/train': 0.7879798132926226} 12/25/2021 08:57:25 - INFO - codeparrot_training - Step 18197: {'lr': 0.00035523095257042293, 'samples': 9317376, 'steps': 18197, 'batch_loss/train': 0.8725592354312539} 12/25/2021 08:57:39 - INFO - codeparrot_training - Step 18198: {'lr': 0.0003552166536853018, 'samples': 9317888, 'steps': 18198, 'batch_loss/train': 0.7514808450359851} 12/25/2021 08:57:49 - INFO - codeparrot_training - Step 18199: {'lr': 0.00035520235438187903, 'samples': 9318400, 'steps': 18199, 'batch_loss/train': 0.7238637034315616} 12/25/2021 08:58:00 - INFO - codeparrot_training - Step 18200: {'lr': 0.00035518805466021135, 'samples': 9318912, 'steps': 18200, 'batch_loss/train': 0.8457416642922908} 12/25/2021 08:58:12 - INFO - codeparrot_training - Step 18201: {'lr': 0.00035517375452035583, 'samples': 9319424, 'steps': 18201, 'batch_loss/train': 0.7515152748674154} 12/25/2021 08:58:22 - INFO - codeparrot_training - Step 18202: {'lr': 0.000355159453962369, 'samples': 9319936, 'steps': 18202, 'batch_loss/train': 0.7860074685886502} 12/25/2021 08:58:33 - INFO - codeparrot_training - Step 18203: {'lr': 0.000355145152986308, 'samples': 9320448, 'steps': 18203, 'batch_loss/train': 0.7717059520073235} 12/25/2021 08:58:45 - INFO - codeparrot_training - Step 18204: {'lr': 0.0003551308515922295, 'samples': 9320960, 'steps': 18204, 'batch_loss/train': 0.8655316652730107} 12/25/2021 08:58:56 - INFO - codeparrot_training - Step 18205: {'lr': 0.0003551165497801905, 'samples': 9321472, 'steps': 18205, 'batch_loss/train': 0.8160864124074578} 12/25/2021 08:59:06 - INFO - codeparrot_training - Step 18206: {'lr': 0.0003551022475502477, 'samples': 9321984, 'steps': 18206, 'batch_loss/train': 0.8484799051657319} 12/25/2021 08:59:17 - INFO - codeparrot_training - Step 18207: {'lr': 0.0003550879449024581, 'samples': 9322496, 'steps': 18207, 'batch_loss/train': 0.7510592038743198} 12/25/2021 08:59:31 - INFO - codeparrot_training - Step 18208: {'lr': 0.0003550736418368786, 'samples': 9323008, 'steps': 18208, 'batch_loss/train': 0.7138039660640061} 12/25/2021 08:59:41 - INFO - codeparrot_training - Step 18209: {'lr': 0.0003550593383535658, 'samples': 9323520, 'steps': 18209, 'batch_loss/train': 0.7504810496466234} 12/25/2021 08:59:52 - INFO - codeparrot_training - Step 18210: {'lr': 0.00035504503445257687, 'samples': 9324032, 'steps': 18210, 'batch_loss/train': 0.8907199408859015} 12/25/2021 09:00:04 - INFO - codeparrot_training - Step 18211: {'lr': 0.00035503073013396847, 'samples': 9324544, 'steps': 18211, 'batch_loss/train': 0.7337094470858574} 12/25/2021 09:00:15 - INFO - codeparrot_training - Step 18212: {'lr': 0.00035501642539779754, 'samples': 9325056, 'steps': 18212, 'batch_loss/train': 0.7464310918003321} 12/25/2021 09:00:25 - INFO - codeparrot_training - Step 18213: {'lr': 0.000355002120244121, 'samples': 9325568, 'steps': 18213, 'batch_loss/train': 0.8410328640602529} 12/25/2021 09:00:37 - INFO - codeparrot_training - Step 18214: {'lr': 0.00035498781467299575, 'samples': 9326080, 'steps': 18214, 'batch_loss/train': 0.7988122152164578} 12/25/2021 09:00:48 - INFO - codeparrot_training - Step 18215: {'lr': 0.00035497350868447846, 'samples': 9326592, 'steps': 18215, 'batch_loss/train': 0.8272709306329489} 12/25/2021 09:00:58 - INFO - codeparrot_training - Step 18216: {'lr': 0.0003549592022786262, 'samples': 9327104, 'steps': 18216, 'batch_loss/train': 0.8837997624650598} 12/25/2021 09:01:11 - INFO - codeparrot_training - Step 18217: {'lr': 0.00035494489545549577, 'samples': 9327616, 'steps': 18217, 'batch_loss/train': 0.7778209209209308} 12/25/2021 09:01:21 - INFO - codeparrot_training - Step 18218: {'lr': 0.000354930588215144, 'samples': 9328128, 'steps': 18218, 'batch_loss/train': 0.6828866014257073} 12/25/2021 09:01:32 - INFO - codeparrot_training - Step 18219: {'lr': 0.0003549162805576279, 'samples': 9328640, 'steps': 18219, 'batch_loss/train': 0.681366142584011} 12/25/2021 09:01:42 - INFO - codeparrot_training - Step 18220: {'lr': 0.00035490197248300425, 'samples': 9329152, 'steps': 18220, 'batch_loss/train': 0.9162348245154135} 12/25/2021 09:01:56 - INFO - codeparrot_training - Step 18221: {'lr': 0.0003548876639913301, 'samples': 9329664, 'steps': 18221, 'batch_loss/train': 0.7215407646726817} 12/25/2021 09:02:07 - INFO - codeparrot_training - Step 18222: {'lr': 0.000354873355082662, 'samples': 9330176, 'steps': 18222, 'batch_loss/train': 0.7352759891655296} 12/25/2021 09:02:17 - INFO - codeparrot_training - Step 18223: {'lr': 0.00035485904575705717, 'samples': 9330688, 'steps': 18223, 'batch_loss/train': 0.7610133150592446} 12/25/2021 09:02:29 - INFO - codeparrot_training - Step 18224: {'lr': 0.00035484473601457236, 'samples': 9331200, 'steps': 18224, 'batch_loss/train': 0.797614177223295} 12/25/2021 09:02:40 - INFO - codeparrot_training - Step 18225: {'lr': 0.00035483042585526445, 'samples': 9331712, 'steps': 18225, 'batch_loss/train': 0.998081830330193} 12/25/2021 09:02:51 - INFO - codeparrot_training - Step 18226: {'lr': 0.00035481611527919026, 'samples': 9332224, 'steps': 18226, 'batch_loss/train': 0.8188760271295905} 12/25/2021 09:03:05 - INFO - codeparrot_training - Step 18227: {'lr': 0.0003548018042864069, 'samples': 9332736, 'steps': 18227, 'batch_loss/train': 0.7961616022512317} 12/25/2021 09:03:15 - INFO - codeparrot_training - Step 18228: {'lr': 0.0003547874928769711, 'samples': 9333248, 'steps': 18228, 'batch_loss/train': 0.6623072340153158} 12/25/2021 09:03:26 - INFO - codeparrot_training - Step 18229: {'lr': 0.00035477318105093977, 'samples': 9333760, 'steps': 18229, 'batch_loss/train': 0.8249494570773095} 12/25/2021 09:03:36 - INFO - codeparrot_training - Step 18230: {'lr': 0.0003547588688083698, 'samples': 9334272, 'steps': 18230, 'batch_loss/train': 0.8093482414260507} 12/25/2021 09:03:49 - INFO - codeparrot_training - Step 18231: {'lr': 0.0003547445561493182, 'samples': 9334784, 'steps': 18231, 'batch_loss/train': 0.7520762784406543} 12/25/2021 09:03:59 - INFO - codeparrot_training - Step 18232: {'lr': 0.0003547302430738417, 'samples': 9335296, 'steps': 18232, 'batch_loss/train': 0.8520737811923027} 12/25/2021 09:04:10 - INFO - codeparrot_training - Step 18233: {'lr': 0.0003547159295819974, 'samples': 9335808, 'steps': 18233, 'batch_loss/train': 0.719253889285028} 12/25/2021 09:04:22 - INFO - codeparrot_training - Step 18234: {'lr': 0.0003547016156738421, 'samples': 9336320, 'steps': 18234, 'batch_loss/train': 0.7951536709442735} 12/25/2021 09:04:32 - INFO - codeparrot_training - Step 18235: {'lr': 0.0003546873013494326, 'samples': 9336832, 'steps': 18235, 'batch_loss/train': 0.8564813584089279} 12/25/2021 09:04:43 - INFO - codeparrot_training - Step 18236: {'lr': 0.0003546729866088261, 'samples': 9337344, 'steps': 18236, 'batch_loss/train': 0.7967358399182558} 12/25/2021 09:04:57 - INFO - codeparrot_training - Step 18237: {'lr': 0.0003546586714520792, 'samples': 9337856, 'steps': 18237, 'batch_loss/train': 0.8339882772415876} 12/25/2021 09:05:08 - INFO - codeparrot_training - Step 18238: {'lr': 0.0003546443558792489, 'samples': 9338368, 'steps': 18238, 'batch_loss/train': 0.681442828848958} 12/25/2021 09:05:18 - INFO - codeparrot_training - Step 18239: {'lr': 0.00035463003989039224, 'samples': 9338880, 'steps': 18239, 'batch_loss/train': 0.6254093071911484} 12/25/2021 09:05:30 - INFO - codeparrot_training - Step 18240: {'lr': 0.000354615723485566, 'samples': 9339392, 'steps': 18240, 'batch_loss/train': 0.7657980592921376} 12/25/2021 09:05:41 - INFO - codeparrot_training - Step 18241: {'lr': 0.0003546014066648272, 'samples': 9339904, 'steps': 18241, 'batch_loss/train': 0.805814060382545} 12/25/2021 09:05:52 - INFO - codeparrot_training - Step 18242: {'lr': 0.00035458708942823257, 'samples': 9340416, 'steps': 18242, 'batch_loss/train': 0.8251910926774144} 12/25/2021 09:06:02 - INFO - codeparrot_training - Step 18243: {'lr': 0.00035457277177583936, 'samples': 9340928, 'steps': 18243, 'batch_loss/train': 0.8015730828046799} 12/25/2021 09:06:16 - INFO - codeparrot_training - Step 18244: {'lr': 0.00035455845370770415, 'samples': 9341440, 'steps': 18244, 'batch_loss/train': 0.7503705499693751} 12/25/2021 09:06:27 - INFO - codeparrot_training - Step 18245: {'lr': 0.00035454413522388405, 'samples': 9341952, 'steps': 18245, 'batch_loss/train': 0.8589896447956562} 12/25/2021 09:06:37 - INFO - codeparrot_training - Step 18246: {'lr': 0.00035452981632443597, 'samples': 9342464, 'steps': 18246, 'batch_loss/train': 0.7428545278962702} 12/25/2021 09:06:49 - INFO - codeparrot_training - Step 18247: {'lr': 0.0003545154970094168, 'samples': 9342976, 'steps': 18247, 'batch_loss/train': 0.8344084564596415} 12/25/2021 09:07:00 - INFO - codeparrot_training - Step 18248: {'lr': 0.0003545011772788834, 'samples': 9343488, 'steps': 18248, 'batch_loss/train': 0.7808633437380195} 12/25/2021 09:07:10 - INFO - codeparrot_training - Step 18249: {'lr': 0.0003544868571328929, 'samples': 9344000, 'steps': 18249, 'batch_loss/train': 0.8018792939838022} 12/25/2021 09:07:21 - INFO - codeparrot_training - Step 18250: {'lr': 0.000354472536571502, 'samples': 9344512, 'steps': 18250, 'batch_loss/train': 0.682825290132314} 12/25/2021 09:07:33 - INFO - codeparrot_training - Step 18251: {'lr': 0.00035445821559476783, 'samples': 9345024, 'steps': 18251, 'batch_loss/train': 0.8040415104478598} 12/25/2021 09:07:44 - INFO - codeparrot_training - Step 18252: {'lr': 0.0003544438942027472, 'samples': 9345536, 'steps': 18252, 'batch_loss/train': 0.7764335656538606} 12/25/2021 09:07:54 - INFO - codeparrot_training - Step 18253: {'lr': 0.0003544295723954971, 'samples': 9346048, 'steps': 18253, 'batch_loss/train': 0.7690345710143447} 12/25/2021 09:08:07 - INFO - codeparrot_training - Step 18254: {'lr': 0.00035441525017307444, 'samples': 9346560, 'steps': 18254, 'batch_loss/train': 0.7543277041986585} 12/25/2021 09:08:17 - INFO - codeparrot_training - Step 18255: {'lr': 0.0003544009275355361, 'samples': 9347072, 'steps': 18255, 'batch_loss/train': 0.748220129404217} 12/25/2021 09:08:28 - INFO - codeparrot_training - Step 18256: {'lr': 0.00035438660448293917, 'samples': 9347584, 'steps': 18256, 'batch_loss/train': 0.821914991363883} 12/25/2021 09:08:42 - INFO - codeparrot_training - Step 18257: {'lr': 0.00035437228101534054, 'samples': 9348096, 'steps': 18257, 'batch_loss/train': 0.7893476346507668} 12/25/2021 09:08:52 - INFO - codeparrot_training - Step 18258: {'lr': 0.00035435795713279703, 'samples': 9348608, 'steps': 18258, 'batch_loss/train': 0.61796925769886} 12/25/2021 09:09:03 - INFO - codeparrot_training - Step 18259: {'lr': 0.0003543436328353658, 'samples': 9349120, 'steps': 18259, 'batch_loss/train': 0.7780861525097862} 12/25/2021 09:09:16 - INFO - codeparrot_training - Step 18260: {'lr': 0.00035432930812310365, 'samples': 9349632, 'steps': 18260, 'batch_loss/train': 0.745546504855156} 12/25/2021 09:09:26 - INFO - codeparrot_training - Step 18261: {'lr': 0.0003543149829960675, 'samples': 9350144, 'steps': 18261, 'batch_loss/train': 0.8328284695744514} 12/25/2021 09:09:37 - INFO - codeparrot_training - Step 18262: {'lr': 0.00035430065745431445, 'samples': 9350656, 'steps': 18262, 'batch_loss/train': 0.8422632268629968} 12/25/2021 09:09:48 - INFO - codeparrot_training - Step 18263: {'lr': 0.00035428633149790133, 'samples': 9351168, 'steps': 18263, 'batch_loss/train': 0.7139165208209306} 12/25/2021 09:10:00 - INFO - codeparrot_training - Step 18264: {'lr': 0.0003542720051268851, 'samples': 9351680, 'steps': 18264, 'batch_loss/train': 0.7482269858010113} 12/25/2021 09:10:10 - INFO - codeparrot_training - Step 18265: {'lr': 0.00035425767834132275, 'samples': 9352192, 'steps': 18265, 'batch_loss/train': 0.7577455551363528} 12/25/2021 09:10:21 - INFO - codeparrot_training - Step 18266: {'lr': 0.00035424335114127135, 'samples': 9352704, 'steps': 18266, 'batch_loss/train': 0.7431050278246403} 12/25/2021 09:10:35 - INFO - codeparrot_training - Step 18267: {'lr': 0.0003542290235267876, 'samples': 9353216, 'steps': 18267, 'batch_loss/train': 0.7935047978535295} 12/25/2021 09:10:45 - INFO - codeparrot_training - Step 18268: {'lr': 0.00035421469549792867, 'samples': 9353728, 'steps': 18268, 'batch_loss/train': 0.7845139745622873} 12/25/2021 09:10:56 - INFO - codeparrot_training - Step 18269: {'lr': 0.0003542003670547515, 'samples': 9354240, 'steps': 18269, 'batch_loss/train': 0.9026742232963443} 12/25/2021 09:11:08 - INFO - codeparrot_training - Step 18270: {'lr': 0.000354186038197313, 'samples': 9354752, 'steps': 18270, 'batch_loss/train': 0.8354894514195621} 12/25/2021 09:11:19 - INFO - codeparrot_training - Step 18271: {'lr': 0.0003541717089256701, 'samples': 9355264, 'steps': 18271, 'batch_loss/train': 0.804980113171041} 12/25/2021 09:11:29 - INFO - codeparrot_training - Step 18272: {'lr': 0.00035415737923987986, 'samples': 9355776, 'steps': 18272, 'batch_loss/train': 0.7814507437869906} 12/25/2021 09:11:40 - INFO - codeparrot_training - Step 18273: {'lr': 0.0003541430491399992, 'samples': 9356288, 'steps': 18273, 'batch_loss/train': 0.6723559305537492} 12/25/2021 09:11:52 - INFO - codeparrot_training - Step 18274: {'lr': 0.00035412871862608507, 'samples': 9356800, 'steps': 18274, 'batch_loss/train': 0.7063667383044958} 12/25/2021 09:12:03 - INFO - codeparrot_training - Step 18275: {'lr': 0.00035411438769819456, 'samples': 9357312, 'steps': 18275, 'batch_loss/train': 0.9605860636220314} 12/25/2021 09:12:13 - INFO - codeparrot_training - Step 18276: {'lr': 0.0003541000563563845, 'samples': 9357824, 'steps': 18276, 'batch_loss/train': 0.8696876643225551} 12/25/2021 09:12:27 - INFO - codeparrot_training - Step 18277: {'lr': 0.00035408572460071187, 'samples': 9358336, 'steps': 18277, 'batch_loss/train': 0.7113186568021774} 12/25/2021 09:12:38 - INFO - codeparrot_training - Step 18278: {'lr': 0.00035407139243123374, 'samples': 9358848, 'steps': 18278, 'batch_loss/train': 0.8831107984296978} 12/25/2021 09:12:48 - INFO - codeparrot_training - Step 18279: {'lr': 0.00035405705984800705, 'samples': 9359360, 'steps': 18279, 'batch_loss/train': 0.8636399731040001} 12/25/2021 09:13:00 - INFO - codeparrot_training - Step 18280: {'lr': 0.0003540427268510888, 'samples': 9359872, 'steps': 18280, 'batch_loss/train': 0.7732694465667009} 12/25/2021 09:13:11 - INFO - codeparrot_training - Step 18281: {'lr': 0.00035402839344053595, 'samples': 9360384, 'steps': 18281, 'batch_loss/train': 0.777085043489933} 12/25/2021 09:13:22 - INFO - codeparrot_training - Step 18282: {'lr': 0.00035401405961640555, 'samples': 9360896, 'steps': 18282, 'batch_loss/train': 0.705643164459616} 12/25/2021 09:13:32 - INFO - codeparrot_training - Step 18283: {'lr': 0.00035399972537875443, 'samples': 9361408, 'steps': 18283, 'batch_loss/train': 1.0015364452265203} 12/25/2021 09:13:46 - INFO - codeparrot_training - Step 18284: {'lr': 0.0003539853907276396, 'samples': 9361920, 'steps': 18284, 'batch_loss/train': 0.7692234236747026} 12/25/2021 09:13:57 - INFO - codeparrot_training - Step 18285: {'lr': 0.0003539710556631183, 'samples': 9362432, 'steps': 18285, 'batch_loss/train': 0.7875398006290197} 12/25/2021 09:14:07 - INFO - codeparrot_training - Step 18286: {'lr': 0.00035395672018524726, 'samples': 9362944, 'steps': 18286, 'batch_loss/train': 0.8903102586045861} 12/25/2021 09:14:19 - INFO - codeparrot_training - Step 18287: {'lr': 0.0003539423842940835, 'samples': 9363456, 'steps': 18287, 'batch_loss/train': 0.8243071222677827} 12/25/2021 09:14:30 - INFO - codeparrot_training - Step 18288: {'lr': 0.00035392804798968416, 'samples': 9363968, 'steps': 18288, 'batch_loss/train': 0.7923187874257565} 12/25/2021 09:14:40 - INFO - codeparrot_training - Step 18289: {'lr': 0.00035391371127210607, 'samples': 9364480, 'steps': 18289, 'batch_loss/train': 0.7887764228507876} 12/25/2021 09:14:53 - INFO - codeparrot_training - Step 18290: {'lr': 0.00035389937414140635, 'samples': 9364992, 'steps': 18290, 'batch_loss/train': 0.7812305456027389} 12/25/2021 09:15:03 - INFO - codeparrot_training - Step 18291: {'lr': 0.00035388503659764187, 'samples': 9365504, 'steps': 18291, 'batch_loss/train': 0.7946096230298281} 12/25/2021 09:15:14 - INFO - codeparrot_training - Step 18292: {'lr': 0.0003538706986408698, 'samples': 9366016, 'steps': 18292, 'batch_loss/train': 0.7761995550245047} 12/25/2021 09:15:24 - INFO - codeparrot_training - Step 18293: {'lr': 0.000353856360271147, 'samples': 9366528, 'steps': 18293, 'batch_loss/train': 0.6408047825680114} 12/25/2021 09:15:37 - INFO - codeparrot_training - Step 18294: {'lr': 0.0003538420214885305, 'samples': 9367040, 'steps': 18294, 'batch_loss/train': 0.8614823436364532} 12/25/2021 09:15:47 - INFO - codeparrot_training - Step 18295: {'lr': 0.0003538276822930774, 'samples': 9367552, 'steps': 18295, 'batch_loss/train': 0.6207938101142645} 12/25/2021 09:15:58 - INFO - codeparrot_training - Step 18296: {'lr': 0.00035381334268484466, 'samples': 9368064, 'steps': 18296, 'batch_loss/train': 0.6700556157156825} 12/25/2021 09:16:11 - INFO - codeparrot_training - Step 18297: {'lr': 0.0003537990026638892, 'samples': 9368576, 'steps': 18297, 'batch_loss/train': 0.6179670555284247} 12/25/2021 09:16:22 - INFO - codeparrot_training - Step 18298: {'lr': 0.0003537846622302681, 'samples': 9369088, 'steps': 18298, 'batch_loss/train': 0.7945935372263193} 12/25/2021 09:16:33 - INFO - codeparrot_training - Step 18299: {'lr': 0.0003537703213840384, 'samples': 9369600, 'steps': 18299, 'batch_loss/train': 0.7843016711995006} 12/25/2021 09:16:45 - INFO - codeparrot_training - Step 18300: {'lr': 0.0003537559801252571, 'samples': 9370112, 'steps': 18300, 'batch_loss/train': 0.8158385562710464} 12/25/2021 09:16:56 - INFO - codeparrot_training - Step 18301: {'lr': 0.0003537416384539812, 'samples': 9370624, 'steps': 18301, 'batch_loss/train': 0.7257327884435654} 12/25/2021 09:17:06 - INFO - codeparrot_training - Step 18302: {'lr': 0.0003537272963702677, 'samples': 9371136, 'steps': 18302, 'batch_loss/train': 0.9158914415165782} 12/25/2021 09:17:17 - INFO - codeparrot_training - Step 18303: {'lr': 0.00035371295387417355, 'samples': 9371648, 'steps': 18303, 'batch_loss/train': 0.8321188790723681} 12/25/2021 09:17:29 - INFO - codeparrot_training - Step 18304: {'lr': 0.00035369861096575596, 'samples': 9372160, 'steps': 18304, 'batch_loss/train': 0.7409073007293046} 12/25/2021 09:17:40 - INFO - codeparrot_training - Step 18305: {'lr': 0.00035368426764507185, 'samples': 9372672, 'steps': 18305, 'batch_loss/train': 0.7263753646984696} 12/25/2021 09:17:50 - INFO - codeparrot_training - Step 18306: {'lr': 0.0003536699239121782, 'samples': 9373184, 'steps': 18306, 'batch_loss/train': 0.7017416239832528} 12/25/2021 09:18:04 - INFO - codeparrot_training - Step 18307: {'lr': 0.00035365557976713206, 'samples': 9373696, 'steps': 18307, 'batch_loss/train': 0.7430018947925419} 12/25/2021 09:18:15 - INFO - codeparrot_training - Step 18308: {'lr': 0.00035364123520999056, 'samples': 9374208, 'steps': 18308, 'batch_loss/train': 0.8671739641577005} 12/25/2021 09:18:25 - INFO - codeparrot_training - Step 18309: {'lr': 0.0003536268902408106, 'samples': 9374720, 'steps': 18309, 'batch_loss/train': 0.844694409519434} 12/25/2021 09:18:37 - INFO - codeparrot_training - Step 18310: {'lr': 0.0003536125448596492, 'samples': 9375232, 'steps': 18310, 'batch_loss/train': 0.7963730273768306} 12/25/2021 09:18:48 - INFO - codeparrot_training - Step 18311: {'lr': 0.00035359819906656347, 'samples': 9375744, 'steps': 18311, 'batch_loss/train': 0.7272380404174328} 12/25/2021 09:18:58 - INFO - codeparrot_training - Step 18312: {'lr': 0.0003535838528616104, 'samples': 9376256, 'steps': 18312, 'batch_loss/train': 0.7526754450518638} 12/25/2021 09:19:12 - INFO - codeparrot_training - Step 18313: {'lr': 0.0003535695062448471, 'samples': 9376768, 'steps': 18313, 'batch_loss/train': 0.7949621734442189} 12/25/2021 09:19:23 - INFO - codeparrot_training - Step 18314: {'lr': 0.0003535551592163306, 'samples': 9377280, 'steps': 18314, 'batch_loss/train': 0.8229397516697645} 12/25/2021 09:19:34 - INFO - codeparrot_training - Step 18315: {'lr': 0.00035354081177611774, 'samples': 9377792, 'steps': 18315, 'batch_loss/train': 0.7252557491883636} 12/25/2021 09:19:44 - INFO - codeparrot_training - Step 18316: {'lr': 0.00035352646392426576, 'samples': 9378304, 'steps': 18316, 'batch_loss/train': 0.7060390496626496} 12/25/2021 09:19:56 - INFO - codeparrot_training - Step 18317: {'lr': 0.0003535121156608317, 'samples': 9378816, 'steps': 18317, 'batch_loss/train': 0.696082498121541} 12/25/2021 09:20:07 - INFO - codeparrot_training - Step 18318: {'lr': 0.0003534977669858725, 'samples': 9379328, 'steps': 18318, 'batch_loss/train': 0.6980329593643546} 12/25/2021 09:20:18 - INFO - codeparrot_training - Step 18319: {'lr': 0.0003534834178994453, 'samples': 9379840, 'steps': 18319, 'batch_loss/train': 0.8172870120033622} 12/25/2021 09:20:30 - INFO - codeparrot_training - Step 18320: {'lr': 0.0003534690684016071, 'samples': 9380352, 'steps': 18320, 'batch_loss/train': 0.7752114036120474} 12/25/2021 09:20:40 - INFO - codeparrot_training - Step 18321: {'lr': 0.00035345471849241493, 'samples': 9380864, 'steps': 18321, 'batch_loss/train': 0.7745637758634984} 12/25/2021 09:20:51 - INFO - codeparrot_training - Step 18322: {'lr': 0.00035344036817192584, 'samples': 9381376, 'steps': 18322, 'batch_loss/train': 0.768857812974602} 12/25/2021 09:21:05 - INFO - codeparrot_training - Step 18323: {'lr': 0.00035342601744019697, 'samples': 9381888, 'steps': 18323, 'batch_loss/train': 0.7469159150496125} 12/25/2021 09:21:15 - INFO - codeparrot_training - Step 18324: {'lr': 0.0003534116662972853, 'samples': 9382400, 'steps': 18324, 'batch_loss/train': 0.6484354408457875} 12/25/2021 09:21:26 - INFO - codeparrot_training - Step 18325: {'lr': 0.0003533973147432478, 'samples': 9382912, 'steps': 18325, 'batch_loss/train': 0.7444615047425032} 12/25/2021 09:21:36 - INFO - codeparrot_training - Step 18326: {'lr': 0.0003533829627781417, 'samples': 9383424, 'steps': 18326, 'batch_loss/train': 0.7807052573189139} 12/25/2021 09:21:48 - INFO - codeparrot_training - Step 18327: {'lr': 0.00035336861040202397, 'samples': 9383936, 'steps': 18327, 'batch_loss/train': 0.8218171345070004} 12/25/2021 09:21:59 - INFO - codeparrot_training - Step 18328: {'lr': 0.0003533542576149517, 'samples': 9384448, 'steps': 18328, 'batch_loss/train': 0.7342365239746869} 12/25/2021 09:22:10 - INFO - codeparrot_training - Step 18329: {'lr': 0.0003533399044169818, 'samples': 9384960, 'steps': 18329, 'batch_loss/train': 0.7782477913424373} 12/25/2021 09:22:22 - INFO - codeparrot_training - Step 18330: {'lr': 0.00035332555080817157, 'samples': 9385472, 'steps': 18330, 'batch_loss/train': 0.6318672521156259} 12/25/2021 09:22:33 - INFO - codeparrot_training - Step 18331: {'lr': 0.0003533111967885779, 'samples': 9385984, 'steps': 18331, 'batch_loss/train': 0.5513495883205906} 12/25/2021 09:22:44 - INFO - codeparrot_training - Step 18332: {'lr': 0.000353296842358258, 'samples': 9386496, 'steps': 18332, 'batch_loss/train': 0.729237389517948} 12/25/2021 09:22:54 - INFO - codeparrot_training - Step 18333: {'lr': 0.00035328248751726875, 'samples': 9387008, 'steps': 18333, 'batch_loss/train': 0.5763221976812929} 12/25/2021 09:23:08 - INFO - codeparrot_training - Step 18334: {'lr': 0.0003532681322656675, 'samples': 9387520, 'steps': 18334, 'batch_loss/train': 0.6979349227622151} 12/25/2021 09:23:19 - INFO - codeparrot_training - Step 18335: {'lr': 0.000353253776603511, 'samples': 9388032, 'steps': 18335, 'batch_loss/train': 0.8118067756295204} 12/25/2021 09:23:29 - INFO - codeparrot_training - Step 18336: {'lr': 0.00035323942053085654, 'samples': 9388544, 'steps': 18336, 'batch_loss/train': 0.7836951222270727} 12/25/2021 09:23:42 - INFO - codeparrot_training - Step 18337: {'lr': 0.0003532250640477611, 'samples': 9389056, 'steps': 18337, 'batch_loss/train': 0.8181560505181551} 12/25/2021 09:23:52 - INFO - codeparrot_training - Step 18338: {'lr': 0.00035321070715428176, 'samples': 9389568, 'steps': 18338, 'batch_loss/train': 0.66237459378317} 12/25/2021 09:24:03 - INFO - codeparrot_training - Step 18339: {'lr': 0.00035319634985047564, 'samples': 9390080, 'steps': 18339, 'batch_loss/train': 0.794788976199925} 12/25/2021 09:24:13 - INFO - codeparrot_training - Step 18340: {'lr': 0.00035318199213639983, 'samples': 9390592, 'steps': 18340, 'batch_loss/train': 0.5129680405370891} 12/25/2021 09:24:26 - INFO - codeparrot_training - Step 18341: {'lr': 0.0003531676340121114, 'samples': 9391104, 'steps': 18341, 'batch_loss/train': 0.7732332702726126} 12/25/2021 09:24:36 - INFO - codeparrot_training - Step 18342: {'lr': 0.0003531532754776673, 'samples': 9391616, 'steps': 18342, 'batch_loss/train': 0.8067026287317276} 12/25/2021 09:24:47 - INFO - codeparrot_training - Step 18343: {'lr': 0.0003531389165331248, 'samples': 9392128, 'steps': 18343, 'batch_loss/train': 0.7272674362175167} 12/25/2021 09:25:01 - INFO - codeparrot_training - Step 18344: {'lr': 0.00035312455717854084, 'samples': 9392640, 'steps': 18344, 'batch_loss/train': 0.8683722009882331} 12/25/2021 09:25:11 - INFO - codeparrot_training - Step 18345: {'lr': 0.00035311019741397266, 'samples': 9393152, 'steps': 18345, 'batch_loss/train': 0.8254996738396585} 12/25/2021 09:25:22 - INFO - codeparrot_training - Step 18346: {'lr': 0.00035309583723947723, 'samples': 9393664, 'steps': 18346, 'batch_loss/train': 0.8027626154944301} 12/25/2021 09:25:34 - INFO - codeparrot_training - Step 18347: {'lr': 0.00035308147665511175, 'samples': 9394176, 'steps': 18347, 'batch_loss/train': 0.8367168372496963} 12/25/2021 09:25:45 - INFO - codeparrot_training - Step 18348: {'lr': 0.00035306711566093313, 'samples': 9394688, 'steps': 18348, 'batch_loss/train': 0.7773719802498817} 12/25/2021 09:25:55 - INFO - codeparrot_training - Step 18349: {'lr': 0.0003530527542569987, 'samples': 9395200, 'steps': 18349, 'batch_loss/train': 0.7781973788514733} 12/25/2021 09:26:08 - INFO - codeparrot_training - Step 18350: {'lr': 0.0003530383924433653, 'samples': 9395712, 'steps': 18350, 'batch_loss/train': 0.7836681150365621} 12/25/2021 09:26:18 - INFO - codeparrot_training - Step 18351: {'lr': 0.0003530240302200902, 'samples': 9396224, 'steps': 18351, 'batch_loss/train': 0.8598368028178811} 12/25/2021 09:26:29 - INFO - codeparrot_training - Step 18352: {'lr': 0.0003530096675872305, 'samples': 9396736, 'steps': 18352, 'batch_loss/train': 0.7745438269339502} 12/25/2021 09:26:40 - INFO - codeparrot_training - Step 18353: {'lr': 0.0003529953045448432, 'samples': 9397248, 'steps': 18353, 'batch_loss/train': 0.6523140167118981} 12/25/2021 09:26:54 - INFO - codeparrot_training - Step 18354: {'lr': 0.0003529809410929856, 'samples': 9397760, 'steps': 18354, 'batch_loss/train': 0.6020921564195305} 12/25/2021 09:27:04 - INFO - codeparrot_training - Step 18355: {'lr': 0.00035296657723171443, 'samples': 9398272, 'steps': 18355, 'batch_loss/train': 0.98278624471277} 12/25/2021 09:27:15 - INFO - codeparrot_training - Step 18356: {'lr': 0.0003529522129610872, 'samples': 9398784, 'steps': 18356, 'batch_loss/train': 0.7787591367959976} 12/25/2021 09:27:28 - INFO - codeparrot_training - Step 18357: {'lr': 0.0003529378482811608, 'samples': 9399296, 'steps': 18357, 'batch_loss/train': 0.8295708457008004} 12/25/2021 09:27:38 - INFO - codeparrot_training - Step 18358: {'lr': 0.00035292348319199244, 'samples': 9399808, 'steps': 18358, 'batch_loss/train': 0.8052599197253585} 12/25/2021 09:27:49 - INFO - codeparrot_training - Step 18359: {'lr': 0.00035290911769363906, 'samples': 9400320, 'steps': 18359, 'batch_loss/train': 0.7366670751944184} 12/25/2021 09:28:00 - INFO - codeparrot_training - Step 18360: {'lr': 0.00035289475178615793, 'samples': 9400832, 'steps': 18360, 'batch_loss/train': 0.7555169714614749} 12/25/2021 09:28:12 - INFO - codeparrot_training - Step 18361: {'lr': 0.00035288038546960613, 'samples': 9401344, 'steps': 18361, 'batch_loss/train': 0.8571860156953335} 12/25/2021 09:28:22 - INFO - codeparrot_training - Step 18362: {'lr': 0.0003528660187440408, 'samples': 9401856, 'steps': 18362, 'batch_loss/train': 0.7668040618300438} 12/25/2021 09:28:33 - INFO - codeparrot_training - Step 18363: {'lr': 0.00035285165160951895, 'samples': 9402368, 'steps': 18363, 'batch_loss/train': 0.8049483050126582} 12/25/2021 09:28:43 - INFO - codeparrot_training - Step 18364: {'lr': 0.0003528372840660978, 'samples': 9402880, 'steps': 18364, 'batch_loss/train': 0.7892106873914599} 12/25/2021 09:28:57 - INFO - codeparrot_training - Step 18365: {'lr': 0.00035282291611383446, 'samples': 9403392, 'steps': 18365, 'batch_loss/train': 0.9090329334139824} 12/25/2021 09:29:08 - INFO - codeparrot_training - Step 18366: {'lr': 0.0003528085477527859, 'samples': 9403904, 'steps': 18366, 'batch_loss/train': 0.8230371470563114} 12/25/2021 09:29:18 - INFO - codeparrot_training - Step 18367: {'lr': 0.00035279417898300954, 'samples': 9404416, 'steps': 18367, 'batch_loss/train': 0.7671612286940217} 12/25/2021 09:29:31 - INFO - codeparrot_training - Step 18368: {'lr': 0.0003527798098045623, 'samples': 9404928, 'steps': 18368, 'batch_loss/train': 0.777278482913971} 12/25/2021 09:29:41 - INFO - codeparrot_training - Step 18369: {'lr': 0.0003527654402175013, 'samples': 9405440, 'steps': 18369, 'batch_loss/train': 0.8271399072837085} 12/25/2021 09:29:52 - INFO - codeparrot_training - Step 18370: {'lr': 0.00035275107022188375, 'samples': 9405952, 'steps': 18370, 'batch_loss/train': 0.7710636388510466} 12/25/2021 09:30:04 - INFO - codeparrot_training - Step 18371: {'lr': 0.0003527366998177667, 'samples': 9406464, 'steps': 18371, 'batch_loss/train': 0.7292368452181108} 12/25/2021 09:30:15 - INFO - codeparrot_training - Step 18372: {'lr': 0.00035272232900520733, 'samples': 9406976, 'steps': 18372, 'batch_loss/train': 0.8146260809153318} 12/25/2021 09:30:25 - INFO - codeparrot_training - Step 18373: {'lr': 0.0003527079577842628, 'samples': 9407488, 'steps': 18373, 'batch_loss/train': 0.8319081924855709} 12/25/2021 09:30:39 - INFO - codeparrot_training - Step 18374: {'lr': 0.00035269358615499017, 'samples': 9408000, 'steps': 18374, 'batch_loss/train': 0.832861764356494} 12/25/2021 09:30:50 - INFO - codeparrot_training - Step 18375: {'lr': 0.0003526792141174467, 'samples': 9408512, 'steps': 18375, 'batch_loss/train': 0.762994110584259} 12/25/2021 09:31:00 - INFO - codeparrot_training - Step 18376: {'lr': 0.0003526648416716893, 'samples': 9409024, 'steps': 18376, 'batch_loss/train': 0.8434227630496025} 12/25/2021 09:31:11 - INFO - codeparrot_training - Step 18377: {'lr': 0.0003526504688177754, 'samples': 9409536, 'steps': 18377, 'batch_loss/train': 0.7131613784004003} 12/25/2021 09:31:23 - INFO - codeparrot_training - Step 18378: {'lr': 0.0003526360955557619, 'samples': 9410048, 'steps': 18378, 'batch_loss/train': 0.7761743701994419} 12/25/2021 09:31:33 - INFO - codeparrot_training - Step 18379: {'lr': 0.00035262172188570606, 'samples': 9410560, 'steps': 18379, 'batch_loss/train': 0.7651150566525757} 12/25/2021 09:31:44 - INFO - codeparrot_training - Step 18380: {'lr': 0.000352607347807665, 'samples': 9411072, 'steps': 18380, 'batch_loss/train': 0.7806137832812965} 12/25/2021 09:31:56 - INFO - codeparrot_training - Step 18381: {'lr': 0.0003525929733216959, 'samples': 9411584, 'steps': 18381, 'batch_loss/train': 0.8184688370674849} 12/25/2021 09:32:07 - INFO - codeparrot_training - Step 18382: {'lr': 0.00035257859842785586, 'samples': 9412096, 'steps': 18382, 'batch_loss/train': 0.7608974371105433} 12/25/2021 09:32:17 - INFO - codeparrot_training - Step 18383: {'lr': 0.00035256422312620207, 'samples': 9412608, 'steps': 18383, 'batch_loss/train': 0.8034849953837693} 12/25/2021 09:32:31 - INFO - codeparrot_training - Step 18384: {'lr': 0.00035254984741679165, 'samples': 9413120, 'steps': 18384, 'batch_loss/train': 0.8169329550582916} 12/25/2021 09:32:42 - INFO - codeparrot_training - Step 18385: {'lr': 0.0003525354712996817, 'samples': 9413632, 'steps': 18385, 'batch_loss/train': 0.6902801664546132} 12/25/2021 09:32:52 - INFO - codeparrot_training - Step 18386: {'lr': 0.00035252109477492956, 'samples': 9414144, 'steps': 18386, 'batch_loss/train': 0.677153970580548} 12/25/2021 09:33:03 - INFO - codeparrot_training - Step 18387: {'lr': 0.0003525067178425921, 'samples': 9414656, 'steps': 18387, 'batch_loss/train': 0.7070654109120369} 12/25/2021 09:33:15 - INFO - codeparrot_training - Step 18388: {'lr': 0.0003524923405027267, 'samples': 9415168, 'steps': 18388, 'batch_loss/train': 0.8864517519250512} 12/25/2021 09:33:26 - INFO - codeparrot_training - Step 18389: {'lr': 0.0003524779627553905, 'samples': 9415680, 'steps': 18389, 'batch_loss/train': 0.7647458738647401} 12/25/2021 09:33:36 - INFO - codeparrot_training - Step 18390: {'lr': 0.00035246358460064054, 'samples': 9416192, 'steps': 18390, 'batch_loss/train': 0.7296621405985206} 12/25/2021 09:33:50 - INFO - codeparrot_training - Step 18391: {'lr': 0.00035244920603853407, 'samples': 9416704, 'steps': 18391, 'batch_loss/train': 0.8188914330676198} 12/25/2021 09:34:01 - INFO - codeparrot_training - Step 18392: {'lr': 0.0003524348270691283, 'samples': 9417216, 'steps': 18392, 'batch_loss/train': 0.7191170761361718} 12/25/2021 09:34:11 - INFO - codeparrot_training - Step 18393: {'lr': 0.00035242044769248025, 'samples': 9417728, 'steps': 18393, 'batch_loss/train': 0.8480361727997661} 12/25/2021 09:34:23 - INFO - codeparrot_training - Step 18394: {'lr': 0.0003524060679086472, 'samples': 9418240, 'steps': 18394, 'batch_loss/train': 0.783493984490633} 12/25/2021 09:34:34 - INFO - codeparrot_training - Step 18395: {'lr': 0.00035239168771768637, 'samples': 9418752, 'steps': 18395, 'batch_loss/train': 0.8120946511626244} 12/25/2021 09:34:45 - INFO - codeparrot_training - Step 18396: {'lr': 0.00035237730711965476, 'samples': 9419264, 'steps': 18396, 'batch_loss/train': 0.8496187906712294} 12/25/2021 09:34:55 - INFO - codeparrot_training - Step 18397: {'lr': 0.0003523629261146097, 'samples': 9419776, 'steps': 18397, 'batch_loss/train': 0.9665097268298268} 12/25/2021 09:35:07 - INFO - codeparrot_training - Step 18398: {'lr': 0.00035234854470260826, 'samples': 9420288, 'steps': 18398, 'batch_loss/train': 0.8078749617561698} 12/25/2021 09:35:18 - INFO - codeparrot_training - Step 18399: {'lr': 0.00035233416288370765, 'samples': 9420800, 'steps': 18399, 'batch_loss/train': 0.8033066997304559} 12/25/2021 09:35:28 - INFO - codeparrot_training - Step 18400: {'lr': 0.000352319780657965, 'samples': 9421312, 'steps': 18400, 'batch_loss/train': 0.8556622816249728} 12/25/2021 09:35:40 - INFO - codeparrot_training - Step 18401: {'lr': 0.0003523053980254376, 'samples': 9421824, 'steps': 18401, 'batch_loss/train': 0.882877612952143} 12/25/2021 09:35:51 - INFO - codeparrot_training - Step 18402: {'lr': 0.00035229101498618256, 'samples': 9422336, 'steps': 18402, 'batch_loss/train': 0.8633098499849439} 12/25/2021 09:36:02 - INFO - codeparrot_training - Step 18403: {'lr': 0.000352276631540257, 'samples': 9422848, 'steps': 18403, 'batch_loss/train': 0.7876148680225015} 12/25/2021 09:36:16 - INFO - codeparrot_training - Step 18404: {'lr': 0.0003522622476877183, 'samples': 9423360, 'steps': 18404, 'batch_loss/train': 0.7506908196955919} 12/25/2021 09:36:26 - INFO - codeparrot_training - Step 18405: {'lr': 0.00035224786342862344, 'samples': 9423872, 'steps': 18405, 'batch_loss/train': 0.72286984231323} 12/25/2021 09:36:37 - INFO - codeparrot_training - Step 18406: {'lr': 0.0003522334787630297, 'samples': 9424384, 'steps': 18406, 'batch_loss/train': 0.5549972746521235} 12/25/2021 09:36:47 - INFO - codeparrot_training - Step 18407: {'lr': 0.0003522190936909942, 'samples': 9424896, 'steps': 18407, 'batch_loss/train': 0.7020266368053854} 12/25/2021 09:36:59 - INFO - codeparrot_training - Step 18408: {'lr': 0.0003522047082125742, 'samples': 9425408, 'steps': 18408, 'batch_loss/train': 0.7129774398636073} 12/25/2021 09:37:10 - INFO - codeparrot_training - Step 18409: {'lr': 0.0003521903223278269, 'samples': 9425920, 'steps': 18409, 'batch_loss/train': 0.8658796139061451} 12/25/2021 09:37:21 - INFO - codeparrot_training - Step 18410: {'lr': 0.00035217593603680946, 'samples': 9426432, 'steps': 18410, 'batch_loss/train': 0.8054534578695893} 12/25/2021 09:37:33 - INFO - codeparrot_training - Step 18411: {'lr': 0.0003521615493395791, 'samples': 9426944, 'steps': 18411, 'batch_loss/train': 0.7674991576932371} 12/25/2021 09:37:43 - INFO - codeparrot_training - Step 18412: {'lr': 0.000352147162236193, 'samples': 9427456, 'steps': 18412, 'batch_loss/train': 0.8886044723913074} 12/25/2021 09:37:54 - INFO - codeparrot_training - Step 18413: {'lr': 0.00035213277472670825, 'samples': 9427968, 'steps': 18413, 'batch_loss/train': 0.8009956507012248} 12/25/2021 09:38:08 - INFO - codeparrot_training - Step 18414: {'lr': 0.00035211838681118227, 'samples': 9428480, 'steps': 18414, 'batch_loss/train': 0.8006974491290748} 12/25/2021 09:38:19 - INFO - codeparrot_training - Step 18415: {'lr': 0.00035210399848967206, 'samples': 9428992, 'steps': 18415, 'batch_loss/train': 1.5751467812806368} 12/25/2021 09:38:29 - INFO - codeparrot_training - Step 18416: {'lr': 0.00035208960976223494, 'samples': 9429504, 'steps': 18416, 'batch_loss/train': 0.7419598968699574} 12/25/2021 09:38:40 - INFO - codeparrot_training - Step 18417: {'lr': 0.00035207522062892803, 'samples': 9430016, 'steps': 18417, 'batch_loss/train': 0.7496707048267126} 12/25/2021 09:38:52 - INFO - codeparrot_training - Step 18418: {'lr': 0.0003520608310898087, 'samples': 9430528, 'steps': 18418, 'batch_loss/train': 0.6792563390918076} 12/25/2021 09:39:02 - INFO - codeparrot_training - Step 18419: {'lr': 0.00035204644114493396, 'samples': 9431040, 'steps': 18419, 'batch_loss/train': 0.7132848845794797} 12/25/2021 09:39:13 - INFO - codeparrot_training - Step 18420: {'lr': 0.0003520320507943611, 'samples': 9431552, 'steps': 18420, 'batch_loss/train': 0.8202406894415617} 12/25/2021 09:39:27 - INFO - codeparrot_training - Step 18421: {'lr': 0.0003520176600381474, 'samples': 9432064, 'steps': 18421, 'batch_loss/train': 0.8092004335485399} 12/25/2021 09:39:38 - INFO - codeparrot_training - Step 18422: {'lr': 0.00035200326887634993, 'samples': 9432576, 'steps': 18422, 'batch_loss/train': 0.7614607224240899} 12/25/2021 09:39:48 - INFO - codeparrot_training - Step 18423: {'lr': 0.00035198887730902595, 'samples': 9433088, 'steps': 18423, 'batch_loss/train': 0.8188060980755836} 12/25/2021 09:40:00 - INFO - codeparrot_training - Step 18424: {'lr': 0.00035197448533623275, 'samples': 9433600, 'steps': 18424, 'batch_loss/train': 0.7835003260988742} 12/25/2021 09:40:11 - INFO - codeparrot_training - Step 18425: {'lr': 0.00035196009295802757, 'samples': 9434112, 'steps': 18425, 'batch_loss/train': 0.793625914491713} 12/25/2021 09:40:21 - INFO - codeparrot_training - Step 18426: {'lr': 0.00035194570017446744, 'samples': 9434624, 'steps': 18426, 'batch_loss/train': 0.8246061641257256} 12/25/2021 09:40:33 - INFO - codeparrot_training - Step 18427: {'lr': 0.0003519313069856097, 'samples': 9435136, 'steps': 18427, 'batch_loss/train': 0.7740435842424631} 12/25/2021 09:40:44 - INFO - codeparrot_training - Step 18428: {'lr': 0.0003519169133915117, 'samples': 9435648, 'steps': 18428, 'batch_loss/train': 0.6806459208019078} 12/25/2021 09:40:55 - INFO - codeparrot_training - Step 18429: {'lr': 0.0003519025193922304, 'samples': 9436160, 'steps': 18429, 'batch_loss/train': 0.819563252851367} 12/25/2021 09:41:05 - INFO - codeparrot_training - Step 18430: {'lr': 0.0003518881249878232, 'samples': 9436672, 'steps': 18430, 'batch_loss/train': 0.7654887610115111} 12/25/2021 09:41:19 - INFO - codeparrot_training - Step 18431: {'lr': 0.00035187373017834736, 'samples': 9437184, 'steps': 18431, 'batch_loss/train': 0.839984490768984} 12/25/2021 09:41:30 - INFO - codeparrot_training - Step 18432: {'lr': 0.00035185933496386, 'samples': 9437696, 'steps': 18432, 'batch_loss/train': 0.7546302061527967} 12/25/2021 09:41:40 - INFO - codeparrot_training - Step 18433: {'lr': 0.0003518449393444183, 'samples': 9438208, 'steps': 18433, 'batch_loss/train': 1.1097107473760843} 12/25/2021 09:41:53 - INFO - codeparrot_training - Step 18434: {'lr': 0.0003518305433200797, 'samples': 9438720, 'steps': 18434, 'batch_loss/train': 0.8140541147440672} 12/25/2021 09:42:03 - INFO - codeparrot_training - Step 18435: {'lr': 0.00035181614689090126, 'samples': 9439232, 'steps': 18435, 'batch_loss/train': 0.6822377280332148} 12/25/2021 09:42:14 - INFO - codeparrot_training - Step 18436: {'lr': 0.0003518017500569403, 'samples': 9439744, 'steps': 18436, 'batch_loss/train': 1.0445223338901997} 12/25/2021 09:42:26 - INFO - codeparrot_training - Step 18437: {'lr': 0.00035178735281825394, 'samples': 9440256, 'steps': 18437, 'batch_loss/train': 0.742509507574141} 12/25/2021 09:42:36 - INFO - codeparrot_training - Step 18438: {'lr': 0.0003517729551748995, 'samples': 9440768, 'steps': 18438, 'batch_loss/train': 0.6269812655518763} 12/25/2021 09:42:47 - INFO - codeparrot_training - Step 18439: {'lr': 0.00035175855712693424, 'samples': 9441280, 'steps': 18439, 'batch_loss/train': 0.7774883136153221} 12/25/2021 09:42:58 - INFO - codeparrot_training - Step 18440: {'lr': 0.0003517441586744153, 'samples': 9441792, 'steps': 18440, 'batch_loss/train': 0.8088339299429208} 12/25/2021 09:43:12 - INFO - codeparrot_training - Step 18441: {'lr': 0.0003517297598174002, 'samples': 9442304, 'steps': 18441, 'batch_loss/train': 0.7800716953352094} 12/25/2021 09:43:23 - INFO - codeparrot_training - Step 18442: {'lr': 0.00035171536055594583, 'samples': 9442816, 'steps': 18442, 'batch_loss/train': 1.018951764330268} 12/25/2021 09:43:34 - INFO - codeparrot_training - Step 18443: {'lr': 0.00035170096089010965, 'samples': 9443328, 'steps': 18443, 'batch_loss/train': 0.7749126749113202} 12/25/2021 09:43:46 - INFO - codeparrot_training - Step 18444: {'lr': 0.0003516865608199489, 'samples': 9443840, 'steps': 18444, 'batch_loss/train': 1.2080784761346877} 12/25/2021 09:43:56 - INFO - codeparrot_training - Step 18445: {'lr': 0.0003516721603455207, 'samples': 9444352, 'steps': 18445, 'batch_loss/train': 0.891740697901696} 12/25/2021 09:44:07 - INFO - codeparrot_training - Step 18446: {'lr': 0.0003516577594668824, 'samples': 9444864, 'steps': 18446, 'batch_loss/train': 0.8409069143235683} 12/25/2021 09:44:17 - INFO - codeparrot_training - Step 18447: {'lr': 0.00035164335818409123, 'samples': 9445376, 'steps': 18447, 'batch_loss/train': 0.8388987453654408} 12/25/2021 09:44:32 - INFO - codeparrot_training - Step 18448: {'lr': 0.0003516289564972044, 'samples': 9445888, 'steps': 18448, 'batch_loss/train': 0.7447444112040102} 12/25/2021 09:44:42 - INFO - codeparrot_training - Step 18449: {'lr': 0.00035161455440627933, 'samples': 9446400, 'steps': 18449, 'batch_loss/train': 0.7045399295166135} 12/25/2021 09:44:53 - INFO - codeparrot_training - Step 18450: {'lr': 0.00035160015191137306, 'samples': 9446912, 'steps': 18450, 'batch_loss/train': 0.5488360283197835} 12/25/2021 09:45:05 - INFO - codeparrot_training - Step 18451: {'lr': 0.000351585749012543, 'samples': 9447424, 'steps': 18451, 'batch_loss/train': 0.8429188225418329} 12/25/2021 09:45:16 - INFO - codeparrot_training - Step 18452: {'lr': 0.0003515713457098463, 'samples': 9447936, 'steps': 18452, 'batch_loss/train': 0.6187885365216061} 12/25/2021 09:45:26 - INFO - codeparrot_training - Step 18453: {'lr': 0.00035155694200334034, 'samples': 9448448, 'steps': 18453, 'batch_loss/train': 0.6664107709657401} 12/25/2021 09:45:38 - INFO - codeparrot_training - Step 18454: {'lr': 0.00035154253789308233, 'samples': 9448960, 'steps': 18454, 'batch_loss/train': 0.8088032985106111} 12/25/2021 09:45:49 - INFO - codeparrot_training - Step 18455: {'lr': 0.0003515281333791295, 'samples': 9449472, 'steps': 18455, 'batch_loss/train': 0.9011161997914314} 12/25/2021 09:45:59 - INFO - codeparrot_training - Step 18456: {'lr': 0.00035151372846153916, 'samples': 9449984, 'steps': 18456, 'batch_loss/train': 0.7288131872192025} 12/25/2021 09:46:10 - INFO - codeparrot_training - Step 18457: {'lr': 0.00035149932314036854, 'samples': 9450496, 'steps': 18457, 'batch_loss/train': 0.7363517438061535} 12/25/2021 09:46:24 - INFO - codeparrot_training - Step 18458: {'lr': 0.00035148491741567495, 'samples': 9451008, 'steps': 18458, 'batch_loss/train': 0.7919001346454024} 12/25/2021 09:46:34 - INFO - codeparrot_training - Step 18459: {'lr': 0.0003514705112875157, 'samples': 9451520, 'steps': 18459, 'batch_loss/train': 0.7507019620388746} 12/25/2021 09:46:45 - INFO - codeparrot_training - Step 18460: {'lr': 0.000351456104755948, 'samples': 9452032, 'steps': 18460, 'batch_loss/train': 0.8054242413491011} 12/25/2021 09:46:57 - INFO - codeparrot_training - Step 18461: {'lr': 0.000351441697821029, 'samples': 9452544, 'steps': 18461, 'batch_loss/train': 0.8417347259819508} 12/25/2021 09:47:08 - INFO - codeparrot_training - Step 18462: {'lr': 0.0003514272904828163, 'samples': 9453056, 'steps': 18462, 'batch_loss/train': 0.7443844974040985} 12/25/2021 09:47:18 - INFO - codeparrot_training - Step 18463: {'lr': 0.0003514128827413668, 'samples': 9453568, 'steps': 18463, 'batch_loss/train': 0.8722412865608931} 12/25/2021 09:47:30 - INFO - codeparrot_training - Step 18464: {'lr': 0.0003513984745967381, 'samples': 9454080, 'steps': 18464, 'batch_loss/train': 0.772864792495966} 12/25/2021 09:47:41 - INFO - codeparrot_training - Step 18465: {'lr': 0.0003513840660489873, 'samples': 9454592, 'steps': 18465, 'batch_loss/train': 0.7467098506167531} 12/25/2021 09:47:51 - INFO - codeparrot_training - Step 18466: {'lr': 0.0003513696570981718, 'samples': 9455104, 'steps': 18466, 'batch_loss/train': 0.7967634918168187} 12/25/2021 09:48:05 - INFO - codeparrot_training - Step 18467: {'lr': 0.00035135524774434877, 'samples': 9455616, 'steps': 18467, 'batch_loss/train': 0.6825023852288723} 12/25/2021 09:48:16 - INFO - codeparrot_training - Step 18468: {'lr': 0.00035134083798757555, 'samples': 9456128, 'steps': 18468, 'batch_loss/train': 0.7773242723196745} 12/25/2021 09:48:26 - INFO - codeparrot_training - Step 18469: {'lr': 0.00035132642782790944, 'samples': 9456640, 'steps': 18469, 'batch_loss/train': 0.7572660967707634} 12/25/2021 09:48:37 - INFO - codeparrot_training - Step 18470: {'lr': 0.00035131201726540775, 'samples': 9457152, 'steps': 18470, 'batch_loss/train': 0.6497233984991908} 12/25/2021 09:48:49 - INFO - codeparrot_training - Step 18471: {'lr': 0.00035129760630012765, 'samples': 9457664, 'steps': 18471, 'batch_loss/train': 0.7899620048701763} 12/25/2021 09:49:00 - INFO - codeparrot_training - Step 18472: {'lr': 0.00035128319493212657, 'samples': 9458176, 'steps': 18472, 'batch_loss/train': 0.8327213935554028} 12/25/2021 09:49:10 - INFO - codeparrot_training - Step 18473: {'lr': 0.0003512687831614617, 'samples': 9458688, 'steps': 18473, 'batch_loss/train': 0.6567718968726695} 12/25/2021 09:49:22 - INFO - codeparrot_training - Step 18474: {'lr': 0.0003512543709881904, 'samples': 9459200, 'steps': 18474, 'batch_loss/train': 0.7831233751494437} 12/25/2021 09:49:33 - INFO - codeparrot_training - Step 18475: {'lr': 0.00035123995841237, 'samples': 9459712, 'steps': 18475, 'batch_loss/train': 0.8086553094908595} 12/25/2021 09:49:43 - INFO - codeparrot_training - Step 18476: {'lr': 0.0003512255454340577, 'samples': 9460224, 'steps': 18476, 'batch_loss/train': 0.7939342032186687} 12/25/2021 09:49:58 - INFO - codeparrot_training - Step 18477: {'lr': 0.0003512111320533109, 'samples': 9460736, 'steps': 18477, 'batch_loss/train': 0.6982813733629882} 12/25/2021 09:50:08 - INFO - codeparrot_training - Step 18478: {'lr': 0.0003511967182701868, 'samples': 9461248, 'steps': 18478, 'batch_loss/train': 0.8363623635377735} 12/25/2021 09:50:19 - INFO - codeparrot_training - Step 18479: {'lr': 0.00035118230408474275, 'samples': 9461760, 'steps': 18479, 'batch_loss/train': 0.7282878682017326} 12/25/2021 09:50:29 - INFO - codeparrot_training - Step 18480: {'lr': 0.00035116788949703613, 'samples': 9462272, 'steps': 18480, 'batch_loss/train': 0.7617618483491242} 12/25/2021 09:50:41 - INFO - codeparrot_training - Step 18481: {'lr': 0.0003511534745071242, 'samples': 9462784, 'steps': 18481, 'batch_loss/train': 0.7873905295855366} 12/25/2021 09:50:52 - INFO - codeparrot_training - Step 18482: {'lr': 0.0003511390591150642, 'samples': 9463296, 'steps': 18482, 'batch_loss/train': 0.8573336899280548} 12/25/2021 09:51:03 - INFO - codeparrot_training - Step 18483: {'lr': 0.00035112464332091343, 'samples': 9463808, 'steps': 18483, 'batch_loss/train': 0.8053197572007775} 12/25/2021 09:51:15 - INFO - codeparrot_training - Step 18484: {'lr': 0.00035111022712472936, 'samples': 9464320, 'steps': 18484, 'batch_loss/train': 0.8100597427692264} 12/25/2021 09:51:25 - INFO - codeparrot_training - Step 18485: {'lr': 0.00035109581052656915, 'samples': 9464832, 'steps': 18485, 'batch_loss/train': 0.8136961339041591} 12/25/2021 09:51:36 - INFO - codeparrot_training - Step 18486: {'lr': 0.0003510813935264902, 'samples': 9465344, 'steps': 18486, 'batch_loss/train': 0.8055492280982435} 12/25/2021 09:51:50 - INFO - codeparrot_training - Step 18487: {'lr': 0.00035106697612454976, 'samples': 9465856, 'steps': 18487, 'batch_loss/train': 0.8272001408040524} 12/25/2021 09:52:00 - INFO - codeparrot_training - Step 18488: {'lr': 0.00035105255832080525, 'samples': 9466368, 'steps': 18488, 'batch_loss/train': 0.7899927580729127} 12/25/2021 09:52:11 - INFO - codeparrot_training - Step 18489: {'lr': 0.00035103814011531385, 'samples': 9466880, 'steps': 18489, 'batch_loss/train': 0.8290909631177783} 12/25/2021 09:52:22 - INFO - codeparrot_training - Step 18490: {'lr': 0.000351023721508133, 'samples': 9467392, 'steps': 18490, 'batch_loss/train': 0.49344855453819036} 12/25/2021 09:52:34 - INFO - codeparrot_training - Step 18491: {'lr': 0.00035100930249931997, 'samples': 9467904, 'steps': 18491, 'batch_loss/train': 0.6994014708325267} 12/25/2021 09:52:44 - INFO - codeparrot_training - Step 18492: {'lr': 0.0003509948830889321, 'samples': 9468416, 'steps': 18492, 'batch_loss/train': 0.7457867292687297} 12/25/2021 09:52:55 - INFO - codeparrot_training - Step 18493: {'lr': 0.00035098046327702664, 'samples': 9468928, 'steps': 18493, 'batch_loss/train': 0.7943231749814004} 12/25/2021 09:53:07 - INFO - codeparrot_training - Step 18494: {'lr': 0.000350966043063661, 'samples': 9469440, 'steps': 18494, 'batch_loss/train': 0.8269388619810343} 12/25/2021 09:53:18 - INFO - codeparrot_training - Step 18495: {'lr': 0.0003509516224488926, 'samples': 9469952, 'steps': 18495, 'batch_loss/train': 0.7100692942040041} 12/25/2021 09:53:28 - INFO - codeparrot_training - Step 18496: {'lr': 0.0003509372014327786, 'samples': 9470464, 'steps': 18496, 'batch_loss/train': 0.7444573936518282} 12/25/2021 09:53:42 - INFO - codeparrot_training - Step 18497: {'lr': 0.00035092278001537644, 'samples': 9470976, 'steps': 18497, 'batch_loss/train': 0.8227430712431669} 12/25/2021 09:53:53 - INFO - codeparrot_training - Step 18498: {'lr': 0.00035090835819674335, 'samples': 9471488, 'steps': 18498, 'batch_loss/train': 0.8004835797473788} 12/25/2021 09:54:03 - INFO - codeparrot_training - Step 18499: {'lr': 0.0003508939359769368, 'samples': 9472000, 'steps': 18499, 'batch_loss/train': 0.8622113587334752} 12/25/2021 09:54:14 - INFO - codeparrot_training - Step 18500: {'lr': 0.00035087951335601396, 'samples': 9472512, 'steps': 18500, 'batch_loss/train': 0.8201148817315698} 12/25/2021 09:54:26 - INFO - codeparrot_training - Step 18501: {'lr': 0.0003508650903340323, 'samples': 9473024, 'steps': 18501, 'batch_loss/train': 0.735537339001894} 12/25/2021 09:54:37 - INFO - codeparrot_training - Step 18502: {'lr': 0.00035085066691104916, 'samples': 9473536, 'steps': 18502, 'batch_loss/train': 0.7794305291026831} 12/25/2021 09:54:47 - INFO - codeparrot_training - Step 18503: {'lr': 0.00035083624308712187, 'samples': 9474048, 'steps': 18503, 'batch_loss/train': 0.7841711319051683} 12/25/2021 09:55:01 - INFO - codeparrot_training - Step 18504: {'lr': 0.0003508218188623077, 'samples': 9474560, 'steps': 18504, 'batch_loss/train': 0.7929606987163424} 12/25/2021 09:55:12 - INFO - codeparrot_training - Step 18505: {'lr': 0.00035080739423666406, 'samples': 9475072, 'steps': 18505, 'batch_loss/train': 0.8649655729532242} 12/25/2021 09:55:23 - INFO - codeparrot_training - Step 18506: {'lr': 0.0003507929692102483, 'samples': 9475584, 'steps': 18506, 'batch_loss/train': 0.5463851449894719} 12/25/2021 09:55:35 - INFO - codeparrot_training - Step 18507: {'lr': 0.0003507785437831177, 'samples': 9476096, 'steps': 18507, 'batch_loss/train': 0.823594281449914} 12/25/2021 09:55:45 - INFO - codeparrot_training - Step 18508: {'lr': 0.00035076411795532974, 'samples': 9476608, 'steps': 18508, 'batch_loss/train': 0.8192264214158058} 12/25/2021 09:55:56 - INFO - codeparrot_training - Step 18509: {'lr': 0.00035074969172694165, 'samples': 9477120, 'steps': 18509, 'batch_loss/train': 0.8704605605453253} 12/25/2021 09:56:08 - INFO - codeparrot_training - Step 18510: {'lr': 0.00035073526509801084, 'samples': 9477632, 'steps': 18510, 'batch_loss/train': 0.7362459804862738} 12/25/2021 09:56:19 - INFO - codeparrot_training - Step 18511: {'lr': 0.0003507208380685946, 'samples': 9478144, 'steps': 18511, 'batch_loss/train': 0.8276361301541328} 12/25/2021 09:56:29 - INFO - codeparrot_training - Step 18512: {'lr': 0.0003507064106387505, 'samples': 9478656, 'steps': 18512, 'batch_loss/train': 0.8180172350257635} 12/25/2021 09:56:40 - INFO - codeparrot_training - Step 18513: {'lr': 0.00035069198280853554, 'samples': 9479168, 'steps': 18513, 'batch_loss/train': 0.8223583539947867} 12/25/2021 09:56:52 - INFO - codeparrot_training - Step 18514: {'lr': 0.0003506775545780073, 'samples': 9479680, 'steps': 18514, 'batch_loss/train': 0.7755202320404351} 12/25/2021 09:57:02 - INFO - codeparrot_training - Step 18515: {'lr': 0.00035066312594722327, 'samples': 9480192, 'steps': 18515, 'batch_loss/train': 0.9205630542710423} 12/25/2021 09:57:13 - INFO - codeparrot_training - Step 18516: {'lr': 0.00035064869691624044, 'samples': 9480704, 'steps': 18516, 'batch_loss/train': 0.7471378119662404} 12/25/2021 09:57:27 - INFO - codeparrot_training - Step 18517: {'lr': 0.00035063426748511663, 'samples': 9481216, 'steps': 18517, 'batch_loss/train': 0.7351823467761278} 12/25/2021 09:57:37 - INFO - codeparrot_training - Step 18518: {'lr': 0.0003506198376539088, 'samples': 9481728, 'steps': 18518, 'batch_loss/train': 1.020582877099514} 12/25/2021 09:57:48 - INFO - codeparrot_training - Step 18519: {'lr': 0.00035060540742267457, 'samples': 9482240, 'steps': 18519, 'batch_loss/train': 0.8661873508244753} 12/25/2021 09:58:00 - INFO - codeparrot_training - Step 18520: {'lr': 0.0003505909767914712, 'samples': 9482752, 'steps': 18520, 'batch_loss/train': 0.7131892265751958} 12/25/2021 09:58:11 - INFO - codeparrot_training - Step 18521: {'lr': 0.00035057654576035614, 'samples': 9483264, 'steps': 18521, 'batch_loss/train': 0.8043440449982882} 12/25/2021 09:58:21 - INFO - codeparrot_training - Step 18522: {'lr': 0.0003505621143293867, 'samples': 9483776, 'steps': 18522, 'batch_loss/train': 0.7533901380375028} 12/25/2021 09:58:33 - INFO - codeparrot_training - Step 18523: {'lr': 0.00035054768249862016, 'samples': 9484288, 'steps': 18523, 'batch_loss/train': 0.8510654661804438} 12/25/2021 09:58:44 - INFO - codeparrot_training - Step 18524: {'lr': 0.00035053325026811405, 'samples': 9484800, 'steps': 18524, 'batch_loss/train': 0.846810937859118} 12/25/2021 09:58:54 - INFO - codeparrot_training - Step 18525: {'lr': 0.00035051881763792576, 'samples': 9485312, 'steps': 18525, 'batch_loss/train': 0.7042150893248618} 12/25/2021 09:59:05 - INFO - codeparrot_training - Step 18526: {'lr': 0.0003505043846081125, 'samples': 9485824, 'steps': 18526, 'batch_loss/train': 0.8604983547702432} 12/25/2021 09:59:19 - INFO - codeparrot_training - Step 18527: {'lr': 0.00035048995117873186, 'samples': 9486336, 'steps': 18527, 'batch_loss/train': 1.5649059377610683} 12/25/2021 09:59:30 - INFO - codeparrot_training - Step 18528: {'lr': 0.0003504755173498411, 'samples': 9486848, 'steps': 18528, 'batch_loss/train': 1.0222053276374936} 12/25/2021 09:59:40 - INFO - codeparrot_training - Step 18529: {'lr': 0.00035046108312149747, 'samples': 9487360, 'steps': 18529, 'batch_loss/train': 0.6952368096681312} 12/25/2021 09:59:52 - INFO - codeparrot_training - Step 18530: {'lr': 0.0003504466484937587, 'samples': 9487872, 'steps': 18530, 'batch_loss/train': 0.8547652247361839} 12/25/2021 10:00:03 - INFO - codeparrot_training - Step 18531: {'lr': 0.0003504322134666819, 'samples': 9488384, 'steps': 18531, 'batch_loss/train': 0.791063763666898} 12/25/2021 10:00:13 - INFO - codeparrot_training - Step 18532: {'lr': 0.00035041777804032455, 'samples': 9488896, 'steps': 18532, 'batch_loss/train': 0.7918801410123706} 12/25/2021 10:00:27 - INFO - codeparrot_training - Step 18533: {'lr': 0.000350403342214744, 'samples': 9489408, 'steps': 18533, 'batch_loss/train': 0.7423478867858648} 12/25/2021 10:00:38 - INFO - codeparrot_training - Step 18534: {'lr': 0.0003503889059899978, 'samples': 9489920, 'steps': 18534, 'batch_loss/train': 0.7152494085021317} 12/25/2021 10:00:48 - INFO - codeparrot_training - Step 18535: {'lr': 0.0003503744693661431, 'samples': 9490432, 'steps': 18535, 'batch_loss/train': 0.7444629155797884} 12/25/2021 10:00:59 - INFO - codeparrot_training - Step 18536: {'lr': 0.00035036003234323744, 'samples': 9490944, 'steps': 18536, 'batch_loss/train': 0.7915724897757173} 12/25/2021 10:01:11 - INFO - codeparrot_training - Step 18537: {'lr': 0.00035034559492133817, 'samples': 9491456, 'steps': 18537, 'batch_loss/train': 0.7237428538501263} 12/25/2021 10:01:22 - INFO - codeparrot_training - Step 18538: {'lr': 0.0003503311571005027, 'samples': 9491968, 'steps': 18538, 'batch_loss/train': 0.7534356308169663} 12/25/2021 10:01:32 - INFO - codeparrot_training - Step 18539: {'lr': 0.00035031671888078843, 'samples': 9492480, 'steps': 18539, 'batch_loss/train': 0.8344368077814579} 12/25/2021 10:01:44 - INFO - codeparrot_training - Step 18540: {'lr': 0.00035030228026225273, 'samples': 9492992, 'steps': 18540, 'batch_loss/train': 0.7007877491414547} 12/25/2021 10:01:55 - INFO - codeparrot_training - Step 18541: {'lr': 0.00035028784124495316, 'samples': 9493504, 'steps': 18541, 'batch_loss/train': 0.731095630559139} 12/25/2021 10:02:05 - INFO - codeparrot_training - Step 18542: {'lr': 0.00035027340182894686, 'samples': 9494016, 'steps': 18542, 'batch_loss/train': 0.7909980928525329} 12/25/2021 10:02:19 - INFO - codeparrot_training - Step 18543: {'lr': 0.00035025896201429146, 'samples': 9494528, 'steps': 18543, 'batch_loss/train': 0.8458621054887772} 12/25/2021 10:02:30 - INFO - codeparrot_training - Step 18544: {'lr': 0.00035024452180104426, 'samples': 9495040, 'steps': 18544, 'batch_loss/train': 0.8485962068662047} 12/25/2021 10:02:40 - INFO - codeparrot_training - Step 18545: {'lr': 0.0003502300811892627, 'samples': 9495552, 'steps': 18545, 'batch_loss/train': 0.7578001518268138} 12/25/2021 10:02:51 - INFO - codeparrot_training - Step 18546: {'lr': 0.00035021564017900423, 'samples': 9496064, 'steps': 18546, 'batch_loss/train': 0.7521484978497028} 12/25/2021 10:03:03 - INFO - codeparrot_training - Step 18547: {'lr': 0.0003502011987703261, 'samples': 9496576, 'steps': 18547, 'batch_loss/train': 0.758905554190278} 12/25/2021 10:03:14 - INFO - codeparrot_training - Step 18548: {'lr': 0.00035018675696328594, 'samples': 9497088, 'steps': 18548, 'batch_loss/train': 0.7074940726161003} 12/25/2021 10:03:25 - INFO - codeparrot_training - Step 18549: {'lr': 0.00035017231475794106, 'samples': 9497600, 'steps': 18549, 'batch_loss/train': 0.947256704326719} 12/25/2021 10:03:37 - INFO - codeparrot_training - Step 18550: {'lr': 0.0003501578721543488, 'samples': 9498112, 'steps': 18550, 'batch_loss/train': 0.6190831797430292} 12/25/2021 10:03:47 - INFO - codeparrot_training - Step 18551: {'lr': 0.0003501434291525667, 'samples': 9498624, 'steps': 18551, 'batch_loss/train': 0.7952403011731803} 12/25/2021 10:03:58 - INFO - codeparrot_training - Step 18552: {'lr': 0.0003501289857526521, 'samples': 9499136, 'steps': 18552, 'batch_loss/train': 0.8304721182212234} 12/25/2021 10:04:10 - INFO - codeparrot_training - Step 18553: {'lr': 0.0003501145419546625, 'samples': 9499648, 'steps': 18553, 'batch_loss/train': 0.6950850570574403} 12/25/2021 10:04:21 - INFO - codeparrot_training - Step 18554: {'lr': 0.0003501000977586554, 'samples': 9500160, 'steps': 18554, 'batch_loss/train': 0.7858161246404052} 12/25/2021 10:04:31 - INFO - codeparrot_training - Step 18555: {'lr': 0.0003500856531646879, 'samples': 9500672, 'steps': 18555, 'batch_loss/train': 0.6870260098949075} 12/25/2021 10:04:45 - INFO - codeparrot_training - Step 18556: {'lr': 0.00035007120817281777, 'samples': 9501184, 'steps': 18556, 'batch_loss/train': 0.7821826497092843} 12/25/2021 10:04:56 - INFO - codeparrot_training - Step 18557: {'lr': 0.0003500567627831022, 'samples': 9501696, 'steps': 18557, 'batch_loss/train': 0.7923790188506246} 12/25/2021 10:05:07 - INFO - codeparrot_training - Step 18558: {'lr': 0.0003500423169955988, 'samples': 9502208, 'steps': 18558, 'batch_loss/train': 1.2060062875971198} 12/25/2021 10:05:17 - INFO - codeparrot_training - Step 18559: {'lr': 0.0003500278708103649, 'samples': 9502720, 'steps': 18559, 'batch_loss/train': 0.7950780158862472} 12/25/2021 10:05:29 - INFO - codeparrot_training - Step 18560: {'lr': 0.00035001342422745794, 'samples': 9503232, 'steps': 18560, 'batch_loss/train': 0.4690779214142822} 12/25/2021 10:05:40 - INFO - codeparrot_training - Step 18561: {'lr': 0.0003499989772469354, 'samples': 9503744, 'steps': 18561, 'batch_loss/train': 0.8034382266923785} 12/25/2021 10:05:50 - INFO - codeparrot_training - Step 18562: {'lr': 0.0003499845298688547, 'samples': 9504256, 'steps': 18562, 'batch_loss/train': 0.7249832623638213} 12/25/2021 10:06:02 - INFO - codeparrot_training - Step 18563: {'lr': 0.00034997008209327316, 'samples': 9504768, 'steps': 18563, 'batch_loss/train': 0.788810140453279} 12/25/2021 10:06:13 - INFO - codeparrot_training - Step 18564: {'lr': 0.0003499556339202484, 'samples': 9505280, 'steps': 18564, 'batch_loss/train': 1.5779247526079416} 12/25/2021 10:06:24 - INFO - codeparrot_training - Step 18565: {'lr': 0.00034994118534983784, 'samples': 9505792, 'steps': 18565, 'batch_loss/train': 0.6071939803659916} 12/25/2021 10:06:37 - INFO - codeparrot_training - Step 18566: {'lr': 0.00034992673638209877, 'samples': 9506304, 'steps': 18566, 'batch_loss/train': 0.851705027744174} 12/25/2021 10:06:48 - INFO - codeparrot_training - Step 18567: {'lr': 0.0003499122870170888, 'samples': 9506816, 'steps': 18567, 'batch_loss/train': 0.7624698856379837} 12/25/2021 10:06:59 - INFO - codeparrot_training - Step 18568: {'lr': 0.00034989783725486516, 'samples': 9507328, 'steps': 18568, 'batch_loss/train': 0.7108784352894872} 12/25/2021 10:07:09 - INFO - codeparrot_training - Step 18569: {'lr': 0.0003498833870954856, 'samples': 9507840, 'steps': 18569, 'batch_loss/train': 0.7666690601035953} 12/25/2021 10:07:21 - INFO - codeparrot_training - Step 18570: {'lr': 0.0003498689365390073, 'samples': 9508352, 'steps': 18570, 'batch_loss/train': 0.711030375212431} 12/25/2021 10:07:32 - INFO - codeparrot_training - Step 18571: {'lr': 0.00034985448558548786, 'samples': 9508864, 'steps': 18571, 'batch_loss/train': 0.876465986482799} 12/25/2021 10:07:43 - INFO - codeparrot_training - Step 18572: {'lr': 0.0003498400342349847, 'samples': 9509376, 'steps': 18572, 'batch_loss/train': 0.7675204836996272} 12/25/2021 10:07:56 - INFO - codeparrot_training - Step 18573: {'lr': 0.0003498255824875553, 'samples': 9509888, 'steps': 18573, 'batch_loss/train': 0.7353970864787698} 12/25/2021 10:08:07 - INFO - codeparrot_training - Step 18574: {'lr': 0.00034981113034325697, 'samples': 9510400, 'steps': 18574, 'batch_loss/train': 0.752272040117532} 12/25/2021 10:08:18 - INFO - codeparrot_training - Step 18575: {'lr': 0.0003497966778021473, 'samples': 9510912, 'steps': 18575, 'batch_loss/train': 0.8208312727510929} 12/25/2021 10:08:30 - INFO - codeparrot_training - Step 18576: {'lr': 0.0003497822248642838, 'samples': 9511424, 'steps': 18576, 'batch_loss/train': 0.7432823530398309} 12/25/2021 10:08:40 - INFO - codeparrot_training - Step 18577: {'lr': 0.00034976777152972386, 'samples': 9511936, 'steps': 18577, 'batch_loss/train': 0.8014957495033741} 12/25/2021 10:08:51 - INFO - codeparrot_training - Step 18578: {'lr': 0.0003497533177985248, 'samples': 9512448, 'steps': 18578, 'batch_loss/train': 0.7709511173889041} 12/25/2021 10:09:01 - INFO - codeparrot_training - Step 18579: {'lr': 0.0003497388636707443, 'samples': 9512960, 'steps': 18579, 'batch_loss/train': 0.8167014485225081} 12/25/2021 10:09:13 - INFO - codeparrot_training - Step 18580: {'lr': 0.00034972440914643975, 'samples': 9513472, 'steps': 18580, 'batch_loss/train': 0.8731568148359656} 12/25/2021 10:09:24 - INFO - codeparrot_training - Step 18581: {'lr': 0.00034970995422566856, 'samples': 9513984, 'steps': 18581, 'batch_loss/train': 0.8232574192807078} 12/25/2021 10:09:35 - INFO - codeparrot_training - Step 18582: {'lr': 0.00034969549890848827, 'samples': 9514496, 'steps': 18582, 'batch_loss/train': 0.6017559023457579} 12/25/2021 10:09:48 - INFO - codeparrot_training - Step 18583: {'lr': 0.0003496810431949563, 'samples': 9515008, 'steps': 18583, 'batch_loss/train': 0.8506343010812998} 12/25/2021 10:09:59 - INFO - codeparrot_training - Step 18584: {'lr': 0.0003496665870851302, 'samples': 9515520, 'steps': 18584, 'batch_loss/train': 0.8054359825327992} 12/25/2021 10:10:10 - INFO - codeparrot_training - Step 18585: {'lr': 0.0003496521305790673, 'samples': 9516032, 'steps': 18585, 'batch_loss/train': 0.7835210236953571} 12/25/2021 10:10:22 - INFO - codeparrot_training - Step 18586: {'lr': 0.0003496376736768252, 'samples': 9516544, 'steps': 18586, 'batch_loss/train': 0.846599037759006} 12/25/2021 10:10:32 - INFO - codeparrot_training - Step 18587: {'lr': 0.0003496232163784613, 'samples': 9517056, 'steps': 18587, 'batch_loss/train': 0.7676551740150899} 12/25/2021 10:10:43 - INFO - codeparrot_training - Step 18588: {'lr': 0.0003496087586840331, 'samples': 9517568, 'steps': 18588, 'batch_loss/train': 0.8159684529528022} 12/25/2021 10:10:55 - INFO - codeparrot_training - Step 18589: {'lr': 0.00034959430059359814, 'samples': 9518080, 'steps': 18589, 'batch_loss/train': 0.7894628650974482} 12/25/2021 10:11:05 - INFO - codeparrot_training - Step 18590: {'lr': 0.0003495798421072138, 'samples': 9518592, 'steps': 18590, 'batch_loss/train': 0.7716496139764786} 12/25/2021 10:11:16 - INFO - codeparrot_training - Step 18591: {'lr': 0.00034956538322493766, 'samples': 9519104, 'steps': 18591, 'batch_loss/train': 0.8582292320206761} 12/25/2021 10:11:27 - INFO - codeparrot_training - Step 18592: {'lr': 0.0003495509239468271, 'samples': 9519616, 'steps': 18592, 'batch_loss/train': 0.793743958696723} 12/25/2021 10:11:41 - INFO - codeparrot_training - Step 18593: {'lr': 0.0003495364642729397, 'samples': 9520128, 'steps': 18593, 'batch_loss/train': 0.8568330220878124} 12/25/2021 10:11:51 - INFO - codeparrot_training - Step 18594: {'lr': 0.0003495220042033328, 'samples': 9520640, 'steps': 18594, 'batch_loss/train': 0.778947307728231} 12/25/2021 10:12:02 - INFO - codeparrot_training - Step 18595: {'lr': 0.00034950754373806416, 'samples': 9521152, 'steps': 18595, 'batch_loss/train': 0.7938401526771486} 12/25/2021 10:12:14 - INFO - codeparrot_training - Step 18596: {'lr': 0.00034949308287719093, 'samples': 9521664, 'steps': 18596, 'batch_loss/train': 0.5970530820195563} 12/25/2021 10:12:25 - INFO - codeparrot_training - Step 18597: {'lr': 0.0003494786216207709, 'samples': 9522176, 'steps': 18597, 'batch_loss/train': 0.8506741933524609} 12/25/2021 10:12:35 - INFO - codeparrot_training - Step 18598: {'lr': 0.0003494641599688613, 'samples': 9522688, 'steps': 18598, 'batch_loss/train': 1.0108198667876422} 12/25/2021 10:12:47 - INFO - codeparrot_training - Step 18599: {'lr': 0.00034944969792151994, 'samples': 9523200, 'steps': 18599, 'batch_loss/train': 0.5904375276295468} 12/25/2021 10:12:58 - INFO - codeparrot_training - Step 18600: {'lr': 0.000349435235478804, 'samples': 9523712, 'steps': 18600, 'batch_loss/train': 0.7485554807353765} 12/25/2021 10:13:09 - INFO - codeparrot_training - Step 18601: {'lr': 0.00034942077264077105, 'samples': 9524224, 'steps': 18601, 'batch_loss/train': 0.8213092996738851} 12/25/2021 10:13:22 - INFO - codeparrot_training - Step 18602: {'lr': 0.00034940630940747877, 'samples': 9524736, 'steps': 18602, 'batch_loss/train': 0.7653691875748336} 12/25/2021 10:13:33 - INFO - codeparrot_training - Step 18603: {'lr': 0.0003493918457789845, 'samples': 9525248, 'steps': 18603, 'batch_loss/train': 0.8208873784169555} 12/25/2021 10:13:44 - INFO - codeparrot_training - Step 18604: {'lr': 0.00034937738175534576, 'samples': 9525760, 'steps': 18604, 'batch_loss/train': 0.7988451765850186} 12/25/2021 10:13:54 - INFO - codeparrot_training - Step 18605: {'lr': 0.00034936291733662006, 'samples': 9526272, 'steps': 18605, 'batch_loss/train': 0.7200880870223045} 12/25/2021 10:14:06 - INFO - codeparrot_training - Step 18606: {'lr': 0.00034934845252286497, 'samples': 9526784, 'steps': 18606, 'batch_loss/train': 0.7879826445132494} 12/25/2021 10:14:17 - INFO - codeparrot_training - Step 18607: {'lr': 0.0003493339873141379, 'samples': 9527296, 'steps': 18607, 'batch_loss/train': 0.8131896341219544} 12/25/2021 10:14:28 - INFO - codeparrot_training - Step 18608: {'lr': 0.00034931952171049653, 'samples': 9527808, 'steps': 18608, 'batch_loss/train': 0.7653361037373543} 12/25/2021 10:14:41 - INFO - codeparrot_training - Step 18609: {'lr': 0.0003493050557119981, 'samples': 9528320, 'steps': 18609, 'batch_loss/train': 0.7518735965713859} 12/25/2021 10:14:52 - INFO - codeparrot_training - Step 18610: {'lr': 0.00034929058931870027, 'samples': 9528832, 'steps': 18610, 'batch_loss/train': 0.7569465673295781} 12/25/2021 10:15:03 - INFO - codeparrot_training - Step 18611: {'lr': 0.0003492761225306606, 'samples': 9529344, 'steps': 18611, 'batch_loss/train': 0.7759028393775225} 12/25/2021 10:15:15 - INFO - codeparrot_training - Step 18612: {'lr': 0.0003492616553479365, 'samples': 9529856, 'steps': 18612, 'batch_loss/train': 0.726267644087784} 12/25/2021 10:15:25 - INFO - codeparrot_training - Step 18613: {'lr': 0.0003492471877705856, 'samples': 9530368, 'steps': 18613, 'batch_loss/train': 0.7023731676163152} 12/25/2021 10:15:36 - INFO - codeparrot_training - Step 18614: {'lr': 0.0003492327197986653, 'samples': 9530880, 'steps': 18614, 'batch_loss/train': 0.6843872576137073} 12/25/2021 10:15:46 - INFO - codeparrot_training - Step 18615: {'lr': 0.0003492182514322333, 'samples': 9531392, 'steps': 18615, 'batch_loss/train': 0.8530434686690569} 12/25/2021 10:15:59 - INFO - codeparrot_training - Step 18616: {'lr': 0.0003492037826713469, 'samples': 9531904, 'steps': 18616, 'batch_loss/train': 0.8461071541532874} 12/25/2021 10:16:09 - INFO - codeparrot_training - Step 18617: {'lr': 0.00034918931351606366, 'samples': 9532416, 'steps': 18617, 'batch_loss/train': 0.765920854639262} 12/25/2021 10:16:20 - INFO - codeparrot_training - Step 18618: {'lr': 0.0003491748439664412, 'samples': 9532928, 'steps': 18618, 'batch_loss/train': 0.8062882698141038} 12/25/2021 10:16:33 - INFO - codeparrot_training - Step 18619: {'lr': 0.000349160374022537, 'samples': 9533440, 'steps': 18619, 'batch_loss/train': 0.7811072547920048} 12/25/2021 10:16:44 - INFO - codeparrot_training - Step 18620: {'lr': 0.0003491459036844087, 'samples': 9533952, 'steps': 18620, 'batch_loss/train': 0.8095726650208235} 12/25/2021 10:16:55 - INFO - codeparrot_training - Step 18621: {'lr': 0.00034913143295211364, 'samples': 9534464, 'steps': 18621, 'batch_loss/train': 0.6611928211059421} 12/25/2021 10:17:07 - INFO - codeparrot_training - Step 18622: {'lr': 0.0003491169618257094, 'samples': 9534976, 'steps': 18622, 'batch_loss/train': 0.8876396967098117} 12/25/2021 10:17:18 - INFO - codeparrot_training - Step 18623: {'lr': 0.0003491024903052536, 'samples': 9535488, 'steps': 18623, 'batch_loss/train': 0.5663747927173972} 12/25/2021 10:17:28 - INFO - codeparrot_training - Step 18624: {'lr': 0.00034908801839080375, 'samples': 9536000, 'steps': 18624, 'batch_loss/train': 0.7485396799165756} 12/25/2021 10:17:39 - INFO - codeparrot_training - Step 18625: {'lr': 0.0003490735460824173, 'samples': 9536512, 'steps': 18625, 'batch_loss/train': 0.8582814885303378} 12/25/2021 10:17:51 - INFO - codeparrot_training - Step 18626: {'lr': 0.0003490590733801518, 'samples': 9537024, 'steps': 18626, 'batch_loss/train': 0.8231639917939901} 12/25/2021 10:18:01 - INFO - codeparrot_training - Step 18627: {'lr': 0.0003490446002840648, 'samples': 9537536, 'steps': 18627, 'batch_loss/train': 0.7834791131317616} 12/25/2021 10:18:12 - INFO - codeparrot_training - Step 18628: {'lr': 0.00034903012679421403, 'samples': 9538048, 'steps': 18628, 'batch_loss/train': 0.7174890735186636} 12/25/2021 10:18:24 - INFO - codeparrot_training - Step 18629: {'lr': 0.00034901565291065674, 'samples': 9538560, 'steps': 18629, 'batch_loss/train': 0.8066465817391872} 12/25/2021 10:18:35 - INFO - codeparrot_training - Step 18630: {'lr': 0.00034900117863345063, 'samples': 9539072, 'steps': 18630, 'batch_loss/train': 0.7231382052414119} 12/25/2021 10:18:45 - INFO - codeparrot_training - Step 18631: {'lr': 0.0003489867039626533, 'samples': 9539584, 'steps': 18631, 'batch_loss/train': 0.8064300855621696} 12/25/2021 10:18:59 - INFO - codeparrot_training - Step 18632: {'lr': 0.0003489722288983221, 'samples': 9540096, 'steps': 18632, 'batch_loss/train': 0.6776224607019685} 12/25/2021 10:19:10 - INFO - codeparrot_training - Step 18633: {'lr': 0.0003489577534405147, 'samples': 9540608, 'steps': 18633, 'batch_loss/train': 0.7027961726416834} 12/25/2021 10:19:20 - INFO - codeparrot_training - Step 18634: {'lr': 0.0003489432775892887, 'samples': 9541120, 'steps': 18634, 'batch_loss/train': 0.8358309529721737} 12/25/2021 10:19:32 - INFO - codeparrot_training - Step 18635: {'lr': 0.0003489288013447015, 'samples': 9541632, 'steps': 18635, 'batch_loss/train': 0.7415444860234857} 12/25/2021 10:19:43 - INFO - codeparrot_training - Step 18636: {'lr': 0.00034891432470681084, 'samples': 9542144, 'steps': 18636, 'batch_loss/train': 0.7131045274436474} 12/25/2021 10:19:54 - INFO - codeparrot_training - Step 18637: {'lr': 0.0003488998476756741, 'samples': 9542656, 'steps': 18637, 'batch_loss/train': 0.7670680441660807} 12/25/2021 10:20:04 - INFO - codeparrot_training - Step 18638: {'lr': 0.0003488853702513489, 'samples': 9543168, 'steps': 18638, 'batch_loss/train': 0.7475365973077714} 12/25/2021 10:20:16 - INFO - codeparrot_training - Step 18639: {'lr': 0.00034887089243389283, 'samples': 9543680, 'steps': 18639, 'batch_loss/train': 0.8077807677909732} 12/25/2021 10:20:27 - INFO - codeparrot_training - Step 18640: {'lr': 0.00034885641422336344, 'samples': 9544192, 'steps': 18640, 'batch_loss/train': 0.8440766772255301} 12/25/2021 10:20:37 - INFO - codeparrot_training - Step 18641: {'lr': 0.00034884193561981824, 'samples': 9544704, 'steps': 18641, 'batch_loss/train': 0.8090447552967817} 12/25/2021 10:20:51 - INFO - codeparrot_training - Step 18642: {'lr': 0.00034882745662331486, 'samples': 9545216, 'steps': 18642, 'batch_loss/train': 0.7643302649958059} 12/25/2021 10:21:02 - INFO - codeparrot_training - Step 18643: {'lr': 0.0003488129772339108, 'samples': 9545728, 'steps': 18643, 'batch_loss/train': 0.6212476934306324} 12/25/2021 10:21:13 - INFO - codeparrot_training - Step 18644: {'lr': 0.0003487984974516637, 'samples': 9546240, 'steps': 18644, 'batch_loss/train': 0.8462937725707889} 12/25/2021 10:21:25 - INFO - codeparrot_training - Step 18645: {'lr': 0.000348784017276631, 'samples': 9546752, 'steps': 18645, 'batch_loss/train': 0.7389103671011981} 12/25/2021 10:21:35 - INFO - codeparrot_training - Step 18646: {'lr': 0.0003487695367088704, 'samples': 9547264, 'steps': 18646, 'batch_loss/train': 0.7120762299164198} 12/25/2021 10:21:46 - INFO - codeparrot_training - Step 18647: {'lr': 0.00034875505574843947, 'samples': 9547776, 'steps': 18647, 'batch_loss/train': 0.7278025085106492} 12/25/2021 10:21:56 - INFO - codeparrot_training - Step 18648: {'lr': 0.0003487405743953956, 'samples': 9548288, 'steps': 18648, 'batch_loss/train': 0.77752052876167} 12/25/2021 10:22:10 - INFO - codeparrot_training - Step 18649: {'lr': 0.0003487260926497965, 'samples': 9548800, 'steps': 18649, 'batch_loss/train': 0.7905545458197594} 12/25/2021 10:22:21 - INFO - codeparrot_training - Step 18650: {'lr': 0.0003487116105116999, 'samples': 9549312, 'steps': 18650, 'batch_loss/train': 0.6823003264144063} 12/25/2021 10:22:32 - INFO - codeparrot_training - Step 18651: {'lr': 0.0003486971279811631, 'samples': 9549824, 'steps': 18651, 'batch_loss/train': 0.7672174675390124} 12/25/2021 10:22:44 - INFO - codeparrot_training - Step 18652: {'lr': 0.00034868264505824375, 'samples': 9550336, 'steps': 18652, 'batch_loss/train': 0.6928790817037225} 12/25/2021 10:22:54 - INFO - codeparrot_training - Step 18653: {'lr': 0.00034866816174299953, 'samples': 9550848, 'steps': 18653, 'batch_loss/train': 0.8363316738978028} 12/25/2021 10:23:05 - INFO - codeparrot_training - Step 18654: {'lr': 0.000348653678035488, 'samples': 9551360, 'steps': 18654, 'batch_loss/train': 0.7065886575728655} 12/25/2021 10:23:17 - INFO - codeparrot_training - Step 18655: {'lr': 0.00034863919393576646, 'samples': 9551872, 'steps': 18655, 'batch_loss/train': 0.6785203283652663} 12/25/2021 10:23:28 - INFO - codeparrot_training - Step 18656: {'lr': 0.00034862470944389297, 'samples': 9552384, 'steps': 18656, 'batch_loss/train': 0.6507465643808246} 12/25/2021 10:23:38 - INFO - codeparrot_training - Step 18657: {'lr': 0.0003486102245599248, 'samples': 9552896, 'steps': 18657, 'batch_loss/train': 0.8141964860260487} 12/25/2021 10:23:49 - INFO - codeparrot_training - Step 18658: {'lr': 0.00034859573928391956, 'samples': 9553408, 'steps': 18658, 'batch_loss/train': 0.777099933475256} 12/25/2021 10:24:01 - INFO - codeparrot_training - Step 18659: {'lr': 0.00034858125361593494, 'samples': 9553920, 'steps': 18659, 'batch_loss/train': 0.8016505837440491} 12/25/2021 10:24:12 - INFO - codeparrot_training - Step 18660: {'lr': 0.0003485667675560285, 'samples': 9554432, 'steps': 18660, 'batch_loss/train': 0.7704608412459493} 12/25/2021 10:24:22 - INFO - codeparrot_training - Step 18661: {'lr': 0.0003485522811042578, 'samples': 9554944, 'steps': 18661, 'batch_loss/train': 0.7517595207318664} 12/25/2021 10:24:36 - INFO - codeparrot_training - Step 18662: {'lr': 0.0003485377942606804, 'samples': 9555456, 'steps': 18662, 'batch_loss/train': 0.796337288338691} 12/25/2021 10:24:47 - INFO - codeparrot_training - Step 18663: {'lr': 0.000348523307025354, 'samples': 9555968, 'steps': 18663, 'batch_loss/train': 0.7442475100979209} 12/25/2021 10:24:58 - INFO - codeparrot_training - Step 18664: {'lr': 0.00034850881939833605, 'samples': 9556480, 'steps': 18664, 'batch_loss/train': 0.5027170343091711} 12/25/2021 10:25:10 - INFO - codeparrot_training - Step 18665: {'lr': 0.0003484943313796843, 'samples': 9556992, 'steps': 18665, 'batch_loss/train': 0.8233391344547272} 12/25/2021 10:25:20 - INFO - codeparrot_training - Step 18666: {'lr': 0.00034847984296945624, 'samples': 9557504, 'steps': 18666, 'batch_loss/train': 0.8339140948373824} 12/25/2021 10:25:31 - INFO - codeparrot_training - Step 18667: {'lr': 0.00034846535416770954, 'samples': 9558016, 'steps': 18667, 'batch_loss/train': 0.7640705578960478} 12/25/2021 10:25:43 - INFO - codeparrot_training - Step 18668: {'lr': 0.0003484508649745017, 'samples': 9558528, 'steps': 18668, 'batch_loss/train': 0.7281005056574941} 12/25/2021 10:25:53 - INFO - codeparrot_training - Step 18669: {'lr': 0.0003484363753898905, 'samples': 9559040, 'steps': 18669, 'batch_loss/train': 0.8641115988139063} 12/25/2021 10:26:04 - INFO - codeparrot_training - Step 18670: {'lr': 0.0003484218854139334, 'samples': 9559552, 'steps': 18670, 'batch_loss/train': 0.7717179246246815} 12/25/2021 10:26:15 - INFO - codeparrot_training - Step 18671: {'lr': 0.0003484073950466879, 'samples': 9560064, 'steps': 18671, 'batch_loss/train': 0.7619197880849242} 12/25/2021 10:26:29 - INFO - codeparrot_training - Step 18672: {'lr': 0.00034839290428821196, 'samples': 9560576, 'steps': 18672, 'batch_loss/train': 0.7263971352949739} 12/25/2021 10:26:39 - INFO - codeparrot_training - Step 18673: {'lr': 0.0003483784131385629, 'samples': 9561088, 'steps': 18673, 'batch_loss/train': 0.7839862391119823} 12/25/2021 10:26:50 - INFO - codeparrot_training - Step 18674: {'lr': 0.00034836392159779837, 'samples': 9561600, 'steps': 18674, 'batch_loss/train': 1.3158676157472655} 12/25/2021 10:27:02 - INFO - codeparrot_training - Step 18675: {'lr': 0.0003483494296659761, 'samples': 9562112, 'steps': 18675, 'batch_loss/train': 0.745675160549581} 12/25/2021 10:27:13 - INFO - codeparrot_training - Step 18676: {'lr': 0.00034833493734315367, 'samples': 9562624, 'steps': 18676, 'batch_loss/train': 0.7530901348218322} 12/25/2021 10:27:23 - INFO - codeparrot_training - Step 18677: {'lr': 0.0003483204446293886, 'samples': 9563136, 'steps': 18677, 'batch_loss/train': 0.6786763614509255} 12/25/2021 10:27:35 - INFO - codeparrot_training - Step 18678: {'lr': 0.00034830595152473854, 'samples': 9563648, 'steps': 18678, 'batch_loss/train': 0.7740502739325166} 12/25/2021 10:27:46 - INFO - codeparrot_training - Step 18679: {'lr': 0.0003482914580292612, 'samples': 9564160, 'steps': 18679, 'batch_loss/train': 0.7938507776707411} 12/25/2021 10:27:57 - INFO - codeparrot_training - Step 18680: {'lr': 0.00034827696414301413, 'samples': 9564672, 'steps': 18680, 'batch_loss/train': 0.7347599156200886} 12/25/2021 10:28:07 - INFO - codeparrot_training - Step 18681: {'lr': 0.00034826246986605487, 'samples': 9565184, 'steps': 18681, 'batch_loss/train': 0.7263248916715384} 12/25/2021 10:28:21 - INFO - codeparrot_training - Step 18682: {'lr': 0.00034824797519844125, 'samples': 9565696, 'steps': 18682, 'batch_loss/train': 0.7270228625275195} 12/25/2021 10:28:32 - INFO - codeparrot_training - Step 18683: {'lr': 0.0003482334801402307, 'samples': 9566208, 'steps': 18683, 'batch_loss/train': 0.817718019708991} 12/25/2021 10:28:42 - INFO - codeparrot_training - Step 18684: {'lr': 0.00034821898469148093, 'samples': 9566720, 'steps': 18684, 'batch_loss/train': 0.7201154314680025} 12/25/2021 10:28:55 - INFO - codeparrot_training - Step 18685: {'lr': 0.0003482044888522496, 'samples': 9567232, 'steps': 18685, 'batch_loss/train': 0.7293989164754748} 12/25/2021 10:29:05 - INFO - codeparrot_training - Step 18686: {'lr': 0.00034818999262259425, 'samples': 9567744, 'steps': 18686, 'batch_loss/train': 0.7897984026931226} 12/25/2021 10:29:16 - INFO - codeparrot_training - Step 18687: {'lr': 0.0003481754960025726, 'samples': 9568256, 'steps': 18687, 'batch_loss/train': 0.8176657439325936} 12/25/2021 10:29:29 - INFO - codeparrot_training - Step 18688: {'lr': 0.00034816099899224226, 'samples': 9568768, 'steps': 18688, 'batch_loss/train': 0.8803627463057637} 12/25/2021 10:29:40 - INFO - codeparrot_training - Step 18689: {'lr': 0.00034814650159166086, 'samples': 9569280, 'steps': 18689, 'batch_loss/train': 0.6661311839707196} 12/25/2021 10:29:50 - INFO - codeparrot_training - Step 18690: {'lr': 0.000348132003800886, 'samples': 9569792, 'steps': 18690, 'batch_loss/train': 0.8113193642348051} 12/25/2021 10:30:02 - INFO - codeparrot_training - Step 18691: {'lr': 0.00034811750561997536, 'samples': 9570304, 'steps': 18691, 'batch_loss/train': 0.8312365664169192} 12/25/2021 10:30:13 - INFO - codeparrot_training - Step 18692: {'lr': 0.0003481030070489865, 'samples': 9570816, 'steps': 18692, 'batch_loss/train': 0.7124893586151302} 12/25/2021 10:30:24 - INFO - codeparrot_training - Step 18693: {'lr': 0.00034808850808797727, 'samples': 9571328, 'steps': 18693, 'batch_loss/train': 0.853909308090806} 12/25/2021 10:30:34 - INFO - codeparrot_training - Step 18694: {'lr': 0.000348074008737005, 'samples': 9571840, 'steps': 18694, 'batch_loss/train': 0.7388953445479274} 12/25/2021 10:30:46 - INFO - codeparrot_training - Step 18695: {'lr': 0.00034805950899612764, 'samples': 9572352, 'steps': 18695, 'batch_loss/train': 0.9521759524941444} 12/25/2021 10:30:57 - INFO - codeparrot_training - Step 18696: {'lr': 0.0003480450088654026, 'samples': 9572864, 'steps': 18696, 'batch_loss/train': 0.8540058457292616} 12/25/2021 10:31:07 - INFO - codeparrot_training - Step 18697: {'lr': 0.0003480305083448877, 'samples': 9573376, 'steps': 18697, 'batch_loss/train': 0.7991096121259034} 12/25/2021 10:31:20 - INFO - codeparrot_training - Step 18698: {'lr': 0.00034801600743464045, 'samples': 9573888, 'steps': 18698, 'batch_loss/train': 0.7955236975103617} 12/25/2021 10:31:31 - INFO - codeparrot_training - Step 18699: {'lr': 0.00034800150613471857, 'samples': 9574400, 'steps': 18699, 'batch_loss/train': 0.6442947941832244} 12/25/2021 10:31:41 - INFO - codeparrot_training - Step 18700: {'lr': 0.0003479870044451797, 'samples': 9574912, 'steps': 18700, 'batch_loss/train': 0.7580149350687861} 12/25/2021 10:31:55 - INFO - codeparrot_training - Step 18701: {'lr': 0.00034797250236608156, 'samples': 9575424, 'steps': 18701, 'batch_loss/train': 0.7611505603417754} 12/25/2021 10:32:06 - INFO - codeparrot_training - Step 18702: {'lr': 0.0003479579998974818, 'samples': 9575936, 'steps': 18702, 'batch_loss/train': 0.7558206552639604} 12/25/2021 10:32:16 - INFO - codeparrot_training - Step 18703: {'lr': 0.0003479434970394379, 'samples': 9576448, 'steps': 18703, 'batch_loss/train': 0.7497061677277088} 12/25/2021 10:32:27 - INFO - codeparrot_training - Step 18704: {'lr': 0.0003479289937920077, 'samples': 9576960, 'steps': 18704, 'batch_loss/train': 0.8461113478988409} 12/25/2021 10:32:39 - INFO - codeparrot_training - Step 18705: {'lr': 0.00034791449015524875, 'samples': 9577472, 'steps': 18705, 'batch_loss/train': 0.6700677805347368} 12/25/2021 10:32:49 - INFO - codeparrot_training - Step 18706: {'lr': 0.0003478999861292188, 'samples': 9577984, 'steps': 18706, 'batch_loss/train': 1.0453611360862851} 12/25/2021 10:33:00 - INFO - codeparrot_training - Step 18707: {'lr': 0.0003478854817139754, 'samples': 9578496, 'steps': 18707, 'batch_loss/train': 0.8400431601330638} 12/25/2021 10:33:12 - INFO - codeparrot_training - Step 18708: {'lr': 0.00034787097690957636, 'samples': 9579008, 'steps': 18708, 'batch_loss/train': 0.897584080696106} 12/25/2021 10:33:23 - INFO - codeparrot_training - Step 18709: {'lr': 0.0003478564717160792, 'samples': 9579520, 'steps': 18709, 'batch_loss/train': 0.7940082754939795} 12/25/2021 10:33:33 - INFO - codeparrot_training - Step 18710: {'lr': 0.0003478419661335417, 'samples': 9580032, 'steps': 18710, 'batch_loss/train': 0.753147232811898} 12/25/2021 10:33:47 - INFO - codeparrot_training - Step 18711: {'lr': 0.0003478274601620215, 'samples': 9580544, 'steps': 18711, 'batch_loss/train': 0.7954136929474771} 12/25/2021 10:33:58 - INFO - codeparrot_training - Step 18712: {'lr': 0.00034781295380157626, 'samples': 9581056, 'steps': 18712, 'batch_loss/train': 0.7055199826136231} 12/25/2021 10:34:08 - INFO - codeparrot_training - Step 18713: {'lr': 0.0003477984470522636, 'samples': 9581568, 'steps': 18713, 'batch_loss/train': 0.7991477157920599} 12/25/2021 10:34:19 - INFO - codeparrot_training - Step 18714: {'lr': 0.00034778393991414126, 'samples': 9582080, 'steps': 18714, 'batch_loss/train': 0.7830173936672509} 12/25/2021 10:34:31 - INFO - codeparrot_training - Step 18715: {'lr': 0.00034776943238726694, 'samples': 9582592, 'steps': 18715, 'batch_loss/train': 0.8153009442612529} 12/25/2021 10:34:41 - INFO - codeparrot_training - Step 18716: {'lr': 0.00034775492447169823, 'samples': 9583104, 'steps': 18716, 'batch_loss/train': 0.7761154621839523} 12/25/2021 10:34:52 - INFO - codeparrot_training - Step 18717: {'lr': 0.00034774041616749286, 'samples': 9583616, 'steps': 18717, 'batch_loss/train': 0.6402372729498893} 12/25/2021 10:35:06 - INFO - codeparrot_training - Step 18718: {'lr': 0.0003477259074747085, 'samples': 9584128, 'steps': 18718, 'batch_loss/train': 0.8047755612060428} 12/25/2021 10:35:16 - INFO - codeparrot_training - Step 18719: {'lr': 0.0003477113983934028, 'samples': 9584640, 'steps': 18719, 'batch_loss/train': 0.8303862875327468} 12/25/2021 10:35:27 - INFO - codeparrot_training - Step 18720: {'lr': 0.00034769688892363347, 'samples': 9585152, 'steps': 18720, 'batch_loss/train': 0.7318877140060067} 12/25/2021 10:35:39 - INFO - codeparrot_training - Step 18721: {'lr': 0.00034768237906545837, 'samples': 9585664, 'steps': 18721, 'batch_loss/train': 0.704299733042717} 12/25/2021 10:35:49 - INFO - codeparrot_training - Step 18722: {'lr': 0.00034766786881893484, 'samples': 9586176, 'steps': 18722, 'batch_loss/train': 0.8184508658014238} 12/25/2021 10:36:00 - INFO - codeparrot_training - Step 18723: {'lr': 0.00034765335818412067, 'samples': 9586688, 'steps': 18723, 'batch_loss/train': 0.7311738121206872} 12/25/2021 10:36:12 - INFO - codeparrot_training - Step 18724: {'lr': 0.00034763884716107387, 'samples': 9587200, 'steps': 18724, 'batch_loss/train': 0.6833886122331023} 12/25/2021 10:36:23 - INFO - codeparrot_training - Step 18725: {'lr': 0.0003476243357498517, 'samples': 9587712, 'steps': 18725, 'batch_loss/train': 0.8306498778983951} 12/25/2021 10:36:33 - INFO - codeparrot_training - Step 18726: {'lr': 0.0003476098239505121, 'samples': 9588224, 'steps': 18726, 'batch_loss/train': 0.7596334731206298} 12/25/2021 10:36:44 - INFO - codeparrot_training - Step 18727: {'lr': 0.00034759531176311264, 'samples': 9588736, 'steps': 18727, 'batch_loss/train': 0.6957074459642172} 12/25/2021 10:36:58 - INFO - codeparrot_training - Step 18728: {'lr': 0.00034758079918771113, 'samples': 9589248, 'steps': 18728, 'batch_loss/train': 0.7442321057897061} 12/25/2021 10:37:09 - INFO - codeparrot_training - Step 18729: {'lr': 0.0003475662862243652, 'samples': 9589760, 'steps': 18729, 'batch_loss/train': 0.7706875158473849} 12/25/2021 10:37:19 - INFO - codeparrot_training - Step 18730: {'lr': 0.00034755177287313255, 'samples': 9590272, 'steps': 18730, 'batch_loss/train': 0.8853028994053602} 12/25/2021 10:37:31 - INFO - codeparrot_training - Step 18731: {'lr': 0.00034753725913407084, 'samples': 9590784, 'steps': 18731, 'batch_loss/train': 0.6702821079525165} 12/25/2021 10:37:42 - INFO - codeparrot_training - Step 18732: {'lr': 0.0003475227450072378, 'samples': 9591296, 'steps': 18732, 'batch_loss/train': 0.7983731357380748} 12/25/2021 10:37:53 - INFO - codeparrot_training - Step 18733: {'lr': 0.0003475082304926912, 'samples': 9591808, 'steps': 18733, 'batch_loss/train': 0.8046897621825337} 12/25/2021 10:38:05 - INFO - codeparrot_training - Step 18734: {'lr': 0.00034749371559048876, 'samples': 9592320, 'steps': 18734, 'batch_loss/train': 0.8778349682688713} 12/25/2021 10:38:16 - INFO - codeparrot_training - Step 18735: {'lr': 0.000347479200300688, 'samples': 9592832, 'steps': 18735, 'batch_loss/train': 0.7403411446139216} 12/25/2021 10:38:26 - INFO - codeparrot_training - Step 18736: {'lr': 0.0003474646846233468, 'samples': 9593344, 'steps': 18736, 'batch_loss/train': 0.7462278003804386} 12/25/2021 10:38:39 - INFO - codeparrot_training - Step 18737: {'lr': 0.00034745016855852284, 'samples': 9593856, 'steps': 18737, 'batch_loss/train': 0.5788874074351043} 12/25/2021 10:38:49 - INFO - codeparrot_training - Step 18738: {'lr': 0.00034743565210627376, 'samples': 9594368, 'steps': 18738, 'batch_loss/train': 0.7379473452456295} 12/25/2021 10:39:00 - INFO - codeparrot_training - Step 18739: {'lr': 0.00034742113526665735, 'samples': 9594880, 'steps': 18739, 'batch_loss/train': 0.780501761008054} 12/25/2021 10:39:10 - INFO - codeparrot_training - Step 18740: {'lr': 0.0003474066180397313, 'samples': 9595392, 'steps': 18740, 'batch_loss/train': 0.7628365489654243} 12/25/2021 10:39:24 - INFO - codeparrot_training - Step 18741: {'lr': 0.0003473921004255533, 'samples': 9595904, 'steps': 18741, 'batch_loss/train': 0.7311419873731211} 12/25/2021 10:39:35 - INFO - codeparrot_training - Step 18742: {'lr': 0.000347377582424181, 'samples': 9596416, 'steps': 18742, 'batch_loss/train': 0.8206292571267113} 12/25/2021 10:39:46 - INFO - codeparrot_training - Step 18743: {'lr': 0.00034736306403567233, 'samples': 9596928, 'steps': 18743, 'batch_loss/train': 0.8102513616904616} 12/25/2021 10:39:58 - INFO - codeparrot_training - Step 18744: {'lr': 0.0003473485452600848, 'samples': 9597440, 'steps': 18744, 'batch_loss/train': 0.6988061685115099} 12/25/2021 10:40:08 - INFO - codeparrot_training - Step 18745: {'lr': 0.0003473340260974762, 'samples': 9597952, 'steps': 18745, 'batch_loss/train': 0.7128123175352812} 12/25/2021 10:40:19 - INFO - codeparrot_training - Step 18746: {'lr': 0.00034731950654790433, 'samples': 9598464, 'steps': 18746, 'batch_loss/train': 0.6825957037508488} 12/25/2021 10:40:29 - INFO - codeparrot_training - Step 18747: {'lr': 0.00034730498661142683, 'samples': 9598976, 'steps': 18747, 'batch_loss/train': 0.7418289333581924} 12/25/2021 10:40:43 - INFO - codeparrot_training - Step 18748: {'lr': 0.00034729046628810144, 'samples': 9599488, 'steps': 18748, 'batch_loss/train': 0.8229359183460474} 12/25/2021 10:40:54 - INFO - codeparrot_training - Step 18749: {'lr': 0.00034727594557798584, 'samples': 9600000, 'steps': 18749, 'batch_loss/train': 0.7019653972238302} 12/25/2021 10:41:04 - INFO - codeparrot_training - Step 18750: {'lr': 0.0003472614244811379, 'samples': 9600512, 'steps': 18750, 'batch_loss/train': 0.7596914919558913} 12/25/2021 10:41:16 - INFO - codeparrot_training - Step 18751: {'lr': 0.00034724690299761527, 'samples': 9601024, 'steps': 18751, 'batch_loss/train': 0.7622581920586526} 12/25/2021 10:41:27 - INFO - codeparrot_training - Step 18752: {'lr': 0.00034723238112747555, 'samples': 9601536, 'steps': 18752, 'batch_loss/train': 0.7385304244235158} 12/25/2021 10:41:38 - INFO - codeparrot_training - Step 18753: {'lr': 0.00034721785887077667, 'samples': 9602048, 'steps': 18753, 'batch_loss/train': 0.7778456876985729} 12/25/2021 10:41:50 - INFO - codeparrot_training - Step 18754: {'lr': 0.0003472033362275763, 'samples': 9602560, 'steps': 18754, 'batch_loss/train': 0.7016493175178766} 12/25/2021 10:42:00 - INFO - codeparrot_training - Step 18755: {'lr': 0.0003471888131979322, 'samples': 9603072, 'steps': 18755, 'batch_loss/train': 0.7589031711686403} 12/25/2021 10:42:11 - INFO - codeparrot_training - Step 18756: {'lr': 0.00034717428978190204, 'samples': 9603584, 'steps': 18756, 'batch_loss/train': 0.784433921566233} 12/25/2021 10:42:25 - INFO - codeparrot_training - Step 18757: {'lr': 0.00034715976597954356, 'samples': 9604096, 'steps': 18757, 'batch_loss/train': 0.8479606723412871} 12/25/2021 10:42:35 - INFO - codeparrot_training - Step 18758: {'lr': 0.00034714524179091453, 'samples': 9604608, 'steps': 18758, 'batch_loss/train': 0.8120886189863086} 12/25/2021 10:42:46 - INFO - codeparrot_training - Step 18759: {'lr': 0.0003471307172160727, 'samples': 9605120, 'steps': 18759, 'batch_loss/train': 0.6650810228893533} 12/25/2021 10:42:57 - INFO - codeparrot_training - Step 18760: {'lr': 0.00034711619225507586, 'samples': 9605632, 'steps': 18760, 'batch_loss/train': 0.8140001338906586} 12/25/2021 10:43:09 - INFO - codeparrot_training - Step 18761: {'lr': 0.00034710166690798174, 'samples': 9606144, 'steps': 18761, 'batch_loss/train': 0.7260458143427968} 12/25/2021 10:43:19 - INFO - codeparrot_training - Step 18762: {'lr': 0.00034708714117484795, 'samples': 9606656, 'steps': 18762, 'batch_loss/train': 1.5299993236549199} 12/25/2021 10:43:30 - INFO - codeparrot_training - Step 18763: {'lr': 0.00034707261505573245, 'samples': 9607168, 'steps': 18763, 'batch_loss/train': 0.7201629905030131} 12/25/2021 10:43:44 - INFO - codeparrot_training - Step 18764: {'lr': 0.00034705808855069284, 'samples': 9607680, 'steps': 18764, 'batch_loss/train': 0.8743476802483201} 12/25/2021 10:43:55 - INFO - codeparrot_training - Step 18765: {'lr': 0.0003470435616597869, 'samples': 9608192, 'steps': 18765, 'batch_loss/train': 0.7962455209344625} 12/25/2021 10:44:05 - INFO - codeparrot_training - Step 18766: {'lr': 0.00034702903438307245, 'samples': 9608704, 'steps': 18766, 'batch_loss/train': 0.8975196918472648} 12/25/2021 10:44:17 - INFO - codeparrot_training - Step 18767: {'lr': 0.0003470145067206072, 'samples': 9609216, 'steps': 18767, 'batch_loss/train': 0.7739746675360948} 12/25/2021 10:44:28 - INFO - codeparrot_training - Step 18768: {'lr': 0.0003469999786724488, 'samples': 9609728, 'steps': 18768, 'batch_loss/train': 0.8507387340068817} 12/25/2021 10:44:38 - INFO - codeparrot_training - Step 18769: {'lr': 0.00034698545023865524, 'samples': 9610240, 'steps': 18769, 'batch_loss/train': 0.7809443827718496} 12/25/2021 10:44:50 - INFO - codeparrot_training - Step 18770: {'lr': 0.0003469709214192841, 'samples': 9610752, 'steps': 18770, 'batch_loss/train': 0.7147701354697347} 12/25/2021 10:45:01 - INFO - codeparrot_training - Step 18771: {'lr': 0.0003469563922143931, 'samples': 9611264, 'steps': 18771, 'batch_loss/train': 0.8855693768709898} 12/25/2021 10:45:12 - INFO - codeparrot_training - Step 18772: {'lr': 0.0003469418626240402, 'samples': 9611776, 'steps': 18772, 'batch_loss/train': 0.9029129752889276} 12/25/2021 10:45:22 - INFO - codeparrot_training - Step 18773: {'lr': 0.000346927332648283, 'samples': 9612288, 'steps': 18773, 'batch_loss/train': 0.7599883126094937} 12/25/2021 10:45:35 - INFO - codeparrot_training - Step 18774: {'lr': 0.0003469128022871794, 'samples': 9612800, 'steps': 18774, 'batch_loss/train': 0.7819186518900096} 12/25/2021 10:45:46 - INFO - codeparrot_training - Step 18775: {'lr': 0.00034689827154078703, 'samples': 9613312, 'steps': 18775, 'batch_loss/train': 0.5636951606720686} 12/25/2021 10:45:56 - INFO - codeparrot_training - Step 18776: {'lr': 0.00034688374040916375, 'samples': 9613824, 'steps': 18776, 'batch_loss/train': 0.8043180657550693} 12/25/2021 10:46:07 - INFO - codeparrot_training - Step 18777: {'lr': 0.0003468692088923673, 'samples': 9614336, 'steps': 18777, 'batch_loss/train': 1.0767905563116074} 12/25/2021 10:46:21 - INFO - codeparrot_training - Step 18778: {'lr': 0.0003468546769904554, 'samples': 9614848, 'steps': 18778, 'batch_loss/train': 0.8407108687097207} 12/25/2021 10:46:31 - INFO - codeparrot_training - Step 18779: {'lr': 0.0003468401447034859, 'samples': 9615360, 'steps': 18779, 'batch_loss/train': 0.7571865865029395} 12/25/2021 10:46:42 - INFO - codeparrot_training - Step 18780: {'lr': 0.00034682561203151665, 'samples': 9615872, 'steps': 18780, 'batch_loss/train': 0.8277892023324966} 12/25/2021 10:46:54 - INFO - codeparrot_training - Step 18781: {'lr': 0.00034681107897460516, 'samples': 9616384, 'steps': 18781, 'batch_loss/train': 0.821262652054429} 12/25/2021 10:47:05 - INFO - codeparrot_training - Step 18782: {'lr': 0.0003467965455328095, 'samples': 9616896, 'steps': 18782, 'batch_loss/train': 0.8084032190963626} 12/25/2021 10:47:15 - INFO - codeparrot_training - Step 18783: {'lr': 0.0003467820117061873, 'samples': 9617408, 'steps': 18783, 'batch_loss/train': 0.8234917987138033} 12/25/2021 10:47:27 - INFO - codeparrot_training - Step 18784: {'lr': 0.0003467674774947963, 'samples': 9617920, 'steps': 18784, 'batch_loss/train': 0.775075500831008} 12/25/2021 10:47:38 - INFO - codeparrot_training - Step 18785: {'lr': 0.00034675294289869443, 'samples': 9618432, 'steps': 18785, 'batch_loss/train': 0.8120484426617622} 12/25/2021 10:47:49 - INFO - codeparrot_training - Step 18786: {'lr': 0.0003467384079179393, 'samples': 9618944, 'steps': 18786, 'batch_loss/train': 0.7900130739435554} 12/25/2021 10:47:59 - INFO - codeparrot_training - Step 18787: {'lr': 0.00034672387255258884, 'samples': 9619456, 'steps': 18787, 'batch_loss/train': 2.1022124723531306} 12/25/2021 10:48:13 - INFO - codeparrot_training - Step 18788: {'lr': 0.0003467093368027008, 'samples': 9619968, 'steps': 18788, 'batch_loss/train': 0.8401096938177943} 12/25/2021 10:48:24 - INFO - codeparrot_training - Step 18789: {'lr': 0.000346694800668333, 'samples': 9620480, 'steps': 18789, 'batch_loss/train': 0.724070455878973} 12/25/2021 10:48:34 - INFO - codeparrot_training - Step 18790: {'lr': 0.0003466802641495431, 'samples': 9620992, 'steps': 18790, 'batch_loss/train': 0.6483173528104089} 12/25/2021 10:48:46 - INFO - codeparrot_training - Step 18791: {'lr': 0.00034666572724638894, 'samples': 9621504, 'steps': 18791, 'batch_loss/train': 0.7087461054325104} 12/25/2021 10:48:57 - INFO - codeparrot_training - Step 18792: {'lr': 0.00034665118995892843, 'samples': 9622016, 'steps': 18792, 'batch_loss/train': 0.8043855624273419} 12/25/2021 10:49:08 - INFO - codeparrot_training - Step 18793: {'lr': 0.0003466366522872193, 'samples': 9622528, 'steps': 18793, 'batch_loss/train': 0.7762519866228104} 12/25/2021 10:49:22 - INFO - codeparrot_training - Step 18794: {'lr': 0.0003466221142313192, 'samples': 9623040, 'steps': 18794, 'batch_loss/train': 0.8207670822739601} 12/25/2021 10:49:32 - INFO - codeparrot_training - Step 18795: {'lr': 0.0003466075757912862, 'samples': 9623552, 'steps': 18795, 'batch_loss/train': 0.797900921665132} 12/25/2021 10:49:43 - INFO - codeparrot_training - Step 18796: {'lr': 0.00034659303696717793, 'samples': 9624064, 'steps': 18796, 'batch_loss/train': 0.7571481531485915} 12/25/2021 10:49:53 - INFO - codeparrot_training - Step 18797: {'lr': 0.00034657849775905213, 'samples': 9624576, 'steps': 18797, 'batch_loss/train': 0.7720773853361607} 12/25/2021 10:50:05 - INFO - codeparrot_training - Step 18798: {'lr': 0.00034656395816696676, 'samples': 9625088, 'steps': 18798, 'batch_loss/train': 0.7241190960630774} 12/25/2021 10:50:16 - INFO - codeparrot_training - Step 18799: {'lr': 0.0003465494181909795, 'samples': 9625600, 'steps': 18799, 'batch_loss/train': 0.7626742050051689} 12/25/2021 10:50:27 - INFO - codeparrot_training - Step 18800: {'lr': 0.0003465348778311482, 'samples': 9626112, 'steps': 18800, 'batch_loss/train': 0.8596837231889367} 12/25/2021 10:50:39 - INFO - codeparrot_training - Step 18801: {'lr': 0.0003465203370875308, 'samples': 9626624, 'steps': 18801, 'batch_loss/train': 0.7621482284739614} 12/25/2021 10:50:49 - INFO - codeparrot_training - Step 18802: {'lr': 0.00034650579596018496, 'samples': 9627136, 'steps': 18802, 'batch_loss/train': 0.8153364532627165} 12/25/2021 10:51:00 - INFO - codeparrot_training - Step 18803: {'lr': 0.00034649125444916843, 'samples': 9627648, 'steps': 18803, 'batch_loss/train': 0.7630178742110729} 12/25/2021 10:51:14 - INFO - codeparrot_training - Step 18804: {'lr': 0.0003464767125545391, 'samples': 9628160, 'steps': 18804, 'batch_loss/train': 0.6906400633743033} 12/25/2021 10:51:24 - INFO - codeparrot_training - Step 18805: {'lr': 0.00034646217027635483, 'samples': 9628672, 'steps': 18805, 'batch_loss/train': 0.7963052913546562} 12/25/2021 10:51:35 - INFO - codeparrot_training - Step 18806: {'lr': 0.0003464476276146734, 'samples': 9629184, 'steps': 18806, 'batch_loss/train': 0.6719077780726366} 12/25/2021 10:51:46 - INFO - codeparrot_training - Step 18807: {'lr': 0.0003464330845695526, 'samples': 9629696, 'steps': 18807, 'batch_loss/train': 0.8173222951591015} 12/25/2021 10:51:58 - INFO - codeparrot_training - Step 18808: {'lr': 0.0003464185411410503, 'samples': 9630208, 'steps': 18808, 'batch_loss/train': 0.7819103789515793} 12/25/2021 10:52:09 - INFO - codeparrot_training - Step 18809: {'lr': 0.00034640399732922426, 'samples': 9630720, 'steps': 18809, 'batch_loss/train': 0.8944466034881771} 12/25/2021 10:52:19 - INFO - codeparrot_training - Step 18810: {'lr': 0.00034638945313413233, 'samples': 9631232, 'steps': 18810, 'batch_loss/train': 0.6137861517490819} 12/25/2021 10:52:31 - INFO - codeparrot_training - Step 18811: {'lr': 0.00034637490855583234, 'samples': 9631744, 'steps': 18811, 'batch_loss/train': 0.8185211368836462} 12/25/2021 10:52:42 - INFO - codeparrot_training - Step 18812: {'lr': 0.00034636036359438207, 'samples': 9632256, 'steps': 18812, 'batch_loss/train': 0.6823826071922667} 12/25/2021 10:52:52 - INFO - codeparrot_training - Step 18813: {'lr': 0.00034634581824983943, 'samples': 9632768, 'steps': 18813, 'batch_loss/train': 0.884008752182126} 12/25/2021 10:53:06 - INFO - codeparrot_training - Step 18814: {'lr': 0.00034633127252226215, 'samples': 9633280, 'steps': 18814, 'batch_loss/train': 0.7895805900916457} 12/25/2021 10:53:17 - INFO - codeparrot_training - Step 18815: {'lr': 0.0003463167264117081, 'samples': 9633792, 'steps': 18815, 'batch_loss/train': 0.7545284717343748} 12/25/2021 10:53:27 - INFO - codeparrot_training - Step 18816: {'lr': 0.0003463021799182352, 'samples': 9634304, 'steps': 18816, 'batch_loss/train': 0.8222132306545973} 12/25/2021 10:53:39 - INFO - codeparrot_training - Step 18817: {'lr': 0.0003462876330419011, 'samples': 9634816, 'steps': 18817, 'batch_loss/train': 0.6460932848858647} 12/25/2021 10:53:50 - INFO - codeparrot_training - Step 18818: {'lr': 0.00034627308578276373, 'samples': 9635328, 'steps': 18818, 'batch_loss/train': 0.7320482926443219} 12/25/2021 10:54:01 - INFO - codeparrot_training - Step 18819: {'lr': 0.00034625853814088104, 'samples': 9635840, 'steps': 18819, 'batch_loss/train': 0.8835523631423712} 12/25/2021 10:54:11 - INFO - codeparrot_training - Step 18820: {'lr': 0.0003462439901163106, 'samples': 9636352, 'steps': 18820, 'batch_loss/train': 0.7598060574382544} 12/25/2021 10:54:23 - INFO - codeparrot_training - Step 18821: {'lr': 0.0003462294417091104, 'samples': 9636864, 'steps': 18821, 'batch_loss/train': 0.6735816053114831} 12/25/2021 10:54:34 - INFO - codeparrot_training - Step 18822: {'lr': 0.00034621489291933837, 'samples': 9637376, 'steps': 18822, 'batch_loss/train': 0.7666902532801032} 12/25/2021 10:54:44 - INFO - codeparrot_training - Step 18823: {'lr': 0.00034620034374705216, 'samples': 9637888, 'steps': 18823, 'batch_loss/train': 0.7374418021645397} 12/25/2021 10:54:56 - INFO - codeparrot_training - Step 18824: {'lr': 0.00034618579419230984, 'samples': 9638400, 'steps': 18824, 'batch_loss/train': 0.7792266244068742} 12/25/2021 10:55:07 - INFO - codeparrot_training - Step 18825: {'lr': 0.000346171244255169, 'samples': 9638912, 'steps': 18825, 'batch_loss/train': 0.6745827539125457} 12/25/2021 10:55:18 - INFO - codeparrot_training - Step 18826: {'lr': 0.00034615669393568764, 'samples': 9639424, 'steps': 18826, 'batch_loss/train': 0.8658986706286669} 12/25/2021 10:55:32 - INFO - codeparrot_training - Step 18827: {'lr': 0.00034614214323392355, 'samples': 9639936, 'steps': 18827, 'batch_loss/train': 0.725441100075841} 12/25/2021 10:55:42 - INFO - codeparrot_training - Step 18828: {'lr': 0.0003461275921499347, 'samples': 9640448, 'steps': 18828, 'batch_loss/train': 0.9069531410932541} 12/25/2021 10:55:53 - INFO - codeparrot_training - Step 18829: {'lr': 0.0003461130406837788, 'samples': 9640960, 'steps': 18829, 'batch_loss/train': 0.7980020404793322} 12/25/2021 10:56:05 - INFO - codeparrot_training - Step 18830: {'lr': 0.0003460984888355137, 'samples': 9641472, 'steps': 18830, 'batch_loss/train': 0.6664891364052892} 12/25/2021 10:56:16 - INFO - codeparrot_training - Step 18831: {'lr': 0.00034608393660519733, 'samples': 9641984, 'steps': 18831, 'batch_loss/train': 0.7499700076878071} 12/25/2021 10:56:26 - INFO - codeparrot_training - Step 18832: {'lr': 0.00034606938399288746, 'samples': 9642496, 'steps': 18832, 'batch_loss/train': 0.7862380011938512} 12/25/2021 10:56:37 - INFO - codeparrot_training - Step 18833: {'lr': 0.000346054830998642, 'samples': 9643008, 'steps': 18833, 'batch_loss/train': 1.5490319717791863} 12/25/2021 10:56:51 - INFO - codeparrot_training - Step 18834: {'lr': 0.00034604027762251877, 'samples': 9643520, 'steps': 18834, 'batch_loss/train': 0.7371652913279831} 12/25/2021 10:57:01 - INFO - codeparrot_training - Step 18835: {'lr': 0.0003460257238645758, 'samples': 9644032, 'steps': 18835, 'batch_loss/train': 0.7495512776076794} 12/25/2021 10:57:12 - INFO - codeparrot_training - Step 18836: {'lr': 0.0003460111697248706, 'samples': 9644544, 'steps': 18836, 'batch_loss/train': 0.8094020774587989} 12/25/2021 10:57:24 - INFO - codeparrot_training - Step 18837: {'lr': 0.0003459966152034614, 'samples': 9645056, 'steps': 18837, 'batch_loss/train': 0.7738287462852895} 12/25/2021 10:57:35 - INFO - codeparrot_training - Step 18838: {'lr': 0.0003459820603004059, 'samples': 9645568, 'steps': 18838, 'batch_loss/train': 0.775902496650815} 12/25/2021 10:57:45 - INFO - codeparrot_training - Step 18839: {'lr': 0.00034596750501576193, 'samples': 9646080, 'steps': 18839, 'batch_loss/train': 0.7429843069985509} 12/25/2021 10:57:57 - INFO - codeparrot_training - Step 18840: {'lr': 0.0003459529493495874, 'samples': 9646592, 'steps': 18840, 'batch_loss/train': 0.7633619336411357} 12/25/2021 10:58:08 - INFO - codeparrot_training - Step 18841: {'lr': 0.00034593839330194023, 'samples': 9647104, 'steps': 18841, 'batch_loss/train': 0.721444251248613} 12/25/2021 10:58:18 - INFO - codeparrot_training - Step 18842: {'lr': 0.00034592383687287817, 'samples': 9647616, 'steps': 18842, 'batch_loss/train': 0.5338749701040797} 12/25/2021 10:58:32 - INFO - codeparrot_training - Step 18843: {'lr': 0.00034590928006245916, 'samples': 9648128, 'steps': 18843, 'batch_loss/train': 1.5821482581086457} 12/25/2021 10:58:43 - INFO - codeparrot_training - Step 18844: {'lr': 0.00034589472287074104, 'samples': 9648640, 'steps': 18844, 'batch_loss/train': 0.7097996214870363} 12/25/2021 10:58:54 - INFO - codeparrot_training - Step 18845: {'lr': 0.0003458801652977817, 'samples': 9649152, 'steps': 18845, 'batch_loss/train': 0.859203249681741} 12/25/2021 10:59:04 - INFO - codeparrot_training - Step 18846: {'lr': 0.00034586560734363903, 'samples': 9649664, 'steps': 18846, 'batch_loss/train': 0.7251520012505352} 12/25/2021 10:59:16 - INFO - codeparrot_training - Step 18847: {'lr': 0.0003458510490083709, 'samples': 9650176, 'steps': 18847, 'batch_loss/train': 0.7868373151868582} 12/25/2021 10:59:27 - INFO - codeparrot_training - Step 18848: {'lr': 0.00034583649029203535, 'samples': 9650688, 'steps': 18848, 'batch_loss/train': 0.7691463362425566} 12/25/2021 10:59:37 - INFO - codeparrot_training - Step 18849: {'lr': 0.0003458219311946899, 'samples': 9651200, 'steps': 18849, 'batch_loss/train': 0.7132683475501835} 12/25/2021 10:59:51 - INFO - codeparrot_training - Step 18850: {'lr': 0.00034580737171639275, 'samples': 9651712, 'steps': 18850, 'batch_loss/train': 0.9894132632762194} 12/25/2021 11:00:02 - INFO - codeparrot_training - Step 18851: {'lr': 0.0003457928118572016, 'samples': 9652224, 'steps': 18851, 'batch_loss/train': 0.7505553457885981} 12/25/2021 11:00:12 - INFO - codeparrot_training - Step 18852: {'lr': 0.00034577825161717445, 'samples': 9652736, 'steps': 18852, 'batch_loss/train': 0.7145899132592604} 12/25/2021 11:00:24 - INFO - codeparrot_training - Step 18853: {'lr': 0.00034576369099636914, 'samples': 9653248, 'steps': 18853, 'batch_loss/train': 0.8752715920563787} 12/25/2021 11:00:35 - INFO - codeparrot_training - Step 18854: {'lr': 0.0003457491299948435, 'samples': 9653760, 'steps': 18854, 'batch_loss/train': 0.784692170098424} 12/25/2021 11:00:45 - INFO - codeparrot_training - Step 18855: {'lr': 0.0003457345686126555, 'samples': 9654272, 'steps': 18855, 'batch_loss/train': 0.7954192431643605} 12/25/2021 11:00:56 - INFO - codeparrot_training - Step 18856: {'lr': 0.00034572000684986306, 'samples': 9654784, 'steps': 18856, 'batch_loss/train': 0.8012550445273519} 12/25/2021 11:01:08 - INFO - codeparrot_training - Step 18857: {'lr': 0.00034570544470652385, 'samples': 9655296, 'steps': 18857, 'batch_loss/train': 0.884464175440371} 12/25/2021 11:01:19 - INFO - codeparrot_training - Step 18858: {'lr': 0.00034569088218269604, 'samples': 9655808, 'steps': 18858, 'batch_loss/train': 0.6018707109615207} 12/25/2021 11:01:29 - INFO - codeparrot_training - Step 18859: {'lr': 0.00034567631927843746, 'samples': 9656320, 'steps': 18859, 'batch_loss/train': 0.7266822493402287} 12/25/2021 11:01:41 - INFO - codeparrot_training - Step 18860: {'lr': 0.0003456617559938059, 'samples': 9656832, 'steps': 18860, 'batch_loss/train': 0.7907217307947576} 12/25/2021 11:01:52 - INFO - codeparrot_training - Step 18861: {'lr': 0.0003456471923288593, 'samples': 9657344, 'steps': 18861, 'batch_loss/train': 0.8698963560163975} 12/25/2021 11:02:02 - INFO - codeparrot_training - Step 18862: {'lr': 0.0003456326282836556, 'samples': 9657856, 'steps': 18862, 'batch_loss/train': 0.7815736243501306} 12/25/2021 11:02:16 - INFO - codeparrot_training - Step 18863: {'lr': 0.00034561806385825273, 'samples': 9658368, 'steps': 18863, 'batch_loss/train': 0.7430653721094131} 12/25/2021 11:02:27 - INFO - codeparrot_training - Step 18864: {'lr': 0.0003456034990527085, 'samples': 9658880, 'steps': 18864, 'batch_loss/train': 0.8432388277724385} 12/25/2021 11:02:38 - INFO - codeparrot_training - Step 18865: {'lr': 0.00034558893386708086, 'samples': 9659392, 'steps': 18865, 'batch_loss/train': 0.7899119760841131} 12/25/2021 11:02:50 - INFO - codeparrot_training - Step 18866: {'lr': 0.0003455743683014277, 'samples': 9659904, 'steps': 18866, 'batch_loss/train': 0.8412643680348992} 12/25/2021 11:03:00 - INFO - codeparrot_training - Step 18867: {'lr': 0.00034555980235580696, 'samples': 9660416, 'steps': 18867, 'batch_loss/train': 0.7804617816582322} 12/25/2021 11:03:11 - INFO - codeparrot_training - Step 18868: {'lr': 0.00034554523603027643, 'samples': 9660928, 'steps': 18868, 'batch_loss/train': 0.8037967355921865} 12/25/2021 11:03:21 - INFO - codeparrot_training - Step 18869: {'lr': 0.0003455306693248943, 'samples': 9661440, 'steps': 18869, 'batch_loss/train': 0.8518189648166299} 12/25/2021 11:03:33 - INFO - codeparrot_training - Step 18870: {'lr': 0.0003455161022397182, 'samples': 9661952, 'steps': 18870, 'batch_loss/train': 0.7342694839462638} 12/25/2021 11:03:44 - INFO - codeparrot_training - Step 18871: {'lr': 0.0003455015347748061, 'samples': 9662464, 'steps': 18871, 'batch_loss/train': 0.7933737109415233} 12/25/2021 11:03:55 - INFO - codeparrot_training - Step 18872: {'lr': 0.00034548696693021595, 'samples': 9662976, 'steps': 18872, 'batch_loss/train': 0.8390513067133725} 12/25/2021 11:04:13 - INFO - codeparrot_training - Step 18873: {'lr': 0.00034547239870600575, 'samples': 9663488, 'steps': 18873, 'batch_loss/train': 0.6212321505881846} 12/25/2021 11:04:24 - INFO - codeparrot_training - Step 18874: {'lr': 0.00034545783010223334, 'samples': 9664000, 'steps': 18874, 'batch_loss/train': 0.7878332077525556} 12/25/2021 11:04:34 - INFO - codeparrot_training - Step 18875: {'lr': 0.0003454432611189566, 'samples': 9664512, 'steps': 18875, 'batch_loss/train': 0.7953470908105373} 12/25/2021 11:04:47 - INFO - codeparrot_training - Step 18876: {'lr': 0.0003454286917562335, 'samples': 9665024, 'steps': 18876, 'batch_loss/train': 0.7465778458863497} 12/25/2021 11:04:58 - INFO - codeparrot_training - Step 18877: {'lr': 0.00034541412201412206, 'samples': 9665536, 'steps': 18877, 'batch_loss/train': 0.7982910005375743} 12/25/2021 11:05:08 - INFO - codeparrot_training - Step 18878: {'lr': 0.00034539955189267997, 'samples': 9666048, 'steps': 18878, 'batch_loss/train': 0.8410755875520408} 12/25/2021 11:05:19 - INFO - codeparrot_training - Step 18879: {'lr': 0.00034538498139196526, 'samples': 9666560, 'steps': 18879, 'batch_loss/train': 0.5149264142964967} 12/25/2021 11:05:32 - INFO - codeparrot_training - Step 18880: {'lr': 0.000345370410512036, 'samples': 9667072, 'steps': 18880, 'batch_loss/train': 0.8774821190163493} 12/25/2021 11:05:43 - INFO - codeparrot_training - Step 18881: {'lr': 0.0003453558392529499, 'samples': 9667584, 'steps': 18881, 'batch_loss/train': 0.7791848030174151} 12/25/2021 11:05:53 - INFO - codeparrot_training - Step 18882: {'lr': 0.000345341267614765, 'samples': 9668096, 'steps': 18882, 'batch_loss/train': 0.8021212564781308} 12/25/2021 11:06:05 - INFO - codeparrot_training - Step 18883: {'lr': 0.0003453266955975393, 'samples': 9668608, 'steps': 18883, 'batch_loss/train': 0.7070535547100008} 12/25/2021 11:06:16 - INFO - codeparrot_training - Step 18884: {'lr': 0.00034531212320133056, 'samples': 9669120, 'steps': 18884, 'batch_loss/train': 0.75937454495579} 12/25/2021 11:06:27 - INFO - codeparrot_training - Step 18885: {'lr': 0.0003452975504261968, 'samples': 9669632, 'steps': 18885, 'batch_loss/train': 0.7370803523808718} 12/25/2021 11:06:39 - INFO - codeparrot_training - Step 18886: {'lr': 0.00034528297727219603, 'samples': 9670144, 'steps': 18886, 'batch_loss/train': 0.83522320818156} 12/25/2021 11:06:50 - INFO - codeparrot_training - Step 18887: {'lr': 0.00034526840373938614, 'samples': 9670656, 'steps': 18887, 'batch_loss/train': 0.8697035126388073} 12/25/2021 11:07:00 - INFO - codeparrot_training - Step 18888: {'lr': 0.00034525382982782495, 'samples': 9671168, 'steps': 18888, 'batch_loss/train': 0.6131943005602807} 12/25/2021 11:07:14 - INFO - codeparrot_training - Step 18889: {'lr': 0.00034523925553757055, 'samples': 9671680, 'steps': 18889, 'batch_loss/train': 0.8815916199237108} 12/25/2021 11:07:25 - INFO - codeparrot_training - Step 18890: {'lr': 0.00034522468086868087, 'samples': 9672192, 'steps': 18890, 'batch_loss/train': 0.7859963001683354} 12/25/2021 11:07:36 - INFO - codeparrot_training - Step 18891: {'lr': 0.00034521010582121375, 'samples': 9672704, 'steps': 18891, 'batch_loss/train': 0.9040429997257888} 12/25/2021 11:07:46 - INFO - codeparrot_training - Step 18892: {'lr': 0.0003451955303952272, 'samples': 9673216, 'steps': 18892, 'batch_loss/train': 0.8796165566891432} 12/25/2021 11:07:58 - INFO - codeparrot_training - Step 18893: {'lr': 0.00034518095459077925, 'samples': 9673728, 'steps': 18893, 'batch_loss/train': 0.8394273230805993} 12/25/2021 11:08:09 - INFO - codeparrot_training - Step 18894: {'lr': 0.00034516637840792765, 'samples': 9674240, 'steps': 18894, 'batch_loss/train': 0.8147627962753177} 12/25/2021 11:08:20 - INFO - codeparrot_training - Step 18895: {'lr': 0.0003451518018467305, 'samples': 9674752, 'steps': 18895, 'batch_loss/train': 0.7812266687396914} 12/25/2021 11:08:34 - INFO - codeparrot_training - Step 18896: {'lr': 0.00034513722490724574, 'samples': 9675264, 'steps': 18896, 'batch_loss/train': 0.7692278493195772} 12/25/2021 11:08:44 - INFO - codeparrot_training - Step 18897: {'lr': 0.0003451226475895313, 'samples': 9675776, 'steps': 18897, 'batch_loss/train': 0.9887528028339148} 12/25/2021 11:08:55 - INFO - codeparrot_training - Step 18898: {'lr': 0.00034510806989364513, 'samples': 9676288, 'steps': 18898, 'batch_loss/train': 0.6896610306575894} 12/25/2021 11:09:07 - INFO - codeparrot_training - Step 18899: {'lr': 0.00034509349181964516, 'samples': 9676800, 'steps': 18899, 'batch_loss/train': 0.8197603262960911} 12/25/2021 11:09:17 - INFO - codeparrot_training - Step 18900: {'lr': 0.0003450789133675895, 'samples': 9677312, 'steps': 18900, 'batch_loss/train': 0.7961986060254276} 12/25/2021 11:09:28 - INFO - codeparrot_training - Step 18901: {'lr': 0.0003450643345375357, 'samples': 9677824, 'steps': 18901, 'batch_loss/train': 0.7662342134863138} 12/25/2021 11:09:39 - INFO - codeparrot_training - Step 18902: {'lr': 0.00034504975532954224, 'samples': 9678336, 'steps': 18902, 'batch_loss/train': 0.7336469236761332} 12/25/2021 11:09:51 - INFO - codeparrot_training - Step 18903: {'lr': 0.0003450351757436668, 'samples': 9678848, 'steps': 18903, 'batch_loss/train': 0.6070020623737946} 12/25/2021 11:10:02 - INFO - codeparrot_training - Step 18904: {'lr': 0.0003450205957799673, 'samples': 9679360, 'steps': 18904, 'batch_loss/train': 0.7625165614299476} 12/25/2021 11:10:12 - INFO - codeparrot_training - Step 18905: {'lr': 0.00034500601543850185, 'samples': 9679872, 'steps': 18905, 'batch_loss/train': 0.6765964357182384} 12/25/2021 11:10:26 - INFO - codeparrot_training - Step 18906: {'lr': 0.00034499143471932836, 'samples': 9680384, 'steps': 18906, 'batch_loss/train': 0.7076512267813087} 12/25/2021 11:10:37 - INFO - codeparrot_training - Step 18907: {'lr': 0.00034497685362250475, 'samples': 9680896, 'steps': 18907, 'batch_loss/train': 0.8477155342698097} 12/25/2021 11:10:47 - INFO - codeparrot_training - Step 18908: {'lr': 0.00034496227214808904, 'samples': 9681408, 'steps': 18908, 'batch_loss/train': 0.7796395951882005} 12/25/2021 11:10:59 - INFO - codeparrot_training - Step 18909: {'lr': 0.0003449476902961393, 'samples': 9681920, 'steps': 18909, 'batch_loss/train': 0.7725904702674598} 12/25/2021 11:11:10 - INFO - codeparrot_training - Step 18910: {'lr': 0.0003449331080667133, 'samples': 9682432, 'steps': 18910, 'batch_loss/train': 0.8364040972664952} 12/25/2021 11:11:20 - INFO - codeparrot_training - Step 18911: {'lr': 0.0003449185254598691, 'samples': 9682944, 'steps': 18911, 'batch_loss/train': 0.8896825863048434} 12/25/2021 11:11:31 - INFO - codeparrot_training - Step 18912: {'lr': 0.0003449039424756647, 'samples': 9683456, 'steps': 18912, 'batch_loss/train': 0.6100451839156449} 12/25/2021 11:11:45 - INFO - codeparrot_training - Step 18913: {'lr': 0.00034488935911415803, 'samples': 9683968, 'steps': 18913, 'batch_loss/train': 0.7836779020726681} 12/25/2021 11:11:56 - INFO - codeparrot_training - Step 18914: {'lr': 0.0003448747753754071, 'samples': 9684480, 'steps': 18914, 'batch_loss/train': 0.637688017450273} 12/25/2021 11:12:07 - INFO - codeparrot_training - Step 18915: {'lr': 0.00034486019125947, 'samples': 9684992, 'steps': 18915, 'batch_loss/train': 0.8173379339277744} 12/25/2021 11:12:19 - INFO - codeparrot_training - Step 18916: {'lr': 0.00034484560676640446, 'samples': 9685504, 'steps': 18916, 'batch_loss/train': 0.693392219953239} 12/25/2021 11:12:29 - INFO - codeparrot_training - Step 18917: {'lr': 0.00034483102189626863, 'samples': 9686016, 'steps': 18917, 'batch_loss/train': 0.5602737215813249} 12/25/2021 11:12:40 - INFO - codeparrot_training - Step 18918: {'lr': 0.00034481643664912047, 'samples': 9686528, 'steps': 18918, 'batch_loss/train': 0.8348768390715122} 12/25/2021 11:12:52 - INFO - codeparrot_training - Step 18919: {'lr': 0.00034480185102501803, 'samples': 9687040, 'steps': 18919, 'batch_loss/train': 0.7909836629405618} 12/25/2021 11:13:03 - INFO - codeparrot_training - Step 18920: {'lr': 0.00034478726502401916, 'samples': 9687552, 'steps': 18920, 'batch_loss/train': 0.80684794485569} 12/25/2021 11:13:14 - INFO - codeparrot_training - Step 18921: {'lr': 0.00034477267864618187, 'samples': 9688064, 'steps': 18921, 'batch_loss/train': 0.7862258879467845} 12/25/2021 11:13:24 - INFO - codeparrot_training - Step 18922: {'lr': 0.0003447580918915643, 'samples': 9688576, 'steps': 18922, 'batch_loss/train': 0.695979259442538} 12/25/2021 11:13:37 - INFO - codeparrot_training - Step 18923: {'lr': 0.00034474350476022423, 'samples': 9689088, 'steps': 18923, 'batch_loss/train': 0.7024097711546347} 12/25/2021 11:13:47 - INFO - codeparrot_training - Step 18924: {'lr': 0.0003447289172522199, 'samples': 9689600, 'steps': 18924, 'batch_loss/train': 0.5488348975195549} 12/25/2021 11:13:58 - INFO - codeparrot_training - Step 18925: {'lr': 0.00034471432936760907, 'samples': 9690112, 'steps': 18925, 'batch_loss/train': 0.8316330476664007} 12/25/2021 11:14:12 - INFO - codeparrot_training - Step 18926: {'lr': 0.00034469974110644977, 'samples': 9690624, 'steps': 18926, 'batch_loss/train': 0.7493955451063812} 12/25/2021 11:14:22 - INFO - codeparrot_training - Step 18927: {'lr': 0.00034468515246880013, 'samples': 9691136, 'steps': 18927, 'batch_loss/train': 0.9337174743413925} 12/25/2021 11:14:33 - INFO - codeparrot_training - Step 18928: {'lr': 0.0003446705634547181, 'samples': 9691648, 'steps': 18928, 'batch_loss/train': 0.8085188949480653} 12/25/2021 11:14:45 - INFO - codeparrot_training - Step 18929: {'lr': 0.0003446559740642616, 'samples': 9692160, 'steps': 18929, 'batch_loss/train': 0.7981852418743074} 12/25/2021 11:14:55 - INFO - codeparrot_training - Step 18930: {'lr': 0.0003446413842974887, 'samples': 9692672, 'steps': 18930, 'batch_loss/train': 0.7689325022511184} 12/25/2021 11:15:06 - INFO - codeparrot_training - Step 18931: {'lr': 0.0003446267941544574, 'samples': 9693184, 'steps': 18931, 'batch_loss/train': 0.7857854522299021} 12/25/2021 11:15:17 - INFO - codeparrot_training - Step 18932: {'lr': 0.0003446122036352257, 'samples': 9693696, 'steps': 18932, 'batch_loss/train': 0.8910740930587053} 12/25/2021 11:15:29 - INFO - codeparrot_training - Step 18933: {'lr': 0.00034459761273985157, 'samples': 9694208, 'steps': 18933, 'batch_loss/train': 0.7898478461429477} 12/25/2021 11:15:39 - INFO - codeparrot_training - Step 18934: {'lr': 0.0003445830214683931, 'samples': 9694720, 'steps': 18934, 'batch_loss/train': 0.7669618440559134} 12/25/2021 11:15:50 - INFO - codeparrot_training - Step 18935: {'lr': 0.0003445684298209083, 'samples': 9695232, 'steps': 18935, 'batch_loss/train': 0.7304667998105288} 12/25/2021 11:16:04 - INFO - codeparrot_training - Step 18936: {'lr': 0.000344553837797455, 'samples': 9695744, 'steps': 18936, 'batch_loss/train': 0.7469926462508738} 12/25/2021 11:16:14 - INFO - codeparrot_training - Step 18937: {'lr': 0.00034453924539809143, 'samples': 9696256, 'steps': 18937, 'batch_loss/train': 0.8783903867006302} 12/25/2021 11:16:25 - INFO - codeparrot_training - Step 18938: {'lr': 0.00034452465262287547, 'samples': 9696768, 'steps': 18938, 'batch_loss/train': 0.7822185903787613} 12/25/2021 11:16:37 - INFO - codeparrot_training - Step 18939: {'lr': 0.0003445100594718652, 'samples': 9697280, 'steps': 18939, 'batch_loss/train': 0.924892982468009} 12/25/2021 11:16:48 - INFO - codeparrot_training - Step 18940: {'lr': 0.0003444954659451186, 'samples': 9697792, 'steps': 18940, 'batch_loss/train': 1.025515919085592} 12/25/2021 11:16:58 - INFO - codeparrot_training - Step 18941: {'lr': 0.0003444808720426938, 'samples': 9698304, 'steps': 18941, 'batch_loss/train': 0.5884568919427693} 12/25/2021 11:17:09 - INFO - codeparrot_training - Step 18942: {'lr': 0.00034446627776464866, 'samples': 9698816, 'steps': 18942, 'batch_loss/train': 0.6412900141440332} 12/25/2021 11:17:21 - INFO - codeparrot_training - Step 18943: {'lr': 0.00034445168311104127, 'samples': 9699328, 'steps': 18943, 'batch_loss/train': 0.7450275840237737} 12/25/2021 11:17:32 - INFO - codeparrot_training - Step 18944: {'lr': 0.0003444370880819296, 'samples': 9699840, 'steps': 18944, 'batch_loss/train': 0.8295315643772483} 12/25/2021 11:17:42 - INFO - codeparrot_training - Step 18945: {'lr': 0.0003444224926773718, 'samples': 9700352, 'steps': 18945, 'batch_loss/train': 0.7914006533101201} 12/25/2021 11:17:56 - INFO - codeparrot_training - Step 18946: {'lr': 0.0003444078968974258, 'samples': 9700864, 'steps': 18946, 'batch_loss/train': 0.7334040882997215} 12/25/2021 11:18:07 - INFO - codeparrot_training - Step 18947: {'lr': 0.0003443933007421496, 'samples': 9701376, 'steps': 18947, 'batch_loss/train': 0.8137115454301238} 12/25/2021 11:18:17 - INFO - codeparrot_training - Step 18948: {'lr': 0.0003443787042116014, 'samples': 9701888, 'steps': 18948, 'batch_loss/train': 0.9085010695271194} 12/25/2021 11:18:28 - INFO - codeparrot_training - Step 18949: {'lr': 0.00034436410730583896, 'samples': 9702400, 'steps': 18949, 'batch_loss/train': 0.7355142330052331} 12/25/2021 11:18:40 - INFO - codeparrot_training - Step 18950: {'lr': 0.00034434951002492055, 'samples': 9702912, 'steps': 18950, 'batch_loss/train': 0.8175578946247697} 12/25/2021 11:18:50 - INFO - codeparrot_training - Step 18951: {'lr': 0.0003443349123689041, 'samples': 9703424, 'steps': 18951, 'batch_loss/train': 0.8442245349287987} 12/25/2021 11:19:01 - INFO - codeparrot_training - Step 18952: {'lr': 0.0003443203143378476, 'samples': 9703936, 'steps': 18952, 'batch_loss/train': 0.8129753787070513} 12/25/2021 11:19:15 - INFO - codeparrot_training - Step 18953: {'lr': 0.00034430571593180927, 'samples': 9704448, 'steps': 18953, 'batch_loss/train': 0.8224596208892763} 12/25/2021 11:19:25 - INFO - codeparrot_training - Step 18954: {'lr': 0.00034429111715084695, 'samples': 9704960, 'steps': 18954, 'batch_loss/train': 0.6930286672431976} 12/25/2021 11:19:36 - INFO - codeparrot_training - Step 18955: {'lr': 0.00034427651799501875, 'samples': 9705472, 'steps': 18955, 'batch_loss/train': 0.8151391712017357} 12/25/2021 11:19:48 - INFO - codeparrot_training - Step 18956: {'lr': 0.00034426191846438273, 'samples': 9705984, 'steps': 18956, 'batch_loss/train': 0.7968354723416269} 12/25/2021 11:19:59 - INFO - codeparrot_training - Step 18957: {'lr': 0.00034424731855899683, 'samples': 9706496, 'steps': 18957, 'batch_loss/train': 0.9017368042841554} 12/25/2021 11:20:09 - INFO - codeparrot_training - Step 18958: {'lr': 0.00034423271827891923, 'samples': 9707008, 'steps': 18958, 'batch_loss/train': 0.8207686273381114} 12/25/2021 11:20:20 - INFO - codeparrot_training - Step 18959: {'lr': 0.000344218117624208, 'samples': 9707520, 'steps': 18959, 'batch_loss/train': 0.7843058807775378} 12/25/2021 11:20:32 - INFO - codeparrot_training - Step 18960: {'lr': 0.000344203516594921, 'samples': 9708032, 'steps': 18960, 'batch_loss/train': 0.8720582174137235} 12/25/2021 11:20:43 - INFO - codeparrot_training - Step 18961: {'lr': 0.00034418891519111646, 'samples': 9708544, 'steps': 18961, 'batch_loss/train': 0.8094629612751305} 12/25/2021 11:20:53 - INFO - codeparrot_training - Step 18962: {'lr': 0.00034417431341285225, 'samples': 9709056, 'steps': 18962, 'batch_loss/train': 0.8479080917313695} 12/25/2021 11:21:05 - INFO - codeparrot_training - Step 18963: {'lr': 0.0003441597112601867, 'samples': 9709568, 'steps': 18963, 'batch_loss/train': 0.7574905501678586} 12/25/2021 11:21:16 - INFO - codeparrot_training - Step 18964: {'lr': 0.00034414510873317754, 'samples': 9710080, 'steps': 18964, 'batch_loss/train': 0.7833065977320075} 12/25/2021 11:21:26 - INFO - codeparrot_training - Step 18965: {'lr': 0.00034413050583188305, 'samples': 9710592, 'steps': 18965, 'batch_loss/train': 0.7646569162607193} 12/25/2021 11:21:40 - INFO - codeparrot_training - Step 18966: {'lr': 0.0003441159025563612, 'samples': 9711104, 'steps': 18966, 'batch_loss/train': 0.8017541710287333} 12/25/2021 11:21:51 - INFO - codeparrot_training - Step 18967: {'lr': 0.00034410129890667006, 'samples': 9711616, 'steps': 18967, 'batch_loss/train': 0.8207068406045437} 12/25/2021 11:22:01 - INFO - codeparrot_training - Step 18968: {'lr': 0.0003440866948828677, 'samples': 9712128, 'steps': 18968, 'batch_loss/train': 0.809895733371377} 12/25/2021 11:22:13 - INFO - codeparrot_training - Step 18969: {'lr': 0.00034407209048501207, 'samples': 9712640, 'steps': 18969, 'batch_loss/train': 0.8373144643846899} 12/25/2021 11:22:24 - INFO - codeparrot_training - Step 18970: {'lr': 0.00034405748571316147, 'samples': 9713152, 'steps': 18970, 'batch_loss/train': 0.7564894370734692} 12/25/2021 11:22:35 - INFO - codeparrot_training - Step 18971: {'lr': 0.00034404288056737376, 'samples': 9713664, 'steps': 18971, 'batch_loss/train': 0.8089568922296166} 12/25/2021 11:22:45 - INFO - codeparrot_training - Step 18972: {'lr': 0.00034402827504770707, 'samples': 9714176, 'steps': 18972, 'batch_loss/train': 0.7698019477538764} 12/25/2021 11:22:57 - INFO - codeparrot_training - Step 18973: {'lr': 0.00034401366915421945, 'samples': 9714688, 'steps': 18973, 'batch_loss/train': 0.8153756195679307} 12/25/2021 11:23:08 - INFO - codeparrot_training - Step 18974: {'lr': 0.000343999062886969, 'samples': 9715200, 'steps': 18974, 'batch_loss/train': 0.829488143324852} 12/25/2021 11:23:19 - INFO - codeparrot_training - Step 18975: {'lr': 0.0003439844562460137, 'samples': 9715712, 'steps': 18975, 'batch_loss/train': 0.6181540066609159} 12/25/2021 11:23:33 - INFO - codeparrot_training - Step 18976: {'lr': 0.0003439698492314118, 'samples': 9716224, 'steps': 18976, 'batch_loss/train': 0.7439787187613547} 12/25/2021 11:23:43 - INFO - codeparrot_training - Step 18977: {'lr': 0.0003439552418432212, 'samples': 9716736, 'steps': 18977, 'batch_loss/train': 0.7847492962609977} 12/25/2021 11:23:54 - INFO - codeparrot_training - Step 18978: {'lr': 0.0003439406340815, 'samples': 9717248, 'steps': 18978, 'batch_loss/train': 0.9666954865679145} 12/25/2021 11:24:06 - INFO - codeparrot_training - Step 18979: {'lr': 0.0003439260259463064, 'samples': 9717760, 'steps': 18979, 'batch_loss/train': 0.8407314820215106} 12/25/2021 11:24:16 - INFO - codeparrot_training - Step 18980: {'lr': 0.00034391141743769836, 'samples': 9718272, 'steps': 18980, 'batch_loss/train': 0.7534983055666089} 12/25/2021 11:24:27 - INFO - codeparrot_training - Step 18981: {'lr': 0.0003438968085557339, 'samples': 9718784, 'steps': 18981, 'batch_loss/train': 0.7744739844929427} 12/25/2021 11:24:38 - INFO - codeparrot_training - Step 18982: {'lr': 0.00034388219930047125, 'samples': 9719296, 'steps': 18982, 'batch_loss/train': 0.7183057551737875} 12/25/2021 11:24:52 - INFO - codeparrot_training - Step 18983: {'lr': 0.00034386758967196846, 'samples': 9719808, 'steps': 18983, 'batch_loss/train': 0.9695058539509773} 12/25/2021 11:25:03 - INFO - codeparrot_training - Step 18984: {'lr': 0.0003438529796702835, 'samples': 9720320, 'steps': 18984, 'batch_loss/train': 0.903760117245838} 12/25/2021 11:25:13 - INFO - codeparrot_training - Step 18985: {'lr': 0.0003438383692954746, 'samples': 9720832, 'steps': 18985, 'batch_loss/train': 0.6981176254339516} 12/25/2021 11:25:25 - INFO - codeparrot_training - Step 18986: {'lr': 0.0003438237585475997, 'samples': 9721344, 'steps': 18986, 'batch_loss/train': 0.7627580721164122} 12/25/2021 11:25:36 - INFO - codeparrot_training - Step 18987: {'lr': 0.00034380914742671703, 'samples': 9721856, 'steps': 18987, 'batch_loss/train': 0.6545045846141875} 12/25/2021 11:25:46 - INFO - codeparrot_training - Step 18988: {'lr': 0.0003437945359328846, 'samples': 9722368, 'steps': 18988, 'batch_loss/train': 0.7863453030586243} 12/25/2021 11:25:59 - INFO - codeparrot_training - Step 18989: {'lr': 0.0003437799240661605, 'samples': 9722880, 'steps': 18989, 'batch_loss/train': 0.7755393106490374} 12/25/2021 11:26:09 - INFO - codeparrot_training - Step 18990: {'lr': 0.0003437653118266028, 'samples': 9723392, 'steps': 18990, 'batch_loss/train': 0.7858003270812333} 12/25/2021 11:26:20 - INFO - codeparrot_training - Step 18991: {'lr': 0.0003437506992142696, 'samples': 9723904, 'steps': 18991, 'batch_loss/train': 1.1099650347605348} 12/25/2021 11:26:31 - INFO - codeparrot_training - Step 18992: {'lr': 0.0003437360862292191, 'samples': 9724416, 'steps': 18992, 'batch_loss/train': 0.7870021602138877} 12/25/2021 11:26:44 - INFO - codeparrot_training - Step 18993: {'lr': 0.0003437214728715093, 'samples': 9724928, 'steps': 18993, 'batch_loss/train': 0.733010845258832} 12/25/2021 11:26:55 - INFO - codeparrot_training - Step 18994: {'lr': 0.0003437068591411983, 'samples': 9725440, 'steps': 18994, 'batch_loss/train': 0.8460205225273967} 12/25/2021 11:27:06 - INFO - codeparrot_training - Step 18995: {'lr': 0.00034369224503834415, 'samples': 9725952, 'steps': 18995, 'batch_loss/train': 0.9501663092523813} 12/25/2021 11:27:18 - INFO - codeparrot_training - Step 18996: {'lr': 0.0003436776305630051, 'samples': 9726464, 'steps': 18996, 'batch_loss/train': 0.9087984105572104} 12/25/2021 11:27:28 - INFO - codeparrot_training - Step 18997: {'lr': 0.00034366301571523913, 'samples': 9726976, 'steps': 18997, 'batch_loss/train': 0.8644480783841573} 12/25/2021 11:27:39 - INFO - codeparrot_training - Step 18998: {'lr': 0.00034364840049510437, 'samples': 9727488, 'steps': 18998, 'batch_loss/train': 0.7631767392158508} 12/25/2021 11:27:51 - INFO - codeparrot_training - Step 18999: {'lr': 0.00034363378490265896, 'samples': 9728000, 'steps': 18999, 'batch_loss/train': 0.6978705930523574} 12/25/2021 11:28:02 - INFO - codeparrot_training - Step 19000: {'lr': 0.000343619168937961, 'samples': 9728512, 'steps': 19000, 'batch_loss/train': 0.7238049160223454} 12/25/2021 11:28:12 - INFO - codeparrot_training - Step 19001: {'lr': 0.00034360455260106843, 'samples': 9729024, 'steps': 19001, 'batch_loss/train': 0.6035127388313413} 12/25/2021 11:28:23 - INFO - codeparrot_training - Step 19002: {'lr': 0.00034358993589203964, 'samples': 9729536, 'steps': 19002, 'batch_loss/train': 0.7659834926016629} 12/25/2021 11:28:35 - INFO - codeparrot_training - Step 19003: {'lr': 0.0003435753188109325, 'samples': 9730048, 'steps': 19003, 'batch_loss/train': 0.7478170180693269} 12/25/2021 11:28:45 - INFO - codeparrot_training - Step 19004: {'lr': 0.0003435607013578053, 'samples': 9730560, 'steps': 19004, 'batch_loss/train': 0.7421515276655555} 12/25/2021 11:28:56 - INFO - codeparrot_training - Step 19005: {'lr': 0.0003435460835327161, 'samples': 9731072, 'steps': 19005, 'batch_loss/train': 0.8020669342949986} 12/25/2021 11:29:10 - INFO - codeparrot_training - Step 19006: {'lr': 0.0003435314653357229, 'samples': 9731584, 'steps': 19006, 'batch_loss/train': 0.8200759091414511} 12/25/2021 11:29:21 - INFO - codeparrot_training - Step 19007: {'lr': 0.00034351684676688397, 'samples': 9732096, 'steps': 19007, 'batch_loss/train': 0.7225058563053608} 12/25/2021 11:29:31 - INFO - codeparrot_training - Step 19008: {'lr': 0.00034350222782625735, 'samples': 9732608, 'steps': 19008, 'batch_loss/train': 0.8709912914782763} 12/25/2021 11:29:43 - INFO - codeparrot_training - Step 19009: {'lr': 0.0003434876085139013, 'samples': 9733120, 'steps': 19009, 'batch_loss/train': 0.8135534888133407} 12/25/2021 11:29:54 - INFO - codeparrot_training - Step 19010: {'lr': 0.00034347298882987366, 'samples': 9733632, 'steps': 19010, 'batch_loss/train': 0.7324072066694498} 12/25/2021 11:30:04 - INFO - codeparrot_training - Step 19011: {'lr': 0.00034345836877423277, 'samples': 9734144, 'steps': 19011, 'batch_loss/train': 0.7262754859402776} 12/25/2021 11:30:15 - INFO - codeparrot_training - Step 19012: {'lr': 0.0003434437483470366, 'samples': 9734656, 'steps': 19012, 'batch_loss/train': 0.89160090778023} 12/25/2021 11:30:27 - INFO - codeparrot_training - Step 19013: {'lr': 0.00034342912754834354, 'samples': 9735168, 'steps': 19013, 'batch_loss/train': 0.8105199607089162} 12/25/2021 11:30:38 - INFO - codeparrot_training - Step 19014: {'lr': 0.0003434145063782114, 'samples': 9735680, 'steps': 19014, 'batch_loss/train': 0.83460240252316} 12/25/2021 11:30:48 - INFO - codeparrot_training - Step 19015: {'lr': 0.0003433998848366985, 'samples': 9736192, 'steps': 19015, 'batch_loss/train': 0.8748245956376195} 12/25/2021 11:31:02 - INFO - codeparrot_training - Step 19016: {'lr': 0.00034338526292386295, 'samples': 9736704, 'steps': 19016, 'batch_loss/train': 0.7682958263903856} 12/25/2021 11:31:13 - INFO - codeparrot_training - Step 19017: {'lr': 0.0003433706406397629, 'samples': 9737216, 'steps': 19017, 'batch_loss/train': 0.9412972582504153} 12/25/2021 11:31:24 - INFO - codeparrot_training - Step 19018: {'lr': 0.0003433560179844564, 'samples': 9737728, 'steps': 19018, 'batch_loss/train': 0.8205118253827095} 12/25/2021 11:31:34 - INFO - codeparrot_training - Step 19019: {'lr': 0.0003433413949580015, 'samples': 9738240, 'steps': 19019, 'batch_loss/train': 1.0734419794753194} 12/25/2021 11:31:46 - INFO - codeparrot_training - Step 19020: {'lr': 0.0003433267715604566, 'samples': 9738752, 'steps': 19020, 'batch_loss/train': 0.8577863546088338} 12/25/2021 11:31:57 - INFO - codeparrot_training - Step 19021: {'lr': 0.0003433121477918796, 'samples': 9739264, 'steps': 19021, 'batch_loss/train': 0.8694042265415192} 12/25/2021 11:32:07 - INFO - codeparrot_training - Step 19022: {'lr': 0.0003432975236523289, 'samples': 9739776, 'steps': 19022, 'batch_loss/train': 0.6958002485334873} 12/25/2021 11:32:22 - INFO - codeparrot_training - Step 19023: {'lr': 0.00034328289914186235, 'samples': 9740288, 'steps': 19023, 'batch_loss/train': 0.7410474715288728} 12/25/2021 11:32:32 - INFO - codeparrot_training - Step 19024: {'lr': 0.0003432682742605382, 'samples': 9740800, 'steps': 19024, 'batch_loss/train': 0.5937281345250085} 12/25/2021 11:32:43 - INFO - codeparrot_training - Step 19025: {'lr': 0.0003432536490084147, 'samples': 9741312, 'steps': 19025, 'batch_loss/train': 0.6977864463115111} 12/25/2021 11:32:55 - INFO - codeparrot_training - Step 19026: {'lr': 0.0003432390233855498, 'samples': 9741824, 'steps': 19026, 'batch_loss/train': 0.7792317699640989} 12/25/2021 11:33:06 - INFO - codeparrot_training - Step 19027: {'lr': 0.00034322439739200173, 'samples': 9742336, 'steps': 19027, 'batch_loss/train': 0.5930554084479809} 12/25/2021 11:33:16 - INFO - codeparrot_training - Step 19028: {'lr': 0.0003432097710278288, 'samples': 9742848, 'steps': 19028, 'batch_loss/train': 0.8744341556448489} 12/25/2021 11:33:27 - INFO - codeparrot_training - Step 19029: {'lr': 0.0003431951442930889, 'samples': 9743360, 'steps': 19029, 'batch_loss/train': 0.8563024494796991} 12/25/2021 11:33:39 - INFO - codeparrot_training - Step 19030: {'lr': 0.0003431805171878403, 'samples': 9743872, 'steps': 19030, 'batch_loss/train': 0.729962641838938} 12/25/2021 11:33:50 - INFO - codeparrot_training - Step 19031: {'lr': 0.0003431658897121412, 'samples': 9744384, 'steps': 19031, 'batch_loss/train': 0.6928334888070822} 12/25/2021 11:34:00 - INFO - codeparrot_training - Step 19032: {'lr': 0.00034315126186604966, 'samples': 9744896, 'steps': 19032, 'batch_loss/train': 0.8168751122429967} 12/25/2021 11:34:14 - INFO - codeparrot_training - Step 19033: {'lr': 0.0003431366336496239, 'samples': 9745408, 'steps': 19033, 'batch_loss/train': 0.8350469721481204} 12/25/2021 11:34:25 - INFO - codeparrot_training - Step 19034: {'lr': 0.000343122005062922, 'samples': 9745920, 'steps': 19034, 'batch_loss/train': 0.6993662742897868} 12/25/2021 11:34:35 - INFO - codeparrot_training - Step 19035: {'lr': 0.0003431073761060023, 'samples': 9746432, 'steps': 19035, 'batch_loss/train': 0.7734264172613621} 12/25/2021 11:34:47 - INFO - codeparrot_training - Step 19036: {'lr': 0.0003430927467789227, 'samples': 9746944, 'steps': 19036, 'batch_loss/train': 0.7956632347777486} 12/25/2021 11:34:58 - INFO - codeparrot_training - Step 19037: {'lr': 0.0003430781170817415, 'samples': 9747456, 'steps': 19037, 'batch_loss/train': 0.6535061894683167} 12/25/2021 11:35:09 - INFO - codeparrot_training - Step 19038: {'lr': 0.0003430634870145169, 'samples': 9747968, 'steps': 19038, 'batch_loss/train': 0.8474367279559374} 12/25/2021 11:35:21 - INFO - codeparrot_training - Step 19039: {'lr': 0.000343048856577307, 'samples': 9748480, 'steps': 19039, 'batch_loss/train': 0.7212464376352727} 12/25/2021 11:35:31 - INFO - codeparrot_training - Step 19040: {'lr': 0.00034303422577016996, 'samples': 9748992, 'steps': 19040, 'batch_loss/train': 0.7973963995464146} 12/25/2021 11:35:42 - INFO - codeparrot_training - Step 19041: {'lr': 0.000343019594593164, 'samples': 9749504, 'steps': 19041, 'batch_loss/train': 0.7619206020608544} 12/25/2021 11:35:53 - INFO - codeparrot_training - Step 19042: {'lr': 0.0003430049630463472, 'samples': 9750016, 'steps': 19042, 'batch_loss/train': 0.7384495369624346} 12/25/2021 11:36:05 - INFO - codeparrot_training - Step 19043: {'lr': 0.0003429903311297778, 'samples': 9750528, 'steps': 19043, 'batch_loss/train': 0.8001390895806253} 12/25/2021 11:36:15 - INFO - codeparrot_training - Step 19044: {'lr': 0.00034297569884351407, 'samples': 9751040, 'steps': 19044, 'batch_loss/train': 0.7537841554731131} 12/25/2021 11:36:26 - INFO - codeparrot_training - Step 19045: {'lr': 0.0003429610661876139, 'samples': 9751552, 'steps': 19045, 'batch_loss/train': 0.7772928439080715} 12/25/2021 11:36:40 - INFO - codeparrot_training - Step 19046: {'lr': 0.00034294643316213567, 'samples': 9752064, 'steps': 19046, 'batch_loss/train': 0.7745780367404222} 12/25/2021 11:36:50 - INFO - codeparrot_training - Step 19047: {'lr': 0.0003429317997671376, 'samples': 9752576, 'steps': 19047, 'batch_loss/train': 0.7765437001362443} 12/25/2021 11:37:01 - INFO - codeparrot_training - Step 19048: {'lr': 0.00034291716600267773, 'samples': 9753088, 'steps': 19048, 'batch_loss/train': 0.7452428000979125} 12/25/2021 11:37:12 - INFO - codeparrot_training - Step 19049: {'lr': 0.00034290253186881425, 'samples': 9753600, 'steps': 19049, 'batch_loss/train': 0.9948066109791398} 12/25/2021 11:37:24 - INFO - codeparrot_training - Step 19050: {'lr': 0.0003428878973656055, 'samples': 9754112, 'steps': 19050, 'batch_loss/train': 0.8752378160133958} 12/25/2021 11:37:34 - INFO - codeparrot_training - Step 19051: {'lr': 0.00034287326249310944, 'samples': 9754624, 'steps': 19051, 'batch_loss/train': 0.8178628260502592} 12/25/2021 11:37:45 - INFO - codeparrot_training - Step 19052: {'lr': 0.0003428586272513844, 'samples': 9755136, 'steps': 19052, 'batch_loss/train': 0.823531097266823} 12/25/2021 11:37:58 - INFO - codeparrot_training - Step 19053: {'lr': 0.00034284399164048854, 'samples': 9755648, 'steps': 19053, 'batch_loss/train': 0.8798393774777651} 12/25/2021 11:38:09 - INFO - codeparrot_training - Step 19054: {'lr': 0.00034282935566048, 'samples': 9756160, 'steps': 19054, 'batch_loss/train': 0.8454690366052091} 12/25/2021 11:38:20 - INFO - codeparrot_training - Step 19055: {'lr': 0.000342814719311417, 'samples': 9756672, 'steps': 19055, 'batch_loss/train': 0.8093813871964812} 12/25/2021 11:38:32 - INFO - codeparrot_training - Step 19056: {'lr': 0.00034280008259335765, 'samples': 9757184, 'steps': 19056, 'batch_loss/train': 0.8823787979781628} 12/25/2021 11:38:42 - INFO - codeparrot_training - Step 19057: {'lr': 0.00034278544550636034, 'samples': 9757696, 'steps': 19057, 'batch_loss/train': 0.7727574687451124} 12/25/2021 11:38:53 - INFO - codeparrot_training - Step 19058: {'lr': 0.00034277080805048297, 'samples': 9758208, 'steps': 19058, 'batch_loss/train': 0.8227803977206349} 12/25/2021 11:39:05 - INFO - codeparrot_training - Step 19059: {'lr': 0.00034275617022578404, 'samples': 9758720, 'steps': 19059, 'batch_loss/train': 0.8096349691040814} 12/25/2021 11:39:16 - INFO - codeparrot_training - Step 19060: {'lr': 0.0003427415320323215, 'samples': 9759232, 'steps': 19060, 'batch_loss/train': 0.7827959572896361} 12/25/2021 11:39:26 - INFO - codeparrot_training - Step 19061: {'lr': 0.0003427268934701538, 'samples': 9759744, 'steps': 19061, 'batch_loss/train': 0.7279633523430675} 12/25/2021 11:39:37 - INFO - codeparrot_training - Step 19062: {'lr': 0.0003427122545393388, 'samples': 9760256, 'steps': 19062, 'batch_loss/train': 0.7830529352650046} 12/25/2021 11:39:51 - INFO - codeparrot_training - Step 19063: {'lr': 0.000342697615239935, 'samples': 9760768, 'steps': 19063, 'batch_loss/train': 0.5368184747640043} 12/25/2021 11:40:02 - INFO - codeparrot_training - Step 19064: {'lr': 0.00034268297557200043, 'samples': 9761280, 'steps': 19064, 'batch_loss/train': 0.8261583978310227} 12/25/2021 11:40:12 - INFO - codeparrot_training - Step 19065: {'lr': 0.00034266833553559336, 'samples': 9761792, 'steps': 19065, 'batch_loss/train': 0.764887098222971} 12/25/2021 11:40:24 - INFO - codeparrot_training - Step 19066: {'lr': 0.000342653695130772, 'samples': 9762304, 'steps': 19066, 'batch_loss/train': 0.5594276660121977} 12/25/2021 11:40:35 - INFO - codeparrot_training - Step 19067: {'lr': 0.00034263905435759454, 'samples': 9762816, 'steps': 19067, 'batch_loss/train': 0.8009321275167167} 12/25/2021 11:40:46 - INFO - codeparrot_training - Step 19068: {'lr': 0.0003426244132161191, 'samples': 9763328, 'steps': 19068, 'batch_loss/train': 0.7595717567019165} 12/25/2021 11:40:58 - INFO - codeparrot_training - Step 19069: {'lr': 0.000342609771706404, 'samples': 9763840, 'steps': 19069, 'batch_loss/train': 0.7996326694265008} 12/25/2021 11:41:08 - INFO - codeparrot_training - Step 19070: {'lr': 0.0003425951298285075, 'samples': 9764352, 'steps': 19070, 'batch_loss/train': 0.8631745560560375} 12/25/2021 11:41:19 - INFO - codeparrot_training - Step 19071: {'lr': 0.0003425804875824876, 'samples': 9764864, 'steps': 19071, 'batch_loss/train': 0.7523689521476626} 12/25/2021 11:41:30 - INFO - codeparrot_training - Step 19072: {'lr': 0.0003425658449684027, 'samples': 9765376, 'steps': 19072, 'batch_loss/train': 0.7733934391289949} 12/25/2021 11:41:43 - INFO - codeparrot_training - Step 19073: {'lr': 0.00034255120198631094, 'samples': 9765888, 'steps': 19073, 'batch_loss/train': 0.803483278490603} 12/25/2021 11:41:54 - INFO - codeparrot_training - Step 19074: {'lr': 0.00034253655863627065, 'samples': 9766400, 'steps': 19074, 'batch_loss/train': 0.7578798979520798} 12/25/2021 11:42:04 - INFO - codeparrot_training - Step 19075: {'lr': 0.00034252191491833975, 'samples': 9766912, 'steps': 19075, 'batch_loss/train': 0.8255442149238661} 12/25/2021 11:42:17 - INFO - codeparrot_training - Step 19076: {'lr': 0.00034250727083257683, 'samples': 9767424, 'steps': 19076, 'batch_loss/train': 0.8082969607785344} 12/25/2021 11:42:27 - INFO - codeparrot_training - Step 19077: {'lr': 0.00034249262637903985, 'samples': 9767936, 'steps': 19077, 'batch_loss/train': 0.8545428114011884} 12/25/2021 11:42:38 - INFO - codeparrot_training - Step 19078: {'lr': 0.0003424779815577871, 'samples': 9768448, 'steps': 19078, 'batch_loss/train': 0.6523821935988963} 12/25/2021 11:42:50 - INFO - codeparrot_training - Step 19079: {'lr': 0.00034246333636887675, 'samples': 9768960, 'steps': 19079, 'batch_loss/train': 0.7958850646391511} 12/25/2021 11:43:00 - INFO - codeparrot_training - Step 19080: {'lr': 0.0003424486908123672, 'samples': 9769472, 'steps': 19080, 'batch_loss/train': 0.7427552817389369} 12/25/2021 11:43:11 - INFO - codeparrot_training - Step 19081: {'lr': 0.0003424340448883166, 'samples': 9769984, 'steps': 19081, 'batch_loss/train': 0.8125301022082567} 12/25/2021 11:43:22 - INFO - codeparrot_training - Step 19082: {'lr': 0.00034241939859678297, 'samples': 9770496, 'steps': 19082, 'batch_loss/train': 0.8044766944367439} 12/25/2021 11:43:35 - INFO - codeparrot_training - Step 19083: {'lr': 0.0003424047519378248, 'samples': 9771008, 'steps': 19083, 'batch_loss/train': 0.8703726083040237} 12/25/2021 11:43:46 - INFO - codeparrot_training - Step 19084: {'lr': 0.0003423901049115003, 'samples': 9771520, 'steps': 19084, 'batch_loss/train': 0.8010066035203636} 12/25/2021 11:43:57 - INFO - codeparrot_training - Step 19085: {'lr': 0.0003423754575178675, 'samples': 9772032, 'steps': 19085, 'batch_loss/train': 0.8703152504749596} 12/25/2021 11:44:09 - INFO - codeparrot_training - Step 19086: {'lr': 0.0003423608097569849, 'samples': 9772544, 'steps': 19086, 'batch_loss/train': 0.9098725616931915} 12/25/2021 11:44:19 - INFO - codeparrot_training - Step 19087: {'lr': 0.00034234616162891055, 'samples': 9773056, 'steps': 19087, 'batch_loss/train': 0.8152003148570657} 12/25/2021 11:44:30 - INFO - codeparrot_training - Step 19088: {'lr': 0.0003423315131337026, 'samples': 9773568, 'steps': 19088, 'batch_loss/train': 0.8530476111918688} 12/25/2021 11:44:42 - INFO - codeparrot_training - Step 19089: {'lr': 0.0003423168642714196, 'samples': 9774080, 'steps': 19089, 'batch_loss/train': 0.8442722344771028} 12/25/2021 11:44:53 - INFO - codeparrot_training - Step 19090: {'lr': 0.0003423022150421194, 'samples': 9774592, 'steps': 19090, 'batch_loss/train': 0.8146810648031533} 12/25/2021 11:45:03 - INFO - codeparrot_training - Step 19091: {'lr': 0.0003422875654458606, 'samples': 9775104, 'steps': 19091, 'batch_loss/train': 0.6445473502390087} 12/25/2021 11:45:17 - INFO - codeparrot_training - Step 19092: {'lr': 0.0003422729154827012, 'samples': 9775616, 'steps': 19092, 'batch_loss/train': 0.7341042654588819} 12/25/2021 11:45:27 - INFO - codeparrot_training - Step 19093: {'lr': 0.0003422582651526996, 'samples': 9776128, 'steps': 19093, 'batch_loss/train': 0.8345770360901952} 12/25/2021 11:45:38 - INFO - codeparrot_training - Step 19094: {'lr': 0.00034224361445591393, 'samples': 9776640, 'steps': 19094, 'batch_loss/train': 0.7870086412876844} 12/25/2021 11:45:49 - INFO - codeparrot_training - Step 19095: {'lr': 0.0003422289633924025, 'samples': 9777152, 'steps': 19095, 'batch_loss/train': 0.7737015844322741} 12/25/2021 11:46:01 - INFO - codeparrot_training - Step 19096: {'lr': 0.0003422143119622236, 'samples': 9777664, 'steps': 19096, 'batch_loss/train': 0.8963851882144809} 12/25/2021 11:46:11 - INFO - codeparrot_training - Step 19097: {'lr': 0.00034219966016543535, 'samples': 9778176, 'steps': 19097, 'batch_loss/train': 0.7972266576252878} 12/25/2021 11:46:22 - INFO - codeparrot_training - Step 19098: {'lr': 0.00034218500800209606, 'samples': 9778688, 'steps': 19098, 'batch_loss/train': 0.8446383681148291} 12/25/2021 11:46:36 - INFO - codeparrot_training - Step 19099: {'lr': 0.0003421703554722639, 'samples': 9779200, 'steps': 19099, 'batch_loss/train': 0.7690163441002369} 12/25/2021 11:46:46 - INFO - codeparrot_training - Step 19100: {'lr': 0.0003421557025759974, 'samples': 9779712, 'steps': 19100, 'batch_loss/train': 0.7352585811167955} 12/25/2021 11:46:57 - INFO - codeparrot_training - Step 19101: {'lr': 0.0003421410493133544, 'samples': 9780224, 'steps': 19101, 'batch_loss/train': 0.8067571758292615} 12/25/2021 11:47:08 - INFO - codeparrot_training - Step 19102: {'lr': 0.0003421263956843936, 'samples': 9780736, 'steps': 19102, 'batch_loss/train': 0.7408691914752126} 12/25/2021 11:47:19 - INFO - codeparrot_training - Step 19103: {'lr': 0.0003421117416891729, 'samples': 9781248, 'steps': 19103, 'batch_loss/train': 0.80238077044487} 12/25/2021 11:47:30 - INFO - codeparrot_training - Step 19104: {'lr': 0.00034209708732775074, 'samples': 9781760, 'steps': 19104, 'batch_loss/train': 0.816266949288547} 12/25/2021 11:47:41 - INFO - codeparrot_training - Step 19105: {'lr': 0.00034208243260018534, 'samples': 9782272, 'steps': 19105, 'batch_loss/train': 0.8272165143862367} 12/25/2021 11:47:53 - INFO - codeparrot_training - Step 19106: {'lr': 0.00034206777750653495, 'samples': 9782784, 'steps': 19106, 'batch_loss/train': 0.8357526031322777} 12/25/2021 11:48:03 - INFO - codeparrot_training - Step 19107: {'lr': 0.0003420531220468578, 'samples': 9783296, 'steps': 19107, 'batch_loss/train': 0.7086948584765196} 12/25/2021 11:48:14 - INFO - codeparrot_training - Step 19108: {'lr': 0.00034203846622121226, 'samples': 9783808, 'steps': 19108, 'batch_loss/train': 0.8530042278580368} 12/25/2021 11:48:28 - INFO - codeparrot_training - Step 19109: {'lr': 0.0003420238100296565, 'samples': 9784320, 'steps': 19109, 'batch_loss/train': 0.917745303362608} 12/25/2021 11:48:38 - INFO - codeparrot_training - Step 19110: {'lr': 0.0003420091534722488, 'samples': 9784832, 'steps': 19110, 'batch_loss/train': 0.6521452150773257} 12/25/2021 11:48:49 - INFO - codeparrot_training - Step 19111: {'lr': 0.00034199449654904746, 'samples': 9785344, 'steps': 19111, 'batch_loss/train': 0.7572850417345762} 12/25/2021 11:49:00 - INFO - codeparrot_training - Step 19112: {'lr': 0.0003419798392601108, 'samples': 9785856, 'steps': 19112, 'batch_loss/train': 0.6918014928232878} 12/25/2021 11:49:12 - INFO - codeparrot_training - Step 19113: {'lr': 0.0003419651816054969, 'samples': 9786368, 'steps': 19113, 'batch_loss/train': 0.7023476890753955} 12/25/2021 11:49:22 - INFO - codeparrot_training - Step 19114: {'lr': 0.0003419505235852642, 'samples': 9786880, 'steps': 19114, 'batch_loss/train': 0.8109932774677873} 12/25/2021 11:49:33 - INFO - codeparrot_training - Step 19115: {'lr': 0.00034193586519947107, 'samples': 9787392, 'steps': 19115, 'batch_loss/train': 0.8605398326180875} 12/25/2021 11:49:45 - INFO - codeparrot_training - Step 19116: {'lr': 0.0003419212064481755, 'samples': 9787904, 'steps': 19116, 'batch_loss/train': 0.7756087733432651} 12/25/2021 11:49:55 - INFO - codeparrot_training - Step 19117: {'lr': 0.00034190654733143593, 'samples': 9788416, 'steps': 19117, 'batch_loss/train': 0.7685662854928523} 12/25/2021 11:50:06 - INFO - codeparrot_training - Step 19118: {'lr': 0.00034189188784931076, 'samples': 9788928, 'steps': 19118, 'batch_loss/train': 0.8560688742436469} 12/25/2021 11:50:18 - INFO - codeparrot_training - Step 19119: {'lr': 0.00034187722800185804, 'samples': 9789440, 'steps': 19119, 'batch_loss/train': 0.6036968347616494} 12/25/2021 11:50:29 - INFO - codeparrot_training - Step 19120: {'lr': 0.00034186256778913614, 'samples': 9789952, 'steps': 19120, 'batch_loss/train': 0.7709991366136819} 12/25/2021 11:50:39 - INFO - codeparrot_training - Step 19121: {'lr': 0.0003418479072112034, 'samples': 9790464, 'steps': 19121, 'batch_loss/train': 0.7457755031064153} 12/25/2021 11:50:50 - INFO - codeparrot_training - Step 19122: {'lr': 0.0003418332462681181, 'samples': 9790976, 'steps': 19122, 'batch_loss/train': 0.806864962913096} 12/25/2021 11:51:04 - INFO - codeparrot_training - Step 19123: {'lr': 0.00034181858495993837, 'samples': 9791488, 'steps': 19123, 'batch_loss/train': 1.0528552792966366} 12/25/2021 11:51:15 - INFO - codeparrot_training - Step 19124: {'lr': 0.0003418039232867227, 'samples': 9792000, 'steps': 19124, 'batch_loss/train': 0.7737647602334619} 12/25/2021 11:51:25 - INFO - codeparrot_training - Step 19125: {'lr': 0.00034178926124852927, 'samples': 9792512, 'steps': 19125, 'batch_loss/train': 0.7936754636466503} 12/25/2021 11:51:37 - INFO - codeparrot_training - Step 19126: {'lr': 0.00034177459884541644, 'samples': 9793024, 'steps': 19126, 'batch_loss/train': 0.8529649758711457} 12/25/2021 11:51:48 - INFO - codeparrot_training - Step 19127: {'lr': 0.00034175993607744236, 'samples': 9793536, 'steps': 19127, 'batch_loss/train': 0.8574472712352872} 12/25/2021 11:51:58 - INFO - codeparrot_training - Step 19128: {'lr': 0.00034174527294466553, 'samples': 9794048, 'steps': 19128, 'batch_loss/train': 0.756684903986752} 12/25/2021 11:52:12 - INFO - codeparrot_training - Step 19129: {'lr': 0.00034173060944714406, 'samples': 9794560, 'steps': 19129, 'batch_loss/train': 0.7539969119243324} 12/25/2021 11:52:23 - INFO - codeparrot_training - Step 19130: {'lr': 0.00034171594558493634, 'samples': 9795072, 'steps': 19130, 'batch_loss/train': 0.7604246018454432} 12/25/2021 11:52:33 - INFO - codeparrot_training - Step 19131: {'lr': 0.0003417012813581007, 'samples': 9795584, 'steps': 19131, 'batch_loss/train': 0.8014653129503131} 12/25/2021 11:52:44 - INFO - codeparrot_training - Step 19132: {'lr': 0.00034168661676669534, 'samples': 9796096, 'steps': 19132, 'batch_loss/train': 0.7789742152672261} 12/25/2021 11:52:56 - INFO - codeparrot_training - Step 19133: {'lr': 0.00034167195181077857, 'samples': 9796608, 'steps': 19133, 'batch_loss/train': 0.7803488615900278} 12/25/2021 11:53:07 - INFO - codeparrot_training - Step 19134: {'lr': 0.00034165728649040886, 'samples': 9797120, 'steps': 19134, 'batch_loss/train': 0.7623062734492123} 12/25/2021 11:53:17 - INFO - codeparrot_training - Step 19135: {'lr': 0.0003416426208056443, 'samples': 9797632, 'steps': 19135, 'batch_loss/train': 0.7856967141851783} 12/25/2021 11:53:30 - INFO - codeparrot_training - Step 19136: {'lr': 0.0003416279547565433, 'samples': 9798144, 'steps': 19136, 'batch_loss/train': 0.7837529517710209} 12/25/2021 11:53:40 - INFO - codeparrot_training - Step 19137: {'lr': 0.0003416132883431641, 'samples': 9798656, 'steps': 19137, 'batch_loss/train': 0.8354904633015394} 12/25/2021 11:53:51 - INFO - codeparrot_training - Step 19138: {'lr': 0.00034159862156556516, 'samples': 9799168, 'steps': 19138, 'batch_loss/train': 0.6586664505302906} 12/25/2021 11:54:04 - INFO - codeparrot_training - Step 19139: {'lr': 0.0003415839544238046, 'samples': 9799680, 'steps': 19139, 'batch_loss/train': 0.8237846218980849} 12/25/2021 11:54:15 - INFO - codeparrot_training - Step 19140: {'lr': 0.0003415692869179408, 'samples': 9800192, 'steps': 19140, 'batch_loss/train': 0.7069016366149299} 12/25/2021 11:54:26 - INFO - codeparrot_training - Step 19141: {'lr': 0.0003415546190480322, 'samples': 9800704, 'steps': 19141, 'batch_loss/train': 0.7159099383279681} 12/25/2021 11:54:38 - INFO - codeparrot_training - Step 19142: {'lr': 0.000341539950814137, 'samples': 9801216, 'steps': 19142, 'batch_loss/train': 0.7978453044779599} 12/25/2021 11:54:48 - INFO - codeparrot_training - Step 19143: {'lr': 0.00034152528221631333, 'samples': 9801728, 'steps': 19143, 'batch_loss/train': 0.8061254629865289} 12/25/2021 11:54:59 - INFO - codeparrot_training - Step 19144: {'lr': 0.0003415106132546199, 'samples': 9802240, 'steps': 19144, 'batch_loss/train': 0.7252282402478158} 12/25/2021 11:55:10 - INFO - codeparrot_training - Step 19145: {'lr': 0.00034149594392911476, 'samples': 9802752, 'steps': 19145, 'batch_loss/train': 0.8552953307516873} 12/25/2021 11:55:21 - INFO - codeparrot_training - Step 19146: {'lr': 0.0003414812742398563, 'samples': 9803264, 'steps': 19146, 'batch_loss/train': 0.7005230477079749} 12/25/2021 11:55:32 - INFO - codeparrot_training - Step 19147: {'lr': 0.0003414666041869028, 'samples': 9803776, 'steps': 19147, 'batch_loss/train': 0.8351018279790878} 12/25/2021 11:55:43 - INFO - codeparrot_training - Step 19148: {'lr': 0.0003414519337703127, 'samples': 9804288, 'steps': 19148, 'batch_loss/train': 0.8216613554395735} 12/25/2021 11:55:55 - INFO - codeparrot_training - Step 19149: {'lr': 0.00034143726299014414, 'samples': 9804800, 'steps': 19149, 'batch_loss/train': 0.705009404104203} 12/25/2021 11:56:06 - INFO - codeparrot_training - Step 19150: {'lr': 0.00034142259184645555, 'samples': 9805312, 'steps': 19150, 'batch_loss/train': 0.6995638445951045} 12/25/2021 11:56:16 - INFO - codeparrot_training - Step 19151: {'lr': 0.00034140792033930537, 'samples': 9805824, 'steps': 19151, 'batch_loss/train': 0.8305348912253976} 12/25/2021 11:56:30 - INFO - codeparrot_training - Step 19152: {'lr': 0.0003413932484687517, 'samples': 9806336, 'steps': 19152, 'batch_loss/train': 0.7445748881436884} 12/25/2021 11:56:41 - INFO - codeparrot_training - Step 19153: {'lr': 0.000341378576234853, 'samples': 9806848, 'steps': 19153, 'batch_loss/train': 0.7829413609579206} 12/25/2021 11:56:51 - INFO - codeparrot_training - Step 19154: {'lr': 0.00034136390363766764, 'samples': 9807360, 'steps': 19154, 'batch_loss/train': 0.6995797466952354} 12/25/2021 11:57:02 - INFO - codeparrot_training - Step 19155: {'lr': 0.0003413492306772539, 'samples': 9807872, 'steps': 19155, 'batch_loss/train': 0.7415723474696279} 12/25/2021 11:57:14 - INFO - codeparrot_training - Step 19156: {'lr': 0.00034133455735367, 'samples': 9808384, 'steps': 19156, 'batch_loss/train': 0.7317403489723802} 12/25/2021 11:57:25 - INFO - codeparrot_training - Step 19157: {'lr': 0.0003413198836669745, 'samples': 9808896, 'steps': 19157, 'batch_loss/train': 0.8477084320038557} 12/25/2021 11:57:35 - INFO - codeparrot_training - Step 19158: {'lr': 0.00034130520961722557, 'samples': 9809408, 'steps': 19158, 'batch_loss/train': 0.7269365931861103} 12/25/2021 11:57:49 - INFO - codeparrot_training - Step 19159: {'lr': 0.0003412905352044816, 'samples': 9809920, 'steps': 19159, 'batch_loss/train': 0.8435047268867493} 12/25/2021 11:58:00 - INFO - codeparrot_training - Step 19160: {'lr': 0.00034127586042880095, 'samples': 9810432, 'steps': 19160, 'batch_loss/train': 0.6656023885589093} 12/25/2021 11:58:11 - INFO - codeparrot_training - Step 19161: {'lr': 0.00034126118529024196, 'samples': 9810944, 'steps': 19161, 'batch_loss/train': 0.8245161296799779} 12/25/2021 11:58:23 - INFO - codeparrot_training - Step 19162: {'lr': 0.0003412465097888629, 'samples': 9811456, 'steps': 19162, 'batch_loss/train': 0.7480680234730244} 12/25/2021 11:58:33 - INFO - codeparrot_training - Step 19163: {'lr': 0.00034123183392472224, 'samples': 9811968, 'steps': 19163, 'batch_loss/train': 0.7272695861756802} 12/25/2021 11:58:44 - INFO - codeparrot_training - Step 19164: {'lr': 0.0003412171576978782, 'samples': 9812480, 'steps': 19164, 'batch_loss/train': 0.7190673826262355} 12/25/2021 11:58:55 - INFO - codeparrot_training - Step 19165: {'lr': 0.00034120248110838927, 'samples': 9812992, 'steps': 19165, 'batch_loss/train': 1.453710772562772} 12/25/2021 11:59:07 - INFO - codeparrot_training - Step 19166: {'lr': 0.00034118780415631354, 'samples': 9813504, 'steps': 19166, 'batch_loss/train': 0.891032655723393} 12/25/2021 11:59:17 - INFO - codeparrot_training - Step 19167: {'lr': 0.0003411731268417097, 'samples': 9814016, 'steps': 19167, 'batch_loss/train': 0.7599384472705424} 12/25/2021 11:59:28 - INFO - codeparrot_training - Step 19168: {'lr': 0.0003411584491646359, 'samples': 9814528, 'steps': 19168, 'batch_loss/train': 0.8271486661396921} 12/25/2021 11:59:41 - INFO - codeparrot_training - Step 19169: {'lr': 0.0003411437711251504, 'samples': 9815040, 'steps': 19169, 'batch_loss/train': 0.8196439947932959} 12/25/2021 11:59:52 - INFO - codeparrot_training - Step 19170: {'lr': 0.0003411290927233119, 'samples': 9815552, 'steps': 19170, 'batch_loss/train': 0.7847076803445816} 12/25/2021 12:00:03 - INFO - codeparrot_training - Step 19171: {'lr': 0.0003411144139591784, 'samples': 9816064, 'steps': 19171, 'batch_loss/train': 0.9134666556492448} 12/25/2021 12:00:15 - INFO - codeparrot_training - Step 19172: {'lr': 0.00034109973483280835, 'samples': 9816576, 'steps': 19172, 'batch_loss/train': 0.8124117855913937} 12/25/2021 12:00:25 - INFO - codeparrot_training - Step 19173: {'lr': 0.00034108505534426024, 'samples': 9817088, 'steps': 19173, 'batch_loss/train': 0.9094450455158949} 12/25/2021 12:00:36 - INFO - codeparrot_training - Step 19174: {'lr': 0.0003410703754935924, 'samples': 9817600, 'steps': 19174, 'batch_loss/train': 0.7286688783206046} 12/25/2021 12:00:47 - INFO - codeparrot_training - Step 19175: {'lr': 0.00034105569528086293, 'samples': 9818112, 'steps': 19175, 'batch_loss/train': 0.8921861005946994} 12/25/2021 12:00:59 - INFO - codeparrot_training - Step 19176: {'lr': 0.00034104101470613056, 'samples': 9818624, 'steps': 19176, 'batch_loss/train': 0.9763051960617304} 12/25/2021 12:01:09 - INFO - codeparrot_training - Step 19177: {'lr': 0.00034102633376945345, 'samples': 9819136, 'steps': 19177, 'batch_loss/train': 0.820205676369369} 12/25/2021 12:01:20 - INFO - codeparrot_training - Step 19178: {'lr': 0.00034101165247089, 'samples': 9819648, 'steps': 19178, 'batch_loss/train': 0.8006602106615901} 12/25/2021 12:01:34 - INFO - codeparrot_training - Step 19179: {'lr': 0.0003409969708104986, 'samples': 9820160, 'steps': 19179, 'batch_loss/train': 0.7874918119050562} 12/25/2021 12:01:44 - INFO - codeparrot_training - Step 19180: {'lr': 0.00034098228878833757, 'samples': 9820672, 'steps': 19180, 'batch_loss/train': 0.8149806812871248} 12/25/2021 12:01:55 - INFO - codeparrot_training - Step 19181: {'lr': 0.0003409676064044654, 'samples': 9821184, 'steps': 19181, 'batch_loss/train': 0.7631547157652676} 12/25/2021 12:02:07 - INFO - codeparrot_training - Step 19182: {'lr': 0.00034095292365894024, 'samples': 9821696, 'steps': 19182, 'batch_loss/train': 0.7629688140004873} 12/25/2021 12:02:18 - INFO - codeparrot_training - Step 19183: {'lr': 0.0003409382405518207, 'samples': 9822208, 'steps': 19183, 'batch_loss/train': 0.7728549670428038} 12/25/2021 12:02:28 - INFO - codeparrot_training - Step 19184: {'lr': 0.000340923557083165, 'samples': 9822720, 'steps': 19184, 'batch_loss/train': 0.7979437466710806} 12/25/2021 12:02:40 - INFO - codeparrot_training - Step 19185: {'lr': 0.0003409088732530316, 'samples': 9823232, 'steps': 19185, 'batch_loss/train': 0.8117071831366047} 12/25/2021 12:02:51 - INFO - codeparrot_training - Step 19186: {'lr': 0.0003408941890614788, 'samples': 9823744, 'steps': 19186, 'batch_loss/train': 0.7809488172642887} 12/25/2021 12:03:01 - INFO - codeparrot_training - Step 19187: {'lr': 0.0003408795045085651, 'samples': 9824256, 'steps': 19187, 'batch_loss/train': 0.7804613513872027} 12/25/2021 12:03:12 - INFO - codeparrot_training - Step 19188: {'lr': 0.00034086481959434874, 'samples': 9824768, 'steps': 19188, 'batch_loss/train': 0.8126434020814486} 12/25/2021 12:03:24 - INFO - codeparrot_training - Step 19189: {'lr': 0.0003408501343188882, 'samples': 9825280, 'steps': 19189, 'batch_loss/train': 0.9002312617376447} 12/25/2021 12:03:35 - INFO - codeparrot_training - Step 19190: {'lr': 0.00034083544868224186, 'samples': 9825792, 'steps': 19190, 'batch_loss/train': 0.8524558329954743} 12/25/2021 12:03:46 - INFO - codeparrot_training - Step 19191: {'lr': 0.000340820762684468, 'samples': 9826304, 'steps': 19191, 'batch_loss/train': 0.4509412278421223} 12/25/2021 12:03:59 - INFO - codeparrot_training - Step 19192: {'lr': 0.0003408060763256251, 'samples': 9826816, 'steps': 19192, 'batch_loss/train': 1.0881256945431232} 12/25/2021 12:04:10 - INFO - codeparrot_training - Step 19193: {'lr': 0.00034079138960577154, 'samples': 9827328, 'steps': 19193, 'batch_loss/train': 0.681819359306246} 12/25/2021 12:04:21 - INFO - codeparrot_training - Step 19194: {'lr': 0.00034077670252496576, 'samples': 9827840, 'steps': 19194, 'batch_loss/train': 0.8471947740763426} 12/25/2021 12:04:31 - INFO - codeparrot_training - Step 19195: {'lr': 0.000340762015083266, 'samples': 9828352, 'steps': 19195, 'batch_loss/train': 0.7725845396053046} 12/25/2021 12:04:43 - INFO - codeparrot_training - Step 19196: {'lr': 0.00034074732728073075, 'samples': 9828864, 'steps': 19196, 'batch_loss/train': 0.8484473312273622} 12/25/2021 12:04:54 - INFO - codeparrot_training - Step 19197: {'lr': 0.00034073263911741844, 'samples': 9829376, 'steps': 19197, 'batch_loss/train': 0.8738347664475441} 12/25/2021 12:05:05 - INFO - codeparrot_training - Step 19198: {'lr': 0.0003407179505933874, 'samples': 9829888, 'steps': 19198, 'batch_loss/train': 0.8059860104694963} 12/25/2021 12:05:20 - INFO - codeparrot_training - Step 19199: {'lr': 0.000340703261708696, 'samples': 9830400, 'steps': 19199, 'batch_loss/train': 0.9015171446371824} 12/25/2021 12:05:30 - INFO - codeparrot_training - Step 19200: {'lr': 0.00034068857246340276, 'samples': 9830912, 'steps': 19200, 'batch_loss/train': 0.7309086811728776} 12/25/2021 12:05:41 - INFO - codeparrot_training - Step 19201: {'lr': 0.00034067388285756586, 'samples': 9831424, 'steps': 19201, 'batch_loss/train': 0.827351882122457} 12/25/2021 12:05:51 - INFO - codeparrot_training - Step 19202: {'lr': 0.000340659192891244, 'samples': 9831936, 'steps': 19202, 'batch_loss/train': 2.057128122076392} 12/25/2021 12:06:03 - INFO - codeparrot_training - Step 19203: {'lr': 0.0003406445025644953, 'samples': 9832448, 'steps': 19203, 'batch_loss/train': 0.7777504804544151} 12/25/2021 12:06:14 - INFO - codeparrot_training - Step 19204: {'lr': 0.00034062981187737827, 'samples': 9832960, 'steps': 19204, 'batch_loss/train': 0.8462946508079767} 12/25/2021 12:06:25 - INFO - codeparrot_training - Step 19205: {'lr': 0.00034061512082995146, 'samples': 9833472, 'steps': 19205, 'batch_loss/train': 0.7623741959687322} 12/25/2021 12:06:37 - INFO - codeparrot_training - Step 19206: {'lr': 0.00034060042942227306, 'samples': 9833984, 'steps': 19206, 'batch_loss/train': 0.7936516366899014} 12/25/2021 12:06:47 - INFO - codeparrot_training - Step 19207: {'lr': 0.00034058573765440156, 'samples': 9834496, 'steps': 19207, 'batch_loss/train': 0.7947227824479342} 12/25/2021 12:06:58 - INFO - codeparrot_training - Step 19208: {'lr': 0.00034057104552639533, 'samples': 9835008, 'steps': 19208, 'batch_loss/train': 0.7776137182954699} 12/25/2021 12:07:12 - INFO - codeparrot_training - Step 19209: {'lr': 0.00034055635303831294, 'samples': 9835520, 'steps': 19209, 'batch_loss/train': 0.8850781042128801} 12/25/2021 12:07:23 - INFO - codeparrot_training - Step 19210: {'lr': 0.0003405416601902126, 'samples': 9836032, 'steps': 19210, 'batch_loss/train': 0.5198158216662705} 12/25/2021 12:07:33 - INFO - codeparrot_training - Step 19211: {'lr': 0.0003405269669821528, 'samples': 9836544, 'steps': 19211, 'batch_loss/train': 0.7859416431747377} 12/25/2021 12:07:44 - INFO - codeparrot_training - Step 19212: {'lr': 0.000340512273414192, 'samples': 9837056, 'steps': 19212, 'batch_loss/train': 0.7990975477732718} 12/25/2021 12:07:56 - INFO - codeparrot_training - Step 19213: {'lr': 0.0003404975794863886, 'samples': 9837568, 'steps': 19213, 'batch_loss/train': 0.8221611212939024} 12/25/2021 12:08:06 - INFO - codeparrot_training - Step 19214: {'lr': 0.0003404828851988009, 'samples': 9838080, 'steps': 19214, 'batch_loss/train': 0.7936675241217017} 12/25/2021 12:08:17 - INFO - codeparrot_training - Step 19215: {'lr': 0.0003404681905514875, 'samples': 9838592, 'steps': 19215, 'batch_loss/train': 0.7257119668647647} 12/25/2021 12:08:31 - INFO - codeparrot_training - Step 19216: {'lr': 0.00034045349554450674, 'samples': 9839104, 'steps': 19216, 'batch_loss/train': 0.9313956471160054} 12/25/2021 12:08:41 - INFO - codeparrot_training - Step 19217: {'lr': 0.00034043880017791696, 'samples': 9839616, 'steps': 19217, 'batch_loss/train': 0.8897378882393241} 12/25/2021 12:08:52 - INFO - codeparrot_training - Step 19218: {'lr': 0.0003404241044517767, 'samples': 9840128, 'steps': 19218, 'batch_loss/train': 0.9634927017614245} 12/25/2021 12:09:04 - INFO - codeparrot_training - Step 19219: {'lr': 0.0003404094083661443, 'samples': 9840640, 'steps': 19219, 'batch_loss/train': 0.7487823311239481} 12/25/2021 12:09:15 - INFO - codeparrot_training - Step 19220: {'lr': 0.0003403947119210783, 'samples': 9841152, 'steps': 19220, 'batch_loss/train': 0.708446444477886} 12/25/2021 12:09:25 - INFO - codeparrot_training - Step 19221: {'lr': 0.000340380015116637, 'samples': 9841664, 'steps': 19221, 'batch_loss/train': 0.8127450309693813} 12/25/2021 12:09:38 - INFO - codeparrot_training - Step 19222: {'lr': 0.00034036531795287895, 'samples': 9842176, 'steps': 19222, 'batch_loss/train': 0.806622963398695} 12/25/2021 12:09:48 - INFO - codeparrot_training - Step 19223: {'lr': 0.0003403506204298625, 'samples': 9842688, 'steps': 19223, 'batch_loss/train': 0.7812731093727052} 12/25/2021 12:09:59 - INFO - codeparrot_training - Step 19224: {'lr': 0.00034033592254764606, 'samples': 9843200, 'steps': 19224, 'batch_loss/train': 0.7437905129045248} 12/25/2021 12:10:10 - INFO - codeparrot_training - Step 19225: {'lr': 0.0003403212243062881, 'samples': 9843712, 'steps': 19225, 'batch_loss/train': 0.7884960770606995} 12/25/2021 12:10:23 - INFO - codeparrot_training - Step 19226: {'lr': 0.00034030652570584714, 'samples': 9844224, 'steps': 19226, 'batch_loss/train': 0.6794148040935397} 12/25/2021 12:10:34 - INFO - codeparrot_training - Step 19227: {'lr': 0.00034029182674638137, 'samples': 9844736, 'steps': 19227, 'batch_loss/train': 0.7216609346214682} 12/25/2021 12:10:45 - INFO - codeparrot_training - Step 19228: {'lr': 0.0003402771274279495, 'samples': 9845248, 'steps': 19228, 'batch_loss/train': 0.7181802205741405} 12/25/2021 12:10:57 - INFO - codeparrot_training - Step 19229: {'lr': 0.0003402624277506099, 'samples': 9845760, 'steps': 19229, 'batch_loss/train': 0.8997695837169886} 12/25/2021 12:11:07 - INFO - codeparrot_training - Step 19230: {'lr': 0.0003402477277144209, 'samples': 9846272, 'steps': 19230, 'batch_loss/train': 0.8775286171585321} 12/25/2021 12:11:18 - INFO - codeparrot_training - Step 19231: {'lr': 0.000340233027319441, 'samples': 9846784, 'steps': 19231, 'batch_loss/train': 0.6771390154026449} 12/25/2021 12:11:30 - INFO - codeparrot_training - Step 19232: {'lr': 0.0003402183265657287, 'samples': 9847296, 'steps': 19232, 'batch_loss/train': 0.8364608865231276} 12/25/2021 12:11:41 - INFO - codeparrot_training - Step 19233: {'lr': 0.0003402036254533424, 'samples': 9847808, 'steps': 19233, 'batch_loss/train': 0.725783265195787} 12/25/2021 12:11:51 - INFO - codeparrot_training - Step 19234: {'lr': 0.00034018892398234046, 'samples': 9848320, 'steps': 19234, 'batch_loss/train': 0.7659849328920245} 12/25/2021 12:12:02 - INFO - codeparrot_training - Step 19235: {'lr': 0.0003401742221527815, 'samples': 9848832, 'steps': 19235, 'batch_loss/train': 0.8374696332029998} 12/25/2021 12:12:14 - INFO - codeparrot_training - Step 19236: {'lr': 0.00034015951996472387, 'samples': 9849344, 'steps': 19236, 'batch_loss/train': 0.7730953034479171} 12/25/2021 12:12:25 - INFO - codeparrot_training - Step 19237: {'lr': 0.000340144817418226, 'samples': 9849856, 'steps': 19237, 'batch_loss/train': 0.857890397310257} 12/25/2021 12:12:35 - INFO - codeparrot_training - Step 19238: {'lr': 0.0003401301145133464, 'samples': 9850368, 'steps': 19238, 'batch_loss/train': 0.6988323931582272} 12/25/2021 12:12:49 - INFO - codeparrot_training - Step 19239: {'lr': 0.0003401154112501436, 'samples': 9850880, 'steps': 19239, 'batch_loss/train': 0.6680547641590238} 12/25/2021 12:13:00 - INFO - codeparrot_training - Step 19240: {'lr': 0.00034010070762867573, 'samples': 9851392, 'steps': 19240, 'batch_loss/train': 0.8429321986623108} 12/25/2021 12:13:10 - INFO - codeparrot_training - Step 19241: {'lr': 0.00034008600364900153, 'samples': 9851904, 'steps': 19241, 'batch_loss/train': 0.8133578170090914} 12/25/2021 12:13:22 - INFO - codeparrot_training - Step 19242: {'lr': 0.0003400712993111795, 'samples': 9852416, 'steps': 19242, 'batch_loss/train': 0.8761980645358562} 12/25/2021 12:13:33 - INFO - codeparrot_training - Step 19243: {'lr': 0.00034005659461526787, 'samples': 9852928, 'steps': 19243, 'batch_loss/train': 0.7325364431599155} 12/25/2021 12:13:44 - INFO - codeparrot_training - Step 19244: {'lr': 0.0003400418895613254, 'samples': 9853440, 'steps': 19244, 'batch_loss/train': 0.678958993870765} 12/25/2021 12:13:58 - INFO - codeparrot_training - Step 19245: {'lr': 0.0003400271841494102, 'samples': 9853952, 'steps': 19245, 'batch_loss/train': 1.0644110655412078} 12/25/2021 12:14:08 - INFO - codeparrot_training - Step 19246: {'lr': 0.000340012478379581, 'samples': 9854464, 'steps': 19246, 'batch_loss/train': 0.7808265881612897} 12/25/2021 12:14:19 - INFO - codeparrot_training - Step 19247: {'lr': 0.0003399977722518961, 'samples': 9854976, 'steps': 19247, 'batch_loss/train': 0.7773832392413169} 12/25/2021 12:14:30 - INFO - codeparrot_training - Step 19248: {'lr': 0.00033998306576641415, 'samples': 9855488, 'steps': 19248, 'batch_loss/train': 0.8407233189791441} 12/25/2021 12:14:42 - INFO - codeparrot_training - Step 19249: {'lr': 0.0003399683589231934, 'samples': 9856000, 'steps': 19249, 'batch_loss/train': 1.2980567160993814} 12/25/2021 12:14:52 - INFO - codeparrot_training - Step 19250: {'lr': 0.0003399536517222925, 'samples': 9856512, 'steps': 19250, 'batch_loss/train': 0.7806986030191183} 12/25/2021 12:15:03 - INFO - codeparrot_training - Step 19251: {'lr': 0.0003399389441637698, 'samples': 9857024, 'steps': 19251, 'batch_loss/train': 0.7532025203108788} 12/25/2021 12:15:15 - INFO - codeparrot_training - Step 19252: {'lr': 0.00033992423624768387, 'samples': 9857536, 'steps': 19252, 'batch_loss/train': 0.8752811043523252} 12/25/2021 12:15:26 - INFO - codeparrot_training - Step 19253: {'lr': 0.00033990952797409305, 'samples': 9858048, 'steps': 19253, 'batch_loss/train': 0.833582203835249} 12/25/2021 12:15:36 - INFO - codeparrot_training - Step 19254: {'lr': 0.00033989481934305595, 'samples': 9858560, 'steps': 19254, 'batch_loss/train': 0.8135738829150796} 12/25/2021 12:15:50 - INFO - codeparrot_training - Step 19255: {'lr': 0.000339880110354631, 'samples': 9859072, 'steps': 19255, 'batch_loss/train': 0.7658934248611331} 12/25/2021 12:16:01 - INFO - codeparrot_training - Step 19256: {'lr': 0.00033986540100887656, 'samples': 9859584, 'steps': 19256, 'batch_loss/train': 0.7344338982366025} 12/25/2021 12:16:11 - INFO - codeparrot_training - Step 19257: {'lr': 0.00033985069130585137, 'samples': 9860096, 'steps': 19257, 'batch_loss/train': 0.7318161334842443} 12/25/2021 12:16:22 - INFO - codeparrot_training - Step 19258: {'lr': 0.00033983598124561365, 'samples': 9860608, 'steps': 19258, 'batch_loss/train': 0.8096662242896855} 12/25/2021 12:16:34 - INFO - codeparrot_training - Step 19259: {'lr': 0.00033982127082822207, 'samples': 9861120, 'steps': 19259, 'batch_loss/train': 0.6786584914661944} 12/25/2021 12:16:45 - INFO - codeparrot_training - Step 19260: {'lr': 0.000339806560053735, 'samples': 9861632, 'steps': 19260, 'batch_loss/train': 0.7559305829927325} 12/25/2021 12:16:55 - INFO - codeparrot_training - Step 19261: {'lr': 0.000339791848922211, 'samples': 9862144, 'steps': 19261, 'batch_loss/train': 0.7170963324606419} 12/25/2021 12:17:08 - INFO - codeparrot_training - Step 19262: {'lr': 0.00033977713743370844, 'samples': 9862656, 'steps': 19262, 'batch_loss/train': 0.5882945274934173} 12/25/2021 12:17:18 - INFO - codeparrot_training - Step 19263: {'lr': 0.00033976242558828593, 'samples': 9863168, 'steps': 19263, 'batch_loss/train': 0.8147554500028491} 12/25/2021 12:17:29 - INFO - codeparrot_training - Step 19264: {'lr': 0.00033974771338600185, 'samples': 9863680, 'steps': 19264, 'batch_loss/train': 0.9059812342748046} 12/25/2021 12:17:39 - INFO - codeparrot_training - Step 19265: {'lr': 0.00033973300082691476, 'samples': 9864192, 'steps': 19265, 'batch_loss/train': 0.7288140561431646} 12/25/2021 12:17:51 - INFO - codeparrot_training - Step 19266: {'lr': 0.0003397182879110832, 'samples': 9864704, 'steps': 19266, 'batch_loss/train': 0.860971000045538} 12/25/2021 12:18:02 - INFO - codeparrot_training - Step 19267: {'lr': 0.0003397035746385655, 'samples': 9865216, 'steps': 19267, 'batch_loss/train': 0.8421807899139822} 12/25/2021 12:18:13 - INFO - codeparrot_training - Step 19268: {'lr': 0.0003396888610094204, 'samples': 9865728, 'steps': 19268, 'batch_loss/train': 0.7515709511935711} 12/25/2021 12:18:26 - INFO - codeparrot_training - Step 19269: {'lr': 0.0003396741470237061, 'samples': 9866240, 'steps': 19269, 'batch_loss/train': 0.8370964704081416} 12/25/2021 12:18:37 - INFO - codeparrot_training - Step 19270: {'lr': 0.0003396594326814814, 'samples': 9866752, 'steps': 19270, 'batch_loss/train': 0.8260572580620646} 12/25/2021 12:18:48 - INFO - codeparrot_training - Step 19271: {'lr': 0.00033964471798280456, 'samples': 9867264, 'steps': 19271, 'batch_loss/train': 0.8172975797206163} 12/25/2021 12:19:00 - INFO - codeparrot_training - Step 19272: {'lr': 0.00033963000292773417, 'samples': 9867776, 'steps': 19272, 'batch_loss/train': 0.8916820334270597} 12/25/2021 12:19:10 - INFO - codeparrot_training - Step 19273: {'lr': 0.00033961528751632876, 'samples': 9868288, 'steps': 19273, 'batch_loss/train': 0.8783552851527929} 12/25/2021 12:19:21 - INFO - codeparrot_training - Step 19274: {'lr': 0.0003396005717486468, 'samples': 9868800, 'steps': 19274, 'batch_loss/train': 0.837614830583334} 12/25/2021 12:19:35 - INFO - codeparrot_training - Step 19275: {'lr': 0.00033958585562474683, 'samples': 9869312, 'steps': 19275, 'batch_loss/train': 0.8341867392882705} 12/25/2021 12:19:45 - INFO - codeparrot_training - Step 19276: {'lr': 0.00033957113914468725, 'samples': 9869824, 'steps': 19276, 'batch_loss/train': 0.9194378908723593} 12/25/2021 12:19:56 - INFO - codeparrot_training - Step 19277: {'lr': 0.00033955642230852666, 'samples': 9870336, 'steps': 19277, 'batch_loss/train': 0.8457382158376276} 12/25/2021 12:20:07 - INFO - codeparrot_training - Step 19278: {'lr': 0.00033954170511632357, 'samples': 9870848, 'steps': 19278, 'batch_loss/train': 0.8307268179487437} 12/25/2021 12:20:19 - INFO - codeparrot_training - Step 19279: {'lr': 0.00033952698756813647, 'samples': 9871360, 'steps': 19279, 'batch_loss/train': 0.8737841006368399} 12/25/2021 12:20:29 - INFO - codeparrot_training - Step 19280: {'lr': 0.0003395122696640239, 'samples': 9871872, 'steps': 19280, 'batch_loss/train': 0.6636199913918972} 12/25/2021 12:20:40 - INFO - codeparrot_training - Step 19281: {'lr': 0.0003394975514040443, 'samples': 9872384, 'steps': 19281, 'batch_loss/train': 0.7381147471023723} 12/25/2021 12:20:52 - INFO - codeparrot_training - Step 19282: {'lr': 0.0003394828327882562, 'samples': 9872896, 'steps': 19282, 'batch_loss/train': 0.7801781117450446} 12/25/2021 12:21:03 - INFO - codeparrot_training - Step 19283: {'lr': 0.00033946811381671825, 'samples': 9873408, 'steps': 19283, 'batch_loss/train': 0.7951333019882441} 12/25/2021 12:21:13 - INFO - codeparrot_training - Step 19284: {'lr': 0.00033945339448948874, 'samples': 9873920, 'steps': 19284, 'batch_loss/train': 0.7841847818344831} 12/25/2021 12:21:27 - INFO - codeparrot_training - Step 19285: {'lr': 0.0003394386748066264, 'samples': 9874432, 'steps': 19285, 'batch_loss/train': 0.8167680772021413} 12/25/2021 12:21:38 - INFO - codeparrot_training - Step 19286: {'lr': 0.0003394239547681896, 'samples': 9874944, 'steps': 19286, 'batch_loss/train': 0.8662865408696234} 12/25/2021 12:21:48 - INFO - codeparrot_training - Step 19287: {'lr': 0.00033940923437423697, 'samples': 9875456, 'steps': 19287, 'batch_loss/train': 0.8371666986495256} 12/25/2021 12:21:59 - INFO - codeparrot_training - Step 19288: {'lr': 0.000339394513624827, 'samples': 9875968, 'steps': 19288, 'batch_loss/train': 0.7968941889703274} 12/25/2021 12:22:11 - INFO - codeparrot_training - Step 19289: {'lr': 0.00033937979252001815, 'samples': 9876480, 'steps': 19289, 'batch_loss/train': 0.8440513792447746} 12/25/2021 12:22:22 - INFO - codeparrot_training - Step 19290: {'lr': 0.00033936507105986904, 'samples': 9876992, 'steps': 19290, 'batch_loss/train': 0.7158919097855687} 12/25/2021 12:22:32 - INFO - codeparrot_training - Step 19291: {'lr': 0.0003393503492444381, 'samples': 9877504, 'steps': 19291, 'batch_loss/train': 0.9276327099651098} 12/25/2021 12:22:46 - INFO - codeparrot_training - Step 19292: {'lr': 0.0003393356270737839, 'samples': 9878016, 'steps': 19292, 'batch_loss/train': 0.8498662589117885} 12/25/2021 12:22:56 - INFO - codeparrot_training - Step 19293: {'lr': 0.000339320904547965, 'samples': 9878528, 'steps': 19293, 'batch_loss/train': 0.7846953868865967} 12/25/2021 12:23:07 - INFO - codeparrot_training - Step 19294: {'lr': 0.00033930618166703997, 'samples': 9879040, 'steps': 19294, 'batch_loss/train': 0.8431632658466697} 12/25/2021 12:23:19 - INFO - codeparrot_training - Step 19295: {'lr': 0.0003392914584310671, 'samples': 9879552, 'steps': 19295, 'batch_loss/train': 0.8460046397522092} 12/25/2021 12:23:30 - INFO - codeparrot_training - Step 19296: {'lr': 0.0003392767348401053, 'samples': 9880064, 'steps': 19296, 'batch_loss/train': 0.6605680913198739} 12/25/2021 12:23:40 - INFO - codeparrot_training - Step 19297: {'lr': 0.00033926201089421284, 'samples': 9880576, 'steps': 19297, 'batch_loss/train': 1.0607068776153028} 12/25/2021 12:23:51 - INFO - codeparrot_training - Step 19298: {'lr': 0.00033924728659344834, 'samples': 9881088, 'steps': 19298, 'batch_loss/train': 0.8694544462487102} 12/25/2021 12:24:03 - INFO - codeparrot_training - Step 19299: {'lr': 0.00033923256193787033, 'samples': 9881600, 'steps': 19299, 'batch_loss/train': 0.8839273434132338} 12/25/2021 12:24:13 - INFO - codeparrot_training - Step 19300: {'lr': 0.0003392178369275373, 'samples': 9882112, 'steps': 19300, 'batch_loss/train': 0.8268254012800753} 12/25/2021 12:24:24 - INFO - codeparrot_training - Step 19301: {'lr': 0.0003392031115625078, 'samples': 9882624, 'steps': 19301, 'batch_loss/train': 0.8229448152706027} 12/25/2021 12:24:38 - INFO - codeparrot_training - Step 19302: {'lr': 0.0003391883858428405, 'samples': 9883136, 'steps': 19302, 'batch_loss/train': 0.8231361033394933} 12/25/2021 12:24:49 - INFO - codeparrot_training - Step 19303: {'lr': 0.00033917365976859394, 'samples': 9883648, 'steps': 19303, 'batch_loss/train': 0.6140969628468156} 12/25/2021 12:24:59 - INFO - codeparrot_training - Step 19304: {'lr': 0.0003391589333398264, 'samples': 9884160, 'steps': 19304, 'batch_loss/train': 2.4370279079303145} 12/25/2021 12:25:11 - INFO - codeparrot_training - Step 19305: {'lr': 0.00033914420655659674, 'samples': 9884672, 'steps': 19305, 'batch_loss/train': 0.8492528246715665} 12/25/2021 12:25:22 - INFO - codeparrot_training - Step 19306: {'lr': 0.00033912947941896327, 'samples': 9885184, 'steps': 19306, 'batch_loss/train': 0.8324839128181338} 12/25/2021 12:25:33 - INFO - codeparrot_training - Step 19307: {'lr': 0.00033911475192698477, 'samples': 9885696, 'steps': 19307, 'batch_loss/train': 1.0911848321557045} 12/25/2021 12:25:43 - INFO - codeparrot_training - Step 19308: {'lr': 0.0003391000240807195, 'samples': 9886208, 'steps': 19308, 'batch_loss/train': 0.7994669154286385} 12/25/2021 12:25:56 - INFO - codeparrot_training - Step 19309: {'lr': 0.0003390852958802263, 'samples': 9886720, 'steps': 19309, 'batch_loss/train': 0.7968425117433071} 12/25/2021 12:26:07 - INFO - codeparrot_training - Step 19310: {'lr': 0.00033907056732556355, 'samples': 9887232, 'steps': 19310, 'batch_loss/train': 0.8974938169121742} 12/25/2021 12:26:17 - INFO - codeparrot_training - Step 19311: {'lr': 0.0003390558384167899, 'samples': 9887744, 'steps': 19311, 'batch_loss/train': 0.6518733591074124} 12/25/2021 12:26:29 - INFO - codeparrot_training - Step 19312: {'lr': 0.0003390411091539638, 'samples': 9888256, 'steps': 19312, 'batch_loss/train': 0.9232479343190789} 12/25/2021 12:26:40 - INFO - codeparrot_training - Step 19313: {'lr': 0.000339026379537144, 'samples': 9888768, 'steps': 19313, 'batch_loss/train': 0.884498598985374} 12/25/2021 12:26:51 - INFO - codeparrot_training - Step 19314: {'lr': 0.00033901164956638875, 'samples': 9889280, 'steps': 19314, 'batch_loss/train': 0.8354439130052924} 12/25/2021 12:27:01 - INFO - codeparrot_training - Step 19315: {'lr': 0.00033899691924175687, 'samples': 9889792, 'steps': 19315, 'batch_loss/train': 1.17485567368567} 12/25/2021 12:27:15 - INFO - codeparrot_training - Step 19316: {'lr': 0.0003389821885633069, 'samples': 9890304, 'steps': 19316, 'batch_loss/train': 0.744038748787716} 12/25/2021 12:27:26 - INFO - codeparrot_training - Step 19317: {'lr': 0.00033896745753109723, 'samples': 9890816, 'steps': 19317, 'batch_loss/train': 0.8178095044568181} 12/25/2021 12:27:36 - INFO - codeparrot_training - Step 19318: {'lr': 0.0003389527261451866, 'samples': 9891328, 'steps': 19318, 'batch_loss/train': 0.7650678069330752} 12/25/2021 12:27:48 - INFO - codeparrot_training - Step 19319: {'lr': 0.00033893799440563356, 'samples': 9891840, 'steps': 19319, 'batch_loss/train': 0.834799088537693} 12/25/2021 12:27:59 - INFO - codeparrot_training - Step 19320: {'lr': 0.0003389232623124966, 'samples': 9892352, 'steps': 19320, 'batch_loss/train': 0.849893611157313} 12/25/2021 12:28:09 - INFO - codeparrot_training - Step 19321: {'lr': 0.00033890852986583433, 'samples': 9892864, 'steps': 19321, 'batch_loss/train': 0.8659488074481487} 12/25/2021 12:28:24 - INFO - codeparrot_training - Step 19322: {'lr': 0.0003388937970657053, 'samples': 9893376, 'steps': 19322, 'batch_loss/train': 0.8246587160974741} 12/25/2021 12:28:35 - INFO - codeparrot_training - Step 19323: {'lr': 0.0003388790639121681, 'samples': 9893888, 'steps': 19323, 'batch_loss/train': 0.9084949418902397} 12/25/2021 12:28:45 - INFO - codeparrot_training - Step 19324: {'lr': 0.0003388643304052813, 'samples': 9894400, 'steps': 19324, 'batch_loss/train': 0.9398560160771012} 12/25/2021 12:28:56 - INFO - codeparrot_training - Step 19325: {'lr': 0.0003388495965451035, 'samples': 9894912, 'steps': 19325, 'batch_loss/train': 0.5448211496695876} 12/25/2021 12:29:08 - INFO - codeparrot_training - Step 19326: {'lr': 0.00033883486233169324, 'samples': 9895424, 'steps': 19326, 'batch_loss/train': 0.7943283058702946} 12/25/2021 12:29:19 - INFO - codeparrot_training - Step 19327: {'lr': 0.0003388201277651091, 'samples': 9895936, 'steps': 19327, 'batch_loss/train': 0.8224752424284816} 12/25/2021 12:29:29 - INFO - codeparrot_training - Step 19328: {'lr': 0.00033880539284540965, 'samples': 9896448, 'steps': 19328, 'batch_loss/train': 0.8651547282934189} 12/25/2021 12:29:43 - INFO - codeparrot_training - Step 19329: {'lr': 0.0003387906575726536, 'samples': 9896960, 'steps': 19329, 'batch_loss/train': 1.0318410927429795} 12/25/2021 12:29:54 - INFO - codeparrot_training - Step 19330: {'lr': 0.00033877592194689934, 'samples': 9897472, 'steps': 19330, 'batch_loss/train': 0.8160574720241129} 12/25/2021 12:30:04 - INFO - codeparrot_training - Step 19331: {'lr': 0.0003387611859682055, 'samples': 9897984, 'steps': 19331, 'batch_loss/train': 0.5522027559345588} 12/25/2021 12:30:17 - INFO - codeparrot_training - Step 19332: {'lr': 0.00033874644963663073, 'samples': 9898496, 'steps': 19332, 'batch_loss/train': 0.9340404695831239} 12/25/2021 12:30:27 - INFO - codeparrot_training - Step 19333: {'lr': 0.00033873171295223354, 'samples': 9899008, 'steps': 19333, 'batch_loss/train': 0.8018654566258192} 12/25/2021 12:30:38 - INFO - codeparrot_training - Step 19334: {'lr': 0.00033871697591507263, 'samples': 9899520, 'steps': 19334, 'batch_loss/train': 0.862503569573164} 12/25/2021 12:30:50 - INFO - codeparrot_training - Step 19335: {'lr': 0.00033870223852520655, 'samples': 9900032, 'steps': 19335, 'batch_loss/train': 0.5520425752620213} 12/25/2021 12:31:01 - INFO - codeparrot_training - Step 19336: {'lr': 0.0003386875007826938, 'samples': 9900544, 'steps': 19336, 'batch_loss/train': 0.8291188306175172} 12/25/2021 12:31:11 - INFO - codeparrot_training - Step 19337: {'lr': 0.0003386727626875931, 'samples': 9901056, 'steps': 19337, 'batch_loss/train': 0.7567511796951294} 12/25/2021 12:31:22 - INFO - codeparrot_training - Step 19338: {'lr': 0.00033865802423996283, 'samples': 9901568, 'steps': 19338, 'batch_loss/train': 0.800274739973247} 12/25/2021 12:31:36 - INFO - codeparrot_training - Step 19339: {'lr': 0.0003386432854398618, 'samples': 9902080, 'steps': 19339, 'batch_loss/train': 0.5765659349272028} 12/25/2021 12:31:46 - INFO - codeparrot_training - Step 19340: {'lr': 0.00033862854628734863, 'samples': 9902592, 'steps': 19340, 'batch_loss/train': 0.7019078400917351} 12/25/2021 12:31:57 - INFO - codeparrot_training - Step 19341: {'lr': 0.00033861380678248173, 'samples': 9903104, 'steps': 19341, 'batch_loss/train': 0.8051914614625275} 12/25/2021 12:32:09 - INFO - codeparrot_training - Step 19342: {'lr': 0.00033859906692531986, 'samples': 9903616, 'steps': 19342, 'batch_loss/train': 0.8097694981843233} 12/25/2021 12:32:20 - INFO - codeparrot_training - Step 19343: {'lr': 0.0003385843267159215, 'samples': 9904128, 'steps': 19343, 'batch_loss/train': 0.8434883709996939} 12/25/2021 12:32:30 - INFO - codeparrot_training - Step 19344: {'lr': 0.0003385695861543454, 'samples': 9904640, 'steps': 19344, 'batch_loss/train': 0.7655209245858714} 12/25/2021 12:32:44 - INFO - codeparrot_training - Step 19345: {'lr': 0.00033855484524064996, 'samples': 9905152, 'steps': 19345, 'batch_loss/train': 0.744946739403531} 12/25/2021 12:32:55 - INFO - codeparrot_training - Step 19346: {'lr': 0.0003385401039748939, 'samples': 9905664, 'steps': 19346, 'batch_loss/train': 0.9013884267769754} 12/25/2021 12:33:06 - INFO - codeparrot_training - Step 19347: {'lr': 0.00033852536235713584, 'samples': 9906176, 'steps': 19347, 'batch_loss/train': 0.733587889932096} 12/25/2021 12:33:16 - INFO - codeparrot_training - Step 19348: {'lr': 0.0003385106203874344, 'samples': 9906688, 'steps': 19348, 'batch_loss/train': 0.7130904698278755} 12/25/2021 12:33:29 - INFO - codeparrot_training - Step 19349: {'lr': 0.0003384958780658481, 'samples': 9907200, 'steps': 19349, 'batch_loss/train': 0.8399328049272299} 12/25/2021 12:33:39 - INFO - codeparrot_training - Step 19350: {'lr': 0.0003384811353924356, 'samples': 9907712, 'steps': 19350, 'batch_loss/train': 0.7137036190833896} 12/25/2021 12:33:50 - INFO - codeparrot_training - Step 19351: {'lr': 0.0003384663923672555, 'samples': 9908224, 'steps': 19351, 'batch_loss/train': 0.8355257750954479} 12/25/2021 12:34:02 - INFO - codeparrot_training - Step 19352: {'lr': 0.0003384516489903665, 'samples': 9908736, 'steps': 19352, 'batch_loss/train': 0.588152613316197} 12/25/2021 12:34:12 - INFO - codeparrot_training - Step 19353: {'lr': 0.00033843690526182715, 'samples': 9909248, 'steps': 19353, 'batch_loss/train': 0.7416276726871729} 12/25/2021 12:34:23 - INFO - codeparrot_training - Step 19354: {'lr': 0.0003384221611816959, 'samples': 9909760, 'steps': 19354, 'batch_loss/train': 0.8428750834427774} 12/25/2021 12:34:35 - INFO - codeparrot_training - Step 19355: {'lr': 0.0003384074167500317, 'samples': 9910272, 'steps': 19355, 'batch_loss/train': 0.7009006827138364} 12/25/2021 12:34:46 - INFO - codeparrot_training - Step 19356: {'lr': 0.00033839267196689296, 'samples': 9910784, 'steps': 19356, 'batch_loss/train': 0.7202132632955909} 12/25/2021 12:34:56 - INFO - codeparrot_training - Step 19357: {'lr': 0.0003383779268323383, 'samples': 9911296, 'steps': 19357, 'batch_loss/train': 0.684387510176748} 12/25/2021 12:35:07 - INFO - codeparrot_training - Step 19358: {'lr': 0.0003383631813464264, 'samples': 9911808, 'steps': 19358, 'batch_loss/train': 0.8072540980065241} 12/25/2021 12:35:21 - INFO - codeparrot_training - Step 19359: {'lr': 0.0003383484355092158, 'samples': 9912320, 'steps': 19359, 'batch_loss/train': 0.7332202726975083} 12/25/2021 12:35:32 - INFO - codeparrot_training - Step 19360: {'lr': 0.0003383336893207652, 'samples': 9912832, 'steps': 19360, 'batch_loss/train': 0.7355721286730841} 12/25/2021 12:35:42 - INFO - codeparrot_training - Step 19361: {'lr': 0.00033831894278113327, 'samples': 9913344, 'steps': 19361, 'batch_loss/train': 0.7869819216430187} 12/25/2021 12:35:54 - INFO - codeparrot_training - Step 19362: {'lr': 0.0003383041958903785, 'samples': 9913856, 'steps': 19362, 'batch_loss/train': 0.7605577651411295} 12/25/2021 12:36:05 - INFO - codeparrot_training - Step 19363: {'lr': 0.0003382894486485596, 'samples': 9914368, 'steps': 19363, 'batch_loss/train': 0.7715198397636414} 12/25/2021 12:36:15 - INFO - codeparrot_training - Step 19364: {'lr': 0.0003382747010557352, 'samples': 9914880, 'steps': 19364, 'batch_loss/train': 0.7917483597993851} 12/25/2021 12:36:28 - INFO - codeparrot_training - Step 19365: {'lr': 0.0003382599531119639, 'samples': 9915392, 'steps': 19365, 'batch_loss/train': 0.8743238003225997} 12/25/2021 12:36:38 - INFO - codeparrot_training - Step 19366: {'lr': 0.0003382452048173044, 'samples': 9915904, 'steps': 19366, 'batch_loss/train': 0.6493665737798437} 12/25/2021 12:36:49 - INFO - codeparrot_training - Step 19367: {'lr': 0.0003382304561718152, 'samples': 9916416, 'steps': 19367, 'batch_loss/train': 0.6234127681236714} 12/25/2021 12:37:03 - INFO - codeparrot_training - Step 19368: {'lr': 0.0003382157071755551, 'samples': 9916928, 'steps': 19368, 'batch_loss/train': 0.6789052695967257} 12/25/2021 12:37:13 - INFO - codeparrot_training - Step 19369: {'lr': 0.0003382009578285826, 'samples': 9917440, 'steps': 19369, 'batch_loss/train': 0.8423147303983569} 12/25/2021 12:37:24 - INFO - codeparrot_training - Step 19370: {'lr': 0.00033818620813095646, 'samples': 9917952, 'steps': 19370, 'batch_loss/train': 0.7237131893634796} 12/25/2021 12:37:34 - INFO - codeparrot_training - Step 19371: {'lr': 0.0003381714580827352, 'samples': 9918464, 'steps': 19371, 'batch_loss/train': 0.6767357416101731} 12/25/2021 12:37:46 - INFO - codeparrot_training - Step 19372: {'lr': 0.00033815670768397755, 'samples': 9918976, 'steps': 19372, 'batch_loss/train': 0.6944130249321461} 12/25/2021 12:37:57 - INFO - codeparrot_training - Step 19373: {'lr': 0.0003381419569347421, 'samples': 9919488, 'steps': 19373, 'batch_loss/train': 0.7984233507886529} 12/25/2021 12:38:08 - INFO - codeparrot_training - Step 19374: {'lr': 0.00033812720583508755, 'samples': 9920000, 'steps': 19374, 'batch_loss/train': 0.8529981952160597} 12/25/2021 12:38:21 - INFO - codeparrot_training - Step 19375: {'lr': 0.00033811245438507243, 'samples': 9920512, 'steps': 19375, 'batch_loss/train': 0.6500848298892379} 12/25/2021 12:38:32 - INFO - codeparrot_training - Step 19376: {'lr': 0.0003380977025847555, 'samples': 9921024, 'steps': 19376, 'batch_loss/train': 0.8099347399547696} 12/25/2021 12:38:43 - INFO - codeparrot_training - Step 19377: {'lr': 0.0003380829504341954, 'samples': 9921536, 'steps': 19377, 'batch_loss/train': 0.6396290876436979} 12/25/2021 12:38:55 - INFO - codeparrot_training - Step 19378: {'lr': 0.0003380681979334507, 'samples': 9922048, 'steps': 19378, 'batch_loss/train': 0.7449266784824431} 12/25/2021 12:39:06 - INFO - codeparrot_training - Step 19379: {'lr': 0.0003380534450825801, 'samples': 9922560, 'steps': 19379, 'batch_loss/train': 0.6433963664458133} 12/25/2021 12:39:16 - INFO - codeparrot_training - Step 19380: {'lr': 0.0003380386918816423, 'samples': 9923072, 'steps': 19380, 'batch_loss/train': 0.8510670620016754} 12/25/2021 12:39:28 - INFO - codeparrot_training - Step 19381: {'lr': 0.00033802393833069587, 'samples': 9923584, 'steps': 19381, 'batch_loss/train': 0.7108434103429317} 12/25/2021 12:39:39 - INFO - codeparrot_training - Step 19382: {'lr': 0.00033800918442979946, 'samples': 9924096, 'steps': 19382, 'batch_loss/train': 0.8746078601107001} 12/25/2021 12:39:49 - INFO - codeparrot_training - Step 19383: {'lr': 0.0003379944301790118, 'samples': 9924608, 'steps': 19383, 'batch_loss/train': 0.7208983050659299} 12/25/2021 12:40:00 - INFO - codeparrot_training - Step 19384: {'lr': 0.00033797967557839154, 'samples': 9925120, 'steps': 19384, 'batch_loss/train': 0.7741072345525026} 12/25/2021 12:40:12 - INFO - codeparrot_training - Step 19385: {'lr': 0.0003379649206279973, 'samples': 9925632, 'steps': 19385, 'batch_loss/train': 0.8517016232945025} 12/25/2021 12:40:23 - INFO - codeparrot_training - Step 19386: {'lr': 0.00033795016532788767, 'samples': 9926144, 'steps': 19386, 'batch_loss/train': 0.6145481518469751} 12/25/2021 12:40:33 - INFO - codeparrot_training - Step 19387: {'lr': 0.00033793540967812154, 'samples': 9926656, 'steps': 19387, 'batch_loss/train': 0.8279763795435429} 12/25/2021 12:40:45 - INFO - codeparrot_training - Step 19388: {'lr': 0.0003379206536787573, 'samples': 9927168, 'steps': 19388, 'batch_loss/train': 0.9200336057692766} 12/25/2021 12:40:56 - INFO - codeparrot_training - Step 19389: {'lr': 0.0003379058973298537, 'samples': 9927680, 'steps': 19389, 'batch_loss/train': 0.7627336527220905} 12/25/2021 12:41:06 - INFO - codeparrot_training - Step 19390: {'lr': 0.00033789114063146957, 'samples': 9928192, 'steps': 19390, 'batch_loss/train': 0.8120149010792375} 12/25/2021 12:41:20 - INFO - codeparrot_training - Step 19391: {'lr': 0.0003378763835836634, 'samples': 9928704, 'steps': 19391, 'batch_loss/train': 0.6980343894101679} 12/25/2021 12:41:31 - INFO - codeparrot_training - Step 19392: {'lr': 0.0003378616261864939, 'samples': 9929216, 'steps': 19392, 'batch_loss/train': 0.7588065415620804} 12/25/2021 12:41:42 - INFO - codeparrot_training - Step 19393: {'lr': 0.00033784686844001976, 'samples': 9929728, 'steps': 19393, 'batch_loss/train': 0.7196375534404069} 12/25/2021 12:41:52 - INFO - codeparrot_training - Step 19394: {'lr': 0.00033783211034429966, 'samples': 9930240, 'steps': 19394, 'batch_loss/train': 0.7724633943289518} 12/25/2021 12:42:04 - INFO - codeparrot_training - Step 19395: {'lr': 0.00033781735189939224, 'samples': 9930752, 'steps': 19395, 'batch_loss/train': 0.795247389934957} 12/25/2021 12:42:15 - INFO - codeparrot_training - Step 19396: {'lr': 0.00033780259310535617, 'samples': 9931264, 'steps': 19396, 'batch_loss/train': 0.7588893249630928} 12/25/2021 12:42:25 - INFO - codeparrot_training - Step 19397: {'lr': 0.00033778783396225015, 'samples': 9931776, 'steps': 19397, 'batch_loss/train': 0.7097208080813289} 12/25/2021 12:42:39 - INFO - codeparrot_training - Step 19398: {'lr': 0.00033777307447013284, 'samples': 9932288, 'steps': 19398, 'batch_loss/train': 0.7792844139039516} 12/25/2021 12:42:50 - INFO - codeparrot_training - Step 19399: {'lr': 0.00033775831462906294, 'samples': 9932800, 'steps': 19399, 'batch_loss/train': 0.9447717517614365} 12/25/2021 12:43:00 - INFO - codeparrot_training - Step 19400: {'lr': 0.0003377435544390992, 'samples': 9933312, 'steps': 19400, 'batch_loss/train': 0.6418302831589244} 12/25/2021 12:43:12 - INFO - codeparrot_training - Step 19401: {'lr': 0.0003377287939003, 'samples': 9933824, 'steps': 19401, 'batch_loss/train': 0.8413646255503409} 12/25/2021 12:43:23 - INFO - codeparrot_training - Step 19402: {'lr': 0.0003377140330127244, 'samples': 9934336, 'steps': 19402, 'batch_loss/train': 0.7187584121711552} 12/25/2021 12:43:34 - INFO - codeparrot_training - Step 19403: {'lr': 0.00033769927177643096, 'samples': 9934848, 'steps': 19403, 'batch_loss/train': 0.8486006706953049} 12/25/2021 12:43:46 - INFO - codeparrot_training - Step 19404: {'lr': 0.0003376845101914783, 'samples': 9935360, 'steps': 19404, 'batch_loss/train': 0.8268695357255638} 12/25/2021 12:43:56 - INFO - codeparrot_training - Step 19405: {'lr': 0.00033766974825792504, 'samples': 9935872, 'steps': 19405, 'batch_loss/train': 0.6475267157074995} 12/25/2021 12:44:07 - INFO - codeparrot_training - Step 19406: {'lr': 0.00033765498597583006, 'samples': 9936384, 'steps': 19406, 'batch_loss/train': 0.7983793709427118} 12/25/2021 12:44:18 - INFO - codeparrot_training - Step 19407: {'lr': 0.00033764022334525196, 'samples': 9936896, 'steps': 19407, 'batch_loss/train': 0.7317205062136054} 12/25/2021 12:44:31 - INFO - codeparrot_training - Step 19408: {'lr': 0.0003376254603662493, 'samples': 9937408, 'steps': 19408, 'batch_loss/train': 0.6969330782594625} 12/25/2021 12:44:42 - INFO - codeparrot_training - Step 19409: {'lr': 0.00033761069703888106, 'samples': 9937920, 'steps': 19409, 'batch_loss/train': 0.732094376347959} 12/25/2021 12:44:52 - INFO - codeparrot_training - Step 19410: {'lr': 0.0003375959333632057, 'samples': 9938432, 'steps': 19410, 'batch_loss/train': 0.7256546563003212} 12/25/2021 12:45:04 - INFO - codeparrot_training - Step 19411: {'lr': 0.00033758116933928196, 'samples': 9938944, 'steps': 19411, 'batch_loss/train': 0.7576970357913524} 12/25/2021 12:45:15 - INFO - codeparrot_training - Step 19412: {'lr': 0.00033756640496716855, 'samples': 9939456, 'steps': 19412, 'batch_loss/train': 0.7894995650276542} 12/25/2021 12:45:26 - INFO - codeparrot_training - Step 19413: {'lr': 0.00033755164024692417, 'samples': 9939968, 'steps': 19413, 'batch_loss/train': 0.7735016765072942} 12/25/2021 12:45:39 - INFO - codeparrot_training - Step 19414: {'lr': 0.0003375368751786075, 'samples': 9940480, 'steps': 19414, 'batch_loss/train': 0.9177544554695487} 12/25/2021 12:45:50 - INFO - codeparrot_training - Step 19415: {'lr': 0.0003375221097622774, 'samples': 9940992, 'steps': 19415, 'batch_loss/train': 0.7165138954296708} 12/25/2021 12:46:01 - INFO - codeparrot_training - Step 19416: {'lr': 0.00033750734399799244, 'samples': 9941504, 'steps': 19416, 'batch_loss/train': 0.7680450156331062} 12/25/2021 12:46:13 - INFO - codeparrot_training - Step 19417: {'lr': 0.0003374925778858112, 'samples': 9942016, 'steps': 19417, 'batch_loss/train': 0.6961935793515295} 12/25/2021 12:46:24 - INFO - codeparrot_training - Step 19418: {'lr': 0.00033747781142579253, 'samples': 9942528, 'steps': 19418, 'batch_loss/train': 0.6933922371827066} 12/25/2021 12:46:34 - INFO - codeparrot_training - Step 19419: {'lr': 0.00033746304461799515, 'samples': 9943040, 'steps': 19419, 'batch_loss/train': 0.6719983834773302} 12/25/2021 12:46:45 - INFO - codeparrot_training - Step 19420: {'lr': 0.00033744827746247776, 'samples': 9943552, 'steps': 19420, 'batch_loss/train': 0.7082045205170289} 12/25/2021 12:46:57 - INFO - codeparrot_training - Step 19421: {'lr': 0.00033743350995929896, 'samples': 9944064, 'steps': 19421, 'batch_loss/train': 0.7374995639547706} 12/25/2021 12:47:07 - INFO - codeparrot_training - Step 19422: {'lr': 0.00033741874210851757, 'samples': 9944576, 'steps': 19422, 'batch_loss/train': 0.7369820456951857} 12/25/2021 12:47:18 - INFO - codeparrot_training - Step 19423: {'lr': 0.0003374039739101923, 'samples': 9945088, 'steps': 19423, 'batch_loss/train': 0.7720836885273457} 12/25/2021 12:47:30 - INFO - codeparrot_training - Step 19424: {'lr': 0.00033738920536438175, 'samples': 9945600, 'steps': 19424, 'batch_loss/train': 0.9513475799467415} 12/25/2021 12:47:41 - INFO - codeparrot_training - Step 19425: {'lr': 0.0003373744364711448, 'samples': 9946112, 'steps': 19425, 'batch_loss/train': 0.6246646911313292} 12/25/2021 12:47:51 - INFO - codeparrot_training - Step 19426: {'lr': 0.00033735966723054, 'samples': 9946624, 'steps': 19426, 'batch_loss/train': 0.6291450849093962} 12/25/2021 12:48:05 - INFO - codeparrot_training - Step 19427: {'lr': 0.0003373448976426262, 'samples': 9947136, 'steps': 19427, 'batch_loss/train': 0.7759872823953629} 12/25/2021 12:48:16 - INFO - codeparrot_training - Step 19428: {'lr': 0.00033733012770746206, 'samples': 9947648, 'steps': 19428, 'batch_loss/train': 0.7636505807749927} 12/25/2021 12:48:26 - INFO - codeparrot_training - Step 19429: {'lr': 0.00033731535742510636, 'samples': 9948160, 'steps': 19429, 'batch_loss/train': 0.7741937031969428} 12/25/2021 12:48:37 - INFO - codeparrot_training - Step 19430: {'lr': 0.0003373005867956177, 'samples': 9948672, 'steps': 19430, 'batch_loss/train': 0.6283999133738689} 12/25/2021 12:48:49 - INFO - codeparrot_training - Step 19431: {'lr': 0.00033728581581905494, 'samples': 9949184, 'steps': 19431, 'batch_loss/train': 0.6948893535882235} 12/25/2021 12:49:00 - INFO - codeparrot_training - Step 19432: {'lr': 0.00033727104449547663, 'samples': 9949696, 'steps': 19432, 'batch_loss/train': 0.7809727797284722} 12/25/2021 12:49:10 - INFO - codeparrot_training - Step 19433: {'lr': 0.0003372562728249417, 'samples': 9950208, 'steps': 19433, 'batch_loss/train': 0.7889904626645148} 12/25/2021 12:49:22 - INFO - codeparrot_training - Step 19434: {'lr': 0.0003372415008075087, 'samples': 9950720, 'steps': 19434, 'batch_loss/train': 0.7288680793717504} 12/25/2021 12:49:33 - INFO - codeparrot_training - Step 19435: {'lr': 0.0003372267284432365, 'samples': 9951232, 'steps': 19435, 'batch_loss/train': 0.6330698125530034} 12/25/2021 12:49:43 - INFO - codeparrot_training - Step 19436: {'lr': 0.00033721195573218374, 'samples': 9951744, 'steps': 19436, 'batch_loss/train': 0.7777733774855733} 12/25/2021 12:49:57 - INFO - codeparrot_training - Step 19437: {'lr': 0.00033719718267440915, 'samples': 9952256, 'steps': 19437, 'batch_loss/train': 0.7602188624441624} 12/25/2021 12:50:08 - INFO - codeparrot_training - Step 19438: {'lr': 0.0003371824092699715, 'samples': 9952768, 'steps': 19438, 'batch_loss/train': 0.7915907618589699} 12/25/2021 12:50:18 - INFO - codeparrot_training - Step 19439: {'lr': 0.0003371676355189295, 'samples': 9953280, 'steps': 19439, 'batch_loss/train': 0.8411492267623544} 12/25/2021 12:50:30 - INFO - codeparrot_training - Step 19440: {'lr': 0.0003371528614213419, 'samples': 9953792, 'steps': 19440, 'batch_loss/train': 0.7354329442605376} 12/25/2021 12:50:41 - INFO - codeparrot_training - Step 19441: {'lr': 0.0003371380869772674, 'samples': 9954304, 'steps': 19441, 'batch_loss/train': 0.7549167964607477} 12/25/2021 12:50:52 - INFO - codeparrot_training - Step 19442: {'lr': 0.0003371233121867648, 'samples': 9954816, 'steps': 19442, 'batch_loss/train': 0.8019930555019528} 12/25/2021 12:51:02 - INFO - codeparrot_training - Step 19443: {'lr': 0.00033710853704989286, 'samples': 9955328, 'steps': 19443, 'batch_loss/train': 0.7868570648133755} 12/25/2021 12:51:16 - INFO - codeparrot_training - Step 19444: {'lr': 0.0003370937615667102, 'samples': 9955840, 'steps': 19444, 'batch_loss/train': 0.7407513475045562} 12/25/2021 12:51:27 - INFO - codeparrot_training - Step 19445: {'lr': 0.00033707898573727565, 'samples': 9956352, 'steps': 19445, 'batch_loss/train': 0.5883020524051972} 12/25/2021 12:51:37 - INFO - codeparrot_training - Step 19446: {'lr': 0.0003370642095616479, 'samples': 9956864, 'steps': 19446, 'batch_loss/train': 0.8137112045660615} 12/25/2021 12:51:49 - INFO - codeparrot_training - Step 19447: {'lr': 0.00033704943303988576, 'samples': 9957376, 'steps': 19447, 'batch_loss/train': 0.7620361624285579} 12/25/2021 12:52:00 - INFO - codeparrot_training - Step 19448: {'lr': 0.00033703465617204796, 'samples': 9957888, 'steps': 19448, 'batch_loss/train': 0.7727515157312155} 12/25/2021 12:52:10 - INFO - codeparrot_training - Step 19449: {'lr': 0.0003370198789581932, 'samples': 9958400, 'steps': 19449, 'batch_loss/train': 0.7484184964559972} 12/25/2021 12:52:23 - INFO - codeparrot_training - Step 19450: {'lr': 0.0003370051013983802, 'samples': 9958912, 'steps': 19450, 'batch_loss/train': 0.7778863264247775} 12/25/2021 12:52:33 - INFO - codeparrot_training - Step 19451: {'lr': 0.0003369903234926679, 'samples': 9959424, 'steps': 19451, 'batch_loss/train': 0.7338932123966515} 12/25/2021 12:52:44 - INFO - codeparrot_training - Step 19452: {'lr': 0.00033697554524111484, 'samples': 9959936, 'steps': 19452, 'batch_loss/train': 0.8594960253685713} 12/25/2021 12:52:54 - INFO - codeparrot_training - Step 19453: {'lr': 0.00033696076664377984, 'samples': 9960448, 'steps': 19453, 'batch_loss/train': 0.7499636947759427} 12/25/2021 12:53:08 - INFO - codeparrot_training - Step 19454: {'lr': 0.00033694598770072173, 'samples': 9960960, 'steps': 19454, 'batch_loss/train': 0.9638900137506425} 12/25/2021 12:53:19 - INFO - codeparrot_training - Step 19455: {'lr': 0.0003369312084119992, 'samples': 9961472, 'steps': 19455, 'batch_loss/train': 0.7352330787107348} 12/25/2021 12:53:30 - INFO - codeparrot_training - Step 19456: {'lr': 0.0003369164287776709, 'samples': 9961984, 'steps': 19456, 'batch_loss/train': 0.7696544080972672} 12/25/2021 12:53:42 - INFO - codeparrot_training - Step 19457: {'lr': 0.00033690164879779583, 'samples': 9962496, 'steps': 19457, 'batch_loss/train': 0.7668086107587442} 12/25/2021 12:53:53 - INFO - codeparrot_training - Step 19458: {'lr': 0.00033688686847243253, 'samples': 9963008, 'steps': 19458, 'batch_loss/train': 1.0413974225521088} 12/25/2021 12:54:03 - INFO - codeparrot_training - Step 19459: {'lr': 0.0003368720878016399, 'samples': 9963520, 'steps': 19459, 'batch_loss/train': 0.755291067995131} 12/25/2021 12:54:15 - INFO - codeparrot_training - Step 19460: {'lr': 0.0003368573067854766, 'samples': 9964032, 'steps': 19460, 'batch_loss/train': 0.8276866618543863} 12/25/2021 12:54:26 - INFO - codeparrot_training - Step 19461: {'lr': 0.0003368425254240015, 'samples': 9964544, 'steps': 19461, 'batch_loss/train': 0.7490599304437637} 12/25/2021 12:54:36 - INFO - codeparrot_training - Step 19462: {'lr': 0.0003368277437172733, 'samples': 9965056, 'steps': 19462, 'batch_loss/train': 0.7515160739421844} 12/25/2021 12:54:47 - INFO - codeparrot_training - Step 19463: {'lr': 0.0003368129616653507, 'samples': 9965568, 'steps': 19463, 'batch_loss/train': 0.7418404575437307} 12/25/2021 12:54:59 - INFO - codeparrot_training - Step 19464: {'lr': 0.00033679817926829267, 'samples': 9966080, 'steps': 19464, 'batch_loss/train': 0.8359878128394485} 12/25/2021 12:55:09 - INFO - codeparrot_training - Step 19465: {'lr': 0.0003367833965261577, 'samples': 9966592, 'steps': 19465, 'batch_loss/train': 0.9379617301747203} 12/25/2021 12:55:20 - INFO - codeparrot_training - Step 19466: {'lr': 0.0003367686134390049, 'samples': 9967104, 'steps': 19466, 'batch_loss/train': 0.7653883434832096} 12/25/2021 12:55:34 - INFO - codeparrot_training - Step 19467: {'lr': 0.0003367538300068927, 'samples': 9967616, 'steps': 19467, 'batch_loss/train': 0.8229397814720869} 12/25/2021 12:55:44 - INFO - codeparrot_training - Step 19468: {'lr': 0.0003367390462298802, 'samples': 9968128, 'steps': 19468, 'batch_loss/train': 0.7416500453837216} 12/25/2021 12:55:55 - INFO - codeparrot_training - Step 19469: {'lr': 0.00033672426210802576, 'samples': 9968640, 'steps': 19469, 'batch_loss/train': 0.7542283404618502} 12/25/2021 12:56:07 - INFO - codeparrot_training - Step 19470: {'lr': 0.0003367094776413886, 'samples': 9969152, 'steps': 19470, 'batch_loss/train': 0.7418921869248152} 12/25/2021 12:56:18 - INFO - codeparrot_training - Step 19471: {'lr': 0.0003366946928300272, 'samples': 9969664, 'steps': 19471, 'batch_loss/train': 0.5266873928485438} 12/25/2021 12:56:28 - INFO - codeparrot_training - Step 19472: {'lr': 0.00033667990767400046, 'samples': 9970176, 'steps': 19472, 'batch_loss/train': 0.7845820793882012} 12/25/2021 12:56:42 - INFO - codeparrot_training - Step 19473: {'lr': 0.0003366651221733671, 'samples': 9970688, 'steps': 19473, 'batch_loss/train': 0.7829364137724042} 12/25/2021 12:56:53 - INFO - codeparrot_training - Step 19474: {'lr': 0.00033665033632818603, 'samples': 9971200, 'steps': 19474, 'batch_loss/train': 0.7919726893305779} 12/25/2021 12:57:03 - INFO - codeparrot_training - Step 19475: {'lr': 0.00033663555013851586, 'samples': 9971712, 'steps': 19475, 'batch_loss/train': 0.7807818725705147} 12/25/2021 12:57:14 - INFO - codeparrot_training - Step 19476: {'lr': 0.0003366207636044154, 'samples': 9972224, 'steps': 19476, 'batch_loss/train': 0.7273260275833309} 12/25/2021 12:57:26 - INFO - codeparrot_training - Step 19477: {'lr': 0.00033660597672594363, 'samples': 9972736, 'steps': 19477, 'batch_loss/train': 0.8487360626459122} 12/25/2021 12:57:37 - INFO - codeparrot_training - Step 19478: {'lr': 0.00033659118950315917, 'samples': 9973248, 'steps': 19478, 'batch_loss/train': 0.7318122694268823} 12/25/2021 12:57:47 - INFO - codeparrot_training - Step 19479: {'lr': 0.0003365764019361208, 'samples': 9973760, 'steps': 19479, 'batch_loss/train': 0.711569006962236} 12/25/2021 12:57:59 - INFO - codeparrot_training - Step 19480: {'lr': 0.00033656161402488734, 'samples': 9974272, 'steps': 19480, 'batch_loss/train': 0.7653743466362357} 12/25/2021 12:58:10 - INFO - codeparrot_training - Step 19481: {'lr': 0.0003365468257695176, 'samples': 9974784, 'steps': 19481, 'batch_loss/train': 0.7648177943192422} 12/25/2021 12:58:20 - INFO - codeparrot_training - Step 19482: {'lr': 0.0003365320371700703, 'samples': 9975296, 'steps': 19482, 'batch_loss/train': 0.7336982823908329} 12/25/2021 12:58:34 - INFO - codeparrot_training - Step 19483: {'lr': 0.0003365172482266044, 'samples': 9975808, 'steps': 19483, 'batch_loss/train': 0.7694850983098149} 12/25/2021 12:58:45 - INFO - codeparrot_training - Step 19484: {'lr': 0.0003365024589391785, 'samples': 9976320, 'steps': 19484, 'batch_loss/train': 0.7658127751201391} 12/25/2021 12:58:55 - INFO - codeparrot_training - Step 19485: {'lr': 0.00033648766930785155, 'samples': 9976832, 'steps': 19485, 'batch_loss/train': 0.6543244779459201} 12/25/2021 12:59:06 - INFO - codeparrot_training - Step 19486: {'lr': 0.0003364728793326823, 'samples': 9977344, 'steps': 19486, 'batch_loss/train': 0.8618660358479246} 12/25/2021 12:59:18 - INFO - codeparrot_training - Step 19487: {'lr': 0.0003364580890137294, 'samples': 9977856, 'steps': 19487, 'batch_loss/train': 1.5762820485979319} 12/25/2021 12:59:29 - INFO - codeparrot_training - Step 19488: {'lr': 0.000336443298351052, 'samples': 9978368, 'steps': 19488, 'batch_loss/train': 0.8575445632450283} 12/25/2021 12:59:39 - INFO - codeparrot_training - Step 19489: {'lr': 0.00033642850734470845, 'samples': 9978880, 'steps': 19489, 'batch_loss/train': 0.9178540555294603} 12/25/2021 12:59:53 - INFO - codeparrot_training - Step 19490: {'lr': 0.00033641371599475793, 'samples': 9979392, 'steps': 19490, 'batch_loss/train': 0.6045818163838703} 12/25/2021 13:00:04 - INFO - codeparrot_training - Step 19491: {'lr': 0.0003363989243012591, 'samples': 9979904, 'steps': 19491, 'batch_loss/train': 0.7224711300805211} 12/25/2021 13:00:14 - INFO - codeparrot_training - Step 19492: {'lr': 0.0003363841322642707, 'samples': 9980416, 'steps': 19492, 'batch_loss/train': 0.7652281546033919} 12/25/2021 13:00:26 - INFO - codeparrot_training - Step 19493: {'lr': 0.00033636933988385163, 'samples': 9980928, 'steps': 19493, 'batch_loss/train': 0.7678417284041643} 12/25/2021 13:00:37 - INFO - codeparrot_training - Step 19494: {'lr': 0.0003363545471600608, 'samples': 9981440, 'steps': 19494, 'batch_loss/train': 0.8204493094235659} 12/25/2021 13:00:48 - INFO - codeparrot_training - Step 19495: {'lr': 0.0003363397540929567, 'samples': 9981952, 'steps': 19495, 'batch_loss/train': 0.8276125607080758} 12/25/2021 13:00:58 - INFO - codeparrot_training - Step 19496: {'lr': 0.0003363249606825985, 'samples': 9982464, 'steps': 19496, 'batch_loss/train': 0.7945496346801519} 12/25/2021 13:01:10 - INFO - codeparrot_training - Step 19497: {'lr': 0.00033631016692904474, 'samples': 9982976, 'steps': 19497, 'batch_loss/train': 0.8785252310335636} 12/25/2021 13:01:21 - INFO - codeparrot_training - Step 19498: {'lr': 0.0003362953728323544, 'samples': 9983488, 'steps': 19498, 'batch_loss/train': 0.6862039682455361} 12/25/2021 13:01:31 - INFO - codeparrot_training - Step 19499: {'lr': 0.0003362805783925862, 'samples': 9984000, 'steps': 19499, 'batch_loss/train': 0.7830887269228697} 12/25/2021 13:01:44 - INFO - codeparrot_training - Step 19500: {'lr': 0.00033626578360979904, 'samples': 9984512, 'steps': 19500, 'batch_loss/train': 0.5786883262917399} 12/25/2021 13:01:54 - INFO - codeparrot_training - Step 19501: {'lr': 0.00033625098848405173, 'samples': 9985024, 'steps': 19501, 'batch_loss/train': 0.8443183358758688} 12/25/2021 13:02:05 - INFO - codeparrot_training - Step 19502: {'lr': 0.000336236193015403, 'samples': 9985536, 'steps': 19502, 'batch_loss/train': 0.8571350369602442} 12/25/2021 13:02:19 - INFO - codeparrot_training - Step 19503: {'lr': 0.00033622139720391173, 'samples': 9986048, 'steps': 19503, 'batch_loss/train': 0.7989849606528878} 12/25/2021 13:02:29 - INFO - codeparrot_training - Step 19504: {'lr': 0.00033620660104963674, 'samples': 9986560, 'steps': 19504, 'batch_loss/train': 0.8371643722057343} 12/25/2021 13:02:40 - INFO - codeparrot_training - Step 19505: {'lr': 0.00033619180455263687, 'samples': 9987072, 'steps': 19505, 'batch_loss/train': 0.8193959062919021} 12/25/2021 13:02:51 - INFO - codeparrot_training - Step 19506: {'lr': 0.0003361770077129709, 'samples': 9987584, 'steps': 19506, 'batch_loss/train': 0.8246302027255297} 12/25/2021 13:03:03 - INFO - codeparrot_training - Step 19507: {'lr': 0.0003361622105306978, 'samples': 9988096, 'steps': 19507, 'batch_loss/train': 0.7127262963913381} 12/25/2021 13:03:13 - INFO - codeparrot_training - Step 19508: {'lr': 0.0003361474130058761, 'samples': 9988608, 'steps': 19508, 'batch_loss/train': 0.8035558271221817} 12/25/2021 13:03:24 - INFO - codeparrot_training - Step 19509: {'lr': 0.000336132615138565, 'samples': 9989120, 'steps': 19509, 'batch_loss/train': 0.8267213944345713} 12/25/2021 13:03:36 - INFO - codeparrot_training - Step 19510: {'lr': 0.000336117816928823, 'samples': 9989632, 'steps': 19510, 'batch_loss/train': 0.7559444727376103} 12/25/2021 13:03:46 - INFO - codeparrot_training - Step 19511: {'lr': 0.00033610301837670915, 'samples': 9990144, 'steps': 19511, 'batch_loss/train': 0.6202369406819344} 12/25/2021 13:03:57 - INFO - codeparrot_training - Step 19512: {'lr': 0.0003360882194822822, 'samples': 9990656, 'steps': 19512, 'batch_loss/train': 0.5948975447099656} 12/25/2021 13:04:11 - INFO - codeparrot_training - Step 19513: {'lr': 0.00033607342024560095, 'samples': 9991168, 'steps': 19513, 'batch_loss/train': 0.8534215446561575} 12/25/2021 13:04:22 - INFO - codeparrot_training - Step 19514: {'lr': 0.0003360586206667243, 'samples': 9991680, 'steps': 19514, 'batch_loss/train': 0.762177811935544} 12/25/2021 13:04:32 - INFO - codeparrot_training - Step 19515: {'lr': 0.00033604382074571104, 'samples': 9992192, 'steps': 19515, 'batch_loss/train': 0.8262188248336315} 12/25/2021 13:04:44 - INFO - codeparrot_training - Step 19516: {'lr': 0.00033602902048262016, 'samples': 9992704, 'steps': 19516, 'batch_loss/train': 0.8016765411011875} 12/25/2021 13:04:55 - INFO - codeparrot_training - Step 19517: {'lr': 0.0003360142198775103, 'samples': 9993216, 'steps': 19517, 'batch_loss/train': 0.7859623543918133} 12/25/2021 13:05:06 - INFO - codeparrot_training - Step 19518: {'lr': 0.0003359994189304404, 'samples': 9993728, 'steps': 19518, 'batch_loss/train': 0.7694400241598487} 12/25/2021 13:05:16 - INFO - codeparrot_training - Step 19519: {'lr': 0.0003359846176414693, 'samples': 9994240, 'steps': 19519, 'batch_loss/train': 0.8077850681729615} 12/25/2021 13:05:28 - INFO - codeparrot_training - Step 19520: {'lr': 0.0003359698160106558, 'samples': 9994752, 'steps': 19520, 'batch_loss/train': 0.7594521343708038} 12/25/2021 13:05:39 - INFO - codeparrot_training - Step 19521: {'lr': 0.00033595501403805867, 'samples': 9995264, 'steps': 19521, 'batch_loss/train': 0.8037137589417398} 12/25/2021 13:05:49 - INFO - codeparrot_training - Step 19522: {'lr': 0.000335940211723737, 'samples': 9995776, 'steps': 19522, 'batch_loss/train': 0.7771094164345413} 12/25/2021 13:06:03 - INFO - codeparrot_training - Step 19523: {'lr': 0.00033592540906774945, 'samples': 9996288, 'steps': 19523, 'batch_loss/train': 0.7903123288415372} 12/25/2021 13:06:14 - INFO - codeparrot_training - Step 19524: {'lr': 0.00033591060607015487, 'samples': 9996800, 'steps': 19524, 'batch_loss/train': 0.7893110988661647} 12/25/2021 13:06:24 - INFO - codeparrot_training - Step 19525: {'lr': 0.0003358958027310122, 'samples': 9997312, 'steps': 19525, 'batch_loss/train': 0.6920755915343761} 12/25/2021 13:06:36 - INFO - codeparrot_training - Step 19526: {'lr': 0.0003358809990503803, 'samples': 9997824, 'steps': 19526, 'batch_loss/train': 0.6394603789085522} 12/25/2021 13:06:47 - INFO - codeparrot_training - Step 19527: {'lr': 0.00033586619502831786, 'samples': 9998336, 'steps': 19527, 'batch_loss/train': 0.7778921648859978} 12/25/2021 13:06:58 - INFO - codeparrot_training - Step 19528: {'lr': 0.000335851390664884, 'samples': 9998848, 'steps': 19528, 'batch_loss/train': 0.7505654999986291} 12/25/2021 13:07:11 - INFO - codeparrot_training - Step 19529: {'lr': 0.00033583658596013735, 'samples': 9999360, 'steps': 19529, 'batch_loss/train': 0.7598025863990188} 12/25/2021 13:07:22 - INFO - codeparrot_training - Step 19530: {'lr': 0.00033582178091413683, 'samples': 9999872, 'steps': 19530, 'batch_loss/train': 0.7822008794173598} 12/25/2021 13:07:33 - INFO - codeparrot_training - Step 19531: {'lr': 0.00033580697552694136, 'samples': 10000384, 'steps': 19531, 'batch_loss/train': 0.6736198246362619} 12/25/2021 13:07:43 - INFO - codeparrot_training - Step 19532: {'lr': 0.0003357921697986097, 'samples': 10000896, 'steps': 19532, 'batch_loss/train': 0.5965044224285521} 12/25/2021 13:07:56 - INFO - codeparrot_training - Step 19533: {'lr': 0.00033577736372920083, 'samples': 10001408, 'steps': 19533, 'batch_loss/train': 0.6199517872883007} 12/25/2021 13:08:06 - INFO - codeparrot_training - Step 19534: {'lr': 0.00033576255731877346, 'samples': 10001920, 'steps': 19534, 'batch_loss/train': 0.7495137358782813} 12/25/2021 13:08:17 - INFO - codeparrot_training - Step 19535: {'lr': 0.0003357477505673867, 'samples': 10002432, 'steps': 19535, 'batch_loss/train': 0.7732848920859396} 12/25/2021 13:08:29 - INFO - codeparrot_training - Step 19536: {'lr': 0.0003357329434750991, 'samples': 10002944, 'steps': 19536, 'batch_loss/train': 0.7166735231876373} 12/25/2021 13:08:40 - INFO - codeparrot_training - Step 19537: {'lr': 0.00033571813604196967, 'samples': 10003456, 'steps': 19537, 'batch_loss/train': 0.668528029229492} 12/25/2021 13:08:50 - INFO - codeparrot_training - Step 19538: {'lr': 0.0003357033282680574, 'samples': 10003968, 'steps': 19538, 'batch_loss/train': 0.7102754674269818} 12/25/2021 13:09:02 - INFO - codeparrot_training - Step 19539: {'lr': 0.00033568852015342104, 'samples': 10004480, 'steps': 19539, 'batch_loss/train': 0.7724588643759489} 12/25/2021 13:09:13 - INFO - codeparrot_training - Step 19540: {'lr': 0.0003356737116981194, 'samples': 10004992, 'steps': 19540, 'batch_loss/train': 1.0890143793076277} 12/25/2021 13:09:24 - INFO - codeparrot_training - Step 19541: {'lr': 0.00033565890290221156, 'samples': 10005504, 'steps': 19541, 'batch_loss/train': 0.7941262610256672} 12/25/2021 13:09:34 - INFO - codeparrot_training - Step 19542: {'lr': 0.00033564409376575624, 'samples': 10006016, 'steps': 19542, 'batch_loss/train': 0.8481623698025942} 12/25/2021 13:09:48 - INFO - codeparrot_training - Step 19543: {'lr': 0.00033562928428881227, 'samples': 10006528, 'steps': 19543, 'batch_loss/train': 0.7178716326598078} 12/25/2021 13:09:59 - INFO - codeparrot_training - Step 19544: {'lr': 0.0003356144744714387, 'samples': 10007040, 'steps': 19544, 'batch_loss/train': 0.8240390634164214} 12/25/2021 13:10:09 - INFO - codeparrot_training - Step 19545: {'lr': 0.00033559966431369423, 'samples': 10007552, 'steps': 19545, 'batch_loss/train': 0.7866469956934452} 12/25/2021 13:10:22 - INFO - codeparrot_training - Step 19546: {'lr': 0.00033558485381563787, 'samples': 10008064, 'steps': 19546, 'batch_loss/train': 0.7783441161736846} 12/25/2021 13:10:32 - INFO - codeparrot_training - Step 19547: {'lr': 0.0003355700429773283, 'samples': 10008576, 'steps': 19547, 'batch_loss/train': 0.6017497622524388} 12/25/2021 13:10:43 - INFO - codeparrot_training - Step 19548: {'lr': 0.0003355552317988246, 'samples': 10009088, 'steps': 19548, 'batch_loss/train': 0.8660011062747799} 12/25/2021 13:10:55 - INFO - codeparrot_training - Step 19549: {'lr': 0.0003355404202801857, 'samples': 10009600, 'steps': 19549, 'batch_loss/train': 0.7072007444221526} 12/25/2021 13:11:05 - INFO - codeparrot_training - Step 19550: {'lr': 0.00033552560842147036, 'samples': 10010112, 'steps': 19550, 'batch_loss/train': 0.6311269451689441} 12/25/2021 13:11:16 - INFO - codeparrot_training - Step 19551: {'lr': 0.0003355107962227375, 'samples': 10010624, 'steps': 19551, 'batch_loss/train': 0.7926126103848219} 12/25/2021 13:11:27 - INFO - codeparrot_training - Step 19552: {'lr': 0.000335495983684046, 'samples': 10011136, 'steps': 19552, 'batch_loss/train': 0.7562589095905423} 12/25/2021 13:11:41 - INFO - codeparrot_training - Step 19553: {'lr': 0.0003354811708054547, 'samples': 10011648, 'steps': 19553, 'batch_loss/train': 0.6924139119219035} 12/25/2021 13:11:51 - INFO - codeparrot_training - Step 19554: {'lr': 0.0003354663575870225, 'samples': 10012160, 'steps': 19554, 'batch_loss/train': 0.5913781860726885} 12/25/2021 13:12:02 - INFO - codeparrot_training - Step 19555: {'lr': 0.0003354515440288085, 'samples': 10012672, 'steps': 19555, 'batch_loss/train': 0.7205105787143111} 12/25/2021 13:12:14 - INFO - codeparrot_training - Step 19556: {'lr': 0.00033543673013087134, 'samples': 10013184, 'steps': 19556, 'batch_loss/train': 0.8724177908152342} 12/25/2021 13:12:25 - INFO - codeparrot_training - Step 19557: {'lr': 0.00033542191589327, 'samples': 10013696, 'steps': 19557, 'batch_loss/train': 0.6607834626920521} 12/25/2021 13:12:35 - INFO - codeparrot_training - Step 19558: {'lr': 0.0003354071013160633, 'samples': 10014208, 'steps': 19558, 'batch_loss/train': 0.8012175755575299} 12/25/2021 13:12:49 - INFO - codeparrot_training - Step 19559: {'lr': 0.0003353922863993103, 'samples': 10014720, 'steps': 19559, 'batch_loss/train': 0.7743444163352251} 12/25/2021 13:13:00 - INFO - codeparrot_training - Step 19560: {'lr': 0.00033537747114306975, 'samples': 10015232, 'steps': 19560, 'batch_loss/train': 0.6173870249767788} 12/25/2021 13:13:10 - INFO - codeparrot_training - Step 19561: {'lr': 0.00033536265554740067, 'samples': 10015744, 'steps': 19561, 'batch_loss/train': 0.7080611633136868} 12/25/2021 13:13:22 - INFO - codeparrot_training - Step 19562: {'lr': 0.00033534783961236195, 'samples': 10016256, 'steps': 19562, 'batch_loss/train': 0.7598349065519869} 12/25/2021 13:13:33 - INFO - codeparrot_training - Step 19563: {'lr': 0.0003353330233380123, 'samples': 10016768, 'steps': 19563, 'batch_loss/train': 0.7602174712810665} 12/25/2021 13:13:44 - INFO - codeparrot_training - Step 19564: {'lr': 0.00033531820672441095, 'samples': 10017280, 'steps': 19564, 'batch_loss/train': 0.4872615776548628} 12/25/2021 13:13:54 - INFO - codeparrot_training - Step 19565: {'lr': 0.0003353033897716165, 'samples': 10017792, 'steps': 19565, 'batch_loss/train': 0.6658559637144208} 12/25/2021 13:14:06 - INFO - codeparrot_training - Step 19566: {'lr': 0.00033528857247968805, 'samples': 10018304, 'steps': 19566, 'batch_loss/train': 0.6900717651005834} 12/25/2021 13:14:17 - INFO - codeparrot_training - Step 19567: {'lr': 0.00033527375484868435, 'samples': 10018816, 'steps': 19567, 'batch_loss/train': 0.8821525406092405} 12/25/2021 13:14:27 - INFO - codeparrot_training - Step 19568: {'lr': 0.0003352589368786645, 'samples': 10019328, 'steps': 19568, 'batch_loss/train': 0.7390900610480458} 12/25/2021 13:14:39 - INFO - codeparrot_training - Step 19569: {'lr': 0.0003352441185696872, 'samples': 10019840, 'steps': 19569, 'batch_loss/train': 0.6223106562392786} 12/25/2021 13:14:50 - INFO - codeparrot_training - Step 19570: {'lr': 0.00033522929992181155, 'samples': 10020352, 'steps': 19570, 'batch_loss/train': 0.8142134305089712} 12/25/2021 13:15:01 - INFO - codeparrot_training - Step 19571: {'lr': 0.0003352144809350964, 'samples': 10020864, 'steps': 19571, 'batch_loss/train': 0.6521538305096328} 12/25/2021 13:15:14 - INFO - codeparrot_training - Step 19572: {'lr': 0.0003351996616096006, 'samples': 10021376, 'steps': 19572, 'batch_loss/train': 0.7641755510121584} 12/25/2021 13:15:25 - INFO - codeparrot_training - Step 19573: {'lr': 0.0003351848419453831, 'samples': 10021888, 'steps': 19573, 'batch_loss/train': 0.7769574201665819} 12/25/2021 13:15:36 - INFO - codeparrot_training - Step 19574: {'lr': 0.00033517002194250286, 'samples': 10022400, 'steps': 19574, 'batch_loss/train': 0.7040599518222734} 12/25/2021 13:15:46 - INFO - codeparrot_training - Step 19575: {'lr': 0.0003351552016010188, 'samples': 10022912, 'steps': 19575, 'batch_loss/train': 0.797177292406559} 12/25/2021 13:15:58 - INFO - codeparrot_training - Step 19576: {'lr': 0.00033514038092098973, 'samples': 10023424, 'steps': 19576, 'batch_loss/train': 0.8103371858596802} 12/25/2021 13:16:09 - INFO - codeparrot_training - Step 19577: {'lr': 0.0003351255599024747, 'samples': 10023936, 'steps': 19577, 'batch_loss/train': 0.7223797827027738} 12/25/2021 13:16:19 - INFO - codeparrot_training - Step 19578: {'lr': 0.00033511073854553263, 'samples': 10024448, 'steps': 19578, 'batch_loss/train': 0.6876494670286775} 12/25/2021 13:16:32 - INFO - codeparrot_training - Step 19579: {'lr': 0.00033509591685022236, 'samples': 10024960, 'steps': 19579, 'batch_loss/train': 0.9165573064237833} 12/25/2021 13:16:42 - INFO - codeparrot_training - Step 19580: {'lr': 0.00033508109481660285, 'samples': 10025472, 'steps': 19580, 'batch_loss/train': 0.7679535134229809} 12/25/2021 13:16:53 - INFO - codeparrot_training - Step 19581: {'lr': 0.0003350662724447331, 'samples': 10025984, 'steps': 19581, 'batch_loss/train': 0.6932059666141868} 12/25/2021 13:17:06 - INFO - codeparrot_training - Step 19582: {'lr': 0.0003350514497346718, 'samples': 10026496, 'steps': 19582, 'batch_loss/train': 0.7746670315973461} 12/25/2021 13:17:17 - INFO - codeparrot_training - Step 19583: {'lr': 0.0003350366266864783, 'samples': 10027008, 'steps': 19583, 'batch_loss/train': 0.7629072181880474} 12/25/2021 13:17:28 - INFO - codeparrot_training - Step 19584: {'lr': 0.0003350218033002111, 'samples': 10027520, 'steps': 19584, 'batch_loss/train': 0.752563470043242} 12/25/2021 13:17:40 - INFO - codeparrot_training - Step 19585: {'lr': 0.0003350069795759294, 'samples': 10028032, 'steps': 19585, 'batch_loss/train': 0.7243459869641811} 12/25/2021 13:17:50 - INFO - codeparrot_training - Step 19586: {'lr': 0.0003349921555136919, 'samples': 10028544, 'steps': 19586, 'batch_loss/train': 0.7642186116427183} 12/25/2021 13:18:01 - INFO - codeparrot_training - Step 19587: {'lr': 0.0003349773311135578, 'samples': 10029056, 'steps': 19587, 'batch_loss/train': 0.7809441806748509} 12/25/2021 13:18:12 - INFO - codeparrot_training - Step 19588: {'lr': 0.000334962506375586, 'samples': 10029568, 'steps': 19588, 'batch_loss/train': 0.7069387696683407} 12/25/2021 13:18:26 - INFO - codeparrot_training - Step 19589: {'lr': 0.00033494768129983517, 'samples': 10030080, 'steps': 19589, 'batch_loss/train': 0.6251228367909789} 12/25/2021 13:18:36 - INFO - codeparrot_training - Step 19590: {'lr': 0.0003349328558863646, 'samples': 10030592, 'steps': 19590, 'batch_loss/train': 0.7631533592939377} 12/25/2021 13:18:47 - INFO - codeparrot_training - Step 19591: {'lr': 0.0003349180301352329, 'samples': 10031104, 'steps': 19591, 'batch_loss/train': 0.7275045365095139} 12/25/2021 13:18:59 - INFO - codeparrot_training - Step 19592: {'lr': 0.0003349032040464993, 'samples': 10031616, 'steps': 19592, 'batch_loss/train': 0.7099621442612261} 12/25/2021 13:19:10 - INFO - codeparrot_training - Step 19593: {'lr': 0.0003348883776202226, 'samples': 10032128, 'steps': 19593, 'batch_loss/train': 0.7065711866598576} 12/25/2021 13:19:20 - INFO - codeparrot_training - Step 19594: {'lr': 0.00033487355085646175, 'samples': 10032640, 'steps': 19594, 'batch_loss/train': 0.7381461511831731} 12/25/2021 13:19:33 - INFO - codeparrot_training - Step 19595: {'lr': 0.00033485872375527565, 'samples': 10033152, 'steps': 19595, 'batch_loss/train': 0.951011685654521} 12/25/2021 13:19:43 - INFO - codeparrot_training - Step 19596: {'lr': 0.0003348438963167234, 'samples': 10033664, 'steps': 19596, 'batch_loss/train': 0.7520605587633327} 12/25/2021 13:19:54 - INFO - codeparrot_training - Step 19597: {'lr': 0.0003348290685408638, 'samples': 10034176, 'steps': 19597, 'batch_loss/train': 0.6920773507736158} 12/25/2021 13:20:04 - INFO - codeparrot_training - Step 19598: {'lr': 0.0003348142404277559, 'samples': 10034688, 'steps': 19598, 'batch_loss/train': 0.5962140867777634} 12/25/2021 13:20:18 - INFO - codeparrot_training - Step 19599: {'lr': 0.00033479941197745854, 'samples': 10035200, 'steps': 19599, 'batch_loss/train': 0.7920302525162697} 12/25/2021 13:20:29 - INFO - codeparrot_training - Step 19600: {'lr': 0.0003347845831900308, 'samples': 10035712, 'steps': 19600, 'batch_loss/train': 0.7246963244397193} 12/25/2021 13:20:39 - INFO - codeparrot_training - Step 19601: {'lr': 0.00033476975406553154, 'samples': 10036224, 'steps': 19601, 'batch_loss/train': 0.7789356047287583} 12/25/2021 13:20:51 - INFO - codeparrot_training - Step 19602: {'lr': 0.0003347549246040197, 'samples': 10036736, 'steps': 19602, 'batch_loss/train': 0.7752177636139095} 12/25/2021 13:21:02 - INFO - codeparrot_training - Step 19603: {'lr': 0.0003347400948055544, 'samples': 10037248, 'steps': 19603, 'batch_loss/train': 0.7780035492032766} 12/25/2021 13:21:13 - INFO - codeparrot_training - Step 19604: {'lr': 0.00033472526467019436, 'samples': 10037760, 'steps': 19604, 'batch_loss/train': 0.7368175969459116} 12/25/2021 13:21:25 - INFO - codeparrot_training - Step 19605: {'lr': 0.00033471043419799873, 'samples': 10038272, 'steps': 19605, 'batch_loss/train': 0.7045154203660786} 12/25/2021 13:21:35 - INFO - codeparrot_training - Step 19606: {'lr': 0.0003346956033890264, 'samples': 10038784, 'steps': 19606, 'batch_loss/train': 0.7500477125868201} 12/25/2021 13:21:46 - INFO - codeparrot_training - Step 19607: {'lr': 0.0003346807722433363, 'samples': 10039296, 'steps': 19607, 'batch_loss/train': 0.6134556168108247} 12/25/2021 13:21:58 - INFO - codeparrot_training - Step 19608: {'lr': 0.0003346659407609874, 'samples': 10039808, 'steps': 19608, 'batch_loss/train': 0.9664136590436101} 12/25/2021 13:22:08 - INFO - codeparrot_training - Step 19609: {'lr': 0.00033465110894203876, 'samples': 10040320, 'steps': 19609, 'batch_loss/train': 0.8518481720238924} 12/25/2021 13:22:19 - INFO - codeparrot_training - Step 19610: {'lr': 0.00033463627678654916, 'samples': 10040832, 'steps': 19610, 'batch_loss/train': 0.8315147710964084} 12/25/2021 13:22:30 - INFO - codeparrot_training - Step 19611: {'lr': 0.0003346214442945778, 'samples': 10041344, 'steps': 19611, 'batch_loss/train': 0.7828465504571795} 12/25/2021 13:22:44 - INFO - codeparrot_training - Step 19612: {'lr': 0.0003346066114661835, 'samples': 10041856, 'steps': 19612, 'batch_loss/train': 0.8566099219024181} 12/25/2021 13:22:54 - INFO - codeparrot_training - Step 19613: {'lr': 0.0003345917783014252, 'samples': 10042368, 'steps': 19613, 'batch_loss/train': 0.7199951093643904} 12/25/2021 13:23:05 - INFO - codeparrot_training - Step 19614: {'lr': 0.000334576944800362, 'samples': 10042880, 'steps': 19614, 'batch_loss/train': 0.8186840682174079} 12/25/2021 13:23:17 - INFO - codeparrot_training - Step 19615: {'lr': 0.0003345621109630527, 'samples': 10043392, 'steps': 19615, 'batch_loss/train': 0.7661044206470251} 12/25/2021 13:23:27 - INFO - codeparrot_training - Step 19616: {'lr': 0.00033454727678955653, 'samples': 10043904, 'steps': 19616, 'batch_loss/train': 0.6706488057971001} 12/25/2021 13:23:38 - INFO - codeparrot_training - Step 19617: {'lr': 0.00033453244227993226, 'samples': 10044416, 'steps': 19617, 'batch_loss/train': 0.7405991544947028} 12/25/2021 13:23:50 - INFO - codeparrot_training - Step 19618: {'lr': 0.00033451760743423886, 'samples': 10044928, 'steps': 19618, 'batch_loss/train': 0.7134188208729029} 12/25/2021 13:24:01 - INFO - codeparrot_training - Step 19619: {'lr': 0.00033450277225253533, 'samples': 10045440, 'steps': 19619, 'batch_loss/train': 0.6853352580219507} 12/25/2021 13:24:12 - INFO - codeparrot_training - Step 19620: {'lr': 0.0003344879367348808, 'samples': 10045952, 'steps': 19620, 'batch_loss/train': 0.7214874569326639} 12/25/2021 13:24:22 - INFO - codeparrot_training - Step 19621: {'lr': 0.0003344731008813341, 'samples': 10046464, 'steps': 19621, 'batch_loss/train': 0.727629872970283} 12/25/2021 13:24:36 - INFO - codeparrot_training - Step 19622: {'lr': 0.00033445826469195416, 'samples': 10046976, 'steps': 19622, 'batch_loss/train': 0.7028927984647453} 12/25/2021 13:24:47 - INFO - codeparrot_training - Step 19623: {'lr': 0.00033444342816680016, 'samples': 10047488, 'steps': 19623, 'batch_loss/train': 0.672880639496725} 12/25/2021 13:24:57 - INFO - codeparrot_training - Step 19624: {'lr': 0.0003344285913059309, 'samples': 10048000, 'steps': 19624, 'batch_loss/train': 0.7968210186809301} 12/25/2021 13:25:10 - INFO - codeparrot_training - Step 19625: {'lr': 0.0003344137541094055, 'samples': 10048512, 'steps': 19625, 'batch_loss/train': 0.7650969005189836} 12/25/2021 13:25:20 - INFO - codeparrot_training - Step 19626: {'lr': 0.0003343989165772829, 'samples': 10049024, 'steps': 19626, 'batch_loss/train': 0.7487255707383156} 12/25/2021 13:25:31 - INFO - codeparrot_training - Step 19627: {'lr': 0.000334384078709622, 'samples': 10049536, 'steps': 19627, 'batch_loss/train': 0.835383977741003} 12/25/2021 13:25:45 - INFO - codeparrot_training - Step 19628: {'lr': 0.00033436924050648186, 'samples': 10050048, 'steps': 19628, 'batch_loss/train': 0.7791956532746553} 12/25/2021 13:25:55 - INFO - codeparrot_training - Step 19629: {'lr': 0.0003343544019679216, 'samples': 10050560, 'steps': 19629, 'batch_loss/train': 0.7540671993046999} 12/25/2021 13:26:06 - INFO - codeparrot_training - Step 19630: {'lr': 0.000334339563094, 'samples': 10051072, 'steps': 19630, 'batch_loss/train': 0.7531551318243146} 12/25/2021 13:26:18 - INFO - codeparrot_training - Step 19631: {'lr': 0.0003343247238847761, 'samples': 10051584, 'steps': 19631, 'batch_loss/train': 0.8769439104944468} 12/25/2021 13:26:29 - INFO - codeparrot_training - Step 19632: {'lr': 0.000334309884340309, 'samples': 10052096, 'steps': 19632, 'batch_loss/train': 0.7515480509027839} 12/25/2021 13:26:39 - INFO - codeparrot_training - Step 19633: {'lr': 0.0003342950444606577, 'samples': 10052608, 'steps': 19633, 'batch_loss/train': 0.7585529461503029} 12/25/2021 13:26:50 - INFO - codeparrot_training - Step 19634: {'lr': 0.000334280204245881, 'samples': 10053120, 'steps': 19634, 'batch_loss/train': 0.7421376793645322} 12/25/2021 13:27:02 - INFO - codeparrot_training - Step 19635: {'lr': 0.00033426536369603806, 'samples': 10053632, 'steps': 19635, 'batch_loss/train': 0.8050362514331937} 12/25/2021 13:27:13 - INFO - codeparrot_training - Step 19636: {'lr': 0.00033425052281118797, 'samples': 10054144, 'steps': 19636, 'batch_loss/train': 0.8230428816750646} 12/25/2021 13:27:23 - INFO - codeparrot_training - Step 19637: {'lr': 0.00033423568159138945, 'samples': 10054656, 'steps': 19637, 'batch_loss/train': 0.6596011286601424} 12/25/2021 13:27:37 - INFO - codeparrot_training - Step 19638: {'lr': 0.00033422084003670175, 'samples': 10055168, 'steps': 19638, 'batch_loss/train': 0.7229127112659626} 12/25/2021 13:27:48 - INFO - codeparrot_training - Step 19639: {'lr': 0.0003342059981471838, 'samples': 10055680, 'steps': 19639, 'batch_loss/train': 0.8027661917731166} 12/25/2021 13:27:58 - INFO - codeparrot_training - Step 19640: {'lr': 0.0003341911559228945, 'samples': 10056192, 'steps': 19640, 'batch_loss/train': 0.7402538498863578} 12/25/2021 13:28:11 - INFO - codeparrot_training - Step 19641: {'lr': 0.00033417631336389304, 'samples': 10056704, 'steps': 19641, 'batch_loss/train': 0.9485483486205339} 12/25/2021 13:28:21 - INFO - codeparrot_training - Step 19642: {'lr': 0.00033416147047023837, 'samples': 10057216, 'steps': 19642, 'batch_loss/train': 0.7317321845330298} 12/25/2021 13:28:32 - INFO - codeparrot_training - Step 19643: {'lr': 0.0003341466272419894, 'samples': 10057728, 'steps': 19643, 'batch_loss/train': 0.9068066021427512} 12/25/2021 13:28:42 - INFO - codeparrot_training - Step 19644: {'lr': 0.0003341317836792053, 'samples': 10058240, 'steps': 19644, 'batch_loss/train': 0.7412715330719948} 12/25/2021 13:28:54 - INFO - codeparrot_training - Step 19645: {'lr': 0.0003341169397819449, 'samples': 10058752, 'steps': 19645, 'batch_loss/train': 0.8075044131837785} 12/25/2021 13:29:05 - INFO - codeparrot_training - Step 19646: {'lr': 0.00033410209555026737, 'samples': 10059264, 'steps': 19646, 'batch_loss/train': 0.7447033692151308} 12/25/2021 13:29:16 - INFO - codeparrot_training - Step 19647: {'lr': 0.0003340872509842316, 'samples': 10059776, 'steps': 19647, 'batch_loss/train': 0.6058185857837088} 12/25/2021 13:29:28 - INFO - codeparrot_training - Step 19648: {'lr': 0.0003340724060838967, 'samples': 10060288, 'steps': 19648, 'batch_loss/train': 0.7545967572368681} 12/25/2021 13:29:38 - INFO - codeparrot_training - Step 19649: {'lr': 0.0003340575608493217, 'samples': 10060800, 'steps': 19649, 'batch_loss/train': 0.7048797234892845} 12/25/2021 13:29:49 - INFO - codeparrot_training - Step 19650: {'lr': 0.0003340427152805655, 'samples': 10061312, 'steps': 19650, 'batch_loss/train': 0.7464162688702345} 12/25/2021 13:30:03 - INFO - codeparrot_training - Step 19651: {'lr': 0.0003340278693776873, 'samples': 10061824, 'steps': 19651, 'batch_loss/train': 0.8568926155567169} 12/25/2021 13:30:13 - INFO - codeparrot_training - Step 19652: {'lr': 0.0003340130231407459, 'samples': 10062336, 'steps': 19652, 'batch_loss/train': 0.7104270039126277} 12/25/2021 13:30:24 - INFO - codeparrot_training - Step 19653: {'lr': 0.0003339981765698005, 'samples': 10062848, 'steps': 19653, 'batch_loss/train': 0.7061568338249344} 12/25/2021 13:30:34 - INFO - codeparrot_training - Step 19654: {'lr': 0.00033398332966491005, 'samples': 10063360, 'steps': 19654, 'batch_loss/train': 0.8100498304702342} 12/25/2021 13:30:47 - INFO - codeparrot_training - Step 19655: {'lr': 0.0003339684824261336, 'samples': 10063872, 'steps': 19655, 'batch_loss/train': 0.72798215970397} 12/25/2021 13:30:57 - INFO - codeparrot_training - Step 19656: {'lr': 0.00033395363485353027, 'samples': 10064384, 'steps': 19656, 'batch_loss/train': 0.6827553734183311} 12/25/2021 13:31:08 - INFO - codeparrot_training - Step 19657: {'lr': 0.00033393878694715885, 'samples': 10064896, 'steps': 19657, 'batch_loss/train': 0.8266622723313048} 12/25/2021 13:31:22 - INFO - codeparrot_training - Step 19658: {'lr': 0.0003339239387070786, 'samples': 10065408, 'steps': 19658, 'batch_loss/train': 0.6270717002335005} 12/25/2021 13:31:32 - INFO - codeparrot_training - Step 19659: {'lr': 0.00033390909013334834, 'samples': 10065920, 'steps': 19659, 'batch_loss/train': 0.8596728765405715} 12/25/2021 13:31:43 - INFO - codeparrot_training - Step 19660: {'lr': 0.00033389424122602727, 'samples': 10066432, 'steps': 19660, 'batch_loss/train': 0.7258437016280368} 12/25/2021 13:31:55 - INFO - codeparrot_training - Step 19661: {'lr': 0.00033387939198517444, 'samples': 10066944, 'steps': 19661, 'batch_loss/train': 0.7727510733529925} 12/25/2021 13:32:06 - INFO - codeparrot_training - Step 19662: {'lr': 0.00033386454241084883, 'samples': 10067456, 'steps': 19662, 'batch_loss/train': 0.6655904864892364} 12/25/2021 13:32:16 - INFO - codeparrot_training - Step 19663: {'lr': 0.0003338496925031094, 'samples': 10067968, 'steps': 19663, 'batch_loss/train': 0.7743688141927123} 12/25/2021 13:32:27 - INFO - codeparrot_training - Step 19664: {'lr': 0.00033383484226201526, 'samples': 10068480, 'steps': 19664, 'batch_loss/train': 0.7610686104744673} 12/25/2021 13:32:39 - INFO - codeparrot_training - Step 19665: {'lr': 0.0003338199916876254, 'samples': 10068992, 'steps': 19665, 'batch_loss/train': 0.8084621212910861} 12/25/2021 13:32:50 - INFO - codeparrot_training - Step 19666: {'lr': 0.00033380514077999894, 'samples': 10069504, 'steps': 19666, 'batch_loss/train': 1.9378396635875106} 12/25/2021 13:33:00 - INFO - codeparrot_training - Step 19667: {'lr': 0.00033379028953919496, 'samples': 10070016, 'steps': 19667, 'batch_loss/train': 1.0881170402280986} 12/25/2021 13:33:14 - INFO - codeparrot_training - Step 19668: {'lr': 0.00033377543796527236, 'samples': 10070528, 'steps': 19668, 'batch_loss/train': 0.7540377052500844} 12/25/2021 13:33:25 - INFO - codeparrot_training - Step 19669: {'lr': 0.0003337605860582902, 'samples': 10071040, 'steps': 19669, 'batch_loss/train': 0.7848611017689109} 12/25/2021 13:33:35 - INFO - codeparrot_training - Step 19670: {'lr': 0.00033374573381830764, 'samples': 10071552, 'steps': 19670, 'batch_loss/train': 0.7101921979337931} 12/25/2021 13:33:47 - INFO - codeparrot_training - Step 19671: {'lr': 0.00033373088124538365, 'samples': 10072064, 'steps': 19671, 'batch_loss/train': 0.7319788197055459} 12/25/2021 13:33:58 - INFO - codeparrot_training - Step 19672: {'lr': 0.00033371602833957734, 'samples': 10072576, 'steps': 19672, 'batch_loss/train': 0.7926926556974649} 12/25/2021 13:34:09 - INFO - codeparrot_training - Step 19673: {'lr': 0.0003337011751009476, 'samples': 10073088, 'steps': 19673, 'batch_loss/train': 0.7012355465849396} 12/25/2021 13:34:22 - INFO - codeparrot_training - Step 19674: {'lr': 0.0003336863215295537, 'samples': 10073600, 'steps': 19674, 'batch_loss/train': 0.7159625273197889} 12/25/2021 13:34:33 - INFO - codeparrot_training - Step 19675: {'lr': 0.00033367146762545454, 'samples': 10074112, 'steps': 19675, 'batch_loss/train': 0.6598931159824133} 12/25/2021 13:34:43 - INFO - codeparrot_training - Step 19676: {'lr': 0.0003336566133887092, 'samples': 10074624, 'steps': 19676, 'batch_loss/train': 1.5460790153592825} 12/25/2021 13:34:54 - INFO - codeparrot_training - Step 19677: {'lr': 0.0003336417588193768, 'samples': 10075136, 'steps': 19677, 'batch_loss/train': 0.8296757936477661} 12/25/2021 13:35:06 - INFO - codeparrot_training - Step 19678: {'lr': 0.00033362690391751626, 'samples': 10075648, 'steps': 19678, 'batch_loss/train': 0.9496915265917778} 12/25/2021 13:35:16 - INFO - codeparrot_training - Step 19679: {'lr': 0.0003336120486831868, 'samples': 10076160, 'steps': 19679, 'batch_loss/train': 0.8305088942870498} 12/25/2021 13:35:27 - INFO - codeparrot_training - Step 19680: {'lr': 0.0003335971931164474, 'samples': 10076672, 'steps': 19680, 'batch_loss/train': 0.571404721646104} 12/25/2021 13:35:39 - INFO - codeparrot_training - Step 19681: {'lr': 0.00033358233721735714, 'samples': 10077184, 'steps': 19681, 'batch_loss/train': 0.8289780030027032} 12/25/2021 13:35:50 - INFO - codeparrot_training - Step 19682: {'lr': 0.00033356748098597506, 'samples': 10077696, 'steps': 19682, 'batch_loss/train': 0.7480535457725637} 12/25/2021 13:36:00 - INFO - codeparrot_training - Step 19683: {'lr': 0.0003335526244223602, 'samples': 10078208, 'steps': 19683, 'batch_loss/train': 0.7234477924648672} 12/25/2021 13:36:14 - INFO - codeparrot_training - Step 19684: {'lr': 0.00033353776752657167, 'samples': 10078720, 'steps': 19684, 'batch_loss/train': 0.8221851140260696} 12/25/2021 13:36:25 - INFO - codeparrot_training - Step 19685: {'lr': 0.00033352291029866847, 'samples': 10079232, 'steps': 19685, 'batch_loss/train': 0.8588974811136723} 12/25/2021 13:36:35 - INFO - codeparrot_training - Step 19686: {'lr': 0.00033350805273870977, 'samples': 10079744, 'steps': 19686, 'batch_loss/train': 0.7956264587119222} 12/25/2021 13:36:46 - INFO - codeparrot_training - Step 19687: {'lr': 0.00033349319484675454, 'samples': 10080256, 'steps': 19687, 'batch_loss/train': 0.6783196672331542} 12/25/2021 13:36:58 - INFO - codeparrot_training - Step 19688: {'lr': 0.00033347833662286197, 'samples': 10080768, 'steps': 19688, 'batch_loss/train': 0.6308112996630371} 12/25/2021 13:37:09 - INFO - codeparrot_training - Step 19689: {'lr': 0.000333463478067091, 'samples': 10081280, 'steps': 19689, 'batch_loss/train': 0.7312213173136115} 12/25/2021 13:37:19 - INFO - codeparrot_training - Step 19690: {'lr': 0.0003334486191795008, 'samples': 10081792, 'steps': 19690, 'batch_loss/train': 0.8591122878715396} 12/25/2021 13:37:31 - INFO - codeparrot_training - Step 19691: {'lr': 0.00033343375996015037, 'samples': 10082304, 'steps': 19691, 'batch_loss/train': 0.7897567423060536} 12/25/2021 13:37:42 - INFO - codeparrot_training - Step 19692: {'lr': 0.0003334189004090988, 'samples': 10082816, 'steps': 19692, 'batch_loss/train': 0.7500589275732636} 12/25/2021 13:37:53 - INFO - codeparrot_training - Step 19693: {'lr': 0.0003334040405264052, 'samples': 10083328, 'steps': 19693, 'batch_loss/train': 0.8118527675978839} 12/25/2021 13:38:05 - INFO - codeparrot_training - Step 19694: {'lr': 0.00033338918031212863, 'samples': 10083840, 'steps': 19694, 'batch_loss/train': 0.8783733332529664} 12/25/2021 13:38:15 - INFO - codeparrot_training - Step 19695: {'lr': 0.00033337431976632816, 'samples': 10084352, 'steps': 19695, 'batch_loss/train': 0.7265019649639726} 12/25/2021 13:38:26 - INFO - codeparrot_training - Step 19696: {'lr': 0.00033335945888906285, 'samples': 10084864, 'steps': 19696, 'batch_loss/train': 0.7722848830744624} 12/25/2021 13:38:36 - INFO - codeparrot_training - Step 19697: {'lr': 0.00033334459768039184, 'samples': 10085376, 'steps': 19697, 'batch_loss/train': 0.8083558600628749} 12/25/2021 13:38:51 - INFO - codeparrot_training - Step 19698: {'lr': 0.00033332973614037413, 'samples': 10085888, 'steps': 19698, 'batch_loss/train': 0.8138823648914695} 12/25/2021 13:39:01 - INFO - codeparrot_training - Step 19699: {'lr': 0.00033331487426906884, 'samples': 10086400, 'steps': 19699, 'batch_loss/train': 0.7716761492192745} 12/25/2021 13:39:12 - INFO - codeparrot_training - Step 19700: {'lr': 0.00033330001206653513, 'samples': 10086912, 'steps': 19700, 'batch_loss/train': 0.8145867986604571} 12/25/2021 13:39:24 - INFO - codeparrot_training - Step 19701: {'lr': 0.0003332851495328321, 'samples': 10087424, 'steps': 19701, 'batch_loss/train': 0.632499698549509} 12/25/2021 13:39:35 - INFO - codeparrot_training - Step 19702: {'lr': 0.00033327028666801863, 'samples': 10087936, 'steps': 19702, 'batch_loss/train': 0.8709803065285087} 12/25/2021 13:39:45 - INFO - codeparrot_training - Step 19703: {'lr': 0.000333255423472154, 'samples': 10088448, 'steps': 19703, 'batch_loss/train': 0.5904999074700754} 12/25/2021 13:39:59 - INFO - codeparrot_training - Step 19704: {'lr': 0.0003332405599452972, 'samples': 10088960, 'steps': 19704, 'batch_loss/train': 0.8065050728619099} 12/25/2021 13:40:09 - INFO - codeparrot_training - Step 19705: {'lr': 0.0003332256960875074, 'samples': 10089472, 'steps': 19705, 'batch_loss/train': 0.7474699039012194} 12/25/2021 13:40:20 - INFO - codeparrot_training - Step 19706: {'lr': 0.0003332108318988436, 'samples': 10089984, 'steps': 19706, 'batch_loss/train': 0.6835919638397172} 12/25/2021 13:40:31 - INFO - codeparrot_training - Step 19707: {'lr': 0.0003331959673793651, 'samples': 10090496, 'steps': 19707, 'batch_loss/train': 0.7716570030897856} 12/25/2021 13:40:43 - INFO - codeparrot_training - Step 19708: {'lr': 0.0003331811025291307, 'samples': 10091008, 'steps': 19708, 'batch_loss/train': 0.7518632551655173} 12/25/2021 13:40:53 - INFO - codeparrot_training - Step 19709: {'lr': 0.0003331662373481996, 'samples': 10091520, 'steps': 19709, 'batch_loss/train': 0.540497890120605} 12/25/2021 13:41:04 - INFO - codeparrot_training - Step 19710: {'lr': 0.00033315137183663116, 'samples': 10092032, 'steps': 19710, 'batch_loss/train': 0.8175843385979533} 12/25/2021 13:41:16 - INFO - codeparrot_training - Step 19711: {'lr': 0.0003331365059944841, 'samples': 10092544, 'steps': 19711, 'batch_loss/train': 0.6766013281885535} 12/25/2021 13:41:26 - INFO - codeparrot_training - Step 19712: {'lr': 0.0003331216398218177, 'samples': 10093056, 'steps': 19712, 'batch_loss/train': 0.6602714502369054} 12/25/2021 13:41:37 - INFO - codeparrot_training - Step 19713: {'lr': 0.0003331067733186911, 'samples': 10093568, 'steps': 19713, 'batch_loss/train': 0.7824912182986736} 12/25/2021 13:41:51 - INFO - codeparrot_training - Step 19714: {'lr': 0.0003330919064851633, 'samples': 10094080, 'steps': 19714, 'batch_loss/train': 0.7902708537876606} 12/25/2021 13:42:02 - INFO - codeparrot_training - Step 19715: {'lr': 0.0003330770393212934, 'samples': 10094592, 'steps': 19715, 'batch_loss/train': 0.7079025628045201} 12/25/2021 13:42:12 - INFO - codeparrot_training - Step 19716: {'lr': 0.0003330621718271407, 'samples': 10095104, 'steps': 19716, 'batch_loss/train': 0.6446340745314956} 12/25/2021 13:42:23 - INFO - codeparrot_training - Step 19717: {'lr': 0.0003330473040027641, 'samples': 10095616, 'steps': 19717, 'batch_loss/train': 0.6585393371060491} 12/25/2021 13:42:35 - INFO - codeparrot_training - Step 19718: {'lr': 0.0003330324358482227, 'samples': 10096128, 'steps': 19718, 'batch_loss/train': 0.6651708451099694} 12/25/2021 13:42:46 - INFO - codeparrot_training - Step 19719: {'lr': 0.0003330175673635758, 'samples': 10096640, 'steps': 19719, 'batch_loss/train': 0.7100905044935644} 12/25/2021 13:42:56 - INFO - codeparrot_training - Step 19720: {'lr': 0.0003330026985488824, 'samples': 10097152, 'steps': 19720, 'batch_loss/train': 0.8195054046809673} 12/25/2021 13:43:08 - INFO - codeparrot_training - Step 19721: {'lr': 0.00033298782940420154, 'samples': 10097664, 'steps': 19721, 'batch_loss/train': 0.8617296172305942} 12/25/2021 13:43:19 - INFO - codeparrot_training - Step 19722: {'lr': 0.0003329729599295924, 'samples': 10098176, 'steps': 19722, 'batch_loss/train': 0.7762842047959566} 12/25/2021 13:43:30 - INFO - codeparrot_training - Step 19723: {'lr': 0.0003329580901251142, 'samples': 10098688, 'steps': 19723, 'batch_loss/train': 0.7519249338656664} 12/25/2021 13:43:42 - INFO - codeparrot_training - Step 19724: {'lr': 0.0003329432199908259, 'samples': 10099200, 'steps': 19724, 'batch_loss/train': 0.7482915534637868} 12/25/2021 13:43:52 - INFO - codeparrot_training - Step 19725: {'lr': 0.00033292834952678666, 'samples': 10099712, 'steps': 19725, 'batch_loss/train': 0.7517625000327826} 12/25/2021 13:44:03 - INFO - codeparrot_training - Step 19726: {'lr': 0.0003329134787330556, 'samples': 10100224, 'steps': 19726, 'batch_loss/train': 0.8551893969997764} 12/25/2021 13:44:13 - INFO - codeparrot_training - Step 19727: {'lr': 0.00033289860760969193, 'samples': 10100736, 'steps': 19727, 'batch_loss/train': 0.6423012503655627} 12/25/2021 13:44:27 - INFO - codeparrot_training - Step 19728: {'lr': 0.0003328837361567546, 'samples': 10101248, 'steps': 19728, 'batch_loss/train': 0.6383606757735834} 12/25/2021 13:44:38 - INFO - codeparrot_training - Step 19729: {'lr': 0.0003328688643743029, 'samples': 10101760, 'steps': 19729, 'batch_loss/train': 0.6982517228461802} 12/25/2021 13:44:49 - INFO - codeparrot_training - Step 19730: {'lr': 0.00033285399226239587, 'samples': 10102272, 'steps': 19730, 'batch_loss/train': 0.7458322118036449} 12/25/2021 13:45:01 - INFO - codeparrot_training - Step 19731: {'lr': 0.0003328391198210926, 'samples': 10102784, 'steps': 19731, 'batch_loss/train': 0.7162808384746313} 12/25/2021 13:45:12 - INFO - codeparrot_training - Step 19732: {'lr': 0.0003328242470504523, 'samples': 10103296, 'steps': 19732, 'batch_loss/train': 0.7638714329805225} 12/25/2021 13:45:22 - INFO - codeparrot_training - Step 19733: {'lr': 0.0003328093739505341, 'samples': 10103808, 'steps': 19733, 'batch_loss/train': 0.6994631150737405} 12/25/2021 13:45:36 - INFO - codeparrot_training - Step 19734: {'lr': 0.0003327945005213971, 'samples': 10104320, 'steps': 19734, 'batch_loss/train': 0.7130704566370696} 12/25/2021 13:45:47 - INFO - codeparrot_training - Step 19735: {'lr': 0.00033277962676310036, 'samples': 10104832, 'steps': 19735, 'batch_loss/train': 0.7918446809053421} 12/25/2021 13:45:57 - INFO - codeparrot_training - Step 19736: {'lr': 0.00033276475267570316, 'samples': 10105344, 'steps': 19736, 'batch_loss/train': 0.8178295698016882} 12/25/2021 13:46:08 - INFO - codeparrot_training - Step 19737: {'lr': 0.0003327498782592645, 'samples': 10105856, 'steps': 19737, 'batch_loss/train': 0.7170332316309214} 12/25/2021 13:46:20 - INFO - codeparrot_training - Step 19738: {'lr': 0.00033273500351384364, 'samples': 10106368, 'steps': 19738, 'batch_loss/train': 0.7505749091506004} 12/25/2021 13:46:31 - INFO - codeparrot_training - Step 19739: {'lr': 0.00033272012843949954, 'samples': 10106880, 'steps': 19739, 'batch_loss/train': 0.7654351908713579} 12/25/2021 13:46:41 - INFO - codeparrot_training - Step 19740: {'lr': 0.00033270525303629155, 'samples': 10107392, 'steps': 19740, 'batch_loss/train': 0.7271760734729469} 12/25/2021 13:46:53 - INFO - codeparrot_training - Step 19741: {'lr': 0.0003326903773042786, 'samples': 10107904, 'steps': 19741, 'batch_loss/train': 0.8528415951877832} 12/25/2021 13:47:04 - INFO - codeparrot_training - Step 19742: {'lr': 0.00033267550124352005, 'samples': 10108416, 'steps': 19742, 'batch_loss/train': 0.7462307279929519} 12/25/2021 13:47:15 - INFO - codeparrot_training - Step 19743: {'lr': 0.0003326606248540748, 'samples': 10108928, 'steps': 19743, 'batch_loss/train': 0.741925423964858} 12/25/2021 13:47:28 - INFO - codeparrot_training - Step 19744: {'lr': 0.00033264574813600216, 'samples': 10109440, 'steps': 19744, 'batch_loss/train': 0.7375512551516294} 12/25/2021 13:47:39 - INFO - codeparrot_training - Step 19745: {'lr': 0.0003326308710893613, 'samples': 10109952, 'steps': 19745, 'batch_loss/train': 0.7127858269959688} 12/25/2021 13:47:50 - INFO - codeparrot_training - Step 19746: {'lr': 0.0003326159937142112, 'samples': 10110464, 'steps': 19746, 'batch_loss/train': 0.8150993762537837} 12/25/2021 13:48:02 - INFO - codeparrot_training - Step 19747: {'lr': 0.00033260111601061113, 'samples': 10110976, 'steps': 19747, 'batch_loss/train': 0.7895487239584327} 12/25/2021 13:48:12 - INFO - codeparrot_training - Step 19748: {'lr': 0.0003325862379786202, 'samples': 10111488, 'steps': 19748, 'batch_loss/train': 0.6737368376925588} 12/25/2021 13:48:23 - INFO - codeparrot_training - Step 19749: {'lr': 0.00033257135961829764, 'samples': 10112000, 'steps': 19749, 'batch_loss/train': 0.8597360495477915} 12/25/2021 13:48:33 - INFO - codeparrot_training - Step 19750: {'lr': 0.0003325564809297025, 'samples': 10112512, 'steps': 19750, 'batch_loss/train': 0.8182838438078761} 12/25/2021 13:48:47 - INFO - codeparrot_training - Step 19751: {'lr': 0.00033254160191289393, 'samples': 10113024, 'steps': 19751, 'batch_loss/train': 0.7665403070859611} 12/25/2021 13:48:58 - INFO - codeparrot_training - Step 19752: {'lr': 0.00033252672256793114, 'samples': 10113536, 'steps': 19752, 'batch_loss/train': 0.7297327923588455} 12/25/2021 13:49:08 - INFO - codeparrot_training - Step 19753: {'lr': 0.0003325118428948732, 'samples': 10114048, 'steps': 19753, 'batch_loss/train': 0.7894171453081071} 12/25/2021 13:49:20 - INFO - codeparrot_training - Step 19754: {'lr': 0.00033249696289377946, 'samples': 10114560, 'steps': 19754, 'batch_loss/train': 0.772781556006521} 12/25/2021 13:49:31 - INFO - codeparrot_training - Step 19755: {'lr': 0.0003324820825647089, 'samples': 10115072, 'steps': 19755, 'batch_loss/train': 0.6376710087060928} 12/25/2021 13:49:42 - INFO - codeparrot_training - Step 19756: {'lr': 0.0003324672019077207, 'samples': 10115584, 'steps': 19756, 'batch_loss/train': 0.7539802349638194} 12/25/2021 13:49:54 - INFO - codeparrot_training - Step 19757: {'lr': 0.000332452320922874, 'samples': 10116096, 'steps': 19757, 'batch_loss/train': 0.7991183130070567} 12/25/2021 13:50:04 - INFO - codeparrot_training - Step 19758: {'lr': 0.0003324374396102281, 'samples': 10116608, 'steps': 19758, 'batch_loss/train': 0.7393001913442276} 12/25/2021 13:50:15 - INFO - codeparrot_training - Step 19759: {'lr': 0.000332422557969842, 'samples': 10117120, 'steps': 19759, 'batch_loss/train': 0.7733512222766876} 12/25/2021 13:50:29 - INFO - codeparrot_training - Step 19760: {'lr': 0.0003324076760017749, 'samples': 10117632, 'steps': 19760, 'batch_loss/train': 0.7810017140582204} 12/25/2021 13:50:39 - INFO - codeparrot_training - Step 19761: {'lr': 0.00033239279370608603, 'samples': 10118144, 'steps': 19761, 'batch_loss/train': 0.6798600321053527} 12/25/2021 13:50:50 - INFO - codeparrot_training - Step 19762: {'lr': 0.00033237791108283463, 'samples': 10118656, 'steps': 19762, 'batch_loss/train': 0.7790517564862967} 12/25/2021 13:51:00 - INFO - codeparrot_training - Step 19763: {'lr': 0.00033236302813207964, 'samples': 10119168, 'steps': 19763, 'batch_loss/train': 0.709942908026278} 12/25/2021 13:51:13 - INFO - codeparrot_training - Step 19764: {'lr': 0.00033234814485388035, 'samples': 10119680, 'steps': 19764, 'batch_loss/train': 0.7080601043999195} 12/25/2021 13:51:23 - INFO - codeparrot_training - Step 19765: {'lr': 0.0003323332612482959, 'samples': 10120192, 'steps': 19765, 'batch_loss/train': 0.8694875715300441} 12/25/2021 13:51:34 - INFO - codeparrot_training - Step 19766: {'lr': 0.0003323183773153856, 'samples': 10120704, 'steps': 19766, 'batch_loss/train': 0.852404021541588} 12/25/2021 13:51:46 - INFO - codeparrot_training - Step 19767: {'lr': 0.00033230349305520847, 'samples': 10121216, 'steps': 19767, 'batch_loss/train': 0.8705939957872033} 12/25/2021 13:51:57 - INFO - codeparrot_training - Step 19768: {'lr': 0.00033228860846782373, 'samples': 10121728, 'steps': 19768, 'batch_loss/train': 0.699289025273174} 12/25/2021 13:52:07 - INFO - codeparrot_training - Step 19769: {'lr': 0.00033227372355329053, 'samples': 10122240, 'steps': 19769, 'batch_loss/train': 0.7455281771253794} 12/25/2021 13:52:19 - INFO - codeparrot_training - Step 19770: {'lr': 0.000332258838311668, 'samples': 10122752, 'steps': 19770, 'batch_loss/train': 0.8103673858568072} 12/25/2021 13:52:30 - INFO - codeparrot_training - Step 19771: {'lr': 0.0003322439527430155, 'samples': 10123264, 'steps': 19771, 'batch_loss/train': 0.7928755497559905} 12/25/2021 13:52:41 - INFO - codeparrot_training - Step 19772: {'lr': 0.00033222906684739206, 'samples': 10123776, 'steps': 19772, 'batch_loss/train': 0.9410797689342871} 12/25/2021 13:52:51 - INFO - codeparrot_training - Step 19773: {'lr': 0.0003322141806248569, 'samples': 10124288, 'steps': 19773, 'batch_loss/train': 0.758756842231378} 12/25/2021 13:53:06 - INFO - codeparrot_training - Step 19774: {'lr': 0.00033219929407546925, 'samples': 10124800, 'steps': 19774, 'batch_loss/train': 0.8018079968169332} 12/25/2021 13:53:16 - INFO - codeparrot_training - Step 19775: {'lr': 0.0003321844071992882, 'samples': 10125312, 'steps': 19775, 'batch_loss/train': 0.8228520750999451} 12/25/2021 13:53:27 - INFO - codeparrot_training - Step 19776: {'lr': 0.00033216951999637294, 'samples': 10125824, 'steps': 19776, 'batch_loss/train': 0.656349312979728} 12/25/2021 13:53:39 - INFO - codeparrot_training - Step 19777: {'lr': 0.0003321546324667828, 'samples': 10126336, 'steps': 19777, 'batch_loss/train': 0.7453155729454011} 12/25/2021 13:53:49 - INFO - codeparrot_training - Step 19778: {'lr': 0.00033213974461057677, 'samples': 10126848, 'steps': 19778, 'batch_loss/train': 0.717448613489978} 12/25/2021 13:54:00 - INFO - codeparrot_training - Step 19779: {'lr': 0.00033212485642781413, 'samples': 10127360, 'steps': 19779, 'batch_loss/train': 0.80884122219868} 12/25/2021 13:54:14 - INFO - codeparrot_training - Step 19780: {'lr': 0.0003321099679185541, 'samples': 10127872, 'steps': 19780, 'batch_loss/train': 0.8825694406405091} 12/25/2021 13:54:25 - INFO - codeparrot_training - Step 19781: {'lr': 0.0003320950790828559, 'samples': 10128384, 'steps': 19781, 'batch_loss/train': 0.6777104737702757} 12/25/2021 13:54:36 - INFO - codeparrot_training - Step 19782: {'lr': 0.0003320801899207786, 'samples': 10128896, 'steps': 19782, 'batch_loss/train': 0.757104454562068} 12/25/2021 13:54:47 - INFO - codeparrot_training - Step 19783: {'lr': 0.0003320653004323814, 'samples': 10129408, 'steps': 19783, 'batch_loss/train': 0.7653856105171144} 12/25/2021 13:54:58 - INFO - codeparrot_training - Step 19784: {'lr': 0.0003320504106177236, 'samples': 10129920, 'steps': 19784, 'batch_loss/train': 0.7294176947325468} 12/25/2021 13:55:09 - INFO - codeparrot_training - Step 19785: {'lr': 0.00033203552047686435, 'samples': 10130432, 'steps': 19785, 'batch_loss/train': 0.8328425670042634} 12/25/2021 13:55:19 - INFO - codeparrot_training - Step 19786: {'lr': 0.0003320206300098629, 'samples': 10130944, 'steps': 19786, 'batch_loss/train': 0.8341011048760265} 12/25/2021 13:55:32 - INFO - codeparrot_training - Step 19787: {'lr': 0.00033200573921677836, 'samples': 10131456, 'steps': 19787, 'batch_loss/train': 0.7935294415801764} 12/25/2021 13:55:42 - INFO - codeparrot_training - Step 19788: {'lr': 0.00033199084809767, 'samples': 10131968, 'steps': 19788, 'batch_loss/train': 0.6166613278910518} 12/25/2021 13:55:53 - INFO - codeparrot_training - Step 19789: {'lr': 0.00033197595665259685, 'samples': 10132480, 'steps': 19789, 'batch_loss/train': 0.8414192516356707} 12/25/2021 13:56:07 - INFO - codeparrot_training - Step 19790: {'lr': 0.0003319610648816184, 'samples': 10132992, 'steps': 19790, 'batch_loss/train': 0.542982813436538} 12/25/2021 13:56:17 - INFO - codeparrot_training - Step 19791: {'lr': 0.0003319461727847936, 'samples': 10133504, 'steps': 19791, 'batch_loss/train': 0.6363523509353399} 12/25/2021 13:56:28 - INFO - codeparrot_training - Step 19792: {'lr': 0.0003319312803621818, 'samples': 10134016, 'steps': 19792, 'batch_loss/train': 0.7761846911162138} 12/25/2021 13:56:40 - INFO - codeparrot_training - Step 19793: {'lr': 0.0003319163876138422, 'samples': 10134528, 'steps': 19793, 'batch_loss/train': 0.7439369226340204} 12/25/2021 13:56:50 - INFO - codeparrot_training - Step 19794: {'lr': 0.000331901494539834, 'samples': 10135040, 'steps': 19794, 'batch_loss/train': 0.7095064902678132} 12/25/2021 13:57:01 - INFO - codeparrot_training - Step 19795: {'lr': 0.0003318866011402163, 'samples': 10135552, 'steps': 19795, 'batch_loss/train': 0.9149298761039972} 12/25/2021 13:57:12 - INFO - codeparrot_training - Step 19796: {'lr': 0.00033187170741504833, 'samples': 10136064, 'steps': 19796, 'batch_loss/train': 0.8199225119315088} 12/25/2021 13:57:24 - INFO - codeparrot_training - Step 19797: {'lr': 0.0003318568133643895, 'samples': 10136576, 'steps': 19797, 'batch_loss/train': 0.739319437649101} 12/25/2021 13:57:34 - INFO - codeparrot_training - Step 19798: {'lr': 0.00033184191898829885, 'samples': 10137088, 'steps': 19798, 'batch_loss/train': 0.8547781892120838} 12/25/2021 13:57:45 - INFO - codeparrot_training - Step 19799: {'lr': 0.0003318270242868356, 'samples': 10137600, 'steps': 19799, 'batch_loss/train': 0.7238557267119177} 12/25/2021 13:57:59 - INFO - codeparrot_training - Step 19800: {'lr': 0.000331812129260059, 'samples': 10138112, 'steps': 19800, 'batch_loss/train': 0.8081320375204086} 12/25/2021 13:58:09 - INFO - codeparrot_training - Step 19801: {'lr': 0.0003317972339080283, 'samples': 10138624, 'steps': 19801, 'batch_loss/train': 0.7887421606574208} 12/25/2021 13:58:20 - INFO - codeparrot_training - Step 19802: {'lr': 0.0003317823382308026, 'samples': 10139136, 'steps': 19802, 'batch_loss/train': 0.751061849296093} 12/25/2021 13:58:32 - INFO - codeparrot_training - Step 19803: {'lr': 0.0003317674422284414, 'samples': 10139648, 'steps': 19803, 'batch_loss/train': 0.8045786172151566} 12/25/2021 13:58:43 - INFO - codeparrot_training - Step 19804: {'lr': 0.0003317525459010036, 'samples': 10140160, 'steps': 19804, 'batch_loss/train': 0.77053077891469} 12/25/2021 13:58:53 - INFO - codeparrot_training - Step 19805: {'lr': 0.00033173764924854845, 'samples': 10140672, 'steps': 19805, 'batch_loss/train': 0.791541256941855} 12/25/2021 13:59:07 - INFO - codeparrot_training - Step 19806: {'lr': 0.00033172275227113537, 'samples': 10141184, 'steps': 19806, 'batch_loss/train': 0.794615070335567} 12/25/2021 13:59:18 - INFO - codeparrot_training - Step 19807: {'lr': 0.0003317078549688235, 'samples': 10141696, 'steps': 19807, 'batch_loss/train': 0.7321228226646781} 12/25/2021 13:59:29 - INFO - codeparrot_training - Step 19808: {'lr': 0.00033169295734167205, 'samples': 10142208, 'steps': 19808, 'batch_loss/train': 0.7327813003212214} 12/25/2021 13:59:39 - INFO - codeparrot_training - Step 19809: {'lr': 0.00033167805938974016, 'samples': 10142720, 'steps': 19809, 'batch_loss/train': 0.8308488472830504} 12/25/2021 13:59:51 - INFO - codeparrot_training - Step 19810: {'lr': 0.0003316631611130872, 'samples': 10143232, 'steps': 19810, 'batch_loss/train': 0.9272597366943955} 12/25/2021 14:00:02 - INFO - codeparrot_training - Step 19811: {'lr': 0.0003316482625117724, 'samples': 10143744, 'steps': 19811, 'batch_loss/train': 0.8579767182818614} 12/25/2021 14:00:13 - INFO - codeparrot_training - Step 19812: {'lr': 0.0003316333635858549, 'samples': 10144256, 'steps': 19812, 'batch_loss/train': 0.8077869927510619} 12/25/2021 14:00:25 - INFO - codeparrot_training - Step 19813: {'lr': 0.00033161846433539397, 'samples': 10144768, 'steps': 19813, 'batch_loss/train': 0.7679282873868942} 12/25/2021 14:00:35 - INFO - codeparrot_training - Step 19814: {'lr': 0.00033160356476044883, 'samples': 10145280, 'steps': 19814, 'batch_loss/train': 0.9485955853015184} 12/25/2021 14:00:46 - INFO - codeparrot_training - Step 19815: {'lr': 0.0003315886648610787, 'samples': 10145792, 'steps': 19815, 'batch_loss/train': 0.7853373186662793} 12/25/2021 14:00:58 - INFO - codeparrot_training - Step 19816: {'lr': 0.0003315737646373429, 'samples': 10146304, 'steps': 19816, 'batch_loss/train': 0.7572991028428078} 12/25/2021 14:01:09 - INFO - codeparrot_training - Step 19817: {'lr': 0.00033155886408930057, 'samples': 10146816, 'steps': 19817, 'batch_loss/train': 0.7141323103569448} 12/25/2021 14:01:19 - INFO - codeparrot_training - Step 19818: {'lr': 0.00033154396321701104, 'samples': 10147328, 'steps': 19818, 'batch_loss/train': 0.7071905788034201} 12/25/2021 14:01:34 - INFO - codeparrot_training - Step 19819: {'lr': 0.00033152906202053346, 'samples': 10147840, 'steps': 19819, 'batch_loss/train': 0.7763480301946402} 12/25/2021 14:01:44 - INFO - codeparrot_training - Step 19820: {'lr': 0.00033151416049992713, 'samples': 10148352, 'steps': 19820, 'batch_loss/train': 0.7881541224196553} 12/25/2021 14:01:55 - INFO - codeparrot_training - Step 19821: {'lr': 0.00033149925865525126, 'samples': 10148864, 'steps': 19821, 'batch_loss/train': 0.6110230647027493} 12/25/2021 14:02:05 - INFO - codeparrot_training - Step 19822: {'lr': 0.0003314843564865651, 'samples': 10149376, 'steps': 19822, 'batch_loss/train': 0.7690599635243416} 12/25/2021 14:02:18 - INFO - codeparrot_training - Step 19823: {'lr': 0.00033146945399392804, 'samples': 10149888, 'steps': 19823, 'batch_loss/train': 0.8596256831660867} 12/25/2021 14:02:28 - INFO - codeparrot_training - Step 19824: {'lr': 0.00033145455117739904, 'samples': 10150400, 'steps': 19824, 'batch_loss/train': 0.7690382455475628} 12/25/2021 14:02:39 - INFO - codeparrot_training - Step 19825: {'lr': 0.00033143964803703755, 'samples': 10150912, 'steps': 19825, 'batch_loss/train': 0.688265617704019} 12/25/2021 14:02:51 - INFO - codeparrot_training - Step 19826: {'lr': 0.0003314247445729027, 'samples': 10151424, 'steps': 19826, 'batch_loss/train': 0.7841478968039155} 12/25/2021 14:03:01 - INFO - codeparrot_training - Step 19827: {'lr': 0.000331409840785054, 'samples': 10151936, 'steps': 19827, 'batch_loss/train': 0.7604004042223096} 12/25/2021 14:03:12 - INFO - codeparrot_training - Step 19828: {'lr': 0.0003313949366735503, 'samples': 10152448, 'steps': 19828, 'batch_loss/train': 0.7710892572067678} 12/25/2021 14:03:26 - INFO - codeparrot_training - Step 19829: {'lr': 0.0003313800322384512, 'samples': 10152960, 'steps': 19829, 'batch_loss/train': 0.7753146439790726} 12/25/2021 14:03:36 - INFO - codeparrot_training - Step 19830: {'lr': 0.00033136512747981574, 'samples': 10153472, 'steps': 19830, 'batch_loss/train': 0.8233117992058396} 12/25/2021 14:03:47 - INFO - codeparrot_training - Step 19831: {'lr': 0.0003313502223977033, 'samples': 10153984, 'steps': 19831, 'batch_loss/train': 0.7869573880452663} 12/25/2021 14:03:58 - INFO - codeparrot_training - Step 19832: {'lr': 0.00033133531699217306, 'samples': 10154496, 'steps': 19832, 'batch_loss/train': 0.7407341171056032} 12/25/2021 14:04:10 - INFO - codeparrot_training - Step 19833: {'lr': 0.00033132041126328435, 'samples': 10155008, 'steps': 19833, 'batch_loss/train': 0.8203223403543234} 12/25/2021 14:04:20 - INFO - codeparrot_training - Step 19834: {'lr': 0.0003313055052110964, 'samples': 10155520, 'steps': 19834, 'batch_loss/train': 0.8448980539105833} 12/25/2021 14:04:31 - INFO - codeparrot_training - Step 19835: {'lr': 0.00033129059883566837, 'samples': 10156032, 'steps': 19835, 'batch_loss/train': 0.7874072368722409} 12/25/2021 14:04:45 - INFO - codeparrot_training - Step 19836: {'lr': 0.00033127569213705976, 'samples': 10156544, 'steps': 19836, 'batch_loss/train': 0.7332330578938127} 12/25/2021 14:04:55 - INFO - codeparrot_training - Step 19837: {'lr': 0.00033126078511532957, 'samples': 10157056, 'steps': 19837, 'batch_loss/train': 0.6320949343207758} 12/25/2021 14:05:06 - INFO - codeparrot_training - Step 19838: {'lr': 0.0003312458777705372, 'samples': 10157568, 'steps': 19838, 'batch_loss/train': 0.7012064326554537} 12/25/2021 14:05:18 - INFO - codeparrot_training - Step 19839: {'lr': 0.00033123097010274196, 'samples': 10158080, 'steps': 19839, 'batch_loss/train': 0.8159881983883679} 12/25/2021 14:05:29 - INFO - codeparrot_training - Step 19840: {'lr': 0.000331216062112003, 'samples': 10158592, 'steps': 19840, 'batch_loss/train': 0.7742619253695011} 12/25/2021 14:05:39 - INFO - codeparrot_training - Step 19841: {'lr': 0.00033120115379837957, 'samples': 10159104, 'steps': 19841, 'batch_loss/train': 0.7780139362439513} 12/25/2021 14:05:50 - INFO - codeparrot_training - Step 19842: {'lr': 0.0003311862451619312, 'samples': 10159616, 'steps': 19842, 'batch_loss/train': 0.7063482482917607} 12/25/2021 14:06:02 - INFO - codeparrot_training - Step 19843: {'lr': 0.0003311713362027168, 'samples': 10160128, 'steps': 19843, 'batch_loss/train': 0.9737032819539309} 12/25/2021 14:06:13 - INFO - codeparrot_training - Step 19844: {'lr': 0.00033115642692079585, 'samples': 10160640, 'steps': 19844, 'batch_loss/train': 0.8207956752739847} 12/25/2021 14:06:23 - INFO - codeparrot_training - Step 19845: {'lr': 0.00033114151731622764, 'samples': 10161152, 'steps': 19845, 'batch_loss/train': 0.7966754138469696} 12/25/2021 14:06:37 - INFO - codeparrot_training - Step 19846: {'lr': 0.0003311266073890714, 'samples': 10161664, 'steps': 19846, 'batch_loss/train': 0.9158485443331301} 12/25/2021 14:06:47 - INFO - codeparrot_training - Step 19847: {'lr': 0.0003311116971393864, 'samples': 10162176, 'steps': 19847, 'batch_loss/train': 0.7907574828714132} 12/25/2021 14:06:58 - INFO - codeparrot_training - Step 19848: {'lr': 0.0003310967865672319, 'samples': 10162688, 'steps': 19848, 'batch_loss/train': 0.7827622490003705} 12/25/2021 14:07:10 - INFO - codeparrot_training - Step 19849: {'lr': 0.0003310818756726672, 'samples': 10163200, 'steps': 19849, 'batch_loss/train': 0.8388052703812718} 12/25/2021 14:07:21 - INFO - codeparrot_training - Step 19850: {'lr': 0.00033106696445575157, 'samples': 10163712, 'steps': 19850, 'batch_loss/train': 0.7096098228357732} 12/25/2021 14:07:31 - INFO - codeparrot_training - Step 19851: {'lr': 0.0003310520529165443, 'samples': 10164224, 'steps': 19851, 'batch_loss/train': 0.713821655139327} 12/25/2021 14:07:44 - INFO - codeparrot_training - Step 19852: {'lr': 0.0003310371410551046, 'samples': 10164736, 'steps': 19852, 'batch_loss/train': 0.7296188119798899} 12/25/2021 14:07:54 - INFO - codeparrot_training - Step 19853: {'lr': 0.00033102222887149195, 'samples': 10165248, 'steps': 19853, 'batch_loss/train': 0.8018823517486453} 12/25/2021 14:08:05 - INFO - codeparrot_training - Step 19854: {'lr': 0.0003310073163657654, 'samples': 10165760, 'steps': 19854, 'batch_loss/train': 0.5880736241233535} 12/25/2021 14:08:16 - INFO - codeparrot_training - Step 19855: {'lr': 0.0003309924035379845, 'samples': 10166272, 'steps': 19855, 'batch_loss/train': 0.5768822212121449} 12/25/2021 14:08:27 - INFO - codeparrot_training - Step 19856: {'lr': 0.00033097749038820827, 'samples': 10166784, 'steps': 19856, 'batch_loss/train': 0.8368224473670125} 12/25/2021 14:08:38 - INFO - codeparrot_training - Step 19857: {'lr': 0.0003309625769164961, 'samples': 10167296, 'steps': 19857, 'batch_loss/train': 0.7746324443724006} 12/25/2021 14:08:49 - INFO - codeparrot_training - Step 19858: {'lr': 0.00033094766312290733, 'samples': 10167808, 'steps': 19858, 'batch_loss/train': 0.7490446446463466} 12/25/2021 14:09:02 - INFO - codeparrot_training - Step 19859: {'lr': 0.00033093274900750124, 'samples': 10168320, 'steps': 19859, 'batch_loss/train': 0.7625183360651135} 12/25/2021 14:09:13 - INFO - codeparrot_training - Step 19860: {'lr': 0.00033091783457033706, 'samples': 10168832, 'steps': 19860, 'batch_loss/train': 0.6603894695872441} 12/25/2021 14:09:23 - INFO - codeparrot_training - Step 19861: {'lr': 0.0003309029198114741, 'samples': 10169344, 'steps': 19861, 'batch_loss/train': 0.8595767812803388} 12/25/2021 14:09:36 - INFO - codeparrot_training - Step 19862: {'lr': 0.0003308880047309718, 'samples': 10169856, 'steps': 19862, 'batch_loss/train': 0.7976423278450966} 12/25/2021 14:09:46 - INFO - codeparrot_training - Step 19863: {'lr': 0.00033087308932888926, 'samples': 10170368, 'steps': 19863, 'batch_loss/train': 0.692877407069318} 12/25/2021 14:09:57 - INFO - codeparrot_training - Step 19864: {'lr': 0.0003308581736052858, 'samples': 10170880, 'steps': 19864, 'batch_loss/train': 0.6901850246358663} 12/25/2021 14:10:11 - INFO - codeparrot_training - Step 19865: {'lr': 0.0003308432575602208, 'samples': 10171392, 'steps': 19865, 'batch_loss/train': 0.6993816732428968} 12/25/2021 14:10:21 - INFO - codeparrot_training - Step 19866: {'lr': 0.0003308283411937536, 'samples': 10171904, 'steps': 19866, 'batch_loss/train': 0.7686030147597194} 12/25/2021 14:10:32 - INFO - codeparrot_training - Step 19867: {'lr': 0.00033081342450594337, 'samples': 10172416, 'steps': 19867, 'batch_loss/train': 0.8261073837056756} 12/25/2021 14:10:42 - INFO - codeparrot_training - Step 19868: {'lr': 0.00033079850749684954, 'samples': 10172928, 'steps': 19868, 'batch_loss/train': 0.8075451790355146} 12/25/2021 14:10:54 - INFO - codeparrot_training - Step 19869: {'lr': 0.0003307835901665313, 'samples': 10173440, 'steps': 19869, 'batch_loss/train': 0.5951279430883005} 12/25/2021 14:11:05 - INFO - codeparrot_training - Step 19870: {'lr': 0.00033076867251504794, 'samples': 10173952, 'steps': 19870, 'batch_loss/train': 0.7463081683963537} 12/25/2021 14:11:16 - INFO - codeparrot_training - Step 19871: {'lr': 0.00033075375454245893, 'samples': 10174464, 'steps': 19871, 'batch_loss/train': 0.860478900372982} 12/25/2021 14:11:28 - INFO - codeparrot_training - Step 19872: {'lr': 0.0003307388362488235, 'samples': 10174976, 'steps': 19872, 'batch_loss/train': 0.790059044957161} 12/25/2021 14:11:38 - INFO - codeparrot_training - Step 19873: {'lr': 0.0003307239176342009, 'samples': 10175488, 'steps': 19873, 'batch_loss/train': 0.7962406873703003} 12/25/2021 14:11:49 - INFO - codeparrot_training - Step 19874: {'lr': 0.00033070899869865045, 'samples': 10176000, 'steps': 19874, 'batch_loss/train': 0.7362729860469699} 12/25/2021 14:12:03 - INFO - codeparrot_training - Step 19875: {'lr': 0.00033069407944223163, 'samples': 10176512, 'steps': 19875, 'batch_loss/train': 0.8409357699565589} 12/25/2021 14:12:13 - INFO - codeparrot_training - Step 19876: {'lr': 0.0003306791598650035, 'samples': 10177024, 'steps': 19876, 'batch_loss/train': 0.7837816732935607} 12/25/2021 14:12:24 - INFO - codeparrot_training - Step 19877: {'lr': 0.0003306642399670255, 'samples': 10177536, 'steps': 19877, 'batch_loss/train': 0.7539327349513769} 12/25/2021 14:12:35 - INFO - codeparrot_training - Step 19878: {'lr': 0.00033064931974835694, 'samples': 10178048, 'steps': 19878, 'batch_loss/train': 1.2727277521044016} 12/25/2021 14:12:47 - INFO - codeparrot_training - Step 19879: {'lr': 0.00033063439920905715, 'samples': 10178560, 'steps': 19879, 'batch_loss/train': 0.8485001502558589} 12/25/2021 14:12:57 - INFO - codeparrot_training - Step 19880: {'lr': 0.00033061947834918544, 'samples': 10179072, 'steps': 19880, 'batch_loss/train': 0.8601746847853065} 12/25/2021 14:13:08 - INFO - codeparrot_training - Step 19881: {'lr': 0.00033060455716880114, 'samples': 10179584, 'steps': 19881, 'batch_loss/train': 0.8126991530880332} 12/25/2021 14:13:22 - INFO - codeparrot_training - Step 19882: {'lr': 0.0003305896356679636, 'samples': 10180096, 'steps': 19882, 'batch_loss/train': 0.7309302613139153} 12/25/2021 14:13:33 - INFO - codeparrot_training - Step 19883: {'lr': 0.0003305747138467319, 'samples': 10180608, 'steps': 19883, 'batch_loss/train': 0.7978587215766311} 12/25/2021 14:13:43 - INFO - codeparrot_training - Step 19884: {'lr': 0.0003305597917051658, 'samples': 10181120, 'steps': 19884, 'batch_loss/train': 0.7995003554970026} 12/25/2021 14:13:54 - INFO - codeparrot_training - Step 19885: {'lr': 0.0003305448692433243, 'samples': 10181632, 'steps': 19885, 'batch_loss/train': 1.176112626446411} 12/25/2021 14:14:06 - INFO - codeparrot_training - Step 19886: {'lr': 0.00033052994646126676, 'samples': 10182144, 'steps': 19886, 'batch_loss/train': 0.7665414317743853} 12/25/2021 14:14:17 - INFO - codeparrot_training - Step 19887: {'lr': 0.00033051502335905257, 'samples': 10182656, 'steps': 19887, 'batch_loss/train': 0.9491775650531054} 12/25/2021 14:14:27 - INFO - codeparrot_training - Step 19888: {'lr': 0.00033050009993674113, 'samples': 10183168, 'steps': 19888, 'batch_loss/train': 0.7777958828955889} 12/25/2021 14:14:39 - INFO - codeparrot_training - Step 19889: {'lr': 0.0003304851761943916, 'samples': 10183680, 'steps': 19889, 'batch_loss/train': 0.8325575981289148} 12/25/2021 14:14:50 - INFO - codeparrot_training - Step 19890: {'lr': 0.0003304702521320635, 'samples': 10184192, 'steps': 19890, 'batch_loss/train': 0.7789106764830649} 12/25/2021 14:15:01 - INFO - codeparrot_training - Step 19891: {'lr': 0.00033045532774981595, 'samples': 10184704, 'steps': 19891, 'batch_loss/train': 0.7910813922062516} 12/25/2021 14:15:13 - INFO - codeparrot_training - Step 19892: {'lr': 0.0003304404030477084, 'samples': 10185216, 'steps': 19892, 'batch_loss/train': 0.7689934372901917} 12/25/2021 14:15:23 - INFO - codeparrot_training - Step 19893: {'lr': 0.0003304254780258002, 'samples': 10185728, 'steps': 19893, 'batch_loss/train': 0.6604242702014744} 12/25/2021 14:15:34 - INFO - codeparrot_training - Step 19894: {'lr': 0.0003304105526841508, 'samples': 10186240, 'steps': 19894, 'batch_loss/train': 0.761886257212609} 12/25/2021 14:15:44 - INFO - codeparrot_training - Step 19895: {'lr': 0.0003303956270228193, 'samples': 10186752, 'steps': 19895, 'batch_loss/train': 0.831630127504468} 12/25/2021 14:15:58 - INFO - codeparrot_training - Step 19896: {'lr': 0.0003303807010418651, 'samples': 10187264, 'steps': 19896, 'batch_loss/train': 0.7698019091039896} 12/25/2021 14:16:09 - INFO - codeparrot_training - Step 19897: {'lr': 0.0003303657747413477, 'samples': 10187776, 'steps': 19897, 'batch_loss/train': 0.9614993501454592} 12/25/2021 14:16:20 - INFO - codeparrot_training - Step 19898: {'lr': 0.0003303508481213262, 'samples': 10188288, 'steps': 19898, 'batch_loss/train': 0.879398541059345} 12/25/2021 14:16:31 - INFO - codeparrot_training - Step 19899: {'lr': 0.0003303359211818601, 'samples': 10188800, 'steps': 19899, 'batch_loss/train': 0.7912765610963106} 12/25/2021 14:16:42 - INFO - codeparrot_training - Step 19900: {'lr': 0.00033032099392300867, 'samples': 10189312, 'steps': 19900, 'batch_loss/train': 0.7136207344010472} 12/25/2021 14:16:53 - INFO - codeparrot_training - Step 19901: {'lr': 0.0003303060663448314, 'samples': 10189824, 'steps': 19901, 'batch_loss/train': 0.8630408472381532} 12/25/2021 14:17:05 - INFO - codeparrot_training - Step 19902: {'lr': 0.0003302911384473875, 'samples': 10190336, 'steps': 19902, 'batch_loss/train': 0.7937371055595577} 12/25/2021 14:17:16 - INFO - codeparrot_training - Step 19903: {'lr': 0.00033027621023073636, 'samples': 10190848, 'steps': 19903, 'batch_loss/train': 0.804478750564158} 12/25/2021 14:17:26 - INFO - codeparrot_training - Step 19904: {'lr': 0.0003302612816949372, 'samples': 10191360, 'steps': 19904, 'batch_loss/train': 0.8055781833827496} 12/25/2021 14:17:40 - INFO - codeparrot_training - Step 19905: {'lr': 0.0003302463528400496, 'samples': 10191872, 'steps': 19905, 'batch_loss/train': 0.7343020932748914} 12/25/2021 14:17:51 - INFO - codeparrot_training - Step 19906: {'lr': 0.0003302314236661328, 'samples': 10192384, 'steps': 19906, 'batch_loss/train': 0.7997299307025969} 12/25/2021 14:18:01 - INFO - codeparrot_training - Step 19907: {'lr': 0.00033021649417324614, 'samples': 10192896, 'steps': 19907, 'batch_loss/train': 0.858190824277699} 12/25/2021 14:18:12 - INFO - codeparrot_training - Step 19908: {'lr': 0.000330201564361449, 'samples': 10193408, 'steps': 19908, 'batch_loss/train': 0.6795084206387401} 12/25/2021 14:18:24 - INFO - codeparrot_training - Step 19909: {'lr': 0.00033018663423080064, 'samples': 10193920, 'steps': 19909, 'batch_loss/train': 1.0083413682878017} 12/25/2021 14:18:35 - INFO - codeparrot_training - Step 19910: {'lr': 0.00033017170378136064, 'samples': 10194432, 'steps': 19910, 'batch_loss/train': 1.0087790917605162} 12/25/2021 14:18:45 - INFO - codeparrot_training - Step 19911: {'lr': 0.00033015677301318814, 'samples': 10194944, 'steps': 19911, 'batch_loss/train': 0.7625384614802897} 12/25/2021 14:18:59 - INFO - codeparrot_training - Step 19912: {'lr': 0.00033014184192634255, 'samples': 10195456, 'steps': 19912, 'batch_loss/train': 0.8146240741480142} 12/25/2021 14:19:10 - INFO - codeparrot_training - Step 19913: {'lr': 0.0003301269105208833, 'samples': 10195968, 'steps': 19913, 'batch_loss/train': 0.9853343833237886} 12/25/2021 14:19:20 - INFO - codeparrot_training - Step 19914: {'lr': 0.00033011197879686974, 'samples': 10196480, 'steps': 19914, 'batch_loss/train': 0.6982919310103171} 12/25/2021 14:19:32 - INFO - codeparrot_training - Step 19915: {'lr': 0.0003300970467543611, 'samples': 10196992, 'steps': 19915, 'batch_loss/train': 0.8738999515771866} 12/25/2021 14:19:43 - INFO - codeparrot_training - Step 19916: {'lr': 0.00033008211439341693, 'samples': 10197504, 'steps': 19916, 'batch_loss/train': 0.7813809788785875} 12/25/2021 14:19:53 - INFO - codeparrot_training - Step 19917: {'lr': 0.0003300671817140965, 'samples': 10198016, 'steps': 19917, 'batch_loss/train': 0.7937680706381798} 12/25/2021 14:20:04 - INFO - codeparrot_training - Step 19918: {'lr': 0.00033005224871645916, 'samples': 10198528, 'steps': 19918, 'batch_loss/train': 0.7303359038196504} 12/25/2021 14:20:16 - INFO - codeparrot_training - Step 19919: {'lr': 0.0003300373154005643, 'samples': 10199040, 'steps': 19919, 'batch_loss/train': 0.7416617004200816} 12/25/2021 14:20:27 - INFO - codeparrot_training - Step 19920: {'lr': 0.0003300223817664713, 'samples': 10199552, 'steps': 19920, 'batch_loss/train': 0.8060117270797491} 12/25/2021 14:20:37 - INFO - codeparrot_training - Step 19921: {'lr': 0.0003300074478142396, 'samples': 10200064, 'steps': 19921, 'batch_loss/train': 0.6775719353463501} 12/25/2021 14:20:51 - INFO - codeparrot_training - Step 19922: {'lr': 0.00032999251354392834, 'samples': 10200576, 'steps': 19922, 'batch_loss/train': 0.7454033883986995} 12/25/2021 14:21:02 - INFO - codeparrot_training - Step 19923: {'lr': 0.00032997757895559724, 'samples': 10201088, 'steps': 19923, 'batch_loss/train': 1.4975067800842226} 12/25/2021 14:21:13 - INFO - codeparrot_training - Step 19924: {'lr': 0.00032996264404930537, 'samples': 10201600, 'steps': 19924, 'batch_loss/train': 0.7908287439495325} 12/25/2021 14:21:25 - INFO - codeparrot_training - Step 19925: {'lr': 0.0003299477088251122, 'samples': 10202112, 'steps': 19925, 'batch_loss/train': 0.6901120850816369} 12/25/2021 14:21:35 - INFO - codeparrot_training - Step 19926: {'lr': 0.0003299327732830772, 'samples': 10202624, 'steps': 19926, 'batch_loss/train': 0.7438973467797041} 12/25/2021 14:21:46 - INFO - codeparrot_training - Step 19927: {'lr': 0.00032991783742325964, 'samples': 10203136, 'steps': 19927, 'batch_loss/train': 0.8252671724185348} 12/25/2021 14:21:56 - INFO - codeparrot_training - Step 19928: {'lr': 0.0003299029012457189, 'samples': 10203648, 'steps': 19928, 'batch_loss/train': 0.7015080023556948} 12/25/2021 14:22:08 - INFO - codeparrot_training - Step 19929: {'lr': 0.00032988796475051446, 'samples': 10204160, 'steps': 19929, 'batch_loss/train': 0.8372968700714409} 12/25/2021 14:22:19 - INFO - codeparrot_training - Step 19930: {'lr': 0.00032987302793770565, 'samples': 10204672, 'steps': 19930, 'batch_loss/train': 0.7796865825075656} 12/25/2021 14:22:30 - INFO - codeparrot_training - Step 19931: {'lr': 0.00032985809080735175, 'samples': 10205184, 'steps': 19931, 'batch_loss/train': 0.6867331725079566} 12/25/2021 14:22:43 - INFO - codeparrot_training - Step 19932: {'lr': 0.00032984315335951234, 'samples': 10205696, 'steps': 19932, 'batch_loss/train': 0.7338219359517097} 12/25/2021 14:22:54 - INFO - codeparrot_training - Step 19933: {'lr': 0.00032982821559424656, 'samples': 10206208, 'steps': 19933, 'batch_loss/train': 0.6977458447217941} 12/25/2021 14:23:05 - INFO - codeparrot_training - Step 19934: {'lr': 0.00032981327751161404, 'samples': 10206720, 'steps': 19934, 'batch_loss/train': 0.7640957322437316} 12/25/2021 14:23:17 - INFO - codeparrot_training - Step 19935: {'lr': 0.00032979833911167403, 'samples': 10207232, 'steps': 19935, 'batch_loss/train': 0.7297810195013881} 12/25/2021 14:23:27 - INFO - codeparrot_training - Step 19936: {'lr': 0.000329783400394486, 'samples': 10207744, 'steps': 19936, 'batch_loss/train': 0.7482617879286408} 12/25/2021 14:23:38 - INFO - codeparrot_training - Step 19937: {'lr': 0.0003297684613601093, 'samples': 10208256, 'steps': 19937, 'batch_loss/train': 0.7517013885080814} 12/25/2021 14:23:50 - INFO - codeparrot_training - Step 19938: {'lr': 0.0003297535220086032, 'samples': 10208768, 'steps': 19938, 'batch_loss/train': 0.6649945543031208} 12/25/2021 14:24:01 - INFO - codeparrot_training - Step 19939: {'lr': 0.00032973858234002727, 'samples': 10209280, 'steps': 19939, 'batch_loss/train': 0.826441315934062} 12/25/2021 14:24:11 - INFO - codeparrot_training - Step 19940: {'lr': 0.000329723642354441, 'samples': 10209792, 'steps': 19940, 'batch_loss/train': 0.7439119014889002} 12/25/2021 14:24:22 - INFO - codeparrot_training - Step 19941: {'lr': 0.0003297087020519034, 'samples': 10210304, 'steps': 19941, 'batch_loss/train': 0.7858030740171671} 12/25/2021 14:24:36 - INFO - codeparrot_training - Step 19942: {'lr': 0.0003296937614324742, 'samples': 10210816, 'steps': 19942, 'batch_loss/train': 0.8749083776492625} 12/25/2021 14:24:47 - INFO - codeparrot_training - Step 19943: {'lr': 0.0003296788204962127, 'samples': 10211328, 'steps': 19943, 'batch_loss/train': 1.066975319525227} 12/25/2021 14:24:57 - INFO - codeparrot_training - Step 19944: {'lr': 0.0003296638792431783, 'samples': 10211840, 'steps': 19944, 'batch_loss/train': 0.8326115552335978} 12/25/2021 14:25:09 - INFO - codeparrot_training - Step 19945: {'lr': 0.00032964893767343037, 'samples': 10212352, 'steps': 19945, 'batch_loss/train': 0.8072260865010321} 12/25/2021 14:25:20 - INFO - codeparrot_training - Step 19946: {'lr': 0.0003296339957870284, 'samples': 10212864, 'steps': 19946, 'batch_loss/train': 0.7976519018411636} 12/25/2021 14:25:30 - INFO - codeparrot_training - Step 19947: {'lr': 0.0003296190535840316, 'samples': 10213376, 'steps': 19947, 'batch_loss/train': 0.7497691176831722} 12/25/2021 14:25:42 - INFO - codeparrot_training - Step 19948: {'lr': 0.00032960411106449955, 'samples': 10213888, 'steps': 19948, 'batch_loss/train': 0.6331467472482473} 12/25/2021 14:25:53 - INFO - codeparrot_training - Step 19949: {'lr': 0.0003295891682284917, 'samples': 10214400, 'steps': 19949, 'batch_loss/train': 0.8795682974159718} 12/25/2021 14:26:04 - INFO - codeparrot_training - Step 19950: {'lr': 0.00032957422507606733, 'samples': 10214912, 'steps': 19950, 'batch_loss/train': 0.8218779666349292} 12/25/2021 14:26:14 - INFO - codeparrot_training - Step 19951: {'lr': 0.0003295592816072859, 'samples': 10215424, 'steps': 19951, 'batch_loss/train': 0.7252268576994538} 12/25/2021 14:26:28 - INFO - codeparrot_training - Step 19952: {'lr': 0.0003295443378222067, 'samples': 10215936, 'steps': 19952, 'batch_loss/train': 1.094376103952527} 12/25/2021 14:26:39 - INFO - codeparrot_training - Step 19953: {'lr': 0.0003295293937208894, 'samples': 10216448, 'steps': 19953, 'batch_loss/train': 0.8567602355033159} 12/25/2021 14:26:49 - INFO - codeparrot_training - Step 19954: {'lr': 0.00032951444930339315, 'samples': 10216960, 'steps': 19954, 'batch_loss/train': 0.7525981448125094} 12/25/2021 14:27:02 - INFO - codeparrot_training - Step 19955: {'lr': 0.00032949950456977753, 'samples': 10217472, 'steps': 19955, 'batch_loss/train': 0.8104070825502276} 12/25/2021 14:27:12 - INFO - codeparrot_training - Step 19956: {'lr': 0.00032948455952010186, 'samples': 10217984, 'steps': 19956, 'batch_loss/train': 0.8436165535822511} 12/25/2021 14:27:23 - INFO - codeparrot_training - Step 19957: {'lr': 0.00032946961415442554, 'samples': 10218496, 'steps': 19957, 'batch_loss/train': 0.7527623474597931} 12/25/2021 14:27:35 - INFO - codeparrot_training - Step 19958: {'lr': 0.0003294546684728082, 'samples': 10219008, 'steps': 19958, 'batch_loss/train': 0.9655302884057164} 12/25/2021 14:27:45 - INFO - codeparrot_training - Step 19959: {'lr': 0.00032943972247530896, 'samples': 10219520, 'steps': 19959, 'batch_loss/train': 0.739163912832737} 12/25/2021 14:27:56 - INFO - codeparrot_training - Step 19960: {'lr': 0.0003294247761619875, 'samples': 10220032, 'steps': 19960, 'batch_loss/train': 0.811477170791477} 12/25/2021 14:28:10 - INFO - codeparrot_training - Step 19961: {'lr': 0.000329409829532903, 'samples': 10220544, 'steps': 19961, 'batch_loss/train': 0.7990440207067877} 12/25/2021 14:28:20 - INFO - codeparrot_training - Step 19962: {'lr': 0.00032939488258811506, 'samples': 10221056, 'steps': 19962, 'batch_loss/train': 0.7589819473214447} 12/25/2021 14:28:31 - INFO - codeparrot_training - Step 19963: {'lr': 0.00032937993532768307, 'samples': 10221568, 'steps': 19963, 'batch_loss/train': 0.767220149282366} 12/25/2021 14:28:41 - INFO - codeparrot_training - Step 19964: {'lr': 0.00032936498775166633, 'samples': 10222080, 'steps': 19964, 'batch_loss/train': 0.7953554354608059} 12/25/2021 14:28:54 - INFO - codeparrot_training - Step 19965: {'lr': 0.0003293500398601245, 'samples': 10222592, 'steps': 19965, 'batch_loss/train': 0.7605789350345731} 12/25/2021 14:29:04 - INFO - codeparrot_training - Step 19966: {'lr': 0.00032933509165311684, 'samples': 10223104, 'steps': 19966, 'batch_loss/train': 0.7034329222515225} 12/25/2021 14:29:15 - INFO - codeparrot_training - Step 19967: {'lr': 0.0003293201431307027, 'samples': 10223616, 'steps': 19967, 'batch_loss/train': 0.9052071701735258} 12/25/2021 14:29:29 - INFO - codeparrot_training - Step 19968: {'lr': 0.0003293051942929417, 'samples': 10224128, 'steps': 19968, 'batch_loss/train': 0.7511838860809803} 12/25/2021 14:29:39 - INFO - codeparrot_training - Step 19969: {'lr': 0.0003292902451398932, 'samples': 10224640, 'steps': 19969, 'batch_loss/train': 0.7952274018898606} 12/25/2021 14:29:50 - INFO - codeparrot_training - Step 19970: {'lr': 0.00032927529567161655, 'samples': 10225152, 'steps': 19970, 'batch_loss/train': 0.7186517957597971} 12/25/2021 14:30:02 - INFO - codeparrot_training - Step 19971: {'lr': 0.0003292603458881714, 'samples': 10225664, 'steps': 19971, 'batch_loss/train': 0.6746059352299199} 12/25/2021 14:30:13 - INFO - codeparrot_training - Step 19972: {'lr': 0.00032924539578961694, 'samples': 10226176, 'steps': 19972, 'batch_loss/train': 0.7467587585560977} 12/25/2021 14:30:23 - INFO - codeparrot_training - Step 19973: {'lr': 0.0003292304453760127, 'samples': 10226688, 'steps': 19973, 'batch_loss/train': 0.828743489459157} 12/25/2021 14:30:34 - INFO - codeparrot_training - Step 19974: {'lr': 0.0003292154946474182, 'samples': 10227200, 'steps': 19974, 'batch_loss/train': 0.7379760518670082} 12/25/2021 14:30:46 - INFO - codeparrot_training - Step 19975: {'lr': 0.0003292005436038927, 'samples': 10227712, 'steps': 19975, 'batch_loss/train': 0.5658871012856252} 12/25/2021 14:30:57 - INFO - codeparrot_training - Step 19976: {'lr': 0.0003291855922454959, 'samples': 10228224, 'steps': 19976, 'batch_loss/train': 0.5042017662199214} 12/25/2021 14:31:08 - INFO - codeparrot_training - Step 19977: {'lr': 0.00032917064057228694, 'samples': 10228736, 'steps': 19977, 'batch_loss/train': 1.4931736555881798} 12/25/2021 14:31:20 - INFO - codeparrot_training - Step 19978: {'lr': 0.00032915568858432544, 'samples': 10229248, 'steps': 19978, 'batch_loss/train': 0.8132456606253982} 12/25/2021 14:31:30 - INFO - codeparrot_training - Step 19979: {'lr': 0.00032914073628167084, 'samples': 10229760, 'steps': 19979, 'batch_loss/train': 0.7897936380468309} 12/25/2021 14:31:41 - INFO - codeparrot_training - Step 19980: {'lr': 0.0003291257836643825, 'samples': 10230272, 'steps': 19980, 'batch_loss/train': 0.6945570079842582} 12/25/2021 14:31:55 - INFO - codeparrot_training - Step 19981: {'lr': 0.00032911083073251996, 'samples': 10230784, 'steps': 19981, 'batch_loss/train': 0.6162888593971729} 12/25/2021 14:32:05 - INFO - codeparrot_training - Step 19982: {'lr': 0.00032909587748614267, 'samples': 10231296, 'steps': 19982, 'batch_loss/train': 0.8591531924903393} 12/25/2021 14:32:16 - INFO - codeparrot_training - Step 19983: {'lr': 0.00032908092392531, 'samples': 10231808, 'steps': 19983, 'batch_loss/train': 0.7430411870591342} 12/25/2021 14:32:26 - INFO - codeparrot_training - Step 19984: {'lr': 0.0003290659700500814, 'samples': 10232320, 'steps': 19984, 'batch_loss/train': 0.7931963698938489} 12/25/2021 14:32:39 - INFO - codeparrot_training - Step 19985: {'lr': 0.0003290510158605164, 'samples': 10232832, 'steps': 19985, 'batch_loss/train': 0.6738342014723457} 12/25/2021 14:32:49 - INFO - codeparrot_training - Step 19986: {'lr': 0.0003290360613566744, 'samples': 10233344, 'steps': 19986, 'batch_loss/train': 0.8295544483698905} 12/25/2021 14:33:00 - INFO - codeparrot_training - Step 19987: {'lr': 0.0003290211065386148, 'samples': 10233856, 'steps': 19987, 'batch_loss/train': 0.7424704981967807} 12/25/2021 14:33:12 - INFO - codeparrot_training - Step 19988: {'lr': 0.00032900615140639725, 'samples': 10234368, 'steps': 19988, 'batch_loss/train': 0.6992637771181762} 12/25/2021 14:33:23 - INFO - codeparrot_training - Step 19989: {'lr': 0.00032899119596008106, 'samples': 10234880, 'steps': 19989, 'batch_loss/train': 0.7570667510153726} 12/25/2021 14:33:33 - INFO - codeparrot_training - Step 19990: {'lr': 0.0003289762401997256, 'samples': 10235392, 'steps': 19990, 'batch_loss/train': 0.7790047624148428} 12/25/2021 14:33:47 - INFO - codeparrot_training - Step 19991: {'lr': 0.0003289612841253905, 'samples': 10235904, 'steps': 19991, 'batch_loss/train': 0.7194998664781451} 12/25/2021 14:33:58 - INFO - codeparrot_training - Step 19992: {'lr': 0.00032894632773713506, 'samples': 10236416, 'steps': 19992, 'batch_loss/train': 0.503225272288546} 12/25/2021 14:34:08 - INFO - codeparrot_training - Step 19993: {'lr': 0.0003289313710350189, 'samples': 10236928, 'steps': 19993, 'batch_loss/train': 0.8917665891349316} 12/25/2021 14:34:19 - INFO - codeparrot_training - Step 19994: {'lr': 0.0003289164140191014, 'samples': 10237440, 'steps': 19994, 'batch_loss/train': 0.7171231503598392} 12/25/2021 14:34:31 - INFO - codeparrot_training - Step 19995: {'lr': 0.0003289014566894421, 'samples': 10237952, 'steps': 19995, 'batch_loss/train': 0.7587660774588585} 12/25/2021 14:34:42 - INFO - codeparrot_training - Step 19996: {'lr': 0.00032888649904610026, 'samples': 10238464, 'steps': 19996, 'batch_loss/train': 0.7848679977469146} 12/25/2021 14:34:52 - INFO - codeparrot_training - Step 19997: {'lr': 0.00032887154108913563, 'samples': 10238976, 'steps': 19997, 'batch_loss/train': 0.7392015559598804} 12/25/2021 14:35:06 - INFO - codeparrot_training - Step 19998: {'lr': 0.00032885658281860743, 'samples': 10239488, 'steps': 19998, 'batch_loss/train': 0.7149940677918494} 12/25/2021 14:35:17 - INFO - codeparrot_training - Step 19999: {'lr': 0.0003288416242345753, 'samples': 10240000, 'steps': 19999, 'batch_loss/train': 0.9672560142353177} 12/25/2021 14:35:17 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/25/2021 14:38:39 - INFO - codeparrot_training - Step 20000: {'loss/eval': 0.8066897988319397, 'perplexity': 2.2404792308807373} 12/25/2021 14:38:57 - WARNING - huggingface_hub.repository - Several commits (6) will be pushed upstream. 12/25/2021 14:39:10 - INFO - codeparrot_training - Step 20000: {'lr': 0.0003288266653370986, 'samples': 10240512, 'steps': 20000, 'batch_loss/train': 0.7516108351992443} 12/25/2021 14:39:22 - INFO - codeparrot_training - Step 20001: {'lr': 0.0003288117061262369, 'samples': 10241024, 'steps': 20001, 'batch_loss/train': 0.8406344018876553} 12/25/2021 14:39:33 - INFO - codeparrot_training - Step 20002: {'lr': 0.0003287967466020495, 'samples': 10241536, 'steps': 20002, 'batch_loss/train': 0.8381735887378454} 12/25/2021 14:39:44 - INFO - codeparrot_training - Step 20003: {'lr': 0.00032878178676459607, 'samples': 10242048, 'steps': 20003, 'batch_loss/train': 0.9467295035719872} 12/25/2021 14:39:54 - INFO - codeparrot_training - Step 20004: {'lr': 0.00032876682661393597, 'samples': 10242560, 'steps': 20004, 'batch_loss/train': 0.7659104950726032} 12/25/2021 14:40:06 - INFO - codeparrot_training - Step 20005: {'lr': 0.00032875186615012874, 'samples': 10243072, 'steps': 20005, 'batch_loss/train': 0.6687926545273513} 12/25/2021 14:40:17 - INFO - codeparrot_training - Step 20006: {'lr': 0.0003287369053732338, 'samples': 10243584, 'steps': 20006, 'batch_loss/train': 0.6712382787372917} 12/25/2021 14:40:28 - INFO - codeparrot_training - Step 20007: {'lr': 0.0003287219442833107, 'samples': 10244096, 'steps': 20007, 'batch_loss/train': 0.7737946566194296} 12/25/2021 14:40:40 - INFO - codeparrot_training - Step 20008: {'lr': 0.0003287069828804189, 'samples': 10244608, 'steps': 20008, 'batch_loss/train': 0.7921066023409367} 12/25/2021 14:40:50 - INFO - codeparrot_training - Step 20009: {'lr': 0.0003286920211646177, 'samples': 10245120, 'steps': 20009, 'batch_loss/train': 0.8887295555323362} 12/25/2021 14:41:01 - INFO - codeparrot_training - Step 20010: {'lr': 0.0003286770591359669, 'samples': 10245632, 'steps': 20010, 'batch_loss/train': 0.815077718347311} 12/25/2021 14:41:15 - INFO - codeparrot_training - Step 20011: {'lr': 0.00032866209679452576, 'samples': 10246144, 'steps': 20011, 'batch_loss/train': 0.8467678255401552} 12/25/2021 14:41:25 - INFO - codeparrot_training - Step 20012: {'lr': 0.00032864713414035377, 'samples': 10246656, 'steps': 20012, 'batch_loss/train': 0.8618984646163881} 12/25/2021 14:41:36 - INFO - codeparrot_training - Step 20013: {'lr': 0.0003286321711735105, 'samples': 10247168, 'steps': 20013, 'batch_loss/train': 0.7339107468724251} 12/25/2021 14:41:48 - INFO - codeparrot_training - Step 20014: {'lr': 0.00032861720789405545, 'samples': 10247680, 'steps': 20014, 'batch_loss/train': 0.8533911649137735} 12/25/2021 14:41:59 - INFO - codeparrot_training - Step 20015: {'lr': 0.00032860224430204804, 'samples': 10248192, 'steps': 20015, 'batch_loss/train': 2.388478160602972} 12/25/2021 14:42:10 - INFO - codeparrot_training - Step 20016: {'lr': 0.0003285872803975478, 'samples': 10248704, 'steps': 20016, 'batch_loss/train': 0.7819903651252389} 12/25/2021 14:42:20 - INFO - codeparrot_training - Step 20017: {'lr': 0.00032857231618061425, 'samples': 10249216, 'steps': 20017, 'batch_loss/train': 0.7407124079763889} 12/25/2021 14:42:33 - INFO - codeparrot_training - Step 20018: {'lr': 0.00032855735165130675, 'samples': 10249728, 'steps': 20018, 'batch_loss/train': 0.7293716720305383} 12/25/2021 14:42:43 - INFO - codeparrot_training - Step 20019: {'lr': 0.0003285423868096849, 'samples': 10250240, 'steps': 20019, 'batch_loss/train': 0.6455809769686311} 12/25/2021 14:42:54 - INFO - codeparrot_training - Step 20020: {'lr': 0.0003285274216558082, 'samples': 10250752, 'steps': 20020, 'batch_loss/train': 0.7171271136030555} 12/25/2021 14:43:08 - INFO - codeparrot_training - Step 20021: {'lr': 0.00032851245618973615, 'samples': 10251264, 'steps': 20021, 'batch_loss/train': 0.7484619254246354} 12/25/2021 14:43:19 - INFO - codeparrot_training - Step 20022: {'lr': 0.0003284974904115281, 'samples': 10251776, 'steps': 20022, 'batch_loss/train': 0.6976329172030091} 12/25/2021 14:43:29 - INFO - codeparrot_training - Step 20023: {'lr': 0.0003284825243212438, 'samples': 10252288, 'steps': 20023, 'batch_loss/train': 0.7659404070582241} 12/25/2021 14:43:41 - INFO - codeparrot_training - Step 20024: {'lr': 0.0003284675579189426, 'samples': 10252800, 'steps': 20024, 'batch_loss/train': 0.6969831681344658} 12/25/2021 14:43:52 - INFO - codeparrot_training - Step 20025: {'lr': 0.00032845259120468397, 'samples': 10253312, 'steps': 20025, 'batch_loss/train': 0.8191772177815437} 12/25/2021 14:44:02 - INFO - codeparrot_training - Step 20026: {'lr': 0.00032843762417852744, 'samples': 10253824, 'steps': 20026, 'batch_loss/train': 0.7254862440750003} 12/25/2021 14:44:13 - INFO - codeparrot_training - Step 20027: {'lr': 0.00032842265684053263, 'samples': 10254336, 'steps': 20027, 'batch_loss/train': 0.7323339870199561} 12/25/2021 14:44:27 - INFO - codeparrot_training - Step 20028: {'lr': 0.0003284076891907588, 'samples': 10254848, 'steps': 20028, 'batch_loss/train': 0.8810483990237117} 12/25/2021 14:44:38 - INFO - codeparrot_training - Step 20029: {'lr': 0.0003283927212292657, 'samples': 10255360, 'steps': 20029, 'batch_loss/train': 0.8804379999637604} 12/25/2021 14:44:48 - INFO - codeparrot_training - Step 20030: {'lr': 0.0003283777529561128, 'samples': 10255872, 'steps': 20030, 'batch_loss/train': 0.6936622997745872} 12/25/2021 14:45:00 - INFO - codeparrot_training - Step 20031: {'lr': 0.0003283627843713595, 'samples': 10256384, 'steps': 20031, 'batch_loss/train': 0.8088537321891636} 12/25/2021 14:45:11 - INFO - codeparrot_training - Step 20032: {'lr': 0.0003283478154750653, 'samples': 10256896, 'steps': 20032, 'batch_loss/train': 0.6748854169854894} 12/25/2021 14:45:22 - INFO - codeparrot_training - Step 20033: {'lr': 0.0003283328462672898, 'samples': 10257408, 'steps': 20033, 'batch_loss/train': 0.6950084373820573} 12/25/2021 14:45:34 - INFO - codeparrot_training - Step 20034: {'lr': 0.00032831787674809254, 'samples': 10257920, 'steps': 20034, 'batch_loss/train': 0.7824219177709892} 12/25/2021 14:45:45 - INFO - codeparrot_training - Step 20035: {'lr': 0.0003283029069175328, 'samples': 10258432, 'steps': 20035, 'batch_loss/train': 0.674139595386805} 12/25/2021 14:45:55 - INFO - codeparrot_training - Step 20036: {'lr': 0.00032828793677567046, 'samples': 10258944, 'steps': 20036, 'batch_loss/train': 0.881163471378386} 12/25/2021 14:46:09 - INFO - codeparrot_training - Step 20037: {'lr': 0.0003282729663225648, 'samples': 10259456, 'steps': 20037, 'batch_loss/train': 0.810365142300725} 12/25/2021 14:46:20 - INFO - codeparrot_training - Step 20038: {'lr': 0.00032825799555827534, 'samples': 10259968, 'steps': 20038, 'batch_loss/train': 0.8097698595374823} 12/25/2021 14:46:30 - INFO - codeparrot_training - Step 20039: {'lr': 0.00032824302448286163, 'samples': 10260480, 'steps': 20039, 'batch_loss/train': 0.8172009224072099} 12/25/2021 14:46:41 - INFO - codeparrot_training - Step 20040: {'lr': 0.00032822805309638326, 'samples': 10260992, 'steps': 20040, 'batch_loss/train': 0.6484505140106194} 12/25/2021 14:46:53 - INFO - codeparrot_training - Step 20041: {'lr': 0.0003282130813988996, 'samples': 10261504, 'steps': 20041, 'batch_loss/train': 0.7970918882638216} 12/25/2021 14:47:04 - INFO - codeparrot_training - Step 20042: {'lr': 0.0003281981093904703, 'samples': 10262016, 'steps': 20042, 'batch_loss/train': 0.7179042166098952} 12/25/2021 14:47:14 - INFO - codeparrot_training - Step 20043: {'lr': 0.0003281831370711549, 'samples': 10262528, 'steps': 20043, 'batch_loss/train': 0.6387337121414021} 12/25/2021 14:47:26 - INFO - codeparrot_training - Step 20044: {'lr': 0.00032816816444101283, 'samples': 10263040, 'steps': 20044, 'batch_loss/train': 0.7121214922517538} 12/25/2021 14:47:37 - INFO - codeparrot_training - Step 20045: {'lr': 0.00032815319150010356, 'samples': 10263552, 'steps': 20045, 'batch_loss/train': 0.6157715239096433} 12/25/2021 14:47:47 - INFO - codeparrot_training - Step 20046: {'lr': 0.00032813821824848684, 'samples': 10264064, 'steps': 20046, 'batch_loss/train': 0.8005156945437193} 12/25/2021 14:48:01 - INFO - codeparrot_training - Step 20047: {'lr': 0.000328123244686222, 'samples': 10264576, 'steps': 20047, 'batch_loss/train': 0.9163009971380234} 12/25/2021 14:48:12 - INFO - codeparrot_training - Step 20048: {'lr': 0.0003281082708133687, 'samples': 10265088, 'steps': 20048, 'batch_loss/train': 0.8839359171688557} 12/25/2021 14:48:22 - INFO - codeparrot_training - Step 20049: {'lr': 0.00032809329662998637, 'samples': 10265600, 'steps': 20049, 'batch_loss/train': 0.6542446153762285} 12/25/2021 14:48:34 - INFO - codeparrot_training - Step 20050: {'lr': 0.0003280783221361346, 'samples': 10266112, 'steps': 20050, 'batch_loss/train': 0.7193212441634387} 12/25/2021 14:48:45 - INFO - codeparrot_training - Step 20051: {'lr': 0.00032806334733187286, 'samples': 10266624, 'steps': 20051, 'batch_loss/train': 0.7822445151396096} 12/25/2021 14:48:56 - INFO - codeparrot_training - Step 20052: {'lr': 0.00032804837221726076, 'samples': 10267136, 'steps': 20052, 'batch_loss/train': 0.7812391994521022} 12/25/2021 14:49:06 - INFO - codeparrot_training - Step 20053: {'lr': 0.0003280333967923578, 'samples': 10267648, 'steps': 20053, 'batch_loss/train': 0.7089542528847232} 12/25/2021 14:49:19 - INFO - codeparrot_training - Step 20054: {'lr': 0.00032801842105722345, 'samples': 10268160, 'steps': 20054, 'batch_loss/train': 1.0270322052529082} 12/25/2021 14:49:29 - INFO - codeparrot_training - Step 20055: {'lr': 0.00032800344501191734, 'samples': 10268672, 'steps': 20055, 'batch_loss/train': 0.7309125326573849} 12/25/2021 14:49:40 - INFO - codeparrot_training - Step 20056: {'lr': 0.0003279884686564991, 'samples': 10269184, 'steps': 20056, 'batch_loss/train': 0.7824658663012087} 12/25/2021 14:49:52 - INFO - codeparrot_training - Step 20057: {'lr': 0.00032797349199102806, 'samples': 10269696, 'steps': 20057, 'batch_loss/train': 0.7917037354782224} 12/25/2021 14:50:03 - INFO - codeparrot_training - Step 20058: {'lr': 0.0003279585150155638, 'samples': 10270208, 'steps': 20058, 'batch_loss/train': 0.9297961136326194} 12/25/2021 14:50:14 - INFO - codeparrot_training - Step 20059: {'lr': 0.00032794353773016603, 'samples': 10270720, 'steps': 20059, 'batch_loss/train': 0.7397515955381095} 12/25/2021 14:50:27 - INFO - codeparrot_training - Step 20060: {'lr': 0.00032792856013489415, 'samples': 10271232, 'steps': 20060, 'batch_loss/train': 0.6303414835128933} 12/25/2021 14:50:38 - INFO - codeparrot_training - Step 20061: {'lr': 0.0003279135822298077, 'samples': 10271744, 'steps': 20061, 'batch_loss/train': 0.775074940174818} 12/25/2021 14:50:49 - INFO - codeparrot_training - Step 20062: {'lr': 0.00032789860401496643, 'samples': 10272256, 'steps': 20062, 'batch_loss/train': 0.6910626299795695} 12/25/2021 14:50:59 - INFO - codeparrot_training - Step 20063: {'lr': 0.00032788362549042953, 'samples': 10272768, 'steps': 20063, 'batch_loss/train': 0.8177893767133355} 12/25/2021 14:51:11 - INFO - codeparrot_training - Step 20064: {'lr': 0.0003278686466562568, 'samples': 10273280, 'steps': 20064, 'batch_loss/train': 0.7408711470197886} 12/25/2021 14:51:22 - INFO - codeparrot_training - Step 20065: {'lr': 0.00032785366751250775, 'samples': 10273792, 'steps': 20065, 'batch_loss/train': 0.6393524897284806} 12/25/2021 14:51:32 - INFO - codeparrot_training - Step 20066: {'lr': 0.0003278386880592419, 'samples': 10274304, 'steps': 20066, 'batch_loss/train': 0.7575772684067488} 12/25/2021 14:51:46 - INFO - codeparrot_training - Step 20067: {'lr': 0.0003278237082965188, 'samples': 10274816, 'steps': 20067, 'batch_loss/train': 0.7387090418487787} 12/25/2021 14:51:57 - INFO - codeparrot_training - Step 20068: {'lr': 0.00032780872822439806, 'samples': 10275328, 'steps': 20068, 'batch_loss/train': 0.7612328752875328} 12/25/2021 14:52:07 - INFO - codeparrot_training - Step 20069: {'lr': 0.0003277937478429392, 'samples': 10275840, 'steps': 20069, 'batch_loss/train': 0.8145161606371403} 12/25/2021 14:52:20 - INFO - codeparrot_training - Step 20070: {'lr': 0.00032777876715220177, 'samples': 10276352, 'steps': 20070, 'batch_loss/train': 0.743548353202641} 12/25/2021 14:52:30 - INFO - codeparrot_training - Step 20071: {'lr': 0.00032776378615224526, 'samples': 10276864, 'steps': 20071, 'batch_loss/train': 0.8458421411924064} 12/25/2021 14:52:41 - INFO - codeparrot_training - Step 20072: {'lr': 0.00032774880484312936, 'samples': 10277376, 'steps': 20072, 'batch_loss/train': 0.7802700409665704} 12/25/2021 14:52:51 - INFO - codeparrot_training - Step 20073: {'lr': 0.00032773382322491356, 'samples': 10277888, 'steps': 20073, 'batch_loss/train': 0.5920307002670597} 12/25/2021 14:53:04 - INFO - codeparrot_training - Step 20074: {'lr': 0.00032771884129765747, 'samples': 10278400, 'steps': 20074, 'batch_loss/train': 0.7829324770718813} 12/25/2021 14:53:14 - INFO - codeparrot_training - Step 20075: {'lr': 0.00032770385906142057, 'samples': 10278912, 'steps': 20075, 'batch_loss/train': 0.7401765789836645} 12/25/2021 14:53:25 - INFO - codeparrot_training - Step 20076: {'lr': 0.00032768887651626243, 'samples': 10279424, 'steps': 20076, 'batch_loss/train': 0.6357410273049027} 12/25/2021 14:53:39 - INFO - codeparrot_training - Step 20077: {'lr': 0.0003276738936622427, 'samples': 10279936, 'steps': 20077, 'batch_loss/train': 0.8270188625901937} 12/25/2021 14:53:49 - INFO - codeparrot_training - Step 20078: {'lr': 0.00032765891049942085, 'samples': 10280448, 'steps': 20078, 'batch_loss/train': 0.7156894970685244} 12/25/2021 14:54:00 - INFO - codeparrot_training - Step 20079: {'lr': 0.00032764392702785657, 'samples': 10280960, 'steps': 20079, 'batch_loss/train': 0.6772412760183215} 12/25/2021 14:54:12 - INFO - codeparrot_training - Step 20080: {'lr': 0.00032762894324760925, 'samples': 10281472, 'steps': 20080, 'batch_loss/train': 0.729524738737382} 12/25/2021 14:54:23 - INFO - codeparrot_training - Step 20081: {'lr': 0.0003276139591587387, 'samples': 10281984, 'steps': 20081, 'batch_loss/train': 0.6401456231833436} 12/25/2021 14:54:33 - INFO - codeparrot_training - Step 20082: {'lr': 0.0003275989747613043, 'samples': 10282496, 'steps': 20082, 'batch_loss/train': 0.7149079266237095} 12/25/2021 14:54:47 - INFO - codeparrot_training - Step 20083: {'lr': 0.0003275839900553656, 'samples': 10283008, 'steps': 20083, 'batch_loss/train': 0.7675534207373857} 12/25/2021 14:54:58 - INFO - codeparrot_training - Step 20084: {'lr': 0.0003275690050409824, 'samples': 10283520, 'steps': 20084, 'batch_loss/train': 1.2192465206608176} 12/25/2021 14:55:08 - INFO - codeparrot_training - Step 20085: {'lr': 0.00032755401971821396, 'samples': 10284032, 'steps': 20085, 'batch_loss/train': 0.6976623372174799} 12/25/2021 14:55:19 - INFO - codeparrot_training - Step 20086: {'lr': 0.0003275390340871201, 'samples': 10284544, 'steps': 20086, 'batch_loss/train': 0.8102625459432602} 12/25/2021 14:55:31 - INFO - codeparrot_training - Step 20087: {'lr': 0.00032752404814776035, 'samples': 10285056, 'steps': 20087, 'batch_loss/train': 0.7414546692743897} 12/25/2021 14:55:41 - INFO - codeparrot_training - Step 20088: {'lr': 0.00032750906190019425, 'samples': 10285568, 'steps': 20088, 'batch_loss/train': 0.6571818871889263} 12/25/2021 14:55:52 - INFO - codeparrot_training - Step 20089: {'lr': 0.0003274940753444813, 'samples': 10286080, 'steps': 20089, 'batch_loss/train': 0.699192576110363} 12/25/2021 14:56:04 - INFO - codeparrot_training - Step 20090: {'lr': 0.00032747908848068126, 'samples': 10286592, 'steps': 20090, 'batch_loss/train': 0.7862463500350714} 12/25/2021 14:56:15 - INFO - codeparrot_training - Step 20091: {'lr': 0.0003274641013088536, 'samples': 10287104, 'steps': 20091, 'batch_loss/train': 0.7615328141837381} 12/25/2021 14:56:25 - INFO - codeparrot_training - Step 20092: {'lr': 0.00032744911382905783, 'samples': 10287616, 'steps': 20092, 'batch_loss/train': 0.8103056391701102} 12/25/2021 14:56:37 - INFO - codeparrot_training - Step 20093: {'lr': 0.0003274341260413537, 'samples': 10288128, 'steps': 20093, 'batch_loss/train': 0.8226445131003857} 12/25/2021 14:56:48 - INFO - codeparrot_training - Step 20094: {'lr': 0.00032741913794580074, 'samples': 10288640, 'steps': 20094, 'batch_loss/train': 0.8109539528377354} 12/25/2021 14:56:59 - INFO - codeparrot_training - Step 20095: {'lr': 0.00032740414954245855, 'samples': 10289152, 'steps': 20095, 'batch_loss/train': 0.7862735516391695} 12/25/2021 14:57:13 - INFO - codeparrot_training - Step 20096: {'lr': 0.0003273891608313866, 'samples': 10289664, 'steps': 20096, 'batch_loss/train': 0.872495487332344} 12/25/2021 14:57:23 - INFO - codeparrot_training - Step 20097: {'lr': 0.00032737417181264464, 'samples': 10290176, 'steps': 20097, 'batch_loss/train': 0.7460854798555374} 12/25/2021 14:57:34 - INFO - codeparrot_training - Step 20098: {'lr': 0.00032735918248629213, 'samples': 10290688, 'steps': 20098, 'batch_loss/train': 0.5575914866640233} 12/25/2021 14:57:44 - INFO - codeparrot_training - Step 20099: {'lr': 0.00032734419285238876, 'samples': 10291200, 'steps': 20099, 'batch_loss/train': 0.7678386215120554} 12/25/2021 14:57:56 - INFO - codeparrot_training - Step 20100: {'lr': 0.00032732920291099406, 'samples': 10291712, 'steps': 20100, 'batch_loss/train': 0.7541093854233623} 12/25/2021 14:58:07 - INFO - codeparrot_training - Step 20101: {'lr': 0.0003273142126621677, 'samples': 10292224, 'steps': 20101, 'batch_loss/train': 0.6499490608694032} 12/25/2021 14:58:18 - INFO - codeparrot_training - Step 20102: {'lr': 0.00032729922210596917, 'samples': 10292736, 'steps': 20102, 'batch_loss/train': 0.8031998868100345} 12/25/2021 14:58:30 - INFO - codeparrot_training - Step 20103: {'lr': 0.0003272842312424581, 'samples': 10293248, 'steps': 20103, 'batch_loss/train': 0.7326462920755148} 12/25/2021 14:58:40 - INFO - codeparrot_training - Step 20104: {'lr': 0.0003272692400716942, 'samples': 10293760, 'steps': 20104, 'batch_loss/train': 0.8617498744279146} 12/25/2021 14:58:51 - INFO - codeparrot_training - Step 20105: {'lr': 0.00032725424859373687, 'samples': 10294272, 'steps': 20105, 'batch_loss/train': 1.1560103381052613} 12/25/2021 14:59:05 - INFO - codeparrot_training - Step 20106: {'lr': 0.00032723925680864586, 'samples': 10294784, 'steps': 20106, 'batch_loss/train': 0.8060187580995262} 12/25/2021 14:59:16 - INFO - codeparrot_training - Step 20107: {'lr': 0.0003272242647164807, 'samples': 10295296, 'steps': 20107, 'batch_loss/train': 0.8061449071392417} 12/25/2021 14:59:26 - INFO - codeparrot_training - Step 20108: {'lr': 0.00032720927231730116, 'samples': 10295808, 'steps': 20108, 'batch_loss/train': 0.7599426340311766} 12/25/2021 14:59:37 - INFO - codeparrot_training - Step 20109: {'lr': 0.0003271942796111665, 'samples': 10296320, 'steps': 20109, 'batch_loss/train': 0.7835470177233219} 12/25/2021 14:59:49 - INFO - codeparrot_training - Step 20110: {'lr': 0.0003271792865981367, 'samples': 10296832, 'steps': 20110, 'batch_loss/train': 0.8343117968179286} 12/25/2021 15:00:00 - INFO - codeparrot_training - Step 20111: {'lr': 0.00032716429327827115, 'samples': 10297344, 'steps': 20111, 'batch_loss/train': 0.7587038329802454} 12/25/2021 15:00:10 - INFO - codeparrot_training - Step 20112: {'lr': 0.0003271492996516294, 'samples': 10297856, 'steps': 20112, 'batch_loss/train': 0.8070117589086294} 12/25/2021 15:00:25 - INFO - codeparrot_training - Step 20113: {'lr': 0.0003271343057182713, 'samples': 10298368, 'steps': 20113, 'batch_loss/train': 0.7165342676453292} 12/25/2021 15:00:35 - INFO - codeparrot_training - Step 20114: {'lr': 0.00032711931147825634, 'samples': 10298880, 'steps': 20114, 'batch_loss/train': 0.7219824523781426} 12/25/2021 15:00:46 - INFO - codeparrot_training - Step 20115: {'lr': 0.000327104316931644, 'samples': 10299392, 'steps': 20115, 'batch_loss/train': 0.7921135718934238} 12/25/2021 15:00:58 - INFO - codeparrot_training - Step 20116: {'lr': 0.00032708932207849405, 'samples': 10299904, 'steps': 20116, 'batch_loss/train': 0.6649726863251999} 12/25/2021 15:01:08 - INFO - codeparrot_training - Step 20117: {'lr': 0.00032707432691886606, 'samples': 10300416, 'steps': 20117, 'batch_loss/train': 0.7171965688467026} 12/25/2021 15:01:19 - INFO - codeparrot_training - Step 20118: {'lr': 0.0003270593314528197, 'samples': 10300928, 'steps': 20118, 'batch_loss/train': 0.669920374522917} 12/25/2021 15:01:30 - INFO - codeparrot_training - Step 20119: {'lr': 0.0003270443356804145, 'samples': 10301440, 'steps': 20119, 'batch_loss/train': 0.7811965593136847} 12/25/2021 15:01:42 - INFO - codeparrot_training - Step 20120: {'lr': 0.0003270293396017101, 'samples': 10301952, 'steps': 20120, 'batch_loss/train': 0.787470560928341} 12/25/2021 15:01:52 - INFO - codeparrot_training - Step 20121: {'lr': 0.0003270143432167662, 'samples': 10302464, 'steps': 20121, 'batch_loss/train': 0.8766360559966415} 12/25/2021 15:02:03 - INFO - codeparrot_training - Step 20122: {'lr': 0.00032699934652564224, 'samples': 10302976, 'steps': 20122, 'batch_loss/train': 0.7705223890952766} 12/25/2021 15:02:17 - INFO - codeparrot_training - Step 20123: {'lr': 0.0003269843495283981, 'samples': 10303488, 'steps': 20123, 'batch_loss/train': 0.6953254286199808} 12/25/2021 15:02:27 - INFO - codeparrot_training - Step 20124: {'lr': 0.0003269693522250931, 'samples': 10304000, 'steps': 20124, 'batch_loss/train': 0.8233179738745093} 12/25/2021 15:02:38 - INFO - codeparrot_training - Step 20125: {'lr': 0.00032695435461578705, 'samples': 10304512, 'steps': 20125, 'batch_loss/train': 0.7685092235915363} 12/25/2021 15:02:50 - INFO - codeparrot_training - Step 20126: {'lr': 0.0003269393567005396, 'samples': 10305024, 'steps': 20126, 'batch_loss/train': 0.8185796262696385} 12/25/2021 15:03:01 - INFO - codeparrot_training - Step 20127: {'lr': 0.0003269243584794103, 'samples': 10305536, 'steps': 20127, 'batch_loss/train': 0.8832245694939047} 12/25/2021 15:03:11 - INFO - codeparrot_training - Step 20128: {'lr': 0.00032690935995245875, 'samples': 10306048, 'steps': 20128, 'batch_loss/train': 0.7812029973138124} 12/25/2021 15:03:22 - INFO - codeparrot_training - Step 20129: {'lr': 0.0003268943611197446, 'samples': 10306560, 'steps': 20129, 'batch_loss/train': 1.5453306627459824} 12/25/2021 15:03:34 - INFO - codeparrot_training - Step 20130: {'lr': 0.0003268793619813277, 'samples': 10307072, 'steps': 20130, 'batch_loss/train': 0.7822633031755686} 12/25/2021 15:03:44 - INFO - codeparrot_training - Step 20131: {'lr': 0.0003268643625372673, 'samples': 10307584, 'steps': 20131, 'batch_loss/train': 0.7567244851961732} 12/25/2021 15:03:55 - INFO - codeparrot_training - Step 20132: {'lr': 0.0003268493627876233, 'samples': 10308096, 'steps': 20132, 'batch_loss/train': 0.7062480635941029} 12/25/2021 15:04:07 - INFO - codeparrot_training - Step 20133: {'lr': 0.0003268343627324552, 'samples': 10308608, 'steps': 20133, 'batch_loss/train': 0.7656051575904712} 12/25/2021 15:04:18 - INFO - codeparrot_training - Step 20134: {'lr': 0.00032681936237182276, 'samples': 10309120, 'steps': 20134, 'batch_loss/train': 0.5846809475915506} 12/25/2021 15:04:28 - INFO - codeparrot_training - Step 20135: {'lr': 0.0003268043617057854, 'samples': 10309632, 'steps': 20135, 'batch_loss/train': 0.7568809166550636} 12/25/2021 15:04:43 - INFO - codeparrot_training - Step 20136: {'lr': 0.00032678936073440303, 'samples': 10310144, 'steps': 20136, 'batch_loss/train': 0.7600237817969173} 12/25/2021 15:04:53 - INFO - codeparrot_training - Step 20137: {'lr': 0.0003267743594577351, 'samples': 10310656, 'steps': 20137, 'batch_loss/train': 0.7996827820315957} 12/25/2021 15:05:04 - INFO - codeparrot_training - Step 20138: {'lr': 0.0003267593578758413, 'samples': 10311168, 'steps': 20138, 'batch_loss/train': 0.7702654208987951} 12/25/2021 15:05:14 - INFO - codeparrot_training - Step 20139: {'lr': 0.0003267443559887813, 'samples': 10311680, 'steps': 20139, 'batch_loss/train': 0.9349186718463898} 12/25/2021 15:05:27 - INFO - codeparrot_training - Step 20140: {'lr': 0.0003267293537966147, 'samples': 10312192, 'steps': 20140, 'batch_loss/train': 2.6010182425379753} 12/25/2021 15:05:37 - INFO - codeparrot_training - Step 20141: {'lr': 0.00032671435129940117, 'samples': 10312704, 'steps': 20141, 'batch_loss/train': 0.7805527318269014} 12/25/2021 15:05:48 - INFO - codeparrot_training - Step 20142: {'lr': 0.00032669934849720035, 'samples': 10313216, 'steps': 20142, 'batch_loss/train': 0.7404842376708984} 12/25/2021 15:06:00 - INFO - codeparrot_training - Step 20143: {'lr': 0.0003266843453900719, 'samples': 10313728, 'steps': 20143, 'batch_loss/train': 0.5931840591947548} 12/25/2021 15:06:11 - INFO - codeparrot_training - Step 20144: {'lr': 0.0003266693419780754, 'samples': 10314240, 'steps': 20144, 'batch_loss/train': 0.813837799243629} 12/25/2021 15:06:21 - INFO - codeparrot_training - Step 20145: {'lr': 0.0003266543382612706, 'samples': 10314752, 'steps': 20145, 'batch_loss/train': 0.8273656107485294} 12/25/2021 15:06:35 - INFO - codeparrot_training - Step 20146: {'lr': 0.00032663933423971705, 'samples': 10315264, 'steps': 20146, 'batch_loss/train': 0.670067643513903} 12/25/2021 15:06:45 - INFO - codeparrot_training - Step 20147: {'lr': 0.0003266243299134745, 'samples': 10315776, 'steps': 20147, 'batch_loss/train': 1.5024842992424965} 12/25/2021 15:06:56 - INFO - codeparrot_training - Step 20148: {'lr': 0.00032660932528260245, 'samples': 10316288, 'steps': 20148, 'batch_loss/train': 0.7132462378358468} 12/25/2021 15:07:07 - INFO - codeparrot_training - Step 20149: {'lr': 0.00032659432034716073, 'samples': 10316800, 'steps': 20149, 'batch_loss/train': 0.8038295730948448} 12/25/2021 15:07:19 - INFO - codeparrot_training - Step 20150: {'lr': 0.0003265793151072089, 'samples': 10317312, 'steps': 20150, 'batch_loss/train': 0.6763924527913332} 12/25/2021 15:07:29 - INFO - codeparrot_training - Step 20151: {'lr': 0.00032656430956280665, 'samples': 10317824, 'steps': 20151, 'batch_loss/train': 1.7037232453003526} 12/25/2021 15:07:40 - INFO - codeparrot_training - Step 20152: {'lr': 0.00032654930371401355, 'samples': 10318336, 'steps': 20152, 'batch_loss/train': 0.7774181487038732} 12/25/2021 15:07:54 - INFO - codeparrot_training - Step 20153: {'lr': 0.00032653429756088935, 'samples': 10318848, 'steps': 20153, 'batch_loss/train': 0.80968813970685} 12/25/2021 15:08:05 - INFO - codeparrot_training - Step 20154: {'lr': 0.00032651929110349363, 'samples': 10319360, 'steps': 20154, 'batch_loss/train': 0.6945489261997864} 12/25/2021 15:08:15 - INFO - codeparrot_training - Step 20155: {'lr': 0.00032650428434188617, 'samples': 10319872, 'steps': 20155, 'batch_loss/train': 0.7169505449710414} 12/25/2021 15:08:27 - INFO - codeparrot_training - Step 20156: {'lr': 0.00032648927727612666, 'samples': 10320384, 'steps': 20156, 'batch_loss/train': 0.7936511738225818} 12/25/2021 15:08:38 - INFO - codeparrot_training - Step 20157: {'lr': 0.0003264742699062745, 'samples': 10320896, 'steps': 20157, 'batch_loss/train': 0.7413773285225034} 12/25/2021 15:08:48 - INFO - codeparrot_training - Step 20158: {'lr': 0.0003264592622323896, 'samples': 10321408, 'steps': 20158, 'batch_loss/train': 0.7976208087056875} 12/25/2021 15:08:59 - INFO - codeparrot_training - Step 20159: {'lr': 0.0003264442542545315, 'samples': 10321920, 'steps': 20159, 'batch_loss/train': 0.6245912780286744} 12/25/2021 15:09:11 - INFO - codeparrot_training - Step 20160: {'lr': 0.00032642924597275994, 'samples': 10322432, 'steps': 20160, 'batch_loss/train': 0.8670930943917483} 12/25/2021 15:09:22 - INFO - codeparrot_training - Step 20161: {'lr': 0.0003264142373871345, 'samples': 10322944, 'steps': 20161, 'batch_loss/train': 0.7784127206541598} 12/25/2021 15:09:32 - INFO - codeparrot_training - Step 20162: {'lr': 0.00032639922849771495, 'samples': 10323456, 'steps': 20162, 'batch_loss/train': 0.7540425881743431} 12/25/2021 15:09:46 - INFO - codeparrot_training - Step 20163: {'lr': 0.000326384219304561, 'samples': 10323968, 'steps': 20163, 'batch_loss/train': 0.7513246131129563} 12/25/2021 15:09:57 - INFO - codeparrot_training - Step 20164: {'lr': 0.0003263692098077321, 'samples': 10324480, 'steps': 20164, 'batch_loss/train': 0.8038742290809751} 12/25/2021 15:10:07 - INFO - codeparrot_training - Step 20165: {'lr': 0.00032635420000728813, 'samples': 10324992, 'steps': 20165, 'batch_loss/train': 0.740114729385823} 12/25/2021 15:10:19 - INFO - codeparrot_training - Step 20166: {'lr': 0.0003263391899032886, 'samples': 10325504, 'steps': 20166, 'batch_loss/train': 0.7838940862566233} 12/25/2021 15:10:30 - INFO - codeparrot_training - Step 20167: {'lr': 0.00032632417949579335, 'samples': 10326016, 'steps': 20167, 'batch_loss/train': 0.7408351330086589} 12/25/2021 15:10:41 - INFO - codeparrot_training - Step 20168: {'lr': 0.00032630916878486195, 'samples': 10326528, 'steps': 20168, 'batch_loss/train': 0.5105721873696893} 12/25/2021 15:10:51 - INFO - codeparrot_training - Step 20169: {'lr': 0.00032629415777055414, 'samples': 10327040, 'steps': 20169, 'batch_loss/train': 0.6720320368185639} 12/25/2021 15:11:03 - INFO - codeparrot_training - Step 20170: {'lr': 0.0003262791464529296, 'samples': 10327552, 'steps': 20170, 'batch_loss/train': 0.7117675870540552} 12/25/2021 15:11:14 - INFO - codeparrot_training - Step 20171: {'lr': 0.0003262641348320479, 'samples': 10328064, 'steps': 20171, 'batch_loss/train': 0.6765287410235032} 12/25/2021 15:11:25 - INFO - codeparrot_training - Step 20172: {'lr': 0.0003262491229079688, 'samples': 10328576, 'steps': 20172, 'batch_loss/train': 0.6838320293463767} 12/25/2021 15:11:37 - INFO - codeparrot_training - Step 20173: {'lr': 0.00032623411068075206, 'samples': 10329088, 'steps': 20173, 'batch_loss/train': 0.787105436436832} 12/25/2021 15:11:47 - INFO - codeparrot_training - Step 20174: {'lr': 0.0003262190981504572, 'samples': 10329600, 'steps': 20174, 'batch_loss/train': 0.7188296373933554} 12/25/2021 15:11:58 - INFO - codeparrot_training - Step 20175: {'lr': 0.00032620408531714405, 'samples': 10330112, 'steps': 20175, 'batch_loss/train': 0.7655693411361426} 12/25/2021 15:12:12 - INFO - codeparrot_training - Step 20176: {'lr': 0.0003261890721808722, 'samples': 10330624, 'steps': 20176, 'batch_loss/train': 0.7637721132487059} 12/25/2021 15:12:22 - INFO - codeparrot_training - Step 20177: {'lr': 0.0003261740587417013, 'samples': 10331136, 'steps': 20177, 'batch_loss/train': 0.6750504402443767} 12/25/2021 15:12:33 - INFO - codeparrot_training - Step 20178: {'lr': 0.00032615904499969117, 'samples': 10331648, 'steps': 20178, 'batch_loss/train': 0.8184616975486279} 12/25/2021 15:12:43 - INFO - codeparrot_training - Step 20179: {'lr': 0.00032614403095490146, 'samples': 10332160, 'steps': 20179, 'batch_loss/train': 0.7602923994418234} 12/25/2021 15:12:55 - INFO - codeparrot_training - Step 20180: {'lr': 0.0003261290166073918, 'samples': 10332672, 'steps': 20180, 'batch_loss/train': 0.8428636761382222} 12/25/2021 15:13:06 - INFO - codeparrot_training - Step 20181: {'lr': 0.0003261140019572219, 'samples': 10333184, 'steps': 20181, 'batch_loss/train': 0.8199235079810023} 12/25/2021 15:13:17 - INFO - codeparrot_training - Step 20182: {'lr': 0.0003260989870044515, 'samples': 10333696, 'steps': 20182, 'batch_loss/train': 0.791678408626467} 12/25/2021 15:13:31 - INFO - codeparrot_training - Step 20183: {'lr': 0.00032608397174914026, 'samples': 10334208, 'steps': 20183, 'batch_loss/train': 0.9404210248030722} 12/25/2021 15:13:41 - INFO - codeparrot_training - Step 20184: {'lr': 0.00032606895619134783, 'samples': 10334720, 'steps': 20184, 'batch_loss/train': 0.8591789505444467} 12/25/2021 15:13:52 - INFO - codeparrot_training - Step 20185: {'lr': 0.000326053940331134, 'samples': 10335232, 'steps': 20185, 'batch_loss/train': 0.7704460201784968} 12/25/2021 15:14:04 - INFO - codeparrot_training - Step 20186: {'lr': 0.0003260389241685584, 'samples': 10335744, 'steps': 20186, 'batch_loss/train': 0.8154418431222439} 12/25/2021 15:14:15 - INFO - codeparrot_training - Step 20187: {'lr': 0.0003260239077036807, 'samples': 10336256, 'steps': 20187, 'batch_loss/train': 0.7336491476744413} 12/25/2021 15:14:25 - INFO - codeparrot_training - Step 20188: {'lr': 0.00032600889093656083, 'samples': 10336768, 'steps': 20188, 'batch_loss/train': 0.6661092285066843} 12/25/2021 15:14:36 - INFO - codeparrot_training - Step 20189: {'lr': 0.00032599387386725813, 'samples': 10337280, 'steps': 20189, 'batch_loss/train': 0.7175526982173324} 12/25/2021 15:14:48 - INFO - codeparrot_training - Step 20190: {'lr': 0.00032597885649583247, 'samples': 10337792, 'steps': 20190, 'batch_loss/train': 0.9460917529650033} 12/25/2021 15:14:59 - INFO - codeparrot_training - Step 20191: {'lr': 0.0003259638388223437, 'samples': 10338304, 'steps': 20191, 'batch_loss/train': 0.7709247674793005} 12/25/2021 15:15:09 - INFO - codeparrot_training - Step 20192: {'lr': 0.00032594882084685127, 'samples': 10338816, 'steps': 20192, 'batch_loss/train': 0.8903357516974211} 12/25/2021 15:15:23 - INFO - codeparrot_training - Step 20193: {'lr': 0.0003259338025694151, 'samples': 10339328, 'steps': 20193, 'batch_loss/train': 0.5497885640943423} 12/25/2021 15:15:34 - INFO - codeparrot_training - Step 20194: {'lr': 0.00032591878399009476, 'samples': 10339840, 'steps': 20194, 'batch_loss/train': 0.7213228906039149} 12/25/2021 15:15:44 - INFO - codeparrot_training - Step 20195: {'lr': 0.00032590376510895005, 'samples': 10340352, 'steps': 20195, 'batch_loss/train': 0.7208594675175846} 12/25/2021 15:15:56 - INFO - codeparrot_training - Step 20196: {'lr': 0.00032588874592604053, 'samples': 10340864, 'steps': 20196, 'batch_loss/train': 0.5972942050429992} 12/25/2021 15:16:07 - INFO - codeparrot_training - Step 20197: {'lr': 0.0003258737264414261, 'samples': 10341376, 'steps': 20197, 'batch_loss/train': 0.6574225150980055} 12/25/2021 15:16:18 - INFO - codeparrot_training - Step 20198: {'lr': 0.0003258587066551663, 'samples': 10341888, 'steps': 20198, 'batch_loss/train': 0.7573861731216311} 12/25/2021 15:16:29 - INFO - codeparrot_training - Step 20199: {'lr': 0.000325843686567321, 'samples': 10342400, 'steps': 20199, 'batch_loss/train': 0.6871665087528527} 12/25/2021 15:16:40 - INFO - codeparrot_training - Step 20200: {'lr': 0.0003258286661779498, 'samples': 10342912, 'steps': 20200, 'batch_loss/train': 0.6544286965508945} 12/25/2021 15:16:51 - INFO - codeparrot_training - Step 20201: {'lr': 0.00032581364548711243, 'samples': 10343424, 'steps': 20201, 'batch_loss/train': 0.816485577262938} 12/25/2021 15:17:01 - INFO - codeparrot_training - Step 20202: {'lr': 0.0003257986244948687, 'samples': 10343936, 'steps': 20202, 'batch_loss/train': 0.7114079017192125} 12/25/2021 15:17:15 - INFO - codeparrot_training - Step 20203: {'lr': 0.00032578360320127816, 'samples': 10344448, 'steps': 20203, 'batch_loss/train': 0.7059875754639506} 12/25/2021 15:17:26 - INFO - codeparrot_training - Step 20204: {'lr': 0.0003257685816064007, 'samples': 10344960, 'steps': 20204, 'batch_loss/train': 0.8544602515175939} 12/25/2021 15:17:37 - INFO - codeparrot_training - Step 20205: {'lr': 0.000325753559710296, 'samples': 10345472, 'steps': 20205, 'batch_loss/train': 0.7777105700224638} 12/25/2021 15:17:49 - INFO - codeparrot_training - Step 20206: {'lr': 0.0003257385375130237, 'samples': 10345984, 'steps': 20206, 'batch_loss/train': 0.7177513521164656} 12/25/2021 15:17:59 - INFO - codeparrot_training - Step 20207: {'lr': 0.0003257235150146436, 'samples': 10346496, 'steps': 20207, 'batch_loss/train': 0.7359495018608868} 12/25/2021 15:18:10 - INFO - codeparrot_training - Step 20208: {'lr': 0.0003257084922152154, 'samples': 10347008, 'steps': 20208, 'batch_loss/train': 0.7729731034487486} 12/25/2021 15:18:22 - INFO - codeparrot_training - Step 20209: {'lr': 0.00032569346911479876, 'samples': 10347520, 'steps': 20209, 'batch_loss/train': 0.7937833850737661} 12/25/2021 15:18:33 - INFO - codeparrot_training - Step 20210: {'lr': 0.0003256784457134535, 'samples': 10348032, 'steps': 20210, 'batch_loss/train': 0.8393762893974781} 12/25/2021 15:18:43 - INFO - codeparrot_training - Step 20211: {'lr': 0.00032566342201123934, 'samples': 10348544, 'steps': 20211, 'batch_loss/train': 0.8147809058427811} 12/25/2021 15:18:54 - INFO - codeparrot_training - Step 20212: {'lr': 0.00032564839800821595, 'samples': 10349056, 'steps': 20212, 'batch_loss/train': 0.7489484767429531} 12/25/2021 15:19:08 - INFO - codeparrot_training - Step 20213: {'lr': 0.00032563337370444313, 'samples': 10349568, 'steps': 20213, 'batch_loss/train': 0.8696633046492934} 12/25/2021 15:19:18 - INFO - codeparrot_training - Step 20214: {'lr': 0.0003256183490999805, 'samples': 10350080, 'steps': 20214, 'batch_loss/train': 0.7953188493847847} 12/25/2021 15:19:29 - INFO - codeparrot_training - Step 20215: {'lr': 0.000325603324194888, 'samples': 10350592, 'steps': 20215, 'batch_loss/train': 0.694058287423104} 12/25/2021 15:19:41 - INFO - codeparrot_training - Step 20216: {'lr': 0.00032558829898922506, 'samples': 10351104, 'steps': 20216, 'batch_loss/train': 0.8105367531534284} 12/25/2021 15:19:51 - INFO - codeparrot_training - Step 20217: {'lr': 0.0003255732734830517, 'samples': 10351616, 'steps': 20217, 'batch_loss/train': 0.795595093164593} 12/25/2021 15:20:02 - INFO - codeparrot_training - Step 20218: {'lr': 0.00032555824767642753, 'samples': 10352128, 'steps': 20218, 'batch_loss/train': 0.8267094390466809} 12/25/2021 15:20:14 - INFO - codeparrot_training - Step 20219: {'lr': 0.0003255432215694123, 'samples': 10352640, 'steps': 20219, 'batch_loss/train': 0.6727261361666024} 12/25/2021 15:20:25 - INFO - codeparrot_training - Step 20220: {'lr': 0.0003255281951620657, 'samples': 10353152, 'steps': 20220, 'batch_loss/train': 0.8008471474749967} 12/25/2021 15:20:35 - INFO - codeparrot_training - Step 20221: {'lr': 0.0003255131684544476, 'samples': 10353664, 'steps': 20221, 'batch_loss/train': 0.774843318387866} 12/25/2021 15:20:49 - INFO - codeparrot_training - Step 20222: {'lr': 0.0003254981414466175, 'samples': 10354176, 'steps': 20222, 'batch_loss/train': 0.7949742870405316} 12/25/2021 15:21:00 - INFO - codeparrot_training - Step 20223: {'lr': 0.00032548311413863545, 'samples': 10354688, 'steps': 20223, 'batch_loss/train': 0.8255056086927652} 12/25/2021 15:21:11 - INFO - codeparrot_training - Step 20224: {'lr': 0.00032546808653056096, 'samples': 10355200, 'steps': 20224, 'batch_loss/train': 0.7126733111217618} 12/25/2021 15:21:21 - INFO - codeparrot_training - Step 20225: {'lr': 0.0003254530586224539, 'samples': 10355712, 'steps': 20225, 'batch_loss/train': 0.7751510115340352} 12/25/2021 15:21:34 - INFO - codeparrot_training - Step 20226: {'lr': 0.0003254380304143739, 'samples': 10356224, 'steps': 20226, 'batch_loss/train': 0.8384215808473527} 12/25/2021 15:21:44 - INFO - codeparrot_training - Step 20227: {'lr': 0.00032542300190638085, 'samples': 10356736, 'steps': 20227, 'batch_loss/train': 0.7828881675377488} 12/25/2021 15:21:55 - INFO - codeparrot_training - Step 20228: {'lr': 0.0003254079730985344, 'samples': 10357248, 'steps': 20228, 'batch_loss/train': 0.9186239647679031} 12/25/2021 15:22:09 - INFO - codeparrot_training - Step 20229: {'lr': 0.0003253929439908943, 'samples': 10357760, 'steps': 20229, 'batch_loss/train': 0.8320273701101542} 12/25/2021 15:22:19 - INFO - codeparrot_training - Step 20230: {'lr': 0.00032537791458352037, 'samples': 10358272, 'steps': 20230, 'batch_loss/train': 0.7851127081084996} 12/25/2021 15:22:30 - INFO - codeparrot_training - Step 20231: {'lr': 0.0003253628848764723, 'samples': 10358784, 'steps': 20231, 'batch_loss/train': 0.6917900089174509} 12/25/2021 15:22:42 - INFO - codeparrot_training - Step 20232: {'lr': 0.0003253478548698098, 'samples': 10359296, 'steps': 20232, 'batch_loss/train': 0.8408985352143645} 12/25/2021 15:22:53 - INFO - codeparrot_training - Step 20233: {'lr': 0.0003253328245635927, 'samples': 10359808, 'steps': 20233, 'batch_loss/train': 0.7628953526727855} 12/25/2021 15:23:03 - INFO - codeparrot_training - Step 20234: {'lr': 0.0003253177939578808, 'samples': 10360320, 'steps': 20234, 'batch_loss/train': 0.5565993483178318} 12/25/2021 15:23:14 - INFO - codeparrot_training - Step 20235: {'lr': 0.00032530276305273365, 'samples': 10360832, 'steps': 20235, 'batch_loss/train': 0.7353943274356425} 12/25/2021 15:23:26 - INFO - codeparrot_training - Step 20236: {'lr': 0.0003252877318482113, 'samples': 10361344, 'steps': 20236, 'batch_loss/train': 0.6733389478176832} 12/25/2021 15:23:37 - INFO - codeparrot_training - Step 20237: {'lr': 0.0003252727003443733, 'samples': 10361856, 'steps': 20237, 'batch_loss/train': 0.7491289731115103} 12/25/2021 15:23:47 - INFO - codeparrot_training - Step 20238: {'lr': 0.00032525766854127943, 'samples': 10362368, 'steps': 20238, 'batch_loss/train': 0.7358358890051022} 12/25/2021 15:24:01 - INFO - codeparrot_training - Step 20239: {'lr': 0.0003252426364389895, 'samples': 10362880, 'steps': 20239, 'batch_loss/train': 0.7602832741104066} 12/25/2021 15:24:12 - INFO - codeparrot_training - Step 20240: {'lr': 0.0003252276040375632, 'samples': 10363392, 'steps': 20240, 'batch_loss/train': 0.8344408883713186} 12/25/2021 15:24:22 - INFO - codeparrot_training - Step 20241: {'lr': 0.0003252125713370605, 'samples': 10363904, 'steps': 20241, 'batch_loss/train': 0.71071254927665} 12/25/2021 15:24:34 - INFO - codeparrot_training - Step 20242: {'lr': 0.0003251975383375409, 'samples': 10364416, 'steps': 20242, 'batch_loss/train': 0.9200003994628787} 12/25/2021 15:24:45 - INFO - codeparrot_training - Step 20243: {'lr': 0.00032518250503906437, 'samples': 10364928, 'steps': 20243, 'batch_loss/train': 0.7164929088903591} 12/25/2021 15:24:56 - INFO - codeparrot_training - Step 20244: {'lr': 0.0003251674714416905, 'samples': 10365440, 'steps': 20244, 'batch_loss/train': 0.7019199142232537} 12/25/2021 15:25:06 - INFO - codeparrot_training - Step 20245: {'lr': 0.00032515243754547917, 'samples': 10365952, 'steps': 20245, 'batch_loss/train': 0.788789939135313} 12/25/2021 15:25:20 - INFO - codeparrot_training - Step 20246: {'lr': 0.00032513740335049013, 'samples': 10366464, 'steps': 20246, 'batch_loss/train': 0.5854255310259759} 12/25/2021 15:25:31 - INFO - codeparrot_training - Step 20247: {'lr': 0.00032512236885678323, 'samples': 10366976, 'steps': 20247, 'batch_loss/train': 0.8040669448673725} 12/25/2021 15:25:41 - INFO - codeparrot_training - Step 20248: {'lr': 0.000325107334064418, 'samples': 10367488, 'steps': 20248, 'batch_loss/train': 0.8081757745239884} 12/25/2021 15:25:53 - INFO - codeparrot_training - Step 20249: {'lr': 0.00032509229897345447, 'samples': 10368000, 'steps': 20249, 'batch_loss/train': 0.7006158656440675} 12/25/2021 15:26:04 - INFO - codeparrot_training - Step 20250: {'lr': 0.00032507726358395236, 'samples': 10368512, 'steps': 20250, 'batch_loss/train': 0.8067115629091859} 12/25/2021 15:26:15 - INFO - codeparrot_training - Step 20251: {'lr': 0.0003250622278959713, 'samples': 10369024, 'steps': 20251, 'batch_loss/train': 0.7264031376689672} 12/25/2021 15:26:27 - INFO - codeparrot_training - Step 20252: {'lr': 0.00032504719190957115, 'samples': 10369536, 'steps': 20252, 'batch_loss/train': 0.7748216823674738} 12/25/2021 15:26:37 - INFO - codeparrot_training - Step 20253: {'lr': 0.00032503215562481176, 'samples': 10370048, 'steps': 20253, 'batch_loss/train': 0.8309554168954492} 12/25/2021 15:26:48 - INFO - codeparrot_training - Step 20254: {'lr': 0.0003250171190417528, 'samples': 10370560, 'steps': 20254, 'batch_loss/train': 0.9034096486866474} 12/25/2021 15:26:58 - INFO - codeparrot_training - Step 20255: {'lr': 0.00032500208216045413, 'samples': 10371072, 'steps': 20255, 'batch_loss/train': 0.7007117746397853} 12/25/2021 15:27:11 - INFO - codeparrot_training - Step 20256: {'lr': 0.0003249870449809756, 'samples': 10371584, 'steps': 20256, 'batch_loss/train': 0.8107067737728357} 12/25/2021 15:27:21 - INFO - codeparrot_training - Step 20257: {'lr': 0.00032497200750337675, 'samples': 10372096, 'steps': 20257, 'batch_loss/train': 0.805160432588309} 12/25/2021 15:27:32 - INFO - codeparrot_training - Step 20258: {'lr': 0.00032495696972771756, 'samples': 10372608, 'steps': 20258, 'batch_loss/train': 0.6935341991484165} 12/25/2021 15:27:46 - INFO - codeparrot_training - Step 20259: {'lr': 0.0003249419316540577, 'samples': 10373120, 'steps': 20259, 'batch_loss/train': 0.8346672710031271} 12/25/2021 15:27:56 - INFO - codeparrot_training - Step 20260: {'lr': 0.0003249268932824572, 'samples': 10373632, 'steps': 20260, 'batch_loss/train': 0.8155595045536757} 12/25/2021 15:28:07 - INFO - codeparrot_training - Step 20261: {'lr': 0.00032491185461297543, 'samples': 10374144, 'steps': 20261, 'batch_loss/train': 0.8287894036620855} 12/25/2021 15:28:19 - INFO - codeparrot_training - Step 20262: {'lr': 0.00032489681564567256, 'samples': 10374656, 'steps': 20262, 'batch_loss/train': 0.8764652134850621} 12/25/2021 15:28:30 - INFO - codeparrot_training - Step 20263: {'lr': 0.00032488177638060825, 'samples': 10375168, 'steps': 20263, 'batch_loss/train': 0.8004700979217887} 12/25/2021 15:28:40 - INFO - codeparrot_training - Step 20264: {'lr': 0.0003248667368178421, 'samples': 10375680, 'steps': 20264, 'batch_loss/train': 0.8595525659620762} 12/25/2021 15:28:51 - INFO - codeparrot_training - Step 20265: {'lr': 0.0003248516969574342, 'samples': 10376192, 'steps': 20265, 'batch_loss/train': 0.6409593867138028} 12/25/2021 15:29:03 - INFO - codeparrot_training - Step 20266: {'lr': 0.00032483665679944426, 'samples': 10376704, 'steps': 20266, 'batch_loss/train': 0.8327285412233323} 12/25/2021 15:29:14 - INFO - codeparrot_training - Step 20267: {'lr': 0.00032482161634393193, 'samples': 10377216, 'steps': 20267, 'batch_loss/train': 0.8489232957363129} 12/25/2021 15:29:24 - INFO - codeparrot_training - Step 20268: {'lr': 0.00032480657559095713, 'samples': 10377728, 'steps': 20268, 'batch_loss/train': 1.4531153738498688} 12/25/2021 15:29:38 - INFO - codeparrot_training - Step 20269: {'lr': 0.0003247915345405797, 'samples': 10378240, 'steps': 20269, 'batch_loss/train': 1.2409063382074237} 12/25/2021 15:29:49 - INFO - codeparrot_training - Step 20270: {'lr': 0.0003247764931928593, 'samples': 10378752, 'steps': 20270, 'batch_loss/train': 0.7341011045500636} 12/25/2021 15:30:00 - INFO - codeparrot_training - Step 20271: {'lr': 0.0003247614515478558, 'samples': 10379264, 'steps': 20271, 'batch_loss/train': 0.8151791715063155} 12/25/2021 15:30:10 - INFO - codeparrot_training - Step 20272: {'lr': 0.00032474640960562903, 'samples': 10379776, 'steps': 20272, 'batch_loss/train': 0.6410303536104038} 12/25/2021 15:30:22 - INFO - codeparrot_training - Step 20273: {'lr': 0.00032473136736623875, 'samples': 10380288, 'steps': 20273, 'batch_loss/train': 0.7869839165359735} 12/25/2021 15:30:33 - INFO - codeparrot_training - Step 20274: {'lr': 0.0003247163248297447, 'samples': 10380800, 'steps': 20274, 'batch_loss/train': 0.5798918148502707} 12/25/2021 15:30:43 - INFO - codeparrot_training - Step 20275: {'lr': 0.00032470128199620687, 'samples': 10381312, 'steps': 20275, 'batch_loss/train': 0.712617689743638} 12/25/2021 15:30:56 - INFO - codeparrot_training - Step 20276: {'lr': 0.00032468623886568486, 'samples': 10381824, 'steps': 20276, 'batch_loss/train': 0.8390891812741756} 12/25/2021 15:31:06 - INFO - codeparrot_training - Step 20277: {'lr': 0.0003246711954382385, 'samples': 10382336, 'steps': 20277, 'batch_loss/train': 0.7810233952477574} 12/25/2021 15:31:17 - INFO - codeparrot_training - Step 20278: {'lr': 0.00032465615171392787, 'samples': 10382848, 'steps': 20278, 'batch_loss/train': 0.8658593352884054} 12/25/2021 15:31:31 - INFO - codeparrot_training - Step 20279: {'lr': 0.00032464110769281243, 'samples': 10383360, 'steps': 20279, 'batch_loss/train': 0.7587796301813796} 12/25/2021 15:31:41 - INFO - codeparrot_training - Step 20280: {'lr': 0.00032462606337495216, 'samples': 10383872, 'steps': 20280, 'batch_loss/train': 0.7559916242025793} 12/25/2021 15:31:52 - INFO - codeparrot_training - Step 20281: {'lr': 0.00032461101876040684, 'samples': 10384384, 'steps': 20281, 'batch_loss/train': 0.827564864535816} 12/25/2021 15:32:03 - INFO - codeparrot_training - Step 20282: {'lr': 0.00032459597384923634, 'samples': 10384896, 'steps': 20282, 'batch_loss/train': 0.8870370667427778} 12/25/2021 15:32:15 - INFO - codeparrot_training - Step 20283: {'lr': 0.0003245809286415003, 'samples': 10385408, 'steps': 20283, 'batch_loss/train': 0.7895474620163441} 12/25/2021 15:32:25 - INFO - codeparrot_training - Step 20284: {'lr': 0.0003245658831372587, 'samples': 10385920, 'steps': 20284, 'batch_loss/train': 0.7501103403046727} 12/25/2021 15:32:36 - INFO - codeparrot_training - Step 20285: {'lr': 0.00032455083733657134, 'samples': 10386432, 'steps': 20285, 'batch_loss/train': 0.826915018260479} 12/25/2021 15:32:50 - INFO - codeparrot_training - Step 20286: {'lr': 0.00032453579123949795, 'samples': 10386944, 'steps': 20286, 'batch_loss/train': 0.8332304563373327} 12/25/2021 15:33:00 - INFO - codeparrot_training - Step 20287: {'lr': 0.0003245207448460984, 'samples': 10387456, 'steps': 20287, 'batch_loss/train': 0.8010310262907296} 12/25/2021 15:33:11 - INFO - codeparrot_training - Step 20288: {'lr': 0.0003245056981564325, 'samples': 10387968, 'steps': 20288, 'batch_loss/train': 0.790238393470645} 12/25/2021 15:33:23 - INFO - codeparrot_training - Step 20289: {'lr': 0.00032449065117056016, 'samples': 10388480, 'steps': 20289, 'batch_loss/train': 0.751148849260062} 12/25/2021 15:33:34 - INFO - codeparrot_training - Step 20290: {'lr': 0.00032447560388854097, 'samples': 10388992, 'steps': 20290, 'batch_loss/train': 0.8172975331544876} 12/25/2021 15:33:44 - INFO - codeparrot_training - Step 20291: {'lr': 0.00032446055631043505, 'samples': 10389504, 'steps': 20291, 'batch_loss/train': 0.8229379637632519} 12/25/2021 15:33:56 - INFO - codeparrot_training - Step 20292: {'lr': 0.000324445508436302, 'samples': 10390016, 'steps': 20292, 'batch_loss/train': 0.7459646100178361} 12/25/2021 15:34:07 - INFO - codeparrot_training - Step 20293: {'lr': 0.00032443046026620173, 'samples': 10390528, 'steps': 20293, 'batch_loss/train': 0.8022803906351328} 12/25/2021 15:34:17 - INFO - codeparrot_training - Step 20294: {'lr': 0.00032441541180019403, 'samples': 10391040, 'steps': 20294, 'batch_loss/train': 0.8279868140816689} 12/25/2021 15:34:28 - INFO - codeparrot_training - Step 20295: {'lr': 0.00032440036303833873, 'samples': 10391552, 'steps': 20295, 'batch_loss/train': 0.7333365197991952} 12/25/2021 15:34:42 - INFO - codeparrot_training - Step 20296: {'lr': 0.0003243853139806958, 'samples': 10392064, 'steps': 20296, 'batch_loss/train': 0.7629866590723395} 12/25/2021 15:34:52 - INFO - codeparrot_training - Step 20297: {'lr': 0.0003243702646273248, 'samples': 10392576, 'steps': 20297, 'batch_loss/train': 0.8209614004008472} 12/25/2021 15:35:03 - INFO - codeparrot_training - Step 20298: {'lr': 0.0003243552149782858, 'samples': 10393088, 'steps': 20298, 'batch_loss/train': 0.8873012876138091} 12/25/2021 15:35:15 - INFO - codeparrot_training - Step 20299: {'lr': 0.0003243401650336385, 'samples': 10393600, 'steps': 20299, 'batch_loss/train': 0.8421481978148222} 12/25/2021 15:35:26 - INFO - codeparrot_training - Step 20300: {'lr': 0.0003243251147934428, 'samples': 10394112, 'steps': 20300, 'batch_loss/train': 0.7042948333546519} 12/25/2021 15:35:36 - INFO - codeparrot_training - Step 20301: {'lr': 0.0003243100642577584, 'samples': 10394624, 'steps': 20301, 'batch_loss/train': 0.825210252776742} 12/25/2021 15:35:49 - INFO - codeparrot_training - Step 20302: {'lr': 0.0003242950134266455, 'samples': 10395136, 'steps': 20302, 'batch_loss/train': 0.8546831279527396} 12/25/2021 15:35:59 - INFO - codeparrot_training - Step 20303: {'lr': 0.00032427996230016345, 'samples': 10395648, 'steps': 20303, 'batch_loss/train': 0.7438715668395162} 12/25/2021 15:36:10 - INFO - codeparrot_training - Step 20304: {'lr': 0.0003242649108783724, 'samples': 10396160, 'steps': 20304, 'batch_loss/train': 0.7287598941475153} 12/25/2021 15:36:20 - INFO - codeparrot_training - Step 20305: {'lr': 0.0003242498591613321, 'samples': 10396672, 'steps': 20305, 'batch_loss/train': 0.8071626098826528} 12/25/2021 15:36:32 - INFO - codeparrot_training - Step 20306: {'lr': 0.00032423480714910234, 'samples': 10397184, 'steps': 20306, 'batch_loss/train': 0.8133898652158678} 12/25/2021 15:36:43 - INFO - codeparrot_training - Step 20307: {'lr': 0.0003242197548417431, 'samples': 10397696, 'steps': 20307, 'batch_loss/train': 0.7390750125050545} 12/25/2021 15:36:53 - INFO - codeparrot_training - Step 20308: {'lr': 0.00032420470223931414, 'samples': 10398208, 'steps': 20308, 'batch_loss/train': 0.7960674557834864} 12/25/2021 15:37:07 - INFO - codeparrot_training - Step 20309: {'lr': 0.0003241896493418752, 'samples': 10398720, 'steps': 20309, 'batch_loss/train': 0.7996757673099637} 12/25/2021 15:37:18 - INFO - codeparrot_training - Step 20310: {'lr': 0.0003241745961494863, 'samples': 10399232, 'steps': 20310, 'batch_loss/train': 0.7523765256628394} 12/25/2021 15:37:28 - INFO - codeparrot_training - Step 20311: {'lr': 0.00032415954266220724, 'samples': 10399744, 'steps': 20311, 'batch_loss/train': 0.8752235053107142} 12/25/2021 15:37:41 - INFO - codeparrot_training - Step 20312: {'lr': 0.0003241444888800978, 'samples': 10400256, 'steps': 20312, 'batch_loss/train': 0.8278079219162464} 12/25/2021 15:37:51 - INFO - codeparrot_training - Step 20313: {'lr': 0.0003241294348032179, 'samples': 10400768, 'steps': 20313, 'batch_loss/train': 1.0146654741838574} 12/25/2021 15:38:02 - INFO - codeparrot_training - Step 20314: {'lr': 0.0003241143804316273, 'samples': 10401280, 'steps': 20314, 'batch_loss/train': 0.8173142760060728} 12/25/2021 15:38:13 - INFO - codeparrot_training - Step 20315: {'lr': 0.00032409932576538597, 'samples': 10401792, 'steps': 20315, 'batch_loss/train': 0.8299499109853059} 12/25/2021 15:38:27 - INFO - codeparrot_training - Step 20316: {'lr': 0.0003240842708045536, 'samples': 10402304, 'steps': 20316, 'batch_loss/train': 1.0477609757799655} 12/25/2021 15:38:37 - INFO - codeparrot_training - Step 20317: {'lr': 0.0003240692155491903, 'samples': 10402816, 'steps': 20317, 'batch_loss/train': 0.7521345044951886} 12/25/2021 15:38:48 - INFO - codeparrot_training - Step 20318: {'lr': 0.0003240541599993556, 'samples': 10403328, 'steps': 20318, 'batch_loss/train': 0.8129470092244446} 12/25/2021 15:39:00 - INFO - codeparrot_training - Step 20319: {'lr': 0.00032403910415510965, 'samples': 10403840, 'steps': 20319, 'batch_loss/train': 0.5716915189987049} 12/25/2021 15:39:11 - INFO - codeparrot_training - Step 20320: {'lr': 0.0003240240480165121, 'samples': 10404352, 'steps': 20320, 'batch_loss/train': 0.8580971220508218} 12/25/2021 15:39:22 - INFO - codeparrot_training - Step 20321: {'lr': 0.00032400899158362297, 'samples': 10404864, 'steps': 20321, 'batch_loss/train': 0.8702658042311668} 12/25/2021 15:39:32 - INFO - codeparrot_training - Step 20322: {'lr': 0.0003239939348565019, 'samples': 10405376, 'steps': 20322, 'batch_loss/train': 0.6424271561554633} 12/25/2021 15:39:44 - INFO - codeparrot_training - Step 20323: {'lr': 0.00032397887783520903, 'samples': 10405888, 'steps': 20323, 'batch_loss/train': 0.7510972169111483} 12/25/2021 15:39:55 - INFO - codeparrot_training - Step 20324: {'lr': 0.000323963820519804, 'samples': 10406400, 'steps': 20324, 'batch_loss/train': 0.8452211488038301} 12/25/2021 15:40:06 - INFO - codeparrot_training - Step 20325: {'lr': 0.00032394876291034667, 'samples': 10406912, 'steps': 20325, 'batch_loss/train': 0.8086402893532068} 12/25/2021 15:40:19 - INFO - codeparrot_training - Step 20326: {'lr': 0.00032393370500689714, 'samples': 10407424, 'steps': 20326, 'batch_loss/train': 0.8326742523349822} 12/25/2021 15:40:30 - INFO - codeparrot_training - Step 20327: {'lr': 0.000323918646809515, 'samples': 10407936, 'steps': 20327, 'batch_loss/train': 0.7553698448464274} 12/25/2021 15:40:41 - INFO - codeparrot_training - Step 20328: {'lr': 0.0003239035883182603, 'samples': 10408448, 'steps': 20328, 'batch_loss/train': 0.7675014501437545} 12/25/2021 15:40:53 - INFO - codeparrot_training - Step 20329: {'lr': 0.0003238885295331928, 'samples': 10408960, 'steps': 20329, 'batch_loss/train': 0.8315989831462502} 12/25/2021 15:41:04 - INFO - codeparrot_training - Step 20330: {'lr': 0.0003238734704543724, 'samples': 10409472, 'steps': 20330, 'batch_loss/train': 0.6924102432094514} 12/25/2021 15:41:14 - INFO - codeparrot_training - Step 20331: {'lr': 0.000323858411081859, 'samples': 10409984, 'steps': 20331, 'batch_loss/train': 0.7383995321579278} 12/25/2021 15:41:25 - INFO - codeparrot_training - Step 20332: {'lr': 0.00032384335141571243, 'samples': 10410496, 'steps': 20332, 'batch_loss/train': 1.0307193789631128} 12/25/2021 15:41:39 - INFO - codeparrot_training - Step 20333: {'lr': 0.00032382829145599255, 'samples': 10411008, 'steps': 20333, 'batch_loss/train': 0.6771677164360881} 12/25/2021 15:41:49 - INFO - codeparrot_training - Step 20334: {'lr': 0.00032381323120275936, 'samples': 10411520, 'steps': 20334, 'batch_loss/train': 1.0958364428952336} 12/25/2021 15:42:00 - INFO - codeparrot_training - Step 20335: {'lr': 0.00032379817065607253, 'samples': 10412032, 'steps': 20335, 'batch_loss/train': 1.1089810989797115} 12/25/2021 15:42:12 - INFO - codeparrot_training - Step 20336: {'lr': 0.0003237831098159921, 'samples': 10412544, 'steps': 20336, 'batch_loss/train': 0.7976193617796525} 12/25/2021 15:42:22 - INFO - codeparrot_training - Step 20337: {'lr': 0.0003237680486825778, 'samples': 10413056, 'steps': 20337, 'batch_loss/train': 0.7363938270136714} 12/25/2021 15:42:33 - INFO - codeparrot_training - Step 20338: {'lr': 0.0003237529872558897, 'samples': 10413568, 'steps': 20338, 'batch_loss/train': 0.9033055892214179} 12/25/2021 15:42:45 - INFO - codeparrot_training - Step 20339: {'lr': 0.00032373792553598747, 'samples': 10414080, 'steps': 20339, 'batch_loss/train': 0.7156923324801028} 12/25/2021 15:42:56 - INFO - codeparrot_training - Step 20340: {'lr': 0.0003237228635229312, 'samples': 10414592, 'steps': 20340, 'batch_loss/train': 0.681959860958159} 12/25/2021 15:43:06 - INFO - codeparrot_training - Step 20341: {'lr': 0.0003237078012167806, 'samples': 10415104, 'steps': 20341, 'batch_loss/train': 0.7273992964765057} 12/25/2021 15:43:17 - INFO - codeparrot_training - Step 20342: {'lr': 0.00032369273861759557, 'samples': 10415616, 'steps': 20342, 'batch_loss/train': 0.7253060461953282} 12/25/2021 15:43:29 - INFO - codeparrot_training - Step 20343: {'lr': 0.0003236776757254361, 'samples': 10416128, 'steps': 20343, 'batch_loss/train': 0.7921827929094434} 12/25/2021 15:43:40 - INFO - codeparrot_training - Step 20344: {'lr': 0.000323662612540362, 'samples': 10416640, 'steps': 20344, 'batch_loss/train': 0.8399056259077042} 12/25/2021 15:43:50 - INFO - codeparrot_training - Step 20345: {'lr': 0.00032364754906243315, 'samples': 10417152, 'steps': 20345, 'batch_loss/train': 0.6921674599871039} 12/25/2021 15:44:04 - INFO - codeparrot_training - Step 20346: {'lr': 0.00032363248529170945, 'samples': 10417664, 'steps': 20346, 'batch_loss/train': 0.8305049743503332} 12/25/2021 15:44:15 - INFO - codeparrot_training - Step 20347: {'lr': 0.0003236174212282509, 'samples': 10418176, 'steps': 20347, 'batch_loss/train': 0.7853541299700737} 12/25/2021 15:44:25 - INFO - codeparrot_training - Step 20348: {'lr': 0.00032360235687211707, 'samples': 10418688, 'steps': 20348, 'batch_loss/train': 0.7259169896133244} 12/25/2021 15:44:38 - INFO - codeparrot_training - Step 20349: {'lr': 0.00032358729222336824, 'samples': 10419200, 'steps': 20349, 'batch_loss/train': 0.6178260239539668} 12/25/2021 15:44:48 - INFO - codeparrot_training - Step 20350: {'lr': 0.00032357222728206404, 'samples': 10419712, 'steps': 20350, 'batch_loss/train': 0.7826402122154832} 12/25/2021 15:44:59 - INFO - codeparrot_training - Step 20351: {'lr': 0.0003235571620482645, 'samples': 10420224, 'steps': 20351, 'batch_loss/train': 0.7546604293165728} 12/25/2021 15:45:09 - INFO - codeparrot_training - Step 20352: {'lr': 0.00032354209652202937, 'samples': 10420736, 'steps': 20352, 'batch_loss/train': 0.8372752917930484} 12/25/2021 15:45:21 - INFO - codeparrot_training - Step 20353: {'lr': 0.00032352703070341864, 'samples': 10421248, 'steps': 20353, 'batch_loss/train': 0.695987680926919} 12/25/2021 15:45:32 - INFO - codeparrot_training - Step 20354: {'lr': 0.00032351196459249227, 'samples': 10421760, 'steps': 20354, 'batch_loss/train': 0.8266833648085594} 12/25/2021 15:45:43 - INFO - codeparrot_training - Step 20355: {'lr': 0.00032349689818931003, 'samples': 10422272, 'steps': 20355, 'batch_loss/train': 0.7830546759068966} 12/25/2021 15:45:56 - INFO - codeparrot_training - Step 20356: {'lr': 0.00032348183149393195, 'samples': 10422784, 'steps': 20356, 'batch_loss/train': 0.9385820017196238} 12/25/2021 15:46:07 - INFO - codeparrot_training - Step 20357: {'lr': 0.00032346676450641776, 'samples': 10423296, 'steps': 20357, 'batch_loss/train': 0.850600172765553} 12/25/2021 15:46:18 - INFO - codeparrot_training - Step 20358: {'lr': 0.00032345169722682753, 'samples': 10423808, 'steps': 20358, 'batch_loss/train': 0.8426347859203815} 12/25/2021 15:46:30 - INFO - codeparrot_training - Step 20359: {'lr': 0.000323436629655221, 'samples': 10424320, 'steps': 20359, 'batch_loss/train': 0.7651050596032292} 12/25/2021 15:46:40 - INFO - codeparrot_training - Step 20360: {'lr': 0.0003234215617916583, 'samples': 10424832, 'steps': 20360, 'batch_loss/train': 0.7968326006084681} 12/25/2021 15:46:51 - INFO - codeparrot_training - Step 20361: {'lr': 0.000323406493636199, 'samples': 10425344, 'steps': 20361, 'batch_loss/train': 0.9035193179734051} 12/25/2021 15:47:05 - INFO - codeparrot_training - Step 20362: {'lr': 0.00032339142518890337, 'samples': 10425856, 'steps': 20362, 'batch_loss/train': 0.8296113992109895} 12/25/2021 15:47:16 - INFO - codeparrot_training - Step 20363: {'lr': 0.0003233763564498311, 'samples': 10426368, 'steps': 20363, 'batch_loss/train': 0.6634293459355831} 12/25/2021 15:47:26 - INFO - codeparrot_training - Step 20364: {'lr': 0.0003233612874190421, 'samples': 10426880, 'steps': 20364, 'batch_loss/train': 0.6981726353988051} 12/25/2021 15:47:37 - INFO - codeparrot_training - Step 20365: {'lr': 0.00032334621809659636, 'samples': 10427392, 'steps': 20365, 'batch_loss/train': 0.6546913594938815} 12/25/2021 15:47:49 - INFO - codeparrot_training - Step 20366: {'lr': 0.00032333114848255367, 'samples': 10427904, 'steps': 20366, 'batch_loss/train': 0.7958157164976001} 12/25/2021 15:48:00 - INFO - codeparrot_training - Step 20367: {'lr': 0.0003233160785769741, 'samples': 10428416, 'steps': 20367, 'batch_loss/train': 0.7836035517975688} 12/25/2021 15:48:10 - INFO - codeparrot_training - Step 20368: {'lr': 0.0003233010083799175, 'samples': 10428928, 'steps': 20368, 'batch_loss/train': 0.7385249065700918} 12/25/2021 15:48:22 - INFO - codeparrot_training - Step 20369: {'lr': 0.0003232859378914438, 'samples': 10429440, 'steps': 20369, 'batch_loss/train': 0.7355432603508234} 12/25/2021 15:48:33 - INFO - codeparrot_training - Step 20370: {'lr': 0.0003232708671116128, 'samples': 10429952, 'steps': 20370, 'batch_loss/train': 0.8797380356118083} 12/25/2021 15:48:44 - INFO - codeparrot_training - Step 20371: {'lr': 0.0003232557960404845, 'samples': 10430464, 'steps': 20371, 'batch_loss/train': 0.788156061898917} 12/25/2021 15:48:58 - INFO - codeparrot_training - Step 20372: {'lr': 0.0003232407246781188, 'samples': 10430976, 'steps': 20372, 'batch_loss/train': 0.7591753690503538} 12/25/2021 15:49:08 - INFO - codeparrot_training - Step 20373: {'lr': 0.0003232256530245757, 'samples': 10431488, 'steps': 20373, 'batch_loss/train': 0.809672437608242} 12/25/2021 15:49:19 - INFO - codeparrot_training - Step 20374: {'lr': 0.00032321058107991497, 'samples': 10432000, 'steps': 20374, 'batch_loss/train': 0.7176207797601819} 12/25/2021 15:49:31 - INFO - codeparrot_training - Step 20375: {'lr': 0.0003231955088441967, 'samples': 10432512, 'steps': 20375, 'batch_loss/train': 0.7240033718990162} 12/25/2021 15:49:42 - INFO - codeparrot_training - Step 20376: {'lr': 0.00032318043631748074, 'samples': 10433024, 'steps': 20376, 'batch_loss/train': 0.6663282788358629} 12/25/2021 15:49:52 - INFO - codeparrot_training - Step 20377: {'lr': 0.00032316536349982685, 'samples': 10433536, 'steps': 20377, 'batch_loss/train': 0.8131999876350164} 12/25/2021 15:50:03 - INFO - codeparrot_training - Step 20378: {'lr': 0.00032315029039129523, 'samples': 10434048, 'steps': 20378, 'batch_loss/train': 0.8125688075087965} 12/25/2021 15:50:15 - INFO - codeparrot_training - Step 20379: {'lr': 0.0003231352169919456, 'samples': 10434560, 'steps': 20379, 'batch_loss/train': 0.7492664959281683} 12/25/2021 15:50:25 - INFO - codeparrot_training - Step 20380: {'lr': 0.00032312014330183804, 'samples': 10435072, 'steps': 20380, 'batch_loss/train': 0.7122536851093173} 12/25/2021 15:50:36 - INFO - codeparrot_training - Step 20381: {'lr': 0.00032310506932103233, 'samples': 10435584, 'steps': 20381, 'batch_loss/train': 0.7088971950579435} 12/25/2021 15:50:48 - INFO - codeparrot_training - Step 20382: {'lr': 0.0003230899950495886, 'samples': 10436096, 'steps': 20382, 'batch_loss/train': 0.8174371421337128} 12/25/2021 15:50:59 - INFO - codeparrot_training - Step 20383: {'lr': 0.00032307492048756656, 'samples': 10436608, 'steps': 20383, 'batch_loss/train': 0.8445188635960221} 12/25/2021 15:51:09 - INFO - codeparrot_training - Step 20384: {'lr': 0.00032305984563502614, 'samples': 10437120, 'steps': 20384, 'batch_loss/train': 0.8550049625337124} 12/25/2021 15:51:23 - INFO - codeparrot_training - Step 20385: {'lr': 0.00032304477049202747, 'samples': 10437632, 'steps': 20385, 'batch_loss/train': 0.8307242598384619} 12/25/2021 15:51:34 - INFO - codeparrot_training - Step 20386: {'lr': 0.00032302969505863035, 'samples': 10438144, 'steps': 20386, 'batch_loss/train': 0.8404160065110773} 12/25/2021 15:51:44 - INFO - codeparrot_training - Step 20387: {'lr': 0.00032301461933489467, 'samples': 10438656, 'steps': 20387, 'batch_loss/train': 0.8358125444501638} 12/25/2021 15:51:55 - INFO - codeparrot_training - Step 20388: {'lr': 0.0003229995433208805, 'samples': 10439168, 'steps': 20388, 'batch_loss/train': 0.8151692235842347} 12/25/2021 15:52:07 - INFO - codeparrot_training - Step 20389: {'lr': 0.00032298446701664775, 'samples': 10439680, 'steps': 20389, 'batch_loss/train': 0.8123342068865895} 12/25/2021 15:52:18 - INFO - codeparrot_training - Step 20390: {'lr': 0.0003229693904222562, 'samples': 10440192, 'steps': 20390, 'batch_loss/train': 0.7070113618392497} 12/25/2021 15:52:28 - INFO - codeparrot_training - Step 20391: {'lr': 0.00032295431353776603, 'samples': 10440704, 'steps': 20391, 'batch_loss/train': 0.6763443863019347} 12/25/2021 15:52:42 - INFO - codeparrot_training - Step 20392: {'lr': 0.000322939236363237, 'samples': 10441216, 'steps': 20392, 'batch_loss/train': 0.717442266875878} 12/25/2021 15:52:53 - INFO - codeparrot_training - Step 20393: {'lr': 0.00032292415889872907, 'samples': 10441728, 'steps': 20393, 'batch_loss/train': 0.7630364932119846} 12/25/2021 15:53:03 - INFO - codeparrot_training - Step 20394: {'lr': 0.0003229090811443023, 'samples': 10442240, 'steps': 20394, 'batch_loss/train': 0.5743357604369521} 12/25/2021 15:53:15 - INFO - codeparrot_training - Step 20395: {'lr': 0.0003228940031000165, 'samples': 10442752, 'steps': 20395, 'batch_loss/train': 0.8766210554167628} 12/25/2021 15:53:26 - INFO - codeparrot_training - Step 20396: {'lr': 0.00032287892476593166, 'samples': 10443264, 'steps': 20396, 'batch_loss/train': 0.6691449377685785} 12/25/2021 15:53:37 - INFO - codeparrot_training - Step 20397: {'lr': 0.0003228638461421077, 'samples': 10443776, 'steps': 20397, 'batch_loss/train': 0.8995641842484474} 12/25/2021 15:53:47 - INFO - codeparrot_training - Step 20398: {'lr': 0.0003228487672286047, 'samples': 10444288, 'steps': 20398, 'batch_loss/train': 0.697022276930511} 12/25/2021 15:53:59 - INFO - codeparrot_training - Step 20399: {'lr': 0.00032283368802548247, 'samples': 10444800, 'steps': 20399, 'batch_loss/train': 0.8133374033495784} 12/25/2021 15:54:10 - INFO - codeparrot_training - Step 20400: {'lr': 0.00032281860853280097, 'samples': 10445312, 'steps': 20400, 'batch_loss/train': 0.7729704575613141} 12/25/2021 15:54:21 - INFO - codeparrot_training - Step 20401: {'lr': 0.00032280352875062014, 'samples': 10445824, 'steps': 20401, 'batch_loss/train': 0.8652265286073089} 12/25/2021 15:54:34 - INFO - codeparrot_training - Step 20402: {'lr': 0.0003227884486790001, 'samples': 10446336, 'steps': 20402, 'batch_loss/train': 0.7836144641041756} 12/25/2021 15:54:45 - INFO - codeparrot_training - Step 20403: {'lr': 0.00032277336831800055, 'samples': 10446848, 'steps': 20403, 'batch_loss/train': 0.83588089607656} 12/25/2021 15:54:56 - INFO - codeparrot_training - Step 20404: {'lr': 0.00032275828766768164, 'samples': 10447360, 'steps': 20404, 'batch_loss/train': 0.9214985342696309} 12/25/2021 15:55:08 - INFO - codeparrot_training - Step 20405: {'lr': 0.0003227432067281032, 'samples': 10447872, 'steps': 20405, 'batch_loss/train': 0.9059944041073322} 12/25/2021 15:55:19 - INFO - codeparrot_training - Step 20406: {'lr': 0.0003227281254993252, 'samples': 10448384, 'steps': 20406, 'batch_loss/train': 0.7518356377258897} 12/25/2021 15:55:29 - INFO - codeparrot_training - Step 20407: {'lr': 0.00032271304398140773, 'samples': 10448896, 'steps': 20407, 'batch_loss/train': 0.8393677501007915} 12/25/2021 15:55:41 - INFO - codeparrot_training - Step 20408: {'lr': 0.00032269796217441067, 'samples': 10449408, 'steps': 20408, 'batch_loss/train': 0.7645256603136659} 12/25/2021 15:55:52 - INFO - codeparrot_training - Step 20409: {'lr': 0.00032268288007839387, 'samples': 10449920, 'steps': 20409, 'batch_loss/train': 0.7255886553321034} 12/25/2021 15:56:02 - INFO - codeparrot_training - Step 20410: {'lr': 0.0003226677976934174, 'samples': 10450432, 'steps': 20410, 'batch_loss/train': 0.8018298475071788} 12/25/2021 15:56:13 - INFO - codeparrot_training - Step 20411: {'lr': 0.00032265271501954123, 'samples': 10450944, 'steps': 20411, 'batch_loss/train': 0.7189889918081462} 12/25/2021 15:56:25 - INFO - codeparrot_training - Step 20412: {'lr': 0.0003226376320568253, 'samples': 10451456, 'steps': 20412, 'batch_loss/train': 0.8811397152021527} 12/25/2021 15:56:36 - INFO - codeparrot_training - Step 20413: {'lr': 0.0003226225488053295, 'samples': 10451968, 'steps': 20413, 'batch_loss/train': 0.8109844541177154} 12/25/2021 15:56:46 - INFO - codeparrot_training - Step 20414: {'lr': 0.00032260746526511396, 'samples': 10452480, 'steps': 20414, 'batch_loss/train': 0.8041966408491135} 12/25/2021 15:57:00 - INFO - codeparrot_training - Step 20415: {'lr': 0.00032259238143623857, 'samples': 10452992, 'steps': 20415, 'batch_loss/train': 0.8288278188556433} 12/25/2021 15:57:11 - INFO - codeparrot_training - Step 20416: {'lr': 0.0003225772973187632, 'samples': 10453504, 'steps': 20416, 'batch_loss/train': 0.7848436686908826} 12/25/2021 15:57:22 - INFO - codeparrot_training - Step 20417: {'lr': 0.00032256221291274796, 'samples': 10454016, 'steps': 20417, 'batch_loss/train': 0.7384022483602166} 12/25/2021 15:57:34 - INFO - codeparrot_training - Step 20418: {'lr': 0.0003225471282182527, 'samples': 10454528, 'steps': 20418, 'batch_loss/train': 0.6710933689028025} 12/25/2021 15:57:45 - INFO - codeparrot_training - Step 20419: {'lr': 0.00032253204323533745, 'samples': 10455040, 'steps': 20419, 'batch_loss/train': 0.8183737583458424} 12/25/2021 15:57:55 - INFO - codeparrot_training - Step 20420: {'lr': 0.00032251695796406226, 'samples': 10455552, 'steps': 20420, 'batch_loss/train': 0.8547175480052829} 12/25/2021 15:58:06 - INFO - codeparrot_training - Step 20421: {'lr': 0.000322501872404487, 'samples': 10456064, 'steps': 20421, 'batch_loss/train': 0.7247743923217058} 12/25/2021 15:58:18 - INFO - codeparrot_training - Step 20422: {'lr': 0.0003224867865566716, 'samples': 10456576, 'steps': 20422, 'batch_loss/train': 0.8103503622114658} 12/25/2021 15:58:29 - INFO - codeparrot_training - Step 20423: {'lr': 0.0003224717004206762, 'samples': 10457088, 'steps': 20423, 'batch_loss/train': 0.7025958793237805} 12/25/2021 15:58:40 - INFO - codeparrot_training - Step 20424: {'lr': 0.00032245661399656067, 'samples': 10457600, 'steps': 20424, 'batch_loss/train': 0.9462755499407649} 12/25/2021 15:58:53 - INFO - codeparrot_training - Step 20425: {'lr': 0.00032244152728438496, 'samples': 10458112, 'steps': 20425, 'batch_loss/train': 1.1450284137390554} 12/25/2021 15:59:04 - INFO - codeparrot_training - Step 20426: {'lr': 0.0003224264402842091, 'samples': 10458624, 'steps': 20426, 'batch_loss/train': 0.7583929402753711} 12/25/2021 15:59:14 - INFO - codeparrot_training - Step 20427: {'lr': 0.000322411352996093, 'samples': 10459136, 'steps': 20427, 'batch_loss/train': 0.8009198876097798} 12/25/2021 15:59:26 - INFO - codeparrot_training - Step 20428: {'lr': 0.0003223962654200969, 'samples': 10459648, 'steps': 20428, 'batch_loss/train': 0.7603636106941849} 12/25/2021 15:59:37 - INFO - codeparrot_training - Step 20429: {'lr': 0.0003223811775562804, 'samples': 10460160, 'steps': 20429, 'batch_loss/train': 0.7327617835253477} 12/25/2021 15:59:48 - INFO - codeparrot_training - Step 20430: {'lr': 0.00032236608940470373, 'samples': 10460672, 'steps': 20430, 'batch_loss/train': 0.8171411454677582} 12/25/2021 15:59:58 - INFO - codeparrot_training - Step 20431: {'lr': 0.0003223510009654268, 'samples': 10461184, 'steps': 20431, 'batch_loss/train': 0.760220430791378} 12/25/2021 16:00:12 - INFO - codeparrot_training - Step 20432: {'lr': 0.0003223359122385096, 'samples': 10461696, 'steps': 20432, 'batch_loss/train': 0.7428834140300751} 12/25/2021 16:00:23 - INFO - codeparrot_training - Step 20433: {'lr': 0.00032232082322401214, 'samples': 10462208, 'steps': 20433, 'batch_loss/train': 0.7994868559762836} 12/25/2021 16:00:33 - INFO - codeparrot_training - Step 20434: {'lr': 0.00032230573392199447, 'samples': 10462720, 'steps': 20434, 'batch_loss/train': 0.8250437645474449} 12/25/2021 16:00:46 - INFO - codeparrot_training - Step 20435: {'lr': 0.0003222906443325163, 'samples': 10463232, 'steps': 20435, 'batch_loss/train': 0.6728767682798207} 12/25/2021 16:00:56 - INFO - codeparrot_training - Step 20436: {'lr': 0.00032227555445563794, 'samples': 10463744, 'steps': 20436, 'batch_loss/train': 0.8213811228051782} 12/25/2021 16:01:07 - INFO - codeparrot_training - Step 20437: {'lr': 0.0003222604642914194, 'samples': 10464256, 'steps': 20437, 'batch_loss/train': 0.7635980029590428} 12/25/2021 16:01:19 - INFO - codeparrot_training - Step 20438: {'lr': 0.00032224537383992037, 'samples': 10464768, 'steps': 20438, 'batch_loss/train': 0.7559531964361668} 12/25/2021 16:01:29 - INFO - codeparrot_training - Step 20439: {'lr': 0.0003222302831012011, 'samples': 10465280, 'steps': 20439, 'batch_loss/train': 0.8093550959601998} 12/25/2021 16:01:40 - INFO - codeparrot_training - Step 20440: {'lr': 0.0003222151920753214, 'samples': 10465792, 'steps': 20440, 'batch_loss/train': 0.7281398237682879} 12/25/2021 16:01:51 - INFO - codeparrot_training - Step 20441: {'lr': 0.0003222001007623415, 'samples': 10466304, 'steps': 20441, 'batch_loss/train': 0.7811534013599157} 12/25/2021 16:02:04 - INFO - codeparrot_training - Step 20442: {'lr': 0.0003221850091623211, 'samples': 10466816, 'steps': 20442, 'batch_loss/train': 0.8486070558428764} 12/25/2021 16:02:15 - INFO - codeparrot_training - Step 20443: {'lr': 0.00032216991727532045, 'samples': 10467328, 'steps': 20443, 'batch_loss/train': 0.7228862540796399} 12/25/2021 16:02:25 - INFO - codeparrot_training - Step 20444: {'lr': 0.00032215482510139943, 'samples': 10467840, 'steps': 20444, 'batch_loss/train': 0.8061618534848094} 12/25/2021 16:02:38 - INFO - codeparrot_training - Step 20445: {'lr': 0.00032213973264061814, 'samples': 10468352, 'steps': 20445, 'batch_loss/train': 0.7793464791029692} 12/25/2021 16:02:48 - INFO - codeparrot_training - Step 20446: {'lr': 0.0003221246398930364, 'samples': 10468864, 'steps': 20446, 'batch_loss/train': 0.8613942191004753} 12/25/2021 16:02:59 - INFO - codeparrot_training - Step 20447: {'lr': 0.00032210954685871446, 'samples': 10469376, 'steps': 20447, 'batch_loss/train': 0.9742087982594967} 12/25/2021 16:03:13 - INFO - codeparrot_training - Step 20448: {'lr': 0.00032209445353771203, 'samples': 10469888, 'steps': 20448, 'batch_loss/train': 0.7055770670995116} 12/25/2021 16:03:23 - INFO - codeparrot_training - Step 20449: {'lr': 0.00032207935993008933, 'samples': 10470400, 'steps': 20449, 'batch_loss/train': 0.7612980637932196} 12/25/2021 16:03:34 - INFO - codeparrot_training - Step 20450: {'lr': 0.00032206426603590646, 'samples': 10470912, 'steps': 20450, 'batch_loss/train': 0.8472271552309394} 12/25/2021 16:03:46 - INFO - codeparrot_training - Step 20451: {'lr': 0.00032204917185522307, 'samples': 10471424, 'steps': 20451, 'batch_loss/train': 0.7783714017132297} 12/25/2021 16:03:57 - INFO - codeparrot_training - Step 20452: {'lr': 0.00032203407738809947, 'samples': 10471936, 'steps': 20452, 'batch_loss/train': 0.648680136539042} 12/25/2021 16:04:07 - INFO - codeparrot_training - Step 20453: {'lr': 0.00032201898263459563, 'samples': 10472448, 'steps': 20453, 'batch_loss/train': 0.8817802723497152} 12/25/2021 16:04:18 - INFO - codeparrot_training - Step 20454: {'lr': 0.00032200388759477146, 'samples': 10472960, 'steps': 20454, 'batch_loss/train': 0.7590532200410962} 12/25/2021 16:04:30 - INFO - codeparrot_training - Step 20455: {'lr': 0.00032198879226868696, 'samples': 10473472, 'steps': 20455, 'batch_loss/train': 0.8744523897767067} 12/25/2021 16:04:41 - INFO - codeparrot_training - Step 20456: {'lr': 0.0003219736966564023, 'samples': 10473984, 'steps': 20456, 'batch_loss/train': 0.8220123746432364} 12/25/2021 16:04:51 - INFO - codeparrot_training - Step 20457: {'lr': 0.00032195860075797734, 'samples': 10474496, 'steps': 20457, 'batch_loss/train': 0.7623159186914563} 12/25/2021 16:05:03 - INFO - codeparrot_training - Step 20458: {'lr': 0.00032194350457347213, 'samples': 10475008, 'steps': 20458, 'batch_loss/train': 0.7092311913147569} 12/25/2021 16:05:14 - INFO - codeparrot_training - Step 20459: {'lr': 0.00032192840810294675, 'samples': 10475520, 'steps': 20459, 'batch_loss/train': 0.7557890811003745} 12/25/2021 16:05:25 - INFO - codeparrot_training - Step 20460: {'lr': 0.0003219133113464612, 'samples': 10476032, 'steps': 20460, 'batch_loss/train': 0.7489467333070934} 12/25/2021 16:05:38 - INFO - codeparrot_training - Step 20461: {'lr': 0.00032189821430407545, 'samples': 10476544, 'steps': 20461, 'batch_loss/train': 0.8664036588743329} 12/25/2021 16:05:49 - INFO - codeparrot_training - Step 20462: {'lr': 0.00032188311697584955, 'samples': 10477056, 'steps': 20462, 'batch_loss/train': 0.6437580061610788} 12/25/2021 16:05:59 - INFO - codeparrot_training - Step 20463: {'lr': 0.00032186801936184357, 'samples': 10477568, 'steps': 20463, 'batch_loss/train': 0.733587525319308} 12/25/2021 16:06:10 - INFO - codeparrot_training - Step 20464: {'lr': 0.0003218529214621174, 'samples': 10478080, 'steps': 20464, 'batch_loss/train': 0.7251591747626662} 12/25/2021 16:06:22 - INFO - codeparrot_training - Step 20465: {'lr': 0.0003218378232767312, 'samples': 10478592, 'steps': 20465, 'batch_loss/train': 0.7945766048505902} 12/25/2021 16:06:33 - INFO - codeparrot_training - Step 20466: {'lr': 0.00032182272480574487, 'samples': 10479104, 'steps': 20466, 'batch_loss/train': 0.7992869894951582} 12/25/2021 16:06:43 - INFO - codeparrot_training - Step 20467: {'lr': 0.00032180762604921856, 'samples': 10479616, 'steps': 20467, 'batch_loss/train': 0.8509317375719547} 12/25/2021 16:06:55 - INFO - codeparrot_training - Step 20468: {'lr': 0.00032179252700721223, 'samples': 10480128, 'steps': 20468, 'batch_loss/train': 0.7116446734871715} 12/25/2021 16:07:06 - INFO - codeparrot_training - Step 20469: {'lr': 0.000321777427679786, 'samples': 10480640, 'steps': 20469, 'batch_loss/train': 0.6696370099671185} 12/25/2021 16:07:17 - INFO - codeparrot_training - Step 20470: {'lr': 0.0003217623280669997, 'samples': 10481152, 'steps': 20470, 'batch_loss/train': 0.7771881413646042} 12/25/2021 16:07:30 - INFO - codeparrot_training - Step 20471: {'lr': 0.00032174722816891355, 'samples': 10481664, 'steps': 20471, 'batch_loss/train': 0.9149417644366622} 12/25/2021 16:07:41 - INFO - codeparrot_training - Step 20472: {'lr': 0.0003217321279855875, 'samples': 10482176, 'steps': 20472, 'batch_loss/train': 0.8179667824879289} 12/25/2021 16:07:52 - INFO - codeparrot_training - Step 20473: {'lr': 0.00032171702751708156, 'samples': 10482688, 'steps': 20473, 'batch_loss/train': 0.7014381983317435} 12/25/2021 16:08:02 - INFO - codeparrot_training - Step 20474: {'lr': 0.0003217019267634558, 'samples': 10483200, 'steps': 20474, 'batch_loss/train': 0.7595402672886848} 12/25/2021 16:08:14 - INFO - codeparrot_training - Step 20475: {'lr': 0.00032168682572477025, 'samples': 10483712, 'steps': 20475, 'batch_loss/train': 0.7874371451325715} 12/25/2021 16:08:25 - INFO - codeparrot_training - Step 20476: {'lr': 0.000321671724401085, 'samples': 10484224, 'steps': 20476, 'batch_loss/train': 0.7609019349329174} 12/25/2021 16:08:36 - INFO - codeparrot_training - Step 20477: {'lr': 0.00032165662279246, 'samples': 10484736, 'steps': 20477, 'batch_loss/train': 0.6868932512588799} 12/25/2021 16:08:49 - INFO - codeparrot_training - Step 20478: {'lr': 0.0003216415208989553, 'samples': 10485248, 'steps': 20478, 'batch_loss/train': 0.8578658681362867} 12/25/2021 16:09:00 - INFO - codeparrot_training - Step 20479: {'lr': 0.000321626418720631, 'samples': 10485760, 'steps': 20479, 'batch_loss/train': 0.8928857631981373} 12/25/2021 16:09:11 - INFO - codeparrot_training - Step 20480: {'lr': 0.00032161131625754714, 'samples': 10486272, 'steps': 20480, 'batch_loss/train': 0.9402186137158424} 12/25/2021 16:09:23 - INFO - codeparrot_training - Step 20481: {'lr': 0.00032159621350976364, 'samples': 10486784, 'steps': 20481, 'batch_loss/train': 0.6696542487479746} 12/25/2021 16:09:33 - INFO - codeparrot_training - Step 20482: {'lr': 0.00032158111047734075, 'samples': 10487296, 'steps': 20482, 'batch_loss/train': 0.8416994321160018} 12/25/2021 16:09:44 - INFO - codeparrot_training - Step 20483: {'lr': 0.0003215660071603383, 'samples': 10487808, 'steps': 20483, 'batch_loss/train': 0.8281344138085842} 12/25/2021 16:09:54 - INFO - codeparrot_training - Step 20484: {'lr': 0.00032155090355881645, 'samples': 10488320, 'steps': 20484, 'batch_loss/train': 0.7895041340962052} 12/25/2021 16:10:07 - INFO - codeparrot_training - Step 20485: {'lr': 0.0003215357996728352, 'samples': 10488832, 'steps': 20485, 'batch_loss/train': 1.2720593120902777} 12/25/2021 16:10:18 - INFO - codeparrot_training - Step 20486: {'lr': 0.0003215206955024546, 'samples': 10489344, 'steps': 20486, 'batch_loss/train': 0.8206763686612248} 12/25/2021 16:10:29 - INFO - codeparrot_training - Step 20487: {'lr': 0.00032150559104773477, 'samples': 10489856, 'steps': 20487, 'batch_loss/train': 0.7646927279420197} 12/25/2021 16:10:42 - INFO - codeparrot_training - Step 20488: {'lr': 0.0003214904863087357, 'samples': 10490368, 'steps': 20488, 'batch_loss/train': 0.7977928002364933} 12/25/2021 16:10:53 - INFO - codeparrot_training - Step 20489: {'lr': 0.0003214753812855175, 'samples': 10490880, 'steps': 20489, 'batch_loss/train': 0.8313047364354134} 12/25/2021 16:11:04 - INFO - codeparrot_training - Step 20490: {'lr': 0.0003214602759781401, 'samples': 10491392, 'steps': 20490, 'batch_loss/train': 0.8000267730094492} 12/25/2021 16:11:16 - INFO - codeparrot_training - Step 20491: {'lr': 0.00032144517038666366, 'samples': 10491904, 'steps': 20491, 'batch_loss/train': 0.6920935306698084} 12/25/2021 16:11:27 - INFO - codeparrot_training - Step 20492: {'lr': 0.00032143006451114815, 'samples': 10492416, 'steps': 20492, 'batch_loss/train': 0.85767253767699} 12/25/2021 16:11:37 - INFO - codeparrot_training - Step 20493: {'lr': 0.0003214149583516537, 'samples': 10492928, 'steps': 20493, 'batch_loss/train': 0.8293882505968213} 12/25/2021 16:11:48 - INFO - codeparrot_training - Step 20494: {'lr': 0.00032139985190824034, 'samples': 10493440, 'steps': 20494, 'batch_loss/train': 0.6916696634143591} 12/25/2021 16:12:00 - INFO - codeparrot_training - Step 20495: {'lr': 0.00032138474518096817, 'samples': 10493952, 'steps': 20495, 'batch_loss/train': 0.7352753435261548} 12/25/2021 16:12:11 - INFO - codeparrot_training - Step 20496: {'lr': 0.0003213696381698972, 'samples': 10494464, 'steps': 20496, 'batch_loss/train': 0.7753432118333876} 12/25/2021 16:12:21 - INFO - codeparrot_training - Step 20497: {'lr': 0.00032135453087508737, 'samples': 10494976, 'steps': 20497, 'batch_loss/train': 0.8175692008808255} 12/25/2021 16:12:33 - INFO - codeparrot_training - Step 20498: {'lr': 0.000321339423296599, 'samples': 10495488, 'steps': 20498, 'batch_loss/train': 0.7459754422307014} 12/25/2021 16:12:44 - INFO - codeparrot_training - Step 20499: {'lr': 0.000321324315434492, 'samples': 10496000, 'steps': 20499, 'batch_loss/train': 0.7760772090405226} 12/25/2021 16:12:54 - INFO - codeparrot_training - Step 20500: {'lr': 0.00032130920728882644, 'samples': 10496512, 'steps': 20500, 'batch_loss/train': 0.7166312778717838} 12/25/2021 16:13:08 - INFO - codeparrot_training - Step 20501: {'lr': 0.00032129409885966244, 'samples': 10497024, 'steps': 20501, 'batch_loss/train': 0.7821380337700248} 12/25/2021 16:13:19 - INFO - codeparrot_training - Step 20502: {'lr': 0.00032127899014706, 'samples': 10497536, 'steps': 20502, 'batch_loss/train': 0.8156931172125041} 12/25/2021 16:13:29 - INFO - codeparrot_training - Step 20503: {'lr': 0.00032126388115107915, 'samples': 10498048, 'steps': 20503, 'batch_loss/train': 0.8268472962081432} 12/25/2021 16:13:40 - INFO - codeparrot_training - Step 20504: {'lr': 0.0003212487718717801, 'samples': 10498560, 'steps': 20504, 'batch_loss/train': 0.8087481390684843} 12/25/2021 16:13:52 - INFO - codeparrot_training - Step 20505: {'lr': 0.00032123366230922274, 'samples': 10499072, 'steps': 20505, 'batch_loss/train': 0.8211919236928225} 12/25/2021 16:14:03 - INFO - codeparrot_training - Step 20506: {'lr': 0.0003212185524634673, 'samples': 10499584, 'steps': 20506, 'batch_loss/train': 0.8002102039754391} 12/25/2021 16:14:13 - INFO - codeparrot_training - Step 20507: {'lr': 0.00032120344233457375, 'samples': 10500096, 'steps': 20507, 'batch_loss/train': 0.7373662898316979} 12/25/2021 16:14:27 - INFO - codeparrot_training - Step 20508: {'lr': 0.0003211883319226023, 'samples': 10500608, 'steps': 20508, 'batch_loss/train': 0.7939179702661932} 12/25/2021 16:14:38 - INFO - codeparrot_training - Step 20509: {'lr': 0.00032117322122761284, 'samples': 10501120, 'steps': 20509, 'batch_loss/train': 0.7469173674471676} 12/25/2021 16:14:48 - INFO - codeparrot_training - Step 20510: {'lr': 0.00032115811024966545, 'samples': 10501632, 'steps': 20510, 'batch_loss/train': 0.7490466572344303} 12/25/2021 16:15:00 - INFO - codeparrot_training - Step 20511: {'lr': 0.00032114299898882044, 'samples': 10502144, 'steps': 20511, 'batch_loss/train': 0.7257564589381218} 12/25/2021 16:15:11 - INFO - codeparrot_training - Step 20512: {'lr': 0.00032112788744513766, 'samples': 10502656, 'steps': 20512, 'batch_loss/train': 0.6592912892228924} 12/25/2021 16:15:22 - INFO - codeparrot_training - Step 20513: {'lr': 0.00032111277561867717, 'samples': 10503168, 'steps': 20513, 'batch_loss/train': 0.8061968982219696} 12/25/2021 16:15:32 - INFO - codeparrot_training - Step 20514: {'lr': 0.00032109766350949925, 'samples': 10503680, 'steps': 20514, 'batch_loss/train': 0.6267467360012233} 12/25/2021 16:15:44 - INFO - codeparrot_training - Step 20515: {'lr': 0.00032108255111766386, 'samples': 10504192, 'steps': 20515, 'batch_loss/train': 0.7059608837589622} 12/25/2021 16:15:55 - INFO - codeparrot_training - Step 20516: {'lr': 0.000321067438443231, 'samples': 10504704, 'steps': 20516, 'batch_loss/train': 0.8150557866320014} 12/25/2021 16:16:06 - INFO - codeparrot_training - Step 20517: {'lr': 0.00032105232548626094, 'samples': 10505216, 'steps': 20517, 'batch_loss/train': 0.8194044847041368} 12/25/2021 16:16:19 - INFO - codeparrot_training - Step 20518: {'lr': 0.0003210372122468136, 'samples': 10505728, 'steps': 20518, 'batch_loss/train': 0.7456455263309181} 12/25/2021 16:16:30 - INFO - codeparrot_training - Step 20519: {'lr': 0.00032102209872494913, 'samples': 10506240, 'steps': 20519, 'batch_loss/train': 0.8412744421511889} 12/25/2021 16:16:41 - INFO - codeparrot_training - Step 20520: {'lr': 0.00032100698492072764, 'samples': 10506752, 'steps': 20520, 'batch_loss/train': 0.7146226880140603} 12/25/2021 16:16:53 - INFO - codeparrot_training - Step 20521: {'lr': 0.00032099187083420923, 'samples': 10507264, 'steps': 20521, 'batch_loss/train': 0.7844343855977058} 12/25/2021 16:17:03 - INFO - codeparrot_training - Step 20522: {'lr': 0.00032097675646545386, 'samples': 10507776, 'steps': 20522, 'batch_loss/train': 0.7887844862416387} 12/25/2021 16:17:14 - INFO - codeparrot_training - Step 20523: {'lr': 0.0003209616418145217, 'samples': 10508288, 'steps': 20523, 'batch_loss/train': 0.826610317453742} 12/25/2021 16:17:26 - INFO - codeparrot_training - Step 20524: {'lr': 0.0003209465268814729, 'samples': 10508800, 'steps': 20524, 'batch_loss/train': 0.7903364622034132} 12/25/2021 16:17:37 - INFO - codeparrot_training - Step 20525: {'lr': 0.00032093141166636755, 'samples': 10509312, 'steps': 20525, 'batch_loss/train': 0.7855398175306618} 12/25/2021 16:17:47 - INFO - codeparrot_training - Step 20526: {'lr': 0.0003209162961692655, 'samples': 10509824, 'steps': 20526, 'batch_loss/train': 0.8240212947130203} 12/25/2021 16:17:58 - INFO - codeparrot_training - Step 20527: {'lr': 0.0003209011803902272, 'samples': 10510336, 'steps': 20527, 'batch_loss/train': 0.7886972073465586} 12/25/2021 16:18:12 - INFO - codeparrot_training - Step 20528: {'lr': 0.0003208860643293126, 'samples': 10510848, 'steps': 20528, 'batch_loss/train': 0.7908745408058167} 12/25/2021 16:18:22 - INFO - codeparrot_training - Step 20529: {'lr': 0.0003208709479865816, 'samples': 10511360, 'steps': 20529, 'batch_loss/train': 0.8190637985244393} 12/25/2021 16:18:33 - INFO - codeparrot_training - Step 20530: {'lr': 0.00032085583136209466, 'samples': 10511872, 'steps': 20530, 'batch_loss/train': 0.8496872149407864} 12/25/2021 16:18:45 - INFO - codeparrot_training - Step 20531: {'lr': 0.0003208407144559116, 'samples': 10512384, 'steps': 20531, 'batch_loss/train': 0.8003496991004795} 12/25/2021 16:18:56 - INFO - codeparrot_training - Step 20532: {'lr': 0.0003208255972680926, 'samples': 10512896, 'steps': 20532, 'batch_loss/train': 0.7693450404331088} 12/25/2021 16:19:06 - INFO - codeparrot_training - Step 20533: {'lr': 0.0003208104797986978, 'samples': 10513408, 'steps': 20533, 'batch_loss/train': 0.7917458154261112} 12/25/2021 16:19:18 - INFO - codeparrot_training - Step 20534: {'lr': 0.0003207953620477873, 'samples': 10513920, 'steps': 20534, 'batch_loss/train': 0.7860917868092656} 12/25/2021 16:19:29 - INFO - codeparrot_training - Step 20535: {'lr': 0.0003207802440154211, 'samples': 10514432, 'steps': 20535, 'batch_loss/train': 0.6287324369768612} 12/25/2021 16:19:40 - INFO - codeparrot_training - Step 20536: {'lr': 0.00032076512570165935, 'samples': 10514944, 'steps': 20536, 'batch_loss/train': 0.5329772742406931} 12/25/2021 16:19:50 - INFO - codeparrot_training - Step 20537: {'lr': 0.00032075000710656226, 'samples': 10515456, 'steps': 20537, 'batch_loss/train': 0.7986410409212112} 12/25/2021 16:20:03 - INFO - codeparrot_training - Step 20538: {'lr': 0.0003207348882301898, 'samples': 10515968, 'steps': 20538, 'batch_loss/train': 0.7740316754207015} 12/25/2021 16:20:13 - INFO - codeparrot_training - Step 20539: {'lr': 0.0003207197690726021, 'samples': 10516480, 'steps': 20539, 'batch_loss/train': 0.7761922907084227} 12/25/2021 16:20:24 - INFO - codeparrot_training - Step 20540: {'lr': 0.0003207046496338594, 'samples': 10516992, 'steps': 20540, 'batch_loss/train': 0.8627011543139815} 12/25/2021 16:20:38 - INFO - codeparrot_training - Step 20541: {'lr': 0.0003206895299140217, 'samples': 10517504, 'steps': 20541, 'batch_loss/train': 0.7149343490600586} 12/25/2021 16:20:48 - INFO - codeparrot_training - Step 20542: {'lr': 0.00032067440991314906, 'samples': 10518016, 'steps': 20542, 'batch_loss/train': 0.8821976911276579} 12/25/2021 16:20:59 - INFO - codeparrot_training - Step 20543: {'lr': 0.00032065928963130174, 'samples': 10518528, 'steps': 20543, 'batch_loss/train': 0.7756542060524225} 12/25/2021 16:21:11 - INFO - codeparrot_training - Step 20544: {'lr': 0.00032064416906853963, 'samples': 10519040, 'steps': 20544, 'batch_loss/train': 0.8377587264403701} 12/25/2021 16:21:22 - INFO - codeparrot_training - Step 20545: {'lr': 0.0003206290482249231, 'samples': 10519552, 'steps': 20545, 'batch_loss/train': 0.8253525346517563} 12/25/2021 16:21:32 - INFO - codeparrot_training - Step 20546: {'lr': 0.00032061392710051206, 'samples': 10520064, 'steps': 20546, 'batch_loss/train': 0.9919190350919962} 12/25/2021 16:21:46 - INFO - codeparrot_training - Step 20547: {'lr': 0.00032059880569536677, 'samples': 10520576, 'steps': 20547, 'batch_loss/train': 0.7824178216978908} 12/25/2021 16:21:57 - INFO - codeparrot_training - Step 20548: {'lr': 0.00032058368400954723, 'samples': 10521088, 'steps': 20548, 'batch_loss/train': 0.7251292413566262} 12/25/2021 16:22:07 - INFO - codeparrot_training - Step 20549: {'lr': 0.0003205685620431137, 'samples': 10521600, 'steps': 20549, 'batch_loss/train': 0.7352380333468318} 12/25/2021 16:22:18 - INFO - codeparrot_training - Step 20550: {'lr': 0.00032055343979612617, 'samples': 10522112, 'steps': 20550, 'batch_loss/train': 0.7590975083876401} 12/25/2021 16:22:30 - INFO - codeparrot_training - Step 20551: {'lr': 0.0003205383172686448, 'samples': 10522624, 'steps': 20551, 'batch_loss/train': 0.881092278752476} 12/25/2021 16:22:41 - INFO - codeparrot_training - Step 20552: {'lr': 0.00032052319446072966, 'samples': 10523136, 'steps': 20552, 'batch_loss/train': 0.7797554000280797} 12/25/2021 16:22:51 - INFO - codeparrot_training - Step 20553: {'lr': 0.000320508071372441, 'samples': 10523648, 'steps': 20553, 'batch_loss/train': 0.7989561716094613} 12/25/2021 16:23:03 - INFO - codeparrot_training - Step 20554: {'lr': 0.00032049294800383886, 'samples': 10524160, 'steps': 20554, 'batch_loss/train': 0.7607894744724035} 12/25/2021 16:23:14 - INFO - codeparrot_training - Step 20555: {'lr': 0.0003204778243549833, 'samples': 10524672, 'steps': 20555, 'batch_loss/train': 0.7676982888951898} 12/25/2021 16:23:24 - INFO - codeparrot_training - Step 20556: {'lr': 0.00032046270042593463, 'samples': 10525184, 'steps': 20556, 'batch_loss/train': 0.7587820370681584} 12/25/2021 16:23:38 - INFO - codeparrot_training - Step 20557: {'lr': 0.0003204475762167528, 'samples': 10525696, 'steps': 20557, 'batch_loss/train': 0.7887032860890031} 12/25/2021 16:23:49 - INFO - codeparrot_training - Step 20558: {'lr': 0.000320432451727498, 'samples': 10526208, 'steps': 20558, 'batch_loss/train': 0.7977784238755703} 12/25/2021 16:23:59 - INFO - codeparrot_training - Step 20559: {'lr': 0.0003204173269582304, 'samples': 10526720, 'steps': 20559, 'batch_loss/train': 0.8107769847847521} 12/25/2021 16:24:11 - INFO - codeparrot_training - Step 20560: {'lr': 0.0003204022019090101, 'samples': 10527232, 'steps': 20560, 'batch_loss/train': 0.7132084490731359} 12/25/2021 16:24:22 - INFO - codeparrot_training - Step 20561: {'lr': 0.0003203870765798972, 'samples': 10527744, 'steps': 20561, 'batch_loss/train': 0.8886520806699991} 12/25/2021 16:24:33 - INFO - codeparrot_training - Step 20562: {'lr': 0.00032037195097095185, 'samples': 10528256, 'steps': 20562, 'batch_loss/train': 0.8232317892834544} 12/25/2021 16:24:43 - INFO - codeparrot_training - Step 20563: {'lr': 0.0003203568250822343, 'samples': 10528768, 'steps': 20563, 'batch_loss/train': 0.7901975284330547} 12/25/2021 16:24:57 - INFO - codeparrot_training - Step 20564: {'lr': 0.0003203416989138045, 'samples': 10529280, 'steps': 20564, 'batch_loss/train': 0.6840060070971958} 12/25/2021 16:25:08 - INFO - codeparrot_training - Step 20565: {'lr': 0.0003203265724657227, 'samples': 10529792, 'steps': 20565, 'batch_loss/train': 0.7485851221717894} 12/25/2021 16:25:18 - INFO - codeparrot_training - Step 20566: {'lr': 0.00032031144573804895, 'samples': 10530304, 'steps': 20566, 'batch_loss/train': 0.7442555122543126} 12/25/2021 16:25:31 - INFO - codeparrot_training - Step 20567: {'lr': 0.00032029631873084345, 'samples': 10530816, 'steps': 20567, 'batch_loss/train': 0.8049518438056111} 12/25/2021 16:25:41 - INFO - codeparrot_training - Step 20568: {'lr': 0.0003202811914441663, 'samples': 10531328, 'steps': 20568, 'batch_loss/train': 0.8258728744694963} 12/25/2021 16:25:52 - INFO - codeparrot_training - Step 20569: {'lr': 0.0003202660638780778, 'samples': 10531840, 'steps': 20569, 'batch_loss/train': 0.7469266009284183} 12/25/2021 16:26:04 - INFO - codeparrot_training - Step 20570: {'lr': 0.0003202509360326378, 'samples': 10532352, 'steps': 20570, 'batch_loss/train': 0.73267813998973} 12/25/2021 16:26:15 - INFO - codeparrot_training - Step 20571: {'lr': 0.00032023580790790663, 'samples': 10532864, 'steps': 20571, 'batch_loss/train': 0.8604395189322531} 12/25/2021 16:26:25 - INFO - codeparrot_training - Step 20572: {'lr': 0.0003202206795039445, 'samples': 10533376, 'steps': 20572, 'batch_loss/train': 1.071135493926704} 12/25/2021 16:26:36 - INFO - codeparrot_training - Step 20573: {'lr': 0.0003202055508208114, 'samples': 10533888, 'steps': 20573, 'batch_loss/train': 0.8617442273534834} 12/25/2021 16:26:48 - INFO - codeparrot_training - Step 20574: {'lr': 0.00032019042185856754, 'samples': 10534400, 'steps': 20574, 'batch_loss/train': 0.6796473786234856} 12/25/2021 16:26:59 - INFO - codeparrot_training - Step 20575: {'lr': 0.0003201752926172731, 'samples': 10534912, 'steps': 20575, 'batch_loss/train': 0.7428617710247636} 12/25/2021 16:27:09 - INFO - codeparrot_training - Step 20576: {'lr': 0.0003201601630969882, 'samples': 10535424, 'steps': 20576, 'batch_loss/train': 0.8586378926411271} 12/25/2021 16:27:23 - INFO - codeparrot_training - Step 20577: {'lr': 0.00032014503329777294, 'samples': 10535936, 'steps': 20577, 'batch_loss/train': 0.7742369910702109} 12/25/2021 16:27:34 - INFO - codeparrot_training - Step 20578: {'lr': 0.0003201299032196875, 'samples': 10536448, 'steps': 20578, 'batch_loss/train': 0.821088396362029} 12/25/2021 16:27:44 - INFO - codeparrot_training - Step 20579: {'lr': 0.00032011477286279213, 'samples': 10536960, 'steps': 20579, 'batch_loss/train': 0.7981434500543401} 12/25/2021 16:27:56 - INFO - codeparrot_training - Step 20580: {'lr': 0.0003200996422271468, 'samples': 10537472, 'steps': 20580, 'batch_loss/train': 0.700233937241137} 12/25/2021 16:28:07 - INFO - codeparrot_training - Step 20581: {'lr': 0.0003200845113128118, 'samples': 10537984, 'steps': 20581, 'batch_loss/train': 0.8730518319644034} 12/25/2021 16:28:18 - INFO - codeparrot_training - Step 20582: {'lr': 0.00032006938011984725, 'samples': 10538496, 'steps': 20582, 'batch_loss/train': 0.7271788734942675} 12/25/2021 16:28:28 - INFO - codeparrot_training - Step 20583: {'lr': 0.0003200542486483133, 'samples': 10539008, 'steps': 20583, 'batch_loss/train': 0.9910708237439394} 12/25/2021 16:28:40 - INFO - codeparrot_training - Step 20584: {'lr': 0.0003200391168982701, 'samples': 10539520, 'steps': 20584, 'batch_loss/train': 0.6936454236856662} 12/25/2021 16:28:51 - INFO - codeparrot_training - Step 20585: {'lr': 0.00032002398486977786, 'samples': 10540032, 'steps': 20585, 'batch_loss/train': 0.7508127242326736} 12/25/2021 16:29:02 - INFO - codeparrot_training - Step 20586: {'lr': 0.0003200088525628966, 'samples': 10540544, 'steps': 20586, 'batch_loss/train': 0.7687851241789758} 12/25/2021 16:29:15 - INFO - codeparrot_training - Step 20587: {'lr': 0.00031999371997768665, 'samples': 10541056, 'steps': 20587, 'batch_loss/train': 0.93476881692186} 12/25/2021 16:29:26 - INFO - codeparrot_training - Step 20588: {'lr': 0.00031997858711420806, 'samples': 10541568, 'steps': 20588, 'batch_loss/train': 0.8010334949940443} 12/25/2021 16:29:37 - INFO - codeparrot_training - Step 20589: {'lr': 0.00031996345397252113, 'samples': 10542080, 'steps': 20589, 'batch_loss/train': 0.84309535427019} 12/25/2021 16:29:49 - INFO - codeparrot_training - Step 20590: {'lr': 0.00031994832055268587, 'samples': 10542592, 'steps': 20590, 'batch_loss/train': 0.7930965656414628} 12/25/2021 16:29:59 - INFO - codeparrot_training - Step 20591: {'lr': 0.00031993318685476246, 'samples': 10543104, 'steps': 20591, 'batch_loss/train': 0.827706269454211} 12/25/2021 16:30:10 - INFO - codeparrot_training - Step 20592: {'lr': 0.0003199180528788111, 'samples': 10543616, 'steps': 20592, 'batch_loss/train': 0.7636387231759727} 12/25/2021 16:30:24 - INFO - codeparrot_training - Step 20593: {'lr': 0.000319902918624892, 'samples': 10544128, 'steps': 20593, 'batch_loss/train': 0.8052846398204565} 12/25/2021 16:30:34 - INFO - codeparrot_training - Step 20594: {'lr': 0.0003198877840930653, 'samples': 10544640, 'steps': 20594, 'batch_loss/train': 0.807477219030261} 12/25/2021 16:30:45 - INFO - codeparrot_training - Step 20595: {'lr': 0.0003198726492833912, 'samples': 10545152, 'steps': 20595, 'batch_loss/train': 0.7990481127053499} 12/25/2021 16:30:56 - INFO - codeparrot_training - Step 20596: {'lr': 0.0003198575141959297, 'samples': 10545664, 'steps': 20596, 'batch_loss/train': 0.8222625395283103} 12/25/2021 16:31:08 - INFO - codeparrot_training - Step 20597: {'lr': 0.00031984237883074106, 'samples': 10546176, 'steps': 20597, 'batch_loss/train': 0.7773072463460267} 12/25/2021 16:31:18 - INFO - codeparrot_training - Step 20598: {'lr': 0.0003198272431878857, 'samples': 10546688, 'steps': 20598, 'batch_loss/train': 0.7810155302286148} 12/25/2021 16:31:29 - INFO - codeparrot_training - Step 20599: {'lr': 0.0003198121072674235, 'samples': 10547200, 'steps': 20599, 'batch_loss/train': 0.7367694680579007} 12/25/2021 16:31:41 - INFO - codeparrot_training - Step 20600: {'lr': 0.0003197969710694146, 'samples': 10547712, 'steps': 20600, 'batch_loss/train': 0.7282607508823276} 12/25/2021 16:31:51 - INFO - codeparrot_training - Step 20601: {'lr': 0.00031978183459391947, 'samples': 10548224, 'steps': 20601, 'batch_loss/train': 0.8896538419649005} 12/25/2021 16:32:02 - INFO - codeparrot_training - Step 20602: {'lr': 0.0003197666978409981, 'samples': 10548736, 'steps': 20602, 'batch_loss/train': 0.7575295437127352} 12/25/2021 16:32:16 - INFO - codeparrot_training - Step 20603: {'lr': 0.0003197515608107106, 'samples': 10549248, 'steps': 20603, 'batch_loss/train': 0.7532401327043772} 12/25/2021 16:32:27 - INFO - codeparrot_training - Step 20604: {'lr': 0.0003197364235031173, 'samples': 10549760, 'steps': 20604, 'batch_loss/train': 0.8963593859225512} 12/25/2021 16:32:37 - INFO - codeparrot_training - Step 20605: {'lr': 0.0003197212859182782, 'samples': 10550272, 'steps': 20605, 'batch_loss/train': 0.7371639804914594} 12/25/2021 16:32:48 - INFO - codeparrot_training - Step 20606: {'lr': 0.0003197061480562537, 'samples': 10550784, 'steps': 20606, 'batch_loss/train': 0.7332896168809384} 12/25/2021 16:33:00 - INFO - codeparrot_training - Step 20607: {'lr': 0.0003196910099171038, 'samples': 10551296, 'steps': 20607, 'batch_loss/train': 0.9883291558362544} 12/25/2021 16:33:11 - INFO - codeparrot_training - Step 20608: {'lr': 0.0003196758715008888, 'samples': 10551808, 'steps': 20608, 'batch_loss/train': 0.7307457495480776} 12/25/2021 16:33:21 - INFO - codeparrot_training - Step 20609: {'lr': 0.0003196607328076689, 'samples': 10552320, 'steps': 20609, 'batch_loss/train': 0.970738154835999} 12/25/2021 16:33:33 - INFO - codeparrot_training - Step 20610: {'lr': 0.00031964559383750415, 'samples': 10552832, 'steps': 20610, 'batch_loss/train': 0.8723037531599402} 12/25/2021 16:33:44 - INFO - codeparrot_training - Step 20611: {'lr': 0.00031963045459045485, 'samples': 10553344, 'steps': 20611, 'batch_loss/train': 0.7446660525165498} 12/25/2021 16:33:54 - INFO - codeparrot_training - Step 20612: {'lr': 0.00031961531506658115, 'samples': 10553856, 'steps': 20612, 'batch_loss/train': 0.7311819004826248} 12/25/2021 16:34:08 - INFO - codeparrot_training - Step 20613: {'lr': 0.0003196001752659432, 'samples': 10554368, 'steps': 20613, 'batch_loss/train': 0.8566945847123861} 12/25/2021 16:34:19 - INFO - codeparrot_training - Step 20614: {'lr': 0.00031958503518860127, 'samples': 10554880, 'steps': 20614, 'batch_loss/train': 0.8342235721647739} 12/25/2021 16:34:29 - INFO - codeparrot_training - Step 20615: {'lr': 0.00031956989483461555, 'samples': 10555392, 'steps': 20615, 'batch_loss/train': 0.869383255019784} 12/25/2021 16:34:40 - INFO - codeparrot_training - Step 20616: {'lr': 0.0003195547542040461, 'samples': 10555904, 'steps': 20616, 'batch_loss/train': 0.7380676405737177} 12/25/2021 16:34:52 - INFO - codeparrot_training - Step 20617: {'lr': 0.00031953961329695335, 'samples': 10556416, 'steps': 20617, 'batch_loss/train': 0.8074232856743038} 12/25/2021 16:35:03 - INFO - codeparrot_training - Step 20618: {'lr': 0.00031952447211339727, 'samples': 10556928, 'steps': 20618, 'batch_loss/train': 0.7488932712003589} 12/25/2021 16:35:13 - INFO - codeparrot_training - Step 20619: {'lr': 0.0003195093306534381, 'samples': 10557440, 'steps': 20619, 'batch_loss/train': 0.8780489629134536} 12/25/2021 16:35:26 - INFO - codeparrot_training - Step 20620: {'lr': 0.00031949418891713614, 'samples': 10557952, 'steps': 20620, 'batch_loss/train': 0.7625947082415223} 12/25/2021 16:35:36 - INFO - codeparrot_training - Step 20621: {'lr': 0.00031947904690455154, 'samples': 10558464, 'steps': 20621, 'batch_loss/train': 0.7547262394800782} 12/25/2021 16:35:47 - INFO - codeparrot_training - Step 20622: {'lr': 0.0003194639046157445, 'samples': 10558976, 'steps': 20622, 'batch_loss/train': 0.8064928143285215} 12/25/2021 16:36:01 - INFO - codeparrot_training - Step 20623: {'lr': 0.00031944876205077503, 'samples': 10559488, 'steps': 20623, 'batch_loss/train': 0.9042696407996118} 12/25/2021 16:36:12 - INFO - codeparrot_training - Step 20624: {'lr': 0.0003194336192097037, 'samples': 10560000, 'steps': 20624, 'batch_loss/train': 0.925789219327271} 12/25/2021 16:36:22 - INFO - codeparrot_training - Step 20625: {'lr': 0.00031941847609259045, 'samples': 10560512, 'steps': 20625, 'batch_loss/train': 1.0134448166936636} 12/25/2021 16:36:34 - INFO - codeparrot_training - Step 20626: {'lr': 0.0003194033326994955, 'samples': 10561024, 'steps': 20626, 'batch_loss/train': 0.8798077953979373} 12/25/2021 16:36:45 - INFO - codeparrot_training - Step 20627: {'lr': 0.0003193881890304792, 'samples': 10561536, 'steps': 20627, 'batch_loss/train': 0.8026040531694889} 12/25/2021 16:36:56 - INFO - codeparrot_training - Step 20628: {'lr': 0.0003193730450856016, 'samples': 10562048, 'steps': 20628, 'batch_loss/train': 0.813130778260529} 12/25/2021 16:37:06 - INFO - codeparrot_training - Step 20629: {'lr': 0.00031935790086492295, 'samples': 10562560, 'steps': 20629, 'batch_loss/train': 0.8221997563960031} 12/25/2021 16:37:19 - INFO - codeparrot_training - Step 20630: {'lr': 0.00031934275636850355, 'samples': 10563072, 'steps': 20630, 'batch_loss/train': 0.804894479457289} 12/25/2021 16:37:29 - INFO - codeparrot_training - Step 20631: {'lr': 0.0003193276115964035, 'samples': 10563584, 'steps': 20631, 'batch_loss/train': 0.825675338273868} 12/25/2021 16:37:40 - INFO - codeparrot_training - Step 20632: {'lr': 0.000319312466548683, 'samples': 10564096, 'steps': 20632, 'batch_loss/train': 0.8452239846810699} 12/25/2021 16:37:54 - INFO - codeparrot_training - Step 20633: {'lr': 0.00031929732122540236, 'samples': 10564608, 'steps': 20633, 'batch_loss/train': 0.918884938582778} 12/25/2021 16:38:04 - INFO - codeparrot_training - Step 20634: {'lr': 0.0003192821756266217, 'samples': 10565120, 'steps': 20634, 'batch_loss/train': 0.8734355717897415} 12/25/2021 16:38:15 - INFO - codeparrot_training - Step 20635: {'lr': 0.0003192670297524013, 'samples': 10565632, 'steps': 20635, 'batch_loss/train': 0.7616549469530582} 12/25/2021 16:38:27 - INFO - codeparrot_training - Step 20636: {'lr': 0.0003192518836028013, 'samples': 10566144, 'steps': 20636, 'batch_loss/train': 0.7644224606920034} 12/25/2021 16:38:38 - INFO - codeparrot_training - Step 20637: {'lr': 0.00031923673717788203, 'samples': 10566656, 'steps': 20637, 'batch_loss/train': 0.752143525518477} 12/25/2021 16:38:48 - INFO - codeparrot_training - Step 20638: {'lr': 0.0003192215904777036, 'samples': 10567168, 'steps': 20638, 'batch_loss/train': 0.8469934435561299} 12/25/2021 16:38:59 - INFO - codeparrot_training - Step 20639: {'lr': 0.0003192064435023262, 'samples': 10567680, 'steps': 20639, 'batch_loss/train': 0.8265793854370713} 12/25/2021 16:39:13 - INFO - codeparrot_training - Step 20640: {'lr': 0.0003191912962518102, 'samples': 10568192, 'steps': 20640, 'batch_loss/train': 0.7961061343085021} 12/25/2021 16:39:23 - INFO - codeparrot_training - Step 20641: {'lr': 0.0003191761487262157, 'samples': 10568704, 'steps': 20641, 'batch_loss/train': 0.8682731790468097} 12/25/2021 16:39:34 - INFO - codeparrot_training - Step 20642: {'lr': 0.0003191610009256029, 'samples': 10569216, 'steps': 20642, 'batch_loss/train': 0.8753026528283954} 12/25/2021 16:39:46 - INFO - codeparrot_training - Step 20643: {'lr': 0.0003191458528500322, 'samples': 10569728, 'steps': 20643, 'batch_loss/train': 0.761408681049943} 12/25/2021 16:39:57 - INFO - codeparrot_training - Step 20644: {'lr': 0.0003191307044995636, 'samples': 10570240, 'steps': 20644, 'batch_loss/train': 0.8255800353363156} 12/25/2021 16:40:08 - INFO - codeparrot_training - Step 20645: {'lr': 0.00031911555587425746, 'samples': 10570752, 'steps': 20645, 'batch_loss/train': 0.7430255236104131} 12/25/2021 16:40:20 - INFO - codeparrot_training - Step 20646: {'lr': 0.00031910040697417393, 'samples': 10571264, 'steps': 20646, 'batch_loss/train': 0.6485860944376327} 12/25/2021 16:40:30 - INFO - codeparrot_training - Step 20647: {'lr': 0.00031908525779937325, 'samples': 10571776, 'steps': 20647, 'batch_loss/train': 1.6448128730989993} 12/25/2021 16:40:41 - INFO - codeparrot_training - Step 20648: {'lr': 0.0003190701083499158, 'samples': 10572288, 'steps': 20648, 'batch_loss/train': 0.6902328971773386} 12/25/2021 16:40:52 - INFO - codeparrot_training - Step 20649: {'lr': 0.0003190549586258615, 'samples': 10572800, 'steps': 20649, 'batch_loss/train': 0.6543174888356589} 12/25/2021 16:41:04 - INFO - codeparrot_training - Step 20650: {'lr': 0.0003190398086272709, 'samples': 10573312, 'steps': 20650, 'batch_loss/train': 0.7772560762241483} 12/25/2021 16:41:14 - INFO - codeparrot_training - Step 20651: {'lr': 0.00031902465835420405, 'samples': 10573824, 'steps': 20651, 'batch_loss/train': 0.7957794880494475} 12/25/2021 16:41:25 - INFO - codeparrot_training - Step 20652: {'lr': 0.00031900950780672124, 'samples': 10574336, 'steps': 20652, 'batch_loss/train': 0.5830813047941774} 12/25/2021 16:41:39 - INFO - codeparrot_training - Step 20653: {'lr': 0.0003189943569848826, 'samples': 10574848, 'steps': 20653, 'batch_loss/train': 0.8065604311414063} 12/25/2021 16:41:50 - INFO - codeparrot_training - Step 20654: {'lr': 0.0003189792058887486, 'samples': 10575360, 'steps': 20654, 'batch_loss/train': 1.03493835311383} 12/25/2021 16:42:00 - INFO - codeparrot_training - Step 20655: {'lr': 0.00031896405451837915, 'samples': 10575872, 'steps': 20655, 'batch_loss/train': 0.6083125388249755} 12/25/2021 16:42:11 - INFO - codeparrot_training - Step 20656: {'lr': 0.0003189489028738348, 'samples': 10576384, 'steps': 20656, 'batch_loss/train': 0.791523321531713} 12/25/2021 16:42:23 - INFO - codeparrot_training - Step 20657: {'lr': 0.00031893375095517557, 'samples': 10576896, 'steps': 20657, 'batch_loss/train': 1.041588267777115} 12/25/2021 16:42:34 - INFO - codeparrot_training - Step 20658: {'lr': 0.00031891859876246177, 'samples': 10577408, 'steps': 20658, 'batch_loss/train': 0.8681944254785776} 12/25/2021 16:42:44 - INFO - codeparrot_training - Step 20659: {'lr': 0.00031890344629575366, 'samples': 10577920, 'steps': 20659, 'batch_loss/train': 0.790959483012557} 12/25/2021 16:42:57 - INFO - codeparrot_training - Step 20660: {'lr': 0.00031888829355511145, 'samples': 10578432, 'steps': 20660, 'batch_loss/train': 0.8349452232941985} 12/25/2021 16:43:07 - INFO - codeparrot_training - Step 20661: {'lr': 0.0003188731405405954, 'samples': 10578944, 'steps': 20661, 'batch_loss/train': 0.7116714306175709} 12/25/2021 16:43:18 - INFO - codeparrot_training - Step 20662: {'lr': 0.0003188579872522658, 'samples': 10579456, 'steps': 20662, 'batch_loss/train': 0.5776403620839119} 12/25/2021 16:43:32 - INFO - codeparrot_training - Step 20663: {'lr': 0.0003188428336901828, 'samples': 10579968, 'steps': 20663, 'batch_loss/train': 0.811422792263329} 12/25/2021 16:43:43 - INFO - codeparrot_training - Step 20664: {'lr': 0.0003188276798544067, 'samples': 10580480, 'steps': 20664, 'batch_loss/train': 0.8091708039864898} 12/25/2021 16:43:53 - INFO - codeparrot_training - Step 20665: {'lr': 0.0003188125257449977, 'samples': 10580992, 'steps': 20665, 'batch_loss/train': 1.0731214019469917} 12/25/2021 16:44:04 - INFO - codeparrot_training - Step 20666: {'lr': 0.00031879737136201607, 'samples': 10581504, 'steps': 20666, 'batch_loss/train': 0.6142965890467167} 12/25/2021 16:44:16 - INFO - codeparrot_training - Step 20667: {'lr': 0.0003187822167055222, 'samples': 10582016, 'steps': 20667, 'batch_loss/train': 0.8428884502500296} 12/25/2021 16:44:27 - INFO - codeparrot_training - Step 20668: {'lr': 0.000318767061775576, 'samples': 10582528, 'steps': 20668, 'batch_loss/train': 0.8274567183107138} 12/25/2021 16:44:37 - INFO - codeparrot_training - Step 20669: {'lr': 0.0003187519065722381, 'samples': 10583040, 'steps': 20669, 'batch_loss/train': 0.686627674032934} 12/25/2021 16:44:51 - INFO - codeparrot_training - Step 20670: {'lr': 0.0003187367510955685, 'samples': 10583552, 'steps': 20670, 'batch_loss/train': 0.7890185341238976} 12/25/2021 16:45:02 - INFO - codeparrot_training - Step 20671: {'lr': 0.00031872159534562756, 'samples': 10584064, 'steps': 20671, 'batch_loss/train': 0.8086393484845757} 12/25/2021 16:45:12 - INFO - codeparrot_training - Step 20672: {'lr': 0.0003187064393224755, 'samples': 10584576, 'steps': 20672, 'batch_loss/train': 0.7817170217167586} 12/25/2021 16:45:23 - INFO - codeparrot_training - Step 20673: {'lr': 0.00031869128302617256, 'samples': 10585088, 'steps': 20673, 'batch_loss/train': 0.8237396664917469} 12/25/2021 16:45:35 - INFO - codeparrot_training - Step 20674: {'lr': 0.000318676126456779, 'samples': 10585600, 'steps': 20674, 'batch_loss/train': 0.7067085774615407} 12/25/2021 16:45:46 - INFO - codeparrot_training - Step 20675: {'lr': 0.00031866096961435513, 'samples': 10586112, 'steps': 20675, 'batch_loss/train': 0.8464892990887165} 12/25/2021 16:45:56 - INFO - codeparrot_training - Step 20676: {'lr': 0.0003186458124989612, 'samples': 10586624, 'steps': 20676, 'batch_loss/train': 0.7868736158125103} 12/25/2021 16:46:08 - INFO - codeparrot_training - Step 20677: {'lr': 0.00031863065511065743, 'samples': 10587136, 'steps': 20677, 'batch_loss/train': 0.7068403544835746} 12/25/2021 16:46:19 - INFO - codeparrot_training - Step 20678: {'lr': 0.000318615497449504, 'samples': 10587648, 'steps': 20678, 'batch_loss/train': 0.817637505941093} 12/25/2021 16:46:30 - INFO - codeparrot_training - Step 20679: {'lr': 0.00031860033951556133, 'samples': 10588160, 'steps': 20679, 'batch_loss/train': 0.7847480960190296} 12/25/2021 16:46:44 - INFO - codeparrot_training - Step 20680: {'lr': 0.0003185851813088897, 'samples': 10588672, 'steps': 20680, 'batch_loss/train': 0.713567822240293} 12/25/2021 16:46:54 - INFO - codeparrot_training - Step 20681: {'lr': 0.0003185700228295491, 'samples': 10589184, 'steps': 20681, 'batch_loss/train': 0.9234573105350137} 12/25/2021 16:47:05 - INFO - codeparrot_training - Step 20682: {'lr': 0.0003185548640776, 'samples': 10589696, 'steps': 20682, 'batch_loss/train': 0.7584499786607921} 12/25/2021 16:47:17 - INFO - codeparrot_training - Step 20683: {'lr': 0.0003185397050531028, 'samples': 10590208, 'steps': 20683, 'batch_loss/train': 0.8107380783185363} 12/25/2021 16:47:27 - INFO - codeparrot_training - Step 20684: {'lr': 0.0003185245457561174, 'samples': 10590720, 'steps': 20684, 'batch_loss/train': 0.7769383587874472} 12/25/2021 16:47:38 - INFO - codeparrot_training - Step 20685: {'lr': 0.00031850938618670455, 'samples': 10591232, 'steps': 20685, 'batch_loss/train': 0.763155426364392} 12/25/2021 16:47:49 - INFO - codeparrot_training - Step 20686: {'lr': 0.0003184942263449241, 'samples': 10591744, 'steps': 20686, 'batch_loss/train': 0.6882833312265575} 12/25/2021 16:48:01 - INFO - codeparrot_training - Step 20687: {'lr': 0.00031847906623083645, 'samples': 10592256, 'steps': 20687, 'batch_loss/train': 0.7789269587956369} 12/25/2021 16:48:11 - INFO - codeparrot_training - Step 20688: {'lr': 0.0003184639058445019, 'samples': 10592768, 'steps': 20688, 'batch_loss/train': 0.7914813281968236} 12/25/2021 16:48:22 - INFO - codeparrot_training - Step 20689: {'lr': 0.0003184487451859808, 'samples': 10593280, 'steps': 20689, 'batch_loss/train': 0.6919280830770731} 12/25/2021 16:48:34 - INFO - codeparrot_training - Step 20690: {'lr': 0.0003184335842553332, 'samples': 10593792, 'steps': 20690, 'batch_loss/train': 1.1090209502726793} 12/25/2021 16:48:45 - INFO - codeparrot_training - Step 20691: {'lr': 0.0003184184230526196, 'samples': 10594304, 'steps': 20691, 'batch_loss/train': 0.9094926444813609} 12/25/2021 16:48:56 - INFO - codeparrot_training - Step 20692: {'lr': 0.0003184032615779001, 'samples': 10594816, 'steps': 20692, 'batch_loss/train': 0.8835597392171621} 12/25/2021 16:49:06 - INFO - codeparrot_training - Step 20693: {'lr': 0.0003183880998312352, 'samples': 10595328, 'steps': 20693, 'batch_loss/train': 1.0774854874471202} 12/25/2021 16:49:20 - INFO - codeparrot_training - Step 20694: {'lr': 0.0003183729378126849, 'samples': 10595840, 'steps': 20694, 'batch_loss/train': 0.8196503198705614} 12/25/2021 16:49:31 - INFO - codeparrot_training - Step 20695: {'lr': 0.0003183577755223096, 'samples': 10596352, 'steps': 20695, 'batch_loss/train': 0.9864908661693335} 12/25/2021 16:49:41 - INFO - codeparrot_training - Step 20696: {'lr': 0.00031834261296016976, 'samples': 10596864, 'steps': 20696, 'batch_loss/train': 0.7201269021024927} 12/25/2021 16:49:54 - INFO - codeparrot_training - Step 20697: {'lr': 0.0003183274501263253, 'samples': 10597376, 'steps': 20697, 'batch_loss/train': 0.7760432166978717} 12/25/2021 16:50:04 - INFO - codeparrot_training - Step 20698: {'lr': 0.00031831228702083685, 'samples': 10597888, 'steps': 20698, 'batch_loss/train': 0.7603639988228679} 12/25/2021 16:50:15 - INFO - codeparrot_training - Step 20699: {'lr': 0.0003182971236437645, 'samples': 10598400, 'steps': 20699, 'batch_loss/train': 0.7074852772057056} 12/25/2021 16:50:28 - INFO - codeparrot_training - Step 20700: {'lr': 0.0003182819599951685, 'samples': 10598912, 'steps': 20700, 'batch_loss/train': 0.8325720094144344} 12/25/2021 16:50:39 - INFO - codeparrot_training - Step 20701: {'lr': 0.00031826679607510925, 'samples': 10599424, 'steps': 20701, 'batch_loss/train': 0.8886342188343406} 12/25/2021 16:50:50 - INFO - codeparrot_training - Step 20702: {'lr': 0.00031825163188364716, 'samples': 10599936, 'steps': 20702, 'batch_loss/train': 0.8462010622024536} 12/25/2021 16:51:02 - INFO - codeparrot_training - Step 20703: {'lr': 0.0003182364674208421, 'samples': 10600448, 'steps': 20703, 'batch_loss/train': 0.7976519097574055} 12/25/2021 16:51:12 - INFO - codeparrot_training - Step 20704: {'lr': 0.00031822130268675477, 'samples': 10600960, 'steps': 20704, 'batch_loss/train': 0.8081730175763369} 12/25/2021 16:51:23 - INFO - codeparrot_training - Step 20705: {'lr': 0.00031820613768144526, 'samples': 10601472, 'steps': 20705, 'batch_loss/train': 0.738548468798399} 12/25/2021 16:51:34 - INFO - codeparrot_training - Step 20706: {'lr': 0.00031819097240497386, 'samples': 10601984, 'steps': 20706, 'batch_loss/train': 0.7792332827812061} 12/25/2021 16:51:46 - INFO - codeparrot_training - Step 20707: {'lr': 0.0003181758068574009, 'samples': 10602496, 'steps': 20707, 'batch_loss/train': 0.760855782602448} 12/25/2021 16:51:56 - INFO - codeparrot_training - Step 20708: {'lr': 0.0003181606410387867, 'samples': 10603008, 'steps': 20708, 'batch_loss/train': 0.7791023487225175} 12/25/2021 16:52:07 - INFO - codeparrot_training - Step 20709: {'lr': 0.00031814547494919156, 'samples': 10603520, 'steps': 20709, 'batch_loss/train': 0.769526875577867} 12/25/2021 16:52:21 - INFO - codeparrot_training - Step 20710: {'lr': 0.00031813030858867563, 'samples': 10604032, 'steps': 20710, 'batch_loss/train': 0.773581306450069} 12/25/2021 16:52:31 - INFO - codeparrot_training - Step 20711: {'lr': 0.00031811514195729954, 'samples': 10604544, 'steps': 20711, 'batch_loss/train': 0.7608402725309134} 12/25/2021 16:52:42 - INFO - codeparrot_training - Step 20712: {'lr': 0.0003180999750551232, 'samples': 10605056, 'steps': 20712, 'batch_loss/train': 0.8730061212554574} 12/25/2021 16:52:54 - INFO - codeparrot_training - Step 20713: {'lr': 0.00031808480788220704, 'samples': 10605568, 'steps': 20713, 'batch_loss/train': 0.7109370599500835} 12/25/2021 16:53:04 - INFO - codeparrot_training - Step 20714: {'lr': 0.00031806964043861147, 'samples': 10606080, 'steps': 20714, 'batch_loss/train': 0.8079652721062303} 12/25/2021 16:53:15 - INFO - codeparrot_training - Step 20715: {'lr': 0.0003180544727243967, 'samples': 10606592, 'steps': 20715, 'batch_loss/train': 0.814089993480593} 12/25/2021 16:53:26 - INFO - codeparrot_training - Step 20716: {'lr': 0.00031803930473962307, 'samples': 10607104, 'steps': 20716, 'batch_loss/train': 0.7888765800744295} 12/25/2021 16:53:38 - INFO - codeparrot_training - Step 20717: {'lr': 0.0003180241364843508, 'samples': 10607616, 'steps': 20717, 'batch_loss/train': 0.7909697415307164} 12/25/2021 16:53:49 - INFO - codeparrot_training - Step 20718: {'lr': 0.00031800896795864027, 'samples': 10608128, 'steps': 20718, 'batch_loss/train': 0.758084581233561} 12/25/2021 16:54:00 - INFO - codeparrot_training - Step 20719: {'lr': 0.0003179937991625517, 'samples': 10608640, 'steps': 20719, 'batch_loss/train': 0.728199768345803} 12/25/2021 16:54:13 - INFO - codeparrot_training - Step 20720: {'lr': 0.00031797863009614557, 'samples': 10609152, 'steps': 20720, 'batch_loss/train': 0.7732649832032621} 12/25/2021 16:54:24 - INFO - codeparrot_training - Step 20721: {'lr': 0.000317963460759482, 'samples': 10609664, 'steps': 20721, 'batch_loss/train': 0.7149091551546007} 12/25/2021 16:54:35 - INFO - codeparrot_training - Step 20722: {'lr': 0.0003179482911526214, 'samples': 10610176, 'steps': 20722, 'batch_loss/train': 0.8579126200638711} 12/25/2021 16:54:47 - INFO - codeparrot_training - Step 20723: {'lr': 0.000317933121275624, 'samples': 10610688, 'steps': 20723, 'batch_loss/train': 0.6672708485275507} 12/25/2021 16:54:58 - INFO - codeparrot_training - Step 20724: {'lr': 0.00031791795112855024, 'samples': 10611200, 'steps': 20724, 'batch_loss/train': 0.7608506786637008} 12/25/2021 16:55:09 - INFO - codeparrot_training - Step 20725: {'lr': 0.00031790278071146033, 'samples': 10611712, 'steps': 20725, 'batch_loss/train': 0.8110561585053802} 12/25/2021 16:55:19 - INFO - codeparrot_training - Step 20726: {'lr': 0.0003178876100244146, 'samples': 10612224, 'steps': 20726, 'batch_loss/train': 1.329759024316445} 12/25/2021 16:55:34 - INFO - codeparrot_training - Step 20727: {'lr': 0.0003178724390674733, 'samples': 10612736, 'steps': 20727, 'batch_loss/train': 0.860953951254487} 12/25/2021 16:55:44 - INFO - codeparrot_training - Step 20728: {'lr': 0.00031785726784069694, 'samples': 10613248, 'steps': 20728, 'batch_loss/train': 0.7313082884647883} 12/25/2021 16:55:55 - INFO - codeparrot_training - Step 20729: {'lr': 0.0003178420963441456, 'samples': 10613760, 'steps': 20729, 'batch_loss/train': 1.038029219955206} 12/25/2021 16:56:07 - INFO - codeparrot_training - Step 20730: {'lr': 0.00031782692457787975, 'samples': 10614272, 'steps': 20730, 'batch_loss/train': 0.8768211808055639} 12/25/2021 16:56:18 - INFO - codeparrot_training - Step 20731: {'lr': 0.00031781175254195967, 'samples': 10614784, 'steps': 20731, 'batch_loss/train': 0.7842903640121222} 12/25/2021 16:56:28 - INFO - codeparrot_training - Step 20732: {'lr': 0.0003177965802364456, 'samples': 10615296, 'steps': 20732, 'batch_loss/train': 0.7680251998826861} 12/25/2021 16:56:40 - INFO - codeparrot_training - Step 20733: {'lr': 0.00031778140766139796, 'samples': 10615808, 'steps': 20733, 'batch_loss/train': 0.7894617202691734} 12/25/2021 16:56:51 - INFO - codeparrot_training - Step 20734: {'lr': 0.0003177662348168771, 'samples': 10616320, 'steps': 20734, 'batch_loss/train': 0.8842875505797565} 12/25/2021 16:57:02 - INFO - codeparrot_training - Step 20735: {'lr': 0.0003177510617029432, 'samples': 10616832, 'steps': 20735, 'batch_loss/train': 0.8011470939964056} 12/25/2021 16:57:12 - INFO - codeparrot_training - Step 20736: {'lr': 0.0003177358883196566, 'samples': 10617344, 'steps': 20736, 'batch_loss/train': 0.7516027959063649} 12/25/2021 16:57:24 - INFO - codeparrot_training - Step 20737: {'lr': 0.0003177207146670778, 'samples': 10617856, 'steps': 20737, 'batch_loss/train': 0.7920179106295109} 12/25/2021 16:57:35 - INFO - codeparrot_training - Step 20738: {'lr': 0.00031770554074526697, 'samples': 10618368, 'steps': 20738, 'batch_loss/train': 0.7075320621952415} 12/25/2021 16:57:46 - INFO - codeparrot_training - Step 20739: {'lr': 0.0003176903665542845, 'samples': 10618880, 'steps': 20739, 'batch_loss/train': 0.7524078763090074} 12/25/2021 16:58:00 - INFO - codeparrot_training - Step 20740: {'lr': 0.00031767519209419066, 'samples': 10619392, 'steps': 20740, 'batch_loss/train': 0.8155276523903012} 12/25/2021 16:58:10 - INFO - codeparrot_training - Step 20741: {'lr': 0.00031766001736504586, 'samples': 10619904, 'steps': 20741, 'batch_loss/train': 0.793404511641711} 12/25/2021 16:58:21 - INFO - codeparrot_training - Step 20742: {'lr': 0.0003176448423669103, 'samples': 10620416, 'steps': 20742, 'batch_loss/train': 0.7642731992527843} 12/25/2021 16:58:33 - INFO - codeparrot_training - Step 20743: {'lr': 0.00031762966709984443, 'samples': 10620928, 'steps': 20743, 'batch_loss/train': 0.7233245042152703} 12/25/2021 16:58:44 - INFO - codeparrot_training - Step 20744: {'lr': 0.00031761449156390853, 'samples': 10621440, 'steps': 20744, 'batch_loss/train': 0.8231580378487706} 12/25/2021 16:58:54 - INFO - codeparrot_training - Step 20745: {'lr': 0.0003175993157591629, 'samples': 10621952, 'steps': 20745, 'batch_loss/train': 0.860473798122257} 12/25/2021 16:59:06 - INFO - codeparrot_training - Step 20746: {'lr': 0.00031758413968566797, 'samples': 10622464, 'steps': 20746, 'batch_loss/train': 0.8342337096109986} 12/25/2021 16:59:17 - INFO - codeparrot_training - Step 20747: {'lr': 0.000317568963343484, 'samples': 10622976, 'steps': 20747, 'batch_loss/train': 0.7779009444639087} 12/25/2021 16:59:28 - INFO - codeparrot_training - Step 20748: {'lr': 0.00031755378673267145, 'samples': 10623488, 'steps': 20748, 'batch_loss/train': 0.7432471225038171} 12/25/2021 16:59:38 - INFO - codeparrot_training - Step 20749: {'lr': 0.00031753860985329034, 'samples': 10624000, 'steps': 20749, 'batch_loss/train': 0.8598503423854709} 12/25/2021 16:59:52 - INFO - codeparrot_training - Step 20750: {'lr': 0.0003175234327054014, 'samples': 10624512, 'steps': 20750, 'batch_loss/train': 0.7697476278990507} 12/25/2021 17:00:03 - INFO - codeparrot_training - Step 20751: {'lr': 0.00031750825528906466, 'samples': 10625024, 'steps': 20751, 'batch_loss/train': 1.0041792187839746} 12/25/2021 17:00:13 - INFO - codeparrot_training - Step 20752: {'lr': 0.0003174930776043407, 'samples': 10625536, 'steps': 20752, 'batch_loss/train': 0.8004184560850263} 12/25/2021 17:00:26 - INFO - codeparrot_training - Step 20753: {'lr': 0.0003174778996512897, 'samples': 10626048, 'steps': 20753, 'batch_loss/train': 0.837042679078877} 12/25/2021 17:00:36 - INFO - codeparrot_training - Step 20754: {'lr': 0.0003174627214299721, 'samples': 10626560, 'steps': 20754, 'batch_loss/train': 0.8353864662349224} 12/25/2021 17:00:47 - INFO - codeparrot_training - Step 20755: {'lr': 0.000317447542940448, 'samples': 10627072, 'steps': 20755, 'batch_loss/train': 0.7714295396581292} 12/25/2021 17:01:01 - INFO - codeparrot_training - Step 20756: {'lr': 0.0003174323641827781, 'samples': 10627584, 'steps': 20756, 'batch_loss/train': 0.7685087155550718} 12/25/2021 17:01:12 - INFO - codeparrot_training - Step 20757: {'lr': 0.0003174171851570226, 'samples': 10628096, 'steps': 20757, 'batch_loss/train': 0.9391792493406683} 12/25/2021 17:01:22 - INFO - codeparrot_training - Step 20758: {'lr': 0.0003174020058632418, 'samples': 10628608, 'steps': 20758, 'batch_loss/train': 0.755695883417502} 12/25/2021 17:01:33 - INFO - codeparrot_training - Step 20759: {'lr': 0.00031738682630149604, 'samples': 10629120, 'steps': 20759, 'batch_loss/train': 0.7727309260517359} 12/25/2021 17:01:45 - INFO - codeparrot_training - Step 20760: {'lr': 0.00031737164647184567, 'samples': 10629632, 'steps': 20760, 'batch_loss/train': 0.8447295241057873} 12/25/2021 17:01:56 - INFO - codeparrot_training - Step 20761: {'lr': 0.0003173564663743512, 'samples': 10630144, 'steps': 20761, 'batch_loss/train': 1.0900024781003594} 12/25/2021 17:02:06 - INFO - codeparrot_training - Step 20762: {'lr': 0.00031734128600907277, 'samples': 10630656, 'steps': 20762, 'batch_loss/train': 0.8289775550365448} 12/25/2021 17:02:18 - INFO - codeparrot_training - Step 20763: {'lr': 0.0003173261053760708, 'samples': 10631168, 'steps': 20763, 'batch_loss/train': 0.9191285315901041} 12/25/2021 17:02:29 - INFO - codeparrot_training - Step 20764: {'lr': 0.0003173109244754057, 'samples': 10631680, 'steps': 20764, 'batch_loss/train': 0.7902698861435056} 12/25/2021 17:02:40 - INFO - codeparrot_training - Step 20765: {'lr': 0.00031729574330713773, 'samples': 10632192, 'steps': 20765, 'batch_loss/train': 0.7441295238677412} 12/25/2021 17:02:50 - INFO - codeparrot_training - Step 20766: {'lr': 0.0003172805618713273, 'samples': 10632704, 'steps': 20766, 'batch_loss/train': 0.8334908057004213} 12/25/2021 17:03:04 - INFO - codeparrot_training - Step 20767: {'lr': 0.00031726538016803486, 'samples': 10633216, 'steps': 20767, 'batch_loss/train': 0.9332935558632016} 12/25/2021 17:03:15 - INFO - codeparrot_training - Step 20768: {'lr': 0.0003172501981973205, 'samples': 10633728, 'steps': 20768, 'batch_loss/train': 0.8238036325201392} 12/25/2021 17:03:25 - INFO - codeparrot_training - Step 20769: {'lr': 0.00031723501595924485, 'samples': 10634240, 'steps': 20769, 'batch_loss/train': 0.6984069366008043} 12/25/2021 17:03:37 - INFO - codeparrot_training - Step 20770: {'lr': 0.0003172198334538682, 'samples': 10634752, 'steps': 20770, 'batch_loss/train': 0.8662050142884254} 12/25/2021 17:03:48 - INFO - codeparrot_training - Step 20771: {'lr': 0.0003172046506812508, 'samples': 10635264, 'steps': 20771, 'batch_loss/train': 0.6147570488974452} 12/25/2021 17:03:59 - INFO - codeparrot_training - Step 20772: {'lr': 0.00031718946764145307, 'samples': 10635776, 'steps': 20772, 'batch_loss/train': 0.7748419139534235} 12/25/2021 17:04:11 - INFO - codeparrot_training - Step 20773: {'lr': 0.00031717428433453545, 'samples': 10636288, 'steps': 20773, 'batch_loss/train': 0.8053344544023275} 12/25/2021 17:04:21 - INFO - codeparrot_training - Step 20774: {'lr': 0.0003171591007605582, 'samples': 10636800, 'steps': 20774, 'batch_loss/train': 0.8224380416795611} 12/25/2021 17:04:32 - INFO - codeparrot_training - Step 20775: {'lr': 0.00031714391691958175, 'samples': 10637312, 'steps': 20775, 'batch_loss/train': 0.8641496151685715} 12/25/2021 17:04:44 - INFO - codeparrot_training - Step 20776: {'lr': 0.00031712873281166645, 'samples': 10637824, 'steps': 20776, 'batch_loss/train': 0.6995805660262704} 12/25/2021 17:04:55 - INFO - codeparrot_training - Step 20777: {'lr': 0.00031711354843687263, 'samples': 10638336, 'steps': 20777, 'batch_loss/train': 0.8191759753972292} 12/25/2021 17:05:05 - INFO - codeparrot_training - Step 20778: {'lr': 0.00031709836379526066, 'samples': 10638848, 'steps': 20778, 'batch_loss/train': 0.8239962095394731} 12/25/2021 17:05:16 - INFO - codeparrot_training - Step 20779: {'lr': 0.00031708317888689094, 'samples': 10639360, 'steps': 20779, 'batch_loss/train': 0.8137095086276531} 12/25/2021 17:05:30 - INFO - codeparrot_training - Step 20780: {'lr': 0.00031706799371182385, 'samples': 10639872, 'steps': 20780, 'batch_loss/train': 0.8606041113380343} 12/25/2021 17:05:40 - INFO - codeparrot_training - Step 20781: {'lr': 0.00031705280827011974, 'samples': 10640384, 'steps': 20781, 'batch_loss/train': 0.8264648579061031} 12/25/2021 17:05:51 - INFO - codeparrot_training - Step 20782: {'lr': 0.00031703762256183893, 'samples': 10640896, 'steps': 20782, 'batch_loss/train': 0.9513144930824637} 12/25/2021 17:06:03 - INFO - codeparrot_training - Step 20783: {'lr': 0.00031702243658704194, 'samples': 10641408, 'steps': 20783, 'batch_loss/train': 0.7114966372027993} 12/25/2021 17:06:14 - INFO - codeparrot_training - Step 20784: {'lr': 0.000317007250345789, 'samples': 10641920, 'steps': 20784, 'batch_loss/train': 0.7030325387604535} 12/25/2021 17:06:25 - INFO - codeparrot_training - Step 20785: {'lr': 0.0003169920638381405, 'samples': 10642432, 'steps': 20785, 'batch_loss/train': 0.7733098031021655} 12/25/2021 17:06:38 - INFO - codeparrot_training - Step 20786: {'lr': 0.0003169768770641569, 'samples': 10642944, 'steps': 20786, 'batch_loss/train': 0.7659326707944274} 12/25/2021 17:06:49 - INFO - codeparrot_training - Step 20787: {'lr': 0.0003169616900238985, 'samples': 10643456, 'steps': 20787, 'batch_loss/train': 0.8173147505149245} 12/25/2021 17:07:00 - INFO - codeparrot_training - Step 20788: {'lr': 0.0003169465027174257, 'samples': 10643968, 'steps': 20788, 'batch_loss/train': 0.8032829789444804} 12/25/2021 17:07:10 - INFO - codeparrot_training - Step 20789: {'lr': 0.00031693131514479893, 'samples': 10644480, 'steps': 20789, 'batch_loss/train': 0.8016627752222121} 12/25/2021 17:07:22 - INFO - codeparrot_training - Step 20790: {'lr': 0.0003169161273060785, 'samples': 10644992, 'steps': 20790, 'batch_loss/train': 0.7978215091861784} 12/25/2021 17:07:33 - INFO - codeparrot_training - Step 20791: {'lr': 0.0003169009392013248, 'samples': 10645504, 'steps': 20791, 'batch_loss/train': 0.788269255310297} 12/25/2021 17:07:44 - INFO - codeparrot_training - Step 20792: {'lr': 0.00031688575083059826, 'samples': 10646016, 'steps': 20792, 'batch_loss/train': 0.7503024787874892} 12/25/2021 17:07:56 - INFO - codeparrot_training - Step 20793: {'lr': 0.00031687056219395916, 'samples': 10646528, 'steps': 20793, 'batch_loss/train': 0.8708896598545834} 12/25/2021 17:08:06 - INFO - codeparrot_training - Step 20794: {'lr': 0.00031685537329146794, 'samples': 10647040, 'steps': 20794, 'batch_loss/train': 0.8304088776931167} 12/25/2021 17:08:17 - INFO - codeparrot_training - Step 20795: {'lr': 0.00031684018412318505, 'samples': 10647552, 'steps': 20795, 'batch_loss/train': 0.6744407396763563} 12/25/2021 17:08:31 - INFO - codeparrot_training - Step 20796: {'lr': 0.00031682499468917093, 'samples': 10648064, 'steps': 20796, 'batch_loss/train': 0.7722097707446665} 12/25/2021 17:08:41 - INFO - codeparrot_training - Step 20797: {'lr': 0.00031680980498948564, 'samples': 10648576, 'steps': 20797, 'batch_loss/train': 0.7105117649771273} 12/25/2021 17:08:52 - INFO - codeparrot_training - Step 20798: {'lr': 0.00031679461502419, 'samples': 10649088, 'steps': 20798, 'batch_loss/train': 0.811378619633615} 12/25/2021 17:09:04 - INFO - codeparrot_training - Step 20799: {'lr': 0.0003167794247933441, 'samples': 10649600, 'steps': 20799, 'batch_loss/train': 0.8201248552650213} 12/25/2021 17:09:15 - INFO - codeparrot_training - Step 20800: {'lr': 0.0003167642342970084, 'samples': 10650112, 'steps': 20800, 'batch_loss/train': 0.6921199208591133} 12/25/2021 17:09:25 - INFO - codeparrot_training - Step 20801: {'lr': 0.0003167490435352433, 'samples': 10650624, 'steps': 20801, 'batch_loss/train': 0.7590450485004112} 12/25/2021 17:09:36 - INFO - codeparrot_training - Step 20802: {'lr': 0.00031673385250810933, 'samples': 10651136, 'steps': 20802, 'batch_loss/train': 0.7786946799606085} 12/25/2021 17:09:50 - INFO - codeparrot_training - Step 20803: {'lr': 0.00031671866121566665, 'samples': 10651648, 'steps': 20803, 'batch_loss/train': 0.8807614231482148} 12/25/2021 17:10:00 - INFO - codeparrot_training - Step 20804: {'lr': 0.0003167034696579758, 'samples': 10652160, 'steps': 20804, 'batch_loss/train': 0.7675597933121026} 12/25/2021 17:10:11 - INFO - codeparrot_training - Step 20805: {'lr': 0.00031668827783509713, 'samples': 10652672, 'steps': 20805, 'batch_loss/train': 0.7968179788440466} 12/25/2021 17:10:23 - INFO - codeparrot_training - Step 20806: {'lr': 0.000316673085747091, 'samples': 10653184, 'steps': 20806, 'batch_loss/train': 0.6989122380036861} 12/25/2021 17:10:34 - INFO - codeparrot_training - Step 20807: {'lr': 0.00031665789339401797, 'samples': 10653696, 'steps': 20807, 'batch_loss/train': 0.7158207204192877} 12/25/2021 17:10:44 - INFO - codeparrot_training - Step 20808: {'lr': 0.0003166427007759382, 'samples': 10654208, 'steps': 20808, 'batch_loss/train': 0.7700895592570305} 12/25/2021 17:10:57 - INFO - codeparrot_training - Step 20809: {'lr': 0.00031662750789291234, 'samples': 10654720, 'steps': 20809, 'batch_loss/train': 0.7673951555043459} 12/25/2021 17:11:08 - INFO - codeparrot_training - Step 20810: {'lr': 0.00031661231474500054, 'samples': 10655232, 'steps': 20810, 'batch_loss/train': 0.5803633122704923} 12/25/2021 17:11:18 - INFO - codeparrot_training - Step 20811: {'lr': 0.0003165971213322634, 'samples': 10655744, 'steps': 20811, 'batch_loss/train': 0.7834602235816419} 12/25/2021 17:11:29 - INFO - codeparrot_training - Step 20812: {'lr': 0.00031658192765476124, 'samples': 10656256, 'steps': 20812, 'batch_loss/train': 0.8145269770175219} 12/25/2021 17:11:41 - INFO - codeparrot_training - Step 20813: {'lr': 0.00031656673371255453, 'samples': 10656768, 'steps': 20813, 'batch_loss/train': 0.8014486599713564} 12/25/2021 17:11:52 - INFO - codeparrot_training - Step 20814: {'lr': 0.00031655153950570355, 'samples': 10657280, 'steps': 20814, 'batch_loss/train': 0.8453418277204037} 12/25/2021 17:12:02 - INFO - codeparrot_training - Step 20815: {'lr': 0.0003165363450342688, 'samples': 10657792, 'steps': 20815, 'batch_loss/train': 0.8047037580981851} 12/25/2021 17:12:16 - INFO - codeparrot_training - Step 20816: {'lr': 0.0003165211502983107, 'samples': 10658304, 'steps': 20816, 'batch_loss/train': 0.6286919084377587} 12/25/2021 17:12:27 - INFO - codeparrot_training - Step 20817: {'lr': 0.00031650595529788953, 'samples': 10658816, 'steps': 20817, 'batch_loss/train': 1.0029003219678998} 12/25/2021 17:12:37 - INFO - codeparrot_training - Step 20818: {'lr': 0.0003164907600330659, 'samples': 10659328, 'steps': 20818, 'batch_loss/train': 0.8862906214781106} 12/25/2021 17:12:49 - INFO - codeparrot_training - Step 20819: {'lr': 0.00031647556450390004, 'samples': 10659840, 'steps': 20819, 'batch_loss/train': 0.8120873775333166} 12/25/2021 17:13:00 - INFO - codeparrot_training - Step 20820: {'lr': 0.0003164603687104525, 'samples': 10660352, 'steps': 20820, 'batch_loss/train': 0.8079974912106991} 12/25/2021 17:13:11 - INFO - codeparrot_training - Step 20821: {'lr': 0.0003164451726527835, 'samples': 10660864, 'steps': 20821, 'batch_loss/train': 0.8044680957682431} 12/25/2021 17:13:21 - INFO - codeparrot_training - Step 20822: {'lr': 0.0003164299763309537, 'samples': 10661376, 'steps': 20822, 'batch_loss/train': 0.7361273672431707} 12/25/2021 17:13:33 - INFO - codeparrot_training - Step 20823: {'lr': 0.0003164147797450234, 'samples': 10661888, 'steps': 20823, 'batch_loss/train': 0.8429056690074503} 12/25/2021 17:13:44 - INFO - codeparrot_training - Step 20824: {'lr': 0.000316399582895053, 'samples': 10662400, 'steps': 20824, 'batch_loss/train': 0.8073760899715126} 12/25/2021 17:13:54 - INFO - codeparrot_training - Step 20825: {'lr': 0.0003163843857811028, 'samples': 10662912, 'steps': 20825, 'batch_loss/train': 1.4121344545856118} 12/25/2021 17:14:08 - INFO - codeparrot_training - Step 20826: {'lr': 0.00031636918840323355, 'samples': 10663424, 'steps': 20826, 'batch_loss/train': 0.9016489135101438} 12/25/2021 17:14:19 - INFO - codeparrot_training - Step 20827: {'lr': 0.00031635399076150535, 'samples': 10663936, 'steps': 20827, 'batch_loss/train': 0.8573445538058877} 12/25/2021 17:14:30 - INFO - codeparrot_training - Step 20828: {'lr': 0.0003163387928559788, 'samples': 10664448, 'steps': 20828, 'batch_loss/train': 0.6971305552870035} 12/25/2021 17:14:42 - INFO - codeparrot_training - Step 20829: {'lr': 0.00031632359468671423, 'samples': 10664960, 'steps': 20829, 'batch_loss/train': 0.661685143597424} 12/25/2021 17:14:53 - INFO - codeparrot_training - Step 20830: {'lr': 0.000316308396253772, 'samples': 10665472, 'steps': 20830, 'batch_loss/train': 0.7657850850373507} 12/25/2021 17:15:03 - INFO - codeparrot_training - Step 20831: {'lr': 0.00031629319755721276, 'samples': 10665984, 'steps': 20831, 'batch_loss/train': 0.7966088382527232} 12/25/2021 17:15:14 - INFO - codeparrot_training - Step 20832: {'lr': 0.00031627799859709673, 'samples': 10666496, 'steps': 20832, 'batch_loss/train': 0.765366175211966} 12/25/2021 17:15:26 - INFO - codeparrot_training - Step 20833: {'lr': 0.0003162627993734844, 'samples': 10667008, 'steps': 20833, 'batch_loss/train': 1.041530990973115} 12/25/2021 17:15:37 - INFO - codeparrot_training - Step 20834: {'lr': 0.00031624759988643623, 'samples': 10667520, 'steps': 20834, 'batch_loss/train': 0.7473955920431763} 12/25/2021 17:15:47 - INFO - codeparrot_training - Step 20835: {'lr': 0.0003162324001360126, 'samples': 10668032, 'steps': 20835, 'batch_loss/train': 0.8038120679557323} 12/25/2021 17:16:01 - INFO - codeparrot_training - Step 20836: {'lr': 0.00031621720012227393, 'samples': 10668544, 'steps': 20836, 'batch_loss/train': 0.7969208573922515} 12/25/2021 17:16:12 - INFO - codeparrot_training - Step 20837: {'lr': 0.00031620199984528074, 'samples': 10669056, 'steps': 20837, 'batch_loss/train': 0.751103189541027} 12/25/2021 17:16:23 - INFO - codeparrot_training - Step 20838: {'lr': 0.00031618679930509334, 'samples': 10669568, 'steps': 20838, 'batch_loss/train': 0.8202129672281444} 12/25/2021 17:16:35 - INFO - codeparrot_training - Step 20839: {'lr': 0.00031617159850177225, 'samples': 10670080, 'steps': 20839, 'batch_loss/train': 0.8083915030583739} 12/25/2021 17:16:45 - INFO - codeparrot_training - Step 20840: {'lr': 0.00031615639743537785, 'samples': 10670592, 'steps': 20840, 'batch_loss/train': 0.8744724034331739} 12/25/2021 17:16:56 - INFO - codeparrot_training - Step 20841: {'lr': 0.00031614119610597064, 'samples': 10671104, 'steps': 20841, 'batch_loss/train': 0.7895042346790433} 12/25/2021 17:17:07 - INFO - codeparrot_training - Step 20842: {'lr': 0.0003161259945136109, 'samples': 10671616, 'steps': 20842, 'batch_loss/train': 0.7228258037939668} 12/25/2021 17:17:21 - INFO - codeparrot_training - Step 20843: {'lr': 0.00031611079265835923, 'samples': 10672128, 'steps': 20843, 'batch_loss/train': 1.0147223304957151} 12/25/2021 17:17:31 - INFO - codeparrot_training - Step 20844: {'lr': 0.0003160955905402761, 'samples': 10672640, 'steps': 20844, 'batch_loss/train': 1.483256708830595} 12/25/2021 17:17:42 - INFO - codeparrot_training - Step 20845: {'lr': 0.00031608038815942176, 'samples': 10673152, 'steps': 20845, 'batch_loss/train': 1.0812549777328968} 12/25/2021 17:17:54 - INFO - codeparrot_training - Step 20846: {'lr': 0.0003160651855158568, 'samples': 10673664, 'steps': 20846, 'batch_loss/train': 0.8183331517502666} 12/25/2021 17:18:05 - INFO - codeparrot_training - Step 20847: {'lr': 0.00031604998260964154, 'samples': 10674176, 'steps': 20847, 'batch_loss/train': 0.7935527320951223} 12/25/2021 17:18:15 - INFO - codeparrot_training - Step 20848: {'lr': 0.00031603477944083656, 'samples': 10674688, 'steps': 20848, 'batch_loss/train': 0.650481061078608} 12/25/2021 17:18:26 - INFO - codeparrot_training - Step 20849: {'lr': 0.00031601957600950215, 'samples': 10675200, 'steps': 20849, 'batch_loss/train': 1.014587253332138} 12/25/2021 17:18:38 - INFO - codeparrot_training - Step 20850: {'lr': 0.0003160043723156989, 'samples': 10675712, 'steps': 20850, 'batch_loss/train': 0.8266443563625216} 12/25/2021 17:18:49 - INFO - codeparrot_training - Step 20851: {'lr': 0.00031598916835948713, 'samples': 10676224, 'steps': 20851, 'batch_loss/train': 0.8164447396993637} 12/25/2021 17:18:59 - INFO - codeparrot_training - Step 20852: {'lr': 0.0003159739641409274, 'samples': 10676736, 'steps': 20852, 'batch_loss/train': 0.780389940366149} 12/25/2021 17:19:13 - INFO - codeparrot_training - Step 20853: {'lr': 0.0003159587596600801, 'samples': 10677248, 'steps': 20853, 'batch_loss/train': 0.8597957575693727} 12/25/2021 17:19:24 - INFO - codeparrot_training - Step 20854: {'lr': 0.00031594355491700564, 'samples': 10677760, 'steps': 20854, 'batch_loss/train': 0.9537774194031954} 12/25/2021 17:19:34 - INFO - codeparrot_training - Step 20855: {'lr': 0.0003159283499117645, 'samples': 10678272, 'steps': 20855, 'batch_loss/train': 0.7242741156369448} 12/25/2021 17:19:46 - INFO - codeparrot_training - Step 20856: {'lr': 0.0003159131446444172, 'samples': 10678784, 'steps': 20856, 'batch_loss/train': 0.8152351011522114} 12/25/2021 17:19:57 - INFO - codeparrot_training - Step 20857: {'lr': 0.00031589793911502407, 'samples': 10679296, 'steps': 20857, 'batch_loss/train': 0.7781965518370271} 12/25/2021 17:20:08 - INFO - codeparrot_training - Step 20858: {'lr': 0.00031588273332364565, 'samples': 10679808, 'steps': 20858, 'batch_loss/train': 0.8865037383511662} 12/25/2021 17:20:20 - INFO - codeparrot_training - Step 20859: {'lr': 0.0003158675272703423, 'samples': 10680320, 'steps': 20859, 'batch_loss/train': 0.8238958227448165} 12/25/2021 17:20:30 - INFO - codeparrot_training - Step 20860: {'lr': 0.0003158523209551746, 'samples': 10680832, 'steps': 20860, 'batch_loss/train': 0.8593175178393722} 12/25/2021 17:20:41 - INFO - codeparrot_training - Step 20861: {'lr': 0.0003158371143782029, 'samples': 10681344, 'steps': 20861, 'batch_loss/train': 0.7283411682583392} 12/25/2021 17:20:51 - INFO - codeparrot_training - Step 20862: {'lr': 0.00031582190753948766, 'samples': 10681856, 'steps': 20862, 'batch_loss/train': 0.7592202331870794} 12/25/2021 17:21:04 - INFO - codeparrot_training - Step 20863: {'lr': 0.00031580670043908945, 'samples': 10682368, 'steps': 20863, 'batch_loss/train': 0.9016687255352736} 12/25/2021 17:21:14 - INFO - codeparrot_training - Step 20864: {'lr': 0.00031579149307706854, 'samples': 10682880, 'steps': 20864, 'batch_loss/train': 0.858155039139092} 12/25/2021 17:21:25 - INFO - codeparrot_training - Step 20865: {'lr': 0.0003157762854534856, 'samples': 10683392, 'steps': 20865, 'batch_loss/train': 0.7796046934090555} 12/25/2021 17:21:39 - INFO - codeparrot_training - Step 20866: {'lr': 0.00031576107756840094, 'samples': 10683904, 'steps': 20866, 'batch_loss/train': 0.8755441661924124} 12/25/2021 17:21:50 - INFO - codeparrot_training - Step 20867: {'lr': 0.0003157458694218751, 'samples': 10684416, 'steps': 20867, 'batch_loss/train': 0.9329717149958014} 12/25/2021 17:22:00 - INFO - codeparrot_training - Step 20868: {'lr': 0.0003157306610139684, 'samples': 10684928, 'steps': 20868, 'batch_loss/train': 0.7242140141315758} 12/25/2021 17:22:11 - INFO - codeparrot_training - Step 20869: {'lr': 0.0003157154523447415, 'samples': 10685440, 'steps': 20869, 'batch_loss/train': 0.7541667767800391} 12/25/2021 17:22:23 - INFO - codeparrot_training - Step 20870: {'lr': 0.0003157002434142548, 'samples': 10685952, 'steps': 20870, 'batch_loss/train': 0.7700156066566706} 12/25/2021 17:22:34 - INFO - codeparrot_training - Step 20871: {'lr': 0.00031568503422256875, 'samples': 10686464, 'steps': 20871, 'batch_loss/train': 0.8386618453077972} 12/25/2021 17:22:44 - INFO - codeparrot_training - Step 20872: {'lr': 0.0003156698247697437, 'samples': 10686976, 'steps': 20872, 'batch_loss/train': 0.8031304152682424} 12/25/2021 17:22:59 - INFO - codeparrot_training - Step 20873: {'lr': 0.00031565461505584025, 'samples': 10687488, 'steps': 20873, 'batch_loss/train': 0.823119018226862} 12/25/2021 17:23:09 - INFO - codeparrot_training - Step 20874: {'lr': 0.00031563940508091894, 'samples': 10688000, 'steps': 20874, 'batch_loss/train': 0.9284690879285336} 12/25/2021 17:23:20 - INFO - codeparrot_training - Step 20875: {'lr': 0.00031562419484504003, 'samples': 10688512, 'steps': 20875, 'batch_loss/train': 0.9841533228172921} 12/25/2021 17:23:32 - INFO - codeparrot_training - Step 20876: {'lr': 0.0003156089843482642, 'samples': 10689024, 'steps': 20876, 'batch_loss/train': 0.7633686861954629} 12/25/2021 17:23:43 - INFO - codeparrot_training - Step 20877: {'lr': 0.00031559377359065174, 'samples': 10689536, 'steps': 20877, 'batch_loss/train': 0.6585816279985011} 12/25/2021 17:23:53 - INFO - codeparrot_training - Step 20878: {'lr': 0.0003155785625722632, 'samples': 10690048, 'steps': 20878, 'batch_loss/train': 0.8432875648140907} 12/25/2021 17:24:05 - INFO - codeparrot_training - Step 20879: {'lr': 0.00031556335129315905, 'samples': 10690560, 'steps': 20879, 'batch_loss/train': 0.6595729957334697} 12/25/2021 17:24:16 - INFO - codeparrot_training - Step 20880: {'lr': 0.0003155481397533998, 'samples': 10691072, 'steps': 20880, 'batch_loss/train': 0.8430394362658262} 12/25/2021 17:24:27 - INFO - codeparrot_training - Step 20881: {'lr': 0.0003155329279530459, 'samples': 10691584, 'steps': 20881, 'batch_loss/train': 0.8145928224548697} 12/25/2021 17:24:37 - INFO - codeparrot_training - Step 20882: {'lr': 0.0003155177158921578, 'samples': 10692096, 'steps': 20882, 'batch_loss/train': 0.8551636352203786} 12/25/2021 17:24:51 - INFO - codeparrot_training - Step 20883: {'lr': 0.0003155025035707961, 'samples': 10692608, 'steps': 20883, 'batch_loss/train': 0.7579833758063614} 12/25/2021 17:25:02 - INFO - codeparrot_training - Step 20884: {'lr': 0.00031548729098902106, 'samples': 10693120, 'steps': 20884, 'batch_loss/train': 0.795007080771029} 12/25/2021 17:25:12 - INFO - codeparrot_training - Step 20885: {'lr': 0.0003154720781468933, 'samples': 10693632, 'steps': 20885, 'batch_loss/train': 0.7623209632001817} 12/25/2021 17:25:24 - INFO - codeparrot_training - Step 20886: {'lr': 0.00031545686504447337, 'samples': 10694144, 'steps': 20886, 'batch_loss/train': 0.8612517286092043} 12/25/2021 17:25:35 - INFO - codeparrot_training - Step 20887: {'lr': 0.00031544165168182153, 'samples': 10694656, 'steps': 20887, 'batch_loss/train': 0.7293339986354113} 12/25/2021 17:25:45 - INFO - codeparrot_training - Step 20888: {'lr': 0.00031542643805899847, 'samples': 10695168, 'steps': 20888, 'batch_loss/train': 0.817761950660497} 12/25/2021 17:25:56 - INFO - codeparrot_training - Step 20889: {'lr': 0.0003154112241760646, 'samples': 10695680, 'steps': 20889, 'batch_loss/train': 0.8114175824448466} 12/25/2021 17:26:08 - INFO - codeparrot_training - Step 20890: {'lr': 0.0003153960100330804, 'samples': 10696192, 'steps': 20890, 'batch_loss/train': 0.7837381670251489} 12/25/2021 17:26:19 - INFO - codeparrot_training - Step 20891: {'lr': 0.0003153807956301063, 'samples': 10696704, 'steps': 20891, 'batch_loss/train': 0.7050910391844809} 12/25/2021 17:26:29 - INFO - codeparrot_training - Step 20892: {'lr': 0.00031536558096720285, 'samples': 10697216, 'steps': 20892, 'batch_loss/train': 0.8817819454707205} 12/25/2021 17:26:44 - INFO - codeparrot_training - Step 20893: {'lr': 0.00031535036604443055, 'samples': 10697728, 'steps': 20893, 'batch_loss/train': 0.8014595513232052} 12/25/2021 17:26:54 - INFO - codeparrot_training - Step 20894: {'lr': 0.0003153351508618499, 'samples': 10698240, 'steps': 20894, 'batch_loss/train': 0.733031926676631} 12/25/2021 17:27:05 - INFO - codeparrot_training - Step 20895: {'lr': 0.0003153199354195213, 'samples': 10698752, 'steps': 20895, 'batch_loss/train': 0.877772057428956} 12/25/2021 17:27:17 - INFO - codeparrot_training - Step 20896: {'lr': 0.0003153047197175054, 'samples': 10699264, 'steps': 20896, 'batch_loss/train': 0.7233489848440513} 12/25/2021 17:27:27 - INFO - codeparrot_training - Step 20897: {'lr': 0.0003152895037558625, 'samples': 10699776, 'steps': 20897, 'batch_loss/train': 0.8123360592871904} 12/25/2021 17:27:38 - INFO - codeparrot_training - Step 20898: {'lr': 0.00031527428753465325, 'samples': 10700288, 'steps': 20898, 'batch_loss/train': 0.723218816332519} 12/25/2021 17:27:50 - INFO - codeparrot_training - Step 20899: {'lr': 0.00031525907105393804, 'samples': 10700800, 'steps': 20899, 'batch_loss/train': 0.7276505695190281} 12/25/2021 17:28:01 - INFO - codeparrot_training - Step 20900: {'lr': 0.00031524385431377737, 'samples': 10701312, 'steps': 20900, 'batch_loss/train': 0.7991412575356662} 12/25/2021 17:28:11 - INFO - codeparrot_training - Step 20901: {'lr': 0.00031522863731423185, 'samples': 10701824, 'steps': 20901, 'batch_loss/train': 0.7240221559768543} 12/25/2021 17:28:22 - INFO - codeparrot_training - Step 20902: {'lr': 0.0003152134200553619, 'samples': 10702336, 'steps': 20902, 'batch_loss/train': 0.8386616631178185} 12/25/2021 17:28:36 - INFO - codeparrot_training - Step 20903: {'lr': 0.00031519820253722797, 'samples': 10702848, 'steps': 20903, 'batch_loss/train': 0.7928722556680441} 12/25/2021 17:28:47 - INFO - codeparrot_training - Step 20904: {'lr': 0.00031518298475989055, 'samples': 10703360, 'steps': 20904, 'batch_loss/train': 0.7245787023566663} 12/25/2021 17:28:57 - INFO - codeparrot_training - Step 20905: {'lr': 0.0003151677667234103, 'samples': 10703872, 'steps': 20905, 'batch_loss/train': 0.7484100461006165} 12/25/2021 17:29:09 - INFO - codeparrot_training - Step 20906: {'lr': 0.0003151525484278476, 'samples': 10704384, 'steps': 20906, 'batch_loss/train': 0.7727577984333038} 12/25/2021 17:29:20 - INFO - codeparrot_training - Step 20907: {'lr': 0.0003151373298732629, 'samples': 10704896, 'steps': 20907, 'batch_loss/train': 0.9073736444115639} 12/25/2021 17:29:31 - INFO - codeparrot_training - Step 20908: {'lr': 0.0003151221110597168, 'samples': 10705408, 'steps': 20908, 'batch_loss/train': 1.6124002262949944} 12/25/2021 17:29:43 - INFO - codeparrot_training - Step 20909: {'lr': 0.0003151068919872698, 'samples': 10705920, 'steps': 20909, 'batch_loss/train': 0.7367806979455054} 12/25/2021 17:29:54 - INFO - codeparrot_training - Step 20910: {'lr': 0.0003150916726559824, 'samples': 10706432, 'steps': 20910, 'batch_loss/train': 0.8501403951086104} 12/25/2021 17:30:04 - INFO - codeparrot_training - Step 20911: {'lr': 0.00031507645306591504, 'samples': 10706944, 'steps': 20911, 'batch_loss/train': 0.7286242854315788} 12/25/2021 17:30:15 - INFO - codeparrot_training - Step 20912: {'lr': 0.0003150612332171283, 'samples': 10707456, 'steps': 20912, 'batch_loss/train': 0.8144367686472833} 12/25/2021 17:30:29 - INFO - codeparrot_training - Step 20913: {'lr': 0.00031504601310968264, 'samples': 10707968, 'steps': 20913, 'batch_loss/train': 0.6934464005753398} 12/25/2021 17:30:39 - INFO - codeparrot_training - Step 20914: {'lr': 0.0003150307927436385, 'samples': 10708480, 'steps': 20914, 'batch_loss/train': 0.6440494407434016} 12/25/2021 17:30:50 - INFO - codeparrot_training - Step 20915: {'lr': 0.0003150155721190567, 'samples': 10708992, 'steps': 20915, 'batch_loss/train': 0.827441577333957} 12/25/2021 17:31:02 - INFO - codeparrot_training - Step 20916: {'lr': 0.00031500035123599733, 'samples': 10709504, 'steps': 20916, 'batch_loss/train': 0.697260120883584} 12/25/2021 17:31:13 - INFO - codeparrot_training - Step 20917: {'lr': 0.0003149851300945212, 'samples': 10710016, 'steps': 20917, 'batch_loss/train': 1.730204064399004} 12/25/2021 17:31:23 - INFO - codeparrot_training - Step 20918: {'lr': 0.0003149699086946888, 'samples': 10710528, 'steps': 20918, 'batch_loss/train': 0.7283911922131665} 12/25/2021 17:31:38 - INFO - codeparrot_training - Step 20919: {'lr': 0.00031495468703656045, 'samples': 10711040, 'steps': 20919, 'batch_loss/train': 0.7356393651571125} 12/25/2021 17:31:48 - INFO - codeparrot_training - Step 20920: {'lr': 0.00031493946512019683, 'samples': 10711552, 'steps': 20920, 'batch_loss/train': 0.75192791223526} 12/25/2021 17:31:59 - INFO - codeparrot_training - Step 20921: {'lr': 0.0003149242429456584, 'samples': 10712064, 'steps': 20921, 'batch_loss/train': 0.7772849006578326} 12/25/2021 17:32:09 - INFO - codeparrot_training - Step 20922: {'lr': 0.0003149090205130058, 'samples': 10712576, 'steps': 20922, 'batch_loss/train': 1.0777819184586406} 12/25/2021 17:32:22 - INFO - codeparrot_training - Step 20923: {'lr': 0.00031489379782229934, 'samples': 10713088, 'steps': 20923, 'batch_loss/train': 0.8505238490179181} 12/25/2021 17:32:33 - INFO - codeparrot_training - Step 20924: {'lr': 0.00031487857487359967, 'samples': 10713600, 'steps': 20924, 'batch_loss/train': 0.7379973940551281} 12/25/2021 17:32:43 - INFO - codeparrot_training - Step 20925: {'lr': 0.00031486335166696727, 'samples': 10714112, 'steps': 20925, 'batch_loss/train': 0.7801443885546178} 12/25/2021 17:32:56 - INFO - codeparrot_training - Step 20926: {'lr': 0.00031484812820246265, 'samples': 10714624, 'steps': 20926, 'batch_loss/train': 0.7991950875148177} 12/25/2021 17:33:07 - INFO - codeparrot_training - Step 20927: {'lr': 0.00031483290448014646, 'samples': 10715136, 'steps': 20927, 'batch_loss/train': 1.1697238832712173} 12/25/2021 17:33:17 - INFO - codeparrot_training - Step 20928: {'lr': 0.0003148176805000791, 'samples': 10715648, 'steps': 20928, 'batch_loss/train': 0.7870495244860649} 12/25/2021 17:33:28 - INFO - codeparrot_training - Step 20929: {'lr': 0.00031480245626232104, 'samples': 10716160, 'steps': 20929, 'batch_loss/train': 0.7772356020286679} 12/25/2021 17:33:42 - INFO - codeparrot_training - Step 20930: {'lr': 0.00031478723176693287, 'samples': 10716672, 'steps': 20930, 'batch_loss/train': 0.8090518643148243} 12/25/2021 17:33:53 - INFO - codeparrot_training - Step 20931: {'lr': 0.0003147720070139752, 'samples': 10717184, 'steps': 20931, 'batch_loss/train': 0.7526632337830961} 12/25/2021 17:34:03 - INFO - codeparrot_training - Step 20932: {'lr': 0.0003147567820035084, 'samples': 10717696, 'steps': 20932, 'batch_loss/train': 0.9967810669913888} 12/25/2021 17:34:15 - INFO - codeparrot_training - Step 20933: {'lr': 0.0003147415567355932, 'samples': 10718208, 'steps': 20933, 'batch_loss/train': 0.8523782826960087} 12/25/2021 17:34:26 - INFO - codeparrot_training - Step 20934: {'lr': 0.0003147263312102899, 'samples': 10718720, 'steps': 20934, 'batch_loss/train': 0.8549497285857797} 12/25/2021 17:34:36 - INFO - codeparrot_training - Step 20935: {'lr': 0.0003147111054276593, 'samples': 10719232, 'steps': 20935, 'batch_loss/train': 0.7728236466646194} 12/25/2021 17:34:47 - INFO - codeparrot_training - Step 20936: {'lr': 0.0003146958793877616, 'samples': 10719744, 'steps': 20936, 'batch_loss/train': 0.9034397331997752} 12/25/2021 17:34:59 - INFO - codeparrot_training - Step 20937: {'lr': 0.0003146806530906577, 'samples': 10720256, 'steps': 20937, 'batch_loss/train': 0.8844528160989285} 12/25/2021 17:35:10 - INFO - codeparrot_training - Step 20938: {'lr': 0.00031466542653640786, 'samples': 10720768, 'steps': 20938, 'batch_loss/train': 0.8941939193755388} 12/25/2021 17:35:20 - INFO - codeparrot_training - Step 20939: {'lr': 0.0003146501997250726, 'samples': 10721280, 'steps': 20939, 'batch_loss/train': 1.027276020962745} 12/25/2021 17:35:32 - INFO - codeparrot_training - Step 20940: {'lr': 0.0003146349726567127, 'samples': 10721792, 'steps': 20940, 'batch_loss/train': 0.8316166461445391} 12/25/2021 17:35:43 - INFO - codeparrot_training - Step 20941: {'lr': 0.0003146197453313885, 'samples': 10722304, 'steps': 20941, 'batch_loss/train': 0.8831067094579339} 12/25/2021 17:35:54 - INFO - codeparrot_training - Step 20942: {'lr': 0.0003146045177491606, 'samples': 10722816, 'steps': 20942, 'batch_loss/train': 0.6961988941766322} 12/25/2021 17:36:07 - INFO - codeparrot_training - Step 20943: {'lr': 0.00031458928991008947, 'samples': 10723328, 'steps': 20943, 'batch_loss/train': 0.8615252196323127} 12/25/2021 17:36:18 - INFO - codeparrot_training - Step 20944: {'lr': 0.00031457406181423583, 'samples': 10723840, 'steps': 20944, 'batch_loss/train': 0.8216991648077965} 12/25/2021 17:36:28 - INFO - codeparrot_training - Step 20945: {'lr': 0.00031455883346166003, 'samples': 10724352, 'steps': 20945, 'batch_loss/train': 0.8363816677592695} 12/25/2021 17:36:40 - INFO - codeparrot_training - Step 20946: {'lr': 0.0003145436048524227, 'samples': 10724864, 'steps': 20946, 'batch_loss/train': 0.8740376017522067} 12/25/2021 17:36:51 - INFO - codeparrot_training - Step 20947: {'lr': 0.00031452837598658427, 'samples': 10725376, 'steps': 20947, 'batch_loss/train': 0.8027481269091368} 12/25/2021 17:37:02 - INFO - codeparrot_training - Step 20948: {'lr': 0.0003145131468642056, 'samples': 10725888, 'steps': 20948, 'batch_loss/train': 0.7940815887413919} 12/25/2021 17:37:12 - INFO - codeparrot_training - Step 20949: {'lr': 0.00031449791748534676, 'samples': 10726400, 'steps': 20949, 'batch_loss/train': 0.7059417413547635} 12/25/2021 17:37:24 - INFO - codeparrot_training - Step 20950: {'lr': 0.0003144826878500687, 'samples': 10726912, 'steps': 20950, 'batch_loss/train': 0.9106154665350914} 12/25/2021 17:37:35 - INFO - codeparrot_training - Step 20951: {'lr': 0.00031446745795843183, 'samples': 10727424, 'steps': 20951, 'batch_loss/train': 0.9129069251939654} 12/25/2021 17:37:46 - INFO - codeparrot_training - Step 20952: {'lr': 0.00031445222781049665, 'samples': 10727936, 'steps': 20952, 'batch_loss/train': 0.7661856599152088} 12/25/2021 17:37:59 - INFO - codeparrot_training - Step 20953: {'lr': 0.00031443699740632373, 'samples': 10728448, 'steps': 20953, 'batch_loss/train': 0.7922326093539596} 12/25/2021 17:38:10 - INFO - codeparrot_training - Step 20954: {'lr': 0.0003144217667459737, 'samples': 10728960, 'steps': 20954, 'batch_loss/train': 0.6955711548216641} 12/25/2021 17:38:21 - INFO - codeparrot_training - Step 20955: {'lr': 0.00031440653582950705, 'samples': 10729472, 'steps': 20955, 'batch_loss/train': 0.8723693424835801} 12/25/2021 17:38:33 - INFO - codeparrot_training - Step 20956: {'lr': 0.00031439130465698423, 'samples': 10729984, 'steps': 20956, 'batch_loss/train': 0.7367733760038391} 12/25/2021 17:38:44 - INFO - codeparrot_training - Step 20957: {'lr': 0.00031437607322846604, 'samples': 10730496, 'steps': 20957, 'batch_loss/train': 0.7911250037141144} 12/25/2021 17:38:54 - INFO - codeparrot_training - Step 20958: {'lr': 0.00031436084154401277, 'samples': 10731008, 'steps': 20958, 'batch_loss/train': 0.8627846271265298} 12/25/2021 17:39:05 - INFO - codeparrot_training - Step 20959: {'lr': 0.0003143456096036852, 'samples': 10731520, 'steps': 20959, 'batch_loss/train': 0.7819713670760393} 12/25/2021 17:39:19 - INFO - codeparrot_training - Step 20960: {'lr': 0.0003143303774075437, 'samples': 10732032, 'steps': 20960, 'batch_loss/train': 0.8259890191257} 12/25/2021 17:39:29 - INFO - codeparrot_training - Step 20961: {'lr': 0.00031431514495564895, 'samples': 10732544, 'steps': 20961, 'batch_loss/train': 0.6158032636158168} 12/25/2021 17:39:40 - INFO - codeparrot_training - Step 20962: {'lr': 0.0003142999122480614, 'samples': 10733056, 'steps': 20962, 'batch_loss/train': 0.7590981898829341} 12/25/2021 17:39:52 - INFO - codeparrot_training - Step 20963: {'lr': 0.00031428467928484177, 'samples': 10733568, 'steps': 20963, 'batch_loss/train': 0.7824127553030849} 12/25/2021 17:40:03 - INFO - codeparrot_training - Step 20964: {'lr': 0.00031426944606605047, 'samples': 10734080, 'steps': 20964, 'batch_loss/train': 1.326983334030956} 12/25/2021 17:40:13 - INFO - codeparrot_training - Step 20965: {'lr': 0.0003142542125917481, 'samples': 10734592, 'steps': 20965, 'batch_loss/train': 0.9198965737596154} 12/25/2021 17:40:25 - INFO - codeparrot_training - Step 20966: {'lr': 0.0003142389788619953, 'samples': 10735104, 'steps': 20966, 'batch_loss/train': 0.8474147198721766} 12/25/2021 17:40:36 - INFO - codeparrot_training - Step 20967: {'lr': 0.00031422374487685256, 'samples': 10735616, 'steps': 20967, 'batch_loss/train': 0.738188287243247} 12/25/2021 17:40:46 - INFO - codeparrot_training - Step 20968: {'lr': 0.00031420851063638047, 'samples': 10736128, 'steps': 20968, 'batch_loss/train': 0.88165676407516} 12/25/2021 17:40:57 - INFO - codeparrot_training - Step 20969: {'lr': 0.00031419327614063953, 'samples': 10736640, 'steps': 20969, 'batch_loss/train': 0.7888828970026225} 12/25/2021 17:41:11 - INFO - codeparrot_training - Step 20970: {'lr': 0.0003141780413896905, 'samples': 10737152, 'steps': 20970, 'batch_loss/train': 0.899686143733561} 12/25/2021 17:41:21 - INFO - codeparrot_training - Step 20971: {'lr': 0.0003141628063835937, 'samples': 10737664, 'steps': 20971, 'batch_loss/train': 0.7810713876970112} 12/25/2021 17:41:32 - INFO - codeparrot_training - Step 20972: {'lr': 0.00031414757112240975, 'samples': 10738176, 'steps': 20972, 'batch_loss/train': 0.7430214777123183} 12/25/2021 17:41:44 - INFO - codeparrot_training - Step 20973: {'lr': 0.00031413233560619936, 'samples': 10738688, 'steps': 20973, 'batch_loss/train': 0.7341588782146573} 12/25/2021 17:41:54 - INFO - codeparrot_training - Step 20974: {'lr': 0.00031411709983502305, 'samples': 10739200, 'steps': 20974, 'batch_loss/train': 0.768998043378815} 12/25/2021 17:42:05 - INFO - codeparrot_training - Step 20975: {'lr': 0.00031410186380894127, 'samples': 10739712, 'steps': 20975, 'batch_loss/train': 0.7781648179516196} 12/25/2021 17:42:17 - INFO - codeparrot_training - Step 20976: {'lr': 0.00031408662752801484, 'samples': 10740224, 'steps': 20976, 'batch_loss/train': 0.759973204578273} 12/25/2021 17:42:28 - INFO - codeparrot_training - Step 20977: {'lr': 0.00031407139099230404, 'samples': 10740736, 'steps': 20977, 'batch_loss/train': 0.5726105766370893} 12/25/2021 17:42:39 - INFO - codeparrot_training - Step 20978: {'lr': 0.0003140561542018696, 'samples': 10741248, 'steps': 20978, 'batch_loss/train': 0.8288984969258308} 12/25/2021 17:42:49 - INFO - codeparrot_training - Step 20979: {'lr': 0.0003140409171567721, 'samples': 10741760, 'steps': 20979, 'batch_loss/train': 0.853514916729182} 12/25/2021 17:43:01 - INFO - codeparrot_training - Step 20980: {'lr': 0.00031402567985707206, 'samples': 10742272, 'steps': 20980, 'batch_loss/train': 0.8513376312330365} 12/25/2021 17:43:12 - INFO - codeparrot_training - Step 20981: {'lr': 0.0003140104423028301, 'samples': 10742784, 'steps': 20981, 'batch_loss/train': 0.8114662589505315} 12/25/2021 17:43:22 - INFO - codeparrot_training - Step 20982: {'lr': 0.0003139952044941068, 'samples': 10743296, 'steps': 20982, 'batch_loss/train': 0.7594912918284535} 12/25/2021 17:43:36 - INFO - codeparrot_training - Step 20983: {'lr': 0.0003139799664309628, 'samples': 10743808, 'steps': 20983, 'batch_loss/train': 0.7431791676208377} 12/25/2021 17:43:47 - INFO - codeparrot_training - Step 20984: {'lr': 0.00031396472811345854, 'samples': 10744320, 'steps': 20984, 'batch_loss/train': 0.7717546690255404} 12/25/2021 17:43:58 - INFO - codeparrot_training - Step 20985: {'lr': 0.0003139494895416547, 'samples': 10744832, 'steps': 20985, 'batch_loss/train': 0.8486716179177165} 12/25/2021 17:44:10 - INFO - codeparrot_training - Step 20986: {'lr': 0.00031393425071561177, 'samples': 10745344, 'steps': 20986, 'batch_loss/train': 0.7856956208124757} 12/25/2021 17:44:20 - INFO - codeparrot_training - Step 20987: {'lr': 0.0003139190116353905, 'samples': 10745856, 'steps': 20987, 'batch_loss/train': 0.740481385961175} 12/25/2021 17:44:31 - INFO - codeparrot_training - Step 20988: {'lr': 0.00031390377230105127, 'samples': 10746368, 'steps': 20988, 'batch_loss/train': 0.7294294373132288} 12/25/2021 17:44:41 - INFO - codeparrot_training - Step 20989: {'lr': 0.00031388853271265487, 'samples': 10746880, 'steps': 20989, 'batch_loss/train': 0.7380249091656879} 12/25/2021 17:44:56 - INFO - codeparrot_training - Step 20990: {'lr': 0.00031387329287026173, 'samples': 10747392, 'steps': 20990, 'batch_loss/train': 0.853828786406666} 12/25/2021 17:45:06 - INFO - codeparrot_training - Step 20991: {'lr': 0.00031385805277393244, 'samples': 10747904, 'steps': 20991, 'batch_loss/train': 0.9931286904029548} 12/25/2021 17:45:17 - INFO - codeparrot_training - Step 20992: {'lr': 0.0003138428124237278, 'samples': 10748416, 'steps': 20992, 'batch_loss/train': 0.8366677286103368} 12/25/2021 17:45:29 - INFO - codeparrot_training - Step 20993: {'lr': 0.0003138275718197081, 'samples': 10748928, 'steps': 20993, 'batch_loss/train': 0.7269932813942432} 12/25/2021 17:45:40 - INFO - codeparrot_training - Step 20994: {'lr': 0.0003138123309619342, 'samples': 10749440, 'steps': 20994, 'batch_loss/train': 0.9581929659470916} 12/25/2021 17:45:50 - INFO - codeparrot_training - Step 20995: {'lr': 0.0003137970898504664, 'samples': 10749952, 'steps': 20995, 'batch_loss/train': 0.7144991625100374} 12/25/2021 17:46:01 - INFO - codeparrot_training - Step 20996: {'lr': 0.00031378184848536564, 'samples': 10750464, 'steps': 20996, 'batch_loss/train': 0.627372425980866} 12/25/2021 17:46:13 - INFO - codeparrot_training - Step 20997: {'lr': 0.00031376660686669225, 'samples': 10750976, 'steps': 20997, 'batch_loss/train': 0.7443262692540884} 12/25/2021 17:46:24 - INFO - codeparrot_training - Step 20998: {'lr': 0.0003137513649945069, 'samples': 10751488, 'steps': 20998, 'batch_loss/train': 0.9101256732828915} 12/25/2021 17:46:34 - INFO - codeparrot_training - Step 20999: {'lr': 0.0003137361228688702, 'samples': 10752000, 'steps': 20999, 'batch_loss/train': 0.7822125921957195} 12/25/2021 17:46:48 - INFO - codeparrot_training - Step 21000: {'lr': 0.00031372088048984274, 'samples': 10752512, 'steps': 21000, 'batch_loss/train': 0.8234328767284751} 12/25/2021 17:46:59 - INFO - codeparrot_training - Step 21001: {'lr': 0.00031370563785748504, 'samples': 10753024, 'steps': 21001, 'batch_loss/train': 0.8300470123067498} 12/25/2021 17:47:09 - INFO - codeparrot_training - Step 21002: {'lr': 0.000313690394971858, 'samples': 10753536, 'steps': 21002, 'batch_loss/train': 0.6447406688821502} 12/25/2021 17:47:22 - INFO - codeparrot_training - Step 21003: {'lr': 0.0003136751518330218, 'samples': 10754048, 'steps': 21003, 'batch_loss/train': 0.680279622785747} 12/25/2021 17:47:32 - INFO - codeparrot_training - Step 21004: {'lr': 0.0003136599084410373, 'samples': 10754560, 'steps': 21004, 'batch_loss/train': 0.777822834905237} 12/25/2021 17:47:43 - INFO - codeparrot_training - Step 21005: {'lr': 0.00031364466479596504, 'samples': 10755072, 'steps': 21005, 'batch_loss/train': 0.8601602474227548} 12/25/2021 17:47:55 - INFO - codeparrot_training - Step 21006: {'lr': 0.0003136294208978656, 'samples': 10755584, 'steps': 21006, 'batch_loss/train': 0.7614302258007228} 12/25/2021 17:48:05 - INFO - codeparrot_training - Step 21007: {'lr': 0.00031361417674679963, 'samples': 10756096, 'steps': 21007, 'batch_loss/train': 0.8288928144611418} 12/25/2021 17:48:16 - INFO - codeparrot_training - Step 21008: {'lr': 0.0003135989323428277, 'samples': 10756608, 'steps': 21008, 'batch_loss/train': 0.7534341160207987} 12/25/2021 17:48:26 - INFO - codeparrot_training - Step 21009: {'lr': 0.00031358368768601055, 'samples': 10757120, 'steps': 21009, 'batch_loss/train': 0.8008698937483132} 12/25/2021 17:48:40 - INFO - codeparrot_training - Step 21010: {'lr': 0.0003135684427764085, 'samples': 10757632, 'steps': 21010, 'batch_loss/train': 0.7902173716574907} 12/25/2021 17:48:51 - INFO - codeparrot_training - Step 21011: {'lr': 0.00031355319761408243, 'samples': 10758144, 'steps': 21011, 'batch_loss/train': 0.744621463585645} 12/25/2021 17:49:02 - INFO - codeparrot_training - Step 21012: {'lr': 0.00031353795219909276, 'samples': 10758656, 'steps': 21012, 'batch_loss/train': 0.8096189312636852} 12/25/2021 17:49:14 - INFO - codeparrot_training - Step 21013: {'lr': 0.0003135227065315002, 'samples': 10759168, 'steps': 21013, 'batch_loss/train': 0.8545039496384561} 12/25/2021 17:49:24 - INFO - codeparrot_training - Step 21014: {'lr': 0.0003135074606113654, 'samples': 10759680, 'steps': 21014, 'batch_loss/train': 0.7757806004956365} 12/25/2021 17:49:35 - INFO - codeparrot_training - Step 21015: {'lr': 0.00031349221443874883, 'samples': 10760192, 'steps': 21015, 'batch_loss/train': 0.8239925373345613} 12/25/2021 17:49:47 - INFO - codeparrot_training - Step 21016: {'lr': 0.00031347696801371127, 'samples': 10760704, 'steps': 21016, 'batch_loss/train': 0.7561230785213411} 12/25/2021 17:49:58 - INFO - codeparrot_training - Step 21017: {'lr': 0.0003134617213363132, 'samples': 10761216, 'steps': 21017, 'batch_loss/train': 0.8309653867036104} 12/25/2021 17:50:08 - INFO - codeparrot_training - Step 21018: {'lr': 0.00031344647440661533, 'samples': 10761728, 'steps': 21018, 'batch_loss/train': 0.6519914265954867} 12/25/2021 17:50:19 - INFO - codeparrot_training - Step 21019: {'lr': 0.00031343122722467823, 'samples': 10762240, 'steps': 21019, 'batch_loss/train': 0.7518523437902331} 12/25/2021 17:50:31 - INFO - codeparrot_training - Step 21020: {'lr': 0.0003134159797905625, 'samples': 10762752, 'steps': 21020, 'batch_loss/train': 0.7837419058196247} 12/25/2021 17:50:41 - INFO - codeparrot_training - Step 21021: {'lr': 0.00031340073210432874, 'samples': 10763264, 'steps': 21021, 'batch_loss/train': 0.7869485095143318} 12/25/2021 17:50:52 - INFO - codeparrot_training - Step 21022: {'lr': 0.00031338548416603773, 'samples': 10763776, 'steps': 21022, 'batch_loss/train': 0.7384994067251682} 12/25/2021 17:51:06 - INFO - codeparrot_training - Step 21023: {'lr': 0.0003133702359757499, 'samples': 10764288, 'steps': 21023, 'batch_loss/train': 0.701359617523849} 12/25/2021 17:51:17 - INFO - codeparrot_training - Step 21024: {'lr': 0.0003133549875335259, 'samples': 10764800, 'steps': 21024, 'batch_loss/train': 0.7992260297760367} 12/25/2021 17:51:27 - INFO - codeparrot_training - Step 21025: {'lr': 0.0003133397388394264, 'samples': 10765312, 'steps': 21025, 'batch_loss/train': 0.6854863869957626} 12/25/2021 17:51:40 - INFO - codeparrot_training - Step 21026: {'lr': 0.000313324489893512, 'samples': 10765824, 'steps': 21026, 'batch_loss/train': 0.7983390409499407} 12/25/2021 17:51:50 - INFO - codeparrot_training - Step 21027: {'lr': 0.00031330924069584336, 'samples': 10766336, 'steps': 21027, 'batch_loss/train': 0.874585096957162} 12/25/2021 17:52:01 - INFO - codeparrot_training - Step 21028: {'lr': 0.00031329399124648113, 'samples': 10766848, 'steps': 21028, 'batch_loss/train': 0.8007093761116266} 12/25/2021 17:52:15 - INFO - codeparrot_training - Step 21029: {'lr': 0.00031327874154548594, 'samples': 10767360, 'steps': 21029, 'batch_loss/train': 0.7813457855954766} 12/25/2021 17:52:25 - INFO - codeparrot_training - Step 21030: {'lr': 0.00031326349159291816, 'samples': 10767872, 'steps': 21030, 'batch_loss/train': 0.6829981266055256} 12/25/2021 17:52:36 - INFO - codeparrot_training - Step 21031: {'lr': 0.00031324824138883884, 'samples': 10768384, 'steps': 21031, 'batch_loss/train': 0.6935455477796495} 12/25/2021 17:52:46 - INFO - codeparrot_training - Step 21032: {'lr': 0.00031323299093330827, 'samples': 10768896, 'steps': 21032, 'batch_loss/train': 0.8692555194720626} 12/25/2021 17:52:59 - INFO - codeparrot_training - Step 21033: {'lr': 0.00031321774022638726, 'samples': 10769408, 'steps': 21033, 'batch_loss/train': 1.0220796754583716} 12/25/2021 17:53:09 - INFO - codeparrot_training - Step 21034: {'lr': 0.0003132024892681363, 'samples': 10769920, 'steps': 21034, 'batch_loss/train': 0.8726307190954685} 12/25/2021 17:53:20 - INFO - codeparrot_training - Step 21035: {'lr': 0.00031318723805861626, 'samples': 10770432, 'steps': 21035, 'batch_loss/train': 0.8357840524986386} 12/25/2021 17:53:34 - INFO - codeparrot_training - Step 21036: {'lr': 0.0003131719865978875, 'samples': 10770944, 'steps': 21036, 'batch_loss/train': 0.7059460529126227} 12/25/2021 17:53:45 - INFO - codeparrot_training - Step 21037: {'lr': 0.0003131567348860108, 'samples': 10771456, 'steps': 21037, 'batch_loss/train': 0.8484548209235072} 12/25/2021 17:53:55 - INFO - codeparrot_training - Step 21038: {'lr': 0.0003131414829230468, 'samples': 10771968, 'steps': 21038, 'batch_loss/train': 0.8192545981146395} 12/25/2021 17:54:07 - INFO - codeparrot_training - Step 21039: {'lr': 0.000313126230709056, 'samples': 10772480, 'steps': 21039, 'batch_loss/train': 0.7208462883718312} 12/25/2021 17:54:18 - INFO - codeparrot_training - Step 21040: {'lr': 0.0003131109782440993, 'samples': 10772992, 'steps': 21040, 'batch_loss/train': 0.7813972309231758} 12/25/2021 17:54:28 - INFO - codeparrot_training - Step 21041: {'lr': 0.00031309572552823704, 'samples': 10773504, 'steps': 21041, 'batch_loss/train': 0.9498893604613841} 12/25/2021 17:54:39 - INFO - codeparrot_training - Step 21042: {'lr': 0.00031308047256153014, 'samples': 10774016, 'steps': 21042, 'batch_loss/train': 0.8056883132085204} 12/25/2021 17:54:51 - INFO - codeparrot_training - Step 21043: {'lr': 0.000313065219344039, 'samples': 10774528, 'steps': 21043, 'batch_loss/train': 0.7911169033031911} 12/25/2021 17:55:02 - INFO - codeparrot_training - Step 21044: {'lr': 0.00031304996587582434, 'samples': 10775040, 'steps': 21044, 'batch_loss/train': 0.7223036549985409} 12/25/2021 17:55:12 - INFO - codeparrot_training - Step 21045: {'lr': 0.0003130347121569469, 'samples': 10775552, 'steps': 21045, 'batch_loss/train': 0.7769979655276984} 12/25/2021 17:55:26 - INFO - codeparrot_training - Step 21046: {'lr': 0.0003130194581874672, 'samples': 10776064, 'steps': 21046, 'batch_loss/train': 0.702313126064837} 12/25/2021 17:55:37 - INFO - codeparrot_training - Step 21047: {'lr': 0.0003130042039674459, 'samples': 10776576, 'steps': 21047, 'batch_loss/train': 0.7874680310487747} 12/25/2021 17:55:47 - INFO - codeparrot_training - Step 21048: {'lr': 0.0003129889494969438, 'samples': 10777088, 'steps': 21048, 'batch_loss/train': 0.8488950226455927} 12/25/2021 17:55:59 - INFO - codeparrot_training - Step 21049: {'lr': 0.00031297369477602125, 'samples': 10777600, 'steps': 21049, 'batch_loss/train': 0.7562400372698903} 12/25/2021 17:56:10 - INFO - codeparrot_training - Step 21050: {'lr': 0.0003129584398047392, 'samples': 10778112, 'steps': 21050, 'batch_loss/train': 0.8514187317341566} 12/25/2021 17:56:20 - INFO - codeparrot_training - Step 21051: {'lr': 0.0003129431845831581, 'samples': 10778624, 'steps': 21051, 'batch_loss/train': 0.8117073960602283} 12/25/2021 17:56:31 - INFO - codeparrot_training - Step 21052: {'lr': 0.00031292792911133865, 'samples': 10779136, 'steps': 21052, 'batch_loss/train': 0.7671935213729739} 12/25/2021 17:56:43 - INFO - codeparrot_training - Step 21053: {'lr': 0.0003129126733893416, 'samples': 10779648, 'steps': 21053, 'batch_loss/train': 1.1588633032515645} 12/25/2021 17:56:54 - INFO - codeparrot_training - Step 21054: {'lr': 0.00031289741741722743, 'samples': 10780160, 'steps': 21054, 'batch_loss/train': 1.092694423510693} 12/25/2021 17:57:04 - INFO - codeparrot_training - Step 21055: {'lr': 0.000312882161195057, 'samples': 10780672, 'steps': 21055, 'batch_loss/train': 0.8065482638776302} 12/25/2021 17:57:18 - INFO - codeparrot_training - Step 21056: {'lr': 0.00031286690472289065, 'samples': 10781184, 'steps': 21056, 'batch_loss/train': 0.8190050777047873} 12/25/2021 17:57:29 - INFO - codeparrot_training - Step 21057: {'lr': 0.00031285164800078937, 'samples': 10781696, 'steps': 21057, 'batch_loss/train': 0.8575726849958301} 12/25/2021 17:57:39 - INFO - codeparrot_training - Step 21058: {'lr': 0.00031283639102881353, 'samples': 10782208, 'steps': 21058, 'batch_loss/train': 0.8159942822530866} 12/25/2021 17:57:51 - INFO - codeparrot_training - Step 21059: {'lr': 0.0003128211338070241, 'samples': 10782720, 'steps': 21059, 'batch_loss/train': 0.7713777744211257} 12/25/2021 17:58:02 - INFO - codeparrot_training - Step 21060: {'lr': 0.0003128058763354815, 'samples': 10783232, 'steps': 21060, 'batch_loss/train': 0.8476803442463279} 12/25/2021 17:58:13 - INFO - codeparrot_training - Step 21061: {'lr': 0.0003127906186142464, 'samples': 10783744, 'steps': 21061, 'batch_loss/train': 0.8131040195003152} 12/25/2021 17:58:23 - INFO - codeparrot_training - Step 21062: {'lr': 0.0003127753606433795, 'samples': 10784256, 'steps': 21062, 'batch_loss/train': 0.7162399087101221} 12/25/2021 17:58:36 - INFO - codeparrot_training - Step 21063: {'lr': 0.0003127601024229416, 'samples': 10784768, 'steps': 21063, 'batch_loss/train': 0.8642041767016053} 12/25/2021 17:58:46 - INFO - codeparrot_training - Step 21064: {'lr': 0.00031274484395299305, 'samples': 10785280, 'steps': 21064, 'batch_loss/train': 0.7230834318324924} 12/25/2021 17:58:57 - INFO - codeparrot_training - Step 21065: {'lr': 0.0003127295852335948, 'samples': 10785792, 'steps': 21065, 'batch_loss/train': 0.7798186461441219} 12/25/2021 17:59:09 - INFO - codeparrot_training - Step 21066: {'lr': 0.0003127143262648074, 'samples': 10786304, 'steps': 21066, 'batch_loss/train': 1.0325191393494606} 12/25/2021 17:59:20 - INFO - codeparrot_training - Step 21067: {'lr': 0.0003126990670466915, 'samples': 10786816, 'steps': 21067, 'batch_loss/train': 0.8439769083634019} 12/25/2021 17:59:30 - INFO - codeparrot_training - Step 21068: {'lr': 0.0003126838075793078, 'samples': 10787328, 'steps': 21068, 'batch_loss/train': 0.7653259597718716} 12/25/2021 17:59:44 - INFO - codeparrot_training - Step 21069: {'lr': 0.00031266854786271694, 'samples': 10787840, 'steps': 21069, 'batch_loss/train': 0.761144146323204} 12/25/2021 17:59:55 - INFO - codeparrot_training - Step 21070: {'lr': 0.0003126532878969796, 'samples': 10788352, 'steps': 21070, 'batch_loss/train': 0.8026068359613419} 12/25/2021 18:00:05 - INFO - codeparrot_training - Step 21071: {'lr': 0.00031263802768215644, 'samples': 10788864, 'steps': 21071, 'batch_loss/train': 0.7757658008486032} 12/25/2021 18:00:16 - INFO - codeparrot_training - Step 21072: {'lr': 0.0003126227672183081, 'samples': 10789376, 'steps': 21072, 'batch_loss/train': 0.7468495129141957} 12/25/2021 18:00:28 - INFO - codeparrot_training - Step 21073: {'lr': 0.00031260750650549527, 'samples': 10789888, 'steps': 21073, 'batch_loss/train': 0.7542937528342009} 12/25/2021 18:00:39 - INFO - codeparrot_training - Step 21074: {'lr': 0.0003125922455437787, 'samples': 10790400, 'steps': 21074, 'batch_loss/train': 0.8343564514070749} 12/25/2021 18:00:49 - INFO - codeparrot_training - Step 21075: {'lr': 0.000312576984333219, 'samples': 10790912, 'steps': 21075, 'batch_loss/train': 0.7270121495239437} 12/25/2021 18:01:03 - INFO - codeparrot_training - Step 21076: {'lr': 0.00031256172287387685, 'samples': 10791424, 'steps': 21076, 'batch_loss/train': 0.7495823446661234} 12/25/2021 18:01:14 - INFO - codeparrot_training - Step 21077: {'lr': 0.00031254646116581287, 'samples': 10791936, 'steps': 21077, 'batch_loss/train': 0.7206234066397883} 12/25/2021 18:01:24 - INFO - codeparrot_training - Step 21078: {'lr': 0.0003125311992090878, 'samples': 10792448, 'steps': 21078, 'batch_loss/train': 0.7831077435985208} 12/25/2021 18:01:36 - INFO - codeparrot_training - Step 21079: {'lr': 0.00031251593700376226, 'samples': 10792960, 'steps': 21079, 'batch_loss/train': 0.8591587119735777} 12/25/2021 18:01:47 - INFO - codeparrot_training - Step 21080: {'lr': 0.00031250067454989696, 'samples': 10793472, 'steps': 21080, 'batch_loss/train': 0.7956735440529883} 12/25/2021 18:01:57 - INFO - codeparrot_training - Step 21081: {'lr': 0.0003124854118475526, 'samples': 10793984, 'steps': 21081, 'batch_loss/train': 0.9264197209849954} 12/25/2021 18:02:08 - INFO - codeparrot_training - Step 21082: {'lr': 0.0003124701488967898, 'samples': 10794496, 'steps': 21082, 'batch_loss/train': 0.6325657598208636} 12/25/2021 18:02:20 - INFO - codeparrot_training - Step 21083: {'lr': 0.00031245488569766933, 'samples': 10795008, 'steps': 21083, 'batch_loss/train': 0.7905158515204675} 12/25/2021 18:02:31 - INFO - codeparrot_training - Step 21084: {'lr': 0.0003124396222502518, 'samples': 10795520, 'steps': 21084, 'batch_loss/train': 0.753379849717021} 12/25/2021 18:02:41 - INFO - codeparrot_training - Step 21085: {'lr': 0.0003124243585545979, 'samples': 10796032, 'steps': 21085, 'batch_loss/train': 0.7988245971500874} 12/25/2021 18:02:53 - INFO - codeparrot_training - Step 21086: {'lr': 0.00031240909461076826, 'samples': 10796544, 'steps': 21086, 'batch_loss/train': 0.5916313249617815} 12/25/2021 18:03:04 - INFO - codeparrot_training - Step 21087: {'lr': 0.00031239383041882377, 'samples': 10797056, 'steps': 21087, 'batch_loss/train': 0.8070811798097566} 12/25/2021 18:03:15 - INFO - codeparrot_training - Step 21088: {'lr': 0.0003123785659788248, 'samples': 10797568, 'steps': 21088, 'batch_loss/train': 0.6976904779439792} 12/25/2021 18:03:29 - INFO - codeparrot_training - Step 21089: {'lr': 0.0003123633012908323, 'samples': 10798080, 'steps': 21089, 'batch_loss/train': 0.9139987416565418} 12/25/2021 18:03:40 - INFO - codeparrot_training - Step 21090: {'lr': 0.0003123480363549069, 'samples': 10798592, 'steps': 21090, 'batch_loss/train': 0.7726110308431089} 12/25/2021 18:03:51 - INFO - codeparrot_training - Step 21091: {'lr': 0.00031233277117110914, 'samples': 10799104, 'steps': 21091, 'batch_loss/train': 0.5441053616814315} 12/25/2021 18:04:01 - INFO - codeparrot_training - Step 21092: {'lr': 0.0003123175057394999, 'samples': 10799616, 'steps': 21092, 'batch_loss/train': 0.7129050651565194} 12/25/2021 18:04:14 - INFO - codeparrot_training - Step 21093: {'lr': 0.00031230224006013977, 'samples': 10800128, 'steps': 21093, 'batch_loss/train': 0.7695477484958246} 12/25/2021 18:04:24 - INFO - codeparrot_training - Step 21094: {'lr': 0.0003122869741330894, 'samples': 10800640, 'steps': 21094, 'batch_loss/train': 0.7914838080760092} 12/25/2021 18:04:35 - INFO - codeparrot_training - Step 21095: {'lr': 0.0003122717079584096, 'samples': 10801152, 'steps': 21095, 'batch_loss/train': 0.743922954890877} 12/25/2021 18:04:49 - INFO - codeparrot_training - Step 21096: {'lr': 0.00031225644153616105, 'samples': 10801664, 'steps': 21096, 'batch_loss/train': 0.7480635498650372} 12/25/2021 18:04:59 - INFO - codeparrot_training - Step 21097: {'lr': 0.00031224117486640434, 'samples': 10802176, 'steps': 21097, 'batch_loss/train': 0.7947740331292152} 12/25/2021 18:05:10 - INFO - codeparrot_training - Step 21098: {'lr': 0.0003122259079492002, 'samples': 10802688, 'steps': 21098, 'batch_loss/train': 0.7891149232164025} 12/25/2021 18:05:22 - INFO - codeparrot_training - Step 21099: {'lr': 0.00031221064078460934, 'samples': 10803200, 'steps': 21099, 'batch_loss/train': 0.7715819351142272} 12/25/2021 18:05:33 - INFO - codeparrot_training - Step 21100: {'lr': 0.00031219537337269255, 'samples': 10803712, 'steps': 21100, 'batch_loss/train': 0.7706480300985277} 12/25/2021 18:05:44 - INFO - codeparrot_training - Step 21101: {'lr': 0.0003121801057135103, 'samples': 10804224, 'steps': 21101, 'batch_loss/train': 0.7832375535508618} 12/25/2021 18:05:54 - INFO - codeparrot_training - Step 21102: {'lr': 0.0003121648378071235, 'samples': 10804736, 'steps': 21102, 'batch_loss/train': 0.6897115878527984} 12/25/2021 18:06:06 - INFO - codeparrot_training - Step 21103: {'lr': 0.0003121495696535928, 'samples': 10805248, 'steps': 21103, 'batch_loss/train': 0.9616347495466471} 12/25/2021 18:06:17 - INFO - codeparrot_training - Step 21104: {'lr': 0.0003121343012529788, 'samples': 10805760, 'steps': 21104, 'batch_loss/train': 0.7671171752735972} 12/25/2021 18:06:28 - INFO - codeparrot_training - Step 21105: {'lr': 0.0003121190326053424, 'samples': 10806272, 'steps': 21105, 'batch_loss/train': 0.7886057095602155} 12/25/2021 18:06:42 - INFO - codeparrot_training - Step 21106: {'lr': 0.0003121037637107441, 'samples': 10806784, 'steps': 21106, 'batch_loss/train': 0.7666667951270938} 12/25/2021 18:06:52 - INFO - codeparrot_training - Step 21107: {'lr': 0.0003120884945692447, 'samples': 10807296, 'steps': 21107, 'batch_loss/train': 0.7582380063831806} 12/25/2021 18:07:03 - INFO - codeparrot_training - Step 21108: {'lr': 0.0003120732251809049, 'samples': 10807808, 'steps': 21108, 'batch_loss/train': 0.8354140522424132} 12/25/2021 18:07:15 - INFO - codeparrot_training - Step 21109: {'lr': 0.0003120579555457854, 'samples': 10808320, 'steps': 21109, 'batch_loss/train': 0.7919439887627959} 12/25/2021 18:07:26 - INFO - codeparrot_training - Step 21110: {'lr': 0.00031204268566394696, 'samples': 10808832, 'steps': 21110, 'batch_loss/train': 0.6757994771469384} 12/25/2021 18:07:37 - INFO - codeparrot_training - Step 21111: {'lr': 0.0003120274155354501, 'samples': 10809344, 'steps': 21111, 'batch_loss/train': 0.7386413346976042} 12/25/2021 18:07:48 - INFO - codeparrot_training - Step 21112: {'lr': 0.0003120121451603557, 'samples': 10809856, 'steps': 21112, 'batch_loss/train': 0.5227747480385005} 12/25/2021 18:07:59 - INFO - codeparrot_training - Step 21113: {'lr': 0.0003119968745387245, 'samples': 10810368, 'steps': 21113, 'batch_loss/train': 0.7029417706653476} 12/25/2021 18:08:10 - INFO - codeparrot_training - Step 21114: {'lr': 0.000311981603670617, 'samples': 10810880, 'steps': 21114, 'batch_loss/train': 0.7718703201971948} 12/25/2021 18:08:20 - INFO - codeparrot_training - Step 21115: {'lr': 0.00031196633255609415, 'samples': 10811392, 'steps': 21115, 'batch_loss/train': 0.7826337567530572} 12/25/2021 18:08:34 - INFO - codeparrot_training - Step 21116: {'lr': 0.0003119510611952165, 'samples': 10811904, 'steps': 21116, 'batch_loss/train': 0.8399809086695313} 12/25/2021 18:08:45 - INFO - codeparrot_training - Step 21117: {'lr': 0.00031193578958804485, 'samples': 10812416, 'steps': 21117, 'batch_loss/train': 0.8544343030080199} 12/25/2021 18:08:55 - INFO - codeparrot_training - Step 21118: {'lr': 0.00031192051773463997, 'samples': 10812928, 'steps': 21118, 'batch_loss/train': 0.8008981458842754} 12/25/2021 18:09:08 - INFO - codeparrot_training - Step 21119: {'lr': 0.0003119052456350624, 'samples': 10813440, 'steps': 21119, 'batch_loss/train': 0.8831020267680287} 12/25/2021 18:09:18 - INFO - codeparrot_training - Step 21120: {'lr': 0.00031188997328937296, 'samples': 10813952, 'steps': 21120, 'batch_loss/train': 0.7685265748295933} 12/25/2021 18:09:29 - INFO - codeparrot_training - Step 21121: {'lr': 0.00031187470069763237, 'samples': 10814464, 'steps': 21121, 'batch_loss/train': 0.8963340222835541} 12/25/2021 18:09:43 - INFO - codeparrot_training - Step 21122: {'lr': 0.00031185942785990137, 'samples': 10814976, 'steps': 21122, 'batch_loss/train': 0.9038263475522399} 12/25/2021 18:09:53 - INFO - codeparrot_training - Step 21123: {'lr': 0.0003118441547762405, 'samples': 10815488, 'steps': 21123, 'batch_loss/train': 0.8480308894068003} 12/25/2021 18:10:04 - INFO - codeparrot_training - Step 21124: {'lr': 0.00031182888144671076, 'samples': 10816000, 'steps': 21124, 'batch_loss/train': 0.8172612222842872} 12/25/2021 18:10:15 - INFO - codeparrot_training - Step 21125: {'lr': 0.0003118136078713726, 'samples': 10816512, 'steps': 21125, 'batch_loss/train': 0.8859287230297923} 12/25/2021 18:10:27 - INFO - codeparrot_training - Step 21126: {'lr': 0.000311798334050287, 'samples': 10817024, 'steps': 21126, 'batch_loss/train': 0.7072918168269098} 12/25/2021 18:10:37 - INFO - codeparrot_training - Step 21127: {'lr': 0.0003117830599835145, 'samples': 10817536, 'steps': 21127, 'batch_loss/train': 0.861964332871139} 12/25/2021 18:10:48 - INFO - codeparrot_training - Step 21128: {'lr': 0.0003117677856711159, 'samples': 10818048, 'steps': 21128, 'batch_loss/train': 0.7929306393489242} 12/25/2021 18:11:00 - INFO - codeparrot_training - Step 21129: {'lr': 0.00031175251111315196, 'samples': 10818560, 'steps': 21129, 'batch_loss/train': 0.8085124511271715} 12/25/2021 18:11:11 - INFO - codeparrot_training - Step 21130: {'lr': 0.0003117372363096833, 'samples': 10819072, 'steps': 21130, 'batch_loss/train': 0.7555899727158248} 12/25/2021 18:11:21 - INFO - codeparrot_training - Step 21131: {'lr': 0.0003117219612607707, 'samples': 10819584, 'steps': 21131, 'batch_loss/train': 0.7999111773679033} 12/25/2021 18:11:33 - INFO - codeparrot_training - Step 21132: {'lr': 0.0003117066859664749, 'samples': 10820096, 'steps': 21132, 'batch_loss/train': 0.781915158033371} 12/25/2021 18:11:44 - INFO - codeparrot_training - Step 21133: {'lr': 0.00031169141042685657, 'samples': 10820608, 'steps': 21133, 'batch_loss/train': 0.8366254325956106} 12/25/2021 18:11:54 - INFO - codeparrot_training - Step 21134: {'lr': 0.00031167613464197654, 'samples': 10821120, 'steps': 21134, 'batch_loss/train': 0.7642814312130213} 12/25/2021 18:12:05 - INFO - codeparrot_training - Step 21135: {'lr': 0.00031166085861189546, 'samples': 10821632, 'steps': 21135, 'batch_loss/train': 0.705956248100847} 12/25/2021 18:12:19 - INFO - codeparrot_training - Step 21136: {'lr': 0.0003116455823366741, 'samples': 10822144, 'steps': 21136, 'batch_loss/train': 0.7338276060763747} 12/25/2021 18:12:30 - INFO - codeparrot_training - Step 21137: {'lr': 0.00031163030581637317, 'samples': 10822656, 'steps': 21137, 'batch_loss/train': 0.8981343306368217} 12/25/2021 18:12:40 - INFO - codeparrot_training - Step 21138: {'lr': 0.00031161502905105345, 'samples': 10823168, 'steps': 21138, 'batch_loss/train': 0.8177247690036893} 12/25/2021 18:12:52 - INFO - codeparrot_training - Step 21139: {'lr': 0.00031159975204077554, 'samples': 10823680, 'steps': 21139, 'batch_loss/train': 0.7768886615522206} 12/25/2021 18:13:03 - INFO - codeparrot_training - Step 21140: {'lr': 0.0003115844747856004, 'samples': 10824192, 'steps': 21140, 'batch_loss/train': 0.6946512688882649} 12/25/2021 18:13:14 - INFO - codeparrot_training - Step 21141: {'lr': 0.0003115691972855885, 'samples': 10824704, 'steps': 21141, 'batch_loss/train': 0.6869699938688427} 12/25/2021 18:13:26 - INFO - codeparrot_training - Step 21142: {'lr': 0.00031155391954080093, 'samples': 10825216, 'steps': 21142, 'batch_loss/train': 0.749017502181232} 12/25/2021 18:13:36 - INFO - codeparrot_training - Step 21143: {'lr': 0.000311538641551298, 'samples': 10825728, 'steps': 21143, 'batch_loss/train': 0.7668589907698333} 12/25/2021 18:13:47 - INFO - codeparrot_training - Step 21144: {'lr': 0.0003115233633171408, 'samples': 10826240, 'steps': 21144, 'batch_loss/train': 0.7355698239989579} 12/25/2021 18:13:57 - INFO - codeparrot_training - Step 21145: {'lr': 0.0003115080848383899, 'samples': 10826752, 'steps': 21145, 'batch_loss/train': 0.8714522905647755} 12/25/2021 18:14:11 - INFO - codeparrot_training - Step 21146: {'lr': 0.00031149280611510606, 'samples': 10827264, 'steps': 21146, 'batch_loss/train': 0.8041241019964218} 12/25/2021 18:14:22 - INFO - codeparrot_training - Step 21147: {'lr': 0.0003114775271473501, 'samples': 10827776, 'steps': 21147, 'batch_loss/train': 0.9109010975807905} 12/25/2021 18:14:33 - INFO - codeparrot_training - Step 21148: {'lr': 0.00031146224793518267, 'samples': 10828288, 'steps': 21148, 'batch_loss/train': 0.8404353708028793} 12/25/2021 18:14:45 - INFO - codeparrot_training - Step 21149: {'lr': 0.0003114469684786645, 'samples': 10828800, 'steps': 21149, 'batch_loss/train': 0.8288251319900155} 12/25/2021 18:14:55 - INFO - codeparrot_training - Step 21150: {'lr': 0.00031143168877785645, 'samples': 10829312, 'steps': 21150, 'batch_loss/train': 0.9395424854010344} 12/25/2021 18:15:06 - INFO - codeparrot_training - Step 21151: {'lr': 0.0003114164088328192, 'samples': 10829824, 'steps': 21151, 'batch_loss/train': 0.9245664358604699} 12/25/2021 18:15:20 - INFO - codeparrot_training - Step 21152: {'lr': 0.0003114011286436135, 'samples': 10830336, 'steps': 21152, 'batch_loss/train': 0.7739222999662161} 12/25/2021 18:15:31 - INFO - codeparrot_training - Step 21153: {'lr': 0.0003113858482103, 'samples': 10830848, 'steps': 21153, 'batch_loss/train': 0.8205440612509847} 12/25/2021 18:15:41 - INFO - codeparrot_training - Step 21154: {'lr': 0.00031137056753293964, 'samples': 10831360, 'steps': 21154, 'batch_loss/train': 0.8269610549323261} 12/25/2021 18:15:53 - INFO - codeparrot_training - Step 21155: {'lr': 0.0003113552866115931, 'samples': 10831872, 'steps': 21155, 'batch_loss/train': 0.8629821008071303} 12/25/2021 18:16:04 - INFO - codeparrot_training - Step 21156: {'lr': 0.00031134000544632103, 'samples': 10832384, 'steps': 21156, 'batch_loss/train': 0.7965078102424741} 12/25/2021 18:16:14 - INFO - codeparrot_training - Step 21157: {'lr': 0.0003113247240371843, 'samples': 10832896, 'steps': 21157, 'batch_loss/train': 0.7824748940765858} 12/25/2021 18:16:25 - INFO - codeparrot_training - Step 21158: {'lr': 0.0003113094423842436, 'samples': 10833408, 'steps': 21158, 'batch_loss/train': 0.7662981385365129} 12/25/2021 18:16:37 - INFO - codeparrot_training - Step 21159: {'lr': 0.00031129416048755976, 'samples': 10833920, 'steps': 21159, 'batch_loss/train': 0.8099019043147564} 12/25/2021 18:16:48 - INFO - codeparrot_training - Step 21160: {'lr': 0.00031127887834719344, 'samples': 10834432, 'steps': 21160, 'batch_loss/train': 0.8895755056291819} 12/25/2021 18:16:58 - INFO - codeparrot_training - Step 21161: {'lr': 0.0003112635959632055, 'samples': 10834944, 'steps': 21161, 'batch_loss/train': 0.7807073127478361} 12/25/2021 18:17:12 - INFO - codeparrot_training - Step 21162: {'lr': 0.0003112483133356565, 'samples': 10835456, 'steps': 21162, 'batch_loss/train': 0.7499191029928625} 12/25/2021 18:17:23 - INFO - codeparrot_training - Step 21163: {'lr': 0.0003112330304646074, 'samples': 10835968, 'steps': 21163, 'batch_loss/train': 0.8518969099968672} 12/25/2021 18:17:33 - INFO - codeparrot_training - Step 21164: {'lr': 0.00031121774735011894, 'samples': 10836480, 'steps': 21164, 'batch_loss/train': 0.8953264961019158} 12/25/2021 18:17:45 - INFO - codeparrot_training - Step 21165: {'lr': 0.0003112024639922518, 'samples': 10836992, 'steps': 21165, 'batch_loss/train': 0.7377715506590903} 12/25/2021 18:17:56 - INFO - codeparrot_training - Step 21166: {'lr': 0.00031118718039106676, 'samples': 10837504, 'steps': 21166, 'batch_loss/train': 0.9026155676692724} 12/25/2021 18:18:06 - INFO - codeparrot_training - Step 21167: {'lr': 0.0003111718965466246, 'samples': 10838016, 'steps': 21167, 'batch_loss/train': 0.7145317769609392} 12/25/2021 18:18:17 - INFO - codeparrot_training - Step 21168: {'lr': 0.0003111566124589861, 'samples': 10838528, 'steps': 21168, 'batch_loss/train': 0.746454288251698} 12/25/2021 18:18:29 - INFO - codeparrot_training - Step 21169: {'lr': 0.0003111413281282119, 'samples': 10839040, 'steps': 21169, 'batch_loss/train': 0.7482721892884001} 12/25/2021 18:18:40 - INFO - codeparrot_training - Step 21170: {'lr': 0.00031112604355436297, 'samples': 10839552, 'steps': 21170, 'batch_loss/train': 0.7537987055256963} 12/25/2021 18:18:51 - INFO - codeparrot_training - Step 21171: {'lr': 0.00031111075873749993, 'samples': 10840064, 'steps': 21171, 'batch_loss/train': 0.820670054294169} 12/25/2021 18:19:04 - INFO - codeparrot_training - Step 21172: {'lr': 0.00031109547367768367, 'samples': 10840576, 'steps': 21172, 'batch_loss/train': 0.7702312618494034} 12/25/2021 18:19:15 - INFO - codeparrot_training - Step 21173: {'lr': 0.0003110801883749748, 'samples': 10841088, 'steps': 21173, 'batch_loss/train': 0.750069385394454} 12/25/2021 18:19:25 - INFO - codeparrot_training - Step 21174: {'lr': 0.00031106490282943415, 'samples': 10841600, 'steps': 21174, 'batch_loss/train': 0.806127046700567} 12/25/2021 18:19:37 - INFO - codeparrot_training - Step 21175: {'lr': 0.00031104961704112244, 'samples': 10842112, 'steps': 21175, 'batch_loss/train': 0.8019031565636396} 12/25/2021 18:19:48 - INFO - codeparrot_training - Step 21176: {'lr': 0.0003110343310101006, 'samples': 10842624, 'steps': 21176, 'batch_loss/train': 0.8045588720124215} 12/25/2021 18:19:59 - INFO - codeparrot_training - Step 21177: {'lr': 0.00031101904473642935, 'samples': 10843136, 'steps': 21177, 'batch_loss/train': 0.7989874060731381} 12/25/2021 18:20:09 - INFO - codeparrot_training - Step 21178: {'lr': 0.00031100375822016934, 'samples': 10843648, 'steps': 21178, 'batch_loss/train': 0.8054421525448561} 12/25/2021 18:20:22 - INFO - codeparrot_training - Step 21179: {'lr': 0.00031098847146138143, 'samples': 10844160, 'steps': 21179, 'batch_loss/train': 0.7934329046402127} 12/25/2021 18:20:32 - INFO - codeparrot_training - Step 21180: {'lr': 0.0003109731844601264, 'samples': 10844672, 'steps': 21180, 'batch_loss/train': 0.9145090603269637} 12/25/2021 18:20:43 - INFO - codeparrot_training - Step 21181: {'lr': 0.0003109578972164651, 'samples': 10845184, 'steps': 21181, 'batch_loss/train': 0.793787736678496} 12/25/2021 18:20:57 - INFO - codeparrot_training - Step 21182: {'lr': 0.00031094260973045805, 'samples': 10845696, 'steps': 21182, 'batch_loss/train': 0.6298931795172393} 12/25/2021 18:21:07 - INFO - codeparrot_training - Step 21183: {'lr': 0.0003109273220021663, 'samples': 10846208, 'steps': 21183, 'batch_loss/train': 0.7663156930357218} 12/25/2021 18:21:18 - INFO - codeparrot_training - Step 21184: {'lr': 0.0003109120340316505, 'samples': 10846720, 'steps': 21184, 'batch_loss/train': 0.6696853805333376} 12/25/2021 18:21:30 - INFO - codeparrot_training - Step 21185: {'lr': 0.00031089674581897143, 'samples': 10847232, 'steps': 21185, 'batch_loss/train': 0.7517464682459831} 12/25/2021 18:21:41 - INFO - codeparrot_training - Step 21186: {'lr': 0.0003108814573641899, 'samples': 10847744, 'steps': 21186, 'batch_loss/train': 0.8598740054294467} 12/25/2021 18:21:51 - INFO - codeparrot_training - Step 21187: {'lr': 0.0003108661686673667, 'samples': 10848256, 'steps': 21187, 'batch_loss/train': 0.7304942994378507} 12/25/2021 18:22:02 - INFO - codeparrot_training - Step 21188: {'lr': 0.00031085087972856256, 'samples': 10848768, 'steps': 21188, 'batch_loss/train': 0.8568593710660934} 12/25/2021 18:22:14 - INFO - codeparrot_training - Step 21189: {'lr': 0.0003108355905478383, 'samples': 10849280, 'steps': 21189, 'batch_loss/train': 0.8427406288683414} 12/25/2021 18:22:24 - INFO - codeparrot_training - Step 21190: {'lr': 0.00031082030112525484, 'samples': 10849792, 'steps': 21190, 'batch_loss/train': 0.7867625079816207} 12/25/2021 18:22:35 - INFO - codeparrot_training - Step 21191: {'lr': 0.0003108050114608727, 'samples': 10850304, 'steps': 21191, 'batch_loss/train': 0.8250701576471329} 12/25/2021 18:22:49 - INFO - codeparrot_training - Step 21192: {'lr': 0.00031078972155475275, 'samples': 10850816, 'steps': 21192, 'batch_loss/train': 0.7846110030077398} 12/25/2021 18:22:59 - INFO - codeparrot_training - Step 21193: {'lr': 0.00031077443140695594, 'samples': 10851328, 'steps': 21193, 'batch_loss/train': 0.9214900089427829} 12/25/2021 18:23:10 - INFO - codeparrot_training - Step 21194: {'lr': 0.00031075914101754283, 'samples': 10851840, 'steps': 21194, 'batch_loss/train': 0.756954996380955} 12/25/2021 18:23:22 - INFO - codeparrot_training - Step 21195: {'lr': 0.00031074385038657435, 'samples': 10852352, 'steps': 21195, 'batch_loss/train': 0.7765986751765013} 12/25/2021 18:23:33 - INFO - codeparrot_training - Step 21196: {'lr': 0.00031072855951411134, 'samples': 10852864, 'steps': 21196, 'batch_loss/train': 0.7976712157251313} 12/25/2021 18:23:43 - INFO - codeparrot_training - Step 21197: {'lr': 0.0003107132684002144, 'samples': 10853376, 'steps': 21197, 'batch_loss/train': 0.8353700414299965} 12/25/2021 18:23:57 - INFO - codeparrot_training - Step 21198: {'lr': 0.00031069797704494454, 'samples': 10853888, 'steps': 21198, 'batch_loss/train': 0.8402393199503422} 12/25/2021 18:24:08 - INFO - codeparrot_training - Step 21199: {'lr': 0.0003106826854483623, 'samples': 10854400, 'steps': 21199, 'batch_loss/train': 0.7101958962157369} 12/25/2021 18:24:18 - INFO - codeparrot_training - Step 21200: {'lr': 0.0003106673936105287, 'samples': 10854912, 'steps': 21200, 'batch_loss/train': 0.7255009841173887} 12/25/2021 18:24:29 - INFO - codeparrot_training - Step 21201: {'lr': 0.0003106521015315044, 'samples': 10855424, 'steps': 21201, 'batch_loss/train': 0.8052193569019437} 12/25/2021 18:24:41 - INFO - codeparrot_training - Step 21202: {'lr': 0.0003106368092113503, 'samples': 10855936, 'steps': 21202, 'batch_loss/train': 0.6974898690823466} 12/25/2021 18:24:52 - INFO - codeparrot_training - Step 21203: {'lr': 0.00031062151665012716, 'samples': 10856448, 'steps': 21203, 'batch_loss/train': 0.8009270913898945} 12/25/2021 18:25:02 - INFO - codeparrot_training - Step 21204: {'lr': 0.00031060622384789573, 'samples': 10856960, 'steps': 21204, 'batch_loss/train': 0.7646315483143553} 12/25/2021 18:25:14 - INFO - codeparrot_training - Step 21205: {'lr': 0.00031059093080471685, 'samples': 10857472, 'steps': 21205, 'batch_loss/train': 0.8477187184616923} 12/25/2021 18:25:25 - INFO - codeparrot_training - Step 21206: {'lr': 0.0003105756375206513, 'samples': 10857984, 'steps': 21206, 'batch_loss/train': 0.7899513052543625} 12/25/2021 18:25:36 - INFO - codeparrot_training - Step 21207: {'lr': 0.0003105603439957598, 'samples': 10858496, 'steps': 21207, 'batch_loss/train': 0.8331679627299309} 12/25/2021 18:25:48 - INFO - codeparrot_training - Step 21208: {'lr': 0.00031054505023010335, 'samples': 10859008, 'steps': 21208, 'batch_loss/train': 0.767692347522825} 12/25/2021 18:25:58 - INFO - codeparrot_training - Step 21209: {'lr': 0.0003105297562237426, 'samples': 10859520, 'steps': 21209, 'batch_loss/train': 0.7842717608436942} 12/25/2021 18:26:09 - INFO - codeparrot_training - Step 21210: {'lr': 0.00031051446197673846, 'samples': 10860032, 'steps': 21210, 'batch_loss/train': 0.8678265316411853} 12/25/2021 18:26:19 - INFO - codeparrot_training - Step 21211: {'lr': 0.00031049916748915156, 'samples': 10860544, 'steps': 21211, 'batch_loss/train': 0.766386684961617} 12/25/2021 18:26:34 - INFO - codeparrot_training - Step 21212: {'lr': 0.0003104838727610429, 'samples': 10861056, 'steps': 21212, 'batch_loss/train': 0.7881709379144013} 12/25/2021 18:26:44 - INFO - codeparrot_training - Step 21213: {'lr': 0.00031046857779247315, 'samples': 10861568, 'steps': 21213, 'batch_loss/train': 0.8445006730034947} 12/25/2021 18:26:55 - INFO - codeparrot_training - Step 21214: {'lr': 0.0003104532825835032, 'samples': 10862080, 'steps': 21214, 'batch_loss/train': 0.8232757351361215} 12/25/2021 18:27:07 - INFO - codeparrot_training - Step 21215: {'lr': 0.0003104379871341938, 'samples': 10862592, 'steps': 21215, 'batch_loss/train': 0.918127027573064} 12/25/2021 18:27:17 - INFO - codeparrot_training - Step 21216: {'lr': 0.0003104226914446058, 'samples': 10863104, 'steps': 21216, 'batch_loss/train': 0.7886214246973395} 12/25/2021 18:27:28 - INFO - codeparrot_training - Step 21217: {'lr': 0.0003104073955147999, 'samples': 10863616, 'steps': 21217, 'batch_loss/train': 0.8388323355466127} 12/25/2021 18:27:40 - INFO - codeparrot_training - Step 21218: {'lr': 0.00031039209934483714, 'samples': 10864128, 'steps': 21218, 'batch_loss/train': 0.8163379766047001} 12/25/2021 18:27:51 - INFO - codeparrot_training - Step 21219: {'lr': 0.00031037680293477814, 'samples': 10864640, 'steps': 21219, 'batch_loss/train': 0.756906776688993} 12/25/2021 18:28:01 - INFO - codeparrot_training - Step 21220: {'lr': 0.0003103615062846838, 'samples': 10865152, 'steps': 21220, 'batch_loss/train': 0.7208876758813858} 12/25/2021 18:28:12 - INFO - codeparrot_training - Step 21221: {'lr': 0.0003103462093946149, 'samples': 10865664, 'steps': 21221, 'batch_loss/train': 0.8540738103911281} 12/25/2021 18:28:26 - INFO - codeparrot_training - Step 21222: {'lr': 0.0003103309122646322, 'samples': 10866176, 'steps': 21222, 'batch_loss/train': 0.7869669201318175} 12/25/2021 18:28:36 - INFO - codeparrot_training - Step 21223: {'lr': 0.0003103156148947967, 'samples': 10866688, 'steps': 21223, 'batch_loss/train': 0.760885582305491} 12/25/2021 18:28:47 - INFO - codeparrot_training - Step 21224: {'lr': 0.0003103003172851689, 'samples': 10867200, 'steps': 21224, 'batch_loss/train': 0.6805126876570284} 12/25/2021 18:28:59 - INFO - codeparrot_training - Step 21225: {'lr': 0.0003102850194358099, 'samples': 10867712, 'steps': 21225, 'batch_loss/train': 0.7090739496052265} 12/25/2021 18:29:10 - INFO - codeparrot_training - Step 21226: {'lr': 0.00031026972134678044, 'samples': 10868224, 'steps': 21226, 'batch_loss/train': 0.7704030564054847} 12/25/2021 18:29:20 - INFO - codeparrot_training - Step 21227: {'lr': 0.00031025442301814134, 'samples': 10868736, 'steps': 21227, 'batch_loss/train': 0.7906185649335384} 12/25/2021 18:29:34 - INFO - codeparrot_training - Step 21228: {'lr': 0.00031023912444995335, 'samples': 10869248, 'steps': 21228, 'batch_loss/train': 0.8120958819054067} 12/25/2021 18:29:45 - INFO - codeparrot_training - Step 21229: {'lr': 0.0003102238256422774, 'samples': 10869760, 'steps': 21229, 'batch_loss/train': 0.8616431383416057} 12/25/2021 18:29:55 - INFO - codeparrot_training - Step 21230: {'lr': 0.00031020852659517417, 'samples': 10870272, 'steps': 21230, 'batch_loss/train': 0.8037239387631416} 12/25/2021 18:30:06 - INFO - codeparrot_training - Step 21231: {'lr': 0.0003101932273087047, 'samples': 10870784, 'steps': 21231, 'batch_loss/train': 0.7597157908603549} 12/25/2021 18:30:18 - INFO - codeparrot_training - Step 21232: {'lr': 0.00031017792778292966, 'samples': 10871296, 'steps': 21232, 'batch_loss/train': 0.7864271014695987} 12/25/2021 18:30:29 - INFO - codeparrot_training - Step 21233: {'lr': 0.00031016262801790976, 'samples': 10871808, 'steps': 21233, 'batch_loss/train': 0.8187255727825686} 12/25/2021 18:30:39 - INFO - codeparrot_training - Step 21234: {'lr': 0.0003101473280137061, 'samples': 10872320, 'steps': 21234, 'batch_loss/train': 0.8241942441090941} 12/25/2021 18:30:51 - INFO - codeparrot_training - Step 21235: {'lr': 0.0003101320277703794, 'samples': 10872832, 'steps': 21235, 'batch_loss/train': 0.5577925449470058} 12/25/2021 18:31:02 - INFO - codeparrot_training - Step 21236: {'lr': 0.0003101167272879904, 'samples': 10873344, 'steps': 21236, 'batch_loss/train': 0.7487212177366018} 12/25/2021 18:31:13 - INFO - codeparrot_training - Step 21237: {'lr': 0.00031010142656659993, 'samples': 10873856, 'steps': 21237, 'batch_loss/train': 0.8442176897078753} 12/25/2021 18:31:27 - INFO - codeparrot_training - Step 21238: {'lr': 0.00031008612560626904, 'samples': 10874368, 'steps': 21238, 'batch_loss/train': 0.7329771285876632} 12/25/2021 18:31:37 - INFO - codeparrot_training - Step 21239: {'lr': 0.00031007082440705826, 'samples': 10874880, 'steps': 21239, 'batch_loss/train': 0.717415967490524} 12/25/2021 18:31:48 - INFO - codeparrot_training - Step 21240: {'lr': 0.0003100555229690286, 'samples': 10875392, 'steps': 21240, 'batch_loss/train': 0.8166081858798862} 12/25/2021 18:31:59 - INFO - codeparrot_training - Step 21241: {'lr': 0.0003100402212922409, 'samples': 10875904, 'steps': 21241, 'batch_loss/train': 0.7843575868755579} 12/25/2021 18:32:11 - INFO - codeparrot_training - Step 21242: {'lr': 0.000310024919376756, 'samples': 10876416, 'steps': 21242, 'batch_loss/train': 0.8339450573548675} 12/25/2021 18:32:21 - INFO - codeparrot_training - Step 21243: {'lr': 0.00031000961722263447, 'samples': 10876928, 'steps': 21243, 'batch_loss/train': 0.8037578854709864} 12/25/2021 18:32:32 - INFO - codeparrot_training - Step 21244: {'lr': 0.0003099943148299376, 'samples': 10877440, 'steps': 21244, 'batch_loss/train': 0.6957164426567033} 12/25/2021 18:32:44 - INFO - codeparrot_training - Step 21245: {'lr': 0.00030997901219872586, 'samples': 10877952, 'steps': 21245, 'batch_loss/train': 1.5923991622403264} 12/25/2021 18:32:55 - INFO - codeparrot_training - Step 21246: {'lr': 0.00030996370932906024, 'samples': 10878464, 'steps': 21246, 'batch_loss/train': 0.797051636967808} 12/25/2021 18:33:05 - INFO - codeparrot_training - Step 21247: {'lr': 0.00030994840622100155, 'samples': 10878976, 'steps': 21247, 'batch_loss/train': 0.8535949679790065} 12/25/2021 18:33:17 - INFO - codeparrot_training - Step 21248: {'lr': 0.00030993310287461066, 'samples': 10879488, 'steps': 21248, 'batch_loss/train': 0.7172076660208404} 12/25/2021 18:33:28 - INFO - codeparrot_training - Step 21249: {'lr': 0.00030991779928994835, 'samples': 10880000, 'steps': 21249, 'batch_loss/train': 0.750581881031394} 12/25/2021 18:33:39 - INFO - codeparrot_training - Step 21250: {'lr': 0.0003099024954670755, 'samples': 10880512, 'steps': 21250, 'batch_loss/train': 0.8490879023447633} 12/25/2021 18:33:49 - INFO - codeparrot_training - Step 21251: {'lr': 0.00030988719140605296, 'samples': 10881024, 'steps': 21251, 'batch_loss/train': 0.5259569870540872} 12/25/2021 18:34:04 - INFO - codeparrot_training - Step 21252: {'lr': 0.00030987188710694154, 'samples': 10881536, 'steps': 21252, 'batch_loss/train': 0.6776475410442799} 12/25/2021 18:34:14 - INFO - codeparrot_training - Step 21253: {'lr': 0.0003098565825698021, 'samples': 10882048, 'steps': 21253, 'batch_loss/train': 0.6231724822428077} 12/25/2021 18:34:25 - INFO - codeparrot_training - Step 21254: {'lr': 0.0003098412777946955, 'samples': 10882560, 'steps': 21254, 'batch_loss/train': 0.8780254637822509} 12/25/2021 18:34:37 - INFO - codeparrot_training - Step 21255: {'lr': 0.00030982597278168264, 'samples': 10883072, 'steps': 21255, 'batch_loss/train': 0.8659840254113078} 12/25/2021 18:34:48 - INFO - codeparrot_training - Step 21256: {'lr': 0.00030981066753082417, 'samples': 10883584, 'steps': 21256, 'batch_loss/train': 0.8569015199318528} 12/25/2021 18:34:58 - INFO - codeparrot_training - Step 21257: {'lr': 0.0003097953620421812, 'samples': 10884096, 'steps': 21257, 'batch_loss/train': 0.8577764872461557} 12/25/2021 18:35:12 - INFO - codeparrot_training - Step 21258: {'lr': 0.0003097800563158144, 'samples': 10884608, 'steps': 21258, 'batch_loss/train': 0.8554258504882455} 12/25/2021 18:35:23 - INFO - codeparrot_training - Step 21259: {'lr': 0.00030976475035178464, 'samples': 10885120, 'steps': 21259, 'batch_loss/train': 0.8605599701404572} 12/25/2021 18:35:33 - INFO - codeparrot_training - Step 21260: {'lr': 0.00030974944415015275, 'samples': 10885632, 'steps': 21260, 'batch_loss/train': 0.8717638943344355} 12/25/2021 18:35:44 - INFO - codeparrot_training - Step 21261: {'lr': 0.0003097341377109797, 'samples': 10886144, 'steps': 21261, 'batch_loss/train': 0.7391209714114666} 12/25/2021 18:35:56 - INFO - codeparrot_training - Step 21262: {'lr': 0.0003097188310343263, 'samples': 10886656, 'steps': 21262, 'batch_loss/train': 0.707053707446903} 12/25/2021 18:36:07 - INFO - codeparrot_training - Step 21263: {'lr': 0.0003097035241202533, 'samples': 10887168, 'steps': 21263, 'batch_loss/train': 0.8432399723678827} 12/25/2021 18:36:17 - INFO - codeparrot_training - Step 21264: {'lr': 0.0003096882169688218, 'samples': 10887680, 'steps': 21264, 'batch_loss/train': 0.8020638506859541} 12/25/2021 18:36:29 - INFO - codeparrot_training - Step 21265: {'lr': 0.00030967290958009233, 'samples': 10888192, 'steps': 21265, 'batch_loss/train': 0.8895733403041959} 12/25/2021 18:36:40 - INFO - codeparrot_training - Step 21266: {'lr': 0.0003096576019541259, 'samples': 10888704, 'steps': 21266, 'batch_loss/train': 0.7520571704953909} 12/25/2021 18:36:51 - INFO - codeparrot_training - Step 21267: {'lr': 0.0003096422940909834, 'samples': 10889216, 'steps': 21267, 'batch_loss/train': 0.817035298794508} 12/25/2021 18:37:04 - INFO - codeparrot_training - Step 21268: {'lr': 0.0003096269859907257, 'samples': 10889728, 'steps': 21268, 'batch_loss/train': 0.8632746981456876} 12/25/2021 18:37:15 - INFO - codeparrot_training - Step 21269: {'lr': 0.00030961167765341353, 'samples': 10890240, 'steps': 21269, 'batch_loss/train': 0.7927479390054941} 12/25/2021 18:37:26 - INFO - codeparrot_training - Step 21270: {'lr': 0.0003095963690791079, 'samples': 10890752, 'steps': 21270, 'batch_loss/train': 0.8273571762256324} 12/25/2021 18:37:36 - INFO - codeparrot_training - Step 21271: {'lr': 0.00030958106026786965, 'samples': 10891264, 'steps': 21271, 'batch_loss/train': 0.8364191169384867} 12/25/2021 18:37:49 - INFO - codeparrot_training - Step 21272: {'lr': 0.0003095657512197595, 'samples': 10891776, 'steps': 21272, 'batch_loss/train': 0.9061705046333373} 12/25/2021 18:37:59 - INFO - codeparrot_training - Step 21273: {'lr': 0.0003095504419348385, 'samples': 10892288, 'steps': 21273, 'batch_loss/train': 0.7655343487858772} 12/25/2021 18:38:10 - INFO - codeparrot_training - Step 21274: {'lr': 0.00030953513241316734, 'samples': 10892800, 'steps': 21274, 'batch_loss/train': 0.7880459045991302} 12/25/2021 18:38:22 - INFO - codeparrot_training - Step 21275: {'lr': 0.00030951982265480703, 'samples': 10893312, 'steps': 21275, 'batch_loss/train': 0.7783167830202729} 12/25/2021 18:38:32 - INFO - codeparrot_training - Step 21276: {'lr': 0.0003095045126598184, 'samples': 10893824, 'steps': 21276, 'batch_loss/train': 0.7695694686844945} 12/25/2021 18:38:43 - INFO - codeparrot_training - Step 21277: {'lr': 0.0003094892024282623, 'samples': 10894336, 'steps': 21277, 'batch_loss/train': 0.8233458111062646} 12/25/2021 18:38:55 - INFO - codeparrot_training - Step 21278: {'lr': 0.00030947389196019954, 'samples': 10894848, 'steps': 21278, 'batch_loss/train': 0.7477448515128344} 12/25/2021 18:39:06 - INFO - codeparrot_training - Step 21279: {'lr': 0.00030945858125569107, 'samples': 10895360, 'steps': 21279, 'batch_loss/train': 0.8216762864030898} 12/25/2021 18:39:16 - INFO - codeparrot_training - Step 21280: {'lr': 0.00030944327031479774, 'samples': 10895872, 'steps': 21280, 'batch_loss/train': 0.7968197716400027} 12/25/2021 18:39:27 - INFO - codeparrot_training - Step 21281: {'lr': 0.0003094279591375804, 'samples': 10896384, 'steps': 21281, 'batch_loss/train': 0.7694469201378524} 12/25/2021 18:39:41 - INFO - codeparrot_training - Step 21282: {'lr': 0.00030941264772409995, 'samples': 10896896, 'steps': 21282, 'batch_loss/train': 0.8140924703329802} 12/25/2021 18:39:51 - INFO - codeparrot_training - Step 21283: {'lr': 0.0003093973360744172, 'samples': 10897408, 'steps': 21283, 'batch_loss/train': 0.9396515623666346} 12/25/2021 18:40:02 - INFO - codeparrot_training - Step 21284: {'lr': 0.0003093820241885931, 'samples': 10897920, 'steps': 21284, 'batch_loss/train': 0.8135196529328823} 12/25/2021 18:40:14 - INFO - codeparrot_training - Step 21285: {'lr': 0.00030936671206668854, 'samples': 10898432, 'steps': 21285, 'batch_loss/train': 0.8715874440968037} 12/25/2021 18:40:25 - INFO - codeparrot_training - Step 21286: {'lr': 0.00030935139970876427, 'samples': 10898944, 'steps': 21286, 'batch_loss/train': 0.7500976650044322} 12/25/2021 18:40:35 - INFO - codeparrot_training - Step 21287: {'lr': 0.00030933608711488123, 'samples': 10899456, 'steps': 21287, 'batch_loss/train': 0.7379147419705987} 12/25/2021 18:40:49 - INFO - codeparrot_training - Step 21288: {'lr': 0.0003093207742851004, 'samples': 10899968, 'steps': 21288, 'batch_loss/train': 0.6492803185246885} 12/25/2021 18:41:00 - INFO - codeparrot_training - Step 21289: {'lr': 0.0003093054612194825, 'samples': 10900480, 'steps': 21289, 'batch_loss/train': 0.6238240534439683} 12/25/2021 18:41:10 - INFO - codeparrot_training - Step 21290: {'lr': 0.0003092901479180886, 'samples': 10900992, 'steps': 21290, 'batch_loss/train': 0.6975105064921081} 12/25/2021 18:41:21 - INFO - codeparrot_training - Step 21291: {'lr': 0.0003092748343809793, 'samples': 10901504, 'steps': 21291, 'batch_loss/train': 0.7380539728328586} 12/25/2021 18:41:33 - INFO - codeparrot_training - Step 21292: {'lr': 0.0003092595206082157, 'samples': 10902016, 'steps': 21292, 'batch_loss/train': 0.8239453069400042} 12/25/2021 18:41:44 - INFO - codeparrot_training - Step 21293: {'lr': 0.0003092442065998586, 'samples': 10902528, 'steps': 21293, 'batch_loss/train': 0.7704212330281734} 12/25/2021 18:41:54 - INFO - codeparrot_training - Step 21294: {'lr': 0.000309228892355969, 'samples': 10903040, 'steps': 21294, 'batch_loss/train': 0.85536074929405} 12/25/2021 18:42:06 - INFO - codeparrot_training - Step 21295: {'lr': 0.0003092135778766076, 'samples': 10903552, 'steps': 21295, 'batch_loss/train': 0.651786329690367} 12/25/2021 18:42:17 - INFO - codeparrot_training - Step 21296: {'lr': 0.0003091982631618354, 'samples': 10904064, 'steps': 21296, 'batch_loss/train': 0.7982847280218266} 12/25/2021 18:42:28 - INFO - codeparrot_training - Step 21297: {'lr': 0.00030918294821171327, 'samples': 10904576, 'steps': 21297, 'batch_loss/train': 0.7856607446447015} 12/25/2021 18:42:41 - INFO - codeparrot_training - Step 21298: {'lr': 0.000309167633026302, 'samples': 10905088, 'steps': 21298, 'batch_loss/train': 0.7206678356742486} 12/25/2021 18:42:52 - INFO - codeparrot_training - Step 21299: {'lr': 0.00030915231760566273, 'samples': 10905600, 'steps': 21299, 'batch_loss/train': 0.7661393713206053} 12/25/2021 18:43:03 - INFO - codeparrot_training - Step 21300: {'lr': 0.00030913700194985604, 'samples': 10906112, 'steps': 21300, 'batch_loss/train': 0.7453343167435378} 12/25/2021 18:43:15 - INFO - codeparrot_training - Step 21301: {'lr': 0.000309121686058943, 'samples': 10906624, 'steps': 21301, 'batch_loss/train': 0.7924841735512018} 12/25/2021 18:43:25 - INFO - codeparrot_training - Step 21302: {'lr': 0.00030910636993298444, 'samples': 10907136, 'steps': 21302, 'batch_loss/train': 0.8315385049208999} 12/25/2021 18:43:36 - INFO - codeparrot_training - Step 21303: {'lr': 0.00030909105357204133, 'samples': 10907648, 'steps': 21303, 'batch_loss/train': 0.7835819013416767} 12/25/2021 18:43:47 - INFO - codeparrot_training - Step 21304: {'lr': 0.00030907573697617444, 'samples': 10908160, 'steps': 21304, 'batch_loss/train': 0.6458223899826407} 12/25/2021 18:43:59 - INFO - codeparrot_training - Step 21305: {'lr': 0.0003090604201454448, 'samples': 10908672, 'steps': 21305, 'batch_loss/train': 0.8284167032688856} 12/25/2021 18:44:09 - INFO - codeparrot_training - Step 21306: {'lr': 0.00030904510307991313, 'samples': 10909184, 'steps': 21306, 'batch_loss/train': 0.8211221974343061} 12/25/2021 18:44:20 - INFO - codeparrot_training - Step 21307: {'lr': 0.00030902978577964046, 'samples': 10909696, 'steps': 21307, 'batch_loss/train': 0.882870375062339} 12/25/2021 18:44:34 - INFO - codeparrot_training - Step 21308: {'lr': 0.00030901446824468763, 'samples': 10910208, 'steps': 21308, 'batch_loss/train': 0.7539018720854074} 12/25/2021 18:44:45 - INFO - codeparrot_training - Step 21309: {'lr': 0.00030899915047511567, 'samples': 10910720, 'steps': 21309, 'batch_loss/train': 0.6972575122490525} 12/25/2021 18:44:55 - INFO - codeparrot_training - Step 21310: {'lr': 0.00030898383247098525, 'samples': 10911232, 'steps': 21310, 'batch_loss/train': 0.7263535605743527} 12/25/2021 18:45:07 - INFO - codeparrot_training - Step 21311: {'lr': 0.0003089685142323574, 'samples': 10911744, 'steps': 21311, 'batch_loss/train': 0.8234719764441252} 12/25/2021 18:45:18 - INFO - codeparrot_training - Step 21312: {'lr': 0.00030895319575929305, 'samples': 10912256, 'steps': 21312, 'batch_loss/train': 0.7266007773578167} 12/25/2021 18:45:28 - INFO - codeparrot_training - Step 21313: {'lr': 0.000308937877051853, 'samples': 10912768, 'steps': 21313, 'batch_loss/train': 0.7957153264433146} 12/25/2021 18:45:39 - INFO - codeparrot_training - Step 21314: {'lr': 0.0003089225581100982, 'samples': 10913280, 'steps': 21314, 'batch_loss/train': 0.8450005259364843} 12/25/2021 18:45:51 - INFO - codeparrot_training - Step 21315: {'lr': 0.0003089072389340896, 'samples': 10913792, 'steps': 21315, 'batch_loss/train': 0.7635768512263894} 12/25/2021 18:46:02 - INFO - codeparrot_training - Step 21316: {'lr': 0.0003088919195238881, 'samples': 10914304, 'steps': 21316, 'batch_loss/train': 0.7841794360429049} 12/25/2021 18:46:12 - INFO - codeparrot_training - Step 21317: {'lr': 0.00030887659987955453, 'samples': 10914816, 'steps': 21317, 'batch_loss/train': 0.9466829244047403} 12/25/2021 18:46:26 - INFO - codeparrot_training - Step 21318: {'lr': 0.00030886128000114977, 'samples': 10915328, 'steps': 21318, 'batch_loss/train': 0.7829925715923309} 12/25/2021 18:46:37 - INFO - codeparrot_training - Step 21319: {'lr': 0.0003088459598887348, 'samples': 10915840, 'steps': 21319, 'batch_loss/train': 0.6293457639403641} 12/25/2021 18:46:47 - INFO - codeparrot_training - Step 21320: {'lr': 0.00030883063954237057, 'samples': 10916352, 'steps': 21320, 'batch_loss/train': 1.2055371589958668} 12/25/2021 18:46:59 - INFO - codeparrot_training - Step 21321: {'lr': 0.0003088153189621179, 'samples': 10916864, 'steps': 21321, 'batch_loss/train': 0.7204382307827473} 12/25/2021 18:47:10 - INFO - codeparrot_training - Step 21322: {'lr': 0.00030879999814803776, 'samples': 10917376, 'steps': 21322, 'batch_loss/train': 0.8845014497637749} 12/25/2021 18:47:20 - INFO - codeparrot_training - Step 21323: {'lr': 0.00030878467710019096, 'samples': 10917888, 'steps': 21323, 'batch_loss/train': 0.906511795707047} 12/25/2021 18:47:31 - INFO - codeparrot_training - Step 21324: {'lr': 0.0003087693558186385, 'samples': 10918400, 'steps': 21324, 'batch_loss/train': 0.8432582374662161} 12/25/2021 18:47:43 - INFO - codeparrot_training - Step 21325: {'lr': 0.00030875403430344133, 'samples': 10918912, 'steps': 21325, 'batch_loss/train': 0.715311779640615} 12/25/2021 18:47:54 - INFO - codeparrot_training - Step 21326: {'lr': 0.00030873871255466027, 'samples': 10919424, 'steps': 21326, 'batch_loss/train': 0.7394724981859326} 12/25/2021 18:48:04 - INFO - codeparrot_training - Step 21327: {'lr': 0.00030872339057235625, 'samples': 10919936, 'steps': 21327, 'batch_loss/train': 0.8056457713246346} 12/25/2021 18:48:19 - INFO - codeparrot_training - Step 21328: {'lr': 0.00030870806835659023, 'samples': 10920448, 'steps': 21328, 'batch_loss/train': 0.8308915980160236} 12/25/2021 18:48:29 - INFO - codeparrot_training - Step 21329: {'lr': 0.0003086927459074231, 'samples': 10920960, 'steps': 21329, 'batch_loss/train': 0.7261955724097788} 12/25/2021 18:48:40 - INFO - codeparrot_training - Step 21330: {'lr': 0.00030867742322491577, 'samples': 10921472, 'steps': 21330, 'batch_loss/train': 0.6640890135895461} 12/25/2021 18:48:50 - INFO - codeparrot_training - Step 21331: {'lr': 0.00030866210030912913, 'samples': 10921984, 'steps': 21331, 'batch_loss/train': 0.8226004000753164} 12/25/2021 18:49:03 - INFO - codeparrot_training - Step 21332: {'lr': 0.0003086467771601241, 'samples': 10922496, 'steps': 21332, 'batch_loss/train': 0.743084367015399} 12/25/2021 18:49:13 - INFO - codeparrot_training - Step 21333: {'lr': 0.0003086314537779617, 'samples': 10923008, 'steps': 21333, 'batch_loss/train': 0.780269724316895} 12/25/2021 18:49:24 - INFO - codeparrot_training - Step 21334: {'lr': 0.0003086161301627027, 'samples': 10923520, 'steps': 21334, 'batch_loss/train': 0.8336620507761836} 12/25/2021 18:49:36 - INFO - codeparrot_training - Step 21335: {'lr': 0.0003086008063144082, 'samples': 10924032, 'steps': 21335, 'batch_loss/train': 0.9033430349081755} 12/25/2021 18:49:47 - INFO - codeparrot_training - Step 21336: {'lr': 0.00030858548223313906, 'samples': 10924544, 'steps': 21336, 'batch_loss/train': 0.7074007777264342} 12/25/2021 18:49:57 - INFO - codeparrot_training - Step 21337: {'lr': 0.000308570157918956, 'samples': 10925056, 'steps': 21337, 'batch_loss/train': 0.7608864158391953} 12/25/2021 18:50:11 - INFO - codeparrot_training - Step 21338: {'lr': 0.0003085548333719202, 'samples': 10925568, 'steps': 21338, 'batch_loss/train': 0.6518070639576763} 12/25/2021 18:50:22 - INFO - codeparrot_training - Step 21339: {'lr': 0.00030853950859209247, 'samples': 10926080, 'steps': 21339, 'batch_loss/train': 0.9092455431818962} 12/25/2021 18:50:33 - INFO - codeparrot_training - Step 21340: {'lr': 0.00030852418357953375, 'samples': 10926592, 'steps': 21340, 'batch_loss/train': 0.6516070964280516} 12/25/2021 18:50:43 - INFO - codeparrot_training - Step 21341: {'lr': 0.000308508858334305, 'samples': 10927104, 'steps': 21341, 'batch_loss/train': 0.8970368029549718} 12/25/2021 18:50:55 - INFO - codeparrot_training - Step 21342: {'lr': 0.00030849353285646714, 'samples': 10927616, 'steps': 21342, 'batch_loss/train': 0.8506719879806042} 12/25/2021 18:51:06 - INFO - codeparrot_training - Step 21343: {'lr': 0.000308478207146081, 'samples': 10928128, 'steps': 21343, 'batch_loss/train': 0.8120100549422204} 12/25/2021 18:51:17 - INFO - codeparrot_training - Step 21344: {'lr': 0.0003084628812032077, 'samples': 10928640, 'steps': 21344, 'batch_loss/train': 0.8004375454038382} 12/25/2021 18:51:30 - INFO - codeparrot_training - Step 21345: {'lr': 0.000308447555027908, 'samples': 10929152, 'steps': 21345, 'batch_loss/train': 0.802770952694118} 12/25/2021 18:51:41 - INFO - codeparrot_training - Step 21346: {'lr': 0.00030843222862024283, 'samples': 10929664, 'steps': 21346, 'batch_loss/train': 0.7875805320218205} 12/25/2021 18:51:52 - INFO - codeparrot_training - Step 21347: {'lr': 0.0003084169019802733, 'samples': 10930176, 'steps': 21347, 'batch_loss/train': 0.7707591084763408} 12/25/2021 18:52:03 - INFO - codeparrot_training - Step 21348: {'lr': 0.0003084015751080601, 'samples': 10930688, 'steps': 21348, 'batch_loss/train': 0.8503723442554474} 12/25/2021 18:52:14 - INFO - codeparrot_training - Step 21349: {'lr': 0.00030838624800366444, 'samples': 10931200, 'steps': 21349, 'batch_loss/train': 0.8007566053420305} 12/25/2021 18:52:25 - INFO - codeparrot_training - Step 21350: {'lr': 0.00030837092066714693, 'samples': 10931712, 'steps': 21350, 'batch_loss/train': 0.86874979455024} 12/25/2021 18:52:37 - INFO - codeparrot_training - Step 21351: {'lr': 0.0003083555930985688, 'samples': 10932224, 'steps': 21351, 'batch_loss/train': 0.8112620385363698} 12/25/2021 18:52:47 - INFO - codeparrot_training - Step 21352: {'lr': 0.00030834026529799087, 'samples': 10932736, 'steps': 21352, 'batch_loss/train': 0.7864052886143327} 12/25/2021 18:52:58 - INFO - codeparrot_training - Step 21353: {'lr': 0.000308324937265474, 'samples': 10933248, 'steps': 21353, 'batch_loss/train': 0.7842887602746487} 12/25/2021 18:53:09 - INFO - codeparrot_training - Step 21354: {'lr': 0.0003083096090010792, 'samples': 10933760, 'steps': 21354, 'batch_loss/train': 0.8602576600387692} 12/25/2021 18:53:21 - INFO - codeparrot_training - Step 21355: {'lr': 0.0003082942805048675, 'samples': 10934272, 'steps': 21355, 'batch_loss/train': 0.8369640903547406} 12/25/2021 18:53:31 - INFO - codeparrot_training - Step 21356: {'lr': 0.0003082789517768997, 'samples': 10934784, 'steps': 21356, 'batch_loss/train': 0.8641293318942189} 12/25/2021 18:53:42 - INFO - codeparrot_training - Step 21357: {'lr': 0.0003082636228172368, 'samples': 10935296, 'steps': 21357, 'batch_loss/train': 0.7387863928452134} 12/25/2021 18:53:56 - INFO - codeparrot_training - Step 21358: {'lr': 0.00030824829362593976, 'samples': 10935808, 'steps': 21358, 'batch_loss/train': 0.6589140160940588} 12/25/2021 18:54:07 - INFO - codeparrot_training - Step 21359: {'lr': 0.00030823296420306946, 'samples': 10936320, 'steps': 21359, 'batch_loss/train': 0.8023851490579545} 12/25/2021 18:54:17 - INFO - codeparrot_training - Step 21360: {'lr': 0.0003082176345486868, 'samples': 10936832, 'steps': 21360, 'batch_loss/train': 0.6989520881325006} 12/25/2021 18:54:29 - INFO - codeparrot_training - Step 21361: {'lr': 0.00030820230466285293, 'samples': 10937344, 'steps': 21361, 'batch_loss/train': 0.8337215012870729} 12/25/2021 18:54:40 - INFO - codeparrot_training - Step 21362: {'lr': 0.0003081869745456287, 'samples': 10937856, 'steps': 21362, 'batch_loss/train': 0.731284134555608} 12/25/2021 18:54:50 - INFO - codeparrot_training - Step 21363: {'lr': 0.00030817164419707496, 'samples': 10938368, 'steps': 21363, 'batch_loss/train': 0.7564103823387995} 12/25/2021 18:55:01 - INFO - codeparrot_training - Step 21364: {'lr': 0.0003081563136172528, 'samples': 10938880, 'steps': 21364, 'batch_loss/train': 0.83621262665838} 12/25/2021 18:55:13 - INFO - codeparrot_training - Step 21365: {'lr': 0.000308140982806223, 'samples': 10939392, 'steps': 21365, 'batch_loss/train': 0.7731449622660875} 12/25/2021 18:55:23 - INFO - codeparrot_training - Step 21366: {'lr': 0.0003081256517640466, 'samples': 10939904, 'steps': 21366, 'batch_loss/train': 0.8221535668708384} 12/25/2021 18:55:34 - INFO - codeparrot_training - Step 21367: {'lr': 0.0003081103204907846, 'samples': 10940416, 'steps': 21367, 'batch_loss/train': 0.7698005635756999} 12/25/2021 18:55:48 - INFO - codeparrot_training - Step 21368: {'lr': 0.00030809498898649797, 'samples': 10940928, 'steps': 21368, 'batch_loss/train': 0.7750478964298964} 12/25/2021 18:55:58 - INFO - codeparrot_training - Step 21369: {'lr': 0.0003080796572512474, 'samples': 10941440, 'steps': 21369, 'batch_loss/train': 0.8453302541747689} 12/25/2021 18:56:09 - INFO - codeparrot_training - Step 21370: {'lr': 0.00030806432528509425, 'samples': 10941952, 'steps': 21370, 'batch_loss/train': 0.7392718668561429} 12/25/2021 18:56:22 - INFO - codeparrot_training - Step 21371: {'lr': 0.00030804899308809917, 'samples': 10942464, 'steps': 21371, 'batch_loss/train': 0.8377829939126968} 12/25/2021 18:56:32 - INFO - codeparrot_training - Step 21372: {'lr': 0.00030803366066032325, 'samples': 10942976, 'steps': 21372, 'batch_loss/train': 0.80472570983693} 12/25/2021 18:56:43 - INFO - codeparrot_training - Step 21373: {'lr': 0.0003080183280018274, 'samples': 10943488, 'steps': 21373, 'batch_loss/train': 0.8627549884840846} 12/25/2021 18:56:53 - INFO - codeparrot_training - Step 21374: {'lr': 0.00030800299511267253, 'samples': 10944000, 'steps': 21374, 'batch_loss/train': 1.043556958436966} 12/25/2021 18:57:07 - INFO - codeparrot_training - Step 21375: {'lr': 0.00030798766199291975, 'samples': 10944512, 'steps': 21375, 'batch_loss/train': 0.839661319507286} 12/25/2021 18:57:18 - INFO - codeparrot_training - Step 21376: {'lr': 0.00030797232864262987, 'samples': 10945024, 'steps': 21376, 'batch_loss/train': 0.723780739121139} 12/25/2021 18:57:28 - INFO - codeparrot_training - Step 21377: {'lr': 0.0003079569950618639, 'samples': 10945536, 'steps': 21377, 'batch_loss/train': 0.8894696859642863} 12/25/2021 18:57:41 - INFO - codeparrot_training - Step 21378: {'lr': 0.0003079416612506828, 'samples': 10946048, 'steps': 21378, 'batch_loss/train': 0.8348777532810345} 12/25/2021 18:57:51 - INFO - codeparrot_training - Step 21379: {'lr': 0.00030792632720914756, 'samples': 10946560, 'steps': 21379, 'batch_loss/train': 0.8142237300053239} 12/25/2021 18:58:02 - INFO - codeparrot_training - Step 21380: {'lr': 0.00030791099293731906, 'samples': 10947072, 'steps': 21380, 'batch_loss/train': 0.8502652244642377} 12/25/2021 18:58:14 - INFO - codeparrot_training - Step 21381: {'lr': 0.0003078956584352584, 'samples': 10947584, 'steps': 21381, 'batch_loss/train': 0.8666487401351333} 12/25/2021 18:58:25 - INFO - codeparrot_training - Step 21382: {'lr': 0.0003078803237030263, 'samples': 10948096, 'steps': 21382, 'batch_loss/train': 0.7390119414776564} 12/25/2021 18:58:35 - INFO - codeparrot_training - Step 21383: {'lr': 0.0003078649887406841, 'samples': 10948608, 'steps': 21383, 'batch_loss/train': 1.1043879870558158} 12/25/2021 18:58:46 - INFO - codeparrot_training - Step 21384: {'lr': 0.00030784965354829243, 'samples': 10949120, 'steps': 21384, 'batch_loss/train': 0.7277362060267478} 12/25/2021 18:59:00 - INFO - codeparrot_training - Step 21385: {'lr': 0.0003078343181259124, 'samples': 10949632, 'steps': 21385, 'batch_loss/train': 0.8001581272110343} 12/25/2021 18:59:10 - INFO - codeparrot_training - Step 21386: {'lr': 0.000307818982473605, 'samples': 10950144, 'steps': 21386, 'batch_loss/train': 0.7631844579009339} 12/25/2021 18:59:21 - INFO - codeparrot_training - Step 21387: {'lr': 0.00030780364659143105, 'samples': 10950656, 'steps': 21387, 'batch_loss/train': 0.7468353575095534} 12/25/2021 18:59:33 - INFO - codeparrot_training - Step 21388: {'lr': 0.0003077883104794517, 'samples': 10951168, 'steps': 21388, 'batch_loss/train': 0.7519762613810599} 12/25/2021 18:59:44 - INFO - codeparrot_training - Step 21389: {'lr': 0.0003077729741377279, 'samples': 10951680, 'steps': 21389, 'batch_loss/train': 0.7006411203183234} 12/25/2021 18:59:54 - INFO - codeparrot_training - Step 21390: {'lr': 0.0003077576375663205, 'samples': 10952192, 'steps': 21390, 'batch_loss/train': 1.5008535979432054} 12/25/2021 19:00:06 - INFO - codeparrot_training - Step 21391: {'lr': 0.0003077423007652905, 'samples': 10952704, 'steps': 21391, 'batch_loss/train': 0.7087000015890226} 12/25/2021 19:00:17 - INFO - codeparrot_training - Step 21392: {'lr': 0.000307726963734699, 'samples': 10953216, 'steps': 21392, 'batch_loss/train': 0.7272133900551125} 12/25/2021 19:00:28 - INFO - codeparrot_training - Step 21393: {'lr': 0.0003077116264746069, 'samples': 10953728, 'steps': 21393, 'batch_loss/train': 0.7773723634891212} 12/25/2021 19:00:38 - INFO - codeparrot_training - Step 21394: {'lr': 0.0003076962889850751, 'samples': 10954240, 'steps': 21394, 'batch_loss/train': 0.806181053398177} 12/25/2021 19:00:52 - INFO - codeparrot_training - Step 21395: {'lr': 0.00030768095126616464, 'samples': 10954752, 'steps': 21395, 'batch_loss/train': 0.7981310002505779} 12/25/2021 19:01:03 - INFO - codeparrot_training - Step 21396: {'lr': 0.0003076656133179366, 'samples': 10955264, 'steps': 21396, 'batch_loss/train': 0.769370533991605} 12/25/2021 19:01:13 - INFO - codeparrot_training - Step 21397: {'lr': 0.00030765027514045173, 'samples': 10955776, 'steps': 21397, 'batch_loss/train': 0.8252338864840567} 12/25/2021 19:01:25 - INFO - codeparrot_training - Step 21398: {'lr': 0.00030763493673377116, 'samples': 10956288, 'steps': 21398, 'batch_loss/train': 0.7936061378568411} 12/25/2021 19:01:36 - INFO - codeparrot_training - Step 21399: {'lr': 0.00030761959809795584, 'samples': 10956800, 'steps': 21399, 'batch_loss/train': 0.81147888652049} 12/25/2021 19:01:47 - INFO - codeparrot_training - Step 21400: {'lr': 0.00030760425923306677, 'samples': 10957312, 'steps': 21400, 'batch_loss/train': 0.8652982832863927} 12/25/2021 19:01:59 - INFO - codeparrot_training - Step 21401: {'lr': 0.00030758892013916497, 'samples': 10957824, 'steps': 21401, 'batch_loss/train': 0.9375110198743641} 12/25/2021 19:02:09 - INFO - codeparrot_training - Step 21402: {'lr': 0.00030757358081631134, 'samples': 10958336, 'steps': 21402, 'batch_loss/train': 0.7232202673330903} 12/25/2021 19:02:20 - INFO - codeparrot_training - Step 21403: {'lr': 0.0003075582412645669, 'samples': 10958848, 'steps': 21403, 'batch_loss/train': 0.8315951889380813} 12/25/2021 19:02:31 - INFO - codeparrot_training - Step 21404: {'lr': 0.00030754290148399255, 'samples': 10959360, 'steps': 21404, 'batch_loss/train': 0.8612406812608242} 12/25/2021 19:02:45 - INFO - codeparrot_training - Step 21405: {'lr': 0.00030752756147464937, 'samples': 10959872, 'steps': 21405, 'batch_loss/train': 0.6791415449697524} 12/25/2021 19:02:55 - INFO - codeparrot_training - Step 21406: {'lr': 0.0003075122212365984, 'samples': 10960384, 'steps': 21406, 'batch_loss/train': 0.7501584622077644} 12/25/2021 19:03:06 - INFO - codeparrot_training - Step 21407: {'lr': 0.0003074968807699006, 'samples': 10960896, 'steps': 21407, 'batch_loss/train': 0.8066430976614356} 12/25/2021 19:03:18 - INFO - codeparrot_training - Step 21408: {'lr': 0.0003074815400746167, 'samples': 10961408, 'steps': 21408, 'batch_loss/train': 0.9305227845907211} 12/25/2021 19:03:28 - INFO - codeparrot_training - Step 21409: {'lr': 0.00030746619915080804, 'samples': 10961920, 'steps': 21409, 'batch_loss/train': 0.841689764522016} 12/25/2021 19:03:39 - INFO - codeparrot_training - Step 21410: {'lr': 0.0003074508579985355, 'samples': 10962432, 'steps': 21410, 'batch_loss/train': 0.774676289409399} 12/25/2021 19:03:51 - INFO - codeparrot_training - Step 21411: {'lr': 0.0003074355166178599, 'samples': 10962944, 'steps': 21411, 'batch_loss/train': 0.7372829795349389} 12/25/2021 19:04:02 - INFO - codeparrot_training - Step 21412: {'lr': 0.0003074201750088426, 'samples': 10963456, 'steps': 21412, 'batch_loss/train': 0.8387435711920261} 12/25/2021 19:04:12 - INFO - codeparrot_training - Step 21413: {'lr': 0.0003074048331715442, 'samples': 10963968, 'steps': 21413, 'batch_loss/train': 0.7518254918977618} 12/25/2021 19:04:23 - INFO - codeparrot_training - Step 21414: {'lr': 0.0003073894911060259, 'samples': 10964480, 'steps': 21414, 'batch_loss/train': 0.7539161895401776} 12/25/2021 19:04:37 - INFO - codeparrot_training - Step 21415: {'lr': 0.0003073741488123486, 'samples': 10964992, 'steps': 21415, 'batch_loss/train': 0.7499756487086415} 12/25/2021 19:04:47 - INFO - codeparrot_training - Step 21416: {'lr': 0.00030735880629057346, 'samples': 10965504, 'steps': 21416, 'batch_loss/train': 0.885468540713191} 12/25/2021 19:04:58 - INFO - codeparrot_training - Step 21417: {'lr': 0.00030734346354076124, 'samples': 10966016, 'steps': 21417, 'batch_loss/train': 0.7062977883033454} 12/25/2021 19:05:10 - INFO - codeparrot_training - Step 21418: {'lr': 0.0003073281205629731, 'samples': 10966528, 'steps': 21418, 'batch_loss/train': 0.6907536592334509} 12/25/2021 19:05:21 - INFO - codeparrot_training - Step 21419: {'lr': 0.00030731277735727004, 'samples': 10967040, 'steps': 21419, 'batch_loss/train': 0.7874907348304987} 12/25/2021 19:05:31 - INFO - codeparrot_training - Step 21420: {'lr': 0.000307297433923713, 'samples': 10967552, 'steps': 21420, 'batch_loss/train': 0.7954421071335673} 12/25/2021 19:05:45 - INFO - codeparrot_training - Step 21421: {'lr': 0.000307282090262363, 'samples': 10968064, 'steps': 21421, 'batch_loss/train': 0.7416437994688749} 12/25/2021 19:05:56 - INFO - codeparrot_training - Step 21422: {'lr': 0.000307266746373281, 'samples': 10968576, 'steps': 21422, 'batch_loss/train': 0.7798425336368382} 12/25/2021 19:06:06 - INFO - codeparrot_training - Step 21423: {'lr': 0.0003072514022565281, 'samples': 10969088, 'steps': 21423, 'batch_loss/train': 0.7903793244622648} 12/25/2021 19:06:17 - INFO - codeparrot_training - Step 21424: {'lr': 0.00030723605791216515, 'samples': 10969600, 'steps': 21424, 'batch_loss/train': 0.8600709894672036} 12/25/2021 19:06:29 - INFO - codeparrot_training - Step 21425: {'lr': 0.0003072207133402534, 'samples': 10970112, 'steps': 21425, 'batch_loss/train': 0.776899135671556} 12/25/2021 19:06:40 - INFO - codeparrot_training - Step 21426: {'lr': 0.00030720536854085355, 'samples': 10970624, 'steps': 21426, 'batch_loss/train': 0.756364107131958} 12/25/2021 19:06:50 - INFO - codeparrot_training - Step 21427: {'lr': 0.0003071900235140268, 'samples': 10971136, 'steps': 21427, 'batch_loss/train': 0.8063179943710566} 12/25/2021 19:07:02 - INFO - codeparrot_training - Step 21428: {'lr': 0.00030717467825983416, 'samples': 10971648, 'steps': 21428, 'batch_loss/train': 0.8032510296907276} 12/25/2021 19:07:13 - INFO - codeparrot_training - Step 21429: {'lr': 0.00030715933277833656, 'samples': 10972160, 'steps': 21429, 'batch_loss/train': 0.7813432386610657} 12/25/2021 19:07:24 - INFO - codeparrot_training - Step 21430: {'lr': 0.000307143987069595, 'samples': 10972672, 'steps': 21430, 'batch_loss/train': 0.7256445838138461} 12/25/2021 19:07:37 - INFO - codeparrot_training - Step 21431: {'lr': 0.00030712864113367056, 'samples': 10973184, 'steps': 21431, 'batch_loss/train': 0.750083084218204} 12/25/2021 19:07:48 - INFO - codeparrot_training - Step 21432: {'lr': 0.0003071132949706242, 'samples': 10973696, 'steps': 21432, 'batch_loss/train': 0.6911157574504614} 12/25/2021 19:07:59 - INFO - codeparrot_training - Step 21433: {'lr': 0.0003070979485805169, 'samples': 10974208, 'steps': 21433, 'batch_loss/train': 0.7787242277991027} 12/25/2021 19:08:09 - INFO - codeparrot_training - Step 21434: {'lr': 0.00030708260196340975, 'samples': 10974720, 'steps': 21434, 'batch_loss/train': 0.7988218031823635} 12/25/2021 19:08:21 - INFO - codeparrot_training - Step 21435: {'lr': 0.0003070672551193638, 'samples': 10975232, 'steps': 21435, 'batch_loss/train': 0.7433811752125621} 12/25/2021 19:08:32 - INFO - codeparrot_training - Step 21436: {'lr': 0.00030705190804843996, 'samples': 10975744, 'steps': 21436, 'batch_loss/train': 0.8807838493958116} 12/25/2021 19:08:43 - INFO - codeparrot_training - Step 21437: {'lr': 0.0003070365607506992, 'samples': 10976256, 'steps': 21437, 'batch_loss/train': 0.8278906056657434} 12/25/2021 19:08:55 - INFO - codeparrot_training - Step 21438: {'lr': 0.0003070212132262027, 'samples': 10976768, 'steps': 21438, 'batch_loss/train': 0.7863313816487789} 12/25/2021 19:09:05 - INFO - codeparrot_training - Step 21439: {'lr': 0.0003070058654750113, 'samples': 10977280, 'steps': 21439, 'batch_loss/train': 0.7832781318575144} 12/25/2021 19:09:16 - INFO - codeparrot_training - Step 21440: {'lr': 0.00030699051749718615, 'samples': 10977792, 'steps': 21440, 'batch_loss/train': 0.7505771624855697} 12/25/2021 19:09:28 - INFO - codeparrot_training - Step 21441: {'lr': 0.00030697516929278825, 'samples': 10978304, 'steps': 21441, 'batch_loss/train': 0.7525730491615832} 12/25/2021 19:09:39 - INFO - codeparrot_training - Step 21442: {'lr': 0.00030695982086187856, 'samples': 10978816, 'steps': 21442, 'batch_loss/train': 0.7930433913134038} 12/25/2021 19:09:49 - INFO - codeparrot_training - Step 21443: {'lr': 0.00030694447220451813, 'samples': 10979328, 'steps': 21443, 'batch_loss/train': 0.8345437371172011} 12/25/2021 19:10:00 - INFO - codeparrot_training - Step 21444: {'lr': 0.000306929123320768, 'samples': 10979840, 'steps': 21444, 'batch_loss/train': 0.763813991798088} 12/25/2021 19:10:14 - INFO - codeparrot_training - Step 21445: {'lr': 0.0003069137742106891, 'samples': 10980352, 'steps': 21445, 'batch_loss/train': 0.7473996132612228} 12/25/2021 19:10:24 - INFO - codeparrot_training - Step 21446: {'lr': 0.00030689842487434263, 'samples': 10980864, 'steps': 21446, 'batch_loss/train': 0.7433265345171094} 12/25/2021 19:10:35 - INFO - codeparrot_training - Step 21447: {'lr': 0.0003068830753117894, 'samples': 10981376, 'steps': 21447, 'batch_loss/train': 0.8038466479629278} 12/25/2021 19:10:47 - INFO - codeparrot_training - Step 21448: {'lr': 0.0003068677255230906, 'samples': 10981888, 'steps': 21448, 'batch_loss/train': 0.7959183221682906} 12/25/2021 19:10:58 - INFO - codeparrot_training - Step 21449: {'lr': 0.0003068523755083072, 'samples': 10982400, 'steps': 21449, 'batch_loss/train': 0.752116116695106} 12/25/2021 19:11:08 - INFO - codeparrot_training - Step 21450: {'lr': 0.0003068370252675002, 'samples': 10982912, 'steps': 21450, 'batch_loss/train': 0.5732114675920457} 12/25/2021 19:11:20 - INFO - codeparrot_training - Step 21451: {'lr': 0.00030682167480073063, 'samples': 10983424, 'steps': 21451, 'batch_loss/train': 0.7490898513933644} 12/25/2021 19:11:31 - INFO - codeparrot_training - Step 21452: {'lr': 0.0003068063241080596, 'samples': 10983936, 'steps': 21452, 'batch_loss/train': 0.8296984555199742} 12/25/2021 19:11:41 - INFO - codeparrot_training - Step 21453: {'lr': 0.000306790973189548, 'samples': 10984448, 'steps': 21453, 'batch_loss/train': 0.7687450260855258} 12/25/2021 19:11:55 - INFO - codeparrot_training - Step 21454: {'lr': 0.000306775622045257, 'samples': 10984960, 'steps': 21454, 'batch_loss/train': 0.7914414377883077} 12/25/2021 19:12:06 - INFO - codeparrot_training - Step 21455: {'lr': 0.00030676027067524757, 'samples': 10985472, 'steps': 21455, 'batch_loss/train': 0.7435496263206005} 12/25/2021 19:12:17 - INFO - codeparrot_training - Step 21456: {'lr': 0.00030674491907958063, 'samples': 10985984, 'steps': 21456, 'batch_loss/train': 0.7877756869420409} 12/25/2021 19:12:27 - INFO - codeparrot_training - Step 21457: {'lr': 0.00030672956725831746, 'samples': 10986496, 'steps': 21457, 'batch_loss/train': 0.8168024467304349} 12/25/2021 19:12:39 - INFO - codeparrot_training - Step 21458: {'lr': 0.0003067142152115189, 'samples': 10987008, 'steps': 21458, 'batch_loss/train': 0.7880743402056396} 12/25/2021 19:12:50 - INFO - codeparrot_training - Step 21459: {'lr': 0.000306698862939246, 'samples': 10987520, 'steps': 21459, 'batch_loss/train': 0.7718414075206965} 12/25/2021 19:13:00 - INFO - codeparrot_training - Step 21460: {'lr': 0.00030668351044155977, 'samples': 10988032, 'steps': 21460, 'batch_loss/train': 0.7920391522347927} 12/25/2021 19:13:14 - INFO - codeparrot_training - Step 21461: {'lr': 0.0003066681577185214, 'samples': 10988544, 'steps': 21461, 'batch_loss/train': 0.7787633575499058} 12/25/2021 19:13:25 - INFO - codeparrot_training - Step 21462: {'lr': 0.00030665280477019187, 'samples': 10989056, 'steps': 21462, 'batch_loss/train': 0.8181329872459173} 12/25/2021 19:13:35 - INFO - codeparrot_training - Step 21463: {'lr': 0.000306637451596632, 'samples': 10989568, 'steps': 21463, 'batch_loss/train': 0.796400876250118} 12/25/2021 19:13:47 - INFO - codeparrot_training - Step 21464: {'lr': 0.0003066220981979032, 'samples': 10990080, 'steps': 21464, 'batch_loss/train': 0.8834101405227557} 12/25/2021 19:13:58 - INFO - codeparrot_training - Step 21465: {'lr': 0.0003066067445740663, 'samples': 10990592, 'steps': 21465, 'batch_loss/train': 0.8881001222180203} 12/25/2021 19:14:09 - INFO - codeparrot_training - Step 21466: {'lr': 0.0003065913907251823, 'samples': 10991104, 'steps': 21466, 'batch_loss/train': 0.7532036965712905} 12/25/2021 19:14:19 - INFO - codeparrot_training - Step 21467: {'lr': 0.0003065760366513123, 'samples': 10991616, 'steps': 21467, 'batch_loss/train': 0.8258855273015797} 12/25/2021 19:14:31 - INFO - codeparrot_training - Step 21468: {'lr': 0.0003065606823525173, 'samples': 10992128, 'steps': 21468, 'batch_loss/train': 0.7701048315502703} 12/25/2021 19:14:42 - INFO - codeparrot_training - Step 21469: {'lr': 0.00030654532782885843, 'samples': 10992640, 'steps': 21469, 'batch_loss/train': 0.8375124773010612} 12/25/2021 19:14:52 - INFO - codeparrot_training - Step 21470: {'lr': 0.0003065299730803967, 'samples': 10993152, 'steps': 21470, 'batch_loss/train': 0.7781537671107799} 12/25/2021 19:15:07 - INFO - codeparrot_training - Step 21471: {'lr': 0.0003065146181071931, 'samples': 10993664, 'steps': 21471, 'batch_loss/train': 0.7678787987679243} 12/25/2021 19:15:17 - INFO - codeparrot_training - Step 21472: {'lr': 0.00030649926290930875, 'samples': 10994176, 'steps': 21472, 'batch_loss/train': 0.7347242555115372} 12/25/2021 19:15:28 - INFO - codeparrot_training - Step 21473: {'lr': 0.00030648390748680464, 'samples': 10994688, 'steps': 21473, 'batch_loss/train': 0.8571016741916537} 12/25/2021 19:15:40 - INFO - codeparrot_training - Step 21474: {'lr': 0.0003064685518397419, 'samples': 10995200, 'steps': 21474, 'batch_loss/train': 0.7505806870758533} 12/25/2021 19:15:51 - INFO - codeparrot_training - Step 21475: {'lr': 0.0003064531959681815, 'samples': 10995712, 'steps': 21475, 'batch_loss/train': 0.684099250822328} 12/25/2021 19:16:01 - INFO - codeparrot_training - Step 21476: {'lr': 0.00030643783987218445, 'samples': 10996224, 'steps': 21476, 'batch_loss/train': 0.8439376847818494} 12/25/2021 19:16:12 - INFO - codeparrot_training - Step 21477: {'lr': 0.0003064224835518119, 'samples': 10996736, 'steps': 21477, 'batch_loss/train': 0.6442134355893359} 12/25/2021 19:16:24 - INFO - codeparrot_training - Step 21478: {'lr': 0.00030640712700712484, 'samples': 10997248, 'steps': 21478, 'batch_loss/train': 0.799895079806447} 12/25/2021 19:16:34 - INFO - codeparrot_training - Step 21479: {'lr': 0.00030639177023818436, 'samples': 10997760, 'steps': 21479, 'batch_loss/train': 1.6452938793227077} 12/25/2021 19:16:45 - INFO - codeparrot_training - Step 21480: {'lr': 0.0003063764132450515, 'samples': 10998272, 'steps': 21480, 'batch_loss/train': 0.78654097719118} 12/25/2021 19:16:57 - INFO - codeparrot_training - Step 21481: {'lr': 0.00030636105602778734, 'samples': 10998784, 'steps': 21481, 'batch_loss/train': 0.7819621129892766} 12/25/2021 19:17:08 - INFO - codeparrot_training - Step 21482: {'lr': 0.00030634569858645276, 'samples': 10999296, 'steps': 21482, 'batch_loss/train': 0.7967916075140238} 12/25/2021 19:17:18 - INFO - codeparrot_training - Step 21483: {'lr': 0.0003063303409211091, 'samples': 10999808, 'steps': 21483, 'batch_loss/train': 0.637115869903937} 12/25/2021 19:17:32 - INFO - codeparrot_training - Step 21484: {'lr': 0.00030631498303181727, 'samples': 11000320, 'steps': 21484, 'batch_loss/train': 0.7604306656867266} 12/25/2021 19:17:43 - INFO - codeparrot_training - Step 21485: {'lr': 0.00030629962491863827, 'samples': 11000832, 'steps': 21485, 'batch_loss/train': 0.6580871725454926} 12/25/2021 19:17:54 - INFO - codeparrot_training - Step 21486: {'lr': 0.00030628426658163327, 'samples': 11001344, 'steps': 21486, 'batch_loss/train': 0.626587797421962} 12/25/2021 19:18:04 - INFO - codeparrot_training - Step 21487: {'lr': 0.0003062689080208632, 'samples': 11001856, 'steps': 21487, 'batch_loss/train': 0.779495581286028} 12/25/2021 19:18:16 - INFO - codeparrot_training - Step 21488: {'lr': 0.0003062535492363893, 'samples': 11002368, 'steps': 21488, 'batch_loss/train': 0.7475925404578447} 12/25/2021 19:18:27 - INFO - codeparrot_training - Step 21489: {'lr': 0.00030623819022827244, 'samples': 11002880, 'steps': 21489, 'batch_loss/train': 0.7584715050179511} 12/25/2021 19:18:38 - INFO - codeparrot_training - Step 21490: {'lr': 0.0003062228309965739, 'samples': 11003392, 'steps': 21490, 'batch_loss/train': 0.6444059884524904} 12/25/2021 19:18:51 - INFO - codeparrot_training - Step 21491: {'lr': 0.0003062074715413545, 'samples': 11003904, 'steps': 21491, 'batch_loss/train': 0.7990823015570641} 12/25/2021 19:19:02 - INFO - codeparrot_training - Step 21492: {'lr': 0.00030619211186267537, 'samples': 11004416, 'steps': 21492, 'batch_loss/train': 0.8253949880599976} 12/25/2021 19:19:13 - INFO - codeparrot_training - Step 21493: {'lr': 0.0003061767519605977, 'samples': 11004928, 'steps': 21493, 'batch_loss/train': 0.7656127260997891} 12/25/2021 19:19:25 - INFO - codeparrot_training - Step 21494: {'lr': 0.00030616139183518257, 'samples': 11005440, 'steps': 21494, 'batch_loss/train': 0.7899115364998579} 12/25/2021 19:19:35 - INFO - codeparrot_training - Step 21495: {'lr': 0.00030614603148649075, 'samples': 11005952, 'steps': 21495, 'batch_loss/train': 0.825658711604774} 12/25/2021 19:19:46 - INFO - codeparrot_training - Step 21496: {'lr': 0.0003061306709145836, 'samples': 11006464, 'steps': 21496, 'batch_loss/train': 0.8389479103498161} 12/25/2021 19:19:56 - INFO - codeparrot_training - Step 21497: {'lr': 0.00030611531011952216, 'samples': 11006976, 'steps': 21497, 'batch_loss/train': 0.7228121189400554} 12/25/2021 19:20:08 - INFO - codeparrot_training - Step 21498: {'lr': 0.0003060999491013674, 'samples': 11007488, 'steps': 21498, 'batch_loss/train': 0.7785449405200779} 12/25/2021 19:20:19 - INFO - codeparrot_training - Step 21499: {'lr': 0.0003060845878601803, 'samples': 11008000, 'steps': 21499, 'batch_loss/train': 0.8105968488380313} 12/25/2021 19:20:30 - INFO - codeparrot_training - Step 21500: {'lr': 0.00030606922639602217, 'samples': 11008512, 'steps': 21500, 'batch_loss/train': 0.7528745052404702} 12/25/2021 19:20:43 - INFO - codeparrot_training - Step 21501: {'lr': 0.00030605386470895394, 'samples': 11009024, 'steps': 21501, 'batch_loss/train': 0.7741949977935292} 12/25/2021 19:20:54 - INFO - codeparrot_training - Step 21502: {'lr': 0.00030603850279903674, 'samples': 11009536, 'steps': 21502, 'batch_loss/train': 0.8041209471412003} 12/25/2021 19:21:05 - INFO - codeparrot_training - Step 21503: {'lr': 0.00030602314066633166, 'samples': 11010048, 'steps': 21503, 'batch_loss/train': 0.7888090070337057} 12/25/2021 19:21:17 - INFO - codeparrot_training - Step 21504: {'lr': 0.00030600777831089963, 'samples': 11010560, 'steps': 21504, 'batch_loss/train': 0.7442550868727267} 12/25/2021 19:21:27 - INFO - codeparrot_training - Step 21505: {'lr': 0.00030599241573280186, 'samples': 11011072, 'steps': 21505, 'batch_loss/train': 0.8222315954044461} 12/25/2021 19:21:38 - INFO - codeparrot_training - Step 21506: {'lr': 0.0003059770529320994, 'samples': 11011584, 'steps': 21506, 'batch_loss/train': 0.756180805619806} 12/25/2021 19:21:49 - INFO - codeparrot_training - Step 21507: {'lr': 0.0003059616899088533, 'samples': 11012096, 'steps': 21507, 'batch_loss/train': 0.798927997238934} 12/25/2021 19:22:02 - INFO - codeparrot_training - Step 21508: {'lr': 0.0003059463266631246, 'samples': 11012608, 'steps': 21508, 'batch_loss/train': 0.8565654354169965} 12/25/2021 19:22:13 - INFO - codeparrot_training - Step 21509: {'lr': 0.00030593096319497446, 'samples': 11013120, 'steps': 21509, 'batch_loss/train': 0.8989107366651297} 12/25/2021 19:22:23 - INFO - codeparrot_training - Step 21510: {'lr': 0.00030591559950446395, 'samples': 11013632, 'steps': 21510, 'batch_loss/train': 0.7178655941970646} 12/25/2021 19:22:35 - INFO - codeparrot_training - Step 21511: {'lr': 0.00030590023559165416, 'samples': 11014144, 'steps': 21511, 'batch_loss/train': 0.820986470207572} 12/25/2021 19:22:46 - INFO - codeparrot_training - Step 21512: {'lr': 0.00030588487145660607, 'samples': 11014656, 'steps': 21512, 'batch_loss/train': 0.8866018187254667} 12/25/2021 19:22:57 - INFO - codeparrot_training - Step 21513: {'lr': 0.0003058695070993809, 'samples': 11015168, 'steps': 21513, 'batch_loss/train': 0.8155271667055786} 12/25/2021 19:23:11 - INFO - codeparrot_training - Step 21514: {'lr': 0.0003058541425200396, 'samples': 11015680, 'steps': 21514, 'batch_loss/train': 0.6956935673952103} 12/25/2021 19:23:22 - INFO - codeparrot_training - Step 21515: {'lr': 0.00030583877771864335, 'samples': 11016192, 'steps': 21515, 'batch_loss/train': 0.7926079761236906} 12/25/2021 19:23:32 - INFO - codeparrot_training - Step 21516: {'lr': 0.00030582341269525323, 'samples': 11016704, 'steps': 21516, 'batch_loss/train': 0.7247236706316471} 12/25/2021 19:23:43 - INFO - codeparrot_training - Step 21517: {'lr': 0.00030580804744993024, 'samples': 11017216, 'steps': 21517, 'batch_loss/train': 0.8504107277840376} 12/25/2021 19:23:55 - INFO - codeparrot_training - Step 21518: {'lr': 0.00030579268198273557, 'samples': 11017728, 'steps': 21518, 'batch_loss/train': 1.0347618041560054} 12/25/2021 19:24:05 - INFO - codeparrot_training - Step 21519: {'lr': 0.0003057773162937302, 'samples': 11018240, 'steps': 21519, 'batch_loss/train': 0.7610499548027292} 12/25/2021 19:24:16 - INFO - codeparrot_training - Step 21520: {'lr': 0.0003057619503829754, 'samples': 11018752, 'steps': 21520, 'batch_loss/train': 0.7983681075274944} 12/25/2021 19:24:28 - INFO - codeparrot_training - Step 21521: {'lr': 0.00030574658425053204, 'samples': 11019264, 'steps': 21521, 'batch_loss/train': 0.7051816461607814} 12/25/2021 19:24:39 - INFO - codeparrot_training - Step 21522: {'lr': 0.0003057312178964613, 'samples': 11019776, 'steps': 21522, 'batch_loss/train': 0.7188307205215096} 12/25/2021 19:24:49 - INFO - codeparrot_training - Step 21523: {'lr': 0.0003057158513208244, 'samples': 11020288, 'steps': 21523, 'batch_loss/train': 0.8233579107327387} 12/25/2021 19:25:03 - INFO - codeparrot_training - Step 21524: {'lr': 0.00030570048452368214, 'samples': 11020800, 'steps': 21524, 'batch_loss/train': 0.6742327283136547} 12/25/2021 19:25:14 - INFO - codeparrot_training - Step 21525: {'lr': 0.00030568511750509597, 'samples': 11021312, 'steps': 21525, 'batch_loss/train': 0.6460121538839303} 12/25/2021 19:25:24 - INFO - codeparrot_training - Step 21526: {'lr': 0.00030566975026512667, 'samples': 11021824, 'steps': 21526, 'batch_loss/train': 0.7418652754276991} 12/25/2021 19:25:35 - INFO - codeparrot_training - Step 21527: {'lr': 0.0003056543828038355, 'samples': 11022336, 'steps': 21527, 'batch_loss/train': 0.8731283869128674} 12/25/2021 19:25:47 - INFO - codeparrot_training - Step 21528: {'lr': 0.00030563901512128353, 'samples': 11022848, 'steps': 21528, 'batch_loss/train': 0.8034256100654602} 12/25/2021 19:25:58 - INFO - codeparrot_training - Step 21529: {'lr': 0.0003056236472175319, 'samples': 11023360, 'steps': 21529, 'batch_loss/train': 0.7041905983351171} 12/25/2021 19:26:08 - INFO - codeparrot_training - Step 21530: {'lr': 0.00030560827909264154, 'samples': 11023872, 'steps': 21530, 'batch_loss/train': 0.763848640024662} 12/25/2021 19:26:20 - INFO - codeparrot_training - Step 21531: {'lr': 0.0003055929107466737, 'samples': 11024384, 'steps': 21531, 'batch_loss/train': 0.8294296996900812} 12/25/2021 19:26:31 - INFO - codeparrot_training - Step 21532: {'lr': 0.00030557754217968946, 'samples': 11024896, 'steps': 21532, 'batch_loss/train': 0.7288417709060013} 12/25/2021 19:26:41 - INFO - codeparrot_training - Step 21533: {'lr': 0.00030556217339174985, 'samples': 11025408, 'steps': 21533, 'batch_loss/train': 0.798798781586811} 12/25/2021 19:26:55 - INFO - codeparrot_training - Step 21534: {'lr': 0.00030554680438291603, 'samples': 11025920, 'steps': 21534, 'batch_loss/train': 0.7110007093288004} 12/25/2021 19:27:06 - INFO - codeparrot_training - Step 21535: {'lr': 0.0003055314351532491, 'samples': 11026432, 'steps': 21535, 'batch_loss/train': 0.8010717798024416} 12/25/2021 19:27:16 - INFO - codeparrot_training - Step 21536: {'lr': 0.0003055160657028102, 'samples': 11026944, 'steps': 21536, 'batch_loss/train': 0.6627857001149096} 12/25/2021 19:27:29 - INFO - codeparrot_training - Step 21537: {'lr': 0.0003055006960316603, 'samples': 11027456, 'steps': 21537, 'batch_loss/train': 0.8290005377493799} 12/25/2021 19:27:39 - INFO - codeparrot_training - Step 21538: {'lr': 0.0003054853261398607, 'samples': 11027968, 'steps': 21538, 'batch_loss/train': 0.7016949476674199} 12/25/2021 19:27:50 - INFO - codeparrot_training - Step 21539: {'lr': 0.0003054699560274723, 'samples': 11028480, 'steps': 21539, 'batch_loss/train': 0.7810685178264976} 12/25/2021 19:28:00 - INFO - codeparrot_training - Step 21540: {'lr': 0.0003054545856945563, 'samples': 11028992, 'steps': 21540, 'batch_loss/train': 0.6725348671898246} 12/25/2021 19:28:14 - INFO - codeparrot_training - Step 21541: {'lr': 0.0003054392151411738, 'samples': 11029504, 'steps': 21541, 'batch_loss/train': 0.7738049337640405} 12/25/2021 19:28:25 - INFO - codeparrot_training - Step 21542: {'lr': 0.00030542384436738605, 'samples': 11030016, 'steps': 21542, 'batch_loss/train': 0.7345831422135234} 12/25/2021 19:28:36 - INFO - codeparrot_training - Step 21543: {'lr': 0.0003054084733732539, 'samples': 11030528, 'steps': 21543, 'batch_loss/train': 0.704367034137249} 12/25/2021 19:28:48 - INFO - codeparrot_training - Step 21544: {'lr': 0.00030539310215883857, 'samples': 11031040, 'steps': 21544, 'batch_loss/train': 0.7405444607138634} 12/25/2021 19:28:58 - INFO - codeparrot_training - Step 21545: {'lr': 0.0003053777307242012, 'samples': 11031552, 'steps': 21545, 'batch_loss/train': 0.6337972934124991} 12/25/2021 19:29:09 - INFO - codeparrot_training - Step 21546: {'lr': 0.0003053623590694029, 'samples': 11032064, 'steps': 21546, 'batch_loss/train': 0.7715261895209551} 12/25/2021 19:29:21 - INFO - codeparrot_training - Step 21547: {'lr': 0.00030534698719450476, 'samples': 11032576, 'steps': 21547, 'batch_loss/train': 0.6819934230297804} 12/25/2021 19:29:31 - INFO - codeparrot_training - Step 21548: {'lr': 0.0003053316150995679, 'samples': 11033088, 'steps': 21548, 'batch_loss/train': 0.7589010433293879} 12/25/2021 19:29:42 - INFO - codeparrot_training - Step 21549: {'lr': 0.00030531624278465347, 'samples': 11033600, 'steps': 21549, 'batch_loss/train': 0.7795665613375604} 12/25/2021 19:29:53 - INFO - codeparrot_training - Step 21550: {'lr': 0.00030530087024982245, 'samples': 11034112, 'steps': 21550, 'batch_loss/train': 0.7418425483629107} 12/25/2021 19:30:05 - INFO - codeparrot_training - Step 21551: {'lr': 0.0003052854974951361, 'samples': 11034624, 'steps': 21551, 'batch_loss/train': 0.7675558952614665} 12/25/2021 19:30:15 - INFO - codeparrot_training - Step 21552: {'lr': 0.0003052701245206555, 'samples': 11035136, 'steps': 21552, 'batch_loss/train': 0.7962346253916621} 12/25/2021 19:30:26 - INFO - codeparrot_training - Step 21553: {'lr': 0.00030525475132644174, 'samples': 11035648, 'steps': 21553, 'batch_loss/train': 0.7568629134912044} 12/25/2021 19:30:40 - INFO - codeparrot_training - Step 21554: {'lr': 0.00030523937791255597, 'samples': 11036160, 'steps': 21554, 'batch_loss/train': 0.6683776509016752} 12/25/2021 19:30:51 - INFO - codeparrot_training - Step 21555: {'lr': 0.00030522400427905936, 'samples': 11036672, 'steps': 21555, 'batch_loss/train': 0.6916555161587894} 12/25/2021 19:31:01 - INFO - codeparrot_training - Step 21556: {'lr': 0.00030520863042601283, 'samples': 11037184, 'steps': 21556, 'batch_loss/train': 0.8272879029391333} 12/25/2021 19:31:14 - INFO - codeparrot_training - Step 21557: {'lr': 0.0003051932563534776, 'samples': 11037696, 'steps': 21557, 'batch_loss/train': 0.7408475866541266} 12/25/2021 19:31:25 - INFO - codeparrot_training - Step 21558: {'lr': 0.000305177882061515, 'samples': 11038208, 'steps': 21558, 'batch_loss/train': 0.6112666358239949} 12/25/2021 19:31:35 - INFO - codeparrot_training - Step 21559: {'lr': 0.00030516250755018584, 'samples': 11038720, 'steps': 21559, 'batch_loss/train': 0.7169027994386852} 12/25/2021 19:31:46 - INFO - codeparrot_training - Step 21560: {'lr': 0.00030514713281955144, 'samples': 11039232, 'steps': 21560, 'batch_loss/train': 0.7966790962964296} 12/25/2021 19:32:00 - INFO - codeparrot_training - Step 21561: {'lr': 0.00030513175786967277, 'samples': 11039744, 'steps': 21561, 'batch_loss/train': 0.9048821250908077} 12/25/2021 19:32:10 - INFO - codeparrot_training - Step 21562: {'lr': 0.0003051163827006112, 'samples': 11040256, 'steps': 21562, 'batch_loss/train': 0.8478874005377293} 12/25/2021 19:32:21 - INFO - codeparrot_training - Step 21563: {'lr': 0.0003051010073124276, 'samples': 11040768, 'steps': 21563, 'batch_loss/train': 0.7740449691191316} 12/25/2021 19:32:33 - INFO - codeparrot_training - Step 21564: {'lr': 0.0003050856317051832, 'samples': 11041280, 'steps': 21564, 'batch_loss/train': 0.5248787282616831} 12/25/2021 19:32:44 - INFO - codeparrot_training - Step 21565: {'lr': 0.00030507025587893914, 'samples': 11041792, 'steps': 21565, 'batch_loss/train': 0.6766049955040216} 12/25/2021 19:32:54 - INFO - codeparrot_training - Step 21566: {'lr': 0.00030505487983375655, 'samples': 11042304, 'steps': 21566, 'batch_loss/train': 0.8450709078460932} 12/25/2021 19:33:06 - INFO - codeparrot_training - Step 21567: {'lr': 0.0003050395035696966, 'samples': 11042816, 'steps': 21567, 'batch_loss/train': 0.702679303009063} 12/25/2021 19:33:17 - INFO - codeparrot_training - Step 21568: {'lr': 0.0003050241270868203, 'samples': 11043328, 'steps': 21568, 'batch_loss/train': 0.6694721831008792} 12/25/2021 19:33:28 - INFO - codeparrot_training - Step 21569: {'lr': 0.0003050087503851888, 'samples': 11043840, 'steps': 21569, 'batch_loss/train': 0.8429274423979223} 12/25/2021 19:33:42 - INFO - codeparrot_training - Step 21570: {'lr': 0.0003049933734648633, 'samples': 11044352, 'steps': 21570, 'batch_loss/train': 0.6983583504334092} 12/25/2021 19:33:52 - INFO - codeparrot_training - Step 21571: {'lr': 0.00030497799632590503, 'samples': 11044864, 'steps': 21571, 'batch_loss/train': 0.7364636103156954} 12/25/2021 19:34:03 - INFO - codeparrot_training - Step 21572: {'lr': 0.00030496261896837486, 'samples': 11045376, 'steps': 21572, 'batch_loss/train': 0.5761338582378812} 12/25/2021 19:34:13 - INFO - codeparrot_training - Step 21573: {'lr': 0.00030494724139233415, 'samples': 11045888, 'steps': 21573, 'batch_loss/train': 0.6713157123886049} 12/25/2021 19:34:25 - INFO - codeparrot_training - Step 21574: {'lr': 0.00030493186359784394, 'samples': 11046400, 'steps': 21574, 'batch_loss/train': 0.7102039521560073} 12/25/2021 19:34:36 - INFO - codeparrot_training - Step 21575: {'lr': 0.0003049164855849654, 'samples': 11046912, 'steps': 21575, 'batch_loss/train': 0.7506180568598211} 12/25/2021 19:34:47 - INFO - codeparrot_training - Step 21576: {'lr': 0.0003049011073537595, 'samples': 11047424, 'steps': 21576, 'batch_loss/train': 0.778630904853344} 12/25/2021 19:34:59 - INFO - codeparrot_training - Step 21577: {'lr': 0.0003048857289042878, 'samples': 11047936, 'steps': 21577, 'batch_loss/train': 0.6782687330851331} 12/25/2021 19:35:09 - INFO - codeparrot_training - Step 21578: {'lr': 0.000304870350236611, 'samples': 11048448, 'steps': 21578, 'batch_loss/train': 0.665055800229311} 12/25/2021 19:35:20 - INFO - codeparrot_training - Step 21579: {'lr': 0.0003048549713507904, 'samples': 11048960, 'steps': 21579, 'batch_loss/train': 0.6822619618615136} 12/25/2021 19:35:34 - INFO - codeparrot_training - Step 21580: {'lr': 0.0003048395922468871, 'samples': 11049472, 'steps': 21580, 'batch_loss/train': 0.8234840631484985} 12/25/2021 19:35:45 - INFO - codeparrot_training - Step 21581: {'lr': 0.0003048242129249624, 'samples': 11049984, 'steps': 21581, 'batch_loss/train': 0.5694918700028211} 12/25/2021 19:35:55 - INFO - codeparrot_training - Step 21582: {'lr': 0.00030480883338507726, 'samples': 11050496, 'steps': 21582, 'batch_loss/train': 0.8013955457136035} 12/25/2021 19:36:07 - INFO - codeparrot_training - Step 21583: {'lr': 0.00030479345362729294, 'samples': 11051008, 'steps': 21583, 'batch_loss/train': 0.727657597977668} 12/25/2021 19:36:18 - INFO - codeparrot_training - Step 21584: {'lr': 0.00030477807365167055, 'samples': 11051520, 'steps': 21584, 'batch_loss/train': 0.7598607949912548} 12/25/2021 19:36:28 - INFO - codeparrot_training - Step 21585: {'lr': 0.0003047626934582712, 'samples': 11052032, 'steps': 21585, 'batch_loss/train': 0.6184044575784355} 12/25/2021 19:36:39 - INFO - codeparrot_training - Step 21586: {'lr': 0.00030474731304715603, 'samples': 11052544, 'steps': 21586, 'batch_loss/train': 0.7038221176480874} 12/25/2021 19:36:51 - INFO - codeparrot_training - Step 21587: {'lr': 0.0003047319324183862, 'samples': 11053056, 'steps': 21587, 'batch_loss/train': 0.7442520293407142} 12/25/2021 19:37:02 - INFO - codeparrot_training - Step 21588: {'lr': 0.000304716551572023, 'samples': 11053568, 'steps': 21588, 'batch_loss/train': 0.7371418550610542} 12/25/2021 19:37:12 - INFO - codeparrot_training - Step 21589: {'lr': 0.0003047011705081273, 'samples': 11054080, 'steps': 21589, 'batch_loss/train': 0.749401013366878} 12/25/2021 19:37:27 - INFO - codeparrot_training - Step 21590: {'lr': 0.0003046857892267605, 'samples': 11054592, 'steps': 21590, 'batch_loss/train': 0.7787123904563487} 12/25/2021 19:37:37 - INFO - codeparrot_training - Step 21591: {'lr': 0.0003046704077279836, 'samples': 11055104, 'steps': 21591, 'batch_loss/train': 0.7568726111203432} 12/25/2021 19:37:48 - INFO - codeparrot_training - Step 21592: {'lr': 0.0003046550260118578, 'samples': 11055616, 'steps': 21592, 'batch_loss/train': 0.7616159012541175} 12/25/2021 19:38:00 - INFO - codeparrot_training - Step 21593: {'lr': 0.0003046396440784442, 'samples': 11056128, 'steps': 21593, 'batch_loss/train': 0.7165712551213801} 12/25/2021 19:38:11 - INFO - codeparrot_training - Step 21594: {'lr': 0.00030462426192780413, 'samples': 11056640, 'steps': 21594, 'batch_loss/train': 0.7473671801853925} 12/25/2021 19:38:22 - INFO - codeparrot_training - Step 21595: {'lr': 0.00030460887955999856, 'samples': 11057152, 'steps': 21595, 'batch_loss/train': 0.7146541127003729} 12/25/2021 19:38:32 - INFO - codeparrot_training - Step 21596: {'lr': 0.00030459349697508866, 'samples': 11057664, 'steps': 21596, 'batch_loss/train': 0.8372028499143198} 12/25/2021 19:38:44 - INFO - codeparrot_training - Step 21597: {'lr': 0.00030457811417313575, 'samples': 11058176, 'steps': 21597, 'batch_loss/train': 0.7925406186841428} 12/25/2021 19:38:55 - INFO - codeparrot_training - Step 21598: {'lr': 0.00030456273115420074, 'samples': 11058688, 'steps': 21598, 'batch_loss/train': 0.7716545015573502} 12/25/2021 19:39:05 - INFO - codeparrot_training - Step 21599: {'lr': 0.00030454734791834497, 'samples': 11059200, 'steps': 21599, 'batch_loss/train': 0.7598011074587703} 12/25/2021 19:39:19 - INFO - codeparrot_training - Step 21600: {'lr': 0.0003045319644656295, 'samples': 11059712, 'steps': 21600, 'batch_loss/train': 0.7347716800868511} 12/25/2021 19:39:30 - INFO - codeparrot_training - Step 21601: {'lr': 0.00030451658079611564, 'samples': 11060224, 'steps': 21601, 'batch_loss/train': 0.7519646801520139} 12/25/2021 19:39:40 - INFO - codeparrot_training - Step 21602: {'lr': 0.0003045011969098643, 'samples': 11060736, 'steps': 21602, 'batch_loss/train': 0.8086389470845461} 12/25/2021 19:39:51 - INFO - codeparrot_training - Step 21603: {'lr': 0.0003044858128069369, 'samples': 11061248, 'steps': 21603, 'batch_loss/train': 0.81368796993047} 12/25/2021 19:40:03 - INFO - codeparrot_training - Step 21604: {'lr': 0.00030447042848739443, 'samples': 11061760, 'steps': 21604, 'batch_loss/train': 0.857514695264399} 12/25/2021 19:40:14 - INFO - codeparrot_training - Step 21605: {'lr': 0.0003044550439512981, 'samples': 11062272, 'steps': 21605, 'batch_loss/train': 0.7402027840726078} 12/25/2021 19:40:24 - INFO - codeparrot_training - Step 21606: {'lr': 0.00030443965919870906, 'samples': 11062784, 'steps': 21606, 'batch_loss/train': 0.7784549696370959} 12/25/2021 19:40:36 - INFO - codeparrot_training - Step 21607: {'lr': 0.0003044242742296885, 'samples': 11063296, 'steps': 21607, 'batch_loss/train': 0.7602799152955413} 12/25/2021 19:40:47 - INFO - codeparrot_training - Step 21608: {'lr': 0.0003044088890442976, 'samples': 11063808, 'steps': 21608, 'batch_loss/train': 0.8536348612979054} 12/25/2021 19:40:58 - INFO - codeparrot_training - Step 21609: {'lr': 0.00030439350364259743, 'samples': 11064320, 'steps': 21609, 'batch_loss/train': 0.6902810302563012} 12/25/2021 19:41:12 - INFO - codeparrot_training - Step 21610: {'lr': 0.0003043781180246493, 'samples': 11064832, 'steps': 21610, 'batch_loss/train': 0.7746408749371767} 12/25/2021 19:41:22 - INFO - codeparrot_training - Step 21611: {'lr': 0.0003043627321905143, 'samples': 11065344, 'steps': 21611, 'batch_loss/train': 0.774260763078928} 12/25/2021 19:41:33 - INFO - codeparrot_training - Step 21612: {'lr': 0.0003043473461402536, 'samples': 11065856, 'steps': 21612, 'batch_loss/train': 0.7441278668120503} 12/25/2021 19:41:45 - INFO - codeparrot_training - Step 21613: {'lr': 0.00030433195987392836, 'samples': 11066368, 'steps': 21613, 'batch_loss/train': 0.7896316172555089} 12/25/2021 19:41:56 - INFO - codeparrot_training - Step 21614: {'lr': 0.0003043165733915997, 'samples': 11066880, 'steps': 21614, 'batch_loss/train': 0.8230118211358786} 12/25/2021 19:42:06 - INFO - codeparrot_training - Step 21615: {'lr': 0.00030430118669332897, 'samples': 11067392, 'steps': 21615, 'batch_loss/train': 0.8401936795562506} 12/25/2021 19:42:17 - INFO - codeparrot_training - Step 21616: {'lr': 0.0003042857997791772, 'samples': 11067904, 'steps': 21616, 'batch_loss/train': 0.8276388747617602} 12/25/2021 19:42:29 - INFO - codeparrot_training - Step 21617: {'lr': 0.0003042704126492056, 'samples': 11068416, 'steps': 21617, 'batch_loss/train': 0.7845488069579005} 12/25/2021 19:42:40 - INFO - codeparrot_training - Step 21618: {'lr': 0.00030425502530347526, 'samples': 11068928, 'steps': 21618, 'batch_loss/train': 0.732019501272589} 12/25/2021 19:42:51 - INFO - codeparrot_training - Step 21619: {'lr': 0.0003042396377420474, 'samples': 11069440, 'steps': 21619, 'batch_loss/train': 0.8255353351123631} 12/25/2021 19:43:03 - INFO - codeparrot_training - Step 21620: {'lr': 0.0003042242499649833, 'samples': 11069952, 'steps': 21620, 'batch_loss/train': 0.8377506611868739} 12/25/2021 19:43:13 - INFO - codeparrot_training - Step 21621: {'lr': 0.000304208861972344, 'samples': 11070464, 'steps': 21621, 'batch_loss/train': 0.6086444770917296} 12/25/2021 19:43:24 - INFO - codeparrot_training - Step 21622: {'lr': 0.0003041934737641908, 'samples': 11070976, 'steps': 21622, 'batch_loss/train': 0.8339142731856555} 12/25/2021 19:43:35 - INFO - codeparrot_training - Step 21623: {'lr': 0.0003041780853405848, 'samples': 11071488, 'steps': 21623, 'batch_loss/train': 0.700105267111212} 12/25/2021 19:43:49 - INFO - codeparrot_training - Step 21624: {'lr': 0.0003041626967015872, 'samples': 11072000, 'steps': 21624, 'batch_loss/train': 0.7496843765256926} 12/25/2021 19:44:00 - INFO - codeparrot_training - Step 21625: {'lr': 0.0003041473078472591, 'samples': 11072512, 'steps': 21625, 'batch_loss/train': 0.9562859735451639} 12/25/2021 19:44:10 - INFO - codeparrot_training - Step 21626: {'lr': 0.00030413191877766177, 'samples': 11073024, 'steps': 21626, 'batch_loss/train': 0.8121361304074526} 12/25/2021 19:44:22 - INFO - codeparrot_training - Step 21627: {'lr': 0.0003041165294928564, 'samples': 11073536, 'steps': 21627, 'batch_loss/train': 0.7653068844228983} 12/25/2021 19:44:33 - INFO - codeparrot_training - Step 21628: {'lr': 0.00030410113999290413, 'samples': 11074048, 'steps': 21628, 'batch_loss/train': 0.7761339074932039} 12/25/2021 19:44:44 - INFO - codeparrot_training - Step 21629: {'lr': 0.00030408575027786615, 'samples': 11074560, 'steps': 21629, 'batch_loss/train': 0.7346700703492388} 12/25/2021 19:44:58 - INFO - codeparrot_training - Step 21630: {'lr': 0.00030407036034780366, 'samples': 11075072, 'steps': 21630, 'batch_loss/train': 0.7655298011377454} 12/25/2021 19:45:08 - INFO - codeparrot_training - Step 21631: {'lr': 0.00030405497020277776, 'samples': 11075584, 'steps': 21631, 'batch_loss/train': 0.8312774784862995} 12/25/2021 19:45:19 - INFO - codeparrot_training - Step 21632: {'lr': 0.0003040395798428499, 'samples': 11076096, 'steps': 21632, 'batch_loss/train': 0.7836050200276077} 12/25/2021 19:45:29 - INFO - codeparrot_training - Step 21633: {'lr': 0.0003040241892680809, 'samples': 11076608, 'steps': 21633, 'batch_loss/train': 0.7148877572035417} 12/25/2021 19:45:41 - INFO - codeparrot_training - Step 21634: {'lr': 0.00030400879847853213, 'samples': 11077120, 'steps': 21634, 'batch_loss/train': 0.7410896318033338} 12/25/2021 19:45:52 - INFO - codeparrot_training - Step 21635: {'lr': 0.00030399340747426485, 'samples': 11077632, 'steps': 21635, 'batch_loss/train': 0.9714519092813134} 12/25/2021 19:46:03 - INFO - codeparrot_training - Step 21636: {'lr': 0.00030397801625534017, 'samples': 11078144, 'steps': 21636, 'batch_loss/train': 0.7650278927758336} 12/25/2021 19:46:15 - INFO - codeparrot_training - Step 21637: {'lr': 0.00030396262482181914, 'samples': 11078656, 'steps': 21637, 'batch_loss/train': 0.805614766664803} 12/25/2021 19:46:25 - INFO - codeparrot_training - Step 21638: {'lr': 0.0003039472331737633, 'samples': 11079168, 'steps': 21638, 'batch_loss/train': 0.7597398487851024} 12/25/2021 19:46:36 - INFO - codeparrot_training - Step 21639: {'lr': 0.0003039318413112335, 'samples': 11079680, 'steps': 21639, 'batch_loss/train': 0.8248839667066932} 12/25/2021 19:46:50 - INFO - codeparrot_training - Step 21640: {'lr': 0.0003039164492342911, 'samples': 11080192, 'steps': 21640, 'batch_loss/train': 0.7798600578680634} 12/25/2021 19:47:01 - INFO - codeparrot_training - Step 21641: {'lr': 0.00030390105694299717, 'samples': 11080704, 'steps': 21641, 'batch_loss/train': 0.830640684813261} 12/25/2021 19:47:11 - INFO - codeparrot_training - Step 21642: {'lr': 0.0003038856644374131, 'samples': 11081216, 'steps': 21642, 'batch_loss/train': 0.7153996285051107} 12/25/2021 19:47:22 - INFO - codeparrot_training - Step 21643: {'lr': 0.00030387027171759995, 'samples': 11081728, 'steps': 21643, 'batch_loss/train': 0.7268636033404619} 12/25/2021 19:47:34 - INFO - codeparrot_training - Step 21644: {'lr': 0.00030385487878361884, 'samples': 11082240, 'steps': 21644, 'batch_loss/train': 0.7647067569196224} 12/25/2021 19:47:45 - INFO - codeparrot_training - Step 21645: {'lr': 0.0003038394856355312, 'samples': 11082752, 'steps': 21645, 'batch_loss/train': 0.8666303139179945} 12/25/2021 19:47:55 - INFO - codeparrot_training - Step 21646: {'lr': 0.000303824092273398, 'samples': 11083264, 'steps': 21646, 'batch_loss/train': 0.8432019650936127} 12/25/2021 19:48:07 - INFO - codeparrot_training - Step 21647: {'lr': 0.0003038086986972806, 'samples': 11083776, 'steps': 21647, 'batch_loss/train': 0.8164770128205419} 12/25/2021 19:48:18 - INFO - codeparrot_training - Step 21648: {'lr': 0.0003037933049072401, 'samples': 11084288, 'steps': 21648, 'batch_loss/train': 0.6843745214864612} 12/25/2021 19:48:28 - INFO - codeparrot_training - Step 21649: {'lr': 0.00030377791090333776, 'samples': 11084800, 'steps': 21649, 'batch_loss/train': 0.8159085782244802} 12/25/2021 19:48:42 - INFO - codeparrot_training - Step 21650: {'lr': 0.0003037625166856347, 'samples': 11085312, 'steps': 21650, 'batch_loss/train': 0.672091371845454} 12/25/2021 19:48:53 - INFO - codeparrot_training - Step 21651: {'lr': 0.0003037471222541923, 'samples': 11085824, 'steps': 21651, 'batch_loss/train': 0.8565218914300203} 12/25/2021 19:49:04 - INFO - codeparrot_training - Step 21652: {'lr': 0.0003037317276090715, 'samples': 11086336, 'steps': 21652, 'batch_loss/train': 0.816854756558314} 12/25/2021 19:49:16 - INFO - codeparrot_training - Step 21653: {'lr': 0.0003037163327503337, 'samples': 11086848, 'steps': 21653, 'batch_loss/train': 0.7664710963144898} 12/25/2021 19:49:26 - INFO - codeparrot_training - Step 21654: {'lr': 0.00030370093767804, 'samples': 11087360, 'steps': 21654, 'batch_loss/train': 0.7195174186490476} 12/25/2021 19:49:37 - INFO - codeparrot_training - Step 21655: {'lr': 0.0003036855423922517, 'samples': 11087872, 'steps': 21655, 'batch_loss/train': 0.7865195535123348} 12/25/2021 19:49:48 - INFO - codeparrot_training - Step 21656: {'lr': 0.0003036701468930299, 'samples': 11088384, 'steps': 21656, 'batch_loss/train': 0.8315541911870241} 12/25/2021 19:50:00 - INFO - codeparrot_training - Step 21657: {'lr': 0.0003036547511804358, 'samples': 11088896, 'steps': 21657, 'batch_loss/train': 0.7851467034779489} 12/25/2021 19:50:10 - INFO - codeparrot_training - Step 21658: {'lr': 0.00030363935525453084, 'samples': 11089408, 'steps': 21658, 'batch_loss/train': 0.713365366216749} 12/25/2021 19:50:21 - INFO - codeparrot_training - Step 21659: {'lr': 0.000303623959115376, 'samples': 11089920, 'steps': 21659, 'batch_loss/train': 0.7853684592992067} 12/25/2021 19:50:33 - INFO - codeparrot_training - Step 21660: {'lr': 0.0003036085627630325, 'samples': 11090432, 'steps': 21660, 'batch_loss/train': 0.7862247256562114} 12/25/2021 19:50:44 - INFO - codeparrot_training - Step 21661: {'lr': 0.00030359316619756163, 'samples': 11090944, 'steps': 21661, 'batch_loss/train': 0.7020458495244384} 12/25/2021 19:50:54 - INFO - codeparrot_training - Step 21662: {'lr': 0.00030357776941902456, 'samples': 11091456, 'steps': 21662, 'batch_loss/train': 0.7413963340222836} 12/25/2021 19:51:08 - INFO - codeparrot_training - Step 21663: {'lr': 0.0003035623724274824, 'samples': 11091968, 'steps': 21663, 'batch_loss/train': 0.7558290930464864} 12/25/2021 19:51:19 - INFO - codeparrot_training - Step 21664: {'lr': 0.0003035469752229966, 'samples': 11092480, 'steps': 21664, 'batch_loss/train': 0.737107404274866} 12/25/2021 19:51:30 - INFO - codeparrot_training - Step 21665: {'lr': 0.0003035315778056282, 'samples': 11092992, 'steps': 21665, 'batch_loss/train': 0.7323341458104551} 12/25/2021 19:51:40 - INFO - codeparrot_training - Step 21666: {'lr': 0.0003035161801754385, 'samples': 11093504, 'steps': 21666, 'batch_loss/train': 0.8042996991425753} 12/25/2021 19:51:52 - INFO - codeparrot_training - Step 21667: {'lr': 0.00030350078233248867, 'samples': 11094016, 'steps': 21667, 'batch_loss/train': 0.752497692592442} 12/25/2021 19:52:03 - INFO - codeparrot_training - Step 21668: {'lr': 0.00030348538427683993, 'samples': 11094528, 'steps': 21668, 'batch_loss/train': 0.8626807052642107} 12/25/2021 19:52:14 - INFO - codeparrot_training - Step 21669: {'lr': 0.0003034699860085535, 'samples': 11095040, 'steps': 21669, 'batch_loss/train': 0.8161943200975657} 12/25/2021 19:52:27 - INFO - codeparrot_training - Step 21670: {'lr': 0.0003034545875276905, 'samples': 11095552, 'steps': 21670, 'batch_loss/train': 0.8130249977111816} 12/25/2021 19:52:38 - INFO - codeparrot_training - Step 21671: {'lr': 0.00030343918883431236, 'samples': 11096064, 'steps': 21671, 'batch_loss/train': 0.7279970948584378} 12/25/2021 19:52:48 - INFO - codeparrot_training - Step 21672: {'lr': 0.00030342378992848014, 'samples': 11096576, 'steps': 21672, 'batch_loss/train': 0.847705646418035} 12/25/2021 19:53:01 - INFO - codeparrot_training - Step 21673: {'lr': 0.0003034083908102551, 'samples': 11097088, 'steps': 21673, 'batch_loss/train': 0.8196066310629249} 12/25/2021 19:53:11 - INFO - codeparrot_training - Step 21674: {'lr': 0.00030339299147969846, 'samples': 11097600, 'steps': 21674, 'batch_loss/train': 0.613457562867552} 12/25/2021 19:53:22 - INFO - codeparrot_training - Step 21675: {'lr': 0.00030337759193687155, 'samples': 11098112, 'steps': 21675, 'batch_loss/train': 0.892320946790278} 12/25/2021 19:53:32 - INFO - codeparrot_training - Step 21676: {'lr': 0.00030336219218183533, 'samples': 11098624, 'steps': 21676, 'batch_loss/train': 0.7905655850190669} 12/25/2021 19:53:45 - INFO - codeparrot_training - Step 21677: {'lr': 0.00030334679221465127, 'samples': 11099136, 'steps': 21677, 'batch_loss/train': 0.7476136708864942} 12/25/2021 19:53:55 - INFO - codeparrot_training - Step 21678: {'lr': 0.00030333139203538045, 'samples': 11099648, 'steps': 21678, 'batch_loss/train': 0.8339485260657966} 12/25/2021 19:54:06 - INFO - codeparrot_training - Step 21679: {'lr': 0.0003033159916440842, 'samples': 11100160, 'steps': 21679, 'batch_loss/train': 0.8282827297225595} 12/25/2021 19:54:20 - INFO - codeparrot_training - Step 21680: {'lr': 0.0003033005910408237, 'samples': 11100672, 'steps': 21680, 'batch_loss/train': 0.6522509437054396} 12/25/2021 19:54:31 - INFO - codeparrot_training - Step 21681: {'lr': 0.00030328519022566013, 'samples': 11101184, 'steps': 21681, 'batch_loss/train': 0.7111117015592754} 12/25/2021 19:54:41 - INFO - codeparrot_training - Step 21682: {'lr': 0.0003032697891986548, 'samples': 11101696, 'steps': 21682, 'batch_loss/train': 0.737415261566639} 12/25/2021 19:54:53 - INFO - codeparrot_training - Step 21683: {'lr': 0.00030325438795986885, 'samples': 11102208, 'steps': 21683, 'batch_loss/train': 0.8037994373589754} 12/25/2021 19:55:04 - INFO - codeparrot_training - Step 21684: {'lr': 0.00030323898650936367, 'samples': 11102720, 'steps': 21684, 'batch_loss/train': 0.7859842564212158} 12/25/2021 19:55:15 - INFO - codeparrot_training - Step 21685: {'lr': 0.0003032235848472003, 'samples': 11103232, 'steps': 21685, 'batch_loss/train': 0.7193423183634877} 12/25/2021 19:55:28 - INFO - codeparrot_training - Step 21686: {'lr': 0.0003032081829734401, 'samples': 11103744, 'steps': 21686, 'batch_loss/train': 0.7777295159175992} 12/25/2021 19:55:39 - INFO - codeparrot_training - Step 21687: {'lr': 0.0003031927808881442, 'samples': 11104256, 'steps': 21687, 'batch_loss/train': 0.7436101492494345} 12/25/2021 19:55:50 - INFO - codeparrot_training - Step 21688: {'lr': 0.000303177378591374, 'samples': 11104768, 'steps': 21688, 'batch_loss/train': 0.8823218918405473} 12/25/2021 19:56:00 - INFO - codeparrot_training - Step 21689: {'lr': 0.00030316197608319044, 'samples': 11105280, 'steps': 21689, 'batch_loss/train': 0.7106430421117693} 12/25/2021 19:56:12 - INFO - codeparrot_training - Step 21690: {'lr': 0.00030314657336365507, 'samples': 11105792, 'steps': 21690, 'batch_loss/train': 0.7875736132264137} 12/25/2021 19:56:23 - INFO - codeparrot_training - Step 21691: {'lr': 0.000303131170432829, 'samples': 11106304, 'steps': 21691, 'batch_loss/train': 0.8019383051432669} 12/25/2021 19:56:34 - INFO - codeparrot_training - Step 21692: {'lr': 0.0003031157672907734, 'samples': 11106816, 'steps': 21692, 'batch_loss/train': 0.8034082869999111} 12/25/2021 19:56:46 - INFO - codeparrot_training - Step 21693: {'lr': 0.0003031003639375495, 'samples': 11107328, 'steps': 21693, 'batch_loss/train': 0.7797649530693889} 12/25/2021 19:56:56 - INFO - codeparrot_training - Step 21694: {'lr': 0.0003030849603732187, 'samples': 11107840, 'steps': 21694, 'batch_loss/train': 0.7872220099670812} 12/25/2021 19:57:07 - INFO - codeparrot_training - Step 21695: {'lr': 0.00030306955659784214, 'samples': 11108352, 'steps': 21695, 'batch_loss/train': 0.7886026743799448} 12/25/2021 19:57:19 - INFO - codeparrot_training - Step 21696: {'lr': 0.000303054152611481, 'samples': 11108864, 'steps': 21696, 'batch_loss/train': 0.8748296545818448} 12/25/2021 19:57:29 - INFO - codeparrot_training - Step 21697: {'lr': 0.0003030387484141967, 'samples': 11109376, 'steps': 21697, 'batch_loss/train': 0.7292300085537136} 12/25/2021 19:57:40 - INFO - codeparrot_training - Step 21698: {'lr': 0.00030302334400605024, 'samples': 11109888, 'steps': 21698, 'batch_loss/train': 0.8019502689130604} 12/25/2021 19:57:51 - INFO - codeparrot_training - Step 21699: {'lr': 0.0003030079393871031, 'samples': 11110400, 'steps': 21699, 'batch_loss/train': 0.7739984858781099} 12/25/2021 19:58:04 - INFO - codeparrot_training - Step 21700: {'lr': 0.0003029925345574163, 'samples': 11110912, 'steps': 21700, 'batch_loss/train': 0.7279646429233253} 12/25/2021 19:58:15 - INFO - codeparrot_training - Step 21701: {'lr': 0.00030297712951705123, 'samples': 11111424, 'steps': 21701, 'batch_loss/train': 0.7566862171515822} 12/25/2021 19:58:26 - INFO - codeparrot_training - Step 21702: {'lr': 0.00030296172426606913, 'samples': 11111936, 'steps': 21702, 'batch_loss/train': 0.6430254966253415} 12/25/2021 19:58:38 - INFO - codeparrot_training - Step 21703: {'lr': 0.0003029463188045312, 'samples': 11112448, 'steps': 21703, 'batch_loss/train': 0.7506979601457715} 12/25/2021 19:58:48 - INFO - codeparrot_training - Step 21704: {'lr': 0.0003029309131324987, 'samples': 11112960, 'steps': 21704, 'batch_loss/train': 0.7583531844429672} 12/25/2021 19:58:59 - INFO - codeparrot_training - Step 21705: {'lr': 0.00030291550725003273, 'samples': 11113472, 'steps': 21705, 'batch_loss/train': 0.7450877940282226} 12/25/2021 19:59:11 - INFO - codeparrot_training - Step 21706: {'lr': 0.0003029001011571949, 'samples': 11113984, 'steps': 21706, 'batch_loss/train': 0.7473811078816652} 12/25/2021 19:59:22 - INFO - codeparrot_training - Step 21707: {'lr': 0.0003028846948540461, 'samples': 11114496, 'steps': 21707, 'batch_loss/train': 0.8773031048476696} 12/25/2021 19:59:32 - INFO - codeparrot_training - Step 21708: {'lr': 0.0003028692883406478, 'samples': 11115008, 'steps': 21708, 'batch_loss/train': 0.7195160752162337} 12/25/2021 19:59:46 - INFO - codeparrot_training - Step 21709: {'lr': 0.0003028538816170611, 'samples': 11115520, 'steps': 21709, 'batch_loss/train': 0.769854458514601} 12/25/2021 19:59:57 - INFO - codeparrot_training - Step 21710: {'lr': 0.00030283847468334747, 'samples': 11116032, 'steps': 21710, 'batch_loss/train': 0.825947328004986} 12/25/2021 20:00:07 - INFO - codeparrot_training - Step 21711: {'lr': 0.0003028230675395679, 'samples': 11116544, 'steps': 21711, 'batch_loss/train': 0.8852997648064047} 12/25/2021 20:00:18 - INFO - codeparrot_training - Step 21712: {'lr': 0.00030280766018578374, 'samples': 11117056, 'steps': 21712, 'batch_loss/train': 0.6857566379476339} 12/25/2021 20:00:30 - INFO - codeparrot_training - Step 21713: {'lr': 0.0003027922526220563, 'samples': 11117568, 'steps': 21713, 'batch_loss/train': 0.7770761128049344} 12/25/2021 20:00:41 - INFO - codeparrot_training - Step 21714: {'lr': 0.00030277684484844683, 'samples': 11118080, 'steps': 21714, 'batch_loss/train': 0.7223312495043501} 12/25/2021 20:00:51 - INFO - codeparrot_training - Step 21715: {'lr': 0.00030276143686501646, 'samples': 11118592, 'steps': 21715, 'batch_loss/train': 0.7386154689593241} 12/25/2021 20:01:05 - INFO - codeparrot_training - Step 21716: {'lr': 0.00030274602867182664, 'samples': 11119104, 'steps': 21716, 'batch_loss/train': 0.7406843286007643} 12/25/2021 20:01:16 - INFO - codeparrot_training - Step 21717: {'lr': 0.0003027306202689385, 'samples': 11119616, 'steps': 21717, 'batch_loss/train': 1.214528464130126} 12/25/2021 20:01:27 - INFO - codeparrot_training - Step 21718: {'lr': 0.0003027152116564132, 'samples': 11120128, 'steps': 21718, 'batch_loss/train': 2.2229896495118737} 12/25/2021 20:01:38 - INFO - codeparrot_training - Step 21719: {'lr': 0.0003026998028343123, 'samples': 11120640, 'steps': 21719, 'batch_loss/train': 0.7702243896201253} 12/25/2021 20:01:49 - INFO - codeparrot_training - Step 21720: {'lr': 0.0003026843938026968, 'samples': 11121152, 'steps': 21720, 'batch_loss/train': 0.764949320233427} 12/25/2021 20:02:00 - INFO - codeparrot_training - Step 21721: {'lr': 0.00030266898456162805, 'samples': 11121664, 'steps': 21721, 'batch_loss/train': 0.7312946962192655} 12/25/2021 20:02:10 - INFO - codeparrot_training - Step 21722: {'lr': 0.0003026535751111673, 'samples': 11122176, 'steps': 21722, 'batch_loss/train': 0.8546478161588311} 12/25/2021 20:02:22 - INFO - codeparrot_training - Step 21723: {'lr': 0.00030263816545137584, 'samples': 11122688, 'steps': 21723, 'batch_loss/train': 0.803718407638371} 12/25/2021 20:02:33 - INFO - codeparrot_training - Step 21724: {'lr': 0.00030262275558231487, 'samples': 11123200, 'steps': 21724, 'batch_loss/train': 0.6857456855941564} 12/25/2021 20:02:44 - INFO - codeparrot_training - Step 21725: {'lr': 0.0003026073455040457, 'samples': 11123712, 'steps': 21725, 'batch_loss/train': 0.721919291652739} 12/25/2021 20:02:58 - INFO - codeparrot_training - Step 21726: {'lr': 0.0003025919352166296, 'samples': 11124224, 'steps': 21726, 'batch_loss/train': 0.829678840469569} 12/25/2021 20:03:08 - INFO - codeparrot_training - Step 21727: {'lr': 0.00030257652472012785, 'samples': 11124736, 'steps': 21727, 'batch_loss/train': 0.7962195854634047} 12/25/2021 20:03:19 - INFO - codeparrot_training - Step 21728: {'lr': 0.00030256111401460165, 'samples': 11125248, 'steps': 21728, 'batch_loss/train': 0.8343350936775096} 12/25/2021 20:03:31 - INFO - codeparrot_training - Step 21729: {'lr': 0.00030254570310011235, 'samples': 11125760, 'steps': 21729, 'batch_loss/train': 0.675458999350667} 12/25/2021 20:03:42 - INFO - codeparrot_training - Step 21730: {'lr': 0.00030253029197672114, 'samples': 11126272, 'steps': 21730, 'batch_loss/train': 0.7231427601072937} 12/25/2021 20:03:52 - INFO - codeparrot_training - Step 21731: {'lr': 0.0003025148806444893, 'samples': 11126784, 'steps': 21731, 'batch_loss/train': 0.716327007394284} 12/25/2021 20:04:03 - INFO - codeparrot_training - Step 21732: {'lr': 0.0003024994691034782, 'samples': 11127296, 'steps': 21732, 'batch_loss/train': 0.7705176556482911} 12/25/2021 20:04:15 - INFO - codeparrot_training - Step 21733: {'lr': 0.00030248405735374903, 'samples': 11127808, 'steps': 21733, 'batch_loss/train': 0.7785163326188922} 12/25/2021 20:04:26 - INFO - codeparrot_training - Step 21734: {'lr': 0.000302468645395363, 'samples': 11128320, 'steps': 21734, 'batch_loss/train': 0.7426060354337096} 12/25/2021 20:04:36 - INFO - codeparrot_training - Step 21735: {'lr': 0.0003024532332283815, 'samples': 11128832, 'steps': 21735, 'batch_loss/train': 0.7423032482620329} 12/25/2021 20:04:48 - INFO - codeparrot_training - Step 21736: {'lr': 0.00030243782085286573, 'samples': 11129344, 'steps': 21736, 'batch_loss/train': 0.7940146224573255} 12/25/2021 20:04:59 - INFO - codeparrot_training - Step 21737: {'lr': 0.000302422408268877, 'samples': 11129856, 'steps': 21737, 'batch_loss/train': 0.7682674573734403} 12/25/2021 20:05:10 - INFO - codeparrot_training - Step 21738: {'lr': 0.0003024069954764765, 'samples': 11130368, 'steps': 21738, 'batch_loss/train': 0.6852199831046164} 12/25/2021 20:05:24 - INFO - codeparrot_training - Step 21739: {'lr': 0.00030239158247572564, 'samples': 11130880, 'steps': 21739, 'batch_loss/train': 0.7469110453967005} 12/25/2021 20:05:34 - INFO - codeparrot_training - Step 21740: {'lr': 0.0003023761692666856, 'samples': 11131392, 'steps': 21740, 'batch_loss/train': 0.7145860688760877} 12/25/2021 20:05:45 - INFO - codeparrot_training - Step 21741: {'lr': 0.00030236075584941767, 'samples': 11131904, 'steps': 21741, 'batch_loss/train': 0.8808816093951464} 12/25/2021 20:05:55 - INFO - codeparrot_training - Step 21742: {'lr': 0.0003023453422239832, 'samples': 11132416, 'steps': 21742, 'batch_loss/train': 0.8600933644920588} 12/25/2021 20:06:08 - INFO - codeparrot_training - Step 21743: {'lr': 0.0003023299283904434, 'samples': 11132928, 'steps': 21743, 'batch_loss/train': 0.7874844763427973} 12/25/2021 20:06:18 - INFO - codeparrot_training - Step 21744: {'lr': 0.0003023145143488595, 'samples': 11133440, 'steps': 21744, 'batch_loss/train': 0.7469536216231063} 12/25/2021 20:06:29 - INFO - codeparrot_training - Step 21745: {'lr': 0.00030229910009929296, 'samples': 11133952, 'steps': 21745, 'batch_loss/train': 0.78172139916569} 12/25/2021 20:06:43 - INFO - codeparrot_training - Step 21746: {'lr': 0.00030228368564180487, 'samples': 11134464, 'steps': 21746, 'batch_loss/train': 0.8048917930573225} 12/25/2021 20:06:54 - INFO - codeparrot_training - Step 21747: {'lr': 0.0003022682709764566, 'samples': 11134976, 'steps': 21747, 'batch_loss/train': 0.4861287041567266} 12/25/2021 20:07:04 - INFO - codeparrot_training - Step 21748: {'lr': 0.00030225285610330937, 'samples': 11135488, 'steps': 21748, 'batch_loss/train': 0.7487192028202116} 12/25/2021 20:07:16 - INFO - codeparrot_training - Step 21749: {'lr': 0.0003022374410224246, 'samples': 11136000, 'steps': 21749, 'batch_loss/train': 0.8059965092688799} 12/25/2021 20:07:27 - INFO - codeparrot_training - Step 21750: {'lr': 0.0003022220257338635, 'samples': 11136512, 'steps': 21750, 'batch_loss/train': 0.7323553888127208} 12/25/2021 20:07:37 - INFO - codeparrot_training - Step 21751: {'lr': 0.00030220661023768727, 'samples': 11137024, 'steps': 21751, 'batch_loss/train': 0.7750813802704215} 12/25/2021 20:07:48 - INFO - codeparrot_training - Step 21752: {'lr': 0.0003021911945339573, 'samples': 11137536, 'steps': 21752, 'batch_loss/train': 0.6978559317067266} 12/25/2021 20:08:00 - INFO - codeparrot_training - Step 21753: {'lr': 0.0003021757786227348, 'samples': 11138048, 'steps': 21753, 'batch_loss/train': 0.7983056586235762} 12/25/2021 20:08:11 - INFO - codeparrot_training - Step 21754: {'lr': 0.00030216036250408116, 'samples': 11138560, 'steps': 21754, 'batch_loss/train': 0.6883805319666862} 12/25/2021 20:08:21 - INFO - codeparrot_training - Step 21755: {'lr': 0.0003021449461780576, 'samples': 11139072, 'steps': 21755, 'batch_loss/train': 0.7816830268129706} 12/25/2021 20:08:35 - INFO - codeparrot_training - Step 21756: {'lr': 0.0003021295296447255, 'samples': 11139584, 'steps': 21756, 'batch_loss/train': 0.8120068358257413} 12/25/2021 20:08:45 - INFO - codeparrot_training - Step 21757: {'lr': 0.0003021141129041459, 'samples': 11140096, 'steps': 21757, 'batch_loss/train': 0.8005796507932246} 12/25/2021 20:08:56 - INFO - codeparrot_training - Step 21758: {'lr': 0.0003020986959563804, 'samples': 11140608, 'steps': 21758, 'batch_loss/train': 0.7830412490293384} 12/25/2021 20:09:08 - INFO - codeparrot_training - Step 21759: {'lr': 0.0003020832788014901, 'samples': 11141120, 'steps': 21759, 'batch_loss/train': 0.7946380665525794} 12/25/2021 20:09:19 - INFO - codeparrot_training - Step 21760: {'lr': 0.0003020678614395364, 'samples': 11141632, 'steps': 21760, 'batch_loss/train': 0.8065224615857005} 12/25/2021 20:09:29 - INFO - codeparrot_training - Step 21761: {'lr': 0.0003020524438705805, 'samples': 11142144, 'steps': 21761, 'batch_loss/train': 0.6988464077585377} 12/25/2021 20:09:47 - INFO - codeparrot_training - Step 21762: {'lr': 0.0003020370260946838, 'samples': 11142656, 'steps': 21762, 'batch_loss/train': 0.6845917475875467} 12/25/2021 20:09:58 - INFO - codeparrot_training - Step 21763: {'lr': 0.00030202160811190743, 'samples': 11143168, 'steps': 21763, 'batch_loss/train': 0.7189415907487273} 12/25/2021 20:10:08 - INFO - codeparrot_training - Step 21764: {'lr': 0.00030200618992231286, 'samples': 11143680, 'steps': 21764, 'batch_loss/train': 0.849908672273159} 12/25/2021 20:10:19 - INFO - codeparrot_training - Step 21765: {'lr': 0.00030199077152596126, 'samples': 11144192, 'steps': 21765, 'batch_loss/train': 0.7186003960086964} 12/25/2021 20:10:31 - INFO - codeparrot_training - Step 21766: {'lr': 0.00030197535292291405, 'samples': 11144704, 'steps': 21766, 'batch_loss/train': 0.7894544079899788} 12/25/2021 20:10:42 - INFO - codeparrot_training - Step 21767: {'lr': 0.00030195993411323243, 'samples': 11145216, 'steps': 21767, 'batch_loss/train': 0.7997955670580268} 12/25/2021 20:10:52 - INFO - codeparrot_training - Step 21768: {'lr': 0.0003019445150969777, 'samples': 11145728, 'steps': 21768, 'batch_loss/train': 0.8672935729846358} 12/25/2021 20:11:04 - INFO - codeparrot_training - Step 21769: {'lr': 0.0003019290958742113, 'samples': 11146240, 'steps': 21769, 'batch_loss/train': 0.7379426280967891} 12/25/2021 20:11:15 - INFO - codeparrot_training - Step 21770: {'lr': 0.00030191367644499426, 'samples': 11146752, 'steps': 21770, 'batch_loss/train': 0.7415580749511719} 12/25/2021 20:11:26 - INFO - codeparrot_training - Step 21771: {'lr': 0.0003018982568093882, 'samples': 11147264, 'steps': 21771, 'batch_loss/train': 0.7516803257167339} 12/25/2021 20:11:38 - INFO - codeparrot_training - Step 21772: {'lr': 0.00030188283696745417, 'samples': 11147776, 'steps': 21772, 'batch_loss/train': 0.7955492930486798} 12/25/2021 20:11:48 - INFO - codeparrot_training - Step 21773: {'lr': 0.0003018674169192536, 'samples': 11148288, 'steps': 21773, 'batch_loss/train': 0.8228360661305487} 12/25/2021 20:11:59 - INFO - codeparrot_training - Step 21774: {'lr': 0.0003018519966648478, 'samples': 11148800, 'steps': 21774, 'batch_loss/train': 0.6969235455617309} 12/25/2021 20:12:10 - INFO - codeparrot_training - Step 21775: {'lr': 0.000301836576204298, 'samples': 11149312, 'steps': 21775, 'batch_loss/train': 0.7239756104536355} 12/25/2021 20:12:22 - INFO - codeparrot_training - Step 21776: {'lr': 0.0003018211555376655, 'samples': 11149824, 'steps': 21776, 'batch_loss/train': 0.8070351504720747} 12/25/2021 20:12:32 - INFO - codeparrot_training - Step 21777: {'lr': 0.00030180573466501176, 'samples': 11150336, 'steps': 21777, 'batch_loss/train': 0.8113983944058418} 12/25/2021 20:12:43 - INFO - codeparrot_training - Step 21778: {'lr': 0.0003017903135863979, 'samples': 11150848, 'steps': 21778, 'batch_loss/train': 0.7525339676067233} 12/25/2021 20:12:57 - INFO - codeparrot_training - Step 21779: {'lr': 0.0003017748923018853, 'samples': 11151360, 'steps': 21779, 'batch_loss/train': 0.702129096724093} 12/25/2021 20:13:07 - INFO - codeparrot_training - Step 21780: {'lr': 0.00030175947081153534, 'samples': 11151872, 'steps': 21780, 'batch_loss/train': 0.770982219837606} 12/25/2021 20:13:18 - INFO - codeparrot_training - Step 21781: {'lr': 0.0003017440491154092, 'samples': 11152384, 'steps': 21781, 'batch_loss/train': 0.8177307052537799} 12/25/2021 20:13:30 - INFO - codeparrot_training - Step 21782: {'lr': 0.0003017286272135684, 'samples': 11152896, 'steps': 21782, 'batch_loss/train': 0.8832708671689034} 12/25/2021 20:13:41 - INFO - codeparrot_training - Step 21783: {'lr': 0.0003017132051060739, 'samples': 11153408, 'steps': 21783, 'batch_loss/train': 0.7863808285910636} 12/25/2021 20:13:51 - INFO - codeparrot_training - Step 21784: {'lr': 0.00030169778279298735, 'samples': 11153920, 'steps': 21784, 'batch_loss/train': 0.8946099430322647} 12/25/2021 20:14:05 - INFO - codeparrot_training - Step 21785: {'lr': 0.0003016823602743699, 'samples': 11154432, 'steps': 21785, 'batch_loss/train': 0.7368855788372457} 12/25/2021 20:14:16 - INFO - codeparrot_training - Step 21786: {'lr': 0.00030166693755028293, 'samples': 11154944, 'steps': 21786, 'batch_loss/train': 1.6354260547086596} 12/25/2021 20:14:26 - INFO - codeparrot_training - Step 21787: {'lr': 0.0003016515146207877, 'samples': 11155456, 'steps': 21787, 'batch_loss/train': 0.7723368512233719} 12/25/2021 20:14:37 - INFO - codeparrot_training - Step 21788: {'lr': 0.0003016360914859456, 'samples': 11155968, 'steps': 21788, 'batch_loss/train': 0.697106370003894} 12/25/2021 20:14:49 - INFO - codeparrot_training - Step 21789: {'lr': 0.0003016206681458178, 'samples': 11156480, 'steps': 21789, 'batch_loss/train': 0.8385377833619714} 12/25/2021 20:15:00 - INFO - codeparrot_training - Step 21790: {'lr': 0.0003016052446004658, 'samples': 11156992, 'steps': 21790, 'batch_loss/train': 0.7576788160949945} 12/25/2021 20:15:10 - INFO - codeparrot_training - Step 21791: {'lr': 0.0003015898208499508, 'samples': 11157504, 'steps': 21791, 'batch_loss/train': 0.7058961763978004} 12/25/2021 20:15:22 - INFO - codeparrot_training - Step 21792: {'lr': 0.0003015743968943341, 'samples': 11158016, 'steps': 21792, 'batch_loss/train': 0.7474004970863461} 12/25/2021 20:15:33 - INFO - codeparrot_training - Step 21793: {'lr': 0.00030155897273367714, 'samples': 11158528, 'steps': 21793, 'batch_loss/train': 0.7817183448933065} 12/25/2021 20:15:44 - INFO - codeparrot_training - Step 21794: {'lr': 0.0003015435483680412, 'samples': 11159040, 'steps': 21794, 'batch_loss/train': 0.7072770660743117} 12/25/2021 20:15:57 - INFO - codeparrot_training - Step 21795: {'lr': 0.0003015281237974876, 'samples': 11159552, 'steps': 21795, 'batch_loss/train': 0.7510645077563822} 12/25/2021 20:16:08 - INFO - codeparrot_training - Step 21796: {'lr': 0.00030151269902207745, 'samples': 11160064, 'steps': 21796, 'batch_loss/train': 0.8042666437104344} 12/25/2021 20:16:18 - INFO - codeparrot_training - Step 21797: {'lr': 0.00030149727404187244, 'samples': 11160576, 'steps': 21797, 'batch_loss/train': 0.7373918471857905} 12/25/2021 20:16:29 - INFO - codeparrot_training - Step 21798: {'lr': 0.0003014818488569336, 'samples': 11161088, 'steps': 21798, 'batch_loss/train': 0.7774154008366168} 12/25/2021 20:16:41 - INFO - codeparrot_training - Step 21799: {'lr': 0.0003014664234673224, 'samples': 11161600, 'steps': 21799, 'batch_loss/train': 0.8055311273783445} 12/25/2021 20:16:52 - INFO - codeparrot_training - Step 21800: {'lr': 0.0003014509978731002, 'samples': 11162112, 'steps': 21800, 'batch_loss/train': 0.7459573405794799} 12/25/2021 20:17:02 - INFO - codeparrot_training - Step 21801: {'lr': 0.00030143557207432825, 'samples': 11162624, 'steps': 21801, 'batch_loss/train': 0.6283707916736603} 12/25/2021 20:17:16 - INFO - codeparrot_training - Step 21802: {'lr': 0.00030142014607106784, 'samples': 11163136, 'steps': 21802, 'batch_loss/train': 0.7198295868001878} 12/25/2021 20:17:27 - INFO - codeparrot_training - Step 21803: {'lr': 0.00030140471986338035, 'samples': 11163648, 'steps': 21803, 'batch_loss/train': 0.7231528367847204} 12/25/2021 20:17:37 - INFO - codeparrot_training - Step 21804: {'lr': 0.0003013892934513271, 'samples': 11164160, 'steps': 21804, 'batch_loss/train': 0.7759091490879655} 12/25/2021 20:17:49 - INFO - codeparrot_training - Step 21805: {'lr': 0.00030137386683496943, 'samples': 11164672, 'steps': 21805, 'batch_loss/train': 0.7652347981929779} 12/25/2021 20:18:00 - INFO - codeparrot_training - Step 21806: {'lr': 0.00030135844001436863, 'samples': 11165184, 'steps': 21806, 'batch_loss/train': 0.8186292536556721} 12/25/2021 20:18:10 - INFO - codeparrot_training - Step 21807: {'lr': 0.0003013430129895861, 'samples': 11165696, 'steps': 21807, 'batch_loss/train': 0.8530946187674999} 12/25/2021 20:18:22 - INFO - codeparrot_training - Step 21808: {'lr': 0.0003013275857606831, 'samples': 11166208, 'steps': 21808, 'batch_loss/train': 0.7851993818767369} 12/25/2021 20:18:33 - INFO - codeparrot_training - Step 21809: {'lr': 0.0003013121583277211, 'samples': 11166720, 'steps': 21809, 'batch_loss/train': 0.7881487850099802} 12/25/2021 20:18:44 - INFO - codeparrot_training - Step 21810: {'lr': 0.0003012967306907613, 'samples': 11167232, 'steps': 21810, 'batch_loss/train': 0.8065042984671891} 12/25/2021 20:18:54 - INFO - codeparrot_training - Step 21811: {'lr': 0.000301281302849865, 'samples': 11167744, 'steps': 21811, 'batch_loss/train': 0.7766705714166164} 12/25/2021 20:19:06 - INFO - codeparrot_training - Step 21812: {'lr': 0.00030126587480509365, 'samples': 11168256, 'steps': 21812, 'batch_loss/train': 0.7348324274644256} 12/25/2021 20:19:17 - INFO - codeparrot_training - Step 21813: {'lr': 0.0003012504465565085, 'samples': 11168768, 'steps': 21813, 'batch_loss/train': 0.8513434797059745} 12/25/2021 20:19:28 - INFO - codeparrot_training - Step 21814: {'lr': 0.000301235018104171, 'samples': 11169280, 'steps': 21814, 'batch_loss/train': 0.7438778108917177} 12/25/2021 20:19:42 - INFO - codeparrot_training - Step 21815: {'lr': 0.0003012195894481423, 'samples': 11169792, 'steps': 21815, 'batch_loss/train': 0.7941347118467093} 12/25/2021 20:19:53 - INFO - codeparrot_training - Step 21816: {'lr': 0.00030120416058848393, 'samples': 11170304, 'steps': 21816, 'batch_loss/train': 0.5343351542251185} 12/25/2021 20:20:03 - INFO - codeparrot_training - Step 21817: {'lr': 0.0003011887315252572, 'samples': 11170816, 'steps': 21817, 'batch_loss/train': 0.6971523007377982} 12/25/2021 20:20:15 - INFO - codeparrot_training - Step 21818: {'lr': 0.0003011733022585233, 'samples': 11171328, 'steps': 21818, 'batch_loss/train': 0.8834353983402252} 12/25/2021 20:20:26 - INFO - codeparrot_training - Step 21819: {'lr': 0.00030115787278834374, 'samples': 11171840, 'steps': 21819, 'batch_loss/train': 0.8400908783078194} 12/25/2021 20:20:36 - INFO - codeparrot_training - Step 21820: {'lr': 0.00030114244311477974, 'samples': 11172352, 'steps': 21820, 'batch_loss/train': 0.7586525928927585} 12/25/2021 20:20:47 - INFO - codeparrot_training - Step 21821: {'lr': 0.0003011270132378927, 'samples': 11172864, 'steps': 21821, 'batch_loss/train': 0.665014710277319} 12/25/2021 20:20:59 - INFO - codeparrot_training - Step 21822: {'lr': 0.000301111583157744, 'samples': 11173376, 'steps': 21822, 'batch_loss/train': 0.8020491793868132} 12/25/2021 20:21:10 - INFO - codeparrot_training - Step 21823: {'lr': 0.00030109615287439494, 'samples': 11173888, 'steps': 21823, 'batch_loss/train': 0.7000716016627848} 12/25/2021 20:21:20 - INFO - codeparrot_training - Step 21824: {'lr': 0.0003010807223879069, 'samples': 11174400, 'steps': 21824, 'batch_loss/train': 0.7918072892352939} 12/25/2021 20:21:34 - INFO - codeparrot_training - Step 21825: {'lr': 0.00030106529169834115, 'samples': 11174912, 'steps': 21825, 'batch_loss/train': 0.6945345769636333} 12/25/2021 20:21:45 - INFO - codeparrot_training - Step 21826: {'lr': 0.0003010498608057591, 'samples': 11175424, 'steps': 21826, 'batch_loss/train': 0.751835118047893} 12/25/2021 20:21:55 - INFO - codeparrot_training - Step 21827: {'lr': 0.0003010344297102221, 'samples': 11175936, 'steps': 21827, 'batch_loss/train': 0.8025036563631147} 12/25/2021 20:22:07 - INFO - codeparrot_training - Step 21828: {'lr': 0.00030101899841179135, 'samples': 11176448, 'steps': 21828, 'batch_loss/train': 0.7213026527315378} 12/25/2021 20:22:18 - INFO - codeparrot_training - Step 21829: {'lr': 0.00030100356691052844, 'samples': 11176960, 'steps': 21829, 'batch_loss/train': 0.7479225212009624} 12/25/2021 20:22:29 - INFO - codeparrot_training - Step 21830: {'lr': 0.00030098813520649457, 'samples': 11177472, 'steps': 21830, 'batch_loss/train': 0.6963226429652423} 12/25/2021 20:22:42 - INFO - codeparrot_training - Step 21831: {'lr': 0.0003009727032997511, 'samples': 11177984, 'steps': 21831, 'batch_loss/train': 0.8559762295335531} 12/25/2021 20:22:53 - INFO - codeparrot_training - Step 21832: {'lr': 0.00030095727119035953, 'samples': 11178496, 'steps': 21832, 'batch_loss/train': 0.752054340322502} 12/25/2021 20:23:04 - INFO - codeparrot_training - Step 21833: {'lr': 0.000300941838878381, 'samples': 11179008, 'steps': 21833, 'batch_loss/train': 0.7805432192981243} 12/25/2021 20:23:14 - INFO - codeparrot_training - Step 21834: {'lr': 0.0003009264063638769, 'samples': 11179520, 'steps': 21834, 'batch_loss/train': 0.7794882757589221} 12/25/2021 20:23:26 - INFO - codeparrot_training - Step 21835: {'lr': 0.00030091097364690875, 'samples': 11180032, 'steps': 21835, 'batch_loss/train': 0.7849797317758203} 12/25/2021 20:23:37 - INFO - codeparrot_training - Step 21836: {'lr': 0.00030089554072753775, 'samples': 11180544, 'steps': 21836, 'batch_loss/train': 0.7261097282171249} 12/25/2021 20:23:48 - INFO - codeparrot_training - Step 21837: {'lr': 0.0003008801076058253, 'samples': 11181056, 'steps': 21837, 'batch_loss/train': 0.7481316598132253} 12/25/2021 20:24:00 - INFO - codeparrot_training - Step 21838: {'lr': 0.0003008646742818327, 'samples': 11181568, 'steps': 21838, 'batch_loss/train': 0.6718958441633731} 12/25/2021 20:24:11 - INFO - codeparrot_training - Step 21839: {'lr': 0.00030084924075562145, 'samples': 11182080, 'steps': 21839, 'batch_loss/train': 0.9431786704808474} 12/25/2021 20:24:22 - INFO - codeparrot_training - Step 21840: {'lr': 0.00030083380702725283, 'samples': 11182592, 'steps': 21840, 'batch_loss/train': 0.8152795638889074} 12/25/2021 20:24:36 - INFO - codeparrot_training - Step 21841: {'lr': 0.00030081837309678806, 'samples': 11183104, 'steps': 21841, 'batch_loss/train': 0.5780198858119547} 12/25/2021 20:24:46 - INFO - codeparrot_training - Step 21842: {'lr': 0.00030080293896428867, 'samples': 11183616, 'steps': 21842, 'batch_loss/train': 0.7364504807628691} 12/25/2021 20:24:57 - INFO - codeparrot_training - Step 21843: {'lr': 0.0003007875046298161, 'samples': 11184128, 'steps': 21843, 'batch_loss/train': 0.7963507138192654} 12/25/2021 20:25:08 - INFO - codeparrot_training - Step 21844: {'lr': 0.00030077207009343137, 'samples': 11184640, 'steps': 21844, 'batch_loss/train': 0.7234895284054801} 12/25/2021 20:25:20 - INFO - codeparrot_training - Step 21845: {'lr': 0.00030075663535519625, 'samples': 11185152, 'steps': 21845, 'batch_loss/train': 0.7239616243168712} 12/25/2021 20:25:31 - INFO - codeparrot_training - Step 21846: {'lr': 0.00030074120041517186, 'samples': 11185664, 'steps': 21846, 'batch_loss/train': 0.6041680076159537} 12/25/2021 20:25:41 - INFO - codeparrot_training - Step 21847: {'lr': 0.0003007257652734196, 'samples': 11186176, 'steps': 21847, 'batch_loss/train': 0.7435081088915467} 12/25/2021 20:25:53 - INFO - codeparrot_training - Step 21848: {'lr': 0.0003007103299300008, 'samples': 11186688, 'steps': 21848, 'batch_loss/train': 0.8281703498214483} 12/25/2021 20:26:04 - INFO - codeparrot_training - Step 21849: {'lr': 0.000300694894384977, 'samples': 11187200, 'steps': 21849, 'batch_loss/train': 0.7630328764207661} 12/25/2021 20:26:14 - INFO - codeparrot_training - Step 21850: {'lr': 0.00030067945863840935, 'samples': 11187712, 'steps': 21850, 'batch_loss/train': 1.1528428113088012} 12/25/2021 20:26:28 - INFO - codeparrot_training - Step 21851: {'lr': 0.0003006640226903593, 'samples': 11188224, 'steps': 21851, 'batch_loss/train': 0.7723339158110321} 12/25/2021 20:26:39 - INFO - codeparrot_training - Step 21852: {'lr': 0.00030064858654088827, 'samples': 11188736, 'steps': 21852, 'batch_loss/train': 0.7957523111253977} 12/25/2021 20:26:50 - INFO - codeparrot_training - Step 21853: {'lr': 0.00030063315019005757, 'samples': 11189248, 'steps': 21853, 'batch_loss/train': 0.7761156223714352} 12/25/2021 20:27:00 - INFO - codeparrot_training - Step 21854: {'lr': 0.0003006177136379286, 'samples': 11189760, 'steps': 21854, 'batch_loss/train': 0.6586869875900447} 12/25/2021 20:27:12 - INFO - codeparrot_training - Step 21855: {'lr': 0.00030060227688456266, 'samples': 11190272, 'steps': 21855, 'batch_loss/train': 0.7949742968194187} 12/25/2021 20:27:23 - INFO - codeparrot_training - Step 21856: {'lr': 0.0003005868399300213, 'samples': 11190784, 'steps': 21856, 'batch_loss/train': 0.7609881055541337} 12/25/2021 20:27:33 - INFO - codeparrot_training - Step 21857: {'lr': 0.0003005714027743656, 'samples': 11191296, 'steps': 21857, 'batch_loss/train': 0.7927875136956573} 12/25/2021 20:27:46 - INFO - codeparrot_training - Step 21858: {'lr': 0.00030055596541765715, 'samples': 11191808, 'steps': 21858, 'batch_loss/train': 0.8138004271313548} 12/25/2021 20:27:56 - INFO - codeparrot_training - Step 21859: {'lr': 0.0003005405278599573, 'samples': 11192320, 'steps': 21859, 'batch_loss/train': 0.7161385185318068} 12/25/2021 20:28:07 - INFO - codeparrot_training - Step 21860: {'lr': 0.0003005250901013274, 'samples': 11192832, 'steps': 21860, 'batch_loss/train': 0.6420025000115857} 12/25/2021 20:28:21 - INFO - codeparrot_training - Step 21861: {'lr': 0.0003005096521418288, 'samples': 11193344, 'steps': 21861, 'batch_loss/train': 0.570721589087043} 12/25/2021 20:28:31 - INFO - codeparrot_training - Step 21862: {'lr': 0.00030049421398152293, 'samples': 11193856, 'steps': 21862, 'batch_loss/train': 0.734308841638267} 12/25/2021 20:28:42 - INFO - codeparrot_training - Step 21863: {'lr': 0.0003004787756204711, 'samples': 11194368, 'steps': 21863, 'batch_loss/train': 0.7827247153036296} 12/25/2021 20:28:53 - INFO - codeparrot_training - Step 21864: {'lr': 0.00030046333705873476, 'samples': 11194880, 'steps': 21864, 'batch_loss/train': 0.6655402276664972} 12/25/2021 20:29:05 - INFO - codeparrot_training - Step 21865: {'lr': 0.00030044789829637513, 'samples': 11195392, 'steps': 21865, 'batch_loss/train': 0.7069303756579757} 12/25/2021 20:29:15 - INFO - codeparrot_training - Step 21866: {'lr': 0.0003004324593334538, 'samples': 11195904, 'steps': 21866, 'batch_loss/train': 0.8957937671802938} 12/25/2021 20:29:26 - INFO - codeparrot_training - Step 21867: {'lr': 0.000300417020170032, 'samples': 11196416, 'steps': 21867, 'batch_loss/train': 0.6275378337595612} 12/25/2021 20:29:38 - INFO - codeparrot_training - Step 21868: {'lr': 0.0003004015808061712, 'samples': 11196928, 'steps': 21868, 'batch_loss/train': 0.7086980191525072} 12/25/2021 20:29:49 - INFO - codeparrot_training - Step 21869: {'lr': 0.0003003861412419328, 'samples': 11197440, 'steps': 21869, 'batch_loss/train': 0.7914966898970306} 12/25/2021 20:29:59 - INFO - codeparrot_training - Step 21870: {'lr': 0.00030037070147737803, 'samples': 11197952, 'steps': 21870, 'batch_loss/train': 0.7985558640211821} 12/25/2021 20:30:13 - INFO - codeparrot_training - Step 21871: {'lr': 0.00030035526151256847, 'samples': 11198464, 'steps': 21871, 'batch_loss/train': 0.8317836285568774} 12/25/2021 20:30:24 - INFO - codeparrot_training - Step 21872: {'lr': 0.0003003398213475653, 'samples': 11198976, 'steps': 21872, 'batch_loss/train': 0.7241822336800396} 12/25/2021 20:30:35 - INFO - codeparrot_training - Step 21873: {'lr': 0.0003003243809824301, 'samples': 11199488, 'steps': 21873, 'batch_loss/train': 0.7678713263012469} 12/25/2021 20:30:45 - INFO - codeparrot_training - Step 21874: {'lr': 0.00030030894041722407, 'samples': 11200000, 'steps': 21874, 'batch_loss/train': 0.7613486992195249} 12/25/2021 20:30:57 - INFO - codeparrot_training - Step 21875: {'lr': 0.0003002934996520088, 'samples': 11200512, 'steps': 21875, 'batch_loss/train': 0.8861903632059693} 12/25/2021 20:31:08 - INFO - codeparrot_training - Step 21876: {'lr': 0.00030027805868684544, 'samples': 11201024, 'steps': 21876, 'batch_loss/train': 1.0144324731081724} 12/25/2021 20:31:19 - INFO - codeparrot_training - Step 21877: {'lr': 0.0003002626175217956, 'samples': 11201536, 'steps': 21877, 'batch_loss/train': 0.8474867194890976} 12/25/2021 20:31:30 - INFO - codeparrot_training - Step 21878: {'lr': 0.00030024717615692055, 'samples': 11202048, 'steps': 21878, 'batch_loss/train': 0.8212610008195043} 12/25/2021 20:31:41 - INFO - codeparrot_training - Step 21879: {'lr': 0.0003002317345922816, 'samples': 11202560, 'steps': 21879, 'batch_loss/train': 0.7529792380519211} 12/25/2021 20:31:52 - INFO - codeparrot_training - Step 21880: {'lr': 0.0003002162928279404, 'samples': 11203072, 'steps': 21880, 'batch_loss/train': 0.783116590231657} 12/25/2021 20:32:04 - INFO - codeparrot_training - Step 21881: {'lr': 0.0003002008508639581, 'samples': 11203584, 'steps': 21881, 'batch_loss/train': 0.770495240110904} 12/25/2021 20:32:14 - INFO - codeparrot_training - Step 21882: {'lr': 0.00030018540870039624, 'samples': 11204096, 'steps': 21882, 'batch_loss/train': 0.7764054406434298} 12/25/2021 20:32:25 - INFO - codeparrot_training - Step 21883: {'lr': 0.00030016996633731597, 'samples': 11204608, 'steps': 21883, 'batch_loss/train': 0.773604879155755} 12/25/2021 20:32:36 - INFO - codeparrot_training - Step 21884: {'lr': 0.00030015452377477904, 'samples': 11205120, 'steps': 21884, 'batch_loss/train': 0.642207445576787} 12/25/2021 20:32:50 - INFO - codeparrot_training - Step 21885: {'lr': 0.0003001390810128466, 'samples': 11205632, 'steps': 21885, 'batch_loss/train': 0.7205858924426138} 12/25/2021 20:33:00 - INFO - codeparrot_training - Step 21886: {'lr': 0.00030012363805158006, 'samples': 11206144, 'steps': 21886, 'batch_loss/train': 0.8339743810938671} 12/25/2021 20:33:11 - INFO - codeparrot_training - Step 21887: {'lr': 0.0003001081948910409, 'samples': 11206656, 'steps': 21887, 'batch_loss/train': 0.7291786260902882} 12/25/2021 20:33:23 - INFO - codeparrot_training - Step 21888: {'lr': 0.0003000927515312905, 'samples': 11207168, 'steps': 21888, 'batch_loss/train': 0.7421121504157782} 12/25/2021 20:33:34 - INFO - codeparrot_training - Step 21889: {'lr': 0.0003000773079723902, 'samples': 11207680, 'steps': 21889, 'batch_loss/train': 0.8122859355062246} 12/25/2021 20:33:44 - INFO - codeparrot_training - Step 21890: {'lr': 0.0003000618642144014, 'samples': 11208192, 'steps': 21890, 'batch_loss/train': 0.8384867487475276} 12/25/2021 20:33:58 - INFO - codeparrot_training - Step 21891: {'lr': 0.0003000464202573856, 'samples': 11208704, 'steps': 21891, 'batch_loss/train': 0.4949322056490928} 12/25/2021 20:34:09 - INFO - codeparrot_training - Step 21892: {'lr': 0.00030003097610140414, 'samples': 11209216, 'steps': 21892, 'batch_loss/train': 0.7936019457411021} 12/25/2021 20:34:19 - INFO - codeparrot_training - Step 21893: {'lr': 0.00030001553174651834, 'samples': 11209728, 'steps': 21893, 'batch_loss/train': 0.8558837566524744} 12/25/2021 20:34:30 - INFO - codeparrot_training - Step 21894: {'lr': 0.0003000000871927897, 'samples': 11210240, 'steps': 21894, 'batch_loss/train': 0.7713704560883343} 12/25/2021 20:34:42 - INFO - codeparrot_training - Step 21895: {'lr': 0.0002999846424402796, 'samples': 11210752, 'steps': 21895, 'batch_loss/train': 0.8044189754873514} 12/25/2021 20:34:53 - INFO - codeparrot_training - Step 21896: {'lr': 0.0002999691974890494, 'samples': 11211264, 'steps': 21896, 'batch_loss/train': 0.7129811954800971} 12/25/2021 20:35:03 - INFO - codeparrot_training - Step 21897: {'lr': 0.00029995375233916064, 'samples': 11211776, 'steps': 21897, 'batch_loss/train': 0.8964819489046931} 12/25/2021 20:35:15 - INFO - codeparrot_training - Step 21898: {'lr': 0.00029993830699067455, 'samples': 11212288, 'steps': 21898, 'batch_loss/train': 0.7887199660763144} 12/25/2021 20:35:26 - INFO - codeparrot_training - Step 21899: {'lr': 0.0002999228614436526, 'samples': 11212800, 'steps': 21899, 'batch_loss/train': 0.6328779319301248} 12/25/2021 20:35:37 - INFO - codeparrot_training - Step 21900: {'lr': 0.00029990741569815615, 'samples': 11213312, 'steps': 21900, 'batch_loss/train': 0.7512085624039173} 12/25/2021 20:35:51 - INFO - codeparrot_training - Step 21901: {'lr': 0.00029989196975424684, 'samples': 11213824, 'steps': 21901, 'batch_loss/train': 0.7945332859526388} 12/25/2021 20:36:01 - INFO - codeparrot_training - Step 21902: {'lr': 0.0002998765236119857, 'samples': 11214336, 'steps': 21902, 'batch_loss/train': 0.8398638038779609} 12/25/2021 20:36:12 - INFO - codeparrot_training - Step 21903: {'lr': 0.00029986107727143443, 'samples': 11214848, 'steps': 21903, 'batch_loss/train': 0.8317987229675055} 12/25/2021 20:36:22 - INFO - codeparrot_training - Step 21904: {'lr': 0.0002998456307326543, 'samples': 11215360, 'steps': 21904, 'batch_loss/train': 0.9196885554119945} 12/25/2021 20:36:34 - INFO - codeparrot_training - Step 21905: {'lr': 0.0002998301839957068, 'samples': 11215872, 'steps': 21905, 'batch_loss/train': 0.7247612612554803} 12/25/2021 20:36:45 - INFO - codeparrot_training - Step 21906: {'lr': 0.0002998147370606533, 'samples': 11216384, 'steps': 21906, 'batch_loss/train': 0.731421313714236} 12/25/2021 20:36:55 - INFO - codeparrot_training - Step 21907: {'lr': 0.00029979928992755517, 'samples': 11216896, 'steps': 21907, 'batch_loss/train': 0.7769901445135474} 12/25/2021 20:37:08 - INFO - codeparrot_training - Step 21908: {'lr': 0.00029978384259647396, 'samples': 11217408, 'steps': 21908, 'batch_loss/train': 0.7798324921168387} 12/25/2021 20:37:18 - INFO - codeparrot_training - Step 21909: {'lr': 0.0002997683950674709, 'samples': 11217920, 'steps': 21909, 'batch_loss/train': 0.8497155765071511} 12/25/2021 20:37:29 - INFO - codeparrot_training - Step 21910: {'lr': 0.0002997529473406075, 'samples': 11218432, 'steps': 21910, 'batch_loss/train': 0.8107758359983563} 12/25/2021 20:37:43 - INFO - codeparrot_training - Step 21911: {'lr': 0.00029973749941594524, 'samples': 11218944, 'steps': 21911, 'batch_loss/train': 0.8313865810632706} 12/25/2021 20:37:53 - INFO - codeparrot_training - Step 21912: {'lr': 0.0002997220512935454, 'samples': 11219456, 'steps': 21912, 'batch_loss/train': 0.6107340990565717} 12/25/2021 20:38:04 - INFO - codeparrot_training - Step 21913: {'lr': 0.00029970660297346945, 'samples': 11219968, 'steps': 21913, 'batch_loss/train': 0.777978309430182} 12/25/2021 20:38:16 - INFO - codeparrot_training - Step 21914: {'lr': 0.00029969115445577894, 'samples': 11220480, 'steps': 21914, 'batch_loss/train': 0.8128669108264148} 12/25/2021 20:38:26 - INFO - codeparrot_training - Step 21915: {'lr': 0.000299675705740535, 'samples': 11220992, 'steps': 21915, 'batch_loss/train': 0.837540254695341} 12/25/2021 20:38:37 - INFO - codeparrot_training - Step 21916: {'lr': 0.00029966025682779925, 'samples': 11221504, 'steps': 21916, 'batch_loss/train': 0.8646846422925591} 12/25/2021 20:38:48 - INFO - codeparrot_training - Step 21917: {'lr': 0.0002996448077176331, 'samples': 11222016, 'steps': 21917, 'batch_loss/train': 0.7318331366404891} 12/25/2021 20:39:00 - INFO - codeparrot_training - Step 21918: {'lr': 0.000299629358410098, 'samples': 11222528, 'steps': 21918, 'batch_loss/train': 0.7432524904725142} 12/25/2021 20:39:11 - INFO - codeparrot_training - Step 21919: {'lr': 0.0002996139089052552, 'samples': 11223040, 'steps': 21919, 'batch_loss/train': 0.4685673051280901} 12/25/2021 20:39:21 - INFO - codeparrot_training - Step 21920: {'lr': 0.00029959845920316626, 'samples': 11223552, 'steps': 21920, 'batch_loss/train': 0.7898734603077173} 12/25/2021 20:39:33 - INFO - codeparrot_training - Step 21921: {'lr': 0.0002995830093038926, 'samples': 11224064, 'steps': 21921, 'batch_loss/train': 0.5840713980724104} 12/25/2021 20:39:44 - INFO - codeparrot_training - Step 21922: {'lr': 0.00029956755920749556, 'samples': 11224576, 'steps': 21922, 'batch_loss/train': 0.7511916032526642} 12/25/2021 20:39:55 - INFO - codeparrot_training - Step 21923: {'lr': 0.0002995521089140367, 'samples': 11225088, 'steps': 21923, 'batch_loss/train': 0.6132587911561131} 12/25/2021 20:40:09 - INFO - codeparrot_training - Step 21924: {'lr': 0.00029953665842357736, 'samples': 11225600, 'steps': 21924, 'batch_loss/train': 0.7963054366409779} 12/25/2021 20:40:19 - INFO - codeparrot_training - Step 21925: {'lr': 0.000299521207736179, 'samples': 11226112, 'steps': 21925, 'batch_loss/train': 0.7381735872477293} 12/25/2021 20:40:30 - INFO - codeparrot_training - Step 21926: {'lr': 0.0002995057568519029, 'samples': 11226624, 'steps': 21926, 'batch_loss/train': 0.7140925102867186} 12/25/2021 20:40:40 - INFO - codeparrot_training - Step 21927: {'lr': 0.00029949030577081066, 'samples': 11227136, 'steps': 21927, 'batch_loss/train': 0.7367670612875372} 12/25/2021 20:40:52 - INFO - codeparrot_training - Step 21928: {'lr': 0.0002994748544929636, 'samples': 11227648, 'steps': 21928, 'batch_loss/train': 0.7660871072439477} 12/25/2021 20:41:03 - INFO - codeparrot_training - Step 21929: {'lr': 0.0002994594030184233, 'samples': 11228160, 'steps': 21929, 'batch_loss/train': 0.6730618099682033} 12/25/2021 20:41:14 - INFO - codeparrot_training - Step 21930: {'lr': 0.00029944395134725107, 'samples': 11228672, 'steps': 21930, 'batch_loss/train': 0.8068613293580711} 12/25/2021 20:41:27 - INFO - codeparrot_training - Step 21931: {'lr': 0.00029942849947950834, 'samples': 11229184, 'steps': 21931, 'batch_loss/train': 0.7393533561844379} 12/25/2021 20:41:38 - INFO - codeparrot_training - Step 21932: {'lr': 0.00029941304741525656, 'samples': 11229696, 'steps': 21932, 'batch_loss/train': 0.7722396990284324} 12/25/2021 20:41:49 - INFO - codeparrot_training - Step 21933: {'lr': 0.0002993975951545571, 'samples': 11230208, 'steps': 21933, 'batch_loss/train': 0.6820308002643287} 12/25/2021 20:42:01 - INFO - codeparrot_training - Step 21934: {'lr': 0.0002993821426974715, 'samples': 11230720, 'steps': 21934, 'batch_loss/train': 0.7812938024289906} 12/25/2021 20:42:11 - INFO - codeparrot_training - Step 21935: {'lr': 0.00029936669004406114, 'samples': 11231232, 'steps': 21935, 'batch_loss/train': 0.729031816823408} 12/25/2021 20:42:22 - INFO - codeparrot_training - Step 21936: {'lr': 0.0002993512371943875, 'samples': 11231744, 'steps': 21936, 'batch_loss/train': 0.7281381541397423} 12/25/2021 20:42:32 - INFO - codeparrot_training - Step 21937: {'lr': 0.000299335784148512, 'samples': 11232256, 'steps': 21937, 'batch_loss/train': 0.7024277523159981} 12/25/2021 20:42:44 - INFO - codeparrot_training - Step 21938: {'lr': 0.00029932033090649594, 'samples': 11232768, 'steps': 21938, 'batch_loss/train': 0.7958067588042468} 12/25/2021 20:42:55 - INFO - codeparrot_training - Step 21939: {'lr': 0.00029930487746840096, 'samples': 11233280, 'steps': 21939, 'batch_loss/train': 0.7124599418602884} 12/25/2021 20:43:06 - INFO - codeparrot_training - Step 21940: {'lr': 0.0002992894238342883, 'samples': 11233792, 'steps': 21940, 'batch_loss/train': 0.7984644835814834} 12/25/2021 20:43:19 - INFO - codeparrot_training - Step 21941: {'lr': 0.0002992739700042196, 'samples': 11234304, 'steps': 21941, 'batch_loss/train': 0.7747234250418842} 12/25/2021 20:43:30 - INFO - codeparrot_training - Step 21942: {'lr': 0.00029925851597825616, 'samples': 11234816, 'steps': 21942, 'batch_loss/train': 0.8187569277361035} 12/25/2021 20:43:41 - INFO - codeparrot_training - Step 21943: {'lr': 0.0002992430617564595, 'samples': 11235328, 'steps': 21943, 'batch_loss/train': 0.771906013134867} 12/25/2021 20:43:53 - INFO - codeparrot_training - Step 21944: {'lr': 0.00029922760733889093, 'samples': 11235840, 'steps': 21944, 'batch_loss/train': 0.8633510526269674} 12/25/2021 20:44:04 - INFO - codeparrot_training - Step 21945: {'lr': 0.0002992121527256121, 'samples': 11236352, 'steps': 21945, 'batch_loss/train': 0.7749299239367247} 12/25/2021 20:44:14 - INFO - codeparrot_training - Step 21946: {'lr': 0.00029919669791668424, 'samples': 11236864, 'steps': 21946, 'batch_loss/train': 0.8127411501482129} 12/25/2021 20:44:25 - INFO - codeparrot_training - Step 21947: {'lr': 0.00029918124291216893, 'samples': 11237376, 'steps': 21947, 'batch_loss/train': 0.7989994015078992} 12/25/2021 20:44:38 - INFO - codeparrot_training - Step 21948: {'lr': 0.00029916578771212757, 'samples': 11237888, 'steps': 21948, 'batch_loss/train': 0.7661845423281193} 12/25/2021 20:44:49 - INFO - codeparrot_training - Step 21949: {'lr': 0.00029915033231662163, 'samples': 11238400, 'steps': 21949, 'batch_loss/train': 0.8043581359088421} 12/25/2021 20:45:00 - INFO - codeparrot_training - Step 21950: {'lr': 0.0002991348767257125, 'samples': 11238912, 'steps': 21950, 'batch_loss/train': 0.763597478158772} 12/25/2021 20:45:12 - INFO - codeparrot_training - Step 21951: {'lr': 0.0002991194209394616, 'samples': 11239424, 'steps': 21951, 'batch_loss/train': 0.7986746514216065} 12/25/2021 20:45:22 - INFO - codeparrot_training - Step 21952: {'lr': 0.00029910396495793045, 'samples': 11239936, 'steps': 21952, 'batch_loss/train': 0.7747420653467998} 12/25/2021 20:45:33 - INFO - codeparrot_training - Step 21953: {'lr': 0.00029908850878118047, 'samples': 11240448, 'steps': 21953, 'batch_loss/train': 0.79797178437002} 12/25/2021 20:45:45 - INFO - codeparrot_training - Step 21954: {'lr': 0.00029907305240927317, 'samples': 11240960, 'steps': 21954, 'batch_loss/train': 0.875832783523947} 12/25/2021 20:45:56 - INFO - codeparrot_training - Step 21955: {'lr': 0.00029905759584226987, 'samples': 11241472, 'steps': 21955, 'batch_loss/train': 0.7793063065037131} 12/25/2021 20:46:06 - INFO - codeparrot_training - Step 21956: {'lr': 0.0002990421390802322, 'samples': 11241984, 'steps': 21956, 'batch_loss/train': 0.6897595487535} 12/25/2021 20:46:18 - INFO - codeparrot_training - Step 21957: {'lr': 0.0002990266821232214, 'samples': 11242496, 'steps': 21957, 'batch_loss/train': 0.600284403655678} 12/25/2021 20:46:29 - INFO - codeparrot_training - Step 21958: {'lr': 0.0002990112249712991, 'samples': 11243008, 'steps': 21958, 'batch_loss/train': 0.7186895846389234} 12/25/2021 20:46:40 - INFO - codeparrot_training - Step 21959: {'lr': 0.00029899576762452664, 'samples': 11243520, 'steps': 21959, 'batch_loss/train': 0.7621767604723573} 12/25/2021 20:46:50 - INFO - codeparrot_training - Step 21960: {'lr': 0.00029898031008296545, 'samples': 11244032, 'steps': 21960, 'batch_loss/train': 0.7186395744793117} 12/25/2021 20:47:05 - INFO - codeparrot_training - Step 21961: {'lr': 0.0002989648523466771, 'samples': 11244544, 'steps': 21961, 'batch_loss/train': 0.8604612005874515} 12/25/2021 20:47:15 - INFO - codeparrot_training - Step 21962: {'lr': 0.0002989493944157231, 'samples': 11245056, 'steps': 21962, 'batch_loss/train': 0.735345565713942} 12/25/2021 20:47:26 - INFO - codeparrot_training - Step 21963: {'lr': 0.00029893393629016457, 'samples': 11245568, 'steps': 21963, 'batch_loss/train': 0.7770599396899343} 12/25/2021 20:47:39 - INFO - codeparrot_training - Step 21964: {'lr': 0.0002989184779700633, 'samples': 11246080, 'steps': 21964, 'batch_loss/train': 0.6933944597840309} 12/25/2021 20:47:50 - INFO - codeparrot_training - Step 21965: {'lr': 0.00029890301945548057, 'samples': 11246592, 'steps': 21965, 'batch_loss/train': 1.2833381597884} 12/25/2021 20:48:00 - INFO - codeparrot_training - Step 21966: {'lr': 0.000298887560746478, 'samples': 11247104, 'steps': 21966, 'batch_loss/train': 0.9758167434483767} 12/25/2021 20:48:11 - INFO - codeparrot_training - Step 21967: {'lr': 0.0002988721018431169, 'samples': 11247616, 'steps': 21967, 'batch_loss/train': 0.7321238027652726} 12/25/2021 20:48:23 - INFO - codeparrot_training - Step 21968: {'lr': 0.00029885664274545876, 'samples': 11248128, 'steps': 21968, 'batch_loss/train': 0.7750827902927995} 12/25/2021 20:48:34 - INFO - codeparrot_training - Step 21969: {'lr': 0.0002988411834535651, 'samples': 11248640, 'steps': 21969, 'batch_loss/train': 0.7784760924987495} 12/25/2021 20:48:44 - INFO - codeparrot_training - Step 21970: {'lr': 0.00029882572396749724, 'samples': 11249152, 'steps': 21970, 'batch_loss/train': 0.7447829954326153} 12/25/2021 20:48:58 - INFO - codeparrot_training - Step 21971: {'lr': 0.00029881026428731685, 'samples': 11249664, 'steps': 21971, 'batch_loss/train': 0.8423774316906929} 12/25/2021 20:49:09 - INFO - codeparrot_training - Step 21972: {'lr': 0.0002987948044130852, 'samples': 11250176, 'steps': 21972, 'batch_loss/train': 0.8856706339865923} 12/25/2021 20:49:19 - INFO - codeparrot_training - Step 21973: {'lr': 0.00029877934434486387, 'samples': 11250688, 'steps': 21973, 'batch_loss/train': 0.9138374384492636} 12/25/2021 20:49:32 - INFO - codeparrot_training - Step 21974: {'lr': 0.0002987638840827143, 'samples': 11251200, 'steps': 21974, 'batch_loss/train': 0.8805929124355316} 12/25/2021 20:49:42 - INFO - codeparrot_training - Step 21975: {'lr': 0.00029874842362669795, 'samples': 11251712, 'steps': 21975, 'batch_loss/train': 0.7442231071181595} 12/25/2021 20:49:53 - INFO - codeparrot_training - Step 21976: {'lr': 0.00029873296297687626, 'samples': 11252224, 'steps': 21976, 'batch_loss/train': 0.7445679442025721} 12/25/2021 20:50:03 - INFO - codeparrot_training - Step 21977: {'lr': 0.0002987175021333106, 'samples': 11252736, 'steps': 21977, 'batch_loss/train': 0.7419776855967939} 12/25/2021 20:50:17 - INFO - codeparrot_training - Step 21978: {'lr': 0.00029870204109606267, 'samples': 11253248, 'steps': 21978, 'batch_loss/train': 0.7422736813314259} 12/25/2021 20:50:28 - INFO - codeparrot_training - Step 21979: {'lr': 0.00029868657986519383, 'samples': 11253760, 'steps': 21979, 'batch_loss/train': 0.7971428642049432} 12/25/2021 20:50:38 - INFO - codeparrot_training - Step 21980: {'lr': 0.00029867111844076547, 'samples': 11254272, 'steps': 21980, 'batch_loss/train': 0.7180687775835395} 12/25/2021 20:50:50 - INFO - codeparrot_training - Step 21981: {'lr': 0.0002986556568228391, 'samples': 11254784, 'steps': 21981, 'batch_loss/train': 0.8104707198217511} 12/25/2021 20:51:01 - INFO - codeparrot_training - Step 21982: {'lr': 0.00029864019501147634, 'samples': 11255296, 'steps': 21982, 'batch_loss/train': 0.7350450798403472} 12/25/2021 20:51:12 - INFO - codeparrot_training - Step 21983: {'lr': 0.0002986247330067383, 'samples': 11255808, 'steps': 21983, 'batch_loss/train': 0.7549083270132542} 12/25/2021 20:51:24 - INFO - codeparrot_training - Step 21984: {'lr': 0.00029860927080868686, 'samples': 11256320, 'steps': 21984, 'batch_loss/train': 0.7912832186557353} 12/25/2021 20:51:34 - INFO - codeparrot_training - Step 21985: {'lr': 0.0002985938084173833, 'samples': 11256832, 'steps': 21985, 'batch_loss/train': 0.8037005765363574} 12/25/2021 20:51:45 - INFO - codeparrot_training - Step 21986: {'lr': 0.00029857834583288905, 'samples': 11257344, 'steps': 21986, 'batch_loss/train': 0.829635196365416} 12/25/2021 20:51:55 - INFO - codeparrot_training - Step 21987: {'lr': 0.0002985628830552657, 'samples': 11257856, 'steps': 21987, 'batch_loss/train': 0.7034969455562532} 12/25/2021 20:52:10 - INFO - codeparrot_training - Step 21988: {'lr': 0.00029854742008457463, 'samples': 11258368, 'steps': 21988, 'batch_loss/train': 0.750882476568222} 12/25/2021 20:52:20 - INFO - codeparrot_training - Step 21989: {'lr': 0.00029853195692087733, 'samples': 11258880, 'steps': 21989, 'batch_loss/train': 0.8345207190141082} 12/25/2021 20:52:31 - INFO - codeparrot_training - Step 21990: {'lr': 0.00029851649356423524, 'samples': 11259392, 'steps': 21990, 'batch_loss/train': 0.8294505691155791} 12/25/2021 20:52:43 - INFO - codeparrot_training - Step 21991: {'lr': 0.00029850103001471007, 'samples': 11259904, 'steps': 21991, 'batch_loss/train': 1.0385208395309746} 12/25/2021 20:52:54 - INFO - codeparrot_training - Step 21992: {'lr': 0.000298485566272363, 'samples': 11260416, 'steps': 21992, 'batch_loss/train': 0.6405597129487433} 12/25/2021 20:53:04 - INFO - codeparrot_training - Step 21993: {'lr': 0.00029847010233725565, 'samples': 11260928, 'steps': 21993, 'batch_loss/train': 0.7700677290558815} 12/25/2021 20:53:18 - INFO - codeparrot_training - Step 21994: {'lr': 0.0002984546382094494, 'samples': 11261440, 'steps': 21994, 'batch_loss/train': 0.7718228083103895} 12/25/2021 20:53:29 - INFO - codeparrot_training - Step 21995: {'lr': 0.00029843917388900596, 'samples': 11261952, 'steps': 21995, 'batch_loss/train': 0.7536871107295156} 12/25/2021 20:53:39 - INFO - codeparrot_training - Step 21996: {'lr': 0.0002984237093759865, 'samples': 11262464, 'steps': 21996, 'batch_loss/train': 0.7269871695898473} 12/25/2021 20:53:50 - INFO - codeparrot_training - Step 21997: {'lr': 0.00029840824467045274, 'samples': 11262976, 'steps': 21997, 'batch_loss/train': 0.8453911505639553} 12/25/2021 20:54:02 - INFO - codeparrot_training - Step 21998: {'lr': 0.000298392779772466, 'samples': 11263488, 'steps': 21998, 'batch_loss/train': 0.7308031097054482} 12/25/2021 20:54:13 - INFO - codeparrot_training - Step 21999: {'lr': 0.00029837731468208796, 'samples': 11264000, 'steps': 21999, 'batch_loss/train': 1.6441803500056267} 12/25/2021 20:54:23 - INFO - codeparrot_training - Step 22000: {'lr': 0.0002983618493993799, 'samples': 11264512, 'steps': 22000, 'batch_loss/train': 0.8562913369387388} 12/25/2021 20:54:35 - INFO - codeparrot_training - Step 22001: {'lr': 0.0002983463839244034, 'samples': 11265024, 'steps': 22001, 'batch_loss/train': 0.6839741305448115} 12/25/2021 20:54:46 - INFO - codeparrot_training - Step 22002: {'lr': 0.0002983309182572199, 'samples': 11265536, 'steps': 22002, 'batch_loss/train': 0.7798808049410582} 12/25/2021 20:54:57 - INFO - codeparrot_training - Step 22003: {'lr': 0.00029831545239789096, 'samples': 11266048, 'steps': 22003, 'batch_loss/train': 0.7995055364444852} 12/25/2021 20:55:07 - INFO - codeparrot_training - Step 22004: {'lr': 0.0002982999863464781, 'samples': 11266560, 'steps': 22004, 'batch_loss/train': 0.7354688014602289} 12/25/2021 20:55:20 - INFO - codeparrot_training - Step 22005: {'lr': 0.0002982845201030426, 'samples': 11267072, 'steps': 22005, 'batch_loss/train': 0.732332477113232} 12/25/2021 20:55:30 - INFO - codeparrot_training - Step 22006: {'lr': 0.0002982690536676461, 'samples': 11267584, 'steps': 22006, 'batch_loss/train': 0.7706478927284479} 12/25/2021 20:55:41 - INFO - codeparrot_training - Step 22007: {'lr': 0.0002982535870403501, 'samples': 11268096, 'steps': 22007, 'batch_loss/train': 0.7586540300399065} 12/25/2021 20:55:55 - INFO - codeparrot_training - Step 22008: {'lr': 0.00029823812022121616, 'samples': 11268608, 'steps': 22008, 'batch_loss/train': 0.7790867183357477} 12/25/2021 20:56:05 - INFO - codeparrot_training - Step 22009: {'lr': 0.0002982226532103055, 'samples': 11269120, 'steps': 22009, 'batch_loss/train': 0.7123064568731934} 12/25/2021 20:56:16 - INFO - codeparrot_training - Step 22010: {'lr': 0.0002982071860076799, 'samples': 11269632, 'steps': 22010, 'batch_loss/train': 0.753374494612217} 12/25/2021 20:56:28 - INFO - codeparrot_training - Step 22011: {'lr': 0.00029819171861340067, 'samples': 11270144, 'steps': 22011, 'batch_loss/train': 0.8259098171256483} 12/25/2021 20:56:39 - INFO - codeparrot_training - Step 22012: {'lr': 0.0002981762510275293, 'samples': 11270656, 'steps': 22012, 'batch_loss/train': 0.7354294563410804} 12/25/2021 20:56:49 - INFO - codeparrot_training - Step 22013: {'lr': 0.0002981607832501274, 'samples': 11271168, 'steps': 22013, 'batch_loss/train': 0.7992221191525459} 12/25/2021 20:57:01 - INFO - codeparrot_training - Step 22014: {'lr': 0.00029814531528125644, 'samples': 11271680, 'steps': 22014, 'batch_loss/train': 1.0728486506268382} 12/25/2021 20:57:12 - INFO - codeparrot_training - Step 22015: {'lr': 0.0002981298471209779, 'samples': 11272192, 'steps': 22015, 'batch_loss/train': 1.0417657243087888} 12/25/2021 20:57:23 - INFO - codeparrot_training - Step 22016: {'lr': 0.0002981143787693532, 'samples': 11272704, 'steps': 22016, 'batch_loss/train': 0.8068354427814484} 12/25/2021 20:57:33 - INFO - codeparrot_training - Step 22017: {'lr': 0.00029809891022644397, 'samples': 11273216, 'steps': 22017, 'batch_loss/train': 0.815771964378655} 12/25/2021 20:57:47 - INFO - codeparrot_training - Step 22018: {'lr': 0.0002980834414923116, 'samples': 11273728, 'steps': 22018, 'batch_loss/train': 0.8199558351188898} 12/25/2021 20:57:58 - INFO - codeparrot_training - Step 22019: {'lr': 0.0002980679725670176, 'samples': 11274240, 'steps': 22019, 'batch_loss/train': 0.6038853939971887} 12/25/2021 20:58:09 - INFO - codeparrot_training - Step 22020: {'lr': 0.0002980525034506235, 'samples': 11274752, 'steps': 22020, 'batch_loss/train': 0.5931091867387295} 12/25/2021 20:58:21 - INFO - codeparrot_training - Step 22021: {'lr': 0.00029803703414319084, 'samples': 11275264, 'steps': 22021, 'batch_loss/train': 0.7490154500119388} 12/25/2021 20:58:31 - INFO - codeparrot_training - Step 22022: {'lr': 0.00029802156464478095, 'samples': 11275776, 'steps': 22022, 'batch_loss/train': 0.816233916208148} 12/25/2021 20:58:42 - INFO - codeparrot_training - Step 22023: {'lr': 0.00029800609495545563, 'samples': 11276288, 'steps': 22023, 'batch_loss/train': 0.7349053851794451} 12/25/2021 20:58:54 - INFO - codeparrot_training - Step 22024: {'lr': 0.0002979906250752761, 'samples': 11276800, 'steps': 22024, 'batch_loss/train': 0.7039102360140532} 12/25/2021 20:59:05 - INFO - codeparrot_training - Step 22025: {'lr': 0.00029797515500430395, 'samples': 11277312, 'steps': 22025, 'batch_loss/train': 0.9599845362827182} 12/25/2021 20:59:16 - INFO - codeparrot_training - Step 22026: {'lr': 0.0002979596847426007, 'samples': 11277824, 'steps': 22026, 'batch_loss/train': 0.7989513259381056} 12/25/2021 20:59:26 - INFO - codeparrot_training - Step 22027: {'lr': 0.00029794421429022783, 'samples': 11278336, 'steps': 22027, 'batch_loss/train': 0.7735831150785089} 12/25/2021 20:59:40 - INFO - codeparrot_training - Step 22028: {'lr': 0.0002979287436472469, 'samples': 11278848, 'steps': 22028, 'batch_loss/train': 0.8423164980486035} 12/25/2021 20:59:51 - INFO - codeparrot_training - Step 22029: {'lr': 0.0002979132728137194, 'samples': 11279360, 'steps': 22029, 'batch_loss/train': 0.7326913385186344} 12/25/2021 21:00:01 - INFO - codeparrot_training - Step 22030: {'lr': 0.0002978978017897068, 'samples': 11279872, 'steps': 22030, 'batch_loss/train': 0.7336203875020146} 12/25/2021 21:00:13 - INFO - codeparrot_training - Step 22031: {'lr': 0.0002978823305752706, 'samples': 11280384, 'steps': 22031, 'batch_loss/train': 0.858673345297575} 12/25/2021 21:00:24 - INFO - codeparrot_training - Step 22032: {'lr': 0.0002978668591704724, 'samples': 11280896, 'steps': 22032, 'batch_loss/train': 0.7336889570578933} 12/25/2021 21:00:35 - INFO - codeparrot_training - Step 22033: {'lr': 0.0002978513875753736, 'samples': 11281408, 'steps': 22033, 'batch_loss/train': 0.7224828226026148} 12/25/2021 21:00:47 - INFO - codeparrot_training - Step 22034: {'lr': 0.00029783591579003575, 'samples': 11281920, 'steps': 22034, 'batch_loss/train': 0.8041857369244099} 12/25/2021 21:00:57 - INFO - codeparrot_training - Step 22035: {'lr': 0.0002978204438145204, 'samples': 11282432, 'steps': 22035, 'batch_loss/train': 0.7746338341385126} 12/25/2021 21:01:08 - INFO - codeparrot_training - Step 22036: {'lr': 0.00029780497164888904, 'samples': 11282944, 'steps': 22036, 'batch_loss/train': 0.7511190660297871} 12/25/2021 21:01:19 - INFO - codeparrot_training - Step 22037: {'lr': 0.0002977894992932031, 'samples': 11283456, 'steps': 22037, 'batch_loss/train': 0.6712731007719412} 12/25/2021 21:01:32 - INFO - codeparrot_training - Step 22038: {'lr': 0.00029777402674752406, 'samples': 11283968, 'steps': 22038, 'batch_loss/train': 0.7060362403281033} 12/25/2021 21:01:43 - INFO - codeparrot_training - Step 22039: {'lr': 0.00029775855401191373, 'samples': 11284480, 'steps': 22039, 'batch_loss/train': 0.7214896769728512} 12/25/2021 21:01:54 - INFO - codeparrot_training - Step 22040: {'lr': 0.00029774308108643336, 'samples': 11284992, 'steps': 22040, 'batch_loss/train': 0.7892780713737011} 12/25/2021 21:02:06 - INFO - codeparrot_training - Step 22041: {'lr': 0.0002977276079711444, 'samples': 11285504, 'steps': 22041, 'batch_loss/train': 0.7952788723632693} 12/25/2021 21:02:17 - INFO - codeparrot_training - Step 22042: {'lr': 0.00029771213466610856, 'samples': 11286016, 'steps': 22042, 'batch_loss/train': 0.7033723169006407} 12/25/2021 21:02:27 - INFO - codeparrot_training - Step 22043: {'lr': 0.0002976966611713874, 'samples': 11286528, 'steps': 22043, 'batch_loss/train': 0.6637075049802661} 12/25/2021 21:02:39 - INFO - codeparrot_training - Step 22044: {'lr': 0.0002976811874870423, 'samples': 11287040, 'steps': 22044, 'batch_loss/train': 0.6371756056323647} 12/25/2021 21:02:50 - INFO - codeparrot_training - Step 22045: {'lr': 0.00029766571361313464, 'samples': 11287552, 'steps': 22045, 'batch_loss/train': 0.7778779612854123} 12/25/2021 21:03:01 - INFO - codeparrot_training - Step 22046: {'lr': 0.00029765023954972624, 'samples': 11288064, 'steps': 22046, 'batch_loss/train': 0.6562916664406657} 12/25/2021 21:03:11 - INFO - codeparrot_training - Step 22047: {'lr': 0.00029763476529687846, 'samples': 11288576, 'steps': 22047, 'batch_loss/train': 1.1505462434142828} 12/25/2021 21:03:25 - INFO - codeparrot_training - Step 22048: {'lr': 0.0002976192908546528, 'samples': 11289088, 'steps': 22048, 'batch_loss/train': 0.6665334498975426} 12/25/2021 21:03:36 - INFO - codeparrot_training - Step 22049: {'lr': 0.00029760381622311087, 'samples': 11289600, 'steps': 22049, 'batch_loss/train': 0.870866360142827} 12/25/2021 21:03:46 - INFO - codeparrot_training - Step 22050: {'lr': 0.00029758834140231413, 'samples': 11290112, 'steps': 22050, 'batch_loss/train': 0.8436749074608088} 12/25/2021 21:03:59 - INFO - codeparrot_training - Step 22051: {'lr': 0.000297572866392324, 'samples': 11290624, 'steps': 22051, 'batch_loss/train': 0.7882513192016631} 12/25/2021 21:04:10 - INFO - codeparrot_training - Step 22052: {'lr': 0.00029755739119320223, 'samples': 11291136, 'steps': 22052, 'batch_loss/train': 0.8061377820558846} 12/25/2021 21:04:20 - INFO - codeparrot_training - Step 22053: {'lr': 0.00029754191580501024, 'samples': 11291648, 'steps': 22053, 'batch_loss/train': 0.8242687620222569} 12/25/2021 21:04:34 - INFO - codeparrot_training - Step 22054: {'lr': 0.0002975264402278095, 'samples': 11292160, 'steps': 22054, 'batch_loss/train': 0.5056304153986275} 12/25/2021 21:04:45 - INFO - codeparrot_training - Step 22055: {'lr': 0.0002975109644616616, 'samples': 11292672, 'steps': 22055, 'batch_loss/train': 0.7971480693668127} 12/25/2021 21:04:55 - INFO - codeparrot_training - Step 22056: {'lr': 0.0002974954885066281, 'samples': 11293184, 'steps': 22056, 'batch_loss/train': 0.812669194303453} 12/25/2021 21:05:06 - INFO - codeparrot_training - Step 22057: {'lr': 0.00029748001236277034, 'samples': 11293696, 'steps': 22057, 'batch_loss/train': 0.720562799833715} 12/25/2021 21:05:18 - INFO - codeparrot_training - Step 22058: {'lr': 0.0002974645360301501, 'samples': 11294208, 'steps': 22058, 'batch_loss/train': 0.759139584377408} 12/25/2021 21:05:28 - INFO - codeparrot_training - Step 22059: {'lr': 0.0002974490595088287, 'samples': 11294720, 'steps': 22059, 'batch_loss/train': 0.7936163749545813} 12/25/2021 21:05:39 - INFO - codeparrot_training - Step 22060: {'lr': 0.0002974335827988678, 'samples': 11295232, 'steps': 22060, 'batch_loss/train': 0.7589264390990138} 12/25/2021 21:05:51 - INFO - codeparrot_training - Step 22061: {'lr': 0.00029741810590032885, 'samples': 11295744, 'steps': 22061, 'batch_loss/train': 0.7385094119235873} 12/25/2021 21:06:02 - INFO - codeparrot_training - Step 22062: {'lr': 0.00029740262881327353, 'samples': 11296256, 'steps': 22062, 'batch_loss/train': 0.8361120205372572} 12/25/2021 21:06:12 - INFO - codeparrot_training - Step 22063: {'lr': 0.0002973871515377631, 'samples': 11296768, 'steps': 22063, 'batch_loss/train': 0.7545133056119084} 12/25/2021 21:06:25 - INFO - codeparrot_training - Step 22064: {'lr': 0.0002973716740738593, 'samples': 11297280, 'steps': 22064, 'batch_loss/train': 0.7288760374067351} 12/25/2021 21:06:35 - INFO - codeparrot_training - Step 22065: {'lr': 0.0002973561964216237, 'samples': 11297792, 'steps': 22065, 'batch_loss/train': 0.9423211924731731} 12/25/2021 21:06:46 - INFO - codeparrot_training - Step 22066: {'lr': 0.0002973407185811177, 'samples': 11298304, 'steps': 22066, 'batch_loss/train': 0.8178501231595874} 12/25/2021 21:06:57 - INFO - codeparrot_training - Step 22067: {'lr': 0.0002973252405524028, 'samples': 11298816, 'steps': 22067, 'batch_loss/train': 0.45729261776432395} 12/25/2021 21:07:11 - INFO - codeparrot_training - Step 22068: {'lr': 0.0002973097623355406, 'samples': 11299328, 'steps': 22068, 'batch_loss/train': 0.6451270510442555} 12/25/2021 21:07:22 - INFO - codeparrot_training - Step 22069: {'lr': 0.00029729428393059277, 'samples': 11299840, 'steps': 22069, 'batch_loss/train': 0.8510776464827359} 12/25/2021 21:07:32 - INFO - codeparrot_training - Step 22070: {'lr': 0.00029727880533762064, 'samples': 11300352, 'steps': 22070, 'batch_loss/train': 0.981548666022718} 12/25/2021 21:07:44 - INFO - codeparrot_training - Step 22071: {'lr': 0.00029726332655668586, 'samples': 11300864, 'steps': 22071, 'batch_loss/train': 0.8131085643544793} 12/25/2021 21:07:55 - INFO - codeparrot_training - Step 22072: {'lr': 0.00029724784758784994, 'samples': 11301376, 'steps': 22072, 'batch_loss/train': 0.7526795566082001} 12/25/2021 21:08:06 - INFO - codeparrot_training - Step 22073: {'lr': 0.0002972323684311744, 'samples': 11301888, 'steps': 22073, 'batch_loss/train': 0.7533376486971974} 12/25/2021 21:08:16 - INFO - codeparrot_training - Step 22074: {'lr': 0.00029721688908672085, 'samples': 11302400, 'steps': 22074, 'batch_loss/train': 0.8139493875205517} 12/25/2021 21:08:29 - INFO - codeparrot_training - Step 22075: {'lr': 0.00029720140955455073, 'samples': 11302912, 'steps': 22075, 'batch_loss/train': 0.8622165704146028} 12/25/2021 21:08:39 - INFO - codeparrot_training - Step 22076: {'lr': 0.0002971859298347257, 'samples': 11303424, 'steps': 22076, 'batch_loss/train': 0.7373439576476812} 12/25/2021 21:08:50 - INFO - codeparrot_training - Step 22077: {'lr': 0.0002971704499273071, 'samples': 11303936, 'steps': 22077, 'batch_loss/train': 0.7751726969145238} 12/25/2021 21:09:04 - INFO - codeparrot_training - Step 22078: {'lr': 0.00029715496983235667, 'samples': 11304448, 'steps': 22078, 'batch_loss/train': 0.8139997906982899} 12/25/2021 21:09:14 - INFO - codeparrot_training - Step 22079: {'lr': 0.00029713948954993594, 'samples': 11304960, 'steps': 22079, 'batch_loss/train': 0.6659461129456758} 12/25/2021 21:09:25 - INFO - codeparrot_training - Step 22080: {'lr': 0.0002971240090801063, 'samples': 11305472, 'steps': 22080, 'batch_loss/train': 0.7261949746171013} 12/25/2021 21:09:37 - INFO - codeparrot_training - Step 22081: {'lr': 0.0002971085284229294, 'samples': 11305984, 'steps': 22081, 'batch_loss/train': 0.8184836044674739} 12/25/2021 21:09:48 - INFO - codeparrot_training - Step 22082: {'lr': 0.0002970930475784669, 'samples': 11306496, 'steps': 22082, 'batch_loss/train': 0.7142219906672835} 12/25/2021 21:09:58 - INFO - codeparrot_training - Step 22083: {'lr': 0.0002970775665467801, 'samples': 11307008, 'steps': 22083, 'batch_loss/train': 0.7493349011056125} 12/25/2021 21:10:10 - INFO - codeparrot_training - Step 22084: {'lr': 0.0002970620853279307, 'samples': 11307520, 'steps': 22084, 'batch_loss/train': 0.7208233410492539} 12/25/2021 21:10:21 - INFO - codeparrot_training - Step 22085: {'lr': 0.0002970466039219802, 'samples': 11308032, 'steps': 22085, 'batch_loss/train': 0.7388404398225248} 12/25/2021 21:10:32 - INFO - codeparrot_training - Step 22086: {'lr': 0.0002970311223289902, 'samples': 11308544, 'steps': 22086, 'batch_loss/train': 1.0550996381789446} 12/25/2021 21:10:42 - INFO - codeparrot_training - Step 22087: {'lr': 0.00029701564054902227, 'samples': 11309056, 'steps': 22087, 'batch_loss/train': 0.7270454328972846} 12/25/2021 21:10:56 - INFO - codeparrot_training - Step 22088: {'lr': 0.00029700015858213783, 'samples': 11309568, 'steps': 22088, 'batch_loss/train': 0.7885928889736533} 12/25/2021 21:11:07 - INFO - codeparrot_training - Step 22089: {'lr': 0.00029698467642839857, 'samples': 11310080, 'steps': 22089, 'batch_loss/train': 0.7988956132903695} 12/25/2021 21:11:17 - INFO - codeparrot_training - Step 22090: {'lr': 0.0002969691940878658, 'samples': 11310592, 'steps': 22090, 'batch_loss/train': 0.7443089811131358} 12/25/2021 21:11:29 - INFO - codeparrot_training - Step 22091: {'lr': 0.0002969537115606015, 'samples': 11311104, 'steps': 22091, 'batch_loss/train': 0.8108170060440898} 12/25/2021 21:11:40 - INFO - codeparrot_training - Step 22092: {'lr': 0.00029693822884666683, 'samples': 11311616, 'steps': 22092, 'batch_loss/train': 0.7695204019546509} 12/25/2021 21:11:51 - INFO - codeparrot_training - Step 22093: {'lr': 0.00029692274594612353, 'samples': 11312128, 'steps': 22093, 'batch_loss/train': 0.7235771622508764} 12/25/2021 21:12:04 - INFO - codeparrot_training - Step 22094: {'lr': 0.000296907262859033, 'samples': 11312640, 'steps': 22094, 'batch_loss/train': 0.7780353657435626} 12/25/2021 21:12:15 - INFO - codeparrot_training - Step 22095: {'lr': 0.0002968917795854571, 'samples': 11313152, 'steps': 22095, 'batch_loss/train': 0.7891260916367173} 12/25/2021 21:12:25 - INFO - codeparrot_training - Step 22096: {'lr': 0.00029687629612545695, 'samples': 11313664, 'steps': 22096, 'batch_loss/train': 0.7938113678246737} 12/25/2021 21:12:36 - INFO - codeparrot_training - Step 22097: {'lr': 0.00029686081247909456, 'samples': 11314176, 'steps': 22097, 'batch_loss/train': 0.7620980851352215} 12/25/2021 21:12:49 - INFO - codeparrot_training - Step 22098: {'lr': 0.0002968453286464312, 'samples': 11314688, 'steps': 22098, 'batch_loss/train': 0.7837013332173228} 12/25/2021 21:12:59 - INFO - codeparrot_training - Step 22099: {'lr': 0.0002968298446275285, 'samples': 11315200, 'steps': 22099, 'batch_loss/train': 0.8371008625254035} 12/25/2021 21:13:10 - INFO - codeparrot_training - Step 22100: {'lr': 0.00029681436042244797, 'samples': 11315712, 'steps': 22100, 'batch_loss/train': 0.6592375431209803} 12/25/2021 21:13:22 - INFO - codeparrot_training - Step 22101: {'lr': 0.00029679887603125126, 'samples': 11316224, 'steps': 22101, 'batch_loss/train': 0.48285796819254756} 12/25/2021 21:13:33 - INFO - codeparrot_training - Step 22102: {'lr': 0.0002967833914539999, 'samples': 11316736, 'steps': 22102, 'batch_loss/train': 0.7510678227990866} 12/25/2021 21:13:43 - INFO - codeparrot_training - Step 22103: {'lr': 0.00029676790669075543, 'samples': 11317248, 'steps': 22103, 'batch_loss/train': 0.7663668720051646} 12/25/2021 21:13:54 - INFO - codeparrot_training - Step 22104: {'lr': 0.00029675242174157944, 'samples': 11317760, 'steps': 22104, 'batch_loss/train': 0.7859143177047372} 12/25/2021 21:14:06 - INFO - codeparrot_training - Step 22105: {'lr': 0.00029673693660653346, 'samples': 11318272, 'steps': 22105, 'batch_loss/train': 0.8181799566373229} 12/25/2021 21:14:17 - INFO - codeparrot_training - Step 22106: {'lr': 0.0002967214512856791, 'samples': 11318784, 'steps': 22106, 'batch_loss/train': 0.7649841662496328} 12/25/2021 21:14:27 - INFO - codeparrot_training - Step 22107: {'lr': 0.00029670596577907783, 'samples': 11319296, 'steps': 22107, 'batch_loss/train': 0.8446947745978832} 12/25/2021 21:14:41 - INFO - codeparrot_training - Step 22108: {'lr': 0.00029669048008679135, 'samples': 11319808, 'steps': 22108, 'batch_loss/train': 0.7504089325666428} 12/25/2021 21:14:52 - INFO - codeparrot_training - Step 22109: {'lr': 0.000296674994208881, 'samples': 11320320, 'steps': 22109, 'batch_loss/train': 0.7948618875816464} 12/25/2021 21:15:02 - INFO - codeparrot_training - Step 22110: {'lr': 0.0002966595081454087, 'samples': 11320832, 'steps': 22110, 'batch_loss/train': 0.7900525431614369} 12/25/2021 21:15:14 - INFO - codeparrot_training - Step 22111: {'lr': 0.0002966440218964357, 'samples': 11321344, 'steps': 22111, 'batch_loss/train': 0.7420944073237479} 12/25/2021 21:15:25 - INFO - codeparrot_training - Step 22112: {'lr': 0.0002966285354620237, 'samples': 11321856, 'steps': 22112, 'batch_loss/train': 0.8217758107930422} 12/25/2021 21:15:36 - INFO - codeparrot_training - Step 22113: {'lr': 0.00029661304884223417, 'samples': 11322368, 'steps': 22113, 'batch_loss/train': 0.7158040548674762} 12/25/2021 21:15:48 - INFO - codeparrot_training - Step 22114: {'lr': 0.00029659756203712887, 'samples': 11322880, 'steps': 22114, 'batch_loss/train': 0.7191231548786163} 12/25/2021 21:15:59 - INFO - codeparrot_training - Step 22115: {'lr': 0.0002965820750467692, 'samples': 11323392, 'steps': 22115, 'batch_loss/train': 0.8076449055224657} 12/25/2021 21:16:09 - INFO - codeparrot_training - Step 22116: {'lr': 0.0002965665878712168, 'samples': 11323904, 'steps': 22116, 'batch_loss/train': 0.7229673112742603} 12/25/2021 21:16:20 - INFO - codeparrot_training - Step 22117: {'lr': 0.00029655110051053326, 'samples': 11324416, 'steps': 22117, 'batch_loss/train': 0.755992078455165} 12/25/2021 21:16:34 - INFO - codeparrot_training - Step 22118: {'lr': 0.0002965356129647801, 'samples': 11324928, 'steps': 22118, 'batch_loss/train': 0.7975305998697877} 12/25/2021 21:16:44 - INFO - codeparrot_training - Step 22119: {'lr': 0.0002965201252340189, 'samples': 11325440, 'steps': 22119, 'batch_loss/train': 0.7545004701241851} 12/25/2021 21:16:55 - INFO - codeparrot_training - Step 22120: {'lr': 0.00029650463731831125, 'samples': 11325952, 'steps': 22120, 'batch_loss/train': 0.6934104524552822} 12/25/2021 21:17:07 - INFO - codeparrot_training - Step 22121: {'lr': 0.0002964891492177188, 'samples': 11326464, 'steps': 22121, 'batch_loss/train': 0.7390397228300571} 12/25/2021 21:17:18 - INFO - codeparrot_training - Step 22122: {'lr': 0.00029647366093230287, 'samples': 11326976, 'steps': 22122, 'batch_loss/train': 0.8114252544473857} 12/25/2021 21:17:28 - INFO - codeparrot_training - Step 22123: {'lr': 0.00029645817246212535, 'samples': 11327488, 'steps': 22123, 'batch_loss/train': 0.8263290422037244} 12/25/2021 21:17:42 - INFO - codeparrot_training - Step 22124: {'lr': 0.0002964426838072477, 'samples': 11328000, 'steps': 22124, 'batch_loss/train': 0.7098111850209534} 12/25/2021 21:17:53 - INFO - codeparrot_training - Step 22125: {'lr': 0.00029642719496773134, 'samples': 11328512, 'steps': 22125, 'batch_loss/train': 0.7641259506344795} 12/25/2021 21:18:03 - INFO - codeparrot_training - Step 22126: {'lr': 0.0002964117059436381, 'samples': 11329024, 'steps': 22126, 'batch_loss/train': 0.76228994037956} 12/25/2021 21:18:15 - INFO - codeparrot_training - Step 22127: {'lr': 0.00029639621673502943, 'samples': 11329536, 'steps': 22127, 'batch_loss/train': 0.7955307234078646} 12/25/2021 21:18:26 - INFO - codeparrot_training - Step 22128: {'lr': 0.0002963807273419669, 'samples': 11330048, 'steps': 22128, 'batch_loss/train': 0.7888769172132015} 12/25/2021 21:18:36 - INFO - codeparrot_training - Step 22129: {'lr': 0.00029636523776451206, 'samples': 11330560, 'steps': 22129, 'batch_loss/train': 0.736938089132309} 12/25/2021 21:18:47 - INFO - codeparrot_training - Step 22130: {'lr': 0.0002963497480027266, 'samples': 11331072, 'steps': 22130, 'batch_loss/train': 0.7620709342882037} 12/25/2021 21:18:59 - INFO - codeparrot_training - Step 22131: {'lr': 0.0002963342580566721, 'samples': 11331584, 'steps': 22131, 'batch_loss/train': 0.6651463403832167} 12/25/2021 21:19:10 - INFO - codeparrot_training - Step 22132: {'lr': 0.00029631876792641, 'samples': 11332096, 'steps': 22132, 'batch_loss/train': 0.7786721549928188} 12/25/2021 21:19:20 - INFO - codeparrot_training - Step 22133: {'lr': 0.000296303277612002, 'samples': 11332608, 'steps': 22133, 'batch_loss/train': 0.788565129507333} 12/25/2021 21:19:34 - INFO - codeparrot_training - Step 22134: {'lr': 0.0002962877871135097, 'samples': 11333120, 'steps': 22134, 'batch_loss/train': 0.7715727146714926} 12/25/2021 21:19:45 - INFO - codeparrot_training - Step 22135: {'lr': 0.0002962722964309945, 'samples': 11333632, 'steps': 22135, 'batch_loss/train': 0.8867979934439063} 12/25/2021 21:19:56 - INFO - codeparrot_training - Step 22136: {'lr': 0.0002962568055645182, 'samples': 11334144, 'steps': 22136, 'batch_loss/train': 0.7554229213856161} 12/25/2021 21:20:08 - INFO - codeparrot_training - Step 22137: {'lr': 0.0002962413145141423, 'samples': 11334656, 'steps': 22137, 'batch_loss/train': 0.9293147390708327} 12/25/2021 21:20:18 - INFO - codeparrot_training - Step 22138: {'lr': 0.00029622582327992834, 'samples': 11335168, 'steps': 22138, 'batch_loss/train': 0.6802843925543129} 12/25/2021 21:20:29 - INFO - codeparrot_training - Step 22139: {'lr': 0.0002962103318619381, 'samples': 11335680, 'steps': 22139, 'batch_loss/train': 0.8221925802063197} 12/25/2021 21:20:40 - INFO - codeparrot_training - Step 22140: {'lr': 0.0002961948402602329, 'samples': 11336192, 'steps': 22140, 'batch_loss/train': 0.7905230566393584} 12/25/2021 21:20:52 - INFO - codeparrot_training - Step 22141: {'lr': 0.0002961793484748745, 'samples': 11336704, 'steps': 22141, 'batch_loss/train': 0.8060246733948588} 12/25/2021 21:21:02 - INFO - codeparrot_training - Step 22142: {'lr': 0.0002961638565059244, 'samples': 11337216, 'steps': 22142, 'batch_loss/train': 0.8842751737684011} 12/25/2021 21:21:13 - INFO - codeparrot_training - Step 22143: {'lr': 0.00029614836435344437, 'samples': 11337728, 'steps': 22143, 'batch_loss/train': 0.5426089996472001} 12/25/2021 21:21:26 - INFO - codeparrot_training - Step 22144: {'lr': 0.00029613287201749576, 'samples': 11338240, 'steps': 22144, 'batch_loss/train': 0.8321570735424757} 12/25/2021 21:21:37 - INFO - codeparrot_training - Step 22145: {'lr': 0.0002961173794981403, 'samples': 11338752, 'steps': 22145, 'batch_loss/train': 0.8598777763545513} 12/25/2021 21:21:47 - INFO - codeparrot_training - Step 22146: {'lr': 0.00029610188679543955, 'samples': 11339264, 'steps': 22146, 'batch_loss/train': 1.3905519744148478} 12/25/2021 21:22:01 - INFO - codeparrot_training - Step 22147: {'lr': 0.00029608639390945506, 'samples': 11339776, 'steps': 22147, 'batch_loss/train': 0.8227771804085933} 12/25/2021 21:22:12 - INFO - codeparrot_training - Step 22148: {'lr': 0.0002960709008402486, 'samples': 11340288, 'steps': 22148, 'batch_loss/train': 0.7426302572712302} 12/25/2021 21:22:22 - INFO - codeparrot_training - Step 22149: {'lr': 0.00029605540758788145, 'samples': 11340800, 'steps': 22149, 'batch_loss/train': 0.7785696536302567} 12/25/2021 21:22:33 - INFO - codeparrot_training - Step 22150: {'lr': 0.00029603991415241555, 'samples': 11341312, 'steps': 22150, 'batch_loss/train': 0.8362264009192586} 12/25/2021 21:22:45 - INFO - codeparrot_training - Step 22151: {'lr': 0.0002960244205339122, 'samples': 11341824, 'steps': 22151, 'batch_loss/train': 0.9230991937220097} 12/25/2021 21:22:56 - INFO - codeparrot_training - Step 22152: {'lr': 0.0002960089267324333, 'samples': 11342336, 'steps': 22152, 'batch_loss/train': 0.7337138652801514} 12/25/2021 21:23:06 - INFO - codeparrot_training - Step 22153: {'lr': 0.00029599343274804013, 'samples': 11342848, 'steps': 22153, 'batch_loss/train': 0.7258608187548816} 12/25/2021 21:23:20 - INFO - codeparrot_training - Step 22154: {'lr': 0.0002959779385807945, 'samples': 11343360, 'steps': 22154, 'batch_loss/train': 0.7836427078582346} 12/25/2021 21:23:31 - INFO - codeparrot_training - Step 22155: {'lr': 0.0002959624442307579, 'samples': 11343872, 'steps': 22155, 'batch_loss/train': 0.5990435815765522} 12/25/2021 21:23:42 - INFO - codeparrot_training - Step 22156: {'lr': 0.00029594694969799207, 'samples': 11344384, 'steps': 22156, 'batch_loss/train': 0.7667881017550826} 12/25/2021 21:23:53 - INFO - codeparrot_training - Step 22157: {'lr': 0.00029593145498255845, 'samples': 11344896, 'steps': 22157, 'batch_loss/train': 0.9021463599056005} 12/25/2021 21:24:04 - INFO - codeparrot_training - Step 22158: {'lr': 0.0002959159600845188, 'samples': 11345408, 'steps': 22158, 'batch_loss/train': 0.702936201938428} 12/25/2021 21:24:15 - INFO - codeparrot_training - Step 22159: {'lr': 0.0002959004650039345, 'samples': 11345920, 'steps': 22159, 'batch_loss/train': 0.8023376115597785} 12/25/2021 21:24:25 - INFO - codeparrot_training - Step 22160: {'lr': 0.0002958849697408673, 'samples': 11346432, 'steps': 22160, 'batch_loss/train': 0.7925940291024745} 12/25/2021 21:24:37 - INFO - codeparrot_training - Step 22161: {'lr': 0.0002958694742953788, 'samples': 11346944, 'steps': 22161, 'batch_loss/train': 0.7165907539892942} 12/25/2021 21:24:48 - INFO - codeparrot_training - Step 22162: {'lr': 0.0002958539786675306, 'samples': 11347456, 'steps': 22162, 'batch_loss/train': 0.7942480007186532} 12/25/2021 21:24:59 - INFO - codeparrot_training - Step 22163: {'lr': 0.0002958384828573844, 'samples': 11347968, 'steps': 22163, 'batch_loss/train': 0.8549206899479032} 12/25/2021 21:25:12 - INFO - codeparrot_training - Step 22164: {'lr': 0.00029582298686500156, 'samples': 11348480, 'steps': 22164, 'batch_loss/train': 0.8218197100795805} 12/25/2021 21:25:23 - INFO - codeparrot_training - Step 22165: {'lr': 0.0002958074906904439, 'samples': 11348992, 'steps': 22165, 'batch_loss/train': 0.7264859154820442} 12/25/2021 21:25:34 - INFO - codeparrot_training - Step 22166: {'lr': 0.0002957919943337729, 'samples': 11349504, 'steps': 22166, 'batch_loss/train': 0.7229066994041204} 12/25/2021 21:25:46 - INFO - codeparrot_training - Step 22167: {'lr': 0.0002957764977950502, 'samples': 11350016, 'steps': 22167, 'batch_loss/train': 0.7429835060029291} 12/25/2021 21:25:56 - INFO - codeparrot_training - Step 22168: {'lr': 0.00029576100107433747, 'samples': 11350528, 'steps': 22168, 'batch_loss/train': 0.7758061997592449} 12/25/2021 21:26:07 - INFO - codeparrot_training - Step 22169: {'lr': 0.00029574550417169634, 'samples': 11351040, 'steps': 22169, 'batch_loss/train': 0.7424534680321813} 12/25/2021 21:26:19 - INFO - codeparrot_training - Step 22170: {'lr': 0.0002957300070871883, 'samples': 11351552, 'steps': 22170, 'batch_loss/train': 0.7597155557014048} 12/25/2021 21:26:30 - INFO - codeparrot_training - Step 22171: {'lr': 0.000295714509820875, 'samples': 11352064, 'steps': 22171, 'batch_loss/train': 0.7326581892557442} 12/25/2021 21:26:40 - INFO - codeparrot_training - Step 22172: {'lr': 0.0002956990123728181, 'samples': 11352576, 'steps': 22172, 'batch_loss/train': 0.6468077157624066} 12/25/2021 21:26:51 - INFO - codeparrot_training - Step 22173: {'lr': 0.00029568351474307914, 'samples': 11353088, 'steps': 22173, 'batch_loss/train': 0.7324984185397625} 12/25/2021 21:27:04 - INFO - codeparrot_training - Step 22174: {'lr': 0.0002956680169317198, 'samples': 11353600, 'steps': 22174, 'batch_loss/train': 0.7669546157121658} 12/25/2021 21:27:15 - INFO - codeparrot_training - Step 22175: {'lr': 0.00029565251893880163, 'samples': 11354112, 'steps': 22175, 'batch_loss/train': 0.9060348561033607} 12/25/2021 21:27:26 - INFO - codeparrot_training - Step 22176: {'lr': 0.00029563702076438643, 'samples': 11354624, 'steps': 22176, 'batch_loss/train': 0.7567542782053351} 12/25/2021 21:27:38 - INFO - codeparrot_training - Step 22177: {'lr': 0.00029562152240853546, 'samples': 11355136, 'steps': 22177, 'batch_loss/train': 0.8289438895881176} 12/25/2021 21:27:49 - INFO - codeparrot_training - Step 22178: {'lr': 0.0002956060238713107, 'samples': 11355648, 'steps': 22178, 'batch_loss/train': 0.789284220431} 12/25/2021 21:28:00 - INFO - codeparrot_training - Step 22179: {'lr': 0.00029559052515277357, 'samples': 11356160, 'steps': 22179, 'batch_loss/train': 0.8767473557963967} 12/25/2021 21:28:12 - INFO - codeparrot_training - Step 22180: {'lr': 0.0002955750262529857, 'samples': 11356672, 'steps': 22180, 'batch_loss/train': 0.8018565932288766} 12/25/2021 21:28:22 - INFO - codeparrot_training - Step 22181: {'lr': 0.0002955595271720087, 'samples': 11357184, 'steps': 22181, 'batch_loss/train': 0.7276516314595938} 12/25/2021 21:28:33 - INFO - codeparrot_training - Step 22182: {'lr': 0.00029554402790990433, 'samples': 11357696, 'steps': 22182, 'batch_loss/train': 0.8482695361599326} 12/25/2021 21:28:43 - INFO - codeparrot_training - Step 22183: {'lr': 0.000295528528466734, 'samples': 11358208, 'steps': 22183, 'batch_loss/train': 0.7905238019302487} 12/25/2021 21:28:57 - INFO - codeparrot_training - Step 22184: {'lr': 0.0002955130288425595, 'samples': 11358720, 'steps': 22184, 'batch_loss/train': 0.7924653240479529} 12/25/2021 21:29:08 - INFO - codeparrot_training - Step 22185: {'lr': 0.00029549752903744236, 'samples': 11359232, 'steps': 22185, 'batch_loss/train': 0.712323572486639} 12/25/2021 21:29:18 - INFO - codeparrot_training - Step 22186: {'lr': 0.00029548202905144415, 'samples': 11359744, 'steps': 22186, 'batch_loss/train': 0.7883026697672904} 12/25/2021 21:29:30 - INFO - codeparrot_training - Step 22187: {'lr': 0.00029546652888462667, 'samples': 11360256, 'steps': 22187, 'batch_loss/train': 0.6026322129182518} 12/25/2021 21:29:41 - INFO - codeparrot_training - Step 22188: {'lr': 0.0002954510285370514, 'samples': 11360768, 'steps': 22188, 'batch_loss/train': 0.7618662258610129} 12/25/2021 21:29:52 - INFO - codeparrot_training - Step 22189: {'lr': 0.00029543552800878, 'samples': 11361280, 'steps': 22189, 'batch_loss/train': 0.8529558498412371} 12/25/2021 21:30:04 - INFO - codeparrot_training - Step 22190: {'lr': 0.0002954200272998741, 'samples': 11361792, 'steps': 22190, 'batch_loss/train': 0.811741417273879} 12/25/2021 21:30:14 - INFO - codeparrot_training - Step 22191: {'lr': 0.00029540452641039534, 'samples': 11362304, 'steps': 22191, 'batch_loss/train': 0.8056045505218208} 12/25/2021 21:30:25 - INFO - codeparrot_training - Step 22192: {'lr': 0.0002953890253404053, 'samples': 11362816, 'steps': 22192, 'batch_loss/train': 0.7673900281079113} 12/25/2021 21:30:36 - INFO - codeparrot_training - Step 22193: {'lr': 0.00029537352408996563, 'samples': 11363328, 'steps': 22193, 'batch_loss/train': 0.746892215218395} 12/25/2021 21:30:49 - INFO - codeparrot_training - Step 22194: {'lr': 0.000295358022659138, 'samples': 11363840, 'steps': 22194, 'batch_loss/train': 0.7120127016678452} 12/25/2021 21:31:00 - INFO - codeparrot_training - Step 22195: {'lr': 0.000295342521047984, 'samples': 11364352, 'steps': 22195, 'batch_loss/train': 0.8657074803486466} 12/25/2021 21:31:11 - INFO - codeparrot_training - Step 22196: {'lr': 0.0002953270192565652, 'samples': 11364864, 'steps': 22196, 'batch_loss/train': 0.8051307189743966} 12/25/2021 21:31:23 - INFO - codeparrot_training - Step 22197: {'lr': 0.0002953115172849433, 'samples': 11365376, 'steps': 22197, 'batch_loss/train': 0.8192352512851357} 12/25/2021 21:31:33 - INFO - codeparrot_training - Step 22198: {'lr': 0.00029529601513317993, 'samples': 11365888, 'steps': 22198, 'batch_loss/train': 0.864076990634203} 12/25/2021 21:31:44 - INFO - codeparrot_training - Step 22199: {'lr': 0.00029528051280133675, 'samples': 11366400, 'steps': 22199, 'batch_loss/train': 0.8469741400331259} 12/25/2021 21:31:56 - INFO - codeparrot_training - Step 22200: {'lr': 0.0002952650102894753, 'samples': 11366912, 'steps': 22200, 'batch_loss/train': 0.840233689174056} 12/25/2021 21:32:06 - INFO - codeparrot_training - Step 22201: {'lr': 0.00029524950759765724, 'samples': 11367424, 'steps': 22201, 'batch_loss/train': 0.7904029469937086} 12/25/2021 21:32:17 - INFO - codeparrot_training - Step 22202: {'lr': 0.0002952340047259443, 'samples': 11367936, 'steps': 22202, 'batch_loss/train': 0.8038267553783953} 12/25/2021 21:32:31 - INFO - codeparrot_training - Step 22203: {'lr': 0.0002952185016743979, 'samples': 11368448, 'steps': 22203, 'batch_loss/train': 0.7143188174813986} 12/25/2021 21:32:41 - INFO - codeparrot_training - Step 22204: {'lr': 0.00029520299844307995, 'samples': 11368960, 'steps': 22204, 'batch_loss/train': 0.770482023479417} 12/25/2021 21:32:52 - INFO - codeparrot_training - Step 22205: {'lr': 0.0002951874950320519, 'samples': 11369472, 'steps': 22205, 'batch_loss/train': 0.7140742829069495} 12/25/2021 21:33:03 - INFO - codeparrot_training - Step 22206: {'lr': 0.0002951719914413754, 'samples': 11369984, 'steps': 22206, 'batch_loss/train': 0.5453434707014821} 12/25/2021 21:33:15 - INFO - codeparrot_training - Step 22207: {'lr': 0.00029515648767111213, 'samples': 11370496, 'steps': 22207, 'batch_loss/train': 0.7695685995277017} 12/25/2021 21:33:25 - INFO - codeparrot_training - Step 22208: {'lr': 0.0002951409837213238, 'samples': 11371008, 'steps': 22208, 'batch_loss/train': 0.7283351237419993} 12/25/2021 21:33:36 - INFO - codeparrot_training - Step 22209: {'lr': 0.0002951254795920718, 'samples': 11371520, 'steps': 22209, 'batch_loss/train': 0.7854876695200801} 12/25/2021 21:33:50 - INFO - codeparrot_training - Step 22210: {'lr': 0.00029510997528341804, 'samples': 11372032, 'steps': 22210, 'batch_loss/train': 0.8136844420805573} 12/25/2021 21:34:00 - INFO - codeparrot_training - Step 22211: {'lr': 0.00029509447079542405, 'samples': 11372544, 'steps': 22211, 'batch_loss/train': 0.769421509001404} 12/25/2021 21:34:11 - INFO - codeparrot_training - Step 22212: {'lr': 0.0002950789661281515, 'samples': 11373056, 'steps': 22212, 'batch_loss/train': 0.6698614954948425} 12/25/2021 21:34:23 - INFO - codeparrot_training - Step 22213: {'lr': 0.00029506346128166193, 'samples': 11373568, 'steps': 22213, 'batch_loss/train': 0.7517057205550373} 12/25/2021 21:34:33 - INFO - codeparrot_training - Step 22214: {'lr': 0.00029504795625601713, 'samples': 11374080, 'steps': 22214, 'batch_loss/train': 0.7794475620612502} 12/25/2021 21:34:44 - INFO - codeparrot_training - Step 22215: {'lr': 0.0002950324510512786, 'samples': 11374592, 'steps': 22215, 'batch_loss/train': 0.740033270791173} 12/25/2021 21:34:55 - INFO - codeparrot_training - Step 22216: {'lr': 0.00029501694566750806, 'samples': 11375104, 'steps': 22216, 'batch_loss/train': 0.757935716479551} 12/25/2021 21:35:07 - INFO - codeparrot_training - Step 22217: {'lr': 0.00029500144010476717, 'samples': 11375616, 'steps': 22217, 'batch_loss/train': 0.8094654037849978} 12/25/2021 21:35:17 - INFO - codeparrot_training - Step 22218: {'lr': 0.00029498593436311747, 'samples': 11376128, 'steps': 22218, 'batch_loss/train': 0.6749091842211783} 12/25/2021 21:35:28 - INFO - codeparrot_training - Step 22219: {'lr': 0.00029497042844262077, 'samples': 11376640, 'steps': 22219, 'batch_loss/train': 0.6933455760590732} 12/25/2021 21:35:40 - INFO - codeparrot_training - Step 22220: {'lr': 0.0002949549223433386, 'samples': 11377152, 'steps': 22220, 'batch_loss/train': 0.6041102843591943} 12/25/2021 21:35:51 - INFO - codeparrot_training - Step 22221: {'lr': 0.00029493941606533266, 'samples': 11377664, 'steps': 22221, 'batch_loss/train': 0.7330736400326714} 12/25/2021 21:36:01 - INFO - codeparrot_training - Step 22222: {'lr': 0.0002949239096086644, 'samples': 11378176, 'steps': 22222, 'batch_loss/train': 0.6761364149861038} 12/25/2021 21:36:15 - INFO - codeparrot_training - Step 22223: {'lr': 0.00029490840297339584, 'samples': 11378688, 'steps': 22223, 'batch_loss/train': 0.6974620616529137} 12/25/2021 21:36:26 - INFO - codeparrot_training - Step 22224: {'lr': 0.00029489289615958837, 'samples': 11379200, 'steps': 22224, 'batch_loss/train': 0.7304641276714392} 12/25/2021 21:36:37 - INFO - codeparrot_training - Step 22225: {'lr': 0.00029487738916730366, 'samples': 11379712, 'steps': 22225, 'batch_loss/train': 0.651855481788516} 12/25/2021 21:36:49 - INFO - codeparrot_training - Step 22226: {'lr': 0.00029486188199660346, 'samples': 11380224, 'steps': 22226, 'batch_loss/train': 0.6460146023891866} 12/25/2021 21:36:59 - INFO - codeparrot_training - Step 22227: {'lr': 0.0002948463746475493, 'samples': 11380736, 'steps': 22227, 'batch_loss/train': 0.7762334770523012} 12/25/2021 21:37:10 - INFO - codeparrot_training - Step 22228: {'lr': 0.0002948308671202029, 'samples': 11381248, 'steps': 22228, 'batch_loss/train': 0.8039872306399047} 12/25/2021 21:37:21 - INFO - codeparrot_training - Step 22229: {'lr': 0.0002948153594146259, 'samples': 11381760, 'steps': 22229, 'batch_loss/train': 0.7545407302677631} 12/25/2021 21:37:33 - INFO - codeparrot_training - Step 22230: {'lr': 0.00029479985153088004, 'samples': 11382272, 'steps': 22230, 'batch_loss/train': 0.7604701190721244} 12/25/2021 21:37:43 - INFO - codeparrot_training - Step 22231: {'lr': 0.0002947843434690268, 'samples': 11382784, 'steps': 22231, 'batch_loss/train': 0.7529627792537212} 12/25/2021 21:37:54 - INFO - codeparrot_training - Step 22232: {'lr': 0.0002947688352291278, 'samples': 11383296, 'steps': 22232, 'batch_loss/train': 0.8030155319720507} 12/25/2021 21:38:08 - INFO - codeparrot_training - Step 22233: {'lr': 0.000294753326811245, 'samples': 11383808, 'steps': 22233, 'batch_loss/train': 0.7460269598523155} 12/25/2021 21:38:18 - INFO - codeparrot_training - Step 22234: {'lr': 0.0002947378182154399, 'samples': 11384320, 'steps': 22234, 'batch_loss/train': 0.8136478066444397} 12/25/2021 21:38:29 - INFO - codeparrot_training - Step 22235: {'lr': 0.00029472230944177394, 'samples': 11384832, 'steps': 22235, 'batch_loss/train': 0.7669894676655531} 12/25/2021 21:38:41 - INFO - codeparrot_training - Step 22236: {'lr': 0.00029470680049030907, 'samples': 11385344, 'steps': 22236, 'batch_loss/train': 0.7081611845642328} 12/25/2021 21:38:52 - INFO - codeparrot_training - Step 22237: {'lr': 0.0002946912913611068, 'samples': 11385856, 'steps': 22237, 'batch_loss/train': 0.7585403341799974} 12/25/2021 21:39:02 - INFO - codeparrot_training - Step 22238: {'lr': 0.0002946757820542289, 'samples': 11386368, 'steps': 22238, 'batch_loss/train': 0.9279745845124125} 12/25/2021 21:39:16 - INFO - codeparrot_training - Step 22239: {'lr': 0.00029466027256973694, 'samples': 11386880, 'steps': 22239, 'batch_loss/train': 0.7748716566711664} 12/25/2021 21:39:27 - INFO - codeparrot_training - Step 22240: {'lr': 0.0002946447629076925, 'samples': 11387392, 'steps': 22240, 'batch_loss/train': 0.6158071325626224} 12/25/2021 21:39:38 - INFO - codeparrot_training - Step 22241: {'lr': 0.00029462925306815744, 'samples': 11387904, 'steps': 22241, 'batch_loss/train': 0.774535074364394} 12/25/2021 21:39:48 - INFO - codeparrot_training - Step 22242: {'lr': 0.0002946137430511933, 'samples': 11388416, 'steps': 22242, 'batch_loss/train': 0.7370615322142839} 12/25/2021 21:40:00 - INFO - codeparrot_training - Step 22243: {'lr': 0.0002945982328568618, 'samples': 11388928, 'steps': 22243, 'batch_loss/train': 0.702051981119439} 12/25/2021 21:40:11 - INFO - codeparrot_training - Step 22244: {'lr': 0.0002945827224852245, 'samples': 11389440, 'steps': 22244, 'batch_loss/train': 0.7791671315208077} 12/25/2021 21:40:21 - INFO - codeparrot_training - Step 22245: {'lr': 0.00029456721193634323, 'samples': 11389952, 'steps': 22245, 'batch_loss/train': 0.7497405065223575} 12/25/2021 21:40:33 - INFO - codeparrot_training - Step 22246: {'lr': 0.00029455170121027945, 'samples': 11390464, 'steps': 22246, 'batch_loss/train': 0.7762665033806115} 12/25/2021 21:40:44 - INFO - codeparrot_training - Step 22247: {'lr': 0.0002945361903070949, 'samples': 11390976, 'steps': 22247, 'batch_loss/train': 0.8482675803825259} 12/25/2021 21:40:55 - INFO - codeparrot_training - Step 22248: {'lr': 0.0002945206792268514, 'samples': 11391488, 'steps': 22248, 'batch_loss/train': 0.844690048135817} 12/25/2021 21:41:08 - INFO - codeparrot_training - Step 22249: {'lr': 0.0002945051679696104, 'samples': 11392000, 'steps': 22249, 'batch_loss/train': 0.7928196229040623} 12/25/2021 21:41:19 - INFO - codeparrot_training - Step 22250: {'lr': 0.00029448965653543376, 'samples': 11392512, 'steps': 22250, 'batch_loss/train': 0.7383961920859292} 12/25/2021 21:41:30 - INFO - codeparrot_training - Step 22251: {'lr': 0.0002944741449243829, 'samples': 11393024, 'steps': 22251, 'batch_loss/train': 0.7459934356156737} 12/25/2021 21:41:40 - INFO - codeparrot_training - Step 22252: {'lr': 0.0002944586331365197, 'samples': 11393536, 'steps': 22252, 'batch_loss/train': 0.8017344241961837} 12/25/2021 21:41:52 - INFO - codeparrot_training - Step 22253: {'lr': 0.0002944431211719058, 'samples': 11394048, 'steps': 22253, 'batch_loss/train': 0.712973814457655} 12/25/2021 21:42:03 - INFO - codeparrot_training - Step 22254: {'lr': 0.0002944276090306028, 'samples': 11394560, 'steps': 22254, 'batch_loss/train': 0.7723849294707179} 12/25/2021 21:42:14 - INFO - codeparrot_training - Step 22255: {'lr': 0.00029441209671267235, 'samples': 11395072, 'steps': 22255, 'batch_loss/train': 0.847613962367177} 12/25/2021 21:42:26 - INFO - codeparrot_training - Step 22256: {'lr': 0.00029439658421817627, 'samples': 11395584, 'steps': 22256, 'batch_loss/train': 0.7085958905518055} 12/25/2021 21:42:36 - INFO - codeparrot_training - Step 22257: {'lr': 0.0002943810715471761, 'samples': 11396096, 'steps': 22257, 'batch_loss/train': 0.6929484326392412} 12/25/2021 21:42:47 - INFO - codeparrot_training - Step 22258: {'lr': 0.0002943655586997335, 'samples': 11396608, 'steps': 22258, 'batch_loss/train': 0.8518830025568604} 12/25/2021 21:42:59 - INFO - codeparrot_training - Step 22259: {'lr': 0.0002943500456759102, 'samples': 11397120, 'steps': 22259, 'batch_loss/train': 0.8117611669003963} 12/25/2021 21:43:10 - INFO - codeparrot_training - Step 22260: {'lr': 0.00029433453247576784, 'samples': 11397632, 'steps': 22260, 'batch_loss/train': 0.77253383025527} 12/25/2021 21:43:20 - INFO - codeparrot_training - Step 22261: {'lr': 0.0002943190190993682, 'samples': 11398144, 'steps': 22261, 'batch_loss/train': 0.78887660196051} 12/25/2021 21:43:31 - INFO - codeparrot_training - Step 22262: {'lr': 0.00029430350554677284, 'samples': 11398656, 'steps': 22262, 'batch_loss/train': 0.7847904050722718} 12/25/2021 21:43:45 - INFO - codeparrot_training - Step 22263: {'lr': 0.0002942879918180435, 'samples': 11399168, 'steps': 22263, 'batch_loss/train': 0.7811557184904814} 12/25/2021 21:43:55 - INFO - codeparrot_training - Step 22264: {'lr': 0.00029427247791324174, 'samples': 11399680, 'steps': 22264, 'batch_loss/train': 0.8124916097149253} 12/25/2021 21:44:06 - INFO - codeparrot_training - Step 22265: {'lr': 0.0002942569638324294, 'samples': 11400192, 'steps': 22265, 'batch_loss/train': 0.9438964149449021} 12/25/2021 21:44:18 - INFO - codeparrot_training - Step 22266: {'lr': 0.0002942414495756681, 'samples': 11400704, 'steps': 22266, 'batch_loss/train': 0.8336822558194399} 12/25/2021 21:44:29 - INFO - codeparrot_training - Step 22267: {'lr': 0.00029422593514301945, 'samples': 11401216, 'steps': 22267, 'batch_loss/train': 0.690023232717067} 12/25/2021 21:44:39 - INFO - codeparrot_training - Step 22268: {'lr': 0.00029421042053454517, 'samples': 11401728, 'steps': 22268, 'batch_loss/train': 0.7988741733133793} 12/25/2021 21:44:53 - INFO - codeparrot_training - Step 22269: {'lr': 0.000294194905750307, 'samples': 11402240, 'steps': 22269, 'batch_loss/train': 0.6925316092092544} 12/25/2021 21:45:04 - INFO - codeparrot_training - Step 22270: {'lr': 0.00029417939079036653, 'samples': 11402752, 'steps': 22270, 'batch_loss/train': 0.6832829122431576} 12/25/2021 21:45:14 - INFO - codeparrot_training - Step 22271: {'lr': 0.0002941638756547855, 'samples': 11403264, 'steps': 22271, 'batch_loss/train': 0.7806738493964076} 12/25/2021 21:45:25 - INFO - codeparrot_training - Step 22272: {'lr': 0.0002941483603436256, 'samples': 11403776, 'steps': 22272, 'batch_loss/train': 0.6804816080257297} 12/25/2021 21:45:37 - INFO - codeparrot_training - Step 22273: {'lr': 0.0002941328448569484, 'samples': 11404288, 'steps': 22273, 'batch_loss/train': 0.7418511789292097} 12/25/2021 21:45:48 - INFO - codeparrot_training - Step 22274: {'lr': 0.00029411732919481576, 'samples': 11404800, 'steps': 22274, 'batch_loss/train': 0.7103938469663262} 12/25/2021 21:45:58 - INFO - codeparrot_training - Step 22275: {'lr': 0.00029410181335728924, 'samples': 11405312, 'steps': 22275, 'batch_loss/train': 0.7734964878764004} 12/25/2021 21:46:10 - INFO - codeparrot_training - Step 22276: {'lr': 0.00029408629734443064, 'samples': 11405824, 'steps': 22276, 'batch_loss/train': 0.7589947059750557} 12/25/2021 21:46:21 - INFO - codeparrot_training - Step 22277: {'lr': 0.00029407078115630144, 'samples': 11406336, 'steps': 22277, 'batch_loss/train': 0.7279424800071865} 12/25/2021 21:46:31 - INFO - codeparrot_training - Step 22278: {'lr': 0.0002940552647929636, 'samples': 11406848, 'steps': 22278, 'batch_loss/train': 0.7218433672678657} 12/25/2021 21:46:45 - INFO - codeparrot_training - Step 22279: {'lr': 0.00029403974825447857, 'samples': 11407360, 'steps': 22279, 'batch_loss/train': 0.8001904846169055} 12/25/2021 21:46:56 - INFO - codeparrot_training - Step 22280: {'lr': 0.00029402423154090807, 'samples': 11407872, 'steps': 22280, 'batch_loss/train': 0.7793540772981942} 12/25/2021 21:47:07 - INFO - codeparrot_training - Step 22281: {'lr': 0.00029400871465231395, 'samples': 11408384, 'steps': 22281, 'batch_loss/train': 0.7942153685726225} 12/25/2021 21:47:19 - INFO - codeparrot_training - Step 22282: {'lr': 0.0002939931975887578, 'samples': 11408896, 'steps': 22282, 'batch_loss/train': 0.746082472614944} 12/25/2021 21:47:29 - INFO - codeparrot_training - Step 22283: {'lr': 0.0002939776803503013, 'samples': 11409408, 'steps': 22283, 'batch_loss/train': 0.7987143669161014} 12/25/2021 21:47:40 - INFO - codeparrot_training - Step 22284: {'lr': 0.00029396216293700605, 'samples': 11409920, 'steps': 22284, 'batch_loss/train': 0.7581591000780463} 12/25/2021 21:47:50 - INFO - codeparrot_training - Step 22285: {'lr': 0.00029394664534893397, 'samples': 11410432, 'steps': 22285, 'batch_loss/train': 0.7489065676927567} 12/25/2021 21:48:02 - INFO - codeparrot_training - Step 22286: {'lr': 0.00029393112758614655, 'samples': 11410944, 'steps': 22286, 'batch_loss/train': 0.6857182602398098} 12/25/2021 21:48:13 - INFO - codeparrot_training - Step 22287: {'lr': 0.00029391560964870555, 'samples': 11411456, 'steps': 22287, 'batch_loss/train': 0.8239051653072238} 12/25/2021 21:48:24 - INFO - codeparrot_training - Step 22288: {'lr': 0.0002939000915366727, 'samples': 11411968, 'steps': 22288, 'batch_loss/train': 0.7461269206833094} 12/25/2021 21:48:38 - INFO - codeparrot_training - Step 22289: {'lr': 0.00029388457325010966, 'samples': 11412480, 'steps': 22289, 'batch_loss/train': 0.6881381021812558} 12/25/2021 21:48:48 - INFO - codeparrot_training - Step 22290: {'lr': 0.00029386905478907805, 'samples': 11412992, 'steps': 22290, 'batch_loss/train': 0.8052917812019587} 12/25/2021 21:48:59 - INFO - codeparrot_training - Step 22291: {'lr': 0.0002938535361536397, 'samples': 11413504, 'steps': 22291, 'batch_loss/train': 0.6950148303876631} 12/25/2021 21:49:11 - INFO - codeparrot_training - Step 22292: {'lr': 0.00029383801734385627, 'samples': 11414016, 'steps': 22292, 'batch_loss/train': 0.7868494400754571} 12/25/2021 21:49:21 - INFO - codeparrot_training - Step 22293: {'lr': 0.0002938224983597895, 'samples': 11414528, 'steps': 22293, 'batch_loss/train': 0.8388803154230118} 12/25/2021 21:49:32 - INFO - codeparrot_training - Step 22294: {'lr': 0.0002938069792015009, 'samples': 11415040, 'steps': 22294, 'batch_loss/train': 0.8127014273777604} 12/25/2021 21:49:43 - INFO - codeparrot_training - Step 22295: {'lr': 0.00029379145986905235, 'samples': 11415552, 'steps': 22295, 'batch_loss/train': 0.7714569778181612} 12/25/2021 21:49:55 - INFO - codeparrot_training - Step 22296: {'lr': 0.00029377594036250545, 'samples': 11416064, 'steps': 22296, 'batch_loss/train': 0.766478500328958} 12/25/2021 21:50:06 - INFO - codeparrot_training - Step 22297: {'lr': 0.0002937604206819219, 'samples': 11416576, 'steps': 22297, 'batch_loss/train': 0.8176273992285132} 12/25/2021 21:50:17 - INFO - codeparrot_training - Step 22298: {'lr': 0.00029374490082736363, 'samples': 11417088, 'steps': 22298, 'batch_loss/train': 0.7349557466804981} 12/25/2021 21:50:31 - INFO - codeparrot_training - Step 22299: {'lr': 0.000293729380798892, 'samples': 11417600, 'steps': 22299, 'batch_loss/train': 0.9795792391523719} 12/25/2021 21:50:41 - INFO - codeparrot_training - Step 22300: {'lr': 0.0002937138605965689, 'samples': 11418112, 'steps': 22300, 'batch_loss/train': 0.7744721602648497} 12/25/2021 21:50:52 - INFO - codeparrot_training - Step 22301: {'lr': 0.000293698340220456, 'samples': 11418624, 'steps': 22301, 'batch_loss/train': 0.7716164386365563} 12/25/2021 21:51:04 - INFO - codeparrot_training - Step 22302: {'lr': 0.000293682819670615, 'samples': 11419136, 'steps': 22302, 'batch_loss/train': 0.7316620848141611} 12/25/2021 21:51:14 - INFO - codeparrot_training - Step 22303: {'lr': 0.0002936672989471075, 'samples': 11419648, 'steps': 22303, 'batch_loss/train': 0.762268993537873} 12/25/2021 21:51:25 - INFO - codeparrot_training - Step 22304: {'lr': 0.0002936517780499955, 'samples': 11420160, 'steps': 22304, 'batch_loss/train': 0.8327423939481378} 12/25/2021 21:51:36 - INFO - codeparrot_training - Step 22305: {'lr': 0.0002936362569793404, 'samples': 11420672, 'steps': 22305, 'batch_loss/train': 0.8281129340175539} 12/25/2021 21:51:48 - INFO - codeparrot_training - Step 22306: {'lr': 0.000293620735735204, 'samples': 11421184, 'steps': 22306, 'batch_loss/train': 0.7807534756138921} 12/25/2021 21:51:59 - INFO - codeparrot_training - Step 22307: {'lr': 0.0002936052143176481, 'samples': 11421696, 'steps': 22307, 'batch_loss/train': 0.9293919297633693} 12/25/2021 21:52:10 - INFO - codeparrot_training - Step 22308: {'lr': 0.0002935896927267343, 'samples': 11422208, 'steps': 22308, 'batch_loss/train': 1.0091528166085482} 12/25/2021 21:52:23 - INFO - codeparrot_training - Step 22309: {'lr': 0.00029357417096252425, 'samples': 11422720, 'steps': 22309, 'batch_loss/train': 0.7788063930347562} 12/25/2021 21:52:34 - INFO - codeparrot_training - Step 22310: {'lr': 0.00029355864902507985, 'samples': 11423232, 'steps': 22310, 'batch_loss/train': 0.7684405520558357} 12/25/2021 21:52:45 - INFO - codeparrot_training - Step 22311: {'lr': 0.0002935431269144627, 'samples': 11423744, 'steps': 22311, 'batch_loss/train': 0.7449143836274743} 12/25/2021 21:52:57 - INFO - codeparrot_training - Step 22312: {'lr': 0.0002935276046307345, 'samples': 11424256, 'steps': 22312, 'batch_loss/train': 0.7945584300905466} 12/25/2021 21:53:07 - INFO - codeparrot_training - Step 22313: {'lr': 0.000293512082173957, 'samples': 11424768, 'steps': 22313, 'batch_loss/train': 0.791794691933319} 12/25/2021 21:53:18 - INFO - codeparrot_training - Step 22314: {'lr': 0.0002934965595441919, 'samples': 11425280, 'steps': 22314, 'batch_loss/train': 0.7184874906670302} 12/25/2021 21:53:29 - INFO - codeparrot_training - Step 22315: {'lr': 0.0002934810367415009, 'samples': 11425792, 'steps': 22315, 'batch_loss/train': 0.7022757374215871} 12/25/2021 21:53:41 - INFO - codeparrot_training - Step 22316: {'lr': 0.00029346551376594566, 'samples': 11426304, 'steps': 22316, 'batch_loss/train': 0.7309253136627376} 12/25/2021 21:53:51 - INFO - codeparrot_training - Step 22317: {'lr': 0.000293449990617588, 'samples': 11426816, 'steps': 22317, 'batch_loss/train': 0.7190146604552865} 12/25/2021 21:54:02 - INFO - codeparrot_training - Step 22318: {'lr': 0.00029343446729648956, 'samples': 11427328, 'steps': 22318, 'batch_loss/train': 0.6253703542752191} 12/25/2021 21:54:16 - INFO - codeparrot_training - Step 22319: {'lr': 0.0002934189438027121, 'samples': 11427840, 'steps': 22319, 'batch_loss/train': 0.7739206803962588} 12/25/2021 21:54:26 - INFO - codeparrot_training - Step 22320: {'lr': 0.00029340342013631725, 'samples': 11428352, 'steps': 22320, 'batch_loss/train': 0.6589360688813031} 12/25/2021 21:54:37 - INFO - codeparrot_training - Step 22321: {'lr': 0.0002933878962973668, 'samples': 11428864, 'steps': 22321, 'batch_loss/train': 0.7065866990014911} 12/25/2021 21:54:49 - INFO - codeparrot_training - Step 22322: {'lr': 0.00029337237228592254, 'samples': 11429376, 'steps': 22322, 'batch_loss/train': 0.7583631183952093} 12/25/2021 21:55:00 - INFO - codeparrot_training - Step 22323: {'lr': 0.00029335684810204604, 'samples': 11429888, 'steps': 22323, 'batch_loss/train': 0.939838691148907} 12/25/2021 21:55:10 - INFO - codeparrot_training - Step 22324: {'lr': 0.0002933413237457991, 'samples': 11430400, 'steps': 22324, 'batch_loss/train': 0.7958524134010077} 12/25/2021 21:55:21 - INFO - codeparrot_training - Step 22325: {'lr': 0.0002933257992172434, 'samples': 11430912, 'steps': 22325, 'batch_loss/train': 0.8204242642968893} 12/25/2021 21:55:35 - INFO - codeparrot_training - Step 22326: {'lr': 0.00029331027451644074, 'samples': 11431424, 'steps': 22326, 'batch_loss/train': 0.8302336540073156} 12/25/2021 21:55:45 - INFO - codeparrot_training - Step 22327: {'lr': 0.00029329474964345277, 'samples': 11431936, 'steps': 22327, 'batch_loss/train': 0.7849760772660375} 12/25/2021 21:55:56 - INFO - codeparrot_training - Step 22328: {'lr': 0.00029327922459834125, 'samples': 11432448, 'steps': 22328, 'batch_loss/train': 0.6631498676724732} 12/25/2021 21:56:08 - INFO - codeparrot_training - Step 22329: {'lr': 0.00029326369938116777, 'samples': 11432960, 'steps': 22329, 'batch_loss/train': 0.8803491600556299} 12/25/2021 21:56:19 - INFO - codeparrot_training - Step 22330: {'lr': 0.0002932481739919943, 'samples': 11433472, 'steps': 22330, 'batch_loss/train': 0.9798602862283587} 12/25/2021 21:56:30 - INFO - codeparrot_training - Step 22331: {'lr': 0.00029323264843088227, 'samples': 11433984, 'steps': 22331, 'batch_loss/train': 0.8770339447073638} 12/25/2021 21:56:42 - INFO - codeparrot_training - Step 22332: {'lr': 0.00029321712269789364, 'samples': 11434496, 'steps': 22332, 'batch_loss/train': 0.7508688105735928} 12/25/2021 21:56:52 - INFO - codeparrot_training - Step 22333: {'lr': 0.00029320159679309, 'samples': 11435008, 'steps': 22333, 'batch_loss/train': 0.7940190425142646} 12/25/2021 21:57:03 - INFO - codeparrot_training - Step 22334: {'lr': 0.00029318607071653313, 'samples': 11435520, 'steps': 22334, 'batch_loss/train': 0.6786330053582788} 12/25/2021 21:57:15 - INFO - codeparrot_training - Step 22335: {'lr': 0.0002931705444682848, 'samples': 11436032, 'steps': 22335, 'batch_loss/train': 0.7567642459180206} 12/25/2021 21:57:25 - INFO - codeparrot_training - Step 22336: {'lr': 0.0002931550180484066, 'samples': 11436544, 'steps': 22336, 'batch_loss/train': 0.7489972738549113} 12/25/2021 21:57:36 - INFO - codeparrot_training - Step 22337: {'lr': 0.00029313949145696045, 'samples': 11437056, 'steps': 22337, 'batch_loss/train': 0.7340102033340372} 12/25/2021 21:57:47 - INFO - codeparrot_training - Step 22338: {'lr': 0.00029312396469400795, 'samples': 11437568, 'steps': 22338, 'batch_loss/train': 0.7355988098424859} 12/25/2021 21:58:00 - INFO - codeparrot_training - Step 22339: {'lr': 0.0002931084377596108, 'samples': 11438080, 'steps': 22339, 'batch_loss/train': 0.6721504182787612} 12/25/2021 21:58:11 - INFO - codeparrot_training - Step 22340: {'lr': 0.0002930929106538308, 'samples': 11438592, 'steps': 22340, 'batch_loss/train': 0.7666071783751249} 12/25/2021 21:58:21 - INFO - codeparrot_training - Step 22341: {'lr': 0.0002930773833767296, 'samples': 11439104, 'steps': 22341, 'batch_loss/train': 0.7081027729436755} 12/25/2021 21:58:34 - INFO - codeparrot_training - Step 22342: {'lr': 0.0002930618559283691, 'samples': 11439616, 'steps': 22342, 'batch_loss/train': 0.8174073596019298} 12/25/2021 21:58:44 - INFO - codeparrot_training - Step 22343: {'lr': 0.0002930463283088109, 'samples': 11440128, 'steps': 22343, 'batch_loss/train': 0.7907624850049615} 12/25/2021 21:58:55 - INFO - codeparrot_training - Step 22344: {'lr': 0.0002930308005181167, 'samples': 11440640, 'steps': 22344, 'batch_loss/train': 0.819174918346107} 12/25/2021 21:59:09 - INFO - codeparrot_training - Step 22345: {'lr': 0.0002930152725563483, 'samples': 11441152, 'steps': 22345, 'batch_loss/train': 0.6909157419577241} 12/25/2021 21:59:19 - INFO - codeparrot_training - Step 22346: {'lr': 0.00029299974442356734, 'samples': 11441664, 'steps': 22346, 'batch_loss/train': 0.8499745484441519} 12/25/2021 21:59:30 - INFO - codeparrot_training - Step 22347: {'lr': 0.0002929842161198357, 'samples': 11442176, 'steps': 22347, 'batch_loss/train': 0.6865137103013694} 12/25/2021 21:59:42 - INFO - codeparrot_training - Step 22348: {'lr': 0.00029296868764521513, 'samples': 11442688, 'steps': 22348, 'batch_loss/train': 0.788300970220007} 12/25/2021 21:59:53 - INFO - codeparrot_training - Step 22349: {'lr': 0.00029295315899976717, 'samples': 11443200, 'steps': 22349, 'batch_loss/train': 0.820764665491879} 12/25/2021 22:00:03 - INFO - codeparrot_training - Step 22350: {'lr': 0.00029293763018355373, 'samples': 11443712, 'steps': 22350, 'batch_loss/train': 0.8168650846928358} 12/25/2021 22:00:14 - INFO - codeparrot_training - Step 22351: {'lr': 0.0002929221011966365, 'samples': 11444224, 'steps': 22351, 'batch_loss/train': 0.8007001085206866} 12/25/2021 22:00:26 - INFO - codeparrot_training - Step 22352: {'lr': 0.0002929065720390772, 'samples': 11444736, 'steps': 22352, 'batch_loss/train': 1.1451955898664892} 12/25/2021 22:00:37 - INFO - codeparrot_training - Step 22353: {'lr': 0.0002928910427109375, 'samples': 11445248, 'steps': 22353, 'batch_loss/train': 0.9151349980384111} 12/25/2021 22:00:47 - INFO - codeparrot_training - Step 22354: {'lr': 0.0002928755132122792, 'samples': 11445760, 'steps': 22354, 'batch_loss/train': 0.6918607649859041} 12/25/2021 22:01:01 - INFO - codeparrot_training - Step 22355: {'lr': 0.00029285998354316414, 'samples': 11446272, 'steps': 22355, 'batch_loss/train': 0.7831362262368202} 12/25/2021 22:01:12 - INFO - codeparrot_training - Step 22356: {'lr': 0.0002928444537036539, 'samples': 11446784, 'steps': 22356, 'batch_loss/train': 0.7433864865452051} 12/25/2021 22:01:22 - INFO - codeparrot_training - Step 22357: {'lr': 0.0002928289236938103, 'samples': 11447296, 'steps': 22357, 'batch_loss/train': 0.7808529660105705} 12/25/2021 22:01:34 - INFO - codeparrot_training - Step 22358: {'lr': 0.00029281339351369505, 'samples': 11447808, 'steps': 22358, 'batch_loss/train': 0.7538015441969037} 12/25/2021 22:01:45 - INFO - codeparrot_training - Step 22359: {'lr': 0.00029279786316336996, 'samples': 11448320, 'steps': 22359, 'batch_loss/train': 0.7271251403726637} 12/25/2021 22:01:56 - INFO - codeparrot_training - Step 22360: {'lr': 0.0002927823326428968, 'samples': 11448832, 'steps': 22360, 'batch_loss/train': 0.7539786621928215} 12/25/2021 22:02:06 - INFO - codeparrot_training - Step 22361: {'lr': 0.0002927668019523371, 'samples': 11449344, 'steps': 22361, 'batch_loss/train': 0.6394691012101248} 12/25/2021 22:02:18 - INFO - codeparrot_training - Step 22362: {'lr': 0.00029275127109175284, 'samples': 11449856, 'steps': 22362, 'batch_loss/train': 0.8726989051792771} 12/25/2021 22:02:29 - INFO - codeparrot_training - Step 22363: {'lr': 0.00029273574006120564, 'samples': 11450368, 'steps': 22363, 'batch_loss/train': 0.8709156177937984} 12/25/2021 22:02:40 - INFO - codeparrot_training - Step 22364: {'lr': 0.0002927202088607572, 'samples': 11450880, 'steps': 22364, 'batch_loss/train': 0.7217475689249113} 12/25/2021 22:02:54 - INFO - codeparrot_training - Step 22365: {'lr': 0.0002927046774904694, 'samples': 11451392, 'steps': 22365, 'batch_loss/train': 0.7403629192849621} 12/25/2021 22:03:04 - INFO - codeparrot_training - Step 22366: {'lr': 0.0002926891459504039, 'samples': 11451904, 'steps': 22366, 'batch_loss/train': 0.7473937133327127} 12/25/2021 22:03:15 - INFO - codeparrot_training - Step 22367: {'lr': 0.0002926736142406226, 'samples': 11452416, 'steps': 22367, 'batch_loss/train': 0.737530717626214} 12/25/2021 22:03:27 - INFO - codeparrot_training - Step 22368: {'lr': 0.00029265808236118694, 'samples': 11452928, 'steps': 22368, 'batch_loss/train': 0.7396635469049215} 12/25/2021 22:03:38 - INFO - codeparrot_training - Step 22369: {'lr': 0.000292642550312159, 'samples': 11453440, 'steps': 22369, 'batch_loss/train': 0.9025275148451328} 12/25/2021 22:03:49 - INFO - codeparrot_training - Step 22370: {'lr': 0.00029262701809360024, 'samples': 11453952, 'steps': 22370, 'batch_loss/train': 0.8377049220725894} 12/25/2021 22:03:59 - INFO - codeparrot_training - Step 22371: {'lr': 0.0002926114857055726, 'samples': 11454464, 'steps': 22371, 'batch_loss/train': 1.5810059010982513} 12/25/2021 22:04:13 - INFO - codeparrot_training - Step 22372: {'lr': 0.0002925959531481379, 'samples': 11454976, 'steps': 22372, 'batch_loss/train': 0.8180748503655195} 12/25/2021 22:04:24 - INFO - codeparrot_training - Step 22373: {'lr': 0.00029258042042135763, 'samples': 11455488, 'steps': 22373, 'batch_loss/train': 0.680892696371302} 12/25/2021 22:04:34 - INFO - codeparrot_training - Step 22374: {'lr': 0.00029256488752529375, 'samples': 11456000, 'steps': 22374, 'batch_loss/train': 0.7793665440985933} 12/25/2021 22:04:46 - INFO - codeparrot_training - Step 22375: {'lr': 0.0002925493544600079, 'samples': 11456512, 'steps': 22375, 'batch_loss/train': 0.8505263775587082} 12/25/2021 22:04:57 - INFO - codeparrot_training - Step 22376: {'lr': 0.000292533821225562, 'samples': 11457024, 'steps': 22376, 'batch_loss/train': 0.8231178680434823} 12/25/2021 22:05:08 - INFO - codeparrot_training - Step 22377: {'lr': 0.00029251828782201754, 'samples': 11457536, 'steps': 22377, 'batch_loss/train': 0.6828471370972693} 12/25/2021 22:05:18 - INFO - codeparrot_training - Step 22378: {'lr': 0.0002925027542494365, 'samples': 11458048, 'steps': 22378, 'batch_loss/train': 0.6613948624581099} 12/25/2021 22:05:30 - INFO - codeparrot_training - Step 22379: {'lr': 0.0002924872205078806, 'samples': 11458560, 'steps': 22379, 'batch_loss/train': 0.7983357543125749} 12/25/2021 22:05:41 - INFO - codeparrot_training - Step 22380: {'lr': 0.00029247168659741145, 'samples': 11459072, 'steps': 22380, 'batch_loss/train': 0.8209223626181483} 12/25/2021 22:05:51 - INFO - codeparrot_training - Step 22381: {'lr': 0.000292456152518091, 'samples': 11459584, 'steps': 22381, 'batch_loss/train': 0.817181948106736} 12/25/2021 22:06:04 - INFO - codeparrot_training - Step 22382: {'lr': 0.00029244061826998093, 'samples': 11460096, 'steps': 22382, 'batch_loss/train': 0.7188315847888589} 12/25/2021 22:06:14 - INFO - codeparrot_training - Step 22383: {'lr': 0.00029242508385314293, 'samples': 11460608, 'steps': 22383, 'batch_loss/train': 1.0400725631043315} 12/25/2021 22:06:25 - INFO - codeparrot_training - Step 22384: {'lr': 0.00029240954926763874, 'samples': 11461120, 'steps': 22384, 'batch_loss/train': 0.8085907138884068} 12/25/2021 22:06:39 - INFO - codeparrot_training - Step 22385: {'lr': 0.0002923940145135303, 'samples': 11461632, 'steps': 22385, 'batch_loss/train': 2.7624530205503106} 12/25/2021 22:06:49 - INFO - codeparrot_training - Step 22386: {'lr': 0.00029237847959087926, 'samples': 11462144, 'steps': 22386, 'batch_loss/train': 0.7377194934524596} 12/25/2021 22:07:00 - INFO - codeparrot_training - Step 22387: {'lr': 0.00029236294449974735, 'samples': 11462656, 'steps': 22387, 'batch_loss/train': 0.7280341479927301} 12/25/2021 22:07:11 - INFO - codeparrot_training - Step 22388: {'lr': 0.00029234740924019635, 'samples': 11463168, 'steps': 22388, 'batch_loss/train': 0.7655809228308499} 12/25/2021 22:07:23 - INFO - codeparrot_training - Step 22389: {'lr': 0.00029233187381228816, 'samples': 11463680, 'steps': 22389, 'batch_loss/train': 1.0217845682054758} 12/25/2021 22:07:34 - INFO - codeparrot_training - Step 22390: {'lr': 0.0002923163382160843, 'samples': 11464192, 'steps': 22390, 'batch_loss/train': 0.8486738409847021} 12/25/2021 22:07:44 - INFO - codeparrot_training - Step 22391: {'lr': 0.00029230080245164675, 'samples': 11464704, 'steps': 22391, 'batch_loss/train': 0.8618145606014878} 12/25/2021 22:07:56 - INFO - codeparrot_training - Step 22392: {'lr': 0.0002922852665190371, 'samples': 11465216, 'steps': 22392, 'batch_loss/train': 0.7458419697359204} 12/25/2021 22:08:07 - INFO - codeparrot_training - Step 22393: {'lr': 0.00029226973041831725, 'samples': 11465728, 'steps': 22393, 'batch_loss/train': 0.8076468482613564} 12/25/2021 22:08:17 - INFO - codeparrot_training - Step 22394: {'lr': 0.00029225419414954885, 'samples': 11466240, 'steps': 22394, 'batch_loss/train': 0.7597702266648412} 12/25/2021 22:08:28 - INFO - codeparrot_training - Step 22395: {'lr': 0.0002922386577127938, 'samples': 11466752, 'steps': 22395, 'batch_loss/train': 1.5598547542467713} 12/25/2021 22:08:42 - INFO - codeparrot_training - Step 22396: {'lr': 0.00029222312110811376, 'samples': 11467264, 'steps': 22396, 'batch_loss/train': 0.8875997122377157} 12/25/2021 22:08:52 - INFO - codeparrot_training - Step 22397: {'lr': 0.0002922075843355705, 'samples': 11467776, 'steps': 22397, 'batch_loss/train': 0.753533803857863} 12/25/2021 22:09:03 - INFO - codeparrot_training - Step 22398: {'lr': 0.0002921920473952259, 'samples': 11468288, 'steps': 22398, 'batch_loss/train': 0.7378589734435081} 12/25/2021 22:09:15 - INFO - codeparrot_training - Step 22399: {'lr': 0.0002921765102871416, 'samples': 11468800, 'steps': 22399, 'batch_loss/train': 0.9023699825629592} 12/25/2021 22:09:26 - INFO - codeparrot_training - Step 22400: {'lr': 0.00029216097301137944, 'samples': 11469312, 'steps': 22400, 'batch_loss/train': 0.7349898349493742} 12/25/2021 22:09:36 - INFO - codeparrot_training - Step 22401: {'lr': 0.00029214543556800116, 'samples': 11469824, 'steps': 22401, 'batch_loss/train': 0.8179921817500144} 12/25/2021 22:09:49 - INFO - codeparrot_training - Step 22402: {'lr': 0.0002921298979570686, 'samples': 11470336, 'steps': 22402, 'batch_loss/train': 0.7478098692372441} 12/25/2021 22:10:00 - INFO - codeparrot_training - Step 22403: {'lr': 0.00029211436017864337, 'samples': 11470848, 'steps': 22403, 'batch_loss/train': 0.8492470039054751} 12/25/2021 22:10:10 - INFO - codeparrot_training - Step 22404: {'lr': 0.0002920988222327874, 'samples': 11471360, 'steps': 22404, 'batch_loss/train': 0.7553394129499793} 12/25/2021 22:10:25 - INFO - codeparrot_training - Step 22405: {'lr': 0.00029208328411956235, 'samples': 11471872, 'steps': 22405, 'batch_loss/train': 0.7248665071092546} 12/25/2021 22:10:36 - INFO - codeparrot_training - Step 22406: {'lr': 0.0002920677458390301, 'samples': 11472384, 'steps': 22406, 'batch_loss/train': 0.8152395710349083} 12/25/2021 22:10:46 - INFO - codeparrot_training - Step 22407: {'lr': 0.0002920522073912524, 'samples': 11472896, 'steps': 22407, 'batch_loss/train': 0.7521290024742484} 12/25/2021 22:10:57 - INFO - codeparrot_training - Step 22408: {'lr': 0.000292036668776291, 'samples': 11473408, 'steps': 22408, 'batch_loss/train': 0.754476991656702} 12/25/2021 22:11:09 - INFO - codeparrot_training - Step 22409: {'lr': 0.0002920211299942076, 'samples': 11473920, 'steps': 22409, 'batch_loss/train': 0.7535992003977299} 12/25/2021 22:11:19 - INFO - codeparrot_training - Step 22410: {'lr': 0.00029200559104506415, 'samples': 11474432, 'steps': 22410, 'batch_loss/train': 0.7818746960256249} 12/25/2021 22:11:30 - INFO - codeparrot_training - Step 22411: {'lr': 0.0002919900519289223, 'samples': 11474944, 'steps': 22411, 'batch_loss/train': 0.8379852699581534} 12/25/2021 22:11:42 - INFO - codeparrot_training - Step 22412: {'lr': 0.0002919745126458439, 'samples': 11475456, 'steps': 22412, 'batch_loss/train': 0.7555496275890619} 12/25/2021 22:11:53 - INFO - codeparrot_training - Step 22413: {'lr': 0.0002919589731958906, 'samples': 11475968, 'steps': 22413, 'batch_loss/train': 0.8162155905738473} 12/25/2021 22:12:03 - INFO - codeparrot_training - Step 22414: {'lr': 0.0002919434335791243, 'samples': 11476480, 'steps': 22414, 'batch_loss/train': 0.7292387019842863} 12/25/2021 22:12:18 - INFO - codeparrot_training - Step 22415: {'lr': 0.0002919278937956068, 'samples': 11476992, 'steps': 22415, 'batch_loss/train': 0.7449687686748803} 12/25/2021 22:12:28 - INFO - codeparrot_training - Step 22416: {'lr': 0.0002919123538453997, 'samples': 11477504, 'steps': 22416, 'batch_loss/train': 0.7272937535308301} 12/25/2021 22:12:39 - INFO - codeparrot_training - Step 22417: {'lr': 0.00029189681372856504, 'samples': 11478016, 'steps': 22417, 'batch_loss/train': 0.6536435708403587} 12/25/2021 22:12:49 - INFO - codeparrot_training - Step 22418: {'lr': 0.0002918812734451644, 'samples': 11478528, 'steps': 22418, 'batch_loss/train': 0.8274161610752344} 12/25/2021 22:13:02 - INFO - codeparrot_training - Step 22419: {'lr': 0.00029186573299525963, 'samples': 11479040, 'steps': 22419, 'batch_loss/train': 0.7895049427170306} 12/25/2021 22:13:12 - INFO - codeparrot_training - Step 22420: {'lr': 0.0002918501923789125, 'samples': 11479552, 'steps': 22420, 'batch_loss/train': 0.8775885002687573} 12/25/2021 22:13:23 - INFO - codeparrot_training - Step 22421: {'lr': 0.0002918346515961848, 'samples': 11480064, 'steps': 22421, 'batch_loss/train': 0.8051958568394184} 12/25/2021 22:13:35 - INFO - codeparrot_training - Step 22422: {'lr': 0.0002918191106471384, 'samples': 11480576, 'steps': 22422, 'batch_loss/train': 0.8051766261924058} 12/25/2021 22:13:46 - INFO - codeparrot_training - Step 22423: {'lr': 0.000291803569531835, 'samples': 11481088, 'steps': 22423, 'batch_loss/train': 0.8006840106099844} 12/25/2021 22:13:56 - INFO - codeparrot_training - Step 22424: {'lr': 0.00029178802825033633, 'samples': 11481600, 'steps': 22424, 'batch_loss/train': 0.6724308920092881} 12/25/2021 22:14:10 - INFO - codeparrot_training - Step 22425: {'lr': 0.00029177248680270425, 'samples': 11482112, 'steps': 22425, 'batch_loss/train': 0.7974609322845936} 12/25/2021 22:14:21 - INFO - codeparrot_training - Step 22426: {'lr': 0.00029175694518900053, 'samples': 11482624, 'steps': 22426, 'batch_loss/train': 0.8121379474177957} 12/25/2021 22:14:31 - INFO - codeparrot_training - Step 22427: {'lr': 0.000291741403409287, 'samples': 11483136, 'steps': 22427, 'batch_loss/train': 0.7320245306473225} 12/25/2021 22:14:42 - INFO - codeparrot_training - Step 22428: {'lr': 0.00029172586146362537, 'samples': 11483648, 'steps': 22428, 'batch_loss/train': 0.6990344589576125} 12/25/2021 22:14:54 - INFO - codeparrot_training - Step 22429: {'lr': 0.0002917103193520775, 'samples': 11484160, 'steps': 22429, 'batch_loss/train': 0.7071261606179178} 12/25/2021 22:15:04 - INFO - codeparrot_training - Step 22430: {'lr': 0.00029169477707470515, 'samples': 11484672, 'steps': 22430, 'batch_loss/train': 0.7372042122296989} 12/25/2021 22:15:15 - INFO - codeparrot_training - Step 22431: {'lr': 0.00029167923463157007, 'samples': 11485184, 'steps': 22431, 'batch_loss/train': 0.7329146629199386} 12/25/2021 22:15:29 - INFO - codeparrot_training - Step 22432: {'lr': 0.0002916636920227341, 'samples': 11485696, 'steps': 22432, 'batch_loss/train': 0.8535065511241555} 12/25/2021 22:15:39 - INFO - codeparrot_training - Step 22433: {'lr': 0.0002916481492482591, 'samples': 11486208, 'steps': 22433, 'batch_loss/train': 0.8225420953240246} 12/25/2021 22:15:50 - INFO - codeparrot_training - Step 22434: {'lr': 0.0002916326063082067, 'samples': 11486720, 'steps': 22434, 'batch_loss/train': 0.8913018635939807} 12/25/2021 22:16:02 - INFO - codeparrot_training - Step 22435: {'lr': 0.00029161706320263874, 'samples': 11487232, 'steps': 22435, 'batch_loss/train': 1.1391207806300372} 12/25/2021 22:16:13 - INFO - codeparrot_training - Step 22436: {'lr': 0.00029160151993161713, 'samples': 11487744, 'steps': 22436, 'batch_loss/train': 0.7639469262212515} 12/25/2021 22:16:23 - INFO - codeparrot_training - Step 22437: {'lr': 0.0002915859764952036, 'samples': 11488256, 'steps': 22437, 'batch_loss/train': 0.4219433782855049} 12/25/2021 22:16:34 - INFO - codeparrot_training - Step 22438: {'lr': 0.0002915704328934599, 'samples': 11488768, 'steps': 22438, 'batch_loss/train': 0.7660202314145863} 12/25/2021 22:16:46 - INFO - codeparrot_training - Step 22439: {'lr': 0.00029155488912644784, 'samples': 11489280, 'steps': 22439, 'batch_loss/train': 0.841741562820971} 12/25/2021 22:16:57 - INFO - codeparrot_training - Step 22440: {'lr': 0.00029153934519422927, 'samples': 11489792, 'steps': 22440, 'batch_loss/train': 0.8249387219548225} 12/25/2021 22:17:07 - INFO - codeparrot_training - Step 22441: {'lr': 0.00029152380109686595, 'samples': 11490304, 'steps': 22441, 'batch_loss/train': 0.7381645776331425} 12/25/2021 22:17:19 - INFO - codeparrot_training - Step 22442: {'lr': 0.0002915082568344196, 'samples': 11490816, 'steps': 22442, 'batch_loss/train': 0.8217377196997404} 12/25/2021 22:17:30 - INFO - codeparrot_training - Step 22443: {'lr': 0.0002914927124069522, 'samples': 11491328, 'steps': 22443, 'batch_loss/train': 0.706901156809181} 12/25/2021 22:17:41 - INFO - codeparrot_training - Step 22444: {'lr': 0.00029147716781452535, 'samples': 11491840, 'steps': 22444, 'batch_loss/train': 0.7377086789347231} 12/25/2021 22:17:54 - INFO - codeparrot_training - Step 22445: {'lr': 0.0002914616230572009, 'samples': 11492352, 'steps': 22445, 'batch_loss/train': 0.7012139782309532} 12/25/2021 22:18:05 - INFO - codeparrot_training - Step 22446: {'lr': 0.0002914460781350408, 'samples': 11492864, 'steps': 22446, 'batch_loss/train': 0.7716860177461058} 12/25/2021 22:18:16 - INFO - codeparrot_training - Step 22447: {'lr': 0.00029143053304810676, 'samples': 11493376, 'steps': 22447, 'batch_loss/train': 0.8296086066402495} 12/25/2021 22:18:28 - INFO - codeparrot_training - Step 22448: {'lr': 0.0002914149877964605, 'samples': 11493888, 'steps': 22448, 'batch_loss/train': 0.8469988717697561} 12/25/2021 22:18:38 - INFO - codeparrot_training - Step 22449: {'lr': 0.0002913994423801639, 'samples': 11494400, 'steps': 22449, 'batch_loss/train': 0.8320638602599502} 12/25/2021 22:18:49 - INFO - codeparrot_training - Step 22450: {'lr': 0.0002913838967992788, 'samples': 11494912, 'steps': 22450, 'batch_loss/train': 0.8406568677164614} 12/25/2021 22:19:00 - INFO - codeparrot_training - Step 22451: {'lr': 0.0002913683510538668, 'samples': 11495424, 'steps': 22451, 'batch_loss/train': 0.795872185844928} 12/25/2021 22:19:11 - INFO - codeparrot_training - Step 22452: {'lr': 0.00029135280514399, 'samples': 11495936, 'steps': 22452, 'batch_loss/train': 0.8623833348974586} 12/25/2021 22:19:22 - INFO - codeparrot_training - Step 22453: {'lr': 0.00029133725906971004, 'samples': 11496448, 'steps': 22453, 'batch_loss/train': 0.7753232140094042} 12/25/2021 22:19:33 - INFO - codeparrot_training - Step 22454: {'lr': 0.00029132171283108874, 'samples': 11496960, 'steps': 22454, 'batch_loss/train': 0.7256174422800541} 12/25/2021 22:19:47 - INFO - codeparrot_training - Step 22455: {'lr': 0.0002913061664281879, 'samples': 11497472, 'steps': 22455, 'batch_loss/train': 0.8787005068734288} 12/25/2021 22:19:57 - INFO - codeparrot_training - Step 22456: {'lr': 0.0002912906198610693, 'samples': 11497984, 'steps': 22456, 'batch_loss/train': 0.8387977730017155} 12/25/2021 22:20:08 - INFO - codeparrot_training - Step 22457: {'lr': 0.00029127507312979483, 'samples': 11498496, 'steps': 22457, 'batch_loss/train': 0.8154427879489958} 12/25/2021 22:20:20 - INFO - codeparrot_training - Step 22458: {'lr': 0.0002912595262344262, 'samples': 11499008, 'steps': 22458, 'batch_loss/train': 0.6582831258419901} 12/25/2021 22:20:30 - INFO - codeparrot_training - Step 22459: {'lr': 0.00029124397917502535, 'samples': 11499520, 'steps': 22459, 'batch_loss/train': 0.8005900057032704} 12/25/2021 22:20:41 - INFO - codeparrot_training - Step 22460: {'lr': 0.000291228431951654, 'samples': 11500032, 'steps': 22460, 'batch_loss/train': 0.7672402837779373} 12/25/2021 22:20:52 - INFO - codeparrot_training - Step 22461: {'lr': 0.00029121288456437395, 'samples': 11500544, 'steps': 22461, 'batch_loss/train': 0.8112371381139383} 12/25/2021 22:21:12 - INFO - codeparrot_training - Step 22462: {'lr': 0.00029119733701324705, 'samples': 11501056, 'steps': 22462, 'batch_loss/train': 0.7010616939514875} 12/25/2021 22:21:23 - INFO - codeparrot_training - Step 22463: {'lr': 0.00029118178929833503, 'samples': 11501568, 'steps': 22463, 'batch_loss/train': 0.7791273416951299} 12/25/2021 22:21:33 - INFO - codeparrot_training - Step 22464: {'lr': 0.0002911662414196998, 'samples': 11502080, 'steps': 22464, 'batch_loss/train': 0.8231499581597745} 12/25/2021 22:21:45 - INFO - codeparrot_training - Step 22465: {'lr': 0.0002911506933774032, 'samples': 11502592, 'steps': 22465, 'batch_loss/train': 0.75712223444134} 12/25/2021 22:21:56 - INFO - codeparrot_training - Step 22466: {'lr': 0.0002911351451715069, 'samples': 11503104, 'steps': 22466, 'batch_loss/train': 0.843675862532109} 12/25/2021 22:22:07 - INFO - codeparrot_training - Step 22467: {'lr': 0.0002911195968020728, 'samples': 11503616, 'steps': 22467, 'batch_loss/train': 0.9464013818651438} 12/25/2021 22:22:19 - INFO - codeparrot_training - Step 22468: {'lr': 0.00029110404826916274, 'samples': 11504128, 'steps': 22468, 'batch_loss/train': 0.8692403705790639} 12/25/2021 22:22:29 - INFO - codeparrot_training - Step 22469: {'lr': 0.00029108849957283856, 'samples': 11504640, 'steps': 22469, 'batch_loss/train': 0.8691574614495039} 12/25/2021 22:22:40 - INFO - codeparrot_training - Step 22470: {'lr': 0.00029107295071316197, 'samples': 11505152, 'steps': 22470, 'batch_loss/train': 0.9426748035475612} 12/25/2021 22:22:50 - INFO - codeparrot_training - Step 22471: {'lr': 0.0002910574016901948, 'samples': 11505664, 'steps': 22471, 'batch_loss/train': 0.7091703237965703} 12/25/2021 22:23:04 - INFO - codeparrot_training - Step 22472: {'lr': 0.00029104185250399895, 'samples': 11506176, 'steps': 22472, 'batch_loss/train': 0.8468801770359278} 12/25/2021 22:23:15 - INFO - codeparrot_training - Step 22473: {'lr': 0.00029102630315463616, 'samples': 11506688, 'steps': 22473, 'batch_loss/train': 0.727248040959239} 12/25/2021 22:23:26 - INFO - codeparrot_training - Step 22474: {'lr': 0.0002910107536421683, 'samples': 11507200, 'steps': 22474, 'batch_loss/train': 0.8057080656290054} 12/25/2021 22:23:38 - INFO - codeparrot_training - Step 22475: {'lr': 0.00029099520396665715, 'samples': 11507712, 'steps': 22475, 'batch_loss/train': 0.7223521661944687} 12/25/2021 22:23:48 - INFO - codeparrot_training - Step 22476: {'lr': 0.00029097965412816463, 'samples': 11508224, 'steps': 22476, 'batch_loss/train': 0.7890830771066248} 12/25/2021 22:23:59 - INFO - codeparrot_training - Step 22477: {'lr': 0.00029096410412675236, 'samples': 11508736, 'steps': 22477, 'batch_loss/train': 0.7205200279131532} 12/25/2021 22:24:11 - INFO - codeparrot_training - Step 22478: {'lr': 0.0002909485539624823, 'samples': 11509248, 'steps': 22478, 'batch_loss/train': 0.6907717802096158} 12/25/2021 22:24:22 - INFO - codeparrot_training - Step 22479: {'lr': 0.0002909330036354163, 'samples': 11509760, 'steps': 22479, 'batch_loss/train': 0.7262306273914874} 12/25/2021 22:24:32 - INFO - codeparrot_training - Step 22480: {'lr': 0.0002909174531456161, 'samples': 11510272, 'steps': 22480, 'batch_loss/train': 0.9038449237123132} 12/25/2021 22:24:47 - INFO - codeparrot_training - Step 22481: {'lr': 0.00029090190249314357, 'samples': 11510784, 'steps': 22481, 'batch_loss/train': 0.7810009885579348} 12/25/2021 22:24:57 - INFO - codeparrot_training - Step 22482: {'lr': 0.0002908863516780605, 'samples': 11511296, 'steps': 22482, 'batch_loss/train': 0.7447903365828097} 12/25/2021 22:25:08 - INFO - codeparrot_training - Step 22483: {'lr': 0.0002908708007004288, 'samples': 11511808, 'steps': 22483, 'batch_loss/train': 0.8314897026866674} 12/25/2021 22:25:19 - INFO - codeparrot_training - Step 22484: {'lr': 0.0002908552495603101, 'samples': 11512320, 'steps': 22484, 'batch_loss/train': 1.369688082486391} 12/25/2021 22:25:31 - INFO - codeparrot_training - Step 22485: {'lr': 0.00029083969825776653, 'samples': 11512832, 'steps': 22485, 'batch_loss/train': 0.7927352422848344} 12/25/2021 22:25:41 - INFO - codeparrot_training - Step 22486: {'lr': 0.0002908241467928596, 'samples': 11513344, 'steps': 22486, 'batch_loss/train': 0.8670884612947702} 12/25/2021 22:25:52 - INFO - codeparrot_training - Step 22487: {'lr': 0.00029080859516565134, 'samples': 11513856, 'steps': 22487, 'batch_loss/train': 0.8312327787280083} 12/25/2021 22:26:04 - INFO - codeparrot_training - Step 22488: {'lr': 0.0002907930433762035, 'samples': 11514368, 'steps': 22488, 'batch_loss/train': 0.7630816674791276} 12/25/2021 22:26:15 - INFO - codeparrot_training - Step 22489: {'lr': 0.00029077749142457794, 'samples': 11514880, 'steps': 22489, 'batch_loss/train': 0.7622831119224429} 12/25/2021 22:26:25 - INFO - codeparrot_training - Step 22490: {'lr': 0.00029076193931083636, 'samples': 11515392, 'steps': 22490, 'batch_loss/train': 0.755025374237448} 12/25/2021 22:26:36 - INFO - codeparrot_training - Step 22491: {'lr': 0.0002907463870350409, 'samples': 11515904, 'steps': 22491, 'batch_loss/train': 0.6529655153863132} 12/25/2021 22:26:50 - INFO - codeparrot_training - Step 22492: {'lr': 0.000290730834597253, 'samples': 11516416, 'steps': 22492, 'batch_loss/train': 0.7724090414121747} 12/25/2021 22:27:01 - INFO - codeparrot_training - Step 22493: {'lr': 0.0002907152819975348, 'samples': 11516928, 'steps': 22493, 'batch_loss/train': 0.7867165864445269} 12/25/2021 22:27:11 - INFO - codeparrot_training - Step 22494: {'lr': 0.00029069972923594794, 'samples': 11517440, 'steps': 22494, 'batch_loss/train': 0.7399246459826827} 12/25/2021 22:27:24 - INFO - codeparrot_training - Step 22495: {'lr': 0.00029068417631255433, 'samples': 11517952, 'steps': 22495, 'batch_loss/train': 0.7437068829312921} 12/25/2021 22:27:35 - INFO - codeparrot_training - Step 22496: {'lr': 0.00029066862322741586, 'samples': 11518464, 'steps': 22496, 'batch_loss/train': 0.7334594102576375} 12/25/2021 22:27:45 - INFO - codeparrot_training - Step 22497: {'lr': 0.0002906530699805943, 'samples': 11518976, 'steps': 22497, 'batch_loss/train': 0.650805938988924} 12/25/2021 22:27:56 - INFO - codeparrot_training - Step 22498: {'lr': 0.00029063751657215143, 'samples': 11519488, 'steps': 22498, 'batch_loss/train': 0.7684131776914} 12/25/2021 22:28:08 - INFO - codeparrot_training - Step 22499: {'lr': 0.0002906219630021492, 'samples': 11520000, 'steps': 22499, 'batch_loss/train': 0.8037868030369282} 12/25/2021 22:28:08 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/25/2021 22:31:32 - INFO - codeparrot_training - Step 22500: {'loss/eval': 0.7933947443962097, 'perplexity': 2.2108891010284424} 12/25/2021 22:31:50 - WARNING - huggingface_hub.repository - Several commits (7) will be pushed upstream. 12/25/2021 22:32:03 - INFO - codeparrot_training - Step 22500: {'lr': 0.0002906064092706493, 'samples': 11520512, 'steps': 22500, 'batch_loss/train': 0.731876933015883} 12/25/2021 22:32:14 - INFO - codeparrot_training - Step 22501: {'lr': 0.00029059085537771367, 'samples': 11521024, 'steps': 22501, 'batch_loss/train': 0.6752322958782315} 12/25/2021 22:32:28 - INFO - codeparrot_training - Step 22502: {'lr': 0.00029057530132340416, 'samples': 11521536, 'steps': 22502, 'batch_loss/train': 0.7905173257458955} 12/25/2021 22:32:38 - INFO - codeparrot_training - Step 22503: {'lr': 0.0002905597471077825, 'samples': 11522048, 'steps': 22503, 'batch_loss/train': 0.8250600909814239} 12/25/2021 22:32:49 - INFO - codeparrot_training - Step 22504: {'lr': 0.0002905441927309107, 'samples': 11522560, 'steps': 22504, 'batch_loss/train': 0.8056585932499729} 12/25/2021 22:33:01 - INFO - codeparrot_training - Step 22505: {'lr': 0.0002905286381928505, 'samples': 11523072, 'steps': 22505, 'batch_loss/train': 0.7987608900293708} 12/25/2021 22:33:12 - INFO - codeparrot_training - Step 22506: {'lr': 0.0002905130834936637, 'samples': 11523584, 'steps': 22506, 'batch_loss/train': 0.6468577198684216} 12/25/2021 22:33:23 - INFO - codeparrot_training - Step 22507: {'lr': 0.00029049752863341213, 'samples': 11524096, 'steps': 22507, 'batch_loss/train': 0.9125505015254021} 12/25/2021 22:33:36 - INFO - codeparrot_training - Step 22508: {'lr': 0.0002904819736121577, 'samples': 11524608, 'steps': 22508, 'batch_loss/train': 0.9048069077543914} 12/25/2021 22:33:47 - INFO - codeparrot_training - Step 22509: {'lr': 0.0002904664184299623, 'samples': 11525120, 'steps': 22509, 'batch_loss/train': 0.7067717665340751} 12/25/2021 22:33:58 - INFO - codeparrot_training - Step 22510: {'lr': 0.0002904508630868876, 'samples': 11525632, 'steps': 22510, 'batch_loss/train': 0.7382813454605639} 12/25/2021 22:34:08 - INFO - codeparrot_training - Step 22511: {'lr': 0.00029043530758299555, 'samples': 11526144, 'steps': 22511, 'batch_loss/train': 0.8087953021749854} 12/25/2021 22:34:21 - INFO - codeparrot_training - Step 22512: {'lr': 0.0002904197519183481, 'samples': 11526656, 'steps': 22512, 'batch_loss/train': 0.7335150390863419} 12/25/2021 22:34:31 - INFO - codeparrot_training - Step 22513: {'lr': 0.0002904041960930068, 'samples': 11527168, 'steps': 22513, 'batch_loss/train': 0.731562715023756} 12/25/2021 22:34:42 - INFO - codeparrot_training - Step 22514: {'lr': 0.0002903886401070338, 'samples': 11527680, 'steps': 22514, 'batch_loss/train': 0.6349573009647429} 12/25/2021 22:34:54 - INFO - codeparrot_training - Step 22515: {'lr': 0.0002903730839604908, 'samples': 11528192, 'steps': 22515, 'batch_loss/train': 0.8675979422405362} 12/25/2021 22:35:05 - INFO - codeparrot_training - Step 22516: {'lr': 0.00029035752765343964, 'samples': 11528704, 'steps': 22516, 'batch_loss/train': 0.8940113410353661} 12/25/2021 22:35:15 - INFO - codeparrot_training - Step 22517: {'lr': 0.0002903419711859422, 'samples': 11529216, 'steps': 22517, 'batch_loss/train': 0.817077605985105} 12/25/2021 22:35:26 - INFO - codeparrot_training - Step 22518: {'lr': 0.00029032641455806034, 'samples': 11529728, 'steps': 22518, 'batch_loss/train': 0.7266962258145213} 12/25/2021 22:35:38 - INFO - codeparrot_training - Step 22519: {'lr': 0.0002903108577698559, 'samples': 11530240, 'steps': 22519, 'batch_loss/train': 0.7875604552682489} 12/25/2021 22:35:49 - INFO - codeparrot_training - Step 22520: {'lr': 0.0002902953008213907, 'samples': 11530752, 'steps': 22520, 'batch_loss/train': 0.7444559149444103} 12/25/2021 22:35:59 - INFO - codeparrot_training - Step 22521: {'lr': 0.0002902797437127265, 'samples': 11531264, 'steps': 22521, 'batch_loss/train': 0.8279942332301289} 12/25/2021 22:36:13 - INFO - codeparrot_training - Step 22522: {'lr': 0.00029026418644392533, 'samples': 11531776, 'steps': 22522, 'batch_loss/train': 0.7580461390316486} 12/25/2021 22:36:24 - INFO - codeparrot_training - Step 22523: {'lr': 0.00029024862901504887, 'samples': 11532288, 'steps': 22523, 'batch_loss/train': 0.9210230398457497} 12/25/2021 22:36:35 - INFO - codeparrot_training - Step 22524: {'lr': 0.00029023307142615924, 'samples': 11532800, 'steps': 22524, 'batch_loss/train': 0.630040701944381} 12/25/2021 22:36:47 - INFO - codeparrot_training - Step 22525: {'lr': 0.00029021751367731797, 'samples': 11533312, 'steps': 22525, 'batch_loss/train': 0.7014949573203921} 12/25/2021 22:36:58 - INFO - codeparrot_training - Step 22526: {'lr': 0.0002902019557685871, 'samples': 11533824, 'steps': 22526, 'batch_loss/train': 0.8157917251810431} 12/25/2021 22:37:08 - INFO - codeparrot_training - Step 22527: {'lr': 0.0002901863977000284, 'samples': 11534336, 'steps': 22527, 'batch_loss/train': 0.6733855294296518} 12/25/2021 22:37:19 - INFO - codeparrot_training - Step 22528: {'lr': 0.0002901708394717038, 'samples': 11534848, 'steps': 22528, 'batch_loss/train': 0.7687690377933905} 12/25/2021 22:37:31 - INFO - codeparrot_training - Step 22529: {'lr': 0.000290155281083675, 'samples': 11535360, 'steps': 22529, 'batch_loss/train': 0.7772164409980178} 12/25/2021 22:37:42 - INFO - codeparrot_training - Step 22530: {'lr': 0.00029013972253600414, 'samples': 11535872, 'steps': 22530, 'batch_loss/train': 0.6501620800700039} 12/25/2021 22:37:52 - INFO - codeparrot_training - Step 22531: {'lr': 0.00029012416382875276, 'samples': 11536384, 'steps': 22531, 'batch_loss/train': 0.7191083473153412} 12/25/2021 22:38:06 - INFO - codeparrot_training - Step 22532: {'lr': 0.0002901086049619829, 'samples': 11536896, 'steps': 22532, 'batch_loss/train': 0.6919164133723825} 12/25/2021 22:38:17 - INFO - codeparrot_training - Step 22533: {'lr': 0.00029009304593575634, 'samples': 11537408, 'steps': 22533, 'batch_loss/train': 0.8890760280191898} 12/25/2021 22:38:27 - INFO - codeparrot_training - Step 22534: {'lr': 0.000290077486750135, 'samples': 11537920, 'steps': 22534, 'batch_loss/train': 0.7353801401332021} 12/25/2021 22:38:40 - INFO - codeparrot_training - Step 22535: {'lr': 0.00029006192740518074, 'samples': 11538432, 'steps': 22535, 'batch_loss/train': 0.8271610587835312} 12/25/2021 22:38:50 - INFO - codeparrot_training - Step 22536: {'lr': 0.0002900463679009553, 'samples': 11538944, 'steps': 22536, 'batch_loss/train': 0.8102096766233444} 12/25/2021 22:39:01 - INFO - codeparrot_training - Step 22537: {'lr': 0.00029003080823752073, 'samples': 11539456, 'steps': 22537, 'batch_loss/train': 0.6797134866937995} 12/25/2021 22:39:12 - INFO - codeparrot_training - Step 22538: {'lr': 0.00029001524841493876, 'samples': 11539968, 'steps': 22538, 'batch_loss/train': 0.79766272357665} 12/25/2021 22:39:24 - INFO - codeparrot_training - Step 22539: {'lr': 0.0002899996884332712, 'samples': 11540480, 'steps': 22539, 'batch_loss/train': 0.8734957799315453} 12/25/2021 22:39:34 - INFO - codeparrot_training - Step 22540: {'lr': 0.00028998412829258003, 'samples': 11540992, 'steps': 22540, 'batch_loss/train': 0.8172866092063487} 12/25/2021 22:39:45 - INFO - codeparrot_training - Step 22541: {'lr': 0.0002899685679929271, 'samples': 11541504, 'steps': 22541, 'batch_loss/train': 0.804289854131639} 12/25/2021 22:39:59 - INFO - codeparrot_training - Step 22542: {'lr': 0.0002899530075343741, 'samples': 11542016, 'steps': 22542, 'batch_loss/train': 0.7013025199994445} 12/25/2021 22:40:10 - INFO - codeparrot_training - Step 22543: {'lr': 0.0002899374469169831, 'samples': 11542528, 'steps': 22543, 'batch_loss/train': 0.7547705117613077} 12/25/2021 22:40:20 - INFO - codeparrot_training - Step 22544: {'lr': 0.00028992188614081595, 'samples': 11543040, 'steps': 22544, 'batch_loss/train': 0.7976263165473938} 12/25/2021 22:40:32 - INFO - codeparrot_training - Step 22545: {'lr': 0.0002899063252059344, 'samples': 11543552, 'steps': 22545, 'batch_loss/train': 0.7333331741392612} 12/25/2021 22:40:43 - INFO - codeparrot_training - Step 22546: {'lr': 0.00028989076411240037, 'samples': 11544064, 'steps': 22546, 'batch_loss/train': 0.7703526047989726} 12/25/2021 22:40:53 - INFO - codeparrot_training - Step 22547: {'lr': 0.00028987520286027576, 'samples': 11544576, 'steps': 22547, 'batch_loss/train': 0.7828838904388249} 12/25/2021 22:41:07 - INFO - codeparrot_training - Step 22548: {'lr': 0.00028985964144962237, 'samples': 11545088, 'steps': 22548, 'batch_loss/train': 0.7737613562494516} 12/25/2021 22:41:18 - INFO - codeparrot_training - Step 22549: {'lr': 0.00028984407988050205, 'samples': 11545600, 'steps': 22549, 'batch_loss/train': 0.7216802150942385} 12/25/2021 22:41:28 - INFO - codeparrot_training - Step 22550: {'lr': 0.0002898285181529768, 'samples': 11546112, 'steps': 22550, 'batch_loss/train': 0.8561648959293962} 12/25/2021 22:41:39 - INFO - codeparrot_training - Step 22551: {'lr': 0.00028981295626710833, 'samples': 11546624, 'steps': 22551, 'batch_loss/train': 0.7115662582218647} 12/25/2021 22:41:51 - INFO - codeparrot_training - Step 22552: {'lr': 0.00028979739422295864, 'samples': 11547136, 'steps': 22552, 'batch_loss/train': 0.6965090539306402} 12/25/2021 22:42:02 - INFO - codeparrot_training - Step 22553: {'lr': 0.0002897818320205895, 'samples': 11547648, 'steps': 22553, 'batch_loss/train': 0.8146997131407261} 12/25/2021 22:42:12 - INFO - codeparrot_training - Step 22554: {'lr': 0.00028976626966006284, 'samples': 11548160, 'steps': 22554, 'batch_loss/train': 0.59282660856843} 12/25/2021 22:42:24 - INFO - codeparrot_training - Step 22555: {'lr': 0.0002897507071414405, 'samples': 11548672, 'steps': 22555, 'batch_loss/train': 0.8167472770437598} 12/25/2021 22:42:35 - INFO - codeparrot_training - Step 22556: {'lr': 0.00028973514446478427, 'samples': 11549184, 'steps': 22556, 'batch_loss/train': 0.7541185421869159} 12/25/2021 22:42:46 - INFO - codeparrot_training - Step 22557: {'lr': 0.0002897195816301563, 'samples': 11549696, 'steps': 22557, 'batch_loss/train': 0.8380925673991442} 12/25/2021 22:42:58 - INFO - codeparrot_training - Step 22558: {'lr': 0.00028970401863761804, 'samples': 11550208, 'steps': 22558, 'batch_loss/train': 0.7546316096559167} 12/25/2021 22:43:08 - INFO - codeparrot_training - Step 22559: {'lr': 0.0002896884554872318, 'samples': 11550720, 'steps': 22559, 'batch_loss/train': 0.7588437125086784} 12/25/2021 22:43:19 - INFO - codeparrot_training - Step 22560: {'lr': 0.00028967289217905913, 'samples': 11551232, 'steps': 22560, 'batch_loss/train': 0.8776461132802069} 12/25/2021 22:43:30 - INFO - codeparrot_training - Step 22561: {'lr': 0.000289657328713162, 'samples': 11551744, 'steps': 22561, 'batch_loss/train': 0.7561854107771069} 12/25/2021 22:43:44 - INFO - codeparrot_training - Step 22562: {'lr': 0.0002896417650896024, 'samples': 11552256, 'steps': 22562, 'batch_loss/train': 0.7899459849577397} 12/25/2021 22:43:54 - INFO - codeparrot_training - Step 22563: {'lr': 0.0002896262013084421, 'samples': 11552768, 'steps': 22563, 'batch_loss/train': 0.802915905136615} 12/25/2021 22:44:05 - INFO - codeparrot_training - Step 22564: {'lr': 0.0002896106373697429, 'samples': 11553280, 'steps': 22564, 'batch_loss/train': 0.7849812228232622} 12/25/2021 22:44:17 - INFO - codeparrot_training - Step 22565: {'lr': 0.0002895950732735668, 'samples': 11553792, 'steps': 22565, 'batch_loss/train': 0.7608536332845688} 12/25/2021 22:44:28 - INFO - codeparrot_training - Step 22566: {'lr': 0.0002895795090199756, 'samples': 11554304, 'steps': 22566, 'batch_loss/train': 0.760546701727435} 12/25/2021 22:44:38 - INFO - codeparrot_training - Step 22567: {'lr': 0.00028956394460903135, 'samples': 11554816, 'steps': 22567, 'batch_loss/train': 0.678451188839972} 12/25/2021 22:44:50 - INFO - codeparrot_training - Step 22568: {'lr': 0.0002895483800407957, 'samples': 11555328, 'steps': 22568, 'batch_loss/train': 0.5676359019125812} 12/25/2021 22:45:01 - INFO - codeparrot_training - Step 22569: {'lr': 0.0002895328153153306, 'samples': 11555840, 'steps': 22569, 'batch_loss/train': 0.7782120369374752} 12/25/2021 22:45:12 - INFO - codeparrot_training - Step 22570: {'lr': 0.00028951725043269806, 'samples': 11556352, 'steps': 22570, 'batch_loss/train': 0.8312361221760511} 12/25/2021 22:45:22 - INFO - codeparrot_training - Step 22571: {'lr': 0.00028950168539295973, 'samples': 11556864, 'steps': 22571, 'batch_loss/train': 0.7735875102225691} 12/25/2021 22:45:36 - INFO - codeparrot_training - Step 22572: {'lr': 0.0002894861201961777, 'samples': 11557376, 'steps': 22572, 'batch_loss/train': 0.8554398091509938} 12/25/2021 22:45:47 - INFO - codeparrot_training - Step 22573: {'lr': 0.00028947055484241377, 'samples': 11557888, 'steps': 22573, 'batch_loss/train': 0.825327561236918} 12/25/2021 22:45:57 - INFO - codeparrot_training - Step 22574: {'lr': 0.00028945498933172974, 'samples': 11558400, 'steps': 22574, 'batch_loss/train': 0.7273888634517789} 12/25/2021 22:46:10 - INFO - codeparrot_training - Step 22575: {'lr': 0.00028943942366418756, 'samples': 11558912, 'steps': 22575, 'batch_loss/train': 0.780704993288964} 12/25/2021 22:46:20 - INFO - codeparrot_training - Step 22576: {'lr': 0.00028942385783984926, 'samples': 11559424, 'steps': 22576, 'batch_loss/train': 0.7240602523088455} 12/25/2021 22:46:31 - INFO - codeparrot_training - Step 22577: {'lr': 0.00028940829185877646, 'samples': 11559936, 'steps': 22577, 'batch_loss/train': 0.7148347630864009} 12/25/2021 22:46:45 - INFO - codeparrot_training - Step 22578: {'lr': 0.00028939272572103125, 'samples': 11560448, 'steps': 22578, 'batch_loss/train': 0.9579357048496604} 12/25/2021 22:46:55 - INFO - codeparrot_training - Step 22579: {'lr': 0.00028937715942667537, 'samples': 11560960, 'steps': 22579, 'batch_loss/train': 0.9891076618805528} 12/25/2021 22:47:06 - INFO - codeparrot_training - Step 22580: {'lr': 0.00028936159297577083, 'samples': 11561472, 'steps': 22580, 'batch_loss/train': 0.7872866913676262} 12/25/2021 22:47:17 - INFO - codeparrot_training - Step 22581: {'lr': 0.0002893460263683794, 'samples': 11561984, 'steps': 22581, 'batch_loss/train': 0.7913382423575968} 12/25/2021 22:47:29 - INFO - codeparrot_training - Step 22582: {'lr': 0.00028933045960456304, 'samples': 11562496, 'steps': 22582, 'batch_loss/train': 0.783328770659864} 12/25/2021 22:47:39 - INFO - codeparrot_training - Step 22583: {'lr': 0.0002893148926843837, 'samples': 11563008, 'steps': 22583, 'batch_loss/train': 0.8111116541549563} 12/25/2021 22:47:50 - INFO - codeparrot_training - Step 22584: {'lr': 0.0002892993256079031, 'samples': 11563520, 'steps': 22584, 'batch_loss/train': 0.7271372756222263} 12/25/2021 22:48:02 - INFO - codeparrot_training - Step 22585: {'lr': 0.0002892837583751833, 'samples': 11564032, 'steps': 22585, 'batch_loss/train': 0.6915243421681225} 12/25/2021 22:48:13 - INFO - codeparrot_training - Step 22586: {'lr': 0.000289268190986286, 'samples': 11564544, 'steps': 22586, 'batch_loss/train': 0.789411747828126} 12/25/2021 22:48:24 - INFO - codeparrot_training - Step 22587: {'lr': 0.00028925262344127326, 'samples': 11565056, 'steps': 22587, 'batch_loss/train': 0.7941379160620272} 12/25/2021 22:48:38 - INFO - codeparrot_training - Step 22588: {'lr': 0.0002892370557402069, 'samples': 11565568, 'steps': 22588, 'batch_loss/train': 0.8013846278190613} 12/25/2021 22:48:48 - INFO - codeparrot_training - Step 22589: {'lr': 0.0002892214878831489, 'samples': 11566080, 'steps': 22589, 'batch_loss/train': 0.7648911383002996} 12/25/2021 22:48:59 - INFO - codeparrot_training - Step 22590: {'lr': 0.000289205919870161, 'samples': 11566592, 'steps': 22590, 'batch_loss/train': 0.9587020091712475} 12/25/2021 22:49:10 - INFO - codeparrot_training - Step 22591: {'lr': 0.00028919035170130516, 'samples': 11567104, 'steps': 22591, 'batch_loss/train': 1.075473445467651} 12/25/2021 22:49:22 - INFO - codeparrot_training - Step 22592: {'lr': 0.0002891747833766433, 'samples': 11567616, 'steps': 22592, 'batch_loss/train': 0.791536582633853} 12/25/2021 22:49:32 - INFO - codeparrot_training - Step 22593: {'lr': 0.00028915921489623724, 'samples': 11568128, 'steps': 22593, 'batch_loss/train': 0.7404154553078115} 12/25/2021 22:49:43 - INFO - codeparrot_training - Step 22594: {'lr': 0.00028914364626014897, 'samples': 11568640, 'steps': 22594, 'batch_loss/train': 0.8398541938513517} 12/25/2021 22:49:55 - INFO - codeparrot_training - Step 22595: {'lr': 0.00028912807746844034, 'samples': 11569152, 'steps': 22595, 'batch_loss/train': 0.82836246304214} 12/25/2021 22:50:06 - INFO - codeparrot_training - Step 22596: {'lr': 0.0002891125085211733, 'samples': 11569664, 'steps': 22596, 'batch_loss/train': 0.9407192161306739} 12/25/2021 22:50:16 - INFO - codeparrot_training - Step 22597: {'lr': 0.00028909693941840957, 'samples': 11570176, 'steps': 22597, 'batch_loss/train': 0.7545405109412968} 12/25/2021 22:50:29 - INFO - codeparrot_training - Step 22598: {'lr': 0.0002890813701602113, 'samples': 11570688, 'steps': 22598, 'batch_loss/train': 0.5335975931957364} 12/25/2021 22:50:39 - INFO - codeparrot_training - Step 22599: {'lr': 0.0002890658007466402, 'samples': 11571200, 'steps': 22599, 'batch_loss/train': 0.7869768179953098} 12/25/2021 22:50:50 - INFO - codeparrot_training - Step 22600: {'lr': 0.00028905023117775826, 'samples': 11571712, 'steps': 22600, 'batch_loss/train': 0.8344895902555436} 12/25/2021 22:51:04 - INFO - codeparrot_training - Step 22601: {'lr': 0.0002890346614536273, 'samples': 11572224, 'steps': 22601, 'batch_loss/train': 0.677282553864643} 12/25/2021 22:51:14 - INFO - codeparrot_training - Step 22602: {'lr': 0.00028901909157430934, 'samples': 11572736, 'steps': 22602, 'batch_loss/train': 0.877029987052083} 12/25/2021 22:51:25 - INFO - codeparrot_training - Step 22603: {'lr': 0.0002890035215398661, 'samples': 11573248, 'steps': 22603, 'batch_loss/train': 0.783524970524013} 12/25/2021 22:51:35 - INFO - codeparrot_training - Step 22604: {'lr': 0.00028898795135035965, 'samples': 11573760, 'steps': 22604, 'batch_loss/train': 0.7129255067557096} 12/25/2021 22:51:48 - INFO - codeparrot_training - Step 22605: {'lr': 0.0002889723810058518, 'samples': 11574272, 'steps': 22605, 'batch_loss/train': 0.8192126646172255} 12/25/2021 22:51:58 - INFO - codeparrot_training - Step 22606: {'lr': 0.0002889568105064045, 'samples': 11574784, 'steps': 22606, 'batch_loss/train': 0.7735885842703283} 12/25/2021 22:52:09 - INFO - codeparrot_training - Step 22607: {'lr': 0.0002889412398520796, 'samples': 11575296, 'steps': 22607, 'batch_loss/train': 0.8505070018582046} 12/25/2021 22:52:23 - INFO - codeparrot_training - Step 22608: {'lr': 0.00028892566904293904, 'samples': 11575808, 'steps': 22608, 'batch_loss/train': 0.927826102823019} 12/25/2021 22:52:34 - INFO - codeparrot_training - Step 22609: {'lr': 0.0002889100980790448, 'samples': 11576320, 'steps': 22609, 'batch_loss/train': 0.7502163872122765} 12/25/2021 22:52:44 - INFO - codeparrot_training - Step 22610: {'lr': 0.00028889452696045855, 'samples': 11576832, 'steps': 22610, 'batch_loss/train': 0.796833947300911} 12/25/2021 22:52:57 - INFO - codeparrot_training - Step 22611: {'lr': 0.00028887895568724246, 'samples': 11577344, 'steps': 22611, 'batch_loss/train': 0.8218054948374629} 12/25/2021 22:53:07 - INFO - codeparrot_training - Step 22612: {'lr': 0.0002888633842594583, 'samples': 11577856, 'steps': 22612, 'batch_loss/train': 0.8048449642956257} 12/25/2021 22:53:18 - INFO - codeparrot_training - Step 22613: {'lr': 0.000288847812677168, 'samples': 11578368, 'steps': 22613, 'batch_loss/train': 0.6835604743100703} 12/25/2021 22:53:28 - INFO - codeparrot_training - Step 22614: {'lr': 0.0002888322409404335, 'samples': 11578880, 'steps': 22614, 'batch_loss/train': 0.7912579060066491} 12/25/2021 22:53:41 - INFO - codeparrot_training - Step 22615: {'lr': 0.00028881666904931666, 'samples': 11579392, 'steps': 22615, 'batch_loss/train': 0.8850737160537392} 12/25/2021 22:53:51 - INFO - codeparrot_training - Step 22616: {'lr': 0.0002888010970038793, 'samples': 11579904, 'steps': 22616, 'batch_loss/train': 0.7360147880390286} 12/25/2021 22:54:02 - INFO - codeparrot_training - Step 22617: {'lr': 0.00028878552480418346, 'samples': 11580416, 'steps': 22617, 'batch_loss/train': 0.7336496529169381} 12/25/2021 22:54:16 - INFO - codeparrot_training - Step 22618: {'lr': 0.00028876995245029113, 'samples': 11580928, 'steps': 22618, 'batch_loss/train': 0.847789709456265} 12/25/2021 22:54:26 - INFO - codeparrot_training - Step 22619: {'lr': 0.000288754379942264, 'samples': 11581440, 'steps': 22619, 'batch_loss/train': 0.8349748337641358} 12/25/2021 22:54:37 - INFO - codeparrot_training - Step 22620: {'lr': 0.00028873880728016413, 'samples': 11581952, 'steps': 22620, 'batch_loss/train': 0.8147660186514258} 12/25/2021 22:54:49 - INFO - codeparrot_training - Step 22621: {'lr': 0.0002887232344640534, 'samples': 11582464, 'steps': 22621, 'batch_loss/train': 0.7102863152395003} 12/25/2021 22:55:00 - INFO - codeparrot_training - Step 22622: {'lr': 0.0002887076614939937, 'samples': 11582976, 'steps': 22622, 'batch_loss/train': 0.8993275321554393} 12/25/2021 22:55:10 - INFO - codeparrot_training - Step 22623: {'lr': 0.0002886920883700469, 'samples': 11583488, 'steps': 22623, 'batch_loss/train': 0.8575993310660124} 12/25/2021 22:55:21 - INFO - codeparrot_training - Step 22624: {'lr': 0.00028867651509227507, 'samples': 11584000, 'steps': 22624, 'batch_loss/train': 0.8392338077537715} 12/25/2021 22:55:35 - INFO - codeparrot_training - Step 22625: {'lr': 0.00028866094166074, 'samples': 11584512, 'steps': 22625, 'batch_loss/train': 0.8480507433414459} 12/25/2021 22:55:45 - INFO - codeparrot_training - Step 22626: {'lr': 0.00028864536807550353, 'samples': 11585024, 'steps': 22626, 'batch_loss/train': 0.747677031904459} 12/25/2021 22:55:56 - INFO - codeparrot_training - Step 22627: {'lr': 0.0002886297943366277, 'samples': 11585536, 'steps': 22627, 'batch_loss/train': 0.7372208768501878} 12/25/2021 22:56:08 - INFO - codeparrot_training - Step 22628: {'lr': 0.0002886142204441745, 'samples': 11586048, 'steps': 22628, 'batch_loss/train': 0.7312235485296696} 12/25/2021 22:56:19 - INFO - codeparrot_training - Step 22629: {'lr': 0.0002885986463982056, 'samples': 11586560, 'steps': 22629, 'batch_loss/train': 0.6850813094060868} 12/25/2021 22:56:29 - INFO - codeparrot_training - Step 22630: {'lr': 0.0002885830721987831, 'samples': 11587072, 'steps': 22630, 'batch_loss/train': 0.7227105868514627} 12/25/2021 22:56:41 - INFO - codeparrot_training - Step 22631: {'lr': 0.0002885674978459689, 'samples': 11587584, 'steps': 22631, 'batch_loss/train': 0.7868105770321563} 12/25/2021 22:56:52 - INFO - codeparrot_training - Step 22632: {'lr': 0.0002885519233398249, 'samples': 11588096, 'steps': 22632, 'batch_loss/train': 0.7760381484404206} 12/25/2021 22:57:02 - INFO - codeparrot_training - Step 22633: {'lr': 0.000288536348680413, 'samples': 11588608, 'steps': 22633, 'batch_loss/train': 0.838157984893769} 12/25/2021 22:57:14 - INFO - codeparrot_training - Step 22634: {'lr': 0.00028852077386779505, 'samples': 11589120, 'steps': 22634, 'batch_loss/train': 0.7168388168793172} 12/25/2021 22:57:25 - INFO - codeparrot_training - Step 22635: {'lr': 0.0002885051989020332, 'samples': 11589632, 'steps': 22635, 'batch_loss/train': 0.6786867990158498} 12/25/2021 22:57:36 - INFO - codeparrot_training - Step 22636: {'lr': 0.0002884896237831891, 'samples': 11590144, 'steps': 22636, 'batch_loss/train': 0.810909453779459} 12/25/2021 22:57:46 - INFO - codeparrot_training - Step 22637: {'lr': 0.0002884740485113249, 'samples': 11590656, 'steps': 22637, 'batch_loss/train': 0.7727569881826639} 12/25/2021 22:58:00 - INFO - codeparrot_training - Step 22638: {'lr': 0.00028845847308650233, 'samples': 11591168, 'steps': 22638, 'batch_loss/train': 0.7216015672311187} 12/25/2021 22:58:11 - INFO - codeparrot_training - Step 22639: {'lr': 0.0002884428975087833, 'samples': 11591680, 'steps': 22639, 'batch_loss/train': 0.7988315913826227} 12/25/2021 22:58:21 - INFO - codeparrot_training - Step 22640: {'lr': 0.00028842732177822994, 'samples': 11592192, 'steps': 22640, 'batch_loss/train': 0.7630127858137712} 12/25/2021 22:58:34 - INFO - codeparrot_training - Step 22641: {'lr': 0.0002884117458949041, 'samples': 11592704, 'steps': 22641, 'batch_loss/train': 0.8487800229340792} 12/25/2021 22:58:45 - INFO - codeparrot_training - Step 22642: {'lr': 0.00028839616985886754, 'samples': 11593216, 'steps': 22642, 'batch_loss/train': 0.6614226748934016} 12/25/2021 22:58:55 - INFO - codeparrot_training - Step 22643: {'lr': 0.0002883805936701824, 'samples': 11593728, 'steps': 22643, 'batch_loss/train': 0.7162802158854902} 12/25/2021 22:59:06 - INFO - codeparrot_training - Step 22644: {'lr': 0.00028836501732891045, 'samples': 11594240, 'steps': 22644, 'batch_loss/train': 0.7846701284870505} 12/25/2021 22:59:18 - INFO - codeparrot_training - Step 22645: {'lr': 0.0002883494408351137, 'samples': 11594752, 'steps': 22645, 'batch_loss/train': 0.5040800446877256} 12/25/2021 22:59:29 - INFO - codeparrot_training - Step 22646: {'lr': 0.0002883338641888541, 'samples': 11595264, 'steps': 22646, 'batch_loss/train': 0.7005868293344975} 12/25/2021 22:59:39 - INFO - codeparrot_training - Step 22647: {'lr': 0.00028831828739019343, 'samples': 11595776, 'steps': 22647, 'batch_loss/train': 0.8263139305636287} 12/25/2021 22:59:53 - INFO - codeparrot_training - Step 22648: {'lr': 0.0002883027104391938, 'samples': 11596288, 'steps': 22648, 'batch_loss/train': 0.8841979522258043} 12/25/2021 23:00:04 - INFO - codeparrot_training - Step 22649: {'lr': 0.00028828713333591704, 'samples': 11596800, 'steps': 22649, 'batch_loss/train': 0.7616319274529815} 12/25/2021 23:00:14 - INFO - codeparrot_training - Step 22650: {'lr': 0.0002882715560804251, 'samples': 11597312, 'steps': 22650, 'batch_loss/train': 0.7818006412126124} 12/25/2021 23:00:27 - INFO - codeparrot_training - Step 22651: {'lr': 0.0002882559786727799, 'samples': 11597824, 'steps': 22651, 'batch_loss/train': 0.65132936835289} 12/25/2021 23:00:37 - INFO - codeparrot_training - Step 22652: {'lr': 0.0002882404011130434, 'samples': 11598336, 'steps': 22652, 'batch_loss/train': 0.7407915049698204} 12/25/2021 23:00:48 - INFO - codeparrot_training - Step 22653: {'lr': 0.0002882248234012774, 'samples': 11598848, 'steps': 22653, 'batch_loss/train': 0.7474587978795171} 12/25/2021 23:00:58 - INFO - codeparrot_training - Step 22654: {'lr': 0.000288209245537544, 'samples': 11599360, 'steps': 22654, 'batch_loss/train': 0.7652021199464798} 12/25/2021 23:01:12 - INFO - codeparrot_training - Step 22655: {'lr': 0.0002881936675219051, 'samples': 11599872, 'steps': 22655, 'batch_loss/train': 0.7753912089392543} 12/25/2021 23:01:23 - INFO - codeparrot_training - Step 22656: {'lr': 0.00028817808935442256, 'samples': 11600384, 'steps': 22656, 'batch_loss/train': 0.7828365787863731} 12/25/2021 23:01:34 - INFO - codeparrot_training - Step 22657: {'lr': 0.0002881625110351584, 'samples': 11600896, 'steps': 22657, 'batch_loss/train': 0.6823383702430874} 12/25/2021 23:01:46 - INFO - codeparrot_training - Step 22658: {'lr': 0.00028814693256417446, 'samples': 11601408, 'steps': 22658, 'batch_loss/train': 0.6489226231351495} 12/25/2021 23:01:57 - INFO - codeparrot_training - Step 22659: {'lr': 0.0002881313539415327, 'samples': 11601920, 'steps': 22659, 'batch_loss/train': 0.7597264745272696} 12/25/2021 23:02:07 - INFO - codeparrot_training - Step 22660: {'lr': 0.00028811577516729513, 'samples': 11602432, 'steps': 22660, 'batch_loss/train': 0.7267479719594121} 12/25/2021 23:02:19 - INFO - codeparrot_training - Step 22661: {'lr': 0.00028810019624152364, 'samples': 11602944, 'steps': 22661, 'batch_loss/train': 0.8131173737347126} 12/25/2021 23:02:30 - INFO - codeparrot_training - Step 22662: {'lr': 0.0002880846171642801, 'samples': 11603456, 'steps': 22662, 'batch_loss/train': 0.7565549514256418} 12/25/2021 23:02:41 - INFO - codeparrot_training - Step 22663: {'lr': 0.00028806903793562655, 'samples': 11603968, 'steps': 22663, 'batch_loss/train': 1.2308233343064785} 12/25/2021 23:02:51 - INFO - codeparrot_training - Step 22664: {'lr': 0.0002880534585556248, 'samples': 11604480, 'steps': 22664, 'batch_loss/train': 1.730246068444103} 12/25/2021 23:03:05 - INFO - codeparrot_training - Step 22665: {'lr': 0.000288037879024337, 'samples': 11604992, 'steps': 22665, 'batch_loss/train': 0.8028713609091938} 12/25/2021 23:03:16 - INFO - codeparrot_training - Step 22666: {'lr': 0.00028802229934182485, 'samples': 11605504, 'steps': 22666, 'batch_loss/train': 0.7280339323915541} 12/25/2021 23:03:26 - INFO - codeparrot_training - Step 22667: {'lr': 0.0002880067195081504, 'samples': 11606016, 'steps': 22667, 'batch_loss/train': 0.686282972805202} 12/25/2021 23:03:38 - INFO - codeparrot_training - Step 22668: {'lr': 0.0002879911395233756, 'samples': 11606528, 'steps': 22668, 'batch_loss/train': 0.7599216639064252} 12/25/2021 23:03:49 - INFO - codeparrot_training - Step 22669: {'lr': 0.00028797555938756236, 'samples': 11607040, 'steps': 22669, 'batch_loss/train': 0.8810764895752072} 12/25/2021 23:04:00 - INFO - codeparrot_training - Step 22670: {'lr': 0.00028795997910077264, 'samples': 11607552, 'steps': 22670, 'batch_loss/train': 0.807412906549871} 12/25/2021 23:04:12 - INFO - codeparrot_training - Step 22671: {'lr': 0.00028794439866306837, 'samples': 11608064, 'steps': 22671, 'batch_loss/train': 0.7357877856120467} 12/25/2021 23:04:22 - INFO - codeparrot_training - Step 22672: {'lr': 0.0002879288180745115, 'samples': 11608576, 'steps': 22672, 'batch_loss/train': 0.7690488192019984} 12/25/2021 23:04:33 - INFO - codeparrot_training - Step 22673: {'lr': 0.00028791323733516405, 'samples': 11609088, 'steps': 22673, 'batch_loss/train': 0.7603060370311141} 12/25/2021 23:04:44 - INFO - codeparrot_training - Step 22674: {'lr': 0.0002878976564450878, 'samples': 11609600, 'steps': 22674, 'batch_loss/train': 0.8097135657444596} 12/25/2021 23:04:56 - INFO - codeparrot_training - Step 22675: {'lr': 0.0002878820754043448, 'samples': 11610112, 'steps': 22675, 'batch_loss/train': 0.7661784649826586} 12/25/2021 23:05:06 - INFO - codeparrot_training - Step 22676: {'lr': 0.000287866494212997, 'samples': 11610624, 'steps': 22676, 'batch_loss/train': 0.8019483559764922} 12/25/2021 23:05:17 - INFO - codeparrot_training - Step 22677: {'lr': 0.0002878509128711063, 'samples': 11611136, 'steps': 22677, 'batch_loss/train': 0.740979160182178} 12/25/2021 23:05:31 - INFO - codeparrot_training - Step 22678: {'lr': 0.00028783533137873463, 'samples': 11611648, 'steps': 22678, 'batch_loss/train': 0.8194541092962027} 12/25/2021 23:05:41 - INFO - codeparrot_training - Step 22679: {'lr': 0.000287819749735944, 'samples': 11612160, 'steps': 22679, 'batch_loss/train': 0.7935027340427041} 12/25/2021 23:05:52 - INFO - codeparrot_training - Step 22680: {'lr': 0.0002878041679427963, 'samples': 11612672, 'steps': 22680, 'batch_loss/train': 0.7603221898898482} 12/25/2021 23:06:04 - INFO - codeparrot_training - Step 22681: {'lr': 0.00028778858599935346, 'samples': 11613184, 'steps': 22681, 'batch_loss/train': 0.827242580242455} 12/25/2021 23:06:15 - INFO - codeparrot_training - Step 22682: {'lr': 0.00028777300390567755, 'samples': 11613696, 'steps': 22682, 'batch_loss/train': 0.9213060177862644} 12/25/2021 23:06:26 - INFO - codeparrot_training - Step 22683: {'lr': 0.0002877574216618305, 'samples': 11614208, 'steps': 22683, 'batch_loss/train': 0.7106915228068829} 12/25/2021 23:06:38 - INFO - codeparrot_training - Step 22684: {'lr': 0.000287741839267874, 'samples': 11614720, 'steps': 22684, 'batch_loss/train': 0.810517406091094} 12/25/2021 23:06:49 - INFO - codeparrot_training - Step 22685: {'lr': 0.0002877262567238704, 'samples': 11615232, 'steps': 22685, 'batch_loss/train': 1.1057825172320008} 12/25/2021 23:06:59 - INFO - codeparrot_training - Step 22686: {'lr': 0.00028771067402988133, 'samples': 11615744, 'steps': 22686, 'batch_loss/train': 0.9961601570248604} 12/25/2021 23:07:10 - INFO - codeparrot_training - Step 22687: {'lr': 0.00028769509118596885, 'samples': 11616256, 'steps': 22687, 'batch_loss/train': 0.7200200306251645} 12/25/2021 23:07:24 - INFO - codeparrot_training - Step 22688: {'lr': 0.00028767950819219497, 'samples': 11616768, 'steps': 22688, 'batch_loss/train': 0.8963401613291353} 12/25/2021 23:07:35 - INFO - codeparrot_training - Step 22689: {'lr': 0.00028766392504862165, 'samples': 11617280, 'steps': 22689, 'batch_loss/train': 0.849960739724338} 12/25/2021 23:07:45 - INFO - codeparrot_training - Step 22690: {'lr': 0.0002876483417553107, 'samples': 11617792, 'steps': 22690, 'batch_loss/train': 0.6958077941089869} 12/25/2021 23:07:58 - INFO - codeparrot_training - Step 22691: {'lr': 0.0002876327583123241, 'samples': 11618304, 'steps': 22691, 'batch_loss/train': 0.7664078781381249} 12/25/2021 23:08:08 - INFO - codeparrot_training - Step 22692: {'lr': 0.000287617174719724, 'samples': 11618816, 'steps': 22692, 'batch_loss/train': 0.7586711540352553} 12/25/2021 23:08:19 - INFO - codeparrot_training - Step 22693: {'lr': 0.0002876015909775722, 'samples': 11619328, 'steps': 22693, 'batch_loss/train': 1.0794225810095668} 12/25/2021 23:08:30 - INFO - codeparrot_training - Step 22694: {'lr': 0.0002875860070859306, 'samples': 11619840, 'steps': 22694, 'batch_loss/train': 0.8031741306185722} 12/25/2021 23:08:44 - INFO - codeparrot_training - Step 22695: {'lr': 0.0002875704230448612, 'samples': 11620352, 'steps': 22695, 'batch_loss/train': 0.8100519729778171} 12/25/2021 23:08:54 - INFO - codeparrot_training - Step 22696: {'lr': 0.0002875548388544261, 'samples': 11620864, 'steps': 22696, 'batch_loss/train': 0.768339378759265} 12/25/2021 23:09:05 - INFO - codeparrot_training - Step 22697: {'lr': 0.000287539254514687, 'samples': 11621376, 'steps': 22697, 'batch_loss/train': 0.7752310214564204} 12/25/2021 23:09:17 - INFO - codeparrot_training - Step 22698: {'lr': 0.0002875236700257061, 'samples': 11621888, 'steps': 22698, 'batch_loss/train': 0.7287883423268795} 12/25/2021 23:09:28 - INFO - codeparrot_training - Step 22699: {'lr': 0.00028750808538754525, 'samples': 11622400, 'steps': 22699, 'batch_loss/train': 0.6575910673709586} 12/25/2021 23:09:38 - INFO - codeparrot_training - Step 22700: {'lr': 0.00028749250060026646, 'samples': 11622912, 'steps': 22700, 'batch_loss/train': 0.7221923461183906} 12/25/2021 23:09:50 - INFO - codeparrot_training - Step 22701: {'lr': 0.0002874769156639315, 'samples': 11623424, 'steps': 22701, 'batch_loss/train': 0.5973530805204064} 12/25/2021 23:10:01 - INFO - codeparrot_training - Step 22702: {'lr': 0.0002874613305786026, 'samples': 11623936, 'steps': 22702, 'batch_loss/train': 0.7618457144126296} 12/25/2021 23:10:12 - INFO - codeparrot_training - Step 22703: {'lr': 0.0002874457453443416, 'samples': 11624448, 'steps': 22703, 'batch_loss/train': 0.7893717242404819} 12/25/2021 23:10:22 - INFO - codeparrot_training - Step 22704: {'lr': 0.0002874301599612103, 'samples': 11624960, 'steps': 22704, 'batch_loss/train': 0.8233353169634938} 12/25/2021 23:10:36 - INFO - codeparrot_training - Step 22705: {'lr': 0.000287414574429271, 'samples': 11625472, 'steps': 22705, 'batch_loss/train': 0.8059598326217383} 12/25/2021 23:10:46 - INFO - codeparrot_training - Step 22706: {'lr': 0.0002873989887485854, 'samples': 11625984, 'steps': 22706, 'batch_loss/train': 0.8026668813545257} 12/25/2021 23:10:57 - INFO - codeparrot_training - Step 22707: {'lr': 0.00028738340291921557, 'samples': 11626496, 'steps': 22707, 'batch_loss/train': 0.7523185117170215} 12/25/2021 23:11:09 - INFO - codeparrot_training - Step 22708: {'lr': 0.0002873678169412234, 'samples': 11627008, 'steps': 22708, 'batch_loss/train': 0.9519589422270656} 12/25/2021 23:11:20 - INFO - codeparrot_training - Step 22709: {'lr': 0.0002873522308146709, 'samples': 11627520, 'steps': 22709, 'batch_loss/train': 0.8051616447046399} 12/25/2021 23:11:30 - INFO - codeparrot_training - Step 22710: {'lr': 0.00028733664453962004, 'samples': 11628032, 'steps': 22710, 'batch_loss/train': 0.7912095179781318} 12/25/2021 23:11:42 - INFO - codeparrot_training - Step 22711: {'lr': 0.00028732105811613283, 'samples': 11628544, 'steps': 22711, 'batch_loss/train': 1.033081231173128} 12/25/2021 23:11:53 - INFO - codeparrot_training - Step 22712: {'lr': 0.00028730547154427115, 'samples': 11629056, 'steps': 22712, 'batch_loss/train': 0.7318410044535995} 12/25/2021 23:12:04 - INFO - codeparrot_training - Step 22713: {'lr': 0.000287289884824097, 'samples': 11629568, 'steps': 22713, 'batch_loss/train': 0.7415193002671003} 12/25/2021 23:12:16 - INFO - codeparrot_training - Step 22714: {'lr': 0.00028727429795567236, 'samples': 11630080, 'steps': 22714, 'batch_loss/train': 0.7919669463299215} 12/25/2021 23:12:26 - INFO - codeparrot_training - Step 22715: {'lr': 0.0002872587109390592, 'samples': 11630592, 'steps': 22715, 'batch_loss/train': 0.8071297500282526} 12/25/2021 23:12:37 - INFO - codeparrot_training - Step 22716: {'lr': 0.00028724312377431937, 'samples': 11631104, 'steps': 22716, 'batch_loss/train': 0.6749767662258819} 12/25/2021 23:12:48 - INFO - codeparrot_training - Step 22717: {'lr': 0.000287227536461515, 'samples': 11631616, 'steps': 22717, 'batch_loss/train': 0.7035721950232983} 12/25/2021 23:13:02 - INFO - codeparrot_training - Step 22718: {'lr': 0.0002872119490007081, 'samples': 11632128, 'steps': 22718, 'batch_loss/train': 0.7593029183335602} 12/25/2021 23:13:12 - INFO - codeparrot_training - Step 22719: {'lr': 0.00028719636139196044, 'samples': 11632640, 'steps': 22719, 'batch_loss/train': 0.7748465207405388} 12/25/2021 23:13:23 - INFO - codeparrot_training - Step 22720: {'lr': 0.0002871807736353341, 'samples': 11633152, 'steps': 22720, 'batch_loss/train': 0.844079724047333} 12/25/2021 23:13:35 - INFO - codeparrot_training - Step 22721: {'lr': 0.00028716518573089104, 'samples': 11633664, 'steps': 22721, 'batch_loss/train': 0.697682443074882} 12/25/2021 23:13:46 - INFO - codeparrot_training - Step 22722: {'lr': 0.0002871495976786933, 'samples': 11634176, 'steps': 22722, 'batch_loss/train': 0.6974532483145595} 12/25/2021 23:13:56 - INFO - codeparrot_training - Step 22723: {'lr': 0.0002871340094788027, 'samples': 11634688, 'steps': 22723, 'batch_loss/train': 1.4087011171504855} 12/25/2021 23:14:10 - INFO - codeparrot_training - Step 22724: {'lr': 0.0002871184211312814, 'samples': 11635200, 'steps': 22724, 'batch_loss/train': 0.8686610572040081} 12/25/2021 23:14:21 - INFO - codeparrot_training - Step 22725: {'lr': 0.0002871028326361912, 'samples': 11635712, 'steps': 22725, 'batch_loss/train': 0.8114334233105183} 12/25/2021 23:14:32 - INFO - codeparrot_training - Step 22726: {'lr': 0.00028708724399359415, 'samples': 11636224, 'steps': 22726, 'batch_loss/train': 0.7606218680739403} 12/25/2021 23:14:42 - INFO - codeparrot_training - Step 22727: {'lr': 0.00028707165520355224, 'samples': 11636736, 'steps': 22727, 'batch_loss/train': 0.8342324811965227} 12/25/2021 23:14:54 - INFO - codeparrot_training - Step 22728: {'lr': 0.0002870560662661274, 'samples': 11637248, 'steps': 22728, 'batch_loss/train': 0.779297161847353} 12/25/2021 23:15:05 - INFO - codeparrot_training - Step 22729: {'lr': 0.0002870404771813816, 'samples': 11637760, 'steps': 22729, 'batch_loss/train': 0.8608004916459322} 12/25/2021 23:15:16 - INFO - codeparrot_training - Step 22730: {'lr': 0.00028702488794937693, 'samples': 11638272, 'steps': 22730, 'batch_loss/train': 0.8234828477725387} 12/25/2021 23:15:28 - INFO - codeparrot_training - Step 22731: {'lr': 0.00028700929857017527, 'samples': 11638784, 'steps': 22731, 'batch_loss/train': 0.673342252150178} 12/25/2021 23:15:38 - INFO - codeparrot_training - Step 22732: {'lr': 0.0002869937090438386, 'samples': 11639296, 'steps': 22732, 'batch_loss/train': 0.7828654758632183} 12/25/2021 23:15:49 - INFO - codeparrot_training - Step 22733: {'lr': 0.00028697811937042887, 'samples': 11639808, 'steps': 22733, 'batch_loss/train': 0.8065787908853963} 12/25/2021 23:16:03 - INFO - codeparrot_training - Step 22734: {'lr': 0.00028696252955000815, 'samples': 11640320, 'steps': 22734, 'batch_loss/train': 0.7678607115522027} 12/25/2021 23:16:14 - INFO - codeparrot_training - Step 22735: {'lr': 0.0002869469395826384, 'samples': 11640832, 'steps': 22735, 'batch_loss/train': 0.991163088940084} 12/25/2021 23:16:24 - INFO - codeparrot_training - Step 22736: {'lr': 0.00028693134946838146, 'samples': 11641344, 'steps': 22736, 'batch_loss/train': 0.7349060741253197} 12/25/2021 23:16:36 - INFO - codeparrot_training - Step 22737: {'lr': 0.00028691575920729957, 'samples': 11641856, 'steps': 22737, 'batch_loss/train': 0.7147160468157381} 12/25/2021 23:16:47 - INFO - codeparrot_training - Step 22738: {'lr': 0.0002869001687994545, 'samples': 11642368, 'steps': 22738, 'batch_loss/train': 0.7942384416237473} 12/25/2021 23:16:58 - INFO - codeparrot_training - Step 22739: {'lr': 0.00028688457824490823, 'samples': 11642880, 'steps': 22739, 'batch_loss/train': 0.8044413393363357} 12/25/2021 23:17:08 - INFO - codeparrot_training - Step 22740: {'lr': 0.0002868689875437228, 'samples': 11643392, 'steps': 22740, 'batch_loss/train': 0.7582202889025211} 12/25/2021 23:17:22 - INFO - codeparrot_training - Step 22741: {'lr': 0.0002868533966959602, 'samples': 11643904, 'steps': 22741, 'batch_loss/train': 0.7382998975226656} 12/25/2021 23:17:33 - INFO - codeparrot_training - Step 22742: {'lr': 0.0002868378057016825, 'samples': 11644416, 'steps': 22742, 'batch_loss/train': 0.8406156823039055} 12/25/2021 23:17:43 - INFO - codeparrot_training - Step 22743: {'lr': 0.00028682221456095147, 'samples': 11644928, 'steps': 22743, 'batch_loss/train': 0.812654941342771} 12/25/2021 23:17:56 - INFO - codeparrot_training - Step 22744: {'lr': 0.0002868066232738294, 'samples': 11645440, 'steps': 22744, 'batch_loss/train': 0.7889196258038282} 12/25/2021 23:18:06 - INFO - codeparrot_training - Step 22745: {'lr': 0.000286791031840378, 'samples': 11645952, 'steps': 22745, 'batch_loss/train': 0.6243782564997673} 12/25/2021 23:18:17 - INFO - codeparrot_training - Step 22746: {'lr': 0.0002867754402606593, 'samples': 11646464, 'steps': 22746, 'batch_loss/train': 0.7297148206271231} 12/25/2021 23:18:29 - INFO - codeparrot_training - Step 22747: {'lr': 0.00028675984853473535, 'samples': 11646976, 'steps': 22747, 'batch_loss/train': 0.6772284409380518} 12/25/2021 23:18:40 - INFO - codeparrot_training - Step 22748: {'lr': 0.0002867442566626682, 'samples': 11647488, 'steps': 22748, 'batch_loss/train': 0.8160465070977807} 12/25/2021 23:18:50 - INFO - codeparrot_training - Step 22749: {'lr': 0.00028672866464451963, 'samples': 11648000, 'steps': 22749, 'batch_loss/train': 0.752157786861062} 12/25/2021 23:19:01 - INFO - codeparrot_training - Step 22750: {'lr': 0.0002867130724803518, 'samples': 11648512, 'steps': 22750, 'batch_loss/train': 0.7606161213479936} 12/25/2021 23:19:13 - INFO - codeparrot_training - Step 22751: {'lr': 0.00028669748017022675, 'samples': 11649024, 'steps': 22751, 'batch_loss/train': 0.6687278090976179} 12/25/2021 23:19:23 - INFO - codeparrot_training - Step 22752: {'lr': 0.0002866818877142063, 'samples': 11649536, 'steps': 22752, 'batch_loss/train': 0.791243112180382} 12/25/2021 23:19:34 - INFO - codeparrot_training - Step 22753: {'lr': 0.0002866662951123525, 'samples': 11650048, 'steps': 22753, 'batch_loss/train': 0.7673757877200842} 12/25/2021 23:19:48 - INFO - codeparrot_training - Step 22754: {'lr': 0.00028665070236472736, 'samples': 11650560, 'steps': 22754, 'batch_loss/train': 0.7791612762957811} 12/25/2021 23:19:59 - INFO - codeparrot_training - Step 22755: {'lr': 0.00028663510947139287, 'samples': 11651072, 'steps': 22755, 'batch_loss/train': 0.8183229109272361} 12/25/2021 23:20:09 - INFO - codeparrot_training - Step 22756: {'lr': 0.000286619516432411, 'samples': 11651584, 'steps': 22756, 'batch_loss/train': 0.8237623758614063} 12/25/2021 23:20:21 - INFO - codeparrot_training - Step 22757: {'lr': 0.0002866039232478439, 'samples': 11652096, 'steps': 22757, 'batch_loss/train': 0.7353719680104405} 12/25/2021 23:20:32 - INFO - codeparrot_training - Step 22758: {'lr': 0.0002865883299177533, 'samples': 11652608, 'steps': 22758, 'batch_loss/train': 0.9229310899972916} 12/25/2021 23:20:43 - INFO - codeparrot_training - Step 22759: {'lr': 0.0002865727364422013, 'samples': 11653120, 'steps': 22759, 'batch_loss/train': 0.7338707798626274} 12/25/2021 23:20:53 - INFO - codeparrot_training - Step 22760: {'lr': 0.00028655714282124994, 'samples': 11653632, 'steps': 22760, 'batch_loss/train': 0.7830730825662613} 12/25/2021 23:21:06 - INFO - codeparrot_training - Step 22761: {'lr': 0.0002865415490549612, 'samples': 11654144, 'steps': 22761, 'batch_loss/train': 0.8613532744348049} 12/25/2021 23:21:16 - INFO - codeparrot_training - Step 22762: {'lr': 0.00028652595514339703, 'samples': 11654656, 'steps': 22762, 'batch_loss/train': 0.7755487756803632} 12/25/2021 23:21:27 - INFO - codeparrot_training - Step 22763: {'lr': 0.0002865103610866195, 'samples': 11655168, 'steps': 22763, 'batch_loss/train': 0.6229619951918721} 12/25/2021 23:21:41 - INFO - codeparrot_training - Step 22764: {'lr': 0.0002864947668846905, 'samples': 11655680, 'steps': 22764, 'batch_loss/train': 0.9034191463142633} 12/25/2021 23:21:51 - INFO - codeparrot_training - Step 22765: {'lr': 0.00028647917253767205, 'samples': 11656192, 'steps': 22765, 'batch_loss/train': 0.7879372499883175} 12/25/2021 23:22:02 - INFO - codeparrot_training - Step 22766: {'lr': 0.00028646357804562627, 'samples': 11656704, 'steps': 22766, 'batch_loss/train': 0.7685629143379629} 12/25/2021 23:22:14 - INFO - codeparrot_training - Step 22767: {'lr': 0.00028644798340861507, 'samples': 11657216, 'steps': 22767, 'batch_loss/train': 0.7797918124124408} 12/25/2021 23:22:24 - INFO - codeparrot_training - Step 22768: {'lr': 0.0002864323886267004, 'samples': 11657728, 'steps': 22768, 'batch_loss/train': 0.7791535258293152} 12/25/2021 23:22:35 - INFO - codeparrot_training - Step 22769: {'lr': 0.0002864167936999443, 'samples': 11658240, 'steps': 22769, 'batch_loss/train': 0.8850528271868825} 12/25/2021 23:22:49 - INFO - codeparrot_training - Step 22770: {'lr': 0.0002864011986284089, 'samples': 11658752, 'steps': 22770, 'batch_loss/train': 0.8410675378981978} 12/25/2021 23:23:00 - INFO - codeparrot_training - Step 22771: {'lr': 0.00028638560341215597, 'samples': 11659264, 'steps': 22771, 'batch_loss/train': 0.9124205419793725} 12/25/2021 23:23:10 - INFO - codeparrot_training - Step 22772: {'lr': 0.00028637000805124764, 'samples': 11659776, 'steps': 22772, 'batch_loss/train': 0.7448339769616723} 12/25/2021 23:23:21 - INFO - codeparrot_training - Step 22773: {'lr': 0.0002863544125457459, 'samples': 11660288, 'steps': 22773, 'batch_loss/train': 0.7884682465228252} 12/25/2021 23:23:33 - INFO - codeparrot_training - Step 22774: {'lr': 0.0002863388168957127, 'samples': 11660800, 'steps': 22774, 'batch_loss/train': 0.8000397868454456} 12/25/2021 23:23:43 - INFO - codeparrot_training - Step 22775: {'lr': 0.0002863232211012101, 'samples': 11661312, 'steps': 22775, 'batch_loss/train': 0.9602551604621112} 12/25/2021 23:23:54 - INFO - codeparrot_training - Step 22776: {'lr': 0.0002863076251623002, 'samples': 11661824, 'steps': 22776, 'batch_loss/train': 0.6673741596750915} 12/25/2021 23:24:06 - INFO - codeparrot_training - Step 22777: {'lr': 0.0002862920290790448, 'samples': 11662336, 'steps': 22777, 'batch_loss/train': 0.6969202617183328} 12/25/2021 23:24:17 - INFO - codeparrot_training - Step 22778: {'lr': 0.00028627643285150596, 'samples': 11662848, 'steps': 22778, 'batch_loss/train': 0.7588463267311454} 12/25/2021 23:24:27 - INFO - codeparrot_training - Step 22779: {'lr': 0.00028626083647974575, 'samples': 11663360, 'steps': 22779, 'batch_loss/train': 0.7725378451868892} 12/25/2021 23:24:41 - INFO - codeparrot_training - Step 22780: {'lr': 0.0002862452399638262, 'samples': 11663872, 'steps': 22780, 'batch_loss/train': 0.7325524524785578} 12/25/2021 23:24:52 - INFO - codeparrot_training - Step 22781: {'lr': 0.0002862296433038092, 'samples': 11664384, 'steps': 22781, 'batch_loss/train': 0.7699169325642288} 12/25/2021 23:25:02 - INFO - codeparrot_training - Step 22782: {'lr': 0.0002862140464997568, 'samples': 11664896, 'steps': 22782, 'batch_loss/train': 0.7524497536942363} 12/25/2021 23:25:13 - INFO - codeparrot_training - Step 22783: {'lr': 0.0002861984495517311, 'samples': 11665408, 'steps': 22783, 'batch_loss/train': 0.8902403376996517} 12/25/2021 23:25:25 - INFO - codeparrot_training - Step 22784: {'lr': 0.0002861828524597939, 'samples': 11665920, 'steps': 22784, 'batch_loss/train': 0.7711325013078749} 12/25/2021 23:25:36 - INFO - codeparrot_training - Step 22785: {'lr': 0.00028616725522400755, 'samples': 11666432, 'steps': 22785, 'batch_loss/train': 0.7903984829317778} 12/25/2021 23:25:46 - INFO - codeparrot_training - Step 22786: {'lr': 0.0002861516578444337, 'samples': 11666944, 'steps': 22786, 'batch_loss/train': 0.8251183114480227} 12/25/2021 23:25:58 - INFO - codeparrot_training - Step 22787: {'lr': 0.0002861360603211345, 'samples': 11667456, 'steps': 22787, 'batch_loss/train': 0.8476281291805208} 12/25/2021 23:26:09 - INFO - codeparrot_training - Step 22788: {'lr': 0.00028612046265417203, 'samples': 11667968, 'steps': 22788, 'batch_loss/train': 0.8337519373744726} 12/25/2021 23:26:20 - INFO - codeparrot_training - Step 22789: {'lr': 0.00028610486484360826, 'samples': 11668480, 'steps': 22789, 'batch_loss/train': 0.8531601568683982} 12/25/2021 23:26:34 - INFO - codeparrot_training - Step 22790: {'lr': 0.00028608926688950513, 'samples': 11668992, 'steps': 22790, 'batch_loss/train': 0.9835077878087759} 12/25/2021 23:26:44 - INFO - codeparrot_training - Step 22791: {'lr': 0.00028607366879192464, 'samples': 11669504, 'steps': 22791, 'batch_loss/train': 0.8028755087871104} 12/25/2021 23:26:55 - INFO - codeparrot_training - Step 22792: {'lr': 0.000286058070550929, 'samples': 11670016, 'steps': 22792, 'batch_loss/train': 0.6278268243768252} 12/25/2021 23:27:06 - INFO - codeparrot_training - Step 22793: {'lr': 0.00028604247216658, 'samples': 11670528, 'steps': 22793, 'batch_loss/train': 0.8015437973663211} 12/25/2021 23:27:18 - INFO - codeparrot_training - Step 22794: {'lr': 0.00028602687363893977, 'samples': 11671040, 'steps': 22794, 'batch_loss/train': 0.8316723620519042} 12/25/2021 23:27:28 - INFO - codeparrot_training - Step 22795: {'lr': 0.00028601127496807025, 'samples': 11671552, 'steps': 22795, 'batch_loss/train': 0.818841028958559} 12/25/2021 23:27:39 - INFO - codeparrot_training - Step 22796: {'lr': 0.0002859956761540336, 'samples': 11672064, 'steps': 22796, 'batch_loss/train': 0.8307846616953611} 12/25/2021 23:27:51 - INFO - codeparrot_training - Step 22797: {'lr': 0.0002859800771968916, 'samples': 11672576, 'steps': 22797, 'batch_loss/train': 0.6841642372310162} 12/25/2021 23:28:02 - INFO - codeparrot_training - Step 22798: {'lr': 0.0002859644780967065, 'samples': 11673088, 'steps': 22798, 'batch_loss/train': 0.8059050440788269} 12/25/2021 23:28:12 - INFO - codeparrot_training - Step 22799: {'lr': 0.00028594887885354013, 'samples': 11673600, 'steps': 22799, 'batch_loss/train': 0.7611659709364176} 12/25/2021 23:28:26 - INFO - codeparrot_training - Step 22800: {'lr': 0.0002859332794674547, 'samples': 11674112, 'steps': 22800, 'batch_loss/train': 0.7830361146479845} 12/25/2021 23:28:37 - INFO - codeparrot_training - Step 22801: {'lr': 0.000285917679938512, 'samples': 11674624, 'steps': 22801, 'batch_loss/train': 0.8055256535299122} 12/25/2021 23:28:47 - INFO - codeparrot_training - Step 22802: {'lr': 0.0002859020802667742, 'samples': 11675136, 'steps': 22802, 'batch_loss/train': 0.7397708771750331} 12/25/2021 23:28:58 - INFO - codeparrot_training - Step 22803: {'lr': 0.0002858864804523034, 'samples': 11675648, 'steps': 22803, 'batch_loss/train': 0.7886830344796181} 12/25/2021 23:29:10 - INFO - codeparrot_training - Step 22804: {'lr': 0.00028587088049516135, 'samples': 11676160, 'steps': 22804, 'batch_loss/train': 0.9112768545746803} 12/25/2021 23:29:21 - INFO - codeparrot_training - Step 22805: {'lr': 0.0002858552803954103, 'samples': 11676672, 'steps': 22805, 'batch_loss/train': 1.7059993650764227} 12/25/2021 23:29:31 - INFO - codeparrot_training - Step 22806: {'lr': 0.0002858396801531122, 'samples': 11677184, 'steps': 22806, 'batch_loss/train': 0.7832849649712443} 12/25/2021 23:29:44 - INFO - codeparrot_training - Step 22807: {'lr': 0.00028582407976832905, 'samples': 11677696, 'steps': 22807, 'batch_loss/train': 0.6069464476313442} 12/25/2021 23:29:54 - INFO - codeparrot_training - Step 22808: {'lr': 0.00028580847924112283, 'samples': 11678208, 'steps': 22808, 'batch_loss/train': 0.7083950388478115} 12/25/2021 23:30:05 - INFO - codeparrot_training - Step 22809: {'lr': 0.0002857928785715557, 'samples': 11678720, 'steps': 22809, 'batch_loss/train': 0.7829598570242524} 12/25/2021 23:30:19 - INFO - codeparrot_training - Step 22810: {'lr': 0.0002857772777596895, 'samples': 11679232, 'steps': 22810, 'batch_loss/train': 0.7497753947973251} 12/25/2021 23:30:30 - INFO - codeparrot_training - Step 22811: {'lr': 0.0002857616768055865, 'samples': 11679744, 'steps': 22811, 'batch_loss/train': 0.7335493732243776} 12/25/2021 23:30:40 - INFO - codeparrot_training - Step 22812: {'lr': 0.0002857460757093085, 'samples': 11680256, 'steps': 22812, 'batch_loss/train': 0.7587293356773444} 12/25/2021 23:30:51 - INFO - codeparrot_training - Step 22813: {'lr': 0.0002857304744709176, 'samples': 11680768, 'steps': 22813, 'batch_loss/train': 0.8964983578771353} 12/25/2021 23:31:03 - INFO - codeparrot_training - Step 22814: {'lr': 0.0002857148730904758, 'samples': 11681280, 'steps': 22814, 'batch_loss/train': 0.7944794171489775} 12/25/2021 23:31:14 - INFO - codeparrot_training - Step 22815: {'lr': 0.00028569927156804517, 'samples': 11681792, 'steps': 22815, 'batch_loss/train': 0.7006518458947539} 12/25/2021 23:31:24 - INFO - codeparrot_training - Step 22816: {'lr': 0.00028568366990368776, 'samples': 11682304, 'steps': 22816, 'batch_loss/train': 0.7685105549171567} 12/25/2021 23:31:37 - INFO - codeparrot_training - Step 22817: {'lr': 0.00028566806809746543, 'samples': 11682816, 'steps': 22817, 'batch_loss/train': 0.7362245307303965} 12/25/2021 23:31:47 - INFO - codeparrot_training - Step 22818: {'lr': 0.00028565246614944045, 'samples': 11683328, 'steps': 22818, 'batch_loss/train': 0.7967825559899211} 12/25/2021 23:31:58 - INFO - codeparrot_training - Step 22819: {'lr': 0.0002856368640596747, 'samples': 11683840, 'steps': 22819, 'batch_loss/train': 1.0592094701714814} 12/25/2021 23:32:12 - INFO - codeparrot_training - Step 22820: {'lr': 0.0002856212618282302, 'samples': 11684352, 'steps': 22820, 'batch_loss/train': 0.7734560738317668} 12/25/2021 23:32:22 - INFO - codeparrot_training - Step 22821: {'lr': 0.00028560565945516905, 'samples': 11684864, 'steps': 22821, 'batch_loss/train': 0.7302447131369263} 12/25/2021 23:32:33 - INFO - codeparrot_training - Step 22822: {'lr': 0.00028559005694055325, 'samples': 11685376, 'steps': 22822, 'batch_loss/train': 0.851352158933878} 12/25/2021 23:32:43 - INFO - codeparrot_training - Step 22823: {'lr': 0.0002855744542844447, 'samples': 11685888, 'steps': 22823, 'batch_loss/train': 0.754659203113988} 12/25/2021 23:32:56 - INFO - codeparrot_training - Step 22824: {'lr': 0.00028555885148690577, 'samples': 11686400, 'steps': 22824, 'batch_loss/train': 0.6748933289200068} 12/25/2021 23:33:06 - INFO - codeparrot_training - Step 22825: {'lr': 0.0002855432485479981, 'samples': 11686912, 'steps': 22825, 'batch_loss/train': 0.7933426629751921} 12/25/2021 23:33:17 - INFO - codeparrot_training - Step 22826: {'lr': 0.0002855276454677839, 'samples': 11687424, 'steps': 22826, 'batch_loss/train': 0.8886462622322142} 12/25/2021 23:33:30 - INFO - codeparrot_training - Step 22827: {'lr': 0.0002855120422463253, 'samples': 11687936, 'steps': 22827, 'batch_loss/train': 0.828724472085014} 12/25/2021 23:33:41 - INFO - codeparrot_training - Step 22828: {'lr': 0.00028549643888368416, 'samples': 11688448, 'steps': 22828, 'batch_loss/train': 0.7817890704609454} 12/25/2021 23:33:52 - INFO - codeparrot_training - Step 22829: {'lr': 0.00028548083537992253, 'samples': 11688960, 'steps': 22829, 'batch_loss/train': 1.261810746975243} 12/25/2021 23:34:04 - INFO - codeparrot_training - Step 22830: {'lr': 0.00028546523173510256, 'samples': 11689472, 'steps': 22830, 'batch_loss/train': 0.8268581782467663} 12/25/2021 23:34:14 - INFO - codeparrot_training - Step 22831: {'lr': 0.00028544962794928633, 'samples': 11689984, 'steps': 22831, 'batch_loss/train': 0.8155853769276291} 12/25/2021 23:34:25 - INFO - codeparrot_training - Step 22832: {'lr': 0.0002854340240225356, 'samples': 11690496, 'steps': 22832, 'batch_loss/train': 0.8250502031296492} 12/25/2021 23:34:37 - INFO - codeparrot_training - Step 22833: {'lr': 0.00028541841995491265, 'samples': 11691008, 'steps': 22833, 'batch_loss/train': 0.789120402187109} 12/25/2021 23:34:48 - INFO - codeparrot_training - Step 22834: {'lr': 0.00028540281574647935, 'samples': 11691520, 'steps': 22834, 'batch_loss/train': 0.7054685952607542} 12/25/2021 23:34:59 - INFO - codeparrot_training - Step 22835: {'lr': 0.00028538721139729795, 'samples': 11692032, 'steps': 22835, 'batch_loss/train': 0.7220412730239332} 12/25/2021 23:35:09 - INFO - codeparrot_training - Step 22836: {'lr': 0.0002853716069074302, 'samples': 11692544, 'steps': 22836, 'batch_loss/train': 0.6199153647758067} 12/25/2021 23:35:23 - INFO - codeparrot_training - Step 22837: {'lr': 0.00028535600227693847, 'samples': 11693056, 'steps': 22837, 'batch_loss/train': 0.7100943038240075} 12/25/2021 23:35:34 - INFO - codeparrot_training - Step 22838: {'lr': 0.0002853403975058845, 'samples': 11693568, 'steps': 22838, 'batch_loss/train': 0.7754880990833044} 12/25/2021 23:35:45 - INFO - codeparrot_training - Step 22839: {'lr': 0.0002853247925943305, 'samples': 11694080, 'steps': 22839, 'batch_loss/train': 0.8560304753482342} 12/25/2021 23:35:57 - INFO - codeparrot_training - Step 22840: {'lr': 0.0002853091875423384, 'samples': 11694592, 'steps': 22840, 'batch_loss/train': 0.7796773426234722} 12/25/2021 23:36:07 - INFO - codeparrot_training - Step 22841: {'lr': 0.0002852935823499704, 'samples': 11695104, 'steps': 22841, 'batch_loss/train': 0.8569992417469621} 12/25/2021 23:36:18 - INFO - codeparrot_training - Step 22842: {'lr': 0.00028527797701728843, 'samples': 11695616, 'steps': 22842, 'batch_loss/train': 0.7797699933871627} 12/25/2021 23:36:32 - INFO - codeparrot_training - Step 22843: {'lr': 0.00028526237154435447, 'samples': 11696128, 'steps': 22843, 'batch_loss/train': 0.797282085288316} 12/25/2021 23:36:43 - INFO - codeparrot_training - Step 22844: {'lr': 0.0002852467659312307, 'samples': 11696640, 'steps': 22844, 'batch_loss/train': 0.6698739142157137} 12/25/2021 23:36:54 - INFO - codeparrot_training - Step 22845: {'lr': 0.0002852311601779791, 'samples': 11697152, 'steps': 22845, 'batch_loss/train': 0.699108007363975} 12/25/2021 23:37:04 - INFO - codeparrot_training - Step 22846: {'lr': 0.0002852155542846617, 'samples': 11697664, 'steps': 22846, 'batch_loss/train': 0.8513077525421977} 12/25/2021 23:37:16 - INFO - codeparrot_training - Step 22847: {'lr': 0.0002851999482513405, 'samples': 11698176, 'steps': 22847, 'batch_loss/train': 0.7541447784751654} 12/25/2021 23:37:27 - INFO - codeparrot_training - Step 22848: {'lr': 0.0002851843420780777, 'samples': 11698688, 'steps': 22848, 'batch_loss/train': 0.7592371702194214} 12/25/2021 23:37:37 - INFO - codeparrot_training - Step 22849: {'lr': 0.0002851687357649351, 'samples': 11699200, 'steps': 22849, 'batch_loss/train': 0.7755671320483088} 12/25/2021 23:37:50 - INFO - codeparrot_training - Step 22850: {'lr': 0.000285153129311975, 'samples': 11699712, 'steps': 22850, 'batch_loss/train': 0.7630519485101104} 12/25/2021 23:38:00 - INFO - codeparrot_training - Step 22851: {'lr': 0.0002851375227192593, 'samples': 11700224, 'steps': 22851, 'batch_loss/train': 0.8298108661547303} 12/25/2021 23:38:11 - INFO - codeparrot_training - Step 22852: {'lr': 0.00028512191598685, 'samples': 11700736, 'steps': 22852, 'batch_loss/train': 0.8012980381026864} 12/25/2021 23:38:25 - INFO - codeparrot_training - Step 22853: {'lr': 0.0002851063091148094, 'samples': 11701248, 'steps': 22853, 'batch_loss/train': 0.7396845412440598} 12/25/2021 23:38:35 - INFO - codeparrot_training - Step 22854: {'lr': 0.0002850907021031992, 'samples': 11701760, 'steps': 22854, 'batch_loss/train': 0.7829332505352795} 12/25/2021 23:38:46 - INFO - codeparrot_training - Step 22855: {'lr': 0.0002850750949520817, 'samples': 11702272, 'steps': 22855, 'batch_loss/train': 0.8051557894796133} 12/25/2021 23:38:56 - INFO - codeparrot_training - Step 22856: {'lr': 0.00028505948766151886, 'samples': 11702784, 'steps': 22856, 'batch_loss/train': 0.7204445693641901} 12/25/2021 23:39:09 - INFO - codeparrot_training - Step 22857: {'lr': 0.0002850438802315728, 'samples': 11703296, 'steps': 22857, 'batch_loss/train': 0.713272325694561} 12/25/2021 23:39:19 - INFO - codeparrot_training - Step 22858: {'lr': 0.00028502827266230544, 'samples': 11703808, 'steps': 22858, 'batch_loss/train': 0.8444036925211549} 12/25/2021 23:39:30 - INFO - codeparrot_training - Step 22859: {'lr': 0.0002850126649537788, 'samples': 11704320, 'steps': 22859, 'batch_loss/train': 0.797051535686478} 12/25/2021 23:39:42 - INFO - codeparrot_training - Step 22860: {'lr': 0.0002849970571060551, 'samples': 11704832, 'steps': 22860, 'batch_loss/train': 0.8727524606510997} 12/25/2021 23:39:53 - INFO - codeparrot_training - Step 22861: {'lr': 0.0002849814491191964, 'samples': 11705344, 'steps': 22861, 'batch_loss/train': 0.7844264460727572} 12/25/2021 23:40:03 - INFO - codeparrot_training - Step 22862: {'lr': 0.0002849658409932645, 'samples': 11705856, 'steps': 22862, 'batch_loss/train': 0.8551429905928671} 12/25/2021 23:40:17 - INFO - codeparrot_training - Step 22863: {'lr': 0.00028495023272832176, 'samples': 11706368, 'steps': 22863, 'batch_loss/train': 0.7664404367096722} 12/25/2021 23:40:28 - INFO - codeparrot_training - Step 22864: {'lr': 0.0002849346243244301, 'samples': 11706880, 'steps': 22864, 'batch_loss/train': 0.8327184887602925} 12/25/2021 23:40:38 - INFO - codeparrot_training - Step 22865: {'lr': 0.0002849190157816514, 'samples': 11707392, 'steps': 22865, 'batch_loss/train': 0.8036573766730726} 12/25/2021 23:40:49 - INFO - codeparrot_training - Step 22866: {'lr': 0.00028490340710004804, 'samples': 11707904, 'steps': 22866, 'batch_loss/train': 0.8225077427923679} 12/25/2021 23:41:01 - INFO - codeparrot_training - Step 22867: {'lr': 0.0002848877982796819, 'samples': 11708416, 'steps': 22867, 'batch_loss/train': 0.7708262247033417} 12/25/2021 23:41:11 - INFO - codeparrot_training - Step 22868: {'lr': 0.00028487218932061494, 'samples': 11708928, 'steps': 22868, 'batch_loss/train': 0.8558713525999337} 12/25/2021 23:41:22 - INFO - codeparrot_training - Step 22869: {'lr': 0.0002848565802229094, 'samples': 11709440, 'steps': 22869, 'batch_loss/train': 0.8383657851954922} 12/25/2021 23:41:36 - INFO - codeparrot_training - Step 22870: {'lr': 0.0002848409709866273, 'samples': 11709952, 'steps': 22870, 'batch_loss/train': 0.8722461238503456} 12/25/2021 23:41:47 - INFO - codeparrot_training - Step 22871: {'lr': 0.0002848253616118306, 'samples': 11710464, 'steps': 22871, 'batch_loss/train': 0.7663159286603332} 12/25/2021 23:41:57 - INFO - codeparrot_training - Step 22872: {'lr': 0.0002848097520985814, 'samples': 11710976, 'steps': 22872, 'batch_loss/train': 0.800886926241219} 12/25/2021 23:42:09 - INFO - codeparrot_training - Step 22873: {'lr': 0.0002847941424469417, 'samples': 11711488, 'steps': 22873, 'batch_loss/train': 0.8862506072036922} 12/25/2021 23:42:20 - INFO - codeparrot_training - Step 22874: {'lr': 0.0002847785326569737, 'samples': 11712000, 'steps': 22874, 'batch_loss/train': 0.8721833946183324} 12/25/2021 23:42:30 - INFO - codeparrot_training - Step 22875: {'lr': 0.0002847629227287394, 'samples': 11712512, 'steps': 22875, 'batch_loss/train': 0.8850316451862454} 12/25/2021 23:42:43 - INFO - codeparrot_training - Step 22876: {'lr': 0.00028474731266230085, 'samples': 11713024, 'steps': 22876, 'batch_loss/train': 0.8989844429306686} 12/25/2021 23:42:54 - INFO - codeparrot_training - Step 22877: {'lr': 0.0002847317024577202, 'samples': 11713536, 'steps': 22877, 'batch_loss/train': 0.7187798428349197} 12/25/2021 23:43:04 - INFO - codeparrot_training - Step 22878: {'lr': 0.00028471609211505925, 'samples': 11714048, 'steps': 22878, 'batch_loss/train': 0.7799412766471505} 12/25/2021 23:43:15 - INFO - codeparrot_training - Step 22879: {'lr': 0.0002847004816343803, 'samples': 11714560, 'steps': 22879, 'batch_loss/train': 1.2420529609080404} 12/25/2021 23:43:27 - INFO - codeparrot_training - Step 22880: {'lr': 0.0002846848710157453, 'samples': 11715072, 'steps': 22880, 'batch_loss/train': 0.7525795958936214} 12/25/2021 23:43:38 - INFO - codeparrot_training - Step 22881: {'lr': 0.00028466926025921634, 'samples': 11715584, 'steps': 22881, 'batch_loss/train': 0.734275178052485} 12/25/2021 23:43:48 - INFO - codeparrot_training - Step 22882: {'lr': 0.00028465364936485556, 'samples': 11716096, 'steps': 22882, 'batch_loss/train': 0.7166895202826709} 12/25/2021 23:44:02 - INFO - codeparrot_training - Step 22883: {'lr': 0.00028463803833272494, 'samples': 11716608, 'steps': 22883, 'batch_loss/train': 0.8020922080613673} 12/25/2021 23:44:13 - INFO - codeparrot_training - Step 22884: {'lr': 0.00028462242716288655, 'samples': 11717120, 'steps': 22884, 'batch_loss/train': 0.7038993565365672} 12/25/2021 23:44:23 - INFO - codeparrot_training - Step 22885: {'lr': 0.00028460681585540246, 'samples': 11717632, 'steps': 22885, 'batch_loss/train': 0.8348811436444521} 12/25/2021 23:44:35 - INFO - codeparrot_training - Step 22886: {'lr': 0.0002845912044103347, 'samples': 11718144, 'steps': 22886, 'batch_loss/train': 0.8855151706375182} 12/25/2021 23:44:46 - INFO - codeparrot_training - Step 22887: {'lr': 0.00028457559282774535, 'samples': 11718656, 'steps': 22887, 'batch_loss/train': 0.860087919048965} 12/25/2021 23:44:57 - INFO - codeparrot_training - Step 22888: {'lr': 0.0002845599811076966, 'samples': 11719168, 'steps': 22888, 'batch_loss/train': 0.8303109752014279} 12/25/2021 23:45:07 - INFO - codeparrot_training - Step 22889: {'lr': 0.00028454436925025035, 'samples': 11719680, 'steps': 22889, 'batch_loss/train': 0.8711606815923005} 12/25/2021 23:45:19 - INFO - codeparrot_training - Step 22890: {'lr': 0.00028452875725546876, 'samples': 11720192, 'steps': 22890, 'batch_loss/train': 0.8616287773475051} 12/25/2021 23:45:30 - INFO - codeparrot_training - Step 22891: {'lr': 0.0002845131451234138, 'samples': 11720704, 'steps': 22891, 'batch_loss/train': 0.8392760576680303} 12/25/2021 23:45:41 - INFO - codeparrot_training - Step 22892: {'lr': 0.0002844975328541477, 'samples': 11721216, 'steps': 22892, 'batch_loss/train': 0.8179442780092359} 12/25/2021 23:45:54 - INFO - codeparrot_training - Step 22893: {'lr': 0.0002844819204477324, 'samples': 11721728, 'steps': 22893, 'batch_loss/train': 0.8015766181051731} 12/25/2021 23:46:05 - INFO - codeparrot_training - Step 22894: {'lr': 0.00028446630790423, 'samples': 11722240, 'steps': 22894, 'batch_loss/train': 0.7619913143571466} 12/25/2021 23:46:16 - INFO - codeparrot_training - Step 22895: {'lr': 0.00028445069522370256, 'samples': 11722752, 'steps': 22895, 'batch_loss/train': 0.7233127481304109} 12/25/2021 23:46:28 - INFO - codeparrot_training - Step 22896: {'lr': 0.0002844350824062122, 'samples': 11723264, 'steps': 22896, 'batch_loss/train': 0.748143513686955} 12/25/2021 23:46:38 - INFO - codeparrot_training - Step 22897: {'lr': 0.0002844194694518209, 'samples': 11723776, 'steps': 22897, 'batch_loss/train': 0.7062367564067245} 12/25/2021 23:46:49 - INFO - codeparrot_training - Step 22898: {'lr': 0.00028440385636059086, 'samples': 11724288, 'steps': 22898, 'batch_loss/train': 0.7911274712532759} 12/25/2021 23:46:59 - INFO - codeparrot_training - Step 22899: {'lr': 0.000284388243132584, 'samples': 11724800, 'steps': 22899, 'batch_loss/train': 0.8318849038332701} 12/25/2021 23:47:13 - INFO - codeparrot_training - Step 22900: {'lr': 0.0002843726297678625, 'samples': 11725312, 'steps': 22900, 'batch_loss/train': 0.9266887200064957} 12/25/2021 23:47:24 - INFO - codeparrot_training - Step 22901: {'lr': 0.00028435701626648843, 'samples': 11725824, 'steps': 22901, 'batch_loss/train': 0.8460259446874261} 12/25/2021 23:47:35 - INFO - codeparrot_training - Step 22902: {'lr': 0.00028434140262852375, 'samples': 11726336, 'steps': 22902, 'batch_loss/train': 0.8430066267028451} 12/25/2021 23:47:47 - INFO - codeparrot_training - Step 22903: {'lr': 0.00028432578885403075, 'samples': 11726848, 'steps': 22903, 'batch_loss/train': 0.7824923880398273} 12/25/2021 23:47:57 - INFO - codeparrot_training - Step 22904: {'lr': 0.0002843101749430713, 'samples': 11727360, 'steps': 22904, 'batch_loss/train': 0.7654899153858423} 12/25/2021 23:48:08 - INFO - codeparrot_training - Step 22905: {'lr': 0.0002842945608957075, 'samples': 11727872, 'steps': 22905, 'batch_loss/train': 0.7745076101273298} 12/25/2021 23:48:20 - INFO - codeparrot_training - Step 22906: {'lr': 0.00028427894671200155, 'samples': 11728384, 'steps': 22906, 'batch_loss/train': 0.82650550827384} 12/25/2021 23:48:31 - INFO - codeparrot_training - Step 22907: {'lr': 0.0002842633323920155, 'samples': 11728896, 'steps': 22907, 'batch_loss/train': 0.8656845698133111} 12/25/2021 23:48:41 - INFO - codeparrot_training - Step 22908: {'lr': 0.0002842477179358113, 'samples': 11729408, 'steps': 22908, 'batch_loss/train': 0.7980579948052764} 12/25/2021 23:48:54 - INFO - codeparrot_training - Step 22909: {'lr': 0.00028423210334345113, 'samples': 11729920, 'steps': 22909, 'batch_loss/train': 0.6985888709314167} 12/25/2021 23:49:05 - INFO - codeparrot_training - Step 22910: {'lr': 0.00028421648861499694, 'samples': 11730432, 'steps': 22910, 'batch_loss/train': 0.8217173619195819} 12/25/2021 23:49:15 - INFO - codeparrot_training - Step 22911: {'lr': 0.0002842008737505111, 'samples': 11730944, 'steps': 22911, 'batch_loss/train': 0.9165426855906844} 12/25/2021 23:49:26 - INFO - codeparrot_training - Step 22912: {'lr': 0.00028418525875005535, 'samples': 11731456, 'steps': 22912, 'batch_loss/train': 0.9582013799808919} 12/25/2021 23:49:40 - INFO - codeparrot_training - Step 22913: {'lr': 0.000284169643613692, 'samples': 11731968, 'steps': 22913, 'batch_loss/train': 0.8806985961273313} 12/25/2021 23:49:50 - INFO - codeparrot_training - Step 22914: {'lr': 0.00028415402834148297, 'samples': 11732480, 'steps': 22914, 'batch_loss/train': 0.7500365898013115} 12/25/2021 23:50:01 - INFO - codeparrot_training - Step 22915: {'lr': 0.0002841384129334905, 'samples': 11732992, 'steps': 22915, 'batch_loss/train': 0.7894020490348339} 12/25/2021 23:50:13 - INFO - codeparrot_training - Step 22916: {'lr': 0.00028412279738977654, 'samples': 11733504, 'steps': 22916, 'batch_loss/train': 0.8459528433158994} 12/25/2021 23:50:24 - INFO - codeparrot_training - Step 22917: {'lr': 0.00028410718171040315, 'samples': 11734016, 'steps': 22917, 'batch_loss/train': 0.7578030589502305} 12/25/2021 23:50:34 - INFO - codeparrot_training - Step 22918: {'lr': 0.00028409156589543266, 'samples': 11734528, 'steps': 22918, 'batch_loss/train': 0.7767371307127178} 12/25/2021 23:50:45 - INFO - codeparrot_training - Step 22919: {'lr': 0.00028407594994492687, 'samples': 11735040, 'steps': 22919, 'batch_loss/train': 0.9375887177884579} 12/25/2021 23:50:57 - INFO - codeparrot_training - Step 22920: {'lr': 0.00028406033385894796, 'samples': 11735552, 'steps': 22920, 'batch_loss/train': 1.094339010247495} 12/25/2021 23:51:08 - INFO - codeparrot_training - Step 22921: {'lr': 0.00028404471763755805, 'samples': 11736064, 'steps': 22921, 'batch_loss/train': 0.8608358651399612} 12/25/2021 23:51:19 - INFO - codeparrot_training - Step 22922: {'lr': 0.0002840291012808193, 'samples': 11736576, 'steps': 22922, 'batch_loss/train': 0.7489016959443688} 12/25/2021 23:51:33 - INFO - codeparrot_training - Step 22923: {'lr': 0.00028401348478879353, 'samples': 11737088, 'steps': 22923, 'batch_loss/train': 0.7159366857958958} 12/25/2021 23:51:43 - INFO - codeparrot_training - Step 22924: {'lr': 0.00028399786816154306, 'samples': 11737600, 'steps': 22924, 'batch_loss/train': 0.7821020744740963} 12/25/2021 23:51:54 - INFO - codeparrot_training - Step 22925: {'lr': 0.00028398225139912987, 'samples': 11738112, 'steps': 22925, 'batch_loss/train': 0.7396532204002142} 12/25/2021 23:52:06 - INFO - codeparrot_training - Step 22926: {'lr': 0.0002839666345016161, 'samples': 11738624, 'steps': 22926, 'batch_loss/train': 0.7715218206867576} 12/25/2021 23:52:17 - INFO - codeparrot_training - Step 22927: {'lr': 0.0002839510174690638, 'samples': 11739136, 'steps': 22927, 'batch_loss/train': 0.7118120633531362} 12/25/2021 23:52:27 - INFO - codeparrot_training - Step 22928: {'lr': 0.0002839354003015351, 'samples': 11739648, 'steps': 22928, 'batch_loss/train': 0.8438824880868196} 12/25/2021 23:52:38 - INFO - codeparrot_training - Step 22929: {'lr': 0.0002839197829990921, 'samples': 11740160, 'steps': 22929, 'batch_loss/train': 0.7999511635862291} 12/25/2021 23:52:52 - INFO - codeparrot_training - Step 22930: {'lr': 0.0002839041655617968, 'samples': 11740672, 'steps': 22930, 'batch_loss/train': 0.739462866447866} 12/25/2021 23:53:03 - INFO - codeparrot_training - Step 22931: {'lr': 0.0002838885479897113, 'samples': 11741184, 'steps': 22931, 'batch_loss/train': 0.6296893493272364} 12/25/2021 23:53:13 - INFO - codeparrot_training - Step 22932: {'lr': 0.0002838729302828978, 'samples': 11741696, 'steps': 22932, 'batch_loss/train': 0.679339271504432} 12/25/2021 23:53:26 - INFO - codeparrot_training - Step 22933: {'lr': 0.00028385731244141827, 'samples': 11742208, 'steps': 22933, 'batch_loss/train': 0.802637133281678} 12/25/2021 23:53:37 - INFO - codeparrot_training - Step 22934: {'lr': 0.00028384169446533486, 'samples': 11742720, 'steps': 22934, 'batch_loss/train': 0.6242829756811261} 12/25/2021 23:53:47 - INFO - codeparrot_training - Step 22935: {'lr': 0.0002838260763547097, 'samples': 11743232, 'steps': 22935, 'batch_loss/train': 0.7090091602876782} 12/25/2021 23:53:58 - INFO - codeparrot_training - Step 22936: {'lr': 0.0002838104581096048, 'samples': 11743744, 'steps': 22936, 'batch_loss/train': 0.8377788234502077} 12/25/2021 23:54:10 - INFO - codeparrot_training - Step 22937: {'lr': 0.0002837948397300823, 'samples': 11744256, 'steps': 22937, 'batch_loss/train': 0.7838440579362214} 12/25/2021 23:54:21 - INFO - codeparrot_training - Step 22938: {'lr': 0.00028377922121620427, 'samples': 11744768, 'steps': 22938, 'batch_loss/train': 0.8141137501224875} 12/25/2021 23:54:31 - INFO - codeparrot_training - Step 22939: {'lr': 0.0002837636025680327, 'samples': 11745280, 'steps': 22939, 'batch_loss/train': 0.850969797000289} 12/25/2021 23:54:45 - INFO - codeparrot_training - Step 22940: {'lr': 0.00028374798378562994, 'samples': 11745792, 'steps': 22940, 'batch_loss/train': 0.9203365631401539} 12/25/2021 23:54:56 - INFO - codeparrot_training - Step 22941: {'lr': 0.00028373236486905785, 'samples': 11746304, 'steps': 22941, 'batch_loss/train': 0.7896318868733943} 12/25/2021 23:55:07 - INFO - codeparrot_training - Step 22942: {'lr': 0.0002837167458183787, 'samples': 11746816, 'steps': 22942, 'batch_loss/train': 0.7848297134041786} 12/25/2021 23:55:19 - INFO - codeparrot_training - Step 22943: {'lr': 0.00028370112663365444, 'samples': 11747328, 'steps': 22943, 'batch_loss/train': 0.8391597541049123} 12/25/2021 23:55:30 - INFO - codeparrot_training - Step 22944: {'lr': 0.00028368550731494726, 'samples': 11747840, 'steps': 22944, 'batch_loss/train': 0.7871019118465483} 12/25/2021 23:55:40 - INFO - codeparrot_training - Step 22945: {'lr': 0.00028366988786231916, 'samples': 11748352, 'steps': 22945, 'batch_loss/train': 0.8347399462945759} 12/25/2021 23:55:51 - INFO - codeparrot_training - Step 22946: {'lr': 0.0002836542682758324, 'samples': 11748864, 'steps': 22946, 'batch_loss/train': 0.7068718745140359} 12/25/2021 23:56:03 - INFO - codeparrot_training - Step 22947: {'lr': 0.0002836386485555489, 'samples': 11749376, 'steps': 22947, 'batch_loss/train': 0.659879497892689} 12/25/2021 23:56:14 - INFO - codeparrot_training - Step 22948: {'lr': 0.0002836230287015309, 'samples': 11749888, 'steps': 22948, 'batch_loss/train': 0.8015023544430733} 12/25/2021 23:56:24 - INFO - codeparrot_training - Step 22949: {'lr': 0.00028360740871384034, 'samples': 11750400, 'steps': 22949, 'batch_loss/train': 0.7505411390447989} 12/25/2021 23:56:36 - INFO - codeparrot_training - Step 22950: {'lr': 0.00028359178859253944, 'samples': 11750912, 'steps': 22950, 'batch_loss/train': 0.737059417180717} 12/25/2021 23:56:47 - INFO - codeparrot_training - Step 22951: {'lr': 0.00028357616833769037, 'samples': 11751424, 'steps': 22951, 'batch_loss/train': 0.9390788078308105} 12/25/2021 23:56:57 - INFO - codeparrot_training - Step 22952: {'lr': 0.000283560547949355, 'samples': 11751936, 'steps': 22952, 'batch_loss/train': 0.9165498630609363} 12/25/2021 23:57:11 - INFO - codeparrot_training - Step 22953: {'lr': 0.00028354492742759565, 'samples': 11752448, 'steps': 22953, 'batch_loss/train': 0.7771570570766926} 12/25/2021 23:57:22 - INFO - codeparrot_training - Step 22954: {'lr': 0.00028352930677247426, 'samples': 11752960, 'steps': 22954, 'batch_loss/train': 0.8430391508154571} 12/25/2021 23:57:32 - INFO - codeparrot_training - Step 22955: {'lr': 0.00028351368598405303, 'samples': 11753472, 'steps': 22955, 'batch_loss/train': 0.7241167858010158} 12/25/2021 23:57:44 - INFO - codeparrot_training - Step 22956: {'lr': 0.0002834980650623941, 'samples': 11753984, 'steps': 22956, 'batch_loss/train': 0.7215806343592703} 12/25/2021 23:57:55 - INFO - codeparrot_training - Step 22957: {'lr': 0.0002834824440075595, 'samples': 11754496, 'steps': 22957, 'batch_loss/train': 0.718522178940475} 12/25/2021 23:58:06 - INFO - codeparrot_training - Step 22958: {'lr': 0.0002834668228196113, 'samples': 11755008, 'steps': 22958, 'batch_loss/train': 0.8135932441800833} 12/25/2021 23:58:16 - INFO - codeparrot_training - Step 22959: {'lr': 0.0002834512014986116, 'samples': 11755520, 'steps': 22959, 'batch_loss/train': 0.7395861772820354} 12/25/2021 23:58:30 - INFO - codeparrot_training - Step 22960: {'lr': 0.00028343558004462264, 'samples': 11756032, 'steps': 22960, 'batch_loss/train': 0.8317090580239892} 12/25/2021 23:58:41 - INFO - codeparrot_training - Step 22961: {'lr': 0.00028341995845770646, 'samples': 11756544, 'steps': 22961, 'batch_loss/train': 0.7670582295395434} 12/25/2021 23:58:52 - INFO - codeparrot_training - Step 22962: {'lr': 0.00028340433673792505, 'samples': 11757056, 'steps': 22962, 'batch_loss/train': 0.7545466567389667} 12/25/2021 23:59:04 - INFO - codeparrot_training - Step 22963: {'lr': 0.0002833887148853407, 'samples': 11757568, 'steps': 22963, 'batch_loss/train': 0.7849928084760904} 12/25/2021 23:59:14 - INFO - codeparrot_training - Step 22964: {'lr': 0.0002833730929000154, 'samples': 11758080, 'steps': 22964, 'batch_loss/train': 0.7079033984337002} 12/25/2021 23:59:25 - INFO - codeparrot_training - Step 22965: {'lr': 0.00028335747078201117, 'samples': 11758592, 'steps': 22965, 'batch_loss/train': 0.6855805818340741} 12/25/2021 23:59:37 - INFO - codeparrot_training - Step 22966: {'lr': 0.0002833418485313904, 'samples': 11759104, 'steps': 22966, 'batch_loss/train': 0.8904683521250263} 12/25/2021 23:59:48 - INFO - codeparrot_training - Step 22967: {'lr': 0.000283326226148215, 'samples': 11759616, 'steps': 22967, 'batch_loss/train': 0.7369827162474394} 12/25/2021 23:59:58 - INFO - codeparrot_training - Step 22968: {'lr': 0.000283310603632547, 'samples': 11760128, 'steps': 22968, 'batch_loss/train': 0.9093068391084671} 12/26/2021 00:00:09 - INFO - codeparrot_training - Step 22969: {'lr': 0.00028329498098444874, 'samples': 11760640, 'steps': 22969, 'batch_loss/train': 0.8137955814599991} 12/26/2021 00:00:23 - INFO - codeparrot_training - Step 22970: {'lr': 0.00028327935820398215, 'samples': 11761152, 'steps': 22970, 'batch_loss/train': 0.7489854285959154} 12/26/2021 00:00:33 - INFO - codeparrot_training - Step 22971: {'lr': 0.0002832637352912094, 'samples': 11761664, 'steps': 22971, 'batch_loss/train': 0.7223568921908736} 12/26/2021 00:00:44 - INFO - codeparrot_training - Step 22972: {'lr': 0.00028324811224619264, 'samples': 11762176, 'steps': 22972, 'batch_loss/train': 0.7654169127345085} 12/26/2021 00:00:56 - INFO - codeparrot_training - Step 22973: {'lr': 0.00028323248906899387, 'samples': 11762688, 'steps': 22973, 'batch_loss/train': 0.8378715440630913} 12/26/2021 00:01:07 - INFO - codeparrot_training - Step 22974: {'lr': 0.0002832168657596753, 'samples': 11763200, 'steps': 22974, 'batch_loss/train': 0.912125782109797} 12/26/2021 00:01:17 - INFO - codeparrot_training - Step 22975: {'lr': 0.000283201242318299, 'samples': 11763712, 'steps': 22975, 'batch_loss/train': 0.7623408623039722} 12/26/2021 00:01:31 - INFO - codeparrot_training - Step 22976: {'lr': 0.0002831856187449271, 'samples': 11764224, 'steps': 22976, 'batch_loss/train': 0.7644222230883315} 12/26/2021 00:01:42 - INFO - codeparrot_training - Step 22977: {'lr': 0.00028316999503962177, 'samples': 11764736, 'steps': 22977, 'batch_loss/train': 0.8176748901605606} 12/26/2021 00:01:53 - INFO - codeparrot_training - Step 22978: {'lr': 0.00028315437120244496, 'samples': 11765248, 'steps': 22978, 'batch_loss/train': 0.7761293087387457} 12/26/2021 00:02:03 - INFO - codeparrot_training - Step 22979: {'lr': 0.000283138747233459, 'samples': 11765760, 'steps': 22979, 'batch_loss/train': 0.7632299195975065} 12/26/2021 00:02:15 - INFO - codeparrot_training - Step 22980: {'lr': 0.00028312312313272584, 'samples': 11766272, 'steps': 22980, 'batch_loss/train': 0.8629665821790695} 12/26/2021 00:02:26 - INFO - codeparrot_training - Step 22981: {'lr': 0.0002831074989003076, 'samples': 11766784, 'steps': 22981, 'batch_loss/train': 0.8316745469346642} 12/26/2021 00:02:36 - INFO - codeparrot_training - Step 22982: {'lr': 0.0002830918745362665, 'samples': 11767296, 'steps': 22982, 'batch_loss/train': 0.7477247240021825} 12/26/2021 00:02:48 - INFO - codeparrot_training - Step 22983: {'lr': 0.00028307625004066466, 'samples': 11767808, 'steps': 22983, 'batch_loss/train': 0.8248623069375753} 12/26/2021 00:02:59 - INFO - codeparrot_training - Step 22984: {'lr': 0.00028306062541356405, 'samples': 11768320, 'steps': 22984, 'batch_loss/train': 0.8748845001682639} 12/26/2021 00:03:10 - INFO - codeparrot_training - Step 22985: {'lr': 0.0002830450006550269, 'samples': 11768832, 'steps': 22985, 'batch_loss/train': 0.7559065241366625} 12/26/2021 00:03:24 - INFO - codeparrot_training - Step 22986: {'lr': 0.0002830293757651153, 'samples': 11769344, 'steps': 22986, 'batch_loss/train': 0.8359022974036634} 12/26/2021 00:03:34 - INFO - codeparrot_training - Step 22987: {'lr': 0.00028301375074389135, 'samples': 11769856, 'steps': 22987, 'batch_loss/train': 0.631339572602883} 12/26/2021 00:03:45 - INFO - codeparrot_training - Step 22988: {'lr': 0.0002829981255914172, 'samples': 11770368, 'steps': 22988, 'batch_loss/train': 0.7706364756450057} 12/26/2021 00:03:57 - INFO - codeparrot_training - Step 22989: {'lr': 0.000282982500307755, 'samples': 11770880, 'steps': 22989, 'batch_loss/train': 0.679085505194962} 12/26/2021 00:04:08 - INFO - codeparrot_training - Step 22990: {'lr': 0.0002829668748929669, 'samples': 11771392, 'steps': 22990, 'batch_loss/train': 1.0074464809149504} 12/26/2021 00:04:18 - INFO - codeparrot_training - Step 22991: {'lr': 0.0002829512493471148, 'samples': 11771904, 'steps': 22991, 'batch_loss/train': 1.316607205895707} 12/26/2021 00:04:29 - INFO - codeparrot_training - Step 22992: {'lr': 0.0002829356236702611, 'samples': 11772416, 'steps': 22992, 'batch_loss/train': 0.6880015020724386} 12/26/2021 00:04:41 - INFO - codeparrot_training - Step 22993: {'lr': 0.0002829199978624677, 'samples': 11772928, 'steps': 22993, 'batch_loss/train': 0.8148300684988499} 12/26/2021 00:04:52 - INFO - codeparrot_training - Step 22994: {'lr': 0.0002829043719237969, 'samples': 11773440, 'steps': 22994, 'batch_loss/train': 0.7312390860170126} 12/26/2021 00:05:02 - INFO - codeparrot_training - Step 22995: {'lr': 0.0002828887458543107, 'samples': 11773952, 'steps': 22995, 'batch_loss/train': 0.5974282054230571} 12/26/2021 00:05:15 - INFO - codeparrot_training - Step 22996: {'lr': 0.00028287311965407127, 'samples': 11774464, 'steps': 22996, 'batch_loss/train': 0.7724219285883009} 12/26/2021 00:05:25 - INFO - codeparrot_training - Step 22997: {'lr': 0.0002828574933231407, 'samples': 11774976, 'steps': 22997, 'batch_loss/train': 0.7799809414427727} 12/26/2021 00:05:36 - INFO - codeparrot_training - Step 22998: {'lr': 0.0002828418668615812, 'samples': 11775488, 'steps': 22998, 'batch_loss/train': 0.7906396677717566} 12/26/2021 00:05:47 - INFO - codeparrot_training - Step 22999: {'lr': 0.0002828262402694548, 'samples': 11776000, 'steps': 22999, 'batch_loss/train': 0.7981626498512924} 12/26/2021 00:06:02 - INFO - codeparrot_training - Step 23000: {'lr': 0.00028281061354682363, 'samples': 11776512, 'steps': 23000, 'batch_loss/train': 0.76317322649993} 12/26/2021 00:06:12 - INFO - codeparrot_training - Step 23001: {'lr': 0.00028279498669374993, 'samples': 11777024, 'steps': 23001, 'batch_loss/train': 0.6343052922748029} 12/26/2021 00:06:23 - INFO - codeparrot_training - Step 23002: {'lr': 0.00028277935971029566, 'samples': 11777536, 'steps': 23002, 'batch_loss/train': 0.7760816793888807} 12/26/2021 00:06:35 - INFO - codeparrot_training - Step 23003: {'lr': 0.0002827637325965231, 'samples': 11778048, 'steps': 23003, 'batch_loss/train': 1.4349511521868408} 12/26/2021 00:06:46 - INFO - codeparrot_training - Step 23004: {'lr': 0.00028274810535249424, 'samples': 11778560, 'steps': 23004, 'batch_loss/train': 0.8349822014570236} 12/26/2021 00:06:56 - INFO - codeparrot_training - Step 23005: {'lr': 0.0002827324779782714, 'samples': 11779072, 'steps': 23005, 'batch_loss/train': 0.8200382590293884} 12/26/2021 00:07:07 - INFO - codeparrot_training - Step 23006: {'lr': 0.00028271685047391647, 'samples': 11779584, 'steps': 23006, 'batch_loss/train': 0.8470713701099157} 12/26/2021 00:07:19 - INFO - codeparrot_training - Step 23007: {'lr': 0.0002827012228394917, 'samples': 11780096, 'steps': 23007, 'batch_loss/train': 0.7588253091089427} 12/26/2021 00:07:30 - INFO - codeparrot_training - Step 23008: {'lr': 0.0002826855950750593, 'samples': 11780608, 'steps': 23008, 'batch_loss/train': 0.7727799946442246} 12/26/2021 00:07:40 - INFO - codeparrot_training - Step 23009: {'lr': 0.00028266996718068134, 'samples': 11781120, 'steps': 23009, 'batch_loss/train': 0.7396661597304046} 12/26/2021 00:07:54 - INFO - codeparrot_training - Step 23010: {'lr': 0.0002826543391564198, 'samples': 11781632, 'steps': 23010, 'batch_loss/train': 0.811903472058475} 12/26/2021 00:08:05 - INFO - codeparrot_training - Step 23011: {'lr': 0.00028263871100233696, 'samples': 11782144, 'steps': 23011, 'batch_loss/train': 0.7978397245751694} 12/26/2021 00:08:15 - INFO - codeparrot_training - Step 23012: {'lr': 0.000282623082718495, 'samples': 11782656, 'steps': 23012, 'batch_loss/train': 0.6889348243130371} 12/26/2021 00:08:27 - INFO - codeparrot_training - Step 23013: {'lr': 0.00028260745430495595, 'samples': 11783168, 'steps': 23013, 'batch_loss/train': 0.6936791655607522} 12/26/2021 00:08:38 - INFO - codeparrot_training - Step 23014: {'lr': 0.00028259182576178193, 'samples': 11783680, 'steps': 23014, 'batch_loss/train': 0.8502820217981935} 12/26/2021 00:08:49 - INFO - codeparrot_training - Step 23015: {'lr': 0.00028257619708903513, 'samples': 11784192, 'steps': 23015, 'batch_loss/train': 0.8573304419405758} 12/26/2021 00:08:59 - INFO - codeparrot_training - Step 23016: {'lr': 0.0002825605682867778, 'samples': 11784704, 'steps': 23016, 'batch_loss/train': 0.8447695514187217} 12/26/2021 00:09:14 - INFO - codeparrot_training - Step 23017: {'lr': 0.0002825449393550717, 'samples': 11785216, 'steps': 23017, 'batch_loss/train': 0.8260899679735303} 12/26/2021 00:09:24 - INFO - codeparrot_training - Step 23018: {'lr': 0.0002825293102939794, 'samples': 11785728, 'steps': 23018, 'batch_loss/train': 0.7567014172673225} 12/26/2021 00:09:35 - INFO - codeparrot_training - Step 23019: {'lr': 0.00028251368110356284, 'samples': 11786240, 'steps': 23019, 'batch_loss/train': 0.5757274585776031} 12/26/2021 00:09:47 - INFO - codeparrot_training - Step 23020: {'lr': 0.0002824980517838841, 'samples': 11786752, 'steps': 23020, 'batch_loss/train': 0.8035537851974368} 12/26/2021 00:09:58 - INFO - codeparrot_training - Step 23021: {'lr': 0.00028248242233500537, 'samples': 11787264, 'steps': 23021, 'batch_loss/train': 0.5816692975349724} 12/26/2021 00:10:08 - INFO - codeparrot_training - Step 23022: {'lr': 0.0002824667927569889, 'samples': 11787776, 'steps': 23022, 'batch_loss/train': 0.7676068586297333} 12/26/2021 00:10:22 - INFO - codeparrot_training - Step 23023: {'lr': 0.0002824511630498965, 'samples': 11788288, 'steps': 23023, 'batch_loss/train': 0.8211015923880041} 12/26/2021 00:10:33 - INFO - codeparrot_training - Step 23024: {'lr': 0.0002824355332137907, 'samples': 11788800, 'steps': 23024, 'batch_loss/train': 0.9350028410553932} 12/26/2021 00:10:43 - INFO - codeparrot_training - Step 23025: {'lr': 0.00028241990324873347, 'samples': 11789312, 'steps': 23025, 'batch_loss/train': 1.678536046296358} 12/26/2021 00:10:54 - INFO - codeparrot_training - Step 23026: {'lr': 0.0002824042731547869, 'samples': 11789824, 'steps': 23026, 'batch_loss/train': 0.7874395148828626} 12/26/2021 00:11:06 - INFO - codeparrot_training - Step 23027: {'lr': 0.00028238864293201314, 'samples': 11790336, 'steps': 23027, 'batch_loss/train': 0.8274010326713324} 12/26/2021 00:11:17 - INFO - codeparrot_training - Step 23028: {'lr': 0.0002823730125804744, 'samples': 11790848, 'steps': 23028, 'batch_loss/train': 0.8275850731879473} 12/26/2021 00:11:28 - INFO - codeparrot_training - Step 23029: {'lr': 0.00028235738210023286, 'samples': 11791360, 'steps': 23029, 'batch_loss/train': 0.6912427898496389} 12/26/2021 00:11:40 - INFO - codeparrot_training - Step 23030: {'lr': 0.0002823417514913505, 'samples': 11791872, 'steps': 23030, 'batch_loss/train': 0.8752988800406456} 12/26/2021 00:11:50 - INFO - codeparrot_training - Step 23031: {'lr': 0.00028232612075388964, 'samples': 11792384, 'steps': 23031, 'batch_loss/train': 0.753049731021747} 12/26/2021 00:12:01 - INFO - codeparrot_training - Step 23032: {'lr': 0.0002823104898879122, 'samples': 11792896, 'steps': 23032, 'batch_loss/train': 0.7521274808095768} 12/26/2021 00:12:13 - INFO - codeparrot_training - Step 23033: {'lr': 0.0002822948588934805, 'samples': 11793408, 'steps': 23033, 'batch_loss/train': 0.825523647479713} 12/26/2021 00:12:24 - INFO - codeparrot_training - Step 23034: {'lr': 0.0002822792277706567, 'samples': 11793920, 'steps': 23034, 'batch_loss/train': 0.7428838389459997} 12/26/2021 00:12:34 - INFO - codeparrot_training - Step 23035: {'lr': 0.00028226359651950284, 'samples': 11794432, 'steps': 23035, 'batch_loss/train': 0.8507320052012801} 12/26/2021 00:12:46 - INFO - codeparrot_training - Step 23036: {'lr': 0.000282247965140081, 'samples': 11794944, 'steps': 23036, 'batch_loss/train': 0.7414539975579828} 12/26/2021 00:12:57 - INFO - codeparrot_training - Step 23037: {'lr': 0.0002822323336324535, 'samples': 11795456, 'steps': 23037, 'batch_loss/train': 0.7556518245255575} 12/26/2021 00:13:08 - INFO - codeparrot_training - Step 23038: {'lr': 0.0002822167019966825, 'samples': 11795968, 'steps': 23038, 'batch_loss/train': 0.8345126267522573} 12/26/2021 00:13:18 - INFO - codeparrot_training - Step 23039: {'lr': 0.00028220107023282996, 'samples': 11796480, 'steps': 23039, 'batch_loss/train': 0.8000642145052552} 12/26/2021 00:13:32 - INFO - codeparrot_training - Step 23040: {'lr': 0.00028218543834095816, 'samples': 11796992, 'steps': 23040, 'batch_loss/train': 0.8521457482129335} 12/26/2021 00:13:43 - INFO - codeparrot_training - Step 23041: {'lr': 0.00028216980632112914, 'samples': 11797504, 'steps': 23041, 'batch_loss/train': 0.7961190894711763} 12/26/2021 00:13:53 - INFO - codeparrot_training - Step 23042: {'lr': 0.00028215417417340525, 'samples': 11798016, 'steps': 23042, 'batch_loss/train': 0.7750822314992547} 12/26/2021 00:14:05 - INFO - codeparrot_training - Step 23043: {'lr': 0.0002821385418978484, 'samples': 11798528, 'steps': 23043, 'batch_loss/train': 0.7943742126226425} 12/26/2021 00:14:16 - INFO - codeparrot_training - Step 23044: {'lr': 0.0002821229094945209, 'samples': 11799040, 'steps': 23044, 'batch_loss/train': 0.8087818305939436} 12/26/2021 00:14:27 - INFO - codeparrot_training - Step 23045: {'lr': 0.00028210727696348484, 'samples': 11799552, 'steps': 23045, 'batch_loss/train': 0.7704830523580313} 12/26/2021 00:14:41 - INFO - codeparrot_training - Step 23046: {'lr': 0.00028209164430480227, 'samples': 11800064, 'steps': 23046, 'batch_loss/train': 0.768349064164795} 12/26/2021 00:14:51 - INFO - codeparrot_training - Step 23047: {'lr': 0.0002820760115185355, 'samples': 11800576, 'steps': 23047, 'batch_loss/train': 0.7910656630992889} 12/26/2021 00:15:02 - INFO - codeparrot_training - Step 23048: {'lr': 0.00028206037860474667, 'samples': 11801088, 'steps': 23048, 'batch_loss/train': 0.7920029386878014} 12/26/2021 00:15:14 - INFO - codeparrot_training - Step 23049: {'lr': 0.0002820447455634978, 'samples': 11801600, 'steps': 23049, 'batch_loss/train': 0.7841405430808663} 12/26/2021 00:15:25 - INFO - codeparrot_training - Step 23050: {'lr': 0.0002820291123948512, 'samples': 11802112, 'steps': 23050, 'batch_loss/train': 0.7489347551018} 12/26/2021 00:15:35 - INFO - codeparrot_training - Step 23051: {'lr': 0.0002820134790988689, 'samples': 11802624, 'steps': 23051, 'batch_loss/train': 0.8217605445533991} 12/26/2021 00:15:46 - INFO - codeparrot_training - Step 23052: {'lr': 0.0002819978456756131, 'samples': 11803136, 'steps': 23052, 'batch_loss/train': 0.6474874448031187} 12/26/2021 00:15:58 - INFO - codeparrot_training - Step 23053: {'lr': 0.00028198221212514597, 'samples': 11803648, 'steps': 23053, 'batch_loss/train': 0.8445652965456247} 12/26/2021 00:16:09 - INFO - codeparrot_training - Step 23054: {'lr': 0.0002819665784475296, 'samples': 11804160, 'steps': 23054, 'batch_loss/train': 0.8439766075462103} 12/26/2021 00:16:19 - INFO - codeparrot_training - Step 23055: {'lr': 0.0002819509446428262, 'samples': 11804672, 'steps': 23055, 'batch_loss/train': 0.86589833535254} 12/26/2021 00:16:34 - INFO - codeparrot_training - Step 23056: {'lr': 0.00028193531071109783, 'samples': 11805184, 'steps': 23056, 'batch_loss/train': 0.7792807659134269} 12/26/2021 00:16:44 - INFO - codeparrot_training - Step 23057: {'lr': 0.0002819196766524069, 'samples': 11805696, 'steps': 23057, 'batch_loss/train': 0.8986649019643664} 12/26/2021 00:16:55 - INFO - codeparrot_training - Step 23058: {'lr': 0.0002819040424668153, 'samples': 11806208, 'steps': 23058, 'batch_loss/train': 1.0036458221729845} 12/26/2021 00:17:06 - INFO - codeparrot_training - Step 23059: {'lr': 0.00028188840815438523, 'samples': 11806720, 'steps': 23059, 'batch_loss/train': 0.8631334851961583} 12/26/2021 00:17:18 - INFO - codeparrot_training - Step 23060: {'lr': 0.00028187277371517896, 'samples': 11807232, 'steps': 23060, 'batch_loss/train': 0.9407301701139659} 12/26/2021 00:17:29 - INFO - codeparrot_training - Step 23061: {'lr': 0.00028185713914925846, 'samples': 11807744, 'steps': 23061, 'batch_loss/train': 0.8133819364011288} 12/26/2021 00:17:39 - INFO - codeparrot_training - Step 23062: {'lr': 0.00028184150445668613, 'samples': 11808256, 'steps': 23062, 'batch_loss/train': 0.8469499489292502} 12/26/2021 00:17:54 - INFO - codeparrot_training - Step 23063: {'lr': 0.0002818258696375239, 'samples': 11808768, 'steps': 23063, 'batch_loss/train': 0.8991511263884604} 12/26/2021 00:18:05 - INFO - codeparrot_training - Step 23064: {'lr': 0.0002818102346918341, 'samples': 11809280, 'steps': 23064, 'batch_loss/train': 1.2374584432691336} 12/26/2021 00:18:15 - INFO - codeparrot_training - Step 23065: {'lr': 0.00028179459961967887, 'samples': 11809792, 'steps': 23065, 'batch_loss/train': 0.8817651150748134} 12/26/2021 00:18:27 - INFO - codeparrot_training - Step 23066: {'lr': 0.0002817789644211202, 'samples': 11810304, 'steps': 23066, 'batch_loss/train': 0.871191430836916} 12/26/2021 00:18:38 - INFO - codeparrot_training - Step 23067: {'lr': 0.0002817633290962204, 'samples': 11810816, 'steps': 23067, 'batch_loss/train': 0.8310620114207268} 12/26/2021 00:18:49 - INFO - codeparrot_training - Step 23068: {'lr': 0.0002817476936450417, 'samples': 11811328, 'steps': 23068, 'batch_loss/train': 0.8191601596772671} 12/26/2021 00:18:59 - INFO - codeparrot_training - Step 23069: {'lr': 0.000281732058067646, 'samples': 11811840, 'steps': 23069, 'batch_loss/train': 0.7220044387504458} 12/26/2021 00:19:11 - INFO - codeparrot_training - Step 23070: {'lr': 0.00028171642236409574, 'samples': 11812352, 'steps': 23070, 'batch_loss/train': 0.7274291792418808} 12/26/2021 00:19:22 - INFO - codeparrot_training - Step 23071: {'lr': 0.00028170078653445293, 'samples': 11812864, 'steps': 23071, 'batch_loss/train': 0.7187632387503982} 12/26/2021 00:19:33 - INFO - codeparrot_training - Step 23072: {'lr': 0.0002816851505787797, 'samples': 11813376, 'steps': 23072, 'batch_loss/train': 0.7547244105953723} 12/26/2021 00:19:45 - INFO - codeparrot_training - Step 23073: {'lr': 0.00028166951449713827, 'samples': 11813888, 'steps': 23073, 'batch_loss/train': 0.8979866849258542} 12/26/2021 00:19:55 - INFO - codeparrot_training - Step 23074: {'lr': 0.0002816538782895909, 'samples': 11814400, 'steps': 23074, 'batch_loss/train': 0.8563010999932885} 12/26/2021 00:20:06 - INFO - codeparrot_training - Step 23075: {'lr': 0.00028163824195619956, 'samples': 11814912, 'steps': 23075, 'batch_loss/train': 0.7979606809094548} 12/26/2021 00:20:20 - INFO - codeparrot_training - Step 23076: {'lr': 0.00028162260549702663, 'samples': 11815424, 'steps': 23076, 'batch_loss/train': 0.805886129848659} 12/26/2021 00:20:31 - INFO - codeparrot_training - Step 23077: {'lr': 0.0002816069689121341, 'samples': 11815936, 'steps': 23077, 'batch_loss/train': 0.6928500789217651} 12/26/2021 00:20:41 - INFO - codeparrot_training - Step 23078: {'lr': 0.0002815913322015842, 'samples': 11816448, 'steps': 23078, 'batch_loss/train': 0.8017547447234392} 12/26/2021 00:20:52 - INFO - codeparrot_training - Step 23079: {'lr': 0.0002815756953654391, 'samples': 11816960, 'steps': 23079, 'batch_loss/train': 0.7942862962372601} 12/26/2021 00:21:04 - INFO - codeparrot_training - Step 23080: {'lr': 0.000281560058403761, 'samples': 11817472, 'steps': 23080, 'batch_loss/train': 0.7249151316937059} 12/26/2021 00:21:14 - INFO - codeparrot_training - Step 23081: {'lr': 0.0002815444213166119, 'samples': 11817984, 'steps': 23081, 'batch_loss/train': 0.9743170440196991} 12/26/2021 00:21:25 - INFO - codeparrot_training - Step 23082: {'lr': 0.00028152878410405417, 'samples': 11818496, 'steps': 23082, 'batch_loss/train': 0.7200592557201162} 12/26/2021 00:21:37 - INFO - codeparrot_training - Step 23083: {'lr': 0.0002815131467661499, 'samples': 11819008, 'steps': 23083, 'batch_loss/train': 0.7438251852290705} 12/26/2021 00:21:48 - INFO - codeparrot_training - Step 23084: {'lr': 0.0002814975093029613, 'samples': 11819520, 'steps': 23084, 'batch_loss/train': 0.7967121287947521} 12/26/2021 00:21:58 - INFO - codeparrot_training - Step 23085: {'lr': 0.00028148187171455035, 'samples': 11820032, 'steps': 23085, 'batch_loss/train': 0.7423981392057613} 12/26/2021 00:22:12 - INFO - codeparrot_training - Step 23086: {'lr': 0.00028146623400097953, 'samples': 11820544, 'steps': 23086, 'batch_loss/train': 0.8216413790360093} 12/26/2021 00:22:23 - INFO - codeparrot_training - Step 23087: {'lr': 0.0002814505961623108, 'samples': 11821056, 'steps': 23087, 'batch_loss/train': 0.794077163329348} 12/26/2021 00:22:33 - INFO - codeparrot_training - Step 23088: {'lr': 0.0002814349581986063, 'samples': 11821568, 'steps': 23088, 'batch_loss/train': 0.7692649611271918} 12/26/2021 00:22:44 - INFO - codeparrot_training - Step 23089: {'lr': 0.00028141932010992835, 'samples': 11822080, 'steps': 23089, 'batch_loss/train': 0.7783062090165913} 12/26/2021 00:22:56 - INFO - codeparrot_training - Step 23090: {'lr': 0.0002814036818963391, 'samples': 11822592, 'steps': 23090, 'batch_loss/train': 0.7848640854936093} 12/26/2021 00:23:07 - INFO - codeparrot_training - Step 23091: {'lr': 0.0002813880435579005, 'samples': 11823104, 'steps': 23091, 'batch_loss/train': 0.7781894393265247} 12/26/2021 00:23:17 - INFO - codeparrot_training - Step 23092: {'lr': 0.00028137240509467504, 'samples': 11823616, 'steps': 23092, 'batch_loss/train': 0.8275690930895507} 12/26/2021 00:23:31 - INFO - codeparrot_training - Step 23093: {'lr': 0.0002813567665067247, 'samples': 11824128, 'steps': 23093, 'batch_loss/train': 0.795526921749115} 12/26/2021 00:23:42 - INFO - codeparrot_training - Step 23094: {'lr': 0.00028134112779411167, 'samples': 11824640, 'steps': 23094, 'batch_loss/train': 0.721221552696079} 12/26/2021 00:23:52 - INFO - codeparrot_training - Step 23095: {'lr': 0.00028132548895689816, 'samples': 11825152, 'steps': 23095, 'batch_loss/train': 0.7977741789072752} 12/26/2021 00:24:04 - INFO - codeparrot_training - Step 23096: {'lr': 0.00028130984999514636, 'samples': 11825664, 'steps': 23096, 'batch_loss/train': 0.8322766730561852} 12/26/2021 00:24:15 - INFO - codeparrot_training - Step 23097: {'lr': 0.0002812942109089184, 'samples': 11826176, 'steps': 23097, 'batch_loss/train': 0.827536822296679} 12/26/2021 00:24:25 - INFO - codeparrot_training - Step 23098: {'lr': 0.00028127857169827643, 'samples': 11826688, 'steps': 23098, 'batch_loss/train': 0.7676050066947937} 12/26/2021 00:24:38 - INFO - codeparrot_training - Step 23099: {'lr': 0.0002812629323632827, 'samples': 11827200, 'steps': 23099, 'batch_loss/train': 0.7497500609606504} 12/26/2021 00:24:48 - INFO - codeparrot_training - Step 23100: {'lr': 0.0002812472929039994, 'samples': 11827712, 'steps': 23100, 'batch_loss/train': 0.8820607522502542} 12/26/2021 00:24:59 - INFO - codeparrot_training - Step 23101: {'lr': 0.0002812316533204886, 'samples': 11828224, 'steps': 23101, 'batch_loss/train': 0.7996002666186541} 12/26/2021 00:25:10 - INFO - codeparrot_training - Step 23102: {'lr': 0.0002812160136128126, 'samples': 11828736, 'steps': 23102, 'batch_loss/train': 0.7318589240312576} 12/26/2021 00:25:24 - INFO - codeparrot_training - Step 23103: {'lr': 0.0002812003737810335, 'samples': 11829248, 'steps': 23103, 'batch_loss/train': 0.8280566371977329} 12/26/2021 00:25:34 - INFO - codeparrot_training - Step 23104: {'lr': 0.0002811847338252134, 'samples': 11829760, 'steps': 23104, 'batch_loss/train': 0.7823999067768455} 12/26/2021 00:25:45 - INFO - codeparrot_training - Step 23105: {'lr': 0.0002811690937454147, 'samples': 11830272, 'steps': 23105, 'batch_loss/train': 0.8156650857999921} 12/26/2021 00:25:57 - INFO - codeparrot_training - Step 23106: {'lr': 0.0002811534535416994, 'samples': 11830784, 'steps': 23106, 'batch_loss/train': 0.8919063610956073} 12/26/2021 00:26:08 - INFO - codeparrot_training - Step 23107: {'lr': 0.00028113781321412976, 'samples': 11831296, 'steps': 23107, 'batch_loss/train': 0.8033899897709489} 12/26/2021 00:26:18 - INFO - codeparrot_training - Step 23108: {'lr': 0.0002811221727627679, 'samples': 11831808, 'steps': 23108, 'batch_loss/train': 0.788266190327704} 12/26/2021 00:26:30 - INFO - codeparrot_training - Step 23109: {'lr': 0.0002811065321876761, 'samples': 11832320, 'steps': 23109, 'batch_loss/train': 0.7108899122104049} 12/26/2021 00:26:41 - INFO - codeparrot_training - Step 23110: {'lr': 0.0002810908914889164, 'samples': 11832832, 'steps': 23110, 'batch_loss/train': 0.842780566366855} 12/26/2021 00:26:52 - INFO - codeparrot_training - Step 23111: {'lr': 0.0002810752506665509, 'samples': 11833344, 'steps': 23111, 'batch_loss/train': 0.7843918125145137} 12/26/2021 00:27:02 - INFO - codeparrot_training - Step 23112: {'lr': 0.0002810596097206422, 'samples': 11833856, 'steps': 23112, 'batch_loss/train': 0.7668485697358847} 12/26/2021 00:27:14 - INFO - codeparrot_training - Step 23113: {'lr': 0.0002810439686512521, 'samples': 11834368, 'steps': 23113, 'batch_loss/train': 0.7744994305539876} 12/26/2021 00:27:25 - INFO - codeparrot_training - Step 23114: {'lr': 0.0002810283274584429, 'samples': 11834880, 'steps': 23114, 'batch_loss/train': 0.7410010264720768} 12/26/2021 00:27:35 - INFO - codeparrot_training - Step 23115: {'lr': 0.0002810126861422768, 'samples': 11835392, 'steps': 23115, 'batch_loss/train': 0.8001371864229441} 12/26/2021 00:27:49 - INFO - codeparrot_training - Step 23116: {'lr': 0.00028099704470281605, 'samples': 11835904, 'steps': 23116, 'batch_loss/train': 0.7914700275287032} 12/26/2021 00:28:00 - INFO - codeparrot_training - Step 23117: {'lr': 0.00028098140314012266, 'samples': 11836416, 'steps': 23117, 'batch_loss/train': 0.8086609113961458} 12/26/2021 00:28:10 - INFO - codeparrot_training - Step 23118: {'lr': 0.00028096576145425896, 'samples': 11836928, 'steps': 23118, 'batch_loss/train': 0.7782302964478731} 12/26/2021 00:28:23 - INFO - codeparrot_training - Step 23119: {'lr': 0.00028095011964528703, 'samples': 11837440, 'steps': 23119, 'batch_loss/train': 0.7403862681239843} 12/26/2021 00:28:33 - INFO - codeparrot_training - Step 23120: {'lr': 0.0002809344777132691, 'samples': 11837952, 'steps': 23120, 'batch_loss/train': 0.791516543366015} 12/26/2021 00:28:44 - INFO - codeparrot_training - Step 23121: {'lr': 0.0002809188356582675, 'samples': 11838464, 'steps': 23121, 'batch_loss/train': 0.7910115926060826} 12/26/2021 00:28:54 - INFO - codeparrot_training - Step 23122: {'lr': 0.00028090319348034415, 'samples': 11838976, 'steps': 23122, 'batch_loss/train': 0.8595472332090139} 12/26/2021 00:29:06 - INFO - codeparrot_training - Step 23123: {'lr': 0.0002808875511795614, 'samples': 11839488, 'steps': 23123, 'batch_loss/train': 0.7776784244924784} 12/26/2021 00:29:17 - INFO - codeparrot_training - Step 23124: {'lr': 0.0002808719087559814, 'samples': 11840000, 'steps': 23124, 'batch_loss/train': 0.7296408447436988} 12/26/2021 00:29:28 - INFO - codeparrot_training - Step 23125: {'lr': 0.0002808562662096664, 'samples': 11840512, 'steps': 23125, 'batch_loss/train': 0.7991088218986988} 12/26/2021 00:29:42 - INFO - codeparrot_training - Step 23126: {'lr': 0.00028084062354067854, 'samples': 11841024, 'steps': 23126, 'batch_loss/train': 0.8012881111353636} 12/26/2021 00:29:52 - INFO - codeparrot_training - Step 23127: {'lr': 0.0002808249807490799, 'samples': 11841536, 'steps': 23127, 'batch_loss/train': 0.8271879246458411} 12/26/2021 00:30:03 - INFO - codeparrot_training - Step 23128: {'lr': 0.0002808093378349329, 'samples': 11842048, 'steps': 23128, 'batch_loss/train': 1.008365323767066} 12/26/2021 00:30:15 - INFO - codeparrot_training - Step 23129: {'lr': 0.00028079369479829956, 'samples': 11842560, 'steps': 23129, 'batch_loss/train': 0.9261332154273987} 12/26/2021 00:30:26 - INFO - codeparrot_training - Step 23130: {'lr': 0.00028077805163924206, 'samples': 11843072, 'steps': 23130, 'batch_loss/train': 0.7387353633530438} 12/26/2021 00:30:36 - INFO - codeparrot_training - Step 23131: {'lr': 0.00028076240835782276, 'samples': 11843584, 'steps': 23131, 'batch_loss/train': 0.7804186604917049} 12/26/2021 00:30:50 - INFO - codeparrot_training - Step 23132: {'lr': 0.00028074676495410365, 'samples': 11844096, 'steps': 23132, 'batch_loss/train': 0.702414333820343} 12/26/2021 00:31:01 - INFO - codeparrot_training - Step 23133: {'lr': 0.00028073112142814704, 'samples': 11844608, 'steps': 23133, 'batch_loss/train': 0.7275197973940521} 12/26/2021 00:31:11 - INFO - codeparrot_training - Step 23134: {'lr': 0.0002807154777800151, 'samples': 11845120, 'steps': 23134, 'batch_loss/train': 0.868368580006063} 12/26/2021 00:31:22 - INFO - codeparrot_training - Step 23135: {'lr': 0.00028069983400977, 'samples': 11845632, 'steps': 23135, 'batch_loss/train': 0.7955821086652577} 12/26/2021 00:31:34 - INFO - codeparrot_training - Step 23136: {'lr': 0.0002806841901174739, 'samples': 11846144, 'steps': 23136, 'batch_loss/train': 0.8544492581859231} 12/26/2021 00:31:45 - INFO - codeparrot_training - Step 23137: {'lr': 0.0002806685461031891, 'samples': 11846656, 'steps': 23137, 'batch_loss/train': 0.7710573222720996} 12/26/2021 00:31:55 - INFO - codeparrot_training - Step 23138: {'lr': 0.00028065290196697783, 'samples': 11847168, 'steps': 23138, 'batch_loss/train': 0.7278528772294521} 12/26/2021 00:32:07 - INFO - codeparrot_training - Step 23139: {'lr': 0.00028063725770890215, 'samples': 11847680, 'steps': 23139, 'batch_loss/train': 0.8434038930572569} 12/26/2021 00:32:18 - INFO - codeparrot_training - Step 23140: {'lr': 0.00028062161332902423, 'samples': 11848192, 'steps': 23140, 'batch_loss/train': 0.8471660725772381} 12/26/2021 00:32:29 - INFO - codeparrot_training - Step 23141: {'lr': 0.00028060596882740634, 'samples': 11848704, 'steps': 23141, 'batch_loss/train': 0.6874834499321878} 12/26/2021 00:32:43 - INFO - codeparrot_training - Step 23142: {'lr': 0.0002805903242041108, 'samples': 11849216, 'steps': 23142, 'batch_loss/train': 0.8005610890686512} 12/26/2021 00:32:53 - INFO - codeparrot_training - Step 23143: {'lr': 0.00028057467945919953, 'samples': 11849728, 'steps': 23143, 'batch_loss/train': 0.7644010279327631} 12/26/2021 00:33:04 - INFO - codeparrot_training - Step 23144: {'lr': 0.000280559034592735, 'samples': 11850240, 'steps': 23144, 'batch_loss/train': 0.7590059963986278} 12/26/2021 00:33:14 - INFO - codeparrot_training - Step 23145: {'lr': 0.0002805433896047792, 'samples': 11850752, 'steps': 23145, 'batch_loss/train': 0.8611915804212913} 12/26/2021 00:33:27 - INFO - codeparrot_training - Step 23146: {'lr': 0.0002805277444953944, 'samples': 11851264, 'steps': 23146, 'batch_loss/train': 0.799786057556048} 12/26/2021 00:33:37 - INFO - codeparrot_training - Step 23147: {'lr': 0.0002805120992646429, 'samples': 11851776, 'steps': 23147, 'batch_loss/train': 0.8980307998135686} 12/26/2021 00:33:48 - INFO - codeparrot_training - Step 23148: {'lr': 0.00028049645391258675, 'samples': 11852288, 'steps': 23148, 'batch_loss/train': 0.7847370943054557} 12/26/2021 00:34:00 - INFO - codeparrot_training - Step 23149: {'lr': 0.0002804808084392882, 'samples': 11852800, 'steps': 23149, 'batch_loss/train': 0.8149297968484461} 12/26/2021 00:34:11 - INFO - codeparrot_training - Step 23150: {'lr': 0.0002804651628448095, 'samples': 11853312, 'steps': 23150, 'batch_loss/train': 0.6276693930849433} 12/26/2021 00:34:21 - INFO - codeparrot_training - Step 23151: {'lr': 0.00028044951712921293, 'samples': 11853824, 'steps': 23151, 'batch_loss/train': 0.7916813566116616} 12/26/2021 00:34:33 - INFO - codeparrot_training - Step 23152: {'lr': 0.0002804338712925604, 'samples': 11854336, 'steps': 23152, 'batch_loss/train': 0.8752655340358615} 12/26/2021 00:34:44 - INFO - codeparrot_training - Step 23153: {'lr': 0.00028041822533491435, 'samples': 11854848, 'steps': 23153, 'batch_loss/train': 0.7527293274179101} 12/26/2021 00:34:55 - INFO - codeparrot_training - Step 23154: {'lr': 0.0002804025792563369, 'samples': 11855360, 'steps': 23154, 'batch_loss/train': 0.8222551732324064} 12/26/2021 00:35:05 - INFO - codeparrot_training - Step 23155: {'lr': 0.00028038693305689036, 'samples': 11855872, 'steps': 23155, 'batch_loss/train': 1.0682476181536913} 12/26/2021 00:35:19 - INFO - codeparrot_training - Step 23156: {'lr': 0.00028037128673663673, 'samples': 11856384, 'steps': 23156, 'batch_loss/train': 0.7045553661882877} 12/26/2021 00:35:30 - INFO - codeparrot_training - Step 23157: {'lr': 0.00028035564029563843, 'samples': 11856896, 'steps': 23157, 'batch_loss/train': 0.7441664482466877} 12/26/2021 00:35:40 - INFO - codeparrot_training - Step 23158: {'lr': 0.00028033999373395754, 'samples': 11857408, 'steps': 23158, 'batch_loss/train': 0.7870584349147975} 12/26/2021 00:35:53 - INFO - codeparrot_training - Step 23159: {'lr': 0.0002803243470516562, 'samples': 11857920, 'steps': 23159, 'batch_loss/train': 0.8771508568897843} 12/26/2021 00:36:03 - INFO - codeparrot_training - Step 23160: {'lr': 0.00028030870024879683, 'samples': 11858432, 'steps': 23160, 'batch_loss/train': 0.7846183720976114} 12/26/2021 00:36:14 - INFO - codeparrot_training - Step 23161: {'lr': 0.0002802930533254415, 'samples': 11858944, 'steps': 23161, 'batch_loss/train': 0.8764031287282705} 12/26/2021 00:36:28 - INFO - codeparrot_training - Step 23162: {'lr': 0.0002802774062816523, 'samples': 11859456, 'steps': 23162, 'batch_loss/train': 0.8210340915247798} 12/26/2021 00:36:39 - INFO - codeparrot_training - Step 23163: {'lr': 0.0002802617591174917, 'samples': 11859968, 'steps': 23163, 'batch_loss/train': 0.8693714225664735} 12/26/2021 00:36:49 - INFO - codeparrot_training - Step 23164: {'lr': 0.00028024611183302174, 'samples': 11860480, 'steps': 23164, 'batch_loss/train': 0.7723776930943131} 12/26/2021 00:37:00 - INFO - codeparrot_training - Step 23165: {'lr': 0.0002802304644283047, 'samples': 11860992, 'steps': 23165, 'batch_loss/train': 0.7144739730283618} 12/26/2021 00:37:12 - INFO - codeparrot_training - Step 23166: {'lr': 0.00028021481690340265, 'samples': 11861504, 'steps': 23166, 'batch_loss/train': 0.7527876874664798} 12/26/2021 00:37:22 - INFO - codeparrot_training - Step 23167: {'lr': 0.00028019916925837795, 'samples': 11862016, 'steps': 23167, 'batch_loss/train': 0.8657200485467911} 12/26/2021 00:37:33 - INFO - codeparrot_training - Step 23168: {'lr': 0.0002801835214932928, 'samples': 11862528, 'steps': 23168, 'batch_loss/train': 0.8244232819415629} 12/26/2021 00:37:45 - INFO - codeparrot_training - Step 23169: {'lr': 0.0002801678736082092, 'samples': 11863040, 'steps': 23169, 'batch_loss/train': 0.7706435495056212} 12/26/2021 00:37:56 - INFO - codeparrot_training - Step 23170: {'lr': 0.0002801522256031897, 'samples': 11863552, 'steps': 23170, 'batch_loss/train': 0.7148971380665898} 12/26/2021 00:38:06 - INFO - codeparrot_training - Step 23171: {'lr': 0.0002801365774782963, 'samples': 11864064, 'steps': 23171, 'batch_loss/train': 0.748736719135195} 12/26/2021 00:38:20 - INFO - codeparrot_training - Step 23172: {'lr': 0.0002801209292335912, 'samples': 11864576, 'steps': 23172, 'batch_loss/train': 0.7847757069393992} 12/26/2021 00:38:31 - INFO - codeparrot_training - Step 23173: {'lr': 0.00028010528086913665, 'samples': 11865088, 'steps': 23173, 'batch_loss/train': 0.5828233773936518} 12/26/2021 00:38:42 - INFO - codeparrot_training - Step 23174: {'lr': 0.0002800896323849949, 'samples': 11865600, 'steps': 23174, 'batch_loss/train': 0.7355363818351179} 12/26/2021 00:38:54 - INFO - codeparrot_training - Step 23175: {'lr': 0.0002800739837812282, 'samples': 11866112, 'steps': 23175, 'batch_loss/train': 0.9922311413101852} 12/26/2021 00:39:04 - INFO - codeparrot_training - Step 23176: {'lr': 0.00028005833505789855, 'samples': 11866624, 'steps': 23176, 'batch_loss/train': 1.1243310933932662} 12/26/2021 00:39:15 - INFO - codeparrot_training - Step 23177: {'lr': 0.0002800426862150685, 'samples': 11867136, 'steps': 23177, 'batch_loss/train': 0.7991575705818832} 12/26/2021 00:39:26 - INFO - codeparrot_training - Step 23178: {'lr': 0.00028002703725279993, 'samples': 11867648, 'steps': 23178, 'batch_loss/train': 0.762890649959445} 12/26/2021 00:39:38 - INFO - codeparrot_training - Step 23179: {'lr': 0.00028001138817115525, 'samples': 11868160, 'steps': 23179, 'batch_loss/train': 0.7989374259486794} 12/26/2021 00:39:48 - INFO - codeparrot_training - Step 23180: {'lr': 0.0002799957389701965, 'samples': 11868672, 'steps': 23180, 'batch_loss/train': 0.8761671157553792} 12/26/2021 00:39:59 - INFO - codeparrot_training - Step 23181: {'lr': 0.00027998008964998615, 'samples': 11869184, 'steps': 23181, 'batch_loss/train': 0.6713178711943328} 12/26/2021 00:40:13 - INFO - codeparrot_training - Step 23182: {'lr': 0.00027996444021058625, 'samples': 11869696, 'steps': 23182, 'batch_loss/train': 0.865111107006669} 12/26/2021 00:40:23 - INFO - codeparrot_training - Step 23183: {'lr': 0.00027994879065205915, 'samples': 11870208, 'steps': 23183, 'batch_loss/train': 0.8408334469422698} 12/26/2021 00:40:34 - INFO - codeparrot_training - Step 23184: {'lr': 0.0002799331409744668, 'samples': 11870720, 'steps': 23184, 'batch_loss/train': 0.7369464980438352} 12/26/2021 00:40:46 - INFO - codeparrot_training - Step 23185: {'lr': 0.0002799174911778716, 'samples': 11871232, 'steps': 23185, 'batch_loss/train': 0.7920277803204954} 12/26/2021 00:40:57 - INFO - codeparrot_training - Step 23186: {'lr': 0.0002799018412623359, 'samples': 11871744, 'steps': 23186, 'batch_loss/train': 0.8432175759226084} 12/26/2021 00:41:07 - INFO - codeparrot_training - Step 23187: {'lr': 0.00027988619122792165, 'samples': 11872256, 'steps': 23187, 'batch_loss/train': 0.7452403092756867} 12/26/2021 00:41:18 - INFO - codeparrot_training - Step 23188: {'lr': 0.0002798705410746912, 'samples': 11872768, 'steps': 23188, 'batch_loss/train': 0.7667288482189178} 12/26/2021 00:41:32 - INFO - codeparrot_training - Step 23189: {'lr': 0.0002798548908027068, 'samples': 11873280, 'steps': 23189, 'batch_loss/train': 0.8939135745167732} 12/26/2021 00:41:42 - INFO - codeparrot_training - Step 23190: {'lr': 0.0002798392404120306, 'samples': 11873792, 'steps': 23190, 'batch_loss/train': 0.6897194348275661} 12/26/2021 00:41:53 - INFO - codeparrot_training - Step 23191: {'lr': 0.0002798235899027249, 'samples': 11874304, 'steps': 23191, 'batch_loss/train': 0.8177107675001025} 12/26/2021 00:42:06 - INFO - codeparrot_training - Step 23192: {'lr': 0.00027980793927485177, 'samples': 11874816, 'steps': 23192, 'batch_loss/train': 0.8616468170657754} 12/26/2021 00:42:16 - INFO - codeparrot_training - Step 23193: {'lr': 0.00027979228852847353, 'samples': 11875328, 'steps': 23193, 'batch_loss/train': 0.6007422818802297} 12/26/2021 00:42:27 - INFO - codeparrot_training - Step 23194: {'lr': 0.0002797766376636525, 'samples': 11875840, 'steps': 23194, 'batch_loss/train': 0.7864629942923784} 12/26/2021 00:42:39 - INFO - codeparrot_training - Step 23195: {'lr': 0.0002797609866804507, 'samples': 11876352, 'steps': 23195, 'batch_loss/train': 0.8038440402597189} 12/26/2021 00:42:49 - INFO - codeparrot_training - Step 23196: {'lr': 0.0002797453355789305, 'samples': 11876864, 'steps': 23196, 'batch_loss/train': 0.7346160998567939} 12/26/2021 00:43:00 - INFO - codeparrot_training - Step 23197: {'lr': 0.00027972968435915414, 'samples': 11877376, 'steps': 23197, 'batch_loss/train': 0.766542062163353} 12/26/2021 00:43:12 - INFO - codeparrot_training - Step 23198: {'lr': 0.00027971403302118363, 'samples': 11877888, 'steps': 23198, 'batch_loss/train': 0.6652499260380864} 12/26/2021 00:43:23 - INFO - codeparrot_training - Step 23199: {'lr': 0.0002796983815650815, 'samples': 11878400, 'steps': 23199, 'batch_loss/train': 0.8355684578418732} 12/26/2021 00:43:33 - INFO - codeparrot_training - Step 23200: {'lr': 0.0002796827299909097, 'samples': 11878912, 'steps': 23200, 'batch_loss/train': 1.5141442473977804} 12/26/2021 00:43:44 - INFO - codeparrot_training - Step 23201: {'lr': 0.0002796670782987306, 'samples': 11879424, 'steps': 23201, 'batch_loss/train': 0.6913182269781828} 12/26/2021 00:43:59 - INFO - codeparrot_training - Step 23202: {'lr': 0.00027965142648860643, 'samples': 11879936, 'steps': 23202, 'batch_loss/train': 0.7405187962576747} 12/26/2021 00:44:10 - INFO - codeparrot_training - Step 23203: {'lr': 0.0002796357745605994, 'samples': 11880448, 'steps': 23203, 'batch_loss/train': 0.8818025216460228} 12/26/2021 00:44:20 - INFO - codeparrot_training - Step 23204: {'lr': 0.00027962012251477167, 'samples': 11880960, 'steps': 23204, 'batch_loss/train': 0.8277168478816748} 12/26/2021 00:44:31 - INFO - codeparrot_training - Step 23205: {'lr': 0.0002796044703511856, 'samples': 11881472, 'steps': 23205, 'batch_loss/train': 1.1113705253228545} 12/26/2021 00:44:43 - INFO - codeparrot_training - Step 23206: {'lr': 0.00027958881806990326, 'samples': 11881984, 'steps': 23206, 'batch_loss/train': 1.0927487630397081} 12/26/2021 00:44:53 - INFO - codeparrot_training - Step 23207: {'lr': 0.00027957316567098694, 'samples': 11882496, 'steps': 23207, 'batch_loss/train': 0.7919991426169872} 12/26/2021 00:45:04 - INFO - codeparrot_training - Step 23208: {'lr': 0.0002795575131544989, 'samples': 11883008, 'steps': 23208, 'batch_loss/train': 0.8270514023024589} 12/26/2021 00:45:16 - INFO - codeparrot_training - Step 23209: {'lr': 0.00027954186052050133, 'samples': 11883520, 'steps': 23209, 'batch_loss/train': 0.7780773798003793} 12/26/2021 00:45:27 - INFO - codeparrot_training - Step 23210: {'lr': 0.00027952620776905656, 'samples': 11884032, 'steps': 23210, 'batch_loss/train': 0.8397815639618784} 12/26/2021 00:45:37 - INFO - codeparrot_training - Step 23211: {'lr': 0.00027951055490022664, 'samples': 11884544, 'steps': 23211, 'batch_loss/train': 0.7921435823664069} 12/26/2021 00:45:51 - INFO - codeparrot_training - Step 23212: {'lr': 0.000279494901914074, 'samples': 11885056, 'steps': 23212, 'batch_loss/train': 0.779039834626019} 12/26/2021 00:46:02 - INFO - codeparrot_training - Step 23213: {'lr': 0.00027947924881066067, 'samples': 11885568, 'steps': 23213, 'batch_loss/train': 0.7324139424599707} 12/26/2021 00:46:13 - INFO - codeparrot_training - Step 23214: {'lr': 0.00027946359559004904, 'samples': 11886080, 'steps': 23214, 'batch_loss/train': 0.6549821142107248} 12/26/2021 00:46:23 - INFO - codeparrot_training - Step 23215: {'lr': 0.0002794479422523012, 'samples': 11886592, 'steps': 23215, 'batch_loss/train': 0.7802901030518115} 12/26/2021 00:46:35 - INFO - codeparrot_training - Step 23216: {'lr': 0.00027943228879747956, 'samples': 11887104, 'steps': 23216, 'batch_loss/train': 0.7544633448123932} 12/26/2021 00:46:46 - INFO - codeparrot_training - Step 23217: {'lr': 0.00027941663522564617, 'samples': 11887616, 'steps': 23217, 'batch_loss/train': 0.6767203826457262} 12/26/2021 00:46:57 - INFO - codeparrot_training - Step 23218: {'lr': 0.00027940098153686347, 'samples': 11888128, 'steps': 23218, 'batch_loss/train': 0.6800219749566168} 12/26/2021 00:47:10 - INFO - codeparrot_training - Step 23219: {'lr': 0.0002793853277311934, 'samples': 11888640, 'steps': 23219, 'batch_loss/train': 0.7464511235011742} 12/26/2021 00:47:21 - INFO - codeparrot_training - Step 23220: {'lr': 0.0002793696738086984, 'samples': 11889152, 'steps': 23220, 'batch_loss/train': 0.8533939461922273} 12/26/2021 00:47:32 - INFO - codeparrot_training - Step 23221: {'lr': 0.00027935401976944075, 'samples': 11889664, 'steps': 23221, 'batch_loss/train': 0.7542439056560397} 12/26/2021 00:47:44 - INFO - codeparrot_training - Step 23222: {'lr': 0.00027933836561348254, 'samples': 11890176, 'steps': 23222, 'batch_loss/train': 0.8584248855477199} 12/26/2021 00:47:54 - INFO - codeparrot_training - Step 23223: {'lr': 0.0002793227113408862, 'samples': 11890688, 'steps': 23223, 'batch_loss/train': 0.744585189037025} 12/26/2021 00:48:05 - INFO - codeparrot_training - Step 23224: {'lr': 0.0002793070569517136, 'samples': 11891200, 'steps': 23224, 'batch_loss/train': 0.8456938955932856} 12/26/2021 00:48:15 - INFO - codeparrot_training - Step 23225: {'lr': 0.00027929140244602733, 'samples': 11891712, 'steps': 23225, 'batch_loss/train': 0.8019077312201262} 12/26/2021 00:48:27 - INFO - codeparrot_training - Step 23226: {'lr': 0.00027927574782388956, 'samples': 11892224, 'steps': 23226, 'batch_loss/train': 0.8649665527045727} 12/26/2021 00:48:38 - INFO - codeparrot_training - Step 23227: {'lr': 0.00027926009308536237, 'samples': 11892736, 'steps': 23227, 'batch_loss/train': 0.7549257599748671} 12/26/2021 00:48:49 - INFO - codeparrot_training - Step 23228: {'lr': 0.0002792444382305081, 'samples': 11893248, 'steps': 23228, 'batch_loss/train': 0.7360675150994211} 12/26/2021 00:49:03 - INFO - codeparrot_training - Step 23229: {'lr': 0.0002792287832593891, 'samples': 11893760, 'steps': 23229, 'batch_loss/train': 0.8772824396146461} 12/26/2021 00:49:13 - INFO - codeparrot_training - Step 23230: {'lr': 0.00027921312817206736, 'samples': 11894272, 'steps': 23230, 'batch_loss/train': 0.595988929271698} 12/26/2021 00:49:24 - INFO - codeparrot_training - Step 23231: {'lr': 0.0002791974729686053, 'samples': 11894784, 'steps': 23231, 'batch_loss/train': 0.7587460116483271} 12/26/2021 00:49:36 - INFO - codeparrot_training - Step 23232: {'lr': 0.00027918181764906514, 'samples': 11895296, 'steps': 23232, 'batch_loss/train': 0.9391450718976557} 12/26/2021 00:49:47 - INFO - codeparrot_training - Step 23233: {'lr': 0.000279166162213509, 'samples': 11895808, 'steps': 23233, 'batch_loss/train': 0.7663805857300758} 12/26/2021 00:49:57 - INFO - codeparrot_training - Step 23234: {'lr': 0.0002791505066619994, 'samples': 11896320, 'steps': 23234, 'batch_loss/train': 0.8074011048302054} 12/26/2021 00:50:08 - INFO - codeparrot_training - Step 23235: {'lr': 0.00027913485099459825, 'samples': 11896832, 'steps': 23235, 'batch_loss/train': 0.800727978348732} 12/26/2021 00:50:21 - INFO - codeparrot_training - Step 23236: {'lr': 0.000279119195211368, 'samples': 11897344, 'steps': 23236, 'batch_loss/train': 0.6968033933080733} 12/26/2021 00:50:31 - INFO - codeparrot_training - Step 23237: {'lr': 0.0002791035393123707, 'samples': 11897856, 'steps': 23237, 'batch_loss/train': 0.7893575234338641} 12/26/2021 00:50:42 - INFO - codeparrot_training - Step 23238: {'lr': 0.0002790878832976688, 'samples': 11898368, 'steps': 23238, 'batch_loss/train': 0.7405964583158493} 12/26/2021 00:50:56 - INFO - codeparrot_training - Step 23239: {'lr': 0.0002790722271673245, 'samples': 11898880, 'steps': 23239, 'batch_loss/train': 0.6556997052393854} 12/26/2021 00:51:06 - INFO - codeparrot_training - Step 23240: {'lr': 0.00027905657092139997, 'samples': 11899392, 'steps': 23240, 'batch_loss/train': 0.8167375609045848} 12/26/2021 00:51:17 - INFO - codeparrot_training - Step 23241: {'lr': 0.0002790409145599574, 'samples': 11899904, 'steps': 23241, 'batch_loss/train': 0.832971403375268} 12/26/2021 00:51:29 - INFO - codeparrot_training - Step 23242: {'lr': 0.0002790252580830593, 'samples': 11900416, 'steps': 23242, 'batch_loss/train': 0.6437104376964271} 12/26/2021 00:51:39 - INFO - codeparrot_training - Step 23243: {'lr': 0.00027900960149076755, 'samples': 11900928, 'steps': 23243, 'batch_loss/train': 0.695595377124846} 12/26/2021 00:51:50 - INFO - codeparrot_training - Step 23244: {'lr': 0.00027899394478314473, 'samples': 11901440, 'steps': 23244, 'batch_loss/train': 0.7374607271049172} 12/26/2021 00:52:01 - INFO - codeparrot_training - Step 23245: {'lr': 0.0002789782879602528, 'samples': 11901952, 'steps': 23245, 'batch_loss/train': 0.7734442274086177} 12/26/2021 00:52:13 - INFO - codeparrot_training - Step 23246: {'lr': 0.00027896263102215424, 'samples': 11902464, 'steps': 23246, 'batch_loss/train': 0.7565448465757072} 12/26/2021 00:52:23 - INFO - codeparrot_training - Step 23247: {'lr': 0.00027894697396891113, 'samples': 11902976, 'steps': 23247, 'batch_loss/train': 0.7903776364400983} 12/26/2021 00:52:34 - INFO - codeparrot_training - Step 23248: {'lr': 0.0002789313168005858, 'samples': 11903488, 'steps': 23248, 'batch_loss/train': 0.8112636776641011} 12/26/2021 00:52:48 - INFO - codeparrot_training - Step 23249: {'lr': 0.0002789156595172405, 'samples': 11904000, 'steps': 23249, 'batch_loss/train': 0.9021211881190538} 12/26/2021 00:52:59 - INFO - codeparrot_training - Step 23250: {'lr': 0.0002789000021189374, 'samples': 11904512, 'steps': 23250, 'batch_loss/train': 0.7179891127161682} 12/26/2021 00:53:09 - INFO - codeparrot_training - Step 23251: {'lr': 0.00027888434460573893, 'samples': 11905024, 'steps': 23251, 'batch_loss/train': 0.5890559410909191} 12/26/2021 00:53:21 - INFO - codeparrot_training - Step 23252: {'lr': 0.00027886868697770713, 'samples': 11905536, 'steps': 23252, 'batch_loss/train': 0.7671810074243695} 12/26/2021 00:53:32 - INFO - codeparrot_training - Step 23253: {'lr': 0.0002788530292349043, 'samples': 11906048, 'steps': 23253, 'batch_loss/train': 0.6868524263845757} 12/26/2021 00:53:43 - INFO - codeparrot_training - Step 23254: {'lr': 0.00027883737137739275, 'samples': 11906560, 'steps': 23254, 'batch_loss/train': 0.808042274788022} 12/26/2021 00:53:53 - INFO - codeparrot_training - Step 23255: {'lr': 0.0002788217134052347, 'samples': 11907072, 'steps': 23255, 'batch_loss/train': 0.8594955718144774} 12/26/2021 00:54:05 - INFO - codeparrot_training - Step 23256: {'lr': 0.00027880605531849234, 'samples': 11907584, 'steps': 23256, 'batch_loss/train': 0.6692996195051819} 12/26/2021 00:54:16 - INFO - codeparrot_training - Step 23257: {'lr': 0.0002787903971172281, 'samples': 11908096, 'steps': 23257, 'batch_loss/train': 0.7364826761186123} 12/26/2021 00:54:26 - INFO - codeparrot_training - Step 23258: {'lr': 0.00027877473880150405, 'samples': 11908608, 'steps': 23258, 'batch_loss/train': 0.7413246737560257} 12/26/2021 00:54:40 - INFO - codeparrot_training - Step 23259: {'lr': 0.00027875908037138246, 'samples': 11909120, 'steps': 23259, 'batch_loss/train': 0.6629244199721143} 12/26/2021 00:54:51 - INFO - codeparrot_training - Step 23260: {'lr': 0.0002787434218269257, 'samples': 11909632, 'steps': 23260, 'batch_loss/train': 0.75048982584849} 12/26/2021 00:55:01 - INFO - codeparrot_training - Step 23261: {'lr': 0.00027872776316819595, 'samples': 11910144, 'steps': 23261, 'batch_loss/train': 0.7878203829750419} 12/26/2021 00:55:14 - INFO - codeparrot_training - Step 23262: {'lr': 0.0002787121043952554, 'samples': 11910656, 'steps': 23262, 'batch_loss/train': 0.8338066693395376} 12/26/2021 00:55:24 - INFO - codeparrot_training - Step 23263: {'lr': 0.00027869644550816637, 'samples': 11911168, 'steps': 23263, 'batch_loss/train': 0.7585459321271628} 12/26/2021 00:55:35 - INFO - codeparrot_training - Step 23264: {'lr': 0.0002786807865069912, 'samples': 11911680, 'steps': 23264, 'batch_loss/train': 0.7501217634417117} 12/26/2021 00:55:45 - INFO - codeparrot_training - Step 23265: {'lr': 0.00027866512739179193, 'samples': 11912192, 'steps': 23265, 'batch_loss/train': 0.744173907674849} 12/26/2021 00:55:58 - INFO - codeparrot_training - Step 23266: {'lr': 0.00027864946816263096, 'samples': 11912704, 'steps': 23266, 'batch_loss/train': 0.6254024575464427} 12/26/2021 00:56:08 - INFO - codeparrot_training - Step 23267: {'lr': 0.0002786338088195706, 'samples': 11913216, 'steps': 23267, 'batch_loss/train': 0.7287915945053101} 12/26/2021 00:56:19 - INFO - codeparrot_training - Step 23268: {'lr': 0.000278618149362673, 'samples': 11913728, 'steps': 23268, 'batch_loss/train': 0.5485541936941445} 12/26/2021 00:56:31 - INFO - codeparrot_training - Step 23269: {'lr': 0.0002786024897920004, 'samples': 11914240, 'steps': 23269, 'batch_loss/train': 0.8311105528846383} 12/26/2021 00:56:42 - INFO - codeparrot_training - Step 23270: {'lr': 0.00027858683010761507, 'samples': 11914752, 'steps': 23270, 'batch_loss/train': 0.842506795655936} 12/26/2021 00:56:52 - INFO - codeparrot_training - Step 23271: {'lr': 0.0002785711703095794, 'samples': 11915264, 'steps': 23271, 'batch_loss/train': 0.8406906640157104} 12/26/2021 00:57:03 - INFO - codeparrot_training - Step 23272: {'lr': 0.00027855551039795545, 'samples': 11915776, 'steps': 23272, 'batch_loss/train': 0.7668054993264377} 12/26/2021 00:57:03 - INFO - codeparrot_training - Dataset epoch: 1 12/26/2021 00:57:17 - INFO - codeparrot_training - Step 23273: {'lr': 0.00027853985037280566, 'samples': 11916288, 'steps': 23273, 'batch_loss/train': 0.8069598358124495} 12/26/2021 00:57:28 - INFO - codeparrot_training - Step 23274: {'lr': 0.00027852419023419216, 'samples': 11916800, 'steps': 23274, 'batch_loss/train': 0.7895133243873715} 12/26/2021 00:57:38 - INFO - codeparrot_training - Step 23275: {'lr': 0.00027850852998217724, 'samples': 11917312, 'steps': 23275, 'batch_loss/train': 0.793035376816988} 12/26/2021 00:57:52 - INFO - codeparrot_training - Step 23276: {'lr': 0.0002784928696168232, 'samples': 11917824, 'steps': 23276, 'batch_loss/train': 0.7622334994375706} 12/26/2021 00:58:03 - INFO - codeparrot_training - Step 23277: {'lr': 0.00027847720913819223, 'samples': 11918336, 'steps': 23277, 'batch_loss/train': 0.7839227234944701} 12/26/2021 00:58:14 - INFO - codeparrot_training - Step 23278: {'lr': 0.0002784615485463467, 'samples': 11918848, 'steps': 23278, 'batch_loss/train': 0.6161008626804687} 12/26/2021 00:58:26 - INFO - codeparrot_training - Step 23279: {'lr': 0.0002784458878413487, 'samples': 11919360, 'steps': 23279, 'batch_loss/train': 0.8617753330618143} 12/26/2021 00:58:36 - INFO - codeparrot_training - Step 23280: {'lr': 0.0002784302270232606, 'samples': 11919872, 'steps': 23280, 'batch_loss/train': 0.7510284167947248} 12/26/2021 00:58:47 - INFO - codeparrot_training - Step 23281: {'lr': 0.0002784145660921448, 'samples': 11920384, 'steps': 23281, 'batch_loss/train': 0.6353056472726166} 12/26/2021 00:58:59 - INFO - codeparrot_training - Step 23282: {'lr': 0.00027839890504806325, 'samples': 11920896, 'steps': 23282, 'batch_loss/train': 0.7653108432423323} 12/26/2021 00:59:09 - INFO - codeparrot_training - Step 23283: {'lr': 0.0002783832438910784, 'samples': 11921408, 'steps': 23283, 'batch_loss/train': 0.7789828535169363} 12/26/2021 00:59:20 - INFO - codeparrot_training - Step 23284: {'lr': 0.0002783675826212526, 'samples': 11921920, 'steps': 23284, 'batch_loss/train': 0.7763008587062359} 12/26/2021 00:59:31 - INFO - codeparrot_training - Step 23285: {'lr': 0.00027835192123864784, 'samples': 11922432, 'steps': 23285, 'batch_loss/train': 0.7302978427615017} 12/26/2021 00:59:45 - INFO - codeparrot_training - Step 23286: {'lr': 0.00027833625974332663, 'samples': 11922944, 'steps': 23286, 'batch_loss/train': 0.7573129897937179} 12/26/2021 00:59:55 - INFO - codeparrot_training - Step 23287: {'lr': 0.0002783205981353512, 'samples': 11923456, 'steps': 23287, 'batch_loss/train': 0.6600630389293656} 12/26/2021 01:00:06 - INFO - codeparrot_training - Step 23288: {'lr': 0.0002783049364147837, 'samples': 11923968, 'steps': 23288, 'batch_loss/train': 0.8310535922646523} 12/26/2021 01:00:18 - INFO - codeparrot_training - Step 23289: {'lr': 0.00027828927458168647, 'samples': 11924480, 'steps': 23289, 'batch_loss/train': 0.8861141856759787} 12/26/2021 01:00:29 - INFO - codeparrot_training - Step 23290: {'lr': 0.00027827361263612184, 'samples': 11924992, 'steps': 23290, 'batch_loss/train': 0.7984616262838244} 12/26/2021 01:00:39 - INFO - codeparrot_training - Step 23291: {'lr': 0.0002782579505781519, 'samples': 11925504, 'steps': 23291, 'batch_loss/train': 0.5748229324817657} 12/26/2021 01:00:53 - INFO - codeparrot_training - Step 23292: {'lr': 0.00027824228840783906, 'samples': 11926016, 'steps': 23292, 'batch_loss/train': 0.7984738489612937} 12/26/2021 01:01:04 - INFO - codeparrot_training - Step 23293: {'lr': 0.0002782266261252455, 'samples': 11926528, 'steps': 23293, 'batch_loss/train': 0.7941192761063576} 12/26/2021 01:01:14 - INFO - codeparrot_training - Step 23294: {'lr': 0.0002782109637304336, 'samples': 11927040, 'steps': 23294, 'batch_loss/train': 0.8497237457195297} 12/26/2021 01:01:26 - INFO - codeparrot_training - Step 23295: {'lr': 0.00027819530122346546, 'samples': 11927552, 'steps': 23295, 'batch_loss/train': 1.1710154917091131} 12/26/2021 01:01:37 - INFO - codeparrot_training - Step 23296: {'lr': 0.00027817963860440355, 'samples': 11928064, 'steps': 23296, 'batch_loss/train': 0.8133914209902287} 12/26/2021 01:01:48 - INFO - codeparrot_training - Step 23297: {'lr': 0.00027816397587331006, 'samples': 11928576, 'steps': 23297, 'batch_loss/train': 0.761879407800734} 12/26/2021 01:01:58 - INFO - codeparrot_training - Step 23298: {'lr': 0.0002781483130302471, 'samples': 11929088, 'steps': 23298, 'batch_loss/train': 0.7374392673373222} 12/26/2021 01:02:10 - INFO - codeparrot_training - Step 23299: {'lr': 0.0002781326500752772, 'samples': 11929600, 'steps': 23299, 'batch_loss/train': 0.8420839812606573} 12/26/2021 01:02:21 - INFO - codeparrot_training - Step 23300: {'lr': 0.00027811698700846245, 'samples': 11930112, 'steps': 23300, 'batch_loss/train': 0.7493344387039542} 12/26/2021 01:02:32 - INFO - codeparrot_training - Step 23301: {'lr': 0.0002781013238298652, 'samples': 11930624, 'steps': 23301, 'batch_loss/train': 0.8298217039555311} 12/26/2021 01:02:45 - INFO - codeparrot_training - Step 23302: {'lr': 0.00027808566053954766, 'samples': 11931136, 'steps': 23302, 'batch_loss/train': 0.7591109778732061} 12/26/2021 01:02:56 - INFO - codeparrot_training - Step 23303: {'lr': 0.0002780699971375722, 'samples': 11931648, 'steps': 23303, 'batch_loss/train': 0.7481443998403847} 12/26/2021 01:03:07 - INFO - codeparrot_training - Step 23304: {'lr': 0.000278054333624001, 'samples': 11932160, 'steps': 23304, 'batch_loss/train': 0.8484637527726591} 12/26/2021 01:03:19 - INFO - codeparrot_training - Step 23305: {'lr': 0.00027803866999889635, 'samples': 11932672, 'steps': 23305, 'batch_loss/train': 0.7567237159237266} 12/26/2021 01:03:29 - INFO - codeparrot_training - Step 23306: {'lr': 0.00027802300626232054, 'samples': 11933184, 'steps': 23306, 'batch_loss/train': 0.7406396558508277} 12/26/2021 01:03:40 - INFO - codeparrot_training - Step 23307: {'lr': 0.00027800734241433585, 'samples': 11933696, 'steps': 23307, 'batch_loss/train': 0.7654835488647223} 12/26/2021 01:03:50 - INFO - codeparrot_training - Step 23308: {'lr': 0.0002779916784550045, 'samples': 11934208, 'steps': 23308, 'batch_loss/train': 0.773169451393187} 12/26/2021 01:04:02 - INFO - codeparrot_training - Step 23309: {'lr': 0.00027797601438438884, 'samples': 11934720, 'steps': 23309, 'batch_loss/train': 0.6919496459886432} 12/26/2021 01:04:13 - INFO - codeparrot_training - Step 23310: {'lr': 0.00027796035020255115, 'samples': 11935232, 'steps': 23310, 'batch_loss/train': 0.7512339977547526} 12/26/2021 01:04:23 - INFO - codeparrot_training - Step 23311: {'lr': 0.00027794468590955357, 'samples': 11935744, 'steps': 23311, 'batch_loss/train': 0.8037000731565058} 12/26/2021 01:04:35 - INFO - codeparrot_training - Step 23312: {'lr': 0.00027792902150545855, 'samples': 11936256, 'steps': 23312, 'batch_loss/train': 0.6955882711336017} 12/26/2021 01:04:46 - INFO - codeparrot_training - Step 23313: {'lr': 0.00027791335699032827, 'samples': 11936768, 'steps': 23313, 'batch_loss/train': 0.7686519762501121} 12/26/2021 01:04:57 - INFO - codeparrot_training - Step 23314: {'lr': 0.000277897692364225, 'samples': 11937280, 'steps': 23314, 'batch_loss/train': 0.7950316164642572} 12/26/2021 01:05:11 - INFO - codeparrot_training - Step 23315: {'lr': 0.00027788202762721103, 'samples': 11937792, 'steps': 23315, 'batch_loss/train': 0.6945577841252089} 12/26/2021 01:05:21 - INFO - codeparrot_training - Step 23316: {'lr': 0.00027786636277934876, 'samples': 11938304, 'steps': 23316, 'batch_loss/train': 0.7764926380477846} 12/26/2021 01:05:32 - INFO - codeparrot_training - Step 23317: {'lr': 0.00027785069782070015, 'samples': 11938816, 'steps': 23317, 'batch_loss/train': 0.8353419778868556} 12/26/2021 01:05:43 - INFO - codeparrot_training - Step 23318: {'lr': 0.0002778350327513279, 'samples': 11939328, 'steps': 23318, 'batch_loss/train': 0.811739657074213} 12/26/2021 01:05:55 - INFO - codeparrot_training - Step 23319: {'lr': 0.0002778193675712939, 'samples': 11939840, 'steps': 23319, 'batch_loss/train': 0.8046791143715382} 12/26/2021 01:06:06 - INFO - codeparrot_training - Step 23320: {'lr': 0.0002778037022806607, 'samples': 11940352, 'steps': 23320, 'batch_loss/train': 0.7135768877342343} 12/26/2021 01:06:16 - INFO - codeparrot_training - Step 23321: {'lr': 0.0002777880368794904, 'samples': 11940864, 'steps': 23321, 'batch_loss/train': 0.8245390225201845} 12/26/2021 01:06:30 - INFO - codeparrot_training - Step 23322: {'lr': 0.0002777723713678454, 'samples': 11941376, 'steps': 23322, 'batch_loss/train': 0.7527716676704586} 12/26/2021 01:06:41 - INFO - codeparrot_training - Step 23323: {'lr': 0.00027775670574578797, 'samples': 11941888, 'steps': 23323, 'batch_loss/train': 0.7230929229408503} 12/26/2021 01:06:51 - INFO - codeparrot_training - Step 23324: {'lr': 0.0002777410400133802, 'samples': 11942400, 'steps': 23324, 'batch_loss/train': 0.7111250348389149} 12/26/2021 01:07:03 - INFO - codeparrot_training - Step 23325: {'lr': 0.00027772537417068477, 'samples': 11942912, 'steps': 23325, 'batch_loss/train': 0.7264134585857391} 12/26/2021 01:07:14 - INFO - codeparrot_training - Step 23326: {'lr': 0.0002777097082177636, 'samples': 11943424, 'steps': 23326, 'batch_loss/train': 1.512011852581054} 12/26/2021 01:07:25 - INFO - codeparrot_training - Step 23327: {'lr': 0.0002776940421546791, 'samples': 11943936, 'steps': 23327, 'batch_loss/train': 0.7959496139083058} 12/26/2021 01:07:35 - INFO - codeparrot_training - Step 23328: {'lr': 0.0002776783759814935, 'samples': 11944448, 'steps': 23328, 'batch_loss/train': 0.7226737039163709} 12/26/2021 01:07:47 - INFO - codeparrot_training - Step 23329: {'lr': 0.00027766270969826925, 'samples': 11944960, 'steps': 23329, 'batch_loss/train': 0.7894274108111858} 12/26/2021 01:07:58 - INFO - codeparrot_training - Step 23330: {'lr': 0.00027764704330506836, 'samples': 11945472, 'steps': 23330, 'batch_loss/train': 0.6943417573347688} 12/26/2021 01:08:09 - INFO - codeparrot_training - Step 23331: {'lr': 0.0002776313768019534, 'samples': 11945984, 'steps': 23331, 'batch_loss/train': 0.818920835852623} 12/26/2021 01:08:23 - INFO - codeparrot_training - Step 23332: {'lr': 0.0002776157101889864, 'samples': 11946496, 'steps': 23332, 'batch_loss/train': 0.7194341151043773} 12/26/2021 01:08:33 - INFO - codeparrot_training - Step 23333: {'lr': 0.00027760004346622983, 'samples': 11947008, 'steps': 23333, 'batch_loss/train': 0.6551992308814079} 12/26/2021 01:08:44 - INFO - codeparrot_training - Step 23334: {'lr': 0.0002775843766337459, 'samples': 11947520, 'steps': 23334, 'batch_loss/train': 0.7554569477215409} 12/26/2021 01:08:56 - INFO - codeparrot_training - Step 23335: {'lr': 0.00027756870969159684, 'samples': 11948032, 'steps': 23335, 'batch_loss/train': 0.7987104090861976} 12/26/2021 01:09:06 - INFO - codeparrot_training - Step 23336: {'lr': 0.00027755304263984515, 'samples': 11948544, 'steps': 23336, 'batch_loss/train': 0.7219437584280968} 12/26/2021 01:09:17 - INFO - codeparrot_training - Step 23337: {'lr': 0.0002775373754785528, 'samples': 11949056, 'steps': 23337, 'batch_loss/train': 0.7584366453811526} 12/26/2021 01:09:28 - INFO - codeparrot_training - Step 23338: {'lr': 0.0002775217082077824, 'samples': 11949568, 'steps': 23338, 'batch_loss/train': 0.7620759727433324} 12/26/2021 01:09:40 - INFO - codeparrot_training - Step 23339: {'lr': 0.0002775060408275959, 'samples': 11950080, 'steps': 23339, 'batch_loss/train': 0.7033872483298182} 12/26/2021 01:09:50 - INFO - codeparrot_training - Step 23340: {'lr': 0.0002774903733380559, 'samples': 11950592, 'steps': 23340, 'batch_loss/train': 0.7190898172557354} 12/26/2021 01:10:01 - INFO - codeparrot_training - Step 23341: {'lr': 0.0002774747057392245, 'samples': 11951104, 'steps': 23341, 'batch_loss/train': 0.8105793250724673} 12/26/2021 01:10:15 - INFO - codeparrot_training - Step 23342: {'lr': 0.00027745903803116413, 'samples': 11951616, 'steps': 23342, 'batch_loss/train': 0.7630026498809457} 12/26/2021 01:10:25 - INFO - codeparrot_training - Step 23343: {'lr': 0.0002774433702139369, 'samples': 11952128, 'steps': 23343, 'batch_loss/train': 0.8174910871312022} 12/26/2021 01:10:36 - INFO - codeparrot_training - Step 23344: {'lr': 0.0002774277022876052, 'samples': 11952640, 'steps': 23344, 'batch_loss/train': 0.802331511862576} 12/26/2021 01:10:48 - INFO - codeparrot_training - Step 23345: {'lr': 0.00027741203425223136, 'samples': 11953152, 'steps': 23345, 'batch_loss/train': 0.8047509146854281} 12/26/2021 01:10:59 - INFO - codeparrot_training - Step 23346: {'lr': 0.0002773963661078776, 'samples': 11953664, 'steps': 23346, 'batch_loss/train': 0.8087655953131616} 12/26/2021 01:11:09 - INFO - codeparrot_training - Step 23347: {'lr': 0.0002773806978546062, 'samples': 11954176, 'steps': 23347, 'batch_loss/train': 0.7603583104209974} 12/26/2021 01:11:24 - INFO - codeparrot_training - Step 23348: {'lr': 0.00027736502949247954, 'samples': 11954688, 'steps': 23348, 'batch_loss/train': 0.7691974081099033} 12/26/2021 01:11:35 - INFO - codeparrot_training - Step 23349: {'lr': 0.0002773493610215599, 'samples': 11955200, 'steps': 23349, 'batch_loss/train': 0.827497900929302} 12/26/2021 01:11:45 - INFO - codeparrot_training - Step 23350: {'lr': 0.00027733369244190936, 'samples': 11955712, 'steps': 23350, 'batch_loss/train': 0.8898344850167632} 12/26/2021 01:11:56 - INFO - codeparrot_training - Step 23351: {'lr': 0.0002773180237535905, 'samples': 11956224, 'steps': 23351, 'batch_loss/train': 0.8194239153526723} 12/26/2021 01:12:08 - INFO - codeparrot_training - Step 23352: {'lr': 0.0002773023549566655, 'samples': 11956736, 'steps': 23352, 'batch_loss/train': 1.6170968981459737} 12/26/2021 01:12:18 - INFO - codeparrot_training - Step 23353: {'lr': 0.00027728668605119655, 'samples': 11957248, 'steps': 23353, 'batch_loss/train': 0.8100717626512051} 12/26/2021 01:12:29 - INFO - codeparrot_training - Step 23354: {'lr': 0.0002772710170372461, 'samples': 11957760, 'steps': 23354, 'batch_loss/train': 0.8133782893419266} 12/26/2021 01:12:41 - INFO - codeparrot_training - Step 23355: {'lr': 0.00027725534791487645, 'samples': 11958272, 'steps': 23355, 'batch_loss/train': 0.8055543410591781} 12/26/2021 01:12:52 - INFO - codeparrot_training - Step 23356: {'lr': 0.0002772396786841497, 'samples': 11958784, 'steps': 23356, 'batch_loss/train': 0.8791079844813794} 12/26/2021 01:13:02 - INFO - codeparrot_training - Step 23357: {'lr': 0.00027722400934512836, 'samples': 11959296, 'steps': 23357, 'batch_loss/train': 0.796000774949789} 12/26/2021 01:13:13 - INFO - codeparrot_training - Step 23358: {'lr': 0.00027720833989787464, 'samples': 11959808, 'steps': 23358, 'batch_loss/train': 0.8592907739803195} 12/26/2021 01:13:25 - INFO - codeparrot_training - Step 23359: {'lr': 0.0002771926703424508, 'samples': 11960320, 'steps': 23359, 'batch_loss/train': 0.8208424388431013} 12/26/2021 01:13:35 - INFO - codeparrot_training - Step 23360: {'lr': 0.00027717700067891914, 'samples': 11960832, 'steps': 23360, 'batch_loss/train': 0.8536047786474228} 12/26/2021 01:13:46 - INFO - codeparrot_training - Step 23361: {'lr': 0.00027716133090734205, 'samples': 11961344, 'steps': 23361, 'batch_loss/train': 0.7373025538399816} 12/26/2021 01:14:00 - INFO - codeparrot_training - Step 23362: {'lr': 0.00027714566102778174, 'samples': 11961856, 'steps': 23362, 'batch_loss/train': 0.7583923251368105} 12/26/2021 01:14:11 - INFO - codeparrot_training - Step 23363: {'lr': 0.00027712999104030053, 'samples': 11962368, 'steps': 23363, 'batch_loss/train': 0.7412408720701933} 12/26/2021 01:14:21 - INFO - codeparrot_training - Step 23364: {'lr': 0.00027711432094496073, 'samples': 11962880, 'steps': 23364, 'batch_loss/train': 0.7111959108151495} 12/26/2021 01:14:34 - INFO - codeparrot_training - Step 23365: {'lr': 0.00027709865074182467, 'samples': 11963392, 'steps': 23365, 'batch_loss/train': 0.8642839742824435} 12/26/2021 01:14:44 - INFO - codeparrot_training - Step 23366: {'lr': 0.0002770829804309545, 'samples': 11963904, 'steps': 23366, 'batch_loss/train': 0.6563065109075978} 12/26/2021 01:14:55 - INFO - codeparrot_training - Step 23367: {'lr': 0.0002770673100124127, 'samples': 11964416, 'steps': 23367, 'batch_loss/train': 0.7037709271535277} 12/26/2021 01:15:06 - INFO - codeparrot_training - Step 23368: {'lr': 0.00027705163948626147, 'samples': 11964928, 'steps': 23368, 'batch_loss/train': 0.7310488494113088} 12/26/2021 01:15:19 - INFO - codeparrot_training - Step 23369: {'lr': 0.0002770359688525631, 'samples': 11965440, 'steps': 23369, 'batch_loss/train': 0.817218023352325} 12/26/2021 01:15:29 - INFO - codeparrot_training - Step 23370: {'lr': 0.00027702029811138, 'samples': 11965952, 'steps': 23370, 'batch_loss/train': 0.6979732785839587} 12/26/2021 01:15:40 - INFO - codeparrot_training - Step 23371: {'lr': 0.0002770046272627744, 'samples': 11966464, 'steps': 23371, 'batch_loss/train': 0.8542010863311589} 12/26/2021 01:15:52 - INFO - codeparrot_training - Step 23372: {'lr': 0.0002769889563068085, 'samples': 11966976, 'steps': 23372, 'batch_loss/train': 0.7635251255705953} 12/26/2021 01:16:02 - INFO - codeparrot_training - Step 23373: {'lr': 0.00027697328524354484, 'samples': 11967488, 'steps': 23373, 'batch_loss/train': 0.8144772090017796} 12/26/2021 01:16:13 - INFO - codeparrot_training - Step 23374: {'lr': 0.0002769576140730455, 'samples': 11968000, 'steps': 23374, 'batch_loss/train': 0.6655683570425026} 12/26/2021 01:16:25 - INFO - codeparrot_training - Step 23375: {'lr': 0.0002769419427953728, 'samples': 11968512, 'steps': 23375, 'batch_loss/train': 0.8276047939434648} 12/26/2021 01:16:36 - INFO - codeparrot_training - Step 23376: {'lr': 0.00027692627141058925, 'samples': 11969024, 'steps': 23376, 'batch_loss/train': 0.7467125048860908} 12/26/2021 01:16:46 - INFO - codeparrot_training - Step 23377: {'lr': 0.00027691059991875693, 'samples': 11969536, 'steps': 23377, 'batch_loss/train': 0.737321563065052} 12/26/2021 01:16:57 - INFO - codeparrot_training - Step 23378: {'lr': 0.00027689492831993825, 'samples': 11970048, 'steps': 23378, 'batch_loss/train': 0.787389162927866} 12/26/2021 01:17:11 - INFO - codeparrot_training - Step 23379: {'lr': 0.0002768792566141955, 'samples': 11970560, 'steps': 23379, 'batch_loss/train': 0.7519342433661222} 12/26/2021 01:17:22 - INFO - codeparrot_training - Step 23380: {'lr': 0.0002768635848015909, 'samples': 11971072, 'steps': 23380, 'batch_loss/train': 0.7086167689412832} 12/26/2021 01:17:33 - INFO - codeparrot_training - Step 23381: {'lr': 0.00027684791288218693, 'samples': 11971584, 'steps': 23381, 'batch_loss/train': 0.7456059083342552} 12/26/2021 01:17:45 - INFO - codeparrot_training - Step 23382: {'lr': 0.0002768322408560457, 'samples': 11972096, 'steps': 23382, 'batch_loss/train': 0.702561246114783} 12/26/2021 01:17:55 - INFO - codeparrot_training - Step 23383: {'lr': 0.0002768165687232297, 'samples': 11972608, 'steps': 23383, 'batch_loss/train': 0.7037491393275559} 12/26/2021 01:18:06 - INFO - codeparrot_training - Step 23384: {'lr': 0.00027680089648380115, 'samples': 11973120, 'steps': 23384, 'batch_loss/train': 0.7770692626945674} 12/26/2021 01:18:18 - INFO - codeparrot_training - Step 23385: {'lr': 0.0002767852241378223, 'samples': 11973632, 'steps': 23385, 'batch_loss/train': 0.6156971375457942} 12/26/2021 01:18:29 - INFO - codeparrot_training - Step 23386: {'lr': 0.0002767695516853555, 'samples': 11974144, 'steps': 23386, 'batch_loss/train': 0.7839812012389302} 12/26/2021 01:18:39 - INFO - codeparrot_training - Step 23387: {'lr': 0.0002767538791264631, 'samples': 11974656, 'steps': 23387, 'batch_loss/train': 0.7906738696619868} 12/26/2021 01:18:51 - INFO - codeparrot_training - Step 23388: {'lr': 0.00027673820646120735, 'samples': 11975168, 'steps': 23388, 'batch_loss/train': 0.7462250087410212} 12/26/2021 01:19:02 - INFO - codeparrot_training - Step 23389: {'lr': 0.0002767225336896506, 'samples': 11975680, 'steps': 23389, 'batch_loss/train': 0.8128387024626136} 12/26/2021 01:19:12 - INFO - codeparrot_training - Step 23390: {'lr': 0.0002767068608118552, 'samples': 11976192, 'steps': 23390, 'batch_loss/train': 0.7501739866565913} 12/26/2021 01:19:23 - INFO - codeparrot_training - Step 23391: {'lr': 0.00027669118782788337, 'samples': 11976704, 'steps': 23391, 'batch_loss/train': 0.7761119310744107} 12/26/2021 01:19:37 - INFO - codeparrot_training - Step 23392: {'lr': 0.00027667551473779746, 'samples': 11977216, 'steps': 23392, 'batch_loss/train': 0.8087962022982538} 12/26/2021 01:19:48 - INFO - codeparrot_training - Step 23393: {'lr': 0.0002766598415416597, 'samples': 11977728, 'steps': 23393, 'batch_loss/train': 0.7970524225383997} 12/26/2021 01:19:58 - INFO - codeparrot_training - Step 23394: {'lr': 0.0002766441682395326, 'samples': 11978240, 'steps': 23394, 'batch_loss/train': 0.7243164079263806} 12/26/2021 01:20:10 - INFO - codeparrot_training - Step 23395: {'lr': 0.00027662849483147826, 'samples': 11978752, 'steps': 23395, 'batch_loss/train': 0.7725281110033393} 12/26/2021 01:20:21 - INFO - codeparrot_training - Step 23396: {'lr': 0.00027661282131755913, 'samples': 11979264, 'steps': 23396, 'batch_loss/train': 0.792465727776289} 12/26/2021 01:20:32 - INFO - codeparrot_training - Step 23397: {'lr': 0.00027659714769783753, 'samples': 11979776, 'steps': 23397, 'batch_loss/train': 0.7654272271320224} 12/26/2021 01:20:44 - INFO - codeparrot_training - Step 23398: {'lr': 0.00027658147397237554, 'samples': 11980288, 'steps': 23398, 'batch_loss/train': 0.8350076450733468} 12/26/2021 01:20:54 - INFO - codeparrot_training - Step 23399: {'lr': 0.00027656580014123576, 'samples': 11980800, 'steps': 23399, 'batch_loss/train': 0.7569049745798111} 12/26/2021 01:21:05 - INFO - codeparrot_training - Step 23400: {'lr': 0.0002765501262044804, 'samples': 11981312, 'steps': 23400, 'batch_loss/train': 0.8012146474793553} 12/26/2021 01:21:16 - INFO - codeparrot_training - Step 23401: {'lr': 0.00027653445216217174, 'samples': 11981824, 'steps': 23401, 'batch_loss/train': 0.8440980757586658} 12/26/2021 01:21:30 - INFO - codeparrot_training - Step 23402: {'lr': 0.00027651877801437207, 'samples': 11982336, 'steps': 23402, 'batch_loss/train': 0.6958742188289762} 12/26/2021 01:21:40 - INFO - codeparrot_training - Step 23403: {'lr': 0.0002765031037611439, 'samples': 11982848, 'steps': 23403, 'batch_loss/train': 0.8558380892500281} 12/26/2021 01:21:51 - INFO - codeparrot_training - Step 23404: {'lr': 0.00027648742940254923, 'samples': 11983360, 'steps': 23404, 'batch_loss/train': 1.1711355713196099} 12/26/2021 01:22:03 - INFO - codeparrot_training - Step 23405: {'lr': 0.0002764717549386506, 'samples': 11983872, 'steps': 23405, 'batch_loss/train': 0.8384177223779261} 12/26/2021 01:22:14 - INFO - codeparrot_training - Step 23406: {'lr': 0.0002764560803695102, 'samples': 11984384, 'steps': 23406, 'batch_loss/train': 0.7477490613237023} 12/26/2021 01:22:24 - INFO - codeparrot_training - Step 23407: {'lr': 0.0002764404056951905, 'samples': 11984896, 'steps': 23407, 'batch_loss/train': 0.7579464907757938} 12/26/2021 01:22:35 - INFO - codeparrot_training - Step 23408: {'lr': 0.0002764247309157537, 'samples': 11985408, 'steps': 23408, 'batch_loss/train': 0.7478477564873174} 12/26/2021 01:22:48 - INFO - codeparrot_training - Step 23409: {'lr': 0.0002764090560312621, 'samples': 11985920, 'steps': 23409, 'batch_loss/train': 0.6986880437470973} 12/26/2021 01:22:58 - INFO - codeparrot_training - Step 23410: {'lr': 0.00027639338104177826, 'samples': 11986432, 'steps': 23410, 'batch_loss/train': 0.6289715261664242} 12/26/2021 01:23:09 - INFO - codeparrot_training - Step 23411: {'lr': 0.00027637770594736406, 'samples': 11986944, 'steps': 23411, 'batch_loss/train': 0.7691820953041315} 12/26/2021 01:23:21 - INFO - codeparrot_training - Step 23412: {'lr': 0.00027636203074808224, 'samples': 11987456, 'steps': 23412, 'batch_loss/train': 0.9203179981559515} 12/26/2021 01:23:32 - INFO - codeparrot_training - Step 23413: {'lr': 0.00027634635544399483, 'samples': 11987968, 'steps': 23413, 'batch_loss/train': 0.7711040480062366} 12/26/2021 01:23:42 - INFO - codeparrot_training - Step 23414: {'lr': 0.00027633068003516433, 'samples': 11988480, 'steps': 23414, 'batch_loss/train': 0.8275698879733682} 12/26/2021 01:23:54 - INFO - codeparrot_training - Step 23415: {'lr': 0.0002763150045216529, 'samples': 11988992, 'steps': 23415, 'batch_loss/train': 0.7850630679167807} 12/26/2021 01:24:05 - INFO - codeparrot_training - Step 23416: {'lr': 0.00027629932890352305, 'samples': 11989504, 'steps': 23416, 'batch_loss/train': 0.6936139953322709} 12/26/2021 01:24:15 - INFO - codeparrot_training - Step 23417: {'lr': 0.000276283653180837, 'samples': 11990016, 'steps': 23417, 'batch_loss/train': 0.7093727644532919} 12/26/2021 01:24:28 - INFO - codeparrot_training - Step 23418: {'lr': 0.00027626797735365695, 'samples': 11990528, 'steps': 23418, 'batch_loss/train': 0.9791971361264586} 12/26/2021 01:24:39 - INFO - codeparrot_training - Step 23419: {'lr': 0.0002762523014220454, 'samples': 11991040, 'steps': 23419, 'batch_loss/train': 0.6942548980005085} 12/26/2021 01:24:49 - INFO - codeparrot_training - Step 23420: {'lr': 0.0002762366253860646, 'samples': 11991552, 'steps': 23420, 'batch_loss/train': 0.8074172530323267} 12/26/2021 01:25:00 - INFO - codeparrot_training - Step 23421: {'lr': 0.00027622094924577696, 'samples': 11992064, 'steps': 23421, 'batch_loss/train': 0.7944452408701181} 12/26/2021 01:25:12 - INFO - codeparrot_training - Step 23422: {'lr': 0.0002762052730012447, 'samples': 11992576, 'steps': 23422, 'batch_loss/train': 0.7439493965357542} 12/26/2021 01:25:23 - INFO - codeparrot_training - Step 23423: {'lr': 0.00027618959665253015, 'samples': 11993088, 'steps': 23423, 'batch_loss/train': 0.629825651878491} 12/26/2021 01:25:33 - INFO - codeparrot_training - Step 23424: {'lr': 0.0002761739201996956, 'samples': 11993600, 'steps': 23424, 'batch_loss/train': 0.7444166257046163} 12/26/2021 01:25:46 - INFO - codeparrot_training - Step 23425: {'lr': 0.0002761582436428035, 'samples': 11994112, 'steps': 23425, 'batch_loss/train': 0.8555805026553571} 12/26/2021 01:25:57 - INFO - codeparrot_training - Step 23426: {'lr': 0.00027614256698191606, 'samples': 11994624, 'steps': 23426, 'batch_loss/train': 0.7448477759025991} 12/26/2021 01:26:07 - INFO - codeparrot_training - Step 23427: {'lr': 0.00027612689021709564, 'samples': 11995136, 'steps': 23427, 'batch_loss/train': 0.7991004642099142} 12/26/2021 01:26:19 - INFO - codeparrot_training - Step 23428: {'lr': 0.0002761112133484046, 'samples': 11995648, 'steps': 23428, 'batch_loss/train': 1.0293505566660315} 12/26/2021 01:26:30 - INFO - codeparrot_training - Step 23429: {'lr': 0.0002760955363759053, 'samples': 11996160, 'steps': 23429, 'batch_loss/train': 0.7620790917426348} 12/26/2021 01:26:40 - INFO - codeparrot_training - Step 23430: {'lr': 0.0002760798592996599, 'samples': 11996672, 'steps': 23430, 'batch_loss/train': 0.7817711783573031} 12/26/2021 01:26:51 - INFO - codeparrot_training - Step 23431: {'lr': 0.0002760641821197308, 'samples': 11997184, 'steps': 23431, 'batch_loss/train': 0.741628441028297} 12/26/2021 01:27:03 - INFO - codeparrot_training - Step 23432: {'lr': 0.00027604850483618044, 'samples': 11997696, 'steps': 23432, 'batch_loss/train': 0.8111198721453547} 12/26/2021 01:27:14 - INFO - codeparrot_training - Step 23433: {'lr': 0.000276032827449071, 'samples': 11998208, 'steps': 23433, 'batch_loss/train': 0.6678508706972934} 12/26/2021 01:27:24 - INFO - codeparrot_training - Step 23434: {'lr': 0.0002760171499584649, 'samples': 11998720, 'steps': 23434, 'batch_loss/train': 0.8014120701700449} 12/26/2021 01:27:37 - INFO - codeparrot_training - Step 23435: {'lr': 0.00027600147236442443, 'samples': 11999232, 'steps': 23435, 'batch_loss/train': 0.7754638269543648} 12/26/2021 01:27:47 - INFO - codeparrot_training - Step 23436: {'lr': 0.0002759857946670119, 'samples': 11999744, 'steps': 23436, 'batch_loss/train': 0.7603265875950456} 12/26/2021 01:27:58 - INFO - codeparrot_training - Step 23437: {'lr': 0.00027597011686628965, 'samples': 12000256, 'steps': 23437, 'batch_loss/train': 0.8470420390367508} 12/26/2021 01:28:11 - INFO - codeparrot_training - Step 23438: {'lr': 0.0002759544389623201, 'samples': 12000768, 'steps': 23438, 'batch_loss/train': 0.6794829424470663} 12/26/2021 01:28:21 - INFO - codeparrot_training - Step 23439: {'lr': 0.00027593876095516546, 'samples': 12001280, 'steps': 23439, 'batch_loss/train': 0.6836934462189674} 12/26/2021 01:28:32 - INFO - codeparrot_training - Step 23440: {'lr': 0.0002759230828448881, 'samples': 12001792, 'steps': 23440, 'batch_loss/train': 0.824966668151319} 12/26/2021 01:28:44 - INFO - codeparrot_training - Step 23441: {'lr': 0.0002759074046315503, 'samples': 12002304, 'steps': 23441, 'batch_loss/train': 0.6950786281377077} 12/26/2021 01:28:55 - INFO - codeparrot_training - Step 23442: {'lr': 0.0002758917263152145, 'samples': 12002816, 'steps': 23442, 'batch_loss/train': 0.6250717635266483} 12/26/2021 01:29:06 - INFO - codeparrot_training - Step 23443: {'lr': 0.00027587604789594295, 'samples': 12003328, 'steps': 23443, 'batch_loss/train': 0.7326618954539299} 12/26/2021 01:29:16 - INFO - codeparrot_training - Step 23444: {'lr': 0.000275860369373798, 'samples': 12003840, 'steps': 23444, 'batch_loss/train': 0.7928802110254765} 12/26/2021 01:29:29 - INFO - codeparrot_training - Step 23445: {'lr': 0.00027584469074884205, 'samples': 12004352, 'steps': 23445, 'batch_loss/train': 0.7776487614028156} 12/26/2021 01:29:39 - INFO - codeparrot_training - Step 23446: {'lr': 0.0002758290120211373, 'samples': 12004864, 'steps': 23446, 'batch_loss/train': 0.5783950777258724} 12/26/2021 01:29:50 - INFO - codeparrot_training - Step 23447: {'lr': 0.0002758133331907461, 'samples': 12005376, 'steps': 23447, 'batch_loss/train': 0.8952086353674531} 12/26/2021 01:30:02 - INFO - codeparrot_training - Step 23448: {'lr': 0.00027579765425773093, 'samples': 12005888, 'steps': 23448, 'batch_loss/train': 0.769868077011779} 12/26/2021 01:30:13 - INFO - codeparrot_training - Step 23449: {'lr': 0.00027578197522215407, 'samples': 12006400, 'steps': 23449, 'batch_loss/train': 0.7720072029042058} 12/26/2021 01:30:24 - INFO - codeparrot_training - Step 23450: {'lr': 0.00027576629608407765, 'samples': 12006912, 'steps': 23450, 'batch_loss/train': 0.8170340809738263} 12/26/2021 01:30:36 - INFO - codeparrot_training - Step 23451: {'lr': 0.00027575061684356435, 'samples': 12007424, 'steps': 23451, 'batch_loss/train': 0.7468596410471946} 12/26/2021 01:30:47 - INFO - codeparrot_training - Step 23452: {'lr': 0.0002757349375006762, 'samples': 12007936, 'steps': 23452, 'batch_loss/train': 0.799640589626506} 12/26/2021 01:30:57 - INFO - codeparrot_training - Step 23453: {'lr': 0.00027571925805547565, 'samples': 12008448, 'steps': 23453, 'batch_loss/train': 0.5855240705423057} 12/26/2021 01:31:08 - INFO - codeparrot_training - Step 23454: {'lr': 0.0002757035785080251, 'samples': 12008960, 'steps': 23454, 'batch_loss/train': 0.7417498347349465} 12/26/2021 01:31:20 - INFO - codeparrot_training - Step 23455: {'lr': 0.0002756878988583868, 'samples': 12009472, 'steps': 23455, 'batch_loss/train': 0.8207940515130758} 12/26/2021 01:31:30 - INFO - codeparrot_training - Step 23456: {'lr': 0.0002756722191066231, 'samples': 12009984, 'steps': 23456, 'batch_loss/train': 0.7539149078074843} 12/26/2021 01:31:41 - INFO - codeparrot_training - Step 23457: {'lr': 0.0002756565392527963, 'samples': 12010496, 'steps': 23457, 'batch_loss/train': 0.8774367910809815} 12/26/2021 01:31:54 - INFO - codeparrot_training - Step 23458: {'lr': 0.00027564085929696893, 'samples': 12011008, 'steps': 23458, 'batch_loss/train': 0.6221006224513985} 12/26/2021 01:32:05 - INFO - codeparrot_training - Step 23459: {'lr': 0.0002756251792392031, 'samples': 12011520, 'steps': 23459, 'batch_loss/train': 0.7392712973523885} 12/26/2021 01:32:15 - INFO - codeparrot_training - Step 23460: {'lr': 0.0002756094990795612, 'samples': 12012032, 'steps': 23460, 'batch_loss/train': 0.7752074273303151} 12/26/2021 01:32:26 - INFO - codeparrot_training - Step 23461: {'lr': 0.0002755938188181057, 'samples': 12012544, 'steps': 23461, 'batch_loss/train': 0.8927123453468084} 12/26/2021 01:32:38 - INFO - codeparrot_training - Step 23462: {'lr': 0.0002755781384548988, 'samples': 12013056, 'steps': 23462, 'batch_loss/train': 0.9649471016600728} 12/26/2021 01:32:49 - INFO - codeparrot_training - Step 23463: {'lr': 0.0002755624579900028, 'samples': 12013568, 'steps': 23463, 'batch_loss/train': 0.6831152834929526} 12/26/2021 01:32:59 - INFO - codeparrot_training - Step 23464: {'lr': 0.00027554677742348024, 'samples': 12014080, 'steps': 23464, 'batch_loss/train': 0.7677428843453526} 12/26/2021 01:33:12 - INFO - codeparrot_training - Step 23465: {'lr': 0.0002755310967553933, 'samples': 12014592, 'steps': 23465, 'batch_loss/train': 0.8442425718531013} 12/26/2021 01:33:22 - INFO - codeparrot_training - Step 23466: {'lr': 0.00027551541598580434, 'samples': 12015104, 'steps': 23466, 'batch_loss/train': 0.48286119813565165} 12/26/2021 01:33:33 - INFO - codeparrot_training - Step 23467: {'lr': 0.00027549973511477565, 'samples': 12015616, 'steps': 23467, 'batch_loss/train': 0.7786768320947886} 12/26/2021 01:33:46 - INFO - codeparrot_training - Step 23468: {'lr': 0.0002754840541423697, 'samples': 12016128, 'steps': 23468, 'batch_loss/train': 0.8167119713034481} 12/26/2021 01:33:57 - INFO - codeparrot_training - Step 23469: {'lr': 0.0002754683730686488, 'samples': 12016640, 'steps': 23469, 'batch_loss/train': 0.8048767833970487} 12/26/2021 01:34:07 - INFO - codeparrot_training - Step 23470: {'lr': 0.0002754526918936752, 'samples': 12017152, 'steps': 23470, 'batch_loss/train': 0.6506520337425172} 12/26/2021 01:34:18 - INFO - codeparrot_training - Step 23471: {'lr': 0.00027543701061751144, 'samples': 12017664, 'steps': 23471, 'batch_loss/train': 0.8059054100885987} 12/26/2021 01:34:30 - INFO - codeparrot_training - Step 23472: {'lr': 0.00027542132924021965, 'samples': 12018176, 'steps': 23472, 'batch_loss/train': 0.7676869706483558} 12/26/2021 01:34:40 - INFO - codeparrot_training - Step 23473: {'lr': 0.0002754056477618622, 'samples': 12018688, 'steps': 23473, 'batch_loss/train': 0.8763722600415349} 12/26/2021 01:34:51 - INFO - codeparrot_training - Step 23474: {'lr': 0.0002753899661825016, 'samples': 12019200, 'steps': 23474, 'batch_loss/train': 0.7381751388311386} 12/26/2021 01:35:03 - INFO - codeparrot_training - Step 23475: {'lr': 0.00027537428450220006, 'samples': 12019712, 'steps': 23475, 'batch_loss/train': 0.790801364229992} 12/26/2021 01:35:14 - INFO - codeparrot_training - Step 23476: {'lr': 0.0002753586027210198, 'samples': 12020224, 'steps': 23476, 'batch_loss/train': 0.8001634292304516} 12/26/2021 01:35:24 - INFO - codeparrot_training - Step 23477: {'lr': 0.00027534292083902345, 'samples': 12020736, 'steps': 23477, 'batch_loss/train': 0.6951576620340347} 12/26/2021 01:35:37 - INFO - codeparrot_training - Step 23478: {'lr': 0.0002753272388562732, 'samples': 12021248, 'steps': 23478, 'batch_loss/train': 0.7727571856230497} 12/26/2021 01:35:48 - INFO - codeparrot_training - Step 23479: {'lr': 0.0002753115567728314, 'samples': 12021760, 'steps': 23479, 'batch_loss/train': 0.7386952508240938} 12/26/2021 01:35:58 - INFO - codeparrot_training - Step 23480: {'lr': 0.00027529587458876035, 'samples': 12022272, 'steps': 23480, 'batch_loss/train': 0.8528391160070896} 12/26/2021 01:36:10 - INFO - codeparrot_training - Step 23481: {'lr': 0.00027528019230412253, 'samples': 12022784, 'steps': 23481, 'batch_loss/train': 0.8260260848328471} 12/26/2021 01:36:21 - INFO - codeparrot_training - Step 23482: {'lr': 0.00027526450991898014, 'samples': 12023296, 'steps': 23482, 'batch_loss/train': 0.7422072342596948} 12/26/2021 01:36:31 - INFO - codeparrot_training - Step 23483: {'lr': 0.0002752488274333957, 'samples': 12023808, 'steps': 23483, 'batch_loss/train': 0.7719581718556583} 12/26/2021 01:36:42 - INFO - codeparrot_training - Step 23484: {'lr': 0.0002752331448474314, 'samples': 12024320, 'steps': 23484, 'batch_loss/train': 0.769119349133689} 12/26/2021 01:36:55 - INFO - codeparrot_training - Step 23485: {'lr': 0.00027521746216114965, 'samples': 12024832, 'steps': 23485, 'batch_loss/train': 0.8067571073770523} 12/26/2021 01:37:05 - INFO - codeparrot_training - Step 23486: {'lr': 0.0002752017793746128, 'samples': 12025344, 'steps': 23486, 'batch_loss/train': 0.7136743990704417} 12/26/2021 01:37:16 - INFO - codeparrot_training - Step 23487: {'lr': 0.00027518609648788313, 'samples': 12025856, 'steps': 23487, 'batch_loss/train': 0.682912808842957} 12/26/2021 01:37:28 - INFO - codeparrot_training - Step 23488: {'lr': 0.0002751704135010231, 'samples': 12026368, 'steps': 23488, 'batch_loss/train': 0.7158128181472421} 12/26/2021 01:37:38 - INFO - codeparrot_training - Step 23489: {'lr': 0.000275154730414095, 'samples': 12026880, 'steps': 23489, 'batch_loss/train': 0.7342220079153776} 12/26/2021 01:37:49 - INFO - codeparrot_training - Step 23490: {'lr': 0.0002751390472271613, 'samples': 12027392, 'steps': 23490, 'batch_loss/train': 0.730783543549478} 12/26/2021 01:38:01 - INFO - codeparrot_training - Step 23491: {'lr': 0.0002751233639402841, 'samples': 12027904, 'steps': 23491, 'batch_loss/train': 0.6254870577249676} 12/26/2021 01:38:12 - INFO - codeparrot_training - Step 23492: {'lr': 0.0002751076805535259, 'samples': 12028416, 'steps': 23492, 'batch_loss/train': 0.7009410974569619} 12/26/2021 01:38:23 - INFO - codeparrot_training - Step 23493: {'lr': 0.00027509199706694906, 'samples': 12028928, 'steps': 23493, 'batch_loss/train': 0.7119410177692771} 12/26/2021 01:38:33 - INFO - codeparrot_training - Step 23494: {'lr': 0.0002750763134806159, 'samples': 12029440, 'steps': 23494, 'batch_loss/train': 0.9831846188753843} 12/26/2021 01:38:45 - INFO - codeparrot_training - Step 23495: {'lr': 0.0002750606297945888, 'samples': 12029952, 'steps': 23495, 'batch_loss/train': 0.7536455038934946} 12/26/2021 01:38:56 - INFO - codeparrot_training - Step 23496: {'lr': 0.0002750449460089301, 'samples': 12030464, 'steps': 23496, 'batch_loss/train': 0.8048032419756055} 12/26/2021 01:39:07 - INFO - codeparrot_training - Step 23497: {'lr': 0.00027502926212370223, 'samples': 12030976, 'steps': 23497, 'batch_loss/train': 0.8268554444657639} 12/26/2021 01:39:19 - INFO - codeparrot_training - Step 23498: {'lr': 0.00027501357813896737, 'samples': 12031488, 'steps': 23498, 'batch_loss/train': 0.7491114316508174} 12/26/2021 01:39:30 - INFO - codeparrot_training - Step 23499: {'lr': 0.00027499789405478796, 'samples': 12032000, 'steps': 23499, 'batch_loss/train': 0.8056773887947202} 12/26/2021 01:39:41 - INFO - codeparrot_training - Step 23500: {'lr': 0.0002749822098712264, 'samples': 12032512, 'steps': 23500, 'batch_loss/train': 0.7680508680641651} 12/26/2021 01:39:53 - INFO - codeparrot_training - Step 23501: {'lr': 0.000274966525588345, 'samples': 12033024, 'steps': 23501, 'batch_loss/train': 0.7969144973903894} 12/26/2021 01:40:03 - INFO - codeparrot_training - Step 23502: {'lr': 0.00027495084120620613, 'samples': 12033536, 'steps': 23502, 'batch_loss/train': 0.7348198248073459} 12/26/2021 01:40:14 - INFO - codeparrot_training - Step 23503: {'lr': 0.0002749351567248722, 'samples': 12034048, 'steps': 23503, 'batch_loss/train': 0.7942602885887027} 12/26/2021 01:40:24 - INFO - codeparrot_training - Step 23504: {'lr': 0.0002749194721444054, 'samples': 12034560, 'steps': 23504, 'batch_loss/train': 0.7483995566144586} 12/26/2021 01:40:36 - INFO - codeparrot_training - Step 23505: {'lr': 0.00027490378746486817, 'samples': 12035072, 'steps': 23505, 'batch_loss/train': 0.7870098575949669} 12/26/2021 01:40:47 - INFO - codeparrot_training - Step 23506: {'lr': 0.000274888102686323, 'samples': 12035584, 'steps': 23506, 'batch_loss/train': 0.8527116104960442} 12/26/2021 01:40:58 - INFO - codeparrot_training - Step 23507: {'lr': 0.000274872417808832, 'samples': 12036096, 'steps': 23507, 'batch_loss/train': 0.7932345867156982} 12/26/2021 01:41:11 - INFO - codeparrot_training - Step 23508: {'lr': 0.0002748567328324577, 'samples': 12036608, 'steps': 23508, 'batch_loss/train': 0.7804672471247613} 12/26/2021 01:41:21 - INFO - codeparrot_training - Step 23509: {'lr': 0.0002748410477572625, 'samples': 12037120, 'steps': 23509, 'batch_loss/train': 0.6975005264393985} 12/26/2021 01:41:32 - INFO - codeparrot_training - Step 23510: {'lr': 0.00027482536258330867, 'samples': 12037632, 'steps': 23510, 'batch_loss/train': 1.5230974350124598} 12/26/2021 01:41:44 - INFO - codeparrot_training - Step 23511: {'lr': 0.00027480967731065845, 'samples': 12038144, 'steps': 23511, 'batch_loss/train': 0.851087499409914} 12/26/2021 01:41:55 - INFO - codeparrot_training - Step 23512: {'lr': 0.0002747939919393744, 'samples': 12038656, 'steps': 23512, 'batch_loss/train': 0.7021890981122851} 12/26/2021 01:42:05 - INFO - codeparrot_training - Step 23513: {'lr': 0.00027477830646951875, 'samples': 12039168, 'steps': 23513, 'batch_loss/train': 0.7838164325803518} 12/26/2021 01:42:16 - INFO - codeparrot_training - Step 23514: {'lr': 0.00027476262090115396, 'samples': 12039680, 'steps': 23514, 'batch_loss/train': 0.7746264925226569} 12/26/2021 01:42:28 - INFO - codeparrot_training - Step 23515: {'lr': 0.0002747469352343423, 'samples': 12040192, 'steps': 23515, 'batch_loss/train': 0.7288443436846137} 12/26/2021 01:42:38 - INFO - codeparrot_training - Step 23516: {'lr': 0.00027473124946914625, 'samples': 12040704, 'steps': 23516, 'batch_loss/train': 0.6757853268645704} 12/26/2021 01:42:49 - INFO - codeparrot_training - Step 23517: {'lr': 0.000274715563605628, 'samples': 12041216, 'steps': 23517, 'batch_loss/train': 0.7517348130932078} 12/26/2021 01:43:02 - INFO - codeparrot_training - Step 23518: {'lr': 0.00027469987764385, 'samples': 12041728, 'steps': 23518, 'batch_loss/train': 0.8300437885336578} 12/26/2021 01:43:13 - INFO - codeparrot_training - Step 23519: {'lr': 0.0002746841915838747, 'samples': 12042240, 'steps': 23519, 'batch_loss/train': 0.7450947922188789} 12/26/2021 01:43:23 - INFO - codeparrot_training - Step 23520: {'lr': 0.00027466850542576433, 'samples': 12042752, 'steps': 23520, 'batch_loss/train': 0.7912795250304043} 12/26/2021 01:43:35 - INFO - codeparrot_training - Step 23521: {'lr': 0.00027465281916958125, 'samples': 12043264, 'steps': 23521, 'batch_loss/train': 1.562438557855785} 12/26/2021 01:43:46 - INFO - codeparrot_training - Step 23522: {'lr': 0.00027463713281538786, 'samples': 12043776, 'steps': 23522, 'batch_loss/train': 0.7266819886863232} 12/26/2021 01:43:56 - INFO - codeparrot_training - Step 23523: {'lr': 0.00027462144636324667, 'samples': 12044288, 'steps': 23523, 'batch_loss/train': 0.7607665928080678} 12/26/2021 01:44:07 - INFO - codeparrot_training - Step 23524: {'lr': 0.0002746057598132198, 'samples': 12044800, 'steps': 23524, 'batch_loss/train': 0.746965330094099} 12/26/2021 01:44:20 - INFO - codeparrot_training - Step 23525: {'lr': 0.00027459007316536977, 'samples': 12045312, 'steps': 23525, 'batch_loss/train': 0.7551025515422225} 12/26/2021 01:44:31 - INFO - codeparrot_training - Step 23526: {'lr': 0.00027457438641975885, 'samples': 12045824, 'steps': 23526, 'batch_loss/train': 0.6970622763037682} 12/26/2021 01:44:41 - INFO - codeparrot_training - Step 23527: {'lr': 0.0002745586995764494, 'samples': 12046336, 'steps': 23527, 'batch_loss/train': 0.8109558490104973} 12/26/2021 01:44:53 - INFO - codeparrot_training - Step 23528: {'lr': 0.0002745430126355039, 'samples': 12046848, 'steps': 23528, 'batch_loss/train': 0.8518787883222103} 12/26/2021 01:45:04 - INFO - codeparrot_training - Step 23529: {'lr': 0.00027452732559698465, 'samples': 12047360, 'steps': 23529, 'batch_loss/train': 0.7142540076747537} 12/26/2021 01:45:14 - INFO - codeparrot_training - Step 23530: {'lr': 0.00027451163846095404, 'samples': 12047872, 'steps': 23530, 'batch_loss/train': 0.7803433053195477} 12/26/2021 01:45:27 - INFO - codeparrot_training - Step 23531: {'lr': 0.0002744959512274743, 'samples': 12048384, 'steps': 23531, 'batch_loss/train': 0.7476359470747411} 12/26/2021 01:45:37 - INFO - codeparrot_training - Step 23532: {'lr': 0.0002744802638966081, 'samples': 12048896, 'steps': 23532, 'batch_loss/train': 0.8055145125836134} 12/26/2021 01:45:48 - INFO - codeparrot_training - Step 23533: {'lr': 0.0002744645764684174, 'samples': 12049408, 'steps': 23533, 'batch_loss/train': 0.7544957040809095} 12/26/2021 01:46:00 - INFO - codeparrot_training - Step 23534: {'lr': 0.0002744488889429649, 'samples': 12049920, 'steps': 23534, 'batch_loss/train': 0.8251630971208215} 12/26/2021 01:46:11 - INFO - codeparrot_training - Step 23535: {'lr': 0.00027443320132031283, 'samples': 12050432, 'steps': 23535, 'batch_loss/train': 0.6766740682069212} 12/26/2021 01:46:22 - INFO - codeparrot_training - Step 23536: {'lr': 0.00027441751360052365, 'samples': 12050944, 'steps': 23536, 'batch_loss/train': 0.7000371133908629} 12/26/2021 01:46:32 - INFO - codeparrot_training - Step 23537: {'lr': 0.0002744018257836595, 'samples': 12051456, 'steps': 23537, 'batch_loss/train': 0.8739042433444411} 12/26/2021 01:46:44 - INFO - codeparrot_training - Step 23538: {'lr': 0.0002743861378697831, 'samples': 12051968, 'steps': 23538, 'batch_loss/train': 0.7651578364893794} 12/26/2021 01:46:55 - INFO - codeparrot_training - Step 23539: {'lr': 0.0002743704498589564, 'samples': 12052480, 'steps': 23539, 'batch_loss/train': 0.737056608311832} 12/26/2021 01:47:06 - INFO - codeparrot_training - Step 23540: {'lr': 0.00027435476175124214, 'samples': 12052992, 'steps': 23540, 'batch_loss/train': 0.7383203767240047} 12/26/2021 01:47:18 - INFO - codeparrot_training - Step 23541: {'lr': 0.0002743390735467025, 'samples': 12053504, 'steps': 23541, 'batch_loss/train': 0.7646727436222136} 12/26/2021 01:47:28 - INFO - codeparrot_training - Step 23542: {'lr': 0.0002743233852453999, 'samples': 12054016, 'steps': 23542, 'batch_loss/train': 0.6873707165941596} 12/26/2021 01:47:39 - INFO - codeparrot_training - Step 23543: {'lr': 0.00027430769684739674, 'samples': 12054528, 'steps': 23543, 'batch_loss/train': 0.8470668029040098} 12/26/2021 01:47:51 - INFO - codeparrot_training - Step 23544: {'lr': 0.00027429200835275526, 'samples': 12055040, 'steps': 23544, 'batch_loss/train': 0.8204845627769828} 12/26/2021 01:48:02 - INFO - codeparrot_training - Step 23545: {'lr': 0.000274276319761538, 'samples': 12055552, 'steps': 23545, 'batch_loss/train': 0.8448652373626828} 12/26/2021 01:48:12 - INFO - codeparrot_training - Step 23546: {'lr': 0.00027426063107380727, 'samples': 12056064, 'steps': 23546, 'batch_loss/train': 0.7745516989380121} 12/26/2021 01:48:23 - INFO - codeparrot_training - Step 23547: {'lr': 0.0002742449422896254, 'samples': 12056576, 'steps': 23547, 'batch_loss/train': 0.7709155101329088} 12/26/2021 01:48:36 - INFO - codeparrot_training - Step 23548: {'lr': 0.00027422925340905483, 'samples': 12057088, 'steps': 23548, 'batch_loss/train': 0.7706423555500805} 12/26/2021 01:48:46 - INFO - codeparrot_training - Step 23549: {'lr': 0.0002742135644321579, 'samples': 12057600, 'steps': 23549, 'batch_loss/train': 0.8033277527429163} 12/26/2021 01:48:57 - INFO - codeparrot_training - Step 23550: {'lr': 0.0002741978753589969, 'samples': 12058112, 'steps': 23550, 'batch_loss/train': 1.8041399815119803} 12/26/2021 01:49:09 - INFO - codeparrot_training - Step 23551: {'lr': 0.00027418218618963433, 'samples': 12058624, 'steps': 23551, 'batch_loss/train': 0.8097545029595494} 12/26/2021 01:49:20 - INFO - codeparrot_training - Step 23552: {'lr': 0.00027416649692413254, 'samples': 12059136, 'steps': 23552, 'batch_loss/train': 0.6366275949403644} 12/26/2021 01:49:31 - INFO - codeparrot_training - Step 23553: {'lr': 0.00027415080756255385, 'samples': 12059648, 'steps': 23553, 'batch_loss/train': 0.8387943413108587} 12/26/2021 01:49:44 - INFO - codeparrot_training - Step 23554: {'lr': 0.00027413511810496066, 'samples': 12060160, 'steps': 23554, 'batch_loss/train': 0.827556966803968} 12/26/2021 01:49:54 - INFO - codeparrot_training - Step 23555: {'lr': 0.00027411942855141535, 'samples': 12060672, 'steps': 23555, 'batch_loss/train': 0.796554266475141} 12/26/2021 01:50:05 - INFO - codeparrot_training - Step 23556: {'lr': 0.00027410373890198033, 'samples': 12061184, 'steps': 23556, 'batch_loss/train': 0.7684158990159631} 12/26/2021 01:50:15 - INFO - codeparrot_training - Step 23557: {'lr': 0.00027408804915671794, 'samples': 12061696, 'steps': 23557, 'batch_loss/train': 0.9154341304674745} 12/26/2021 01:50:27 - INFO - codeparrot_training - Step 23558: {'lr': 0.00027407235931569063, 'samples': 12062208, 'steps': 23558, 'batch_loss/train': 0.6485136995906942} 12/26/2021 01:50:38 - INFO - codeparrot_training - Step 23559: {'lr': 0.00027405666937896063, 'samples': 12062720, 'steps': 23559, 'batch_loss/train': 0.7300959397107363} 12/26/2021 01:50:49 - INFO - codeparrot_training - Step 23560: {'lr': 0.0002740409793465904, 'samples': 12063232, 'steps': 23560, 'batch_loss/train': 0.9331448646262288} 12/26/2021 01:51:01 - INFO - codeparrot_training - Step 23561: {'lr': 0.00027402528921864226, 'samples': 12063744, 'steps': 23561, 'batch_loss/train': 0.8090773457661271} 12/26/2021 01:51:11 - INFO - codeparrot_training - Step 23562: {'lr': 0.0002740095989951788, 'samples': 12064256, 'steps': 23562, 'batch_loss/train': 0.7015138622373343} 12/26/2021 01:51:22 - INFO - codeparrot_training - Step 23563: {'lr': 0.00027399390867626205, 'samples': 12064768, 'steps': 23563, 'batch_loss/train': 0.7476536268368363} 12/26/2021 01:51:35 - INFO - codeparrot_training - Step 23564: {'lr': 0.00027397821826195476, 'samples': 12065280, 'steps': 23564, 'batch_loss/train': 0.6512306830845773} 12/26/2021 01:51:45 - INFO - codeparrot_training - Step 23565: {'lr': 0.000273962527752319, 'samples': 12065792, 'steps': 23565, 'batch_loss/train': 0.7428092444315553} 12/26/2021 01:51:56 - INFO - codeparrot_training - Step 23566: {'lr': 0.0002739468371474173, 'samples': 12066304, 'steps': 23566, 'batch_loss/train': 0.7149891541339457} 12/26/2021 01:52:06 - INFO - codeparrot_training - Step 23567: {'lr': 0.00027393114644731206, 'samples': 12066816, 'steps': 23567, 'batch_loss/train': 0.7870610998943448} 12/26/2021 01:52:18 - INFO - codeparrot_training - Step 23568: {'lr': 0.00027391545565206557, 'samples': 12067328, 'steps': 23568, 'batch_loss/train': 0.7565038371831179} 12/26/2021 01:52:29 - INFO - codeparrot_training - Step 23569: {'lr': 0.0002738997647617403, 'samples': 12067840, 'steps': 23569, 'batch_loss/train': 0.6923675802536309} 12/26/2021 01:52:40 - INFO - codeparrot_training - Step 23570: {'lr': 0.0002738840737763985, 'samples': 12068352, 'steps': 23570, 'batch_loss/train': 0.7983898769598454} 12/26/2021 01:52:53 - INFO - codeparrot_training - Step 23571: {'lr': 0.0002738683826961028, 'samples': 12068864, 'steps': 23571, 'batch_loss/train': 0.7332077082246542} 12/26/2021 01:53:03 - INFO - codeparrot_training - Step 23572: {'lr': 0.0002738526915209153, 'samples': 12069376, 'steps': 23572, 'batch_loss/train': 0.7836702251806855} 12/26/2021 01:53:14 - INFO - codeparrot_training - Step 23573: {'lr': 0.0002738370002508985, 'samples': 12069888, 'steps': 23573, 'batch_loss/train': 0.7678982121869922} 12/26/2021 01:53:26 - INFO - codeparrot_training - Step 23574: {'lr': 0.0002738213088861149, 'samples': 12070400, 'steps': 23574, 'batch_loss/train': 0.8217535139992833} 12/26/2021 01:53:37 - INFO - codeparrot_training - Step 23575: {'lr': 0.00027380561742662676, 'samples': 12070912, 'steps': 23575, 'batch_loss/train': 0.763305775821209} 12/26/2021 01:53:47 - INFO - codeparrot_training - Step 23576: {'lr': 0.00027378992587249627, 'samples': 12071424, 'steps': 23576, 'batch_loss/train': 0.7957917815074325} 12/26/2021 01:53:59 - INFO - codeparrot_training - Step 23577: {'lr': 0.0002737742342237862, 'samples': 12071936, 'steps': 23577, 'batch_loss/train': 0.746207210700959} 12/26/2021 01:54:10 - INFO - codeparrot_training - Step 23578: {'lr': 0.00027375854248055877, 'samples': 12072448, 'steps': 23578, 'batch_loss/train': 0.8073832327499986} 12/26/2021 01:54:21 - INFO - codeparrot_training - Step 23579: {'lr': 0.0002737428506428762, 'samples': 12072960, 'steps': 23579, 'batch_loss/train': 0.8968251347541809} 12/26/2021 01:54:31 - INFO - codeparrot_training - Step 23580: {'lr': 0.00027372715871080115, 'samples': 12073472, 'steps': 23580, 'batch_loss/train': 0.681794379837811} 12/26/2021 01:54:43 - INFO - codeparrot_training - Step 23581: {'lr': 0.0002737114666843958, 'samples': 12073984, 'steps': 23581, 'batch_loss/train': 0.8813878055661917} 12/26/2021 01:54:54 - INFO - codeparrot_training - Step 23582: {'lr': 0.0002736957745637227, 'samples': 12074496, 'steps': 23582, 'batch_loss/train': 0.7585161151364446} 12/26/2021 01:55:05 - INFO - codeparrot_training - Step 23583: {'lr': 0.00027368008234884405, 'samples': 12075008, 'steps': 23583, 'batch_loss/train': 0.7247462328523397} 12/26/2021 01:55:17 - INFO - codeparrot_training - Step 23584: {'lr': 0.0002736643900398224, 'samples': 12075520, 'steps': 23584, 'batch_loss/train': 0.9653375744819641} 12/26/2021 01:55:27 - INFO - codeparrot_training - Step 23585: {'lr': 0.0002736486976367201, 'samples': 12076032, 'steps': 23585, 'batch_loss/train': 0.6400500448071398} 12/26/2021 01:55:38 - INFO - codeparrot_training - Step 23586: {'lr': 0.0002736330051395994, 'samples': 12076544, 'steps': 23586, 'batch_loss/train': 0.9147171434015036} 12/26/2021 01:55:51 - INFO - codeparrot_training - Step 23587: {'lr': 0.00027361731254852285, 'samples': 12077056, 'steps': 23587, 'batch_loss/train': 0.7764735692180693} 12/26/2021 01:56:01 - INFO - codeparrot_training - Step 23588: {'lr': 0.0002736016198635528, 'samples': 12077568, 'steps': 23588, 'batch_loss/train': 0.8130071660270914} 12/26/2021 01:56:12 - INFO - codeparrot_training - Step 23589: {'lr': 0.00027358592708475154, 'samples': 12078080, 'steps': 23589, 'batch_loss/train': 0.7811527960002422} 12/26/2021 01:56:23 - INFO - codeparrot_training - Step 23590: {'lr': 0.0002735702342121817, 'samples': 12078592, 'steps': 23590, 'batch_loss/train': 0.7178149643586949} 12/26/2021 01:56:35 - INFO - codeparrot_training - Step 23591: {'lr': 0.00027355454124590537, 'samples': 12079104, 'steps': 23591, 'batch_loss/train': 0.8303779577836394} 12/26/2021 01:56:45 - INFO - codeparrot_training - Step 23592: {'lr': 0.0002735388481859851, 'samples': 12079616, 'steps': 23592, 'batch_loss/train': 0.7364679416641593} 12/26/2021 01:56:56 - INFO - codeparrot_training - Step 23593: {'lr': 0.0002735231550324833, 'samples': 12080128, 'steps': 23593, 'batch_loss/train': 0.8163228314369917} 12/26/2021 01:57:09 - INFO - codeparrot_training - Step 23594: {'lr': 0.0002735074617854623, 'samples': 12080640, 'steps': 23594, 'batch_loss/train': 0.8897084333002567} 12/26/2021 01:57:19 - INFO - codeparrot_training - Step 23595: {'lr': 0.00027349176844498446, 'samples': 12081152, 'steps': 23595, 'batch_loss/train': 0.7302141329273582} 12/26/2021 01:57:30 - INFO - codeparrot_training - Step 23596: {'lr': 0.00027347607501111227, 'samples': 12081664, 'steps': 23596, 'batch_loss/train': 0.7333616362884641} 12/26/2021 01:57:42 - INFO - codeparrot_training - Step 23597: {'lr': 0.0002734603814839081, 'samples': 12082176, 'steps': 23597, 'batch_loss/train': 0.7700341725721955} 12/26/2021 01:57:53 - INFO - codeparrot_training - Step 23598: {'lr': 0.00027344468786343426, 'samples': 12082688, 'steps': 23598, 'batch_loss/train': 0.7821780666708946} 12/26/2021 01:58:03 - INFO - codeparrot_training - Step 23599: {'lr': 0.00027342899414975317, 'samples': 12083200, 'steps': 23599, 'batch_loss/train': 0.8675961144035682} 12/26/2021 01:58:14 - INFO - codeparrot_training - Step 23600: {'lr': 0.0002734133003429273, 'samples': 12083712, 'steps': 23600, 'batch_loss/train': 0.7395716342143714} 12/26/2021 01:58:26 - INFO - codeparrot_training - Step 23601: {'lr': 0.000273397606443019, 'samples': 12084224, 'steps': 23601, 'batch_loss/train': 0.7373599617276341} 12/26/2021 01:58:36 - INFO - codeparrot_training - Step 23602: {'lr': 0.0002733819124500906, 'samples': 12084736, 'steps': 23602, 'batch_loss/train': 0.7212954903952777} 12/26/2021 01:58:47 - INFO - codeparrot_training - Step 23603: {'lr': 0.0002733662183642046, 'samples': 12085248, 'steps': 23603, 'batch_loss/train': 0.8221955122426152} 12/26/2021 01:59:00 - INFO - codeparrot_training - Step 23604: {'lr': 0.00027335052418542326, 'samples': 12085760, 'steps': 23604, 'batch_loss/train': 0.8325273413211107} 12/26/2021 01:59:11 - INFO - codeparrot_training - Step 23605: {'lr': 0.0002733348299138091, 'samples': 12086272, 'steps': 23605, 'batch_loss/train': 0.7738930629566312} 12/26/2021 01:59:21 - INFO - codeparrot_training - Step 23606: {'lr': 0.00027331913554942444, 'samples': 12086784, 'steps': 23606, 'batch_loss/train': 0.7866178015246987} 12/26/2021 01:59:34 - INFO - codeparrot_training - Step 23607: {'lr': 0.00027330344109233174, 'samples': 12087296, 'steps': 23607, 'batch_loss/train': 0.8205328518524766} 12/26/2021 01:59:44 - INFO - codeparrot_training - Step 23608: {'lr': 0.00027328774654259336, 'samples': 12087808, 'steps': 23608, 'batch_loss/train': 0.8273351481184363} 12/26/2021 01:59:55 - INFO - codeparrot_training - Step 23609: {'lr': 0.00027327205190027165, 'samples': 12088320, 'steps': 23609, 'batch_loss/train': 0.8743466606829315} 12/26/2021 02:00:08 - INFO - codeparrot_training - Step 23610: {'lr': 0.0002732563571654291, 'samples': 12088832, 'steps': 23610, 'batch_loss/train': 0.5806890358217061} 12/26/2021 02:00:18 - INFO - codeparrot_training - Step 23611: {'lr': 0.00027324066233812803, 'samples': 12089344, 'steps': 23611, 'batch_loss/train': 0.8088189177215099} 12/26/2021 02:00:29 - INFO - codeparrot_training - Step 23612: {'lr': 0.00027322496741843087, 'samples': 12089856, 'steps': 23612, 'batch_loss/train': 1.0072387158870697} 12/26/2021 02:00:40 - INFO - codeparrot_training - Step 23613: {'lr': 0.00027320927240639996, 'samples': 12090368, 'steps': 23613, 'batch_loss/train': 0.8971014274284244} 12/26/2021 02:00:52 - INFO - codeparrot_training - Step 23614: {'lr': 0.0002731935773020978, 'samples': 12090880, 'steps': 23614, 'batch_loss/train': 0.8390287412330508} 12/26/2021 02:01:02 - INFO - codeparrot_training - Step 23615: {'lr': 0.00027317788210558675, 'samples': 12091392, 'steps': 23615, 'batch_loss/train': 1.4792565274983644} 12/26/2021 02:01:13 - INFO - codeparrot_training - Step 23616: {'lr': 0.00027316218681692914, 'samples': 12091904, 'steps': 23616, 'batch_loss/train': 0.7833453700877726} 12/26/2021 02:01:25 - INFO - codeparrot_training - Step 23617: {'lr': 0.00027314649143618747, 'samples': 12092416, 'steps': 23617, 'batch_loss/train': 0.9033493073657155} 12/26/2021 02:01:36 - INFO - codeparrot_training - Step 23618: {'lr': 0.00027313079596342403, 'samples': 12092928, 'steps': 23618, 'batch_loss/train': 0.7315458357334137} 12/26/2021 02:01:46 - INFO - codeparrot_training - Step 23619: {'lr': 0.00027311510039870133, 'samples': 12093440, 'steps': 23619, 'batch_loss/train': 0.8216312751173973} 12/26/2021 02:01:57 - INFO - codeparrot_training - Step 23620: {'lr': 0.0002730994047420817, 'samples': 12093952, 'steps': 23620, 'batch_loss/train': 0.7462798824999481} 12/26/2021 02:02:10 - INFO - codeparrot_training - Step 23621: {'lr': 0.00027308370899362756, 'samples': 12094464, 'steps': 23621, 'batch_loss/train': 0.8180109348613769} 12/26/2021 02:02:20 - INFO - codeparrot_training - Step 23622: {'lr': 0.0002730680131534013, 'samples': 12094976, 'steps': 23622, 'batch_loss/train': 0.9082625145092607} 12/26/2021 02:02:31 - INFO - codeparrot_training - Step 23623: {'lr': 0.0002730523172214654, 'samples': 12095488, 'steps': 23623, 'batch_loss/train': 0.7370505980215967} 12/26/2021 02:02:43 - INFO - codeparrot_training - Step 23624: {'lr': 0.0002730366211978821, 'samples': 12096000, 'steps': 23624, 'batch_loss/train': 0.7447583378525451} 12/26/2021 02:02:54 - INFO - codeparrot_training - Step 23625: {'lr': 0.00027302092508271395, 'samples': 12096512, 'steps': 23625, 'batch_loss/train': 0.8183071138337255} 12/26/2021 02:03:04 - INFO - codeparrot_training - Step 23626: {'lr': 0.00027300522887602327, 'samples': 12097024, 'steps': 23626, 'batch_loss/train': 0.7382275182753801} 12/26/2021 02:03:17 - INFO - codeparrot_training - Step 23627: {'lr': 0.00027298953257787244, 'samples': 12097536, 'steps': 23627, 'batch_loss/train': 0.7530308701097965} 12/26/2021 02:03:27 - INFO - codeparrot_training - Step 23628: {'lr': 0.00027297383618832396, 'samples': 12098048, 'steps': 23628, 'batch_loss/train': 0.7500350496266037} 12/26/2021 02:03:38 - INFO - codeparrot_training - Step 23629: {'lr': 0.0002729581397074402, 'samples': 12098560, 'steps': 23629, 'batch_loss/train': 0.8227336211130023} 12/26/2021 02:03:50 - INFO - codeparrot_training - Step 23630: {'lr': 0.00027294244313528354, 'samples': 12099072, 'steps': 23630, 'batch_loss/train': 0.7698770838323981} 12/26/2021 02:04:00 - INFO - codeparrot_training - Step 23631: {'lr': 0.0002729267464719163, 'samples': 12099584, 'steps': 23631, 'batch_loss/train': 0.8258179682306945} 12/26/2021 02:04:11 - INFO - codeparrot_training - Step 23632: {'lr': 0.0002729110497174011, 'samples': 12100096, 'steps': 23632, 'batch_loss/train': 0.7492228420451283} 12/26/2021 02:04:22 - INFO - codeparrot_training - Step 23633: {'lr': 0.0002728953528718002, 'samples': 12100608, 'steps': 23633, 'batch_loss/train': 0.8088077800348401} 12/26/2021 02:04:34 - INFO - codeparrot_training - Step 23634: {'lr': 0.0002728796559351759, 'samples': 12101120, 'steps': 23634, 'batch_loss/train': 0.7832729844376445} 12/26/2021 02:04:45 - INFO - codeparrot_training - Step 23635: {'lr': 0.0002728639589075908, 'samples': 12101632, 'steps': 23635, 'batch_loss/train': 0.7452402836643159} 12/26/2021 02:04:56 - INFO - codeparrot_training - Step 23636: {'lr': 0.00027284826178910733, 'samples': 12102144, 'steps': 23636, 'batch_loss/train': 0.8465833533555269} 12/26/2021 02:05:08 - INFO - codeparrot_training - Step 23637: {'lr': 0.00027283256457978763, 'samples': 12102656, 'steps': 23637, 'batch_loss/train': 0.7781181940808892} 12/26/2021 02:05:18 - INFO - codeparrot_training - Step 23638: {'lr': 0.0002728168672796944, 'samples': 12103168, 'steps': 23638, 'batch_loss/train': 0.94480718113482} 12/26/2021 02:05:29 - INFO - codeparrot_training - Step 23639: {'lr': 0.0002728011698888899, 'samples': 12103680, 'steps': 23639, 'batch_loss/train': 0.6965735517442226} 12/26/2021 02:05:42 - INFO - codeparrot_training - Step 23640: {'lr': 0.0002727854724074365, 'samples': 12104192, 'steps': 23640, 'batch_loss/train': 0.8458122904412448} 12/26/2021 02:05:53 - INFO - codeparrot_training - Step 23641: {'lr': 0.00027276977483539665, 'samples': 12104704, 'steps': 23641, 'batch_loss/train': 0.6411277344450355} 12/26/2021 02:06:03 - INFO - codeparrot_training - Step 23642: {'lr': 0.00027275407717283285, 'samples': 12105216, 'steps': 23642, 'batch_loss/train': 0.8022592095658183} 12/26/2021 02:06:14 - INFO - codeparrot_training - Step 23643: {'lr': 0.00027273837941980744, 'samples': 12105728, 'steps': 23643, 'batch_loss/train': 0.8478907402604818} 12/26/2021 02:06:26 - INFO - codeparrot_training - Step 23644: {'lr': 0.0002727226815763827, 'samples': 12106240, 'steps': 23644, 'batch_loss/train': 0.8294340749271214} 12/26/2021 02:06:36 - INFO - codeparrot_training - Step 23645: {'lr': 0.0002727069836426213, 'samples': 12106752, 'steps': 23645, 'batch_loss/train': 0.6946185138076544} 12/26/2021 02:06:47 - INFO - codeparrot_training - Step 23646: {'lr': 0.0002726912856185854, 'samples': 12107264, 'steps': 23646, 'batch_loss/train': 0.8492622608318925} 12/26/2021 02:06:59 - INFO - codeparrot_training - Step 23647: {'lr': 0.00027267558750433755, 'samples': 12107776, 'steps': 23647, 'batch_loss/train': 0.9154989521484822} 12/26/2021 02:07:10 - INFO - codeparrot_training - Step 23648: {'lr': 0.0002726598892999402, 'samples': 12108288, 'steps': 23648, 'batch_loss/train': 0.8043311078799888} 12/26/2021 02:07:20 - INFO - codeparrot_training - Step 23649: {'lr': 0.00027264419100545565, 'samples': 12108800, 'steps': 23649, 'batch_loss/train': 0.7129065445624292} 12/26/2021 02:07:33 - INFO - codeparrot_training - Step 23650: {'lr': 0.0002726284926209462, 'samples': 12109312, 'steps': 23650, 'batch_loss/train': 0.6750867078080773} 12/26/2021 02:07:44 - INFO - codeparrot_training - Step 23651: {'lr': 0.0002726127941464746, 'samples': 12109824, 'steps': 23651, 'batch_loss/train': 0.6980985747650266} 12/26/2021 02:07:54 - INFO - codeparrot_training - Step 23652: {'lr': 0.000272597095582103, 'samples': 12110336, 'steps': 23652, 'batch_loss/train': 0.7780188396573067} 12/26/2021 02:08:05 - INFO - codeparrot_training - Step 23653: {'lr': 0.00027258139692789386, 'samples': 12110848, 'steps': 23653, 'batch_loss/train': 0.8276205793954432} 12/26/2021 02:08:17 - INFO - codeparrot_training - Step 23654: {'lr': 0.0002725656981839096, 'samples': 12111360, 'steps': 23654, 'batch_loss/train': 0.7524007093161345} 12/26/2021 02:08:28 - INFO - codeparrot_training - Step 23655: {'lr': 0.0002725499993502127, 'samples': 12111872, 'steps': 23655, 'batch_loss/train': 0.781556417234242} 12/26/2021 02:08:38 - INFO - codeparrot_training - Step 23656: {'lr': 0.00027253430042686557, 'samples': 12112384, 'steps': 23656, 'batch_loss/train': 0.778690718812868} 12/26/2021 02:08:51 - INFO - codeparrot_training - Step 23657: {'lr': 0.0002725186014139304, 'samples': 12112896, 'steps': 23657, 'batch_loss/train': 0.6074317044112831} 12/26/2021 02:09:02 - INFO - codeparrot_training - Step 23658: {'lr': 0.0002725029023114699, 'samples': 12113408, 'steps': 23658, 'batch_loss/train': 0.7774368785321712} 12/26/2021 02:09:12 - INFO - codeparrot_training - Step 23659: {'lr': 0.0002724872031195463, 'samples': 12113920, 'steps': 23659, 'batch_loss/train': 0.8981534373015165} 12/26/2021 02:09:24 - INFO - codeparrot_training - Step 23660: {'lr': 0.0002724715038382221, 'samples': 12114432, 'steps': 23660, 'batch_loss/train': 0.8379524294286966} 12/26/2021 02:09:35 - INFO - codeparrot_training - Step 23661: {'lr': 0.0002724558044675597, 'samples': 12114944, 'steps': 23661, 'batch_loss/train': 0.8403698150068521} 12/26/2021 02:09:45 - INFO - codeparrot_training - Step 23662: {'lr': 0.0002724401050076215, 'samples': 12115456, 'steps': 23662, 'batch_loss/train': 0.7381491279229522} 12/26/2021 02:09:58 - INFO - codeparrot_training - Step 23663: {'lr': 0.00027242440545846986, 'samples': 12115968, 'steps': 23663, 'batch_loss/train': 0.7942961610388011} 12/26/2021 02:10:08 - INFO - codeparrot_training - Step 23664: {'lr': 0.0002724087058201673, 'samples': 12116480, 'steps': 23664, 'batch_loss/train': 0.7449138639494777} 12/26/2021 02:10:19 - INFO - codeparrot_training - Step 23665: {'lr': 0.0002723930060927762, 'samples': 12116992, 'steps': 23665, 'batch_loss/train': 0.8972467840649188} 12/26/2021 02:10:29 - INFO - codeparrot_training - Step 23666: {'lr': 0.0002723773062763589, 'samples': 12117504, 'steps': 23666, 'batch_loss/train': 0.7805159315466881} 12/26/2021 02:10:42 - INFO - codeparrot_training - Step 23667: {'lr': 0.0002723616063709779, 'samples': 12118016, 'steps': 23667, 'batch_loss/train': 0.9050420764833689} 12/26/2021 02:10:52 - INFO - codeparrot_training - Step 23668: {'lr': 0.0002723459063766956, 'samples': 12118528, 'steps': 23668, 'batch_loss/train': 0.805857764557004} 12/26/2021 02:11:03 - INFO - codeparrot_training - Step 23669: {'lr': 0.0002723302062935744, 'samples': 12119040, 'steps': 23669, 'batch_loss/train': 0.770706981420517} 12/26/2021 02:11:15 - INFO - codeparrot_training - Step 23670: {'lr': 0.0002723145061216768, 'samples': 12119552, 'steps': 23670, 'batch_loss/train': 0.8305456675589085} 12/26/2021 02:11:26 - INFO - codeparrot_training - Step 23671: {'lr': 0.00027229880586106514, 'samples': 12120064, 'steps': 23671, 'batch_loss/train': 0.6800441418308765} 12/26/2021 02:11:37 - INFO - codeparrot_training - Step 23672: {'lr': 0.0002722831055118018, 'samples': 12120576, 'steps': 23672, 'batch_loss/train': 0.7595237958012149} 12/26/2021 02:11:49 - INFO - codeparrot_training - Step 23673: {'lr': 0.00027226740507394924, 'samples': 12121088, 'steps': 23673, 'batch_loss/train': 0.8069628784433007} 12/26/2021 02:11:59 - INFO - codeparrot_training - Step 23674: {'lr': 0.0002722517045475699, 'samples': 12121600, 'steps': 23674, 'batch_loss/train': 0.7310784601722844} 12/26/2021 02:12:10 - INFO - codeparrot_training - Step 23675: {'lr': 0.00027223600393272624, 'samples': 12122112, 'steps': 23675, 'batch_loss/train': 0.8411143678240478} 12/26/2021 02:12:21 - INFO - codeparrot_training - Step 23676: {'lr': 0.00027222030322948047, 'samples': 12122624, 'steps': 23676, 'batch_loss/train': 0.6379128687549382} 12/26/2021 02:12:33 - INFO - codeparrot_training - Step 23677: {'lr': 0.0002722046024378953, 'samples': 12123136, 'steps': 23677, 'batch_loss/train': 0.6165774374967441} 12/26/2021 02:12:43 - INFO - codeparrot_training - Step 23678: {'lr': 0.00027218890155803307, 'samples': 12123648, 'steps': 23678, 'batch_loss/train': 0.7706226212903857} 12/26/2021 02:12:54 - INFO - codeparrot_training - Step 23679: {'lr': 0.00027217320058995606, 'samples': 12124160, 'steps': 23679, 'batch_loss/train': 0.8494082223623991} 12/26/2021 02:13:07 - INFO - codeparrot_training - Step 23680: {'lr': 0.0002721574995337267, 'samples': 12124672, 'steps': 23680, 'batch_loss/train': 0.8233963469974697} 12/26/2021 02:13:17 - INFO - codeparrot_training - Step 23681: {'lr': 0.00027214179838940753, 'samples': 12125184, 'steps': 23681, 'batch_loss/train': 0.7544592659687623} 12/26/2021 02:13:28 - INFO - codeparrot_training - Step 23682: {'lr': 0.000272126097157061, 'samples': 12125696, 'steps': 23682, 'batch_loss/train': 0.8297600215300918} 12/26/2021 02:13:40 - INFO - codeparrot_training - Step 23683: {'lr': 0.0002721103958367494, 'samples': 12126208, 'steps': 23683, 'batch_loss/train': 0.8254067255184054} 12/26/2021 02:13:51 - INFO - codeparrot_training - Step 23684: {'lr': 0.0002720946944285353, 'samples': 12126720, 'steps': 23684, 'batch_loss/train': 0.8003612160682678} 12/26/2021 02:14:02 - INFO - codeparrot_training - Step 23685: {'lr': 0.00027207899293248095, 'samples': 12127232, 'steps': 23685, 'batch_loss/train': 0.6773913842625916} 12/26/2021 02:14:12 - INFO - codeparrot_training - Step 23686: {'lr': 0.00027206329134864894, 'samples': 12127744, 'steps': 23686, 'batch_loss/train': 0.5913285263814032} 12/26/2021 02:14:24 - INFO - codeparrot_training - Step 23687: {'lr': 0.0002720475896771015, 'samples': 12128256, 'steps': 23687, 'batch_loss/train': 0.7785123446956277} 12/26/2021 02:14:35 - INFO - codeparrot_training - Step 23688: {'lr': 0.0002720318879179013, 'samples': 12128768, 'steps': 23688, 'batch_loss/train': 0.7952416399493814} 12/26/2021 02:14:45 - INFO - codeparrot_training - Step 23689: {'lr': 0.0002720161860711105, 'samples': 12129280, 'steps': 23689, 'batch_loss/train': 0.7468397342599928} 12/26/2021 02:14:58 - INFO - codeparrot_training - Step 23690: {'lr': 0.00027200048413679174, 'samples': 12129792, 'steps': 23690, 'batch_loss/train': 0.751601055264473} 12/26/2021 02:15:09 - INFO - codeparrot_training - Step 23691: {'lr': 0.0002719847821150074, 'samples': 12130304, 'steps': 23691, 'batch_loss/train': 0.7806841959245503} 12/26/2021 02:15:19 - INFO - codeparrot_training - Step 23692: {'lr': 0.00027196908000581973, 'samples': 12130816, 'steps': 23692, 'batch_loss/train': 0.8169412286952138} 12/26/2021 02:15:31 - INFO - codeparrot_training - Step 23693: {'lr': 0.0002719533778092914, 'samples': 12131328, 'steps': 23693, 'batch_loss/train': 0.761223672889173} 12/26/2021 02:15:42 - INFO - codeparrot_training - Step 23694: {'lr': 0.0002719376755254847, 'samples': 12131840, 'steps': 23694, 'batch_loss/train': 0.7160444865003228} 12/26/2021 02:15:52 - INFO - codeparrot_training - Step 23695: {'lr': 0.0002719219731544621, 'samples': 12132352, 'steps': 23695, 'batch_loss/train': 0.7989846188575029} 12/26/2021 02:16:03 - INFO - codeparrot_training - Step 23696: {'lr': 0.000271906270696286, 'samples': 12132864, 'steps': 23696, 'batch_loss/train': 0.7980837856885046} 12/26/2021 02:16:16 - INFO - codeparrot_training - Step 23697: {'lr': 0.00027189056815101885, 'samples': 12133376, 'steps': 23697, 'batch_loss/train': 0.7815585434436798} 12/26/2021 02:16:26 - INFO - codeparrot_training - Step 23698: {'lr': 0.0002718748655187231, 'samples': 12133888, 'steps': 23698, 'batch_loss/train': 0.9081819579005241} 12/26/2021 02:16:37 - INFO - codeparrot_training - Step 23699: {'lr': 0.00027185916279946103, 'samples': 12134400, 'steps': 23699, 'batch_loss/train': 0.6852232383098453} 12/26/2021 02:16:49 - INFO - codeparrot_training - Step 23700: {'lr': 0.0002718434599932953, 'samples': 12134912, 'steps': 23700, 'batch_loss/train': 0.7009800322121009} 12/26/2021 02:16:59 - INFO - codeparrot_training - Step 23701: {'lr': 0.00027182775710028815, 'samples': 12135424, 'steps': 23701, 'batch_loss/train': 0.6941496334038675} 12/26/2021 02:17:10 - INFO - codeparrot_training - Step 23702: {'lr': 0.0002718120541205021, 'samples': 12135936, 'steps': 23702, 'batch_loss/train': 0.7909628455527127} 12/26/2021 02:17:22 - INFO - codeparrot_training - Step 23703: {'lr': 0.00027179635105399955, 'samples': 12136448, 'steps': 23703, 'batch_loss/train': 0.9238306181505322} 12/26/2021 02:17:33 - INFO - codeparrot_training - Step 23704: {'lr': 0.000271780647900843, 'samples': 12136960, 'steps': 23704, 'batch_loss/train': 0.7842456065118313} 12/26/2021 02:17:43 - INFO - codeparrot_training - Step 23705: {'lr': 0.0002717649446610947, 'samples': 12137472, 'steps': 23705, 'batch_loss/train': 0.7458759937435389} 12/26/2021 02:17:54 - INFO - codeparrot_training - Step 23706: {'lr': 0.0002717492413348173, 'samples': 12137984, 'steps': 23706, 'batch_loss/train': 0.7356897611171007} 12/26/2021 02:18:06 - INFO - codeparrot_training - Step 23707: {'lr': 0.0002717335379220731, 'samples': 12138496, 'steps': 23707, 'batch_loss/train': 0.8124767765402794} 12/26/2021 02:18:17 - INFO - codeparrot_training - Step 23708: {'lr': 0.0002717178344229245, 'samples': 12139008, 'steps': 23708, 'batch_loss/train': 0.7592251421883702} 12/26/2021 02:18:28 - INFO - codeparrot_training - Step 23709: {'lr': 0.00027170213083743406, 'samples': 12139520, 'steps': 23709, 'batch_loss/train': 0.9276531292125583} 12/26/2021 02:18:41 - INFO - codeparrot_training - Step 23710: {'lr': 0.00027168642716566416, 'samples': 12140032, 'steps': 23710, 'batch_loss/train': 0.6932559069246054} 12/26/2021 02:18:52 - INFO - codeparrot_training - Step 23711: {'lr': 0.0002716707234076772, 'samples': 12140544, 'steps': 23711, 'batch_loss/train': 0.7629570142598823} 12/26/2021 02:19:02 - INFO - codeparrot_training - Step 23712: {'lr': 0.0002716550195635356, 'samples': 12141056, 'steps': 23712, 'batch_loss/train': 0.7698326539248228} 12/26/2021 02:19:14 - INFO - codeparrot_training - Step 23713: {'lr': 0.0002716393156333018, 'samples': 12141568, 'steps': 23713, 'batch_loss/train': 0.7604164176154882} 12/26/2021 02:19:25 - INFO - codeparrot_training - Step 23714: {'lr': 0.0002716236116170383, 'samples': 12142080, 'steps': 23714, 'batch_loss/train': 0.8205617871135473} 12/26/2021 02:19:36 - INFO - codeparrot_training - Step 23715: {'lr': 0.0002716079075148075, 'samples': 12142592, 'steps': 23715, 'batch_loss/train': 0.8077642526477575} 12/26/2021 02:19:49 - INFO - codeparrot_training - Step 23716: {'lr': 0.0002715922033266717, 'samples': 12143104, 'steps': 23716, 'batch_loss/train': 0.7464509569108486} 12/26/2021 02:19:59 - INFO - codeparrot_training - Step 23717: {'lr': 0.0002715764990526936, 'samples': 12143616, 'steps': 23717, 'batch_loss/train': 0.8402319275774062} 12/26/2021 02:20:10 - INFO - codeparrot_training - Step 23718: {'lr': 0.0002715607946929354, 'samples': 12144128, 'steps': 23718, 'batch_loss/train': 0.5328313150675967} 12/26/2021 02:20:20 - INFO - codeparrot_training - Step 23719: {'lr': 0.00027154509024745967, 'samples': 12144640, 'steps': 23719, 'batch_loss/train': 0.7826389279216528} 12/26/2021 02:20:33 - INFO - codeparrot_training - Step 23720: {'lr': 0.00027152938571632873, 'samples': 12145152, 'steps': 23720, 'batch_loss/train': 0.6723735686391592} 12/26/2021 02:20:43 - INFO - codeparrot_training - Step 23721: {'lr': 0.00027151368109960513, 'samples': 12145664, 'steps': 23721, 'batch_loss/train': 0.810585199855268} 12/26/2021 02:20:54 - INFO - codeparrot_training - Step 23722: {'lr': 0.00027149797639735126, 'samples': 12146176, 'steps': 23722, 'batch_loss/train': 0.4920903609599918} 12/26/2021 02:21:06 - INFO - codeparrot_training - Step 23723: {'lr': 0.00027148227160962956, 'samples': 12146688, 'steps': 23723, 'batch_loss/train': 0.7973637899849564} 12/26/2021 02:21:17 - INFO - codeparrot_training - Step 23724: {'lr': 0.00027146656673650245, 'samples': 12147200, 'steps': 23724, 'batch_loss/train': 0.6369677358306944} 12/26/2021 02:21:27 - INFO - codeparrot_training - Step 23725: {'lr': 0.00027145086177803236, 'samples': 12147712, 'steps': 23725, 'batch_loss/train': 0.7175169228576124} 12/26/2021 02:21:40 - INFO - codeparrot_training - Step 23726: {'lr': 0.0002714351567342817, 'samples': 12148224, 'steps': 23726, 'batch_loss/train': 0.7637719940394163} 12/26/2021 02:21:51 - INFO - codeparrot_training - Step 23727: {'lr': 0.000271419451605313, 'samples': 12148736, 'steps': 23727, 'batch_loss/train': 0.7026260909624398} 12/26/2021 02:22:01 - INFO - codeparrot_training - Step 23728: {'lr': 0.00027140374639118863, 'samples': 12149248, 'steps': 23728, 'batch_loss/train': 0.740792129188776} 12/26/2021 02:22:12 - INFO - codeparrot_training - Step 23729: {'lr': 0.0002713880410919711, 'samples': 12149760, 'steps': 23729, 'batch_loss/train': 0.7437371406704187} 12/26/2021 02:22:24 - INFO - codeparrot_training - Step 23730: {'lr': 0.00027137233570772277, 'samples': 12150272, 'steps': 23730, 'batch_loss/train': 0.7316655004397035} 12/26/2021 02:22:35 - INFO - codeparrot_training - Step 23731: {'lr': 0.000271356630238506, 'samples': 12150784, 'steps': 23731, 'batch_loss/train': 0.7948650931939483} 12/26/2021 02:22:45 - INFO - codeparrot_training - Step 23732: {'lr': 0.00027134092468438355, 'samples': 12151296, 'steps': 23732, 'batch_loss/train': 0.7671318463981152} 12/26/2021 02:22:58 - INFO - codeparrot_training - Step 23733: {'lr': 0.00027132521904541743, 'samples': 12151808, 'steps': 23733, 'batch_loss/train': 0.6928978147916496} 12/26/2021 02:23:09 - INFO - codeparrot_training - Step 23734: {'lr': 0.0002713095133216704, 'samples': 12152320, 'steps': 23734, 'batch_loss/train': 0.8600930152460933} 12/26/2021 02:23:20 - INFO - codeparrot_training - Step 23735: {'lr': 0.0002712938075132048, 'samples': 12152832, 'steps': 23735, 'batch_loss/train': 0.7149732802063227} 12/26/2021 02:23:31 - INFO - codeparrot_training - Step 23736: {'lr': 0.000271278101620083, 'samples': 12153344, 'steps': 23736, 'batch_loss/train': 0.5961905783042312} 12/26/2021 02:23:42 - INFO - codeparrot_training - Step 23737: {'lr': 0.00027126239564236753, 'samples': 12153856, 'steps': 23737, 'batch_loss/train': 0.7412620456889272} 12/26/2021 02:23:53 - INFO - codeparrot_training - Step 23738: {'lr': 0.0002712466895801208, 'samples': 12154368, 'steps': 23738, 'batch_loss/train': 0.7061233459971845} 12/26/2021 02:24:03 - INFO - codeparrot_training - Step 23739: {'lr': 0.0002712309834334053, 'samples': 12154880, 'steps': 23739, 'batch_loss/train': 0.7038009157404304} 12/26/2021 02:24:15 - INFO - codeparrot_training - Step 23740: {'lr': 0.00027121527720228337, 'samples': 12155392, 'steps': 23740, 'batch_loss/train': 0.8028465146198869} 12/26/2021 02:24:26 - INFO - codeparrot_training - Step 23741: {'lr': 0.0002711995708868175, 'samples': 12155904, 'steps': 23741, 'batch_loss/train': 0.7786060930229723} 12/26/2021 02:24:37 - INFO - codeparrot_training - Step 23742: {'lr': 0.0002711838644870702, 'samples': 12156416, 'steps': 23742, 'batch_loss/train': 0.8321281094104052} 12/26/2021 02:24:50 - INFO - codeparrot_training - Step 23743: {'lr': 0.00027116815800310384, 'samples': 12156928, 'steps': 23743, 'batch_loss/train': 0.7634395544882864} 12/26/2021 02:25:01 - INFO - codeparrot_training - Step 23744: {'lr': 0.0002711524514349808, 'samples': 12157440, 'steps': 23744, 'batch_loss/train': 0.7643976947292686} 12/26/2021 02:25:11 - INFO - codeparrot_training - Step 23745: {'lr': 0.0002711367447827637, 'samples': 12157952, 'steps': 23745, 'batch_loss/train': 0.7269439096562564} 12/26/2021 02:25:23 - INFO - codeparrot_training - Step 23746: {'lr': 0.0002711210380465148, 'samples': 12158464, 'steps': 23746, 'batch_loss/train': 0.7290366133674979} 12/26/2021 02:25:34 - INFO - codeparrot_training - Step 23747: {'lr': 0.00027110533122629666, 'samples': 12158976, 'steps': 23747, 'batch_loss/train': 0.7398995846160688} 12/26/2021 02:25:44 - INFO - codeparrot_training - Step 23748: {'lr': 0.0002710896243221717, 'samples': 12159488, 'steps': 23748, 'batch_loss/train': 0.7292347885668278} 12/26/2021 02:25:56 - INFO - codeparrot_training - Step 23749: {'lr': 0.0002710739173342024, 'samples': 12160000, 'steps': 23749, 'batch_loss/train': 0.8155121566960588} 12/26/2021 02:26:07 - INFO - codeparrot_training - Step 23750: {'lr': 0.0002710582102624511, 'samples': 12160512, 'steps': 23750, 'batch_loss/train': 0.7643560746219009} 12/26/2021 02:26:18 - INFO - codeparrot_training - Step 23751: {'lr': 0.0002710425031069803, 'samples': 12161024, 'steps': 23751, 'batch_loss/train': 0.7366753187961876} 12/26/2021 02:26:28 - INFO - codeparrot_training - Step 23752: {'lr': 0.00027102679586785253, 'samples': 12161536, 'steps': 23752, 'batch_loss/train': 0.7526702508330345} 12/26/2021 02:26:40 - INFO - codeparrot_training - Step 23753: {'lr': 0.0002710110885451301, 'samples': 12162048, 'steps': 23753, 'batch_loss/train': 0.6517527475371026} 12/26/2021 02:26:51 - INFO - codeparrot_training - Step 23754: {'lr': 0.0002709953811388755, 'samples': 12162560, 'steps': 23754, 'batch_loss/train': 0.6984351354185492} 12/26/2021 02:27:01 - INFO - codeparrot_training - Step 23755: {'lr': 0.00027097967364915124, 'samples': 12163072, 'steps': 23755, 'batch_loss/train': 0.7002329952083528} 12/26/2021 02:27:14 - INFO - codeparrot_training - Step 23756: {'lr': 0.0002709639660760197, 'samples': 12163584, 'steps': 23756, 'batch_loss/train': 0.7197877997532487} 12/26/2021 02:27:25 - INFO - codeparrot_training - Step 23757: {'lr': 0.0002709482584195433, 'samples': 12164096, 'steps': 23757, 'batch_loss/train': 0.7813808581558987} 12/26/2021 02:27:35 - INFO - codeparrot_training - Step 23758: {'lr': 0.00027093255067978464, 'samples': 12164608, 'steps': 23758, 'batch_loss/train': 0.7642560647800565} 12/26/2021 02:27:48 - INFO - codeparrot_training - Step 23759: {'lr': 0.00027091684285680595, 'samples': 12165120, 'steps': 23759, 'batch_loss/train': 0.7960188357828883} 12/26/2021 02:27:59 - INFO - codeparrot_training - Step 23760: {'lr': 0.0002709011349506698, 'samples': 12165632, 'steps': 23760, 'batch_loss/train': 0.9274584762752056} 12/26/2021 02:28:09 - INFO - codeparrot_training - Step 23761: {'lr': 0.0002708854269614387, 'samples': 12166144, 'steps': 23761, 'batch_loss/train': 0.7445228463038802} 12/26/2021 02:28:20 - INFO - codeparrot_training - Step 23762: {'lr': 0.00027086971888917505, 'samples': 12166656, 'steps': 23762, 'batch_loss/train': 0.7802053731866181} 12/26/2021 02:28:32 - INFO - codeparrot_training - Step 23763: {'lr': 0.00027085401073394114, 'samples': 12167168, 'steps': 23763, 'batch_loss/train': 0.8129862728528678} 12/26/2021 02:28:43 - INFO - codeparrot_training - Step 23764: {'lr': 0.00027083830249579965, 'samples': 12167680, 'steps': 23764, 'batch_loss/train': 0.8055959707126021} 12/26/2021 02:28:54 - INFO - codeparrot_training - Step 23765: {'lr': 0.00027082259417481297, 'samples': 12168192, 'steps': 23765, 'batch_loss/train': 0.8542152214795351} 12/26/2021 02:29:06 - INFO - codeparrot_training - Step 23766: {'lr': 0.00027080688577104344, 'samples': 12168704, 'steps': 23766, 'batch_loss/train': 0.7005505217239261} 12/26/2021 02:29:16 - INFO - codeparrot_training - Step 23767: {'lr': 0.00027079117728455367, 'samples': 12169216, 'steps': 23767, 'batch_loss/train': 0.4769416024792008} 12/26/2021 02:29:27 - INFO - codeparrot_training - Step 23768: {'lr': 0.0002707754687154059, 'samples': 12169728, 'steps': 23768, 'batch_loss/train': 0.575733043893706} 12/26/2021 02:29:39 - INFO - codeparrot_training - Step 23769: {'lr': 0.00027075976006366286, 'samples': 12170240, 'steps': 23769, 'batch_loss/train': 0.7737466115504503} 12/26/2021 02:29:50 - INFO - codeparrot_training - Step 23770: {'lr': 0.0002707440513293867, 'samples': 12170752, 'steps': 23770, 'batch_loss/train': 0.855733482632786} 12/26/2021 02:30:01 - INFO - codeparrot_training - Step 23771: {'lr': 0.00027072834251264024, 'samples': 12171264, 'steps': 23771, 'batch_loss/train': 0.6837480125832371} 12/26/2021 02:30:13 - INFO - codeparrot_training - Step 23772: {'lr': 0.0002707126336134855, 'samples': 12171776, 'steps': 23772, 'batch_loss/train': 0.7576639908947982} 12/26/2021 02:30:24 - INFO - codeparrot_training - Step 23773: {'lr': 0.00027069692463198525, 'samples': 12172288, 'steps': 23773, 'batch_loss/train': 0.7106786924414337} 12/26/2021 02:30:35 - INFO - codeparrot_training - Step 23774: {'lr': 0.00027068121556820184, 'samples': 12172800, 'steps': 23774, 'batch_loss/train': 0.7313105864450336} 12/26/2021 02:30:45 - INFO - codeparrot_training - Step 23775: {'lr': 0.0002706655064221978, 'samples': 12173312, 'steps': 23775, 'batch_loss/train': 0.7687882091850042} 12/26/2021 02:30:57 - INFO - codeparrot_training - Step 23776: {'lr': 0.0002706497971940353, 'samples': 12173824, 'steps': 23776, 'batch_loss/train': 0.774144371971488} 12/26/2021 02:31:08 - INFO - codeparrot_training - Step 23777: {'lr': 0.00027063408788377714, 'samples': 12174336, 'steps': 23777, 'batch_loss/train': 0.6904884884133935} 12/26/2021 02:31:19 - INFO - codeparrot_training - Step 23778: {'lr': 0.0002706183784914857, 'samples': 12174848, 'steps': 23778, 'batch_loss/train': 0.599519734736532} 12/26/2021 02:31:31 - INFO - codeparrot_training - Step 23779: {'lr': 0.0002706026690172233, 'samples': 12175360, 'steps': 23779, 'batch_loss/train': 0.7918418007902801} 12/26/2021 02:31:42 - INFO - codeparrot_training - Step 23780: {'lr': 0.00027058695946105245, 'samples': 12175872, 'steps': 23780, 'batch_loss/train': 0.711731469957158} 12/26/2021 02:31:52 - INFO - codeparrot_training - Step 23781: {'lr': 0.0002705712498230356, 'samples': 12176384, 'steps': 23781, 'batch_loss/train': 0.7127048464026302} 12/26/2021 02:32:04 - INFO - codeparrot_training - Step 23782: {'lr': 0.0002705555401032353, 'samples': 12176896, 'steps': 23782, 'batch_loss/train': 0.6301251733675599} 12/26/2021 02:32:15 - INFO - codeparrot_training - Step 23783: {'lr': 0.00027053983030171386, 'samples': 12177408, 'steps': 23783, 'batch_loss/train': 0.737501663621515} 12/26/2021 02:32:25 - INFO - codeparrot_training - Step 23784: {'lr': 0.00027052412041853394, 'samples': 12177920, 'steps': 23784, 'batch_loss/train': 0.7101232465356588} 12/26/2021 02:32:36 - INFO - codeparrot_training - Step 23785: {'lr': 0.0002705084104537577, 'samples': 12178432, 'steps': 23785, 'batch_loss/train': 0.7256726073101163} 12/26/2021 02:32:49 - INFO - codeparrot_training - Step 23786: {'lr': 0.0002704927004074479, 'samples': 12178944, 'steps': 23786, 'batch_loss/train': 0.7267292840406299} 12/26/2021 02:33:00 - INFO - codeparrot_training - Step 23787: {'lr': 0.00027047699027966677, 'samples': 12179456, 'steps': 23787, 'batch_loss/train': 0.6582185300649144} 12/26/2021 02:33:10 - INFO - codeparrot_training - Step 23788: {'lr': 0.0002704612800704769, 'samples': 12179968, 'steps': 23788, 'batch_loss/train': 0.7265866505913436} 12/26/2021 02:33:22 - INFO - codeparrot_training - Step 23789: {'lr': 0.0002704455697799407, 'samples': 12180480, 'steps': 23789, 'batch_loss/train': 0.6892374800518155} 12/26/2021 02:33:33 - INFO - codeparrot_training - Step 23790: {'lr': 0.0002704298594081206, 'samples': 12180992, 'steps': 23790, 'batch_loss/train': 0.7054252056404948} 12/26/2021 02:33:43 - INFO - codeparrot_training - Step 23791: {'lr': 0.00027041414895507914, 'samples': 12181504, 'steps': 23791, 'batch_loss/train': 0.8088436527177691} 12/26/2021 02:33:56 - INFO - codeparrot_training - Step 23792: {'lr': 0.00027039843842087874, 'samples': 12182016, 'steps': 23792, 'batch_loss/train': 0.829925658996217} 12/26/2021 02:34:06 - INFO - codeparrot_training - Step 23793: {'lr': 0.0002703827278055818, 'samples': 12182528, 'steps': 23793, 'batch_loss/train': 0.6490280759753659} 12/26/2021 02:34:17 - INFO - codeparrot_training - Step 23794: {'lr': 0.0002703670171092509, 'samples': 12183040, 'steps': 23794, 'batch_loss/train': 0.7702986402437091} 12/26/2021 02:34:27 - INFO - codeparrot_training - Step 23795: {'lr': 0.00027035130633194836, 'samples': 12183552, 'steps': 23795, 'batch_loss/train': 0.526829449518118} 12/26/2021 02:34:40 - INFO - codeparrot_training - Step 23796: {'lr': 0.0002703355954737367, 'samples': 12184064, 'steps': 23796, 'batch_loss/train': 0.6569107777904719} 12/26/2021 02:34:51 - INFO - codeparrot_training - Step 23797: {'lr': 0.0002703198845346785, 'samples': 12184576, 'steps': 23797, 'batch_loss/train': 0.7874029902741313} 12/26/2021 02:35:01 - INFO - codeparrot_training - Step 23798: {'lr': 0.00027030417351483604, 'samples': 12185088, 'steps': 23798, 'batch_loss/train': 0.7738820174708962} 12/26/2021 02:35:13 - INFO - codeparrot_training - Step 23799: {'lr': 0.00027028846241427184, 'samples': 12185600, 'steps': 23799, 'batch_loss/train': 0.7086332887411118} 12/26/2021 02:35:24 - INFO - codeparrot_training - Step 23800: {'lr': 0.0002702727512330484, 'samples': 12186112, 'steps': 23800, 'batch_loss/train': 0.7018428689334542} 12/26/2021 02:35:35 - INFO - codeparrot_training - Step 23801: {'lr': 0.00027025703997122813, 'samples': 12186624, 'steps': 23801, 'batch_loss/train': 0.6998837785795331} 12/26/2021 02:35:47 - INFO - codeparrot_training - Step 23802: {'lr': 0.0002702413286288735, 'samples': 12187136, 'steps': 23802, 'batch_loss/train': 0.6533944989787415} 12/26/2021 02:35:58 - INFO - codeparrot_training - Step 23803: {'lr': 0.00027022561720604704, 'samples': 12187648, 'steps': 23803, 'batch_loss/train': 0.7316401209682226} 12/26/2021 02:36:09 - INFO - codeparrot_training - Step 23804: {'lr': 0.0002702099057028112, 'samples': 12188160, 'steps': 23804, 'batch_loss/train': 0.7067997803678736} 12/26/2021 02:36:21 - INFO - codeparrot_training - Step 23805: {'lr': 0.0002701941941192283, 'samples': 12188672, 'steps': 23805, 'batch_loss/train': 0.863318907096982} 12/26/2021 02:36:31 - INFO - codeparrot_training - Step 23806: {'lr': 0.00027017848245536097, 'samples': 12189184, 'steps': 23806, 'batch_loss/train': 0.763079565949738} 12/26/2021 02:36:42 - INFO - codeparrot_training - Step 23807: {'lr': 0.0002701627707112716, 'samples': 12189696, 'steps': 23807, 'batch_loss/train': 0.6145706493407488} 12/26/2021 02:36:53 - INFO - codeparrot_training - Step 23808: {'lr': 0.0002701470588870227, 'samples': 12190208, 'steps': 23808, 'batch_loss/train': 0.8007351374253631} 12/26/2021 02:37:05 - INFO - codeparrot_training - Step 23809: {'lr': 0.0002701313469826766, 'samples': 12190720, 'steps': 23809, 'batch_loss/train': 0.812605001963675} 12/26/2021 02:37:15 - INFO - codeparrot_training - Step 23810: {'lr': 0.000270115634998296, 'samples': 12191232, 'steps': 23810, 'batch_loss/train': 0.6912050757091492} 12/26/2021 02:37:26 - INFO - codeparrot_training - Step 23811: {'lr': 0.00027009992293394313, 'samples': 12191744, 'steps': 23811, 'batch_loss/train': 0.6305028903589118} 12/26/2021 02:37:39 - INFO - codeparrot_training - Step 23812: {'lr': 0.0002700842107896806, 'samples': 12192256, 'steps': 23812, 'batch_loss/train': 0.7344811783405021} 12/26/2021 02:37:49 - INFO - codeparrot_training - Step 23813: {'lr': 0.00027006849856557087, 'samples': 12192768, 'steps': 23813, 'batch_loss/train': 0.5378684840397909} 12/26/2021 02:38:00 - INFO - codeparrot_training - Step 23814: {'lr': 0.00027005278626167627, 'samples': 12193280, 'steps': 23814, 'batch_loss/train': 0.8063180814497173} 12/26/2021 02:38:12 - INFO - codeparrot_training - Step 23815: {'lr': 0.0002700370738780594, 'samples': 12193792, 'steps': 23815, 'batch_loss/train': 0.6802814214024693} 12/26/2021 02:38:23 - INFO - codeparrot_training - Step 23816: {'lr': 0.0002700213614147827, 'samples': 12194304, 'steps': 23816, 'batch_loss/train': 0.7493752781301737} 12/26/2021 02:38:33 - INFO - codeparrot_training - Step 23817: {'lr': 0.0002700056488719087, 'samples': 12194816, 'steps': 23817, 'batch_loss/train': 0.6706100450828671} 12/26/2021 02:38:45 - INFO - codeparrot_training - Step 23818: {'lr': 0.0002699899362494997, 'samples': 12195328, 'steps': 23818, 'batch_loss/train': 0.78957111062482} 12/26/2021 02:38:56 - INFO - codeparrot_training - Step 23819: {'lr': 0.0002699742235476183, 'samples': 12195840, 'steps': 23819, 'batch_loss/train': 0.7238314026035368} 12/26/2021 02:39:07 - INFO - codeparrot_training - Step 23820: {'lr': 0.00026995851076632696, 'samples': 12196352, 'steps': 23820, 'batch_loss/train': 0.6364754941314459} 12/26/2021 02:39:17 - INFO - codeparrot_training - Step 23821: {'lr': 0.000269942797905688, 'samples': 12196864, 'steps': 23821, 'batch_loss/train': 0.7533042505383492} 12/26/2021 02:39:29 - INFO - codeparrot_training - Step 23822: {'lr': 0.00026992708496576407, 'samples': 12197376, 'steps': 23822, 'batch_loss/train': 0.6692596413195133} 12/26/2021 02:39:40 - INFO - codeparrot_training - Step 23823: {'lr': 0.0002699113719466176, 'samples': 12197888, 'steps': 23823, 'batch_loss/train': 0.7016361355781555} 12/26/2021 02:39:50 - INFO - codeparrot_training - Step 23824: {'lr': 0.00026989565884831106, 'samples': 12198400, 'steps': 23824, 'batch_loss/train': 0.7427449875394814} 12/26/2021 02:40:03 - INFO - codeparrot_training - Step 23825: {'lr': 0.0002698799456709068, 'samples': 12198912, 'steps': 23825, 'batch_loss/train': 0.8220412977971137} 12/26/2021 02:40:14 - INFO - codeparrot_training - Step 23826: {'lr': 0.0002698642324144674, 'samples': 12199424, 'steps': 23826, 'batch_loss/train': 0.8992576878517866} 12/26/2021 02:40:25 - INFO - codeparrot_training - Step 23827: {'lr': 0.0002698485190790554, 'samples': 12199936, 'steps': 23827, 'batch_loss/train': 0.6795046527404338} 12/26/2021 02:40:37 - INFO - codeparrot_training - Step 23828: {'lr': 0.00026983280566473313, 'samples': 12200448, 'steps': 23828, 'batch_loss/train': 0.7295305547304451} 12/26/2021 02:40:47 - INFO - codeparrot_training - Step 23829: {'lr': 0.0002698170921715631, 'samples': 12200960, 'steps': 23829, 'batch_loss/train': 0.8296682434156537} 12/26/2021 02:40:58 - INFO - codeparrot_training - Step 23830: {'lr': 0.00026980137859960784, 'samples': 12201472, 'steps': 23830, 'batch_loss/train': 0.6493604289134964} 12/26/2021 02:41:09 - INFO - codeparrot_training - Step 23831: {'lr': 0.0002697856649489297, 'samples': 12201984, 'steps': 23831, 'batch_loss/train': 0.5701369096059352} 12/26/2021 02:41:22 - INFO - codeparrot_training - Step 23832: {'lr': 0.0002697699512195913, 'samples': 12202496, 'steps': 23832, 'batch_loss/train': 0.740330136497505} 12/26/2021 02:41:33 - INFO - codeparrot_training - Step 23833: {'lr': 0.000269754237411655, 'samples': 12203008, 'steps': 23833, 'batch_loss/train': 0.5996837002458051} 12/26/2021 02:41:43 - INFO - codeparrot_training - Step 23834: {'lr': 0.00026973852352518327, 'samples': 12203520, 'steps': 23834, 'batch_loss/train': 0.8558751433156431} 12/26/2021 02:41:55 - INFO - codeparrot_training - Step 23835: {'lr': 0.0002697228095602387, 'samples': 12204032, 'steps': 23835, 'batch_loss/train': 1.1807092567905784} 12/26/2021 02:42:06 - INFO - codeparrot_training - Step 23836: {'lr': 0.00026970709551688373, 'samples': 12204544, 'steps': 23836, 'batch_loss/train': 0.6530178619432263} 12/26/2021 02:42:16 - INFO - codeparrot_training - Step 23837: {'lr': 0.0002696913813951807, 'samples': 12205056, 'steps': 23837, 'batch_loss/train': 0.6001652446575463} 12/26/2021 02:42:27 - INFO - codeparrot_training - Step 23838: {'lr': 0.0002696756671951921, 'samples': 12205568, 'steps': 23838, 'batch_loss/train': 0.8161939596757293} 12/26/2021 02:42:39 - INFO - codeparrot_training - Step 23839: {'lr': 0.00026965995291698057, 'samples': 12206080, 'steps': 23839, 'batch_loss/train': 0.9173251329921186} 12/26/2021 02:42:50 - INFO - codeparrot_training - Step 23840: {'lr': 0.00026964423856060853, 'samples': 12206592, 'steps': 23840, 'batch_loss/train': 0.6773529460188001} 12/26/2021 02:43:00 - INFO - codeparrot_training - Step 23841: {'lr': 0.0002696285241261384, 'samples': 12207104, 'steps': 23841, 'batch_loss/train': 0.8182593258097768} 12/26/2021 02:43:13 - INFO - codeparrot_training - Step 23842: {'lr': 0.0002696128096136326, 'samples': 12207616, 'steps': 23842, 'batch_loss/train': 0.6873036208562553} 12/26/2021 02:43:23 - INFO - codeparrot_training - Step 23843: {'lr': 0.0002695970950231538, 'samples': 12208128, 'steps': 23843, 'batch_loss/train': 0.6686318409629166} 12/26/2021 02:43:34 - INFO - codeparrot_training - Step 23844: {'lr': 0.00026958138035476414, 'samples': 12208640, 'steps': 23844, 'batch_loss/train': 0.7337457006797194} 12/26/2021 02:43:46 - INFO - codeparrot_training - Step 23845: {'lr': 0.00026956566560852645, 'samples': 12209152, 'steps': 23845, 'batch_loss/train': 0.7418147884309292} 12/26/2021 02:43:57 - INFO - codeparrot_training - Step 23846: {'lr': 0.00026954995078450305, 'samples': 12209664, 'steps': 23846, 'batch_loss/train': 0.7505991416983306} 12/26/2021 02:44:07 - INFO - codeparrot_training - Step 23847: {'lr': 0.00026953423588275645, 'samples': 12210176, 'steps': 23847, 'batch_loss/train': 0.8241482535377145} 12/26/2021 02:44:19 - INFO - codeparrot_training - Step 23848: {'lr': 0.00026951852090334905, 'samples': 12210688, 'steps': 23848, 'batch_loss/train': 0.7862004423514009} 12/26/2021 02:44:30 - INFO - codeparrot_training - Step 23849: {'lr': 0.0002695028058463434, 'samples': 12211200, 'steps': 23849, 'batch_loss/train': 0.7530065132305026} 12/26/2021 02:44:41 - INFO - codeparrot_training - Step 23850: {'lr': 0.000269487090711802, 'samples': 12211712, 'steps': 23850, 'batch_loss/train': 0.7964350981637836} 12/26/2021 02:44:51 - INFO - codeparrot_training - Step 23851: {'lr': 0.00026947137549978716, 'samples': 12212224, 'steps': 23851, 'batch_loss/train': 0.9091683770529926} 12/26/2021 02:45:04 - INFO - codeparrot_training - Step 23852: {'lr': 0.00026945566021036163, 'samples': 12212736, 'steps': 23852, 'batch_loss/train': 0.7684813942760229} 12/26/2021 02:45:14 - INFO - codeparrot_training - Step 23853: {'lr': 0.0002694399448435877, 'samples': 12213248, 'steps': 23853, 'batch_loss/train': 0.7180752505082637} 12/26/2021 02:45:25 - INFO - codeparrot_training - Step 23854: {'lr': 0.00026942422939952784, 'samples': 12213760, 'steps': 23854, 'batch_loss/train': 0.7561991113470867} 12/26/2021 02:45:38 - INFO - codeparrot_training - Step 23855: {'lr': 0.0002694085138782446, 'samples': 12214272, 'steps': 23855, 'batch_loss/train': 0.7546890657395124} 12/26/2021 02:45:48 - INFO - codeparrot_training - Step 23856: {'lr': 0.0002693927982798005, 'samples': 12214784, 'steps': 23856, 'batch_loss/train': 0.7511208048090339} 12/26/2021 02:45:59 - INFO - codeparrot_training - Step 23857: {'lr': 0.00026937708260425786, 'samples': 12215296, 'steps': 23857, 'batch_loss/train': 0.6223729336634278} 12/26/2021 02:46:11 - INFO - codeparrot_training - Step 23858: {'lr': 0.0002693613668516793, 'samples': 12215808, 'steps': 23858, 'batch_loss/train': 0.7970398524776101} 12/26/2021 02:46:22 - INFO - codeparrot_training - Step 23859: {'lr': 0.00026934565102212726, 'samples': 12216320, 'steps': 23859, 'batch_loss/train': 0.7314677941612899} 12/26/2021 02:46:32 - INFO - codeparrot_training - Step 23860: {'lr': 0.0002693299351156642, 'samples': 12216832, 'steps': 23860, 'batch_loss/train': 0.7255671233870089} 12/26/2021 02:46:43 - INFO - codeparrot_training - Step 23861: {'lr': 0.0002693142191323526, 'samples': 12217344, 'steps': 23861, 'batch_loss/train': 0.7374288954306394} 12/26/2021 02:46:56 - INFO - codeparrot_training - Step 23862: {'lr': 0.00026929850307225503, 'samples': 12217856, 'steps': 23862, 'batch_loss/train': 0.7292991280555725} 12/26/2021 02:47:06 - INFO - codeparrot_training - Step 23863: {'lr': 0.00026928278693543384, 'samples': 12218368, 'steps': 23863, 'batch_loss/train': 0.7713172789663076} 12/26/2021 02:47:17 - INFO - codeparrot_training - Step 23864: {'lr': 0.0002692670707219516, 'samples': 12218880, 'steps': 23864, 'batch_loss/train': 0.6726994663476944} 12/26/2021 02:47:29 - INFO - codeparrot_training - Step 23865: {'lr': 0.00026925135443187076, 'samples': 12219392, 'steps': 23865, 'batch_loss/train': 0.7164955572225153} 12/26/2021 02:47:39 - INFO - codeparrot_training - Step 23866: {'lr': 0.00026923563806525373, 'samples': 12219904, 'steps': 23866, 'batch_loss/train': 0.7427028985694051} 12/26/2021 02:47:50 - INFO - codeparrot_training - Step 23867: {'lr': 0.00026921992162216315, 'samples': 12220416, 'steps': 23867, 'batch_loss/train': 0.6874166331253946} 12/26/2021 02:48:02 - INFO - codeparrot_training - Step 23868: {'lr': 0.0002692042051026614, 'samples': 12220928, 'steps': 23868, 'batch_loss/train': 0.7916098702698946} 12/26/2021 02:48:13 - INFO - codeparrot_training - Step 23869: {'lr': 0.00026918848850681097, 'samples': 12221440, 'steps': 23869, 'batch_loss/train': 0.764061818132177} 12/26/2021 02:48:23 - INFO - codeparrot_training - Step 23870: {'lr': 0.00026917277183467433, 'samples': 12221952, 'steps': 23870, 'batch_loss/train': 0.6375604264903814} 12/26/2021 02:48:34 - INFO - codeparrot_training - Step 23871: {'lr': 0.00026915705508631405, 'samples': 12222464, 'steps': 23871, 'batch_loss/train': 0.7626490877009928} 12/26/2021 02:48:47 - INFO - codeparrot_training - Step 23872: {'lr': 0.0002691413382617925, 'samples': 12222976, 'steps': 23872, 'batch_loss/train': 0.7967728646472096} 12/26/2021 02:48:57 - INFO - codeparrot_training - Step 23873: {'lr': 0.00026912562136117225, 'samples': 12223488, 'steps': 23873, 'batch_loss/train': 0.7092539228033274} 12/26/2021 02:49:08 - INFO - codeparrot_training - Step 23874: {'lr': 0.00026910990438451575, 'samples': 12224000, 'steps': 23874, 'batch_loss/train': 0.7915964161511511} 12/26/2021 02:49:20 - INFO - codeparrot_training - Step 23875: {'lr': 0.00026909418733188546, 'samples': 12224512, 'steps': 23875, 'batch_loss/train': 0.7792313575046137} 12/26/2021 02:49:31 - INFO - codeparrot_training - Step 23876: {'lr': 0.00026907847020334394, 'samples': 12225024, 'steps': 23876, 'batch_loss/train': 0.8118751798756421} 12/26/2021 02:49:41 - INFO - codeparrot_training - Step 23877: {'lr': 0.00026906275299895353, 'samples': 12225536, 'steps': 23877, 'batch_loss/train': 0.7653199788182974} 12/26/2021 02:49:54 - INFO - codeparrot_training - Step 23878: {'lr': 0.0002690470357187769, 'samples': 12226048, 'steps': 23878, 'batch_loss/train': 0.7778528044000268} 12/26/2021 02:50:05 - INFO - codeparrot_training - Step 23879: {'lr': 0.0002690313183628764, 'samples': 12226560, 'steps': 23879, 'batch_loss/train': 0.644041319261305} 12/26/2021 02:50:15 - INFO - codeparrot_training - Step 23880: {'lr': 0.0002690156009313146, 'samples': 12227072, 'steps': 23880, 'batch_loss/train': 0.6542246355675161} 12/26/2021 02:50:27 - INFO - codeparrot_training - Step 23881: {'lr': 0.00026899988342415394, 'samples': 12227584, 'steps': 23881, 'batch_loss/train': 0.426884492742829} 12/26/2021 02:50:38 - INFO - codeparrot_training - Step 23882: {'lr': 0.00026898416584145694, 'samples': 12228096, 'steps': 23882, 'batch_loss/train': 0.684939150873106} 12/26/2021 02:50:49 - INFO - codeparrot_training - Step 23883: {'lr': 0.000268968448183286, 'samples': 12228608, 'steps': 23883, 'batch_loss/train': 0.8362769829109311} 12/26/2021 02:50:59 - INFO - codeparrot_training - Step 23884: {'lr': 0.00026895273044970374, 'samples': 12229120, 'steps': 23884, 'batch_loss/train': 0.7031074874103069} 12/26/2021 02:51:11 - INFO - codeparrot_training - Step 23885: {'lr': 0.00026893701264077256, 'samples': 12229632, 'steps': 23885, 'batch_loss/train': 0.5589426420920063} 12/26/2021 02:51:22 - INFO - codeparrot_training - Step 23886: {'lr': 0.0002689212947565549, 'samples': 12230144, 'steps': 23886, 'batch_loss/train': 0.8473818628117442} 12/26/2021 02:51:33 - INFO - codeparrot_training - Step 23887: {'lr': 0.00026890557679711344, 'samples': 12230656, 'steps': 23887, 'batch_loss/train': 0.7771607339382172} 12/26/2021 02:51:45 - INFO - codeparrot_training - Step 23888: {'lr': 0.0002688898587625105, 'samples': 12231168, 'steps': 23888, 'batch_loss/train': 0.7478873413056135} 12/26/2021 02:51:55 - INFO - codeparrot_training - Step 23889: {'lr': 0.0002688741406528086, 'samples': 12231680, 'steps': 23889, 'batch_loss/train': 0.5337316628720146} 12/26/2021 02:52:06 - INFO - codeparrot_training - Step 23890: {'lr': 0.0002688584224680702, 'samples': 12232192, 'steps': 23890, 'batch_loss/train': 0.7996529107913375} 12/26/2021 02:52:19 - INFO - codeparrot_training - Step 23891: {'lr': 0.0002688427042083579, 'samples': 12232704, 'steps': 23891, 'batch_loss/train': 0.6848178063519299} 12/26/2021 02:52:29 - INFO - codeparrot_training - Step 23892: {'lr': 0.0002688269858737341, 'samples': 12233216, 'steps': 23892, 'batch_loss/train': 0.7680944257881492} 12/26/2021 02:52:40 - INFO - codeparrot_training - Step 23893: {'lr': 0.0002688112674642613, 'samples': 12233728, 'steps': 23893, 'batch_loss/train': 0.6607063782284968} 12/26/2021 02:52:51 - INFO - codeparrot_training - Step 23894: {'lr': 0.000268795548980002, 'samples': 12234240, 'steps': 23894, 'batch_loss/train': 0.6685192319564521} 12/26/2021 02:53:03 - INFO - codeparrot_training - Step 23895: {'lr': 0.0002687798304210187, 'samples': 12234752, 'steps': 23895, 'batch_loss/train': 0.769274735590443} 12/26/2021 02:53:13 - INFO - codeparrot_training - Step 23896: {'lr': 0.00026876411178737385, 'samples': 12235264, 'steps': 23896, 'batch_loss/train': 0.6377217945992015} 12/26/2021 02:53:24 - INFO - codeparrot_training - Step 23897: {'lr': 0.0002687483930791301, 'samples': 12235776, 'steps': 23897, 'batch_loss/train': 0.7048011253355071} 12/26/2021 02:53:36 - INFO - codeparrot_training - Step 23898: {'lr': 0.00026873267429634976, 'samples': 12236288, 'steps': 23898, 'batch_loss/train': 0.8883842369541526} 12/26/2021 02:53:47 - INFO - codeparrot_training - Step 23899: {'lr': 0.00026871695543909525, 'samples': 12236800, 'steps': 23899, 'batch_loss/train': 0.7350655766204} 12/26/2021 02:53:57 - INFO - codeparrot_training - Step 23900: {'lr': 0.00026870123650742935, 'samples': 12237312, 'steps': 23900, 'batch_loss/train': 0.6988838913384825} 12/26/2021 02:54:10 - INFO - codeparrot_training - Step 23901: {'lr': 0.0002686855175014143, 'samples': 12237824, 'steps': 23901, 'batch_loss/train': 0.6864289129152894} 12/26/2021 02:54:21 - INFO - codeparrot_training - Step 23902: {'lr': 0.0002686697984211127, 'samples': 12238336, 'steps': 23902, 'batch_loss/train': 0.7716113016940653} 12/26/2021 02:54:32 - INFO - codeparrot_training - Step 23903: {'lr': 0.00026865407926658706, 'samples': 12238848, 'steps': 23903, 'batch_loss/train': 0.70516446698457} 12/26/2021 02:54:44 - INFO - codeparrot_training - Step 23904: {'lr': 0.0002686383600378999, 'samples': 12239360, 'steps': 23904, 'batch_loss/train': 0.5356957412441261} 12/26/2021 02:54:55 - INFO - codeparrot_training - Step 23905: {'lr': 0.0002686226407351136, 'samples': 12239872, 'steps': 23905, 'batch_loss/train': 0.7301292000338435} 12/26/2021 02:55:05 - INFO - codeparrot_training - Step 23906: {'lr': 0.0002686069213582907, 'samples': 12240384, 'steps': 23906, 'batch_loss/train': 0.7691755504347384} 12/26/2021 02:55:16 - INFO - codeparrot_training - Step 23907: {'lr': 0.00026859120190749375, 'samples': 12240896, 'steps': 23907, 'batch_loss/train': 0.5726347929448821} 12/26/2021 02:55:28 - INFO - codeparrot_training - Step 23908: {'lr': 0.0002685754823827852, 'samples': 12241408, 'steps': 23908, 'batch_loss/train': 0.8782839076593518} 12/26/2021 02:55:39 - INFO - codeparrot_training - Step 23909: {'lr': 0.00026855976278422747, 'samples': 12241920, 'steps': 23909, 'batch_loss/train': 0.7433117814362049} 12/26/2021 02:55:50 - INFO - codeparrot_training - Step 23910: {'lr': 0.00026854404311188327, 'samples': 12242432, 'steps': 23910, 'batch_loss/train': 0.7458272175863385} 12/26/2021 02:56:02 - INFO - codeparrot_training - Step 23911: {'lr': 0.0002685283233658148, 'samples': 12242944, 'steps': 23911, 'batch_loss/train': 0.7802470331080258} 12/26/2021 02:56:13 - INFO - codeparrot_training - Step 23912: {'lr': 0.0002685126035460847, 'samples': 12243456, 'steps': 23912, 'batch_loss/train': 0.6542746620252728} 12/26/2021 02:56:23 - INFO - codeparrot_training - Step 23913: {'lr': 0.0002684968836527556, 'samples': 12243968, 'steps': 23913, 'batch_loss/train': 0.8074692594818771} 12/26/2021 02:56:35 - INFO - codeparrot_training - Step 23914: {'lr': 0.00026848116368588987, 'samples': 12244480, 'steps': 23914, 'batch_loss/train': 0.8683657739311457} 12/26/2021 02:56:46 - INFO - codeparrot_training - Step 23915: {'lr': 0.00026846544364554994, 'samples': 12244992, 'steps': 23915, 'batch_loss/train': 0.642020417843014} 12/26/2021 02:56:57 - INFO - codeparrot_training - Step 23916: {'lr': 0.0002684497235317984, 'samples': 12245504, 'steps': 23916, 'batch_loss/train': 0.6745743616484106} 12/26/2021 02:57:07 - INFO - codeparrot_training - Step 23917: {'lr': 0.0002684340033446978, 'samples': 12246016, 'steps': 23917, 'batch_loss/train': 0.7531937947496772} 12/26/2021 02:57:20 - INFO - codeparrot_training - Step 23918: {'lr': 0.00026841828308431047, 'samples': 12246528, 'steps': 23918, 'batch_loss/train': 0.6242791637196206} 12/26/2021 02:57:30 - INFO - codeparrot_training - Step 23919: {'lr': 0.000268402562750699, 'samples': 12247040, 'steps': 23919, 'batch_loss/train': 0.4651821998413652} 12/26/2021 02:57:41 - INFO - codeparrot_training - Step 23920: {'lr': 0.00026838684234392596, 'samples': 12247552, 'steps': 23920, 'batch_loss/train': 0.7878462886437774} 12/26/2021 02:57:54 - INFO - codeparrot_training - Step 23921: {'lr': 0.00026837112186405373, 'samples': 12248064, 'steps': 23921, 'batch_loss/train': 0.7429698351770639} 12/26/2021 02:58:04 - INFO - codeparrot_training - Step 23922: {'lr': 0.0002683554013111448, 'samples': 12248576, 'steps': 23922, 'batch_loss/train': 0.7480844943784177} 12/26/2021 02:58:15 - INFO - codeparrot_training - Step 23923: {'lr': 0.0002683396806852618, 'samples': 12249088, 'steps': 23923, 'batch_loss/train': 0.7320802365429699} 12/26/2021 02:58:27 - INFO - codeparrot_training - Step 23924: {'lr': 0.0002683239599864672, 'samples': 12249600, 'steps': 23924, 'batch_loss/train': 0.7992077516391873} 12/26/2021 02:58:38 - INFO - codeparrot_training - Step 23925: {'lr': 0.00026830823921482333, 'samples': 12250112, 'steps': 23925, 'batch_loss/train': 0.8687211913056672} 12/26/2021 02:58:48 - INFO - codeparrot_training - Step 23926: {'lr': 0.00026829251837039294, 'samples': 12250624, 'steps': 23926, 'batch_loss/train': 0.7457404397428036} 12/26/2021 02:59:00 - INFO - codeparrot_training - Step 23927: {'lr': 0.00026827679745323834, 'samples': 12251136, 'steps': 23927, 'batch_loss/train': 0.634655658621341} 12/26/2021 02:59:11 - INFO - codeparrot_training - Step 23928: {'lr': 0.00026826107646342207, 'samples': 12251648, 'steps': 23928, 'batch_loss/train': 0.6112652139272541} 12/26/2021 02:59:22 - INFO - codeparrot_training - Step 23929: {'lr': 0.0002682453554010067, 'samples': 12252160, 'steps': 23929, 'batch_loss/train': 0.8159925690852106} 12/26/2021 02:59:32 - INFO - codeparrot_training - Step 23930: {'lr': 0.00026822963426605475, 'samples': 12252672, 'steps': 23930, 'batch_loss/train': 0.7280419319868088} 12/26/2021 02:59:45 - INFO - codeparrot_training - Step 23931: {'lr': 0.0002682139130586285, 'samples': 12253184, 'steps': 23931, 'batch_loss/train': 0.7111246050335467} 12/26/2021 02:59:55 - INFO - codeparrot_training - Step 23932: {'lr': 0.00026819819177879075, 'samples': 12253696, 'steps': 23932, 'batch_loss/train': 0.7485021650791168} 12/26/2021 03:00:06 - INFO - codeparrot_training - Step 23933: {'lr': 0.00026818247042660376, 'samples': 12254208, 'steps': 23933, 'batch_loss/train': 0.7187814489006996} 12/26/2021 03:00:18 - INFO - codeparrot_training - Step 23934: {'lr': 0.0002681667490021302, 'samples': 12254720, 'steps': 23934, 'batch_loss/train': 0.665980392601341} 12/26/2021 03:00:29 - INFO - codeparrot_training - Step 23935: {'lr': 0.0002681510275054324, 'samples': 12255232, 'steps': 23935, 'batch_loss/train': 0.7717397366650403} 12/26/2021 03:00:39 - INFO - codeparrot_training - Step 23936: {'lr': 0.000268135305936573, 'samples': 12255744, 'steps': 23936, 'batch_loss/train': 0.743482893332839} 12/26/2021 03:00:51 - INFO - codeparrot_training - Step 23937: {'lr': 0.0002681195842956146, 'samples': 12256256, 'steps': 23937, 'batch_loss/train': 0.7178376474184915} 12/26/2021 03:01:02 - INFO - codeparrot_training - Step 23938: {'lr': 0.00026810386258261943, 'samples': 12256768, 'steps': 23938, 'batch_loss/train': 0.681353495689109} 12/26/2021 03:01:13 - INFO - codeparrot_training - Step 23939: {'lr': 0.00026808814079765027, 'samples': 12257280, 'steps': 23939, 'batch_loss/train': 0.6539976033382118} 12/26/2021 03:01:23 - INFO - codeparrot_training - Step 23940: {'lr': 0.0002680724189407694, 'samples': 12257792, 'steps': 23940, 'batch_loss/train': 0.7528631342574954} 12/26/2021 03:01:36 - INFO - codeparrot_training - Step 23941: {'lr': 0.0002680566970120394, 'samples': 12258304, 'steps': 23941, 'batch_loss/train': 0.9095144048333168} 12/26/2021 03:01:47 - INFO - codeparrot_training - Step 23942: {'lr': 0.00026804097501152284, 'samples': 12258816, 'steps': 23942, 'batch_loss/train': 0.746099871583283} 12/26/2021 03:01:57 - INFO - codeparrot_training - Step 23943: {'lr': 0.0002680252529392822, 'samples': 12259328, 'steps': 23943, 'batch_loss/train': 0.7391518221702427} 12/26/2021 03:02:10 - INFO - codeparrot_training - Step 23944: {'lr': 0.00026800953079537986, 'samples': 12259840, 'steps': 23944, 'batch_loss/train': 0.7234034966677427} 12/26/2021 03:02:20 - INFO - codeparrot_training - Step 23945: {'lr': 0.0002679938085798785, 'samples': 12260352, 'steps': 23945, 'batch_loss/train': 0.7723863613791764} 12/26/2021 03:02:31 - INFO - codeparrot_training - Step 23946: {'lr': 0.0002679780862928406, 'samples': 12260864, 'steps': 23946, 'batch_loss/train': 0.7185704708099365} 12/26/2021 03:02:43 - INFO - codeparrot_training - Step 23947: {'lr': 0.00026796236393432854, 'samples': 12261376, 'steps': 23947, 'batch_loss/train': 0.7856760029681027} 12/26/2021 03:02:54 - INFO - codeparrot_training - Step 23948: {'lr': 0.0002679466415044049, 'samples': 12261888, 'steps': 23948, 'batch_loss/train': 0.5721216876991093} 12/26/2021 03:03:05 - INFO - codeparrot_training - Step 23949: {'lr': 0.0002679309190031322, 'samples': 12262400, 'steps': 23949, 'batch_loss/train': 0.7200281876139343} 12/26/2021 03:03:15 - INFO - codeparrot_training - Step 23950: {'lr': 0.000267915196430573, 'samples': 12262912, 'steps': 23950, 'batch_loss/train': 0.826831434853375} 12/26/2021 03:03:27 - INFO - codeparrot_training - Step 23951: {'lr': 0.00026789947378678963, 'samples': 12263424, 'steps': 23951, 'batch_loss/train': 0.5956999430200085} 12/26/2021 03:03:38 - INFO - codeparrot_training - Step 23952: {'lr': 0.0002678837510718448, 'samples': 12263936, 'steps': 23952, 'batch_loss/train': 0.7134434593608603} 12/26/2021 03:03:48 - INFO - codeparrot_training - Step 23953: {'lr': 0.00026786802828580094, 'samples': 12264448, 'steps': 23953, 'batch_loss/train': 0.7420103773474693} 12/26/2021 03:04:00 - INFO - codeparrot_training - Step 23954: {'lr': 0.00026785230542872047, 'samples': 12264960, 'steps': 23954, 'batch_loss/train': 0.6081424816511571} 12/26/2021 03:04:11 - INFO - codeparrot_training - Step 23955: {'lr': 0.000267836582500666, 'samples': 12265472, 'steps': 23955, 'batch_loss/train': 0.7089472273364663} 12/26/2021 03:04:22 - INFO - codeparrot_training - Step 23956: {'lr': 0.00026782085950170006, 'samples': 12265984, 'steps': 23956, 'batch_loss/train': 0.7311242432333529} 12/26/2021 03:04:34 - INFO - codeparrot_training - Step 23957: {'lr': 0.000267805136431885, 'samples': 12266496, 'steps': 23957, 'batch_loss/train': 0.7063015031162649} 12/26/2021 03:04:44 - INFO - codeparrot_training - Step 23958: {'lr': 0.00026778941329128357, 'samples': 12267008, 'steps': 23958, 'batch_loss/train': 0.6740615351591259} 12/26/2021 03:04:55 - INFO - codeparrot_training - Step 23959: {'lr': 0.0002677736900799581, 'samples': 12267520, 'steps': 23959, 'batch_loss/train': 0.668417424603831} 12/26/2021 03:05:08 - INFO - codeparrot_training - Step 23960: {'lr': 0.00026775796679797114, 'samples': 12268032, 'steps': 23960, 'batch_loss/train': 0.6412272236775607} 12/26/2021 03:05:19 - INFO - codeparrot_training - Step 23961: {'lr': 0.00026774224344538523, 'samples': 12268544, 'steps': 23961, 'batch_loss/train': 0.7340745734982193} 12/26/2021 03:05:29 - INFO - codeparrot_training - Step 23962: {'lr': 0.00026772652002226275, 'samples': 12269056, 'steps': 23962, 'batch_loss/train': 0.8125288840383291} 12/26/2021 03:05:40 - INFO - codeparrot_training - Step 23963: {'lr': 0.00026771079652866653, 'samples': 12269568, 'steps': 23963, 'batch_loss/train': 0.8055520560592413} 12/26/2021 03:05:52 - INFO - codeparrot_training - Step 23964: {'lr': 0.00026769507296465865, 'samples': 12270080, 'steps': 23964, 'batch_loss/train': 0.7493863725103438} 12/26/2021 03:06:02 - INFO - codeparrot_training - Step 23965: {'lr': 0.00026767934933030204, 'samples': 12270592, 'steps': 23965, 'batch_loss/train': 0.6940721767023206} 12/26/2021 03:06:13 - INFO - codeparrot_training - Step 23966: {'lr': 0.0002676636256256589, 'samples': 12271104, 'steps': 23966, 'batch_loss/train': 0.7023185146972537} 12/26/2021 03:06:25 - INFO - codeparrot_training - Step 23967: {'lr': 0.00026764790185079183, 'samples': 12271616, 'steps': 23967, 'batch_loss/train': 0.7641827580519021} 12/26/2021 03:06:36 - INFO - codeparrot_training - Step 23968: {'lr': 0.0002676321780057634, 'samples': 12272128, 'steps': 23968, 'batch_loss/train': 0.5946956319385208} 12/26/2021 03:06:46 - INFO - codeparrot_training - Step 23969: {'lr': 0.00026761645409063617, 'samples': 12272640, 'steps': 23969, 'batch_loss/train': 0.821041633374989} 12/26/2021 03:06:59 - INFO - codeparrot_training - Step 23970: {'lr': 0.00026760073010547244, 'samples': 12273152, 'steps': 23970, 'batch_loss/train': 0.8112223001662642} 12/26/2021 03:07:10 - INFO - codeparrot_training - Step 23971: {'lr': 0.0002675850060503349, 'samples': 12273664, 'steps': 23971, 'batch_loss/train': 0.7125394949689507} 12/26/2021 03:07:20 - INFO - codeparrot_training - Step 23972: {'lr': 0.000267569281925286, 'samples': 12274176, 'steps': 23972, 'batch_loss/train': 0.8160026599653065} 12/26/2021 03:07:33 - INFO - codeparrot_training - Step 23973: {'lr': 0.0002675535577303883, 'samples': 12274688, 'steps': 23973, 'batch_loss/train': 0.7869997639209032} 12/26/2021 03:07:43 - INFO - codeparrot_training - Step 23974: {'lr': 0.00026753783346570426, 'samples': 12275200, 'steps': 23974, 'batch_loss/train': 0.673953972524032} 12/26/2021 03:07:54 - INFO - codeparrot_training - Step 23975: {'lr': 0.0002675221091312964, 'samples': 12275712, 'steps': 23975, 'batch_loss/train': 0.724899782333523} 12/26/2021 03:08:04 - INFO - codeparrot_training - Step 23976: {'lr': 0.0002675063847272273, 'samples': 12276224, 'steps': 23976, 'batch_loss/train': 0.719934887252748} 12/26/2021 03:08:17 - INFO - codeparrot_training - Step 23977: {'lr': 0.0002674906602535594, 'samples': 12276736, 'steps': 23977, 'batch_loss/train': 0.7583187683485448} 12/26/2021 03:08:28 - INFO - codeparrot_training - Step 23978: {'lr': 0.0002674749357103553, 'samples': 12277248, 'steps': 23978, 'batch_loss/train': 0.868863744661212} 12/26/2021 03:08:38 - INFO - codeparrot_training - Step 23979: {'lr': 0.0002674592110976774, 'samples': 12277760, 'steps': 23979, 'batch_loss/train': 0.9082894255407155} 12/26/2021 03:08:50 - INFO - codeparrot_training - Step 23980: {'lr': 0.0002674434864155882, 'samples': 12278272, 'steps': 23980, 'batch_loss/train': 0.7469810014590621} 12/26/2021 03:09:01 - INFO - codeparrot_training - Step 23981: {'lr': 0.0002674277616641504, 'samples': 12278784, 'steps': 23981, 'batch_loss/train': 0.7271029447438195} 12/26/2021 03:09:12 - INFO - codeparrot_training - Step 23982: {'lr': 0.0002674120368434264, 'samples': 12279296, 'steps': 23982, 'batch_loss/train': 0.7033652883255854} 12/26/2021 03:09:24 - INFO - codeparrot_training - Step 23983: {'lr': 0.00026739631195347855, 'samples': 12279808, 'steps': 23983, 'batch_loss/train': 0.7808762360364199} 12/26/2021 03:09:34 - INFO - codeparrot_training - Step 23984: {'lr': 0.00026738058699436973, 'samples': 12280320, 'steps': 23984, 'batch_loss/train': 0.726496544200927} 12/26/2021 03:09:45 - INFO - codeparrot_training - Step 23985: {'lr': 0.00026736486196616215, 'samples': 12280832, 'steps': 23985, 'batch_loss/train': 0.7101682960055768} 12/26/2021 03:09:56 - INFO - codeparrot_training - Step 23986: {'lr': 0.0002673491368689185, 'samples': 12281344, 'steps': 23986, 'batch_loss/train': 0.7036900464445353} 12/26/2021 03:10:08 - INFO - codeparrot_training - Step 23987: {'lr': 0.0002673334117027012, 'samples': 12281856, 'steps': 23987, 'batch_loss/train': 0.8013535970821977} 12/26/2021 03:10:19 - INFO - codeparrot_training - Step 23988: {'lr': 0.00026731768646757277, 'samples': 12282368, 'steps': 23988, 'batch_loss/train': 0.736212270334363} 12/26/2021 03:10:29 - INFO - codeparrot_training - Step 23989: {'lr': 0.00026730196116359577, 'samples': 12282880, 'steps': 23989, 'batch_loss/train': 0.7955430056899786} 12/26/2021 03:10:42 - INFO - codeparrot_training - Step 23990: {'lr': 0.0002672862357908327, 'samples': 12283392, 'steps': 23990, 'batch_loss/train': 0.8147940617054701} 12/26/2021 03:10:53 - INFO - codeparrot_training - Step 23991: {'lr': 0.0002672705103493462, 'samples': 12283904, 'steps': 23991, 'batch_loss/train': 0.7186896037310362} 12/26/2021 03:11:03 - INFO - codeparrot_training - Step 23992: {'lr': 0.00026725478483919854, 'samples': 12284416, 'steps': 23992, 'batch_loss/train': 0.6284587585832924} 12/26/2021 03:11:15 - INFO - codeparrot_training - Step 23993: {'lr': 0.0002672390592604524, 'samples': 12284928, 'steps': 23993, 'batch_loss/train': 0.7625699490308762} 12/26/2021 03:11:26 - INFO - codeparrot_training - Step 23994: {'lr': 0.00026722333361317026, 'samples': 12285440, 'steps': 23994, 'batch_loss/train': 0.7860132697969675} 12/26/2021 03:11:37 - INFO - codeparrot_training - Step 23995: {'lr': 0.00026720760789741474, 'samples': 12285952, 'steps': 23995, 'batch_loss/train': 0.8226245427504182} 12/26/2021 03:11:49 - INFO - codeparrot_training - Step 23996: {'lr': 0.00026719188211324807, 'samples': 12286464, 'steps': 23996, 'batch_loss/train': 0.6419153939932585} 12/26/2021 03:11:59 - INFO - codeparrot_training - Step 23997: {'lr': 0.00026717615626073313, 'samples': 12286976, 'steps': 23997, 'batch_loss/train': 0.7747430494055152} 12/26/2021 03:12:10 - INFO - codeparrot_training - Step 23998: {'lr': 0.00026716043033993227, 'samples': 12287488, 'steps': 23998, 'batch_loss/train': 0.7036835020408034} 12/26/2021 03:12:21 - INFO - codeparrot_training - Step 23999: {'lr': 0.0002671447043509079, 'samples': 12288000, 'steps': 23999, 'batch_loss/train': 0.8106211759150028} 12/26/2021 03:12:33 - INFO - codeparrot_training - Step 24000: {'lr': 0.00026712897829372277, 'samples': 12288512, 'steps': 24000, 'batch_loss/train': 0.7700261734426022} 12/26/2021 03:12:44 - INFO - codeparrot_training - Step 24001: {'lr': 0.0002671132521684393, 'samples': 12289024, 'steps': 24001, 'batch_loss/train': 0.7162701524794102} 12/26/2021 03:12:55 - INFO - codeparrot_training - Step 24002: {'lr': 0.00026709752597511995, 'samples': 12289536, 'steps': 24002, 'batch_loss/train': 0.8021867079660296} 12/26/2021 03:13:07 - INFO - codeparrot_training - Step 24003: {'lr': 0.0002670817997138273, 'samples': 12290048, 'steps': 24003, 'batch_loss/train': 0.7536216545850039} 12/26/2021 03:13:17 - INFO - codeparrot_training - Step 24004: {'lr': 0.0002670660733846239, 'samples': 12290560, 'steps': 24004, 'batch_loss/train': 0.7771802917122841} 12/26/2021 03:13:28 - INFO - codeparrot_training - Step 24005: {'lr': 0.00026705034698757226, 'samples': 12291072, 'steps': 24005, 'batch_loss/train': 0.8177287550643086} 12/26/2021 03:13:41 - INFO - codeparrot_training - Step 24006: {'lr': 0.0002670346205227348, 'samples': 12291584, 'steps': 24006, 'batch_loss/train': 0.7423866083845496} 12/26/2021 03:13:52 - INFO - codeparrot_training - Step 24007: {'lr': 0.00026701889399017415, 'samples': 12292096, 'steps': 24007, 'batch_loss/train': 0.7135963728651404} 12/26/2021 03:14:02 - INFO - codeparrot_training - Step 24008: {'lr': 0.00026700316738995284, 'samples': 12292608, 'steps': 24008, 'batch_loss/train': 0.7338184749241918} 12/26/2021 03:14:14 - INFO - codeparrot_training - Step 24009: {'lr': 0.00026698744072213326, 'samples': 12293120, 'steps': 24009, 'batch_loss/train': 0.7234223762061447} 12/26/2021 03:14:25 - INFO - codeparrot_training - Step 24010: {'lr': 0.00026697171398677803, 'samples': 12293632, 'steps': 24010, 'batch_loss/train': 0.689813026576303} 12/26/2021 03:14:35 - INFO - codeparrot_training - Step 24011: {'lr': 0.0002669559871839498, 'samples': 12294144, 'steps': 24011, 'batch_loss/train': 0.8566519429441541} 12/26/2021 03:14:46 - INFO - codeparrot_training - Step 24012: {'lr': 0.00026694026031371087, 'samples': 12294656, 'steps': 24012, 'batch_loss/train': 0.7171919355168939} 12/26/2021 03:14:58 - INFO - codeparrot_training - Step 24013: {'lr': 0.00026692453337612396, 'samples': 12295168, 'steps': 24013, 'batch_loss/train': 0.7724543807562441} 12/26/2021 03:15:09 - INFO - codeparrot_training - Step 24014: {'lr': 0.00026690880637125135, 'samples': 12295680, 'steps': 24014, 'batch_loss/train': 0.8205408193171024} 12/26/2021 03:15:19 - INFO - codeparrot_training - Step 24015: {'lr': 0.0002668930792991558, 'samples': 12296192, 'steps': 24015, 'batch_loss/train': 0.7658146433532238} 12/26/2021 03:15:32 - INFO - codeparrot_training - Step 24016: {'lr': 0.0002668773521598997, 'samples': 12296704, 'steps': 24016, 'batch_loss/train': 0.8037578193470836} 12/26/2021 03:15:43 - INFO - codeparrot_training - Step 24017: {'lr': 0.0002668616249535457, 'samples': 12297216, 'steps': 24017, 'batch_loss/train': 0.749703805660829} 12/26/2021 03:15:54 - INFO - codeparrot_training - Step 24018: {'lr': 0.0002668458976801562, 'samples': 12297728, 'steps': 24018, 'batch_loss/train': 0.673424219712615} 12/26/2021 03:16:06 - INFO - codeparrot_training - Step 24019: {'lr': 0.0002668301703397938, 'samples': 12298240, 'steps': 24019, 'batch_loss/train': 0.7171848006546497} 12/26/2021 03:16:16 - INFO - codeparrot_training - Step 24020: {'lr': 0.00026681444293252094, 'samples': 12298752, 'steps': 24020, 'batch_loss/train': 0.725233574397862} 12/26/2021 03:16:27 - INFO - codeparrot_training - Step 24021: {'lr': 0.0002667987154584002, 'samples': 12299264, 'steps': 24021, 'batch_loss/train': 0.7477351557463408} 12/26/2021 03:16:38 - INFO - codeparrot_training - Step 24022: {'lr': 0.00026678298791749414, 'samples': 12299776, 'steps': 24022, 'batch_loss/train': 0.6685355122899637} 12/26/2021 03:16:50 - INFO - codeparrot_training - Step 24023: {'lr': 0.0002667672603098653, 'samples': 12300288, 'steps': 24023, 'batch_loss/train': 0.7167196171358228} 12/26/2021 03:17:00 - INFO - codeparrot_training - Step 24024: {'lr': 0.0002667515326355762, 'samples': 12300800, 'steps': 24024, 'batch_loss/train': 0.7288212962448597} 12/26/2021 03:17:11 - INFO - codeparrot_training - Step 24025: {'lr': 0.00026673580489468913, 'samples': 12301312, 'steps': 24025, 'batch_loss/train': 0.735278666834347} 12/26/2021 03:17:24 - INFO - codeparrot_training - Step 24026: {'lr': 0.000266720077087267, 'samples': 12301824, 'steps': 24026, 'batch_loss/train': 0.672090174164623} 12/26/2021 03:17:34 - INFO - codeparrot_training - Step 24027: {'lr': 0.00026670434921337213, 'samples': 12302336, 'steps': 24027, 'batch_loss/train': 0.745156372897327} 12/26/2021 03:17:45 - INFO - codeparrot_training - Step 24028: {'lr': 0.000266688621273067, 'samples': 12302848, 'steps': 24028, 'batch_loss/train': 0.7688738065771759} 12/26/2021 03:17:57 - INFO - codeparrot_training - Step 24029: {'lr': 0.00026667289326641426, 'samples': 12303360, 'steps': 24029, 'batch_loss/train': 0.8076397082768381} 12/26/2021 03:18:08 - INFO - codeparrot_training - Step 24030: {'lr': 0.0002666571651934764, 'samples': 12303872, 'steps': 24030, 'batch_loss/train': 0.7521133981645107} 12/26/2021 03:18:18 - INFO - codeparrot_training - Step 24031: {'lr': 0.000266641437054316, 'samples': 12304384, 'steps': 24031, 'batch_loss/train': 0.6957426466979086} 12/26/2021 03:18:29 - INFO - codeparrot_training - Step 24032: {'lr': 0.0002666257088489954, 'samples': 12304896, 'steps': 24032, 'batch_loss/train': 0.8054479421116412} 12/26/2021 03:18:41 - INFO - codeparrot_training - Step 24033: {'lr': 0.00026660998057757737, 'samples': 12305408, 'steps': 24033, 'batch_loss/train': 0.7518982156179845} 12/26/2021 03:18:52 - INFO - codeparrot_training - Step 24034: {'lr': 0.00026659425224012423, 'samples': 12305920, 'steps': 24034, 'batch_loss/train': 0.6839588881703094} 12/26/2021 03:19:02 - INFO - codeparrot_training - Step 24035: {'lr': 0.00026657852383669864, 'samples': 12306432, 'steps': 24035, 'batch_loss/train': 0.7629478583112359} 12/26/2021 03:19:15 - INFO - codeparrot_training - Step 24036: {'lr': 0.0002665627953673631, 'samples': 12306944, 'steps': 24036, 'batch_loss/train': 0.8174893888644874} 12/26/2021 03:19:26 - INFO - codeparrot_training - Step 24037: {'lr': 0.0002665470668321802, 'samples': 12307456, 'steps': 24037, 'batch_loss/train': 0.6462090051500127} 12/26/2021 03:19:36 - INFO - codeparrot_training - Step 24038: {'lr': 0.0002665313382312123, 'samples': 12307968, 'steps': 24038, 'batch_loss/train': 0.736491119954735} 12/26/2021 03:19:49 - INFO - codeparrot_training - Step 24039: {'lr': 0.00026651560956452217, 'samples': 12308480, 'steps': 24039, 'batch_loss/train': 1.172623666934669} 12/26/2021 03:20:00 - INFO - codeparrot_training - Step 24040: {'lr': 0.00026649988083217213, 'samples': 12308992, 'steps': 24040, 'batch_loss/train': 0.6711311452672817} 12/26/2021 03:20:10 - INFO - codeparrot_training - Step 24041: {'lr': 0.00026648415203422483, 'samples': 12309504, 'steps': 24041, 'batch_loss/train': 0.8414095267653465} 12/26/2021 03:20:21 - INFO - codeparrot_training - Step 24042: {'lr': 0.0002664684231707427, 'samples': 12310016, 'steps': 24042, 'batch_loss/train': 0.7974354168400168} 12/26/2021 03:20:33 - INFO - codeparrot_training - Step 24043: {'lr': 0.0002664526942417885, 'samples': 12310528, 'steps': 24043, 'batch_loss/train': 0.7097762692719698} 12/26/2021 03:20:44 - INFO - codeparrot_training - Step 24044: {'lr': 0.0002664369652474244, 'samples': 12311040, 'steps': 24044, 'batch_loss/train': 0.6377007244154811} 12/26/2021 03:20:54 - INFO - codeparrot_training - Step 24045: {'lr': 0.00026642123618771324, 'samples': 12311552, 'steps': 24045, 'batch_loss/train': 0.7129405424930155} 12/26/2021 03:21:07 - INFO - codeparrot_training - Step 24046: {'lr': 0.00026640550706271744, 'samples': 12312064, 'steps': 24046, 'batch_loss/train': 0.6638314937590621} 12/26/2021 03:21:18 - INFO - codeparrot_training - Step 24047: {'lr': 0.0002663897778724995, 'samples': 12312576, 'steps': 24047, 'batch_loss/train': 0.8192258048802614} 12/26/2021 03:21:28 - INFO - codeparrot_training - Step 24048: {'lr': 0.000266374048617122, 'samples': 12313088, 'steps': 24048, 'batch_loss/train': 0.7875416846945882} 12/26/2021 03:21:41 - INFO - codeparrot_training - Step 24049: {'lr': 0.0002663583192966475, 'samples': 12313600, 'steps': 24049, 'batch_loss/train': 0.6399554965901189} 12/26/2021 03:21:51 - INFO - codeparrot_training - Step 24050: {'lr': 0.0002663425899111385, 'samples': 12314112, 'steps': 24050, 'batch_loss/train': 0.7711009094491601} 12/26/2021 03:22:02 - INFO - codeparrot_training - Step 24051: {'lr': 0.0002663268604606575, 'samples': 12314624, 'steps': 24051, 'batch_loss/train': 0.4585186010808684} 12/26/2021 03:22:14 - INFO - codeparrot_training - Step 24052: {'lr': 0.0002663111309452671, 'samples': 12315136, 'steps': 24052, 'batch_loss/train': 0.8323307037353516} 12/26/2021 03:22:25 - INFO - codeparrot_training - Step 24053: {'lr': 0.00026629540136502975, 'samples': 12315648, 'steps': 24053, 'batch_loss/train': 0.7280478542670608} 12/26/2021 03:22:35 - INFO - codeparrot_training - Step 24054: {'lr': 0.0002662796717200081, 'samples': 12316160, 'steps': 24054, 'batch_loss/train': 0.7408524393104017} 12/26/2021 03:22:46 - INFO - codeparrot_training - Step 24055: {'lr': 0.00026626394201026453, 'samples': 12316672, 'steps': 24055, 'batch_loss/train': 0.6788173532113433} 12/26/2021 03:22:59 - INFO - codeparrot_training - Step 24056: {'lr': 0.0002662482122358618, 'samples': 12317184, 'steps': 24056, 'batch_loss/train': 0.7594407871365547} 12/26/2021 03:23:09 - INFO - codeparrot_training - Step 24057: {'lr': 0.0002662324823968622, 'samples': 12317696, 'steps': 24057, 'batch_loss/train': 0.8274411465972662} 12/26/2021 03:23:20 - INFO - codeparrot_training - Step 24058: {'lr': 0.00026621675249332845, 'samples': 12318208, 'steps': 24058, 'batch_loss/train': 0.7105059021851048} 12/26/2021 03:23:32 - INFO - codeparrot_training - Step 24059: {'lr': 0.00026620102252532295, 'samples': 12318720, 'steps': 24059, 'batch_loss/train': 0.7751764906570315} 12/26/2021 03:23:43 - INFO - codeparrot_training - Step 24060: {'lr': 0.00026618529249290837, 'samples': 12319232, 'steps': 24060, 'batch_loss/train': 0.840142625849694} 12/26/2021 03:23:53 - INFO - codeparrot_training - Step 24061: {'lr': 0.00026616956239614714, 'samples': 12319744, 'steps': 24061, 'batch_loss/train': 0.7624926706776023} 12/26/2021 03:24:05 - INFO - codeparrot_training - Step 24062: {'lr': 0.00026615383223510184, 'samples': 12320256, 'steps': 24062, 'batch_loss/train': 0.7082548686885275} 12/26/2021 03:24:16 - INFO - codeparrot_training - Step 24063: {'lr': 0.00026613810200983506, 'samples': 12320768, 'steps': 24063, 'batch_loss/train': 0.8085290566086769} 12/26/2021 03:24:27 - INFO - codeparrot_training - Step 24064: {'lr': 0.00026612237172040916, 'samples': 12321280, 'steps': 24064, 'batch_loss/train': 0.8023824915289879} 12/26/2021 03:24:39 - INFO - codeparrot_training - Step 24065: {'lr': 0.0002661066413668869, 'samples': 12321792, 'steps': 24065, 'batch_loss/train': 0.7316072564572096} 12/26/2021 03:24:50 - INFO - codeparrot_training - Step 24066: {'lr': 0.0002660909109493307, 'samples': 12322304, 'steps': 24066, 'batch_loss/train': 0.7866976060904562} 12/26/2021 03:25:01 - INFO - codeparrot_training - Step 24067: {'lr': 0.00026607518046780303, 'samples': 12322816, 'steps': 24067, 'batch_loss/train': 0.8223931882530451} 12/26/2021 03:25:11 - INFO - codeparrot_training - Step 24068: {'lr': 0.0002660594499223666, 'samples': 12323328, 'steps': 24068, 'batch_loss/train': 0.8009086949750781} 12/26/2021 03:25:23 - INFO - codeparrot_training - Step 24069: {'lr': 0.00026604371931308386, 'samples': 12323840, 'steps': 24069, 'batch_loss/train': 0.8183803725987673} 12/26/2021 03:25:34 - INFO - codeparrot_training - Step 24070: {'lr': 0.00026602798864001724, 'samples': 12324352, 'steps': 24070, 'batch_loss/train': 0.7271354543045163} 12/26/2021 03:25:45 - INFO - codeparrot_training - Step 24071: {'lr': 0.0002660122579032295, 'samples': 12324864, 'steps': 24071, 'batch_loss/train': 0.7890078909695148} 12/26/2021 03:25:57 - INFO - codeparrot_training - Step 24072: {'lr': 0.0002659965271027831, 'samples': 12325376, 'steps': 24072, 'batch_loss/train': 0.7234503915533423} 12/26/2021 03:26:07 - INFO - codeparrot_training - Step 24073: {'lr': 0.0002659807962387405, 'samples': 12325888, 'steps': 24073, 'batch_loss/train': 0.7152709532529116} 12/26/2021 03:26:18 - INFO - codeparrot_training - Step 24074: {'lr': 0.00026596506531116425, 'samples': 12326400, 'steps': 24074, 'batch_loss/train': 0.7909625703468919} 12/26/2021 03:26:31 - INFO - codeparrot_training - Step 24075: {'lr': 0.0002659493343201169, 'samples': 12326912, 'steps': 24075, 'batch_loss/train': 0.7166873030364513} 12/26/2021 03:26:41 - INFO - codeparrot_training - Step 24076: {'lr': 0.0002659336032656612, 'samples': 12327424, 'steps': 24076, 'batch_loss/train': 0.7546114583965391} 12/26/2021 03:26:52 - INFO - codeparrot_training - Step 24077: {'lr': 0.0002659178721478593, 'samples': 12327936, 'steps': 24077, 'batch_loss/train': 0.7519114576280117} 12/26/2021 03:27:03 - INFO - codeparrot_training - Step 24078: {'lr': 0.00026590214096677405, 'samples': 12328448, 'steps': 24078, 'batch_loss/train': 0.6599904932081699} 12/26/2021 03:27:15 - INFO - codeparrot_training - Step 24079: {'lr': 0.00026588640972246793, 'samples': 12328960, 'steps': 24079, 'batch_loss/train': 0.7633470920845866} 12/26/2021 03:27:25 - INFO - codeparrot_training - Step 24080: {'lr': 0.00026587067841500335, 'samples': 12329472, 'steps': 24080, 'batch_loss/train': 0.9951289119198918} 12/26/2021 03:27:36 - INFO - codeparrot_training - Step 24081: {'lr': 0.00026585494704444297, 'samples': 12329984, 'steps': 24081, 'batch_loss/train': 0.78276133723557} 12/26/2021 03:27:48 - INFO - codeparrot_training - Step 24082: {'lr': 0.0002658392156108494, 'samples': 12330496, 'steps': 24082, 'batch_loss/train': 0.7510974255856127} 12/26/2021 03:27:59 - INFO - codeparrot_training - Step 24083: {'lr': 0.0002658234841142849, 'samples': 12331008, 'steps': 24083, 'batch_loss/train': 0.747178147546947} 12/26/2021 03:28:09 - INFO - codeparrot_training - Step 24084: {'lr': 0.0002658077525548123, 'samples': 12331520, 'steps': 24084, 'batch_loss/train': 0.7177193183451891} 12/26/2021 03:28:22 - INFO - codeparrot_training - Step 24085: {'lr': 0.0002657920209324942, 'samples': 12332032, 'steps': 24085, 'batch_loss/train': 0.7368159883189946} 12/26/2021 03:28:33 - INFO - codeparrot_training - Step 24086: {'lr': 0.00026577628924739276, 'samples': 12332544, 'steps': 24086, 'batch_loss/train': 0.7461348278447986} 12/26/2021 03:28:43 - INFO - codeparrot_training - Step 24087: {'lr': 0.00026576055749957084, 'samples': 12333056, 'steps': 24087, 'batch_loss/train': 0.7408963683992624} 12/26/2021 03:28:54 - INFO - codeparrot_training - Step 24088: {'lr': 0.00026574482568909093, 'samples': 12333568, 'steps': 24088, 'batch_loss/train': 0.6962389969266951} 12/26/2021 03:29:06 - INFO - codeparrot_training - Step 24089: {'lr': 0.0002657290938160155, 'samples': 12334080, 'steps': 24089, 'batch_loss/train': 0.9889562726020813} 12/26/2021 03:29:17 - INFO - codeparrot_training - Step 24090: {'lr': 0.00026571336188040715, 'samples': 12334592, 'steps': 24090, 'batch_loss/train': 0.786803288385272} 12/26/2021 03:29:27 - INFO - codeparrot_training - Step 24091: {'lr': 0.0002656976298823284, 'samples': 12335104, 'steps': 24091, 'batch_loss/train': 0.7019246844574809} 12/26/2021 03:29:40 - INFO - codeparrot_training - Step 24092: {'lr': 0.00026568189782184175, 'samples': 12335616, 'steps': 24092, 'batch_loss/train': 0.7276613619178534} 12/26/2021 03:29:51 - INFO - codeparrot_training - Step 24093: {'lr': 0.00026566616569900987, 'samples': 12336128, 'steps': 24093, 'batch_loss/train': 0.6254687830805779} 12/26/2021 03:30:01 - INFO - codeparrot_training - Step 24094: {'lr': 0.00026565043351389517, 'samples': 12336640, 'steps': 24094, 'batch_loss/train': 0.8171444311738014} 12/26/2021 03:30:13 - INFO - codeparrot_training - Step 24095: {'lr': 0.0002656347012665603, 'samples': 12337152, 'steps': 24095, 'batch_loss/train': 0.7725715078413486} 12/26/2021 03:30:24 - INFO - codeparrot_training - Step 24096: {'lr': 0.0002656189689570677, 'samples': 12337664, 'steps': 24096, 'batch_loss/train': 0.8437703065574169} 12/26/2021 03:30:34 - INFO - codeparrot_training - Step 24097: {'lr': 0.00026560323658548005, 'samples': 12338176, 'steps': 24097, 'batch_loss/train': 0.7522661769762635} 12/26/2021 03:30:45 - INFO - codeparrot_training - Step 24098: {'lr': 0.0002655875041518598, 'samples': 12338688, 'steps': 24098, 'batch_loss/train': 0.8068787138909101} 12/26/2021 03:30:57 - INFO - codeparrot_training - Step 24099: {'lr': 0.0002655717716562695, 'samples': 12339200, 'steps': 24099, 'batch_loss/train': 0.5220098026038613} 12/26/2021 03:31:08 - INFO - codeparrot_training - Step 24100: {'lr': 0.00026555603909877176, 'samples': 12339712, 'steps': 24100, 'batch_loss/train': 0.634413119405508} 12/26/2021 03:31:19 - INFO - codeparrot_training - Step 24101: {'lr': 0.000265540306479429, 'samples': 12340224, 'steps': 24101, 'batch_loss/train': 0.7044965168461204} 12/26/2021 03:31:31 - INFO - codeparrot_training - Step 24102: {'lr': 0.0002655245737983039, 'samples': 12340736, 'steps': 24102, 'batch_loss/train': 0.676186946220696} 12/26/2021 03:31:42 - INFO - codeparrot_training - Step 24103: {'lr': 0.00026550884105545897, 'samples': 12341248, 'steps': 24103, 'batch_loss/train': 0.7312718727625906} 12/26/2021 03:31:53 - INFO - codeparrot_training - Step 24104: {'lr': 0.00026549310825095676, 'samples': 12341760, 'steps': 24104, 'batch_loss/train': 0.6774343959987164} 12/26/2021 03:32:05 - INFO - codeparrot_training - Step 24105: {'lr': 0.00026547737538485977, 'samples': 12342272, 'steps': 24105, 'batch_loss/train': 0.6899732984602451} 12/26/2021 03:32:15 - INFO - codeparrot_training - Step 24106: {'lr': 0.0002654616424572306, 'samples': 12342784, 'steps': 24106, 'batch_loss/train': 0.8088113706326112} 12/26/2021 03:32:26 - INFO - codeparrot_training - Step 24107: {'lr': 0.0002654459094681317, 'samples': 12343296, 'steps': 24107, 'batch_loss/train': 0.8610356580466032} 12/26/2021 03:32:38 - INFO - codeparrot_training - Step 24108: {'lr': 0.00026543017641762577, 'samples': 12343808, 'steps': 24108, 'batch_loss/train': 0.6761779375374317} 12/26/2021 03:32:49 - INFO - codeparrot_training - Step 24109: {'lr': 0.00026541444330577523, 'samples': 12344320, 'steps': 24109, 'batch_loss/train': 0.6406078711152077} 12/26/2021 03:32:59 - INFO - codeparrot_training - Step 24110: {'lr': 0.0002653987101326427, 'samples': 12344832, 'steps': 24110, 'batch_loss/train': 0.7639828883111477} 12/26/2021 03:33:10 - INFO - codeparrot_training - Step 24111: {'lr': 0.0002653829768982908, 'samples': 12345344, 'steps': 24111, 'batch_loss/train': 0.6526622076635249} 12/26/2021 03:33:22 - INFO - codeparrot_training - Step 24112: {'lr': 0.0002653672436027819, 'samples': 12345856, 'steps': 24112, 'batch_loss/train': 0.5780565703753382} 12/26/2021 03:33:33 - INFO - codeparrot_training - Step 24113: {'lr': 0.0002653515102461787, 'samples': 12346368, 'steps': 24113, 'batch_loss/train': 0.7467015320435166} 12/26/2021 03:33:43 - INFO - codeparrot_training - Step 24114: {'lr': 0.00026533577682854366, 'samples': 12346880, 'steps': 24114, 'batch_loss/train': 0.7537469719536602} 12/26/2021 03:33:56 - INFO - codeparrot_training - Step 24115: {'lr': 0.00026532004334993935, 'samples': 12347392, 'steps': 24115, 'batch_loss/train': 0.7124916003085673} 12/26/2021 03:34:07 - INFO - codeparrot_training - Step 24116: {'lr': 0.00026530430981042835, 'samples': 12347904, 'steps': 24116, 'batch_loss/train': 0.7883493136614561} 12/26/2021 03:34:17 - INFO - codeparrot_training - Step 24117: {'lr': 0.0002652885762100732, 'samples': 12348416, 'steps': 24117, 'batch_loss/train': 0.8601377233862877} 12/26/2021 03:34:29 - INFO - codeparrot_training - Step 24118: {'lr': 0.0002652728425489365, 'samples': 12348928, 'steps': 24118, 'batch_loss/train': 0.9009463051334023} 12/26/2021 03:34:40 - INFO - codeparrot_training - Step 24119: {'lr': 0.0002652571088270807, 'samples': 12349440, 'steps': 24119, 'batch_loss/train': 0.7306875621434301} 12/26/2021 03:34:50 - INFO - codeparrot_training - Step 24120: {'lr': 0.0002652413750445683, 'samples': 12349952, 'steps': 24120, 'batch_loss/train': 0.6672104671597481} 12/26/2021 03:35:01 - INFO - codeparrot_training - Step 24121: {'lr': 0.00026522564120146205, 'samples': 12350464, 'steps': 24121, 'batch_loss/train': 0.8311997083947062} 12/26/2021 03:35:13 - INFO - codeparrot_training - Step 24122: {'lr': 0.00026520990729782436, 'samples': 12350976, 'steps': 24122, 'batch_loss/train': 0.8404696425423026} 12/26/2021 03:35:24 - INFO - codeparrot_training - Step 24123: {'lr': 0.0002651941733337178, 'samples': 12351488, 'steps': 24123, 'batch_loss/train': 0.8488993737846613} 12/26/2021 03:35:35 - INFO - codeparrot_training - Step 24124: {'lr': 0.000265178439309205, 'samples': 12352000, 'steps': 24124, 'batch_loss/train': 0.5452037315699272} 12/26/2021 03:35:47 - INFO - codeparrot_training - Step 24125: {'lr': 0.00026516270522434837, 'samples': 12352512, 'steps': 24125, 'batch_loss/train': 0.6142962947487831} 12/26/2021 03:35:58 - INFO - codeparrot_training - Step 24126: {'lr': 0.00026514697107921065, 'samples': 12353024, 'steps': 24126, 'batch_loss/train': 0.7198857287294231} 12/26/2021 03:36:08 - INFO - codeparrot_training - Step 24127: {'lr': 0.0002651312368738543, 'samples': 12353536, 'steps': 24127, 'batch_loss/train': 0.7487843539565802} 12/26/2021 03:36:21 - INFO - codeparrot_training - Step 24128: {'lr': 0.0002651155026083418, 'samples': 12354048, 'steps': 24128, 'batch_loss/train': 0.8041360611096025} 12/26/2021 03:36:31 - INFO - codeparrot_training - Step 24129: {'lr': 0.00026509976828273575, 'samples': 12354560, 'steps': 24129, 'batch_loss/train': 0.6863161879591644} 12/26/2021 03:36:42 - INFO - codeparrot_training - Step 24130: {'lr': 0.00026508403389709884, 'samples': 12355072, 'steps': 24130, 'batch_loss/train': 0.8109495425596833} 12/26/2021 03:36:54 - INFO - codeparrot_training - Step 24131: {'lr': 0.0002650682994514934, 'samples': 12355584, 'steps': 24131, 'batch_loss/train': 0.7583562070503831} 12/26/2021 03:37:05 - INFO - codeparrot_training - Step 24132: {'lr': 0.00026505256494598214, 'samples': 12356096, 'steps': 24132, 'batch_loss/train': 0.6744588050059974} 12/26/2021 03:37:16 - INFO - codeparrot_training - Step 24133: {'lr': 0.00026503683038062756, 'samples': 12356608, 'steps': 24133, 'batch_loss/train': 0.7441625078208745} 12/26/2021 03:37:26 - INFO - codeparrot_training - Step 24134: {'lr': 0.0002650210957554922, 'samples': 12357120, 'steps': 24134, 'batch_loss/train': 0.7788871452212334} 12/26/2021 03:37:39 - INFO - codeparrot_training - Step 24135: {'lr': 0.00026500536107063857, 'samples': 12357632, 'steps': 24135, 'batch_loss/train': 1.6057816911488771} 12/26/2021 03:37:50 - INFO - codeparrot_training - Step 24136: {'lr': 0.0002649896263261294, 'samples': 12358144, 'steps': 24136, 'batch_loss/train': 0.9459422938525677} 12/26/2021 03:38:01 - INFO - codeparrot_training - Step 24137: {'lr': 0.0002649738915220271, 'samples': 12358656, 'steps': 24137, 'batch_loss/train': 1.2402194016613066} 12/26/2021 03:38:13 - INFO - codeparrot_training - Step 24138: {'lr': 0.0002649581566583942, 'samples': 12359168, 'steps': 24138, 'batch_loss/train': 1.1973367216996849} 12/26/2021 03:38:23 - INFO - codeparrot_training - Step 24139: {'lr': 0.00026494242173529344, 'samples': 12359680, 'steps': 24139, 'batch_loss/train': 0.7787886364385486} 12/26/2021 03:38:34 - INFO - codeparrot_training - Step 24140: {'lr': 0.00026492668675278714, 'samples': 12360192, 'steps': 24140, 'batch_loss/train': 0.7153587606735528} 12/26/2021 03:38:45 - INFO - codeparrot_training - Step 24141: {'lr': 0.00026491095171093807, 'samples': 12360704, 'steps': 24141, 'batch_loss/train': 0.6851588864810765} 12/26/2021 03:38:57 - INFO - codeparrot_training - Step 24142: {'lr': 0.0002648952166098086, 'samples': 12361216, 'steps': 24142, 'batch_loss/train': 0.7883409112691879} 12/26/2021 03:39:08 - INFO - codeparrot_training - Step 24143: {'lr': 0.00026487948144946144, 'samples': 12361728, 'steps': 24143, 'batch_loss/train': 0.735982843209058} 12/26/2021 03:39:18 - INFO - codeparrot_training - Step 24144: {'lr': 0.000264863746229959, 'samples': 12362240, 'steps': 24144, 'batch_loss/train': 0.8104106672108173} 12/26/2021 03:39:31 - INFO - codeparrot_training - Step 24145: {'lr': 0.000264848010951364, 'samples': 12362752, 'steps': 24145, 'batch_loss/train': 0.7752484753727913} 12/26/2021 03:39:41 - INFO - codeparrot_training - Step 24146: {'lr': 0.00026483227561373886, 'samples': 12363264, 'steps': 24146, 'batch_loss/train': 0.5723049994558096} 12/26/2021 03:39:52 - INFO - codeparrot_training - Step 24147: {'lr': 0.0002648165402171462, 'samples': 12363776, 'steps': 24147, 'batch_loss/train': 0.7617819551378489} 12/26/2021 03:40:04 - INFO - codeparrot_training - Step 24148: {'lr': 0.0002648008047616486, 'samples': 12364288, 'steps': 24148, 'batch_loss/train': 1.2597915325313807} 12/26/2021 03:40:15 - INFO - codeparrot_training - Step 24149: {'lr': 0.00026478506924730854, 'samples': 12364800, 'steps': 24149, 'batch_loss/train': 0.6224832464940846} 12/26/2021 03:40:25 - INFO - codeparrot_training - Step 24150: {'lr': 0.0002647693336741887, 'samples': 12365312, 'steps': 24150, 'batch_loss/train': 0.7157901283353567} 12/26/2021 03:40:36 - INFO - codeparrot_training - Step 24151: {'lr': 0.00026475359804235143, 'samples': 12365824, 'steps': 24151, 'batch_loss/train': 0.8667580047622323} 12/26/2021 03:40:48 - INFO - codeparrot_training - Step 24152: {'lr': 0.0002647378623518596, 'samples': 12366336, 'steps': 24152, 'batch_loss/train': 0.8369651167886332} 12/26/2021 03:40:59 - INFO - codeparrot_training - Step 24153: {'lr': 0.0002647221266027755, 'samples': 12366848, 'steps': 24153, 'batch_loss/train': 0.7834032252430916} 12/26/2021 03:41:09 - INFO - codeparrot_training - Step 24154: {'lr': 0.00026470639079516177, 'samples': 12367360, 'steps': 24154, 'batch_loss/train': 0.7572539141401649} 12/26/2021 03:41:22 - INFO - codeparrot_training - Step 24155: {'lr': 0.000264690654929081, 'samples': 12367872, 'steps': 24155, 'batch_loss/train': 0.7969219489023089} 12/26/2021 03:41:33 - INFO - codeparrot_training - Step 24156: {'lr': 0.00026467491900459585, 'samples': 12368384, 'steps': 24156, 'batch_loss/train': 0.6040849655983038} 12/26/2021 03:41:43 - INFO - codeparrot_training - Step 24157: {'lr': 0.00026465918302176857, 'samples': 12368896, 'steps': 24157, 'batch_loss/train': 0.7987648174166679} 12/26/2021 03:41:56 - INFO - codeparrot_training - Step 24158: {'lr': 0.00026464344698066196, 'samples': 12369408, 'steps': 24158, 'batch_loss/train': 0.714255410246551} 12/26/2021 03:42:06 - INFO - codeparrot_training - Step 24159: {'lr': 0.00026462771088133863, 'samples': 12369920, 'steps': 24159, 'batch_loss/train': 0.7853555716574192} 12/26/2021 03:42:17 - INFO - codeparrot_training - Step 24160: {'lr': 0.000264611974723861, 'samples': 12370432, 'steps': 24160, 'batch_loss/train': 0.8026291662827134} 12/26/2021 03:42:30 - INFO - codeparrot_training - Step 24161: {'lr': 0.0002645962385082916, 'samples': 12370944, 'steps': 24161, 'batch_loss/train': 0.679711083881557} 12/26/2021 03:42:41 - INFO - codeparrot_training - Step 24162: {'lr': 0.00026458050223469317, 'samples': 12371456, 'steps': 24162, 'batch_loss/train': 0.7587752755498514} 12/26/2021 03:42:52 - INFO - codeparrot_training - Step 24163: {'lr': 0.00026456476590312815, 'samples': 12371968, 'steps': 24163, 'batch_loss/train': 0.8010887848213315} 12/26/2021 03:43:02 - INFO - codeparrot_training - Step 24164: {'lr': 0.00026454902951365903, 'samples': 12372480, 'steps': 24164, 'batch_loss/train': 0.8594133835285902} 12/26/2021 03:43:14 - INFO - codeparrot_training - Step 24165: {'lr': 0.0002645332930663486, 'samples': 12372992, 'steps': 24165, 'batch_loss/train': 0.7824878008104861} 12/26/2021 03:43:25 - INFO - codeparrot_training - Step 24166: {'lr': 0.0002645175565612592, 'samples': 12373504, 'steps': 24166, 'batch_loss/train': 0.8203016258776188} 12/26/2021 03:43:36 - INFO - codeparrot_training - Step 24167: {'lr': 0.0002645018199984535, 'samples': 12374016, 'steps': 24167, 'batch_loss/train': 0.7888778080232441} 12/26/2021 03:43:48 - INFO - codeparrot_training - Step 24168: {'lr': 0.000264486083377994, 'samples': 12374528, 'steps': 24168, 'batch_loss/train': 0.7023189896717668} 12/26/2021 03:43:58 - INFO - codeparrot_training - Step 24169: {'lr': 0.00026447034669994343, 'samples': 12375040, 'steps': 24169, 'batch_loss/train': 0.8222827771678567} 12/26/2021 03:44:09 - INFO - codeparrot_training - Step 24170: {'lr': 0.000264454609964364, 'samples': 12375552, 'steps': 24170, 'batch_loss/train': 0.8437664499506354} 12/26/2021 03:44:23 - INFO - codeparrot_training - Step 24171: {'lr': 0.00026443887317131865, 'samples': 12376064, 'steps': 24171, 'batch_loss/train': 0.8262660605832934} 12/26/2021 03:44:34 - INFO - codeparrot_training - Step 24172: {'lr': 0.00026442313632086986, 'samples': 12376576, 'steps': 24172, 'batch_loss/train': 0.8351597795262933} 12/26/2021 03:44:44 - INFO - codeparrot_training - Step 24173: {'lr': 0.00026440739941308, 'samples': 12377088, 'steps': 24173, 'batch_loss/train': 0.8385199373587966} 12/26/2021 03:44:56 - INFO - codeparrot_training - Step 24174: {'lr': 0.00026439166244801175, 'samples': 12377600, 'steps': 24174, 'batch_loss/train': 0.7275387207046151} 12/26/2021 03:45:07 - INFO - codeparrot_training - Step 24175: {'lr': 0.00026437592542572776, 'samples': 12378112, 'steps': 24175, 'batch_loss/train': 0.7051706295460463} 12/26/2021 03:45:17 - INFO - codeparrot_training - Step 24176: {'lr': 0.0002643601883462905, 'samples': 12378624, 'steps': 24176, 'batch_loss/train': 0.7300188606604934} 12/26/2021 03:45:28 - INFO - codeparrot_training - Step 24177: {'lr': 0.00026434445120976255, 'samples': 12379136, 'steps': 24177, 'batch_loss/train': 0.7084232671186328} 12/26/2021 03:45:41 - INFO - codeparrot_training - Step 24178: {'lr': 0.00026432871401620644, 'samples': 12379648, 'steps': 24178, 'batch_loss/train': 0.820458748145029} 12/26/2021 03:45:52 - INFO - codeparrot_training - Step 24179: {'lr': 0.00026431297676568483, 'samples': 12380160, 'steps': 24179, 'batch_loss/train': 0.6119709721533582} 12/26/2021 03:46:02 - INFO - codeparrot_training - Step 24180: {'lr': 0.00026429723945826017, 'samples': 12380672, 'steps': 24180, 'batch_loss/train': 0.8753724629059434} 12/26/2021 03:46:14 - INFO - codeparrot_training - Step 24181: {'lr': 0.0002642815020939952, 'samples': 12381184, 'steps': 24181, 'batch_loss/train': 0.7886434700340033} 12/26/2021 03:46:25 - INFO - codeparrot_training - Step 24182: {'lr': 0.0002642657646729522, 'samples': 12381696, 'steps': 24182, 'batch_loss/train': 0.9008859839523211} 12/26/2021 03:46:35 - INFO - codeparrot_training - Step 24183: {'lr': 0.00026425002719519396, 'samples': 12382208, 'steps': 24183, 'batch_loss/train': 0.7863618200644851} 12/26/2021 03:46:47 - INFO - codeparrot_training - Step 24184: {'lr': 0.00026423428966078306, 'samples': 12382720, 'steps': 24184, 'batch_loss/train': 0.807410798035562} 12/26/2021 03:46:58 - INFO - codeparrot_training - Step 24185: {'lr': 0.000264218552069782, 'samples': 12383232, 'steps': 24185, 'batch_loss/train': 0.6815986200235784} 12/26/2021 03:47:09 - INFO - codeparrot_training - Step 24186: {'lr': 0.0002642028144222533, 'samples': 12383744, 'steps': 24186, 'batch_loss/train': 0.75282729184255} 12/26/2021 03:47:19 - INFO - codeparrot_training - Step 24187: {'lr': 0.0002641870767182596, 'samples': 12384256, 'steps': 24187, 'batch_loss/train': 0.7983377486816607} 12/26/2021 03:47:32 - INFO - codeparrot_training - Step 24188: {'lr': 0.0002641713389578634, 'samples': 12384768, 'steps': 24188, 'batch_loss/train': 0.88313062582165} 12/26/2021 03:47:43 - INFO - codeparrot_training - Step 24189: {'lr': 0.00026415560114112735, 'samples': 12385280, 'steps': 24189, 'batch_loss/train': 0.7526192115619779} 12/26/2021 03:47:53 - INFO - codeparrot_training - Step 24190: {'lr': 0.0002641398632681139, 'samples': 12385792, 'steps': 24190, 'batch_loss/train': 0.7736091604456306} 12/26/2021 03:48:05 - INFO - codeparrot_training - Step 24191: {'lr': 0.0002641241253388858, 'samples': 12386304, 'steps': 24191, 'batch_loss/train': 0.7309709670953453} 12/26/2021 03:48:16 - INFO - codeparrot_training - Step 24192: {'lr': 0.00026410838735350543, 'samples': 12386816, 'steps': 24192, 'batch_loss/train': 0.7198444465175271} 12/26/2021 03:48:27 - INFO - codeparrot_training - Step 24193: {'lr': 0.00026409264931203546, 'samples': 12387328, 'steps': 24193, 'batch_loss/train': 0.8105953619815409} 12/26/2021 03:48:39 - INFO - codeparrot_training - Step 24194: {'lr': 0.00026407691121453847, 'samples': 12387840, 'steps': 24194, 'batch_loss/train': 0.7041775486432016} 12/26/2021 03:48:50 - INFO - codeparrot_training - Step 24195: {'lr': 0.000264061173061077, 'samples': 12388352, 'steps': 24195, 'batch_loss/train': 0.7297064088052139} 12/26/2021 03:49:00 - INFO - codeparrot_training - Step 24196: {'lr': 0.0002640454348517135, 'samples': 12388864, 'steps': 24196, 'batch_loss/train': 0.7086197499884292} 12/26/2021 03:49:11 - INFO - codeparrot_training - Step 24197: {'lr': 0.0002640296965865108, 'samples': 12389376, 'steps': 24197, 'batch_loss/train': 0.6893927868222818} 12/26/2021 03:49:23 - INFO - codeparrot_training - Step 24198: {'lr': 0.0002640139582655313, 'samples': 12389888, 'steps': 24198, 'batch_loss/train': 0.46320539887528867} 12/26/2021 03:49:34 - INFO - codeparrot_training - Step 24199: {'lr': 0.00026399821988883754, 'samples': 12390400, 'steps': 24199, 'batch_loss/train': 0.748939136043191} 12/26/2021 03:49:44 - INFO - codeparrot_training - Step 24200: {'lr': 0.00026398248145649214, 'samples': 12390912, 'steps': 24200, 'batch_loss/train': 0.8247944705653936} 12/26/2021 03:49:56 - INFO - codeparrot_training - Step 24201: {'lr': 0.00026396674296855763, 'samples': 12391424, 'steps': 24201, 'batch_loss/train': 0.8525107055902481} 12/26/2021 03:50:07 - INFO - codeparrot_training - Step 24202: {'lr': 0.00026395100442509674, 'samples': 12391936, 'steps': 24202, 'batch_loss/train': 0.6919948891736567} 12/26/2021 03:50:18 - INFO - codeparrot_training - Step 24203: {'lr': 0.0002639352658261718, 'samples': 12392448, 'steps': 24203, 'batch_loss/train': 0.8297556525794789} 12/26/2021 03:50:30 - INFO - codeparrot_training - Step 24204: {'lr': 0.0002639195271718457, 'samples': 12392960, 'steps': 24204, 'batch_loss/train': 0.6632494947407395} 12/26/2021 03:50:40 - INFO - codeparrot_training - Step 24205: {'lr': 0.00026390378846218065, 'samples': 12393472, 'steps': 24205, 'batch_loss/train': 0.6725142048671842} 12/26/2021 03:50:51 - INFO - codeparrot_training - Step 24206: {'lr': 0.0002638880496972394, 'samples': 12393984, 'steps': 24206, 'batch_loss/train': 0.7961290255188942} 12/26/2021 03:51:04 - INFO - codeparrot_training - Step 24207: {'lr': 0.00026387231087708456, 'samples': 12394496, 'steps': 24207, 'batch_loss/train': 0.8933230866678059} 12/26/2021 03:51:14 - INFO - codeparrot_training - Step 24208: {'lr': 0.00026385657200177865, 'samples': 12395008, 'steps': 24208, 'batch_loss/train': 0.743944589048624} 12/26/2021 03:51:25 - INFO - codeparrot_training - Step 24209: {'lr': 0.00026384083307138417, 'samples': 12395520, 'steps': 24209, 'batch_loss/train': 0.9051188910380006} 12/26/2021 03:51:35 - INFO - codeparrot_training - Step 24210: {'lr': 0.0002638250940859638, 'samples': 12396032, 'steps': 24210, 'batch_loss/train': 0.7674044566228986} 12/26/2021 03:51:48 - INFO - codeparrot_training - Step 24211: {'lr': 0.00026380935504558017, 'samples': 12396544, 'steps': 24211, 'batch_loss/train': 0.7860603677108884} 12/26/2021 03:51:58 - INFO - codeparrot_training - Step 24212: {'lr': 0.00026379361595029566, 'samples': 12397056, 'steps': 24212, 'batch_loss/train': 0.743692426243797} 12/26/2021 03:52:09 - INFO - codeparrot_training - Step 24213: {'lr': 0.00026377787680017297, 'samples': 12397568, 'steps': 24213, 'batch_loss/train': 0.704804833047092} 12/26/2021 03:52:21 - INFO - codeparrot_training - Step 24214: {'lr': 0.0002637621375952746, 'samples': 12398080, 'steps': 24214, 'batch_loss/train': 0.7339328392408788} 12/26/2021 03:52:31 - INFO - codeparrot_training - Step 24215: {'lr': 0.0002637463983356632, 'samples': 12398592, 'steps': 24215, 'batch_loss/train': 0.7959686564281583} 12/26/2021 03:52:42 - INFO - codeparrot_training - Step 24216: {'lr': 0.0002637306590214013, 'samples': 12399104, 'steps': 24216, 'batch_loss/train': 0.7580053005367517} 12/26/2021 03:52:53 - INFO - codeparrot_training - Step 24217: {'lr': 0.00026371491965255156, 'samples': 12399616, 'steps': 24217, 'batch_loss/train': 0.8735746033489704} 12/26/2021 03:53:05 - INFO - codeparrot_training - Step 24218: {'lr': 0.00026369918022917636, 'samples': 12400128, 'steps': 24218, 'batch_loss/train': 0.7508383328095078} 12/26/2021 03:53:16 - INFO - codeparrot_training - Step 24219: {'lr': 0.00026368344075133844, 'samples': 12400640, 'steps': 24219, 'batch_loss/train': 0.8250716016627848} 12/26/2021 03:53:27 - INFO - codeparrot_training - Step 24220: {'lr': 0.00026366770121910034, 'samples': 12401152, 'steps': 24220, 'batch_loss/train': 0.8264617910608649} 12/26/2021 03:53:39 - INFO - codeparrot_training - Step 24221: {'lr': 0.00026365196163252464, 'samples': 12401664, 'steps': 24221, 'batch_loss/train': 0.7407372258603573} 12/26/2021 03:53:49 - INFO - codeparrot_training - Step 24222: {'lr': 0.00026363622199167383, 'samples': 12402176, 'steps': 24222, 'batch_loss/train': 0.7349729957641102} 12/26/2021 03:54:00 - INFO - codeparrot_training - Step 24223: {'lr': 0.0002636204822966106, 'samples': 12402688, 'steps': 24223, 'batch_loss/train': 1.443144123069942} 12/26/2021 03:54:12 - INFO - codeparrot_training - Step 24224: {'lr': 0.00026360474254739745, 'samples': 12403200, 'steps': 24224, 'batch_loss/train': 0.7024149470962584} 12/26/2021 03:54:23 - INFO - codeparrot_training - Step 24225: {'lr': 0.000263589002744097, 'samples': 12403712, 'steps': 24225, 'batch_loss/train': 0.7311221016570926} 12/26/2021 03:54:33 - INFO - codeparrot_training - Step 24226: {'lr': 0.0002635732628867718, 'samples': 12404224, 'steps': 24226, 'batch_loss/train': 0.7382782925851643} 12/26/2021 03:54:44 - INFO - codeparrot_training - Step 24227: {'lr': 0.00026355752297548446, 'samples': 12404736, 'steps': 24227, 'batch_loss/train': 0.8278144244104624} 12/26/2021 03:54:57 - INFO - codeparrot_training - Step 24228: {'lr': 0.00026354178301029745, 'samples': 12405248, 'steps': 24228, 'batch_loss/train': 0.9004626418463886} 12/26/2021 03:55:08 - INFO - codeparrot_training - Step 24229: {'lr': 0.00026352604299127345, 'samples': 12405760, 'steps': 24229, 'batch_loss/train': 0.8242793688550591} 12/26/2021 03:55:18 - INFO - codeparrot_training - Step 24230: {'lr': 0.0002635103029184751, 'samples': 12406272, 'steps': 24230, 'batch_loss/train': 0.6299900506273843} 12/26/2021 03:55:30 - INFO - codeparrot_training - Step 24231: {'lr': 0.0002634945627919648, 'samples': 12406784, 'steps': 24231, 'batch_loss/train': 0.9010738991200924} 12/26/2021 03:55:41 - INFO - codeparrot_training - Step 24232: {'lr': 0.0002634788226118052, 'samples': 12407296, 'steps': 24232, 'batch_loss/train': 0.8365028547123075} 12/26/2021 03:55:52 - INFO - codeparrot_training - Step 24233: {'lr': 0.000263463082378059, 'samples': 12407808, 'steps': 24233, 'batch_loss/train': 0.718419580720365} 12/26/2021 03:56:04 - INFO - codeparrot_training - Step 24234: {'lr': 0.00026344734209078857, 'samples': 12408320, 'steps': 24234, 'batch_loss/train': 0.7997917737811804} 12/26/2021 03:56:15 - INFO - codeparrot_training - Step 24235: {'lr': 0.0002634316017500566, 'samples': 12408832, 'steps': 24235, 'batch_loss/train': 1.0490104826167226} 12/26/2021 03:56:26 - INFO - codeparrot_training - Step 24236: {'lr': 0.0002634158613559257, 'samples': 12409344, 'steps': 24236, 'batch_loss/train': 0.7256466683465987} 12/26/2021 03:56:38 - INFO - codeparrot_training - Step 24237: {'lr': 0.0002634001209084583, 'samples': 12409856, 'steps': 24237, 'batch_loss/train': 0.8419632203876972} 12/26/2021 03:56:48 - INFO - codeparrot_training - Step 24238: {'lr': 0.0002633843804077171, 'samples': 12410368, 'steps': 24238, 'batch_loss/train': 0.7707486958242953} 12/26/2021 03:56:59 - INFO - codeparrot_training - Step 24239: {'lr': 0.0002633686398537648, 'samples': 12410880, 'steps': 24239, 'batch_loss/train': 0.7036211093654856} 12/26/2021 03:57:10 - INFO - codeparrot_training - Step 24240: {'lr': 0.00026335289924666374, 'samples': 12411392, 'steps': 24240, 'batch_loss/train': 0.7634105999022722} 12/26/2021 03:57:22 - INFO - codeparrot_training - Step 24241: {'lr': 0.00026333715858647657, 'samples': 12411904, 'steps': 24241, 'batch_loss/train': 1.369511493947357} 12/26/2021 03:57:32 - INFO - codeparrot_training - Step 24242: {'lr': 0.00026332141787326584, 'samples': 12412416, 'steps': 24242, 'batch_loss/train': 0.792382228653878} 12/26/2021 03:57:43 - INFO - codeparrot_training - Step 24243: {'lr': 0.0002633056771070944, 'samples': 12412928, 'steps': 24243, 'batch_loss/train': 0.7702798722311854} 12/26/2021 03:57:55 - INFO - codeparrot_training - Step 24244: {'lr': 0.00026328993628802437, 'samples': 12413440, 'steps': 24244, 'batch_loss/train': 0.7588488636538386} 12/26/2021 03:58:06 - INFO - codeparrot_training - Step 24245: {'lr': 0.00026327419541611864, 'samples': 12413952, 'steps': 24245, 'batch_loss/train': 0.830697288736701} 12/26/2021 03:58:17 - INFO - codeparrot_training - Step 24246: {'lr': 0.0002632584544914398, 'samples': 12414464, 'steps': 24246, 'batch_loss/train': 0.6778009689296596} 12/26/2021 03:58:29 - INFO - codeparrot_training - Step 24247: {'lr': 0.00026324271351405026, 'samples': 12414976, 'steps': 24247, 'batch_loss/train': 0.8169110603630543} 12/26/2021 03:58:40 - INFO - codeparrot_training - Step 24248: {'lr': 0.00026322697248401273, 'samples': 12415488, 'steps': 24248, 'batch_loss/train': 0.7311533987522125} 12/26/2021 03:58:51 - INFO - codeparrot_training - Step 24249: {'lr': 0.00026321123140138976, 'samples': 12416000, 'steps': 24249, 'batch_loss/train': 0.890121828764677} 12/26/2021 03:59:01 - INFO - codeparrot_training - Step 24250: {'lr': 0.000263195490266244, 'samples': 12416512, 'steps': 24250, 'batch_loss/train': 0.7267941134050488} 12/26/2021 03:59:14 - INFO - codeparrot_training - Step 24251: {'lr': 0.00026317974907863776, 'samples': 12417024, 'steps': 24251, 'batch_loss/train': 0.6437487308867276} 12/26/2021 03:59:24 - INFO - codeparrot_training - Step 24252: {'lr': 0.00026316400783863396, 'samples': 12417536, 'steps': 24252, 'batch_loss/train': 0.7641361835412681} 12/26/2021 03:59:35 - INFO - codeparrot_training - Step 24253: {'lr': 0.00026314826654629496, 'samples': 12418048, 'steps': 24253, 'batch_loss/train': 0.877677833661437} 12/26/2021 03:59:48 - INFO - codeparrot_training - Step 24254: {'lr': 0.00026313252520168354, 'samples': 12418560, 'steps': 24254, 'batch_loss/train': 0.8187601659446955} 12/26/2021 03:59:58 - INFO - codeparrot_training - Step 24255: {'lr': 0.00026311678380486203, 'samples': 12419072, 'steps': 24255, 'batch_loss/train': 0.7856522584334016} 12/26/2021 04:00:09 - INFO - codeparrot_training - Step 24256: {'lr': 0.00026310104235589317, 'samples': 12419584, 'steps': 24256, 'batch_loss/train': 0.7414655056782067} 12/26/2021 04:00:21 - INFO - codeparrot_training - Step 24257: {'lr': 0.00026308530085483956, 'samples': 12420096, 'steps': 24257, 'batch_loss/train': 0.7282205121591687} 12/26/2021 04:00:32 - INFO - codeparrot_training - Step 24258: {'lr': 0.00026306955930176365, 'samples': 12420608, 'steps': 24258, 'batch_loss/train': 0.8018740946426988} 12/26/2021 04:00:42 - INFO - codeparrot_training - Step 24259: {'lr': 0.0002630538176967282, 'samples': 12421120, 'steps': 24259, 'batch_loss/train': 0.7954439027234912} 12/26/2021 04:00:53 - INFO - codeparrot_training - Step 24260: {'lr': 0.0002630380760397956, 'samples': 12421632, 'steps': 24260, 'batch_loss/train': 0.9528080932796001} 12/26/2021 04:01:05 - INFO - codeparrot_training - Step 24261: {'lr': 0.00026302233433102865, 'samples': 12422144, 'steps': 24261, 'batch_loss/train': 0.7756420323858038} 12/26/2021 04:01:15 - INFO - codeparrot_training - Step 24262: {'lr': 0.0002630065925704897, 'samples': 12422656, 'steps': 24262, 'batch_loss/train': 0.7174595147371292} 12/26/2021 04:01:26 - INFO - codeparrot_training - Step 24263: {'lr': 0.0002629908507582416, 'samples': 12423168, 'steps': 24263, 'batch_loss/train': 0.824497246183455} 12/26/2021 04:01:39 - INFO - codeparrot_training - Step 24264: {'lr': 0.0002629751088943466, 'samples': 12423680, 'steps': 24264, 'batch_loss/train': 0.7637620507739484} 12/26/2021 04:01:49 - INFO - codeparrot_training - Step 24265: {'lr': 0.0002629593669788676, 'samples': 12424192, 'steps': 24265, 'batch_loss/train': 0.8133238302543759} 12/26/2021 04:02:00 - INFO - codeparrot_training - Step 24266: {'lr': 0.000262943625011867, 'samples': 12424704, 'steps': 24266, 'batch_loss/train': 0.8369548788759857} 12/26/2021 04:02:12 - INFO - codeparrot_training - Step 24267: {'lr': 0.00026292788299340743, 'samples': 12425216, 'steps': 24267, 'batch_loss/train': 0.8054542625322938} 12/26/2021 04:02:23 - INFO - codeparrot_training - Step 24268: {'lr': 0.0002629121409235515, 'samples': 12425728, 'steps': 24268, 'batch_loss/train': 0.8798506846651435} 12/26/2021 04:02:33 - INFO - codeparrot_training - Step 24269: {'lr': 0.0002628963988023617, 'samples': 12426240, 'steps': 24269, 'batch_loss/train': 0.7268474167212844} 12/26/2021 04:02:44 - INFO - codeparrot_training - Step 24270: {'lr': 0.00026288065662990085, 'samples': 12426752, 'steps': 24270, 'batch_loss/train': 0.7152440506033599} 12/26/2021 04:02:56 - INFO - codeparrot_training - Step 24271: {'lr': 0.00026286491440623115, 'samples': 12427264, 'steps': 24271, 'batch_loss/train': 0.7233748300932348} 12/26/2021 04:03:07 - INFO - codeparrot_training - Step 24272: {'lr': 0.0002628491721314156, 'samples': 12427776, 'steps': 24272, 'batch_loss/train': 0.7295412148814648} 12/26/2021 04:03:17 - INFO - codeparrot_training - Step 24273: {'lr': 0.0002628334298055165, 'samples': 12428288, 'steps': 24273, 'batch_loss/train': 0.8175826682709157} 12/26/2021 04:03:30 - INFO - codeparrot_training - Step 24274: {'lr': 0.00026281768742859656, 'samples': 12428800, 'steps': 24274, 'batch_loss/train': 0.8048542756587267} 12/26/2021 04:03:41 - INFO - codeparrot_training - Step 24275: {'lr': 0.0002628019450007183, 'samples': 12429312, 'steps': 24275, 'batch_loss/train': 0.7456812368473038} 12/26/2021 04:03:51 - INFO - codeparrot_training - Step 24276: {'lr': 0.0002627862025219444, 'samples': 12429824, 'steps': 24276, 'batch_loss/train': 0.708570291288197} 12/26/2021 04:04:03 - INFO - codeparrot_training - Step 24277: {'lr': 0.00026277045999233736, 'samples': 12430336, 'steps': 24277, 'batch_loss/train': 0.7501995386555791} 12/26/2021 04:04:14 - INFO - codeparrot_training - Step 24278: {'lr': 0.0002627547174119598, 'samples': 12430848, 'steps': 24278, 'batch_loss/train': 0.7907567471265793} 12/26/2021 04:04:25 - INFO - codeparrot_training - Step 24279: {'lr': 0.0002627389747808743, 'samples': 12431360, 'steps': 24279, 'batch_loss/train': 0.7633549878373742} 12/26/2021 04:04:35 - INFO - codeparrot_training - Step 24280: {'lr': 0.00026272323209914337, 'samples': 12431872, 'steps': 24280, 'batch_loss/train': 1.5185099337249994} 12/26/2021 04:04:47 - INFO - codeparrot_training - Step 24281: {'lr': 0.00026270748936682976, 'samples': 12432384, 'steps': 24281, 'batch_loss/train': 0.7703813090920448} 12/26/2021 04:04:58 - INFO - codeparrot_training - Step 24282: {'lr': 0.0002626917465839959, 'samples': 12432896, 'steps': 24282, 'batch_loss/train': 0.74152269680053} 12/26/2021 04:05:09 - INFO - codeparrot_training - Step 24283: {'lr': 0.0002626760037507045, 'samples': 12433408, 'steps': 24283, 'batch_loss/train': 0.6854548677802086} 12/26/2021 04:05:21 - INFO - codeparrot_training - Step 24284: {'lr': 0.0002626602608670181, 'samples': 12433920, 'steps': 24284, 'batch_loss/train': 0.7946512182243168} 12/26/2021 04:05:32 - INFO - codeparrot_training - Step 24285: {'lr': 0.0002626445179329993, 'samples': 12434432, 'steps': 24285, 'batch_loss/train': 0.5989061021246016} 12/26/2021 04:05:42 - INFO - codeparrot_training - Step 24286: {'lr': 0.0002626287749487106, 'samples': 12434944, 'steps': 24286, 'batch_loss/train': 0.8163961241953075} 12/26/2021 04:05:55 - INFO - codeparrot_training - Step 24287: {'lr': 0.0002626130319142147, 'samples': 12435456, 'steps': 24287, 'batch_loss/train': 0.7342808968387544} 12/26/2021 04:06:06 - INFO - codeparrot_training - Step 24288: {'lr': 0.0002625972888295742, 'samples': 12435968, 'steps': 24288, 'batch_loss/train': 0.8487263098359108} 12/26/2021 04:06:16 - INFO - codeparrot_training - Step 24289: {'lr': 0.0002625815456948516, 'samples': 12436480, 'steps': 24289, 'batch_loss/train': 0.7576503762975335} 12/26/2021 04:06:28 - INFO - codeparrot_training - Step 24290: {'lr': 0.00026256580251010945, 'samples': 12436992, 'steps': 24290, 'batch_loss/train': 0.7411453367676586} 12/26/2021 04:06:39 - INFO - codeparrot_training - Step 24291: {'lr': 0.00026255005927541055, 'samples': 12437504, 'steps': 24291, 'batch_loss/train': 0.7875437824986875} 12/26/2021 04:06:50 - INFO - codeparrot_training - Step 24292: {'lr': 0.00026253431599081726, 'samples': 12438016, 'steps': 24292, 'batch_loss/train': 0.6934568136930466} 12/26/2021 04:07:00 - INFO - codeparrot_training - Step 24293: {'lr': 0.00026251857265639225, 'samples': 12438528, 'steps': 24293, 'batch_loss/train': 0.8716298369690776} 12/26/2021 04:07:13 - INFO - codeparrot_training - Step 24294: {'lr': 0.0002625028292721981, 'samples': 12439040, 'steps': 24294, 'batch_loss/train': 0.7648211540654302} 12/26/2021 04:07:24 - INFO - codeparrot_training - Step 24295: {'lr': 0.0002624870858382975, 'samples': 12439552, 'steps': 24295, 'batch_loss/train': 1.0164421070367098} 12/26/2021 04:07:35 - INFO - codeparrot_training - Step 24296: {'lr': 0.00026247134235475296, 'samples': 12440064, 'steps': 24296, 'batch_loss/train': 0.8266219291836023} 12/26/2021 04:07:47 - INFO - codeparrot_training - Step 24297: {'lr': 0.00026245559882162703, 'samples': 12440576, 'steps': 24297, 'batch_loss/train': 0.7278693788684905} 12/26/2021 04:07:57 - INFO - codeparrot_training - Step 24298: {'lr': 0.0002624398552389824, 'samples': 12441088, 'steps': 24298, 'batch_loss/train': 0.7484458647668362} 12/26/2021 04:08:08 - INFO - codeparrot_training - Step 24299: {'lr': 0.00026242411160688154, 'samples': 12441600, 'steps': 24299, 'batch_loss/train': 0.744323889259249} 12/26/2021 04:08:20 - INFO - codeparrot_training - Step 24300: {'lr': 0.00026240836792538715, 'samples': 12442112, 'steps': 24300, 'batch_loss/train': 0.8173370484728366} 12/26/2021 04:08:31 - INFO - codeparrot_training - Step 24301: {'lr': 0.0002623926241945617, 'samples': 12442624, 'steps': 24301, 'batch_loss/train': 0.8023723969236016} 12/26/2021 04:08:42 - INFO - codeparrot_training - Step 24302: {'lr': 0.00026237688041446793, 'samples': 12443136, 'steps': 24302, 'batch_loss/train': 0.7755596209317446} 12/26/2021 04:08:52 - INFO - codeparrot_training - Step 24303: {'lr': 0.0002623611365851683, 'samples': 12443648, 'steps': 24303, 'batch_loss/train': 0.8161757467314601} 12/26/2021 04:09:05 - INFO - codeparrot_training - Step 24304: {'lr': 0.0002623453927067255, 'samples': 12444160, 'steps': 24304, 'batch_loss/train': 0.706203700043261} 12/26/2021 04:09:16 - INFO - codeparrot_training - Step 24305: {'lr': 0.00026232964877920213, 'samples': 12444672, 'steps': 24305, 'batch_loss/train': 0.8333033081144094} 12/26/2021 04:09:26 - INFO - codeparrot_training - Step 24306: {'lr': 0.0002623139048026606, 'samples': 12445184, 'steps': 24306, 'batch_loss/train': 0.6841475191758946} 12/26/2021 04:09:38 - INFO - codeparrot_training - Step 24307: {'lr': 0.0002622981607771637, 'samples': 12445696, 'steps': 24307, 'batch_loss/train': 0.8363904552534223} 12/26/2021 04:09:49 - INFO - codeparrot_training - Step 24308: {'lr': 0.000262282416702774, 'samples': 12446208, 'steps': 24308, 'batch_loss/train': 0.7467161570675671} 12/26/2021 04:10:00 - INFO - codeparrot_training - Step 24309: {'lr': 0.000262266672579554, 'samples': 12446720, 'steps': 24309, 'batch_loss/train': 0.7799351420253515} 12/26/2021 04:10:12 - INFO - codeparrot_training - Step 24310: {'lr': 0.0002622509284075663, 'samples': 12447232, 'steps': 24310, 'batch_loss/train': 0.7413571961224079} 12/26/2021 04:10:23 - INFO - codeparrot_training - Step 24311: {'lr': 0.00026223518418687364, 'samples': 12447744, 'steps': 24311, 'batch_loss/train': 0.6987017383798957} 12/26/2021 04:10:33 - INFO - codeparrot_training - Step 24312: {'lr': 0.0002622194399175385, 'samples': 12448256, 'steps': 24312, 'batch_loss/train': 0.6712638238677755} 12/26/2021 04:10:44 - INFO - codeparrot_training - Step 24313: {'lr': 0.00026220369559962344, 'samples': 12448768, 'steps': 24313, 'batch_loss/train': 0.8239492131397128} 12/26/2021 04:10:56 - INFO - codeparrot_training - Step 24314: {'lr': 0.00026218795123319105, 'samples': 12449280, 'steps': 24314, 'batch_loss/train': 0.7037721239030361} 12/26/2021 04:11:07 - INFO - codeparrot_training - Step 24315: {'lr': 0.00026217220681830404, 'samples': 12449792, 'steps': 24315, 'batch_loss/train': 0.7279941216111183} 12/26/2021 04:11:17 - INFO - codeparrot_training - Step 24316: {'lr': 0.0002621564623550248, 'samples': 12450304, 'steps': 24316, 'batch_loss/train': 0.7922841748222709} 12/26/2021 04:11:30 - INFO - codeparrot_training - Step 24317: {'lr': 0.0002621407178434163, 'samples': 12450816, 'steps': 24317, 'batch_loss/train': 0.7180841574445367} 12/26/2021 04:11:40 - INFO - codeparrot_training - Step 24318: {'lr': 0.0002621249732835407, 'samples': 12451328, 'steps': 24318, 'batch_loss/train': 0.6837794308084995} 12/26/2021 04:11:51 - INFO - codeparrot_training - Step 24319: {'lr': 0.00026210922867546074, 'samples': 12451840, 'steps': 24319, 'batch_loss/train': 0.7895013713277876} 12/26/2021 04:12:03 - INFO - codeparrot_training - Step 24320: {'lr': 0.0002620934840192392, 'samples': 12452352, 'steps': 24320, 'batch_loss/train': 0.8467632727697492} 12/26/2021 04:12:13 - INFO - codeparrot_training - Step 24321: {'lr': 0.0002620777393149385, 'samples': 12452864, 'steps': 24321, 'batch_loss/train': 0.747168839443475} 12/26/2021 04:12:24 - INFO - codeparrot_training - Step 24322: {'lr': 0.0002620619945626212, 'samples': 12453376, 'steps': 24322, 'batch_loss/train': 0.7901446567848325} 12/26/2021 04:12:35 - INFO - codeparrot_training - Step 24323: {'lr': 0.00026204624976235, 'samples': 12453888, 'steps': 24323, 'batch_loss/train': 0.8048334857448936} 12/26/2021 04:12:49 - INFO - codeparrot_training - Step 24324: {'lr': 0.00026203050491418756, 'samples': 12454400, 'steps': 24324, 'batch_loss/train': 0.7767462914343923} 12/26/2021 04:12:59 - INFO - codeparrot_training - Step 24325: {'lr': 0.00026201476001819625, 'samples': 12454912, 'steps': 24325, 'batch_loss/train': 0.4931553423521109} 12/26/2021 04:13:10 - INFO - codeparrot_training - Step 24326: {'lr': 0.00026199901507443883, 'samples': 12455424, 'steps': 24326, 'batch_loss/train': 0.7978348261676729} 12/26/2021 04:13:22 - INFO - codeparrot_training - Step 24327: {'lr': 0.0002619832700829778, 'samples': 12455936, 'steps': 24327, 'batch_loss/train': 0.7563943453133106} 12/26/2021 04:13:33 - INFO - codeparrot_training - Step 24328: {'lr': 0.00026196752504387594, 'samples': 12456448, 'steps': 24328, 'batch_loss/train': 0.6562422877177596} 12/26/2021 04:13:43 - INFO - codeparrot_training - Step 24329: {'lr': 0.0002619517799571956, 'samples': 12456960, 'steps': 24329, 'batch_loss/train': 0.7233076598495245} 12/26/2021 04:13:55 - INFO - codeparrot_training - Step 24330: {'lr': 0.00026193603482299957, 'samples': 12457472, 'steps': 24330, 'batch_loss/train': 0.7514431662857533} 12/26/2021 04:14:06 - INFO - codeparrot_training - Step 24331: {'lr': 0.0002619202896413504, 'samples': 12457984, 'steps': 24331, 'batch_loss/train': 0.6686060479842126} 12/26/2021 04:14:17 - INFO - codeparrot_training - Step 24332: {'lr': 0.00026190454441231057, 'samples': 12458496, 'steps': 24332, 'batch_loss/train': 0.8607394946739078} 12/26/2021 04:14:29 - INFO - codeparrot_training - Step 24333: {'lr': 0.0002618887991359428, 'samples': 12459008, 'steps': 24333, 'batch_loss/train': 0.7276725918054581} 12/26/2021 04:14:40 - INFO - codeparrot_training - Step 24334: {'lr': 0.0002618730538123096, 'samples': 12459520, 'steps': 24334, 'batch_loss/train': 0.7434635199606419} 12/26/2021 04:14:51 - INFO - codeparrot_training - Step 24335: {'lr': 0.0002618573084414737, 'samples': 12460032, 'steps': 24335, 'batch_loss/train': 0.8208486838266253} 12/26/2021 04:15:01 - INFO - codeparrot_training - Step 24336: {'lr': 0.0002618415630234976, 'samples': 12460544, 'steps': 24336, 'batch_loss/train': 0.6971485684625804} 12/26/2021 04:15:13 - INFO - codeparrot_training - Step 24337: {'lr': 0.000261825817558444, 'samples': 12461056, 'steps': 24337, 'batch_loss/train': 0.7884511733427644} 12/26/2021 04:15:24 - INFO - codeparrot_training - Step 24338: {'lr': 0.0002618100720463753, 'samples': 12461568, 'steps': 24338, 'batch_loss/train': 0.7218810748308897} 12/26/2021 04:15:35 - INFO - codeparrot_training - Step 24339: {'lr': 0.0002617943264873542, 'samples': 12462080, 'steps': 24339, 'batch_loss/train': 0.843672776594758} 12/26/2021 04:15:48 - INFO - codeparrot_training - Step 24340: {'lr': 0.00026177858088144337, 'samples': 12462592, 'steps': 24340, 'batch_loss/train': 0.8628441607579589} 12/26/2021 04:15:58 - INFO - codeparrot_training - Step 24341: {'lr': 0.00026176283522870534, 'samples': 12463104, 'steps': 24341, 'batch_loss/train': 0.8000437659211457} 12/26/2021 04:16:09 - INFO - codeparrot_training - Step 24342: {'lr': 0.0002617470895292027, 'samples': 12463616, 'steps': 24342, 'batch_loss/train': 0.7856088718399405} 12/26/2021 04:16:21 - INFO - codeparrot_training - Step 24343: {'lr': 0.0002617313437829981, 'samples': 12464128, 'steps': 24343, 'batch_loss/train': 0.9080814169719815} 12/26/2021 04:16:32 - INFO - codeparrot_training - Step 24344: {'lr': 0.0002617155979901541, 'samples': 12464640, 'steps': 24344, 'batch_loss/train': 0.7963168376591057} 12/26/2021 04:16:42 - INFO - codeparrot_training - Step 24345: {'lr': 0.0002616998521507332, 'samples': 12465152, 'steps': 24345, 'batch_loss/train': 0.7491561281494796} 12/26/2021 04:16:54 - INFO - codeparrot_training - Step 24346: {'lr': 0.00026168410626479827, 'samples': 12465664, 'steps': 24346, 'batch_loss/train': 0.7960793115198612} 12/26/2021 04:17:05 - INFO - codeparrot_training - Step 24347: {'lr': 0.00026166836033241164, 'samples': 12466176, 'steps': 24347, 'batch_loss/train': 0.7316403491422534} 12/26/2021 04:17:16 - INFO - codeparrot_training - Step 24348: {'lr': 0.000261652614353636, 'samples': 12466688, 'steps': 24348, 'batch_loss/train': 0.8310020440258086} 12/26/2021 04:17:26 - INFO - codeparrot_training - Step 24349: {'lr': 0.00026163686832853406, 'samples': 12467200, 'steps': 24349, 'batch_loss/train': 0.6553899724967778} 12/26/2021 04:17:39 - INFO - codeparrot_training - Step 24350: {'lr': 0.00026162112225716833, 'samples': 12467712, 'steps': 24350, 'batch_loss/train': 0.8346798922866583} 12/26/2021 04:17:50 - INFO - codeparrot_training - Step 24351: {'lr': 0.0002616053761396013, 'samples': 12468224, 'steps': 24351, 'batch_loss/train': 0.7467241808772087} 12/26/2021 04:18:00 - INFO - codeparrot_training - Step 24352: {'lr': 0.00026158962997589573, 'samples': 12468736, 'steps': 24352, 'batch_loss/train': 0.7341080075129867} 12/26/2021 04:18:12 - INFO - codeparrot_training - Step 24353: {'lr': 0.0002615738837661142, 'samples': 12469248, 'steps': 24353, 'batch_loss/train': 0.6245857583126053} 12/26/2021 04:18:23 - INFO - codeparrot_training - Step 24354: {'lr': 0.0002615581375103192, 'samples': 12469760, 'steps': 24354, 'batch_loss/train': 0.7845178795978427} 12/26/2021 04:18:34 - INFO - codeparrot_training - Step 24355: {'lr': 0.0002615423912085735, 'samples': 12470272, 'steps': 24355, 'batch_loss/train': 0.6768304361030459} 12/26/2021 04:18:46 - INFO - codeparrot_training - Step 24356: {'lr': 0.00026152664486093954, 'samples': 12470784, 'steps': 24356, 'batch_loss/train': 0.6194917977554724} 12/26/2021 04:18:56 - INFO - codeparrot_training - Step 24357: {'lr': 0.00026151089846748006, 'samples': 12471296, 'steps': 24357, 'batch_loss/train': 0.7334783198311925} 12/26/2021 04:19:07 - INFO - codeparrot_training - Step 24358: {'lr': 0.0002614951520282575, 'samples': 12471808, 'steps': 24358, 'batch_loss/train': 0.7584430314600468} 12/26/2021 04:19:18 - INFO - codeparrot_training - Step 24359: {'lr': 0.0002614794055433346, 'samples': 12472320, 'steps': 24359, 'batch_loss/train': 0.7652744110673666} 12/26/2021 04:19:30 - INFO - codeparrot_training - Step 24360: {'lr': 0.0002614636590127739, 'samples': 12472832, 'steps': 24360, 'batch_loss/train': 0.7269698367454112} 12/26/2021 04:19:40 - INFO - codeparrot_training - Step 24361: {'lr': 0.00026144791243663806, 'samples': 12473344, 'steps': 24361, 'batch_loss/train': 0.7169515346176922} 12/26/2021 04:19:51 - INFO - codeparrot_training - Step 24362: {'lr': 0.00026143216581498964, 'samples': 12473856, 'steps': 24362, 'batch_loss/train': 0.8012137282639742} 12/26/2021 04:20:04 - INFO - codeparrot_training - Step 24363: {'lr': 0.00026141641914789125, 'samples': 12474368, 'steps': 24363, 'batch_loss/train': 0.6933570886030793} 12/26/2021 04:20:15 - INFO - codeparrot_training - Step 24364: {'lr': 0.00026140067243540544, 'samples': 12474880, 'steps': 24364, 'batch_loss/train': 0.705326497554779} 12/26/2021 04:20:25 - INFO - codeparrot_training - Step 24365: {'lr': 0.00026138492567759494, 'samples': 12475392, 'steps': 24365, 'batch_loss/train': 0.7359706228598952} 12/26/2021 04:20:38 - INFO - codeparrot_training - Step 24366: {'lr': 0.0002613691788745222, 'samples': 12475904, 'steps': 24366, 'batch_loss/train': 0.8696985564893112} 12/26/2021 04:20:48 - INFO - codeparrot_training - Step 24367: {'lr': 0.00026135343202624993, 'samples': 12476416, 'steps': 24367, 'batch_loss/train': 0.7941517531871796} 12/26/2021 04:20:59 - INFO - codeparrot_training - Step 24368: {'lr': 0.0002613376851328407, 'samples': 12476928, 'steps': 24368, 'batch_loss/train': 0.7538403822109103} 12/26/2021 04:21:12 - INFO - codeparrot_training - Step 24369: {'lr': 0.0002613219381943571, 'samples': 12477440, 'steps': 24369, 'batch_loss/train': 0.5315733241732232} 12/26/2021 04:21:22 - INFO - codeparrot_training - Step 24370: {'lr': 0.0002613061912108618, 'samples': 12477952, 'steps': 24370, 'batch_loss/train': 0.7293824730440974} 12/26/2021 04:21:33 - INFO - codeparrot_training - Step 24371: {'lr': 0.00026129044418241726, 'samples': 12478464, 'steps': 24371, 'batch_loss/train': 0.7724309768527746} 12/26/2021 04:21:44 - INFO - codeparrot_training - Step 24372: {'lr': 0.00026127469710908626, 'samples': 12478976, 'steps': 24372, 'batch_loss/train': 0.8018658314831555} 12/26/2021 04:21:57 - INFO - codeparrot_training - Step 24373: {'lr': 0.0002612589499909313, 'samples': 12479488, 'steps': 24373, 'batch_loss/train': 0.8069038158282638} 12/26/2021 04:22:07 - INFO - codeparrot_training - Step 24374: {'lr': 0.000261243202828015, 'samples': 12480000, 'steps': 24374, 'batch_loss/train': 0.7256604796275496} 12/26/2021 04:22:18 - INFO - codeparrot_training - Step 24375: {'lr': 0.00026122745562039997, 'samples': 12480512, 'steps': 24375, 'batch_loss/train': 0.7879851208999753} 12/26/2021 04:22:29 - INFO - codeparrot_training - Step 24376: {'lr': 0.0002612117083681488, 'samples': 12481024, 'steps': 24376, 'batch_loss/train': 1.296536241658032} 12/26/2021 04:22:41 - INFO - codeparrot_training - Step 24377: {'lr': 0.0002611959610713241, 'samples': 12481536, 'steps': 24377, 'batch_loss/train': 1.7319757724180818} 12/26/2021 04:22:51 - INFO - codeparrot_training - Step 24378: {'lr': 0.0002611802137299886, 'samples': 12482048, 'steps': 24378, 'batch_loss/train': 0.7993536960566416} 12/26/2021 04:23:02 - INFO - codeparrot_training - Step 24379: {'lr': 0.00026116446634420466, 'samples': 12482560, 'steps': 24379, 'batch_loss/train': 0.7445895420387387} 12/26/2021 04:23:15 - INFO - codeparrot_training - Step 24380: {'lr': 0.000261148718914035, 'samples': 12483072, 'steps': 24380, 'batch_loss/train': 0.7522597182542086} 12/26/2021 04:23:25 - INFO - codeparrot_training - Step 24381: {'lr': 0.0002611329714395423, 'samples': 12483584, 'steps': 24381, 'batch_loss/train': 0.8319993987679482} 12/26/2021 04:23:36 - INFO - codeparrot_training - Step 24382: {'lr': 0.00026111722392078914, 'samples': 12484096, 'steps': 24382, 'batch_loss/train': 0.7874619215726852} 12/26/2021 04:23:48 - INFO - codeparrot_training - Step 24383: {'lr': 0.0002611014763578381, 'samples': 12484608, 'steps': 24383, 'batch_loss/train': 0.7580625966656953} 12/26/2021 04:23:59 - INFO - codeparrot_training - Step 24384: {'lr': 0.00026108572875075173, 'samples': 12485120, 'steps': 24384, 'batch_loss/train': 0.8259860016405582} 12/26/2021 04:24:09 - INFO - codeparrot_training - Step 24385: {'lr': 0.00026106998109959274, 'samples': 12485632, 'steps': 24385, 'batch_loss/train': 0.620099009887781} 12/26/2021 04:24:22 - INFO - codeparrot_training - Step 24386: {'lr': 0.00026105423340442365, 'samples': 12486144, 'steps': 24386, 'batch_loss/train': 0.7424088283441961} 12/26/2021 04:24:33 - INFO - codeparrot_training - Step 24387: {'lr': 0.0002610384856653071, 'samples': 12486656, 'steps': 24387, 'batch_loss/train': 0.738864557351917} 12/26/2021 04:24:43 - INFO - codeparrot_training - Step 24388: {'lr': 0.0002610227378823057, 'samples': 12487168, 'steps': 24388, 'batch_loss/train': 0.6766291009262204} 12/26/2021 04:24:54 - INFO - codeparrot_training - Step 24389: {'lr': 0.0002610069900554821, 'samples': 12487680, 'steps': 24389, 'batch_loss/train': 0.7632136675529182} 12/26/2021 04:25:07 - INFO - codeparrot_training - Step 24390: {'lr': 0.00026099124218489873, 'samples': 12488192, 'steps': 24390, 'batch_loss/train': 0.6433476622332819} 12/26/2021 04:25:17 - INFO - codeparrot_training - Step 24391: {'lr': 0.00026097549427061845, 'samples': 12488704, 'steps': 24391, 'batch_loss/train': 0.759806509129703} 12/26/2021 04:25:28 - INFO - codeparrot_training - Step 24392: {'lr': 0.00026095974631270375, 'samples': 12489216, 'steps': 24392, 'batch_loss/train': 1.1953932885080576} 12/26/2021 04:25:40 - INFO - codeparrot_training - Step 24393: {'lr': 0.0002609439983112172, 'samples': 12489728, 'steps': 24393, 'batch_loss/train': 0.791868700645864} 12/26/2021 04:25:51 - INFO - codeparrot_training - Step 24394: {'lr': 0.0002609282502662215, 'samples': 12490240, 'steps': 24394, 'batch_loss/train': 0.7100272318348289} 12/26/2021 04:26:01 - INFO - codeparrot_training - Step 24395: {'lr': 0.00026091250217777904, 'samples': 12490752, 'steps': 24395, 'batch_loss/train': 0.8026741263456643} 12/26/2021 04:26:13 - INFO - codeparrot_training - Step 24396: {'lr': 0.00026089675404595273, 'samples': 12491264, 'steps': 24396, 'batch_loss/train': 0.7531523825600743} 12/26/2021 04:26:24 - INFO - codeparrot_training - Step 24397: {'lr': 0.00026088100587080506, 'samples': 12491776, 'steps': 24397, 'batch_loss/train': 0.5777760982746258} 12/26/2021 04:26:35 - INFO - codeparrot_training - Step 24398: {'lr': 0.0002608652576523986, 'samples': 12492288, 'steps': 24398, 'batch_loss/train': 0.7717648902907968} 12/26/2021 04:26:45 - INFO - codeparrot_training - Step 24399: {'lr': 0.00026084950939079594, 'samples': 12492800, 'steps': 24399, 'batch_loss/train': 0.843291993252933} 12/26/2021 04:26:59 - INFO - codeparrot_training - Step 24400: {'lr': 0.00026083376108605976, 'samples': 12493312, 'steps': 24400, 'batch_loss/train': 0.7887421799823642} 12/26/2021 04:27:10 - INFO - codeparrot_training - Step 24401: {'lr': 0.00026081801273825263, 'samples': 12493824, 'steps': 24401, 'batch_loss/train': 0.8063834146596491} 12/26/2021 04:27:20 - INFO - codeparrot_training - Step 24402: {'lr': 0.0002608022643474372, 'samples': 12494336, 'steps': 24402, 'batch_loss/train': 0.78527104947716} 12/26/2021 04:27:33 - INFO - codeparrot_training - Step 24403: {'lr': 0.0002607865159136759, 'samples': 12494848, 'steps': 24403, 'batch_loss/train': 0.7716476246714592} 12/26/2021 04:27:43 - INFO - codeparrot_training - Step 24404: {'lr': 0.0002607707674370316, 'samples': 12495360, 'steps': 24404, 'batch_loss/train': 0.7049837792292237} 12/26/2021 04:27:54 - INFO - codeparrot_training - Step 24405: {'lr': 0.00026075501891756683, 'samples': 12495872, 'steps': 24405, 'batch_loss/train': 0.6692988709546626} 12/26/2021 04:28:06 - INFO - codeparrot_training - Step 24406: {'lr': 0.00026073927035534414, 'samples': 12496384, 'steps': 24406, 'batch_loss/train': 0.8509313566610217} 12/26/2021 04:28:17 - INFO - codeparrot_training - Step 24407: {'lr': 0.00026072352175042615, 'samples': 12496896, 'steps': 24407, 'batch_loss/train': 0.8678071973845363} 12/26/2021 04:28:27 - INFO - codeparrot_training - Step 24408: {'lr': 0.0002607077731028754, 'samples': 12497408, 'steps': 24408, 'batch_loss/train': 0.7684484054334462} 12/26/2021 04:28:38 - INFO - codeparrot_training - Step 24409: {'lr': 0.00026069202441275467, 'samples': 12497920, 'steps': 24409, 'batch_loss/train': 0.6790153291076422} 12/26/2021 04:28:51 - INFO - codeparrot_training - Step 24410: {'lr': 0.0002606762756801265, 'samples': 12498432, 'steps': 24410, 'batch_loss/train': 0.8377882586792111} 12/26/2021 04:29:01 - INFO - codeparrot_training - Step 24411: {'lr': 0.00026066052690505357, 'samples': 12498944, 'steps': 24411, 'batch_loss/train': 0.74969796044752} 12/26/2021 04:29:12 - INFO - codeparrot_training - Step 24412: {'lr': 0.0002606447780875983, 'samples': 12499456, 'steps': 24412, 'batch_loss/train': 0.7427612249739468} 12/26/2021 04:29:24 - INFO - codeparrot_training - Step 24413: {'lr': 0.0002606290292278234, 'samples': 12499968, 'steps': 24413, 'batch_loss/train': 0.6387062050926033} 12/26/2021 04:29:35 - INFO - codeparrot_training - Step 24414: {'lr': 0.00026061328032579153, 'samples': 12500480, 'steps': 24414, 'batch_loss/train': 0.6906402740860358} 12/26/2021 04:29:45 - INFO - codeparrot_training - Step 24415: {'lr': 0.00026059753138156527, 'samples': 12500992, 'steps': 24415, 'batch_loss/train': 0.7649144099559635} 12/26/2021 04:29:57 - INFO - codeparrot_training - Step 24416: {'lr': 0.0002605817823952072, 'samples': 12501504, 'steps': 24416, 'batch_loss/train': 0.7330304719507694} 12/26/2021 04:30:08 - INFO - codeparrot_training - Step 24417: {'lr': 0.00026056603336677993, 'samples': 12502016, 'steps': 24417, 'batch_loss/train': 0.7593798190355301} 12/26/2021 04:30:19 - INFO - codeparrot_training - Step 24418: {'lr': 0.0002605502842963462, 'samples': 12502528, 'steps': 24418, 'batch_loss/train': 0.7420353735797107} 12/26/2021 04:30:29 - INFO - codeparrot_training - Step 24419: {'lr': 0.0002605345351839684, 'samples': 12503040, 'steps': 24419, 'batch_loss/train': 0.7104260690975934} 12/26/2021 04:30:42 - INFO - codeparrot_training - Step 24420: {'lr': 0.00026051878602970937, 'samples': 12503552, 'steps': 24420, 'batch_loss/train': 0.7526672845706344} 12/26/2021 04:30:53 - INFO - codeparrot_training - Step 24421: {'lr': 0.00026050303683363156, 'samples': 12504064, 'steps': 24421, 'batch_loss/train': 0.8564398726448417} 12/26/2021 04:31:03 - INFO - codeparrot_training - Step 24422: {'lr': 0.00026048728759579766, 'samples': 12504576, 'steps': 24422, 'batch_loss/train': 0.7566099874675274} 12/26/2021 04:31:15 - INFO - codeparrot_training - Step 24423: {'lr': 0.00026047153831627027, 'samples': 12505088, 'steps': 24423, 'batch_loss/train': 0.7524530163500458} 12/26/2021 04:31:26 - INFO - codeparrot_training - Step 24424: {'lr': 0.00026045578899511207, 'samples': 12505600, 'steps': 24424, 'batch_loss/train': 0.8689310131594539} 12/26/2021 04:31:36 - INFO - codeparrot_training - Step 24425: {'lr': 0.0002604400396323855, 'samples': 12506112, 'steps': 24425, 'batch_loss/train': 0.7552080224268138} 12/26/2021 04:31:49 - INFO - codeparrot_training - Step 24426: {'lr': 0.0002604242902281533, 'samples': 12506624, 'steps': 24426, 'batch_loss/train': 0.7898838389664888} 12/26/2021 04:32:00 - INFO - codeparrot_training - Step 24427: {'lr': 0.00026040854078247805, 'samples': 12507136, 'steps': 24427, 'batch_loss/train': 0.7745008273050189} 12/26/2021 04:32:10 - INFO - codeparrot_training - Step 24428: {'lr': 0.00026039279129542244, 'samples': 12507648, 'steps': 24428, 'batch_loss/train': 0.7594612357206643} 12/26/2021 04:32:24 - INFO - codeparrot_training - Step 24429: {'lr': 0.00026037704176704886, 'samples': 12508160, 'steps': 24429, 'batch_loss/train': 0.8316993750631809} 12/26/2021 04:32:34 - INFO - codeparrot_training - Step 24430: {'lr': 0.00026036129219742015, 'samples': 12508672, 'steps': 24430, 'batch_loss/train': 0.7936038472689688} 12/26/2021 04:32:45 - INFO - codeparrot_training - Step 24431: {'lr': 0.0002603455425865989, 'samples': 12509184, 'steps': 24431, 'batch_loss/train': 0.7336687077768147} 12/26/2021 04:32:55 - INFO - codeparrot_training - Step 24432: {'lr': 0.00026032979293464764, 'samples': 12509696, 'steps': 24432, 'batch_loss/train': 1.21577291470021} 12/26/2021 04:33:07 - INFO - codeparrot_training - Step 24433: {'lr': 0.0002603140432416291, 'samples': 12510208, 'steps': 24433, 'batch_loss/train': 0.7428657505661249} 12/26/2021 04:33:18 - INFO - codeparrot_training - Step 24434: {'lr': 0.00026029829350760576, 'samples': 12510720, 'steps': 24434, 'batch_loss/train': 0.6658086306415498} 12/26/2021 04:33:29 - INFO - codeparrot_training - Step 24435: {'lr': 0.00026028254373264026, 'samples': 12511232, 'steps': 24435, 'batch_loss/train': 0.7199438009411097} 12/26/2021 04:33:41 - INFO - codeparrot_training - Step 24436: {'lr': 0.0002602667939167953, 'samples': 12511744, 'steps': 24436, 'batch_loss/train': 0.8801689455285668} 12/26/2021 04:33:51 - INFO - codeparrot_training - Step 24437: {'lr': 0.00026025104406013344, 'samples': 12512256, 'steps': 24437, 'batch_loss/train': 0.7531734742224216} 12/26/2021 04:34:02 - INFO - codeparrot_training - Step 24438: {'lr': 0.0002602352941627173, 'samples': 12512768, 'steps': 24438, 'batch_loss/train': 0.7462096852250397} 12/26/2021 04:34:13 - INFO - codeparrot_training - Step 24439: {'lr': 0.00026021954422460945, 'samples': 12513280, 'steps': 24439, 'batch_loss/train': 0.862000438850373} 12/26/2021 04:34:25 - INFO - codeparrot_training - Step 24440: {'lr': 0.0002602037942458726, 'samples': 12513792, 'steps': 24440, 'batch_loss/train': 0.7249176856130362} 12/26/2021 04:34:36 - INFO - codeparrot_training - Step 24441: {'lr': 0.00026018804422656936, 'samples': 12514304, 'steps': 24441, 'batch_loss/train': 0.8200888615101576} 12/26/2021 04:34:47 - INFO - codeparrot_training - Step 24442: {'lr': 0.0002601722941667622, 'samples': 12514816, 'steps': 24442, 'batch_loss/train': 0.8115698149194941} 12/26/2021 04:34:59 - INFO - codeparrot_training - Step 24443: {'lr': 0.00026015654406651384, 'samples': 12515328, 'steps': 24443, 'batch_loss/train': 0.7734621642157435} 12/26/2021 04:35:09 - INFO - codeparrot_training - Step 24444: {'lr': 0.000260140793925887, 'samples': 12515840, 'steps': 24444, 'batch_loss/train': 0.8447852858807892} 12/26/2021 04:35:20 - INFO - codeparrot_training - Step 24445: {'lr': 0.0002601250437449441, 'samples': 12516352, 'steps': 24445, 'batch_loss/train': 0.8082983372732997} 12/26/2021 04:35:32 - INFO - codeparrot_training - Step 24446: {'lr': 0.0002601092935237479, 'samples': 12516864, 'steps': 24446, 'batch_loss/train': 0.7475041067227721} 12/26/2021 04:35:43 - INFO - codeparrot_training - Step 24447: {'lr': 0.00026009354326236096, 'samples': 12517376, 'steps': 24447, 'batch_loss/train': 0.756893147714436} 12/26/2021 04:35:54 - INFO - codeparrot_training - Step 24448: {'lr': 0.0002600777929608459, 'samples': 12517888, 'steps': 24448, 'batch_loss/train': 0.6028378600021824} 12/26/2021 04:36:07 - INFO - codeparrot_training - Step 24449: {'lr': 0.00026006204261926525, 'samples': 12518400, 'steps': 24449, 'batch_loss/train': 0.8153072046115994} 12/26/2021 04:36:17 - INFO - codeparrot_training - Step 24450: {'lr': 0.0002600462922376819, 'samples': 12518912, 'steps': 24450, 'batch_loss/train': 0.7415425754152238} 12/26/2021 04:36:28 - INFO - codeparrot_training - Step 24451: {'lr': 0.00026003054181615814, 'samples': 12519424, 'steps': 24451, 'batch_loss/train': 0.7593413922004402} 12/26/2021 04:36:38 - INFO - codeparrot_training - Step 24452: {'lr': 0.00026001479135475677, 'samples': 12519936, 'steps': 24452, 'batch_loss/train': 1.0658466899767518} 12/26/2021 04:36:51 - INFO - codeparrot_training - Step 24453: {'lr': 0.00025999904085354035, 'samples': 12520448, 'steps': 24453, 'batch_loss/train': 0.6205470818094909} 12/26/2021 04:37:01 - INFO - codeparrot_training - Step 24454: {'lr': 0.00025998329031257154, 'samples': 12520960, 'steps': 24454, 'batch_loss/train': 0.8376160934567451} 12/26/2021 04:37:12 - INFO - codeparrot_training - Step 24455: {'lr': 0.00025996753973191297, 'samples': 12521472, 'steps': 24455, 'batch_loss/train': 0.7075994741171598} 12/26/2021 04:37:25 - INFO - codeparrot_training - Step 24456: {'lr': 0.0002599517891116272, 'samples': 12521984, 'steps': 24456, 'batch_loss/train': 0.6785772170405835} 12/26/2021 04:37:36 - INFO - codeparrot_training - Step 24457: {'lr': 0.00025993603845177686, 'samples': 12522496, 'steps': 24457, 'batch_loss/train': 0.8984597055241466} 12/26/2021 04:37:46 - INFO - codeparrot_training - Step 24458: {'lr': 0.00025992028775242455, 'samples': 12523008, 'steps': 24458, 'batch_loss/train': 0.7800621173810214} 12/26/2021 04:37:58 - INFO - codeparrot_training - Step 24459: {'lr': 0.00025990453701363304, 'samples': 12523520, 'steps': 24459, 'batch_loss/train': 0.6733065387234092} 12/26/2021 04:38:09 - INFO - codeparrot_training - Step 24460: {'lr': 0.00025988878623546476, 'samples': 12524032, 'steps': 24460, 'batch_loss/train': 0.8094044672325253} 12/26/2021 04:38:19 - INFO - codeparrot_training - Step 24461: {'lr': 0.00025987303541798233, 'samples': 12524544, 'steps': 24461, 'batch_loss/train': 0.83066730434075} 12/26/2021 04:38:30 - INFO - codeparrot_training - Step 24462: {'lr': 0.00025985728456124854, 'samples': 12525056, 'steps': 24462, 'batch_loss/train': 0.6786110354587436} 12/26/2021 04:38:42 - INFO - codeparrot_training - Step 24463: {'lr': 0.00025984153366532593, 'samples': 12525568, 'steps': 24463, 'batch_loss/train': 0.7571978019550443} 12/26/2021 04:38:53 - INFO - codeparrot_training - Step 24464: {'lr': 0.00025982578273027696, 'samples': 12526080, 'steps': 24464, 'batch_loss/train': 0.932153420522809} 12/26/2021 04:39:03 - INFO - codeparrot_training - Step 24465: {'lr': 0.0002598100317561645, 'samples': 12526592, 'steps': 24465, 'batch_loss/train': 0.7192723783664405} 12/26/2021 04:39:16 - INFO - codeparrot_training - Step 24466: {'lr': 0.0002597942807430511, 'samples': 12527104, 'steps': 24466, 'batch_loss/train': 0.7061386266723275} 12/26/2021 04:39:27 - INFO - codeparrot_training - Step 24467: {'lr': 0.00025977852969099924, 'samples': 12527616, 'steps': 24467, 'batch_loss/train': 0.7207601796835661} 12/26/2021 04:39:37 - INFO - codeparrot_training - Step 24468: {'lr': 0.00025976277860007165, 'samples': 12528128, 'steps': 24468, 'batch_loss/train': 0.7795482352375984} 12/26/2021 04:39:49 - INFO - codeparrot_training - Step 24469: {'lr': 0.0002597470274703309, 'samples': 12528640, 'steps': 24469, 'batch_loss/train': 0.731992099666968} 12/26/2021 04:40:00 - INFO - codeparrot_training - Step 24470: {'lr': 0.0002597312763018398, 'samples': 12529152, 'steps': 24470, 'batch_loss/train': 0.7364111677743495} 12/26/2021 04:40:11 - INFO - codeparrot_training - Step 24471: {'lr': 0.0002597155250946606, 'samples': 12529664, 'steps': 24471, 'batch_loss/train': 0.6642644917592406} 12/26/2021 04:40:21 - INFO - codeparrot_training - Step 24472: {'lr': 0.00025969977384885637, 'samples': 12530176, 'steps': 24472, 'batch_loss/train': 0.7856981913791969} 12/26/2021 04:40:34 - INFO - codeparrot_training - Step 24473: {'lr': 0.0002596840225644894, 'samples': 12530688, 'steps': 24473, 'batch_loss/train': 0.7873612754046917} 12/26/2021 04:40:44 - INFO - codeparrot_training - Step 24474: {'lr': 0.0002596682712416223, 'samples': 12531200, 'steps': 24474, 'batch_loss/train': 0.8184491544961929} 12/26/2021 04:40:55 - INFO - codeparrot_training - Step 24475: {'lr': 0.00025965251988031797, 'samples': 12531712, 'steps': 24475, 'batch_loss/train': 0.7079719454050064} 12/26/2021 04:41:07 - INFO - codeparrot_training - Step 24476: {'lr': 0.00025963676848063887, 'samples': 12532224, 'steps': 24476, 'batch_loss/train': 0.7603966514579952} 12/26/2021 04:41:18 - INFO - codeparrot_training - Step 24477: {'lr': 0.00025962101704264743, 'samples': 12532736, 'steps': 24477, 'batch_loss/train': 0.69917484652251} 12/26/2021 04:41:28 - INFO - codeparrot_training - Step 24478: {'lr': 0.00025960526556640664, 'samples': 12533248, 'steps': 24478, 'batch_loss/train': 0.7320597218349576} 12/26/2021 04:41:41 - INFO - codeparrot_training - Step 24479: {'lr': 0.00025958951405197895, 'samples': 12533760, 'steps': 24479, 'batch_loss/train': 0.7581772031262517} 12/26/2021 04:41:52 - INFO - codeparrot_training - Step 24480: {'lr': 0.00025957376249942686, 'samples': 12534272, 'steps': 24480, 'batch_loss/train': 0.6501117665902711} 12/26/2021 04:42:02 - INFO - codeparrot_training - Step 24481: {'lr': 0.00025955801090881314, 'samples': 12534784, 'steps': 24481, 'batch_loss/train': 0.814669584389776} 12/26/2021 04:42:14 - INFO - codeparrot_training - Step 24482: {'lr': 0.0002595422592802004, 'samples': 12535296, 'steps': 24482, 'batch_loss/train': 0.6478766347281635} 12/26/2021 04:42:25 - INFO - codeparrot_training - Step 24483: {'lr': 0.00025952650761365125, 'samples': 12535808, 'steps': 24483, 'batch_loss/train': 0.726258925627917} 12/26/2021 04:42:36 - INFO - codeparrot_training - Step 24484: {'lr': 0.00025951075590922824, 'samples': 12536320, 'steps': 24484, 'batch_loss/train': 0.8257781052961946} 12/26/2021 04:42:46 - INFO - codeparrot_training - Step 24485: {'lr': 0.00025949500416699416, 'samples': 12536832, 'steps': 24485, 'batch_loss/train': 0.736421980895102} 12/26/2021 04:43:00 - INFO - codeparrot_training - Step 24486: {'lr': 0.0002594792523870115, 'samples': 12537344, 'steps': 24486, 'batch_loss/train': 1.1823461095336825} 12/26/2021 04:43:10 - INFO - codeparrot_training - Step 24487: {'lr': 0.0002594635005693428, 'samples': 12537856, 'steps': 24487, 'batch_loss/train': 0.8365129297599196} 12/26/2021 04:43:21 - INFO - codeparrot_training - Step 24488: {'lr': 0.00025944774871405093, 'samples': 12538368, 'steps': 24488, 'batch_loss/train': 0.8014644905924797} 12/26/2021 04:43:33 - INFO - codeparrot_training - Step 24489: {'lr': 0.00025943199682119836, 'samples': 12538880, 'steps': 24489, 'batch_loss/train': 0.7227055327966809} 12/26/2021 04:43:44 - INFO - codeparrot_training - Step 24490: {'lr': 0.00025941624489084756, 'samples': 12539392, 'steps': 24490, 'batch_loss/train': 0.756045151501894} 12/26/2021 04:43:54 - INFO - codeparrot_training - Step 24491: {'lr': 0.0002594004929230615, 'samples': 12539904, 'steps': 24491, 'batch_loss/train': 0.6392104011029005} 12/26/2021 04:44:05 - INFO - codeparrot_training - Step 24492: {'lr': 0.00025938474091790264, 'samples': 12540416, 'steps': 24492, 'batch_loss/train': 0.7702749958261847} 12/26/2021 04:44:17 - INFO - codeparrot_training - Step 24493: {'lr': 0.0002593689888754335, 'samples': 12540928, 'steps': 24493, 'batch_loss/train': 0.7104033639188856} 12/26/2021 04:44:28 - INFO - codeparrot_training - Step 24494: {'lr': 0.00025935323679571686, 'samples': 12541440, 'steps': 24494, 'batch_loss/train': 0.8581746355630457} 12/26/2021 04:44:38 - INFO - codeparrot_training - Step 24495: {'lr': 0.00025933748467881526, 'samples': 12541952, 'steps': 24495, 'batch_loss/train': 0.809442390804179} 12/26/2021 04:44:51 - INFO - codeparrot_training - Step 24496: {'lr': 0.0002593217325247914, 'samples': 12542464, 'steps': 24496, 'batch_loss/train': 0.8045449573546648} 12/26/2021 04:45:01 - INFO - codeparrot_training - Step 24497: {'lr': 0.00025930598033370775, 'samples': 12542976, 'steps': 24497, 'batch_loss/train': 0.7387470467947423} 12/26/2021 04:45:12 - INFO - codeparrot_training - Step 24498: {'lr': 0.0002592902281056271, 'samples': 12543488, 'steps': 24498, 'batch_loss/train': 0.8471988150849938} 12/26/2021 04:45:24 - INFO - codeparrot_training - Step 24499: {'lr': 0.000259274475840612, 'samples': 12544000, 'steps': 24499, 'batch_loss/train': 0.7998448247089982} 12/26/2021 04:45:35 - INFO - codeparrot_training - Step 24500: {'lr': 0.0002592587235387251, 'samples': 12544512, 'steps': 24500, 'batch_loss/train': 0.9036931935697794} 12/26/2021 04:45:45 - INFO - codeparrot_training - Step 24501: {'lr': 0.000259242971200029, 'samples': 12545024, 'steps': 24501, 'batch_loss/train': 0.7333262814208865} 12/26/2021 04:45:57 - INFO - codeparrot_training - Step 24502: {'lr': 0.0002592272188245864, 'samples': 12545536, 'steps': 24502, 'batch_loss/train': 0.678092084126547} 12/26/2021 04:46:08 - INFO - codeparrot_training - Step 24503: {'lr': 0.0002592114664124597, 'samples': 12546048, 'steps': 24503, 'batch_loss/train': 0.6364370915107429} 12/26/2021 04:46:19 - INFO - codeparrot_training - Step 24504: {'lr': 0.00025919571396371176, 'samples': 12546560, 'steps': 24504, 'batch_loss/train': 0.7651012381538749} 12/26/2021 04:46:29 - INFO - codeparrot_training - Step 24505: {'lr': 0.0002591799614784052, 'samples': 12547072, 'steps': 24505, 'batch_loss/train': 0.7669191136956215} 12/26/2021 04:46:41 - INFO - codeparrot_training - Step 24506: {'lr': 0.00025916420895660255, 'samples': 12547584, 'steps': 24506, 'batch_loss/train': 0.6900456319563091} 12/26/2021 04:46:52 - INFO - codeparrot_training - Step 24507: {'lr': 0.0002591484563983664, 'samples': 12548096, 'steps': 24507, 'batch_loss/train': 0.6274277110933326} 12/26/2021 04:47:03 - INFO - codeparrot_training - Step 24508: {'lr': 0.00025913270380375955, 'samples': 12548608, 'steps': 24508, 'batch_loss/train': 0.7771940398961306} 12/26/2021 04:47:16 - INFO - codeparrot_training - Step 24509: {'lr': 0.0002591169511728444, 'samples': 12549120, 'steps': 24509, 'batch_loss/train': 0.77340177167207} 12/26/2021 04:47:26 - INFO - codeparrot_training - Step 24510: {'lr': 0.00025910119850568376, 'samples': 12549632, 'steps': 24510, 'batch_loss/train': 0.6656945976428688} 12/26/2021 04:47:37 - INFO - codeparrot_training - Step 24511: {'lr': 0.00025908544580234024, 'samples': 12550144, 'steps': 24511, 'batch_loss/train': 0.7023273161612451} 12/26/2021 04:47:49 - INFO - codeparrot_training - Step 24512: {'lr': 0.00025906969306287636, 'samples': 12550656, 'steps': 24512, 'batch_loss/train': 0.649564066901803} 12/26/2021 04:48:00 - INFO - codeparrot_training - Step 24513: {'lr': 0.0002590539402873548, 'samples': 12551168, 'steps': 24513, 'batch_loss/train': 0.5284965289174579} 12/26/2021 04:48:10 - INFO - codeparrot_training - Step 24514: {'lr': 0.0002590381874758382, 'samples': 12551680, 'steps': 24514, 'batch_loss/train': 0.9158943668007851} 12/26/2021 04:48:21 - INFO - codeparrot_training - Step 24515: {'lr': 0.00025902243462838913, 'samples': 12552192, 'steps': 24515, 'batch_loss/train': 0.7926941737532616} 12/26/2021 04:48:35 - INFO - codeparrot_training - Step 24516: {'lr': 0.0002590066817450704, 'samples': 12552704, 'steps': 24516, 'batch_loss/train': 0.7392660807818174} 12/26/2021 04:48:45 - INFO - codeparrot_training - Step 24517: {'lr': 0.00025899092882594434, 'samples': 12553216, 'steps': 24517, 'batch_loss/train': 0.7929608384147286} 12/26/2021 04:48:56 - INFO - codeparrot_training - Step 24518: {'lr': 0.0002589751758710739, 'samples': 12553728, 'steps': 24518, 'batch_loss/train': 0.7458990290760994} 12/26/2021 04:49:08 - INFO - codeparrot_training - Step 24519: {'lr': 0.0002589594228805214, 'samples': 12554240, 'steps': 24519, 'batch_loss/train': 0.7450795089825988} 12/26/2021 04:49:18 - INFO - codeparrot_training - Step 24520: {'lr': 0.00025894366985434975, 'samples': 12554752, 'steps': 24520, 'batch_loss/train': 0.860472084954381} 12/26/2021 04:49:29 - INFO - codeparrot_training - Step 24521: {'lr': 0.00025892791679262136, 'samples': 12555264, 'steps': 24521, 'batch_loss/train': 0.6876695179380476} 12/26/2021 04:49:41 - INFO - codeparrot_training - Step 24522: {'lr': 0.00025891216369539896, 'samples': 12555776, 'steps': 24522, 'batch_loss/train': 0.6424939180724323} 12/26/2021 04:49:52 - INFO - codeparrot_training - Step 24523: {'lr': 0.00025889641056274515, 'samples': 12556288, 'steps': 24523, 'batch_loss/train': 0.7288590623065829} 12/26/2021 04:50:02 - INFO - codeparrot_training - Step 24524: {'lr': 0.00025888065739472266, 'samples': 12556800, 'steps': 24524, 'batch_loss/train': 0.801297961268574} 12/26/2021 04:50:15 - INFO - codeparrot_training - Step 24525: {'lr': 0.000258864904191394, 'samples': 12557312, 'steps': 24525, 'batch_loss/train': 0.7368679903447628} 12/26/2021 04:50:26 - INFO - codeparrot_training - Step 24526: {'lr': 0.00025884915095282176, 'samples': 12557824, 'steps': 24526, 'batch_loss/train': 0.7722405451349914} 12/26/2021 04:50:36 - INFO - codeparrot_training - Step 24527: {'lr': 0.00025883339767906864, 'samples': 12558336, 'steps': 24527, 'batch_loss/train': 0.7213435231242329} 12/26/2021 04:50:47 - INFO - codeparrot_training - Step 24528: {'lr': 0.0002588176443701973, 'samples': 12558848, 'steps': 24528, 'batch_loss/train': 0.9031217484734952} 12/26/2021 04:50:59 - INFO - codeparrot_training - Step 24529: {'lr': 0.00025880189102627025, 'samples': 12559360, 'steps': 24529, 'batch_loss/train': 0.6989138596691191} 12/26/2021 04:51:10 - INFO - codeparrot_training - Step 24530: {'lr': 0.00025878613764735033, 'samples': 12559872, 'steps': 24530, 'batch_loss/train': 0.8091344865970314} 12/26/2021 04:51:20 - INFO - codeparrot_training - Step 24531: {'lr': 0.0002587703842335, 'samples': 12560384, 'steps': 24531, 'batch_loss/train': 0.6646284745074809} 12/26/2021 04:51:32 - INFO - codeparrot_training - Step 24532: {'lr': 0.0002587546307847819, 'samples': 12560896, 'steps': 24532, 'batch_loss/train': 0.7378972136648372} 12/26/2021 04:51:43 - INFO - codeparrot_training - Step 24533: {'lr': 0.00025873887730125877, 'samples': 12561408, 'steps': 24533, 'batch_loss/train': 0.6560434475541115} 12/26/2021 04:51:54 - INFO - codeparrot_training - Step 24534: {'lr': 0.0002587231237829931, 'samples': 12561920, 'steps': 24534, 'batch_loss/train': 0.702593338675797} 12/26/2021 04:52:06 - INFO - codeparrot_training - Step 24535: {'lr': 0.00025870737023004763, 'samples': 12562432, 'steps': 24535, 'batch_loss/train': 0.785601356998086} 12/26/2021 04:52:17 - INFO - codeparrot_training - Step 24536: {'lr': 0.0002586916166424849, 'samples': 12562944, 'steps': 24536, 'batch_loss/train': 0.9106679521501064} 12/26/2021 04:52:28 - INFO - codeparrot_training - Step 24537: {'lr': 0.00025867586302036767, 'samples': 12563456, 'steps': 24537, 'batch_loss/train': 0.7565264001023024} 12/26/2021 04:52:38 - INFO - codeparrot_training - Step 24538: {'lr': 0.0002586601093637584, 'samples': 12563968, 'steps': 24538, 'batch_loss/train': 0.7741681346669793} 12/26/2021 04:52:50 - INFO - codeparrot_training - Step 24539: {'lr': 0.00025864435567271983, 'samples': 12564480, 'steps': 24539, 'batch_loss/train': 0.7714349094312638} 12/26/2021 04:53:01 - INFO - codeparrot_training - Step 24540: {'lr': 0.00025862860194731454, 'samples': 12564992, 'steps': 24540, 'batch_loss/train': 0.7192964344285429} 12/26/2021 04:53:12 - INFO - codeparrot_training - Step 24541: {'lr': 0.0002586128481876052, 'samples': 12565504, 'steps': 24541, 'batch_loss/train': 0.7517085894942284} 12/26/2021 04:53:24 - INFO - codeparrot_training - Step 24542: {'lr': 0.0002585970943936544, 'samples': 12566016, 'steps': 24542, 'batch_loss/train': 0.651538745034486} 12/26/2021 04:53:35 - INFO - codeparrot_training - Step 24543: {'lr': 0.0002585813405655248, 'samples': 12566528, 'steps': 24543, 'batch_loss/train': 1.4325401280075312} 12/26/2021 04:53:46 - INFO - codeparrot_training - Step 24544: {'lr': 0.0002585655867032791, 'samples': 12567040, 'steps': 24544, 'batch_loss/train': 0.7926759719848633} 12/26/2021 04:53:56 - INFO - codeparrot_training - Step 24545: {'lr': 0.0002585498328069798, 'samples': 12567552, 'steps': 24545, 'batch_loss/train': 0.7974076019600034} 12/26/2021 04:54:10 - INFO - codeparrot_training - Step 24546: {'lr': 0.0002585340788766896, 'samples': 12568064, 'steps': 24546, 'batch_loss/train': 0.6995574366301298} 12/26/2021 04:54:21 - INFO - codeparrot_training - Step 24547: {'lr': 0.0002585183249124712, 'samples': 12568576, 'steps': 24547, 'batch_loss/train': 0.7069050003774464} 12/26/2021 04:54:31 - INFO - codeparrot_training - Step 24548: {'lr': 0.0002585025709143871, 'samples': 12569088, 'steps': 24548, 'batch_loss/train': 0.7327839946374297} 12/26/2021 04:54:43 - INFO - codeparrot_training - Step 24549: {'lr': 0.0002584868168824999, 'samples': 12569600, 'steps': 24549, 'batch_loss/train': 0.8540340662002563} 12/26/2021 04:54:54 - INFO - codeparrot_training - Step 24550: {'lr': 0.0002584710628168725, 'samples': 12570112, 'steps': 24550, 'batch_loss/train': 0.8096044629346579} 12/26/2021 04:55:05 - INFO - codeparrot_training - Step 24551: {'lr': 0.00025845530871756723, 'samples': 12570624, 'steps': 24551, 'batch_loss/train': 0.738484367262572} 12/26/2021 04:55:17 - INFO - codeparrot_training - Step 24552: {'lr': 0.00025843955458464677, 'samples': 12571136, 'steps': 24552, 'batch_loss/train': 0.7808530111797154} 12/26/2021 04:55:27 - INFO - codeparrot_training - Step 24553: {'lr': 0.00025842380041817404, 'samples': 12571648, 'steps': 24553, 'batch_loss/train': 0.6980779693694785} 12/26/2021 04:55:38 - INFO - codeparrot_training - Step 24554: {'lr': 0.00025840804621821134, 'samples': 12572160, 'steps': 24554, 'batch_loss/train': 0.7177352989092469} 12/26/2021 04:55:49 - INFO - codeparrot_training - Step 24555: {'lr': 0.0002583922919848214, 'samples': 12572672, 'steps': 24555, 'batch_loss/train': 0.7532275826670229} 12/26/2021 04:56:01 - INFO - codeparrot_training - Step 24556: {'lr': 0.0002583765377180669, 'samples': 12573184, 'steps': 24556, 'batch_loss/train': 0.7655749330297112} 12/26/2021 04:56:12 - INFO - codeparrot_training - Step 24557: {'lr': 0.0002583607834180105, 'samples': 12573696, 'steps': 24557, 'batch_loss/train': 0.6791203133761883} 12/26/2021 04:56:23 - INFO - codeparrot_training - Step 24558: {'lr': 0.00025834502908471467, 'samples': 12574208, 'steps': 24558, 'batch_loss/train': 0.7322657210752368} 12/26/2021 04:56:35 - INFO - codeparrot_training - Step 24559: {'lr': 0.0002583292747182423, 'samples': 12574720, 'steps': 24559, 'batch_loss/train': 0.8271223828196526} 12/26/2021 04:56:45 - INFO - codeparrot_training - Step 24560: {'lr': 0.0002583135203186558, 'samples': 12575232, 'steps': 24560, 'batch_loss/train': 0.8010352323763072} 12/26/2021 04:56:56 - INFO - codeparrot_training - Step 24561: {'lr': 0.0002582977658860179, 'samples': 12575744, 'steps': 24561, 'batch_loss/train': 0.7706021531485021} 12/26/2021 04:57:08 - INFO - codeparrot_training - Step 24562: {'lr': 0.00025828201142039114, 'samples': 12576256, 'steps': 24562, 'batch_loss/train': 0.7319929231889546} 12/26/2021 04:57:19 - INFO - codeparrot_training - Step 24563: {'lr': 0.00025826625692183835, 'samples': 12576768, 'steps': 24563, 'batch_loss/train': 0.748727036640048} 12/26/2021 04:57:29 - INFO - codeparrot_training - Step 24564: {'lr': 0.000258250502390422, 'samples': 12577280, 'steps': 24564, 'batch_loss/train': 0.8215664529707283} 12/26/2021 04:57:42 - INFO - codeparrot_training - Step 24565: {'lr': 0.00025823474782620464, 'samples': 12577792, 'steps': 24565, 'batch_loss/train': 0.7707380009815097} 12/26/2021 04:57:53 - INFO - codeparrot_training - Step 24566: {'lr': 0.0002582189932292492, 'samples': 12578304, 'steps': 24566, 'batch_loss/train': 0.6479353735921904} 12/26/2021 04:58:03 - INFO - codeparrot_training - Step 24567: {'lr': 0.0002582032385996181, 'samples': 12578816, 'steps': 24567, 'batch_loss/train': 0.7993826507590711} 12/26/2021 04:58:14 - INFO - codeparrot_training - Step 24568: {'lr': 0.000258187483937374, 'samples': 12579328, 'steps': 24568, 'batch_loss/train': 0.794258089736104} 12/26/2021 04:58:26 - INFO - codeparrot_training - Step 24569: {'lr': 0.0002581717292425796, 'samples': 12579840, 'steps': 24569, 'batch_loss/train': 0.7546780470293015} 12/26/2021 04:58:37 - INFO - codeparrot_training - Step 24570: {'lr': 0.0002581559745152975, 'samples': 12580352, 'steps': 24570, 'batch_loss/train': 0.7248472864739597} 12/26/2021 04:58:47 - INFO - codeparrot_training - Step 24571: {'lr': 0.0002581402197555902, 'samples': 12580864, 'steps': 24571, 'batch_loss/train': 0.7852852744981647} 12/26/2021 04:59:01 - INFO - codeparrot_training - Step 24572: {'lr': 0.0002581244649635206, 'samples': 12581376, 'steps': 24572, 'batch_loss/train': 0.7024202225729823} 12/26/2021 04:59:11 - INFO - codeparrot_training - Step 24573: {'lr': 0.0002581087101391511, 'samples': 12581888, 'steps': 24573, 'batch_loss/train': 0.7937242705374956} 12/26/2021 04:59:22 - INFO - codeparrot_training - Step 24574: {'lr': 0.0002580929552825445, 'samples': 12582400, 'steps': 24574, 'batch_loss/train': 0.7713799979537725} 12/26/2021 04:59:34 - INFO - codeparrot_training - Step 24575: {'lr': 0.0002580772003937633, 'samples': 12582912, 'steps': 24575, 'batch_loss/train': 0.7377945062471554} 12/26/2021 04:59:45 - INFO - codeparrot_training - Step 24576: {'lr': 0.0002580614454728703, 'samples': 12583424, 'steps': 24576, 'batch_loss/train': 0.7646213886328042} 12/26/2021 04:59:55 - INFO - codeparrot_training - Step 24577: {'lr': 0.0002580456905199279, 'samples': 12583936, 'steps': 24577, 'batch_loss/train': 0.7270173297147267} 12/26/2021 05:00:06 - INFO - codeparrot_training - Step 24578: {'lr': 0.0002580299355349989, 'samples': 12584448, 'steps': 24578, 'batch_loss/train': 0.7074451318476349} 12/26/2021 05:00:19 - INFO - codeparrot_training - Step 24579: {'lr': 0.000258014180518146, 'samples': 12584960, 'steps': 24579, 'batch_loss/train': 0.7962935687974095} 12/26/2021 05:00:29 - INFO - codeparrot_training - Step 24580: {'lr': 0.0002579984254694317, 'samples': 12585472, 'steps': 24580, 'batch_loss/train': 0.6783601748757064} 12/26/2021 05:00:40 - INFO - codeparrot_training - Step 24581: {'lr': 0.0002579826703889187, 'samples': 12585984, 'steps': 24581, 'batch_loss/train': 0.5240168084856123} 12/26/2021 05:00:52 - INFO - codeparrot_training - Step 24582: {'lr': 0.00025796691527666955, 'samples': 12586496, 'steps': 24582, 'batch_loss/train': 0.7542747028637677} 12/26/2021 05:01:02 - INFO - codeparrot_training - Step 24583: {'lr': 0.00025795116013274704, 'samples': 12587008, 'steps': 24583, 'batch_loss/train': 0.711622804403305} 12/26/2021 05:01:13 - INFO - codeparrot_training - Step 24584: {'lr': 0.00025793540495721364, 'samples': 12587520, 'steps': 24584, 'batch_loss/train': 0.7884965091943741} 12/26/2021 05:01:26 - INFO - codeparrot_training - Step 24585: {'lr': 0.0002579196497501322, 'samples': 12588032, 'steps': 24585, 'batch_loss/train': 0.7520243157632649} 12/26/2021 05:01:37 - INFO - codeparrot_training - Step 24586: {'lr': 0.00025790389451156505, 'samples': 12588544, 'steps': 24586, 'batch_loss/train': 0.7843633526936173} 12/26/2021 05:01:47 - INFO - codeparrot_training - Step 24587: {'lr': 0.00025788813924157504, 'samples': 12589056, 'steps': 24587, 'batch_loss/train': 0.6803429583087564} 12/26/2021 05:01:58 - INFO - codeparrot_training - Step 24588: {'lr': 0.00025787238394022486, 'samples': 12589568, 'steps': 24588, 'batch_loss/train': 0.7133637607330456} 12/26/2021 05:02:10 - INFO - codeparrot_training - Step 24589: {'lr': 0.00025785662860757695, 'samples': 12590080, 'steps': 24589, 'batch_loss/train': 0.5765447111334652} 12/26/2021 05:02:21 - INFO - codeparrot_training - Step 24590: {'lr': 0.0002578408732436941, 'samples': 12590592, 'steps': 24590, 'batch_loss/train': 0.7262111655436456} 12/26/2021 05:02:32 - INFO - codeparrot_training - Step 24591: {'lr': 0.0002578251178486389, 'samples': 12591104, 'steps': 24591, 'batch_loss/train': 0.7058718167245388} 12/26/2021 05:02:44 - INFO - codeparrot_training - Step 24592: {'lr': 0.000257809362422474, 'samples': 12591616, 'steps': 24592, 'batch_loss/train': 0.8266466548666358} 12/26/2021 05:02:54 - INFO - codeparrot_training - Step 24593: {'lr': 0.000257793606965262, 'samples': 12592128, 'steps': 24593, 'batch_loss/train': 0.8068954964983277} 12/26/2021 05:03:05 - INFO - codeparrot_training - Step 24594: {'lr': 0.00025777785147706555, 'samples': 12592640, 'steps': 24594, 'batch_loss/train': 0.8794616835657507} 12/26/2021 05:03:18 - INFO - codeparrot_training - Step 24595: {'lr': 0.0002577620959579473, 'samples': 12593152, 'steps': 24595, 'batch_loss/train': 0.7614762978628278} 12/26/2021 05:03:28 - INFO - codeparrot_training - Step 24596: {'lr': 0.00025774634040797, 'samples': 12593664, 'steps': 24596, 'batch_loss/train': 0.7096756724640727} 12/26/2021 05:03:39 - INFO - codeparrot_training - Step 24597: {'lr': 0.00025773058482719596, 'samples': 12594176, 'steps': 24597, 'batch_loss/train': 0.837405820377171} 12/26/2021 05:03:49 - INFO - codeparrot_training - Step 24598: {'lr': 0.0002577148292156882, 'samples': 12594688, 'steps': 24598, 'batch_loss/train': 0.7528905011713505} 12/26/2021 05:04:02 - INFO - codeparrot_training - Step 24599: {'lr': 0.00025769907357350917, 'samples': 12595200, 'steps': 24599, 'batch_loss/train': 0.6772667253389955} 12/26/2021 05:04:12 - INFO - codeparrot_training - Step 24600: {'lr': 0.00025768331790072146, 'samples': 12595712, 'steps': 24600, 'batch_loss/train': 0.7022361136041582} 12/26/2021 05:04:23 - INFO - codeparrot_training - Step 24601: {'lr': 0.00025766756219738787, 'samples': 12596224, 'steps': 24601, 'batch_loss/train': 0.7822330698836595} 12/26/2021 05:04:35 - INFO - codeparrot_training - Step 24602: {'lr': 0.0002576518064635709, 'samples': 12596736, 'steps': 24602, 'batch_loss/train': 0.7337400689721107} 12/26/2021 05:04:45 - INFO - codeparrot_training - Step 24603: {'lr': 0.00025763605069933324, 'samples': 12597248, 'steps': 24603, 'batch_loss/train': 0.7280078071635216} 12/26/2021 05:04:56 - INFO - codeparrot_training - Step 24604: {'lr': 0.0002576202949047375, 'samples': 12597760, 'steps': 24604, 'batch_loss/train': 0.713528276886791} 12/26/2021 05:05:09 - INFO - codeparrot_training - Step 24605: {'lr': 0.0002576045390798464, 'samples': 12598272, 'steps': 24605, 'batch_loss/train': 0.7472376562654972} 12/26/2021 05:05:19 - INFO - codeparrot_training - Step 24606: {'lr': 0.00025758878322472244, 'samples': 12598784, 'steps': 24606, 'batch_loss/train': 0.8259155945852399} 12/26/2021 05:05:30 - INFO - codeparrot_training - Step 24607: {'lr': 0.00025757302733942845, 'samples': 12599296, 'steps': 24607, 'batch_loss/train': 0.7855778345838189} 12/26/2021 05:05:43 - INFO - codeparrot_training - Step 24608: {'lr': 0.0002575572714240269, 'samples': 12599808, 'steps': 24608, 'batch_loss/train': 0.7844617776572704} 12/26/2021 05:05:54 - INFO - codeparrot_training - Step 24609: {'lr': 0.0002575415154785805, 'samples': 12600320, 'steps': 24609, 'batch_loss/train': 0.8076389487832785} 12/26/2021 05:06:05 - INFO - codeparrot_training - Step 24610: {'lr': 0.0002575257595031518, 'samples': 12600832, 'steps': 24610, 'batch_loss/train': 0.7861655987799168} 12/26/2021 05:06:15 - INFO - codeparrot_training - Step 24611: {'lr': 0.0002575100034978037, 'samples': 12601344, 'steps': 24611, 'batch_loss/train': 0.7694622543640435} 12/26/2021 05:06:27 - INFO - codeparrot_training - Step 24612: {'lr': 0.0002574942474625986, 'samples': 12601856, 'steps': 24612, 'batch_loss/train': 0.8865730962716043} 12/26/2021 05:06:38 - INFO - codeparrot_training - Step 24613: {'lr': 0.0002574784913975991, 'samples': 12602368, 'steps': 24613, 'batch_loss/train': 0.7323419684544206} 12/26/2021 05:06:49 - INFO - codeparrot_training - Step 24614: {'lr': 0.000257462735302868, 'samples': 12602880, 'steps': 24614, 'batch_loss/train': 0.7531196838244796} 12/26/2021 05:07:01 - INFO - codeparrot_training - Step 24615: {'lr': 0.00025744697917846795, 'samples': 12603392, 'steps': 24615, 'batch_loss/train': 0.8390682181343436} 12/26/2021 05:07:12 - INFO - codeparrot_training - Step 24616: {'lr': 0.00025743122302446144, 'samples': 12603904, 'steps': 24616, 'batch_loss/train': 0.8484016573056579} 12/26/2021 05:07:23 - INFO - codeparrot_training - Step 24617: {'lr': 0.00025741546684091124, 'samples': 12604416, 'steps': 24617, 'batch_loss/train': 0.9048740500584245} 12/26/2021 05:07:35 - INFO - codeparrot_training - Step 24618: {'lr': 0.0002573997106278799, 'samples': 12604928, 'steps': 24618, 'batch_loss/train': 0.781878337264061} 12/26/2021 05:07:45 - INFO - codeparrot_training - Step 24619: {'lr': 0.00025738395438543014, 'samples': 12605440, 'steps': 24619, 'batch_loss/train': 0.7860085507854819} 12/26/2021 05:07:56 - INFO - codeparrot_training - Step 24620: {'lr': 0.0002573681981136246, 'samples': 12605952, 'steps': 24620, 'batch_loss/train': 0.793979718349874} 12/26/2021 05:08:06 - INFO - codeparrot_training - Step 24621: {'lr': 0.00025735244181252577, 'samples': 12606464, 'steps': 24621, 'batch_loss/train': 0.8499921793118119} 12/26/2021 05:08:19 - INFO - codeparrot_training - Step 24622: {'lr': 0.00025733668548219644, 'samples': 12606976, 'steps': 24622, 'batch_loss/train': 0.8570392606779933} 12/26/2021 05:08:29 - INFO - codeparrot_training - Step 24623: {'lr': 0.00025732092912269925, 'samples': 12607488, 'steps': 24623, 'batch_loss/train': 0.767259432701394} 12/26/2021 05:08:40 - INFO - codeparrot_training - Step 24624: {'lr': 0.0002573051727340969, 'samples': 12608000, 'steps': 24624, 'batch_loss/train': 0.7101960902800784} 12/26/2021 05:08:52 - INFO - codeparrot_training - Step 24625: {'lr': 0.00025728941631645186, 'samples': 12608512, 'steps': 24625, 'batch_loss/train': 0.7515264106914401} 12/26/2021 05:09:03 - INFO - codeparrot_training - Step 24626: {'lr': 0.0002572736598698268, 'samples': 12609024, 'steps': 24626, 'batch_loss/train': 0.6962940199300647} 12/26/2021 05:09:14 - INFO - codeparrot_training - Step 24627: {'lr': 0.0002572579033942845, 'samples': 12609536, 'steps': 24627, 'batch_loss/train': 0.7284952569752932} 12/26/2021 05:09:26 - INFO - codeparrot_training - Step 24628: {'lr': 0.00025724214688988743, 'samples': 12610048, 'steps': 24628, 'batch_loss/train': 0.8281800346449018} 12/26/2021 05:09:36 - INFO - codeparrot_training - Step 24629: {'lr': 0.00025722639035669835, 'samples': 12610560, 'steps': 24629, 'batch_loss/train': 0.7438796814531088} 12/26/2021 05:09:47 - INFO - codeparrot_training - Step 24630: {'lr': 0.0002572106337947799, 'samples': 12611072, 'steps': 24630, 'batch_loss/train': 0.8495824206620455} 12/26/2021 05:09:58 - INFO - codeparrot_training - Step 24631: {'lr': 0.0002571948772041948, 'samples': 12611584, 'steps': 24631, 'batch_loss/train': 0.7761679636314511} 12/26/2021 05:10:11 - INFO - codeparrot_training - Step 24632: {'lr': 0.0002571791205850054, 'samples': 12612096, 'steps': 24632, 'batch_loss/train': 0.7834113065619022} 12/26/2021 05:10:21 - INFO - codeparrot_training - Step 24633: {'lr': 0.00025716336393727455, 'samples': 12612608, 'steps': 24633, 'batch_loss/train': 0.88243463402614} 12/26/2021 05:10:32 - INFO - codeparrot_training - Step 24634: {'lr': 0.00025714760726106494, 'samples': 12613120, 'steps': 24634, 'batch_loss/train': 0.8195967841893435} 12/26/2021 05:10:44 - INFO - codeparrot_training - Step 24635: {'lr': 0.0002571318505564391, 'samples': 12613632, 'steps': 24635, 'batch_loss/train': 0.7014586832374334} 12/26/2021 05:10:55 - INFO - codeparrot_training - Step 24636: {'lr': 0.00025711609382345974, 'samples': 12614144, 'steps': 24636, 'batch_loss/train': 0.7627136894734576} 12/26/2021 05:11:05 - INFO - codeparrot_training - Step 24637: {'lr': 0.00025710033706218954, 'samples': 12614656, 'steps': 24637, 'batch_loss/train': 0.8394261822104454} 12/26/2021 05:11:17 - INFO - codeparrot_training - Step 24638: {'lr': 0.000257084580272691, 'samples': 12615168, 'steps': 24638, 'batch_loss/train': 0.7971817213110626} 12/26/2021 05:11:28 - INFO - codeparrot_training - Step 24639: {'lr': 0.0002570688234550268, 'samples': 12615680, 'steps': 24639, 'batch_loss/train': 0.9139517862349749} 12/26/2021 05:11:38 - INFO - codeparrot_training - Step 24640: {'lr': 0.0002570530666092597, 'samples': 12616192, 'steps': 24640, 'batch_loss/train': 0.7665028083138168} 12/26/2021 05:11:49 - INFO - codeparrot_training - Step 24641: {'lr': 0.00025703730973545224, 'samples': 12616704, 'steps': 24641, 'batch_loss/train': 0.707723175175488} 12/26/2021 05:12:02 - INFO - codeparrot_training - Step 24642: {'lr': 0.0002570215528336671, 'samples': 12617216, 'steps': 24642, 'batch_loss/train': 0.7873019650578499} 12/26/2021 05:12:12 - INFO - codeparrot_training - Step 24643: {'lr': 0.00025700579590396696, 'samples': 12617728, 'steps': 24643, 'batch_loss/train': 0.7697812551632524} 12/26/2021 05:12:23 - INFO - codeparrot_training - Step 24644: {'lr': 0.0002569900389464144, 'samples': 12618240, 'steps': 24644, 'batch_loss/train': 0.7513761166483164} 12/26/2021 05:12:35 - INFO - codeparrot_training - Step 24645: {'lr': 0.00025697428196107204, 'samples': 12618752, 'steps': 24645, 'batch_loss/train': 0.7761450232937932} 12/26/2021 05:12:46 - INFO - codeparrot_training - Step 24646: {'lr': 0.0002569585249480026, 'samples': 12619264, 'steps': 24646, 'batch_loss/train': 0.802833121502772} 12/26/2021 05:12:56 - INFO - codeparrot_training - Step 24647: {'lr': 0.00025694276790726866, 'samples': 12619776, 'steps': 24647, 'batch_loss/train': 0.7823742795735598} 12/26/2021 05:13:08 - INFO - codeparrot_training - Step 24648: {'lr': 0.00025692701083893294, 'samples': 12620288, 'steps': 24648, 'batch_loss/train': 0.7866355697624385} 12/26/2021 05:13:19 - INFO - codeparrot_training - Step 24649: {'lr': 0.000256911253743058, 'samples': 12620800, 'steps': 24649, 'batch_loss/train': 0.8790517430752516} 12/26/2021 05:13:29 - INFO - codeparrot_training - Step 24650: {'lr': 0.0002568954966197066, 'samples': 12621312, 'steps': 24650, 'batch_loss/train': 0.7735757995396852} 12/26/2021 05:13:42 - INFO - codeparrot_training - Step 24651: {'lr': 0.00025687973946894127, 'samples': 12621824, 'steps': 24651, 'batch_loss/train': 0.7715418621664867} 12/26/2021 05:13:52 - INFO - codeparrot_training - Step 24652: {'lr': 0.00025686398229082466, 'samples': 12622336, 'steps': 24652, 'batch_loss/train': 0.7366729355417192} 12/26/2021 05:14:03 - INFO - codeparrot_training - Step 24653: {'lr': 0.0002568482250854195, 'samples': 12622848, 'steps': 24653, 'batch_loss/train': 0.780398890725337} 12/26/2021 05:14:14 - INFO - codeparrot_training - Step 24654: {'lr': 0.0002568324678527883, 'samples': 12623360, 'steps': 24654, 'batch_loss/train': 0.7733602807857096} 12/26/2021 05:14:27 - INFO - codeparrot_training - Step 24655: {'lr': 0.0002568167105929939, 'samples': 12623872, 'steps': 24655, 'batch_loss/train': 0.7468733573332429} 12/26/2021 05:14:38 - INFO - codeparrot_training - Step 24656: {'lr': 0.00025680095330609873, 'samples': 12624384, 'steps': 24656, 'batch_loss/train': 0.8209639727137983} 12/26/2021 05:14:49 - INFO - codeparrot_training - Step 24657: {'lr': 0.00025678519599216563, 'samples': 12624896, 'steps': 24657, 'batch_loss/train': 0.5870973531273194} 12/26/2021 05:15:01 - INFO - codeparrot_training - Step 24658: {'lr': 0.00025676943865125704, 'samples': 12625408, 'steps': 24658, 'batch_loss/train': 0.7225414065178484} 12/26/2021 05:15:11 - INFO - codeparrot_training - Step 24659: {'lr': 0.00025675368128343583, 'samples': 12625920, 'steps': 24659, 'batch_loss/train': 0.7802560646086931} 12/26/2021 05:15:22 - INFO - codeparrot_training - Step 24660: {'lr': 0.0002567379238887645, 'samples': 12626432, 'steps': 24660, 'batch_loss/train': 0.8403300447389483} 12/26/2021 05:15:34 - INFO - codeparrot_training - Step 24661: {'lr': 0.0002567221664673057, 'samples': 12626944, 'steps': 24661, 'batch_loss/train': 0.6539810462272726} 12/26/2021 05:15:45 - INFO - codeparrot_training - Step 24662: {'lr': 0.0002567064090191221, 'samples': 12627456, 'steps': 24662, 'batch_loss/train': 1.0984913492575288} 12/26/2021 05:15:55 - INFO - codeparrot_training - Step 24663: {'lr': 0.00025669065154427633, 'samples': 12627968, 'steps': 24663, 'batch_loss/train': 1.2250546859577298} 12/26/2021 05:16:06 - INFO - codeparrot_training - Step 24664: {'lr': 0.00025667489404283117, 'samples': 12628480, 'steps': 24664, 'batch_loss/train': 0.7778789512813091} 12/26/2021 05:16:19 - INFO - codeparrot_training - Step 24665: {'lr': 0.000256659136514849, 'samples': 12628992, 'steps': 24665, 'batch_loss/train': 0.787547460757196} 12/26/2021 05:16:29 - INFO - codeparrot_training - Step 24666: {'lr': 0.00025664337896039277, 'samples': 12629504, 'steps': 24666, 'batch_loss/train': 0.8012138828635216} 12/26/2021 05:16:40 - INFO - codeparrot_training - Step 24667: {'lr': 0.00025662762137952487, 'samples': 12630016, 'steps': 24667, 'batch_loss/train': 0.8225661339238286} 12/26/2021 05:16:52 - INFO - codeparrot_training - Step 24668: {'lr': 0.00025661186377230803, 'samples': 12630528, 'steps': 24668, 'batch_loss/train': 0.7362922420725226} 12/26/2021 05:17:03 - INFO - codeparrot_training - Step 24669: {'lr': 0.00025659610613880504, 'samples': 12631040, 'steps': 24669, 'batch_loss/train': 0.741578639484942} 12/26/2021 05:17:13 - INFO - codeparrot_training - Step 24670: {'lr': 0.00025658034847907834, 'samples': 12631552, 'steps': 24670, 'batch_loss/train': 0.7346204479690641} 12/26/2021 05:17:27 - INFO - codeparrot_training - Step 24671: {'lr': 0.0002565645907931906, 'samples': 12632064, 'steps': 24671, 'batch_loss/train': 0.7355074500665069} 12/26/2021 05:17:37 - INFO - codeparrot_training - Step 24672: {'lr': 0.00025654883308120466, 'samples': 12632576, 'steps': 24672, 'batch_loss/train': 1.1905341874808073} 12/26/2021 05:17:48 - INFO - codeparrot_training - Step 24673: {'lr': 0.00025653307534318295, 'samples': 12633088, 'steps': 24673, 'batch_loss/train': 0.7337606213986874} 12/26/2021 05:17:58 - INFO - codeparrot_training - Step 24674: {'lr': 0.0002565173175791882, 'samples': 12633600, 'steps': 24674, 'batch_loss/train': 0.7868120959028602} 12/26/2021 05:18:11 - INFO - codeparrot_training - Step 24675: {'lr': 0.0002565015597892831, 'samples': 12634112, 'steps': 24675, 'batch_loss/train': 0.7403049855493009} 12/26/2021 05:18:21 - INFO - codeparrot_training - Step 24676: {'lr': 0.00025648580197353015, 'samples': 12634624, 'steps': 24676, 'batch_loss/train': 0.9209697465412319} 12/26/2021 05:18:32 - INFO - codeparrot_training - Step 24677: {'lr': 0.00025647004413199224, 'samples': 12635136, 'steps': 24677, 'batch_loss/train': 0.873860090970993} 12/26/2021 05:18:44 - INFO - codeparrot_training - Step 24678: {'lr': 0.0002564542862647318, 'samples': 12635648, 'steps': 24678, 'batch_loss/train': 0.8480294719338417} 12/26/2021 05:18:54 - INFO - codeparrot_training - Step 24679: {'lr': 0.0002564385283718116, 'samples': 12636160, 'steps': 24679, 'batch_loss/train': 0.7908150162547827} 12/26/2021 05:19:05 - INFO - codeparrot_training - Step 24680: {'lr': 0.0002564227704532942, 'samples': 12636672, 'steps': 24680, 'batch_loss/train': 0.8706714268773794} 12/26/2021 05:19:18 - INFO - codeparrot_training - Step 24681: {'lr': 0.00025640701250924223, 'samples': 12637184, 'steps': 24681, 'batch_loss/train': 0.7076734998263419} 12/26/2021 05:19:29 - INFO - codeparrot_training - Step 24682: {'lr': 0.00025639125453971853, 'samples': 12637696, 'steps': 24682, 'batch_loss/train': 0.7502752309665084} 12/26/2021 05:19:40 - INFO - codeparrot_training - Step 24683: {'lr': 0.0002563754965447855, 'samples': 12638208, 'steps': 24683, 'batch_loss/train': 0.538047157227993} 12/26/2021 05:19:50 - INFO - codeparrot_training - Step 24684: {'lr': 0.00025635973852450593, 'samples': 12638720, 'steps': 24684, 'batch_loss/train': 0.7595200217328966} 12/26/2021 05:20:02 - INFO - codeparrot_training - Step 24685: {'lr': 0.0002563439804789425, 'samples': 12639232, 'steps': 24685, 'batch_loss/train': 0.895031422318425} 12/26/2021 05:20:13 - INFO - codeparrot_training - Step 24686: {'lr': 0.0002563282224081578, 'samples': 12639744, 'steps': 24686, 'batch_loss/train': 0.7234373199753463} 12/26/2021 05:20:24 - INFO - codeparrot_training - Step 24687: {'lr': 0.0002563124643122144, 'samples': 12640256, 'steps': 24687, 'batch_loss/train': 0.7515299746301025} 12/26/2021 05:20:36 - INFO - codeparrot_training - Step 24688: {'lr': 0.0002562967061911751, 'samples': 12640768, 'steps': 24688, 'batch_loss/train': 0.7948528658598661} 12/26/2021 05:20:46 - INFO - codeparrot_training - Step 24689: {'lr': 0.0002562809480451024, 'samples': 12641280, 'steps': 24689, 'batch_loss/train': 0.657786400988698} 12/26/2021 05:20:57 - INFO - codeparrot_training - Step 24690: {'lr': 0.0002562651898740591, 'samples': 12641792, 'steps': 24690, 'batch_loss/train': 0.8148117763921618} 12/26/2021 05:21:09 - INFO - codeparrot_training - Step 24691: {'lr': 0.00025624943167810765, 'samples': 12642304, 'steps': 24691, 'batch_loss/train': 0.8236131649464369} 12/26/2021 05:21:20 - INFO - codeparrot_training - Step 24692: {'lr': 0.000256233673457311, 'samples': 12642816, 'steps': 24692, 'batch_loss/train': 0.7581471453886479} 12/26/2021 05:21:30 - INFO - codeparrot_training - Step 24693: {'lr': 0.0002562179152117315, 'samples': 12643328, 'steps': 24693, 'batch_loss/train': 0.898688274435699} 12/26/2021 05:21:41 - INFO - codeparrot_training - Step 24694: {'lr': 0.0002562021569414319, 'samples': 12643840, 'steps': 24694, 'batch_loss/train': 0.7786192875355482} 12/26/2021 05:21:54 - INFO - codeparrot_training - Step 24695: {'lr': 0.0002561863986464749, 'samples': 12644352, 'steps': 24695, 'batch_loss/train': 0.8348541585728526} 12/26/2021 05:22:04 - INFO - codeparrot_training - Step 24696: {'lr': 0.00025617064032692317, 'samples': 12644864, 'steps': 24696, 'batch_loss/train': 0.7899759067222476} 12/26/2021 05:22:15 - INFO - codeparrot_training - Step 24697: {'lr': 0.0002561548819828392, 'samples': 12645376, 'steps': 24697, 'batch_loss/train': 0.7564282525563613} 12/26/2021 05:22:27 - INFO - codeparrot_training - Step 24698: {'lr': 0.0002561391236142858, 'samples': 12645888, 'steps': 24698, 'batch_loss/train': 0.821175433928147} 12/26/2021 05:22:38 - INFO - codeparrot_training - Step 24699: {'lr': 0.00025612336522132554, 'samples': 12646400, 'steps': 24699, 'batch_loss/train': 0.6168084944365546} 12/26/2021 05:22:48 - INFO - codeparrot_training - Step 24700: {'lr': 0.00025610760680402114, 'samples': 12646912, 'steps': 24700, 'batch_loss/train': 0.8069386091083288} 12/26/2021 05:23:01 - INFO - codeparrot_training - Step 24701: {'lr': 0.00025609184836243516, 'samples': 12647424, 'steps': 24701, 'batch_loss/train': 0.7312137568369508} 12/26/2021 05:23:12 - INFO - codeparrot_training - Step 24702: {'lr': 0.0002560760898966303, 'samples': 12647936, 'steps': 24702, 'batch_loss/train': 0.7186475903727114} 12/26/2021 05:23:22 - INFO - codeparrot_training - Step 24703: {'lr': 0.00025606033140666915, 'samples': 12648448, 'steps': 24703, 'batch_loss/train': 0.7310049035586417} 12/26/2021 05:23:33 - INFO - codeparrot_training - Step 24704: {'lr': 0.0002560445728926145, 'samples': 12648960, 'steps': 24704, 'batch_loss/train': 0.7670473344624043} 12/26/2021 05:23:45 - INFO - codeparrot_training - Step 24705: {'lr': 0.0002560288143545289, 'samples': 12649472, 'steps': 24705, 'batch_loss/train': 0.6426041317172348} 12/26/2021 05:23:56 - INFO - codeparrot_training - Step 24706: {'lr': 0.00025601305579247495, 'samples': 12649984, 'steps': 24706, 'batch_loss/train': 0.7075512812007219} 12/26/2021 05:24:06 - INFO - codeparrot_training - Step 24707: {'lr': 0.0002559972972065154, 'samples': 12650496, 'steps': 24707, 'batch_loss/train': 0.7913460079580545} 12/26/2021 05:24:19 - INFO - codeparrot_training - Step 24708: {'lr': 0.00025598153859671287, 'samples': 12651008, 'steps': 24708, 'batch_loss/train': 0.598215508973226} 12/26/2021 05:24:29 - INFO - codeparrot_training - Step 24709: {'lr': 0.00025596577996313003, 'samples': 12651520, 'steps': 24709, 'batch_loss/train': 0.7886152677237988} 12/26/2021 05:24:40 - INFO - codeparrot_training - Step 24710: {'lr': 0.0002559500213058294, 'samples': 12652032, 'steps': 24710, 'batch_loss/train': 0.7481160876341164} 12/26/2021 05:24:52 - INFO - codeparrot_training - Step 24711: {'lr': 0.00025593426262487387, 'samples': 12652544, 'steps': 24711, 'batch_loss/train': 0.8544176612049341} 12/26/2021 05:25:03 - INFO - codeparrot_training - Step 24712: {'lr': 0.0002559185039203258, 'samples': 12653056, 'steps': 24712, 'batch_loss/train': 0.7633374645374715} 12/26/2021 05:25:14 - INFO - codeparrot_training - Step 24713: {'lr': 0.0002559027451922481, 'samples': 12653568, 'steps': 24713, 'batch_loss/train': 0.6971714505925775} 12/26/2021 05:25:24 - INFO - codeparrot_training - Step 24714: {'lr': 0.00025588698644070325, 'samples': 12654080, 'steps': 24714, 'batch_loss/train': 0.8543379697948694} 12/26/2021 05:25:37 - INFO - codeparrot_training - Step 24715: {'lr': 0.000255871227665754, 'samples': 12654592, 'steps': 24715, 'batch_loss/train': 0.7174140950664878} 12/26/2021 05:25:47 - INFO - codeparrot_training - Step 24716: {'lr': 0.00025585546886746296, 'samples': 12655104, 'steps': 24716, 'batch_loss/train': 0.7005695845000446} 12/26/2021 05:25:58 - INFO - codeparrot_training - Step 24717: {'lr': 0.00025583971004589274, 'samples': 12655616, 'steps': 24717, 'batch_loss/train': 0.8408903167583048} 12/26/2021 05:26:10 - INFO - codeparrot_training - Step 24718: {'lr': 0.00025582395120110613, 'samples': 12656128, 'steps': 24718, 'batch_loss/train': 0.7024385733529925} 12/26/2021 05:26:21 - INFO - codeparrot_training - Step 24719: {'lr': 0.00025580819233316567, 'samples': 12656640, 'steps': 24719, 'batch_loss/train': 0.7568119294010103} 12/26/2021 05:26:32 - INFO - codeparrot_training - Step 24720: {'lr': 0.000255792433442134, 'samples': 12657152, 'steps': 24720, 'batch_loss/train': 0.8768127146176994} 12/26/2021 05:26:44 - INFO - codeparrot_training - Step 24721: {'lr': 0.0002557766745280738, 'samples': 12657664, 'steps': 24721, 'batch_loss/train': 0.6818820191547275} 12/26/2021 05:26:54 - INFO - codeparrot_training - Step 24722: {'lr': 0.00025576091559104775, 'samples': 12658176, 'steps': 24722, 'batch_loss/train': 0.9160810112953186} 12/26/2021 05:27:05 - INFO - codeparrot_training - Step 24723: {'lr': 0.00025574515663111843, 'samples': 12658688, 'steps': 24723, 'batch_loss/train': 0.7234834914561361} 12/26/2021 05:27:15 - INFO - codeparrot_training - Step 24724: {'lr': 0.0002557293976483485, 'samples': 12659200, 'steps': 24724, 'batch_loss/train': 0.7598385578021407} 12/26/2021 05:27:28 - INFO - codeparrot_training - Step 24725: {'lr': 0.0002557136386428008, 'samples': 12659712, 'steps': 24725, 'batch_loss/train': 0.7903746226802468} 12/26/2021 05:27:38 - INFO - codeparrot_training - Step 24726: {'lr': 0.00025569787961453766, 'samples': 12660224, 'steps': 24726, 'batch_loss/train': 0.7755082487128675} 12/26/2021 05:27:49 - INFO - codeparrot_training - Step 24727: {'lr': 0.000255682120563622, 'samples': 12660736, 'steps': 24727, 'batch_loss/train': 0.7279764884151518} 12/26/2021 05:28:01 - INFO - codeparrot_training - Step 24728: {'lr': 0.00025566636149011637, 'samples': 12661248, 'steps': 24728, 'batch_loss/train': 0.827735613915138} 12/26/2021 05:28:12 - INFO - codeparrot_training - Step 24729: {'lr': 0.0002556506023940834, 'samples': 12661760, 'steps': 24729, 'batch_loss/train': 0.8544190553948283} 12/26/2021 05:28:22 - INFO - codeparrot_training - Step 24730: {'lr': 0.0002556348432755858, 'samples': 12662272, 'steps': 24730, 'batch_loss/train': 0.7259723236784339} 12/26/2021 05:28:35 - INFO - codeparrot_training - Step 24731: {'lr': 0.0002556190841346862, 'samples': 12662784, 'steps': 24731, 'batch_loss/train': 0.8697008546441793} 12/26/2021 05:28:46 - INFO - codeparrot_training - Step 24732: {'lr': 0.0002556033249714472, 'samples': 12663296, 'steps': 24732, 'batch_loss/train': 0.762541419826448} 12/26/2021 05:28:56 - INFO - codeparrot_training - Step 24733: {'lr': 0.0002555875657859315, 'samples': 12663808, 'steps': 24733, 'batch_loss/train': 0.7563776737079024} 12/26/2021 05:29:07 - INFO - codeparrot_training - Step 24734: {'lr': 0.0002555718065782017, 'samples': 12664320, 'steps': 24734, 'batch_loss/train': 0.7660170895978808} 12/26/2021 05:29:19 - INFO - codeparrot_training - Step 24735: {'lr': 0.0002555560473483207, 'samples': 12664832, 'steps': 24735, 'batch_loss/train': 0.8463534507900476} 12/26/2021 05:29:30 - INFO - codeparrot_training - Step 24736: {'lr': 0.0002555402880963507, 'samples': 12665344, 'steps': 24736, 'batch_loss/train': 0.7797690229490399} 12/26/2021 05:29:40 - INFO - codeparrot_training - Step 24737: {'lr': 0.0002555245288223547, 'samples': 12665856, 'steps': 24737, 'batch_loss/train': 0.7230483195744455} 12/26/2021 05:29:52 - INFO - codeparrot_training - Step 24738: {'lr': 0.00025550876952639534, 'samples': 12666368, 'steps': 24738, 'batch_loss/train': 0.7379548056051135} 12/26/2021 05:30:03 - INFO - codeparrot_training - Step 24739: {'lr': 0.00025549301020853506, 'samples': 12666880, 'steps': 24739, 'batch_loss/train': 0.8074141268152744} 12/26/2021 05:30:13 - INFO - codeparrot_training - Step 24740: {'lr': 0.00025547725086883677, 'samples': 12667392, 'steps': 24740, 'batch_loss/train': 0.6755846857558936} 12/26/2021 05:30:27 - INFO - codeparrot_training - Step 24741: {'lr': 0.0002554614915073629, 'samples': 12667904, 'steps': 24741, 'batch_loss/train': 0.797629501670599} 12/26/2021 05:30:37 - INFO - codeparrot_training - Step 24742: {'lr': 0.00025544573212417626, 'samples': 12668416, 'steps': 24742, 'batch_loss/train': 0.8673700029030442} 12/26/2021 05:30:48 - INFO - codeparrot_training - Step 24743: {'lr': 0.0002554299727193394, 'samples': 12668928, 'steps': 24743, 'batch_loss/train': 0.7977843410335481} 12/26/2021 05:30:58 - INFO - codeparrot_training - Step 24744: {'lr': 0.00025541421329291515, 'samples': 12669440, 'steps': 24744, 'batch_loss/train': 0.6634957320056856} 12/26/2021 05:31:11 - INFO - codeparrot_training - Step 24745: {'lr': 0.0002553984538449659, 'samples': 12669952, 'steps': 24745, 'batch_loss/train': 0.49711026530712843} 12/26/2021 05:31:21 - INFO - codeparrot_training - Step 24746: {'lr': 0.00025538269437555447, 'samples': 12670464, 'steps': 24746, 'batch_loss/train': 0.7981108175590634} 12/26/2021 05:31:32 - INFO - codeparrot_training - Step 24747: {'lr': 0.00025536693488474344, 'samples': 12670976, 'steps': 24747, 'batch_loss/train': 0.78650545142591} 12/26/2021 05:31:45 - INFO - codeparrot_training - Step 24748: {'lr': 0.00025535117537259553, 'samples': 12671488, 'steps': 24748, 'batch_loss/train': 0.738091747276485} 12/26/2021 05:31:55 - INFO - codeparrot_training - Step 24749: {'lr': 0.0002553354158391734, 'samples': 12672000, 'steps': 24749, 'batch_loss/train': 0.8319613989442587} 12/26/2021 05:32:06 - INFO - codeparrot_training - Step 24750: {'lr': 0.0002553196562845396, 'samples': 12672512, 'steps': 24750, 'batch_loss/train': 0.753526259213686} 12/26/2021 05:32:18 - INFO - codeparrot_training - Step 24751: {'lr': 0.00025530389670875696, 'samples': 12673024, 'steps': 24751, 'batch_loss/train': 0.7988036307506263} 12/26/2021 05:32:29 - INFO - codeparrot_training - Step 24752: {'lr': 0.0002552881371118879, 'samples': 12673536, 'steps': 24752, 'batch_loss/train': 0.818331210874021} 12/26/2021 05:32:39 - INFO - codeparrot_training - Step 24753: {'lr': 0.00025527237749399534, 'samples': 12674048, 'steps': 24753, 'batch_loss/train': 0.655671872664243} 12/26/2021 05:32:50 - INFO - codeparrot_training - Step 24754: {'lr': 0.00025525661785514176, 'samples': 12674560, 'steps': 24754, 'batch_loss/train': 0.8211173843592405} 12/26/2021 05:33:03 - INFO - codeparrot_training - Step 24755: {'lr': 0.00025524085819538975, 'samples': 12675072, 'steps': 24755, 'batch_loss/train': 1.2091620871797204} 12/26/2021 05:33:13 - INFO - codeparrot_training - Step 24756: {'lr': 0.0002552250985148022, 'samples': 12675584, 'steps': 24756, 'batch_loss/train': 0.7218361678533256} 12/26/2021 05:33:24 - INFO - codeparrot_training - Step 24757: {'lr': 0.0002552093388134416, 'samples': 12676096, 'steps': 24757, 'batch_loss/train': 0.824179801158607} 12/26/2021 05:33:37 - INFO - codeparrot_training - Step 24758: {'lr': 0.00025519357909137063, 'samples': 12676608, 'steps': 24758, 'batch_loss/train': 0.794149967841804} 12/26/2021 05:33:47 - INFO - codeparrot_training - Step 24759: {'lr': 0.0002551778193486519, 'samples': 12677120, 'steps': 24759, 'batch_loss/train': 0.7065903437323868} 12/26/2021 05:33:58 - INFO - codeparrot_training - Step 24760: {'lr': 0.0002551620595853482, 'samples': 12677632, 'steps': 24760, 'batch_loss/train': 0.8053747424855828} 12/26/2021 05:34:09 - INFO - codeparrot_training - Step 24761: {'lr': 0.000255146299801522, 'samples': 12678144, 'steps': 24761, 'batch_loss/train': 0.7969287196174264} 12/26/2021 05:34:21 - INFO - codeparrot_training - Step 24762: {'lr': 0.00025513053999723606, 'samples': 12678656, 'steps': 24762, 'batch_loss/train': 0.713251169770956} 12/26/2021 05:34:31 - INFO - codeparrot_training - Step 24763: {'lr': 0.00025511478017255305, 'samples': 12679168, 'steps': 24763, 'batch_loss/train': 0.8374856663867831} 12/26/2021 05:34:42 - INFO - codeparrot_training - Step 24764: {'lr': 0.0002550990203275357, 'samples': 12679680, 'steps': 24764, 'batch_loss/train': 0.7786852568387985} 12/26/2021 05:34:54 - INFO - codeparrot_training - Step 24765: {'lr': 0.00025508326046224647, 'samples': 12680192, 'steps': 24765, 'batch_loss/train': 0.8574534356594086} 12/26/2021 05:35:05 - INFO - codeparrot_training - Step 24766: {'lr': 0.0002550675005767482, 'samples': 12680704, 'steps': 24766, 'batch_loss/train': 0.8978059841319919} 12/26/2021 05:35:15 - INFO - codeparrot_training - Step 24767: {'lr': 0.0002550517406711034, 'samples': 12681216, 'steps': 24767, 'batch_loss/train': 0.7548536700196564} 12/26/2021 05:35:28 - INFO - codeparrot_training - Step 24768: {'lr': 0.00025503598074537475, 'samples': 12681728, 'steps': 24768, 'batch_loss/train': 1.0225301885511726} 12/26/2021 05:35:39 - INFO - codeparrot_training - Step 24769: {'lr': 0.000255020220799625, 'samples': 12682240, 'steps': 24769, 'batch_loss/train': 0.7670152187347412} 12/26/2021 05:35:49 - INFO - codeparrot_training - Step 24770: {'lr': 0.0002550044608339168, 'samples': 12682752, 'steps': 24770, 'batch_loss/train': 0.7833386156708002} 12/26/2021 05:36:01 - INFO - codeparrot_training - Step 24771: {'lr': 0.0002549887008483126, 'samples': 12683264, 'steps': 24771, 'batch_loss/train': 0.7815100997686386} 12/26/2021 05:36:12 - INFO - codeparrot_training - Step 24772: {'lr': 0.0002549729408428753, 'samples': 12683776, 'steps': 24772, 'batch_loss/train': 0.786096353083849} 12/26/2021 05:36:23 - INFO - codeparrot_training - Step 24773: {'lr': 0.0002549571808176675, 'samples': 12684288, 'steps': 24773, 'batch_loss/train': 0.7283882130868733} 12/26/2021 05:36:33 - INFO - codeparrot_training - Step 24774: {'lr': 0.0002549414207727517, 'samples': 12684800, 'steps': 24774, 'batch_loss/train': 0.7651997716166079} 12/26/2021 05:36:46 - INFO - codeparrot_training - Step 24775: {'lr': 0.00025492566070819077, 'samples': 12685312, 'steps': 24775, 'batch_loss/train': 0.819559252820909} 12/26/2021 05:36:56 - INFO - codeparrot_training - Step 24776: {'lr': 0.0002549099006240473, 'samples': 12685824, 'steps': 24776, 'batch_loss/train': 0.7394249765202403} 12/26/2021 05:37:07 - INFO - codeparrot_training - Step 24777: {'lr': 0.0002548941405203839, 'samples': 12686336, 'steps': 24777, 'batch_loss/train': 0.8865533801726997} 12/26/2021 05:37:19 - INFO - codeparrot_training - Step 24778: {'lr': 0.00025487838039726316, 'samples': 12686848, 'steps': 24778, 'batch_loss/train': 0.70123983733356} 12/26/2021 05:37:30 - INFO - codeparrot_training - Step 24779: {'lr': 0.0002548626202547479, 'samples': 12687360, 'steps': 24779, 'batch_loss/train': 1.012038927525282} 12/26/2021 05:37:41 - INFO - codeparrot_training - Step 24780: {'lr': 0.0002548468600929007, 'samples': 12687872, 'steps': 24780, 'batch_loss/train': 0.9954049079678953} 12/26/2021 05:37:51 - INFO - codeparrot_training - Step 24781: {'lr': 0.0002548310999117842, 'samples': 12688384, 'steps': 24781, 'batch_loss/train': 0.8005169504322112} 12/26/2021 05:38:04 - INFO - codeparrot_training - Step 24782: {'lr': 0.0002548153397114611, 'samples': 12688896, 'steps': 24782, 'batch_loss/train': 0.7025910969241522} 12/26/2021 05:38:15 - INFO - codeparrot_training - Step 24783: {'lr': 0.000254799579491994, 'samples': 12689408, 'steps': 24783, 'batch_loss/train': 0.7090844810009003} 12/26/2021 05:38:25 - INFO - codeparrot_training - Step 24784: {'lr': 0.00025478381925344554, 'samples': 12689920, 'steps': 24784, 'batch_loss/train': 0.6746053448878229} 12/26/2021 05:38:37 - INFO - codeparrot_training - Step 24785: {'lr': 0.0002547680589958785, 'samples': 12690432, 'steps': 24785, 'batch_loss/train': 0.6542738055577502} 12/26/2021 05:38:48 - INFO - codeparrot_training - Step 24786: {'lr': 0.0002547522987193554, 'samples': 12690944, 'steps': 24786, 'batch_loss/train': 0.5534851890988648} 12/26/2021 05:38:59 - INFO - codeparrot_training - Step 24787: {'lr': 0.00025473653842393893, 'samples': 12691456, 'steps': 24787, 'batch_loss/train': 0.9180043209344149} 12/26/2021 05:39:11 - INFO - codeparrot_training - Step 24788: {'lr': 0.00025472077810969186, 'samples': 12691968, 'steps': 24788, 'batch_loss/train': 0.6623535859398544} 12/26/2021 05:39:22 - INFO - codeparrot_training - Step 24789: {'lr': 0.0002547050177766767, 'samples': 12692480, 'steps': 24789, 'batch_loss/train': 0.7546729068271816} 12/26/2021 05:39:32 - INFO - codeparrot_training - Step 24790: {'lr': 0.00025468925742495624, 'samples': 12692992, 'steps': 24790, 'batch_loss/train': 0.7421357975108549} 12/26/2021 05:39:43 - INFO - codeparrot_training - Step 24791: {'lr': 0.0002546734970545929, 'samples': 12693504, 'steps': 24791, 'batch_loss/train': 0.7835542320972309} 12/26/2021 05:39:55 - INFO - codeparrot_training - Step 24792: {'lr': 0.0002546577366656497, 'samples': 12694016, 'steps': 24792, 'batch_loss/train': 0.7549600312486291} 12/26/2021 05:40:06 - INFO - codeparrot_training - Step 24793: {'lr': 0.000254641976258189, 'samples': 12694528, 'steps': 24793, 'batch_loss/train': 0.744335220195353} 12/26/2021 05:40:16 - INFO - codeparrot_training - Step 24794: {'lr': 0.00025462621583227353, 'samples': 12695040, 'steps': 24794, 'batch_loss/train': 0.7348001729696989} 12/26/2021 05:40:29 - INFO - codeparrot_training - Step 24795: {'lr': 0.00025461045538796595, 'samples': 12695552, 'steps': 24795, 'batch_loss/train': 0.8275785590521991} 12/26/2021 05:40:39 - INFO - codeparrot_training - Step 24796: {'lr': 0.00025459469492532905, 'samples': 12696064, 'steps': 24796, 'batch_loss/train': 0.7030670889653265} 12/26/2021 05:40:50 - INFO - codeparrot_training - Step 24797: {'lr': 0.00025457893444442525, 'samples': 12696576, 'steps': 24797, 'batch_loss/train': 0.8175119045190513} 12/26/2021 05:41:03 - INFO - codeparrot_training - Step 24798: {'lr': 0.00025456317394531743, 'samples': 12697088, 'steps': 24798, 'batch_loss/train': 0.7298067624215037} 12/26/2021 05:41:13 - INFO - codeparrot_training - Step 24799: {'lr': 0.00025454741342806813, 'samples': 12697600, 'steps': 24799, 'batch_loss/train': 0.57056612073211} 12/26/2021 05:41:24 - INFO - codeparrot_training - Step 24800: {'lr': 0.00025453165289274004, 'samples': 12698112, 'steps': 24800, 'batch_loss/train': 0.6374956779764034} 12/26/2021 05:41:36 - INFO - codeparrot_training - Step 24801: {'lr': 0.0002545158923393958, 'samples': 12698624, 'steps': 24801, 'batch_loss/train': 0.7027817512862384} 12/26/2021 05:41:47 - INFO - codeparrot_training - Step 24802: {'lr': 0.00025450013176809807, 'samples': 12699136, 'steps': 24802, 'batch_loss/train': 0.665481727104634} 12/26/2021 05:41:57 - INFO - codeparrot_training - Step 24803: {'lr': 0.0002544843711789095, 'samples': 12699648, 'steps': 24803, 'batch_loss/train': 0.8422335525974631} 12/26/2021 05:42:08 - INFO - codeparrot_training - Step 24804: {'lr': 0.0002544686105718928, 'samples': 12700160, 'steps': 24804, 'batch_loss/train': 0.7167910803109407} 12/26/2021 05:42:20 - INFO - codeparrot_training - Step 24805: {'lr': 0.0002544528499471106, 'samples': 12700672, 'steps': 24805, 'batch_loss/train': 0.8025215687230229} 12/26/2021 05:42:31 - INFO - codeparrot_training - Step 24806: {'lr': 0.0002544370893046255, 'samples': 12701184, 'steps': 24806, 'batch_loss/train': 0.7583416078705341} 12/26/2021 05:42:41 - INFO - codeparrot_training - Step 24807: {'lr': 0.00025442132864450024, 'samples': 12701696, 'steps': 24807, 'batch_loss/train': 0.8098235905636102} 12/26/2021 05:42:53 - INFO - codeparrot_training - Step 24808: {'lr': 0.00025440556796679743, 'samples': 12702208, 'steps': 24808, 'batch_loss/train': 0.8956469330005348} 12/26/2021 05:43:04 - INFO - codeparrot_training - Step 24809: {'lr': 0.00025438980727157976, 'samples': 12702720, 'steps': 24809, 'batch_loss/train': 0.74257457931526} 12/26/2021 05:43:15 - INFO - codeparrot_training - Step 24810: {'lr': 0.00025437404655890986, 'samples': 12703232, 'steps': 24810, 'batch_loss/train': 0.8089384841732681} 12/26/2021 05:43:27 - INFO - codeparrot_training - Step 24811: {'lr': 0.0002543582858288504, 'samples': 12703744, 'steps': 24811, 'batch_loss/train': 0.6931500188075006} 12/26/2021 05:43:38 - INFO - codeparrot_training - Step 24812: {'lr': 0.0002543425250814641, 'samples': 12704256, 'steps': 24812, 'batch_loss/train': 0.8336990200914443} 12/26/2021 05:43:49 - INFO - codeparrot_training - Step 24813: {'lr': 0.0002543267643168135, 'samples': 12704768, 'steps': 24813, 'batch_loss/train': 0.7499790098518133} 12/26/2021 05:43:59 - INFO - codeparrot_training - Step 24814: {'lr': 0.00025431100353496134, 'samples': 12705280, 'steps': 24814, 'batch_loss/train': 0.6259290443267673} 12/26/2021 05:44:11 - INFO - codeparrot_training - Step 24815: {'lr': 0.0002542952427359702, 'samples': 12705792, 'steps': 24815, 'batch_loss/train': 0.8916226299479604} 12/26/2021 05:44:22 - INFO - codeparrot_training - Step 24816: {'lr': 0.00025427948191990285, 'samples': 12706304, 'steps': 24816, 'batch_loss/train': 0.8152557210996747} 12/26/2021 05:44:32 - INFO - codeparrot_training - Step 24817: {'lr': 0.0002542637210868219, 'samples': 12706816, 'steps': 24817, 'batch_loss/train': 0.8335425960831344} 12/26/2021 05:44:46 - INFO - codeparrot_training - Step 24818: {'lr': 0.00025424796023679, 'samples': 12707328, 'steps': 24818, 'batch_loss/train': 0.770266481442377} 12/26/2021 05:44:56 - INFO - codeparrot_training - Step 24819: {'lr': 0.00025423219936986986, 'samples': 12707840, 'steps': 24819, 'batch_loss/train': 0.7866546767763793} 12/26/2021 05:45:07 - INFO - codeparrot_training - Step 24820: {'lr': 0.000254216438486124, 'samples': 12708352, 'steps': 24820, 'batch_loss/train': 0.8059912258759141} 12/26/2021 05:45:19 - INFO - codeparrot_training - Step 24821: {'lr': 0.0002542006775856152, 'samples': 12708864, 'steps': 24821, 'batch_loss/train': 0.632319662719965} 12/26/2021 05:45:30 - INFO - codeparrot_training - Step 24822: {'lr': 0.00025418491666840613, 'samples': 12709376, 'steps': 24822, 'batch_loss/train': 0.8364198748022318} 12/26/2021 05:45:40 - INFO - codeparrot_training - Step 24823: {'lr': 0.00025416915573455936, 'samples': 12709888, 'steps': 24823, 'batch_loss/train': 0.8469332968816161} 12/26/2021 05:45:51 - INFO - codeparrot_training - Step 24824: {'lr': 0.00025415339478413764, 'samples': 12710400, 'steps': 24824, 'batch_loss/train': 0.7710581766441464} 12/26/2021 05:46:04 - INFO - codeparrot_training - Step 24825: {'lr': 0.0002541376338172036, 'samples': 12710912, 'steps': 24825, 'batch_loss/train': 0.7759105265140533} 12/26/2021 05:46:14 - INFO - codeparrot_training - Step 24826: {'lr': 0.00025412187283381987, 'samples': 12711424, 'steps': 24826, 'batch_loss/train': 0.7531522125937045} 12/26/2021 05:46:25 - INFO - codeparrot_training - Step 24827: {'lr': 0.0002541061118340491, 'samples': 12711936, 'steps': 24827, 'batch_loss/train': 0.5193799533881247} 12/26/2021 05:46:38 - INFO - codeparrot_training - Step 24828: {'lr': 0.000254090350817954, 'samples': 12712448, 'steps': 24828, 'batch_loss/train': 0.8577850451692939} 12/26/2021 05:46:49 - INFO - codeparrot_training - Step 24829: {'lr': 0.0002540745897855972, 'samples': 12712960, 'steps': 24829, 'batch_loss/train': 0.72273777006194} 12/26/2021 05:46:59 - INFO - codeparrot_training - Step 24830: {'lr': 0.00025405882873704143, 'samples': 12713472, 'steps': 24830, 'batch_loss/train': 0.8764648716896772} 12/26/2021 05:47:10 - INFO - codeparrot_training - Step 24831: {'lr': 0.0002540430676723493, 'samples': 12713984, 'steps': 24831, 'batch_loss/train': 0.7850155821070075} 12/26/2021 05:47:22 - INFO - codeparrot_training - Step 24832: {'lr': 0.00025402730659158343, 'samples': 12714496, 'steps': 24832, 'batch_loss/train': 0.7559387876535766} 12/26/2021 05:47:33 - INFO - codeparrot_training - Step 24833: {'lr': 0.00025401154549480645, 'samples': 12715008, 'steps': 24833, 'batch_loss/train': 0.7516218551900238} 12/26/2021 05:47:43 - INFO - codeparrot_training - Step 24834: {'lr': 0.0002539957843820811, 'samples': 12715520, 'steps': 24834, 'batch_loss/train': 0.6683099751826376} 12/26/2021 05:47:56 - INFO - codeparrot_training - Step 24835: {'lr': 0.00025398002325347, 'samples': 12716032, 'steps': 24835, 'batch_loss/train': 0.8348655686713755} 12/26/2021 05:48:07 - INFO - codeparrot_training - Step 24836: {'lr': 0.0002539642621090359, 'samples': 12716544, 'steps': 24836, 'batch_loss/train': 0.7455194353824481} 12/26/2021 05:48:18 - INFO - codeparrot_training - Step 24837: {'lr': 0.00025394850094884137, 'samples': 12717056, 'steps': 24837, 'batch_loss/train': 0.7650479483418167} 12/26/2021 05:48:30 - INFO - codeparrot_training - Step 24838: {'lr': 0.0002539327397729491, 'samples': 12717568, 'steps': 24838, 'batch_loss/train': 0.8525168588384986} 12/26/2021 05:48:41 - INFO - codeparrot_training - Step 24839: {'lr': 0.00025391697858142165, 'samples': 12718080, 'steps': 24839, 'batch_loss/train': 0.8439169367775321} 12/26/2021 05:48:51 - INFO - codeparrot_training - Step 24840: {'lr': 0.00025390121737432195, 'samples': 12718592, 'steps': 24840, 'batch_loss/train': 0.6286340090446174} 12/26/2021 05:49:02 - INFO - codeparrot_training - Step 24841: {'lr': 0.00025388545615171235, 'samples': 12719104, 'steps': 24841, 'batch_loss/train': 0.9734370978549123} 12/26/2021 05:49:14 - INFO - codeparrot_training - Step 24842: {'lr': 0.00025386969491365564, 'samples': 12719616, 'steps': 24842, 'batch_loss/train': 0.7834780147532001} 12/26/2021 05:49:24 - INFO - codeparrot_training - Step 24843: {'lr': 0.00025385393366021453, 'samples': 12720128, 'steps': 24843, 'batch_loss/train': 0.707332843914628} 12/26/2021 05:49:35 - INFO - codeparrot_training - Step 24844: {'lr': 0.00025383817239145164, 'samples': 12720640, 'steps': 24844, 'batch_loss/train': 0.8442684458568692} 12/26/2021 05:49:47 - INFO - codeparrot_training - Step 24845: {'lr': 0.0002538224111074296, 'samples': 12721152, 'steps': 24845, 'batch_loss/train': 0.8457775479182601} 12/26/2021 05:49:58 - INFO - codeparrot_training - Step 24846: {'lr': 0.0002538066498082111, 'samples': 12721664, 'steps': 24846, 'batch_loss/train': 0.7965913405641913} 12/26/2021 05:50:08 - INFO - codeparrot_training - Step 24847: {'lr': 0.0002537908884938588, 'samples': 12722176, 'steps': 24847, 'batch_loss/train': 0.764515726827085} 12/26/2021 05:50:21 - INFO - codeparrot_training - Step 24848: {'lr': 0.00025377512716443537, 'samples': 12722688, 'steps': 24848, 'batch_loss/train': 0.7434773377608508} 12/26/2021 05:50:32 - INFO - codeparrot_training - Step 24849: {'lr': 0.0002537593658200034, 'samples': 12723200, 'steps': 24849, 'batch_loss/train': 0.7742030834779143} 12/26/2021 05:50:42 - INFO - codeparrot_training - Step 24850: {'lr': 0.00025374360446062566, 'samples': 12723712, 'steps': 24850, 'batch_loss/train': 0.8531427709385753} 12/26/2021 05:50:53 - INFO - codeparrot_training - Step 24851: {'lr': 0.0002537278430863648, 'samples': 12724224, 'steps': 24851, 'batch_loss/train': 0.7071935213170946} 12/26/2021 05:51:05 - INFO - codeparrot_training - Step 24852: {'lr': 0.00025371208169728334, 'samples': 12724736, 'steps': 24852, 'batch_loss/train': 0.6994953672401607} 12/26/2021 05:51:16 - INFO - codeparrot_training - Step 24853: {'lr': 0.0002536963202934442, 'samples': 12725248, 'steps': 24853, 'batch_loss/train': 0.8151614060625434} 12/26/2021 05:51:26 - INFO - codeparrot_training - Step 24854: {'lr': 0.0002536805588749098, 'samples': 12725760, 'steps': 24854, 'batch_loss/train': 0.7625072570517659} 12/26/2021 05:51:39 - INFO - codeparrot_training - Step 24855: {'lr': 0.0002536647974417428, 'samples': 12726272, 'steps': 24855, 'batch_loss/train': 0.789273485308513} 12/26/2021 05:51:49 - INFO - codeparrot_training - Step 24856: {'lr': 0.00025364903599400606, 'samples': 12726784, 'steps': 24856, 'batch_loss/train': 0.7327688289806247} 12/26/2021 05:52:00 - INFO - codeparrot_training - Step 24857: {'lr': 0.00025363327453176214, 'samples': 12727296, 'steps': 24857, 'batch_loss/train': 0.8243811214342713} 12/26/2021 05:52:13 - INFO - codeparrot_training - Step 24858: {'lr': 0.0002536175130550736, 'samples': 12727808, 'steps': 24858, 'batch_loss/train': 0.8242974560707808} 12/26/2021 05:52:23 - INFO - codeparrot_training - Step 24859: {'lr': 0.0002536017515640033, 'samples': 12728320, 'steps': 24859, 'batch_loss/train': 0.8168273975607008} 12/26/2021 05:52:34 - INFO - codeparrot_training - Step 24860: {'lr': 0.0002535859900586137, 'samples': 12728832, 'steps': 24860, 'batch_loss/train': 0.8572876937687397} 12/26/2021 05:52:44 - INFO - codeparrot_training - Step 24861: {'lr': 0.0002535702285389676, 'samples': 12729344, 'steps': 24861, 'batch_loss/train': 0.8749248571693897} 12/26/2021 05:52:57 - INFO - codeparrot_training - Step 24862: {'lr': 0.0002535544670051276, 'samples': 12729856, 'steps': 24862, 'batch_loss/train': 0.7821863740682602} 12/26/2021 05:53:07 - INFO - codeparrot_training - Step 24863: {'lr': 0.00025353870545715643, 'samples': 12730368, 'steps': 24863, 'batch_loss/train': 0.7601056310813874} 12/26/2021 05:53:18 - INFO - codeparrot_training - Step 24864: {'lr': 0.0002535229438951167, 'samples': 12730880, 'steps': 24864, 'batch_loss/train': 0.7015132133383304} 12/26/2021 05:53:31 - INFO - codeparrot_training - Step 24865: {'lr': 0.00025350718231907107, 'samples': 12731392, 'steps': 24865, 'batch_loss/train': 0.7504838709719479} 12/26/2021 05:53:41 - INFO - codeparrot_training - Step 24866: {'lr': 0.0002534914207290822, 'samples': 12731904, 'steps': 24866, 'batch_loss/train': 0.7471401863731444} 12/26/2021 05:53:52 - INFO - codeparrot_training - Step 24867: {'lr': 0.00025347565912521276, 'samples': 12732416, 'steps': 24867, 'batch_loss/train': 0.7502192946267314} 12/26/2021 05:54:04 - INFO - codeparrot_training - Step 24868: {'lr': 0.00025345989750752537, 'samples': 12732928, 'steps': 24868, 'batch_loss/train': 0.6881888145580888} 12/26/2021 05:54:15 - INFO - codeparrot_training - Step 24869: {'lr': 0.0002534441358760827, 'samples': 12733440, 'steps': 24869, 'batch_loss/train': 0.7167257564142346} 12/26/2021 05:54:25 - INFO - codeparrot_training - Step 24870: {'lr': 0.0002534283742309476, 'samples': 12733952, 'steps': 24870, 'batch_loss/train': 0.7470928034745157} 12/26/2021 05:54:36 - INFO - codeparrot_training - Step 24871: {'lr': 0.0002534126125721825, 'samples': 12734464, 'steps': 24871, 'batch_loss/train': 0.8065883535891771} 12/26/2021 05:54:48 - INFO - codeparrot_training - Step 24872: {'lr': 0.00025339685089985006, 'samples': 12734976, 'steps': 24872, 'batch_loss/train': 0.7389662237837911} 12/26/2021 05:54:58 - INFO - codeparrot_training - Step 24873: {'lr': 0.00025338108921401314, 'samples': 12735488, 'steps': 24873, 'batch_loss/train': 0.802092912606895} 12/26/2021 05:55:09 - INFO - codeparrot_training - Step 24874: {'lr': 0.0002533653275147342, 'samples': 12736000, 'steps': 24874, 'batch_loss/train': 0.757572234608233} 12/26/2021 05:55:22 - INFO - codeparrot_training - Step 24875: {'lr': 0.0002533495658020761, 'samples': 12736512, 'steps': 24875, 'batch_loss/train': 0.7643684624927118} 12/26/2021 05:55:33 - INFO - codeparrot_training - Step 24876: {'lr': 0.0002533338040761013, 'samples': 12737024, 'steps': 24876, 'batch_loss/train': 0.8258011350408196} 12/26/2021 05:55:43 - INFO - codeparrot_training - Step 24877: {'lr': 0.0002533180423368726, 'samples': 12737536, 'steps': 24877, 'batch_loss/train': 0.7520536052761599} 12/26/2021 05:55:55 - INFO - codeparrot_training - Step 24878: {'lr': 0.00025330228058445254, 'samples': 12738048, 'steps': 24878, 'batch_loss/train': 0.8195007985923439} 12/26/2021 05:56:06 - INFO - codeparrot_training - Step 24879: {'lr': 0.00025328651881890394, 'samples': 12738560, 'steps': 24879, 'batch_loss/train': 0.7087985116522759} 12/26/2021 05:56:17 - INFO - codeparrot_training - Step 24880: {'lr': 0.00025327075704028934, 'samples': 12739072, 'steps': 24880, 'batch_loss/train': 0.759241652674973} 12/26/2021 05:56:29 - INFO - codeparrot_training - Step 24881: {'lr': 0.00025325499524867147, 'samples': 12739584, 'steps': 24881, 'batch_loss/train': 0.7818289627321064} 12/26/2021 05:56:39 - INFO - codeparrot_training - Step 24882: {'lr': 0.00025323923344411294, 'samples': 12740096, 'steps': 24882, 'batch_loss/train': 0.7774083404801786} 12/26/2021 05:56:50 - INFO - codeparrot_training - Step 24883: {'lr': 0.00025322347162667647, 'samples': 12740608, 'steps': 24883, 'batch_loss/train': 0.7756681148894131} 12/26/2021 05:57:01 - INFO - codeparrot_training - Step 24884: {'lr': 0.0002532077097964246, 'samples': 12741120, 'steps': 24884, 'batch_loss/train': 0.8423928088741377} 12/26/2021 05:57:13 - INFO - codeparrot_training - Step 24885: {'lr': 0.0002531919479534202, 'samples': 12741632, 'steps': 24885, 'batch_loss/train': 0.7067006956785917} 12/26/2021 05:57:23 - INFO - codeparrot_training - Step 24886: {'lr': 0.0002531761860977258, 'samples': 12742144, 'steps': 24886, 'batch_loss/train': 0.732774970587343} 12/26/2021 05:57:34 - INFO - codeparrot_training - Step 24887: {'lr': 0.0002531604242294041, 'samples': 12742656, 'steps': 24887, 'batch_loss/train': 0.8360012983903289} 12/26/2021 05:57:47 - INFO - codeparrot_training - Step 24888: {'lr': 0.0002531446623485177, 'samples': 12743168, 'steps': 24888, 'batch_loss/train': 0.825798858422786} 12/26/2021 05:57:58 - INFO - codeparrot_training - Step 24889: {'lr': 0.00025312890045512933, 'samples': 12743680, 'steps': 24889, 'batch_loss/train': 0.8748963233083487} 12/26/2021 05:58:08 - INFO - codeparrot_training - Step 24890: {'lr': 0.00025311313854930166, 'samples': 12744192, 'steps': 24890, 'batch_loss/train': 0.6782165131298825} 12/26/2021 05:58:20 - INFO - codeparrot_training - Step 24891: {'lr': 0.0002530973766310972, 'samples': 12744704, 'steps': 24891, 'batch_loss/train': 0.7861191406846046} 12/26/2021 05:58:31 - INFO - codeparrot_training - Step 24892: {'lr': 0.0002530816147005789, 'samples': 12745216, 'steps': 24892, 'batch_loss/train': 0.7295623617246747} 12/26/2021 05:58:42 - INFO - codeparrot_training - Step 24893: {'lr': 0.0002530658527578092, 'samples': 12745728, 'steps': 24893, 'batch_loss/train': 0.7584611852653325} 12/26/2021 05:58:52 - INFO - codeparrot_training - Step 24894: {'lr': 0.0002530500908028508, 'samples': 12746240, 'steps': 24894, 'batch_loss/train': 0.7412572540342808} 12/26/2021 05:59:05 - INFO - codeparrot_training - Step 24895: {'lr': 0.0002530343288357664, 'samples': 12746752, 'steps': 24895, 'batch_loss/train': 0.8282511346042156} 12/26/2021 05:59:16 - INFO - codeparrot_training - Step 24896: {'lr': 0.0002530185668566187, 'samples': 12747264, 'steps': 24896, 'batch_loss/train': 0.8184828478842974} 12/26/2021 05:59:26 - INFO - codeparrot_training - Step 24897: {'lr': 0.0002530028048654703, 'samples': 12747776, 'steps': 24897, 'batch_loss/train': 0.8400929868221283} 12/26/2021 05:59:39 - INFO - codeparrot_training - Step 24898: {'lr': 0.0002529870428623838, 'samples': 12748288, 'steps': 24898, 'batch_loss/train': 0.7661148821935058} 12/26/2021 05:59:50 - INFO - codeparrot_training - Step 24899: {'lr': 0.0002529712808474221, 'samples': 12748800, 'steps': 24899, 'batch_loss/train': 0.7959339926019311} 12/26/2021 06:00:00 - INFO - codeparrot_training - Step 24900: {'lr': 0.0002529555188206476, 'samples': 12749312, 'steps': 24900, 'batch_loss/train': 0.7462967769242823} 12/26/2021 06:00:11 - INFO - codeparrot_training - Step 24901: {'lr': 0.0002529397567821231, 'samples': 12749824, 'steps': 24901, 'batch_loss/train': 0.9048593118786812} 12/26/2021 06:00:23 - INFO - codeparrot_training - Step 24902: {'lr': 0.00025292399473191127, 'samples': 12750336, 'steps': 24902, 'batch_loss/train': 0.6108003952540457} 12/26/2021 06:00:34 - INFO - codeparrot_training - Step 24903: {'lr': 0.00025290823267007476, 'samples': 12750848, 'steps': 24903, 'batch_loss/train': 0.81755590159446} 12/26/2021 06:00:45 - INFO - codeparrot_training - Step 24904: {'lr': 0.0002528924705966761, 'samples': 12751360, 'steps': 24904, 'batch_loss/train': 0.8680335852550343} 12/26/2021 06:00:57 - INFO - codeparrot_training - Step 24905: {'lr': 0.00025287670851177813, 'samples': 12751872, 'steps': 24905, 'batch_loss/train': 0.87167979311198} 12/26/2021 06:01:08 - INFO - codeparrot_training - Step 24906: {'lr': 0.00025286094641544345, 'samples': 12752384, 'steps': 24906, 'batch_loss/train': 0.8352724257856607} 12/26/2021 06:01:18 - INFO - codeparrot_training - Step 24907: {'lr': 0.00025284518430773473, 'samples': 12752896, 'steps': 24907, 'batch_loss/train': 0.7800577040761709} 12/26/2021 06:01:31 - INFO - codeparrot_training - Step 24908: {'lr': 0.00025282942218871464, 'samples': 12753408, 'steps': 24908, 'batch_loss/train': 0.8316229488700628} 12/26/2021 06:01:41 - INFO - codeparrot_training - Step 24909: {'lr': 0.0002528136600584458, 'samples': 12753920, 'steps': 24909, 'batch_loss/train': 0.7849643416702747} 12/26/2021 06:01:52 - INFO - codeparrot_training - Step 24910: {'lr': 0.00025279789791699095, 'samples': 12754432, 'steps': 24910, 'batch_loss/train': 0.7660470437258482} 12/26/2021 06:02:02 - INFO - codeparrot_training - Step 24911: {'lr': 0.0002527821357644127, 'samples': 12754944, 'steps': 24911, 'batch_loss/train': 0.7566336310119368} 12/26/2021 06:02:15 - INFO - codeparrot_training - Step 24912: {'lr': 0.00025276637360077377, 'samples': 12755456, 'steps': 24912, 'batch_loss/train': 0.781073790974915} 12/26/2021 06:02:25 - INFO - codeparrot_training - Step 24913: {'lr': 0.0002527506114261367, 'samples': 12755968, 'steps': 24913, 'batch_loss/train': 0.8079828722402453} 12/26/2021 06:02:36 - INFO - codeparrot_training - Step 24914: {'lr': 0.0002527348492405643, 'samples': 12756480, 'steps': 24914, 'batch_loss/train': 0.7559020039625466} 12/26/2021 06:02:49 - INFO - codeparrot_training - Step 24915: {'lr': 0.0002527190870441192, 'samples': 12756992, 'steps': 24915, 'batch_loss/train': 0.792154278839007} 12/26/2021 06:02:59 - INFO - codeparrot_training - Step 24916: {'lr': 0.00025270332483686396, 'samples': 12757504, 'steps': 24916, 'batch_loss/train': 0.7274012072011828} 12/26/2021 06:03:10 - INFO - codeparrot_training - Step 24917: {'lr': 0.00025268756261886136, 'samples': 12758016, 'steps': 24917, 'batch_loss/train': 0.8191934069618583} 12/26/2021 06:03:22 - INFO - codeparrot_training - Step 24918: {'lr': 0.000252671800390174, 'samples': 12758528, 'steps': 24918, 'batch_loss/train': 0.8026571208611131} 12/26/2021 06:03:32 - INFO - codeparrot_training - Step 24919: {'lr': 0.00025265603815086455, 'samples': 12759040, 'steps': 24919, 'batch_loss/train': 0.7854236150160432} 12/26/2021 06:03:43 - INFO - codeparrot_training - Step 24920: {'lr': 0.00025264027590099573, 'samples': 12759552, 'steps': 24920, 'batch_loss/train': 0.8413013285025954} 12/26/2021 06:03:54 - INFO - codeparrot_training - Step 24921: {'lr': 0.0002526245136406302, 'samples': 12760064, 'steps': 24921, 'batch_loss/train': 0.7875333988340572} 12/26/2021 06:04:06 - INFO - codeparrot_training - Step 24922: {'lr': 0.0002526087513698306, 'samples': 12760576, 'steps': 24922, 'batch_loss/train': 0.9172204099595547} 12/26/2021 06:04:17 - INFO - codeparrot_training - Step 24923: {'lr': 0.0002525929890886595, 'samples': 12761088, 'steps': 24923, 'batch_loss/train': 0.7568469538819045} 12/26/2021 06:04:27 - INFO - codeparrot_training - Step 24924: {'lr': 0.0002525772267971797, 'samples': 12761600, 'steps': 24924, 'batch_loss/train': 0.7513985894620419} 12/26/2021 06:04:40 - INFO - codeparrot_training - Step 24925: {'lr': 0.0002525614644954539, 'samples': 12762112, 'steps': 24925, 'batch_loss/train': 1.0861274019116536} 12/26/2021 06:04:50 - INFO - codeparrot_training - Step 24926: {'lr': 0.0002525457021835446, 'samples': 12762624, 'steps': 24926, 'batch_loss/train': 0.7734331274405122} 12/26/2021 06:05:01 - INFO - codeparrot_training - Step 24927: {'lr': 0.00025252993986151455, 'samples': 12763136, 'steps': 24927, 'batch_loss/train': 0.7920874648261815} 12/26/2021 06:05:14 - INFO - codeparrot_training - Step 24928: {'lr': 0.00025251417752942645, 'samples': 12763648, 'steps': 24928, 'batch_loss/train': 0.7129593743011355} 12/26/2021 06:05:24 - INFO - codeparrot_training - Step 24929: {'lr': 0.0002524984151873429, 'samples': 12764160, 'steps': 24929, 'batch_loss/train': 0.7809911179356277} 12/26/2021 06:05:35 - INFO - codeparrot_training - Step 24930: {'lr': 0.00025248265283532654, 'samples': 12764672, 'steps': 24930, 'batch_loss/train': 0.8087263517081738} 12/26/2021 06:05:46 - INFO - codeparrot_training - Step 24931: {'lr': 0.00025246689047344023, 'samples': 12765184, 'steps': 24931, 'batch_loss/train': 0.8801111923530698} 12/26/2021 06:05:58 - INFO - codeparrot_training - Step 24932: {'lr': 0.00025245112810174644, 'samples': 12765696, 'steps': 24932, 'batch_loss/train': 0.7760742919053882} 12/26/2021 06:06:08 - INFO - codeparrot_training - Step 24933: {'lr': 0.00025243536572030784, 'samples': 12766208, 'steps': 24933, 'batch_loss/train': 0.7522114105522633} 12/26/2021 06:06:19 - INFO - codeparrot_training - Step 24934: {'lr': 0.00025241960332918716, 'samples': 12766720, 'steps': 24934, 'batch_loss/train': 0.7440211274661124} 12/26/2021 06:06:32 - INFO - codeparrot_training - Step 24935: {'lr': 0.0002524038409284471, 'samples': 12767232, 'steps': 24935, 'batch_loss/train': 0.7764529390260577} 12/26/2021 06:06:42 - INFO - codeparrot_training - Step 24936: {'lr': 0.0002523880785181502, 'samples': 12767744, 'steps': 24936, 'batch_loss/train': 0.8382568196393549} 12/26/2021 06:06:53 - INFO - codeparrot_training - Step 24937: {'lr': 0.00025237231609835924, 'samples': 12768256, 'steps': 24937, 'batch_loss/train': 0.8763061226345599} 12/26/2021 06:07:05 - INFO - codeparrot_training - Step 24938: {'lr': 0.0002523565536691369, 'samples': 12768768, 'steps': 24938, 'batch_loss/train': 0.7934449673630297} 12/26/2021 06:07:16 - INFO - codeparrot_training - Step 24939: {'lr': 0.00025234079123054575, 'samples': 12769280, 'steps': 24939, 'batch_loss/train': 0.6966619701124728} 12/26/2021 06:07:27 - INFO - codeparrot_training - Step 24940: {'lr': 0.00025232502878264843, 'samples': 12769792, 'steps': 24940, 'batch_loss/train': 0.8250888055190444} 12/26/2021 06:07:37 - INFO - codeparrot_training - Step 24941: {'lr': 0.00025230926632550775, 'samples': 12770304, 'steps': 24941, 'batch_loss/train': 0.7745797107927501} 12/26/2021 06:07:49 - INFO - codeparrot_training - Step 24942: {'lr': 0.0002522935038591863, 'samples': 12770816, 'steps': 24942, 'batch_loss/train': 0.7593968524597585} 12/26/2021 06:08:00 - INFO - codeparrot_training - Step 24943: {'lr': 0.0002522777413837467, 'samples': 12771328, 'steps': 24943, 'batch_loss/train': 0.7549207163974643} 12/26/2021 06:08:10 - INFO - codeparrot_training - Step 24944: {'lr': 0.00025226197889925174, 'samples': 12771840, 'steps': 24944, 'batch_loss/train': 0.828864075243473} 12/26/2021 06:08:23 - INFO - codeparrot_training - Step 24945: {'lr': 0.00025224621640576394, 'samples': 12772352, 'steps': 24945, 'batch_loss/train': 0.8526098616421223} 12/26/2021 06:08:34 - INFO - codeparrot_training - Step 24946: {'lr': 0.000252230453903346, 'samples': 12772864, 'steps': 24946, 'batch_loss/train': 0.8371884357184172} 12/26/2021 06:08:44 - INFO - codeparrot_training - Step 24947: {'lr': 0.0002522146913920607, 'samples': 12773376, 'steps': 24947, 'batch_loss/train': 0.8664449695497751} 12/26/2021 06:08:57 - INFO - codeparrot_training - Step 24948: {'lr': 0.0002521989288719706, 'samples': 12773888, 'steps': 24948, 'batch_loss/train': 0.9425729105714709} 12/26/2021 06:09:07 - INFO - codeparrot_training - Step 24949: {'lr': 0.0002521831663431384, 'samples': 12774400, 'steps': 24949, 'batch_loss/train': 1.1620691744610667} 12/26/2021 06:09:18 - INFO - codeparrot_training - Step 24950: {'lr': 0.0002521674038056267, 'samples': 12774912, 'steps': 24950, 'batch_loss/train': 0.7355386849376373} 12/26/2021 06:09:31 - INFO - codeparrot_training - Step 24951: {'lr': 0.00025215164125949824, 'samples': 12775424, 'steps': 24951, 'batch_loss/train': 0.7205757927149534} 12/26/2021 06:09:41 - INFO - codeparrot_training - Step 24952: {'lr': 0.0002521358787048156, 'samples': 12775936, 'steps': 24952, 'batch_loss/train': 0.7853892892599106} 12/26/2021 06:09:52 - INFO - codeparrot_training - Step 24953: {'lr': 0.00025212011614164165, 'samples': 12776448, 'steps': 24953, 'batch_loss/train': 0.7172204428352416} 12/26/2021 06:10:03 - INFO - codeparrot_training - Step 24954: {'lr': 0.00025210435357003887, 'samples': 12776960, 'steps': 24954, 'batch_loss/train': 0.714069064706564} 12/26/2021 06:10:15 - INFO - codeparrot_training - Step 24955: {'lr': 0.00025208859099006995, 'samples': 12777472, 'steps': 24955, 'batch_loss/train': 0.7088814722374082} 12/26/2021 06:10:26 - INFO - codeparrot_training - Step 24956: {'lr': 0.0002520728284017976, 'samples': 12777984, 'steps': 24956, 'batch_loss/train': 1.079192447476089} 12/26/2021 06:10:36 - INFO - codeparrot_training - Step 24957: {'lr': 0.00025205706580528454, 'samples': 12778496, 'steps': 24957, 'batch_loss/train': 0.7800593730062246} 12/26/2021 06:10:48 - INFO - codeparrot_training - Step 24958: {'lr': 0.00025204130320059326, 'samples': 12779008, 'steps': 24958, 'batch_loss/train': 0.7372098822379485} 12/26/2021 06:10:59 - INFO - codeparrot_training - Step 24959: {'lr': 0.00025202554058778646, 'samples': 12779520, 'steps': 24959, 'batch_loss/train': 0.7312946673482656} 12/26/2021 06:11:09 - INFO - codeparrot_training - Step 24960: {'lr': 0.000252009777966927, 'samples': 12780032, 'steps': 24960, 'batch_loss/train': 0.8163196686655283} 12/26/2021 06:11:20 - INFO - codeparrot_training - Step 24961: {'lr': 0.00025199401533807745, 'samples': 12780544, 'steps': 24961, 'batch_loss/train': 0.8564392561092973} 12/26/2021 06:11:32 - INFO - codeparrot_training - Step 24962: {'lr': 0.00025197825270130035, 'samples': 12781056, 'steps': 24962, 'batch_loss/train': 0.8676671423017979} 12/26/2021 06:11:43 - INFO - codeparrot_training - Step 24963: {'lr': 0.0002519624900566585, 'samples': 12781568, 'steps': 24963, 'batch_loss/train': 0.7247524475678802} 12/26/2021 06:11:53 - INFO - codeparrot_training - Step 24964: {'lr': 0.00025194672740421457, 'samples': 12782080, 'steps': 24964, 'batch_loss/train': 0.7637684643268585} 12/26/2021 06:12:06 - INFO - codeparrot_training - Step 24965: {'lr': 0.0002519309647440311, 'samples': 12782592, 'steps': 24965, 'batch_loss/train': 0.7046522088348866} 12/26/2021 06:12:17 - INFO - codeparrot_training - Step 24966: {'lr': 0.0002519152020761709, 'samples': 12783104, 'steps': 24966, 'batch_loss/train': 0.8126272782683372} 12/26/2021 06:12:27 - INFO - codeparrot_training - Step 24967: {'lr': 0.0002518994394006966, 'samples': 12783616, 'steps': 24967, 'batch_loss/train': 0.6298179760342464} 12/26/2021 06:12:40 - INFO - codeparrot_training - Step 24968: {'lr': 0.00025188367671767083, 'samples': 12784128, 'steps': 24968, 'batch_loss/train': 0.837307934591081} 12/26/2021 06:12:50 - INFO - codeparrot_training - Step 24969: {'lr': 0.00025186791402715623, 'samples': 12784640, 'steps': 24969, 'batch_loss/train': 0.9084839589777403} 12/26/2021 06:13:01 - INFO - codeparrot_training - Step 24970: {'lr': 0.0002518521513292156, 'samples': 12785152, 'steps': 24970, 'batch_loss/train': 0.7200919932802208} 12/26/2021 06:13:12 - INFO - codeparrot_training - Step 24971: {'lr': 0.00025183638862391145, 'samples': 12785664, 'steps': 24971, 'batch_loss/train': 0.9629106279462576} 12/26/2021 06:13:24 - INFO - codeparrot_training - Step 24972: {'lr': 0.00025182062591130644, 'samples': 12786176, 'steps': 24972, 'batch_loss/train': 0.7621615240350366} 12/26/2021 06:13:34 - INFO - codeparrot_training - Step 24973: {'lr': 0.00025180486319146346, 'samples': 12786688, 'steps': 24973, 'batch_loss/train': 0.8849503519013524} 12/26/2021 06:13:45 - INFO - codeparrot_training - Step 24974: {'lr': 0.0002517891004644449, 'samples': 12787200, 'steps': 24974, 'batch_loss/train': 0.8366238446906209} 12/26/2021 06:13:58 - INFO - codeparrot_training - Step 24975: {'lr': 0.0002517733377303136, 'samples': 12787712, 'steps': 24975, 'batch_loss/train': 0.7338991165161133} 12/26/2021 06:14:08 - INFO - codeparrot_training - Step 24976: {'lr': 0.00025175757498913223, 'samples': 12788224, 'steps': 24976, 'batch_loss/train': 1.1689689265331253} 12/26/2021 06:14:19 - INFO - codeparrot_training - Step 24977: {'lr': 0.0002517418122409634, 'samples': 12788736, 'steps': 24977, 'batch_loss/train': 0.7449292679084465} 12/26/2021 06:14:31 - INFO - codeparrot_training - Step 24978: {'lr': 0.00025172604948586963, 'samples': 12789248, 'steps': 24978, 'batch_loss/train': 0.7883643393870443} 12/26/2021 06:14:42 - INFO - codeparrot_training - Step 24979: {'lr': 0.0002517102867239139, 'samples': 12789760, 'steps': 24979, 'batch_loss/train': 0.7421724731102586} 12/26/2021 06:14:52 - INFO - codeparrot_training - Step 24980: {'lr': 0.00025169452395515867, 'samples': 12790272, 'steps': 24980, 'batch_loss/train': 0.7577595540788025} 12/26/2021 06:15:03 - INFO - codeparrot_training - Step 24981: {'lr': 0.00025167876117966663, 'samples': 12790784, 'steps': 24981, 'batch_loss/train': 0.7252539885230362} 12/26/2021 06:15:16 - INFO - codeparrot_training - Step 24982: {'lr': 0.0002516629983975005, 'samples': 12791296, 'steps': 24982, 'batch_loss/train': 0.6751779410988092} 12/26/2021 06:15:26 - INFO - codeparrot_training - Step 24983: {'lr': 0.0002516472356087229, 'samples': 12791808, 'steps': 24983, 'batch_loss/train': 0.8975117113441229} 12/26/2021 06:15:37 - INFO - codeparrot_training - Step 24984: {'lr': 0.00025163147281339655, 'samples': 12792320, 'steps': 24984, 'batch_loss/train': 0.7967911204323173} 12/26/2021 06:15:49 - INFO - codeparrot_training - Step 24985: {'lr': 0.000251615710011584, 'samples': 12792832, 'steps': 24985, 'batch_loss/train': 0.7215148480609059} 12/26/2021 06:16:00 - INFO - codeparrot_training - Step 24986: {'lr': 0.0002515999472033481, 'samples': 12793344, 'steps': 24986, 'batch_loss/train': 0.7019456325215288} 12/26/2021 06:16:10 - INFO - codeparrot_training - Step 24987: {'lr': 0.00025158418438875135, 'samples': 12793856, 'steps': 24987, 'batch_loss/train': 0.8186142910271883} 12/26/2021 06:16:23 - INFO - codeparrot_training - Step 24988: {'lr': 0.0002515684215678565, 'samples': 12794368, 'steps': 24988, 'batch_loss/train': 0.8317175814881921} 12/26/2021 06:16:33 - INFO - codeparrot_training - Step 24989: {'lr': 0.00025155265874072613, 'samples': 12794880, 'steps': 24989, 'batch_loss/train': 0.8865574533119798} 12/26/2021 06:16:44 - INFO - codeparrot_training - Step 24990: {'lr': 0.00025153689590742313, 'samples': 12795392, 'steps': 24990, 'batch_loss/train': 1.066456414759159} 12/26/2021 06:16:55 - INFO - codeparrot_training - Step 24991: {'lr': 0.00025152113306800984, 'samples': 12795904, 'steps': 24991, 'batch_loss/train': 0.7189703504554927} 12/26/2021 06:17:07 - INFO - codeparrot_training - Step 24992: {'lr': 0.0002515053702225492, 'samples': 12796416, 'steps': 24992, 'batch_loss/train': 0.7852033929666504} 12/26/2021 06:17:18 - INFO - codeparrot_training - Step 24993: {'lr': 0.00025148960737110375, 'samples': 12796928, 'steps': 24993, 'batch_loss/train': 0.8078479496762156} 12/26/2021 06:17:29 - INFO - codeparrot_training - Step 24994: {'lr': 0.0002514738445137362, 'samples': 12797440, 'steps': 24994, 'batch_loss/train': 0.7249171605799347} 12/26/2021 06:17:41 - INFO - codeparrot_training - Step 24995: {'lr': 0.00025145808165050914, 'samples': 12797952, 'steps': 24995, 'batch_loss/train': 0.7669657953083515} 12/26/2021 06:17:51 - INFO - codeparrot_training - Step 24996: {'lr': 0.00025144231878148536, 'samples': 12798464, 'steps': 24996, 'batch_loss/train': 0.684463432058692} 12/26/2021 06:18:02 - INFO - codeparrot_training - Step 24997: {'lr': 0.00025142655590672746, 'samples': 12798976, 'steps': 24997, 'batch_loss/train': 0.8065405888482928} 12/26/2021 06:18:14 - INFO - codeparrot_training - Step 24998: {'lr': 0.0002514107930262981, 'samples': 12799488, 'steps': 24998, 'batch_loss/train': 0.8291336880065501} 12/26/2021 06:18:25 - INFO - codeparrot_training - Step 24999: {'lr': 0.00025139503014026, 'samples': 12800000, 'steps': 24999, 'batch_loss/train': 0.850140267983079} 12/26/2021 06:18:25 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/26/2021 06:21:47 - INFO - codeparrot_training - Step 25000: {'loss/eval': 0.7836087942123413, 'perplexity': 2.189358949661255} 12/26/2021 06:22:05 - WARNING - huggingface_hub.repository - Several commits (8) will be pushed upstream. 12/26/2021 06:22:18 - INFO - codeparrot_training - Step 25000: {'lr': 0.0002513792672486757, 'samples': 12800512, 'steps': 25000, 'batch_loss/train': 0.7486031828448176} 12/26/2021 06:22:30 - INFO - codeparrot_training - Step 25001: {'lr': 0.00025136350435160806, 'samples': 12801024, 'steps': 25001, 'batch_loss/train': 0.7181321987882257} 12/26/2021 06:22:41 - INFO - codeparrot_training - Step 25002: {'lr': 0.0002513477414491195, 'samples': 12801536, 'steps': 25002, 'batch_loss/train': 0.7192767392843962} 12/26/2021 06:22:51 - INFO - codeparrot_training - Step 25003: {'lr': 0.000251331978541273, 'samples': 12802048, 'steps': 25003, 'batch_loss/train': 0.8425130904652178} 12/26/2021 06:23:02 - INFO - codeparrot_training - Step 25004: {'lr': 0.0002513162156281309, 'samples': 12802560, 'steps': 25004, 'batch_loss/train': 0.826509122736752} 12/26/2021 06:23:15 - INFO - codeparrot_training - Step 25005: {'lr': 0.00025130045270975617, 'samples': 12803072, 'steps': 25005, 'batch_loss/train': 0.7450620988383889} 12/26/2021 06:23:25 - INFO - codeparrot_training - Step 25006: {'lr': 0.0002512846897862112, 'samples': 12803584, 'steps': 25006, 'batch_loss/train': 0.793431447353214} 12/26/2021 06:23:36 - INFO - codeparrot_training - Step 25007: {'lr': 0.0002512689268575589, 'samples': 12804096, 'steps': 25007, 'batch_loss/train': 0.7859456650912762} 12/26/2021 06:23:48 - INFO - codeparrot_training - Step 25008: {'lr': 0.00025125316392386174, 'samples': 12804608, 'steps': 25008, 'batch_loss/train': 0.7776422398164868} 12/26/2021 06:23:59 - INFO - codeparrot_training - Step 25009: {'lr': 0.0002512374009851824, 'samples': 12805120, 'steps': 25009, 'batch_loss/train': 0.7480822606012225} 12/26/2021 06:24:09 - INFO - codeparrot_training - Step 25010: {'lr': 0.00025122163804158375, 'samples': 12805632, 'steps': 25010, 'batch_loss/train': 0.8441228782758117} 12/26/2021 06:24:21 - INFO - codeparrot_training - Step 25011: {'lr': 0.00025120587509312834, 'samples': 12806144, 'steps': 25011, 'batch_loss/train': 0.7649983889423311} 12/26/2021 06:24:32 - INFO - codeparrot_training - Step 25012: {'lr': 0.0002511901121398788, 'samples': 12806656, 'steps': 25012, 'batch_loss/train': 0.730508170556277} 12/26/2021 06:24:42 - INFO - codeparrot_training - Step 25013: {'lr': 0.0002511743491818978, 'samples': 12807168, 'steps': 25013, 'batch_loss/train': 0.84923015255481} 12/26/2021 06:24:53 - INFO - codeparrot_training - Step 25014: {'lr': 0.000251158586219248, 'samples': 12807680, 'steps': 25014, 'batch_loss/train': 0.6510013522347435} 12/26/2021 06:25:06 - INFO - codeparrot_training - Step 25015: {'lr': 0.0002511428232519922, 'samples': 12808192, 'steps': 25015, 'batch_loss/train': 0.774962754920125} 12/26/2021 06:25:17 - INFO - codeparrot_training - Step 25016: {'lr': 0.0002511270602801929, 'samples': 12808704, 'steps': 25016, 'batch_loss/train': 0.7743064537644386} 12/26/2021 06:25:27 - INFO - codeparrot_training - Step 25017: {'lr': 0.0002511112973039128, 'samples': 12809216, 'steps': 25017, 'batch_loss/train': 0.7886867765337229} 12/26/2021 06:25:40 - INFO - codeparrot_training - Step 25018: {'lr': 0.0002510955343232147, 'samples': 12809728, 'steps': 25018, 'batch_loss/train': 0.7100114235654473} 12/26/2021 06:25:50 - INFO - codeparrot_training - Step 25019: {'lr': 0.0002510797713381611, 'samples': 12810240, 'steps': 25019, 'batch_loss/train': 0.7704456942155957} 12/26/2021 06:26:01 - INFO - codeparrot_training - Step 25020: {'lr': 0.0002510640083488147, 'samples': 12810752, 'steps': 25020, 'batch_loss/train': 0.6934929229319096} 12/26/2021 06:26:14 - INFO - codeparrot_training - Step 25021: {'lr': 0.00025104824535523827, 'samples': 12811264, 'steps': 25021, 'batch_loss/train': 0.8820293630706146} 12/26/2021 06:26:24 - INFO - codeparrot_training - Step 25022: {'lr': 0.00025103248235749443, 'samples': 12811776, 'steps': 25022, 'batch_loss/train': 0.8309427043423057} 12/26/2021 06:26:35 - INFO - codeparrot_training - Step 25023: {'lr': 0.00025101671935564575, 'samples': 12812288, 'steps': 25023, 'batch_loss/train': 0.6776512796059251} 12/26/2021 06:26:46 - INFO - codeparrot_training - Step 25024: {'lr': 0.000251000956349755, 'samples': 12812800, 'steps': 25024, 'batch_loss/train': 0.7076684992644005} 12/26/2021 06:26:58 - INFO - codeparrot_training - Step 25025: {'lr': 0.0002509851933398849, 'samples': 12813312, 'steps': 25025, 'batch_loss/train': 0.8165128990076482} 12/26/2021 06:27:09 - INFO - codeparrot_training - Step 25026: {'lr': 0.00025096943032609797, 'samples': 12813824, 'steps': 25026, 'batch_loss/train': 0.6828194671543315} 12/26/2021 06:27:19 - INFO - codeparrot_training - Step 25027: {'lr': 0.00025095366730845697, 'samples': 12814336, 'steps': 25027, 'batch_loss/train': 1.1038661771453917} 12/26/2021 06:27:31 - INFO - codeparrot_training - Step 25028: {'lr': 0.00025093790428702456, 'samples': 12814848, 'steps': 25028, 'batch_loss/train': 0.89796376042068} 12/26/2021 06:27:42 - INFO - codeparrot_training - Step 25029: {'lr': 0.0002509221412618635, 'samples': 12815360, 'steps': 25029, 'batch_loss/train': 0.7610187320969999} 12/26/2021 06:27:52 - INFO - codeparrot_training - Step 25030: {'lr': 0.00025090637823303616, 'samples': 12815872, 'steps': 25030, 'batch_loss/train': 0.9010326601564884} 12/26/2021 06:28:05 - INFO - codeparrot_training - Step 25031: {'lr': 0.0002508906152006055, 'samples': 12816384, 'steps': 25031, 'batch_loss/train': 0.7293249936774373} 12/26/2021 06:28:16 - INFO - codeparrot_training - Step 25032: {'lr': 0.00025087485216463405, 'samples': 12816896, 'steps': 25032, 'batch_loss/train': 0.7494586976245046} 12/26/2021 06:28:27 - INFO - codeparrot_training - Step 25033: {'lr': 0.00025085908912518453, 'samples': 12817408, 'steps': 25033, 'batch_loss/train': 0.6448300629854202} 12/26/2021 06:28:37 - INFO - codeparrot_training - Step 25034: {'lr': 0.0002508433260823196, 'samples': 12817920, 'steps': 25034, 'batch_loss/train': 0.6479422824922949} 12/26/2021 06:28:49 - INFO - codeparrot_training - Step 25035: {'lr': 0.0002508275630361019, 'samples': 12818432, 'steps': 25035, 'batch_loss/train': 0.8090747715905309} 12/26/2021 06:29:00 - INFO - codeparrot_training - Step 25036: {'lr': 0.0002508117999865942, 'samples': 12818944, 'steps': 25036, 'batch_loss/train': 0.8346334446687251} 12/26/2021 06:29:11 - INFO - codeparrot_training - Step 25037: {'lr': 0.00025079603693385904, 'samples': 12819456, 'steps': 25037, 'batch_loss/train': 0.7675484083592892} 12/26/2021 06:29:25 - INFO - codeparrot_training - Step 25038: {'lr': 0.00025078027387795914, 'samples': 12819968, 'steps': 25038, 'batch_loss/train': 0.7722112098708749} 12/26/2021 06:29:35 - INFO - codeparrot_training - Step 25039: {'lr': 0.0002507645108189572, 'samples': 12820480, 'steps': 25039, 'batch_loss/train': 0.8204971980303526} 12/26/2021 06:29:46 - INFO - codeparrot_training - Step 25040: {'lr': 0.0002507487477569158, 'samples': 12820992, 'steps': 25040, 'batch_loss/train': 0.8561209291219711} 12/26/2021 06:29:58 - INFO - codeparrot_training - Step 25041: {'lr': 0.0002507329846918977, 'samples': 12821504, 'steps': 25041, 'batch_loss/train': 0.8045361461117864} 12/26/2021 06:30:09 - INFO - codeparrot_training - Step 25042: {'lr': 0.0002507172216239655, 'samples': 12822016, 'steps': 25042, 'batch_loss/train': 0.8016167664900422} 12/26/2021 06:30:19 - INFO - codeparrot_training - Step 25043: {'lr': 0.0002507014585531819, 'samples': 12822528, 'steps': 25043, 'batch_loss/train': 1.4926398913376033} 12/26/2021 06:30:30 - INFO - codeparrot_training - Step 25044: {'lr': 0.0002506856954796096, 'samples': 12823040, 'steps': 25044, 'batch_loss/train': 0.7300555454567075} 12/26/2021 06:30:42 - INFO - codeparrot_training - Step 25045: {'lr': 0.0002506699324033112, 'samples': 12823552, 'steps': 25045, 'batch_loss/train': 0.7691471693105996} 12/26/2021 06:30:53 - INFO - codeparrot_training - Step 25046: {'lr': 0.00025065416932434936, 'samples': 12824064, 'steps': 25046, 'batch_loss/train': 0.7339524396229535} 12/26/2021 06:31:03 - INFO - codeparrot_training - Step 25047: {'lr': 0.0002506384062427869, 'samples': 12824576, 'steps': 25047, 'batch_loss/train': 0.8020094148814678} 12/26/2021 06:31:17 - INFO - codeparrot_training - Step 25048: {'lr': 0.00025062264315868634, 'samples': 12825088, 'steps': 25048, 'batch_loss/train': 0.8317256150767207} 12/26/2021 06:31:27 - INFO - codeparrot_training - Step 25049: {'lr': 0.0002506068800721103, 'samples': 12825600, 'steps': 25049, 'batch_loss/train': 0.7649759287014604} 12/26/2021 06:31:38 - INFO - codeparrot_training - Step 25050: {'lr': 0.0002505911169831216, 'samples': 12826112, 'steps': 25050, 'batch_loss/train': 0.7396461516618729} 12/26/2021 06:31:50 - INFO - codeparrot_training - Step 25051: {'lr': 0.00025057535389178285, 'samples': 12826624, 'steps': 25051, 'batch_loss/train': 0.68362484453246} 12/26/2021 06:32:01 - INFO - codeparrot_training - Step 25052: {'lr': 0.0002505595907981568, 'samples': 12827136, 'steps': 25052, 'batch_loss/train': 0.7610949499066919} 12/26/2021 06:32:11 - INFO - codeparrot_training - Step 25053: {'lr': 0.0002505438277023059, 'samples': 12827648, 'steps': 25053, 'batch_loss/train': 0.7911650519818068} 12/26/2021 06:32:22 - INFO - codeparrot_training - Step 25054: {'lr': 0.00025052806460429297, 'samples': 12828160, 'steps': 25054, 'batch_loss/train': 0.8082429440692067} 12/26/2021 06:32:36 - INFO - codeparrot_training - Step 25055: {'lr': 0.0002505123015041806, 'samples': 12828672, 'steps': 25055, 'batch_loss/train': 0.7483472656458616} 12/26/2021 06:32:47 - INFO - codeparrot_training - Step 25056: {'lr': 0.00025049653840203157, 'samples': 12829184, 'steps': 25056, 'batch_loss/train': 0.7184061212465167} 12/26/2021 06:32:57 - INFO - codeparrot_training - Step 25057: {'lr': 0.0002504807752979085, 'samples': 12829696, 'steps': 25057, 'batch_loss/train': 0.7280578874051571} 12/26/2021 06:33:10 - INFO - codeparrot_training - Step 25058: {'lr': 0.00025046501219187404, 'samples': 12830208, 'steps': 25058, 'batch_loss/train': 0.7821520059369504} 12/26/2021 06:33:20 - INFO - codeparrot_training - Step 25059: {'lr': 0.0002504492490839909, 'samples': 12830720, 'steps': 25059, 'batch_loss/train': 0.7844573194161057} 12/26/2021 06:33:31 - INFO - codeparrot_training - Step 25060: {'lr': 0.0002504334859743217, 'samples': 12831232, 'steps': 25060, 'batch_loss/train': 0.6604930923786014} 12/26/2021 06:33:43 - INFO - codeparrot_training - Step 25061: {'lr': 0.0002504177228629291, 'samples': 12831744, 'steps': 25061, 'batch_loss/train': 0.6798931136727333} 12/26/2021 06:33:54 - INFO - codeparrot_training - Step 25062: {'lr': 0.0002504019597498758, 'samples': 12832256, 'steps': 25062, 'batch_loss/train': 0.7029382777400315} 12/26/2021 06:34:04 - INFO - codeparrot_training - Step 25063: {'lr': 0.00025038619663522444, 'samples': 12832768, 'steps': 25063, 'batch_loss/train': 0.8242754535749555} 12/26/2021 06:34:15 - INFO - codeparrot_training - Step 25064: {'lr': 0.00025037043351903776, 'samples': 12833280, 'steps': 25064, 'batch_loss/train': 0.762474010232836} 12/26/2021 06:34:28 - INFO - codeparrot_training - Step 25065: {'lr': 0.0002503546704013783, 'samples': 12833792, 'steps': 25065, 'batch_loss/train': 0.8155861888080835} 12/26/2021 06:34:39 - INFO - codeparrot_training - Step 25066: {'lr': 0.0002503389072823089, 'samples': 12834304, 'steps': 25066, 'batch_loss/train': 0.8117594355717301} 12/26/2021 06:34:49 - INFO - codeparrot_training - Step 25067: {'lr': 0.00025032314416189205, 'samples': 12834816, 'steps': 25067, 'batch_loss/train': 0.6958929365500808} 12/26/2021 06:35:01 - INFO - codeparrot_training - Step 25068: {'lr': 0.0002503073810401905, 'samples': 12835328, 'steps': 25068, 'batch_loss/train': 0.8559302669018507} 12/26/2021 06:35:12 - INFO - codeparrot_training - Step 25069: {'lr': 0.000250291617917267, 'samples': 12835840, 'steps': 25069, 'batch_loss/train': 0.7728042753878981} 12/26/2021 06:35:23 - INFO - codeparrot_training - Step 25070: {'lr': 0.00025027585479318404, 'samples': 12836352, 'steps': 25070, 'batch_loss/train': 0.6847943116445094} 12/26/2021 06:35:35 - INFO - codeparrot_training - Step 25071: {'lr': 0.0002502600916680045, 'samples': 12836864, 'steps': 25071, 'batch_loss/train': 0.7892649853602052} 12/26/2021 06:35:45 - INFO - codeparrot_training - Step 25072: {'lr': 0.0002502443285417908, 'samples': 12837376, 'steps': 25072, 'batch_loss/train': 0.8463014224544168} 12/26/2021 06:35:56 - INFO - codeparrot_training - Step 25073: {'lr': 0.0002502285654146059, 'samples': 12837888, 'steps': 25073, 'batch_loss/train': 0.8712224217306357} 12/26/2021 06:36:07 - INFO - codeparrot_training - Step 25074: {'lr': 0.00025021280228651216, 'samples': 12838400, 'steps': 25074, 'batch_loss/train': 0.8430936792865396} 12/26/2021 06:36:19 - INFO - codeparrot_training - Step 25075: {'lr': 0.0002501970391575725, 'samples': 12838912, 'steps': 25075, 'batch_loss/train': 0.8203224837779999} 12/26/2021 06:36:29 - INFO - codeparrot_training - Step 25076: {'lr': 0.0002501812760278494, 'samples': 12839424, 'steps': 25076, 'batch_loss/train': 0.8165245950222015} 12/26/2021 06:36:40 - INFO - codeparrot_training - Step 25077: {'lr': 0.0002501655128974057, 'samples': 12839936, 'steps': 25077, 'batch_loss/train': 0.8602843016851693} 12/26/2021 06:36:53 - INFO - codeparrot_training - Step 25078: {'lr': 0.0002501497497663039, 'samples': 12840448, 'steps': 25078, 'batch_loss/train': 0.8666546577587724} 12/26/2021 06:37:04 - INFO - codeparrot_training - Step 25079: {'lr': 0.0002501339866346068, 'samples': 12840960, 'steps': 25079, 'batch_loss/train': 0.8109909556806087} 12/26/2021 06:37:14 - INFO - codeparrot_training - Step 25080: {'lr': 0.000250118223502377, 'samples': 12841472, 'steps': 25080, 'batch_loss/train': 0.8109851409681141} 12/26/2021 06:37:26 - INFO - codeparrot_training - Step 25081: {'lr': 0.0002501024603696772, 'samples': 12841984, 'steps': 25081, 'batch_loss/train': 0.8219738341867924} 12/26/2021 06:37:37 - INFO - codeparrot_training - Step 25082: {'lr': 0.0002500866972365701, 'samples': 12842496, 'steps': 25082, 'batch_loss/train': 0.824177891947329} 12/26/2021 06:37:48 - INFO - codeparrot_training - Step 25083: {'lr': 0.0002500709341031183, 'samples': 12843008, 'steps': 25083, 'batch_loss/train': 0.7640428580343723} 12/26/2021 06:38:02 - INFO - codeparrot_training - Step 25084: {'lr': 0.0002500551709693845, 'samples': 12843520, 'steps': 25084, 'batch_loss/train': 0.8038536254316568} 12/26/2021 06:38:12 - INFO - codeparrot_training - Step 25085: {'lr': 0.0002500394078354312, 'samples': 12844032, 'steps': 25085, 'batch_loss/train': 0.7495944211259484} 12/26/2021 06:38:23 - INFO - codeparrot_training - Step 25086: {'lr': 0.00025002364470132145, 'samples': 12844544, 'steps': 25086, 'batch_loss/train': 0.7743529570288956} 12/26/2021 06:38:34 - INFO - codeparrot_training - Step 25087: {'lr': 0.0002500078815671176, 'samples': 12845056, 'steps': 25087, 'batch_loss/train': 0.7381405476480722} 12/26/2021 06:38:46 - INFO - codeparrot_training - Step 25088: {'lr': 0.00024999211843288247, 'samples': 12845568, 'steps': 25088, 'batch_loss/train': 0.8159225331619382} 12/26/2021 06:38:56 - INFO - codeparrot_training - Step 25089: {'lr': 0.0002499763552986786, 'samples': 12846080, 'steps': 25089, 'batch_loss/train': 0.7501355865970254} 12/26/2021 06:39:07 - INFO - codeparrot_training - Step 25090: {'lr': 0.0002499605921645688, 'samples': 12846592, 'steps': 25090, 'batch_loss/train': 0.8090883437544107} 12/26/2021 06:39:19 - INFO - codeparrot_training - Step 25091: {'lr': 0.00024994482903061563, 'samples': 12847104, 'steps': 25091, 'batch_loss/train': 0.8196591201703995} 12/26/2021 06:39:30 - INFO - codeparrot_training - Step 25092: {'lr': 0.0002499290658968817, 'samples': 12847616, 'steps': 25092, 'batch_loss/train': 0.6432130634784698} 12/26/2021 06:39:40 - INFO - codeparrot_training - Step 25093: {'lr': 0.00024991330276343, 'samples': 12848128, 'steps': 25093, 'batch_loss/train': 0.7690113387070596} 12/26/2021 06:39:53 - INFO - codeparrot_training - Step 25094: {'lr': 0.0002498975396303228, 'samples': 12848640, 'steps': 25094, 'batch_loss/train': 0.8569069574587047} 12/26/2021 06:40:04 - INFO - codeparrot_training - Step 25095: {'lr': 0.00024988177649762296, 'samples': 12849152, 'steps': 25095, 'batch_loss/train': 0.6780301304534078} 12/26/2021 06:40:15 - INFO - codeparrot_training - Step 25096: {'lr': 0.0002498660133653932, 'samples': 12849664, 'steps': 25096, 'batch_loss/train': 0.8248353535309434} 12/26/2021 06:40:25 - INFO - codeparrot_training - Step 25097: {'lr': 0.00024985025023369615, 'samples': 12850176, 'steps': 25097, 'batch_loss/train': 0.8412895491346717} 12/26/2021 06:40:37 - INFO - codeparrot_training - Step 25098: {'lr': 0.0002498344871025944, 'samples': 12850688, 'steps': 25098, 'batch_loss/train': 0.7135917525738478} 12/26/2021 06:40:48 - INFO - codeparrot_training - Step 25099: {'lr': 0.00024981872397215063, 'samples': 12851200, 'steps': 25099, 'batch_loss/train': 0.7319100480526686} 12/26/2021 06:40:59 - INFO - codeparrot_training - Step 25100: {'lr': 0.00024980296084242757, 'samples': 12851712, 'steps': 25100, 'batch_loss/train': 0.8136328440159559} 12/26/2021 06:41:12 - INFO - codeparrot_training - Step 25101: {'lr': 0.00024978719771348785, 'samples': 12852224, 'steps': 25101, 'batch_loss/train': 0.8135204399004579} 12/26/2021 06:41:22 - INFO - codeparrot_training - Step 25102: {'lr': 0.00024977143458539423, 'samples': 12852736, 'steps': 25102, 'batch_loss/train': 0.8123887374531478} 12/26/2021 06:41:33 - INFO - codeparrot_training - Step 25103: {'lr': 0.0002497556714582092, 'samples': 12853248, 'steps': 25103, 'batch_loss/train': 0.928447462618351} 12/26/2021 06:41:45 - INFO - codeparrot_training - Step 25104: {'lr': 0.00024973990833199557, 'samples': 12853760, 'steps': 25104, 'batch_loss/train': 0.8239318253472447} 12/26/2021 06:41:56 - INFO - codeparrot_training - Step 25105: {'lr': 0.00024972414520681597, 'samples': 12854272, 'steps': 25105, 'batch_loss/train': 0.7832867875695229} 12/26/2021 06:42:06 - INFO - codeparrot_training - Step 25106: {'lr': 0.00024970838208273304, 'samples': 12854784, 'steps': 25106, 'batch_loss/train': 0.8070274833589792} 12/26/2021 06:42:17 - INFO - codeparrot_training - Step 25107: {'lr': 0.0002496926189598095, 'samples': 12855296, 'steps': 25107, 'batch_loss/train': 0.8040147256106138} 12/26/2021 06:42:29 - INFO - codeparrot_training - Step 25108: {'lr': 0.00024967685583810796, 'samples': 12855808, 'steps': 25108, 'batch_loss/train': 0.6460763679351658} 12/26/2021 06:42:40 - INFO - codeparrot_training - Step 25109: {'lr': 0.0002496610927176912, 'samples': 12856320, 'steps': 25109, 'batch_loss/train': 0.7477070325985551} 12/26/2021 06:42:50 - INFO - codeparrot_training - Step 25110: {'lr': 0.0002496453295986217, 'samples': 12856832, 'steps': 25110, 'batch_loss/train': 0.8281011367216706} 12/26/2021 06:43:04 - INFO - codeparrot_training - Step 25111: {'lr': 0.00024962956648096225, 'samples': 12857344, 'steps': 25111, 'batch_loss/train': 0.6982504073530436} 12/26/2021 06:43:14 - INFO - codeparrot_training - Step 25112: {'lr': 0.0002496138033647756, 'samples': 12857856, 'steps': 25112, 'batch_loss/train': 0.7600904619321227} 12/26/2021 06:43:25 - INFO - codeparrot_training - Step 25113: {'lr': 0.00024959804025012427, 'samples': 12858368, 'steps': 25113, 'batch_loss/train': 0.7655067087616771} 12/26/2021 06:43:37 - INFO - codeparrot_training - Step 25114: {'lr': 0.0002495822771370709, 'samples': 12858880, 'steps': 25114, 'batch_loss/train': 0.8308038571849465} 12/26/2021 06:43:48 - INFO - codeparrot_training - Step 25115: {'lr': 0.00024956651402567836, 'samples': 12859392, 'steps': 25115, 'batch_loss/train': 0.6360157998278737} 12/26/2021 06:43:58 - INFO - codeparrot_training - Step 25116: {'lr': 0.0002495507509160092, 'samples': 12859904, 'steps': 25116, 'batch_loss/train': 0.7919766688719392} 12/26/2021 06:44:10 - INFO - codeparrot_training - Step 25117: {'lr': 0.000249534987808126, 'samples': 12860416, 'steps': 25117, 'batch_loss/train': 0.6969919735565782} 12/26/2021 06:44:21 - INFO - codeparrot_training - Step 25118: {'lr': 0.0002495192247020915, 'samples': 12860928, 'steps': 25118, 'batch_loss/train': 0.7754886150360107} 12/26/2021 06:44:32 - INFO - codeparrot_training - Step 25119: {'lr': 0.0002495034615979685, 'samples': 12861440, 'steps': 25119, 'batch_loss/train': 0.7734495415352285} 12/26/2021 06:44:42 - INFO - codeparrot_training - Step 25120: {'lr': 0.0002494876984958194, 'samples': 12861952, 'steps': 25120, 'batch_loss/train': 0.9501345902681351} 12/26/2021 06:44:54 - INFO - codeparrot_training - Step 25121: {'lr': 0.00024947193539570704, 'samples': 12862464, 'steps': 25121, 'batch_loss/train': 0.6086930404417217} 12/26/2021 06:45:05 - INFO - codeparrot_training - Step 25122: {'lr': 0.0002494561722976942, 'samples': 12862976, 'steps': 25122, 'batch_loss/train': 0.6249937597895041} 12/26/2021 06:45:16 - INFO - codeparrot_training - Step 25123: {'lr': 0.0002494404092018433, 'samples': 12863488, 'steps': 25123, 'batch_loss/train': 0.8264021137729287} 12/26/2021 06:45:29 - INFO - codeparrot_training - Step 25124: {'lr': 0.0002494246461082172, 'samples': 12864000, 'steps': 25124, 'batch_loss/train': 0.7581469791475683} 12/26/2021 06:45:40 - INFO - codeparrot_training - Step 25125: {'lr': 0.0002494088830168784, 'samples': 12864512, 'steps': 25125, 'batch_loss/train': 0.7333409847924486} 12/26/2021 06:45:51 - INFO - codeparrot_training - Step 25126: {'lr': 0.0002493931199278897, 'samples': 12865024, 'steps': 25126, 'batch_loss/train': 0.7781597096472979} 12/26/2021 06:46:01 - INFO - codeparrot_training - Step 25127: {'lr': 0.0002493773568413137, 'samples': 12865536, 'steps': 25127, 'batch_loss/train': 0.8451296836137772} 12/26/2021 06:46:14 - INFO - codeparrot_training - Step 25128: {'lr': 0.0002493615937572132, 'samples': 12866048, 'steps': 25128, 'batch_loss/train': 0.8305618958547711} 12/26/2021 06:46:24 - INFO - codeparrot_training - Step 25129: {'lr': 0.0002493458306756506, 'samples': 12866560, 'steps': 25129, 'batch_loss/train': 0.7346009504981339} 12/26/2021 06:46:35 - INFO - codeparrot_training - Step 25130: {'lr': 0.00024933006759668885, 'samples': 12867072, 'steps': 25130, 'batch_loss/train': 0.8255310375243425} 12/26/2021 06:46:49 - INFO - codeparrot_training - Step 25131: {'lr': 0.00024931430452039045, 'samples': 12867584, 'steps': 25131, 'batch_loss/train': 0.8082596897147596} 12/26/2021 06:46:59 - INFO - codeparrot_training - Step 25132: {'lr': 0.0002492985414468181, 'samples': 12868096, 'steps': 25132, 'batch_loss/train': 0.7954719422850758} 12/26/2021 06:47:10 - INFO - codeparrot_training - Step 25133: {'lr': 0.00024928277837603456, 'samples': 12868608, 'steps': 25133, 'batch_loss/train': 0.8195606218650937} 12/26/2021 06:47:22 - INFO - codeparrot_training - Step 25134: {'lr': 0.00024926701530810234, 'samples': 12869120, 'steps': 25134, 'batch_loss/train': 0.7434855596511625} 12/26/2021 06:47:33 - INFO - codeparrot_training - Step 25135: {'lr': 0.0002492512522430843, 'samples': 12869632, 'steps': 25135, 'batch_loss/train': 0.7310621296055615} 12/26/2021 06:47:43 - INFO - codeparrot_training - Step 25136: {'lr': 0.00024923548918104284, 'samples': 12870144, 'steps': 25136, 'batch_loss/train': 0.8429504497908056} 12/26/2021 06:47:56 - INFO - codeparrot_training - Step 25137: {'lr': 0.00024921972612204087, 'samples': 12870656, 'steps': 25137, 'batch_loss/train': 0.5821855196263641} 12/26/2021 06:48:07 - INFO - codeparrot_training - Step 25138: {'lr': 0.00024920396306614103, 'samples': 12871168, 'steps': 25138, 'batch_loss/train': 0.6593735608039424} 12/26/2021 06:48:17 - INFO - codeparrot_training - Step 25139: {'lr': 0.00024918820001340586, 'samples': 12871680, 'steps': 25139, 'batch_loss/train': 0.852902359329164} 12/26/2021 06:48:28 - INFO - codeparrot_training - Step 25140: {'lr': 0.00024917243696389804, 'samples': 12872192, 'steps': 25140, 'batch_loss/train': 0.7759818159975111} 12/26/2021 06:48:42 - INFO - codeparrot_training - Step 25141: {'lr': 0.00024915667391768045, 'samples': 12872704, 'steps': 25141, 'batch_loss/train': 0.7678036973811686} 12/26/2021 06:48:53 - INFO - codeparrot_training - Step 25142: {'lr': 0.00024914091087481553, 'samples': 12873216, 'steps': 25142, 'batch_loss/train': 0.7463054198306054} 12/26/2021 06:49:03 - INFO - codeparrot_training - Step 25143: {'lr': 0.00024912514783536596, 'samples': 12873728, 'steps': 25143, 'batch_loss/train': 0.7507673408836126} 12/26/2021 06:49:15 - INFO - codeparrot_training - Step 25144: {'lr': 0.00024910938479939457, 'samples': 12874240, 'steps': 25144, 'batch_loss/train': 0.8576031904667616} 12/26/2021 06:49:26 - INFO - codeparrot_training - Step 25145: {'lr': 0.0002490936217669639, 'samples': 12874752, 'steps': 25145, 'batch_loss/train': 0.8438692688941956} 12/26/2021 06:49:37 - INFO - codeparrot_training - Step 25146: {'lr': 0.00024907785873813664, 'samples': 12875264, 'steps': 25146, 'batch_loss/train': 0.7079036742215976} 12/26/2021 06:49:49 - INFO - codeparrot_training - Step 25147: {'lr': 0.0002490620957129754, 'samples': 12875776, 'steps': 25147, 'batch_loss/train': 0.8087858906947076} 12/26/2021 06:49:59 - INFO - codeparrot_training - Step 25148: {'lr': 0.00024904633269154304, 'samples': 12876288, 'steps': 25148, 'batch_loss/train': 0.7815684415400028} 12/26/2021 06:50:10 - INFO - codeparrot_training - Step 25149: {'lr': 0.00024903056967390204, 'samples': 12876800, 'steps': 25149, 'batch_loss/train': 0.7710384301608428} 12/26/2021 06:50:20 - INFO - codeparrot_training - Step 25150: {'lr': 0.00024901480666011515, 'samples': 12877312, 'steps': 25150, 'batch_loss/train': 0.70039829518646} 12/26/2021 06:50:33 - INFO - codeparrot_training - Step 25151: {'lr': 0.00024899904365024495, 'samples': 12877824, 'steps': 25151, 'batch_loss/train': 0.7677603964693844} 12/26/2021 06:50:43 - INFO - codeparrot_training - Step 25152: {'lr': 0.00024898328064435426, 'samples': 12878336, 'steps': 25152, 'batch_loss/train': 0.710736081469804} 12/26/2021 06:50:54 - INFO - codeparrot_training - Step 25153: {'lr': 0.00024896751764250563, 'samples': 12878848, 'steps': 25153, 'batch_loss/train': 0.8221833747811615} 12/26/2021 06:51:08 - INFO - codeparrot_training - Step 25154: {'lr': 0.0002489517546447618, 'samples': 12879360, 'steps': 25154, 'batch_loss/train': 0.7117111319676042} 12/26/2021 06:51:18 - INFO - codeparrot_training - Step 25155: {'lr': 0.0002489359916511853, 'samples': 12879872, 'steps': 25155, 'batch_loss/train': 0.6154463244602084} 12/26/2021 06:51:29 - INFO - codeparrot_training - Step 25156: {'lr': 0.00024892022866183894, 'samples': 12880384, 'steps': 25156, 'batch_loss/train': 0.7968365116976202} 12/26/2021 06:51:40 - INFO - codeparrot_training - Step 25157: {'lr': 0.0002489044656767854, 'samples': 12880896, 'steps': 25157, 'batch_loss/train': 0.7369305486790836} 12/26/2021 06:51:52 - INFO - codeparrot_training - Step 25158: {'lr': 0.0002488887026960872, 'samples': 12881408, 'steps': 25158, 'batch_loss/train': 0.8034469457343221} 12/26/2021 06:52:02 - INFO - codeparrot_training - Step 25159: {'lr': 0.0002488729397198071, 'samples': 12881920, 'steps': 25159, 'batch_loss/train': 0.8813261836767197} 12/26/2021 06:52:13 - INFO - codeparrot_training - Step 25160: {'lr': 0.0002488571767480079, 'samples': 12882432, 'steps': 25160, 'batch_loss/train': 0.7850959744537249} 12/26/2021 06:52:25 - INFO - codeparrot_training - Step 25161: {'lr': 0.00024884141378075204, 'samples': 12882944, 'steps': 25161, 'batch_loss/train': 0.7900625499896705} 12/26/2021 06:52:36 - INFO - codeparrot_training - Step 25162: {'lr': 0.0002488256508181022, 'samples': 12883456, 'steps': 25162, 'batch_loss/train': 0.7361171576194465} 12/26/2021 06:52:46 - INFO - codeparrot_training - Step 25163: {'lr': 0.00024880988786012126, 'samples': 12883968, 'steps': 25163, 'batch_loss/train': 0.7151773679070175} 12/26/2021 06:53:00 - INFO - codeparrot_training - Step 25164: {'lr': 0.00024879412490687173, 'samples': 12884480, 'steps': 25164, 'batch_loss/train': 0.7105048722587526} 12/26/2021 06:53:11 - INFO - codeparrot_training - Step 25165: {'lr': 0.00024877836195841627, 'samples': 12884992, 'steps': 25165, 'batch_loss/train': 0.7667989786714315} 12/26/2021 06:53:21 - INFO - codeparrot_training - Step 25166: {'lr': 0.00024876259901481754, 'samples': 12885504, 'steps': 25166, 'batch_loss/train': 0.7538462462835014} 12/26/2021 06:53:34 - INFO - codeparrot_training - Step 25167: {'lr': 0.00024874683607613833, 'samples': 12886016, 'steps': 25167, 'batch_loss/train': 0.7253105510026217} 12/26/2021 06:53:44 - INFO - codeparrot_training - Step 25168: {'lr': 0.0002487310731424412, 'samples': 12886528, 'steps': 25168, 'batch_loss/train': 0.5715466148685664} 12/26/2021 06:53:55 - INFO - codeparrot_training - Step 25169: {'lr': 0.0002487153102137888, 'samples': 12887040, 'steps': 25169, 'batch_loss/train': 0.7002727170474827} 12/26/2021 06:54:05 - INFO - codeparrot_training - Step 25170: {'lr': 0.0002486995472902439, 'samples': 12887552, 'steps': 25170, 'batch_loss/train': 0.8182336259633303} 12/26/2021 06:54:19 - INFO - codeparrot_training - Step 25171: {'lr': 0.00024868378437186914, 'samples': 12888064, 'steps': 25171, 'batch_loss/train': 0.7659212308935821} 12/26/2021 06:54:30 - INFO - codeparrot_training - Step 25172: {'lr': 0.0002486680214587271, 'samples': 12888576, 'steps': 25172, 'batch_loss/train': 0.7304374193772674} 12/26/2021 06:54:40 - INFO - codeparrot_training - Step 25173: {'lr': 0.00024865225855088044, 'samples': 12889088, 'steps': 25173, 'batch_loss/train': 0.8240761044435203} 12/26/2021 06:54:53 - INFO - codeparrot_training - Step 25174: {'lr': 0.000248636495648392, 'samples': 12889600, 'steps': 25174, 'batch_loss/train': 0.8811975559219718} 12/26/2021 06:55:03 - INFO - codeparrot_training - Step 25175: {'lr': 0.0002486207327513243, 'samples': 12890112, 'steps': 25175, 'batch_loss/train': 0.7502259267494082} 12/26/2021 06:55:14 - INFO - codeparrot_training - Step 25176: {'lr': 0.0002486049698597401, 'samples': 12890624, 'steps': 25176, 'batch_loss/train': 0.758040162269026} 12/26/2021 06:55:24 - INFO - codeparrot_training - Step 25177: {'lr': 0.0002485892069737019, 'samples': 12891136, 'steps': 25177, 'batch_loss/train': 0.7791593314614147} 12/26/2021 06:55:37 - INFO - codeparrot_training - Step 25178: {'lr': 0.0002485734440932726, 'samples': 12891648, 'steps': 25178, 'batch_loss/train': 0.7071819272823632} 12/26/2021 06:55:47 - INFO - codeparrot_training - Step 25179: {'lr': 0.0002485576812185147, 'samples': 12892160, 'steps': 25179, 'batch_loss/train': 0.7747503183782101} 12/26/2021 06:55:58 - INFO - codeparrot_training - Step 25180: {'lr': 0.0002485419183494908, 'samples': 12892672, 'steps': 25180, 'batch_loss/train': 0.8664773162454367} 12/26/2021 06:56:11 - INFO - codeparrot_training - Step 25181: {'lr': 0.00024852615548626384, 'samples': 12893184, 'steps': 25181, 'batch_loss/train': 0.8101812009699643} 12/26/2021 06:56:21 - INFO - codeparrot_training - Step 25182: {'lr': 0.00024851039262889626, 'samples': 12893696, 'steps': 25182, 'batch_loss/train': 0.6996027354616672} 12/26/2021 06:56:32 - INFO - codeparrot_training - Step 25183: {'lr': 0.0002484946297774509, 'samples': 12894208, 'steps': 25183, 'batch_loss/train': 0.6872691279277205} 12/26/2021 06:56:44 - INFO - codeparrot_training - Step 25184: {'lr': 0.0002484788669319902, 'samples': 12894720, 'steps': 25184, 'batch_loss/train': 0.7989580417051911} 12/26/2021 06:56:55 - INFO - codeparrot_training - Step 25185: {'lr': 0.00024846310409257693, 'samples': 12895232, 'steps': 25185, 'batch_loss/train': 0.6227215928956866} 12/26/2021 06:57:05 - INFO - codeparrot_training - Step 25186: {'lr': 0.0002484473412592739, 'samples': 12895744, 'steps': 25186, 'batch_loss/train': 0.847647812217474} 12/26/2021 06:57:16 - INFO - codeparrot_training - Step 25187: {'lr': 0.0002484315784321436, 'samples': 12896256, 'steps': 25187, 'batch_loss/train': 0.8474017744883895} 12/26/2021 06:57:28 - INFO - codeparrot_training - Step 25188: {'lr': 0.00024841581561124866, 'samples': 12896768, 'steps': 25188, 'batch_loss/train': 0.7684995625168085} 12/26/2021 06:57:39 - INFO - codeparrot_training - Step 25189: {'lr': 0.00024840005279665195, 'samples': 12897280, 'steps': 25189, 'batch_loss/train': 0.8822960983961821} 12/26/2021 06:57:49 - INFO - codeparrot_training - Step 25190: {'lr': 0.000248384289988416, 'samples': 12897792, 'steps': 25190, 'batch_loss/train': 0.7692431528121233} 12/26/2021 06:58:02 - INFO - codeparrot_training - Step 25191: {'lr': 0.0002483685271866035, 'samples': 12898304, 'steps': 25191, 'batch_loss/train': 0.7712049782276154} 12/26/2021 06:58:13 - INFO - codeparrot_training - Step 25192: {'lr': 0.0002483527643912771, 'samples': 12898816, 'steps': 25192, 'batch_loss/train': 0.7923592193983495} 12/26/2021 06:58:23 - INFO - codeparrot_training - Step 25193: {'lr': 0.00024833700160249955, 'samples': 12899328, 'steps': 25193, 'batch_loss/train': 0.7270699713844806} 12/26/2021 06:58:35 - INFO - codeparrot_training - Step 25194: {'lr': 0.0002483212388203334, 'samples': 12899840, 'steps': 25194, 'batch_loss/train': 0.6735094654140994} 12/26/2021 06:58:46 - INFO - codeparrot_training - Step 25195: {'lr': 0.0002483054760448413, 'samples': 12900352, 'steps': 25195, 'batch_loss/train': 0.8068630769848824} 12/26/2021 06:58:57 - INFO - codeparrot_training - Step 25196: {'lr': 0.00024828971327608615, 'samples': 12900864, 'steps': 25196, 'batch_loss/train': 0.7272345228120685} 12/26/2021 06:59:07 - INFO - codeparrot_training - Step 25197: {'lr': 0.0002482739505141304, 'samples': 12901376, 'steps': 25197, 'batch_loss/train': 0.7435132982209325} 12/26/2021 06:59:19 - INFO - codeparrot_training - Step 25198: {'lr': 0.00024825818775903673, 'samples': 12901888, 'steps': 25198, 'batch_loss/train': 1.0053739519789815} 12/26/2021 06:59:30 - INFO - codeparrot_training - Step 25199: {'lr': 0.0002482424250108678, 'samples': 12902400, 'steps': 25199, 'batch_loss/train': 0.8014832139015198} 12/26/2021 06:59:40 - INFO - codeparrot_training - Step 25200: {'lr': 0.0002482266622696864, 'samples': 12902912, 'steps': 25200, 'batch_loss/train': 0.7613943028263748} 12/26/2021 06:59:54 - INFO - codeparrot_training - Step 25201: {'lr': 0.0002482108995355551, 'samples': 12903424, 'steps': 25201, 'batch_loss/train': 0.6531531689688563} 12/26/2021 07:00:04 - INFO - codeparrot_training - Step 25202: {'lr': 0.00024819513680853666, 'samples': 12903936, 'steps': 25202, 'batch_loss/train': 0.8714314689859748} 12/26/2021 07:00:15 - INFO - codeparrot_training - Step 25203: {'lr': 0.0002481793740886935, 'samples': 12904448, 'steps': 25203, 'batch_loss/train': 0.4329336302471347} 12/26/2021 07:00:28 - INFO - codeparrot_training - Step 25204: {'lr': 0.00024816361137608856, 'samples': 12904960, 'steps': 25204, 'batch_loss/train': 0.715070887003094} 12/26/2021 07:00:39 - INFO - codeparrot_training - Step 25205: {'lr': 0.00024814784867078447, 'samples': 12905472, 'steps': 25205, 'batch_loss/train': 0.633795278146863} 12/26/2021 07:00:49 - INFO - codeparrot_training - Step 25206: {'lr': 0.0002481320859728438, 'samples': 12905984, 'steps': 25206, 'batch_loss/train': 0.7792302202433348} 12/26/2021 07:01:00 - INFO - codeparrot_training - Step 25207: {'lr': 0.00024811632328232923, 'samples': 12906496, 'steps': 25207, 'batch_loss/train': 1.1545892343856394} 12/26/2021 07:01:12 - INFO - codeparrot_training - Step 25208: {'lr': 0.00024810056059930343, 'samples': 12907008, 'steps': 25208, 'batch_loss/train': 0.8924387460574508} 12/26/2021 07:01:23 - INFO - codeparrot_training - Step 25209: {'lr': 0.00024808479792382915, 'samples': 12907520, 'steps': 25209, 'batch_loss/train': 0.7427454718854278} 12/26/2021 07:01:33 - INFO - codeparrot_training - Step 25210: {'lr': 0.0002480690352559689, 'samples': 12908032, 'steps': 25210, 'batch_loss/train': 0.912970220670104} 12/26/2021 07:01:46 - INFO - codeparrot_training - Step 25211: {'lr': 0.0002480532725957855, 'samples': 12908544, 'steps': 25211, 'batch_loss/train': 0.8438984034582973} 12/26/2021 07:01:57 - INFO - codeparrot_training - Step 25212: {'lr': 0.00024803750994334157, 'samples': 12909056, 'steps': 25212, 'batch_loss/train': 0.8386733194347471} 12/26/2021 07:02:07 - INFO - codeparrot_training - Step 25213: {'lr': 0.0002480217472986997, 'samples': 12909568, 'steps': 25213, 'batch_loss/train': 0.818516792729497} 12/26/2021 07:02:19 - INFO - codeparrot_training - Step 25214: {'lr': 0.00024800598466192256, 'samples': 12910080, 'steps': 25214, 'batch_loss/train': 0.5850690361112356} 12/26/2021 07:02:30 - INFO - codeparrot_training - Step 25215: {'lr': 0.000247990222033073, 'samples': 12910592, 'steps': 25215, 'batch_loss/train': 0.8122101118788123} 12/26/2021 07:02:41 - INFO - codeparrot_training - Step 25216: {'lr': 0.00024797445941221355, 'samples': 12911104, 'steps': 25216, 'batch_loss/train': 0.8072720803320408} 12/26/2021 07:02:51 - INFO - codeparrot_training - Step 25217: {'lr': 0.0002479586967994068, 'samples': 12911616, 'steps': 25217, 'batch_loss/train': 0.7236932758241892} 12/26/2021 07:03:05 - INFO - codeparrot_training - Step 25218: {'lr': 0.00024794293419471553, 'samples': 12912128, 'steps': 25218, 'batch_loss/train': 0.7611323604360223} 12/26/2021 07:03:15 - INFO - codeparrot_training - Step 25219: {'lr': 0.00024792717159820244, 'samples': 12912640, 'steps': 25219, 'batch_loss/train': 0.6436840840615332} 12/26/2021 07:03:26 - INFO - codeparrot_training - Step 25220: {'lr': 0.0002479114090099301, 'samples': 12913152, 'steps': 25220, 'batch_loss/train': 0.7485068994574249} 12/26/2021 07:03:38 - INFO - codeparrot_training - Step 25221: {'lr': 0.0002478956464299611, 'samples': 12913664, 'steps': 25221, 'batch_loss/train': 0.8766025807708502} 12/26/2021 07:03:49 - INFO - codeparrot_training - Step 25222: {'lr': 0.00024787988385835836, 'samples': 12914176, 'steps': 25222, 'batch_loss/train': 0.7820430193096399} 12/26/2021 07:03:59 - INFO - codeparrot_training - Step 25223: {'lr': 0.0002478641212951844, 'samples': 12914688, 'steps': 25223, 'batch_loss/train': 0.86290272930637} 12/26/2021 07:04:11 - INFO - codeparrot_training - Step 25224: {'lr': 0.0002478483587405018, 'samples': 12915200, 'steps': 25224, 'batch_loss/train': 0.7425351138226688} 12/26/2021 07:04:22 - INFO - codeparrot_training - Step 25225: {'lr': 0.0002478325961943733, 'samples': 12915712, 'steps': 25225, 'batch_loss/train': 0.7603598572313786} 12/26/2021 07:04:33 - INFO - codeparrot_training - Step 25226: {'lr': 0.0002478168336568617, 'samples': 12916224, 'steps': 25226, 'batch_loss/train': 0.7730655688792467} 12/26/2021 07:04:43 - INFO - codeparrot_training - Step 25227: {'lr': 0.0002478010711280295, 'samples': 12916736, 'steps': 25227, 'batch_loss/train': 0.781137814745307} 12/26/2021 07:04:55 - INFO - codeparrot_training - Step 25228: {'lr': 0.00024778530860793943, 'samples': 12917248, 'steps': 25228, 'batch_loss/train': 0.8078898503445089} 12/26/2021 07:05:06 - INFO - codeparrot_training - Step 25229: {'lr': 0.000247769546096654, 'samples': 12917760, 'steps': 25229, 'batch_loss/train': 0.7216541916131973} 12/26/2021 07:05:17 - INFO - codeparrot_training - Step 25230: {'lr': 0.00024775378359423607, 'samples': 12918272, 'steps': 25230, 'batch_loss/train': 0.8242227379232645} 12/26/2021 07:05:30 - INFO - codeparrot_training - Step 25231: {'lr': 0.0002477380211007484, 'samples': 12918784, 'steps': 25231, 'batch_loss/train': 0.676161892246455} 12/26/2021 07:05:40 - INFO - codeparrot_training - Step 25232: {'lr': 0.0002477222586162533, 'samples': 12919296, 'steps': 25232, 'batch_loss/train': 0.49211621849099174} 12/26/2021 07:05:51 - INFO - codeparrot_training - Step 25233: {'lr': 0.00024770649614081375, 'samples': 12919808, 'steps': 25233, 'batch_loss/train': 0.7874887166544795} 12/26/2021 07:06:03 - INFO - codeparrot_training - Step 25234: {'lr': 0.00024769073367449226, 'samples': 12920320, 'steps': 25234, 'batch_loss/train': 0.8523913593962789} 12/26/2021 07:06:14 - INFO - codeparrot_training - Step 25235: {'lr': 0.00024767497121735163, 'samples': 12920832, 'steps': 25235, 'batch_loss/train': 0.7446892810985446} 12/26/2021 07:06:25 - INFO - codeparrot_training - Step 25236: {'lr': 0.0002476592087694543, 'samples': 12921344, 'steps': 25236, 'batch_loss/train': 0.7439605798572302} 12/26/2021 07:06:35 - INFO - codeparrot_training - Step 25237: {'lr': 0.00024764344633086314, 'samples': 12921856, 'steps': 25237, 'batch_loss/train': 0.8222996983677149} 12/26/2021 07:06:47 - INFO - codeparrot_training - Step 25238: {'lr': 0.00024762768390164077, 'samples': 12922368, 'steps': 25238, 'batch_loss/train': 0.7443396970629692} 12/26/2021 07:06:58 - INFO - codeparrot_training - Step 25239: {'lr': 0.0002476119214818498, 'samples': 12922880, 'steps': 25239, 'batch_loss/train': 0.8450577547773719} 12/26/2021 07:07:09 - INFO - codeparrot_training - Step 25240: {'lr': 0.0002475961590715529, 'samples': 12923392, 'steps': 25240, 'batch_loss/train': 0.7906779842451215} 12/26/2021 07:07:22 - INFO - codeparrot_training - Step 25241: {'lr': 0.00024758039667081285, 'samples': 12923904, 'steps': 25241, 'batch_loss/train': 0.7624347559176385} 12/26/2021 07:07:32 - INFO - codeparrot_training - Step 25242: {'lr': 0.0002475646342796923, 'samples': 12924416, 'steps': 25242, 'batch_loss/train': 0.9943456850014627} 12/26/2021 07:07:43 - INFO - codeparrot_training - Step 25243: {'lr': 0.0002475488718982536, 'samples': 12924928, 'steps': 25243, 'batch_loss/train': 0.7388753894483671} 12/26/2021 07:07:55 - INFO - codeparrot_training - Step 25244: {'lr': 0.0002475331095265598, 'samples': 12925440, 'steps': 25244, 'batch_loss/train': 0.8480263967067003} 12/26/2021 07:08:06 - INFO - codeparrot_training - Step 25245: {'lr': 0.00024751734716467347, 'samples': 12925952, 'steps': 25245, 'batch_loss/train': 0.7194511773996055} 12/26/2021 07:08:16 - INFO - codeparrot_training - Step 25246: {'lr': 0.00024750158481265714, 'samples': 12926464, 'steps': 25246, 'batch_loss/train': 0.7120249404106289} 12/26/2021 07:08:27 - INFO - codeparrot_training - Step 25247: {'lr': 0.00024748582247057356, 'samples': 12926976, 'steps': 25247, 'batch_loss/train': 0.754669664427638} 12/26/2021 07:08:40 - INFO - codeparrot_training - Step 25248: {'lr': 0.00024747006013848546, 'samples': 12927488, 'steps': 25248, 'batch_loss/train': 0.7864838857203722} 12/26/2021 07:08:51 - INFO - codeparrot_training - Step 25249: {'lr': 0.00024745429781645545, 'samples': 12928000, 'steps': 25249, 'batch_loss/train': 0.7990676984190941} 12/26/2021 07:09:01 - INFO - codeparrot_training - Step 25250: {'lr': 0.0002474385355045462, 'samples': 12928512, 'steps': 25250, 'batch_loss/train': 0.677190980874002} 12/26/2021 07:09:14 - INFO - codeparrot_training - Step 25251: {'lr': 0.00024742277320282026, 'samples': 12929024, 'steps': 25251, 'batch_loss/train': 0.777458910655696} 12/26/2021 07:09:24 - INFO - codeparrot_training - Step 25252: {'lr': 0.0002474070109113405, 'samples': 12929536, 'steps': 25252, 'batch_loss/train': 0.7513727061450481} 12/26/2021 07:09:35 - INFO - codeparrot_training - Step 25253: {'lr': 0.0002473912486301695, 'samples': 12930048, 'steps': 25253, 'batch_loss/train': 0.4703896758146584} 12/26/2021 07:09:47 - INFO - codeparrot_training - Step 25254: {'lr': 0.00024737548635936975, 'samples': 12930560, 'steps': 25254, 'batch_loss/train': 0.7433117341715842} 12/26/2021 07:09:58 - INFO - codeparrot_training - Step 25255: {'lr': 0.0002473597240990043, 'samples': 12931072, 'steps': 25255, 'batch_loss/train': 0.7909235283732414} 12/26/2021 07:10:08 - INFO - codeparrot_training - Step 25256: {'lr': 0.00024734396184913546, 'samples': 12931584, 'steps': 25256, 'batch_loss/train': 0.7707237731665373} 12/26/2021 07:10:19 - INFO - codeparrot_training - Step 25257: {'lr': 0.00024732819960982606, 'samples': 12932096, 'steps': 25257, 'batch_loss/train': 0.824686860665679} 12/26/2021 07:10:31 - INFO - codeparrot_training - Step 25258: {'lr': 0.00024731243738113865, 'samples': 12932608, 'steps': 25258, 'batch_loss/train': 0.7180286394432187} 12/26/2021 07:10:42 - INFO - codeparrot_training - Step 25259: {'lr': 0.00024729667516313605, 'samples': 12933120, 'steps': 25259, 'batch_loss/train': 0.8182902811095119} 12/26/2021 07:10:52 - INFO - codeparrot_training - Step 25260: {'lr': 0.0002472809129558809, 'samples': 12933632, 'steps': 25260, 'batch_loss/train': 0.6827348773367703} 12/26/2021 07:11:05 - INFO - codeparrot_training - Step 25261: {'lr': 0.0002472651507594358, 'samples': 12934144, 'steps': 25261, 'batch_loss/train': 0.8835940267890692} 12/26/2021 07:11:16 - INFO - codeparrot_training - Step 25262: {'lr': 0.0002472493885738633, 'samples': 12934656, 'steps': 25262, 'batch_loss/train': 0.8211430353112519} 12/26/2021 07:11:26 - INFO - codeparrot_training - Step 25263: {'lr': 0.0002472336263992263, 'samples': 12935168, 'steps': 25263, 'batch_loss/train': 0.8292129868641496} 12/26/2021 07:11:38 - INFO - codeparrot_training - Step 25264: {'lr': 0.0002472178642355874, 'samples': 12935680, 'steps': 25264, 'batch_loss/train': 0.7430075723677874} 12/26/2021 07:11:49 - INFO - codeparrot_training - Step 25265: {'lr': 0.0002472021020830091, 'samples': 12936192, 'steps': 25265, 'batch_loss/train': 0.7389349334407598} 12/26/2021 07:11:59 - INFO - codeparrot_training - Step 25266: {'lr': 0.00024718633994155417, 'samples': 12936704, 'steps': 25266, 'batch_loss/train': 0.7853728900663555} 12/26/2021 07:12:10 - INFO - codeparrot_training - Step 25267: {'lr': 0.0002471705778112854, 'samples': 12937216, 'steps': 25267, 'batch_loss/train': 0.6541175544261932} 12/26/2021 07:12:22 - INFO - codeparrot_training - Step 25268: {'lr': 0.0002471548156922654, 'samples': 12937728, 'steps': 25268, 'batch_loss/train': 0.8168372639920563} 12/26/2021 07:12:33 - INFO - codeparrot_training - Step 25269: {'lr': 0.00024713905358455656, 'samples': 12938240, 'steps': 25269, 'batch_loss/train': 0.8073833100497723} 12/26/2021 07:12:43 - INFO - codeparrot_training - Step 25270: {'lr': 0.00024712329148822193, 'samples': 12938752, 'steps': 25270, 'batch_loss/train': 0.8600629041902721} 12/26/2021 07:12:56 - INFO - codeparrot_training - Step 25271: {'lr': 0.00024710752940332396, 'samples': 12939264, 'steps': 25271, 'batch_loss/train': 0.8163968287408352} 12/26/2021 07:13:07 - INFO - codeparrot_training - Step 25272: {'lr': 0.0002470917673299254, 'samples': 12939776, 'steps': 25272, 'batch_loss/train': 0.7171615394763649} 12/26/2021 07:13:17 - INFO - codeparrot_training - Step 25273: {'lr': 0.00024707600526808874, 'samples': 12940288, 'steps': 25273, 'batch_loss/train': 0.8787297820672393} 12/26/2021 07:13:30 - INFO - codeparrot_training - Step 25274: {'lr': 0.00024706024321787695, 'samples': 12940800, 'steps': 25274, 'batch_loss/train': 0.7769947946071625} 12/26/2021 07:13:40 - INFO - codeparrot_training - Step 25275: {'lr': 0.0002470444811793524, 'samples': 12941312, 'steps': 25275, 'batch_loss/train': 0.7589322440326214} 12/26/2021 07:13:51 - INFO - codeparrot_training - Step 25276: {'lr': 0.000247028719152578, 'samples': 12941824, 'steps': 25276, 'batch_loss/train': 0.7731988416053355} 12/26/2021 07:14:04 - INFO - codeparrot_training - Step 25277: {'lr': 0.00024701295713761614, 'samples': 12942336, 'steps': 25277, 'batch_loss/train': 0.7145546902902424} 12/26/2021 07:14:14 - INFO - codeparrot_training - Step 25278: {'lr': 0.00024699719513452973, 'samples': 12942848, 'steps': 25278, 'batch_loss/train': 0.9208776755258441} 12/26/2021 07:14:25 - INFO - codeparrot_training - Step 25279: {'lr': 0.00024698143314338136, 'samples': 12943360, 'steps': 25279, 'batch_loss/train': 0.8304054532200098} 12/26/2021 07:14:36 - INFO - codeparrot_training - Step 25280: {'lr': 0.00024696567116423354, 'samples': 12943872, 'steps': 25280, 'batch_loss/train': 0.7006958639249206} 12/26/2021 07:14:48 - INFO - codeparrot_training - Step 25281: {'lr': 0.0002469499091971492, 'samples': 12944384, 'steps': 25281, 'batch_loss/train': 0.8345130595844239} 12/26/2021 07:14:58 - INFO - codeparrot_training - Step 25282: {'lr': 0.00024693414724219086, 'samples': 12944896, 'steps': 25282, 'batch_loss/train': 0.7493009883910418} 12/26/2021 07:15:09 - INFO - codeparrot_training - Step 25283: {'lr': 0.00024691838529942116, 'samples': 12945408, 'steps': 25283, 'batch_loss/train': 0.7530277851037681} 12/26/2021 07:15:21 - INFO - codeparrot_training - Step 25284: {'lr': 0.00024690262336890274, 'samples': 12945920, 'steps': 25284, 'batch_loss/train': 0.8222754523158073} 12/26/2021 07:15:32 - INFO - codeparrot_training - Step 25285: {'lr': 0.0002468868614506984, 'samples': 12946432, 'steps': 25285, 'batch_loss/train': 0.6642490981612355} 12/26/2021 07:15:42 - INFO - codeparrot_training - Step 25286: {'lr': 0.00024687109954487073, 'samples': 12946944, 'steps': 25286, 'batch_loss/train': 1.4412244409322739} 12/26/2021 07:15:55 - INFO - codeparrot_training - Step 25287: {'lr': 0.0002468553376514824, 'samples': 12947456, 'steps': 25287, 'batch_loss/train': 1.214528679382056} 12/26/2021 07:16:06 - INFO - codeparrot_training - Step 25288: {'lr': 0.00024683957577059597, 'samples': 12947968, 'steps': 25288, 'batch_loss/train': 0.8030426502227783} 12/26/2021 07:16:17 - INFO - codeparrot_training - Step 25289: {'lr': 0.0002468238139022742, 'samples': 12948480, 'steps': 25289, 'batch_loss/train': 0.5838584009325132} 12/26/2021 07:16:27 - INFO - codeparrot_training - Step 25290: {'lr': 0.0002468080520465799, 'samples': 12948992, 'steps': 25290, 'batch_loss/train': 0.8264772808179259} 12/26/2021 07:16:40 - INFO - codeparrot_training - Step 25291: {'lr': 0.00024679229020357533, 'samples': 12949504, 'steps': 25291, 'batch_loss/train': 0.7577890955726616} 12/26/2021 07:16:51 - INFO - codeparrot_training - Step 25292: {'lr': 0.0002467765283733236, 'samples': 12950016, 'steps': 25292, 'batch_loss/train': 0.829571285052225} 12/26/2021 07:17:01 - INFO - codeparrot_training - Step 25293: {'lr': 0.00024676076655588713, 'samples': 12950528, 'steps': 25293, 'batch_loss/train': 0.935324274469167} 12/26/2021 07:17:13 - INFO - codeparrot_training - Step 25294: {'lr': 0.0002467450047513286, 'samples': 12951040, 'steps': 25294, 'batch_loss/train': 0.7920938823372126} 12/26/2021 07:17:24 - INFO - codeparrot_training - Step 25295: {'lr': 0.0002467292429597107, 'samples': 12951552, 'steps': 25295, 'batch_loss/train': 0.7854979118565097} 12/26/2021 07:17:34 - INFO - codeparrot_training - Step 25296: {'lr': 0.0002467134811810961, 'samples': 12952064, 'steps': 25296, 'batch_loss/train': 0.822670885361731} 12/26/2021 07:17:46 - INFO - codeparrot_training - Step 25297: {'lr': 0.0002466977194155475, 'samples': 12952576, 'steps': 25297, 'batch_loss/train': 0.770421254914254} 12/26/2021 07:17:57 - INFO - codeparrot_training - Step 25298: {'lr': 0.0002466819576631275, 'samples': 12953088, 'steps': 25298, 'batch_loss/train': 0.7045935597270727} 12/26/2021 07:18:08 - INFO - codeparrot_training - Step 25299: {'lr': 0.0002466661959238987, 'samples': 12953600, 'steps': 25299, 'batch_loss/train': 0.7364079630933702} 12/26/2021 07:18:18 - INFO - codeparrot_training - Step 25300: {'lr': 0.000246650434197924, 'samples': 12954112, 'steps': 25300, 'batch_loss/train': 0.6296161271166056} 12/26/2021 07:18:31 - INFO - codeparrot_training - Step 25301: {'lr': 0.0002466346724852658, 'samples': 12954624, 'steps': 25301, 'batch_loss/train': 0.8039627065882087} 12/26/2021 07:18:42 - INFO - codeparrot_training - Step 25302: {'lr': 0.00024661891078598693, 'samples': 12955136, 'steps': 25302, 'batch_loss/train': 0.6877897237427533} 12/26/2021 07:18:52 - INFO - codeparrot_training - Step 25303: {'lr': 0.0002466031491001499, 'samples': 12955648, 'steps': 25303, 'batch_loss/train': 0.7613191846758127} 12/26/2021 07:19:05 - INFO - codeparrot_training - Step 25304: {'lr': 0.00024658738742781757, 'samples': 12956160, 'steps': 25304, 'batch_loss/train': 0.8813451244495809} 12/26/2021 07:19:16 - INFO - codeparrot_training - Step 25305: {'lr': 0.00024657162576905247, 'samples': 12956672, 'steps': 25305, 'batch_loss/train': 0.7283953651203774} 12/26/2021 07:19:26 - INFO - codeparrot_training - Step 25306: {'lr': 0.00024655586412391725, 'samples': 12957184, 'steps': 25306, 'batch_loss/train': 1.2224285434931517} 12/26/2021 07:19:38 - INFO - codeparrot_training - Step 25307: {'lr': 0.0002465401024924747, 'samples': 12957696, 'steps': 25307, 'batch_loss/train': 0.7949550431221724} 12/26/2021 07:19:49 - INFO - codeparrot_training - Step 25308: {'lr': 0.0002465243408747873, 'samples': 12958208, 'steps': 25308, 'batch_loss/train': 0.863728646421805} 12/26/2021 07:19:59 - INFO - codeparrot_training - Step 25309: {'lr': 0.0002465085792709179, 'samples': 12958720, 'steps': 25309, 'batch_loss/train': 0.9161113444715738} 12/26/2021 07:20:10 - INFO - codeparrot_training - Step 25310: {'lr': 0.00024649281768092895, 'samples': 12959232, 'steps': 25310, 'batch_loss/train': 0.8621358126401901} 12/26/2021 07:20:23 - INFO - codeparrot_training - Step 25311: {'lr': 0.0002464770561048833, 'samples': 12959744, 'steps': 25311, 'batch_loss/train': 0.7718858229927719} 12/26/2021 07:20:33 - INFO - codeparrot_training - Step 25312: {'lr': 0.00024646129454284363, 'samples': 12960256, 'steps': 25312, 'batch_loss/train': 0.8593959324061871} 12/26/2021 07:20:44 - INFO - codeparrot_training - Step 25313: {'lr': 0.00024644553299487244, 'samples': 12960768, 'steps': 25313, 'batch_loss/train': 0.8555699987336993} 12/26/2021 07:20:56 - INFO - codeparrot_training - Step 25314: {'lr': 0.0002464297714610324, 'samples': 12961280, 'steps': 25314, 'batch_loss/train': 0.8807461345568299} 12/26/2021 07:21:07 - INFO - codeparrot_training - Step 25315: {'lr': 0.0002464140099413863, 'samples': 12961792, 'steps': 25315, 'batch_loss/train': 0.8216862613335252} 12/26/2021 07:21:17 - INFO - codeparrot_training - Step 25316: {'lr': 0.0002463982484359968, 'samples': 12962304, 'steps': 25316, 'batch_loss/train': 0.7079996354877949} 12/26/2021 07:21:30 - INFO - codeparrot_training - Step 25317: {'lr': 0.0002463824869449264, 'samples': 12962816, 'steps': 25317, 'batch_loss/train': 0.7702031619846821} 12/26/2021 07:21:41 - INFO - codeparrot_training - Step 25318: {'lr': 0.0002463667254682379, 'samples': 12963328, 'steps': 25318, 'batch_loss/train': 0.7269221104215831} 12/26/2021 07:21:51 - INFO - codeparrot_training - Step 25319: {'lr': 0.000246350964005994, 'samples': 12963840, 'steps': 25319, 'batch_loss/train': 0.8442498408257961} 12/26/2021 07:22:02 - INFO - codeparrot_training - Step 25320: {'lr': 0.0002463352025582572, 'samples': 12964352, 'steps': 25320, 'batch_loss/train': 0.7899448396638036} 12/26/2021 07:22:14 - INFO - codeparrot_training - Step 25321: {'lr': 0.00024631944112509023, 'samples': 12964864, 'steps': 25321, 'batch_loss/train': 0.7599393334239721} 12/26/2021 07:22:25 - INFO - codeparrot_training - Step 25322: {'lr': 0.00024630367970655587, 'samples': 12965376, 'steps': 25322, 'batch_loss/train': 0.8126710345968604} 12/26/2021 07:22:35 - INFO - codeparrot_training - Step 25323: {'lr': 0.0002462879183027167, 'samples': 12965888, 'steps': 25323, 'batch_loss/train': 0.7138885674066842} 12/26/2021 07:22:48 - INFO - codeparrot_training - Step 25324: {'lr': 0.00024627215691363526, 'samples': 12966400, 'steps': 25324, 'batch_loss/train': 0.7698466056026518} 12/26/2021 07:22:58 - INFO - codeparrot_training - Step 25325: {'lr': 0.0002462563955393743, 'samples': 12966912, 'steps': 25325, 'batch_loss/train': 0.7943705562502146} 12/26/2021 07:23:09 - INFO - codeparrot_training - Step 25326: {'lr': 0.0002462406341799966, 'samples': 12967424, 'steps': 25326, 'batch_loss/train': 0.7365490397205576} 12/26/2021 07:23:22 - INFO - codeparrot_training - Step 25327: {'lr': 0.0002462248728355647, 'samples': 12967936, 'steps': 25327, 'batch_loss/train': 0.47858711751177907} 12/26/2021 07:23:32 - INFO - codeparrot_training - Step 25328: {'lr': 0.0002462091115061412, 'samples': 12968448, 'steps': 25328, 'batch_loss/train': 0.7393220420926809} 12/26/2021 07:23:43 - INFO - codeparrot_training - Step 25329: {'lr': 0.00024619335019178894, 'samples': 12968960, 'steps': 25329, 'batch_loss/train': 0.7532531747128814} 12/26/2021 07:23:55 - INFO - codeparrot_training - Step 25330: {'lr': 0.00024617758889257043, 'samples': 12969472, 'steps': 25330, 'batch_loss/train': 0.8638168655452318} 12/26/2021 07:24:06 - INFO - codeparrot_training - Step 25331: {'lr': 0.0002461618276085485, 'samples': 12969984, 'steps': 25331, 'batch_loss/train': 0.799006418325007} 12/26/2021 07:24:16 - INFO - codeparrot_training - Step 25332: {'lr': 0.0002461460663397855, 'samples': 12970496, 'steps': 25332, 'batch_loss/train': 0.7566322372294962} 12/26/2021 07:24:27 - INFO - codeparrot_training - Step 25333: {'lr': 0.00024613030508634443, 'samples': 12971008, 'steps': 25333, 'batch_loss/train': 0.6381411827169359} 12/26/2021 07:24:39 - INFO - codeparrot_training - Step 25334: {'lr': 0.0002461145438482877, 'samples': 12971520, 'steps': 25334, 'batch_loss/train': 0.7141017545945942} 12/26/2021 07:24:50 - INFO - codeparrot_training - Step 25335: {'lr': 0.00024609878262567817, 'samples': 12972032, 'steps': 25335, 'batch_loss/train': 0.7904504425823689} 12/26/2021 07:25:00 - INFO - codeparrot_training - Step 25336: {'lr': 0.0002460830214185783, 'samples': 12972544, 'steps': 25336, 'batch_loss/train': 0.7841252505313605} 12/26/2021 07:25:12 - INFO - codeparrot_training - Step 25337: {'lr': 0.00024606726022705094, 'samples': 12973056, 'steps': 25337, 'batch_loss/train': 0.798270296305418} 12/26/2021 07:25:23 - INFO - codeparrot_training - Step 25338: {'lr': 0.0002460514990511587, 'samples': 12973568, 'steps': 25338, 'batch_loss/train': 0.7787195895798504} 12/26/2021 07:25:34 - INFO - codeparrot_training - Step 25339: {'lr': 0.0002460357378909641, 'samples': 12974080, 'steps': 25339, 'batch_loss/train': 0.8543125931173563} 12/26/2021 07:25:46 - INFO - codeparrot_training - Step 25340: {'lr': 0.00024601997674652995, 'samples': 12974592, 'steps': 25340, 'batch_loss/train': 0.7798041058704257} 12/26/2021 07:25:57 - INFO - codeparrot_training - Step 25341: {'lr': 0.00024600421561791894, 'samples': 12975104, 'steps': 25341, 'batch_loss/train': 0.8071639854460955} 12/26/2021 07:26:08 - INFO - codeparrot_training - Step 25342: {'lr': 0.0002459884545051936, 'samples': 12975616, 'steps': 25342, 'batch_loss/train': 0.6946186758577824} 12/26/2021 07:26:18 - INFO - codeparrot_training - Step 25343: {'lr': 0.00024597269340841663, 'samples': 12976128, 'steps': 25343, 'batch_loss/train': 0.7040374320931733} 12/26/2021 07:26:30 - INFO - codeparrot_training - Step 25344: {'lr': 0.0002459569323276507, 'samples': 12976640, 'steps': 25344, 'batch_loss/train': 0.6986176986247301} 12/26/2021 07:26:41 - INFO - codeparrot_training - Step 25345: {'lr': 0.00024594117126295863, 'samples': 12977152, 'steps': 25345, 'batch_loss/train': 0.8554142005741596} 12/26/2021 07:26:52 - INFO - codeparrot_training - Step 25346: {'lr': 0.0002459254102144028, 'samples': 12977664, 'steps': 25346, 'batch_loss/train': 0.815017644315958} 12/26/2021 07:27:04 - INFO - codeparrot_training - Step 25347: {'lr': 0.00024590964918204595, 'samples': 12978176, 'steps': 25347, 'batch_loss/train': 0.7652826309204102} 12/26/2021 07:27:15 - INFO - codeparrot_training - Step 25348: {'lr': 0.0002458938881659509, 'samples': 12978688, 'steps': 25348, 'batch_loss/train': 0.782454433850944} 12/26/2021 07:27:26 - INFO - codeparrot_training - Step 25349: {'lr': 0.00024587812716618014, 'samples': 12979200, 'steps': 25349, 'batch_loss/train': 0.7266142340376973} 12/26/2021 07:27:38 - INFO - codeparrot_training - Step 25350: {'lr': 0.0002458623661827965, 'samples': 12979712, 'steps': 25350, 'batch_loss/train': 0.7426616088487208} 12/26/2021 07:27:49 - INFO - codeparrot_training - Step 25351: {'lr': 0.0002458466052158624, 'samples': 12980224, 'steps': 25351, 'batch_loss/train': 0.7766115609556437} 12/26/2021 07:27:59 - INFO - codeparrot_training - Step 25352: {'lr': 0.00024583084426544065, 'samples': 12980736, 'steps': 25352, 'batch_loss/train': 0.8152451182249933} 12/26/2021 07:28:10 - INFO - codeparrot_training - Step 25353: {'lr': 0.00024581508333159393, 'samples': 12981248, 'steps': 25353, 'batch_loss/train': 0.8470707144588232} 12/26/2021 07:28:22 - INFO - codeparrot_training - Step 25354: {'lr': 0.0002457993224143848, 'samples': 12981760, 'steps': 25354, 'batch_loss/train': 0.7705894717946649} 12/26/2021 07:28:32 - INFO - codeparrot_training - Step 25355: {'lr': 0.00024578356151387605, 'samples': 12982272, 'steps': 25355, 'batch_loss/train': 0.7544023925438523} 12/26/2021 07:28:43 - INFO - codeparrot_training - Step 25356: {'lr': 0.0002457678006301302, 'samples': 12982784, 'steps': 25356, 'batch_loss/train': 0.781107165850699} 12/26/2021 07:28:56 - INFO - codeparrot_training - Step 25357: {'lr': 0.00024575203976321005, 'samples': 12983296, 'steps': 25357, 'batch_loss/train': 0.9001335194334388} 12/26/2021 07:29:07 - INFO - codeparrot_training - Step 25358: {'lr': 0.0002457362789131781, 'samples': 12983808, 'steps': 25358, 'batch_loss/train': 0.5494246690068394} 12/26/2021 07:29:17 - INFO - codeparrot_training - Step 25359: {'lr': 0.00024572051808009716, 'samples': 12984320, 'steps': 25359, 'batch_loss/train': 0.7335309130139649} 12/26/2021 07:29:30 - INFO - codeparrot_training - Step 25360: {'lr': 0.0002457047572640298, 'samples': 12984832, 'steps': 25360, 'batch_loss/train': 0.860393347509671} 12/26/2021 07:29:40 - INFO - codeparrot_training - Step 25361: {'lr': 0.0002456889964650388, 'samples': 12985344, 'steps': 25361, 'batch_loss/train': 0.8578019645065069} 12/26/2021 07:29:51 - INFO - codeparrot_training - Step 25362: {'lr': 0.0002456732356831866, 'samples': 12985856, 'steps': 25362, 'batch_loss/train': 0.7850818978622556} 12/26/2021 07:30:02 - INFO - codeparrot_training - Step 25363: {'lr': 0.00024565747491853596, 'samples': 12986368, 'steps': 25363, 'batch_loss/train': 0.8587602023035288} 12/26/2021 07:30:14 - INFO - codeparrot_training - Step 25364: {'lr': 0.00024564171417114963, 'samples': 12986880, 'steps': 25364, 'batch_loss/train': 0.9332420630380511} 12/26/2021 07:30:25 - INFO - codeparrot_training - Step 25365: {'lr': 0.00024562595344109015, 'samples': 12987392, 'steps': 25365, 'batch_loss/train': 1.3789934376254678} 12/26/2021 07:30:36 - INFO - codeparrot_training - Step 25366: {'lr': 0.00024561019272842025, 'samples': 12987904, 'steps': 25366, 'batch_loss/train': 0.8737056590616703} 12/26/2021 07:30:48 - INFO - codeparrot_training - Step 25367: {'lr': 0.00024559443203320264, 'samples': 12988416, 'steps': 25367, 'batch_loss/train': 0.7742352578788996} 12/26/2021 07:30:58 - INFO - codeparrot_training - Step 25368: {'lr': 0.0002455786713554999, 'samples': 12988928, 'steps': 25368, 'batch_loss/train': 0.8010924705304205} 12/26/2021 07:31:09 - INFO - codeparrot_training - Step 25369: {'lr': 0.00024556291069537453, 'samples': 12989440, 'steps': 25369, 'batch_loss/train': 0.7124667521566153} 12/26/2021 07:31:21 - INFO - codeparrot_training - Step 25370: {'lr': 0.0002455471500528894, 'samples': 12989952, 'steps': 25370, 'batch_loss/train': 0.871288635302335} 12/26/2021 07:31:32 - INFO - codeparrot_training - Step 25371: {'lr': 0.0002455313894281073, 'samples': 12990464, 'steps': 25371, 'batch_loss/train': 0.8234588205814362} 12/26/2021 07:31:42 - INFO - codeparrot_training - Step 25372: {'lr': 0.00024551562882109055, 'samples': 12990976, 'steps': 25372, 'batch_loss/train': 0.7400173996575177} 12/26/2021 07:31:53 - INFO - codeparrot_training - Step 25373: {'lr': 0.00024549986823190194, 'samples': 12991488, 'steps': 25373, 'batch_loss/train': 0.693841791478917} 12/26/2021 07:32:05 - INFO - codeparrot_training - Step 25374: {'lr': 0.00024548410766060425, 'samples': 12992000, 'steps': 25374, 'batch_loss/train': 0.5898843656759709} 12/26/2021 07:32:16 - INFO - codeparrot_training - Step 25375: {'lr': 0.00024546834710725997, 'samples': 12992512, 'steps': 25375, 'batch_loss/train': 0.7365167494863272} 12/26/2021 07:32:26 - INFO - codeparrot_training - Step 25376: {'lr': 0.00024545258657193193, 'samples': 12993024, 'steps': 25376, 'batch_loss/train': 0.8548072943231091} 12/26/2021 07:32:39 - INFO - codeparrot_training - Step 25377: {'lr': 0.0002454368260546826, 'samples': 12993536, 'steps': 25377, 'batch_loss/train': 0.8361872429959476} 12/26/2021 07:32:50 - INFO - codeparrot_training - Step 25378: {'lr': 0.00024542106555557476, 'samples': 12994048, 'steps': 25378, 'batch_loss/train': 0.8396130031906068} 12/26/2021 07:33:00 - INFO - codeparrot_training - Step 25379: {'lr': 0.00024540530507467106, 'samples': 12994560, 'steps': 25379, 'batch_loss/train': 0.7665313724428415} 12/26/2021 07:33:12 - INFO - codeparrot_training - Step 25380: {'lr': 0.000245389544612034, 'samples': 12995072, 'steps': 25380, 'batch_loss/train': 0.7999862506985664} 12/26/2021 07:33:23 - INFO - codeparrot_training - Step 25381: {'lr': 0.00024537378416772653, 'samples': 12995584, 'steps': 25381, 'batch_loss/train': 0.871319143101573} 12/26/2021 07:33:34 - INFO - codeparrot_training - Step 25382: {'lr': 0.0002453580237418111, 'samples': 12996096, 'steps': 25382, 'batch_loss/train': 0.8008547341451049} 12/26/2021 07:33:44 - INFO - codeparrot_training - Step 25383: {'lr': 0.00024534226333435043, 'samples': 12996608, 'steps': 25383, 'batch_loss/train': 0.7540412547532469} 12/26/2021 07:33:56 - INFO - codeparrot_training - Step 25384: {'lr': 0.0002453265029454071, 'samples': 12997120, 'steps': 25384, 'batch_loss/train': 0.9050165433436632} 12/26/2021 07:34:07 - INFO - codeparrot_training - Step 25385: {'lr': 0.0002453107425750438, 'samples': 12997632, 'steps': 25385, 'batch_loss/train': 0.7634796863421798} 12/26/2021 07:34:17 - INFO - codeparrot_training - Step 25386: {'lr': 0.0002452949822233233, 'samples': 12998144, 'steps': 25386, 'batch_loss/train': 0.7556217219680548} 12/26/2021 07:34:30 - INFO - codeparrot_training - Step 25387: {'lr': 0.00024527922189030826, 'samples': 12998656, 'steps': 25387, 'batch_loss/train': 0.6770554697141051} 12/26/2021 07:34:41 - INFO - codeparrot_training - Step 25388: {'lr': 0.0002452634615760611, 'samples': 12999168, 'steps': 25388, 'batch_loss/train': 0.8456044206395745} 12/26/2021 07:34:52 - INFO - codeparrot_training - Step 25389: {'lr': 0.00024524770128064464, 'samples': 12999680, 'steps': 25389, 'batch_loss/train': 0.7407592302188277} 12/26/2021 07:35:04 - INFO - codeparrot_training - Step 25390: {'lr': 0.0002452319410041216, 'samples': 13000192, 'steps': 25390, 'batch_loss/train': 0.6251990406308323} 12/26/2021 07:35:14 - INFO - codeparrot_training - Step 25391: {'lr': 0.0002452161807465545, 'samples': 13000704, 'steps': 25391, 'batch_loss/train': 0.7039190374780446} 12/26/2021 07:35:25 - INFO - codeparrot_training - Step 25392: {'lr': 0.000245200420508006, 'samples': 13001216, 'steps': 25392, 'batch_loss/train': 0.7306415108032525} 12/26/2021 07:35:36 - INFO - codeparrot_training - Step 25393: {'lr': 0.000245184660288539, 'samples': 13001728, 'steps': 25393, 'batch_loss/train': 0.9142991360276937} 12/26/2021 07:35:48 - INFO - codeparrot_training - Step 25394: {'lr': 0.0002451689000882158, 'samples': 13002240, 'steps': 25394, 'batch_loss/train': 0.6952954871812835} 12/26/2021 07:35:59 - INFO - codeparrot_training - Step 25395: {'lr': 0.00024515313990709927, 'samples': 13002752, 'steps': 25395, 'batch_loss/train': 0.6990523962303996} 12/26/2021 07:36:09 - INFO - codeparrot_training - Step 25396: {'lr': 0.0002451373797452521, 'samples': 13003264, 'steps': 25396, 'batch_loss/train': 1.174984791316092} 12/26/2021 07:36:22 - INFO - codeparrot_training - Step 25397: {'lr': 0.0002451216196027369, 'samples': 13003776, 'steps': 25397, 'batch_loss/train': 0.8036088240332901} 12/26/2021 07:36:33 - INFO - codeparrot_training - Step 25398: {'lr': 0.0002451058594796162, 'samples': 13004288, 'steps': 25398, 'batch_loss/train': 0.8061168324202299} 12/26/2021 07:36:43 - INFO - codeparrot_training - Step 25399: {'lr': 0.00024509009937595267, 'samples': 13004800, 'steps': 25399, 'batch_loss/train': 0.822550181299448} 12/26/2021 07:36:56 - INFO - codeparrot_training - Step 25400: {'lr': 0.00024507433929180924, 'samples': 13005312, 'steps': 25400, 'batch_loss/train': 0.7763899373821914} 12/26/2021 07:37:06 - INFO - codeparrot_training - Step 25401: {'lr': 0.0002450585792272483, 'samples': 13005824, 'steps': 25401, 'batch_loss/train': 1.0820992450753693} 12/26/2021 07:37:17 - INFO - codeparrot_training - Step 25402: {'lr': 0.0002450428191823325, 'samples': 13006336, 'steps': 25402, 'batch_loss/train': 0.8203055632766336} 12/26/2021 07:37:28 - INFO - codeparrot_training - Step 25403: {'lr': 0.0002450270591571247, 'samples': 13006848, 'steps': 25403, 'batch_loss/train': 0.8230154486373067} 12/26/2021 07:37:41 - INFO - codeparrot_training - Step 25404: {'lr': 0.0002450112991516875, 'samples': 13007360, 'steps': 25404, 'batch_loss/train': 0.7468617481645197} 12/26/2021 07:37:51 - INFO - codeparrot_training - Step 25405: {'lr': 0.00024499553916608333, 'samples': 13007872, 'steps': 25405, 'batch_loss/train': 0.8325383504852653} 12/26/2021 07:38:02 - INFO - codeparrot_training - Step 25406: {'lr': 0.000244979779200375, 'samples': 13008384, 'steps': 25406, 'batch_loss/train': 0.8488864479586482} 12/26/2021 07:38:14 - INFO - codeparrot_training - Step 25407: {'lr': 0.00024496401925462526, 'samples': 13008896, 'steps': 25407, 'batch_loss/train': 0.7771522486582398} 12/26/2021 07:38:25 - INFO - codeparrot_training - Step 25408: {'lr': 0.00024494825932889667, 'samples': 13009408, 'steps': 25408, 'batch_loss/train': 0.7495356234721839} 12/26/2021 07:38:35 - INFO - codeparrot_training - Step 25409: {'lr': 0.0002449324994232519, 'samples': 13009920, 'steps': 25409, 'batch_loss/train': 0.8241537779103965} 12/26/2021 07:38:46 - INFO - codeparrot_training - Step 25410: {'lr': 0.00024491673953775354, 'samples': 13010432, 'steps': 25410, 'batch_loss/train': 1.1986556188203394} 12/26/2021 07:38:58 - INFO - codeparrot_training - Step 25411: {'lr': 0.00024490097967246434, 'samples': 13010944, 'steps': 25411, 'batch_loss/train': 0.8233243371360004} 12/26/2021 07:39:09 - INFO - codeparrot_training - Step 25412: {'lr': 0.00024488521982744696, 'samples': 13011456, 'steps': 25412, 'batch_loss/train': 0.6863092781277373} 12/26/2021 07:39:19 - INFO - codeparrot_training - Step 25413: {'lr': 0.000244869460002764, 'samples': 13011968, 'steps': 25413, 'batch_loss/train': 0.8579651885665953} 12/26/2021 07:39:31 - INFO - codeparrot_training - Step 25414: {'lr': 0.00024485370019847804, 'samples': 13012480, 'steps': 25414, 'batch_loss/train': 0.7258054243866354} 12/26/2021 07:39:42 - INFO - codeparrot_training - Step 25415: {'lr': 0.0002448379404146519, 'samples': 13012992, 'steps': 25415, 'batch_loss/train': 0.8513416517525911} 12/26/2021 07:39:53 - INFO - codeparrot_training - Step 25416: {'lr': 0.00024482218065134816, 'samples': 13013504, 'steps': 25416, 'batch_loss/train': 0.9008687753230333} 12/26/2021 07:40:05 - INFO - codeparrot_training - Step 25417: {'lr': 0.00024480642090862943, 'samples': 13014016, 'steps': 25417, 'batch_loss/train': 0.7531140232458711} 12/26/2021 07:40:16 - INFO - codeparrot_training - Step 25418: {'lr': 0.00024479066118655846, 'samples': 13014528, 'steps': 25418, 'batch_loss/train': 0.8334431545808911} 12/26/2021 07:40:27 - INFO - codeparrot_training - Step 25419: {'lr': 0.00024477490148519786, 'samples': 13015040, 'steps': 25419, 'batch_loss/train': 0.7741233939304948} 12/26/2021 07:40:39 - INFO - codeparrot_training - Step 25420: {'lr': 0.00024475914180461026, 'samples': 13015552, 'steps': 25420, 'batch_loss/train': 0.8121870262548327} 12/26/2021 07:40:49 - INFO - codeparrot_training - Step 25421: {'lr': 0.00024474338214485825, 'samples': 13016064, 'steps': 25421, 'batch_loss/train': 0.8603701549582183} 12/26/2021 07:41:00 - INFO - codeparrot_training - Step 25422: {'lr': 0.0002447276225060047, 'samples': 13016576, 'steps': 25422, 'batch_loss/train': 0.7138027236796916} 12/26/2021 07:41:10 - INFO - codeparrot_training - Step 25423: {'lr': 0.00024471186288811214, 'samples': 13017088, 'steps': 25423, 'batch_loss/train': 0.7466053767129779} 12/26/2021 07:41:23 - INFO - codeparrot_training - Step 25424: {'lr': 0.0002446961032912431, 'samples': 13017600, 'steps': 25424, 'batch_loss/train': 0.9878677083179355} 12/26/2021 07:41:34 - INFO - codeparrot_training - Step 25425: {'lr': 0.00024468034371546033, 'samples': 13018112, 'steps': 25425, 'batch_loss/train': 1.1608747923746705} 12/26/2021 07:41:44 - INFO - codeparrot_training - Step 25426: {'lr': 0.00024466458416082666, 'samples': 13018624, 'steps': 25426, 'batch_loss/train': 0.761301476508379} 12/26/2021 07:41:57 - INFO - codeparrot_training - Step 25427: {'lr': 0.0002446488246274045, 'samples': 13019136, 'steps': 25427, 'batch_loss/train': 0.8101494791917503} 12/26/2021 07:42:08 - INFO - codeparrot_training - Step 25428: {'lr': 0.0002446330651152565, 'samples': 13019648, 'steps': 25428, 'batch_loss/train': 0.712806137278676} 12/26/2021 07:42:18 - INFO - codeparrot_training - Step 25429: {'lr': 0.0002446173056244456, 'samples': 13020160, 'steps': 25429, 'batch_loss/train': 0.8665047744289041} 12/26/2021 07:42:31 - INFO - codeparrot_training - Step 25430: {'lr': 0.0002446015461550341, 'samples': 13020672, 'steps': 25430, 'batch_loss/train': 0.7810465442016721} 12/26/2021 07:42:41 - INFO - codeparrot_training - Step 25431: {'lr': 0.00024458578670708496, 'samples': 13021184, 'steps': 25431, 'batch_loss/train': 0.8129059700295329} 12/26/2021 07:42:52 - INFO - codeparrot_training - Step 25432: {'lr': 0.00024457002728066053, 'samples': 13021696, 'steps': 25432, 'batch_loss/train': 0.7552985725924373} 12/26/2021 07:43:04 - INFO - codeparrot_training - Step 25433: {'lr': 0.00024455426787582375, 'samples': 13022208, 'steps': 25433, 'batch_loss/train': 0.6449605815578252} 12/26/2021 07:43:15 - INFO - codeparrot_training - Step 25434: {'lr': 0.00024453850849263714, 'samples': 13022720, 'steps': 25434, 'batch_loss/train': 0.8152423533611} 12/26/2021 07:43:26 - INFO - codeparrot_training - Step 25435: {'lr': 0.00024452274913116335, 'samples': 13023232, 'steps': 25435, 'batch_loss/train': 0.7941770548932254} 12/26/2021 07:43:36 - INFO - codeparrot_training - Step 25436: {'lr': 0.00024450698979146496, 'samples': 13023744, 'steps': 25436, 'batch_loss/train': 0.8146755611523986} 12/26/2021 07:43:49 - INFO - codeparrot_training - Step 25437: {'lr': 0.0002444912304736047, 'samples': 13024256, 'steps': 25437, 'batch_loss/train': 0.8922433909028769} 12/26/2021 07:43:59 - INFO - codeparrot_training - Step 25438: {'lr': 0.00024447547117764537, 'samples': 13024768, 'steps': 25438, 'batch_loss/train': 0.7938921856693923} 12/26/2021 07:44:10 - INFO - codeparrot_training - Step 25439: {'lr': 0.0002444597119036493, 'samples': 13025280, 'steps': 25439, 'batch_loss/train': 0.900546652264893} 12/26/2021 07:44:22 - INFO - codeparrot_training - Step 25440: {'lr': 0.0002444439526516794, 'samples': 13025792, 'steps': 25440, 'batch_loss/train': 0.7754444655729458} 12/26/2021 07:44:33 - INFO - codeparrot_training - Step 25441: {'lr': 0.0002444281934217983, 'samples': 13026304, 'steps': 25441, 'batch_loss/train': 0.7852080389857292} 12/26/2021 07:44:44 - INFO - codeparrot_training - Step 25442: {'lr': 0.0002444124342140686, 'samples': 13026816, 'steps': 25442, 'batch_loss/train': 0.783188997185789} 12/26/2021 07:44:54 - INFO - codeparrot_training - Step 25443: {'lr': 0.0002443966750285528, 'samples': 13027328, 'steps': 25443, 'batch_loss/train': 1.3487502527423203} 12/26/2021 07:45:07 - INFO - codeparrot_training - Step 25444: {'lr': 0.00024438091586531386, 'samples': 13027840, 'steps': 25444, 'batch_loss/train': 0.8172194515354931} 12/26/2021 07:45:17 - INFO - codeparrot_training - Step 25445: {'lr': 0.0002443651567244143, 'samples': 13028352, 'steps': 25445, 'batch_loss/train': 0.7570237312465906} 12/26/2021 07:45:28 - INFO - codeparrot_training - Step 25446: {'lr': 0.00024434939760591666, 'samples': 13028864, 'steps': 25446, 'batch_loss/train': 0.8262944389134645} 12/26/2021 07:45:40 - INFO - codeparrot_training - Step 25447: {'lr': 0.00024433363850988364, 'samples': 13029376, 'steps': 25447, 'batch_loss/train': 0.8412288976833224} 12/26/2021 07:45:51 - INFO - codeparrot_training - Step 25448: {'lr': 0.000244317879436378, 'samples': 13029888, 'steps': 25448, 'batch_loss/train': 0.708147298777476} 12/26/2021 07:46:01 - INFO - codeparrot_training - Step 25449: {'lr': 0.0002443021203854624, 'samples': 13030400, 'steps': 25449, 'batch_loss/train': 0.7583611465524882} 12/26/2021 07:46:14 - INFO - codeparrot_training - Step 25450: {'lr': 0.0002442863613571993, 'samples': 13030912, 'steps': 25450, 'batch_loss/train': 0.7533436673693359} 12/26/2021 07:46:25 - INFO - codeparrot_training - Step 25451: {'lr': 0.00024427060235165146, 'samples': 13031424, 'steps': 25451, 'batch_loss/train': 0.729934114497155} 12/26/2021 07:46:35 - INFO - codeparrot_training - Step 25452: {'lr': 0.00024425484336888163, 'samples': 13031936, 'steps': 25452, 'batch_loss/train': 0.8378202025778592} 12/26/2021 07:46:48 - INFO - codeparrot_training - Step 25453: {'lr': 0.00024423908440895237, 'samples': 13032448, 'steps': 25453, 'batch_loss/train': 0.6327522803330794} 12/26/2021 07:46:59 - INFO - codeparrot_training - Step 25454: {'lr': 0.0002442233254719262, 'samples': 13032960, 'steps': 25454, 'batch_loss/train': 0.6477282037958503} 12/26/2021 07:47:09 - INFO - codeparrot_training - Step 25455: {'lr': 0.000244207566557866, 'samples': 13033472, 'steps': 25455, 'batch_loss/train': 0.7823877381160855} 12/26/2021 07:47:20 - INFO - codeparrot_training - Step 25456: {'lr': 0.0002441918076668344, 'samples': 13033984, 'steps': 25456, 'batch_loss/train': 0.7434593327343464} 12/26/2021 07:47:32 - INFO - codeparrot_training - Step 25457: {'lr': 0.00024417604879889394, 'samples': 13034496, 'steps': 25457, 'batch_loss/train': 0.7816774221137166} 12/26/2021 07:47:43 - INFO - codeparrot_training - Step 25458: {'lr': 0.0002441602899541072, 'samples': 13035008, 'steps': 25458, 'batch_loss/train': 1.0845389538444579} 12/26/2021 07:47:53 - INFO - codeparrot_training - Step 25459: {'lr': 0.0002441445311325371, 'samples': 13035520, 'steps': 25459, 'batch_loss/train': 0.9729837044142187} 12/26/2021 07:48:05 - INFO - codeparrot_training - Step 25460: {'lr': 0.00024412877233424604, 'samples': 13036032, 'steps': 25460, 'batch_loss/train': 0.7763194935396314} 12/26/2021 07:48:16 - INFO - codeparrot_training - Step 25461: {'lr': 0.00024411301355929685, 'samples': 13036544, 'steps': 25461, 'batch_loss/train': 0.7907136203721166} 12/26/2021 07:48:27 - INFO - codeparrot_training - Step 25462: {'lr': 0.00024409725480775197, 'samples': 13037056, 'steps': 25462, 'batch_loss/train': 0.8247030223719776} 12/26/2021 07:48:40 - INFO - codeparrot_training - Step 25463: {'lr': 0.00024408149607967422, 'samples': 13037568, 'steps': 25463, 'batch_loss/train': 0.7964211283251643} 12/26/2021 07:48:51 - INFO - codeparrot_training - Step 25464: {'lr': 0.00024406573737512628, 'samples': 13038080, 'steps': 25464, 'batch_loss/train': 0.7149649448692799} 12/26/2021 07:49:01 - INFO - codeparrot_training - Step 25465: {'lr': 0.00024404997869417058, 'samples': 13038592, 'steps': 25465, 'batch_loss/train': 0.5932726527098566} 12/26/2021 07:49:12 - INFO - codeparrot_training - Step 25466: {'lr': 0.00024403422003687, 'samples': 13039104, 'steps': 25466, 'batch_loss/train': 0.730967917945236} 12/26/2021 07:49:24 - INFO - codeparrot_training - Step 25467: {'lr': 0.00024401846140328717, 'samples': 13039616, 'steps': 25467, 'batch_loss/train': 0.774234626442194} 12/26/2021 07:49:35 - INFO - codeparrot_training - Step 25468: {'lr': 0.00024400270279348468, 'samples': 13040128, 'steps': 25468, 'batch_loss/train': 0.7984460713341832} 12/26/2021 07:49:45 - INFO - codeparrot_training - Step 25469: {'lr': 0.00024398694420752504, 'samples': 13040640, 'steps': 25469, 'batch_loss/train': 0.7120322035625577} 12/26/2021 07:49:59 - INFO - codeparrot_training - Step 25470: {'lr': 0.00024397118564547114, 'samples': 13041152, 'steps': 25470, 'batch_loss/train': 0.785693955142051} 12/26/2021 07:50:09 - INFO - codeparrot_training - Step 25471: {'lr': 0.00024395542710738556, 'samples': 13041664, 'steps': 25471, 'batch_loss/train': 1.5290119735873304} 12/26/2021 07:50:20 - INFO - codeparrot_training - Step 25472: {'lr': 0.00024393966859333086, 'samples': 13042176, 'steps': 25472, 'batch_loss/train': 0.8014979155850597} 12/26/2021 07:50:30 - INFO - codeparrot_training - Step 25473: {'lr': 0.0002439239101033697, 'samples': 13042688, 'steps': 25473, 'batch_loss/train': 1.46103885024786} 12/26/2021 07:50:42 - INFO - codeparrot_training - Step 25474: {'lr': 0.00024390815163756487, 'samples': 13043200, 'steps': 25474, 'batch_loss/train': 1.1255553355440497} 12/26/2021 07:50:53 - INFO - codeparrot_training - Step 25475: {'lr': 0.00024389239319597892, 'samples': 13043712, 'steps': 25475, 'batch_loss/train': 0.7214068797184154} 12/26/2021 07:51:04 - INFO - codeparrot_training - Step 25476: {'lr': 0.00024387663477867442, 'samples': 13044224, 'steps': 25476, 'batch_loss/train': 0.7377722505480051} 12/26/2021 07:51:16 - INFO - codeparrot_training - Step 25477: {'lr': 0.00024386087638571423, 'samples': 13044736, 'steps': 25477, 'batch_loss/train': 0.8908010569866747} 12/26/2021 07:51:26 - INFO - codeparrot_training - Step 25478: {'lr': 0.00024384511801716087, 'samples': 13045248, 'steps': 25478, 'batch_loss/train': 0.7320580258965492} 12/26/2021 07:51:37 - INFO - codeparrot_training - Step 25479: {'lr': 0.00024382935967307692, 'samples': 13045760, 'steps': 25479, 'batch_loss/train': 0.6817730193724856} 12/26/2021 07:51:50 - INFO - codeparrot_training - Step 25480: {'lr': 0.00024381360135352509, 'samples': 13046272, 'steps': 25480, 'batch_loss/train': 0.7772970146033913} 12/26/2021 07:52:01 - INFO - codeparrot_training - Step 25481: {'lr': 0.00024379784305856813, 'samples': 13046784, 'steps': 25481, 'batch_loss/train': 0.7166177356848493} 12/26/2021 07:52:11 - INFO - codeparrot_training - Step 25482: {'lr': 0.00024378208478826856, 'samples': 13047296, 'steps': 25482, 'batch_loss/train': 0.6999594018561766} 12/26/2021 07:52:22 - INFO - codeparrot_training - Step 25483: {'lr': 0.00024376632654268913, 'samples': 13047808, 'steps': 25483, 'batch_loss/train': 0.7762803776422516} 12/26/2021 07:52:34 - INFO - codeparrot_training - Step 25484: {'lr': 0.00024375056832189233, 'samples': 13048320, 'steps': 25484, 'batch_loss/train': 0.7487888978794217} 12/26/2021 07:52:45 - INFO - codeparrot_training - Step 25485: {'lr': 0.00024373481012594092, 'samples': 13048832, 'steps': 25485, 'batch_loss/train': 0.7476593474857509} 12/26/2021 07:52:55 - INFO - codeparrot_training - Step 25486: {'lr': 0.00024371905195489765, 'samples': 13049344, 'steps': 25486, 'batch_loss/train': 0.750846590846777} 12/26/2021 07:53:08 - INFO - codeparrot_training - Step 25487: {'lr': 0.00024370329380882501, 'samples': 13049856, 'steps': 25487, 'batch_loss/train': 0.7131106555461884} 12/26/2021 07:53:18 - INFO - codeparrot_training - Step 25488: {'lr': 0.00024368753568778563, 'samples': 13050368, 'steps': 25488, 'batch_loss/train': 0.7658937918022275} 12/26/2021 07:53:29 - INFO - codeparrot_training - Step 25489: {'lr': 0.00024367177759184225, 'samples': 13050880, 'steps': 25489, 'batch_loss/train': 0.7303954078815877} 12/26/2021 07:53:41 - INFO - codeparrot_training - Step 25490: {'lr': 0.00024365601952105757, 'samples': 13051392, 'steps': 25490, 'batch_loss/train': 0.7632824928732589} 12/26/2021 07:53:52 - INFO - codeparrot_training - Step 25491: {'lr': 0.00024364026147549403, 'samples': 13051904, 'steps': 25491, 'batch_loss/train': 0.7004146655090153} 12/26/2021 07:54:03 - INFO - codeparrot_training - Step 25492: {'lr': 0.0002436245034552145, 'samples': 13052416, 'steps': 25492, 'batch_loss/train': 0.8936387524008751} 12/26/2021 07:54:13 - INFO - codeparrot_training - Step 25493: {'lr': 0.00024360874546028156, 'samples': 13052928, 'steps': 25493, 'batch_loss/train': 0.8933940138667822} 12/26/2021 07:54:26 - INFO - codeparrot_training - Step 25494: {'lr': 0.00024359298749075783, 'samples': 13053440, 'steps': 25494, 'batch_loss/train': 0.7924457369372249} 12/26/2021 07:54:36 - INFO - codeparrot_training - Step 25495: {'lr': 0.00024357722954670586, 'samples': 13053952, 'steps': 25495, 'batch_loss/train': 0.8117532152682543} 12/26/2021 07:54:47 - INFO - codeparrot_training - Step 25496: {'lr': 0.00024356147162818848, 'samples': 13054464, 'steps': 25496, 'batch_loss/train': 0.8411374874413013} 12/26/2021 07:54:59 - INFO - codeparrot_training - Step 25497: {'lr': 0.0002435457137352683, 'samples': 13054976, 'steps': 25497, 'batch_loss/train': 0.5607083175564185} 12/26/2021 07:55:10 - INFO - codeparrot_training - Step 25498: {'lr': 0.00024352995586800785, 'samples': 13055488, 'steps': 25498, 'batch_loss/train': 0.4539109641336836} 12/26/2021 07:55:20 - INFO - codeparrot_training - Step 25499: {'lr': 0.0002435141980264698, 'samples': 13056000, 'steps': 25499, 'batch_loss/train': 0.7429040626157075} 12/26/2021 07:55:33 - INFO - codeparrot_training - Step 25500: {'lr': 0.00024349844021071698, 'samples': 13056512, 'steps': 25500, 'batch_loss/train': 0.8486783965490758} 12/26/2021 07:55:44 - INFO - codeparrot_training - Step 25501: {'lr': 0.00024348268242081183, 'samples': 13057024, 'steps': 25501, 'batch_loss/train': 0.5929707678733394} 12/26/2021 07:55:54 - INFO - codeparrot_training - Step 25502: {'lr': 0.00024346692465681703, 'samples': 13057536, 'steps': 25502, 'batch_loss/train': 0.7385403085500002} 12/26/2021 07:56:05 - INFO - codeparrot_training - Step 25503: {'lr': 0.00024345116691879538, 'samples': 13058048, 'steps': 25503, 'batch_loss/train': 0.6885278676636517} 12/26/2021 07:56:17 - INFO - codeparrot_training - Step 25504: {'lr': 0.00024343540920680943, 'samples': 13058560, 'steps': 25504, 'batch_loss/train': 0.7813744312152267} 12/26/2021 07:56:28 - INFO - codeparrot_training - Step 25505: {'lr': 0.00024341965152092175, 'samples': 13059072, 'steps': 25505, 'batch_loss/train': 0.7755492059513927} 12/26/2021 07:56:38 - INFO - codeparrot_training - Step 25506: {'lr': 0.000243403893861195, 'samples': 13059584, 'steps': 25506, 'batch_loss/train': 0.797609587199986} 12/26/2021 07:56:50 - INFO - codeparrot_training - Step 25507: {'lr': 0.00024338813622769195, 'samples': 13060096, 'steps': 25507, 'batch_loss/train': 0.6656316828448325} 12/26/2021 07:57:01 - INFO - codeparrot_training - Step 25508: {'lr': 0.00024337237862047517, 'samples': 13060608, 'steps': 25508, 'batch_loss/train': 0.7350138546898961} 12/26/2021 07:57:12 - INFO - codeparrot_training - Step 25509: {'lr': 0.00024335662103960735, 'samples': 13061120, 'steps': 25509, 'batch_loss/train': 0.6631193880457431} 12/26/2021 07:57:25 - INFO - codeparrot_training - Step 25510: {'lr': 0.00024334086348515096, 'samples': 13061632, 'steps': 25510, 'batch_loss/train': 0.7939260378479958} 12/26/2021 07:57:35 - INFO - codeparrot_training - Step 25511: {'lr': 0.00024332510595716887, 'samples': 13062144, 'steps': 25511, 'batch_loss/train': 0.567274977802299} 12/26/2021 07:57:46 - INFO - codeparrot_training - Step 25512: {'lr': 0.0002433093484557237, 'samples': 13062656, 'steps': 25512, 'batch_loss/train': 0.8364546671509743} 12/26/2021 07:57:57 - INFO - codeparrot_training - Step 25513: {'lr': 0.0002432935909808779, 'samples': 13063168, 'steps': 25513, 'batch_loss/train': 0.7746496293693781} 12/26/2021 07:58:09 - INFO - codeparrot_training - Step 25514: {'lr': 0.00024327783353269436, 'samples': 13063680, 'steps': 25514, 'batch_loss/train': 0.6773244286887348} 12/26/2021 07:58:19 - INFO - codeparrot_training - Step 25515: {'lr': 0.00024326207611123556, 'samples': 13064192, 'steps': 25515, 'batch_loss/train': 0.7349844751879573} 12/26/2021 07:58:30 - INFO - codeparrot_training - Step 25516: {'lr': 0.00024324631871656428, 'samples': 13064704, 'steps': 25516, 'batch_loss/train': 0.6271973676048219} 12/26/2021 07:58:43 - INFO - codeparrot_training - Step 25517: {'lr': 0.00024323056134874295, 'samples': 13065216, 'steps': 25517, 'batch_loss/train': 0.7165581309236586} 12/26/2021 07:58:54 - INFO - codeparrot_training - Step 25518: {'lr': 0.0002432148040078344, 'samples': 13065728, 'steps': 25518, 'batch_loss/train': 0.7643301365897059} 12/26/2021 07:59:04 - INFO - codeparrot_training - Step 25519: {'lr': 0.0002431990466939013, 'samples': 13066240, 'steps': 25519, 'batch_loss/train': 0.7881850986741483} 12/26/2021 07:59:16 - INFO - codeparrot_training - Step 25520: {'lr': 0.00024318328940700618, 'samples': 13066752, 'steps': 25520, 'batch_loss/train': 0.6828729519620538} 12/26/2021 07:59:27 - INFO - codeparrot_training - Step 25521: {'lr': 0.00024316753214721166, 'samples': 13067264, 'steps': 25521, 'batch_loss/train': 0.6760604626033455} 12/26/2021 07:59:37 - INFO - codeparrot_training - Step 25522: {'lr': 0.00024315177491458057, 'samples': 13067776, 'steps': 25522, 'batch_loss/train': 0.7200795961543918} 12/26/2021 07:59:48 - INFO - codeparrot_training - Step 25523: {'lr': 0.00024313601770917543, 'samples': 13068288, 'steps': 25523, 'batch_loss/train': 0.8379979776218534} 12/26/2021 08:00:00 - INFO - codeparrot_training - Step 25524: {'lr': 0.00024312026053105882, 'samples': 13068800, 'steps': 25524, 'batch_loss/train': 0.7415991751477122} 12/26/2021 08:00:11 - INFO - codeparrot_training - Step 25525: {'lr': 0.0002431045033802934, 'samples': 13069312, 'steps': 25525, 'batch_loss/train': 0.8626988641917706} 12/26/2021 08:00:21 - INFO - codeparrot_training - Step 25526: {'lr': 0.000243088746256942, 'samples': 13069824, 'steps': 25526, 'batch_loss/train': 0.7116710502887145} 12/26/2021 08:00:34 - INFO - codeparrot_training - Step 25527: {'lr': 0.0002430729891610671, 'samples': 13070336, 'steps': 25527, 'batch_loss/train': 0.7741246717050672} 12/26/2021 08:00:45 - INFO - codeparrot_training - Step 25528: {'lr': 0.0002430572320927313, 'samples': 13070848, 'steps': 25528, 'batch_loss/train': 0.7867710562422872} 12/26/2021 08:00:55 - INFO - codeparrot_training - Step 25529: {'lr': 0.0002430414750519974, 'samples': 13071360, 'steps': 25529, 'batch_loss/train': 0.7652080627158284} 12/26/2021 08:01:07 - INFO - codeparrot_training - Step 25530: {'lr': 0.00024302571803892802, 'samples': 13071872, 'steps': 25530, 'batch_loss/train': 0.7033906318247318} 12/26/2021 08:01:18 - INFO - codeparrot_training - Step 25531: {'lr': 0.0002430099610535857, 'samples': 13072384, 'steps': 25531, 'batch_loss/train': 0.7764322590082884} 12/26/2021 08:01:29 - INFO - codeparrot_training - Step 25532: {'lr': 0.00024299420409603305, 'samples': 13072896, 'steps': 25532, 'batch_loss/train': 0.7878284929320216} 12/26/2021 08:01:39 - INFO - codeparrot_training - Step 25533: {'lr': 0.00024297844716633294, 'samples': 13073408, 'steps': 25533, 'batch_loss/train': 0.7328105643391609} 12/26/2021 08:01:51 - INFO - codeparrot_training - Step 25534: {'lr': 0.0002429626902645478, 'samples': 13073920, 'steps': 25534, 'batch_loss/train': 0.7411608172114938} 12/26/2021 08:02:02 - INFO - codeparrot_training - Step 25535: {'lr': 0.0002429469333907404, 'samples': 13074432, 'steps': 25535, 'batch_loss/train': 0.7631489369086921} 12/26/2021 08:02:13 - INFO - codeparrot_training - Step 25536: {'lr': 0.00024293117654497322, 'samples': 13074944, 'steps': 25536, 'batch_loss/train': 0.738086523488164} 12/26/2021 08:02:25 - INFO - codeparrot_training - Step 25537: {'lr': 0.00024291541972730905, 'samples': 13075456, 'steps': 25537, 'batch_loss/train': 0.7535044949036092} 12/26/2021 08:02:35 - INFO - codeparrot_training - Step 25538: {'lr': 0.00024289966293781058, 'samples': 13075968, 'steps': 25538, 'batch_loss/train': 0.7353521944023669} 12/26/2021 08:02:46 - INFO - codeparrot_training - Step 25539: {'lr': 0.00024288390617654024, 'samples': 13076480, 'steps': 25539, 'batch_loss/train': 0.7471957316156477} 12/26/2021 08:02:59 - INFO - codeparrot_training - Step 25540: {'lr': 0.0002428681494435609, 'samples': 13076992, 'steps': 25540, 'batch_loss/train': 1.4619121607393026} 12/26/2021 08:03:10 - INFO - codeparrot_training - Step 25541: {'lr': 0.0002428523927389351, 'samples': 13077504, 'steps': 25541, 'batch_loss/train': 1.5842913356609643} 12/26/2021 08:03:20 - INFO - codeparrot_training - Step 25542: {'lr': 0.00024283663606272548, 'samples': 13078016, 'steps': 25542, 'batch_loss/train': 0.7480158703401685} 12/26/2021 08:03:31 - INFO - codeparrot_training - Step 25543: {'lr': 0.00024282087941499464, 'samples': 13078528, 'steps': 25543, 'batch_loss/train': 0.7749808831140399} 12/26/2021 08:03:43 - INFO - codeparrot_training - Step 25544: {'lr': 0.0002428051227958053, 'samples': 13079040, 'steps': 25544, 'batch_loss/train': 0.6896400721743703} 12/26/2021 08:03:54 - INFO - codeparrot_training - Step 25545: {'lr': 0.00024278936620522013, 'samples': 13079552, 'steps': 25545, 'batch_loss/train': 0.8794459467753768} 12/26/2021 08:04:04 - INFO - codeparrot_training - Step 25546: {'lr': 0.00024277360964330164, 'samples': 13080064, 'steps': 25546, 'batch_loss/train': 0.8534476819913834} 12/26/2021 08:04:17 - INFO - codeparrot_training - Step 25547: {'lr': 0.00024275785311011252, 'samples': 13080576, 'steps': 25547, 'batch_loss/train': 0.6423857440240681} 12/26/2021 08:04:27 - INFO - codeparrot_training - Step 25548: {'lr': 0.00024274209660571556, 'samples': 13081088, 'steps': 25548, 'batch_loss/train': 0.8141310745850205} 12/26/2021 08:04:38 - INFO - codeparrot_training - Step 25549: {'lr': 0.0002427263401301733, 'samples': 13081600, 'steps': 25549, 'batch_loss/train': 0.7233685841783881} 12/26/2021 08:04:51 - INFO - codeparrot_training - Step 25550: {'lr': 0.0002427105836835482, 'samples': 13082112, 'steps': 25550, 'batch_loss/train': 0.7557749408297241} 12/26/2021 08:05:01 - INFO - codeparrot_training - Step 25551: {'lr': 0.00024269482726590314, 'samples': 13082624, 'steps': 25551, 'batch_loss/train': 0.7225532999727875} 12/26/2021 08:05:12 - INFO - codeparrot_training - Step 25552: {'lr': 0.00024267907087730076, 'samples': 13083136, 'steps': 25552, 'batch_loss/train': 0.8197799436748028} 12/26/2021 08:05:24 - INFO - codeparrot_training - Step 25553: {'lr': 0.00024266331451780357, 'samples': 13083648, 'steps': 25553, 'batch_loss/train': 0.719443621346727} 12/26/2021 08:05:34 - INFO - codeparrot_training - Step 25554: {'lr': 0.0002426475581874742, 'samples': 13084160, 'steps': 25554, 'batch_loss/train': 0.7277134156320244} 12/26/2021 08:05:45 - INFO - codeparrot_training - Step 25555: {'lr': 0.00024263180188637549, 'samples': 13084672, 'steps': 25555, 'batch_loss/train': 0.7578696487471461} 12/26/2021 08:05:56 - INFO - codeparrot_training - Step 25556: {'lr': 0.0002426160456145699, 'samples': 13085184, 'steps': 25556, 'batch_loss/train': 0.8113365648314357} 12/26/2021 08:06:09 - INFO - codeparrot_training - Step 25557: {'lr': 0.00024260028937212017, 'samples': 13085696, 'steps': 25557, 'batch_loss/train': 0.7272019472438842} 12/26/2021 08:06:19 - INFO - codeparrot_training - Step 25558: {'lr': 0.0002425845331590888, 'samples': 13086208, 'steps': 25558, 'batch_loss/train': 1.7482003327459097} 12/26/2021 08:06:30 - INFO - codeparrot_training - Step 25559: {'lr': 0.00024256877697553862, 'samples': 13086720, 'steps': 25559, 'batch_loss/train': 0.6849408550187945} 12/26/2021 08:06:42 - INFO - codeparrot_training - Step 25560: {'lr': 0.00024255302082153214, 'samples': 13087232, 'steps': 25560, 'batch_loss/train': 0.8043036805465817} 12/26/2021 08:06:52 - INFO - codeparrot_training - Step 25561: {'lr': 0.00024253726469713208, 'samples': 13087744, 'steps': 25561, 'batch_loss/train': 0.7244565216824412} 12/26/2021 08:07:03 - INFO - codeparrot_training - Step 25562: {'lr': 0.0002425215086024009, 'samples': 13088256, 'steps': 25562, 'batch_loss/train': 0.7623887090012431} 12/26/2021 08:07:15 - INFO - codeparrot_training - Step 25563: {'lr': 0.0002425057525374015, 'samples': 13088768, 'steps': 25563, 'batch_loss/train': 0.7684190995059907} 12/26/2021 08:07:26 - INFO - codeparrot_training - Step 25564: {'lr': 0.00024248999650219638, 'samples': 13089280, 'steps': 25564, 'batch_loss/train': 0.8243449740111828} 12/26/2021 08:07:36 - INFO - codeparrot_training - Step 25565: {'lr': 0.00024247424049684814, 'samples': 13089792, 'steps': 25565, 'batch_loss/train': 0.7946715466678143} 12/26/2021 08:07:47 - INFO - codeparrot_training - Step 25566: {'lr': 0.0002424584845214195, 'samples': 13090304, 'steps': 25566, 'batch_loss/train': 0.8640458779409528} 12/26/2021 08:08:00 - INFO - codeparrot_training - Step 25567: {'lr': 0.00024244272857597315, 'samples': 13090816, 'steps': 25567, 'batch_loss/train': 0.800279809627682} 12/26/2021 08:08:10 - INFO - codeparrot_training - Step 25568: {'lr': 0.00024242697266057167, 'samples': 13091328, 'steps': 25568, 'batch_loss/train': 0.7536684954538941} 12/26/2021 08:08:21 - INFO - codeparrot_training - Step 25569: {'lr': 0.00024241121677527755, 'samples': 13091840, 'steps': 25569, 'batch_loss/train': 0.7600521584972739} 12/26/2021 08:08:33 - INFO - codeparrot_training - Step 25570: {'lr': 0.00024239546092015362, 'samples': 13092352, 'steps': 25570, 'batch_loss/train': 0.9224528511986136} 12/26/2021 08:08:44 - INFO - codeparrot_training - Step 25571: {'lr': 0.00024237970509526256, 'samples': 13092864, 'steps': 25571, 'batch_loss/train': 0.7183505739085376} 12/26/2021 08:08:55 - INFO - codeparrot_training - Step 25572: {'lr': 0.00024236394930066683, 'samples': 13093376, 'steps': 25572, 'batch_loss/train': 0.8200003032106906} 12/26/2021 08:09:07 - INFO - codeparrot_training - Step 25573: {'lr': 0.00024234819353642913, 'samples': 13093888, 'steps': 25573, 'batch_loss/train': 0.710227167699486} 12/26/2021 08:09:18 - INFO - codeparrot_training - Step 25574: {'lr': 0.0002423324378026122, 'samples': 13094400, 'steps': 25574, 'batch_loss/train': 0.798586618155241} 12/26/2021 08:09:29 - INFO - codeparrot_training - Step 25575: {'lr': 0.0002423166820992786, 'samples': 13094912, 'steps': 25575, 'batch_loss/train': 0.7765833102166653} 12/26/2021 08:09:39 - INFO - codeparrot_training - Step 25576: {'lr': 0.00024230092642649087, 'samples': 13095424, 'steps': 25576, 'batch_loss/train': 0.8258945439010859} 12/26/2021 08:09:51 - INFO - codeparrot_training - Step 25577: {'lr': 0.0002422851707843118, 'samples': 13095936, 'steps': 25577, 'batch_loss/train': 0.8323613165412098} 12/26/2021 08:10:02 - INFO - codeparrot_training - Step 25578: {'lr': 0.00024226941517280405, 'samples': 13096448, 'steps': 25578, 'batch_loss/train': 0.8673857040703297} 12/26/2021 08:10:12 - INFO - codeparrot_training - Step 25579: {'lr': 0.0002422536595920301, 'samples': 13096960, 'steps': 25579, 'batch_loss/train': 0.922815608792007} 12/26/2021 08:10:25 - INFO - codeparrot_training - Step 25580: {'lr': 0.00024223790404205264, 'samples': 13097472, 'steps': 25580, 'batch_loss/train': 0.7278592064976692} 12/26/2021 08:10:35 - INFO - codeparrot_training - Step 25581: {'lr': 0.00024222214852293448, 'samples': 13097984, 'steps': 25581, 'batch_loss/train': 0.6751248650252819} 12/26/2021 08:10:46 - INFO - codeparrot_training - Step 25582: {'lr': 0.00024220639303473805, 'samples': 13098496, 'steps': 25582, 'batch_loss/train': 0.7653814805671573} 12/26/2021 08:10:58 - INFO - codeparrot_training - Step 25583: {'lr': 0.00024219063757752608, 'samples': 13099008, 'steps': 25583, 'batch_loss/train': 0.7525776531547308} 12/26/2021 08:11:09 - INFO - codeparrot_training - Step 25584: {'lr': 0.0002421748821513611, 'samples': 13099520, 'steps': 25584, 'batch_loss/train': 0.8457002146169543} 12/26/2021 08:11:19 - INFO - codeparrot_training - Step 25585: {'lr': 0.00024215912675630593, 'samples': 13100032, 'steps': 25585, 'batch_loss/train': 0.9936649659648538} 12/26/2021 08:11:30 - INFO - codeparrot_training - Step 25586: {'lr': 0.00024214337139242308, 'samples': 13100544, 'steps': 25586, 'batch_loss/train': 0.787125249626115} 12/26/2021 08:11:44 - INFO - codeparrot_training - Step 25587: {'lr': 0.00024212761605977516, 'samples': 13101056, 'steps': 25587, 'batch_loss/train': 0.753306852420792} 12/26/2021 08:11:54 - INFO - codeparrot_training - Step 25588: {'lr': 0.00024211186075842494, 'samples': 13101568, 'steps': 25588, 'batch_loss/train': 0.8850392820313573} 12/26/2021 08:12:05 - INFO - codeparrot_training - Step 25589: {'lr': 0.00024209610548843496, 'samples': 13102080, 'steps': 25589, 'batch_loss/train': 0.8164111841470003} 12/26/2021 08:12:17 - INFO - codeparrot_training - Step 25590: {'lr': 0.00024208035024986797, 'samples': 13102592, 'steps': 25590, 'batch_loss/train': 0.7593295760452747} 12/26/2021 08:12:28 - INFO - codeparrot_training - Step 25591: {'lr': 0.00024206459504278634, 'samples': 13103104, 'steps': 25591, 'batch_loss/train': 0.6801232043653727} 12/26/2021 08:12:38 - INFO - codeparrot_training - Step 25592: {'lr': 0.00024204883986725297, 'samples': 13103616, 'steps': 25592, 'batch_loss/train': 0.8775069303810596} 12/26/2021 08:12:51 - INFO - codeparrot_training - Step 25593: {'lr': 0.00024203308472333046, 'samples': 13104128, 'steps': 25593, 'batch_loss/train': 0.6741062770597637} 12/26/2021 08:13:01 - INFO - codeparrot_training - Step 25594: {'lr': 0.00024201732961108143, 'samples': 13104640, 'steps': 25594, 'batch_loss/train': 0.790467849932611} 12/26/2021 08:13:12 - INFO - codeparrot_training - Step 25595: {'lr': 0.00024200157453056832, 'samples': 13105152, 'steps': 25595, 'batch_loss/train': 0.8367215236648917} 12/26/2021 08:13:23 - INFO - codeparrot_training - Step 25596: {'lr': 0.00024198581948185402, 'samples': 13105664, 'steps': 25596, 'batch_loss/train': 0.44364497589413077} 12/26/2021 08:13:35 - INFO - codeparrot_training - Step 25597: {'lr': 0.00024197006446500112, 'samples': 13106176, 'steps': 25597, 'batch_loss/train': 0.7509783655405045} 12/26/2021 08:13:46 - INFO - codeparrot_training - Step 25598: {'lr': 0.00024195430948007217, 'samples': 13106688, 'steps': 25598, 'batch_loss/train': 0.6636368716135621} 12/26/2021 08:13:57 - INFO - codeparrot_training - Step 25599: {'lr': 0.00024193855452712975, 'samples': 13107200, 'steps': 25599, 'batch_loss/train': 0.6974060162901878} 12/26/2021 08:14:09 - INFO - codeparrot_training - Step 25600: {'lr': 0.00024192279960623674, 'samples': 13107712, 'steps': 25600, 'batch_loss/train': 1.0024978760629892} 12/26/2021 08:14:19 - INFO - codeparrot_training - Step 25601: {'lr': 0.00024190704471745557, 'samples': 13108224, 'steps': 25601, 'batch_loss/train': 0.8435797831043601} 12/26/2021 08:14:30 - INFO - codeparrot_training - Step 25602: {'lr': 0.00024189128986084888, 'samples': 13108736, 'steps': 25602, 'batch_loss/train': 0.7231028815731406} 12/26/2021 08:14:43 - INFO - codeparrot_training - Step 25603: {'lr': 0.00024187553503647946, 'samples': 13109248, 'steps': 25603, 'batch_loss/train': 0.618755720090121} 12/26/2021 08:14:54 - INFO - codeparrot_training - Step 25604: {'lr': 0.00024185978024440986, 'samples': 13109760, 'steps': 25604, 'batch_loss/train': 0.7936783460900187} 12/26/2021 08:15:04 - INFO - codeparrot_training - Step 25605: {'lr': 0.0002418440254847026, 'samples': 13110272, 'steps': 25605, 'batch_loss/train': 0.6919245980679989} 12/26/2021 08:15:15 - INFO - codeparrot_training - Step 25606: {'lr': 0.00024182827075742042, 'samples': 13110784, 'steps': 25606, 'batch_loss/train': 0.7902758882846683} 12/26/2021 08:15:27 - INFO - codeparrot_training - Step 25607: {'lr': 0.00024181251606262603, 'samples': 13111296, 'steps': 25607, 'batch_loss/train': 0.7506070043891668} 12/26/2021 08:15:37 - INFO - codeparrot_training - Step 25608: {'lr': 0.00024179676140038194, 'samples': 13111808, 'steps': 25608, 'batch_loss/train': 0.7658517234958708} 12/26/2021 08:15:48 - INFO - codeparrot_training - Step 25609: {'lr': 0.00024178100677075087, 'samples': 13112320, 'steps': 25609, 'batch_loss/train': 0.7822896768338978} 12/26/2021 08:16:00 - INFO - codeparrot_training - Step 25610: {'lr': 0.00024176525217379531, 'samples': 13112832, 'steps': 25610, 'batch_loss/train': 0.852112396620214} 12/26/2021 08:16:11 - INFO - codeparrot_training - Step 25611: {'lr': 0.00024174949760957806, 'samples': 13113344, 'steps': 25611, 'batch_loss/train': 0.7652851152233779} 12/26/2021 08:16:21 - INFO - codeparrot_training - Step 25612: {'lr': 0.00024173374307816174, 'samples': 13113856, 'steps': 25612, 'batch_loss/train': 0.8033472802489996} 12/26/2021 08:16:34 - INFO - codeparrot_training - Step 25613: {'lr': 0.00024171798857960884, 'samples': 13114368, 'steps': 25613, 'batch_loss/train': 0.8066268293187022} 12/26/2021 08:16:45 - INFO - codeparrot_training - Step 25614: {'lr': 0.00024170223411398219, 'samples': 13114880, 'steps': 25614, 'batch_loss/train': 0.8279240019619465} 12/26/2021 08:16:55 - INFO - codeparrot_training - Step 25615: {'lr': 0.00024168647968134425, 'samples': 13115392, 'steps': 25615, 'batch_loss/train': 0.7942458707839251} 12/26/2021 08:17:07 - INFO - codeparrot_training - Step 25616: {'lr': 0.0002416707252817578, 'samples': 13115904, 'steps': 25616, 'batch_loss/train': 0.8525265846401453} 12/26/2021 08:17:18 - INFO - codeparrot_training - Step 25617: {'lr': 0.0002416549709152853, 'samples': 13116416, 'steps': 25617, 'batch_loss/train': 0.7663184478878975} 12/26/2021 08:17:29 - INFO - codeparrot_training - Step 25618: {'lr': 0.00024163921658198955, 'samples': 13116928, 'steps': 25618, 'batch_loss/train': 0.7457417282275856} 12/26/2021 08:17:39 - INFO - codeparrot_training - Step 25619: {'lr': 0.00024162346228193314, 'samples': 13117440, 'steps': 25619, 'batch_loss/train': 0.8245335537940264} 12/26/2021 08:17:52 - INFO - codeparrot_training - Step 25620: {'lr': 0.0002416077080151787, 'samples': 13117952, 'steps': 25620, 'batch_loss/train': 1.127496127039194} 12/26/2021 08:18:02 - INFO - codeparrot_training - Step 25621: {'lr': 0.00024159195378178872, 'samples': 13118464, 'steps': 25621, 'batch_loss/train': 0.7195271225646138} 12/26/2021 08:18:13 - INFO - codeparrot_training - Step 25622: {'lr': 0.00024157619958182602, 'samples': 13118976, 'steps': 25622, 'batch_loss/train': 0.7661439492367208} 12/26/2021 08:18:25 - INFO - codeparrot_training - Step 25623: {'lr': 0.00024156044541535324, 'samples': 13119488, 'steps': 25623, 'batch_loss/train': 0.8303496846929193} 12/26/2021 08:18:36 - INFO - codeparrot_training - Step 25624: {'lr': 0.0002415446912824328, 'samples': 13120000, 'steps': 25624, 'batch_loss/train': 0.6610793732106686} 12/26/2021 08:18:47 - INFO - codeparrot_training - Step 25625: {'lr': 0.00024152893718312754, 'samples': 13120512, 'steps': 25625, 'batch_loss/train': 0.803855866426602} 12/26/2021 08:18:59 - INFO - codeparrot_training - Step 25626: {'lr': 0.00024151318311750012, 'samples': 13121024, 'steps': 25626, 'batch_loss/train': 0.895193855278194} 12/26/2021 08:19:10 - INFO - codeparrot_training - Step 25627: {'lr': 0.00024149742908561298, 'samples': 13121536, 'steps': 25627, 'batch_loss/train': 0.7403868939727545} 12/26/2021 08:19:20 - INFO - codeparrot_training - Step 25628: {'lr': 0.00024148167508752883, 'samples': 13122048, 'steps': 25628, 'batch_loss/train': 0.7678479219321162} 12/26/2021 08:19:31 - INFO - codeparrot_training - Step 25629: {'lr': 0.0002414659211233104, 'samples': 13122560, 'steps': 25629, 'batch_loss/train': 0.7657164447009563} 12/26/2021 08:19:43 - INFO - codeparrot_training - Step 25630: {'lr': 0.00024145016719302027, 'samples': 13123072, 'steps': 25630, 'batch_loss/train': 0.7660337975248694} 12/26/2021 08:19:54 - INFO - codeparrot_training - Step 25631: {'lr': 0.00024143441329672096, 'samples': 13123584, 'steps': 25631, 'batch_loss/train': 0.8475355925038457} 12/26/2021 08:20:04 - INFO - codeparrot_training - Step 25632: {'lr': 0.00024141865943447515, 'samples': 13124096, 'steps': 25632, 'batch_loss/train': 0.7750682914629579} 12/26/2021 08:20:16 - INFO - codeparrot_training - Step 25633: {'lr': 0.00024140290560634565, 'samples': 13124608, 'steps': 25633, 'batch_loss/train': 0.7848186660557985} 12/26/2021 08:20:27 - INFO - codeparrot_training - Step 25634: {'lr': 0.00024138715181239487, 'samples': 13125120, 'steps': 25634, 'batch_loss/train': 0.686711935326457} 12/26/2021 08:20:38 - INFO - codeparrot_training - Step 25635: {'lr': 0.00024137139805268556, 'samples': 13125632, 'steps': 25635, 'batch_loss/train': 0.7536587476497516} 12/26/2021 08:20:50 - INFO - codeparrot_training - Step 25636: {'lr': 0.00024135564432728023, 'samples': 13126144, 'steps': 25636, 'batch_loss/train': 0.8220540774054825} 12/26/2021 08:21:01 - INFO - codeparrot_training - Step 25637: {'lr': 0.00024133989063624165, 'samples': 13126656, 'steps': 25637, 'batch_loss/train': 0.8285356843844056} 12/26/2021 08:21:12 - INFO - codeparrot_training - Step 25638: {'lr': 0.00024132413697963245, 'samples': 13127168, 'steps': 25638, 'batch_loss/train': 0.8406379446387291} 12/26/2021 08:21:22 - INFO - codeparrot_training - Step 25639: {'lr': 0.00024130838335751509, 'samples': 13127680, 'steps': 25639, 'batch_loss/train': 0.7448771416675299} 12/26/2021 08:21:34 - INFO - codeparrot_training - Step 25640: {'lr': 0.0002412926297699524, 'samples': 13128192, 'steps': 25640, 'batch_loss/train': 0.7736082436749712} 12/26/2021 08:21:45 - INFO - codeparrot_training - Step 25641: {'lr': 0.00024127687621700692, 'samples': 13128704, 'steps': 25641, 'batch_loss/train': 0.7507635420188308} 12/26/2021 08:21:56 - INFO - codeparrot_training - Step 25642: {'lr': 0.00024126112269874132, 'samples': 13129216, 'steps': 25642, 'batch_loss/train': 0.8442133297212422} 12/26/2021 08:22:08 - INFO - codeparrot_training - Step 25643: {'lr': 0.0002412453692152181, 'samples': 13129728, 'steps': 25643, 'batch_loss/train': 0.7651243424043059} 12/26/2021 08:22:19 - INFO - codeparrot_training - Step 25644: {'lr': 0.0002412296157665, 'samples': 13130240, 'steps': 25644, 'batch_loss/train': 0.7152375257574022} 12/26/2021 08:22:30 - INFO - codeparrot_training - Step 25645: {'lr': 0.00024121386235264973, 'samples': 13130752, 'steps': 25645, 'batch_loss/train': 0.800598953384906} 12/26/2021 08:22:42 - INFO - codeparrot_training - Step 25646: {'lr': 0.00024119810897372973, 'samples': 13131264, 'steps': 25646, 'batch_loss/train': 0.7278207242488861} 12/26/2021 08:22:53 - INFO - codeparrot_training - Step 25647: {'lr': 0.0002411823556298027, 'samples': 13131776, 'steps': 25647, 'batch_loss/train': 0.7406101655215025} 12/26/2021 08:23:03 - INFO - codeparrot_training - Step 25648: {'lr': 0.0002411666023209314, 'samples': 13132288, 'steps': 25648, 'batch_loss/train': 0.7755062775686383} 12/26/2021 08:23:15 - INFO - codeparrot_training - Step 25649: {'lr': 0.00024115084904717836, 'samples': 13132800, 'steps': 25649, 'batch_loss/train': 0.6750445789657533} 12/26/2021 08:23:26 - INFO - codeparrot_training - Step 25650: {'lr': 0.0002411350958086061, 'samples': 13133312, 'steps': 25650, 'batch_loss/train': 0.8506113239564002} 12/26/2021 08:23:37 - INFO - codeparrot_training - Step 25651: {'lr': 0.00024111934260527738, 'samples': 13133824, 'steps': 25651, 'batch_loss/train': 0.7383399084210396} 12/26/2021 08:23:47 - INFO - codeparrot_training - Step 25652: {'lr': 0.0002411035894372549, 'samples': 13134336, 'steps': 25652, 'batch_loss/train': 0.8143528802320361} 12/26/2021 08:24:00 - INFO - codeparrot_training - Step 25653: {'lr': 0.00024108783630460108, 'samples': 13134848, 'steps': 25653, 'batch_loss/train': 0.8548581749200821} 12/26/2021 08:24:11 - INFO - codeparrot_training - Step 25654: {'lr': 0.00024107208320737862, 'samples': 13135360, 'steps': 25654, 'batch_loss/train': 0.7694508843123913} 12/26/2021 08:24:22 - INFO - codeparrot_training - Step 25655: {'lr': 0.0002410563301456503, 'samples': 13135872, 'steps': 25655, 'batch_loss/train': 0.6568312533199787} 12/26/2021 08:24:34 - INFO - codeparrot_training - Step 25656: {'lr': 0.00024104057711947857, 'samples': 13136384, 'steps': 25656, 'batch_loss/train': 0.7584679732099175} 12/26/2021 08:24:44 - INFO - codeparrot_training - Step 25657: {'lr': 0.0002410248241289262, 'samples': 13136896, 'steps': 25657, 'batch_loss/train': 0.7533673010766506} 12/26/2021 08:24:55 - INFO - codeparrot_training - Step 25658: {'lr': 0.00024100907117405562, 'samples': 13137408, 'steps': 25658, 'batch_loss/train': 1.001582687254995} 12/26/2021 08:25:08 - INFO - codeparrot_training - Step 25659: {'lr': 0.0002409933182549297, 'samples': 13137920, 'steps': 25659, 'batch_loss/train': 0.7891017072834074} 12/26/2021 08:25:19 - INFO - codeparrot_training - Step 25660: {'lr': 0.0002409775653716109, 'samples': 13138432, 'steps': 25660, 'batch_loss/train': 0.8817252553999424} 12/26/2021 08:25:29 - INFO - codeparrot_training - Step 25661: {'lr': 0.0002409618125241618, 'samples': 13138944, 'steps': 25661, 'batch_loss/train': 0.5441136875888333} 12/26/2021 08:25:40 - INFO - codeparrot_training - Step 25662: {'lr': 0.00024094605971264525, 'samples': 13139456, 'steps': 25662, 'batch_loss/train': 0.517150899511762} 12/26/2021 08:25:52 - INFO - codeparrot_training - Step 25663: {'lr': 0.00024093030693712368, 'samples': 13139968, 'steps': 25663, 'batch_loss/train': 0.7682713996618986} 12/26/2021 08:26:02 - INFO - codeparrot_training - Step 25664: {'lr': 0.00024091455419765988, 'samples': 13140480, 'steps': 25664, 'batch_loss/train': 0.8016022838419303} 12/26/2021 08:26:13 - INFO - codeparrot_training - Step 25665: {'lr': 0.00024089880149431622, 'samples': 13140992, 'steps': 25665, 'batch_loss/train': 0.7519746003672481} 12/26/2021 08:26:26 - INFO - codeparrot_training - Step 25666: {'lr': 0.00024088304882715562, 'samples': 13141504, 'steps': 25666, 'batch_loss/train': 0.7858880301937461} 12/26/2021 08:26:36 - INFO - codeparrot_training - Step 25667: {'lr': 0.00024086729619624055, 'samples': 13142016, 'steps': 25667, 'batch_loss/train': 0.6504157148301601} 12/26/2021 08:26:47 - INFO - codeparrot_training - Step 25668: {'lr': 0.00024085154360163367, 'samples': 13142528, 'steps': 25668, 'batch_loss/train': 0.8548199199140072} 12/26/2021 08:26:59 - INFO - codeparrot_training - Step 25669: {'lr': 0.00024083579104339751, 'samples': 13143040, 'steps': 25669, 'batch_loss/train': 0.6942332901526242} 12/26/2021 08:27:10 - INFO - codeparrot_training - Step 25670: {'lr': 0.00024082003852159484, 'samples': 13143552, 'steps': 25670, 'batch_loss/train': 0.7351198044489138} 12/26/2021 08:27:20 - INFO - codeparrot_training - Step 25671: {'lr': 0.00024080428603628825, 'samples': 13144064, 'steps': 25671, 'batch_loss/train': 0.7527138320729136} 12/26/2021 08:27:33 - INFO - codeparrot_training - Step 25672: {'lr': 0.00024078853358754036, 'samples': 13144576, 'steps': 25672, 'batch_loss/train': 0.8404470253735781} 12/26/2021 08:27:44 - INFO - codeparrot_training - Step 25673: {'lr': 0.00024077278117541368, 'samples': 13145088, 'steps': 25673, 'batch_loss/train': 0.692695727571845} 12/26/2021 08:27:54 - INFO - codeparrot_training - Step 25674: {'lr': 0.00024075702879997106, 'samples': 13145600, 'steps': 25674, 'batch_loss/train': 0.6913813119754195} 12/26/2021 08:28:05 - INFO - codeparrot_training - Step 25675: {'lr': 0.000240741276461275, 'samples': 13146112, 'steps': 25675, 'batch_loss/train': 0.7882913071662188} 12/26/2021 08:28:17 - INFO - codeparrot_training - Step 25676: {'lr': 0.00024072552415938802, 'samples': 13146624, 'steps': 25676, 'batch_loss/train': 0.7930404988583177} 12/26/2021 08:28:28 - INFO - codeparrot_training - Step 25677: {'lr': 0.0002407097718943729, 'samples': 13147136, 'steps': 25677, 'batch_loss/train': 1.2901873313821852} 12/26/2021 08:28:38 - INFO - codeparrot_training - Step 25678: {'lr': 0.00024069401966629229, 'samples': 13147648, 'steps': 25678, 'batch_loss/train': 0.702502197586} 12/26/2021 08:28:51 - INFO - codeparrot_training - Step 25679: {'lr': 0.00024067826747520868, 'samples': 13148160, 'steps': 25679, 'batch_loss/train': 0.7899191472679377} 12/26/2021 08:29:01 - INFO - codeparrot_training - Step 25680: {'lr': 0.00024066251532118473, 'samples': 13148672, 'steps': 25680, 'batch_loss/train': 0.7923524482175708} 12/26/2021 08:29:12 - INFO - codeparrot_training - Step 25681: {'lr': 0.00024064676320428318, 'samples': 13149184, 'steps': 25681, 'batch_loss/train': 0.6240557470591739} 12/26/2021 08:29:25 - INFO - codeparrot_training - Step 25682: {'lr': 0.0002406310111245665, 'samples': 13149696, 'steps': 25682, 'batch_loss/train': 0.8028945070691407} 12/26/2021 08:29:35 - INFO - codeparrot_training - Step 25683: {'lr': 0.00024061525908209748, 'samples': 13150208, 'steps': 25683, 'batch_loss/train': 1.1717665675096214} 12/26/2021 08:29:46 - INFO - codeparrot_training - Step 25684: {'lr': 0.00024059950707693847, 'samples': 13150720, 'steps': 25684, 'batch_loss/train': 1.4787830892018974} 12/26/2021 08:29:57 - INFO - codeparrot_training - Step 25685: {'lr': 0.00024058375510915242, 'samples': 13151232, 'steps': 25685, 'batch_loss/train': 0.9157687267288566} 12/26/2021 08:30:09 - INFO - codeparrot_training - Step 25686: {'lr': 0.00024056800317880176, 'samples': 13151744, 'steps': 25686, 'batch_loss/train': 0.7809232342988253} 12/26/2021 08:30:19 - INFO - codeparrot_training - Step 25687: {'lr': 0.0002405522512859491, 'samples': 13152256, 'steps': 25687, 'batch_loss/train': 0.8665598444640636} 12/26/2021 08:30:30 - INFO - codeparrot_training - Step 25688: {'lr': 0.00024053649943065722, 'samples': 13152768, 'steps': 25688, 'batch_loss/train': 0.7689689036924392} 12/26/2021 08:30:43 - INFO - codeparrot_training - Step 25689: {'lr': 0.0002405207476129886, 'samples': 13153280, 'steps': 25689, 'batch_loss/train': 0.789722110144794} 12/26/2021 08:30:53 - INFO - codeparrot_training - Step 25690: {'lr': 0.00024050499583300595, 'samples': 13153792, 'steps': 25690, 'batch_loss/train': 0.8239286374300718} 12/26/2021 08:31:04 - INFO - codeparrot_training - Step 25691: {'lr': 0.00024048924409077175, 'samples': 13154304, 'steps': 25691, 'batch_loss/train': 0.5402957508340478} 12/26/2021 08:31:16 - INFO - codeparrot_training - Step 25692: {'lr': 0.00024047349238634876, 'samples': 13154816, 'steps': 25692, 'batch_loss/train': 0.8345676432363689} 12/26/2021 08:31:27 - INFO - codeparrot_training - Step 25693: {'lr': 0.00024045774071979964, 'samples': 13155328, 'steps': 25693, 'batch_loss/train': 0.7936202669516206} 12/26/2021 08:31:37 - INFO - codeparrot_training - Step 25694: {'lr': 0.00024044198909118695, 'samples': 13155840, 'steps': 25694, 'batch_loss/train': 0.7949494943022728} 12/26/2021 08:31:48 - INFO - codeparrot_training - Step 25695: {'lr': 0.00024042623750057315, 'samples': 13156352, 'steps': 25695, 'batch_loss/train': 0.8667920362204313} 12/26/2021 08:32:00 - INFO - codeparrot_training - Step 25696: {'lr': 0.00024041048594802112, 'samples': 13156864, 'steps': 25696, 'batch_loss/train': 0.78974160249345} 12/26/2021 08:32:11 - INFO - codeparrot_training - Step 25697: {'lr': 0.00024039473443359343, 'samples': 13157376, 'steps': 25697, 'batch_loss/train': 0.8008343577384949} 12/26/2021 08:32:21 - INFO - codeparrot_training - Step 25698: {'lr': 0.0002403789829573525, 'samples': 13157888, 'steps': 25698, 'batch_loss/train': 0.7565186815336347} 12/26/2021 08:32:34 - INFO - codeparrot_training - Step 25699: {'lr': 0.0002403632315193612, 'samples': 13158400, 'steps': 25699, 'batch_loss/train': 0.7636766098439693} 12/26/2021 08:32:45 - INFO - codeparrot_training - Step 25700: {'lr': 0.00024034748011968207, 'samples': 13158912, 'steps': 25700, 'batch_loss/train': 0.8322267290204763} 12/26/2021 08:32:55 - INFO - codeparrot_training - Step 25701: {'lr': 0.00024033172875837766, 'samples': 13159424, 'steps': 25701, 'batch_loss/train': 0.8310914281755686} 12/26/2021 08:33:07 - INFO - codeparrot_training - Step 25702: {'lr': 0.00024031597743551063, 'samples': 13159936, 'steps': 25702, 'batch_loss/train': 0.8530555604957044} 12/26/2021 08:33:18 - INFO - codeparrot_training - Step 25703: {'lr': 0.0002403002261511437, 'samples': 13160448, 'steps': 25703, 'batch_loss/train': 0.8359483107924461} 12/26/2021 08:33:28 - INFO - codeparrot_training - Step 25704: {'lr': 0.0002402844749053394, 'samples': 13160960, 'steps': 25704, 'batch_loss/train': 0.7300987932831049} 12/26/2021 08:33:40 - INFO - codeparrot_training - Step 25705: {'lr': 0.00024026872369816032, 'samples': 13161472, 'steps': 25705, 'batch_loss/train': 0.7146923262625933} 12/26/2021 08:33:51 - INFO - codeparrot_training - Step 25706: {'lr': 0.00024025297252966904, 'samples': 13161984, 'steps': 25706, 'batch_loss/train': 0.7606488345190883} 12/26/2021 08:34:02 - INFO - codeparrot_training - Step 25707: {'lr': 0.0002402372213999284, 'samples': 13162496, 'steps': 25707, 'batch_loss/train': 0.851669903844595} 12/26/2021 08:34:12 - INFO - codeparrot_training - Step 25708: {'lr': 0.00024022147030900083, 'samples': 13163008, 'steps': 25708, 'batch_loss/train': 0.743648887379095} 12/26/2021 08:34:24 - INFO - codeparrot_training - Step 25709: {'lr': 0.00024020571925694904, 'samples': 13163520, 'steps': 25709, 'batch_loss/train': 0.861418925691396} 12/26/2021 08:34:35 - INFO - codeparrot_training - Step 25710: {'lr': 0.0002401899682438355, 'samples': 13164032, 'steps': 25710, 'batch_loss/train': 0.7006671763956547} 12/26/2021 08:34:46 - INFO - codeparrot_training - Step 25711: {'lr': 0.00024017421726972305, 'samples': 13164544, 'steps': 25711, 'batch_loss/train': 0.7296668440103531} 12/26/2021 08:34:58 - INFO - codeparrot_training - Step 25712: {'lr': 0.00024015846633467416, 'samples': 13165056, 'steps': 25712, 'batch_loss/train': 0.7852527648210526} 12/26/2021 08:35:09 - INFO - codeparrot_training - Step 25713: {'lr': 0.00024014271543875145, 'samples': 13165568, 'steps': 25713, 'batch_loss/train': 0.8590681655332446} 12/26/2021 08:35:20 - INFO - codeparrot_training - Step 25714: {'lr': 0.00024012696458201765, 'samples': 13166080, 'steps': 25714, 'batch_loss/train': 0.7225767835043371} 12/26/2021 08:35:32 - INFO - codeparrot_training - Step 25715: {'lr': 0.00024011121376453528, 'samples': 13166592, 'steps': 25715, 'batch_loss/train': 0.7293378482572734} 12/26/2021 08:35:42 - INFO - codeparrot_training - Step 25716: {'lr': 0.00024009546298636706, 'samples': 13167104, 'steps': 25716, 'batch_loss/train': 0.7592993788421154} 12/26/2021 08:35:53 - INFO - codeparrot_training - Step 25717: {'lr': 0.0002400797122475754, 'samples': 13167616, 'steps': 25717, 'batch_loss/train': 0.7924859141930938} 12/26/2021 08:36:03 - INFO - codeparrot_training - Step 25718: {'lr': 0.00024006396154822312, 'samples': 13168128, 'steps': 25718, 'batch_loss/train': 0.6861547706648707} 12/26/2021 08:36:15 - INFO - codeparrot_training - Step 25719: {'lr': 0.00024004821088837288, 'samples': 13168640, 'steps': 25719, 'batch_loss/train': 0.7934864368289709} 12/26/2021 08:36:26 - INFO - codeparrot_training - Step 25720: {'lr': 0.00024003246026808715, 'samples': 13169152, 'steps': 25720, 'batch_loss/train': 0.7968322103843093} 12/26/2021 08:36:37 - INFO - codeparrot_training - Step 25721: {'lr': 0.00024001670968742847, 'samples': 13169664, 'steps': 25721, 'batch_loss/train': 0.7081986400298774} 12/26/2021 08:36:50 - INFO - codeparrot_training - Step 25722: {'lr': 0.00024000095914645966, 'samples': 13170176, 'steps': 25722, 'batch_loss/train': 0.7227232349105179} 12/26/2021 08:37:00 - INFO - codeparrot_training - Step 25723: {'lr': 0.00023998520864524332, 'samples': 13170688, 'steps': 25723, 'batch_loss/train': 0.6848343564197421} 12/26/2021 08:37:11 - INFO - codeparrot_training - Step 25724: {'lr': 0.0002399694581838419, 'samples': 13171200, 'steps': 25724, 'batch_loss/train': 0.7336029461584985} 12/26/2021 08:37:23 - INFO - codeparrot_training - Step 25725: {'lr': 0.00023995370776231818, 'samples': 13171712, 'steps': 25725, 'batch_loss/train': 0.6575454506091774} 12/26/2021 08:37:34 - INFO - codeparrot_training - Step 25726: {'lr': 0.00023993795738073477, 'samples': 13172224, 'steps': 25726, 'batch_loss/train': 0.7492589924950153} 12/26/2021 08:37:45 - INFO - codeparrot_training - Step 25727: {'lr': 0.00023992220703915418, 'samples': 13172736, 'steps': 25727, 'batch_loss/train': 0.7203901568427682} 12/26/2021 08:37:57 - INFO - codeparrot_training - Step 25728: {'lr': 0.00023990645673763905, 'samples': 13173248, 'steps': 25728, 'batch_loss/train': 0.7796481673722155} 12/26/2021 08:38:08 - INFO - codeparrot_training - Step 25729: {'lr': 0.00023989070647625213, 'samples': 13173760, 'steps': 25729, 'batch_loss/train': 0.7440155036747456} 12/26/2021 08:38:19 - INFO - codeparrot_training - Step 25730: {'lr': 0.000239874956255056, 'samples': 13174272, 'steps': 25730, 'batch_loss/train': 0.7862240625545382} 12/26/2021 08:38:29 - INFO - codeparrot_training - Step 25731: {'lr': 0.0002398592060741131, 'samples': 13174784, 'steps': 25731, 'batch_loss/train': 0.7974992776289582} 12/26/2021 08:38:41 - INFO - codeparrot_training - Step 25732: {'lr': 0.00023984345593348614, 'samples': 13175296, 'steps': 25732, 'batch_loss/train': 0.6980411889962852} 12/26/2021 08:38:52 - INFO - codeparrot_training - Step 25733: {'lr': 0.00023982770583323785, 'samples': 13175808, 'steps': 25733, 'batch_loss/train': 0.7091995552182198} 12/26/2021 08:39:03 - INFO - codeparrot_training - Step 25734: {'lr': 0.00023981195577343073, 'samples': 13176320, 'steps': 25734, 'batch_loss/train': 0.872275173664093} 12/26/2021 08:39:15 - INFO - codeparrot_training - Step 25735: {'lr': 0.00023979620575412735, 'samples': 13176832, 'steps': 25735, 'batch_loss/train': 0.7051110533066094} 12/26/2021 08:39:25 - INFO - codeparrot_training - Step 25736: {'lr': 0.00023978045577539053, 'samples': 13177344, 'steps': 25736, 'batch_loss/train': 0.7495652941288427} 12/26/2021 08:39:36 - INFO - codeparrot_training - Step 25737: {'lr': 0.00023976470583728272, 'samples': 13177856, 'steps': 25737, 'batch_loss/train': 1.0153612131252885} 12/26/2021 08:39:49 - INFO - codeparrot_training - Step 25738: {'lr': 0.00023974895593986662, 'samples': 13178368, 'steps': 25738, 'batch_loss/train': 0.7433601086959243} 12/26/2021 08:39:59 - INFO - codeparrot_training - Step 25739: {'lr': 0.00023973320608320468, 'samples': 13178880, 'steps': 25739, 'batch_loss/train': 0.6806671074591577} 12/26/2021 08:40:10 - INFO - codeparrot_training - Step 25740: {'lr': 0.00023971745626735975, 'samples': 13179392, 'steps': 25740, 'batch_loss/train': 1.0124928411096334} 12/26/2021 08:40:21 - INFO - codeparrot_training - Step 25741: {'lr': 0.00023970170649239428, 'samples': 13179904, 'steps': 25741, 'batch_loss/train': 0.7584285056218505} 12/26/2021 08:40:33 - INFO - codeparrot_training - Step 25742: {'lr': 0.000239685956758371, 'samples': 13180416, 'steps': 25742, 'batch_loss/train': 1.0049521503970027} 12/26/2021 08:40:44 - INFO - codeparrot_training - Step 25743: {'lr': 0.00023967020706535235, 'samples': 13180928, 'steps': 25743, 'batch_loss/train': 0.7149849545676261} 12/26/2021 08:40:54 - INFO - codeparrot_training - Step 25744: {'lr': 0.00023965445741340108, 'samples': 13181440, 'steps': 25744, 'batch_loss/train': 0.835332652553916} 12/26/2021 08:41:07 - INFO - codeparrot_training - Step 25745: {'lr': 0.00023963870780257986, 'samples': 13181952, 'steps': 25745, 'batch_loss/train': 0.7892172420397401} 12/26/2021 08:41:17 - INFO - codeparrot_training - Step 25746: {'lr': 0.00023962295823295115, 'samples': 13182464, 'steps': 25746, 'batch_loss/train': 0.6809890040312894} 12/26/2021 08:41:28 - INFO - codeparrot_training - Step 25747: {'lr': 0.00023960720870457765, 'samples': 13182976, 'steps': 25747, 'batch_loss/train': 0.6467705350369215} 12/26/2021 08:41:40 - INFO - codeparrot_training - Step 25748: {'lr': 0.00023959145921752202, 'samples': 13183488, 'steps': 25748, 'batch_loss/train': 0.7079080189578235} 12/26/2021 08:41:51 - INFO - codeparrot_training - Step 25749: {'lr': 0.0002395757097718468, 'samples': 13184000, 'steps': 25749, 'batch_loss/train': 0.7931945724412799} 12/26/2021 08:42:02 - INFO - codeparrot_training - Step 25750: {'lr': 0.00023955996036761454, 'samples': 13184512, 'steps': 25750, 'batch_loss/train': 0.7731549628078938} 12/26/2021 08:42:12 - INFO - codeparrot_training - Step 25751: {'lr': 0.000239544211004888, 'samples': 13185024, 'steps': 25751, 'batch_loss/train': 0.7282419791445136} 12/26/2021 08:42:25 - INFO - codeparrot_training - Step 25752: {'lr': 0.00023952846168372977, 'samples': 13185536, 'steps': 25752, 'batch_loss/train': 0.851398472674191} 12/26/2021 08:42:35 - INFO - codeparrot_training - Step 25753: {'lr': 0.00023951271240420235, 'samples': 13186048, 'steps': 25753, 'batch_loss/train': 0.7699234905885532} 12/26/2021 08:42:46 - INFO - codeparrot_training - Step 25754: {'lr': 0.0002394969631663684, 'samples': 13186560, 'steps': 25754, 'batch_loss/train': 0.6961660687811673} 12/26/2021 08:42:58 - INFO - codeparrot_training - Step 25755: {'lr': 0.00023948121397029067, 'samples': 13187072, 'steps': 25755, 'batch_loss/train': 0.487789791659452} 12/26/2021 08:43:09 - INFO - codeparrot_training - Step 25756: {'lr': 0.00023946546481603165, 'samples': 13187584, 'steps': 25756, 'batch_loss/train': 0.9084380222484469} 12/26/2021 08:43:19 - INFO - codeparrot_training - Step 25757: {'lr': 0.0002394497157036539, 'samples': 13188096, 'steps': 25757, 'batch_loss/train': 0.7611568029969931} 12/26/2021 08:43:33 - INFO - codeparrot_training - Step 25758: {'lr': 0.00023943396663322005, 'samples': 13188608, 'steps': 25758, 'batch_loss/train': 0.7315767034888268} 12/26/2021 08:43:43 - INFO - codeparrot_training - Step 25759: {'lr': 0.00023941821760479289, 'samples': 13189120, 'steps': 25759, 'batch_loss/train': 0.7697617029771209} 12/26/2021 08:43:54 - INFO - codeparrot_training - Step 25760: {'lr': 0.00023940246861843482, 'samples': 13189632, 'steps': 25760, 'batch_loss/train': 0.630233125470113} 12/26/2021 08:44:05 - INFO - codeparrot_training - Step 25761: {'lr': 0.00023938671967420848, 'samples': 13190144, 'steps': 25761, 'batch_loss/train': 0.5011241778265685} 12/26/2021 08:44:17 - INFO - codeparrot_training - Step 25762: {'lr': 0.00023937097077217665, 'samples': 13190656, 'steps': 25762, 'batch_loss/train': 0.6652533113956451} 12/26/2021 08:44:28 - INFO - codeparrot_training - Step 25763: {'lr': 0.00023935522191240177, 'samples': 13191168, 'steps': 25763, 'batch_loss/train': 0.6427395481150597} 12/26/2021 08:44:38 - INFO - codeparrot_training - Step 25764: {'lr': 0.00023933947309494655, 'samples': 13191680, 'steps': 25764, 'batch_loss/train': 0.807606347836554} 12/26/2021 08:44:50 - INFO - codeparrot_training - Step 25765: {'lr': 0.00023932372431987347, 'samples': 13192192, 'steps': 25765, 'batch_loss/train': 0.7295577796176076} 12/26/2021 08:45:01 - INFO - codeparrot_training - Step 25766: {'lr': 0.00023930797558724534, 'samples': 13192704, 'steps': 25766, 'batch_loss/train': 0.7307689823210239} 12/26/2021 08:45:12 - INFO - codeparrot_training - Step 25767: {'lr': 0.0002392922268971246, 'samples': 13193216, 'steps': 25767, 'batch_loss/train': 0.5970146679319441} 12/26/2021 08:45:24 - INFO - codeparrot_training - Step 25768: {'lr': 0.000239276478249574, 'samples': 13193728, 'steps': 25768, 'batch_loss/train': 0.7473883754573762} 12/26/2021 08:45:35 - INFO - codeparrot_training - Step 25769: {'lr': 0.00023926072964465593, 'samples': 13194240, 'steps': 25769, 'batch_loss/train': 0.9938931385986507} 12/26/2021 08:45:46 - INFO - codeparrot_training - Step 25770: {'lr': 0.0002392449810824332, 'samples': 13194752, 'steps': 25770, 'batch_loss/train': 0.9848896004259586} 12/26/2021 08:45:56 - INFO - codeparrot_training - Step 25771: {'lr': 0.00023922923256296843, 'samples': 13195264, 'steps': 25771, 'batch_loss/train': 0.8097131885588169} 12/26/2021 08:46:08 - INFO - codeparrot_training - Step 25772: {'lr': 0.00023921348408632406, 'samples': 13195776, 'steps': 25772, 'batch_loss/train': 0.772377263288945} 12/26/2021 08:46:19 - INFO - codeparrot_training - Step 25773: {'lr': 0.00023919773565256287, 'samples': 13196288, 'steps': 25773, 'batch_loss/train': 0.7120174127630889} 12/26/2021 08:46:30 - INFO - codeparrot_training - Step 25774: {'lr': 0.00023918198726174744, 'samples': 13196800, 'steps': 25774, 'batch_loss/train': 0.723727410659194} 12/26/2021 08:46:42 - INFO - codeparrot_training - Step 25775: {'lr': 0.00023916623891394036, 'samples': 13197312, 'steps': 25775, 'batch_loss/train': 0.7646713042631745} 12/26/2021 08:46:52 - INFO - codeparrot_training - Step 25776: {'lr': 0.00023915049060920407, 'samples': 13197824, 'steps': 25776, 'batch_loss/train': 0.7677471470087767} 12/26/2021 08:47:03 - INFO - codeparrot_training - Step 25777: {'lr': 0.00023913474234760144, 'samples': 13198336, 'steps': 25777, 'batch_loss/train': 0.7229087149025872} 12/26/2021 08:47:13 - INFO - codeparrot_training - Step 25778: {'lr': 0.000239118994129195, 'samples': 13198848, 'steps': 25778, 'batch_loss/train': 0.7394619351252913} 12/26/2021 08:47:26 - INFO - codeparrot_training - Step 25779: {'lr': 0.00023910324595404728, 'samples': 13199360, 'steps': 25779, 'batch_loss/train': 0.7986871553584933} 12/26/2021 08:47:36 - INFO - codeparrot_training - Step 25780: {'lr': 0.0002390874978222209, 'samples': 13199872, 'steps': 25780, 'batch_loss/train': 0.8150971699506044} 12/26/2021 08:47:47 - INFO - codeparrot_training - Step 25781: {'lr': 0.0002390717497337786, 'samples': 13200384, 'steps': 25781, 'batch_loss/train': 0.8640148262493312} 12/26/2021 08:47:59 - INFO - codeparrot_training - Step 25782: {'lr': 0.00023905600168878287, 'samples': 13200896, 'steps': 25782, 'batch_loss/train': 0.7916410621255636} 12/26/2021 08:48:10 - INFO - codeparrot_training - Step 25783: {'lr': 0.00023904025368729634, 'samples': 13201408, 'steps': 25783, 'batch_loss/train': 0.8282918771728873} 12/26/2021 08:48:21 - INFO - codeparrot_training - Step 25784: {'lr': 0.00023902450572938156, 'samples': 13201920, 'steps': 25784, 'batch_loss/train': 0.7876042947173119} 12/26/2021 08:48:33 - INFO - codeparrot_training - Step 25785: {'lr': 0.0002390087578151013, 'samples': 13202432, 'steps': 25785, 'batch_loss/train': 0.8113891645334661} 12/26/2021 08:48:43 - INFO - codeparrot_training - Step 25786: {'lr': 0.00023899300994451804, 'samples': 13202944, 'steps': 25786, 'batch_loss/train': 0.77538335416466} 12/26/2021 08:48:54 - INFO - codeparrot_training - Step 25787: {'lr': 0.00023897726211769433, 'samples': 13203456, 'steps': 25787, 'batch_loss/train': 0.7206165418028831} 12/26/2021 08:49:07 - INFO - codeparrot_training - Step 25788: {'lr': 0.00023896151433469296, 'samples': 13203968, 'steps': 25788, 'batch_loss/train': 0.7424169920850545} 12/26/2021 08:49:17 - INFO - codeparrot_training - Step 25789: {'lr': 0.00023894576659557644, 'samples': 13204480, 'steps': 25789, 'batch_loss/train': 0.7823118790984154} 12/26/2021 08:49:28 - INFO - codeparrot_training - Step 25790: {'lr': 0.00023893001890040738, 'samples': 13204992, 'steps': 25790, 'batch_loss/train': 0.8082421176368371} 12/26/2021 08:49:39 - INFO - codeparrot_training - Step 25791: {'lr': 0.00023891427124924828, 'samples': 13205504, 'steps': 25791, 'batch_loss/train': 0.7648593168705702} 12/26/2021 08:49:51 - INFO - codeparrot_training - Step 25792: {'lr': 0.00023889852364216194, 'samples': 13206016, 'steps': 25792, 'batch_loss/train': 0.8382271435111761} 12/26/2021 08:50:01 - INFO - codeparrot_training - Step 25793: {'lr': 0.0002388827760792109, 'samples': 13206528, 'steps': 25793, 'batch_loss/train': 0.7468791650608182} 12/26/2021 08:50:12 - INFO - codeparrot_training - Step 25794: {'lr': 0.00023886702856045777, 'samples': 13207040, 'steps': 25794, 'batch_loss/train': 0.72382235317491} 12/26/2021 08:50:24 - INFO - codeparrot_training - Step 25795: {'lr': 0.00023885128108596502, 'samples': 13207552, 'steps': 25795, 'batch_loss/train': 0.7529358807951212} 12/26/2021 08:50:35 - INFO - codeparrot_training - Step 25796: {'lr': 0.00023883553365579538, 'samples': 13208064, 'steps': 25796, 'batch_loss/train': 0.7787098162807524} 12/26/2021 08:50:45 - INFO - codeparrot_training - Step 25797: {'lr': 0.00023881978627001156, 'samples': 13208576, 'steps': 25797, 'batch_loss/train': 0.8660402758978307} 12/26/2021 08:50:58 - INFO - codeparrot_training - Step 25798: {'lr': 0.0002388040389286759, 'samples': 13209088, 'steps': 25798, 'batch_loss/train': 0.7861004546284676} 12/26/2021 08:51:09 - INFO - codeparrot_training - Step 25799: {'lr': 0.00023878829163185121, 'samples': 13209600, 'steps': 25799, 'batch_loss/train': 0.7302084341645241} 12/26/2021 08:51:19 - INFO - codeparrot_training - Step 25800: {'lr': 0.0002387725443796001, 'samples': 13210112, 'steps': 25800, 'batch_loss/train': 0.7636531237512827} 12/26/2021 08:51:30 - INFO - codeparrot_training - Step 25801: {'lr': 0.00023875679717198512, 'samples': 13210624, 'steps': 25801, 'batch_loss/train': 0.7649942100979388} 12/26/2021 08:51:42 - INFO - codeparrot_training - Step 25802: {'lr': 0.00023874105000906877, 'samples': 13211136, 'steps': 25802, 'batch_loss/train': 0.782144655007869} 12/26/2021 08:51:53 - INFO - codeparrot_training - Step 25803: {'lr': 0.00023872530289091375, 'samples': 13211648, 'steps': 25803, 'batch_loss/train': 0.8992117829620838} 12/26/2021 08:52:03 - INFO - codeparrot_training - Step 25804: {'lr': 0.0002387095558175828, 'samples': 13212160, 'steps': 25804, 'batch_loss/train': 0.7227761764079332} 12/26/2021 08:52:16 - INFO - codeparrot_training - Step 25805: {'lr': 0.00023869380878913828, 'samples': 13212672, 'steps': 25805, 'batch_loss/train': 0.7792542732786387} 12/26/2021 08:52:26 - INFO - codeparrot_training - Step 25806: {'lr': 0.0002386780618056429, 'samples': 13213184, 'steps': 25806, 'batch_loss/train': 0.7791832450311631} 12/26/2021 08:52:37 - INFO - codeparrot_training - Step 25807: {'lr': 0.00023866231486715934, 'samples': 13213696, 'steps': 25807, 'batch_loss/train': 0.7875376876909286} 12/26/2021 08:52:49 - INFO - codeparrot_training - Step 25808: {'lr': 0.0002386465679737501, 'samples': 13214208, 'steps': 25808, 'batch_loss/train': 0.731395436450839} 12/26/2021 08:53:00 - INFO - codeparrot_training - Step 25809: {'lr': 0.00023863082112547776, 'samples': 13214720, 'steps': 25809, 'batch_loss/train': 0.8849211251363158} 12/26/2021 08:53:10 - INFO - codeparrot_training - Step 25810: {'lr': 0.0002386150743224051, 'samples': 13215232, 'steps': 25810, 'batch_loss/train': 0.7273187150713056} 12/26/2021 08:53:22 - INFO - codeparrot_training - Step 25811: {'lr': 0.0002385993275645946, 'samples': 13215744, 'steps': 25811, 'batch_loss/train': 0.719786380417645} 12/26/2021 08:53:33 - INFO - codeparrot_training - Step 25812: {'lr': 0.00023858358085210882, 'samples': 13216256, 'steps': 25812, 'batch_loss/train': 0.7613806575536728} 12/26/2021 08:53:44 - INFO - codeparrot_training - Step 25813: {'lr': 0.00023856783418501034, 'samples': 13216768, 'steps': 25813, 'batch_loss/train': 0.8490590648725629} 12/26/2021 08:53:54 - INFO - codeparrot_training - Step 25814: {'lr': 0.00023855208756336197, 'samples': 13217280, 'steps': 25814, 'batch_loss/train': 0.7466192059218884} 12/26/2021 08:54:06 - INFO - codeparrot_training - Step 25815: {'lr': 0.0002385363409872261, 'samples': 13217792, 'steps': 25815, 'batch_loss/train': 0.8462620107457042} 12/26/2021 08:54:17 - INFO - codeparrot_training - Step 25816: {'lr': 0.00023852059445666547, 'samples': 13218304, 'steps': 25816, 'batch_loss/train': 0.8822855362668633} 12/26/2021 08:54:28 - INFO - codeparrot_training - Step 25817: {'lr': 0.00023850484797174253, 'samples': 13218816, 'steps': 25817, 'batch_loss/train': 0.6205046270042658} 12/26/2021 08:54:40 - INFO - codeparrot_training - Step 25818: {'lr': 0.00023848910153252, 'samples': 13219328, 'steps': 25818, 'batch_loss/train': 0.6986976908519864} 12/26/2021 08:54:51 - INFO - codeparrot_training - Step 25819: {'lr': 0.00023847335513906053, 'samples': 13219840, 'steps': 25819, 'batch_loss/train': 0.7838326701894403} 12/26/2021 08:55:02 - INFO - codeparrot_training - Step 25820: {'lr': 0.00023845760879142664, 'samples': 13220352, 'steps': 25820, 'batch_loss/train': 0.7708400280680507} 12/26/2021 08:55:14 - INFO - codeparrot_training - Step 25821: {'lr': 0.00023844186248968082, 'samples': 13220864, 'steps': 25821, 'batch_loss/train': 0.5903663435019553} 12/26/2021 08:55:24 - INFO - codeparrot_training - Step 25822: {'lr': 0.00023842611623388585, 'samples': 13221376, 'steps': 25822, 'batch_loss/train': 0.7859106743708253} 12/26/2021 08:55:35 - INFO - codeparrot_training - Step 25823: {'lr': 0.00023841037002410434, 'samples': 13221888, 'steps': 25823, 'batch_loss/train': 0.7558905808255076} 12/26/2021 08:55:46 - INFO - codeparrot_training - Step 25824: {'lr': 0.0002383946238603987, 'samples': 13222400, 'steps': 25824, 'batch_loss/train': 0.7717930786311626} 12/26/2021 08:55:58 - INFO - codeparrot_training - Step 25825: {'lr': 0.00023837887774283173, 'samples': 13222912, 'steps': 25825, 'batch_loss/train': 0.8487730016931891} 12/26/2021 08:56:08 - INFO - codeparrot_training - Step 25826: {'lr': 0.000238363131671466, 'samples': 13223424, 'steps': 25826, 'batch_loss/train': 0.7383453054353595} 12/26/2021 08:56:19 - INFO - codeparrot_training - Step 25827: {'lr': 0.000238347385646364, 'samples': 13223936, 'steps': 25827, 'batch_loss/train': 0.7573372656479478} 12/26/2021 08:56:32 - INFO - codeparrot_training - Step 25828: {'lr': 0.00023833163966758834, 'samples': 13224448, 'steps': 25828, 'batch_loss/train': 0.7068978953175247} 12/26/2021 08:56:42 - INFO - codeparrot_training - Step 25829: {'lr': 0.0002383158937352018, 'samples': 13224960, 'steps': 25829, 'batch_loss/train': 0.7752053984440863} 12/26/2021 08:56:53 - INFO - codeparrot_training - Step 25830: {'lr': 0.0002383001478492668, 'samples': 13225472, 'steps': 25830, 'batch_loss/train': 0.678603402338922} 12/26/2021 08:57:05 - INFO - codeparrot_training - Step 25831: {'lr': 0.00023828440200984602, 'samples': 13225984, 'steps': 25831, 'batch_loss/train': 0.6619744659401476} 12/26/2021 08:57:16 - INFO - codeparrot_training - Step 25832: {'lr': 0.00023826865621700194, 'samples': 13226496, 'steps': 25832, 'batch_loss/train': 0.7074231074657291} 12/26/2021 08:57:26 - INFO - codeparrot_training - Step 25833: {'lr': 0.00023825291047079736, 'samples': 13227008, 'steps': 25833, 'batch_loss/train': 0.7026512789307162} 12/26/2021 08:57:37 - INFO - codeparrot_training - Step 25834: {'lr': 0.00023823716477129473, 'samples': 13227520, 'steps': 25834, 'batch_loss/train': 0.7529293678235263} 12/26/2021 08:57:50 - INFO - codeparrot_training - Step 25835: {'lr': 0.00023822141911855664, 'samples': 13228032, 'steps': 25835, 'batch_loss/train': 0.7667767368257046} 12/26/2021 08:58:00 - INFO - codeparrot_training - Step 25836: {'lr': 0.00023820567351264582, 'samples': 13228544, 'steps': 25836, 'batch_loss/train': 0.7498479164205492} 12/26/2021 08:58:11 - INFO - codeparrot_training - Step 25837: {'lr': 0.00023818992795362475, 'samples': 13229056, 'steps': 25837, 'batch_loss/train': 0.751997341401875} 12/26/2021 08:58:23 - INFO - codeparrot_training - Step 25838: {'lr': 0.00023817418244155613, 'samples': 13229568, 'steps': 25838, 'batch_loss/train': 0.8761371374130249} 12/26/2021 08:58:34 - INFO - codeparrot_training - Step 25839: {'lr': 0.00023815843697650238, 'samples': 13230080, 'steps': 25839, 'batch_loss/train': 0.8168506021611392} 12/26/2021 08:58:44 - INFO - codeparrot_training - Step 25840: {'lr': 0.00023814269155852634, 'samples': 13230592, 'steps': 25840, 'batch_loss/train': 0.7627277803840116} 12/26/2021 08:58:57 - INFO - codeparrot_training - Step 25841: {'lr': 0.00023812694618769036, 'samples': 13231104, 'steps': 25841, 'batch_loss/train': 0.6352652052883059} 12/26/2021 08:59:07 - INFO - codeparrot_training - Step 25842: {'lr': 0.0002381112008640573, 'samples': 13231616, 'steps': 25842, 'batch_loss/train': 0.769064961001277} 12/26/2021 08:59:18 - INFO - codeparrot_training - Step 25843: {'lr': 0.00023809545558768947, 'samples': 13232128, 'steps': 25843, 'batch_loss/train': 0.7181727685965598} 12/26/2021 08:59:28 - INFO - codeparrot_training - Step 25844: {'lr': 0.00023807971035864968, 'samples': 13232640, 'steps': 25844, 'batch_loss/train': 0.8240876458585262} 12/26/2021 08:59:41 - INFO - codeparrot_training - Step 25845: {'lr': 0.0002380639651770005, 'samples': 13233152, 'steps': 25845, 'batch_loss/train': 0.8549575665965676} 12/26/2021 08:59:52 - INFO - codeparrot_training - Step 25846: {'lr': 0.00023804822004280435, 'samples': 13233664, 'steps': 25846, 'batch_loss/train': 0.7134215317200869} 12/26/2021 09:00:02 - INFO - codeparrot_training - Step 25847: {'lr': 0.00023803247495612413, 'samples': 13234176, 'steps': 25847, 'batch_loss/train': 0.8119685752317309} 12/26/2021 09:00:14 - INFO - codeparrot_training - Step 25848: {'lr': 0.0002380167299170222, 'samples': 13234688, 'steps': 25848, 'batch_loss/train': 0.7307017343118787} 12/26/2021 09:00:25 - INFO - codeparrot_training - Step 25849: {'lr': 0.00023800098492556126, 'samples': 13235200, 'steps': 25849, 'batch_loss/train': 0.7084963535889983} 12/26/2021 09:00:36 - INFO - codeparrot_training - Step 25850: {'lr': 0.00023798523998180376, 'samples': 13235712, 'steps': 25850, 'batch_loss/train': 0.7355574136599898} 12/26/2021 09:00:48 - INFO - codeparrot_training - Step 25851: {'lr': 0.0002379694950858125, 'samples': 13236224, 'steps': 25851, 'batch_loss/train': 0.813627430703491} 12/26/2021 09:00:58 - INFO - codeparrot_training - Step 25852: {'lr': 0.00023795375023765, 'samples': 13236736, 'steps': 25852, 'batch_loss/train': 0.7095377296209335} 12/26/2021 09:01:09 - INFO - codeparrot_training - Step 25853: {'lr': 0.0002379380054373788, 'samples': 13237248, 'steps': 25853, 'batch_loss/train': 0.7851838804781437} 12/26/2021 09:01:21 - INFO - codeparrot_training - Step 25854: {'lr': 0.0002379222606850615, 'samples': 13237760, 'steps': 25854, 'batch_loss/train': 0.7029807581566274} 12/26/2021 09:01:32 - INFO - codeparrot_training - Step 25855: {'lr': 0.00023790651598076085, 'samples': 13238272, 'steps': 25855, 'batch_loss/train': 0.7082315306179225} 12/26/2021 09:01:42 - INFO - codeparrot_training - Step 25856: {'lr': 0.0002378907713245393, 'samples': 13238784, 'steps': 25856, 'batch_loss/train': 0.7536236932501197} 12/26/2021 09:01:53 - INFO - codeparrot_training - Step 25857: {'lr': 0.00023787502671645942, 'samples': 13239296, 'steps': 25857, 'batch_loss/train': 0.9224378038197756} 12/26/2021 09:02:06 - INFO - codeparrot_training - Step 25858: {'lr': 0.0002378592821565838, 'samples': 13239808, 'steps': 25858, 'batch_loss/train': 0.8491421900689602} 12/26/2021 09:02:16 - INFO - codeparrot_training - Step 25859: {'lr': 0.0002378435376449752, 'samples': 13240320, 'steps': 25859, 'batch_loss/train': 0.7726580509915948} 12/26/2021 09:02:27 - INFO - codeparrot_training - Step 25860: {'lr': 0.00023782779318169605, 'samples': 13240832, 'steps': 25860, 'batch_loss/train': 0.8032284388318658} 12/26/2021 09:02:39 - INFO - codeparrot_training - Step 25861: {'lr': 0.00023781204876680896, 'samples': 13241344, 'steps': 25861, 'batch_loss/train': 0.7835508196149021} 12/26/2021 09:02:50 - INFO - codeparrot_training - Step 25862: {'lr': 0.00023779630440037665, 'samples': 13241856, 'steps': 25862, 'batch_loss/train': 0.6777599202468991} 12/26/2021 09:03:00 - INFO - codeparrot_training - Step 25863: {'lr': 0.00023778056008246157, 'samples': 13242368, 'steps': 25863, 'batch_loss/train': 0.6783703099936247} 12/26/2021 09:03:13 - INFO - codeparrot_training - Step 25864: {'lr': 0.00023776481581312643, 'samples': 13242880, 'steps': 25864, 'batch_loss/train': 0.7382772710407153} 12/26/2021 09:03:24 - INFO - codeparrot_training - Step 25865: {'lr': 0.00023774907159243363, 'samples': 13243392, 'steps': 25865, 'batch_loss/train': 0.7440869733691216} 12/26/2021 09:03:35 - INFO - codeparrot_training - Step 25866: {'lr': 0.00023773332742044607, 'samples': 13243904, 'steps': 25866, 'batch_loss/train': 0.6469409526325762} 12/26/2021 09:03:45 - INFO - codeparrot_training - Step 25867: {'lr': 0.00023771758329722606, 'samples': 13244416, 'steps': 25867, 'batch_loss/train': 0.8313260339200497} 12/26/2021 09:03:57 - INFO - codeparrot_training - Step 25868: {'lr': 0.00023770183922283637, 'samples': 13244928, 'steps': 25868, 'batch_loss/train': 0.6554475215962157} 12/26/2021 09:04:08 - INFO - codeparrot_training - Step 25869: {'lr': 0.0002376860951973394, 'samples': 13245440, 'steps': 25869, 'batch_loss/train': 0.7330520222894847} 12/26/2021 09:04:19 - INFO - codeparrot_training - Step 25870: {'lr': 0.00023767035122079793, 'samples': 13245952, 'steps': 25870, 'batch_loss/train': 0.7343311649747193} 12/26/2021 09:04:31 - INFO - codeparrot_training - Step 25871: {'lr': 0.00023765460729327457, 'samples': 13246464, 'steps': 25871, 'batch_loss/train': 0.8181927315890789} 12/26/2021 09:04:41 - INFO - codeparrot_training - Step 25872: {'lr': 0.0002376388634148317, 'samples': 13246976, 'steps': 25872, 'batch_loss/train': 0.7867958163842559} 12/26/2021 09:04:52 - INFO - codeparrot_training - Step 25873: {'lr': 0.00023762311958553205, 'samples': 13247488, 'steps': 25873, 'batch_loss/train': 0.7787940478883684} 12/26/2021 09:05:05 - INFO - codeparrot_training - Step 25874: {'lr': 0.00023760737580543834, 'samples': 13248000, 'steps': 25874, 'batch_loss/train': 0.7329162256792188} 12/26/2021 09:05:16 - INFO - codeparrot_training - Step 25875: {'lr': 0.000237591632074613, 'samples': 13248512, 'steps': 25875, 'batch_loss/train': 0.712734192609787} 12/26/2021 09:05:26 - INFO - codeparrot_training - Step 25876: {'lr': 0.0002375758883931185, 'samples': 13249024, 'steps': 25876, 'batch_loss/train': 0.9029819583520293} 12/26/2021 09:05:37 - INFO - codeparrot_training - Step 25877: {'lr': 0.00023756014476101765, 'samples': 13249536, 'steps': 25877, 'batch_loss/train': 0.7182923723012209} 12/26/2021 09:05:49 - INFO - codeparrot_training - Step 25878: {'lr': 0.00023754440117837303, 'samples': 13250048, 'steps': 25878, 'batch_loss/train': 0.7883221320807934} 12/26/2021 09:06:00 - INFO - codeparrot_training - Step 25879: {'lr': 0.0002375286576452471, 'samples': 13250560, 'steps': 25879, 'batch_loss/train': 0.7622475530952215} 12/26/2021 09:06:10 - INFO - codeparrot_training - Step 25880: {'lr': 0.00023751291416170247, 'samples': 13251072, 'steps': 25880, 'batch_loss/train': 0.850754152983427} 12/26/2021 09:06:23 - INFO - codeparrot_training - Step 25881: {'lr': 0.00023749717072780188, 'samples': 13251584, 'steps': 25881, 'batch_loss/train': 0.6617951535154134} 12/26/2021 09:06:34 - INFO - codeparrot_training - Step 25882: {'lr': 0.0002374814273436078, 'samples': 13252096, 'steps': 25882, 'batch_loss/train': 0.7948501383652911} 12/26/2021 09:06:44 - INFO - codeparrot_training - Step 25883: {'lr': 0.00023746568400918275, 'samples': 13252608, 'steps': 25883, 'batch_loss/train': 0.7859710892662406} 12/26/2021 09:06:55 - INFO - codeparrot_training - Step 25884: {'lr': 0.00023744994072458952, 'samples': 13253120, 'steps': 25884, 'batch_loss/train': 0.7411525533534586} 12/26/2021 09:07:07 - INFO - codeparrot_training - Step 25885: {'lr': 0.0002374341974898906, 'samples': 13253632, 'steps': 25885, 'batch_loss/train': 0.6558716711588204} 12/26/2021 09:07:18 - INFO - codeparrot_training - Step 25886: {'lr': 0.0002374184543051485, 'samples': 13254144, 'steps': 25886, 'batch_loss/train': 0.7384914932772517} 12/26/2021 09:07:28 - INFO - codeparrot_training - Step 25887: {'lr': 0.0002374027111704258, 'samples': 13254656, 'steps': 25887, 'batch_loss/train': 0.8627694901078939} 12/26/2021 09:07:40 - INFO - codeparrot_training - Step 25888: {'lr': 0.0002373869680857853, 'samples': 13255168, 'steps': 25888, 'batch_loss/train': 0.7836383730173111} 12/26/2021 09:07:51 - INFO - codeparrot_training - Step 25889: {'lr': 0.0002373712250512894, 'samples': 13255680, 'steps': 25889, 'batch_loss/train': 0.8167732283473015} 12/26/2021 09:08:02 - INFO - codeparrot_training - Step 25890: {'lr': 0.00023735548206700079, 'samples': 13256192, 'steps': 25890, 'batch_loss/train': 0.8129525622352958} 12/26/2021 09:08:14 - INFO - codeparrot_training - Step 25891: {'lr': 0.0002373397391329819, 'samples': 13256704, 'steps': 25891, 'batch_loss/train': 0.7314910553395748} 12/26/2021 09:08:24 - INFO - codeparrot_training - Step 25892: {'lr': 0.00023732399624929555, 'samples': 13257216, 'steps': 25892, 'batch_loss/train': 0.7945600261446089} 12/26/2021 09:08:35 - INFO - codeparrot_training - Step 25893: {'lr': 0.00023730825341600412, 'samples': 13257728, 'steps': 25893, 'batch_loss/train': 0.8338272827677429} 12/26/2021 09:08:48 - INFO - codeparrot_training - Step 25894: {'lr': 0.00023729251063317033, 'samples': 13258240, 'steps': 25894, 'batch_loss/train': 0.9272001590579748} 12/26/2021 09:08:58 - INFO - codeparrot_training - Step 25895: {'lr': 0.00023727676790085664, 'samples': 13258752, 'steps': 25895, 'batch_loss/train': 0.7245870530605316} 12/26/2021 09:09:09 - INFO - codeparrot_training - Step 25896: {'lr': 0.00023726102521912576, 'samples': 13259264, 'steps': 25896, 'batch_loss/train': 0.7669850520323962} 12/26/2021 09:09:19 - INFO - codeparrot_training - Step 25897: {'lr': 0.0002372452825880403, 'samples': 13259776, 'steps': 25897, 'batch_loss/train': 0.7699030581861734} 12/26/2021 09:09:34 - INFO - codeparrot_training - Step 25898: {'lr': 0.00023722954000766268, 'samples': 13260288, 'steps': 25898, 'batch_loss/train': 0.7679054755717516} 12/26/2021 09:09:45 - INFO - codeparrot_training - Step 25899: {'lr': 0.0002372137974780556, 'samples': 13260800, 'steps': 25899, 'batch_loss/train': 0.7533565675839782} 12/26/2021 09:09:55 - INFO - codeparrot_training - Step 25900: {'lr': 0.0002371980549992817, 'samples': 13261312, 'steps': 25900, 'batch_loss/train': 0.697374670766294} 12/26/2021 09:10:07 - INFO - codeparrot_training - Step 25901: {'lr': 0.00023718231257140355, 'samples': 13261824, 'steps': 25901, 'batch_loss/train': 0.7607488394714892} 12/26/2021 09:10:18 - INFO - codeparrot_training - Step 25902: {'lr': 0.0002371665701944835, 'samples': 13262336, 'steps': 25902, 'batch_loss/train': 0.670893486472778} 12/26/2021 09:10:29 - INFO - codeparrot_training - Step 25903: {'lr': 0.00023715082786858444, 'samples': 13262848, 'steps': 25903, 'batch_loss/train': 0.7957008937373757} 12/26/2021 09:10:41 - INFO - codeparrot_training - Step 25904: {'lr': 0.00023713508559376886, 'samples': 13263360, 'steps': 25904, 'batch_loss/train': 0.7625911957584321} 12/26/2021 09:10:51 - INFO - codeparrot_training - Step 25905: {'lr': 0.00023711934337009927, 'samples': 13263872, 'steps': 25905, 'batch_loss/train': 0.7578310491517186} 12/26/2021 09:11:02 - INFO - codeparrot_training - Step 25906: {'lr': 0.00023710360119763826, 'samples': 13264384, 'steps': 25906, 'batch_loss/train': 0.6637306208722293} 12/26/2021 09:11:16 - INFO - codeparrot_training - Step 25907: {'lr': 0.00023708785907644857, 'samples': 13264896, 'steps': 25907, 'batch_loss/train': 0.7968278415501118} 12/26/2021 09:11:27 - INFO - codeparrot_training - Step 25908: {'lr': 0.00023707211700659263, 'samples': 13265408, 'steps': 25908, 'batch_loss/train': 0.7519544120877981} 12/26/2021 09:11:37 - INFO - codeparrot_training - Step 25909: {'lr': 0.000237056374988133, 'samples': 13265920, 'steps': 25909, 'batch_loss/train': 0.674628120730631} 12/26/2021 09:11:48 - INFO - codeparrot_training - Step 25910: {'lr': 0.00023704063302113244, 'samples': 13266432, 'steps': 25910, 'batch_loss/train': 0.8337775748805143} 12/26/2021 09:12:00 - INFO - codeparrot_training - Step 25911: {'lr': 0.00023702489110565345, 'samples': 13266944, 'steps': 25911, 'batch_loss/train': 0.7792388579109684} 12/26/2021 09:12:10 - INFO - codeparrot_training - Step 25912: {'lr': 0.0002370091492417585, 'samples': 13267456, 'steps': 25912, 'batch_loss/train': 0.675048332195729} 12/26/2021 09:12:21 - INFO - codeparrot_training - Step 25913: {'lr': 0.00023699340742951025, 'samples': 13267968, 'steps': 25913, 'batch_loss/train': 0.8270281180739403} 12/26/2021 09:12:35 - INFO - codeparrot_training - Step 25914: {'lr': 0.00023697766566897142, 'samples': 13268480, 'steps': 25914, 'batch_loss/train': 0.7361671247635968} 12/26/2021 09:12:46 - INFO - codeparrot_training - Step 25915: {'lr': 0.00023696192396020438, 'samples': 13268992, 'steps': 25915, 'batch_loss/train': 0.7710491635371} 12/26/2021 09:12:56 - INFO - codeparrot_training - Step 25916: {'lr': 0.0002369461823032719, 'samples': 13269504, 'steps': 25916, 'batch_loss/train': 0.7704849233850837} 12/26/2021 09:13:08 - INFO - codeparrot_training - Step 25917: {'lr': 0.00023693044069823636, 'samples': 13270016, 'steps': 25917, 'batch_loss/train': 0.7608193596825004} 12/26/2021 09:13:19 - INFO - codeparrot_training - Step 25918: {'lr': 0.00023691469914516048, 'samples': 13270528, 'steps': 25918, 'batch_loss/train': 0.7649617120623589} 12/26/2021 09:13:29 - INFO - codeparrot_training - Step 25919: {'lr': 0.00023689895764410687, 'samples': 13271040, 'steps': 25919, 'batch_loss/train': 0.7301172437146306} 12/26/2021 09:13:40 - INFO - codeparrot_training - Step 25920: {'lr': 0.00023688321619513798, 'samples': 13271552, 'steps': 25920, 'batch_loss/train': 0.8097895570099354} 12/26/2021 09:13:52 - INFO - codeparrot_training - Step 25921: {'lr': 0.00023686747479831656, 'samples': 13272064, 'steps': 25921, 'batch_loss/train': 0.7223944151774049} 12/26/2021 09:14:03 - INFO - codeparrot_training - Step 25922: {'lr': 0.00023685173345370502, 'samples': 13272576, 'steps': 25922, 'batch_loss/train': 0.826703604310751} 12/26/2021 09:14:13 - INFO - codeparrot_training - Step 25923: {'lr': 0.0002368359921613661, 'samples': 13273088, 'steps': 25923, 'batch_loss/train': 0.7204307373613119} 12/26/2021 09:14:28 - INFO - codeparrot_training - Step 25924: {'lr': 0.00023682025092136222, 'samples': 13273600, 'steps': 25924, 'batch_loss/train': 0.6914122509770095} 12/26/2021 09:14:38 - INFO - codeparrot_training - Step 25925: {'lr': 0.0002368045097337561, 'samples': 13274112, 'steps': 25925, 'batch_loss/train': 0.7630156809464097} 12/26/2021 09:14:49 - INFO - codeparrot_training - Step 25926: {'lr': 0.00023678876859861028, 'samples': 13274624, 'steps': 25926, 'batch_loss/train': 0.8214589990675449} 12/26/2021 09:15:01 - INFO - codeparrot_training - Step 25927: {'lr': 0.0002367730275159873, 'samples': 13275136, 'steps': 25927, 'batch_loss/train': 0.7388475835323334} 12/26/2021 09:15:11 - INFO - codeparrot_training - Step 25928: {'lr': 0.00023675728648594977, 'samples': 13275648, 'steps': 25928, 'batch_loss/train': 0.7671865252777934} 12/26/2021 09:15:22 - INFO - codeparrot_training - Step 25929: {'lr': 0.00023674154550856025, 'samples': 13276160, 'steps': 25929, 'batch_loss/train': 0.7363605930004269} 12/26/2021 09:15:33 - INFO - codeparrot_training - Step 25930: {'lr': 0.0002367258045838814, 'samples': 13276672, 'steps': 25930, 'batch_loss/train': 0.7293666321784258} 12/26/2021 09:15:45 - INFO - codeparrot_training - Step 25931: {'lr': 0.0002367100637119757, 'samples': 13277184, 'steps': 25931, 'batch_loss/train': 0.7950822487473488} 12/26/2021 09:15:55 - INFO - codeparrot_training - Step 25932: {'lr': 0.0002366943228929057, 'samples': 13277696, 'steps': 25932, 'batch_loss/train': 0.6805221736431122} 12/26/2021 09:16:06 - INFO - codeparrot_training - Step 25933: {'lr': 0.00023667858212673415, 'samples': 13278208, 'steps': 25933, 'batch_loss/train': 0.6444833055138588} 12/26/2021 09:16:18 - INFO - codeparrot_training - Step 25934: {'lr': 0.00023666284141352352, 'samples': 13278720, 'steps': 25934, 'batch_loss/train': 0.7770900866016746} 12/26/2021 09:16:29 - INFO - codeparrot_training - Step 25935: {'lr': 0.0002366471007533363, 'samples': 13279232, 'steps': 25935, 'batch_loss/train': 0.5607092686695978} 12/26/2021 09:16:39 - INFO - codeparrot_training - Step 25936: {'lr': 0.00023663136014623527, 'samples': 13279744, 'steps': 25936, 'batch_loss/train': 0.7606370359426364} 12/26/2021 09:16:54 - INFO - codeparrot_training - Step 25937: {'lr': 0.0002366156195922829, 'samples': 13280256, 'steps': 25937, 'batch_loss/train': 0.7363438149914145} 12/26/2021 09:17:04 - INFO - codeparrot_training - Step 25938: {'lr': 0.00023659987909154174, 'samples': 13280768, 'steps': 25938, 'batch_loss/train': 0.8617282379418612} 12/26/2021 09:17:15 - INFO - codeparrot_training - Step 25939: {'lr': 0.00023658413864407435, 'samples': 13281280, 'steps': 25939, 'batch_loss/train': 1.059578905813396} 12/26/2021 09:17:25 - INFO - codeparrot_training - Step 25940: {'lr': 0.00023656839824994346, 'samples': 13281792, 'steps': 25940, 'batch_loss/train': 0.7155702370218933} 12/26/2021 09:17:38 - INFO - codeparrot_training - Step 25941: {'lr': 0.0002365526579092115, 'samples': 13282304, 'steps': 25941, 'batch_loss/train': 0.7176461787894368} 12/26/2021 09:17:48 - INFO - codeparrot_training - Step 25942: {'lr': 0.00023653691762194113, 'samples': 13282816, 'steps': 25942, 'batch_loss/train': 0.831014025490731} 12/26/2021 09:17:59 - INFO - codeparrot_training - Step 25943: {'lr': 0.0002365211773881948, 'samples': 13283328, 'steps': 25943, 'batch_loss/train': 0.7375680790282786} 12/26/2021 09:18:13 - INFO - codeparrot_training - Step 25944: {'lr': 0.0002365054372080352, 'samples': 13283840, 'steps': 25944, 'batch_loss/train': 0.8618763815611601} 12/26/2021 09:18:23 - INFO - codeparrot_training - Step 25945: {'lr': 0.00023648969708152498, 'samples': 13284352, 'steps': 25945, 'batch_loss/train': 0.746865161228925} 12/26/2021 09:18:34 - INFO - codeparrot_training - Step 25946: {'lr': 0.00023647395700872648, 'samples': 13284864, 'steps': 25946, 'batch_loss/train': 0.7379702075850219} 12/26/2021 09:18:46 - INFO - codeparrot_training - Step 25947: {'lr': 0.00023645821698970256, 'samples': 13285376, 'steps': 25947, 'batch_loss/train': 0.7563121132552624} 12/26/2021 09:18:57 - INFO - codeparrot_training - Step 25948: {'lr': 0.0002364424770245156, 'samples': 13285888, 'steps': 25948, 'batch_loss/train': 0.7647718004882336} 12/26/2021 09:19:07 - INFO - codeparrot_training - Step 25949: {'lr': 0.00023642673711322826, 'samples': 13286400, 'steps': 25949, 'batch_loss/train': 0.7338793538510799} 12/26/2021 09:19:18 - INFO - codeparrot_training - Step 25950: {'lr': 0.00023641099725590301, 'samples': 13286912, 'steps': 25950, 'batch_loss/train': 0.7441261947387829} 12/26/2021 09:19:30 - INFO - codeparrot_training - Step 25951: {'lr': 0.00023639525745260253, 'samples': 13287424, 'steps': 25951, 'batch_loss/train': 0.8491028444841504} 12/26/2021 09:19:41 - INFO - codeparrot_training - Step 25952: {'lr': 0.00023637951770338946, 'samples': 13287936, 'steps': 25952, 'batch_loss/train': 1.0665181400254369} 12/26/2021 09:19:52 - INFO - codeparrot_training - Step 25953: {'lr': 0.0002363637780083262, 'samples': 13288448, 'steps': 25953, 'batch_loss/train': 1.3581379484385252} 12/26/2021 09:20:05 - INFO - codeparrot_training - Step 25954: {'lr': 0.00023634803836747537, 'samples': 13288960, 'steps': 25954, 'batch_loss/train': 0.7752924566157162} 12/26/2021 09:20:16 - INFO - codeparrot_training - Step 25955: {'lr': 0.0002363322987808997, 'samples': 13289472, 'steps': 25955, 'batch_loss/train': 0.7328214468434453} 12/26/2021 09:20:27 - INFO - codeparrot_training - Step 25956: {'lr': 0.00023631655924866163, 'samples': 13289984, 'steps': 25956, 'batch_loss/train': 0.7878120760433376} 12/26/2021 09:20:39 - INFO - codeparrot_training - Step 25957: {'lr': 0.00023630081977082365, 'samples': 13290496, 'steps': 25957, 'batch_loss/train': 0.8095325324684381} 12/26/2021 09:20:49 - INFO - codeparrot_training - Step 25958: {'lr': 0.0002362850803474485, 'samples': 13291008, 'steps': 25958, 'batch_loss/train': 0.7506893844110891} 12/26/2021 09:21:00 - INFO - codeparrot_training - Step 25959: {'lr': 0.00023626934097859873, 'samples': 13291520, 'steps': 25959, 'batch_loss/train': 0.8044554013758898} 12/26/2021 09:21:11 - INFO - codeparrot_training - Step 25960: {'lr': 0.00023625360166433686, 'samples': 13292032, 'steps': 25960, 'batch_loss/train': 0.7885770266875625} 12/26/2021 09:21:23 - INFO - codeparrot_training - Step 25961: {'lr': 0.00023623786240472539, 'samples': 13292544, 'steps': 25961, 'batch_loss/train': 0.8216549977660179} 12/26/2021 09:21:34 - INFO - codeparrot_training - Step 25962: {'lr': 0.0002362221231998271, 'samples': 13293056, 'steps': 25962, 'batch_loss/train': 0.7208397784270346} 12/26/2021 09:21:44 - INFO - codeparrot_training - Step 25963: {'lr': 0.00023620638404970437, 'samples': 13293568, 'steps': 25963, 'batch_loss/train': 0.7875577285885811} 12/26/2021 09:21:58 - INFO - codeparrot_training - Step 25964: {'lr': 0.00023619064495441995, 'samples': 13294080, 'steps': 25964, 'batch_loss/train': 0.7668369561433792} 12/26/2021 09:22:09 - INFO - codeparrot_training - Step 25965: {'lr': 0.0002361749059140362, 'samples': 13294592, 'steps': 25965, 'batch_loss/train': 0.7741670468822122} 12/26/2021 09:22:19 - INFO - codeparrot_training - Step 25966: {'lr': 0.00023615916692861587, 'samples': 13295104, 'steps': 25966, 'batch_loss/train': 0.7739833360537887} 12/26/2021 09:22:31 - INFO - codeparrot_training - Step 25967: {'lr': 0.0002361434279982214, 'samples': 13295616, 'steps': 25967, 'batch_loss/train': 0.7451002122834325} 12/26/2021 09:22:42 - INFO - codeparrot_training - Step 25968: {'lr': 0.0002361276891229155, 'samples': 13296128, 'steps': 25968, 'batch_loss/train': 0.8506747931241989} 12/26/2021 09:22:53 - INFO - codeparrot_training - Step 25969: {'lr': 0.0002361119503027606, 'samples': 13296640, 'steps': 25969, 'batch_loss/train': 0.9083570409566164} 12/26/2021 09:23:07 - INFO - codeparrot_training - Step 25970: {'lr': 0.0002360962115378194, 'samples': 13297152, 'steps': 25970, 'batch_loss/train': 0.8969914205372334} 12/26/2021 09:23:17 - INFO - codeparrot_training - Step 25971: {'lr': 0.00023608047282815443, 'samples': 13297664, 'steps': 25971, 'batch_loss/train': 0.8583004020620137} 12/26/2021 09:23:28 - INFO - codeparrot_training - Step 25972: {'lr': 0.00023606473417382812, 'samples': 13298176, 'steps': 25972, 'batch_loss/train': 0.7221879176795483} 12/26/2021 09:23:38 - INFO - codeparrot_training - Step 25973: {'lr': 0.00023604899557490332, 'samples': 13298688, 'steps': 25973, 'batch_loss/train': 0.6990219606086612} 12/26/2021 09:23:50 - INFO - codeparrot_training - Step 25974: {'lr': 0.00023603325703144235, 'samples': 13299200, 'steps': 25974, 'batch_loss/train': 0.8094442496076226} 12/26/2021 09:24:01 - INFO - codeparrot_training - Step 25975: {'lr': 0.00023601751854350795, 'samples': 13299712, 'steps': 25975, 'batch_loss/train': 0.7896806411445141} 12/26/2021 09:24:12 - INFO - codeparrot_training - Step 25976: {'lr': 0.00023600178011116252, 'samples': 13300224, 'steps': 25976, 'batch_loss/train': 0.714960609562695} 12/26/2021 09:24:24 - INFO - codeparrot_training - Step 25977: {'lr': 0.00023598604173446879, 'samples': 13300736, 'steps': 25977, 'batch_loss/train': 0.7862966675311327} 12/26/2021 09:24:34 - INFO - codeparrot_training - Step 25978: {'lr': 0.00023597030341348927, 'samples': 13301248, 'steps': 25978, 'batch_loss/train': 0.8549827896058559} 12/26/2021 09:24:45 - INFO - codeparrot_training - Step 25979: {'lr': 0.0002359545651482865, 'samples': 13301760, 'steps': 25979, 'batch_loss/train': 0.7247585998848081} 12/26/2021 09:24:58 - INFO - codeparrot_training - Step 25980: {'lr': 0.000235938826938923, 'samples': 13302272, 'steps': 25980, 'batch_loss/train': 0.7524679191410542} 12/26/2021 09:25:08 - INFO - codeparrot_training - Step 25981: {'lr': 0.0002359230887854616, 'samples': 13302784, 'steps': 25981, 'batch_loss/train': 0.5509963142685592} 12/26/2021 09:25:19 - INFO - codeparrot_training - Step 25982: {'lr': 0.00023590735068796463, 'samples': 13303296, 'steps': 25982, 'batch_loss/train': 0.6975228009978309} 12/26/2021 09:25:33 - INFO - codeparrot_training - Step 25983: {'lr': 0.00023589161264649458, 'samples': 13303808, 'steps': 25983, 'batch_loss/train': 0.6339583437656984} 12/26/2021 09:25:44 - INFO - codeparrot_training - Step 25984: {'lr': 0.00023587587466111423, 'samples': 13304320, 'steps': 25984, 'batch_loss/train': 0.8780158488079906} 12/26/2021 09:25:54 - INFO - codeparrot_training - Step 25985: {'lr': 0.00023586013673188613, 'samples': 13304832, 'steps': 25985, 'batch_loss/train': 0.7409118125215173} 12/26/2021 09:26:05 - INFO - codeparrot_training - Step 25986: {'lr': 0.00023584439885887272, 'samples': 13305344, 'steps': 25986, 'batch_loss/train': 0.6298027783632278} 12/26/2021 09:26:17 - INFO - codeparrot_training - Step 25987: {'lr': 0.0002358286610421366, 'samples': 13305856, 'steps': 25987, 'batch_loss/train': 0.7174604749307036} 12/26/2021 09:26:27 - INFO - codeparrot_training - Step 25988: {'lr': 0.00023581292328174046, 'samples': 13306368, 'steps': 25988, 'batch_loss/train': 0.6211636929656379} 12/26/2021 09:26:38 - INFO - codeparrot_training - Step 25989: {'lr': 0.00023579718557774674, 'samples': 13306880, 'steps': 25989, 'batch_loss/train': 0.7626046957448125} 12/26/2021 09:26:50 - INFO - codeparrot_training - Step 25990: {'lr': 0.0002357814479302181, 'samples': 13307392, 'steps': 25990, 'batch_loss/train': 0.7834404250606894} 12/26/2021 09:27:01 - INFO - codeparrot_training - Step 25991: {'lr': 0.00023576571033921692, 'samples': 13307904, 'steps': 25991, 'batch_loss/train': 1.0344557026401162} 12/26/2021 09:27:12 - INFO - codeparrot_training - Step 25992: {'lr': 0.00023574997280480602, 'samples': 13308416, 'steps': 25992, 'batch_loss/train': 0.7838697091210634} 12/26/2021 09:27:25 - INFO - codeparrot_training - Step 25993: {'lr': 0.00023573423532704784, 'samples': 13308928, 'steps': 25993, 'batch_loss/train': 0.8149074874818325} 12/26/2021 09:27:36 - INFO - codeparrot_training - Step 25994: {'lr': 0.00023571849790600486, 'samples': 13309440, 'steps': 25994, 'batch_loss/train': 0.8304463140666485} 12/26/2021 09:27:47 - INFO - codeparrot_training - Step 25995: {'lr': 0.00023570276054173984, 'samples': 13309952, 'steps': 25995, 'batch_loss/train': 0.7654845351353288} 12/26/2021 09:27:57 - INFO - codeparrot_training - Step 25996: {'lr': 0.0002356870232343152, 'samples': 13310464, 'steps': 25996, 'batch_loss/train': 0.679148836992681} 12/26/2021 09:28:09 - INFO - codeparrot_training - Step 25997: {'lr': 0.00023567128598379363, 'samples': 13310976, 'steps': 25997, 'batch_loss/train': 0.6932398732751608} 12/26/2021 09:28:20 - INFO - codeparrot_training - Step 25998: {'lr': 0.0002356555487902375, 'samples': 13311488, 'steps': 25998, 'batch_loss/train': 0.7532311007380486} 12/26/2021 09:28:31 - INFO - codeparrot_training - Step 25999: {'lr': 0.00023563981165370952, 'samples': 13312000, 'steps': 25999, 'batch_loss/train': 0.7428278885781765} 12/26/2021 09:28:45 - INFO - codeparrot_training - Step 26000: {'lr': 0.0002356240745742723, 'samples': 13312512, 'steps': 26000, 'batch_loss/train': 0.7819683216512203} 12/26/2021 09:28:55 - INFO - codeparrot_training - Step 26001: {'lr': 0.0002356083375519883, 'samples': 13313024, 'steps': 26001, 'batch_loss/train': 1.5837554666213691} 12/26/2021 09:29:06 - INFO - codeparrot_training - Step 26002: {'lr': 0.00023559260058692006, 'samples': 13313536, 'steps': 26002, 'batch_loss/train': 1.6329371524043381} 12/26/2021 09:29:18 - INFO - codeparrot_training - Step 26003: {'lr': 0.00023557686367913023, 'samples': 13314048, 'steps': 26003, 'batch_loss/train': 0.7240577936172485} 12/26/2021 09:29:29 - INFO - codeparrot_training - Step 26004: {'lr': 0.0002355611268286814, 'samples': 13314560, 'steps': 26004, 'batch_loss/train': 0.8276777900755405} 12/26/2021 09:29:39 - INFO - codeparrot_training - Step 26005: {'lr': 0.000235545390035636, 'samples': 13315072, 'steps': 26005, 'batch_loss/train': 0.8475555153563619} 12/26/2021 09:29:50 - INFO - codeparrot_training - Step 26006: {'lr': 0.00023552965330005666, 'samples': 13315584, 'steps': 26006, 'batch_loss/train': 0.8010385651141405} 12/26/2021 09:30:02 - INFO - codeparrot_training - Step 26007: {'lr': 0.00023551391662200602, 'samples': 13316096, 'steps': 26007, 'batch_loss/train': 0.831780334468931} 12/26/2021 09:30:12 - INFO - codeparrot_training - Step 26008: {'lr': 0.00023549818000154655, 'samples': 13316608, 'steps': 26008, 'batch_loss/train': 0.8252571113407612} 12/26/2021 09:30:23 - INFO - codeparrot_training - Step 26009: {'lr': 0.0002354824434387408, 'samples': 13317120, 'steps': 26009, 'batch_loss/train': 0.8603667933493853} 12/26/2021 09:30:37 - INFO - codeparrot_training - Step 26010: {'lr': 0.00023546670693365144, 'samples': 13317632, 'steps': 26010, 'batch_loss/train': 0.8298094915226102} 12/26/2021 09:30:47 - INFO - codeparrot_training - Step 26011: {'lr': 0.00023545097048634098, 'samples': 13318144, 'steps': 26011, 'batch_loss/train': 0.7826192174106836} 12/26/2021 09:30:58 - INFO - codeparrot_training - Step 26012: {'lr': 0.00023543523409687192, 'samples': 13318656, 'steps': 26012, 'batch_loss/train': 0.7158940387889743} 12/26/2021 09:31:10 - INFO - codeparrot_training - Step 26013: {'lr': 0.00023541949776530684, 'samples': 13319168, 'steps': 26013, 'batch_loss/train': 0.7622501326259226} 12/26/2021 09:31:21 - INFO - codeparrot_training - Step 26014: {'lr': 0.0002354037614917084, 'samples': 13319680, 'steps': 26014, 'batch_loss/train': 0.8237082115374506} 12/26/2021 09:31:31 - INFO - codeparrot_training - Step 26015: {'lr': 0.00023538802527613907, 'samples': 13320192, 'steps': 26015, 'batch_loss/train': 0.678959489800036} 12/26/2021 09:31:42 - INFO - codeparrot_training - Step 26016: {'lr': 0.00023537228911866146, 'samples': 13320704, 'steps': 26016, 'batch_loss/train': 0.8331500729545951} 12/26/2021 09:31:56 - INFO - codeparrot_training - Step 26017: {'lr': 0.000235356553019338, 'samples': 13321216, 'steps': 26017, 'batch_loss/train': 0.7271498269401491} 12/26/2021 09:32:07 - INFO - codeparrot_training - Step 26018: {'lr': 0.0002353408169782315, 'samples': 13321728, 'steps': 26018, 'batch_loss/train': 0.7428227672353387} 12/26/2021 09:32:17 - INFO - codeparrot_training - Step 26019: {'lr': 0.00023532508099540427, 'samples': 13322240, 'steps': 26019, 'batch_loss/train': 0.7161113130860031} 12/26/2021 09:32:29 - INFO - codeparrot_training - Step 26020: {'lr': 0.00023530934507091898, 'samples': 13322752, 'steps': 26020, 'batch_loss/train': 0.6813697521574795} 12/26/2021 09:32:40 - INFO - codeparrot_training - Step 26021: {'lr': 0.00023529360920483824, 'samples': 13323264, 'steps': 26021, 'batch_loss/train': 0.7968770060688257} 12/26/2021 09:32:51 - INFO - codeparrot_training - Step 26022: {'lr': 0.00023527787339722454, 'samples': 13323776, 'steps': 26022, 'batch_loss/train': 0.6951539788860828} 12/26/2021 09:33:03 - INFO - codeparrot_training - Step 26023: {'lr': 0.0002352621376481405, 'samples': 13324288, 'steps': 26023, 'batch_loss/train': 0.8986020470038056} 12/26/2021 09:33:13 - INFO - codeparrot_training - Step 26024: {'lr': 0.00023524640195764853, 'samples': 13324800, 'steps': 26024, 'batch_loss/train': 0.789143449626863} 12/26/2021 09:33:24 - INFO - codeparrot_training - Step 26025: {'lr': 0.00023523066632581132, 'samples': 13325312, 'steps': 26025, 'batch_loss/train': 0.6362986410968006} 12/26/2021 09:33:36 - INFO - codeparrot_training - Step 26026: {'lr': 0.0002352149307526915, 'samples': 13325824, 'steps': 26026, 'batch_loss/train': 0.8891735468059778} 12/26/2021 09:33:47 - INFO - codeparrot_training - Step 26027: {'lr': 0.0002351991952383515, 'samples': 13326336, 'steps': 26027, 'batch_loss/train': 0.7402698043733835} 12/26/2021 09:33:57 - INFO - codeparrot_training - Step 26028: {'lr': 0.00023518345978285383, 'samples': 13326848, 'steps': 26028, 'batch_loss/train': 0.9123258572071791} 12/26/2021 09:34:08 - INFO - codeparrot_training - Step 26029: {'lr': 0.0002351677243862612, 'samples': 13327360, 'steps': 26029, 'batch_loss/train': 0.6906130453571677} 12/26/2021 09:34:22 - INFO - codeparrot_training - Step 26030: {'lr': 0.0002351519890486361, 'samples': 13327872, 'steps': 26030, 'batch_loss/train': 0.7377040348947048} 12/26/2021 09:34:32 - INFO - codeparrot_training - Step 26031: {'lr': 0.000235136253770041, 'samples': 13328384, 'steps': 26031, 'batch_loss/train': 0.7830670922994614} 12/26/2021 09:34:43 - INFO - codeparrot_training - Step 26032: {'lr': 0.00023512051855053862, 'samples': 13328896, 'steps': 26032, 'batch_loss/train': 0.7655929527245462} 12/26/2021 09:34:55 - INFO - codeparrot_training - Step 26033: {'lr': 0.00023510478339019145, 'samples': 13329408, 'steps': 26033, 'batch_loss/train': 0.7757431399077177} 12/26/2021 09:35:06 - INFO - codeparrot_training - Step 26034: {'lr': 0.00023508904828906202, 'samples': 13329920, 'steps': 26034, 'batch_loss/train': 0.6988633163273335} 12/26/2021 09:35:16 - INFO - codeparrot_training - Step 26035: {'lr': 0.0002350733132472128, 'samples': 13330432, 'steps': 26035, 'batch_loss/train': 0.8000454464927316} 12/26/2021 09:35:29 - INFO - codeparrot_training - Step 26036: {'lr': 0.00023505757826470662, 'samples': 13330944, 'steps': 26036, 'batch_loss/train': 0.749291087500751} 12/26/2021 09:35:39 - INFO - codeparrot_training - Step 26037: {'lr': 0.00023504184334160582, 'samples': 13331456, 'steps': 26037, 'batch_loss/train': 0.8156044669449329} 12/26/2021 09:35:50 - INFO - codeparrot_training - Step 26038: {'lr': 0.000235026108477973, 'samples': 13331968, 'steps': 26038, 'batch_loss/train': 0.6475465341936797} 12/26/2021 09:36:00 - INFO - codeparrot_training - Step 26039: {'lr': 0.00023501037367387062, 'samples': 13332480, 'steps': 26039, 'batch_loss/train': 0.7890656762756407} 12/26/2021 09:36:15 - INFO - codeparrot_training - Step 26040: {'lr': 0.00023499463892936147, 'samples': 13332992, 'steps': 26040, 'batch_loss/train': 0.7555847116746008} 12/26/2021 09:36:25 - INFO - codeparrot_training - Step 26041: {'lr': 0.0002349789042445079, 'samples': 13333504, 'steps': 26041, 'batch_loss/train': 0.7939869952388108} 12/26/2021 09:36:36 - INFO - codeparrot_training - Step 26042: {'lr': 0.0002349631696193726, 'samples': 13334016, 'steps': 26042, 'batch_loss/train': 0.7804129258729517} 12/26/2021 09:36:48 - INFO - codeparrot_training - Step 26043: {'lr': 0.00023494743505401796, 'samples': 13334528, 'steps': 26043, 'batch_loss/train': 0.779556754976511} 12/26/2021 09:36:59 - INFO - codeparrot_training - Step 26044: {'lr': 0.00023493170054850665, 'samples': 13335040, 'steps': 26044, 'batch_loss/train': 0.6590242731035687} 12/26/2021 09:37:09 - INFO - codeparrot_training - Step 26045: {'lr': 0.00023491596610290127, 'samples': 13335552, 'steps': 26045, 'batch_loss/train': 0.8654554346576333} 12/26/2021 09:37:21 - INFO - codeparrot_training - Step 26046: {'lr': 0.0002349002317172642, 'samples': 13336064, 'steps': 26046, 'batch_loss/train': 0.7215089579112828} 12/26/2021 09:37:32 - INFO - codeparrot_training - Step 26047: {'lr': 0.00023488449739165825, 'samples': 13336576, 'steps': 26047, 'batch_loss/train': 0.5504820933565497} 12/26/2021 09:37:43 - INFO - codeparrot_training - Step 26048: {'lr': 0.00023486876312614577, 'samples': 13337088, 'steps': 26048, 'batch_loss/train': 0.8309954688884318} 12/26/2021 09:37:53 - INFO - codeparrot_training - Step 26049: {'lr': 0.0002348530289207894, 'samples': 13337600, 'steps': 26049, 'batch_loss/train': 0.7403324274346232} 12/26/2021 09:38:07 - INFO - codeparrot_training - Step 26050: {'lr': 0.00023483729477565162, 'samples': 13338112, 'steps': 26050, 'batch_loss/train': 0.6441928073763847} 12/26/2021 09:38:17 - INFO - codeparrot_training - Step 26051: {'lr': 0.00023482156069079503, 'samples': 13338624, 'steps': 26051, 'batch_loss/train': 0.7687936997972429} 12/26/2021 09:38:28 - INFO - codeparrot_training - Step 26052: {'lr': 0.00023480582666628226, 'samples': 13339136, 'steps': 26052, 'batch_loss/train': 0.7480375675368123} 12/26/2021 09:38:40 - INFO - codeparrot_training - Step 26053: {'lr': 0.0002347900927021757, 'samples': 13339648, 'steps': 26053, 'batch_loss/train': 0.7344648099970073} 12/26/2021 09:38:51 - INFO - codeparrot_training - Step 26054: {'lr': 0.00023477435879853798, 'samples': 13340160, 'steps': 26054, 'batch_loss/train': 0.7587902992963791} 12/26/2021 09:39:01 - INFO - codeparrot_training - Step 26055: {'lr': 0.00023475862495543173, 'samples': 13340672, 'steps': 26055, 'batch_loss/train': 0.6749400754924864} 12/26/2021 09:39:13 - INFO - codeparrot_training - Step 26056: {'lr': 0.00023474289117291947, 'samples': 13341184, 'steps': 26056, 'batch_loss/train': 0.7667188689811155} 12/26/2021 09:39:24 - INFO - codeparrot_training - Step 26057: {'lr': 0.00023472715745106358, 'samples': 13341696, 'steps': 26057, 'batch_loss/train': 0.8194458084180951} 12/26/2021 09:39:35 - INFO - codeparrot_training - Step 26058: {'lr': 0.00023471142378992679, 'samples': 13342208, 'steps': 26058, 'batch_loss/train': 0.853099474683404} 12/26/2021 09:39:45 - INFO - codeparrot_training - Step 26059: {'lr': 0.0002346956901895717, 'samples': 13342720, 'steps': 26059, 'batch_loss/train': 0.8446651850827038} 12/26/2021 09:39:59 - INFO - codeparrot_training - Step 26060: {'lr': 0.00023467995665006066, 'samples': 13343232, 'steps': 26060, 'batch_loss/train': 0.7404141239821911} 12/26/2021 09:40:10 - INFO - codeparrot_training - Step 26061: {'lr': 0.00023466422317145632, 'samples': 13343744, 'steps': 26061, 'batch_loss/train': 0.7121214317157865} 12/26/2021 09:40:20 - INFO - codeparrot_training - Step 26062: {'lr': 0.00023464848975382133, 'samples': 13344256, 'steps': 26062, 'batch_loss/train': 0.7340856790542603} 12/26/2021 09:40:32 - INFO - codeparrot_training - Step 26063: {'lr': 0.00023463275639721816, 'samples': 13344768, 'steps': 26063, 'batch_loss/train': 0.7486339574679732} 12/26/2021 09:40:43 - INFO - codeparrot_training - Step 26064: {'lr': 0.00023461702310170927, 'samples': 13345280, 'steps': 26064, 'batch_loss/train': 0.729817243758589} 12/26/2021 09:40:54 - INFO - codeparrot_training - Step 26065: {'lr': 0.00023460128986735725, 'samples': 13345792, 'steps': 26065, 'batch_loss/train': 0.6613902288954705} 12/26/2021 09:41:06 - INFO - codeparrot_training - Step 26066: {'lr': 0.0002345855566942248, 'samples': 13346304, 'steps': 26066, 'batch_loss/train': 0.7246125438250601} 12/26/2021 09:41:16 - INFO - codeparrot_training - Step 26067: {'lr': 0.00023456982358237427, 'samples': 13346816, 'steps': 26067, 'batch_loss/train': 0.6446796543896198} 12/26/2021 09:41:27 - INFO - codeparrot_training - Step 26068: {'lr': 0.00023455409053186826, 'samples': 13347328, 'steps': 26068, 'batch_loss/train': 0.7702003885060549} 12/26/2021 09:41:37 - INFO - codeparrot_training - Step 26069: {'lr': 0.00023453835754276946, 'samples': 13347840, 'steps': 26069, 'batch_loss/train': 0.7305349982343614} 12/26/2021 09:41:51 - INFO - codeparrot_training - Step 26070: {'lr': 0.00023452262461514027, 'samples': 13348352, 'steps': 26070, 'batch_loss/train': 0.7061587243806571} 12/26/2021 09:42:02 - INFO - codeparrot_training - Step 26071: {'lr': 0.00023450689174904334, 'samples': 13348864, 'steps': 26071, 'batch_loss/train': 0.8305439404211938} 12/26/2021 09:42:13 - INFO - codeparrot_training - Step 26072: {'lr': 0.00023449115894454104, 'samples': 13349376, 'steps': 26072, 'batch_loss/train': 0.597798774484545} 12/26/2021 09:42:25 - INFO - codeparrot_training - Step 26073: {'lr': 0.00023447542620169614, 'samples': 13349888, 'steps': 26073, 'batch_loss/train': 0.7754335361532867} 12/26/2021 09:42:35 - INFO - codeparrot_training - Step 26074: {'lr': 0.00023445969352057104, 'samples': 13350400, 'steps': 26074, 'batch_loss/train': 0.7986839694203809} 12/26/2021 09:42:46 - INFO - codeparrot_training - Step 26075: {'lr': 0.00023444396090122836, 'samples': 13350912, 'steps': 26075, 'batch_loss/train': 0.7641231298912317} 12/26/2021 09:43:00 - INFO - codeparrot_training - Step 26076: {'lr': 0.00023442822834373055, 'samples': 13351424, 'steps': 26076, 'batch_loss/train': 0.8496071497211233} 12/26/2021 09:43:11 - INFO - codeparrot_training - Step 26077: {'lr': 0.00023441249584814022, 'samples': 13351936, 'steps': 26077, 'batch_loss/train': 0.800064499489963} 12/26/2021 09:43:21 - INFO - codeparrot_training - Step 26078: {'lr': 0.00023439676341452002, 'samples': 13352448, 'steps': 26078, 'batch_loss/train': 0.7726354412734509} 12/26/2021 09:43:33 - INFO - codeparrot_training - Step 26079: {'lr': 0.00023438103104293234, 'samples': 13352960, 'steps': 26079, 'batch_loss/train': 0.8180374763906002} 12/26/2021 09:43:44 - INFO - codeparrot_training - Step 26080: {'lr': 0.0002343652987334397, 'samples': 13353472, 'steps': 26080, 'batch_loss/train': 0.7575042750686407} 12/26/2021 09:43:54 - INFO - codeparrot_training - Step 26081: {'lr': 0.00023434956648610487, 'samples': 13353984, 'steps': 26081, 'batch_loss/train': 0.8903489392250776} 12/26/2021 09:44:05 - INFO - codeparrot_training - Step 26082: {'lr': 0.00023433383430099025, 'samples': 13354496, 'steps': 26082, 'batch_loss/train': 0.7336709103547037} 12/26/2021 09:44:17 - INFO - codeparrot_training - Step 26083: {'lr': 0.00023431810217815828, 'samples': 13355008, 'steps': 26083, 'batch_loss/train': 0.8542579677887261} 12/26/2021 09:44:28 - INFO - codeparrot_training - Step 26084: {'lr': 0.00023430237011767165, 'samples': 13355520, 'steps': 26084, 'batch_loss/train': 0.6742868754081428} 12/26/2021 09:44:38 - INFO - codeparrot_training - Step 26085: {'lr': 0.00023428663811959295, 'samples': 13356032, 'steps': 26085, 'batch_loss/train': 0.6988393776118755} 12/26/2021 09:44:52 - INFO - codeparrot_training - Step 26086: {'lr': 0.00023427090618398456, 'samples': 13356544, 'steps': 26086, 'batch_loss/train': 0.7064858681987971} 12/26/2021 09:45:03 - INFO - codeparrot_training - Step 26087: {'lr': 0.00023425517431090908, 'samples': 13357056, 'steps': 26087, 'batch_loss/train': 0.811085226945579} 12/26/2021 09:45:13 - INFO - codeparrot_training - Step 26088: {'lr': 0.00023423944250042917, 'samples': 13357568, 'steps': 26088, 'batch_loss/train': 0.7687523546628654} 12/26/2021 09:45:25 - INFO - codeparrot_training - Step 26089: {'lr': 0.00023422371075260722, 'samples': 13358080, 'steps': 26089, 'batch_loss/train': 0.758080936037004} 12/26/2021 09:45:36 - INFO - codeparrot_training - Step 26090: {'lr': 0.00023420797906750594, 'samples': 13358592, 'steps': 26090, 'batch_loss/train': 0.7081567756831646} 12/26/2021 09:45:47 - INFO - codeparrot_training - Step 26091: {'lr': 0.00023419224744518764, 'samples': 13359104, 'steps': 26091, 'batch_loss/train': 0.6830850336700678} 12/26/2021 09:45:57 - INFO - codeparrot_training - Step 26092: {'lr': 0.0002341765158857151, 'samples': 13359616, 'steps': 26092, 'batch_loss/train': 0.6773157633724622} 12/26/2021 09:46:09 - INFO - codeparrot_training - Step 26093: {'lr': 0.0002341607843891507, 'samples': 13360128, 'steps': 26093, 'batch_loss/train': 0.8158318211790174} 12/26/2021 09:46:20 - INFO - codeparrot_training - Step 26094: {'lr': 0.000234145052955557, 'samples': 13360640, 'steps': 26094, 'batch_loss/train': 0.6335561532177962} 12/26/2021 09:46:31 - INFO - codeparrot_training - Step 26095: {'lr': 0.0002341293215849967, 'samples': 13361152, 'steps': 26095, 'batch_loss/train': 0.6869053670670837} 12/26/2021 09:46:43 - INFO - codeparrot_training - Step 26096: {'lr': 0.00023411359027753216, 'samples': 13361664, 'steps': 26096, 'batch_loss/train': 0.740988768171519} 12/26/2021 09:46:54 - INFO - codeparrot_training - Step 26097: {'lr': 0.00023409785903322604, 'samples': 13362176, 'steps': 26097, 'batch_loss/train': 0.7090284409932792} 12/26/2021 09:47:04 - INFO - codeparrot_training - Step 26098: {'lr': 0.0002340821278521407, 'samples': 13362688, 'steps': 26098, 'batch_loss/train': 0.6597109218128026} 12/26/2021 09:47:18 - INFO - codeparrot_training - Step 26099: {'lr': 0.00023406639673433888, 'samples': 13363200, 'steps': 26099, 'batch_loss/train': 1.1865664194338024} 12/26/2021 09:47:29 - INFO - codeparrot_training - Step 26100: {'lr': 0.0002340506656798831, 'samples': 13363712, 'steps': 26100, 'batch_loss/train': 0.7233932334929705} 12/26/2021 09:47:39 - INFO - codeparrot_training - Step 26101: {'lr': 0.00023403493468883587, 'samples': 13364224, 'steps': 26101, 'batch_loss/train': 0.6929620201699436} 12/26/2021 09:47:51 - INFO - codeparrot_training - Step 26102: {'lr': 0.00023401920376125957, 'samples': 13364736, 'steps': 26102, 'batch_loss/train': 0.7442413624376059} 12/26/2021 09:48:02 - INFO - codeparrot_training - Step 26103: {'lr': 0.00023400347289721695, 'samples': 13365248, 'steps': 26103, 'batch_loss/train': 0.6916662137955427} 12/26/2021 09:48:13 - INFO - codeparrot_training - Step 26104: {'lr': 0.00023398774209677054, 'samples': 13365760, 'steps': 26104, 'batch_loss/train': 0.8064131606370211} 12/26/2021 09:48:23 - INFO - codeparrot_training - Step 26105: {'lr': 0.00023397201135998269, 'samples': 13366272, 'steps': 26105, 'batch_loss/train': 0.7870007515884936} 12/26/2021 09:48:35 - INFO - codeparrot_training - Step 26106: {'lr': 0.00023395628068691615, 'samples': 13366784, 'steps': 26106, 'batch_loss/train': 0.7503221407532692} 12/26/2021 09:48:46 - INFO - codeparrot_training - Step 26107: {'lr': 0.00023394055007763345, 'samples': 13367296, 'steps': 26107, 'batch_loss/train': 0.7024766054237261} 12/26/2021 09:48:57 - INFO - codeparrot_training - Step 26108: {'lr': 0.00023392481953219703, 'samples': 13367808, 'steps': 26108, 'batch_loss/train': 0.9176308214082383} 12/26/2021 09:49:11 - INFO - codeparrot_training - Step 26109: {'lr': 0.00023390908905066937, 'samples': 13368320, 'steps': 26109, 'batch_loss/train': 0.7125283456407487} 12/26/2021 09:49:21 - INFO - codeparrot_training - Step 26110: {'lr': 0.00023389335863311313, 'samples': 13368832, 'steps': 26110, 'batch_loss/train': 0.8522607963532209} 12/26/2021 09:49:32 - INFO - codeparrot_training - Step 26111: {'lr': 0.00023387762827959088, 'samples': 13369344, 'steps': 26111, 'batch_loss/train': 0.8357624094933271} 12/26/2021 09:49:44 - INFO - codeparrot_training - Step 26112: {'lr': 0.00023386189799016506, 'samples': 13369856, 'steps': 26112, 'batch_loss/train': 0.6610377198085189} 12/26/2021 09:49:55 - INFO - codeparrot_training - Step 26113: {'lr': 0.00023384616776489815, 'samples': 13370368, 'steps': 26113, 'batch_loss/train': 0.8656371487304568} 12/26/2021 09:50:05 - INFO - codeparrot_training - Step 26114: {'lr': 0.0002338304376038529, 'samples': 13370880, 'steps': 26114, 'batch_loss/train': 0.8674363326281309} 12/26/2021 09:50:16 - INFO - codeparrot_training - Step 26115: {'lr': 0.00023381470750709167, 'samples': 13371392, 'steps': 26115, 'batch_loss/train': 0.8453086540102959} 12/26/2021 09:50:30 - INFO - codeparrot_training - Step 26116: {'lr': 0.0002337989774746771, 'samples': 13371904, 'steps': 26116, 'batch_loss/train': 0.7349693961441517} 12/26/2021 09:50:40 - INFO - codeparrot_training - Step 26117: {'lr': 0.00023378324750667156, 'samples': 13372416, 'steps': 26117, 'batch_loss/train': 0.8253106623888016} 12/26/2021 09:50:51 - INFO - codeparrot_training - Step 26118: {'lr': 0.00023376751760313785, 'samples': 13372928, 'steps': 26118, 'batch_loss/train': 0.8211213289760053} 12/26/2021 09:51:03 - INFO - codeparrot_training - Step 26119: {'lr': 0.00023375178776413827, 'samples': 13373440, 'steps': 26119, 'batch_loss/train': 0.7695653578266501} 12/26/2021 09:51:14 - INFO - codeparrot_training - Step 26120: {'lr': 0.0002337360579897354, 'samples': 13373952, 'steps': 26120, 'batch_loss/train': 0.7769173076376319} 12/26/2021 09:51:24 - INFO - codeparrot_training - Step 26121: {'lr': 0.00023372032827999195, 'samples': 13374464, 'steps': 26121, 'batch_loss/train': 0.8235906669870019} 12/26/2021 09:51:36 - INFO - codeparrot_training - Step 26122: {'lr': 0.00023370459863497026, 'samples': 13374976, 'steps': 26122, 'batch_loss/train': 0.7366911545395851} 12/26/2021 09:51:47 - INFO - codeparrot_training - Step 26123: {'lr': 0.000233688869054733, 'samples': 13375488, 'steps': 26123, 'batch_loss/train': 0.721054854337126} 12/26/2021 09:51:57 - INFO - codeparrot_training - Step 26124: {'lr': 0.00023367313953934253, 'samples': 13376000, 'steps': 26124, 'batch_loss/train': 0.7429492813535035} 12/26/2021 09:52:08 - INFO - codeparrot_training - Step 26125: {'lr': 0.0002336574100888615, 'samples': 13376512, 'steps': 26125, 'batch_loss/train': 0.8412161711603403} 12/26/2021 09:52:22 - INFO - codeparrot_training - Step 26126: {'lr': 0.00023364168070335255, 'samples': 13377024, 'steps': 26126, 'batch_loss/train': 0.7101463684812188} 12/26/2021 09:52:33 - INFO - codeparrot_training - Step 26127: {'lr': 0.00023362595138287808, 'samples': 13377536, 'steps': 26127, 'batch_loss/train': 0.7577155167236924} 12/26/2021 09:52:43 - INFO - codeparrot_training - Step 26128: {'lr': 0.0002336102221275005, 'samples': 13378048, 'steps': 26128, 'batch_loss/train': 0.8114373916760087} 12/26/2021 09:52:56 - INFO - codeparrot_training - Step 26129: {'lr': 0.0002335944929372826, 'samples': 13378560, 'steps': 26129, 'batch_loss/train': 0.75711471401155} 12/26/2021 09:53:07 - INFO - codeparrot_training - Step 26130: {'lr': 0.00023357876381228685, 'samples': 13379072, 'steps': 26130, 'batch_loss/train': 1.3573623867705464} 12/26/2021 09:53:18 - INFO - codeparrot_training - Step 26131: {'lr': 0.0002335630347525756, 'samples': 13379584, 'steps': 26131, 'batch_loss/train': 0.7126883128657937} 12/26/2021 09:53:30 - INFO - codeparrot_training - Step 26132: {'lr': 0.00023354730575821157, 'samples': 13380096, 'steps': 26132, 'batch_loss/train': 0.7333954321220517} 12/26/2021 09:53:40 - INFO - codeparrot_training - Step 26133: {'lr': 0.0002335315768292573, 'samples': 13380608, 'steps': 26133, 'batch_loss/train': 0.8198731241282076} 12/26/2021 09:53:51 - INFO - codeparrot_training - Step 26134: {'lr': 0.0002335158479657752, 'samples': 13381120, 'steps': 26134, 'batch_loss/train': 0.7400715332478285} 12/26/2021 09:54:01 - INFO - codeparrot_training - Step 26135: {'lr': 0.00023350011916782785, 'samples': 13381632, 'steps': 26135, 'batch_loss/train': 0.7217797404155135} 12/26/2021 09:54:16 - INFO - codeparrot_training - Step 26136: {'lr': 0.00023348439043547787, 'samples': 13382144, 'steps': 26136, 'batch_loss/train': 0.7197559922933578} 12/26/2021 09:54:27 - INFO - codeparrot_training - Step 26137: {'lr': 0.00023346866176878774, 'samples': 13382656, 'steps': 26137, 'batch_loss/train': 0.754384396597743} 12/26/2021 09:54:37 - INFO - codeparrot_training - Step 26138: {'lr': 0.00023345293316781989, 'samples': 13383168, 'steps': 26138, 'batch_loss/train': 0.6137346024625003} 12/26/2021 09:54:49 - INFO - codeparrot_training - Step 26139: {'lr': 0.00023343720463263687, 'samples': 13383680, 'steps': 26139, 'batch_loss/train': 0.7604665872640908} 12/26/2021 09:55:00 - INFO - codeparrot_training - Step 26140: {'lr': 0.0002334214761633014, 'samples': 13384192, 'steps': 26140, 'batch_loss/train': 0.7813973668962717} 12/26/2021 09:55:11 - INFO - codeparrot_training - Step 26141: {'lr': 0.00023340574775987583, 'samples': 13384704, 'steps': 26141, 'batch_loss/train': 0.8254727032035589} 12/26/2021 09:55:23 - INFO - codeparrot_training - Step 26142: {'lr': 0.00023339001942242267, 'samples': 13385216, 'steps': 26142, 'batch_loss/train': 0.761193691752851} 12/26/2021 09:55:33 - INFO - codeparrot_training - Step 26143: {'lr': 0.00023337429115100465, 'samples': 13385728, 'steps': 26143, 'batch_loss/train': 0.8571447683498263} 12/26/2021 09:55:44 - INFO - codeparrot_training - Step 26144: {'lr': 0.00023335856294568407, 'samples': 13386240, 'steps': 26144, 'batch_loss/train': 0.7415509894490242} 12/26/2021 09:55:55 - INFO - codeparrot_training - Step 26145: {'lr': 0.00023334283480652365, 'samples': 13386752, 'steps': 26145, 'batch_loss/train': 0.8814348978921771} 12/26/2021 09:56:09 - INFO - codeparrot_training - Step 26146: {'lr': 0.00023332710673358573, 'samples': 13387264, 'steps': 26146, 'batch_loss/train': 1.092443136498332} 12/26/2021 09:56:19 - INFO - codeparrot_training - Step 26147: {'lr': 0.00023331137872693302, 'samples': 13387776, 'steps': 26147, 'batch_loss/train': 0.7659530355595052} 12/26/2021 09:56:30 - INFO - codeparrot_training - Step 26148: {'lr': 0.00023329565078662796, 'samples': 13388288, 'steps': 26148, 'batch_loss/train': 0.7718757493421435} 12/26/2021 09:56:42 - INFO - codeparrot_training - Step 26149: {'lr': 0.00023327992291273308, 'samples': 13388800, 'steps': 26149, 'batch_loss/train': 0.752516588428989} 12/26/2021 09:56:52 - INFO - codeparrot_training - Step 26150: {'lr': 0.00023326419510531082, 'samples': 13389312, 'steps': 26150, 'batch_loss/train': 0.8019016687758267} 12/26/2021 09:57:03 - INFO - codeparrot_training - Step 26151: {'lr': 0.0002332484673644239, 'samples': 13389824, 'steps': 26151, 'batch_loss/train': 0.8583212997764349} 12/26/2021 09:57:15 - INFO - codeparrot_training - Step 26152: {'lr': 0.00023323273969013476, 'samples': 13390336, 'steps': 26152, 'batch_loss/train': 0.8662036973983049} 12/26/2021 09:57:26 - INFO - codeparrot_training - Step 26153: {'lr': 0.00023321701208250592, 'samples': 13390848, 'steps': 26153, 'batch_loss/train': 0.8236299380660057} 12/26/2021 09:57:37 - INFO - codeparrot_training - Step 26154: {'lr': 0.0002332012845415998, 'samples': 13391360, 'steps': 26154, 'batch_loss/train': 0.6762142558582127} 12/26/2021 09:57:47 - INFO - codeparrot_training - Step 26155: {'lr': 0.0002331855570674791, 'samples': 13391872, 'steps': 26155, 'batch_loss/train': 0.6285018532071263} 12/26/2021 09:58:01 - INFO - codeparrot_training - Step 26156: {'lr': 0.00023316982966020632, 'samples': 13392384, 'steps': 26156, 'batch_loss/train': 0.7696425663307309} 12/26/2021 09:58:12 - INFO - codeparrot_training - Step 26157: {'lr': 0.0002331541023198438, 'samples': 13392896, 'steps': 26157, 'batch_loss/train': 0.8106358023360372} 12/26/2021 09:58:23 - INFO - codeparrot_training - Step 26158: {'lr': 0.0002331383750464543, 'samples': 13393408, 'steps': 26158, 'batch_loss/train': 0.7853950676508248} 12/26/2021 09:58:35 - INFO - codeparrot_training - Step 26159: {'lr': 0.0002331226478401003, 'samples': 13393920, 'steps': 26159, 'batch_loss/train': 0.5293515503872186} 12/26/2021 09:58:45 - INFO - codeparrot_training - Step 26160: {'lr': 0.00023310692070084422, 'samples': 13394432, 'steps': 26160, 'batch_loss/train': 0.7831840617582202} 12/26/2021 09:58:56 - INFO - codeparrot_training - Step 26161: {'lr': 0.00023309119362874858, 'samples': 13394944, 'steps': 26161, 'batch_loss/train': 0.7708150092512369} 12/26/2021 09:59:08 - INFO - codeparrot_training - Step 26162: {'lr': 0.0002330754666238761, 'samples': 13395456, 'steps': 26162, 'batch_loss/train': 0.7340543884783983} 12/26/2021 09:59:19 - INFO - codeparrot_training - Step 26163: {'lr': 0.0002330597396862892, 'samples': 13395968, 'steps': 26163, 'batch_loss/train': 0.7614497162867337} 12/26/2021 09:59:29 - INFO - codeparrot_training - Step 26164: {'lr': 0.0002330440128160503, 'samples': 13396480, 'steps': 26164, 'batch_loss/train': 0.7221005982137285} 12/26/2021 09:59:43 - INFO - codeparrot_training - Step 26165: {'lr': 0.0002330282860132219, 'samples': 13396992, 'steps': 26165, 'batch_loss/train': 0.7697407845407724} 12/26/2021 09:59:54 - INFO - codeparrot_training - Step 26166: {'lr': 0.0002330125592778668, 'samples': 13397504, 'steps': 26166, 'batch_loss/train': 0.7741874142084271} 12/26/2021 10:00:05 - INFO - codeparrot_training - Step 26167: {'lr': 0.00023299683261004725, 'samples': 13398016, 'steps': 26167, 'batch_loss/train': 0.8488091416656971} 12/26/2021 10:00:15 - INFO - codeparrot_training - Step 26168: {'lr': 0.00023298110600982586, 'samples': 13398528, 'steps': 26168, 'batch_loss/train': 0.771504445001483} 12/26/2021 10:00:27 - INFO - codeparrot_training - Step 26169: {'lr': 0.00023296537947726525, 'samples': 13399040, 'steps': 26169, 'batch_loss/train': 0.6329452376812696} 12/26/2021 10:00:38 - INFO - codeparrot_training - Step 26170: {'lr': 0.00023294965301242783, 'samples': 13399552, 'steps': 26170, 'batch_loss/train': 0.7216535108163953} 12/26/2021 10:00:49 - INFO - codeparrot_training - Step 26171: {'lr': 0.00023293392661537615, 'samples': 13400064, 'steps': 26171, 'batch_loss/train': 0.7001497126184404} 12/26/2021 10:01:02 - INFO - codeparrot_training - Step 26172: {'lr': 0.00023291820028617267, 'samples': 13400576, 'steps': 26172, 'batch_loss/train': 0.6967901097377762} 12/26/2021 10:01:13 - INFO - codeparrot_training - Step 26173: {'lr': 0.00023290247402488009, 'samples': 13401088, 'steps': 26173, 'batch_loss/train': 0.7278851922601461} 12/26/2021 10:01:24 - INFO - codeparrot_training - Step 26174: {'lr': 0.00023288674783156075, 'samples': 13401600, 'steps': 26174, 'batch_loss/train': 0.7752253534272313} 12/26/2021 10:01:36 - INFO - codeparrot_training - Step 26175: {'lr': 0.00023287102170627727, 'samples': 13402112, 'steps': 26175, 'batch_loss/train': 0.7098590465029702} 12/26/2021 10:01:46 - INFO - codeparrot_training - Step 26176: {'lr': 0.0002328552956490921, 'samples': 13402624, 'steps': 26176, 'batch_loss/train': 0.7505971062928438} 12/26/2021 10:01:57 - INFO - codeparrot_training - Step 26177: {'lr': 0.00023283956966006777, 'samples': 13403136, 'steps': 26177, 'batch_loss/train': 0.730323847848922} 12/26/2021 10:02:07 - INFO - codeparrot_training - Step 26178: {'lr': 0.00023282384373926693, 'samples': 13403648, 'steps': 26178, 'batch_loss/train': 0.8090007407590747} 12/26/2021 10:02:19 - INFO - codeparrot_training - Step 26179: {'lr': 0.00023280811788675189, 'samples': 13404160, 'steps': 26179, 'batch_loss/train': 0.7712122884113342} 12/26/2021 10:02:30 - INFO - codeparrot_training - Step 26180: {'lr': 0.00023279239210258532, 'samples': 13404672, 'steps': 26180, 'batch_loss/train': 0.7078364416956902} 12/26/2021 10:02:41 - INFO - codeparrot_training - Step 26181: {'lr': 0.0002327766663868298, 'samples': 13405184, 'steps': 26181, 'batch_loss/train': 0.8172575352946296} 12/26/2021 10:02:53 - INFO - codeparrot_training - Step 26182: {'lr': 0.00023276094073954772, 'samples': 13405696, 'steps': 26182, 'batch_loss/train': 0.7967284377664328} 12/26/2021 10:03:04 - INFO - codeparrot_training - Step 26183: {'lr': 0.0002327452151608015, 'samples': 13406208, 'steps': 26183, 'batch_loss/train': 0.7490208093076944} 12/26/2021 10:03:14 - INFO - codeparrot_training - Step 26184: {'lr': 0.00023272948965065389, 'samples': 13406720, 'steps': 26184, 'batch_loss/train': 0.7314787735231221} 12/26/2021 10:03:28 - INFO - codeparrot_training - Step 26185: {'lr': 0.0002327137642091673, 'samples': 13407232, 'steps': 26185, 'batch_loss/train': 0.7268679328262806} 12/26/2021 10:03:39 - INFO - codeparrot_training - Step 26186: {'lr': 0.00023269803883640424, 'samples': 13407744, 'steps': 26186, 'batch_loss/train': 0.7260758108459413} 12/26/2021 10:03:49 - INFO - codeparrot_training - Step 26187: {'lr': 0.0002326823135324272, 'samples': 13408256, 'steps': 26187, 'batch_loss/train': 0.669847896322608} 12/26/2021 10:04:00 - INFO - codeparrot_training - Step 26188: {'lr': 0.00023266658829729887, 'samples': 13408768, 'steps': 26188, 'batch_loss/train': 0.7562977713532746} 12/26/2021 10:04:12 - INFO - codeparrot_training - Step 26189: {'lr': 0.00023265086313108154, 'samples': 13409280, 'steps': 26189, 'batch_loss/train': 0.695277632214129} 12/26/2021 10:04:23 - INFO - codeparrot_training - Step 26190: {'lr': 0.0002326351380338379, 'samples': 13409792, 'steps': 26190, 'batch_loss/train': 0.8684347374364734} 12/26/2021 10:04:33 - INFO - codeparrot_training - Step 26191: {'lr': 0.0002326194130056303, 'samples': 13410304, 'steps': 26191, 'batch_loss/train': 0.8933281591162086} 12/26/2021 10:04:45 - INFO - codeparrot_training - Step 26192: {'lr': 0.00023260368804652143, 'samples': 13410816, 'steps': 26192, 'batch_loss/train': 0.7761816796846688} 12/26/2021 10:04:56 - INFO - codeparrot_training - Step 26193: {'lr': 0.0002325879631565737, 'samples': 13411328, 'steps': 26193, 'batch_loss/train': 0.7965296604670584} 12/26/2021 10:05:07 - INFO - codeparrot_training - Step 26194: {'lr': 0.0002325722383358496, 'samples': 13411840, 'steps': 26194, 'batch_loss/train': 0.8024264238774776} 12/26/2021 10:05:20 - INFO - codeparrot_training - Step 26195: {'lr': 0.0002325565135844118, 'samples': 13412352, 'steps': 26195, 'batch_loss/train': 0.798708651214838} 12/26/2021 10:05:31 - INFO - codeparrot_training - Step 26196: {'lr': 0.00023254078890232264, 'samples': 13412864, 'steps': 26196, 'batch_loss/train': 0.651177337160334} 12/26/2021 10:05:42 - INFO - codeparrot_training - Step 26197: {'lr': 0.0002325250642896448, 'samples': 13413376, 'steps': 26197, 'batch_loss/train': 0.7126981876790524} 12/26/2021 10:05:52 - INFO - codeparrot_training - Step 26198: {'lr': 0.00023250933974644055, 'samples': 13413888, 'steps': 26198, 'batch_loss/train': 0.7137439642101526} 12/26/2021 10:06:05 - INFO - codeparrot_training - Step 26199: {'lr': 0.0002324936152727727, 'samples': 13414400, 'steps': 26199, 'batch_loss/train': 0.7095487420447171} 12/26/2021 10:06:16 - INFO - codeparrot_training - Step 26200: {'lr': 0.00023247789086870357, 'samples': 13414912, 'steps': 26200, 'batch_loss/train': 0.6828748772386461} 12/26/2021 10:06:26 - INFO - codeparrot_training - Step 26201: {'lr': 0.0002324621665342958, 'samples': 13415424, 'steps': 26201, 'batch_loss/train': 0.5886604227125645} 12/26/2021 10:06:40 - INFO - codeparrot_training - Step 26202: {'lr': 0.00023244644226961172, 'samples': 13415936, 'steps': 26202, 'batch_loss/train': 0.6795970303937793} 12/26/2021 10:06:51 - INFO - codeparrot_training - Step 26203: {'lr': 0.000232430718074714, 'samples': 13416448, 'steps': 26203, 'batch_loss/train': 0.6193226324976422} 12/26/2021 10:07:02 - INFO - codeparrot_training - Step 26204: {'lr': 0.00023241499394966518, 'samples': 13416960, 'steps': 26204, 'batch_loss/train': 0.8341347416862845} 12/26/2021 10:07:14 - INFO - codeparrot_training - Step 26205: {'lr': 0.0002323992698945276, 'samples': 13417472, 'steps': 26205, 'batch_loss/train': 0.7065936243161559} 12/26/2021 10:07:24 - INFO - codeparrot_training - Step 26206: {'lr': 0.0002323835459093639, 'samples': 13417984, 'steps': 26206, 'batch_loss/train': 0.7896255105733871} 12/26/2021 10:07:35 - INFO - codeparrot_training - Step 26207: {'lr': 0.00023236782199423665, 'samples': 13418496, 'steps': 26207, 'batch_loss/train': 0.7958039687946439} 12/26/2021 10:07:47 - INFO - codeparrot_training - Step 26208: {'lr': 0.00023235209814920826, 'samples': 13419008, 'steps': 26208, 'batch_loss/train': 0.6834685415960848} 12/26/2021 10:07:58 - INFO - codeparrot_training - Step 26209: {'lr': 0.00023233637437434116, 'samples': 13419520, 'steps': 26209, 'batch_loss/train': 0.7255755057558417} 12/26/2021 10:08:09 - INFO - codeparrot_training - Step 26210: {'lr': 0.00023232065066969805, 'samples': 13420032, 'steps': 26210, 'batch_loss/train': 0.8582848599180579} 12/26/2021 10:08:19 - INFO - codeparrot_training - Step 26211: {'lr': 0.00023230492703534136, 'samples': 13420544, 'steps': 26211, 'batch_loss/train': 0.7397546181455255} 12/26/2021 10:08:33 - INFO - codeparrot_training - Step 26212: {'lr': 0.00023228920347133358, 'samples': 13421056, 'steps': 26212, 'batch_loss/train': 0.7721071289852262} 12/26/2021 10:08:44 - INFO - codeparrot_training - Step 26213: {'lr': 0.00023227347997773718, 'samples': 13421568, 'steps': 26213, 'batch_loss/train': 0.8341947020962834} 12/26/2021 10:08:54 - INFO - codeparrot_training - Step 26214: {'lr': 0.00023225775655461484, 'samples': 13422080, 'steps': 26214, 'batch_loss/train': 0.7484061205759645} 12/26/2021 10:09:07 - INFO - codeparrot_training - Step 26215: {'lr': 0.00023224203320202893, 'samples': 13422592, 'steps': 26215, 'batch_loss/train': 0.7408265848644078} 12/26/2021 10:09:17 - INFO - codeparrot_training - Step 26216: {'lr': 0.0002322263099200419, 'samples': 13423104, 'steps': 26216, 'batch_loss/train': 0.8158200024627149} 12/26/2021 10:09:28 - INFO - codeparrot_training - Step 26217: {'lr': 0.00023221058670871644, 'samples': 13423616, 'steps': 26217, 'batch_loss/train': 0.7274777055718005} 12/26/2021 10:09:40 - INFO - codeparrot_training - Step 26218: {'lr': 0.00023219486356811504, 'samples': 13424128, 'steps': 26218, 'batch_loss/train': 0.6769930575974286} 12/26/2021 10:09:51 - INFO - codeparrot_training - Step 26219: {'lr': 0.00023217914049830003, 'samples': 13424640, 'steps': 26219, 'batch_loss/train': 0.7066082265228033} 12/26/2021 10:10:01 - INFO - codeparrot_training - Step 26220: {'lr': 0.00023216341749933398, 'samples': 13425152, 'steps': 26220, 'batch_loss/train': 0.7364610154181719} 12/26/2021 10:10:12 - INFO - codeparrot_training - Step 26221: {'lr': 0.00023214769457127957, 'samples': 13425664, 'steps': 26221, 'batch_loss/train': 0.7984052752144635} 12/26/2021 10:10:24 - INFO - codeparrot_training - Step 26222: {'lr': 0.00023213197171419915, 'samples': 13426176, 'steps': 26222, 'batch_loss/train': 0.8635175498202443} 12/26/2021 10:10:35 - INFO - codeparrot_training - Step 26223: {'lr': 0.00023211624892815528, 'samples': 13426688, 'steps': 26223, 'batch_loss/train': 0.8087545540183783} 12/26/2021 10:10:45 - INFO - codeparrot_training - Step 26224: {'lr': 0.00023210052621321035, 'samples': 13427200, 'steps': 26224, 'batch_loss/train': 1.0036026174202561} 12/26/2021 10:10:59 - INFO - codeparrot_training - Step 26225: {'lr': 0.00023208480356942703, 'samples': 13427712, 'steps': 26225, 'batch_loss/train': 0.7077645503450185} 12/26/2021 10:11:10 - INFO - codeparrot_training - Step 26226: {'lr': 0.00023206908099686783, 'samples': 13428224, 'steps': 26226, 'batch_loss/train': 0.8674269486218691} 12/26/2021 10:11:21 - INFO - codeparrot_training - Step 26227: {'lr': 0.00023205335849559518, 'samples': 13428736, 'steps': 26227, 'batch_loss/train': 0.6503856899216771} 12/26/2021 10:11:33 - INFO - codeparrot_training - Step 26228: {'lr': 0.0002320376360656715, 'samples': 13429248, 'steps': 26228, 'batch_loss/train': 0.7031209585256875} 12/26/2021 10:11:43 - INFO - codeparrot_training - Step 26229: {'lr': 0.00023202191370715945, 'samples': 13429760, 'steps': 26229, 'batch_loss/train': 0.860745926387608} 12/26/2021 10:11:54 - INFO - codeparrot_training - Step 26230: {'lr': 0.00023200619142012152, 'samples': 13430272, 'steps': 26230, 'batch_loss/train': 0.7567672715522349} 12/26/2021 10:12:04 - INFO - codeparrot_training - Step 26231: {'lr': 0.0002319904692046201, 'samples': 13430784, 'steps': 26231, 'batch_loss/train': 0.901835210621357} 12/26/2021 10:12:18 - INFO - codeparrot_training - Step 26232: {'lr': 0.0002319747470607178, 'samples': 13431296, 'steps': 26232, 'batch_loss/train': 0.7115955180488527} 12/26/2021 10:12:29 - INFO - codeparrot_training - Step 26233: {'lr': 0.00023195902498847717, 'samples': 13431808, 'steps': 26233, 'batch_loss/train': 0.7389371367171407} 12/26/2021 10:12:39 - INFO - codeparrot_training - Step 26234: {'lr': 0.0002319433029879606, 'samples': 13432320, 'steps': 26234, 'batch_loss/train': 0.7826099436497316} 12/26/2021 10:12:51 - INFO - codeparrot_training - Step 26235: {'lr': 0.0002319275810592306, 'samples': 13432832, 'steps': 26235, 'batch_loss/train': 0.771770715713501} 12/26/2021 10:13:02 - INFO - codeparrot_training - Step 26236: {'lr': 0.00023191185920234977, 'samples': 13433344, 'steps': 26236, 'batch_loss/train': 0.7349578198045492} 12/26/2021 10:13:13 - INFO - codeparrot_training - Step 26237: {'lr': 0.00023189613741738058, 'samples': 13433856, 'steps': 26237, 'batch_loss/train': 0.7713393960148096} 12/26/2021 10:13:25 - INFO - codeparrot_training - Step 26238: {'lr': 0.00023188041570438547, 'samples': 13434368, 'steps': 26238, 'batch_loss/train': 0.7941971952095628} 12/26/2021 10:13:35 - INFO - codeparrot_training - Step 26239: {'lr': 0.0002318646940634269, 'samples': 13434880, 'steps': 26239, 'batch_loss/train': 0.7853837659931742} 12/26/2021 10:13:46 - INFO - codeparrot_training - Step 26240: {'lr': 0.00023184897249456762, 'samples': 13435392, 'steps': 26240, 'batch_loss/train': 0.7345951870083809} 12/26/2021 10:13:59 - INFO - codeparrot_training - Step 26241: {'lr': 0.0002318332509978699, 'samples': 13435904, 'steps': 26241, 'batch_loss/train': 0.7863642331212759} 12/26/2021 10:14:10 - INFO - codeparrot_training - Step 26242: {'lr': 0.00023181752957339625, 'samples': 13436416, 'steps': 26242, 'batch_loss/train': 0.7956974180415273} 12/26/2021 10:14:21 - INFO - codeparrot_training - Step 26243: {'lr': 0.00023180180822120934, 'samples': 13436928, 'steps': 26243, 'batch_loss/train': 0.8023375663906336} 12/26/2021 10:14:31 - INFO - codeparrot_training - Step 26244: {'lr': 0.00023178608694137158, 'samples': 13437440, 'steps': 26244, 'batch_loss/train': 0.7816259292885661} 12/26/2021 10:14:43 - INFO - codeparrot_training - Step 26245: {'lr': 0.0002317703657339454, 'samples': 13437952, 'steps': 26245, 'batch_loss/train': 0.8157386677339673} 12/26/2021 10:14:54 - INFO - codeparrot_training - Step 26246: {'lr': 0.00023175464459899327, 'samples': 13438464, 'steps': 26246, 'batch_loss/train': 0.7845200505107641} 12/26/2021 10:15:05 - INFO - codeparrot_training - Step 26247: {'lr': 0.00023173892353657794, 'samples': 13438976, 'steps': 26247, 'batch_loss/train': 0.6888174738269299} 12/26/2021 10:15:17 - INFO - codeparrot_training - Step 26248: {'lr': 0.0002317232025467617, 'samples': 13439488, 'steps': 26248, 'batch_loss/train': 0.6556850392371416} 12/26/2021 10:15:28 - INFO - codeparrot_training - Step 26249: {'lr': 0.00023170748162960715, 'samples': 13440000, 'steps': 26249, 'batch_loss/train': 0.8230776488780975} 12/26/2021 10:15:38 - INFO - codeparrot_training - Step 26250: {'lr': 0.00023169176078517665, 'samples': 13440512, 'steps': 26250, 'batch_loss/train': 0.7073634809348732} 12/26/2021 10:15:52 - INFO - codeparrot_training - Step 26251: {'lr': 0.00023167604001353282, 'samples': 13441024, 'steps': 26251, 'batch_loss/train': 0.4805913553573191} 12/26/2021 10:16:03 - INFO - codeparrot_training - Step 26252: {'lr': 0.0002316603193147382, 'samples': 13441536, 'steps': 26252, 'batch_loss/train': 0.7876241244375706} 12/26/2021 10:16:14 - INFO - codeparrot_training - Step 26253: {'lr': 0.00023164459868885514, 'samples': 13442048, 'steps': 26253, 'batch_loss/train': 0.7730848230421543} 12/26/2021 10:16:24 - INFO - codeparrot_training - Step 26254: {'lr': 0.00023162887813594634, 'samples': 13442560, 'steps': 26254, 'batch_loss/train': 0.6762714367359877} 12/26/2021 10:16:36 - INFO - codeparrot_training - Step 26255: {'lr': 0.0002316131576560741, 'samples': 13443072, 'steps': 26255, 'batch_loss/train': 0.70591601729393} 12/26/2021 10:16:47 - INFO - codeparrot_training - Step 26256: {'lr': 0.00023159743724930108, 'samples': 13443584, 'steps': 26256, 'batch_loss/train': 0.7158820340409875} 12/26/2021 10:16:58 - INFO - codeparrot_training - Step 26257: {'lr': 0.00023158171691568954, 'samples': 13444096, 'steps': 26257, 'batch_loss/train': 0.8887211736291647} 12/26/2021 10:17:10 - INFO - codeparrot_training - Step 26258: {'lr': 0.00023156599665530227, 'samples': 13444608, 'steps': 26258, 'batch_loss/train': 0.8374080625362694} 12/26/2021 10:17:20 - INFO - codeparrot_training - Step 26259: {'lr': 0.00023155027646820165, 'samples': 13445120, 'steps': 26259, 'batch_loss/train': 0.8579171122983098} 12/26/2021 10:17:31 - INFO - codeparrot_training - Step 26260: {'lr': 0.0002315345563544501, 'samples': 13445632, 'steps': 26260, 'batch_loss/train': 0.7153625497594476} 12/26/2021 10:17:43 - INFO - codeparrot_training - Step 26261: {'lr': 0.00023151883631411014, 'samples': 13446144, 'steps': 26261, 'batch_loss/train': 0.7794278869405389} 12/26/2021 10:17:54 - INFO - codeparrot_training - Step 26262: {'lr': 0.0002315031163472444, 'samples': 13446656, 'steps': 26262, 'batch_loss/train': 0.49711071548517793} 12/26/2021 10:18:04 - INFO - codeparrot_training - Step 26263: {'lr': 0.0002314873964539153, 'samples': 13447168, 'steps': 26263, 'batch_loss/train': 0.8196134075988084} 12/26/2021 10:18:15 - INFO - codeparrot_training - Step 26264: {'lr': 0.00023147167663418528, 'samples': 13447680, 'steps': 26264, 'batch_loss/train': 0.814304718747735} 12/26/2021 10:18:29 - INFO - codeparrot_training - Step 26265: {'lr': 0.00023145595688811682, 'samples': 13448192, 'steps': 26265, 'batch_loss/train': 0.6988451024517417} 12/26/2021 10:18:40 - INFO - codeparrot_training - Step 26266: {'lr': 0.0002314402372157726, 'samples': 13448704, 'steps': 26266, 'batch_loss/train': 0.718789691105485} 12/26/2021 10:18:50 - INFO - codeparrot_training - Step 26267: {'lr': 0.0002314245176172149, 'samples': 13449216, 'steps': 26267, 'batch_loss/train': 0.8405181029811502} 12/26/2021 10:19:02 - INFO - codeparrot_training - Step 26268: {'lr': 0.00023140879809250626, 'samples': 13449728, 'steps': 26268, 'batch_loss/train': 0.7701915372163057} 12/26/2021 10:19:13 - INFO - codeparrot_training - Step 26269: {'lr': 0.00023139307864170934, 'samples': 13450240, 'steps': 26269, 'batch_loss/train': 0.634485955350101} 12/26/2021 10:19:24 - INFO - codeparrot_training - Step 26270: {'lr': 0.00023137735926488643, 'samples': 13450752, 'steps': 26270, 'batch_loss/train': 0.761842031031847} 12/26/2021 10:19:37 - INFO - codeparrot_training - Step 26271: {'lr': 0.00023136163996210018, 'samples': 13451264, 'steps': 26271, 'batch_loss/train': 0.8154570814222097} 12/26/2021 10:19:48 - INFO - codeparrot_training - Step 26272: {'lr': 0.0002313459207334129, 'samples': 13451776, 'steps': 26272, 'batch_loss/train': 0.7310328297317028} 12/26/2021 10:19:59 - INFO - codeparrot_training - Step 26273: {'lr': 0.0002313302015788873, 'samples': 13452288, 'steps': 26273, 'batch_loss/train': 0.930685460800305} 12/26/2021 10:20:09 - INFO - codeparrot_training - Step 26274: {'lr': 0.00023131448249858572, 'samples': 13452800, 'steps': 26274, 'batch_loss/train': 0.7044536017347127} 12/26/2021 10:20:22 - INFO - codeparrot_training - Step 26275: {'lr': 0.00023129876349257077, 'samples': 13453312, 'steps': 26275, 'batch_loss/train': 0.7879202831536531} 12/26/2021 10:20:32 - INFO - codeparrot_training - Step 26276: {'lr': 0.00023128304456090476, 'samples': 13453824, 'steps': 26276, 'batch_loss/train': 0.7817599903792143} 12/26/2021 10:20:43 - INFO - codeparrot_training - Step 26277: {'lr': 0.00023126732570365034, 'samples': 13454336, 'steps': 26277, 'batch_loss/train': 0.7262054271996021} 12/26/2021 10:20:55 - INFO - codeparrot_training - Step 26278: {'lr': 0.00023125160692087003, 'samples': 13454848, 'steps': 26278, 'batch_loss/train': 0.6550607895478606} 12/26/2021 10:21:06 - INFO - codeparrot_training - Step 26279: {'lr': 0.0002312358882126261, 'samples': 13455360, 'steps': 26279, 'batch_loss/train': 0.787059148773551} 12/26/2021 10:21:17 - INFO - codeparrot_training - Step 26280: {'lr': 0.0002312201695789813, 'samples': 13455872, 'steps': 26280, 'batch_loss/train': 0.7623834004625678} 12/26/2021 10:21:30 - INFO - codeparrot_training - Step 26281: {'lr': 0.00023120445101999803, 'samples': 13456384, 'steps': 26281, 'batch_loss/train': 0.8411388462409377} 12/26/2021 10:21:41 - INFO - codeparrot_training - Step 26282: {'lr': 0.0002311887325357388, 'samples': 13456896, 'steps': 26282, 'batch_loss/train': 0.8290560180321336} 12/26/2021 10:21:52 - INFO - codeparrot_training - Step 26283: {'lr': 0.00023117301412626596, 'samples': 13457408, 'steps': 26283, 'batch_loss/train': 1.7429401744157076} 12/26/2021 10:22:02 - INFO - codeparrot_training - Step 26284: {'lr': 0.00023115729579164214, 'samples': 13457920, 'steps': 26284, 'batch_loss/train': 0.832346623763442} 12/26/2021 10:22:15 - INFO - codeparrot_training - Step 26285: {'lr': 0.00023114157753192986, 'samples': 13458432, 'steps': 26285, 'batch_loss/train': 0.8304202407598495} 12/26/2021 10:22:25 - INFO - codeparrot_training - Step 26286: {'lr': 0.0002311258593471915, 'samples': 13458944, 'steps': 26286, 'batch_loss/train': 0.7541996911168098} 12/26/2021 10:22:36 - INFO - codeparrot_training - Step 26287: {'lr': 0.00023111014123748953, 'samples': 13459456, 'steps': 26287, 'batch_loss/train': 0.7054243567399681} 12/26/2021 10:22:50 - INFO - codeparrot_training - Step 26288: {'lr': 0.0002310944232028866, 'samples': 13459968, 'steps': 26288, 'batch_loss/train': 0.7156015376094729} 12/26/2021 10:23:01 - INFO - codeparrot_training - Step 26289: {'lr': 0.00023107870524344512, 'samples': 13460480, 'steps': 26289, 'batch_loss/train': 0.7899724338203669} 12/26/2021 10:23:11 - INFO - codeparrot_training - Step 26290: {'lr': 0.00023106298735922748, 'samples': 13460992, 'steps': 26290, 'batch_loss/train': 0.8630540855228901} 12/26/2021 10:23:23 - INFO - codeparrot_training - Step 26291: {'lr': 0.00023104726955029627, 'samples': 13461504, 'steps': 26291, 'batch_loss/train': 0.7461704602465034} 12/26/2021 10:23:34 - INFO - codeparrot_training - Step 26292: {'lr': 0.00023103155181671404, 'samples': 13462016, 'steps': 26292, 'batch_loss/train': 0.6873550089076161} 12/26/2021 10:23:45 - INFO - codeparrot_training - Step 26293: {'lr': 0.00023101583415854313, 'samples': 13462528, 'steps': 26293, 'batch_loss/train': 0.779627209994942} 12/26/2021 10:23:55 - INFO - codeparrot_training - Step 26294: {'lr': 0.000231000116575846, 'samples': 13463040, 'steps': 26294, 'batch_loss/train': 0.7277526403777301} 12/26/2021 10:24:07 - INFO - codeparrot_training - Step 26295: {'lr': 0.0002309843990686854, 'samples': 13463552, 'steps': 26295, 'batch_loss/train': 0.6764907971955836} 12/26/2021 10:24:18 - INFO - codeparrot_training - Step 26296: {'lr': 0.00023096868163712358, 'samples': 13464064, 'steps': 26296, 'batch_loss/train': 0.7934698138851672} 12/26/2021 10:24:28 - INFO - codeparrot_training - Step 26297: {'lr': 0.00023095296428122318, 'samples': 13464576, 'steps': 26297, 'batch_loss/train': 0.7428763592615724} 12/26/2021 10:24:41 - INFO - codeparrot_training - Step 26298: {'lr': 0.00023093724700104642, 'samples': 13465088, 'steps': 26298, 'batch_loss/train': 0.7903498820960522} 12/26/2021 10:24:51 - INFO - codeparrot_training - Step 26299: {'lr': 0.00023092152979665615, 'samples': 13465600, 'steps': 26299, 'batch_loss/train': 0.8566628266125917} 12/26/2021 10:25:02 - INFO - codeparrot_training - Step 26300: {'lr': 0.00023090581266811458, 'samples': 13466112, 'steps': 26300, 'batch_loss/train': 0.7581474827602506} 12/26/2021 10:25:15 - INFO - codeparrot_training - Step 26301: {'lr': 0.00023089009561548434, 'samples': 13466624, 'steps': 26301, 'batch_loss/train': 0.8490446079522371} 12/26/2021 10:25:26 - INFO - codeparrot_training - Step 26302: {'lr': 0.00023087437863882776, 'samples': 13467136, 'steps': 26302, 'batch_loss/train': 0.8227810487151146} 12/26/2021 10:25:37 - INFO - codeparrot_training - Step 26303: {'lr': 0.0002308586617382075, 'samples': 13467648, 'steps': 26303, 'batch_loss/train': 0.6958869432564825} 12/26/2021 10:25:49 - INFO - codeparrot_training - Step 26304: {'lr': 0.00023084294491368602, 'samples': 13468160, 'steps': 26304, 'batch_loss/train': 0.7977168820798397} 12/26/2021 10:26:00 - INFO - codeparrot_training - Step 26305: {'lr': 0.00023082722816532565, 'samples': 13468672, 'steps': 26305, 'batch_loss/train': 0.7480731443502009} 12/26/2021 10:26:10 - INFO - codeparrot_training - Step 26306: {'lr': 0.00023081151149318901, 'samples': 13469184, 'steps': 26306, 'batch_loss/train': 0.820012026000768} 12/26/2021 10:26:21 - INFO - codeparrot_training - Step 26307: {'lr': 0.00023079579489733865, 'samples': 13469696, 'steps': 26307, 'batch_loss/train': 0.7486750222742558} 12/26/2021 10:26:33 - INFO - codeparrot_training - Step 26308: {'lr': 0.00023078007837783695, 'samples': 13470208, 'steps': 26308, 'batch_loss/train': 0.8535340204834938} 12/26/2021 10:26:43 - INFO - codeparrot_training - Step 26309: {'lr': 0.00023076436193474628, 'samples': 13470720, 'steps': 26309, 'batch_loss/train': 0.6990560758858919} 12/26/2021 10:26:54 - INFO - codeparrot_training - Step 26310: {'lr': 0.0002307486455681293, 'samples': 13471232, 'steps': 26310, 'batch_loss/train': 0.7054136539809406} 12/26/2021 10:27:08 - INFO - codeparrot_training - Step 26311: {'lr': 0.00023073292927804849, 'samples': 13471744, 'steps': 26311, 'batch_loss/train': 0.7281540732365102} 12/26/2021 10:27:19 - INFO - codeparrot_training - Step 26312: {'lr': 0.00023071721306456623, 'samples': 13472256, 'steps': 26312, 'batch_loss/train': 0.6711474629119039} 12/26/2021 10:27:29 - INFO - codeparrot_training - Step 26313: {'lr': 0.00023070149692774498, 'samples': 13472768, 'steps': 26313, 'batch_loss/train': 0.6797333809081465} 12/26/2021 10:27:41 - INFO - codeparrot_training - Step 26314: {'lr': 0.0002306857808676474, 'samples': 13473280, 'steps': 26314, 'batch_loss/train': 0.7671933649107814} 12/26/2021 10:27:52 - INFO - codeparrot_training - Step 26315: {'lr': 0.00023067006488433585, 'samples': 13473792, 'steps': 26315, 'batch_loss/train': 0.737089947797358} 12/26/2021 10:28:03 - INFO - codeparrot_training - Step 26316: {'lr': 0.0002306543489778727, 'samples': 13474304, 'steps': 26316, 'batch_loss/train': 0.7085490059107542} 12/26/2021 10:28:14 - INFO - codeparrot_training - Step 26317: {'lr': 0.00023063863314832074, 'samples': 13474816, 'steps': 26317, 'batch_loss/train': 0.7392154978588223} 12/26/2021 10:28:28 - INFO - codeparrot_training - Step 26318: {'lr': 0.0002306229173957422, 'samples': 13475328, 'steps': 26318, 'batch_loss/train': 0.7095002988353372} 12/26/2021 10:28:38 - INFO - codeparrot_training - Step 26319: {'lr': 0.00023060720172019962, 'samples': 13475840, 'steps': 26319, 'batch_loss/train': 0.5098478496074677} 12/26/2021 10:28:49 - INFO - codeparrot_training - Step 26320: {'lr': 0.00023059148612175541, 'samples': 13476352, 'steps': 26320, 'batch_loss/train': 0.8426100062206388} 12/26/2021 10:29:01 - INFO - codeparrot_training - Step 26321: {'lr': 0.0002305757706004722, 'samples': 13476864, 'steps': 26321, 'batch_loss/train': 0.7349335672333837} 12/26/2021 10:29:12 - INFO - codeparrot_training - Step 26322: {'lr': 0.00023056005515641238, 'samples': 13477376, 'steps': 26322, 'batch_loss/train': 0.7051230588112958} 12/26/2021 10:29:22 - INFO - codeparrot_training - Step 26323: {'lr': 0.00023054433978963846, 'samples': 13477888, 'steps': 26323, 'batch_loss/train': 0.68692380504217} 12/26/2021 10:29:35 - INFO - codeparrot_training - Step 26324: {'lr': 0.0002305286245002128, 'samples': 13478400, 'steps': 26324, 'batch_loss/train': 0.7896018885076046} 12/26/2021 10:29:46 - INFO - codeparrot_training - Step 26325: {'lr': 0.00023051290928819803, 'samples': 13478912, 'steps': 26325, 'batch_loss/train': 0.618038518470712} 12/26/2021 10:29:56 - INFO - codeparrot_training - Step 26326: {'lr': 0.00023049719415365667, 'samples': 13479424, 'steps': 26326, 'batch_loss/train': 0.8173243775963783} 12/26/2021 10:30:07 - INFO - codeparrot_training - Step 26327: {'lr': 0.00023048147909665096, 'samples': 13479936, 'steps': 26327, 'batch_loss/train': 1.0979345124214888} 12/26/2021 10:30:20 - INFO - codeparrot_training - Step 26328: {'lr': 0.00023046576411724364, 'samples': 13480448, 'steps': 26328, 'batch_loss/train': 0.7573218373581767} 12/26/2021 10:30:30 - INFO - codeparrot_training - Step 26329: {'lr': 0.00023045004921549698, 'samples': 13480960, 'steps': 26329, 'batch_loss/train': 0.7515492091188207} 12/26/2021 10:30:41 - INFO - codeparrot_training - Step 26330: {'lr': 0.0002304343343914736, 'samples': 13481472, 'steps': 26330, 'batch_loss/train': 0.7778468923643231} 12/26/2021 10:30:55 - INFO - codeparrot_training - Step 26331: {'lr': 0.00023041861964523584, 'samples': 13481984, 'steps': 26331, 'batch_loss/train': 0.7588240881450474} 12/26/2021 10:31:06 - INFO - codeparrot_training - Step 26332: {'lr': 0.0002304029049768463, 'samples': 13482496, 'steps': 26332, 'batch_loss/train': 0.7967244577594101} 12/26/2021 10:31:16 - INFO - codeparrot_training - Step 26333: {'lr': 0.00023038719038636745, 'samples': 13483008, 'steps': 26333, 'batch_loss/train': 0.6590705807320774} 12/26/2021 10:31:28 - INFO - codeparrot_training - Step 26334: {'lr': 0.00023037147587386177, 'samples': 13483520, 'steps': 26334, 'batch_loss/train': 0.6673543965443969} 12/26/2021 10:31:39 - INFO - codeparrot_training - Step 26335: {'lr': 0.00023035576143939153, 'samples': 13484032, 'steps': 26335, 'batch_loss/train': 0.6809943860862404} 12/26/2021 10:31:50 - INFO - codeparrot_training - Step 26336: {'lr': 0.00023034004708301942, 'samples': 13484544, 'steps': 26336, 'batch_loss/train': 0.7674567709909752} 12/26/2021 10:32:00 - INFO - codeparrot_training - Step 26337: {'lr': 0.00023032433280480795, 'samples': 13485056, 'steps': 26337, 'batch_loss/train': 0.8203878425993025} 12/26/2021 10:32:12 - INFO - codeparrot_training - Step 26338: {'lr': 0.00023030861860481942, 'samples': 13485568, 'steps': 26338, 'batch_loss/train': 0.7808148190379143} 12/26/2021 10:32:23 - INFO - codeparrot_training - Step 26339: {'lr': 0.00023029290448311636, 'samples': 13486080, 'steps': 26339, 'batch_loss/train': 0.699969279114157} 12/26/2021 10:32:34 - INFO - codeparrot_training - Step 26340: {'lr': 0.00023027719043976132, 'samples': 13486592, 'steps': 26340, 'batch_loss/train': 0.7212994303554296} 12/26/2021 10:32:48 - INFO - codeparrot_training - Step 26341: {'lr': 0.00023026147647481674, 'samples': 13487104, 'steps': 26341, 'batch_loss/train': 0.7416748786345124} 12/26/2021 10:32:58 - INFO - codeparrot_training - Step 26342: {'lr': 0.00023024576258834497, 'samples': 13487616, 'steps': 26342, 'batch_loss/train': 0.5976918248925358} 12/26/2021 10:33:09 - INFO - codeparrot_training - Step 26343: {'lr': 0.00023023004878040875, 'samples': 13488128, 'steps': 26343, 'batch_loss/train': 0.7260345639660954} 12/26/2021 10:33:21 - INFO - codeparrot_training - Step 26344: {'lr': 0.00023021433505107037, 'samples': 13488640, 'steps': 26344, 'batch_loss/train': 0.8050139779224992} 12/26/2021 10:33:32 - INFO - codeparrot_training - Step 26345: {'lr': 0.00023019862140039225, 'samples': 13489152, 'steps': 26345, 'batch_loss/train': 0.8074117798241787} 12/26/2021 10:33:42 - INFO - codeparrot_training - Step 26346: {'lr': 0.0002301829078284369, 'samples': 13489664, 'steps': 26346, 'batch_loss/train': 0.5613970353151672} 12/26/2021 10:33:53 - INFO - codeparrot_training - Step 26347: {'lr': 0.00023016719433526693, 'samples': 13490176, 'steps': 26347, 'batch_loss/train': 0.6612517446046695} 12/26/2021 10:34:07 - INFO - codeparrot_training - Step 26348: {'lr': 0.00023015148092094464, 'samples': 13490688, 'steps': 26348, 'batch_loss/train': 0.8071663295850158} 12/26/2021 10:34:18 - INFO - codeparrot_training - Step 26349: {'lr': 0.00023013576758553262, 'samples': 13491200, 'steps': 26349, 'batch_loss/train': 0.73896240349859} 12/26/2021 10:34:28 - INFO - codeparrot_training - Step 26350: {'lr': 0.0002301200543290932, 'samples': 13491712, 'steps': 26350, 'batch_loss/train': 0.7885100422427058} 12/26/2021 10:34:40 - INFO - codeparrot_training - Step 26351: {'lr': 0.000230104341151689, 'samples': 13492224, 'steps': 26351, 'batch_loss/train': 0.7444465421140194} 12/26/2021 10:34:51 - INFO - codeparrot_training - Step 26352: {'lr': 0.00023008862805338245, 'samples': 13492736, 'steps': 26352, 'batch_loss/train': 0.769721882417798} 12/26/2021 10:35:02 - INFO - codeparrot_training - Step 26353: {'lr': 0.00023007291503423588, 'samples': 13493248, 'steps': 26353, 'batch_loss/train': 0.7396321590058506} 12/26/2021 10:35:15 - INFO - codeparrot_training - Step 26354: {'lr': 0.00023005720209431203, 'samples': 13493760, 'steps': 26354, 'batch_loss/train': 0.8253393084742129} 12/26/2021 10:35:26 - INFO - codeparrot_training - Step 26355: {'lr': 0.00023004148923367313, 'samples': 13494272, 'steps': 26355, 'batch_loss/train': 0.8429589280858636} 12/26/2021 10:35:37 - INFO - codeparrot_training - Step 26356: {'lr': 0.0002300257764523818, 'samples': 13494784, 'steps': 26356, 'batch_loss/train': 0.7415354885160923} 12/26/2021 10:35:47 - INFO - codeparrot_training - Step 26357: {'lr': 0.00023001006375050032, 'samples': 13495296, 'steps': 26357, 'batch_loss/train': 0.7677646148949862} 12/26/2021 10:35:59 - INFO - codeparrot_training - Step 26358: {'lr': 0.00022999435112809133, 'samples': 13495808, 'steps': 26358, 'batch_loss/train': 0.7555956775322556} 12/26/2021 10:36:10 - INFO - codeparrot_training - Step 26359: {'lr': 0.0002299786385852173, 'samples': 13496320, 'steps': 26359, 'batch_loss/train': 0.6908195983851328} 12/26/2021 10:36:21 - INFO - codeparrot_training - Step 26360: {'lr': 0.00022996292612194062, 'samples': 13496832, 'steps': 26360, 'batch_loss/train': 0.6932414500042796} 12/26/2021 10:36:33 - INFO - codeparrot_training - Step 26361: {'lr': 0.00022994721373832368, 'samples': 13497344, 'steps': 26361, 'batch_loss/train': 0.8173355339094996} 12/26/2021 10:36:44 - INFO - codeparrot_training - Step 26362: {'lr': 0.0002299315014344292, 'samples': 13497856, 'steps': 26362, 'batch_loss/train': 0.730623094830662} 12/26/2021 10:36:54 - INFO - codeparrot_training - Step 26363: {'lr': 0.00022991578921031946, 'samples': 13498368, 'steps': 26363, 'batch_loss/train': 0.6293481294997036} 12/26/2021 10:37:06 - INFO - codeparrot_training - Step 26364: {'lr': 0.00022990007706605686, 'samples': 13498880, 'steps': 26364, 'batch_loss/train': 0.7629873035475612} 12/26/2021 10:37:17 - INFO - codeparrot_training - Step 26365: {'lr': 0.00022988436500170404, 'samples': 13499392, 'steps': 26365, 'batch_loss/train': 0.818441272713244} 12/26/2021 10:37:28 - INFO - codeparrot_training - Step 26366: {'lr': 0.0002298686530173234, 'samples': 13499904, 'steps': 26366, 'batch_loss/train': 0.7308141118846834} 12/26/2021 10:37:40 - INFO - codeparrot_training - Step 26367: {'lr': 0.00022985294111297738, 'samples': 13500416, 'steps': 26367, 'batch_loss/train': 0.7774022333323956} 12/26/2021 10:37:51 - INFO - codeparrot_training - Step 26368: {'lr': 0.00022983722928872837, 'samples': 13500928, 'steps': 26368, 'batch_loss/train': 0.6795993172563612} 12/26/2021 10:38:01 - INFO - codeparrot_training - Step 26369: {'lr': 0.00022982151754463907, 'samples': 13501440, 'steps': 26369, 'batch_loss/train': 0.7356705875135958} 12/26/2021 10:38:12 - INFO - codeparrot_training - Step 26370: {'lr': 0.00022980580588077176, 'samples': 13501952, 'steps': 26370, 'batch_loss/train': 0.7589896535500884} 12/26/2021 10:38:26 - INFO - codeparrot_training - Step 26371: {'lr': 0.0002297900942971889, 'samples': 13502464, 'steps': 26371, 'batch_loss/train': 0.7688696039840579} 12/26/2021 10:38:37 - INFO - codeparrot_training - Step 26372: {'lr': 0.00022977438279395297, 'samples': 13502976, 'steps': 26372, 'batch_loss/train': 0.8107136506587267} 12/26/2021 10:38:47 - INFO - codeparrot_training - Step 26373: {'lr': 0.00022975867137112652, 'samples': 13503488, 'steps': 26373, 'batch_loss/train': 0.8367642010562122} 12/26/2021 10:38:59 - INFO - codeparrot_training - Step 26374: {'lr': 0.00022974296002877193, 'samples': 13504000, 'steps': 26374, 'batch_loss/train': 0.7689884295687079} 12/26/2021 10:39:10 - INFO - codeparrot_training - Step 26375: {'lr': 0.00022972724876695172, 'samples': 13504512, 'steps': 26375, 'batch_loss/train': 0.6898940037935972} 12/26/2021 10:39:21 - INFO - codeparrot_training - Step 26376: {'lr': 0.0002297115375857282, 'samples': 13505024, 'steps': 26376, 'batch_loss/train': 0.735118858050555} 12/26/2021 10:39:34 - INFO - codeparrot_training - Step 26377: {'lr': 0.00022969582648516403, 'samples': 13505536, 'steps': 26377, 'batch_loss/train': 0.8226137352176011} 12/26/2021 10:39:45 - INFO - codeparrot_training - Step 26378: {'lr': 0.00022968011546532159, 'samples': 13506048, 'steps': 26378, 'batch_loss/train': 0.7421882287599146} 12/26/2021 10:39:55 - INFO - codeparrot_training - Step 26379: {'lr': 0.00022966440452626324, 'samples': 13506560, 'steps': 26379, 'batch_loss/train': 0.662020830437541} 12/26/2021 10:40:06 - INFO - codeparrot_training - Step 26380: {'lr': 0.00022964869366805168, 'samples': 13507072, 'steps': 26380, 'batch_loss/train': 0.7234824607148767} 12/26/2021 10:40:18 - INFO - codeparrot_training - Step 26381: {'lr': 0.00022963298289074916, 'samples': 13507584, 'steps': 26381, 'batch_loss/train': 0.8299737265333533} 12/26/2021 10:40:28 - INFO - codeparrot_training - Step 26382: {'lr': 0.00022961727219441828, 'samples': 13508096, 'steps': 26382, 'batch_loss/train': 0.8051335453055799} 12/26/2021 10:40:39 - INFO - codeparrot_training - Step 26383: {'lr': 0.0002296015615791213, 'samples': 13508608, 'steps': 26383, 'batch_loss/train': 0.7475555017590523} 12/26/2021 10:40:51 - INFO - codeparrot_training - Step 26384: {'lr': 0.00022958585104492085, 'samples': 13509120, 'steps': 26384, 'batch_loss/train': 0.6993989022448659} 12/26/2021 10:41:02 - INFO - codeparrot_training - Step 26385: {'lr': 0.00022957014059187943, 'samples': 13509632, 'steps': 26385, 'batch_loss/train': 0.7126201307401061} 12/26/2021 10:41:12 - INFO - codeparrot_training - Step 26386: {'lr': 0.00022955443022005937, 'samples': 13510144, 'steps': 26386, 'batch_loss/train': 1.0187919670715928} 12/26/2021 10:41:26 - INFO - codeparrot_training - Step 26387: {'lr': 0.0002295387199295231, 'samples': 13510656, 'steps': 26387, 'batch_loss/train': 0.8527535377070308} 12/26/2021 10:41:37 - INFO - codeparrot_training - Step 26388: {'lr': 0.00022952300972033327, 'samples': 13511168, 'steps': 26388, 'batch_loss/train': 0.76249875780195} 12/26/2021 10:41:47 - INFO - codeparrot_training - Step 26389: {'lr': 0.00022950729959255223, 'samples': 13511680, 'steps': 26389, 'batch_loss/train': 0.7615324249491096} 12/26/2021 10:41:58 - INFO - codeparrot_training - Step 26390: {'lr': 0.0002294915895462423, 'samples': 13512192, 'steps': 26390, 'batch_loss/train': 0.7242501545697451} 12/26/2021 10:42:10 - INFO - codeparrot_training - Step 26391: {'lr': 0.00022947587958146613, 'samples': 13512704, 'steps': 26391, 'batch_loss/train': 0.7946090510813519} 12/26/2021 10:42:21 - INFO - codeparrot_training - Step 26392: {'lr': 0.00022946016969828617, 'samples': 13513216, 'steps': 26392, 'batch_loss/train': 0.8139668125659227} 12/26/2021 10:42:32 - INFO - codeparrot_training - Step 26393: {'lr': 0.00022944445989676478, 'samples': 13513728, 'steps': 26393, 'batch_loss/train': 0.9001809414476156} 12/26/2021 10:42:45 - INFO - codeparrot_training - Step 26394: {'lr': 0.00022942875017696436, 'samples': 13514240, 'steps': 26394, 'batch_loss/train': 0.8559263704810292} 12/26/2021 10:42:56 - INFO - codeparrot_training - Step 26395: {'lr': 0.00022941304053894759, 'samples': 13514752, 'steps': 26395, 'batch_loss/train': 0.7711971933022141} 12/26/2021 10:43:07 - INFO - codeparrot_training - Step 26396: {'lr': 0.00022939733098277678, 'samples': 13515264, 'steps': 26396, 'batch_loss/train': 0.6844325183192268} 12/26/2021 10:43:19 - INFO - codeparrot_training - Step 26397: {'lr': 0.0002293816215085144, 'samples': 13515776, 'steps': 26397, 'batch_loss/train': 0.7345785340294242} 12/26/2021 10:43:29 - INFO - codeparrot_training - Step 26398: {'lr': 0.0002293659121162228, 'samples': 13516288, 'steps': 26398, 'batch_loss/train': 0.7602941747754812} 12/26/2021 10:43:40 - INFO - codeparrot_training - Step 26399: {'lr': 0.0002293502028059647, 'samples': 13516800, 'steps': 26399, 'batch_loss/train': 0.8566155969165266} 12/26/2021 10:43:52 - INFO - codeparrot_training - Step 26400: {'lr': 0.00022933449357780234, 'samples': 13517312, 'steps': 26400, 'batch_loss/train': 0.7915117861703038} 12/26/2021 10:44:03 - INFO - codeparrot_training - Step 26401: {'lr': 0.00022931878443179814, 'samples': 13517824, 'steps': 26401, 'batch_loss/train': 0.7434261776506901} 12/26/2021 10:44:13 - INFO - codeparrot_training - Step 26402: {'lr': 0.00022930307536801476, 'samples': 13518336, 'steps': 26402, 'batch_loss/train': 0.8669106159359217} 12/26/2021 10:44:24 - INFO - codeparrot_training - Step 26403: {'lr': 0.00022928736638651447, 'samples': 13518848, 'steps': 26403, 'batch_loss/train': 0.8835240163025446} 12/26/2021 10:44:36 - INFO - codeparrot_training - Step 26404: {'lr': 0.0002292716574873599, 'samples': 13519360, 'steps': 26404, 'batch_loss/train': 0.7323910668492317} 12/26/2021 10:44:47 - INFO - codeparrot_training - Step 26405: {'lr': 0.00022925594867061324, 'samples': 13519872, 'steps': 26405, 'batch_loss/train': 0.6597953522577882} 12/26/2021 10:44:57 - INFO - codeparrot_training - Step 26406: {'lr': 0.00022924023993633718, 'samples': 13520384, 'steps': 26406, 'batch_loss/train': 0.7838473506271839} 12/26/2021 10:45:12 - INFO - codeparrot_training - Step 26407: {'lr': 0.0002292245312845941, 'samples': 13520896, 'steps': 26407, 'batch_loss/train': 0.7996250081341714} 12/26/2021 10:45:22 - INFO - codeparrot_training - Step 26408: {'lr': 0.00022920882271544647, 'samples': 13521408, 'steps': 26408, 'batch_loss/train': 0.8739825012162328} 12/26/2021 10:45:33 - INFO - codeparrot_training - Step 26409: {'lr': 0.00022919311422895657, 'samples': 13521920, 'steps': 26409, 'batch_loss/train': 0.6460348220425658} 12/26/2021 10:45:45 - INFO - codeparrot_training - Step 26410: {'lr': 0.0002291774058251871, 'samples': 13522432, 'steps': 26410, 'batch_loss/train': 0.7829908579587936} 12/26/2021 10:45:55 - INFO - codeparrot_training - Step 26411: {'lr': 0.00022916169750420042, 'samples': 13522944, 'steps': 26411, 'batch_loss/train': 0.7151548396795988} 12/26/2021 10:46:06 - INFO - codeparrot_training - Step 26412: {'lr': 0.0002291459892660589, 'samples': 13523456, 'steps': 26412, 'batch_loss/train': 0.7838390865363181} 12/26/2021 10:46:17 - INFO - codeparrot_training - Step 26413: {'lr': 0.00022913028111082502, 'samples': 13523968, 'steps': 26413, 'batch_loss/train': 0.7295641186647117} 12/26/2021 10:46:29 - INFO - codeparrot_training - Step 26414: {'lr': 0.00022911457303856133, 'samples': 13524480, 'steps': 26414, 'batch_loss/train': 0.7370542755816132} 12/26/2021 10:46:40 - INFO - codeparrot_training - Step 26415: {'lr': 0.00022909886504933024, 'samples': 13524992, 'steps': 26415, 'batch_loss/train': 0.6185264443629421} 12/26/2021 10:46:50 - INFO - codeparrot_training - Step 26416: {'lr': 0.0002290831571431941, 'samples': 13525504, 'steps': 26416, 'batch_loss/train': 0.7544240991119295} 12/26/2021 10:47:04 - INFO - codeparrot_training - Step 26417: {'lr': 0.00022906744932021545, 'samples': 13526016, 'steps': 26417, 'batch_loss/train': 0.723677912261337} 12/26/2021 10:47:15 - INFO - codeparrot_training - Step 26418: {'lr': 0.00022905174158045676, 'samples': 13526528, 'steps': 26418, 'batch_loss/train': 0.783828909508884} 12/26/2021 10:47:26 - INFO - codeparrot_training - Step 26419: {'lr': 0.00022903603392398039, 'samples': 13527040, 'steps': 26419, 'batch_loss/train': 0.7433942770585418} 12/26/2021 10:47:38 - INFO - codeparrot_training - Step 26420: {'lr': 0.0002290203263508488, 'samples': 13527552, 'steps': 26420, 'batch_loss/train': 0.6123960199183784} 12/26/2021 10:47:48 - INFO - codeparrot_training - Step 26421: {'lr': 0.00022900461886112456, 'samples': 13528064, 'steps': 26421, 'batch_loss/train': 0.6970004476606846} 12/26/2021 10:47:59 - INFO - codeparrot_training - Step 26422: {'lr': 0.00022898891145486998, 'samples': 13528576, 'steps': 26422, 'batch_loss/train': 0.7805371802533045} 12/26/2021 10:48:10 - INFO - codeparrot_training - Step 26423: {'lr': 0.0002289732041321476, 'samples': 13529088, 'steps': 26423, 'batch_loss/train': 0.7916224440559745} 12/26/2021 10:48:24 - INFO - codeparrot_training - Step 26424: {'lr': 0.0002289574968930197, 'samples': 13529600, 'steps': 26424, 'batch_loss/train': 1.1662898873910308} 12/26/2021 10:48:34 - INFO - codeparrot_training - Step 26425: {'lr': 0.00022894178973754899, 'samples': 13530112, 'steps': 26425, 'batch_loss/train': 0.713077993132174} 12/26/2021 10:48:45 - INFO - codeparrot_training - Step 26426: {'lr': 0.00022892608266579767, 'samples': 13530624, 'steps': 26426, 'batch_loss/train': 0.7753490759059787} 12/26/2021 10:48:57 - INFO - codeparrot_training - Step 26427: {'lr': 0.00022891037567782826, 'samples': 13531136, 'steps': 26427, 'batch_loss/train': 0.7921460054349154} 12/26/2021 10:49:08 - INFO - codeparrot_training - Step 26428: {'lr': 0.00022889466877370335, 'samples': 13531648, 'steps': 26428, 'batch_loss/train': 0.6522731338627636} 12/26/2021 10:49:18 - INFO - codeparrot_training - Step 26429: {'lr': 0.0002288789619534852, 'samples': 13532160, 'steps': 26429, 'batch_loss/train': 0.673866315279156} 12/26/2021 10:49:29 - INFO - codeparrot_training - Step 26430: {'lr': 0.0002288632552172364, 'samples': 13532672, 'steps': 26430, 'batch_loss/train': 0.7447085967287421} 12/26/2021 10:49:41 - INFO - codeparrot_training - Step 26431: {'lr': 0.0002288475485650192, 'samples': 13533184, 'steps': 26431, 'batch_loss/train': 0.7540964977815747} 12/26/2021 10:49:52 - INFO - codeparrot_training - Step 26432: {'lr': 0.0002288318419968962, 'samples': 13533696, 'steps': 26432, 'batch_loss/train': 0.6712830968899652} 12/26/2021 10:50:02 - INFO - codeparrot_training - Step 26433: {'lr': 0.00022881613551292988, 'samples': 13534208, 'steps': 26433, 'batch_loss/train': 0.6730381542583928} 12/26/2021 10:50:16 - INFO - codeparrot_training - Step 26434: {'lr': 0.00022880042911318257, 'samples': 13534720, 'steps': 26434, 'batch_loss/train': 0.7790232702391222} 12/26/2021 10:50:27 - INFO - codeparrot_training - Step 26435: {'lr': 0.00022878472279771667, 'samples': 13535232, 'steps': 26435, 'batch_loss/train': 0.770393610233441} 12/26/2021 10:50:38 - INFO - codeparrot_training - Step 26436: {'lr': 0.00022876901656659475, 'samples': 13535744, 'steps': 26436, 'batch_loss/train': 0.7680381932295859} 12/26/2021 10:50:50 - INFO - codeparrot_training - Step 26437: {'lr': 0.00022875331041987925, 'samples': 13536256, 'steps': 26437, 'batch_loss/train': 0.8224147474393249} 12/26/2021 10:51:00 - INFO - codeparrot_training - Step 26438: {'lr': 0.00022873760435763248, 'samples': 13536768, 'steps': 26438, 'batch_loss/train': 0.7256800676696002} 12/26/2021 10:51:11 - INFO - codeparrot_training - Step 26439: {'lr': 0.000228721898379917, 'samples': 13537280, 'steps': 26439, 'batch_loss/train': 0.6698136394843459} 12/26/2021 10:51:23 - INFO - codeparrot_training - Step 26440: {'lr': 0.00022870619248679526, 'samples': 13537792, 'steps': 26440, 'batch_loss/train': 0.7587280552834272} 12/26/2021 10:51:34 - INFO - codeparrot_training - Step 26441: {'lr': 0.00022869048667832962, 'samples': 13538304, 'steps': 26441, 'batch_loss/train': 0.6718999044969678} 12/26/2021 10:51:44 - INFO - codeparrot_training - Step 26442: {'lr': 0.0002286747809545825, 'samples': 13538816, 'steps': 26442, 'batch_loss/train': 0.704154082108289} 12/26/2021 10:51:55 - INFO - codeparrot_training - Step 26443: {'lr': 0.00022865907531561652, 'samples': 13539328, 'steps': 26443, 'batch_loss/train': 0.7484538580756634} 12/26/2021 10:52:07 - INFO - codeparrot_training - Step 26444: {'lr': 0.000228643369761494, 'samples': 13539840, 'steps': 26444, 'batch_loss/train': 0.7924815067090094} 12/26/2021 10:52:18 - INFO - codeparrot_training - Step 26445: {'lr': 0.0002286276642922773, 'samples': 13540352, 'steps': 26445, 'batch_loss/train': 0.7965269037522376} 12/26/2021 10:52:29 - INFO - codeparrot_training - Step 26446: {'lr': 0.00022861195890802892, 'samples': 13540864, 'steps': 26446, 'batch_loss/train': 0.6562613951973617} 12/26/2021 10:52:42 - INFO - codeparrot_training - Step 26447: {'lr': 0.00022859625360881138, 'samples': 13541376, 'steps': 26447, 'batch_loss/train': 0.8279432775452733} 12/26/2021 10:52:53 - INFO - codeparrot_training - Step 26448: {'lr': 0.00022858054839468704, 'samples': 13541888, 'steps': 26448, 'batch_loss/train': 0.7995092160999775} 12/26/2021 10:53:04 - INFO - codeparrot_training - Step 26449: {'lr': 0.00022856484326571835, 'samples': 13542400, 'steps': 26449, 'batch_loss/train': 0.6666943458840251} 12/26/2021 10:53:16 - INFO - codeparrot_training - Step 26450: {'lr': 0.00022854913822196768, 'samples': 13542912, 'steps': 26450, 'batch_loss/train': 0.7700927336700261} 12/26/2021 10:53:26 - INFO - codeparrot_training - Step 26451: {'lr': 0.0002285334332634976, 'samples': 13543424, 'steps': 26451, 'batch_loss/train': 0.7685429723933339} 12/26/2021 10:53:37 - INFO - codeparrot_training - Step 26452: {'lr': 0.00022851772839037053, 'samples': 13543936, 'steps': 26452, 'batch_loss/train': 0.8479480971582234} 12/26/2021 10:53:48 - INFO - codeparrot_training - Step 26453: {'lr': 0.00022850202360264873, 'samples': 13544448, 'steps': 26453, 'batch_loss/train': 0.8691898789256811} 12/26/2021 10:54:00 - INFO - codeparrot_training - Step 26454: {'lr': 0.00022848631890039494, 'samples': 13544960, 'steps': 26454, 'batch_loss/train': 0.7406675467500463} 12/26/2021 10:54:10 - INFO - codeparrot_training - Step 26455: {'lr': 0.0002284706142836713, 'samples': 13545472, 'steps': 26455, 'batch_loss/train': 0.7297307663247921} 12/26/2021 10:54:21 - INFO - codeparrot_training - Step 26456: {'lr': 0.00022845490975254042, 'samples': 13545984, 'steps': 26456, 'batch_loss/train': 0.8356396723538637} 12/26/2021 10:54:35 - INFO - codeparrot_training - Step 26457: {'lr': 0.00022843920530706463, 'samples': 13546496, 'steps': 26457, 'batch_loss/train': 0.6902227960526943} 12/26/2021 10:54:45 - INFO - codeparrot_training - Step 26458: {'lr': 0.00022842350094730643, 'samples': 13547008, 'steps': 26458, 'batch_loss/train': 0.7863526865839958} 12/26/2021 10:54:56 - INFO - codeparrot_training - Step 26459: {'lr': 0.00022840779667332832, 'samples': 13547520, 'steps': 26459, 'batch_loss/train': 0.8528694165870547} 12/26/2021 10:55:08 - INFO - codeparrot_training - Step 26460: {'lr': 0.00022839209248519266, 'samples': 13548032, 'steps': 26460, 'batch_loss/train': 0.8173585655167699} 12/26/2021 10:55:19 - INFO - codeparrot_training - Step 26461: {'lr': 0.00022837638838296175, 'samples': 13548544, 'steps': 26461, 'batch_loss/train': 0.7685276011470705} 12/26/2021 10:55:29 - INFO - codeparrot_training - Step 26462: {'lr': 0.0002283606843666982, 'samples': 13549056, 'steps': 26462, 'batch_loss/train': 0.6973487767390907} 12/26/2021 10:55:43 - INFO - codeparrot_training - Step 26463: {'lr': 0.00022834498043646447, 'samples': 13549568, 'steps': 26463, 'batch_loss/train': 0.762430218514055} 12/26/2021 10:55:54 - INFO - codeparrot_training - Step 26464: {'lr': 0.00022832927659232282, 'samples': 13550080, 'steps': 26464, 'batch_loss/train': 0.7232519425451756} 12/26/2021 10:56:05 - INFO - codeparrot_training - Step 26465: {'lr': 0.00022831357283433585, 'samples': 13550592, 'steps': 26465, 'batch_loss/train': 0.8598402077332139} 12/26/2021 10:56:15 - INFO - codeparrot_training - Step 26466: {'lr': 0.00022829786916256595, 'samples': 13551104, 'steps': 26466, 'batch_loss/train': 0.7833577319979668} 12/26/2021 10:56:27 - INFO - codeparrot_training - Step 26467: {'lr': 0.00022828216557707552, 'samples': 13551616, 'steps': 26467, 'batch_loss/train': 0.7480811825953424} 12/26/2021 10:56:38 - INFO - codeparrot_training - Step 26468: {'lr': 0.0002282664620779269, 'samples': 13552128, 'steps': 26468, 'batch_loss/train': 0.768919677939266} 12/26/2021 10:56:49 - INFO - codeparrot_training - Step 26469: {'lr': 0.00022825075866518275, 'samples': 13552640, 'steps': 26469, 'batch_loss/train': 0.7537871890235692} 12/26/2021 10:57:01 - INFO - codeparrot_training - Step 26470: {'lr': 0.00022823505533890533, 'samples': 13553152, 'steps': 26470, 'batch_loss/train': 0.7461477373726666} 12/26/2021 10:57:11 - INFO - codeparrot_training - Step 26471: {'lr': 0.0002282193520991571, 'samples': 13553664, 'steps': 26471, 'batch_loss/train': 0.6697752242907882} 12/26/2021 10:57:22 - INFO - codeparrot_training - Step 26472: {'lr': 0.00022820364894600046, 'samples': 13554176, 'steps': 26472, 'batch_loss/train': 0.7280890559777617} 12/26/2021 10:57:36 - INFO - codeparrot_training - Step 26473: {'lr': 0.00022818794587949797, 'samples': 13554688, 'steps': 26473, 'batch_loss/train': 0.7584407897666097} 12/26/2021 10:57:46 - INFO - codeparrot_training - Step 26474: {'lr': 0.00022817224289971188, 'samples': 13555200, 'steps': 26474, 'batch_loss/train': 0.743313170503825} 12/26/2021 10:57:57 - INFO - codeparrot_training - Step 26475: {'lr': 0.00022815654000670472, 'samples': 13555712, 'steps': 26475, 'batch_loss/train': 0.7699325624853373} 12/26/2021 10:58:08 - INFO - codeparrot_training - Step 26476: {'lr': 0.000228140837200539, 'samples': 13556224, 'steps': 26476, 'batch_loss/train': 0.7444582832977176} 12/26/2021 10:58:20 - INFO - codeparrot_training - Step 26477: {'lr': 0.00022812513448127697, 'samples': 13556736, 'steps': 26477, 'batch_loss/train': 0.7263819458894432} 12/26/2021 10:58:30 - INFO - codeparrot_training - Step 26478: {'lr': 0.00022810943184898124, 'samples': 13557248, 'steps': 26478, 'batch_loss/train': 0.6607896109344438} 12/26/2021 10:58:41 - INFO - codeparrot_training - Step 26479: {'lr': 0.000228093729303714, 'samples': 13557760, 'steps': 26479, 'batch_loss/train': 0.7659159004688263} 12/26/2021 10:58:53 - INFO - codeparrot_training - Step 26480: {'lr': 0.00022807802684553797, 'samples': 13558272, 'steps': 26480, 'batch_loss/train': 0.6615124945528805} 12/26/2021 10:59:04 - INFO - codeparrot_training - Step 26481: {'lr': 0.00022806232447451533, 'samples': 13558784, 'steps': 26481, 'batch_loss/train': 0.8849879186600447} 12/26/2021 10:59:14 - INFO - codeparrot_training - Step 26482: {'lr': 0.00022804662219070868, 'samples': 13559296, 'steps': 26482, 'batch_loss/train': 0.8220686316490173} 12/26/2021 10:59:26 - INFO - codeparrot_training - Step 26483: {'lr': 0.00022803091999418028, 'samples': 13559808, 'steps': 26483, 'batch_loss/train': 0.7542761135846376} 12/26/2021 10:59:37 - INFO - codeparrot_training - Step 26484: {'lr': 0.00022801521788499266, 'samples': 13560320, 'steps': 26484, 'batch_loss/train': 0.7142631788738072} 12/26/2021 10:59:48 - INFO - codeparrot_training - Step 26485: {'lr': 0.00022799951586320832, 'samples': 13560832, 'steps': 26485, 'batch_loss/train': 0.7874725563451648} 12/26/2021 10:59:58 - INFO - codeparrot_training - Step 26486: {'lr': 0.00022798381392888953, 'samples': 13561344, 'steps': 26486, 'batch_loss/train': 0.76888239197433} 12/26/2021 11:00:12 - INFO - codeparrot_training - Step 26487: {'lr': 0.00022796811208209873, 'samples': 13561856, 'steps': 26487, 'batch_loss/train': 0.7201489664148539} 12/26/2021 11:00:23 - INFO - codeparrot_training - Step 26488: {'lr': 0.0002279524103228985, 'samples': 13562368, 'steps': 26488, 'batch_loss/train': 0.8560398947447538} 12/26/2021 11:00:33 - INFO - codeparrot_training - Step 26489: {'lr': 0.00022793670865135118, 'samples': 13562880, 'steps': 26489, 'batch_loss/train': 0.7499525765888393} 12/26/2021 11:00:46 - INFO - codeparrot_training - Step 26490: {'lr': 0.00022792100706751906, 'samples': 13563392, 'steps': 26490, 'batch_loss/train': 0.7551544792950153} 12/26/2021 11:00:56 - INFO - codeparrot_training - Step 26491: {'lr': 0.0002279053055714647, 'samples': 13563904, 'steps': 26491, 'batch_loss/train': 0.7085969555191696} 12/26/2021 11:01:07 - INFO - codeparrot_training - Step 26492: {'lr': 0.0002278896041632506, 'samples': 13564416, 'steps': 26492, 'batch_loss/train': 0.7493948573246598} 12/26/2021 11:01:22 - INFO - codeparrot_training - Step 26493: {'lr': 0.00022787390284293905, 'samples': 13564928, 'steps': 26493, 'batch_loss/train': 0.925752729177475} 12/26/2021 11:01:32 - INFO - codeparrot_training - Step 26494: {'lr': 0.0002278582016105924, 'samples': 13565440, 'steps': 26494, 'batch_loss/train': 0.7138329073786736} 12/26/2021 11:01:43 - INFO - codeparrot_training - Step 26495: {'lr': 0.00022784250046627331, 'samples': 13565952, 'steps': 26495, 'batch_loss/train': 1.5136159602552652} 12/26/2021 11:01:54 - INFO - codeparrot_training - Step 26496: {'lr': 0.00022782679941004403, 'samples': 13566464, 'steps': 26496, 'batch_loss/train': 0.6118942002067342} 12/26/2021 11:02:06 - INFO - codeparrot_training - Step 26497: {'lr': 0.00022781109844196708, 'samples': 13566976, 'steps': 26497, 'batch_loss/train': 0.7639608486206271} 12/26/2021 11:02:16 - INFO - codeparrot_training - Step 26498: {'lr': 0.0002277953975621047, 'samples': 13567488, 'steps': 26498, 'batch_loss/train': 0.6714569074683823} 12/26/2021 11:02:27 - INFO - codeparrot_training - Step 26499: {'lr': 0.0002277796967705195, 'samples': 13568000, 'steps': 26499, 'batch_loss/train': 0.6376433650148101} 12/26/2021 11:02:39 - INFO - codeparrot_training - Step 26500: {'lr': 0.00022776399606727385, 'samples': 13568512, 'steps': 26500, 'batch_loss/train': 0.7740937070921063} 12/26/2021 11:02:50 - INFO - codeparrot_training - Step 26501: {'lr': 0.00022774829545243007, 'samples': 13569024, 'steps': 26501, 'batch_loss/train': 0.6921401573345065} 12/26/2021 11:03:00 - INFO - codeparrot_training - Step 26502: {'lr': 0.00022773259492605077, 'samples': 13569536, 'steps': 26502, 'batch_loss/train': 0.7668122593313456} 12/26/2021 11:03:11 - INFO - codeparrot_training - Step 26503: {'lr': 0.00022771689448819823, 'samples': 13570048, 'steps': 26503, 'batch_loss/train': 0.7857512142509222} 12/26/2021 11:03:25 - INFO - codeparrot_training - Step 26504: {'lr': 0.00022770119413893498, 'samples': 13570560, 'steps': 26504, 'batch_loss/train': 0.7736318088136613} 12/26/2021 11:03:36 - INFO - codeparrot_training - Step 26505: {'lr': 0.0002276854938783232, 'samples': 13571072, 'steps': 26505, 'batch_loss/train': 1.2800234400201589} 12/26/2021 11:03:46 - INFO - codeparrot_training - Step 26506: {'lr': 0.0002276697937064256, 'samples': 13571584, 'steps': 26506, 'batch_loss/train': 0.7278192150406539} 12/26/2021 11:03:58 - INFO - codeparrot_training - Step 26507: {'lr': 0.0002276540936233044, 'samples': 13572096, 'steps': 26507, 'batch_loss/train': 0.864836947992444} 12/26/2021 11:04:09 - INFO - codeparrot_training - Step 26508: {'lr': 0.0002276383936290222, 'samples': 13572608, 'steps': 26508, 'batch_loss/train': 0.9033354455605149} 12/26/2021 11:04:20 - INFO - codeparrot_training - Step 26509: {'lr': 0.00022762269372364114, 'samples': 13573120, 'steps': 26509, 'batch_loss/train': 0.6352962167002261} 12/26/2021 11:04:33 - INFO - codeparrot_training - Step 26510: {'lr': 0.00022760699390722387, 'samples': 13573632, 'steps': 26510, 'batch_loss/train': 0.7612182899611071} 12/26/2021 11:04:44 - INFO - codeparrot_training - Step 26511: {'lr': 0.00022759129417983278, 'samples': 13574144, 'steps': 26511, 'batch_loss/train': 0.745813270797953} 12/26/2021 11:04:55 - INFO - codeparrot_training - Step 26512: {'lr': 0.00022757559454153015, 'samples': 13574656, 'steps': 26512, 'batch_loss/train': 0.7687159804627299} 12/26/2021 11:05:05 - INFO - codeparrot_training - Step 26513: {'lr': 0.00022755989499237853, 'samples': 13575168, 'steps': 26513, 'batch_loss/train': 0.785527334548533} 12/26/2021 11:05:17 - INFO - codeparrot_training - Step 26514: {'lr': 0.00022754419553244035, 'samples': 13575680, 'steps': 26514, 'batch_loss/train': 0.8516025329008698} 12/26/2021 11:05:28 - INFO - codeparrot_training - Step 26515: {'lr': 0.00022752849616177798, 'samples': 13576192, 'steps': 26515, 'batch_loss/train': 0.7767160208895802} 12/26/2021 11:05:39 - INFO - codeparrot_training - Step 26516: {'lr': 0.0002275127968804537, 'samples': 13576704, 'steps': 26516, 'batch_loss/train': 0.8032967804465443} 12/26/2021 11:05:51 - INFO - codeparrot_training - Step 26517: {'lr': 0.00022749709768853012, 'samples': 13577216, 'steps': 26517, 'batch_loss/train': 0.70370087120682} 12/26/2021 11:06:01 - INFO - codeparrot_training - Step 26518: {'lr': 0.00022748139858606964, 'samples': 13577728, 'steps': 26518, 'batch_loss/train': 0.7096171411685646} 12/26/2021 11:06:12 - INFO - codeparrot_training - Step 26519: {'lr': 0.00022746569957313455, 'samples': 13578240, 'steps': 26519, 'batch_loss/train': 0.7398445997387171} 12/26/2021 11:06:24 - INFO - codeparrot_training - Step 26520: {'lr': 0.00022745000064978727, 'samples': 13578752, 'steps': 26520, 'batch_loss/train': 0.7186222551390529} 12/26/2021 11:06:35 - INFO - codeparrot_training - Step 26521: {'lr': 0.0002274343018160904, 'samples': 13579264, 'steps': 26521, 'batch_loss/train': 0.7609301079064608} 12/26/2021 11:06:45 - INFO - codeparrot_training - Step 26522: {'lr': 0.00022741860307210615, 'samples': 13579776, 'steps': 26522, 'batch_loss/train': 0.7549688220024109} 12/26/2021 11:06:59 - INFO - codeparrot_training - Step 26523: {'lr': 0.00022740290441789708, 'samples': 13580288, 'steps': 26523, 'batch_loss/train': 0.7560655237175524} 12/26/2021 11:07:10 - INFO - codeparrot_training - Step 26524: {'lr': 0.0002273872058535254, 'samples': 13580800, 'steps': 26524, 'batch_loss/train': 0.7806376731023192} 12/26/2021 11:07:21 - INFO - codeparrot_training - Step 26525: {'lr': 0.00022737150737905379, 'samples': 13581312, 'steps': 26525, 'batch_loss/train': 0.8453270504251122} 12/26/2021 11:07:31 - INFO - codeparrot_training - Step 26526: {'lr': 0.00022735580899454444, 'samples': 13581824, 'steps': 26526, 'batch_loss/train': 0.7303385250270367} 12/26/2021 11:07:44 - INFO - codeparrot_training - Step 26527: {'lr': 0.0002273401107000598, 'samples': 13582336, 'steps': 26527, 'batch_loss/train': 0.7048224434256554} 12/26/2021 11:07:54 - INFO - codeparrot_training - Step 26528: {'lr': 0.00022732441249566243, 'samples': 13582848, 'steps': 26528, 'batch_loss/train': 0.7646640362218022} 12/26/2021 11:08:05 - INFO - codeparrot_training - Step 26529: {'lr': 0.0002273087143814146, 'samples': 13583360, 'steps': 26529, 'batch_loss/train': 0.7545663490891457} 12/26/2021 11:08:17 - INFO - codeparrot_training - Step 26530: {'lr': 0.0002272930163573788, 'samples': 13583872, 'steps': 26530, 'batch_loss/train': 0.6981466608121991} 12/26/2021 11:08:28 - INFO - codeparrot_training - Step 26531: {'lr': 0.00022727731842361727, 'samples': 13584384, 'steps': 26531, 'batch_loss/train': 0.7109900552313775} 12/26/2021 11:08:38 - INFO - codeparrot_training - Step 26532: {'lr': 0.0002272616205801926, 'samples': 13584896, 'steps': 26532, 'batch_loss/train': 0.7645823499187827} 12/26/2021 11:08:49 - INFO - codeparrot_training - Step 26533: {'lr': 0.0002272459228271672, 'samples': 13585408, 'steps': 26533, 'batch_loss/train': 0.8640383742749691} 12/26/2021 11:09:03 - INFO - codeparrot_training - Step 26534: {'lr': 0.00022723022516460344, 'samples': 13585920, 'steps': 26534, 'batch_loss/train': 0.623320407117717} 12/26/2021 11:09:13 - INFO - codeparrot_training - Step 26535: {'lr': 0.00022721452759256355, 'samples': 13586432, 'steps': 26535, 'batch_loss/train': 0.7744660452008247} 12/26/2021 11:09:24 - INFO - codeparrot_training - Step 26536: {'lr': 0.00022719883011111018, 'samples': 13586944, 'steps': 26536, 'batch_loss/train': 0.730665713082999} 12/26/2021 11:09:36 - INFO - codeparrot_training - Step 26537: {'lr': 0.0002271831327203057, 'samples': 13587456, 'steps': 26537, 'batch_loss/train': 0.776857624296099} 12/26/2021 11:09:47 - INFO - codeparrot_training - Step 26538: {'lr': 0.00022716743542021238, 'samples': 13587968, 'steps': 26538, 'batch_loss/train': 0.864201239310205} 12/26/2021 11:09:57 - INFO - codeparrot_training - Step 26539: {'lr': 0.00022715173821089273, 'samples': 13588480, 'steps': 26539, 'batch_loss/train': 0.6712399171665311} 12/26/2021 11:10:11 - INFO - codeparrot_training - Step 26540: {'lr': 0.0002271360410924092, 'samples': 13588992, 'steps': 26540, 'batch_loss/train': 0.8223260752856731} 12/26/2021 11:10:22 - INFO - codeparrot_training - Step 26541: {'lr': 0.00022712034406482414, 'samples': 13589504, 'steps': 26541, 'batch_loss/train': 0.6726165945874527} 12/26/2021 11:10:33 - INFO - codeparrot_training - Step 26542: {'lr': 0.00022710464712819985, 'samples': 13590016, 'steps': 26542, 'batch_loss/train': 0.7840409353375435} 12/26/2021 11:10:45 - INFO - codeparrot_training - Step 26543: {'lr': 0.00022708895028259894, 'samples': 13590528, 'steps': 26543, 'batch_loss/train': 0.8387900311499834} 12/26/2021 11:10:55 - INFO - codeparrot_training - Step 26544: {'lr': 0.00022707325352808373, 'samples': 13591040, 'steps': 26544, 'batch_loss/train': 0.7366657396778464} 12/26/2021 11:11:06 - INFO - codeparrot_training - Step 26545: {'lr': 0.00022705755686471655, 'samples': 13591552, 'steps': 26545, 'batch_loss/train': 0.7931143064051867} 12/26/2021 11:11:17 - INFO - codeparrot_training - Step 26546: {'lr': 0.0002270418602925598, 'samples': 13592064, 'steps': 26546, 'batch_loss/train': 0.7599034733138978} 12/26/2021 11:11:29 - INFO - codeparrot_training - Step 26547: {'lr': 0.00022702616381167607, 'samples': 13592576, 'steps': 26547, 'batch_loss/train': 0.6938266381621361} 12/26/2021 11:11:40 - INFO - codeparrot_training - Step 26548: {'lr': 0.00022701046742212757, 'samples': 13593088, 'steps': 26548, 'batch_loss/train': 0.6176656854804605} 12/26/2021 11:11:50 - INFO - codeparrot_training - Step 26549: {'lr': 0.00022699477112397674, 'samples': 13593600, 'steps': 26549, 'batch_loss/train': 0.7595840459689498} 12/26/2021 11:12:02 - INFO - codeparrot_training - Step 26550: {'lr': 0.0002269790749172861, 'samples': 13594112, 'steps': 26550, 'batch_loss/train': 0.6899700262583792} 12/26/2021 11:12:13 - INFO - codeparrot_training - Step 26551: {'lr': 0.00022696337880211798, 'samples': 13594624, 'steps': 26551, 'batch_loss/train': 0.6756789352511987} 12/26/2021 11:12:24 - INFO - codeparrot_training - Step 26552: {'lr': 0.0002269476827785347, 'samples': 13595136, 'steps': 26552, 'batch_loss/train': 0.7096376288682222} 12/26/2021 11:12:38 - INFO - codeparrot_training - Step 26553: {'lr': 0.00022693198684659865, 'samples': 13595648, 'steps': 26553, 'batch_loss/train': 0.695090712280944} 12/26/2021 11:12:48 - INFO - codeparrot_training - Step 26554: {'lr': 0.00022691629100637245, 'samples': 13596160, 'steps': 26554, 'batch_loss/train': 0.9113818516489118} 12/26/2021 11:12:59 - INFO - codeparrot_training - Step 26555: {'lr': 0.0002269005952579183, 'samples': 13596672, 'steps': 26555, 'batch_loss/train': 0.8460070549044758} 12/26/2021 11:13:09 - INFO - codeparrot_training - Step 26556: {'lr': 0.00022688489960129873, 'samples': 13597184, 'steps': 26556, 'batch_loss/train': 0.7651677643880248} 12/26/2021 11:13:21 - INFO - codeparrot_training - Step 26557: {'lr': 0.00022686920403657595, 'samples': 13597696, 'steps': 26557, 'batch_loss/train': 0.6803085539722815} 12/26/2021 11:13:32 - INFO - codeparrot_training - Step 26558: {'lr': 0.00022685350856381254, 'samples': 13598208, 'steps': 26558, 'batch_loss/train': 0.7868963205255568} 12/26/2021 11:13:43 - INFO - codeparrot_training - Step 26559: {'lr': 0.0002268378131830709, 'samples': 13598720, 'steps': 26559, 'batch_loss/train': 0.7616252226289362} 12/26/2021 11:13:55 - INFO - codeparrot_training - Step 26560: {'lr': 0.00022682211789441337, 'samples': 13599232, 'steps': 26560, 'batch_loss/train': 0.6687498043756932} 12/26/2021 11:14:05 - INFO - codeparrot_training - Step 26561: {'lr': 0.00022680642269790223, 'samples': 13599744, 'steps': 26561, 'batch_loss/train': 0.8808200443163514} 12/26/2021 11:14:16 - INFO - codeparrot_training - Step 26562: {'lr': 0.00022679072759360002, 'samples': 13600256, 'steps': 26562, 'batch_loss/train': 0.7626407607458532} 12/26/2021 11:14:30 - INFO - codeparrot_training - Step 26563: {'lr': 0.00022677503258156922, 'samples': 13600768, 'steps': 26563, 'batch_loss/train': 0.7620911984704435} 12/26/2021 11:14:40 - INFO - codeparrot_training - Step 26564: {'lr': 0.00022675933766187198, 'samples': 13601280, 'steps': 26564, 'batch_loss/train': 0.7069011814892292} 12/26/2021 11:14:51 - INFO - codeparrot_training - Step 26565: {'lr': 0.0002267436428345709, 'samples': 13601792, 'steps': 26565, 'batch_loss/train': 0.6958777036052197} 12/26/2021 11:15:02 - INFO - codeparrot_training - Step 26566: {'lr': 0.0002267279480997284, 'samples': 13602304, 'steps': 26566, 'batch_loss/train': 0.7242936184629798} 12/26/2021 11:15:14 - INFO - codeparrot_training - Step 26567: {'lr': 0.0002267122534574067, 'samples': 13602816, 'steps': 26567, 'batch_loss/train': 0.8843832355923951} 12/26/2021 11:15:24 - INFO - codeparrot_training - Step 26568: {'lr': 0.00022669655890766825, 'samples': 13603328, 'steps': 26568, 'batch_loss/train': 0.6611906029284} 12/26/2021 11:15:35 - INFO - codeparrot_training - Step 26569: {'lr': 0.00022668086445057557, 'samples': 13603840, 'steps': 26569, 'batch_loss/train': 0.7161659366684034} 12/26/2021 11:15:49 - INFO - codeparrot_training - Step 26570: {'lr': 0.00022666517008619098, 'samples': 13604352, 'steps': 26570, 'batch_loss/train': 0.7489704107865691} 12/26/2021 11:16:00 - INFO - codeparrot_training - Step 26571: {'lr': 0.0002266494758145768, 'samples': 13604864, 'steps': 26571, 'batch_loss/train': 1.5984522392973304} 12/26/2021 11:16:10 - INFO - codeparrot_training - Step 26572: {'lr': 0.00022663378163579546, 'samples': 13605376, 'steps': 26572, 'batch_loss/train': 0.8257952490821481} 12/26/2021 11:16:22 - INFO - codeparrot_training - Step 26573: {'lr': 0.00022661808754990945, 'samples': 13605888, 'steps': 26573, 'batch_loss/train': 0.9459136473014951} 12/26/2021 11:16:33 - INFO - codeparrot_training - Step 26574: {'lr': 0.0002266023935569811, 'samples': 13606400, 'steps': 26574, 'batch_loss/train': 0.85934903845191} 12/26/2021 11:16:43 - INFO - codeparrot_training - Step 26575: {'lr': 0.00022658669965707268, 'samples': 13606912, 'steps': 26575, 'batch_loss/train': 0.7798464661464095} 12/26/2021 11:16:55 - INFO - codeparrot_training - Step 26576: {'lr': 0.00022657100585024684, 'samples': 13607424, 'steps': 26576, 'batch_loss/train': 0.7788379229605198} 12/26/2021 11:17:06 - INFO - codeparrot_training - Step 26577: {'lr': 0.00022655531213656578, 'samples': 13607936, 'steps': 26577, 'batch_loss/train': 0.841097554191947} 12/26/2021 11:17:17 - INFO - codeparrot_training - Step 26578: {'lr': 0.00022653961851609202, 'samples': 13608448, 'steps': 26578, 'batch_loss/train': 0.7865645289421082} 12/26/2021 11:17:27 - INFO - codeparrot_training - Step 26579: {'lr': 0.00022652392498888772, 'samples': 13608960, 'steps': 26579, 'batch_loss/train': 0.7991254054941237} 12/26/2021 11:17:41 - INFO - codeparrot_training - Step 26580: {'lr': 0.00022650823155501558, 'samples': 13609472, 'steps': 26580, 'batch_loss/train': 0.7274698519613594} 12/26/2021 11:17:52 - INFO - codeparrot_training - Step 26581: {'lr': 0.00022649253821453778, 'samples': 13609984, 'steps': 26581, 'batch_loss/train': 0.8712316863238811} 12/26/2021 11:18:03 - INFO - codeparrot_training - Step 26582: {'lr': 0.0002264768449675168, 'samples': 13610496, 'steps': 26582, 'batch_loss/train': 0.7444622442126274} 12/26/2021 11:18:15 - INFO - codeparrot_training - Step 26583: {'lr': 0.0002264611518140149, 'samples': 13611008, 'steps': 26583, 'batch_loss/train': 0.8177956002764404} 12/26/2021 11:18:25 - INFO - codeparrot_training - Step 26584: {'lr': 0.00022644545875409467, 'samples': 13611520, 'steps': 26584, 'batch_loss/train': 0.807553025893867} 12/26/2021 11:18:36 - INFO - codeparrot_training - Step 26585: {'lr': 0.00022642976578781844, 'samples': 13612032, 'steps': 26585, 'batch_loss/train': 0.6557619000668637} 12/26/2021 11:18:48 - INFO - codeparrot_training - Step 26586: {'lr': 0.00022641407291524844, 'samples': 13612544, 'steps': 26586, 'batch_loss/train': 0.6432518749497831} 12/26/2021 11:18:59 - INFO - codeparrot_training - Step 26587: {'lr': 0.0002263983801364472, 'samples': 13613056, 'steps': 26587, 'batch_loss/train': 0.7612902484834194} 12/26/2021 11:19:09 - INFO - codeparrot_training - Step 26588: {'lr': 0.0002263826874514772, 'samples': 13613568, 'steps': 26588, 'batch_loss/train': 0.6587547059170902} 12/26/2021 11:19:20 - INFO - codeparrot_training - Step 26589: {'lr': 0.0002263669948604007, 'samples': 13614080, 'steps': 26589, 'batch_loss/train': 0.7823539674282074} 12/26/2021 11:19:34 - INFO - codeparrot_training - Step 26590: {'lr': 0.00022635130236327996, 'samples': 13614592, 'steps': 26590, 'batch_loss/train': 0.7286267848103307} 12/26/2021 11:19:45 - INFO - codeparrot_training - Step 26591: {'lr': 0.00022633560996017764, 'samples': 13615104, 'steps': 26591, 'batch_loss/train': 0.7290628448245116} 12/26/2021 11:19:56 - INFO - codeparrot_training - Step 26592: {'lr': 0.00022631991765115602, 'samples': 13615616, 'steps': 26592, 'batch_loss/train': 0.6645289361476898} 12/26/2021 11:20:08 - INFO - codeparrot_training - Step 26593: {'lr': 0.00022630422543627737, 'samples': 13616128, 'steps': 26593, 'batch_loss/train': 0.7166994784492999} 12/26/2021 11:20:18 - INFO - codeparrot_training - Step 26594: {'lr': 0.00022628853331560416, 'samples': 13616640, 'steps': 26594, 'batch_loss/train': 0.7923867651261389} 12/26/2021 11:20:29 - INFO - codeparrot_training - Step 26595: {'lr': 0.0002262728412891989, 'samples': 13617152, 'steps': 26595, 'batch_loss/train': 0.8837126903235912} 12/26/2021 11:20:41 - INFO - codeparrot_training - Step 26596: {'lr': 0.00022625714935712386, 'samples': 13617664, 'steps': 26596, 'batch_loss/train': 0.6655160079244524} 12/26/2021 11:20:52 - INFO - codeparrot_training - Step 26597: {'lr': 0.00022624145751944135, 'samples': 13618176, 'steps': 26597, 'batch_loss/train': 0.7083052834495902} 12/26/2021 11:21:02 - INFO - codeparrot_training - Step 26598: {'lr': 0.0002262257657762138, 'samples': 13618688, 'steps': 26598, 'batch_loss/train': 0.8130045523867011} 12/26/2021 11:21:13 - INFO - codeparrot_training - Step 26599: {'lr': 0.00022621007412750374, 'samples': 13619200, 'steps': 26599, 'batch_loss/train': 0.718199753202498} 12/26/2021 11:21:27 - INFO - codeparrot_training - Step 26600: {'lr': 0.00022619438257337338, 'samples': 13619712, 'steps': 26600, 'batch_loss/train': 0.8561946467962116} 12/26/2021 11:21:38 - INFO - codeparrot_training - Step 26601: {'lr': 0.00022617869111388512, 'samples': 13620224, 'steps': 26601, 'batch_loss/train': 0.6388814570382237} 12/26/2021 11:21:48 - INFO - codeparrot_training - Step 26602: {'lr': 0.00022616299974910148, 'samples': 13620736, 'steps': 26602, 'batch_loss/train': 0.8357412731274962} 12/26/2021 11:22:00 - INFO - codeparrot_training - Step 26603: {'lr': 0.00022614730847908473, 'samples': 13621248, 'steps': 26603, 'batch_loss/train': 0.746918854303658} 12/26/2021 11:22:11 - INFO - codeparrot_training - Step 26604: {'lr': 0.0002261316173038973, 'samples': 13621760, 'steps': 26604, 'batch_loss/train': 0.7787495879456401} 12/26/2021 11:22:21 - INFO - codeparrot_training - Step 26605: {'lr': 0.00022611592622360144, 'samples': 13622272, 'steps': 26605, 'batch_loss/train': 0.7462692093104124} 12/26/2021 11:22:33 - INFO - codeparrot_training - Step 26606: {'lr': 0.00022610023523825977, 'samples': 13622784, 'steps': 26606, 'batch_loss/train': 0.84342792769894} 12/26/2021 11:22:44 - INFO - codeparrot_training - Step 26607: {'lr': 0.00022608454434793447, 'samples': 13623296, 'steps': 26607, 'batch_loss/train': 0.7468075631186366} 12/26/2021 11:22:55 - INFO - codeparrot_training - Step 26608: {'lr': 0.00022606885355268806, 'samples': 13623808, 'steps': 26608, 'batch_loss/train': 0.6925795243587345} 12/26/2021 11:23:05 - INFO - codeparrot_training - Step 26609: {'lr': 0.0002260531628525827, 'samples': 13624320, 'steps': 26609, 'batch_loss/train': 0.7717750249430537} 12/26/2021 11:23:19 - INFO - codeparrot_training - Step 26610: {'lr': 0.00022603747224768104, 'samples': 13624832, 'steps': 26610, 'batch_loss/train': 0.7392938584089279} 12/26/2021 11:23:30 - INFO - codeparrot_training - Step 26611: {'lr': 0.00022602178173804536, 'samples': 13625344, 'steps': 26611, 'batch_loss/train': 0.8145723892375827} 12/26/2021 11:23:41 - INFO - codeparrot_training - Step 26612: {'lr': 0.0002260060913237379, 'samples': 13625856, 'steps': 26612, 'batch_loss/train': 0.7459759095218033} 12/26/2021 11:23:53 - INFO - codeparrot_training - Step 26613: {'lr': 0.00022599040100482128, 'samples': 13626368, 'steps': 26613, 'batch_loss/train': 0.7151271470356733} 12/26/2021 11:24:04 - INFO - codeparrot_training - Step 26614: {'lr': 0.00022597471078135772, 'samples': 13626880, 'steps': 26614, 'batch_loss/train': 0.4973503132350743} 12/26/2021 11:24:14 - INFO - codeparrot_training - Step 26615: {'lr': 0.0002259590206534097, 'samples': 13627392, 'steps': 26615, 'batch_loss/train': 0.6818685787147842} 12/26/2021 11:24:26 - INFO - codeparrot_training - Step 26616: {'lr': 0.0002259433306210394, 'samples': 13627904, 'steps': 26616, 'batch_loss/train': 0.6848952546715736} 12/26/2021 11:24:37 - INFO - codeparrot_training - Step 26617: {'lr': 0.00022592764068430943, 'samples': 13628416, 'steps': 26617, 'batch_loss/train': 0.8665241161361337} 12/26/2021 11:24:47 - INFO - codeparrot_training - Step 26618: {'lr': 0.0002259119508432821, 'samples': 13628928, 'steps': 26618, 'batch_loss/train': 0.7802820652723312} 12/26/2021 11:24:58 - INFO - codeparrot_training - Step 26619: {'lr': 0.0002258962610980197, 'samples': 13629440, 'steps': 26619, 'batch_loss/train': 0.8334549386054277} 12/26/2021 11:25:12 - INFO - codeparrot_training - Step 26620: {'lr': 0.00022588057144858463, 'samples': 13629952, 'steps': 26620, 'batch_loss/train': 0.6809018300846219} 12/26/2021 11:25:23 - INFO - codeparrot_training - Step 26621: {'lr': 0.00022586488189503938, 'samples': 13630464, 'steps': 26621, 'batch_loss/train': 0.7751767290756106} 12/26/2021 11:25:33 - INFO - codeparrot_training - Step 26622: {'lr': 0.0002258491924374462, 'samples': 13630976, 'steps': 26622, 'batch_loss/train': 0.8207745882682502} 12/26/2021 11:25:45 - INFO - codeparrot_training - Step 26623: {'lr': 0.00022583350307586747, 'samples': 13631488, 'steps': 26623, 'batch_loss/train': 0.7057975311763585} 12/26/2021 11:25:56 - INFO - codeparrot_training - Step 26624: {'lr': 0.0002258178138103657, 'samples': 13632000, 'steps': 26624, 'batch_loss/train': 0.8451768991071731} 12/26/2021 11:26:07 - INFO - codeparrot_training - Step 26625: {'lr': 0.00022580212464100318, 'samples': 13632512, 'steps': 26625, 'batch_loss/train': 0.8224200764670968} 12/26/2021 11:26:19 - INFO - codeparrot_training - Step 26626: {'lr': 0.00022578643556784222, 'samples': 13633024, 'steps': 26626, 'batch_loss/train': 0.638866692665033} 12/26/2021 11:26:29 - INFO - codeparrot_training - Step 26627: {'lr': 0.0002257707465909452, 'samples': 13633536, 'steps': 26627, 'batch_loss/train': 0.8850031257607043} 12/26/2021 11:26:40 - INFO - codeparrot_training - Step 26628: {'lr': 0.00022575505771037466, 'samples': 13634048, 'steps': 26628, 'batch_loss/train': 0.8070859452709556} 12/26/2021 11:26:51 - INFO - codeparrot_training - Step 26629: {'lr': 0.0002257393689261928, 'samples': 13634560, 'steps': 26629, 'batch_loss/train': 0.8175071962177753} 12/26/2021 11:27:03 - INFO - codeparrot_training - Step 26630: {'lr': 0.00022572368023846207, 'samples': 13635072, 'steps': 26630, 'batch_loss/train': 1.5824670451693237} 12/26/2021 11:27:13 - INFO - codeparrot_training - Step 26631: {'lr': 0.00022570799164724472, 'samples': 13635584, 'steps': 26631, 'batch_loss/train': 0.595354356628377} 12/26/2021 11:27:24 - INFO - codeparrot_training - Step 26632: {'lr': 0.0002256923031526033, 'samples': 13636096, 'steps': 26632, 'batch_loss/train': 0.8010539840906858} 12/26/2021 11:27:38 - INFO - codeparrot_training - Step 26633: {'lr': 0.00022567661475460016, 'samples': 13636608, 'steps': 26633, 'batch_loss/train': 0.7642057281918824} 12/26/2021 11:27:49 - INFO - codeparrot_training - Step 26634: {'lr': 0.0002256609264532976, 'samples': 13637120, 'steps': 26634, 'batch_loss/train': 0.7622585487551987} 12/26/2021 11:27:59 - INFO - codeparrot_training - Step 26635: {'lr': 0.0002256452382487579, 'samples': 13637632, 'steps': 26635, 'batch_loss/train': 0.7689282284118235} 12/26/2021 11:28:11 - INFO - codeparrot_training - Step 26636: {'lr': 0.00022562955014104362, 'samples': 13638144, 'steps': 26636, 'batch_loss/train': 0.8065306320786476} 12/26/2021 11:28:22 - INFO - codeparrot_training - Step 26637: {'lr': 0.00022561386213021707, 'samples': 13638656, 'steps': 26637, 'batch_loss/train': 0.829422214999795} 12/26/2021 11:28:32 - INFO - codeparrot_training - Step 26638: {'lr': 0.0002255981742163405, 'samples': 13639168, 'steps': 26638, 'batch_loss/train': 0.8035866566933692} 12/26/2021 11:28:43 - INFO - codeparrot_training - Step 26639: {'lr': 0.0002255824863994764, 'samples': 13639680, 'steps': 26639, 'batch_loss/train': 0.7132150302059017} 12/26/2021 11:28:58 - INFO - codeparrot_training - Step 26640: {'lr': 0.0002255667986796872, 'samples': 13640192, 'steps': 26640, 'batch_loss/train': 0.7334228330291808} 12/26/2021 11:29:08 - INFO - codeparrot_training - Step 26641: {'lr': 0.00022555111105703517, 'samples': 13640704, 'steps': 26641, 'batch_loss/train': 0.6450730999931693} 12/26/2021 11:29:19 - INFO - codeparrot_training - Step 26642: {'lr': 0.0002255354235315826, 'samples': 13641216, 'steps': 26642, 'batch_loss/train': 0.7570244250819087} 12/26/2021 11:29:31 - INFO - codeparrot_training - Step 26643: {'lr': 0.00022551973610339198, 'samples': 13641728, 'steps': 26643, 'batch_loss/train': 0.6819639666937292} 12/26/2021 11:29:42 - INFO - codeparrot_training - Step 26644: {'lr': 0.00022550404877252572, 'samples': 13642240, 'steps': 26644, 'batch_loss/train': 0.7604140918701887} 12/26/2021 11:29:52 - INFO - codeparrot_training - Step 26645: {'lr': 0.00022548836153904605, 'samples': 13642752, 'steps': 26645, 'batch_loss/train': 0.7297080112621188} 12/26/2021 11:30:04 - INFO - codeparrot_training - Step 26646: {'lr': 0.00022547267440301534, 'samples': 13643264, 'steps': 26646, 'batch_loss/train': 0.7699466068297625} 12/26/2021 11:30:15 - INFO - codeparrot_training - Step 26647: {'lr': 0.00022545698736449612, 'samples': 13643776, 'steps': 26647, 'batch_loss/train': 0.7487853011116385} 12/26/2021 11:30:26 - INFO - codeparrot_training - Step 26648: {'lr': 0.00022544130042355062, 'samples': 13644288, 'steps': 26648, 'batch_loss/train': 0.8196489382535219} 12/26/2021 11:30:36 - INFO - codeparrot_training - Step 26649: {'lr': 0.00022542561358024116, 'samples': 13644800, 'steps': 26649, 'batch_loss/train': 0.7440110882744193} 12/26/2021 11:30:50 - INFO - codeparrot_training - Step 26650: {'lr': 0.0002254099268346303, 'samples': 13645312, 'steps': 26650, 'batch_loss/train': 0.6838004253804684} 12/26/2021 11:31:01 - INFO - codeparrot_training - Step 26651: {'lr': 0.00022539424018678029, 'samples': 13645824, 'steps': 26651, 'batch_loss/train': 0.9201954072341323} 12/26/2021 11:31:12 - INFO - codeparrot_training - Step 26652: {'lr': 0.00022537855363675342, 'samples': 13646336, 'steps': 26652, 'batch_loss/train': 0.8465639613568783} 12/26/2021 11:31:24 - INFO - codeparrot_training - Step 26653: {'lr': 0.00022536286718461207, 'samples': 13646848, 'steps': 26653, 'batch_loss/train': 0.7024266084190458} 12/26/2021 11:31:34 - INFO - codeparrot_training - Step 26654: {'lr': 0.00022534718083041876, 'samples': 13647360, 'steps': 26654, 'batch_loss/train': 0.7138100657612085} 12/26/2021 11:31:45 - INFO - codeparrot_training - Step 26655: {'lr': 0.00022533149457423573, 'samples': 13647872, 'steps': 26655, 'batch_loss/train': 0.7176647423766553} 12/26/2021 11:31:59 - INFO - codeparrot_training - Step 26656: {'lr': 0.00022531580841612543, 'samples': 13648384, 'steps': 26656, 'batch_loss/train': 0.7364725302322768} 12/26/2021 11:32:09 - INFO - codeparrot_training - Step 26657: {'lr': 0.00022530012235615, 'samples': 13648896, 'steps': 26657, 'batch_loss/train': 0.7039200107101351} 12/26/2021 11:32:20 - INFO - codeparrot_training - Step 26658: {'lr': 0.00022528443639437202, 'samples': 13649408, 'steps': 26658, 'batch_loss/train': 0.7581022684462368} 12/26/2021 11:32:32 - INFO - codeparrot_training - Step 26659: {'lr': 0.00022526875053085387, 'samples': 13649920, 'steps': 26659, 'batch_loss/train': 0.7669555004686117} 12/26/2021 11:32:43 - INFO - codeparrot_training - Step 26660: {'lr': 0.00022525306476565767, 'samples': 13650432, 'steps': 26660, 'batch_loss/train': 0.9768393528647721} 12/26/2021 11:32:53 - INFO - codeparrot_training - Step 26661: {'lr': 0.00022523737909884608, 'samples': 13650944, 'steps': 26661, 'batch_loss/train': 0.7221408439800143} 12/26/2021 11:33:04 - INFO - codeparrot_training - Step 26662: {'lr': 0.00022522169353048128, 'samples': 13651456, 'steps': 26662, 'batch_loss/train': 0.6863712761551142} 12/26/2021 11:33:16 - INFO - codeparrot_training - Step 26663: {'lr': 0.00022520600806062572, 'samples': 13651968, 'steps': 26663, 'batch_loss/train': 0.7765477066859603} 12/26/2021 11:33:27 - INFO - codeparrot_training - Step 26664: {'lr': 0.0002251903226893416, 'samples': 13652480, 'steps': 26664, 'batch_loss/train': 0.7658288162201643} 12/26/2021 11:33:37 - INFO - codeparrot_training - Step 26665: {'lr': 0.00022517463741669142, 'samples': 13652992, 'steps': 26665, 'batch_loss/train': 0.681660661008209} 12/26/2021 11:33:50 - INFO - codeparrot_training - Step 26666: {'lr': 0.00022515895224273758, 'samples': 13653504, 'steps': 26666, 'batch_loss/train': 0.8009716200176626} 12/26/2021 11:34:00 - INFO - codeparrot_training - Step 26667: {'lr': 0.0002251432671675423, 'samples': 13654016, 'steps': 26667, 'batch_loss/train': 0.7883411543443799} 12/26/2021 11:34:11 - INFO - codeparrot_training - Step 26668: {'lr': 0.00022512758219116796, 'samples': 13654528, 'steps': 26668, 'batch_loss/train': 0.7048510778695345} 12/26/2021 11:34:25 - INFO - codeparrot_training - Step 26669: {'lr': 0.00022511189731367706, 'samples': 13655040, 'steps': 26669, 'batch_loss/train': 0.804938163375482} 12/26/2021 11:34:36 - INFO - codeparrot_training - Step 26670: {'lr': 0.00022509621253513187, 'samples': 13655552, 'steps': 26670, 'batch_loss/train': 0.6188855724176392} 12/26/2021 11:34:47 - INFO - codeparrot_training - Step 26671: {'lr': 0.0002250805278555947, 'samples': 13656064, 'steps': 26671, 'batch_loss/train': 0.8471671580336988} 12/26/2021 11:34:57 - INFO - codeparrot_training - Step 26672: {'lr': 0.00022506484327512787, 'samples': 13656576, 'steps': 26672, 'batch_loss/train': 0.8000881494954228} 12/26/2021 11:35:10 - INFO - codeparrot_training - Step 26673: {'lr': 0.00022504915879379393, 'samples': 13657088, 'steps': 26673, 'batch_loss/train': 0.7564933103858493} 12/26/2021 11:35:20 - INFO - codeparrot_training - Step 26674: {'lr': 0.00022503347441165503, 'samples': 13657600, 'steps': 26674, 'batch_loss/train': 0.6293267777946312} 12/26/2021 11:35:31 - INFO - codeparrot_training - Step 26675: {'lr': 0.00022501779012877357, 'samples': 13658112, 'steps': 26675, 'batch_loss/train': 0.7573028784245253} 12/26/2021 11:35:43 - INFO - codeparrot_training - Step 26676: {'lr': 0.00022500210594521205, 'samples': 13658624, 'steps': 26676, 'batch_loss/train': 0.779009803198278} 12/26/2021 11:35:53 - INFO - codeparrot_training - Step 26677: {'lr': 0.00022498642186103267, 'samples': 13659136, 'steps': 26677, 'batch_loss/train': 0.8309233764884993} 12/26/2021 11:36:04 - INFO - codeparrot_training - Step 26678: {'lr': 0.0002249707378762979, 'samples': 13659648, 'steps': 26678, 'batch_loss/train': 0.7156615585554391} 12/26/2021 11:36:18 - INFO - codeparrot_training - Step 26679: {'lr': 0.00022495505399106985, 'samples': 13660160, 'steps': 26679, 'batch_loss/train': 0.8159506050869823} 12/26/2021 11:36:29 - INFO - codeparrot_training - Step 26680: {'lr': 0.00022493937020541123, 'samples': 13660672, 'steps': 26680, 'batch_loss/train': 1.4519646824337542} 12/26/2021 11:36:39 - INFO - codeparrot_training - Step 26681: {'lr': 0.00022492368651938415, 'samples': 13661184, 'steps': 26681, 'batch_loss/train': 0.5447830036282539} 12/26/2021 11:36:50 - INFO - codeparrot_training - Step 26682: {'lr': 0.00022490800293305105, 'samples': 13661696, 'steps': 26682, 'batch_loss/train': 0.7201319967862219} 12/26/2021 11:37:02 - INFO - codeparrot_training - Step 26683: {'lr': 0.00022489231944647416, 'samples': 13662208, 'steps': 26683, 'batch_loss/train': 0.7191772379446775} 12/26/2021 11:37:13 - INFO - codeparrot_training - Step 26684: {'lr': 0.00022487663605971598, 'samples': 13662720, 'steps': 26684, 'batch_loss/train': 0.6983474022708833} 12/26/2021 11:37:23 - INFO - codeparrot_training - Step 26685: {'lr': 0.00022486095277283885, 'samples': 13663232, 'steps': 26685, 'batch_loss/train': 0.7209075298160315} 12/26/2021 11:37:36 - INFO - codeparrot_training - Step 26686: {'lr': 0.00022484526958590497, 'samples': 13663744, 'steps': 26686, 'batch_loss/train': 0.7649381848168559} 12/26/2021 11:37:46 - INFO - codeparrot_training - Step 26687: {'lr': 0.00022482958649897695, 'samples': 13664256, 'steps': 26687, 'batch_loss/train': 0.7653354583308101} 12/26/2021 11:37:57 - INFO - codeparrot_training - Step 26688: {'lr': 0.00022481390351211688, 'samples': 13664768, 'steps': 26688, 'batch_loss/train': 0.7406483432278037} 12/26/2021 11:38:11 - INFO - codeparrot_training - Step 26689: {'lr': 0.00022479822062538733, 'samples': 13665280, 'steps': 26689, 'batch_loss/train': 0.5849062938941643} 12/26/2021 11:38:22 - INFO - codeparrot_training - Step 26690: {'lr': 0.0002247825378388504, 'samples': 13665792, 'steps': 26690, 'batch_loss/train': 0.8757971944287419} 12/26/2021 11:38:32 - INFO - codeparrot_training - Step 26691: {'lr': 0.00022476685515256863, 'samples': 13666304, 'steps': 26691, 'batch_loss/train': 0.6977055622264743} 12/26/2021 11:38:44 - INFO - codeparrot_training - Step 26692: {'lr': 0.00022475117256660437, 'samples': 13666816, 'steps': 26692, 'batch_loss/train': 0.700093568302691} 12/26/2021 11:38:55 - INFO - codeparrot_training - Step 26693: {'lr': 0.00022473549008101987, 'samples': 13667328, 'steps': 26693, 'batch_loss/train': 0.7777432007715106} 12/26/2021 11:39:05 - INFO - codeparrot_training - Step 26694: {'lr': 0.00022471980769587745, 'samples': 13667840, 'steps': 26694, 'batch_loss/train': 0.7593953646719456} 12/26/2021 11:39:16 - INFO - codeparrot_training - Step 26695: {'lr': 0.00022470412541123966, 'samples': 13668352, 'steps': 26695, 'batch_loss/train': 0.7959694918245077} 12/26/2021 11:39:30 - INFO - codeparrot_training - Step 26696: {'lr': 0.0002246884432271687, 'samples': 13668864, 'steps': 26696, 'batch_loss/train': 0.9149260297417641} 12/26/2021 11:39:41 - INFO - codeparrot_training - Step 26697: {'lr': 0.00022467276114372683, 'samples': 13669376, 'steps': 26697, 'batch_loss/train': 0.8135582394897938} 12/26/2021 11:39:51 - INFO - codeparrot_training - Step 26698: {'lr': 0.0002246570791609766, 'samples': 13669888, 'steps': 26698, 'batch_loss/train': 0.7809630110859871} 12/26/2021 11:40:03 - INFO - codeparrot_training - Step 26699: {'lr': 0.00022464139727898023, 'samples': 13670400, 'steps': 26699, 'batch_loss/train': 0.7451390996575356} 12/26/2021 11:40:14 - INFO - codeparrot_training - Step 26700: {'lr': 0.00022462571549780006, 'samples': 13670912, 'steps': 26700, 'batch_loss/train': 0.7020670101046562} 12/26/2021 11:40:25 - INFO - codeparrot_training - Step 26701: {'lr': 0.00022461003381749842, 'samples': 13671424, 'steps': 26701, 'batch_loss/train': 0.7398744001984596} 12/26/2021 11:40:37 - INFO - codeparrot_training - Step 26702: {'lr': 0.00022459435223813783, 'samples': 13671936, 'steps': 26702, 'batch_loss/train': 0.6029589595273137} 12/26/2021 11:40:47 - INFO - codeparrot_training - Step 26703: {'lr': 0.0002245786707597804, 'samples': 13672448, 'steps': 26703, 'batch_loss/train': 0.8532277485355735} 12/26/2021 11:40:58 - INFO - codeparrot_training - Step 26704: {'lr': 0.00022456298938248865, 'samples': 13672960, 'steps': 26704, 'batch_loss/train': 0.8129167756997049} 12/26/2021 11:41:09 - INFO - codeparrot_training - Step 26705: {'lr': 0.00022454730810632476, 'samples': 13673472, 'steps': 26705, 'batch_loss/train': 0.7387951135169715} 12/26/2021 11:41:21 - INFO - codeparrot_training - Step 26706: {'lr': 0.00022453162693135124, 'samples': 13673984, 'steps': 26706, 'batch_loss/train': 0.7568303346633911} 12/26/2021 11:41:31 - INFO - codeparrot_training - Step 26707: {'lr': 0.00022451594585763034, 'samples': 13674496, 'steps': 26707, 'batch_loss/train': 0.8917929581366479} 12/26/2021 11:41:42 - INFO - codeparrot_training - Step 26708: {'lr': 0.00022450026488522444, 'samples': 13675008, 'steps': 26708, 'batch_loss/train': 0.769305182620883} 12/26/2021 11:41:56 - INFO - codeparrot_training - Step 26709: {'lr': 0.00022448458401419576, 'samples': 13675520, 'steps': 26709, 'batch_loss/train': 0.9500228678807616} 12/26/2021 11:42:06 - INFO - codeparrot_training - Step 26710: {'lr': 0.0002244689032446068, 'samples': 13676032, 'steps': 26710, 'batch_loss/train': 0.6765034454874694} 12/26/2021 11:42:17 - INFO - codeparrot_training - Step 26711: {'lr': 0.00022445322257651988, 'samples': 13676544, 'steps': 26711, 'batch_loss/train': 0.8417146757710725} 12/26/2021 11:42:29 - INFO - codeparrot_training - Step 26712: {'lr': 0.00022443754200999718, 'samples': 13677056, 'steps': 26712, 'batch_loss/train': 0.7223571226932108} 12/26/2021 11:42:40 - INFO - codeparrot_training - Step 26713: {'lr': 0.00022442186154510125, 'samples': 13677568, 'steps': 26713, 'batch_loss/train': 0.8219403270632029} 12/26/2021 11:42:50 - INFO - codeparrot_training - Step 26714: {'lr': 0.00022440618118189438, 'samples': 13678080, 'steps': 26714, 'batch_loss/train': 0.7166420973371714} 12/26/2021 11:43:02 - INFO - codeparrot_training - Step 26715: {'lr': 0.00022439050092043887, 'samples': 13678592, 'steps': 26715, 'batch_loss/train': 0.7734335358254611} 12/26/2021 11:43:13 - INFO - codeparrot_training - Step 26716: {'lr': 0.00022437482076079692, 'samples': 13679104, 'steps': 26716, 'batch_loss/train': 0.8076036614365876} 12/26/2021 11:43:24 - INFO - codeparrot_training - Step 26717: {'lr': 0.0002243591407030311, 'samples': 13679616, 'steps': 26717, 'batch_loss/train': 0.7045322360936552} 12/26/2021 11:43:34 - INFO - codeparrot_training - Step 26718: {'lr': 0.00022434346074720369, 'samples': 13680128, 'steps': 26718, 'batch_loss/train': 0.675122354645282} 12/26/2021 11:43:48 - INFO - codeparrot_training - Step 26719: {'lr': 0.00022432778089337698, 'samples': 13680640, 'steps': 26719, 'batch_loss/train': 0.7730833645910025} 12/26/2021 11:43:59 - INFO - codeparrot_training - Step 26720: {'lr': 0.0002243121011416132, 'samples': 13681152, 'steps': 26720, 'batch_loss/train': 0.7504207687452435} 12/26/2021 11:44:09 - INFO - codeparrot_training - Step 26721: {'lr': 0.00022429642149197497, 'samples': 13681664, 'steps': 26721, 'batch_loss/train': 0.7857560955453664} 12/26/2021 11:44:21 - INFO - codeparrot_training - Step 26722: {'lr': 0.00022428074194452436, 'samples': 13682176, 'steps': 26722, 'batch_loss/train': 0.7278433381579816} 12/26/2021 11:44:32 - INFO - codeparrot_training - Step 26723: {'lr': 0.0002242650624993238, 'samples': 13682688, 'steps': 26723, 'batch_loss/train': 0.7392128771170974} 12/26/2021 11:44:43 - INFO - codeparrot_training - Step 26724: {'lr': 0.00022424938315643574, 'samples': 13683200, 'steps': 26724, 'batch_loss/train': 0.7947757253423333} 12/26/2021 11:44:56 - INFO - codeparrot_training - Step 26725: {'lr': 0.00022423370391592238, 'samples': 13683712, 'steps': 26725, 'batch_loss/train': 0.698815044015646} 12/26/2021 11:45:07 - INFO - codeparrot_training - Step 26726: {'lr': 0.00022421802477784602, 'samples': 13684224, 'steps': 26726, 'batch_loss/train': 0.7945384818594903} 12/26/2021 11:45:17 - INFO - codeparrot_training - Step 26727: {'lr': 0.00022420234574226902, 'samples': 13684736, 'steps': 26727, 'batch_loss/train': 0.672198268584907} 12/26/2021 11:45:28 - INFO - codeparrot_training - Step 26728: {'lr': 0.0002241866668092539, 'samples': 13685248, 'steps': 26728, 'batch_loss/train': 0.7528452281258069} 12/26/2021 11:45:40 - INFO - codeparrot_training - Step 26729: {'lr': 0.00022417098797886277, 'samples': 13685760, 'steps': 26729, 'batch_loss/train': 0.8097270308062434} 12/26/2021 11:45:51 - INFO - codeparrot_training - Step 26730: {'lr': 0.00022415530925115807, 'samples': 13686272, 'steps': 26730, 'batch_loss/train': 0.6633328453171998} 12/26/2021 11:46:01 - INFO - codeparrot_training - Step 26731: {'lr': 0.000224139630626202, 'samples': 13686784, 'steps': 26731, 'batch_loss/train': 0.6709648980759084} 12/26/2021 11:46:13 - INFO - codeparrot_training - Step 26732: {'lr': 0.00022412395210405712, 'samples': 13687296, 'steps': 26732, 'batch_loss/train': 0.574543856899254} 12/26/2021 11:46:24 - INFO - codeparrot_training - Step 26733: {'lr': 0.00022410827368478557, 'samples': 13687808, 'steps': 26733, 'batch_loss/train': 0.711560832336545} 12/26/2021 11:46:35 - INFO - codeparrot_training - Step 26734: {'lr': 0.00022409259536844967, 'samples': 13688320, 'steps': 26734, 'batch_loss/train': 0.6947666211053729} 12/26/2021 11:46:48 - INFO - codeparrot_training - Step 26735: {'lr': 0.00022407691715511198, 'samples': 13688832, 'steps': 26735, 'batch_loss/train': 0.8241451736539602} 12/26/2021 11:46:59 - INFO - codeparrot_training - Step 26736: {'lr': 0.0002240612390448346, 'samples': 13689344, 'steps': 26736, 'batch_loss/train': 0.8094647875986993} 12/26/2021 11:47:10 - INFO - codeparrot_training - Step 26737: {'lr': 0.00022404556103768002, 'samples': 13689856, 'steps': 26737, 'batch_loss/train': 0.728834742680192} 12/26/2021 11:47:20 - INFO - codeparrot_training - Step 26738: {'lr': 0.00022402988313371033, 'samples': 13690368, 'steps': 26738, 'batch_loss/train': 0.7604713602922857} 12/26/2021 11:47:32 - INFO - codeparrot_training - Step 26739: {'lr': 0.0002240142053329881, 'samples': 13690880, 'steps': 26739, 'batch_loss/train': 0.672586562926881} 12/26/2021 11:47:43 - INFO - codeparrot_training - Step 26740: {'lr': 0.00022399852763557566, 'samples': 13691392, 'steps': 26740, 'batch_loss/train': 0.8304130518808961} 12/26/2021 11:47:54 - INFO - codeparrot_training - Step 26741: {'lr': 0.00022398285004153525, 'samples': 13691904, 'steps': 26741, 'batch_loss/train': 0.7769549109507352} 12/26/2021 11:48:06 - INFO - codeparrot_training - Step 26742: {'lr': 0.00022396717255092905, 'samples': 13692416, 'steps': 26742, 'batch_loss/train': 0.6626744035165757} 12/26/2021 11:48:17 - INFO - codeparrot_training - Step 26743: {'lr': 0.00022395149516381963, 'samples': 13692928, 'steps': 26743, 'batch_loss/train': 0.46685352141503245} 12/26/2021 11:48:27 - INFO - codeparrot_training - Step 26744: {'lr': 0.0002239358178802693, 'samples': 13693440, 'steps': 26744, 'batch_loss/train': 0.7280251104384661} 12/26/2021 11:48:40 - INFO - codeparrot_training - Step 26745: {'lr': 0.00022392014070034022, 'samples': 13693952, 'steps': 26745, 'batch_loss/train': 0.7345063565298915} 12/26/2021 11:48:50 - INFO - codeparrot_training - Step 26746: {'lr': 0.00022390446362409477, 'samples': 13694464, 'steps': 26746, 'batch_loss/train': 0.9260724182240665} 12/26/2021 11:49:01 - INFO - codeparrot_training - Step 26747: {'lr': 0.00022388878665159545, 'samples': 13694976, 'steps': 26747, 'batch_loss/train': 0.8287439672276378} 12/26/2021 11:49:11 - INFO - codeparrot_training - Step 26748: {'lr': 0.00022387310978290437, 'samples': 13695488, 'steps': 26748, 'batch_loss/train': 0.6539232429349795} 12/26/2021 11:49:26 - INFO - codeparrot_training - Step 26749: {'lr': 0.00022385743301808392, 'samples': 13696000, 'steps': 26749, 'batch_loss/train': 0.7139667572919279} 12/26/2021 11:49:36 - INFO - codeparrot_training - Step 26750: {'lr': 0.00022384175635719653, 'samples': 13696512, 'steps': 26750, 'batch_loss/train': 1.0195169858634472} 12/26/2021 11:49:47 - INFO - codeparrot_training - Step 26751: {'lr': 0.00022382607980030446, 'samples': 13697024, 'steps': 26751, 'batch_loss/train': 0.7679197704419494} 12/26/2021 11:49:59 - INFO - codeparrot_training - Step 26752: {'lr': 0.00022381040334746994, 'samples': 13697536, 'steps': 26752, 'batch_loss/train': 0.8465987355448306} 12/26/2021 11:50:09 - INFO - codeparrot_training - Step 26753: {'lr': 0.00022379472699875534, 'samples': 13698048, 'steps': 26753, 'batch_loss/train': 0.7641827864572406} 12/26/2021 11:50:20 - INFO - codeparrot_training - Step 26754: {'lr': 0.00022377905075422308, 'samples': 13698560, 'steps': 26754, 'batch_loss/train': 0.813064175657928} 12/26/2021 11:50:34 - INFO - codeparrot_training - Step 26755: {'lr': 0.0002237633746139354, 'samples': 13699072, 'steps': 26755, 'batch_loss/train': 0.7422439004294574} 12/26/2021 11:50:45 - INFO - codeparrot_training - Step 26756: {'lr': 0.00022374769857795465, 'samples': 13699584, 'steps': 26756, 'batch_loss/train': 0.7358318055048585} 12/26/2021 11:50:55 - INFO - codeparrot_training - Step 26757: {'lr': 0.00022373202264634306, 'samples': 13700096, 'steps': 26757, 'batch_loss/train': 0.6917812356259674} 12/26/2021 11:51:06 - INFO - codeparrot_training - Step 26758: {'lr': 0.00022371634681916307, 'samples': 13700608, 'steps': 26758, 'batch_loss/train': 0.7568380394950509} 12/26/2021 11:51:18 - INFO - codeparrot_training - Step 26759: {'lr': 0.00022370067109647702, 'samples': 13701120, 'steps': 26759, 'batch_loss/train': 0.7143628867343068} 12/26/2021 11:51:28 - INFO - codeparrot_training - Step 26760: {'lr': 0.00022368499547834706, 'samples': 13701632, 'steps': 26760, 'batch_loss/train': 0.8075491851195693} 12/26/2021 11:51:39 - INFO - codeparrot_training - Step 26761: {'lr': 0.00022366931996483573, 'samples': 13702144, 'steps': 26761, 'batch_loss/train': 0.7242178930900991} 12/26/2021 11:51:51 - INFO - codeparrot_training - Step 26762: {'lr': 0.00022365364455600518, 'samples': 13702656, 'steps': 26762, 'batch_loss/train': 0.8126704003661871} 12/26/2021 11:52:02 - INFO - codeparrot_training - Step 26763: {'lr': 0.00022363796925191788, 'samples': 13703168, 'steps': 26763, 'batch_loss/train': 0.8090120700653642} 12/26/2021 11:52:12 - INFO - codeparrot_training - Step 26764: {'lr': 0.0002236222940526359, 'samples': 13703680, 'steps': 26764, 'batch_loss/train': 0.7950878329575062} 12/26/2021 11:52:26 - INFO - codeparrot_training - Step 26765: {'lr': 0.0002236066189582218, 'samples': 13704192, 'steps': 26765, 'batch_loss/train': 0.7440197328105569} 12/26/2021 11:52:36 - INFO - codeparrot_training - Step 26766: {'lr': 0.00022359094396873789, 'samples': 13704704, 'steps': 26766, 'batch_loss/train': 0.8267139233648777} 12/26/2021 11:52:47 - INFO - codeparrot_training - Step 26767: {'lr': 0.00022357526908424634, 'samples': 13705216, 'steps': 26767, 'batch_loss/train': 0.7268555879127234} 12/26/2021 11:53:00 - INFO - codeparrot_training - Step 26768: {'lr': 0.00022355959430480954, 'samples': 13705728, 'steps': 26768, 'batch_loss/train': 0.7351593929342926} 12/26/2021 11:53:11 - INFO - codeparrot_training - Step 26769: {'lr': 0.0002235439196304898, 'samples': 13706240, 'steps': 26769, 'batch_loss/train': 0.6706378739327192} 12/26/2021 11:53:21 - INFO - codeparrot_training - Step 26770: {'lr': 0.00022352824506134951, 'samples': 13706752, 'steps': 26770, 'batch_loss/train': 0.9205794259905815} 12/26/2021 11:53:32 - INFO - codeparrot_training - Step 26771: {'lr': 0.00022351257059745084, 'samples': 13707264, 'steps': 26771, 'batch_loss/train': 0.7534484402276576} 12/26/2021 11:53:44 - INFO - codeparrot_training - Step 26772: {'lr': 0.00022349689623885622, 'samples': 13707776, 'steps': 26772, 'batch_loss/train': 0.8009936893358827} 12/26/2021 11:53:54 - INFO - codeparrot_training - Step 26773: {'lr': 0.00022348122198562797, 'samples': 13708288, 'steps': 26773, 'batch_loss/train': 0.8281614175066352} 12/26/2021 11:54:05 - INFO - codeparrot_training - Step 26774: {'lr': 0.00022346554783782835, 'samples': 13708800, 'steps': 26774, 'batch_loss/train': 0.7124905314994976} 12/26/2021 11:54:17 - INFO - codeparrot_training - Step 26775: {'lr': 0.00022344987379551963, 'samples': 13709312, 'steps': 26775, 'batch_loss/train': 0.802395431092009} 12/26/2021 11:54:28 - INFO - codeparrot_training - Step 26776: {'lr': 0.00022343419985876428, 'samples': 13709824, 'steps': 26776, 'batch_loss/train': 0.7627839820925146} 12/26/2021 11:54:38 - INFO - codeparrot_training - Step 26777: {'lr': 0.00022341852602762453, 'samples': 13710336, 'steps': 26777, 'batch_loss/train': 0.8239222292322665} 12/26/2021 11:54:51 - INFO - codeparrot_training - Step 26778: {'lr': 0.00022340285230216262, 'samples': 13710848, 'steps': 26778, 'batch_loss/train': 0.8001159750856459} 12/26/2021 11:55:02 - INFO - codeparrot_training - Step 26779: {'lr': 0.00022338717868244088, 'samples': 13711360, 'steps': 26779, 'batch_loss/train': 0.8617224283516407} 12/26/2021 11:55:12 - INFO - codeparrot_training - Step 26780: {'lr': 0.00022337150516852177, 'samples': 13711872, 'steps': 26780, 'batch_loss/train': 0.735003415029496} 12/26/2021 11:55:23 - INFO - codeparrot_training - Step 26781: {'lr': 0.00022335583176046745, 'samples': 13712384, 'steps': 26781, 'batch_loss/train': 0.7173898508772254} 12/26/2021 11:55:35 - INFO - codeparrot_training - Step 26782: {'lr': 0.00022334015845834035, 'samples': 13712896, 'steps': 26782, 'batch_loss/train': 0.7272152462974191} 12/26/2021 11:55:46 - INFO - codeparrot_training - Step 26783: {'lr': 0.00022332448526220258, 'samples': 13713408, 'steps': 26783, 'batch_loss/train': 0.8619802165776491} 12/26/2021 11:55:57 - INFO - codeparrot_training - Step 26784: {'lr': 0.00022330881217211667, 'samples': 13713920, 'steps': 26784, 'batch_loss/train': 0.7869294080883265} 12/26/2021 11:56:09 - INFO - codeparrot_training - Step 26785: {'lr': 0.00022329313918814487, 'samples': 13714432, 'steps': 26785, 'batch_loss/train': 0.7647852562367916} 12/26/2021 11:56:19 - INFO - codeparrot_training - Step 26786: {'lr': 0.00022327746631034938, 'samples': 13714944, 'steps': 26786, 'batch_loss/train': 0.7661078488454223} 12/26/2021 11:56:30 - INFO - codeparrot_training - Step 26787: {'lr': 0.00022326179353879266, 'samples': 13715456, 'steps': 26787, 'batch_loss/train': 0.7528681075200438} 12/26/2021 11:56:43 - INFO - codeparrot_training - Step 26788: {'lr': 0.00022324612087353695, 'samples': 13715968, 'steps': 26788, 'batch_loss/train': 0.7136106750695035} 12/26/2021 11:56:53 - INFO - codeparrot_training - Step 26789: {'lr': 0.0002232304483146446, 'samples': 13716480, 'steps': 26789, 'batch_loss/train': 0.7386757186613977} 12/26/2021 11:57:04 - INFO - codeparrot_training - Step 26790: {'lr': 0.00022321477586217776, 'samples': 13716992, 'steps': 26790, 'batch_loss/train': 0.8724896181374788} 12/26/2021 11:57:14 - INFO - codeparrot_training - Step 26791: {'lr': 0.00022319910351619894, 'samples': 13717504, 'steps': 26791, 'batch_loss/train': 0.6794714301358908} 12/26/2021 11:57:27 - INFO - codeparrot_training - Step 26792: {'lr': 0.00022318343127677038, 'samples': 13718016, 'steps': 26792, 'batch_loss/train': 2.850297162309289} 12/26/2021 11:57:38 - INFO - codeparrot_training - Step 26793: {'lr': 0.00022316775914395432, 'samples': 13718528, 'steps': 26793, 'batch_loss/train': 0.7929796017706394} 12/26/2021 11:57:48 - INFO - codeparrot_training - Step 26794: {'lr': 0.0002231520871178131, 'samples': 13719040, 'steps': 26794, 'batch_loss/train': 0.8509122906252742} 12/26/2021 11:58:01 - INFO - codeparrot_training - Step 26795: {'lr': 0.0002231364151984091, 'samples': 13719552, 'steps': 26795, 'batch_loss/train': 0.7834293236956} 12/26/2021 11:58:11 - INFO - codeparrot_training - Step 26796: {'lr': 0.00022312074338580462, 'samples': 13720064, 'steps': 26796, 'batch_loss/train': 0.8372762817889452} 12/26/2021 11:58:22 - INFO - codeparrot_training - Step 26797: {'lr': 0.0002231050716800618, 'samples': 13720576, 'steps': 26797, 'batch_loss/train': 0.7053414818365127} 12/26/2021 11:58:34 - INFO - codeparrot_training - Step 26798: {'lr': 0.00022308940008124308, 'samples': 13721088, 'steps': 26798, 'batch_loss/train': 0.6875747516751289} 12/26/2021 11:58:45 - INFO - codeparrot_training - Step 26799: {'lr': 0.00022307372858941082, 'samples': 13721600, 'steps': 26799, 'batch_loss/train': 0.8327781856060028} 12/26/2021 11:58:56 - INFO - codeparrot_training - Step 26800: {'lr': 0.00022305805720462718, 'samples': 13722112, 'steps': 26800, 'batch_loss/train': 0.8501297235488892} 12/26/2021 11:59:06 - INFO - codeparrot_training - Step 26801: {'lr': 0.0002230423859269545, 'samples': 13722624, 'steps': 26801, 'batch_loss/train': 0.7178016903344542} 12/26/2021 11:59:18 - INFO - codeparrot_training - Step 26802: {'lr': 0.00022302671475645522, 'samples': 13723136, 'steps': 26802, 'batch_loss/train': 0.7639355631545186} 12/26/2021 11:59:29 - INFO - codeparrot_training - Step 26803: {'lr': 0.00022301104369319146, 'samples': 13723648, 'steps': 26803, 'batch_loss/train': 0.6907674754038453} 12/26/2021 11:59:39 - INFO - codeparrot_training - Step 26804: {'lr': 0.0002229953727372257, 'samples': 13724160, 'steps': 26804, 'batch_loss/train': 0.8076934600248933} 12/26/2021 11:59:52 - INFO - codeparrot_training - Step 26805: {'lr': 0.00022297970188862, 'samples': 13724672, 'steps': 26805, 'batch_loss/train': 1.6821157545782626} 12/26/2021 12:00:03 - INFO - codeparrot_training - Step 26806: {'lr': 0.0002229640311474369, 'samples': 13725184, 'steps': 26806, 'batch_loss/train': 0.7718113958835602} 12/26/2021 12:00:14 - INFO - codeparrot_training - Step 26807: {'lr': 0.00022294836051373857, 'samples': 13725696, 'steps': 26807, 'batch_loss/train': 0.773678632453084} 12/26/2021 12:00:26 - INFO - codeparrot_training - Step 26808: {'lr': 0.0002229326899875873, 'samples': 13726208, 'steps': 26808, 'batch_loss/train': 0.7622329127043486} 12/26/2021 12:00:36 - INFO - codeparrot_training - Step 26809: {'lr': 0.00022291701956904554, 'samples': 13726720, 'steps': 26809, 'batch_loss/train': 0.8234318783506751} 12/26/2021 12:00:47 - INFO - codeparrot_training - Step 26810: {'lr': 0.0002229013492581754, 'samples': 13727232, 'steps': 26810, 'batch_loss/train': 0.7606566408649087} 12/26/2021 12:00:58 - INFO - codeparrot_training - Step 26811: {'lr': 0.00022288567905503934, 'samples': 13727744, 'steps': 26811, 'batch_loss/train': 0.8458014251664281} 12/26/2021 12:01:10 - INFO - codeparrot_training - Step 26812: {'lr': 0.00022287000895969948, 'samples': 13728256, 'steps': 26812, 'batch_loss/train': 0.8317914055660367} 12/26/2021 12:01:20 - INFO - codeparrot_training - Step 26813: {'lr': 0.0002228543389722183, 'samples': 13728768, 'steps': 26813, 'batch_loss/train': 0.7181926819030195} 12/26/2021 12:01:31 - INFO - codeparrot_training - Step 26814: {'lr': 0.00022283866909265798, 'samples': 13729280, 'steps': 26814, 'batch_loss/train': 0.7828273391351104} 12/26/2021 12:01:43 - INFO - codeparrot_training - Step 26815: {'lr': 0.0002228229993210809, 'samples': 13729792, 'steps': 26815, 'batch_loss/train': 0.8542177546769381} 12/26/2021 12:01:54 - INFO - codeparrot_training - Step 26816: {'lr': 0.00022280732965754925, 'samples': 13730304, 'steps': 26816, 'batch_loss/train': 0.7761054458096623} 12/26/2021 12:02:05 - INFO - codeparrot_training - Step 26817: {'lr': 0.0002227916601021254, 'samples': 13730816, 'steps': 26817, 'batch_loss/train': 0.45147480204468593} 12/26/2021 12:02:17 - INFO - codeparrot_training - Step 26818: {'lr': 0.0002227759906548717, 'samples': 13731328, 'steps': 26818, 'batch_loss/train': 0.780628441600129} 12/26/2021 12:02:28 - INFO - codeparrot_training - Step 26819: {'lr': 0.00022276032131585027, 'samples': 13731840, 'steps': 26819, 'batch_loss/train': 0.584349409211427} 12/26/2021 12:02:39 - INFO - codeparrot_training - Step 26820: {'lr': 0.00022274465208512356, 'samples': 13732352, 'steps': 26820, 'batch_loss/train': 0.8435436283471063} 12/26/2021 12:02:49 - INFO - codeparrot_training - Step 26821: {'lr': 0.0002227289829627539, 'samples': 13732864, 'steps': 26821, 'batch_loss/train': 0.7436623591929674} 12/26/2021 12:03:01 - INFO - codeparrot_training - Step 26822: {'lr': 0.0002227133139488035, 'samples': 13733376, 'steps': 26822, 'batch_loss/train': 0.7117613069713116} 12/26/2021 12:03:12 - INFO - codeparrot_training - Step 26823: {'lr': 0.00022269764504333455, 'samples': 13733888, 'steps': 26823, 'batch_loss/train': 0.747182690538466} 12/26/2021 12:03:23 - INFO - codeparrot_training - Step 26824: {'lr': 0.00022268197624640952, 'samples': 13734400, 'steps': 26824, 'batch_loss/train': 0.7623042110353708} 12/26/2021 12:03:35 - INFO - codeparrot_training - Step 26825: {'lr': 0.00022266630755809067, 'samples': 13734912, 'steps': 26825, 'batch_loss/train': 0.7098951985244639} 12/26/2021 12:03:46 - INFO - codeparrot_training - Step 26826: {'lr': 0.00022265063897844025, 'samples': 13735424, 'steps': 26826, 'batch_loss/train': 0.8082050601951778} 12/26/2021 12:03:57 - INFO - codeparrot_training - Step 26827: {'lr': 0.00022263497050752047, 'samples': 13735936, 'steps': 26827, 'batch_loss/train': 0.6800264082849026} 12/26/2021 12:04:09 - INFO - codeparrot_training - Step 26828: {'lr': 0.0002226193021453938, 'samples': 13736448, 'steps': 26828, 'batch_loss/train': 0.7263939605327323} 12/26/2021 12:04:19 - INFO - codeparrot_training - Step 26829: {'lr': 0.00022260363389212242, 'samples': 13736960, 'steps': 26829, 'batch_loss/train': 0.8407434308901429} 12/26/2021 12:04:30 - INFO - codeparrot_training - Step 26830: {'lr': 0.0002225879657477687, 'samples': 13737472, 'steps': 26830, 'batch_loss/train': 0.8158363876864314} 12/26/2021 12:04:40 - INFO - codeparrot_training - Step 26831: {'lr': 0.00022257229771239479, 'samples': 13737984, 'steps': 26831, 'batch_loss/train': 0.6993914027698338} 12/26/2021 12:04:53 - INFO - codeparrot_training - Step 26832: {'lr': 0.00022255662978606317, 'samples': 13738496, 'steps': 26832, 'batch_loss/train': 0.8400634983554482} 12/26/2021 12:05:03 - INFO - codeparrot_training - Step 26833: {'lr': 0.00022254096196883594, 'samples': 13739008, 'steps': 26833, 'batch_loss/train': 0.766527112107724} 12/26/2021 12:05:14 - INFO - codeparrot_training - Step 26834: {'lr': 0.00022252529426077544, 'samples': 13739520, 'steps': 26834, 'batch_loss/train': 0.773870796430856} 12/26/2021 12:05:27 - INFO - codeparrot_training - Step 26835: {'lr': 0.00022250962666194412, 'samples': 13740032, 'steps': 26835, 'batch_loss/train': 1.0740570342168212} 12/26/2021 12:05:37 - INFO - codeparrot_training - Step 26836: {'lr': 0.00022249395917240409, 'samples': 13740544, 'steps': 26836, 'batch_loss/train': 0.7754856683313847} 12/26/2021 12:05:48 - INFO - codeparrot_training - Step 26837: {'lr': 0.00022247829179221774, 'samples': 13741056, 'steps': 26837, 'batch_loss/train': 0.7529646942857653} 12/26/2021 12:06:00 - INFO - codeparrot_training - Step 26838: {'lr': 0.00022246262452144718, 'samples': 13741568, 'steps': 26838, 'batch_loss/train': 0.7397330235689878} 12/26/2021 12:06:11 - INFO - codeparrot_training - Step 26839: {'lr': 0.00022244695736015491, 'samples': 13742080, 'steps': 26839, 'batch_loss/train': 0.767320801387541} 12/26/2021 12:06:21 - INFO - codeparrot_training - Step 26840: {'lr': 0.00022243129030840317, 'samples': 13742592, 'steps': 26840, 'batch_loss/train': 0.8042468512430787} 12/26/2021 12:06:32 - INFO - codeparrot_training - Step 26841: {'lr': 0.0002224156233662542, 'samples': 13743104, 'steps': 26841, 'batch_loss/train': 0.784406702965498} 12/26/2021 12:06:45 - INFO - codeparrot_training - Step 26842: {'lr': 0.0002223999565337702, 'samples': 13743616, 'steps': 26842, 'batch_loss/train': 0.8355715675279498} 12/26/2021 12:06:55 - INFO - codeparrot_training - Step 26843: {'lr': 0.00022238428981101362, 'samples': 13744128, 'steps': 26843, 'batch_loss/train': 0.8051990633830428} 12/26/2021 12:07:06 - INFO - codeparrot_training - Step 26844: {'lr': 0.00022236862319804674, 'samples': 13744640, 'steps': 26844, 'batch_loss/train': 0.7813010402023792} 12/26/2021 12:07:19 - INFO - codeparrot_training - Step 26845: {'lr': 0.00022235295669493165, 'samples': 13745152, 'steps': 26845, 'batch_loss/train': 0.7996572474949062} 12/26/2021 12:07:30 - INFO - codeparrot_training - Step 26846: {'lr': 0.00022233729030173084, 'samples': 13745664, 'steps': 26846, 'batch_loss/train': 0.6430337727069855} 12/26/2021 12:07:41 - INFO - codeparrot_training - Step 26847: {'lr': 0.00022232162401850653, 'samples': 13746176, 'steps': 26847, 'batch_loss/train': 1.0970399267971516} 12/26/2021 12:07:51 - INFO - codeparrot_training - Step 26848: {'lr': 0.00022230595784532097, 'samples': 13746688, 'steps': 26848, 'batch_loss/train': 1.1000310322269797} 12/26/2021 12:08:03 - INFO - codeparrot_training - Step 26849: {'lr': 0.00022229029178223638, 'samples': 13747200, 'steps': 26849, 'batch_loss/train': 0.8026437042281032} 12/26/2021 12:08:14 - INFO - codeparrot_training - Step 26850: {'lr': 0.0002222746258293153, 'samples': 13747712, 'steps': 26850, 'batch_loss/train': 0.7355893284548074} 12/26/2021 12:08:24 - INFO - codeparrot_training - Step 26851: {'lr': 0.0002222589599866198, 'samples': 13748224, 'steps': 26851, 'batch_loss/train': 0.7877711039036512} 12/26/2021 12:08:37 - INFO - codeparrot_training - Step 26852: {'lr': 0.00022224329425421215, 'samples': 13748736, 'steps': 26852, 'batch_loss/train': 0.6617656713351607} 12/26/2021 12:08:47 - INFO - codeparrot_training - Step 26853: {'lr': 0.0002222276286321546, 'samples': 13749248, 'steps': 26853, 'batch_loss/train': 0.7899280283600092} 12/26/2021 12:08:58 - INFO - codeparrot_training - Step 26854: {'lr': 0.00022221196312050966, 'samples': 13749760, 'steps': 26854, 'batch_loss/train': 0.7865721830166876} 12/26/2021 12:09:11 - INFO - codeparrot_training - Step 26855: {'lr': 0.0002221962977193394, 'samples': 13750272, 'steps': 26855, 'batch_loss/train': 0.5480280756019056} 12/26/2021 12:09:21 - INFO - codeparrot_training - Step 26856: {'lr': 0.00022218063242870607, 'samples': 13750784, 'steps': 26856, 'batch_loss/train': 0.7746148332953453} 12/26/2021 12:09:32 - INFO - codeparrot_training - Step 26857: {'lr': 0.0002221649672486722, 'samples': 13751296, 'steps': 26857, 'batch_loss/train': 0.757344757206738} 12/26/2021 12:09:44 - INFO - codeparrot_training - Step 26858: {'lr': 0.00022214930217929988, 'samples': 13751808, 'steps': 26858, 'batch_loss/train': 0.7879700260236859} 12/26/2021 12:09:55 - INFO - codeparrot_training - Step 26859: {'lr': 0.00022213363722065136, 'samples': 13752320, 'steps': 26859, 'batch_loss/train': 0.7987445006147027} 12/26/2021 12:10:05 - INFO - codeparrot_training - Step 26860: {'lr': 0.00022211797237278895, 'samples': 13752832, 'steps': 26860, 'batch_loss/train': 0.8768093446269631} 12/26/2021 12:10:16 - INFO - codeparrot_training - Step 26861: {'lr': 0.00022210230763577504, 'samples': 13753344, 'steps': 26861, 'batch_loss/train': 0.7112472879234701} 12/26/2021 12:10:28 - INFO - codeparrot_training - Step 26862: {'lr': 0.00022208664300967177, 'samples': 13753856, 'steps': 26862, 'batch_loss/train': 0.6738438033498824} 12/26/2021 12:10:38 - INFO - codeparrot_training - Step 26863: {'lr': 0.00022207097849454152, 'samples': 13754368, 'steps': 26863, 'batch_loss/train': 0.692893841303885} 12/26/2021 12:10:49 - INFO - codeparrot_training - Step 26864: {'lr': 0.00022205531409044642, 'samples': 13754880, 'steps': 26864, 'batch_loss/train': 0.8456500321626663} 12/26/2021 12:11:02 - INFO - codeparrot_training - Step 26865: {'lr': 0.0002220396497974489, 'samples': 13755392, 'steps': 26865, 'batch_loss/train': 0.6859007223974913} 12/26/2021 12:11:12 - INFO - codeparrot_training - Step 26866: {'lr': 0.00022202398561561122, 'samples': 13755904, 'steps': 26866, 'batch_loss/train': 0.7807025155052543} 12/26/2021 12:11:23 - INFO - codeparrot_training - Step 26867: {'lr': 0.0002220083215449956, 'samples': 13756416, 'steps': 26867, 'batch_loss/train': 0.7742145815864205} 12/26/2021 12:11:35 - INFO - codeparrot_training - Step 26868: {'lr': 0.0002219926575856642, 'samples': 13756928, 'steps': 26868, 'batch_loss/train': 0.7385332202538848} 12/26/2021 12:11:46 - INFO - codeparrot_training - Step 26869: {'lr': 0.0002219769937376795, 'samples': 13757440, 'steps': 26869, 'batch_loss/train': 0.7771449470892549} 12/26/2021 12:11:56 - INFO - codeparrot_training - Step 26870: {'lr': 0.00022196133000110374, 'samples': 13757952, 'steps': 26870, 'batch_loss/train': 0.7919516563415527} 12/26/2021 12:12:07 - INFO - codeparrot_training - Step 26871: {'lr': 0.00022194566637599904, 'samples': 13758464, 'steps': 26871, 'batch_loss/train': 0.7733302777633071} 12/26/2021 12:12:20 - INFO - codeparrot_training - Step 26872: {'lr': 0.0002219300028624278, 'samples': 13758976, 'steps': 26872, 'batch_loss/train': 1.1943327018525451} 12/26/2021 12:12:30 - INFO - codeparrot_training - Step 26873: {'lr': 0.00022191433946045237, 'samples': 13759488, 'steps': 26873, 'batch_loss/train': 0.8430076334625483} 12/26/2021 12:12:41 - INFO - codeparrot_training - Step 26874: {'lr': 0.00022189867617013485, 'samples': 13760000, 'steps': 26874, 'batch_loss/train': 0.7361236289143562} 12/26/2021 12:12:53 - INFO - codeparrot_training - Step 26875: {'lr': 0.0002218830129915375, 'samples': 13760512, 'steps': 26875, 'batch_loss/train': 0.777714729309082} 12/26/2021 12:13:04 - INFO - codeparrot_training - Step 26876: {'lr': 0.00022186734992472285, 'samples': 13761024, 'steps': 26876, 'batch_loss/train': 0.9971666326746345} 12/26/2021 12:13:14 - INFO - codeparrot_training - Step 26877: {'lr': 0.00022185168696975293, 'samples': 13761536, 'steps': 26877, 'batch_loss/train': 0.977806318551302} 12/26/2021 12:13:26 - INFO - codeparrot_training - Step 26878: {'lr': 0.00022183602412669003, 'samples': 13762048, 'steps': 26878, 'batch_loss/train': 1.143821744248271} 12/26/2021 12:13:37 - INFO - codeparrot_training - Step 26879: {'lr': 0.00022182036139559643, 'samples': 13762560, 'steps': 26879, 'batch_loss/train': 0.814595403149724} 12/26/2021 12:13:48 - INFO - codeparrot_training - Step 26880: {'lr': 0.00022180469877653455, 'samples': 13763072, 'steps': 26880, 'batch_loss/train': 0.7866146340966225} 12/26/2021 12:13:58 - INFO - codeparrot_training - Step 26881: {'lr': 0.0002217890362695665, 'samples': 13763584, 'steps': 26881, 'batch_loss/train': 0.6592147927731276} 12/26/2021 12:14:16 - INFO - codeparrot_training - Step 26882: {'lr': 0.00022177337387475449, 'samples': 13764096, 'steps': 26882, 'batch_loss/train': 0.7920673051849008} 12/26/2021 12:14:26 - INFO - codeparrot_training - Step 26883: {'lr': 0.000221757711592161, 'samples': 13764608, 'steps': 26883, 'batch_loss/train': 0.819673668127507} 12/26/2021 12:14:37 - INFO - codeparrot_training - Step 26884: {'lr': 0.00022174204942184813, 'samples': 13765120, 'steps': 26884, 'batch_loss/train': 0.8063693894073367} 12/26/2021 12:14:49 - INFO - codeparrot_training - Step 26885: {'lr': 0.0002217263873638783, 'samples': 13765632, 'steps': 26885, 'batch_loss/train': 0.9522474677069113} 12/26/2021 12:15:00 - INFO - codeparrot_training - Step 26886: {'lr': 0.00022171072541831352, 'samples': 13766144, 'steps': 26886, 'batch_loss/train': 0.7217081125709228} 12/26/2021 12:15:10 - INFO - codeparrot_training - Step 26887: {'lr': 0.00022169506358521635, 'samples': 13766656, 'steps': 26887, 'batch_loss/train': 0.7735015712678432} 12/26/2021 12:15:23 - INFO - codeparrot_training - Step 26888: {'lr': 0.00022167940186464886, 'samples': 13767168, 'steps': 26888, 'batch_loss/train': 0.6956087579019368} 12/26/2021 12:15:34 - INFO - codeparrot_training - Step 26889: {'lr': 0.00022166374025667343, 'samples': 13767680, 'steps': 26889, 'batch_loss/train': 0.7983135674148798} 12/26/2021 12:15:44 - INFO - codeparrot_training - Step 26890: {'lr': 0.00022164807876135214, 'samples': 13768192, 'steps': 26890, 'batch_loss/train': 0.6301028807647526} 12/26/2021 12:15:55 - INFO - codeparrot_training - Step 26891: {'lr': 0.00022163241737874748, 'samples': 13768704, 'steps': 26891, 'batch_loss/train': 0.8184030428528786} 12/26/2021 12:16:07 - INFO - codeparrot_training - Step 26892: {'lr': 0.00022161675610892162, 'samples': 13769216, 'steps': 26892, 'batch_loss/train': 0.8600868084467947} 12/26/2021 12:16:18 - INFO - codeparrot_training - Step 26893: {'lr': 0.00022160109495193676, 'samples': 13769728, 'steps': 26893, 'batch_loss/train': 1.0054925084114075} 12/26/2021 12:16:28 - INFO - codeparrot_training - Step 26894: {'lr': 0.00022158543390785523, 'samples': 13770240, 'steps': 26894, 'batch_loss/train': 0.6680535636842251} 12/26/2021 12:16:41 - INFO - codeparrot_training - Step 26895: {'lr': 0.00022156977297673938, 'samples': 13770752, 'steps': 26895, 'batch_loss/train': 0.6908709881827235} 12/26/2021 12:16:51 - INFO - codeparrot_training - Step 26896: {'lr': 0.00022155411215865137, 'samples': 13771264, 'steps': 26896, 'batch_loss/train': 0.6544442421291023} 12/26/2021 12:17:02 - INFO - codeparrot_training - Step 26897: {'lr': 0.00022153845145365337, 'samples': 13771776, 'steps': 26897, 'batch_loss/train': 0.7569029438309371} 12/26/2021 12:17:14 - INFO - codeparrot_training - Step 26898: {'lr': 0.00022152279086180778, 'samples': 13772288, 'steps': 26898, 'batch_loss/train': 0.8283262569457293} 12/26/2021 12:17:25 - INFO - codeparrot_training - Step 26899: {'lr': 0.00022150713038317688, 'samples': 13772800, 'steps': 26899, 'batch_loss/train': 0.6689525116235018} 12/26/2021 12:17:35 - INFO - codeparrot_training - Step 26900: {'lr': 0.0002214914700178228, 'samples': 13773312, 'steps': 26900, 'batch_loss/train': 0.675526135135442} 12/26/2021 12:17:46 - INFO - codeparrot_training - Step 26901: {'lr': 0.00022147580976580782, 'samples': 13773824, 'steps': 26901, 'batch_loss/train': 0.7552426075562835} 12/26/2021 12:17:58 - INFO - codeparrot_training - Step 26902: {'lr': 0.00022146014962719438, 'samples': 13774336, 'steps': 26902, 'batch_loss/train': 0.8152258079499006} 12/26/2021 12:18:09 - INFO - codeparrot_training - Step 26903: {'lr': 0.00022144448960204462, 'samples': 13774848, 'steps': 26903, 'batch_loss/train': 0.6719792718067765} 12/26/2021 12:18:19 - INFO - codeparrot_training - Step 26904: {'lr': 0.0002214288296904207, 'samples': 13775360, 'steps': 26904, 'batch_loss/train': 0.7214266150258482} 12/26/2021 12:18:32 - INFO - codeparrot_training - Step 26905: {'lr': 0.0002214131698923849, 'samples': 13775872, 'steps': 26905, 'batch_loss/train': 0.820925259962678} 12/26/2021 12:18:43 - INFO - codeparrot_training - Step 26906: {'lr': 0.00022139751020799968, 'samples': 13776384, 'steps': 26906, 'batch_loss/train': 0.7527413037605584} 12/26/2021 12:18:53 - INFO - codeparrot_training - Step 26907: {'lr': 0.00022138185063732708, 'samples': 13776896, 'steps': 26907, 'batch_loss/train': 0.6894842847250402} 12/26/2021 12:19:05 - INFO - codeparrot_training - Step 26908: {'lr': 0.00022136619118042942, 'samples': 13777408, 'steps': 26908, 'batch_loss/train': 0.7620302494615316} 12/26/2021 12:19:16 - INFO - codeparrot_training - Step 26909: {'lr': 0.00022135053183736905, 'samples': 13777920, 'steps': 26909, 'batch_loss/train': 1.0828076107427478} 12/26/2021 12:19:27 - INFO - codeparrot_training - Step 26910: {'lr': 0.00022133487260820808, 'samples': 13778432, 'steps': 26910, 'batch_loss/train': 0.8808481618762016} 12/26/2021 12:19:38 - INFO - codeparrot_training - Step 26911: {'lr': 0.00022131921349300893, 'samples': 13778944, 'steps': 26911, 'batch_loss/train': 0.7959563378244638} 12/26/2021 12:19:50 - INFO - codeparrot_training - Step 26912: {'lr': 0.0002213035544918336, 'samples': 13779456, 'steps': 26912, 'batch_loss/train': 0.7652749232947826} 12/26/2021 12:20:01 - INFO - codeparrot_training - Step 26913: {'lr': 0.00022128789560474466, 'samples': 13779968, 'steps': 26913, 'batch_loss/train': 0.8190188645385206} 12/26/2021 12:20:11 - INFO - codeparrot_training - Step 26914: {'lr': 0.00022127223683180415, 'samples': 13780480, 'steps': 26914, 'batch_loss/train': 0.7522309748455882} 12/26/2021 12:20:24 - INFO - codeparrot_training - Step 26915: {'lr': 0.0002212565781730744, 'samples': 13780992, 'steps': 26915, 'batch_loss/train': 0.7729719779454172} 12/26/2021 12:20:34 - INFO - codeparrot_training - Step 26916: {'lr': 0.00022124091962861755, 'samples': 13781504, 'steps': 26916, 'batch_loss/train': 0.8246688917279243} 12/26/2021 12:20:45 - INFO - codeparrot_training - Step 26917: {'lr': 0.00022122526119849602, 'samples': 13782016, 'steps': 26917, 'batch_loss/train': 0.7731840568594635} 12/26/2021 12:20:57 - INFO - codeparrot_training - Step 26918: {'lr': 0.00022120960288277201, 'samples': 13782528, 'steps': 26918, 'batch_loss/train': 0.7323749829083681} 12/26/2021 12:21:08 - INFO - codeparrot_training - Step 26919: {'lr': 0.00022119394468150764, 'samples': 13783040, 'steps': 26919, 'batch_loss/train': 0.59223563503474} 12/26/2021 12:21:19 - INFO - codeparrot_training - Step 26920: {'lr': 0.00022117828659476532, 'samples': 13783552, 'steps': 26920, 'batch_loss/train': 0.7884430647827685} 12/26/2021 12:21:31 - INFO - codeparrot_training - Step 26921: {'lr': 0.0002211626286226073, 'samples': 13784064, 'steps': 26921, 'batch_loss/train': 1.1021805945783854} 12/26/2021 12:21:42 - INFO - codeparrot_training - Step 26922: {'lr': 0.00022114697076509583, 'samples': 13784576, 'steps': 26922, 'batch_loss/train': 0.7746934378519654} 12/26/2021 12:21:53 - INFO - codeparrot_training - Step 26923: {'lr': 0.00022113131302229294, 'samples': 13785088, 'steps': 26923, 'batch_loss/train': 0.8097754147602245} 12/26/2021 12:22:03 - INFO - codeparrot_training - Step 26924: {'lr': 0.00022111565539426113, 'samples': 13785600, 'steps': 26924, 'batch_loss/train': 0.6941768364049494} 12/26/2021 12:22:15 - INFO - codeparrot_training - Step 26925: {'lr': 0.00022109999788106263, 'samples': 13786112, 'steps': 26925, 'batch_loss/train': 0.7548412219621241} 12/26/2021 12:22:26 - INFO - codeparrot_training - Step 26926: {'lr': 0.00022108434048275955, 'samples': 13786624, 'steps': 26926, 'batch_loss/train': 0.748389377258718} 12/26/2021 12:22:36 - INFO - codeparrot_training - Step 26927: {'lr': 0.00022106868319941416, 'samples': 13787136, 'steps': 26927, 'batch_loss/train': 0.7457506963983178} 12/26/2021 12:22:50 - INFO - codeparrot_training - Step 26928: {'lr': 0.0002210530260310889, 'samples': 13787648, 'steps': 26928, 'batch_loss/train': 0.7677046461030841} 12/26/2021 12:23:00 - INFO - codeparrot_training - Step 26929: {'lr': 0.00022103736897784583, 'samples': 13788160, 'steps': 26929, 'batch_loss/train': 0.7903059343807399} 12/26/2021 12:23:11 - INFO - codeparrot_training - Step 26930: {'lr': 0.00022102171203974713, 'samples': 13788672, 'steps': 26930, 'batch_loss/train': 0.5825669071637094} 12/26/2021 12:23:21 - INFO - codeparrot_training - Step 26931: {'lr': 0.00022100605521685533, 'samples': 13789184, 'steps': 26931, 'batch_loss/train': 1.4838593173772097} 12/26/2021 12:23:34 - INFO - codeparrot_training - Step 26932: {'lr': 0.00022099039850923246, 'samples': 13789696, 'steps': 26932, 'batch_loss/train': 0.8256744435057044} 12/26/2021 12:23:44 - INFO - codeparrot_training - Step 26933: {'lr': 0.0002209747419169408, 'samples': 13790208, 'steps': 26933, 'batch_loss/train': 0.9038686603307724} 12/26/2021 12:23:55 - INFO - codeparrot_training - Step 26934: {'lr': 0.00022095908544004253, 'samples': 13790720, 'steps': 26934, 'batch_loss/train': 0.8370175184682012} 12/26/2021 12:24:07 - INFO - codeparrot_training - Step 26935: {'lr': 0.0002209434290786001, 'samples': 13791232, 'steps': 26935, 'batch_loss/train': 0.8107154318131506} 12/26/2021 12:24:17 - INFO - codeparrot_training - Step 26936: {'lr': 0.00022092777283267553, 'samples': 13791744, 'steps': 26936, 'batch_loss/train': 0.678151513915509} 12/26/2021 12:24:28 - INFO - codeparrot_training - Step 26937: {'lr': 0.00022091211670233123, 'samples': 13792256, 'steps': 26937, 'batch_loss/train': 0.8012394467368722} 12/26/2021 12:24:40 - INFO - codeparrot_training - Step 26938: {'lr': 0.00022089646068762931, 'samples': 13792768, 'steps': 26938, 'batch_loss/train': 0.7890409119427204} 12/26/2021 12:24:51 - INFO - codeparrot_training - Step 26939: {'lr': 0.00022088080478863207, 'samples': 13793280, 'steps': 26939, 'batch_loss/train': 0.7706225030124187} 12/26/2021 12:25:02 - INFO - codeparrot_training - Step 26940: {'lr': 0.00022086514900540184, 'samples': 13793792, 'steps': 26940, 'batch_loss/train': 0.7992990287020802} 12/26/2021 12:25:14 - INFO - codeparrot_training - Step 26941: {'lr': 0.00022084949333800075, 'samples': 13794304, 'steps': 26941, 'batch_loss/train': 0.9773413995280862} 12/26/2021 12:25:25 - INFO - codeparrot_training - Step 26942: {'lr': 0.00022083383778649096, 'samples': 13794816, 'steps': 26942, 'batch_loss/train': 0.7390576009638608} 12/26/2021 12:25:36 - INFO - codeparrot_training - Step 26943: {'lr': 0.0002208181823509349, 'samples': 13795328, 'steps': 26943, 'batch_loss/train': 0.8162013106048107} 12/26/2021 12:25:48 - INFO - codeparrot_training - Step 26944: {'lr': 0.00022080252703139474, 'samples': 13795840, 'steps': 26944, 'batch_loss/train': 0.7922199815511703} 12/26/2021 12:25:58 - INFO - codeparrot_training - Step 26945: {'lr': 0.00022078687182793265, 'samples': 13796352, 'steps': 26945, 'batch_loss/train': 0.8028006684035063} 12/26/2021 12:26:09 - INFO - codeparrot_training - Step 26946: {'lr': 0.00022077121674061095, 'samples': 13796864, 'steps': 26946, 'batch_loss/train': 0.7663517042528838} 12/26/2021 12:26:19 - INFO - codeparrot_training - Step 26947: {'lr': 0.00022075556176949192, 'samples': 13797376, 'steps': 26947, 'batch_loss/train': 0.8363300557248294} 12/26/2021 12:26:32 - INFO - codeparrot_training - Step 26948: {'lr': 0.00022073990691463775, 'samples': 13797888, 'steps': 26948, 'batch_loss/train': 0.7961086360737681} 12/26/2021 12:26:42 - INFO - codeparrot_training - Step 26949: {'lr': 0.00022072425217611053, 'samples': 13798400, 'steps': 26949, 'batch_loss/train': 0.9001552425324917} 12/26/2021 12:26:53 - INFO - codeparrot_training - Step 26950: {'lr': 0.00022070859755397268, 'samples': 13798912, 'steps': 26950, 'batch_loss/train': 0.81758813560009} 12/26/2021 12:27:06 - INFO - codeparrot_training - Step 26951: {'lr': 0.00022069294304828647, 'samples': 13799424, 'steps': 26951, 'batch_loss/train': 0.8528676414862275} 12/26/2021 12:27:16 - INFO - codeparrot_training - Step 26952: {'lr': 0.00022067728865911396, 'samples': 13799936, 'steps': 26952, 'batch_loss/train': 0.8304093349725008} 12/26/2021 12:27:27 - INFO - codeparrot_training - Step 26953: {'lr': 0.00022066163438651742, 'samples': 13800448, 'steps': 26953, 'batch_loss/train': 0.7794530354440212} 12/26/2021 12:27:39 - INFO - codeparrot_training - Step 26954: {'lr': 0.0002206459802305593, 'samples': 13800960, 'steps': 26954, 'batch_loss/train': 0.7005040431395173} 12/26/2021 12:27:50 - INFO - codeparrot_training - Step 26955: {'lr': 0.0002206303261913016, 'samples': 13801472, 'steps': 26955, 'batch_loss/train': 0.743810866959393} 12/26/2021 12:28:00 - INFO - codeparrot_training - Step 26956: {'lr': 0.00022061467226880657, 'samples': 13801984, 'steps': 26956, 'batch_loss/train': 0.6777687013964169} 12/26/2021 12:28:11 - INFO - codeparrot_training - Step 26957: {'lr': 0.00022059901846313663, 'samples': 13802496, 'steps': 26957, 'batch_loss/train': 0.7706391140818596} 12/26/2021 12:28:24 - INFO - codeparrot_training - Step 26958: {'lr': 0.0002205833647743539, 'samples': 13803008, 'steps': 26958, 'batch_loss/train': 0.7633831053972244} 12/26/2021 12:28:34 - INFO - codeparrot_training - Step 26959: {'lr': 0.00022056771120252053, 'samples': 13803520, 'steps': 26959, 'batch_loss/train': 0.7450809633592144} 12/26/2021 12:28:45 - INFO - codeparrot_training - Step 26960: {'lr': 0.00022055205774769877, 'samples': 13804032, 'steps': 26960, 'batch_loss/train': 0.7906965763540938} 12/26/2021 12:28:57 - INFO - codeparrot_training - Step 26961: {'lr': 0.00022053640440995102, 'samples': 13804544, 'steps': 26961, 'batch_loss/train': 0.7080852688523009} 12/26/2021 12:29:07 - INFO - codeparrot_training - Step 26962: {'lr': 0.00022052075118933937, 'samples': 13805056, 'steps': 26962, 'batch_loss/train': 0.8069335306063294} 12/26/2021 12:29:18 - INFO - codeparrot_training - Step 26963: {'lr': 0.00022050509808592612, 'samples': 13805568, 'steps': 26963, 'batch_loss/train': 0.7631840375252068} 12/26/2021 12:29:30 - INFO - codeparrot_training - Step 26964: {'lr': 0.00022048944509977335, 'samples': 13806080, 'steps': 26964, 'batch_loss/train': 0.7285250294953585} 12/26/2021 12:29:41 - INFO - codeparrot_training - Step 26965: {'lr': 0.00022047379223094347, 'samples': 13806592, 'steps': 26965, 'batch_loss/train': 0.6962743575568311} 12/26/2021 12:29:52 - INFO - codeparrot_training - Step 26966: {'lr': 0.00022045813947949865, 'samples': 13807104, 'steps': 26966, 'batch_loss/train': 0.876103950664401} 12/26/2021 12:30:02 - INFO - codeparrot_training - Step 26967: {'lr': 0.00022044248684550106, 'samples': 13807616, 'steps': 26967, 'batch_loss/train': 0.8048037430271506} 12/26/2021 12:30:15 - INFO - codeparrot_training - Step 26968: {'lr': 0.0002204268343290131, 'samples': 13808128, 'steps': 26968, 'batch_loss/train': 0.739431219175458} 12/26/2021 12:30:26 - INFO - codeparrot_training - Step 26969: {'lr': 0.0002204111819300968, 'samples': 13808640, 'steps': 26969, 'batch_loss/train': 1.096088821592275} 12/26/2021 12:30:37 - INFO - codeparrot_training - Step 26970: {'lr': 0.00022039552964881452, 'samples': 13809152, 'steps': 26970, 'batch_loss/train': 0.8523690355941653} 12/26/2021 12:30:49 - INFO - codeparrot_training - Step 26971: {'lr': 0.00022037987748522832, 'samples': 13809664, 'steps': 26971, 'batch_loss/train': 0.9263286064378917} 12/26/2021 12:30:59 - INFO - codeparrot_training - Step 26972: {'lr': 0.0002203642254394006, 'samples': 13810176, 'steps': 26972, 'batch_loss/train': 0.7456491100601852} 12/26/2021 12:31:10 - INFO - codeparrot_training - Step 26973: {'lr': 0.00022034857351139358, 'samples': 13810688, 'steps': 26973, 'batch_loss/train': 0.63673903234303} 12/26/2021 12:31:22 - INFO - codeparrot_training - Step 26974: {'lr': 0.0002203329217012694, 'samples': 13811200, 'steps': 26974, 'batch_loss/train': 0.7526087742298841} 12/26/2021 12:31:33 - INFO - codeparrot_training - Step 26975: {'lr': 0.00022031727000909027, 'samples': 13811712, 'steps': 26975, 'batch_loss/train': 0.84923948161304} 12/26/2021 12:31:43 - INFO - codeparrot_training - Step 26976: {'lr': 0.00022030161843491857, 'samples': 13812224, 'steps': 26976, 'batch_loss/train': 0.8585964189842343} 12/26/2021 12:31:54 - INFO - codeparrot_training - Step 26977: {'lr': 0.00022028596697881643, 'samples': 13812736, 'steps': 26977, 'batch_loss/train': 0.7541988678276539} 12/26/2021 12:32:06 - INFO - codeparrot_training - Step 26978: {'lr': 0.00022027031564084598, 'samples': 13813248, 'steps': 26978, 'batch_loss/train': 0.7926886482164264} 12/26/2021 12:32:17 - INFO - codeparrot_training - Step 26979: {'lr': 0.0002202546644210695, 'samples': 13813760, 'steps': 26979, 'batch_loss/train': 0.7955847587436438} 12/26/2021 12:32:28 - INFO - codeparrot_training - Step 26980: {'lr': 0.00022023901331954935, 'samples': 13814272, 'steps': 26980, 'batch_loss/train': 0.7680556392297149} 12/26/2021 12:32:41 - INFO - codeparrot_training - Step 26981: {'lr': 0.00022022336233634758, 'samples': 13814784, 'steps': 26981, 'batch_loss/train': 0.9030226496979594} 12/26/2021 12:32:51 - INFO - codeparrot_training - Step 26982: {'lr': 0.00022020771147152643, 'samples': 13815296, 'steps': 26982, 'batch_loss/train': 0.7730351767968386} 12/26/2021 12:33:02 - INFO - codeparrot_training - Step 26983: {'lr': 0.00022019206072514827, 'samples': 13815808, 'steps': 26983, 'batch_loss/train': 0.7203350877389312} 12/26/2021 12:33:14 - INFO - codeparrot_training - Step 26984: {'lr': 0.0002201764100972752, 'samples': 13816320, 'steps': 26984, 'batch_loss/train': 0.8322451766580343} 12/26/2021 12:33:25 - INFO - codeparrot_training - Step 26985: {'lr': 0.00022016075958796946, 'samples': 13816832, 'steps': 26985, 'batch_loss/train': 0.8818021677434444} 12/26/2021 12:33:35 - INFO - codeparrot_training - Step 26986: {'lr': 0.00022014510919729323, 'samples': 13817344, 'steps': 26986, 'batch_loss/train': 0.7441869508475065} 12/26/2021 12:33:46 - INFO - codeparrot_training - Step 26987: {'lr': 0.00022012945892530884, 'samples': 13817856, 'steps': 26987, 'batch_loss/train': 0.8289701147004962} 12/26/2021 12:33:59 - INFO - codeparrot_training - Step 26988: {'lr': 0.0002201138087720784, 'samples': 13818368, 'steps': 26988, 'batch_loss/train': 0.6286798673681915} 12/26/2021 12:34:10 - INFO - codeparrot_training - Step 26989: {'lr': 0.00022009815873766421, 'samples': 13818880, 'steps': 26989, 'batch_loss/train': 0.7761516403406858} 12/26/2021 12:34:21 - INFO - codeparrot_training - Step 26990: {'lr': 0.00022008250882212836, 'samples': 13819392, 'steps': 26990, 'batch_loss/train': 0.7179976042825729} 12/26/2021 12:34:33 - INFO - codeparrot_training - Step 26991: {'lr': 0.0002200668590255332, 'samples': 13819904, 'steps': 26991, 'batch_loss/train': 0.6701143195386976} 12/26/2021 12:34:43 - INFO - codeparrot_training - Step 26992: {'lr': 0.00022005120934794097, 'samples': 13820416, 'steps': 26992, 'batch_loss/train': 0.7933606458827853} 12/26/2021 12:34:54 - INFO - codeparrot_training - Step 26993: {'lr': 0.00022003555978941373, 'samples': 13820928, 'steps': 26993, 'batch_loss/train': 0.7773209363222122} 12/26/2021 12:35:07 - INFO - codeparrot_training - Step 26994: {'lr': 0.00022001991035001386, 'samples': 13821440, 'steps': 26994, 'batch_loss/train': 0.7822571038268507} 12/26/2021 12:35:17 - INFO - codeparrot_training - Step 26995: {'lr': 0.00022000426102980347, 'samples': 13821952, 'steps': 26995, 'batch_loss/train': 1.0510398243786767} 12/26/2021 12:35:28 - INFO - codeparrot_training - Step 26996: {'lr': 0.0002199886118288449, 'samples': 13822464, 'steps': 26996, 'batch_loss/train': 0.8635897166095674} 12/26/2021 12:35:39 - INFO - codeparrot_training - Step 26997: {'lr': 0.00021997296274720013, 'samples': 13822976, 'steps': 26997, 'batch_loss/train': 0.725394444540143} 12/26/2021 12:35:52 - INFO - codeparrot_training - Step 26998: {'lr': 0.00021995731378493159, 'samples': 13823488, 'steps': 26998, 'batch_loss/train': 1.0469727548770607} 12/26/2021 12:36:03 - INFO - codeparrot_training - Step 26999: {'lr': 0.00021994166494210146, 'samples': 13824000, 'steps': 26999, 'batch_loss/train': 1.1291398098692298} 12/26/2021 12:36:13 - INFO - codeparrot_training - Step 27000: {'lr': 0.00021992601621877188, 'samples': 13824512, 'steps': 27000, 'batch_loss/train': 0.6775836439337581} 12/26/2021 12:36:26 - INFO - codeparrot_training - Step 27001: {'lr': 0.0002199103676150051, 'samples': 13825024, 'steps': 27001, 'batch_loss/train': 0.7505150595679879} 12/26/2021 12:36:36 - INFO - codeparrot_training - Step 27002: {'lr': 0.00021989471913086336, 'samples': 13825536, 'steps': 27002, 'batch_loss/train': 0.6063736071810126} 12/26/2021 12:36:47 - INFO - codeparrot_training - Step 27003: {'lr': 0.0002198790707664089, 'samples': 13826048, 'steps': 27003, 'batch_loss/train': 0.7102157928748056} 12/26/2021 12:36:59 - INFO - codeparrot_training - Step 27004: {'lr': 0.00021986342252170373, 'samples': 13826560, 'steps': 27004, 'batch_loss/train': 2.4254546919837594} 12/26/2021 12:37:10 - INFO - codeparrot_training - Step 27005: {'lr': 0.0002198477743968103, 'samples': 13827072, 'steps': 27005, 'batch_loss/train': 0.793127171928063} 12/26/2021 12:37:20 - INFO - codeparrot_training - Step 27006: {'lr': 0.00021983212639179082, 'samples': 13827584, 'steps': 27006, 'batch_loss/train': 0.7204168578609824} 12/26/2021 12:37:31 - INFO - codeparrot_training - Step 27007: {'lr': 0.00021981647850670733, 'samples': 13828096, 'steps': 27007, 'batch_loss/train': 0.838599038310349} 12/26/2021 12:37:44 - INFO - codeparrot_training - Step 27008: {'lr': 0.00021980083074162206, 'samples': 13828608, 'steps': 27008, 'batch_loss/train': 0.8126303539611399} 12/26/2021 12:37:54 - INFO - codeparrot_training - Step 27009: {'lr': 0.00021978518309659738, 'samples': 13829120, 'steps': 27009, 'batch_loss/train': 0.7605645968578756} 12/26/2021 12:38:05 - INFO - codeparrot_training - Step 27010: {'lr': 0.00021976953557169537, 'samples': 13829632, 'steps': 27010, 'batch_loss/train': 0.8270937083289027} 12/26/2021 12:38:17 - INFO - codeparrot_training - Step 27011: {'lr': 0.00021975388816697833, 'samples': 13830144, 'steps': 27011, 'batch_loss/train': 0.8533422630280256} 12/26/2021 12:38:27 - INFO - codeparrot_training - Step 27012: {'lr': 0.0002197382408825083, 'samples': 13830656, 'steps': 27012, 'batch_loss/train': 0.8064243197441101} 12/26/2021 12:38:38 - INFO - codeparrot_training - Step 27013: {'lr': 0.0002197225937183477, 'samples': 13831168, 'steps': 27013, 'batch_loss/train': 0.8330345312133431} 12/26/2021 12:38:51 - INFO - codeparrot_training - Step 27014: {'lr': 0.00021970694667455858, 'samples': 13831680, 'steps': 27014, 'batch_loss/train': 0.8708139136433601} 12/26/2021 12:39:02 - INFO - codeparrot_training - Step 27015: {'lr': 0.00021969129975120326, 'samples': 13832192, 'steps': 27015, 'batch_loss/train': 0.7529397895559669} 12/26/2021 12:39:12 - INFO - codeparrot_training - Step 27016: {'lr': 0.0002196756529483438, 'samples': 13832704, 'steps': 27016, 'batch_loss/train': 0.7541760453023016} 12/26/2021 12:39:23 - INFO - codeparrot_training - Step 27017: {'lr': 0.00021966000626604253, 'samples': 13833216, 'steps': 27017, 'batch_loss/train': 0.7897997591644526} 12/26/2021 12:39:35 - INFO - codeparrot_training - Step 27018: {'lr': 0.00021964435970436166, 'samples': 13833728, 'steps': 27018, 'batch_loss/train': 0.8469058303162456} 12/26/2021 12:39:45 - INFO - codeparrot_training - Step 27019: {'lr': 0.00021962871326336325, 'samples': 13834240, 'steps': 27019, 'batch_loss/train': 0.7272483045235276} 12/26/2021 12:39:56 - INFO - codeparrot_training - Step 27020: {'lr': 0.00021961306694310968, 'samples': 13834752, 'steps': 27020, 'batch_loss/train': 0.7006875933147967} 12/26/2021 12:40:08 - INFO - codeparrot_training - Step 27021: {'lr': 0.00021959742074366311, 'samples': 13835264, 'steps': 27021, 'batch_loss/train': 0.7315167980268598} 12/26/2021 12:40:19 - INFO - codeparrot_training - Step 27022: {'lr': 0.00021958177466508574, 'samples': 13835776, 'steps': 27022, 'batch_loss/train': 0.7836270860861987} 12/26/2021 12:40:29 - INFO - codeparrot_training - Step 27023: {'lr': 0.00021956612870743965, 'samples': 13836288, 'steps': 27023, 'batch_loss/train': 1.781600458547473} 12/26/2021 12:40:42 - INFO - codeparrot_training - Step 27024: {'lr': 0.00021955048287078716, 'samples': 13836800, 'steps': 27024, 'batch_loss/train': 0.7434368012472987} 12/26/2021 12:40:52 - INFO - codeparrot_training - Step 27025: {'lr': 0.0002195348371551905, 'samples': 13837312, 'steps': 27025, 'batch_loss/train': 0.8770156526006758} 12/26/2021 12:41:03 - INFO - codeparrot_training - Step 27026: {'lr': 0.00021951919156071182, 'samples': 13837824, 'steps': 27026, 'batch_loss/train': 0.7966187102720141} 12/26/2021 12:41:13 - INFO - codeparrot_training - Step 27027: {'lr': 0.00021950354608741323, 'samples': 13838336, 'steps': 27027, 'batch_loss/train': 0.787079163827002} 12/26/2021 12:41:26 - INFO - codeparrot_training - Step 27028: {'lr': 0.00021948790073535716, 'samples': 13838848, 'steps': 27028, 'batch_loss/train': 0.7692256653681397} 12/26/2021 12:41:37 - INFO - codeparrot_training - Step 27029: {'lr': 0.0002194722555046056, 'samples': 13839360, 'steps': 27029, 'batch_loss/train': 0.8831538306549191} 12/26/2021 12:41:47 - INFO - codeparrot_training - Step 27030: {'lr': 0.0002194566103952208, 'samples': 13839872, 'steps': 27030, 'batch_loss/train': 0.8083289889618754} 12/26/2021 12:42:00 - INFO - codeparrot_training - Step 27031: {'lr': 0.00021944096540726506, 'samples': 13840384, 'steps': 27031, 'batch_loss/train': 0.562446370255202} 12/26/2021 12:42:10 - INFO - codeparrot_training - Step 27032: {'lr': 0.00021942532054080053, 'samples': 13840896, 'steps': 27032, 'batch_loss/train': 0.6989931841380894} 12/26/2021 12:42:21 - INFO - codeparrot_training - Step 27033: {'lr': 0.0002194096757958893, 'samples': 13841408, 'steps': 27033, 'batch_loss/train': 0.8117665471509099} 12/26/2021 12:42:33 - INFO - codeparrot_training - Step 27034: {'lr': 0.0002193940311725936, 'samples': 13841920, 'steps': 27034, 'batch_loss/train': 0.7164223592262715} 12/26/2021 12:42:44 - INFO - codeparrot_training - Step 27035: {'lr': 0.0002193783866709758, 'samples': 13842432, 'steps': 27035, 'batch_loss/train': 1.6427463321015239} 12/26/2021 12:42:55 - INFO - codeparrot_training - Step 27036: {'lr': 0.00021936274229109794, 'samples': 13842944, 'steps': 27036, 'batch_loss/train': 0.8429020373150706} 12/26/2021 12:43:05 - INFO - codeparrot_training - Step 27037: {'lr': 0.00021934709803302224, 'samples': 13843456, 'steps': 27037, 'batch_loss/train': 0.9469894208014011} 12/26/2021 12:43:18 - INFO - codeparrot_training - Step 27038: {'lr': 0.00021933145389681083, 'samples': 13843968, 'steps': 27038, 'batch_loss/train': 0.8318827801849693} 12/26/2021 12:43:29 - INFO - codeparrot_training - Step 27039: {'lr': 0.0002193158098825261, 'samples': 13844480, 'steps': 27039, 'batch_loss/train': 0.7812955733388662} 12/26/2021 12:43:39 - INFO - codeparrot_training - Step 27040: {'lr': 0.00021930016599023007, 'samples': 13844992, 'steps': 27040, 'batch_loss/train': 0.7668367670848966} 12/26/2021 12:43:51 - INFO - codeparrot_training - Step 27041: {'lr': 0.0002192845222199849, 'samples': 13845504, 'steps': 27041, 'batch_loss/train': 0.7606734749861062} 12/26/2021 12:44:02 - INFO - codeparrot_training - Step 27042: {'lr': 0.00021926887857185302, 'samples': 13846016, 'steps': 27042, 'batch_loss/train': 0.7399984924122691} 12/26/2021 12:44:13 - INFO - codeparrot_training - Step 27043: {'lr': 0.00021925323504589638, 'samples': 13846528, 'steps': 27043, 'batch_loss/train': 0.6473904829472303} 12/26/2021 12:44:25 - INFO - codeparrot_training - Step 27044: {'lr': 0.00021923759164217736, 'samples': 13847040, 'steps': 27044, 'batch_loss/train': 0.8271099627017975} 12/26/2021 12:44:36 - INFO - codeparrot_training - Step 27045: {'lr': 0.00021922194836075792, 'samples': 13847552, 'steps': 27045, 'batch_loss/train': 0.7703213612549007} 12/26/2021 12:44:47 - INFO - codeparrot_training - Step 27046: {'lr': 0.00021920630520170045, 'samples': 13848064, 'steps': 27046, 'batch_loss/train': 0.779523809440434} 12/26/2021 12:44:59 - INFO - codeparrot_training - Step 27047: {'lr': 0.00021919066216506716, 'samples': 13848576, 'steps': 27047, 'batch_loss/train': 0.7274111602455378} 12/26/2021 12:45:09 - INFO - codeparrot_training - Step 27048: {'lr': 0.00021917501925092014, 'samples': 13849088, 'steps': 27048, 'batch_loss/train': 0.7483838414773345} 12/26/2021 12:45:20 - INFO - codeparrot_training - Step 27049: {'lr': 0.00021915937645932153, 'samples': 13849600, 'steps': 27049, 'batch_loss/train': 0.7487923111766577} 12/26/2021 12:45:30 - INFO - codeparrot_training - Step 27050: {'lr': 0.0002191437337903336, 'samples': 13850112, 'steps': 27050, 'batch_loss/train': 0.8818279793485999} 12/26/2021 12:45:43 - INFO - codeparrot_training - Step 27051: {'lr': 0.00021912809124401863, 'samples': 13850624, 'steps': 27051, 'batch_loss/train': 0.8660522634163499} 12/26/2021 12:45:54 - INFO - codeparrot_training - Step 27052: {'lr': 0.00021911244882043864, 'samples': 13851136, 'steps': 27052, 'batch_loss/train': 0.8110558970365673} 12/26/2021 12:46:04 - INFO - codeparrot_training - Step 27053: {'lr': 0.00021909680651965586, 'samples': 13851648, 'steps': 27053, 'batch_loss/train': 1.012129764072597} 12/26/2021 12:46:18 - INFO - codeparrot_training - Step 27054: {'lr': 0.0002190811643417326, 'samples': 13852160, 'steps': 27054, 'batch_loss/train': 0.7084256396628916} 12/26/2021 12:46:28 - INFO - codeparrot_training - Step 27055: {'lr': 0.00021906552228673088, 'samples': 13852672, 'steps': 27055, 'batch_loss/train': 1.434898822568357} 12/26/2021 12:46:39 - INFO - codeparrot_training - Step 27056: {'lr': 0.00021904988035471293, 'samples': 13853184, 'steps': 27056, 'batch_loss/train': 1.0496738450601697} 12/26/2021 12:46:50 - INFO - codeparrot_training - Step 27057: {'lr': 0.00021903423854574108, 'samples': 13853696, 'steps': 27057, 'batch_loss/train': 1.036179605871439} 12/26/2021 12:47:02 - INFO - codeparrot_training - Step 27058: {'lr': 0.0002190185968598774, 'samples': 13854208, 'steps': 27058, 'batch_loss/train': 0.8004870277363807} 12/26/2021 12:47:12 - INFO - codeparrot_training - Step 27059: {'lr': 0.00021900295529718404, 'samples': 13854720, 'steps': 27059, 'batch_loss/train': 0.588340733665973} 12/26/2021 12:47:23 - INFO - codeparrot_training - Step 27060: {'lr': 0.00021898731385772315, 'samples': 13855232, 'steps': 27060, 'batch_loss/train': 0.7730168383568525} 12/26/2021 12:47:35 - INFO - codeparrot_training - Step 27061: {'lr': 0.00021897167254155713, 'samples': 13855744, 'steps': 27061, 'batch_loss/train': 0.809382401406765} 12/26/2021 12:47:46 - INFO - codeparrot_training - Step 27062: {'lr': 0.00021895603134874792, 'samples': 13856256, 'steps': 27062, 'batch_loss/train': 0.7451087338849902} 12/26/2021 12:47:56 - INFO - codeparrot_training - Step 27063: {'lr': 0.0002189403902793579, 'samples': 13856768, 'steps': 27063, 'batch_loss/train': 0.7126258290372789} 12/26/2021 12:48:07 - INFO - codeparrot_training - Step 27064: {'lr': 0.00021892474933344903, 'samples': 13857280, 'steps': 27064, 'batch_loss/train': 0.7846895977854729} 12/26/2021 12:48:19 - INFO - codeparrot_training - Step 27065: {'lr': 0.0002189091085110837, 'samples': 13857792, 'steps': 27065, 'batch_loss/train': 0.7160997511819005} 12/26/2021 12:48:30 - INFO - codeparrot_training - Step 27066: {'lr': 0.00021889346781232404, 'samples': 13858304, 'steps': 27066, 'batch_loss/train': 0.7794391997158527} 12/26/2021 12:48:40 - INFO - codeparrot_training - Step 27067: {'lr': 0.00021887782723723214, 'samples': 13858816, 'steps': 27067, 'batch_loss/train': 0.8421553252264857} 12/26/2021 12:48:53 - INFO - codeparrot_training - Step 27068: {'lr': 0.00021886218678587033, 'samples': 13859328, 'steps': 27068, 'batch_loss/train': 0.7937546847388148} 12/26/2021 12:49:04 - INFO - codeparrot_training - Step 27069: {'lr': 0.00021884654645830065, 'samples': 13859840, 'steps': 27069, 'batch_loss/train': 0.745559782255441} 12/26/2021 12:49:14 - INFO - codeparrot_training - Step 27070: {'lr': 0.00021883090625458542, 'samples': 13860352, 'steps': 27070, 'batch_loss/train': 0.781346004921943} 12/26/2021 12:49:27 - INFO - codeparrot_training - Step 27071: {'lr': 0.00021881526617478658, 'samples': 13860864, 'steps': 27071, 'batch_loss/train': 0.8135878723114729} 12/26/2021 12:49:37 - INFO - codeparrot_training - Step 27072: {'lr': 0.00021879962621896654, 'samples': 13861376, 'steps': 27072, 'batch_loss/train': 0.7833338738419116} 12/26/2021 12:49:48 - INFO - codeparrot_training - Step 27073: {'lr': 0.0002187839863871875, 'samples': 13861888, 'steps': 27073, 'batch_loss/train': 0.7291790745221078} 12/26/2021 12:50:01 - INFO - codeparrot_training - Step 27074: {'lr': 0.00021876834667951145, 'samples': 13862400, 'steps': 27074, 'batch_loss/train': 0.8123081708326936} 12/26/2021 12:50:11 - INFO - codeparrot_training - Step 27075: {'lr': 0.0002187527070960006, 'samples': 13862912, 'steps': 27075, 'batch_loss/train': 0.7491811811923981} 12/26/2021 12:50:22 - INFO - codeparrot_training - Step 27076: {'lr': 0.0002187370676367173, 'samples': 13863424, 'steps': 27076, 'batch_loss/train': 0.781294490210712} 12/26/2021 12:50:33 - INFO - codeparrot_training - Step 27077: {'lr': 0.00021872142830172366, 'samples': 13863936, 'steps': 27077, 'batch_loss/train': 0.793626700527966} 12/26/2021 12:50:45 - INFO - codeparrot_training - Step 27078: {'lr': 0.00021870578909108165, 'samples': 13864448, 'steps': 27078, 'batch_loss/train': 0.6436593462713063} 12/26/2021 12:50:55 - INFO - codeparrot_training - Step 27079: {'lr': 0.00021869015000485368, 'samples': 13864960, 'steps': 27079, 'batch_loss/train': 0.7208319630008191} 12/26/2021 12:51:06 - INFO - codeparrot_training - Step 27080: {'lr': 0.0002186745110431019, 'samples': 13865472, 'steps': 27080, 'batch_loss/train': 0.7263997981790453} 12/26/2021 12:51:18 - INFO - codeparrot_training - Step 27081: {'lr': 0.0002186588722058884, 'samples': 13865984, 'steps': 27081, 'batch_loss/train': 0.9427215843461454} 12/26/2021 12:51:29 - INFO - codeparrot_training - Step 27082: {'lr': 0.0002186432334932753, 'samples': 13866496, 'steps': 27082, 'batch_loss/train': 0.8414271725341678} 12/26/2021 12:51:39 - INFO - codeparrot_training - Step 27083: {'lr': 0.00021862759490532503, 'samples': 13867008, 'steps': 27083, 'batch_loss/train': 0.8108413126319647} 12/26/2021 12:51:52 - INFO - codeparrot_training - Step 27084: {'lr': 0.00021861195644209952, 'samples': 13867520, 'steps': 27084, 'batch_loss/train': 0.7967210188508034} 12/26/2021 12:52:03 - INFO - codeparrot_training - Step 27085: {'lr': 0.00021859631810366105, 'samples': 13868032, 'steps': 27085, 'batch_loss/train': 0.7856118390336633} 12/26/2021 12:52:14 - INFO - codeparrot_training - Step 27086: {'lr': 0.00021858067989007164, 'samples': 13868544, 'steps': 27086, 'batch_loss/train': 0.7972414479590952} 12/26/2021 12:52:24 - INFO - codeparrot_training - Step 27087: {'lr': 0.00021856504180139372, 'samples': 13869056, 'steps': 27087, 'batch_loss/train': 0.8154855170287192} 12/26/2021 12:52:36 - INFO - codeparrot_training - Step 27088: {'lr': 0.00021854940383768927, 'samples': 13869568, 'steps': 27088, 'batch_loss/train': 0.8462633369490504} 12/26/2021 12:52:47 - INFO - codeparrot_training - Step 27089: {'lr': 0.00021853376599902056, 'samples': 13870080, 'steps': 27089, 'batch_loss/train': 0.7942904438823462} 12/26/2021 12:52:58 - INFO - codeparrot_training - Step 27090: {'lr': 0.0002185181282854496, 'samples': 13870592, 'steps': 27090, 'batch_loss/train': 0.7880741157568991} 12/26/2021 12:53:10 - INFO - codeparrot_training - Step 27091: {'lr': 0.00021850249069703875, 'samples': 13871104, 'steps': 27091, 'batch_loss/train': 0.7183612436638214} 12/26/2021 12:53:21 - INFO - codeparrot_training - Step 27092: {'lr': 0.00021848685323385016, 'samples': 13871616, 'steps': 27092, 'batch_loss/train': 0.8345005924347788} 12/26/2021 12:53:32 - INFO - codeparrot_training - Step 27093: {'lr': 0.00021847121589594582, 'samples': 13872128, 'steps': 27093, 'batch_loss/train': 0.8004102967679501} 12/26/2021 12:53:44 - INFO - codeparrot_training - Step 27094: {'lr': 0.00021845557868338812, 'samples': 13872640, 'steps': 27094, 'batch_loss/train': 0.7121539828367531} 12/26/2021 12:53:54 - INFO - codeparrot_training - Step 27095: {'lr': 0.0002184399415962391, 'samples': 13873152, 'steps': 27095, 'batch_loss/train': 0.7235813681036234} 12/26/2021 12:54:05 - INFO - codeparrot_training - Step 27096: {'lr': 0.00021842430463456098, 'samples': 13873664, 'steps': 27096, 'batch_loss/train': 0.7366345808841288} 12/26/2021 12:54:16 - INFO - codeparrot_training - Step 27097: {'lr': 0.0002184086677984158, 'samples': 13874176, 'steps': 27097, 'batch_loss/train': 0.8298656931146979} 12/26/2021 12:54:28 - INFO - codeparrot_training - Step 27098: {'lr': 0.0002183930310878659, 'samples': 13874688, 'steps': 27098, 'batch_loss/train': 0.7614655075594783} 12/26/2021 12:54:38 - INFO - codeparrot_training - Step 27099: {'lr': 0.0002183773945029734, 'samples': 13875200, 'steps': 27099, 'batch_loss/train': 1.5876053732354194} 12/26/2021 12:54:49 - INFO - codeparrot_training - Step 27100: {'lr': 0.00021836175804380042, 'samples': 13875712, 'steps': 27100, 'batch_loss/train': 0.7100616097450256} 12/26/2021 12:55:01 - INFO - codeparrot_training - Step 27101: {'lr': 0.0002183461217104091, 'samples': 13876224, 'steps': 27101, 'batch_loss/train': 0.6580079422565177} 12/26/2021 12:55:12 - INFO - codeparrot_training - Step 27102: {'lr': 0.00021833048550286171, 'samples': 13876736, 'steps': 27102, 'batch_loss/train': 0.7946018704678863} 12/26/2021 12:55:22 - INFO - codeparrot_training - Step 27103: {'lr': 0.00021831484942122037, 'samples': 13877248, 'steps': 27103, 'batch_loss/train': 0.7266000350937247} 12/26/2021 12:55:35 - INFO - codeparrot_training - Step 27104: {'lr': 0.00021829921346554716, 'samples': 13877760, 'steps': 27104, 'batch_loss/train': 0.8187220357358456} 12/26/2021 12:55:46 - INFO - codeparrot_training - Step 27105: {'lr': 0.00021828357763590427, 'samples': 13878272, 'steps': 27105, 'batch_loss/train': 0.8175321104936302} 12/26/2021 12:55:56 - INFO - codeparrot_training - Step 27106: {'lr': 0.00021826794193235404, 'samples': 13878784, 'steps': 27106, 'batch_loss/train': 0.8699501026421785} 12/26/2021 12:56:07 - INFO - codeparrot_training - Step 27107: {'lr': 0.0002182523063549584, 'samples': 13879296, 'steps': 27107, 'batch_loss/train': 0.6648680404759943} 12/26/2021 12:56:19 - INFO - codeparrot_training - Step 27108: {'lr': 0.00021823667090377954, 'samples': 13879808, 'steps': 27108, 'batch_loss/train': 0.8258742885664105} 12/26/2021 12:56:30 - INFO - codeparrot_training - Step 27109: {'lr': 0.00021822103557887982, 'samples': 13880320, 'steps': 27109, 'batch_loss/train': 0.7412862260825932} 12/26/2021 12:56:40 - INFO - codeparrot_training - Step 27110: {'lr': 0.0002182054003803212, 'samples': 13880832, 'steps': 27110, 'batch_loss/train': 0.9426129348576069} 12/26/2021 12:56:52 - INFO - codeparrot_training - Step 27111: {'lr': 0.00021818976530816591, 'samples': 13881344, 'steps': 27111, 'batch_loss/train': 0.9827627884224057} 12/26/2021 12:57:03 - INFO - codeparrot_training - Step 27112: {'lr': 0.00021817413036247606, 'samples': 13881856, 'steps': 27112, 'batch_loss/train': 0.811134030809626} 12/26/2021 12:57:14 - INFO - codeparrot_training - Step 27113: {'lr': 0.00021815849554331394, 'samples': 13882368, 'steps': 27113, 'batch_loss/train': 0.7723307642154396} 12/26/2021 12:57:26 - INFO - codeparrot_training - Step 27114: {'lr': 0.00021814286085074155, 'samples': 13882880, 'steps': 27114, 'batch_loss/train': 0.7148374242242426} 12/26/2021 12:57:37 - INFO - codeparrot_training - Step 27115: {'lr': 0.00021812722628482107, 'samples': 13883392, 'steps': 27115, 'batch_loss/train': 0.6243322346126661} 12/26/2021 12:57:48 - INFO - codeparrot_training - Step 27116: {'lr': 0.0002181115918456148, 'samples': 13883904, 'steps': 27116, 'batch_loss/train': 0.8197688583750278} 12/26/2021 12:58:00 - INFO - codeparrot_training - Step 27117: {'lr': 0.00021809595753318476, 'samples': 13884416, 'steps': 27117, 'batch_loss/train': 0.7262420190963894} 12/26/2021 12:58:10 - INFO - codeparrot_training - Step 27118: {'lr': 0.0002180803233475932, 'samples': 13884928, 'steps': 27118, 'batch_loss/train': 0.821019237395376} 12/26/2021 12:58:21 - INFO - codeparrot_training - Step 27119: {'lr': 0.0002180646892889021, 'samples': 13885440, 'steps': 27119, 'batch_loss/train': 0.7356160450726748} 12/26/2021 12:58:31 - INFO - codeparrot_training - Step 27120: {'lr': 0.0002180490553571738, 'samples': 13885952, 'steps': 27120, 'batch_loss/train': 0.7506029678042978} 12/26/2021 12:58:44 - INFO - codeparrot_training - Step 27121: {'lr': 0.00021803342155247047, 'samples': 13886464, 'steps': 27121, 'batch_loss/train': 0.7654706547036767} 12/26/2021 12:58:55 - INFO - codeparrot_training - Step 27122: {'lr': 0.00021801778787485415, 'samples': 13886976, 'steps': 27122, 'batch_loss/train': 0.7212541177868843} 12/26/2021 12:59:06 - INFO - codeparrot_training - Step 27123: {'lr': 0.00021800215432438695, 'samples': 13887488, 'steps': 27123, 'batch_loss/train': 0.7998914867639542} 12/26/2021 12:59:18 - INFO - codeparrot_training - Step 27124: {'lr': 0.00021798652090113113, 'samples': 13888000, 'steps': 27124, 'batch_loss/train': 0.840723967179656} 12/26/2021 12:59:28 - INFO - codeparrot_training - Step 27125: {'lr': 0.0002179708876051489, 'samples': 13888512, 'steps': 27125, 'batch_loss/train': 0.7694286247715354} 12/26/2021 12:59:39 - INFO - codeparrot_training - Step 27126: {'lr': 0.00021795525443650223, 'samples': 13889024, 'steps': 27126, 'batch_loss/train': 0.6716906378278509} 12/26/2021 12:59:51 - INFO - codeparrot_training - Step 27127: {'lr': 0.00021793962139525337, 'samples': 13889536, 'steps': 27127, 'batch_loss/train': 0.8311259895563126} 12/26/2021 13:00:02 - INFO - codeparrot_training - Step 27128: {'lr': 0.00021792398848146454, 'samples': 13890048, 'steps': 27128, 'batch_loss/train': 0.7572008902207017} 12/26/2021 13:00:13 - INFO - codeparrot_training - Step 27129: {'lr': 0.00021790835569519783, 'samples': 13890560, 'steps': 27129, 'batch_loss/train': 0.5487365419976413} 12/26/2021 13:00:23 - INFO - codeparrot_training - Step 27130: {'lr': 0.00021789272303651526, 'samples': 13891072, 'steps': 27130, 'batch_loss/train': 0.7727763800648972} 12/26/2021 13:00:36 - INFO - codeparrot_training - Step 27131: {'lr': 0.00021787709050547917, 'samples': 13891584, 'steps': 27131, 'batch_loss/train': 0.722957078833133} 12/26/2021 13:00:47 - INFO - codeparrot_training - Step 27132: {'lr': 0.0002178614581021517, 'samples': 13892096, 'steps': 27132, 'batch_loss/train': 0.7817056505009532} 12/26/2021 13:00:57 - INFO - codeparrot_training - Step 27133: {'lr': 0.00021784582582659484, 'samples': 13892608, 'steps': 27133, 'batch_loss/train': 0.7707389332354069} 12/26/2021 13:01:10 - INFO - codeparrot_training - Step 27134: {'lr': 0.0002178301936788708, 'samples': 13893120, 'steps': 27134, 'batch_loss/train': 0.7625471208011732} 12/26/2021 13:01:20 - INFO - codeparrot_training - Step 27135: {'lr': 0.0002178145616590419, 'samples': 13893632, 'steps': 27135, 'batch_loss/train': 0.8191851545125246} 12/26/2021 13:01:31 - INFO - codeparrot_training - Step 27136: {'lr': 0.00021779892976717008, 'samples': 13894144, 'steps': 27136, 'batch_loss/train': 0.7463194709271193} 12/26/2021 13:01:43 - INFO - codeparrot_training - Step 27137: {'lr': 0.00021778329800331762, 'samples': 13894656, 'steps': 27137, 'batch_loss/train': 0.7906886297278106} 12/26/2021 13:01:54 - INFO - codeparrot_training - Step 27138: {'lr': 0.00021776766636754646, 'samples': 13895168, 'steps': 27138, 'batch_loss/train': 0.8154989648610353} 12/26/2021 13:02:04 - INFO - codeparrot_training - Step 27139: {'lr': 0.000217752034859919, 'samples': 13895680, 'steps': 27139, 'batch_loss/train': 0.8336400042753667} 12/26/2021 13:02:15 - INFO - codeparrot_training - Step 27140: {'lr': 0.00021773640348049728, 'samples': 13896192, 'steps': 27140, 'batch_loss/train': 0.7406328469514847} 12/26/2021 13:02:27 - INFO - codeparrot_training - Step 27141: {'lr': 0.00021772077222934335, 'samples': 13896704, 'steps': 27141, 'batch_loss/train': 0.5720992251299322} 12/26/2021 13:02:37 - INFO - codeparrot_training - Step 27142: {'lr': 0.0002177051411065195, 'samples': 13897216, 'steps': 27142, 'batch_loss/train': 0.7950016250833869} 12/26/2021 13:02:48 - INFO - codeparrot_training - Step 27143: {'lr': 0.00021768951011208782, 'samples': 13897728, 'steps': 27143, 'batch_loss/train': 0.7916628890670836} 12/26/2021 13:03:01 - INFO - codeparrot_training - Step 27144: {'lr': 0.00021767387924611048, 'samples': 13898240, 'steps': 27144, 'batch_loss/train': 0.67946937517263} 12/26/2021 13:03:12 - INFO - codeparrot_training - Step 27145: {'lr': 0.0002176582485086495, 'samples': 13898752, 'steps': 27145, 'batch_loss/train': 0.7375022559426725} 12/26/2021 13:03:22 - INFO - codeparrot_training - Step 27146: {'lr': 0.00021764261789976715, 'samples': 13899264, 'steps': 27146, 'batch_loss/train': 0.8013742389157414} 12/26/2021 13:03:34 - INFO - codeparrot_training - Step 27147: {'lr': 0.00021762698741952557, 'samples': 13899776, 'steps': 27147, 'batch_loss/train': 0.7948544169776142} 12/26/2021 13:03:45 - INFO - codeparrot_training - Step 27148: {'lr': 0.0002176113570679869, 'samples': 13900288, 'steps': 27148, 'batch_loss/train': 0.7543308027088642} 12/26/2021 13:03:55 - INFO - codeparrot_training - Step 27149: {'lr': 0.00021759572684521312, 'samples': 13900800, 'steps': 27149, 'batch_loss/train': 0.7966751968488097} 12/26/2021 13:04:08 - INFO - codeparrot_training - Step 27150: {'lr': 0.00021758009675126654, 'samples': 13901312, 'steps': 27150, 'batch_loss/train': 0.682664658408612} 12/26/2021 13:04:18 - INFO - codeparrot_training - Step 27151: {'lr': 0.00021756446678620933, 'samples': 13901824, 'steps': 27151, 'batch_loss/train': 0.7275812588632107} 12/26/2021 13:04:29 - INFO - codeparrot_training - Step 27152: {'lr': 0.00021754883695010343, 'samples': 13902336, 'steps': 27152, 'batch_loss/train': 0.7778588319197297} 12/26/2021 13:04:40 - INFO - codeparrot_training - Step 27153: {'lr': 0.00021753320724301117, 'samples': 13902848, 'steps': 27153, 'batch_loss/train': 0.8127177143469453} 12/26/2021 13:04:52 - INFO - codeparrot_training - Step 27154: {'lr': 0.00021751757766499467, 'samples': 13903360, 'steps': 27154, 'batch_loss/train': 0.6921047195792198} 12/26/2021 13:05:03 - INFO - codeparrot_training - Step 27155: {'lr': 0.00021750194821611594, 'samples': 13903872, 'steps': 27155, 'batch_loss/train': 0.8137591991107911} 12/26/2021 13:05:14 - INFO - codeparrot_training - Step 27156: {'lr': 0.00021748631889643717, 'samples': 13904384, 'steps': 27156, 'batch_loss/train': 0.5954364044591784} 12/26/2021 13:05:26 - INFO - codeparrot_training - Step 27157: {'lr': 0.0002174706897060206, 'samples': 13904896, 'steps': 27157, 'batch_loss/train': 0.7736724596470594} 12/26/2021 13:05:36 - INFO - codeparrot_training - Step 27158: {'lr': 0.00021745506064492829, 'samples': 13905408, 'steps': 27158, 'batch_loss/train': 0.8878686367534101} 12/26/2021 13:05:47 - INFO - codeparrot_training - Step 27159: {'lr': 0.00021743943171322234, 'samples': 13905920, 'steps': 27159, 'batch_loss/train': 0.729830470867455} 12/26/2021 13:05:59 - INFO - codeparrot_training - Step 27160: {'lr': 0.0002174238029109648, 'samples': 13906432, 'steps': 27160, 'batch_loss/train': 0.7685317504219711} 12/26/2021 13:06:10 - INFO - codeparrot_training - Step 27161: {'lr': 0.0002174081742382181, 'samples': 13906944, 'steps': 27161, 'batch_loss/train': 0.7988546201959252} 12/26/2021 13:06:21 - INFO - codeparrot_training - Step 27162: {'lr': 0.0002173925456950441, 'samples': 13907456, 'steps': 27162, 'batch_loss/train': 0.7338656370993704} 12/26/2021 13:06:31 - INFO - codeparrot_training - Step 27163: {'lr': 0.0002173769172815051, 'samples': 13907968, 'steps': 27163, 'batch_loss/train': 0.8197123003192246} 12/26/2021 13:06:44 - INFO - codeparrot_training - Step 27164: {'lr': 0.00021736128899766302, 'samples': 13908480, 'steps': 27164, 'batch_loss/train': 1.5042116409167647} 12/26/2021 13:06:54 - INFO - codeparrot_training - Step 27165: {'lr': 0.00021734566084358022, 'samples': 13908992, 'steps': 27165, 'batch_loss/train': 0.5574335539713502} 12/26/2021 13:07:05 - INFO - codeparrot_training - Step 27166: {'lr': 0.0002173300328193188, 'samples': 13909504, 'steps': 27166, 'batch_loss/train': 0.6928272880613804} 12/26/2021 13:07:17 - INFO - codeparrot_training - Step 27167: {'lr': 0.0002173144049249407, 'samples': 13910016, 'steps': 27167, 'batch_loss/train': 0.7365514598786831} 12/26/2021 13:07:27 - INFO - codeparrot_training - Step 27168: {'lr': 0.00021729877716050828, 'samples': 13910528, 'steps': 27168, 'batch_loss/train': 0.7901913430541754} 12/26/2021 13:07:38 - INFO - codeparrot_training - Step 27169: {'lr': 0.00021728314952608354, 'samples': 13911040, 'steps': 27169, 'batch_loss/train': 0.7468491735635325} 12/26/2021 13:07:51 - INFO - codeparrot_training - Step 27170: {'lr': 0.00021726752202172873, 'samples': 13911552, 'steps': 27170, 'batch_loss/train': 0.8004070268943906} 12/26/2021 13:08:01 - INFO - codeparrot_training - Step 27171: {'lr': 0.00021725189464750572, 'samples': 13912064, 'steps': 27171, 'batch_loss/train': 0.770708717405796} 12/26/2021 13:08:12 - INFO - codeparrot_training - Step 27172: {'lr': 0.0002172362674034769, 'samples': 13912576, 'steps': 27172, 'batch_loss/train': 0.5760654994519427} 12/26/2021 13:08:23 - INFO - codeparrot_training - Step 27173: {'lr': 0.0002172206402897044, 'samples': 13913088, 'steps': 27173, 'batch_loss/train': 0.8182404469698668} 12/26/2021 13:08:35 - INFO - codeparrot_training - Step 27174: {'lr': 0.0002172050133062502, 'samples': 13913600, 'steps': 27174, 'batch_loss/train': 0.7858980847522616} 12/26/2021 13:08:46 - INFO - codeparrot_training - Step 27175: {'lr': 0.0002171893864531764, 'samples': 13914112, 'steps': 27175, 'batch_loss/train': 0.6581841219449416} 12/26/2021 13:08:56 - INFO - codeparrot_training - Step 27176: {'lr': 0.00021717375973054527, 'samples': 13914624, 'steps': 27176, 'batch_loss/train': 0.8197712432593107} 12/26/2021 13:09:08 - INFO - codeparrot_training - Step 27177: {'lr': 0.00021715813313841892, 'samples': 13915136, 'steps': 27177, 'batch_loss/train': 0.719864524435252} 12/26/2021 13:09:19 - INFO - codeparrot_training - Step 27178: {'lr': 0.0002171425066768593, 'samples': 13915648, 'steps': 27178, 'batch_loss/train': 0.5885572559782304} 12/26/2021 13:09:30 - INFO - codeparrot_training - Step 27179: {'lr': 0.00021712688034592875, 'samples': 13916160, 'steps': 27179, 'batch_loss/train': 0.8334966888651252} 12/26/2021 13:09:42 - INFO - codeparrot_training - Step 27180: {'lr': 0.00021711125414568939, 'samples': 13916672, 'steps': 27180, 'batch_loss/train': 0.7438372685573995} 12/26/2021 13:09:52 - INFO - codeparrot_training - Step 27181: {'lr': 0.00021709562807620318, 'samples': 13917184, 'steps': 27181, 'batch_loss/train': 0.7454559940379113} 12/26/2021 13:10:03 - INFO - codeparrot_training - Step 27182: {'lr': 0.00021708000213753226, 'samples': 13917696, 'steps': 27182, 'batch_loss/train': 0.825958036351949} 12/26/2021 13:10:16 - INFO - codeparrot_training - Step 27183: {'lr': 0.00021706437632973896, 'samples': 13918208, 'steps': 27183, 'batch_loss/train': 0.7852405402809381} 12/26/2021 13:10:26 - INFO - codeparrot_training - Step 27184: {'lr': 0.00021704875065288527, 'samples': 13918720, 'steps': 27184, 'batch_loss/train': 0.7660761141451076} 12/26/2021 13:10:37 - INFO - codeparrot_training - Step 27185: {'lr': 0.0002170331251070332, 'samples': 13919232, 'steps': 27185, 'batch_loss/train': 0.9318880988284945} 12/26/2021 13:10:47 - INFO - codeparrot_training - Step 27186: {'lr': 0.00021701749969224495, 'samples': 13919744, 'steps': 27186, 'batch_loss/train': 0.7673905971460044} 12/26/2021 13:10:59 - INFO - codeparrot_training - Step 27187: {'lr': 0.0002170018744085828, 'samples': 13920256, 'steps': 27187, 'batch_loss/train': 0.8371842717751861} 12/26/2021 13:11:10 - INFO - codeparrot_training - Step 27188: {'lr': 0.00021698624925610866, 'samples': 13920768, 'steps': 27188, 'batch_loss/train': 0.8226602403447032} 12/26/2021 13:11:21 - INFO - codeparrot_training - Step 27189: {'lr': 0.00021697062423488468, 'samples': 13921280, 'steps': 27189, 'batch_loss/train': 0.8650006866082549} 12/26/2021 13:11:33 - INFO - codeparrot_training - Step 27190: {'lr': 0.00021695499934497314, 'samples': 13921792, 'steps': 27190, 'batch_loss/train': 0.7525924732908607} 12/26/2021 13:11:44 - INFO - codeparrot_training - Step 27191: {'lr': 0.00021693937458643598, 'samples': 13922304, 'steps': 27191, 'batch_loss/train': 0.7396597884362563} 12/26/2021 13:11:55 - INFO - codeparrot_training - Step 27192: {'lr': 0.00021692374995933546, 'samples': 13922816, 'steps': 27192, 'batch_loss/train': 0.7689241981133819} 12/26/2021 13:12:07 - INFO - codeparrot_training - Step 27193: {'lr': 0.00021690812546373348, 'samples': 13923328, 'steps': 27193, 'batch_loss/train': 0.6986834856215864} 12/26/2021 13:12:17 - INFO - codeparrot_training - Step 27194: {'lr': 0.00021689250109969242, 'samples': 13923840, 'steps': 27194, 'batch_loss/train': 0.7162663314957172} 12/26/2021 13:12:28 - INFO - codeparrot_training - Step 27195: {'lr': 0.00021687687686727423, 'samples': 13924352, 'steps': 27195, 'batch_loss/train': 0.7787895798683167} 12/26/2021 13:12:40 - INFO - codeparrot_training - Step 27196: {'lr': 0.00021686125276654112, 'samples': 13924864, 'steps': 27196, 'batch_loss/train': 0.7925372673198581} 12/26/2021 13:12:51 - INFO - codeparrot_training - Step 27197: {'lr': 0.00021684562879755503, 'samples': 13925376, 'steps': 27197, 'batch_loss/train': 0.7754449415951967} 12/26/2021 13:13:01 - INFO - codeparrot_training - Step 27198: {'lr': 0.00021683000496037827, 'samples': 13925888, 'steps': 27198, 'batch_loss/train': 0.6875720005482435} 12/26/2021 13:13:12 - INFO - codeparrot_training - Step 27199: {'lr': 0.00021681438125507294, 'samples': 13926400, 'steps': 27199, 'batch_loss/train': 0.7437633546069264} 12/26/2021 13:13:25 - INFO - codeparrot_training - Step 27200: {'lr': 0.00021679875768170105, 'samples': 13926912, 'steps': 27200, 'batch_loss/train': 0.7322270143777132} 12/26/2021 13:13:35 - INFO - codeparrot_training - Step 27201: {'lr': 0.0002167831342403247, 'samples': 13927424, 'steps': 27201, 'batch_loss/train': 0.7838300270959735} 12/26/2021 13:13:46 - INFO - codeparrot_training - Step 27202: {'lr': 0.00021676751093100617, 'samples': 13927936, 'steps': 27202, 'batch_loss/train': 0.716859613719862} 12/26/2021 13:13:58 - INFO - codeparrot_training - Step 27203: {'lr': 0.0002167518877538075, 'samples': 13928448, 'steps': 27203, 'batch_loss/train': 0.7750191250815988} 12/26/2021 13:14:08 - INFO - codeparrot_training - Step 27204: {'lr': 0.00021673626470879062, 'samples': 13928960, 'steps': 27204, 'batch_loss/train': 0.6689856229932047} 12/26/2021 13:14:19 - INFO - codeparrot_training - Step 27205: {'lr': 0.00021672064179601786, 'samples': 13929472, 'steps': 27205, 'batch_loss/train': 0.8102872051531449} 12/26/2021 13:14:31 - INFO - codeparrot_training - Step 27206: {'lr': 0.00021670501901555135, 'samples': 13929984, 'steps': 27206, 'batch_loss/train': 0.7617429653182626} 12/26/2021 13:14:42 - INFO - codeparrot_training - Step 27207: {'lr': 0.000216689396367453, 'samples': 13930496, 'steps': 27207, 'batch_loss/train': 0.6580883885035291} 12/26/2021 13:14:52 - INFO - codeparrot_training - Step 27208: {'lr': 0.00021667377385178502, 'samples': 13931008, 'steps': 27208, 'batch_loss/train': 0.5714455780107528} 12/26/2021 13:15:03 - INFO - codeparrot_training - Step 27209: {'lr': 0.00021665815146860963, 'samples': 13931520, 'steps': 27209, 'batch_loss/train': 0.7166253803297877} 12/26/2021 13:15:17 - INFO - codeparrot_training - Step 27210: {'lr': 0.00021664252921798884, 'samples': 13932032, 'steps': 27210, 'batch_loss/train': 0.7633572886697948} 12/26/2021 13:15:27 - INFO - codeparrot_training - Step 27211: {'lr': 0.0002166269070999847, 'samples': 13932544, 'steps': 27211, 'batch_loss/train': 0.7582415184006095} 12/26/2021 13:15:38 - INFO - codeparrot_training - Step 27212: {'lr': 0.00021661128511465933, 'samples': 13933056, 'steps': 27212, 'batch_loss/train': 1.2208243769127876} 12/26/2021 13:15:50 - INFO - codeparrot_training - Step 27213: {'lr': 0.00021659566326207496, 'samples': 13933568, 'steps': 27213, 'batch_loss/train': 0.7081458461470902} 12/26/2021 13:16:01 - INFO - codeparrot_training - Step 27214: {'lr': 0.0002165800415422936, 'samples': 13934080, 'steps': 27214, 'batch_loss/train': 0.8444514228031039} 12/26/2021 13:16:11 - INFO - codeparrot_training - Step 27215: {'lr': 0.00021656441995537734, 'samples': 13934592, 'steps': 27215, 'batch_loss/train': 0.7960063051432371} 12/26/2021 13:16:24 - INFO - codeparrot_training - Step 27216: {'lr': 0.00021654879850138838, 'samples': 13935104, 'steps': 27216, 'batch_loss/train': 0.6514921197667718} 12/26/2021 13:16:35 - INFO - codeparrot_training - Step 27217: {'lr': 0.00021653317718038875, 'samples': 13935616, 'steps': 27217, 'batch_loss/train': 0.7056590393185616} 12/26/2021 13:16:45 - INFO - codeparrot_training - Step 27218: {'lr': 0.00021651755599244062, 'samples': 13936128, 'steps': 27218, 'batch_loss/train': 0.7420390956103802} 12/26/2021 13:16:56 - INFO - codeparrot_training - Step 27219: {'lr': 0.00021650193493760592, 'samples': 13936640, 'steps': 27219, 'batch_loss/train': 0.8523889114148915} 12/26/2021 13:17:08 - INFO - codeparrot_training - Step 27220: {'lr': 0.00021648631401594698, 'samples': 13937152, 'steps': 27220, 'batch_loss/train': 0.6987639035796747} 12/26/2021 13:17:19 - INFO - codeparrot_training - Step 27221: {'lr': 0.00021647069322752578, 'samples': 13937664, 'steps': 27221, 'batch_loss/train': 0.8319571511819959} 12/26/2021 13:17:30 - INFO - codeparrot_training - Step 27222: {'lr': 0.0002164550725724045, 'samples': 13938176, 'steps': 27222, 'batch_loss/train': 0.7248290125280619} 12/26/2021 13:17:42 - INFO - codeparrot_training - Step 27223: {'lr': 0.00021643945205064504, 'samples': 13938688, 'steps': 27223, 'batch_loss/train': 0.7175038680434227} 12/26/2021 13:17:53 - INFO - codeparrot_training - Step 27224: {'lr': 0.00021642383166230972, 'samples': 13939200, 'steps': 27224, 'batch_loss/train': 0.7992959674447775} 12/26/2021 13:18:03 - INFO - codeparrot_training - Step 27225: {'lr': 0.00021640821140746062, 'samples': 13939712, 'steps': 27225, 'batch_loss/train': 0.7977613476105034} 12/26/2021 13:18:16 - INFO - codeparrot_training - Step 27226: {'lr': 0.00021639259128615967, 'samples': 13940224, 'steps': 27226, 'batch_loss/train': 0.5686143399798311} 12/26/2021 13:18:27 - INFO - codeparrot_training - Step 27227: {'lr': 0.00021637697129846915, 'samples': 13940736, 'steps': 27227, 'batch_loss/train': 0.7828495283611119} 12/26/2021 13:18:37 - INFO - codeparrot_training - Step 27228: {'lr': 0.00021636135144445116, 'samples': 13941248, 'steps': 27228, 'batch_loss/train': 0.769777289358899} 12/26/2021 13:18:48 - INFO - codeparrot_training - Step 27229: {'lr': 0.00021634573172416772, 'samples': 13941760, 'steps': 27229, 'batch_loss/train': 0.6886118231341243} 12/26/2021 13:19:00 - INFO - codeparrot_training - Step 27230: {'lr': 0.00021633011213768083, 'samples': 13942272, 'steps': 27230, 'batch_loss/train': 0.7176682483404875} 12/26/2021 13:19:11 - INFO - codeparrot_training - Step 27231: {'lr': 0.00021631449268505278, 'samples': 13942784, 'steps': 27231, 'batch_loss/train': 0.6014870899380185} 12/26/2021 13:19:21 - INFO - codeparrot_training - Step 27232: {'lr': 0.00021629887336634563, 'samples': 13943296, 'steps': 27232, 'batch_loss/train': 0.7714622672647238} 12/26/2021 13:19:35 - INFO - codeparrot_training - Step 27233: {'lr': 0.00021628325418162136, 'samples': 13943808, 'steps': 27233, 'batch_loss/train': 0.8038583733141422} 12/26/2021 13:19:45 - INFO - codeparrot_training - Step 27234: {'lr': 0.0002162676351309421, 'samples': 13944320, 'steps': 27234, 'batch_loss/train': 0.8422315088100731} 12/26/2021 13:19:56 - INFO - codeparrot_training - Step 27235: {'lr': 0.0002162520162143701, 'samples': 13944832, 'steps': 27235, 'batch_loss/train': 0.6787443356588483} 12/26/2021 13:20:08 - INFO - codeparrot_training - Step 27236: {'lr': 0.0002162363974319673, 'samples': 13945344, 'steps': 27236, 'batch_loss/train': 0.5963813392445445} 12/26/2021 13:20:18 - INFO - codeparrot_training - Step 27237: {'lr': 0.00021622077878379587, 'samples': 13945856, 'steps': 27237, 'batch_loss/train': 0.6830056535545737} 12/26/2021 13:20:29 - INFO - codeparrot_training - Step 27238: {'lr': 0.00021620516026991777, 'samples': 13946368, 'steps': 27238, 'batch_loss/train': 0.6639145319350064} 12/26/2021 13:20:41 - INFO - codeparrot_training - Step 27239: {'lr': 0.0002161895418903953, 'samples': 13946880, 'steps': 27239, 'batch_loss/train': 0.6495601636124775} 12/26/2021 13:20:52 - INFO - codeparrot_training - Step 27240: {'lr': 0.00021617392364529038, 'samples': 13947392, 'steps': 27240, 'batch_loss/train': 0.6775653185322881} 12/26/2021 13:21:02 - INFO - codeparrot_training - Step 27241: {'lr': 0.00021615830553466513, 'samples': 13947904, 'steps': 27241, 'batch_loss/train': 0.7794518987648189} 12/26/2021 13:21:13 - INFO - codeparrot_training - Step 27242: {'lr': 0.00021614268755858177, 'samples': 13948416, 'steps': 27242, 'batch_loss/train': 0.7755796652054414} 12/26/2021 13:21:25 - INFO - codeparrot_training - Step 27243: {'lr': 0.00021612706971710225, 'samples': 13948928, 'steps': 27243, 'batch_loss/train': 0.7977935504168272} 12/26/2021 13:21:36 - INFO - codeparrot_training - Step 27244: {'lr': 0.00021611145201028877, 'samples': 13949440, 'steps': 27244, 'batch_loss/train': 0.75275884475559} 12/26/2021 13:21:46 - INFO - codeparrot_training - Step 27245: {'lr': 0.00021609583443820324, 'samples': 13949952, 'steps': 27245, 'batch_loss/train': 0.6399912384222262} 12/26/2021 13:21:59 - INFO - codeparrot_training - Step 27246: {'lr': 0.00021608021700090794, 'samples': 13950464, 'steps': 27246, 'batch_loss/train': 0.7015347694978118} 12/26/2021 13:22:10 - INFO - codeparrot_training - Step 27247: {'lr': 0.00021606459969846496, 'samples': 13950976, 'steps': 27247, 'batch_loss/train': 0.7046430506743491} 12/26/2021 13:22:20 - INFO - codeparrot_training - Step 27248: {'lr': 0.0002160489825309363, 'samples': 13951488, 'steps': 27248, 'batch_loss/train': 0.6316765104420483} 12/26/2021 13:22:32 - INFO - codeparrot_training - Step 27249: {'lr': 0.00021603336549838392, 'samples': 13952000, 'steps': 27249, 'batch_loss/train': 0.7750612022355199} 12/26/2021 13:22:43 - INFO - codeparrot_training - Step 27250: {'lr': 0.00021601774860087014, 'samples': 13952512, 'steps': 27250, 'batch_loss/train': 0.7403640220873058} 12/26/2021 13:22:54 - INFO - codeparrot_training - Step 27251: {'lr': 0.00021600213183845703, 'samples': 13953024, 'steps': 27251, 'batch_loss/train': 0.8010115143842995} 12/26/2021 13:23:04 - INFO - codeparrot_training - Step 27252: {'lr': 0.0002159865152112065, 'samples': 13953536, 'steps': 27252, 'batch_loss/train': 0.7791456738486886} 12/26/2021 13:23:16 - INFO - codeparrot_training - Step 27253: {'lr': 0.00021597089871918076, 'samples': 13954048, 'steps': 27253, 'batch_loss/train': 0.7514928611926734} 12/26/2021 13:23:27 - INFO - codeparrot_training - Step 27254: {'lr': 0.00021595528236244196, 'samples': 13954560, 'steps': 27254, 'batch_loss/train': 0.7237394331023097} 12/26/2021 13:23:38 - INFO - codeparrot_training - Step 27255: {'lr': 0.0002159396661410521, 'samples': 13955072, 'steps': 27255, 'batch_loss/train': 0.7462077271193266} 12/26/2021 13:23:51 - INFO - codeparrot_training - Step 27256: {'lr': 0.00021592405005507314, 'samples': 13955584, 'steps': 27256, 'batch_loss/train': 0.7743469309061766} 12/26/2021 13:24:02 - INFO - codeparrot_training - Step 27257: {'lr': 0.0002159084341045674, 'samples': 13956096, 'steps': 27257, 'batch_loss/train': 0.7982581984251738} 12/26/2021 13:24:12 - INFO - codeparrot_training - Step 27258: {'lr': 0.00021589281828959683, 'samples': 13956608, 'steps': 27258, 'batch_loss/train': 0.8406065278686583} 12/26/2021 13:24:24 - INFO - codeparrot_training - Step 27259: {'lr': 0.00021587720261022353, 'samples': 13957120, 'steps': 27259, 'batch_loss/train': 0.462055156705901} 12/26/2021 13:24:35 - INFO - codeparrot_training - Step 27260: {'lr': 0.00021586158706650955, 'samples': 13957632, 'steps': 27260, 'batch_loss/train': 0.6888150159502402} 12/26/2021 13:24:46 - INFO - codeparrot_training - Step 27261: {'lr': 0.00021584597165851706, 'samples': 13958144, 'steps': 27261, 'batch_loss/train': 0.6735057001933455} 12/26/2021 13:24:58 - INFO - codeparrot_training - Step 27262: {'lr': 0.0002158303563863081, 'samples': 13958656, 'steps': 27262, 'batch_loss/train': 0.8011096132104285} 12/26/2021 13:25:08 - INFO - codeparrot_training - Step 27263: {'lr': 0.00021581474124994466, 'samples': 13959168, 'steps': 27263, 'batch_loss/train': 0.6348018195712939} 12/26/2021 13:25:19 - INFO - codeparrot_training - Step 27264: {'lr': 0.000215799126249489, 'samples': 13959680, 'steps': 27264, 'batch_loss/train': 0.7797064799815416} 12/26/2021 13:25:30 - INFO - codeparrot_training - Step 27265: {'lr': 0.0002157835113850031, 'samples': 13960192, 'steps': 27265, 'batch_loss/train': 0.7268147207796574} 12/26/2021 13:25:42 - INFO - codeparrot_training - Step 27266: {'lr': 0.00021576789665654896, 'samples': 13960704, 'steps': 27266, 'batch_loss/train': 0.7752494341693819} 12/26/2021 13:25:53 - INFO - codeparrot_training - Step 27267: {'lr': 0.00021575228206418873, 'samples': 13961216, 'steps': 27267, 'batch_loss/train': 0.6358629220048897} 12/26/2021 13:26:04 - INFO - codeparrot_training - Step 27268: {'lr': 0.0002157366676079846, 'samples': 13961728, 'steps': 27268, 'batch_loss/train': 0.7664312198758125} 12/26/2021 13:26:16 - INFO - codeparrot_training - Step 27269: {'lr': 0.00021572105328799846, 'samples': 13962240, 'steps': 27269, 'batch_loss/train': 0.6792155737057328} 12/26/2021 13:26:26 - INFO - codeparrot_training - Step 27270: {'lr': 0.00021570543910429254, 'samples': 13962752, 'steps': 27270, 'batch_loss/train': 0.7756683919578791} 12/26/2021 13:26:37 - INFO - codeparrot_training - Step 27271: {'lr': 0.00021568982505692876, 'samples': 13963264, 'steps': 27271, 'batch_loss/train': 0.8135379033628851} 12/26/2021 13:26:50 - INFO - codeparrot_training - Step 27272: {'lr': 0.0002156742111459693, 'samples': 13963776, 'steps': 27272, 'batch_loss/train': 0.568958580144681} 12/26/2021 13:27:01 - INFO - codeparrot_training - Step 27273: {'lr': 0.00021565859737147626, 'samples': 13964288, 'steps': 27273, 'batch_loss/train': 0.7800338980741799} 12/26/2021 13:27:11 - INFO - codeparrot_training - Step 27274: {'lr': 0.00021564298373351172, 'samples': 13964800, 'steps': 27274, 'batch_loss/train': 1.727379010990262} 12/26/2021 13:27:22 - INFO - codeparrot_training - Step 27275: {'lr': 0.00021562737023213754, 'samples': 13965312, 'steps': 27275, 'batch_loss/train': 0.7799513605423272} 12/26/2021 13:27:34 - INFO - codeparrot_training - Step 27276: {'lr': 0.000215611756867416, 'samples': 13965824, 'steps': 27276, 'batch_loss/train': 0.7438491573557258} 12/26/2021 13:27:45 - INFO - codeparrot_training - Step 27277: {'lr': 0.00021559614363940926, 'samples': 13966336, 'steps': 27277, 'batch_loss/train': 0.7821432148339227} 12/26/2021 13:27:55 - INFO - codeparrot_training - Step 27278: {'lr': 0.0002155805305481791, 'samples': 13966848, 'steps': 27278, 'batch_loss/train': 0.7806840687990189} 12/26/2021 13:28:07 - INFO - codeparrot_training - Step 27279: {'lr': 0.00021556491759378783, 'samples': 13967360, 'steps': 27279, 'batch_loss/train': 0.7307316893711686} 12/26/2021 13:28:18 - INFO - codeparrot_training - Step 27280: {'lr': 0.0002155493047762975, 'samples': 13967872, 'steps': 27280, 'batch_loss/train': 0.9692784659564495} 12/26/2021 13:28:29 - INFO - codeparrot_training - Step 27281: {'lr': 0.00021553369209577006, 'samples': 13968384, 'steps': 27281, 'batch_loss/train': 0.7906979527324438} 12/26/2021 13:28:41 - INFO - codeparrot_training - Step 27282: {'lr': 0.0002155180795522676, 'samples': 13968896, 'steps': 27282, 'batch_loss/train': 0.7414973890408874} 12/26/2021 13:28:52 - INFO - codeparrot_training - Step 27283: {'lr': 0.00021550246714585234, 'samples': 13969408, 'steps': 27283, 'batch_loss/train': 0.7714155362918973} 12/26/2021 13:29:03 - INFO - codeparrot_training - Step 27284: {'lr': 0.00021548685487658624, 'samples': 13969920, 'steps': 27284, 'batch_loss/train': 0.7386543401516974} 12/26/2021 13:29:13 - INFO - codeparrot_training - Step 27285: {'lr': 0.00021547124274453133, 'samples': 13970432, 'steps': 27285, 'batch_loss/train': 0.9896892933174968} 12/26/2021 13:29:25 - INFO - codeparrot_training - Step 27286: {'lr': 0.00021545563074974966, 'samples': 13970944, 'steps': 27286, 'batch_loss/train': 0.8365190960466862} 12/26/2021 13:29:36 - INFO - codeparrot_training - Step 27287: {'lr': 0.00021544001889230348, 'samples': 13971456, 'steps': 27287, 'batch_loss/train': 0.8277174970135093} 12/26/2021 13:29:46 - INFO - codeparrot_training - Step 27288: {'lr': 0.00021542440717225466, 'samples': 13971968, 'steps': 27288, 'batch_loss/train': 0.721927460283041} 12/26/2021 13:29:59 - INFO - codeparrot_training - Step 27289: {'lr': 0.0002154087955896653, 'samples': 13972480, 'steps': 27289, 'batch_loss/train': 0.6463753036223352} 12/26/2021 13:30:09 - INFO - codeparrot_training - Step 27290: {'lr': 0.0002153931841445976, 'samples': 13972992, 'steps': 27290, 'batch_loss/train': 0.7168538812547922} 12/26/2021 13:30:20 - INFO - codeparrot_training - Step 27291: {'lr': 0.0002153775728371135, 'samples': 13973504, 'steps': 27291, 'batch_loss/train': 0.6692867192905396} 12/26/2021 13:30:32 - INFO - codeparrot_training - Step 27292: {'lr': 0.00021536196166727515, 'samples': 13974016, 'steps': 27292, 'batch_loss/train': 0.83882902469486} 12/26/2021 13:30:43 - INFO - codeparrot_training - Step 27293: {'lr': 0.00021534635063514442, 'samples': 13974528, 'steps': 27293, 'batch_loss/train': 0.7865318530239165} 12/26/2021 13:30:53 - INFO - codeparrot_training - Step 27294: {'lr': 0.00021533073974078367, 'samples': 13975040, 'steps': 27294, 'batch_loss/train': 0.7282458022236824} 12/26/2021 13:31:04 - INFO - codeparrot_training - Step 27295: {'lr': 0.00021531512898425472, 'samples': 13975552, 'steps': 27295, 'batch_loss/train': 0.9085077131167054} 12/26/2021 13:31:17 - INFO - codeparrot_training - Step 27296: {'lr': 0.0002152995183656198, 'samples': 13976064, 'steps': 27296, 'batch_loss/train': 0.7575462497770786} 12/26/2021 13:31:27 - INFO - codeparrot_training - Step 27297: {'lr': 0.0002152839078849408, 'samples': 13976576, 'steps': 27297, 'batch_loss/train': 0.780556540004909} 12/26/2021 13:31:38 - INFO - codeparrot_training - Step 27298: {'lr': 0.0002152682975422799, 'samples': 13977088, 'steps': 27298, 'batch_loss/train': 1.1239968948066235} 12/26/2021 13:31:50 - INFO - codeparrot_training - Step 27299: {'lr': 0.00021525268733769919, 'samples': 13977600, 'steps': 27299, 'batch_loss/train': 0.7615221943706274} 12/26/2021 13:32:00 - INFO - codeparrot_training - Step 27300: {'lr': 0.00021523707727126056, 'samples': 13978112, 'steps': 27300, 'batch_loss/train': 0.7327823960222304} 12/26/2021 13:32:11 - INFO - codeparrot_training - Step 27301: {'lr': 0.0002152214673430263, 'samples': 13978624, 'steps': 27301, 'batch_loss/train': 0.6219471669755876} 12/26/2021 13:32:24 - INFO - codeparrot_training - Step 27302: {'lr': 0.0002152058575530583, 'samples': 13979136, 'steps': 27302, 'batch_loss/train': 0.7311607603915036} 12/26/2021 13:32:35 - INFO - codeparrot_training - Step 27303: {'lr': 0.00021519024790141872, 'samples': 13979648, 'steps': 27303, 'batch_loss/train': 0.7338210600428283} 12/26/2021 13:32:45 - INFO - codeparrot_training - Step 27304: {'lr': 0.00021517463838816947, 'samples': 13980160, 'steps': 27304, 'batch_loss/train': 0.7143482840619981} 12/26/2021 13:32:57 - INFO - codeparrot_training - Step 27305: {'lr': 0.00021515902901337276, 'samples': 13980672, 'steps': 27305, 'batch_loss/train': 0.795198455452919} 12/26/2021 13:33:08 - INFO - codeparrot_training - Step 27306: {'lr': 0.00021514341977709066, 'samples': 13981184, 'steps': 27306, 'batch_loss/train': 0.7574572199955583} 12/26/2021 13:33:19 - INFO - codeparrot_training - Step 27307: {'lr': 0.00021512781067938507, 'samples': 13981696, 'steps': 27307, 'batch_loss/train': 0.7312184358015656} 12/26/2021 13:33:29 - INFO - codeparrot_training - Step 27308: {'lr': 0.00021511220172031815, 'samples': 13982208, 'steps': 27308, 'batch_loss/train': 0.6593869583448395} 12/26/2021 13:33:41 - INFO - codeparrot_training - Step 27309: {'lr': 0.000215096592899952, 'samples': 13982720, 'steps': 27309, 'batch_loss/train': 0.8406616430729628} 12/26/2021 13:33:52 - INFO - codeparrot_training - Step 27310: {'lr': 0.00021508098421834862, 'samples': 13983232, 'steps': 27310, 'batch_loss/train': 0.6879583867266774} 12/26/2021 13:34:02 - INFO - codeparrot_training - Step 27311: {'lr': 0.00021506537567557, 'samples': 13983744, 'steps': 27311, 'batch_loss/train': 0.9272620435804129} 12/26/2021 13:34:15 - INFO - codeparrot_training - Step 27312: {'lr': 0.00021504976727167825, 'samples': 13984256, 'steps': 27312, 'batch_loss/train': 0.8133239904418588} 12/26/2021 13:34:26 - INFO - codeparrot_training - Step 27313: {'lr': 0.0002150341590067355, 'samples': 13984768, 'steps': 27313, 'batch_loss/train': 0.7298646392300725} 12/26/2021 13:34:37 - INFO - codeparrot_training - Step 27314: {'lr': 0.00021501855088080368, 'samples': 13985280, 'steps': 27314, 'batch_loss/train': 0.6734913196414709} 12/26/2021 13:34:49 - INFO - codeparrot_training - Step 27315: {'lr': 0.00021500294289394483, 'samples': 13985792, 'steps': 27315, 'batch_loss/train': 0.7497005555778742} 12/26/2021 13:34:59 - INFO - codeparrot_training - Step 27316: {'lr': 0.0002149873350462212, 'samples': 13986304, 'steps': 27316, 'batch_loss/train': 0.7670929473824799} 12/26/2021 13:35:10 - INFO - codeparrot_training - Step 27317: {'lr': 0.00021497172733769466, 'samples': 13986816, 'steps': 27317, 'batch_loss/train': 0.7528836000710726} 12/26/2021 13:35:20 - INFO - codeparrot_training - Step 27318: {'lr': 0.00021495611976842733, 'samples': 13987328, 'steps': 27318, 'batch_loss/train': 0.7487867968156934} 12/26/2021 13:35:33 - INFO - codeparrot_training - Step 27319: {'lr': 0.00021494051233848113, 'samples': 13987840, 'steps': 27319, 'batch_loss/train': 0.6081765084527433} 12/26/2021 13:35:44 - INFO - codeparrot_training - Step 27320: {'lr': 0.00021492490504791833, 'samples': 13988352, 'steps': 27320, 'batch_loss/train': 0.982107064453885} 12/26/2021 13:35:55 - INFO - codeparrot_training - Step 27321: {'lr': 0.0002149092978968008, 'samples': 13988864, 'steps': 27321, 'batch_loss/train': 0.7989742499776185} 12/26/2021 13:36:07 - INFO - codeparrot_training - Step 27322: {'lr': 0.00021489369088519074, 'samples': 13989376, 'steps': 27322, 'batch_loss/train': 0.7249123575165868} 12/26/2021 13:36:17 - INFO - codeparrot_training - Step 27323: {'lr': 0.00021487808401314995, 'samples': 13989888, 'steps': 27323, 'batch_loss/train': 0.903818441554904} 12/26/2021 13:36:28 - INFO - codeparrot_training - Step 27324: {'lr': 0.00021486247728074072, 'samples': 13990400, 'steps': 27324, 'batch_loss/train': 0.7822946012020111} 12/26/2021 13:36:40 - INFO - codeparrot_training - Step 27325: {'lr': 0.00021484687068802505, 'samples': 13990912, 'steps': 27325, 'batch_loss/train': 0.7887480761855841} 12/26/2021 13:36:51 - INFO - codeparrot_training - Step 27326: {'lr': 0.0002148312642350649, 'samples': 13991424, 'steps': 27326, 'batch_loss/train': 0.7507677581161261} 12/26/2021 13:37:01 - INFO - codeparrot_training - Step 27327: {'lr': 0.00021481565792192236, 'samples': 13991936, 'steps': 27327, 'batch_loss/train': 0.7593656322569586} 12/26/2021 13:37:12 - INFO - codeparrot_training - Step 27328: {'lr': 0.00021480005174865952, 'samples': 13992448, 'steps': 27328, 'batch_loss/train': 0.7345631305361167} 12/26/2021 13:37:24 - INFO - codeparrot_training - Step 27329: {'lr': 0.00021478444571533843, 'samples': 13992960, 'steps': 27329, 'batch_loss/train': 0.7742540074978024} 12/26/2021 13:37:35 - INFO - codeparrot_training - Step 27330: {'lr': 0.00021476883982202095, 'samples': 13993472, 'steps': 27330, 'batch_loss/train': 0.7496516294777393} 12/26/2021 13:37:45 - INFO - codeparrot_training - Step 27331: {'lr': 0.0002147532340687693, 'samples': 13993984, 'steps': 27331, 'batch_loss/train': 0.7428221050649881} 12/26/2021 13:37:58 - INFO - codeparrot_training - Step 27332: {'lr': 0.00021473762845564557, 'samples': 13994496, 'steps': 27332, 'batch_loss/train': 0.7683177078142762} 12/26/2021 13:38:09 - INFO - codeparrot_training - Step 27333: {'lr': 0.00021472202298271166, 'samples': 13995008, 'steps': 27333, 'batch_loss/train': 0.6582209975458682} 12/26/2021 13:38:19 - INFO - codeparrot_training - Step 27334: {'lr': 0.00021470641765002958, 'samples': 13995520, 'steps': 27334, 'batch_loss/train': 0.7059986557578668} 12/26/2021 13:38:32 - INFO - codeparrot_training - Step 27335: {'lr': 0.00021469081245766158, 'samples': 13996032, 'steps': 27335, 'batch_loss/train': 0.6611622218042612} 12/26/2021 13:38:42 - INFO - codeparrot_training - Step 27336: {'lr': 0.00021467520740566954, 'samples': 13996544, 'steps': 27336, 'batch_loss/train': 0.6859114109538496} 12/26/2021 13:38:53 - INFO - codeparrot_training - Step 27337: {'lr': 0.0002146596024941155, 'samples': 13997056, 'steps': 27337, 'batch_loss/train': 0.837390573695302} 12/26/2021 13:39:05 - INFO - codeparrot_training - Step 27338: {'lr': 0.0002146439977230616, 'samples': 13997568, 'steps': 27338, 'batch_loss/train': 0.8234539162367582} 12/26/2021 13:39:15 - INFO - codeparrot_training - Step 27339: {'lr': 0.00021462839309256983, 'samples': 13998080, 'steps': 27339, 'batch_loss/train': 0.7875824719667435} 12/26/2021 13:39:26 - INFO - codeparrot_training - Step 27340: {'lr': 0.00021461278860270217, 'samples': 13998592, 'steps': 27340, 'batch_loss/train': 0.8308965414762497} 12/26/2021 13:39:37 - INFO - codeparrot_training - Step 27341: {'lr': 0.0002145971842535206, 'samples': 13999104, 'steps': 27341, 'batch_loss/train': 0.6995325153693557} 12/26/2021 13:39:49 - INFO - codeparrot_training - Step 27342: {'lr': 0.00021458158004508744, 'samples': 13999616, 'steps': 27342, 'batch_loss/train': 0.7458971869200468} 12/26/2021 13:40:00 - INFO - codeparrot_training - Step 27343: {'lr': 0.00021456597597746443, 'samples': 14000128, 'steps': 27343, 'batch_loss/train': 0.7875542612746358} 12/26/2021 13:40:10 - INFO - codeparrot_training - Step 27344: {'lr': 0.00021455037205071382, 'samples': 14000640, 'steps': 27344, 'batch_loss/train': 0.7613002881407738} 12/26/2021 13:40:23 - INFO - codeparrot_training - Step 27345: {'lr': 0.0002145347682648974, 'samples': 14001152, 'steps': 27345, 'batch_loss/train': 0.6832750528119504} 12/26/2021 13:40:33 - INFO - codeparrot_training - Step 27346: {'lr': 0.00021451916462007748, 'samples': 14001664, 'steps': 27346, 'batch_loss/train': 0.7787856794893742} 12/26/2021 13:40:44 - INFO - codeparrot_training - Step 27347: {'lr': 0.0002145035611163159, 'samples': 14002176, 'steps': 27347, 'batch_loss/train': 0.6276735719293356} 12/26/2021 13:40:57 - INFO - codeparrot_training - Step 27348: {'lr': 0.00021448795775367484, 'samples': 14002688, 'steps': 27348, 'batch_loss/train': 0.7256981087848544} 12/26/2021 13:41:07 - INFO - codeparrot_training - Step 27349: {'lr': 0.00021447235453221611, 'samples': 14003200, 'steps': 27349, 'batch_loss/train': 0.7397038876079023} 12/26/2021 13:41:18 - INFO - codeparrot_training - Step 27350: {'lr': 0.00021445675145200194, 'samples': 14003712, 'steps': 27350, 'batch_loss/train': 0.834719086997211} 12/26/2021 13:41:29 - INFO - codeparrot_training - Step 27351: {'lr': 0.00021444114851309438, 'samples': 14004224, 'steps': 27351, 'batch_loss/train': 0.8950158753432333} 12/26/2021 13:41:41 - INFO - codeparrot_training - Step 27352: {'lr': 0.00021442554571555526, 'samples': 14004736, 'steps': 27352, 'batch_loss/train': 0.7443508412688971} 12/26/2021 13:41:51 - INFO - codeparrot_training - Step 27353: {'lr': 0.00021440994305944676, 'samples': 14005248, 'steps': 27353, 'batch_loss/train': 0.7121713790111244} 12/26/2021 13:42:02 - INFO - codeparrot_training - Step 27354: {'lr': 0.00021439434054483102, 'samples': 14005760, 'steps': 27354, 'batch_loss/train': 0.7105630626901984} 12/26/2021 13:42:14 - INFO - codeparrot_training - Step 27355: {'lr': 0.0002143787381717699, 'samples': 14006272, 'steps': 27355, 'batch_loss/train': 0.698831470683217} 12/26/2021 13:42:25 - INFO - codeparrot_training - Step 27356: {'lr': 0.00021436313594032537, 'samples': 14006784, 'steps': 27356, 'batch_loss/train': 0.7169869374483824} 12/26/2021 13:42:35 - INFO - codeparrot_training - Step 27357: {'lr': 0.00021434753385055958, 'samples': 14007296, 'steps': 27357, 'batch_loss/train': 0.7608541250228882} 12/26/2021 13:42:48 - INFO - codeparrot_training - Step 27358: {'lr': 0.00021433193190253458, 'samples': 14007808, 'steps': 27358, 'batch_loss/train': 0.6376302824355662} 12/26/2021 13:42:59 - INFO - codeparrot_training - Step 27359: {'lr': 0.00021431633009631236, 'samples': 14008320, 'steps': 27359, 'batch_loss/train': 0.8062478937208652} 12/26/2021 13:43:09 - INFO - codeparrot_training - Step 27360: {'lr': 0.00021430072843195482, 'samples': 14008832, 'steps': 27360, 'batch_loss/train': 0.6433372311294079} 12/26/2021 13:43:20 - INFO - codeparrot_training - Step 27361: {'lr': 0.00021428512690952424, 'samples': 14009344, 'steps': 27361, 'batch_loss/train': 0.6805545720271766} 12/26/2021 13:43:32 - INFO - codeparrot_training - Step 27362: {'lr': 0.00021426952552908247, 'samples': 14009856, 'steps': 27362, 'batch_loss/train': 0.7056457819417119} 12/26/2021 13:43:43 - INFO - codeparrot_training - Step 27363: {'lr': 0.00021425392429069148, 'samples': 14010368, 'steps': 27363, 'batch_loss/train': 0.7411837382242084} 12/26/2021 13:43:53 - INFO - codeparrot_training - Step 27364: {'lr': 0.00021423832319441354, 'samples': 14010880, 'steps': 27364, 'batch_loss/train': 0.7282661478966475} 12/26/2021 13:44:05 - INFO - codeparrot_training - Step 27365: {'lr': 0.0002142227222403105, 'samples': 14011392, 'steps': 27365, 'batch_loss/train': 0.7000159891322255} 12/26/2021 13:44:16 - INFO - codeparrot_training - Step 27366: {'lr': 0.00021420712142844434, 'samples': 14011904, 'steps': 27366, 'batch_loss/train': 0.7185401599854231} 12/26/2021 13:44:27 - INFO - codeparrot_training - Step 27367: {'lr': 0.00021419152075887712, 'samples': 14012416, 'steps': 27367, 'batch_loss/train': 0.8292774418368936} 12/26/2021 13:44:39 - INFO - codeparrot_training - Step 27368: {'lr': 0.00021417592023167102, 'samples': 14012928, 'steps': 27368, 'batch_loss/train': 0.775236853864044} 12/26/2021 13:44:49 - INFO - codeparrot_training - Step 27369: {'lr': 0.00021416031984688782, 'samples': 14013440, 'steps': 27369, 'batch_loss/train': 0.7054244875907898} 12/26/2021 13:45:00 - INFO - codeparrot_training - Step 27370: {'lr': 0.00021414471960458974, 'samples': 14013952, 'steps': 27370, 'batch_loss/train': 0.793754521291703} 12/26/2021 13:45:13 - INFO - codeparrot_training - Step 27371: {'lr': 0.00021412911950483866, 'samples': 14014464, 'steps': 27371, 'batch_loss/train': 0.7846129550598562} 12/26/2021 13:45:23 - INFO - codeparrot_training - Step 27372: {'lr': 0.00021411351954769665, 'samples': 14014976, 'steps': 27372, 'batch_loss/train': 0.6844465694739483} 12/26/2021 13:45:34 - INFO - codeparrot_training - Step 27373: {'lr': 0.00021409791973322578, 'samples': 14015488, 'steps': 27373, 'batch_loss/train': 0.7256845971569419} 12/26/2021 13:45:44 - INFO - codeparrot_training - Step 27374: {'lr': 0.000214082320061488, 'samples': 14016000, 'steps': 27374, 'batch_loss/train': 0.757585521787405} 12/26/2021 13:45:57 - INFO - codeparrot_training - Step 27375: {'lr': 0.00021406672053254536, 'samples': 14016512, 'steps': 27375, 'batch_loss/train': 0.7790449121966958} 12/26/2021 13:46:07 - INFO - codeparrot_training - Step 27376: {'lr': 0.00021405112114645985, 'samples': 14017024, 'steps': 27376, 'batch_loss/train': 0.6959700677543879} 12/26/2021 13:46:18 - INFO - codeparrot_training - Step 27377: {'lr': 0.0002140355219032936, 'samples': 14017536, 'steps': 27377, 'batch_loss/train': 0.4756632625358179} 12/26/2021 13:46:31 - INFO - codeparrot_training - Step 27378: {'lr': 0.0002140199228031084, 'samples': 14018048, 'steps': 27378, 'batch_loss/train': 0.7108594734454527} 12/26/2021 13:46:42 - INFO - codeparrot_training - Step 27379: {'lr': 0.00021400432384596646, 'samples': 14018560, 'steps': 27379, 'batch_loss/train': 0.854108223807998} 12/26/2021 13:46:52 - INFO - codeparrot_training - Step 27380: {'lr': 0.00021398872503192978, 'samples': 14019072, 'steps': 27380, 'batch_loss/train': 0.7838931204751134} 12/26/2021 13:47:04 - INFO - codeparrot_training - Step 27381: {'lr': 0.0002139731263610603, 'samples': 14019584, 'steps': 27381, 'batch_loss/train': 0.8206744994968176} 12/26/2021 13:47:15 - INFO - codeparrot_training - Step 27382: {'lr': 0.00021395752783341998, 'samples': 14020096, 'steps': 27382, 'batch_loss/train': 0.8776143128052354} 12/26/2021 13:47:25 - INFO - codeparrot_training - Step 27383: {'lr': 0.00021394192944907103, 'samples': 14020608, 'steps': 27383, 'batch_loss/train': 0.7816627025604248} 12/26/2021 13:47:36 - INFO - codeparrot_training - Step 27384: {'lr': 0.0002139263312080754, 'samples': 14021120, 'steps': 27384, 'batch_loss/train': 0.7261860747821629} 12/26/2021 13:47:48 - INFO - codeparrot_training - Step 27385: {'lr': 0.00021391073311049496, 'samples': 14021632, 'steps': 27385, 'batch_loss/train': 0.7626169072464108} 12/26/2021 13:47:59 - INFO - codeparrot_training - Step 27386: {'lr': 0.00021389513515639177, 'samples': 14022144, 'steps': 27386, 'batch_loss/train': 0.6116462723584846} 12/26/2021 13:48:10 - INFO - codeparrot_training - Step 27387: {'lr': 0.000213879537345828, 'samples': 14022656, 'steps': 27387, 'batch_loss/train': 0.8628039751201868} 12/26/2021 13:48:22 - INFO - codeparrot_training - Step 27388: {'lr': 0.00021386393967886552, 'samples': 14023168, 'steps': 27388, 'batch_loss/train': 0.7532918192446232} 12/26/2021 13:48:33 - INFO - codeparrot_training - Step 27389: {'lr': 0.0002138483421555663, 'samples': 14023680, 'steps': 27389, 'batch_loss/train': 0.7642232980579138} 12/26/2021 13:48:44 - INFO - codeparrot_training - Step 27390: {'lr': 0.00021383274477599251, 'samples': 14024192, 'steps': 27390, 'batch_loss/train': 0.7356998829636723} 12/26/2021 13:48:56 - INFO - codeparrot_training - Step 27391: {'lr': 0.0002138171475402061, 'samples': 14024704, 'steps': 27391, 'batch_loss/train': 0.7197974873706698} 12/26/2021 13:49:06 - INFO - codeparrot_training - Step 27392: {'lr': 0.00021380155044826898, 'samples': 14025216, 'steps': 27392, 'batch_loss/train': 0.7219215063378215} 12/26/2021 13:49:17 - INFO - codeparrot_training - Step 27393: {'lr': 0.0002137859535002432, 'samples': 14025728, 'steps': 27393, 'batch_loss/train': 0.715513383038342} 12/26/2021 13:49:28 - INFO - codeparrot_training - Step 27394: {'lr': 0.00021377035669619085, 'samples': 14026240, 'steps': 27394, 'batch_loss/train': 0.6809238884598017} 12/26/2021 13:49:40 - INFO - codeparrot_training - Step 27395: {'lr': 0.00021375476003617386, 'samples': 14026752, 'steps': 27395, 'batch_loss/train': 0.725053469534032} 12/26/2021 13:49:50 - INFO - codeparrot_training - Step 27396: {'lr': 0.00021373916352025434, 'samples': 14027264, 'steps': 27396, 'batch_loss/train': 0.8133892444893718} 12/26/2021 13:50:01 - INFO - codeparrot_training - Step 27397: {'lr': 0.00021372356714849405, 'samples': 14027776, 'steps': 27397, 'batch_loss/train': 0.767081051133573} 12/26/2021 13:50:14 - INFO - codeparrot_training - Step 27398: {'lr': 0.00021370797092095523, 'samples': 14028288, 'steps': 27398, 'batch_loss/train': 0.815512964501977} 12/26/2021 13:50:24 - INFO - codeparrot_training - Step 27399: {'lr': 0.0002136923748376999, 'samples': 14028800, 'steps': 27399, 'batch_loss/train': 0.7347484575584531} 12/26/2021 13:50:35 - INFO - codeparrot_training - Step 27400: {'lr': 0.00021367677889878983, 'samples': 14029312, 'steps': 27400, 'batch_loss/train': 0.7425196873955429} 12/26/2021 13:50:47 - INFO - codeparrot_training - Step 27401: {'lr': 0.00021366118310428732, 'samples': 14029824, 'steps': 27401, 'batch_loss/train': 0.6715479507111013} 12/26/2021 13:50:58 - INFO - codeparrot_training - Step 27402: {'lr': 0.00021364558745425412, 'samples': 14030336, 'steps': 27402, 'batch_loss/train': 0.7701253080740571} 12/26/2021 13:51:08 - INFO - codeparrot_training - Step 27403: {'lr': 0.00021362999194875242, 'samples': 14030848, 'steps': 27403, 'batch_loss/train': 0.698065287899226} 12/26/2021 13:51:21 - INFO - codeparrot_training - Step 27404: {'lr': 0.00021361439658784404, 'samples': 14031360, 'steps': 27404, 'batch_loss/train': 0.7713523618876934} 12/26/2021 13:51:31 - INFO - codeparrot_training - Step 27405: {'lr': 0.00021359880137159117, 'samples': 14031872, 'steps': 27405, 'batch_loss/train': 0.7367908740416169} 12/26/2021 13:51:42 - INFO - codeparrot_training - Step 27406: {'lr': 0.0002135832063000557, 'samples': 14032384, 'steps': 27406, 'batch_loss/train': 0.723023840226233} 12/26/2021 13:51:53 - INFO - codeparrot_training - Step 27407: {'lr': 0.00021356761137329963, 'samples': 14032896, 'steps': 27407, 'batch_loss/train': 0.5365917473100126} 12/26/2021 13:52:05 - INFO - codeparrot_training - Step 27408: {'lr': 0.00021355201659138494, 'samples': 14033408, 'steps': 27408, 'batch_loss/train': 0.7949535623774864} 12/26/2021 13:52:16 - INFO - codeparrot_training - Step 27409: {'lr': 0.00021353642195437377, 'samples': 14033920, 'steps': 27409, 'batch_loss/train': 0.7375349141657352} 12/26/2021 13:52:27 - INFO - codeparrot_training - Step 27410: {'lr': 0.000213520827462328, 'samples': 14034432, 'steps': 27410, 'batch_loss/train': 0.7822269746102393} 12/26/2021 13:52:39 - INFO - codeparrot_training - Step 27411: {'lr': 0.00021350523311530956, 'samples': 14034944, 'steps': 27411, 'batch_loss/train': 0.8198210634291172} 12/26/2021 13:52:49 - INFO - codeparrot_training - Step 27412: {'lr': 0.00021348963891338056, 'samples': 14035456, 'steps': 27412, 'batch_loss/train': 0.6518556969240308} 12/26/2021 13:53:00 - INFO - codeparrot_training - Step 27413: {'lr': 0.00021347404485660303, 'samples': 14035968, 'steps': 27413, 'batch_loss/train': 0.625104307604488} 12/26/2021 13:53:12 - INFO - codeparrot_training - Step 27414: {'lr': 0.00021345845094503888, 'samples': 14036480, 'steps': 27414, 'batch_loss/train': 0.6648952420218848} 12/26/2021 13:53:23 - INFO - codeparrot_training - Step 27415: {'lr': 0.00021344285717875007, 'samples': 14036992, 'steps': 27415, 'batch_loss/train': 0.5767999533563852} 12/26/2021 13:53:33 - INFO - codeparrot_training - Step 27416: {'lr': 0.00021342726355779874, 'samples': 14037504, 'steps': 27416, 'batch_loss/train': 0.8002425227314234} 12/26/2021 13:53:44 - INFO - codeparrot_training - Step 27417: {'lr': 0.00021341167008224676, 'samples': 14038016, 'steps': 27417, 'batch_loss/train': 0.7636952525936067} 12/26/2021 13:53:57 - INFO - codeparrot_training - Step 27418: {'lr': 0.00021339607675215623, 'samples': 14038528, 'steps': 27418, 'batch_loss/train': 0.7894334369339049} 12/26/2021 13:54:07 - INFO - codeparrot_training - Step 27419: {'lr': 0.00021338048356758896, 'samples': 14039040, 'steps': 27419, 'batch_loss/train': 0.751232647919096} 12/26/2021 13:54:18 - INFO - codeparrot_training - Step 27420: {'lr': 0.00021336489052860717, 'samples': 14039552, 'steps': 27420, 'batch_loss/train': 0.6808071164414287} 12/26/2021 13:54:30 - INFO - codeparrot_training - Step 27421: {'lr': 0.00021334929763527268, 'samples': 14040064, 'steps': 27421, 'batch_loss/train': 0.8142749294638634} 12/26/2021 13:54:41 - INFO - codeparrot_training - Step 27422: {'lr': 0.0002133337048876476, 'samples': 14040576, 'steps': 27422, 'batch_loss/train': 0.8602611385285854} 12/26/2021 13:54:52 - INFO - codeparrot_training - Step 27423: {'lr': 0.00021331811228579375, 'samples': 14041088, 'steps': 27423, 'batch_loss/train': 0.7152037741616368} 12/26/2021 13:55:04 - INFO - codeparrot_training - Step 27424: {'lr': 0.00021330251982977334, 'samples': 14041600, 'steps': 27424, 'batch_loss/train': 0.6925508691929281} 12/26/2021 13:55:14 - INFO - codeparrot_training - Step 27425: {'lr': 0.00021328692751964822, 'samples': 14042112, 'steps': 27425, 'batch_loss/train': 0.7436532387509942} 12/26/2021 13:55:25 - INFO - codeparrot_training - Step 27426: {'lr': 0.00021327133535548038, 'samples': 14042624, 'steps': 27426, 'batch_loss/train': 0.46217688638716936} 12/26/2021 13:55:38 - INFO - codeparrot_training - Step 27427: {'lr': 0.00021325574333733188, 'samples': 14043136, 'steps': 27427, 'batch_loss/train': 0.6827804450877011} 12/26/2021 13:55:48 - INFO - codeparrot_training - Step 27428: {'lr': 0.0002132401514652647, 'samples': 14043648, 'steps': 27428, 'batch_loss/train': 0.8060486786998808} 12/26/2021 13:55:59 - INFO - codeparrot_training - Step 27429: {'lr': 0.00021322455973934082, 'samples': 14044160, 'steps': 27429, 'batch_loss/train': 0.7341833962127566} 12/26/2021 13:56:10 - INFO - codeparrot_training - Step 27430: {'lr': 0.0002132089681596221, 'samples': 14044672, 'steps': 27430, 'batch_loss/train': 0.7362618185579777} 12/26/2021 13:56:22 - INFO - codeparrot_training - Step 27431: {'lr': 0.00021319337672617066, 'samples': 14045184, 'steps': 27431, 'batch_loss/train': 0.758668229333125} 12/26/2021 13:56:33 - INFO - codeparrot_training - Step 27432: {'lr': 0.00021317778543904854, 'samples': 14045696, 'steps': 27432, 'batch_loss/train': 0.6630243993131444} 12/26/2021 13:56:43 - INFO - codeparrot_training - Step 27433: {'lr': 0.00021316219429831756, 'samples': 14046208, 'steps': 27433, 'batch_loss/train': 0.7078460613265634} 12/26/2021 13:56:56 - INFO - codeparrot_training - Step 27434: {'lr': 0.00021314660330403975, 'samples': 14046720, 'steps': 27434, 'batch_loss/train': 0.7466688342392445} 12/26/2021 13:57:07 - INFO - codeparrot_training - Step 27435: {'lr': 0.00021313101245627725, 'samples': 14047232, 'steps': 27435, 'batch_loss/train': 0.7231939774937928} 12/26/2021 13:57:17 - INFO - codeparrot_training - Step 27436: {'lr': 0.00021311542175509192, 'samples': 14047744, 'steps': 27436, 'batch_loss/train': 0.8363915020599961} 12/26/2021 13:57:28 - INFO - codeparrot_training - Step 27437: {'lr': 0.0002130998312005456, 'samples': 14048256, 'steps': 27437, 'batch_loss/train': 0.7254895297810435} 12/26/2021 13:57:40 - INFO - codeparrot_training - Step 27438: {'lr': 0.0002130842407927005, 'samples': 14048768, 'steps': 27438, 'batch_loss/train': 0.6996960295364261} 12/26/2021 13:57:51 - INFO - codeparrot_training - Step 27439: {'lr': 0.00021306865053161855, 'samples': 14049280, 'steps': 27439, 'batch_loss/train': 0.8057536543346941} 12/26/2021 13:58:01 - INFO - codeparrot_training - Step 27440: {'lr': 0.0002130530604173617, 'samples': 14049792, 'steps': 27440, 'batch_loss/train': 0.6870970940217376} 12/26/2021 13:58:14 - INFO - codeparrot_training - Step 27441: {'lr': 0.00021303747044999183, 'samples': 14050304, 'steps': 27441, 'batch_loss/train': 0.7393595790490508} 12/26/2021 13:58:24 - INFO - codeparrot_training - Step 27442: {'lr': 0.00021302188062957114, 'samples': 14050816, 'steps': 27442, 'batch_loss/train': 0.7588392328470945} 12/26/2021 13:58:35 - INFO - codeparrot_training - Step 27443: {'lr': 0.00021300629095616144, 'samples': 14051328, 'steps': 27443, 'batch_loss/train': 0.7193299857899547} 12/26/2021 13:58:47 - INFO - codeparrot_training - Step 27444: {'lr': 0.00021299070142982482, 'samples': 14051840, 'steps': 27444, 'batch_loss/train': 0.8250915680546314} 12/26/2021 13:58:58 - INFO - codeparrot_training - Step 27445: {'lr': 0.00021297511205062308, 'samples': 14052352, 'steps': 27445, 'batch_loss/train': 1.2071020039729774} 12/26/2021 13:59:09 - INFO - codeparrot_training - Step 27446: {'lr': 0.0002129595228186184, 'samples': 14052864, 'steps': 27446, 'batch_loss/train': 0.6906975638121367} 12/26/2021 13:59:19 - INFO - codeparrot_training - Step 27447: {'lr': 0.00021294393373387266, 'samples': 14053376, 'steps': 27447, 'batch_loss/train': 0.8235173812136054} 12/26/2021 13:59:32 - INFO - codeparrot_training - Step 27448: {'lr': 0.0002129283447964478, 'samples': 14053888, 'steps': 27448, 'batch_loss/train': 0.759198906365782} 12/26/2021 13:59:43 - INFO - codeparrot_training - Step 27449: {'lr': 0.0002129127560064059, 'samples': 14054400, 'steps': 27449, 'batch_loss/train': 0.7453299295157194} 12/26/2021 13:59:53 - INFO - codeparrot_training - Step 27450: {'lr': 0.00021289716736380885, 'samples': 14054912, 'steps': 27450, 'batch_loss/train': 0.803541692905128} 12/26/2021 14:00:05 - INFO - codeparrot_training - Step 27451: {'lr': 0.00021288157886871872, 'samples': 14055424, 'steps': 27451, 'batch_loss/train': 0.7352191489189863} 12/26/2021 14:00:16 - INFO - codeparrot_training - Step 27452: {'lr': 0.00021286599052119728, 'samples': 14055936, 'steps': 27452, 'batch_loss/train': 0.6392095913179219} 12/26/2021 14:00:26 - INFO - codeparrot_training - Step 27453: {'lr': 0.00021285040232130672, 'samples': 14056448, 'steps': 27453, 'batch_loss/train': 0.7259219083935022} 12/26/2021 14:00:39 - INFO - codeparrot_training - Step 27454: {'lr': 0.00021283481426910897, 'samples': 14056960, 'steps': 27454, 'batch_loss/train': 0.7768657645210624} 12/26/2021 14:00:49 - INFO - codeparrot_training - Step 27455: {'lr': 0.000212819226364666, 'samples': 14057472, 'steps': 27455, 'batch_loss/train': 0.7818382414989173} 12/26/2021 14:01:00 - INFO - codeparrot_training - Step 27456: {'lr': 0.0002128036386080396, 'samples': 14057984, 'steps': 27456, 'batch_loss/train': 0.7406553188338876} 12/26/2021 14:01:13 - INFO - codeparrot_training - Step 27457: {'lr': 0.00021278805099929194, 'samples': 14058496, 'steps': 27457, 'batch_loss/train': 0.8145861243829131} 12/26/2021 14:01:23 - INFO - codeparrot_training - Step 27458: {'lr': 0.000212772463538485, 'samples': 14059008, 'steps': 27458, 'batch_loss/train': 0.7333097690716386} 12/26/2021 14:01:34 - INFO - codeparrot_training - Step 27459: {'lr': 0.00021275687622568064, 'samples': 14059520, 'steps': 27459, 'batch_loss/train': 0.7677144100889564} 12/26/2021 14:01:45 - INFO - codeparrot_training - Step 27460: {'lr': 0.00021274128906094083, 'samples': 14060032, 'steps': 27460, 'batch_loss/train': 0.7111782357096672} 12/26/2021 14:01:57 - INFO - codeparrot_training - Step 27461: {'lr': 0.0002127257020443277, 'samples': 14060544, 'steps': 27461, 'batch_loss/train': 0.7449412643909454} 12/26/2021 14:02:07 - INFO - codeparrot_training - Step 27462: {'lr': 0.00021271011517590306, 'samples': 14061056, 'steps': 27462, 'batch_loss/train': 0.7833515191450715} 12/26/2021 14:02:18 - INFO - codeparrot_training - Step 27463: {'lr': 0.00021269452845572884, 'samples': 14061568, 'steps': 27463, 'batch_loss/train': 0.8655624212697148} 12/26/2021 14:02:31 - INFO - codeparrot_training - Step 27464: {'lr': 0.0002126789418838672, 'samples': 14062080, 'steps': 27464, 'batch_loss/train': 0.7563475044444203} 12/26/2021 14:02:42 - INFO - codeparrot_training - Step 27465: {'lr': 0.00021266335546038002, 'samples': 14062592, 'steps': 27465, 'batch_loss/train': 0.7239959571743384} 12/26/2021 14:02:52 - INFO - codeparrot_training - Step 27466: {'lr': 0.00021264776918532917, 'samples': 14063104, 'steps': 27466, 'batch_loss/train': 0.7107904488220811} 12/26/2021 14:03:05 - INFO - codeparrot_training - Step 27467: {'lr': 0.0002126321830587766, 'samples': 14063616, 'steps': 27467, 'batch_loss/train': 0.8896985407918692} 12/26/2021 14:03:15 - INFO - codeparrot_training - Step 27468: {'lr': 0.00021261659708078453, 'samples': 14064128, 'steps': 27468, 'batch_loss/train': 1.0209396263526287} 12/26/2021 14:03:26 - INFO - codeparrot_training - Step 27469: {'lr': 0.00021260101125141466, 'samples': 14064640, 'steps': 27469, 'batch_loss/train': 0.7232308965176344} 12/26/2021 14:03:37 - INFO - codeparrot_training - Step 27470: {'lr': 0.0002125854255707291, 'samples': 14065152, 'steps': 27470, 'batch_loss/train': 0.7214729031547904} 12/26/2021 14:03:49 - INFO - codeparrot_training - Step 27471: {'lr': 0.00021256984003878966, 'samples': 14065664, 'steps': 27471, 'batch_loss/train': 0.7331533394753933} 12/26/2021 14:04:00 - INFO - codeparrot_training - Step 27472: {'lr': 0.00021255425465565848, 'samples': 14066176, 'steps': 27472, 'batch_loss/train': 0.7935335780493915} 12/26/2021 14:04:10 - INFO - codeparrot_training - Step 27473: {'lr': 0.00021253866942139746, 'samples': 14066688, 'steps': 27473, 'batch_loss/train': 0.7163341604173183} 12/26/2021 14:04:22 - INFO - codeparrot_training - Step 27474: {'lr': 0.00021252308433606845, 'samples': 14067200, 'steps': 27474, 'batch_loss/train': 0.8058529701083899} 12/26/2021 14:04:33 - INFO - codeparrot_training - Step 27475: {'lr': 0.00021250749939973363, 'samples': 14067712, 'steps': 27475, 'batch_loss/train': 0.8155926428735256} 12/26/2021 14:04:44 - INFO - codeparrot_training - Step 27476: {'lr': 0.00021249191461245476, 'samples': 14068224, 'steps': 27476, 'batch_loss/train': 0.7240930711850524} 12/26/2021 14:04:57 - INFO - codeparrot_training - Step 27477: {'lr': 0.00021247632997429395, 'samples': 14068736, 'steps': 27477, 'batch_loss/train': 0.7854027720168233} 12/26/2021 14:05:07 - INFO - codeparrot_training - Step 27478: {'lr': 0.00021246074548531298, 'samples': 14069248, 'steps': 27478, 'batch_loss/train': 0.6895593856461346} 12/26/2021 14:05:18 - INFO - codeparrot_training - Step 27479: {'lr': 0.00021244516114557395, 'samples': 14069760, 'steps': 27479, 'batch_loss/train': 0.7933257790282369} 12/26/2021 14:05:29 - INFO - codeparrot_training - Step 27480: {'lr': 0.0002124295769551388, 'samples': 14070272, 'steps': 27480, 'batch_loss/train': 0.7846604458754882} 12/26/2021 14:05:41 - INFO - codeparrot_training - Step 27481: {'lr': 0.00021241399291406954, 'samples': 14070784, 'steps': 27481, 'batch_loss/train': 0.6970454277470708} 12/26/2021 14:05:51 - INFO - codeparrot_training - Step 27482: {'lr': 0.00021239840902242792, 'samples': 14071296, 'steps': 27482, 'batch_loss/train': 0.7796653024852276} 12/26/2021 14:06:02 - INFO - codeparrot_training - Step 27483: {'lr': 0.00021238282528027603, 'samples': 14071808, 'steps': 27483, 'batch_loss/train': 0.764684995636344} 12/26/2021 14:06:14 - INFO - codeparrot_training - Step 27484: {'lr': 0.00021236724168767593, 'samples': 14072320, 'steps': 27484, 'batch_loss/train': 0.7123764264397323} 12/26/2021 14:06:25 - INFO - codeparrot_training - Step 27485: {'lr': 0.00021235165824468933, 'samples': 14072832, 'steps': 27485, 'batch_loss/train': 0.6907447972334921} 12/26/2021 14:06:35 - INFO - codeparrot_training - Step 27486: {'lr': 0.00021233607495137842, 'samples': 14073344, 'steps': 27486, 'batch_loss/train': 0.527449149871245} 12/26/2021 14:06:48 - INFO - codeparrot_training - Step 27487: {'lr': 0.00021232049180780504, 'samples': 14073856, 'steps': 27487, 'batch_loss/train': 0.7105193589814007} 12/26/2021 14:06:59 - INFO - codeparrot_training - Step 27488: {'lr': 0.00021230490881403116, 'samples': 14074368, 'steps': 27488, 'batch_loss/train': 0.7007274520583451} 12/26/2021 14:07:09 - INFO - codeparrot_training - Step 27489: {'lr': 0.00021228932597011865, 'samples': 14074880, 'steps': 27489, 'batch_loss/train': 0.7285040142014623} 12/26/2021 14:07:21 - INFO - codeparrot_training - Step 27490: {'lr': 0.00021227374327612966, 'samples': 14075392, 'steps': 27490, 'batch_loss/train': 0.7185186264105141} 12/26/2021 14:07:32 - INFO - codeparrot_training - Step 27491: {'lr': 0.000212258160732126, 'samples': 14075904, 'steps': 27491, 'batch_loss/train': 0.8200795892626047} 12/26/2021 14:07:43 - INFO - codeparrot_training - Step 27492: {'lr': 0.0002122425783381696, 'samples': 14076416, 'steps': 27492, 'batch_loss/train': 0.7352728857658803} 12/26/2021 14:07:53 - INFO - codeparrot_training - Step 27493: {'lr': 0.00021222699609432243, 'samples': 14076928, 'steps': 27493, 'batch_loss/train': 0.7604974801652133} 12/26/2021 14:08:06 - INFO - codeparrot_training - Step 27494: {'lr': 0.00021221141400064653, 'samples': 14077440, 'steps': 27494, 'batch_loss/train': 0.7580869700759649} 12/26/2021 14:08:17 - INFO - codeparrot_training - Step 27495: {'lr': 0.00021219583205720375, 'samples': 14077952, 'steps': 27495, 'batch_loss/train': 0.7945199115201831} 12/26/2021 14:08:27 - INFO - codeparrot_training - Step 27496: {'lr': 0.0002121802502640561, 'samples': 14078464, 'steps': 27496, 'batch_loss/train': 0.5193465107586235} 12/26/2021 14:08:40 - INFO - codeparrot_training - Step 27497: {'lr': 0.0002121646686212654, 'samples': 14078976, 'steps': 27497, 'batch_loss/train': 0.7732472894713283} 12/26/2021 14:08:50 - INFO - codeparrot_training - Step 27498: {'lr': 0.00021214908712889374, 'samples': 14079488, 'steps': 27498, 'batch_loss/train': 0.5588060114532709} 12/26/2021 14:09:01 - INFO - codeparrot_training - Step 27499: {'lr': 0.00021213350578700308, 'samples': 14080000, 'steps': 27499, 'batch_loss/train': 0.7931059999391437} 12/26/2021 14:09:01 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/26/2021 14:12:22 - INFO - codeparrot_training - Step 27500: {'loss/eval': 0.7743321061134338, 'perplexity': 2.169142961502075} 12/26/2021 14:12:40 - WARNING - huggingface_hub.repository - Several commits (9) will be pushed upstream. 12/26/2021 14:12:55 - INFO - codeparrot_training - Step 27500: {'lr': 0.0002121179245956552, 'samples': 14080512, 'steps': 27500, 'batch_loss/train': 0.7501500835642219} 12/26/2021 14:13:06 - INFO - codeparrot_training - Step 27501: {'lr': 0.00021210234355491223, 'samples': 14081024, 'steps': 27501, 'batch_loss/train': 0.6287045693024993} 12/26/2021 14:13:16 - INFO - codeparrot_training - Step 27502: {'lr': 0.00021208676266483602, 'samples': 14081536, 'steps': 27502, 'batch_loss/train': 0.6537073054350913} 12/26/2021 14:13:27 - INFO - codeparrot_training - Step 27503: {'lr': 0.00021207118192548852, 'samples': 14082048, 'steps': 27503, 'batch_loss/train': 0.7052658870816231} 12/26/2021 14:13:40 - INFO - codeparrot_training - Step 27504: {'lr': 0.0002120556013369316, 'samples': 14082560, 'steps': 27504, 'batch_loss/train': 0.7651253077201545} 12/26/2021 14:13:50 - INFO - codeparrot_training - Step 27505: {'lr': 0.00021204002089922737, 'samples': 14083072, 'steps': 27505, 'batch_loss/train': 0.6481801920454018} 12/26/2021 14:14:01 - INFO - codeparrot_training - Step 27506: {'lr': 0.0002120244406124377, 'samples': 14083584, 'steps': 27506, 'batch_loss/train': 0.7515624025836587} 12/26/2021 14:14:13 - INFO - codeparrot_training - Step 27507: {'lr': 0.00021200886047662446, 'samples': 14084096, 'steps': 27507, 'batch_loss/train': 0.785441261716187} 12/26/2021 14:14:24 - INFO - codeparrot_training - Step 27508: {'lr': 0.0002119932804918496, 'samples': 14084608, 'steps': 27508, 'batch_loss/train': 0.7582257804460824} 12/26/2021 14:14:34 - INFO - codeparrot_training - Step 27509: {'lr': 0.00021197770065817524, 'samples': 14085120, 'steps': 27509, 'batch_loss/train': 0.7562439958564937} 12/26/2021 14:14:46 - INFO - codeparrot_training - Step 27510: {'lr': 0.00021196212097566314, 'samples': 14085632, 'steps': 27510, 'batch_loss/train': 0.7310663629323244} 12/26/2021 14:14:57 - INFO - codeparrot_training - Step 27511: {'lr': 0.0002119465414443752, 'samples': 14086144, 'steps': 27511, 'batch_loss/train': 0.6543893008492887} 12/26/2021 14:15:08 - INFO - codeparrot_training - Step 27512: {'lr': 0.00021193096206437352, 'samples': 14086656, 'steps': 27512, 'batch_loss/train': 0.699976917821914} 12/26/2021 14:15:18 - INFO - codeparrot_training - Step 27513: {'lr': 0.00021191538283572, 'samples': 14087168, 'steps': 27513, 'batch_loss/train': 0.7801557397469878} 12/26/2021 14:15:31 - INFO - codeparrot_training - Step 27514: {'lr': 0.00021189980375847645, 'samples': 14087680, 'steps': 27514, 'batch_loss/train': 0.7867098273709416} 12/26/2021 14:15:41 - INFO - codeparrot_training - Step 27515: {'lr': 0.00021188422483270486, 'samples': 14088192, 'steps': 27515, 'batch_loss/train': 0.7358463294804096} 12/26/2021 14:15:52 - INFO - codeparrot_training - Step 27516: {'lr': 0.0002118686460584673, 'samples': 14088704, 'steps': 27516, 'batch_loss/train': 0.6726901499787346} 12/26/2021 14:16:05 - INFO - codeparrot_training - Step 27517: {'lr': 0.00021185306743582555, 'samples': 14089216, 'steps': 27517, 'batch_loss/train': 0.817061965353787} 12/26/2021 14:16:15 - INFO - codeparrot_training - Step 27518: {'lr': 0.00021183748896484166, 'samples': 14089728, 'steps': 27518, 'batch_loss/train': 0.7141010884661227} 12/26/2021 14:16:26 - INFO - codeparrot_training - Step 27519: {'lr': 0.00021182191064557742, 'samples': 14090240, 'steps': 27519, 'batch_loss/train': 0.7157449368387461} 12/26/2021 14:16:38 - INFO - codeparrot_training - Step 27520: {'lr': 0.00021180633247809495, 'samples': 14090752, 'steps': 27520, 'batch_loss/train': 0.6706008006585762} 12/26/2021 14:16:49 - INFO - codeparrot_training - Step 27521: {'lr': 0.000211790754462456, 'samples': 14091264, 'steps': 27521, 'batch_loss/train': 0.7615755551960319} 12/26/2021 14:16:59 - INFO - codeparrot_training - Step 27522: {'lr': 0.00021177517659872255, 'samples': 14091776, 'steps': 27522, 'batch_loss/train': 0.7805274743586779} 12/26/2021 14:17:11 - INFO - codeparrot_training - Step 27523: {'lr': 0.00021175959888695669, 'samples': 14092288, 'steps': 27523, 'batch_loss/train': 0.7641057977452874} 12/26/2021 14:17:22 - INFO - codeparrot_training - Step 27524: {'lr': 0.0002117440213272201, 'samples': 14092800, 'steps': 27524, 'batch_loss/train': 0.710121869109571} 12/26/2021 14:17:33 - INFO - codeparrot_training - Step 27525: {'lr': 0.00021172844391957496, 'samples': 14093312, 'steps': 27525, 'batch_loss/train': 0.6181605016463436} 12/26/2021 14:17:43 - INFO - codeparrot_training - Step 27526: {'lr': 0.00021171286666408297, 'samples': 14093824, 'steps': 27526, 'batch_loss/train': 1.0406199810095131} 12/26/2021 14:17:56 - INFO - codeparrot_training - Step 27527: {'lr': 0.00021169728956080622, 'samples': 14094336, 'steps': 27527, 'batch_loss/train': 0.721488778013736} 12/26/2021 14:18:07 - INFO - codeparrot_training - Step 27528: {'lr': 0.00021168171260980655, 'samples': 14094848, 'steps': 27528, 'batch_loss/train': 0.7203105529770255} 12/26/2021 14:18:17 - INFO - codeparrot_training - Step 27529: {'lr': 0.00021166613581114602, 'samples': 14095360, 'steps': 27529, 'batch_loss/train': 0.7227115486748517} 12/26/2021 14:18:30 - INFO - codeparrot_training - Step 27530: {'lr': 0.0002116505591648863, 'samples': 14095872, 'steps': 27530, 'batch_loss/train': 0.7627002252265811} 12/26/2021 14:18:40 - INFO - codeparrot_training - Step 27531: {'lr': 0.00021163498267108956, 'samples': 14096384, 'steps': 27531, 'batch_loss/train': 0.6590668156277388} 12/26/2021 14:18:51 - INFO - codeparrot_training - Step 27532: {'lr': 0.0002116194063298177, 'samples': 14096896, 'steps': 27532, 'batch_loss/train': 0.6932403198443353} 12/26/2021 14:19:02 - INFO - codeparrot_training - Step 27533: {'lr': 0.0002116038301411325, 'samples': 14097408, 'steps': 27533, 'batch_loss/train': 0.7263728873804212} 12/26/2021 14:19:14 - INFO - codeparrot_training - Step 27534: {'lr': 0.00021158825410509593, 'samples': 14097920, 'steps': 27534, 'batch_loss/train': 0.7154150407295674} 12/26/2021 14:19:25 - INFO - codeparrot_training - Step 27535: {'lr': 0.0002115726782217701, 'samples': 14098432, 'steps': 27535, 'batch_loss/train': 0.7721927734091878} 12/26/2021 14:19:36 - INFO - codeparrot_training - Step 27536: {'lr': 0.00021155710249121676, 'samples': 14098944, 'steps': 27536, 'batch_loss/train': 0.847007698379457} 12/26/2021 14:19:48 - INFO - codeparrot_training - Step 27537: {'lr': 0.00021154152691349776, 'samples': 14099456, 'steps': 27537, 'batch_loss/train': 0.7508865632116795} 12/26/2021 14:19:58 - INFO - codeparrot_training - Step 27538: {'lr': 0.00021152595148867518, 'samples': 14099968, 'steps': 27538, 'batch_loss/train': 0.7436033431440592} 12/26/2021 14:20:09 - INFO - codeparrot_training - Step 27539: {'lr': 0.00021151037621681097, 'samples': 14100480, 'steps': 27539, 'batch_loss/train': 0.8251396073028445} 12/26/2021 14:20:21 - INFO - codeparrot_training - Step 27540: {'lr': 0.00021149480109796688, 'samples': 14100992, 'steps': 27540, 'batch_loss/train': 0.7750688521191478} 12/26/2021 14:20:32 - INFO - codeparrot_training - Step 27541: {'lr': 0.00021147922613220488, 'samples': 14101504, 'steps': 27541, 'batch_loss/train': 0.7637317853514105} 12/26/2021 14:20:42 - INFO - codeparrot_training - Step 27542: {'lr': 0.00021146365131958705, 'samples': 14102016, 'steps': 27542, 'batch_loss/train': 0.7481366326101124} 12/26/2021 14:20:55 - INFO - codeparrot_training - Step 27543: {'lr': 0.00021144807666017513, 'samples': 14102528, 'steps': 27543, 'batch_loss/train': 0.7272480772808194} 12/26/2021 14:21:06 - INFO - codeparrot_training - Step 27544: {'lr': 0.00021143250215403114, 'samples': 14103040, 'steps': 27544, 'batch_loss/train': 0.7826724965125322} 12/26/2021 14:21:16 - INFO - codeparrot_training - Step 27545: {'lr': 0.0002114169278012169, 'samples': 14103552, 'steps': 27545, 'batch_loss/train': 0.9113104501739144} 12/26/2021 14:21:27 - INFO - codeparrot_training - Step 27546: {'lr': 0.00021140135360179447, 'samples': 14104064, 'steps': 27546, 'batch_loss/train': 0.8260812237858772} 12/26/2021 14:21:39 - INFO - codeparrot_training - Step 27547: {'lr': 0.0002113857795558256, 'samples': 14104576, 'steps': 27547, 'batch_loss/train': 0.7192181558348238} 12/26/2021 14:21:50 - INFO - codeparrot_training - Step 27548: {'lr': 0.00021137020566337227, 'samples': 14105088, 'steps': 27548, 'batch_loss/train': 0.809067590162158} 12/26/2021 14:22:00 - INFO - codeparrot_training - Step 27549: {'lr': 0.0002113546319244965, 'samples': 14105600, 'steps': 27549, 'batch_loss/train': 0.798805913887918} 12/26/2021 14:22:12 - INFO - codeparrot_training - Step 27550: {'lr': 0.0002113390583392601, 'samples': 14106112, 'steps': 27550, 'batch_loss/train': 0.7570283580571413} 12/26/2021 14:22:23 - INFO - codeparrot_training - Step 27551: {'lr': 0.00021132348490772503, 'samples': 14106624, 'steps': 27551, 'batch_loss/train': 0.67045072093606} 12/26/2021 14:22:34 - INFO - codeparrot_training - Step 27552: {'lr': 0.00021130791162995307, 'samples': 14107136, 'steps': 27552, 'batch_loss/train': 0.6567427041009068} 12/26/2021 14:22:46 - INFO - codeparrot_training - Step 27553: {'lr': 0.00021129233850600632, 'samples': 14107648, 'steps': 27553, 'batch_loss/train': 0.7499889913015068} 12/26/2021 14:22:56 - INFO - codeparrot_training - Step 27554: {'lr': 0.00021127676553594666, 'samples': 14108160, 'steps': 27554, 'batch_loss/train': 0.7809158088639379} 12/26/2021 14:23:07 - INFO - codeparrot_training - Step 27555: {'lr': 0.00021126119271983596, 'samples': 14108672, 'steps': 27555, 'batch_loss/train': 0.7424639565870166} 12/26/2021 14:23:18 - INFO - codeparrot_training - Step 27556: {'lr': 0.00021124562005773605, 'samples': 14109184, 'steps': 27556, 'batch_loss/train': 0.7746769711375237} 12/26/2021 14:23:31 - INFO - codeparrot_training - Step 27557: {'lr': 0.00021123004754970893, 'samples': 14109696, 'steps': 27557, 'batch_loss/train': 0.7910131211392581} 12/26/2021 14:23:41 - INFO - codeparrot_training - Step 27558: {'lr': 0.00021121447519581658, 'samples': 14110208, 'steps': 27558, 'batch_loss/train': 0.953006993047893} 12/26/2021 14:23:52 - INFO - codeparrot_training - Step 27559: {'lr': 0.0002111989029961207, 'samples': 14110720, 'steps': 27559, 'batch_loss/train': 0.7312466171570122} 12/26/2021 14:24:04 - INFO - codeparrot_training - Step 27560: {'lr': 0.00021118333095068343, 'samples': 14111232, 'steps': 27560, 'batch_loss/train': 0.404830858373316} 12/26/2021 14:24:15 - INFO - codeparrot_training - Step 27561: {'lr': 0.0002111677590595666, 'samples': 14111744, 'steps': 27561, 'batch_loss/train': 0.7535343355266377} 12/26/2021 14:24:25 - INFO - codeparrot_training - Step 27562: {'lr': 0.00021115218732283204, 'samples': 14112256, 'steps': 27562, 'batch_loss/train': 0.758615031838417} 12/26/2021 14:24:38 - INFO - codeparrot_training - Step 27563: {'lr': 0.00021113661574054165, 'samples': 14112768, 'steps': 27563, 'batch_loss/train': 1.4893466383218765} 12/26/2021 14:24:49 - INFO - codeparrot_training - Step 27564: {'lr': 0.00021112104431275755, 'samples': 14113280, 'steps': 27564, 'batch_loss/train': 0.6859097774140537} 12/26/2021 14:25:00 - INFO - codeparrot_training - Step 27565: {'lr': 0.00021110547303954146, 'samples': 14113792, 'steps': 27565, 'batch_loss/train': 0.7399031138047576} 12/26/2021 14:25:10 - INFO - codeparrot_training - Step 27566: {'lr': 0.00021108990192095532, 'samples': 14114304, 'steps': 27566, 'batch_loss/train': 0.6911023528082296} 12/26/2021 14:25:23 - INFO - codeparrot_training - Step 27567: {'lr': 0.00021107433095706094, 'samples': 14114816, 'steps': 27567, 'batch_loss/train': 0.6797867463901639} 12/26/2021 14:25:33 - INFO - codeparrot_training - Step 27568: {'lr': 0.00021105876014792044, 'samples': 14115328, 'steps': 27568, 'batch_loss/train': 0.6897832923568785} 12/26/2021 14:25:44 - INFO - codeparrot_training - Step 27569: {'lr': 0.00021104318949359554, 'samples': 14115840, 'steps': 27569, 'batch_loss/train': 0.7601068229414523} 12/26/2021 14:25:56 - INFO - codeparrot_training - Step 27570: {'lr': 0.0002110276189941483, 'samples': 14116352, 'steps': 27570, 'batch_loss/train': 0.7616737699136138} 12/26/2021 14:26:06 - INFO - codeparrot_training - Step 27571: {'lr': 0.00021101204864964042, 'samples': 14116864, 'steps': 27571, 'batch_loss/train': 0.6893642873037606} 12/26/2021 14:26:17 - INFO - codeparrot_training - Step 27572: {'lr': 0.00021099647846013398, 'samples': 14117376, 'steps': 27572, 'batch_loss/train': 0.6159504386596382} 12/26/2021 14:26:30 - INFO - codeparrot_training - Step 27573: {'lr': 0.00021098090842569078, 'samples': 14117888, 'steps': 27573, 'batch_loss/train': 0.7787097608670592} 12/26/2021 14:26:40 - INFO - codeparrot_training - Step 27574: {'lr': 0.0002109653385463727, 'samples': 14118400, 'steps': 27574, 'batch_loss/train': 0.7455521943047643} 12/26/2021 14:26:51 - INFO - codeparrot_training - Step 27575: {'lr': 0.0002109497688222418, 'samples': 14118912, 'steps': 27575, 'batch_loss/train': 0.7557007735595107} 12/26/2021 14:27:03 - INFO - codeparrot_training - Step 27576: {'lr': 0.00021093419925335982, 'samples': 14119424, 'steps': 27576, 'batch_loss/train': 0.7440016232430935} 12/26/2021 14:27:14 - INFO - codeparrot_training - Step 27577: {'lr': 0.0002109186298397888, 'samples': 14119936, 'steps': 27577, 'batch_loss/train': 0.6842752430820838} 12/26/2021 14:27:24 - INFO - codeparrot_training - Step 27578: {'lr': 0.0002109030605815904, 'samples': 14120448, 'steps': 27578, 'batch_loss/train': 0.7253090525045991} 12/26/2021 14:27:35 - INFO - codeparrot_training - Step 27579: {'lr': 0.00021088749147882675, 'samples': 14120960, 'steps': 27579, 'batch_loss/train': 0.7098385989665985} 12/26/2021 14:27:48 - INFO - codeparrot_training - Step 27580: {'lr': 0.0002108719225315597, 'samples': 14121472, 'steps': 27580, 'batch_loss/train': 0.7422689044615254} 12/26/2021 14:27:58 - INFO - codeparrot_training - Step 27581: {'lr': 0.00021085635373985112, 'samples': 14121984, 'steps': 27581, 'batch_loss/train': 0.7699539018794894} 12/26/2021 14:28:09 - INFO - codeparrot_training - Step 27582: {'lr': 0.0002108407851037628, 'samples': 14122496, 'steps': 27582, 'batch_loss/train': 0.8159037809818983} 12/26/2021 14:28:21 - INFO - codeparrot_training - Step 27583: {'lr': 0.00021082521662335677, 'samples': 14123008, 'steps': 27583, 'batch_loss/train': 0.6779626086354256} 12/26/2021 14:28:32 - INFO - codeparrot_training - Step 27584: {'lr': 0.00021080964829869495, 'samples': 14123520, 'steps': 27584, 'batch_loss/train': 0.663820852059871} 12/26/2021 14:28:42 - INFO - codeparrot_training - Step 27585: {'lr': 0.00021079408012983906, 'samples': 14124032, 'steps': 27585, 'batch_loss/train': 0.7470366843044758} 12/26/2021 14:28:54 - INFO - codeparrot_training - Step 27586: {'lr': 0.00021077851211685118, 'samples': 14124544, 'steps': 27586, 'batch_loss/train': 0.7711049523204565} 12/26/2021 14:29:05 - INFO - codeparrot_training - Step 27587: {'lr': 0.00021076294425979313, 'samples': 14125056, 'steps': 27587, 'batch_loss/train': 0.7888738224282861} 12/26/2021 14:29:16 - INFO - codeparrot_training - Step 27588: {'lr': 0.00021074737655872678, 'samples': 14125568, 'steps': 27588, 'batch_loss/train': 0.7683989377692342} 12/26/2021 14:29:26 - INFO - codeparrot_training - Step 27589: {'lr': 0.00021073180901371398, 'samples': 14126080, 'steps': 27589, 'batch_loss/train': 0.7203455371782184} 12/26/2021 14:29:38 - INFO - codeparrot_training - Step 27590: {'lr': 0.0002107162416248168, 'samples': 14126592, 'steps': 27590, 'batch_loss/train': 0.7477819859050214} 12/26/2021 14:29:49 - INFO - codeparrot_training - Step 27591: {'lr': 0.00021070067439209702, 'samples': 14127104, 'steps': 27591, 'batch_loss/train': 0.7435187539085746} 12/26/2021 14:30:00 - INFO - codeparrot_training - Step 27592: {'lr': 0.00021068510731561641, 'samples': 14127616, 'steps': 27592, 'batch_loss/train': 0.6960709955310449} 12/26/2021 14:30:12 - INFO - codeparrot_training - Step 27593: {'lr': 0.00021066954039543695, 'samples': 14128128, 'steps': 27593, 'batch_loss/train': 1.0054265391081572} 12/26/2021 14:30:23 - INFO - codeparrot_training - Step 27594: {'lr': 0.00021065397363162066, 'samples': 14128640, 'steps': 27594, 'batch_loss/train': 0.7617818489670753} 12/26/2021 14:30:34 - INFO - codeparrot_training - Step 27595: {'lr': 0.00021063840702422926, 'samples': 14129152, 'steps': 27595, 'batch_loss/train': 0.79028026945889} 12/26/2021 14:30:46 - INFO - codeparrot_training - Step 27596: {'lr': 0.0002106228405733246, 'samples': 14129664, 'steps': 27596, 'batch_loss/train': 0.7842209190130234} 12/26/2021 14:30:56 - INFO - codeparrot_training - Step 27597: {'lr': 0.0002106072742789688, 'samples': 14130176, 'steps': 27597, 'batch_loss/train': 0.7633971995674074} 12/26/2021 14:31:07 - INFO - codeparrot_training - Step 27598: {'lr': 0.00021059170814122355, 'samples': 14130688, 'steps': 27598, 'batch_loss/train': 0.7469310080632567} 12/26/2021 14:31:19 - INFO - codeparrot_training - Step 27599: {'lr': 0.00021057614216015084, 'samples': 14131200, 'steps': 27599, 'batch_loss/train': 0.7683088127523661} 12/26/2021 14:31:30 - INFO - codeparrot_training - Step 27600: {'lr': 0.00021056057633581237, 'samples': 14131712, 'steps': 27600, 'batch_loss/train': 0.8607469741255045} 12/26/2021 14:31:40 - INFO - codeparrot_training - Step 27601: {'lr': 0.0002105450106682703, 'samples': 14132224, 'steps': 27601, 'batch_loss/train': 0.7295112116262317} 12/26/2021 14:31:51 - INFO - codeparrot_training - Step 27602: {'lr': 0.00021052944515758632, 'samples': 14132736, 'steps': 27602, 'batch_loss/train': 0.8057273502927274} 12/26/2021 14:32:04 - INFO - codeparrot_training - Step 27603: {'lr': 0.0002105138798038224, 'samples': 14133248, 'steps': 27603, 'batch_loss/train': 0.8096237871795893} 12/26/2021 14:32:14 - INFO - codeparrot_training - Step 27604: {'lr': 0.00021049831460704028, 'samples': 14133760, 'steps': 27604, 'batch_loss/train': 0.7079956568777561} 12/26/2021 14:32:25 - INFO - codeparrot_training - Step 27605: {'lr': 0.00021048274956730198, 'samples': 14134272, 'steps': 27605, 'batch_loss/train': 0.7223728969693184} 12/26/2021 14:32:37 - INFO - codeparrot_training - Step 27606: {'lr': 0.0002104671846846694, 'samples': 14134784, 'steps': 27606, 'batch_loss/train': 0.8157735811546445} 12/26/2021 14:32:48 - INFO - codeparrot_training - Step 27607: {'lr': 0.00021045161995920437, 'samples': 14135296, 'steps': 27607, 'batch_loss/train': 0.7583822181914002} 12/26/2021 14:32:59 - INFO - codeparrot_training - Step 27608: {'lr': 0.0002104360553909687, 'samples': 14135808, 'steps': 27608, 'batch_loss/train': 0.7112089404836297} 12/26/2021 14:33:09 - INFO - codeparrot_training - Step 27609: {'lr': 0.00021042049098002436, 'samples': 14136320, 'steps': 27609, 'batch_loss/train': 0.6853420312982053} 12/26/2021 14:33:22 - INFO - codeparrot_training - Step 27610: {'lr': 0.00021040492672643328, 'samples': 14136832, 'steps': 27610, 'batch_loss/train': 0.7863056384958327} 12/26/2021 14:33:33 - INFO - codeparrot_training - Step 27611: {'lr': 0.00021038936263025713, 'samples': 14137344, 'steps': 27611, 'batch_loss/train': 0.6853002645075321} 12/26/2021 14:33:43 - INFO - codeparrot_training - Step 27612: {'lr': 0.00021037379869155796, 'samples': 14137856, 'steps': 27612, 'batch_loss/train': 0.7867793887853622} 12/26/2021 14:33:55 - INFO - codeparrot_training - Step 27613: {'lr': 0.00021035823491039767, 'samples': 14138368, 'steps': 27613, 'batch_loss/train': 0.6490345196798444} 12/26/2021 14:34:06 - INFO - codeparrot_training - Step 27614: {'lr': 0.00021034267128683802, 'samples': 14138880, 'steps': 27614, 'batch_loss/train': 0.7424994828761555} 12/26/2021 14:34:16 - INFO - codeparrot_training - Step 27615: {'lr': 0.00021032710782094088, 'samples': 14139392, 'steps': 27615, 'batch_loss/train': 0.7978864964097738} 12/26/2021 14:34:28 - INFO - codeparrot_training - Step 27616: {'lr': 0.0002103115445127683, 'samples': 14139904, 'steps': 27616, 'batch_loss/train': 0.7154992021387443} 12/26/2021 14:34:39 - INFO - codeparrot_training - Step 27617: {'lr': 0.00021029598136238197, 'samples': 14140416, 'steps': 27617, 'batch_loss/train': 0.7648752583190799} 12/26/2021 14:34:50 - INFO - codeparrot_training - Step 27618: {'lr': 0.00021028041836984384, 'samples': 14140928, 'steps': 27618, 'batch_loss/train': 0.7051179325208068} 12/26/2021 14:35:03 - INFO - codeparrot_training - Step 27619: {'lr': 0.00021026485553521571, 'samples': 14141440, 'steps': 27619, 'batch_loss/train': 0.7901749247685075} 12/26/2021 14:35:14 - INFO - codeparrot_training - Step 27620: {'lr': 0.00021024929285855959, 'samples': 14141952, 'steps': 27620, 'batch_loss/train': 0.7616702385712415} 12/26/2021 14:35:24 - INFO - codeparrot_training - Step 27621: {'lr': 0.00021023373033993725, 'samples': 14142464, 'steps': 27621, 'batch_loss/train': 0.6864696755073965} 12/26/2021 14:35:35 - INFO - codeparrot_training - Step 27622: {'lr': 0.00021021816797941052, 'samples': 14142976, 'steps': 27622, 'batch_loss/train': 0.5162265023682266} 12/26/2021 14:35:47 - INFO - codeparrot_training - Step 27623: {'lr': 0.00021020260577704142, 'samples': 14143488, 'steps': 27623, 'batch_loss/train': 0.760600071400404} 12/26/2021 14:35:58 - INFO - codeparrot_training - Step 27624: {'lr': 0.00021018704373289168, 'samples': 14144000, 'steps': 27624, 'batch_loss/train': 0.5835103713907301} 12/26/2021 14:36:08 - INFO - codeparrot_training - Step 27625: {'lr': 0.0002101714818470233, 'samples': 14144512, 'steps': 27625, 'batch_loss/train': 0.6820217375643551} 12/26/2021 14:36:20 - INFO - codeparrot_training - Step 27626: {'lr': 0.00021015592011949794, 'samples': 14145024, 'steps': 27626, 'batch_loss/train': 0.8094063615426421} 12/26/2021 14:36:31 - INFO - codeparrot_training - Step 27627: {'lr': 0.00021014035855037772, 'samples': 14145536, 'steps': 27627, 'batch_loss/train': 0.785059011541307} 12/26/2021 14:36:42 - INFO - codeparrot_training - Step 27628: {'lr': 0.00021012479713972433, 'samples': 14146048, 'steps': 27628, 'batch_loss/train': 0.7597140651196241} 12/26/2021 14:36:54 - INFO - codeparrot_training - Step 27629: {'lr': 0.00021010923588759973, 'samples': 14146560, 'steps': 27629, 'batch_loss/train': 0.757861484773457} 12/26/2021 14:37:04 - INFO - codeparrot_training - Step 27630: {'lr': 0.00021009367479406563, 'samples': 14147072, 'steps': 27630, 'batch_loss/train': 0.755395251326263} 12/26/2021 14:37:15 - INFO - codeparrot_training - Step 27631: {'lr': 0.0002100781138591841, 'samples': 14147584, 'steps': 27631, 'batch_loss/train': 0.745545799843967} 12/26/2021 14:37:26 - INFO - codeparrot_training - Step 27632: {'lr': 0.00021006255308301695, 'samples': 14148096, 'steps': 27632, 'batch_loss/train': 0.728732363670133} 12/26/2021 14:37:39 - INFO - codeparrot_training - Step 27633: {'lr': 0.00021004699246562588, 'samples': 14148608, 'steps': 27633, 'batch_loss/train': 0.8027369827032089} 12/26/2021 14:37:49 - INFO - codeparrot_training - Step 27634: {'lr': 0.00021003143200707297, 'samples': 14149120, 'steps': 27634, 'batch_loss/train': 0.6535264351405203} 12/26/2021 14:38:00 - INFO - codeparrot_training - Step 27635: {'lr': 0.00021001587170742003, 'samples': 14149632, 'steps': 27635, 'batch_loss/train': 0.6892382260411978} 12/26/2021 14:38:12 - INFO - codeparrot_training - Step 27636: {'lr': 0.0002100003115667289, 'samples': 14150144, 'steps': 27636, 'batch_loss/train': 0.7501916717737913} 12/26/2021 14:38:23 - INFO - codeparrot_training - Step 27637: {'lr': 0.0002099847515850613, 'samples': 14150656, 'steps': 27637, 'batch_loss/train': 0.7278710817918181} 12/26/2021 14:38:33 - INFO - codeparrot_training - Step 27638: {'lr': 0.0002099691917624793, 'samples': 14151168, 'steps': 27638, 'batch_loss/train': 0.6974751381203532} 12/26/2021 14:38:46 - INFO - codeparrot_training - Step 27639: {'lr': 0.0002099536320990447, 'samples': 14151680, 'steps': 27639, 'batch_loss/train': 0.7700713938102126} 12/26/2021 14:38:57 - INFO - codeparrot_training - Step 27640: {'lr': 0.00020993807259481932, 'samples': 14152192, 'steps': 27640, 'batch_loss/train': 0.7482502534985542} 12/26/2021 14:39:07 - INFO - codeparrot_training - Step 27641: {'lr': 0.00020992251324986494, 'samples': 14152704, 'steps': 27641, 'batch_loss/train': 0.8204301388468593} 12/26/2021 14:39:19 - INFO - codeparrot_training - Step 27642: {'lr': 0.00020990695406424364, 'samples': 14153216, 'steps': 27642, 'batch_loss/train': 0.71233785856748} 12/26/2021 14:39:30 - INFO - codeparrot_training - Step 27643: {'lr': 0.00020989139503801714, 'samples': 14153728, 'steps': 27643, 'batch_loss/train': 0.7476673787459731} 12/26/2021 14:39:41 - INFO - codeparrot_training - Step 27644: {'lr': 0.0002098758361712473, 'samples': 14154240, 'steps': 27644, 'batch_loss/train': 0.7546227853745222} 12/26/2021 14:39:51 - INFO - codeparrot_training - Step 27645: {'lr': 0.00020986027746399593, 'samples': 14154752, 'steps': 27645, 'batch_loss/train': 0.761287915345747} 12/26/2021 14:40:03 - INFO - codeparrot_training - Step 27646: {'lr': 0.000209844718916325, 'samples': 14155264, 'steps': 27646, 'batch_loss/train': 0.8896274352446198} 12/26/2021 14:40:14 - INFO - codeparrot_training - Step 27647: {'lr': 0.00020982916052829628, 'samples': 14155776, 'steps': 27647, 'batch_loss/train': 0.8909128848463297} 12/26/2021 14:40:25 - INFO - codeparrot_training - Step 27648: {'lr': 0.0002098136022999716, 'samples': 14156288, 'steps': 27648, 'batch_loss/train': 0.8046596343629062} 12/26/2021 14:40:39 - INFO - codeparrot_training - Step 27649: {'lr': 0.00020979804423141297, 'samples': 14156800, 'steps': 27649, 'batch_loss/train': 0.8352566850371659} 12/26/2021 14:40:50 - INFO - codeparrot_training - Step 27650: {'lr': 0.00020978248632268207, 'samples': 14157312, 'steps': 27650, 'batch_loss/train': 0.7206275281496346} 12/26/2021 14:41:00 - INFO - codeparrot_training - Step 27651: {'lr': 0.00020976692857384088, 'samples': 14157824, 'steps': 27651, 'batch_loss/train': 0.6213419880950823} 12/26/2021 14:41:12 - INFO - codeparrot_training - Step 27652: {'lr': 0.00020975137098495106, 'samples': 14158336, 'steps': 27652, 'batch_loss/train': 0.6728599280468188} 12/26/2021 14:41:23 - INFO - codeparrot_training - Step 27653: {'lr': 0.00020973581355607473, 'samples': 14158848, 'steps': 27653, 'batch_loss/train': 0.7469377133529633} 12/26/2021 14:41:33 - INFO - codeparrot_training - Step 27654: {'lr': 0.00020972025628727354, 'samples': 14159360, 'steps': 27654, 'batch_loss/train': 0.6434839549474418} 12/26/2021 14:41:44 - INFO - codeparrot_training - Step 27655: {'lr': 0.00020970469917860947, 'samples': 14159872, 'steps': 27655, 'batch_loss/train': 0.7449108799919486} 12/26/2021 14:41:56 - INFO - codeparrot_training - Step 27656: {'lr': 0.00020968914223014418, 'samples': 14160384, 'steps': 27656, 'batch_loss/train': 0.7464056648313999} 12/26/2021 14:42:07 - INFO - codeparrot_training - Step 27657: {'lr': 0.00020967358544193967, 'samples': 14160896, 'steps': 27657, 'batch_loss/train': 0.7860474763438106} 12/26/2021 14:42:17 - INFO - codeparrot_training - Step 27658: {'lr': 0.00020965802881405782, 'samples': 14161408, 'steps': 27658, 'batch_loss/train': 0.8051136182621121} 12/26/2021 14:42:31 - INFO - codeparrot_training - Step 27659: {'lr': 0.00020964247234656034, 'samples': 14161920, 'steps': 27659, 'batch_loss/train': 0.7748130485415459} 12/26/2021 14:42:42 - INFO - codeparrot_training - Step 27660: {'lr': 0.00020962691603950917, 'samples': 14162432, 'steps': 27660, 'batch_loss/train': 0.7629405884072185} 12/26/2021 14:42:53 - INFO - codeparrot_training - Step 27661: {'lr': 0.00020961135989296621, 'samples': 14162944, 'steps': 27661, 'batch_loss/train': 0.8317570993676782} 12/26/2021 14:43:05 - INFO - codeparrot_training - Step 27662: {'lr': 0.00020959580390699322, 'samples': 14163456, 'steps': 27662, 'batch_loss/train': 1.197047833353281} 12/26/2021 14:43:15 - INFO - codeparrot_training - Step 27663: {'lr': 0.000209580248081652, 'samples': 14163968, 'steps': 27663, 'batch_loss/train': 0.7296739537268877} 12/26/2021 14:43:26 - INFO - codeparrot_training - Step 27664: {'lr': 0.00020956469241700443, 'samples': 14164480, 'steps': 27664, 'batch_loss/train': 0.7398018138483167} 12/26/2021 14:43:37 - INFO - codeparrot_training - Step 27665: {'lr': 0.00020954913691311248, 'samples': 14164992, 'steps': 27665, 'batch_loss/train': 0.9252873151563108} 12/26/2021 14:43:49 - INFO - codeparrot_training - Step 27666: {'lr': 0.00020953358157003782, 'samples': 14165504, 'steps': 27666, 'batch_loss/train': 0.7157985880039632} 12/26/2021 14:44:00 - INFO - codeparrot_training - Step 27667: {'lr': 0.00020951802638784228, 'samples': 14166016, 'steps': 27667, 'batch_loss/train': 0.6243002901319414} 12/26/2021 14:44:10 - INFO - codeparrot_training - Step 27668: {'lr': 0.00020950247136658793, 'samples': 14166528, 'steps': 27668, 'batch_loss/train': 0.5945429326966405} 12/26/2021 14:44:24 - INFO - codeparrot_training - Step 27669: {'lr': 0.0002094869165063364, 'samples': 14167040, 'steps': 27669, 'batch_loss/train': 0.787390767596662} 12/26/2021 14:44:35 - INFO - codeparrot_training - Step 27670: {'lr': 0.0002094713618071495, 'samples': 14167552, 'steps': 27670, 'batch_loss/train': 0.8081588980276138} 12/26/2021 14:44:46 - INFO - codeparrot_training - Step 27671: {'lr': 0.00020945580726908932, 'samples': 14168064, 'steps': 27671, 'batch_loss/train': 0.7409259155392647} 12/26/2021 14:44:58 - INFO - codeparrot_training - Step 27672: {'lr': 0.00020944025289221753, 'samples': 14168576, 'steps': 27672, 'batch_loss/train': 0.7763032661750913} 12/26/2021 14:45:08 - INFO - codeparrot_training - Step 27673: {'lr': 0.00020942469867659593, 'samples': 14169088, 'steps': 27673, 'batch_loss/train': 0.7286341767758131} 12/26/2021 14:45:19 - INFO - codeparrot_training - Step 27674: {'lr': 0.00020940914462228634, 'samples': 14169600, 'steps': 27674, 'batch_loss/train': 0.7602169709280133} 12/26/2021 14:45:30 - INFO - codeparrot_training - Step 27675: {'lr': 0.00020939359072935076, 'samples': 14170112, 'steps': 27675, 'batch_loss/train': 0.7562604257836938} 12/26/2021 14:45:43 - INFO - codeparrot_training - Step 27676: {'lr': 0.0002093780369978509, 'samples': 14170624, 'steps': 27676, 'batch_loss/train': 0.6834850986488163} 12/26/2021 14:45:54 - INFO - codeparrot_training - Step 27677: {'lr': 0.00020936248342784866, 'samples': 14171136, 'steps': 27677, 'batch_loss/train': 0.7499008299782872} 12/26/2021 14:46:05 - INFO - codeparrot_training - Step 27678: {'lr': 0.00020934693001940574, 'samples': 14171648, 'steps': 27678, 'batch_loss/train': 0.8767004366964102} 12/26/2021 14:46:17 - INFO - codeparrot_training - Step 27679: {'lr': 0.00020933137677258415, 'samples': 14172160, 'steps': 27679, 'batch_loss/train': 0.7795315105468035} 12/26/2021 14:46:27 - INFO - codeparrot_training - Step 27680: {'lr': 0.00020931582368744568, 'samples': 14172672, 'steps': 27680, 'batch_loss/train': 0.7646876465296373} 12/26/2021 14:46:38 - INFO - codeparrot_training - Step 27681: {'lr': 0.00020930027076405215, 'samples': 14173184, 'steps': 27681, 'batch_loss/train': 0.7652696818113327} 12/26/2021 14:46:50 - INFO - codeparrot_training - Step 27682: {'lr': 0.00020928471800246527, 'samples': 14173696, 'steps': 27682, 'batch_loss/train': 0.738478478975594} 12/26/2021 14:47:01 - INFO - codeparrot_training - Step 27683: {'lr': 0.00020926916540274702, 'samples': 14174208, 'steps': 27683, 'batch_loss/train': 0.8068592720665038} 12/26/2021 14:47:11 - INFO - codeparrot_training - Step 27684: {'lr': 0.00020925361296495924, 'samples': 14174720, 'steps': 27684, 'batch_loss/train': 0.8093541553243995} 12/26/2021 14:47:22 - INFO - codeparrot_training - Step 27685: {'lr': 0.00020923806068916362, 'samples': 14175232, 'steps': 27685, 'batch_loss/train': 0.7910610623657703} 12/26/2021 14:47:36 - INFO - codeparrot_training - Step 27686: {'lr': 0.0002092225085754221, 'samples': 14175744, 'steps': 27686, 'batch_loss/train': 0.6875989998225123} 12/26/2021 14:47:47 - INFO - codeparrot_training - Step 27687: {'lr': 0.00020920695662379656, 'samples': 14176256, 'steps': 27687, 'batch_loss/train': 0.9227752024307847} 12/26/2021 14:47:57 - INFO - codeparrot_training - Step 27688: {'lr': 0.00020919140483434873, 'samples': 14176768, 'steps': 27688, 'batch_loss/train': 0.8182422001846135} 12/26/2021 14:48:10 - INFO - codeparrot_training - Step 27689: {'lr': 0.00020917585320714038, 'samples': 14177280, 'steps': 27689, 'batch_loss/train': 0.7202201448380947} 12/26/2021 14:48:20 - INFO - codeparrot_training - Step 27690: {'lr': 0.00020916030174223356, 'samples': 14177792, 'steps': 27690, 'batch_loss/train': 0.7577060153707862} 12/26/2021 14:48:31 - INFO - codeparrot_training - Step 27691: {'lr': 0.0002091447504396899, 'samples': 14178304, 'steps': 27691, 'batch_loss/train': 0.705867285374552} 12/26/2021 14:48:43 - INFO - codeparrot_training - Step 27692: {'lr': 0.0002091291992995713, 'samples': 14178816, 'steps': 27692, 'batch_loss/train': 0.8137333914637566} 12/26/2021 14:48:54 - INFO - codeparrot_training - Step 27693: {'lr': 0.00020911364832193946, 'samples': 14179328, 'steps': 27693, 'batch_loss/train': 0.6816135756671429} 12/26/2021 14:49:05 - INFO - codeparrot_training - Step 27694: {'lr': 0.0002090980975068565, 'samples': 14179840, 'steps': 27694, 'batch_loss/train': 0.684831578284502} 12/26/2021 14:49:15 - INFO - codeparrot_training - Step 27695: {'lr': 0.00020908254685438393, 'samples': 14180352, 'steps': 27695, 'batch_loss/train': 0.8009596772026271} 12/26/2021 14:49:29 - INFO - codeparrot_training - Step 27696: {'lr': 0.0002090669963645837, 'samples': 14180864, 'steps': 27696, 'batch_loss/train': 0.641822278383188} 12/26/2021 14:49:40 - INFO - codeparrot_training - Step 27697: {'lr': 0.0002090514460375177, 'samples': 14181376, 'steps': 27697, 'batch_loss/train': 0.6912596710026264} 12/26/2021 14:49:51 - INFO - codeparrot_training - Step 27698: {'lr': 0.00020903589587324773, 'samples': 14181888, 'steps': 27698, 'batch_loss/train': 0.7942612124606967} 12/26/2021 14:50:03 - INFO - codeparrot_training - Step 27699: {'lr': 0.00020902034587183549, 'samples': 14182400, 'steps': 27699, 'batch_loss/train': 0.7438774001784623} 12/26/2021 14:50:13 - INFO - codeparrot_training - Step 27700: {'lr': 0.00020900479603334283, 'samples': 14182912, 'steps': 27700, 'batch_loss/train': 0.7452274039387703} 12/26/2021 14:50:24 - INFO - codeparrot_training - Step 27701: {'lr': 0.00020898924635783176, 'samples': 14183424, 'steps': 27701, 'batch_loss/train': 0.7451680991798639} 12/26/2021 14:50:36 - INFO - codeparrot_training - Step 27702: {'lr': 0.00020897369684536388, 'samples': 14183936, 'steps': 27702, 'batch_loss/train': 0.7791845794417895} 12/26/2021 14:50:47 - INFO - codeparrot_training - Step 27703: {'lr': 0.00020895814749600114, 'samples': 14184448, 'steps': 27703, 'batch_loss/train': 0.7343801674433053} 12/26/2021 14:50:57 - INFO - codeparrot_training - Step 27704: {'lr': 0.00020894259830980523, 'samples': 14184960, 'steps': 27704, 'batch_loss/train': 0.7637212537229061} 12/26/2021 14:51:08 - INFO - codeparrot_training - Step 27705: {'lr': 0.00020892704928683807, 'samples': 14185472, 'steps': 27705, 'batch_loss/train': 0.8328684666194022} 12/26/2021 14:51:21 - INFO - codeparrot_training - Step 27706: {'lr': 0.0002089115004271615, 'samples': 14185984, 'steps': 27706, 'batch_loss/train': 0.8262239721079823} 12/26/2021 14:51:32 - INFO - codeparrot_training - Step 27707: {'lr': 0.0002088959517308372, 'samples': 14186496, 'steps': 27707, 'batch_loss/train': 0.739138949662447} 12/26/2021 14:51:42 - INFO - codeparrot_training - Step 27708: {'lr': 0.00020888040319792719, 'samples': 14187008, 'steps': 27708, 'batch_loss/train': 0.8482620194554329} 12/26/2021 14:51:55 - INFO - codeparrot_training - Step 27709: {'lr': 0.0002088648548284931, 'samples': 14187520, 'steps': 27709, 'batch_loss/train': 0.7190131640527397} 12/26/2021 14:52:06 - INFO - codeparrot_training - Step 27710: {'lr': 0.00020884930662259688, 'samples': 14188032, 'steps': 27710, 'batch_loss/train': 0.6803100551478565} 12/26/2021 14:52:16 - INFO - codeparrot_training - Step 27711: {'lr': 0.00020883375858030018, 'samples': 14188544, 'steps': 27711, 'batch_loss/train': 0.6325806332752109} 12/26/2021 14:52:28 - INFO - codeparrot_training - Step 27712: {'lr': 0.00020881821070166496, 'samples': 14189056, 'steps': 27712, 'batch_loss/train': 0.7842146879411303} 12/26/2021 14:52:39 - INFO - codeparrot_training - Step 27713: {'lr': 0.00020880266298675305, 'samples': 14189568, 'steps': 27713, 'batch_loss/train': 0.5847971141338348} 12/26/2021 14:52:49 - INFO - codeparrot_training - Step 27714: {'lr': 0.0002087871154356261, 'samples': 14190080, 'steps': 27714, 'batch_loss/train': 0.7327817864716053} 12/26/2021 14:53:00 - INFO - codeparrot_training - Step 27715: {'lr': 0.00020877156804834603, 'samples': 14190592, 'steps': 27715, 'batch_loss/train': 0.7473599074874073} 12/26/2021 14:53:14 - INFO - codeparrot_training - Step 27716: {'lr': 0.00020875602082497468, 'samples': 14191104, 'steps': 27716, 'batch_loss/train': 0.6987007195129991} 12/26/2021 14:53:25 - INFO - codeparrot_training - Step 27717: {'lr': 0.00020874047376557385, 'samples': 14191616, 'steps': 27717, 'batch_loss/train': 0.599352138582617} 12/26/2021 14:53:35 - INFO - codeparrot_training - Step 27718: {'lr': 0.00020872492687020526, 'samples': 14192128, 'steps': 27718, 'batch_loss/train': 0.8027881570160389} 12/26/2021 14:53:47 - INFO - codeparrot_training - Step 27719: {'lr': 0.0002087093801389307, 'samples': 14192640, 'steps': 27719, 'batch_loss/train': 0.8515580398961902} 12/26/2021 14:53:58 - INFO - codeparrot_training - Step 27720: {'lr': 0.00020869383357181222, 'samples': 14193152, 'steps': 27720, 'batch_loss/train': 0.8519781799986959} 12/26/2021 14:54:09 - INFO - codeparrot_training - Step 27721: {'lr': 0.00020867828716891135, 'samples': 14193664, 'steps': 27721, 'batch_loss/train': 0.6811021417379379} 12/26/2021 14:54:21 - INFO - codeparrot_training - Step 27722: {'lr': 0.00020866274093028997, 'samples': 14194176, 'steps': 27722, 'batch_loss/train': 0.7536508099874482} 12/26/2021 14:54:31 - INFO - codeparrot_training - Step 27723: {'lr': 0.00020864719485601003, 'samples': 14194688, 'steps': 27723, 'batch_loss/train': 0.8954402673989534} 12/26/2021 14:54:42 - INFO - codeparrot_training - Step 27724: {'lr': 0.0002086316489461332, 'samples': 14195200, 'steps': 27724, 'batch_loss/train': 0.7974963993765414} 12/26/2021 14:54:56 - INFO - codeparrot_training - Step 27725: {'lr': 0.00020861610320072133, 'samples': 14195712, 'steps': 27725, 'batch_loss/train': 0.7704443465918303} 12/26/2021 14:55:07 - INFO - codeparrot_training - Step 27726: {'lr': 0.00020860055761983613, 'samples': 14196224, 'steps': 27726, 'batch_loss/train': 0.8084644614718854} 12/26/2021 14:55:17 - INFO - codeparrot_training - Step 27727: {'lr': 0.00020858501220353957, 'samples': 14196736, 'steps': 27727, 'batch_loss/train': 0.7679045172408223} 12/26/2021 14:55:28 - INFO - codeparrot_training - Step 27728: {'lr': 0.00020856946695189333, 'samples': 14197248, 'steps': 27728, 'batch_loss/train': 0.7416025726124644} 12/26/2021 14:55:40 - INFO - codeparrot_training - Step 27729: {'lr': 0.00020855392186495928, 'samples': 14197760, 'steps': 27729, 'batch_loss/train': 0.8011772483587265} 12/26/2021 14:55:50 - INFO - codeparrot_training - Step 27730: {'lr': 0.00020853837694279908, 'samples': 14198272, 'steps': 27730, 'batch_loss/train': 0.6797637855634093} 12/26/2021 14:56:01 - INFO - codeparrot_training - Step 27731: {'lr': 0.00020852283218547469, 'samples': 14198784, 'steps': 27731, 'batch_loss/train': 0.6607327219098806} 12/26/2021 14:56:13 - INFO - codeparrot_training - Step 27732: {'lr': 0.00020850728759304793, 'samples': 14199296, 'steps': 27732, 'batch_loss/train': 0.7487048609182239} 12/26/2021 14:56:24 - INFO - codeparrot_training - Step 27733: {'lr': 0.0002084917431655804, 'samples': 14199808, 'steps': 27733, 'batch_loss/train': 0.7051220387220383} 12/26/2021 14:56:34 - INFO - codeparrot_training - Step 27734: {'lr': 0.00020847619890313406, 'samples': 14200320, 'steps': 27734, 'batch_loss/train': 0.7478117430582643} 12/26/2021 14:56:48 - INFO - codeparrot_training - Step 27735: {'lr': 0.0002084606548057708, 'samples': 14200832, 'steps': 27735, 'batch_loss/train': 0.667149170069024} 12/26/2021 14:56:59 - INFO - codeparrot_training - Step 27736: {'lr': 0.00020844511087355223, 'samples': 14201344, 'steps': 27736, 'batch_loss/train': 0.8189079691655934} 12/26/2021 14:57:10 - INFO - codeparrot_training - Step 27737: {'lr': 0.00020842956710654013, 'samples': 14201856, 'steps': 27737, 'batch_loss/train': 0.6157654598355293} 12/26/2021 14:57:22 - INFO - codeparrot_training - Step 27738: {'lr': 0.00020841402350479642, 'samples': 14202368, 'steps': 27738, 'batch_loss/train': 0.7937519955448806} 12/26/2021 14:57:33 - INFO - codeparrot_training - Step 27739: {'lr': 0.0002083984800683829, 'samples': 14202880, 'steps': 27739, 'batch_loss/train': 0.7191409803926945} 12/26/2021 14:57:43 - INFO - codeparrot_training - Step 27740: {'lr': 0.00020838293679736127, 'samples': 14203392, 'steps': 27740, 'batch_loss/train': 0.7191137811169028} 12/26/2021 14:57:54 - INFO - codeparrot_training - Step 27741: {'lr': 0.0002083673936917933, 'samples': 14203904, 'steps': 27741, 'batch_loss/train': 0.7602440549526364} 12/26/2021 14:58:06 - INFO - codeparrot_training - Step 27742: {'lr': 0.00020835185075174096, 'samples': 14204416, 'steps': 27742, 'batch_loss/train': 0.7209223655518144} 12/26/2021 14:58:16 - INFO - codeparrot_training - Step 27743: {'lr': 0.00020833630797726597, 'samples': 14204928, 'steps': 27743, 'batch_loss/train': 0.8056470714509487} 12/26/2021 14:58:27 - INFO - codeparrot_training - Step 27744: {'lr': 0.00020832076536842994, 'samples': 14205440, 'steps': 27744, 'batch_loss/train': 0.7431069668382406} 12/26/2021 14:58:42 - INFO - codeparrot_training - Step 27745: {'lr': 0.0002083052229252949, 'samples': 14205952, 'steps': 27745, 'batch_loss/train': 0.800712214782834} 12/26/2021 14:58:52 - INFO - codeparrot_training - Step 27746: {'lr': 0.00020828968064792256, 'samples': 14206464, 'steps': 27746, 'batch_loss/train': 0.7512138774618506} 12/26/2021 14:59:03 - INFO - codeparrot_training - Step 27747: {'lr': 0.00020827413853637467, 'samples': 14206976, 'steps': 27747, 'batch_loss/train': 0.847097004763782} 12/26/2021 14:59:14 - INFO - codeparrot_training - Step 27748: {'lr': 0.000208258596590713, 'samples': 14207488, 'steps': 27748, 'batch_loss/train': 1.2106011230498552} 12/26/2021 14:59:26 - INFO - codeparrot_training - Step 27749: {'lr': 0.0002082430548109995, 'samples': 14208000, 'steps': 27749, 'batch_loss/train': 0.8829543022438884} 12/26/2021 14:59:37 - INFO - codeparrot_training - Step 27750: {'lr': 0.00020822751319729576, 'samples': 14208512, 'steps': 27750, 'batch_loss/train': 0.8011102364398539} 12/26/2021 14:59:47 - INFO - codeparrot_training - Step 27751: {'lr': 0.00020821197174966378, 'samples': 14209024, 'steps': 27751, 'batch_loss/train': 0.7879505855962634} 12/26/2021 15:00:01 - INFO - codeparrot_training - Step 27752: {'lr': 0.00020819643046816506, 'samples': 14209536, 'steps': 27752, 'batch_loss/train': 0.7723037763498724} 12/26/2021 15:00:12 - INFO - codeparrot_training - Step 27753: {'lr': 0.00020818088935286168, 'samples': 14210048, 'steps': 27753, 'batch_loss/train': 0.7470193789340556} 12/26/2021 15:00:23 - INFO - codeparrot_training - Step 27754: {'lr': 0.00020816534840381522, 'samples': 14210560, 'steps': 27754, 'batch_loss/train': 0.774458090774715} 12/26/2021 15:00:35 - INFO - codeparrot_training - Step 27755: {'lr': 0.0002081498076210876, 'samples': 14211072, 'steps': 27755, 'batch_loss/train': 0.7964037861675024} 12/26/2021 15:00:45 - INFO - codeparrot_training - Step 27756: {'lr': 0.00020813426700474043, 'samples': 14211584, 'steps': 27756, 'batch_loss/train': 0.8824387593194842} 12/26/2021 15:00:56 - INFO - codeparrot_training - Step 27757: {'lr': 0.00020811872655483567, 'samples': 14212096, 'steps': 27757, 'batch_loss/train': 0.7522658221423626} 12/26/2021 15:01:08 - INFO - codeparrot_training - Step 27758: {'lr': 0.00020810318627143508, 'samples': 14212608, 'steps': 27758, 'batch_loss/train': 0.8307864852249622} 12/26/2021 15:01:19 - INFO - codeparrot_training - Step 27759: {'lr': 0.00020808764615460026, 'samples': 14213120, 'steps': 27759, 'batch_loss/train': 0.879539804533124} 12/26/2021 15:01:29 - INFO - codeparrot_training - Step 27760: {'lr': 0.00020807210620439324, 'samples': 14213632, 'steps': 27760, 'batch_loss/train': 0.7185667549492791} 12/26/2021 15:01:40 - INFO - codeparrot_training - Step 27761: {'lr': 0.00020805656642087573, 'samples': 14214144, 'steps': 27761, 'batch_loss/train': 0.7466942020691931} 12/26/2021 15:01:52 - INFO - codeparrot_training - Step 27762: {'lr': 0.0002080410268041095, 'samples': 14214656, 'steps': 27762, 'batch_loss/train': 0.7520222803577781} 12/26/2021 15:02:03 - INFO - codeparrot_training - Step 27763: {'lr': 0.00020802548735415616, 'samples': 14215168, 'steps': 27763, 'batch_loss/train': 0.7119854227639735} 12/26/2021 15:02:13 - INFO - codeparrot_training - Step 27764: {'lr': 0.0002080099480710777, 'samples': 14215680, 'steps': 27764, 'batch_loss/train': 0.7554630599915981} 12/26/2021 15:02:28 - INFO - codeparrot_training - Step 27765: {'lr': 0.00020799440895493592, 'samples': 14216192, 'steps': 27765, 'batch_loss/train': 0.8572999630123377} 12/26/2021 15:02:38 - INFO - codeparrot_training - Step 27766: {'lr': 0.00020797887000579242, 'samples': 14216704, 'steps': 27766, 'batch_loss/train': 0.7304784124717116} 12/26/2021 15:02:49 - INFO - codeparrot_training - Step 27767: {'lr': 0.000207963331223709, 'samples': 14217216, 'steps': 27767, 'batch_loss/train': 0.6861512362957001} 12/26/2021 15:03:01 - INFO - codeparrot_training - Step 27768: {'lr': 0.00020794779260874764, 'samples': 14217728, 'steps': 27768, 'batch_loss/train': 0.5567816663533449} 12/26/2021 15:03:11 - INFO - codeparrot_training - Step 27769: {'lr': 0.00020793225416096989, 'samples': 14218240, 'steps': 27769, 'batch_loss/train': 0.7832265691831708} 12/26/2021 15:03:22 - INFO - codeparrot_training - Step 27770: {'lr': 0.00020791671588043758, 'samples': 14218752, 'steps': 27770, 'batch_loss/train': 0.7415005257353187} 12/26/2021 15:03:33 - INFO - codeparrot_training - Step 27771: {'lr': 0.00020790117776721265, 'samples': 14219264, 'steps': 27771, 'batch_loss/train': 0.8000236265361309} 12/26/2021 15:03:45 - INFO - codeparrot_training - Step 27772: {'lr': 0.00020788563982135672, 'samples': 14219776, 'steps': 27772, 'batch_loss/train': 0.7654853463172913} 12/26/2021 15:03:56 - INFO - codeparrot_training - Step 27773: {'lr': 0.00020787010204293153, 'samples': 14220288, 'steps': 27773, 'batch_loss/train': 0.9977322281338274} 12/26/2021 15:04:06 - INFO - codeparrot_training - Step 27774: {'lr': 0.00020785456443199885, 'samples': 14220800, 'steps': 27774, 'batch_loss/train': 0.7668840661644936} 12/26/2021 15:04:20 - INFO - codeparrot_training - Step 27775: {'lr': 0.00020783902698862062, 'samples': 14221312, 'steps': 27775, 'batch_loss/train': 0.7095707692787983} 12/26/2021 15:04:31 - INFO - codeparrot_training - Step 27776: {'lr': 0.00020782348971285845, 'samples': 14221824, 'steps': 27776, 'batch_loss/train': 0.8122523603960872} 12/26/2021 15:04:42 - INFO - codeparrot_training - Step 27777: {'lr': 0.0002078079526047742, 'samples': 14222336, 'steps': 27777, 'batch_loss/train': 0.816333282738924} 12/26/2021 15:04:54 - INFO - codeparrot_training - Step 27778: {'lr': 0.0002077924156644295, 'samples': 14222848, 'steps': 27778, 'batch_loss/train': 0.7479749470949173} 12/26/2021 15:05:04 - INFO - codeparrot_training - Step 27779: {'lr': 0.00020777687889188625, 'samples': 14223360, 'steps': 27779, 'batch_loss/train': 0.7243672562763095} 12/26/2021 15:05:15 - INFO - codeparrot_training - Step 27780: {'lr': 0.00020776134228720627, 'samples': 14223872, 'steps': 27780, 'batch_loss/train': 0.7612899858504534} 12/26/2021 15:05:26 - INFO - codeparrot_training - Step 27781: {'lr': 0.00020774580585045113, 'samples': 14224384, 'steps': 27781, 'batch_loss/train': 0.7965971259400249} 12/26/2021 15:05:39 - INFO - codeparrot_training - Step 27782: {'lr': 0.00020773026958168281, 'samples': 14224896, 'steps': 27782, 'batch_loss/train': 0.8512439215555787} 12/26/2021 15:05:50 - INFO - codeparrot_training - Step 27783: {'lr': 0.00020771473348096293, 'samples': 14225408, 'steps': 27783, 'batch_loss/train': 0.8330768244341016} 12/26/2021 15:06:01 - INFO - codeparrot_training - Step 27784: {'lr': 0.00020769919754835337, 'samples': 14225920, 'steps': 27784, 'batch_loss/train': 0.8002216499298811} 12/26/2021 15:06:13 - INFO - codeparrot_training - Step 27785: {'lr': 0.0002076836617839157, 'samples': 14226432, 'steps': 27785, 'batch_loss/train': 0.7469473538221791} 12/26/2021 15:06:23 - INFO - codeparrot_training - Step 27786: {'lr': 0.00020766812618771188, 'samples': 14226944, 'steps': 27786, 'batch_loss/train': 0.7649592030793428} 12/26/2021 15:06:34 - INFO - codeparrot_training - Step 27787: {'lr': 0.00020765259075980363, 'samples': 14227456, 'steps': 27787, 'batch_loss/train': 0.772639843984507} 12/26/2021 15:06:46 - INFO - codeparrot_training - Step 27788: {'lr': 0.00020763705550025274, 'samples': 14227968, 'steps': 27788, 'batch_loss/train': 0.7377035496756434} 12/26/2021 15:06:57 - INFO - codeparrot_training - Step 27789: {'lr': 0.0002076215204091208, 'samples': 14228480, 'steps': 27789, 'batch_loss/train': 0.8426229832693934} 12/26/2021 15:07:07 - INFO - codeparrot_training - Step 27790: {'lr': 0.00020760598548646972, 'samples': 14228992, 'steps': 27790, 'batch_loss/train': 0.7892408855259418} 12/26/2021 15:07:18 - INFO - codeparrot_training - Step 27791: {'lr': 0.00020759045073236133, 'samples': 14229504, 'steps': 27791, 'batch_loss/train': 0.735875504091382} 12/26/2021 15:07:32 - INFO - codeparrot_training - Step 27792: {'lr': 0.0002075749161468572, 'samples': 14230016, 'steps': 27792, 'batch_loss/train': 0.8037335490807891} 12/26/2021 15:07:43 - INFO - codeparrot_training - Step 27793: {'lr': 0.00020755938173001916, 'samples': 14230528, 'steps': 27793, 'batch_loss/train': 0.7048098603263497} 12/26/2021 15:07:53 - INFO - codeparrot_training - Step 27794: {'lr': 0.0002075438474819091, 'samples': 14231040, 'steps': 27794, 'batch_loss/train': 0.6178658349672332} 12/26/2021 15:08:06 - INFO - codeparrot_training - Step 27795: {'lr': 0.00020752831340258856, 'samples': 14231552, 'steps': 27795, 'batch_loss/train': 0.7242889916524291} 12/26/2021 15:08:16 - INFO - codeparrot_training - Step 27796: {'lr': 0.00020751277949211943, 'samples': 14232064, 'steps': 27796, 'batch_loss/train': 0.6773590706288815} 12/26/2021 15:08:27 - INFO - codeparrot_training - Step 27797: {'lr': 0.0002074972457505635, 'samples': 14232576, 'steps': 27797, 'batch_loss/train': 0.7828621985390782} 12/26/2021 15:08:39 - INFO - codeparrot_training - Step 27798: {'lr': 0.0002074817121779825, 'samples': 14233088, 'steps': 27798, 'batch_loss/train': 0.6962936678901315} 12/26/2021 15:08:50 - INFO - codeparrot_training - Step 27799: {'lr': 0.0002074661787744381, 'samples': 14233600, 'steps': 27799, 'batch_loss/train': 0.7222713191295043} 12/26/2021 15:09:00 - INFO - codeparrot_training - Step 27800: {'lr': 0.00020745064553999204, 'samples': 14234112, 'steps': 27800, 'batch_loss/train': 0.8447078624740243} 12/26/2021 15:09:11 - INFO - codeparrot_training - Step 27801: {'lr': 0.00020743511247470632, 'samples': 14234624, 'steps': 27801, 'batch_loss/train': 0.7816819315776229} 12/26/2021 15:09:23 - INFO - codeparrot_training - Step 27802: {'lr': 0.00020741957957864238, 'samples': 14235136, 'steps': 27802, 'batch_loss/train': 0.6777187059633434} 12/26/2021 15:09:33 - INFO - codeparrot_training - Step 27803: {'lr': 0.00020740404685186224, 'samples': 14235648, 'steps': 27803, 'batch_loss/train': 0.7069247583858669} 12/26/2021 15:09:44 - INFO - codeparrot_training - Step 27804: {'lr': 0.00020738851429442739, 'samples': 14236160, 'steps': 27804, 'batch_loss/train': 0.736131978686899} 12/26/2021 15:09:58 - INFO - codeparrot_training - Step 27805: {'lr': 0.00020737298190639974, 'samples': 14236672, 'steps': 27805, 'batch_loss/train': 0.7229613135568798} 12/26/2021 15:10:09 - INFO - codeparrot_training - Step 27806: {'lr': 0.0002073574496878411, 'samples': 14237184, 'steps': 27806, 'batch_loss/train': 0.6973473923280835} 12/26/2021 15:10:19 - INFO - codeparrot_training - Step 27807: {'lr': 0.00020734191763881305, 'samples': 14237696, 'steps': 27807, 'batch_loss/train': 0.8161609414964914} 12/26/2021 15:10:32 - INFO - codeparrot_training - Step 27808: {'lr': 0.0002073263857593775, 'samples': 14238208, 'steps': 27808, 'batch_loss/train': 0.8636017730459571} 12/26/2021 15:10:42 - INFO - codeparrot_training - Step 27809: {'lr': 0.0002073108540495961, 'samples': 14238720, 'steps': 27809, 'batch_loss/train': 0.7605816945433617} 12/26/2021 15:10:53 - INFO - codeparrot_training - Step 27810: {'lr': 0.00020729532250953066, 'samples': 14239232, 'steps': 27810, 'batch_loss/train': 0.7289687364827842} 12/26/2021 15:11:03 - INFO - codeparrot_training - Step 27811: {'lr': 0.00020727979113924281, 'samples': 14239744, 'steps': 27811, 'batch_loss/train': 0.8212592890486121} 12/26/2021 15:11:16 - INFO - codeparrot_training - Step 27812: {'lr': 0.00020726425993879445, 'samples': 14240256, 'steps': 27812, 'batch_loss/train': 0.9140670490451157} 12/26/2021 15:11:27 - INFO - codeparrot_training - Step 27813: {'lr': 0.00020724872890824726, 'samples': 14240768, 'steps': 27813, 'batch_loss/train': 0.7651581149548292} 12/26/2021 15:11:37 - INFO - codeparrot_training - Step 27814: {'lr': 0.00020723319804766293, 'samples': 14241280, 'steps': 27814, 'batch_loss/train': 0.567731540708337} 12/26/2021 15:11:51 - INFO - codeparrot_training - Step 27815: {'lr': 0.00020721766735710325, 'samples': 14241792, 'steps': 27815, 'batch_loss/train': 0.8450314737856388} 12/26/2021 15:12:02 - INFO - codeparrot_training - Step 27816: {'lr': 0.00020720213683663003, 'samples': 14242304, 'steps': 27816, 'batch_loss/train': 0.7858416484668851} 12/26/2021 15:12:12 - INFO - codeparrot_training - Step 27817: {'lr': 0.000207186606486305, 'samples': 14242816, 'steps': 27817, 'batch_loss/train': 0.7799141549039632} 12/26/2021 15:12:25 - INFO - codeparrot_training - Step 27818: {'lr': 0.0002071710763061897, 'samples': 14243328, 'steps': 27818, 'batch_loss/train': 0.7327669076621532} 12/26/2021 15:12:36 - INFO - codeparrot_training - Step 27819: {'lr': 0.00020715554629634612, 'samples': 14243840, 'steps': 27819, 'batch_loss/train': 0.7431902969256043} 12/26/2021 15:12:46 - INFO - codeparrot_training - Step 27820: {'lr': 0.00020714001645683595, 'samples': 14244352, 'steps': 27820, 'batch_loss/train': 0.9140037158504128} 12/26/2021 15:12:57 - INFO - codeparrot_training - Step 27821: {'lr': 0.00020712448678772083, 'samples': 14244864, 'steps': 27821, 'batch_loss/train': 0.8383500212803483} 12/26/2021 15:13:11 - INFO - codeparrot_training - Step 27822: {'lr': 0.0002071089572890625, 'samples': 14245376, 'steps': 27822, 'batch_loss/train': 0.798218137351796} 12/26/2021 15:13:21 - INFO - codeparrot_training - Step 27823: {'lr': 0.0002070934279609229, 'samples': 14245888, 'steps': 27823, 'batch_loss/train': 0.760783470235765} 12/26/2021 15:13:32 - INFO - codeparrot_training - Step 27824: {'lr': 0.00020707789880336356, 'samples': 14246400, 'steps': 27824, 'batch_loss/train': 0.837350984569639} 12/26/2021 15:13:44 - INFO - codeparrot_training - Step 27825: {'lr': 0.00020706236981644634, 'samples': 14246912, 'steps': 27825, 'batch_loss/train': 0.7957609109580517} 12/26/2021 15:13:55 - INFO - codeparrot_training - Step 27826: {'lr': 0.0002070468410002328, 'samples': 14247424, 'steps': 27826, 'batch_loss/train': 0.7650497835129499} 12/26/2021 15:14:05 - INFO - codeparrot_training - Step 27827: {'lr': 0.00020703131235478496, 'samples': 14247936, 'steps': 27827, 'batch_loss/train': 0.7824153557885438} 12/26/2021 15:14:17 - INFO - codeparrot_training - Step 27828: {'lr': 0.0002070157838801643, 'samples': 14248448, 'steps': 27828, 'batch_loss/train': 0.7681783221196383} 12/26/2021 15:14:28 - INFO - codeparrot_training - Step 27829: {'lr': 0.0002070002555764327, 'samples': 14248960, 'steps': 27829, 'batch_loss/train': 0.7468631663359702} 12/26/2021 15:14:39 - INFO - codeparrot_training - Step 27830: {'lr': 0.00020698472744365177, 'samples': 14249472, 'steps': 27830, 'batch_loss/train': 0.7946378169581294} 12/26/2021 15:14:49 - INFO - codeparrot_training - Step 27831: {'lr': 0.00020696919948188337, 'samples': 14249984, 'steps': 27831, 'batch_loss/train': 0.6724686864763498} 12/26/2021 15:15:03 - INFO - codeparrot_training - Step 27832: {'lr': 0.00020695367169118922, 'samples': 14250496, 'steps': 27832, 'batch_loss/train': 0.7204487712588161} 12/26/2021 15:15:14 - INFO - codeparrot_training - Step 27833: {'lr': 0.00020693814407163092, 'samples': 14251008, 'steps': 27833, 'batch_loss/train': 0.8801147872582078} 12/26/2021 15:15:25 - INFO - codeparrot_training - Step 27834: {'lr': 0.0002069226166232704, 'samples': 14251520, 'steps': 27834, 'batch_loss/train': 0.8248907411471009} 12/26/2021 15:15:37 - INFO - codeparrot_training - Step 27835: {'lr': 0.00020690708934616926, 'samples': 14252032, 'steps': 27835, 'batch_loss/train': 0.7204786692745984} 12/26/2021 15:15:47 - INFO - codeparrot_training - Step 27836: {'lr': 0.0002068915622403893, 'samples': 14252544, 'steps': 27836, 'batch_loss/train': 0.8027448579668999} 12/26/2021 15:15:58 - INFO - codeparrot_training - Step 27837: {'lr': 0.00020687603530599211, 'samples': 14253056, 'steps': 27837, 'batch_loss/train': 0.73909717425704} 12/26/2021 15:16:10 - INFO - codeparrot_training - Step 27838: {'lr': 0.00020686050854303956, 'samples': 14253568, 'steps': 27838, 'batch_loss/train': 0.7588854050263762} 12/26/2021 15:16:21 - INFO - codeparrot_training - Step 27839: {'lr': 0.00020684498195159342, 'samples': 14254080, 'steps': 27839, 'batch_loss/train': 0.8103754040785134} 12/26/2021 15:16:31 - INFO - codeparrot_training - Step 27840: {'lr': 0.00020682945553171525, 'samples': 14254592, 'steps': 27840, 'batch_loss/train': 0.8030640273354948} 12/26/2021 15:16:42 - INFO - codeparrot_training - Step 27841: {'lr': 0.00020681392928346685, 'samples': 14255104, 'steps': 27841, 'batch_loss/train': 0.911394702270627} 12/26/2021 15:16:54 - INFO - codeparrot_training - Step 27842: {'lr': 0.00020679840320691003, 'samples': 14255616, 'steps': 27842, 'batch_loss/train': 0.6950009111315012} 12/26/2021 15:17:05 - INFO - codeparrot_training - Step 27843: {'lr': 0.00020678287730210645, 'samples': 14256128, 'steps': 27843, 'batch_loss/train': 0.8326733924914151} 12/26/2021 15:17:15 - INFO - codeparrot_training - Step 27844: {'lr': 0.00020676735156911774, 'samples': 14256640, 'steps': 27844, 'batch_loss/train': 0.8308369317092001} 12/26/2021 15:17:29 - INFO - codeparrot_training - Step 27845: {'lr': 0.00020675182600800575, 'samples': 14257152, 'steps': 27845, 'batch_loss/train': 0.8155151759274304} 12/26/2021 15:17:40 - INFO - codeparrot_training - Step 27846: {'lr': 0.00020673630061883224, 'samples': 14257664, 'steps': 27846, 'batch_loss/train': 0.8054457595571876} 12/26/2021 15:17:51 - INFO - codeparrot_training - Step 27847: {'lr': 0.00020672077540165884, 'samples': 14258176, 'steps': 27847, 'batch_loss/train': 0.8469916405156255} 12/26/2021 15:18:03 - INFO - codeparrot_training - Step 27848: {'lr': 0.0002067052503565472, 'samples': 14258688, 'steps': 27848, 'batch_loss/train': 0.7779812067747116} 12/26/2021 15:18:13 - INFO - codeparrot_training - Step 27849: {'lr': 0.00020668972548355924, 'samples': 14259200, 'steps': 27849, 'batch_loss/train': 0.7624202657025307} 12/26/2021 15:18:24 - INFO - codeparrot_training - Step 27850: {'lr': 0.00020667420078275656, 'samples': 14259712, 'steps': 27850, 'batch_loss/train': 0.7460895404219627} 12/26/2021 15:18:38 - INFO - codeparrot_training - Step 27851: {'lr': 0.00020665867625420094, 'samples': 14260224, 'steps': 27851, 'batch_loss/train': 0.7737601725384593} 12/26/2021 15:18:48 - INFO - codeparrot_training - Step 27852: {'lr': 0.00020664315189795394, 'samples': 14260736, 'steps': 27852, 'batch_loss/train': 0.7464798982255161} 12/26/2021 15:18:59 - INFO - codeparrot_training - Step 27853: {'lr': 0.0002066276277140775, 'samples': 14261248, 'steps': 27853, 'batch_loss/train': 0.6853617490269244} 12/26/2021 15:19:10 - INFO - codeparrot_training - Step 27854: {'lr': 0.00020661210370263318, 'samples': 14261760, 'steps': 27854, 'batch_loss/train': 0.7171490075998008} 12/26/2021 15:19:22 - INFO - codeparrot_training - Step 27855: {'lr': 0.0002065965798636827, 'samples': 14262272, 'steps': 27855, 'batch_loss/train': 0.8142361673526466} 12/26/2021 15:19:32 - INFO - codeparrot_training - Step 27856: {'lr': 0.00020658105619728796, 'samples': 14262784, 'steps': 27856, 'batch_loss/train': 0.6940608820877969} 12/26/2021 15:19:43 - INFO - codeparrot_training - Step 27857: {'lr': 0.00020656553270351048, 'samples': 14263296, 'steps': 27857, 'batch_loss/train': 0.6576668210327625} 12/26/2021 15:19:55 - INFO - codeparrot_training - Step 27858: {'lr': 0.0002065500093824121, 'samples': 14263808, 'steps': 27858, 'batch_loss/train': 0.7497142433421686} 12/26/2021 15:20:06 - INFO - codeparrot_training - Step 27859: {'lr': 0.00020653448623405435, 'samples': 14264320, 'steps': 27859, 'batch_loss/train': 0.6929541896097362} 12/26/2021 15:20:16 - INFO - codeparrot_training - Step 27860: {'lr': 0.00020651896325849913, 'samples': 14264832, 'steps': 27860, 'batch_loss/train': 0.8624072605744004} 12/26/2021 15:20:30 - INFO - codeparrot_training - Step 27861: {'lr': 0.00020650344045580816, 'samples': 14265344, 'steps': 27861, 'batch_loss/train': 0.8178667537868023} 12/26/2021 15:20:41 - INFO - codeparrot_training - Step 27862: {'lr': 0.0002064879178260431, 'samples': 14265856, 'steps': 27862, 'batch_loss/train': 0.7680172072723508} 12/26/2021 15:20:52 - INFO - codeparrot_training - Step 27863: {'lr': 0.0002064723953692655, 'samples': 14266368, 'steps': 27863, 'batch_loss/train': 0.7511024801060557} 12/26/2021 15:21:02 - INFO - codeparrot_training - Step 27864: {'lr': 0.00020645687308553732, 'samples': 14266880, 'steps': 27864, 'batch_loss/train': 0.7561353966593742} 12/26/2021 15:21:14 - INFO - codeparrot_training - Step 27865: {'lr': 0.0002064413509749202, 'samples': 14267392, 'steps': 27865, 'batch_loss/train': 0.845715475268662} 12/26/2021 15:21:25 - INFO - codeparrot_training - Step 27866: {'lr': 0.00020642582903747578, 'samples': 14267904, 'steps': 27866, 'batch_loss/train': 0.8829315458424389} 12/26/2021 15:21:36 - INFO - codeparrot_training - Step 27867: {'lr': 0.00020641030727326575, 'samples': 14268416, 'steps': 27867, 'batch_loss/train': 0.730442448053509} 12/26/2021 15:21:49 - INFO - codeparrot_training - Step 27868: {'lr': 0.00020639478568235199, 'samples': 14268928, 'steps': 27868, 'batch_loss/train': 0.787900757510215} 12/26/2021 15:22:00 - INFO - codeparrot_training - Step 27869: {'lr': 0.00020637926426479602, 'samples': 14269440, 'steps': 27869, 'batch_loss/train': 0.7806104430928826} 12/26/2021 15:22:10 - INFO - codeparrot_training - Step 27870: {'lr': 0.00020636374302065958, 'samples': 14269952, 'steps': 27870, 'batch_loss/train': 0.7561289330478758} 12/26/2021 15:22:22 - INFO - codeparrot_training - Step 27871: {'lr': 0.00020634822195000454, 'samples': 14270464, 'steps': 27871, 'batch_loss/train': 0.8287205370143056} 12/26/2021 15:22:33 - INFO - codeparrot_training - Step 27872: {'lr': 0.0002063327010528925, 'samples': 14270976, 'steps': 27872, 'batch_loss/train': 0.7277302821166813} 12/26/2021 15:22:44 - INFO - codeparrot_training - Step 27873: {'lr': 0.00020631718032938507, 'samples': 14271488, 'steps': 27873, 'batch_loss/train': 0.7602472431026399} 12/26/2021 15:22:54 - INFO - codeparrot_training - Step 27874: {'lr': 0.000206301659779544, 'samples': 14272000, 'steps': 27874, 'batch_loss/train': 0.9806874166242778} 12/26/2021 15:23:07 - INFO - codeparrot_training - Step 27875: {'lr': 0.00020628613940343115, 'samples': 14272512, 'steps': 27875, 'batch_loss/train': 0.7382661318406463} 12/26/2021 15:23:17 - INFO - codeparrot_training - Step 27876: {'lr': 0.00020627061920110805, 'samples': 14273024, 'steps': 27876, 'batch_loss/train': 0.621831082738936} 12/26/2021 15:23:28 - INFO - codeparrot_training - Step 27877: {'lr': 0.0002062550991726365, 'samples': 14273536, 'steps': 27877, 'batch_loss/train': 0.7788197263143957} 12/26/2021 15:23:40 - INFO - codeparrot_training - Step 27878: {'lr': 0.000206239579318078, 'samples': 14274048, 'steps': 27878, 'batch_loss/train': 0.7248810483142734} 12/26/2021 15:23:50 - INFO - codeparrot_training - Step 27879: {'lr': 0.0002062240596374946, 'samples': 14274560, 'steps': 27879, 'batch_loss/train': 0.7566384497331455} 12/26/2021 15:24:01 - INFO - codeparrot_training - Step 27880: {'lr': 0.00020620854013094771, 'samples': 14275072, 'steps': 27880, 'batch_loss/train': 0.7656654920428991} 12/26/2021 15:24:15 - INFO - codeparrot_training - Step 27881: {'lr': 0.00020619302079849907, 'samples': 14275584, 'steps': 27881, 'batch_loss/train': 0.7753443904221058} 12/26/2021 15:24:26 - INFO - codeparrot_training - Step 27882: {'lr': 0.0002061775016402106, 'samples': 14276096, 'steps': 27882, 'batch_loss/train': 0.7746650255285203} 12/26/2021 15:24:36 - INFO - codeparrot_training - Step 27883: {'lr': 0.00020616198265614372, 'samples': 14276608, 'steps': 27883, 'batch_loss/train': 0.8260157834738493} 12/26/2021 15:24:47 - INFO - codeparrot_training - Step 27884: {'lr': 0.00020614646384636032, 'samples': 14277120, 'steps': 27884, 'batch_loss/train': 0.7656291689490899} 12/26/2021 15:24:59 - INFO - codeparrot_training - Step 27885: {'lr': 0.00020613094521092194, 'samples': 14277632, 'steps': 27885, 'batch_loss/train': 0.7873460641130805} 12/26/2021 15:25:09 - INFO - codeparrot_training - Step 27886: {'lr': 0.00020611542674989038, 'samples': 14278144, 'steps': 27886, 'batch_loss/train': 0.8174056056886911} 12/26/2021 15:25:20 - INFO - codeparrot_training - Step 27887: {'lr': 0.00020609990846332738, 'samples': 14278656, 'steps': 27887, 'batch_loss/train': 0.831601676531136} 12/26/2021 15:25:32 - INFO - codeparrot_training - Step 27888: {'lr': 0.00020608439035129457, 'samples': 14279168, 'steps': 27888, 'batch_loss/train': 0.809136065421626} 12/26/2021 15:25:43 - INFO - codeparrot_training - Step 27889: {'lr': 0.0002060688724138535, 'samples': 14279680, 'steps': 27889, 'batch_loss/train': 0.8173207202926278} 12/26/2021 15:25:53 - INFO - codeparrot_training - Step 27890: {'lr': 0.00020605335465106612, 'samples': 14280192, 'steps': 27890, 'batch_loss/train': 0.7978219538927078} 12/26/2021 15:26:07 - INFO - codeparrot_training - Step 27891: {'lr': 0.000206037837062994, 'samples': 14280704, 'steps': 27891, 'batch_loss/train': 0.6247050501406193} 12/26/2021 15:26:18 - INFO - codeparrot_training - Step 27892: {'lr': 0.0002060223196496988, 'samples': 14281216, 'steps': 27892, 'batch_loss/train': 0.7999415900558233} 12/26/2021 15:26:28 - INFO - codeparrot_training - Step 27893: {'lr': 0.00020600680241124222, 'samples': 14281728, 'steps': 27893, 'batch_loss/train': 0.9668789440765977} 12/26/2021 15:26:39 - INFO - codeparrot_training - Step 27894: {'lr': 0.0002059912853476861, 'samples': 14282240, 'steps': 27894, 'batch_loss/train': 1.8593963952735066} 12/26/2021 15:26:51 - INFO - codeparrot_training - Step 27895: {'lr': 0.00020597576845909194, 'samples': 14282752, 'steps': 27895, 'batch_loss/train': 0.8183635259047151} 12/26/2021 15:27:02 - INFO - codeparrot_training - Step 27896: {'lr': 0.00020596025174552147, 'samples': 14283264, 'steps': 27896, 'batch_loss/train': 0.7608779529109597} 12/26/2021 15:27:12 - INFO - codeparrot_training - Step 27897: {'lr': 0.0002059447352070365, 'samples': 14283776, 'steps': 27897, 'batch_loss/train': 0.6460706837242469} 12/26/2021 15:27:27 - INFO - codeparrot_training - Step 27898: {'lr': 0.0002059292188436986, 'samples': 14284288, 'steps': 27898, 'batch_loss/train': 0.7278365539386868} 12/26/2021 15:27:38 - INFO - codeparrot_training - Step 27899: {'lr': 0.00020591370265556948, 'samples': 14284800, 'steps': 27899, 'batch_loss/train': 1.4494325751438737} 12/26/2021 15:27:48 - INFO - codeparrot_training - Step 27900: {'lr': 0.00020589818664271074, 'samples': 14285312, 'steps': 27900, 'batch_loss/train': 0.6938904100097716} 12/26/2021 15:27:59 - INFO - codeparrot_training - Step 27901: {'lr': 0.00020588267080518427, 'samples': 14285824, 'steps': 27901, 'batch_loss/train': 0.5818638566415757} 12/26/2021 15:28:11 - INFO - codeparrot_training - Step 27902: {'lr': 0.0002058671551430516, 'samples': 14286336, 'steps': 27902, 'batch_loss/train': 0.7995214741677046} 12/26/2021 15:28:22 - INFO - codeparrot_training - Step 27903: {'lr': 0.00020585163965637453, 'samples': 14286848, 'steps': 27903, 'batch_loss/train': 0.7696234900504351} 12/26/2021 15:28:32 - INFO - codeparrot_training - Step 27904: {'lr': 0.00020583612434521453, 'samples': 14287360, 'steps': 27904, 'batch_loss/train': 0.7609989326447248} 12/26/2021 15:28:44 - INFO - codeparrot_training - Step 27905: {'lr': 0.0002058206092096335, 'samples': 14287872, 'steps': 27905, 'batch_loss/train': 0.7406320367008448} 12/26/2021 15:28:55 - INFO - codeparrot_training - Step 27906: {'lr': 0.0002058050942496931, 'samples': 14288384, 'steps': 27906, 'batch_loss/train': 0.775563299190253} 12/26/2021 15:29:06 - INFO - codeparrot_training - Step 27907: {'lr': 0.00020578957946545485, 'samples': 14288896, 'steps': 27907, 'batch_loss/train': 0.6989005198702216} 12/26/2021 15:29:20 - INFO - codeparrot_training - Step 27908: {'lr': 0.00020577406485698064, 'samples': 14289408, 'steps': 27908, 'batch_loss/train': 0.7847793279215693} 12/26/2021 15:29:31 - INFO - codeparrot_training - Step 27909: {'lr': 0.00020575855042433199, 'samples': 14289920, 'steps': 27909, 'batch_loss/train': 0.5959484709892422} 12/26/2021 15:29:41 - INFO - codeparrot_training - Step 27910: {'lr': 0.0002057430361675707, 'samples': 14290432, 'steps': 27910, 'batch_loss/train': 0.8091585021466017} 12/26/2021 15:29:52 - INFO - codeparrot_training - Step 27911: {'lr': 0.00020572752208675827, 'samples': 14290944, 'steps': 27911, 'batch_loss/train': 0.7815408334136009} 12/26/2021 15:30:04 - INFO - codeparrot_training - Step 27912: {'lr': 0.00020571200818195653, 'samples': 14291456, 'steps': 27912, 'batch_loss/train': 0.820476814522408} 12/26/2021 15:30:15 - INFO - codeparrot_training - Step 27913: {'lr': 0.00020569649445322723, 'samples': 14291968, 'steps': 27913, 'batch_loss/train': 0.6732230298221111} 12/26/2021 15:30:25 - INFO - codeparrot_training - Step 27914: {'lr': 0.00020568098090063187, 'samples': 14292480, 'steps': 27914, 'batch_loss/train': 0.7294207386439666} 12/26/2021 15:30:37 - INFO - codeparrot_training - Step 27915: {'lr': 0.0002056654675242321, 'samples': 14292992, 'steps': 27915, 'batch_loss/train': 0.7049047295004129} 12/26/2021 15:30:48 - INFO - codeparrot_training - Step 27916: {'lr': 0.00020564995432408983, 'samples': 14293504, 'steps': 27916, 'batch_loss/train': 0.7792744701728225} 12/26/2021 15:30:59 - INFO - codeparrot_training - Step 27917: {'lr': 0.0002056344413002666, 'samples': 14294016, 'steps': 27917, 'batch_loss/train': 0.7969039333984256} 12/26/2021 15:31:12 - INFO - codeparrot_training - Step 27918: {'lr': 0.00020561892845282398, 'samples': 14294528, 'steps': 27918, 'batch_loss/train': 0.6457296019652858} 12/26/2021 15:31:23 - INFO - codeparrot_training - Step 27919: {'lr': 0.0002056034157818238, 'samples': 14295040, 'steps': 27919, 'batch_loss/train': 0.7654048809781671} 12/26/2021 15:31:33 - INFO - codeparrot_training - Step 27920: {'lr': 0.0002055879032873277, 'samples': 14295552, 'steps': 27920, 'batch_loss/train': 0.797148042358458} 12/26/2021 15:31:44 - INFO - codeparrot_training - Step 27921: {'lr': 0.0002055723909693973, 'samples': 14296064, 'steps': 27921, 'batch_loss/train': 0.6794094119686633} 12/26/2021 15:31:56 - INFO - codeparrot_training - Step 27922: {'lr': 0.0002055568788280942, 'samples': 14296576, 'steps': 27922, 'batch_loss/train': 0.84367102291435} 12/26/2021 15:32:07 - INFO - codeparrot_training - Step 27923: {'lr': 0.0002055413668634803, 'samples': 14297088, 'steps': 27923, 'batch_loss/train': 0.8640804439783096} 12/26/2021 15:32:17 - INFO - codeparrot_training - Step 27924: {'lr': 0.00020552585507561716, 'samples': 14297600, 'steps': 27924, 'batch_loss/train': 0.7331417555687949} 12/26/2021 15:32:29 - INFO - codeparrot_training - Step 27925: {'lr': 0.00020551034346456633, 'samples': 14298112, 'steps': 27925, 'batch_loss/train': 0.7884908285923302} 12/26/2021 15:32:40 - INFO - codeparrot_training - Step 27926: {'lr': 0.00020549483203038957, 'samples': 14298624, 'steps': 27926, 'batch_loss/train': 0.7967120971879922} 12/26/2021 15:32:51 - INFO - codeparrot_training - Step 27927: {'lr': 0.00020547932077314866, 'samples': 14299136, 'steps': 27927, 'batch_loss/train': 0.7463819850236177} 12/26/2021 15:33:05 - INFO - codeparrot_training - Step 27928: {'lr': 0.00020546380969290506, 'samples': 14299648, 'steps': 27928, 'batch_loss/train': 0.6924917682772502} 12/26/2021 15:33:15 - INFO - codeparrot_training - Step 27929: {'lr': 0.0002054482987897205, 'samples': 14300160, 'steps': 27929, 'batch_loss/train': 0.790398271754384} 12/26/2021 15:33:26 - INFO - codeparrot_training - Step 27930: {'lr': 0.00020543278806365683, 'samples': 14300672, 'steps': 27930, 'batch_loss/train': 0.6757573913782835} 12/26/2021 15:33:36 - INFO - codeparrot_training - Step 27931: {'lr': 0.00020541727751477548, 'samples': 14301184, 'steps': 27931, 'batch_loss/train': 0.8531942707486451} 12/26/2021 15:33:49 - INFO - codeparrot_training - Step 27932: {'lr': 0.00020540176714313826, 'samples': 14301696, 'steps': 27932, 'batch_loss/train': 0.797578286495991} 12/26/2021 15:33:59 - INFO - codeparrot_training - Step 27933: {'lr': 0.00020538625694880666, 'samples': 14302208, 'steps': 27933, 'batch_loss/train': 0.7522048987448215} 12/26/2021 15:34:10 - INFO - codeparrot_training - Step 27934: {'lr': 0.00020537074693184257, 'samples': 14302720, 'steps': 27934, 'batch_loss/train': 1.0416754391044378} 12/26/2021 15:34:22 - INFO - codeparrot_training - Step 27935: {'lr': 0.00020535523709230749, 'samples': 14303232, 'steps': 27935, 'batch_loss/train': 1.0006388374604285} 12/26/2021 15:34:33 - INFO - codeparrot_training - Step 27936: {'lr': 0.00020533972743026318, 'samples': 14303744, 'steps': 27936, 'batch_loss/train': 0.81527831684798} 12/26/2021 15:34:44 - INFO - codeparrot_training - Step 27937: {'lr': 0.00020532421794577113, 'samples': 14304256, 'steps': 27937, 'batch_loss/train': 0.8230981675442308} 12/26/2021 15:34:58 - INFO - codeparrot_training - Step 27938: {'lr': 0.0002053087086388932, 'samples': 14304768, 'steps': 27938, 'batch_loss/train': 1.2210607873275876} 12/26/2021 15:35:08 - INFO - codeparrot_training - Step 27939: {'lr': 0.000205293199509691, 'samples': 14305280, 'steps': 27939, 'batch_loss/train': 0.8245847085490823} 12/26/2021 15:35:19 - INFO - codeparrot_training - Step 27940: {'lr': 0.00020527769055822613, 'samples': 14305792, 'steps': 27940, 'batch_loss/train': 0.7634757356718183} 12/26/2021 15:35:29 - INFO - codeparrot_training - Step 27941: {'lr': 0.0002052621817845602, 'samples': 14306304, 'steps': 27941, 'batch_loss/train': 0.860941544175148} 12/26/2021 15:35:42 - INFO - codeparrot_training - Step 27942: {'lr': 0.00020524667318875506, 'samples': 14306816, 'steps': 27942, 'batch_loss/train': 0.7919116886332631} 12/26/2021 15:35:52 - INFO - codeparrot_training - Step 27943: {'lr': 0.0002052311647708722, 'samples': 14307328, 'steps': 27943, 'batch_loss/train': 0.7624813984148204} 12/26/2021 15:36:03 - INFO - codeparrot_training - Step 27944: {'lr': 0.0002052156565309733, 'samples': 14307840, 'steps': 27944, 'batch_loss/train': 0.7583670066669583} 12/26/2021 15:36:15 - INFO - codeparrot_training - Step 27945: {'lr': 0.00020520014846912006, 'samples': 14308352, 'steps': 27945, 'batch_loss/train': 0.7869910513982177} 12/26/2021 15:36:26 - INFO - codeparrot_training - Step 27946: {'lr': 0.00020518464058537415, 'samples': 14308864, 'steps': 27946, 'batch_loss/train': 0.7823977824300528} 12/26/2021 15:36:36 - INFO - codeparrot_training - Step 27947: {'lr': 0.00020516913287979717, 'samples': 14309376, 'steps': 27947, 'batch_loss/train': 0.786237149965018} 12/26/2021 15:36:50 - INFO - codeparrot_training - Step 27948: {'lr': 0.0002051536253524507, 'samples': 14309888, 'steps': 27948, 'batch_loss/train': 0.6869414958637208} 12/26/2021 15:37:01 - INFO - codeparrot_training - Step 27949: {'lr': 0.0002051381180033966, 'samples': 14310400, 'steps': 27949, 'batch_loss/train': 0.8346917051821947} 12/26/2021 15:37:11 - INFO - codeparrot_training - Step 27950: {'lr': 0.00020512261083269635, 'samples': 14310912, 'steps': 27950, 'batch_loss/train': 0.8603220535442233} 12/26/2021 15:37:22 - INFO - codeparrot_training - Step 27951: {'lr': 0.0002051071038404117, 'samples': 14311424, 'steps': 27951, 'batch_loss/train': 0.7615056489594281} 12/26/2021 15:37:34 - INFO - codeparrot_training - Step 27952: {'lr': 0.00020509159702660417, 'samples': 14311936, 'steps': 27952, 'batch_loss/train': 1.27291239798069} 12/26/2021 15:37:45 - INFO - codeparrot_training - Step 27953: {'lr': 0.00020507609039133558, 'samples': 14312448, 'steps': 27953, 'batch_loss/train': 0.764736601151526} 12/26/2021 15:37:55 - INFO - codeparrot_training - Step 27954: {'lr': 0.00020506058393466746, 'samples': 14312960, 'steps': 27954, 'batch_loss/train': 0.8342042565345764} 12/26/2021 15:38:09 - INFO - codeparrot_training - Step 27955: {'lr': 0.00020504507765666143, 'samples': 14313472, 'steps': 27955, 'batch_loss/train': 0.7591792729217559} 12/26/2021 15:38:20 - INFO - codeparrot_training - Step 27956: {'lr': 0.00020502957155737927, 'samples': 14313984, 'steps': 27956, 'batch_loss/train': 0.664636459609028} 12/26/2021 15:38:30 - INFO - codeparrot_training - Step 27957: {'lr': 0.00020501406563688254, 'samples': 14314496, 'steps': 27957, 'batch_loss/train': 0.7947249296121299} 12/26/2021 15:38:43 - INFO - codeparrot_training - Step 27958: {'lr': 0.00020499855989523295, 'samples': 14315008, 'steps': 27958, 'batch_loss/train': 0.888318716082722} 12/26/2021 15:38:54 - INFO - codeparrot_training - Step 27959: {'lr': 0.00020498305433249198, 'samples': 14315520, 'steps': 27959, 'batch_loss/train': 0.8140057679265738} 12/26/2021 15:39:04 - INFO - codeparrot_training - Step 27960: {'lr': 0.00020496754894872143, 'samples': 14316032, 'steps': 27960, 'batch_loss/train': 0.819312178529799} 12/26/2021 15:39:15 - INFO - codeparrot_training - Step 27961: {'lr': 0.00020495204374398296, 'samples': 14316544, 'steps': 27961, 'batch_loss/train': 0.7707278998568654} 12/26/2021 15:39:27 - INFO - codeparrot_training - Step 27962: {'lr': 0.00020493653871833819, 'samples': 14317056, 'steps': 27962, 'batch_loss/train': 0.7655764585360885} 12/26/2021 15:39:38 - INFO - codeparrot_training - Step 27963: {'lr': 0.00020492103387184858, 'samples': 14317568, 'steps': 27963, 'batch_loss/train': 0.7991070812568069} 12/26/2021 15:39:48 - INFO - codeparrot_training - Step 27964: {'lr': 0.00020490552920457596, 'samples': 14318080, 'steps': 27964, 'batch_loss/train': 0.7566548241302371} 12/26/2021 15:40:00 - INFO - codeparrot_training - Step 27965: {'lr': 0.000204890024716582, 'samples': 14318592, 'steps': 27965, 'batch_loss/train': 0.6628879094496369} 12/26/2021 15:40:11 - INFO - codeparrot_training - Step 27966: {'lr': 0.00020487452040792814, 'samples': 14319104, 'steps': 27966, 'batch_loss/train': 0.7844561086967587} 12/26/2021 15:40:21 - INFO - codeparrot_training - Step 27967: {'lr': 0.00020485901627867627, 'samples': 14319616, 'steps': 27967, 'batch_loss/train': 0.7373288301751018} 12/26/2021 15:40:35 - INFO - codeparrot_training - Step 27968: {'lr': 0.0002048435123288879, 'samples': 14320128, 'steps': 27968, 'batch_loss/train': 0.9023939273320138} 12/26/2021 15:40:46 - INFO - codeparrot_training - Step 27969: {'lr': 0.00020482800855862467, 'samples': 14320640, 'steps': 27969, 'batch_loss/train': 0.7544047082774341} 12/26/2021 15:40:57 - INFO - codeparrot_training - Step 27970: {'lr': 0.00020481250496794814, 'samples': 14321152, 'steps': 27970, 'batch_loss/train': 0.6515917931683362} 12/26/2021 15:41:07 - INFO - codeparrot_training - Step 27971: {'lr': 0.00020479700155692006, 'samples': 14321664, 'steps': 27971, 'batch_loss/train': 0.8263399321585894} 12/26/2021 15:41:19 - INFO - codeparrot_training - Step 27972: {'lr': 0.00020478149832560213, 'samples': 14322176, 'steps': 27972, 'batch_loss/train': 0.8293336797505617} 12/26/2021 15:41:30 - INFO - codeparrot_training - Step 27973: {'lr': 0.0002047659952740558, 'samples': 14322688, 'steps': 27973, 'batch_loss/train': 0.7778996434062719} 12/26/2021 15:41:41 - INFO - codeparrot_training - Step 27974: {'lr': 0.00020475049240234277, 'samples': 14323200, 'steps': 27974, 'batch_loss/train': 0.7546304506249726} 12/26/2021 15:41:54 - INFO - codeparrot_training - Step 27975: {'lr': 0.00020473498971052475, 'samples': 14323712, 'steps': 27975, 'batch_loss/train': 0.8114886367693543} 12/26/2021 15:42:05 - INFO - codeparrot_training - Step 27976: {'lr': 0.00020471948719866332, 'samples': 14324224, 'steps': 27976, 'batch_loss/train': 0.7506611421704292} 12/26/2021 15:42:16 - INFO - codeparrot_training - Step 27977: {'lr': 0.00020470398486682013, 'samples': 14324736, 'steps': 27977, 'batch_loss/train': 0.826446395367384} 12/26/2021 15:42:28 - INFO - codeparrot_training - Step 27978: {'lr': 0.00020468848271505672, 'samples': 14325248, 'steps': 27978, 'batch_loss/train': 0.8175917807966471} 12/26/2021 15:42:38 - INFO - codeparrot_training - Step 27979: {'lr': 0.00020467298074343488, 'samples': 14325760, 'steps': 27979, 'batch_loss/train': 0.8223967798985541} 12/26/2021 15:42:49 - INFO - codeparrot_training - Step 27980: {'lr': 0.00020465747895201608, 'samples': 14326272, 'steps': 27980, 'batch_loss/train': 0.8042724328115582} 12/26/2021 15:43:02 - INFO - codeparrot_training - Step 27981: {'lr': 0.00020464197734086198, 'samples': 14326784, 'steps': 27981, 'batch_loss/train': 0.701460259500891} 12/26/2021 15:43:12 - INFO - codeparrot_training - Step 27982: {'lr': 0.00020462647591003438, 'samples': 14327296, 'steps': 27982, 'batch_loss/train': 0.805311057716608} 12/26/2021 15:43:23 - INFO - codeparrot_training - Step 27983: {'lr': 0.0002046109746595947, 'samples': 14327808, 'steps': 27983, 'batch_loss/train': 0.7892286572605371} 12/26/2021 15:43:33 - INFO - codeparrot_training - Step 27984: {'lr': 0.00020459547358960472, 'samples': 14328320, 'steps': 27984, 'batch_loss/train': 1.1685624024830759} 12/26/2021 15:43:47 - INFO - codeparrot_training - Step 27985: {'lr': 0.0002045799727001259, 'samples': 14328832, 'steps': 27985, 'batch_loss/train': 0.7646554252132773} 12/26/2021 15:43:58 - INFO - codeparrot_training - Step 27986: {'lr': 0.00020456447199122, 'samples': 14329344, 'steps': 27986, 'batch_loss/train': 0.80214180983603} 12/26/2021 15:44:08 - INFO - codeparrot_training - Step 27987: {'lr': 0.00020454897146294864, 'samples': 14329856, 'steps': 27987, 'batch_loss/train': 0.7365728756412864} 12/26/2021 15:44:21 - INFO - codeparrot_training - Step 27988: {'lr': 0.00020453347111537343, 'samples': 14330368, 'steps': 27988, 'batch_loss/train': 0.7360031111165881} 12/26/2021 15:44:31 - INFO - codeparrot_training - Step 27989: {'lr': 0.00020451797094855583, 'samples': 14330880, 'steps': 27989, 'batch_loss/train': 0.8115032012574375} 12/26/2021 15:44:42 - INFO - codeparrot_training - Step 27990: {'lr': 0.0002045024709625577, 'samples': 14331392, 'steps': 27990, 'batch_loss/train': 0.7742088809609413} 12/26/2021 15:44:52 - INFO - codeparrot_training - Step 27991: {'lr': 0.00020448697115744058, 'samples': 14331904, 'steps': 27991, 'batch_loss/train': 0.7269682651385665} 12/26/2021 15:45:05 - INFO - codeparrot_training - Step 27992: {'lr': 0.000204471471533266, 'samples': 14332416, 'steps': 27992, 'batch_loss/train': 0.8224097620695829} 12/26/2021 15:45:15 - INFO - codeparrot_training - Step 27993: {'lr': 0.0002044559720900957, 'samples': 14332928, 'steps': 27993, 'batch_loss/train': 0.8634993000887334} 12/26/2021 15:45:26 - INFO - codeparrot_training - Step 27994: {'lr': 0.00020444047282799132, 'samples': 14333440, 'steps': 27994, 'batch_loss/train': 0.7665125583298504} 12/26/2021 15:45:40 - INFO - codeparrot_training - Step 27995: {'lr': 0.00020442497374701434, 'samples': 14333952, 'steps': 27995, 'batch_loss/train': 0.779724758467637} 12/26/2021 15:45:51 - INFO - codeparrot_training - Step 27996: {'lr': 0.00020440947484722644, 'samples': 14334464, 'steps': 27996, 'batch_loss/train': 0.943311708047986} 12/26/2021 15:46:01 - INFO - codeparrot_training - Step 27997: {'lr': 0.00020439397612868936, 'samples': 14334976, 'steps': 27997, 'batch_loss/train': 0.9169247774407268} 12/26/2021 15:46:13 - INFO - codeparrot_training - Step 27998: {'lr': 0.00020437847759146455, 'samples': 14335488, 'steps': 27998, 'batch_loss/train': 0.7836429523304105} 12/26/2021 15:46:24 - INFO - codeparrot_training - Step 27999: {'lr': 0.0002043629792356137, 'samples': 14336000, 'steps': 27999, 'batch_loss/train': 0.7627217760309577} 12/26/2021 15:46:35 - INFO - codeparrot_training - Step 28000: {'lr': 0.00020434748106119832, 'samples': 14336512, 'steps': 28000, 'batch_loss/train': 0.7483104458078742} 12/26/2021 15:46:47 - INFO - codeparrot_training - Step 28001: {'lr': 0.00020433198306828024, 'samples': 14337024, 'steps': 28001, 'batch_loss/train': 0.8215581336989999} 12/26/2021 15:46:57 - INFO - codeparrot_training - Step 28002: {'lr': 0.0002043164852569209, 'samples': 14337536, 'steps': 28002, 'batch_loss/train': 0.7414274425245821} 12/26/2021 15:47:08 - INFO - codeparrot_training - Step 28003: {'lr': 0.00020430098762718192, 'samples': 14338048, 'steps': 28003, 'batch_loss/train': 0.7563275219872594} 12/26/2021 15:47:19 - INFO - codeparrot_training - Step 28004: {'lr': 0.00020428549017912503, 'samples': 14338560, 'steps': 28004, 'batch_loss/train': 0.8049240168184042} 12/26/2021 15:47:31 - INFO - codeparrot_training - Step 28005: {'lr': 0.00020426999291281176, 'samples': 14339072, 'steps': 28005, 'batch_loss/train': 0.7017099398653954} 12/26/2021 15:47:41 - INFO - codeparrot_training - Step 28006: {'lr': 0.00020425449582830375, 'samples': 14339584, 'steps': 28006, 'batch_loss/train': 1.5528801255859435} 12/26/2021 15:47:52 - INFO - codeparrot_training - Step 28007: {'lr': 0.0002042389989256625, 'samples': 14340096, 'steps': 28007, 'batch_loss/train': 0.7387940087355673} 12/26/2021 15:48:06 - INFO - codeparrot_training - Step 28008: {'lr': 0.00020422350220494984, 'samples': 14340608, 'steps': 28008, 'batch_loss/train': 0.8102814999874681} 12/26/2021 15:48:16 - INFO - codeparrot_training - Step 28009: {'lr': 0.00020420800566622717, 'samples': 14341120, 'steps': 28009, 'batch_loss/train': 0.8417176785878837} 12/26/2021 15:48:27 - INFO - codeparrot_training - Step 28010: {'lr': 0.00020419250930955624, 'samples': 14341632, 'steps': 28010, 'batch_loss/train': 0.7542319446802139} 12/26/2021 15:48:40 - INFO - codeparrot_training - Step 28011: {'lr': 0.00020417701313499848, 'samples': 14342144, 'steps': 28011, 'batch_loss/train': 0.7420757371000946} 12/26/2021 15:48:50 - INFO - codeparrot_training - Step 28012: {'lr': 0.00020416151714261566, 'samples': 14342656, 'steps': 28012, 'batch_loss/train': 0.7122769613051787} 12/26/2021 15:49:01 - INFO - codeparrot_training - Step 28013: {'lr': 0.00020414602133246942, 'samples': 14343168, 'steps': 28013, 'batch_loss/train': 0.9612216836540028} 12/26/2021 15:49:11 - INFO - codeparrot_training - Step 28014: {'lr': 0.00020413052570462126, 'samples': 14343680, 'steps': 28014, 'batch_loss/train': 0.8852046057581902} 12/26/2021 15:49:25 - INFO - codeparrot_training - Step 28015: {'lr': 0.00020411503025913273, 'samples': 14344192, 'steps': 28015, 'batch_loss/train': 0.7754485998302698} 12/26/2021 15:49:36 - INFO - codeparrot_training - Step 28016: {'lr': 0.00020409953499606553, 'samples': 14344704, 'steps': 28016, 'batch_loss/train': 0.745221380610019} 12/26/2021 15:49:47 - INFO - codeparrot_training - Step 28017: {'lr': 0.00020408403991548137, 'samples': 14345216, 'steps': 28017, 'batch_loss/train': 0.780676809605211} 12/26/2021 15:49:59 - INFO - codeparrot_training - Step 28018: {'lr': 0.00020406854501744154, 'samples': 14345728, 'steps': 28018, 'batch_loss/train': 0.7579767219722271} 12/26/2021 15:50:09 - INFO - codeparrot_training - Step 28019: {'lr': 0.00020405305030200794, 'samples': 14346240, 'steps': 28019, 'batch_loss/train': 0.8376777470111847} 12/26/2021 15:50:20 - INFO - codeparrot_training - Step 28020: {'lr': 0.0002040375557692421, 'samples': 14346752, 'steps': 28020, 'batch_loss/train': 0.7896686522290111} 12/26/2021 15:50:32 - INFO - codeparrot_training - Step 28021: {'lr': 0.00020402206141920554, 'samples': 14347264, 'steps': 28021, 'batch_loss/train': 0.718835195992142} 12/26/2021 15:50:43 - INFO - codeparrot_training - Step 28022: {'lr': 0.00020400656725195983, 'samples': 14347776, 'steps': 28022, 'batch_loss/train': 0.7979372562840581} 12/26/2021 15:50:53 - INFO - codeparrot_training - Step 28023: {'lr': 0.00020399107326756678, 'samples': 14348288, 'steps': 28023, 'batch_loss/train': 0.7566459542140365} 12/26/2021 15:51:04 - INFO - codeparrot_training - Step 28024: {'lr': 0.0002039755794660878, 'samples': 14348800, 'steps': 28024, 'batch_loss/train': 0.8020465085282922} 12/26/2021 15:51:18 - INFO - codeparrot_training - Step 28025: {'lr': 0.0002039600858475845, 'samples': 14349312, 'steps': 28025, 'batch_loss/train': 0.7895832816138864} 12/26/2021 15:51:28 - INFO - codeparrot_training - Step 28026: {'lr': 0.0002039445924121185, 'samples': 14349824, 'steps': 28026, 'batch_loss/train': 0.8079499078448862} 12/26/2021 15:51:39 - INFO - codeparrot_training - Step 28027: {'lr': 0.0002039290991597515, 'samples': 14350336, 'steps': 28027, 'batch_loss/train': 0.7855296574998647} 12/26/2021 15:51:51 - INFO - codeparrot_training - Step 28028: {'lr': 0.00020391360609054492, 'samples': 14350848, 'steps': 28028, 'batch_loss/train': 0.8357692873105407} 12/26/2021 15:52:01 - INFO - codeparrot_training - Step 28029: {'lr': 0.0002038981132045604, 'samples': 14351360, 'steps': 28029, 'batch_loss/train': 0.7458403753116727} 12/26/2021 15:52:12 - INFO - codeparrot_training - Step 28030: {'lr': 0.00020388262050185973, 'samples': 14351872, 'steps': 28030, 'batch_loss/train': 0.7299795777653344} 12/26/2021 15:52:25 - INFO - codeparrot_training - Step 28031: {'lr': 0.00020386712798250425, 'samples': 14352384, 'steps': 28031, 'batch_loss/train': 0.8194664185866714} 12/26/2021 15:52:36 - INFO - codeparrot_training - Step 28032: {'lr': 0.00020385163564655575, 'samples': 14352896, 'steps': 28032, 'batch_loss/train': 0.717535519041121} 12/26/2021 15:52:47 - INFO - codeparrot_training - Step 28033: {'lr': 0.00020383614349407553, 'samples': 14353408, 'steps': 28033, 'batch_loss/train': 1.2321873512119055} 12/26/2021 15:52:57 - INFO - codeparrot_training - Step 28034: {'lr': 0.00020382065152512555, 'samples': 14353920, 'steps': 28034, 'batch_loss/train': 0.8310663597658277} 12/26/2021 15:53:09 - INFO - codeparrot_training - Step 28035: {'lr': 0.00020380515973976717, 'samples': 14354432, 'steps': 28035, 'batch_loss/train': 0.8150992994196713} 12/26/2021 15:53:20 - INFO - codeparrot_training - Step 28036: {'lr': 0.00020378966813806206, 'samples': 14354944, 'steps': 28036, 'batch_loss/train': 0.7302363156341016} 12/26/2021 15:53:30 - INFO - codeparrot_training - Step 28037: {'lr': 0.00020377417672007165, 'samples': 14355456, 'steps': 28037, 'batch_loss/train': 0.8116129012778401} 12/26/2021 15:53:42 - INFO - codeparrot_training - Step 28038: {'lr': 0.00020375868548585773, 'samples': 14355968, 'steps': 28038, 'batch_loss/train': 0.7429215325973928} 12/26/2021 15:53:53 - INFO - codeparrot_training - Step 28039: {'lr': 0.0002037431944354819, 'samples': 14356480, 'steps': 28039, 'batch_loss/train': 0.7639273041859269} 12/26/2021 15:54:04 - INFO - codeparrot_training - Step 28040: {'lr': 0.00020372770356900553, 'samples': 14356992, 'steps': 28040, 'batch_loss/train': 0.7079367088153958} 12/26/2021 15:54:16 - INFO - codeparrot_training - Step 28041: {'lr': 0.00020371221288649038, 'samples': 14357504, 'steps': 28041, 'batch_loss/train': 0.7276076397392899} 12/26/2021 15:54:27 - INFO - codeparrot_training - Step 28042: {'lr': 0.0002036967223879981, 'samples': 14358016, 'steps': 28042, 'batch_loss/train': 0.8766822852194309} 12/26/2021 15:54:37 - INFO - codeparrot_training - Step 28043: {'lr': 0.0002036812320735901, 'samples': 14358528, 'steps': 28043, 'batch_loss/train': 0.7773112878203392} 12/26/2021 15:54:48 - INFO - codeparrot_training - Step 28044: {'lr': 0.00020366574194332797, 'samples': 14359040, 'steps': 28044, 'batch_loss/train': 0.7951172590255737} 12/26/2021 15:55:02 - INFO - codeparrot_training - Step 28045: {'lr': 0.00020365025199727338, 'samples': 14359552, 'steps': 28045, 'batch_loss/train': 0.8112081000581384} 12/26/2021 15:55:13 - INFO - codeparrot_training - Step 28046: {'lr': 0.00020363476223548795, 'samples': 14360064, 'steps': 28046, 'batch_loss/train': 0.8604815565049648} 12/26/2021 15:55:23 - INFO - codeparrot_training - Step 28047: {'lr': 0.00020361927265803316, 'samples': 14360576, 'steps': 28047, 'batch_loss/train': 0.8228194816038013} 12/26/2021 15:55:35 - INFO - codeparrot_training - Step 28048: {'lr': 0.00020360378326497055, 'samples': 14361088, 'steps': 28048, 'batch_loss/train': 0.6522153890691698} 12/26/2021 15:55:46 - INFO - codeparrot_training - Step 28049: {'lr': 0.00020358829405636193, 'samples': 14361600, 'steps': 28049, 'batch_loss/train': 0.8837688816711307} 12/26/2021 15:55:56 - INFO - codeparrot_training - Step 28050: {'lr': 0.0002035728050322687, 'samples': 14362112, 'steps': 28050, 'batch_loss/train': 0.8246542541310191} 12/26/2021 15:56:09 - INFO - codeparrot_training - Step 28051: {'lr': 0.00020355731619275242, 'samples': 14362624, 'steps': 28051, 'batch_loss/train': 0.8487098021432757} 12/26/2021 15:56:20 - INFO - codeparrot_training - Step 28052: {'lr': 0.00020354182753787466, 'samples': 14363136, 'steps': 28052, 'batch_loss/train': 0.7632872983813286} 12/26/2021 15:56:31 - INFO - codeparrot_training - Step 28053: {'lr': 0.00020352633906769714, 'samples': 14363648, 'steps': 28053, 'batch_loss/train': 0.7529297778382897} 12/26/2021 15:56:41 - INFO - codeparrot_training - Step 28054: {'lr': 0.00020351085078228133, 'samples': 14364160, 'steps': 28054, 'batch_loss/train': 1.3938939645886421} 12/26/2021 15:56:55 - INFO - codeparrot_training - Step 28055: {'lr': 0.00020349536268168873, 'samples': 14364672, 'steps': 28055, 'batch_loss/train': 0.7077732650795951} 12/26/2021 15:57:06 - INFO - codeparrot_training - Step 28056: {'lr': 0.00020347987476598115, 'samples': 14365184, 'steps': 28056, 'batch_loss/train': 1.0120750768110156} 12/26/2021 15:57:16 - INFO - codeparrot_training - Step 28057: {'lr': 0.00020346438703521994, 'samples': 14365696, 'steps': 28057, 'batch_loss/train': 0.8026904882863164} 12/26/2021 15:57:28 - INFO - codeparrot_training - Step 28058: {'lr': 0.00020344889948946683, 'samples': 14366208, 'steps': 28058, 'batch_loss/train': 0.7458560192026198} 12/26/2021 15:57:39 - INFO - codeparrot_training - Step 28059: {'lr': 0.00020343341212878318, 'samples': 14366720, 'steps': 28059, 'batch_loss/train': 0.795136830303818} 12/26/2021 15:57:50 - INFO - codeparrot_training - Step 28060: {'lr': 0.00020341792495323085, 'samples': 14367232, 'steps': 28060, 'batch_loss/train': 0.893522179685533} 12/26/2021 15:58:00 - INFO - codeparrot_training - Step 28061: {'lr': 0.0002034024379628712, 'samples': 14367744, 'steps': 28061, 'batch_loss/train': 0.8011572491377592} 12/26/2021 15:58:14 - INFO - codeparrot_training - Step 28062: {'lr': 0.0002033869511577659, 'samples': 14368256, 'steps': 28062, 'batch_loss/train': 0.7613540170714259} 12/26/2021 15:58:25 - INFO - codeparrot_training - Step 28063: {'lr': 0.00020337146453797638, 'samples': 14368768, 'steps': 28063, 'batch_loss/train': 0.8288173694163561} 12/26/2021 15:58:35 - INFO - codeparrot_training - Step 28064: {'lr': 0.00020335597810356438, 'samples': 14369280, 'steps': 28064, 'batch_loss/train': 0.8339309031143785} 12/26/2021 15:58:48 - INFO - codeparrot_training - Step 28065: {'lr': 0.00020334049185459144, 'samples': 14369792, 'steps': 28065, 'batch_loss/train': 0.6911505851894617} 12/26/2021 15:58:58 - INFO - codeparrot_training - Step 28066: {'lr': 0.00020332500579111896, 'samples': 14370304, 'steps': 28066, 'batch_loss/train': 0.8614558910485357} 12/26/2021 15:59:09 - INFO - codeparrot_training - Step 28067: {'lr': 0.00020330951991320872, 'samples': 14370816, 'steps': 28067, 'batch_loss/train': 0.7706584604457021} 12/26/2021 15:59:21 - INFO - codeparrot_training - Step 28068: {'lr': 0.00020329403422092223, 'samples': 14371328, 'steps': 28068, 'batch_loss/train': 0.7592059838352725} 12/26/2021 15:59:32 - INFO - codeparrot_training - Step 28069: {'lr': 0.00020327854871432107, 'samples': 14371840, 'steps': 28069, 'batch_loss/train': 0.7791324098943733} 12/26/2021 15:59:42 - INFO - codeparrot_training - Step 28070: {'lr': 0.00020326306339346655, 'samples': 14372352, 'steps': 28070, 'batch_loss/train': 0.8439992787316442} 12/26/2021 15:59:57 - INFO - codeparrot_training - Step 28071: {'lr': 0.0002032475782584206, 'samples': 14372864, 'steps': 28071, 'batch_loss/train': 0.7752312896773219} 12/26/2021 16:00:07 - INFO - codeparrot_training - Step 28072: {'lr': 0.00020323209330924463, 'samples': 14373376, 'steps': 28072, 'batch_loss/train': 0.7875804626382887} 12/26/2021 16:00:18 - INFO - codeparrot_training - Step 28073: {'lr': 0.00020321660854600017, 'samples': 14373888, 'steps': 28073, 'batch_loss/train': 1.2900229846127331} 12/26/2021 16:00:28 - INFO - codeparrot_training - Step 28074: {'lr': 0.00020320112396874873, 'samples': 14374400, 'steps': 28074, 'batch_loss/train': 0.7265465902164578} 12/26/2021 16:00:41 - INFO - codeparrot_training - Step 28075: {'lr': 0.00020318563957755207, 'samples': 14374912, 'steps': 28075, 'batch_loss/train': 0.8037894954904914} 12/26/2021 16:00:51 - INFO - codeparrot_training - Step 28076: {'lr': 0.00020317015537247158, 'samples': 14375424, 'steps': 28076, 'batch_loss/train': 0.7204141956754029} 12/26/2021 16:01:02 - INFO - codeparrot_training - Step 28077: {'lr': 0.0002031546713535688, 'samples': 14375936, 'steps': 28077, 'batch_loss/train': 0.8740958031266928} 12/26/2021 16:01:14 - INFO - codeparrot_training - Step 28078: {'lr': 0.00020313918752090548, 'samples': 14376448, 'steps': 28078, 'batch_loss/train': 0.7971820998936892} 12/26/2021 16:01:25 - INFO - codeparrot_training - Step 28079: {'lr': 0.00020312370387454303, 'samples': 14376960, 'steps': 28079, 'batch_loss/train': 0.7801911784335971} 12/26/2021 16:01:35 - INFO - codeparrot_training - Step 28080: {'lr': 0.00020310822041454303, 'samples': 14377472, 'steps': 28080, 'batch_loss/train': 1.37445754930377} 12/26/2021 16:01:46 - INFO - codeparrot_training - Step 28081: {'lr': 0.00020309273714096696, 'samples': 14377984, 'steps': 28081, 'batch_loss/train': 0.7902945997193456} 12/26/2021 16:01:58 - INFO - codeparrot_training - Step 28082: {'lr': 0.00020307725405387656, 'samples': 14378496, 'steps': 28082, 'batch_loss/train': 0.6719877910800278} 12/26/2021 16:02:09 - INFO - codeparrot_training - Step 28083: {'lr': 0.00020306177115333323, 'samples': 14379008, 'steps': 28083, 'batch_loss/train': 0.7675317549146712} 12/26/2021 16:02:19 - INFO - codeparrot_training - Step 28084: {'lr': 0.00020304628843939863, 'samples': 14379520, 'steps': 28084, 'batch_loss/train': 0.7558521020691842} 12/26/2021 16:02:33 - INFO - codeparrot_training - Step 28085: {'lr': 0.00020303080591213413, 'samples': 14380032, 'steps': 28085, 'batch_loss/train': 0.866883484646678} 12/26/2021 16:02:44 - INFO - codeparrot_training - Step 28086: {'lr': 0.0002030153235716015, 'samples': 14380544, 'steps': 28086, 'batch_loss/train': 0.6886656126007438} 12/26/2021 16:02:55 - INFO - codeparrot_training - Step 28087: {'lr': 0.00020299984141786224, 'samples': 14381056, 'steps': 28087, 'batch_loss/train': 0.761703793425113} 12/26/2021 16:03:07 - INFO - codeparrot_training - Step 28088: {'lr': 0.00020298435945097787, 'samples': 14381568, 'steps': 28088, 'batch_loss/train': 0.7518025012686849} 12/26/2021 16:03:17 - INFO - codeparrot_training - Step 28089: {'lr': 0.0002029688776710098, 'samples': 14382080, 'steps': 28089, 'batch_loss/train': 0.8357833279296756} 12/26/2021 16:03:28 - INFO - codeparrot_training - Step 28090: {'lr': 0.0002029533960780198, 'samples': 14382592, 'steps': 28090, 'batch_loss/train': 0.7605722937732935} 12/26/2021 16:03:42 - INFO - codeparrot_training - Step 28091: {'lr': 0.00020293791467206938, 'samples': 14383104, 'steps': 28091, 'batch_loss/train': 0.7711320538073778} 12/26/2021 16:03:52 - INFO - codeparrot_training - Step 28092: {'lr': 0.00020292243345321992, 'samples': 14383616, 'steps': 28092, 'batch_loss/train': 0.6535938326269388} 12/26/2021 16:04:03 - INFO - codeparrot_training - Step 28093: {'lr': 0.00020290695242153315, 'samples': 14384128, 'steps': 28093, 'batch_loss/train': 0.7257642196491361} 12/26/2021 16:04:14 - INFO - codeparrot_training - Step 28094: {'lr': 0.0002028914715770706, 'samples': 14384640, 'steps': 28094, 'batch_loss/train': 0.807749230414629} 12/26/2021 16:04:26 - INFO - codeparrot_training - Step 28095: {'lr': 0.0002028759909198938, 'samples': 14385152, 'steps': 28095, 'batch_loss/train': 0.8330813348293304} 12/26/2021 16:04:36 - INFO - codeparrot_training - Step 28096: {'lr': 0.00020286051045006415, 'samples': 14385664, 'steps': 28096, 'batch_loss/train': 0.7671700939536095} 12/26/2021 16:04:47 - INFO - codeparrot_training - Step 28097: {'lr': 0.00020284503016764337, 'samples': 14386176, 'steps': 28097, 'batch_loss/train': 0.7962335906922817} 12/26/2021 16:05:01 - INFO - codeparrot_training - Step 28098: {'lr': 0.00020282955007269295, 'samples': 14386688, 'steps': 28098, 'batch_loss/train': 0.7080691214650869} 12/26/2021 16:05:12 - INFO - codeparrot_training - Step 28099: {'lr': 0.00020281407016527442, 'samples': 14387200, 'steps': 28099, 'batch_loss/train': 0.7734066862612963} 12/26/2021 16:05:22 - INFO - codeparrot_training - Step 28100: {'lr': 0.00020279859044544925, 'samples': 14387712, 'steps': 28100, 'batch_loss/train': 0.5923100580112077} 12/26/2021 16:05:34 - INFO - codeparrot_training - Step 28101: {'lr': 0.0002027831109132792, 'samples': 14388224, 'steps': 28101, 'batch_loss/train': 0.6833058381453156} 12/26/2021 16:05:45 - INFO - codeparrot_training - Step 28102: {'lr': 0.0002027676315688256, 'samples': 14388736, 'steps': 28102, 'batch_loss/train': 0.7105962289497256} 12/26/2021 16:05:56 - INFO - codeparrot_training - Step 28103: {'lr': 0.00020275215241215004, 'samples': 14389248, 'steps': 28103, 'batch_loss/train': 0.6786678051576018} 12/26/2021 16:06:08 - INFO - codeparrot_training - Step 28104: {'lr': 0.00020273667344331415, 'samples': 14389760, 'steps': 28104, 'batch_loss/train': 0.7928958083502948} 12/26/2021 16:06:19 - INFO - codeparrot_training - Step 28105: {'lr': 0.00020272119466237942, 'samples': 14390272, 'steps': 28105, 'batch_loss/train': 0.7133950097486377} 12/26/2021 16:06:29 - INFO - codeparrot_training - Step 28106: {'lr': 0.0002027057160694073, 'samples': 14390784, 'steps': 28106, 'batch_loss/train': 0.7143124882131815} 12/26/2021 16:06:40 - INFO - codeparrot_training - Step 28107: {'lr': 0.00020269023766445935, 'samples': 14391296, 'steps': 28107, 'batch_loss/train': 0.8669933080673218} 12/26/2021 16:06:54 - INFO - codeparrot_training - Step 28108: {'lr': 0.00020267475944759725, 'samples': 14391808, 'steps': 28108, 'batch_loss/train': 0.7319311341270804} 12/26/2021 16:07:04 - INFO - codeparrot_training - Step 28109: {'lr': 0.0002026592814188824, 'samples': 14392320, 'steps': 28109, 'batch_loss/train': 0.6840508147142828} 12/26/2021 16:07:15 - INFO - codeparrot_training - Step 28110: {'lr': 0.0002026438035783764, 'samples': 14392832, 'steps': 28110, 'batch_loss/train': 0.7776940576732159} 12/26/2021 16:07:27 - INFO - codeparrot_training - Step 28111: {'lr': 0.0002026283259261407, 'samples': 14393344, 'steps': 28111, 'batch_loss/train': 0.8883359972387552} 12/26/2021 16:07:38 - INFO - codeparrot_training - Step 28112: {'lr': 0.00020261284846223687, 'samples': 14393856, 'steps': 28112, 'batch_loss/train': 0.7616898701526225} 12/26/2021 16:07:49 - INFO - codeparrot_training - Step 28113: {'lr': 0.0002025973711867266, 'samples': 14394368, 'steps': 28113, 'batch_loss/train': 0.6846965766744688} 12/26/2021 16:08:01 - INFO - codeparrot_training - Step 28114: {'lr': 0.00020258189409967108, 'samples': 14394880, 'steps': 28114, 'batch_loss/train': 0.7635707082226872} 12/26/2021 16:08:11 - INFO - codeparrot_training - Step 28115: {'lr': 0.00020256641720113223, 'samples': 14395392, 'steps': 28115, 'batch_loss/train': 0.5354825129033998} 12/26/2021 16:08:22 - INFO - codeparrot_training - Step 28116: {'lr': 0.00020255094049117131, 'samples': 14395904, 'steps': 28116, 'batch_loss/train': 0.6455010959180072} 12/26/2021 16:08:32 - INFO - codeparrot_training - Step 28117: {'lr': 0.00020253546396985, 'samples': 14396416, 'steps': 28117, 'batch_loss/train': 0.676925893407315} 12/26/2021 16:08:45 - INFO - codeparrot_training - Step 28118: {'lr': 0.00020251998763722964, 'samples': 14396928, 'steps': 28118, 'batch_loss/train': 0.8029940063133836} 12/26/2021 16:08:56 - INFO - codeparrot_training - Step 28119: {'lr': 0.00020250451149337196, 'samples': 14397440, 'steps': 28119, 'batch_loss/train': 0.7134903145488352} 12/26/2021 16:09:06 - INFO - codeparrot_training - Step 28120: {'lr': 0.00020248903553833845, 'samples': 14397952, 'steps': 28120, 'batch_loss/train': 0.6651008778717369} 12/26/2021 16:09:20 - INFO - codeparrot_training - Step 28121: {'lr': 0.0002024735597721905, 'samples': 14398464, 'steps': 28121, 'batch_loss/train': 0.7246415207628161} 12/26/2021 16:09:31 - INFO - codeparrot_training - Step 28122: {'lr': 0.00020245808419498974, 'samples': 14398976, 'steps': 28122, 'batch_loss/train': 0.7169730132445693} 12/26/2021 16:09:41 - INFO - codeparrot_training - Step 28123: {'lr': 0.00020244260880679778, 'samples': 14399488, 'steps': 28123, 'batch_loss/train': 0.9488050248473883} 12/26/2021 16:09:54 - INFO - codeparrot_training - Step 28124: {'lr': 0.00020242713360767603, 'samples': 14400000, 'steps': 28124, 'batch_loss/train': 0.7717329822480679} 12/26/2021 16:10:04 - INFO - codeparrot_training - Step 28125: {'lr': 0.000202411658597686, 'samples': 14400512, 'steps': 28125, 'batch_loss/train': 0.4792600699001923} 12/26/2021 16:10:15 - INFO - codeparrot_training - Step 28126: {'lr': 0.00020239618377688917, 'samples': 14401024, 'steps': 28126, 'batch_loss/train': 0.7979043857194483} 12/26/2021 16:10:26 - INFO - codeparrot_training - Step 28127: {'lr': 0.00020238070914534726, 'samples': 14401536, 'steps': 28127, 'batch_loss/train': 0.8391327313147485} 12/26/2021 16:10:38 - INFO - codeparrot_training - Step 28128: {'lr': 0.00020236523470312163, 'samples': 14402048, 'steps': 28128, 'batch_loss/train': 0.771250780671835} 12/26/2021 16:10:48 - INFO - codeparrot_training - Step 28129: {'lr': 0.00020234976045027377, 'samples': 14402560, 'steps': 28129, 'batch_loss/train': 0.9993329821154475} 12/26/2021 16:10:59 - INFO - codeparrot_training - Step 28130: {'lr': 0.00020233428638686534, 'samples': 14403072, 'steps': 28130, 'batch_loss/train': 0.7363728191703558} 12/26/2021 16:11:13 - INFO - codeparrot_training - Step 28131: {'lr': 0.00020231881251295782, 'samples': 14403584, 'steps': 28131, 'batch_loss/train': 0.7248766757547855} 12/26/2021 16:11:23 - INFO - codeparrot_training - Step 28132: {'lr': 0.0002023033388286127, 'samples': 14404096, 'steps': 28132, 'batch_loss/train': 0.6808200068771839} 12/26/2021 16:11:34 - INFO - codeparrot_training - Step 28133: {'lr': 0.00020228786533389137, 'samples': 14404608, 'steps': 28133, 'batch_loss/train': 0.6523865330964327} 12/26/2021 16:11:46 - INFO - codeparrot_training - Step 28134: {'lr': 0.00020227239202885562, 'samples': 14405120, 'steps': 28134, 'batch_loss/train': 0.6611377978697419} 12/26/2021 16:11:57 - INFO - codeparrot_training - Step 28135: {'lr': 0.00020225691891356674, 'samples': 14405632, 'steps': 28135, 'batch_loss/train': 0.6561337327584624} 12/26/2021 16:12:07 - INFO - codeparrot_training - Step 28136: {'lr': 0.0002022414459880864, 'samples': 14406144, 'steps': 28136, 'batch_loss/train': 0.8394280932843685} 12/26/2021 16:12:21 - INFO - codeparrot_training - Step 28137: {'lr': 0.00020222597325247592, 'samples': 14406656, 'steps': 28137, 'batch_loss/train': 0.6358731780201197} 12/26/2021 16:12:32 - INFO - codeparrot_training - Step 28138: {'lr': 0.00020221050070679696, 'samples': 14407168, 'steps': 28138, 'batch_loss/train': 0.6892099911347032} 12/26/2021 16:12:42 - INFO - codeparrot_training - Step 28139: {'lr': 0.00020219502835111108, 'samples': 14407680, 'steps': 28139, 'batch_loss/train': 0.7372373656835407} 12/26/2021 16:12:53 - INFO - codeparrot_training - Step 28140: {'lr': 0.00020217955618547962, 'samples': 14408192, 'steps': 28140, 'batch_loss/train': 0.7898669308051467} 12/26/2021 16:13:05 - INFO - codeparrot_training - Step 28141: {'lr': 0.00020216408420996426, 'samples': 14408704, 'steps': 28141, 'batch_loss/train': 0.7599951447919011} 12/26/2021 16:13:16 - INFO - codeparrot_training - Step 28142: {'lr': 0.00020214861242462642, 'samples': 14409216, 'steps': 28142, 'batch_loss/train': 0.7067931508645415} 12/26/2021 16:13:26 - INFO - codeparrot_training - Step 28143: {'lr': 0.00020213314082952766, 'samples': 14409728, 'steps': 28143, 'batch_loss/train': 0.7090926878154278} 12/26/2021 16:13:38 - INFO - codeparrot_training - Step 28144: {'lr': 0.00020211766942472936, 'samples': 14410240, 'steps': 28144, 'batch_loss/train': 0.7670314777642488} 12/26/2021 16:13:49 - INFO - codeparrot_training - Step 28145: {'lr': 0.0002021021982102932, 'samples': 14410752, 'steps': 28145, 'batch_loss/train': 0.9060358749702573} 12/26/2021 16:14:00 - INFO - codeparrot_training - Step 28146: {'lr': 0.00020208672718628065, 'samples': 14411264, 'steps': 28146, 'batch_loss/train': 0.7181963747134432} 12/26/2021 16:14:15 - INFO - codeparrot_training - Step 28147: {'lr': 0.00020207125635275314, 'samples': 14411776, 'steps': 28147, 'batch_loss/train': 0.781381094828248} 12/26/2021 16:14:25 - INFO - codeparrot_training - Step 28148: {'lr': 0.00020205578570977215, 'samples': 14412288, 'steps': 28148, 'batch_loss/train': 0.7796414820477366} 12/26/2021 16:14:36 - INFO - codeparrot_training - Step 28149: {'lr': 0.00020204031525739935, 'samples': 14412800, 'steps': 28149, 'batch_loss/train': 0.790671662427485} 12/26/2021 16:14:48 - INFO - codeparrot_training - Step 28150: {'lr': 0.00020202484499569617, 'samples': 14413312, 'steps': 28150, 'batch_loss/train': 0.7525641340762377} 12/26/2021 16:14:59 - INFO - codeparrot_training - Step 28151: {'lr': 0.00020200937492472398, 'samples': 14413824, 'steps': 28151, 'batch_loss/train': 0.6323952130915131} 12/26/2021 16:15:09 - INFO - codeparrot_training - Step 28152: {'lr': 0.00020199390504454446, 'samples': 14414336, 'steps': 28152, 'batch_loss/train': 0.8563869399949908} 12/26/2021 16:15:20 - INFO - codeparrot_training - Step 28153: {'lr': 0.00020197843535521906, 'samples': 14414848, 'steps': 28153, 'batch_loss/train': 0.7783543257974088} 12/26/2021 16:15:34 - INFO - codeparrot_training - Step 28154: {'lr': 0.00020196296585680922, 'samples': 14415360, 'steps': 28154, 'batch_loss/train': 0.6884720282396302} 12/26/2021 16:15:44 - INFO - codeparrot_training - Step 28155: {'lr': 0.0002019474965493765, 'samples': 14415872, 'steps': 28155, 'batch_loss/train': 0.7761835041455925} 12/26/2021 16:15:55 - INFO - codeparrot_training - Step 28156: {'lr': 0.00020193202743298244, 'samples': 14416384, 'steps': 28156, 'batch_loss/train': 0.7577800145372748} 12/26/2021 16:16:07 - INFO - codeparrot_training - Step 28157: {'lr': 0.00020191655850768846, 'samples': 14416896, 'steps': 28157, 'batch_loss/train': 0.7973240772262216} 12/26/2021 16:16:18 - INFO - codeparrot_training - Step 28158: {'lr': 0.00020190108977355612, 'samples': 14417408, 'steps': 28158, 'batch_loss/train': 0.7802397646009922} 12/26/2021 16:16:29 - INFO - codeparrot_training - Step 28159: {'lr': 0.00020188562123064677, 'samples': 14417920, 'steps': 28159, 'batch_loss/train': 0.7143372157588601} 12/26/2021 16:16:41 - INFO - codeparrot_training - Step 28160: {'lr': 0.00020187015287902215, 'samples': 14418432, 'steps': 28160, 'batch_loss/train': 0.7633674871176481} 12/26/2021 16:16:51 - INFO - codeparrot_training - Step 28161: {'lr': 0.00020185468471874357, 'samples': 14418944, 'steps': 28161, 'batch_loss/train': 0.7502046232111752} 12/26/2021 16:17:02 - INFO - codeparrot_training - Step 28162: {'lr': 0.00020183921674987265, 'samples': 14419456, 'steps': 28162, 'batch_loss/train': 0.7105682254768908} 12/26/2021 16:17:13 - INFO - codeparrot_training - Step 28163: {'lr': 0.00020182374897247072, 'samples': 14419968, 'steps': 28163, 'batch_loss/train': 0.7883646218106151} 12/26/2021 16:17:25 - INFO - codeparrot_training - Step 28164: {'lr': 0.0002018082813865994, 'samples': 14420480, 'steps': 28164, 'batch_loss/train': 0.7272940268740058} 12/26/2021 16:17:35 - INFO - codeparrot_training - Step 28165: {'lr': 0.00020179281399232023, 'samples': 14420992, 'steps': 28165, 'batch_loss/train': 0.7100314563140273} 12/26/2021 16:17:46 - INFO - codeparrot_training - Step 28166: {'lr': 0.0002017773467896945, 'samples': 14421504, 'steps': 28166, 'batch_loss/train': 0.7013611821457744} 12/26/2021 16:18:00 - INFO - codeparrot_training - Step 28167: {'lr': 0.0002017618797787839, 'samples': 14422016, 'steps': 28167, 'batch_loss/train': 0.6884188014082611} 12/26/2021 16:18:10 - INFO - codeparrot_training - Step 28168: {'lr': 0.00020174641295964992, 'samples': 14422528, 'steps': 28168, 'batch_loss/train': 0.8513194564729929} 12/26/2021 16:18:21 - INFO - codeparrot_training - Step 28169: {'lr': 0.00020173094633235396, 'samples': 14423040, 'steps': 28169, 'batch_loss/train': 0.7871040152385831} 12/26/2021 16:18:33 - INFO - codeparrot_training - Step 28170: {'lr': 0.00020171547989695742, 'samples': 14423552, 'steps': 28170, 'batch_loss/train': 0.6998287367168814} 12/26/2021 16:18:44 - INFO - codeparrot_training - Step 28171: {'lr': 0.00020170001365352197, 'samples': 14424064, 'steps': 28171, 'batch_loss/train': 0.7394145056605339} 12/26/2021 16:18:54 - INFO - codeparrot_training - Step 28172: {'lr': 0.00020168454760210907, 'samples': 14424576, 'steps': 28172, 'batch_loss/train': 0.7177937291562557} 12/26/2021 16:19:06 - INFO - codeparrot_training - Step 28173: {'lr': 0.00020166908174278013, 'samples': 14425088, 'steps': 28173, 'batch_loss/train': 0.6798444045707583} 12/26/2021 16:19:17 - INFO - codeparrot_training - Step 28174: {'lr': 0.00020165361607559659, 'samples': 14425600, 'steps': 28174, 'batch_loss/train': 0.7819133009761572} 12/26/2021 16:19:28 - INFO - codeparrot_training - Step 28175: {'lr': 0.00020163815060062016, 'samples': 14426112, 'steps': 28175, 'batch_loss/train': 0.7078034956939518} 12/26/2021 16:19:38 - INFO - codeparrot_training - Step 28176: {'lr': 0.0002016226853179121, 'samples': 14426624, 'steps': 28176, 'batch_loss/train': 0.7973883198574185} 12/26/2021 16:19:53 - INFO - codeparrot_training - Step 28177: {'lr': 0.0002016072202275339, 'samples': 14427136, 'steps': 28177, 'batch_loss/train': 0.7283846237696707} 12/26/2021 16:20:03 - INFO - codeparrot_training - Step 28178: {'lr': 0.0002015917553295473, 'samples': 14427648, 'steps': 28178, 'batch_loss/train': 0.6114825359545648} 12/26/2021 16:20:14 - INFO - codeparrot_training - Step 28179: {'lr': 0.00020157629062401357, 'samples': 14428160, 'steps': 28179, 'batch_loss/train': 0.7921419683843851} 12/26/2021 16:20:27 - INFO - codeparrot_training - Step 28180: {'lr': 0.00020156082611099413, 'samples': 14428672, 'steps': 28180, 'batch_loss/train': 1.0480195647105575} 12/26/2021 16:20:37 - INFO - codeparrot_training - Step 28181: {'lr': 0.00020154536179055056, 'samples': 14429184, 'steps': 28181, 'batch_loss/train': 0.6133511461084709} 12/26/2021 16:20:48 - INFO - codeparrot_training - Step 28182: {'lr': 0.00020152989766274442, 'samples': 14429696, 'steps': 28182, 'batch_loss/train': 1.670193829573691} 12/26/2021 16:20:58 - INFO - codeparrot_training - Step 28183: {'lr': 0.00020151443372763705, 'samples': 14430208, 'steps': 28183, 'batch_loss/train': 0.7326623535482213} 12/26/2021 16:21:12 - INFO - codeparrot_training - Step 28184: {'lr': 0.00020149896998529005, 'samples': 14430720, 'steps': 28184, 'batch_loss/train': 0.739276465959847} 12/26/2021 16:21:22 - INFO - codeparrot_training - Step 28185: {'lr': 0.00020148350643576468, 'samples': 14431232, 'steps': 28185, 'batch_loss/train': 0.6381610920652747} 12/26/2021 16:21:33 - INFO - codeparrot_training - Step 28186: {'lr': 0.00020146804307912274, 'samples': 14431744, 'steps': 28186, 'batch_loss/train': 0.6293160546920262} 12/26/2021 16:21:45 - INFO - codeparrot_training - Step 28187: {'lr': 0.00020145257991542544, 'samples': 14432256, 'steps': 28187, 'batch_loss/train': 0.6896253018639982} 12/26/2021 16:21:55 - INFO - codeparrot_training - Step 28188: {'lr': 0.00020143711694473428, 'samples': 14432768, 'steps': 28188, 'batch_loss/train': 0.7160709258168936} 12/26/2021 16:22:06 - INFO - codeparrot_training - Step 28189: {'lr': 0.00020142165416711096, 'samples': 14433280, 'steps': 28189, 'batch_loss/train': 0.7177507225424051} 12/26/2021 16:22:18 - INFO - codeparrot_training - Step 28190: {'lr': 0.00020140619158261674, 'samples': 14433792, 'steps': 28190, 'batch_loss/train': 0.8731233151629567} 12/26/2021 16:22:29 - INFO - codeparrot_training - Step 28191: {'lr': 0.00020139072919131317, 'samples': 14434304, 'steps': 28191, 'batch_loss/train': 0.7771028177812696} 12/26/2021 16:22:39 - INFO - codeparrot_training - Step 28192: {'lr': 0.00020137526699326165, 'samples': 14434816, 'steps': 28192, 'batch_loss/train': 0.7811991553753614} 12/26/2021 16:22:53 - INFO - codeparrot_training - Step 28193: {'lr': 0.00020135980498852375, 'samples': 14435328, 'steps': 28193, 'batch_loss/train': 0.7438433289644308} 12/26/2021 16:23:03 - INFO - codeparrot_training - Step 28194: {'lr': 0.00020134434317716093, 'samples': 14435840, 'steps': 28194, 'batch_loss/train': 0.7184502041200176} 12/26/2021 16:23:14 - INFO - codeparrot_training - Step 28195: {'lr': 0.00020132888155923465, 'samples': 14436352, 'steps': 28195, 'batch_loss/train': 0.7145004980266094} 12/26/2021 16:23:24 - INFO - codeparrot_training - Step 28196: {'lr': 0.00020131342013480626, 'samples': 14436864, 'steps': 28196, 'batch_loss/train': 0.7742519282037392} 12/26/2021 16:23:37 - INFO - codeparrot_training - Step 28197: {'lr': 0.00020129795890393734, 'samples': 14437376, 'steps': 28197, 'batch_loss/train': 0.7310623368248343} 12/26/2021 16:23:47 - INFO - codeparrot_training - Step 28198: {'lr': 0.00020128249786668942, 'samples': 14437888, 'steps': 28198, 'batch_loss/train': 0.7822711849585176} 12/26/2021 16:23:58 - INFO - codeparrot_training - Step 28199: {'lr': 0.00020126703702312386, 'samples': 14438400, 'steps': 28199, 'batch_loss/train': 0.48361872998066247} 12/26/2021 16:24:10 - INFO - codeparrot_training - Step 28200: {'lr': 0.00020125157637330211, 'samples': 14438912, 'steps': 28200, 'batch_loss/train': 0.8950129239819944} 12/26/2021 16:24:21 - INFO - codeparrot_training - Step 28201: {'lr': 0.00020123611591728573, 'samples': 14439424, 'steps': 28201, 'batch_loss/train': 0.7892736960202456} 12/26/2021 16:24:32 - INFO - codeparrot_training - Step 28202: {'lr': 0.00020122065565513614, 'samples': 14439936, 'steps': 28202, 'batch_loss/train': 0.7129106456413865} 12/26/2021 16:24:44 - INFO - codeparrot_training - Step 28203: {'lr': 0.00020120519558691476, 'samples': 14440448, 'steps': 28203, 'batch_loss/train': 0.6999909887090325} 12/26/2021 16:24:54 - INFO - codeparrot_training - Step 28204: {'lr': 0.00020118973571268319, 'samples': 14440960, 'steps': 28204, 'batch_loss/train': 0.8013690281659365} 12/26/2021 16:25:05 - INFO - codeparrot_training - Step 28205: {'lr': 0.0002011742760325028, 'samples': 14441472, 'steps': 28205, 'batch_loss/train': 0.7398876883089542} 12/26/2021 16:25:19 - INFO - codeparrot_training - Step 28206: {'lr': 0.000201158816546435, 'samples': 14441984, 'steps': 28206, 'batch_loss/train': 0.6953416124451905} 12/26/2021 16:25:30 - INFO - codeparrot_training - Step 28207: {'lr': 0.00020114335725454125, 'samples': 14442496, 'steps': 28207, 'batch_loss/train': 1.068994275876321} 12/26/2021 16:25:40 - INFO - codeparrot_training - Step 28208: {'lr': 0.00020112789815688315, 'samples': 14443008, 'steps': 28208, 'batch_loss/train': 0.915267014876008} 12/26/2021 16:25:51 - INFO - codeparrot_training - Step 28209: {'lr': 0.00020111243925352205, 'samples': 14443520, 'steps': 28209, 'batch_loss/train': 0.8183322604745626} 12/26/2021 16:26:03 - INFO - codeparrot_training - Step 28210: {'lr': 0.00020109698054451947, 'samples': 14444032, 'steps': 28210, 'batch_loss/train': 0.7210367131046951} 12/26/2021 16:26:14 - INFO - codeparrot_training - Step 28211: {'lr': 0.00020108152202993673, 'samples': 14444544, 'steps': 28211, 'batch_loss/train': 0.7659230785211548} 12/26/2021 16:26:24 - INFO - codeparrot_training - Step 28212: {'lr': 0.00020106606370983544, 'samples': 14445056, 'steps': 28212, 'batch_loss/train': 0.6724621098255739} 12/26/2021 16:26:38 - INFO - codeparrot_training - Step 28213: {'lr': 0.00020105060558427707, 'samples': 14445568, 'steps': 28213, 'batch_loss/train': 0.6875525779323652} 12/26/2021 16:26:48 - INFO - codeparrot_training - Step 28214: {'lr': 0.00020103514765332288, 'samples': 14446080, 'steps': 28214, 'batch_loss/train': 0.7238426960539073} 12/26/2021 16:26:59 - INFO - codeparrot_training - Step 28215: {'lr': 0.00020101968991703456, 'samples': 14446592, 'steps': 28215, 'batch_loss/train': 0.8577379824710079} 12/26/2021 16:27:11 - INFO - codeparrot_training - Step 28216: {'lr': 0.00020100423237547345, 'samples': 14447104, 'steps': 28216, 'batch_loss/train': 0.6521470351144671} 12/26/2021 16:27:22 - INFO - codeparrot_training - Step 28217: {'lr': 0.000200988775028701, 'samples': 14447616, 'steps': 28217, 'batch_loss/train': 0.740623970516026} 12/26/2021 16:27:32 - INFO - codeparrot_training - Step 28218: {'lr': 0.0002009733178767786, 'samples': 14448128, 'steps': 28218, 'batch_loss/train': 0.6856782411341555} 12/26/2021 16:27:43 - INFO - codeparrot_training - Step 28219: {'lr': 0.00020095786091976783, 'samples': 14448640, 'steps': 28219, 'batch_loss/train': 0.7477124845609069} 12/26/2021 16:27:55 - INFO - codeparrot_training - Step 28220: {'lr': 0.00020094240415773014, 'samples': 14449152, 'steps': 28220, 'batch_loss/train': 0.8095623021945357} 12/26/2021 16:28:06 - INFO - codeparrot_training - Step 28221: {'lr': 0.0002009269475907269, 'samples': 14449664, 'steps': 28221, 'batch_loss/train': 0.6171881398477126} 12/26/2021 16:28:16 - INFO - codeparrot_training - Step 28222: {'lr': 0.0002009114912188195, 'samples': 14450176, 'steps': 28222, 'batch_loss/train': 0.720734576578252} 12/26/2021 16:28:30 - INFO - codeparrot_training - Step 28223: {'lr': 0.00020089603504206958, 'samples': 14450688, 'steps': 28223, 'batch_loss/train': 0.7667789915576577} 12/26/2021 16:28:41 - INFO - codeparrot_training - Step 28224: {'lr': 0.0002008805790605385, 'samples': 14451200, 'steps': 28224, 'batch_loss/train': 0.7562628753948957} 12/26/2021 16:28:51 - INFO - codeparrot_training - Step 28225: {'lr': 0.00020086512327428758, 'samples': 14451712, 'steps': 28225, 'batch_loss/train': 0.7180177383124828} 12/26/2021 16:29:03 - INFO - codeparrot_training - Step 28226: {'lr': 0.00020084966768337844, 'samples': 14452224, 'steps': 28226, 'batch_loss/train': 0.751040211878717} 12/26/2021 16:29:14 - INFO - codeparrot_training - Step 28227: {'lr': 0.0002008342122878725, 'samples': 14452736, 'steps': 28227, 'batch_loss/train': 0.8215373521670699} 12/26/2021 16:29:25 - INFO - codeparrot_training - Step 28228: {'lr': 0.0002008187570878311, 'samples': 14453248, 'steps': 28228, 'batch_loss/train': 0.7437055958434939} 12/26/2021 16:29:37 - INFO - codeparrot_training - Step 28229: {'lr': 0.00020080330208331571, 'samples': 14453760, 'steps': 28229, 'batch_loss/train': 0.6899830610491335} 12/26/2021 16:29:47 - INFO - codeparrot_training - Step 28230: {'lr': 0.00020078784727438792, 'samples': 14454272, 'steps': 28230, 'batch_loss/train': 0.7573310937732458} 12/26/2021 16:29:58 - INFO - codeparrot_training - Step 28231: {'lr': 0.00020077239266110908, 'samples': 14454784, 'steps': 28231, 'batch_loss/train': 0.636060803080909} 12/26/2021 16:30:09 - INFO - codeparrot_training - Step 28232: {'lr': 0.00020075693824354057, 'samples': 14455296, 'steps': 28232, 'batch_loss/train': 0.799221052788198} 12/26/2021 16:30:22 - INFO - codeparrot_training - Step 28233: {'lr': 0.00020074148402174382, 'samples': 14455808, 'steps': 28233, 'batch_loss/train': 0.6606189988087863} 12/26/2021 16:30:33 - INFO - codeparrot_training - Step 28234: {'lr': 0.00020072602999578044, 'samples': 14456320, 'steps': 28234, 'batch_loss/train': 1.6557370265945792} 12/26/2021 16:30:43 - INFO - codeparrot_training - Step 28235: {'lr': 0.0002007105761657117, 'samples': 14456832, 'steps': 28235, 'batch_loss/train': 1.0865877913311124} 12/26/2021 16:30:55 - INFO - codeparrot_training - Step 28236: {'lr': 0.00020069512253159915, 'samples': 14457344, 'steps': 28236, 'batch_loss/train': 0.8338604024611413} 12/26/2021 16:31:06 - INFO - codeparrot_training - Step 28237: {'lr': 0.00020067966909350407, 'samples': 14457856, 'steps': 28237, 'batch_loss/train': 0.7316479086875916} 12/26/2021 16:31:17 - INFO - codeparrot_training - Step 28238: {'lr': 0.00020066421585148808, 'samples': 14458368, 'steps': 28238, 'batch_loss/train': 0.8615976241417229} 12/26/2021 16:31:31 - INFO - codeparrot_training - Step 28239: {'lr': 0.00020064876280561256, 'samples': 14458880, 'steps': 28239, 'batch_loss/train': 0.8354452438652515} 12/26/2021 16:31:41 - INFO - codeparrot_training - Step 28240: {'lr': 0.00020063330995593885, 'samples': 14459392, 'steps': 28240, 'batch_loss/train': 0.7655255626887083} 12/26/2021 16:31:52 - INFO - codeparrot_training - Step 28241: {'lr': 0.00020061785730252855, 'samples': 14459904, 'steps': 28241, 'batch_loss/train': 0.6714155981317163} 12/26/2021 16:32:02 - INFO - codeparrot_training - Step 28242: {'lr': 0.00020060240484544292, 'samples': 14460416, 'steps': 28242, 'batch_loss/train': 0.7937309024855494} 12/26/2021 16:32:15 - INFO - codeparrot_training - Step 28243: {'lr': 0.00020058695258474358, 'samples': 14460928, 'steps': 28243, 'batch_loss/train': 0.8257672907784581} 12/26/2021 16:32:25 - INFO - codeparrot_training - Step 28244: {'lr': 0.00020057150052049173, 'samples': 14461440, 'steps': 28244, 'batch_loss/train': 0.7013311916962266} 12/26/2021 16:32:36 - INFO - codeparrot_training - Step 28245: {'lr': 0.00020055604865274897, 'samples': 14461952, 'steps': 28245, 'batch_loss/train': 0.6990266088396311} 12/26/2021 16:32:48 - INFO - codeparrot_training - Step 28246: {'lr': 0.00020054059698157675, 'samples': 14462464, 'steps': 28246, 'batch_loss/train': 0.8020619358867407} 12/26/2021 16:32:59 - INFO - codeparrot_training - Step 28247: {'lr': 0.00020052514550703641, 'samples': 14462976, 'steps': 28247, 'batch_loss/train': 0.801116137765348} 12/26/2021 16:33:10 - INFO - codeparrot_training - Step 28248: {'lr': 0.00020050969422918933, 'samples': 14463488, 'steps': 28248, 'batch_loss/train': 0.800477865152061} 12/26/2021 16:33:22 - INFO - codeparrot_training - Step 28249: {'lr': 0.00020049424314809715, 'samples': 14464000, 'steps': 28249, 'batch_loss/train': 0.5089704277925193} 12/26/2021 16:33:33 - INFO - codeparrot_training - Step 28250: {'lr': 0.00020047879226382114, 'samples': 14464512, 'steps': 28250, 'batch_loss/train': 0.7083578190067783} 12/26/2021 16:33:43 - INFO - codeparrot_training - Step 28251: {'lr': 0.00020046334157642265, 'samples': 14465024, 'steps': 28251, 'batch_loss/train': 0.5535345842363313} 12/26/2021 16:33:54 - INFO - codeparrot_training - Step 28252: {'lr': 0.0002004478910859633, 'samples': 14465536, 'steps': 28252, 'batch_loss/train': 0.5615333300374914} 12/26/2021 16:34:08 - INFO - codeparrot_training - Step 28253: {'lr': 0.00020043244079250445, 'samples': 14466048, 'steps': 28253, 'batch_loss/train': 0.731469021178782} 12/26/2021 16:34:18 - INFO - codeparrot_training - Step 28254: {'lr': 0.00020041699069610745, 'samples': 14466560, 'steps': 28254, 'batch_loss/train': 0.8637183541432023} 12/26/2021 16:34:29 - INFO - codeparrot_training - Step 28255: {'lr': 0.0002004015407968337, 'samples': 14467072, 'steps': 28255, 'batch_loss/train': 0.9383465310093015} 12/26/2021 16:34:41 - INFO - codeparrot_training - Step 28256: {'lr': 0.00020038609109474485, 'samples': 14467584, 'steps': 28256, 'batch_loss/train': 0.5423218674841337} 12/26/2021 16:34:52 - INFO - codeparrot_training - Step 28257: {'lr': 0.00020037064158990206, 'samples': 14468096, 'steps': 28257, 'batch_loss/train': 0.6621981277130544} 12/26/2021 16:35:02 - INFO - codeparrot_training - Step 28258: {'lr': 0.00020035519228236696, 'samples': 14468608, 'steps': 28258, 'batch_loss/train': 0.7631215024739504} 12/26/2021 16:35:16 - INFO - codeparrot_training - Step 28259: {'lr': 0.0002003397431722007, 'samples': 14469120, 'steps': 28259, 'batch_loss/train': 0.6087325811386108} 12/26/2021 16:35:27 - INFO - codeparrot_training - Step 28260: {'lr': 0.00020032429425946506, 'samples': 14469632, 'steps': 28260, 'batch_loss/train': 0.6853994957637042} 12/26/2021 16:35:38 - INFO - codeparrot_training - Step 28261: {'lr': 0.00020030884554422115, 'samples': 14470144, 'steps': 28261, 'batch_loss/train': 0.7125919489189982} 12/26/2021 16:35:48 - INFO - codeparrot_training - Step 28262: {'lr': 0.00020029339702653048, 'samples': 14470656, 'steps': 28262, 'batch_loss/train': 0.6826217463240027} 12/26/2021 16:36:00 - INFO - codeparrot_training - Step 28263: {'lr': 0.0002002779487064546, 'samples': 14471168, 'steps': 28263, 'batch_loss/train': 0.610639673192054} 12/26/2021 16:36:11 - INFO - codeparrot_training - Step 28264: {'lr': 0.0002002625005840548, 'samples': 14471680, 'steps': 28264, 'batch_loss/train': 0.8690589750185609} 12/26/2021 16:36:22 - INFO - codeparrot_training - Step 28265: {'lr': 0.00020024705265939252, 'samples': 14472192, 'steps': 28265, 'batch_loss/train': 0.5794024942442775} 12/26/2021 16:36:34 - INFO - codeparrot_training - Step 28266: {'lr': 0.0002002316049325291, 'samples': 14472704, 'steps': 28266, 'batch_loss/train': 0.6965260216966271} 12/26/2021 16:36:44 - INFO - codeparrot_training - Step 28267: {'lr': 0.00020021615740352605, 'samples': 14473216, 'steps': 28267, 'batch_loss/train': 0.6161865703761578} 12/26/2021 16:36:55 - INFO - codeparrot_training - Step 28268: {'lr': 0.00020020071007244484, 'samples': 14473728, 'steps': 28268, 'batch_loss/train': 0.7436985364183784} 12/26/2021 16:37:09 - INFO - codeparrot_training - Step 28269: {'lr': 0.0002001852629393468, 'samples': 14474240, 'steps': 28269, 'batch_loss/train': 0.6436311079887673} 12/26/2021 16:37:20 - INFO - codeparrot_training - Step 28270: {'lr': 0.00020016981600429323, 'samples': 14474752, 'steps': 28270, 'batch_loss/train': 0.6332069442141801} 12/26/2021 16:37:30 - INFO - codeparrot_training - Step 28271: {'lr': 0.0002001543692673457, 'samples': 14475264, 'steps': 28271, 'batch_loss/train': 0.5623399732867256} 12/26/2021 16:37:41 - INFO - codeparrot_training - Step 28272: {'lr': 0.00020013892272856566, 'samples': 14475776, 'steps': 28272, 'batch_loss/train': 0.5841127429157495} 12/26/2021 16:37:53 - INFO - codeparrot_training - Step 28273: {'lr': 0.00020012347638801437, 'samples': 14476288, 'steps': 28273, 'batch_loss/train': 0.6221873313188553} 12/26/2021 16:38:03 - INFO - codeparrot_training - Step 28274: {'lr': 0.00020010803024575325, 'samples': 14476800, 'steps': 28274, 'batch_loss/train': 0.6499955235049129} 12/26/2021 16:38:14 - INFO - codeparrot_training - Step 28275: {'lr': 0.00020009258430184386, 'samples': 14477312, 'steps': 28275, 'batch_loss/train': 0.7536264914087951} 12/26/2021 16:38:26 - INFO - codeparrot_training - Step 28276: {'lr': 0.00020007713855634752, 'samples': 14477824, 'steps': 28276, 'batch_loss/train': 0.7171207349747419} 12/26/2021 16:38:37 - INFO - codeparrot_training - Step 28277: {'lr': 0.0002000616930093255, 'samples': 14478336, 'steps': 28277, 'batch_loss/train': 0.6065933278296143} 12/26/2021 16:38:47 - INFO - codeparrot_training - Step 28278: {'lr': 0.0002000462476608394, 'samples': 14478848, 'steps': 28278, 'batch_loss/train': 0.6214920068159699} 12/26/2021 16:39:00 - INFO - codeparrot_training - Step 28279: {'lr': 0.00020003080251095062, 'samples': 14479360, 'steps': 28279, 'batch_loss/train': 0.6496583162806928} 12/26/2021 16:39:11 - INFO - codeparrot_training - Step 28280: {'lr': 0.00020001535755972044, 'samples': 14479872, 'steps': 28280, 'batch_loss/train': 0.7001287713646889} 12/26/2021 16:39:21 - INFO - codeparrot_training - Step 28281: {'lr': 0.00019999991280721026, 'samples': 14480384, 'steps': 28281, 'batch_loss/train': 0.6681334329769015} 12/26/2021 16:39:35 - INFO - codeparrot_training - Step 28282: {'lr': 0.0001999844682534817, 'samples': 14480896, 'steps': 28282, 'batch_loss/train': 0.6928243085276335} 12/26/2021 16:39:45 - INFO - codeparrot_training - Step 28283: {'lr': 0.00019996902389859592, 'samples': 14481408, 'steps': 28283, 'batch_loss/train': 0.7478963090106845} 12/26/2021 16:39:56 - INFO - codeparrot_training - Step 28284: {'lr': 0.00019995357974261446, 'samples': 14481920, 'steps': 28284, 'batch_loss/train': 0.7671526116319001} 12/26/2021 16:40:07 - INFO - codeparrot_training - Step 28285: {'lr': 0.00019993813578559857, 'samples': 14482432, 'steps': 28285, 'batch_loss/train': 0.6594103914685547} 12/26/2021 16:40:19 - INFO - codeparrot_training - Step 28286: {'lr': 0.0001999226920276099, 'samples': 14482944, 'steps': 28286, 'batch_loss/train': 0.6280629704706371} 12/26/2021 16:40:30 - INFO - codeparrot_training - Step 28287: {'lr': 0.00019990724846870956, 'samples': 14483456, 'steps': 28287, 'batch_loss/train': 0.697022489272058} 12/26/2021 16:40:40 - INFO - codeparrot_training - Step 28288: {'lr': 0.00019989180510895908, 'samples': 14483968, 'steps': 28288, 'batch_loss/train': 0.6845835656858981} 12/26/2021 16:40:52 - INFO - codeparrot_training - Step 28289: {'lr': 0.00019987636194841995, 'samples': 14484480, 'steps': 28289, 'batch_loss/train': 0.7265902352519333} 12/26/2021 16:41:03 - INFO - codeparrot_training - Step 28290: {'lr': 0.00019986091898715347, 'samples': 14484992, 'steps': 28290, 'batch_loss/train': 0.6842303173616529} 12/26/2021 16:41:13 - INFO - codeparrot_training - Step 28291: {'lr': 0.00019984547622522105, 'samples': 14485504, 'steps': 28291, 'batch_loss/train': 0.784081919118762} 12/26/2021 16:41:27 - INFO - codeparrot_training - Step 28292: {'lr': 0.00019983003366268398, 'samples': 14486016, 'steps': 28292, 'batch_loss/train': 0.74699816852808} 12/26/2021 16:41:38 - INFO - codeparrot_training - Step 28293: {'lr': 0.00019981459129960383, 'samples': 14486528, 'steps': 28293, 'batch_loss/train': 0.6974333096295595} 12/26/2021 16:41:48 - INFO - codeparrot_training - Step 28294: {'lr': 0.00019979914913604197, 'samples': 14487040, 'steps': 28294, 'batch_loss/train': 0.7821853142231703} 12/26/2021 16:42:00 - INFO - codeparrot_training - Step 28295: {'lr': 0.00019978370717205972, 'samples': 14487552, 'steps': 28295, 'batch_loss/train': 0.7114443145692348} 12/26/2021 16:42:11 - INFO - codeparrot_training - Step 28296: {'lr': 0.00019976826540771836, 'samples': 14488064, 'steps': 28296, 'batch_loss/train': 0.7512726243585348} 12/26/2021 16:42:22 - INFO - codeparrot_training - Step 28297: {'lr': 0.00019975282384307948, 'samples': 14488576, 'steps': 28297, 'batch_loss/train': 0.7353146048262715} 12/26/2021 16:42:32 - INFO - codeparrot_training - Step 28298: {'lr': 0.0001997373824782045, 'samples': 14489088, 'steps': 28298, 'batch_loss/train': 0.7872107482980937} 12/26/2021 16:42:46 - INFO - codeparrot_training - Step 28299: {'lr': 0.00019972194131315455, 'samples': 14489600, 'steps': 28299, 'batch_loss/train': 1.4601714694872499} 12/26/2021 16:42:57 - INFO - codeparrot_training - Step 28300: {'lr': 0.00019970650034799124, 'samples': 14490112, 'steps': 28300, 'batch_loss/train': 0.6606960976496339} 12/26/2021 16:43:08 - INFO - codeparrot_training - Step 28301: {'lr': 0.00019969105958277594, 'samples': 14490624, 'steps': 28301, 'batch_loss/train': 0.7852666862308979} 12/26/2021 16:43:20 - INFO - codeparrot_training - Step 28302: {'lr': 0.00019967561901756997, 'samples': 14491136, 'steps': 28302, 'batch_loss/train': 0.6910918925423175} 12/26/2021 16:43:31 - INFO - codeparrot_training - Step 28303: {'lr': 0.00019966017865243467, 'samples': 14491648, 'steps': 28303, 'batch_loss/train': 0.7856837892904878} 12/26/2021 16:43:41 - INFO - codeparrot_training - Step 28304: {'lr': 0.0001996447384874316, 'samples': 14492160, 'steps': 28304, 'batch_loss/train': 0.6911931880749762} 12/26/2021 16:43:52 - INFO - codeparrot_training - Step 28305: {'lr': 0.000199629298522622, 'samples': 14492672, 'steps': 28305, 'batch_loss/train': 0.5920543578977231} 12/26/2021 16:44:04 - INFO - codeparrot_training - Step 28306: {'lr': 0.00019961385875806725, 'samples': 14493184, 'steps': 28306, 'batch_loss/train': 0.6428976822644472} 12/26/2021 16:44:15 - INFO - codeparrot_training - Step 28307: {'lr': 0.00019959841919382876, 'samples': 14493696, 'steps': 28307, 'batch_loss/train': 0.7127876500599086} 12/26/2021 16:44:25 - INFO - codeparrot_training - Step 28308: {'lr': 0.00019958297982996802, 'samples': 14494208, 'steps': 28308, 'batch_loss/train': 0.7691130274906754} 12/26/2021 16:44:39 - INFO - codeparrot_training - Step 28309: {'lr': 0.00019956754066654622, 'samples': 14494720, 'steps': 28309, 'batch_loss/train': 0.7152816448360682} 12/26/2021 16:44:49 - INFO - codeparrot_training - Step 28310: {'lr': 0.0001995521017036249, 'samples': 14495232, 'steps': 28310, 'batch_loss/train': 0.7912977365776896} 12/26/2021 16:45:00 - INFO - codeparrot_training - Step 28311: {'lr': 0.00019953666294126533, 'samples': 14495744, 'steps': 28311, 'batch_loss/train': 0.7219408263918012} 12/26/2021 16:45:12 - INFO - codeparrot_training - Step 28312: {'lr': 0.0001995212243795289, 'samples': 14496256, 'steps': 28312, 'batch_loss/train': 0.8306973921135068} 12/26/2021 16:45:23 - INFO - codeparrot_training - Step 28313: {'lr': 0.00019950578601847713, 'samples': 14496768, 'steps': 28313, 'batch_loss/train': 0.7424301104620099} 12/26/2021 16:45:33 - INFO - codeparrot_training - Step 28314: {'lr': 0.0001994903478581712, 'samples': 14497280, 'steps': 28314, 'batch_loss/train': 0.6586959031410515} 12/26/2021 16:45:46 - INFO - codeparrot_training - Step 28315: {'lr': 0.00019947490989867262, 'samples': 14497792, 'steps': 28315, 'batch_loss/train': 0.7341237957589328} 12/26/2021 16:45:56 - INFO - codeparrot_training - Step 28316: {'lr': 0.00019945947214004272, 'samples': 14498304, 'steps': 28316, 'batch_loss/train': 0.7372870752587914} 12/26/2021 16:46:07 - INFO - codeparrot_training - Step 28317: {'lr': 0.00019944403458234289, 'samples': 14498816, 'steps': 28317, 'batch_loss/train': 0.639121512271231} 12/26/2021 16:46:17 - INFO - codeparrot_training - Step 28318: {'lr': 0.00019942859722563443, 'samples': 14499328, 'steps': 28318, 'batch_loss/train': 0.7954975813627243} 12/26/2021 16:46:30 - INFO - codeparrot_training - Step 28319: {'lr': 0.0001994131600699788, 'samples': 14499840, 'steps': 28319, 'batch_loss/train': 0.6494522643042728} 12/26/2021 16:46:40 - INFO - codeparrot_training - Step 28320: {'lr': 0.00019939772311543738, 'samples': 14500352, 'steps': 28320, 'batch_loss/train': 0.8030212819576263} 12/26/2021 16:46:51 - INFO - codeparrot_training - Step 28321: {'lr': 0.00019938228636207152, 'samples': 14500864, 'steps': 28321, 'batch_loss/train': 0.8209607200697064} 12/26/2021 16:47:05 - INFO - codeparrot_training - Step 28322: {'lr': 0.00019936684980994247, 'samples': 14501376, 'steps': 28322, 'batch_loss/train': 0.7040592180565} 12/26/2021 16:47:15 - INFO - codeparrot_training - Step 28323: {'lr': 0.00019935141345911177, 'samples': 14501888, 'steps': 28323, 'batch_loss/train': 0.7464016219601035} 12/26/2021 16:47:26 - INFO - codeparrot_training - Step 28324: {'lr': 0.00019933597730964077, 'samples': 14502400, 'steps': 28324, 'batch_loss/train': 0.7193362843245268} 12/26/2021 16:47:38 - INFO - codeparrot_training - Step 28325: {'lr': 0.00019932054136159066, 'samples': 14502912, 'steps': 28325, 'batch_loss/train': 0.7520026443526149} 12/26/2021 16:47:49 - INFO - codeparrot_training - Step 28326: {'lr': 0.00019930510561502304, 'samples': 14503424, 'steps': 28326, 'batch_loss/train': 0.7296976606012322} 12/26/2021 16:47:59 - INFO - codeparrot_training - Step 28327: {'lr': 0.0001992896700699992, 'samples': 14503936, 'steps': 28327, 'batch_loss/train': 0.7032198612578213} 12/26/2021 16:48:13 - INFO - codeparrot_training - Step 28328: {'lr': 0.00019927423472658047, 'samples': 14504448, 'steps': 28328, 'batch_loss/train': 0.7849400835111737} 12/26/2021 16:48:24 - INFO - codeparrot_training - Step 28329: {'lr': 0.00019925879958482815, 'samples': 14504960, 'steps': 28329, 'batch_loss/train': 0.5236970485420898} 12/26/2021 16:48:35 - INFO - codeparrot_training - Step 28330: {'lr': 0.0001992433646448038, 'samples': 14505472, 'steps': 28330, 'batch_loss/train': 0.6512665343470871} 12/26/2021 16:48:45 - INFO - codeparrot_training - Step 28331: {'lr': 0.00019922792990656864, 'samples': 14505984, 'steps': 28331, 'batch_loss/train': 0.8199945176020265} 12/26/2021 16:48:58 - INFO - codeparrot_training - Step 28332: {'lr': 0.00019921249537018404, 'samples': 14506496, 'steps': 28332, 'batch_loss/train': 0.7751430105417967} 12/26/2021 16:49:08 - INFO - codeparrot_training - Step 28333: {'lr': 0.0001991970610357113, 'samples': 14507008, 'steps': 28333, 'batch_loss/train': 0.7728591673076153} 12/26/2021 16:49:19 - INFO - codeparrot_training - Step 28334: {'lr': 0.00019918162690321198, 'samples': 14507520, 'steps': 28334, 'batch_loss/train': 0.5981133785098791} 12/26/2021 16:49:31 - INFO - codeparrot_training - Step 28335: {'lr': 0.0001991661929727473, 'samples': 14508032, 'steps': 28335, 'batch_loss/train': 0.6155802759749349} 12/26/2021 16:49:41 - INFO - codeparrot_training - Step 28336: {'lr': 0.00019915075924437854, 'samples': 14508544, 'steps': 28336, 'batch_loss/train': 0.7271556748892181} 12/26/2021 16:49:52 - INFO - codeparrot_training - Step 28337: {'lr': 0.00019913532571816728, 'samples': 14509056, 'steps': 28337, 'batch_loss/train': 0.8830009391531348} 12/26/2021 16:50:06 - INFO - codeparrot_training - Step 28338: {'lr': 0.0001991198923941747, 'samples': 14509568, 'steps': 28338, 'batch_loss/train': 0.7377223381772637} 12/26/2021 16:50:17 - INFO - codeparrot_training - Step 28339: {'lr': 0.00019910445927246232, 'samples': 14510080, 'steps': 28339, 'batch_loss/train': 0.6780234561301768} 12/26/2021 16:50:27 - INFO - codeparrot_training - Step 28340: {'lr': 0.00019908902635309126, 'samples': 14510592, 'steps': 28340, 'batch_loss/train': 0.7365970155224204} 12/26/2021 16:50:39 - INFO - codeparrot_training - Step 28341: {'lr': 0.00019907359363612308, 'samples': 14511104, 'steps': 28341, 'batch_loss/train': 0.8133370336145163} 12/26/2021 16:50:50 - INFO - codeparrot_training - Step 28342: {'lr': 0.00019905816112161906, 'samples': 14511616, 'steps': 28342, 'batch_loss/train': 0.6433242075145245} 12/26/2021 16:51:01 - INFO - codeparrot_training - Step 28343: {'lr': 0.0001990427288096406, 'samples': 14512128, 'steps': 28343, 'batch_loss/train': 0.7408520015887916} 12/26/2021 16:51:11 - INFO - codeparrot_training - Step 28344: {'lr': 0.00019902729670024887, 'samples': 14512640, 'steps': 28344, 'batch_loss/train': 0.7538707507774234} 12/26/2021 16:51:23 - INFO - codeparrot_training - Step 28345: {'lr': 0.00019901186479350544, 'samples': 14513152, 'steps': 28345, 'batch_loss/train': 0.6896988233784214} 12/26/2021 16:51:34 - INFO - codeparrot_training - Step 28346: {'lr': 0.00019899643308947162, 'samples': 14513664, 'steps': 28346, 'batch_loss/train': 0.7096565193496644} 12/26/2021 16:51:44 - INFO - codeparrot_training - Step 28347: {'lr': 0.0001989810015882087, 'samples': 14514176, 'steps': 28347, 'batch_loss/train': 0.7771140523254871} 12/26/2021 16:51:58 - INFO - codeparrot_training - Step 28348: {'lr': 0.00019896557028977797, 'samples': 14514688, 'steps': 28348, 'batch_loss/train': 0.7391493339091539} 12/26/2021 16:52:09 - INFO - codeparrot_training - Step 28349: {'lr': 0.000198950139194241, 'samples': 14515200, 'steps': 28349, 'batch_loss/train': 0.7515907888300717} 12/26/2021 16:52:19 - INFO - codeparrot_training - Step 28350: {'lr': 0.00019893470830165897, 'samples': 14515712, 'steps': 28350, 'batch_loss/train': 0.7372662806883454} 12/26/2021 16:52:31 - INFO - codeparrot_training - Step 28351: {'lr': 0.00019891927761209316, 'samples': 14516224, 'steps': 28351, 'batch_loss/train': 0.7647080365568399} 12/26/2021 16:52:42 - INFO - codeparrot_training - Step 28352: {'lr': 0.00019890384712560505, 'samples': 14516736, 'steps': 28352, 'batch_loss/train': 0.7460291064344347} 12/26/2021 16:52:53 - INFO - codeparrot_training - Step 28353: {'lr': 0.00019888841684225604, 'samples': 14517248, 'steps': 28353, 'batch_loss/train': 0.7269822652451694} 12/26/2021 16:53:05 - INFO - codeparrot_training - Step 28354: {'lr': 0.00019887298676210733, 'samples': 14517760, 'steps': 28354, 'batch_loss/train': 0.7466765819117427} 12/26/2021 16:53:16 - INFO - codeparrot_training - Step 28355: {'lr': 0.0001988575568852202, 'samples': 14518272, 'steps': 28355, 'batch_loss/train': 0.7677166741341352} 12/26/2021 16:53:26 - INFO - codeparrot_training - Step 28356: {'lr': 0.00019884212721165632, 'samples': 14518784, 'steps': 28356, 'batch_loss/train': 0.7229429874569178} 12/26/2021 16:53:37 - INFO - codeparrot_training - Step 28357: {'lr': 0.00019882669774147672, 'samples': 14519296, 'steps': 28357, 'batch_loss/train': 0.6831890735775232} 12/26/2021 16:53:51 - INFO - codeparrot_training - Step 28358: {'lr': 0.0001988112684747429, 'samples': 14519808, 'steps': 28358, 'batch_loss/train': 0.7593267923220992} 12/26/2021 16:54:02 - INFO - codeparrot_training - Step 28359: {'lr': 0.00019879583941151605, 'samples': 14520320, 'steps': 28359, 'batch_loss/train': 0.7068719766102731} 12/26/2021 16:54:12 - INFO - codeparrot_training - Step 28360: {'lr': 0.00019878041055185772, 'samples': 14520832, 'steps': 28360, 'batch_loss/train': 0.6952394023537636} 12/26/2021 16:54:24 - INFO - codeparrot_training - Step 28361: {'lr': 0.00019876498189582909, 'samples': 14521344, 'steps': 28361, 'batch_loss/train': 0.7528784195892513} 12/26/2021 16:54:35 - INFO - codeparrot_training - Step 28362: {'lr': 0.00019874955344349147, 'samples': 14521856, 'steps': 28362, 'batch_loss/train': 0.7461749324575067} 12/26/2021 16:54:45 - INFO - codeparrot_training - Step 28363: {'lr': 0.00019873412519490641, 'samples': 14522368, 'steps': 28363, 'batch_loss/train': 0.686074455967173} 12/26/2021 16:54:57 - INFO - codeparrot_training - Step 28364: {'lr': 0.00019871869715013503, 'samples': 14522880, 'steps': 28364, 'batch_loss/train': 0.7267489405348897} 12/26/2021 16:55:08 - INFO - codeparrot_training - Step 28365: {'lr': 0.00019870326930923883, 'samples': 14523392, 'steps': 28365, 'batch_loss/train': 0.8027730723842978} 12/26/2021 16:55:19 - INFO - codeparrot_training - Step 28366: {'lr': 0.0001986878416722789, 'samples': 14523904, 'steps': 28366, 'batch_loss/train': 0.7162584969191812} 12/26/2021 16:55:29 - INFO - codeparrot_training - Step 28367: {'lr': 0.0001986724142393169, 'samples': 14524416, 'steps': 28367, 'batch_loss/train': 0.7253934317268431} 12/26/2021 16:55:43 - INFO - codeparrot_training - Step 28368: {'lr': 0.00019865698701041393, 'samples': 14524928, 'steps': 28368, 'batch_loss/train': 0.7820893051102757} 12/26/2021 16:55:54 - INFO - codeparrot_training - Step 28369: {'lr': 0.00019864155998563143, 'samples': 14525440, 'steps': 28369, 'batch_loss/train': 0.6369628815446049} 12/26/2021 16:56:04 - INFO - codeparrot_training - Step 28370: {'lr': 0.0001986261331650306, 'samples': 14525952, 'steps': 28370, 'batch_loss/train': 0.7493159114383161} 12/26/2021 16:56:16 - INFO - codeparrot_training - Step 28371: {'lr': 0.00019861070654867292, 'samples': 14526464, 'steps': 28371, 'batch_loss/train': 0.8027165583334863} 12/26/2021 16:56:27 - INFO - codeparrot_training - Step 28372: {'lr': 0.00019859528013661974, 'samples': 14526976, 'steps': 28372, 'batch_loss/train': 0.7353244211990386} 12/26/2021 16:56:38 - INFO - codeparrot_training - Step 28373: {'lr': 0.00019857985392893217, 'samples': 14527488, 'steps': 28373, 'batch_loss/train': 0.7358415480703115} 12/26/2021 16:56:50 - INFO - codeparrot_training - Step 28374: {'lr': 0.0001985644279256718, 'samples': 14528000, 'steps': 28374, 'batch_loss/train': 0.7092074751853943} 12/26/2021 16:57:00 - INFO - codeparrot_training - Step 28375: {'lr': 0.00019854900212689982, 'samples': 14528512, 'steps': 28375, 'batch_loss/train': 0.7618554756045341} 12/26/2021 16:57:11 - INFO - codeparrot_training - Step 28376: {'lr': 0.00019853357653267761, 'samples': 14529024, 'steps': 28376, 'batch_loss/train': 0.7866869922727346} 12/26/2021 16:57:25 - INFO - codeparrot_training - Step 28377: {'lr': 0.0001985181511430664, 'samples': 14529536, 'steps': 28377, 'batch_loss/train': 0.6936252359300852} 12/26/2021 16:57:35 - INFO - codeparrot_training - Step 28378: {'lr': 0.0001985027259581276, 'samples': 14530048, 'steps': 28378, 'batch_loss/train': 0.7567786071449518} 12/26/2021 16:57:46 - INFO - codeparrot_training - Step 28379: {'lr': 0.00019848730097792256, 'samples': 14530560, 'steps': 28379, 'batch_loss/train': 0.6983888102695346} 12/26/2021 16:57:57 - INFO - codeparrot_training - Step 28380: {'lr': 0.00019847187620251254, 'samples': 14531072, 'steps': 28380, 'batch_loss/train': 0.7338542372453958} 12/26/2021 16:58:09 - INFO - codeparrot_training - Step 28381: {'lr': 0.0001984564516319588, 'samples': 14531584, 'steps': 28381, 'batch_loss/train': 0.7807473028078675} 12/26/2021 16:58:19 - INFO - codeparrot_training - Step 28382: {'lr': 0.00019844102726632287, 'samples': 14532096, 'steps': 28382, 'batch_loss/train': 0.7335687736049294} 12/26/2021 16:58:30 - INFO - codeparrot_training - Step 28383: {'lr': 0.00019842560310566587, 'samples': 14532608, 'steps': 28383, 'batch_loss/train': 0.6022010782617144} 12/26/2021 16:58:44 - INFO - codeparrot_training - Step 28384: {'lr': 0.00019841017915004928, 'samples': 14533120, 'steps': 28384, 'batch_loss/train': 0.7090776129625738} 12/26/2021 16:58:55 - INFO - codeparrot_training - Step 28385: {'lr': 0.00019839475539953422, 'samples': 14533632, 'steps': 28385, 'batch_loss/train': 0.737313361139968} 12/26/2021 16:59:05 - INFO - codeparrot_training - Step 28386: {'lr': 0.00019837933185418226, 'samples': 14534144, 'steps': 28386, 'batch_loss/train': 0.768639927264303} 12/26/2021 16:59:18 - INFO - codeparrot_training - Step 28387: {'lr': 0.0001983639085140545, 'samples': 14534656, 'steps': 28387, 'batch_loss/train': 0.7904574824497104} 12/26/2021 16:59:28 - INFO - codeparrot_training - Step 28388: {'lr': 0.0001983484853792123, 'samples': 14535168, 'steps': 28388, 'batch_loss/train': 0.7526710042729974} 12/26/2021 16:59:39 - INFO - codeparrot_training - Step 28389: {'lr': 0.0001983330624497171, 'samples': 14535680, 'steps': 28389, 'batch_loss/train': 0.749922773684375} 12/26/2021 16:59:51 - INFO - codeparrot_training - Step 28390: {'lr': 0.0001983176397256301, 'samples': 14536192, 'steps': 28390, 'batch_loss/train': 0.673965553403832} 12/26/2021 17:00:02 - INFO - codeparrot_training - Step 28391: {'lr': 0.0001983022172070127, 'samples': 14536704, 'steps': 28391, 'batch_loss/train': 0.7084409571252763} 12/26/2021 17:00:12 - INFO - codeparrot_training - Step 28392: {'lr': 0.0001982867948939261, 'samples': 14537216, 'steps': 28392, 'batch_loss/train': 0.7930462614167482} 12/26/2021 17:00:23 - INFO - codeparrot_training - Step 28393: {'lr': 0.00019827137278643171, 'samples': 14537728, 'steps': 28393, 'batch_loss/train': 0.7257548104971647} 12/26/2021 17:00:35 - INFO - codeparrot_training - Step 28394: {'lr': 0.00019825595088459084, 'samples': 14538240, 'steps': 28394, 'batch_loss/train': 0.6423619519919157} 12/26/2021 17:00:46 - INFO - codeparrot_training - Step 28395: {'lr': 0.00019824052918846478, 'samples': 14538752, 'steps': 28395, 'batch_loss/train': 0.7967724744230509} 12/26/2021 17:00:56 - INFO - codeparrot_training - Step 28396: {'lr': 0.00019822510769811473, 'samples': 14539264, 'steps': 28396, 'batch_loss/train': 0.6064401601906866} 12/26/2021 17:01:10 - INFO - codeparrot_training - Step 28397: {'lr': 0.00019820968641360215, 'samples': 14539776, 'steps': 28397, 'batch_loss/train': 0.6943439194001257} 12/26/2021 17:01:21 - INFO - codeparrot_training - Step 28398: {'lr': 0.00019819426533498836, 'samples': 14540288, 'steps': 28398, 'batch_loss/train': 0.7175009371712804} 12/26/2021 17:01:31 - INFO - codeparrot_training - Step 28399: {'lr': 0.00019817884446233448, 'samples': 14540800, 'steps': 28399, 'batch_loss/train': 0.7117712006438524} 12/26/2021 17:01:43 - INFO - codeparrot_training - Step 28400: {'lr': 0.00019816342379570202, 'samples': 14541312, 'steps': 28400, 'batch_loss/train': 0.8215343793854117} 12/26/2021 17:01:54 - INFO - codeparrot_training - Step 28401: {'lr': 0.0001981480033351523, 'samples': 14541824, 'steps': 28401, 'batch_loss/train': 0.8469418431632221} 12/26/2021 17:02:05 - INFO - codeparrot_training - Step 28402: {'lr': 0.00019813258308074645, 'samples': 14542336, 'steps': 28402, 'batch_loss/train': 0.8290585931390524} 12/26/2021 17:02:17 - INFO - codeparrot_training - Step 28403: {'lr': 0.00019811716303254578, 'samples': 14542848, 'steps': 28403, 'batch_loss/train': 0.8047391921281815} 12/26/2021 17:02:27 - INFO - codeparrot_training - Step 28404: {'lr': 0.00019810174319061187, 'samples': 14543360, 'steps': 28404, 'batch_loss/train': 0.7209714502096176} 12/26/2021 17:02:38 - INFO - codeparrot_training - Step 28405: {'lr': 0.00019808632355500577, 'samples': 14543872, 'steps': 28405, 'batch_loss/train': 0.7396137099713087} 12/26/2021 17:02:49 - INFO - codeparrot_training - Step 28406: {'lr': 0.0001980709041257888, 'samples': 14544384, 'steps': 28406, 'batch_loss/train': 0.7552049286896363} 12/26/2021 17:03:02 - INFO - codeparrot_training - Step 28407: {'lr': 0.00019805548490302226, 'samples': 14544896, 'steps': 28407, 'batch_loss/train': 0.6703898496925831} 12/26/2021 17:03:13 - INFO - codeparrot_training - Step 28408: {'lr': 0.00019804006588676763, 'samples': 14545408, 'steps': 28408, 'batch_loss/train': 0.7300611343234777} 12/26/2021 17:03:24 - INFO - codeparrot_training - Step 28409: {'lr': 0.00019802464707708602, 'samples': 14545920, 'steps': 28409, 'batch_loss/train': 0.6965848931577057} 12/26/2021 17:03:36 - INFO - codeparrot_training - Step 28410: {'lr': 0.00019800922847403867, 'samples': 14546432, 'steps': 28410, 'batch_loss/train': 0.585809848678764} 12/26/2021 17:03:46 - INFO - codeparrot_training - Step 28411: {'lr': 0.00019799381007768718, 'samples': 14546944, 'steps': 28411, 'batch_loss/train': 0.6151928776234854} 12/26/2021 17:03:57 - INFO - codeparrot_training - Step 28412: {'lr': 0.00019797839188809264, 'samples': 14547456, 'steps': 28412, 'batch_loss/train': 0.6638466134900227} 12/26/2021 17:04:11 - INFO - codeparrot_training - Step 28413: {'lr': 0.00019796297390531632, 'samples': 14547968, 'steps': 28413, 'batch_loss/train': 1.3552523711696267} 12/26/2021 17:04:22 - INFO - codeparrot_training - Step 28414: {'lr': 0.0001979475561294195, 'samples': 14548480, 'steps': 28414, 'batch_loss/train': 0.6725126802921295} 12/26/2021 17:04:32 - INFO - codeparrot_training - Step 28415: {'lr': 0.00019793213856046367, 'samples': 14548992, 'steps': 28415, 'batch_loss/train': 0.7383492803201079} 12/26/2021 17:04:43 - INFO - codeparrot_training - Step 28416: {'lr': 0.00019791672119850992, 'samples': 14549504, 'steps': 28416, 'batch_loss/train': 0.7278774045407772} 12/26/2021 17:04:55 - INFO - codeparrot_training - Step 28417: {'lr': 0.00019790130404361966, 'samples': 14550016, 'steps': 28417, 'batch_loss/train': 0.5332841936033219} 12/26/2021 17:05:06 - INFO - codeparrot_training - Step 28418: {'lr': 0.0001978858870958541, 'samples': 14550528, 'steps': 28418, 'batch_loss/train': 0.7519927239045501} 12/26/2021 17:05:16 - INFO - codeparrot_training - Step 28419: {'lr': 0.0001978704703552746, 'samples': 14551040, 'steps': 28419, 'batch_loss/train': 1.6311757802031934} 12/26/2021 17:05:28 - INFO - codeparrot_training - Step 28420: {'lr': 0.00019785505382194244, 'samples': 14551552, 'steps': 28420, 'batch_loss/train': 0.7630253732204437} 12/26/2021 17:05:39 - INFO - codeparrot_training - Step 28421: {'lr': 0.00019783963749591893, 'samples': 14552064, 'steps': 28421, 'batch_loss/train': 0.6071820383658633} 12/26/2021 17:05:50 - INFO - codeparrot_training - Step 28422: {'lr': 0.00019782422137726522, 'samples': 14552576, 'steps': 28422, 'batch_loss/train': 0.687840623781085} 12/26/2021 17:06:03 - INFO - codeparrot_training - Step 28423: {'lr': 0.00019780880546604274, 'samples': 14553088, 'steps': 28423, 'batch_loss/train': 0.8057790498714894} 12/26/2021 17:06:14 - INFO - codeparrot_training - Step 28424: {'lr': 0.00019779338976231282, 'samples': 14553600, 'steps': 28424, 'batch_loss/train': 0.7613225746899843} 12/26/2021 17:06:25 - INFO - codeparrot_training - Step 28425: {'lr': 0.00019777797426613654, 'samples': 14554112, 'steps': 28425, 'batch_loss/train': 0.830081831663847} 12/26/2021 17:06:35 - INFO - codeparrot_training - Step 28426: {'lr': 0.0001977625589775754, 'samples': 14554624, 'steps': 28426, 'batch_loss/train': 0.8736344827339053} 12/26/2021 17:06:48 - INFO - codeparrot_training - Step 28427: {'lr': 0.00019774714389669067, 'samples': 14555136, 'steps': 28427, 'batch_loss/train': 0.8064987659454346} 12/26/2021 17:06:58 - INFO - codeparrot_training - Step 28428: {'lr': 0.00019773172902354347, 'samples': 14555648, 'steps': 28428, 'batch_loss/train': 0.7013810412026942} 12/26/2021 17:07:09 - INFO - codeparrot_training - Step 28429: {'lr': 0.00019771631435819512, 'samples': 14556160, 'steps': 28429, 'batch_loss/train': 0.5545438312692568} 12/26/2021 17:07:21 - INFO - codeparrot_training - Step 28430: {'lr': 0.0001977008999007071, 'samples': 14556672, 'steps': 28430, 'batch_loss/train': 0.6390353775350377} 12/26/2021 17:07:32 - INFO - codeparrot_training - Step 28431: {'lr': 0.00019768548565114052, 'samples': 14557184, 'steps': 28431, 'batch_loss/train': 0.6019071657792665} 12/26/2021 17:07:42 - INFO - codeparrot_training - Step 28432: {'lr': 0.0001976700716095567, 'samples': 14557696, 'steps': 28432, 'batch_loss/train': 0.7342198668047786} 12/26/2021 17:07:56 - INFO - codeparrot_training - Step 28433: {'lr': 0.00019765465777601683, 'samples': 14558208, 'steps': 28433, 'batch_loss/train': 0.8027631305158138} 12/26/2021 17:08:07 - INFO - codeparrot_training - Step 28434: {'lr': 0.00019763924415058237, 'samples': 14558720, 'steps': 28434, 'batch_loss/train': 1.098695752210915} 12/26/2021 17:08:18 - INFO - codeparrot_training - Step 28435: {'lr': 0.00019762383073331443, 'samples': 14559232, 'steps': 28435, 'batch_loss/train': 0.7261646715924144} 12/26/2021 17:08:30 - INFO - codeparrot_training - Step 28436: {'lr': 0.00019760841752427435, 'samples': 14559744, 'steps': 28436, 'batch_loss/train': 0.8921501329168677} 12/26/2021 17:08:40 - INFO - codeparrot_training - Step 28437: {'lr': 0.00019759300452352351, 'samples': 14560256, 'steps': 28437, 'batch_loss/train': 0.6489450134686194} 12/26/2021 17:08:51 - INFO - codeparrot_training - Step 28438: {'lr': 0.00019757759173112305, 'samples': 14560768, 'steps': 28438, 'batch_loss/train': 0.698183189611882} 12/26/2021 17:09:02 - INFO - codeparrot_training - Step 28439: {'lr': 0.00019756217914713436, 'samples': 14561280, 'steps': 28439, 'batch_loss/train': 0.6932605183683336} 12/26/2021 17:09:14 - INFO - codeparrot_training - Step 28440: {'lr': 0.00019754676677161853, 'samples': 14561792, 'steps': 28440, 'batch_loss/train': 1.4190707080997527} 12/26/2021 17:09:25 - INFO - codeparrot_training - Step 28441: {'lr': 0.00019753135460463705, 'samples': 14562304, 'steps': 28441, 'batch_loss/train': 0.7771067945286632} 12/26/2021 17:09:35 - INFO - codeparrot_training - Step 28442: {'lr': 0.00019751594264625104, 'samples': 14562816, 'steps': 28442, 'batch_loss/train': 0.7189513761550188} 12/26/2021 17:09:49 - INFO - codeparrot_training - Step 28443: {'lr': 0.00019750053089652187, 'samples': 14563328, 'steps': 28443, 'batch_loss/train': 0.7296917014755309} 12/26/2021 17:10:00 - INFO - codeparrot_training - Step 28444: {'lr': 0.00019748511935551067, 'samples': 14563840, 'steps': 28444, 'batch_loss/train': 0.6802052254788578} 12/26/2021 17:10:10 - INFO - codeparrot_training - Step 28445: {'lr': 0.00019746970802327884, 'samples': 14564352, 'steps': 28445, 'batch_loss/train': 0.6426775180734694} 12/26/2021 17:10:23 - INFO - codeparrot_training - Step 28446: {'lr': 0.0001974542968998877, 'samples': 14564864, 'steps': 28446, 'batch_loss/train': 0.6868233281420544} 12/26/2021 17:10:33 - INFO - codeparrot_training - Step 28447: {'lr': 0.00019743888598539833, 'samples': 14565376, 'steps': 28447, 'batch_loss/train': 0.7412563776597381} 12/26/2021 17:10:44 - INFO - codeparrot_training - Step 28448: {'lr': 0.00019742347527987216, 'samples': 14565888, 'steps': 28448, 'batch_loss/train': 0.6475686936173588} 12/26/2021 17:10:56 - INFO - codeparrot_training - Step 28449: {'lr': 0.0001974080647833704, 'samples': 14566400, 'steps': 28449, 'batch_loss/train': 0.6622724253684282} 12/26/2021 17:11:06 - INFO - codeparrot_training - Step 28450: {'lr': 0.00019739265449595435, 'samples': 14566912, 'steps': 28450, 'batch_loss/train': 0.7579816998913884} 12/26/2021 17:11:17 - INFO - codeparrot_training - Step 28451: {'lr': 0.00019737724441768514, 'samples': 14567424, 'steps': 28451, 'batch_loss/train': 0.7457962576299906} 12/26/2021 17:11:28 - INFO - codeparrot_training - Step 28452: {'lr': 0.0001973618345486242, 'samples': 14567936, 'steps': 28452, 'batch_loss/train': 0.7553109843283892} 12/26/2021 17:11:42 - INFO - codeparrot_training - Step 28453: {'lr': 0.00019734642488883277, 'samples': 14568448, 'steps': 28453, 'batch_loss/train': 0.8358058901503682} 12/26/2021 17:11:52 - INFO - codeparrot_training - Step 28454: {'lr': 0.00019733101543837201, 'samples': 14568960, 'steps': 28454, 'batch_loss/train': 0.6730291000567377} 12/26/2021 17:12:03 - INFO - codeparrot_training - Step 28455: {'lr': 0.0001973156061973032, 'samples': 14569472, 'steps': 28455, 'batch_loss/train': 0.6968229811172932} 12/26/2021 17:12:15 - INFO - codeparrot_training - Step 28456: {'lr': 0.00019730019716568775, 'samples': 14569984, 'steps': 28456, 'batch_loss/train': 0.8309386353939772} 12/26/2021 17:12:26 - INFO - codeparrot_training - Step 28457: {'lr': 0.00019728478834358683, 'samples': 14570496, 'steps': 28457, 'batch_loss/train': 0.6849480794044212} 12/26/2021 17:12:36 - INFO - codeparrot_training - Step 28458: {'lr': 0.00019726937973106162, 'samples': 14571008, 'steps': 28458, 'batch_loss/train': 0.7569841127842665} 12/26/2021 17:12:48 - INFO - codeparrot_training - Step 28459: {'lr': 0.0001972539713281734, 'samples': 14571520, 'steps': 28459, 'batch_loss/train': 0.636200291570276} 12/26/2021 17:12:59 - INFO - codeparrot_training - Step 28460: {'lr': 0.0001972385631349836, 'samples': 14572032, 'steps': 28460, 'batch_loss/train': 0.5934783148986753} 12/26/2021 17:13:10 - INFO - codeparrot_training - Step 28461: {'lr': 0.00019722315515155326, 'samples': 14572544, 'steps': 28461, 'batch_loss/train': 0.718625834910199} 12/26/2021 17:13:20 - INFO - codeparrot_training - Step 28462: {'lr': 0.0001972077473779437, 'samples': 14573056, 'steps': 28462, 'batch_loss/train': 0.7266781227663159} 12/26/2021 17:13:34 - INFO - codeparrot_training - Step 28463: {'lr': 0.0001971923398142163, 'samples': 14573568, 'steps': 28463, 'batch_loss/train': 0.7173395799472928} 12/26/2021 17:13:45 - INFO - codeparrot_training - Step 28464: {'lr': 0.00019717693246043216, 'samples': 14574080, 'steps': 28464, 'batch_loss/train': 0.7603187819477171} 12/26/2021 17:13:55 - INFO - codeparrot_training - Step 28465: {'lr': 0.00019716152531665265, 'samples': 14574592, 'steps': 28465, 'batch_loss/train': 0.8160930946469307} 12/26/2021 17:14:07 - INFO - codeparrot_training - Step 28466: {'lr': 0.00019714611838293885, 'samples': 14575104, 'steps': 28466, 'batch_loss/train': 0.700460938969627} 12/26/2021 17:14:18 - INFO - codeparrot_training - Step 28467: {'lr': 0.00019713071165935225, 'samples': 14575616, 'steps': 28467, 'batch_loss/train': 0.7045608395710588} 12/26/2021 17:14:29 - INFO - codeparrot_training - Step 28468: {'lr': 0.0001971153051459539, 'samples': 14576128, 'steps': 28468, 'batch_loss/train': 0.7449497971683741} 12/26/2021 17:14:43 - INFO - codeparrot_training - Step 28469: {'lr': 0.0001970998988428052, 'samples': 14576640, 'steps': 28469, 'batch_loss/train': 0.6941979071125388} 12/26/2021 17:14:53 - INFO - codeparrot_training - Step 28470: {'lr': 0.00019708449274996723, 'samples': 14577152, 'steps': 28470, 'batch_loss/train': 0.6737669559661299} 12/26/2021 17:15:04 - INFO - codeparrot_training - Step 28471: {'lr': 0.00019706908686750139, 'samples': 14577664, 'steps': 28471, 'batch_loss/train': 0.7212108438834548} 12/26/2021 17:15:16 - INFO - codeparrot_training - Step 28472: {'lr': 0.00019705368119546891, 'samples': 14578176, 'steps': 28472, 'batch_loss/train': 0.7175689530558884} 12/26/2021 17:15:26 - INFO - codeparrot_training - Step 28473: {'lr': 0.0001970382757339309, 'samples': 14578688, 'steps': 28473, 'batch_loss/train': 0.7534589413553476} 12/26/2021 17:15:37 - INFO - codeparrot_training - Step 28474: {'lr': 0.00019702287048294876, 'samples': 14579200, 'steps': 28474, 'batch_loss/train': 0.6276882640086114} 12/26/2021 17:15:48 - INFO - codeparrot_training - Step 28475: {'lr': 0.00019700746544258375, 'samples': 14579712, 'steps': 28475, 'batch_loss/train': 0.772015979513526} 12/26/2021 17:16:00 - INFO - codeparrot_training - Step 28476: {'lr': 0.00019699206061289703, 'samples': 14580224, 'steps': 28476, 'batch_loss/train': 0.7209635032340884} 12/26/2021 17:16:10 - INFO - codeparrot_training - Step 28477: {'lr': 0.00019697665599394977, 'samples': 14580736, 'steps': 28477, 'batch_loss/train': 0.7036194852553308} 12/26/2021 17:16:21 - INFO - codeparrot_training - Step 28478: {'lr': 0.00019696125158580335, 'samples': 14581248, 'steps': 28478, 'batch_loss/train': 0.7125172629021108} 12/26/2021 17:16:33 - INFO - codeparrot_training - Step 28479: {'lr': 0.000196945847388519, 'samples': 14581760, 'steps': 28479, 'batch_loss/train': 0.7493171838577837} 12/26/2021 17:16:44 - INFO - codeparrot_training - Step 28480: {'lr': 0.0001969304434021579, 'samples': 14582272, 'steps': 28480, 'batch_loss/train': 0.7685047127306461} 12/26/2021 17:16:54 - INFO - codeparrot_training - Step 28481: {'lr': 0.00019691503962678125, 'samples': 14582784, 'steps': 28481, 'batch_loss/train': 0.6424147692741826} 12/26/2021 17:17:08 - INFO - codeparrot_training - Step 28482: {'lr': 0.0001968996360624505, 'samples': 14583296, 'steps': 28482, 'batch_loss/train': 0.779590118676424} 12/26/2021 17:17:19 - INFO - codeparrot_training - Step 28483: {'lr': 0.00019688423270922668, 'samples': 14583808, 'steps': 28483, 'batch_loss/train': 0.7278484208509326} 12/26/2021 17:17:29 - INFO - codeparrot_training - Step 28484: {'lr': 0.00019686882956717103, 'samples': 14584320, 'steps': 28484, 'batch_loss/train': 0.7571703125722706} 12/26/2021 17:17:42 - INFO - codeparrot_training - Step 28485: {'lr': 0.00019685342663634494, 'samples': 14584832, 'steps': 28485, 'batch_loss/train': 0.727528098039329} 12/26/2021 17:17:52 - INFO - codeparrot_training - Step 28486: {'lr': 0.0001968380239168096, 'samples': 14585344, 'steps': 28486, 'batch_loss/train': 0.6950079533271492} 12/26/2021 17:18:03 - INFO - codeparrot_training - Step 28487: {'lr': 0.00019682262140862611, 'samples': 14585856, 'steps': 28487, 'batch_loss/train': 0.6920137316919863} 12/26/2021 17:18:13 - INFO - codeparrot_training - Step 28488: {'lr': 0.0001968072191118558, 'samples': 14586368, 'steps': 28488, 'batch_loss/train': 0.7101616784930229} 12/26/2021 17:18:26 - INFO - codeparrot_training - Step 28489: {'lr': 0.00019679181702655997, 'samples': 14586880, 'steps': 28489, 'batch_loss/train': 0.6871916384552605} 12/26/2021 17:18:36 - INFO - codeparrot_training - Step 28490: {'lr': 0.0001967764151527997, 'samples': 14587392, 'steps': 28490, 'batch_loss/train': 0.6511565733235329} 12/26/2021 17:18:47 - INFO - codeparrot_training - Step 28491: {'lr': 0.00019676101349063642, 'samples': 14587904, 'steps': 28491, 'batch_loss/train': 0.7669527777470648} 12/26/2021 17:19:01 - INFO - codeparrot_training - Step 28492: {'lr': 0.0001967456120401311, 'samples': 14588416, 'steps': 28492, 'batch_loss/train': 0.732826401013881} 12/26/2021 17:19:11 - INFO - codeparrot_training - Step 28493: {'lr': 0.00019673021080134526, 'samples': 14588928, 'steps': 28493, 'batch_loss/train': 0.7157269832678139} 12/26/2021 17:19:22 - INFO - codeparrot_training - Step 28494: {'lr': 0.0001967148097743399, 'samples': 14589440, 'steps': 28494, 'batch_loss/train': 0.8293955838307738} 12/26/2021 17:19:34 - INFO - codeparrot_training - Step 28495: {'lr': 0.00019669940895917644, 'samples': 14589952, 'steps': 28495, 'batch_loss/train': 0.8938689690548927} 12/26/2021 17:19:45 - INFO - codeparrot_training - Step 28496: {'lr': 0.00019668400835591584, 'samples': 14590464, 'steps': 28496, 'batch_loss/train': 0.6897400394082069} 12/26/2021 17:19:55 - INFO - codeparrot_training - Step 28497: {'lr': 0.0001966686079646196, 'samples': 14590976, 'steps': 28497, 'batch_loss/train': 0.7373067745938897} 12/26/2021 17:20:09 - INFO - codeparrot_training - Step 28498: {'lr': 0.00019665320778534885, 'samples': 14591488, 'steps': 28498, 'batch_loss/train': 0.7624897425994277} 12/26/2021 17:20:20 - INFO - codeparrot_training - Step 28499: {'lr': 0.0001966378078181647, 'samples': 14592000, 'steps': 28499, 'batch_loss/train': 1.4312162324786186} 12/26/2021 17:20:31 - INFO - codeparrot_training - Step 28500: {'lr': 0.00019662240806312854, 'samples': 14592512, 'steps': 28500, 'batch_loss/train': 0.5655800714157522} 12/26/2021 17:20:41 - INFO - codeparrot_training - Step 28501: {'lr': 0.00019660700852030155, 'samples': 14593024, 'steps': 28501, 'batch_loss/train': 0.6880248577799648} 12/26/2021 17:20:53 - INFO - codeparrot_training - Step 28502: {'lr': 0.00019659160918974498, 'samples': 14593536, 'steps': 28502, 'batch_loss/train': 0.8083331100642681} 12/26/2021 17:21:04 - INFO - codeparrot_training - Step 28503: {'lr': 0.00019657621007151987, 'samples': 14594048, 'steps': 28503, 'batch_loss/train': 0.7356174681335688} 12/26/2021 17:21:15 - INFO - codeparrot_training - Step 28504: {'lr': 0.00019656081116568765, 'samples': 14594560, 'steps': 28504, 'batch_loss/train': 0.6992828967049718} 12/26/2021 17:21:27 - INFO - codeparrot_training - Step 28505: {'lr': 0.00019654541247230952, 'samples': 14595072, 'steps': 28505, 'batch_loss/train': 0.7585580912418664} 12/26/2021 17:21:37 - INFO - codeparrot_training - Step 28506: {'lr': 0.0001965300139914466, 'samples': 14595584, 'steps': 28506, 'batch_loss/train': 0.6294507598504424} 12/26/2021 17:21:48 - INFO - codeparrot_training - Step 28507: {'lr': 0.0001965146157231601, 'samples': 14596096, 'steps': 28507, 'batch_loss/train': 0.6994029558263719} 12/26/2021 17:22:00 - INFO - codeparrot_training - Step 28508: {'lr': 0.00019649921766751137, 'samples': 14596608, 'steps': 28508, 'batch_loss/train': 0.7840819777920842} 12/26/2021 17:22:11 - INFO - codeparrot_training - Step 28509: {'lr': 0.00019648381982456152, 'samples': 14597120, 'steps': 28509, 'batch_loss/train': 0.7352719579357654} 12/26/2021 17:22:21 - INFO - codeparrot_training - Step 28510: {'lr': 0.0001964684221943717, 'samples': 14597632, 'steps': 28510, 'batch_loss/train': 0.7151364037999883} 12/26/2021 17:22:32 - INFO - codeparrot_training - Step 28511: {'lr': 0.0001964530247770034, 'samples': 14598144, 'steps': 28511, 'batch_loss/train': 0.8135200026445091} 12/26/2021 17:22:45 - INFO - codeparrot_training - Step 28512: {'lr': 0.0001964376275725176, 'samples': 14598656, 'steps': 28512, 'batch_loss/train': 0.7333195814862847} 12/26/2021 17:22:56 - INFO - codeparrot_training - Step 28513: {'lr': 0.00019642223058097556, 'samples': 14599168, 'steps': 28513, 'batch_loss/train': 0.76789675001055} 12/26/2021 17:23:06 - INFO - codeparrot_training - Step 28514: {'lr': 0.0001964068338024384, 'samples': 14599680, 'steps': 28514, 'batch_loss/train': 0.7127366594504565} 12/26/2021 17:23:19 - INFO - codeparrot_training - Step 28515: {'lr': 0.00019639143723696755, 'samples': 14600192, 'steps': 28515, 'batch_loss/train': 0.6286013252101839} 12/26/2021 17:23:29 - INFO - codeparrot_training - Step 28516: {'lr': 0.0001963760408846241, 'samples': 14600704, 'steps': 28516, 'batch_loss/train': 0.6918374570086598} 12/26/2021 17:23:40 - INFO - codeparrot_training - Step 28517: {'lr': 0.00019636064474546928, 'samples': 14601216, 'steps': 28517, 'batch_loss/train': 0.6739319339394569} 12/26/2021 17:23:52 - INFO - codeparrot_training - Step 28518: {'lr': 0.00019634524881956413, 'samples': 14601728, 'steps': 28518, 'batch_loss/train': 0.6882084798999131} 12/26/2021 17:24:03 - INFO - codeparrot_training - Step 28519: {'lr': 0.00019632985310697013, 'samples': 14602240, 'steps': 28519, 'batch_loss/train': 0.7207975093042478} 12/26/2021 17:24:13 - INFO - codeparrot_training - Step 28520: {'lr': 0.0001963144576077484, 'samples': 14602752, 'steps': 28520, 'batch_loss/train': 0.68716906523332} 12/26/2021 17:24:24 - INFO - codeparrot_training - Step 28521: {'lr': 0.00019629906232196002, 'samples': 14603264, 'steps': 28521, 'batch_loss/train': 0.6541986728552729} 12/26/2021 17:24:38 - INFO - codeparrot_training - Step 28522: {'lr': 0.00019628366724966639, 'samples': 14603776, 'steps': 28522, 'batch_loss/train': 0.8017210643738508} 12/26/2021 17:24:49 - INFO - codeparrot_training - Step 28523: {'lr': 0.00019626827239092856, 'samples': 14604288, 'steps': 28523, 'batch_loss/train': 0.8870516326278448} 12/26/2021 17:24:59 - INFO - codeparrot_training - Step 28524: {'lr': 0.00019625287774580784, 'samples': 14604800, 'steps': 28524, 'batch_loss/train': 0.718442274548579} 12/26/2021 17:25:11 - INFO - codeparrot_training - Step 28525: {'lr': 0.00019623748331436527, 'samples': 14605312, 'steps': 28525, 'batch_loss/train': 0.82170738466084} 12/26/2021 17:25:22 - INFO - codeparrot_training - Step 28526: {'lr': 0.00019622208909666225, 'samples': 14605824, 'steps': 28526, 'batch_loss/train': 0.5754794192616828} 12/26/2021 17:25:33 - INFO - codeparrot_training - Step 28527: {'lr': 0.00019620669509275992, 'samples': 14606336, 'steps': 28527, 'batch_loss/train': 0.6961271585896611} 12/26/2021 17:25:45 - INFO - codeparrot_training - Step 28528: {'lr': 0.00019619130130271945, 'samples': 14606848, 'steps': 28528, 'batch_loss/train': 0.7344663329422474} 12/26/2021 17:25:56 - INFO - codeparrot_training - Step 28529: {'lr': 0.00019617590772660193, 'samples': 14607360, 'steps': 28529, 'batch_loss/train': 0.6770186526700854} 12/26/2021 17:26:06 - INFO - codeparrot_training - Step 28530: {'lr': 0.00019616051436446884, 'samples': 14607872, 'steps': 28530, 'batch_loss/train': 0.647043232107535} 12/26/2021 17:26:17 - INFO - codeparrot_training - Step 28531: {'lr': 0.0001961451212163812, 'samples': 14608384, 'steps': 28531, 'batch_loss/train': 0.6737773194909096} 12/26/2021 17:26:31 - INFO - codeparrot_training - Step 28532: {'lr': 0.00019612972828240012, 'samples': 14608896, 'steps': 28532, 'batch_loss/train': 0.6653413102030754} 12/26/2021 17:26:42 - INFO - codeparrot_training - Step 28533: {'lr': 0.0001961143355625869, 'samples': 14609408, 'steps': 28533, 'batch_loss/train': 0.7545466665178537} 12/26/2021 17:26:53 - INFO - codeparrot_training - Step 28534: {'lr': 0.00019609894305700287, 'samples': 14609920, 'steps': 28534, 'batch_loss/train': 0.7000186685472727} 12/26/2021 17:27:05 - INFO - codeparrot_training - Step 28535: {'lr': 0.000196083550765709, 'samples': 14610432, 'steps': 28535, 'batch_loss/train': 0.7036452952306718} 12/26/2021 17:27:15 - INFO - codeparrot_training - Step 28536: {'lr': 0.0001960681586887665, 'samples': 14610944, 'steps': 28536, 'batch_loss/train': 0.5722558326087892} 12/26/2021 17:27:26 - INFO - codeparrot_training - Step 28537: {'lr': 0.00019605276682623679, 'samples': 14611456, 'steps': 28537, 'batch_loss/train': 0.7074326537549496} 12/26/2021 17:27:38 - INFO - codeparrot_training - Step 28538: {'lr': 0.00019603737517818087, 'samples': 14611968, 'steps': 28538, 'batch_loss/train': 0.8823860511183739} 12/26/2021 17:27:49 - INFO - codeparrot_training - Step 28539: {'lr': 0.00019602198374465995, 'samples': 14612480, 'steps': 28539, 'batch_loss/train': 0.7319696880877018} 12/26/2021 17:27:59 - INFO - codeparrot_training - Step 28540: {'lr': 0.00019600659252573516, 'samples': 14612992, 'steps': 28540, 'batch_loss/train': 0.7608768865466118} 12/26/2021 17:28:13 - INFO - codeparrot_training - Step 28541: {'lr': 0.00019599120152146788, 'samples': 14613504, 'steps': 28541, 'batch_loss/train': 0.7296656109392643} 12/26/2021 17:28:24 - INFO - codeparrot_training - Step 28542: {'lr': 0.00019597581073191916, 'samples': 14614016, 'steps': 28542, 'batch_loss/train': 0.7484386786818504} 12/26/2021 17:28:35 - INFO - codeparrot_training - Step 28543: {'lr': 0.00019596042015715027, 'samples': 14614528, 'steps': 28543, 'batch_loss/train': 0.7347254569176584} 12/26/2021 17:28:45 - INFO - codeparrot_training - Step 28544: {'lr': 0.0001959450297972222, 'samples': 14615040, 'steps': 28544, 'batch_loss/train': 0.7332305922172964} 12/26/2021 17:28:58 - INFO - codeparrot_training - Step 28545: {'lr': 0.00019592963965219635, 'samples': 14615552, 'steps': 28545, 'batch_loss/train': 0.6666085154283792} 12/26/2021 17:29:08 - INFO - codeparrot_training - Step 28546: {'lr': 0.0001959142497221339, 'samples': 14616064, 'steps': 28546, 'batch_loss/train': 0.7610935127013363} 12/26/2021 17:29:19 - INFO - codeparrot_training - Step 28547: {'lr': 0.00019589886000709585, 'samples': 14616576, 'steps': 28547, 'batch_loss/train': 0.6351356967352331} 12/26/2021 17:29:31 - INFO - codeparrot_training - Step 28548: {'lr': 0.00019588347050714363, 'samples': 14617088, 'steps': 28548, 'batch_loss/train': 0.9669383261352777} 12/26/2021 17:29:42 - INFO - codeparrot_training - Step 28549: {'lr': 0.00019586808122233827, 'samples': 14617600, 'steps': 28549, 'batch_loss/train': 0.7997334082610905} 12/26/2021 17:29:53 - INFO - codeparrot_training - Step 28550: {'lr': 0.000195852692152741, 'samples': 14618112, 'steps': 28550, 'batch_loss/train': 1.0598834906704724} 12/26/2021 17:30:03 - INFO - codeparrot_training - Step 28551: {'lr': 0.00019583730329841286, 'samples': 14618624, 'steps': 28551, 'batch_loss/train': 0.8780245715752244} 12/26/2021 17:30:17 - INFO - codeparrot_training - Step 28552: {'lr': 0.00019582191465941523, 'samples': 14619136, 'steps': 28552, 'batch_loss/train': 0.7445703390985727} 12/26/2021 17:30:28 - INFO - codeparrot_training - Step 28553: {'lr': 0.00019580652623580924, 'samples': 14619648, 'steps': 28553, 'batch_loss/train': 0.7191633377224207} 12/26/2021 17:30:38 - INFO - codeparrot_training - Step 28554: {'lr': 0.000195791138027656, 'samples': 14620160, 'steps': 28554, 'batch_loss/train': 0.6948897582478821} 12/26/2021 17:30:51 - INFO - codeparrot_training - Step 28555: {'lr': 0.00019577575003501668, 'samples': 14620672, 'steps': 28555, 'batch_loss/train': 0.7490716697648168} 12/26/2021 17:31:01 - INFO - codeparrot_training - Step 28556: {'lr': 0.0001957603622579526, 'samples': 14621184, 'steps': 28556, 'batch_loss/train': 0.6005049808882177} 12/26/2021 17:31:12 - INFO - codeparrot_training - Step 28557: {'lr': 0.00019574497469652485, 'samples': 14621696, 'steps': 28557, 'batch_loss/train': 0.4385479733464308} 12/26/2021 17:31:26 - INFO - codeparrot_training - Step 28558: {'lr': 0.00019572958735079448, 'samples': 14622208, 'steps': 28558, 'batch_loss/train': 0.6805204562842846} 12/26/2021 17:31:37 - INFO - codeparrot_training - Step 28559: {'lr': 0.00019571420022082282, 'samples': 14622720, 'steps': 28559, 'batch_loss/train': 0.6986300919670612} 12/26/2021 17:31:47 - INFO - codeparrot_training - Step 28560: {'lr': 0.00019569881330667107, 'samples': 14623232, 'steps': 28560, 'batch_loss/train': 0.6612874008715153} 12/26/2021 17:31:58 - INFO - codeparrot_training - Step 28561: {'lr': 0.00019568342660840026, 'samples': 14623744, 'steps': 28561, 'batch_loss/train': 0.6752206180244684} 12/26/2021 17:32:10 - INFO - codeparrot_training - Step 28562: {'lr': 0.00019566804012607165, 'samples': 14624256, 'steps': 28562, 'batch_loss/train': 0.6877455383073539} 12/26/2021 17:32:21 - INFO - codeparrot_training - Step 28563: {'lr': 0.00019565265385974645, 'samples': 14624768, 'steps': 28563, 'batch_loss/train': 0.7486745775677264} 12/26/2021 17:32:31 - INFO - codeparrot_training - Step 28564: {'lr': 0.0001956372678094857, 'samples': 14625280, 'steps': 28564, 'batch_loss/train': 0.7710723159834743} 12/26/2021 17:32:43 - INFO - codeparrot_training - Step 28565: {'lr': 0.00019562188197535073, 'samples': 14625792, 'steps': 28565, 'batch_loss/train': 0.8904760172590613} 12/26/2021 17:32:54 - INFO - codeparrot_training - Step 28566: {'lr': 0.00019560649635740255, 'samples': 14626304, 'steps': 28566, 'batch_loss/train': 0.7850501136854291} 12/26/2021 17:33:05 - INFO - codeparrot_training - Step 28567: {'lr': 0.00019559111095570248, 'samples': 14626816, 'steps': 28567, 'batch_loss/train': 0.7077950278762728} 12/26/2021 17:33:18 - INFO - codeparrot_training - Step 28568: {'lr': 0.00019557572577031155, 'samples': 14627328, 'steps': 28568, 'batch_loss/train': 0.7811137377284467} 12/26/2021 17:33:29 - INFO - codeparrot_training - Step 28569: {'lr': 0.00019556034080129092, 'samples': 14627840, 'steps': 28569, 'batch_loss/train': 0.8302909741178155} 12/26/2021 17:33:40 - INFO - codeparrot_training - Step 28570: {'lr': 0.00019554495604870195, 'samples': 14628352, 'steps': 28570, 'batch_loss/train': 0.7018389902077615} 12/26/2021 17:33:50 - INFO - codeparrot_training - Step 28571: {'lr': 0.0001955295715126056, 'samples': 14628864, 'steps': 28571, 'batch_loss/train': 0.733825467992574} 12/26/2021 17:34:03 - INFO - codeparrot_training - Step 28572: {'lr': 0.0001955141871930632, 'samples': 14629376, 'steps': 28572, 'batch_loss/train': 0.73560979985632} 12/26/2021 17:34:13 - INFO - codeparrot_training - Step 28573: {'lr': 0.00019549880309013568, 'samples': 14629888, 'steps': 28573, 'batch_loss/train': 0.7370066880248487} 12/26/2021 17:34:24 - INFO - codeparrot_training - Step 28574: {'lr': 0.00019548341920388437, 'samples': 14630400, 'steps': 28574, 'batch_loss/train': 1.1078692306764424} 12/26/2021 17:34:37 - INFO - codeparrot_training - Step 28575: {'lr': 0.00019546803553437047, 'samples': 14630912, 'steps': 28575, 'batch_loss/train': 0.7281323950737715} 12/26/2021 17:34:47 - INFO - codeparrot_training - Step 28576: {'lr': 0.0001954526520816551, 'samples': 14631424, 'steps': 28576, 'batch_loss/train': 0.7606618460267782} 12/26/2021 17:34:58 - INFO - codeparrot_training - Step 28577: {'lr': 0.00019543726884579927, 'samples': 14631936, 'steps': 28577, 'batch_loss/train': 0.9497397812083364} 12/26/2021 17:35:10 - INFO - codeparrot_training - Step 28578: {'lr': 0.00019542188582686432, 'samples': 14632448, 'steps': 28578, 'batch_loss/train': 0.7872471096925437} 12/26/2021 17:35:20 - INFO - codeparrot_training - Step 28579: {'lr': 0.00019540650302491138, 'samples': 14632960, 'steps': 28579, 'batch_loss/train': 0.7384167811833322} 12/26/2021 17:35:31 - INFO - codeparrot_training - Step 28580: {'lr': 0.0001953911204400015, 'samples': 14633472, 'steps': 28580, 'batch_loss/train': 0.7544575622305274} 12/26/2021 17:35:45 - INFO - codeparrot_training - Step 28581: {'lr': 0.00019537573807219588, 'samples': 14633984, 'steps': 28581, 'batch_loss/train': 0.7089071606751531} 12/26/2021 17:35:56 - INFO - codeparrot_training - Step 28582: {'lr': 0.0001953603559215558, 'samples': 14634496, 'steps': 28582, 'batch_loss/train': 0.7730206474661827} 12/26/2021 17:36:06 - INFO - codeparrot_training - Step 28583: {'lr': 0.0001953449739881423, 'samples': 14635008, 'steps': 28583, 'batch_loss/train': 0.7918029213324189} 12/26/2021 17:36:17 - INFO - codeparrot_training - Step 28584: {'lr': 0.00019532959227201648, 'samples': 14635520, 'steps': 28584, 'batch_loss/train': 0.7660950971767306} 12/26/2021 17:36:29 - INFO - codeparrot_training - Step 28585: {'lr': 0.00019531421077323955, 'samples': 14636032, 'steps': 28585, 'batch_loss/train': 0.8232097513973713} 12/26/2021 17:36:40 - INFO - codeparrot_training - Step 28586: {'lr': 0.00019529882949187279, 'samples': 14636544, 'steps': 28586, 'batch_loss/train': 0.5757196554914117} 12/26/2021 17:36:50 - INFO - codeparrot_training - Step 28587: {'lr': 0.00019528344842797713, 'samples': 14637056, 'steps': 28587, 'batch_loss/train': 0.8673792611807585} 12/26/2021 17:37:03 - INFO - codeparrot_training - Step 28588: {'lr': 0.00019526806758161374, 'samples': 14637568, 'steps': 28588, 'batch_loss/train': 0.8303702753037214} 12/26/2021 17:37:13 - INFO - codeparrot_training - Step 28589: {'lr': 0.000195252686952844, 'samples': 14638080, 'steps': 28589, 'batch_loss/train': 0.7695662016049027} 12/26/2021 17:37:24 - INFO - codeparrot_training - Step 28590: {'lr': 0.00019523730654172886, 'samples': 14638592, 'steps': 28590, 'batch_loss/train': 0.7070718351751566} 12/26/2021 17:37:38 - INFO - codeparrot_training - Step 28591: {'lr': 0.00019522192634832954, 'samples': 14639104, 'steps': 28591, 'batch_loss/train': 0.7494062413461506} 12/26/2021 17:37:49 - INFO - codeparrot_training - Step 28592: {'lr': 0.00019520654637270707, 'samples': 14639616, 'steps': 28592, 'batch_loss/train': 0.6743326532305218} 12/26/2021 17:37:59 - INFO - codeparrot_training - Step 28593: {'lr': 0.00019519116661492275, 'samples': 14640128, 'steps': 28593, 'batch_loss/train': 0.7434562835842371} 12/26/2021 17:38:11 - INFO - codeparrot_training - Step 28594: {'lr': 0.00019517578707503765, 'samples': 14640640, 'steps': 28594, 'batch_loss/train': 0.7279641653876752} 12/26/2021 17:38:22 - INFO - codeparrot_training - Step 28595: {'lr': 0.00019516040775311284, 'samples': 14641152, 'steps': 28595, 'batch_loss/train': 0.7439208514988422} 12/26/2021 17:38:33 - INFO - codeparrot_training - Step 28596: {'lr': 0.00019514502864920966, 'samples': 14641664, 'steps': 28596, 'batch_loss/train': 0.8576284972950816} 12/26/2021 17:38:43 - INFO - codeparrot_training - Step 28597: {'lr': 0.00019512964976338906, 'samples': 14642176, 'steps': 28597, 'batch_loss/train': 0.7848966047167778} 12/26/2021 17:38:57 - INFO - codeparrot_training - Step 28598: {'lr': 0.00019511427109571234, 'samples': 14642688, 'steps': 28598, 'batch_loss/train': 0.7541611478663981} 12/26/2021 17:39:08 - INFO - codeparrot_training - Step 28599: {'lr': 0.00019509889264624042, 'samples': 14643200, 'steps': 28599, 'batch_loss/train': 0.8525032578036189} 12/26/2021 17:39:18 - INFO - codeparrot_training - Step 28600: {'lr': 0.00019508351441503463, 'samples': 14643712, 'steps': 28600, 'batch_loss/train': 1.3777821853291243} 12/26/2021 17:39:30 - INFO - codeparrot_training - Step 28601: {'lr': 0.00019506813640215613, 'samples': 14644224, 'steps': 28601, 'batch_loss/train': 0.7065868522040546} 12/26/2021 17:39:41 - INFO - codeparrot_training - Step 28602: {'lr': 0.00019505275860766597, 'samples': 14644736, 'steps': 28602, 'batch_loss/train': 0.8364956779405475} 12/26/2021 17:39:52 - INFO - codeparrot_training - Step 28603: {'lr': 0.00019503738103162515, 'samples': 14645248, 'steps': 28603, 'batch_loss/train': 0.7296672160737216} 12/26/2021 17:40:04 - INFO - codeparrot_training - Step 28604: {'lr': 0.00019502200367409506, 'samples': 14645760, 'steps': 28604, 'batch_loss/train': 0.8000224651768804} 12/26/2021 17:40:14 - INFO - codeparrot_training - Step 28605: {'lr': 0.00019500662653513674, 'samples': 14646272, 'steps': 28605, 'batch_loss/train': 0.7662012157961726} 12/26/2021 17:40:25 - INFO - codeparrot_training - Step 28606: {'lr': 0.0001949912496148112, 'samples': 14646784, 'steps': 28606, 'batch_loss/train': 0.6632688422687352} 12/26/2021 17:40:36 - INFO - codeparrot_training - Step 28607: {'lr': 0.00019497587291317976, 'samples': 14647296, 'steps': 28607, 'batch_loss/train': 0.736060299910605} 12/26/2021 17:40:48 - INFO - codeparrot_training - Step 28608: {'lr': 0.0001949604964303035, 'samples': 14647808, 'steps': 28608, 'batch_loss/train': 0.7321093678474426} 12/26/2021 17:40:58 - INFO - codeparrot_training - Step 28609: {'lr': 0.0001949451201662435, 'samples': 14648320, 'steps': 28609, 'batch_loss/train': 0.5872481202241033} 12/26/2021 17:41:09 - INFO - codeparrot_training - Step 28610: {'lr': 0.00019492974412106082, 'samples': 14648832, 'steps': 28610, 'batch_loss/train': 0.7017876571044326} 12/26/2021 17:41:23 - INFO - codeparrot_training - Step 28611: {'lr': 0.00019491436829481681, 'samples': 14649344, 'steps': 28611, 'batch_loss/train': 0.8087169798091054} 12/26/2021 17:41:34 - INFO - codeparrot_training - Step 28612: {'lr': 0.00019489899268757249, 'samples': 14649856, 'steps': 28612, 'batch_loss/train': 0.7239278144552372} 12/26/2021 17:41:44 - INFO - codeparrot_training - Step 28613: {'lr': 0.0001948836172993889, 'samples': 14650368, 'steps': 28613, 'batch_loss/train': 0.7678657621145248} 12/26/2021 17:41:56 - INFO - codeparrot_training - Step 28614: {'lr': 0.0001948682421303272, 'samples': 14650880, 'steps': 28614, 'batch_loss/train': 0.6171729632187635} 12/26/2021 17:42:07 - INFO - codeparrot_training - Step 28615: {'lr': 0.00019485286718044865, 'samples': 14651392, 'steps': 28615, 'batch_loss/train': 0.7319986410439014} 12/26/2021 17:42:18 - INFO - codeparrot_training - Step 28616: {'lr': 0.0001948374924498142, 'samples': 14651904, 'steps': 28616, 'batch_loss/train': 0.7485248306766152} 12/26/2021 17:42:30 - INFO - codeparrot_training - Step 28617: {'lr': 0.0001948221179384851, 'samples': 14652416, 'steps': 28617, 'batch_loss/train': 0.6986413383856416} 12/26/2021 17:42:40 - INFO - codeparrot_training - Step 28618: {'lr': 0.00019480674364652235, 'samples': 14652928, 'steps': 28618, 'batch_loss/train': 0.7796358950436115} 12/26/2021 17:42:51 - INFO - codeparrot_training - Step 28619: {'lr': 0.00019479136957398718, 'samples': 14653440, 'steps': 28619, 'batch_loss/train': 0.7487779534421861} 12/26/2021 17:43:02 - INFO - codeparrot_training - Step 28620: {'lr': 0.00019477599572094076, 'samples': 14653952, 'steps': 28620, 'batch_loss/train': 0.6997857093811035} 12/26/2021 17:43:15 - INFO - codeparrot_training - Step 28621: {'lr': 0.000194760622087444, 'samples': 14654464, 'steps': 28621, 'batch_loss/train': 0.7843897622078657} 12/26/2021 17:43:26 - INFO - codeparrot_training - Step 28622: {'lr': 0.00019474524867355827, 'samples': 14654976, 'steps': 28622, 'batch_loss/train': 0.6618954334408045} 12/26/2021 17:43:37 - INFO - codeparrot_training - Step 28623: {'lr': 0.00019472987547934452, 'samples': 14655488, 'steps': 28623, 'batch_loss/train': 0.742802687571384} 12/26/2021 17:43:49 - INFO - codeparrot_training - Step 28624: {'lr': 0.00019471450250486394, 'samples': 14656000, 'steps': 28624, 'batch_loss/train': 0.7241597464308143} 12/26/2021 17:44:00 - INFO - codeparrot_training - Step 28625: {'lr': 0.00019469912975017754, 'samples': 14656512, 'steps': 28625, 'batch_loss/train': 0.7603635713458061} 12/26/2021 17:44:10 - INFO - codeparrot_training - Step 28626: {'lr': 0.00019468375721534657, 'samples': 14657024, 'steps': 28626, 'batch_loss/train': 0.7933821640908718} 12/26/2021 17:44:24 - INFO - codeparrot_training - Step 28627: {'lr': 0.00019466838490043214, 'samples': 14657536, 'steps': 28627, 'batch_loss/train': 0.7266235039569438} 12/26/2021 17:44:35 - INFO - codeparrot_training - Step 28628: {'lr': 0.0001946530128054953, 'samples': 14658048, 'steps': 28628, 'batch_loss/train': 0.6315502361394465} 12/26/2021 17:44:45 - INFO - codeparrot_training - Step 28629: {'lr': 0.00019463764093059712, 'samples': 14658560, 'steps': 28629, 'batch_loss/train': 0.83855105843395} 12/26/2021 17:44:56 - INFO - codeparrot_training - Step 28630: {'lr': 0.0001946222692757988, 'samples': 14659072, 'steps': 28630, 'batch_loss/train': 0.8401459446176887} 12/26/2021 17:45:08 - INFO - codeparrot_training - Step 28631: {'lr': 0.00019460689784116147, 'samples': 14659584, 'steps': 28631, 'batch_loss/train': 0.7586721666157246} 12/26/2021 17:45:19 - INFO - codeparrot_training - Step 28632: {'lr': 0.00019459152662674613, 'samples': 14660096, 'steps': 28632, 'batch_loss/train': 0.7600305918604136} 12/26/2021 17:45:29 - INFO - codeparrot_training - Step 28633: {'lr': 0.000194576155632614, 'samples': 14660608, 'steps': 28633, 'batch_loss/train': 0.8297273202333599} 12/26/2021 17:45:42 - INFO - codeparrot_training - Step 28634: {'lr': 0.00019456078485882616, 'samples': 14661120, 'steps': 28634, 'batch_loss/train': 0.7523971926420927} 12/26/2021 17:45:52 - INFO - codeparrot_training - Step 28635: {'lr': 0.0001945454143054437, 'samples': 14661632, 'steps': 28635, 'batch_loss/train': 0.804642147384584} 12/26/2021 17:46:03 - INFO - codeparrot_training - Step 28636: {'lr': 0.00019453004397252765, 'samples': 14662144, 'steps': 28636, 'batch_loss/train': 0.8386686630547047} 12/26/2021 17:46:17 - INFO - codeparrot_training - Step 28637: {'lr': 0.00019451467386013936, 'samples': 14662656, 'steps': 28637, 'batch_loss/train': 0.7480500931851566} 12/26/2021 17:46:27 - INFO - codeparrot_training - Step 28638: {'lr': 0.00019449930396833973, 'samples': 14663168, 'steps': 28638, 'batch_loss/train': 0.7762655266560614} 12/26/2021 17:46:38 - INFO - codeparrot_training - Step 28639: {'lr': 0.00019448393429718987, 'samples': 14663680, 'steps': 28639, 'batch_loss/train': 0.7793066184967756} 12/26/2021 17:46:49 - INFO - codeparrot_training - Step 28640: {'lr': 0.00019446856484675086, 'samples': 14664192, 'steps': 28640, 'batch_loss/train': 0.8265590309165418} 12/26/2021 17:47:01 - INFO - codeparrot_training - Step 28641: {'lr': 0.000194453195617084, 'samples': 14664704, 'steps': 28641, 'batch_loss/train': 1.356794593972154} 12/26/2021 17:47:12 - INFO - codeparrot_training - Step 28642: {'lr': 0.0001944378266082502, 'samples': 14665216, 'steps': 28642, 'batch_loss/train': 0.7218953799456358} 12/26/2021 17:47:23 - INFO - codeparrot_training - Step 28643: {'lr': 0.00019442245782031053, 'samples': 14665728, 'steps': 28643, 'batch_loss/train': 0.8594150899443775} 12/26/2021 17:47:35 - INFO - codeparrot_training - Step 28644: {'lr': 0.00019440708925332634, 'samples': 14666240, 'steps': 28644, 'batch_loss/train': 0.537929801736027} 12/26/2021 17:47:45 - INFO - codeparrot_training - Step 28645: {'lr': 0.0001943917209073585, 'samples': 14666752, 'steps': 28645, 'batch_loss/train': 0.7578949487651698} 12/26/2021 17:47:56 - INFO - codeparrot_training - Step 28646: {'lr': 0.00019437635278246823, 'samples': 14667264, 'steps': 28646, 'batch_loss/train': 0.6456727371551096} 12/26/2021 17:48:08 - INFO - codeparrot_training - Step 28647: {'lr': 0.00019436098487871645, 'samples': 14667776, 'steps': 28647, 'batch_loss/train': 0.8713216101750731} 12/26/2021 17:48:19 - INFO - codeparrot_training - Step 28648: {'lr': 0.00019434561719616452, 'samples': 14668288, 'steps': 28648, 'batch_loss/train': 0.7361934543587267} 12/26/2021 17:48:29 - INFO - codeparrot_training - Step 28649: {'lr': 0.00019433024973487334, 'samples': 14668800, 'steps': 28649, 'batch_loss/train': 0.6970552513375878} 12/26/2021 17:48:40 - INFO - codeparrot_training - Step 28650: {'lr': 0.00019431488249490415, 'samples': 14669312, 'steps': 28650, 'batch_loss/train': 0.7805483974516392} 12/26/2021 17:48:54 - INFO - codeparrot_training - Step 28651: {'lr': 0.00019429951547631782, 'samples': 14669824, 'steps': 28651, 'batch_loss/train': 0.7091061356477439} 12/26/2021 17:49:04 - INFO - codeparrot_training - Step 28652: {'lr': 0.00019428414867917566, 'samples': 14670336, 'steps': 28652, 'batch_loss/train': 0.7450491599738598} 12/26/2021 17:49:15 - INFO - codeparrot_training - Step 28653: {'lr': 0.00019426878210353872, 'samples': 14670848, 'steps': 28653, 'batch_loss/train': 0.8089707652106881} 12/26/2021 17:49:27 - INFO - codeparrot_training - Step 28654: {'lr': 0.000194253415749468, 'samples': 14671360, 'steps': 28654, 'batch_loss/train': 0.6489042779430747} 12/26/2021 17:49:38 - INFO - codeparrot_training - Step 28655: {'lr': 0.00019423804961702463, 'samples': 14671872, 'steps': 28655, 'batch_loss/train': 0.8033319842070341} 12/26/2021 17:49:48 - INFO - codeparrot_training - Step 28656: {'lr': 0.0001942226837062698, 'samples': 14672384, 'steps': 28656, 'batch_loss/train': 0.78509482415393} 12/26/2021 17:50:02 - INFO - codeparrot_training - Step 28657: {'lr': 0.00019420731801726455, 'samples': 14672896, 'steps': 28657, 'batch_loss/train': 0.7386748516000807} 12/26/2021 17:50:13 - INFO - codeparrot_training - Step 28658: {'lr': 0.00019419195255006977, 'samples': 14673408, 'steps': 28658, 'batch_loss/train': 0.7667226712219417} 12/26/2021 17:50:23 - INFO - codeparrot_training - Step 28659: {'lr': 0.00019417658730474683, 'samples': 14673920, 'steps': 28659, 'batch_loss/train': 0.7310227607376873} 12/26/2021 17:50:34 - INFO - codeparrot_training - Step 28660: {'lr': 0.00019416122228135672, 'samples': 14674432, 'steps': 28660, 'batch_loss/train': 0.7370076715014875} 12/26/2021 17:50:46 - INFO - codeparrot_training - Step 28661: {'lr': 0.00019414585747996047, 'samples': 14674944, 'steps': 28661, 'batch_loss/train': 0.7056998861953616} 12/26/2021 17:50:57 - INFO - codeparrot_training - Step 28662: {'lr': 0.00019413049290061912, 'samples': 14675456, 'steps': 28662, 'batch_loss/train': 0.670166238618549} 12/26/2021 17:51:07 - INFO - codeparrot_training - Step 28663: {'lr': 0.00019411512854339397, 'samples': 14675968, 'steps': 28663, 'batch_loss/train': 0.7529711527749896} 12/26/2021 17:51:19 - INFO - codeparrot_training - Step 28664: {'lr': 0.0001940997644083459, 'samples': 14676480, 'steps': 28664, 'batch_loss/train': 0.7372318753041327} 12/26/2021 17:51:30 - INFO - codeparrot_training - Step 28665: {'lr': 0.00019408440049553612, 'samples': 14676992, 'steps': 28665, 'batch_loss/train': 0.9251271151006222} 12/26/2021 17:51:40 - INFO - codeparrot_training - Step 28666: {'lr': 0.00019406903680502555, 'samples': 14677504, 'steps': 28666, 'batch_loss/train': 0.7955580456182361} 12/26/2021 17:51:55 - INFO - codeparrot_training - Step 28667: {'lr': 0.00019405367333687545, 'samples': 14678016, 'steps': 28667, 'batch_loss/train': 0.6971463138470426} 12/26/2021 17:52:05 - INFO - codeparrot_training - Step 28668: {'lr': 0.0001940383100911468, 'samples': 14678528, 'steps': 28668, 'batch_loss/train': 0.7919919129926711} 12/26/2021 17:52:16 - INFO - codeparrot_training - Step 28669: {'lr': 0.00019402294706790063, 'samples': 14679040, 'steps': 28669, 'batch_loss/train': 0.8081417083740234} 12/26/2021 17:52:27 - INFO - codeparrot_training - Step 28670: {'lr': 0.00019400758426719818, 'samples': 14679552, 'steps': 28670, 'batch_loss/train': 0.706329985987395} 12/26/2021 17:52:39 - INFO - codeparrot_training - Step 28671: {'lr': 0.0001939922216891004, 'samples': 14680064, 'steps': 28671, 'batch_loss/train': 0.839406113140285} 12/26/2021 17:52:50 - INFO - codeparrot_training - Step 28672: {'lr': 0.00019397685933366846, 'samples': 14680576, 'steps': 28672, 'batch_loss/train': 0.7910663210204802} 12/26/2021 17:53:00 - INFO - codeparrot_training - Step 28673: {'lr': 0.00019396149720096327, 'samples': 14681088, 'steps': 28673, 'batch_loss/train': 0.592012262030039} 12/26/2021 17:53:14 - INFO - codeparrot_training - Step 28674: {'lr': 0.00019394613529104607, 'samples': 14681600, 'steps': 28674, 'batch_loss/train': 0.6841868198243901} 12/26/2021 17:53:25 - INFO - codeparrot_training - Step 28675: {'lr': 0.00019393077360397787, 'samples': 14682112, 'steps': 28675, 'batch_loss/train': 0.6565966250491329} 12/26/2021 17:53:35 - INFO - codeparrot_training - Step 28676: {'lr': 0.00019391541213981977, 'samples': 14682624, 'steps': 28676, 'batch_loss/train': 0.6943702340358868} 12/26/2021 17:53:48 - INFO - codeparrot_training - Step 28677: {'lr': 0.0001939000508986327, 'samples': 14683136, 'steps': 28677, 'batch_loss/train': 0.6769044320099056} 12/26/2021 17:53:58 - INFO - codeparrot_training - Step 28678: {'lr': 0.00019388468988047793, 'samples': 14683648, 'steps': 28678, 'batch_loss/train': 0.7981105982325971} 12/26/2021 17:54:09 - INFO - codeparrot_training - Step 28679: {'lr': 0.00019386932908541647, 'samples': 14684160, 'steps': 28679, 'batch_loss/train': 0.6761978566646576} 12/26/2021 17:54:19 - INFO - codeparrot_training - Step 28680: {'lr': 0.00019385396851350926, 'samples': 14684672, 'steps': 28680, 'batch_loss/train': 0.7272938750102185} 12/26/2021 17:54:32 - INFO - codeparrot_training - Step 28681: {'lr': 0.00019383860816481752, 'samples': 14685184, 'steps': 28681, 'batch_loss/train': 0.8011681791394949} 12/26/2021 17:54:42 - INFO - codeparrot_training - Step 28682: {'lr': 0.00019382324803940234, 'samples': 14685696, 'steps': 28682, 'batch_loss/train': 0.730545949190855} 12/26/2021 17:54:53 - INFO - codeparrot_training - Step 28683: {'lr': 0.0001938078881373247, 'samples': 14686208, 'steps': 28683, 'batch_loss/train': 0.7927467431873083} 12/26/2021 17:55:05 - INFO - codeparrot_training - Step 28684: {'lr': 0.00019379252845864555, 'samples': 14686720, 'steps': 28684, 'batch_loss/train': 0.7928187772631645} 12/26/2021 17:55:16 - INFO - codeparrot_training - Step 28685: {'lr': 0.00019377716900342617, 'samples': 14687232, 'steps': 28685, 'batch_loss/train': 0.5270058177411556} 12/26/2021 17:55:26 - INFO - codeparrot_training - Step 28686: {'lr': 0.0001937618097717276, 'samples': 14687744, 'steps': 28686, 'batch_loss/train': 0.7391961654648185} 12/26/2021 17:55:40 - INFO - codeparrot_training - Step 28687: {'lr': 0.00019374645076361078, 'samples': 14688256, 'steps': 28687, 'batch_loss/train': 0.8052281038835645} 12/26/2021 17:55:51 - INFO - codeparrot_training - Step 28688: {'lr': 0.00019373109197913673, 'samples': 14688768, 'steps': 28688, 'batch_loss/train': 0.6182756014459301} 12/26/2021 17:56:02 - INFO - codeparrot_training - Step 28689: {'lr': 0.0001937157334183668, 'samples': 14689280, 'steps': 28689, 'batch_loss/train': 0.8219330906867981} 12/26/2021 17:56:14 - INFO - codeparrot_training - Step 28690: {'lr': 0.00019370037508136174, 'samples': 14689792, 'steps': 28690, 'batch_loss/train': 0.8249612655490637} 12/26/2021 17:56:24 - INFO - codeparrot_training - Step 28691: {'lr': 0.00019368501696818283, 'samples': 14690304, 'steps': 28691, 'batch_loss/train': 0.8515000040642917} 12/26/2021 17:56:35 - INFO - codeparrot_training - Step 28692: {'lr': 0.0001936696590788909, 'samples': 14690816, 'steps': 28692, 'batch_loss/train': 0.7286515445448458} 12/26/2021 17:56:46 - INFO - codeparrot_training - Step 28693: {'lr': 0.00019365430141354722, 'samples': 14691328, 'steps': 28693, 'batch_loss/train': 0.8729002784239128} 12/26/2021 17:56:58 - INFO - codeparrot_training - Step 28694: {'lr': 0.00019363894397221278, 'samples': 14691840, 'steps': 28694, 'batch_loss/train': 0.9724415047094226} 12/26/2021 17:57:09 - INFO - codeparrot_training - Step 28695: {'lr': 0.00019362358675494847, 'samples': 14692352, 'steps': 28695, 'batch_loss/train': 0.9153937208466232} 12/26/2021 17:57:19 - INFO - codeparrot_training - Step 28696: {'lr': 0.00019360822976181567, 'samples': 14692864, 'steps': 28696, 'batch_loss/train': 0.7533593862317502} 12/26/2021 17:57:33 - INFO - codeparrot_training - Step 28697: {'lr': 0.00019359287299287517, 'samples': 14693376, 'steps': 28697, 'batch_loss/train': 0.7931512724608183} 12/26/2021 17:57:44 - INFO - codeparrot_training - Step 28698: {'lr': 0.00019357751644818815, 'samples': 14693888, 'steps': 28698, 'batch_loss/train': 0.7952718995511532} 12/26/2021 17:57:54 - INFO - codeparrot_training - Step 28699: {'lr': 0.0001935621601278156, 'samples': 14694400, 'steps': 28699, 'batch_loss/train': 0.7978557744063437} 12/26/2021 17:58:06 - INFO - codeparrot_training - Step 28700: {'lr': 0.00019354680403181853, 'samples': 14694912, 'steps': 28700, 'batch_loss/train': 0.7810436161234975} 12/26/2021 17:58:17 - INFO - codeparrot_training - Step 28701: {'lr': 0.00019353144816025816, 'samples': 14695424, 'steps': 28701, 'batch_loss/train': 0.8060703007504344} 12/26/2021 17:58:28 - INFO - codeparrot_training - Step 28702: {'lr': 0.00019351609251319542, 'samples': 14695936, 'steps': 28702, 'batch_loss/train': 0.6867650533095002} 12/26/2021 17:58:38 - INFO - codeparrot_training - Step 28703: {'lr': 0.00019350073709069126, 'samples': 14696448, 'steps': 28703, 'batch_loss/train': 0.7436612262390554} 12/26/2021 17:58:51 - INFO - codeparrot_training - Step 28704: {'lr': 0.0001934853818928069, 'samples': 14696960, 'steps': 28704, 'batch_loss/train': 0.7049092594534159} 12/26/2021 17:59:01 - INFO - codeparrot_training - Step 28705: {'lr': 0.00019347002691960338, 'samples': 14697472, 'steps': 28705, 'batch_loss/train': 0.7865999657660723} 12/26/2021 17:59:12 - INFO - codeparrot_training - Step 28706: {'lr': 0.00019345467217114158, 'samples': 14697984, 'steps': 28706, 'batch_loss/train': 0.7452824972569942} 12/26/2021 17:59:26 - INFO - codeparrot_training - Step 28707: {'lr': 0.00019343931764748267, 'samples': 14698496, 'steps': 28707, 'batch_loss/train': 0.8194791208952665} 12/26/2021 17:59:36 - INFO - codeparrot_training - Step 28708: {'lr': 0.0001934239633486878, 'samples': 14699008, 'steps': 28708, 'batch_loss/train': 0.8185717845335603} 12/26/2021 17:59:47 - INFO - codeparrot_training - Step 28709: {'lr': 0.0001934086092748178, 'samples': 14699520, 'steps': 28709, 'batch_loss/train': 0.9321533795446157} 12/26/2021 17:59:59 - INFO - codeparrot_training - Step 28710: {'lr': 0.00019339325542593372, 'samples': 14700032, 'steps': 28710, 'batch_loss/train': 0.7378026377409697} 12/26/2021 18:00:10 - INFO - codeparrot_training - Step 28711: {'lr': 0.0001933779018020968, 'samples': 14700544, 'steps': 28711, 'batch_loss/train': 0.6978579494170845} 12/26/2021 18:00:20 - INFO - codeparrot_training - Step 28712: {'lr': 0.00019336254840336797, 'samples': 14701056, 'steps': 28712, 'batch_loss/train': 0.6621587567497045} 12/26/2021 18:00:31 - INFO - codeparrot_training - Step 28713: {'lr': 0.00019334719522980822, 'samples': 14701568, 'steps': 28713, 'batch_loss/train': 0.7425093026831746} 12/26/2021 18:00:43 - INFO - codeparrot_training - Step 28714: {'lr': 0.0001933318422814786, 'samples': 14702080, 'steps': 28714, 'batch_loss/train': 0.7755943769589067} 12/26/2021 18:00:54 - INFO - codeparrot_training - Step 28715: {'lr': 0.00019331648955844024, 'samples': 14702592, 'steps': 28715, 'batch_loss/train': 0.8695763731375337} 12/26/2021 18:01:04 - INFO - codeparrot_training - Step 28716: {'lr': 0.00019330113706075407, 'samples': 14703104, 'steps': 28716, 'batch_loss/train': 0.6905450699850917} 12/26/2021 18:01:19 - INFO - codeparrot_training - Step 28717: {'lr': 0.00019328578478848113, 'samples': 14703616, 'steps': 28717, 'batch_loss/train': 0.6740006050094962} 12/26/2021 18:01:29 - INFO - codeparrot_training - Step 28718: {'lr': 0.0001932704327416826, 'samples': 14704128, 'steps': 28718, 'batch_loss/train': 0.7686606030911207} 12/26/2021 18:01:40 - INFO - codeparrot_training - Step 28719: {'lr': 0.0001932550809204194, 'samples': 14704640, 'steps': 28719, 'batch_loss/train': 0.7825174406170845} 12/26/2021 18:01:50 - INFO - codeparrot_training - Step 28720: {'lr': 0.00019323972932475253, 'samples': 14705152, 'steps': 28720, 'batch_loss/train': 0.7102540591731668} 12/26/2021 18:02:02 - INFO - codeparrot_training - Step 28721: {'lr': 0.000193224377954743, 'samples': 14705664, 'steps': 28721, 'batch_loss/train': 0.7098911474458873} 12/26/2021 18:02:13 - INFO - codeparrot_training - Step 28722: {'lr': 0.000193209026810452, 'samples': 14706176, 'steps': 28722, 'batch_loss/train': 0.7456584931351244} 12/26/2021 18:02:24 - INFO - codeparrot_training - Step 28723: {'lr': 0.00019319367589194046, 'samples': 14706688, 'steps': 28723, 'batch_loss/train': 0.7127905883826315} 12/26/2021 18:02:36 - INFO - codeparrot_training - Step 28724: {'lr': 0.00019317832519926944, 'samples': 14707200, 'steps': 28724, 'batch_loss/train': 0.7912019945215434} 12/26/2021 18:02:47 - INFO - codeparrot_training - Step 28725: {'lr': 0.00019316297473249982, 'samples': 14707712, 'steps': 28725, 'batch_loss/train': 0.7223835131153464} 12/26/2021 18:02:57 - INFO - codeparrot_training - Step 28726: {'lr': 0.00019314762449169283, 'samples': 14708224, 'steps': 28726, 'batch_loss/train': 0.8267314685508609} 12/26/2021 18:03:11 - INFO - codeparrot_training - Step 28727: {'lr': 0.00019313227447690946, 'samples': 14708736, 'steps': 28727, 'batch_loss/train': 0.8197776125743985} 12/26/2021 18:03:22 - INFO - codeparrot_training - Step 28728: {'lr': 0.00019311692468821067, 'samples': 14709248, 'steps': 28728, 'batch_loss/train': 0.8357994053512812} 12/26/2021 18:03:32 - INFO - codeparrot_training - Step 28729: {'lr': 0.00019310157512565746, 'samples': 14709760, 'steps': 28729, 'batch_loss/train': 0.72469327528961} 12/26/2021 18:03:43 - INFO - codeparrot_training - Step 28730: {'lr': 0.00019308622578931091, 'samples': 14710272, 'steps': 28730, 'batch_loss/train': 0.7779374942183495} 12/26/2021 18:03:55 - INFO - codeparrot_training - Step 28731: {'lr': 0.00019307087667923213, 'samples': 14710784, 'steps': 28731, 'batch_loss/train': 0.7035580235533416} 12/26/2021 18:04:06 - INFO - codeparrot_training - Step 28732: {'lr': 0.0001930555277954819, 'samples': 14711296, 'steps': 28732, 'batch_loss/train': 0.7686000168323517} 12/26/2021 18:04:16 - INFO - codeparrot_training - Step 28733: {'lr': 0.00019304017913812147, 'samples': 14711808, 'steps': 28733, 'batch_loss/train': 0.7678870647214353} 12/26/2021 18:04:30 - INFO - codeparrot_training - Step 28734: {'lr': 0.00019302483070721184, 'samples': 14712320, 'steps': 28734, 'batch_loss/train': 0.8130239914171398} 12/26/2021 18:04:40 - INFO - codeparrot_training - Step 28735: {'lr': 0.0001930094825028139, 'samples': 14712832, 'steps': 28735, 'batch_loss/train': 0.8126279758289456} 12/26/2021 18:04:51 - INFO - codeparrot_training - Step 28736: {'lr': 0.00019299413452498869, 'samples': 14713344, 'steps': 28736, 'batch_loss/train': 0.682959204306826} 12/26/2021 18:05:03 - INFO - codeparrot_training - Step 28737: {'lr': 0.00019297878677379738, 'samples': 14713856, 'steps': 28737, 'batch_loss/train': 0.7634520540013909} 12/26/2021 18:05:14 - INFO - codeparrot_training - Step 28738: {'lr': 0.00019296343924930087, 'samples': 14714368, 'steps': 28738, 'batch_loss/train': 0.725237775943242} 12/26/2021 18:05:24 - INFO - codeparrot_training - Step 28739: {'lr': 0.00019294809195156016, 'samples': 14714880, 'steps': 28739, 'batch_loss/train': 0.7127964505343698} 12/26/2021 18:05:37 - INFO - codeparrot_training - Step 28740: {'lr': 0.00019293274488063622, 'samples': 14715392, 'steps': 28740, 'batch_loss/train': 0.7828685846179724} 12/26/2021 18:05:47 - INFO - codeparrot_training - Step 28741: {'lr': 0.00019291739803659026, 'samples': 14715904, 'steps': 28741, 'batch_loss/train': 0.7573726263362914} 12/26/2021 18:05:58 - INFO - codeparrot_training - Step 28742: {'lr': 0.00019290205141948312, 'samples': 14716416, 'steps': 28742, 'batch_loss/train': 0.8406480811536312} 12/26/2021 18:06:09 - INFO - codeparrot_training - Step 28743: {'lr': 0.0001928867050293758, 'samples': 14716928, 'steps': 28743, 'batch_loss/train': 0.7775236200541258} 12/26/2021 18:06:22 - INFO - codeparrot_training - Step 28744: {'lr': 0.00019287135886632948, 'samples': 14717440, 'steps': 28744, 'batch_loss/train': 0.727316465228796} 12/26/2021 18:06:33 - INFO - codeparrot_training - Step 28745: {'lr': 0.00019285601293040502, 'samples': 14717952, 'steps': 28745, 'batch_loss/train': 0.7845144155435264} 12/26/2021 18:06:44 - INFO - codeparrot_training - Step 28746: {'lr': 0.00019284066722166356, 'samples': 14718464, 'steps': 28746, 'batch_loss/train': 0.8846686948090792} 12/26/2021 18:06:56 - INFO - codeparrot_training - Step 28747: {'lr': 0.00019282532174016588, 'samples': 14718976, 'steps': 28747, 'batch_loss/train': 0.7348457709886134} 12/26/2021 18:07:06 - INFO - codeparrot_training - Step 28748: {'lr': 0.00019280997648597322, 'samples': 14719488, 'steps': 28748, 'batch_loss/train': 0.7149964119307697} 12/26/2021 18:07:17 - INFO - codeparrot_training - Step 28749: {'lr': 0.00019279463145914648, 'samples': 14720000, 'steps': 28749, 'batch_loss/train': 0.8450490874238312} 12/26/2021 18:07:29 - INFO - codeparrot_training - Step 28750: {'lr': 0.00019277928665974674, 'samples': 14720512, 'steps': 28750, 'batch_loss/train': 0.7072492563165724} 12/26/2021 18:07:40 - INFO - codeparrot_training - Step 28751: {'lr': 0.0001927639420878348, 'samples': 14721024, 'steps': 28751, 'batch_loss/train': 0.7880728729069233} 12/26/2021 18:07:50 - INFO - codeparrot_training - Step 28752: {'lr': 0.00019274859774347193, 'samples': 14721536, 'steps': 28752, 'batch_loss/train': 0.8037770162336528} 12/26/2021 18:08:01 - INFO - codeparrot_training - Step 28753: {'lr': 0.00019273325362671907, 'samples': 14722048, 'steps': 28753, 'batch_loss/train': 0.7403878243640065} 12/26/2021 18:08:13 - INFO - codeparrot_training - Step 28754: {'lr': 0.00019271790973763702, 'samples': 14722560, 'steps': 28754, 'batch_loss/train': 0.7382677872665226} 12/26/2021 18:08:24 - INFO - codeparrot_training - Step 28755: {'lr': 0.000192702566076287, 'samples': 14723072, 'steps': 28755, 'batch_loss/train': 0.6362285087816417} 12/26/2021 18:08:34 - INFO - codeparrot_training - Step 28756: {'lr': 0.00019268722264273, 'samples': 14723584, 'steps': 28756, 'batch_loss/train': 0.8097973922267556} 12/26/2021 18:08:48 - INFO - codeparrot_training - Step 28757: {'lr': 0.00019267187943702697, 'samples': 14724096, 'steps': 28757, 'batch_loss/train': 0.7118729278445244} 12/26/2021 18:08:59 - INFO - codeparrot_training - Step 28758: {'lr': 0.00019265653645923877, 'samples': 14724608, 'steps': 28758, 'batch_loss/train': 0.7725752266123891} 12/26/2021 18:09:09 - INFO - codeparrot_training - Step 28759: {'lr': 0.00019264119370942663, 'samples': 14725120, 'steps': 28759, 'batch_loss/train': 0.810988234821707} 12/26/2021 18:09:21 - INFO - codeparrot_training - Step 28760: {'lr': 0.00019262585118765146, 'samples': 14725632, 'steps': 28760, 'batch_loss/train': 0.8141375156119466} 12/26/2021 18:09:32 - INFO - codeparrot_training - Step 28761: {'lr': 0.0001926105088939742, 'samples': 14726144, 'steps': 28761, 'batch_loss/train': 0.8768849922344089} 12/26/2021 18:09:43 - INFO - codeparrot_training - Step 28762: {'lr': 0.00019259516682845583, 'samples': 14726656, 'steps': 28762, 'batch_loss/train': 0.8090008068829775} 12/26/2021 18:09:55 - INFO - codeparrot_training - Step 28763: {'lr': 0.00019257982499115752, 'samples': 14727168, 'steps': 28763, 'batch_loss/train': 0.7238240852020681} 12/26/2021 18:10:06 - INFO - codeparrot_training - Step 28764: {'lr': 0.0001925644833821401, 'samples': 14727680, 'steps': 28764, 'batch_loss/train': 0.880010683555156} 12/26/2021 18:10:16 - INFO - codeparrot_training - Step 28765: {'lr': 0.00019254914200146463, 'samples': 14728192, 'steps': 28765, 'batch_loss/train': 0.8422191608697176} 12/26/2021 18:10:27 - INFO - codeparrot_training - Step 28766: {'lr': 0.00019253380084919197, 'samples': 14728704, 'steps': 28766, 'batch_loss/train': 0.5176485569099896} 12/26/2021 18:10:41 - INFO - codeparrot_training - Step 28767: {'lr': 0.00019251845992538334, 'samples': 14729216, 'steps': 28767, 'batch_loss/train': 0.8130417740903795} 12/26/2021 18:10:51 - INFO - codeparrot_training - Step 28768: {'lr': 0.00019250311923009954, 'samples': 14729728, 'steps': 28768, 'batch_loss/train': 0.8011182798072696} 12/26/2021 18:11:02 - INFO - codeparrot_training - Step 28769: {'lr': 0.0001924877787634016, 'samples': 14730240, 'steps': 28769, 'batch_loss/train': 0.7073418628424406} 12/26/2021 18:11:14 - INFO - codeparrot_training - Step 28770: {'lr': 0.00019247243852535062, 'samples': 14730752, 'steps': 28770, 'batch_loss/train': 0.7555410284548998} 12/26/2021 18:11:25 - INFO - codeparrot_training - Step 28771: {'lr': 0.0001924570985160075, 'samples': 14731264, 'steps': 28771, 'batch_loss/train': 0.7458280478604138} 12/26/2021 18:11:35 - INFO - codeparrot_training - Step 28772: {'lr': 0.00019244175873543324, 'samples': 14731776, 'steps': 28772, 'batch_loss/train': 0.6236544076818973} 12/26/2021 18:11:49 - INFO - codeparrot_training - Step 28773: {'lr': 0.0001924264191836887, 'samples': 14732288, 'steps': 28773, 'batch_loss/train': 0.7136423941701651} 12/26/2021 18:12:00 - INFO - codeparrot_training - Step 28774: {'lr': 0.0001924110798608351, 'samples': 14732800, 'steps': 28774, 'batch_loss/train': 0.6844655689783394} 12/26/2021 18:12:11 - INFO - codeparrot_training - Step 28775: {'lr': 0.00019239574076693322, 'samples': 14733312, 'steps': 28775, 'batch_loss/train': 1.5722718695178628} 12/26/2021 18:12:21 - INFO - codeparrot_training - Step 28776: {'lr': 0.00019238040190204422, 'samples': 14733824, 'steps': 28776, 'batch_loss/train': 0.6512492053443566} 12/26/2021 18:12:33 - INFO - codeparrot_training - Step 28777: {'lr': 0.00019236506326622885, 'samples': 14734336, 'steps': 28777, 'batch_loss/train': 0.7186239557340741} 12/26/2021 18:12:44 - INFO - codeparrot_training - Step 28778: {'lr': 0.0001923497248595483, 'samples': 14734848, 'steps': 28778, 'batch_loss/train': 0.7882868517190218} 12/26/2021 18:12:55 - INFO - codeparrot_training - Step 28779: {'lr': 0.00019233438668206352, 'samples': 14735360, 'steps': 28779, 'batch_loss/train': 0.8289854899048805} 12/26/2021 18:13:07 - INFO - codeparrot_training - Step 28780: {'lr': 0.00019231904873383534, 'samples': 14735872, 'steps': 28780, 'batch_loss/train': 0.81979248393327} 12/26/2021 18:13:17 - INFO - codeparrot_training - Step 28781: {'lr': 0.0001923037110149249, 'samples': 14736384, 'steps': 28781, 'batch_loss/train': 0.7205801231320947} 12/26/2021 18:13:28 - INFO - codeparrot_training - Step 28782: {'lr': 0.00019228837352539316, 'samples': 14736896, 'steps': 28782, 'batch_loss/train': 0.9566602297127247} 12/26/2021 18:13:42 - INFO - codeparrot_training - Step 28783: {'lr': 0.00019227303626530106, 'samples': 14737408, 'steps': 28783, 'batch_loss/train': 0.7802565032616258} 12/26/2021 18:13:52 - INFO - codeparrot_training - Step 28784: {'lr': 0.00019225769923470948, 'samples': 14737920, 'steps': 28784, 'batch_loss/train': 0.7795551540330052} 12/26/2021 18:14:03 - INFO - codeparrot_training - Step 28785: {'lr': 0.00019224236243367955, 'samples': 14738432, 'steps': 28785, 'batch_loss/train': 0.7861004539299756} 12/26/2021 18:14:14 - INFO - codeparrot_training - Step 28786: {'lr': 0.0001922270258622722, 'samples': 14738944, 'steps': 28786, 'batch_loss/train': 0.6997666638344526} 12/26/2021 18:14:26 - INFO - codeparrot_training - Step 28787: {'lr': 0.00019221168952054832, 'samples': 14739456, 'steps': 28787, 'batch_loss/train': 0.821659863460809} 12/26/2021 18:14:36 - INFO - codeparrot_training - Step 28788: {'lr': 0.00019219635340856893, 'samples': 14739968, 'steps': 28788, 'batch_loss/train': 0.8513305047526956} 12/26/2021 18:14:47 - INFO - codeparrot_training - Step 28789: {'lr': 0.00019218101752639508, 'samples': 14740480, 'steps': 28789, 'batch_loss/train': 0.6464167585363612} 12/26/2021 18:14:59 - INFO - codeparrot_training - Step 28790: {'lr': 0.0001921656818740876, 'samples': 14740992, 'steps': 28790, 'batch_loss/train': 0.7396839456632733} 12/26/2021 18:15:10 - INFO - codeparrot_training - Step 28791: {'lr': 0.00019215034645170753, 'samples': 14741504, 'steps': 28791, 'batch_loss/train': 0.7054159790277481} 12/26/2021 18:15:20 - INFO - codeparrot_training - Step 28792: {'lr': 0.00019213501125931592, 'samples': 14742016, 'steps': 28792, 'batch_loss/train': 0.788894347846508} 12/26/2021 18:15:33 - INFO - codeparrot_training - Step 28793: {'lr': 0.0001921196762969737, 'samples': 14742528, 'steps': 28793, 'batch_loss/train': 0.6960763363167644} 12/26/2021 18:15:43 - INFO - codeparrot_training - Step 28794: {'lr': 0.00019210434156474168, 'samples': 14743040, 'steps': 28794, 'batch_loss/train': 0.5842119893059134} 12/26/2021 18:15:54 - INFO - codeparrot_training - Step 28795: {'lr': 0.00019208900706268092, 'samples': 14743552, 'steps': 28795, 'batch_loss/train': 0.8777117258869112} 12/26/2021 18:16:08 - INFO - codeparrot_training - Step 28796: {'lr': 0.0001920736727908525, 'samples': 14744064, 'steps': 28796, 'batch_loss/train': 0.6929879436502233} 12/26/2021 18:16:18 - INFO - codeparrot_training - Step 28797: {'lr': 0.00019205833874931722, 'samples': 14744576, 'steps': 28797, 'batch_loss/train': 0.6713511710986495} 12/26/2021 18:16:29 - INFO - codeparrot_training - Step 28798: {'lr': 0.00019204300493813617, 'samples': 14745088, 'steps': 28798, 'batch_loss/train': 0.7407299326732755} 12/26/2021 18:16:40 - INFO - codeparrot_training - Step 28799: {'lr': 0.00019202767135737014, 'samples': 14745600, 'steps': 28799, 'batch_loss/train': 0.7417241677176207} 12/26/2021 18:16:52 - INFO - codeparrot_training - Step 28800: {'lr': 0.00019201233800708026, 'samples': 14746112, 'steps': 28800, 'batch_loss/train': 0.7568565969122574} 12/26/2021 18:17:02 - INFO - codeparrot_training - Step 28801: {'lr': 0.00019199700488732745, 'samples': 14746624, 'steps': 28801, 'batch_loss/train': 0.7823487743735313} 12/26/2021 18:17:13 - INFO - codeparrot_training - Step 28802: {'lr': 0.00019198167199817271, 'samples': 14747136, 'steps': 28802, 'batch_loss/train': 0.7897088145837188} 12/26/2021 18:17:27 - INFO - codeparrot_training - Step 28803: {'lr': 0.00019196633933967679, 'samples': 14747648, 'steps': 28803, 'batch_loss/train': 0.7545152269303799} 12/26/2021 18:17:37 - INFO - codeparrot_training - Step 28804: {'lr': 0.00019195100691190084, 'samples': 14748160, 'steps': 28804, 'batch_loss/train': 0.7488670060411096} 12/26/2021 18:17:48 - INFO - codeparrot_training - Step 28805: {'lr': 0.00019193567471490584, 'samples': 14748672, 'steps': 28805, 'batch_loss/train': 0.8145812656730413} 12/26/2021 18:18:00 - INFO - codeparrot_training - Step 28806: {'lr': 0.00019192034274875252, 'samples': 14749184, 'steps': 28806, 'batch_loss/train': 0.7740697250701487} 12/26/2021 18:18:11 - INFO - codeparrot_training - Step 28807: {'lr': 0.0001919050110135021, 'samples': 14749696, 'steps': 28807, 'batch_loss/train': 0.7431804714724422} 12/26/2021 18:18:21 - INFO - codeparrot_training - Step 28808: {'lr': 0.00019188967950921543, 'samples': 14750208, 'steps': 28808, 'batch_loss/train': 0.7883133028517477} 12/26/2021 18:18:32 - INFO - codeparrot_training - Step 28809: {'lr': 0.00019187434823595347, 'samples': 14750720, 'steps': 28809, 'batch_loss/train': 0.7444129325449467} 12/26/2021 18:18:44 - INFO - codeparrot_training - Step 28810: {'lr': 0.00019185901719377706, 'samples': 14751232, 'steps': 28810, 'batch_loss/train': 0.8035651203244925} 12/26/2021 18:18:55 - INFO - codeparrot_training - Step 28811: {'lr': 0.0001918436863827473, 'samples': 14751744, 'steps': 28811, 'batch_loss/train': 0.7120178279001266} 12/26/2021 18:19:05 - INFO - codeparrot_training - Step 28812: {'lr': 0.00019182835580292513, 'samples': 14752256, 'steps': 28812, 'batch_loss/train': 0.6966295433230698} 12/26/2021 18:19:19 - INFO - codeparrot_training - Step 28813: {'lr': 0.00019181302545437137, 'samples': 14752768, 'steps': 28813, 'batch_loss/train': 0.724967050133273} 12/26/2021 18:19:30 - INFO - codeparrot_training - Step 28814: {'lr': 0.00019179769533714703, 'samples': 14753280, 'steps': 28814, 'batch_loss/train': 0.8643670161254704} 12/26/2021 18:19:40 - INFO - codeparrot_training - Step 28815: {'lr': 0.00019178236545131316, 'samples': 14753792, 'steps': 28815, 'batch_loss/train': 0.6494614267721772} 12/26/2021 18:19:52 - INFO - codeparrot_training - Step 28816: {'lr': 0.0001917670357969306, 'samples': 14754304, 'steps': 28816, 'batch_loss/train': 0.8234179066494107} 12/26/2021 18:20:03 - INFO - codeparrot_training - Step 28817: {'lr': 0.00019175170637406025, 'samples': 14754816, 'steps': 28817, 'batch_loss/train': 0.6544703906401992} 12/26/2021 18:20:14 - INFO - codeparrot_training - Step 28818: {'lr': 0.0001917363771827632, 'samples': 14755328, 'steps': 28818, 'batch_loss/train': 0.7662621980998665} 12/26/2021 18:20:28 - INFO - codeparrot_training - Step 28819: {'lr': 0.00019172104822310037, 'samples': 14755840, 'steps': 28819, 'batch_loss/train': 0.7261045607738197} 12/26/2021 18:20:39 - INFO - codeparrot_training - Step 28820: {'lr': 0.00019170571949513255, 'samples': 14756352, 'steps': 28820, 'batch_loss/train': 0.7289380710572004} 12/26/2021 18:20:49 - INFO - codeparrot_training - Step 28821: {'lr': 0.00019169039099892074, 'samples': 14756864, 'steps': 28821, 'batch_loss/train': 0.7386451305355877} 12/26/2021 18:21:00 - INFO - codeparrot_training - Step 28822: {'lr': 0.00019167506273452603, 'samples': 14757376, 'steps': 28822, 'batch_loss/train': 1.1156321316957474} 12/26/2021 18:21:12 - INFO - codeparrot_training - Step 28823: {'lr': 0.0001916597347020092, 'samples': 14757888, 'steps': 28823, 'batch_loss/train': 0.7606810587458313} 12/26/2021 18:21:23 - INFO - codeparrot_training - Step 28824: {'lr': 0.00019164440690143126, 'samples': 14758400, 'steps': 28824, 'batch_loss/train': 0.7418347586644813} 12/26/2021 18:21:33 - INFO - codeparrot_training - Step 28825: {'lr': 0.00019162907933285305, 'samples': 14758912, 'steps': 28825, 'batch_loss/train': 0.7768002087250352} 12/26/2021 18:21:45 - INFO - codeparrot_training - Step 28826: {'lr': 0.00019161375199633563, 'samples': 14759424, 'steps': 28826, 'batch_loss/train': 0.8038216773420572} 12/26/2021 18:21:56 - INFO - codeparrot_training - Step 28827: {'lr': 0.0001915984248919399, 'samples': 14759936, 'steps': 28827, 'batch_loss/train': 0.7997958874329925} 12/26/2021 18:22:07 - INFO - codeparrot_training - Step 28828: {'lr': 0.00019158309801972672, 'samples': 14760448, 'steps': 28828, 'batch_loss/train': 0.806415265891701} 12/26/2021 18:22:19 - INFO - codeparrot_training - Step 28829: {'lr': 0.00019156777137975718, 'samples': 14760960, 'steps': 28829, 'batch_loss/train': 0.6549482829868793} 12/26/2021 18:22:29 - INFO - codeparrot_training - Step 28830: {'lr': 0.00019155244497209208, 'samples': 14761472, 'steps': 28830, 'batch_loss/train': 0.7467505937675014} 12/26/2021 18:22:40 - INFO - codeparrot_training - Step 28831: {'lr': 0.00019153711879679242, 'samples': 14761984, 'steps': 28831, 'batch_loss/train': 0.7547303028404713} 12/26/2021 18:22:51 - INFO - codeparrot_training - Step 28832: {'lr': 0.000191521792853919, 'samples': 14762496, 'steps': 28832, 'batch_loss/train': 0.9874953432008624} 12/26/2021 18:23:05 - INFO - codeparrot_training - Step 28833: {'lr': 0.00019150646714353287, 'samples': 14763008, 'steps': 28833, 'batch_loss/train': 0.8366861873073503} 12/26/2021 18:23:15 - INFO - codeparrot_training - Step 28834: {'lr': 0.00019149114166569503, 'samples': 14763520, 'steps': 28834, 'batch_loss/train': 0.9724292531609535} 12/26/2021 18:23:26 - INFO - codeparrot_training - Step 28835: {'lr': 0.00019147581642046626, 'samples': 14764032, 'steps': 28835, 'batch_loss/train': 0.7817200543358922} 12/26/2021 18:23:38 - INFO - codeparrot_training - Step 28836: {'lr': 0.0001914604914079075, 'samples': 14764544, 'steps': 28836, 'batch_loss/train': 0.8287384053692222} 12/26/2021 18:23:49 - INFO - codeparrot_training - Step 28837: {'lr': 0.00019144516662807982, 'samples': 14765056, 'steps': 28837, 'batch_loss/train': 0.8629812244325876} 12/26/2021 18:23:59 - INFO - codeparrot_training - Step 28838: {'lr': 0.00019142984208104404, 'samples': 14765568, 'steps': 28838, 'batch_loss/train': 0.6482482036226429} 12/26/2021 18:24:11 - INFO - codeparrot_training - Step 28839: {'lr': 0.0001914145177668611, 'samples': 14766080, 'steps': 28839, 'batch_loss/train': 0.8632886158302426} 12/26/2021 18:24:22 - INFO - codeparrot_training - Step 28840: {'lr': 0.00019139919368559176, 'samples': 14766592, 'steps': 28840, 'batch_loss/train': 1.0271872533485293} 12/26/2021 18:24:33 - INFO - codeparrot_training - Step 28841: {'lr': 0.00019138386983729726, 'samples': 14767104, 'steps': 28841, 'batch_loss/train': 0.824548882432282} 12/26/2021 18:24:43 - INFO - codeparrot_training - Step 28842: {'lr': 0.00019136854622203833, 'samples': 14767616, 'steps': 28842, 'batch_loss/train': 0.7905222978442907} 12/26/2021 18:24:57 - INFO - codeparrot_training - Step 28843: {'lr': 0.00019135322283987582, 'samples': 14768128, 'steps': 28843, 'batch_loss/train': 0.7775276340544224} 12/26/2021 18:25:08 - INFO - codeparrot_training - Step 28844: {'lr': 0.0001913378996908709, 'samples': 14768640, 'steps': 28844, 'batch_loss/train': 0.7617025999352336} 12/26/2021 18:25:19 - INFO - codeparrot_training - Step 28845: {'lr': 0.00019132257677508427, 'samples': 14769152, 'steps': 28845, 'batch_loss/train': 0.7218767411541194} 12/26/2021 18:25:32 - INFO - codeparrot_training - Step 28846: {'lr': 0.000191307254092577, 'samples': 14769664, 'steps': 28846, 'batch_loss/train': 0.7982128318399191} 12/26/2021 18:25:42 - INFO - codeparrot_training - Step 28847: {'lr': 0.00019129193164340978, 'samples': 14770176, 'steps': 28847, 'batch_loss/train': 1.1095893802121282} 12/26/2021 18:25:53 - INFO - codeparrot_training - Step 28848: {'lr': 0.0001912766094276438, 'samples': 14770688, 'steps': 28848, 'batch_loss/train': 0.8650993397459388} 12/26/2021 18:26:07 - INFO - codeparrot_training - Step 28849: {'lr': 0.00019126128744533977, 'samples': 14771200, 'steps': 28849, 'batch_loss/train': 0.8025443674996495} 12/26/2021 18:26:18 - INFO - codeparrot_training - Step 28850: {'lr': 0.0001912459656965588, 'samples': 14771712, 'steps': 28850, 'batch_loss/train': 0.7960072727873921} 12/26/2021 18:26:28 - INFO - codeparrot_training - Step 28851: {'lr': 0.0001912306441813615, 'samples': 14772224, 'steps': 28851, 'batch_loss/train': 0.7277800649753772} 12/26/2021 18:26:39 - INFO - codeparrot_training - Step 28852: {'lr': 0.00019121532289980903, 'samples': 14772736, 'steps': 28852, 'batch_loss/train': 0.6388340697158128} 12/26/2021 18:26:51 - INFO - codeparrot_training - Step 28853: {'lr': 0.00019120000185196234, 'samples': 14773248, 'steps': 28853, 'batch_loss/train': 0.6851030760444701} 12/26/2021 18:27:02 - INFO - codeparrot_training - Step 28854: {'lr': 0.00019118468103788213, 'samples': 14773760, 'steps': 28854, 'batch_loss/train': 0.7512936139246449} 12/26/2021 18:27:12 - INFO - codeparrot_training - Step 28855: {'lr': 0.00019116936045762947, 'samples': 14774272, 'steps': 28855, 'batch_loss/train': 0.7747121844440699} 12/26/2021 18:27:24 - INFO - codeparrot_training - Step 28856: {'lr': 0.00019115404011126521, 'samples': 14774784, 'steps': 28856, 'batch_loss/train': 0.7909134458750486} 12/26/2021 18:27:35 - INFO - codeparrot_training - Step 28857: {'lr': 0.00019113871999885032, 'samples': 14775296, 'steps': 28857, 'batch_loss/train': 0.6895775995217264} 12/26/2021 18:27:46 - INFO - codeparrot_training - Step 28858: {'lr': 0.00019112340012044556, 'samples': 14775808, 'steps': 28858, 'batch_loss/train': 0.7895464971661568} 12/26/2021 18:27:59 - INFO - codeparrot_training - Step 28859: {'lr': 0.00019110808047611194, 'samples': 14776320, 'steps': 28859, 'batch_loss/train': 0.882242931984365} 12/26/2021 18:28:10 - INFO - codeparrot_training - Step 28860: {'lr': 0.00019109276106591045, 'samples': 14776832, 'steps': 28860, 'batch_loss/train': 0.7342558233067393} 12/26/2021 18:28:21 - INFO - codeparrot_training - Step 28861: {'lr': 0.0001910774418899018, 'samples': 14777344, 'steps': 28861, 'batch_loss/train': 0.7311646785819903} 12/26/2021 18:28:31 - INFO - codeparrot_training - Step 28862: {'lr': 0.00019106212294814698, 'samples': 14777856, 'steps': 28862, 'batch_loss/train': 0.8247773159528151} 12/26/2021 18:28:44 - INFO - codeparrot_training - Step 28863: {'lr': 0.000191046804240707, 'samples': 14778368, 'steps': 28863, 'batch_loss/train': 0.8388363057747483} 12/26/2021 18:28:54 - INFO - codeparrot_training - Step 28864: {'lr': 0.00019103148576764267, 'samples': 14778880, 'steps': 28864, 'batch_loss/train': 0.679317669942975} 12/26/2021 18:29:05 - INFO - codeparrot_training - Step 28865: {'lr': 0.0001910161675290148, 'samples': 14779392, 'steps': 28865, 'batch_loss/train': 0.6217195494100451} 12/26/2021 18:29:17 - INFO - codeparrot_training - Step 28866: {'lr': 0.0001910008495248844, 'samples': 14779904, 'steps': 28866, 'batch_loss/train': 0.6389568728627637} 12/26/2021 18:29:27 - INFO - codeparrot_training - Step 28867: {'lr': 0.00019098553175531238, 'samples': 14780416, 'steps': 28867, 'batch_loss/train': 0.6949313378427178} 12/26/2021 18:29:38 - INFO - codeparrot_training - Step 28868: {'lr': 0.0001909702142203596, 'samples': 14780928, 'steps': 28868, 'batch_loss/train': 0.7914582472294569} 12/26/2021 18:29:52 - INFO - codeparrot_training - Step 28869: {'lr': 0.00019095489692008688, 'samples': 14781440, 'steps': 28869, 'batch_loss/train': 0.8417007122188807} 12/26/2021 18:30:03 - INFO - codeparrot_training - Step 28870: {'lr': 0.0001909395798545553, 'samples': 14781952, 'steps': 28870, 'batch_loss/train': 0.8357813470065594} 12/26/2021 18:30:13 - INFO - codeparrot_training - Step 28871: {'lr': 0.0001909242630238256, 'samples': 14782464, 'steps': 28871, 'batch_loss/train': 0.7284276690334082} 12/26/2021 18:30:25 - INFO - codeparrot_training - Step 28872: {'lr': 0.00019090894642795879, 'samples': 14782976, 'steps': 28872, 'batch_loss/train': 0.800046875141561} 12/26/2021 18:30:36 - INFO - codeparrot_training - Step 28873: {'lr': 0.00019089363006701555, 'samples': 14783488, 'steps': 28873, 'batch_loss/train': 0.7315299184992909} 12/26/2021 18:30:47 - INFO - codeparrot_training - Step 28874: {'lr': 0.00019087831394105705, 'samples': 14784000, 'steps': 28874, 'batch_loss/train': 0.8409914830699563} 12/26/2021 18:30:57 - INFO - codeparrot_training - Step 28875: {'lr': 0.000190862998050144, 'samples': 14784512, 'steps': 28875, 'batch_loss/train': 0.6984170526266098} 12/26/2021 18:31:10 - INFO - codeparrot_training - Step 28876: {'lr': 0.00019084768239433742, 'samples': 14785024, 'steps': 28876, 'batch_loss/train': 0.8344348985701799} 12/26/2021 18:31:20 - INFO - codeparrot_training - Step 28877: {'lr': 0.000190832366973698, 'samples': 14785536, 'steps': 28877, 'batch_loss/train': 0.726136508397758} 12/26/2021 18:31:31 - INFO - codeparrot_training - Step 28878: {'lr': 0.00019081705178828677, 'samples': 14786048, 'steps': 28878, 'batch_loss/train': 0.6184897013590671} 12/26/2021 18:31:43 - INFO - codeparrot_training - Step 28879: {'lr': 0.00019080173683816464, 'samples': 14786560, 'steps': 28879, 'batch_loss/train': 0.8599983248859644} 12/26/2021 18:31:54 - INFO - codeparrot_training - Step 28880: {'lr': 0.0001907864221233924, 'samples': 14787072, 'steps': 28880, 'batch_loss/train': 0.7315550204366446} 12/26/2021 18:32:04 - INFO - codeparrot_training - Step 28881: {'lr': 0.00019077110764403098, 'samples': 14787584, 'steps': 28881, 'batch_loss/train': 0.6952398886787705} 12/26/2021 18:32:18 - INFO - codeparrot_training - Step 28882: {'lr': 0.00019075579340014137, 'samples': 14788096, 'steps': 28882, 'batch_loss/train': 0.6962053552269936} 12/26/2021 18:32:29 - INFO - codeparrot_training - Step 28883: {'lr': 0.00019074047939178436, 'samples': 14788608, 'steps': 28883, 'batch_loss/train': 0.8663780642673373} 12/26/2021 18:32:39 - INFO - codeparrot_training - Step 28884: {'lr': 0.0001907251656190207, 'samples': 14789120, 'steps': 28884, 'batch_loss/train': 0.689507931470871} 12/26/2021 18:32:50 - INFO - codeparrot_training - Step 28885: {'lr': 0.00019070985208191146, 'samples': 14789632, 'steps': 28885, 'batch_loss/train': 0.7200067583471537} 12/26/2021 18:33:02 - INFO - codeparrot_training - Step 28886: {'lr': 0.0001906945387805175, 'samples': 14790144, 'steps': 28886, 'batch_loss/train': 0.6659502943803091} 12/26/2021 18:33:13 - INFO - codeparrot_training - Step 28887: {'lr': 0.00019067922571489965, 'samples': 14790656, 'steps': 28887, 'batch_loss/train': 0.7171438671648502} 12/26/2021 18:33:23 - INFO - codeparrot_training - Step 28888: {'lr': 0.00019066391288511872, 'samples': 14791168, 'steps': 28888, 'batch_loss/train': 0.6639409000054002} 12/26/2021 18:33:37 - INFO - codeparrot_training - Step 28889: {'lr': 0.00019064860029123577, 'samples': 14791680, 'steps': 28889, 'batch_loss/train': 0.8615855919197202} 12/26/2021 18:33:48 - INFO - codeparrot_training - Step 28890: {'lr': 0.00019063328793331158, 'samples': 14792192, 'steps': 28890, 'batch_loss/train': 0.8163237040862441} 12/26/2021 18:33:58 - INFO - codeparrot_training - Step 28891: {'lr': 0.0001906179758114069, 'samples': 14792704, 'steps': 28891, 'batch_loss/train': 0.633960684761405} 12/26/2021 18:34:11 - INFO - codeparrot_training - Step 28892: {'lr': 0.0001906026639255828, 'samples': 14793216, 'steps': 28892, 'batch_loss/train': 0.7564778616651893} 12/26/2021 18:34:21 - INFO - codeparrot_training - Step 28893: {'lr': 0.00019058735227590014, 'samples': 14793728, 'steps': 28893, 'batch_loss/train': 0.8553700018674135} 12/26/2021 18:34:32 - INFO - codeparrot_training - Step 28894: {'lr': 0.00019057204086241967, 'samples': 14794240, 'steps': 28894, 'batch_loss/train': 0.8595909820869565} 12/26/2021 18:34:44 - INFO - codeparrot_training - Step 28895: {'lr': 0.00019055672968520227, 'samples': 14794752, 'steps': 28895, 'batch_loss/train': 0.7632999671623111} 12/26/2021 18:34:54 - INFO - codeparrot_training - Step 28896: {'lr': 0.00019054141874430897, 'samples': 14795264, 'steps': 28896, 'batch_loss/train': 0.7203769063344225} 12/26/2021 18:35:05 - INFO - codeparrot_training - Step 28897: {'lr': 0.00019052610803980047, 'samples': 14795776, 'steps': 28897, 'batch_loss/train': 0.7393959381151944} 12/26/2021 18:35:15 - INFO - codeparrot_training - Step 28898: {'lr': 0.00019051079757173778, 'samples': 14796288, 'steps': 28898, 'batch_loss/train': 0.6836082092486322} 12/26/2021 18:35:29 - INFO - codeparrot_training - Step 28899: {'lr': 0.0001904954873401816, 'samples': 14796800, 'steps': 28899, 'batch_loss/train': 0.8138849614188075} 12/26/2021 18:35:40 - INFO - codeparrot_training - Step 28900: {'lr': 0.00019048017734519298, 'samples': 14797312, 'steps': 28900, 'batch_loss/train': 0.7801123852841556} 12/26/2021 18:35:51 - INFO - codeparrot_training - Step 28901: {'lr': 0.00019046486758683267, 'samples': 14797824, 'steps': 28901, 'batch_loss/train': 0.6871439050883055} 12/26/2021 18:36:03 - INFO - codeparrot_training - Step 28902: {'lr': 0.0001904495580651615, 'samples': 14798336, 'steps': 28902, 'batch_loss/train': 0.7137213931418955} 12/26/2021 18:36:13 - INFO - codeparrot_training - Step 28903: {'lr': 0.0001904342487802405, 'samples': 14798848, 'steps': 28903, 'batch_loss/train': 0.6664562090300024} 12/26/2021 18:36:24 - INFO - codeparrot_training - Step 28904: {'lr': 0.00019041893973213041, 'samples': 14799360, 'steps': 28904, 'batch_loss/train': 0.7074668887071311} 12/26/2021 18:36:36 - INFO - codeparrot_training - Step 28905: {'lr': 0.00019040363092089215, 'samples': 14799872, 'steps': 28905, 'batch_loss/train': 0.7164854984730482} 12/26/2021 18:36:46 - INFO - codeparrot_training - Step 28906: {'lr': 0.00019038832234658646, 'samples': 14800384, 'steps': 28906, 'batch_loss/train': 0.7241043737158179} 12/26/2021 18:36:57 - INFO - codeparrot_training - Step 28907: {'lr': 0.00019037301400927435, 'samples': 14800896, 'steps': 28907, 'batch_loss/train': 0.7898507858626544} 12/26/2021 18:37:08 - INFO - codeparrot_training - Step 28908: {'lr': 0.00019035770590901667, 'samples': 14801408, 'steps': 28908, 'batch_loss/train': 0.8802862036973238} 12/26/2021 18:37:22 - INFO - codeparrot_training - Step 28909: {'lr': 0.00019034239804587423, 'samples': 14801920, 'steps': 28909, 'batch_loss/train': 0.7548566181212664} 12/26/2021 18:37:32 - INFO - codeparrot_training - Step 28910: {'lr': 0.00019032709041990776, 'samples': 14802432, 'steps': 28910, 'batch_loss/train': 0.826603383757174} 12/26/2021 18:37:43 - INFO - codeparrot_training - Step 28911: {'lr': 0.0001903117830311783, 'samples': 14802944, 'steps': 28911, 'batch_loss/train': 0.7879684381186962} 12/26/2021 18:37:55 - INFO - codeparrot_training - Step 28912: {'lr': 0.00019029647587974672, 'samples': 14803456, 'steps': 28912, 'batch_loss/train': 0.7227597156306729} 12/26/2021 18:38:06 - INFO - codeparrot_training - Step 28913: {'lr': 0.00019028116896567377, 'samples': 14803968, 'steps': 28913, 'batch_loss/train': 0.7031620861962438} 12/26/2021 18:38:16 - INFO - codeparrot_training - Step 28914: {'lr': 0.00019026586228902025, 'samples': 14804480, 'steps': 28914, 'batch_loss/train': 0.6552574415691197} 12/26/2021 18:38:29 - INFO - codeparrot_training - Step 28915: {'lr': 0.00019025055584984726, 'samples': 14804992, 'steps': 28915, 'batch_loss/train': 0.6910053688334301} 12/26/2021 18:38:39 - INFO - codeparrot_training - Step 28916: {'lr': 0.00019023524964821543, 'samples': 14805504, 'steps': 28916, 'batch_loss/train': 0.7285337867215276} 12/26/2021 18:38:50 - INFO - codeparrot_training - Step 28917: {'lr': 0.00019021994368418563, 'samples': 14806016, 'steps': 28917, 'batch_loss/train': 0.7733558274339885} 12/26/2021 18:39:00 - INFO - codeparrot_training - Step 28918: {'lr': 0.00019020463795781884, 'samples': 14806528, 'steps': 28918, 'batch_loss/train': 0.6014138204045594} 12/26/2021 18:39:15 - INFO - codeparrot_training - Step 28919: {'lr': 0.00019018933246917587, 'samples': 14807040, 'steps': 28919, 'batch_loss/train': 0.7798480824567378} 12/26/2021 18:39:25 - INFO - codeparrot_training - Step 28920: {'lr': 0.00019017402721831746, 'samples': 14807552, 'steps': 28920, 'batch_loss/train': 0.9349698959849775} 12/26/2021 18:39:36 - INFO - codeparrot_training - Step 28921: {'lr': 0.00019015872220530446, 'samples': 14808064, 'steps': 28921, 'batch_loss/train': 0.7350397827103734} 12/26/2021 18:39:48 - INFO - codeparrot_training - Step 28922: {'lr': 0.00019014341743019792, 'samples': 14808576, 'steps': 28922, 'batch_loss/train': 0.9683388532139361} 12/26/2021 18:39:59 - INFO - codeparrot_training - Step 28923: {'lr': 0.00019012811289305847, 'samples': 14809088, 'steps': 28923, 'batch_loss/train': 0.7625720240175724} 12/26/2021 18:40:09 - INFO - codeparrot_training - Step 28924: {'lr': 0.00019011280859394713, 'samples': 14809600, 'steps': 28924, 'batch_loss/train': 0.7457737261429429} 12/26/2021 18:40:21 - INFO - codeparrot_training - Step 28925: {'lr': 0.00019009750453292452, 'samples': 14810112, 'steps': 28925, 'batch_loss/train': 0.6421864966396242} 12/26/2021 18:40:32 - INFO - codeparrot_training - Step 28926: {'lr': 0.00019008220071005166, 'samples': 14810624, 'steps': 28926, 'batch_loss/train': 0.7281994836521335} 12/26/2021 18:40:43 - INFO - codeparrot_training - Step 28927: {'lr': 0.0001900668971253894, 'samples': 14811136, 'steps': 28927, 'batch_loss/train': 0.6723268392961472} 12/26/2021 18:40:54 - INFO - codeparrot_training - Step 28928: {'lr': 0.00019005159377899844, 'samples': 14811648, 'steps': 28928, 'batch_loss/train': 0.7403420526534319} 12/26/2021 18:41:07 - INFO - codeparrot_training - Step 28929: {'lr': 0.0001900362906709398, 'samples': 14812160, 'steps': 28929, 'batch_loss/train': 0.7797406665049493} 12/26/2021 18:41:18 - INFO - codeparrot_training - Step 28930: {'lr': 0.00019002098780127418, 'samples': 14812672, 'steps': 28930, 'batch_loss/train': 0.6813872284255922} 12/26/2021 18:41:29 - INFO - codeparrot_training - Step 28931: {'lr': 0.0001900056851700625, 'samples': 14813184, 'steps': 28931, 'batch_loss/train': 0.7933759158477187} 12/26/2021 18:41:41 - INFO - codeparrot_training - Step 28932: {'lr': 0.00018999038277736546, 'samples': 14813696, 'steps': 28932, 'batch_loss/train': 0.756232550367713} 12/26/2021 18:41:52 - INFO - codeparrot_training - Step 28933: {'lr': 0.00018997508062324408, 'samples': 14814208, 'steps': 28933, 'batch_loss/train': 1.1592211741954088} 12/26/2021 18:42:03 - INFO - codeparrot_training - Step 28934: {'lr': 0.00018995977870775915, 'samples': 14814720, 'steps': 28934, 'batch_loss/train': 0.7207265356555581} 12/26/2021 18:42:13 - INFO - codeparrot_training - Step 28935: {'lr': 0.00018994447703097147, 'samples': 14815232, 'steps': 28935, 'batch_loss/train': 0.5600909854983911} 12/26/2021 18:42:27 - INFO - codeparrot_training - Step 28936: {'lr': 0.00018992917559294175, 'samples': 14815744, 'steps': 28936, 'batch_loss/train': 0.7018185124034062} 12/26/2021 18:42:38 - INFO - codeparrot_training - Step 28937: {'lr': 0.00018991387439373103, 'samples': 14816256, 'steps': 28937, 'batch_loss/train': 0.6411270787939429} 12/26/2021 18:42:48 - INFO - codeparrot_training - Step 28938: {'lr': 0.0001898985734334001, 'samples': 14816768, 'steps': 28938, 'batch_loss/train': 0.7116763283265755} 12/26/2021 18:43:00 - INFO - codeparrot_training - Step 28939: {'lr': 0.00018988327271200964, 'samples': 14817280, 'steps': 28939, 'batch_loss/train': 0.6482496517710388} 12/26/2021 18:43:11 - INFO - codeparrot_training - Step 28940: {'lr': 0.00018986797222962065, 'samples': 14817792, 'steps': 28940, 'batch_loss/train': 0.7738343281671405} 12/26/2021 18:43:22 - INFO - codeparrot_training - Step 28941: {'lr': 0.00018985267198629393, 'samples': 14818304, 'steps': 28941, 'batch_loss/train': 0.8456416008993983} 12/26/2021 18:43:34 - INFO - codeparrot_training - Step 28942: {'lr': 0.00018983737198209022, 'samples': 14818816, 'steps': 28942, 'batch_loss/train': 0.748426147736609} 12/26/2021 18:43:45 - INFO - codeparrot_training - Step 28943: {'lr': 0.00018982207221707038, 'samples': 14819328, 'steps': 28943, 'batch_loss/train': 0.7693975912407041} 12/26/2021 18:43:55 - INFO - codeparrot_training - Step 28944: {'lr': 0.00018980677269129535, 'samples': 14819840, 'steps': 28944, 'batch_loss/train': 0.7774847998516634} 12/26/2021 18:44:07 - INFO - codeparrot_training - Step 28945: {'lr': 0.00018979147340482586, 'samples': 14820352, 'steps': 28945, 'batch_loss/train': 0.8675110847689211} 12/26/2021 18:44:18 - INFO - codeparrot_training - Step 28946: {'lr': 0.00018977617435772268, 'samples': 14820864, 'steps': 28946, 'batch_loss/train': 0.5733527820557356} 12/26/2021 18:44:29 - INFO - codeparrot_training - Step 28947: {'lr': 0.00018976087555004663, 'samples': 14821376, 'steps': 28947, 'batch_loss/train': 0.7822811836376786} 12/26/2021 18:44:39 - INFO - codeparrot_training - Step 28948: {'lr': 0.00018974557698185873, 'samples': 14821888, 'steps': 28948, 'batch_loss/train': 0.8882116731256247} 12/26/2021 18:44:53 - INFO - codeparrot_training - Step 28949: {'lr': 0.0001897302786532196, 'samples': 14822400, 'steps': 28949, 'batch_loss/train': 0.820354851312004} 12/26/2021 18:45:04 - INFO - codeparrot_training - Step 28950: {'lr': 0.00018971498056419018, 'samples': 14822912, 'steps': 28950, 'batch_loss/train': 0.778687619138509} 12/26/2021 18:45:14 - INFO - codeparrot_training - Step 28951: {'lr': 0.00018969968271483113, 'samples': 14823424, 'steps': 28951, 'batch_loss/train': 0.7581178434193134} 12/26/2021 18:45:27 - INFO - codeparrot_training - Step 28952: {'lr': 0.00018968438510520342, 'samples': 14823936, 'steps': 28952, 'batch_loss/train': 0.75962835829705} 12/26/2021 18:45:37 - INFO - codeparrot_training - Step 28953: {'lr': 0.00018966908773536783, 'samples': 14824448, 'steps': 28953, 'batch_loss/train': 0.840753513853997} 12/26/2021 18:45:48 - INFO - codeparrot_training - Step 28954: {'lr': 0.00018965379060538514, 'samples': 14824960, 'steps': 28954, 'batch_loss/train': 0.8234098274260759} 12/26/2021 18:46:00 - INFO - codeparrot_training - Step 28955: {'lr': 0.00018963849371531625, 'samples': 14825472, 'steps': 28955, 'batch_loss/train': 0.6122718791011721} 12/26/2021 18:46:10 - INFO - codeparrot_training - Step 28956: {'lr': 0.00018962319706522187, 'samples': 14825984, 'steps': 28956, 'batch_loss/train': 0.8657764159142971} 12/26/2021 18:46:21 - INFO - codeparrot_training - Step 28957: {'lr': 0.00018960790065516292, 'samples': 14826496, 'steps': 28957, 'batch_loss/train': 0.8106424817815423} 12/26/2021 18:46:32 - INFO - codeparrot_training - Step 28958: {'lr': 0.00018959260448520005, 'samples': 14827008, 'steps': 28958, 'batch_loss/train': 0.7983764284290373} 12/26/2021 18:46:45 - INFO - codeparrot_training - Step 28959: {'lr': 0.00018957730855539423, 'samples': 14827520, 'steps': 28959, 'batch_loss/train': 0.8016838021576405} 12/26/2021 18:46:56 - INFO - codeparrot_training - Step 28960: {'lr': 0.00018956201286580627, 'samples': 14828032, 'steps': 28960, 'batch_loss/train': 0.7622413949575275} 12/26/2021 18:47:07 - INFO - codeparrot_training - Step 28961: {'lr': 0.00018954671741649686, 'samples': 14828544, 'steps': 28961, 'batch_loss/train': 0.6710925581865013} 12/26/2021 18:47:20 - INFO - codeparrot_training - Step 28962: {'lr': 0.00018953142220752683, 'samples': 14829056, 'steps': 28962, 'batch_loss/train': 0.687056164140813} 12/26/2021 18:47:31 - INFO - codeparrot_training - Step 28963: {'lr': 0.00018951612723895714, 'samples': 14829568, 'steps': 28963, 'batch_loss/train': 0.4698373844148591} 12/26/2021 18:47:41 - INFO - codeparrot_training - Step 28964: {'lr': 0.00018950083251084848, 'samples': 14830080, 'steps': 28964, 'batch_loss/train': 1.096222042106092} 12/26/2021 18:47:52 - INFO - codeparrot_training - Step 28965: {'lr': 0.00018948553802326158, 'samples': 14830592, 'steps': 28965, 'batch_loss/train': 1.0298641081899405} 12/26/2021 18:48:05 - INFO - codeparrot_training - Step 28966: {'lr': 0.0001894702437762574, 'samples': 14831104, 'steps': 28966, 'batch_loss/train': 0.6548490058630705} 12/26/2021 18:48:16 - INFO - codeparrot_training - Step 28967: {'lr': 0.0001894549497698967, 'samples': 14831616, 'steps': 28967, 'batch_loss/train': 0.6100396247929893} 12/26/2021 18:48:27 - INFO - codeparrot_training - Step 28968: {'lr': 0.0001894396560042402, 'samples': 14832128, 'steps': 28968, 'batch_loss/train': 0.8558042412623763} 12/26/2021 18:48:39 - INFO - codeparrot_training - Step 28969: {'lr': 0.00018942436247934872, 'samples': 14832640, 'steps': 28969, 'batch_loss/train': 0.865026465151459} 12/26/2021 18:48:49 - INFO - codeparrot_training - Step 28970: {'lr': 0.00018940906919528319, 'samples': 14833152, 'steps': 28970, 'batch_loss/train': 0.6998450751416385} 12/26/2021 18:49:00 - INFO - codeparrot_training - Step 28971: {'lr': 0.00018939377615210428, 'samples': 14833664, 'steps': 28971, 'batch_loss/train': 0.7666260665282607} 12/26/2021 18:49:14 - INFO - codeparrot_training - Step 28972: {'lr': 0.0001893784833498729, 'samples': 14834176, 'steps': 28972, 'batch_loss/train': 0.7638696758076549} 12/26/2021 18:49:24 - INFO - codeparrot_training - Step 28973: {'lr': 0.00018936319078864967, 'samples': 14834688, 'steps': 28973, 'batch_loss/train': 0.7996690911240876} 12/26/2021 18:49:35 - INFO - codeparrot_training - Step 28974: {'lr': 0.0001893478984684956, 'samples': 14835200, 'steps': 28974, 'batch_loss/train': 0.7926881071180105} 12/26/2021 18:49:47 - INFO - codeparrot_training - Step 28975: {'lr': 0.00018933260638947133, 'samples': 14835712, 'steps': 28975, 'batch_loss/train': 0.7183591821230948} 12/26/2021 18:49:57 - INFO - codeparrot_training - Step 28976: {'lr': 0.00018931731455163768, 'samples': 14836224, 'steps': 28976, 'batch_loss/train': 0.6923348060809076} 12/26/2021 18:50:08 - INFO - codeparrot_training - Step 28977: {'lr': 0.00018930202295505555, 'samples': 14836736, 'steps': 28977, 'batch_loss/train': 0.6932151750661433} 12/26/2021 18:50:19 - INFO - codeparrot_training - Step 28978: {'lr': 0.0001892867315997856, 'samples': 14837248, 'steps': 28978, 'batch_loss/train': 0.7241353108547628} 12/26/2021 18:50:31 - INFO - codeparrot_training - Step 28979: {'lr': 0.00018927144048588876, 'samples': 14837760, 'steps': 28979, 'batch_loss/train': 0.8030118467286229} 12/26/2021 18:50:42 - INFO - codeparrot_training - Step 28980: {'lr': 0.00018925614961342563, 'samples': 14838272, 'steps': 28980, 'batch_loss/train': 0.7329310327768326} 12/26/2021 18:50:52 - INFO - codeparrot_training - Step 28981: {'lr': 0.00018924085898245718, 'samples': 14838784, 'steps': 28981, 'batch_loss/train': 0.6035568428924307} 12/26/2021 18:51:06 - INFO - codeparrot_training - Step 28982: {'lr': 0.00018922556859304413, 'samples': 14839296, 'steps': 28982, 'batch_loss/train': 0.758103009313345} 12/26/2021 18:51:16 - INFO - codeparrot_training - Step 28983: {'lr': 0.0001892102784452473, 'samples': 14839808, 'steps': 28983, 'batch_loss/train': 0.7860074061900377} 12/26/2021 18:51:27 - INFO - codeparrot_training - Step 28984: {'lr': 0.00018919498853912738, 'samples': 14840320, 'steps': 28984, 'batch_loss/train': 0.7114294543862343} 12/26/2021 18:51:39 - INFO - codeparrot_training - Step 28985: {'lr': 0.00018917969887474522, 'samples': 14840832, 'steps': 28985, 'batch_loss/train': 0.6754752709530294} 12/26/2021 18:51:50 - INFO - codeparrot_training - Step 28986: {'lr': 0.0001891644094521617, 'samples': 14841344, 'steps': 28986, 'batch_loss/train': 0.8180391388013959} 12/26/2021 18:52:00 - INFO - codeparrot_training - Step 28987: {'lr': 0.00018914912027143745, 'samples': 14841856, 'steps': 28987, 'batch_loss/train': 0.8050239961594343} 12/26/2021 18:52:11 - INFO - codeparrot_training - Step 28988: {'lr': 0.0001891338313326333, 'samples': 14842368, 'steps': 28988, 'batch_loss/train': 0.5393848712556064} 12/26/2021 18:52:23 - INFO - codeparrot_training - Step 28989: {'lr': 0.00018911854263581012, 'samples': 14842880, 'steps': 28989, 'batch_loss/train': 0.7595449036452919} 12/26/2021 18:52:34 - INFO - codeparrot_training - Step 28990: {'lr': 0.00018910325418102866, 'samples': 14843392, 'steps': 28990, 'batch_loss/train': 0.7389806937426329} 12/26/2021 18:52:44 - INFO - codeparrot_training - Step 28991: {'lr': 0.00018908796596834955, 'samples': 14843904, 'steps': 28991, 'batch_loss/train': 0.8145812731236219} 12/26/2021 18:52:58 - INFO - codeparrot_training - Step 28992: {'lr': 0.00018907267799783372, 'samples': 14844416, 'steps': 28992, 'batch_loss/train': 0.7459706030786037} 12/26/2021 18:53:09 - INFO - codeparrot_training - Step 28993: {'lr': 0.000189057390269542, 'samples': 14844928, 'steps': 28993, 'batch_loss/train': 0.6344381775707006} 12/26/2021 18:53:19 - INFO - codeparrot_training - Step 28994: {'lr': 0.00018904210278353502, 'samples': 14845440, 'steps': 28994, 'batch_loss/train': 0.8024381194263697} 12/26/2021 18:53:31 - INFO - codeparrot_training - Step 28995: {'lr': 0.00018902681553987355, 'samples': 14845952, 'steps': 28995, 'batch_loss/train': 0.8127606576308608} 12/26/2021 18:53:42 - INFO - codeparrot_training - Step 28996: {'lr': 0.00018901152853861858, 'samples': 14846464, 'steps': 28996, 'batch_loss/train': 0.6951859355904162} 12/26/2021 18:53:53 - INFO - codeparrot_training - Step 28997: {'lr': 0.0001889962417798307, 'samples': 14846976, 'steps': 28997, 'batch_loss/train': 0.8431024523451924} 12/26/2021 18:54:03 - INFO - codeparrot_training - Step 28998: {'lr': 0.00018898095526357074, 'samples': 14847488, 'steps': 28998, 'batch_loss/train': 0.7615198912099004} 12/26/2021 18:54:17 - INFO - codeparrot_training - Step 28999: {'lr': 0.00018896566898989937, 'samples': 14848000, 'steps': 28999, 'batch_loss/train': 0.8015931891277432} 12/26/2021 18:54:27 - INFO - codeparrot_training - Step 29000: {'lr': 0.00018895038295887757, 'samples': 14848512, 'steps': 29000, 'batch_loss/train': 0.7867150767706335} 12/26/2021 18:54:38 - INFO - codeparrot_training - Step 29001: {'lr': 0.0001889350971705659, 'samples': 14849024, 'steps': 29001, 'batch_loss/train': 0.6955624767579138} 12/26/2021 18:54:50 - INFO - codeparrot_training - Step 29002: {'lr': 0.00018891981162502526, 'samples': 14849536, 'steps': 29002, 'batch_loss/train': 0.7817931994795799} 12/26/2021 18:55:01 - INFO - codeparrot_training - Step 29003: {'lr': 0.00018890452632231642, 'samples': 14850048, 'steps': 29003, 'batch_loss/train': 0.6109686140553094} 12/26/2021 18:55:11 - INFO - codeparrot_training - Step 29004: {'lr': 0.0001888892412625001, 'samples': 14850560, 'steps': 29004, 'batch_loss/train': 0.7627303544431925} 12/26/2021 18:55:23 - INFO - codeparrot_training - Step 29005: {'lr': 0.0001888739564456371, 'samples': 14851072, 'steps': 29005, 'batch_loss/train': 0.6994947935454547} 12/26/2021 18:55:34 - INFO - codeparrot_training - Step 29006: {'lr': 0.0001888586718717881, 'samples': 14851584, 'steps': 29006, 'batch_loss/train': 0.7345377129968256} 12/26/2021 18:55:45 - INFO - codeparrot_training - Step 29007: {'lr': 0.00018884338754101396, 'samples': 14852096, 'steps': 29007, 'batch_loss/train': 0.7360991984605789} 12/26/2021 18:55:57 - INFO - codeparrot_training - Step 29008: {'lr': 0.00018882810345337547, 'samples': 14852608, 'steps': 29008, 'batch_loss/train': 0.6838936619460583} 12/26/2021 18:56:08 - INFO - codeparrot_training - Step 29009: {'lr': 0.00018881281960893338, 'samples': 14853120, 'steps': 29009, 'batch_loss/train': 0.7774518541991711} 12/26/2021 18:56:18 - INFO - codeparrot_training - Step 29010: {'lr': 0.00018879753600774827, 'samples': 14853632, 'steps': 29010, 'batch_loss/train': 0.8308789683505893} 12/26/2021 18:56:29 - INFO - codeparrot_training - Step 29011: {'lr': 0.0001887822526498811, 'samples': 14854144, 'steps': 29011, 'batch_loss/train': 0.8141008000820875} 12/26/2021 18:56:44 - INFO - codeparrot_training - Step 29012: {'lr': 0.00018876696953539266, 'samples': 14854656, 'steps': 29012, 'batch_loss/train': 0.6573784542270005} 12/26/2021 18:56:54 - INFO - codeparrot_training - Step 29013: {'lr': 0.0001887516866643435, 'samples': 14855168, 'steps': 29013, 'batch_loss/train': 0.7441991148516536} 12/26/2021 18:57:05 - INFO - codeparrot_training - Step 29014: {'lr': 0.00018873640403679456, 'samples': 14855680, 'steps': 29014, 'batch_loss/train': 0.665341017767787} 12/26/2021 18:57:17 - INFO - codeparrot_training - Step 29015: {'lr': 0.00018872112165280663, 'samples': 14856192, 'steps': 29015, 'batch_loss/train': 0.7043049461208284} 12/26/2021 18:57:28 - INFO - codeparrot_training - Step 29016: {'lr': 0.0001887058395124403, 'samples': 14856704, 'steps': 29016, 'batch_loss/train': 0.7396035115234554} 12/26/2021 18:57:38 - INFO - codeparrot_training - Step 29017: {'lr': 0.00018869055761575637, 'samples': 14857216, 'steps': 29017, 'batch_loss/train': 0.8017915841192007} 12/26/2021 18:57:50 - INFO - codeparrot_training - Step 29018: {'lr': 0.00018867527596281571, 'samples': 14857728, 'steps': 29018, 'batch_loss/train': 0.7249186602421105} 12/26/2021 18:58:01 - INFO - codeparrot_training - Step 29019: {'lr': 0.00018865999455367906, 'samples': 14858240, 'steps': 29019, 'batch_loss/train': 0.7687780177220702} 12/26/2021 18:58:12 - INFO - codeparrot_training - Step 29020: {'lr': 0.000188644713388407, 'samples': 14858752, 'steps': 29020, 'batch_loss/train': 0.667943978914991} 12/26/2021 18:58:22 - INFO - codeparrot_training - Step 29021: {'lr': 0.00018862943246706034, 'samples': 14859264, 'steps': 29021, 'batch_loss/train': 0.5987191246822476} 12/26/2021 18:58:36 - INFO - codeparrot_training - Step 29022: {'lr': 0.0001886141517897, 'samples': 14859776, 'steps': 29022, 'batch_loss/train': 0.815562434727326} 12/26/2021 18:58:47 - INFO - codeparrot_training - Step 29023: {'lr': 0.0001885988713563866, 'samples': 14860288, 'steps': 29023, 'batch_loss/train': 1.0082009881734848} 12/26/2021 18:58:58 - INFO - codeparrot_training - Step 29024: {'lr': 0.00018858359116718092, 'samples': 14860800, 'steps': 29024, 'batch_loss/train': 1.771474645473063} 12/26/2021 18:59:10 - INFO - codeparrot_training - Step 29025: {'lr': 0.00018856831122214358, 'samples': 14861312, 'steps': 29025, 'batch_loss/train': 0.7532705797348171} 12/26/2021 18:59:20 - INFO - codeparrot_training - Step 29026: {'lr': 0.00018855303152133553, 'samples': 14861824, 'steps': 29026, 'batch_loss/train': 0.7630802430212498} 12/26/2021 18:59:31 - INFO - codeparrot_training - Step 29027: {'lr': 0.00018853775206481742, 'samples': 14862336, 'steps': 29027, 'batch_loss/train': 0.7770846514031291} 12/26/2021 18:59:45 - INFO - codeparrot_training - Step 29028: {'lr': 0.00018852247285264993, 'samples': 14862848, 'steps': 29028, 'batch_loss/train': 0.7671567182987928} 12/26/2021 18:59:55 - INFO - codeparrot_training - Step 29029: {'lr': 0.00018850719388489396, 'samples': 14863360, 'steps': 29029, 'batch_loss/train': 0.7224739035591483} 12/26/2021 19:00:06 - INFO - codeparrot_training - Step 29030: {'lr': 0.00018849191516161012, 'samples': 14863872, 'steps': 29030, 'batch_loss/train': 0.720713131595403} 12/26/2021 19:00:17 - INFO - codeparrot_training - Step 29031: {'lr': 0.00018847663668285927, 'samples': 14864384, 'steps': 29031, 'batch_loss/train': 0.8137127719819546} 12/26/2021 19:00:29 - INFO - codeparrot_training - Step 29032: {'lr': 0.000188461358448702, 'samples': 14864896, 'steps': 29032, 'batch_loss/train': 0.7303304960951209} 12/26/2021 19:00:39 - INFO - codeparrot_training - Step 29033: {'lr': 0.00018844608045919914, 'samples': 14865408, 'steps': 29033, 'batch_loss/train': 0.5848789477604441} 12/26/2021 19:00:50 - INFO - codeparrot_training - Step 29034: {'lr': 0.0001884308027144115, 'samples': 14865920, 'steps': 29034, 'batch_loss/train': 0.7952748765237629} 12/26/2021 19:01:02 - INFO - codeparrot_training - Step 29035: {'lr': 0.0001884155252143997, 'samples': 14866432, 'steps': 29035, 'batch_loss/train': 0.7444836581125855} 12/26/2021 19:01:13 - INFO - codeparrot_training - Step 29036: {'lr': 0.00018840024795922444, 'samples': 14866944, 'steps': 29036, 'batch_loss/train': 0.7654100181534886} 12/26/2021 19:01:23 - INFO - codeparrot_training - Step 29037: {'lr': 0.00018838497094894661, 'samples': 14867456, 'steps': 29037, 'batch_loss/train': 0.7388808443211019} 12/26/2021 19:01:37 - INFO - codeparrot_training - Step 29038: {'lr': 0.00018836969418362692, 'samples': 14867968, 'steps': 29038, 'batch_loss/train': 0.7642673552036285} 12/26/2021 19:01:48 - INFO - codeparrot_training - Step 29039: {'lr': 0.00018835441766332591, 'samples': 14868480, 'steps': 29039, 'batch_loss/train': 0.717974629253149} 12/26/2021 19:01:59 - INFO - codeparrot_training - Step 29040: {'lr': 0.00018833914138810455, 'samples': 14868992, 'steps': 29040, 'batch_loss/train': 0.683283924125135} 12/26/2021 19:02:09 - INFO - codeparrot_training - Step 29041: {'lr': 0.00018832386535802353, 'samples': 14869504, 'steps': 29041, 'batch_loss/train': 0.7658702889457345} 12/26/2021 19:02:22 - INFO - codeparrot_training - Step 29042: {'lr': 0.00018830858957314344, 'samples': 14870016, 'steps': 29042, 'batch_loss/train': 0.8783514369279146} 12/26/2021 19:02:32 - INFO - codeparrot_training - Step 29043: {'lr': 0.00018829331403352512, 'samples': 14870528, 'steps': 29043, 'batch_loss/train': 0.777228806167841} 12/26/2021 19:02:43 - INFO - codeparrot_training - Step 29044: {'lr': 0.00018827803873922933, 'samples': 14871040, 'steps': 29044, 'batch_loss/train': 0.7610284658148885} 12/26/2021 19:02:55 - INFO - codeparrot_training - Step 29045: {'lr': 0.0001882627636903168, 'samples': 14871552, 'steps': 29045, 'batch_loss/train': 0.6933040800504386} 12/26/2021 19:03:06 - INFO - codeparrot_training - Step 29046: {'lr': 0.0001882474888868481, 'samples': 14872064, 'steps': 29046, 'batch_loss/train': 0.6871023094281554} 12/26/2021 19:03:16 - INFO - codeparrot_training - Step 29047: {'lr': 0.00018823221432888408, 'samples': 14872576, 'steps': 29047, 'batch_loss/train': 0.5375638029072434} 12/26/2021 19:03:28 - INFO - codeparrot_training - Step 29048: {'lr': 0.0001882169400164855, 'samples': 14873088, 'steps': 29048, 'batch_loss/train': 0.7301394315436482} 12/26/2021 19:03:39 - INFO - codeparrot_training - Step 29049: {'lr': 0.00018820166594971303, 'samples': 14873600, 'steps': 29049, 'batch_loss/train': 0.7452266523614526} 12/26/2021 19:03:50 - INFO - codeparrot_training - Step 29050: {'lr': 0.0001881863921286273, 'samples': 14874112, 'steps': 29050, 'batch_loss/train': 0.6627343450672925} 12/26/2021 19:04:03 - INFO - codeparrot_training - Step 29051: {'lr': 0.0001881711185532893, 'samples': 14874624, 'steps': 29051, 'batch_loss/train': 0.7704428285360336} 12/26/2021 19:04:14 - INFO - codeparrot_training - Step 29052: {'lr': 0.00018815584522375952, 'samples': 14875136, 'steps': 29052, 'batch_loss/train': 0.6900751469656825} 12/26/2021 19:04:24 - INFO - codeparrot_training - Step 29053: {'lr': 0.00018814057214009878, 'samples': 14875648, 'steps': 29053, 'batch_loss/train': 0.6791513320058584} 12/26/2021 19:04:35 - INFO - codeparrot_training - Step 29054: {'lr': 0.00018812529930236764, 'samples': 14876160, 'steps': 29054, 'batch_loss/train': 0.7685759230516851} 12/26/2021 19:04:47 - INFO - codeparrot_training - Step 29055: {'lr': 0.0001881100267106271, 'samples': 14876672, 'steps': 29055, 'batch_loss/train': 0.7951775323599577} 12/26/2021 19:04:58 - INFO - codeparrot_training - Step 29056: {'lr': 0.00018809475436493768, 'samples': 14877184, 'steps': 29056, 'batch_loss/train': 0.7529133753851056} 12/26/2021 19:05:08 - INFO - codeparrot_training - Step 29057: {'lr': 0.00018807948226536015, 'samples': 14877696, 'steps': 29057, 'batch_loss/train': 0.7721777241677046} 12/26/2021 19:05:21 - INFO - codeparrot_training - Step 29058: {'lr': 0.00018806421041195513, 'samples': 14878208, 'steps': 29058, 'batch_loss/train': 0.7019713828340173} 12/26/2021 19:05:32 - INFO - codeparrot_training - Step 29059: {'lr': 0.00018804893880478346, 'samples': 14878720, 'steps': 29059, 'batch_loss/train': 0.8232401767745614} 12/26/2021 19:05:43 - INFO - codeparrot_training - Step 29060: {'lr': 0.0001880336674439059, 'samples': 14879232, 'steps': 29060, 'batch_loss/train': 0.6621752844657749} 12/26/2021 19:05:55 - INFO - codeparrot_training - Step 29061: {'lr': 0.000188018396329383, 'samples': 14879744, 'steps': 29061, 'batch_loss/train': 0.7060574935749173} 12/26/2021 19:06:06 - INFO - codeparrot_training - Step 29062: {'lr': 0.00018800312546127552, 'samples': 14880256, 'steps': 29062, 'batch_loss/train': 0.7135654538869858} 12/26/2021 19:06:16 - INFO - codeparrot_training - Step 29063: {'lr': 0.0001879878548396443, 'samples': 14880768, 'steps': 29063, 'batch_loss/train': 0.6550926039344631} 12/26/2021 19:06:27 - INFO - codeparrot_training - Step 29064: {'lr': 0.00018797258446454998, 'samples': 14881280, 'steps': 29064, 'batch_loss/train': 0.7275648741051555} 12/26/2021 19:06:39 - INFO - codeparrot_training - Step 29065: {'lr': 0.0001879573143360531, 'samples': 14881792, 'steps': 29065, 'batch_loss/train': 0.8038358548656106} 12/26/2021 19:06:50 - INFO - codeparrot_training - Step 29066: {'lr': 0.00018794204445421459, 'samples': 14882304, 'steps': 29066, 'batch_loss/train': 0.7996502388268709} 12/26/2021 19:07:00 - INFO - codeparrot_training - Step 29067: {'lr': 0.00018792677481909513, 'samples': 14882816, 'steps': 29067, 'batch_loss/train': 0.6749419439584017} 12/26/2021 19:07:13 - INFO - codeparrot_training - Step 29068: {'lr': 0.0001879115054307553, 'samples': 14883328, 'steps': 29068, 'batch_loss/train': 0.7122602434828877} 12/26/2021 19:07:24 - INFO - codeparrot_training - Step 29069: {'lr': 0.00018789623628925585, 'samples': 14883840, 'steps': 29069, 'batch_loss/train': 0.7703879817854613} 12/26/2021 19:07:35 - INFO - codeparrot_training - Step 29070: {'lr': 0.00018788096739465762, 'samples': 14884352, 'steps': 29070, 'batch_loss/train': 0.7306543639861047} 12/26/2021 19:07:47 - INFO - codeparrot_training - Step 29071: {'lr': 0.00018786569874702119, 'samples': 14884864, 'steps': 29071, 'batch_loss/train': 0.7423250402789563} 12/26/2021 19:07:57 - INFO - codeparrot_training - Step 29072: {'lr': 0.00018785043034640726, 'samples': 14885376, 'steps': 29072, 'batch_loss/train': 0.8123738905414939} 12/26/2021 19:08:08 - INFO - codeparrot_training - Step 29073: {'lr': 0.00018783516219287647, 'samples': 14885888, 'steps': 29073, 'batch_loss/train': 0.7134749845135957} 12/26/2021 19:08:19 - INFO - codeparrot_training - Step 29074: {'lr': 0.0001878198942864897, 'samples': 14886400, 'steps': 29074, 'batch_loss/train': 0.7446766961365938} 12/26/2021 19:08:32 - INFO - codeparrot_training - Step 29075: {'lr': 0.00018780462662730754, 'samples': 14886912, 'steps': 29075, 'batch_loss/train': 0.7512941488530487} 12/26/2021 19:08:42 - INFO - codeparrot_training - Step 29076: {'lr': 0.00018778935921539064, 'samples': 14887424, 'steps': 29076, 'batch_loss/train': 0.7083654664456844} 12/26/2021 19:08:53 - INFO - codeparrot_training - Step 29077: {'lr': 0.00018777409205079985, 'samples': 14887936, 'steps': 29077, 'batch_loss/train': 0.7666064855293371} 12/26/2021 19:09:05 - INFO - codeparrot_training - Step 29078: {'lr': 0.0001877588251335957, 'samples': 14888448, 'steps': 29078, 'batch_loss/train': 0.7489375080913305} 12/26/2021 19:09:16 - INFO - codeparrot_training - Step 29079: {'lr': 0.00018774355846383906, 'samples': 14888960, 'steps': 29079, 'batch_loss/train': 0.7392769875004888} 12/26/2021 19:09:27 - INFO - codeparrot_training - Step 29080: {'lr': 0.00018772829204159036, 'samples': 14889472, 'steps': 29080, 'batch_loss/train': 0.7271011353004724} 12/26/2021 19:09:39 - INFO - codeparrot_training - Step 29081: {'lr': 0.0001877130258669106, 'samples': 14889984, 'steps': 29081, 'batch_loss/train': 0.6918102940544486} 12/26/2021 19:09:49 - INFO - codeparrot_training - Step 29082: {'lr': 0.0001876977599398603, 'samples': 14890496, 'steps': 29082, 'batch_loss/train': 0.7289239293895662} 12/26/2021 19:10:00 - INFO - codeparrot_training - Step 29083: {'lr': 0.00018768249426050023, 'samples': 14891008, 'steps': 29083, 'batch_loss/train': 0.7334441601997241} 12/26/2021 19:10:12 - INFO - codeparrot_training - Step 29084: {'lr': 0.00018766722882889087, 'samples': 14891520, 'steps': 29084, 'batch_loss/train': 0.9558575209230185} 12/26/2021 19:10:23 - INFO - codeparrot_training - Step 29085: {'lr': 0.00018765196364509316, 'samples': 14892032, 'steps': 29085, 'batch_loss/train': 0.7638478712178767} 12/26/2021 19:10:33 - INFO - codeparrot_training - Step 29086: {'lr': 0.00018763669870916778, 'samples': 14892544, 'steps': 29086, 'batch_loss/train': 0.7259996579959989} 12/26/2021 19:10:44 - INFO - codeparrot_training - Step 29087: {'lr': 0.00018762143402117519, 'samples': 14893056, 'steps': 29087, 'batch_loss/train': 0.719693630002439} 12/26/2021 19:10:57 - INFO - codeparrot_training - Step 29088: {'lr': 0.0001876061695811763, 'samples': 14893568, 'steps': 29088, 'batch_loss/train': 0.8107340540736914} 12/26/2021 19:11:08 - INFO - codeparrot_training - Step 29089: {'lr': 0.00018759090538923175, 'samples': 14894080, 'steps': 29089, 'batch_loss/train': 0.8464975850656629} 12/26/2021 19:11:19 - INFO - codeparrot_training - Step 29090: {'lr': 0.00018757564144540224, 'samples': 14894592, 'steps': 29090, 'batch_loss/train': 0.7429689546115696} 12/26/2021 19:11:31 - INFO - codeparrot_training - Step 29091: {'lr': 0.00018756037774974826, 'samples': 14895104, 'steps': 29091, 'batch_loss/train': 0.6830819873139262} 12/26/2021 19:11:41 - INFO - codeparrot_training - Step 29092: {'lr': 0.0001875451143023307, 'samples': 14895616, 'steps': 29092, 'batch_loss/train': 0.72918985132128} 12/26/2021 19:11:52 - INFO - codeparrot_training - Step 29093: {'lr': 0.00018752985110321023, 'samples': 14896128, 'steps': 29093, 'batch_loss/train': 0.7359472964890301} 12/26/2021 19:12:04 - INFO - codeparrot_training - Step 29094: {'lr': 0.0001875145881524475, 'samples': 14896640, 'steps': 29094, 'batch_loss/train': 0.8522191476076841} 12/26/2021 19:12:15 - INFO - codeparrot_training - Step 29095: {'lr': 0.00018749932545010305, 'samples': 14897152, 'steps': 29095, 'batch_loss/train': 0.6955152899026871} 12/26/2021 19:12:25 - INFO - codeparrot_training - Step 29096: {'lr': 0.00018748406299623783, 'samples': 14897664, 'steps': 29096, 'batch_loss/train': 0.7729864290449768} 12/26/2021 19:12:36 - INFO - codeparrot_training - Step 29097: {'lr': 0.00018746880079091227, 'samples': 14898176, 'steps': 29097, 'batch_loss/train': 0.7263462780974805} 12/26/2021 19:12:49 - INFO - codeparrot_training - Step 29098: {'lr': 0.00018745353883418725, 'samples': 14898688, 'steps': 29098, 'batch_loss/train': 0.828366358182393} 12/26/2021 19:13:00 - INFO - codeparrot_training - Step 29099: {'lr': 0.00018743827712612321, 'samples': 14899200, 'steps': 29099, 'batch_loss/train': 0.7092323621036485} 12/26/2021 19:13:10 - INFO - codeparrot_training - Step 29100: {'lr': 0.00018742301566678106, 'samples': 14899712, 'steps': 29100, 'batch_loss/train': 0.674156315100845} 12/26/2021 19:13:23 - INFO - codeparrot_training - Step 29101: {'lr': 0.00018740775445622132, 'samples': 14900224, 'steps': 29101, 'batch_loss/train': 0.7751687942072749} 12/26/2021 19:13:33 - INFO - codeparrot_training - Step 29102: {'lr': 0.00018739249349450466, 'samples': 14900736, 'steps': 29102, 'batch_loss/train': 0.7605547625571489} 12/26/2021 19:13:44 - INFO - codeparrot_training - Step 29103: {'lr': 0.00018737723278169194, 'samples': 14901248, 'steps': 29103, 'batch_loss/train': 0.7640549065545201} 12/26/2021 19:13:57 - INFO - codeparrot_training - Step 29104: {'lr': 0.00018736197231784363, 'samples': 14901760, 'steps': 29104, 'batch_loss/train': 0.8279954849276692} 12/26/2021 19:14:08 - INFO - codeparrot_training - Step 29105: {'lr': 0.0001873467121030205, 'samples': 14902272, 'steps': 29105, 'batch_loss/train': 0.8165611801669002} 12/26/2021 19:14:18 - INFO - codeparrot_training - Step 29106: {'lr': 0.0001873314521372831, 'samples': 14902784, 'steps': 29106, 'batch_loss/train': 0.5290177186252549} 12/26/2021 19:14:29 - INFO - codeparrot_training - Step 29107: {'lr': 0.00018731619242069223, 'samples': 14903296, 'steps': 29107, 'batch_loss/train': 0.7936170413158834} 12/26/2021 19:14:41 - INFO - codeparrot_training - Step 29108: {'lr': 0.00018730093295330855, 'samples': 14903808, 'steps': 29108, 'batch_loss/train': 0.7861821250990033} 12/26/2021 19:14:52 - INFO - codeparrot_training - Step 29109: {'lr': 0.0001872856737351927, 'samples': 14904320, 'steps': 29109, 'batch_loss/train': 0.7718688161112368} 12/26/2021 19:15:02 - INFO - codeparrot_training - Step 29110: {'lr': 0.0001872704147664052, 'samples': 14904832, 'steps': 29110, 'batch_loss/train': 0.7336207628250122} 12/26/2021 19:15:14 - INFO - codeparrot_training - Step 29111: {'lr': 0.00018725515604700694, 'samples': 14905344, 'steps': 29111, 'batch_loss/train': 0.6067606764263473} 12/26/2021 19:15:25 - INFO - codeparrot_training - Step 29112: {'lr': 0.00018723989757705854, 'samples': 14905856, 'steps': 29112, 'batch_loss/train': 0.8319720160216093} 12/26/2021 19:15:36 - INFO - codeparrot_training - Step 29113: {'lr': 0.00018722463935662048, 'samples': 14906368, 'steps': 29113, 'batch_loss/train': 0.7840174646116793} 12/26/2021 19:15:50 - INFO - codeparrot_training - Step 29114: {'lr': 0.0001872093813857536, 'samples': 14906880, 'steps': 29114, 'batch_loss/train': 1.8012216938659549} 12/26/2021 19:16:00 - INFO - codeparrot_training - Step 29115: {'lr': 0.0001871941236645186, 'samples': 14907392, 'steps': 29115, 'batch_loss/train': 2.7565476140007377} 12/26/2021 19:16:11 - INFO - codeparrot_training - Step 29116: {'lr': 0.00018717886619297602, 'samples': 14907904, 'steps': 29116, 'batch_loss/train': 0.7240079673938453} 12/26/2021 19:16:22 - INFO - codeparrot_training - Step 29117: {'lr': 0.00018716360897118645, 'samples': 14908416, 'steps': 29117, 'batch_loss/train': 0.8302714684978127} 12/26/2021 19:16:34 - INFO - codeparrot_training - Step 29118: {'lr': 0.00018714835199921072, 'samples': 14908928, 'steps': 29118, 'batch_loss/train': 0.8441948768449947} 12/26/2021 19:16:45 - INFO - codeparrot_training - Step 29119: {'lr': 0.00018713309527710942, 'samples': 14909440, 'steps': 29119, 'batch_loss/train': 0.734609343460761} 12/26/2021 19:16:55 - INFO - codeparrot_training - Step 29120: {'lr': 0.00018711783880494317, 'samples': 14909952, 'steps': 29120, 'batch_loss/train': 0.6222301808884367} 12/26/2021 19:17:07 - INFO - codeparrot_training - Step 29121: {'lr': 0.00018710258258277258, 'samples': 14910464, 'steps': 29121, 'batch_loss/train': 0.6881431760266423} 12/26/2021 19:17:18 - INFO - codeparrot_training - Step 29122: {'lr': 0.00018708732661065847, 'samples': 14910976, 'steps': 29122, 'batch_loss/train': 0.46315847284859046} 12/26/2021 19:17:29 - INFO - codeparrot_training - Step 29123: {'lr': 0.00018707207088866136, 'samples': 14911488, 'steps': 29123, 'batch_loss/train': 0.7930748630315065} 12/26/2021 19:17:41 - INFO - codeparrot_training - Step 29124: {'lr': 0.00018705681541684188, 'samples': 14912000, 'steps': 29124, 'batch_loss/train': 0.7856074566952884} 12/26/2021 19:17:52 - INFO - codeparrot_training - Step 29125: {'lr': 0.00018704156019526085, 'samples': 14912512, 'steps': 29125, 'batch_loss/train': 0.7022804426960647} 12/26/2021 19:18:02 - INFO - codeparrot_training - Step 29126: {'lr': 0.0001870263052239788, 'samples': 14913024, 'steps': 29126, 'batch_loss/train': 0.7237678440287709} 12/26/2021 19:18:16 - INFO - codeparrot_training - Step 29127: {'lr': 0.00018701105050305633, 'samples': 14913536, 'steps': 29127, 'batch_loss/train': 0.7541758469305933} 12/26/2021 19:18:27 - INFO - codeparrot_training - Step 29128: {'lr': 0.00018699579603255408, 'samples': 14914048, 'steps': 29128, 'batch_loss/train': 0.7024395726621151} 12/26/2021 19:18:37 - INFO - codeparrot_training - Step 29129: {'lr': 0.00018698054181253286, 'samples': 14914560, 'steps': 29129, 'batch_loss/train': 0.7273759711533785} 12/26/2021 19:18:48 - INFO - codeparrot_training - Step 29130: {'lr': 0.00018696528784305317, 'samples': 14915072, 'steps': 29130, 'batch_loss/train': 0.7299644565209746} 12/26/2021 19:19:00 - INFO - codeparrot_training - Step 29131: {'lr': 0.00018695003412417573, 'samples': 14915584, 'steps': 29131, 'batch_loss/train': 0.4778471114113927} 12/26/2021 19:19:11 - INFO - codeparrot_training - Step 29132: {'lr': 0.00018693478065596103, 'samples': 14916096, 'steps': 29132, 'batch_loss/train': 0.7302698823623359} 12/26/2021 19:19:21 - INFO - codeparrot_training - Step 29133: {'lr': 0.0001869195274384699, 'samples': 14916608, 'steps': 29133, 'batch_loss/train': 0.5233756238012575} 12/26/2021 19:19:34 - INFO - codeparrot_training - Step 29134: {'lr': 0.00018690427447176297, 'samples': 14917120, 'steps': 29134, 'batch_loss/train': 0.7049937276169658} 12/26/2021 19:19:44 - INFO - codeparrot_training - Step 29135: {'lr': 0.00018688902175590082, 'samples': 14917632, 'steps': 29135, 'batch_loss/train': 0.565937063191086} 12/26/2021 19:19:55 - INFO - codeparrot_training - Step 29136: {'lr': 0.000186873769290944, 'samples': 14918144, 'steps': 29136, 'batch_loss/train': 0.6481486322591081} 12/26/2021 19:20:09 - INFO - codeparrot_training - Step 29137: {'lr': 0.00018685851707695325, 'samples': 14918656, 'steps': 29137, 'batch_loss/train': 0.8052274724468589} 12/26/2021 19:20:20 - INFO - codeparrot_training - Step 29138: {'lr': 0.0001868432651139893, 'samples': 14919168, 'steps': 29138, 'batch_loss/train': 0.6105230491375551} 12/26/2021 19:20:30 - INFO - codeparrot_training - Step 29139: {'lr': 0.00018682801340211253, 'samples': 14919680, 'steps': 29139, 'batch_loss/train': 0.7534569530980662} 12/26/2021 19:20:41 - INFO - codeparrot_training - Step 29140: {'lr': 0.0001868127619413838, 'samples': 14920192, 'steps': 29140, 'batch_loss/train': 0.5379987657652237} 12/26/2021 19:20:53 - INFO - codeparrot_training - Step 29141: {'lr': 0.0001867975107318637, 'samples': 14920704, 'steps': 29141, 'batch_loss/train': 0.7523747679078951} 12/26/2021 19:21:04 - INFO - codeparrot_training - Step 29142: {'lr': 0.0001867822597736128, 'samples': 14921216, 'steps': 29142, 'batch_loss/train': 0.7538403049111366} 12/26/2021 19:21:14 - INFO - codeparrot_training - Step 29143: {'lr': 0.0001867670090666917, 'samples': 14921728, 'steps': 29143, 'batch_loss/train': 0.6271502864547074} 12/26/2021 19:21:28 - INFO - codeparrot_training - Step 29144: {'lr': 0.00018675175861116122, 'samples': 14922240, 'steps': 29144, 'batch_loss/train': 0.7846818490652367} 12/26/2021 19:21:38 - INFO - codeparrot_training - Step 29145: {'lr': 0.00018673650840708185, 'samples': 14922752, 'steps': 29145, 'batch_loss/train': 0.7439807094633579} 12/26/2021 19:21:49 - INFO - codeparrot_training - Step 29146: {'lr': 0.0001867212584545142, 'samples': 14923264, 'steps': 29146, 'batch_loss/train': 0.6569117931649089} 12/26/2021 19:22:01 - INFO - codeparrot_training - Step 29147: {'lr': 0.00018670600875351885, 'samples': 14923776, 'steps': 29147, 'batch_loss/train': 0.8608679189346731} 12/26/2021 19:22:12 - INFO - codeparrot_training - Step 29148: {'lr': 0.00018669075930415662, 'samples': 14924288, 'steps': 29148, 'batch_loss/train': 0.7430944419465959} 12/26/2021 19:22:22 - INFO - codeparrot_training - Step 29149: {'lr': 0.000186675510106488, 'samples': 14924800, 'steps': 29149, 'batch_loss/train': 0.7841005511581898} 12/26/2021 19:22:34 - INFO - codeparrot_training - Step 29150: {'lr': 0.00018666026116057361, 'samples': 14925312, 'steps': 29150, 'batch_loss/train': 0.824350954964757} 12/26/2021 19:22:45 - INFO - codeparrot_training - Step 29151: {'lr': 0.00018664501246647413, 'samples': 14925824, 'steps': 29151, 'batch_loss/train': 0.6863889247179031} 12/26/2021 19:22:56 - INFO - codeparrot_training - Step 29152: {'lr': 0.0001866297640242502, 'samples': 14926336, 'steps': 29152, 'batch_loss/train': 0.70354831404984} 12/26/2021 19:23:06 - INFO - codeparrot_training - Step 29153: {'lr': 0.0001866145158339624, 'samples': 14926848, 'steps': 29153, 'batch_loss/train': 0.718204386997968} 12/26/2021 19:23:21 - INFO - codeparrot_training - Step 29154: {'lr': 0.00018659926789567122, 'samples': 14927360, 'steps': 29154, 'batch_loss/train': 0.7281481185927987} 12/26/2021 19:23:32 - INFO - codeparrot_training - Step 29155: {'lr': 0.00018658402020943756, 'samples': 14927872, 'steps': 29155, 'batch_loss/train': 0.7554166307672858} 12/26/2021 19:23:42 - INFO - codeparrot_training - Step 29156: {'lr': 0.00018656877277532186, 'samples': 14928384, 'steps': 29156, 'batch_loss/train': 0.612966782762669} 12/26/2021 19:23:55 - INFO - codeparrot_training - Step 29157: {'lr': 0.0001865535255933848, 'samples': 14928896, 'steps': 29157, 'batch_loss/train': 0.6294105320703238} 12/26/2021 19:24:05 - INFO - codeparrot_training - Step 29158: {'lr': 0.00018653827866368683, 'samples': 14929408, 'steps': 29158, 'batch_loss/train': 0.7924152193590999} 12/26/2021 19:24:16 - INFO - codeparrot_training - Step 29159: {'lr': 0.00018652303198628877, 'samples': 14929920, 'steps': 29159, 'batch_loss/train': 0.7378929546102881} 12/26/2021 19:24:28 - INFO - codeparrot_training - Step 29160: {'lr': 0.00018650778556125123, 'samples': 14930432, 'steps': 29160, 'batch_loss/train': 0.9366162478690967} 12/26/2021 19:24:39 - INFO - codeparrot_training - Step 29161: {'lr': 0.00018649253938863463, 'samples': 14930944, 'steps': 29161, 'batch_loss/train': 0.724897887208499} 12/26/2021 19:24:49 - INFO - codeparrot_training - Step 29162: {'lr': 0.00018647729346849985, 'samples': 14931456, 'steps': 29162, 'batch_loss/train': 0.6673558859620243} 12/26/2021 19:25:00 - INFO - codeparrot_training - Step 29163: {'lr': 0.0001864620478009073, 'samples': 14931968, 'steps': 29163, 'batch_loss/train': 0.6328043635003269} 12/26/2021 19:25:12 - INFO - codeparrot_training - Step 29164: {'lr': 0.00018644680238591772, 'samples': 14932480, 'steps': 29164, 'batch_loss/train': 0.7253497736528516} 12/26/2021 19:25:23 - INFO - codeparrot_training - Step 29165: {'lr': 0.00018643155722359155, 'samples': 14932992, 'steps': 29165, 'batch_loss/train': 0.7772086830809712} 12/26/2021 19:25:34 - INFO - codeparrot_training - Step 29166: {'lr': 0.00018641631231398954, 'samples': 14933504, 'steps': 29166, 'batch_loss/train': 0.8955100369639695} 12/26/2021 19:25:47 - INFO - codeparrot_training - Step 29167: {'lr': 0.0001864010676571723, 'samples': 14934016, 'steps': 29167, 'batch_loss/train': 0.5567328815814108} 12/26/2021 19:25:58 - INFO - codeparrot_training - Step 29168: {'lr': 0.0001863858232532004, 'samples': 14934528, 'steps': 29168, 'batch_loss/train': 0.7658443944528699} 12/26/2021 19:26:08 - INFO - codeparrot_training - Step 29169: {'lr': 0.00018637057910213438, 'samples': 14935040, 'steps': 29169, 'batch_loss/train': 0.7630638303235173} 12/26/2021 19:26:20 - INFO - codeparrot_training - Step 29170: {'lr': 0.000186355335204035, 'samples': 14935552, 'steps': 29170, 'batch_loss/train': 0.6974576776847243} 12/26/2021 19:26:31 - INFO - codeparrot_training - Step 29171: {'lr': 0.0001863400915589628, 'samples': 14936064, 'steps': 29171, 'batch_loss/train': 0.8041325518861413} 12/26/2021 19:26:42 - INFO - codeparrot_training - Step 29172: {'lr': 0.00018632484816697827, 'samples': 14936576, 'steps': 29172, 'batch_loss/train': 0.5575836074422114} 12/26/2021 19:26:52 - INFO - codeparrot_training - Step 29173: {'lr': 0.0001863096050281421, 'samples': 14937088, 'steps': 29173, 'batch_loss/train': 0.8276977306231856} 12/26/2021 19:27:06 - INFO - codeparrot_training - Step 29174: {'lr': 0.00018629436214251494, 'samples': 14937600, 'steps': 29174, 'batch_loss/train': 0.7276134872809052} 12/26/2021 19:27:17 - INFO - codeparrot_training - Step 29175: {'lr': 0.0001862791195101573, 'samples': 14938112, 'steps': 29175, 'batch_loss/train': 0.7444876916706562} 12/26/2021 19:27:27 - INFO - codeparrot_training - Step 29176: {'lr': 0.0001862638771311298, 'samples': 14938624, 'steps': 29176, 'batch_loss/train': 0.7252026638016105} 12/26/2021 19:27:39 - INFO - codeparrot_training - Step 29177: {'lr': 0.00018624863500549317, 'samples': 14939136, 'steps': 29177, 'batch_loss/train': 0.8102161400020123} 12/26/2021 19:27:50 - INFO - codeparrot_training - Step 29178: {'lr': 0.00018623339313330779, 'samples': 14939648, 'steps': 29178, 'batch_loss/train': 0.7631073463708162} 12/26/2021 19:28:01 - INFO - codeparrot_training - Step 29179: {'lr': 0.00018621815151463446, 'samples': 14940160, 'steps': 29179, 'batch_loss/train': 0.6793875070288777} 12/26/2021 19:28:13 - INFO - codeparrot_training - Step 29180: {'lr': 0.00018620291014953354, 'samples': 14940672, 'steps': 29180, 'batch_loss/train': 0.7579261660575867} 12/26/2021 19:28:24 - INFO - codeparrot_training - Step 29181: {'lr': 0.0001861876690380659, 'samples': 14941184, 'steps': 29181, 'batch_loss/train': 0.6545192018384114} 12/26/2021 19:28:34 - INFO - codeparrot_training - Step 29182: {'lr': 0.00018617242818029191, 'samples': 14941696, 'steps': 29182, 'batch_loss/train': 0.67413950903574} 12/26/2021 19:28:45 - INFO - codeparrot_training - Step 29183: {'lr': 0.0001861571875762723, 'samples': 14942208, 'steps': 29183, 'batch_loss/train': 0.670218006009236} 12/26/2021 19:28:59 - INFO - codeparrot_training - Step 29184: {'lr': 0.00018614194722606754, 'samples': 14942720, 'steps': 29184, 'batch_loss/train': 0.8542136088944972} 12/26/2021 19:29:10 - INFO - codeparrot_training - Step 29185: {'lr': 0.00018612670712973828, 'samples': 14943232, 'steps': 29185, 'batch_loss/train': 0.8422044543549418} 12/26/2021 19:29:20 - INFO - codeparrot_training - Step 29186: {'lr': 0.00018611146728734522, 'samples': 14943744, 'steps': 29186, 'batch_loss/train': 0.7026969166472554} 12/26/2021 19:29:32 - INFO - codeparrot_training - Step 29187: {'lr': 0.00018609622769894871, 'samples': 14944256, 'steps': 29187, 'batch_loss/train': 0.7546647908166051} 12/26/2021 19:29:43 - INFO - codeparrot_training - Step 29188: {'lr': 0.0001860809883646095, 'samples': 14944768, 'steps': 29188, 'batch_loss/train': 0.7152145002037287} 12/26/2021 19:29:54 - INFO - codeparrot_training - Step 29189: {'lr': 0.00018606574928438824, 'samples': 14945280, 'steps': 29189, 'batch_loss/train': 0.7730560591444373} 12/26/2021 19:30:06 - INFO - codeparrot_training - Step 29190: {'lr': 0.00018605051045834543, 'samples': 14945792, 'steps': 29190, 'batch_loss/train': 0.8425534218549728} 12/26/2021 19:30:16 - INFO - codeparrot_training - Step 29191: {'lr': 0.00018603527188654152, 'samples': 14946304, 'steps': 29191, 'batch_loss/train': 0.7859860118478537} 12/26/2021 19:30:27 - INFO - codeparrot_training - Step 29192: {'lr': 0.00018602003356903724, 'samples': 14946816, 'steps': 29192, 'batch_loss/train': 0.7450678111054003} 12/26/2021 19:30:38 - INFO - codeparrot_training - Step 29193: {'lr': 0.00018600479550589323, 'samples': 14947328, 'steps': 29193, 'batch_loss/train': 0.7332323547452688} 12/26/2021 19:30:50 - INFO - codeparrot_training - Step 29194: {'lr': 0.0001859895576971699, 'samples': 14947840, 'steps': 29194, 'batch_loss/train': 0.7752347011119127} 12/26/2021 19:31:00 - INFO - codeparrot_training - Step 29195: {'lr': 0.00018597432014292792, 'samples': 14948352, 'steps': 29195, 'batch_loss/train': 0.7251407401636243} 12/26/2021 19:31:11 - INFO - codeparrot_training - Step 29196: {'lr': 0.00018595908284322796, 'samples': 14948864, 'steps': 29196, 'batch_loss/train': 0.6691336320945993} 12/26/2021 19:31:24 - INFO - codeparrot_training - Step 29197: {'lr': 0.00018594384579813045, 'samples': 14949376, 'steps': 29197, 'batch_loss/train': 0.7198557993397117} 12/26/2021 19:31:35 - INFO - codeparrot_training - Step 29198: {'lr': 0.00018592860900769594, 'samples': 14949888, 'steps': 29198, 'batch_loss/train': 0.7591204270720482} 12/26/2021 19:31:45 - INFO - codeparrot_training - Step 29199: {'lr': 0.00018591337247198523, 'samples': 14950400, 'steps': 29199, 'batch_loss/train': 0.7034087930805981} 12/26/2021 19:31:57 - INFO - codeparrot_training - Step 29200: {'lr': 0.00018589813619105874, 'samples': 14950912, 'steps': 29200, 'batch_loss/train': 0.7055465806042776} 12/26/2021 19:32:08 - INFO - codeparrot_training - Step 29201: {'lr': 0.000185882900164977, 'samples': 14951424, 'steps': 29201, 'batch_loss/train': 0.7592547321692109} 12/26/2021 19:32:19 - INFO - codeparrot_training - Step 29202: {'lr': 0.0001858676643938006, 'samples': 14951936, 'steps': 29202, 'batch_loss/train': 0.7023523934185505} 12/26/2021 19:32:32 - INFO - codeparrot_training - Step 29203: {'lr': 0.00018585242887759023, 'samples': 14952448, 'steps': 29203, 'batch_loss/train': 0.6036471205297858} 12/26/2021 19:32:43 - INFO - codeparrot_training - Step 29204: {'lr': 0.00018583719361640636, 'samples': 14952960, 'steps': 29204, 'batch_loss/train': 0.7936435006558895} 12/26/2021 19:32:54 - INFO - codeparrot_training - Step 29205: {'lr': 0.00018582195861030962, 'samples': 14953472, 'steps': 29205, 'batch_loss/train': 1.2002252598758787} 12/26/2021 19:33:04 - INFO - codeparrot_training - Step 29206: {'lr': 0.00018580672385936042, 'samples': 14953984, 'steps': 29206, 'batch_loss/train': 0.6630340199917555} 12/26/2021 19:33:16 - INFO - codeparrot_training - Step 29207: {'lr': 0.0001857914893636196, 'samples': 14954496, 'steps': 29207, 'batch_loss/train': 0.6854504396906123} 12/26/2021 19:33:27 - INFO - codeparrot_training - Step 29208: {'lr': 0.00018577625512314748, 'samples': 14955008, 'steps': 29208, 'batch_loss/train': 0.8126350808888674} 12/26/2021 19:33:38 - INFO - codeparrot_training - Step 29209: {'lr': 0.00018576102113800478, 'samples': 14955520, 'steps': 29209, 'batch_loss/train': 0.6846467722207308} 12/26/2021 19:33:50 - INFO - codeparrot_training - Step 29210: {'lr': 0.00018574578740825187, 'samples': 14956032, 'steps': 29210, 'batch_loss/train': 0.5902923117391765} 12/26/2021 19:34:00 - INFO - codeparrot_training - Step 29211: {'lr': 0.00018573055393394954, 'samples': 14956544, 'steps': 29211, 'batch_loss/train': 0.6637571649625897} 12/26/2021 19:34:11 - INFO - codeparrot_training - Step 29212: {'lr': 0.00018571532071515832, 'samples': 14957056, 'steps': 29212, 'batch_loss/train': 0.7925565778277814} 12/26/2021 19:34:22 - INFO - codeparrot_training - Step 29213: {'lr': 0.0001857000877519386, 'samples': 14957568, 'steps': 29213, 'batch_loss/train': 0.7496738824993372} 12/26/2021 19:34:35 - INFO - codeparrot_training - Step 29214: {'lr': 0.00018568485504435106, 'samples': 14958080, 'steps': 29214, 'batch_loss/train': 0.7405330901965499} 12/26/2021 19:34:45 - INFO - codeparrot_training - Step 29215: {'lr': 0.00018566962259245634, 'samples': 14958592, 'steps': 29215, 'batch_loss/train': 0.7069830963155255} 12/26/2021 19:34:56 - INFO - codeparrot_training - Step 29216: {'lr': 0.00018565439039631493, 'samples': 14959104, 'steps': 29216, 'batch_loss/train': 0.7071241098456085} 12/26/2021 19:35:08 - INFO - codeparrot_training - Step 29217: {'lr': 0.0001856391584559872, 'samples': 14959616, 'steps': 29217, 'batch_loss/train': 0.9659808035939932} 12/26/2021 19:35:19 - INFO - codeparrot_training - Step 29218: {'lr': 0.000185623926771534, 'samples': 14960128, 'steps': 29218, 'batch_loss/train': 0.7583301970735192} 12/26/2021 19:35:30 - INFO - codeparrot_training - Step 29219: {'lr': 0.00018560869534301578, 'samples': 14960640, 'steps': 29219, 'batch_loss/train': 0.747757613658905} 12/26/2021 19:35:42 - INFO - codeparrot_training - Step 29220: {'lr': 0.00018559346417049304, 'samples': 14961152, 'steps': 29220, 'batch_loss/train': 0.7180228745564818} 12/26/2021 19:35:52 - INFO - codeparrot_training - Step 29221: {'lr': 0.00018557823325402627, 'samples': 14961664, 'steps': 29221, 'batch_loss/train': 0.7479404639452696} 12/26/2021 19:36:03 - INFO - codeparrot_training - Step 29222: {'lr': 0.00018556300259367625, 'samples': 14962176, 'steps': 29222, 'batch_loss/train': 0.8411390352994204} 12/26/2021 19:36:17 - INFO - codeparrot_training - Step 29223: {'lr': 0.00018554777218950336, 'samples': 14962688, 'steps': 29223, 'batch_loss/train': 0.6908114831894636} 12/26/2021 19:36:27 - INFO - codeparrot_training - Step 29224: {'lr': 0.00018553254204156818, 'samples': 14963200, 'steps': 29224, 'batch_loss/train': 0.800765675958246} 12/26/2021 19:36:38 - INFO - codeparrot_training - Step 29225: {'lr': 0.0001855173121499313, 'samples': 14963712, 'steps': 29225, 'batch_loss/train': 0.7823479194194078} 12/26/2021 19:36:48 - INFO - codeparrot_training - Step 29226: {'lr': 0.00018550208251465327, 'samples': 14964224, 'steps': 29226, 'batch_loss/train': 0.8184566022828221} 12/26/2021 19:37:01 - INFO - codeparrot_training - Step 29227: {'lr': 0.00018548685313579455, 'samples': 14964736, 'steps': 29227, 'batch_loss/train': 0.6023158826865256} 12/26/2021 19:37:11 - INFO - codeparrot_training - Step 29228: {'lr': 0.00018547162401341566, 'samples': 14965248, 'steps': 29228, 'batch_loss/train': 0.6868251701816916} 12/26/2021 19:37:22 - INFO - codeparrot_training - Step 29229: {'lr': 0.0001854563951475774, 'samples': 14965760, 'steps': 29229, 'batch_loss/train': 0.7306693256832659} 12/26/2021 19:37:34 - INFO - codeparrot_training - Step 29230: {'lr': 0.00018544116653834003, 'samples': 14966272, 'steps': 29230, 'batch_loss/train': 0.6817761493148282} 12/26/2021 19:37:45 - INFO - codeparrot_training - Step 29231: {'lr': 0.0001854259381857643, 'samples': 14966784, 'steps': 29231, 'batch_loss/train': 0.7413899814710021} 12/26/2021 19:37:55 - INFO - codeparrot_training - Step 29232: {'lr': 0.0001854107100899105, 'samples': 14967296, 'steps': 29232, 'batch_loss/train': 0.7686789419967681} 12/26/2021 19:38:08 - INFO - codeparrot_training - Step 29233: {'lr': 0.00018539548225083942, 'samples': 14967808, 'steps': 29233, 'batch_loss/train': 0.8147133989259601} 12/26/2021 19:38:19 - INFO - codeparrot_training - Step 29234: {'lr': 0.00018538025466861157, 'samples': 14968320, 'steps': 29234, 'batch_loss/train': 0.7611170168966055} 12/26/2021 19:38:29 - INFO - codeparrot_training - Step 29235: {'lr': 0.0001853650273432873, 'samples': 14968832, 'steps': 29235, 'batch_loss/train': 0.9032534109428525} 12/26/2021 19:38:40 - INFO - codeparrot_training - Step 29236: {'lr': 0.0001853498002749274, 'samples': 14969344, 'steps': 29236, 'batch_loss/train': 0.7891938989050686} 12/26/2021 19:38:52 - INFO - codeparrot_training - Step 29237: {'lr': 0.00018533457346359223, 'samples': 14969856, 'steps': 29237, 'batch_loss/train': 0.756927453156095} 12/26/2021 19:39:03 - INFO - codeparrot_training - Step 29238: {'lr': 0.0001853193469093424, 'samples': 14970368, 'steps': 29238, 'batch_loss/train': 0.6842794176191092} 12/26/2021 19:39:14 - INFO - codeparrot_training - Step 29239: {'lr': 0.00018530412061223834, 'samples': 14970880, 'steps': 29239, 'batch_loss/train': 0.7454768838360906} 12/26/2021 19:39:26 - INFO - codeparrot_training - Step 29240: {'lr': 0.00018528889457234072, 'samples': 14971392, 'steps': 29240, 'batch_loss/train': 0.7490538032725453} 12/26/2021 19:39:36 - INFO - codeparrot_training - Step 29241: {'lr': 0.0001852736687897101, 'samples': 14971904, 'steps': 29241, 'batch_loss/train': 0.7640344379469752} 12/26/2021 19:39:47 - INFO - codeparrot_training - Step 29242: {'lr': 0.00018525844326440684, 'samples': 14972416, 'steps': 29242, 'batch_loss/train': 0.639614877582062} 12/26/2021 19:40:01 - INFO - codeparrot_training - Step 29243: {'lr': 0.00018524321799649152, 'samples': 14972928, 'steps': 29243, 'batch_loss/train': 0.7185011408291757} 12/26/2021 19:40:12 - INFO - codeparrot_training - Step 29244: {'lr': 0.00018522799298602483, 'samples': 14973440, 'steps': 29244, 'batch_loss/train': 0.617464056937024} 12/26/2021 19:40:22 - INFO - codeparrot_training - Step 29245: {'lr': 0.0001852127682330672, 'samples': 14973952, 'steps': 29245, 'batch_loss/train': 0.7062159581109881} 12/26/2021 19:40:34 - INFO - codeparrot_training - Step 29246: {'lr': 0.00018519754373767905, 'samples': 14974464, 'steps': 29246, 'batch_loss/train': 0.7654188452288508} 12/26/2021 19:40:45 - INFO - codeparrot_training - Step 29247: {'lr': 0.00018518231949992098, 'samples': 14974976, 'steps': 29247, 'batch_loss/train': 0.8206505440175533} 12/26/2021 19:40:56 - INFO - codeparrot_training - Step 29248: {'lr': 0.0001851670955198536, 'samples': 14975488, 'steps': 29248, 'batch_loss/train': 0.7175465505570173} 12/26/2021 19:41:06 - INFO - codeparrot_training - Step 29249: {'lr': 0.00018515187179753733, 'samples': 14976000, 'steps': 29249, 'batch_loss/train': 0.7136018797755241} 12/26/2021 19:41:18 - INFO - codeparrot_training - Step 29250: {'lr': 0.0001851366483330327, 'samples': 14976512, 'steps': 29250, 'batch_loss/train': 0.7064055893570185} 12/26/2021 19:41:29 - INFO - codeparrot_training - Step 29251: {'lr': 0.00018512142512640034, 'samples': 14977024, 'steps': 29251, 'batch_loss/train': 0.529373513389146} 12/26/2021 19:41:40 - INFO - codeparrot_training - Step 29252: {'lr': 0.00018510620217770076, 'samples': 14977536, 'steps': 29252, 'batch_loss/train': 0.7822412587120198} 12/26/2021 19:41:54 - INFO - codeparrot_training - Step 29253: {'lr': 0.0001850909794869943, 'samples': 14978048, 'steps': 29253, 'batch_loss/train': 0.7910803006961942} 12/26/2021 19:42:04 - INFO - codeparrot_training - Step 29254: {'lr': 0.00018507575705434154, 'samples': 14978560, 'steps': 29254, 'batch_loss/train': 0.6929427017457783} 12/26/2021 19:42:15 - INFO - codeparrot_training - Step 29255: {'lr': 0.0001850605348798032, 'samples': 14979072, 'steps': 29255, 'batch_loss/train': 0.7264988073147833} 12/26/2021 19:42:27 - INFO - codeparrot_training - Step 29256: {'lr': 0.00018504531296343956, 'samples': 14979584, 'steps': 29256, 'batch_loss/train': 0.7817624977324158} 12/26/2021 19:42:38 - INFO - codeparrot_training - Step 29257: {'lr': 0.00018503009130531136, 'samples': 14980096, 'steps': 29257, 'batch_loss/train': 0.7537989341653883} 12/26/2021 19:42:48 - INFO - codeparrot_training - Step 29258: {'lr': 0.0001850148699054788, 'samples': 14980608, 'steps': 29258, 'batch_loss/train': 0.6836387482471764} 12/26/2021 19:42:59 - INFO - codeparrot_training - Step 29259: {'lr': 0.00018499964876400265, 'samples': 14981120, 'steps': 29259, 'batch_loss/train': 0.7425828797277063} 12/26/2021 19:43:13 - INFO - codeparrot_training - Step 29260: {'lr': 0.00018498442788094345, 'samples': 14981632, 'steps': 29260, 'batch_loss/train': 0.7400846537202597} 12/26/2021 19:43:23 - INFO - codeparrot_training - Step 29261: {'lr': 0.00018496920725636146, 'samples': 14982144, 'steps': 29261, 'batch_loss/train': 0.7246720243711025} 12/26/2021 19:43:34 - INFO - codeparrot_training - Step 29262: {'lr': 0.00018495398689031745, 'samples': 14982656, 'steps': 29262, 'batch_loss/train': 0.7827224382199347} 12/26/2021 19:43:46 - INFO - codeparrot_training - Step 29263: {'lr': 0.0001849387667828718, 'samples': 14983168, 'steps': 29263, 'batch_loss/train': 0.7937769680283964} 12/26/2021 19:43:57 - INFO - codeparrot_training - Step 29264: {'lr': 0.00018492354693408508, 'samples': 14983680, 'steps': 29264, 'batch_loss/train': 0.7513242637505755} 12/26/2021 19:44:07 - INFO - codeparrot_training - Step 29265: {'lr': 0.00018490832734401768, 'samples': 14984192, 'steps': 29265, 'batch_loss/train': 0.6643599898088723} 12/26/2021 19:44:19 - INFO - codeparrot_training - Step 29266: {'lr': 0.00018489310801273024, 'samples': 14984704, 'steps': 29266, 'batch_loss/train': 0.7928058607503772} 12/26/2021 19:44:30 - INFO - codeparrot_training - Step 29267: {'lr': 0.00018487788894028327, 'samples': 14985216, 'steps': 29267, 'batch_loss/train': 0.7059884567279369} 12/26/2021 19:44:41 - INFO - codeparrot_training - Step 29268: {'lr': 0.0001848626701267372, 'samples': 14985728, 'steps': 29268, 'batch_loss/train': 0.6703144302591681} 12/26/2021 19:44:53 - INFO - codeparrot_training - Step 29269: {'lr': 0.00018484745157215245, 'samples': 14986240, 'steps': 29269, 'batch_loss/train': 0.6043781274929643} 12/26/2021 19:45:04 - INFO - codeparrot_training - Step 29270: {'lr': 0.00018483223327658976, 'samples': 14986752, 'steps': 29270, 'batch_loss/train': 0.7368250507861376} 12/26/2021 19:45:14 - INFO - codeparrot_training - Step 29271: {'lr': 0.0001848170152401095, 'samples': 14987264, 'steps': 29271, 'batch_loss/train': 0.7555766110308468} 12/26/2021 19:45:25 - INFO - codeparrot_training - Step 29272: {'lr': 0.0001848017974627721, 'samples': 14987776, 'steps': 29272, 'batch_loss/train': 0.7079053651541471} 12/26/2021 19:45:39 - INFO - codeparrot_training - Step 29273: {'lr': 0.00018478657994463815, 'samples': 14988288, 'steps': 29273, 'batch_loss/train': 0.6955216927453876} 12/26/2021 19:45:49 - INFO - codeparrot_training - Step 29274: {'lr': 0.00018477136268576821, 'samples': 14988800, 'steps': 29274, 'batch_loss/train': 0.7643552676890977} 12/26/2021 19:46:00 - INFO - codeparrot_training - Step 29275: {'lr': 0.00018475614568622264, 'samples': 14989312, 'steps': 29275, 'batch_loss/train': 0.8100583883933723} 12/26/2021 19:46:12 - INFO - codeparrot_training - Step 29276: {'lr': 0.00018474092894606194, 'samples': 14989824, 'steps': 29276, 'batch_loss/train': 0.727871049195528} 12/26/2021 19:46:23 - INFO - codeparrot_training - Step 29277: {'lr': 0.00018472571246534681, 'samples': 14990336, 'steps': 29277, 'batch_loss/train': 0.7748595979064703} 12/26/2021 19:46:33 - INFO - codeparrot_training - Step 29278: {'lr': 0.00018471049624413748, 'samples': 14990848, 'steps': 29278, 'batch_loss/train': 0.7790953125804663} 12/26/2021 19:46:46 - INFO - codeparrot_training - Step 29279: {'lr': 0.0001846952802824947, 'samples': 14991360, 'steps': 29279, 'batch_loss/train': 0.7538457913324237} 12/26/2021 19:46:56 - INFO - codeparrot_training - Step 29280: {'lr': 0.00018468006458047863, 'samples': 14991872, 'steps': 29280, 'batch_loss/train': 0.6660760673694313} 12/26/2021 19:47:07 - INFO - codeparrot_training - Step 29281: {'lr': 0.0001846648491381501, 'samples': 14992384, 'steps': 29281, 'batch_loss/train': 0.722271513484884} 12/26/2021 19:47:18 - INFO - codeparrot_training - Step 29282: {'lr': 0.00018464963395556946, 'samples': 14992896, 'steps': 29282, 'batch_loss/train': 0.7272719452157617} 12/26/2021 19:47:32 - INFO - codeparrot_training - Step 29283: {'lr': 0.0001846344190327972, 'samples': 14993408, 'steps': 29283, 'batch_loss/train': 0.6805802932940423} 12/26/2021 19:47:42 - INFO - codeparrot_training - Step 29284: {'lr': 0.00018461920436989375, 'samples': 14993920, 'steps': 29284, 'batch_loss/train': 0.814880246296525} 12/26/2021 19:47:53 - INFO - codeparrot_training - Step 29285: {'lr': 0.00018460398996691967, 'samples': 14994432, 'steps': 29285, 'batch_loss/train': 0.5177009967155755} 12/26/2021 19:48:05 - INFO - codeparrot_training - Step 29286: {'lr': 0.0001845887758239355, 'samples': 14994944, 'steps': 29286, 'batch_loss/train': 0.7434278004802763} 12/26/2021 19:48:16 - INFO - codeparrot_training - Step 29287: {'lr': 0.00018457356194100154, 'samples': 14995456, 'steps': 29287, 'batch_loss/train': 0.6613997952081263} 12/26/2021 19:48:27 - INFO - codeparrot_training - Step 29288: {'lr': 0.0001845583483181785, 'samples': 14995968, 'steps': 29288, 'batch_loss/train': 0.7023501477669924} 12/26/2021 19:48:40 - INFO - codeparrot_training - Step 29289: {'lr': 0.00018454313495552672, 'samples': 14996480, 'steps': 29289, 'batch_loss/train': 0.7507486371323466} 12/26/2021 19:48:51 - INFO - codeparrot_training - Step 29290: {'lr': 0.00018452792185310674, 'samples': 14996992, 'steps': 29290, 'batch_loss/train': 0.9440536135807633} 12/26/2021 19:49:01 - INFO - codeparrot_training - Step 29291: {'lr': 0.00018451270901097895, 'samples': 14997504, 'steps': 29291, 'batch_loss/train': 0.7133143553510308} 12/26/2021 19:49:13 - INFO - codeparrot_training - Step 29292: {'lr': 0.00018449749642920394, 'samples': 14998016, 'steps': 29292, 'batch_loss/train': 0.6499471208080649} 12/26/2021 19:49:24 - INFO - codeparrot_training - Step 29293: {'lr': 0.0001844822841078422, 'samples': 14998528, 'steps': 29293, 'batch_loss/train': 0.7385220288415439} 12/26/2021 19:49:35 - INFO - codeparrot_training - Step 29294: {'lr': 0.00018446707204695413, 'samples': 14999040, 'steps': 29294, 'batch_loss/train': 0.6927884416654706} 12/26/2021 19:49:45 - INFO - codeparrot_training - Step 29295: {'lr': 0.00018445186024660016, 'samples': 14999552, 'steps': 29295, 'batch_loss/train': 0.679480604827404} 12/26/2021 19:49:57 - INFO - codeparrot_training - Step 29296: {'lr': 0.00018443664870684096, 'samples': 15000064, 'steps': 29296, 'batch_loss/train': 0.7465457846410573} 12/26/2021 19:50:08 - INFO - codeparrot_training - Step 29297: {'lr': 0.0001844214374277369, 'samples': 15000576, 'steps': 29297, 'batch_loss/train': 0.564886922511505} 12/26/2021 19:50:19 - INFO - codeparrot_training - Step 29298: {'lr': 0.00018440622640934832, 'samples': 15001088, 'steps': 29298, 'batch_loss/train': 0.7506997780874372} 12/26/2021 19:50:32 - INFO - codeparrot_training - Step 29299: {'lr': 0.00018439101565173588, 'samples': 15001600, 'steps': 29299, 'batch_loss/train': 0.8070874218828976} 12/26/2021 19:50:43 - INFO - codeparrot_training - Step 29300: {'lr': 0.00018437580515496003, 'samples': 15002112, 'steps': 29300, 'batch_loss/train': 0.6719499786850065} 12/26/2021 19:50:53 - INFO - codeparrot_training - Step 29301: {'lr': 0.00018436059491908112, 'samples': 15002624, 'steps': 29301, 'batch_loss/train': 0.7675778795965016} 12/26/2021 19:51:06 - INFO - codeparrot_training - Step 29302: {'lr': 0.0001843453849441597, 'samples': 15003136, 'steps': 29302, 'batch_loss/train': 0.7268214449286461} 12/26/2021 19:51:16 - INFO - codeparrot_training - Step 29303: {'lr': 0.00018433017523025632, 'samples': 15003648, 'steps': 29303, 'batch_loss/train': 0.6791963176801801} 12/26/2021 19:51:27 - INFO - codeparrot_training - Step 29304: {'lr': 0.00018431496577743135, 'samples': 15004160, 'steps': 29304, 'batch_loss/train': 0.7740727365016937} 12/26/2021 19:51:38 - INFO - codeparrot_training - Step 29305: {'lr': 0.00018429975658574527, 'samples': 15004672, 'steps': 29305, 'batch_loss/train': 1.139757472090423} 12/26/2021 19:51:50 - INFO - codeparrot_training - Step 29306: {'lr': 0.0001842845476552585, 'samples': 15005184, 'steps': 29306, 'batch_loss/train': 0.8341427864506841} 12/26/2021 19:52:00 - INFO - codeparrot_training - Step 29307: {'lr': 0.0001842693389860316, 'samples': 15005696, 'steps': 29307, 'batch_loss/train': 0.6841003423323855} 12/26/2021 19:52:11 - INFO - codeparrot_training - Step 29308: {'lr': 0.000184254130578125, 'samples': 15006208, 'steps': 29308, 'batch_loss/train': 0.7697333917021751} 12/26/2021 19:52:23 - INFO - codeparrot_training - Step 29309: {'lr': 0.00018423892243159907, 'samples': 15006720, 'steps': 29309, 'batch_loss/train': 0.900514516979456} 12/26/2021 19:52:34 - INFO - codeparrot_training - Step 29310: {'lr': 0.00018422371454651442, 'samples': 15007232, 'steps': 29310, 'batch_loss/train': 0.7023361828178167} 12/26/2021 19:52:44 - INFO - codeparrot_training - Step 29311: {'lr': 0.00018420850692293144, 'samples': 15007744, 'steps': 29311, 'batch_loss/train': 0.7163371930364519} 12/26/2021 19:52:58 - INFO - codeparrot_training - Step 29312: {'lr': 0.00018419329956091064, 'samples': 15008256, 'steps': 29312, 'batch_loss/train': 1.0210760636255145} 12/26/2021 19:53:08 - INFO - codeparrot_training - Step 29313: {'lr': 0.00018417809246051235, 'samples': 15008768, 'steps': 29313, 'batch_loss/train': 0.9352679032599553} 12/26/2021 19:53:19 - INFO - codeparrot_training - Step 29314: {'lr': 0.00018416288562179712, 'samples': 15009280, 'steps': 29314, 'batch_loss/train': 0.7281372342258692} 12/26/2021 19:53:31 - INFO - codeparrot_training - Step 29315: {'lr': 0.00018414767904482548, 'samples': 15009792, 'steps': 29315, 'batch_loss/train': 0.7476647263392806} 12/26/2021 19:53:42 - INFO - codeparrot_training - Step 29316: {'lr': 0.00018413247272965776, 'samples': 15010304, 'steps': 29316, 'batch_loss/train': 0.6325158022809774} 12/26/2021 19:53:52 - INFO - codeparrot_training - Step 29317: {'lr': 0.00018411726667635442, 'samples': 15010816, 'steps': 29317, 'batch_loss/train': 0.6968571676407009} 12/26/2021 19:54:03 - INFO - codeparrot_training - Step 29318: {'lr': 0.00018410206088497594, 'samples': 15011328, 'steps': 29318, 'batch_loss/train': 0.49371032579801977} 12/26/2021 19:54:15 - INFO - codeparrot_training - Step 29319: {'lr': 0.00018408685535558288, 'samples': 15011840, 'steps': 29319, 'batch_loss/train': 0.7851938875392079} 12/26/2021 19:54:26 - INFO - codeparrot_training - Step 29320: {'lr': 0.00018407165008823553, 'samples': 15012352, 'steps': 29320, 'batch_loss/train': 0.5611892146989703} 12/26/2021 19:54:36 - INFO - codeparrot_training - Step 29321: {'lr': 0.00018405644508299435, 'samples': 15012864, 'steps': 29321, 'batch_loss/train': 0.7218407094478607} 12/26/2021 19:54:50 - INFO - codeparrot_training - Step 29322: {'lr': 0.00018404124033991994, 'samples': 15013376, 'steps': 29322, 'batch_loss/train': 0.7021914939396083} 12/26/2021 19:55:00 - INFO - codeparrot_training - Step 29323: {'lr': 0.00018402603585907263, 'samples': 15013888, 'steps': 29323, 'batch_loss/train': 0.7037624064832926} 12/26/2021 19:55:11 - INFO - codeparrot_training - Step 29324: {'lr': 0.0001840108316405128, 'samples': 15014400, 'steps': 29324, 'batch_loss/train': 0.6882556304335594} 12/26/2021 19:55:23 - INFO - codeparrot_training - Step 29325: {'lr': 0.0001839956276843011, 'samples': 15014912, 'steps': 29325, 'batch_loss/train': 0.8169593986822292} 12/26/2021 19:55:34 - INFO - codeparrot_training - Step 29326: {'lr': 0.0001839804239904979, 'samples': 15015424, 'steps': 29326, 'batch_loss/train': 0.6855558301322162} 12/26/2021 19:55:44 - INFO - codeparrot_training - Step 29327: {'lr': 0.00018396522055916353, 'samples': 15015936, 'steps': 29327, 'batch_loss/train': 0.7209611814469099} 12/26/2021 19:55:59 - INFO - codeparrot_training - Step 29328: {'lr': 0.00018395001739035844, 'samples': 15016448, 'steps': 29328, 'batch_loss/train': 0.6317989643430337} 12/26/2021 19:56:09 - INFO - codeparrot_training - Step 29329: {'lr': 0.0001839348144841433, 'samples': 15016960, 'steps': 29329, 'batch_loss/train': 0.7239980250597} 12/26/2021 19:56:20 - INFO - codeparrot_training - Step 29330: {'lr': 0.00018391961184057825, 'samples': 15017472, 'steps': 29330, 'batch_loss/train': 0.535315633751452} 12/26/2021 19:56:30 - INFO - codeparrot_training - Step 29331: {'lr': 0.000183904409459724, 'samples': 15017984, 'steps': 29331, 'batch_loss/train': 0.7241338128224015} 12/26/2021 19:56:42 - INFO - codeparrot_training - Step 29332: {'lr': 0.00018388920734164072, 'samples': 15018496, 'steps': 29332, 'batch_loss/train': 0.6518481806851923} 12/26/2021 19:56:53 - INFO - codeparrot_training - Step 29333: {'lr': 0.00018387400548638912, 'samples': 15019008, 'steps': 29333, 'batch_loss/train': 0.7341278875246644} 12/26/2021 19:57:04 - INFO - codeparrot_training - Step 29334: {'lr': 0.00018385880389402943, 'samples': 15019520, 'steps': 29334, 'batch_loss/train': 0.7407878171652555} 12/26/2021 19:57:16 - INFO - codeparrot_training - Step 29335: {'lr': 0.00018384360256462214, 'samples': 15020032, 'steps': 29335, 'batch_loss/train': 0.7657009325921535} 12/26/2021 19:57:26 - INFO - codeparrot_training - Step 29336: {'lr': 0.00018382840149822776, 'samples': 15020544, 'steps': 29336, 'batch_loss/train': 0.8058838192373514} 12/26/2021 19:57:37 - INFO - codeparrot_training - Step 29337: {'lr': 0.00018381320069490667, 'samples': 15021056, 'steps': 29337, 'batch_loss/train': 0.6526731927879155} 12/26/2021 19:57:51 - INFO - codeparrot_training - Step 29338: {'lr': 0.00018379800015471935, 'samples': 15021568, 'steps': 29338, 'batch_loss/train': 0.7700016899034381} 12/26/2021 19:58:01 - INFO - codeparrot_training - Step 29339: {'lr': 0.00018378279987772605, 'samples': 15022080, 'steps': 29339, 'batch_loss/train': 0.710616949480027} 12/26/2021 19:58:12 - INFO - codeparrot_training - Step 29340: {'lr': 0.00018376759986398739, 'samples': 15022592, 'steps': 29340, 'batch_loss/train': 0.843558706343174} 12/26/2021 19:58:23 - INFO - codeparrot_training - Step 29341: {'lr': 0.0001837524001135638, 'samples': 15023104, 'steps': 29341, 'batch_loss/train': 0.7504093092866242} 12/26/2021 19:58:35 - INFO - codeparrot_training - Step 29342: {'lr': 0.00018373720062651568, 'samples': 15023616, 'steps': 29342, 'batch_loss/train': 0.601063736830838} 12/26/2021 19:58:45 - INFO - codeparrot_training - Step 29343: {'lr': 0.0001837220014029033, 'samples': 15024128, 'steps': 29343, 'batch_loss/train': 0.8402445763349533} 12/26/2021 19:58:56 - INFO - codeparrot_training - Step 29344: {'lr': 0.00018370680244278725, 'samples': 15024640, 'steps': 29344, 'batch_loss/train': 0.6747677307575941} 12/26/2021 19:59:08 - INFO - codeparrot_training - Step 29345: {'lr': 0.00018369160374622802, 'samples': 15025152, 'steps': 29345, 'batch_loss/train': 0.7578208241611719} 12/26/2021 19:59:19 - INFO - codeparrot_training - Step 29346: {'lr': 0.00018367640531328583, 'samples': 15025664, 'steps': 29346, 'batch_loss/train': 0.7358803998213261} 12/26/2021 19:59:30 - INFO - codeparrot_training - Step 29347: {'lr': 0.00018366120714402123, 'samples': 15026176, 'steps': 29347, 'batch_loss/train': 1.0346967773512006} 12/26/2021 19:59:41 - INFO - codeparrot_training - Step 29348: {'lr': 0.00018364600923849466, 'samples': 15026688, 'steps': 29348, 'batch_loss/train': 0.7108281515538692} 12/26/2021 19:59:52 - INFO - codeparrot_training - Step 29349: {'lr': 0.00018363081159676652, 'samples': 15027200, 'steps': 29349, 'batch_loss/train': 0.7902201851829886} 12/26/2021 20:00:03 - INFO - codeparrot_training - Step 29350: {'lr': 0.0001836156142188971, 'samples': 15027712, 'steps': 29350, 'batch_loss/train': 0.76576026994735} 12/26/2021 20:00:16 - INFO - codeparrot_training - Step 29351: {'lr': 0.00018360041710494706, 'samples': 15028224, 'steps': 29351, 'batch_loss/train': 0.7054296997375786} 12/26/2021 20:00:27 - INFO - codeparrot_training - Step 29352: {'lr': 0.0001835852202549767, 'samples': 15028736, 'steps': 29352, 'batch_loss/train': 0.7800502281170338} 12/26/2021 20:00:37 - INFO - codeparrot_training - Step 29353: {'lr': 0.00018357002366904636, 'samples': 15029248, 'steps': 29353, 'batch_loss/train': 0.7650780687108636} 12/26/2021 20:00:48 - INFO - codeparrot_training - Step 29354: {'lr': 0.00018355482734721645, 'samples': 15029760, 'steps': 29354, 'batch_loss/train': 0.6982818664982915} 12/26/2021 20:01:01 - INFO - codeparrot_training - Step 29355: {'lr': 0.00018353963128954758, 'samples': 15030272, 'steps': 29355, 'batch_loss/train': 0.6313811067957431} 12/26/2021 20:01:11 - INFO - codeparrot_training - Step 29356: {'lr': 0.00018352443549610002, 'samples': 15030784, 'steps': 29356, 'batch_loss/train': 0.7536057038232684} 12/26/2021 20:01:22 - INFO - codeparrot_training - Step 29357: {'lr': 0.00018350923996693423, 'samples': 15031296, 'steps': 29357, 'batch_loss/train': 0.46205990808084607} 12/26/2021 20:01:35 - INFO - codeparrot_training - Step 29358: {'lr': 0.00018349404470211048, 'samples': 15031808, 'steps': 29358, 'batch_loss/train': 0.4742947685299441} 12/26/2021 20:01:46 - INFO - codeparrot_training - Step 29359: {'lr': 0.00018347884970168937, 'samples': 15032320, 'steps': 29359, 'batch_loss/train': 0.6554435817524791} 12/26/2021 20:01:56 - INFO - codeparrot_training - Step 29360: {'lr': 0.00018346365496573126, 'samples': 15032832, 'steps': 29360, 'batch_loss/train': 0.7114973003044724} 12/26/2021 20:02:07 - INFO - codeparrot_training - Step 29361: {'lr': 0.00018344846049429646, 'samples': 15033344, 'steps': 29361, 'batch_loss/train': 0.6987928133457899} 12/26/2021 20:02:19 - INFO - codeparrot_training - Step 29362: {'lr': 0.00018343326628744556, 'samples': 15033856, 'steps': 29362, 'batch_loss/train': 0.7260821992531419} 12/26/2021 20:02:30 - INFO - codeparrot_training - Step 29363: {'lr': 0.00018341807234523877, 'samples': 15034368, 'steps': 29363, 'batch_loss/train': 0.6299302261322737} 12/26/2021 20:02:41 - INFO - codeparrot_training - Step 29364: {'lr': 0.00018340287866773665, 'samples': 15034880, 'steps': 29364, 'batch_loss/train': 1.1004385566338897} 12/26/2021 20:02:53 - INFO - codeparrot_training - Step 29365: {'lr': 0.00018338768525499944, 'samples': 15035392, 'steps': 29365, 'batch_loss/train': 0.7875461089424789} 12/26/2021 20:03:03 - INFO - codeparrot_training - Step 29366: {'lr': 0.0001833724921070877, 'samples': 15035904, 'steps': 29366, 'batch_loss/train': 0.7364997072145343} 12/26/2021 20:03:14 - INFO - codeparrot_training - Step 29367: {'lr': 0.00018335729922406183, 'samples': 15036416, 'steps': 29367, 'batch_loss/train': 0.7613590955734253} 12/26/2021 20:03:28 - INFO - codeparrot_training - Step 29368: {'lr': 0.00018334210660598212, 'samples': 15036928, 'steps': 29368, 'batch_loss/train': 0.686823177151382} 12/26/2021 20:03:39 - INFO - codeparrot_training - Step 29369: {'lr': 0.00018332691425290896, 'samples': 15037440, 'steps': 29369, 'batch_loss/train': 1.121473771519959} 12/26/2021 20:03:49 - INFO - codeparrot_training - Step 29370: {'lr': 0.00018331172216490293, 'samples': 15037952, 'steps': 29370, 'batch_loss/train': 0.7852112147957087} 12/26/2021 20:04:00 - INFO - codeparrot_training - Step 29371: {'lr': 0.0001832965303420243, 'samples': 15038464, 'steps': 29371, 'batch_loss/train': 0.7573683643713593} 12/26/2021 20:04:12 - INFO - codeparrot_training - Step 29372: {'lr': 0.0001832813387843334, 'samples': 15038976, 'steps': 29372, 'batch_loss/train': 0.7553761871531606} 12/26/2021 20:04:22 - INFO - codeparrot_training - Step 29373: {'lr': 0.0001832661474918907, 'samples': 15039488, 'steps': 29373, 'batch_loss/train': 0.7377723213285208} 12/26/2021 20:04:33 - INFO - codeparrot_training - Step 29374: {'lr': 0.0001832509564647567, 'samples': 15040000, 'steps': 29374, 'batch_loss/train': 0.7676837472245097} 12/26/2021 20:04:45 - INFO - codeparrot_training - Step 29375: {'lr': 0.00018323576570299167, 'samples': 15040512, 'steps': 29375, 'batch_loss/train': 0.8213342232629657} 12/26/2021 20:04:56 - INFO - codeparrot_training - Step 29376: {'lr': 0.0001832205752066559, 'samples': 15041024, 'steps': 29376, 'batch_loss/train': 0.774943214841187} 12/26/2021 20:05:06 - INFO - codeparrot_training - Step 29377: {'lr': 0.00018320538497581008, 'samples': 15041536, 'steps': 29377, 'batch_loss/train': 0.8047779947519302} 12/26/2021 20:05:18 - INFO - codeparrot_training - Step 29378: {'lr': 0.00018319019501051437, 'samples': 15042048, 'steps': 29378, 'batch_loss/train': 0.699884407222271} 12/26/2021 20:05:29 - INFO - codeparrot_training - Step 29379: {'lr': 0.0001831750053108292, 'samples': 15042560, 'steps': 29379, 'batch_loss/train': 0.7594386897981167} 12/26/2021 20:05:40 - INFO - codeparrot_training - Step 29380: {'lr': 0.0001831598158768149, 'samples': 15043072, 'steps': 29380, 'batch_loss/train': 0.8308869050815701} 12/26/2021 20:05:50 - INFO - codeparrot_training - Step 29381: {'lr': 0.0001831446267085321, 'samples': 15043584, 'steps': 29381, 'batch_loss/train': 0.7201192700304091} 12/26/2021 20:06:04 - INFO - codeparrot_training - Step 29382: {'lr': 0.0001831294378060409, 'samples': 15044096, 'steps': 29382, 'batch_loss/train': 0.6535228216089308} 12/26/2021 20:06:15 - INFO - codeparrot_training - Step 29383: {'lr': 0.00018311424916940178, 'samples': 15044608, 'steps': 29383, 'batch_loss/train': 0.6860429071821272} 12/26/2021 20:06:26 - INFO - codeparrot_training - Step 29384: {'lr': 0.00018309906079867527, 'samples': 15045120, 'steps': 29384, 'batch_loss/train': 0.8067872803658247} 12/26/2021 20:06:38 - INFO - codeparrot_training - Step 29385: {'lr': 0.00018308387269392159, 'samples': 15045632, 'steps': 29385, 'batch_loss/train': 0.7800549725070596} 12/26/2021 20:06:49 - INFO - codeparrot_training - Step 29386: {'lr': 0.00018306868485520118, 'samples': 15046144, 'steps': 29386, 'batch_loss/train': 0.7206719680689275} 12/26/2021 20:06:59 - INFO - codeparrot_training - Step 29387: {'lr': 0.0001830534972825743, 'samples': 15046656, 'steps': 29387, 'batch_loss/train': 0.7360675754025578} 12/26/2021 20:07:12 - INFO - codeparrot_training - Step 29388: {'lr': 0.00018303830997610154, 'samples': 15047168, 'steps': 29388, 'batch_loss/train': 0.7599272085353732} 12/26/2021 20:07:23 - INFO - codeparrot_training - Step 29389: {'lr': 0.00018302312293584316, 'samples': 15047680, 'steps': 29389, 'batch_loss/train': 0.7266435016645119} 12/26/2021 20:07:34 - INFO - codeparrot_training - Step 29390: {'lr': 0.00018300793616185957, 'samples': 15048192, 'steps': 29390, 'batch_loss/train': 0.7954293694347143} 12/26/2021 20:07:46 - INFO - codeparrot_training - Step 29391: {'lr': 0.00018299274965421106, 'samples': 15048704, 'steps': 29391, 'batch_loss/train': 0.7998521206900477} 12/26/2021 20:07:57 - INFO - codeparrot_training - Step 29392: {'lr': 0.00018297756341295813, 'samples': 15049216, 'steps': 29392, 'batch_loss/train': 0.6365894065238535} 12/26/2021 20:08:07 - INFO - codeparrot_training - Step 29393: {'lr': 0.0001829623774381611, 'samples': 15049728, 'steps': 29393, 'batch_loss/train': 0.5892092649301048} 12/26/2021 20:08:18 - INFO - codeparrot_training - Step 29394: {'lr': 0.00018294719172988033, 'samples': 15050240, 'steps': 29394, 'batch_loss/train': 0.7869109278544784} 12/26/2021 20:08:30 - INFO - codeparrot_training - Step 29395: {'lr': 0.00018293200628817616, 'samples': 15050752, 'steps': 29395, 'batch_loss/train': 0.8666739584878087} 12/26/2021 20:08:41 - INFO - codeparrot_training - Step 29396: {'lr': 0.0001829168211131091, 'samples': 15051264, 'steps': 29396, 'batch_loss/train': 0.7010350488126278} 12/26/2021 20:08:51 - INFO - codeparrot_training - Step 29397: {'lr': 0.00018290163620473943, 'samples': 15051776, 'steps': 29397, 'batch_loss/train': 1.065645594149828} 12/26/2021 20:09:05 - INFO - codeparrot_training - Step 29398: {'lr': 0.00018288645156312744, 'samples': 15052288, 'steps': 29398, 'batch_loss/train': 0.8420988619327545} 12/26/2021 20:09:15 - INFO - codeparrot_training - Step 29399: {'lr': 0.0001828712671883336, 'samples': 15052800, 'steps': 29399, 'batch_loss/train': 0.544541264011059} 12/26/2021 20:09:26 - INFO - codeparrot_training - Step 29400: {'lr': 0.00018285608308041832, 'samples': 15053312, 'steps': 29400, 'batch_loss/train': 0.8205985510721803} 12/26/2021 20:09:38 - INFO - codeparrot_training - Step 29401: {'lr': 0.00018284089923944187, 'samples': 15053824, 'steps': 29401, 'batch_loss/train': 0.818356042727828} 12/26/2021 20:09:49 - INFO - codeparrot_training - Step 29402: {'lr': 0.00018282571566546454, 'samples': 15054336, 'steps': 29402, 'batch_loss/train': 0.6886188657954335} 12/26/2021 20:09:59 - INFO - codeparrot_training - Step 29403: {'lr': 0.00018281053235854694, 'samples': 15054848, 'steps': 29403, 'batch_loss/train': 0.9089459432289004} 12/26/2021 20:10:10 - INFO - codeparrot_training - Step 29404: {'lr': 0.00018279534931874925, 'samples': 15055360, 'steps': 29404, 'batch_loss/train': 0.7297272691503167} 12/26/2021 20:10:24 - INFO - codeparrot_training - Step 29405: {'lr': 0.00018278016654613187, 'samples': 15055872, 'steps': 29405, 'batch_loss/train': 0.712934123002924} 12/26/2021 20:10:34 - INFO - codeparrot_training - Step 29406: {'lr': 0.00018276498404075513, 'samples': 15056384, 'steps': 29406, 'batch_loss/train': 0.7093519468326122} 12/26/2021 20:10:45 - INFO - codeparrot_training - Step 29407: {'lr': 0.00018274980180267952, 'samples': 15056896, 'steps': 29407, 'batch_loss/train': 0.5811639297753572} 12/26/2021 20:10:57 - INFO - codeparrot_training - Step 29408: {'lr': 0.00018273461983196523, 'samples': 15057408, 'steps': 29408, 'batch_loss/train': 0.759200943633914} 12/26/2021 20:11:08 - INFO - codeparrot_training - Step 29409: {'lr': 0.00018271943812867263, 'samples': 15057920, 'steps': 29409, 'batch_loss/train': 0.7654959145002067} 12/26/2021 20:11:18 - INFO - codeparrot_training - Step 29410: {'lr': 0.00018270425669286228, 'samples': 15058432, 'steps': 29410, 'batch_loss/train': 0.6992937512695789} 12/26/2021 20:11:31 - INFO - codeparrot_training - Step 29411: {'lr': 0.00018268907552459435, 'samples': 15058944, 'steps': 29411, 'batch_loss/train': 0.7521948236972094} 12/26/2021 20:11:41 - INFO - codeparrot_training - Step 29412: {'lr': 0.00018267389462392925, 'samples': 15059456, 'steps': 29412, 'batch_loss/train': 0.7448057988658547} 12/26/2021 20:11:52 - INFO - codeparrot_training - Step 29413: {'lr': 0.00018265871399092727, 'samples': 15059968, 'steps': 29413, 'batch_loss/train': 0.8183764694258571} 12/26/2021 20:12:03 - INFO - codeparrot_training - Step 29414: {'lr': 0.00018264353362564884, 'samples': 15060480, 'steps': 29414, 'batch_loss/train': 0.7485010476084426} 12/26/2021 20:12:15 - INFO - codeparrot_training - Step 29415: {'lr': 0.00018262835352815432, 'samples': 15060992, 'steps': 29415, 'batch_loss/train': 0.7488073776476085} 12/26/2021 20:12:25 - INFO - codeparrot_training - Step 29416: {'lr': 0.00018261317369850405, 'samples': 15061504, 'steps': 29416, 'batch_loss/train': 0.7830801708623767} 12/26/2021 20:12:36 - INFO - codeparrot_training - Step 29417: {'lr': 0.00018259799413675827, 'samples': 15062016, 'steps': 29417, 'batch_loss/train': 0.7380760163068771} 12/26/2021 20:12:49 - INFO - codeparrot_training - Step 29418: {'lr': 0.00018258281484297742, 'samples': 15062528, 'steps': 29418, 'batch_loss/train': 0.774574376642704} 12/26/2021 20:13:00 - INFO - codeparrot_training - Step 29419: {'lr': 0.0001825676358172219, 'samples': 15063040, 'steps': 29419, 'batch_loss/train': 0.7392964283935726} 12/26/2021 20:13:10 - INFO - codeparrot_training - Step 29420: {'lr': 0.00018255245705955193, 'samples': 15063552, 'steps': 29420, 'batch_loss/train': 0.7349558919668198} 12/26/2021 20:13:24 - INFO - codeparrot_training - Step 29421: {'lr': 0.00018253727857002797, 'samples': 15064064, 'steps': 29421, 'batch_loss/train': 1.258681422099471} 12/26/2021 20:13:35 - INFO - codeparrot_training - Step 29422: {'lr': 0.00018252210034871036, 'samples': 15064576, 'steps': 29422, 'batch_loss/train': 0.8102200524881482} 12/26/2021 20:13:46 - INFO - codeparrot_training - Step 29423: {'lr': 0.00018250692239565942, 'samples': 15065088, 'steps': 29423, 'batch_loss/train': 0.8253629468381405} 12/26/2021 20:13:56 - INFO - codeparrot_training - Step 29424: {'lr': 0.00018249174471093532, 'samples': 15065600, 'steps': 29424, 'batch_loss/train': 0.823579702526331} 12/26/2021 20:14:08 - INFO - codeparrot_training - Step 29425: {'lr': 0.00018247656729459866, 'samples': 15066112, 'steps': 29425, 'batch_loss/train': 0.5945186277385801} 12/26/2021 20:14:19 - INFO - codeparrot_training - Step 29426: {'lr': 0.00018246139014670967, 'samples': 15066624, 'steps': 29426, 'batch_loss/train': 0.8197789965197444} 12/26/2021 20:14:29 - INFO - codeparrot_training - Step 29427: {'lr': 0.0001824462132673287, 'samples': 15067136, 'steps': 29427, 'batch_loss/train': 0.7019421320874244} 12/26/2021 20:14:41 - INFO - codeparrot_training - Step 29428: {'lr': 0.00018243103665651597, 'samples': 15067648, 'steps': 29428, 'batch_loss/train': 0.8461860483512282} 12/26/2021 20:14:52 - INFO - codeparrot_training - Step 29429: {'lr': 0.00018241586031433204, 'samples': 15068160, 'steps': 29429, 'batch_loss/train': 0.6587701826938428} 12/26/2021 20:15:03 - INFO - codeparrot_training - Step 29430: {'lr': 0.0001824006842408371, 'samples': 15068672, 'steps': 29430, 'batch_loss/train': 0.7016040984308347} 12/26/2021 20:15:17 - INFO - codeparrot_training - Step 29431: {'lr': 0.0001823855084360916, 'samples': 15069184, 'steps': 29431, 'batch_loss/train': 0.6744162612594664} 12/26/2021 20:15:27 - INFO - codeparrot_training - Step 29432: {'lr': 0.0001823703329001556, 'samples': 15069696, 'steps': 29432, 'batch_loss/train': 0.7611066568642855} 12/26/2021 20:15:38 - INFO - codeparrot_training - Step 29433: {'lr': 0.0001823551576330898, 'samples': 15070208, 'steps': 29433, 'batch_loss/train': 0.7471401738002896} 12/26/2021 20:15:50 - INFO - codeparrot_training - Step 29434: {'lr': 0.00018233998263495423, 'samples': 15070720, 'steps': 29434, 'batch_loss/train': 0.7767917071469128} 12/26/2021 20:16:01 - INFO - codeparrot_training - Step 29435: {'lr': 0.00018232480790580935, 'samples': 15071232, 'steps': 29435, 'batch_loss/train': 0.6524727009236813} 12/26/2021 20:16:11 - INFO - codeparrot_training - Step 29436: {'lr': 0.00018230963344571555, 'samples': 15071744, 'steps': 29436, 'batch_loss/train': 0.7011385797522962} 12/26/2021 20:16:22 - INFO - codeparrot_training - Step 29437: {'lr': 0.00018229445925473304, 'samples': 15072256, 'steps': 29437, 'batch_loss/train': 0.8157746409997344} 12/26/2021 20:16:36 - INFO - codeparrot_training - Step 29438: {'lr': 0.00018227928533292225, 'samples': 15072768, 'steps': 29438, 'batch_loss/train': 0.7369438204914331} 12/26/2021 20:16:47 - INFO - codeparrot_training - Step 29439: {'lr': 0.00018226411168034337, 'samples': 15073280, 'steps': 29439, 'batch_loss/train': 0.8239617794752121} 12/26/2021 20:16:57 - INFO - codeparrot_training - Step 29440: {'lr': 0.00018224893829705687, 'samples': 15073792, 'steps': 29440, 'batch_loss/train': 0.7820285959169269} 12/26/2021 20:17:09 - INFO - codeparrot_training - Step 29441: {'lr': 0.00018223376518312302, 'samples': 15074304, 'steps': 29441, 'batch_loss/train': 0.7317733876407146} 12/26/2021 20:17:20 - INFO - codeparrot_training - Step 29442: {'lr': 0.00018221859233860213, 'samples': 15074816, 'steps': 29442, 'batch_loss/train': 0.5964555526152253} 12/26/2021 20:17:31 - INFO - codeparrot_training - Step 29443: {'lr': 0.00018220341976355443, 'samples': 15075328, 'steps': 29443, 'batch_loss/train': 0.8309764182195067} 12/26/2021 20:17:43 - INFO - codeparrot_training - Step 29444: {'lr': 0.00018218824745804043, 'samples': 15075840, 'steps': 29444, 'batch_loss/train': 0.6868021057453007} 12/26/2021 20:17:54 - INFO - codeparrot_training - Step 29445: {'lr': 0.00018217307542212034, 'samples': 15076352, 'steps': 29445, 'batch_loss/train': 0.7362331738695502} 12/26/2021 20:18:04 - INFO - codeparrot_training - Step 29446: {'lr': 0.0001821579036558544, 'samples': 15076864, 'steps': 29446, 'batch_loss/train': 1.330175288952887} 12/26/2021 20:18:15 - INFO - codeparrot_training - Step 29447: {'lr': 0.0001821427321593031, 'samples': 15077376, 'steps': 29447, 'batch_loss/train': 0.7768626627512276} 12/26/2021 20:18:29 - INFO - codeparrot_training - Step 29448: {'lr': 0.0001821275609325267, 'samples': 15077888, 'steps': 29448, 'batch_loss/train': 0.740162459667772} 12/26/2021 20:18:40 - INFO - codeparrot_training - Step 29449: {'lr': 0.00018211238997558544, 'samples': 15078400, 'steps': 29449, 'batch_loss/train': 0.9176278857048601} 12/26/2021 20:18:50 - INFO - codeparrot_training - Step 29450: {'lr': 0.00018209721928853965, 'samples': 15078912, 'steps': 29450, 'batch_loss/train': 0.6389922072994523} 12/26/2021 20:19:03 - INFO - codeparrot_training - Step 29451: {'lr': 0.00018208204887144977, 'samples': 15079424, 'steps': 29451, 'batch_loss/train': 0.7239634962752461} 12/26/2021 20:19:13 - INFO - codeparrot_training - Step 29452: {'lr': 0.00018206687872437602, 'samples': 15079936, 'steps': 29452, 'batch_loss/train': 0.7705500908195972} 12/26/2021 20:19:24 - INFO - codeparrot_training - Step 29453: {'lr': 0.00018205170884737866, 'samples': 15080448, 'steps': 29453, 'batch_loss/train': 0.6651444472372532} 12/26/2021 20:19:38 - INFO - codeparrot_training - Step 29454: {'lr': 0.000182036539240518, 'samples': 15080960, 'steps': 29454, 'batch_loss/train': 0.7356877418933436} 12/26/2021 20:19:48 - INFO - codeparrot_training - Step 29455: {'lr': 0.0001820213699038545, 'samples': 15081472, 'steps': 29455, 'batch_loss/train': 0.6273898049257696} 12/26/2021 20:19:59 - INFO - codeparrot_training - Step 29456: {'lr': 0.00018200620083744828, 'samples': 15081984, 'steps': 29456, 'batch_loss/train': 0.9178623715415597} 12/26/2021 20:20:11 - INFO - codeparrot_training - Step 29457: {'lr': 0.00018199103204135975, 'samples': 15082496, 'steps': 29457, 'batch_loss/train': 0.8117393963038921} 12/26/2021 20:20:22 - INFO - codeparrot_training - Step 29458: {'lr': 0.00018197586351564928, 'samples': 15083008, 'steps': 29458, 'batch_loss/train': 0.7603393811732531} 12/26/2021 20:20:32 - INFO - codeparrot_training - Step 29459: {'lr': 0.00018196069526037702, 'samples': 15083520, 'steps': 29459, 'batch_loss/train': 0.7583584887906909} 12/26/2021 20:20:43 - INFO - codeparrot_training - Step 29460: {'lr': 0.00018194552727560337, 'samples': 15084032, 'steps': 29460, 'batch_loss/train': 0.6471896097064018} 12/26/2021 20:20:55 - INFO - codeparrot_training - Step 29461: {'lr': 0.00018193035956138854, 'samples': 15084544, 'steps': 29461, 'batch_loss/train': 0.8073206092230976} 12/26/2021 20:21:06 - INFO - codeparrot_training - Step 29462: {'lr': 0.000181915192117793, 'samples': 15085056, 'steps': 29462, 'batch_loss/train': 0.7612242344766855} 12/26/2021 20:21:16 - INFO - codeparrot_training - Step 29463: {'lr': 0.00018190002494487687, 'samples': 15085568, 'steps': 29463, 'batch_loss/train': 0.5989924511814024} 12/26/2021 20:21:29 - INFO - codeparrot_training - Step 29464: {'lr': 0.0001818848580427006, 'samples': 15086080, 'steps': 29464, 'batch_loss/train': 0.6614830121397972} 12/26/2021 20:21:39 - INFO - codeparrot_training - Step 29465: {'lr': 0.0001818696914113243, 'samples': 15086592, 'steps': 29465, 'batch_loss/train': 0.6784335100091994} 12/26/2021 20:21:50 - INFO - codeparrot_training - Step 29466: {'lr': 0.00018185452505080845, 'samples': 15087104, 'steps': 29466, 'batch_loss/train': 0.6830206890590489} 12/26/2021 20:22:04 - INFO - codeparrot_training - Step 29467: {'lr': 0.00018183935896121334, 'samples': 15087616, 'steps': 29467, 'batch_loss/train': 0.7026498499326408} 12/26/2021 20:22:14 - INFO - codeparrot_training - Step 29468: {'lr': 0.0001818241931425992, 'samples': 15088128, 'steps': 29468, 'batch_loss/train': 0.73049839399755} 12/26/2021 20:22:25 - INFO - codeparrot_training - Step 29469: {'lr': 0.0001818090275950262, 'samples': 15088640, 'steps': 29469, 'batch_loss/train': 0.8739034966565669} 12/26/2021 20:22:36 - INFO - codeparrot_training - Step 29470: {'lr': 0.00018179386231855483, 'samples': 15089152, 'steps': 29470, 'batch_loss/train': 0.7769928216002882} 12/26/2021 20:22:48 - INFO - codeparrot_training - Step 29471: {'lr': 0.00018177869731324535, 'samples': 15089664, 'steps': 29471, 'batch_loss/train': 0.7920779141131788} 12/26/2021 20:22:58 - INFO - codeparrot_training - Step 29472: {'lr': 0.0001817635325791579, 'samples': 15090176, 'steps': 29472, 'batch_loss/train': 0.7599543705582619} 12/26/2021 20:23:09 - INFO - codeparrot_training - Step 29473: {'lr': 0.00018174836811635293, 'samples': 15090688, 'steps': 29473, 'batch_loss/train': 0.7331667953403667} 12/26/2021 20:23:21 - INFO - codeparrot_training - Step 29474: {'lr': 0.00018173320392489073, 'samples': 15091200, 'steps': 29474, 'batch_loss/train': 0.6883227629587054} 12/26/2021 20:23:32 - INFO - codeparrot_training - Step 29475: {'lr': 0.0001817180400048315, 'samples': 15091712, 'steps': 29475, 'batch_loss/train': 0.6901211449876428} 12/26/2021 20:23:42 - INFO - codeparrot_training - Step 29476: {'lr': 0.0001817028763562355, 'samples': 15092224, 'steps': 29476, 'batch_loss/train': 0.7026248192414641} 12/26/2021 20:23:56 - INFO - codeparrot_training - Step 29477: {'lr': 0.0001816877129791632, 'samples': 15092736, 'steps': 29477, 'batch_loss/train': 0.6894436730071902} 12/26/2021 20:24:07 - INFO - codeparrot_training - Step 29478: {'lr': 0.00018167254987367472, 'samples': 15093248, 'steps': 29478, 'batch_loss/train': 0.803079416975379} 12/26/2021 20:24:17 - INFO - codeparrot_training - Step 29479: {'lr': 0.00018165738703983036, 'samples': 15093760, 'steps': 29479, 'batch_loss/train': 0.6339846006594598} 12/26/2021 20:24:30 - INFO - codeparrot_training - Step 29480: {'lr': 0.00018164222447769035, 'samples': 15094272, 'steps': 29480, 'batch_loss/train': 0.7105831308290362} 12/26/2021 20:24:40 - INFO - codeparrot_training - Step 29481: {'lr': 0.00018162706218731518, 'samples': 15094784, 'steps': 29481, 'batch_loss/train': 0.7107014873763546} 12/26/2021 20:24:51 - INFO - codeparrot_training - Step 29482: {'lr': 0.0001816119001687649, 'samples': 15095296, 'steps': 29482, 'batch_loss/train': 0.6597651811316609} 12/26/2021 20:25:01 - INFO - codeparrot_training - Step 29483: {'lr': 0.00018159673842209985, 'samples': 15095808, 'steps': 29483, 'batch_loss/train': 0.7493708143010736} 12/26/2021 20:25:15 - INFO - codeparrot_training - Step 29484: {'lr': 0.00018158157694738046, 'samples': 15096320, 'steps': 29484, 'batch_loss/train': 0.7169660604558885} 12/26/2021 20:25:26 - INFO - codeparrot_training - Step 29485: {'lr': 0.0001815664157446668, 'samples': 15096832, 'steps': 29485, 'batch_loss/train': 0.5514483695151284} 12/26/2021 20:25:37 - INFO - codeparrot_training - Step 29486: {'lr': 0.00018155125481401931, 'samples': 15097344, 'steps': 29486, 'batch_loss/train': 0.7724679079838097} 12/26/2021 20:25:49 - INFO - codeparrot_training - Step 29487: {'lr': 0.0001815360941554981, 'samples': 15097856, 'steps': 29487, 'batch_loss/train': 0.7288124114274979} 12/26/2021 20:26:00 - INFO - codeparrot_training - Step 29488: {'lr': 0.0001815209337691636, 'samples': 15098368, 'steps': 29488, 'batch_loss/train': 0.7465467769652605} 12/26/2021 20:26:10 - INFO - codeparrot_training - Step 29489: {'lr': 0.00018150577365507597, 'samples': 15098880, 'steps': 29489, 'batch_loss/train': 0.5982748395763338} 12/26/2021 20:26:22 - INFO - codeparrot_training - Step 29490: {'lr': 0.00018149061381329557, 'samples': 15099392, 'steps': 29490, 'batch_loss/train': 0.5090405191294849} 12/26/2021 20:26:33 - INFO - codeparrot_training - Step 29491: {'lr': 0.0001814754542438825, 'samples': 15099904, 'steps': 29491, 'batch_loss/train': 0.7774674734100699} 12/26/2021 20:26:43 - INFO - codeparrot_training - Step 29492: {'lr': 0.00018146029494689722, 'samples': 15100416, 'steps': 29492, 'batch_loss/train': 0.6591949058347382} 12/26/2021 20:26:54 - INFO - codeparrot_training - Step 29493: {'lr': 0.00018144513592239998, 'samples': 15100928, 'steps': 29493, 'batch_loss/train': 0.7993801767006516} 12/26/2021 20:27:06 - INFO - codeparrot_training - Step 29494: {'lr': 0.0001814299771704509, 'samples': 15101440, 'steps': 29494, 'batch_loss/train': 0.7106831497512758} 12/26/2021 20:27:17 - INFO - codeparrot_training - Step 29495: {'lr': 0.00018141481869111037, 'samples': 15101952, 'steps': 29495, 'batch_loss/train': 0.7736831959336996} 12/26/2021 20:27:27 - INFO - codeparrot_training - Step 29496: {'lr': 0.00018139966048443868, 'samples': 15102464, 'steps': 29496, 'batch_loss/train': 0.7462375616887584} 12/26/2021 20:27:39 - INFO - codeparrot_training - Step 29497: {'lr': 0.00018138450255049605, 'samples': 15102976, 'steps': 29497, 'batch_loss/train': 0.7059293105266988} 12/26/2021 20:27:50 - INFO - codeparrot_training - Step 29498: {'lr': 0.00018136934488934263, 'samples': 15103488, 'steps': 29498, 'batch_loss/train': 0.7768711494281888} 12/26/2021 20:28:01 - INFO - codeparrot_training - Step 29499: {'lr': 0.00018135418750103884, 'samples': 15104000, 'steps': 29499, 'batch_loss/train': 0.7237832453101873} 12/26/2021 20:28:15 - INFO - codeparrot_training - Step 29500: {'lr': 0.0001813390303856449, 'samples': 15104512, 'steps': 29500, 'batch_loss/train': 0.6007846295833588} 12/26/2021 20:28:25 - INFO - codeparrot_training - Step 29501: {'lr': 0.00018132387354322102, 'samples': 15105024, 'steps': 29501, 'batch_loss/train': 1.0098745180293918} 12/26/2021 20:28:36 - INFO - codeparrot_training - Step 29502: {'lr': 0.00018130871697382746, 'samples': 15105536, 'steps': 29502, 'batch_loss/train': 0.7082922959234565} 12/26/2021 20:28:48 - INFO - codeparrot_training - Step 29503: {'lr': 0.00018129356067752455, 'samples': 15106048, 'steps': 29503, 'batch_loss/train': 0.6982089173980057} 12/26/2021 20:28:59 - INFO - codeparrot_training - Step 29504: {'lr': 0.00018127840465437248, 'samples': 15106560, 'steps': 29504, 'batch_loss/train': 0.7254513124935329} 12/26/2021 20:29:10 - INFO - codeparrot_training - Step 29505: {'lr': 0.00018126324890443157, 'samples': 15107072, 'steps': 29505, 'batch_loss/train': 0.7326964614912868} 12/26/2021 20:29:20 - INFO - codeparrot_training - Step 29506: {'lr': 0.00018124809342776193, 'samples': 15107584, 'steps': 29506, 'batch_loss/train': 0.8362217247486115} 12/26/2021 20:29:34 - INFO - codeparrot_training - Step 29507: {'lr': 0.000181232938224424, 'samples': 15108096, 'steps': 29507, 'batch_loss/train': 0.700530054513365} 12/26/2021 20:29:45 - INFO - codeparrot_training - Step 29508: {'lr': 0.0001812177832944779, 'samples': 15108608, 'steps': 29508, 'batch_loss/train': 0.7054906307021156} 12/26/2021 20:29:55 - INFO - codeparrot_training - Step 29509: {'lr': 0.00018120262863798386, 'samples': 15109120, 'steps': 29509, 'batch_loss/train': 0.7296452987939119} 12/26/2021 20:30:08 - INFO - codeparrot_training - Step 29510: {'lr': 0.00018118747425500232, 'samples': 15109632, 'steps': 29510, 'batch_loss/train': 0.8069966603070498} 12/26/2021 20:30:18 - INFO - codeparrot_training - Step 29511: {'lr': 0.00018117232014559336, 'samples': 15110144, 'steps': 29511, 'batch_loss/train': 0.7012800918892026} 12/26/2021 20:30:29 - INFO - codeparrot_training - Step 29512: {'lr': 0.0001811571663098173, 'samples': 15110656, 'steps': 29512, 'batch_loss/train': 0.7272154157981277} 12/26/2021 20:30:41 - INFO - codeparrot_training - Step 29513: {'lr': 0.00018114201274773423, 'samples': 15111168, 'steps': 29513, 'batch_loss/train': 1.0252323532477021} 12/26/2021 20:30:52 - INFO - codeparrot_training - Step 29514: {'lr': 0.00018112685945940465, 'samples': 15111680, 'steps': 29514, 'batch_loss/train': 0.7804261818528175} 12/26/2021 20:31:02 - INFO - codeparrot_training - Step 29515: {'lr': 0.0001811117064448886, 'samples': 15112192, 'steps': 29515, 'batch_loss/train': 0.7774937283247709} 12/26/2021 20:31:13 - INFO - codeparrot_training - Step 29516: {'lr': 0.00018109655370424644, 'samples': 15112704, 'steps': 29516, 'batch_loss/train': 0.7446258051786572} 12/26/2021 20:31:27 - INFO - codeparrot_training - Step 29517: {'lr': 0.00018108140123753827, 'samples': 15113216, 'steps': 29517, 'batch_loss/train': 0.6585931708104908} 12/26/2021 20:31:38 - INFO - codeparrot_training - Step 29518: {'lr': 0.0001810662490448245, 'samples': 15113728, 'steps': 29518, 'batch_loss/train': 0.7350449892692268} 12/26/2021 20:31:48 - INFO - codeparrot_training - Step 29519: {'lr': 0.0001810510971261653, 'samples': 15114240, 'steps': 29519, 'batch_loss/train': 0.6649607690051198} 12/26/2021 20:32:01 - INFO - codeparrot_training - Step 29520: {'lr': 0.00018103594548162083, 'samples': 15114752, 'steps': 29520, 'batch_loss/train': 0.7289614817127585} 12/26/2021 20:32:11 - INFO - codeparrot_training - Step 29521: {'lr': 0.00018102079411125147, 'samples': 15115264, 'steps': 29521, 'batch_loss/train': 0.6963574616238475} 12/26/2021 20:32:22 - INFO - codeparrot_training - Step 29522: {'lr': 0.0001810056430151174, 'samples': 15115776, 'steps': 29522, 'batch_loss/train': 0.38139686023350805} 12/26/2021 20:32:36 - INFO - codeparrot_training - Step 29523: {'lr': 0.0001809904921932789, 'samples': 15116288, 'steps': 29523, 'batch_loss/train': 0.9846505830064416} 12/26/2021 20:32:47 - INFO - codeparrot_training - Step 29524: {'lr': 0.00018097534164579599, 'samples': 15116800, 'steps': 29524, 'batch_loss/train': 1.488784173852764} 12/26/2021 20:32:57 - INFO - codeparrot_training - Step 29525: {'lr': 0.00018096019137272913, 'samples': 15117312, 'steps': 29525, 'batch_loss/train': 1.5291337333619595} 12/26/2021 20:33:08 - INFO - codeparrot_training - Step 29526: {'lr': 0.00018094504137413853, 'samples': 15117824, 'steps': 29526, 'batch_loss/train': 0.7359403185546398} 12/26/2021 20:33:20 - INFO - codeparrot_training - Step 29527: {'lr': 0.0001809298916500843, 'samples': 15118336, 'steps': 29527, 'batch_loss/train': 0.991396764293313} 12/26/2021 20:33:31 - INFO - codeparrot_training - Step 29528: {'lr': 0.0001809147422006267, 'samples': 15118848, 'steps': 29528, 'batch_loss/train': 0.76436831895262} 12/26/2021 20:33:41 - INFO - codeparrot_training - Step 29529: {'lr': 0.0001808995930258261, 'samples': 15119360, 'steps': 29529, 'batch_loss/train': 0.7696453774115071} 12/26/2021 20:33:54 - INFO - codeparrot_training - Step 29530: {'lr': 0.0001808844441257426, 'samples': 15119872, 'steps': 29530, 'batch_loss/train': 0.7495295060798526} 12/26/2021 20:34:04 - INFO - codeparrot_training - Step 29531: {'lr': 0.0001808692955004364, 'samples': 15120384, 'steps': 29531, 'batch_loss/train': 0.7539323994424194} 12/26/2021 20:34:15 - INFO - codeparrot_training - Step 29532: {'lr': 0.00018085414714996784, 'samples': 15120896, 'steps': 29532, 'batch_loss/train': 0.7404245655052364} 12/26/2021 20:34:27 - INFO - codeparrot_training - Step 29533: {'lr': 0.00018083899907439714, 'samples': 15121408, 'steps': 29533, 'batch_loss/train': 0.8711012909188867} 12/26/2021 20:34:37 - INFO - codeparrot_training - Step 29534: {'lr': 0.00018082385127378436, 'samples': 15121920, 'steps': 29534, 'batch_loss/train': 0.7634481689892709} 12/26/2021 20:34:48 - INFO - codeparrot_training - Step 29535: {'lr': 0.00018080870374818982, 'samples': 15122432, 'steps': 29535, 'batch_loss/train': 0.8031928678974509} 12/26/2021 20:34:59 - INFO - codeparrot_training - Step 29536: {'lr': 0.00018079355649767382, 'samples': 15122944, 'steps': 29536, 'batch_loss/train': 0.742884828010574} 12/26/2021 20:35:13 - INFO - codeparrot_training - Step 29537: {'lr': 0.00018077840952229645, 'samples': 15123456, 'steps': 29537, 'batch_loss/train': 0.6811107750982046} 12/26/2021 20:35:23 - INFO - codeparrot_training - Step 29538: {'lr': 0.0001807632628221181, 'samples': 15123968, 'steps': 29538, 'batch_loss/train': 0.6567497226351406} 12/26/2021 20:35:34 - INFO - codeparrot_training - Step 29539: {'lr': 0.00018074811639719868, 'samples': 15124480, 'steps': 29539, 'batch_loss/train': 0.7559782918542624} 12/26/2021 20:35:46 - INFO - codeparrot_training - Step 29540: {'lr': 0.00018073297024759873, 'samples': 15124992, 'steps': 29540, 'batch_loss/train': 0.7042343660723418} 12/26/2021 20:35:57 - INFO - codeparrot_training - Step 29541: {'lr': 0.00018071782437337837, 'samples': 15125504, 'steps': 29541, 'batch_loss/train': 0.8397961757145822} 12/26/2021 20:36:07 - INFO - codeparrot_training - Step 29542: {'lr': 0.00018070267877459773, 'samples': 15126016, 'steps': 29542, 'batch_loss/train': 0.7063688896596432} 12/26/2021 20:36:19 - INFO - codeparrot_training - Step 29543: {'lr': 0.00018068753345131701, 'samples': 15126528, 'steps': 29543, 'batch_loss/train': 0.8069231193512678} 12/26/2021 20:36:30 - INFO - codeparrot_training - Step 29544: {'lr': 0.00018067238840359658, 'samples': 15127040, 'steps': 29544, 'batch_loss/train': 0.6505186728900298} 12/26/2021 20:36:41 - INFO - codeparrot_training - Step 29545: {'lr': 0.00018065724363149654, 'samples': 15127552, 'steps': 29545, 'batch_loss/train': 0.6549482829868793} 12/26/2021 20:36:55 - INFO - codeparrot_training - Step 29546: {'lr': 0.000180642099135077, 'samples': 15128064, 'steps': 29546, 'batch_loss/train': 0.6621566670946777} 12/26/2021 20:37:06 - INFO - codeparrot_training - Step 29547: {'lr': 0.0001806269549143984, 'samples': 15128576, 'steps': 29547, 'batch_loss/train': 0.6838188851252198} 12/26/2021 20:37:16 - INFO - codeparrot_training - Step 29548: {'lr': 0.00018061181096952085, 'samples': 15129088, 'steps': 29548, 'batch_loss/train': 0.6727215490536764} 12/26/2021 20:37:27 - INFO - codeparrot_training - Step 29549: {'lr': 0.0001805966673005045, 'samples': 15129600, 'steps': 29549, 'batch_loss/train': 0.8125472478568554} 12/26/2021 20:37:39 - INFO - codeparrot_training - Step 29550: {'lr': 0.00018058152390740953, 'samples': 15130112, 'steps': 29550, 'batch_loss/train': 0.7138918647542596} 12/26/2021 20:37:50 - INFO - codeparrot_training - Step 29551: {'lr': 0.00018056638079029633, 'samples': 15130624, 'steps': 29551, 'batch_loss/train': 0.6644319084007293} 12/26/2021 20:38:00 - INFO - codeparrot_training - Step 29552: {'lr': 0.00018055123794922498, 'samples': 15131136, 'steps': 29552, 'batch_loss/train': 0.724754280410707} 12/26/2021 20:38:14 - INFO - codeparrot_training - Step 29553: {'lr': 0.00018053609538425564, 'samples': 15131648, 'steps': 29553, 'batch_loss/train': 0.7844232451170683} 12/26/2021 20:38:25 - INFO - codeparrot_training - Step 29554: {'lr': 0.00018052095309544847, 'samples': 15132160, 'steps': 29554, 'batch_loss/train': 0.7654858767054975} 12/26/2021 20:38:36 - INFO - codeparrot_training - Step 29555: {'lr': 0.0001805058110828639, 'samples': 15132672, 'steps': 29555, 'batch_loss/train': 0.9296467797830701} 12/26/2021 20:38:48 - INFO - codeparrot_training - Step 29556: {'lr': 0.0001804906693465619, 'samples': 15133184, 'steps': 29556, 'batch_loss/train': 0.6815310064703226} 12/26/2021 20:38:58 - INFO - codeparrot_training - Step 29557: {'lr': 0.0001804755278866027, 'samples': 15133696, 'steps': 29557, 'batch_loss/train': 0.7211178615689278} 12/26/2021 20:39:09 - INFO - codeparrot_training - Step 29558: {'lr': 0.00018046038670304672, 'samples': 15134208, 'steps': 29558, 'batch_loss/train': 0.7227323004044592} 12/26/2021 20:39:19 - INFO - codeparrot_training - Step 29559: {'lr': 0.0001804452457959539, 'samples': 15134720, 'steps': 29559, 'batch_loss/train': 1.3926651617512107} 12/26/2021 20:39:32 - INFO - codeparrot_training - Step 29560: {'lr': 0.00018043010516538452, 'samples': 15135232, 'steps': 29560, 'batch_loss/train': 0.7398084653541446} 12/26/2021 20:39:43 - INFO - codeparrot_training - Step 29561: {'lr': 0.00018041496481139869, 'samples': 15135744, 'steps': 29561, 'batch_loss/train': 1.2034906828776002} 12/26/2021 20:39:54 - INFO - codeparrot_training - Step 29562: {'lr': 0.0001803998247340568, 'samples': 15136256, 'steps': 29562, 'batch_loss/train': 0.7086104209301993} 12/26/2021 20:40:08 - INFO - codeparrot_training - Step 29563: {'lr': 0.0001803846849334189, 'samples': 15136768, 'steps': 29563, 'batch_loss/train': 0.7628834433853626} 12/26/2021 20:40:18 - INFO - codeparrot_training - Step 29564: {'lr': 0.00018036954540954527, 'samples': 15137280, 'steps': 29564, 'batch_loss/train': 0.7547719427966513} 12/26/2021 20:40:29 - INFO - codeparrot_training - Step 29565: {'lr': 0.00018035440616249586, 'samples': 15137792, 'steps': 29565, 'batch_loss/train': 0.6390295896562748} 12/26/2021 20:40:41 - INFO - codeparrot_training - Step 29566: {'lr': 0.00018033926719233112, 'samples': 15138304, 'steps': 29566, 'batch_loss/train': 0.689622319303453} 12/26/2021 20:40:51 - INFO - codeparrot_training - Step 29567: {'lr': 0.00018032412849911122, 'samples': 15138816, 'steps': 29567, 'batch_loss/train': 0.7661242326721549} 12/26/2021 20:41:02 - INFO - codeparrot_training - Step 29568: {'lr': 0.00018030899008289616, 'samples': 15139328, 'steps': 29568, 'batch_loss/train': 0.6900628609582782} 12/26/2021 20:41:13 - INFO - codeparrot_training - Step 29569: {'lr': 0.00018029385194374636, 'samples': 15139840, 'steps': 29569, 'batch_loss/train': 0.8513529202900827} 12/26/2021 20:41:25 - INFO - codeparrot_training - Step 29570: {'lr': 0.0001802787140817218, 'samples': 15140352, 'steps': 29570, 'batch_loss/train': 0.6939413412474096} 12/26/2021 20:41:35 - INFO - codeparrot_training - Step 29571: {'lr': 0.00018026357649688282, 'samples': 15140864, 'steps': 29571, 'batch_loss/train': 0.7915012175217271} 12/26/2021 20:41:46 - INFO - codeparrot_training - Step 29572: {'lr': 0.00018024843918928944, 'samples': 15141376, 'steps': 29572, 'batch_loss/train': 0.7447158200666308} 12/26/2021 20:42:00 - INFO - codeparrot_training - Step 29573: {'lr': 0.00018023330215900197, 'samples': 15141888, 'steps': 29573, 'batch_loss/train': 0.7563488064333797} 12/26/2021 20:42:10 - INFO - codeparrot_training - Step 29574: {'lr': 0.00018021816540608057, 'samples': 15142400, 'steps': 29574, 'batch_loss/train': 0.6312541803345084} 12/26/2021 20:42:21 - INFO - codeparrot_training - Step 29575: {'lr': 0.00018020302893058537, 'samples': 15142912, 'steps': 29575, 'batch_loss/train': 0.7830754118040204} 12/26/2021 20:42:33 - INFO - codeparrot_training - Step 29576: {'lr': 0.00018018789273257653, 'samples': 15143424, 'steps': 29576, 'batch_loss/train': 0.680125625920482} 12/26/2021 20:42:44 - INFO - codeparrot_training - Step 29577: {'lr': 0.00018017275681211437, 'samples': 15143936, 'steps': 29577, 'batch_loss/train': 0.7282328768633306} 12/26/2021 20:42:54 - INFO - codeparrot_training - Step 29578: {'lr': 0.00018015762116925892, 'samples': 15144448, 'steps': 29578, 'batch_loss/train': 0.8229903085157275} 12/26/2021 20:43:06 - INFO - codeparrot_training - Step 29579: {'lr': 0.0001801424858040704, 'samples': 15144960, 'steps': 29579, 'batch_loss/train': 0.7097624689340591} 12/26/2021 20:43:17 - INFO - codeparrot_training - Step 29580: {'lr': 0.00018012735071660892, 'samples': 15145472, 'steps': 29580, 'batch_loss/train': 0.7718259897083044} 12/26/2021 20:43:27 - INFO - codeparrot_training - Step 29581: {'lr': 0.00018011221590693479, 'samples': 15145984, 'steps': 29581, 'batch_loss/train': 0.8035184107720852} 12/26/2021 20:43:38 - INFO - codeparrot_training - Step 29582: {'lr': 0.00018009708137510806, 'samples': 15146496, 'steps': 29582, 'batch_loss/train': 0.7793425307609141} 12/26/2021 20:43:52 - INFO - codeparrot_training - Step 29583: {'lr': 0.00018008194712118886, 'samples': 15147008, 'steps': 29583, 'batch_loss/train': 0.8231644965708256} 12/26/2021 20:44:03 - INFO - codeparrot_training - Step 29584: {'lr': 0.0001800668131452376, 'samples': 15147520, 'steps': 29584, 'batch_loss/train': 0.6420911168679595} 12/26/2021 20:44:13 - INFO - codeparrot_training - Step 29585: {'lr': 0.00018005167944731422, 'samples': 15148032, 'steps': 29585, 'batch_loss/train': 0.7652848232537508} 12/26/2021 20:44:25 - INFO - codeparrot_training - Step 29586: {'lr': 0.00018003654602747896, 'samples': 15148544, 'steps': 29586, 'batch_loss/train': 0.7090628650039434} 12/26/2021 20:44:36 - INFO - codeparrot_training - Step 29587: {'lr': 0.00018002141288579187, 'samples': 15149056, 'steps': 29587, 'batch_loss/train': 0.7455866867676377} 12/26/2021 20:44:46 - INFO - codeparrot_training - Step 29588: {'lr': 0.00018000628002231336, 'samples': 15149568, 'steps': 29588, 'batch_loss/train': 0.7194213727489114} 12/26/2021 20:44:59 - INFO - codeparrot_training - Step 29589: {'lr': 0.0001799911474371034, 'samples': 15150080, 'steps': 29589, 'batch_loss/train': 0.7254469748586416} 12/26/2021 20:45:09 - INFO - codeparrot_training - Step 29590: {'lr': 0.00017997601513022228, 'samples': 15150592, 'steps': 29590, 'batch_loss/train': 0.7571539958007634} 12/26/2021 20:45:20 - INFO - codeparrot_training - Step 29591: {'lr': 0.00017996088310172993, 'samples': 15151104, 'steps': 29591, 'batch_loss/train': 0.5433278759010136} 12/26/2021 20:45:31 - INFO - codeparrot_training - Step 29592: {'lr': 0.00017994575135168672, 'samples': 15151616, 'steps': 29592, 'batch_loss/train': 0.708873110357672} 12/26/2021 20:45:44 - INFO - codeparrot_training - Step 29593: {'lr': 0.0001799306198801528, 'samples': 15152128, 'steps': 29593, 'batch_loss/train': 0.7971278037875891} 12/26/2021 20:45:55 - INFO - codeparrot_training - Step 29594: {'lr': 0.0001799154886871882, 'samples': 15152640, 'steps': 29594, 'batch_loss/train': 0.7676974260248244} 12/26/2021 20:46:06 - INFO - codeparrot_training - Step 29595: {'lr': 0.00017990035777285318, 'samples': 15153152, 'steps': 29595, 'batch_loss/train': 0.7252565019298345} 12/26/2021 20:46:18 - INFO - codeparrot_training - Step 29596: {'lr': 0.00017988522713720796, 'samples': 15153664, 'steps': 29596, 'batch_loss/train': 0.7159147481434047} 12/26/2021 20:46:29 - INFO - codeparrot_training - Step 29597: {'lr': 0.00017987009678031258, 'samples': 15154176, 'steps': 29597, 'batch_loss/train': 0.6444325125776231} 12/26/2021 20:46:39 - INFO - codeparrot_training - Step 29598: {'lr': 0.00017985496670222713, 'samples': 15154688, 'steps': 29598, 'batch_loss/train': 0.7847897168248892} 12/26/2021 20:46:51 - INFO - codeparrot_training - Step 29599: {'lr': 0.00017983983690301187, 'samples': 15155200, 'steps': 29599, 'batch_loss/train': 0.7552827259059995} 12/26/2021 20:47:02 - INFO - codeparrot_training - Step 29600: {'lr': 0.00017982470738272697, 'samples': 15155712, 'steps': 29600, 'batch_loss/train': 0.7577668926678598} 12/26/2021 20:47:13 - INFO - codeparrot_training - Step 29601: {'lr': 0.0001798095781414325, 'samples': 15156224, 'steps': 29601, 'batch_loss/train': 0.6764092976227403} 12/26/2021 20:47:26 - INFO - codeparrot_training - Step 29602: {'lr': 0.00017979444917918857, 'samples': 15156736, 'steps': 29602, 'batch_loss/train': 0.8097494579851627} 12/26/2021 20:47:37 - INFO - codeparrot_training - Step 29603: {'lr': 0.00017977932049605556, 'samples': 15157248, 'steps': 29603, 'batch_loss/train': 0.6499298823182471} 12/26/2021 20:47:48 - INFO - codeparrot_training - Step 29604: {'lr': 0.00017976419209209343, 'samples': 15157760, 'steps': 29604, 'batch_loss/train': 0.7044934388250113} 12/26/2021 20:47:58 - INFO - codeparrot_training - Step 29605: {'lr': 0.00017974906396736224, 'samples': 15158272, 'steps': 29605, 'batch_loss/train': 0.7434879837092012} 12/26/2021 20:48:11 - INFO - codeparrot_training - Step 29606: {'lr': 0.00017973393612192228, 'samples': 15158784, 'steps': 29606, 'batch_loss/train': 0.8177911685779691} 12/26/2021 20:48:21 - INFO - codeparrot_training - Step 29607: {'lr': 0.00017971880855583374, 'samples': 15159296, 'steps': 29607, 'batch_loss/train': 0.7092501665465534} 12/26/2021 20:48:32 - INFO - codeparrot_training - Step 29608: {'lr': 0.00017970368126915665, 'samples': 15159808, 'steps': 29608, 'batch_loss/train': 0.682644460350275} 12/26/2021 20:48:46 - INFO - codeparrot_training - Step 29609: {'lr': 0.00017968855426195106, 'samples': 15160320, 'steps': 29609, 'batch_loss/train': 0.7077808645553887} 12/26/2021 20:48:56 - INFO - codeparrot_training - Step 29610: {'lr': 0.0001796734275342774, 'samples': 15160832, 'steps': 29610, 'batch_loss/train': 0.727285388391465} 12/26/2021 20:49:07 - INFO - codeparrot_training - Step 29611: {'lr': 0.00017965830108619555, 'samples': 15161344, 'steps': 29611, 'batch_loss/train': 0.7426012796349823} 12/26/2021 20:49:19 - INFO - codeparrot_training - Step 29612: {'lr': 0.0001796431749177658, 'samples': 15161856, 'steps': 29612, 'batch_loss/train': 0.6529114842414856} 12/26/2021 20:49:30 - INFO - codeparrot_training - Step 29613: {'lr': 0.0001796280490290481, 'samples': 15162368, 'steps': 29613, 'batch_loss/train': 0.7550078118219972} 12/26/2021 20:49:40 - INFO - codeparrot_training - Step 29614: {'lr': 0.00017961292342010283, 'samples': 15162880, 'steps': 29614, 'batch_loss/train': 0.7693848805502057} 12/26/2021 20:49:51 - INFO - codeparrot_training - Step 29615: {'lr': 0.00017959779809098995, 'samples': 15163392, 'steps': 29615, 'batch_loss/train': 0.7594064432196319} 12/26/2021 20:50:03 - INFO - codeparrot_training - Step 29616: {'lr': 0.00017958267304176969, 'samples': 15163904, 'steps': 29616, 'batch_loss/train': 0.7438688543625176} 12/26/2021 20:50:14 - INFO - codeparrot_training - Step 29617: {'lr': 0.00017956754827250202, 'samples': 15164416, 'steps': 29617, 'batch_loss/train': 0.7509451289661229} 12/26/2021 20:50:24 - INFO - codeparrot_training - Step 29618: {'lr': 0.00017955242378324724, 'samples': 15164928, 'steps': 29618, 'batch_loss/train': 0.6930852252990007} 12/26/2021 20:50:37 - INFO - codeparrot_training - Step 29619: {'lr': 0.00017953729957406544, 'samples': 15165440, 'steps': 29619, 'batch_loss/train': 0.7592165293172002} 12/26/2021 20:50:47 - INFO - codeparrot_training - Step 29620: {'lr': 0.0001795221756450167, 'samples': 15165952, 'steps': 29620, 'batch_loss/train': 0.5067366661969572} 12/26/2021 20:50:58 - INFO - codeparrot_training - Step 29621: {'lr': 0.00017950705199616118, 'samples': 15166464, 'steps': 29621, 'batch_loss/train': 0.7696105185896158} 12/26/2021 20:51:12 - INFO - codeparrot_training - Step 29622: {'lr': 0.0001794919286275591, 'samples': 15166976, 'steps': 29622, 'batch_loss/train': 0.7436200035735965} 12/26/2021 20:51:23 - INFO - codeparrot_training - Step 29623: {'lr': 0.00017947680553927043, 'samples': 15167488, 'steps': 29623, 'batch_loss/train': 0.5560043861623853} 12/26/2021 20:51:33 - INFO - codeparrot_training - Step 29624: {'lr': 0.00017946168273135526, 'samples': 15168000, 'steps': 29624, 'batch_loss/train': 0.8125783000141382} 12/26/2021 20:51:44 - INFO - codeparrot_training - Step 29625: {'lr': 0.0001794465602038739, 'samples': 15168512, 'steps': 29625, 'batch_loss/train': 0.6734315886860713} 12/26/2021 20:51:56 - INFO - codeparrot_training - Step 29626: {'lr': 0.00017943143795688638, 'samples': 15169024, 'steps': 29626, 'batch_loss/train': 0.6235421304591} 12/26/2021 20:52:07 - INFO - codeparrot_training - Step 29627: {'lr': 0.0001794163159904528, 'samples': 15169536, 'steps': 29627, 'batch_loss/train': 0.7639632653445005} 12/26/2021 20:52:17 - INFO - codeparrot_training - Step 29628: {'lr': 0.00017940119430463321, 'samples': 15170048, 'steps': 29628, 'batch_loss/train': 0.7538903066888452} 12/26/2021 20:52:29 - INFO - codeparrot_training - Step 29629: {'lr': 0.00017938607289948795, 'samples': 15170560, 'steps': 29629, 'batch_loss/train': 0.7285287389531732} 12/26/2021 20:52:40 - INFO - codeparrot_training - Step 29630: {'lr': 0.00017937095177507695, 'samples': 15171072, 'steps': 29630, 'batch_loss/train': 0.7399794794619083} 12/26/2021 20:52:50 - INFO - codeparrot_training - Step 29631: {'lr': 0.0001793558309314603, 'samples': 15171584, 'steps': 29631, 'batch_loss/train': 0.7108875485137105} 12/26/2021 20:53:05 - INFO - codeparrot_training - Step 29632: {'lr': 0.0001793407103686983, 'samples': 15172096, 'steps': 29632, 'batch_loss/train': 0.753778182901442} 12/26/2021 20:53:15 - INFO - codeparrot_training - Step 29633: {'lr': 0.00017932559008685097, 'samples': 15172608, 'steps': 29633, 'batch_loss/train': 1.3935723463073373} 12/26/2021 20:53:26 - INFO - codeparrot_training - Step 29634: {'lr': 0.00017931047008597838, 'samples': 15173120, 'steps': 29634, 'batch_loss/train': 0.7075900464551523} 12/26/2021 20:53:38 - INFO - codeparrot_training - Step 29635: {'lr': 0.00017929535036614058, 'samples': 15173632, 'steps': 29635, 'batch_loss/train': 0.7863331628032029} 12/26/2021 20:53:49 - INFO - codeparrot_training - Step 29636: {'lr': 0.00017928023092739787, 'samples': 15174144, 'steps': 29636, 'batch_loss/train': 0.761295082513243} 12/26/2021 20:54:00 - INFO - codeparrot_training - Step 29637: {'lr': 0.00017926511176981022, 'samples': 15174656, 'steps': 29637, 'batch_loss/train': 0.4355146289453842} 12/26/2021 20:54:10 - INFO - codeparrot_training - Step 29638: {'lr': 0.00017924999289343786, 'samples': 15175168, 'steps': 29638, 'batch_loss/train': 0.6843239014269784} 12/26/2021 20:54:24 - INFO - codeparrot_training - Step 29639: {'lr': 0.00017923487429834066, 'samples': 15175680, 'steps': 29639, 'batch_loss/train': 0.6382572720758617} 12/26/2021 20:54:35 - INFO - codeparrot_training - Step 29640: {'lr': 0.00017921975598457897, 'samples': 15176192, 'steps': 29640, 'batch_loss/train': 0.701817971188575} 12/26/2021 20:54:45 - INFO - codeparrot_training - Step 29641: {'lr': 0.0001792046379522128, 'samples': 15176704, 'steps': 29641, 'batch_loss/train': 0.6986011555418372} 12/26/2021 20:54:58 - INFO - codeparrot_training - Step 29642: {'lr': 0.00017918952020130218, 'samples': 15177216, 'steps': 29642, 'batch_loss/train': 0.8584015257656574} 12/26/2021 20:55:08 - INFO - codeparrot_training - Step 29643: {'lr': 0.00017917440273190744, 'samples': 15177728, 'steps': 29643, 'batch_loss/train': 0.708758034161292} 12/26/2021 20:55:19 - INFO - codeparrot_training - Step 29644: {'lr': 0.00017915928554408845, 'samples': 15178240, 'steps': 29644, 'batch_loss/train': 0.7056364852469414} 12/26/2021 20:55:31 - INFO - codeparrot_training - Step 29645: {'lr': 0.00017914416863790546, 'samples': 15178752, 'steps': 29645, 'batch_loss/train': 0.7648824052885175} 12/26/2021 20:55:42 - INFO - codeparrot_training - Step 29646: {'lr': 0.00017912905201341832, 'samples': 15179264, 'steps': 29646, 'batch_loss/train': 0.7848113200161606} 12/26/2021 20:55:52 - INFO - codeparrot_training - Step 29647: {'lr': 0.00017911393567068745, 'samples': 15179776, 'steps': 29647, 'batch_loss/train': 0.8631074137520045} 12/26/2021 20:56:03 - INFO - codeparrot_training - Step 29648: {'lr': 0.0001790988196097728, 'samples': 15180288, 'steps': 29648, 'batch_loss/train': 0.573206604924053} 12/26/2021 20:56:17 - INFO - codeparrot_training - Step 29649: {'lr': 0.00017908370383073452, 'samples': 15180800, 'steps': 29649, 'batch_loss/train': 0.7302935002371669} 12/26/2021 20:56:28 - INFO - codeparrot_training - Step 29650: {'lr': 0.00017906858833363254, 'samples': 15181312, 'steps': 29650, 'batch_loss/train': 0.7492532031610608} 12/26/2021 20:56:38 - INFO - codeparrot_training - Step 29651: {'lr': 0.0001790534731185271, 'samples': 15181824, 'steps': 29651, 'batch_loss/train': 0.580860091606155} 12/26/2021 20:56:50 - INFO - codeparrot_training - Step 29652: {'lr': 0.00017903835818547834, 'samples': 15182336, 'steps': 29652, 'batch_loss/train': 0.7545688347890973} 12/26/2021 20:57:01 - INFO - codeparrot_training - Step 29653: {'lr': 0.0001790232435345462, 'samples': 15182848, 'steps': 29653, 'batch_loss/train': 0.7479949868284166} 12/26/2021 20:57:12 - INFO - codeparrot_training - Step 29654: {'lr': 0.0001790081291657908, 'samples': 15183360, 'steps': 29654, 'batch_loss/train': 0.7146472637541592} 12/26/2021 20:57:24 - INFO - codeparrot_training - Step 29655: {'lr': 0.0001789930150792724, 'samples': 15183872, 'steps': 29655, 'batch_loss/train': 0.7849845038726926} 12/26/2021 20:57:34 - INFO - codeparrot_training - Step 29656: {'lr': 0.00017897790127505088, 'samples': 15184384, 'steps': 29656, 'batch_loss/train': 0.7287924033589661} 12/26/2021 20:57:45 - INFO - codeparrot_training - Step 29657: {'lr': 0.00017896278775318637, 'samples': 15184896, 'steps': 29657, 'batch_loss/train': 0.778289781883359} 12/26/2021 20:57:57 - INFO - codeparrot_training - Step 29658: {'lr': 0.0001789476745137391, 'samples': 15185408, 'steps': 29658, 'batch_loss/train': 0.8169345194473863} 12/26/2021 20:58:08 - INFO - codeparrot_training - Step 29659: {'lr': 0.00017893256155676905, 'samples': 15185920, 'steps': 29659, 'batch_loss/train': 0.701937053585425} 12/26/2021 20:58:18 - INFO - codeparrot_training - Step 29660: {'lr': 0.00017891744888233623, 'samples': 15186432, 'steps': 29660, 'batch_loss/train': 0.6329903678852133} 12/26/2021 20:58:29 - INFO - codeparrot_training - Step 29661: {'lr': 0.00017890233649050076, 'samples': 15186944, 'steps': 29661, 'batch_loss/train': 0.7313764030113816} 12/26/2021 20:58:43 - INFO - codeparrot_training - Step 29662: {'lr': 0.00017888722438132282, 'samples': 15187456, 'steps': 29662, 'batch_loss/train': 1.0299854923505336} 12/26/2021 20:58:54 - INFO - codeparrot_training - Step 29663: {'lr': 0.0001788721125548624, 'samples': 15187968, 'steps': 29663, 'batch_loss/train': 0.7148416875279509} 12/26/2021 20:59:04 - INFO - codeparrot_training - Step 29664: {'lr': 0.00017885700101117968, 'samples': 15188480, 'steps': 29664, 'batch_loss/train': 0.7185075676534325} 12/26/2021 20:59:17 - INFO - codeparrot_training - Step 29665: {'lr': 0.00017884188975033453, 'samples': 15188992, 'steps': 29665, 'batch_loss/train': 0.7330107768066227} 12/26/2021 20:59:27 - INFO - codeparrot_training - Step 29666: {'lr': 0.0001788267787723872, 'samples': 15189504, 'steps': 29666, 'batch_loss/train': 0.7693793922662735} 12/26/2021 20:59:38 - INFO - codeparrot_training - Step 29667: {'lr': 0.00017881166807739778, 'samples': 15190016, 'steps': 29667, 'batch_loss/train': 0.7381952940486372} 12/26/2021 20:59:52 - INFO - codeparrot_training - Step 29668: {'lr': 0.0001787965576654262, 'samples': 15190528, 'steps': 29668, 'batch_loss/train': 0.7376005128026009} 12/26/2021 21:00:02 - INFO - codeparrot_training - Step 29669: {'lr': 0.00017878144753653274, 'samples': 15191040, 'steps': 29669, 'batch_loss/train': 0.7467221412807703} 12/26/2021 21:00:13 - INFO - codeparrot_training - Step 29670: {'lr': 0.00017876633769077727, 'samples': 15191552, 'steps': 29670, 'batch_loss/train': 0.7991220252588391} 12/26/2021 21:00:24 - INFO - codeparrot_training - Step 29671: {'lr': 0.00017875122812822003, 'samples': 15192064, 'steps': 29671, 'batch_loss/train': 0.6939474060200155} 12/26/2021 21:00:36 - INFO - codeparrot_training - Step 29672: {'lr': 0.0001787361188489209, 'samples': 15192576, 'steps': 29672, 'batch_loss/train': 0.7130822315812111} 12/26/2021 21:00:46 - INFO - codeparrot_training - Step 29673: {'lr': 0.00017872100985294004, 'samples': 15193088, 'steps': 29673, 'batch_loss/train': 0.7551730945706367} 12/26/2021 21:00:57 - INFO - codeparrot_training - Step 29674: {'lr': 0.00017870590114033765, 'samples': 15193600, 'steps': 29674, 'batch_loss/train': 0.86601497605443} 12/26/2021 21:01:09 - INFO - codeparrot_training - Step 29675: {'lr': 0.00017869079271117357, 'samples': 15194112, 'steps': 29675, 'batch_loss/train': 0.7087095631286502} 12/26/2021 21:01:20 - INFO - codeparrot_training - Step 29676: {'lr': 0.00017867568456550798, 'samples': 15194624, 'steps': 29676, 'batch_loss/train': 0.5067882637958974} 12/26/2021 21:01:30 - INFO - codeparrot_training - Step 29677: {'lr': 0.00017866057670340102, 'samples': 15195136, 'steps': 29677, 'batch_loss/train': 0.689250466413796} 12/26/2021 21:01:44 - INFO - codeparrot_training - Step 29678: {'lr': 0.00017864546912491264, 'samples': 15195648, 'steps': 29678, 'batch_loss/train': 0.7704201778396964} 12/26/2021 21:01:55 - INFO - codeparrot_training - Step 29679: {'lr': 0.00017863036183010287, 'samples': 15196160, 'steps': 29679, 'batch_loss/train': 0.7245671267155558} 12/26/2021 21:02:05 - INFO - codeparrot_training - Step 29680: {'lr': 0.00017861525481903184, 'samples': 15196672, 'steps': 29680, 'batch_loss/train': 0.7383296331390738} 12/26/2021 21:02:16 - INFO - codeparrot_training - Step 29681: {'lr': 0.0001786001480917597, 'samples': 15197184, 'steps': 29681, 'batch_loss/train': 0.7163495135027915} 12/26/2021 21:02:28 - INFO - codeparrot_training - Step 29682: {'lr': 0.00017858504164834635, 'samples': 15197696, 'steps': 29682, 'batch_loss/train': 0.7962837032973766} 12/26/2021 21:02:39 - INFO - codeparrot_training - Step 29683: {'lr': 0.00017856993548885186, 'samples': 15198208, 'steps': 29683, 'batch_loss/train': 0.6359259015880525} 12/26/2021 21:02:49 - INFO - codeparrot_training - Step 29684: {'lr': 0.0001785548296133364, 'samples': 15198720, 'steps': 29684, 'batch_loss/train': 0.9397498592734337} 12/26/2021 21:03:04 - INFO - codeparrot_training - Step 29685: {'lr': 0.00017853972402185998, 'samples': 15199232, 'steps': 29685, 'batch_loss/train': 0.7654268213082105} 12/26/2021 21:03:15 - INFO - codeparrot_training - Step 29686: {'lr': 0.00017852461871448262, 'samples': 15199744, 'steps': 29686, 'batch_loss/train': 0.6986118946224451} 12/26/2021 21:03:26 - INFO - codeparrot_training - Step 29687: {'lr': 0.0001785095136912643, 'samples': 15200256, 'steps': 29687, 'batch_loss/train': 1.1710496442392468} 12/26/2021 21:03:36 - INFO - codeparrot_training - Step 29688: {'lr': 0.00017849440895226527, 'samples': 15200768, 'steps': 29688, 'batch_loss/train': 0.9345581203233451} 12/26/2021 21:03:48 - INFO - codeparrot_training - Step 29689: {'lr': 0.00017847930449754542, 'samples': 15201280, 'steps': 29689, 'batch_loss/train': 0.7292365282773972} 12/26/2021 21:03:59 - INFO - codeparrot_training - Step 29690: {'lr': 0.00017846420032716494, 'samples': 15201792, 'steps': 29690, 'batch_loss/train': 0.7823678432032466} 12/26/2021 21:04:09 - INFO - codeparrot_training - Step 29691: {'lr': 0.0001784490964411836, 'samples': 15202304, 'steps': 29691, 'batch_loss/train': 0.9048747168853879} 12/26/2021 21:04:21 - INFO - codeparrot_training - Step 29692: {'lr': 0.00017843399283966178, 'samples': 15202816, 'steps': 29692, 'batch_loss/train': 0.6254646209999919} 12/26/2021 21:04:32 - INFO - codeparrot_training - Step 29693: {'lr': 0.0001784188895226594, 'samples': 15203328, 'steps': 29693, 'batch_loss/train': 0.6731319250538945} 12/26/2021 21:04:43 - INFO - codeparrot_training - Step 29694: {'lr': 0.00017840378649023637, 'samples': 15203840, 'steps': 29694, 'batch_loss/train': 0.6890071713132784} 12/26/2021 21:04:55 - INFO - codeparrot_training - Step 29695: {'lr': 0.00017838868374245293, 'samples': 15204352, 'steps': 29695, 'batch_loss/train': 0.6373655169736594} 12/26/2021 21:05:06 - INFO - codeparrot_training - Step 29696: {'lr': 0.00017837358127936905, 'samples': 15204864, 'steps': 29696, 'batch_loss/train': 0.766672651283443} 12/26/2021 21:05:16 - INFO - codeparrot_training - Step 29697: {'lr': 0.00017835847910104474, 'samples': 15205376, 'steps': 29697, 'batch_loss/train': 0.724757494404912} 12/26/2021 21:05:30 - INFO - codeparrot_training - Step 29698: {'lr': 0.00017834337720754005, 'samples': 15205888, 'steps': 29698, 'batch_loss/train': 1.836168650072068} 12/26/2021 21:05:41 - INFO - codeparrot_training - Step 29699: {'lr': 0.00017832827559891503, 'samples': 15206400, 'steps': 29699, 'batch_loss/train': 0.7940390715375543} 12/26/2021 21:05:51 - INFO - codeparrot_training - Step 29700: {'lr': 0.0001783131742752298, 'samples': 15206912, 'steps': 29700, 'batch_loss/train': 0.7666420452296734} 12/26/2021 21:06:02 - INFO - codeparrot_training - Step 29701: {'lr': 0.00017829807323654426, 'samples': 15207424, 'steps': 29701, 'batch_loss/train': 0.7974228262901306} 12/26/2021 21:06:14 - INFO - codeparrot_training - Step 29702: {'lr': 0.00017828297248291848, 'samples': 15207936, 'steps': 29702, 'batch_loss/train': 0.7130425381474197} 12/26/2021 21:06:24 - INFO - codeparrot_training - Step 29703: {'lr': 0.00017826787201441258, 'samples': 15208448, 'steps': 29703, 'batch_loss/train': 0.7641797817777842} 12/26/2021 21:06:35 - INFO - codeparrot_training - Step 29704: {'lr': 0.00017825277183108657, 'samples': 15208960, 'steps': 29704, 'batch_loss/train': 0.6184000663924962} 12/26/2021 21:06:48 - INFO - codeparrot_training - Step 29705: {'lr': 0.00017823767193300032, 'samples': 15209472, 'steps': 29705, 'batch_loss/train': 0.7248103357851505} 12/26/2021 21:06:59 - INFO - codeparrot_training - Step 29706: {'lr': 0.00017822257232021408, 'samples': 15209984, 'steps': 29706, 'batch_loss/train': 0.9362311852164567} 12/26/2021 21:07:09 - INFO - codeparrot_training - Step 29707: {'lr': 0.0001782074729927878, 'samples': 15210496, 'steps': 29707, 'batch_loss/train': 0.9197434047237039} 12/26/2021 21:07:20 - INFO - codeparrot_training - Step 29708: {'lr': 0.00017819237395078148, 'samples': 15211008, 'steps': 29708, 'batch_loss/train': 0.6724425796419382} 12/26/2021 21:07:34 - INFO - codeparrot_training - Step 29709: {'lr': 0.0001781772751942551, 'samples': 15211520, 'steps': 29709, 'batch_loss/train': 0.647759853862226} 12/26/2021 21:07:45 - INFO - codeparrot_training - Step 29710: {'lr': 0.00017816217672326888, 'samples': 15212032, 'steps': 29710, 'batch_loss/train': 0.8446121293818578} 12/26/2021 21:07:55 - INFO - codeparrot_training - Step 29711: {'lr': 0.00017814707853788263, 'samples': 15212544, 'steps': 29711, 'batch_loss/train': 0.7221333491615951} 12/26/2021 21:08:07 - INFO - codeparrot_training - Step 29712: {'lr': 0.00017813198063815655, 'samples': 15213056, 'steps': 29712, 'batch_loss/train': 0.6587330442853272} 12/26/2021 21:08:18 - INFO - codeparrot_training - Step 29713: {'lr': 0.00017811688302415046, 'samples': 15213568, 'steps': 29713, 'batch_loss/train': 0.7822092124260962} 12/26/2021 21:08:29 - INFO - codeparrot_training - Step 29714: {'lr': 0.00017810178569592462, 'samples': 15214080, 'steps': 29714, 'batch_loss/train': 0.754624355584383} 12/26/2021 21:08:42 - INFO - codeparrot_training - Step 29715: {'lr': 0.00017808668865353883, 'samples': 15214592, 'steps': 29715, 'batch_loss/train': 0.7035689102485776} 12/26/2021 21:08:53 - INFO - codeparrot_training - Step 29716: {'lr': 0.0001780715918970532, 'samples': 15215104, 'steps': 29716, 'batch_loss/train': 0.6992707264143974} 12/26/2021 21:09:04 - INFO - codeparrot_training - Step 29717: {'lr': 0.00017805649542652788, 'samples': 15215616, 'steps': 29717, 'batch_loss/train': 0.822845202870667} 12/26/2021 21:09:16 - INFO - codeparrot_training - Step 29718: {'lr': 0.0001780413992420227, 'samples': 15216128, 'steps': 29718, 'batch_loss/train': 0.7714179074391723} 12/26/2021 21:09:26 - INFO - codeparrot_training - Step 29719: {'lr': 0.00017802630334359782, 'samples': 15216640, 'steps': 29719, 'batch_loss/train': 0.8180420557036996} 12/26/2021 21:09:37 - INFO - codeparrot_training - Step 29720: {'lr': 0.00017801120773131303, 'samples': 15217152, 'steps': 29720, 'batch_loss/train': 1.1545129073783755} 12/26/2021 21:09:48 - INFO - codeparrot_training - Step 29721: {'lr': 0.00017799611240522858, 'samples': 15217664, 'steps': 29721, 'batch_loss/train': 0.7370078752283007} 12/26/2021 21:10:00 - INFO - codeparrot_training - Step 29722: {'lr': 0.00017798101736540446, 'samples': 15218176, 'steps': 29722, 'batch_loss/train': 0.7013291604816914} 12/26/2021 21:10:10 - INFO - codeparrot_training - Step 29723: {'lr': 0.0001779659226119006, 'samples': 15218688, 'steps': 29723, 'batch_loss/train': 0.8354972628876567} 12/26/2021 21:10:21 - INFO - codeparrot_training - Step 29724: {'lr': 0.00017795082814477692, 'samples': 15219200, 'steps': 29724, 'batch_loss/train': 0.8392063092906028} 12/26/2021 21:10:35 - INFO - codeparrot_training - Step 29725: {'lr': 0.00017793573396409363, 'samples': 15219712, 'steps': 29725, 'batch_loss/train': 1.0087231392390095} 12/26/2021 21:10:46 - INFO - codeparrot_training - Step 29726: {'lr': 0.00017792064006991068, 'samples': 15220224, 'steps': 29726, 'batch_loss/train': 0.5467325369827449} 12/26/2021 21:10:56 - INFO - codeparrot_training - Step 29727: {'lr': 0.000177905546462288, 'samples': 15220736, 'steps': 29727, 'batch_loss/train': 1.0930876466445625} 12/26/2021 21:11:08 - INFO - codeparrot_training - Step 29728: {'lr': 0.0001778904531412856, 'samples': 15221248, 'steps': 29728, 'batch_loss/train': 0.7145019532181323} 12/26/2021 21:11:19 - INFO - codeparrot_training - Step 29729: {'lr': 0.00017787536010696364, 'samples': 15221760, 'steps': 29729, 'batch_loss/train': 0.6379437855212018} 12/26/2021 21:11:30 - INFO - codeparrot_training - Step 29730: {'lr': 0.00017786026735938198, 'samples': 15222272, 'steps': 29730, 'batch_loss/train': 0.7128448490984738} 12/26/2021 21:11:40 - INFO - codeparrot_training - Step 29731: {'lr': 0.00017784517489860058, 'samples': 15222784, 'steps': 29731, 'batch_loss/train': 0.8043301086872816} 12/26/2021 21:11:52 - INFO - codeparrot_training - Step 29732: {'lr': 0.0001778300827246796, 'samples': 15223296, 'steps': 29732, 'batch_loss/train': 0.7735797865316272} 12/26/2021 21:12:03 - INFO - codeparrot_training - Step 29733: {'lr': 0.00017781499083767896, 'samples': 15223808, 'steps': 29733, 'batch_loss/train': 0.7445667330175638} 12/26/2021 21:12:13 - INFO - codeparrot_training - Step 29734: {'lr': 0.00017779989923765867, 'samples': 15224320, 'steps': 29734, 'batch_loss/train': 0.6594809321686625} 12/26/2021 21:12:26 - INFO - codeparrot_training - Step 29735: {'lr': 0.0001777848079246786, 'samples': 15224832, 'steps': 29735, 'batch_loss/train': 0.6562718388158828} 12/26/2021 21:12:36 - INFO - codeparrot_training - Step 29736: {'lr': 0.00017776971689879902, 'samples': 15225344, 'steps': 29736, 'batch_loss/train': 0.7203908623196185} 12/26/2021 21:12:47 - INFO - codeparrot_training - Step 29737: {'lr': 0.00017775462616007967, 'samples': 15225856, 'steps': 29737, 'batch_loss/train': 0.7575616184622049} 12/26/2021 21:13:01 - INFO - codeparrot_training - Step 29738: {'lr': 0.00017773953570858074, 'samples': 15226368, 'steps': 29738, 'batch_loss/train': 0.7687734123319387} 12/26/2021 21:13:11 - INFO - codeparrot_training - Step 29739: {'lr': 0.00017772444554436202, 'samples': 15226880, 'steps': 29739, 'batch_loss/train': 0.6847659447230399} 12/26/2021 21:13:22 - INFO - codeparrot_training - Step 29740: {'lr': 0.0001777093556674837, 'samples': 15227392, 'steps': 29740, 'batch_loss/train': 0.717803961946629} 12/26/2021 21:13:33 - INFO - codeparrot_training - Step 29741: {'lr': 0.00017769426607800565, 'samples': 15227904, 'steps': 29741, 'batch_loss/train': 0.7359814870869741} 12/26/2021 21:13:45 - INFO - codeparrot_training - Step 29742: {'lr': 0.00017767917677598787, 'samples': 15228416, 'steps': 29742, 'batch_loss/train': 0.765425420133397} 12/26/2021 21:13:55 - INFO - codeparrot_training - Step 29743: {'lr': 0.00017766408776149042, 'samples': 15228928, 'steps': 29743, 'batch_loss/train': 0.7202131934463978} 12/26/2021 21:14:06 - INFO - codeparrot_training - Step 29744: {'lr': 0.00017764899903457323, 'samples': 15229440, 'steps': 29744, 'batch_loss/train': 0.8419054283294827} 12/26/2021 21:14:18 - INFO - codeparrot_training - Step 29745: {'lr': 0.0001776339105952964, 'samples': 15229952, 'steps': 29745, 'batch_loss/train': 0.7406213735230267} 12/26/2021 21:14:28 - INFO - codeparrot_training - Step 29746: {'lr': 0.00017761882244371962, 'samples': 15230464, 'steps': 29746, 'batch_loss/train': 0.7965186933288351} 12/26/2021 21:14:39 - INFO - codeparrot_training - Step 29747: {'lr': 0.00017760373457990317, 'samples': 15230976, 'steps': 29747, 'batch_loss/train': 0.8059532735496759} 12/26/2021 21:14:53 - INFO - codeparrot_training - Step 29748: {'lr': 0.00017758864700390697, 'samples': 15231488, 'steps': 29748, 'batch_loss/train': 0.7907090443186462} 12/26/2021 21:15:03 - INFO - codeparrot_training - Step 29749: {'lr': 0.00017757355971579098, 'samples': 15232000, 'steps': 29749, 'batch_loss/train': 0.7048477688804269} 12/26/2021 21:15:14 - INFO - codeparrot_training - Step 29750: {'lr': 0.0001775584727156151, 'samples': 15232512, 'steps': 29750, 'batch_loss/train': 0.7450990159995854} 12/26/2021 21:15:26 - INFO - codeparrot_training - Step 29751: {'lr': 0.0001775433860034394, 'samples': 15233024, 'steps': 29751, 'batch_loss/train': 0.767734594643116} 12/26/2021 21:15:37 - INFO - codeparrot_training - Step 29752: {'lr': 0.00017752829957932387, 'samples': 15233536, 'steps': 29752, 'batch_loss/train': 1.497092628851533} 12/26/2021 21:15:47 - INFO - codeparrot_training - Step 29753: {'lr': 0.00017751321344332838, 'samples': 15234048, 'steps': 29753, 'batch_loss/train': 0.7268079044297338} 12/26/2021 21:15:58 - INFO - codeparrot_training - Step 29754: {'lr': 0.000177498127595513, 'samples': 15234560, 'steps': 29754, 'batch_loss/train': 0.7672124113887548} 12/26/2021 21:16:12 - INFO - codeparrot_training - Step 29755: {'lr': 0.00017748304203593778, 'samples': 15235072, 'steps': 29755, 'batch_loss/train': 0.7670661862939596} 12/26/2021 21:16:23 - INFO - codeparrot_training - Step 29756: {'lr': 0.00017746795676466253, 'samples': 15235584, 'steps': 29756, 'batch_loss/train': 0.7181652672588825} 12/26/2021 21:16:33 - INFO - codeparrot_training - Step 29757: {'lr': 0.00017745287178174728, 'samples': 15236096, 'steps': 29757, 'batch_loss/train': 0.7047363338060677} 12/26/2021 21:16:45 - INFO - codeparrot_training - Step 29758: {'lr': 0.00017743778708725208, 'samples': 15236608, 'steps': 29758, 'batch_loss/train': 0.841586678288877} 12/26/2021 21:16:56 - INFO - codeparrot_training - Step 29759: {'lr': 0.00017742270268123687, 'samples': 15237120, 'steps': 29759, 'batch_loss/train': 0.7608906719833612} 12/26/2021 21:17:07 - INFO - codeparrot_training - Step 29760: {'lr': 0.00017740761856376152, 'samples': 15237632, 'steps': 29760, 'batch_loss/train': 0.8075438421219587} 12/26/2021 21:17:19 - INFO - codeparrot_training - Step 29761: {'lr': 0.00017739253473488605, 'samples': 15238144, 'steps': 29761, 'batch_loss/train': 0.6667342744767666} 12/26/2021 21:17:29 - INFO - codeparrot_training - Step 29762: {'lr': 0.00017737745119467049, 'samples': 15238656, 'steps': 29762, 'batch_loss/train': 0.7216117060743272} 12/26/2021 21:17:40 - INFO - codeparrot_training - Step 29763: {'lr': 0.00017736236794317474, 'samples': 15239168, 'steps': 29763, 'batch_loss/train': 0.6196448701957706} 12/26/2021 21:17:51 - INFO - codeparrot_training - Step 29764: {'lr': 0.00017734728498045886, 'samples': 15239680, 'steps': 29764, 'batch_loss/train': 0.661348304245621} 12/26/2021 21:18:05 - INFO - codeparrot_training - Step 29765: {'lr': 0.0001773322023065826, 'samples': 15240192, 'steps': 29765, 'batch_loss/train': 0.8506433255970478} 12/26/2021 21:18:15 - INFO - codeparrot_training - Step 29766: {'lr': 0.00017731711992160614, 'samples': 15240704, 'steps': 29766, 'batch_loss/train': 0.7465312890708447} 12/26/2021 21:18:26 - INFO - codeparrot_training - Step 29767: {'lr': 0.00017730203782558943, 'samples': 15241216, 'steps': 29767, 'batch_loss/train': 0.5415058767539449} 12/26/2021 21:18:38 - INFO - codeparrot_training - Step 29768: {'lr': 0.00017728695601859225, 'samples': 15241728, 'steps': 29768, 'batch_loss/train': 0.6421274122549221} 12/26/2021 21:18:49 - INFO - codeparrot_training - Step 29769: {'lr': 0.0001772718745006748, 'samples': 15242240, 'steps': 29769, 'batch_loss/train': 0.908165967091918} 12/26/2021 21:18:59 - INFO - codeparrot_training - Step 29770: {'lr': 0.00017725679327189683, 'samples': 15242752, 'steps': 29770, 'batch_loss/train': 0.6793364896439016} 12/26/2021 21:19:11 - INFO - codeparrot_training - Step 29771: {'lr': 0.00017724171233231845, 'samples': 15243264, 'steps': 29771, 'batch_loss/train': 0.6477412055246532} 12/26/2021 21:19:22 - INFO - codeparrot_training - Step 29772: {'lr': 0.00017722663168199946, 'samples': 15243776, 'steps': 29772, 'batch_loss/train': 0.7519195666536689} 12/26/2021 21:19:33 - INFO - codeparrot_training - Step 29773: {'lr': 0.00017721155132099996, 'samples': 15244288, 'steps': 29773, 'batch_loss/train': 0.6490724398754537} 12/26/2021 21:19:45 - INFO - codeparrot_training - Step 29774: {'lr': 0.00017719647124937987, 'samples': 15244800, 'steps': 29774, 'batch_loss/train': 0.717858221149072} 12/26/2021 21:19:55 - INFO - codeparrot_training - Step 29775: {'lr': 0.00017718139146719915, 'samples': 15245312, 'steps': 29775, 'batch_loss/train': 0.5901306597806979} 12/26/2021 21:20:06 - INFO - codeparrot_training - Step 29776: {'lr': 0.0001771663119745176, 'samples': 15245824, 'steps': 29776, 'batch_loss/train': 0.8338520433753729} 12/26/2021 21:20:17 - INFO - codeparrot_training - Step 29777: {'lr': 0.00017715123277139532, 'samples': 15246336, 'steps': 29777, 'batch_loss/train': 0.6693906339351088} 12/26/2021 21:20:30 - INFO - codeparrot_training - Step 29778: {'lr': 0.00017713615385789235, 'samples': 15246848, 'steps': 29778, 'batch_loss/train': 0.7747744023799896} 12/26/2021 21:20:41 - INFO - codeparrot_training - Step 29779: {'lr': 0.00017712107523406835, 'samples': 15247360, 'steps': 29779, 'batch_loss/train': 0.670726303011179} 12/26/2021 21:20:52 - INFO - codeparrot_training - Step 29780: {'lr': 0.00017710599689998354, 'samples': 15247872, 'steps': 29780, 'batch_loss/train': 0.7073358246125281} 12/26/2021 21:21:04 - INFO - codeparrot_training - Step 29781: {'lr': 0.00017709091885569777, 'samples': 15248384, 'steps': 29781, 'batch_loss/train': 0.8879480259492993} 12/26/2021 21:21:14 - INFO - codeparrot_training - Step 29782: {'lr': 0.00017707584110127094, 'samples': 15248896, 'steps': 29782, 'batch_loss/train': 0.6912606945261359} 12/26/2021 21:21:25 - INFO - codeparrot_training - Step 29783: {'lr': 0.000177060763636763, 'samples': 15249408, 'steps': 29783, 'batch_loss/train': 0.8109342055395246} 12/26/2021 21:21:39 - INFO - codeparrot_training - Step 29784: {'lr': 0.000177045686462234, 'samples': 15249920, 'steps': 29784, 'batch_loss/train': 0.7142260905820876} 12/26/2021 21:21:50 - INFO - codeparrot_training - Step 29785: {'lr': 0.00017703060957774384, 'samples': 15250432, 'steps': 29785, 'batch_loss/train': 0.7161621139384806} 12/26/2021 21:22:00 - INFO - codeparrot_training - Step 29786: {'lr': 0.00017701553298335232, 'samples': 15250944, 'steps': 29786, 'batch_loss/train': 0.7578260721638799} 12/26/2021 21:22:11 - INFO - codeparrot_training - Step 29787: {'lr': 0.00017700045667911948, 'samples': 15251456, 'steps': 29787, 'batch_loss/train': 0.6908773821778595} 12/26/2021 21:22:23 - INFO - codeparrot_training - Step 29788: {'lr': 0.00017698538066510534, 'samples': 15251968, 'steps': 29788, 'batch_loss/train': 0.6509663172182627} 12/26/2021 21:22:34 - INFO - codeparrot_training - Step 29789: {'lr': 0.00017697030494136974, 'samples': 15252480, 'steps': 29789, 'batch_loss/train': 0.7308741137385368} 12/26/2021 21:22:44 - INFO - codeparrot_training - Step 29790: {'lr': 0.0001769552295079725, 'samples': 15252992, 'steps': 29790, 'batch_loss/train': 0.6962286802008748} 12/26/2021 21:22:56 - INFO - codeparrot_training - Step 29791: {'lr': 0.00017694015436497384, 'samples': 15253504, 'steps': 29791, 'batch_loss/train': 0.8124657655134797} 12/26/2021 21:23:07 - INFO - codeparrot_training - Step 29792: {'lr': 0.00017692507951243353, 'samples': 15254016, 'steps': 29792, 'batch_loss/train': 0.6933006341569126} 12/26/2021 21:23:18 - INFO - codeparrot_training - Step 29793: {'lr': 0.0001769100049504115, 'samples': 15254528, 'steps': 29793, 'batch_loss/train': 0.6943378762807697} 12/26/2021 21:23:32 - INFO - codeparrot_training - Step 29794: {'lr': 0.00017689493067896763, 'samples': 15255040, 'steps': 29794, 'batch_loss/train': 0.7321571917855181} 12/26/2021 21:23:42 - INFO - codeparrot_training - Step 29795: {'lr': 0.00017687985669816197, 'samples': 15255552, 'steps': 29795, 'batch_loss/train': 0.6485957640688866} 12/26/2021 21:23:53 - INFO - codeparrot_training - Step 29796: {'lr': 0.00017686478300805438, 'samples': 15256064, 'steps': 29796, 'batch_loss/train': 0.77204604819417} 12/26/2021 21:24:05 - INFO - codeparrot_training - Step 29797: {'lr': 0.00017684970960870483, 'samples': 15256576, 'steps': 29797, 'batch_loss/train': 0.7205553418025374} 12/26/2021 21:24:16 - INFO - codeparrot_training - Step 29798: {'lr': 0.0001768346365001731, 'samples': 15257088, 'steps': 29798, 'batch_loss/train': 0.7515976997092366} 12/26/2021 21:24:26 - INFO - codeparrot_training - Step 29799: {'lr': 0.00017681956368251932, 'samples': 15257600, 'steps': 29799, 'batch_loss/train': 0.7840972621925175} 12/26/2021 21:24:37 - INFO - codeparrot_training - Step 29800: {'lr': 0.0001768044911558034, 'samples': 15258112, 'steps': 29800, 'batch_loss/train': 0.7755493959411979} 12/26/2021 21:24:51 - INFO - codeparrot_training - Step 29801: {'lr': 0.00017678941892008504, 'samples': 15258624, 'steps': 29801, 'batch_loss/train': 0.7939862255007029} 12/26/2021 21:25:01 - INFO - codeparrot_training - Step 29802: {'lr': 0.00017677434697542432, 'samples': 15259136, 'steps': 29802, 'batch_loss/train': 0.7384286595042795} 12/26/2021 21:25:12 - INFO - codeparrot_training - Step 29803: {'lr': 0.00017675927532188125, 'samples': 15259648, 'steps': 29803, 'batch_loss/train': 0.7758098868653178} 12/26/2021 21:25:24 - INFO - codeparrot_training - Step 29804: {'lr': 0.00017674420395951562, 'samples': 15260160, 'steps': 29804, 'batch_loss/train': 0.7142388643696904} 12/26/2021 21:25:35 - INFO - codeparrot_training - Step 29805: {'lr': 0.00017672913288838726, 'samples': 15260672, 'steps': 29805, 'batch_loss/train': 0.8118705498054624} 12/26/2021 21:25:45 - INFO - codeparrot_training - Step 29806: {'lr': 0.0001767140621085563, 'samples': 15261184, 'steps': 29806, 'batch_loss/train': 0.6469806218519807} 12/26/2021 21:25:57 - INFO - codeparrot_training - Step 29807: {'lr': 0.00017669899162008258, 'samples': 15261696, 'steps': 29807, 'batch_loss/train': 0.6903760358691216} 12/26/2021 21:26:08 - INFO - codeparrot_training - Step 29808: {'lr': 0.00017668392142302592, 'samples': 15262208, 'steps': 29808, 'batch_loss/train': 0.7140622437000275} 12/26/2021 21:26:19 - INFO - codeparrot_training - Step 29809: {'lr': 0.00017666885151744626, 'samples': 15262720, 'steps': 29809, 'batch_loss/train': 0.7086870260536671} 12/26/2021 21:26:29 - INFO - codeparrot_training - Step 29810: {'lr': 0.0001766537819034037, 'samples': 15263232, 'steps': 29810, 'batch_loss/train': 0.7812087014317513} 12/26/2021 21:26:41 - INFO - codeparrot_training - Step 29811: {'lr': 0.00017663871258095793, 'samples': 15263744, 'steps': 29811, 'batch_loss/train': 0.7920146733522415} 12/26/2021 21:26:52 - INFO - codeparrot_training - Step 29812: {'lr': 0.000176623643550169, 'samples': 15264256, 'steps': 29812, 'batch_loss/train': 0.7499558385461569} 12/26/2021 21:27:03 - INFO - codeparrot_training - Step 29813: {'lr': 0.00017660857481109664, 'samples': 15264768, 'steps': 29813, 'batch_loss/train': 0.8641089433804154} 12/26/2021 21:27:16 - INFO - codeparrot_training - Step 29814: {'lr': 0.000176593506363801, 'samples': 15265280, 'steps': 29814, 'batch_loss/train': 0.717272749170661} 12/26/2021 21:27:27 - INFO - codeparrot_training - Step 29815: {'lr': 0.0001765784382083418, 'samples': 15265792, 'steps': 29815, 'batch_loss/train': 0.6271995665156282} 12/26/2021 21:27:38 - INFO - codeparrot_training - Step 29816: {'lr': 0.00017656337034477896, 'samples': 15266304, 'steps': 29816, 'batch_loss/train': 0.545910691464087} 12/26/2021 21:27:50 - INFO - codeparrot_training - Step 29817: {'lr': 0.00017654830277317253, 'samples': 15266816, 'steps': 29817, 'batch_loss/train': 0.6521773245476652} 12/26/2021 21:28:00 - INFO - codeparrot_training - Step 29818: {'lr': 0.00017653323549358225, 'samples': 15267328, 'steps': 29818, 'batch_loss/train': 0.7856312040239573} 12/26/2021 21:28:11 - INFO - codeparrot_training - Step 29819: {'lr': 0.00017651816850606814, 'samples': 15267840, 'steps': 29819, 'batch_loss/train': 0.5438176231109537} 12/26/2021 21:28:22 - INFO - codeparrot_training - Step 29820: {'lr': 0.00017650310181068995, 'samples': 15268352, 'steps': 29820, 'batch_loss/train': 0.6613616212271154} 12/26/2021 21:28:34 - INFO - codeparrot_training - Step 29821: {'lr': 0.0001764880354075078, 'samples': 15268864, 'steps': 29821, 'batch_loss/train': 0.792014092206955} 12/26/2021 21:28:44 - INFO - codeparrot_training - Step 29822: {'lr': 0.0001764729692965814, 'samples': 15269376, 'steps': 29822, 'batch_loss/train': 0.7469424307346344} 12/26/2021 21:28:55 - INFO - codeparrot_training - Step 29823: {'lr': 0.00017645790347797075, 'samples': 15269888, 'steps': 29823, 'batch_loss/train': 0.6481342851184309} 12/26/2021 21:29:09 - INFO - codeparrot_training - Step 29824: {'lr': 0.0001764428379517356, 'samples': 15270400, 'steps': 29824, 'batch_loss/train': 0.7187524428591132} 12/26/2021 21:29:19 - INFO - codeparrot_training - Step 29825: {'lr': 0.000176427772717936, 'samples': 15270912, 'steps': 29825, 'batch_loss/train': 0.7392551805824041} 12/26/2021 21:29:30 - INFO - codeparrot_training - Step 29826: {'lr': 0.00017641270777663188, 'samples': 15271424, 'steps': 29826, 'batch_loss/train': 0.7651397986337543} 12/26/2021 21:29:43 - INFO - codeparrot_training - Step 29827: {'lr': 0.0001763976431278829, 'samples': 15271936, 'steps': 29827, 'batch_loss/train': 0.7482382113230415} 12/26/2021 21:29:53 - INFO - codeparrot_training - Step 29828: {'lr': 0.0001763825787717492, 'samples': 15272448, 'steps': 29828, 'batch_loss/train': 0.727389445528388} 12/26/2021 21:30:04 - INFO - codeparrot_training - Step 29829: {'lr': 0.0001763675147082906, 'samples': 15272960, 'steps': 29829, 'batch_loss/train': 0.7932992819696665} 12/26/2021 21:30:16 - INFO - codeparrot_training - Step 29830: {'lr': 0.00017635245093756694, 'samples': 15273472, 'steps': 29830, 'batch_loss/train': 0.7442169757559896} 12/26/2021 21:30:27 - INFO - codeparrot_training - Step 29831: {'lr': 0.00017633738745963804, 'samples': 15273984, 'steps': 29831, 'batch_loss/train': 0.6056583978352137} 12/26/2021 21:30:37 - INFO - codeparrot_training - Step 29832: {'lr': 0.0001763223242745639, 'samples': 15274496, 'steps': 29832, 'batch_loss/train': 0.7146063231630251} 12/26/2021 21:30:48 - INFO - codeparrot_training - Step 29833: {'lr': 0.0001763072613824045, 'samples': 15275008, 'steps': 29833, 'batch_loss/train': 0.7634139342117123} 12/26/2021 21:31:02 - INFO - codeparrot_training - Step 29834: {'lr': 0.0001762921987832195, 'samples': 15275520, 'steps': 29834, 'batch_loss/train': 0.7500520991161466} 12/26/2021 21:31:13 - INFO - codeparrot_training - Step 29835: {'lr': 0.00017627713647706884, 'samples': 15276032, 'steps': 29835, 'batch_loss/train': 0.5898754580994137} 12/26/2021 21:31:24 - INFO - codeparrot_training - Step 29836: {'lr': 0.00017626207446401254, 'samples': 15276544, 'steps': 29836, 'batch_loss/train': 0.7751809675246477} 12/26/2021 21:31:36 - INFO - codeparrot_training - Step 29837: {'lr': 0.00017624701274411038, 'samples': 15277056, 'steps': 29837, 'batch_loss/train': 0.7746949347201735} 12/26/2021 21:31:46 - INFO - codeparrot_training - Step 29838: {'lr': 0.00017623195131742224, 'samples': 15277568, 'steps': 29838, 'batch_loss/train': 0.7797921928577125} 12/26/2021 21:31:57 - INFO - codeparrot_training - Step 29839: {'lr': 0.00017621689018400795, 'samples': 15278080, 'steps': 29839, 'batch_loss/train': 0.6334669766947627} 12/26/2021 21:32:11 - INFO - codeparrot_training - Step 29840: {'lr': 0.00017620182934392753, 'samples': 15278592, 'steps': 29840, 'batch_loss/train': 0.7396857347339392} 12/26/2021 21:32:21 - INFO - codeparrot_training - Step 29841: {'lr': 0.0001761867687972407, 'samples': 15279104, 'steps': 29841, 'batch_loss/train': 0.7625320474617183} 12/26/2021 21:32:32 - INFO - codeparrot_training - Step 29842: {'lr': 0.00017617170854400743, 'samples': 15279616, 'steps': 29842, 'batch_loss/train': 0.8146515507251024} 12/26/2021 21:32:43 - INFO - codeparrot_training - Step 29843: {'lr': 0.00017615664858428758, 'samples': 15280128, 'steps': 29843, 'batch_loss/train': 0.7647698260843754} 12/26/2021 21:32:55 - INFO - codeparrot_training - Step 29844: {'lr': 0.00017614158891814103, 'samples': 15280640, 'steps': 29844, 'batch_loss/train': 0.6433966923505068} 12/26/2021 21:33:05 - INFO - codeparrot_training - Step 29845: {'lr': 0.0001761265295456277, 'samples': 15281152, 'steps': 29845, 'batch_loss/train': 0.7238161331042647} 12/26/2021 21:33:16 - INFO - codeparrot_training - Step 29846: {'lr': 0.00017611147046680723, 'samples': 15281664, 'steps': 29846, 'batch_loss/train': 0.6239673933014274} 12/26/2021 21:33:28 - INFO - codeparrot_training - Step 29847: {'lr': 0.00017609641168173974, 'samples': 15282176, 'steps': 29847, 'batch_loss/train': 0.5686751308385283} 12/26/2021 21:33:39 - INFO - codeparrot_training - Step 29848: {'lr': 0.00017608135319048505, 'samples': 15282688, 'steps': 29848, 'batch_loss/train': 0.7340925596654415} 12/26/2021 21:33:49 - INFO - codeparrot_training - Step 29849: {'lr': 0.000176066294993103, 'samples': 15283200, 'steps': 29849, 'batch_loss/train': 0.9923207573592663} 12/26/2021 21:34:01 - INFO - codeparrot_training - Step 29850: {'lr': 0.00017605123708965332, 'samples': 15283712, 'steps': 29850, 'batch_loss/train': 0.7833455274812877} 12/26/2021 21:34:12 - INFO - codeparrot_training - Step 29851: {'lr': 0.00017603617948019608, 'samples': 15284224, 'steps': 29851, 'batch_loss/train': 0.7291613980196416} 12/26/2021 21:34:23 - INFO - codeparrot_training - Step 29852: {'lr': 0.00017602112216479111, 'samples': 15284736, 'steps': 29852, 'batch_loss/train': 0.701761634554714} 12/26/2021 21:34:37 - INFO - codeparrot_training - Step 29853: {'lr': 0.0001760060651434981, 'samples': 15285248, 'steps': 29853, 'batch_loss/train': 0.7850781306624413} 12/26/2021 21:34:47 - INFO - codeparrot_training - Step 29854: {'lr': 0.0001759910084163771, 'samples': 15285760, 'steps': 29854, 'batch_loss/train': 0.7267654367024079} 12/26/2021 21:34:58 - INFO - codeparrot_training - Step 29855: {'lr': 0.00017597595198348795, 'samples': 15286272, 'steps': 29855, 'batch_loss/train': 0.7279439377598464} 12/26/2021 21:35:08 - INFO - codeparrot_training - Step 29856: {'lr': 0.00017596089584489042, 'samples': 15286784, 'steps': 29856, 'batch_loss/train': 0.7236279817298055} 12/26/2021 21:35:20 - INFO - codeparrot_training - Step 29857: {'lr': 0.00017594584000064434, 'samples': 15287296, 'steps': 29857, 'batch_loss/train': 0.6707192519679666} 12/26/2021 21:35:31 - INFO - codeparrot_training - Step 29858: {'lr': 0.0001759307844508098, 'samples': 15287808, 'steps': 29858, 'batch_loss/train': 0.7214372185990214} 12/26/2021 21:35:42 - INFO - codeparrot_training - Step 29859: {'lr': 0.00017591572919544644, 'samples': 15288320, 'steps': 29859, 'batch_loss/train': 0.7886618408374488} 12/26/2021 21:35:54 - INFO - codeparrot_training - Step 29860: {'lr': 0.00017590067423461415, 'samples': 15288832, 'steps': 29860, 'batch_loss/train': 0.7576619000174105} 12/26/2021 21:36:04 - INFO - codeparrot_training - Step 29861: {'lr': 0.0001758856195683727, 'samples': 15289344, 'steps': 29861, 'batch_loss/train': 0.7954837922006845} 12/26/2021 21:36:15 - INFO - codeparrot_training - Step 29862: {'lr': 0.00017587056519678218, 'samples': 15289856, 'steps': 29862, 'batch_loss/train': 0.710962432436645} 12/26/2021 21:36:29 - INFO - codeparrot_training - Step 29863: {'lr': 0.00017585551111990225, 'samples': 15290368, 'steps': 29863, 'batch_loss/train': 0.766776273958385} 12/26/2021 21:36:39 - INFO - codeparrot_training - Step 29864: {'lr': 0.00017584045733779274, 'samples': 15290880, 'steps': 29864, 'batch_loss/train': 0.7955705337226391} 12/26/2021 21:36:50 - INFO - codeparrot_training - Step 29865: {'lr': 0.00017582540385051372, 'samples': 15291392, 'steps': 29865, 'batch_loss/train': 0.7784337396733463} 12/26/2021 21:37:00 - INFO - codeparrot_training - Step 29866: {'lr': 0.00017581035065812484, 'samples': 15291904, 'steps': 29866, 'batch_loss/train': 0.7520672967657447} 12/26/2021 21:37:13 - INFO - codeparrot_training - Step 29867: {'lr': 0.00017579529776068597, 'samples': 15292416, 'steps': 29867, 'batch_loss/train': 0.8317349073477089} 12/26/2021 21:37:23 - INFO - codeparrot_training - Step 29868: {'lr': 0.00017578024515825692, 'samples': 15292928, 'steps': 29868, 'batch_loss/train': 0.6763775246217847} 12/26/2021 21:37:34 - INFO - codeparrot_training - Step 29869: {'lr': 0.00017576519285089767, 'samples': 15293440, 'steps': 29869, 'batch_loss/train': 0.7823966074211057} 12/26/2021 21:37:48 - INFO - codeparrot_training - Step 29870: {'lr': 0.00017575014083866797, 'samples': 15293952, 'steps': 29870, 'batch_loss/train': 0.7214104114100337} 12/26/2021 21:37:59 - INFO - codeparrot_training - Step 29871: {'lr': 0.0001757350891216277, 'samples': 15294464, 'steps': 29871, 'batch_loss/train': 0.7659154273569584} 12/26/2021 21:38:09 - INFO - codeparrot_training - Step 29872: {'lr': 0.00017572003769983658, 'samples': 15294976, 'steps': 29872, 'batch_loss/train': 0.7361147091723979} 12/26/2021 21:38:21 - INFO - codeparrot_training - Step 29873: {'lr': 0.0001757049865733546, 'samples': 15295488, 'steps': 29873, 'batch_loss/train': 0.6872147074900568} 12/26/2021 21:38:32 - INFO - codeparrot_training - Step 29874: {'lr': 0.00017568993574224156, 'samples': 15296000, 'steps': 29874, 'batch_loss/train': 0.7198687493801117} 12/26/2021 21:38:43 - INFO - codeparrot_training - Step 29875: {'lr': 0.00017567488520655732, 'samples': 15296512, 'steps': 29875, 'batch_loss/train': 0.6901395455934107} 12/26/2021 21:38:55 - INFO - codeparrot_training - Step 29876: {'lr': 0.00017565983496636153, 'samples': 15297024, 'steps': 29876, 'batch_loss/train': 0.6735962443053722} 12/26/2021 21:39:06 - INFO - codeparrot_training - Step 29877: {'lr': 0.00017564478502171425, 'samples': 15297536, 'steps': 29877, 'batch_loss/train': 0.8112838086672127} 12/26/2021 21:39:16 - INFO - codeparrot_training - Step 29878: {'lr': 0.00017562973537267525, 'samples': 15298048, 'steps': 29878, 'batch_loss/train': 0.7579631190747023} 12/26/2021 21:39:27 - INFO - codeparrot_training - Step 29879: {'lr': 0.0001756146860193043, 'samples': 15298560, 'steps': 29879, 'batch_loss/train': 0.7173398863524199} 12/26/2021 21:39:41 - INFO - codeparrot_training - Step 29880: {'lr': 0.00017559963696166126, 'samples': 15299072, 'steps': 29880, 'batch_loss/train': 0.8212463073432446} 12/26/2021 21:39:51 - INFO - codeparrot_training - Step 29881: {'lr': 0.00017558458819980606, 'samples': 15299584, 'steps': 29881, 'batch_loss/train': 0.6758187059313059} 12/26/2021 21:40:02 - INFO - codeparrot_training - Step 29882: {'lr': 0.00017556953973379836, 'samples': 15300096, 'steps': 29882, 'batch_loss/train': 0.7743541933596134} 12/26/2021 21:40:14 - INFO - codeparrot_training - Step 29883: {'lr': 0.00017555449156369802, 'samples': 15300608, 'steps': 29883, 'batch_loss/train': 0.9544348432682455} 12/26/2021 21:40:25 - INFO - codeparrot_training - Step 29884: {'lr': 0.00017553944368956502, 'samples': 15301120, 'steps': 29884, 'batch_loss/train': 0.7028228780254722} 12/26/2021 21:40:35 - INFO - codeparrot_training - Step 29885: {'lr': 0.00017552439611145907, 'samples': 15301632, 'steps': 29885, 'batch_loss/train': 0.9280639151111245} 12/26/2021 21:40:49 - INFO - codeparrot_training - Step 29886: {'lr': 0.00017550934882944, 'samples': 15302144, 'steps': 29886, 'batch_loss/train': 0.8013779390603304} 12/26/2021 21:40:59 - INFO - codeparrot_training - Step 29887: {'lr': 0.0001754943018435675, 'samples': 15302656, 'steps': 29887, 'batch_loss/train': 0.7891852082684636} 12/26/2021 21:41:10 - INFO - codeparrot_training - Step 29888: {'lr': 0.00017547925515390167, 'samples': 15303168, 'steps': 29888, 'batch_loss/train': 0.6404763609170914} 12/26/2021 21:41:21 - INFO - codeparrot_training - Step 29889: {'lr': 0.00017546420876050212, 'samples': 15303680, 'steps': 29889, 'batch_loss/train': 0.7671295367181301} 12/26/2021 21:41:33 - INFO - codeparrot_training - Step 29890: {'lr': 0.0001754491626634287, 'samples': 15304192, 'steps': 29890, 'batch_loss/train': 0.9025059817358851} 12/26/2021 21:41:43 - INFO - codeparrot_training - Step 29891: {'lr': 0.00017543411686274133, 'samples': 15304704, 'steps': 29891, 'batch_loss/train': 0.7852827720344067} 12/26/2021 21:41:54 - INFO - codeparrot_training - Step 29892: {'lr': 0.0001754190713584997, 'samples': 15305216, 'steps': 29892, 'batch_loss/train': 0.8115841491380706} 12/26/2021 21:42:06 - INFO - codeparrot_training - Step 29893: {'lr': 0.00017540402615076378, 'samples': 15305728, 'steps': 29893, 'batch_loss/train': 0.783857923001051} 12/26/2021 21:42:16 - INFO - codeparrot_training - Step 29894: {'lr': 0.00017538898123959314, 'samples': 15306240, 'steps': 29894, 'batch_loss/train': 0.737752164597623} 12/26/2021 21:42:27 - INFO - codeparrot_training - Step 29895: {'lr': 0.00017537393662504788, 'samples': 15306752, 'steps': 29895, 'batch_loss/train': 0.7501233089715242} 12/26/2021 21:42:42 - INFO - codeparrot_training - Step 29896: {'lr': 0.0001753588923071876, 'samples': 15307264, 'steps': 29896, 'batch_loss/train': 0.737302296794951} 12/26/2021 21:42:52 - INFO - codeparrot_training - Step 29897: {'lr': 0.00017534384828607225, 'samples': 15307776, 'steps': 29897, 'batch_loss/train': 0.8081900216639042} 12/26/2021 21:43:03 - INFO - codeparrot_training - Step 29898: {'lr': 0.0001753288045617614, 'samples': 15308288, 'steps': 29898, 'batch_loss/train': 0.6901397455949336} 12/26/2021 21:43:15 - INFO - codeparrot_training - Step 29899: {'lr': 0.00017531376113431515, 'samples': 15308800, 'steps': 29899, 'batch_loss/train': 0.7841885900124907} 12/26/2021 21:43:25 - INFO - codeparrot_training - Step 29900: {'lr': 0.00017529871800379322, 'samples': 15309312, 'steps': 29900, 'batch_loss/train': 0.7823585774749517} 12/26/2021 21:43:36 - INFO - codeparrot_training - Step 29901: {'lr': 0.0001752836751702553, 'samples': 15309824, 'steps': 29901, 'batch_loss/train': 0.8728428818285465} 12/26/2021 21:43:47 - INFO - codeparrot_training - Step 29902: {'lr': 0.00017526863263376128, 'samples': 15310336, 'steps': 29902, 'batch_loss/train': 0.6931925537064672} 12/26/2021 21:43:59 - INFO - codeparrot_training - Step 29903: {'lr': 0.00017525359039437106, 'samples': 15310848, 'steps': 29903, 'batch_loss/train': 0.7468297928571701} 12/26/2021 21:44:10 - INFO - codeparrot_training - Step 29904: {'lr': 0.0001752385484521443, 'samples': 15311360, 'steps': 29904, 'batch_loss/train': 0.5199747987790033} 12/26/2021 21:44:20 - INFO - codeparrot_training - Step 29905: {'lr': 0.00017522350680714076, 'samples': 15311872, 'steps': 29905, 'batch_loss/train': 0.6493317577987909} 12/26/2021 21:44:32 - INFO - codeparrot_training - Step 29906: {'lr': 0.00017520846545942035, 'samples': 15312384, 'steps': 29906, 'batch_loss/train': 0.7693142448551953} 12/26/2021 21:44:43 - INFO - codeparrot_training - Step 29907: {'lr': 0.0001751934244090429, 'samples': 15312896, 'steps': 29907, 'batch_loss/train': 0.7766768138390034} 12/26/2021 21:44:54 - INFO - codeparrot_training - Step 29908: {'lr': 0.0001751783836560681, 'samples': 15313408, 'steps': 29908, 'batch_loss/train': 0.6933397594839334} 12/26/2021 21:45:07 - INFO - codeparrot_training - Step 29909: {'lr': 0.00017516334320055577, 'samples': 15313920, 'steps': 29909, 'batch_loss/train': 0.740471285302192} 12/26/2021 21:45:18 - INFO - codeparrot_training - Step 29910: {'lr': 0.00017514830304256578, 'samples': 15314432, 'steps': 29910, 'batch_loss/train': 0.7773136971518397} 12/26/2021 21:45:29 - INFO - codeparrot_training - Step 29911: {'lr': 0.00017513326318215793, 'samples': 15314944, 'steps': 29911, 'batch_loss/train': 0.7704294142313302} 12/26/2021 21:45:41 - INFO - codeparrot_training - Step 29912: {'lr': 0.0001751182236193919, 'samples': 15315456, 'steps': 29912, 'batch_loss/train': 0.7521948392386548} 12/26/2021 21:45:52 - INFO - codeparrot_training - Step 29913: {'lr': 0.00017510318435432745, 'samples': 15315968, 'steps': 29913, 'batch_loss/train': 0.7948013143613935} 12/26/2021 21:46:02 - INFO - codeparrot_training - Step 29914: {'lr': 0.0001750881453870246, 'samples': 15316480, 'steps': 29914, 'batch_loss/train': 0.7653692420572042} 12/26/2021 21:46:13 - INFO - codeparrot_training - Step 29915: {'lr': 0.00017507310671754293, 'samples': 15316992, 'steps': 29915, 'batch_loss/train': 0.8040840933099389} 12/26/2021 21:46:27 - INFO - codeparrot_training - Step 29916: {'lr': 0.00017505806834594223, 'samples': 15317504, 'steps': 29916, 'batch_loss/train': 0.757386795245111} 12/26/2021 21:46:38 - INFO - codeparrot_training - Step 29917: {'lr': 0.00017504303027228245, 'samples': 15318016, 'steps': 29917, 'batch_loss/train': 0.6550556881120428} 12/26/2021 21:46:48 - INFO - codeparrot_training - Step 29918: {'lr': 0.00017502799249662326, 'samples': 15318528, 'steps': 29918, 'batch_loss/train': 0.6616125628352165} 12/26/2021 21:47:00 - INFO - codeparrot_training - Step 29919: {'lr': 0.00017501295501902453, 'samples': 15319040, 'steps': 29919, 'batch_loss/train': 0.7918259552679956} 12/26/2021 21:47:11 - INFO - codeparrot_training - Step 29920: {'lr': 0.00017499791783954583, 'samples': 15319552, 'steps': 29920, 'batch_loss/train': 0.8005525730550289} 12/26/2021 21:47:21 - INFO - codeparrot_training - Step 29921: {'lr': 0.00017498288095824723, 'samples': 15320064, 'steps': 29921, 'batch_loss/train': 0.6757664622273296} 12/26/2021 21:47:33 - INFO - codeparrot_training - Step 29922: {'lr': 0.00017496784437518828, 'samples': 15320576, 'steps': 29922, 'batch_loss/train': 0.7448168182745576} 12/26/2021 21:47:44 - INFO - codeparrot_training - Step 29923: {'lr': 0.0001749528080904289, 'samples': 15321088, 'steps': 29923, 'batch_loss/train': 0.8278609858825803} 12/26/2021 21:47:55 - INFO - codeparrot_training - Step 29924: {'lr': 0.00017493777210402875, 'samples': 15321600, 'steps': 29924, 'batch_loss/train': 0.6878920391900465} 12/26/2021 21:48:05 - INFO - codeparrot_training - Step 29925: {'lr': 0.00017492273641604773, 'samples': 15322112, 'steps': 29925, 'batch_loss/train': 0.8161553139798343} 12/26/2021 21:48:19 - INFO - codeparrot_training - Step 29926: {'lr': 0.00017490770102654557, 'samples': 15322624, 'steps': 29926, 'batch_loss/train': 0.7768269460648298} 12/26/2021 21:48:30 - INFO - codeparrot_training - Step 29927: {'lr': 0.00017489266593558194, 'samples': 15323136, 'steps': 29927, 'batch_loss/train': 0.7631095605902374} 12/26/2021 21:48:40 - INFO - codeparrot_training - Step 29928: {'lr': 0.0001748776311432168, 'samples': 15323648, 'steps': 29928, 'batch_loss/train': 0.8902457421645522} 12/26/2021 21:48:53 - INFO - codeparrot_training - Step 29929: {'lr': 0.00017486259664950988, 'samples': 15324160, 'steps': 29929, 'batch_loss/train': 0.7653959030285478} 12/26/2021 21:49:03 - INFO - codeparrot_training - Step 29930: {'lr': 0.0001748475624545209, 'samples': 15324672, 'steps': 29930, 'batch_loss/train': 0.7122709998511709} 12/26/2021 21:49:14 - INFO - codeparrot_training - Step 29931: {'lr': 0.0001748325285583095, 'samples': 15325184, 'steps': 29931, 'batch_loss/train': 0.7065404746681452} 12/26/2021 21:49:26 - INFO - codeparrot_training - Step 29932: {'lr': 0.0001748174949609357, 'samples': 15325696, 'steps': 29932, 'batch_loss/train': 0.6513059518765658} 12/26/2021 21:49:37 - INFO - codeparrot_training - Step 29933: {'lr': 0.00017480246166245913, 'samples': 15326208, 'steps': 29933, 'batch_loss/train': 0.6802951442077756} 12/26/2021 21:49:47 - INFO - codeparrot_training - Step 29934: {'lr': 0.00017478742866293957, 'samples': 15326720, 'steps': 29934, 'batch_loss/train': 0.7164548886939883} 12/26/2021 21:50:01 - INFO - codeparrot_training - Step 29935: {'lr': 0.0001747723959624367, 'samples': 15327232, 'steps': 29935, 'batch_loss/train': 0.7590101277455688} 12/26/2021 21:50:12 - INFO - codeparrot_training - Step 29936: {'lr': 0.00017475736356101056, 'samples': 15327744, 'steps': 29936, 'batch_loss/train': 1.017740837763995} 12/26/2021 21:50:23 - INFO - codeparrot_training - Step 29937: {'lr': 0.0001747423314587206, 'samples': 15328256, 'steps': 29937, 'batch_loss/train': 0.7966646729037166} 12/26/2021 21:50:33 - INFO - codeparrot_training - Step 29938: {'lr': 0.00017472729965562674, 'samples': 15328768, 'steps': 29938, 'batch_loss/train': 0.7844056347385049} 12/26/2021 21:50:45 - INFO - codeparrot_training - Step 29939: {'lr': 0.00017471226815178877, 'samples': 15329280, 'steps': 29939, 'batch_loss/train': 0.717587104300037} 12/26/2021 21:50:56 - INFO - codeparrot_training - Step 29940: {'lr': 0.00017469723694726636, 'samples': 15329792, 'steps': 29940, 'batch_loss/train': 0.772905747871846} 12/26/2021 21:51:07 - INFO - codeparrot_training - Step 29941: {'lr': 0.00017468220604211933, 'samples': 15330304, 'steps': 29941, 'batch_loss/train': 0.7545631709508598} 12/26/2021 21:51:21 - INFO - codeparrot_training - Step 29942: {'lr': 0.0001746671754364073, 'samples': 15330816, 'steps': 29942, 'batch_loss/train': 0.6628169436007738} 12/26/2021 21:51:31 - INFO - codeparrot_training - Step 29943: {'lr': 0.00017465214513019025, 'samples': 15331328, 'steps': 29943, 'batch_loss/train': 0.7875103671103716} 12/26/2021 21:51:42 - INFO - codeparrot_training - Step 29944: {'lr': 0.0001746371151235278, 'samples': 15331840, 'steps': 29944, 'batch_loss/train': 0.8706015152856708} 12/26/2021 21:51:54 - INFO - codeparrot_training - Step 29945: {'lr': 0.00017462208541647973, 'samples': 15332352, 'steps': 29945, 'batch_loss/train': 0.7087336871773005} 12/26/2021 21:52:05 - INFO - codeparrot_training - Step 29946: {'lr': 0.0001746070560091057, 'samples': 15332864, 'steps': 29946, 'batch_loss/train': 0.7838950206059963} 12/26/2021 21:52:15 - INFO - codeparrot_training - Step 29947: {'lr': 0.00017459202690146562, 'samples': 15333376, 'steps': 29947, 'batch_loss/train': 0.8370181024074554} 12/26/2021 21:52:26 - INFO - codeparrot_training - Step 29948: {'lr': 0.00017457699809361922, 'samples': 15333888, 'steps': 29948, 'batch_loss/train': 0.7857787392567843} 12/26/2021 21:52:38 - INFO - codeparrot_training - Step 29949: {'lr': 0.00017456196958562617, 'samples': 15334400, 'steps': 29949, 'batch_loss/train': 0.8094277903437614} 12/26/2021 21:52:48 - INFO - codeparrot_training - Step 29950: {'lr': 0.00017454694137754615, 'samples': 15334912, 'steps': 29950, 'batch_loss/train': 0.7135200668126345} 12/26/2021 21:52:59 - INFO - codeparrot_training - Step 29951: {'lr': 0.00017453191346943905, 'samples': 15335424, 'steps': 29951, 'batch_loss/train': 0.7859143372625113} 12/26/2021 21:53:11 - INFO - codeparrot_training - Step 29952: {'lr': 0.00017451688586136464, 'samples': 15335936, 'steps': 29952, 'batch_loss/train': 0.7892568032257259} 12/26/2021 21:53:22 - INFO - codeparrot_training - Step 29953: {'lr': 0.00017450185855338247, 'samples': 15336448, 'steps': 29953, 'batch_loss/train': 1.2738578631542623} 12/26/2021 21:53:33 - INFO - codeparrot_training - Step 29954: {'lr': 0.00017448683154555245, 'samples': 15336960, 'steps': 29954, 'batch_loss/train': 1.057082451414317} 12/26/2021 21:53:47 - INFO - codeparrot_training - Step 29955: {'lr': 0.00017447180483793435, 'samples': 15337472, 'steps': 29955, 'batch_loss/train': 0.7235221231821924} 12/26/2021 21:53:58 - INFO - codeparrot_training - Step 29956: {'lr': 0.00017445677843058784, 'samples': 15337984, 'steps': 29956, 'batch_loss/train': 0.8809539619833231} 12/26/2021 21:54:08 - INFO - codeparrot_training - Step 29957: {'lr': 0.0001744417523235725, 'samples': 15338496, 'steps': 29957, 'batch_loss/train': 0.796815378125757} 12/26/2021 21:54:19 - INFO - codeparrot_training - Step 29958: {'lr': 0.00017442672651694829, 'samples': 15339008, 'steps': 29958, 'batch_loss/train': 0.7455943282693624} 12/26/2021 21:54:31 - INFO - codeparrot_training - Step 29959: {'lr': 0.00017441170101077498, 'samples': 15339520, 'steps': 29959, 'batch_loss/train': 0.8075656741857529} 12/26/2021 21:54:42 - INFO - codeparrot_training - Step 29960: {'lr': 0.0001743966758051121, 'samples': 15340032, 'steps': 29960, 'batch_loss/train': 0.6835260484367609} 12/26/2021 21:54:52 - INFO - codeparrot_training - Step 29961: {'lr': 0.00017438165090001944, 'samples': 15340544, 'steps': 29961, 'batch_loss/train': 0.8512129643931985} 12/26/2021 21:55:04 - INFO - codeparrot_training - Step 29962: {'lr': 0.00017436662629555694, 'samples': 15341056, 'steps': 29962, 'batch_loss/train': 0.7435447024181485} 12/26/2021 21:55:15 - INFO - codeparrot_training - Step 29963: {'lr': 0.00017435160199178408, 'samples': 15341568, 'steps': 29963, 'batch_loss/train': 0.8048975355923176} 12/26/2021 21:55:25 - INFO - codeparrot_training - Step 29964: {'lr': 0.00017433657798876065, 'samples': 15342080, 'steps': 29964, 'batch_loss/train': 0.7956162337213755} 12/26/2021 21:55:40 - INFO - codeparrot_training - Step 29965: {'lr': 0.0001743215542865465, 'samples': 15342592, 'steps': 29965, 'batch_loss/train': 0.7601553788408637} 12/26/2021 21:55:51 - INFO - codeparrot_training - Step 29966: {'lr': 0.00017430653088520133, 'samples': 15343104, 'steps': 29966, 'batch_loss/train': 0.8080777684226632} 12/26/2021 21:56:01 - INFO - codeparrot_training - Step 29967: {'lr': 0.0001742915077847847, 'samples': 15343616, 'steps': 29967, 'batch_loss/train': 0.7813421457540244} 12/26/2021 21:56:12 - INFO - codeparrot_training - Step 29968: {'lr': 0.0001742764849853564, 'samples': 15344128, 'steps': 29968, 'batch_loss/train': 0.696130739786895} 12/26/2021 21:56:24 - INFO - codeparrot_training - Step 29969: {'lr': 0.00017426146248697633, 'samples': 15344640, 'steps': 29969, 'batch_loss/train': 0.6331873859744519} 12/26/2021 21:56:35 - INFO - codeparrot_training - Step 29970: {'lr': 0.00017424644028970405, 'samples': 15345152, 'steps': 29970, 'batch_loss/train': 0.7722918381914496} 12/26/2021 21:56:45 - INFO - codeparrot_training - Step 29971: {'lr': 0.00017423141839359934, 'samples': 15345664, 'steps': 29971, 'batch_loss/train': 0.8369016856886446} 12/26/2021 21:56:59 - INFO - codeparrot_training - Step 29972: {'lr': 0.00017421639679872182, 'samples': 15346176, 'steps': 29972, 'batch_loss/train': 0.8403258072212338} 12/26/2021 21:57:10 - INFO - codeparrot_training - Step 29973: {'lr': 0.00017420137550513135, 'samples': 15346688, 'steps': 29973, 'batch_loss/train': 0.7371348300948739} 12/26/2021 21:57:21 - INFO - codeparrot_training - Step 29974: {'lr': 0.0001741863545128876, 'samples': 15347200, 'steps': 29974, 'batch_loss/train': 0.796187971252948} 12/26/2021 21:57:33 - INFO - codeparrot_training - Step 29975: {'lr': 0.00017417133382205023, 'samples': 15347712, 'steps': 29975, 'batch_loss/train': 0.7288419667165726} 12/26/2021 21:57:43 - INFO - codeparrot_training - Step 29976: {'lr': 0.00017415631343267908, 'samples': 15348224, 'steps': 29976, 'batch_loss/train': 0.7522234462667257} 12/26/2021 21:57:54 - INFO - codeparrot_training - Step 29977: {'lr': 0.00017414129334483375, 'samples': 15348736, 'steps': 29977, 'batch_loss/train': 0.7646721145138144} 12/26/2021 21:58:04 - INFO - codeparrot_training - Step 29978: {'lr': 0.00017412627355857402, 'samples': 15349248, 'steps': 29978, 'batch_loss/train': 0.7512490912340581} 12/26/2021 21:58:17 - INFO - codeparrot_training - Step 29979: {'lr': 0.0001741112540739595, 'samples': 15349760, 'steps': 29979, 'batch_loss/train': 0.8031727815978229} 12/26/2021 21:58:27 - INFO - codeparrot_training - Step 29980: {'lr': 0.00017409623489105, 'samples': 15350272, 'steps': 29980, 'batch_loss/train': 0.7787335691973567} 12/26/2021 21:58:38 - INFO - codeparrot_training - Step 29981: {'lr': 0.00017408121600990528, 'samples': 15350784, 'steps': 29981, 'batch_loss/train': 0.5157217895612121} 12/26/2021 21:58:52 - INFO - codeparrot_training - Step 29982: {'lr': 0.00017406619743058494, 'samples': 15351296, 'steps': 29982, 'batch_loss/train': 1.5815662414534017} 12/26/2021 21:59:02 - INFO - codeparrot_training - Step 29983: {'lr': 0.00017405117915314866, 'samples': 15351808, 'steps': 29983, 'batch_loss/train': 0.7331077791750431} 12/26/2021 21:59:13 - INFO - codeparrot_training - Step 29984: {'lr': 0.00017403616117765634, 'samples': 15352320, 'steps': 29984, 'batch_loss/train': 0.7295268788002431} 12/26/2021 21:59:25 - INFO - codeparrot_training - Step 29985: {'lr': 0.00017402114350416754, 'samples': 15352832, 'steps': 29985, 'batch_loss/train': 0.7106801010668278} 12/26/2021 21:59:36 - INFO - codeparrot_training - Step 29986: {'lr': 0.0001740061261327419, 'samples': 15353344, 'steps': 29986, 'batch_loss/train': 0.6638474719366059} 12/26/2021 21:59:46 - INFO - codeparrot_training - Step 29987: {'lr': 0.00017399110906343924, 'samples': 15353856, 'steps': 29987, 'batch_loss/train': 0.8789615430869162} 12/26/2021 21:59:57 - INFO - codeparrot_training - Step 29988: {'lr': 0.00017397609229631927, 'samples': 15354368, 'steps': 29988, 'batch_loss/train': 0.6692727757617831} 12/26/2021 22:00:10 - INFO - codeparrot_training - Step 29989: {'lr': 0.00017396107583144161, 'samples': 15354880, 'steps': 29989, 'batch_loss/train': 0.7368970743846148} 12/26/2021 22:00:20 - INFO - codeparrot_training - Step 29990: {'lr': 0.00017394605966886595, 'samples': 15355392, 'steps': 29990, 'batch_loss/train': 0.6781292000669055} 12/26/2021 22:00:31 - INFO - codeparrot_training - Step 29991: {'lr': 0.00017393104380865215, 'samples': 15355904, 'steps': 29991, 'batch_loss/train': 0.8083555214107037} 12/26/2021 22:00:43 - INFO - codeparrot_training - Step 29992: {'lr': 0.00017391602825085977, 'samples': 15356416, 'steps': 29992, 'batch_loss/train': 0.6322332786512561} 12/26/2021 22:00:53 - INFO - codeparrot_training - Step 29993: {'lr': 0.00017390101299554855, 'samples': 15356928, 'steps': 29993, 'batch_loss/train': 0.8048429619520903} 12/26/2021 22:01:04 - INFO - codeparrot_training - Step 29994: {'lr': 0.00017388599804277807, 'samples': 15357440, 'steps': 29994, 'batch_loss/train': 0.7752991076558828} 12/26/2021 22:01:18 - INFO - codeparrot_training - Step 29995: {'lr': 0.00017387098339260827, 'samples': 15357952, 'steps': 29995, 'batch_loss/train': 0.7732452815398574} 12/26/2021 22:01:29 - INFO - codeparrot_training - Step 29996: {'lr': 0.00017385596904509858, 'samples': 15358464, 'steps': 29996, 'batch_loss/train': 0.7817917163483799} 12/26/2021 22:01:39 - INFO - codeparrot_training - Step 29997: {'lr': 0.0001738409550003089, 'samples': 15358976, 'steps': 29997, 'batch_loss/train': 0.6202339010778815} 12/26/2021 22:01:50 - INFO - codeparrot_training - Step 29998: {'lr': 0.0001738259412582987, 'samples': 15359488, 'steps': 29998, 'batch_loss/train': 0.753645206335932} 12/26/2021 22:02:02 - INFO - codeparrot_training - Step 29999: {'lr': 0.00017381092781912783, 'samples': 15360000, 'steps': 29999, 'batch_loss/train': 0.6627156580798328} 12/26/2021 22:02:02 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/26/2021 22:05:23 - INFO - codeparrot_training - Step 30000: {'loss/eval': 0.7671898603439331, 'perplexity': 2.153705596923828} 12/26/2021 22:05:41 - WARNING - huggingface_hub.repository - Several commits (10) will be pushed upstream. 12/26/2021 22:05:54 - INFO - codeparrot_training - Step 30000: {'lr': 0.000173795914682856, 'samples': 15360512, 'steps': 30000, 'batch_loss/train': 0.6741299910936505} 12/26/2021 22:06:05 - INFO - codeparrot_training - Step 30001: {'lr': 0.00017378090184954275, 'samples': 15361024, 'steps': 30001, 'batch_loss/train': 0.734086737036705} 12/26/2021 22:06:19 - INFO - codeparrot_training - Step 30002: {'lr': 0.000173765889319248, 'samples': 15361536, 'steps': 30002, 'batch_loss/train': 0.738006106344983} 12/26/2021 22:06:29 - INFO - codeparrot_training - Step 30003: {'lr': 0.0001737508770920312, 'samples': 15362048, 'steps': 30003, 'batch_loss/train': 0.9215702582150698} 12/26/2021 22:06:40 - INFO - codeparrot_training - Step 30004: {'lr': 0.00017373586516795218, 'samples': 15362560, 'steps': 30004, 'batch_loss/train': 0.7753298892639577} 12/26/2021 22:06:52 - INFO - codeparrot_training - Step 30005: {'lr': 0.00017372085354707046, 'samples': 15363072, 'steps': 30005, 'batch_loss/train': 0.6623600504826754} 12/26/2021 22:07:03 - INFO - codeparrot_training - Step 30006: {'lr': 0.00017370584222944584, 'samples': 15363584, 'steps': 30006, 'batch_loss/train': 1.3678796836175025} 12/26/2021 22:07:13 - INFO - codeparrot_training - Step 30007: {'lr': 0.00017369083121513806, 'samples': 15364096, 'steps': 30007, 'batch_loss/train': 0.7491974392905831} 12/26/2021 22:07:26 - INFO - codeparrot_training - Step 30008: {'lr': 0.00017367582050420669, 'samples': 15364608, 'steps': 30008, 'batch_loss/train': 0.621832494798582} 12/26/2021 22:07:37 - INFO - codeparrot_training - Step 30009: {'lr': 0.00017366081009671135, 'samples': 15365120, 'steps': 30009, 'batch_loss/train': 0.7666941303759813} 12/26/2021 22:07:47 - INFO - codeparrot_training - Step 30010: {'lr': 0.00017364579999271196, 'samples': 15365632, 'steps': 30010, 'batch_loss/train': 0.7723866398446262} 12/26/2021 22:07:58 - INFO - codeparrot_training - Step 30011: {'lr': 0.000173630790192268, 'samples': 15366144, 'steps': 30011, 'batch_loss/train': 0.6190134166972712} 12/26/2021 22:08:12 - INFO - codeparrot_training - Step 30012: {'lr': 0.0001736157806954391, 'samples': 15366656, 'steps': 30012, 'batch_loss/train': 0.7219861653866246} 12/26/2021 22:08:23 - INFO - codeparrot_training - Step 30013: {'lr': 0.00017360077150228504, 'samples': 15367168, 'steps': 30013, 'batch_loss/train': 0.7434927065623924} 12/26/2021 22:08:33 - INFO - codeparrot_training - Step 30014: {'lr': 0.00017358576261286554, 'samples': 15367680, 'steps': 30014, 'batch_loss/train': 0.7640785025432706} 12/26/2021 22:08:45 - INFO - codeparrot_training - Step 30015: {'lr': 0.00017357075402724015, 'samples': 15368192, 'steps': 30015, 'batch_loss/train': 0.9167928621172905} 12/26/2021 22:08:56 - INFO - codeparrot_training - Step 30016: {'lr': 0.0001735557457454685, 'samples': 15368704, 'steps': 30016, 'batch_loss/train': 0.8416248792782426} 12/26/2021 22:09:07 - INFO - codeparrot_training - Step 30017: {'lr': 0.00017354073776761048, 'samples': 15369216, 'steps': 30017, 'batch_loss/train': 0.8706389958970249} 12/26/2021 22:09:19 - INFO - codeparrot_training - Step 30018: {'lr': 0.00017352573009372552, 'samples': 15369728, 'steps': 30018, 'batch_loss/train': 0.79141865670681} 12/26/2021 22:09:29 - INFO - codeparrot_training - Step 30019: {'lr': 0.00017351072272387346, 'samples': 15370240, 'steps': 30019, 'batch_loss/train': 0.8110535629093647} 12/26/2021 22:09:40 - INFO - codeparrot_training - Step 30020: {'lr': 0.0001734957156581138, 'samples': 15370752, 'steps': 30020, 'batch_loss/train': 0.7980521563440561} 12/26/2021 22:09:50 - INFO - codeparrot_training - Step 30021: {'lr': 0.00017348070889650636, 'samples': 15371264, 'steps': 30021, 'batch_loss/train': 0.8003891687840223} 12/26/2021 22:10:04 - INFO - codeparrot_training - Step 30022: {'lr': 0.00017346570243911068, 'samples': 15371776, 'steps': 30022, 'batch_loss/train': 0.766948038013652} 12/26/2021 22:10:15 - INFO - codeparrot_training - Step 30023: {'lr': 0.00017345069628598654, 'samples': 15372288, 'steps': 30023, 'batch_loss/train': 0.7598970904946327} 12/26/2021 22:10:25 - INFO - codeparrot_training - Step 30024: {'lr': 0.00017343569043719342, 'samples': 15372800, 'steps': 30024, 'batch_loss/train': 0.6691847261972725} 12/26/2021 22:10:38 - INFO - codeparrot_training - Step 30025: {'lr': 0.0001734206848927911, 'samples': 15373312, 'steps': 30025, 'batch_loss/train': 0.7270841198042035} 12/26/2021 22:10:48 - INFO - codeparrot_training - Step 30026: {'lr': 0.0001734056796528393, 'samples': 15373824, 'steps': 30026, 'batch_loss/train': 0.7853860724717379} 12/26/2021 22:10:59 - INFO - codeparrot_training - Step 30027: {'lr': 0.00017339067471739753, 'samples': 15374336, 'steps': 30027, 'batch_loss/train': 0.7010464081540704} 12/26/2021 22:11:11 - INFO - codeparrot_training - Step 30028: {'lr': 0.00017337567008652555, 'samples': 15374848, 'steps': 30028, 'batch_loss/train': 0.8457832271233201} 12/26/2021 22:11:22 - INFO - codeparrot_training - Step 30029: {'lr': 0.00017336066576028298, 'samples': 15375360, 'steps': 30029, 'batch_loss/train': 0.8015181734226644} 12/26/2021 22:11:33 - INFO - codeparrot_training - Step 30030: {'lr': 0.0001733456617387295, 'samples': 15375872, 'steps': 30030, 'batch_loss/train': 0.7640859708189964} 12/26/2021 22:11:43 - INFO - codeparrot_training - Step 30031: {'lr': 0.0001733306580219246, 'samples': 15376384, 'steps': 30031, 'batch_loss/train': 0.6884589660912752} 12/26/2021 22:11:57 - INFO - codeparrot_training - Step 30032: {'lr': 0.00017331565460992812, 'samples': 15376896, 'steps': 30032, 'batch_loss/train': 0.7282989639788866} 12/26/2021 22:12:08 - INFO - codeparrot_training - Step 30033: {'lr': 0.00017330065150279972, 'samples': 15377408, 'steps': 30033, 'batch_loss/train': 0.8217993099242449} 12/26/2021 22:12:19 - INFO - codeparrot_training - Step 30034: {'lr': 0.00017328564870059887, 'samples': 15377920, 'steps': 30034, 'batch_loss/train': 0.8053731303662062} 12/26/2021 22:12:31 - INFO - codeparrot_training - Step 30035: {'lr': 0.0001732706462033853, 'samples': 15378432, 'steps': 30035, 'batch_loss/train': 0.804332653991878} 12/26/2021 22:12:41 - INFO - codeparrot_training - Step 30036: {'lr': 0.00017325564401121874, 'samples': 15378944, 'steps': 30036, 'batch_loss/train': 0.7638776050880551} 12/26/2021 22:12:52 - INFO - codeparrot_training - Step 30037: {'lr': 0.00017324064212415874, 'samples': 15379456, 'steps': 30037, 'batch_loss/train': 0.6693581594154239} 12/26/2021 22:13:04 - INFO - codeparrot_training - Step 30038: {'lr': 0.0001732256405422649, 'samples': 15379968, 'steps': 30038, 'batch_loss/train': 0.6732131012249738} 12/26/2021 22:13:15 - INFO - codeparrot_training - Step 30039: {'lr': 0.000173210639265597, 'samples': 15380480, 'steps': 30039, 'batch_loss/train': 0.7626086929813027} 12/26/2021 22:13:26 - INFO - codeparrot_training - Step 30040: {'lr': 0.00017319563829421464, 'samples': 15380992, 'steps': 30040, 'batch_loss/train': 0.6998994001187384} 12/26/2021 22:13:36 - INFO - codeparrot_training - Step 30041: {'lr': 0.00017318063762817738, 'samples': 15381504, 'steps': 30041, 'batch_loss/train': 0.7567367167212069} 12/26/2021 22:13:50 - INFO - codeparrot_training - Step 30042: {'lr': 0.0001731656372675448, 'samples': 15382016, 'steps': 30042, 'batch_loss/train': 0.8409196278080344} 12/26/2021 22:14:01 - INFO - codeparrot_training - Step 30043: {'lr': 0.00017315063721237678, 'samples': 15382528, 'steps': 30043, 'batch_loss/train': 0.6762333190999925} 12/26/2021 22:14:12 - INFO - codeparrot_training - Step 30044: {'lr': 0.00017313563746273273, 'samples': 15383040, 'steps': 30044, 'batch_loss/train': 0.7392363771796227} 12/26/2021 22:14:24 - INFO - codeparrot_training - Step 30045: {'lr': 0.00017312063801867242, 'samples': 15383552, 'steps': 30045, 'batch_loss/train': 0.9839538093656301} 12/26/2021 22:14:34 - INFO - codeparrot_training - Step 30046: {'lr': 0.00017310563888025532, 'samples': 15384064, 'steps': 30046, 'batch_loss/train': 0.7850812714314088} 12/26/2021 22:14:45 - INFO - codeparrot_training - Step 30047: {'lr': 0.0001730906400475413, 'samples': 15384576, 'steps': 30047, 'batch_loss/train': 0.6211493818555027} 12/26/2021 22:14:57 - INFO - codeparrot_training - Step 30048: {'lr': 0.00017307564152058976, 'samples': 15385088, 'steps': 30048, 'batch_loss/train': 0.8396371905691922} 12/26/2021 22:15:08 - INFO - codeparrot_training - Step 30049: {'lr': 0.0001730606432994604, 'samples': 15385600, 'steps': 30049, 'batch_loss/train': 0.7791494959965348} 12/26/2021 22:15:19 - INFO - codeparrot_training - Step 30050: {'lr': 0.00017304564538421296, 'samples': 15386112, 'steps': 30050, 'batch_loss/train': 0.8510519838891923} 12/26/2021 22:15:32 - INFO - codeparrot_training - Step 30051: {'lr': 0.00017303064777490694, 'samples': 15386624, 'steps': 30051, 'batch_loss/train': 0.7835797714069486} 12/26/2021 22:15:43 - INFO - codeparrot_training - Step 30052: {'lr': 0.00017301565047160206, 'samples': 15387136, 'steps': 30052, 'batch_loss/train': 0.8498518075793982} 12/26/2021 22:15:54 - INFO - codeparrot_training - Step 30053: {'lr': 0.00017300065347435775, 'samples': 15387648, 'steps': 30053, 'batch_loss/train': 0.8227023310028017} 12/26/2021 22:16:04 - INFO - codeparrot_training - Step 30054: {'lr': 0.00017298565678323385, 'samples': 15388160, 'steps': 30054, 'batch_loss/train': 0.7760849432088435} 12/26/2021 22:16:17 - INFO - codeparrot_training - Step 30055: {'lr': 0.00017297066039828992, 'samples': 15388672, 'steps': 30055, 'batch_loss/train': 0.6745460024103522} 12/26/2021 22:16:27 - INFO - codeparrot_training - Step 30056: {'lr': 0.0001729556643195856, 'samples': 15389184, 'steps': 30056, 'batch_loss/train': 0.7962581180036068} 12/26/2021 22:16:38 - INFO - codeparrot_training - Step 30057: {'lr': 0.00017294066854718032, 'samples': 15389696, 'steps': 30057, 'batch_loss/train': 0.7219424182549119} 12/26/2021 22:16:52 - INFO - codeparrot_training - Step 30058: {'lr': 0.00017292567308113395, 'samples': 15390208, 'steps': 30058, 'batch_loss/train': 0.8424452901817858} 12/26/2021 22:17:02 - INFO - codeparrot_training - Step 30059: {'lr': 0.000172910677921506, 'samples': 15390720, 'steps': 30059, 'batch_loss/train': 0.74047726765275} 12/26/2021 22:17:13 - INFO - codeparrot_training - Step 30060: {'lr': 0.00017289568306835606, 'samples': 15391232, 'steps': 30060, 'batch_loss/train': 0.7741753468289971} 12/26/2021 22:17:25 - INFO - codeparrot_training - Step 30061: {'lr': 0.0001728806885217437, 'samples': 15391744, 'steps': 30061, 'batch_loss/train': 0.7326560192741454} 12/26/2021 22:17:36 - INFO - codeparrot_training - Step 30062: {'lr': 0.00017286569428172875, 'samples': 15392256, 'steps': 30062, 'batch_loss/train': 0.9826996740885079} 12/26/2021 22:17:46 - INFO - codeparrot_training - Step 30063: {'lr': 0.00017285070034837058, 'samples': 15392768, 'steps': 30063, 'batch_loss/train': 0.8089829701930285} 12/26/2021 22:17:57 - INFO - codeparrot_training - Step 30064: {'lr': 0.0001728357067217289, 'samples': 15393280, 'steps': 30064, 'batch_loss/train': 0.8493360662832856} 12/26/2021 22:18:09 - INFO - codeparrot_training - Step 30065: {'lr': 0.00017282071340186334, 'samples': 15393792, 'steps': 30065, 'batch_loss/train': 0.7486019311472774} 12/26/2021 22:18:20 - INFO - codeparrot_training - Step 30066: {'lr': 0.00017280572038883352, 'samples': 15394304, 'steps': 30066, 'batch_loss/train': 1.0539588844403625} 12/26/2021 22:18:30 - INFO - codeparrot_training - Step 30067: {'lr': 0.00017279072768269896, 'samples': 15394816, 'steps': 30067, 'batch_loss/train': 0.6918194266036153} 12/26/2021 22:18:43 - INFO - codeparrot_training - Step 30068: {'lr': 0.00017277573528351925, 'samples': 15395328, 'steps': 30068, 'batch_loss/train': 0.7978560430929065} 12/26/2021 22:18:53 - INFO - codeparrot_training - Step 30069: {'lr': 0.00017276074319135418, 'samples': 15395840, 'steps': 30069, 'batch_loss/train': 0.7362247407436371} 12/26/2021 22:19:04 - INFO - codeparrot_training - Step 30070: {'lr': 0.00017274575140626317, 'samples': 15396352, 'steps': 30070, 'batch_loss/train': 0.7372299400158226} 12/26/2021 22:19:18 - INFO - codeparrot_training - Step 30071: {'lr': 0.00017273075992830595, 'samples': 15396864, 'steps': 30071, 'batch_loss/train': 0.7451476529240608} 12/26/2021 22:19:28 - INFO - codeparrot_training - Step 30072: {'lr': 0.0001727157687575419, 'samples': 15397376, 'steps': 30072, 'batch_loss/train': 0.7476471541449428} 12/26/2021 22:19:39 - INFO - codeparrot_training - Step 30073: {'lr': 0.00017270077789403087, 'samples': 15397888, 'steps': 30073, 'batch_loss/train': 0.7450636811554432} 12/26/2021 22:19:50 - INFO - codeparrot_training - Step 30074: {'lr': 0.00017268578733783239, 'samples': 15398400, 'steps': 30074, 'batch_loss/train': 0.7568363696336746} 12/26/2021 22:20:02 - INFO - codeparrot_training - Step 30075: {'lr': 0.00017267079708900592, 'samples': 15398912, 'steps': 30075, 'batch_loss/train': 0.6913242950104177} 12/26/2021 22:20:13 - INFO - codeparrot_training - Step 30076: {'lr': 0.00017265580714761127, 'samples': 15399424, 'steps': 30076, 'batch_loss/train': 0.8014904021983966} 12/26/2021 22:20:23 - INFO - codeparrot_training - Step 30077: {'lr': 0.0001726408175137079, 'samples': 15399936, 'steps': 30077, 'batch_loss/train': 0.7292408989742398} 12/26/2021 22:20:35 - INFO - codeparrot_training - Step 30078: {'lr': 0.00017262582818735545, 'samples': 15400448, 'steps': 30078, 'batch_loss/train': 0.8398302784189582} 12/26/2021 22:20:46 - INFO - codeparrot_training - Step 30079: {'lr': 0.0001726108391686134, 'samples': 15400960, 'steps': 30079, 'batch_loss/train': 0.6699753239518031} 12/26/2021 22:20:57 - INFO - codeparrot_training - Step 30080: {'lr': 0.0001725958504575415, 'samples': 15401472, 'steps': 30080, 'batch_loss/train': 0.7022940395399928} 12/26/2021 22:21:10 - INFO - codeparrot_training - Step 30081: {'lr': 0.00017258086205419932, 'samples': 15401984, 'steps': 30081, 'batch_loss/train': 0.7528568576090038} 12/26/2021 22:21:21 - INFO - codeparrot_training - Step 30082: {'lr': 0.00017256587395864632, 'samples': 15402496, 'steps': 30082, 'batch_loss/train': 0.8106545330956578} 12/26/2021 22:21:32 - INFO - codeparrot_training - Step 30083: {'lr': 0.00017255088617094212, 'samples': 15403008, 'steps': 30083, 'batch_loss/train': 0.8102855207398534} 12/26/2021 22:21:42 - INFO - codeparrot_training - Step 30084: {'lr': 0.00017253589869114648, 'samples': 15403520, 'steps': 30084, 'batch_loss/train': 0.8678870680741966} 12/26/2021 22:21:55 - INFO - codeparrot_training - Step 30085: {'lr': 0.00017252091151931883, 'samples': 15404032, 'steps': 30085, 'batch_loss/train': 0.7671860898844898} 12/26/2021 22:22:05 - INFO - codeparrot_training - Step 30086: {'lr': 0.0001725059246555187, 'samples': 15404544, 'steps': 30086, 'batch_loss/train': 0.6489957813173532} 12/26/2021 22:22:16 - INFO - codeparrot_training - Step 30087: {'lr': 0.00017249093809980581, 'samples': 15405056, 'steps': 30087, 'batch_loss/train': 0.7567091472446918} 12/26/2021 22:22:30 - INFO - codeparrot_training - Step 30088: {'lr': 0.00017247595185223972, 'samples': 15405568, 'steps': 30088, 'batch_loss/train': 0.7141801677644253} 12/26/2021 22:22:41 - INFO - codeparrot_training - Step 30089: {'lr': 0.0001724609659128799, 'samples': 15406080, 'steps': 30089, 'batch_loss/train': 0.849186778999865} 12/26/2021 22:22:51 - INFO - codeparrot_training - Step 30090: {'lr': 0.000172445980281786, 'samples': 15406592, 'steps': 30090, 'batch_loss/train': 0.7257679365575314} 12/26/2021 22:23:03 - INFO - codeparrot_training - Step 30091: {'lr': 0.00017243099495901768, 'samples': 15407104, 'steps': 30091, 'batch_loss/train': 0.6953356581507251} 12/26/2021 22:23:14 - INFO - codeparrot_training - Step 30092: {'lr': 0.00017241600994463444, 'samples': 15407616, 'steps': 30092, 'batch_loss/train': 0.7381041520275176} 12/26/2021 22:23:25 - INFO - codeparrot_training - Step 30093: {'lr': 0.00017240102523869577, 'samples': 15408128, 'steps': 30093, 'batch_loss/train': 0.7892778916284442} 12/26/2021 22:23:35 - INFO - codeparrot_training - Step 30094: {'lr': 0.0001723860408412613, 'samples': 15408640, 'steps': 30094, 'batch_loss/train': 0.7443556408397853} 12/26/2021 22:23:47 - INFO - codeparrot_training - Step 30095: {'lr': 0.00017237105675239073, 'samples': 15409152, 'steps': 30095, 'batch_loss/train': 0.7885005749994889} 12/26/2021 22:23:58 - INFO - codeparrot_training - Step 30096: {'lr': 0.00017235607297214344, 'samples': 15409664, 'steps': 30096, 'batch_loss/train': 0.6891096513718367} 12/26/2021 22:24:09 - INFO - codeparrot_training - Step 30097: {'lr': 0.00017234108950057918, 'samples': 15410176, 'steps': 30097, 'batch_loss/train': 0.7976120794191957} 12/26/2021 22:24:22 - INFO - codeparrot_training - Step 30098: {'lr': 0.0001723261063377573, 'samples': 15410688, 'steps': 30098, 'batch_loss/train': 0.8144163526594639} 12/26/2021 22:24:33 - INFO - codeparrot_training - Step 30099: {'lr': 0.00017231112348373756, 'samples': 15411200, 'steps': 30099, 'batch_loss/train': 0.7131247983779758} 12/26/2021 22:24:43 - INFO - codeparrot_training - Step 30100: {'lr': 0.00017229614093857952, 'samples': 15411712, 'steps': 30100, 'batch_loss/train': 0.6972510032355785} 12/26/2021 22:24:56 - INFO - codeparrot_training - Step 30101: {'lr': 0.00017228115870234257, 'samples': 15412224, 'steps': 30101, 'batch_loss/train': 0.7893682749127038} 12/26/2021 22:25:06 - INFO - codeparrot_training - Step 30102: {'lr': 0.00017226617677508648, 'samples': 15412736, 'steps': 30102, 'batch_loss/train': 0.7767848498187959} 12/26/2021 22:25:17 - INFO - codeparrot_training - Step 30103: {'lr': 0.00017225119515687065, 'samples': 15413248, 'steps': 30103, 'batch_loss/train': 0.7849394087679684} 12/26/2021 22:25:29 - INFO - codeparrot_training - Step 30104: {'lr': 0.00017223621384775478, 'samples': 15413760, 'steps': 30104, 'batch_loss/train': 0.837081694509834} 12/26/2021 22:25:40 - INFO - codeparrot_training - Step 30105: {'lr': 0.00017222123284779827, 'samples': 15414272, 'steps': 30105, 'batch_loss/train': 0.8439938924275339} 12/26/2021 22:25:50 - INFO - codeparrot_training - Step 30106: {'lr': 0.00017220625215706083, 'samples': 15414784, 'steps': 30106, 'batch_loss/train': 0.7008415329619311} 12/26/2021 22:26:01 - INFO - codeparrot_training - Step 30107: {'lr': 0.000172191271775602, 'samples': 15415296, 'steps': 30107, 'batch_loss/train': 0.7134687842335552} 12/26/2021 22:26:13 - INFO - codeparrot_training - Step 30108: {'lr': 0.00017217629170348123, 'samples': 15415808, 'steps': 30108, 'batch_loss/train': 0.8981254743412137} 12/26/2021 22:26:24 - INFO - codeparrot_training - Step 30109: {'lr': 0.0001721613119407581, 'samples': 15416320, 'steps': 30109, 'batch_loss/train': 0.7550781164318323} 12/26/2021 22:26:34 - INFO - codeparrot_training - Step 30110: {'lr': 0.0001721463324874923, 'samples': 15416832, 'steps': 30110, 'batch_loss/train': 0.8040883932262659} 12/26/2021 22:26:48 - INFO - codeparrot_training - Step 30111: {'lr': 0.0001721313533437433, 'samples': 15417344, 'steps': 30111, 'batch_loss/train': 0.6994348773732781} 12/26/2021 22:26:59 - INFO - codeparrot_training - Step 30112: {'lr': 0.0001721163745095705, 'samples': 15417856, 'steps': 30112, 'batch_loss/train': 0.7171628852374852} 12/26/2021 22:27:09 - INFO - codeparrot_training - Step 30113: {'lr': 0.00017210139598503366, 'samples': 15418368, 'steps': 30113, 'batch_loss/train': 0.8213838301599026} 12/26/2021 22:27:22 - INFO - codeparrot_training - Step 30114: {'lr': 0.0001720864177701923, 'samples': 15418880, 'steps': 30114, 'batch_loss/train': 0.793201123829931} 12/26/2021 22:27:32 - INFO - codeparrot_training - Step 30115: {'lr': 0.00017207143986510586, 'samples': 15419392, 'steps': 30115, 'batch_loss/train': 0.7220549727790058} 12/26/2021 22:27:43 - INFO - codeparrot_training - Step 30116: {'lr': 0.00017205646226983395, 'samples': 15419904, 'steps': 30116, 'batch_loss/train': 0.7124692564830184} 12/26/2021 22:27:54 - INFO - codeparrot_training - Step 30117: {'lr': 0.00017204148498443616, 'samples': 15420416, 'steps': 30117, 'batch_loss/train': 0.7452107379212976} 12/26/2021 22:28:08 - INFO - codeparrot_training - Step 30118: {'lr': 0.00017202650800897198, 'samples': 15420928, 'steps': 30118, 'batch_loss/train': 0.7143907658755779} 12/26/2021 22:28:19 - INFO - codeparrot_training - Step 30119: {'lr': 0.000172011531343501, 'samples': 15421440, 'steps': 30119, 'batch_loss/train': 1.0864731697365642} 12/26/2021 22:28:29 - INFO - codeparrot_training - Step 30120: {'lr': 0.00017199655498808262, 'samples': 15421952, 'steps': 30120, 'batch_loss/train': 0.7790592920500785} 12/26/2021 22:28:41 - INFO - codeparrot_training - Step 30121: {'lr': 0.00017198157894277662, 'samples': 15422464, 'steps': 30121, 'batch_loss/train': 0.8677251469343901} 12/26/2021 22:28:52 - INFO - codeparrot_training - Step 30122: {'lr': 0.0001719666032076423, 'samples': 15422976, 'steps': 30122, 'batch_loss/train': 0.6991434679366648} 12/26/2021 22:29:03 - INFO - codeparrot_training - Step 30123: {'lr': 0.00017195162778273925, 'samples': 15423488, 'steps': 30123, 'batch_loss/train': 0.6238324551377445} 12/26/2021 22:29:15 - INFO - codeparrot_training - Step 30124: {'lr': 0.0001719366526681272, 'samples': 15424000, 'steps': 30124, 'batch_loss/train': 0.7442104904912412} 12/26/2021 22:29:25 - INFO - codeparrot_training - Step 30125: {'lr': 0.00017192167786386544, 'samples': 15424512, 'steps': 30125, 'batch_loss/train': 0.7090409314259887} 12/26/2021 22:29:36 - INFO - codeparrot_training - Step 30126: {'lr': 0.0001719067033700137, 'samples': 15425024, 'steps': 30126, 'batch_loss/train': 0.7438466749154031} 12/26/2021 22:29:47 - INFO - codeparrot_training - Step 30127: {'lr': 0.00017189172918663132, 'samples': 15425536, 'steps': 30127, 'batch_loss/train': 0.7487726705148816} 12/26/2021 22:30:01 - INFO - codeparrot_training - Step 30128: {'lr': 0.000171876755313778, 'samples': 15426048, 'steps': 30128, 'batch_loss/train': 0.7240990828722715} 12/26/2021 22:30:11 - INFO - codeparrot_training - Step 30129: {'lr': 0.00017186178175151317, 'samples': 15426560, 'steps': 30129, 'batch_loss/train': 0.8044574223458767} 12/26/2021 22:30:22 - INFO - codeparrot_training - Step 30130: {'lr': 0.00017184680849989647, 'samples': 15427072, 'steps': 30130, 'batch_loss/train': 0.7138062501326203} 12/26/2021 22:30:34 - INFO - codeparrot_training - Step 30131: {'lr': 0.00017183183555898724, 'samples': 15427584, 'steps': 30131, 'batch_loss/train': 0.7305234321393073} 12/26/2021 22:30:45 - INFO - codeparrot_training - Step 30132: {'lr': 0.00017181686292884514, 'samples': 15428096, 'steps': 30132, 'batch_loss/train': 0.6973096441943198} 12/26/2021 22:30:55 - INFO - codeparrot_training - Step 30133: {'lr': 0.00017180189060952972, 'samples': 15428608, 'steps': 30133, 'batch_loss/train': 0.7673983685672283} 12/26/2021 22:31:09 - INFO - codeparrot_training - Step 30134: {'lr': 0.00017178691860110043, 'samples': 15429120, 'steps': 30134, 'batch_loss/train': 0.6476702867075801} 12/26/2021 22:31:20 - INFO - codeparrot_training - Step 30135: {'lr': 0.00017177194690361675, 'samples': 15429632, 'steps': 30135, 'batch_loss/train': 0.7682532276958227} 12/26/2021 22:31:30 - INFO - codeparrot_training - Step 30136: {'lr': 0.0001717569755171384, 'samples': 15430144, 'steps': 30136, 'batch_loss/train': 0.8033249387517571} 12/26/2021 22:31:41 - INFO - codeparrot_training - Step 30137: {'lr': 0.00017174200444172478, 'samples': 15430656, 'steps': 30137, 'batch_loss/train': 0.8136797357583418} 12/26/2021 22:31:53 - INFO - codeparrot_training - Step 30138: {'lr': 0.00017172703367743526, 'samples': 15431168, 'steps': 30138, 'batch_loss/train': 0.7859202641993761} 12/26/2021 22:32:04 - INFO - codeparrot_training - Step 30139: {'lr': 0.00017171206322432955, 'samples': 15431680, 'steps': 30139, 'batch_loss/train': 0.7577421879395843} 12/26/2021 22:32:14 - INFO - codeparrot_training - Step 30140: {'lr': 0.0001716970930824672, 'samples': 15432192, 'steps': 30140, 'batch_loss/train': 0.8617171067744493} 12/26/2021 22:32:26 - INFO - codeparrot_training - Step 30141: {'lr': 0.00017168212325190758, 'samples': 15432704, 'steps': 30141, 'batch_loss/train': 1.3861628780141473} 12/26/2021 22:32:37 - INFO - codeparrot_training - Step 30142: {'lr': 0.00017166715373271018, 'samples': 15433216, 'steps': 30142, 'batch_loss/train': 0.7858304166002199} 12/26/2021 22:32:48 - INFO - codeparrot_training - Step 30143: {'lr': 0.00017165218452493475, 'samples': 15433728, 'steps': 30143, 'batch_loss/train': 0.731003588414751} 12/26/2021 22:33:00 - INFO - codeparrot_training - Step 30144: {'lr': 0.00017163721562864058, 'samples': 15434240, 'steps': 30144, 'batch_loss/train': 0.7497860310832039} 12/26/2021 22:33:10 - INFO - codeparrot_training - Step 30145: {'lr': 0.0001716222470438873, 'samples': 15434752, 'steps': 30145, 'batch_loss/train': 0.8268007491715252} 12/26/2021 22:33:21 - INFO - codeparrot_training - Step 30146: {'lr': 0.00017160727877073428, 'samples': 15435264, 'steps': 30146, 'batch_loss/train': 0.7636476019397378} 12/26/2021 22:33:32 - INFO - codeparrot_training - Step 30147: {'lr': 0.0001715923108092412, 'samples': 15435776, 'steps': 30147, 'batch_loss/train': 0.7605560589581728} 12/26/2021 22:33:46 - INFO - codeparrot_training - Step 30148: {'lr': 0.00017157734315946746, 'samples': 15436288, 'steps': 30148, 'batch_loss/train': 0.7837179200723767} 12/26/2021 22:33:56 - INFO - codeparrot_training - Step 30149: {'lr': 0.00017156237582147252, 'samples': 15436800, 'steps': 30149, 'batch_loss/train': 0.8282950611319393} 12/26/2021 22:34:07 - INFO - codeparrot_training - Step 30150: {'lr': 0.00017154740879531607, 'samples': 15437312, 'steps': 30150, 'batch_loss/train': 0.7807195065543056} 12/26/2021 22:34:19 - INFO - codeparrot_training - Step 30151: {'lr': 0.00017153244208105745, 'samples': 15437824, 'steps': 30151, 'batch_loss/train': 0.7551262527704239} 12/26/2021 22:34:30 - INFO - codeparrot_training - Step 30152: {'lr': 0.00017151747567875625, 'samples': 15438336, 'steps': 30152, 'batch_loss/train': 0.7440205405000597} 12/26/2021 22:34:40 - INFO - codeparrot_training - Step 30153: {'lr': 0.00017150250958847187, 'samples': 15438848, 'steps': 30153, 'batch_loss/train': 0.7390148888807744} 12/26/2021 22:34:52 - INFO - codeparrot_training - Step 30154: {'lr': 0.0001714875438102639, 'samples': 15439360, 'steps': 30154, 'batch_loss/train': 0.7206031922250986} 12/26/2021 22:35:03 - INFO - codeparrot_training - Step 30155: {'lr': 0.00017147257834419183, 'samples': 15439872, 'steps': 30155, 'batch_loss/train': 0.8251425996422768} 12/26/2021 22:35:14 - INFO - codeparrot_training - Step 30156: {'lr': 0.0001714576131903152, 'samples': 15440384, 'steps': 30156, 'batch_loss/train': 0.7348489970900118} 12/26/2021 22:35:27 - INFO - codeparrot_training - Step 30157: {'lr': 0.00017144264834869328, 'samples': 15440896, 'steps': 30157, 'batch_loss/train': 0.7628682423382998} 12/26/2021 22:35:38 - INFO - codeparrot_training - Step 30158: {'lr': 0.00017142768381938584, 'samples': 15441408, 'steps': 30158, 'batch_loss/train': 0.8132486666436307} 12/26/2021 22:35:49 - INFO - codeparrot_training - Step 30159: {'lr': 0.00017141271960245224, 'samples': 15441920, 'steps': 30159, 'batch_loss/train': 0.7918633022345603} 12/26/2021 22:35:59 - INFO - codeparrot_training - Step 30160: {'lr': 0.00017139775569795197, 'samples': 15442432, 'steps': 30160, 'batch_loss/train': 0.6538411750807427} 12/26/2021 22:36:11 - INFO - codeparrot_training - Step 30161: {'lr': 0.00017138279210594454, 'samples': 15442944, 'steps': 30161, 'batch_loss/train': 0.7439510705880821} 12/26/2021 22:36:22 - INFO - codeparrot_training - Step 30162: {'lr': 0.0001713678288264895, 'samples': 15443456, 'steps': 30162, 'batch_loss/train': 0.7983887683367357} 12/26/2021 22:36:33 - INFO - codeparrot_training - Step 30163: {'lr': 0.00017135286585964627, 'samples': 15443968, 'steps': 30163, 'batch_loss/train': 0.829889852553606} 12/26/2021 22:36:45 - INFO - codeparrot_training - Step 30164: {'lr': 0.00017133790320547425, 'samples': 15444480, 'steps': 30164, 'batch_loss/train': 0.8182187136262655} 12/26/2021 22:36:55 - INFO - codeparrot_training - Step 30165: {'lr': 0.00017132294086403315, 'samples': 15444992, 'steps': 30165, 'batch_loss/train': 0.8457002304494381} 12/26/2021 22:37:06 - INFO - codeparrot_training - Step 30166: {'lr': 0.00017130797883538231, 'samples': 15445504, 'steps': 30166, 'batch_loss/train': 0.788476224988699} 12/26/2021 22:37:20 - INFO - codeparrot_training - Step 30167: {'lr': 0.00017129301711958124, 'samples': 15446016, 'steps': 30167, 'batch_loss/train': 0.7380972653627396} 12/26/2021 22:37:31 - INFO - codeparrot_training - Step 30168: {'lr': 0.00017127805571668931, 'samples': 15446528, 'steps': 30168, 'batch_loss/train': 0.6068593673408031} 12/26/2021 22:37:41 - INFO - codeparrot_training - Step 30169: {'lr': 0.0001712630946267662, 'samples': 15447040, 'steps': 30169, 'batch_loss/train': 0.7379289530217648} 12/26/2021 22:37:52 - INFO - codeparrot_training - Step 30170: {'lr': 0.00017124813384987125, 'samples': 15447552, 'steps': 30170, 'batch_loss/train': 0.7468526707962155} 12/26/2021 22:38:04 - INFO - codeparrot_training - Step 30171: {'lr': 0.0001712331733860641, 'samples': 15448064, 'steps': 30171, 'batch_loss/train': 0.7052089986391366} 12/26/2021 22:38:15 - INFO - codeparrot_training - Step 30172: {'lr': 0.00017121821323540394, 'samples': 15448576, 'steps': 30172, 'batch_loss/train': 0.7874707216396928} 12/26/2021 22:38:25 - INFO - codeparrot_training - Step 30173: {'lr': 0.00017120325339795052, 'samples': 15449088, 'steps': 30173, 'batch_loss/train': 0.8724192650988698} 12/26/2021 22:38:39 - INFO - codeparrot_training - Step 30174: {'lr': 0.00017118829387376323, 'samples': 15449600, 'steps': 30174, 'batch_loss/train': 0.7766593573614955} 12/26/2021 22:38:50 - INFO - codeparrot_training - Step 30175: {'lr': 0.00017117333466290142, 'samples': 15450112, 'steps': 30175, 'batch_loss/train': 0.8448414076119661} 12/26/2021 22:39:01 - INFO - codeparrot_training - Step 30176: {'lr': 0.00017115837576542475, 'samples': 15450624, 'steps': 30176, 'batch_loss/train': 0.7914546653628349} 12/26/2021 22:39:13 - INFO - codeparrot_training - Step 30177: {'lr': 0.00017114341718139258, 'samples': 15451136, 'steps': 30177, 'batch_loss/train': 0.7464643048588187} 12/26/2021 22:39:23 - INFO - codeparrot_training - Step 30178: {'lr': 0.0001711284589108645, 'samples': 15451648, 'steps': 30178, 'batch_loss/train': 0.8491006214171648} 12/26/2021 22:39:34 - INFO - codeparrot_training - Step 30179: {'lr': 0.00017111350095389973, 'samples': 15452160, 'steps': 30179, 'batch_loss/train': 0.7780843521468341} 12/26/2021 22:39:46 - INFO - codeparrot_training - Step 30180: {'lr': 0.00017109854331055795, 'samples': 15452672, 'steps': 30180, 'batch_loss/train': 0.8505377480760217} 12/26/2021 22:39:57 - INFO - codeparrot_training - Step 30181: {'lr': 0.00017108358598089864, 'samples': 15453184, 'steps': 30181, 'batch_loss/train': 0.6658239359967411} 12/26/2021 22:40:07 - INFO - codeparrot_training - Step 30182: {'lr': 0.0001710686289649812, 'samples': 15453696, 'steps': 30182, 'batch_loss/train': 0.8088411800563335} 12/26/2021 22:40:18 - INFO - codeparrot_training - Step 30183: {'lr': 0.00017105367226286495, 'samples': 15454208, 'steps': 30183, 'batch_loss/train': 0.6911014914512634} 12/26/2021 22:40:30 - INFO - codeparrot_training - Step 30184: {'lr': 0.00017103871587460958, 'samples': 15454720, 'steps': 30184, 'batch_loss/train': 0.7968331421725452} 12/26/2021 22:40:41 - INFO - codeparrot_training - Step 30185: {'lr': 0.00017102375980027447, 'samples': 15455232, 'steps': 30185, 'batch_loss/train': 0.774537842720747} 12/26/2021 22:40:51 - INFO - codeparrot_training - Step 30186: {'lr': 0.00017100880403991904, 'samples': 15455744, 'steps': 30186, 'batch_loss/train': 0.7611840171739459} 12/26/2021 22:41:05 - INFO - codeparrot_training - Step 30187: {'lr': 0.00017099384859360276, 'samples': 15456256, 'steps': 30187, 'batch_loss/train': 0.5353598645306192} 12/26/2021 22:41:16 - INFO - codeparrot_training - Step 30188: {'lr': 0.0001709788934613852, 'samples': 15456768, 'steps': 30188, 'batch_loss/train': 0.613200630992651} 12/26/2021 22:41:27 - INFO - codeparrot_training - Step 30189: {'lr': 0.00017096393864332566, 'samples': 15457280, 'steps': 30189, 'batch_loss/train': 0.7450653046835214} 12/26/2021 22:41:39 - INFO - codeparrot_training - Step 30190: {'lr': 0.00017094898413948356, 'samples': 15457792, 'steps': 30190, 'batch_loss/train': 0.8485537096858025} 12/26/2021 22:41:49 - INFO - codeparrot_training - Step 30191: {'lr': 0.0001709340299499186, 'samples': 15458304, 'steps': 30191, 'batch_loss/train': 0.7984926765784621} 12/26/2021 22:42:00 - INFO - codeparrot_training - Step 30192: {'lr': 0.0001709190760746901, 'samples': 15458816, 'steps': 30192, 'batch_loss/train': 0.7859768802300096} 12/26/2021 22:42:11 - INFO - codeparrot_training - Step 30193: {'lr': 0.0001709041225138574, 'samples': 15459328, 'steps': 30193, 'batch_loss/train': 0.8240900293458253} 12/26/2021 22:42:23 - INFO - codeparrot_training - Step 30194: {'lr': 0.00017088916926748, 'samples': 15459840, 'steps': 30194, 'batch_loss/train': 0.752422631951049} 12/26/2021 22:42:33 - INFO - codeparrot_training - Step 30195: {'lr': 0.0001708742163356175, 'samples': 15460352, 'steps': 30195, 'batch_loss/train': 0.6084944438189268} 12/26/2021 22:42:44 - INFO - codeparrot_training - Step 30196: {'lr': 0.00017085926371832922, 'samples': 15460864, 'steps': 30196, 'batch_loss/train': 0.7535654362291098} 12/26/2021 22:42:57 - INFO - codeparrot_training - Step 30197: {'lr': 0.00017084431141567452, 'samples': 15461376, 'steps': 30197, 'batch_loss/train': 0.7973830096889287} 12/26/2021 22:43:08 - INFO - codeparrot_training - Step 30198: {'lr': 0.0001708293594277131, 'samples': 15461888, 'steps': 30198, 'batch_loss/train': 0.7898166389204562} 12/26/2021 22:43:18 - INFO - codeparrot_training - Step 30199: {'lr': 0.0001708144077545042, 'samples': 15462400, 'steps': 30199, 'batch_loss/train': 0.6952607102575712} 12/26/2021 22:43:30 - INFO - codeparrot_training - Step 30200: {'lr': 0.00017079945639610734, 'samples': 15462912, 'steps': 30200, 'batch_loss/train': 0.7315352740697563} 12/26/2021 22:43:41 - INFO - codeparrot_training - Step 30201: {'lr': 0.00017078450535258183, 'samples': 15463424, 'steps': 30201, 'batch_loss/train': 0.7384460099274293} 12/26/2021 22:43:52 - INFO - codeparrot_training - Step 30202: {'lr': 0.00017076955462398734, 'samples': 15463936, 'steps': 30202, 'batch_loss/train': 0.780335180927068} 12/26/2021 22:44:05 - INFO - codeparrot_training - Step 30203: {'lr': 0.0001707546042103831, 'samples': 15464448, 'steps': 30203, 'batch_loss/train': 0.6779464709106833} 12/26/2021 22:44:16 - INFO - codeparrot_training - Step 30204: {'lr': 0.00017073965411182874, 'samples': 15464960, 'steps': 30204, 'batch_loss/train': 0.8255715789273381} 12/26/2021 22:44:26 - INFO - codeparrot_training - Step 30205: {'lr': 0.00017072470432838343, 'samples': 15465472, 'steps': 30205, 'batch_loss/train': 0.8533568065613508} 12/26/2021 22:44:37 - INFO - codeparrot_training - Step 30206: {'lr': 0.0001707097548601068, 'samples': 15465984, 'steps': 30206, 'batch_loss/train': 0.7405768660828471} 12/26/2021 22:44:49 - INFO - codeparrot_training - Step 30207: {'lr': 0.00017069480570705835, 'samples': 15466496, 'steps': 30207, 'batch_loss/train': 0.811709119938314} 12/26/2021 22:45:00 - INFO - codeparrot_training - Step 30208: {'lr': 0.00017067985686929736, 'samples': 15467008, 'steps': 30208, 'batch_loss/train': 0.7277418617159128} 12/26/2021 22:45:10 - INFO - codeparrot_training - Step 30209: {'lr': 0.00017066490834688326, 'samples': 15467520, 'steps': 30209, 'batch_loss/train': 0.7405705503188074} 12/26/2021 22:45:22 - INFO - codeparrot_training - Step 30210: {'lr': 0.00017064996013987558, 'samples': 15468032, 'steps': 30210, 'batch_loss/train': 0.8084484711289406} 12/26/2021 22:45:33 - INFO - codeparrot_training - Step 30211: {'lr': 0.00017063501224833373, 'samples': 15468544, 'steps': 30211, 'batch_loss/train': 0.8048828365281224} 12/26/2021 22:45:44 - INFO - codeparrot_training - Step 30212: {'lr': 0.000170620064672317, 'samples': 15469056, 'steps': 30212, 'batch_loss/train': 0.7058506393805146} 12/26/2021 22:45:58 - INFO - codeparrot_training - Step 30213: {'lr': 0.00017060511741188495, 'samples': 15469568, 'steps': 30213, 'batch_loss/train': 0.8646960565820336} 12/26/2021 22:46:08 - INFO - codeparrot_training - Step 30214: {'lr': 0.00017059017046709707, 'samples': 15470080, 'steps': 30214, 'batch_loss/train': 0.5803473198320717} 12/26/2021 22:46:19 - INFO - codeparrot_training - Step 30215: {'lr': 0.00017057522383801262, 'samples': 15470592, 'steps': 30215, 'batch_loss/train': 0.5834137692581862} 12/26/2021 22:46:29 - INFO - codeparrot_training - Step 30216: {'lr': 0.000170560277524691, 'samples': 15471104, 'steps': 30216, 'batch_loss/train': 0.7416657693684101} 12/26/2021 22:46:41 - INFO - codeparrot_training - Step 30217: {'lr': 0.00017054533152719188, 'samples': 15471616, 'steps': 30217, 'batch_loss/train': 0.6904004840180278} 12/26/2021 22:46:52 - INFO - codeparrot_training - Step 30218: {'lr': 0.0001705303858455745, 'samples': 15472128, 'steps': 30218, 'batch_loss/train': 0.7197460364550352} 12/26/2021 22:47:03 - INFO - codeparrot_training - Step 30219: {'lr': 0.00017051544047989823, 'samples': 15472640, 'steps': 30219, 'batch_loss/train': 0.8537957901135087} 12/26/2021 22:47:15 - INFO - codeparrot_training - Step 30220: {'lr': 0.0001705004954302225, 'samples': 15473152, 'steps': 30220, 'batch_loss/train': 0.6804219400510192} 12/26/2021 22:47:26 - INFO - codeparrot_training - Step 30221: {'lr': 0.00017048555069660692, 'samples': 15473664, 'steps': 30221, 'batch_loss/train': 0.7659509279765189} 12/26/2021 22:47:36 - INFO - codeparrot_training - Step 30222: {'lr': 0.0001704706062791107, 'samples': 15474176, 'steps': 30222, 'batch_loss/train': 0.7897807424888015} 12/26/2021 22:47:50 - INFO - codeparrot_training - Step 30223: {'lr': 0.00017045566217779325, 'samples': 15474688, 'steps': 30223, 'batch_loss/train': 0.7224198309704661} 12/26/2021 22:48:01 - INFO - codeparrot_training - Step 30224: {'lr': 0.0001704407183927142, 'samples': 15475200, 'steps': 30224, 'batch_loss/train': 0.8045733901672065} 12/26/2021 22:48:11 - INFO - codeparrot_training - Step 30225: {'lr': 0.0001704257749239327, 'samples': 15475712, 'steps': 30225, 'batch_loss/train': 0.6225101761519909} 12/26/2021 22:48:22 - INFO - codeparrot_training - Step 30226: {'lr': 0.00017041083177150838, 'samples': 15476224, 'steps': 30226, 'batch_loss/train': 0.7898276168853045} 12/26/2021 22:48:34 - INFO - codeparrot_training - Step 30227: {'lr': 0.0001703958889355004, 'samples': 15476736, 'steps': 30227, 'batch_loss/train': 0.7054000068455935} 12/26/2021 22:48:45 - INFO - codeparrot_training - Step 30228: {'lr': 0.00017038094641596847, 'samples': 15477248, 'steps': 30228, 'batch_loss/train': 0.5856531300232746} 12/26/2021 22:48:55 - INFO - codeparrot_training - Step 30229: {'lr': 0.00017036600421297172, 'samples': 15477760, 'steps': 30229, 'batch_loss/train': 0.5302227712236345} 12/26/2021 22:49:07 - INFO - codeparrot_training - Step 30230: {'lr': 0.00017035106232656972, 'samples': 15478272, 'steps': 30230, 'batch_loss/train': 0.8924488946795464} 12/26/2021 22:49:18 - INFO - codeparrot_training - Step 30231: {'lr': 0.00017033612075682177, 'samples': 15478784, 'steps': 30231, 'batch_loss/train': 0.7294901590794325} 12/26/2021 22:49:29 - INFO - codeparrot_training - Step 30232: {'lr': 0.00017032117950378735, 'samples': 15479296, 'steps': 30232, 'batch_loss/train': 0.7680020816624165} 12/26/2021 22:49:43 - INFO - codeparrot_training - Step 30233: {'lr': 0.00017030623856752587, 'samples': 15479808, 'steps': 30233, 'batch_loss/train': 0.7604910479858518} 12/26/2021 22:49:53 - INFO - codeparrot_training - Step 30234: {'lr': 0.0001702912979480966, 'samples': 15480320, 'steps': 30234, 'batch_loss/train': 0.7879368830472231} 12/26/2021 22:50:04 - INFO - codeparrot_training - Step 30235: {'lr': 0.0001702763576455591, 'samples': 15480832, 'steps': 30235, 'batch_loss/train': 0.6841073200048413} 12/26/2021 22:50:15 - INFO - codeparrot_training - Step 30236: {'lr': 0.00017026141765997272, 'samples': 15481344, 'steps': 30236, 'batch_loss/train': 0.7507742196321487} 12/26/2021 22:50:27 - INFO - codeparrot_training - Step 30237: {'lr': 0.00017024647799139685, 'samples': 15481856, 'steps': 30237, 'batch_loss/train': 0.7999563859775662} 12/26/2021 22:50:38 - INFO - codeparrot_training - Step 30238: {'lr': 0.00017023153863989076, 'samples': 15482368, 'steps': 30238, 'batch_loss/train': 0.8174186162650585} 12/26/2021 22:50:48 - INFO - codeparrot_training - Step 30239: {'lr': 0.00017021659960551403, 'samples': 15482880, 'steps': 30239, 'batch_loss/train': 0.5905561039689928} 12/26/2021 22:51:00 - INFO - codeparrot_training - Step 30240: {'lr': 0.000170201660888326, 'samples': 15483392, 'steps': 30240, 'batch_loss/train': 0.6552972113713622} 12/26/2021 22:51:11 - INFO - codeparrot_training - Step 30241: {'lr': 0.00017018672248838603, 'samples': 15483904, 'steps': 30241, 'batch_loss/train': 0.7872807821258903} 12/26/2021 22:51:22 - INFO - codeparrot_training - Step 30242: {'lr': 0.0001701717844057534, 'samples': 15484416, 'steps': 30242, 'batch_loss/train': 0.8112925784662366} 12/26/2021 22:51:36 - INFO - codeparrot_training - Step 30243: {'lr': 0.00017015684664048775, 'samples': 15484928, 'steps': 30243, 'batch_loss/train': 0.6270186617039144} 12/26/2021 22:51:46 - INFO - codeparrot_training - Step 30244: {'lr': 0.00017014190919264826, 'samples': 15485440, 'steps': 30244, 'batch_loss/train': 0.7511256919242442} 12/26/2021 22:51:57 - INFO - codeparrot_training - Step 30245: {'lr': 0.00017012697206229436, 'samples': 15485952, 'steps': 30245, 'batch_loss/train': 0.7961480235680938} 12/26/2021 22:52:08 - INFO - codeparrot_training - Step 30246: {'lr': 0.00017011203524948555, 'samples': 15486464, 'steps': 30246, 'batch_loss/train': 0.7098890114575624} 12/26/2021 22:52:20 - INFO - codeparrot_training - Step 30247: {'lr': 0.00017009709875428114, 'samples': 15486976, 'steps': 30247, 'batch_loss/train': 0.42586302175186574} 12/26/2021 22:52:31 - INFO - codeparrot_training - Step 30248: {'lr': 0.00017008216257674043, 'samples': 15487488, 'steps': 30248, 'batch_loss/train': 0.7573311687447131} 12/26/2021 22:52:41 - INFO - codeparrot_training - Step 30249: {'lr': 0.0001700672267169228, 'samples': 15488000, 'steps': 30249, 'batch_loss/train': 0.7416502069681883} 12/26/2021 22:52:53 - INFO - codeparrot_training - Step 30250: {'lr': 0.0001700522911748878, 'samples': 15488512, 'steps': 30250, 'batch_loss/train': 0.709998045116663} 12/26/2021 22:53:04 - INFO - codeparrot_training - Step 30251: {'lr': 0.00017003735595069467, 'samples': 15489024, 'steps': 30251, 'batch_loss/train': 0.8271262859925628} 12/26/2021 22:53:15 - INFO - codeparrot_training - Step 30252: {'lr': 0.00017002242104440285, 'samples': 15489536, 'steps': 30252, 'batch_loss/train': 0.7761383070610464} 12/26/2021 22:53:25 - INFO - codeparrot_training - Step 30253: {'lr': 0.00017000748645607162, 'samples': 15490048, 'steps': 30253, 'batch_loss/train': 0.73658312484622} 12/26/2021 22:53:37 - INFO - codeparrot_training - Step 30254: {'lr': 0.00016999255218576044, 'samples': 15490560, 'steps': 30254, 'batch_loss/train': 0.8929464863613248} 12/26/2021 22:53:48 - INFO - codeparrot_training - Step 30255: {'lr': 0.00016997761823352872, 'samples': 15491072, 'steps': 30255, 'batch_loss/train': 0.7805014001205564} 12/26/2021 22:53:59 - INFO - codeparrot_training - Step 30256: {'lr': 0.00016996268459943577, 'samples': 15491584, 'steps': 30256, 'batch_loss/train': 0.8488701209425926} 12/26/2021 22:54:13 - INFO - codeparrot_training - Step 30257: {'lr': 0.0001699477512835409, 'samples': 15492096, 'steps': 30257, 'batch_loss/train': 0.8698488622903824} 12/26/2021 22:54:23 - INFO - codeparrot_training - Step 30258: {'lr': 0.00016993281828590353, 'samples': 15492608, 'steps': 30258, 'batch_loss/train': 0.7501571774482727} 12/26/2021 22:54:34 - INFO - codeparrot_training - Step 30259: {'lr': 0.00016991788560658314, 'samples': 15493120, 'steps': 30259, 'batch_loss/train': 1.1061141779646277} 12/26/2021 22:54:45 - INFO - codeparrot_training - Step 30260: {'lr': 0.00016990295324563888, 'samples': 15493632, 'steps': 30260, 'batch_loss/train': 0.8007212905213237} 12/26/2021 22:54:57 - INFO - codeparrot_training - Step 30261: {'lr': 0.0001698880212031303, 'samples': 15494144, 'steps': 30261, 'batch_loss/train': 0.7447588900104165} 12/26/2021 22:55:08 - INFO - codeparrot_training - Step 30262: {'lr': 0.00016987308947911673, 'samples': 15494656, 'steps': 30262, 'batch_loss/train': 0.7841519140638411} 12/26/2021 22:55:18 - INFO - codeparrot_training - Step 30263: {'lr': 0.00016985815807365752, 'samples': 15495168, 'steps': 30263, 'batch_loss/train': 0.7313377915415913} 12/26/2021 22:55:32 - INFO - codeparrot_training - Step 30264: {'lr': 0.0001698432269868119, 'samples': 15495680, 'steps': 30264, 'batch_loss/train': 0.7227810202166438} 12/26/2021 22:55:43 - INFO - codeparrot_training - Step 30265: {'lr': 0.0001698282962186394, 'samples': 15496192, 'steps': 30265, 'batch_loss/train': 0.7078095120377839} 12/26/2021 22:55:53 - INFO - codeparrot_training - Step 30266: {'lr': 0.00016981336576919937, 'samples': 15496704, 'steps': 30266, 'batch_loss/train': 0.7943510878831148} 12/26/2021 22:56:05 - INFO - codeparrot_training - Step 30267: {'lr': 0.0001697984356385511, 'samples': 15497216, 'steps': 30267, 'batch_loss/train': 0.7222013265127316} 12/26/2021 22:56:16 - INFO - codeparrot_training - Step 30268: {'lr': 0.00016978350582675384, 'samples': 15497728, 'steps': 30268, 'batch_loss/train': 0.800003373529762} 12/26/2021 22:56:27 - INFO - codeparrot_training - Step 30269: {'lr': 0.00016976857633386723, 'samples': 15498240, 'steps': 30269, 'batch_loss/train': 0.751267846673727} 12/26/2021 22:56:39 - INFO - codeparrot_training - Step 30270: {'lr': 0.00016975364715995038, 'samples': 15498752, 'steps': 30270, 'batch_loss/train': 0.7717712474986911} 12/26/2021 22:56:49 - INFO - codeparrot_training - Step 30271: {'lr': 0.00016973871830506274, 'samples': 15499264, 'steps': 30271, 'batch_loss/train': 0.8753399020060897} 12/26/2021 22:57:00 - INFO - codeparrot_training - Step 30272: {'lr': 0.00016972378976926373, 'samples': 15499776, 'steps': 30272, 'batch_loss/train': 0.784490155056119} 12/26/2021 22:57:11 - INFO - codeparrot_training - Step 30273: {'lr': 0.00016970886155261259, 'samples': 15500288, 'steps': 30273, 'batch_loss/train': 0.8296043896116316} 12/26/2021 22:57:25 - INFO - codeparrot_training - Step 30274: {'lr': 0.00016969393365516867, 'samples': 15500800, 'steps': 30274, 'batch_loss/train': 0.7989545548334718} 12/26/2021 22:57:35 - INFO - codeparrot_training - Step 30275: {'lr': 0.00016967900607699126, 'samples': 15501312, 'steps': 30275, 'batch_loss/train': 0.7553271669894457} 12/26/2021 22:57:46 - INFO - codeparrot_training - Step 30276: {'lr': 0.00016966407881813996, 'samples': 15501824, 'steps': 30276, 'batch_loss/train': 0.7328490964137018} 12/26/2021 22:57:58 - INFO - codeparrot_training - Step 30277: {'lr': 0.00016964915187867385, 'samples': 15502336, 'steps': 30277, 'batch_loss/train': 0.7438050553901121} 12/26/2021 22:58:08 - INFO - codeparrot_training - Step 30278: {'lr': 0.00016963422525865247, 'samples': 15502848, 'steps': 30278, 'batch_loss/train': 0.7820229983190075} 12/26/2021 22:58:19 - INFO - codeparrot_training - Step 30279: {'lr': 0.00016961929895813492, 'samples': 15503360, 'steps': 30279, 'batch_loss/train': 0.7693208917044103} 12/26/2021 22:58:33 - INFO - codeparrot_training - Step 30280: {'lr': 0.00016960437297718073, 'samples': 15503872, 'steps': 30280, 'batch_loss/train': 0.719751438125968} 12/26/2021 22:58:44 - INFO - codeparrot_training - Step 30281: {'lr': 0.0001695894473158493, 'samples': 15504384, 'steps': 30281, 'batch_loss/train': 0.8101354754762724} 12/26/2021 22:58:54 - INFO - codeparrot_training - Step 30282: {'lr': 0.0001695745219741997, 'samples': 15504896, 'steps': 30282, 'batch_loss/train': 0.9863415732979774} 12/26/2021 22:59:05 - INFO - codeparrot_training - Step 30283: {'lr': 0.0001695595969522916, 'samples': 15505408, 'steps': 30283, 'batch_loss/train': 0.8348743710666895} 12/26/2021 22:59:17 - INFO - codeparrot_training - Step 30284: {'lr': 0.00016954467225018406, 'samples': 15505920, 'steps': 30284, 'batch_loss/train': 0.5992396213114262} 12/26/2021 22:59:28 - INFO - codeparrot_training - Step 30285: {'lr': 0.00016952974786793663, 'samples': 15506432, 'steps': 30285, 'batch_loss/train': 0.7405585134401917} 12/26/2021 22:59:39 - INFO - codeparrot_training - Step 30286: {'lr': 0.0001695148238056084, 'samples': 15506944, 'steps': 30286, 'batch_loss/train': 0.7648942917585373} 12/26/2021 22:59:52 - INFO - codeparrot_training - Step 30287: {'lr': 0.00016949990006325888, 'samples': 15507456, 'steps': 30287, 'batch_loss/train': 0.7963621281087399} 12/26/2021 23:00:02 - INFO - codeparrot_training - Step 30288: {'lr': 0.00016948497664094741, 'samples': 15507968, 'steps': 30288, 'batch_loss/train': 0.8158303452655673} 12/26/2021 23:00:13 - INFO - codeparrot_training - Step 30289: {'lr': 0.00016947005353873328, 'samples': 15508480, 'steps': 30289, 'batch_loss/train': 0.7588244881480932} 12/26/2021 23:00:24 - INFO - codeparrot_training - Step 30290: {'lr': 0.00016945513075667572, 'samples': 15508992, 'steps': 30290, 'batch_loss/train': 1.328205400146544} 12/26/2021 23:00:36 - INFO - codeparrot_training - Step 30291: {'lr': 0.00016944020829483427, 'samples': 15509504, 'steps': 30291, 'batch_loss/train': 0.7488730769837275} 12/26/2021 23:00:46 - INFO - codeparrot_training - Step 30292: {'lr': 0.00016942528615326808, 'samples': 15510016, 'steps': 30292, 'batch_loss/train': 0.7088473225012422} 12/26/2021 23:00:57 - INFO - codeparrot_training - Step 30293: {'lr': 0.0001694103643320365, 'samples': 15510528, 'steps': 30293, 'batch_loss/train': 0.6422260177787393} 12/26/2021 23:01:10 - INFO - codeparrot_training - Step 30294: {'lr': 0.00016939544283119885, 'samples': 15511040, 'steps': 30294, 'batch_loss/train': 0.7294866153970361} 12/26/2021 23:01:21 - INFO - codeparrot_training - Step 30295: {'lr': 0.00016938052165081457, 'samples': 15511552, 'steps': 30295, 'batch_loss/train': 0.8018245710991323} 12/26/2021 23:01:31 - INFO - codeparrot_training - Step 30296: {'lr': 0.00016936560079094286, 'samples': 15512064, 'steps': 30296, 'batch_loss/train': 0.6985211223363876} 12/26/2021 23:01:43 - INFO - codeparrot_training - Step 30297: {'lr': 0.00016935068025164302, 'samples': 15512576, 'steps': 30297, 'batch_loss/train': 0.8240173105150461} 12/26/2021 23:01:54 - INFO - codeparrot_training - Step 30298: {'lr': 0.00016933576003297452, 'samples': 15513088, 'steps': 30298, 'batch_loss/train': 0.7196063422597945} 12/26/2021 23:02:05 - INFO - codeparrot_training - Step 30299: {'lr': 0.00016932084013499655, 'samples': 15513600, 'steps': 30299, 'batch_loss/train': 0.6981802491936833} 12/26/2021 23:02:15 - INFO - codeparrot_training - Step 30300: {'lr': 0.0001693059205577685, 'samples': 15514112, 'steps': 30300, 'batch_loss/train': 0.6931999516673386} 12/26/2021 23:02:27 - INFO - codeparrot_training - Step 30301: {'lr': 0.0001692910013013495, 'samples': 15514624, 'steps': 30301, 'batch_loss/train': 0.7850580052472651} 12/26/2021 23:02:38 - INFO - codeparrot_training - Step 30302: {'lr': 0.00016927608236579916, 'samples': 15515136, 'steps': 30302, 'batch_loss/train': 0.7290357548044994} 12/26/2021 23:02:49 - INFO - codeparrot_training - Step 30303: {'lr': 0.00016926116375117655, 'samples': 15515648, 'steps': 30303, 'batch_loss/train': 0.7403982819523662} 12/26/2021 23:03:03 - INFO - codeparrot_training - Step 30304: {'lr': 0.00016924624545754113, 'samples': 15516160, 'steps': 30304, 'batch_loss/train': 0.6724852081388235} 12/26/2021 23:03:13 - INFO - codeparrot_training - Step 30305: {'lr': 0.00016923132748495205, 'samples': 15516672, 'steps': 30305, 'batch_loss/train': 0.8020163818728179} 12/26/2021 23:03:24 - INFO - codeparrot_training - Step 30306: {'lr': 0.00016921640983346876, 'samples': 15517184, 'steps': 30306, 'batch_loss/train': 0.777906057657674} 12/26/2021 23:03:36 - INFO - codeparrot_training - Step 30307: {'lr': 0.00016920149250315055, 'samples': 15517696, 'steps': 30307, 'batch_loss/train': 0.85644262842834} 12/26/2021 23:03:47 - INFO - codeparrot_training - Step 30308: {'lr': 0.00016918657549405664, 'samples': 15518208, 'steps': 30308, 'batch_loss/train': 0.7868274850770831} 12/26/2021 23:03:57 - INFO - codeparrot_training - Step 30309: {'lr': 0.00016917165880624645, 'samples': 15518720, 'steps': 30309, 'batch_loss/train': 0.737827654927969} 12/26/2021 23:04:08 - INFO - codeparrot_training - Step 30310: {'lr': 0.00016915674243977918, 'samples': 15519232, 'steps': 30310, 'batch_loss/train': 0.7763118771836162} 12/26/2021 23:04:22 - INFO - codeparrot_training - Step 30311: {'lr': 0.00016914182639471426, 'samples': 15519744, 'steps': 30311, 'batch_loss/train': 0.8282372038811445} 12/26/2021 23:04:33 - INFO - codeparrot_training - Step 30312: {'lr': 0.0001691269106711108, 'samples': 15520256, 'steps': 30312, 'batch_loss/train': 0.7324341060593724} 12/26/2021 23:04:43 - INFO - codeparrot_training - Step 30313: {'lr': 0.00016911199526902824, 'samples': 15520768, 'steps': 30313, 'batch_loss/train': 0.8489788360893726} 12/26/2021 23:04:55 - INFO - codeparrot_training - Step 30314: {'lr': 0.0001690970801885259, 'samples': 15521280, 'steps': 30314, 'batch_loss/train': 0.8108050134032965} 12/26/2021 23:05:06 - INFO - codeparrot_training - Step 30315: {'lr': 0.00016908216542966297, 'samples': 15521792, 'steps': 30315, 'batch_loss/train': 0.776255832053721} 12/26/2021 23:05:17 - INFO - codeparrot_training - Step 30316: {'lr': 0.00016906725099249874, 'samples': 15522304, 'steps': 30316, 'batch_loss/train': 0.7113209548406303} 12/26/2021 23:05:29 - INFO - codeparrot_training - Step 30317: {'lr': 0.00016905233687709268, 'samples': 15522816, 'steps': 30317, 'batch_loss/train': 0.7220376411569305} 12/26/2021 23:05:40 - INFO - codeparrot_training - Step 30318: {'lr': 0.00016903742308350394, 'samples': 15523328, 'steps': 30318, 'batch_loss/train': 0.7391798719763756} 12/26/2021 23:05:50 - INFO - codeparrot_training - Step 30319: {'lr': 0.00016902250961179177, 'samples': 15523840, 'steps': 30319, 'batch_loss/train': 0.7085179686546326} 12/26/2021 23:06:01 - INFO - codeparrot_training - Step 30320: {'lr': 0.00016900759646201554, 'samples': 15524352, 'steps': 30320, 'batch_loss/train': 0.7047441517934203} 12/26/2021 23:06:15 - INFO - codeparrot_training - Step 30321: {'lr': 0.0001689926836342346, 'samples': 15524864, 'steps': 30321, 'batch_loss/train': 0.7973251051735133} 12/26/2021 23:06:25 - INFO - codeparrot_training - Step 30322: {'lr': 0.00016897777112850809, 'samples': 15525376, 'steps': 30322, 'batch_loss/train': 0.7721666395664215} 12/26/2021 23:06:36 - INFO - codeparrot_training - Step 30323: {'lr': 0.00016896285894489534, 'samples': 15525888, 'steps': 30323, 'batch_loss/train': 0.7789962785318494} 12/26/2021 23:06:48 - INFO - codeparrot_training - Step 30324: {'lr': 0.00016894794708345576, 'samples': 15526400, 'steps': 30324, 'batch_loss/train': 0.7131933788768947} 12/26/2021 23:06:59 - INFO - codeparrot_training - Step 30325: {'lr': 0.0001689330355442485, 'samples': 15526912, 'steps': 30325, 'batch_loss/train': 0.776974068954587} 12/26/2021 23:07:09 - INFO - codeparrot_training - Step 30326: {'lr': 0.00016891812432733288, 'samples': 15527424, 'steps': 30326, 'batch_loss/train': 0.7034314731135964} 12/26/2021 23:07:23 - INFO - codeparrot_training - Step 30327: {'lr': 0.00016890321343276816, 'samples': 15527936, 'steps': 30327, 'batch_loss/train': 0.7511392049491405} 12/26/2021 23:07:34 - INFO - codeparrot_training - Step 30328: {'lr': 0.0001688883028606137, 'samples': 15528448, 'steps': 30328, 'batch_loss/train': 0.7517673717811704} 12/26/2021 23:07:44 - INFO - codeparrot_training - Step 30329: {'lr': 0.00016887339261092865, 'samples': 15528960, 'steps': 30329, 'batch_loss/train': 0.7526908391155303} 12/26/2021 23:07:56 - INFO - codeparrot_training - Step 30330: {'lr': 0.00016885848268377243, 'samples': 15529472, 'steps': 30330, 'batch_loss/train': 0.734469898045063} 12/26/2021 23:08:07 - INFO - codeparrot_training - Step 30331: {'lr': 0.00016884357307920414, 'samples': 15529984, 'steps': 30331, 'batch_loss/train': 0.7318243077024817} 12/26/2021 23:08:18 - INFO - codeparrot_training - Step 30332: {'lr': 0.00016882866379728324, 'samples': 15530496, 'steps': 30332, 'batch_loss/train': 0.8138715210370719} 12/26/2021 23:08:28 - INFO - codeparrot_training - Step 30333: {'lr': 0.00016881375483806892, 'samples': 15531008, 'steps': 30333, 'batch_loss/train': 0.7649172402452677} 12/26/2021 23:08:41 - INFO - codeparrot_training - Step 30334: {'lr': 0.0001687988462016204, 'samples': 15531520, 'steps': 30334, 'batch_loss/train': 0.7446907706325874} 12/26/2021 23:08:52 - INFO - codeparrot_training - Step 30335: {'lr': 0.00016878393788799704, 'samples': 15532032, 'steps': 30335, 'batch_loss/train': 0.8156741689890623} 12/26/2021 23:09:02 - INFO - codeparrot_training - Step 30336: {'lr': 0.00016876902989725813, 'samples': 15532544, 'steps': 30336, 'batch_loss/train': 0.8527269996702671} 12/26/2021 23:09:14 - INFO - codeparrot_training - Step 30337: {'lr': 0.0001687541222294629, 'samples': 15533056, 'steps': 30337, 'batch_loss/train': 0.4707522098906338} 12/26/2021 23:09:25 - INFO - codeparrot_training - Step 30338: {'lr': 0.00016873921488467047, 'samples': 15533568, 'steps': 30338, 'batch_loss/train': 0.7123753689229488} 12/26/2021 23:09:36 - INFO - codeparrot_training - Step 30339: {'lr': 0.0001687243078629403, 'samples': 15534080, 'steps': 30339, 'batch_loss/train': 0.7300927881151438} 12/26/2021 23:09:46 - INFO - codeparrot_training - Step 30340: {'lr': 0.00016870940116433164, 'samples': 15534592, 'steps': 30340, 'batch_loss/train': 0.7324778120964766} 12/26/2021 23:10:00 - INFO - codeparrot_training - Step 30341: {'lr': 0.0001686944947889037, 'samples': 15535104, 'steps': 30341, 'batch_loss/train': 0.7074102280894294} 12/26/2021 23:10:11 - INFO - codeparrot_training - Step 30342: {'lr': 0.00016867958873671563, 'samples': 15535616, 'steps': 30342, 'batch_loss/train': 0.7517969692125916} 12/26/2021 23:10:21 - INFO - codeparrot_training - Step 30343: {'lr': 0.00016866468300782696, 'samples': 15536128, 'steps': 30343, 'batch_loss/train': 0.6890024549793452} 12/26/2021 23:10:34 - INFO - codeparrot_training - Step 30344: {'lr': 0.00016864977760229673, 'samples': 15536640, 'steps': 30344, 'batch_loss/train': 0.6734271494206041} 12/26/2021 23:10:44 - INFO - codeparrot_training - Step 30345: {'lr': 0.00016863487252018422, 'samples': 15537152, 'steps': 30345, 'batch_loss/train': 0.7394433710724115} 12/26/2021 23:10:55 - INFO - codeparrot_training - Step 30346: {'lr': 0.00016861996776154883, 'samples': 15537664, 'steps': 30346, 'batch_loss/train': 0.5770099712535739} 12/26/2021 23:11:08 - INFO - codeparrot_training - Step 30347: {'lr': 0.00016860506332644977, 'samples': 15538176, 'steps': 30347, 'batch_loss/train': 0.6582098465878516} 12/26/2021 23:11:18 - INFO - codeparrot_training - Step 30348: {'lr': 0.00016859015921494614, 'samples': 15538688, 'steps': 30348, 'batch_loss/train': 0.7780614672228694} 12/26/2021 23:11:29 - INFO - codeparrot_training - Step 30349: {'lr': 0.00016857525542709722, 'samples': 15539200, 'steps': 30349, 'batch_loss/train': 0.7679492551833391} 12/26/2021 23:11:40 - INFO - codeparrot_training - Step 30350: {'lr': 0.0001685603519629625, 'samples': 15539712, 'steps': 30350, 'batch_loss/train': 0.6860130322165787} 12/26/2021 23:11:53 - INFO - codeparrot_training - Step 30351: {'lr': 0.000168545448822601, 'samples': 15540224, 'steps': 30351, 'batch_loss/train': 0.5020563171710819} 12/26/2021 23:12:04 - INFO - codeparrot_training - Step 30352: {'lr': 0.0001685305460060721, 'samples': 15540736, 'steps': 30352, 'batch_loss/train': 0.7705603120848536} 12/26/2021 23:12:14 - INFO - codeparrot_training - Step 30353: {'lr': 0.00016851564351343484, 'samples': 15541248, 'steps': 30353, 'batch_loss/train': 0.7334735421463847} 12/26/2021 23:12:25 - INFO - codeparrot_training - Step 30354: {'lr': 0.00016850074134474878, 'samples': 15541760, 'steps': 30354, 'batch_loss/train': 0.7996252104640007} 12/26/2021 23:12:37 - INFO - codeparrot_training - Step 30355: {'lr': 0.0001684858395000729, 'samples': 15542272, 'steps': 30355, 'batch_loss/train': 0.7450212640687823} 12/26/2021 23:12:48 - INFO - codeparrot_training - Step 30356: {'lr': 0.00016847093797946652, 'samples': 15542784, 'steps': 30356, 'batch_loss/train': 0.7170653584762476} 12/26/2021 23:12:58 - INFO - codeparrot_training - Step 30357: {'lr': 0.00016845603678298902, 'samples': 15543296, 'steps': 30357, 'batch_loss/train': 0.8102821819484234} 12/26/2021 23:13:13 - INFO - codeparrot_training - Step 30358: {'lr': 0.00016844113591069944, 'samples': 15543808, 'steps': 30358, 'batch_loss/train': 0.7498789317905903} 12/26/2021 23:13:23 - INFO - codeparrot_training - Step 30359: {'lr': 0.00016842623536265716, 'samples': 15544320, 'steps': 30359, 'batch_loss/train': 0.7817348474636674} 12/26/2021 23:13:34 - INFO - codeparrot_training - Step 30360: {'lr': 0.00016841133513892132, 'samples': 15544832, 'steps': 30360, 'batch_loss/train': 1.0102986991405487} 12/26/2021 23:13:46 - INFO - codeparrot_training - Step 30361: {'lr': 0.0001683964352395512, 'samples': 15545344, 'steps': 30361, 'batch_loss/train': 0.7489525116980076} 12/26/2021 23:13:57 - INFO - codeparrot_training - Step 30362: {'lr': 0.00016838153566460612, 'samples': 15545856, 'steps': 30362, 'batch_loss/train': 0.7904896242544055} 12/26/2021 23:14:07 - INFO - codeparrot_training - Step 30363: {'lr': 0.00016836663641414524, 'samples': 15546368, 'steps': 30363, 'batch_loss/train': 0.7904106536880136} 12/26/2021 23:14:18 - INFO - codeparrot_training - Step 30364: {'lr': 0.00016835173748822765, 'samples': 15546880, 'steps': 30364, 'batch_loss/train': 0.7847065171226859} 12/26/2021 23:14:30 - INFO - codeparrot_training - Step 30365: {'lr': 0.0001683368388869128, 'samples': 15547392, 'steps': 30365, 'batch_loss/train': 0.8193763294257224} 12/26/2021 23:14:41 - INFO - codeparrot_training - Step 30366: {'lr': 0.0001683219406102599, 'samples': 15547904, 'steps': 30366, 'batch_loss/train': 0.8148705440107733} 12/26/2021 23:14:51 - INFO - codeparrot_training - Step 30367: {'lr': 0.00016830704265832804, 'samples': 15548416, 'steps': 30367, 'batch_loss/train': 0.8123387405648828} 12/26/2021 23:15:05 - INFO - codeparrot_training - Step 30368: {'lr': 0.0001682921450311765, 'samples': 15548928, 'steps': 30368, 'batch_loss/train': 0.7505918517708778} 12/26/2021 23:15:16 - INFO - codeparrot_training - Step 30369: {'lr': 0.00016827724772886465, 'samples': 15549440, 'steps': 30369, 'batch_loss/train': 0.7471540346741676} 12/26/2021 23:15:27 - INFO - codeparrot_training - Step 30370: {'lr': 0.0001682623507514516, 'samples': 15549952, 'steps': 30370, 'batch_loss/train': 0.8285234281793237} 12/26/2021 23:15:39 - INFO - codeparrot_training - Step 30371: {'lr': 0.00016824745409899645, 'samples': 15550464, 'steps': 30371, 'batch_loss/train': 0.7738095661625266} 12/26/2021 23:15:49 - INFO - codeparrot_training - Step 30372: {'lr': 0.00016823255777155872, 'samples': 15550976, 'steps': 30372, 'batch_loss/train': 0.8617900181561708} 12/26/2021 23:16:00 - INFO - codeparrot_training - Step 30373: {'lr': 0.0001682176617691974, 'samples': 15551488, 'steps': 30373, 'batch_loss/train': 0.8115465901792049} 12/26/2021 23:16:12 - INFO - codeparrot_training - Step 30374: {'lr': 0.00016820276609197176, 'samples': 15552000, 'steps': 30374, 'batch_loss/train': 0.7183247555512935} 12/26/2021 23:16:23 - INFO - codeparrot_training - Step 30375: {'lr': 0.000168187870739941, 'samples': 15552512, 'steps': 30375, 'batch_loss/train': 0.7912985803559422} 12/26/2021 23:16:33 - INFO - codeparrot_training - Step 30376: {'lr': 0.00016817297571316447, 'samples': 15553024, 'steps': 30376, 'batch_loss/train': 0.681115326937288} 12/26/2021 23:16:44 - INFO - codeparrot_training - Step 30377: {'lr': 0.0001681580810117012, 'samples': 15553536, 'steps': 30377, 'batch_loss/train': 0.8016816191375256} 12/26/2021 23:16:56 - INFO - codeparrot_training - Step 30378: {'lr': 0.00016814318663561063, 'samples': 15554048, 'steps': 30378, 'batch_loss/train': 0.8214237892534584} 12/26/2021 23:17:07 - INFO - codeparrot_training - Step 30379: {'lr': 0.00016812829258495166, 'samples': 15554560, 'steps': 30379, 'batch_loss/train': 0.9423373104073107} 12/26/2021 23:17:17 - INFO - codeparrot_training - Step 30380: {'lr': 0.00016811339885978378, 'samples': 15555072, 'steps': 30380, 'batch_loss/train': 0.7057053507305682} 12/26/2021 23:17:32 - INFO - codeparrot_training - Step 30381: {'lr': 0.00016809850546016614, 'samples': 15555584, 'steps': 30381, 'batch_loss/train': 0.8247639765031636} 12/26/2021 23:17:42 - INFO - codeparrot_training - Step 30382: {'lr': 0.0001680836123861578, 'samples': 15556096, 'steps': 30382, 'batch_loss/train': 0.787279230542481} 12/26/2021 23:17:53 - INFO - codeparrot_training - Step 30383: {'lr': 0.00016806871963781823, 'samples': 15556608, 'steps': 30383, 'batch_loss/train': 0.7488685818389058} 12/26/2021 23:18:05 - INFO - codeparrot_training - Step 30384: {'lr': 0.00016805382721520642, 'samples': 15557120, 'steps': 30384, 'batch_loss/train': 0.7944280819501728} 12/26/2021 23:18:16 - INFO - codeparrot_training - Step 30385: {'lr': 0.0001680389351183817, 'samples': 15557632, 'steps': 30385, 'batch_loss/train': 0.712493835715577} 12/26/2021 23:18:26 - INFO - codeparrot_training - Step 30386: {'lr': 0.00016802404334740316, 'samples': 15558144, 'steps': 30386, 'batch_loss/train': 0.8355588223785162} 12/26/2021 23:18:37 - INFO - codeparrot_training - Step 30387: {'lr': 0.00016800915190233007, 'samples': 15558656, 'steps': 30387, 'batch_loss/train': 0.5986833041533828} 12/26/2021 23:18:51 - INFO - codeparrot_training - Step 30388: {'lr': 0.0001679942607832217, 'samples': 15559168, 'steps': 30388, 'batch_loss/train': 0.6737103778868914} 12/26/2021 23:19:02 - INFO - codeparrot_training - Step 30389: {'lr': 0.00016797936999013717, 'samples': 15559680, 'steps': 30389, 'batch_loss/train': 0.7401702785864472} 12/26/2021 23:19:12 - INFO - codeparrot_training - Step 30390: {'lr': 0.0001679644795231356, 'samples': 15560192, 'steps': 30390, 'batch_loss/train': 1.0108242947608232} 12/26/2021 23:19:24 - INFO - codeparrot_training - Step 30391: {'lr': 0.00016794958938227638, 'samples': 15560704, 'steps': 30391, 'batch_loss/train': 0.5847743379417807} 12/26/2021 23:19:35 - INFO - codeparrot_training - Step 30392: {'lr': 0.00016793469956761863, 'samples': 15561216, 'steps': 30392, 'batch_loss/train': 0.8208916211733595} 12/26/2021 23:19:46 - INFO - codeparrot_training - Step 30393: {'lr': 0.00016791981007922147, 'samples': 15561728, 'steps': 30393, 'batch_loss/train': 0.6985751749016345} 12/26/2021 23:19:56 - INFO - codeparrot_training - Step 30394: {'lr': 0.00016790492091714416, 'samples': 15562240, 'steps': 30394, 'batch_loss/train': 0.7737347148358822} 12/26/2021 23:20:09 - INFO - codeparrot_training - Step 30395: {'lr': 0.00016789003208144592, 'samples': 15562752, 'steps': 30395, 'batch_loss/train': 0.8400062536820769} 12/26/2021 23:20:19 - INFO - codeparrot_training - Step 30396: {'lr': 0.00016787514357218588, 'samples': 15563264, 'steps': 30396, 'batch_loss/train': 0.7292717797681689} 12/26/2021 23:20:30 - INFO - codeparrot_training - Step 30397: {'lr': 0.00016786025538942324, 'samples': 15563776, 'steps': 30397, 'batch_loss/train': 0.7670926628634334} 12/26/2021 23:20:44 - INFO - codeparrot_training - Step 30398: {'lr': 0.00016784536753321726, 'samples': 15564288, 'steps': 30398, 'batch_loss/train': 0.7345486702397466} 12/26/2021 23:20:55 - INFO - codeparrot_training - Step 30399: {'lr': 0.0001678304800036271, 'samples': 15564800, 'steps': 30399, 'batch_loss/train': 0.5688985696760938} 12/26/2021 23:21:05 - INFO - codeparrot_training - Step 30400: {'lr': 0.00016781559280071184, 'samples': 15565312, 'steps': 30400, 'batch_loss/train': 0.7763326806016266} 12/26/2021 23:21:17 - INFO - codeparrot_training - Step 30401: {'lr': 0.00016780070592453076, 'samples': 15565824, 'steps': 30401, 'batch_loss/train': 0.712107615545392} 12/26/2021 23:21:28 - INFO - codeparrot_training - Step 30402: {'lr': 0.0001677858193751431, 'samples': 15566336, 'steps': 30402, 'batch_loss/train': 0.8241373589262366} 12/26/2021 23:21:39 - INFO - codeparrot_training - Step 30403: {'lr': 0.00016777093315260795, 'samples': 15566848, 'steps': 30403, 'batch_loss/train': 0.7559393253177404} 12/26/2021 23:21:51 - INFO - codeparrot_training - Step 30404: {'lr': 0.00016775604725698456, 'samples': 15567360, 'steps': 30404, 'batch_loss/train': 0.7411087965592742} 12/26/2021 23:22:02 - INFO - codeparrot_training - Step 30405: {'lr': 0.00016774116168833197, 'samples': 15567872, 'steps': 30405, 'batch_loss/train': 0.6853297797497362} 12/26/2021 23:22:12 - INFO - codeparrot_training - Step 30406: {'lr': 0.0001677262764467095, 'samples': 15568384, 'steps': 30406, 'batch_loss/train': 0.726391093339771} 12/26/2021 23:22:23 - INFO - codeparrot_training - Step 30407: {'lr': 0.00016771139153217636, 'samples': 15568896, 'steps': 30407, 'batch_loss/train': 0.666979237110354} 12/26/2021 23:22:37 - INFO - codeparrot_training - Step 30408: {'lr': 0.00016769650694479154, 'samples': 15569408, 'steps': 30408, 'batch_loss/train': 0.8520151400007308} 12/26/2021 23:22:47 - INFO - codeparrot_training - Step 30409: {'lr': 0.00016768162268461447, 'samples': 15569920, 'steps': 30409, 'batch_loss/train': 0.7287234761752188} 12/26/2021 23:22:58 - INFO - codeparrot_training - Step 30410: {'lr': 0.00016766673875170406, 'samples': 15570432, 'steps': 30410, 'batch_loss/train': 0.710415106615983} 12/26/2021 23:23:10 - INFO - codeparrot_training - Step 30411: {'lr': 0.00016765185514611974, 'samples': 15570944, 'steps': 30411, 'batch_loss/train': 0.7998373750597239} 12/26/2021 23:23:21 - INFO - codeparrot_training - Step 30412: {'lr': 0.00016763697186792043, 'samples': 15571456, 'steps': 30412, 'batch_loss/train': 0.8001595344394445} 12/26/2021 23:23:31 - INFO - codeparrot_training - Step 30413: {'lr': 0.00016762208891716543, 'samples': 15571968, 'steps': 30413, 'batch_loss/train': 0.6736700907349586} 12/26/2021 23:23:43 - INFO - codeparrot_training - Step 30414: {'lr': 0.00016760720629391398, 'samples': 15572480, 'steps': 30414, 'batch_loss/train': 0.6813257988542318} 12/26/2021 23:23:54 - INFO - codeparrot_training - Step 30415: {'lr': 0.0001675923239982251, 'samples': 15572992, 'steps': 30415, 'batch_loss/train': 0.7733433274552226} 12/26/2021 23:24:05 - INFO - codeparrot_training - Step 30416: {'lr': 0.000167577442030158, 'samples': 15573504, 'steps': 30416, 'batch_loss/train': 0.7530504707247019} 12/26/2021 23:24:15 - INFO - codeparrot_training - Step 30417: {'lr': 0.00016756256038977197, 'samples': 15574016, 'steps': 30417, 'batch_loss/train': 0.7939586602151394} 12/26/2021 23:24:27 - INFO - codeparrot_training - Step 30418: {'lr': 0.00016754767907712606, 'samples': 15574528, 'steps': 30418, 'batch_loss/train': 0.78902786411345} 12/26/2021 23:24:38 - INFO - codeparrot_training - Step 30419: {'lr': 0.00016753279809227932, 'samples': 15575040, 'steps': 30419, 'batch_loss/train': 0.7901883223094046} 12/26/2021 23:24:48 - INFO - codeparrot_training - Step 30420: {'lr': 0.00016751791743529112, 'samples': 15575552, 'steps': 30420, 'batch_loss/train': 0.7544458326883614} 12/26/2021 23:25:02 - INFO - codeparrot_training - Step 30421: {'lr': 0.00016750303710622057, 'samples': 15576064, 'steps': 30421, 'batch_loss/train': 0.7267407760955393} 12/26/2021 23:25:13 - INFO - codeparrot_training - Step 30422: {'lr': 0.00016748815710512677, 'samples': 15576576, 'steps': 30422, 'batch_loss/train': 0.7640627641230822} 12/26/2021 23:25:23 - INFO - codeparrot_training - Step 30423: {'lr': 0.00016747327743206884, 'samples': 15577088, 'steps': 30423, 'batch_loss/train': 0.7210181271657348} 12/26/2021 23:25:36 - INFO - codeparrot_training - Step 30424: {'lr': 0.00016745839808710608, 'samples': 15577600, 'steps': 30424, 'batch_loss/train': 0.8005913989618421} 12/26/2021 23:25:46 - INFO - codeparrot_training - Step 30425: {'lr': 0.00016744351907029754, 'samples': 15578112, 'steps': 30425, 'batch_loss/train': 0.7619810588657856} 12/26/2021 23:25:57 - INFO - codeparrot_training - Step 30426: {'lr': 0.0001674286403817024, 'samples': 15578624, 'steps': 30426, 'batch_loss/train': 0.6967713190242648} 12/26/2021 23:26:11 - INFO - codeparrot_training - Step 30427: {'lr': 0.00016741376202137974, 'samples': 15579136, 'steps': 30427, 'batch_loss/train': 0.7998430440202355} 12/26/2021 23:26:22 - INFO - codeparrot_training - Step 30428: {'lr': 0.0001673988839893889, 'samples': 15579648, 'steps': 30428, 'batch_loss/train': 0.7589839361608028} 12/26/2021 23:26:32 - INFO - codeparrot_training - Step 30429: {'lr': 0.00016738400628578883, 'samples': 15580160, 'steps': 30429, 'batch_loss/train': 0.6821065288968384} 12/26/2021 23:26:43 - INFO - codeparrot_training - Step 30430: {'lr': 0.0001673691289106387, 'samples': 15580672, 'steps': 30430, 'batch_loss/train': 0.8097657379694283} 12/26/2021 23:26:55 - INFO - codeparrot_training - Step 30431: {'lr': 0.00016735425186399782, 'samples': 15581184, 'steps': 30431, 'batch_loss/train': 0.751464348519221} 12/26/2021 23:27:06 - INFO - codeparrot_training - Step 30432: {'lr': 0.00016733937514592522, 'samples': 15581696, 'steps': 30432, 'batch_loss/train': 0.7272158656269312} 12/26/2021 23:27:16 - INFO - codeparrot_training - Step 30433: {'lr': 0.00016732449875648007, 'samples': 15582208, 'steps': 30433, 'batch_loss/train': 0.8663795292377472} 12/26/2021 23:27:28 - INFO - codeparrot_training - Step 30434: {'lr': 0.00016730962269572138, 'samples': 15582720, 'steps': 30434, 'batch_loss/train': 0.7927469653077424} 12/26/2021 23:27:39 - INFO - codeparrot_training - Step 30435: {'lr': 0.00016729474696370849, 'samples': 15583232, 'steps': 30435, 'batch_loss/train': 0.8029588959179819} 12/26/2021 23:27:50 - INFO - codeparrot_training - Step 30436: {'lr': 0.00016727987156050047, 'samples': 15583744, 'steps': 30436, 'batch_loss/train': 0.776198542676866} 12/26/2021 23:28:03 - INFO - codeparrot_training - Step 30437: {'lr': 0.00016726499648615648, 'samples': 15584256, 'steps': 30437, 'batch_loss/train': 0.8480096980929375} 12/26/2021 23:28:14 - INFO - codeparrot_training - Step 30438: {'lr': 0.0001672501217407355, 'samples': 15584768, 'steps': 30438, 'batch_loss/train': 0.7654904290102422} 12/26/2021 23:28:25 - INFO - codeparrot_training - Step 30439: {'lr': 0.00016723524732429685, 'samples': 15585280, 'steps': 30439, 'batch_loss/train': 0.7022222420200706} 12/26/2021 23:28:35 - INFO - codeparrot_training - Step 30440: {'lr': 0.00016722037323689965, 'samples': 15585792, 'steps': 30440, 'batch_loss/train': 0.6496235635131598} 12/26/2021 23:28:47 - INFO - codeparrot_training - Step 30441: {'lr': 0.00016720549947860298, 'samples': 15586304, 'steps': 30441, 'batch_loss/train': 0.6743149543181062} 12/26/2021 23:28:58 - INFO - codeparrot_training - Step 30442: {'lr': 0.00016719062604946588, 'samples': 15586816, 'steps': 30442, 'batch_loss/train': 0.64466555591207} 12/26/2021 23:29:09 - INFO - codeparrot_training - Step 30443: {'lr': 0.0001671757529495477, 'samples': 15587328, 'steps': 30443, 'batch_loss/train': 0.7495721997693181} 12/26/2021 23:29:23 - INFO - codeparrot_training - Step 30444: {'lr': 0.00016716088017890745, 'samples': 15587840, 'steps': 30444, 'batch_loss/train': 0.7318567750044167} 12/26/2021 23:29:33 - INFO - codeparrot_training - Step 30445: {'lr': 0.0001671460077376042, 'samples': 15588352, 'steps': 30445, 'batch_loss/train': 0.642926600528881} 12/26/2021 23:29:44 - INFO - codeparrot_training - Step 30446: {'lr': 0.0001671311356256971, 'samples': 15588864, 'steps': 30446, 'batch_loss/train': 0.7831369498744607} 12/26/2021 23:29:56 - INFO - codeparrot_training - Step 30447: {'lr': 0.00016711626384324546, 'samples': 15589376, 'steps': 30447, 'batch_loss/train': 0.8400950599461794} 12/26/2021 23:30:07 - INFO - codeparrot_training - Step 30448: {'lr': 0.00016710139239030816, 'samples': 15589888, 'steps': 30448, 'batch_loss/train': 0.8443474831874482} 12/26/2021 23:30:17 - INFO - codeparrot_training - Step 30449: {'lr': 0.00016708652126694437, 'samples': 15590400, 'steps': 30449, 'batch_loss/train': 1.0454582823440433} 12/26/2021 23:30:28 - INFO - codeparrot_training - Step 30450: {'lr': 0.00016707165047321338, 'samples': 15590912, 'steps': 30450, 'batch_loss/train': 0.7445238791406155} 12/26/2021 23:30:40 - INFO - codeparrot_training - Step 30451: {'lr': 0.00016705678000917415, 'samples': 15591424, 'steps': 30451, 'batch_loss/train': 0.9690962387248874} 12/26/2021 23:30:51 - INFO - codeparrot_training - Step 30452: {'lr': 0.00016704190987488587, 'samples': 15591936, 'steps': 30452, 'batch_loss/train': 0.7988529978320003} 12/26/2021 23:31:01 - INFO - codeparrot_training - Step 30453: {'lr': 0.00016702704007040756, 'samples': 15592448, 'steps': 30453, 'batch_loss/train': 0.7540526804514229} 12/26/2021 23:31:14 - INFO - codeparrot_training - Step 30454: {'lr': 0.0001670121705957985, 'samples': 15592960, 'steps': 30454, 'batch_loss/train': 0.7865419210866094} 12/26/2021 23:31:24 - INFO - codeparrot_training - Step 30455: {'lr': 0.00016699730145111765, 'samples': 15593472, 'steps': 30455, 'batch_loss/train': 0.6928483406081796} 12/26/2021 23:31:35 - INFO - codeparrot_training - Step 30456: {'lr': 0.00016698243263642414, 'samples': 15593984, 'steps': 30456, 'batch_loss/train': 0.7714017319958657} 12/26/2021 23:31:49 - INFO - codeparrot_training - Step 30457: {'lr': 0.00016696756415177728, 'samples': 15594496, 'steps': 30457, 'batch_loss/train': 0.6110899240011349} 12/26/2021 23:31:59 - INFO - codeparrot_training - Step 30458: {'lr': 0.00016695269599723594, 'samples': 15595008, 'steps': 30458, 'batch_loss/train': 0.726873129606247} 12/26/2021 23:32:10 - INFO - codeparrot_training - Step 30459: {'lr': 0.00016693782817285937, 'samples': 15595520, 'steps': 30459, 'batch_loss/train': 0.7129095469135791} 12/26/2021 23:32:20 - INFO - codeparrot_training - Step 30460: {'lr': 0.00016692296067870655, 'samples': 15596032, 'steps': 30460, 'batch_loss/train': 0.8127227071672678} 12/26/2021 23:32:32 - INFO - codeparrot_training - Step 30461: {'lr': 0.00016690809351483672, 'samples': 15596544, 'steps': 30461, 'batch_loss/train': 0.7900870870798826} 12/26/2021 23:32:43 - INFO - codeparrot_training - Step 30462: {'lr': 0.00016689322668130897, 'samples': 15597056, 'steps': 30462, 'batch_loss/train': 0.7312082047574222} 12/26/2021 23:32:54 - INFO - codeparrot_training - Step 30463: {'lr': 0.00016687836017818236, 'samples': 15597568, 'steps': 30463, 'batch_loss/train': 0.7938750358298421} 12/26/2021 23:33:06 - INFO - codeparrot_training - Step 30464: {'lr': 0.00016686349400551592, 'samples': 15598080, 'steps': 30464, 'batch_loss/train': 0.5672712509986013} 12/26/2021 23:33:16 - INFO - codeparrot_training - Step 30465: {'lr': 0.0001668486281633689, 'samples': 15598592, 'steps': 30465, 'batch_loss/train': 0.8273929050192237} 12/26/2021 23:33:27 - INFO - codeparrot_training - Step 30466: {'lr': 0.0001668337626518004, 'samples': 15599104, 'steps': 30466, 'batch_loss/train': 0.677532846108079} 12/26/2021 23:33:41 - INFO - codeparrot_training - Step 30467: {'lr': 0.00016681889747086932, 'samples': 15599616, 'steps': 30467, 'batch_loss/train': 0.9333339212462306} 12/26/2021 23:33:51 - INFO - codeparrot_training - Step 30468: {'lr': 0.00016680403262063495, 'samples': 15600128, 'steps': 30468, 'batch_loss/train': 0.8461117194965482} 12/26/2021 23:34:02 - INFO - codeparrot_training - Step 30469: {'lr': 0.0001667891681011564, 'samples': 15600640, 'steps': 30469, 'batch_loss/train': 0.8172977520152926} 12/26/2021 23:34:13 - INFO - codeparrot_training - Step 30470: {'lr': 0.00016677430391249265, 'samples': 15601152, 'steps': 30470, 'batch_loss/train': 0.7199551428493578} 12/26/2021 23:34:25 - INFO - codeparrot_training - Step 30471: {'lr': 0.0001667594400547028, 'samples': 15601664, 'steps': 30471, 'batch_loss/train': 0.7095055533573031} 12/26/2021 23:34:35 - INFO - codeparrot_training - Step 30472: {'lr': 0.00016674457652784605, 'samples': 15602176, 'steps': 30472, 'batch_loss/train': 0.7600930836051702} 12/26/2021 23:34:46 - INFO - codeparrot_training - Step 30473: {'lr': 0.00016672971333198144, 'samples': 15602688, 'steps': 30473, 'batch_loss/train': 0.885420442558825} 12/26/2021 23:35:00 - INFO - codeparrot_training - Step 30474: {'lr': 0.000166714850467168, 'samples': 15603200, 'steps': 30474, 'batch_loss/train': 0.7727569499984384} 12/26/2021 23:35:11 - INFO - codeparrot_training - Step 30475: {'lr': 0.00016669998793346482, 'samples': 15603712, 'steps': 30475, 'batch_loss/train': 0.7679472751915455} 12/26/2021 23:35:21 - INFO - codeparrot_training - Step 30476: {'lr': 0.00016668512573093114, 'samples': 15604224, 'steps': 30476, 'batch_loss/train': 0.9210673263296485} 12/26/2021 23:35:34 - INFO - codeparrot_training - Step 30477: {'lr': 0.0001666702638596259, 'samples': 15604736, 'steps': 30477, 'batch_loss/train': 0.7888710550032556} 12/26/2021 23:35:44 - INFO - codeparrot_training - Step 30478: {'lr': 0.00016665540231960827, 'samples': 15605248, 'steps': 30478, 'batch_loss/train': 0.6968195559456944} 12/26/2021 23:35:55 - INFO - codeparrot_training - Step 30479: {'lr': 0.0001666405411109372, 'samples': 15605760, 'steps': 30479, 'batch_loss/train': 0.8031116253696382} 12/26/2021 23:36:05 - INFO - codeparrot_training - Step 30480: {'lr': 0.00016662568023367188, 'samples': 15606272, 'steps': 30480, 'batch_loss/train': 0.7141737155616283} 12/26/2021 23:36:18 - INFO - codeparrot_training - Step 30481: {'lr': 0.00016661081968787146, 'samples': 15606784, 'steps': 30481, 'batch_loss/train': 0.7285800736863166} 12/26/2021 23:36:28 - INFO - codeparrot_training - Step 30482: {'lr': 0.0001665959594735948, 'samples': 15607296, 'steps': 30482, 'batch_loss/train': 0.7097166208550334} 12/26/2021 23:36:39 - INFO - codeparrot_training - Step 30483: {'lr': 0.00016658109959090124, 'samples': 15607808, 'steps': 30483, 'batch_loss/train': 0.7637602312024683} 12/26/2021 23:36:53 - INFO - codeparrot_training - Step 30484: {'lr': 0.00016656624003984967, 'samples': 15608320, 'steps': 30484, 'batch_loss/train': 0.752293320838362} 12/26/2021 23:37:03 - INFO - codeparrot_training - Step 30485: {'lr': 0.0001665513808204993, 'samples': 15608832, 'steps': 30485, 'batch_loss/train': 0.7566982920980081} 12/26/2021 23:37:14 - INFO - codeparrot_training - Step 30486: {'lr': 0.000166536521932909, 'samples': 15609344, 'steps': 30486, 'batch_loss/train': 0.7562190787866712} 12/26/2021 23:37:26 - INFO - codeparrot_training - Step 30487: {'lr': 0.00016652166337713804, 'samples': 15609856, 'steps': 30487, 'batch_loss/train': 0.7546584326773882} 12/26/2021 23:37:37 - INFO - codeparrot_training - Step 30488: {'lr': 0.00016650680515324545, 'samples': 15610368, 'steps': 30488, 'batch_loss/train': 0.7720594676211476} 12/26/2021 23:37:47 - INFO - codeparrot_training - Step 30489: {'lr': 0.0001664919472612903, 'samples': 15610880, 'steps': 30489, 'batch_loss/train': 0.8496150011196733} 12/26/2021 23:37:59 - INFO - codeparrot_training - Step 30490: {'lr': 0.00016647708970133154, 'samples': 15611392, 'steps': 30490, 'batch_loss/train': 0.7078951571602374} 12/26/2021 23:38:10 - INFO - codeparrot_training - Step 30491: {'lr': 0.00016646223247342837, 'samples': 15611904, 'steps': 30491, 'batch_loss/train': 0.8080749213695526} 12/26/2021 23:38:21 - INFO - codeparrot_training - Step 30492: {'lr': 0.00016644737557763988, 'samples': 15612416, 'steps': 30492, 'batch_loss/train': 0.738821295555681} 12/26/2021 23:38:31 - INFO - codeparrot_training - Step 30493: {'lr': 0.00016643251901402498, 'samples': 15612928, 'steps': 30493, 'batch_loss/train': 0.7825767518952489} 12/26/2021 23:38:43 - INFO - codeparrot_training - Step 30494: {'lr': 0.00016641766278264287, 'samples': 15613440, 'steps': 30494, 'batch_loss/train': 0.7771296501159668} 12/26/2021 23:38:54 - INFO - codeparrot_training - Step 30495: {'lr': 0.00016640280688355263, 'samples': 15613952, 'steps': 30495, 'batch_loss/train': 0.7018483718857169} 12/26/2021 23:39:05 - INFO - codeparrot_training - Step 30496: {'lr': 0.0001663879513168132, 'samples': 15614464, 'steps': 30496, 'batch_loss/train': 0.7980988165363669} 12/26/2021 23:39:19 - INFO - codeparrot_training - Step 30497: {'lr': 0.0001663730960824837, 'samples': 15614976, 'steps': 30497, 'batch_loss/train': 0.6297385348298121} 12/26/2021 23:39:29 - INFO - codeparrot_training - Step 30498: {'lr': 0.00016635824118062327, 'samples': 15615488, 'steps': 30498, 'batch_loss/train': 0.7581208101473749} 12/26/2021 23:39:40 - INFO - codeparrot_training - Step 30499: {'lr': 0.00016634338661129086, 'samples': 15616000, 'steps': 30499, 'batch_loss/train': 0.8157514980994165} 12/26/2021 23:39:52 - INFO - codeparrot_training - Step 30500: {'lr': 0.00016632853237454553, 'samples': 15616512, 'steps': 30500, 'batch_loss/train': 0.7329970365390182} 12/26/2021 23:40:03 - INFO - codeparrot_training - Step 30501: {'lr': 0.00016631367847044632, 'samples': 15617024, 'steps': 30501, 'batch_loss/train': 0.7616530945524573} 12/26/2021 23:40:13 - INFO - codeparrot_training - Step 30502: {'lr': 0.0001662988248990524, 'samples': 15617536, 'steps': 30502, 'batch_loss/train': 0.8316608276218176} 12/26/2021 23:40:27 - INFO - codeparrot_training - Step 30503: {'lr': 0.00016628397166042276, 'samples': 15618048, 'steps': 30503, 'batch_loss/train': 0.7109505590051413} 12/26/2021 23:40:38 - INFO - codeparrot_training - Step 30504: {'lr': 0.0001662691187546163, 'samples': 15618560, 'steps': 30504, 'batch_loss/train': 0.7271362829487771} 12/26/2021 23:40:49 - INFO - codeparrot_training - Step 30505: {'lr': 0.00016625426618169237, 'samples': 15619072, 'steps': 30505, 'batch_loss/train': 0.7892314530909061} 12/26/2021 23:40:59 - INFO - codeparrot_training - Step 30506: {'lr': 0.0001662394139417098, 'samples': 15619584, 'steps': 30506, 'batch_loss/train': 0.7760427799075842} 12/26/2021 23:41:11 - INFO - codeparrot_training - Step 30507: {'lr': 0.00016622456203472773, 'samples': 15620096, 'steps': 30507, 'batch_loss/train': 0.7651226622983813} 12/26/2021 23:41:22 - INFO - codeparrot_training - Step 30508: {'lr': 0.00016620971046080508, 'samples': 15620608, 'steps': 30508, 'batch_loss/train': 0.8186481632292271} 12/26/2021 23:41:33 - INFO - codeparrot_training - Step 30509: {'lr': 0.00016619485922000104, 'samples': 15621120, 'steps': 30509, 'batch_loss/train': 0.7571031767874956} 12/26/2021 23:41:45 - INFO - codeparrot_training - Step 30510: {'lr': 0.00016618000831237462, 'samples': 15621632, 'steps': 30510, 'batch_loss/train': 0.723425735719502} 12/26/2021 23:41:55 - INFO - codeparrot_training - Step 30511: {'lr': 0.00016616515773798486, 'samples': 15622144, 'steps': 30511, 'batch_loss/train': 0.8259640750475228} 12/26/2021 23:42:06 - INFO - codeparrot_training - Step 30512: {'lr': 0.00016615030749689063, 'samples': 15622656, 'steps': 30512, 'batch_loss/train': 0.707457063253969} 12/26/2021 23:42:20 - INFO - codeparrot_training - Step 30513: {'lr': 0.00016613545758915123, 'samples': 15623168, 'steps': 30513, 'batch_loss/train': 0.6972046853043139} 12/26/2021 23:42:31 - INFO - codeparrot_training - Step 30514: {'lr': 0.0001661206080148256, 'samples': 15623680, 'steps': 30514, 'batch_loss/train': 0.8264711005613208} 12/26/2021 23:42:41 - INFO - codeparrot_training - Step 30515: {'lr': 0.0001661057587739727, 'samples': 15624192, 'steps': 30515, 'batch_loss/train': 0.6457669390365481} 12/26/2021 23:42:52 - INFO - codeparrot_training - Step 30516: {'lr': 0.0001660909098666516, 'samples': 15624704, 'steps': 30516, 'batch_loss/train': 0.7880122531205416} 12/26/2021 23:43:04 - INFO - codeparrot_training - Step 30517: {'lr': 0.00016607606129292148, 'samples': 15625216, 'steps': 30517, 'batch_loss/train': 0.8323951596394181} 12/26/2021 23:43:15 - INFO - codeparrot_training - Step 30518: {'lr': 0.00016606121305284122, 'samples': 15625728, 'steps': 30518, 'batch_loss/train': 0.7508040834218264} 12/26/2021 23:43:25 - INFO - codeparrot_training - Step 30519: {'lr': 0.0001660463651464698, 'samples': 15626240, 'steps': 30519, 'batch_loss/train': 0.7554784929379821} 12/26/2021 23:43:37 - INFO - codeparrot_training - Step 30520: {'lr': 0.00016603151757386636, 'samples': 15626752, 'steps': 30520, 'batch_loss/train': 0.7923707459121943} 12/26/2021 23:43:48 - INFO - codeparrot_training - Step 30521: {'lr': 0.00016601667033508996, 'samples': 15627264, 'steps': 30521, 'batch_loss/train': 0.6494198588188738} 12/26/2021 23:43:59 - INFO - codeparrot_training - Step 30522: {'lr': 0.00016600182343019953, 'samples': 15627776, 'steps': 30522, 'batch_loss/train': 0.8244441758142784} 12/26/2021 23:44:13 - INFO - codeparrot_training - Step 30523: {'lr': 0.00016598697685925407, 'samples': 15628288, 'steps': 30523, 'batch_loss/train': 0.7291449401527643} 12/26/2021 23:44:23 - INFO - codeparrot_training - Step 30524: {'lr': 0.00016597213062231276, 'samples': 15628800, 'steps': 30524, 'batch_loss/train': 0.8263470539823174} 12/26/2021 23:44:34 - INFO - codeparrot_training - Step 30525: {'lr': 0.0001659572847194345, 'samples': 15629312, 'steps': 30525, 'batch_loss/train': 0.7884634928777814} 12/26/2021 23:44:45 - INFO - codeparrot_training - Step 30526: {'lr': 0.00016594243915067842, 'samples': 15629824, 'steps': 30526, 'batch_loss/train': 0.7860724618658423} 12/26/2021 23:44:57 - INFO - codeparrot_training - Step 30527: {'lr': 0.0001659275939161033, 'samples': 15630336, 'steps': 30527, 'batch_loss/train': 0.975614276714623} 12/26/2021 23:45:08 - INFO - codeparrot_training - Step 30528: {'lr': 0.00016591274901576848, 'samples': 15630848, 'steps': 30528, 'batch_loss/train': 0.7630142383277416} 12/26/2021 23:45:18 - INFO - codeparrot_training - Step 30529: {'lr': 0.00016589790444973272, 'samples': 15631360, 'steps': 30529, 'batch_loss/train': 0.7673402444925159} 12/26/2021 23:45:30 - INFO - codeparrot_training - Step 30530: {'lr': 0.0001658830602180551, 'samples': 15631872, 'steps': 30530, 'batch_loss/train': 0.754888859577477} 12/26/2021 23:45:41 - INFO - codeparrot_training - Step 30531: {'lr': 0.0001658682163207948, 'samples': 15632384, 'steps': 30531, 'batch_loss/train': 0.6859216887387447} 12/26/2021 23:45:52 - INFO - codeparrot_training - Step 30532: {'lr': 0.0001658533727580106, 'samples': 15632896, 'steps': 30532, 'batch_loss/train': 0.7559486776590347} 12/26/2021 23:46:05 - INFO - codeparrot_training - Step 30533: {'lr': 0.00016583852952976172, 'samples': 15633408, 'steps': 30533, 'batch_loss/train': 0.7516771778464317} 12/26/2021 23:46:16 - INFO - codeparrot_training - Step 30534: {'lr': 0.00016582368663610694, 'samples': 15633920, 'steps': 30534, 'batch_loss/train': 0.8068866138346493} 12/26/2021 23:46:27 - INFO - codeparrot_training - Step 30535: {'lr': 0.0001658088440771055, 'samples': 15634432, 'steps': 30535, 'batch_loss/train': 0.6918726083822548} 12/26/2021 23:46:37 - INFO - codeparrot_training - Step 30536: {'lr': 0.0001657940018528163, 'samples': 15634944, 'steps': 30536, 'batch_loss/train': 0.7127158502116799} 12/26/2021 23:46:49 - INFO - codeparrot_training - Step 30537: {'lr': 0.00016577915996329834, 'samples': 15635456, 'steps': 30537, 'batch_loss/train': 0.7626612549647689} 12/26/2021 23:47:00 - INFO - codeparrot_training - Step 30538: {'lr': 0.00016576431840861056, 'samples': 15635968, 'steps': 30538, 'batch_loss/train': 0.7229328724788502} 12/26/2021 23:47:11 - INFO - codeparrot_training - Step 30539: {'lr': 0.00016574947718881212, 'samples': 15636480, 'steps': 30539, 'batch_loss/train': 0.6478317846776918} 12/26/2021 23:47:23 - INFO - codeparrot_training - Step 30540: {'lr': 0.00016573463630396195, 'samples': 15636992, 'steps': 30540, 'batch_loss/train': 0.7974648009985685} 12/26/2021 23:47:33 - INFO - codeparrot_training - Step 30541: {'lr': 0.000165719795754119, 'samples': 15637504, 'steps': 30541, 'batch_loss/train': 0.6562700891518034} 12/26/2021 23:47:44 - INFO - codeparrot_training - Step 30542: {'lr': 0.00016570495553934234, 'samples': 15638016, 'steps': 30542, 'batch_loss/train': 0.6532367204781622} 12/26/2021 23:47:58 - INFO - codeparrot_training - Step 30543: {'lr': 0.000165690115659691, 'samples': 15638528, 'steps': 30543, 'batch_loss/train': 0.714112572837621} 12/26/2021 23:48:09 - INFO - codeparrot_training - Step 30544: {'lr': 0.00016567527611522392, 'samples': 15639040, 'steps': 30544, 'batch_loss/train': 0.7337025988381356} 12/26/2021 23:48:19 - INFO - codeparrot_training - Step 30545: {'lr': 0.00016566043690600002, 'samples': 15639552, 'steps': 30545, 'batch_loss/train': 0.7547266306355596} 12/26/2021 23:48:30 - INFO - codeparrot_training - Step 30546: {'lr': 0.00016564559803207844, 'samples': 15640064, 'steps': 30546, 'batch_loss/train': 0.6096406517899595} 12/26/2021 23:48:42 - INFO - codeparrot_training - Step 30547: {'lr': 0.00016563075949351812, 'samples': 15640576, 'steps': 30547, 'batch_loss/train': 0.7833166387281381} 12/26/2021 23:48:53 - INFO - codeparrot_training - Step 30548: {'lr': 0.00016561592129037803, 'samples': 15641088, 'steps': 30548, 'batch_loss/train': 0.7509008585475385} 12/26/2021 23:49:03 - INFO - codeparrot_training - Step 30549: {'lr': 0.00016560108342271711, 'samples': 15641600, 'steps': 30549, 'batch_loss/train': 0.8010374333243817} 12/26/2021 23:49:15 - INFO - codeparrot_training - Step 30550: {'lr': 0.00016558624589059452, 'samples': 15642112, 'steps': 30550, 'batch_loss/train': 0.78069659601897} 12/26/2021 23:49:26 - INFO - codeparrot_training - Step 30551: {'lr': 0.00016557140869406908, 'samples': 15642624, 'steps': 30551, 'batch_loss/train': 0.7584557815571316} 12/26/2021 23:49:37 - INFO - codeparrot_training - Step 30552: {'lr': 0.00016555657183319988, 'samples': 15643136, 'steps': 30552, 'batch_loss/train': 0.7605905525851995} 12/26/2021 23:49:51 - INFO - codeparrot_training - Step 30553: {'lr': 0.00016554173530804582, 'samples': 15643648, 'steps': 30553, 'batch_loss/train': 0.7879759510979056} 12/26/2021 23:50:01 - INFO - codeparrot_training - Step 30554: {'lr': 0.00016552689911866598, 'samples': 15644160, 'steps': 30554, 'batch_loss/train': 0.7540164571255445} 12/26/2021 23:50:12 - INFO - codeparrot_training - Step 30555: {'lr': 0.00016551206326511926, 'samples': 15644672, 'steps': 30555, 'batch_loss/train': 0.7237034541321918} 12/26/2021 23:50:24 - INFO - codeparrot_training - Step 30556: {'lr': 0.00016549722774746462, 'samples': 15645184, 'steps': 30556, 'batch_loss/train': 0.7816982325166464} 12/26/2021 23:50:35 - INFO - codeparrot_training - Step 30557: {'lr': 0.0001654823925657612, 'samples': 15645696, 'steps': 30557, 'batch_loss/train': 0.748059808742255} 12/26/2021 23:50:45 - INFO - codeparrot_training - Step 30558: {'lr': 0.00016546755772006783, 'samples': 15646208, 'steps': 30558, 'batch_loss/train': 0.8597570583224297} 12/26/2021 23:50:56 - INFO - codeparrot_training - Step 30559: {'lr': 0.00016545272321044359, 'samples': 15646720, 'steps': 30559, 'batch_loss/train': 0.8099596640095115} 12/26/2021 23:51:10 - INFO - codeparrot_training - Step 30560: {'lr': 0.00016543788903694725, 'samples': 15647232, 'steps': 30560, 'batch_loss/train': 0.6863399036228657} 12/26/2021 23:51:21 - INFO - codeparrot_training - Step 30561: {'lr': 0.000165423055199638, 'samples': 15647744, 'steps': 30561, 'batch_loss/train': 0.825993447448127} 12/26/2021 23:51:31 - INFO - codeparrot_training - Step 30562: {'lr': 0.00016540822169857483, 'samples': 15648256, 'steps': 30562, 'batch_loss/train': 0.8933410085737705} 12/26/2021 23:51:43 - INFO - codeparrot_training - Step 30563: {'lr': 0.00016539338853381664, 'samples': 15648768, 'steps': 30563, 'batch_loss/train': 0.9282791432924569} 12/26/2021 23:51:54 - INFO - codeparrot_training - Step 30564: {'lr': 0.00016537855570542226, 'samples': 15649280, 'steps': 30564, 'batch_loss/train': 0.7838929882273078} 12/26/2021 23:52:05 - INFO - codeparrot_training - Step 30565: {'lr': 0.00016536372321345085, 'samples': 15649792, 'steps': 30565, 'batch_loss/train': 0.7055733827874064} 12/26/2021 23:52:17 - INFO - codeparrot_training - Step 30566: {'lr': 0.00016534889105796136, 'samples': 15650304, 'steps': 30566, 'batch_loss/train': 0.7812752788886428} 12/26/2021 23:52:27 - INFO - codeparrot_training - Step 30567: {'lr': 0.00016533405923901262, 'samples': 15650816, 'steps': 30567, 'batch_loss/train': 0.836799256503582} 12/26/2021 23:52:38 - INFO - codeparrot_training - Step 30568: {'lr': 0.00016531922775666374, 'samples': 15651328, 'steps': 30568, 'batch_loss/train': 0.7327574114315212} 12/26/2021 23:52:48 - INFO - codeparrot_training - Step 30569: {'lr': 0.00016530439661097368, 'samples': 15651840, 'steps': 30569, 'batch_loss/train': 0.62836500024423} 12/26/2021 23:53:01 - INFO - codeparrot_training - Step 30570: {'lr': 0.00016528956580200138, 'samples': 15652352, 'steps': 30570, 'batch_loss/train': 0.8019597912207246} 12/26/2021 23:53:11 - INFO - codeparrot_training - Step 30571: {'lr': 0.00016527473532980565, 'samples': 15652864, 'steps': 30571, 'batch_loss/train': 0.8015092797577381} 12/26/2021 23:53:22 - INFO - codeparrot_training - Step 30572: {'lr': 0.00016525990519444567, 'samples': 15653376, 'steps': 30572, 'batch_loss/train': 0.7039557038806379} 12/26/2021 23:53:36 - INFO - codeparrot_training - Step 30573: {'lr': 0.00016524507539598032, 'samples': 15653888, 'steps': 30573, 'batch_loss/train': 0.7812421293929219} 12/26/2021 23:53:46 - INFO - codeparrot_training - Step 30574: {'lr': 0.00016523024593446853, 'samples': 15654400, 'steps': 30574, 'batch_loss/train': 0.767178145237267} 12/26/2021 23:53:57 - INFO - codeparrot_training - Step 30575: {'lr': 0.0001652154168099692, 'samples': 15654912, 'steps': 30575, 'batch_loss/train': 0.83501638751477} 12/26/2021 23:54:09 - INFO - codeparrot_training - Step 30576: {'lr': 0.00016520058802254147, 'samples': 15655424, 'steps': 30576, 'batch_loss/train': 0.6458901287987828} 12/26/2021 23:54:20 - INFO - codeparrot_training - Step 30577: {'lr': 0.00016518575957224414, 'samples': 15655936, 'steps': 30577, 'batch_loss/train': 0.8007586263120174} 12/26/2021 23:54:30 - INFO - codeparrot_training - Step 30578: {'lr': 0.00016517093145913613, 'samples': 15656448, 'steps': 30578, 'batch_loss/train': 0.7980684586800635} 12/26/2021 23:54:41 - INFO - codeparrot_training - Step 30579: {'lr': 0.0001651561036832766, 'samples': 15656960, 'steps': 30579, 'batch_loss/train': 0.7237235470674932} 12/26/2021 23:54:53 - INFO - codeparrot_training - Step 30580: {'lr': 0.00016514127624472436, 'samples': 15657472, 'steps': 30580, 'batch_loss/train': 0.7461907309480011} 12/26/2021 23:55:03 - INFO - codeparrot_training - Step 30581: {'lr': 0.0001651264491435383, 'samples': 15657984, 'steps': 30581, 'batch_loss/train': 0.7080123984487727} 12/26/2021 23:55:14 - INFO - codeparrot_training - Step 30582: {'lr': 0.0001651116223797774, 'samples': 15658496, 'steps': 30582, 'batch_loss/train': 1.0292422394268215} 12/26/2021 23:55:27 - INFO - codeparrot_training - Step 30583: {'lr': 0.00016509679595350075, 'samples': 15659008, 'steps': 30583, 'batch_loss/train': 0.7324570929631591} 12/26/2021 23:55:38 - INFO - codeparrot_training - Step 30584: {'lr': 0.00016508196986476708, 'samples': 15659520, 'steps': 30584, 'batch_loss/train': 0.6899360418319702} 12/26/2021 23:55:49 - INFO - codeparrot_training - Step 30585: {'lr': 0.00016506714411363552, 'samples': 15660032, 'steps': 30585, 'batch_loss/train': 0.7563235173001885} 12/26/2021 23:56:01 - INFO - codeparrot_training - Step 30586: {'lr': 0.0001650523187001648, 'samples': 15660544, 'steps': 30586, 'batch_loss/train': 0.7276992332190275} 12/26/2021 23:56:11 - INFO - codeparrot_training - Step 30587: {'lr': 0.00016503749362441406, 'samples': 15661056, 'steps': 30587, 'batch_loss/train': 0.7898718304932117} 12/26/2021 23:56:22 - INFO - codeparrot_training - Step 30588: {'lr': 0.0001650226688864422, 'samples': 15661568, 'steps': 30588, 'batch_loss/train': 0.7913267826661468} 12/26/2021 23:56:32 - INFO - codeparrot_training - Step 30589: {'lr': 0.0001650078444863081, 'samples': 15662080, 'steps': 30589, 'batch_loss/train': 0.7904033865779638} 12/26/2021 23:56:46 - INFO - codeparrot_training - Step 30590: {'lr': 0.0001649930204240707, 'samples': 15662592, 'steps': 30590, 'batch_loss/train': 0.7426213724538684} 12/26/2021 23:56:57 - INFO - codeparrot_training - Step 30591: {'lr': 0.00016497819669978892, 'samples': 15663104, 'steps': 30591, 'batch_loss/train': 0.7175329250749201} 12/26/2021 23:57:08 - INFO - codeparrot_training - Step 30592: {'lr': 0.00016496337331352185, 'samples': 15663616, 'steps': 30592, 'batch_loss/train': 0.6877839625813067} 12/26/2021 23:57:20 - INFO - codeparrot_training - Step 30593: {'lr': 0.00016494855026532812, 'samples': 15664128, 'steps': 30593, 'batch_loss/train': 0.7173084244132042} 12/26/2021 23:57:30 - INFO - codeparrot_training - Step 30594: {'lr': 0.00016493372755526693, 'samples': 15664640, 'steps': 30594, 'batch_loss/train': 0.8038392621092498} 12/26/2021 23:57:41 - INFO - codeparrot_training - Step 30595: {'lr': 0.00016491890518339719, 'samples': 15665152, 'steps': 30595, 'batch_loss/train': 0.8194242818281054} 12/26/2021 23:57:53 - INFO - codeparrot_training - Step 30596: {'lr': 0.00016490408314977768, 'samples': 15665664, 'steps': 30596, 'batch_loss/train': 0.8014969825744629} 12/26/2021 23:58:04 - INFO - codeparrot_training - Step 30597: {'lr': 0.00016488926145446736, 'samples': 15666176, 'steps': 30597, 'batch_loss/train': 0.7432472105138004} 12/26/2021 23:58:14 - INFO - codeparrot_training - Step 30598: {'lr': 0.00016487444009752528, 'samples': 15666688, 'steps': 30598, 'batch_loss/train': 0.7665765850106254} 12/26/2021 23:58:25 - INFO - codeparrot_training - Step 30599: {'lr': 0.0001648596190790103, 'samples': 15667200, 'steps': 30599, 'batch_loss/train': 0.8175449501723051} 12/26/2021 23:58:39 - INFO - codeparrot_training - Step 30600: {'lr': 0.0001648447983989813, 'samples': 15667712, 'steps': 30600, 'batch_loss/train': 0.6754542253911495} 12/26/2021 23:58:50 - INFO - codeparrot_training - Step 30601: {'lr': 0.00016482997805749715, 'samples': 15668224, 'steps': 30601, 'batch_loss/train': 0.48433887981809676} 12/26/2021 23:59:00 - INFO - codeparrot_training - Step 30602: {'lr': 0.00016481515805461693, 'samples': 15668736, 'steps': 30602, 'batch_loss/train': 0.597506923484616} 12/26/2021 23:59:13 - INFO - codeparrot_training - Step 30603: {'lr': 0.0001648003383903995, 'samples': 15669248, 'steps': 30603, 'batch_loss/train': 0.7370040656533092} 12/26/2021 23:59:23 - INFO - codeparrot_training - Step 30604: {'lr': 0.00016478551906490364, 'samples': 15669760, 'steps': 30604, 'batch_loss/train': 0.645355440035928} 12/26/2021 23:59:34 - INFO - codeparrot_training - Step 30605: {'lr': 0.00016477070007818852, 'samples': 15670272, 'steps': 30605, 'batch_loss/train': 0.8864666409790516} 12/26/2021 23:59:46 - INFO - codeparrot_training - Step 30606: {'lr': 0.00016475588143031282, 'samples': 15670784, 'steps': 30606, 'batch_loss/train': 0.808967188000679} 12/26/2021 23:59:57 - INFO - codeparrot_training - Step 30607: {'lr': 0.00016474106312133562, 'samples': 15671296, 'steps': 30607, 'batch_loss/train': 0.9342397348955274} 12/27/2021 00:00:07 - INFO - codeparrot_training - Step 30608: {'lr': 0.00016472624515131566, 'samples': 15671808, 'steps': 30608, 'batch_loss/train': 0.7565833814442158} 12/27/2021 00:00:18 - INFO - codeparrot_training - Step 30609: {'lr': 0.00016471142752031204, 'samples': 15672320, 'steps': 30609, 'batch_loss/train': 0.7412891276180744} 12/27/2021 00:00:32 - INFO - codeparrot_training - Step 30610: {'lr': 0.00016469661022838357, 'samples': 15672832, 'steps': 30610, 'batch_loss/train': 0.7930377947632223} 12/27/2021 00:00:42 - INFO - codeparrot_training - Step 30611: {'lr': 0.0001646817932755892, 'samples': 15673344, 'steps': 30611, 'batch_loss/train': 0.7553656827658415} 12/27/2021 00:00:53 - INFO - codeparrot_training - Step 30612: {'lr': 0.00016466697666198766, 'samples': 15673856, 'steps': 30612, 'batch_loss/train': 0.6915016373386607} 12/27/2021 00:01:05 - INFO - codeparrot_training - Step 30613: {'lr': 0.0001646521603876381, 'samples': 15674368, 'steps': 30613, 'batch_loss/train': 0.819216487929225} 12/27/2021 00:01:16 - INFO - codeparrot_training - Step 30614: {'lr': 0.00016463734445259937, 'samples': 15674880, 'steps': 30614, 'batch_loss/train': 0.7304168334230781} 12/27/2021 00:01:26 - INFO - codeparrot_training - Step 30615: {'lr': 0.00016462252885693018, 'samples': 15675392, 'steps': 30615, 'batch_loss/train': 0.7064919555559754} 12/27/2021 00:01:39 - INFO - codeparrot_training - Step 30616: {'lr': 0.00016460771360068973, 'samples': 15675904, 'steps': 30616, 'batch_loss/train': 0.6777270098682493} 12/27/2021 00:01:49 - INFO - codeparrot_training - Step 30617: {'lr': 0.00016459289868393673, 'samples': 15676416, 'steps': 30617, 'batch_loss/train': 0.7308497936464846} 12/27/2021 00:02:00 - INFO - codeparrot_training - Step 30618: {'lr': 0.00016457808410673015, 'samples': 15676928, 'steps': 30618, 'batch_loss/train': 0.8203815794549882} 12/27/2021 00:02:14 - INFO - codeparrot_training - Step 30619: {'lr': 0.00016456326986912873, 'samples': 15677440, 'steps': 30619, 'batch_loss/train': 0.716071140486747} 12/27/2021 00:02:24 - INFO - codeparrot_training - Step 30620: {'lr': 0.00016454845597119156, 'samples': 15677952, 'steps': 30620, 'batch_loss/train': 0.6438308507204056} 12/27/2021 00:02:35 - INFO - codeparrot_training - Step 30621: {'lr': 0.00016453364241297746, 'samples': 15678464, 'steps': 30621, 'batch_loss/train': 0.7448357627727091} 12/27/2021 00:02:46 - INFO - codeparrot_training - Step 30622: {'lr': 0.00016451882919454536, 'samples': 15678976, 'steps': 30622, 'batch_loss/train': 0.6836877157911658} 12/27/2021 00:02:58 - INFO - codeparrot_training - Step 30623: {'lr': 0.000164504016315954, 'samples': 15679488, 'steps': 30623, 'batch_loss/train': 0.6715410231263377} 12/27/2021 00:03:08 - INFO - codeparrot_training - Step 30624: {'lr': 0.00016448920377726256, 'samples': 15680000, 'steps': 30624, 'batch_loss/train': 0.7180749094113708} 12/27/2021 00:03:19 - INFO - codeparrot_training - Step 30625: {'lr': 0.00016447439157852973, 'samples': 15680512, 'steps': 30625, 'batch_loss/train': 0.9870774019509554} 12/27/2021 00:03:31 - INFO - codeparrot_training - Step 30626: {'lr': 0.00016445957971981435, 'samples': 15681024, 'steps': 30626, 'batch_loss/train': 0.8315054774284363} 12/27/2021 00:03:42 - INFO - codeparrot_training - Step 30627: {'lr': 0.00016444476820117533, 'samples': 15681536, 'steps': 30627, 'batch_loss/train': 0.689970220439136} 12/27/2021 00:03:52 - INFO - codeparrot_training - Step 30628: {'lr': 0.00016442995702267173, 'samples': 15682048, 'steps': 30628, 'batch_loss/train': 0.7069191108457744} 12/27/2021 00:04:06 - INFO - codeparrot_training - Step 30629: {'lr': 0.00016441514618436225, 'samples': 15682560, 'steps': 30629, 'batch_loss/train': 0.7637327616102993} 12/27/2021 00:04:17 - INFO - codeparrot_training - Step 30630: {'lr': 0.0001644003356863058, 'samples': 15683072, 'steps': 30630, 'batch_loss/train': 0.7515502593014389} 12/27/2021 00:04:28 - INFO - codeparrot_training - Step 30631: {'lr': 0.0001643855255285614, 'samples': 15683584, 'steps': 30631, 'batch_loss/train': 0.8343311632052064} 12/27/2021 00:04:38 - INFO - codeparrot_training - Step 30632: {'lr': 0.00016437071571118772, 'samples': 15684096, 'steps': 30632, 'batch_loss/train': 0.7845033146440983} 12/27/2021 00:04:50 - INFO - codeparrot_training - Step 30633: {'lr': 0.00016435590623424385, 'samples': 15684608, 'steps': 30633, 'batch_loss/train': 0.7781083029694855} 12/27/2021 00:05:01 - INFO - codeparrot_training - Step 30634: {'lr': 0.00016434109709778845, 'samples': 15685120, 'steps': 30634, 'batch_loss/train': 0.7876761127263308} 12/27/2021 00:05:12 - INFO - codeparrot_training - Step 30635: {'lr': 0.00016432628830188056, 'samples': 15685632, 'steps': 30635, 'batch_loss/train': 0.7634504833258688} 12/27/2021 00:05:26 - INFO - codeparrot_training - Step 30636: {'lr': 0.00016431147984657902, 'samples': 15686144, 'steps': 30636, 'batch_loss/train': 0.6915192960295826} 12/27/2021 00:05:36 - INFO - codeparrot_training - Step 30637: {'lr': 0.0001642966717319427, 'samples': 15686656, 'steps': 30637, 'batch_loss/train': 0.8007270535454154} 12/27/2021 00:05:47 - INFO - codeparrot_training - Step 30638: {'lr': 0.0001642818639580303, 'samples': 15687168, 'steps': 30638, 'batch_loss/train': 0.973015614785254} 12/27/2021 00:05:59 - INFO - codeparrot_training - Step 30639: {'lr': 0.00016426705652490095, 'samples': 15687680, 'steps': 30639, 'batch_loss/train': 0.920708023943007} 12/27/2021 00:06:10 - INFO - codeparrot_training - Step 30640: {'lr': 0.00016425224943261347, 'samples': 15688192, 'steps': 30640, 'batch_loss/train': 0.8099111579358578} 12/27/2021 00:06:20 - INFO - codeparrot_training - Step 30641: {'lr': 0.00016423744268122652, 'samples': 15688704, 'steps': 30641, 'batch_loss/train': 0.8282647514715791} 12/27/2021 00:06:31 - INFO - codeparrot_training - Step 30642: {'lr': 0.0001642226362707992, 'samples': 15689216, 'steps': 30642, 'batch_loss/train': 0.6294436082243919} 12/27/2021 00:06:43 - INFO - codeparrot_training - Step 30643: {'lr': 0.00016420783020139036, 'samples': 15689728, 'steps': 30643, 'batch_loss/train': 0.8696349109522998} 12/27/2021 00:06:54 - INFO - codeparrot_training - Step 30644: {'lr': 0.00016419302447305876, 'samples': 15690240, 'steps': 30644, 'batch_loss/train': 0.8386941161006689} 12/27/2021 00:07:04 - INFO - codeparrot_training - Step 30645: {'lr': 0.0001641782190858632, 'samples': 15690752, 'steps': 30645, 'batch_loss/train': 0.7935496317222714} 12/27/2021 00:07:17 - INFO - codeparrot_training - Step 30646: {'lr': 0.0001641634140398627, 'samples': 15691264, 'steps': 30646, 'batch_loss/train': 0.7868967191316187} 12/27/2021 00:07:27 - INFO - codeparrot_training - Step 30647: {'lr': 0.00016414860933511608, 'samples': 15691776, 'steps': 30647, 'batch_loss/train': 0.8516339175403118} 12/27/2021 00:07:38 - INFO - codeparrot_training - Step 30648: {'lr': 0.00016413380497168212, 'samples': 15692288, 'steps': 30648, 'batch_loss/train': 0.7968182768672705} 12/27/2021 00:07:52 - INFO - codeparrot_training - Step 30649: {'lr': 0.0001641190009496197, 'samples': 15692800, 'steps': 30649, 'batch_loss/train': 0.7777708092471585} 12/27/2021 00:08:02 - INFO - codeparrot_training - Step 30650: {'lr': 0.00016410419726898778, 'samples': 15693312, 'steps': 30650, 'batch_loss/train': 0.7401506868191063} 12/27/2021 00:08:13 - INFO - codeparrot_training - Step 30651: {'lr': 0.00016408939392984512, 'samples': 15693824, 'steps': 30651, 'batch_loss/train': 0.8122078664600849} 12/27/2021 00:08:23 - INFO - codeparrot_training - Step 30652: {'lr': 0.00016407459093225056, 'samples': 15694336, 'steps': 30652, 'batch_loss/train': 0.7899655774235725} 12/27/2021 00:08:36 - INFO - codeparrot_training - Step 30653: {'lr': 0.00016405978827626307, 'samples': 15694848, 'steps': 30653, 'batch_loss/train': 0.9921443085186183} 12/27/2021 00:08:46 - INFO - codeparrot_training - Step 30654: {'lr': 0.00016404498596194137, 'samples': 15695360, 'steps': 30654, 'batch_loss/train': 0.6335900048725307} 12/27/2021 00:08:57 - INFO - codeparrot_training - Step 30655: {'lr': 0.00016403018398934434, 'samples': 15695872, 'steps': 30655, 'batch_loss/train': 0.7562210966134444} 12/27/2021 00:09:09 - INFO - codeparrot_training - Step 30656: {'lr': 0.00016401538235853076, 'samples': 15696384, 'steps': 30656, 'batch_loss/train': 0.711921798530966} 12/27/2021 00:09:20 - INFO - codeparrot_training - Step 30657: {'lr': 0.00016400058106955969, 'samples': 15696896, 'steps': 30657, 'batch_loss/train': 0.808349093887955} 12/27/2021 00:09:30 - INFO - codeparrot_training - Step 30658: {'lr': 0.00016398578012248973, 'samples': 15697408, 'steps': 30658, 'batch_loss/train': 0.6860288185998797} 12/27/2021 00:09:45 - INFO - codeparrot_training - Step 30659: {'lr': 0.00016397097951737993, 'samples': 15697920, 'steps': 30659, 'batch_loss/train': 0.7258026488125324} 12/27/2021 00:09:55 - INFO - codeparrot_training - Step 30660: {'lr': 0.00016395617925428892, 'samples': 15698432, 'steps': 30660, 'batch_loss/train': 0.7601099401945248} 12/27/2021 00:10:06 - INFO - codeparrot_training - Step 30661: {'lr': 0.00016394137933327575, 'samples': 15698944, 'steps': 30661, 'batch_loss/train': 0.6781169390305877} 12/27/2021 00:10:18 - INFO - codeparrot_training - Step 30662: {'lr': 0.00016392657975439909, 'samples': 15699456, 'steps': 30662, 'batch_loss/train': 0.795491547207348} 12/27/2021 00:10:29 - INFO - codeparrot_training - Step 30663: {'lr': 0.00016391178051771793, 'samples': 15699968, 'steps': 30663, 'batch_loss/train': 0.7274784548208117} 12/27/2021 00:10:39 - INFO - codeparrot_training - Step 30664: {'lr': 0.00016389698162329092, 'samples': 15700480, 'steps': 30664, 'batch_loss/train': 0.8066432783380151} 12/27/2021 00:10:50 - INFO - codeparrot_training - Step 30665: {'lr': 0.00016388218307117702, 'samples': 15700992, 'steps': 30665, 'batch_loss/train': 0.606018669786863} 12/27/2021 00:11:02 - INFO - codeparrot_training - Step 30666: {'lr': 0.00016386738486143511, 'samples': 15701504, 'steps': 30666, 'batch_loss/train': 0.5948637470137328} 12/27/2021 00:11:13 - INFO - codeparrot_training - Step 30667: {'lr': 0.00016385258699412386, 'samples': 15702016, 'steps': 30667, 'batch_loss/train': 0.6891128541901708} 12/27/2021 00:11:23 - INFO - codeparrot_training - Step 30668: {'lr': 0.00016383778946930228, 'samples': 15702528, 'steps': 30668, 'batch_loss/train': 0.6824625106528401} 12/27/2021 00:11:37 - INFO - codeparrot_training - Step 30669: {'lr': 0.00016382299228702912, 'samples': 15703040, 'steps': 30669, 'batch_loss/train': 0.8419821290299296} 12/27/2021 00:11:48 - INFO - codeparrot_training - Step 30670: {'lr': 0.00016380819544736322, 'samples': 15703552, 'steps': 30670, 'batch_loss/train': 0.777962270192802} 12/27/2021 00:11:58 - INFO - codeparrot_training - Step 30671: {'lr': 0.00016379339895036327, 'samples': 15704064, 'steps': 30671, 'batch_loss/train': 0.752974616130814} 12/27/2021 00:12:11 - INFO - codeparrot_training - Step 30672: {'lr': 0.0001637786027960883, 'samples': 15704576, 'steps': 30672, 'batch_loss/train': 0.7670457467902452} 12/27/2021 00:12:21 - INFO - codeparrot_training - Step 30673: {'lr': 0.00016376380698459708, 'samples': 15705088, 'steps': 30673, 'batch_loss/train': 0.714107563951984} 12/27/2021 00:12:32 - INFO - codeparrot_training - Step 30674: {'lr': 0.00016374901151594833, 'samples': 15705600, 'steps': 30674, 'batch_loss/train': 0.7320686941966414} 12/27/2021 00:12:42 - INFO - codeparrot_training - Step 30675: {'lr': 0.00016373421639020094, 'samples': 15706112, 'steps': 30675, 'batch_loss/train': 0.7192562588024884} 12/27/2021 00:12:56 - INFO - codeparrot_training - Step 30676: {'lr': 0.00016371942160741383, 'samples': 15706624, 'steps': 30676, 'batch_loss/train': 0.7603801421355456} 12/27/2021 00:13:07 - INFO - codeparrot_training - Step 30677: {'lr': 0.00016370462716764565, 'samples': 15707136, 'steps': 30677, 'batch_loss/train': 0.9963803472928703} 12/27/2021 00:13:18 - INFO - codeparrot_training - Step 30678: {'lr': 0.00016368983307095522, 'samples': 15707648, 'steps': 30678, 'batch_loss/train': 0.7094229804351926} 12/27/2021 00:13:30 - INFO - codeparrot_training - Step 30679: {'lr': 0.00016367503931740158, 'samples': 15708160, 'steps': 30679, 'batch_loss/train': 0.6663944693282247} 12/27/2021 00:13:41 - INFO - codeparrot_training - Step 30680: {'lr': 0.0001636602459070433, 'samples': 15708672, 'steps': 30680, 'batch_loss/train': 0.8590273009613156} 12/27/2021 00:13:51 - INFO - codeparrot_training - Step 30681: {'lr': 0.00016364545283993932, 'samples': 15709184, 'steps': 30681, 'batch_loss/train': 0.8413809025660157} 12/27/2021 00:14:03 - INFO - codeparrot_training - Step 30682: {'lr': 0.00016363066011614832, 'samples': 15709696, 'steps': 30682, 'batch_loss/train': 0.8472917936742306} 12/27/2021 00:14:14 - INFO - codeparrot_training - Step 30683: {'lr': 0.0001636158677357293, 'samples': 15710208, 'steps': 30683, 'batch_loss/train': 0.7693531550467014} 12/27/2021 00:14:25 - INFO - codeparrot_training - Step 30684: {'lr': 0.00016360107569874093, 'samples': 15710720, 'steps': 30684, 'batch_loss/train': 0.6100525700021535} 12/27/2021 00:14:37 - INFO - codeparrot_training - Step 30685: {'lr': 0.00016358628400524214, 'samples': 15711232, 'steps': 30685, 'batch_loss/train': 0.7896861517801881} 12/27/2021 00:14:48 - INFO - codeparrot_training - Step 30686: {'lr': 0.00016357149265529153, 'samples': 15711744, 'steps': 30686, 'batch_loss/train': 0.8218743037432432} 12/27/2021 00:14:58 - INFO - codeparrot_training - Step 30687: {'lr': 0.0001635567016489481, 'samples': 15712256, 'steps': 30687, 'batch_loss/train': 0.7702992677222937} 12/27/2021 00:15:09 - INFO - codeparrot_training - Step 30688: {'lr': 0.0001635419109862706, 'samples': 15712768, 'steps': 30688, 'batch_loss/train': 0.7788527719676495} 12/27/2021 00:15:22 - INFO - codeparrot_training - Step 30689: {'lr': 0.00016352712066731773, 'samples': 15713280, 'steps': 30689, 'batch_loss/train': 0.7354790144599974} 12/27/2021 00:15:33 - INFO - codeparrot_training - Step 30690: {'lr': 0.00016351233069214849, 'samples': 15713792, 'steps': 30690, 'batch_loss/train': 0.7381614069454372} 12/27/2021 00:15:44 - INFO - codeparrot_training - Step 30691: {'lr': 0.0001634975410608215, 'samples': 15714304, 'steps': 30691, 'batch_loss/train': 0.7937191287055612} 12/27/2021 00:15:56 - INFO - codeparrot_training - Step 30692: {'lr': 0.00016348275177339566, 'samples': 15714816, 'steps': 30692, 'batch_loss/train': 0.7495193808572367} 12/27/2021 00:16:06 - INFO - codeparrot_training - Step 30693: {'lr': 0.00016346796282992965, 'samples': 15715328, 'steps': 30693, 'batch_loss/train': 0.6835753032937646} 12/27/2021 00:16:17 - INFO - codeparrot_training - Step 30694: {'lr': 0.00016345317423048243, 'samples': 15715840, 'steps': 30694, 'batch_loss/train': 0.7349508409388363} 12/27/2021 00:16:29 - INFO - codeparrot_training - Step 30695: {'lr': 0.00016343838597511273, 'samples': 15716352, 'steps': 30695, 'batch_loss/train': 0.7574555147439241} 12/27/2021 00:16:40 - INFO - codeparrot_training - Step 30696: {'lr': 0.00016342359806387923, 'samples': 15716864, 'steps': 30696, 'batch_loss/train': 0.7284579984843731} 12/27/2021 00:16:50 - INFO - codeparrot_training - Step 30697: {'lr': 0.00016340881049684084, 'samples': 15717376, 'steps': 30697, 'batch_loss/train': 0.6621048506349325} 12/27/2021 00:17:01 - INFO - codeparrot_training - Step 30698: {'lr': 0.0001633940232740564, 'samples': 15717888, 'steps': 30698, 'batch_loss/train': 0.6218259930610657} 12/27/2021 00:17:15 - INFO - codeparrot_training - Step 30699: {'lr': 0.0001633792363955846, 'samples': 15718400, 'steps': 30699, 'batch_loss/train': 0.7929580514319241} 12/27/2021 00:17:25 - INFO - codeparrot_training - Step 30700: {'lr': 0.0001633644498614842, 'samples': 15718912, 'steps': 30700, 'batch_loss/train': 0.7750376597978175} 12/27/2021 00:17:36 - INFO - codeparrot_training - Step 30701: {'lr': 0.00016334966367181404, 'samples': 15719424, 'steps': 30701, 'batch_loss/train': 0.9509856104850769} 12/27/2021 00:17:48 - INFO - codeparrot_training - Step 30702: {'lr': 0.00016333487782663292, 'samples': 15719936, 'steps': 30702, 'batch_loss/train': 0.8648104295134544} 12/27/2021 00:17:59 - INFO - codeparrot_training - Step 30703: {'lr': 0.0001633200923259996, 'samples': 15720448, 'steps': 30703, 'batch_loss/train': 0.7686479613184929} 12/27/2021 00:18:10 - INFO - codeparrot_training - Step 30704: {'lr': 0.0001633053071699728, 'samples': 15720960, 'steps': 30704, 'batch_loss/train': 0.664530704729259} 12/27/2021 00:18:23 - INFO - codeparrot_training - Step 30705: {'lr': 0.00016329052235861146, 'samples': 15721472, 'steps': 30705, 'batch_loss/train': 0.839724762365222} 12/27/2021 00:18:34 - INFO - codeparrot_training - Step 30706: {'lr': 0.00016327573789197427, 'samples': 15721984, 'steps': 30706, 'batch_loss/train': 0.6984200719743967} 12/27/2021 00:18:45 - INFO - codeparrot_training - Step 30707: {'lr': 0.00016326095377011996, 'samples': 15722496, 'steps': 30707, 'batch_loss/train': 0.7811967087909579} 12/27/2021 00:18:55 - INFO - codeparrot_training - Step 30708: {'lr': 0.0001632461699931073, 'samples': 15723008, 'steps': 30708, 'batch_loss/train': 0.6740872678346932} 12/27/2021 00:19:07 - INFO - codeparrot_training - Step 30709: {'lr': 0.00016323138656099516, 'samples': 15723520, 'steps': 30709, 'batch_loss/train': 0.7913325112313032} 12/27/2021 00:19:18 - INFO - codeparrot_training - Step 30710: {'lr': 0.00016321660347384226, 'samples': 15724032, 'steps': 30710, 'batch_loss/train': 0.7781145954504609} 12/27/2021 00:19:29 - INFO - codeparrot_training - Step 30711: {'lr': 0.00016320182073170742, 'samples': 15724544, 'steps': 30711, 'batch_loss/train': 0.8135796496644616} 12/27/2021 00:19:41 - INFO - codeparrot_training - Step 30712: {'lr': 0.00016318703833464925, 'samples': 15725056, 'steps': 30712, 'batch_loss/train': 0.9736356204375625} 12/27/2021 00:19:51 - INFO - codeparrot_training - Step 30713: {'lr': 0.00016317225628272676, 'samples': 15725568, 'steps': 30713, 'batch_loss/train': 0.6943339044228196} 12/27/2021 00:20:02 - INFO - codeparrot_training - Step 30714: {'lr': 0.00016315747457599855, 'samples': 15726080, 'steps': 30714, 'batch_loss/train': 0.8116454565897584} 12/27/2021 00:20:16 - INFO - codeparrot_training - Step 30715: {'lr': 0.00016314269321452337, 'samples': 15726592, 'steps': 30715, 'batch_loss/train': 0.6794492965564132} 12/27/2021 00:20:27 - INFO - codeparrot_training - Step 30716: {'lr': 0.00016312791219836014, 'samples': 15727104, 'steps': 30716, 'batch_loss/train': 0.8066911827772856} 12/27/2021 00:20:38 - INFO - codeparrot_training - Step 30717: {'lr': 0.00016311313152756748, 'samples': 15727616, 'steps': 30717, 'batch_loss/train': 0.7490399368107319} 12/27/2021 00:20:48 - INFO - codeparrot_training - Step 30718: {'lr': 0.00016309835120220428, 'samples': 15728128, 'steps': 30718, 'batch_loss/train': 0.5833171107806265} 12/27/2021 00:21:00 - INFO - codeparrot_training - Step 30719: {'lr': 0.00016308357122232908, 'samples': 15728640, 'steps': 30719, 'batch_loss/train': 0.8767593218944967} 12/27/2021 00:21:11 - INFO - codeparrot_training - Step 30720: {'lr': 0.00016306879158800087, 'samples': 15729152, 'steps': 30720, 'batch_loss/train': 0.7668367850128561} 12/27/2021 00:21:22 - INFO - codeparrot_training - Step 30721: {'lr': 0.00016305401229927834, 'samples': 15729664, 'steps': 30721, 'batch_loss/train': 0.769505399512127} 12/27/2021 00:21:34 - INFO - codeparrot_training - Step 30722: {'lr': 0.00016303923335622017, 'samples': 15730176, 'steps': 30722, 'batch_loss/train': 0.6662010657601058} 12/27/2021 00:21:44 - INFO - codeparrot_training - Step 30723: {'lr': 0.00016302445475888517, 'samples': 15730688, 'steps': 30723, 'batch_loss/train': 0.4862563656643033} 12/27/2021 00:21:55 - INFO - codeparrot_training - Step 30724: {'lr': 0.00016300967650733215, 'samples': 15731200, 'steps': 30724, 'batch_loss/train': 0.7248493866063654} 12/27/2021 00:22:07 - INFO - codeparrot_training - Step 30725: {'lr': 0.0001629948986016198, 'samples': 15731712, 'steps': 30725, 'batch_loss/train': 0.753311253618449} 12/27/2021 00:22:18 - INFO - codeparrot_training - Step 30726: {'lr': 0.00016298012104180683, 'samples': 15732224, 'steps': 30726, 'batch_loss/train': 0.6851329994387925} 12/27/2021 00:22:28 - INFO - codeparrot_training - Step 30727: {'lr': 0.00016296534382795208, 'samples': 15732736, 'steps': 30727, 'batch_loss/train': 0.7255883254110813} 12/27/2021 00:22:39 - INFO - codeparrot_training - Step 30728: {'lr': 0.00016295056696011428, 'samples': 15733248, 'steps': 30728, 'batch_loss/train': 0.8874179003760219} 12/27/2021 00:22:53 - INFO - codeparrot_training - Step 30729: {'lr': 0.00016293579043835215, 'samples': 15733760, 'steps': 30729, 'batch_loss/train': 0.8322876477614045} 12/27/2021 00:23:03 - INFO - codeparrot_training - Step 30730: {'lr': 0.00016292101426272436, 'samples': 15734272, 'steps': 30730, 'batch_loss/train': 0.7694784186314791} 12/27/2021 00:23:14 - INFO - codeparrot_training - Step 30731: {'lr': 0.00016290623843328984, 'samples': 15734784, 'steps': 30731, 'batch_loss/train': 0.6983974915929139} 12/27/2021 00:23:26 - INFO - codeparrot_training - Step 30732: {'lr': 0.0001628914629501072, 'samples': 15735296, 'steps': 30732, 'batch_loss/train': 0.6290706023573875} 12/27/2021 00:23:37 - INFO - codeparrot_training - Step 30733: {'lr': 0.00016287668781323524, 'samples': 15735808, 'steps': 30733, 'batch_loss/train': 1.1383631052449346} 12/27/2021 00:23:48 - INFO - codeparrot_training - Step 30734: {'lr': 0.0001628619130227326, 'samples': 15736320, 'steps': 30734, 'batch_loss/train': 0.7600749852135777} 12/27/2021 00:23:58 - INFO - codeparrot_training - Step 30735: {'lr': 0.0001628471385786582, 'samples': 15736832, 'steps': 30735, 'batch_loss/train': 0.6916469964198768} 12/27/2021 00:24:10 - INFO - codeparrot_training - Step 30736: {'lr': 0.00016283236448107058, 'samples': 15737344, 'steps': 30736, 'batch_loss/train': 0.8044495219364762} 12/27/2021 00:24:21 - INFO - codeparrot_training - Step 30737: {'lr': 0.0001628175907300286, 'samples': 15737856, 'steps': 30737, 'batch_loss/train': 0.8046136777848005} 12/27/2021 00:24:31 - INFO - codeparrot_training - Step 30738: {'lr': 0.00016280281732559092, 'samples': 15738368, 'steps': 30738, 'batch_loss/train': 0.6870739919831976} 12/27/2021 00:24:46 - INFO - codeparrot_training - Step 30739: {'lr': 0.00016278804426781633, 'samples': 15738880, 'steps': 30739, 'batch_loss/train': 0.7778442117851228} 12/27/2021 00:24:56 - INFO - codeparrot_training - Step 30740: {'lr': 0.00016277327155676358, 'samples': 15739392, 'steps': 30740, 'batch_loss/train': 0.7426081290468574} 12/27/2021 00:25:07 - INFO - codeparrot_training - Step 30741: {'lr': 0.00016275849919249131, 'samples': 15739904, 'steps': 30741, 'batch_loss/train': 0.6952892089029774} 12/27/2021 00:25:20 - INFO - codeparrot_training - Step 30742: {'lr': 0.0001627437271750583, 'samples': 15740416, 'steps': 30742, 'batch_loss/train': 0.6948827924206853} 12/27/2021 00:25:31 - INFO - codeparrot_training - Step 30743: {'lr': 0.00016272895550452338, 'samples': 15740928, 'steps': 30743, 'batch_loss/train': 0.7573662605136633} 12/27/2021 00:25:41 - INFO - codeparrot_training - Step 30744: {'lr': 0.00016271418418094518, 'samples': 15741440, 'steps': 30744, 'batch_loss/train': 1.0494047132087871} 12/27/2021 00:25:52 - INFO - codeparrot_training - Step 30745: {'lr': 0.0001626994132043823, 'samples': 15741952, 'steps': 30745, 'batch_loss/train': 1.9626185595989227} 12/27/2021 00:26:06 - INFO - codeparrot_training - Step 30746: {'lr': 0.00016268464257489367, 'samples': 15742464, 'steps': 30746, 'batch_loss/train': 1.1384495692327619} 12/27/2021 00:26:17 - INFO - codeparrot_training - Step 30747: {'lr': 0.00016266987229253798, 'samples': 15742976, 'steps': 30747, 'batch_loss/train': 0.7080499958246946} 12/27/2021 00:26:27 - INFO - codeparrot_training - Step 30748: {'lr': 0.00016265510235737382, 'samples': 15743488, 'steps': 30748, 'batch_loss/train': 0.8030439291615039} 12/27/2021 00:26:39 - INFO - codeparrot_training - Step 30749: {'lr': 0.00016264033276945997, 'samples': 15744000, 'steps': 30749, 'batch_loss/train': 0.7787617789581418} 12/27/2021 00:26:50 - INFO - codeparrot_training - Step 30750: {'lr': 0.00016262556352885526, 'samples': 15744512, 'steps': 30750, 'batch_loss/train': 0.7038614861667156} 12/27/2021 00:27:01 - INFO - codeparrot_training - Step 30751: {'lr': 0.00016261079463561834, 'samples': 15745024, 'steps': 30751, 'batch_loss/train': 0.7629413893446326} 12/27/2021 00:27:11 - INFO - codeparrot_training - Step 30752: {'lr': 0.0001625960260898078, 'samples': 15745536, 'steps': 30752, 'batch_loss/train': 0.7356947157531977} 12/27/2021 00:27:23 - INFO - codeparrot_training - Step 30753: {'lr': 0.00016258125789148244, 'samples': 15746048, 'steps': 30753, 'batch_loss/train': 0.8345602694898844} 12/27/2021 00:27:34 - INFO - codeparrot_training - Step 30754: {'lr': 0.0001625664900407011, 'samples': 15746560, 'steps': 30754, 'batch_loss/train': 0.7484917985275388} 12/27/2021 00:27:45 - INFO - codeparrot_training - Step 30755: {'lr': 0.00016255172253752234, 'samples': 15747072, 'steps': 30755, 'batch_loss/train': 0.7298173727467656} 12/27/2021 00:27:59 - INFO - codeparrot_training - Step 30756: {'lr': 0.00016253695538200486, 'samples': 15747584, 'steps': 30756, 'batch_loss/train': 0.8583465861156583} 12/27/2021 00:28:09 - INFO - codeparrot_training - Step 30757: {'lr': 0.00016252218857420748, 'samples': 15748096, 'steps': 30757, 'batch_loss/train': 0.8300075652077794} 12/27/2021 00:28:20 - INFO - codeparrot_training - Step 30758: {'lr': 0.00016250742211418884, 'samples': 15748608, 'steps': 30758, 'batch_loss/train': 0.8139081031549722} 12/27/2021 00:28:32 - INFO - codeparrot_training - Step 30759: {'lr': 0.00016249265600200768, 'samples': 15749120, 'steps': 30759, 'batch_loss/train': 0.7976779462769628} 12/27/2021 00:28:43 - INFO - codeparrot_training - Step 30760: {'lr': 0.00016247789023772258, 'samples': 15749632, 'steps': 30760, 'batch_loss/train': 0.795784049667418} 12/27/2021 00:28:53 - INFO - codeparrot_training - Step 30761: {'lr': 0.00016246312482139246, 'samples': 15750144, 'steps': 30761, 'batch_loss/train': 0.776810348033905} 12/27/2021 00:29:04 - INFO - codeparrot_training - Step 30762: {'lr': 0.00016244835975307587, 'samples': 15750656, 'steps': 30762, 'batch_loss/train': 0.8015978625044227} 12/27/2021 00:29:18 - INFO - codeparrot_training - Step 30763: {'lr': 0.00016243359503283146, 'samples': 15751168, 'steps': 30763, 'batch_loss/train': 0.8141813455149531} 12/27/2021 00:29:29 - INFO - codeparrot_training - Step 30764: {'lr': 0.0001624188306607181, 'samples': 15751680, 'steps': 30764, 'batch_loss/train': 0.6414330622646958} 12/27/2021 00:29:40 - INFO - codeparrot_training - Step 30765: {'lr': 0.00016240406663679436, 'samples': 15752192, 'steps': 30765, 'batch_loss/train': 0.7387345200404525} 12/27/2021 00:29:52 - INFO - codeparrot_training - Step 30766: {'lr': 0.00016238930296111903, 'samples': 15752704, 'steps': 30766, 'batch_loss/train': 1.0650580124929547} 12/27/2021 00:30:03 - INFO - codeparrot_training - Step 30767: {'lr': 0.00016237453963375065, 'samples': 15753216, 'steps': 30767, 'batch_loss/train': 0.7682269169017673} 12/27/2021 00:30:13 - INFO - codeparrot_training - Step 30768: {'lr': 0.0001623597766547481, 'samples': 15753728, 'steps': 30768, 'batch_loss/train': 0.7155738370493054} 12/27/2021 00:30:24 - INFO - codeparrot_training - Step 30769: {'lr': 0.00016234501402417, 'samples': 15754240, 'steps': 30769, 'batch_loss/train': 0.8091745898127556} 12/27/2021 00:30:36 - INFO - codeparrot_training - Step 30770: {'lr': 0.00016233025174207505, 'samples': 15754752, 'steps': 30770, 'batch_loss/train': 0.8596173236146569} 12/27/2021 00:30:47 - INFO - codeparrot_training - Step 30771: {'lr': 0.0001623154898085218, 'samples': 15755264, 'steps': 30771, 'batch_loss/train': 0.7708400972187519} 12/27/2021 00:30:57 - INFO - codeparrot_training - Step 30772: {'lr': 0.00016230072822356908, 'samples': 15755776, 'steps': 30772, 'batch_loss/train': 0.7056989758275449} 12/27/2021 00:31:11 - INFO - codeparrot_training - Step 30773: {'lr': 0.00016228596698727562, 'samples': 15756288, 'steps': 30773, 'batch_loss/train': 0.7650093724951148} 12/27/2021 00:31:22 - INFO - codeparrot_training - Step 30774: {'lr': 0.00016227120609970002, 'samples': 15756800, 'steps': 30774, 'batch_loss/train': 0.7275451808236539} 12/27/2021 00:31:32 - INFO - codeparrot_training - Step 30775: {'lr': 0.00016225644556090086, 'samples': 15757312, 'steps': 30775, 'batch_loss/train': 0.8059558421373367} 12/27/2021 00:31:45 - INFO - codeparrot_training - Step 30776: {'lr': 0.0001622416853709371, 'samples': 15757824, 'steps': 30776, 'batch_loss/train': 0.6486248095170595} 12/27/2021 00:31:55 - INFO - codeparrot_training - Step 30777: {'lr': 0.00016222692552986717, 'samples': 15758336, 'steps': 30777, 'batch_loss/train': 0.8518314836546779} 12/27/2021 00:32:06 - INFO - codeparrot_training - Step 30778: {'lr': 0.00016221216603774986, 'samples': 15758848, 'steps': 30778, 'batch_loss/train': 0.8023763683158904} 12/27/2021 00:32:20 - INFO - codeparrot_training - Step 30779: {'lr': 0.00016219740689464384, 'samples': 15759360, 'steps': 30779, 'batch_loss/train': 0.8458361905068159} 12/27/2021 00:32:30 - INFO - codeparrot_training - Step 30780: {'lr': 0.00016218264810060785, 'samples': 15759872, 'steps': 30780, 'batch_loss/train': 0.8002205602824688} 12/27/2021 00:32:41 - INFO - codeparrot_training - Step 30781: {'lr': 0.0001621678896557004, 'samples': 15760384, 'steps': 30781, 'batch_loss/train': 0.716746041085571} 12/27/2021 00:32:51 - INFO - codeparrot_training - Step 30782: {'lr': 0.00016215313155998023, 'samples': 15760896, 'steps': 30782, 'batch_loss/train': 0.8024417269043624} 12/27/2021 00:33:03 - INFO - codeparrot_training - Step 30783: {'lr': 0.00016213837381350615, 'samples': 15761408, 'steps': 30783, 'batch_loss/train': 0.6863022106699646} 12/27/2021 00:33:14 - INFO - codeparrot_training - Step 30784: {'lr': 0.00016212361641633664, 'samples': 15761920, 'steps': 30784, 'batch_loss/train': 0.7047581584192812} 12/27/2021 00:33:25 - INFO - codeparrot_training - Step 30785: {'lr': 0.00016210885936853052, 'samples': 15762432, 'steps': 30785, 'batch_loss/train': 0.77451375965029} 12/27/2021 00:33:37 - INFO - codeparrot_training - Step 30786: {'lr': 0.00016209410267014628, 'samples': 15762944, 'steps': 30786, 'batch_loss/train': 0.7358229744131677} 12/27/2021 00:33:48 - INFO - codeparrot_training - Step 30787: {'lr': 0.00016207934632124274, 'samples': 15763456, 'steps': 30787, 'batch_loss/train': 0.7033231714740396} 12/27/2021 00:33:58 - INFO - codeparrot_training - Step 30788: {'lr': 0.0001620645903218786, 'samples': 15763968, 'steps': 30788, 'batch_loss/train': 0.7743355995044112} 12/27/2021 00:34:10 - INFO - codeparrot_training - Step 30789: {'lr': 0.00016204983467211232, 'samples': 15764480, 'steps': 30789, 'batch_loss/train': 0.7461689347401261} 12/27/2021 00:34:21 - INFO - codeparrot_training - Step 30790: {'lr': 0.0001620350793720028, 'samples': 15764992, 'steps': 30790, 'batch_loss/train': 0.9490162804722786} 12/27/2021 00:34:32 - INFO - codeparrot_training - Step 30791: {'lr': 0.00016202032442160853, 'samples': 15765504, 'steps': 30791, 'batch_loss/train': 0.7693260200321674} 12/27/2021 00:34:42 - INFO - codeparrot_training - Step 30792: {'lr': 0.00016200556982098828, 'samples': 15766016, 'steps': 30792, 'batch_loss/train': 0.7252944828942418} 12/27/2021 00:34:56 - INFO - codeparrot_training - Step 30793: {'lr': 0.00016199081557020055, 'samples': 15766528, 'steps': 30793, 'batch_loss/train': 0.7324328916147351} 12/27/2021 00:35:07 - INFO - codeparrot_training - Step 30794: {'lr': 0.00016197606166930417, 'samples': 15767040, 'steps': 30794, 'batch_loss/train': 0.717556088231504} 12/27/2021 00:35:17 - INFO - codeparrot_training - Step 30795: {'lr': 0.0001619613081183578, 'samples': 15767552, 'steps': 30795, 'batch_loss/train': 0.7924542678520083} 12/27/2021 00:35:29 - INFO - codeparrot_training - Step 30796: {'lr': 0.00016194655491742, 'samples': 15768064, 'steps': 30796, 'batch_loss/train': 0.7862041450571269} 12/27/2021 00:35:40 - INFO - codeparrot_training - Step 30797: {'lr': 0.00016193180206654933, 'samples': 15768576, 'steps': 30797, 'batch_loss/train': 0.7801267672330141} 12/27/2021 00:35:51 - INFO - codeparrot_training - Step 30798: {'lr': 0.00016191704956580467, 'samples': 15769088, 'steps': 30798, 'batch_loss/train': 0.829815779812634} 12/27/2021 00:36:03 - INFO - codeparrot_training - Step 30799: {'lr': 0.00016190229741524457, 'samples': 15769600, 'steps': 30799, 'batch_loss/train': 0.5454260711267125} 12/27/2021 00:36:14 - INFO - codeparrot_training - Step 30800: {'lr': 0.00016188754561492758, 'samples': 15770112, 'steps': 30800, 'batch_loss/train': 0.8294669974711724} 12/27/2021 00:36:25 - INFO - codeparrot_training - Step 30801: {'lr': 0.0001618727941649125, 'samples': 15770624, 'steps': 30801, 'batch_loss/train': 1.990128057077527} 12/27/2021 00:36:35 - INFO - codeparrot_training - Step 30802: {'lr': 0.00016185804306525795, 'samples': 15771136, 'steps': 30802, 'batch_loss/train': 0.8283108780160546} 12/27/2021 00:36:49 - INFO - codeparrot_training - Step 30803: {'lr': 0.0001618432923160225, 'samples': 15771648, 'steps': 30803, 'batch_loss/train': 0.833123330026865} 12/27/2021 00:37:00 - INFO - codeparrot_training - Step 30804: {'lr': 0.00016182854191726476, 'samples': 15772160, 'steps': 30804, 'batch_loss/train': 0.7487641358748078} 12/27/2021 00:37:10 - INFO - codeparrot_training - Step 30805: {'lr': 0.00016181379186904358, 'samples': 15772672, 'steps': 30805, 'batch_loss/train': 0.7339253854006529} 12/27/2021 00:37:23 - INFO - codeparrot_training - Step 30806: {'lr': 0.00016179904217141744, 'samples': 15773184, 'steps': 30806, 'batch_loss/train': 0.7528280671685934} 12/27/2021 00:37:33 - INFO - codeparrot_training - Step 30807: {'lr': 0.00016178429282444494, 'samples': 15773696, 'steps': 30807, 'batch_loss/train': 0.7540730383479968} 12/27/2021 00:37:44 - INFO - codeparrot_training - Step 30808: {'lr': 0.0001617695438281848, 'samples': 15774208, 'steps': 30808, 'batch_loss/train': 0.7612901595421135} 12/27/2021 00:37:56 - INFO - codeparrot_training - Step 30809: {'lr': 0.00016175479518269571, 'samples': 15774720, 'steps': 30809, 'batch_loss/train': 0.8382108071818948} 12/27/2021 00:38:07 - INFO - codeparrot_training - Step 30810: {'lr': 0.00016174004688803616, 'samples': 15775232, 'steps': 30810, 'batch_loss/train': 0.6808730756747536} 12/27/2021 00:38:17 - INFO - codeparrot_training - Step 30811: {'lr': 0.00016172529894426492, 'samples': 15775744, 'steps': 30811, 'batch_loss/train': 0.6773623232729733} 12/27/2021 00:38:28 - INFO - codeparrot_training - Step 30812: {'lr': 0.00016171055135144044, 'samples': 15776256, 'steps': 30812, 'batch_loss/train': 0.6327231962932274} 12/27/2021 00:38:42 - INFO - codeparrot_training - Step 30813: {'lr': 0.00016169580410962153, 'samples': 15776768, 'steps': 30813, 'batch_loss/train': 0.8337628608569503} 12/27/2021 00:38:52 - INFO - codeparrot_training - Step 30814: {'lr': 0.00016168105721886682, 'samples': 15777280, 'steps': 30814, 'batch_loss/train': 0.8047929042950273} 12/27/2021 00:39:03 - INFO - codeparrot_training - Step 30815: {'lr': 0.0001616663106792348, 'samples': 15777792, 'steps': 30815, 'batch_loss/train': 0.730521603487432} 12/27/2021 00:39:15 - INFO - codeparrot_training - Step 30816: {'lr': 0.00016165156449078424, 'samples': 15778304, 'steps': 30816, 'batch_loss/train': 0.7217117673717439} 12/27/2021 00:39:26 - INFO - codeparrot_training - Step 30817: {'lr': 0.00016163681865357366, 'samples': 15778816, 'steps': 30817, 'batch_loss/train': 0.8659359049052} 12/27/2021 00:39:36 - INFO - codeparrot_training - Step 30818: {'lr': 0.0001616220731676618, 'samples': 15779328, 'steps': 30818, 'batch_loss/train': 0.6854172823950648} 12/27/2021 00:39:50 - INFO - codeparrot_training - Step 30819: {'lr': 0.00016160732803310708, 'samples': 15779840, 'steps': 30819, 'batch_loss/train': 0.7937643304467201} 12/27/2021 00:40:01 - INFO - codeparrot_training - Step 30820: {'lr': 0.0001615925832499683, 'samples': 15780352, 'steps': 30820, 'batch_loss/train': 0.7309375293552876} 12/27/2021 00:40:11 - INFO - codeparrot_training - Step 30821: {'lr': 0.00016157783881830406, 'samples': 15780864, 'steps': 30821, 'batch_loss/train': 0.736822965554893} 12/27/2021 00:40:22 - INFO - codeparrot_training - Step 30822: {'lr': 0.00016156309473817294, 'samples': 15781376, 'steps': 30822, 'batch_loss/train': 0.7149216327816248} 12/27/2021 00:40:34 - INFO - codeparrot_training - Step 30823: {'lr': 0.0001615483510096335, 'samples': 15781888, 'steps': 30823, 'batch_loss/train': 0.7472240459173918} 12/27/2021 00:40:45 - INFO - codeparrot_training - Step 30824: {'lr': 0.0001615336076327445, 'samples': 15782400, 'steps': 30824, 'batch_loss/train': 0.8456062995828688} 12/27/2021 00:40:55 - INFO - codeparrot_training - Step 30825: {'lr': 0.00016151886460756448, 'samples': 15782912, 'steps': 30825, 'batch_loss/train': 0.7915514297783375} 12/27/2021 00:41:07 - INFO - codeparrot_training - Step 30826: {'lr': 0.00016150412193415193, 'samples': 15783424, 'steps': 30826, 'batch_loss/train': 0.7283979726489633} 12/27/2021 00:41:18 - INFO - codeparrot_training - Step 30827: {'lr': 0.00016148937961256565, 'samples': 15783936, 'steps': 30827, 'batch_loss/train': 0.6758827373851091} 12/27/2021 00:41:28 - INFO - codeparrot_training - Step 30828: {'lr': 0.00016147463764286423, 'samples': 15784448, 'steps': 30828, 'batch_loss/train': 0.6607715612044558} 12/27/2021 00:41:41 - INFO - codeparrot_training - Step 30829: {'lr': 0.00016145989602510616, 'samples': 15784960, 'steps': 30829, 'batch_loss/train': 0.7137435795739293} 12/27/2021 00:41:51 - INFO - codeparrot_training - Step 30830: {'lr': 0.00016144515475935005, 'samples': 15785472, 'steps': 30830, 'batch_loss/train': 0.7008698731660843} 12/27/2021 00:42:02 - INFO - codeparrot_training - Step 30831: {'lr': 0.00016143041384565472, 'samples': 15785984, 'steps': 30831, 'batch_loss/train': 0.6983389793895185} 12/27/2021 00:42:16 - INFO - codeparrot_training - Step 30832: {'lr': 0.0001614156732840785, 'samples': 15786496, 'steps': 30832, 'batch_loss/train': 0.8212960474193096} 12/27/2021 00:42:26 - INFO - codeparrot_training - Step 30833: {'lr': 0.00016140093307468023, 'samples': 15787008, 'steps': 30833, 'batch_loss/train': 0.7239554298575968} 12/27/2021 00:42:37 - INFO - codeparrot_training - Step 30834: {'lr': 0.00016138619321751823, 'samples': 15787520, 'steps': 30834, 'batch_loss/train': 0.7665661289356649} 12/27/2021 00:42:48 - INFO - codeparrot_training - Step 30835: {'lr': 0.00016137145371265143, 'samples': 15788032, 'steps': 30835, 'batch_loss/train': 0.7112649756018072} 12/27/2021 00:43:00 - INFO - codeparrot_training - Step 30836: {'lr': 0.00016135671456013817, 'samples': 15788544, 'steps': 30836, 'batch_loss/train': 0.819144593551755} 12/27/2021 00:43:11 - INFO - codeparrot_training - Step 30837: {'lr': 0.00016134197576003713, 'samples': 15789056, 'steps': 30837, 'batch_loss/train': 0.8204434663057327} 12/27/2021 00:43:21 - INFO - codeparrot_training - Step 30838: {'lr': 0.000161327237312407, 'samples': 15789568, 'steps': 30838, 'batch_loss/train': 0.8251678440719843} 12/27/2021 00:43:33 - INFO - codeparrot_training - Step 30839: {'lr': 0.00016131249921730624, 'samples': 15790080, 'steps': 30839, 'batch_loss/train': 0.6814272035844624} 12/27/2021 00:43:44 - INFO - codeparrot_training - Step 30840: {'lr': 0.00016129776147479356, 'samples': 15790592, 'steps': 30840, 'batch_loss/train': 0.7852231403812766} 12/27/2021 00:43:54 - INFO - codeparrot_training - Step 30841: {'lr': 0.00016128302408492738, 'samples': 15791104, 'steps': 30841, 'batch_loss/train': 0.8152106599882245} 12/27/2021 00:44:08 - INFO - codeparrot_training - Step 30842: {'lr': 0.00016126828704776647, 'samples': 15791616, 'steps': 30842, 'batch_loss/train': 0.7442397754639387} 12/27/2021 00:44:19 - INFO - codeparrot_training - Step 30843: {'lr': 0.00016125355036336934, 'samples': 15792128, 'steps': 30843, 'batch_loss/train': 0.7873721913201734} 12/27/2021 00:44:30 - INFO - codeparrot_training - Step 30844: {'lr': 0.0001612388140317946, 'samples': 15792640, 'steps': 30844, 'batch_loss/train': 0.7654439928010106} 12/27/2021 00:44:42 - INFO - codeparrot_training - Step 30845: {'lr': 0.00016122407805310075, 'samples': 15793152, 'steps': 30845, 'batch_loss/train': 0.8660440240055323} 12/27/2021 00:44:52 - INFO - codeparrot_training - Step 30846: {'lr': 0.00016120934242734648, 'samples': 15793664, 'steps': 30846, 'batch_loss/train': 0.7122303145006299} 12/27/2021 00:45:03 - INFO - codeparrot_training - Step 30847: {'lr': 0.0001611946071545904, 'samples': 15794176, 'steps': 30847, 'batch_loss/train': 0.7616346888244152} 12/27/2021 00:45:14 - INFO - codeparrot_training - Step 30848: {'lr': 0.00016117987223489093, 'samples': 15794688, 'steps': 30848, 'batch_loss/train': 0.6411690707318485} 12/27/2021 00:45:28 - INFO - codeparrot_training - Step 30849: {'lr': 0.00016116513766830677, 'samples': 15795200, 'steps': 30849, 'batch_loss/train': 0.7486068871803582} 12/27/2021 00:45:38 - INFO - codeparrot_training - Step 30850: {'lr': 0.00016115040345489654, 'samples': 15795712, 'steps': 30850, 'batch_loss/train': 0.7476525702513754} 12/27/2021 00:45:49 - INFO - codeparrot_training - Step 30851: {'lr': 0.00016113566959471875, 'samples': 15796224, 'steps': 30851, 'batch_loss/train': 0.8739882661029696} 12/27/2021 00:46:01 - INFO - codeparrot_training - Step 30852: {'lr': 0.0001611209360878319, 'samples': 15796736, 'steps': 30852, 'batch_loss/train': 0.7135303956456482} 12/27/2021 00:46:12 - INFO - codeparrot_training - Step 30853: {'lr': 0.00016110620293429472, 'samples': 15797248, 'steps': 30853, 'batch_loss/train': 1.4244527341797948} 12/27/2021 00:46:22 - INFO - codeparrot_training - Step 30854: {'lr': 0.00016109147013416576, 'samples': 15797760, 'steps': 30854, 'batch_loss/train': 0.8129576286301017} 12/27/2021 00:46:33 - INFO - codeparrot_training - Step 30855: {'lr': 0.00016107673768750346, 'samples': 15798272, 'steps': 30855, 'batch_loss/train': 0.6903572764713317} 12/27/2021 00:46:45 - INFO - codeparrot_training - Step 30856: {'lr': 0.00016106200559436645, 'samples': 15798784, 'steps': 30856, 'batch_loss/train': 0.665244656149298} 12/27/2021 00:46:55 - INFO - codeparrot_training - Step 30857: {'lr': 0.0001610472738548134, 'samples': 15799296, 'steps': 30857, 'batch_loss/train': 0.7328792768530548} 12/27/2021 00:47:06 - INFO - codeparrot_training - Step 30858: {'lr': 0.00016103254246890278, 'samples': 15799808, 'steps': 30858, 'batch_loss/train': 0.7520218235440552} 12/27/2021 00:47:20 - INFO - codeparrot_training - Step 30859: {'lr': 0.00016101781143669324, 'samples': 15800320, 'steps': 30859, 'batch_loss/train': 0.7866185316815972} 12/27/2021 00:47:31 - INFO - codeparrot_training - Step 30860: {'lr': 0.00016100308075824314, 'samples': 15800832, 'steps': 30860, 'batch_loss/train': 0.6666816789656878} 12/27/2021 00:47:41 - INFO - codeparrot_training - Step 30861: {'lr': 0.0001609883504336113, 'samples': 15801344, 'steps': 30861, 'batch_loss/train': 0.7371662762016058} 12/27/2021 00:47:54 - INFO - codeparrot_training - Step 30862: {'lr': 0.00016097362046285613, 'samples': 15801856, 'steps': 30862, 'batch_loss/train': 0.6850643646903336} 12/27/2021 00:48:04 - INFO - codeparrot_training - Step 30863: {'lr': 0.00016095889084603615, 'samples': 15802368, 'steps': 30863, 'batch_loss/train': 0.6809002519585192} 12/27/2021 00:48:15 - INFO - codeparrot_training - Step 30864: {'lr': 0.00016094416158321016, 'samples': 15802880, 'steps': 30864, 'batch_loss/train': 0.7334612235426903} 12/27/2021 00:48:25 - INFO - codeparrot_training - Step 30865: {'lr': 0.00016092943267443646, 'samples': 15803392, 'steps': 30865, 'batch_loss/train': 0.879188084974885} 12/27/2021 00:48:38 - INFO - codeparrot_training - Step 30866: {'lr': 0.00016091470411977376, 'samples': 15803904, 'steps': 30866, 'batch_loss/train': 0.7382848744746298} 12/27/2021 00:48:48 - INFO - codeparrot_training - Step 30867: {'lr': 0.00016089997591928047, 'samples': 15804416, 'steps': 30867, 'batch_loss/train': 0.6886414904147387} 12/27/2021 00:48:59 - INFO - codeparrot_training - Step 30868: {'lr': 0.0001608852480730153, 'samples': 15804928, 'steps': 30868, 'batch_loss/train': 0.6908066663891077} 12/27/2021 00:49:11 - INFO - codeparrot_training - Step 30869: {'lr': 0.00016087052058103674, 'samples': 15805440, 'steps': 30869, 'batch_loss/train': 0.7658298150636256} 12/27/2021 00:49:22 - INFO - codeparrot_training - Step 30870: {'lr': 0.00016085579344340338, 'samples': 15805952, 'steps': 30870, 'batch_loss/train': 0.7357853669673204} 12/27/2021 00:49:33 - INFO - codeparrot_training - Step 30871: {'lr': 0.00016084106666017358, 'samples': 15806464, 'steps': 30871, 'batch_loss/train': 0.7952696345746517} 12/27/2021 00:49:46 - INFO - codeparrot_training - Step 30872: {'lr': 0.00016082634023140613, 'samples': 15806976, 'steps': 30872, 'batch_loss/train': 0.5207338219042867} 12/27/2021 00:49:57 - INFO - codeparrot_training - Step 30873: {'lr': 0.0001608116141571595, 'samples': 15807488, 'steps': 30873, 'batch_loss/train': 0.8028752068057656} 12/27/2021 00:50:08 - INFO - codeparrot_training - Step 30874: {'lr': 0.00016079688843749214, 'samples': 15808000, 'steps': 30874, 'batch_loss/train': 0.678417258663103} 12/27/2021 00:50:18 - INFO - codeparrot_training - Step 30875: {'lr': 0.00016078216307246267, 'samples': 15808512, 'steps': 30875, 'batch_loss/train': 0.6784301674924791} 12/27/2021 00:50:30 - INFO - codeparrot_training - Step 30876: {'lr': 0.00016076743806212973, 'samples': 15809024, 'steps': 30876, 'batch_loss/train': 0.69429388968274} 12/27/2021 00:50:41 - INFO - codeparrot_training - Step 30877: {'lr': 0.00016075271340655172, 'samples': 15809536, 'steps': 30877, 'batch_loss/train': 0.7691114619374275} 12/27/2021 00:50:52 - INFO - codeparrot_training - Step 30878: {'lr': 0.00016073798910578712, 'samples': 15810048, 'steps': 30878, 'batch_loss/train': 0.7139703719876707} 12/27/2021 00:51:06 - INFO - codeparrot_training - Step 30879: {'lr': 0.0001607232651598947, 'samples': 15810560, 'steps': 30879, 'batch_loss/train': 0.8478558673523366} 12/27/2021 00:51:17 - INFO - codeparrot_training - Step 30880: {'lr': 0.0001607085415689329, 'samples': 15811072, 'steps': 30880, 'batch_loss/train': 1.3108900068327785} 12/27/2021 00:51:27 - INFO - codeparrot_training - Step 30881: {'lr': 0.00016069381833296015, 'samples': 15811584, 'steps': 30881, 'batch_loss/train': 1.0161505173891783} 12/27/2021 00:51:38 - INFO - codeparrot_training - Step 30882: {'lr': 0.00016067909545203503, 'samples': 15812096, 'steps': 30882, 'batch_loss/train': 0.8232877617701888} 12/27/2021 00:51:50 - INFO - codeparrot_training - Step 30883: {'lr': 0.00016066437292621617, 'samples': 15812608, 'steps': 30883, 'batch_loss/train': 0.6942473677918315} 12/27/2021 00:52:01 - INFO - codeparrot_training - Step 30884: {'lr': 0.00016064965075556197, 'samples': 15813120, 'steps': 30884, 'batch_loss/train': 0.8192414678633213} 12/27/2021 00:52:11 - INFO - codeparrot_training - Step 30885: {'lr': 0.0001606349289401311, 'samples': 15813632, 'steps': 30885, 'batch_loss/train': 0.7604305576533079} 12/27/2021 00:52:24 - INFO - codeparrot_training - Step 30886: {'lr': 0.0001606202074799819, 'samples': 15814144, 'steps': 30886, 'batch_loss/train': 0.7469745650887489} 12/27/2021 00:52:34 - INFO - codeparrot_training - Step 30887: {'lr': 0.00016060548637517307, 'samples': 15814656, 'steps': 30887, 'batch_loss/train': 0.7873178661684506} 12/27/2021 00:52:45 - INFO - codeparrot_training - Step 30888: {'lr': 0.00016059076562576307, 'samples': 15815168, 'steps': 30888, 'batch_loss/train': 0.7897944343276322} 12/27/2021 00:52:59 - INFO - codeparrot_training - Step 30889: {'lr': 0.00016057604523181035, 'samples': 15815680, 'steps': 30889, 'batch_loss/train': 0.8216747650876641} 12/27/2021 00:53:09 - INFO - codeparrot_training - Step 30890: {'lr': 0.00016056132519337362, 'samples': 15816192, 'steps': 30890, 'batch_loss/train': 0.7547458657063544} 12/27/2021 00:53:20 - INFO - codeparrot_training - Step 30891: {'lr': 0.00016054660551051127, 'samples': 15816704, 'steps': 30891, 'batch_loss/train': 0.6870410484261811} 12/27/2021 00:53:31 - INFO - codeparrot_training - Step 30892: {'lr': 0.00016053188618328184, 'samples': 15817216, 'steps': 30892, 'batch_loss/train': 0.8114374820142984} 12/27/2021 00:53:43 - INFO - codeparrot_training - Step 30893: {'lr': 0.00016051716721174377, 'samples': 15817728, 'steps': 30893, 'batch_loss/train': 0.6409935094416142} 12/27/2021 00:53:53 - INFO - codeparrot_training - Step 30894: {'lr': 0.00016050244859595569, 'samples': 15818240, 'steps': 30894, 'batch_loss/train': 0.7669994505122304} 12/27/2021 00:54:04 - INFO - codeparrot_training - Step 30895: {'lr': 0.0001604877303359762, 'samples': 15818752, 'steps': 30895, 'batch_loss/train': 0.7880333252251148} 12/27/2021 00:54:18 - INFO - codeparrot_training - Step 30896: {'lr': 0.00016047301243186362, 'samples': 15819264, 'steps': 30896, 'batch_loss/train': 0.8211307800374925} 12/27/2021 00:54:28 - INFO - codeparrot_training - Step 30897: {'lr': 0.00016045829488367647, 'samples': 15819776, 'steps': 30897, 'batch_loss/train': 0.7276456663385034} 12/27/2021 00:54:39 - INFO - codeparrot_training - Step 30898: {'lr': 0.00016044357769147335, 'samples': 15820288, 'steps': 30898, 'batch_loss/train': 0.6687818779610097} 12/27/2021 00:54:51 - INFO - codeparrot_training - Step 30899: {'lr': 0.00016042886085531282, 'samples': 15820800, 'steps': 30899, 'batch_loss/train': 0.719536473043263} 12/27/2021 00:55:02 - INFO - codeparrot_training - Step 30900: {'lr': 0.00016041414437525323, 'samples': 15821312, 'steps': 30900, 'batch_loss/train': 0.9306612424552441} 12/27/2021 00:55:12 - INFO - codeparrot_training - Step 30901: {'lr': 0.00016039942825135322, 'samples': 15821824, 'steps': 30901, 'batch_loss/train': 0.7413059789687395} 12/27/2021 00:55:25 - INFO - codeparrot_training - Step 30902: {'lr': 0.0001603847124836713, 'samples': 15822336, 'steps': 30902, 'batch_loss/train': 0.8339521875604987} 12/27/2021 00:55:35 - INFO - codeparrot_training - Step 30903: {'lr': 0.00016036999707226587, 'samples': 15822848, 'steps': 30903, 'batch_loss/train': 0.6866914411075413} 12/27/2021 00:55:46 - INFO - codeparrot_training - Step 30904: {'lr': 0.00016035528201719545, 'samples': 15823360, 'steps': 30904, 'batch_loss/train': 0.7364457731600851} 12/27/2021 00:55:57 - INFO - codeparrot_training - Step 30905: {'lr': 0.00016034056731851866, 'samples': 15823872, 'steps': 30905, 'batch_loss/train': 0.7413009302690625} 12/27/2021 00:56:09 - INFO - codeparrot_training - Step 30906: {'lr': 0.00016032585297629393, 'samples': 15824384, 'steps': 30906, 'batch_loss/train': 0.47756398253841326} 12/27/2021 00:56:20 - INFO - codeparrot_training - Step 30907: {'lr': 0.0001603111389905797, 'samples': 15824896, 'steps': 30907, 'batch_loss/train': 0.824321505613625} 12/27/2021 00:56:30 - INFO - codeparrot_training - Step 30908: {'lr': 0.00016029642536143445, 'samples': 15825408, 'steps': 30908, 'batch_loss/train': 0.7077257158234715} 12/27/2021 00:56:44 - INFO - codeparrot_training - Step 30909: {'lr': 0.0001602817120889169, 'samples': 15825920, 'steps': 30909, 'batch_loss/train': 0.7675416581332684} 12/27/2021 00:56:55 - INFO - codeparrot_training - Step 30910: {'lr': 0.00016026699917308528, 'samples': 15826432, 'steps': 30910, 'batch_loss/train': 0.7258037290303037} 12/27/2021 00:57:05 - INFO - codeparrot_training - Step 30911: {'lr': 0.00016025228661399816, 'samples': 15826944, 'steps': 30911, 'batch_loss/train': 0.7361184869660065} 12/27/2021 00:57:17 - INFO - codeparrot_training - Step 30912: {'lr': 0.00016023757441171416, 'samples': 15827456, 'steps': 30912, 'batch_loss/train': 0.7204616377130151} 12/27/2021 00:57:28 - INFO - codeparrot_training - Step 30913: {'lr': 0.0001602228625662916, 'samples': 15827968, 'steps': 30913, 'batch_loss/train': 0.6740610487759113} 12/27/2021 00:57:39 - INFO - codeparrot_training - Step 30914: {'lr': 0.00016020815107778912, 'samples': 15828480, 'steps': 30914, 'batch_loss/train': 0.8304150709882379} 12/27/2021 00:57:49 - INFO - codeparrot_training - Step 30915: {'lr': 0.000160193439946265, 'samples': 15828992, 'steps': 30915, 'batch_loss/train': 0.8102163830772042} 12/27/2021 00:58:01 - INFO - codeparrot_training - Step 30916: {'lr': 0.00016017872917177797, 'samples': 15829504, 'steps': 30916, 'batch_loss/train': 0.7572794952429831} 12/27/2021 00:58:12 - INFO - codeparrot_training - Step 30917: {'lr': 0.00016016401875438636, 'samples': 15830016, 'steps': 30917, 'batch_loss/train': 0.8058102810755372} 12/27/2021 00:58:23 - INFO - codeparrot_training - Step 30918: {'lr': 0.00016014930869414872, 'samples': 15830528, 'steps': 30918, 'batch_loss/train': 0.7368296808563173} 12/27/2021 00:58:37 - INFO - codeparrot_training - Step 30919: {'lr': 0.0001601345989911234, 'samples': 15831040, 'steps': 30919, 'batch_loss/train': 0.49313009955221787} 12/27/2021 00:58:47 - INFO - codeparrot_training - Step 30920: {'lr': 0.00016011988964536905, 'samples': 15831552, 'steps': 30920, 'batch_loss/train': 0.7266494273208082} 12/27/2021 00:58:58 - INFO - codeparrot_training - Step 30921: {'lr': 0.00016010518065694412, 'samples': 15832064, 'steps': 30921, 'batch_loss/train': 0.823487208224833} 12/27/2021 00:59:10 - INFO - codeparrot_training - Step 30922: {'lr': 0.000160090472025907, 'samples': 15832576, 'steps': 30922, 'batch_loss/train': 0.736624522600323} 12/27/2021 00:59:21 - INFO - codeparrot_training - Step 30923: {'lr': 0.00016007576375231617, 'samples': 15833088, 'steps': 30923, 'batch_loss/train': 0.6849812790751457} 12/27/2021 00:59:31 - INFO - codeparrot_training - Step 30924: {'lr': 0.00016006105583623022, 'samples': 15833600, 'steps': 30924, 'batch_loss/train': 0.7833189701195806} 12/27/2021 00:59:42 - INFO - codeparrot_training - Step 30925: {'lr': 0.0001600463482777076, 'samples': 15834112, 'steps': 30925, 'batch_loss/train': 0.7481093539390713} 12/27/2021 00:59:56 - INFO - codeparrot_training - Step 30926: {'lr': 0.00016003164107680662, 'samples': 15834624, 'steps': 30926, 'batch_loss/train': 0.9055466183344834} 12/27/2021 01:00:07 - INFO - codeparrot_training - Step 30927: {'lr': 0.00016001693423358589, 'samples': 15835136, 'steps': 30927, 'batch_loss/train': 0.5188769396045245} 12/27/2021 01:00:17 - INFO - codeparrot_training - Step 30928: {'lr': 0.00016000222774810392, 'samples': 15835648, 'steps': 30928, 'batch_loss/train': 0.7371681754011661} 12/27/2021 01:00:30 - INFO - codeparrot_training - Step 30929: {'lr': 0.00015998752162041908, 'samples': 15836160, 'steps': 30929, 'batch_loss/train': 0.7774777887389064} 12/27/2021 01:00:41 - INFO - codeparrot_training - Step 30930: {'lr': 0.00015997281585058976, 'samples': 15836672, 'steps': 30930, 'batch_loss/train': 0.7090618601068854} 12/27/2021 01:00:51 - INFO - codeparrot_training - Step 30931: {'lr': 0.0001599581104386747, 'samples': 15837184, 'steps': 30931, 'batch_loss/train': 0.6594168236479163} 12/27/2021 01:01:02 - INFO - codeparrot_training - Step 30932: {'lr': 0.00015994340538473214, 'samples': 15837696, 'steps': 30932, 'batch_loss/train': 0.8727131478954107} 12/27/2021 01:01:14 - INFO - codeparrot_training - Step 30933: {'lr': 0.00015992870068882057, 'samples': 15838208, 'steps': 30933, 'batch_loss/train': 0.846006229519844} 12/27/2021 01:01:25 - INFO - codeparrot_training - Step 30934: {'lr': 0.0001599139963509984, 'samples': 15838720, 'steps': 30934, 'batch_loss/train': 0.7020608629100025} 12/27/2021 01:01:35 - INFO - codeparrot_training - Step 30935: {'lr': 0.00015989929237132433, 'samples': 15839232, 'steps': 30935, 'batch_loss/train': 0.7135394860524684} 12/27/2021 01:01:48 - INFO - codeparrot_training - Step 30936: {'lr': 0.00015988458874985657, 'samples': 15839744, 'steps': 30936, 'batch_loss/train': 0.6663407463347539} 12/27/2021 01:01:58 - INFO - codeparrot_training - Step 30937: {'lr': 0.00015986988548665358, 'samples': 15840256, 'steps': 30937, 'batch_loss/train': 0.7518970984965563} 12/27/2021 01:02:09 - INFO - codeparrot_training - Step 30938: {'lr': 0.000159855182581774, 'samples': 15840768, 'steps': 30938, 'batch_loss/train': 0.5478270481107756} 12/27/2021 01:02:23 - INFO - codeparrot_training - Step 30939: {'lr': 0.00015984048003527617, 'samples': 15841280, 'steps': 30939, 'batch_loss/train': 0.6976400790736079} 12/27/2021 01:02:33 - INFO - codeparrot_training - Step 30940: {'lr': 0.00015982577784721854, 'samples': 15841792, 'steps': 30940, 'batch_loss/train': 0.6728284736163914} 12/27/2021 01:02:44 - INFO - codeparrot_training - Step 30941: {'lr': 0.00015981107601765953, 'samples': 15842304, 'steps': 30941, 'batch_loss/train': 0.7502237651497126} 12/27/2021 01:02:55 - INFO - codeparrot_training - Step 30942: {'lr': 0.00015979637454665769, 'samples': 15842816, 'steps': 30942, 'batch_loss/train': 0.7428133487701416} 12/27/2021 01:03:07 - INFO - codeparrot_training - Step 30943: {'lr': 0.00015978167343427135, 'samples': 15843328, 'steps': 30943, 'batch_loss/train': 0.730098273139447} 12/27/2021 01:03:18 - INFO - codeparrot_training - Step 30944: {'lr': 0.00015976697268055905, 'samples': 15843840, 'steps': 30944, 'batch_loss/train': 0.7331927660852671} 12/27/2021 01:03:28 - INFO - codeparrot_training - Step 30945: {'lr': 0.00015975227228557916, 'samples': 15844352, 'steps': 30945, 'batch_loss/train': 0.5741670379065908} 12/27/2021 01:03:40 - INFO - codeparrot_training - Step 30946: {'lr': 0.00015973757224939015, 'samples': 15844864, 'steps': 30946, 'batch_loss/train': 0.7537467768415809} 12/27/2021 01:03:51 - INFO - codeparrot_training - Step 30947: {'lr': 0.00015972287257205053, 'samples': 15845376, 'steps': 30947, 'batch_loss/train': 0.7854438167996705} 12/27/2021 01:04:02 - INFO - codeparrot_training - Step 30948: {'lr': 0.0001597081732536186, 'samples': 15845888, 'steps': 30948, 'batch_loss/train': 0.7557224370539188} 12/27/2021 01:04:15 - INFO - codeparrot_training - Step 30949: {'lr': 0.00015969347429415292, 'samples': 15846400, 'steps': 30949, 'batch_loss/train': 0.7996171759441495} 12/27/2021 01:04:26 - INFO - codeparrot_training - Step 30950: {'lr': 0.0001596787756937119, 'samples': 15846912, 'steps': 30950, 'batch_loss/train': 0.7828426696360111} 12/27/2021 01:04:37 - INFO - codeparrot_training - Step 30951: {'lr': 0.00015966407745235405, 'samples': 15847424, 'steps': 30951, 'batch_loss/train': 0.7825040945317596} 12/27/2021 01:04:49 - INFO - codeparrot_training - Step 30952: {'lr': 0.00015964937957013757, 'samples': 15847936, 'steps': 30952, 'batch_loss/train': 0.7346310596913099} 12/27/2021 01:05:00 - INFO - codeparrot_training - Step 30953: {'lr': 0.0001596346820471211, 'samples': 15848448, 'steps': 30953, 'batch_loss/train': 1.738952654413879} 12/27/2021 01:05:10 - INFO - codeparrot_training - Step 30954: {'lr': 0.00015961998488336305, 'samples': 15848960, 'steps': 30954, 'batch_loss/train': 0.6818685964681208} 12/27/2021 01:05:21 - INFO - codeparrot_training - Step 30955: {'lr': 0.00015960528807892178, 'samples': 15849472, 'steps': 30955, 'batch_loss/train': 0.7927789213135839} 12/27/2021 01:05:35 - INFO - codeparrot_training - Step 30956: {'lr': 0.00015959059163385566, 'samples': 15849984, 'steps': 30956, 'batch_loss/train': 0.6690448473673314} 12/27/2021 01:05:46 - INFO - codeparrot_training - Step 30957: {'lr': 0.00015957589554822335, 'samples': 15850496, 'steps': 30957, 'batch_loss/train': 0.5124068564036861} 12/27/2021 01:05:56 - INFO - codeparrot_training - Step 30958: {'lr': 0.00015956119982208307, 'samples': 15851008, 'steps': 30958, 'batch_loss/train': 0.9607853747438639} 12/27/2021 01:06:09 - INFO - codeparrot_training - Step 30959: {'lr': 0.0001595465044554934, 'samples': 15851520, 'steps': 30959, 'batch_loss/train': 0.8304897211492062} 12/27/2021 01:06:19 - INFO - codeparrot_training - Step 30960: {'lr': 0.00015953180944851253, 'samples': 15852032, 'steps': 30960, 'batch_loss/train': 0.862902779597789} 12/27/2021 01:06:30 - INFO - codeparrot_training - Step 30961: {'lr': 0.00015951711480119912, 'samples': 15852544, 'steps': 30961, 'batch_loss/train': 0.4621347379870713} 12/27/2021 01:06:40 - INFO - codeparrot_training - Step 30962: {'lr': 0.00015950242051361149, 'samples': 15853056, 'steps': 30962, 'batch_loss/train': 0.8457566620782018} 12/27/2021 01:06:53 - INFO - codeparrot_training - Step 30963: {'lr': 0.00015948772658580803, 'samples': 15853568, 'steps': 30963, 'batch_loss/train': 0.7852991092950106} 12/27/2021 01:07:03 - INFO - codeparrot_training - Step 30964: {'lr': 0.0001594730330178472, 'samples': 15854080, 'steps': 30964, 'batch_loss/train': 0.7082541394047439} 12/27/2021 01:07:14 - INFO - codeparrot_training - Step 30965: {'lr': 0.00015945833980978745, 'samples': 15854592, 'steps': 30965, 'batch_loss/train': 0.8534339899197221} 12/27/2021 01:07:28 - INFO - codeparrot_training - Step 30966: {'lr': 0.00015944364696168718, 'samples': 15855104, 'steps': 30966, 'batch_loss/train': 0.7428147620521486} 12/27/2021 01:07:38 - INFO - codeparrot_training - Step 30967: {'lr': 0.00015942895447360465, 'samples': 15855616, 'steps': 30967, 'batch_loss/train': 0.7817545551806688} 12/27/2021 01:07:49 - INFO - codeparrot_training - Step 30968: {'lr': 0.00015941426234559853, 'samples': 15856128, 'steps': 30968, 'batch_loss/train': 0.8148624440655112} 12/27/2021 01:08:01 - INFO - codeparrot_training - Step 30969: {'lr': 0.00015939957057772703, 'samples': 15856640, 'steps': 30969, 'batch_loss/train': 0.8212556215003133} 12/27/2021 01:08:12 - INFO - codeparrot_training - Step 30970: {'lr': 0.00015938487917004868, 'samples': 15857152, 'steps': 30970, 'batch_loss/train': 0.8303335739765316} 12/27/2021 01:08:22 - INFO - codeparrot_training - Step 30971: {'lr': 0.00015937018812262172, 'samples': 15857664, 'steps': 30971, 'batch_loss/train': 0.7534277830272913} 12/27/2021 01:08:33 - INFO - codeparrot_training - Step 30972: {'lr': 0.00015935549743550476, 'samples': 15858176, 'steps': 30972, 'batch_loss/train': 0.7711659184424207} 12/27/2021 01:08:45 - INFO - codeparrot_training - Step 30973: {'lr': 0.0001593408071087561, 'samples': 15858688, 'steps': 30973, 'batch_loss/train': 0.7370801018550992} 12/27/2021 01:08:56 - INFO - codeparrot_training - Step 30974: {'lr': 0.00015932611714243413, 'samples': 15859200, 'steps': 30974, 'batch_loss/train': 0.7597909113392234} 12/27/2021 01:09:07 - INFO - codeparrot_training - Step 30975: {'lr': 0.0001593114275365973, 'samples': 15859712, 'steps': 30975, 'batch_loss/train': 0.682334499899298} 12/27/2021 01:09:19 - INFO - codeparrot_training - Step 30976: {'lr': 0.00015929673829130403, 'samples': 15860224, 'steps': 30976, 'batch_loss/train': 0.549232435412705} 12/27/2021 01:09:29 - INFO - codeparrot_training - Step 30977: {'lr': 0.00015928204940661274, 'samples': 15860736, 'steps': 30977, 'batch_loss/train': 0.7913648537360132} 12/27/2021 01:09:40 - INFO - codeparrot_training - Step 30978: {'lr': 0.0001592673608825816, 'samples': 15861248, 'steps': 30978, 'batch_loss/train': 0.8050692370161414} 12/27/2021 01:09:54 - INFO - codeparrot_training - Step 30979: {'lr': 0.00015925267271926923, 'samples': 15861760, 'steps': 30979, 'batch_loss/train': 0.8311370899900794} 12/27/2021 01:10:05 - INFO - codeparrot_training - Step 30980: {'lr': 0.0001592379849167341, 'samples': 15862272, 'steps': 30980, 'batch_loss/train': 0.6875535580329597} 12/27/2021 01:10:15 - INFO - codeparrot_training - Step 30981: {'lr': 0.00015922329747503433, 'samples': 15862784, 'steps': 30981, 'batch_loss/train': 0.7542576342821121} 12/27/2021 01:10:27 - INFO - codeparrot_training - Step 30982: {'lr': 0.0001592086103942284, 'samples': 15863296, 'steps': 30982, 'batch_loss/train': 0.7516611418686807} 12/27/2021 01:10:38 - INFO - codeparrot_training - Step 30983: {'lr': 0.0001591939236743749, 'samples': 15863808, 'steps': 30983, 'batch_loss/train': 0.8088318761438131} 12/27/2021 01:10:49 - INFO - codeparrot_training - Step 30984: {'lr': 0.00015917923731553201, 'samples': 15864320, 'steps': 30984, 'batch_loss/train': 0.8027662266977131} 12/27/2021 01:10:59 - INFO - codeparrot_training - Step 30985: {'lr': 0.00015916455131775815, 'samples': 15864832, 'steps': 30985, 'batch_loss/train': 0.7696922766044736} 12/27/2021 01:11:13 - INFO - codeparrot_training - Step 30986: {'lr': 0.0001591498656811118, 'samples': 15865344, 'steps': 30986, 'batch_loss/train': 0.8583994824439287} 12/27/2021 01:11:24 - INFO - codeparrot_training - Step 30987: {'lr': 0.00015913518040565127, 'samples': 15865856, 'steps': 30987, 'batch_loss/train': 0.7270805677399039} 12/27/2021 01:11:34 - INFO - codeparrot_training - Step 30988: {'lr': 0.00015912049549143495, 'samples': 15866368, 'steps': 30988, 'batch_loss/train': 0.852509081363678} 12/27/2021 01:11:46 - INFO - codeparrot_training - Step 30989: {'lr': 0.00015910581093852116, 'samples': 15866880, 'steps': 30989, 'batch_loss/train': 0.764291487634182} 12/27/2021 01:11:57 - INFO - codeparrot_training - Step 30990: {'lr': 0.00015909112674696843, 'samples': 15867392, 'steps': 30990, 'batch_loss/train': 0.7696849587373435} 12/27/2021 01:12:08 - INFO - codeparrot_training - Step 30991: {'lr': 0.000159076442916835, 'samples': 15867904, 'steps': 30991, 'batch_loss/train': 0.7885394496843219} 12/27/2021 01:12:20 - INFO - codeparrot_training - Step 30992: {'lr': 0.0001590617594481794, 'samples': 15868416, 'steps': 30992, 'batch_loss/train': 0.8254323662258685} 12/27/2021 01:12:30 - INFO - codeparrot_training - Step 30993: {'lr': 0.00015904707634105975, 'samples': 15868928, 'steps': 30993, 'batch_loss/train': 0.6878603957593441} 12/27/2021 01:12:41 - INFO - codeparrot_training - Step 30994: {'lr': 0.00015903239359553464, 'samples': 15869440, 'steps': 30994, 'batch_loss/train': 0.7539136596024036} 12/27/2021 01:12:52 - INFO - codeparrot_training - Step 30995: {'lr': 0.00015901771121166247, 'samples': 15869952, 'steps': 30995, 'batch_loss/train': 0.6513620978221297} 12/27/2021 01:13:06 - INFO - codeparrot_training - Step 30996: {'lr': 0.0001590030291895015, 'samples': 15870464, 'steps': 30996, 'batch_loss/train': 0.7256746403872967} 12/27/2021 01:13:16 - INFO - codeparrot_training - Step 30997: {'lr': 0.00015898834752911, 'samples': 15870976, 'steps': 30997, 'batch_loss/train': 0.7668946729972959} 12/27/2021 01:13:27 - INFO - codeparrot_training - Step 30998: {'lr': 0.00015897366623054656, 'samples': 15871488, 'steps': 30998, 'batch_loss/train': 0.7572177378460765} 12/27/2021 01:13:39 - INFO - codeparrot_training - Step 30999: {'lr': 0.0001589589852938695, 'samples': 15872000, 'steps': 30999, 'batch_loss/train': 0.8741669611772522} 12/27/2021 01:13:50 - INFO - codeparrot_training - Step 31000: {'lr': 0.00015894430471913705, 'samples': 15872512, 'steps': 31000, 'batch_loss/train': 0.7697173319756985} 12/27/2021 01:14:00 - INFO - codeparrot_training - Step 31001: {'lr': 0.0001589296245064077, 'samples': 15873024, 'steps': 31001, 'batch_loss/train': 0.7213050005957484} 12/27/2021 01:14:12 - INFO - codeparrot_training - Step 31002: {'lr': 0.0001589149446557398, 'samples': 15873536, 'steps': 31002, 'batch_loss/train': 0.8183126980438828} 12/27/2021 01:14:23 - INFO - codeparrot_training - Step 31003: {'lr': 0.00015890026516719163, 'samples': 15874048, 'steps': 31003, 'batch_loss/train': 0.9556475915014744} 12/27/2021 01:14:34 - INFO - codeparrot_training - Step 31004: {'lr': 0.0001588855860408216, 'samples': 15874560, 'steps': 31004, 'batch_loss/train': 0.7682319204322994} 12/27/2021 01:14:44 - INFO - codeparrot_training - Step 31005: {'lr': 0.0001588709072766882, 'samples': 15875072, 'steps': 31005, 'batch_loss/train': 0.758257175097242} 12/27/2021 01:14:58 - INFO - codeparrot_training - Step 31006: {'lr': 0.00015885622887484957, 'samples': 15875584, 'steps': 31006, 'batch_loss/train': 0.7207337520085275} 12/27/2021 01:15:09 - INFO - codeparrot_training - Step 31007: {'lr': 0.00015884155083536417, 'samples': 15876096, 'steps': 31007, 'batch_loss/train': 0.7772744530811906} 12/27/2021 01:15:19 - INFO - codeparrot_training - Step 31008: {'lr': 0.00015882687315829031, 'samples': 15876608, 'steps': 31008, 'batch_loss/train': 0.801705852150917} 12/27/2021 01:15:31 - INFO - codeparrot_training - Step 31009: {'lr': 0.00015881219584368644, 'samples': 15877120, 'steps': 31009, 'batch_loss/train': 0.7727651530876756} 12/27/2021 01:15:42 - INFO - codeparrot_training - Step 31010: {'lr': 0.00015879751889161082, 'samples': 15877632, 'steps': 31010, 'batch_loss/train': 0.7331004424486309} 12/27/2021 01:15:53 - INFO - codeparrot_training - Step 31011: {'lr': 0.00015878284230212177, 'samples': 15878144, 'steps': 31011, 'batch_loss/train': 0.7579820598475635} 12/27/2021 01:16:05 - INFO - codeparrot_training - Step 31012: {'lr': 0.00015876816607527777, 'samples': 15878656, 'steps': 31012, 'batch_loss/train': 0.7889898489229381} 12/27/2021 01:16:15 - INFO - codeparrot_training - Step 31013: {'lr': 0.00015875349021113718, 'samples': 15879168, 'steps': 31013, 'batch_loss/train': 0.8108031437732279} 12/27/2021 01:16:26 - INFO - codeparrot_training - Step 31014: {'lr': 0.0001587388147097581, 'samples': 15879680, 'steps': 31014, 'batch_loss/train': 0.6751331547275186} 12/27/2021 01:16:38 - INFO - codeparrot_training - Step 31015: {'lr': 0.00015872413957119906, 'samples': 15880192, 'steps': 31015, 'batch_loss/train': 0.732411848148331} 12/27/2021 01:16:49 - INFO - codeparrot_training - Step 31016: {'lr': 0.00015870946479551847, 'samples': 15880704, 'steps': 31016, 'batch_loss/train': 0.7265564613044262} 12/27/2021 01:16:59 - INFO - codeparrot_training - Step 31017: {'lr': 0.0001586947903827745, 'samples': 15881216, 'steps': 31017, 'batch_loss/train': 0.8278833264485002} 12/27/2021 01:17:10 - INFO - codeparrot_training - Step 31018: {'lr': 0.0001586801163330256, 'samples': 15881728, 'steps': 31018, 'batch_loss/train': 0.5832267695805058} 12/27/2021 01:17:24 - INFO - codeparrot_training - Step 31019: {'lr': 0.00015866544264633002, 'samples': 15882240, 'steps': 31019, 'batch_loss/train': 0.7990945195779204} 12/27/2021 01:17:35 - INFO - codeparrot_training - Step 31020: {'lr': 0.0001586507693227462, 'samples': 15882752, 'steps': 31020, 'batch_loss/train': 0.7605624864809215} 12/27/2021 01:17:45 - INFO - codeparrot_training - Step 31021: {'lr': 0.00015863609636233245, 'samples': 15883264, 'steps': 31021, 'batch_loss/train': 0.7524601130280644} 12/27/2021 01:17:57 - INFO - codeparrot_training - Step 31022: {'lr': 0.00015862142376514698, 'samples': 15883776, 'steps': 31022, 'batch_loss/train': 0.6837199702858925} 12/27/2021 01:18:08 - INFO - codeparrot_training - Step 31023: {'lr': 0.00015860675153124839, 'samples': 15884288, 'steps': 31023, 'batch_loss/train': 0.6880010787863284} 12/27/2021 01:18:19 - INFO - codeparrot_training - Step 31024: {'lr': 0.00015859207966069472, 'samples': 15884800, 'steps': 31024, 'batch_loss/train': 0.7273413834627718} 12/27/2021 01:18:33 - INFO - codeparrot_training - Step 31025: {'lr': 0.0001585774081535445, 'samples': 15885312, 'steps': 31025, 'batch_loss/train': 0.7179328170605004} 12/27/2021 01:18:43 - INFO - codeparrot_training - Step 31026: {'lr': 0.0001585627370098559, 'samples': 15885824, 'steps': 31026, 'batch_loss/train': 0.700245360378176} 12/27/2021 01:18:54 - INFO - codeparrot_training - Step 31027: {'lr': 0.00015854806622968737, 'samples': 15886336, 'steps': 31027, 'batch_loss/train': 0.7488870299421251} 12/27/2021 01:19:05 - INFO - codeparrot_training - Step 31028: {'lr': 0.00015853339581309729, 'samples': 15886848, 'steps': 31028, 'batch_loss/train': 1.5918220954481512} 12/27/2021 01:19:17 - INFO - codeparrot_training - Step 31029: {'lr': 0.00015851872576014378, 'samples': 15887360, 'steps': 31029, 'batch_loss/train': 1.8955766917206347} 12/27/2021 01:19:27 - INFO - codeparrot_training - Step 31030: {'lr': 0.00015850405607088525, 'samples': 15887872, 'steps': 31030, 'batch_loss/train': 0.7191026518121362} 12/27/2021 01:19:38 - INFO - codeparrot_training - Step 31031: {'lr': 0.00015848938674538015, 'samples': 15888384, 'steps': 31031, 'batch_loss/train': 0.6433155378326774} 12/27/2021 01:19:50 - INFO - codeparrot_training - Step 31032: {'lr': 0.00015847471778368668, 'samples': 15888896, 'steps': 31032, 'batch_loss/train': 0.712498658336699} 12/27/2021 01:20:01 - INFO - codeparrot_training - Step 31033: {'lr': 0.00015846004918586314, 'samples': 15889408, 'steps': 31033, 'batch_loss/train': 0.7916928296908736} 12/27/2021 01:20:11 - INFO - codeparrot_training - Step 31034: {'lr': 0.0001584453809519678, 'samples': 15889920, 'steps': 31034, 'batch_loss/train': 0.7859434729907662} 12/27/2021 01:20:22 - INFO - codeparrot_training - Step 31035: {'lr': 0.0001584307130820592, 'samples': 15890432, 'steps': 31035, 'batch_loss/train': 0.7204895308241248} 12/27/2021 01:20:36 - INFO - codeparrot_training - Step 31036: {'lr': 0.00015841604557619544, 'samples': 15890944, 'steps': 31036, 'batch_loss/train': 0.7029059105552733} 12/27/2021 01:20:47 - INFO - codeparrot_training - Step 31037: {'lr': 0.00015840137843443487, 'samples': 15891456, 'steps': 31037, 'batch_loss/train': 0.7469586906954646} 12/27/2021 01:20:57 - INFO - codeparrot_training - Step 31038: {'lr': 0.00015838671165683592, 'samples': 15891968, 'steps': 31038, 'batch_loss/train': 0.7669969452545047} 12/27/2021 01:21:09 - INFO - codeparrot_training - Step 31039: {'lr': 0.00015837204524345673, 'samples': 15892480, 'steps': 31039, 'batch_loss/train': 0.8146183213684708} 12/27/2021 01:21:20 - INFO - codeparrot_training - Step 31040: {'lr': 0.00015835737919435577, 'samples': 15892992, 'steps': 31040, 'batch_loss/train': 0.7469966653734446} 12/27/2021 01:21:31 - INFO - codeparrot_training - Step 31041: {'lr': 0.00015834271350959118, 'samples': 15893504, 'steps': 31041, 'batch_loss/train': 0.8506253575906157} 12/27/2021 01:21:45 - INFO - codeparrot_training - Step 31042: {'lr': 0.00015832804818922142, 'samples': 15894016, 'steps': 31042, 'batch_loss/train': 0.9154888219200075} 12/27/2021 01:21:55 - INFO - codeparrot_training - Step 31043: {'lr': 0.00015831338323330472, 'samples': 15894528, 'steps': 31043, 'batch_loss/train': 0.730831696651876} 12/27/2021 01:22:06 - INFO - codeparrot_training - Step 31044: {'lr': 0.0001582987186418994, 'samples': 15895040, 'steps': 31044, 'batch_loss/train': 0.7899353769607842} 12/27/2021 01:22:18 - INFO - codeparrot_training - Step 31045: {'lr': 0.00015828405441506367, 'samples': 15895552, 'steps': 31045, 'batch_loss/train': 1.3094531260430813} 12/27/2021 01:22:29 - INFO - codeparrot_training - Step 31046: {'lr': 0.00015826939055285595, 'samples': 15896064, 'steps': 31046, 'batch_loss/train': 0.8033779300749302} 12/27/2021 01:22:39 - INFO - codeparrot_training - Step 31047: {'lr': 0.00015825472705533456, 'samples': 15896576, 'steps': 31047, 'batch_loss/train': 0.7734782425686717} 12/27/2021 01:22:50 - INFO - codeparrot_training - Step 31048: {'lr': 0.00015824006392255763, 'samples': 15897088, 'steps': 31048, 'batch_loss/train': 0.7622476993128657} 12/27/2021 01:23:02 - INFO - codeparrot_training - Step 31049: {'lr': 0.00015822540115458357, 'samples': 15897600, 'steps': 31049, 'batch_loss/train': 0.7161011388525367} 12/27/2021 01:23:13 - INFO - codeparrot_training - Step 31050: {'lr': 0.00015821073875147074, 'samples': 15898112, 'steps': 31050, 'batch_loss/train': 0.779757421463728} 12/27/2021 01:23:23 - INFO - codeparrot_training - Step 31051: {'lr': 0.00015819607671327737, 'samples': 15898624, 'steps': 31051, 'batch_loss/train': 0.7649510558694601} 12/27/2021 01:23:36 - INFO - codeparrot_training - Step 31052: {'lr': 0.0001581814150400616, 'samples': 15899136, 'steps': 31052, 'batch_loss/train': 0.8092801524326205} 12/27/2021 01:23:46 - INFO - codeparrot_training - Step 31053: {'lr': 0.00015816675373188195, 'samples': 15899648, 'steps': 31053, 'batch_loss/train': 0.7197168027050793} 12/27/2021 01:23:57 - INFO - codeparrot_training - Step 31054: {'lr': 0.00015815209278879667, 'samples': 15900160, 'steps': 31054, 'batch_loss/train': 0.7696050852537155} 12/27/2021 01:24:11 - INFO - codeparrot_training - Step 31055: {'lr': 0.0001581374322108639, 'samples': 15900672, 'steps': 31055, 'batch_loss/train': 0.8303708722814918} 12/27/2021 01:24:22 - INFO - codeparrot_training - Step 31056: {'lr': 0.00015812277199814197, 'samples': 15901184, 'steps': 31056, 'batch_loss/train': 0.784689954482019} 12/27/2021 01:24:32 - INFO - codeparrot_training - Step 31057: {'lr': 0.00015810811215068933, 'samples': 15901696, 'steps': 31057, 'batch_loss/train': 0.7952025244012475} 12/27/2021 01:24:43 - INFO - codeparrot_training - Step 31058: {'lr': 0.00015809345266856408, 'samples': 15902208, 'steps': 31058, 'batch_loss/train': 0.6548025303054601} 12/27/2021 01:24:55 - INFO - codeparrot_training - Step 31059: {'lr': 0.0001580787935518245, 'samples': 15902720, 'steps': 31059, 'batch_loss/train': 0.7742264997214079} 12/27/2021 01:25:06 - INFO - codeparrot_training - Step 31060: {'lr': 0.000158064134800529, 'samples': 15903232, 'steps': 31060, 'batch_loss/train': 0.5705209944862872} 12/27/2021 01:25:16 - INFO - codeparrot_training - Step 31061: {'lr': 0.0001580494764147358, 'samples': 15903744, 'steps': 31061, 'batch_loss/train': 0.7259818213060498} 12/27/2021 01:25:29 - INFO - codeparrot_training - Step 31062: {'lr': 0.0001580348183945031, 'samples': 15904256, 'steps': 31062, 'batch_loss/train': 0.7433553063310683} 12/27/2021 01:25:39 - INFO - codeparrot_training - Step 31063: {'lr': 0.00015802016073988923, 'samples': 15904768, 'steps': 31063, 'batch_loss/train': 0.7317376820137724} 12/27/2021 01:25:50 - INFO - codeparrot_training - Step 31064: {'lr': 0.00015800550345095255, 'samples': 15905280, 'steps': 31064, 'batch_loss/train': 0.684568609111011} 12/27/2021 01:26:04 - INFO - codeparrot_training - Step 31065: {'lr': 0.0001579908465277512, 'samples': 15905792, 'steps': 31065, 'batch_loss/train': 0.7620658784871921} 12/27/2021 01:26:14 - INFO - codeparrot_training - Step 31066: {'lr': 0.00015797618997034357, 'samples': 15906304, 'steps': 31066, 'batch_loss/train': 0.711801796220243} 12/27/2021 01:26:25 - INFO - codeparrot_training - Step 31067: {'lr': 0.00015796153377878775, 'samples': 15906816, 'steps': 31067, 'batch_loss/train': 0.7405852377414703} 12/27/2021 01:26:36 - INFO - codeparrot_training - Step 31068: {'lr': 0.00015794687795314222, 'samples': 15907328, 'steps': 31068, 'batch_loss/train': 0.6956771545810625} 12/27/2021 01:26:48 - INFO - codeparrot_training - Step 31069: {'lr': 0.00015793222249346512, 'samples': 15907840, 'steps': 31069, 'batch_loss/train': 0.7211243249475956} 12/27/2021 01:26:58 - INFO - codeparrot_training - Step 31070: {'lr': 0.00015791756739981475, 'samples': 15908352, 'steps': 31070, 'batch_loss/train': 0.7164490483701229} 12/27/2021 01:27:09 - INFO - codeparrot_training - Step 31071: {'lr': 0.00015790291267224927, 'samples': 15908864, 'steps': 31071, 'batch_loss/train': 0.8350698305293918} 12/27/2021 01:27:23 - INFO - codeparrot_training - Step 31072: {'lr': 0.0001578882583108271, 'samples': 15909376, 'steps': 31072, 'batch_loss/train': 0.6381482316646725} 12/27/2021 01:27:34 - INFO - codeparrot_training - Step 31073: {'lr': 0.00015787360431560648, 'samples': 15909888, 'steps': 31073, 'batch_loss/train': 0.6942046768963337} 12/27/2021 01:27:44 - INFO - codeparrot_training - Step 31074: {'lr': 0.00015785895068664551, 'samples': 15910400, 'steps': 31074, 'batch_loss/train': 0.6580778206698596} 12/27/2021 01:27:56 - INFO - codeparrot_training - Step 31075: {'lr': 0.00015784429742400264, 'samples': 15910912, 'steps': 31075, 'batch_loss/train': 0.8217186350375414} 12/27/2021 01:28:07 - INFO - codeparrot_training - Step 31076: {'lr': 0.00015782964452773607, 'samples': 15911424, 'steps': 31076, 'batch_loss/train': 0.7915335008874536} 12/27/2021 01:28:18 - INFO - codeparrot_training - Step 31077: {'lr': 0.00015781499199790406, 'samples': 15911936, 'steps': 31077, 'batch_loss/train': 0.6789219682104886} 12/27/2021 01:28:28 - INFO - codeparrot_training - Step 31078: {'lr': 0.00015780033983456472, 'samples': 15912448, 'steps': 31078, 'batch_loss/train': 0.7360159286763519} 12/27/2021 01:28:40 - INFO - codeparrot_training - Step 31079: {'lr': 0.00015778568803777643, 'samples': 15912960, 'steps': 31079, 'batch_loss/train': 1.6666910154744983} 12/27/2021 01:28:51 - INFO - codeparrot_training - Step 31080: {'lr': 0.0001577710366075975, 'samples': 15913472, 'steps': 31080, 'batch_loss/train': 0.7727972315624356} 12/27/2021 01:29:02 - INFO - codeparrot_training - Step 31081: {'lr': 0.00015775638554408608, 'samples': 15913984, 'steps': 31081, 'batch_loss/train': 0.6767517095431685} 12/27/2021 01:29:15 - INFO - codeparrot_training - Step 31082: {'lr': 0.00015774173484730037, 'samples': 15914496, 'steps': 31082, 'batch_loss/train': 0.6733672328991815} 12/27/2021 01:29:26 - INFO - codeparrot_training - Step 31083: {'lr': 0.0001577270845172988, 'samples': 15915008, 'steps': 31083, 'batch_loss/train': 0.787358509725891} 12/27/2021 01:29:37 - INFO - codeparrot_training - Step 31084: {'lr': 0.00015771243455413943, 'samples': 15915520, 'steps': 31084, 'batch_loss/train': 0.7835597335360944} 12/27/2021 01:29:49 - INFO - codeparrot_training - Step 31085: {'lr': 0.00015769778495788053, 'samples': 15916032, 'steps': 31085, 'batch_loss/train': 0.6757879303768277} 12/27/2021 01:30:00 - INFO - codeparrot_training - Step 31086: {'lr': 0.00015768313572858052, 'samples': 15916544, 'steps': 31086, 'batch_loss/train': 0.6648219339549541} 12/27/2021 01:30:10 - INFO - codeparrot_training - Step 31087: {'lr': 0.00015766848686629746, 'samples': 15917056, 'steps': 31087, 'batch_loss/train': 0.7525088414549828} 12/27/2021 01:30:21 - INFO - codeparrot_training - Step 31088: {'lr': 0.0001576538383710896, 'samples': 15917568, 'steps': 31088, 'batch_loss/train': 0.6650817911140621} 12/27/2021 01:30:33 - INFO - codeparrot_training - Step 31089: {'lr': 0.00015763919024301516, 'samples': 15918080, 'steps': 31089, 'batch_loss/train': 0.7717553027905524} 12/27/2021 01:30:44 - INFO - codeparrot_training - Step 31090: {'lr': 0.00015762454248213252, 'samples': 15918592, 'steps': 31090, 'batch_loss/train': 0.7929167272523046} 12/27/2021 01:30:54 - INFO - codeparrot_training - Step 31091: {'lr': 0.00015760989508849975, 'samples': 15919104, 'steps': 31091, 'batch_loss/train': 0.7265625628642738} 12/27/2021 01:31:06 - INFO - codeparrot_training - Step 31092: {'lr': 0.00015759524806217522, 'samples': 15919616, 'steps': 31092, 'batch_loss/train': 0.8614545809105039} 12/27/2021 01:31:17 - INFO - codeparrot_training - Step 31093: {'lr': 0.00015758060140321701, 'samples': 15920128, 'steps': 31093, 'batch_loss/train': 0.7450016504153609} 12/27/2021 01:31:28 - INFO - codeparrot_training - Step 31094: {'lr': 0.00015756595511168348, 'samples': 15920640, 'steps': 31094, 'batch_loss/train': 0.7025021978188306} 12/27/2021 01:31:41 - INFO - codeparrot_training - Step 31095: {'lr': 0.00015755130918763282, 'samples': 15921152, 'steps': 31095, 'batch_loss/train': 0.8859447566792369} 12/27/2021 01:31:52 - INFO - codeparrot_training - Step 31096: {'lr': 0.00015753666363112318, 'samples': 15921664, 'steps': 31096, 'batch_loss/train': 0.7691970434971154} 12/27/2021 01:32:02 - INFO - codeparrot_training - Step 31097: {'lr': 0.00015752201844221298, 'samples': 15922176, 'steps': 31097, 'batch_loss/train': 0.7445485973730683} 12/27/2021 01:32:14 - INFO - codeparrot_training - Step 31098: {'lr': 0.00015750737362096024, 'samples': 15922688, 'steps': 31098, 'batch_loss/train': 0.7723862552084029} 12/27/2021 01:32:25 - INFO - codeparrot_training - Step 31099: {'lr': 0.0001574927291674233, 'samples': 15923200, 'steps': 31099, 'batch_loss/train': 0.6814604771789163} 12/27/2021 01:32:36 - INFO - codeparrot_training - Step 31100: {'lr': 0.00015747808508166024, 'samples': 15923712, 'steps': 31100, 'batch_loss/train': 0.6531933178193867} 12/27/2021 01:32:46 - INFO - codeparrot_training - Step 31101: {'lr': 0.0001574634413637294, 'samples': 15924224, 'steps': 31101, 'batch_loss/train': 0.7394403466023505} 12/27/2021 01:33:01 - INFO - codeparrot_training - Step 31102: {'lr': 0.0001574487980136891, 'samples': 15924736, 'steps': 31102, 'batch_loss/train': 0.7644406957551837} 12/27/2021 01:33:11 - INFO - codeparrot_training - Step 31103: {'lr': 0.00015743415503159737, 'samples': 15925248, 'steps': 31103, 'batch_loss/train': 0.8644211678765714} 12/27/2021 01:33:22 - INFO - codeparrot_training - Step 31104: {'lr': 0.0001574195124175124, 'samples': 15925760, 'steps': 31104, 'batch_loss/train': 0.7159216939471662} 12/27/2021 01:33:34 - INFO - codeparrot_training - Step 31105: {'lr': 0.00015740487017149257, 'samples': 15926272, 'steps': 31105, 'batch_loss/train': 0.7707074121572077} 12/27/2021 01:33:44 - INFO - codeparrot_training - Step 31106: {'lr': 0.00015739022829359606, 'samples': 15926784, 'steps': 31106, 'batch_loss/train': 0.7401857399381697} 12/27/2021 01:33:55 - INFO - codeparrot_training - Step 31107: {'lr': 0.00015737558678388097, 'samples': 15927296, 'steps': 31107, 'batch_loss/train': 0.7640858640661463} 12/27/2021 01:34:07 - INFO - codeparrot_training - Step 31108: {'lr': 0.0001573609456424055, 'samples': 15927808, 'steps': 31108, 'batch_loss/train': 0.7620543912053108} 12/27/2021 01:34:18 - INFO - codeparrot_training - Step 31109: {'lr': 0.00015734630486922808, 'samples': 15928320, 'steps': 31109, 'batch_loss/train': 0.6650875471532345} 12/27/2021 01:34:28 - INFO - codeparrot_training - Step 31110: {'lr': 0.00015733166446440668, 'samples': 15928832, 'steps': 31110, 'batch_loss/train': 0.7645098334178329} 12/27/2021 01:34:39 - INFO - codeparrot_training - Step 31111: {'lr': 0.00015731702442799955, 'samples': 15929344, 'steps': 31111, 'batch_loss/train': 0.8596219988539815} 12/27/2021 01:34:53 - INFO - codeparrot_training - Step 31112: {'lr': 0.00015730238476006504, 'samples': 15929856, 'steps': 31112, 'batch_loss/train': 0.6845508366823196} 12/27/2021 01:35:03 - INFO - codeparrot_training - Step 31113: {'lr': 0.00015728774546066123, 'samples': 15930368, 'steps': 31113, 'batch_loss/train': 0.7467604796402156} 12/27/2021 01:35:14 - INFO - codeparrot_training - Step 31114: {'lr': 0.00015727310652984634, 'samples': 15930880, 'steps': 31114, 'batch_loss/train': 0.8268289444968104} 12/27/2021 01:35:26 - INFO - codeparrot_training - Step 31115: {'lr': 0.00015725846796767846, 'samples': 15931392, 'steps': 31115, 'batch_loss/train': 0.7537598800845444} 12/27/2021 01:35:37 - INFO - codeparrot_training - Step 31116: {'lr': 0.00015724382977421602, 'samples': 15931904, 'steps': 31116, 'batch_loss/train': 0.7371206237003207} 12/27/2021 01:35:47 - INFO - codeparrot_training - Step 31117: {'lr': 0.000157229191949517, 'samples': 15932416, 'steps': 31117, 'batch_loss/train': 0.7436909079551697} 12/27/2021 01:35:59 - INFO - codeparrot_training - Step 31118: {'lr': 0.0001572145544936398, 'samples': 15932928, 'steps': 31118, 'batch_loss/train': 0.7120484746992588} 12/27/2021 01:36:10 - INFO - codeparrot_training - Step 31119: {'lr': 0.00015719991740664236, 'samples': 15933440, 'steps': 31119, 'batch_loss/train': 0.7515906589105725} 12/27/2021 01:36:21 - INFO - codeparrot_training - Step 31120: {'lr': 0.00015718528068858306, 'samples': 15933952, 'steps': 31120, 'batch_loss/train': 0.6975909236352891} 12/27/2021 01:36:35 - INFO - codeparrot_training - Step 31121: {'lr': 0.00015717064433952007, 'samples': 15934464, 'steps': 31121, 'batch_loss/train': 0.7856069798581302} 12/27/2021 01:36:45 - INFO - codeparrot_training - Step 31122: {'lr': 0.0001571560083595115, 'samples': 15934976, 'steps': 31122, 'batch_loss/train': 0.7270895512774587} 12/27/2021 01:36:56 - INFO - codeparrot_training - Step 31123: {'lr': 0.00015714137274861563, 'samples': 15935488, 'steps': 31123, 'batch_loss/train': 0.7848384184762836} 12/27/2021 01:37:06 - INFO - codeparrot_training - Step 31124: {'lr': 0.00015712673750689057, 'samples': 15936000, 'steps': 31124, 'batch_loss/train': 0.8055107239633799} 12/27/2021 01:37:19 - INFO - codeparrot_training - Step 31125: {'lr': 0.00015711210263439458, 'samples': 15936512, 'steps': 31125, 'batch_loss/train': 0.7868521343916655} 12/27/2021 01:37:30 - INFO - codeparrot_training - Step 31126: {'lr': 0.0001570974681311857, 'samples': 15937024, 'steps': 31126, 'batch_loss/train': 0.781474323477596} 12/27/2021 01:37:40 - INFO - codeparrot_training - Step 31127: {'lr': 0.00015708283399732225, 'samples': 15937536, 'steps': 31127, 'batch_loss/train': 0.7688420478370972} 12/27/2021 01:37:52 - INFO - codeparrot_training - Step 31128: {'lr': 0.00015706820023286244, 'samples': 15938048, 'steps': 31128, 'batch_loss/train': 0.5665921655599959} 12/27/2021 01:38:03 - INFO - codeparrot_training - Step 31129: {'lr': 0.00015705356683786431, 'samples': 15938560, 'steps': 31129, 'batch_loss/train': 0.7205194933339953} 12/27/2021 01:38:14 - INFO - codeparrot_training - Step 31130: {'lr': 0.00015703893381238608, 'samples': 15939072, 'steps': 31130, 'batch_loss/train': 0.8002591617405415} 12/27/2021 01:38:28 - INFO - codeparrot_training - Step 31131: {'lr': 0.00015702430115648603, 'samples': 15939584, 'steps': 31131, 'batch_loss/train': 0.7140631170477718} 12/27/2021 01:38:38 - INFO - codeparrot_training - Step 31132: {'lr': 0.00015700966887022223, 'samples': 15940096, 'steps': 31132, 'batch_loss/train': 0.7988921827636659} 12/27/2021 01:38:49 - INFO - codeparrot_training - Step 31133: {'lr': 0.0001569950369536528, 'samples': 15940608, 'steps': 31133, 'batch_loss/train': 0.7371317185461521} 12/27/2021 01:38:59 - INFO - codeparrot_training - Step 31134: {'lr': 0.00015698040540683604, 'samples': 15941120, 'steps': 31134, 'batch_loss/train': 0.7326097607146949} 12/27/2021 01:39:11 - INFO - codeparrot_training - Step 31135: {'lr': 0.0001569657742298301, 'samples': 15941632, 'steps': 31135, 'batch_loss/train': 0.7098896112293005} 12/27/2021 01:39:22 - INFO - codeparrot_training - Step 31136: {'lr': 0.00015695114342269306, 'samples': 15942144, 'steps': 31136, 'batch_loss/train': 0.7653290807502344} 12/27/2021 01:39:33 - INFO - codeparrot_training - Step 31137: {'lr': 0.0001569365129854831, 'samples': 15942656, 'steps': 31137, 'batch_loss/train': 0.8244258612394333} 12/27/2021 01:39:45 - INFO - codeparrot_training - Step 31138: {'lr': 0.00015692188291825852, 'samples': 15943168, 'steps': 31138, 'batch_loss/train': 0.6960319671779871} 12/27/2021 01:39:56 - INFO - codeparrot_training - Step 31139: {'lr': 0.00015690725322107735, 'samples': 15943680, 'steps': 31139, 'batch_loss/train': 0.694034461863339} 12/27/2021 01:40:06 - INFO - codeparrot_training - Step 31140: {'lr': 0.00015689262389399783, 'samples': 15944192, 'steps': 31140, 'batch_loss/train': 0.7197387469932437} 12/27/2021 01:40:20 - INFO - codeparrot_training - Step 31141: {'lr': 0.000156877994937078, 'samples': 15944704, 'steps': 31141, 'batch_loss/train': 0.7380871213972569} 12/27/2021 01:40:31 - INFO - codeparrot_training - Step 31142: {'lr': 0.00015686336635037615, 'samples': 15945216, 'steps': 31142, 'batch_loss/train': 0.7774759051389992} 12/27/2021 01:40:41 - INFO - codeparrot_training - Step 31143: {'lr': 0.0001568487381339504, 'samples': 15945728, 'steps': 31143, 'batch_loss/train': 0.9161382341990247} 12/27/2021 01:40:52 - INFO - codeparrot_training - Step 31144: {'lr': 0.00015683411028785888, 'samples': 15946240, 'steps': 31144, 'batch_loss/train': 0.828970336355269} 12/27/2021 01:41:04 - INFO - codeparrot_training - Step 31145: {'lr': 0.0001568194828121597, 'samples': 15946752, 'steps': 31145, 'batch_loss/train': 0.8322256682440639} 12/27/2021 01:41:15 - INFO - codeparrot_training - Step 31146: {'lr': 0.00015680485570691114, 'samples': 15947264, 'steps': 31146, 'batch_loss/train': 0.6286868585739285} 12/27/2021 01:41:25 - INFO - codeparrot_training - Step 31147: {'lr': 0.00015679022897217132, 'samples': 15947776, 'steps': 31147, 'batch_loss/train': 0.7433146121911705} 12/27/2021 01:41:37 - INFO - codeparrot_training - Step 31148: {'lr': 0.00015677560260799823, 'samples': 15948288, 'steps': 31148, 'batch_loss/train': 0.7570709661813453} 12/27/2021 01:41:48 - INFO - codeparrot_training - Step 31149: {'lr': 0.00015676097661445026, 'samples': 15948800, 'steps': 31149, 'batch_loss/train': 0.7714064354076982} 12/27/2021 01:41:59 - INFO - codeparrot_training - Step 31150: {'lr': 0.0001567463509915854, 'samples': 15949312, 'steps': 31150, 'batch_loss/train': 0.6722631268203259} 12/27/2021 01:42:13 - INFO - codeparrot_training - Step 31151: {'lr': 0.00015673172573946187, 'samples': 15949824, 'steps': 31151, 'batch_loss/train': 0.7876556804403663} 12/27/2021 01:42:23 - INFO - codeparrot_training - Step 31152: {'lr': 0.0001567171008581377, 'samples': 15950336, 'steps': 31152, 'batch_loss/train': 0.8222541594877839} 12/27/2021 01:42:34 - INFO - codeparrot_training - Step 31153: {'lr': 0.00015670247634767115, 'samples': 15950848, 'steps': 31153, 'batch_loss/train': 0.7600433398038149} 12/27/2021 01:42:46 - INFO - codeparrot_training - Step 31154: {'lr': 0.00015668785220812037, 'samples': 15951360, 'steps': 31154, 'batch_loss/train': 0.7501985551789403} 12/27/2021 01:42:57 - INFO - codeparrot_training - Step 31155: {'lr': 0.00015667322843954345, 'samples': 15951872, 'steps': 31155, 'batch_loss/train': 0.6373902545310557} 12/27/2021 01:43:07 - INFO - codeparrot_training - Step 31156: {'lr': 0.00015665860504199844, 'samples': 15952384, 'steps': 31156, 'batch_loss/train': 0.773982867365703} 12/27/2021 01:43:18 - INFO - codeparrot_training - Step 31157: {'lr': 0.0001566439820155437, 'samples': 15952896, 'steps': 31157, 'batch_loss/train': 0.7516657533124089} 12/27/2021 01:43:31 - INFO - codeparrot_training - Step 31158: {'lr': 0.0001566293593602372, 'samples': 15953408, 'steps': 31158, 'batch_loss/train': 0.7858112789690495} 12/27/2021 01:43:41 - INFO - codeparrot_training - Step 31159: {'lr': 0.00015661473707613704, 'samples': 15953920, 'steps': 31159, 'batch_loss/train': 0.8427554285153747} 12/27/2021 01:43:52 - INFO - codeparrot_training - Step 31160: {'lr': 0.00015660011516330146, 'samples': 15954432, 'steps': 31160, 'batch_loss/train': 0.7671202216297388} 12/27/2021 01:44:04 - INFO - codeparrot_training - Step 31161: {'lr': 0.00015658549362178865, 'samples': 15954944, 'steps': 31161, 'batch_loss/train': 0.7890508873388171} 12/27/2021 01:44:14 - INFO - codeparrot_training - Step 31162: {'lr': 0.00015657087245165655, 'samples': 15955456, 'steps': 31162, 'batch_loss/train': 0.7019515744177625} 12/27/2021 01:44:25 - INFO - codeparrot_training - Step 31163: {'lr': 0.00015655625165296333, 'samples': 15955968, 'steps': 31163, 'batch_loss/train': 0.7778482546564192} 12/27/2021 01:44:37 - INFO - codeparrot_training - Step 31164: {'lr': 0.0001565416312257673, 'samples': 15956480, 'steps': 31164, 'batch_loss/train': 0.6850563366897404} 12/27/2021 01:44:48 - INFO - codeparrot_training - Step 31165: {'lr': 0.0001565270111701264, 'samples': 15956992, 'steps': 31165, 'batch_loss/train': 0.7352451207116246} 12/27/2021 01:44:58 - INFO - codeparrot_training - Step 31166: {'lr': 0.00015651239148609886, 'samples': 15957504, 'steps': 31166, 'batch_loss/train': 0.7835873500443995} 12/27/2021 01:45:09 - INFO - codeparrot_training - Step 31167: {'lr': 0.0001564977721737426, 'samples': 15958016, 'steps': 31167, 'batch_loss/train': 0.741925360634923} 12/27/2021 01:45:21 - INFO - codeparrot_training - Step 31168: {'lr': 0.00015648315323311607, 'samples': 15958528, 'steps': 31168, 'batch_loss/train': 0.7515094624832273} 12/27/2021 01:45:32 - INFO - codeparrot_training - Step 31169: {'lr': 0.00015646853466427712, 'samples': 15959040, 'steps': 31169, 'batch_loss/train': 0.6235052628908306} 12/27/2021 01:45:43 - INFO - codeparrot_training - Step 31170: {'lr': 0.0001564539164672839, 'samples': 15959552, 'steps': 31170, 'batch_loss/train': 1.0512652667239308} 12/27/2021 01:45:55 - INFO - codeparrot_training - Step 31171: {'lr': 0.0001564392986421947, 'samples': 15960064, 'steps': 31171, 'batch_loss/train': 0.6890686347614974} 12/27/2021 01:46:06 - INFO - codeparrot_training - Step 31172: {'lr': 0.0001564246811890675, 'samples': 15960576, 'steps': 31172, 'batch_loss/train': 1.433008930645883} 12/27/2021 01:46:17 - INFO - codeparrot_training - Step 31173: {'lr': 0.00015641006410796045, 'samples': 15961088, 'steps': 31173, 'batch_loss/train': 0.9381017759442329} 12/27/2021 01:46:27 - INFO - codeparrot_training - Step 31174: {'lr': 0.00015639544739893158, 'samples': 15961600, 'steps': 31174, 'batch_loss/train': 0.8422728944569826} 12/27/2021 01:46:39 - INFO - codeparrot_training - Step 31175: {'lr': 0.0001563808310620391, 'samples': 15962112, 'steps': 31175, 'batch_loss/train': 0.7596401730552316} 12/27/2021 01:46:50 - INFO - codeparrot_training - Step 31176: {'lr': 0.0001563662150973411, 'samples': 15962624, 'steps': 31176, 'batch_loss/train': 0.8254927447997034} 12/27/2021 01:47:01 - INFO - codeparrot_training - Step 31177: {'lr': 0.00015635159950489573, 'samples': 15963136, 'steps': 31177, 'batch_loss/train': 0.7170295035466552} 12/27/2021 01:47:15 - INFO - codeparrot_training - Step 31178: {'lr': 0.0001563369842847609, 'samples': 15963648, 'steps': 31178, 'batch_loss/train': 1.0568512352183461} 12/27/2021 01:47:26 - INFO - codeparrot_training - Step 31179: {'lr': 0.0001563223694369949, 'samples': 15964160, 'steps': 31179, 'batch_loss/train': 0.7001821435987949} 12/27/2021 01:47:36 - INFO - codeparrot_training - Step 31180: {'lr': 0.0001563077549616559, 'samples': 15964672, 'steps': 31180, 'batch_loss/train': 0.6935852691531181} 12/27/2021 01:47:48 - INFO - codeparrot_training - Step 31181: {'lr': 0.00015629314085880178, 'samples': 15965184, 'steps': 31181, 'batch_loss/train': 0.5917715223622508} 12/27/2021 01:47:59 - INFO - codeparrot_training - Step 31182: {'lr': 0.00015627852712849072, 'samples': 15965696, 'steps': 31182, 'batch_loss/train': 0.7616368811577559} 12/27/2021 01:48:10 - INFO - codeparrot_training - Step 31183: {'lr': 0.00015626391377078096, 'samples': 15966208, 'steps': 31183, 'batch_loss/train': 0.8391030067577958} 12/27/2021 01:48:23 - INFO - codeparrot_training - Step 31184: {'lr': 0.0001562493007857304, 'samples': 15966720, 'steps': 31184, 'batch_loss/train': 0.7480639342684299} 12/27/2021 01:48:34 - INFO - codeparrot_training - Step 31185: {'lr': 0.0001562346881733972, 'samples': 15967232, 'steps': 31185, 'batch_loss/train': 0.8480459693819284} 12/27/2021 01:48:45 - INFO - codeparrot_training - Step 31186: {'lr': 0.00015622007593383958, 'samples': 15967744, 'steps': 31186, 'batch_loss/train': 0.7168179647997022} 12/27/2021 01:48:55 - INFO - codeparrot_training - Step 31187: {'lr': 0.0001562054640671155, 'samples': 15968256, 'steps': 31187, 'batch_loss/train': 0.75635592546314} 12/27/2021 01:49:08 - INFO - codeparrot_training - Step 31188: {'lr': 0.00015619085257328303, 'samples': 15968768, 'steps': 31188, 'batch_loss/train': 0.6197016406804323} 12/27/2021 01:49:18 - INFO - codeparrot_training - Step 31189: {'lr': 0.00015617624145240029, 'samples': 15969280, 'steps': 31189, 'batch_loss/train': 0.770565687911585} 12/27/2021 01:49:29 - INFO - codeparrot_training - Step 31190: {'lr': 0.00015616163070452548, 'samples': 15969792, 'steps': 31190, 'batch_loss/train': 0.7461546855047345} 12/27/2021 01:49:41 - INFO - codeparrot_training - Step 31191: {'lr': 0.00015614702032971654, 'samples': 15970304, 'steps': 31191, 'batch_loss/train': 0.6935714257415384} 12/27/2021 01:49:52 - INFO - codeparrot_training - Step 31192: {'lr': 0.00015613241032803166, 'samples': 15970816, 'steps': 31192, 'batch_loss/train': 0.6667155781760812} 12/27/2021 01:50:02 - INFO - codeparrot_training - Step 31193: {'lr': 0.00015611780069952876, 'samples': 15971328, 'steps': 31193, 'batch_loss/train': 0.7096760114654899} 12/27/2021 01:50:16 - INFO - codeparrot_training - Step 31194: {'lr': 0.00015610319144426617, 'samples': 15971840, 'steps': 31194, 'batch_loss/train': 0.708776168525219} 12/27/2021 01:50:27 - INFO - codeparrot_training - Step 31195: {'lr': 0.00015608858256230173, 'samples': 15972352, 'steps': 31195, 'batch_loss/train': 0.7705657761543989} 12/27/2021 01:50:37 - INFO - codeparrot_training - Step 31196: {'lr': 0.00015607397405369361, 'samples': 15972864, 'steps': 31196, 'batch_loss/train': 0.6498663043603301} 12/27/2021 01:50:49 - INFO - codeparrot_training - Step 31197: {'lr': 0.0001560593659185, 'samples': 15973376, 'steps': 31197, 'batch_loss/train': 0.7073149774223566} 12/27/2021 01:51:00 - INFO - codeparrot_training - Step 31198: {'lr': 0.00015604475815677882, 'samples': 15973888, 'steps': 31198, 'batch_loss/train': 0.8223716779612005} 12/27/2021 01:51:11 - INFO - codeparrot_training - Step 31199: {'lr': 0.0001560301507685883, 'samples': 15974400, 'steps': 31199, 'batch_loss/train': 0.7116389106959105} 12/27/2021 01:51:21 - INFO - codeparrot_training - Step 31200: {'lr': 0.00015601554375398627, 'samples': 15974912, 'steps': 31200, 'batch_loss/train': 0.658361317589879} 12/27/2021 01:51:33 - INFO - codeparrot_training - Step 31201: {'lr': 0.000156000937113031, 'samples': 15975424, 'steps': 31201, 'batch_loss/train': 0.5731234189588577} 12/27/2021 01:51:44 - INFO - codeparrot_training - Step 31202: {'lr': 0.0001559863308457806, 'samples': 15975936, 'steps': 31202, 'batch_loss/train': 0.6893653818406165} 12/27/2021 01:51:55 - INFO - codeparrot_training - Step 31203: {'lr': 0.00015597172495229302, 'samples': 15976448, 'steps': 31203, 'batch_loss/train': 0.734826902858913} 12/27/2021 01:52:09 - INFO - codeparrot_training - Step 31204: {'lr': 0.00015595711943262628, 'samples': 15976960, 'steps': 31204, 'batch_loss/train': 0.7738064546138048} 12/27/2021 01:52:19 - INFO - codeparrot_training - Step 31205: {'lr': 0.00015594251428683857, 'samples': 15977472, 'steps': 31205, 'batch_loss/train': 0.6332427490269765} 12/27/2021 01:52:30 - INFO - codeparrot_training - Step 31206: {'lr': 0.00015592790951498791, 'samples': 15977984, 'steps': 31206, 'batch_loss/train': 0.7848239261657} 12/27/2021 01:52:42 - INFO - codeparrot_training - Step 31207: {'lr': 0.0001559133051171323, 'samples': 15978496, 'steps': 31207, 'batch_loss/train': 0.7676130421459675} 12/27/2021 01:52:52 - INFO - codeparrot_training - Step 31208: {'lr': 0.00015589870109332992, 'samples': 15979008, 'steps': 31208, 'batch_loss/train': 0.7439386900514364} 12/27/2021 01:53:03 - INFO - codeparrot_training - Step 31209: {'lr': 0.00015588409744363884, 'samples': 15979520, 'steps': 31209, 'batch_loss/train': 0.7905589742586017} 12/27/2021 01:53:17 - INFO - codeparrot_training - Step 31210: {'lr': 0.00015586949416811696, 'samples': 15980032, 'steps': 31210, 'batch_loss/train': 0.6851076884195209} 12/27/2021 01:53:27 - INFO - codeparrot_training - Step 31211: {'lr': 0.00015585489126682242, 'samples': 15980544, 'steps': 31211, 'batch_loss/train': 0.7202304974198341} 12/27/2021 01:53:38 - INFO - codeparrot_training - Step 31212: {'lr': 0.00015584028873981336, 'samples': 15981056, 'steps': 31212, 'batch_loss/train': 0.6640660092234612} 12/27/2021 01:53:49 - INFO - codeparrot_training - Step 31213: {'lr': 0.00015582568658714777, 'samples': 15981568, 'steps': 31213, 'batch_loss/train': 0.6995602557435632} 12/27/2021 01:54:01 - INFO - codeparrot_training - Step 31214: {'lr': 0.0001558110848088836, 'samples': 15982080, 'steps': 31214, 'batch_loss/train': 0.7191937863826752} 12/27/2021 01:54:11 - INFO - codeparrot_training - Step 31215: {'lr': 0.00015579648340507902, 'samples': 15982592, 'steps': 31215, 'batch_loss/train': 0.7133359448052943} 12/27/2021 01:54:22 - INFO - codeparrot_training - Step 31216: {'lr': 0.0001557818823757921, 'samples': 15983104, 'steps': 31216, 'batch_loss/train': 0.7548217801377177} 12/27/2021 01:54:34 - INFO - codeparrot_training - Step 31217: {'lr': 0.00015576728172108078, 'samples': 15983616, 'steps': 31217, 'batch_loss/train': 0.7344898218289018} 12/27/2021 01:54:45 - INFO - codeparrot_training - Step 31218: {'lr': 0.00015575268144100323, 'samples': 15984128, 'steps': 31218, 'batch_loss/train': 0.7182461442425847} 12/27/2021 01:54:55 - INFO - codeparrot_training - Step 31219: {'lr': 0.00015573808153561736, 'samples': 15984640, 'steps': 31219, 'batch_loss/train': 0.6519049028865993} 12/27/2021 01:55:09 - INFO - codeparrot_training - Step 31220: {'lr': 0.0001557234820049813, 'samples': 15985152, 'steps': 31220, 'batch_loss/train': 0.6969852454494685} 12/27/2021 01:55:20 - INFO - codeparrot_training - Step 31221: {'lr': 0.00015570888284915314, 'samples': 15985664, 'steps': 31221, 'batch_loss/train': 0.720598938409239} 12/27/2021 01:55:30 - INFO - codeparrot_training - Step 31222: {'lr': 0.00015569428406819074, 'samples': 15986176, 'steps': 31222, 'batch_loss/train': 0.7881099372170866} 12/27/2021 01:55:42 - INFO - codeparrot_training - Step 31223: {'lr': 0.0001556796856621524, 'samples': 15986688, 'steps': 31223, 'batch_loss/train': 0.7833489300683141} 12/27/2021 01:55:53 - INFO - codeparrot_training - Step 31224: {'lr': 0.00015566508763109594, 'samples': 15987200, 'steps': 31224, 'batch_loss/train': 0.7094577087555081} 12/27/2021 01:56:04 - INFO - codeparrot_training - Step 31225: {'lr': 0.00015565048997507951, 'samples': 15987712, 'steps': 31225, 'batch_loss/train': 0.6724015465006232} 12/27/2021 01:56:14 - INFO - codeparrot_training - Step 31226: {'lr': 0.00015563589269416107, 'samples': 15988224, 'steps': 31226, 'batch_loss/train': 0.6971373530104756} 12/27/2021 01:56:26 - INFO - codeparrot_training - Step 31227: {'lr': 0.00015562129578839868, 'samples': 15988736, 'steps': 31227, 'batch_loss/train': 0.7767620651284233} 12/27/2021 01:56:37 - INFO - codeparrot_training - Step 31228: {'lr': 0.00015560669925785044, 'samples': 15989248, 'steps': 31228, 'batch_loss/train': 0.7260140459984541} 12/27/2021 01:56:48 - INFO - codeparrot_training - Step 31229: {'lr': 0.00015559210310257427, 'samples': 15989760, 'steps': 31229, 'batch_loss/train': 0.6977221020497382} 12/27/2021 01:57:02 - INFO - codeparrot_training - Step 31230: {'lr': 0.00015557750732262822, 'samples': 15990272, 'steps': 31230, 'batch_loss/train': 0.8050493160262704} 12/27/2021 01:57:12 - INFO - codeparrot_training - Step 31231: {'lr': 0.00015556291191807042, 'samples': 15990784, 'steps': 31231, 'batch_loss/train': 0.7101543580647558} 12/27/2021 01:57:23 - INFO - codeparrot_training - Step 31232: {'lr': 0.00015554831688895885, 'samples': 15991296, 'steps': 31232, 'batch_loss/train': 0.7424016129225492} 12/27/2021 01:57:35 - INFO - codeparrot_training - Step 31233: {'lr': 0.0001555337222353514, 'samples': 15991808, 'steps': 31233, 'batch_loss/train': 0.6792374821379781} 12/27/2021 01:57:46 - INFO - codeparrot_training - Step 31234: {'lr': 0.00015551912795730625, 'samples': 15992320, 'steps': 31234, 'batch_loss/train': 0.8984327954240143} 12/27/2021 01:57:56 - INFO - codeparrot_training - Step 31235: {'lr': 0.00015550453405488142, 'samples': 15992832, 'steps': 31235, 'batch_loss/train': 0.7790385910193436} 12/27/2021 01:58:07 - INFO - codeparrot_training - Step 31236: {'lr': 0.00015548994052813484, 'samples': 15993344, 'steps': 31236, 'batch_loss/train': 0.7173456251621246} 12/27/2021 01:58:19 - INFO - codeparrot_training - Step 31237: {'lr': 0.00015547534737712449, 'samples': 15993856, 'steps': 31237, 'batch_loss/train': 0.5910374777740799} 12/27/2021 01:58:30 - INFO - codeparrot_training - Step 31238: {'lr': 0.0001554607546019086, 'samples': 15994368, 'steps': 31238, 'batch_loss/train': 0.6181629025377333} 12/27/2021 01:58:40 - INFO - codeparrot_training - Step 31239: {'lr': 0.00015544616220254507, 'samples': 15994880, 'steps': 31239, 'batch_loss/train': 0.6964071444235742} 12/27/2021 01:58:54 - INFO - codeparrot_training - Step 31240: {'lr': 0.00015543157017909183, 'samples': 15995392, 'steps': 31240, 'batch_loss/train': 0.6350541392457671} 12/27/2021 01:59:05 - INFO - codeparrot_training - Step 31241: {'lr': 0.0001554169785316069, 'samples': 15995904, 'steps': 31241, 'batch_loss/train': 0.7360039995983243} 12/27/2021 01:59:15 - INFO - codeparrot_training - Step 31242: {'lr': 0.00015540238726014844, 'samples': 15996416, 'steps': 31242, 'batch_loss/train': 0.6966207064688206} 12/27/2021 01:59:27 - INFO - codeparrot_training - Step 31243: {'lr': 0.00015538779636477436, 'samples': 15996928, 'steps': 31243, 'batch_loss/train': 0.7341013413388282} 12/27/2021 01:59:38 - INFO - codeparrot_training - Step 31244: {'lr': 0.00015537320584554256, 'samples': 15997440, 'steps': 31244, 'batch_loss/train': 0.7187804756686091} 12/27/2021 01:59:49 - INFO - codeparrot_training - Step 31245: {'lr': 0.0001553586157025113, 'samples': 15997952, 'steps': 31245, 'batch_loss/train': 0.7283897968009114} 12/27/2021 02:00:01 - INFO - codeparrot_training - Step 31246: {'lr': 0.00015534402593573844, 'samples': 15998464, 'steps': 31246, 'batch_loss/train': 0.6330820638686419} 12/27/2021 02:00:11 - INFO - codeparrot_training - Step 31247: {'lr': 0.00015532943654528197, 'samples': 15998976, 'steps': 31247, 'batch_loss/train': 0.6694566374644637} 12/27/2021 02:00:22 - INFO - codeparrot_training - Step 31248: {'lr': 0.00015531484753119988, 'samples': 15999488, 'steps': 31248, 'batch_loss/train': 0.7139387684874237} 12/27/2021 02:00:36 - INFO - codeparrot_training - Step 31249: {'lr': 0.00015530025889355027, 'samples': 16000000, 'steps': 31249, 'batch_loss/train': 0.8985593896359205} 12/27/2021 02:00:46 - INFO - codeparrot_training - Step 31250: {'lr': 0.00015528567063239102, 'samples': 16000512, 'steps': 31250, 'batch_loss/train': 0.7342694578692317} 12/27/2021 02:00:57 - INFO - codeparrot_training - Step 31251: {'lr': 0.00015527108274778024, 'samples': 16001024, 'steps': 31251, 'batch_loss/train': 0.6817827164195478} 12/27/2021 02:01:08 - INFO - codeparrot_training - Step 31252: {'lr': 0.00015525649523977575, 'samples': 16001536, 'steps': 31252, 'batch_loss/train': 0.6423605300078634} 12/27/2021 02:01:20 - INFO - codeparrot_training - Step 31253: {'lr': 0.00015524190810843575, 'samples': 16002048, 'steps': 31253, 'batch_loss/train': 0.726797659881413} 12/27/2021 02:01:31 - INFO - codeparrot_training - Step 31254: {'lr': 0.00015522732135381817, 'samples': 16002560, 'steps': 31254, 'batch_loss/train': 0.7936034593731165} 12/27/2021 02:01:41 - INFO - codeparrot_training - Step 31255: {'lr': 0.00015521273497598093, 'samples': 16003072, 'steps': 31255, 'batch_loss/train': 0.7979706954210997} 12/27/2021 02:01:55 - INFO - codeparrot_training - Step 31256: {'lr': 0.00015519814897498198, 'samples': 16003584, 'steps': 31256, 'batch_loss/train': 0.76047316507902} 12/27/2021 02:02:06 - INFO - codeparrot_training - Step 31257: {'lr': 0.00015518356335087952, 'samples': 16004096, 'steps': 31257, 'batch_loss/train': 0.625092735979706} 12/27/2021 02:02:16 - INFO - codeparrot_training - Step 31258: {'lr': 0.00015516897810373143, 'samples': 16004608, 'steps': 31258, 'batch_loss/train': 0.8180119409225881} 12/27/2021 02:02:28 - INFO - codeparrot_training - Step 31259: {'lr': 0.00015515439323359557, 'samples': 16005120, 'steps': 31259, 'batch_loss/train': 0.7036831108853221} 12/27/2021 02:02:39 - INFO - codeparrot_training - Step 31260: {'lr': 0.00015513980874053008, 'samples': 16005632, 'steps': 31260, 'batch_loss/train': 0.7192430375143886} 12/27/2021 02:02:49 - INFO - codeparrot_training - Step 31261: {'lr': 0.00015512522462459295, 'samples': 16006144, 'steps': 31261, 'batch_loss/train': 0.7916339715011418} 12/27/2021 02:03:02 - INFO - codeparrot_training - Step 31262: {'lr': 0.000155110640885842, 'samples': 16006656, 'steps': 31262, 'batch_loss/train': 0.7462668046355247} 12/27/2021 02:03:12 - INFO - codeparrot_training - Step 31263: {'lr': 0.0001550960575243353, 'samples': 16007168, 'steps': 31263, 'batch_loss/train': 0.8187322812154889} 12/27/2021 02:03:23 - INFO - codeparrot_training - Step 31264: {'lr': 0.00015508147454013093, 'samples': 16007680, 'steps': 31264, 'batch_loss/train': 0.708149014506489} 12/27/2021 02:03:33 - INFO - codeparrot_training - Step 31265: {'lr': 0.00015506689193328676, 'samples': 16008192, 'steps': 31265, 'batch_loss/train': 0.7305104015395045} 12/27/2021 02:03:47 - INFO - codeparrot_training - Step 31266: {'lr': 0.00015505230970386082, 'samples': 16008704, 'steps': 31266, 'batch_loss/train': 0.6202785763889551} 12/27/2021 02:03:58 - INFO - codeparrot_training - Step 31267: {'lr': 0.00015503772785191094, 'samples': 16009216, 'steps': 31267, 'batch_loss/train': 0.6569737754762173} 12/27/2021 02:04:08 - INFO - codeparrot_training - Step 31268: {'lr': 0.00015502314637749529, 'samples': 16009728, 'steps': 31268, 'batch_loss/train': 0.7104355180636048} 12/27/2021 02:04:21 - INFO - codeparrot_training - Step 31269: {'lr': 0.0001550085652806717, 'samples': 16010240, 'steps': 31269, 'batch_loss/train': 0.6751037536887452} 12/27/2021 02:04:31 - INFO - codeparrot_training - Step 31270: {'lr': 0.00015499398456149816, 'samples': 16010752, 'steps': 31270, 'batch_loss/train': 0.7389004686847329} 12/27/2021 02:04:42 - INFO - codeparrot_training - Step 31271: {'lr': 0.00015497940422003274, 'samples': 16011264, 'steps': 31271, 'batch_loss/train': 0.47883270122110844} 12/27/2021 02:04:54 - INFO - codeparrot_training - Step 31272: {'lr': 0.0001549648242563333, 'samples': 16011776, 'steps': 31272, 'batch_loss/train': 0.66883041895926} 12/27/2021 02:05:05 - INFO - codeparrot_training - Step 31273: {'lr': 0.00015495024467045786, 'samples': 16012288, 'steps': 31273, 'batch_loss/train': 0.7301327046006918} 12/27/2021 02:05:15 - INFO - codeparrot_training - Step 31274: {'lr': 0.00015493566546246425, 'samples': 16012800, 'steps': 31274, 'batch_loss/train': 0.4981325544649735} 12/27/2021 02:05:27 - INFO - codeparrot_training - Step 31275: {'lr': 0.0001549210866324106, 'samples': 16013312, 'steps': 31275, 'batch_loss/train': 0.5584777336916886} 12/27/2021 02:05:38 - INFO - codeparrot_training - Step 31276: {'lr': 0.0001549065081803549, 'samples': 16013824, 'steps': 31276, 'batch_loss/train': 0.7164248167537153} 12/27/2021 02:05:48 - INFO - codeparrot_training - Step 31277: {'lr': 0.00015489193010635496, 'samples': 16014336, 'steps': 31277, 'batch_loss/train': 0.7296941699460149} 12/27/2021 02:05:59 - INFO - codeparrot_training - Step 31278: {'lr': 0.00015487735241046874, 'samples': 16014848, 'steps': 31278, 'batch_loss/train': 0.6835520723834634} 12/27/2021 02:06:13 - INFO - codeparrot_training - Step 31279: {'lr': 0.00015486277509275427, 'samples': 16015360, 'steps': 31279, 'batch_loss/train': 0.6912011979147792} 12/27/2021 02:06:24 - INFO - codeparrot_training - Step 31280: {'lr': 0.00015484819815326953, 'samples': 16015872, 'steps': 31280, 'batch_loss/train': 0.7223558188416064} 12/27/2021 02:06:34 - INFO - codeparrot_training - Step 31281: {'lr': 0.00015483362159207236, 'samples': 16016384, 'steps': 31281, 'batch_loss/train': 0.6753521482460201} 12/27/2021 02:06:46 - INFO - codeparrot_training - Step 31282: {'lr': 0.00015481904540922079, 'samples': 16016896, 'steps': 31282, 'batch_loss/train': 0.7678639376536012} 12/27/2021 02:06:57 - INFO - codeparrot_training - Step 31283: {'lr': 0.00015480446960477282, 'samples': 16017408, 'steps': 31283, 'batch_loss/train': 0.5597793461638503} 12/27/2021 02:07:08 - INFO - codeparrot_training - Step 31284: {'lr': 0.00015478989417878636, 'samples': 16017920, 'steps': 31284, 'batch_loss/train': 0.6410588688449934} 12/27/2021 02:07:20 - INFO - codeparrot_training - Step 31285: {'lr': 0.0001547753191313192, 'samples': 16018432, 'steps': 31285, 'batch_loss/train': 0.798741802573204} 12/27/2021 02:07:30 - INFO - codeparrot_training - Step 31286: {'lr': 0.00015476074446242946, 'samples': 16018944, 'steps': 31286, 'batch_loss/train': 0.7210889346897602} 12/27/2021 02:07:41 - INFO - codeparrot_training - Step 31287: {'lr': 0.0001547461701721751, 'samples': 16019456, 'steps': 31287, 'batch_loss/train': 0.6415602487977594} 12/27/2021 02:07:55 - INFO - codeparrot_training - Step 31288: {'lr': 0.00015473159626061396, 'samples': 16019968, 'steps': 31288, 'batch_loss/train': 0.6473033728543669} 12/27/2021 02:08:05 - INFO - codeparrot_training - Step 31289: {'lr': 0.00015471702272780398, 'samples': 16020480, 'steps': 31289, 'batch_loss/train': 0.7110807308927178} 12/27/2021 02:08:16 - INFO - codeparrot_training - Step 31290: {'lr': 0.0001547024495738032, 'samples': 16020992, 'steps': 31290, 'batch_loss/train': 0.6674098647199571} 12/27/2021 02:08:27 - INFO - codeparrot_training - Step 31291: {'lr': 0.00015468787679866948, 'samples': 16021504, 'steps': 31291, 'batch_loss/train': 0.6123711504042149} 12/27/2021 02:08:39 - INFO - codeparrot_training - Step 31292: {'lr': 0.0001546733044024608, 'samples': 16022016, 'steps': 31292, 'batch_loss/train': 0.7036696026334539} 12/27/2021 02:08:49 - INFO - codeparrot_training - Step 31293: {'lr': 0.000154658732385235, 'samples': 16022528, 'steps': 31293, 'batch_loss/train': 0.674368146341294} 12/27/2021 02:09:00 - INFO - codeparrot_training - Step 31294: {'lr': 0.00015464416074705017, 'samples': 16023040, 'steps': 31294, 'batch_loss/train': 0.6091129976557568} 12/27/2021 02:09:14 - INFO - codeparrot_training - Step 31295: {'lr': 0.00015462958948796407, 'samples': 16023552, 'steps': 31295, 'batch_loss/train': 0.742903359234333} 12/27/2021 02:09:24 - INFO - codeparrot_training - Step 31296: {'lr': 0.0001546150186080347, 'samples': 16024064, 'steps': 31296, 'batch_loss/train': 0.8050195770338178} 12/27/2021 02:09:35 - INFO - codeparrot_training - Step 31297: {'lr': 0.0001546004481073201, 'samples': 16024576, 'steps': 31297, 'batch_loss/train': 0.728780671954155} 12/27/2021 02:09:47 - INFO - codeparrot_training - Step 31298: {'lr': 0.00015458587798587804, 'samples': 16025088, 'steps': 31298, 'batch_loss/train': 0.7994769895449281} 12/27/2021 02:09:58 - INFO - codeparrot_training - Step 31299: {'lr': 0.00015457130824376652, 'samples': 16025600, 'steps': 31299, 'batch_loss/train': 0.7149194912053645} 12/27/2021 02:10:08 - INFO - codeparrot_training - Step 31300: {'lr': 0.00015455673888104337, 'samples': 16026112, 'steps': 31300, 'batch_loss/train': 0.7383509390056133} 12/27/2021 02:10:20 - INFO - codeparrot_training - Step 31301: {'lr': 0.00015454216989776664, 'samples': 16026624, 'steps': 31301, 'batch_loss/train': 0.7110766796395183} 12/27/2021 02:10:31 - INFO - codeparrot_training - Step 31302: {'lr': 0.00015452760129399428, 'samples': 16027136, 'steps': 31302, 'batch_loss/train': 0.7304291408509016} 12/27/2021 02:10:41 - INFO - codeparrot_training - Step 31303: {'lr': 0.00015451303306978408, 'samples': 16027648, 'steps': 31303, 'batch_loss/train': 0.7224130425602198} 12/27/2021 02:10:52 - INFO - codeparrot_training - Step 31304: {'lr': 0.00015449846522519396, 'samples': 16028160, 'steps': 31304, 'batch_loss/train': 0.7913840226829052} 12/27/2021 02:11:06 - INFO - codeparrot_training - Step 31305: {'lr': 0.00015448389776028188, 'samples': 16028672, 'steps': 31305, 'batch_loss/train': 0.7463649660348892} 12/27/2021 02:11:16 - INFO - codeparrot_training - Step 31306: {'lr': 0.00015446933067510583, 'samples': 16029184, 'steps': 31306, 'batch_loss/train': 0.6852632639929652} 12/27/2021 02:11:27 - INFO - codeparrot_training - Step 31307: {'lr': 0.00015445476396972352, 'samples': 16029696, 'steps': 31307, 'batch_loss/train': 0.7481912723742425} 12/27/2021 02:11:39 - INFO - codeparrot_training - Step 31308: {'lr': 0.0001544401976441931, 'samples': 16030208, 'steps': 31308, 'batch_loss/train': 0.661697531118989} 12/27/2021 02:11:50 - INFO - codeparrot_training - Step 31309: {'lr': 0.00015442563169857238, 'samples': 16030720, 'steps': 31309, 'batch_loss/train': 0.6916296659037471} 12/27/2021 02:12:00 - INFO - codeparrot_training - Step 31310: {'lr': 0.00015441106613291923, 'samples': 16031232, 'steps': 31310, 'batch_loss/train': 0.6466857155319303} 12/27/2021 02:12:13 - INFO - codeparrot_training - Step 31311: {'lr': 0.0001543965009472915, 'samples': 16031744, 'steps': 31311, 'batch_loss/train': 0.6556397834792733} 12/27/2021 02:12:24 - INFO - codeparrot_training - Step 31312: {'lr': 0.0001543819361417473, 'samples': 16032256, 'steps': 31312, 'batch_loss/train': 0.6169125941814855} 12/27/2021 02:12:34 - INFO - codeparrot_training - Step 31313: {'lr': 0.00015436737171634443, 'samples': 16032768, 'steps': 31313, 'batch_loss/train': 0.9026902765035629} 12/27/2021 02:12:47 - INFO - codeparrot_training - Step 31314: {'lr': 0.00015435280767114073, 'samples': 16033280, 'steps': 31314, 'batch_loss/train': 0.7764902161434293} 12/27/2021 02:12:57 - INFO - codeparrot_training - Step 31315: {'lr': 0.00015433824400619411, 'samples': 16033792, 'steps': 31315, 'batch_loss/train': 0.7373222464229912} 12/27/2021 02:13:08 - INFO - codeparrot_training - Step 31316: {'lr': 0.0001543236807215626, 'samples': 16034304, 'steps': 31316, 'batch_loss/train': 0.7882818104699254} 12/27/2021 02:13:20 - INFO - codeparrot_training - Step 31317: {'lr': 0.00015430911781730394, 'samples': 16034816, 'steps': 31317, 'batch_loss/train': 0.7285517184063792} 12/27/2021 02:13:30 - INFO - codeparrot_training - Step 31318: {'lr': 0.00015429455529347607, 'samples': 16035328, 'steps': 31318, 'batch_loss/train': 0.722235044464469} 12/27/2021 02:13:41 - INFO - codeparrot_training - Step 31319: {'lr': 0.00015427999315013703, 'samples': 16035840, 'steps': 31319, 'batch_loss/train': 0.8063041428104043} 12/27/2021 02:13:52 - INFO - codeparrot_training - Step 31320: {'lr': 0.0001542654313873445, 'samples': 16036352, 'steps': 31320, 'batch_loss/train': 0.6599353859201074} 12/27/2021 02:14:04 - INFO - codeparrot_training - Step 31321: {'lr': 0.00015425087000515655, 'samples': 16036864, 'steps': 31321, 'batch_loss/train': 0.6970173018053174} 12/27/2021 02:14:15 - INFO - codeparrot_training - Step 31322: {'lr': 0.0001542363090036309, 'samples': 16037376, 'steps': 31322, 'batch_loss/train': 0.8440124550834298} 12/27/2021 02:14:25 - INFO - codeparrot_training - Step 31323: {'lr': 0.0001542217483828256, 'samples': 16037888, 'steps': 31323, 'batch_loss/train': 0.8674872359260917} 12/27/2021 02:14:38 - INFO - codeparrot_training - Step 31324: {'lr': 0.00015420718814279842, 'samples': 16038400, 'steps': 31324, 'batch_loss/train': 0.5953059657767881} 12/27/2021 02:14:49 - INFO - codeparrot_training - Step 31325: {'lr': 0.00015419262828360734, 'samples': 16038912, 'steps': 31325, 'batch_loss/train': 0.9329207418486476} 12/27/2021 02:15:00 - INFO - codeparrot_training - Step 31326: {'lr': 0.0001541780688053101, 'samples': 16039424, 'steps': 31326, 'batch_loss/train': 0.7426380440592766} 12/27/2021 02:15:12 - INFO - codeparrot_training - Step 31327: {'lr': 0.0001541635097079647, 'samples': 16039936, 'steps': 31327, 'batch_loss/train': 0.721134283579886} 12/27/2021 02:15:22 - INFO - codeparrot_training - Step 31328: {'lr': 0.00015414895099162906, 'samples': 16040448, 'steps': 31328, 'batch_loss/train': 0.7194748157635331} 12/27/2021 02:15:33 - INFO - codeparrot_training - Step 31329: {'lr': 0.00015413439265636104, 'samples': 16040960, 'steps': 31329, 'batch_loss/train': 0.7242075130343437} 12/27/2021 02:15:43 - INFO - codeparrot_training - Step 31330: {'lr': 0.00015411983470221832, 'samples': 16041472, 'steps': 31330, 'batch_loss/train': 0.6902520656585693} 12/27/2021 02:15:56 - INFO - codeparrot_training - Step 31331: {'lr': 0.00015410527712925902, 'samples': 16041984, 'steps': 31331, 'batch_loss/train': 0.7494861204177141} 12/27/2021 02:16:06 - INFO - codeparrot_training - Step 31332: {'lr': 0.000154090719937541, 'samples': 16042496, 'steps': 31332, 'batch_loss/train': 0.8218966205604374} 12/27/2021 02:16:17 - INFO - codeparrot_training - Step 31333: {'lr': 0.00015407616312712192, 'samples': 16043008, 'steps': 31333, 'batch_loss/train': 0.7168535124510527} 12/27/2021 02:16:30 - INFO - codeparrot_training - Step 31334: {'lr': 0.00015406160669805986, 'samples': 16043520, 'steps': 31334, 'batch_loss/train': 0.5433488389244303} 12/27/2021 02:16:40 - INFO - codeparrot_training - Step 31335: {'lr': 0.00015404705065041264, 'samples': 16044032, 'steps': 31335, 'batch_loss/train': 0.6933231581933796} 12/27/2021 02:16:51 - INFO - codeparrot_training - Step 31336: {'lr': 0.0001540324949842381, 'samples': 16044544, 'steps': 31336, 'batch_loss/train': 0.8995597707107663} 12/27/2021 02:17:03 - INFO - codeparrot_training - Step 31337: {'lr': 0.0001540179396995941, 'samples': 16045056, 'steps': 31337, 'batch_loss/train': 0.659222003800096} 12/27/2021 02:17:14 - INFO - codeparrot_training - Step 31338: {'lr': 0.00015400338479653857, 'samples': 16045568, 'steps': 31338, 'batch_loss/train': 0.8208300874102861} 12/27/2021 02:17:24 - INFO - codeparrot_training - Step 31339: {'lr': 0.0001539888302751294, 'samples': 16046080, 'steps': 31339, 'batch_loss/train': 0.7539726081886329} 12/27/2021 02:17:37 - INFO - codeparrot_training - Step 31340: {'lr': 0.00015397427613542431, 'samples': 16046592, 'steps': 31340, 'batch_loss/train': 0.7268832204863429} 12/27/2021 02:17:48 - INFO - codeparrot_training - Step 31341: {'lr': 0.0001539597223774812, 'samples': 16047104, 'steps': 31341, 'batch_loss/train': 0.6770823698025197} 12/27/2021 02:17:58 - INFO - codeparrot_training - Step 31342: {'lr': 0.00015394516900135805, 'samples': 16047616, 'steps': 31342, 'batch_loss/train': 0.8147261301055551} 12/27/2021 02:18:09 - INFO - codeparrot_training - Step 31343: {'lr': 0.0001539306160071126, 'samples': 16048128, 'steps': 31343, 'batch_loss/train': 0.6636506328359246} 12/27/2021 02:18:21 - INFO - codeparrot_training - Step 31344: {'lr': 0.00015391606339480268, 'samples': 16048640, 'steps': 31344, 'batch_loss/train': 0.6456280023267027} 12/27/2021 02:18:32 - INFO - codeparrot_training - Step 31345: {'lr': 0.00015390151116448635, 'samples': 16049152, 'steps': 31345, 'batch_loss/train': 0.6903989776037633} 12/27/2021 02:18:42 - INFO - codeparrot_training - Step 31346: {'lr': 0.00015388695931622127, 'samples': 16049664, 'steps': 31346, 'batch_loss/train': 0.6778666623868048} 12/27/2021 02:18:54 - INFO - codeparrot_training - Step 31347: {'lr': 0.00015387240785006537, 'samples': 16050176, 'steps': 31347, 'batch_loss/train': 0.7661455273628235} 12/27/2021 02:19:05 - INFO - codeparrot_training - Step 31348: {'lr': 0.00015385785676607638, 'samples': 16050688, 'steps': 31348, 'batch_loss/train': 0.7072465685196221} 12/27/2021 02:19:16 - INFO - codeparrot_training - Step 31349: {'lr': 0.00015384330606431237, 'samples': 16051200, 'steps': 31349, 'batch_loss/train': 0.7317456197924912} 12/27/2021 02:19:28 - INFO - codeparrot_training - Step 31350: {'lr': 0.000153828755744831, 'samples': 16051712, 'steps': 31350, 'batch_loss/train': 0.6996677408460528} 12/27/2021 02:19:39 - INFO - codeparrot_training - Step 31351: {'lr': 0.00015381420580769025, 'samples': 16052224, 'steps': 31351, 'batch_loss/train': 0.6903347400948405} 12/27/2021 02:19:50 - INFO - codeparrot_training - Step 31352: {'lr': 0.00015379965625294777, 'samples': 16052736, 'steps': 31352, 'batch_loss/train': 0.8197744432836771} 12/27/2021 02:20:02 - INFO - codeparrot_training - Step 31353: {'lr': 0.00015378510708066162, 'samples': 16053248, 'steps': 31353, 'batch_loss/train': 0.557360382808838} 12/27/2021 02:20:12 - INFO - codeparrot_training - Step 31354: {'lr': 0.0001537705582908896, 'samples': 16053760, 'steps': 31354, 'batch_loss/train': 0.6941123176948167} 12/27/2021 02:20:23 - INFO - codeparrot_training - Step 31355: {'lr': 0.0001537560098836894, 'samples': 16054272, 'steps': 31355, 'batch_loss/train': 0.6945116906426847} 12/27/2021 02:20:34 - INFO - codeparrot_training - Step 31356: {'lr': 0.00015374146185911903, 'samples': 16054784, 'steps': 31356, 'batch_loss/train': 0.7458361135795712} 12/27/2021 02:20:46 - INFO - codeparrot_training - Step 31357: {'lr': 0.00015372691421723628, 'samples': 16055296, 'steps': 31357, 'batch_loss/train': 0.7057629227638245} 12/27/2021 02:20:56 - INFO - codeparrot_training - Step 31358: {'lr': 0.000153712366958099, 'samples': 16055808, 'steps': 31358, 'batch_loss/train': 0.6619936747010797} 12/27/2021 02:21:07 - INFO - codeparrot_training - Step 31359: {'lr': 0.00015369782008176487, 'samples': 16056320, 'steps': 31359, 'batch_loss/train': 0.843850277364254} 12/27/2021 02:21:20 - INFO - codeparrot_training - Step 31360: {'lr': 0.00015368327358829186, 'samples': 16056832, 'steps': 31360, 'batch_loss/train': 0.7112798197194934} 12/27/2021 02:21:30 - INFO - codeparrot_training - Step 31361: {'lr': 0.00015366872747773791, 'samples': 16057344, 'steps': 31361, 'batch_loss/train': 0.7124923411756754} 12/27/2021 02:21:41 - INFO - codeparrot_training - Step 31362: {'lr': 0.00015365418175016063, 'samples': 16057856, 'steps': 31362, 'batch_loss/train': 0.7313026748597622} 12/27/2021 02:21:53 - INFO - codeparrot_training - Step 31363: {'lr': 0.0001536396364056179, 'samples': 16058368, 'steps': 31363, 'batch_loss/train': 0.7527172064874321} 12/27/2021 02:22:04 - INFO - codeparrot_training - Step 31364: {'lr': 0.0001536250914441677, 'samples': 16058880, 'steps': 31364, 'batch_loss/train': 0.6772037725895643} 12/27/2021 02:22:15 - INFO - codeparrot_training - Step 31365: {'lr': 0.0001536105468658677, 'samples': 16059392, 'steps': 31365, 'batch_loss/train': 0.6888483893126249} 12/27/2021 02:22:27 - INFO - codeparrot_training - Step 31366: {'lr': 0.00015359600267077583, 'samples': 16059904, 'steps': 31366, 'batch_loss/train': 0.7323987488634884} 12/27/2021 02:22:38 - INFO - codeparrot_training - Step 31367: {'lr': 0.00015358145885894972, 'samples': 16060416, 'steps': 31367, 'batch_loss/train': 0.6252516122185625} 12/27/2021 02:22:48 - INFO - codeparrot_training - Step 31368: {'lr': 0.00015356691543044743, 'samples': 16060928, 'steps': 31368, 'batch_loss/train': 0.6793262840947136} 12/27/2021 02:22:59 - INFO - codeparrot_training - Step 31369: {'lr': 0.00015355237238532665, 'samples': 16061440, 'steps': 31369, 'batch_loss/train': 0.7993590449914336} 12/27/2021 02:23:11 - INFO - codeparrot_training - Step 31370: {'lr': 0.00015353782972364515, 'samples': 16061952, 'steps': 31370, 'batch_loss/train': 0.753980353474617} 12/27/2021 02:23:22 - INFO - codeparrot_training - Step 31371: {'lr': 0.0001535232874454609, 'samples': 16062464, 'steps': 31371, 'batch_loss/train': 0.6860061786137521} 12/27/2021 02:23:32 - INFO - codeparrot_training - Step 31372: {'lr': 0.0001535087455508316, 'samples': 16062976, 'steps': 31372, 'batch_loss/train': 0.7205528421327472} 12/27/2021 02:23:44 - INFO - codeparrot_training - Step 31373: {'lr': 0.00015349420403981519, 'samples': 16063488, 'steps': 31373, 'batch_loss/train': 0.7876288061961532} 12/27/2021 02:23:55 - INFO - codeparrot_training - Step 31374: {'lr': 0.00015347966291246923, 'samples': 16064000, 'steps': 31374, 'batch_loss/train': 0.7264855918474495} 12/27/2021 02:24:06 - INFO - codeparrot_training - Step 31375: {'lr': 0.00015346512216885179, 'samples': 16064512, 'steps': 31375, 'batch_loss/train': 0.7497520996257663} 12/27/2021 02:24:18 - INFO - codeparrot_training - Step 31376: {'lr': 0.00015345058180902052, 'samples': 16065024, 'steps': 31376, 'batch_loss/train': 0.7570459204725921} 12/27/2021 02:24:29 - INFO - codeparrot_training - Step 31377: {'lr': 0.00015343604183303333, 'samples': 16065536, 'steps': 31377, 'batch_loss/train': 0.506314558268059} 12/27/2021 02:24:40 - INFO - codeparrot_training - Step 31378: {'lr': 0.0001534215022409479, 'samples': 16066048, 'steps': 31378, 'batch_loss/train': 0.6373367641645018} 12/27/2021 02:24:52 - INFO - codeparrot_training - Step 31379: {'lr': 0.00015340696303282214, 'samples': 16066560, 'steps': 31379, 'batch_loss/train': 0.6185096751432866} 12/27/2021 02:25:03 - INFO - codeparrot_training - Step 31380: {'lr': 0.0001533924242087139, 'samples': 16067072, 'steps': 31380, 'batch_loss/train': 0.6869913963600993} 12/27/2021 02:25:13 - INFO - codeparrot_training - Step 31381: {'lr': 0.0001533778857686808, 'samples': 16067584, 'steps': 31381, 'batch_loss/train': 0.7948316056281328} 12/27/2021 02:25:24 - INFO - codeparrot_training - Step 31382: {'lr': 0.00015336334771278077, 'samples': 16068096, 'steps': 31382, 'batch_loss/train': 0.730217142845504} 12/27/2021 02:25:36 - INFO - codeparrot_training - Step 31383: {'lr': 0.00015334881004107164, 'samples': 16068608, 'steps': 31383, 'batch_loss/train': 0.8126309998333454} 12/27/2021 02:25:47 - INFO - codeparrot_training - Step 31384: {'lr': 0.00015333427275361113, 'samples': 16069120, 'steps': 31384, 'batch_loss/train': 0.7286208756268024} 12/27/2021 02:25:57 - INFO - codeparrot_training - Step 31385: {'lr': 0.00015331973585045694, 'samples': 16069632, 'steps': 31385, 'batch_loss/train': 0.8926768624223769} 12/27/2021 02:26:09 - INFO - codeparrot_training - Step 31386: {'lr': 0.0001533051993316671, 'samples': 16070144, 'steps': 31386, 'batch_loss/train': 0.7544528073631227} 12/27/2021 02:26:20 - INFO - codeparrot_training - Step 31387: {'lr': 0.00015329066319729926, 'samples': 16070656, 'steps': 31387, 'batch_loss/train': 0.8013615668751299} 12/27/2021 02:26:31 - INFO - codeparrot_training - Step 31388: {'lr': 0.0001532761274474112, 'samples': 16071168, 'steps': 31388, 'batch_loss/train': 0.7397567750886083} 12/27/2021 02:26:44 - INFO - codeparrot_training - Step 31389: {'lr': 0.00015326159208206068, 'samples': 16071680, 'steps': 31389, 'batch_loss/train': 0.7619536137208343} 12/27/2021 02:26:54 - INFO - codeparrot_training - Step 31390: {'lr': 0.00015324705710130566, 'samples': 16072192, 'steps': 31390, 'batch_loss/train': 0.6652904450893402} 12/27/2021 02:27:05 - INFO - codeparrot_training - Step 31391: {'lr': 0.00015323252250520374, 'samples': 16072704, 'steps': 31391, 'batch_loss/train': 1.5972239151597023} 12/27/2021 02:27:17 - INFO - codeparrot_training - Step 31392: {'lr': 0.00015321798829381273, 'samples': 16073216, 'steps': 31392, 'batch_loss/train': 0.7530344566330314} 12/27/2021 02:27:28 - INFO - codeparrot_training - Step 31393: {'lr': 0.00015320345446719053, 'samples': 16073728, 'steps': 31393, 'batch_loss/train': 0.6935775491874665} 12/27/2021 02:27:38 - INFO - codeparrot_training - Step 31394: {'lr': 0.00015318892102539488, 'samples': 16074240, 'steps': 31394, 'batch_loss/train': 0.7033118134131655} 12/27/2021 02:27:49 - INFO - codeparrot_training - Step 31395: {'lr': 0.00015317438796848346, 'samples': 16074752, 'steps': 31395, 'batch_loss/train': 0.7913850070908666} 12/27/2021 02:28:02 - INFO - codeparrot_training - Step 31396: {'lr': 0.00015315985529651405, 'samples': 16075264, 'steps': 31396, 'batch_loss/train': 0.6549664112972096} 12/27/2021 02:28:12 - INFO - codeparrot_training - Step 31397: {'lr': 0.0001531453230095446, 'samples': 16075776, 'steps': 31397, 'batch_loss/train': 0.6659903349354863} 12/27/2021 02:28:23 - INFO - codeparrot_training - Step 31398: {'lr': 0.00015313079110763274, 'samples': 16076288, 'steps': 31398, 'batch_loss/train': 0.7683227695524693} 12/27/2021 02:28:35 - INFO - codeparrot_training - Step 31399: {'lr': 0.00015311625959083634, 'samples': 16076800, 'steps': 31399, 'batch_loss/train': 0.7181908823549747} 12/27/2021 02:28:46 - INFO - codeparrot_training - Step 31400: {'lr': 0.000153101728459213, 'samples': 16077312, 'steps': 31400, 'batch_loss/train': 0.5962661541998386} 12/27/2021 02:28:56 - INFO - codeparrot_training - Step 31401: {'lr': 0.0001530871977128206, 'samples': 16077824, 'steps': 31401, 'batch_loss/train': 0.5788312088989187} 12/27/2021 02:29:09 - INFO - codeparrot_training - Step 31402: {'lr': 0.00015307266735171702, 'samples': 16078336, 'steps': 31402, 'batch_loss/train': 0.740003545768559} 12/27/2021 02:29:19 - INFO - codeparrot_training - Step 31403: {'lr': 0.0001530581373759599, 'samples': 16078848, 'steps': 31403, 'batch_loss/train': 0.6374310236424208} 12/27/2021 02:29:30 - INFO - codeparrot_training - Step 31404: {'lr': 0.00015304360778560692, 'samples': 16079360, 'steps': 31404, 'batch_loss/train': 0.6790181607939303} 12/27/2021 02:29:40 - INFO - codeparrot_training - Step 31405: {'lr': 0.00015302907858071596, 'samples': 16079872, 'steps': 31405, 'batch_loss/train': 0.657290754839778} 12/27/2021 02:29:53 - INFO - codeparrot_training - Step 31406: {'lr': 0.00015301454976134488, 'samples': 16080384, 'steps': 31406, 'batch_loss/train': 0.7152929436415434} 12/27/2021 02:30:04 - INFO - codeparrot_training - Step 31407: {'lr': 0.0001530000213275512, 'samples': 16080896, 'steps': 31407, 'batch_loss/train': 0.7268308158963919} 12/27/2021 02:30:15 - INFO - codeparrot_training - Step 31408: {'lr': 0.00015298549327939288, 'samples': 16081408, 'steps': 31408, 'batch_loss/train': 0.7238501030951738} 12/27/2021 02:30:27 - INFO - codeparrot_training - Step 31409: {'lr': 0.0001529709656169276, 'samples': 16081920, 'steps': 31409, 'batch_loss/train': 0.5563080223510042} 12/27/2021 02:30:38 - INFO - codeparrot_training - Step 31410: {'lr': 0.00015295643834021316, 'samples': 16082432, 'steps': 31410, 'batch_loss/train': 0.715206197462976} 12/27/2021 02:30:48 - INFO - codeparrot_training - Step 31411: {'lr': 0.0001529419114493072, 'samples': 16082944, 'steps': 31411, 'batch_loss/train': 0.7437752243131399} 12/27/2021 02:31:01 - INFO - codeparrot_training - Step 31412: {'lr': 0.0001529273849442676, 'samples': 16083456, 'steps': 31412, 'batch_loss/train': 0.699783431366086} 12/27/2021 02:31:11 - INFO - codeparrot_training - Step 31413: {'lr': 0.0001529128588251521, 'samples': 16083968, 'steps': 31413, 'batch_loss/train': 0.7374707804992795} 12/27/2021 02:31:22 - INFO - codeparrot_training - Step 31414: {'lr': 0.00015289833309201835, 'samples': 16084480, 'steps': 31414, 'batch_loss/train': 0.6016094719816465} 12/27/2021 02:31:34 - INFO - codeparrot_training - Step 31415: {'lr': 0.0001528838077449241, 'samples': 16084992, 'steps': 31415, 'batch_loss/train': 0.6190495310293045} 12/27/2021 02:31:45 - INFO - codeparrot_training - Step 31416: {'lr': 0.0001528692827839273, 'samples': 16085504, 'steps': 31416, 'batch_loss/train': 0.7923814784735441} 12/27/2021 02:31:56 - INFO - codeparrot_training - Step 31417: {'lr': 0.0001528547582090855, 'samples': 16086016, 'steps': 31417, 'batch_loss/train': 0.652852586004883} 12/27/2021 02:32:06 - INFO - codeparrot_training - Step 31418: {'lr': 0.00015284023402045645, 'samples': 16086528, 'steps': 31418, 'batch_loss/train': 0.6484762988984585} 12/27/2021 02:32:19 - INFO - codeparrot_training - Step 31419: {'lr': 0.00015282571021809805, 'samples': 16087040, 'steps': 31419, 'batch_loss/train': 0.7060039136558771} 12/27/2021 02:32:30 - INFO - codeparrot_training - Step 31420: {'lr': 0.0001528111868020679, 'samples': 16087552, 'steps': 31420, 'batch_loss/train': 0.6574824941344559} 12/27/2021 02:32:40 - INFO - codeparrot_training - Step 31421: {'lr': 0.00015279666377242374, 'samples': 16088064, 'steps': 31421, 'batch_loss/train': 0.6841356151271611} 12/27/2021 02:32:52 - INFO - codeparrot_training - Step 31422: {'lr': 0.0001527821411292233, 'samples': 16088576, 'steps': 31422, 'batch_loss/train': 0.667212750588078} 12/27/2021 02:33:03 - INFO - codeparrot_training - Step 31423: {'lr': 0.00015276761887252446, 'samples': 16089088, 'steps': 31423, 'batch_loss/train': 0.7037957014981657} 12/27/2021 02:33:14 - INFO - codeparrot_training - Step 31424: {'lr': 0.00015275309700238482, 'samples': 16089600, 'steps': 31424, 'batch_loss/train': 0.728162637911737} 12/27/2021 02:33:27 - INFO - codeparrot_training - Step 31425: {'lr': 0.00015273857551886217, 'samples': 16090112, 'steps': 31425, 'batch_loss/train': 0.7412622040137649} 12/27/2021 02:33:37 - INFO - codeparrot_training - Step 31426: {'lr': 0.00015272405442201415, 'samples': 16090624, 'steps': 31426, 'batch_loss/train': 0.6198385075258557} 12/27/2021 02:33:48 - INFO - codeparrot_training - Step 31427: {'lr': 0.00015270953371189855, 'samples': 16091136, 'steps': 31427, 'batch_loss/train': 0.7468491579056717} 12/27/2021 02:33:59 - INFO - codeparrot_training - Step 31428: {'lr': 0.0001526950133885732, 'samples': 16091648, 'steps': 31428, 'batch_loss/train': 0.7116808767896146} 12/27/2021 02:34:11 - INFO - codeparrot_training - Step 31429: {'lr': 0.00015268049345209565, 'samples': 16092160, 'steps': 31429, 'batch_loss/train': 0.8158577280119061} 12/27/2021 02:34:21 - INFO - codeparrot_training - Step 31430: {'lr': 0.0001526659739025238, 'samples': 16092672, 'steps': 31430, 'batch_loss/train': 0.6291690769139677} 12/27/2021 02:34:32 - INFO - codeparrot_training - Step 31431: {'lr': 0.0001526514547399152, 'samples': 16093184, 'steps': 31431, 'batch_loss/train': 0.6905692927539349} 12/27/2021 02:34:44 - INFO - codeparrot_training - Step 31432: {'lr': 0.00015263693596432776, 'samples': 16093696, 'steps': 31432, 'batch_loss/train': 0.624697970226407} 12/27/2021 02:34:55 - INFO - codeparrot_training - Step 31433: {'lr': 0.00015262241757581896, 'samples': 16094208, 'steps': 31433, 'batch_loss/train': 0.47775574878323823} 12/27/2021 02:35:05 - INFO - codeparrot_training - Step 31434: {'lr': 0.00015260789957444675, 'samples': 16094720, 'steps': 31434, 'batch_loss/train': 0.8996003028005362} 12/27/2021 02:35:19 - INFO - codeparrot_training - Step 31435: {'lr': 0.00015259338196026877, 'samples': 16095232, 'steps': 31435, 'batch_loss/train': 0.7121553122997284} 12/27/2021 02:35:29 - INFO - codeparrot_training - Step 31436: {'lr': 0.0001525788647333427, 'samples': 16095744, 'steps': 31436, 'batch_loss/train': 0.6188963684253395} 12/27/2021 02:35:40 - INFO - codeparrot_training - Step 31437: {'lr': 0.0001525643478937262, 'samples': 16096256, 'steps': 31437, 'batch_loss/train': 0.7426007548347116} 12/27/2021 02:35:52 - INFO - codeparrot_training - Step 31438: {'lr': 0.0001525498314414772, 'samples': 16096768, 'steps': 31438, 'batch_loss/train': 0.7272120942361653} 12/27/2021 02:36:03 - INFO - codeparrot_training - Step 31439: {'lr': 0.00015253531537665323, 'samples': 16097280, 'steps': 31439, 'batch_loss/train': 0.9965296518057585} 12/27/2021 02:36:14 - INFO - codeparrot_training - Step 31440: {'lr': 0.00015252079969931204, 'samples': 16097792, 'steps': 31440, 'batch_loss/train': 0.5572548629716039} 12/27/2021 02:36:24 - INFO - codeparrot_training - Step 31441: {'lr': 0.00015250628440951125, 'samples': 16098304, 'steps': 31441, 'batch_loss/train': 0.6187423735391349} 12/27/2021 02:36:36 - INFO - codeparrot_training - Step 31442: {'lr': 0.0001524917695073088, 'samples': 16098816, 'steps': 31442, 'batch_loss/train': 0.6476672918652184} 12/27/2021 02:36:47 - INFO - codeparrot_training - Step 31443: {'lr': 0.0001524772549927622, 'samples': 16099328, 'steps': 31443, 'batch_loss/train': 0.7253275644034147} 12/27/2021 02:36:58 - INFO - codeparrot_training - Step 31444: {'lr': 0.00015246274086592914, 'samples': 16099840, 'steps': 31444, 'batch_loss/train': 0.752673065289855} 12/27/2021 02:37:11 - INFO - codeparrot_training - Step 31445: {'lr': 0.0001524482271268675, 'samples': 16100352, 'steps': 31445, 'batch_loss/train': 0.6029241657233797} 12/27/2021 02:37:21 - INFO - codeparrot_training - Step 31446: {'lr': 0.00015243371377563487, 'samples': 16100864, 'steps': 31446, 'batch_loss/train': 0.6811741283163428} 12/27/2021 02:37:32 - INFO - codeparrot_training - Step 31447: {'lr': 0.00015241920081228894, 'samples': 16101376, 'steps': 31447, 'batch_loss/train': 0.6016291776613798} 12/27/2021 02:37:44 - INFO - codeparrot_training - Step 31448: {'lr': 0.00015240468823688735, 'samples': 16101888, 'steps': 31448, 'batch_loss/train': 0.7493951926007867} 12/27/2021 02:37:55 - INFO - codeparrot_training - Step 31449: {'lr': 0.00015239017604948797, 'samples': 16102400, 'steps': 31449, 'batch_loss/train': 0.8252062508836389} 12/27/2021 02:38:05 - INFO - codeparrot_training - Step 31450: {'lr': 0.00015237566425014832, 'samples': 16102912, 'steps': 31450, 'batch_loss/train': 0.7388798296451569} 12/27/2021 02:38:16 - INFO - codeparrot_training - Step 31451: {'lr': 0.00015236115283892628, 'samples': 16103424, 'steps': 31451, 'batch_loss/train': 0.7227443149313331} 12/27/2021 02:38:29 - INFO - codeparrot_training - Step 31452: {'lr': 0.00015234664181587926, 'samples': 16103936, 'steps': 31452, 'batch_loss/train': 0.7169483460020274} 12/27/2021 02:38:39 - INFO - codeparrot_training - Step 31453: {'lr': 0.0001523321311810652, 'samples': 16104448, 'steps': 31453, 'batch_loss/train': 0.7166472496464849} 12/27/2021 02:38:50 - INFO - codeparrot_training - Step 31454: {'lr': 0.00015231762093454178, 'samples': 16104960, 'steps': 31454, 'batch_loss/train': 0.5630936808884144} 12/27/2021 02:39:02 - INFO - codeparrot_training - Step 31455: {'lr': 0.00015230311107636646, 'samples': 16105472, 'steps': 31455, 'batch_loss/train': 0.6408872394822538} 12/27/2021 02:39:13 - INFO - codeparrot_training - Step 31456: {'lr': 0.0001522886016065972, 'samples': 16105984, 'steps': 31456, 'batch_loss/train': 0.7261606520041823} 12/27/2021 02:39:23 - INFO - codeparrot_training - Step 31457: {'lr': 0.00015227409252529156, 'samples': 16106496, 'steps': 31457, 'batch_loss/train': 0.7113102530129254} 12/27/2021 02:39:35 - INFO - codeparrot_training - Step 31458: {'lr': 0.00015225958383250724, 'samples': 16107008, 'steps': 31458, 'batch_loss/train': 0.5090793297276832} 12/27/2021 02:39:46 - INFO - codeparrot_training - Step 31459: {'lr': 0.0001522450755283018, 'samples': 16107520, 'steps': 31459, 'batch_loss/train': 0.7473622761899605} 12/27/2021 02:39:57 - INFO - codeparrot_training - Step 31460: {'lr': 0.0001522305676127331, 'samples': 16108032, 'steps': 31460, 'batch_loss/train': 0.6404742567101493} 12/27/2021 02:40:09 - INFO - codeparrot_training - Step 31461: {'lr': 0.0001522160600858588, 'samples': 16108544, 'steps': 31461, 'batch_loss/train': 0.7027882996480912} 12/27/2021 02:40:19 - INFO - codeparrot_training - Step 31462: {'lr': 0.00015220155294773646, 'samples': 16109056, 'steps': 31462, 'batch_loss/train': 0.6904643606394529} 12/27/2021 02:40:30 - INFO - codeparrot_training - Step 31463: {'lr': 0.00015218704619842377, 'samples': 16109568, 'steps': 31463, 'batch_loss/train': 0.7433326467871666} 12/27/2021 02:40:41 - INFO - codeparrot_training - Step 31464: {'lr': 0.00015217253983797858, 'samples': 16110080, 'steps': 31464, 'batch_loss/train': 0.7341283932328224} 12/27/2021 02:40:54 - INFO - codeparrot_training - Step 31465: {'lr': 0.0001521580338664584, 'samples': 16110592, 'steps': 31465, 'batch_loss/train': 0.6866295222425833} 12/27/2021 02:41:04 - INFO - codeparrot_training - Step 31466: {'lr': 0.0001521435282839208, 'samples': 16111104, 'steps': 31466, 'batch_loss/train': 0.7785090822726488} 12/27/2021 02:41:15 - INFO - codeparrot_training - Step 31467: {'lr': 0.0001521290230904237, 'samples': 16111616, 'steps': 31467, 'batch_loss/train': 0.6328826197423041} 12/27/2021 02:41:27 - INFO - codeparrot_training - Step 31468: {'lr': 0.00015211451828602468, 'samples': 16112128, 'steps': 31468, 'batch_loss/train': 0.6729566007852554} 12/27/2021 02:41:38 - INFO - codeparrot_training - Step 31469: {'lr': 0.00015210001387078132, 'samples': 16112640, 'steps': 31469, 'batch_loss/train': 0.7594414316117764} 12/27/2021 02:41:48 - INFO - codeparrot_training - Step 31470: {'lr': 0.0001520855098447513, 'samples': 16113152, 'steps': 31470, 'batch_loss/train': 0.7550814980641007} 12/27/2021 02:42:01 - INFO - codeparrot_training - Step 31471: {'lr': 0.0001520710062079924, 'samples': 16113664, 'steps': 31471, 'batch_loss/train': 0.7066844720393419} 12/27/2021 02:42:11 - INFO - codeparrot_training - Step 31472: {'lr': 0.00015205650296056216, 'samples': 16114176, 'steps': 31472, 'batch_loss/train': 0.6995956678874791} 12/27/2021 02:42:22 - INFO - codeparrot_training - Step 31473: {'lr': 0.00015204200010251833, 'samples': 16114688, 'steps': 31473, 'batch_loss/train': 0.7848382368683815} 12/27/2021 02:42:34 - INFO - codeparrot_training - Step 31474: {'lr': 0.00015202749763391842, 'samples': 16115200, 'steps': 31474, 'batch_loss/train': 0.6401720058638602} 12/27/2021 02:42:45 - INFO - codeparrot_training - Step 31475: {'lr': 0.00015201299555482028, 'samples': 16115712, 'steps': 31475, 'batch_loss/train': 0.6189368153281976} 12/27/2021 02:42:56 - INFO - codeparrot_training - Step 31476: {'lr': 0.00015199849386528144, 'samples': 16116224, 'steps': 31476, 'batch_loss/train': 0.5850978442467749} 12/27/2021 02:43:06 - INFO - codeparrot_training - Step 31477: {'lr': 0.00015198399256535964, 'samples': 16116736, 'steps': 31477, 'batch_loss/train': 0.768613045103848} 12/27/2021 02:43:19 - INFO - codeparrot_training - Step 31478: {'lr': 0.00015196949165511236, 'samples': 16117248, 'steps': 31478, 'batch_loss/train': 0.7738383379764855} 12/27/2021 02:43:29 - INFO - codeparrot_training - Step 31479: {'lr': 0.00015195499113459742, 'samples': 16117760, 'steps': 31479, 'batch_loss/train': 0.6964362990111113} 12/27/2021 02:43:40 - INFO - codeparrot_training - Step 31480: {'lr': 0.00015194049100387247, 'samples': 16118272, 'steps': 31480, 'batch_loss/train': 0.7328276028856635} 12/27/2021 02:43:53 - INFO - codeparrot_training - Step 31481: {'lr': 0.000151925991262995, 'samples': 16118784, 'steps': 31481, 'batch_loss/train': 0.7583038443699479} 12/27/2021 02:44:03 - INFO - codeparrot_training - Step 31482: {'lr': 0.0001519114919120228, 'samples': 16119296, 'steps': 31482, 'batch_loss/train': 0.6309347879141569} 12/27/2021 02:44:14 - INFO - codeparrot_training - Step 31483: {'lr': 0.0001518969929510135, 'samples': 16119808, 'steps': 31483, 'batch_loss/train': 0.703783311881125} 12/27/2021 02:44:26 - INFO - codeparrot_training - Step 31484: {'lr': 0.00015188249438002476, 'samples': 16120320, 'steps': 31484, 'batch_loss/train': 0.6927805099403486} 12/27/2021 02:44:37 - INFO - codeparrot_training - Step 31485: {'lr': 0.00015186799619911403, 'samples': 16120832, 'steps': 31485, 'batch_loss/train': 0.8748905076645315} 12/27/2021 02:44:47 - INFO - codeparrot_training - Step 31486: {'lr': 0.00015185349840833917, 'samples': 16121344, 'steps': 31486, 'batch_loss/train': 0.8289650953374803} 12/27/2021 02:44:59 - INFO - codeparrot_training - Step 31487: {'lr': 0.0001518390010077578, 'samples': 16121856, 'steps': 31487, 'batch_loss/train': 0.7433943618088961} 12/27/2021 02:45:10 - INFO - codeparrot_training - Step 31488: {'lr': 0.00015182450399742743, 'samples': 16122368, 'steps': 31488, 'batch_loss/train': 0.7184185329824686} 12/27/2021 02:45:20 - INFO - codeparrot_training - Step 31489: {'lr': 0.00015181000737740573, 'samples': 16122880, 'steps': 31489, 'batch_loss/train': 0.6129598161205649} 12/27/2021 02:45:31 - INFO - codeparrot_training - Step 31490: {'lr': 0.00015179551114775046, 'samples': 16123392, 'steps': 31490, 'batch_loss/train': 0.7467375372070819} 12/27/2021 02:45:44 - INFO - codeparrot_training - Step 31491: {'lr': 0.0001517810153085191, 'samples': 16123904, 'steps': 31491, 'batch_loss/train': 0.7727363593876362} 12/27/2021 02:45:54 - INFO - codeparrot_training - Step 31492: {'lr': 0.00015176651985976928, 'samples': 16124416, 'steps': 31492, 'batch_loss/train': 0.7054386329837143} 12/27/2021 02:46:05 - INFO - codeparrot_training - Step 31493: {'lr': 0.00015175202480155882, 'samples': 16124928, 'steps': 31493, 'batch_loss/train': 0.7159635750576854} 12/27/2021 02:46:17 - INFO - codeparrot_training - Step 31494: {'lr': 0.00015173753013394514, 'samples': 16125440, 'steps': 31494, 'batch_loss/train': 0.6832621422363445} 12/27/2021 02:46:28 - INFO - codeparrot_training - Step 31495: {'lr': 0.00015172303585698596, 'samples': 16125952, 'steps': 31495, 'batch_loss/train': 0.7718106820248067} 12/27/2021 02:46:38 - INFO - codeparrot_training - Step 31496: {'lr': 0.0001517085419707388, 'samples': 16126464, 'steps': 31496, 'batch_loss/train': 0.5990631942404434} 12/27/2021 02:46:51 - INFO - codeparrot_training - Step 31497: {'lr': 0.00015169404847526147, 'samples': 16126976, 'steps': 31497, 'batch_loss/train': 0.7343331323936582} 12/27/2021 02:47:02 - INFO - codeparrot_training - Step 31498: {'lr': 0.00015167955537061145, 'samples': 16127488, 'steps': 31498, 'batch_loss/train': 0.7331145964562893} 12/27/2021 02:47:12 - INFO - codeparrot_training - Step 31499: {'lr': 0.00015166506265684642, 'samples': 16128000, 'steps': 31499, 'batch_loss/train': 0.6760934421326965} 12/27/2021 02:47:24 - INFO - codeparrot_training - Step 31500: {'lr': 0.00015165057033402388, 'samples': 16128512, 'steps': 31500, 'batch_loss/train': 0.7103553544729948} 12/27/2021 02:47:35 - INFO - codeparrot_training - Step 31501: {'lr': 0.0001516360784022016, 'samples': 16129024, 'steps': 31501, 'batch_loss/train': 0.706904532853514} 12/27/2021 02:47:45 - INFO - codeparrot_training - Step 31502: {'lr': 0.00015162158686143716, 'samples': 16129536, 'steps': 31502, 'batch_loss/train': 0.5841919060912915} 12/27/2021 02:47:56 - INFO - codeparrot_training - Step 31503: {'lr': 0.00015160709571178805, 'samples': 16130048, 'steps': 31503, 'batch_loss/train': 0.6139028932084329} 12/27/2021 02:48:08 - INFO - codeparrot_training - Step 31504: {'lr': 0.00015159260495331206, 'samples': 16130560, 'steps': 31504, 'batch_loss/train': 0.6896583186462522} 12/27/2021 02:48:19 - INFO - codeparrot_training - Step 31505: {'lr': 0.0001515781145860667, 'samples': 16131072, 'steps': 31505, 'batch_loss/train': 0.6065133341471665} 12/27/2021 02:48:30 - INFO - codeparrot_training - Step 31506: {'lr': 0.0001515636246101096, 'samples': 16131584, 'steps': 31506, 'batch_loss/train': 0.5704595501883887} 12/27/2021 02:48:42 - INFO - codeparrot_training - Step 31507: {'lr': 0.0001515491350254983, 'samples': 16132096, 'steps': 31507, 'batch_loss/train': 0.6552599861170165} 12/27/2021 02:48:52 - INFO - codeparrot_training - Step 31508: {'lr': 0.0001515346458322905, 'samples': 16132608, 'steps': 31508, 'batch_loss/train': 0.686503151897341} 12/27/2021 02:49:03 - INFO - codeparrot_training - Step 31509: {'lr': 0.00015152015703054383, 'samples': 16133120, 'steps': 31509, 'batch_loss/train': 0.7183635057881474} 12/27/2021 02:49:16 - INFO - codeparrot_training - Step 31510: {'lr': 0.00015150566862031585, 'samples': 16133632, 'steps': 31510, 'batch_loss/train': 0.7410596385598183} 12/27/2021 02:49:26 - INFO - codeparrot_training - Step 31511: {'lr': 0.000151491180601664, 'samples': 16134144, 'steps': 31511, 'batch_loss/train': 0.6397136354353279} 12/27/2021 02:49:37 - INFO - codeparrot_training - Step 31512: {'lr': 0.00015147669297464606, 'samples': 16134656, 'steps': 31512, 'batch_loss/train': 0.7214510682970285} 12/27/2021 02:49:49 - INFO - codeparrot_training - Step 31513: {'lr': 0.00015146220573931967, 'samples': 16135168, 'steps': 31513, 'batch_loss/train': 0.737498658709228} 12/27/2021 02:49:59 - INFO - codeparrot_training - Step 31514: {'lr': 0.00015144771889574228, 'samples': 16135680, 'steps': 31514, 'batch_loss/train': 0.6294438873883337} 12/27/2021 02:50:10 - INFO - codeparrot_training - Step 31515: {'lr': 0.0001514332324439715, 'samples': 16136192, 'steps': 31515, 'batch_loss/train': 0.6383352223783731} 12/27/2021 02:50:21 - INFO - codeparrot_training - Step 31516: {'lr': 0.00015141874638406507, 'samples': 16136704, 'steps': 31516, 'batch_loss/train': 0.6964890863746405} 12/27/2021 02:50:33 - INFO - codeparrot_training - Step 31517: {'lr': 0.00015140426071608043, 'samples': 16137216, 'steps': 31517, 'batch_loss/train': 0.7416737475432456} 12/27/2021 02:50:43 - INFO - codeparrot_training - Step 31518: {'lr': 0.0001513897754400752, 'samples': 16137728, 'steps': 31518, 'batch_loss/train': 0.7244674609974027} 12/27/2021 02:50:54 - INFO - codeparrot_training - Step 31519: {'lr': 0.00015137529055610704, 'samples': 16138240, 'steps': 31519, 'batch_loss/train': 0.7813706323504448} 12/27/2021 02:51:07 - INFO - codeparrot_training - Step 31520: {'lr': 0.00015136080606423352, 'samples': 16138752, 'steps': 31520, 'batch_loss/train': 0.6413423191988841} 12/27/2021 02:51:17 - INFO - codeparrot_training - Step 31521: {'lr': 0.00015134632196451214, 'samples': 16139264, 'steps': 31521, 'batch_loss/train': 0.5576433222740889} 12/27/2021 02:51:28 - INFO - codeparrot_training - Step 31522: {'lr': 0.00015133183825700048, 'samples': 16139776, 'steps': 31522, 'batch_loss/train': 0.6848308630287647} 12/27/2021 02:51:40 - INFO - codeparrot_training - Step 31523: {'lr': 0.00015131735494175629, 'samples': 16140288, 'steps': 31523, 'batch_loss/train': 0.6837809784337878} 12/27/2021 02:51:51 - INFO - codeparrot_training - Step 31524: {'lr': 0.00015130287201883696, 'samples': 16140800, 'steps': 31524, 'batch_loss/train': 0.7010832452215254} 12/27/2021 02:52:01 - INFO - codeparrot_training - Step 31525: {'lr': 0.0001512883894883002, 'samples': 16141312, 'steps': 31525, 'batch_loss/train': 0.690398509032093} 12/27/2021 02:52:14 - INFO - codeparrot_training - Step 31526: {'lr': 0.0001512739073502034, 'samples': 16141824, 'steps': 31526, 'batch_loss/train': 0.7113946229219437} 12/27/2021 02:52:25 - INFO - codeparrot_training - Step 31527: {'lr': 0.00015125942560460437, 'samples': 16142336, 'steps': 31527, 'batch_loss/train': 0.7750695394352078} 12/27/2021 02:52:35 - INFO - codeparrot_training - Step 31528: {'lr': 0.0001512449442515606, 'samples': 16142848, 'steps': 31528, 'batch_loss/train': 0.827846510335803} 12/27/2021 02:52:46 - INFO - codeparrot_training - Step 31529: {'lr': 0.00015123046329112955, 'samples': 16143360, 'steps': 31529, 'batch_loss/train': 0.6886011816095561} 12/27/2021 02:52:58 - INFO - codeparrot_training - Step 31530: {'lr': 0.000151215982723369, 'samples': 16143872, 'steps': 31530, 'batch_loss/train': 0.7595715820789337} 12/27/2021 02:53:09 - INFO - codeparrot_training - Step 31531: {'lr': 0.00015120150254833634, 'samples': 16144384, 'steps': 31531, 'batch_loss/train': 0.6852353151189163} 12/27/2021 02:53:19 - INFO - codeparrot_training - Step 31532: {'lr': 0.00015118702276608925, 'samples': 16144896, 'steps': 31532, 'batch_loss/train': 0.7253406811505556} 12/27/2021 02:53:32 - INFO - codeparrot_training - Step 31533: {'lr': 0.00015117254337668515, 'samples': 16145408, 'steps': 31533, 'batch_loss/train': 0.7702100090391468} 12/27/2021 02:53:43 - INFO - codeparrot_training - Step 31534: {'lr': 0.00015115806438018175, 'samples': 16145920, 'steps': 31534, 'batch_loss/train': 0.671945808455348} 12/27/2021 02:53:53 - INFO - codeparrot_training - Step 31535: {'lr': 0.00015114358577663663, 'samples': 16146432, 'steps': 31535, 'batch_loss/train': 0.750348842702806} 12/27/2021 02:54:06 - INFO - codeparrot_training - Step 31536: {'lr': 0.0001511291075661072, 'samples': 16146944, 'steps': 31536, 'batch_loss/train': 0.65590674552368} 12/27/2021 02:54:17 - INFO - codeparrot_training - Step 31537: {'lr': 0.00015111462974865108, 'samples': 16147456, 'steps': 31537, 'batch_loss/train': 0.7227046792395413} 12/27/2021 02:54:27 - INFO - codeparrot_training - Step 31538: {'lr': 0.00015110015232432594, 'samples': 16147968, 'steps': 31538, 'batch_loss/train': 0.592062541982159} 12/27/2021 02:54:38 - INFO - codeparrot_training - Step 31539: {'lr': 0.00015108567529318928, 'samples': 16148480, 'steps': 31539, 'batch_loss/train': 0.5439181718975306} 12/27/2021 02:54:50 - INFO - codeparrot_training - Step 31540: {'lr': 0.0001510711986552985, 'samples': 16148992, 'steps': 31540, 'batch_loss/train': 0.7469077873975039} 12/27/2021 02:55:01 - INFO - codeparrot_training - Step 31541: {'lr': 0.00015105672241071133, 'samples': 16149504, 'steps': 31541, 'batch_loss/train': 0.7096933517605066} 12/27/2021 02:55:11 - INFO - codeparrot_training - Step 31542: {'lr': 0.00015104224655948533, 'samples': 16150016, 'steps': 31542, 'batch_loss/train': 0.6750988978892565} 12/27/2021 02:55:23 - INFO - codeparrot_training - Step 31543: {'lr': 0.00015102777110167798, 'samples': 16150528, 'steps': 31543, 'batch_loss/train': 0.7328909016214311} 12/27/2021 02:55:34 - INFO - codeparrot_training - Step 31544: {'lr': 0.00015101329603734676, 'samples': 16151040, 'steps': 31544, 'batch_loss/train': 0.7439754949882627} 12/27/2021 02:55:45 - INFO - codeparrot_training - Step 31545: {'lr': 0.00015099882136654938, 'samples': 16151552, 'steps': 31545, 'batch_loss/train': 0.7246581744402647} 12/27/2021 02:55:57 - INFO - codeparrot_training - Step 31546: {'lr': 0.00015098434708934333, 'samples': 16152064, 'steps': 31546, 'batch_loss/train': 0.7958050114102662} 12/27/2021 02:56:07 - INFO - codeparrot_training - Step 31547: {'lr': 0.00015096987320578606, 'samples': 16152576, 'steps': 31547, 'batch_loss/train': 0.6317842246498913} 12/27/2021 02:56:18 - INFO - codeparrot_training - Step 31548: {'lr': 0.00015095539971593514, 'samples': 16153088, 'steps': 31548, 'batch_loss/train': 0.6659115398069844} 12/27/2021 02:56:32 - INFO - codeparrot_training - Step 31549: {'lr': 0.00015094092661984825, 'samples': 16153600, 'steps': 31549, 'batch_loss/train': 0.6573078613728285} 12/27/2021 02:56:42 - INFO - codeparrot_training - Step 31550: {'lr': 0.0001509264539175828, 'samples': 16154112, 'steps': 31550, 'batch_loss/train': 0.5794845618656836} 12/27/2021 02:56:53 - INFO - codeparrot_training - Step 31551: {'lr': 0.0001509119816091964, 'samples': 16154624, 'steps': 31551, 'batch_loss/train': 0.7186567434109747} 12/27/2021 02:57:04 - INFO - codeparrot_training - Step 31552: {'lr': 0.0001508975096947464, 'samples': 16155136, 'steps': 31552, 'batch_loss/train': 0.6341042334679514} 12/27/2021 02:57:16 - INFO - codeparrot_training - Step 31553: {'lr': 0.00015088303817429062, 'samples': 16155648, 'steps': 31553, 'batch_loss/train': 0.7286252825288102} 12/27/2021 02:57:26 - INFO - codeparrot_training - Step 31554: {'lr': 0.00015086856704788642, 'samples': 16156160, 'steps': 31554, 'batch_loss/train': 0.5001787610235624} 12/27/2021 02:57:37 - INFO - codeparrot_training - Step 31555: {'lr': 0.00015085409631559132, 'samples': 16156672, 'steps': 31555, 'batch_loss/train': 0.6694228057749569} 12/27/2021 02:57:50 - INFO - codeparrot_training - Step 31556: {'lr': 0.00015083962597746298, 'samples': 16157184, 'steps': 31556, 'batch_loss/train': 0.668015081435442} 12/27/2021 02:58:01 - INFO - codeparrot_training - Step 31557: {'lr': 0.00015082515603355882, 'samples': 16157696, 'steps': 31557, 'batch_loss/train': 0.7443045871332288} 12/27/2021 02:58:11 - INFO - codeparrot_training - Step 31558: {'lr': 0.00015081068648393643, 'samples': 16158208, 'steps': 31558, 'batch_loss/train': 0.7621028728317469} 12/27/2021 02:58:23 - INFO - codeparrot_training - Step 31559: {'lr': 0.00015079621732865317, 'samples': 16158720, 'steps': 31559, 'batch_loss/train': 0.6233878751809243} 12/27/2021 02:58:34 - INFO - codeparrot_training - Step 31560: {'lr': 0.00015078174856776678, 'samples': 16159232, 'steps': 31560, 'batch_loss/train': 0.7353176893666387} 12/27/2021 02:58:45 - INFO - codeparrot_training - Step 31561: {'lr': 0.00015076728020133474, 'samples': 16159744, 'steps': 31561, 'batch_loss/train': 0.6346353432163596} 12/27/2021 02:58:55 - INFO - codeparrot_training - Step 31562: {'lr': 0.00015075281222941446, 'samples': 16160256, 'steps': 31562, 'batch_loss/train': 0.8508788356557488} 12/27/2021 02:59:08 - INFO - codeparrot_training - Step 31563: {'lr': 0.00015073834465206346, 'samples': 16160768, 'steps': 31563, 'batch_loss/train': 0.5597165746148676} 12/27/2021 02:59:18 - INFO - codeparrot_training - Step 31564: {'lr': 0.00015072387746933944, 'samples': 16161280, 'steps': 31564, 'batch_loss/train': 0.6651992439292371} 12/27/2021 02:59:29 - INFO - codeparrot_training - Step 31565: {'lr': 0.0001507094106812998, 'samples': 16161792, 'steps': 31565, 'batch_loss/train': 0.6714837965555489} 12/27/2021 02:59:42 - INFO - codeparrot_training - Step 31566: {'lr': 0.00015069494428800195, 'samples': 16162304, 'steps': 31566, 'batch_loss/train': 0.7104943976737559} 12/27/2021 02:59:52 - INFO - codeparrot_training - Step 31567: {'lr': 0.00015068047828950353, 'samples': 16162816, 'steps': 31567, 'batch_loss/train': 0.7190822437405586} 12/27/2021 03:00:03 - INFO - codeparrot_training - Step 31568: {'lr': 0.0001506660126858621, 'samples': 16163328, 'steps': 31568, 'batch_loss/train': 0.7021235595457256} 12/27/2021 03:00:15 - INFO - codeparrot_training - Step 31569: {'lr': 0.00015065154747713504, 'samples': 16163840, 'steps': 31569, 'batch_loss/train': 0.7148955315351486} 12/27/2021 03:00:26 - INFO - codeparrot_training - Step 31570: {'lr': 0.0001506370826633799, 'samples': 16164352, 'steps': 31570, 'batch_loss/train': 0.745745268650353} 12/27/2021 03:00:37 - INFO - codeparrot_training - Step 31571: {'lr': 0.00015062261824465428, 'samples': 16164864, 'steps': 31571, 'batch_loss/train': 0.7024156204424798} 12/27/2021 03:00:49 - INFO - codeparrot_training - Step 31572: {'lr': 0.00015060815422101553, 'samples': 16165376, 'steps': 31572, 'batch_loss/train': 0.7353322450071573} 12/27/2021 03:00:59 - INFO - codeparrot_training - Step 31573: {'lr': 0.00015059369059252132, 'samples': 16165888, 'steps': 31573, 'batch_loss/train': 0.5243236672831699} 12/27/2021 03:01:10 - INFO - codeparrot_training - Step 31574: {'lr': 0.00015057922735922893, 'samples': 16166400, 'steps': 31574, 'batch_loss/train': 1.3861316861584783} 12/27/2021 03:01:21 - INFO - codeparrot_training - Step 31575: {'lr': 0.0001505647645211961, 'samples': 16166912, 'steps': 31575, 'batch_loss/train': 1.0312389680184424} 12/27/2021 03:01:33 - INFO - codeparrot_training - Step 31576: {'lr': 0.0001505503020784802, 'samples': 16167424, 'steps': 31576, 'batch_loss/train': 0.6958817914128304} 12/27/2021 03:01:44 - INFO - codeparrot_training - Step 31577: {'lr': 0.00015053584003113865, 'samples': 16167936, 'steps': 31577, 'batch_loss/train': 0.7642549141310155} 12/27/2021 03:01:54 - INFO - codeparrot_training - Step 31578: {'lr': 0.00015052137837922918, 'samples': 16168448, 'steps': 31578, 'batch_loss/train': 0.6340053707244806} 12/27/2021 03:02:07 - INFO - codeparrot_training - Step 31579: {'lr': 0.00015050691712280905, 'samples': 16168960, 'steps': 31579, 'batch_loss/train': 0.7128059146925807} 12/27/2021 03:02:17 - INFO - codeparrot_training - Step 31580: {'lr': 0.00015049245626193598, 'samples': 16169472, 'steps': 31580, 'batch_loss/train': 0.7530551175586879} 12/27/2021 03:02:28 - INFO - codeparrot_training - Step 31581: {'lr': 0.00015047799579666715, 'samples': 16169984, 'steps': 31581, 'batch_loss/train': 0.7077461765147746} 12/27/2021 03:02:40 - INFO - codeparrot_training - Step 31582: {'lr': 0.00015046353572706033, 'samples': 16170496, 'steps': 31582, 'batch_loss/train': 0.6558126052841544} 12/27/2021 03:02:51 - INFO - codeparrot_training - Step 31583: {'lr': 0.00015044907605317296, 'samples': 16171008, 'steps': 31583, 'batch_loss/train': 0.7271980112418532} 12/27/2021 03:03:02 - INFO - codeparrot_training - Step 31584: {'lr': 0.00015043461677506246, 'samples': 16171520, 'steps': 31584, 'batch_loss/train': 0.6928527434356511} 12/27/2021 03:03:12 - INFO - codeparrot_training - Step 31585: {'lr': 0.00015042015789278623, 'samples': 16172032, 'steps': 31585, 'batch_loss/train': 0.5899099740781821} 12/27/2021 03:03:25 - INFO - codeparrot_training - Step 31586: {'lr': 0.0001504056994064019, 'samples': 16172544, 'steps': 31586, 'batch_loss/train': 0.8173958896659315} 12/27/2021 03:03:36 - INFO - codeparrot_training - Step 31587: {'lr': 0.00015039124131596693, 'samples': 16173056, 'steps': 31587, 'batch_loss/train': 0.6790745826438069} 12/27/2021 03:03:46 - INFO - codeparrot_training - Step 31588: {'lr': 0.00015037678362153878, 'samples': 16173568, 'steps': 31588, 'batch_loss/train': 0.7820536214858294} 12/27/2021 03:03:58 - INFO - codeparrot_training - Step 31589: {'lr': 0.0001503623263231748, 'samples': 16174080, 'steps': 31589, 'batch_loss/train': 0.7491732453927398} 12/27/2021 03:04:09 - INFO - codeparrot_training - Step 31590: {'lr': 0.00015034786942093277, 'samples': 16174592, 'steps': 31590, 'batch_loss/train': 0.7252370032365434} 12/27/2021 03:04:19 - INFO - codeparrot_training - Step 31591: {'lr': 0.00015033341291486994, 'samples': 16175104, 'steps': 31591, 'batch_loss/train': 0.707928170915693} 12/27/2021 03:04:30 - INFO - codeparrot_training - Step 31592: {'lr': 0.00015031895680504373, 'samples': 16175616, 'steps': 31592, 'batch_loss/train': 0.7474225480109453} 12/27/2021 03:04:42 - INFO - codeparrot_training - Step 31593: {'lr': 0.00015030450109151177, 'samples': 16176128, 'steps': 31593, 'batch_loss/train': 0.6776969088241458} 12/27/2021 03:04:53 - INFO - codeparrot_training - Step 31594: {'lr': 0.0001502900457743315, 'samples': 16176640, 'steps': 31594, 'batch_loss/train': 0.7043181923218071} 12/27/2021 03:05:03 - INFO - codeparrot_training - Step 31595: {'lr': 0.0001502755908535603, 'samples': 16177152, 'steps': 31595, 'batch_loss/train': 0.6038160275202245} 12/27/2021 03:05:16 - INFO - codeparrot_training - Step 31596: {'lr': 0.0001502611363292557, 'samples': 16177664, 'steps': 31596, 'batch_loss/train': 0.7187776565551758} 12/27/2021 03:05:27 - INFO - codeparrot_training - Step 31597: {'lr': 0.00015024668220147523, 'samples': 16178176, 'steps': 31597, 'batch_loss/train': 0.6339523384522181} 12/27/2021 03:05:37 - INFO - codeparrot_training - Step 31598: {'lr': 0.00015023222847027623, 'samples': 16178688, 'steps': 31598, 'batch_loss/train': 0.7122092814533971} 12/27/2021 03:05:50 - INFO - codeparrot_training - Step 31599: {'lr': 0.00015021777513571626, 'samples': 16179200, 'steps': 31599, 'batch_loss/train': 0.721508227288723} 12/27/2021 03:06:00 - INFO - codeparrot_training - Step 31600: {'lr': 0.00015020332219785267, 'samples': 16179712, 'steps': 31600, 'batch_loss/train': 0.675104558467865} 12/27/2021 03:06:11 - INFO - codeparrot_training - Step 31601: {'lr': 0.00015018886965674304, 'samples': 16180224, 'steps': 31601, 'batch_loss/train': 0.6837186724878848} 12/27/2021 03:06:23 - INFO - codeparrot_training - Step 31602: {'lr': 0.0001501744175124448, 'samples': 16180736, 'steps': 31602, 'batch_loss/train': 0.7423969674855471} 12/27/2021 03:06:34 - INFO - codeparrot_training - Step 31603: {'lr': 0.0001501599657650153, 'samples': 16181248, 'steps': 31603, 'batch_loss/train': 0.7026722128503025} 12/27/2021 03:06:45 - INFO - codeparrot_training - Step 31604: {'lr': 0.00015014551441451212, 'samples': 16181760, 'steps': 31604, 'batch_loss/train': 0.5620977662038058} 12/27/2021 03:06:55 - INFO - codeparrot_training - Step 31605: {'lr': 0.0001501310634609927, 'samples': 16182272, 'steps': 31605, 'batch_loss/train': 0.6969141322188079} 12/27/2021 03:07:08 - INFO - codeparrot_training - Step 31606: {'lr': 0.0001501166129045145, 'samples': 16182784, 'steps': 31606, 'batch_loss/train': 0.7348775779828429} 12/27/2021 03:07:18 - INFO - codeparrot_training - Step 31607: {'lr': 0.0001501021627451348, 'samples': 16183296, 'steps': 31607, 'batch_loss/train': 0.7926487075164914} 12/27/2021 03:07:29 - INFO - codeparrot_training - Step 31608: {'lr': 0.00015008771298291124, 'samples': 16183808, 'steps': 31608, 'batch_loss/train': 0.7718122014775872} 12/27/2021 03:07:41 - INFO - codeparrot_training - Step 31609: {'lr': 0.00015007326361790127, 'samples': 16184320, 'steps': 31609, 'batch_loss/train': 0.6521721258759499} 12/27/2021 03:07:52 - INFO - codeparrot_training - Step 31610: {'lr': 0.00015005881465016228, 'samples': 16184832, 'steps': 31610, 'batch_loss/train': 0.6884867339394987} 12/27/2021 03:08:02 - INFO - codeparrot_training - Step 31611: {'lr': 0.00015004436607975158, 'samples': 16185344, 'steps': 31611, 'batch_loss/train': 0.7342792619019747} 12/27/2021 03:08:14 - INFO - codeparrot_training - Step 31612: {'lr': 0.00015002991790672682, 'samples': 16185856, 'steps': 31612, 'batch_loss/train': 0.5459339949884452} 12/27/2021 03:08:25 - INFO - codeparrot_training - Step 31613: {'lr': 0.0001500154701311454, 'samples': 16186368, 'steps': 31613, 'batch_loss/train': 0.6966313547454774} 12/27/2021 03:08:36 - INFO - codeparrot_training - Step 31614: {'lr': 0.0001500010227530646, 'samples': 16186880, 'steps': 31614, 'batch_loss/train': 0.7025516852736473} 12/27/2021 03:08:49 - INFO - codeparrot_training - Step 31615: {'lr': 0.00014998657577254204, 'samples': 16187392, 'steps': 31615, 'batch_loss/train': 0.8505585929378867} 12/27/2021 03:09:00 - INFO - codeparrot_training - Step 31616: {'lr': 0.00014997212918963515, 'samples': 16187904, 'steps': 31616, 'batch_loss/train': 0.7607985464856029} 12/27/2021 03:09:11 - INFO - codeparrot_training - Step 31617: {'lr': 0.0001499576830044012, 'samples': 16188416, 'steps': 31617, 'batch_loss/train': 0.7168494660872966} 12/27/2021 03:09:21 - INFO - codeparrot_training - Step 31618: {'lr': 0.00014994323721689772, 'samples': 16188928, 'steps': 31618, 'batch_loss/train': 0.8339604260399938} 12/27/2021 03:09:33 - INFO - codeparrot_training - Step 31619: {'lr': 0.00014992879182718226, 'samples': 16189440, 'steps': 31619, 'batch_loss/train': 0.850530524738133} 12/27/2021 03:09:44 - INFO - codeparrot_training - Step 31620: {'lr': 0.00014991434683531214, 'samples': 16189952, 'steps': 31620, 'batch_loss/train': 0.6366600766777992} 12/27/2021 03:09:54 - INFO - codeparrot_training - Step 31621: {'lr': 0.00014989990224134476, 'samples': 16190464, 'steps': 31621, 'batch_loss/train': 0.670459725195542} 12/27/2021 03:10:06 - INFO - codeparrot_training - Step 31622: {'lr': 0.00014988545804533746, 'samples': 16190976, 'steps': 31622, 'batch_loss/train': 0.7775247478857636} 12/27/2021 03:10:17 - INFO - codeparrot_training - Step 31623: {'lr': 0.0001498710142473479, 'samples': 16191488, 'steps': 31623, 'batch_loss/train': 0.7579837804660201} 12/27/2021 03:10:28 - INFO - codeparrot_training - Step 31624: {'lr': 0.00014985657084743334, 'samples': 16192000, 'steps': 31624, 'batch_loss/train': 0.7948171626776457} 12/27/2021 03:10:42 - INFO - codeparrot_training - Step 31625: {'lr': 0.00014984212784565127, 'samples': 16192512, 'steps': 31625, 'batch_loss/train': 0.7122698193415999} 12/27/2021 03:10:52 - INFO - codeparrot_training - Step 31626: {'lr': 0.00014982768524205903, 'samples': 16193024, 'steps': 31626, 'batch_loss/train': 0.8272983375936747} 12/27/2021 03:11:03 - INFO - codeparrot_training - Step 31627: {'lr': 0.0001498132430367141, 'samples': 16193536, 'steps': 31627, 'batch_loss/train': 0.6876924876123667} 12/27/2021 03:11:15 - INFO - codeparrot_training - Step 31628: {'lr': 0.00014979880122967392, 'samples': 16194048, 'steps': 31628, 'batch_loss/train': 0.8077275669202209} 12/27/2021 03:11:26 - INFO - codeparrot_training - Step 31629: {'lr': 0.0001497843598209958, 'samples': 16194560, 'steps': 31629, 'batch_loss/train': 0.7503146370872855} 12/27/2021 03:11:36 - INFO - codeparrot_training - Step 31630: {'lr': 0.00014976991881073733, 'samples': 16195072, 'steps': 31630, 'batch_loss/train': 0.8021339299157262} 12/27/2021 03:11:47 - INFO - codeparrot_training - Step 31631: {'lr': 0.00014975547819895575, 'samples': 16195584, 'steps': 31631, 'batch_loss/train': 0.7491267486475408} 12/27/2021 03:12:01 - INFO - codeparrot_training - Step 31632: {'lr': 0.0001497410379857086, 'samples': 16196096, 'steps': 31632, 'batch_loss/train': 0.7394359367899597} 12/27/2021 03:12:11 - INFO - codeparrot_training - Step 31633: {'lr': 0.00014972659817105312, 'samples': 16196608, 'steps': 31633, 'batch_loss/train': 0.7092770081944764} 12/27/2021 03:12:22 - INFO - codeparrot_training - Step 31634: {'lr': 0.0001497121587550469, 'samples': 16197120, 'steps': 31634, 'batch_loss/train': 0.6842481223866343} 12/27/2021 03:12:34 - INFO - codeparrot_training - Step 31635: {'lr': 0.00014969771973774725, 'samples': 16197632, 'steps': 31635, 'batch_loss/train': 0.787144185975194} 12/27/2021 03:12:45 - INFO - codeparrot_training - Step 31636: {'lr': 0.00014968328111921166, 'samples': 16198144, 'steps': 31636, 'batch_loss/train': 0.7320106661645696} 12/27/2021 03:12:55 - INFO - codeparrot_training - Step 31637: {'lr': 0.00014966884289949734, 'samples': 16198656, 'steps': 31637, 'batch_loss/train': 0.7585328221321106} 12/27/2021 03:13:07 - INFO - codeparrot_training - Step 31638: {'lr': 0.0001496544050786619, 'samples': 16199168, 'steps': 31638, 'batch_loss/train': 0.7887696549296379} 12/27/2021 03:13:18 - INFO - codeparrot_training - Step 31639: {'lr': 0.00014963996765676268, 'samples': 16199680, 'steps': 31639, 'batch_loss/train': 0.7991688712500036} 12/27/2021 03:13:29 - INFO - codeparrot_training - Step 31640: {'lr': 0.00014962553063385694, 'samples': 16200192, 'steps': 31640, 'batch_loss/train': 0.8053356576710939} 12/27/2021 03:13:39 - INFO - codeparrot_training - Step 31641: {'lr': 0.00014961109401000224, 'samples': 16200704, 'steps': 31641, 'batch_loss/train': 0.8155677216127515} 12/27/2021 03:13:53 - INFO - codeparrot_training - Step 31642: {'lr': 0.000149596657785256, 'samples': 16201216, 'steps': 31642, 'batch_loss/train': 0.709837771137245} 12/27/2021 03:14:04 - INFO - codeparrot_training - Step 31643: {'lr': 0.00014958222195967546, 'samples': 16201728, 'steps': 31643, 'batch_loss/train': 0.7332812553504482} 12/27/2021 03:14:14 - INFO - codeparrot_training - Step 31644: {'lr': 0.0001495677865333181, 'samples': 16202240, 'steps': 31644, 'batch_loss/train': 0.7402179503696971} 12/27/2021 03:14:27 - INFO - codeparrot_training - Step 31645: {'lr': 0.00014955335150624132, 'samples': 16202752, 'steps': 31645, 'batch_loss/train': 0.8030292051844299} 12/27/2021 03:14:37 - INFO - codeparrot_training - Step 31646: {'lr': 0.00014953891687850251, 'samples': 16203264, 'steps': 31646, 'batch_loss/train': 0.7822386980988085} 12/27/2021 03:14:48 - INFO - codeparrot_training - Step 31647: {'lr': 0.00014952448265015902, 'samples': 16203776, 'steps': 31647, 'batch_loss/train': 0.6835565872024745} 12/27/2021 03:15:00 - INFO - codeparrot_training - Step 31648: {'lr': 0.00014951004882126818, 'samples': 16204288, 'steps': 31648, 'batch_loss/train': 0.783615087158978} 12/27/2021 03:15:11 - INFO - codeparrot_training - Step 31649: {'lr': 0.00014949561539188755, 'samples': 16204800, 'steps': 31649, 'batch_loss/train': 0.8961152657866478} 12/27/2021 03:15:21 - INFO - codeparrot_training - Step 31650: {'lr': 0.0001494811823620743, 'samples': 16205312, 'steps': 31650, 'batch_loss/train': 0.7123422183794901} 12/27/2021 03:15:32 - INFO - codeparrot_training - Step 31651: {'lr': 0.00014946674973188593, 'samples': 16205824, 'steps': 31651, 'batch_loss/train': 0.7148691010661423} 12/27/2021 03:15:46 - INFO - codeparrot_training - Step 31652: {'lr': 0.00014945231750137987, 'samples': 16206336, 'steps': 31652, 'batch_loss/train': 0.7745263804681599} 12/27/2021 03:15:56 - INFO - codeparrot_training - Step 31653: {'lr': 0.00014943788567061339, 'samples': 16206848, 'steps': 31653, 'batch_loss/train': 0.7594144493341446} 12/27/2021 03:16:07 - INFO - codeparrot_training - Step 31654: {'lr': 0.00014942345423964395, 'samples': 16207360, 'steps': 31654, 'batch_loss/train': 0.7718704855069518} 12/27/2021 03:16:19 - INFO - codeparrot_training - Step 31655: {'lr': 0.00014940902320852877, 'samples': 16207872, 'steps': 31655, 'batch_loss/train': 0.8346535703167319} 12/27/2021 03:16:30 - INFO - codeparrot_training - Step 31656: {'lr': 0.00014939459257732547, 'samples': 16208384, 'steps': 31656, 'batch_loss/train': 0.8215374713763595} 12/27/2021 03:16:40 - INFO - codeparrot_training - Step 31657: {'lr': 0.0001493801623460912, 'samples': 16208896, 'steps': 31657, 'batch_loss/train': 0.7473504897207022} 12/27/2021 03:16:53 - INFO - codeparrot_training - Step 31658: {'lr': 0.0001493657325148835, 'samples': 16209408, 'steps': 31658, 'batch_loss/train': 0.7797026708722115} 12/27/2021 03:17:03 - INFO - codeparrot_training - Step 31659: {'lr': 0.0001493513030837595, 'samples': 16209920, 'steps': 31659, 'batch_loss/train': 0.7987078139558434} 12/27/2021 03:17:14 - INFO - codeparrot_training - Step 31660: {'lr': 0.0001493368740527768, 'samples': 16210432, 'steps': 31660, 'batch_loss/train': 0.7118460603524} 12/27/2021 03:17:28 - INFO - codeparrot_training - Step 31661: {'lr': 0.00014932244542199268, 'samples': 16210944, 'steps': 31661, 'batch_loss/train': 0.7848309138789773} 12/27/2021 03:17:39 - INFO - codeparrot_training - Step 31662: {'lr': 0.00014930801719146452, 'samples': 16211456, 'steps': 31662, 'batch_loss/train': 0.7449708124622703} 12/27/2021 03:17:49 - INFO - codeparrot_training - Step 31663: {'lr': 0.0001492935893612496, 'samples': 16211968, 'steps': 31663, 'batch_loss/train': 0.6416525086387992} 12/27/2021 03:18:00 - INFO - codeparrot_training - Step 31664: {'lr': 0.0001492791619314054, 'samples': 16212480, 'steps': 31664, 'batch_loss/train': 0.8016207222826779} 12/27/2021 03:18:12 - INFO - codeparrot_training - Step 31665: {'lr': 0.00014926473490198928, 'samples': 16212992, 'steps': 31665, 'batch_loss/train': 0.7805045119021088} 12/27/2021 03:18:23 - INFO - codeparrot_training - Step 31666: {'lr': 0.0001492503082730584, 'samples': 16213504, 'steps': 31666, 'batch_loss/train': 0.5185206040041521} 12/27/2021 03:18:33 - INFO - codeparrot_training - Step 31667: {'lr': 0.0001492358820446703, 'samples': 16214016, 'steps': 31667, 'batch_loss/train': 0.8296190658584237} 12/27/2021 03:18:46 - INFO - codeparrot_training - Step 31668: {'lr': 0.00014922145621688233, 'samples': 16214528, 'steps': 31668, 'batch_loss/train': 0.7656819587573409} 12/27/2021 03:18:56 - INFO - codeparrot_training - Step 31669: {'lr': 0.0001492070307897518, 'samples': 16215040, 'steps': 31669, 'batch_loss/train': 0.752427805098705} 12/27/2021 03:19:07 - INFO - codeparrot_training - Step 31670: {'lr': 0.00014919260576333593, 'samples': 16215552, 'steps': 31670, 'batch_loss/train': 0.9490494597703218} 12/27/2021 03:19:21 - INFO - codeparrot_training - Step 31671: {'lr': 0.00014917818113769233, 'samples': 16216064, 'steps': 31671, 'batch_loss/train': 0.6657813182100654} 12/27/2021 03:19:31 - INFO - codeparrot_training - Step 31672: {'lr': 0.0001491637569128782, 'samples': 16216576, 'steps': 31672, 'batch_loss/train': 0.7524215709418058} 12/27/2021 03:19:42 - INFO - codeparrot_training - Step 31673: {'lr': 0.0001491493330889509, 'samples': 16217088, 'steps': 31673, 'batch_loss/train': 0.7435899777337909} 12/27/2021 03:19:52 - INFO - codeparrot_training - Step 31674: {'lr': 0.00014913490966596767, 'samples': 16217600, 'steps': 31674, 'batch_loss/train': 0.7103992803022265} 12/27/2021 03:20:05 - INFO - codeparrot_training - Step 31675: {'lr': 0.00014912048664398608, 'samples': 16218112, 'steps': 31675, 'batch_loss/train': 0.6320135196438059} 12/27/2021 03:20:15 - INFO - codeparrot_training - Step 31676: {'lr': 0.0001491060640230633, 'samples': 16218624, 'steps': 31676, 'batch_loss/train': 0.7509701596572995} 12/27/2021 03:20:26 - INFO - codeparrot_training - Step 31677: {'lr': 0.00014909164180325663, 'samples': 16219136, 'steps': 31677, 'batch_loss/train': 0.7956889634951949} 12/27/2021 03:20:40 - INFO - codeparrot_training - Step 31678: {'lr': 0.00014907721998462365, 'samples': 16219648, 'steps': 31678, 'batch_loss/train': 0.8251870041713119} 12/27/2021 03:20:51 - INFO - codeparrot_training - Step 31679: {'lr': 0.00014906279856722143, 'samples': 16220160, 'steps': 31679, 'batch_loss/train': 0.5458209747448564} 12/27/2021 03:21:01 - INFO - codeparrot_training - Step 31680: {'lr': 0.00014904837755110747, 'samples': 16220672, 'steps': 31680, 'batch_loss/train': 0.8088277231436223} 12/27/2021 03:21:13 - INFO - codeparrot_training - Step 31681: {'lr': 0.00014903395693633895, 'samples': 16221184, 'steps': 31681, 'batch_loss/train': 0.7035886598750949} 12/27/2021 03:21:24 - INFO - codeparrot_training - Step 31682: {'lr': 0.0001490195367229734, 'samples': 16221696, 'steps': 31682, 'batch_loss/train': 0.8019968699663877} 12/27/2021 03:21:35 - INFO - codeparrot_training - Step 31683: {'lr': 0.000149005116911068, 'samples': 16222208, 'steps': 31683, 'batch_loss/train': 0.9524455554783344} 12/27/2021 03:21:47 - INFO - codeparrot_training - Step 31684: {'lr': 0.00014899069750068015, 'samples': 16222720, 'steps': 31684, 'batch_loss/train': 0.9131941958330572} 12/27/2021 03:21:57 - INFO - codeparrot_training - Step 31685: {'lr': 0.00014897627849186706, 'samples': 16223232, 'steps': 31685, 'batch_loss/train': 0.7217303784564137} 12/27/2021 03:22:08 - INFO - codeparrot_training - Step 31686: {'lr': 0.00014896185988468619, 'samples': 16223744, 'steps': 31686, 'batch_loss/train': 0.8439891124144197} 12/27/2021 03:22:19 - INFO - codeparrot_training - Step 31687: {'lr': 0.00014894744167919487, 'samples': 16224256, 'steps': 31687, 'batch_loss/train': 0.6808542096987367} 12/27/2021 03:22:32 - INFO - codeparrot_training - Step 31688: {'lr': 0.00014893302387545025, 'samples': 16224768, 'steps': 31688, 'batch_loss/train': 0.7292704321444035} 12/27/2021 03:22:43 - INFO - codeparrot_training - Step 31689: {'lr': 0.0001489186064735098, 'samples': 16225280, 'steps': 31689, 'batch_loss/train': 0.668515489436686} 12/27/2021 03:22:54 - INFO - codeparrot_training - Step 31690: {'lr': 0.0001489041894734309, 'samples': 16225792, 'steps': 31690, 'batch_loss/train': 0.8133476856164634} 12/27/2021 03:23:06 - INFO - codeparrot_training - Step 31691: {'lr': 0.00014888977287527073, 'samples': 16226304, 'steps': 31691, 'batch_loss/train': 0.7447798843495548} 12/27/2021 03:23:16 - INFO - codeparrot_training - Step 31692: {'lr': 0.00014887535667908658, 'samples': 16226816, 'steps': 31692, 'batch_loss/train': 0.6655381498858333} 12/27/2021 03:23:27 - INFO - codeparrot_training - Step 31693: {'lr': 0.00014886094088493583, 'samples': 16227328, 'steps': 31693, 'batch_loss/train': 1.056951131671667} 12/27/2021 03:23:39 - INFO - codeparrot_training - Step 31694: {'lr': 0.00014884652549287592, 'samples': 16227840, 'steps': 31694, 'batch_loss/train': 0.7120602265931666} 12/27/2021 03:23:50 - INFO - codeparrot_training - Step 31695: {'lr': 0.0001488321105029639, 'samples': 16228352, 'steps': 31695, 'batch_loss/train': 0.8751148879528046} 12/27/2021 03:24:00 - INFO - codeparrot_training - Step 31696: {'lr': 0.0001488176959152572, 'samples': 16228864, 'steps': 31696, 'batch_loss/train': 0.6865968115162104} 12/27/2021 03:24:11 - INFO - codeparrot_training - Step 31697: {'lr': 0.00014880328172981323, 'samples': 16229376, 'steps': 31697, 'batch_loss/train': 0.7452032589353621} 12/27/2021 03:24:23 - INFO - codeparrot_training - Step 31698: {'lr': 0.00014878886794668917, 'samples': 16229888, 'steps': 31698, 'batch_loss/train': 0.7323186611756682} 12/27/2021 03:24:34 - INFO - codeparrot_training - Step 31699: {'lr': 0.00014877445456594235, 'samples': 16230400, 'steps': 31699, 'batch_loss/train': 0.8630652893334627} 12/27/2021 03:24:44 - INFO - codeparrot_training - Step 31700: {'lr': 0.00014876004158763002, 'samples': 16230912, 'steps': 31700, 'batch_loss/train': 0.828066770453006} 12/27/2021 03:24:58 - INFO - codeparrot_training - Step 31701: {'lr': 0.00014874562901180967, 'samples': 16231424, 'steps': 31701, 'batch_loss/train': 0.7571679018437862} 12/27/2021 03:25:09 - INFO - codeparrot_training - Step 31702: {'lr': 0.00014873121683853835, 'samples': 16231936, 'steps': 31702, 'batch_loss/train': 0.8093816619366407} 12/27/2021 03:25:20 - INFO - codeparrot_training - Step 31703: {'lr': 0.00014871680506787344, 'samples': 16232448, 'steps': 31703, 'batch_loss/train': 0.7825202029198408} 12/27/2021 03:25:32 - INFO - codeparrot_training - Step 31704: {'lr': 0.00014870239369987239, 'samples': 16232960, 'steps': 31704, 'batch_loss/train': 0.7030838737264276} 12/27/2021 03:25:42 - INFO - codeparrot_training - Step 31705: {'lr': 0.00014868798273459232, 'samples': 16233472, 'steps': 31705, 'batch_loss/train': 0.5823993457888719} 12/27/2021 03:25:53 - INFO - codeparrot_training - Step 31706: {'lr': 0.00014867357217209062, 'samples': 16233984, 'steps': 31706, 'batch_loss/train': 0.7130589609732851} 12/27/2021 03:26:04 - INFO - codeparrot_training - Step 31707: {'lr': 0.00014865916201242443, 'samples': 16234496, 'steps': 31707, 'batch_loss/train': 0.6426851185970008} 12/27/2021 03:26:16 - INFO - codeparrot_training - Step 31708: {'lr': 0.0001486447522556512, 'samples': 16235008, 'steps': 31708, 'batch_loss/train': 0.7915622061118484} 12/27/2021 03:26:26 - INFO - codeparrot_training - Step 31709: {'lr': 0.00014863034290182826, 'samples': 16235520, 'steps': 31709, 'batch_loss/train': 0.7649353607557714} 12/27/2021 03:26:37 - INFO - codeparrot_training - Step 31710: {'lr': 0.00014861593395101274, 'samples': 16236032, 'steps': 31710, 'batch_loss/train': 0.8822889509610832} 12/27/2021 03:26:51 - INFO - codeparrot_training - Step 31711: {'lr': 0.0001486015254032619, 'samples': 16236544, 'steps': 31711, 'batch_loss/train': 0.8615020159631968} 12/27/2021 03:27:01 - INFO - codeparrot_training - Step 31712: {'lr': 0.00014858711725863316, 'samples': 16237056, 'steps': 31712, 'batch_loss/train': 0.5967396515188739} 12/27/2021 03:27:12 - INFO - codeparrot_training - Step 31713: {'lr': 0.00014857270951718383, 'samples': 16237568, 'steps': 31713, 'batch_loss/train': 0.7788412794470787} 12/27/2021 03:27:24 - INFO - codeparrot_training - Step 31714: {'lr': 0.000148558302178971, 'samples': 16238080, 'steps': 31714, 'batch_loss/train': 0.8279923759400845} 12/27/2021 03:27:35 - INFO - codeparrot_training - Step 31715: {'lr': 0.00014854389524405207, 'samples': 16238592, 'steps': 31715, 'batch_loss/train': 0.7971275905147195} 12/27/2021 03:27:45 - INFO - codeparrot_training - Step 31716: {'lr': 0.0001485294887124844, 'samples': 16239104, 'steps': 31716, 'batch_loss/train': 0.6763803726062179} 12/27/2021 03:27:59 - INFO - codeparrot_training - Step 31717: {'lr': 0.0001485150825843251, 'samples': 16239616, 'steps': 31717, 'batch_loss/train': 0.6625431397696957} 12/27/2021 03:28:10 - INFO - codeparrot_training - Step 31718: {'lr': 0.00014850067685963142, 'samples': 16240128, 'steps': 31718, 'batch_loss/train': 0.7458411380648613} 12/27/2021 03:28:20 - INFO - codeparrot_training - Step 31719: {'lr': 0.00014848627153846088, 'samples': 16240640, 'steps': 31719, 'batch_loss/train': 0.6833106358535588} 12/27/2021 03:28:31 - INFO - codeparrot_training - Step 31720: {'lr': 0.0001484718666208706, 'samples': 16241152, 'steps': 31720, 'batch_loss/train': 0.7540038451552391} 12/27/2021 03:28:43 - INFO - codeparrot_training - Step 31721: {'lr': 0.00014845746210691773, 'samples': 16241664, 'steps': 31721, 'batch_loss/train': 0.80637177079916} 12/27/2021 03:28:54 - INFO - codeparrot_training - Step 31722: {'lr': 0.00014844305799665967, 'samples': 16242176, 'steps': 31722, 'batch_loss/train': 0.8535362305119634} 12/27/2021 03:29:04 - INFO - codeparrot_training - Step 31723: {'lr': 0.00014842865429015372, 'samples': 16242688, 'steps': 31723, 'batch_loss/train': 0.8222441552206874} 12/27/2021 03:29:16 - INFO - codeparrot_training - Step 31724: {'lr': 0.00014841425098745706, 'samples': 16243200, 'steps': 31724, 'batch_loss/train': 0.6701649494934827} 12/27/2021 03:29:27 - INFO - codeparrot_training - Step 31725: {'lr': 0.00014839984808862693, 'samples': 16243712, 'steps': 31725, 'batch_loss/train': 0.7294413540512323} 12/27/2021 03:29:37 - INFO - codeparrot_training - Step 31726: {'lr': 0.00014838544559372076, 'samples': 16244224, 'steps': 31726, 'batch_loss/train': 0.777249657548964} 12/27/2021 03:29:51 - INFO - codeparrot_training - Step 31727: {'lr': 0.0001483710435027956, 'samples': 16244736, 'steps': 31727, 'batch_loss/train': 0.7059426745399833} 12/27/2021 03:30:02 - INFO - codeparrot_training - Step 31728: {'lr': 0.00014835664181590886, 'samples': 16245248, 'steps': 31728, 'batch_loss/train': 0.7440544553101063} 12/27/2021 03:30:13 - INFO - codeparrot_training - Step 31729: {'lr': 0.00014834224053311763, 'samples': 16245760, 'steps': 31729, 'batch_loss/train': 0.8080921405926347} 12/27/2021 03:30:23 - INFO - codeparrot_training - Step 31730: {'lr': 0.00014832783965447937, 'samples': 16246272, 'steps': 31730, 'batch_loss/train': 0.7977205272763968} 12/27/2021 03:30:36 - INFO - codeparrot_training - Step 31731: {'lr': 0.0001483134391800512, 'samples': 16246784, 'steps': 31731, 'batch_loss/train': 0.9062856948003173} 12/27/2021 03:30:46 - INFO - codeparrot_training - Step 31732: {'lr': 0.00014829903910989039, 'samples': 16247296, 'steps': 31732, 'batch_loss/train': 1.1271203542128205} 12/27/2021 03:30:57 - INFO - codeparrot_training - Step 31733: {'lr': 0.00014828463944405418, 'samples': 16247808, 'steps': 31733, 'batch_loss/train': 0.7566627925261855} 12/27/2021 03:31:09 - INFO - codeparrot_training - Step 31734: {'lr': 0.00014827024018259982, 'samples': 16248320, 'steps': 31734, 'batch_loss/train': 1.00788001017645} 12/27/2021 03:31:20 - INFO - codeparrot_training - Step 31735: {'lr': 0.00014825584132558466, 'samples': 16248832, 'steps': 31735, 'batch_loss/train': 0.6892585186287761} 12/27/2021 03:31:30 - INFO - codeparrot_training - Step 31736: {'lr': 0.00014824144287306585, 'samples': 16249344, 'steps': 31736, 'batch_loss/train': 0.6940071559511125} 12/27/2021 03:31:42 - INFO - codeparrot_training - Step 31737: {'lr': 0.00014822704482510057, 'samples': 16249856, 'steps': 31737, 'batch_loss/train': 0.8178597348742187} 12/27/2021 03:31:53 - INFO - codeparrot_training - Step 31738: {'lr': 0.00014821264718174615, 'samples': 16250368, 'steps': 31738, 'batch_loss/train': 0.742019722238183} 12/27/2021 03:32:03 - INFO - codeparrot_training - Step 31739: {'lr': 0.00014819824994305987, 'samples': 16250880, 'steps': 31739, 'batch_loss/train': 0.7447379338555038} 12/27/2021 03:32:14 - INFO - codeparrot_training - Step 31740: {'lr': 0.0001481838531090988, 'samples': 16251392, 'steps': 31740, 'batch_loss/train': 0.7930385824292898} 12/27/2021 03:32:28 - INFO - codeparrot_training - Step 31741: {'lr': 0.00014816945667992033, 'samples': 16251904, 'steps': 31741, 'batch_loss/train': 0.7752768516074866} 12/27/2021 03:32:39 - INFO - codeparrot_training - Step 31742: {'lr': 0.0001481550606555817, 'samples': 16252416, 'steps': 31742, 'batch_loss/train': 0.6725230333395302} 12/27/2021 03:32:49 - INFO - codeparrot_training - Step 31743: {'lr': 0.00014814066503614006, 'samples': 16252928, 'steps': 31743, 'batch_loss/train': 0.7175064971670508} 12/27/2021 03:33:01 - INFO - codeparrot_training - Step 31744: {'lr': 0.00014812626982165262, 'samples': 16253440, 'steps': 31744, 'batch_loss/train': 0.7106685219332576} 12/27/2021 03:33:12 - INFO - codeparrot_training - Step 31745: {'lr': 0.00014811187501217676, 'samples': 16253952, 'steps': 31745, 'batch_loss/train': 0.8797373194247484} 12/27/2021 03:33:22 - INFO - codeparrot_training - Step 31746: {'lr': 0.00014809748060776962, 'samples': 16254464, 'steps': 31746, 'batch_loss/train': 0.78371995780617} 12/27/2021 03:33:36 - INFO - codeparrot_training - Step 31747: {'lr': 0.0001480830866084884, 'samples': 16254976, 'steps': 31747, 'batch_loss/train': 0.836367683019489} 12/27/2021 03:33:47 - INFO - codeparrot_training - Step 31748: {'lr': 0.00014806869301439024, 'samples': 16255488, 'steps': 31748, 'batch_loss/train': 0.7841380201280117} 12/27/2021 03:33:57 - INFO - codeparrot_training - Step 31749: {'lr': 0.0001480542998255326, 'samples': 16256000, 'steps': 31749, 'batch_loss/train': 0.7750903917476535} 12/27/2021 03:34:10 - INFO - codeparrot_training - Step 31750: {'lr': 0.00014803990704197252, 'samples': 16256512, 'steps': 31750, 'batch_loss/train': 0.7492283419705927} 12/27/2021 03:34:20 - INFO - codeparrot_training - Step 31751: {'lr': 0.0001480255146637672, 'samples': 16257024, 'steps': 31751, 'batch_loss/train': 0.6940749790519476} 12/27/2021 03:34:31 - INFO - codeparrot_training - Step 31752: {'lr': 0.00014801112269097406, 'samples': 16257536, 'steps': 31752, 'batch_loss/train': 0.7272526025772095} 12/27/2021 03:34:42 - INFO - codeparrot_training - Step 31753: {'lr': 0.0001479967311236501, 'samples': 16258048, 'steps': 31753, 'batch_loss/train': 0.5883333316887729} 12/27/2021 03:34:54 - INFO - codeparrot_training - Step 31754: {'lr': 0.00014798233996185273, 'samples': 16258560, 'steps': 31754, 'batch_loss/train': 0.7717536836862564} 12/27/2021 03:35:05 - INFO - codeparrot_training - Step 31755: {'lr': 0.00014796794920563888, 'samples': 16259072, 'steps': 31755, 'batch_loss/train': 0.6680816453881562} 12/27/2021 03:35:15 - INFO - codeparrot_training - Step 31756: {'lr': 0.0001479535588550661, 'samples': 16259584, 'steps': 31756, 'batch_loss/train': 0.7600958133116364} 12/27/2021 03:35:29 - INFO - codeparrot_training - Step 31757: {'lr': 0.00014793916891019137, 'samples': 16260096, 'steps': 31757, 'batch_loss/train': 0.6905246968381107} 12/27/2021 03:35:40 - INFO - codeparrot_training - Step 31758: {'lr': 0.000147924779371072, 'samples': 16260608, 'steps': 31758, 'batch_loss/train': 0.6729814801365137} 12/27/2021 03:35:50 - INFO - codeparrot_training - Step 31759: {'lr': 0.0001479103902377651, 'samples': 16261120, 'steps': 31759, 'batch_loss/train': 0.7228317447006702} 12/27/2021 03:36:03 - INFO - codeparrot_training - Step 31760: {'lr': 0.00014789600151032795, 'samples': 16261632, 'steps': 31760, 'batch_loss/train': 0.6921065129572526} 12/27/2021 03:36:13 - INFO - codeparrot_training - Step 31761: {'lr': 0.00014788161318881783, 'samples': 16262144, 'steps': 31761, 'batch_loss/train': 0.7940711071714759} 12/27/2021 03:36:24 - INFO - codeparrot_training - Step 31762: {'lr': 0.0001478672252732917, 'samples': 16262656, 'steps': 31762, 'batch_loss/train': 0.7277129460126162} 12/27/2021 03:36:34 - INFO - codeparrot_training - Step 31763: {'lr': 0.00014785283776380702, 'samples': 16263168, 'steps': 31763, 'batch_loss/train': 0.9660350037738681} 12/27/2021 03:36:47 - INFO - codeparrot_training - Step 31764: {'lr': 0.00014783845066042094, 'samples': 16263680, 'steps': 31764, 'batch_loss/train': 0.7470192485488951} 12/27/2021 03:36:58 - INFO - codeparrot_training - Step 31765: {'lr': 0.00014782406396319058, 'samples': 16264192, 'steps': 31765, 'batch_loss/train': 0.8828006908297539} 12/27/2021 03:37:08 - INFO - codeparrot_training - Step 31766: {'lr': 0.0001478096776721731, 'samples': 16264704, 'steps': 31766, 'batch_loss/train': 0.6793643324635923} 12/27/2021 03:37:20 - INFO - codeparrot_training - Step 31767: {'lr': 0.00014779529178742578, 'samples': 16265216, 'steps': 31767, 'batch_loss/train': 0.8120659259147942} 12/27/2021 03:37:31 - INFO - codeparrot_training - Step 31768: {'lr': 0.00014778090630900584, 'samples': 16265728, 'steps': 31768, 'batch_loss/train': 0.6179751014860813} 12/27/2021 03:37:42 - INFO - codeparrot_training - Step 31769: {'lr': 0.00014776652123697038, 'samples': 16266240, 'steps': 31769, 'batch_loss/train': 0.8644246142357588} 12/27/2021 03:37:54 - INFO - codeparrot_training - Step 31770: {'lr': 0.00014775213657137657, 'samples': 16266752, 'steps': 31770, 'batch_loss/train': 0.8037289590574801} 12/27/2021 03:38:04 - INFO - codeparrot_training - Step 31771: {'lr': 0.00014773775231228175, 'samples': 16267264, 'steps': 31771, 'batch_loss/train': 0.7276228396221995} 12/27/2021 03:38:15 - INFO - codeparrot_training - Step 31772: {'lr': 0.00014772336845974304, 'samples': 16267776, 'steps': 31772, 'batch_loss/train': 0.8344570277258754} 12/27/2021 03:38:26 - INFO - codeparrot_training - Step 31773: {'lr': 0.00014770898501381753, 'samples': 16268288, 'steps': 31773, 'batch_loss/train': 0.7389940937282518} 12/27/2021 03:38:38 - INFO - codeparrot_training - Step 31774: {'lr': 0.00014769460197456243, 'samples': 16268800, 'steps': 31774, 'batch_loss/train': 0.7734876908361912} 12/27/2021 03:38:48 - INFO - codeparrot_training - Step 31775: {'lr': 0.00014768021934203504, 'samples': 16269312, 'steps': 31775, 'batch_loss/train': 0.7662241221405566} 12/27/2021 03:38:59 - INFO - codeparrot_training - Step 31776: {'lr': 0.00014766583711629244, 'samples': 16269824, 'steps': 31776, 'batch_loss/train': 0.7375223878771067} 12/27/2021 03:39:12 - INFO - codeparrot_training - Step 31777: {'lr': 0.00014765145529739177, 'samples': 16270336, 'steps': 31777, 'batch_loss/train': 0.6999512831680477} 12/27/2021 03:39:22 - INFO - codeparrot_training - Step 31778: {'lr': 0.00014763707388539034, 'samples': 16270848, 'steps': 31778, 'batch_loss/train': 0.7523617250844836} 12/27/2021 03:39:33 - INFO - codeparrot_training - Step 31779: {'lr': 0.00014762269288034525, 'samples': 16271360, 'steps': 31779, 'batch_loss/train': 0.8029182502068579} 12/27/2021 03:39:45 - INFO - codeparrot_training - Step 31780: {'lr': 0.00014760831228231372, 'samples': 16271872, 'steps': 31780, 'batch_loss/train': 0.8113296162337065} 12/27/2021 03:39:56 - INFO - codeparrot_training - Step 31781: {'lr': 0.00014759393209135276, 'samples': 16272384, 'steps': 31781, 'batch_loss/train': 0.7803838532418013} 12/27/2021 03:40:06 - INFO - codeparrot_training - Step 31782: {'lr': 0.0001475795523075198, 'samples': 16272896, 'steps': 31782, 'batch_loss/train': 0.6792642939835787} 12/27/2021 03:40:18 - INFO - codeparrot_training - Step 31783: {'lr': 0.00014756517293087177, 'samples': 16273408, 'steps': 31783, 'batch_loss/train': 0.8317977599799633} 12/27/2021 03:40:29 - INFO - codeparrot_training - Step 31784: {'lr': 0.00014755079396146597, 'samples': 16273920, 'steps': 31784, 'batch_loss/train': 0.8894432339584455} 12/27/2021 03:40:40 - INFO - codeparrot_training - Step 31785: {'lr': 0.00014753641539935947, 'samples': 16274432, 'steps': 31785, 'batch_loss/train': 0.7732839724048972} 12/27/2021 03:40:50 - INFO - codeparrot_training - Step 31786: {'lr': 0.00014752203724460955, 'samples': 16274944, 'steps': 31786, 'batch_loss/train': 0.7276667198166251} 12/27/2021 03:41:03 - INFO - codeparrot_training - Step 31787: {'lr': 0.00014750765949727334, 'samples': 16275456, 'steps': 31787, 'batch_loss/train': 0.5757452072575688} 12/27/2021 03:41:14 - INFO - codeparrot_training - Step 31788: {'lr': 0.00014749328215740788, 'samples': 16275968, 'steps': 31788, 'batch_loss/train': 0.7184634902514517} 12/27/2021 03:41:24 - INFO - codeparrot_training - Step 31789: {'lr': 0.00014747890522507048, 'samples': 16276480, 'steps': 31789, 'batch_loss/train': 0.7106884493259713} 12/27/2021 03:41:36 - INFO - codeparrot_training - Step 31790: {'lr': 0.0001474645287003183, 'samples': 16276992, 'steps': 31790, 'batch_loss/train': 0.711998981423676} 12/27/2021 03:41:47 - INFO - codeparrot_training - Step 31791: {'lr': 0.00014745015258320842, 'samples': 16277504, 'steps': 31791, 'batch_loss/train': 0.7504907064139843} 12/27/2021 03:41:58 - INFO - codeparrot_training - Step 31792: {'lr': 0.00014743577687379794, 'samples': 16278016, 'steps': 31792, 'batch_loss/train': 0.7860930059105158} 12/27/2021 03:42:10 - INFO - codeparrot_training - Step 31793: {'lr': 0.00014742140157214412, 'samples': 16278528, 'steps': 31793, 'batch_loss/train': 0.6986489426344633} 12/27/2021 03:42:21 - INFO - codeparrot_training - Step 31794: {'lr': 0.00014740702667830413, 'samples': 16279040, 'steps': 31794, 'batch_loss/train': 0.823252015048638} 12/27/2021 03:42:32 - INFO - codeparrot_training - Step 31795: {'lr': 0.000147392652192335, 'samples': 16279552, 'steps': 31795, 'batch_loss/train': 1.4886555215343833} 12/27/2021 03:42:42 - INFO - codeparrot_training - Step 31796: {'lr': 0.0001473782781142939, 'samples': 16280064, 'steps': 31796, 'batch_loss/train': 0.7669469905085862} 12/27/2021 03:42:54 - INFO - codeparrot_training - Step 31797: {'lr': 0.0001473639044442381, 'samples': 16280576, 'steps': 31797, 'batch_loss/train': 0.7204260984435678} 12/27/2021 03:43:05 - INFO - codeparrot_training - Step 31798: {'lr': 0.00014734953118222466, 'samples': 16281088, 'steps': 31798, 'batch_loss/train': 0.777967946138233} 12/27/2021 03:43:16 - INFO - codeparrot_training - Step 31799: {'lr': 0.00014733515832831062, 'samples': 16281600, 'steps': 31799, 'batch_loss/train': 0.7459743227809668} 12/27/2021 03:43:28 - INFO - codeparrot_training - Step 31800: {'lr': 0.00014732078588255336, 'samples': 16282112, 'steps': 31800, 'batch_loss/train': 0.6618986083194613} 12/27/2021 03:43:38 - INFO - codeparrot_training - Step 31801: {'lr': 0.00014730641384500987, 'samples': 16282624, 'steps': 31801, 'batch_loss/train': 0.7434786017984152} 12/27/2021 03:43:49 - INFO - codeparrot_training - Step 31802: {'lr': 0.00014729204221573727, 'samples': 16283136, 'steps': 31802, 'batch_loss/train': 0.8470667889341712} 12/27/2021 03:44:02 - INFO - codeparrot_training - Step 31803: {'lr': 0.00014727767099479268, 'samples': 16283648, 'steps': 31803, 'batch_loss/train': 0.7849399792030454} 12/27/2021 03:44:12 - INFO - codeparrot_training - Step 31804: {'lr': 0.00014726330018223333, 'samples': 16284160, 'steps': 31804, 'batch_loss/train': 0.6796897798776627} 12/27/2021 03:44:23 - INFO - codeparrot_training - Step 31805: {'lr': 0.0001472489297781163, 'samples': 16284672, 'steps': 31805, 'batch_loss/train': 0.6817664848640561} 12/27/2021 03:44:34 - INFO - codeparrot_training - Step 31806: {'lr': 0.00014723455978249878, 'samples': 16285184, 'steps': 31806, 'batch_loss/train': 0.6494614311959594} 12/27/2021 03:44:46 - INFO - codeparrot_training - Step 31807: {'lr': 0.00014722019019543776, 'samples': 16285696, 'steps': 31807, 'batch_loss/train': 0.8497034460306168} 12/27/2021 03:44:56 - INFO - codeparrot_training - Step 31808: {'lr': 0.00014720582101699053, 'samples': 16286208, 'steps': 31808, 'batch_loss/train': 0.8771047024056315} 12/27/2021 03:45:07 - INFO - codeparrot_training - Step 31809: {'lr': 0.0001471914522472141, 'samples': 16286720, 'steps': 31809, 'batch_loss/train': 0.8233612468466163} 12/27/2021 03:45:20 - INFO - codeparrot_training - Step 31810: {'lr': 0.0001471770838861657, 'samples': 16287232, 'steps': 31810, 'batch_loss/train': 0.6766634145751595} 12/27/2021 03:45:30 - INFO - codeparrot_training - Step 31811: {'lr': 0.00014716271593390228, 'samples': 16287744, 'steps': 31811, 'batch_loss/train': 0.7130514550954103} 12/27/2021 03:45:41 - INFO - codeparrot_training - Step 31812: {'lr': 0.0001471483483904811, 'samples': 16288256, 'steps': 31812, 'batch_loss/train': 0.7606725795194507} 12/27/2021 03:45:53 - INFO - codeparrot_training - Step 31813: {'lr': 0.00014713398125595933, 'samples': 16288768, 'steps': 31813, 'batch_loss/train': 0.7772129331715405} 12/27/2021 03:46:04 - INFO - codeparrot_training - Step 31814: {'lr': 0.0001471196145303939, 'samples': 16289280, 'steps': 31814, 'batch_loss/train': 0.6653079318348318} 12/27/2021 03:46:14 - INFO - codeparrot_training - Step 31815: {'lr': 0.00014710524821384208, 'samples': 16289792, 'steps': 31815, 'batch_loss/train': 0.9863593950867653} 12/27/2021 03:46:25 - INFO - codeparrot_training - Step 31816: {'lr': 0.000147090882306361, 'samples': 16290304, 'steps': 31816, 'batch_loss/train': 0.8456998120527714} 12/27/2021 03:46:37 - INFO - codeparrot_training - Step 31817: {'lr': 0.0001470765168080077, 'samples': 16290816, 'steps': 31817, 'batch_loss/train': 0.710507012438029} 12/27/2021 03:46:48 - INFO - codeparrot_training - Step 31818: {'lr': 0.00014706215171883926, 'samples': 16291328, 'steps': 31818, 'batch_loss/train': 0.6500339657068253} 12/27/2021 03:46:59 - INFO - codeparrot_training - Step 31819: {'lr': 0.0001470477870389128, 'samples': 16291840, 'steps': 31819, 'batch_loss/train': 0.7619247944094241} 12/27/2021 03:47:11 - INFO - codeparrot_training - Step 31820: {'lr': 0.00014703342276828558, 'samples': 16292352, 'steps': 31820, 'batch_loss/train': 0.8098590113222599} 12/27/2021 03:47:22 - INFO - codeparrot_training - Step 31821: {'lr': 0.0001470190589070145, 'samples': 16292864, 'steps': 31821, 'batch_loss/train': 0.5406850918370765} 12/27/2021 03:47:32 - INFO - codeparrot_training - Step 31822: {'lr': 0.00014700469545515676, 'samples': 16293376, 'steps': 31822, 'batch_loss/train': 0.6650030773598701} 12/27/2021 03:47:45 - INFO - codeparrot_training - Step 31823: {'lr': 0.00014699033241276954, 'samples': 16293888, 'steps': 31823, 'batch_loss/train': 0.7995863249525428} 12/27/2021 03:47:55 - INFO - codeparrot_training - Step 31824: {'lr': 0.00014697596977990979, 'samples': 16294400, 'steps': 31824, 'batch_loss/train': 0.7086032410152256} 12/27/2021 03:48:06 - INFO - codeparrot_training - Step 31825: {'lr': 0.00014696160755663465, 'samples': 16294912, 'steps': 31825, 'batch_loss/train': 0.6762685319408774} 12/27/2021 03:48:17 - INFO - codeparrot_training - Step 31826: {'lr': 0.00014694724574300136, 'samples': 16295424, 'steps': 31826, 'batch_loss/train': 0.7346733435988426} 12/27/2021 03:48:29 - INFO - codeparrot_training - Step 31827: {'lr': 0.0001469328843390669, 'samples': 16295936, 'steps': 31827, 'batch_loss/train': 0.8530834997072816} 12/27/2021 03:48:39 - INFO - codeparrot_training - Step 31828: {'lr': 0.00014691852334488832, 'samples': 16296448, 'steps': 31828, 'batch_loss/train': 0.7092279838398099} 12/27/2021 03:48:50 - INFO - codeparrot_training - Step 31829: {'lr': 0.00014690416276052273, 'samples': 16296960, 'steps': 31829, 'batch_loss/train': 0.7751464295433834} 12/27/2021 03:49:02 - INFO - codeparrot_training - Step 31830: {'lr': 0.00014688980258602735, 'samples': 16297472, 'steps': 31830, 'batch_loss/train': 0.6414663014002144} 12/27/2021 03:49:13 - INFO - codeparrot_training - Step 31831: {'lr': 0.00014687544282145914, 'samples': 16297984, 'steps': 31831, 'batch_loss/train': 0.8453071312978864} 12/27/2021 03:49:23 - INFO - codeparrot_training - Step 31832: {'lr': 0.00014686108346687529, 'samples': 16298496, 'steps': 31832, 'batch_loss/train': 0.7563894763588905} 12/27/2021 03:49:36 - INFO - codeparrot_training - Step 31833: {'lr': 0.0001468467245223327, 'samples': 16299008, 'steps': 31833, 'batch_loss/train': 0.8147904565557837} 12/27/2021 03:49:47 - INFO - codeparrot_training - Step 31834: {'lr': 0.00014683236598788868, 'samples': 16299520, 'steps': 31834, 'batch_loss/train': 0.8050429481081665} 12/27/2021 03:49:57 - INFO - codeparrot_training - Step 31835: {'lr': 0.00014681800786360026, 'samples': 16300032, 'steps': 31835, 'batch_loss/train': 0.7384370155632496} 12/27/2021 03:50:10 - INFO - codeparrot_training - Step 31836: {'lr': 0.00014680365014952434, 'samples': 16300544, 'steps': 31836, 'batch_loss/train': 0.7763364547863603} 12/27/2021 03:50:20 - INFO - codeparrot_training - Step 31837: {'lr': 0.0001467892928457183, 'samples': 16301056, 'steps': 31837, 'batch_loss/train': 0.7426579194143414} 12/27/2021 03:50:31 - INFO - codeparrot_training - Step 31838: {'lr': 0.00014677493595223896, 'samples': 16301568, 'steps': 31838, 'batch_loss/train': 0.7647074116393924} 12/27/2021 03:50:42 - INFO - codeparrot_training - Step 31839: {'lr': 0.00014676057946914355, 'samples': 16302080, 'steps': 31839, 'batch_loss/train': 0.8184208071324974} 12/27/2021 03:50:54 - INFO - codeparrot_training - Step 31840: {'lr': 0.00014674622339648903, 'samples': 16302592, 'steps': 31840, 'batch_loss/train': 0.8736024396494031} 12/27/2021 03:51:05 - INFO - codeparrot_training - Step 31841: {'lr': 0.00014673186773433256, 'samples': 16303104, 'steps': 31841, 'batch_loss/train': 0.8056481243111193} 12/27/2021 03:51:16 - INFO - codeparrot_training - Step 31842: {'lr': 0.00014671751248273123, 'samples': 16303616, 'steps': 31842, 'batch_loss/train': 0.682413000613451} 12/27/2021 03:51:28 - INFO - codeparrot_training - Step 31843: {'lr': 0.00014670315764174206, 'samples': 16304128, 'steps': 31843, 'batch_loss/train': 0.5934856978128664} 12/27/2021 03:51:38 - INFO - codeparrot_training - Step 31844: {'lr': 0.00014668880321142206, 'samples': 16304640, 'steps': 31844, 'batch_loss/train': 0.7754579829052091} 12/27/2021 03:51:49 - INFO - codeparrot_training - Step 31845: {'lr': 0.0001466744491918285, 'samples': 16305152, 'steps': 31845, 'batch_loss/train': 1.024993128143251} 12/27/2021 03:52:01 - INFO - codeparrot_training - Step 31846: {'lr': 0.00014666009558301823, 'samples': 16305664, 'steps': 31846, 'batch_loss/train': 0.7702944269403815} 12/27/2021 03:52:12 - INFO - codeparrot_training - Step 31847: {'lr': 0.00014664574238504844, 'samples': 16306176, 'steps': 31847, 'batch_loss/train': 0.7868547644466162} 12/27/2021 03:52:23 - INFO - codeparrot_training - Step 31848: {'lr': 0.00014663138959797607, 'samples': 16306688, 'steps': 31848, 'batch_loss/train': 0.6642484983312897} 12/27/2021 03:52:33 - INFO - codeparrot_training - Step 31849: {'lr': 0.00014661703722185836, 'samples': 16307200, 'steps': 31849, 'batch_loss/train': 0.6248986299615353} 12/27/2021 03:52:46 - INFO - codeparrot_training - Step 31850: {'lr': 0.00014660268525675224, 'samples': 16307712, 'steps': 31850, 'batch_loss/train': 0.6798126986250281} 12/27/2021 03:52:57 - INFO - codeparrot_training - Step 31851: {'lr': 0.00014658833370271473, 'samples': 16308224, 'steps': 31851, 'batch_loss/train': 0.7671913150697947} 12/27/2021 03:53:07 - INFO - codeparrot_training - Step 31852: {'lr': 0.0001465739825598031, 'samples': 16308736, 'steps': 31852, 'batch_loss/train': 0.772369921207428} 12/27/2021 03:53:20 - INFO - codeparrot_training - Step 31853: {'lr': 0.0001465596318280742, 'samples': 16309248, 'steps': 31853, 'batch_loss/train': 0.728679892141372} 12/27/2021 03:53:31 - INFO - codeparrot_training - Step 31854: {'lr': 0.00014654528150758513, 'samples': 16309760, 'steps': 31854, 'batch_loss/train': 0.7942998632788658} 12/27/2021 03:53:41 - INFO - codeparrot_training - Step 31855: {'lr': 0.00014653093159839293, 'samples': 16310272, 'steps': 31855, 'batch_loss/train': 0.706990098580718} 12/27/2021 03:53:53 - INFO - codeparrot_training - Step 31856: {'lr': 0.00014651658210055473, 'samples': 16310784, 'steps': 31856, 'batch_loss/train': 1.1624716392252594} 12/27/2021 03:54:04 - INFO - codeparrot_training - Step 31857: {'lr': 0.0001465022330141275, 'samples': 16311296, 'steps': 31857, 'batch_loss/train': 0.780224627815187} 12/27/2021 03:54:15 - INFO - codeparrot_training - Step 31858: {'lr': 0.00014648788433916838, 'samples': 16311808, 'steps': 31858, 'batch_loss/train': 0.8894189577549696} 12/27/2021 03:54:25 - INFO - codeparrot_training - Step 31859: {'lr': 0.00014647353607573422, 'samples': 16312320, 'steps': 31859, 'batch_loss/train': 0.7548011164180934} 12/27/2021 03:54:37 - INFO - codeparrot_training - Step 31860: {'lr': 0.00014645918822388227, 'samples': 16312832, 'steps': 31860, 'batch_loss/train': 0.7251311384607106} 12/27/2021 03:54:48 - INFO - codeparrot_training - Step 31861: {'lr': 0.00014644484078366952, 'samples': 16313344, 'steps': 31861, 'batch_loss/train': 0.7070825295522809} 12/27/2021 03:54:59 - INFO - codeparrot_training - Step 31862: {'lr': 0.00014643049375515289, 'samples': 16313856, 'steps': 31862, 'batch_loss/train': 0.8560344823636115} 12/27/2021 03:55:11 - INFO - codeparrot_training - Step 31863: {'lr': 0.00014641614713838957, 'samples': 16314368, 'steps': 31863, 'batch_loss/train': 0.8497227462939918} 12/27/2021 03:55:22 - INFO - codeparrot_training - Step 31864: {'lr': 0.00014640180093343654, 'samples': 16314880, 'steps': 31864, 'batch_loss/train': 0.7192728687077761} 12/27/2021 03:55:33 - INFO - codeparrot_training - Step 31865: {'lr': 0.00014638745514035088, 'samples': 16315392, 'steps': 31865, 'batch_loss/train': 0.7193884460721165} 12/27/2021 03:55:45 - INFO - codeparrot_training - Step 31866: {'lr': 0.00014637310975918944, 'samples': 16315904, 'steps': 31866, 'batch_loss/train': 0.711571104824543} 12/27/2021 03:55:56 - INFO - codeparrot_training - Step 31867: {'lr': 0.00014635876479000948, 'samples': 16316416, 'steps': 31867, 'batch_loss/train': 0.7574204029515386} 12/27/2021 03:56:06 - INFO - codeparrot_training - Step 31868: {'lr': 0.00014634442023286797, 'samples': 16316928, 'steps': 31868, 'batch_loss/train': 0.7270896129775792} 12/27/2021 03:56:17 - INFO - codeparrot_training - Step 31869: {'lr': 0.00014633007608782185, 'samples': 16317440, 'steps': 31869, 'batch_loss/train': 0.7042636479018256} 12/27/2021 03:56:29 - INFO - codeparrot_training - Step 31870: {'lr': 0.00014631573235492814, 'samples': 16317952, 'steps': 31870, 'batch_loss/train': 0.7848080629482865} 12/27/2021 03:56:40 - INFO - codeparrot_training - Step 31871: {'lr': 0.00014630138903424405, 'samples': 16318464, 'steps': 31871, 'batch_loss/train': 0.7599697194527835} 12/27/2021 03:56:50 - INFO - codeparrot_training - Step 31872: {'lr': 0.00014628704612582648, 'samples': 16318976, 'steps': 31872, 'batch_loss/train': 0.781647764146328} 12/27/2021 03:57:03 - INFO - codeparrot_training - Step 31873: {'lr': 0.00014627270362973236, 'samples': 16319488, 'steps': 31873, 'batch_loss/train': 0.7487312600715086} 12/27/2021 03:57:14 - INFO - codeparrot_training - Step 31874: {'lr': 0.00014625836154601884, 'samples': 16320000, 'steps': 31874, 'batch_loss/train': 0.7714458405971527} 12/27/2021 03:57:24 - INFO - codeparrot_training - Step 31875: {'lr': 0.00014624401987474296, 'samples': 16320512, 'steps': 31875, 'batch_loss/train': 0.9513601567596197} 12/27/2021 03:57:37 - INFO - codeparrot_training - Step 31876: {'lr': 0.00014622967861596165, 'samples': 16321024, 'steps': 31876, 'batch_loss/train': 0.7000431376509368} 12/27/2021 03:57:47 - INFO - codeparrot_training - Step 31877: {'lr': 0.00014621533776973186, 'samples': 16321536, 'steps': 31877, 'batch_loss/train': 0.7194643935654312} 12/27/2021 03:57:58 - INFO - codeparrot_training - Step 31878: {'lr': 0.00014620099733611084, 'samples': 16322048, 'steps': 31878, 'batch_loss/train': 0.7721019824966788} 12/27/2021 03:58:09 - INFO - codeparrot_training - Step 31879: {'lr': 0.0001461866573151554, 'samples': 16322560, 'steps': 31879, 'batch_loss/train': 0.5679656567517668} 12/27/2021 03:58:21 - INFO - codeparrot_training - Step 31880: {'lr': 0.00014617231770692264, 'samples': 16323072, 'steps': 31880, 'batch_loss/train': 0.7106166505254805} 12/27/2021 03:58:32 - INFO - codeparrot_training - Step 31881: {'lr': 0.00014615797851146945, 'samples': 16323584, 'steps': 31881, 'batch_loss/train': 0.7093262923881412} 12/27/2021 03:58:42 - INFO - codeparrot_training - Step 31882: {'lr': 0.00014614363972885306, 'samples': 16324096, 'steps': 31882, 'batch_loss/train': 0.7116166427731514} 12/27/2021 03:58:55 - INFO - codeparrot_training - Step 31883: {'lr': 0.00014612930135913027, 'samples': 16324608, 'steps': 31883, 'batch_loss/train': 0.7922893278300762} 12/27/2021 03:59:05 - INFO - codeparrot_training - Step 31884: {'lr': 0.0001461149634023582, 'samples': 16325120, 'steps': 31884, 'batch_loss/train': 0.7382288193330169} 12/27/2021 03:59:16 - INFO - codeparrot_training - Step 31885: {'lr': 0.00014610062585859374, 'samples': 16325632, 'steps': 31885, 'batch_loss/train': 0.8411637023091316} 12/27/2021 03:59:28 - INFO - codeparrot_training - Step 31886: {'lr': 0.00014608628872789397, 'samples': 16326144, 'steps': 31886, 'batch_loss/train': 0.8442542399279773} 12/27/2021 03:59:39 - INFO - codeparrot_training - Step 31887: {'lr': 0.00014607195201031596, 'samples': 16326656, 'steps': 31887, 'batch_loss/train': 0.8398930821567774} 12/27/2021 03:59:49 - INFO - codeparrot_training - Step 31888: {'lr': 0.00014605761570591654, 'samples': 16327168, 'steps': 31888, 'batch_loss/train': 0.763575054705143} 12/27/2021 04:00:00 - INFO - codeparrot_training - Step 31889: {'lr': 0.0001460432798147528, 'samples': 16327680, 'steps': 31889, 'batch_loss/train': 0.7098903129808605} 12/27/2021 04:00:12 - INFO - codeparrot_training - Step 31890: {'lr': 0.0001460289443368818, 'samples': 16328192, 'steps': 31890, 'batch_loss/train': 0.6790118010248989} 12/27/2021 04:00:23 - INFO - codeparrot_training - Step 31891: {'lr': 0.00014601460927236043, 'samples': 16328704, 'steps': 31891, 'batch_loss/train': 0.7472548792138696} 12/27/2021 04:00:33 - INFO - codeparrot_training - Step 31892: {'lr': 0.00014600027462124566, 'samples': 16329216, 'steps': 31892, 'batch_loss/train': 0.6992622511461377} 12/27/2021 04:00:46 - INFO - codeparrot_training - Step 31893: {'lr': 0.00014598594038359454, 'samples': 16329728, 'steps': 31893, 'batch_loss/train': 0.6979976161092054} 12/27/2021 04:00:57 - INFO - codeparrot_training - Step 31894: {'lr': 0.00014597160655946412, 'samples': 16330240, 'steps': 31894, 'batch_loss/train': 0.7447692109271884} 12/27/2021 04:01:07 - INFO - codeparrot_training - Step 31895: {'lr': 0.00014595727314891126, 'samples': 16330752, 'steps': 31895, 'batch_loss/train': 0.5916073407279328} 12/27/2021 04:01:19 - INFO - codeparrot_training - Step 31896: {'lr': 0.00014594294015199297, 'samples': 16331264, 'steps': 31896, 'batch_loss/train': 0.7128315868321806} 12/27/2021 04:01:30 - INFO - codeparrot_training - Step 31897: {'lr': 0.0001459286075687663, 'samples': 16331776, 'steps': 31897, 'batch_loss/train': 0.8143412685021758} 12/27/2021 04:01:41 - INFO - codeparrot_training - Step 31898: {'lr': 0.00014591427539928814, 'samples': 16332288, 'steps': 31898, 'batch_loss/train': 0.6799070099368691} 12/27/2021 04:01:53 - INFO - codeparrot_training - Step 31899: {'lr': 0.0001458999436436156, 'samples': 16332800, 'steps': 31899, 'batch_loss/train': 0.7802020125091076} 12/27/2021 04:02:03 - INFO - codeparrot_training - Step 31900: {'lr': 0.0001458856123018055, 'samples': 16333312, 'steps': 31900, 'batch_loss/train': 0.7217562827281654} 12/27/2021 04:02:14 - INFO - codeparrot_training - Step 31901: {'lr': 0.00014587128137391492, 'samples': 16333824, 'steps': 31901, 'batch_loss/train': 0.7216942110098898} 12/27/2021 04:02:25 - INFO - codeparrot_training - Step 31902: {'lr': 0.00014585695086000088, 'samples': 16334336, 'steps': 31902, 'batch_loss/train': 0.6065724173095077} 12/27/2021 04:02:37 - INFO - codeparrot_training - Step 31903: {'lr': 0.00014584262076012015, 'samples': 16334848, 'steps': 31903, 'batch_loss/train': 0.6974895792081952} 12/27/2021 04:02:48 - INFO - codeparrot_training - Step 31904: {'lr': 0.00014582829107432987, 'samples': 16335360, 'steps': 31904, 'batch_loss/train': 0.7183791995048523} 12/27/2021 04:02:58 - INFO - codeparrot_training - Step 31905: {'lr': 0.00014581396180268712, 'samples': 16335872, 'steps': 31905, 'batch_loss/train': 0.832825199700892} 12/27/2021 04:03:11 - INFO - codeparrot_training - Step 31906: {'lr': 0.00014579963294524857, 'samples': 16336384, 'steps': 31906, 'batch_loss/train': 0.7168078317772597} 12/27/2021 04:03:21 - INFO - codeparrot_training - Step 31907: {'lr': 0.0001457853045020713, 'samples': 16336896, 'steps': 31907, 'batch_loss/train': 0.7170864334329963} 12/27/2021 04:03:32 - INFO - codeparrot_training - Step 31908: {'lr': 0.0001457709764732124, 'samples': 16337408, 'steps': 31908, 'batch_loss/train': 0.7776830308139324} 12/27/2021 04:03:45 - INFO - codeparrot_training - Step 31909: {'lr': 0.00014575664885872874, 'samples': 16337920, 'steps': 31909, 'batch_loss/train': 0.7674722499214113} 12/27/2021 04:03:55 - INFO - codeparrot_training - Step 31910: {'lr': 0.00014574232165867723, 'samples': 16338432, 'steps': 31910, 'batch_loss/train': 0.7922011539340019} 12/27/2021 04:04:06 - INFO - codeparrot_training - Step 31911: {'lr': 0.00014572799487311492, 'samples': 16338944, 'steps': 31911, 'batch_loss/train': 0.6858602399006486} 12/27/2021 04:04:17 - INFO - codeparrot_training - Step 31912: {'lr': 0.0001457136685020987, 'samples': 16339456, 'steps': 31912, 'batch_loss/train': 0.6687997803092003} 12/27/2021 04:04:29 - INFO - codeparrot_training - Step 31913: {'lr': 0.00014569934254568558, 'samples': 16339968, 'steps': 31913, 'batch_loss/train': 0.8424835996702313} 12/27/2021 04:04:39 - INFO - codeparrot_training - Step 31914: {'lr': 0.0001456850170039325, 'samples': 16340480, 'steps': 31914, 'batch_loss/train': 0.7690926534123719} 12/27/2021 04:04:50 - INFO - codeparrot_training - Step 31915: {'lr': 0.00014567069187689638, 'samples': 16340992, 'steps': 31915, 'batch_loss/train': 0.67406477231998} 12/27/2021 04:05:03 - INFO - codeparrot_training - Step 31916: {'lr': 0.0001456563671646342, 'samples': 16341504, 'steps': 31916, 'batch_loss/train': 0.7282918945420533} 12/27/2021 04:05:13 - INFO - codeparrot_training - Step 31917: {'lr': 0.000145642042867203, 'samples': 16342016, 'steps': 31917, 'batch_loss/train': 0.7654187821317464} 12/27/2021 04:05:24 - INFO - codeparrot_training - Step 31918: {'lr': 0.00014562771898465947, 'samples': 16342528, 'steps': 31918, 'batch_loss/train': 0.8673186041414738} 12/27/2021 04:05:36 - INFO - codeparrot_training - Step 31919: {'lr': 0.00014561339551706086, 'samples': 16343040, 'steps': 31919, 'batch_loss/train': 0.5356267626048066} 12/27/2021 04:05:47 - INFO - codeparrot_training - Step 31920: {'lr': 0.0001455990724644639, 'samples': 16343552, 'steps': 31920, 'batch_loss/train': 0.7468445715494454} 12/27/2021 04:05:58 - INFO - codeparrot_training - Step 31921: {'lr': 0.00014558474982692566, 'samples': 16344064, 'steps': 31921, 'batch_loss/train': 0.7571802595630288} 12/27/2021 04:06:10 - INFO - codeparrot_training - Step 31922: {'lr': 0.00014557042760450294, 'samples': 16344576, 'steps': 31922, 'batch_loss/train': 0.7310304092243314} 12/27/2021 04:06:21 - INFO - codeparrot_training - Step 31923: {'lr': 0.00014555610579725285, 'samples': 16345088, 'steps': 31923, 'batch_loss/train': 0.7035038662143052} 12/27/2021 04:06:31 - INFO - codeparrot_training - Step 31924: {'lr': 0.00014554178440523218, 'samples': 16345600, 'steps': 31924, 'batch_loss/train': 0.7212524614296854} 12/27/2021 04:06:42 - INFO - codeparrot_training - Step 31925: {'lr': 0.00014552746342849797, 'samples': 16346112, 'steps': 31925, 'batch_loss/train': 0.8436366748064756} 12/27/2021 04:06:54 - INFO - codeparrot_training - Step 31926: {'lr': 0.00014551314286710712, 'samples': 16346624, 'steps': 31926, 'batch_loss/train': 0.7359624085947871} 12/27/2021 04:07:05 - INFO - codeparrot_training - Step 31927: {'lr': 0.00014549882272111656, 'samples': 16347136, 'steps': 31927, 'batch_loss/train': 0.6669484761077911} 12/27/2021 04:07:16 - INFO - codeparrot_training - Step 31928: {'lr': 0.0001454845029905833, 'samples': 16347648, 'steps': 31928, 'batch_loss/train': 0.6063989468966611} 12/27/2021 04:07:28 - INFO - codeparrot_training - Step 31929: {'lr': 0.00014547018367556402, 'samples': 16348160, 'steps': 31929, 'batch_loss/train': 0.718155836686492} 12/27/2021 04:07:38 - INFO - codeparrot_training - Step 31930: {'lr': 0.0001454558647761159, 'samples': 16348672, 'steps': 31930, 'batch_loss/train': 0.7551941247656941} 12/27/2021 04:07:49 - INFO - codeparrot_training - Step 31931: {'lr': 0.00014544154629229588, 'samples': 16349184, 'steps': 31931, 'batch_loss/train': 0.7434254363179207} 12/27/2021 04:08:01 - INFO - codeparrot_training - Step 31932: {'lr': 0.00014542722822416074, 'samples': 16349696, 'steps': 31932, 'batch_loss/train': 0.7211455209180713} 12/27/2021 04:08:12 - INFO - codeparrot_training - Step 31933: {'lr': 0.0001454129105717673, 'samples': 16350208, 'steps': 31933, 'batch_loss/train': 0.8136425279080868} 12/27/2021 04:08:23 - INFO - codeparrot_training - Step 31934: {'lr': 0.00014539859333517288, 'samples': 16350720, 'steps': 31934, 'batch_loss/train': 0.694523562444374} 12/27/2021 04:08:33 - INFO - codeparrot_training - Step 31935: {'lr': 0.00014538427651443402, 'samples': 16351232, 'steps': 31935, 'batch_loss/train': 0.7421758528798819} 12/27/2021 04:08:46 - INFO - codeparrot_training - Step 31936: {'lr': 0.00014536996010960782, 'samples': 16351744, 'steps': 31936, 'batch_loss/train': 0.852515017206315} 12/27/2021 04:08:56 - INFO - codeparrot_training - Step 31937: {'lr': 0.0001453556441207511, 'samples': 16352256, 'steps': 31937, 'batch_loss/train': 0.9033232172951102} 12/27/2021 04:09:07 - INFO - codeparrot_training - Step 31938: {'lr': 0.00014534132854792086, 'samples': 16352768, 'steps': 31938, 'batch_loss/train': 0.7894304152578115} 12/27/2021 04:09:19 - INFO - codeparrot_training - Step 31939: {'lr': 0.000145327013391174, 'samples': 16353280, 'steps': 31939, 'batch_loss/train': 0.8209690796211362} 12/27/2021 04:09:30 - INFO - codeparrot_training - Step 31940: {'lr': 0.00014531269865056737, 'samples': 16353792, 'steps': 31940, 'batch_loss/train': 0.8715091785416007} 12/27/2021 04:09:40 - INFO - codeparrot_training - Step 31941: {'lr': 0.00014529838432615795, 'samples': 16354304, 'steps': 31941, 'batch_loss/train': 0.7477897226344794} 12/27/2021 04:09:53 - INFO - codeparrot_training - Step 31942: {'lr': 0.00014528407041800257, 'samples': 16354816, 'steps': 31942, 'batch_loss/train': 0.7067928593605757} 12/27/2021 04:10:04 - INFO - codeparrot_training - Step 31943: {'lr': 0.00014526975692615834, 'samples': 16355328, 'steps': 31943, 'batch_loss/train': 0.6619329880923033} 12/27/2021 04:10:14 - INFO - codeparrot_training - Step 31944: {'lr': 0.0001452554438506818, 'samples': 16355840, 'steps': 31944, 'batch_loss/train': 0.8168338229879737} 12/27/2021 04:10:25 - INFO - codeparrot_training - Step 31945: {'lr': 0.00014524113119163023, 'samples': 16356352, 'steps': 31945, 'batch_loss/train': 0.7344493726268411} 12/27/2021 04:10:37 - INFO - codeparrot_training - Step 31946: {'lr': 0.00014522681894906032, 'samples': 16356864, 'steps': 31946, 'batch_loss/train': 0.6771369376219809} 12/27/2021 04:10:48 - INFO - codeparrot_training - Step 31947: {'lr': 0.00014521250712302888, 'samples': 16357376, 'steps': 31947, 'batch_loss/train': 0.8312897658906877} 12/27/2021 04:10:58 - INFO - codeparrot_training - Step 31948: {'lr': 0.00014519819571359317, 'samples': 16357888, 'steps': 31948, 'batch_loss/train': 0.6991229438572191} 12/27/2021 04:11:11 - INFO - codeparrot_training - Step 31949: {'lr': 0.00014518388472080972, 'samples': 16358400, 'steps': 31949, 'batch_loss/train': 0.721479722764343} 12/27/2021 04:11:22 - INFO - codeparrot_training - Step 31950: {'lr': 0.00014516957414473564, 'samples': 16358912, 'steps': 31950, 'batch_loss/train': 0.8277254961431026} 12/27/2021 04:11:32 - INFO - codeparrot_training - Step 31951: {'lr': 0.0001451552639854277, 'samples': 16359424, 'steps': 31951, 'batch_loss/train': 0.773600448621437} 12/27/2021 04:11:45 - INFO - codeparrot_training - Step 31952: {'lr': 0.00014514095424294287, 'samples': 16359936, 'steps': 31952, 'batch_loss/train': 0.6513679935596883} 12/27/2021 04:11:55 - INFO - codeparrot_training - Step 31953: {'lr': 0.00014512664491733796, 'samples': 16360448, 'steps': 31953, 'batch_loss/train': 0.7553668515756726} 12/27/2021 04:12:06 - INFO - codeparrot_training - Step 31954: {'lr': 0.00014511233600867008, 'samples': 16360960, 'steps': 31954, 'batch_loss/train': 0.7141644963994622} 12/27/2021 04:12:17 - INFO - codeparrot_training - Step 31955: {'lr': 0.00014509802751699568, 'samples': 16361472, 'steps': 31955, 'batch_loss/train': 1.023904161178507} 12/27/2021 04:12:29 - INFO - codeparrot_training - Step 31956: {'lr': 0.00014508371944237206, 'samples': 16361984, 'steps': 31956, 'batch_loss/train': 0.7519565192051232} 12/27/2021 04:12:40 - INFO - codeparrot_training - Step 31957: {'lr': 0.00014506941178485605, 'samples': 16362496, 'steps': 31957, 'batch_loss/train': 0.7882409440353513} 12/27/2021 04:12:51 - INFO - codeparrot_training - Step 31958: {'lr': 0.00014505510454450432, 'samples': 16363008, 'steps': 31958, 'batch_loss/train': 0.7754185041412711} 12/27/2021 04:13:03 - INFO - codeparrot_training - Step 31959: {'lr': 0.0001450407977213738, 'samples': 16363520, 'steps': 31959, 'batch_loss/train': 0.9079031273722649} 12/27/2021 04:13:14 - INFO - codeparrot_training - Step 31960: {'lr': 0.00014502649131552158, 'samples': 16364032, 'steps': 31960, 'batch_loss/train': 0.8797187586314976} 12/27/2021 04:13:24 - INFO - codeparrot_training - Step 31961: {'lr': 0.00014501218532700437, 'samples': 16364544, 'steps': 31961, 'batch_loss/train': 0.6584167894907296} 12/27/2021 04:13:36 - INFO - codeparrot_training - Step 31962: {'lr': 0.0001449978797558789, 'samples': 16365056, 'steps': 31962, 'batch_loss/train': 0.5918694599531591} 12/27/2021 04:13:47 - INFO - codeparrot_training - Step 31963: {'lr': 0.00014498357460220247, 'samples': 16365568, 'steps': 31963, 'batch_loss/train': 0.808854759670794} 12/27/2021 04:13:58 - INFO - codeparrot_training - Step 31964: {'lr': 0.00014496926986603154, 'samples': 16366080, 'steps': 31964, 'batch_loss/train': 0.7099562641233206} 12/27/2021 04:14:10 - INFO - codeparrot_training - Step 31965: {'lr': 0.0001449549655474232, 'samples': 16366592, 'steps': 31965, 'batch_loss/train': 0.7439971081912518} 12/27/2021 04:14:20 - INFO - codeparrot_training - Step 31966: {'lr': 0.0001449406616464342, 'samples': 16367104, 'steps': 31966, 'batch_loss/train': 0.6718647326342762} 12/27/2021 04:14:31 - INFO - codeparrot_training - Step 31967: {'lr': 0.0001449263581631215, 'samples': 16367616, 'steps': 31967, 'batch_loss/train': 0.8115277751348913} 12/27/2021 04:14:42 - INFO - codeparrot_training - Step 31968: {'lr': 0.00014491205509754187, 'samples': 16368128, 'steps': 31968, 'batch_loss/train': 0.6705477278446779} 12/27/2021 04:14:54 - INFO - codeparrot_training - Step 31969: {'lr': 0.00014489775244975236, 'samples': 16368640, 'steps': 31969, 'batch_loss/train': 0.5724111577728763} 12/27/2021 04:15:05 - INFO - codeparrot_training - Step 31970: {'lr': 0.00014488345021980954, 'samples': 16369152, 'steps': 31970, 'batch_loss/train': 0.7230505216866732} 12/27/2021 04:15:16 - INFO - codeparrot_training - Step 31971: {'lr': 0.00014486914840777058, 'samples': 16369664, 'steps': 31971, 'batch_loss/train': 0.8773024952970445} 12/27/2021 04:15:28 - INFO - codeparrot_training - Step 31972: {'lr': 0.00014485484701369213, 'samples': 16370176, 'steps': 31972, 'batch_loss/train': 0.7467029644176364} 12/27/2021 04:15:38 - INFO - codeparrot_training - Step 31973: {'lr': 0.00014484054603763094, 'samples': 16370688, 'steps': 31973, 'batch_loss/train': 0.7027703507337719} 12/27/2021 04:15:49 - INFO - codeparrot_training - Step 31974: {'lr': 0.0001448262454796443, 'samples': 16371200, 'steps': 31974, 'batch_loss/train': 0.5603175391443074} 12/27/2021 04:16:01 - INFO - codeparrot_training - Step 31975: {'lr': 0.00014481194533978863, 'samples': 16371712, 'steps': 31975, 'batch_loss/train': 0.762289198115468} 12/27/2021 04:16:12 - INFO - codeparrot_training - Step 31976: {'lr': 0.000144797645618121, 'samples': 16372224, 'steps': 31976, 'batch_loss/train': 0.8408209783956409} 12/27/2021 04:16:22 - INFO - codeparrot_training - Step 31977: {'lr': 0.0001447833463146982, 'samples': 16372736, 'steps': 31977, 'batch_loss/train': 0.7153506581671536} 12/27/2021 04:16:33 - INFO - codeparrot_training - Step 31978: {'lr': 0.00014476904742957708, 'samples': 16373248, 'steps': 31978, 'batch_loss/train': 0.6531502353027463} 12/27/2021 04:16:47 - INFO - codeparrot_training - Step 31979: {'lr': 0.0001447547489628145, 'samples': 16373760, 'steps': 31979, 'batch_loss/train': 0.8184443190693855} 12/27/2021 04:16:57 - INFO - codeparrot_training - Step 31980: {'lr': 0.0001447404509144674, 'samples': 16374272, 'steps': 31980, 'batch_loss/train': 0.6546603161841631} 12/27/2021 04:17:08 - INFO - codeparrot_training - Step 31981: {'lr': 0.00014472615328459232, 'samples': 16374784, 'steps': 31981, 'batch_loss/train': 0.6982567654922605} 12/27/2021 04:17:20 - INFO - codeparrot_training - Step 31982: {'lr': 0.0001447118560732464, 'samples': 16375296, 'steps': 31982, 'batch_loss/train': 0.7658149469643831} 12/27/2021 04:17:31 - INFO - codeparrot_training - Step 31983: {'lr': 0.0001446975592804865, 'samples': 16375808, 'steps': 31983, 'batch_loss/train': 0.6913008582778275} 12/27/2021 04:17:41 - INFO - codeparrot_training - Step 31984: {'lr': 0.00014468326290636912, 'samples': 16376320, 'steps': 31984, 'batch_loss/train': 0.7507428070530295} 12/27/2021 04:17:53 - INFO - codeparrot_training - Step 31985: {'lr': 0.00014466896695095144, 'samples': 16376832, 'steps': 31985, 'batch_loss/train': 0.9294863492250443} 12/27/2021 04:18:04 - INFO - codeparrot_training - Step 31986: {'lr': 0.0001446546714142903, 'samples': 16377344, 'steps': 31986, 'batch_loss/train': 0.828424443025142} 12/27/2021 04:18:15 - INFO - codeparrot_training - Step 31987: {'lr': 0.00014464037629644225, 'samples': 16377856, 'steps': 31987, 'batch_loss/train': 0.7560961013659835} 12/27/2021 04:18:27 - INFO - codeparrot_training - Step 31988: {'lr': 0.0001446260815974642, 'samples': 16378368, 'steps': 31988, 'batch_loss/train': 0.6332122723106295} 12/27/2021 04:18:38 - INFO - codeparrot_training - Step 31989: {'lr': 0.00014461178731741325, 'samples': 16378880, 'steps': 31989, 'batch_loss/train': 0.8544108366477303} 12/27/2021 04:18:49 - INFO - codeparrot_training - Step 31990: {'lr': 0.000144597493456346, 'samples': 16379392, 'steps': 31990, 'batch_loss/train': 0.7099901484325528} 12/27/2021 04:18:59 - INFO - codeparrot_training - Step 31991: {'lr': 0.00014458320001431926, 'samples': 16379904, 'steps': 31991, 'batch_loss/train': 0.832249200437218} 12/27/2021 04:19:12 - INFO - codeparrot_training - Step 31992: {'lr': 0.00014456890699138995, 'samples': 16380416, 'steps': 31992, 'batch_loss/train': 1.6357176955789328} 12/27/2021 04:19:22 - INFO - codeparrot_training - Step 31993: {'lr': 0.00014455461438761484, 'samples': 16380928, 'steps': 31993, 'batch_loss/train': 0.9067887160927057} 12/27/2021 04:19:33 - INFO - codeparrot_training - Step 31994: {'lr': 0.00014454032220305075, 'samples': 16381440, 'steps': 31994, 'batch_loss/train': 0.8224053429439664} 12/27/2021 04:19:46 - INFO - codeparrot_training - Step 31995: {'lr': 0.00014452603043775452, 'samples': 16381952, 'steps': 31995, 'batch_loss/train': 1.795152777864132} 12/27/2021 04:19:56 - INFO - codeparrot_training - Step 31996: {'lr': 0.000144511739091783, 'samples': 16382464, 'steps': 31996, 'batch_loss/train': 0.7361714583821595} 12/27/2021 04:20:07 - INFO - codeparrot_training - Step 31997: {'lr': 0.00014449744816519297, 'samples': 16382976, 'steps': 31997, 'batch_loss/train': 0.779631033539772} 12/27/2021 04:20:19 - INFO - codeparrot_training - Step 31998: {'lr': 0.00014448315765804132, 'samples': 16383488, 'steps': 31998, 'batch_loss/train': 0.8314580013975501} 12/27/2021 04:20:30 - INFO - codeparrot_training - Step 31999: {'lr': 0.00014446886757038462, 'samples': 16384000, 'steps': 31999, 'batch_loss/train': 0.8306775754317641} 12/27/2021 04:20:40 - INFO - codeparrot_training - Step 32000: {'lr': 0.00014445457790228006, 'samples': 16384512, 'steps': 32000, 'batch_loss/train': 0.7902027629315853} 12/27/2021 04:20:51 - INFO - codeparrot_training - Step 32001: {'lr': 0.0001444402886537841, 'samples': 16385024, 'steps': 32001, 'batch_loss/train': 0.7147384216077626} 12/27/2021 04:21:03 - INFO - codeparrot_training - Step 32002: {'lr': 0.00014442599982495376, 'samples': 16385536, 'steps': 32002, 'batch_loss/train': 0.7440563533455133} 12/27/2021 04:21:14 - INFO - codeparrot_training - Step 32003: {'lr': 0.00014441171141584575, 'samples': 16386048, 'steps': 32003, 'batch_loss/train': 0.8096453249454498} 12/27/2021 04:21:24 - INFO - codeparrot_training - Step 32004: {'lr': 0.0001443974234265169, 'samples': 16386560, 'steps': 32004, 'batch_loss/train': 0.7984539461322129} 12/27/2021 04:21:37 - INFO - codeparrot_training - Step 32005: {'lr': 0.00014438313585702403, 'samples': 16387072, 'steps': 32005, 'batch_loss/train': 0.7090642526745796} 12/27/2021 04:21:48 - INFO - codeparrot_training - Step 32006: {'lr': 0.00014436884870742394, 'samples': 16387584, 'steps': 32006, 'batch_loss/train': 0.811488610226661} 12/27/2021 04:21:59 - INFO - codeparrot_training - Step 32007: {'lr': 0.00014435456197777342, 'samples': 16388096, 'steps': 32007, 'batch_loss/train': 0.7773114405572414} 12/27/2021 04:22:09 - INFO - codeparrot_training - Step 32008: {'lr': 0.00014434027566812924, 'samples': 16388608, 'steps': 32008, 'batch_loss/train': 0.7019176525063813} 12/27/2021 04:22:22 - INFO - codeparrot_training - Step 32009: {'lr': 0.00014432598977854834, 'samples': 16389120, 'steps': 32009, 'batch_loss/train': 0.7707016887143254} 12/27/2021 04:22:32 - INFO - codeparrot_training - Step 32010: {'lr': 0.00014431170430908726, 'samples': 16389632, 'steps': 32010, 'batch_loss/train': 0.7873198231682181} 12/27/2021 04:22:43 - INFO - codeparrot_training - Step 32011: {'lr': 0.00014429741925980297, 'samples': 16390144, 'steps': 32011, 'batch_loss/train': 0.8362345859641209} 12/27/2021 04:22:55 - INFO - codeparrot_training - Step 32012: {'lr': 0.00014428313463075236, 'samples': 16390656, 'steps': 32012, 'batch_loss/train': 0.9523573406040668} 12/27/2021 04:23:06 - INFO - codeparrot_training - Step 32013: {'lr': 0.00014426885042199196, 'samples': 16391168, 'steps': 32013, 'batch_loss/train': 0.9361483342945576} 12/27/2021 04:23:16 - INFO - codeparrot_training - Step 32014: {'lr': 0.00014425456663357862, 'samples': 16391680, 'steps': 32014, 'batch_loss/train': 0.7183908191509545} 12/27/2021 04:23:28 - INFO - codeparrot_training - Step 32015: {'lr': 0.00014424028326556938, 'samples': 16392192, 'steps': 32015, 'batch_loss/train': 0.7678525168448687} 12/27/2021 04:23:39 - INFO - codeparrot_training - Step 32016: {'lr': 0.00014422600031802074, 'samples': 16392704, 'steps': 32016, 'batch_loss/train': 0.7479820968583226} 12/27/2021 04:23:50 - INFO - codeparrot_training - Step 32017: {'lr': 0.0001442117177909896, 'samples': 16393216, 'steps': 32017, 'batch_loss/train': 0.7566901003010571} 12/27/2021 04:24:00 - INFO - codeparrot_training - Step 32018: {'lr': 0.00014419743568453268, 'samples': 16393728, 'steps': 32018, 'batch_loss/train': 0.8056121943518519} 12/27/2021 04:24:13 - INFO - codeparrot_training - Step 32019: {'lr': 0.0001441831539987069, 'samples': 16394240, 'steps': 32019, 'batch_loss/train': 0.7711253399029374} 12/27/2021 04:24:24 - INFO - codeparrot_training - Step 32020: {'lr': 0.0001441688727335689, 'samples': 16394752, 'steps': 32020, 'batch_loss/train': 0.6398999774828553} 12/27/2021 04:24:34 - INFO - codeparrot_training - Step 32021: {'lr': 0.00014415459188917546, 'samples': 16395264, 'steps': 32021, 'batch_loss/train': 1.1589857544749975} 12/27/2021 04:24:47 - INFO - codeparrot_training - Step 32022: {'lr': 0.00014414031146558343, 'samples': 16395776, 'steps': 32022, 'batch_loss/train': 0.7305793170817196} 12/27/2021 04:24:57 - INFO - codeparrot_training - Step 32023: {'lr': 0.00014412603146284953, 'samples': 16396288, 'steps': 32023, 'batch_loss/train': 0.7443191993515939} 12/27/2021 04:25:08 - INFO - codeparrot_training - Step 32024: {'lr': 0.00014411175188103068, 'samples': 16396800, 'steps': 32024, 'batch_loss/train': 0.614909537951462} 12/27/2021 04:25:21 - INFO - codeparrot_training - Step 32025: {'lr': 0.0001440974727201833, 'samples': 16397312, 'steps': 32025, 'batch_loss/train': 0.7050760970450938} 12/27/2021 04:25:32 - INFO - codeparrot_training - Step 32026: {'lr': 0.0001440831939803646, 'samples': 16397824, 'steps': 32026, 'batch_loss/train': 1.2226259396411479} 12/27/2021 04:25:42 - INFO - codeparrot_training - Step 32027: {'lr': 0.00014406891566163103, 'samples': 16398336, 'steps': 32027, 'batch_loss/train': 0.4699362368846778} 12/27/2021 04:25:53 - INFO - codeparrot_training - Step 32028: {'lr': 0.00014405463776403943, 'samples': 16398848, 'steps': 32028, 'batch_loss/train': 0.7054315218701959} 12/27/2021 04:26:05 - INFO - codeparrot_training - Step 32029: {'lr': 0.00014404036028764662, 'samples': 16399360, 'steps': 32029, 'batch_loss/train': 0.8305226294323802} 12/27/2021 04:26:15 - INFO - codeparrot_training - Step 32030: {'lr': 0.0001440260832325093, 'samples': 16399872, 'steps': 32030, 'batch_loss/train': 0.7652666755020618} 12/27/2021 04:26:26 - INFO - codeparrot_training - Step 32031: {'lr': 0.00014401180659868423, 'samples': 16400384, 'steps': 32031, 'batch_loss/train': 0.797881587408483} 12/27/2021 04:26:38 - INFO - codeparrot_training - Step 32032: {'lr': 0.00014399753038622825, 'samples': 16400896, 'steps': 32032, 'batch_loss/train': 0.8490724666044116} 12/27/2021 04:26:49 - INFO - codeparrot_training - Step 32033: {'lr': 0.00014398325459519807, 'samples': 16401408, 'steps': 32033, 'batch_loss/train': 0.7470477148890495} 12/27/2021 04:26:59 - INFO - codeparrot_training - Step 32034: {'lr': 0.00014396897922565042, 'samples': 16401920, 'steps': 32034, 'batch_loss/train': 0.6756067825481296} 12/27/2021 04:27:12 - INFO - codeparrot_training - Step 32035: {'lr': 0.00014395470427764218, 'samples': 16402432, 'steps': 32035, 'batch_loss/train': 0.6628507194109261} 12/27/2021 04:27:23 - INFO - codeparrot_training - Step 32036: {'lr': 0.00014394042975122978, 'samples': 16402944, 'steps': 32036, 'batch_loss/train': 0.7749318163841963} 12/27/2021 04:27:33 - INFO - codeparrot_training - Step 32037: {'lr': 0.00014392615564647027, 'samples': 16403456, 'steps': 32037, 'batch_loss/train': 0.6599286147393286} 12/27/2021 04:27:44 - INFO - codeparrot_training - Step 32038: {'lr': 0.00014391188196342044, 'samples': 16403968, 'steps': 32038, 'batch_loss/train': 0.7620538743212819} 12/27/2021 04:27:56 - INFO - codeparrot_training - Step 32039: {'lr': 0.0001438976087021368, 'samples': 16404480, 'steps': 32039, 'batch_loss/train': 0.7309103289153427} 12/27/2021 04:28:07 - INFO - codeparrot_training - Step 32040: {'lr': 0.0001438833358626761, 'samples': 16404992, 'steps': 32040, 'batch_loss/train': 0.7300708154216409} 12/27/2021 04:28:17 - INFO - codeparrot_training - Step 32041: {'lr': 0.00014386906344509542, 'samples': 16405504, 'steps': 32041, 'batch_loss/train': 0.8374668788164854} 12/27/2021 04:28:30 - INFO - codeparrot_training - Step 32042: {'lr': 0.00014385479144945112, 'samples': 16406016, 'steps': 32042, 'batch_loss/train': 0.9623431181535125} 12/27/2021 04:28:40 - INFO - codeparrot_training - Step 32043: {'lr': 0.00014384051987580016, 'samples': 16406528, 'steps': 32043, 'batch_loss/train': 0.7005334938876331} 12/27/2021 04:28:51 - INFO - codeparrot_training - Step 32044: {'lr': 0.00014382624872419914, 'samples': 16407040, 'steps': 32044, 'batch_loss/train': 0.6849117111414671} 12/27/2021 04:29:03 - INFO - codeparrot_training - Step 32045: {'lr': 0.00014381197799470492, 'samples': 16407552, 'steps': 32045, 'batch_loss/train': 0.8269471779931337} 12/27/2021 04:29:14 - INFO - codeparrot_training - Step 32046: {'lr': 0.00014379770768737412, 'samples': 16408064, 'steps': 32046, 'batch_loss/train': 0.6840331447310746} 12/27/2021 04:29:25 - INFO - codeparrot_training - Step 32047: {'lr': 0.00014378343780226357, 'samples': 16408576, 'steps': 32047, 'batch_loss/train': 0.7290927558206022} 12/27/2021 04:29:35 - INFO - codeparrot_training - Step 32048: {'lr': 0.00014376916833942999, 'samples': 16409088, 'steps': 32048, 'batch_loss/train': 0.6814772132784128} 12/27/2021 04:29:49 - INFO - codeparrot_training - Step 32049: {'lr': 0.00014375489929893004, 'samples': 16409600, 'steps': 32049, 'batch_loss/train': 0.7796170525252819} 12/27/2021 04:30:00 - INFO - codeparrot_training - Step 32050: {'lr': 0.0001437406306808206, 'samples': 16410112, 'steps': 32050, 'batch_loss/train': 0.7001647190190852} 12/27/2021 04:30:10 - INFO - codeparrot_training - Step 32051: {'lr': 0.00014372636248515813, 'samples': 16410624, 'steps': 32051, 'batch_loss/train': 0.6083197426050901} 12/27/2021 04:30:22 - INFO - codeparrot_training - Step 32052: {'lr': 0.00014371209471199966, 'samples': 16411136, 'steps': 32052, 'batch_loss/train': 0.7529731539543718} 12/27/2021 04:30:33 - INFO - codeparrot_training - Step 32053: {'lr': 0.0001436978273614017, 'samples': 16411648, 'steps': 32053, 'batch_loss/train': 0.5923964397516102} 12/27/2021 04:30:43 - INFO - codeparrot_training - Step 32054: {'lr': 0.00014368356043342105, 'samples': 16412160, 'steps': 32054, 'batch_loss/train': 0.7596370121464133} 12/27/2021 04:30:54 - INFO - codeparrot_training - Step 32055: {'lr': 0.0001436692939281144, 'samples': 16412672, 'steps': 32055, 'batch_loss/train': 0.726387204718776} 12/27/2021 04:31:06 - INFO - codeparrot_training - Step 32056: {'lr': 0.0001436550278455385, 'samples': 16413184, 'steps': 32056, 'batch_loss/train': 1.0850694347172976} 12/27/2021 04:31:17 - INFO - codeparrot_training - Step 32057: {'lr': 0.00014364076218575006, 'samples': 16413696, 'steps': 32057, 'batch_loss/train': 0.6881679128855467} 12/27/2021 04:31:27 - INFO - codeparrot_training - Step 32058: {'lr': 0.00014362649694880575, 'samples': 16414208, 'steps': 32058, 'batch_loss/train': 0.7856850964017212} 12/27/2021 04:31:40 - INFO - codeparrot_training - Step 32059: {'lr': 0.0001436122321347624, 'samples': 16414720, 'steps': 32059, 'batch_loss/train': 0.7129621653584763} 12/27/2021 04:31:51 - INFO - codeparrot_training - Step 32060: {'lr': 0.00014359796774367656, 'samples': 16415232, 'steps': 32060, 'batch_loss/train': 0.7604674687609076} 12/27/2021 04:32:01 - INFO - codeparrot_training - Step 32061: {'lr': 0.00014358370377560517, 'samples': 16415744, 'steps': 32061, 'batch_loss/train': 0.7428207723423839} 12/27/2021 04:32:14 - INFO - codeparrot_training - Step 32062: {'lr': 0.0001435694402306046, 'samples': 16416256, 'steps': 32062, 'batch_loss/train': 0.6768989665433764} 12/27/2021 04:32:24 - INFO - codeparrot_training - Step 32063: {'lr': 0.00014355517710873183, 'samples': 16416768, 'steps': 32063, 'batch_loss/train': 0.7924026491236873} 12/27/2021 04:32:35 - INFO - codeparrot_training - Step 32064: {'lr': 0.00014354091441004359, 'samples': 16417280, 'steps': 32064, 'batch_loss/train': 0.7453712224960327} 12/27/2021 04:32:48 - INFO - codeparrot_training - Step 32065: {'lr': 0.00014352665213459637, 'samples': 16417792, 'steps': 32065, 'batch_loss/train': 0.8447694364003837} 12/27/2021 04:32:58 - INFO - codeparrot_training - Step 32066: {'lr': 0.0001435123902824469, 'samples': 16418304, 'steps': 32066, 'batch_loss/train': 0.9305233092163689} 12/27/2021 04:33:09 - INFO - codeparrot_training - Step 32067: {'lr': 0.0001434981288536521, 'samples': 16418816, 'steps': 32067, 'batch_loss/train': 0.7175712501630187} 12/27/2021 04:33:20 - INFO - codeparrot_training - Step 32068: {'lr': 0.00014348386784826848, 'samples': 16419328, 'steps': 32068, 'batch_loss/train': 0.7373849279247224} 12/27/2021 04:33:32 - INFO - codeparrot_training - Step 32069: {'lr': 0.00014346960726635267, 'samples': 16419840, 'steps': 32069, 'batch_loss/train': 0.7547628979664296} 12/27/2021 04:33:42 - INFO - codeparrot_training - Step 32070: {'lr': 0.0001434553471079617, 'samples': 16420352, 'steps': 32070, 'batch_loss/train': 1.294735343195498} 12/27/2021 04:33:53 - INFO - codeparrot_training - Step 32071: {'lr': 0.00014344108737315188, 'samples': 16420864, 'steps': 32071, 'batch_loss/train': 0.6694833161309361} 12/27/2021 04:34:05 - INFO - codeparrot_training - Step 32072: {'lr': 0.0001434268280619801, 'samples': 16421376, 'steps': 32072, 'batch_loss/train': 0.9559012735262513} 12/27/2021 04:34:16 - INFO - codeparrot_training - Step 32073: {'lr': 0.000143412569174503, 'samples': 16421888, 'steps': 32073, 'batch_loss/train': 0.7959235669113696} 12/27/2021 04:34:26 - INFO - codeparrot_training - Step 32074: {'lr': 0.0001433983107107773, 'samples': 16422400, 'steps': 32074, 'batch_loss/train': 0.675749787595123} 12/27/2021 04:34:39 - INFO - codeparrot_training - Step 32075: {'lr': 0.00014338405267085965, 'samples': 16422912, 'steps': 32075, 'batch_loss/train': 0.7334698680788279} 12/27/2021 04:34:50 - INFO - codeparrot_training - Step 32076: {'lr': 0.00014336979505480686, 'samples': 16423424, 'steps': 32076, 'batch_loss/train': 0.7126283580437303} 12/27/2021 04:35:00 - INFO - codeparrot_training - Step 32077: {'lr': 0.0001433555378626753, 'samples': 16423936, 'steps': 32077, 'batch_loss/train': 0.759138165274635} 12/27/2021 04:35:11 - INFO - codeparrot_training - Step 32078: {'lr': 0.00014334128109452194, 'samples': 16424448, 'steps': 32078, 'batch_loss/train': 0.7521615056321025} 12/27/2021 04:35:23 - INFO - codeparrot_training - Step 32079: {'lr': 0.0001433270247504035, 'samples': 16424960, 'steps': 32079, 'batch_loss/train': 0.6326209069229662} 12/27/2021 04:35:34 - INFO - codeparrot_training - Step 32080: {'lr': 0.0001433127688303764, 'samples': 16425472, 'steps': 32080, 'batch_loss/train': 0.7256203959695995} 12/27/2021 04:35:44 - INFO - codeparrot_training - Step 32081: {'lr': 0.00014329851333449748, 'samples': 16425984, 'steps': 32081, 'batch_loss/train': 0.7578574204817414} 12/27/2021 04:35:57 - INFO - codeparrot_training - Step 32082: {'lr': 0.00014328425826282338, 'samples': 16426496, 'steps': 32082, 'batch_loss/train': 0.5367078290437348} 12/27/2021 04:36:07 - INFO - codeparrot_training - Step 32083: {'lr': 0.00014327000361541076, 'samples': 16427008, 'steps': 32083, 'batch_loss/train': 0.7548322509974241} 12/27/2021 04:36:18 - INFO - codeparrot_training - Step 32084: {'lr': 0.00014325574939231628, 'samples': 16427520, 'steps': 32084, 'batch_loss/train': 0.6794548742473125} 12/27/2021 04:36:31 - INFO - codeparrot_training - Step 32085: {'lr': 0.00014324149559359666, 'samples': 16428032, 'steps': 32085, 'batch_loss/train': 0.6510281367227435} 12/27/2021 04:36:41 - INFO - codeparrot_training - Step 32086: {'lr': 0.00014322724221930855, 'samples': 16428544, 'steps': 32086, 'batch_loss/train': 0.7874831389635801} 12/27/2021 04:36:52 - INFO - codeparrot_training - Step 32087: {'lr': 0.0001432129892695086, 'samples': 16429056, 'steps': 32087, 'batch_loss/train': 0.6643449139082804} 12/27/2021 04:37:04 - INFO - codeparrot_training - Step 32088: {'lr': 0.00014319873674425347, 'samples': 16429568, 'steps': 32088, 'batch_loss/train': 0.832916896790266} 12/27/2021 04:37:15 - INFO - codeparrot_training - Step 32089: {'lr': 0.00014318448464359989, 'samples': 16430080, 'steps': 32089, 'batch_loss/train': 0.6767360335215926} 12/27/2021 04:37:26 - INFO - codeparrot_training - Step 32090: {'lr': 0.00014317023296760452, 'samples': 16430592, 'steps': 32090, 'batch_loss/train': 0.6681750847492367} 12/27/2021 04:37:36 - INFO - codeparrot_training - Step 32091: {'lr': 0.00014315598171632388, 'samples': 16431104, 'steps': 32091, 'batch_loss/train': 0.8043926213867962} 12/27/2021 04:37:48 - INFO - codeparrot_training - Step 32092: {'lr': 0.0001431417308898146, 'samples': 16431616, 'steps': 32092, 'batch_loss/train': 0.8229221580550075} 12/27/2021 04:37:59 - INFO - codeparrot_training - Step 32093: {'lr': 0.00014312748048813368, 'samples': 16432128, 'steps': 32093, 'batch_loss/train': 0.7497468311339617} 12/27/2021 04:38:09 - INFO - codeparrot_training - Step 32094: {'lr': 0.0001431132305113374, 'samples': 16432640, 'steps': 32094, 'batch_loss/train': 0.7468500826507807} 12/27/2021 04:38:22 - INFO - codeparrot_training - Step 32095: {'lr': 0.00014309898095948249, 'samples': 16433152, 'steps': 32095, 'batch_loss/train': 0.7528686467558146} 12/27/2021 04:38:32 - INFO - codeparrot_training - Step 32096: {'lr': 0.00014308473183262589, 'samples': 16433664, 'steps': 32096, 'batch_loss/train': 0.6899444735608995} 12/27/2021 04:38:43 - INFO - codeparrot_training - Step 32097: {'lr': 0.0001430704831308239, 'samples': 16434176, 'steps': 32097, 'batch_loss/train': 0.6677827476523817} 12/27/2021 04:38:56 - INFO - codeparrot_training - Step 32098: {'lr': 0.00014305623485413326, 'samples': 16434688, 'steps': 32098, 'batch_loss/train': 0.8713530809618533} 12/27/2021 04:39:06 - INFO - codeparrot_training - Step 32099: {'lr': 0.00014304198700261067, 'samples': 16435200, 'steps': 32099, 'batch_loss/train': 0.8635667869821191} 12/27/2021 04:39:17 - INFO - codeparrot_training - Step 32100: {'lr': 0.00014302773957631276, 'samples': 16435712, 'steps': 32100, 'batch_loss/train': 0.7125907139852643} 12/27/2021 04:39:27 - INFO - codeparrot_training - Step 32101: {'lr': 0.00014301349257529617, 'samples': 16436224, 'steps': 32101, 'batch_loss/train': 0.7214028714224696} 12/27/2021 04:39:39 - INFO - codeparrot_training - Step 32102: {'lr': 0.00014299924599961764, 'samples': 16436736, 'steps': 32102, 'batch_loss/train': 0.7524868054315448} 12/27/2021 04:39:50 - INFO - codeparrot_training - Step 32103: {'lr': 0.0001429849998493335, 'samples': 16437248, 'steps': 32103, 'batch_loss/train': 0.6344959632260725} 12/27/2021 04:40:01 - INFO - codeparrot_training - Step 32104: {'lr': 0.00014297075412450072, 'samples': 16437760, 'steps': 32104, 'batch_loss/train': 0.6653016227646731} 12/27/2021 04:40:13 - INFO - codeparrot_training - Step 32105: {'lr': 0.00014295650882517586, 'samples': 16438272, 'steps': 32105, 'batch_loss/train': 0.6578119631158188} 12/27/2021 04:40:24 - INFO - codeparrot_training - Step 32106: {'lr': 0.00014294226395141535, 'samples': 16438784, 'steps': 32106, 'batch_loss/train': 0.7194399596191943} 12/27/2021 04:40:35 - INFO - codeparrot_training - Step 32107: {'lr': 0.00014292801950327615, 'samples': 16439296, 'steps': 32107, 'batch_loss/train': 0.777302899514325} 12/27/2021 04:40:47 - INFO - codeparrot_training - Step 32108: {'lr': 0.00014291377548081462, 'samples': 16439808, 'steps': 32108, 'batch_loss/train': 0.7586999703198671} 12/27/2021 04:40:57 - INFO - codeparrot_training - Step 32109: {'lr': 0.00014289953188408746, 'samples': 16440320, 'steps': 32109, 'batch_loss/train': 0.9171146280132234} 12/27/2021 04:41:08 - INFO - codeparrot_training - Step 32110: {'lr': 0.00014288528871315138, 'samples': 16440832, 'steps': 32110, 'batch_loss/train': 0.6477636899799109} 12/27/2021 04:41:19 - INFO - codeparrot_training - Step 32111: {'lr': 0.0001428710459680629, 'samples': 16441344, 'steps': 32111, 'batch_loss/train': 0.662066325545311} 12/27/2021 04:41:31 - INFO - codeparrot_training - Step 32112: {'lr': 0.0001428568036488787, 'samples': 16441856, 'steps': 32112, 'batch_loss/train': 0.7234137421473861} 12/27/2021 04:41:41 - INFO - codeparrot_training - Step 32113: {'lr': 0.00014284256175565538, 'samples': 16442368, 'steps': 32113, 'batch_loss/train': 0.7457964213099331} 12/27/2021 04:41:52 - INFO - codeparrot_training - Step 32114: {'lr': 0.0001428283202884496, 'samples': 16442880, 'steps': 32114, 'batch_loss/train': 0.7919500591233373} 12/27/2021 04:42:05 - INFO - codeparrot_training - Step 32115: {'lr': 0.00014281407924731794, 'samples': 16443392, 'steps': 32115, 'batch_loss/train': 0.6608889484778047} 12/27/2021 04:42:15 - INFO - codeparrot_training - Step 32116: {'lr': 0.00014279983863231711, 'samples': 16443904, 'steps': 32116, 'batch_loss/train': 0.6365120398113504} 12/27/2021 04:42:26 - INFO - codeparrot_training - Step 32117: {'lr': 0.00014278559844350358, 'samples': 16444416, 'steps': 32117, 'batch_loss/train': 0.7556113145546988} 12/27/2021 04:42:38 - INFO - codeparrot_training - Step 32118: {'lr': 0.00014277135868093392, 'samples': 16444928, 'steps': 32118, 'batch_loss/train': 0.6897818350698799} 12/27/2021 04:42:49 - INFO - codeparrot_training - Step 32119: {'lr': 0.00014275711934466504, 'samples': 16445440, 'steps': 32119, 'batch_loss/train': 0.6941314907744527} 12/27/2021 04:42:59 - INFO - codeparrot_training - Step 32120: {'lr': 0.00014274288043475324, 'samples': 16445952, 'steps': 32120, 'batch_loss/train': 0.7534670264576562} 12/27/2021 04:43:10 - INFO - codeparrot_training - Step 32121: {'lr': 0.00014272864195125518, 'samples': 16446464, 'steps': 32121, 'batch_loss/train': 0.7814805838279426} 12/27/2021 04:43:22 - INFO - codeparrot_training - Step 32122: {'lr': 0.00014271440389422774, 'samples': 16446976, 'steps': 32122, 'batch_loss/train': 0.6979315197095275} 12/27/2021 04:43:33 - INFO - codeparrot_training - Step 32123: {'lr': 0.0001427001662637272, 'samples': 16447488, 'steps': 32123, 'batch_loss/train': 0.6089500530215446} 12/27/2021 04:43:44 - INFO - codeparrot_training - Step 32124: {'lr': 0.0001426859290598103, 'samples': 16448000, 'steps': 32124, 'batch_loss/train': 0.7540907924994826} 12/27/2021 04:43:56 - INFO - codeparrot_training - Step 32125: {'lr': 0.0001426716922825336, 'samples': 16448512, 'steps': 32125, 'batch_loss/train': 0.719371011480689} 12/27/2021 04:44:06 - INFO - codeparrot_training - Step 32126: {'lr': 0.00014265745593195376, 'samples': 16449024, 'steps': 32126, 'batch_loss/train': 1.042520219925791} 12/27/2021 04:44:17 - INFO - codeparrot_training - Step 32127: {'lr': 0.00014264322000812733, 'samples': 16449536, 'steps': 32127, 'batch_loss/train': 0.7057013250887394} 12/27/2021 04:44:29 - INFO - codeparrot_training - Step 32128: {'lr': 0.000142628984511111, 'samples': 16450048, 'steps': 32128, 'batch_loss/train': 0.6506889648735523} 12/27/2021 04:44:40 - INFO - codeparrot_training - Step 32129: {'lr': 0.00014261474944096109, 'samples': 16450560, 'steps': 32129, 'batch_loss/train': 0.781022671610117} 12/27/2021 04:44:51 - INFO - codeparrot_training - Step 32130: {'lr': 0.0001426005147977345, 'samples': 16451072, 'steps': 32130, 'batch_loss/train': 0.8003670573234558} 12/27/2021 04:45:03 - INFO - codeparrot_training - Step 32131: {'lr': 0.00014258628058148783, 'samples': 16451584, 'steps': 32131, 'batch_loss/train': 0.6593137104064226} 12/27/2021 04:45:13 - INFO - codeparrot_training - Step 32132: {'lr': 0.00014257204679227734, 'samples': 16452096, 'steps': 32132, 'batch_loss/train': 0.7181751960888505} 12/27/2021 04:45:24 - INFO - codeparrot_training - Step 32133: {'lr': 0.00014255781343015995, 'samples': 16452608, 'steps': 32133, 'batch_loss/train': 0.7596271568909287} 12/27/2021 04:45:35 - INFO - codeparrot_training - Step 32134: {'lr': 0.00014254358049519216, 'samples': 16453120, 'steps': 32134, 'batch_loss/train': 0.7634360967203975} 12/27/2021 04:45:47 - INFO - codeparrot_training - Step 32135: {'lr': 0.00014252934798743047, 'samples': 16453632, 'steps': 32135, 'batch_loss/train': 0.7818901548162103} 12/27/2021 04:45:58 - INFO - codeparrot_training - Step 32136: {'lr': 0.0001425151159069315, 'samples': 16454144, 'steps': 32136, 'batch_loss/train': 0.8862715968862176} 12/27/2021 04:46:09 - INFO - codeparrot_training - Step 32137: {'lr': 0.00014250088425375184, 'samples': 16454656, 'steps': 32137, 'batch_loss/train': 0.749318297021091} 12/27/2021 04:46:21 - INFO - codeparrot_training - Step 32138: {'lr': 0.00014248665302794809, 'samples': 16455168, 'steps': 32138, 'batch_loss/train': 0.7290945509448647} 12/27/2021 04:46:31 - INFO - codeparrot_training - Step 32139: {'lr': 0.00014247242222957678, 'samples': 16455680, 'steps': 32139, 'batch_loss/train': 0.7513828906230628} 12/27/2021 04:46:42 - INFO - codeparrot_training - Step 32140: {'lr': 0.00014245819185869458, 'samples': 16456192, 'steps': 32140, 'batch_loss/train': 0.7510615614010021} 12/27/2021 04:46:54 - INFO - codeparrot_training - Step 32141: {'lr': 0.00014244396191535793, 'samples': 16456704, 'steps': 32141, 'batch_loss/train': 0.7792044170200825} 12/27/2021 04:47:05 - INFO - codeparrot_training - Step 32142: {'lr': 0.00014242973239962363, 'samples': 16457216, 'steps': 32142, 'batch_loss/train': 0.674945145030506} 12/27/2021 04:47:16 - INFO - codeparrot_training - Step 32143: {'lr': 0.00014241550331154786, 'samples': 16457728, 'steps': 32143, 'batch_loss/train': 0.8262941930443048} 12/27/2021 04:47:26 - INFO - codeparrot_training - Step 32144: {'lr': 0.00014240127465118754, 'samples': 16458240, 'steps': 32144, 'batch_loss/train': 0.7043214305303991} 12/27/2021 04:47:39 - INFO - codeparrot_training - Step 32145: {'lr': 0.00014238704641859922, 'samples': 16458752, 'steps': 32145, 'batch_loss/train': 0.656287997495383} 12/27/2021 04:47:49 - INFO - codeparrot_training - Step 32146: {'lr': 0.0001423728186138393, 'samples': 16459264, 'steps': 32146, 'batch_loss/train': 0.670390164013952} 12/27/2021 04:48:00 - INFO - codeparrot_training - Step 32147: {'lr': 0.0001423585912369643, 'samples': 16459776, 'steps': 32147, 'batch_loss/train': 0.6942694594617933} 12/27/2021 04:48:12 - INFO - codeparrot_training - Step 32148: {'lr': 0.0001423443642880311, 'samples': 16460288, 'steps': 32148, 'batch_loss/train': 0.7674919473938644} 12/27/2021 04:48:23 - INFO - codeparrot_training - Step 32149: {'lr': 0.0001423301377670959, 'samples': 16460800, 'steps': 32149, 'batch_loss/train': 0.7353104525245726} 12/27/2021 04:48:33 - INFO - codeparrot_training - Step 32150: {'lr': 0.00014231591167421549, 'samples': 16461312, 'steps': 32150, 'batch_loss/train': 0.6761416718363762} 12/27/2021 04:48:47 - INFO - codeparrot_training - Step 32151: {'lr': 0.00014230168600944633, 'samples': 16461824, 'steps': 32151, 'batch_loss/train': 0.6971013648435473} 12/27/2021 04:48:57 - INFO - codeparrot_training - Step 32152: {'lr': 0.00014228746077284498, 'samples': 16462336, 'steps': 32152, 'batch_loss/train': 0.752703882753849} 12/27/2021 04:49:08 - INFO - codeparrot_training - Step 32153: {'lr': 0.00014227323596446807, 'samples': 16462848, 'steps': 32153, 'batch_loss/train': 0.7803296949714422} 12/27/2021 04:49:18 - INFO - codeparrot_training - Step 32154: {'lr': 0.00014225901158437215, 'samples': 16463360, 'steps': 32154, 'batch_loss/train': 1.0385072557255626} 12/27/2021 04:49:31 - INFO - codeparrot_training - Step 32155: {'lr': 0.00014224478763261356, 'samples': 16463872, 'steps': 32155, 'batch_loss/train': 0.860835547093302} 12/27/2021 04:49:41 - INFO - codeparrot_training - Step 32156: {'lr': 0.00014223056410924906, 'samples': 16464384, 'steps': 32156, 'batch_loss/train': 0.7704247790388763} 12/27/2021 04:49:52 - INFO - codeparrot_training - Step 32157: {'lr': 0.00014221634101433528, 'samples': 16464896, 'steps': 32157, 'batch_loss/train': 0.7543237525969744} 12/27/2021 04:50:04 - INFO - codeparrot_training - Step 32158: {'lr': 0.0001422021183479284, 'samples': 16465408, 'steps': 32158, 'batch_loss/train': 0.8972427761182189} 12/27/2021 04:50:15 - INFO - codeparrot_training - Step 32159: {'lr': 0.0001421878961100853, 'samples': 16465920, 'steps': 32159, 'batch_loss/train': 0.6142801374662668} 12/27/2021 04:50:25 - INFO - codeparrot_training - Step 32160: {'lr': 0.00014217367430086254, 'samples': 16466432, 'steps': 32160, 'batch_loss/train': 0.8135367808863521} 12/27/2021 04:50:37 - INFO - codeparrot_training - Step 32161: {'lr': 0.00014215945292031642, 'samples': 16466944, 'steps': 32161, 'batch_loss/train': 0.7092428961768746} 12/27/2021 04:50:48 - INFO - codeparrot_training - Step 32162: {'lr': 0.00014214523196850358, 'samples': 16467456, 'steps': 32162, 'batch_loss/train': 0.6933655021712184} 12/27/2021 04:50:58 - INFO - codeparrot_training - Step 32163: {'lr': 0.00014213101144548062, 'samples': 16467968, 'steps': 32163, 'batch_loss/train': 0.7092152631375939} 12/27/2021 04:51:11 - INFO - codeparrot_training - Step 32164: {'lr': 0.000142116791351304, 'samples': 16468480, 'steps': 32164, 'batch_loss/train': 0.7825713781639934} 12/27/2021 04:51:22 - INFO - codeparrot_training - Step 32165: {'lr': 0.0001421025716860303, 'samples': 16468992, 'steps': 32165, 'batch_loss/train': 0.7320784656330943} 12/27/2021 04:51:32 - INFO - codeparrot_training - Step 32166: {'lr': 0.00014208835244971602, 'samples': 16469504, 'steps': 32166, 'batch_loss/train': 0.6672828306327574} 12/27/2021 04:51:43 - INFO - codeparrot_training - Step 32167: {'lr': 0.0001420741336424177, 'samples': 16470016, 'steps': 32167, 'batch_loss/train': 0.8390267267823219} 12/27/2021 04:51:56 - INFO - codeparrot_training - Step 32168: {'lr': 0.00014205991526419187, 'samples': 16470528, 'steps': 32168, 'batch_loss/train': 0.7378744222223759} 12/27/2021 04:52:07 - INFO - codeparrot_training - Step 32169: {'lr': 0.00014204569731509508, 'samples': 16471040, 'steps': 32169, 'batch_loss/train': 0.7298314590007067} 12/27/2021 04:52:17 - INFO - codeparrot_training - Step 32170: {'lr': 0.00014203147979518382, 'samples': 16471552, 'steps': 32170, 'batch_loss/train': 0.7943827286362648} 12/27/2021 04:52:29 - INFO - codeparrot_training - Step 32171: {'lr': 0.00014201726270451472, 'samples': 16472064, 'steps': 32171, 'batch_loss/train': 0.6417138166725636} 12/27/2021 04:52:40 - INFO - codeparrot_training - Step 32172: {'lr': 0.00014200304604314413, 'samples': 16472576, 'steps': 32172, 'batch_loss/train': 0.7467708699405193} 12/27/2021 04:52:51 - INFO - codeparrot_training - Step 32173: {'lr': 0.00014198882981112854, 'samples': 16473088, 'steps': 32173, 'batch_loss/train': 0.7656828830949962} 12/27/2021 04:53:01 - INFO - codeparrot_training - Step 32174: {'lr': 0.0001419746140085248, 'samples': 16473600, 'steps': 32174, 'batch_loss/train': 0.7040509782673325} 12/27/2021 04:53:14 - INFO - codeparrot_training - Step 32175: {'lr': 0.0001419603986353891, 'samples': 16474112, 'steps': 32175, 'batch_loss/train': 0.7360111903399229} 12/27/2021 04:53:25 - INFO - codeparrot_training - Step 32176: {'lr': 0.00014194618369177804, 'samples': 16474624, 'steps': 32176, 'batch_loss/train': 0.7857917854562402} 12/27/2021 04:53:35 - INFO - codeparrot_training - Step 32177: {'lr': 0.00014193196917774818, 'samples': 16475136, 'steps': 32177, 'batch_loss/train': 0.7192990938201547} 12/27/2021 04:53:48 - INFO - codeparrot_training - Step 32178: {'lr': 0.00014191775509335598, 'samples': 16475648, 'steps': 32178, 'batch_loss/train': 0.715227362466976} 12/27/2021 04:53:58 - INFO - codeparrot_training - Step 32179: {'lr': 0.00014190354143865796, 'samples': 16476160, 'steps': 32179, 'batch_loss/train': 0.6661192501778714} 12/27/2021 04:54:09 - INFO - codeparrot_training - Step 32180: {'lr': 0.0001418893282137107, 'samples': 16476672, 'steps': 32180, 'batch_loss/train': 0.7803647965192795} 12/27/2021 04:54:22 - INFO - codeparrot_training - Step 32181: {'lr': 0.0001418751154185706, 'samples': 16477184, 'steps': 32181, 'batch_loss/train': 1.347253710089717} 12/27/2021 04:54:33 - INFO - codeparrot_training - Step 32182: {'lr': 0.00014186090305329425, 'samples': 16477696, 'steps': 32182, 'batch_loss/train': 0.7609147471375763} 12/27/2021 04:54:43 - INFO - codeparrot_training - Step 32183: {'lr': 0.00014184669111793818, 'samples': 16478208, 'steps': 32183, 'batch_loss/train': 0.7311585622373968} 12/27/2021 04:54:54 - INFO - codeparrot_training - Step 32184: {'lr': 0.00014183247961255867, 'samples': 16478720, 'steps': 32184, 'batch_loss/train': 0.6708004353567958} 12/27/2021 04:55:06 - INFO - codeparrot_training - Step 32185: {'lr': 0.00014181826853721248, 'samples': 16479232, 'steps': 32185, 'batch_loss/train': 0.7673129066824913} 12/27/2021 04:55:17 - INFO - codeparrot_training - Step 32186: {'lr': 0.00014180405789195608, 'samples': 16479744, 'steps': 32186, 'batch_loss/train': 0.7414378123939969} 12/27/2021 04:55:27 - INFO - codeparrot_training - Step 32187: {'lr': 0.00014178984767684578, 'samples': 16480256, 'steps': 32187, 'batch_loss/train': 0.7140732952393591} 12/27/2021 04:55:39 - INFO - codeparrot_training - Step 32188: {'lr': 0.0001417756378919382, 'samples': 16480768, 'steps': 32188, 'batch_loss/train': 0.7050739550031722} 12/27/2021 04:55:50 - INFO - codeparrot_training - Step 32189: {'lr': 0.00014176142853728984, 'samples': 16481280, 'steps': 32189, 'batch_loss/train': 0.7750973305664957} 12/27/2021 04:56:00 - INFO - codeparrot_training - Step 32190: {'lr': 0.00014174721961295717, 'samples': 16481792, 'steps': 32190, 'batch_loss/train': 0.6859108693897724} 12/27/2021 04:56:13 - INFO - codeparrot_training - Step 32191: {'lr': 0.00014173301111899668, 'samples': 16482304, 'steps': 32191, 'batch_loss/train': 0.7529450310394168} 12/27/2021 04:56:24 - INFO - codeparrot_training - Step 32192: {'lr': 0.0001417188030554648, 'samples': 16482816, 'steps': 32192, 'batch_loss/train': 0.8447713954374194} 12/27/2021 04:56:34 - INFO - codeparrot_training - Step 32193: {'lr': 0.00014170459542241814, 'samples': 16483328, 'steps': 32193, 'batch_loss/train': 0.741950151976198} 12/27/2021 04:56:45 - INFO - codeparrot_training - Step 32194: {'lr': 0.0001416903882199131, 'samples': 16483840, 'steps': 32194, 'batch_loss/train': 0.7032777639105916} 12/27/2021 04:56:57 - INFO - codeparrot_training - Step 32195: {'lr': 0.00014167618144800617, 'samples': 16484352, 'steps': 32195, 'batch_loss/train': 0.8060387915465981} 12/27/2021 04:57:08 - INFO - codeparrot_training - Step 32196: {'lr': 0.00014166197510675384, 'samples': 16484864, 'steps': 32196, 'batch_loss/train': 0.6959010171703994} 12/27/2021 04:57:18 - INFO - codeparrot_training - Step 32197: {'lr': 0.0001416477691962127, 'samples': 16485376, 'steps': 32197, 'batch_loss/train': 0.7002528982702643} 12/27/2021 04:57:30 - INFO - codeparrot_training - Step 32198: {'lr': 0.00014163356371643904, 'samples': 16485888, 'steps': 32198, 'batch_loss/train': 0.8412300683557987} 12/27/2021 04:57:41 - INFO - codeparrot_training - Step 32199: {'lr': 0.00014161935866748926, 'samples': 16486400, 'steps': 32199, 'batch_loss/train': 0.7334463535808027} 12/27/2021 04:57:52 - INFO - codeparrot_training - Step 32200: {'lr': 0.00014160515404942024, 'samples': 16486912, 'steps': 32200, 'batch_loss/train': 0.7032351987436414} 12/27/2021 04:58:04 - INFO - codeparrot_training - Step 32201: {'lr': 0.00014159094986228803, 'samples': 16487424, 'steps': 32201, 'batch_loss/train': 0.7727161776274443} 12/27/2021 04:58:14 - INFO - codeparrot_training - Step 32202: {'lr': 0.00014157674610614936, 'samples': 16487936, 'steps': 32202, 'batch_loss/train': 0.652889487799257} 12/27/2021 04:58:25 - INFO - codeparrot_training - Step 32203: {'lr': 0.00014156254278106057, 'samples': 16488448, 'steps': 32203, 'batch_loss/train': 0.7186842178925872} 12/27/2021 04:58:38 - INFO - codeparrot_training - Step 32204: {'lr': 0.00014154833988707817, 'samples': 16488960, 'steps': 32204, 'batch_loss/train': 0.7048428151756525} 12/27/2021 04:58:48 - INFO - codeparrot_training - Step 32205: {'lr': 0.00014153413742425857, 'samples': 16489472, 'steps': 32205, 'batch_loss/train': 0.7167224232107401} 12/27/2021 04:58:59 - INFO - codeparrot_training - Step 32206: {'lr': 0.00014151993539265835, 'samples': 16489984, 'steps': 32206, 'batch_loss/train': 0.6723442273214459} 12/27/2021 04:59:09 - INFO - codeparrot_training - Step 32207: {'lr': 0.00014150573379233386, 'samples': 16490496, 'steps': 32207, 'batch_loss/train': 0.7613435061648488} 12/27/2021 04:59:22 - INFO - codeparrot_training - Step 32208: {'lr': 0.00014149153262334164, 'samples': 16491008, 'steps': 32208, 'batch_loss/train': 0.7208065101876855} 12/27/2021 04:59:32 - INFO - codeparrot_training - Step 32209: {'lr': 0.0001414773318857382, 'samples': 16491520, 'steps': 32209, 'batch_loss/train': 0.7008938565850258} 12/27/2021 04:59:43 - INFO - codeparrot_training - Step 32210: {'lr': 0.00014146313157957968, 'samples': 16492032, 'steps': 32210, 'batch_loss/train': 0.7205646671354771} 12/27/2021 04:59:56 - INFO - codeparrot_training - Step 32211: {'lr': 0.00014144893170492292, 'samples': 16492544, 'steps': 32211, 'batch_loss/train': 0.6695821348112077} 12/27/2021 05:00:06 - INFO - codeparrot_training - Step 32212: {'lr': 0.0001414347322618243, 'samples': 16493056, 'steps': 32212, 'batch_loss/train': 0.6424560244195163} 12/27/2021 05:00:17 - INFO - codeparrot_training - Step 32213: {'lr': 0.0001414205332503401, 'samples': 16493568, 'steps': 32213, 'batch_loss/train': 0.6752178238821216} 12/27/2021 05:00:29 - INFO - codeparrot_training - Step 32214: {'lr': 0.00014140633467052676, 'samples': 16494080, 'steps': 32214, 'batch_loss/train': 0.673171253874898} 12/27/2021 05:00:39 - INFO - codeparrot_training - Step 32215: {'lr': 0.000141392136522441, 'samples': 16494592, 'steps': 32215, 'batch_loss/train': 0.8013190235942602} 12/27/2021 05:00:50 - INFO - codeparrot_training - Step 32216: {'lr': 0.00014137793880613904, 'samples': 16495104, 'steps': 32216, 'batch_loss/train': 0.7517765467055142} 12/27/2021 05:01:01 - INFO - codeparrot_training - Step 32217: {'lr': 0.00014136374152167735, 'samples': 16495616, 'steps': 32217, 'batch_loss/train': 0.7360608587041497} 12/27/2021 05:01:13 - INFO - codeparrot_training - Step 32218: {'lr': 0.00014134954466911242, 'samples': 16496128, 'steps': 32218, 'batch_loss/train': 0.8172041983343661} 12/27/2021 05:01:23 - INFO - codeparrot_training - Step 32219: {'lr': 0.0001413353482485007, 'samples': 16496640, 'steps': 32219, 'batch_loss/train': 0.5885238312184811} 12/27/2021 05:01:34 - INFO - codeparrot_training - Step 32220: {'lr': 0.00014132115225989856, 'samples': 16497152, 'steps': 32220, 'batch_loss/train': 0.7939993436448276} 12/27/2021 05:01:47 - INFO - codeparrot_training - Step 32221: {'lr': 0.0001413069567033625, 'samples': 16497664, 'steps': 32221, 'batch_loss/train': 0.6865488516632468} 12/27/2021 05:01:57 - INFO - codeparrot_training - Step 32222: {'lr': 0.00014129276157894894, 'samples': 16498176, 'steps': 32222, 'batch_loss/train': 0.7393544414080679} 12/27/2021 05:02:08 - INFO - codeparrot_training - Step 32223: {'lr': 0.0001412785668867143, 'samples': 16498688, 'steps': 32223, 'batch_loss/train': 0.66373236104846} 12/27/2021 05:02:20 - INFO - codeparrot_training - Step 32224: {'lr': 0.00014126437262671515, 'samples': 16499200, 'steps': 32224, 'batch_loss/train': 0.6117088152095675} 12/27/2021 05:02:31 - INFO - codeparrot_training - Step 32225: {'lr': 0.0001412501787990076, 'samples': 16499712, 'steps': 32225, 'batch_loss/train': 0.7418463006615639} 12/27/2021 05:02:42 - INFO - codeparrot_training - Step 32226: {'lr': 0.00014123598540364846, 'samples': 16500224, 'steps': 32226, 'batch_loss/train': 0.5456150131067261} 12/27/2021 05:02:54 - INFO - codeparrot_training - Step 32227: {'lr': 0.00014122179244069387, 'samples': 16500736, 'steps': 32227, 'batch_loss/train': 0.7089443574659526} 12/27/2021 05:03:05 - INFO - codeparrot_training - Step 32228: {'lr': 0.0001412075999102004, 'samples': 16501248, 'steps': 32228, 'batch_loss/train': 0.8090895470231771} 12/27/2021 05:03:16 - INFO - codeparrot_training - Step 32229: {'lr': 0.00014119340781222438, 'samples': 16501760, 'steps': 32229, 'batch_loss/train': 0.7077463157474995} 12/27/2021 05:03:26 - INFO - codeparrot_training - Step 32230: {'lr': 0.00014117921614682227, 'samples': 16502272, 'steps': 32230, 'batch_loss/train': 0.7505989549681544} 12/27/2021 05:03:38 - INFO - codeparrot_training - Step 32231: {'lr': 0.0001411650249140506, 'samples': 16502784, 'steps': 32231, 'batch_loss/train': 0.6914443150162697} 12/27/2021 05:03:49 - INFO - codeparrot_training - Step 32232: {'lr': 0.00014115083411396563, 'samples': 16503296, 'steps': 32232, 'batch_loss/train': 0.6876529278233647} 12/27/2021 05:03:59 - INFO - codeparrot_training - Step 32233: {'lr': 0.00014113664374662384, 'samples': 16503808, 'steps': 32233, 'batch_loss/train': 0.7608127021230757} 12/27/2021 05:04:12 - INFO - codeparrot_training - Step 32234: {'lr': 0.00014112245381208167, 'samples': 16504320, 'steps': 32234, 'batch_loss/train': 0.8000100161880255} 12/27/2021 05:04:22 - INFO - codeparrot_training - Step 32235: {'lr': 0.00014110826431039562, 'samples': 16504832, 'steps': 32235, 'batch_loss/train': 0.6675173491239548} 12/27/2021 05:04:33 - INFO - codeparrot_training - Step 32236: {'lr': 0.00014109407524162177, 'samples': 16505344, 'steps': 32236, 'batch_loss/train': 0.6243846314027905} 12/27/2021 05:04:45 - INFO - codeparrot_training - Step 32237: {'lr': 0.00014107988660581688, 'samples': 16505856, 'steps': 32237, 'batch_loss/train': 0.6929899465758353} 12/27/2021 05:04:56 - INFO - codeparrot_training - Step 32238: {'lr': 0.0001410656984030373, 'samples': 16506368, 'steps': 32238, 'batch_loss/train': 0.753618320915848} 12/27/2021 05:05:06 - INFO - codeparrot_training - Step 32239: {'lr': 0.00014105151063333933, 'samples': 16506880, 'steps': 32239, 'batch_loss/train': 0.6955355550162494} 12/27/2021 05:05:18 - INFO - codeparrot_training - Step 32240: {'lr': 0.00014103732329677927, 'samples': 16507392, 'steps': 32240, 'batch_loss/train': 0.6712037334218621} 12/27/2021 05:05:29 - INFO - codeparrot_training - Step 32241: {'lr': 0.00014102313639341386, 'samples': 16507904, 'steps': 32241, 'batch_loss/train': 0.6204511274117976} 12/27/2021 05:05:40 - INFO - codeparrot_training - Step 32242: {'lr': 0.00014100894992329923, 'samples': 16508416, 'steps': 32242, 'batch_loss/train': 0.6209729824913666} 12/27/2021 05:05:50 - INFO - codeparrot_training - Step 32243: {'lr': 0.00014099476388649187, 'samples': 16508928, 'steps': 32243, 'batch_loss/train': 0.7053787927143276} 12/27/2021 05:06:03 - INFO - codeparrot_training - Step 32244: {'lr': 0.00014098057828304818, 'samples': 16509440, 'steps': 32244, 'batch_loss/train': 0.7640824615955353} 12/27/2021 05:06:14 - INFO - codeparrot_training - Step 32245: {'lr': 0.00014096639311302452, 'samples': 16509952, 'steps': 32245, 'batch_loss/train': 0.7382045714184642} 12/27/2021 05:06:24 - INFO - codeparrot_training - Step 32246: {'lr': 0.00014095220837647733, 'samples': 16510464, 'steps': 32246, 'batch_loss/train': 0.6433682451024652} 12/27/2021 05:06:36 - INFO - codeparrot_training - Step 32247: {'lr': 0.000140938024073463, 'samples': 16510976, 'steps': 32247, 'batch_loss/train': 0.7096298718824983} 12/27/2021 05:06:47 - INFO - codeparrot_training - Step 32248: {'lr': 0.00014092384020403787, 'samples': 16511488, 'steps': 32248, 'batch_loss/train': 0.6340147789160255} 12/27/2021 05:06:58 - INFO - codeparrot_training - Step 32249: {'lr': 0.00014090965676825836, 'samples': 16512000, 'steps': 32249, 'batch_loss/train': 0.7777726221829653} 12/27/2021 05:07:10 - INFO - codeparrot_training - Step 32250: {'lr': 0.000140895473766181, 'samples': 16512512, 'steps': 32250, 'batch_loss/train': 0.6772524868138134} 12/27/2021 05:07:21 - INFO - codeparrot_training - Step 32251: {'lr': 0.0001408812911978618, 'samples': 16513024, 'steps': 32251, 'batch_loss/train': 0.683428649790585} 12/27/2021 05:07:32 - INFO - codeparrot_training - Step 32252: {'lr': 0.00014086710906335763, 'samples': 16513536, 'steps': 32252, 'batch_loss/train': 0.7173400986939669} 12/27/2021 05:07:42 - INFO - codeparrot_training - Step 32253: {'lr': 0.00014085292736272452, 'samples': 16514048, 'steps': 32253, 'batch_loss/train': 0.851617312990129} 12/27/2021 05:07:54 - INFO - codeparrot_training - Step 32254: {'lr': 0.00014083874609601882, 'samples': 16514560, 'steps': 32254, 'batch_loss/train': 0.664437529630959} 12/27/2021 05:08:05 - INFO - codeparrot_training - Step 32255: {'lr': 0.00014082456526329728, 'samples': 16515072, 'steps': 32255, 'batch_loss/train': 0.8386290315538645} 12/27/2021 05:08:16 - INFO - codeparrot_training - Step 32256: {'lr': 0.00014081038486461595, 'samples': 16515584, 'steps': 32256, 'batch_loss/train': 0.7970191591884941} 12/27/2021 05:08:28 - INFO - codeparrot_training - Step 32257: {'lr': 0.00014079620490003127, 'samples': 16516096, 'steps': 32257, 'batch_loss/train': 0.8231175349210389} 12/27/2021 05:08:38 - INFO - codeparrot_training - Step 32258: {'lr': 0.00014078202536959967, 'samples': 16516608, 'steps': 32258, 'batch_loss/train': 0.6344115226529539} 12/27/2021 05:08:49 - INFO - codeparrot_training - Step 32259: {'lr': 0.0001407678462733775, 'samples': 16517120, 'steps': 32259, 'batch_loss/train': 0.6986809694208205} 12/27/2021 05:09:02 - INFO - codeparrot_training - Step 32260: {'lr': 0.00014075366761142111, 'samples': 16517632, 'steps': 32260, 'batch_loss/train': 0.6686365585774183} 12/27/2021 05:09:12 - INFO - codeparrot_training - Step 32261: {'lr': 0.000140739489383787, 'samples': 16518144, 'steps': 32261, 'batch_loss/train': 0.7574848776566796} 12/27/2021 05:09:23 - INFO - codeparrot_training - Step 32262: {'lr': 0.0001407253115905312, 'samples': 16518656, 'steps': 32262, 'batch_loss/train': 0.6909284089924768} 12/27/2021 05:09:35 - INFO - codeparrot_training - Step 32263: {'lr': 0.00014071113423171045, 'samples': 16519168, 'steps': 32263, 'batch_loss/train': 0.7538660462014377} 12/27/2021 05:09:46 - INFO - codeparrot_training - Step 32264: {'lr': 0.000140696957307381, 'samples': 16519680, 'steps': 32264, 'batch_loss/train': 0.6747748528141528} 12/27/2021 05:09:56 - INFO - codeparrot_training - Step 32265: {'lr': 0.00014068278081759915, 'samples': 16520192, 'steps': 32265, 'batch_loss/train': 0.7267934088595212} 12/27/2021 05:10:07 - INFO - codeparrot_training - Step 32266: {'lr': 0.00014066860476242115, 'samples': 16520704, 'steps': 32266, 'batch_loss/train': 0.653785171394702} 12/27/2021 05:10:20 - INFO - codeparrot_training - Step 32267: {'lr': 0.0001406544291419037, 'samples': 16521216, 'steps': 32267, 'batch_loss/train': 0.7216460141353309} 12/27/2021 05:10:31 - INFO - codeparrot_training - Step 32268: {'lr': 0.00014064025395610285, 'samples': 16521728, 'steps': 32268, 'batch_loss/train': 0.6871995311230421} 12/27/2021 05:10:41 - INFO - codeparrot_training - Step 32269: {'lr': 0.00014062607920507499, 'samples': 16522240, 'steps': 32269, 'batch_loss/train': 0.673477269243449} 12/27/2021 05:10:53 - INFO - codeparrot_training - Step 32270: {'lr': 0.00014061190488887668, 'samples': 16522752, 'steps': 32270, 'batch_loss/train': 0.6558797525940463} 12/27/2021 05:11:04 - INFO - codeparrot_training - Step 32271: {'lr': 0.00014059773100756406, 'samples': 16523264, 'steps': 32271, 'batch_loss/train': 1.4619604954496026} 12/27/2021 05:11:14 - INFO - codeparrot_training - Step 32272: {'lr': 0.00014058355756119356, 'samples': 16523776, 'steps': 32272, 'batch_loss/train': 0.7344227954745293} 12/27/2021 05:11:26 - INFO - codeparrot_training - Step 32273: {'lr': 0.00014056938454982149, 'samples': 16524288, 'steps': 32273, 'batch_loss/train': 0.8013887545093894} 12/27/2021 05:11:37 - INFO - codeparrot_training - Step 32274: {'lr': 0.00014055521197350426, 'samples': 16524800, 'steps': 32274, 'batch_loss/train': 0.7755100411595777} 12/27/2021 05:11:48 - INFO - codeparrot_training - Step 32275: {'lr': 0.00014054103983229818, 'samples': 16525312, 'steps': 32275, 'batch_loss/train': 0.8104966445825994} 12/27/2021 05:11:58 - INFO - codeparrot_training - Step 32276: {'lr': 0.0001405268681262597, 'samples': 16525824, 'steps': 32276, 'batch_loss/train': 0.6896214940352365} 12/27/2021 05:12:11 - INFO - codeparrot_training - Step 32277: {'lr': 0.00014051269685544485, 'samples': 16526336, 'steps': 32277, 'batch_loss/train': 0.7028019819408655} 12/27/2021 05:12:21 - INFO - codeparrot_training - Step 32278: {'lr': 0.00014049852601991038, 'samples': 16526848, 'steps': 32278, 'batch_loss/train': 0.7507393141277134} 12/27/2021 05:12:32 - INFO - codeparrot_training - Step 32279: {'lr': 0.0001404843556197123, 'samples': 16527360, 'steps': 32279, 'batch_loss/train': 0.6852635028772056} 12/27/2021 05:12:45 - INFO - codeparrot_training - Step 32280: {'lr': 0.000140470185654907, 'samples': 16527872, 'steps': 32280, 'batch_loss/train': 0.7707140631973743} 12/27/2021 05:12:55 - INFO - codeparrot_training - Step 32281: {'lr': 0.0001404560161255511, 'samples': 16528384, 'steps': 32281, 'batch_loss/train': 0.7048487275606021} 12/27/2021 05:13:06 - INFO - codeparrot_training - Step 32282: {'lr': 0.00014044184703170058, 'samples': 16528896, 'steps': 32282, 'batch_loss/train': 0.6588018386391923} 12/27/2021 05:13:18 - INFO - codeparrot_training - Step 32283: {'lr': 0.0001404276783734119, 'samples': 16529408, 'steps': 32283, 'batch_loss/train': 0.6386478452477604} 12/27/2021 05:13:29 - INFO - codeparrot_training - Step 32284: {'lr': 0.00014041351015074143, 'samples': 16529920, 'steps': 32284, 'batch_loss/train': 0.6294620789121836} 12/27/2021 05:13:39 - INFO - codeparrot_training - Step 32285: {'lr': 0.00014039934236374546, 'samples': 16530432, 'steps': 32285, 'batch_loss/train': 0.5020782520296052} 12/27/2021 05:13:50 - INFO - codeparrot_training - Step 32286: {'lr': 0.0001403851750124803, 'samples': 16530944, 'steps': 32286, 'batch_loss/train': 0.8720851298421621} 12/27/2021 05:14:02 - INFO - codeparrot_training - Step 32287: {'lr': 0.00014037100809700243, 'samples': 16531456, 'steps': 32287, 'batch_loss/train': 0.7483430681750178} 12/27/2021 05:14:13 - INFO - codeparrot_training - Step 32288: {'lr': 0.0001403568416173678, 'samples': 16531968, 'steps': 32288, 'batch_loss/train': 0.6981320797931403} 12/27/2021 05:14:23 - INFO - codeparrot_training - Step 32289: {'lr': 0.0001403426755736331, 'samples': 16532480, 'steps': 32289, 'batch_loss/train': 0.7333119437098503} 12/27/2021 05:14:36 - INFO - codeparrot_training - Step 32290: {'lr': 0.0001403285099658546, 'samples': 16532992, 'steps': 32290, 'batch_loss/train': 0.754728065803647} 12/27/2021 05:14:47 - INFO - codeparrot_training - Step 32291: {'lr': 0.00014031434479408833, 'samples': 16533504, 'steps': 32291, 'batch_loss/train': 0.731094847433269} 12/27/2021 05:14:57 - INFO - codeparrot_training - Step 32292: {'lr': 0.0001403001800583909, 'samples': 16534016, 'steps': 32292, 'batch_loss/train': 0.6082432887051255} 12/27/2021 05:15:10 - INFO - codeparrot_training - Step 32293: {'lr': 0.00014028601575881867, 'samples': 16534528, 'steps': 32293, 'batch_loss/train': 0.7096018446609378} 12/27/2021 05:15:20 - INFO - codeparrot_training - Step 32294: {'lr': 0.00014027185189542768, 'samples': 16535040, 'steps': 32294, 'batch_loss/train': 0.6473485995084047} 12/27/2021 05:15:31 - INFO - codeparrot_training - Step 32295: {'lr': 0.0001402576884682743, 'samples': 16535552, 'steps': 32295, 'batch_loss/train': 0.8452915567904711} 12/27/2021 05:15:41 - INFO - codeparrot_training - Step 32296: {'lr': 0.0001402435254774151, 'samples': 16536064, 'steps': 32296, 'batch_loss/train': 0.788308777846396} 12/27/2021 05:15:54 - INFO - codeparrot_training - Step 32297: {'lr': 0.0001402293629229061, 'samples': 16536576, 'steps': 32297, 'batch_loss/train': 0.7010481357574463} 12/27/2021 05:16:05 - INFO - codeparrot_training - Step 32298: {'lr': 0.0001402152008048037, 'samples': 16537088, 'steps': 32298, 'batch_loss/train': 0.7920894725248218} 12/27/2021 05:16:16 - INFO - codeparrot_training - Step 32299: {'lr': 0.00014020103912316424, 'samples': 16537600, 'steps': 32299, 'batch_loss/train': 0.6913965749554336} 12/27/2021 05:16:28 - INFO - codeparrot_training - Step 32300: {'lr': 0.00014018687787804396, 'samples': 16538112, 'steps': 32300, 'batch_loss/train': 0.7396945971995592} 12/27/2021 05:16:39 - INFO - codeparrot_training - Step 32301: {'lr': 0.0001401727170694992, 'samples': 16538624, 'steps': 32301, 'batch_loss/train': 0.6422419087029994} 12/27/2021 05:16:49 - INFO - codeparrot_training - Step 32302: {'lr': 0.00014015855669758634, 'samples': 16539136, 'steps': 32302, 'batch_loss/train': 0.8203388601541519} 12/27/2021 05:17:02 - INFO - codeparrot_training - Step 32303: {'lr': 0.00014014439676236141, 'samples': 16539648, 'steps': 32303, 'batch_loss/train': 0.6303406322840601} 12/27/2021 05:17:12 - INFO - codeparrot_training - Step 32304: {'lr': 0.00014013023726388093, 'samples': 16540160, 'steps': 32304, 'batch_loss/train': 0.7581426119431853} 12/27/2021 05:17:23 - INFO - codeparrot_training - Step 32305: {'lr': 0.00014011607820220125, 'samples': 16540672, 'steps': 32305, 'batch_loss/train': 0.6641548555344343} 12/27/2021 05:17:34 - INFO - codeparrot_training - Step 32306: {'lr': 0.0001401019195773784, 'samples': 16541184, 'steps': 32306, 'batch_loss/train': 0.7428359037730843} 12/27/2021 05:17:46 - INFO - codeparrot_training - Step 32307: {'lr': 0.000140087761389469, 'samples': 16541696, 'steps': 32307, 'batch_loss/train': 0.7058635083958507} 12/27/2021 05:17:57 - INFO - codeparrot_training - Step 32308: {'lr': 0.00014007360363852905, 'samples': 16542208, 'steps': 32308, 'batch_loss/train': 0.6930845654569566} 12/27/2021 05:18:07 - INFO - codeparrot_training - Step 32309: {'lr': 0.00014005944632461493, 'samples': 16542720, 'steps': 32309, 'batch_loss/train': 0.5091306239482947} 12/27/2021 05:18:20 - INFO - codeparrot_training - Step 32310: {'lr': 0.00014004528944778295, 'samples': 16543232, 'steps': 32310, 'batch_loss/train': 0.6600369763909839} 12/27/2021 05:18:31 - INFO - codeparrot_training - Step 32311: {'lr': 0.0001400311330080894, 'samples': 16543744, 'steps': 32311, 'batch_loss/train': 0.7436795388348401} 12/27/2021 05:18:41 - INFO - codeparrot_training - Step 32312: {'lr': 0.00014001697700559052, 'samples': 16544256, 'steps': 32312, 'batch_loss/train': 0.443354940158315} 12/27/2021 05:18:54 - INFO - codeparrot_training - Step 32313: {'lr': 0.00014000282144034266, 'samples': 16544768, 'steps': 32313, 'batch_loss/train': 0.6823590728454292} 12/27/2021 05:19:04 - INFO - codeparrot_training - Step 32314: {'lr': 0.00013998866631240197, 'samples': 16545280, 'steps': 32314, 'batch_loss/train': 0.7069672392681241} 12/27/2021 05:19:15 - INFO - codeparrot_training - Step 32315: {'lr': 0.00013997451162182486, 'samples': 16545792, 'steps': 32315, 'batch_loss/train': 0.7410523823928088} 12/27/2021 05:19:27 - INFO - codeparrot_training - Step 32316: {'lr': 0.00013996035736866763, 'samples': 16546304, 'steps': 32316, 'batch_loss/train': 0.7432721578516066} 12/27/2021 05:19:38 - INFO - codeparrot_training - Step 32317: {'lr': 0.00013994620355298627, 'samples': 16546816, 'steps': 32317, 'batch_loss/train': 0.7362597666215152} 12/27/2021 05:19:48 - INFO - codeparrot_training - Step 32318: {'lr': 0.00013993205017483735, 'samples': 16547328, 'steps': 32318, 'batch_loss/train': 0.8229704652912915} 12/27/2021 05:19:59 - INFO - codeparrot_training - Step 32319: {'lr': 0.00013991789723427717, 'samples': 16547840, 'steps': 32319, 'batch_loss/train': 0.8080765102058649} 12/27/2021 05:20:12 - INFO - codeparrot_training - Step 32320: {'lr': 0.0001399037447313617, 'samples': 16548352, 'steps': 32320, 'batch_loss/train': 0.848365587182343} 12/27/2021 05:20:22 - INFO - codeparrot_training - Step 32321: {'lr': 0.0001398895926661473, 'samples': 16548864, 'steps': 32321, 'batch_loss/train': 0.6218033956829458} 12/27/2021 05:20:33 - INFO - codeparrot_training - Step 32322: {'lr': 0.00013987544103869054, 'samples': 16549376, 'steps': 32322, 'batch_loss/train': 0.6698503410443664} 12/27/2021 05:20:45 - INFO - codeparrot_training - Step 32323: {'lr': 0.00013986128984904732, 'samples': 16549888, 'steps': 32323, 'batch_loss/train': 0.7402838496491313} 12/27/2021 05:20:56 - INFO - codeparrot_training - Step 32324: {'lr': 0.00013984713909727403, 'samples': 16550400, 'steps': 32324, 'batch_loss/train': 0.2110253483406268} 12/27/2021 05:21:07 - INFO - codeparrot_training - Step 32325: {'lr': 0.0001398329887834269, 'samples': 16550912, 'steps': 32325, 'batch_loss/train': 0.7804268598556519} 12/27/2021 05:21:19 - INFO - codeparrot_training - Step 32326: {'lr': 0.00013981883890756223, 'samples': 16551424, 'steps': 32326, 'batch_loss/train': 0.5265220301225781} 12/27/2021 05:21:30 - INFO - codeparrot_training - Step 32327: {'lr': 0.00013980468946973623, 'samples': 16551936, 'steps': 32327, 'batch_loss/train': 0.7191035840660334} 12/27/2021 05:21:41 - INFO - codeparrot_training - Step 32328: {'lr': 0.0001397905404700052, 'samples': 16552448, 'steps': 32328, 'batch_loss/train': 0.7231624163687229} 12/27/2021 05:21:52 - INFO - codeparrot_training - Step 32329: {'lr': 0.00013977639190842535, 'samples': 16552960, 'steps': 32329, 'batch_loss/train': 0.6932185934856534} 12/27/2021 05:22:03 - INFO - codeparrot_training - Step 32330: {'lr': 0.00013976224378505297, 'samples': 16553472, 'steps': 32330, 'batch_loss/train': 0.7417432125657797} 12/27/2021 05:22:14 - INFO - codeparrot_training - Step 32331: {'lr': 0.00013974809609994439, 'samples': 16553984, 'steps': 32331, 'batch_loss/train': 0.7439419236034155} 12/27/2021 05:22:24 - INFO - codeparrot_training - Step 32332: {'lr': 0.00013973394885315548, 'samples': 16554496, 'steps': 32332, 'batch_loss/train': 0.7626461442559958} 12/27/2021 05:22:37 - INFO - codeparrot_training - Step 32333: {'lr': 0.00013971980204474305, 'samples': 16555008, 'steps': 32333, 'batch_loss/train': 0.7175390813499689} 12/27/2021 05:22:47 - INFO - codeparrot_training - Step 32334: {'lr': 0.0001397056556747629, 'samples': 16555520, 'steps': 32334, 'batch_loss/train': 0.73565723747015} 12/27/2021 05:22:58 - INFO - codeparrot_training - Step 32335: {'lr': 0.0001396915097432714, 'samples': 16556032, 'steps': 32335, 'batch_loss/train': 0.6494126156903803} 12/27/2021 05:23:12 - INFO - codeparrot_training - Step 32336: {'lr': 0.00013967736425032483, 'samples': 16556544, 'steps': 32336, 'batch_loss/train': 0.7941952338442206} 12/27/2021 05:23:22 - INFO - codeparrot_training - Step 32337: {'lr': 0.0001396632191959794, 'samples': 16557056, 'steps': 32337, 'batch_loss/train': 0.7518828478641808} 12/27/2021 05:23:33 - INFO - codeparrot_training - Step 32338: {'lr': 0.00013964907458029137, 'samples': 16557568, 'steps': 32338, 'batch_loss/train': 0.8009118097834289} 12/27/2021 05:23:45 - INFO - codeparrot_training - Step 32339: {'lr': 0.00013963493040331692, 'samples': 16558080, 'steps': 32339, 'batch_loss/train': 0.6985837467946112} 12/27/2021 05:23:56 - INFO - codeparrot_training - Step 32340: {'lr': 0.00013962078666511233, 'samples': 16558592, 'steps': 32340, 'batch_loss/train': 0.7208583783358335} 12/27/2021 05:24:06 - INFO - codeparrot_training - Step 32341: {'lr': 0.0001396066433657338, 'samples': 16559104, 'steps': 32341, 'batch_loss/train': 0.7429032474756241} 12/27/2021 05:24:17 - INFO - codeparrot_training - Step 32342: {'lr': 0.0001395925005052377, 'samples': 16559616, 'steps': 32342, 'batch_loss/train': 0.6830298871500418} 12/27/2021 05:24:29 - INFO - codeparrot_training - Step 32343: {'lr': 0.0001395783580836799, 'samples': 16560128, 'steps': 32343, 'batch_loss/train': 0.7842979421839118} 12/27/2021 05:24:40 - INFO - codeparrot_training - Step 32344: {'lr': 0.00013956421610111696, 'samples': 16560640, 'steps': 32344, 'batch_loss/train': 0.699182879878208} 12/27/2021 05:24:50 - INFO - codeparrot_training - Step 32345: {'lr': 0.00013955007455760513, 'samples': 16561152, 'steps': 32345, 'batch_loss/train': 0.6695302715525031} 12/27/2021 05:25:03 - INFO - codeparrot_training - Step 32346: {'lr': 0.00013953593345320038, 'samples': 16561664, 'steps': 32346, 'batch_loss/train': 0.8325824243947864} 12/27/2021 05:25:13 - INFO - codeparrot_training - Step 32347: {'lr': 0.00013952179278795896, 'samples': 16562176, 'steps': 32347, 'batch_loss/train': 0.6639157008030452} 12/27/2021 05:25:24 - INFO - codeparrot_training - Step 32348: {'lr': 0.00013950765256193735, 'samples': 16562688, 'steps': 32348, 'batch_loss/train': 0.4879343523643911} 12/27/2021 05:25:37 - INFO - codeparrot_training - Step 32349: {'lr': 0.0001394935127751915, 'samples': 16563200, 'steps': 32349, 'batch_loss/train': 0.7681866786442697} 12/27/2021 05:25:47 - INFO - codeparrot_training - Step 32350: {'lr': 0.0001394793734277777, 'samples': 16563712, 'steps': 32350, 'batch_loss/train': 0.7373222107999027} 12/27/2021 05:25:58 - INFO - codeparrot_training - Step 32351: {'lr': 0.00013946523451975224, 'samples': 16564224, 'steps': 32351, 'batch_loss/train': 0.8112887050956488} 12/27/2021 05:26:10 - INFO - codeparrot_training - Step 32352: {'lr': 0.00013945109605117124, 'samples': 16564736, 'steps': 32352, 'batch_loss/train': 0.696663680661004} 12/27/2021 05:26:21 - INFO - codeparrot_training - Step 32353: {'lr': 0.00013943695802209088, 'samples': 16565248, 'steps': 32353, 'batch_loss/train': 0.6448397925123572} 12/27/2021 05:26:32 - INFO - codeparrot_training - Step 32354: {'lr': 0.00013942282043256753, 'samples': 16565760, 'steps': 32354, 'batch_loss/train': 0.683486474212259} 12/27/2021 05:26:42 - INFO - codeparrot_training - Step 32355: {'lr': 0.00013940868328265722, 'samples': 16566272, 'steps': 32355, 'batch_loss/train': 0.6210959147429094} 12/27/2021 05:26:54 - INFO - codeparrot_training - Step 32356: {'lr': 0.00013939454657241627, 'samples': 16566784, 'steps': 32356, 'batch_loss/train': 0.7643389538861811} 12/27/2021 05:27:05 - INFO - codeparrot_training - Step 32357: {'lr': 0.00013938041030190092, 'samples': 16567296, 'steps': 32357, 'batch_loss/train': 0.7217443864792585} 12/27/2021 05:27:15 - INFO - codeparrot_training - Step 32358: {'lr': 0.0001393662744711671, 'samples': 16567808, 'steps': 32358, 'batch_loss/train': 0.7136455690488219} 12/27/2021 05:27:28 - INFO - codeparrot_training - Step 32359: {'lr': 0.00013935213908027128, 'samples': 16568320, 'steps': 32359, 'batch_loss/train': 0.7066012807190418} 12/27/2021 05:27:39 - INFO - codeparrot_training - Step 32360: {'lr': 0.0001393380041292697, 'samples': 16568832, 'steps': 32360, 'batch_loss/train': 0.6906012678518891} 12/27/2021 05:27:49 - INFO - codeparrot_training - Step 32361: {'lr': 0.0001393238696182183, 'samples': 16569344, 'steps': 32361, 'batch_loss/train': 0.6133220409974456} 12/27/2021 05:28:02 - INFO - codeparrot_training - Step 32362: {'lr': 0.00013930973554717344, 'samples': 16569856, 'steps': 32362, 'batch_loss/train': 0.7559969611465931} 12/27/2021 05:28:12 - INFO - codeparrot_training - Step 32363: {'lr': 0.00013929560191619126, 'samples': 16570368, 'steps': 32363, 'batch_loss/train': 0.7405283318366855} 12/27/2021 05:28:23 - INFO - codeparrot_training - Step 32364: {'lr': 0.00013928146872532797, 'samples': 16570880, 'steps': 32364, 'batch_loss/train': 0.721204839181155} 12/27/2021 05:28:36 - INFO - codeparrot_training - Step 32365: {'lr': 0.00013926733597463976, 'samples': 16571392, 'steps': 32365, 'batch_loss/train': 0.7201142306439579} 12/27/2021 05:28:47 - INFO - codeparrot_training - Step 32366: {'lr': 0.00013925320366418278, 'samples': 16571904, 'steps': 32366, 'batch_loss/train': 0.6387135619297624} 12/27/2021 05:28:57 - INFO - codeparrot_training - Step 32367: {'lr': 0.0001392390717940133, 'samples': 16572416, 'steps': 32367, 'batch_loss/train': 0.7856559292413294} 12/27/2021 05:29:08 - INFO - codeparrot_training - Step 32368: {'lr': 0.00013922494036418747, 'samples': 16572928, 'steps': 32368, 'batch_loss/train': 0.768601288087666} 12/27/2021 05:29:20 - INFO - codeparrot_training - Step 32369: {'lr': 0.00013921080937476132, 'samples': 16573440, 'steps': 32369, 'batch_loss/train': 0.7374382824636996} 12/27/2021 05:29:31 - INFO - codeparrot_training - Step 32370: {'lr': 0.00013919667882579125, 'samples': 16573952, 'steps': 32370, 'batch_loss/train': 0.8219026615843177} 12/27/2021 05:29:41 - INFO - codeparrot_training - Step 32371: {'lr': 0.00013918254871733344, 'samples': 16574464, 'steps': 32371, 'batch_loss/train': 0.5401793254422955} 12/27/2021 05:29:53 - INFO - codeparrot_training - Step 32372: {'lr': 0.00013916841904944386, 'samples': 16574976, 'steps': 32372, 'batch_loss/train': 0.6908401395776309} 12/27/2021 05:30:04 - INFO - codeparrot_training - Step 32373: {'lr': 0.0001391542898221787, 'samples': 16575488, 'steps': 32373, 'batch_loss/train': 0.8346827785135247} 12/27/2021 05:30:15 - INFO - codeparrot_training - Step 32374: {'lr': 0.00013914016103559447, 'samples': 16576000, 'steps': 32374, 'batch_loss/train': 0.7283850042149425} 12/27/2021 05:30:27 - INFO - codeparrot_training - Step 32375: {'lr': 0.00013912603268974693, 'samples': 16576512, 'steps': 32375, 'batch_loss/train': 0.7789133954793215} 12/27/2021 05:30:38 - INFO - codeparrot_training - Step 32376: {'lr': 0.00013911190478469246, 'samples': 16577024, 'steps': 32376, 'batch_loss/train': 0.953929761890322} 12/27/2021 05:30:49 - INFO - codeparrot_training - Step 32377: {'lr': 0.00013909777732048717, 'samples': 16577536, 'steps': 32377, 'batch_loss/train': 0.7046423451974988} 12/27/2021 05:30:59 - INFO - codeparrot_training - Step 32378: {'lr': 0.00013908365029718727, 'samples': 16578048, 'steps': 32378, 'batch_loss/train': 0.6734427093761042} 12/27/2021 05:31:11 - INFO - codeparrot_training - Step 32379: {'lr': 0.00013906952371484888, 'samples': 16578560, 'steps': 32379, 'batch_loss/train': 0.7410535831004381} 12/27/2021 05:31:22 - INFO - codeparrot_training - Step 32380: {'lr': 0.0001390553975735282, 'samples': 16579072, 'steps': 32380, 'batch_loss/train': 0.6309030561242253} 12/27/2021 05:31:33 - INFO - codeparrot_training - Step 32381: {'lr': 0.00013904127187328133, 'samples': 16579584, 'steps': 32381, 'batch_loss/train': 0.7413820698857307} 12/27/2021 05:31:45 - INFO - codeparrot_training - Step 32382: {'lr': 0.0001390271466141645, 'samples': 16580096, 'steps': 32382, 'batch_loss/train': 0.7370264614000916} 12/27/2021 05:31:55 - INFO - codeparrot_training - Step 32383: {'lr': 0.0001390130217962339, 'samples': 16580608, 'steps': 32383, 'batch_loss/train': 0.5851664608344436} 12/27/2021 05:32:06 - INFO - codeparrot_training - Step 32384: {'lr': 0.00013899889741954543, 'samples': 16581120, 'steps': 32384, 'batch_loss/train': 0.6550396382808685} 12/27/2021 05:32:18 - INFO - codeparrot_training - Step 32385: {'lr': 0.00013898477348415556, 'samples': 16581632, 'steps': 32385, 'batch_loss/train': 0.8114882009103894} 12/27/2021 05:32:29 - INFO - codeparrot_training - Step 32386: {'lr': 0.00013897064999012038, 'samples': 16582144, 'steps': 32386, 'batch_loss/train': 0.7587218033149838} 12/27/2021 05:32:39 - INFO - codeparrot_training - Step 32387: {'lr': 0.0001389565269374959, 'samples': 16582656, 'steps': 32387, 'batch_loss/train': 0.7428464675322175} 12/27/2021 05:32:53 - INFO - codeparrot_training - Step 32388: {'lr': 0.0001389424043263383, 'samples': 16583168, 'steps': 32388, 'batch_loss/train': 0.8147524707019329} 12/27/2021 05:33:03 - INFO - codeparrot_training - Step 32389: {'lr': 0.00013892828215670379, 'samples': 16583680, 'steps': 32389, 'batch_loss/train': 0.7617266097804531} 12/27/2021 05:33:14 - INFO - codeparrot_training - Step 32390: {'lr': 0.0001389141604286485, 'samples': 16584192, 'steps': 32390, 'batch_loss/train': 0.8466346114873886} 12/27/2021 05:33:24 - INFO - codeparrot_training - Step 32391: {'lr': 0.00013890003914222854, 'samples': 16584704, 'steps': 32391, 'batch_loss/train': 0.7170351557433605} 12/27/2021 05:33:37 - INFO - codeparrot_training - Step 32392: {'lr': 0.00013888591829750008, 'samples': 16585216, 'steps': 32392, 'batch_loss/train': 0.495068369229557} 12/27/2021 05:33:47 - INFO - codeparrot_training - Step 32393: {'lr': 0.00013887179789451927, 'samples': 16585728, 'steps': 32393, 'batch_loss/train': 0.7930173818022013} 12/27/2021 05:33:58 - INFO - codeparrot_training - Step 32394: {'lr': 0.00013885767793334226, 'samples': 16586240, 'steps': 32394, 'batch_loss/train': 0.6963072149083018} 12/27/2021 05:34:10 - INFO - codeparrot_training - Step 32395: {'lr': 0.0001388435584140251, 'samples': 16586752, 'steps': 32395, 'batch_loss/train': 0.7602996407076716} 12/27/2021 05:34:21 - INFO - codeparrot_training - Step 32396: {'lr': 0.000138829439336624, 'samples': 16587264, 'steps': 32396, 'batch_loss/train': 0.7176498537883162} 12/27/2021 05:34:32 - INFO - codeparrot_training - Step 32397: {'lr': 0.00013881532070119518, 'samples': 16587776, 'steps': 32397, 'batch_loss/train': 0.7548332056030631} 12/27/2021 05:34:44 - INFO - codeparrot_training - Step 32398: {'lr': 0.00013880120250779455, 'samples': 16588288, 'steps': 32398, 'batch_loss/train': 1.5376355491462164} 12/27/2021 05:34:55 - INFO - codeparrot_training - Step 32399: {'lr': 0.00013878708475647828, 'samples': 16588800, 'steps': 32399, 'batch_loss/train': 0.789752337615937} 12/27/2021 05:35:05 - INFO - codeparrot_training - Step 32400: {'lr': 0.00013877296744730272, 'samples': 16589312, 'steps': 32400, 'batch_loss/train': 0.714497294742614} 12/27/2021 05:35:16 - INFO - codeparrot_training - Step 32401: {'lr': 0.0001387588505803238, 'samples': 16589824, 'steps': 32401, 'batch_loss/train': 0.7001133184821811} 12/27/2021 05:35:29 - INFO - codeparrot_training - Step 32402: {'lr': 0.00013874473415559753, 'samples': 16590336, 'steps': 32402, 'batch_loss/train': 0.6017447697231546} 12/27/2021 05:35:39 - INFO - codeparrot_training - Step 32403: {'lr': 0.00013873061817318045, 'samples': 16590848, 'steps': 32403, 'batch_loss/train': 0.7502701710909605} 12/27/2021 05:35:50 - INFO - codeparrot_training - Step 32404: {'lr': 0.00013871650263312828, 'samples': 16591360, 'steps': 32404, 'batch_loss/train': 1.0415718303993344} 12/27/2021 05:36:03 - INFO - codeparrot_training - Step 32405: {'lr': 0.00013870238753549728, 'samples': 16591872, 'steps': 32405, 'batch_loss/train': 0.810288674198091} 12/27/2021 05:36:13 - INFO - codeparrot_training - Step 32406: {'lr': 0.00013868827288034355, 'samples': 16592384, 'steps': 32406, 'batch_loss/train': 1.4429555293172598} 12/27/2021 05:36:24 - INFO - codeparrot_training - Step 32407: {'lr': 0.00013867415866772325, 'samples': 16592896, 'steps': 32407, 'batch_loss/train': 0.8011913076043129} 12/27/2021 05:36:34 - INFO - codeparrot_training - Step 32408: {'lr': 0.00013866004489769242, 'samples': 16593408, 'steps': 32408, 'batch_loss/train': 0.5983671038411558} 12/27/2021 05:36:47 - INFO - codeparrot_training - Step 32409: {'lr': 0.00013864593157030737, 'samples': 16593920, 'steps': 32409, 'batch_loss/train': 0.8532659986522049} 12/27/2021 05:36:57 - INFO - codeparrot_training - Step 32410: {'lr': 0.0001386318186856238, 'samples': 16594432, 'steps': 32410, 'batch_loss/train': 0.5224180004152004} 12/27/2021 05:37:08 - INFO - codeparrot_training - Step 32411: {'lr': 0.00013861770624369818, 'samples': 16594944, 'steps': 32411, 'batch_loss/train': 0.781175076495856} 12/27/2021 05:37:21 - INFO - codeparrot_training - Step 32412: {'lr': 0.0001386035942445866, 'samples': 16595456, 'steps': 32412, 'batch_loss/train': 0.751599351875484} 12/27/2021 05:37:32 - INFO - codeparrot_training - Step 32413: {'lr': 0.000138589482688345, 'samples': 16595968, 'steps': 32413, 'batch_loss/train': 0.6962300566956401} 12/27/2021 05:37:42 - INFO - codeparrot_training - Step 32414: {'lr': 0.0001385753715750295, 'samples': 16596480, 'steps': 32414, 'batch_loss/train': 0.6928810020908713} 12/27/2021 05:37:54 - INFO - codeparrot_training - Step 32415: {'lr': 0.00013856126090469628, 'samples': 16596992, 'steps': 32415, 'batch_loss/train': 0.5085188393713906} 12/27/2021 05:38:05 - INFO - codeparrot_training - Step 32416: {'lr': 0.0001385471506774014, 'samples': 16597504, 'steps': 32416, 'batch_loss/train': 0.7534496751613915} 12/27/2021 05:38:16 - INFO - codeparrot_training - Step 32417: {'lr': 0.00013853304089320094, 'samples': 16598016, 'steps': 32417, 'batch_loss/train': 0.704179780150298} 12/27/2021 05:38:26 - INFO - codeparrot_training - Step 32418: {'lr': 0.00013851893155215106, 'samples': 16598528, 'steps': 32418, 'batch_loss/train': 0.7507332526147366} 12/27/2021 05:38:39 - INFO - codeparrot_training - Step 32419: {'lr': 0.0001385048226543078, 'samples': 16599040, 'steps': 32419, 'batch_loss/train': 0.8217652537859976} 12/27/2021 05:38:49 - INFO - codeparrot_training - Step 32420: {'lr': 0.0001384907141997273, 'samples': 16599552, 'steps': 32420, 'batch_loss/train': 0.4734814161201939} 12/27/2021 05:39:00 - INFO - codeparrot_training - Step 32421: {'lr': 0.0001384766061884656, 'samples': 16600064, 'steps': 32421, 'batch_loss/train': 0.7484965904732235} 12/27/2021 05:39:12 - INFO - codeparrot_training - Step 32422: {'lr': 0.00013846249862057878, 'samples': 16600576, 'steps': 32422, 'batch_loss/train': 0.8086684960871935} 12/27/2021 05:39:23 - INFO - codeparrot_training - Step 32423: {'lr': 0.00013844839149612304, 'samples': 16601088, 'steps': 32423, 'batch_loss/train': 0.7137076877988875} 12/27/2021 05:39:33 - INFO - codeparrot_training - Step 32424: {'lr': 0.00013843428481515429, 'samples': 16601600, 'steps': 32424, 'batch_loss/train': 0.7386652203276753} 12/27/2021 05:39:46 - INFO - codeparrot_training - Step 32425: {'lr': 0.00013842017857772863, 'samples': 16602112, 'steps': 32425, 'batch_loss/train': 0.7639040334615856} 12/27/2021 05:39:57 - INFO - codeparrot_training - Step 32426: {'lr': 0.00013840607278390238, 'samples': 16602624, 'steps': 32426, 'batch_loss/train': 0.6574268760159612} 12/27/2021 05:40:07 - INFO - codeparrot_training - Step 32427: {'lr': 0.00013839196743373135, 'samples': 16603136, 'steps': 32427, 'batch_loss/train': 0.7259528385475278} 12/27/2021 05:40:18 - INFO - codeparrot_training - Step 32428: {'lr': 0.00013837786252727163, 'samples': 16603648, 'steps': 32428, 'batch_loss/train': 0.8341601993888617} 12/27/2021 05:40:30 - INFO - codeparrot_training - Step 32429: {'lr': 0.00013836375806457958, 'samples': 16604160, 'steps': 32429, 'batch_loss/train': 0.8033705744892359} 12/27/2021 05:40:41 - INFO - codeparrot_training - Step 32430: {'lr': 0.00013834965404571091, 'samples': 16604672, 'steps': 32430, 'batch_loss/train': 0.766113568097353} 12/27/2021 05:40:51 - INFO - codeparrot_training - Step 32431: {'lr': 0.00013833555047072194, 'samples': 16605184, 'steps': 32431, 'batch_loss/train': 0.6236517236102372} 12/27/2021 05:41:04 - INFO - codeparrot_training - Step 32432: {'lr': 0.00013832144733966865, 'samples': 16605696, 'steps': 32432, 'batch_loss/train': 0.5811746741528623} 12/27/2021 05:41:14 - INFO - codeparrot_training - Step 32433: {'lr': 0.00013830734465260707, 'samples': 16606208, 'steps': 32433, 'batch_loss/train': 0.7484292471781373} 12/27/2021 05:41:25 - INFO - codeparrot_training - Step 32434: {'lr': 0.00013829324240959335, 'samples': 16606720, 'steps': 32434, 'batch_loss/train': 0.612607421935536} 12/27/2021 05:41:38 - INFO - codeparrot_training - Step 32435: {'lr': 0.00013827914061068363, 'samples': 16607232, 'steps': 32435, 'batch_loss/train': 0.705434713861905} 12/27/2021 05:41:48 - INFO - codeparrot_training - Step 32436: {'lr': 0.00013826503925593365, 'samples': 16607744, 'steps': 32436, 'batch_loss/train': 0.7532886583358049} 12/27/2021 05:41:59 - INFO - codeparrot_training - Step 32437: {'lr': 0.00013825093834539975, 'samples': 16608256, 'steps': 32437, 'batch_loss/train': 0.7443896476179361} 12/27/2021 05:42:11 - INFO - codeparrot_training - Step 32438: {'lr': 0.00013823683787913808, 'samples': 16608768, 'steps': 32438, 'batch_loss/train': 0.671644221059978} 12/27/2021 05:42:22 - INFO - codeparrot_training - Step 32439: {'lr': 0.00013822273785720428, 'samples': 16609280, 'steps': 32439, 'batch_loss/train': 0.6642408790066838} 12/27/2021 05:42:32 - INFO - codeparrot_training - Step 32440: {'lr': 0.00013820863827965484, 'samples': 16609792, 'steps': 32440, 'batch_loss/train': 0.7788570588454604} 12/27/2021 05:42:43 - INFO - codeparrot_training - Step 32441: {'lr': 0.00013819453914654573, 'samples': 16610304, 'steps': 32441, 'batch_loss/train': 0.728126322850585} 12/27/2021 05:42:56 - INFO - codeparrot_training - Step 32442: {'lr': 0.00013818044045793278, 'samples': 16610816, 'steps': 32442, 'batch_loss/train': 0.8227411694824696} 12/27/2021 05:43:07 - INFO - codeparrot_training - Step 32443: {'lr': 0.0001381663422138722, 'samples': 16611328, 'steps': 32443, 'batch_loss/train': 0.7742307484149933} 12/27/2021 05:43:17 - INFO - codeparrot_training - Step 32444: {'lr': 0.00013815224441442, 'samples': 16611840, 'steps': 32444, 'batch_loss/train': 0.7940793074667454} 12/27/2021 05:43:29 - INFO - codeparrot_training - Step 32445: {'lr': 0.00013813814705963224, 'samples': 16612352, 'steps': 32445, 'batch_loss/train': 0.7053366787731647} 12/27/2021 05:43:40 - INFO - codeparrot_training - Step 32446: {'lr': 0.00013812405014956497, 'samples': 16612864, 'steps': 32446, 'batch_loss/train': 0.676638004835695} 12/27/2021 05:43:51 - INFO - codeparrot_training - Step 32447: {'lr': 0.00013810995368427427, 'samples': 16613376, 'steps': 32447, 'batch_loss/train': 0.5985388248227537} 12/27/2021 05:44:03 - INFO - codeparrot_training - Step 32448: {'lr': 0.0001380958576638161, 'samples': 16613888, 'steps': 32448, 'batch_loss/train': 0.7153349751606584} 12/27/2021 05:44:13 - INFO - codeparrot_training - Step 32449: {'lr': 0.0001380817620882467, 'samples': 16614400, 'steps': 32449, 'batch_loss/train': 0.6915714247152209} 12/27/2021 05:44:24 - INFO - codeparrot_training - Step 32450: {'lr': 0.00013806766695762183, 'samples': 16614912, 'steps': 32450, 'batch_loss/train': 0.6816259303595871} 12/27/2021 05:44:36 - INFO - codeparrot_training - Step 32451: {'lr': 0.00013805357227199755, 'samples': 16615424, 'steps': 32451, 'batch_loss/train': 0.6785509377950802} 12/27/2021 05:44:47 - INFO - codeparrot_training - Step 32452: {'lr': 0.00013803947803143018, 'samples': 16615936, 'steps': 32452, 'batch_loss/train': 0.7248928765766323} 12/27/2021 05:44:57 - INFO - codeparrot_training - Step 32453: {'lr': 0.00013802538423597548, 'samples': 16616448, 'steps': 32453, 'batch_loss/train': 0.9132445277646184} 12/27/2021 05:45:08 - INFO - codeparrot_training - Step 32454: {'lr': 0.00013801129088568948, 'samples': 16616960, 'steps': 32454, 'batch_loss/train': 0.6215811809524894} 12/27/2021 05:45:21 - INFO - codeparrot_training - Step 32455: {'lr': 0.00013799719798062847, 'samples': 16617472, 'steps': 32455, 'batch_loss/train': 0.6927280491217971} 12/27/2021 05:45:31 - INFO - codeparrot_training - Step 32456: {'lr': 0.00013798310552084827, 'samples': 16617984, 'steps': 32456, 'batch_loss/train': 0.7068889955990016} 12/27/2021 05:45:42 - INFO - codeparrot_training - Step 32457: {'lr': 0.00013796901350640492, 'samples': 16618496, 'steps': 32457, 'batch_loss/train': 0.7138373600319028} 12/27/2021 05:45:54 - INFO - codeparrot_training - Step 32458: {'lr': 0.00013795492193735448, 'samples': 16619008, 'steps': 32458, 'batch_loss/train': 0.6901852479204535} 12/27/2021 05:46:05 - INFO - codeparrot_training - Step 32459: {'lr': 0.00013794083081375294, 'samples': 16619520, 'steps': 32459, 'batch_loss/train': 0.7198220230638981} 12/27/2021 05:46:15 - INFO - codeparrot_training - Step 32460: {'lr': 0.00013792674013565637, 'samples': 16620032, 'steps': 32460, 'batch_loss/train': 0.6757111614570022} 12/27/2021 05:46:27 - INFO - codeparrot_training - Step 32461: {'lr': 0.00013791264990312084, 'samples': 16620544, 'steps': 32461, 'batch_loss/train': 0.6527447723783553} 12/27/2021 05:46:38 - INFO - codeparrot_training - Step 32462: {'lr': 0.0001378985601162021, 'samples': 16621056, 'steps': 32462, 'batch_loss/train': 0.7372813150286674} 12/27/2021 05:46:49 - INFO - codeparrot_training - Step 32463: {'lr': 0.00013788447077495646, 'samples': 16621568, 'steps': 32463, 'batch_loss/train': 0.638128784543369} 12/27/2021 05:47:02 - INFO - codeparrot_training - Step 32464: {'lr': 0.00013787038187943991, 'samples': 16622080, 'steps': 32464, 'batch_loss/train': 0.7169531243853271} 12/27/2021 05:47:12 - INFO - codeparrot_training - Step 32465: {'lr': 0.00013785629342970824, 'samples': 16622592, 'steps': 32465, 'batch_loss/train': 0.7599275065585971} 12/27/2021 05:47:23 - INFO - codeparrot_training - Step 32466: {'lr': 0.00013784220542581766, 'samples': 16623104, 'steps': 32466, 'batch_loss/train': 0.7788556346204132} 12/27/2021 05:47:34 - INFO - codeparrot_training - Step 32467: {'lr': 0.00013782811786782422, 'samples': 16623616, 'steps': 32467, 'batch_loss/train': 0.553601078921929} 12/27/2021 05:47:46 - INFO - codeparrot_training - Step 32468: {'lr': 0.00013781403075578374, 'samples': 16624128, 'steps': 32468, 'batch_loss/train': 0.7659071842208505} 12/27/2021 05:47:57 - INFO - codeparrot_training - Step 32469: {'lr': 0.0001377999440897523, 'samples': 16624640, 'steps': 32469, 'batch_loss/train': 0.6591085472609848} 12/27/2021 05:48:07 - INFO - codeparrot_training - Step 32470: {'lr': 0.00013778585786978592, 'samples': 16625152, 'steps': 32470, 'batch_loss/train': 0.5613021455646958} 12/27/2021 05:48:20 - INFO - codeparrot_training - Step 32471: {'lr': 0.00013777177209594061, 'samples': 16625664, 'steps': 32471, 'batch_loss/train': 0.7297030960326083} 12/27/2021 05:48:31 - INFO - codeparrot_training - Step 32472: {'lr': 0.00013775768676827237, 'samples': 16626176, 'steps': 32472, 'batch_loss/train': 0.6705396577599458} 12/27/2021 05:48:41 - INFO - codeparrot_training - Step 32473: {'lr': 0.00013774360188683715, 'samples': 16626688, 'steps': 32473, 'batch_loss/train': 0.7812702525407076} 12/27/2021 05:48:53 - INFO - codeparrot_training - Step 32474: {'lr': 0.00013772951745169102, 'samples': 16627200, 'steps': 32474, 'batch_loss/train': 0.7529483459657058} 12/27/2021 05:49:04 - INFO - codeparrot_training - Step 32475: {'lr': 0.0001377154334628899, 'samples': 16627712, 'steps': 32475, 'batch_loss/train': 0.7927341302856803} 12/27/2021 05:49:15 - INFO - codeparrot_training - Step 32476: {'lr': 0.00013770134992048986, 'samples': 16628224, 'steps': 32476, 'batch_loss/train': 0.7769243735820055} 12/27/2021 05:49:27 - INFO - codeparrot_training - Step 32477: {'lr': 0.00013768726682454683, 'samples': 16628736, 'steps': 32477, 'batch_loss/train': 0.6747968327254057} 12/27/2021 05:49:37 - INFO - codeparrot_training - Step 32478: {'lr': 0.00013767318417511694, 'samples': 16629248, 'steps': 32478, 'batch_loss/train': 0.7257094540400431} 12/27/2021 05:49:48 - INFO - codeparrot_training - Step 32479: {'lr': 0.00013765910197225593, 'samples': 16629760, 'steps': 32479, 'batch_loss/train': 0.831925161415711} 12/27/2021 05:49:59 - INFO - codeparrot_training - Step 32480: {'lr': 0.00013764502021601982, 'samples': 16630272, 'steps': 32480, 'batch_loss/train': 0.7881812911946326} 12/27/2021 05:50:12 - INFO - codeparrot_training - Step 32481: {'lr': 0.00013763093890646489, 'samples': 16630784, 'steps': 32481, 'batch_loss/train': 0.802687874995172} 12/27/2021 05:50:23 - INFO - codeparrot_training - Step 32482: {'lr': 0.00013761685804364681, 'samples': 16631296, 'steps': 32482, 'batch_loss/train': 0.8430116111412644} 12/27/2021 05:50:33 - INFO - codeparrot_training - Step 32483: {'lr': 0.00013760277762762165, 'samples': 16631808, 'steps': 32483, 'batch_loss/train': 0.6907042318489403} 12/27/2021 05:50:45 - INFO - codeparrot_training - Step 32484: {'lr': 0.00013758869765844546, 'samples': 16632320, 'steps': 32484, 'batch_loss/train': 0.5471556948905345} 12/27/2021 05:50:56 - INFO - codeparrot_training - Step 32485: {'lr': 0.00013757461813617412, 'samples': 16632832, 'steps': 32485, 'batch_loss/train': 0.827866998501122} 12/27/2021 05:51:07 - INFO - codeparrot_training - Step 32486: {'lr': 0.00013756053906086363, 'samples': 16633344, 'steps': 32486, 'batch_loss/train': 0.7250346280634403} 12/27/2021 05:51:19 - INFO - codeparrot_training - Step 32487: {'lr': 0.00013754646043257014, 'samples': 16633856, 'steps': 32487, 'batch_loss/train': 0.6856287503615022} 12/27/2021 05:51:29 - INFO - codeparrot_training - Step 32488: {'lr': 0.00013753238225134922, 'samples': 16634368, 'steps': 32488, 'batch_loss/train': 0.7769987247884274} 12/27/2021 05:51:40 - INFO - codeparrot_training - Step 32489: {'lr': 0.00013751830451725722, 'samples': 16634880, 'steps': 32489, 'batch_loss/train': 0.6275716994423419} 12/27/2021 05:51:51 - INFO - codeparrot_training - Step 32490: {'lr': 0.00013750422723035003, 'samples': 16635392, 'steps': 32490, 'batch_loss/train': 0.5762830912135541} 12/27/2021 05:52:03 - INFO - codeparrot_training - Step 32491: {'lr': 0.00013749015039068335, 'samples': 16635904, 'steps': 32491, 'batch_loss/train': 0.8496203701943159} 12/27/2021 05:52:13 - INFO - codeparrot_training - Step 32492: {'lr': 0.00013747607399831348, 'samples': 16636416, 'steps': 32492, 'batch_loss/train': 0.7157357486430556} 12/27/2021 05:52:24 - INFO - codeparrot_training - Step 32493: {'lr': 0.00013746199805329636, 'samples': 16636928, 'steps': 32493, 'batch_loss/train': 0.7555425718892366} 12/27/2021 05:52:37 - INFO - codeparrot_training - Step 32494: {'lr': 0.00013744792255568773, 'samples': 16637440, 'steps': 32494, 'batch_loss/train': 0.7542116555268876} 12/27/2021 05:52:48 - INFO - codeparrot_training - Step 32495: {'lr': 0.00013743384750554367, 'samples': 16637952, 'steps': 32495, 'batch_loss/train': 0.6957337207859382} 12/27/2021 05:52:58 - INFO - codeparrot_training - Step 32496: {'lr': 0.00013741977290292008, 'samples': 16638464, 'steps': 32496, 'batch_loss/train': 0.7453595474362373} 12/27/2021 05:53:10 - INFO - codeparrot_training - Step 32497: {'lr': 0.00013740569874787302, 'samples': 16638976, 'steps': 32497, 'batch_loss/train': 0.7514802729710937} 12/27/2021 05:53:21 - INFO - codeparrot_training - Step 32498: {'lr': 0.00013739162504045838, 'samples': 16639488, 'steps': 32498, 'batch_loss/train': 0.7322430461645126} 12/27/2021 05:53:31 - INFO - codeparrot_training - Step 32499: {'lr': 0.00013737755178073208, 'samples': 16640000, 'steps': 32499, 'batch_loss/train': 0.6620134976692498} 12/27/2021 05:53:31 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/27/2021 05:56:54 - INFO - codeparrot_training - Step 32500: {'loss/eval': 0.7624319195747375, 'perplexity': 2.1434826850891113} 12/27/2021 05:57:12 - WARNING - huggingface_hub.repository - Several commits (11) will be pushed upstream. 12/27/2021 05:57:26 - INFO - codeparrot_training - Step 32500: {'lr': 0.00013736347896875018, 'samples': 16640512, 'steps': 32500, 'batch_loss/train': 0.9208770589902997} 12/27/2021 05:57:38 - INFO - codeparrot_training - Step 32501: {'lr': 0.0001373494066045685, 'samples': 16641024, 'steps': 32501, 'batch_loss/train': 0.7277491930872202} 12/27/2021 05:57:48 - INFO - codeparrot_training - Step 32502: {'lr': 0.00013733533468824306, 'samples': 16641536, 'steps': 32502, 'batch_loss/train': 0.8004205152392387} 12/27/2021 05:57:59 - INFO - codeparrot_training - Step 32503: {'lr': 0.00013732126321982978, 'samples': 16642048, 'steps': 32503, 'batch_loss/train': 0.7976576206274331} 12/27/2021 05:58:12 - INFO - codeparrot_training - Step 32504: {'lr': 0.00013730719219938472, 'samples': 16642560, 'steps': 32504, 'batch_loss/train': 0.7614717176184058} 12/27/2021 05:58:22 - INFO - codeparrot_training - Step 32505: {'lr': 0.0001372931216269636, 'samples': 16643072, 'steps': 32505, 'batch_loss/train': 0.6506944610737264} 12/27/2021 05:58:33 - INFO - codeparrot_training - Step 32506: {'lr': 0.0001372790515026224, 'samples': 16643584, 'steps': 32506, 'batch_loss/train': 0.7521962930914015} 12/27/2021 05:58:47 - INFO - codeparrot_training - Step 32507: {'lr': 0.00013726498182641735, 'samples': 16644096, 'steps': 32507, 'batch_loss/train': 0.6022114474908449} 12/27/2021 05:58:58 - INFO - codeparrot_training - Step 32508: {'lr': 0.000137250912598404, 'samples': 16644608, 'steps': 32508, 'batch_loss/train': 0.6816643564961851} 12/27/2021 05:59:08 - INFO - codeparrot_training - Step 32509: {'lr': 0.0001372368438186385, 'samples': 16645120, 'steps': 32509, 'batch_loss/train': 0.9547348960768431} 12/27/2021 05:59:20 - INFO - codeparrot_training - Step 32510: {'lr': 0.00013722277548717672, 'samples': 16645632, 'steps': 32510, 'batch_loss/train': 0.9029780243290588} 12/27/2021 05:59:31 - INFO - codeparrot_training - Step 32511: {'lr': 0.00013720870760407456, 'samples': 16646144, 'steps': 32511, 'batch_loss/train': 0.6407713796943426} 12/27/2021 05:59:42 - INFO - codeparrot_training - Step 32512: {'lr': 0.00013719464016938803, 'samples': 16646656, 'steps': 32512, 'batch_loss/train': 0.7646068842150271} 12/27/2021 05:59:52 - INFO - codeparrot_training - Step 32513: {'lr': 0.000137180573183173, 'samples': 16647168, 'steps': 32513, 'batch_loss/train': 0.7238606144674122} 12/27/2021 06:00:06 - INFO - codeparrot_training - Step 32514: {'lr': 0.00013716650664548542, 'samples': 16647680, 'steps': 32514, 'batch_loss/train': 0.7730881958268583} 12/27/2021 06:00:17 - INFO - codeparrot_training - Step 32515: {'lr': 0.0001371524405563812, 'samples': 16648192, 'steps': 32515, 'batch_loss/train': 0.726102915708907} 12/27/2021 06:00:28 - INFO - codeparrot_training - Step 32516: {'lr': 0.00013713837491591637, 'samples': 16648704, 'steps': 32516, 'batch_loss/train': 0.8976170644164085} 12/27/2021 06:00:40 - INFO - codeparrot_training - Step 32517: {'lr': 0.00013712430972414658, 'samples': 16649216, 'steps': 32517, 'batch_loss/train': 0.8131613684818149} 12/27/2021 06:00:50 - INFO - codeparrot_training - Step 32518: {'lr': 0.00013711024498112796, 'samples': 16649728, 'steps': 32518, 'batch_loss/train': 0.7388673289678991} 12/27/2021 06:01:01 - INFO - codeparrot_training - Step 32519: {'lr': 0.0001370961806869165, 'samples': 16650240, 'steps': 32519, 'batch_loss/train': 0.6491213105909992} 12/27/2021 06:01:13 - INFO - codeparrot_training - Step 32520: {'lr': 0.0001370821168415679, 'samples': 16650752, 'steps': 32520, 'batch_loss/train': 0.6831732010468841} 12/27/2021 06:01:24 - INFO - codeparrot_training - Step 32521: {'lr': 0.00013706805344513805, 'samples': 16651264, 'steps': 32521, 'batch_loss/train': 0.7790114857489243} 12/27/2021 06:01:34 - INFO - codeparrot_training - Step 32522: {'lr': 0.00013705399049768324, 'samples': 16651776, 'steps': 32522, 'batch_loss/train': 0.7368335635401309} 12/27/2021 06:01:45 - INFO - codeparrot_training - Step 32523: {'lr': 0.00013703992799925897, 'samples': 16652288, 'steps': 32523, 'batch_loss/train': 0.8816508278250694} 12/27/2021 06:01:59 - INFO - codeparrot_training - Step 32524: {'lr': 0.0001370258659499213, 'samples': 16652800, 'steps': 32524, 'batch_loss/train': 0.7236168710514903} 12/27/2021 06:02:10 - INFO - codeparrot_training - Step 32525: {'lr': 0.0001370118043497261, 'samples': 16653312, 'steps': 32525, 'batch_loss/train': 0.6383849661797285} 12/27/2021 06:02:20 - INFO - codeparrot_training - Step 32526: {'lr': 0.00013699774319872934, 'samples': 16653824, 'steps': 32526, 'batch_loss/train': 0.7170037319883704} 12/27/2021 06:02:32 - INFO - codeparrot_training - Step 32527: {'lr': 0.0001369836824969869, 'samples': 16654336, 'steps': 32527, 'batch_loss/train': 0.8057771902531385} 12/27/2021 06:02:43 - INFO - codeparrot_training - Step 32528: {'lr': 0.00013696962224455462, 'samples': 16654848, 'steps': 32528, 'batch_loss/train': 0.7851992808282375} 12/27/2021 06:02:54 - INFO - codeparrot_training - Step 32529: {'lr': 0.00013695556244148845, 'samples': 16655360, 'steps': 32529, 'batch_loss/train': 0.7391144232824445} 12/27/2021 06:03:06 - INFO - codeparrot_training - Step 32530: {'lr': 0.00013694150308784432, 'samples': 16655872, 'steps': 32530, 'batch_loss/train': 0.7468047142028809} 12/27/2021 06:03:17 - INFO - codeparrot_training - Step 32531: {'lr': 0.00013692744418367813, 'samples': 16656384, 'steps': 32531, 'batch_loss/train': 0.7735141050070524} 12/27/2021 06:03:27 - INFO - codeparrot_training - Step 32532: {'lr': 0.00013691338572904556, 'samples': 16656896, 'steps': 32532, 'batch_loss/train': 0.7215976174920797} 12/27/2021 06:03:38 - INFO - codeparrot_training - Step 32533: {'lr': 0.0001368993277240029, 'samples': 16657408, 'steps': 32533, 'batch_loss/train': 0.8233411251567304} 12/27/2021 06:03:52 - INFO - codeparrot_training - Step 32534: {'lr': 0.00013688527016860564, 'samples': 16657920, 'steps': 32534, 'batch_loss/train': 0.7153285648673773} 12/27/2021 06:04:03 - INFO - codeparrot_training - Step 32535: {'lr': 0.00013687121306290984, 'samples': 16658432, 'steps': 32535, 'batch_loss/train': 0.6680723507888615} 12/27/2021 06:04:13 - INFO - codeparrot_training - Step 32536: {'lr': 0.00013685715640697143, 'samples': 16658944, 'steps': 32536, 'batch_loss/train': 0.7477466808632016} 12/27/2021 06:04:25 - INFO - codeparrot_training - Step 32537: {'lr': 0.00013684310020084623, 'samples': 16659456, 'steps': 32537, 'batch_loss/train': 0.8559379636426456} 12/27/2021 06:04:36 - INFO - codeparrot_training - Step 32538: {'lr': 0.0001368290444445901, 'samples': 16659968, 'steps': 32538, 'batch_loss/train': 0.8908551568165421} 12/27/2021 06:04:46 - INFO - codeparrot_training - Step 32539: {'lr': 0.000136814989138259, 'samples': 16660480, 'steps': 32539, 'batch_loss/train': 0.809152471832931} 12/27/2021 06:04:59 - INFO - codeparrot_training - Step 32540: {'lr': 0.00013680093428190877, 'samples': 16660992, 'steps': 32540, 'batch_loss/train': 0.717927694437094} 12/27/2021 06:05:10 - INFO - codeparrot_training - Step 32541: {'lr': 0.00013678687987559525, 'samples': 16661504, 'steps': 32541, 'batch_loss/train': 0.8054799647070467} 12/27/2021 06:05:20 - INFO - codeparrot_training - Step 32542: {'lr': 0.00013677282591937446, 'samples': 16662016, 'steps': 32542, 'batch_loss/train': 0.7506179292686284} 12/27/2021 06:05:31 - INFO - codeparrot_training - Step 32543: {'lr': 0.00013675877241330197, 'samples': 16662528, 'steps': 32543, 'batch_loss/train': 0.6891085966490209} 12/27/2021 06:05:45 - INFO - codeparrot_training - Step 32544: {'lr': 0.00013674471935743392, 'samples': 16663040, 'steps': 32544, 'batch_loss/train': 0.6505055141169578} 12/27/2021 06:05:55 - INFO - codeparrot_training - Step 32545: {'lr': 0.00013673066675182625, 'samples': 16663552, 'steps': 32545, 'batch_loss/train': 0.775784570723772} 12/27/2021 06:06:06 - INFO - codeparrot_training - Step 32546: {'lr': 0.00013671661459653455, 'samples': 16664064, 'steps': 32546, 'batch_loss/train': 0.7517792019061744} 12/27/2021 06:06:19 - INFO - codeparrot_training - Step 32547: {'lr': 0.00013670256289161471, 'samples': 16664576, 'steps': 32547, 'batch_loss/train': 0.7796813460299745} 12/27/2021 06:06:30 - INFO - codeparrot_training - Step 32548: {'lr': 0.0001366885116371229, 'samples': 16665088, 'steps': 32548, 'batch_loss/train': 1.1760967709124088} 12/27/2021 06:06:40 - INFO - codeparrot_training - Step 32549: {'lr': 0.00013667446083311468, 'samples': 16665600, 'steps': 32549, 'batch_loss/train': 0.6140146995894611} 12/27/2021 06:06:51 - INFO - codeparrot_training - Step 32550: {'lr': 0.00013666041047964605, 'samples': 16666112, 'steps': 32550, 'batch_loss/train': 0.8119958927854896} 12/27/2021 06:07:03 - INFO - codeparrot_training - Step 32551: {'lr': 0.0001366463605767728, 'samples': 16666624, 'steps': 32551, 'batch_loss/train': 0.7208533145021647} 12/27/2021 06:07:14 - INFO - codeparrot_training - Step 32552: {'lr': 0.00013663231112455082, 'samples': 16667136, 'steps': 32552, 'batch_loss/train': 0.7359112813137472} 12/27/2021 06:07:24 - INFO - codeparrot_training - Step 32553: {'lr': 0.00013661826212303598, 'samples': 16667648, 'steps': 32553, 'batch_loss/train': 0.803216964006424} 12/27/2021 06:07:38 - INFO - codeparrot_training - Step 32554: {'lr': 0.0001366042135722841, 'samples': 16668160, 'steps': 32554, 'batch_loss/train': 0.7511237720027566} 12/27/2021 06:07:49 - INFO - codeparrot_training - Step 32555: {'lr': 0.00013659016547235106, 'samples': 16668672, 'steps': 32555, 'batch_loss/train': 0.7691191625781357} 12/27/2021 06:07:59 - INFO - codeparrot_training - Step 32556: {'lr': 0.00013657611782329272, 'samples': 16669184, 'steps': 32556, 'batch_loss/train': 0.7709907139651477} 12/27/2021 06:08:12 - INFO - codeparrot_training - Step 32557: {'lr': 0.00013656207062516496, 'samples': 16669696, 'steps': 32557, 'batch_loss/train': 0.7587492881575599} 12/27/2021 06:08:22 - INFO - codeparrot_training - Step 32558: {'lr': 0.0001365480238780234, 'samples': 16670208, 'steps': 32558, 'batch_loss/train': 0.7285422254353762} 12/27/2021 06:08:33 - INFO - codeparrot_training - Step 32559: {'lr': 0.00013653397758192425, 'samples': 16670720, 'steps': 32559, 'batch_loss/train': 0.7243287321180105} 12/27/2021 06:08:43 - INFO - codeparrot_training - Step 32560: {'lr': 0.0001365199317369231, 'samples': 16671232, 'steps': 32560, 'batch_loss/train': 0.7901037954725325} 12/27/2021 06:08:57 - INFO - codeparrot_training - Step 32561: {'lr': 0.00013650588634307582, 'samples': 16671744, 'steps': 32561, 'batch_loss/train': 0.7630898766219616} 12/27/2021 06:09:08 - INFO - codeparrot_training - Step 32562: {'lr': 0.0001364918414004383, 'samples': 16672256, 'steps': 32562, 'batch_loss/train': 0.7892381921410561} 12/27/2021 06:09:19 - INFO - codeparrot_training - Step 32563: {'lr': 0.00013647779690906633, 'samples': 16672768, 'steps': 32563, 'batch_loss/train': 0.8381125312298536} 12/27/2021 06:09:31 - INFO - codeparrot_training - Step 32564: {'lr': 0.0001364637528690158, 'samples': 16673280, 'steps': 32564, 'batch_loss/train': 0.9810159387998283} 12/27/2021 06:09:41 - INFO - codeparrot_training - Step 32565: {'lr': 0.00013644970928034252, 'samples': 16673792, 'steps': 32565, 'batch_loss/train': 0.7529652249068022} 12/27/2021 06:09:52 - INFO - codeparrot_training - Step 32566: {'lr': 0.0001364356661431023, 'samples': 16674304, 'steps': 32566, 'batch_loss/train': 0.7369262338615954} 12/27/2021 06:10:04 - INFO - codeparrot_training - Step 32567: {'lr': 0.000136421623457351, 'samples': 16674816, 'steps': 32567, 'batch_loss/train': 0.779405455570668} 12/27/2021 06:10:15 - INFO - codeparrot_training - Step 32568: {'lr': 0.00013640758122314455, 'samples': 16675328, 'steps': 32568, 'batch_loss/train': 0.6816306877881289} 12/27/2021 06:10:25 - INFO - codeparrot_training - Step 32569: {'lr': 0.00013639353944053847, 'samples': 16675840, 'steps': 32569, 'batch_loss/train': 0.795666366815567} 12/27/2021 06:10:36 - INFO - codeparrot_training - Step 32570: {'lr': 0.00013637949810958887, 'samples': 16676352, 'steps': 32570, 'batch_loss/train': 0.7202076696557924} 12/27/2021 06:10:50 - INFO - codeparrot_training - Step 32571: {'lr': 0.0001363654572303516, 'samples': 16676864, 'steps': 32571, 'batch_loss/train': 0.7055088992929086} 12/27/2021 06:11:01 - INFO - codeparrot_training - Step 32572: {'lr': 0.00013635141680288224, 'samples': 16677376, 'steps': 32572, 'batch_loss/train': 0.835942623205483} 12/27/2021 06:11:12 - INFO - codeparrot_training - Step 32573: {'lr': 0.00013633737682723664, 'samples': 16677888, 'steps': 32573, 'batch_loss/train': 0.6557873468846083} 12/27/2021 06:11:24 - INFO - codeparrot_training - Step 32574: {'lr': 0.00013632333730347092, 'samples': 16678400, 'steps': 32574, 'batch_loss/train': 0.46327887708321214} 12/27/2021 06:11:35 - INFO - codeparrot_training - Step 32575: {'lr': 0.0001363092982316406, 'samples': 16678912, 'steps': 32575, 'batch_loss/train': 0.729751484002918} 12/27/2021 06:11:46 - INFO - codeparrot_training - Step 32576: {'lr': 0.00013629525961180144, 'samples': 16679424, 'steps': 32576, 'batch_loss/train': 1.6592428837902844} 12/27/2021 06:11:56 - INFO - codeparrot_training - Step 32577: {'lr': 0.00013628122144400965, 'samples': 16679936, 'steps': 32577, 'batch_loss/train': 0.7651403825730085} 12/27/2021 06:12:08 - INFO - codeparrot_training - Step 32578: {'lr': 0.00013626718372832064, 'samples': 16680448, 'steps': 32578, 'batch_loss/train': 0.8130198577418923} 12/27/2021 06:12:19 - INFO - codeparrot_training - Step 32579: {'lr': 0.0001362531464647904, 'samples': 16680960, 'steps': 32579, 'batch_loss/train': 0.7796061248518527} 12/27/2021 06:12:29 - INFO - codeparrot_training - Step 32580: {'lr': 0.0001362391096534747, 'samples': 16681472, 'steps': 32580, 'batch_loss/train': 0.8931382866576314} 12/27/2021 06:12:43 - INFO - codeparrot_training - Step 32581: {'lr': 0.00013622507329442933, 'samples': 16681984, 'steps': 32581, 'batch_loss/train': 0.744130621664226} 12/27/2021 06:12:54 - INFO - codeparrot_training - Step 32582: {'lr': 0.0001362110373877101, 'samples': 16682496, 'steps': 32582, 'batch_loss/train': 0.7437131032347679} 12/27/2021 06:13:04 - INFO - codeparrot_training - Step 32583: {'lr': 0.00013619700193337293, 'samples': 16683008, 'steps': 32583, 'batch_loss/train': 0.8006175816990435} 12/27/2021 06:13:17 - INFO - codeparrot_training - Step 32584: {'lr': 0.0001361829669314733, 'samples': 16683520, 'steps': 32584, 'batch_loss/train': 0.8395776944234967} 12/27/2021 06:13:28 - INFO - codeparrot_training - Step 32585: {'lr': 0.0001361689323820674, 'samples': 16684032, 'steps': 32585, 'batch_loss/train': 0.7933531990274787} 12/27/2021 06:13:39 - INFO - codeparrot_training - Step 32586: {'lr': 0.00013615489828521077, 'samples': 16684544, 'steps': 32586, 'batch_loss/train': 0.8378536021336913} 12/27/2021 06:13:49 - INFO - codeparrot_training - Step 32587: {'lr': 0.0001361408646409592, 'samples': 16685056, 'steps': 32587, 'batch_loss/train': 0.9439338333904743} 12/27/2021 06:14:01 - INFO - codeparrot_training - Step 32588: {'lr': 0.00013612683144936872, 'samples': 16685568, 'steps': 32588, 'batch_loss/train': 0.9193122866563499} 12/27/2021 06:14:12 - INFO - codeparrot_training - Step 32589: {'lr': 0.0001361127987104949, 'samples': 16686080, 'steps': 32589, 'batch_loss/train': 0.704955501947552} 12/27/2021 06:14:23 - INFO - codeparrot_training - Step 32590: {'lr': 0.00013609876642439357, 'samples': 16686592, 'steps': 32590, 'batch_loss/train': 0.6898243688046932} 12/27/2021 06:14:37 - INFO - codeparrot_training - Step 32591: {'lr': 0.00013608473459112054, 'samples': 16687104, 'steps': 32591, 'batch_loss/train': 1.0196059215813875} 12/27/2021 06:14:47 - INFO - codeparrot_training - Step 32592: {'lr': 0.00013607070321073155, 'samples': 16687616, 'steps': 32592, 'batch_loss/train': 0.8608577609993517} 12/27/2021 06:14:58 - INFO - codeparrot_training - Step 32593: {'lr': 0.0001360566722832825, 'samples': 16688128, 'steps': 32593, 'batch_loss/train': 0.6994039237033576} 12/27/2021 06:15:10 - INFO - codeparrot_training - Step 32594: {'lr': 0.00013604264180882916, 'samples': 16688640, 'steps': 32594, 'batch_loss/train': 0.8461646537762135} 12/27/2021 06:15:21 - INFO - codeparrot_training - Step 32595: {'lr': 0.00013602861178742704, 'samples': 16689152, 'steps': 32595, 'batch_loss/train': 0.7832591836340725} 12/27/2021 06:15:32 - INFO - codeparrot_training - Step 32596: {'lr': 0.00013601458221913226, 'samples': 16689664, 'steps': 32596, 'batch_loss/train': 0.7291111628292128} 12/27/2021 06:15:42 - INFO - codeparrot_training - Step 32597: {'lr': 0.00013600055310400053, 'samples': 16690176, 'steps': 32597, 'batch_loss/train': 0.7922502914443612} 12/27/2021 06:15:54 - INFO - codeparrot_training - Step 32598: {'lr': 0.0001359865244420875, 'samples': 16690688, 'steps': 32598, 'batch_loss/train': 0.8134510489180684} 12/27/2021 06:16:05 - INFO - codeparrot_training - Step 32599: {'lr': 0.00013597249623344883, 'samples': 16691200, 'steps': 32599, 'batch_loss/train': 0.7509931400418282} 12/27/2021 06:16:15 - INFO - codeparrot_training - Step 32600: {'lr': 0.0001359584684781407, 'samples': 16691712, 'steps': 32600, 'batch_loss/train': 0.9673413187265396} 12/27/2021 06:16:29 - INFO - codeparrot_training - Step 32601: {'lr': 0.0001359444411762185, 'samples': 16692224, 'steps': 32601, 'batch_loss/train': 0.7850387319922447} 12/27/2021 06:16:40 - INFO - codeparrot_training - Step 32602: {'lr': 0.0001359304143277381, 'samples': 16692736, 'steps': 32602, 'batch_loss/train': 0.8397468775510788} 12/27/2021 06:16:51 - INFO - codeparrot_training - Step 32603: {'lr': 0.00013591638793275545, 'samples': 16693248, 'steps': 32603, 'batch_loss/train': 0.6971222423017025} 12/27/2021 06:17:03 - INFO - codeparrot_training - Step 32604: {'lr': 0.0001359023619913261, 'samples': 16693760, 'steps': 32604, 'batch_loss/train': 0.744877474848181} 12/27/2021 06:17:13 - INFO - codeparrot_training - Step 32605: {'lr': 0.00013588833650350586, 'samples': 16694272, 'steps': 32605, 'batch_loss/train': 0.7669695378281176} 12/27/2021 06:17:24 - INFO - codeparrot_training - Step 32606: {'lr': 0.0001358743114693505, 'samples': 16694784, 'steps': 32606, 'batch_loss/train': 0.7725818287581205} 12/27/2021 06:17:35 - INFO - codeparrot_training - Step 32607: {'lr': 0.0001358602868889158, 'samples': 16695296, 'steps': 32607, 'batch_loss/train': 0.7733853453537449} 12/27/2021 06:17:47 - INFO - codeparrot_training - Step 32608: {'lr': 0.00013584626276225747, 'samples': 16695808, 'steps': 32608, 'batch_loss/train': 0.7137958949897438} 12/27/2021 06:17:57 - INFO - codeparrot_training - Step 32609: {'lr': 0.00013583223908943142, 'samples': 16696320, 'steps': 32609, 'batch_loss/train': 0.6824386022053659} 12/27/2021 06:18:08 - INFO - codeparrot_training - Step 32610: {'lr': 0.00013581821587049312, 'samples': 16696832, 'steps': 32610, 'batch_loss/train': 0.8190219451207668} 12/27/2021 06:18:22 - INFO - codeparrot_training - Step 32611: {'lr': 0.00013580419310549853, 'samples': 16697344, 'steps': 32611, 'batch_loss/train': 0.7606095052324235} 12/27/2021 06:18:32 - INFO - codeparrot_training - Step 32612: {'lr': 0.00013579017079450346, 'samples': 16697856, 'steps': 32612, 'batch_loss/train': 0.7194118397310376} 12/27/2021 06:18:43 - INFO - codeparrot_training - Step 32613: {'lr': 0.00013577614893756334, 'samples': 16698368, 'steps': 32613, 'batch_loss/train': 0.8947266712784767} 12/27/2021 06:18:55 - INFO - codeparrot_training - Step 32614: {'lr': 0.00013576212753473437, 'samples': 16698880, 'steps': 32614, 'batch_loss/train': 0.7381382440216839} 12/27/2021 06:19:06 - INFO - codeparrot_training - Step 32615: {'lr': 0.0001357481065860719, 'samples': 16699392, 'steps': 32615, 'batch_loss/train': 0.8210102603770792} 12/27/2021 06:19:16 - INFO - codeparrot_training - Step 32616: {'lr': 0.00013573408609163183, 'samples': 16699904, 'steps': 32616, 'batch_loss/train': 0.7281113397330046} 12/27/2021 06:19:27 - INFO - codeparrot_training - Step 32617: {'lr': 0.00013572006605146986, 'samples': 16700416, 'steps': 32617, 'batch_loss/train': 0.7739439969882369} 12/27/2021 06:19:41 - INFO - codeparrot_training - Step 32618: {'lr': 0.00013570604646564178, 'samples': 16700928, 'steps': 32618, 'batch_loss/train': 0.7704653842374682} 12/27/2021 06:19:52 - INFO - codeparrot_training - Step 32619: {'lr': 0.00013569202733420332, 'samples': 16701440, 'steps': 32619, 'batch_loss/train': 0.6685974323190749} 12/27/2021 06:20:02 - INFO - codeparrot_training - Step 32620: {'lr': 0.0001356780086572102, 'samples': 16701952, 'steps': 32620, 'batch_loss/train': 0.817153268493712} 12/27/2021 06:20:14 - INFO - codeparrot_training - Step 32621: {'lr': 0.0001356639904347181, 'samples': 16702464, 'steps': 32621, 'batch_loss/train': 0.7170449746772647} 12/27/2021 06:20:25 - INFO - codeparrot_training - Step 32622: {'lr': 0.00013564997266678287, 'samples': 16702976, 'steps': 32622, 'batch_loss/train': 0.7815780302044004} 12/27/2021 06:20:36 - INFO - codeparrot_training - Step 32623: {'lr': 0.00013563595535346024, 'samples': 16703488, 'steps': 32623, 'batch_loss/train': 0.7681208057329059} 12/27/2021 06:20:48 - INFO - codeparrot_training - Step 32624: {'lr': 0.00013562193849480565, 'samples': 16704000, 'steps': 32624, 'batch_loss/train': 0.7696133782155812} 12/27/2021 06:20:58 - INFO - codeparrot_training - Step 32625: {'lr': 0.00013560792209087516, 'samples': 16704512, 'steps': 32625, 'batch_loss/train': 0.812892927788198} 12/27/2021 06:21:09 - INFO - codeparrot_training - Step 32626: {'lr': 0.00013559390614172452, 'samples': 16705024, 'steps': 32626, 'batch_loss/train': 0.8293343847617507} 12/27/2021 06:21:20 - INFO - codeparrot_training - Step 32627: {'lr': 0.00013557989064740916, 'samples': 16705536, 'steps': 32627, 'batch_loss/train': 0.7247367657255381} 12/27/2021 06:21:32 - INFO - codeparrot_training - Step 32628: {'lr': 0.00013556587560798483, 'samples': 16706048, 'steps': 32628, 'batch_loss/train': 0.7878127079457045} 12/27/2021 06:21:42 - INFO - codeparrot_training - Step 32629: {'lr': 0.00013555186102350765, 'samples': 16706560, 'steps': 32629, 'batch_loss/train': 0.7730557257309556} 12/27/2021 06:21:53 - INFO - codeparrot_training - Step 32630: {'lr': 0.00013553784689403288, 'samples': 16707072, 'steps': 32630, 'batch_loss/train': 0.78091926779598} 12/27/2021 06:22:07 - INFO - codeparrot_training - Step 32631: {'lr': 0.00013552383321961645, 'samples': 16707584, 'steps': 32631, 'batch_loss/train': 0.7280416692374274} 12/27/2021 06:22:18 - INFO - codeparrot_training - Step 32632: {'lr': 0.00013550982000031404, 'samples': 16708096, 'steps': 32632, 'batch_loss/train': 0.6120895934291184} 12/27/2021 06:22:28 - INFO - codeparrot_training - Step 32633: {'lr': 0.00013549580723618132, 'samples': 16708608, 'steps': 32633, 'batch_loss/train': 0.6571884271688759} 12/27/2021 06:22:41 - INFO - codeparrot_training - Step 32634: {'lr': 0.00013548179492727408, 'samples': 16709120, 'steps': 32634, 'batch_loss/train': 0.7829571310430765} 12/27/2021 06:22:51 - INFO - codeparrot_training - Step 32635: {'lr': 0.00013546778307364804, 'samples': 16709632, 'steps': 32635, 'batch_loss/train': 0.8008013009093702} 12/27/2021 06:23:02 - INFO - codeparrot_training - Step 32636: {'lr': 0.00013545377167535868, 'samples': 16710144, 'steps': 32636, 'batch_loss/train': 1.0352077181451023} 12/27/2021 06:23:14 - INFO - codeparrot_training - Step 32637: {'lr': 0.00013543976073246197, 'samples': 16710656, 'steps': 32637, 'batch_loss/train': 0.6996814697049558} 12/27/2021 06:23:25 - INFO - codeparrot_training - Step 32638: {'lr': 0.0001354257502450136, 'samples': 16711168, 'steps': 32638, 'batch_loss/train': 0.8056741580367088} 12/27/2021 06:23:36 - INFO - codeparrot_training - Step 32639: {'lr': 0.000135411740213069, 'samples': 16711680, 'steps': 32639, 'batch_loss/train': 0.7955926386639476} 12/27/2021 06:23:46 - INFO - codeparrot_training - Step 32640: {'lr': 0.00013539773063668426, 'samples': 16712192, 'steps': 32640, 'batch_loss/train': 0.5619586588582024} 12/27/2021 06:24:00 - INFO - codeparrot_training - Step 32641: {'lr': 0.00013538372151591476, 'samples': 16712704, 'steps': 32641, 'batch_loss/train': 0.7830506367608905} 12/27/2021 06:24:11 - INFO - codeparrot_training - Step 32642: {'lr': 0.00013536971285081628, 'samples': 16713216, 'steps': 32642, 'batch_loss/train': 0.6912546558305621} 12/27/2021 06:24:21 - INFO - codeparrot_training - Step 32643: {'lr': 0.0001353557046414446, 'samples': 16713728, 'steps': 32643, 'batch_loss/train': 0.7582838817033917} 12/27/2021 06:24:33 - INFO - codeparrot_training - Step 32644: {'lr': 0.00013534169688785532, 'samples': 16714240, 'steps': 32644, 'batch_loss/train': 0.7725821938365698} 12/27/2021 06:24:44 - INFO - codeparrot_training - Step 32645: {'lr': 0.00013532768959010415, 'samples': 16714752, 'steps': 32645, 'batch_loss/train': 0.8289649356156588} 12/27/2021 06:24:55 - INFO - codeparrot_training - Step 32646: {'lr': 0.0001353136827482468, 'samples': 16715264, 'steps': 32646, 'batch_loss/train': 0.7723832260817289} 12/27/2021 06:25:07 - INFO - codeparrot_training - Step 32647: {'lr': 0.00013529967636233893, 'samples': 16715776, 'steps': 32647, 'batch_loss/train': 0.7551166850607842} 12/27/2021 06:25:18 - INFO - codeparrot_training - Step 32648: {'lr': 0.00013528567043243623, 'samples': 16716288, 'steps': 32648, 'batch_loss/train': 0.7736502029001713} 12/27/2021 06:25:28 - INFO - codeparrot_training - Step 32649: {'lr': 0.0001352716649585945, 'samples': 16716800, 'steps': 32649, 'batch_loss/train': 0.7655954677611589} 12/27/2021 06:25:39 - INFO - codeparrot_training - Step 32650: {'lr': 0.0001352576599408691, 'samples': 16717312, 'steps': 32650, 'batch_loss/train': 0.8094581051263958} 12/27/2021 06:25:51 - INFO - codeparrot_training - Step 32651: {'lr': 0.00013524365537931604, 'samples': 16717824, 'steps': 32651, 'batch_loss/train': 0.8401974653825164} 12/27/2021 06:26:02 - INFO - codeparrot_training - Step 32652: {'lr': 0.00013522965127399098, 'samples': 16718336, 'steps': 32652, 'batch_loss/train': 0.7553415801376104} 12/27/2021 06:26:13 - INFO - codeparrot_training - Step 32653: {'lr': 0.00013521564762494935, 'samples': 16718848, 'steps': 32653, 'batch_loss/train': 0.6970772529020905} 12/27/2021 06:26:25 - INFO - codeparrot_training - Step 32654: {'lr': 0.00013520164443224693, 'samples': 16719360, 'steps': 32654, 'batch_loss/train': 0.8268526066094637} 12/27/2021 06:26:35 - INFO - codeparrot_training - Step 32655: {'lr': 0.00013518764169593957, 'samples': 16719872, 'steps': 32655, 'batch_loss/train': 0.7346389316953719} 12/27/2021 06:26:46 - INFO - codeparrot_training - Step 32656: {'lr': 0.00013517363941608273, 'samples': 16720384, 'steps': 32656, 'batch_loss/train': 0.7635237360373139} 12/27/2021 06:26:59 - INFO - codeparrot_training - Step 32657: {'lr': 0.00013515963759273212, 'samples': 16720896, 'steps': 32657, 'batch_loss/train': 0.8900690455920994} 12/27/2021 06:27:09 - INFO - codeparrot_training - Step 32658: {'lr': 0.00013514563622594344, 'samples': 16721408, 'steps': 32658, 'batch_loss/train': 0.8826884208247066} 12/27/2021 06:27:20 - INFO - codeparrot_training - Step 32659: {'lr': 0.00013513163531577233, 'samples': 16721920, 'steps': 32659, 'batch_loss/train': 0.7913777409121394} 12/27/2021 06:27:31 - INFO - codeparrot_training - Step 32660: {'lr': 0.00013511763486227446, 'samples': 16722432, 'steps': 32660, 'batch_loss/train': 0.8280966011807323} 12/27/2021 06:27:43 - INFO - codeparrot_training - Step 32661: {'lr': 0.00013510363486550554, 'samples': 16722944, 'steps': 32661, 'batch_loss/train': 0.7678025905042887} 12/27/2021 06:27:53 - INFO - codeparrot_training - Step 32662: {'lr': 0.00013508963532552116, 'samples': 16723456, 'steps': 32662, 'batch_loss/train': 0.7837266121059656} 12/27/2021 06:28:04 - INFO - codeparrot_training - Step 32663: {'lr': 0.00013507563624237697, 'samples': 16723968, 'steps': 32663, 'batch_loss/train': 0.7130572777241468} 12/27/2021 06:28:16 - INFO - codeparrot_training - Step 32664: {'lr': 0.00013506163761612877, 'samples': 16724480, 'steps': 32664, 'batch_loss/train': 0.82179851597175} 12/27/2021 06:28:27 - INFO - codeparrot_training - Step 32665: {'lr': 0.00013504763944683192, 'samples': 16724992, 'steps': 32665, 'batch_loss/train': 0.7891265265643597} 12/27/2021 06:28:38 - INFO - codeparrot_training - Step 32666: {'lr': 0.00013503364173454236, 'samples': 16725504, 'steps': 32666, 'batch_loss/train': 0.6035030018538237} 12/27/2021 06:28:48 - INFO - codeparrot_training - Step 32667: {'lr': 0.0001350196444793157, 'samples': 16726016, 'steps': 32667, 'batch_loss/train': 0.7453144863247871} 12/27/2021 06:29:00 - INFO - codeparrot_training - Step 32668: {'lr': 0.00013500564768120744, 'samples': 16726528, 'steps': 32668, 'batch_loss/train': 0.773714548908174} 12/27/2021 06:29:11 - INFO - codeparrot_training - Step 32669: {'lr': 0.0001349916513402733, 'samples': 16727040, 'steps': 32669, 'batch_loss/train': 0.9301126692444086} 12/27/2021 06:29:22 - INFO - codeparrot_training - Step 32670: {'lr': 0.00013497765545656892, 'samples': 16727552, 'steps': 32670, 'batch_loss/train': 0.8563525388017297} 12/27/2021 06:29:34 - INFO - codeparrot_training - Step 32671: {'lr': 0.00013496366003015, 'samples': 16728064, 'steps': 32671, 'batch_loss/train': 0.7018402158282697} 12/27/2021 06:29:45 - INFO - codeparrot_training - Step 32672: {'lr': 0.00013494966506107206, 'samples': 16728576, 'steps': 32672, 'batch_loss/train': 0.7862717369571328} 12/27/2021 06:29:56 - INFO - codeparrot_training - Step 32673: {'lr': 0.00013493567054939082, 'samples': 16729088, 'steps': 32673, 'batch_loss/train': 0.8019796181470156} 12/27/2021 06:30:08 - INFO - codeparrot_training - Step 32674: {'lr': 0.00013492167649516195, 'samples': 16729600, 'steps': 32674, 'batch_loss/train': 0.7274872423149645} 12/27/2021 06:30:18 - INFO - codeparrot_training - Step 32675: {'lr': 0.00013490768289844114, 'samples': 16730112, 'steps': 32675, 'batch_loss/train': 0.7883129892870784} 12/27/2021 06:30:29 - INFO - codeparrot_training - Step 32676: {'lr': 0.00013489368975928368, 'samples': 16730624, 'steps': 32676, 'batch_loss/train': 0.7286571720615029} 12/27/2021 06:30:40 - INFO - codeparrot_training - Step 32677: {'lr': 0.0001348796970777456, 'samples': 16731136, 'steps': 32677, 'batch_loss/train': 0.7787547903135419} 12/27/2021 06:30:53 - INFO - codeparrot_training - Step 32678: {'lr': 0.00013486570485388244, 'samples': 16731648, 'steps': 32678, 'batch_loss/train': 0.7692036754451692} 12/27/2021 06:31:03 - INFO - codeparrot_training - Step 32679: {'lr': 0.00013485171308774964, 'samples': 16732160, 'steps': 32679, 'batch_loss/train': 0.775856252759695} 12/27/2021 06:31:14 - INFO - codeparrot_training - Step 32680: {'lr': 0.0001348377217794029, 'samples': 16732672, 'steps': 32680, 'batch_loss/train': 0.7423268547281623} 12/27/2021 06:31:26 - INFO - codeparrot_training - Step 32681: {'lr': 0.00013482373092889812, 'samples': 16733184, 'steps': 32681, 'batch_loss/train': 0.8435716181993484} 12/27/2021 06:31:37 - INFO - codeparrot_training - Step 32682: {'lr': 0.00013480974053629052, 'samples': 16733696, 'steps': 32682, 'batch_loss/train': 0.8554306114092469} 12/27/2021 06:31:47 - INFO - codeparrot_training - Step 32683: {'lr': 0.00013479575060163594, 'samples': 16734208, 'steps': 32683, 'batch_loss/train': 0.8749125255271792} 12/27/2021 06:31:59 - INFO - codeparrot_training - Step 32684: {'lr': 0.00013478176112498993, 'samples': 16734720, 'steps': 32684, 'batch_loss/train': 0.8240821836516261} 12/27/2021 06:32:10 - INFO - codeparrot_training - Step 32685: {'lr': 0.00013476777210640818, 'samples': 16735232, 'steps': 32685, 'batch_loss/train': 0.8056352459825575} 12/27/2021 06:32:21 - INFO - codeparrot_training - Step 32686: {'lr': 0.00013475378354594619, 'samples': 16735744, 'steps': 32686, 'batch_loss/train': 0.7268404038622975} 12/27/2021 06:32:31 - INFO - codeparrot_training - Step 32687: {'lr': 0.0001347397954436597, 'samples': 16736256, 'steps': 32687, 'batch_loss/train': 0.6655218000523746} 12/27/2021 06:32:45 - INFO - codeparrot_training - Step 32688: {'lr': 0.00013472580779960426, 'samples': 16736768, 'steps': 32688, 'batch_loss/train': 0.6370849395170808} 12/27/2021 06:32:56 - INFO - codeparrot_training - Step 32689: {'lr': 0.00013471182061383546, 'samples': 16737280, 'steps': 32689, 'batch_loss/train': 0.8450074503198266} 12/27/2021 06:33:06 - INFO - codeparrot_training - Step 32690: {'lr': 0.00013469783388640904, 'samples': 16737792, 'steps': 32690, 'batch_loss/train': 0.7429216605378315} 12/27/2021 06:33:18 - INFO - codeparrot_training - Step 32691: {'lr': 0.00013468384761738028, 'samples': 16738304, 'steps': 32691, 'batch_loss/train': 0.883596186991781} 12/27/2021 06:33:29 - INFO - codeparrot_training - Step 32692: {'lr': 0.00013466986180680514, 'samples': 16738816, 'steps': 32692, 'batch_loss/train': 0.7657459955662489} 12/27/2021 06:33:39 - INFO - codeparrot_training - Step 32693: {'lr': 0.00013465587645473915, 'samples': 16739328, 'steps': 32693, 'batch_loss/train': 0.760251902975142} 12/27/2021 06:33:51 - INFO - codeparrot_training - Step 32694: {'lr': 0.00013464189156123775, 'samples': 16739840, 'steps': 32694, 'batch_loss/train': 0.8475656267255545} 12/27/2021 06:34:02 - INFO - codeparrot_training - Step 32695: {'lr': 0.00013462790712635663, 'samples': 16740352, 'steps': 32695, 'batch_loss/train': 0.8174056708812714} 12/27/2021 06:34:13 - INFO - codeparrot_training - Step 32696: {'lr': 0.00013461392315015135, 'samples': 16740864, 'steps': 32696, 'batch_loss/train': 0.7889454774558544} 12/27/2021 06:34:23 - INFO - codeparrot_training - Step 32697: {'lr': 0.0001345999396326776, 'samples': 16741376, 'steps': 32697, 'batch_loss/train': 0.760672870092094} 12/27/2021 06:34:36 - INFO - codeparrot_training - Step 32698: {'lr': 0.00013458595657399092, 'samples': 16741888, 'steps': 32698, 'batch_loss/train': 0.7763969833031297} 12/27/2021 06:34:47 - INFO - codeparrot_training - Step 32699: {'lr': 0.00013457197397414685, 'samples': 16742400, 'steps': 32699, 'batch_loss/train': 0.6412800001562573} 12/27/2021 06:34:58 - INFO - codeparrot_training - Step 32700: {'lr': 0.00013455799183320107, 'samples': 16742912, 'steps': 32700, 'batch_loss/train': 0.8319384548813105} 12/27/2021 06:35:10 - INFO - codeparrot_training - Step 32701: {'lr': 0.0001345440101512091, 'samples': 16743424, 'steps': 32701, 'batch_loss/train': 0.7089679897762835} 12/27/2021 06:35:21 - INFO - codeparrot_training - Step 32702: {'lr': 0.00013453002892822657, 'samples': 16743936, 'steps': 32702, 'batch_loss/train': 0.8180472552776337} 12/27/2021 06:35:31 - INFO - codeparrot_training - Step 32703: {'lr': 0.00013451604816430905, 'samples': 16744448, 'steps': 32703, 'batch_loss/train': 0.7535415240563452} 12/27/2021 06:35:43 - INFO - codeparrot_training - Step 32704: {'lr': 0.0001345020678595122, 'samples': 16744960, 'steps': 32704, 'batch_loss/train': 0.5795410370919853} 12/27/2021 06:35:54 - INFO - codeparrot_training - Step 32705: {'lr': 0.00013448808801389139, 'samples': 16745472, 'steps': 32705, 'batch_loss/train': 0.8532279459759593} 12/27/2021 06:36:05 - INFO - codeparrot_training - Step 32706: {'lr': 0.00013447410862750227, 'samples': 16745984, 'steps': 32706, 'batch_loss/train': 0.776523360866122} 12/27/2021 06:36:15 - INFO - codeparrot_training - Step 32707: {'lr': 0.00013446012970040066, 'samples': 16746496, 'steps': 32707, 'batch_loss/train': 0.7919898554682732} 12/27/2021 06:36:28 - INFO - codeparrot_training - Step 32708: {'lr': 0.00013444615123264187, 'samples': 16747008, 'steps': 32708, 'batch_loss/train': 0.7618654686957598} 12/27/2021 06:36:39 - INFO - codeparrot_training - Step 32709: {'lr': 0.00013443217322428152, 'samples': 16747520, 'steps': 32709, 'batch_loss/train': 0.7407371969893575} 12/27/2021 06:36:49 - INFO - codeparrot_training - Step 32710: {'lr': 0.0001344181956753752, 'samples': 16748032, 'steps': 32710, 'batch_loss/train': 0.81173434574157} 12/27/2021 06:37:01 - INFO - codeparrot_training - Step 32711: {'lr': 0.00013440421858597855, 'samples': 16748544, 'steps': 32711, 'batch_loss/train': 0.6554187319707125} 12/27/2021 06:37:12 - INFO - codeparrot_training - Step 32712: {'lr': 0.00013439024195614703, 'samples': 16749056, 'steps': 32712, 'batch_loss/train': 0.7307939470629208} 12/27/2021 06:37:22 - INFO - codeparrot_training - Step 32713: {'lr': 0.00013437626578593627, 'samples': 16749568, 'steps': 32713, 'batch_loss/train': 0.8738134184386581} 12/27/2021 06:37:35 - INFO - codeparrot_training - Step 32714: {'lr': 0.00013436229007540183, 'samples': 16750080, 'steps': 32714, 'batch_loss/train': 0.8139468934386969} 12/27/2021 06:37:45 - INFO - codeparrot_training - Step 32715: {'lr': 0.00013434831482459926, 'samples': 16750592, 'steps': 32715, 'batch_loss/train': 0.7738958662375808} 12/27/2021 06:37:56 - INFO - codeparrot_training - Step 32716: {'lr': 0.0001343343400335842, 'samples': 16751104, 'steps': 32716, 'batch_loss/train': 0.7910852478817105} 12/27/2021 06:38:06 - INFO - codeparrot_training - Step 32717: {'lr': 0.00013432036570241196, 'samples': 16751616, 'steps': 32717, 'batch_loss/train': 0.7260676755104214} 12/27/2021 06:38:19 - INFO - codeparrot_training - Step 32718: {'lr': 0.00013430639183113836, 'samples': 16752128, 'steps': 32718, 'batch_loss/train': 0.7253924305550754} 12/27/2021 06:38:30 - INFO - codeparrot_training - Step 32719: {'lr': 0.00013429241841981894, 'samples': 16752640, 'steps': 32719, 'batch_loss/train': 0.8162380089052022} 12/27/2021 06:38:40 - INFO - codeparrot_training - Step 32720: {'lr': 0.0001342784454685091, 'samples': 16753152, 'steps': 32720, 'batch_loss/train': 0.7203994896262884} 12/27/2021 06:38:53 - INFO - codeparrot_training - Step 32721: {'lr': 0.0001342644729772644, 'samples': 16753664, 'steps': 32721, 'batch_loss/train': 0.6492763459682465} 12/27/2021 06:39:03 - INFO - codeparrot_training - Step 32722: {'lr': 0.00013425050094614056, 'samples': 16754176, 'steps': 32722, 'batch_loss/train': 0.8906774502247572} 12/27/2021 06:39:14 - INFO - codeparrot_training - Step 32723: {'lr': 0.00013423652937519294, 'samples': 16754688, 'steps': 32723, 'batch_loss/train': 0.76848304271698} 12/27/2021 06:39:27 - INFO - codeparrot_training - Step 32724: {'lr': 0.0001342225582644772, 'samples': 16755200, 'steps': 32724, 'batch_loss/train': 0.8176797893829644} 12/27/2021 06:39:38 - INFO - codeparrot_training - Step 32725: {'lr': 0.00013420858761404886, 'samples': 16755712, 'steps': 32725, 'batch_loss/train': 0.6490705152973533} 12/27/2021 06:39:49 - INFO - codeparrot_training - Step 32726: {'lr': 0.00013419461742396343, 'samples': 16756224, 'steps': 32726, 'batch_loss/train': 0.9231781605631113} 12/27/2021 06:39:59 - INFO - codeparrot_training - Step 32727: {'lr': 0.00013418064769427651, 'samples': 16756736, 'steps': 32727, 'batch_loss/train': 0.7411475986009464} 12/27/2021 06:40:11 - INFO - codeparrot_training - Step 32728: {'lr': 0.0001341666784250436, 'samples': 16757248, 'steps': 32728, 'batch_loss/train': 0.6043703886680305} 12/27/2021 06:40:22 - INFO - codeparrot_training - Step 32729: {'lr': 0.0001341527096163202, 'samples': 16757760, 'steps': 32729, 'batch_loss/train': 0.7027591918595135} 12/27/2021 06:40:33 - INFO - codeparrot_training - Step 32730: {'lr': 0.00013413874126816201, 'samples': 16758272, 'steps': 32730, 'batch_loss/train': 0.6580876442603767} 12/27/2021 06:40:45 - INFO - codeparrot_training - Step 32731: {'lr': 0.00013412477338062433, 'samples': 16758784, 'steps': 32731, 'batch_loss/train': 0.7522346866317093} 12/27/2021 06:40:56 - INFO - codeparrot_training - Step 32732: {'lr': 0.00013411080595376274, 'samples': 16759296, 'steps': 32732, 'batch_loss/train': 0.6310884900158271} 12/27/2021 06:41:06 - INFO - codeparrot_training - Step 32733: {'lr': 0.00013409683898763297, 'samples': 16759808, 'steps': 32733, 'batch_loss/train': 0.7731443103402853} 12/27/2021 06:41:17 - INFO - codeparrot_training - Step 32734: {'lr': 0.00013408287248229035, 'samples': 16760320, 'steps': 32734, 'batch_loss/train': 0.7404144543106668} 12/27/2021 06:41:29 - INFO - codeparrot_training - Step 32735: {'lr': 0.0001340689064377903, 'samples': 16760832, 'steps': 32735, 'batch_loss/train': 0.8536467514932156} 12/27/2021 06:41:40 - INFO - codeparrot_training - Step 32736: {'lr': 0.00013405494085418873, 'samples': 16761344, 'steps': 32736, 'batch_loss/train': 0.6836907556280494} 12/27/2021 06:41:50 - INFO - codeparrot_training - Step 32737: {'lr': 0.00013404097573154088, 'samples': 16761856, 'steps': 32737, 'batch_loss/train': 0.7512128492817283} 12/27/2021 06:42:03 - INFO - codeparrot_training - Step 32738: {'lr': 0.0001340270110699023, 'samples': 16762368, 'steps': 32738, 'batch_loss/train': 0.8137235520407557} 12/27/2021 06:42:14 - INFO - codeparrot_training - Step 32739: {'lr': 0.00013401304686932853, 'samples': 16762880, 'steps': 32739, 'batch_loss/train': 0.752755215857178} 12/27/2021 06:42:24 - INFO - codeparrot_training - Step 32740: {'lr': 0.00013399908312987507, 'samples': 16763392, 'steps': 32740, 'batch_loss/train': 0.6855339894536883} 12/27/2021 06:42:36 - INFO - codeparrot_training - Step 32741: {'lr': 0.00013398511985159746, 'samples': 16763904, 'steps': 32741, 'batch_loss/train': 0.8366607818752527} 12/27/2021 06:42:47 - INFO - codeparrot_training - Step 32742: {'lr': 0.0001339711570345513, 'samples': 16764416, 'steps': 32742, 'batch_loss/train': 0.7566195633262396} 12/27/2021 06:42:57 - INFO - codeparrot_training - Step 32743: {'lr': 0.00013395719467879186, 'samples': 16764928, 'steps': 32743, 'batch_loss/train': 0.7202742956578732} 12/27/2021 06:43:09 - INFO - codeparrot_training - Step 32744: {'lr': 0.00013394323278437487, 'samples': 16765440, 'steps': 32744, 'batch_loss/train': 0.7050728076137602} 12/27/2021 06:43:20 - INFO - codeparrot_training - Step 32745: {'lr': 0.00013392927135135585, 'samples': 16765952, 'steps': 32745, 'batch_loss/train': 0.7400916588376276} 12/27/2021 06:43:31 - INFO - codeparrot_training - Step 32746: {'lr': 0.00013391531037979017, 'samples': 16766464, 'steps': 32746, 'batch_loss/train': 0.7299794727005064} 12/27/2021 06:43:41 - INFO - codeparrot_training - Step 32747: {'lr': 0.00013390134986973325, 'samples': 16766976, 'steps': 32747, 'batch_loss/train': 0.7798452121205628} 12/27/2021 06:43:54 - INFO - codeparrot_training - Step 32748: {'lr': 0.0001338873898212409, 'samples': 16767488, 'steps': 32748, 'batch_loss/train': 0.7356701857643202} 12/27/2021 06:44:05 - INFO - codeparrot_training - Step 32749: {'lr': 0.00013387343023436838, 'samples': 16768000, 'steps': 32749, 'batch_loss/train': 0.8071040138602257} 12/27/2021 06:44:15 - INFO - codeparrot_training - Step 32750: {'lr': 0.00013385947110917124, 'samples': 16768512, 'steps': 32750, 'batch_loss/train': 0.8572183984797448} 12/27/2021 06:44:27 - INFO - codeparrot_training - Step 32751: {'lr': 0.000133845512445705, 'samples': 16769024, 'steps': 32751, 'batch_loss/train': 0.7555928197689354} 12/27/2021 06:44:38 - INFO - codeparrot_training - Step 32752: {'lr': 0.00013383155424402513, 'samples': 16769536, 'steps': 32752, 'batch_loss/train': 0.6650372943840921} 12/27/2021 06:44:49 - INFO - codeparrot_training - Step 32753: {'lr': 0.00013381759650418713, 'samples': 16770048, 'steps': 32753, 'batch_loss/train': 0.7659090614179149} 12/27/2021 06:45:01 - INFO - codeparrot_training - Step 32754: {'lr': 0.00013380363922624653, 'samples': 16770560, 'steps': 32754, 'batch_loss/train': 0.8223979854956269} 12/27/2021 06:45:11 - INFO - codeparrot_training - Step 32755: {'lr': 0.00013378968241025874, 'samples': 16771072, 'steps': 32755, 'batch_loss/train': 0.8414037274196744} 12/27/2021 06:45:22 - INFO - codeparrot_training - Step 32756: {'lr': 0.00013377572605627935, 'samples': 16771584, 'steps': 32756, 'batch_loss/train': 0.8041329109109938} 12/27/2021 06:45:32 - INFO - codeparrot_training - Step 32757: {'lr': 0.00013376177016436385, 'samples': 16772096, 'steps': 32757, 'batch_loss/train': 0.7418454741127789} 12/27/2021 06:45:45 - INFO - codeparrot_training - Step 32758: {'lr': 0.0001337478147345675, 'samples': 16772608, 'steps': 32758, 'batch_loss/train': 0.6755769047886133} 12/27/2021 06:45:56 - INFO - codeparrot_training - Step 32759: {'lr': 0.0001337338597669461, 'samples': 16773120, 'steps': 32759, 'batch_loss/train': 0.8184736855328083} 12/27/2021 06:46:06 - INFO - codeparrot_training - Step 32760: {'lr': 0.00013371990526155487, 'samples': 16773632, 'steps': 32760, 'batch_loss/train': 0.7100099958479404} 12/27/2021 06:46:19 - INFO - codeparrot_training - Step 32761: {'lr': 0.0001337059512184493, 'samples': 16774144, 'steps': 32761, 'batch_loss/train': 0.8023219469469041} 12/27/2021 06:46:29 - INFO - codeparrot_training - Step 32762: {'lr': 0.00013369199763768518, 'samples': 16774656, 'steps': 32762, 'batch_loss/train': 0.7484512184746563} 12/27/2021 06:46:40 - INFO - codeparrot_training - Step 32763: {'lr': 0.00013367804451931765, 'samples': 16775168, 'steps': 32763, 'batch_loss/train': 0.7915802029892802} 12/27/2021 06:46:53 - INFO - codeparrot_training - Step 32764: {'lr': 0.00013366409186340227, 'samples': 16775680, 'steps': 32764, 'batch_loss/train': 0.7177115753293037} 12/27/2021 06:47:03 - INFO - codeparrot_training - Step 32765: {'lr': 0.00013365013966999456, 'samples': 16776192, 'steps': 32765, 'batch_loss/train': 0.7223316761665046} 12/27/2021 06:47:14 - INFO - codeparrot_training - Step 32766: {'lr': 0.00013363618793914997, 'samples': 16776704, 'steps': 32766, 'batch_loss/train': 0.7315291278064251} 12/27/2021 06:47:26 - INFO - codeparrot_training - Step 32767: {'lr': 0.00013362223667092392, 'samples': 16777216, 'steps': 32767, 'batch_loss/train': 0.7299526971764863} 12/27/2021 06:47:37 - INFO - codeparrot_training - Step 32768: {'lr': 0.00013360828586537208, 'samples': 16777728, 'steps': 32768, 'batch_loss/train': 0.7381871929392219} 12/27/2021 06:47:47 - INFO - codeparrot_training - Step 32769: {'lr': 0.0001335943355225495, 'samples': 16778240, 'steps': 32769, 'batch_loss/train': 0.7325740380911157} 12/27/2021 06:47:58 - INFO - codeparrot_training - Step 32770: {'lr': 0.00013358038564251202, 'samples': 16778752, 'steps': 32770, 'batch_loss/train': 0.7231694385409355} 12/27/2021 06:48:10 - INFO - codeparrot_training - Step 32771: {'lr': 0.00013356643622531504, 'samples': 16779264, 'steps': 32771, 'batch_loss/train': 0.737535733205732} 12/27/2021 06:48:20 - INFO - codeparrot_training - Step 32772: {'lr': 0.00013355248727101374, 'samples': 16779776, 'steps': 32772, 'batch_loss/train': 0.731338219717145} 12/27/2021 06:48:31 - INFO - codeparrot_training - Step 32773: {'lr': 0.00013353853877966392, 'samples': 16780288, 'steps': 32773, 'batch_loss/train': 0.7916044453158975} 12/27/2021 06:48:43 - INFO - codeparrot_training - Step 32774: {'lr': 0.00013352459075132095, 'samples': 16780800, 'steps': 32774, 'batch_loss/train': 0.6556006222963333} 12/27/2021 06:48:54 - INFO - codeparrot_training - Step 32775: {'lr': 0.00013351064318604014, 'samples': 16781312, 'steps': 32775, 'batch_loss/train': 0.6147020643111318} 12/27/2021 06:49:05 - INFO - codeparrot_training - Step 32776: {'lr': 0.00013349669608387705, 'samples': 16781824, 'steps': 32776, 'batch_loss/train': 0.8895056955516338} 12/27/2021 06:49:17 - INFO - codeparrot_training - Step 32777: {'lr': 0.0001334827494448871, 'samples': 16782336, 'steps': 32777, 'batch_loss/train': 0.8545149797573686} 12/27/2021 06:49:28 - INFO - codeparrot_training - Step 32778: {'lr': 0.00013346880326912573, 'samples': 16782848, 'steps': 32778, 'batch_loss/train': 0.7726743156090379} 12/27/2021 06:49:39 - INFO - codeparrot_training - Step 32779: {'lr': 0.0001334548575566484, 'samples': 16783360, 'steps': 32779, 'batch_loss/train': 0.7971946247853339} 12/27/2021 06:49:49 - INFO - codeparrot_training - Step 32780: {'lr': 0.00013344091230751054, 'samples': 16783872, 'steps': 32780, 'batch_loss/train': 0.7468161056749523} 12/27/2021 06:50:02 - INFO - codeparrot_training - Step 32781: {'lr': 0.00013342696752176764, 'samples': 16784384, 'steps': 32781, 'batch_loss/train': 0.7692144003231078} 12/27/2021 06:50:12 - INFO - codeparrot_training - Step 32782: {'lr': 0.00013341302319947506, 'samples': 16784896, 'steps': 32782, 'batch_loss/train': 0.7785242106765509} 12/27/2021 06:50:23 - INFO - codeparrot_training - Step 32783: {'lr': 0.0001333990793406884, 'samples': 16785408, 'steps': 32783, 'batch_loss/train': 0.7950089760124683} 12/27/2021 06:50:35 - INFO - codeparrot_training - Step 32784: {'lr': 0.00013338513594546276, 'samples': 16785920, 'steps': 32784, 'batch_loss/train': 0.6195867983624339} 12/27/2021 06:50:46 - INFO - codeparrot_training - Step 32785: {'lr': 0.00013337119301385397, 'samples': 16786432, 'steps': 32785, 'batch_loss/train': 0.6956012374721467} 12/27/2021 06:50:56 - INFO - codeparrot_training - Step 32786: {'lr': 0.00013335725054591723, 'samples': 16786944, 'steps': 32786, 'batch_loss/train': 0.801514831604436} 12/27/2021 06:51:09 - INFO - codeparrot_training - Step 32787: {'lr': 0.0001333433085417079, 'samples': 16787456, 'steps': 32787, 'batch_loss/train': 0.7638355502858758} 12/27/2021 06:51:20 - INFO - codeparrot_training - Step 32788: {'lr': 0.00013332936700128172, 'samples': 16787968, 'steps': 32788, 'batch_loss/train': 0.7462296551093459} 12/27/2021 06:51:31 - INFO - codeparrot_training - Step 32789: {'lr': 0.00013331542592469382, 'samples': 16788480, 'steps': 32789, 'batch_loss/train': 0.8519874196499586} 12/27/2021 06:51:41 - INFO - codeparrot_training - Step 32790: {'lr': 0.00013330148531199974, 'samples': 16788992, 'steps': 32790, 'batch_loss/train': 0.8286263504996896} 12/27/2021 06:51:53 - INFO - codeparrot_training - Step 32791: {'lr': 0.00013328754516325487, 'samples': 16789504, 'steps': 32791, 'batch_loss/train': 0.7074433770030737} 12/27/2021 06:52:04 - INFO - codeparrot_training - Step 32792: {'lr': 0.00013327360547851468, 'samples': 16790016, 'steps': 32792, 'batch_loss/train': 0.722806919249706} 12/27/2021 06:52:14 - INFO - codeparrot_training - Step 32793: {'lr': 0.00013325966625783454, 'samples': 16790528, 'steps': 32793, 'batch_loss/train': 0.7212333809584379} 12/27/2021 06:52:27 - INFO - codeparrot_training - Step 32794: {'lr': 0.00013324572750127002, 'samples': 16791040, 'steps': 32794, 'batch_loss/train': 0.6434297049418092} 12/27/2021 06:52:38 - INFO - codeparrot_training - Step 32795: {'lr': 0.00013323178920887613, 'samples': 16791552, 'steps': 32795, 'batch_loss/train': 0.7212392066139728} 12/27/2021 06:52:48 - INFO - codeparrot_training - Step 32796: {'lr': 0.00013321785138070875, 'samples': 16792064, 'steps': 32796, 'batch_loss/train': 0.741472159512341} 12/27/2021 06:53:01 - INFO - codeparrot_training - Step 32797: {'lr': 0.00013320391401682314, 'samples': 16792576, 'steps': 32797, 'batch_loss/train': 0.7738949302583933} 12/27/2021 06:53:11 - INFO - codeparrot_training - Step 32798: {'lr': 0.00013318997711727446, 'samples': 16793088, 'steps': 32798, 'batch_loss/train': 0.8072568774223328} 12/27/2021 06:53:22 - INFO - codeparrot_training - Step 32799: {'lr': 0.00013317604068211844, 'samples': 16793600, 'steps': 32799, 'batch_loss/train': 0.7913613428827375} 12/27/2021 06:53:34 - INFO - codeparrot_training - Step 32800: {'lr': 0.0001331621047114105, 'samples': 16794112, 'steps': 32800, 'batch_loss/train': 0.6802088441327214} 12/27/2021 06:53:45 - INFO - codeparrot_training - Step 32801: {'lr': 0.0001331481692052058, 'samples': 16794624, 'steps': 32801, 'batch_loss/train': 0.7696054391562939} 12/27/2021 06:53:55 - INFO - codeparrot_training - Step 32802: {'lr': 0.00013313423416355972, 'samples': 16795136, 'steps': 32802, 'batch_loss/train': 0.7473348379135132} 12/27/2021 06:54:06 - INFO - codeparrot_training - Step 32803: {'lr': 0.00013312029958652803, 'samples': 16795648, 'steps': 32803, 'batch_loss/train': 0.7180984066799283} 12/27/2021 06:54:19 - INFO - codeparrot_training - Step 32804: {'lr': 0.00013310636547416577, 'samples': 16796160, 'steps': 32804, 'batch_loss/train': 0.5049507222138345} 12/27/2021 06:54:29 - INFO - codeparrot_training - Step 32805: {'lr': 0.0001330924318265285, 'samples': 16796672, 'steps': 32805, 'batch_loss/train': 0.6206558221601881} 12/27/2021 06:54:40 - INFO - codeparrot_training - Step 32806: {'lr': 0.00013307849864367156, 'samples': 16797184, 'steps': 32806, 'batch_loss/train': 0.5996738448739052} 12/27/2021 06:54:53 - INFO - codeparrot_training - Step 32807: {'lr': 0.00013306456592565035, 'samples': 16797696, 'steps': 32807, 'batch_loss/train': 0.8286809651181102} 12/27/2021 06:55:03 - INFO - codeparrot_training - Step 32808: {'lr': 0.00013305063367252028, 'samples': 16798208, 'steps': 32808, 'batch_loss/train': 0.6549678630253766} 12/27/2021 06:55:14 - INFO - codeparrot_training - Step 32809: {'lr': 0.00013303670188433676, 'samples': 16798720, 'steps': 32809, 'batch_loss/train': 0.7726145309861749} 12/27/2021 06:55:25 - INFO - codeparrot_training - Step 32810: {'lr': 0.00013302277056115514, 'samples': 16799232, 'steps': 32810, 'batch_loss/train': 1.1665364727377892} 12/27/2021 06:55:37 - INFO - codeparrot_training - Step 32811: {'lr': 0.0001330088397030309, 'samples': 16799744, 'steps': 32811, 'batch_loss/train': 0.7859709048643708} 12/27/2021 06:55:48 - INFO - codeparrot_training - Step 32812: {'lr': 0.00013299490931001923, 'samples': 16800256, 'steps': 32812, 'batch_loss/train': 0.7602070309221745} 12/27/2021 06:55:59 - INFO - codeparrot_training - Step 32813: {'lr': 0.00013298097938217558, 'samples': 16800768, 'steps': 32813, 'batch_loss/train': 0.7851401157677174} 12/27/2021 06:56:11 - INFO - codeparrot_training - Step 32814: {'lr': 0.0001329670499195555, 'samples': 16801280, 'steps': 32814, 'batch_loss/train': 0.7532081953249872} 12/27/2021 06:56:21 - INFO - codeparrot_training - Step 32815: {'lr': 0.0001329531209222142, 'samples': 16801792, 'steps': 32815, 'batch_loss/train': 0.6403605071827769} 12/27/2021 06:56:32 - INFO - codeparrot_training - Step 32816: {'lr': 0.00013293919239020703, 'samples': 16802304, 'steps': 32816, 'batch_loss/train': 0.6314808842726052} 12/27/2021 06:56:44 - INFO - codeparrot_training - Step 32817: {'lr': 0.00013292526432358948, 'samples': 16802816, 'steps': 32817, 'batch_loss/train': 0.6375460537092295} 12/27/2021 06:56:55 - INFO - codeparrot_training - Step 32818: {'lr': 0.00013291133672241684, 'samples': 16803328, 'steps': 32818, 'batch_loss/train': 0.8816806804388762} 12/27/2021 06:57:06 - INFO - codeparrot_training - Step 32819: {'lr': 0.00013289740958674452, 'samples': 16803840, 'steps': 32819, 'batch_loss/train': 0.7883828384801745} 12/27/2021 06:57:16 - INFO - codeparrot_training - Step 32820: {'lr': 0.000132883482916628, 'samples': 16804352, 'steps': 32820, 'batch_loss/train': 0.7584369946271181} 12/27/2021 06:57:28 - INFO - codeparrot_training - Step 32821: {'lr': 0.00013286955671212234, 'samples': 16804864, 'steps': 32821, 'batch_loss/train': 0.682821836322546} 12/27/2021 06:57:39 - INFO - codeparrot_training - Step 32822: {'lr': 0.00013285563097328317, 'samples': 16805376, 'steps': 32822, 'batch_loss/train': 0.7186226362828165} 12/27/2021 06:57:50 - INFO - codeparrot_training - Step 32823: {'lr': 0.00013284170570016592, 'samples': 16805888, 'steps': 32823, 'batch_loss/train': 0.7671376443468034} 12/27/2021 06:58:02 - INFO - codeparrot_training - Step 32824: {'lr': 0.00013282778089282555, 'samples': 16806400, 'steps': 32824, 'batch_loss/train': 0.5213562897406518} 12/27/2021 06:58:13 - INFO - codeparrot_training - Step 32825: {'lr': 0.00013281385655131782, 'samples': 16806912, 'steps': 32825, 'batch_loss/train': 0.7861469546332955} 12/27/2021 06:58:24 - INFO - codeparrot_training - Step 32826: {'lr': 0.00013279993267569803, 'samples': 16807424, 'steps': 32826, 'batch_loss/train': 0.6877715641167015} 12/27/2021 06:58:36 - INFO - codeparrot_training - Step 32827: {'lr': 0.00013278600926602135, 'samples': 16807936, 'steps': 32827, 'batch_loss/train': 0.7591075574746355} 12/27/2021 06:58:47 - INFO - codeparrot_training - Step 32828: {'lr': 0.00013277208632234318, 'samples': 16808448, 'steps': 32828, 'batch_loss/train': 0.7818947965279222} 12/27/2021 06:58:57 - INFO - codeparrot_training - Step 32829: {'lr': 0.0001327581638447191, 'samples': 16808960, 'steps': 32829, 'batch_loss/train': 0.849380916915834} 12/27/2021 06:59:08 - INFO - codeparrot_training - Step 32830: {'lr': 0.0001327442418332042, 'samples': 16809472, 'steps': 32830, 'batch_loss/train': 0.7699391646310687} 12/27/2021 06:59:20 - INFO - codeparrot_training - Step 32831: {'lr': 0.00013273032028785392, 'samples': 16809984, 'steps': 32831, 'batch_loss/train': 0.7518552895635366} 12/27/2021 06:59:31 - INFO - codeparrot_training - Step 32832: {'lr': 0.0001327163992087236, 'samples': 16810496, 'steps': 32832, 'batch_loss/train': 0.8327233511954546} 12/27/2021 06:59:41 - INFO - codeparrot_training - Step 32833: {'lr': 0.0001327024785958686, 'samples': 16811008, 'steps': 32833, 'batch_loss/train': 0.7543230024166405} 12/27/2021 06:59:54 - INFO - codeparrot_training - Step 32834: {'lr': 0.00013268855844934425, 'samples': 16811520, 'steps': 32834, 'batch_loss/train': 0.641672482015565} 12/27/2021 07:00:05 - INFO - codeparrot_training - Step 32835: {'lr': 0.00013267463876920588, 'samples': 16812032, 'steps': 32835, 'batch_loss/train': 1.3990943033713847} 12/27/2021 07:00:15 - INFO - codeparrot_training - Step 32836: {'lr': 0.00013266071955550887, 'samples': 16812544, 'steps': 32836, 'batch_loss/train': 0.7200525314547122} 12/27/2021 07:00:26 - INFO - codeparrot_training - Step 32837: {'lr': 0.00013264680080830854, 'samples': 16813056, 'steps': 32837, 'batch_loss/train': 0.725541488151066} 12/27/2021 07:00:38 - INFO - codeparrot_training - Step 32838: {'lr': 0.0001326328825276603, 'samples': 16813568, 'steps': 32838, 'batch_loss/train': 0.6953508288133889} 12/27/2021 07:00:49 - INFO - codeparrot_training - Step 32839: {'lr': 0.0001326189647136192, 'samples': 16814080, 'steps': 32839, 'batch_loss/train': 0.7213190849870443} 12/27/2021 07:00:59 - INFO - codeparrot_training - Step 32840: {'lr': 0.000132605047366241, 'samples': 16814592, 'steps': 32840, 'batch_loss/train': 0.8122111177071929} 12/27/2021 07:01:12 - INFO - codeparrot_training - Step 32841: {'lr': 0.0001325911304855807, 'samples': 16815104, 'steps': 32841, 'batch_loss/train': 0.7857244269689545} 12/27/2021 07:01:23 - INFO - codeparrot_training - Step 32842: {'lr': 0.0001325772140716937, 'samples': 16815616, 'steps': 32842, 'batch_loss/train': 0.7669617924839258} 12/27/2021 07:01:33 - INFO - codeparrot_training - Step 32843: {'lr': 0.0001325632981246354, 'samples': 16816128, 'steps': 32843, 'batch_loss/train': 0.7675169124267995} 12/27/2021 07:01:46 - INFO - codeparrot_training - Step 32844: {'lr': 0.00013254938264446105, 'samples': 16816640, 'steps': 32844, 'batch_loss/train': 0.6544994059950113} 12/27/2021 07:01:56 - INFO - codeparrot_training - Step 32845: {'lr': 0.00013253546763122604, 'samples': 16817152, 'steps': 32845, 'batch_loss/train': 0.7869840445928276} 12/27/2021 07:02:07 - INFO - codeparrot_training - Step 32846: {'lr': 0.00013252155308498563, 'samples': 16817664, 'steps': 32846, 'batch_loss/train': 0.7569695577258244} 12/27/2021 07:02:17 - INFO - codeparrot_training - Step 32847: {'lr': 0.00013250763900579516, 'samples': 16818176, 'steps': 32847, 'batch_loss/train': 0.8254059916362166} 12/27/2021 07:02:30 - INFO - codeparrot_training - Step 32848: {'lr': 0.00013249372539371, 'samples': 16818688, 'steps': 32848, 'batch_loss/train': 0.7751267636194825} 12/27/2021 07:02:40 - INFO - codeparrot_training - Step 32849: {'lr': 0.00013247981224878547, 'samples': 16819200, 'steps': 32849, 'batch_loss/train': 0.7831993447616696} 12/27/2021 07:02:51 - INFO - codeparrot_training - Step 32850: {'lr': 0.00013246589957107666, 'samples': 16819712, 'steps': 32850, 'batch_loss/train': 1.4230134515091777} 12/27/2021 07:03:04 - INFO - codeparrot_training - Step 32851: {'lr': 0.00013245198736063914, 'samples': 16820224, 'steps': 32851, 'batch_loss/train': 0.8569651544094086} 12/27/2021 07:03:14 - INFO - codeparrot_training - Step 32852: {'lr': 0.00013243807561752824, 'samples': 16820736, 'steps': 32852, 'batch_loss/train': 0.6583848823793232} 12/27/2021 07:03:25 - INFO - codeparrot_training - Step 32853: {'lr': 0.00013242416434179909, 'samples': 16821248, 'steps': 32853, 'batch_loss/train': 1.0567990555427969} 12/27/2021 07:03:37 - INFO - codeparrot_training - Step 32854: {'lr': 0.00013241025353350695, 'samples': 16821760, 'steps': 32854, 'batch_loss/train': 1.189592121168971} 12/27/2021 07:03:48 - INFO - codeparrot_training - Step 32855: {'lr': 0.00013239634319270742, 'samples': 16822272, 'steps': 32855, 'batch_loss/train': 0.8068193551152945} 12/27/2021 07:03:58 - INFO - codeparrot_training - Step 32856: {'lr': 0.00013238243331945555, 'samples': 16822784, 'steps': 32856, 'batch_loss/train': 0.7003826592117548} 12/27/2021 07:04:09 - INFO - codeparrot_training - Step 32857: {'lr': 0.00013236852391380668, 'samples': 16823296, 'steps': 32857, 'batch_loss/train': 0.7190386578440666} 12/27/2021 07:04:21 - INFO - codeparrot_training - Step 32858: {'lr': 0.00013235461497581618, 'samples': 16823808, 'steps': 32858, 'batch_loss/train': 0.7823977656662464} 12/27/2021 07:04:32 - INFO - codeparrot_training - Step 32859: {'lr': 0.0001323407065055393, 'samples': 16824320, 'steps': 32859, 'batch_loss/train': 0.7252225491101854} 12/27/2021 07:04:42 - INFO - codeparrot_training - Step 32860: {'lr': 0.00013232679850303132, 'samples': 16824832, 'steps': 32860, 'batch_loss/train': 0.7935886327177286} 12/27/2021 07:04:54 - INFO - codeparrot_training - Step 32861: {'lr': 0.0001323128909683476, 'samples': 16825344, 'steps': 32861, 'batch_loss/train': 0.7360397204756737} 12/27/2021 07:05:05 - INFO - codeparrot_training - Step 32862: {'lr': 0.00013229898390154331, 'samples': 16825856, 'steps': 32862, 'batch_loss/train': 0.7599229793995619} 12/27/2021 07:05:16 - INFO - codeparrot_training - Step 32863: {'lr': 0.00013228507730267392, 'samples': 16826368, 'steps': 32863, 'batch_loss/train': 0.8208561611827463} 12/27/2021 07:05:29 - INFO - codeparrot_training - Step 32864: {'lr': 0.00013227117117179467, 'samples': 16826880, 'steps': 32864, 'batch_loss/train': 0.8334630457684398} 12/27/2021 07:05:39 - INFO - codeparrot_training - Step 32865: {'lr': 0.00013225726550896057, 'samples': 16827392, 'steps': 32865, 'batch_loss/train': 0.7453787387348711} 12/27/2021 07:05:50 - INFO - codeparrot_training - Step 32866: {'lr': 0.00013224336031422734, 'samples': 16827904, 'steps': 32866, 'batch_loss/train': 0.841276977211237} 12/27/2021 07:06:01 - INFO - codeparrot_training - Step 32867: {'lr': 0.00013222945558764995, 'samples': 16828416, 'steps': 32867, 'batch_loss/train': 0.8258506217971444} 12/27/2021 07:06:13 - INFO - codeparrot_training - Step 32868: {'lr': 0.00013221555132928377, 'samples': 16828928, 'steps': 32868, 'batch_loss/train': 0.6859809895977378} 12/27/2021 07:06:23 - INFO - codeparrot_training - Step 32869: {'lr': 0.0001322016475391841, 'samples': 16829440, 'steps': 32869, 'batch_loss/train': 0.8098117802292109} 12/27/2021 07:06:34 - INFO - codeparrot_training - Step 32870: {'lr': 0.0001321877442174062, 'samples': 16829952, 'steps': 32870, 'batch_loss/train': 0.8078249087557197} 12/27/2021 07:06:47 - INFO - codeparrot_training - Step 32871: {'lr': 0.00013217384136400535, 'samples': 16830464, 'steps': 32871, 'batch_loss/train': 0.9931638753041625} 12/27/2021 07:06:57 - INFO - codeparrot_training - Step 32872: {'lr': 0.00013215993897903677, 'samples': 16830976, 'steps': 32872, 'batch_loss/train': 0.7749100783839822} 12/27/2021 07:07:08 - INFO - codeparrot_training - Step 32873: {'lr': 0.00013214603706255583, 'samples': 16831488, 'steps': 32873, 'batch_loss/train': 0.7643644213676453} 12/27/2021 07:07:20 - INFO - codeparrot_training - Step 32874: {'lr': 0.00013213213561461774, 'samples': 16832000, 'steps': 32874, 'batch_loss/train': 0.7558420323766768} 12/27/2021 07:07:31 - INFO - codeparrot_training - Step 32875: {'lr': 0.00013211823463527784, 'samples': 16832512, 'steps': 32875, 'batch_loss/train': 0.611228501831647} 12/27/2021 07:07:41 - INFO - codeparrot_training - Step 32876: {'lr': 0.00013210433412459116, 'samples': 16833024, 'steps': 32876, 'batch_loss/train': 0.9048771616071463} 12/27/2021 07:07:52 - INFO - codeparrot_training - Step 32877: {'lr': 0.00013209043408261322, 'samples': 16833536, 'steps': 32877, 'batch_loss/train': 0.830366893671453} 12/27/2021 07:08:04 - INFO - codeparrot_training - Step 32878: {'lr': 0.00013207653450939927, 'samples': 16834048, 'steps': 32878, 'batch_loss/train': 0.7777322828769684} 12/27/2021 07:08:15 - INFO - codeparrot_training - Step 32879: {'lr': 0.0001320626354050044, 'samples': 16834560, 'steps': 32879, 'batch_loss/train': 0.7799492296762764} 12/27/2021 07:08:26 - INFO - codeparrot_training - Step 32880: {'lr': 0.00013204873676948392, 'samples': 16835072, 'steps': 32880, 'batch_loss/train': 0.6424266388639808} 12/27/2021 07:08:38 - INFO - codeparrot_training - Step 32881: {'lr': 0.0001320348386028933, 'samples': 16835584, 'steps': 32881, 'batch_loss/train': 0.7581873973831534} 12/27/2021 07:08:49 - INFO - codeparrot_training - Step 32882: {'lr': 0.00013202094090528747, 'samples': 16836096, 'steps': 32882, 'batch_loss/train': 0.7074798932299018} 12/27/2021 07:09:00 - INFO - codeparrot_training - Step 32883: {'lr': 0.00013200704367672178, 'samples': 16836608, 'steps': 32883, 'batch_loss/train': 0.7400221014395356} 12/27/2021 07:09:10 - INFO - codeparrot_training - Step 32884: {'lr': 0.0001319931469172517, 'samples': 16837120, 'steps': 32884, 'batch_loss/train': 0.944164402782917} 12/27/2021 07:09:23 - INFO - codeparrot_training - Step 32885: {'lr': 0.00013197925062693222, 'samples': 16837632, 'steps': 32885, 'batch_loss/train': 0.7096533072181046} 12/27/2021 07:09:33 - INFO - codeparrot_training - Step 32886: {'lr': 0.00013196535480581868, 'samples': 16838144, 'steps': 32886, 'batch_loss/train': 0.81162931304425} 12/27/2021 07:09:44 - INFO - codeparrot_training - Step 32887: {'lr': 0.00013195145945396633, 'samples': 16838656, 'steps': 32887, 'batch_loss/train': 0.7095808517187834} 12/27/2021 07:09:56 - INFO - codeparrot_training - Step 32888: {'lr': 0.00013193756457143043, 'samples': 16839168, 'steps': 32888, 'batch_loss/train': 0.7700046300888062} 12/27/2021 07:10:07 - INFO - codeparrot_training - Step 32889: {'lr': 0.0001319236701582661, 'samples': 16839680, 'steps': 32889, 'batch_loss/train': 0.7903599864803255} 12/27/2021 07:10:17 - INFO - codeparrot_training - Step 32890: {'lr': 0.00013190977621452886, 'samples': 16840192, 'steps': 32890, 'batch_loss/train': 0.9045986557612196} 12/27/2021 07:10:30 - INFO - codeparrot_training - Step 32891: {'lr': 0.00013189588274027352, 'samples': 16840704, 'steps': 32891, 'batch_loss/train': 0.7401346350088716} 12/27/2021 07:10:41 - INFO - codeparrot_training - Step 32892: {'lr': 0.0001318819897355558, 'samples': 16841216, 'steps': 32892, 'batch_loss/train': 0.8345786537975073} 12/27/2021 07:10:51 - INFO - codeparrot_training - Step 32893: {'lr': 0.00013186809720043058, 'samples': 16841728, 'steps': 32893, 'batch_loss/train': 0.7689736867323518} 12/27/2021 07:11:02 - INFO - codeparrot_training - Step 32894: {'lr': 0.00013185420513495317, 'samples': 16842240, 'steps': 32894, 'batch_loss/train': 0.6825482344720513} 12/27/2021 07:11:14 - INFO - codeparrot_training - Step 32895: {'lr': 0.00013184031353917887, 'samples': 16842752, 'steps': 32895, 'batch_loss/train': 0.7835796754807234} 12/27/2021 07:11:25 - INFO - codeparrot_training - Step 32896: {'lr': 0.0001318264224131628, 'samples': 16843264, 'steps': 32896, 'batch_loss/train': 0.8170063830912113} 12/27/2021 07:11:35 - INFO - codeparrot_training - Step 32897: {'lr': 0.0001318125317569603, 'samples': 16843776, 'steps': 32897, 'batch_loss/train': 0.8024990241974592} 12/27/2021 07:11:47 - INFO - codeparrot_training - Step 32898: {'lr': 0.00013179864157062654, 'samples': 16844288, 'steps': 32898, 'batch_loss/train': 0.6375899575068615} 12/27/2021 07:11:58 - INFO - codeparrot_training - Step 32899: {'lr': 0.00013178475185421677, 'samples': 16844800, 'steps': 32899, 'batch_loss/train': 0.7991022962378338} 12/27/2021 07:12:09 - INFO - codeparrot_training - Step 32900: {'lr': 0.00013177086260778614, 'samples': 16845312, 'steps': 32900, 'batch_loss/train': 0.7046256219036877} 12/27/2021 07:12:21 - INFO - codeparrot_training - Step 32901: {'lr': 0.00013175697383139008, 'samples': 16845824, 'steps': 32901, 'batch_loss/train': 0.7380210813134909} 12/27/2021 07:12:31 - INFO - codeparrot_training - Step 32902: {'lr': 0.0001317430855250834, 'samples': 16846336, 'steps': 32902, 'batch_loss/train': 0.7160908159567043} 12/27/2021 07:12:42 - INFO - codeparrot_training - Step 32903: {'lr': 0.00013172919768892168, 'samples': 16846848, 'steps': 32903, 'batch_loss/train': 0.7565447129309177} 12/27/2021 07:12:53 - INFO - codeparrot_training - Step 32904: {'lr': 0.00013171531032296014, 'samples': 16847360, 'steps': 32904, 'batch_loss/train': 0.6039606978883967} 12/27/2021 07:13:06 - INFO - codeparrot_training - Step 32905: {'lr': 0.00013170142342725373, 'samples': 16847872, 'steps': 32905, 'batch_loss/train': 0.7086623213253915} 12/27/2021 07:13:16 - INFO - codeparrot_training - Step 32906: {'lr': 0.00013168753700185768, 'samples': 16848384, 'steps': 32906, 'batch_loss/train': 0.8464821884408593} 12/27/2021 07:13:27 - INFO - codeparrot_training - Step 32907: {'lr': 0.00013167365104682752, 'samples': 16848896, 'steps': 32907, 'batch_loss/train': 0.6724866447038949} 12/27/2021 07:13:39 - INFO - codeparrot_training - Step 32908: {'lr': 0.00013165976556221815, 'samples': 16849408, 'steps': 32908, 'batch_loss/train': 0.7982170293107629} 12/27/2021 07:13:50 - INFO - codeparrot_training - Step 32909: {'lr': 0.00013164588054808478, 'samples': 16849920, 'steps': 32909, 'batch_loss/train': 0.7635489697568119} 12/27/2021 07:14:01 - INFO - codeparrot_training - Step 32910: {'lr': 0.00013163199600448288, 'samples': 16850432, 'steps': 32910, 'batch_loss/train': 0.6116113001480699} 12/27/2021 07:14:13 - INFO - codeparrot_training - Step 32911: {'lr': 0.0001316181119314674, 'samples': 16850944, 'steps': 32911, 'batch_loss/train': 0.5897801085375249} 12/27/2021 07:14:24 - INFO - codeparrot_training - Step 32912: {'lr': 0.00013160422832909357, 'samples': 16851456, 'steps': 32912, 'batch_loss/train': 0.585562993073836} 12/27/2021 07:14:35 - INFO - codeparrot_training - Step 32913: {'lr': 0.00013159034519741664, 'samples': 16851968, 'steps': 32913, 'batch_loss/train': 0.6275815679691732} 12/27/2021 07:14:45 - INFO - codeparrot_training - Step 32914: {'lr': 0.0001315764625364918, 'samples': 16852480, 'steps': 32914, 'batch_loss/train': 0.5954747025389224} 12/27/2021 07:14:57 - INFO - codeparrot_training - Step 32915: {'lr': 0.0001315625803463742, 'samples': 16852992, 'steps': 32915, 'batch_loss/train': 0.6709097342099994} 12/27/2021 07:15:08 - INFO - codeparrot_training - Step 32916: {'lr': 0.00013154869862711918, 'samples': 16853504, 'steps': 32916, 'batch_loss/train': 0.7578774122521281} 12/27/2021 07:15:19 - INFO - codeparrot_training - Step 32917: {'lr': 0.00013153481737878162, 'samples': 16854016, 'steps': 32917, 'batch_loss/train': 0.8105420222273096} 12/27/2021 07:15:31 - INFO - codeparrot_training - Step 32918: {'lr': 0.000131520936601417, 'samples': 16854528, 'steps': 32918, 'batch_loss/train': 0.748603736050427} 12/27/2021 07:15:41 - INFO - codeparrot_training - Step 32919: {'lr': 0.00013150705629508047, 'samples': 16855040, 'steps': 32919, 'batch_loss/train': 0.7303206343203783} 12/27/2021 07:15:52 - INFO - codeparrot_training - Step 32920: {'lr': 0.00013149317645982694, 'samples': 16855552, 'steps': 32920, 'batch_loss/train': 0.7241820120252669} 12/27/2021 07:16:05 - INFO - codeparrot_training - Step 32921: {'lr': 0.000131479297095712, 'samples': 16856064, 'steps': 32921, 'batch_loss/train': 0.750176539644599} 12/27/2021 07:16:15 - INFO - codeparrot_training - Step 32922: {'lr': 0.00013146541820279056, 'samples': 16856576, 'steps': 32922, 'batch_loss/train': 0.7128347128164023} 12/27/2021 07:16:26 - INFO - codeparrot_training - Step 32923: {'lr': 0.00013145153978111782, 'samples': 16857088, 'steps': 32923, 'batch_loss/train': 0.69252646015957} 12/27/2021 07:16:37 - INFO - codeparrot_training - Step 32924: {'lr': 0.000131437661830749, 'samples': 16857600, 'steps': 32924, 'batch_loss/train': 0.7317116083577275} 12/27/2021 07:16:49 - INFO - codeparrot_training - Step 32925: {'lr': 0.0001314237843517393, 'samples': 16858112, 'steps': 32925, 'batch_loss/train': 0.7128299549221992} 12/27/2021 07:16:59 - INFO - codeparrot_training - Step 32926: {'lr': 0.00013140990734414383, 'samples': 16858624, 'steps': 32926, 'batch_loss/train': 0.7499518163967878} 12/27/2021 07:17:10 - INFO - codeparrot_training - Step 32927: {'lr': 0.00013139603080801783, 'samples': 16859136, 'steps': 32927, 'batch_loss/train': 0.7891016928479075} 12/27/2021 07:17:23 - INFO - codeparrot_training - Step 32928: {'lr': 0.00013138215474341636, 'samples': 16859648, 'steps': 32928, 'batch_loss/train': 0.8440863816067576} 12/27/2021 07:17:33 - INFO - codeparrot_training - Step 32929: {'lr': 0.0001313682791503947, 'samples': 16860160, 'steps': 32929, 'batch_loss/train': 0.7917776466347277} 12/27/2021 07:17:44 - INFO - codeparrot_training - Step 32930: {'lr': 0.00013135440402900806, 'samples': 16860672, 'steps': 32930, 'batch_loss/train': 0.7815025025047362} 12/27/2021 07:17:56 - INFO - codeparrot_training - Step 32931: {'lr': 0.0001313405293793114, 'samples': 16861184, 'steps': 32931, 'batch_loss/train': 0.7244967077858746} 12/27/2021 07:18:07 - INFO - codeparrot_training - Step 32932: {'lr': 0.0001313266552013599, 'samples': 16861696, 'steps': 32932, 'batch_loss/train': 0.7724434286355972} 12/27/2021 07:18:18 - INFO - codeparrot_training - Step 32933: {'lr': 0.00013131278149520903, 'samples': 16862208, 'steps': 32933, 'batch_loss/train': 0.7231005267240107} 12/27/2021 07:18:28 - INFO - codeparrot_training - Step 32934: {'lr': 0.00013129890826091362, 'samples': 16862720, 'steps': 32934, 'batch_loss/train': 0.7239410688634962} 12/27/2021 07:18:40 - INFO - codeparrot_training - Step 32935: {'lr': 0.0001312850354985288, 'samples': 16863232, 'steps': 32935, 'batch_loss/train': 0.6896098426077515} 12/27/2021 07:18:51 - INFO - codeparrot_training - Step 32936: {'lr': 0.00013127116320811008, 'samples': 16863744, 'steps': 32936, 'batch_loss/train': 0.7052327496930957} 12/27/2021 07:19:01 - INFO - codeparrot_training - Step 32937: {'lr': 0.00013125729138971227, 'samples': 16864256, 'steps': 32937, 'batch_loss/train': 0.7971372618339956} 12/27/2021 07:19:16 - INFO - codeparrot_training - Step 32938: {'lr': 0.00013124342004339066, 'samples': 16864768, 'steps': 32938, 'batch_loss/train': 0.7708628945983946} 12/27/2021 07:19:26 - INFO - codeparrot_training - Step 32939: {'lr': 0.00013122954916920034, 'samples': 16865280, 'steps': 32939, 'batch_loss/train': 0.7681284565478563} 12/27/2021 07:19:37 - INFO - codeparrot_training - Step 32940: {'lr': 0.0001312156787671965, 'samples': 16865792, 'steps': 32940, 'batch_loss/train': 0.8115362748503685} 12/27/2021 07:19:49 - INFO - codeparrot_training - Step 32941: {'lr': 0.00013120180883743426, 'samples': 16866304, 'steps': 32941, 'batch_loss/train': 0.6784637807868421} 12/27/2021 07:20:00 - INFO - codeparrot_training - Step 32942: {'lr': 0.0001311879393799689, 'samples': 16866816, 'steps': 32942, 'batch_loss/train': 0.8459016834385693} 12/27/2021 07:20:10 - INFO - codeparrot_training - Step 32943: {'lr': 0.00013117407039485522, 'samples': 16867328, 'steps': 32943, 'batch_loss/train': 0.6842795661650598} 12/27/2021 07:20:21 - INFO - codeparrot_training - Step 32944: {'lr': 0.00013116020188214868, 'samples': 16867840, 'steps': 32944, 'batch_loss/train': 0.7941965481732041} 12/27/2021 07:20:35 - INFO - codeparrot_training - Step 32945: {'lr': 0.00013114633384190435, 'samples': 16868352, 'steps': 32945, 'batch_loss/train': 0.8425123346969485} 12/27/2021 07:20:46 - INFO - codeparrot_training - Step 32946: {'lr': 0.00013113246627417715, 'samples': 16868864, 'steps': 32946, 'batch_loss/train': 0.747466066852212} 12/27/2021 07:20:56 - INFO - codeparrot_training - Step 32947: {'lr': 0.0001311185991790226, 'samples': 16869376, 'steps': 32947, 'batch_loss/train': 0.8171129857655615} 12/27/2021 07:21:08 - INFO - codeparrot_training - Step 32948: {'lr': 0.0001311047325564955, 'samples': 16869888, 'steps': 32948, 'batch_loss/train': 0.5929455370642245} 12/27/2021 07:21:19 - INFO - codeparrot_training - Step 32949: {'lr': 0.0001310908664066511, 'samples': 16870400, 'steps': 32949, 'batch_loss/train': 1.009910627733916} 12/27/2021 07:21:30 - INFO - codeparrot_training - Step 32950: {'lr': 0.0001310770007295445, 'samples': 16870912, 'steps': 32950, 'batch_loss/train': 0.6971886038663797} 12/27/2021 07:21:42 - INFO - codeparrot_training - Step 32951: {'lr': 0.00013106313552523085, 'samples': 16871424, 'steps': 32951, 'batch_loss/train': 0.7196250692941248} 12/27/2021 07:21:53 - INFO - codeparrot_training - Step 32952: {'lr': 0.00013104927079376529, 'samples': 16871936, 'steps': 32952, 'batch_loss/train': 0.8056912841275334} 12/27/2021 07:22:03 - INFO - codeparrot_training - Step 32953: {'lr': 0.00013103540653520285, 'samples': 16872448, 'steps': 32953, 'batch_loss/train': 0.7046487461775541} 12/27/2021 07:22:14 - INFO - codeparrot_training - Step 32954: {'lr': 0.00013102154274959875, 'samples': 16872960, 'steps': 32954, 'batch_loss/train': 0.7229498573578894} 12/27/2021 07:22:28 - INFO - codeparrot_training - Step 32955: {'lr': 0.00013100767943700808, 'samples': 16873472, 'steps': 32955, 'batch_loss/train': 0.8072086186148226} 12/27/2021 07:22:38 - INFO - codeparrot_training - Step 32956: {'lr': 0.00013099381659748604, 'samples': 16873984, 'steps': 32956, 'batch_loss/train': 0.8204257772304118} 12/27/2021 07:22:49 - INFO - codeparrot_training - Step 32957: {'lr': 0.00013097995423108744, 'samples': 16874496, 'steps': 32957, 'batch_loss/train': 0.684420008212328} 12/27/2021 07:23:01 - INFO - codeparrot_training - Step 32958: {'lr': 0.0001309660923378677, 'samples': 16875008, 'steps': 32958, 'batch_loss/train': 0.883733082562685} 12/27/2021 07:23:12 - INFO - codeparrot_training - Step 32959: {'lr': 0.00013095223091788195, 'samples': 16875520, 'steps': 32959, 'batch_loss/train': 0.7058332515880466} 12/27/2021 07:23:22 - INFO - codeparrot_training - Step 32960: {'lr': 0.00013093836997118508, 'samples': 16876032, 'steps': 32960, 'batch_loss/train': 0.6654223224613816} 12/27/2021 07:23:34 - INFO - codeparrot_training - Step 32961: {'lr': 0.0001309245094978322, 'samples': 16876544, 'steps': 32961, 'batch_loss/train': 0.7436138014309108} 12/27/2021 07:23:45 - INFO - codeparrot_training - Step 32962: {'lr': 0.00013091064949787866, 'samples': 16877056, 'steps': 32962, 'batch_loss/train': 0.7887880485504866} 12/27/2021 07:23:55 - INFO - codeparrot_training - Step 32963: {'lr': 0.00013089678997137938, 'samples': 16877568, 'steps': 32963, 'batch_loss/train': 0.7628467138856649} 12/27/2021 07:24:09 - INFO - codeparrot_training - Step 32964: {'lr': 0.00013088293091838946, 'samples': 16878080, 'steps': 32964, 'batch_loss/train': 0.6912886593490839} 12/27/2021 07:24:20 - INFO - codeparrot_training - Step 32965: {'lr': 0.00013086907233896404, 'samples': 16878592, 'steps': 32965, 'batch_loss/train': 0.74368069274351} 12/27/2021 07:24:31 - INFO - codeparrot_training - Step 32966: {'lr': 0.00013085521423315818, 'samples': 16879104, 'steps': 32966, 'batch_loss/train': 0.7081708614714444} 12/27/2021 07:24:41 - INFO - codeparrot_training - Step 32967: {'lr': 0.00013084135660102703, 'samples': 16879616, 'steps': 32967, 'batch_loss/train': 0.7706433702260256} 12/27/2021 07:24:54 - INFO - codeparrot_training - Step 32968: {'lr': 0.00013082749944262574, 'samples': 16880128, 'steps': 32968, 'batch_loss/train': 0.742536305449903} 12/27/2021 07:25:04 - INFO - codeparrot_training - Step 32969: {'lr': 0.0001308136427580091, 'samples': 16880640, 'steps': 32969, 'batch_loss/train': 0.8013458016794175} 12/27/2021 07:25:15 - INFO - codeparrot_training - Step 32970: {'lr': 0.0001307997865472325, 'samples': 16881152, 'steps': 32970, 'batch_loss/train': 0.7305372217670083} 12/27/2021 07:25:29 - INFO - codeparrot_training - Step 32971: {'lr': 0.00013078593081035112, 'samples': 16881664, 'steps': 32971, 'batch_loss/train': 0.6657405681908131} 12/27/2021 07:25:39 - INFO - codeparrot_training - Step 32972: {'lr': 0.0001307720755474196, 'samples': 16882176, 'steps': 32972, 'batch_loss/train': 0.6507818284444511} 12/27/2021 07:25:50 - INFO - codeparrot_training - Step 32973: {'lr': 0.0001307582207584934, 'samples': 16882688, 'steps': 32973, 'batch_loss/train': 0.7506435411050916} 12/27/2021 07:26:02 - INFO - codeparrot_training - Step 32974: {'lr': 0.00013074436644362758, 'samples': 16883200, 'steps': 32974, 'batch_loss/train': 0.6257905643433332} 12/27/2021 07:26:13 - INFO - codeparrot_training - Step 32975: {'lr': 0.00013073051260287705, 'samples': 16883712, 'steps': 32975, 'batch_loss/train': 0.8203939255326986} 12/27/2021 07:26:23 - INFO - codeparrot_training - Step 32976: {'lr': 0.00013071665923629697, 'samples': 16884224, 'steps': 32976, 'batch_loss/train': 0.8561685268068686} 12/27/2021 07:26:34 - INFO - codeparrot_training - Step 32977: {'lr': 0.0001307028063439424, 'samples': 16884736, 'steps': 32977, 'batch_loss/train': 0.8043449865654111} 12/27/2021 07:26:46 - INFO - codeparrot_training - Step 32978: {'lr': 0.00013068895392586842, 'samples': 16885248, 'steps': 32978, 'batch_loss/train': 0.7865976542234421} 12/27/2021 07:26:57 - INFO - codeparrot_training - Step 32979: {'lr': 0.00013067510198213012, 'samples': 16885760, 'steps': 32979, 'batch_loss/train': 0.8015189610887319} 12/27/2021 07:27:07 - INFO - codeparrot_training - Step 32980: {'lr': 0.00013066125051278257, 'samples': 16886272, 'steps': 32980, 'batch_loss/train': 0.775541786570102} 12/27/2021 07:27:19 - INFO - codeparrot_training - Step 32981: {'lr': 0.0001306473995178808, 'samples': 16886784, 'steps': 32981, 'batch_loss/train': 0.7290052995085716} 12/27/2021 07:27:30 - INFO - codeparrot_training - Step 32982: {'lr': 0.00013063354899748003, 'samples': 16887296, 'steps': 32982, 'batch_loss/train': 0.5464557166560553} 12/27/2021 07:27:41 - INFO - codeparrot_training - Step 32983: {'lr': 0.000130619698951635, 'samples': 16887808, 'steps': 32983, 'batch_loss/train': 0.9088236875832081} 12/27/2021 07:27:54 - INFO - codeparrot_training - Step 32984: {'lr': 0.00013060584938040106, 'samples': 16888320, 'steps': 32984, 'batch_loss/train': 0.7015139553695917} 12/27/2021 07:28:05 - INFO - codeparrot_training - Step 32985: {'lr': 0.00013059200028383327, 'samples': 16888832, 'steps': 32985, 'batch_loss/train': 0.7830542987212539} 12/27/2021 07:28:16 - INFO - codeparrot_training - Step 32986: {'lr': 0.0001305781516619865, 'samples': 16889344, 'steps': 32986, 'batch_loss/train': 0.7371012014336884} 12/27/2021 07:28:28 - INFO - codeparrot_training - Step 32987: {'lr': 0.00013056430351491582, 'samples': 16889856, 'steps': 32987, 'batch_loss/train': 0.8012810312211514} 12/27/2021 07:28:38 - INFO - codeparrot_training - Step 32988: {'lr': 0.00013055045584267657, 'samples': 16890368, 'steps': 32988, 'batch_loss/train': 0.8047000712249428} 12/27/2021 07:28:49 - INFO - codeparrot_training - Step 32989: {'lr': 0.0001305366086453235, 'samples': 16890880, 'steps': 32989, 'batch_loss/train': 0.7526776068843901} 12/27/2021 07:28:59 - INFO - codeparrot_training - Step 32990: {'lr': 0.0001305227619229118, 'samples': 16891392, 'steps': 32990, 'batch_loss/train': 0.8138421294279397} 12/27/2021 07:29:11 - INFO - codeparrot_training - Step 32991: {'lr': 0.00013050891567549644, 'samples': 16891904, 'steps': 32991, 'batch_loss/train': 0.7221095897257328} 12/27/2021 07:29:22 - INFO - codeparrot_training - Step 32992: {'lr': 0.00013049506990313255, 'samples': 16892416, 'steps': 32992, 'batch_loss/train': 0.7731274319812655} 12/27/2021 07:29:33 - INFO - codeparrot_training - Step 32993: {'lr': 0.0001304812246058751, 'samples': 16892928, 'steps': 32993, 'batch_loss/train': 0.7860258971340954} 12/27/2021 07:29:47 - INFO - codeparrot_training - Step 32994: {'lr': 0.0001304673797837792, 'samples': 16893440, 'steps': 32994, 'batch_loss/train': 0.7741958638653159} 12/27/2021 07:29:57 - INFO - codeparrot_training - Step 32995: {'lr': 0.00013045353543689986, 'samples': 16893952, 'steps': 32995, 'batch_loss/train': 0.8239010591059923} 12/27/2021 07:30:08 - INFO - codeparrot_training - Step 32996: {'lr': 0.0001304396915652921, 'samples': 16894464, 'steps': 32996, 'batch_loss/train': 1.1938555287197232} 12/27/2021 07:30:20 - INFO - codeparrot_training - Step 32997: {'lr': 0.00013042584816901108, 'samples': 16894976, 'steps': 32997, 'batch_loss/train': 0.6675433318596333} 12/27/2021 07:30:31 - INFO - codeparrot_training - Step 32998: {'lr': 0.0001304120052481116, 'samples': 16895488, 'steps': 32998, 'batch_loss/train': 0.6136237466707826} 12/27/2021 07:30:41 - INFO - codeparrot_training - Step 32999: {'lr': 0.00013039816280264888, 'samples': 16896000, 'steps': 32999, 'batch_loss/train': 0.6994451936334372} 12/27/2021 07:30:52 - INFO - codeparrot_training - Step 33000: {'lr': 0.00013038432083267804, 'samples': 16896512, 'steps': 33000, 'batch_loss/train': 0.87385423341766} 12/27/2021 07:31:06 - INFO - codeparrot_training - Step 33001: {'lr': 0.00013037047933825386, 'samples': 16897024, 'steps': 33001, 'batch_loss/train': 0.7837154581211507} 12/27/2021 07:31:17 - INFO - codeparrot_training - Step 33002: {'lr': 0.00013035663831943145, 'samples': 16897536, 'steps': 33002, 'batch_loss/train': 0.8125774306245148} 12/27/2021 07:31:27 - INFO - codeparrot_training - Step 33003: {'lr': 0.0001303427977762659, 'samples': 16898048, 'steps': 33003, 'batch_loss/train': 0.650711068417877} 12/27/2021 07:31:39 - INFO - codeparrot_training - Step 33004: {'lr': 0.00013032895770881221, 'samples': 16898560, 'steps': 33004, 'batch_loss/train': 0.6036619418300688} 12/27/2021 07:31:50 - INFO - codeparrot_training - Step 33005: {'lr': 0.00013031511811712536, 'samples': 16899072, 'steps': 33005, 'batch_loss/train': 0.7359100128524005} 12/27/2021 07:32:01 - INFO - codeparrot_training - Step 33006: {'lr': 0.00013030127900126045, 'samples': 16899584, 'steps': 33006, 'batch_loss/train': 0.7456070617772639} 12/27/2021 07:32:13 - INFO - codeparrot_training - Step 33007: {'lr': 0.00013028744036127247, 'samples': 16900096, 'steps': 33007, 'batch_loss/train': 0.7570881098508835} 12/27/2021 07:32:23 - INFO - codeparrot_training - Step 33008: {'lr': 0.00013027360219721641, 'samples': 16900608, 'steps': 33008, 'batch_loss/train': 0.7266429516021162} 12/27/2021 07:32:34 - INFO - codeparrot_training - Step 33009: {'lr': 0.0001302597645091473, 'samples': 16901120, 'steps': 33009, 'batch_loss/train': 0.7155283857136965} 12/27/2021 07:32:45 - INFO - codeparrot_training - Step 33010: {'lr': 0.0001302459272971201, 'samples': 16901632, 'steps': 33010, 'batch_loss/train': 0.6676379139535129} 12/27/2021 07:32:59 - INFO - codeparrot_training - Step 33011: {'lr': 0.00013023209056119007, 'samples': 16902144, 'steps': 33011, 'batch_loss/train': 0.6223365605110303} 12/27/2021 07:33:09 - INFO - codeparrot_training - Step 33012: {'lr': 0.00013021825430141188, 'samples': 16902656, 'steps': 33012, 'batch_loss/train': 0.7606517458334565} 12/27/2021 07:33:20 - INFO - codeparrot_training - Step 33013: {'lr': 0.00013020441851784056, 'samples': 16903168, 'steps': 33013, 'batch_loss/train': 0.8552352711558342} 12/27/2021 07:33:32 - INFO - codeparrot_training - Step 33014: {'lr': 0.00013019058321053146, 'samples': 16903680, 'steps': 33014, 'batch_loss/train': 0.7093945741653442} 12/27/2021 07:33:43 - INFO - codeparrot_training - Step 33015: {'lr': 0.00013017674837953924, 'samples': 16904192, 'steps': 33015, 'batch_loss/train': 0.7223528125323355} 12/27/2021 07:33:54 - INFO - codeparrot_training - Step 33016: {'lr': 0.00013016291402491908, 'samples': 16904704, 'steps': 33016, 'batch_loss/train': 0.7291096289409325} 12/27/2021 07:34:06 - INFO - codeparrot_training - Step 33017: {'lr': 0.00013014908014672588, 'samples': 16905216, 'steps': 33017, 'batch_loss/train': 0.7322072335518897} 12/27/2021 07:34:17 - INFO - codeparrot_training - Step 33018: {'lr': 0.0001301352467450147, 'samples': 16905728, 'steps': 33018, 'batch_loss/train': 0.7543410630896688} 12/27/2021 07:34:27 - INFO - codeparrot_training - Step 33019: {'lr': 0.00013012141381984053, 'samples': 16906240, 'steps': 33019, 'batch_loss/train': 0.7517090477049351} 12/27/2021 07:34:38 - INFO - codeparrot_training - Step 33020: {'lr': 0.00013010758137125834, 'samples': 16906752, 'steps': 33020, 'batch_loss/train': 0.6798605914227664} 12/27/2021 07:34:50 - INFO - codeparrot_training - Step 33021: {'lr': 0.0001300937493993231, 'samples': 16907264, 'steps': 33021, 'batch_loss/train': 0.8052314859814942} 12/27/2021 07:35:01 - INFO - codeparrot_training - Step 33022: {'lr': 0.0001300799179040899, 'samples': 16907776, 'steps': 33022, 'batch_loss/train': 0.8029906190931797} 12/27/2021 07:35:11 - INFO - codeparrot_training - Step 33023: {'lr': 0.00013006608688561375, 'samples': 16908288, 'steps': 33023, 'batch_loss/train': 0.6626090835779905} 12/27/2021 07:35:25 - INFO - codeparrot_training - Step 33024: {'lr': 0.00013005225634394934, 'samples': 16908800, 'steps': 33024, 'batch_loss/train': 0.7219842579215765} 12/27/2021 07:35:36 - INFO - codeparrot_training - Step 33025: {'lr': 0.00013003842627915197, 'samples': 16909312, 'steps': 33025, 'batch_loss/train': 0.7397825348889455} 12/27/2021 07:35:46 - INFO - codeparrot_training - Step 33026: {'lr': 0.00013002459669127664, 'samples': 16909824, 'steps': 33026, 'batch_loss/train': 0.7907002130523324} 12/27/2021 07:35:58 - INFO - codeparrot_training - Step 33027: {'lr': 0.0001300107675803781, 'samples': 16910336, 'steps': 33027, 'batch_loss/train': 0.6942002438008785} 12/27/2021 07:36:09 - INFO - codeparrot_training - Step 33028: {'lr': 0.00012999693894651142, 'samples': 16910848, 'steps': 33028, 'batch_loss/train': 0.6600740498397499} 12/27/2021 07:36:20 - INFO - codeparrot_training - Step 33029: {'lr': 0.00012998311078973163, 'samples': 16911360, 'steps': 33029, 'batch_loss/train': 0.6810560352168977} 12/27/2021 07:36:34 - INFO - codeparrot_training - Step 33030: {'lr': 0.00012996928311009365, 'samples': 16911872, 'steps': 33030, 'batch_loss/train': 0.7965761981904507} 12/27/2021 07:36:44 - INFO - codeparrot_training - Step 33031: {'lr': 0.00012995545590765252, 'samples': 16912384, 'steps': 33031, 'batch_loss/train': 0.7545371102169156} 12/27/2021 07:36:55 - INFO - codeparrot_training - Step 33032: {'lr': 0.0001299416291824631, 'samples': 16912896, 'steps': 33032, 'batch_loss/train': 0.9648510161787271} 12/27/2021 07:37:06 - INFO - codeparrot_training - Step 33033: {'lr': 0.0001299278029345805, 'samples': 16913408, 'steps': 33033, 'batch_loss/train': 0.700017096940428} 12/27/2021 07:37:18 - INFO - codeparrot_training - Step 33034: {'lr': 0.00012991397716405958, 'samples': 16913920, 'steps': 33034, 'batch_loss/train': 0.7247361443005502} 12/27/2021 07:37:28 - INFO - codeparrot_training - Step 33035: {'lr': 0.00012990015187095536, 'samples': 16914432, 'steps': 33035, 'batch_loss/train': 0.7559674242511392} 12/27/2021 07:37:39 - INFO - codeparrot_training - Step 33036: {'lr': 0.00012988632705532278, 'samples': 16914944, 'steps': 33036, 'batch_loss/train': 0.8078911155462265} 12/27/2021 07:37:51 - INFO - codeparrot_training - Step 33037: {'lr': 0.0001298725027172169, 'samples': 16915456, 'steps': 33037, 'batch_loss/train': 0.7260810890002176} 12/27/2021 07:38:02 - INFO - codeparrot_training - Step 33038: {'lr': 0.00012985867885669251, 'samples': 16915968, 'steps': 33038, 'batch_loss/train': 0.8561580339446664} 12/27/2021 07:38:12 - INFO - codeparrot_training - Step 33039: {'lr': 0.00012984485547380456, 'samples': 16916480, 'steps': 33039, 'batch_loss/train': 0.7124265702441335} 12/27/2021 07:38:26 - INFO - codeparrot_training - Step 33040: {'lr': 0.00012983103256860828, 'samples': 16916992, 'steps': 33040, 'batch_loss/train': 0.6895460924133658} 12/27/2021 07:38:37 - INFO - codeparrot_training - Step 33041: {'lr': 0.00012981721014115833, 'samples': 16917504, 'steps': 33041, 'batch_loss/train': 0.7078900029882789} 12/27/2021 07:38:48 - INFO - codeparrot_training - Step 33042: {'lr': 0.00012980338819150978, 'samples': 16918016, 'steps': 33042, 'batch_loss/train': 1.3127699922770262} 12/27/2021 07:38:58 - INFO - codeparrot_training - Step 33043: {'lr': 0.0001297895667197176, 'samples': 16918528, 'steps': 33043, 'batch_loss/train': 0.730896714143455} 12/27/2021 07:39:10 - INFO - codeparrot_training - Step 33044: {'lr': 0.00012977574572583666, 'samples': 16919040, 'steps': 33044, 'batch_loss/train': 0.8628409137018025} 12/27/2021 07:39:21 - INFO - codeparrot_training - Step 33045: {'lr': 0.000129761925209922, 'samples': 16919552, 'steps': 33045, 'batch_loss/train': 0.7782676494680345} 12/27/2021 07:39:31 - INFO - codeparrot_training - Step 33046: {'lr': 0.0001297481051720285, 'samples': 16920064, 'steps': 33046, 'batch_loss/train': 0.7206461448222399} 12/27/2021 07:39:45 - INFO - codeparrot_training - Step 33047: {'lr': 0.00012973428561221118, 'samples': 16920576, 'steps': 33047, 'batch_loss/train': 0.788434712216258} 12/27/2021 07:39:56 - INFO - codeparrot_training - Step 33048: {'lr': 0.00012972046653052488, 'samples': 16921088, 'steps': 33048, 'batch_loss/train': 0.7999296663329005} 12/27/2021 07:40:07 - INFO - codeparrot_training - Step 33049: {'lr': 0.0001297066479270247, 'samples': 16921600, 'steps': 33049, 'batch_loss/train': 0.7158998269587755} 12/27/2021 07:40:17 - INFO - codeparrot_training - Step 33050: {'lr': 0.00012969282980176527, 'samples': 16922112, 'steps': 33050, 'batch_loss/train': 0.8626576634123921} 12/27/2021 07:40:30 - INFO - codeparrot_training - Step 33051: {'lr': 0.00012967901215480183, 'samples': 16922624, 'steps': 33051, 'batch_loss/train': 0.7917128328699619} 12/27/2021 07:40:40 - INFO - codeparrot_training - Step 33052: {'lr': 0.00012966519498618928, 'samples': 16923136, 'steps': 33052, 'batch_loss/train': 0.7626309422776103} 12/27/2021 07:40:51 - INFO - codeparrot_training - Step 33053: {'lr': 0.0001296513782959824, 'samples': 16923648, 'steps': 33053, 'batch_loss/train': 0.7607382764108479} 12/27/2021 07:41:03 - INFO - codeparrot_training - Step 33054: {'lr': 0.0001296375620842361, 'samples': 16924160, 'steps': 33054, 'batch_loss/train': 0.9050036128610373} 12/27/2021 07:41:14 - INFO - codeparrot_training - Step 33055: {'lr': 0.0001296237463510056, 'samples': 16924672, 'steps': 33055, 'batch_loss/train': 0.7409554552286863} 12/27/2021 07:41:25 - INFO - codeparrot_training - Step 33056: {'lr': 0.00012960993109634552, 'samples': 16925184, 'steps': 33056, 'batch_loss/train': 0.8402601880952716} 12/27/2021 07:41:37 - INFO - codeparrot_training - Step 33057: {'lr': 0.00012959611632031094, 'samples': 16925696, 'steps': 33057, 'batch_loss/train': 0.7532445134129375} 12/27/2021 07:41:47 - INFO - codeparrot_training - Step 33058: {'lr': 0.0001295823020229567, 'samples': 16926208, 'steps': 33058, 'batch_loss/train': 0.7237564516253769} 12/27/2021 07:41:58 - INFO - codeparrot_training - Step 33059: {'lr': 0.00012956848820433776, 'samples': 16926720, 'steps': 33059, 'batch_loss/train': 0.7907122885808349} 12/27/2021 07:42:09 - INFO - codeparrot_training - Step 33060: {'lr': 0.00012955467486450907, 'samples': 16927232, 'steps': 33060, 'batch_loss/train': 0.8517245100811124} 12/27/2021 07:42:22 - INFO - codeparrot_training - Step 33061: {'lr': 0.00012954086200352545, 'samples': 16927744, 'steps': 33061, 'batch_loss/train': 0.772533037699759} 12/27/2021 07:42:33 - INFO - codeparrot_training - Step 33062: {'lr': 0.00012952704962144193, 'samples': 16928256, 'steps': 33062, 'batch_loss/train': 0.7245263258228078} 12/27/2021 07:42:44 - INFO - codeparrot_training - Step 33063: {'lr': 0.00012951323771831337, 'samples': 16928768, 'steps': 33063, 'batch_loss/train': 0.7308290613582358} 12/27/2021 07:42:56 - INFO - codeparrot_training - Step 33064: {'lr': 0.00012949942629419476, 'samples': 16929280, 'steps': 33064, 'batch_loss/train': 0.8097431696951389} 12/27/2021 07:43:07 - INFO - codeparrot_training - Step 33065: {'lr': 0.00012948561534914074, 'samples': 16929792, 'steps': 33065, 'batch_loss/train': 0.7754724938422441} 12/27/2021 07:43:17 - INFO - codeparrot_training - Step 33066: {'lr': 0.00012947180488320661, 'samples': 16930304, 'steps': 33066, 'batch_loss/train': 0.7495453534647822} 12/27/2021 07:43:29 - INFO - codeparrot_training - Step 33067: {'lr': 0.000129457994896447, 'samples': 16930816, 'steps': 33067, 'batch_loss/train': 0.6831132606603205} 12/27/2021 07:43:40 - INFO - codeparrot_training - Step 33068: {'lr': 0.00012944418538891675, 'samples': 16931328, 'steps': 33068, 'batch_loss/train': 0.7955010803416371} 12/27/2021 07:43:51 - INFO - codeparrot_training - Step 33069: {'lr': 0.00012943037636067113, 'samples': 16931840, 'steps': 33069, 'batch_loss/train': 0.7393618590431288} 12/27/2021 07:44:04 - INFO - codeparrot_training - Step 33070: {'lr': 0.00012941656781176468, 'samples': 16932352, 'steps': 33070, 'batch_loss/train': 0.767703376011923} 12/27/2021 07:44:15 - INFO - codeparrot_training - Step 33071: {'lr': 0.00012940275974225243, 'samples': 16932864, 'steps': 33071, 'batch_loss/train': 0.7113293400034308} 12/27/2021 07:44:26 - INFO - codeparrot_training - Step 33072: {'lr': 0.0001293889521521893, 'samples': 16933376, 'steps': 33072, 'batch_loss/train': 0.830701014958322} 12/27/2021 07:44:36 - INFO - codeparrot_training - Step 33073: {'lr': 0.00012937514504163011, 'samples': 16933888, 'steps': 33073, 'batch_loss/train': 0.7734907902777195} 12/27/2021 07:44:49 - INFO - codeparrot_training - Step 33074: {'lr': 0.00012936133841062982, 'samples': 16934400, 'steps': 33074, 'batch_loss/train': 0.6556865475140512} 12/27/2021 07:44:59 - INFO - codeparrot_training - Step 33075: {'lr': 0.0001293475322592434, 'samples': 16934912, 'steps': 33075, 'batch_loss/train': 0.7295609610155225} 12/27/2021 07:45:10 - INFO - codeparrot_training - Step 33076: {'lr': 0.00012933372658752545, 'samples': 16935424, 'steps': 33076, 'batch_loss/train': 0.6558588771149516} 12/27/2021 07:45:24 - INFO - codeparrot_training - Step 33077: {'lr': 0.0001293199213955311, 'samples': 16935936, 'steps': 33077, 'batch_loss/train': 1.0038532502949238} 12/27/2021 07:45:34 - INFO - codeparrot_training - Step 33078: {'lr': 0.0001293061166833153, 'samples': 16936448, 'steps': 33078, 'batch_loss/train': 0.7360701085999608} 12/27/2021 07:45:45 - INFO - codeparrot_training - Step 33079: {'lr': 0.00012929231245093273, 'samples': 16936960, 'steps': 33079, 'batch_loss/train': 0.9309135694056749} 12/27/2021 07:45:57 - INFO - codeparrot_training - Step 33080: {'lr': 0.00012927850869843825, 'samples': 16937472, 'steps': 33080, 'batch_loss/train': 0.8206805824302137} 12/27/2021 07:46:08 - INFO - codeparrot_training - Step 33081: {'lr': 0.000129264705425887, 'samples': 16937984, 'steps': 33081, 'batch_loss/train': 0.6883603981696069} 12/27/2021 07:46:18 - INFO - codeparrot_training - Step 33082: {'lr': 0.0001292509026333336, 'samples': 16938496, 'steps': 33082, 'batch_loss/train': 0.7245570551604033} 12/27/2021 07:46:29 - INFO - codeparrot_training - Step 33083: {'lr': 0.00012923710032083303, 'samples': 16939008, 'steps': 33083, 'batch_loss/train': 0.8935494674369693} 12/27/2021 07:46:41 - INFO - codeparrot_training - Step 33084: {'lr': 0.00012922329848844013, 'samples': 16939520, 'steps': 33084, 'batch_loss/train': 0.696276233298704} 12/27/2021 07:46:52 - INFO - codeparrot_training - Step 33085: {'lr': 0.00012920949713620983, 'samples': 16940032, 'steps': 33085, 'batch_loss/train': 0.7455052705481648} 12/27/2021 07:47:03 - INFO - codeparrot_training - Step 33086: {'lr': 0.0001291956962641969, 'samples': 16940544, 'steps': 33086, 'batch_loss/train': 0.8075180193409324} 12/27/2021 07:47:17 - INFO - codeparrot_training - Step 33087: {'lr': 0.00012918189587245632, 'samples': 16941056, 'steps': 33087, 'batch_loss/train': 0.8116055121645331} 12/27/2021 07:47:27 - INFO - codeparrot_training - Step 33088: {'lr': 0.00012916809596104288, 'samples': 16941568, 'steps': 33088, 'batch_loss/train': 0.6440099417231977} 12/27/2021 07:47:38 - INFO - codeparrot_training - Step 33089: {'lr': 0.0001291542965300115, 'samples': 16942080, 'steps': 33089, 'batch_loss/train': 0.7518593184649944} 12/27/2021 07:47:50 - INFO - codeparrot_training - Step 33090: {'lr': 0.0001291404975794171, 'samples': 16942592, 'steps': 33090, 'batch_loss/train': 0.6544325551949441} 12/27/2021 07:48:01 - INFO - codeparrot_training - Step 33091: {'lr': 0.00012912669910931422, 'samples': 16943104, 'steps': 33091, 'batch_loss/train': 0.770304967649281} 12/27/2021 07:48:11 - INFO - codeparrot_training - Step 33092: {'lr': 0.00012911290111975815, 'samples': 16943616, 'steps': 33092, 'batch_loss/train': 0.7858962779864669} 12/27/2021 07:48:22 - INFO - codeparrot_training - Step 33093: {'lr': 0.00012909910361080345, 'samples': 16944128, 'steps': 33093, 'batch_loss/train': 0.8329202998429537} 12/27/2021 07:48:34 - INFO - codeparrot_training - Step 33094: {'lr': 0.00012908530658250494, 'samples': 16944640, 'steps': 33094, 'batch_loss/train': 0.713894066400826} 12/27/2021 07:48:45 - INFO - codeparrot_training - Step 33095: {'lr': 0.0001290715100349178, 'samples': 16945152, 'steps': 33095, 'batch_loss/train': 0.7025221968069673} 12/27/2021 07:48:55 - INFO - codeparrot_training - Step 33096: {'lr': 0.00012905771396809663, 'samples': 16945664, 'steps': 33096, 'batch_loss/train': 0.7781829186715186} 12/27/2021 07:49:07 - INFO - codeparrot_training - Step 33097: {'lr': 0.00012904391838209625, 'samples': 16946176, 'steps': 33097, 'batch_loss/train': 0.7241425681859255} 12/27/2021 07:49:18 - INFO - codeparrot_training - Step 33098: {'lr': 0.00012903012327697161, 'samples': 16946688, 'steps': 33098, 'batch_loss/train': 0.781009966507554} 12/27/2021 07:49:29 - INFO - codeparrot_training - Step 33099: {'lr': 0.00012901632865277752, 'samples': 16947200, 'steps': 33099, 'batch_loss/train': 0.7053238963708282} 12/27/2021 07:49:42 - INFO - codeparrot_training - Step 33100: {'lr': 0.00012900253450956884, 'samples': 16947712, 'steps': 33100, 'batch_loss/train': 0.7655641492456198} 12/27/2021 07:49:53 - INFO - codeparrot_training - Step 33101: {'lr': 0.00012898874084740048, 'samples': 16948224, 'steps': 33101, 'batch_loss/train': 1.742727359291166} 12/27/2021 07:50:04 - INFO - codeparrot_training - Step 33102: {'lr': 0.000128974947666327, 'samples': 16948736, 'steps': 33102, 'batch_loss/train': 0.7394894664175808} 12/27/2021 07:50:14 - INFO - codeparrot_training - Step 33103: {'lr': 0.00012896115496640353, 'samples': 16949248, 'steps': 33103, 'batch_loss/train': 0.6919946609996259} 12/27/2021 07:50:27 - INFO - codeparrot_training - Step 33104: {'lr': 0.00012894736274768491, 'samples': 16949760, 'steps': 33104, 'batch_loss/train': 0.8675805944949389} 12/27/2021 07:50:37 - INFO - codeparrot_training - Step 33105: {'lr': 0.00012893357101022567, 'samples': 16950272, 'steps': 33105, 'batch_loss/train': 0.9119675680994987} 12/27/2021 07:50:48 - INFO - codeparrot_training - Step 33106: {'lr': 0.00012891977975408093, 'samples': 16950784, 'steps': 33106, 'batch_loss/train': 0.678970291512087} 12/27/2021 07:51:02 - INFO - codeparrot_training - Step 33107: {'lr': 0.00012890598897930553, 'samples': 16951296, 'steps': 33107, 'batch_loss/train': 0.7703231582418084} 12/27/2021 07:51:12 - INFO - codeparrot_training - Step 33108: {'lr': 0.00012889219868595408, 'samples': 16951808, 'steps': 33108, 'batch_loss/train': 0.7174799786880612} 12/27/2021 07:51:23 - INFO - codeparrot_training - Step 33109: {'lr': 0.00012887840887408145, 'samples': 16952320, 'steps': 33109, 'batch_loss/train': 0.6790923832450062} 12/27/2021 07:51:35 - INFO - codeparrot_training - Step 33110: {'lr': 0.00012886461954374274, 'samples': 16952832, 'steps': 33110, 'batch_loss/train': 0.7646987070329487} 12/27/2021 07:51:46 - INFO - codeparrot_training - Step 33111: {'lr': 0.00012885083069499244, 'samples': 16953344, 'steps': 33111, 'batch_loss/train': 0.6738058514893055} 12/27/2021 07:51:56 - INFO - codeparrot_training - Step 33112: {'lr': 0.00012883704232788545, 'samples': 16953856, 'steps': 33112, 'batch_loss/train': 0.7560677723959088} 12/27/2021 07:52:07 - INFO - codeparrot_training - Step 33113: {'lr': 0.0001288232544424767, 'samples': 16954368, 'steps': 33113, 'batch_loss/train': 0.5706308607477695} 12/27/2021 07:52:19 - INFO - codeparrot_training - Step 33114: {'lr': 0.0001288094670388209, 'samples': 16954880, 'steps': 33114, 'batch_loss/train': 0.7872146302834153} 12/27/2021 07:52:30 - INFO - codeparrot_training - Step 33115: {'lr': 0.0001287956801169729, 'samples': 16955392, 'steps': 33115, 'batch_loss/train': 0.8261724393814802} 12/27/2021 07:52:40 - INFO - codeparrot_training - Step 33116: {'lr': 0.00012878189367698762, 'samples': 16955904, 'steps': 33116, 'batch_loss/train': 0.7601709621958435} 12/27/2021 07:52:54 - INFO - codeparrot_training - Step 33117: {'lr': 0.00012876810771891956, 'samples': 16956416, 'steps': 33117, 'batch_loss/train': 0.8299460653215647} 12/27/2021 07:53:05 - INFO - codeparrot_training - Step 33118: {'lr': 0.00012875432224282397, 'samples': 16956928, 'steps': 33118, 'batch_loss/train': 0.8039845414459705} 12/27/2021 07:53:16 - INFO - codeparrot_training - Step 33119: {'lr': 0.00012874053724875524, 'samples': 16957440, 'steps': 33119, 'batch_loss/train': 0.7374725565314293} 12/27/2021 07:53:28 - INFO - codeparrot_training - Step 33120: {'lr': 0.00012872675273676828, 'samples': 16957952, 'steps': 33120, 'batch_loss/train': 0.7205913653597236} 12/27/2021 07:53:39 - INFO - codeparrot_training - Step 33121: {'lr': 0.00012871296870691819, 'samples': 16958464, 'steps': 33121, 'batch_loss/train': 0.6902930539799854} 12/27/2021 07:53:49 - INFO - codeparrot_training - Step 33122: {'lr': 0.00012869918515925937, 'samples': 16958976, 'steps': 33122, 'batch_loss/train': 0.6392002500360832} 12/27/2021 07:54:00 - INFO - codeparrot_training - Step 33123: {'lr': 0.00012868540209384684, 'samples': 16959488, 'steps': 33123, 'batch_loss/train': 0.7761958627961576} 12/27/2021 07:54:12 - INFO - codeparrot_training - Step 33124: {'lr': 0.00012867161951073537, 'samples': 16960000, 'steps': 33124, 'batch_loss/train': 0.7446487564593554} 12/27/2021 07:54:22 - INFO - codeparrot_training - Step 33125: {'lr': 0.00012865783740997967, 'samples': 16960512, 'steps': 33125, 'batch_loss/train': 0.7210655994713306} 12/27/2021 07:54:33 - INFO - codeparrot_training - Step 33126: {'lr': 0.00012864405579163463, 'samples': 16961024, 'steps': 33126, 'batch_loss/train': 0.7357093961909413} 12/27/2021 07:54:47 - INFO - codeparrot_training - Step 33127: {'lr': 0.00012863027465575506, 'samples': 16961536, 'steps': 33127, 'batch_loss/train': 0.7659977171570063} 12/27/2021 07:54:58 - INFO - codeparrot_training - Step 33128: {'lr': 0.00012861649400239554, 'samples': 16962048, 'steps': 33128, 'batch_loss/train': 0.7409145636484027} 12/27/2021 07:55:08 - INFO - codeparrot_training - Step 33129: {'lr': 0.00012860271383161108, 'samples': 16962560, 'steps': 33129, 'batch_loss/train': 0.7345027860719711} 12/27/2021 07:55:20 - INFO - codeparrot_training - Step 33130: {'lr': 0.0001285889341434565, 'samples': 16963072, 'steps': 33130, 'batch_loss/train': 0.7244695047847927} 12/27/2021 07:55:31 - INFO - codeparrot_training - Step 33131: {'lr': 0.00012857515493798626, 'samples': 16963584, 'steps': 33131, 'batch_loss/train': 0.7748554209247231} 12/27/2021 07:55:42 - INFO - codeparrot_training - Step 33132: {'lr': 0.0001285613762152555, 'samples': 16964096, 'steps': 33132, 'batch_loss/train': 0.5301023755746428} 12/27/2021 07:55:52 - INFO - codeparrot_training - Step 33133: {'lr': 0.0001285475979753189, 'samples': 16964608, 'steps': 33133, 'batch_loss/train': 0.7789658857509494} 12/27/2021 07:56:04 - INFO - codeparrot_training - Step 33134: {'lr': 0.0001285338202182311, 'samples': 16965120, 'steps': 33134, 'batch_loss/train': 0.8056446433765814} 12/27/2021 07:56:15 - INFO - codeparrot_training - Step 33135: {'lr': 0.0001285200429440469, 'samples': 16965632, 'steps': 33135, 'batch_loss/train': 0.7372410842217505} 12/27/2021 07:56:26 - INFO - codeparrot_training - Step 33136: {'lr': 0.0001285062661528213, 'samples': 16966144, 'steps': 33136, 'batch_loss/train': 0.775403177831322} 12/27/2021 07:56:39 - INFO - codeparrot_training - Step 33137: {'lr': 0.00012849248984460882, 'samples': 16966656, 'steps': 33137, 'batch_loss/train': 0.7197731994092464} 12/27/2021 07:56:50 - INFO - codeparrot_training - Step 33138: {'lr': 0.00012847871401946433, 'samples': 16967168, 'steps': 33138, 'batch_loss/train': 0.7412359262816608} 12/27/2021 07:57:01 - INFO - codeparrot_training - Step 33139: {'lr': 0.00012846493867744261, 'samples': 16967680, 'steps': 33139, 'batch_loss/train': 0.7825950253754854} 12/27/2021 07:57:13 - INFO - codeparrot_training - Step 33140: {'lr': 0.00012845116381859838, 'samples': 16968192, 'steps': 33140, 'batch_loss/train': 0.8582210140302777} 12/27/2021 07:57:24 - INFO - codeparrot_training - Step 33141: {'lr': 0.00012843738944298645, 'samples': 16968704, 'steps': 33141, 'batch_loss/train': 0.8124270495027304} 12/27/2021 07:57:34 - INFO - codeparrot_training - Step 33142: {'lr': 0.00012842361555066154, 'samples': 16969216, 'steps': 33142, 'batch_loss/train': 0.6846115086227655} 12/27/2021 07:57:45 - INFO - codeparrot_training - Step 33143: {'lr': 0.00012840984214167842, 'samples': 16969728, 'steps': 33143, 'batch_loss/train': 0.4836359379114583} 12/27/2021 07:57:57 - INFO - codeparrot_training - Step 33144: {'lr': 0.00012839606921609186, 'samples': 16970240, 'steps': 33144, 'batch_loss/train': 0.6511605300474912} 12/27/2021 07:58:08 - INFO - codeparrot_training - Step 33145: {'lr': 0.00012838229677395675, 'samples': 16970752, 'steps': 33145, 'batch_loss/train': 0.6594616575166583} 12/27/2021 07:58:18 - INFO - codeparrot_training - Step 33146: {'lr': 0.00012836852481532747, 'samples': 16971264, 'steps': 33146, 'batch_loss/train': 0.6797140085836872} 12/27/2021 07:58:32 - INFO - codeparrot_training - Step 33147: {'lr': 0.0001283547533402592, 'samples': 16971776, 'steps': 33147, 'batch_loss/train': 0.7796323073562235} 12/27/2021 07:58:43 - INFO - codeparrot_training - Step 33148: {'lr': 0.00012834098234880642, 'samples': 16972288, 'steps': 33148, 'batch_loss/train': 0.7311640353873372} 12/27/2021 07:58:54 - INFO - codeparrot_training - Step 33149: {'lr': 0.00012832721184102396, 'samples': 16972800, 'steps': 33149, 'batch_loss/train': 0.662164937122725} 12/27/2021 07:59:06 - INFO - codeparrot_training - Step 33150: {'lr': 0.00012831344181696655, 'samples': 16973312, 'steps': 33150, 'batch_loss/train': 0.597680979873985} 12/27/2021 07:59:16 - INFO - codeparrot_training - Step 33151: {'lr': 0.00012829967227668897, 'samples': 16973824, 'steps': 33151, 'batch_loss/train': 0.8119584312662482} 12/27/2021 07:59:27 - INFO - codeparrot_training - Step 33152: {'lr': 0.00012828590322024592, 'samples': 16974336, 'steps': 33152, 'batch_loss/train': 0.7855225214734674} 12/27/2021 07:59:39 - INFO - codeparrot_training - Step 33153: {'lr': 0.00012827213464769216, 'samples': 16974848, 'steps': 33153, 'batch_loss/train': 0.7891525099985301} 12/27/2021 07:59:50 - INFO - codeparrot_training - Step 33154: {'lr': 0.00012825836655908245, 'samples': 16975360, 'steps': 33154, 'batch_loss/train': 0.45265699620358646} 12/27/2021 08:00:01 - INFO - codeparrot_training - Step 33155: {'lr': 0.00012824459895447148, 'samples': 16975872, 'steps': 33155, 'batch_loss/train': 0.7317128768190742} 12/27/2021 08:00:11 - INFO - codeparrot_training - Step 33156: {'lr': 0.00012823083183391415, 'samples': 16976384, 'steps': 33156, 'batch_loss/train': 0.5846348591148853} 12/27/2021 08:00:25 - INFO - codeparrot_training - Step 33157: {'lr': 0.00012821706519746485, 'samples': 16976896, 'steps': 33157, 'batch_loss/train': 0.7402609731070697} 12/27/2021 08:00:36 - INFO - codeparrot_training - Step 33158: {'lr': 0.00012820329904517859, 'samples': 16977408, 'steps': 33158, 'batch_loss/train': 0.7019538506865501} 12/27/2021 08:00:46 - INFO - codeparrot_training - Step 33159: {'lr': 0.0001281895333771101, 'samples': 16977920, 'steps': 33159, 'batch_loss/train': 0.7914709579199553} 12/27/2021 08:00:58 - INFO - codeparrot_training - Step 33160: {'lr': 0.00012817576819331399, 'samples': 16978432, 'steps': 33160, 'batch_loss/train': 0.8972574071958661} 12/27/2021 08:01:09 - INFO - codeparrot_training - Step 33161: {'lr': 0.00012816200349384489, 'samples': 16978944, 'steps': 33161, 'batch_loss/train': 0.9107339782640338} 12/27/2021 08:01:20 - INFO - codeparrot_training - Step 33162: {'lr': 0.00012814823927875785, 'samples': 16979456, 'steps': 33162, 'batch_loss/train': 0.7062789461342618} 12/27/2021 08:01:34 - INFO - codeparrot_training - Step 33163: {'lr': 0.0001281344755481073, 'samples': 16979968, 'steps': 33163, 'batch_loss/train': 0.7471469171578065} 12/27/2021 08:01:44 - INFO - codeparrot_training - Step 33164: {'lr': 0.0001281207123019481, 'samples': 16980480, 'steps': 33164, 'batch_loss/train': 0.5761290462687612} 12/27/2021 08:01:55 - INFO - codeparrot_training - Step 33165: {'lr': 0.0001281069495403349, 'samples': 16980992, 'steps': 33165, 'batch_loss/train': 0.8695084657520056} 12/27/2021 08:02:06 - INFO - codeparrot_training - Step 33166: {'lr': 0.00012809318726332247, 'samples': 16981504, 'steps': 33166, 'batch_loss/train': 0.9405432138592005} 12/27/2021 08:02:18 - INFO - codeparrot_training - Step 33167: {'lr': 0.00012807942547096547, 'samples': 16982016, 'steps': 33167, 'batch_loss/train': 0.6748232659883797} 12/27/2021 08:02:28 - INFO - codeparrot_training - Step 33168: {'lr': 0.00012806566416331865, 'samples': 16982528, 'steps': 33168, 'batch_loss/train': 0.7040325361303985} 12/27/2021 08:02:39 - INFO - codeparrot_training - Step 33169: {'lr': 0.0001280519033404367, 'samples': 16983040, 'steps': 33169, 'batch_loss/train': 0.7193037075921893} 12/27/2021 08:02:51 - INFO - codeparrot_training - Step 33170: {'lr': 0.00012803814300237437, 'samples': 16983552, 'steps': 33170, 'batch_loss/train': 0.7998020881786942} 12/27/2021 08:03:02 - INFO - codeparrot_training - Step 33171: {'lr': 0.00012802438314918642, 'samples': 16984064, 'steps': 33171, 'batch_loss/train': 0.7862382833845913} 12/27/2021 08:03:12 - INFO - codeparrot_training - Step 33172: {'lr': 0.00012801062378092724, 'samples': 16984576, 'steps': 33172, 'batch_loss/train': 0.8954963618889451} 12/27/2021 08:03:25 - INFO - codeparrot_training - Step 33173: {'lr': 0.00012799686489765194, 'samples': 16985088, 'steps': 33173, 'batch_loss/train': 0.6538793237414211} 12/27/2021 08:03:35 - INFO - codeparrot_training - Step 33174: {'lr': 0.00012798310649941497, 'samples': 16985600, 'steps': 33174, 'batch_loss/train': 0.7562529128044844} 12/27/2021 08:03:46 - INFO - codeparrot_training - Step 33175: {'lr': 0.00012796934858627112, 'samples': 16986112, 'steps': 33175, 'batch_loss/train': 0.7540188156417571} 12/27/2021 08:03:57 - INFO - codeparrot_training - Step 33176: {'lr': 0.00012795559115827505, 'samples': 16986624, 'steps': 33176, 'batch_loss/train': 1.0102627966552973} 12/27/2021 08:04:11 - INFO - codeparrot_training - Step 33177: {'lr': 0.00012794183421548146, 'samples': 16987136, 'steps': 33177, 'batch_loss/train': 0.7547294432297349} 12/27/2021 08:04:21 - INFO - codeparrot_training - Step 33178: {'lr': 0.00012792807775794502, 'samples': 16987648, 'steps': 33178, 'batch_loss/train': 0.7997679505497217} 12/27/2021 08:04:32 - INFO - codeparrot_training - Step 33179: {'lr': 0.0001279143217857205, 'samples': 16988160, 'steps': 33179, 'batch_loss/train': 0.7996385553851724} 12/27/2021 08:04:44 - INFO - codeparrot_training - Step 33180: {'lr': 0.00012790056629886248, 'samples': 16988672, 'steps': 33180, 'batch_loss/train': 0.8737389040179551} 12/27/2021 08:04:55 - INFO - codeparrot_training - Step 33181: {'lr': 0.00012788681129742574, 'samples': 16989184, 'steps': 33181, 'batch_loss/train': 0.8237908594310284} 12/27/2021 08:05:05 - INFO - codeparrot_training - Step 33182: {'lr': 0.00012787305678146506, 'samples': 16989696, 'steps': 33182, 'batch_loss/train': 0.48739163420395926} 12/27/2021 08:05:17 - INFO - codeparrot_training - Step 33183: {'lr': 0.00012785930275103474, 'samples': 16990208, 'steps': 33183, 'batch_loss/train': 0.650899650529027} 12/27/2021 08:05:28 - INFO - codeparrot_training - Step 33184: {'lr': 0.00012784554920618985, 'samples': 16990720, 'steps': 33184, 'batch_loss/train': 0.7813009447418153} 12/27/2021 08:05:39 - INFO - codeparrot_training - Step 33185: {'lr': 0.00012783179614698502, 'samples': 16991232, 'steps': 33185, 'batch_loss/train': 0.7019544090144336} 12/27/2021 08:05:49 - INFO - codeparrot_training - Step 33186: {'lr': 0.00012781804357347475, 'samples': 16991744, 'steps': 33186, 'batch_loss/train': 0.7247766219079494} 12/27/2021 08:06:03 - INFO - codeparrot_training - Step 33187: {'lr': 0.00012780429148571368, 'samples': 16992256, 'steps': 33187, 'batch_loss/train': 0.7246838212013245} 12/27/2021 08:06:14 - INFO - codeparrot_training - Step 33188: {'lr': 0.00012779053988375684, 'samples': 16992768, 'steps': 33188, 'batch_loss/train': 0.7019751390325837} 12/27/2021 08:06:24 - INFO - codeparrot_training - Step 33189: {'lr': 0.00012777678876765852, 'samples': 16993280, 'steps': 33189, 'batch_loss/train': 0.8075648169033229} 12/27/2021 08:06:37 - INFO - codeparrot_training - Step 33190: {'lr': 0.0001277630381374736, 'samples': 16993792, 'steps': 33190, 'batch_loss/train': 0.7192507050931454} 12/27/2021 08:06:47 - INFO - codeparrot_training - Step 33191: {'lr': 0.00012774928799325663, 'samples': 16994304, 'steps': 33191, 'batch_loss/train': 0.7661500513204373} 12/27/2021 08:06:58 - INFO - codeparrot_training - Step 33192: {'lr': 0.00012773553833506234, 'samples': 16994816, 'steps': 33192, 'batch_loss/train': 0.9891676348634064} 12/27/2021 08:07:12 - INFO - codeparrot_training - Step 33193: {'lr': 0.0001277217891629454, 'samples': 16995328, 'steps': 33193, 'batch_loss/train': 0.7666170986485668} 12/27/2021 08:07:23 - INFO - codeparrot_training - Step 33194: {'lr': 0.0001277080404769605, 'samples': 16995840, 'steps': 33194, 'batch_loss/train': 0.6699184933677316} 12/27/2021 08:07:33 - INFO - codeparrot_training - Step 33195: {'lr': 0.0001276942922771622, 'samples': 16996352, 'steps': 33195, 'batch_loss/train': 0.7317971037700772} 12/27/2021 08:07:45 - INFO - codeparrot_training - Step 33196: {'lr': 0.00012768054456360523, 'samples': 16996864, 'steps': 33196, 'batch_loss/train': 0.6435625918675214} 12/27/2021 08:07:56 - INFO - codeparrot_training - Step 33197: {'lr': 0.0001276667973363443, 'samples': 16997376, 'steps': 33197, 'batch_loss/train': 0.7454748004674911} 12/27/2021 08:08:07 - INFO - codeparrot_training - Step 33198: {'lr': 0.0001276530505954338, 'samples': 16997888, 'steps': 33198, 'batch_loss/train': 0.784387371968478} 12/27/2021 08:08:17 - INFO - codeparrot_training - Step 33199: {'lr': 0.0001276393043409287, 'samples': 16998400, 'steps': 33199, 'batch_loss/train': 0.7435407121665776} 12/27/2021 08:08:30 - INFO - codeparrot_training - Step 33200: {'lr': 0.00012762555857288364, 'samples': 16998912, 'steps': 33200, 'batch_loss/train': 0.7965475139208138} 12/27/2021 08:08:40 - INFO - codeparrot_training - Step 33201: {'lr': 0.00012761181329135301, 'samples': 16999424, 'steps': 33201, 'batch_loss/train': 0.6976605379022658} 12/27/2021 08:08:51 - INFO - codeparrot_training - Step 33202: {'lr': 0.00012759806849639162, 'samples': 16999936, 'steps': 33202, 'batch_loss/train': 0.7369440525653772} 12/27/2021 08:09:05 - INFO - codeparrot_training - Step 33203: {'lr': 0.0001275843241880541, 'samples': 17000448, 'steps': 33203, 'batch_loss/train': 0.6155351309571415} 12/27/2021 08:09:15 - INFO - codeparrot_training - Step 33204: {'lr': 0.00012757058036639508, 'samples': 17000960, 'steps': 33204, 'batch_loss/train': 0.676407746039331} 12/27/2021 08:09:26 - INFO - codeparrot_training - Step 33205: {'lr': 0.0001275568370314692, 'samples': 17001472, 'steps': 33205, 'batch_loss/train': 0.6780748598976061} 12/27/2021 08:09:38 - INFO - codeparrot_training - Step 33206: {'lr': 0.00012754309418333106, 'samples': 17001984, 'steps': 33206, 'batch_loss/train': 0.7150826662546024} 12/27/2021 08:09:49 - INFO - codeparrot_training - Step 33207: {'lr': 0.0001275293518220354, 'samples': 17002496, 'steps': 33207, 'batch_loss/train': 0.8493102141655982} 12/27/2021 08:09:59 - INFO - codeparrot_training - Step 33208: {'lr': 0.00012751560994763686, 'samples': 17003008, 'steps': 33208, 'batch_loss/train': 0.8411274394020438} 12/27/2021 08:10:10 - INFO - codeparrot_training - Step 33209: {'lr': 0.00012750186856018983, 'samples': 17003520, 'steps': 33209, 'batch_loss/train': 0.7624844592064619} 12/27/2021 08:10:22 - INFO - codeparrot_training - Step 33210: {'lr': 0.0001274881276597492, 'samples': 17004032, 'steps': 33210, 'batch_loss/train': 0.8049094802699983} 12/27/2021 08:10:33 - INFO - codeparrot_training - Step 33211: {'lr': 0.0001274743872463696, 'samples': 17004544, 'steps': 33211, 'batch_loss/train': 0.6033330522477627} 12/27/2021 08:10:43 - INFO - codeparrot_training - Step 33212: {'lr': 0.00012746064732010553, 'samples': 17005056, 'steps': 33212, 'batch_loss/train': 0.7864636881276965} 12/27/2021 08:10:55 - INFO - codeparrot_training - Step 33213: {'lr': 0.00012744690788101155, 'samples': 17005568, 'steps': 33213, 'batch_loss/train': 0.827596758492291} 12/27/2021 08:11:06 - INFO - codeparrot_training - Step 33214: {'lr': 0.00012743316892914252, 'samples': 17006080, 'steps': 33214, 'batch_loss/train': 0.7306958809494972} 12/27/2021 08:11:17 - INFO - codeparrot_training - Step 33215: {'lr': 0.0001274194304645529, 'samples': 17006592, 'steps': 33215, 'batch_loss/train': 0.7420821669511497} 12/27/2021 08:11:30 - INFO - codeparrot_training - Step 33216: {'lr': 0.00012740569248729722, 'samples': 17007104, 'steps': 33216, 'batch_loss/train': 0.739711781963706} 12/27/2021 08:11:41 - INFO - codeparrot_training - Step 33217: {'lr': 0.0001273919549974304, 'samples': 17007616, 'steps': 33217, 'batch_loss/train': 0.8770536333322525} 12/27/2021 08:11:51 - INFO - codeparrot_training - Step 33218: {'lr': 0.00012737821799500677, 'samples': 17008128, 'steps': 33218, 'batch_loss/train': 0.8476608069613576} 12/27/2021 08:12:02 - INFO - codeparrot_training - Step 33219: {'lr': 0.00012736448148008107, 'samples': 17008640, 'steps': 33219, 'batch_loss/train': 0.727387267164886} 12/27/2021 08:12:14 - INFO - codeparrot_training - Step 33220: {'lr': 0.00012735074545270787, 'samples': 17009152, 'steps': 33220, 'batch_loss/train': 0.6900227926671505} 12/27/2021 08:12:25 - INFO - codeparrot_training - Step 33221: {'lr': 0.00012733700991294177, 'samples': 17009664, 'steps': 33221, 'batch_loss/train': 0.7051559814717621} 12/27/2021 08:12:35 - INFO - codeparrot_training - Step 33222: {'lr': 0.00012732327486083745, 'samples': 17010176, 'steps': 33222, 'batch_loss/train': 0.7284213034436107} 12/27/2021 08:12:49 - INFO - codeparrot_training - Step 33223: {'lr': 0.00012730954029644955, 'samples': 17010688, 'steps': 33223, 'batch_loss/train': 0.7363107893615961} 12/27/2021 08:13:00 - INFO - codeparrot_training - Step 33224: {'lr': 0.00012729580621983238, 'samples': 17011200, 'steps': 33224, 'batch_loss/train': 0.6449431143701077} 12/27/2021 08:13:11 - INFO - codeparrot_training - Step 33225: {'lr': 0.00012728207263104088, 'samples': 17011712, 'steps': 33225, 'batch_loss/train': 0.7998420847579837} 12/27/2021 08:13:23 - INFO - codeparrot_training - Step 33226: {'lr': 0.0001272683395301296, 'samples': 17012224, 'steps': 33226, 'batch_loss/train': 0.7424523448571563} 12/27/2021 08:13:33 - INFO - codeparrot_training - Step 33227: {'lr': 0.00012725460691715297, 'samples': 17012736, 'steps': 33227, 'batch_loss/train': 0.7184363889973611} 12/27/2021 08:13:44 - INFO - codeparrot_training - Step 33228: {'lr': 0.00012724087479216566, 'samples': 17013248, 'steps': 33228, 'batch_loss/train': 0.7236185180954635} 12/27/2021 08:13:56 - INFO - codeparrot_training - Step 33229: {'lr': 0.00012722714315522232, 'samples': 17013760, 'steps': 33229, 'batch_loss/train': 0.9185291975736618} 12/27/2021 08:14:06 - INFO - codeparrot_training - Step 33230: {'lr': 0.0001272134120063775, 'samples': 17014272, 'steps': 33230, 'batch_loss/train': 0.7059773685177788} 12/27/2021 08:14:17 - INFO - codeparrot_training - Step 33231: {'lr': 0.00012719968134568575, 'samples': 17014784, 'steps': 33231, 'batch_loss/train': 0.7449606629088521} 12/27/2021 08:14:28 - INFO - codeparrot_training - Step 33232: {'lr': 0.0001271859511732017, 'samples': 17015296, 'steps': 33232, 'batch_loss/train': 0.7514500427059829} 12/27/2021 08:14:42 - INFO - codeparrot_training - Step 33233: {'lr': 0.00012717222148897996, 'samples': 17015808, 'steps': 33233, 'batch_loss/train': 0.777956438716501} 12/27/2021 08:14:52 - INFO - codeparrot_training - Step 33234: {'lr': 0.00012715849229307508, 'samples': 17016320, 'steps': 33234, 'batch_loss/train': 0.8935815738514066} 12/27/2021 08:15:03 - INFO - codeparrot_training - Step 33235: {'lr': 0.00012714476358554168, 'samples': 17016832, 'steps': 33235, 'batch_loss/train': 0.8173327436670661} 12/27/2021 08:15:15 - INFO - codeparrot_training - Step 33236: {'lr': 0.00012713103536643425, 'samples': 17017344, 'steps': 33236, 'batch_loss/train': 0.7322745561832562} 12/27/2021 08:15:26 - INFO - codeparrot_training - Step 33237: {'lr': 0.00012711730763580758, 'samples': 17017856, 'steps': 33237, 'batch_loss/train': 0.6825598576106131} 12/27/2021 08:15:36 - INFO - codeparrot_training - Step 33238: {'lr': 0.00012710358039371596, 'samples': 17018368, 'steps': 33238, 'batch_loss/train': 0.7783363130874932} 12/27/2021 08:15:48 - INFO - codeparrot_training - Step 33239: {'lr': 0.00012708985364021403, 'samples': 17018880, 'steps': 33239, 'batch_loss/train': 0.7275638394057751} 12/27/2021 08:15:59 - INFO - codeparrot_training - Step 33240: {'lr': 0.00012707612737535663, 'samples': 17019392, 'steps': 33240, 'batch_loss/train': 1.0085422126576304} 12/27/2021 08:16:10 - INFO - codeparrot_training - Step 33241: {'lr': 0.000127062401599198, 'samples': 17019904, 'steps': 33241, 'batch_loss/train': 0.6883231217507273} 12/27/2021 08:16:20 - INFO - codeparrot_training - Step 33242: {'lr': 0.00012704867631179278, 'samples': 17020416, 'steps': 33242, 'batch_loss/train': 0.7325601618504152} 12/27/2021 08:16:34 - INFO - codeparrot_training - Step 33243: {'lr': 0.0001270349515131958, 'samples': 17020928, 'steps': 33243, 'batch_loss/train': 1.34772441489622} 12/27/2021 08:16:45 - INFO - codeparrot_training - Step 33244: {'lr': 0.00012702122720346132, 'samples': 17021440, 'steps': 33244, 'batch_loss/train': 0.6630745925940573} 12/27/2021 08:16:55 - INFO - codeparrot_training - Step 33245: {'lr': 0.000127007503382644, 'samples': 17021952, 'steps': 33245, 'batch_loss/train': 0.6581558054313064} 12/27/2021 08:17:08 - INFO - codeparrot_training - Step 33246: {'lr': 0.00012699378005079836, 'samples': 17022464, 'steps': 33246, 'batch_loss/train': 0.785658644977957} 12/27/2021 08:17:19 - INFO - codeparrot_training - Step 33247: {'lr': 0.00012698005720797906, 'samples': 17022976, 'steps': 33247, 'batch_loss/train': 1.0278800437226892} 12/27/2021 08:17:29 - INFO - codeparrot_training - Step 33248: {'lr': 0.00012696633485424059, 'samples': 17023488, 'steps': 33248, 'batch_loss/train': 0.656552808592096} 12/27/2021 08:17:40 - INFO - codeparrot_training - Step 33249: {'lr': 0.00012695261298963762, 'samples': 17024000, 'steps': 33249, 'batch_loss/train': 0.7644529026001692} 12/27/2021 08:17:52 - INFO - codeparrot_training - Step 33250: {'lr': 0.0001269388916142244, 'samples': 17024512, 'steps': 33250, 'batch_loss/train': 0.6930557466112077} 12/27/2021 08:18:03 - INFO - codeparrot_training - Step 33251: {'lr': 0.00012692517072805578, 'samples': 17025024, 'steps': 33251, 'batch_loss/train': 0.777644571615383} 12/27/2021 08:18:13 - INFO - codeparrot_training - Step 33252: {'lr': 0.0001269114503311863, 'samples': 17025536, 'steps': 33252, 'batch_loss/train': 0.6874257849995047} 12/27/2021 08:18:27 - INFO - codeparrot_training - Step 33253: {'lr': 0.00012689773042367023, 'samples': 17026048, 'steps': 33253, 'batch_loss/train': 0.8282474214211106} 12/27/2021 08:18:38 - INFO - codeparrot_training - Step 33254: {'lr': 0.00012688401100556252, 'samples': 17026560, 'steps': 33254, 'batch_loss/train': 0.9119804478250444} 12/27/2021 08:18:49 - INFO - codeparrot_training - Step 33255: {'lr': 0.00012687029207691735, 'samples': 17027072, 'steps': 33255, 'batch_loss/train': 0.7972896047867835} 12/27/2021 08:19:01 - INFO - codeparrot_training - Step 33256: {'lr': 0.0001268565736377894, 'samples': 17027584, 'steps': 33256, 'batch_loss/train': 0.5801623021252453} 12/27/2021 08:19:11 - INFO - codeparrot_training - Step 33257: {'lr': 0.00012684285568823327, 'samples': 17028096, 'steps': 33257, 'batch_loss/train': 0.7925526043400168} 12/27/2021 08:19:22 - INFO - codeparrot_training - Step 33258: {'lr': 0.0001268291382283034, 'samples': 17028608, 'steps': 33258, 'batch_loss/train': 0.7636444959789515} 12/27/2021 08:19:34 - INFO - codeparrot_training - Step 33259: {'lr': 0.00012681542125805438, 'samples': 17029120, 'steps': 33259, 'batch_loss/train': 0.6928765210323036} 12/27/2021 08:19:45 - INFO - codeparrot_training - Step 33260: {'lr': 0.00012680170477754078, 'samples': 17029632, 'steps': 33260, 'batch_loss/train': 0.7215486927889287} 12/27/2021 08:19:55 - INFO - codeparrot_training - Step 33261: {'lr': 0.00012678798878681702, 'samples': 17030144, 'steps': 33261, 'batch_loss/train': 0.6837352956645191} 12/27/2021 08:20:06 - INFO - codeparrot_training - Step 33262: {'lr': 0.0001267742732859377, 'samples': 17030656, 'steps': 33262, 'batch_loss/train': 0.7689531245268881} 12/27/2021 08:20:20 - INFO - codeparrot_training - Step 33263: {'lr': 0.00012676055827495747, 'samples': 17031168, 'steps': 33263, 'batch_loss/train': 0.5532032356131822} 12/27/2021 08:20:30 - INFO - codeparrot_training - Step 33264: {'lr': 0.00012674684375393064, 'samples': 17031680, 'steps': 33264, 'batch_loss/train': 0.8237698394805193} 12/27/2021 08:20:41 - INFO - codeparrot_training - Step 33265: {'lr': 0.0001267331297229117, 'samples': 17032192, 'steps': 33265, 'batch_loss/train': 0.6628156579099596} 12/27/2021 08:20:53 - INFO - codeparrot_training - Step 33266: {'lr': 0.00012671941618195548, 'samples': 17032704, 'steps': 33266, 'batch_loss/train': 0.7078875340521336} 12/27/2021 08:21:04 - INFO - codeparrot_training - Step 33267: {'lr': 0.0001267057031311162, 'samples': 17033216, 'steps': 33267, 'batch_loss/train': 0.6936724651604891} 12/27/2021 08:21:14 - INFO - codeparrot_training - Step 33268: {'lr': 0.0001266919905704484, 'samples': 17033728, 'steps': 33268, 'batch_loss/train': 0.7386924577876925} 12/27/2021 08:21:26 - INFO - codeparrot_training - Step 33269: {'lr': 0.0001266782785000069, 'samples': 17034240, 'steps': 33269, 'batch_loss/train': 0.7574680116958916} 12/27/2021 08:21:37 - INFO - codeparrot_training - Step 33270: {'lr': 0.0001266645669198459, 'samples': 17034752, 'steps': 33270, 'batch_loss/train': 0.69206952303648} 12/27/2021 08:21:48 - INFO - codeparrot_training - Step 33271: {'lr': 0.00012665085583002001, 'samples': 17035264, 'steps': 33271, 'batch_loss/train': 0.6832440253347158} 12/27/2021 08:21:58 - INFO - codeparrot_training - Step 33272: {'lr': 0.00012663714523058374, 'samples': 17035776, 'steps': 33272, 'batch_loss/train': 0.7364211871754378} 12/27/2021 08:22:12 - INFO - codeparrot_training - Step 33273: {'lr': 0.00012662343512159164, 'samples': 17036288, 'steps': 33273, 'batch_loss/train': 0.7150747878476977} 12/27/2021 08:22:23 - INFO - codeparrot_training - Step 33274: {'lr': 0.00012660972550309813, 'samples': 17036800, 'steps': 33274, 'batch_loss/train': 0.7956023281440139} 12/27/2021 08:22:33 - INFO - codeparrot_training - Step 33275: {'lr': 0.00012659601637515785, 'samples': 17037312, 'steps': 33275, 'batch_loss/train': 0.7549088420346379} 12/27/2021 08:22:46 - INFO - codeparrot_training - Step 33276: {'lr': 0.00012658230773782503, 'samples': 17037824, 'steps': 33276, 'batch_loss/train': 0.8518421328626573} 12/27/2021 08:22:56 - INFO - codeparrot_training - Step 33277: {'lr': 0.00012656859959115448, 'samples': 17038336, 'steps': 33277, 'batch_loss/train': 0.7883317479863763} 12/27/2021 08:23:07 - INFO - codeparrot_training - Step 33278: {'lr': 0.00012655489193520063, 'samples': 17038848, 'steps': 33278, 'batch_loss/train': 0.6494303537765518} 12/27/2021 08:23:21 - INFO - codeparrot_training - Step 33279: {'lr': 0.00012654118477001776, 'samples': 17039360, 'steps': 33279, 'batch_loss/train': 0.8334643142297864} 12/27/2021 08:23:31 - INFO - codeparrot_training - Step 33280: {'lr': 0.00012652747809566056, 'samples': 17039872, 'steps': 33280, 'batch_loss/train': 0.6997067863121629} 12/27/2021 08:23:42 - INFO - codeparrot_training - Step 33281: {'lr': 0.00012651377191218367, 'samples': 17040384, 'steps': 33281, 'batch_loss/train': 0.751389142125845} 12/27/2021 08:23:53 - INFO - codeparrot_training - Step 33282: {'lr': 0.00012650006621964122, 'samples': 17040896, 'steps': 33282, 'batch_loss/train': 0.8080689539201558} 12/27/2021 08:24:05 - INFO - codeparrot_training - Step 33283: {'lr': 0.0001264863610180879, 'samples': 17041408, 'steps': 33283, 'batch_loss/train': 0.7569023268297315} 12/27/2021 08:24:16 - INFO - codeparrot_training - Step 33284: {'lr': 0.0001264726563075782, 'samples': 17041920, 'steps': 33284, 'batch_loss/train': 0.8081628698855639} 12/27/2021 08:24:26 - INFO - codeparrot_training - Step 33285: {'lr': 0.00012645895208816655, 'samples': 17042432, 'steps': 33285, 'batch_loss/train': 0.7201214001979679} 12/27/2021 08:24:38 - INFO - codeparrot_training - Step 33286: {'lr': 0.00012644524835990744, 'samples': 17042944, 'steps': 33286, 'batch_loss/train': 0.7780463062226772} 12/27/2021 08:24:49 - INFO - codeparrot_training - Step 33287: {'lr': 0.00012643154512285542, 'samples': 17043456, 'steps': 33287, 'batch_loss/train': 0.7494607828557491} 12/27/2021 08:24:59 - INFO - codeparrot_training - Step 33288: {'lr': 0.0001264178423770649, 'samples': 17043968, 'steps': 33288, 'batch_loss/train': 0.9146655931835994} 12/27/2021 08:25:12 - INFO - codeparrot_training - Step 33289: {'lr': 0.00012640414012259047, 'samples': 17044480, 'steps': 33289, 'batch_loss/train': 0.7812614347785711} 12/27/2021 08:25:22 - INFO - codeparrot_training - Step 33290: {'lr': 0.00012639043835948634, 'samples': 17044992, 'steps': 33290, 'batch_loss/train': 0.7705883160233498} 12/27/2021 08:25:33 - INFO - codeparrot_training - Step 33291: {'lr': 0.0001263767370878072, 'samples': 17045504, 'steps': 33291, 'batch_loss/train': 0.6593318707309663} 12/27/2021 08:25:47 - INFO - codeparrot_training - Step 33292: {'lr': 0.0001263630363076076, 'samples': 17046016, 'steps': 33292, 'batch_loss/train': 0.7526915315538645} 12/27/2021 08:25:58 - INFO - codeparrot_training - Step 33293: {'lr': 0.00012634933601894178, 'samples': 17046528, 'steps': 33293, 'batch_loss/train': 0.7454608390107751} 12/27/2021 08:26:08 - INFO - codeparrot_training - Step 33294: {'lr': 0.00012633563622186422, 'samples': 17047040, 'steps': 33294, 'batch_loss/train': 0.7852099831216037} 12/27/2021 08:26:19 - INFO - codeparrot_training - Step 33295: {'lr': 0.00012632193691642968, 'samples': 17047552, 'steps': 33295, 'batch_loss/train': 0.7722526574507356} 12/27/2021 08:26:31 - INFO - codeparrot_training - Step 33296: {'lr': 0.0001263082381026923, 'samples': 17048064, 'steps': 33296, 'batch_loss/train': 0.7282237554900348} 12/27/2021 08:26:42 - INFO - codeparrot_training - Step 33297: {'lr': 0.00012629453978070664, 'samples': 17048576, 'steps': 33297, 'batch_loss/train': 0.5772080908645876} 12/27/2021 08:26:53 - INFO - codeparrot_training - Step 33298: {'lr': 0.00012628084195052725, 'samples': 17049088, 'steps': 33298, 'batch_loss/train': 0.6091940272599459} 12/27/2021 08:27:05 - INFO - codeparrot_training - Step 33299: {'lr': 0.00012626714461220844, 'samples': 17049600, 'steps': 33299, 'batch_loss/train': 0.8026347937993705} 12/27/2021 08:27:15 - INFO - codeparrot_training - Step 33300: {'lr': 0.00012625344776580482, 'samples': 17050112, 'steps': 33300, 'batch_loss/train': 0.7336519481614232} 12/27/2021 08:27:26 - INFO - codeparrot_training - Step 33301: {'lr': 0.00012623975141137084, 'samples': 17050624, 'steps': 33301, 'batch_loss/train': 0.8340768255293369} 12/27/2021 08:27:40 - INFO - codeparrot_training - Step 33302: {'lr': 0.00012622605554896065, 'samples': 17051136, 'steps': 33302, 'batch_loss/train': 0.7160103656351566} 12/27/2021 08:27:50 - INFO - codeparrot_training - Step 33303: {'lr': 0.00012621236017862906, 'samples': 17051648, 'steps': 33303, 'batch_loss/train': 0.7525390814989805} 12/27/2021 08:28:01 - INFO - codeparrot_training - Step 33304: {'lr': 0.00012619866530043044, 'samples': 17052160, 'steps': 33304, 'batch_loss/train': 0.8794094026088715} 12/27/2021 08:28:12 - INFO - codeparrot_training - Step 33305: {'lr': 0.00012618497091441906, 'samples': 17052672, 'steps': 33305, 'batch_loss/train': 0.7722370969131589} 12/27/2021 08:28:24 - INFO - codeparrot_training - Step 33306: {'lr': 0.0001261712770206495, 'samples': 17053184, 'steps': 33306, 'batch_loss/train': 0.748497748747468} 12/27/2021 08:28:34 - INFO - codeparrot_training - Step 33307: {'lr': 0.00012615758361917635, 'samples': 17053696, 'steps': 33307, 'batch_loss/train': 0.6576475873589516} 12/27/2021 08:28:45 - INFO - codeparrot_training - Step 33308: {'lr': 0.00012614389071005376, 'samples': 17054208, 'steps': 33308, 'batch_loss/train': 0.729582242667675} 12/27/2021 08:28:59 - INFO - codeparrot_training - Step 33309: {'lr': 0.00012613019829333627, 'samples': 17054720, 'steps': 33309, 'batch_loss/train': 0.6918291910551488} 12/27/2021 08:29:09 - INFO - codeparrot_training - Step 33310: {'lr': 0.0001261165063690784, 'samples': 17055232, 'steps': 33310, 'batch_loss/train': 0.7031409577466547} 12/27/2021 08:29:20 - INFO - codeparrot_training - Step 33311: {'lr': 0.00012610281493733447, 'samples': 17055744, 'steps': 33311, 'batch_loss/train': 0.7936238469555974} 12/27/2021 08:29:32 - INFO - codeparrot_training - Step 33312: {'lr': 0.000126089123998159, 'samples': 17056256, 'steps': 33312, 'batch_loss/train': 0.8274290170520544} 12/27/2021 08:29:43 - INFO - codeparrot_training - Step 33313: {'lr': 0.00012607543355160636, 'samples': 17056768, 'steps': 33313, 'batch_loss/train': 0.8524833675473928} 12/27/2021 08:29:53 - INFO - codeparrot_training - Step 33314: {'lr': 0.00012606174359773104, 'samples': 17057280, 'steps': 33314, 'batch_loss/train': 0.7463145283982158} 12/27/2021 08:30:04 - INFO - codeparrot_training - Step 33315: {'lr': 0.00012604805413658738, 'samples': 17057792, 'steps': 33315, 'batch_loss/train': 0.8152685612440109} 12/27/2021 08:30:16 - INFO - codeparrot_training - Step 33316: {'lr': 0.00012603436516822988, 'samples': 17058304, 'steps': 33316, 'batch_loss/train': 0.715877746231854} 12/27/2021 08:30:27 - INFO - codeparrot_training - Step 33317: {'lr': 0.00012602067669271295, 'samples': 17058816, 'steps': 33317, 'batch_loss/train': 0.8110833214595914} 12/27/2021 08:30:38 - INFO - codeparrot_training - Step 33318: {'lr': 0.00012600698871009108, 'samples': 17059328, 'steps': 33318, 'batch_loss/train': 0.7054184461012483} 12/27/2021 08:30:52 - INFO - codeparrot_training - Step 33319: {'lr': 0.00012599330122041853, 'samples': 17059840, 'steps': 33319, 'batch_loss/train': 0.7382331090047956} 12/27/2021 08:31:02 - INFO - codeparrot_training - Step 33320: {'lr': 0.00012597961422374966, 'samples': 17060352, 'steps': 33320, 'batch_loss/train': 0.7427700400585309} 12/27/2021 08:31:13 - INFO - codeparrot_training - Step 33321: {'lr': 0.0001259659277201392, 'samples': 17060864, 'steps': 33321, 'batch_loss/train': 0.8144116420298815} 12/27/2021 08:31:25 - INFO - codeparrot_training - Step 33322: {'lr': 0.00012595224170964132, 'samples': 17061376, 'steps': 33322, 'batch_loss/train': 0.8375532124191523} 12/27/2021 08:31:36 - INFO - codeparrot_training - Step 33323: {'lr': 0.0001259385561923105, 'samples': 17061888, 'steps': 33323, 'batch_loss/train': 0.803545858245343} 12/27/2021 08:31:46 - INFO - codeparrot_training - Step 33324: {'lr': 0.00012592487116820108, 'samples': 17062400, 'steps': 33324, 'batch_loss/train': 0.6980224493891001} 12/27/2021 08:31:57 - INFO - codeparrot_training - Step 33325: {'lr': 0.00012591118663736757, 'samples': 17062912, 'steps': 33325, 'batch_loss/train': 0.7587078153155744} 12/27/2021 08:32:09 - INFO - codeparrot_training - Step 33326: {'lr': 0.0001258975025998643, 'samples': 17063424, 'steps': 33326, 'batch_loss/train': 0.7365450095385313} 12/27/2021 08:32:20 - INFO - codeparrot_training - Step 33327: {'lr': 0.0001258838190557457, 'samples': 17063936, 'steps': 33327, 'batch_loss/train': 0.7655960437841713} 12/27/2021 08:32:30 - INFO - codeparrot_training - Step 33328: {'lr': 0.0001258701360050662, 'samples': 17064448, 'steps': 33328, 'batch_loss/train': 0.7104736240580678} 12/27/2021 08:32:42 - INFO - codeparrot_training - Step 33329: {'lr': 0.00012585645344788017, 'samples': 17064960, 'steps': 33329, 'batch_loss/train': 0.7385570526821539} 12/27/2021 08:32:53 - INFO - codeparrot_training - Step 33330: {'lr': 0.00012584277138424205, 'samples': 17065472, 'steps': 33330, 'batch_loss/train': 0.8119165686657652} 12/27/2021 08:33:03 - INFO - codeparrot_training - Step 33331: {'lr': 0.00012582908981420603, 'samples': 17065984, 'steps': 33331, 'batch_loss/train': 0.7101295897737145} 12/27/2021 08:33:18 - INFO - codeparrot_training - Step 33332: {'lr': 0.00012581540873782678, 'samples': 17066496, 'steps': 33332, 'batch_loss/train': 0.7126870285719633} 12/27/2021 08:33:28 - INFO - codeparrot_training - Step 33333: {'lr': 0.00012580172815515863, 'samples': 17067008, 'steps': 33333, 'batch_loss/train': 0.6561231594532728} 12/27/2021 08:33:39 - INFO - codeparrot_training - Step 33334: {'lr': 0.00012578804806625586, 'samples': 17067520, 'steps': 33334, 'batch_loss/train': 0.8095725262537599} 12/27/2021 08:33:51 - INFO - codeparrot_training - Step 33335: {'lr': 0.0001257743684711729, 'samples': 17068032, 'steps': 33335, 'batch_loss/train': 0.7554369056597352} 12/27/2021 08:34:02 - INFO - codeparrot_training - Step 33336: {'lr': 0.00012576068936996412, 'samples': 17068544, 'steps': 33336, 'batch_loss/train': 0.6642054263502359} 12/27/2021 08:34:12 - INFO - codeparrot_training - Step 33337: {'lr': 0.00012574701076268393, 'samples': 17069056, 'steps': 33337, 'batch_loss/train': 0.7285931650549173} 12/27/2021 08:34:23 - INFO - codeparrot_training - Step 33338: {'lr': 0.00012573333264938674, 'samples': 17069568, 'steps': 33338, 'batch_loss/train': 0.7152352221310139} 12/27/2021 08:34:37 - INFO - codeparrot_training - Step 33339: {'lr': 0.0001257196550301269, 'samples': 17070080, 'steps': 33339, 'batch_loss/train': 0.7485687788575888} 12/27/2021 08:34:47 - INFO - codeparrot_training - Step 33340: {'lr': 0.00012570597790495874, 'samples': 17070592, 'steps': 33340, 'batch_loss/train': 0.6471165153197944} 12/27/2021 08:34:58 - INFO - codeparrot_training - Step 33341: {'lr': 0.0001256923012739367, 'samples': 17071104, 'steps': 33341, 'batch_loss/train': 0.7143783736974001} 12/27/2021 08:35:10 - INFO - codeparrot_training - Step 33342: {'lr': 0.00012567862513711514, 'samples': 17071616, 'steps': 33342, 'batch_loss/train': 0.5480917831300758} 12/27/2021 08:35:21 - INFO - codeparrot_training - Step 33343: {'lr': 0.00012566494949454844, 'samples': 17072128, 'steps': 33343, 'batch_loss/train': 0.813124707667157} 12/27/2021 08:35:31 - INFO - codeparrot_training - Step 33344: {'lr': 0.00012565127434629103, 'samples': 17072640, 'steps': 33344, 'batch_loss/train': 0.9041546145454049} 12/27/2021 08:35:42 - INFO - codeparrot_training - Step 33345: {'lr': 0.00012563759969239714, 'samples': 17073152, 'steps': 33345, 'batch_loss/train': 0.9213990364223719} 12/27/2021 08:35:54 - INFO - codeparrot_training - Step 33346: {'lr': 0.00012562392553292103, 'samples': 17073664, 'steps': 33346, 'batch_loss/train': 0.7510574482148513} 12/27/2021 08:36:05 - INFO - codeparrot_training - Step 33347: {'lr': 0.00012561025186791746, 'samples': 17074176, 'steps': 33347, 'batch_loss/train': 0.7072713756933808} 12/27/2021 08:36:15 - INFO - codeparrot_training - Step 33348: {'lr': 0.00012559657869744048, 'samples': 17074688, 'steps': 33348, 'batch_loss/train': 0.7440731008537114} 12/27/2021 08:36:30 - INFO - codeparrot_training - Step 33349: {'lr': 0.00012558290602154452, 'samples': 17075200, 'steps': 33349, 'batch_loss/train': 0.7064582789316773} 12/27/2021 08:36:41 - INFO - codeparrot_training - Step 33350: {'lr': 0.00012556923384028396, 'samples': 17075712, 'steps': 33350, 'batch_loss/train': 0.7097013723105192} 12/27/2021 08:36:51 - INFO - codeparrot_training - Step 33351: {'lr': 0.0001255555621537131, 'samples': 17076224, 'steps': 33351, 'batch_loss/train': 0.6435787541558966} 12/27/2021 08:37:04 - INFO - codeparrot_training - Step 33352: {'lr': 0.00012554189096188636, 'samples': 17076736, 'steps': 33352, 'batch_loss/train': 0.608508869074285} 12/27/2021 08:37:14 - INFO - codeparrot_training - Step 33353: {'lr': 0.00012552822026485808, 'samples': 17077248, 'steps': 33353, 'batch_loss/train': 0.6863819824066013} 12/27/2021 08:37:25 - INFO - codeparrot_training - Step 33354: {'lr': 0.0001255145500626826, 'samples': 17077760, 'steps': 33354, 'batch_loss/train': 0.6925772002432495} 12/27/2021 08:37:35 - INFO - codeparrot_training - Step 33355: {'lr': 0.00012550088035541423, 'samples': 17078272, 'steps': 33355, 'batch_loss/train': 0.7970162769779563} 12/27/2021 08:37:48 - INFO - codeparrot_training - Step 33356: {'lr': 0.0001254872111431075, 'samples': 17078784, 'steps': 33356, 'batch_loss/train': 0.7214827304705977} 12/27/2021 08:37:58 - INFO - codeparrot_training - Step 33357: {'lr': 0.0001254735424258164, 'samples': 17079296, 'steps': 33357, 'batch_loss/train': 0.7506187856197357} 12/27/2021 08:38:09 - INFO - codeparrot_training - Step 33358: {'lr': 0.0001254598742035955, 'samples': 17079808, 'steps': 33358, 'batch_loss/train': 0.7692099136766046} 12/27/2021 08:38:21 - INFO - codeparrot_training - Step 33359: {'lr': 0.0001254462064764993, 'samples': 17080320, 'steps': 33359, 'batch_loss/train': 0.7830345928668976} 12/27/2021 08:38:32 - INFO - codeparrot_training - Step 33360: {'lr': 0.00012543253924458182, 'samples': 17080832, 'steps': 33360, 'batch_loss/train': 0.773732999805361} 12/27/2021 08:38:42 - INFO - codeparrot_training - Step 33361: {'lr': 0.00012541887250789742, 'samples': 17081344, 'steps': 33361, 'batch_loss/train': 0.8256253553554416} 12/27/2021 08:38:56 - INFO - codeparrot_training - Step 33362: {'lr': 0.00012540520626650077, 'samples': 17081856, 'steps': 33362, 'batch_loss/train': 0.6794246542267501} 12/27/2021 08:39:07 - INFO - codeparrot_training - Step 33363: {'lr': 0.00012539154052044585, 'samples': 17082368, 'steps': 33363, 'batch_loss/train': 0.49189899931661785} 12/27/2021 08:39:17 - INFO - codeparrot_training - Step 33364: {'lr': 0.00012537787526978713, 'samples': 17082880, 'steps': 33364, 'batch_loss/train': 0.8777012741193175} 12/27/2021 08:39:28 - INFO - codeparrot_training - Step 33365: {'lr': 0.0001253642105145789, 'samples': 17083392, 'steps': 33365, 'batch_loss/train': 0.7544235925888643} 12/27/2021 08:39:40 - INFO - codeparrot_training - Step 33366: {'lr': 0.00012535054625487552, 'samples': 17083904, 'steps': 33366, 'batch_loss/train': 0.7369404300116003} 12/27/2021 08:39:51 - INFO - codeparrot_training - Step 33367: {'lr': 0.00012533688249073132, 'samples': 17084416, 'steps': 33367, 'batch_loss/train': 0.7830372918397188} 12/27/2021 08:40:01 - INFO - codeparrot_training - Step 33368: {'lr': 0.00012532321922220058, 'samples': 17084928, 'steps': 33368, 'batch_loss/train': 0.8285528691485524} 12/27/2021 08:40:13 - INFO - codeparrot_training - Step 33369: {'lr': 0.00012530955644933766, 'samples': 17085440, 'steps': 33369, 'batch_loss/train': 0.7107479926198721} 12/27/2021 08:40:24 - INFO - codeparrot_training - Step 33370: {'lr': 0.00012529589417219688, 'samples': 17085952, 'steps': 33370, 'batch_loss/train': 0.7641040813177824} 12/27/2021 08:40:35 - INFO - codeparrot_training - Step 33371: {'lr': 0.0001252822323908326, 'samples': 17086464, 'steps': 33371, 'batch_loss/train': 0.7852833727374673} 12/27/2021 08:40:50 - INFO - codeparrot_training - Step 33372: {'lr': 0.00012526857110529887, 'samples': 17086976, 'steps': 33372, 'batch_loss/train': 0.7454089364036918} 12/27/2021 08:41:01 - INFO - codeparrot_training - Step 33373: {'lr': 0.00012525491031565046, 'samples': 17087488, 'steps': 33373, 'batch_loss/train': 0.7450890876352787} 12/27/2021 08:41:11 - INFO - codeparrot_training - Step 33374: {'lr': 0.00012524125002194128, 'samples': 17088000, 'steps': 33374, 'batch_loss/train': 0.7783585209399462} 12/27/2021 08:41:22 - INFO - codeparrot_training - Step 33375: {'lr': 0.0001252275902242258, 'samples': 17088512, 'steps': 33375, 'batch_loss/train': 0.6705425690161064} 12/27/2021 08:41:34 - INFO - codeparrot_training - Step 33376: {'lr': 0.00012521393092255827, 'samples': 17089024, 'steps': 33376, 'batch_loss/train': 0.8032257817685604} 12/27/2021 08:41:44 - INFO - codeparrot_training - Step 33377: {'lr': 0.0001252002721169931, 'samples': 17089536, 'steps': 33377, 'batch_loss/train': 0.7340726225520484} 12/27/2021 08:41:55 - INFO - codeparrot_training - Step 33378: {'lr': 0.0001251866138075845, 'samples': 17090048, 'steps': 33378, 'batch_loss/train': 0.7409337935969234} 12/27/2021 08:42:07 - INFO - codeparrot_training - Step 33379: {'lr': 0.00012517295599438676, 'samples': 17090560, 'steps': 33379, 'batch_loss/train': 0.8255807762034237} 12/27/2021 08:42:18 - INFO - codeparrot_training - Step 33380: {'lr': 0.00012515929867745423, 'samples': 17091072, 'steps': 33380, 'batch_loss/train': 0.7704330589622259} 12/27/2021 08:42:29 - INFO - codeparrot_training - Step 33381: {'lr': 0.00012514564185684118, 'samples': 17091584, 'steps': 33381, 'batch_loss/train': 0.823317923466675} 12/27/2021 08:42:42 - INFO - codeparrot_training - Step 33382: {'lr': 0.00012513198553260207, 'samples': 17092096, 'steps': 33382, 'batch_loss/train': 0.7967833112925291} 12/27/2021 08:42:52 - INFO - codeparrot_training - Step 33383: {'lr': 0.0001251183297047908, 'samples': 17092608, 'steps': 33383, 'batch_loss/train': 0.6932556973770261} 12/27/2021 08:43:03 - INFO - codeparrot_training - Step 33384: {'lr': 0.00012510467437346198, 'samples': 17093120, 'steps': 33384, 'batch_loss/train': 0.8014408512972295} 12/27/2021 08:43:14 - INFO - codeparrot_training - Step 33385: {'lr': 0.00012509101953866994, 'samples': 17093632, 'steps': 33385, 'batch_loss/train': 0.7660858272574842} 12/27/2021 08:43:26 - INFO - codeparrot_training - Step 33386: {'lr': 0.00012507736520046874, 'samples': 17094144, 'steps': 33386, 'batch_loss/train': 0.8327371859923005} 12/27/2021 08:43:37 - INFO - codeparrot_training - Step 33387: {'lr': 0.00012506371135891268, 'samples': 17094656, 'steps': 33387, 'batch_loss/train': 0.6423678016290069} 12/27/2021 08:43:47 - INFO - codeparrot_training - Step 33388: {'lr': 0.00012505005801405632, 'samples': 17095168, 'steps': 33388, 'batch_loss/train': 0.7154705720022321} 12/27/2021 08:43:59 - INFO - codeparrot_training - Step 33389: {'lr': 0.0001250364051659536, 'samples': 17095680, 'steps': 33389, 'batch_loss/train': 0.7229575803503394} 12/27/2021 08:44:10 - INFO - codeparrot_training - Step 33390: {'lr': 0.00012502275281465902, 'samples': 17096192, 'steps': 33390, 'batch_loss/train': 0.6404558201320469} 12/27/2021 08:44:20 - INFO - codeparrot_training - Step 33391: {'lr': 0.0001250091009602268, 'samples': 17096704, 'steps': 33391, 'batch_loss/train': 0.7189935157075524} 12/27/2021 08:44:34 - INFO - codeparrot_training - Step 33392: {'lr': 0.0001249954496027111, 'samples': 17097216, 'steps': 33392, 'batch_loss/train': 0.71388851525262} 12/27/2021 08:44:45 - INFO - codeparrot_training - Step 33393: {'lr': 0.00012498179874216638, 'samples': 17097728, 'steps': 33393, 'batch_loss/train': 0.6993934446945786} 12/27/2021 08:44:55 - INFO - codeparrot_training - Step 33394: {'lr': 0.00012496814837864679, 'samples': 17098240, 'steps': 33394, 'batch_loss/train': 0.6613279073499143} 12/27/2021 08:45:06 - INFO - codeparrot_training - Step 33395: {'lr': 0.00012495449851220665, 'samples': 17098752, 'steps': 33395, 'batch_loss/train': 0.7202119631692767} 12/27/2021 08:45:18 - INFO - codeparrot_training - Step 33396: {'lr': 0.00012494084914290015, 'samples': 17099264, 'steps': 33396, 'batch_loss/train': 0.7663141619414091} 12/27/2021 08:45:29 - INFO - codeparrot_training - Step 33397: {'lr': 0.00012492720027078178, 'samples': 17099776, 'steps': 33397, 'batch_loss/train': 0.7419156413525343} 12/27/2021 08:45:40 - INFO - codeparrot_training - Step 33398: {'lr': 0.00012491355189590543, 'samples': 17100288, 'steps': 33398, 'batch_loss/train': 0.7637591328239068} 12/27/2021 08:45:53 - INFO - codeparrot_training - Step 33399: {'lr': 0.00012489990401832573, 'samples': 17100800, 'steps': 33399, 'batch_loss/train': 0.7121084435493685} 12/27/2021 08:46:03 - INFO - codeparrot_training - Step 33400: {'lr': 0.0001248862566380967, 'samples': 17101312, 'steps': 33400, 'batch_loss/train': 0.7246095983427949} 12/27/2021 08:46:14 - INFO - codeparrot_training - Step 33401: {'lr': 0.00012487260975527255, 'samples': 17101824, 'steps': 33401, 'batch_loss/train': 0.7376710157841444} 12/27/2021 08:46:26 - INFO - codeparrot_training - Step 33402: {'lr': 0.00012485896336990787, 'samples': 17102336, 'steps': 33402, 'batch_loss/train': 0.6633077855221927} 12/27/2021 08:46:37 - INFO - codeparrot_training - Step 33403: {'lr': 0.00012484531748205662, 'samples': 17102848, 'steps': 33403, 'batch_loss/train': 0.6815685438923538} 12/27/2021 08:46:47 - INFO - codeparrot_training - Step 33404: {'lr': 0.00012483167209177313, 'samples': 17103360, 'steps': 33404, 'batch_loss/train': 0.6585981999523938} 12/27/2021 08:46:59 - INFO - codeparrot_training - Step 33405: {'lr': 0.00012481802719911163, 'samples': 17103872, 'steps': 33405, 'batch_loss/train': 0.7232086225412786} 12/27/2021 08:47:10 - INFO - codeparrot_training - Step 33406: {'lr': 0.0001248043828041264, 'samples': 17104384, 'steps': 33406, 'batch_loss/train': 0.7656743149273098} 12/27/2021 08:47:21 - INFO - codeparrot_training - Step 33407: {'lr': 0.00012479073890687165, 'samples': 17104896, 'steps': 33407, 'batch_loss/train': 0.7039416336920112} 12/27/2021 08:47:31 - INFO - codeparrot_training - Step 33408: {'lr': 0.00012477709550740178, 'samples': 17105408, 'steps': 33408, 'batch_loss/train': 0.6803116220980883} 12/27/2021 08:47:43 - INFO - codeparrot_training - Step 33409: {'lr': 0.00012476345260577066, 'samples': 17105920, 'steps': 33409, 'batch_loss/train': 0.583045085077174} 12/27/2021 08:47:54 - INFO - codeparrot_training - Step 33410: {'lr': 0.00012474981020203289, 'samples': 17106432, 'steps': 33410, 'batch_loss/train': 0.6946185994893312} 12/27/2021 08:48:05 - INFO - codeparrot_training - Step 33411: {'lr': 0.00012473616829624268, 'samples': 17106944, 'steps': 33411, 'batch_loss/train': 0.8337840521708131} 12/27/2021 08:48:18 - INFO - codeparrot_training - Step 33412: {'lr': 0.00012472252688845405, 'samples': 17107456, 'steps': 33412, 'batch_loss/train': 0.6468551817815751} 12/27/2021 08:48:28 - INFO - codeparrot_training - Step 33413: {'lr': 0.00012470888597872125, 'samples': 17107968, 'steps': 33413, 'batch_loss/train': 0.8346461178734899} 12/27/2021 08:48:39 - INFO - codeparrot_training - Step 33414: {'lr': 0.00012469524556709883, 'samples': 17108480, 'steps': 33414, 'batch_loss/train': 0.714808089658618} 12/27/2021 08:48:51 - INFO - codeparrot_training - Step 33415: {'lr': 0.0001246816056536407, 'samples': 17108992, 'steps': 33415, 'batch_loss/train': 0.8563279332593083} 12/27/2021 08:49:02 - INFO - codeparrot_training - Step 33416: {'lr': 0.00012466796623840108, 'samples': 17109504, 'steps': 33416, 'batch_loss/train': 0.7011562110856175} 12/27/2021 08:49:12 - INFO - codeparrot_training - Step 33417: {'lr': 0.0001246543273214345, 'samples': 17110016, 'steps': 33417, 'batch_loss/train': 0.7803645455278456} 12/27/2021 08:49:24 - INFO - codeparrot_training - Step 33418: {'lr': 0.0001246406889027949, 'samples': 17110528, 'steps': 33418, 'batch_loss/train': 0.6085259669926018} 12/27/2021 08:49:35 - INFO - codeparrot_training - Step 33419: {'lr': 0.00012462705098253656, 'samples': 17111040, 'steps': 33419, 'batch_loss/train': 0.6710161259397864} 12/27/2021 08:49:46 - INFO - codeparrot_training - Step 33420: {'lr': 0.00012461341356071373, 'samples': 17111552, 'steps': 33420, 'batch_loss/train': 0.5643324476841372} 12/27/2021 08:49:56 - INFO - codeparrot_training - Step 33421: {'lr': 0.00012459977663738066, 'samples': 17112064, 'steps': 33421, 'batch_loss/train': 0.6499813792761415} 12/27/2021 08:50:10 - INFO - codeparrot_training - Step 33422: {'lr': 0.0001245861402125915, 'samples': 17112576, 'steps': 33422, 'batch_loss/train': 0.71575137891341} 12/27/2021 08:50:21 - INFO - codeparrot_training - Step 33423: {'lr': 0.0001245725042864006, 'samples': 17113088, 'steps': 33423, 'batch_loss/train': 0.7827451513148844} 12/27/2021 08:50:32 - INFO - codeparrot_training - Step 33424: {'lr': 0.00012455886885886187, 'samples': 17113600, 'steps': 33424, 'batch_loss/train': 0.7791705625131726} 12/27/2021 08:50:44 - INFO - codeparrot_training - Step 33425: {'lr': 0.00012454523393002988, 'samples': 17114112, 'steps': 33425, 'batch_loss/train': 0.6919752322137356} 12/27/2021 08:50:54 - INFO - codeparrot_training - Step 33426: {'lr': 0.0001245315994999586, 'samples': 17114624, 'steps': 33426, 'batch_loss/train': 0.7715926757082343} 12/27/2021 08:51:05 - INFO - codeparrot_training - Step 33427: {'lr': 0.0001245179655687022, 'samples': 17115136, 'steps': 33427, 'batch_loss/train': 0.6860812620725483} 12/27/2021 08:51:19 - INFO - codeparrot_training - Step 33428: {'lr': 0.00012450433213631517, 'samples': 17115648, 'steps': 33428, 'batch_loss/train': 0.8013630844070576} 12/27/2021 08:51:29 - INFO - codeparrot_training - Step 33429: {'lr': 0.00012449069920285148, 'samples': 17116160, 'steps': 33429, 'batch_loss/train': 0.6580577057320625} 12/27/2021 08:51:40 - INFO - codeparrot_training - Step 33430: {'lr': 0.0001244770667683653, 'samples': 17116672, 'steps': 33430, 'batch_loss/train': 0.8998440329451114} 12/27/2021 08:51:51 - INFO - codeparrot_training - Step 33431: {'lr': 0.00012446343483291097, 'samples': 17117184, 'steps': 33431, 'batch_loss/train': 0.7449102960526943} 12/27/2021 08:52:03 - INFO - codeparrot_training - Step 33432: {'lr': 0.0001244498033965426, 'samples': 17117696, 'steps': 33432, 'batch_loss/train': 0.7036013295874} 12/27/2021 08:52:13 - INFO - codeparrot_training - Step 33433: {'lr': 0.00012443617245931445, 'samples': 17118208, 'steps': 33433, 'batch_loss/train': 0.73085658531636} 12/27/2021 08:52:24 - INFO - codeparrot_training - Step 33434: {'lr': 0.0001244225420212807, 'samples': 17118720, 'steps': 33434, 'batch_loss/train': 0.7066642446443439} 12/27/2021 08:52:36 - INFO - codeparrot_training - Step 33435: {'lr': 0.00012440891208249534, 'samples': 17119232, 'steps': 33435, 'batch_loss/train': 0.8504440300166607} 12/27/2021 08:52:47 - INFO - codeparrot_training - Step 33436: {'lr': 0.00012439528264301283, 'samples': 17119744, 'steps': 33436, 'batch_loss/train': 0.6882747505442239} 12/27/2021 08:52:57 - INFO - codeparrot_training - Step 33437: {'lr': 0.00012438165370288733, 'samples': 17120256, 'steps': 33437, 'batch_loss/train': 0.6865176705177873} 12/27/2021 08:53:11 - INFO - codeparrot_training - Step 33438: {'lr': 0.00012436802526217275, 'samples': 17120768, 'steps': 33438, 'batch_loss/train': 0.7967762867920101} 12/27/2021 08:53:21 - INFO - codeparrot_training - Step 33439: {'lr': 0.00012435439732092358, 'samples': 17121280, 'steps': 33439, 'batch_loss/train': 0.7710772510617971} 12/27/2021 08:53:32 - INFO - codeparrot_training - Step 33440: {'lr': 0.00012434076987919397, 'samples': 17121792, 'steps': 33440, 'batch_loss/train': 0.7175537245348096} 12/27/2021 08:53:43 - INFO - codeparrot_training - Step 33441: {'lr': 0.0001243271429370379, 'samples': 17122304, 'steps': 33441, 'batch_loss/train': 0.9351706192246638} 12/27/2021 08:53:55 - INFO - codeparrot_training - Step 33442: {'lr': 0.0001243135164945096, 'samples': 17122816, 'steps': 33442, 'batch_loss/train': 0.7909373296424747} 12/27/2021 08:54:05 - INFO - codeparrot_training - Step 33443: {'lr': 0.00012429989055166344, 'samples': 17123328, 'steps': 33443, 'batch_loss/train': 0.8024515877477825} 12/27/2021 08:54:16 - INFO - codeparrot_training - Step 33444: {'lr': 0.0001242862651085534, 'samples': 17123840, 'steps': 33444, 'batch_loss/train': 0.6903424276970327} 12/27/2021 08:54:28 - INFO - codeparrot_training - Step 33445: {'lr': 0.00012427264016523368, 'samples': 17124352, 'steps': 33445, 'batch_loss/train': 0.8089905399829149} 12/27/2021 08:54:39 - INFO - codeparrot_training - Step 33446: {'lr': 0.0001242590157217585, 'samples': 17124864, 'steps': 33446, 'batch_loss/train': 0.814534058328718} 12/27/2021 08:54:49 - INFO - codeparrot_training - Step 33447: {'lr': 0.00012424539177818196, 'samples': 17125376, 'steps': 33447, 'batch_loss/train': 0.7939398604794405} 12/27/2021 08:55:01 - INFO - codeparrot_training - Step 33448: {'lr': 0.00012423176833455826, 'samples': 17125888, 'steps': 33448, 'batch_loss/train': 0.6786286885617301} 12/27/2021 08:55:12 - INFO - codeparrot_training - Step 33449: {'lr': 0.00012421814539094169, 'samples': 17126400, 'steps': 33449, 'batch_loss/train': 0.7312705991789699} 12/27/2021 08:55:23 - INFO - codeparrot_training - Step 33450: {'lr': 0.00012420452294738604, 'samples': 17126912, 'steps': 33450, 'batch_loss/train': 0.8293235742021352} 12/27/2021 08:55:33 - INFO - codeparrot_training - Step 33451: {'lr': 0.00012419090100394584, 'samples': 17127424, 'steps': 33451, 'batch_loss/train': 0.7477775933220983} 12/27/2021 08:55:46 - INFO - codeparrot_training - Step 33452: {'lr': 0.0001241772795606752, 'samples': 17127936, 'steps': 33452, 'batch_loss/train': 0.7187223974615335} 12/27/2021 08:55:57 - INFO - codeparrot_training - Step 33453: {'lr': 0.000124163658617628, 'samples': 17128448, 'steps': 33453, 'batch_loss/train': 0.6547086936188862} 12/27/2021 08:56:08 - INFO - codeparrot_training - Step 33454: {'lr': 0.00012415003817485876, 'samples': 17128960, 'steps': 33454, 'batch_loss/train': 0.7023826753720641} 12/27/2021 08:56:20 - INFO - codeparrot_training - Step 33455: {'lr': 0.00012413641823242136, 'samples': 17129472, 'steps': 33455, 'batch_loss/train': 0.6449115280993283} 12/27/2021 08:56:30 - INFO - codeparrot_training - Step 33456: {'lr': 0.00012412279879037006, 'samples': 17129984, 'steps': 33456, 'batch_loss/train': 0.7796670529060066} 12/27/2021 08:56:41 - INFO - codeparrot_training - Step 33457: {'lr': 0.00012410917984875895, 'samples': 17130496, 'steps': 33457, 'batch_loss/train': 0.7108461610041559} 12/27/2021 08:56:54 - INFO - codeparrot_training - Step 33458: {'lr': 0.00012409556140764223, 'samples': 17131008, 'steps': 33458, 'batch_loss/train': 0.653349397238344} 12/27/2021 08:57:05 - INFO - codeparrot_training - Step 33459: {'lr': 0.000124081943467074, 'samples': 17131520, 'steps': 33459, 'batch_loss/train': 0.7064296270837076} 12/27/2021 08:57:15 - INFO - codeparrot_training - Step 33460: {'lr': 0.0001240683260271084, 'samples': 17132032, 'steps': 33460, 'batch_loss/train': 0.8341723093762994} 12/27/2021 08:57:26 - INFO - codeparrot_training - Step 33461: {'lr': 0.00012405470908779961, 'samples': 17132544, 'steps': 33461, 'batch_loss/train': 0.7679789839312434} 12/27/2021 08:57:38 - INFO - codeparrot_training - Step 33462: {'lr': 0.00012404109264920173, 'samples': 17133056, 'steps': 33462, 'batch_loss/train': 0.7379900417290628} 12/27/2021 08:57:49 - INFO - codeparrot_training - Step 33463: {'lr': 0.00012402747671136903, 'samples': 17133568, 'steps': 33463, 'batch_loss/train': 0.8584918309934437} 12/27/2021 08:58:00 - INFO - codeparrot_training - Step 33464: {'lr': 0.00012401386127435532, 'samples': 17134080, 'steps': 33464, 'batch_loss/train': 0.6839073221199214} 12/27/2021 08:58:12 - INFO - codeparrot_training - Step 33465: {'lr': 0.000124000246338215, 'samples': 17134592, 'steps': 33465, 'batch_loss/train': 0.7454060520976782} 12/27/2021 08:58:22 - INFO - codeparrot_training - Step 33466: {'lr': 0.00012398663190300226, 'samples': 17135104, 'steps': 33466, 'batch_loss/train': 0.71779228921514} 12/27/2021 08:58:33 - INFO - codeparrot_training - Step 33467: {'lr': 0.00012397301796877094, 'samples': 17135616, 'steps': 33467, 'batch_loss/train': 0.7757304939441383} 12/27/2021 08:58:47 - INFO - codeparrot_training - Step 33468: {'lr': 0.00012395940453557524, 'samples': 17136128, 'steps': 33468, 'batch_loss/train': 0.8265653671696782} 12/27/2021 08:58:57 - INFO - codeparrot_training - Step 33469: {'lr': 0.00012394579160346955, 'samples': 17136640, 'steps': 33469, 'batch_loss/train': 0.6417932698968798} 12/27/2021 08:59:08 - INFO - codeparrot_training - Step 33470: {'lr': 0.00012393217917250771, 'samples': 17137152, 'steps': 33470, 'batch_loss/train': 0.7217447655275464} 12/27/2021 08:59:20 - INFO - codeparrot_training - Step 33471: {'lr': 0.00012391856724274392, 'samples': 17137664, 'steps': 33471, 'batch_loss/train': 0.7692301366478205} 12/27/2021 08:59:31 - INFO - codeparrot_training - Step 33472: {'lr': 0.0001239049558142323, 'samples': 17138176, 'steps': 33472, 'batch_loss/train': 0.737387275788933} 12/27/2021 08:59:41 - INFO - codeparrot_training - Step 33473: {'lr': 0.00012389134488702697, 'samples': 17138688, 'steps': 33473, 'batch_loss/train': 0.7079592915251851} 12/27/2021 08:59:52 - INFO - codeparrot_training - Step 33474: {'lr': 0.00012387773446118202, 'samples': 17139200, 'steps': 33474, 'batch_loss/train': 0.7450564904138446} 12/27/2021 09:00:06 - INFO - codeparrot_training - Step 33475: {'lr': 0.00012386412453675158, 'samples': 17139712, 'steps': 33475, 'batch_loss/train': 0.7446940297959372} 12/27/2021 09:00:17 - INFO - codeparrot_training - Step 33476: {'lr': 0.00012385051511378976, 'samples': 17140224, 'steps': 33476, 'batch_loss/train': 0.6545617389492691} 12/27/2021 09:00:27 - INFO - codeparrot_training - Step 33477: {'lr': 0.00012383690619235068, 'samples': 17140736, 'steps': 33477, 'batch_loss/train': 0.6576047791168094} 12/27/2021 09:00:39 - INFO - codeparrot_training - Step 33478: {'lr': 0.00012382329777248845, 'samples': 17141248, 'steps': 33478, 'batch_loss/train': 0.6849452964961529} 12/27/2021 09:00:50 - INFO - codeparrot_training - Step 33479: {'lr': 0.00012380968985425703, 'samples': 17141760, 'steps': 33479, 'batch_loss/train': 0.7926631389418617} 12/27/2021 09:01:01 - INFO - codeparrot_training - Step 33480: {'lr': 0.00012379608243771078, 'samples': 17142272, 'steps': 33480, 'batch_loss/train': 0.8495056135579944} 12/27/2021 09:01:13 - INFO - codeparrot_training - Step 33481: {'lr': 0.0001237824755229036, 'samples': 17142784, 'steps': 33481, 'batch_loss/train': 0.7026007026433945} 12/27/2021 09:01:23 - INFO - codeparrot_training - Step 33482: {'lr': 0.0001237688691098896, 'samples': 17143296, 'steps': 33482, 'batch_loss/train': 0.8307538153603673} 12/27/2021 09:01:34 - INFO - codeparrot_training - Step 33483: {'lr': 0.00012375526319872293, 'samples': 17143808, 'steps': 33483, 'batch_loss/train': 0.8071939861401916} 12/27/2021 09:01:44 - INFO - codeparrot_training - Step 33484: {'lr': 0.00012374165778945766, 'samples': 17144320, 'steps': 33484, 'batch_loss/train': 0.7760596545413136} 12/27/2021 09:01:57 - INFO - codeparrot_training - Step 33485: {'lr': 0.00012372805288214788, 'samples': 17144832, 'steps': 33485, 'batch_loss/train': 0.685663228854537} 12/27/2021 09:02:07 - INFO - codeparrot_training - Step 33486: {'lr': 0.0001237144484768477, 'samples': 17145344, 'steps': 33486, 'batch_loss/train': 0.6988323016557842} 12/27/2021 09:02:18 - INFO - codeparrot_training - Step 33487: {'lr': 0.0001237008445736112, 'samples': 17145856, 'steps': 33487, 'batch_loss/train': 0.7201735923881643} 12/27/2021 09:02:32 - INFO - codeparrot_training - Step 33488: {'lr': 0.00012368724117249241, 'samples': 17146368, 'steps': 33488, 'batch_loss/train': 0.7160018961876631} 12/27/2021 09:02:43 - INFO - codeparrot_training - Step 33489: {'lr': 0.00012367363827354557, 'samples': 17146880, 'steps': 33489, 'batch_loss/train': 0.7221929818624631} 12/27/2021 09:02:53 - INFO - codeparrot_training - Step 33490: {'lr': 0.00012366003587682448, 'samples': 17147392, 'steps': 33490, 'batch_loss/train': 0.7618108913302422} 12/27/2021 09:03:05 - INFO - codeparrot_training - Step 33491: {'lr': 0.00012364643398238347, 'samples': 17147904, 'steps': 33491, 'batch_loss/train': 0.6970920162275434} 12/27/2021 09:03:16 - INFO - codeparrot_training - Step 33492: {'lr': 0.0001236328325902766, 'samples': 17148416, 'steps': 33492, 'batch_loss/train': 0.6317453687079251} 12/27/2021 09:03:27 - INFO - codeparrot_training - Step 33493: {'lr': 0.00012361923170055777, 'samples': 17148928, 'steps': 33493, 'batch_loss/train': 0.8045411240309477} 12/27/2021 09:03:37 - INFO - codeparrot_training - Step 33494: {'lr': 0.0001236056313132811, 'samples': 17149440, 'steps': 33494, 'batch_loss/train': 0.7093127109110355} 12/27/2021 09:03:49 - INFO - codeparrot_training - Step 33495: {'lr': 0.0001235920314285009, 'samples': 17149952, 'steps': 33495, 'batch_loss/train': 0.7164455507881939} 12/27/2021 09:04:00 - INFO - codeparrot_training - Step 33496: {'lr': 0.0001235784320462709, 'samples': 17150464, 'steps': 33496, 'batch_loss/train': 0.7674536183476448} 12/27/2021 09:04:11 - INFO - codeparrot_training - Step 33497: {'lr': 0.0001235648331666454, 'samples': 17150976, 'steps': 33497, 'batch_loss/train': 0.8769416734576225} 12/27/2021 09:04:25 - INFO - codeparrot_training - Step 33498: {'lr': 0.00012355123478967832, 'samples': 17151488, 'steps': 33498, 'batch_loss/train': 0.771634578704834} 12/27/2021 09:04:35 - INFO - codeparrot_training - Step 33499: {'lr': 0.00012353763691542384, 'samples': 17152000, 'steps': 33499, 'batch_loss/train': 0.6979234656319022} 12/27/2021 09:04:46 - INFO - codeparrot_training - Step 33500: {'lr': 0.00012352403954393593, 'samples': 17152512, 'steps': 33500, 'batch_loss/train': 0.75095582799986} 12/27/2021 09:04:58 - INFO - codeparrot_training - Step 33501: {'lr': 0.0001235104426752687, 'samples': 17153024, 'steps': 33501, 'batch_loss/train': 0.7136511970311403} 12/27/2021 09:05:09 - INFO - codeparrot_training - Step 33502: {'lr': 0.00012349684630947615, 'samples': 17153536, 'steps': 33502, 'batch_loss/train': 0.6746913425158709} 12/27/2021 09:05:19 - INFO - codeparrot_training - Step 33503: {'lr': 0.00012348325044661242, 'samples': 17154048, 'steps': 33503, 'batch_loss/train': 0.7875054944306612} 12/27/2021 09:05:32 - INFO - codeparrot_training - Step 33504: {'lr': 0.0001234696550867316, 'samples': 17154560, 'steps': 33504, 'batch_loss/train': 0.9112852686084807} 12/27/2021 09:05:43 - INFO - codeparrot_training - Step 33505: {'lr': 0.00012345606022988746, 'samples': 17155072, 'steps': 33505, 'batch_loss/train': 0.6881375084631145} 12/27/2021 09:05:54 - INFO - codeparrot_training - Step 33506: {'lr': 0.00012344246587613433, 'samples': 17155584, 'steps': 33506, 'batch_loss/train': 0.7509131841361523} 12/27/2021 09:06:04 - INFO - codeparrot_training - Step 33507: {'lr': 0.00012342887202552628, 'samples': 17156096, 'steps': 33507, 'batch_loss/train': 0.6520713353529572} 12/27/2021 09:06:16 - INFO - codeparrot_training - Step 33508: {'lr': 0.00012341527867811713, 'samples': 17156608, 'steps': 33508, 'batch_loss/train': 0.7072147568687797} 12/27/2021 09:06:27 - INFO - codeparrot_training - Step 33509: {'lr': 0.00012340168583396105, 'samples': 17157120, 'steps': 33509, 'batch_loss/train': 0.709790509659797} 12/27/2021 09:06:38 - INFO - codeparrot_training - Step 33510: {'lr': 0.0001233880934931121, 'samples': 17157632, 'steps': 33510, 'batch_loss/train': 0.7384863682091236} 12/27/2021 09:06:50 - INFO - codeparrot_training - Step 33511: {'lr': 0.00012337450165562423, 'samples': 17158144, 'steps': 33511, 'batch_loss/train': 0.8002522473107092} 12/27/2021 09:07:01 - INFO - codeparrot_training - Step 33512: {'lr': 0.00012336091032155152, 'samples': 17158656, 'steps': 33512, 'batch_loss/train': 0.8105689040385187} 12/27/2021 09:07:11 - INFO - codeparrot_training - Step 33513: {'lr': 0.00012334731949094804, 'samples': 17159168, 'steps': 33513, 'batch_loss/train': 0.765799936838448} 12/27/2021 09:07:25 - INFO - codeparrot_training - Step 33514: {'lr': 0.00012333372916386781, 'samples': 17159680, 'steps': 33514, 'batch_loss/train': 0.611436084844172} 12/27/2021 09:07:36 - INFO - codeparrot_training - Step 33515: {'lr': 0.00012332013934036494, 'samples': 17160192, 'steps': 33515, 'batch_loss/train': 0.8324446999467909} 12/27/2021 09:07:46 - INFO - codeparrot_training - Step 33516: {'lr': 0.00012330655002049313, 'samples': 17160704, 'steps': 33516, 'batch_loss/train': 0.7939453320577741} 12/27/2021 09:07:57 - INFO - codeparrot_training - Step 33517: {'lr': 0.00012329296120430677, 'samples': 17161216, 'steps': 33517, 'batch_loss/train': 0.7996385805308819} 12/27/2021 09:08:09 - INFO - codeparrot_training - Step 33518: {'lr': 0.00012327937289185984, 'samples': 17161728, 'steps': 33518, 'batch_loss/train': 0.7135487110354006} 12/27/2021 09:08:20 - INFO - codeparrot_training - Step 33519: {'lr': 0.0001232657850832062, 'samples': 17162240, 'steps': 33519, 'batch_loss/train': 0.7665968472138047} 12/27/2021 09:08:30 - INFO - codeparrot_training - Step 33520: {'lr': 0.00012325219777839982, 'samples': 17162752, 'steps': 33520, 'batch_loss/train': 0.791055463720113} 12/27/2021 09:08:42 - INFO - codeparrot_training - Step 33521: {'lr': 0.00012323861097749504, 'samples': 17163264, 'steps': 33521, 'batch_loss/train': 0.7621315978467464} 12/27/2021 09:08:53 - INFO - codeparrot_training - Step 33522: {'lr': 0.00012322502468054558, 'samples': 17163776, 'steps': 33522, 'batch_loss/train': 0.6185896208626218} 12/27/2021 09:09:04 - INFO - codeparrot_training - Step 33523: {'lr': 0.0001232114388876056, 'samples': 17164288, 'steps': 33523, 'batch_loss/train': 0.7565558375790715} 12/27/2021 09:09:16 - INFO - codeparrot_training - Step 33524: {'lr': 0.00012319785359872902, 'samples': 17164800, 'steps': 33524, 'batch_loss/train': 1.3218139531090856} 12/27/2021 09:09:26 - INFO - codeparrot_training - Step 33525: {'lr': 0.00012318426881396987, 'samples': 17165312, 'steps': 33525, 'batch_loss/train': 0.7043977361172438} 12/27/2021 09:09:37 - INFO - codeparrot_training - Step 33526: {'lr': 0.00012317068453338224, 'samples': 17165824, 'steps': 33526, 'batch_loss/train': 0.7641646694391966} 12/27/2021 09:09:47 - INFO - codeparrot_training - Step 33527: {'lr': 0.00012315710075702006, 'samples': 17166336, 'steps': 33527, 'batch_loss/train': 0.75275260489434} 12/27/2021 09:10:01 - INFO - codeparrot_training - Step 33528: {'lr': 0.00012314351748493734, 'samples': 17166848, 'steps': 33528, 'batch_loss/train': 0.842103379778564} 12/27/2021 09:10:11 - INFO - codeparrot_training - Step 33529: {'lr': 0.00012312993471718812, 'samples': 17167360, 'steps': 33529, 'batch_loss/train': 0.7355476687662303} 12/27/2021 09:10:22 - INFO - codeparrot_training - Step 33530: {'lr': 0.00012311635245382642, 'samples': 17167872, 'steps': 33530, 'batch_loss/train': 0.7045552139170468} 12/27/2021 09:10:34 - INFO - codeparrot_training - Step 33531: {'lr': 0.00012310277069490605, 'samples': 17168384, 'steps': 33531, 'batch_loss/train': 0.6313399606151506} 12/27/2021 09:10:45 - INFO - codeparrot_training - Step 33532: {'lr': 0.00012308918944048126, 'samples': 17168896, 'steps': 33532, 'batch_loss/train': 0.714846458286047} 12/27/2021 09:10:55 - INFO - codeparrot_training - Step 33533: {'lr': 0.000123075608690606, 'samples': 17169408, 'steps': 33533, 'batch_loss/train': 0.783733013086021} 12/27/2021 09:11:07 - INFO - codeparrot_training - Step 33534: {'lr': 0.0001230620284453341, 'samples': 17169920, 'steps': 33534, 'batch_loss/train': 0.7763040540739894} 12/27/2021 09:11:18 - INFO - codeparrot_training - Step 33535: {'lr': 0.00012304844870471963, 'samples': 17170432, 'steps': 33535, 'batch_loss/train': 0.7151972164865583} 12/27/2021 09:11:29 - INFO - codeparrot_training - Step 33536: {'lr': 0.0001230348694688166, 'samples': 17170944, 'steps': 33536, 'batch_loss/train': 0.7423299942165613} 12/27/2021 09:11:39 - INFO - codeparrot_training - Step 33537: {'lr': 0.000123021290737679, 'samples': 17171456, 'steps': 33537, 'batch_loss/train': 0.7457756525836885} 12/27/2021 09:11:52 - INFO - codeparrot_training - Step 33538: {'lr': 0.00012300771251136082, 'samples': 17171968, 'steps': 33538, 'batch_loss/train': 0.7020561057142913} 12/27/2021 09:12:03 - INFO - codeparrot_training - Step 33539: {'lr': 0.000122994134789916, 'samples': 17172480, 'steps': 33539, 'batch_loss/train': 0.7608400741009973} 12/27/2021 09:12:14 - INFO - codeparrot_training - Step 33540: {'lr': 0.00012298055757339855, 'samples': 17172992, 'steps': 33540, 'batch_loss/train': 0.8136303089559078} 12/27/2021 09:12:26 - INFO - codeparrot_training - Step 33541: {'lr': 0.00012296698086186245, 'samples': 17173504, 'steps': 33541, 'batch_loss/train': 0.709506515879184} 12/27/2021 09:12:37 - INFO - codeparrot_training - Step 33542: {'lr': 0.00012295340465536165, 'samples': 17174016, 'steps': 33542, 'batch_loss/train': 0.7337254360318184} 12/27/2021 09:12:47 - INFO - codeparrot_training - Step 33543: {'lr': 0.00012293982895395018, 'samples': 17174528, 'steps': 33543, 'batch_loss/train': 0.6893136342987418} 12/27/2021 09:13:00 - INFO - codeparrot_training - Step 33544: {'lr': 0.000122926253757682, 'samples': 17175040, 'steps': 33544, 'batch_loss/train': 0.7119625946506858} 12/27/2021 09:13:11 - INFO - codeparrot_training - Step 33545: {'lr': 0.00012291267906661103, 'samples': 17175552, 'steps': 33545, 'batch_loss/train': 0.6524014691822231} 12/27/2021 09:13:22 - INFO - codeparrot_training - Step 33546: {'lr': 0.0001228991048807911, 'samples': 17176064, 'steps': 33546, 'batch_loss/train': 0.7239087005145848} 12/27/2021 09:13:32 - INFO - codeparrot_training - Step 33547: {'lr': 0.0001228855312002766, 'samples': 17176576, 'steps': 33547, 'batch_loss/train': 0.7593069663271308} 12/27/2021 09:13:44 - INFO - codeparrot_training - Step 33548: {'lr': 0.0001228719580251211, 'samples': 17177088, 'steps': 33548, 'batch_loss/train': 0.7231369463261217} 12/27/2021 09:13:55 - INFO - codeparrot_training - Step 33549: {'lr': 0.0001228583853553786, 'samples': 17177600, 'steps': 33549, 'batch_loss/train': 0.7472805865108967} 12/27/2021 09:14:05 - INFO - codeparrot_training - Step 33550: {'lr': 0.00012284481319110334, 'samples': 17178112, 'steps': 33550, 'batch_loss/train': 0.7312609772197902} 12/27/2021 09:14:18 - INFO - codeparrot_training - Step 33551: {'lr': 0.000122831241532349, 'samples': 17178624, 'steps': 33551, 'batch_loss/train': 0.8513247882947326} 12/27/2021 09:14:28 - INFO - codeparrot_training - Step 33552: {'lr': 0.00012281767037916963, 'samples': 17179136, 'steps': 33552, 'batch_loss/train': 0.8693388625979424} 12/27/2021 09:14:39 - INFO - codeparrot_training - Step 33553: {'lr': 0.0001228040997316192, 'samples': 17179648, 'steps': 33553, 'batch_loss/train': 0.6818877393379807} 12/27/2021 09:14:52 - INFO - codeparrot_training - Step 33554: {'lr': 0.0001227905295897516, 'samples': 17180160, 'steps': 33554, 'batch_loss/train': 0.8351497673429549} 12/27/2021 09:15:02 - INFO - codeparrot_training - Step 33555: {'lr': 0.00012277695995362083, 'samples': 17180672, 'steps': 33555, 'batch_loss/train': 0.6840871502645314} 12/27/2021 09:15:13 - INFO - codeparrot_training - Step 33556: {'lr': 0.00012276339082328095, 'samples': 17181184, 'steps': 33556, 'batch_loss/train': 0.7454669617582113} 12/27/2021 09:15:25 - INFO - codeparrot_training - Step 33557: {'lr': 0.00012274982219878555, 'samples': 17181696, 'steps': 33557, 'batch_loss/train': 0.7146436576731503} 12/27/2021 09:15:36 - INFO - codeparrot_training - Step 33558: {'lr': 0.00012273625408018896, 'samples': 17182208, 'steps': 33558, 'batch_loss/train': 0.7192826131358743} 12/27/2021 09:15:46 - INFO - codeparrot_training - Step 33559: {'lr': 0.00012272268646754502, 'samples': 17182720, 'steps': 33559, 'batch_loss/train': 0.7004066105000675} 12/27/2021 09:15:57 - INFO - codeparrot_training - Step 33560: {'lr': 0.00012270911936090756, 'samples': 17183232, 'steps': 33560, 'batch_loss/train': 0.6836717072874308} 12/27/2021 09:16:10 - INFO - codeparrot_training - Step 33561: {'lr': 0.00012269555276033057, 'samples': 17183744, 'steps': 33561, 'batch_loss/train': 0.817289536818862} 12/27/2021 09:16:20 - INFO - codeparrot_training - Step 33562: {'lr': 0.00012268198666586796, 'samples': 17184256, 'steps': 33562, 'batch_loss/train': 0.46666595787974074} 12/27/2021 09:16:31 - INFO - codeparrot_training - Step 33563: {'lr': 0.00012266842107757373, 'samples': 17184768, 'steps': 33563, 'batch_loss/train': 0.671758851443883} 12/27/2021 09:16:43 - INFO - codeparrot_training - Step 33564: {'lr': 0.00012265485599550178, 'samples': 17185280, 'steps': 33564, 'batch_loss/train': 0.8296099249273539} 12/27/2021 09:16:53 - INFO - codeparrot_training - Step 33565: {'lr': 0.00012264129141970604, 'samples': 17185792, 'steps': 33565, 'batch_loss/train': 0.7261538822203875} 12/27/2021 09:17:04 - INFO - codeparrot_training - Step 33566: {'lr': 0.0001226277273502404, 'samples': 17186304, 'steps': 33566, 'batch_loss/train': 0.7598761687986553} 12/27/2021 09:17:17 - INFO - codeparrot_training - Step 33567: {'lr': 0.00012261416378715886, 'samples': 17186816, 'steps': 33567, 'batch_loss/train': 0.7378274239599705} 12/27/2021 09:17:28 - INFO - codeparrot_training - Step 33568: {'lr': 0.00012260060073051533, 'samples': 17187328, 'steps': 33568, 'batch_loss/train': 0.8096839357167482} 12/27/2021 09:17:39 - INFO - codeparrot_training - Step 33569: {'lr': 0.00012258703818036364, 'samples': 17187840, 'steps': 33569, 'batch_loss/train': 0.7614660635590553} 12/27/2021 09:17:49 - INFO - codeparrot_training - Step 33570: {'lr': 0.00012257347613675793, 'samples': 17188352, 'steps': 33570, 'batch_loss/train': 0.7472274359315634} 12/27/2021 09:18:01 - INFO - codeparrot_training - Step 33571: {'lr': 0.00012255991459975187, 'samples': 17188864, 'steps': 33571, 'batch_loss/train': 0.6737453355453908} 12/27/2021 09:18:12 - INFO - codeparrot_training - Step 33572: {'lr': 0.00012254635356939933, 'samples': 17189376, 'steps': 33572, 'batch_loss/train': 0.7102553509175777} 12/27/2021 09:18:23 - INFO - codeparrot_training - Step 33573: {'lr': 0.00012253279304575457, 'samples': 17189888, 'steps': 33573, 'batch_loss/train': 0.6401355580892414} 12/27/2021 09:18:37 - INFO - codeparrot_training - Step 33574: {'lr': 0.0001225192330288712, 'samples': 17190400, 'steps': 33574, 'batch_loss/train': 0.8665533035527915} 12/27/2021 09:18:47 - INFO - codeparrot_training - Step 33575: {'lr': 0.00012250567351880314, 'samples': 17190912, 'steps': 33575, 'batch_loss/train': 0.8074024501256645} 12/27/2021 09:18:58 - INFO - codeparrot_training - Step 33576: {'lr': 0.00012249211451560458, 'samples': 17191424, 'steps': 33576, 'batch_loss/train': 0.6855621887370944} 12/27/2021 09:19:11 - INFO - codeparrot_training - Step 33577: {'lr': 0.0001224785560193291, 'samples': 17191936, 'steps': 33577, 'batch_loss/train': 1.0319385966286063} 12/27/2021 09:19:21 - INFO - codeparrot_training - Step 33578: {'lr': 0.00012246499803003074, 'samples': 17192448, 'steps': 33578, 'batch_loss/train': 1.3811567779630423} 12/27/2021 09:19:32 - INFO - codeparrot_training - Step 33579: {'lr': 0.00012245144054776343, 'samples': 17192960, 'steps': 33579, 'batch_loss/train': 0.7244279771111906} 12/27/2021 09:19:42 - INFO - codeparrot_training - Step 33580: {'lr': 0.00012243788357258102, 'samples': 17193472, 'steps': 33580, 'batch_loss/train': 0.7802684810012579} 12/27/2021 09:19:55 - INFO - codeparrot_training - Step 33581: {'lr': 0.00012242432710453738, 'samples': 17193984, 'steps': 33581, 'batch_loss/train': 0.7505679074674845} 12/27/2021 09:20:05 - INFO - codeparrot_training - Step 33582: {'lr': 0.00012241077114368657, 'samples': 17194496, 'steps': 33582, 'batch_loss/train': 0.7220660448074341} 12/27/2021 09:20:16 - INFO - codeparrot_training - Step 33583: {'lr': 0.00012239721569008217, 'samples': 17195008, 'steps': 33583, 'batch_loss/train': 0.7562598534859717} 12/27/2021 09:20:29 - INFO - codeparrot_training - Step 33584: {'lr': 0.00012238366074377832, 'samples': 17195520, 'steps': 33584, 'batch_loss/train': 0.7450878964737058} 12/27/2021 09:20:40 - INFO - codeparrot_training - Step 33585: {'lr': 0.000122370106304829, 'samples': 17196032, 'steps': 33585, 'batch_loss/train': 0.7651410810649395} 12/27/2021 09:20:50 - INFO - codeparrot_training - Step 33586: {'lr': 0.0001223565523732877, 'samples': 17196544, 'steps': 33586, 'batch_loss/train': 0.8125155279412866} 12/27/2021 09:21:03 - INFO - codeparrot_training - Step 33587: {'lr': 0.0001223429989492087, 'samples': 17197056, 'steps': 33587, 'batch_loss/train': 0.7881441114004701} 12/27/2021 09:21:13 - INFO - codeparrot_training - Step 33588: {'lr': 0.00012232944603264578, 'samples': 17197568, 'steps': 33588, 'batch_loss/train': 0.6978904902935028} 12/27/2021 09:21:24 - INFO - codeparrot_training - Step 33589: {'lr': 0.00012231589362365273, 'samples': 17198080, 'steps': 33589, 'batch_loss/train': 0.8565639615990222} 12/27/2021 09:21:34 - INFO - codeparrot_training - Step 33590: {'lr': 0.00012230234172228344, 'samples': 17198592, 'steps': 33590, 'batch_loss/train': 0.7257025828585029} 12/27/2021 09:21:48 - INFO - codeparrot_training - Step 33591: {'lr': 0.00012228879032859184, 'samples': 17199104, 'steps': 33591, 'batch_loss/train': 0.6781987408176064} 12/27/2021 09:21:59 - INFO - codeparrot_training - Step 33592: {'lr': 0.00012227523944263177, 'samples': 17199616, 'steps': 33592, 'batch_loss/train': 0.718682698905468} 12/27/2021 09:22:09 - INFO - codeparrot_training - Step 33593: {'lr': 0.0001222616890644571, 'samples': 17200128, 'steps': 33593, 'batch_loss/train': 0.7492301752790809} 12/27/2021 09:22:22 - INFO - codeparrot_training - Step 33594: {'lr': 0.0001222481391941218, 'samples': 17200640, 'steps': 33594, 'batch_loss/train': 0.7473631575703621} 12/27/2021 09:22:32 - INFO - codeparrot_training - Step 33595: {'lr': 0.0001222345898316796, 'samples': 17201152, 'steps': 33595, 'batch_loss/train': 0.6802428867667913} 12/27/2021 09:22:43 - INFO - codeparrot_training - Step 33596: {'lr': 0.00012222104097718445, 'samples': 17201664, 'steps': 33596, 'batch_loss/train': 0.6539855781884398} 12/27/2021 09:22:55 - INFO - codeparrot_training - Step 33597: {'lr': 0.00012220749263069027, 'samples': 17202176, 'steps': 33597, 'batch_loss/train': 0.8075946227181703} 12/27/2021 09:23:06 - INFO - codeparrot_training - Step 33598: {'lr': 0.00012219394479225066, 'samples': 17202688, 'steps': 33598, 'batch_loss/train': 0.7326373234391212} 12/27/2021 09:23:16 - INFO - codeparrot_training - Step 33599: {'lr': 0.00012218039746191988, 'samples': 17203200, 'steps': 33599, 'batch_loss/train': 0.6734733376652002} 12/27/2021 09:23:27 - INFO - codeparrot_training - Step 33600: {'lr': 0.00012216685063975148, 'samples': 17203712, 'steps': 33600, 'batch_loss/train': 0.7856307737529278} 12/27/2021 09:23:39 - INFO - codeparrot_training - Step 33601: {'lr': 0.0001221533043257993, 'samples': 17204224, 'steps': 33601, 'batch_loss/train': 0.8785126283764839} 12/27/2021 09:23:50 - INFO - codeparrot_training - Step 33602: {'lr': 0.0001221397585201175, 'samples': 17204736, 'steps': 33602, 'batch_loss/train': 0.74592927377671} 12/27/2021 09:24:00 - INFO - codeparrot_training - Step 33603: {'lr': 0.00012212621322275967, 'samples': 17205248, 'steps': 33603, 'batch_loss/train': 0.7486495254561305} 12/27/2021 09:24:13 - INFO - codeparrot_training - Step 33604: {'lr': 0.0001221126684337797, 'samples': 17205760, 'steps': 33604, 'batch_loss/train': 0.7554321838542819} 12/27/2021 09:24:24 - INFO - codeparrot_training - Step 33605: {'lr': 0.00012209912415323148, 'samples': 17206272, 'steps': 33605, 'batch_loss/train': 0.7150425942381844} 12/27/2021 09:24:35 - INFO - codeparrot_training - Step 33606: {'lr': 0.00012208558038116885, 'samples': 17206784, 'steps': 33606, 'batch_loss/train': 0.7577569149434566} 12/27/2021 09:24:47 - INFO - codeparrot_training - Step 33607: {'lr': 0.00012207203711764566, 'samples': 17207296, 'steps': 33607, 'batch_loss/train': 0.741596139036119} 12/27/2021 09:24:57 - INFO - codeparrot_training - Step 33608: {'lr': 0.00012205849436271585, 'samples': 17207808, 'steps': 33608, 'batch_loss/train': 0.7797032976523042} 12/27/2021 09:25:08 - INFO - codeparrot_training - Step 33609: {'lr': 0.00012204495211643296, 'samples': 17208320, 'steps': 33609, 'batch_loss/train': 0.7529685641638935} 12/27/2021 09:25:20 - INFO - codeparrot_training - Step 33610: {'lr': 0.00012203141037885112, 'samples': 17208832, 'steps': 33610, 'batch_loss/train': 0.8142484752461314} 12/27/2021 09:25:31 - INFO - codeparrot_training - Step 33611: {'lr': 0.00012201786915002416, 'samples': 17209344, 'steps': 33611, 'batch_loss/train': 0.7768986718729138} 12/27/2021 09:25:41 - INFO - codeparrot_training - Step 33612: {'lr': 0.00012200432843000564, 'samples': 17209856, 'steps': 33612, 'batch_loss/train': 0.8023775760084391} 12/27/2021 09:25:52 - INFO - codeparrot_training - Step 33613: {'lr': 0.0001219907882188497, 'samples': 17210368, 'steps': 33613, 'batch_loss/train': 0.7941501375753433} 12/27/2021 09:26:06 - INFO - codeparrot_training - Step 33614: {'lr': 0.00012197724851661012, 'samples': 17210880, 'steps': 33614, 'batch_loss/train': 0.779807589482516} 12/27/2021 09:26:16 - INFO - codeparrot_training - Step 33615: {'lr': 0.00012196370932334058, 'samples': 17211392, 'steps': 33615, 'batch_loss/train': 0.669136197771877} 12/27/2021 09:26:27 - INFO - codeparrot_training - Step 33616: {'lr': 0.00012195017063909497, 'samples': 17211904, 'steps': 33616, 'batch_loss/train': 0.6791223119944334} 12/27/2021 09:26:39 - INFO - codeparrot_training - Step 33617: {'lr': 0.00012193663246392717, 'samples': 17212416, 'steps': 33617, 'batch_loss/train': 0.6869940743781626} 12/27/2021 09:26:50 - INFO - codeparrot_training - Step 33618: {'lr': 0.00012192309479789094, 'samples': 17212928, 'steps': 33618, 'batch_loss/train': 0.801733406027779} 12/27/2021 09:27:00 - INFO - codeparrot_training - Step 33619: {'lr': 0.00012190955764104012, 'samples': 17213440, 'steps': 33619, 'batch_loss/train': 0.7430668296292424} 12/27/2021 09:27:14 - INFO - codeparrot_training - Step 33620: {'lr': 0.00012189602099342853, 'samples': 17213952, 'steps': 33620, 'batch_loss/train': 0.7521279118955135} 12/27/2021 09:27:24 - INFO - codeparrot_training - Step 33621: {'lr': 0.00012188248485511, 'samples': 17214464, 'steps': 33621, 'batch_loss/train': 0.7812659619376063} 12/27/2021 09:27:35 - INFO - codeparrot_training - Step 33622: {'lr': 0.00012186894922613833, 'samples': 17214976, 'steps': 33622, 'batch_loss/train': 0.7501036301255226} 12/27/2021 09:27:47 - INFO - codeparrot_training - Step 33623: {'lr': 0.00012185541410656733, 'samples': 17215488, 'steps': 33623, 'batch_loss/train': 0.7295154137536883} 12/27/2021 09:27:58 - INFO - codeparrot_training - Step 33624: {'lr': 0.00012184187949645085, 'samples': 17216000, 'steps': 33624, 'batch_loss/train': 0.6388683137483895} 12/27/2021 09:28:08 - INFO - codeparrot_training - Step 33625: {'lr': 0.00012182834539584273, 'samples': 17216512, 'steps': 33625, 'batch_loss/train': 0.6386727746576071} 12/27/2021 09:28:19 - INFO - codeparrot_training - Step 33626: {'lr': 0.00012181481180479661, 'samples': 17217024, 'steps': 33626, 'batch_loss/train': 0.7166255504125729} 12/27/2021 09:28:31 - INFO - codeparrot_training - Step 33627: {'lr': 0.00012180127872336632, 'samples': 17217536, 'steps': 33627, 'batch_loss/train': 0.7795331003144383} 12/27/2021 09:28:42 - INFO - codeparrot_training - Step 33628: {'lr': 0.00012178774615160593, 'samples': 17218048, 'steps': 33628, 'batch_loss/train': 0.7656715111806989} 12/27/2021 09:28:53 - INFO - codeparrot_training - Step 33629: {'lr': 0.00012177421408956895, 'samples': 17218560, 'steps': 33629, 'batch_loss/train': 0.7419651420786977} 12/27/2021 09:29:06 - INFO - codeparrot_training - Step 33630: {'lr': 0.00012176068253730929, 'samples': 17219072, 'steps': 33630, 'batch_loss/train': 1.057248909957707} 12/27/2021 09:29:17 - INFO - codeparrot_training - Step 33631: {'lr': 0.00012174715149488075, 'samples': 17219584, 'steps': 33631, 'batch_loss/train': 0.6733072097413242} 12/27/2021 09:29:27 - INFO - codeparrot_training - Step 33632: {'lr': 0.00012173362096233706, 'samples': 17220096, 'steps': 33632, 'batch_loss/train': 0.7630918493960053} 12/27/2021 09:29:38 - INFO - codeparrot_training - Step 33633: {'lr': 0.0001217200909397321, 'samples': 17220608, 'steps': 33633, 'batch_loss/train': 0.6869408269412816} 12/27/2021 09:29:51 - INFO - codeparrot_training - Step 33634: {'lr': 0.00012170656142711972, 'samples': 17221120, 'steps': 33634, 'batch_loss/train': 0.7850044465158135} 12/27/2021 09:30:02 - INFO - codeparrot_training - Step 33635: {'lr': 0.0001216930324245534, 'samples': 17221632, 'steps': 33635, 'batch_loss/train': 1.1367486882954836} 12/27/2021 09:30:13 - INFO - codeparrot_training - Step 33636: {'lr': 0.00012167950393208723, 'samples': 17222144, 'steps': 33636, 'batch_loss/train': 1.01500843744725} 12/27/2021 09:30:25 - INFO - codeparrot_training - Step 33637: {'lr': 0.000121665975949775, 'samples': 17222656, 'steps': 33637, 'batch_loss/train': 0.711737563367933} 12/27/2021 09:30:36 - INFO - codeparrot_training - Step 33638: {'lr': 0.00012165244847767021, 'samples': 17223168, 'steps': 33638, 'batch_loss/train': 0.5908395119477063} 12/27/2021 09:30:46 - INFO - codeparrot_training - Step 33639: {'lr': 0.00012163892151582692, 'samples': 17223680, 'steps': 33639, 'batch_loss/train': 0.6634682975709438} 12/27/2021 09:30:57 - INFO - codeparrot_training - Step 33640: {'lr': 0.0001216253950642989, 'samples': 17224192, 'steps': 33640, 'batch_loss/train': 0.5645267154322937} 12/27/2021 09:31:09 - INFO - codeparrot_training - Step 33641: {'lr': 0.00012161186912313973, 'samples': 17224704, 'steps': 33641, 'batch_loss/train': 0.7807382300961763} 12/27/2021 09:31:20 - INFO - codeparrot_training - Step 33642: {'lr': 0.00012159834369240322, 'samples': 17225216, 'steps': 33642, 'batch_loss/train': 0.833513387478888} 12/27/2021 09:31:30 - INFO - codeparrot_training - Step 33643: {'lr': 0.0001215848187721434, 'samples': 17225728, 'steps': 33643, 'batch_loss/train': 0.7465156111866236} 12/27/2021 09:31:44 - INFO - codeparrot_training - Step 33644: {'lr': 0.00012157129436241374, 'samples': 17226240, 'steps': 33644, 'batch_loss/train': 0.6733027333393693} 12/27/2021 09:31:55 - INFO - codeparrot_training - Step 33645: {'lr': 0.0001215577704632681, 'samples': 17226752, 'steps': 33645, 'batch_loss/train': 0.7758807288482785} 12/27/2021 09:32:06 - INFO - codeparrot_training - Step 33646: {'lr': 0.00012154424707476031, 'samples': 17227264, 'steps': 33646, 'batch_loss/train': 0.6185843421262689} 12/27/2021 09:32:18 - INFO - codeparrot_training - Step 33647: {'lr': 0.00012153072419694408, 'samples': 17227776, 'steps': 33647, 'batch_loss/train': 0.8008023481816053} 12/27/2021 09:32:28 - INFO - codeparrot_training - Step 33648: {'lr': 0.00012151720182987316, 'samples': 17228288, 'steps': 33648, 'batch_loss/train': 0.6382996004540473} 12/27/2021 09:32:39 - INFO - codeparrot_training - Step 33649: {'lr': 0.00012150367997360137, 'samples': 17228800, 'steps': 33649, 'batch_loss/train': 0.7505090404301882} 12/27/2021 09:32:51 - INFO - codeparrot_training - Step 33650: {'lr': 0.00012149015862818241, 'samples': 17229312, 'steps': 33650, 'batch_loss/train': 0.7493192609399557} 12/27/2021 09:33:02 - INFO - codeparrot_training - Step 33651: {'lr': 0.00012147663779367016, 'samples': 17229824, 'steps': 33651, 'batch_loss/train': 0.6702890095766634} 12/27/2021 09:33:12 - INFO - codeparrot_training - Step 33652: {'lr': 0.00012146311747011815, 'samples': 17230336, 'steps': 33652, 'batch_loss/train': 1.5167615814134479} 12/27/2021 09:33:23 - INFO - codeparrot_training - Step 33653: {'lr': 0.00012144959765758017, 'samples': 17230848, 'steps': 33653, 'batch_loss/train': 0.6904588937759399} 12/27/2021 09:33:37 - INFO - codeparrot_training - Step 33654: {'lr': 0.00012143607835611026, 'samples': 17231360, 'steps': 33654, 'batch_loss/train': 0.7036456752102822} 12/27/2021 09:33:47 - INFO - codeparrot_training - Step 33655: {'lr': 0.00012142255956576184, 'samples': 17231872, 'steps': 33655, 'batch_loss/train': 0.7162935547530651} 12/27/2021 09:33:58 - INFO - codeparrot_training - Step 33656: {'lr': 0.00012140904128658877, 'samples': 17232384, 'steps': 33656, 'batch_loss/train': 1.2514894073829055} 12/27/2021 09:34:10 - INFO - codeparrot_training - Step 33657: {'lr': 0.0001213955235186448, 'samples': 17232896, 'steps': 33657, 'batch_loss/train': 0.7327676978893578} 12/27/2021 09:34:21 - INFO - codeparrot_training - Step 33658: {'lr': 0.0001213820062619837, 'samples': 17233408, 'steps': 33658, 'batch_loss/train': 0.7870878376998007} 12/27/2021 09:34:31 - INFO - codeparrot_training - Step 33659: {'lr': 0.00012136848951665916, 'samples': 17233920, 'steps': 33659, 'batch_loss/train': 0.7749565159901977} 12/27/2021 09:34:45 - INFO - codeparrot_training - Step 33660: {'lr': 0.00012135497328272494, 'samples': 17234432, 'steps': 33660, 'batch_loss/train': 0.7284556729719043} 12/27/2021 09:34:55 - INFO - codeparrot_training - Step 33661: {'lr': 0.00012134145756023474, 'samples': 17234944, 'steps': 33661, 'batch_loss/train': 0.9031348926946521} 12/27/2021 09:35:06 - INFO - codeparrot_training - Step 33662: {'lr': 0.00012132794234924235, 'samples': 17235456, 'steps': 33662, 'batch_loss/train': 0.7426502429880202} 12/27/2021 09:35:17 - INFO - codeparrot_training - Step 33663: {'lr': 0.00012131442764980158, 'samples': 17235968, 'steps': 33663, 'batch_loss/train': 0.9251001584343612} 12/27/2021 09:35:29 - INFO - codeparrot_training - Step 33664: {'lr': 0.00012130091346196587, 'samples': 17236480, 'steps': 33664, 'batch_loss/train': 0.7501468537375331} 12/27/2021 09:35:39 - INFO - codeparrot_training - Step 33665: {'lr': 0.00012128739978578921, 'samples': 17236992, 'steps': 33665, 'batch_loss/train': 0.6832997901365161} 12/27/2021 09:35:50 - INFO - codeparrot_training - Step 33666: {'lr': 0.00012127388662132535, 'samples': 17237504, 'steps': 33666, 'batch_loss/train': 0.7433949699625373} 12/27/2021 09:36:02 - INFO - codeparrot_training - Step 33667: {'lr': 0.0001212603739686278, 'samples': 17238016, 'steps': 33667, 'batch_loss/train': 0.747029697522521} 12/27/2021 09:36:12 - INFO - codeparrot_training - Step 33668: {'lr': 0.00012124686182775033, 'samples': 17238528, 'steps': 33668, 'batch_loss/train': 0.6887284805998206} 12/27/2021 09:36:23 - INFO - codeparrot_training - Step 33669: {'lr': 0.0001212333501987469, 'samples': 17239040, 'steps': 33669, 'batch_loss/train': 0.7220185399055481} 12/27/2021 09:36:37 - INFO - codeparrot_training - Step 33670: {'lr': 0.00012121983908167094, 'samples': 17239552, 'steps': 33670, 'batch_loss/train': 0.7717766109853983} 12/27/2021 09:36:48 - INFO - codeparrot_training - Step 33671: {'lr': 0.00012120632847657628, 'samples': 17240064, 'steps': 33671, 'batch_loss/train': 0.7576645826920867} 12/27/2021 09:36:58 - INFO - codeparrot_training - Step 33672: {'lr': 0.00012119281838351665, 'samples': 17240576, 'steps': 33672, 'batch_loss/train': 0.752566778101027} 12/27/2021 09:37:10 - INFO - codeparrot_training - Step 33673: {'lr': 0.00012117930880254571, 'samples': 17241088, 'steps': 33673, 'batch_loss/train': 0.7561455620452762} 12/27/2021 09:37:21 - INFO - codeparrot_training - Step 33674: {'lr': 0.00012116579973371719, 'samples': 17241600, 'steps': 33674, 'batch_loss/train': 0.7438115868717432} 12/27/2021 09:37:31 - INFO - codeparrot_training - Step 33675: {'lr': 0.00012115229117708484, 'samples': 17242112, 'steps': 33675, 'batch_loss/train': 0.7543993936851621} 12/27/2021 09:37:42 - INFO - codeparrot_training - Step 33676: {'lr': 0.0001211387831327023, 'samples': 17242624, 'steps': 33676, 'batch_loss/train': 0.7431303020566702} 12/27/2021 09:37:56 - INFO - codeparrot_training - Step 33677: {'lr': 0.00012112527560062331, 'samples': 17243136, 'steps': 33677, 'batch_loss/train': 0.7742545469664037} 12/27/2021 09:38:07 - INFO - codeparrot_training - Step 33678: {'lr': 0.00012111176858090167, 'samples': 17243648, 'steps': 33678, 'batch_loss/train': 0.7520823967643082} 12/27/2021 09:38:17 - INFO - codeparrot_training - Step 33679: {'lr': 0.00012109826207359076, 'samples': 17244160, 'steps': 33679, 'batch_loss/train': 0.6776019870303571} 12/27/2021 09:38:29 - INFO - codeparrot_training - Step 33680: {'lr': 0.00012108475607874469, 'samples': 17244672, 'steps': 33680, 'batch_loss/train': 0.6552950851619244} 12/27/2021 09:38:40 - INFO - codeparrot_training - Step 33681: {'lr': 0.00012107125059641689, 'samples': 17245184, 'steps': 33681, 'batch_loss/train': 0.687204904621467} 12/27/2021 09:38:50 - INFO - codeparrot_training - Step 33682: {'lr': 0.00012105774562666108, 'samples': 17245696, 'steps': 33682, 'batch_loss/train': 0.6959978805389255} 12/27/2021 09:39:02 - INFO - codeparrot_training - Step 33683: {'lr': 0.00012104424116953098, 'samples': 17246208, 'steps': 33683, 'batch_loss/train': 0.7299808152019978} 12/27/2021 09:39:13 - INFO - codeparrot_training - Step 33684: {'lr': 0.00012103073722508028, 'samples': 17246720, 'steps': 33684, 'batch_loss/train': 0.7990842019207776} 12/27/2021 09:39:24 - INFO - codeparrot_training - Step 33685: {'lr': 0.00012101723379336272, 'samples': 17247232, 'steps': 33685, 'batch_loss/train': 0.8188069127500057} 12/27/2021 09:39:34 - INFO - codeparrot_training - Step 33686: {'lr': 0.00012100373087443189, 'samples': 17247744, 'steps': 33686, 'batch_loss/train': 0.637483493424952} 12/27/2021 09:39:47 - INFO - codeparrot_training - Step 33687: {'lr': 0.00012099022846834154, 'samples': 17248256, 'steps': 33687, 'batch_loss/train': 0.6240795544581488} 12/27/2021 09:39:57 - INFO - codeparrot_training - Step 33688: {'lr': 0.00012097672657514533, 'samples': 17248768, 'steps': 33688, 'batch_loss/train': 0.7387072835117579} 12/27/2021 09:40:08 - INFO - codeparrot_training - Step 33689: {'lr': 0.00012096322519489702, 'samples': 17249280, 'steps': 33689, 'batch_loss/train': 1.050299558788538} 12/27/2021 09:40:22 - INFO - codeparrot_training - Step 33690: {'lr': 0.00012094972432765003, 'samples': 17249792, 'steps': 33690, 'batch_loss/train': 1.377135592047125} 12/27/2021 09:40:33 - INFO - codeparrot_training - Step 33691: {'lr': 0.00012093622397345829, 'samples': 17250304, 'steps': 33691, 'batch_loss/train': 0.7095358436927199} 12/27/2021 09:40:43 - INFO - codeparrot_training - Step 33692: {'lr': 0.00012092272413237549, 'samples': 17250816, 'steps': 33692, 'batch_loss/train': 0.7742928555235267} 12/27/2021 09:40:54 - INFO - codeparrot_training - Step 33693: {'lr': 0.0001209092248044551, 'samples': 17251328, 'steps': 33693, 'batch_loss/train': 0.7255480187013745} 12/27/2021 09:41:06 - INFO - codeparrot_training - Step 33694: {'lr': 0.00012089572598975082, 'samples': 17251840, 'steps': 33694, 'batch_loss/train': 0.5071461270563304} 12/27/2021 09:41:17 - INFO - codeparrot_training - Step 33695: {'lr': 0.00012088222768831655, 'samples': 17252352, 'steps': 33695, 'batch_loss/train': 0.7335848963703029} 12/27/2021 09:41:27 - INFO - codeparrot_training - Step 33696: {'lr': 0.0001208687299002057, 'samples': 17252864, 'steps': 33696, 'batch_loss/train': 0.7169615915045142} 12/27/2021 09:41:39 - INFO - codeparrot_training - Step 33697: {'lr': 0.00012085523262547202, 'samples': 17253376, 'steps': 33697, 'batch_loss/train': 0.7524627838283777} 12/27/2021 09:41:50 - INFO - codeparrot_training - Step 33698: {'lr': 0.00012084173586416919, 'samples': 17253888, 'steps': 33698, 'batch_loss/train': 0.7298930949764326} 12/27/2021 09:42:01 - INFO - codeparrot_training - Step 33699: {'lr': 0.00012082823961635084, 'samples': 17254400, 'steps': 33699, 'batch_loss/train': 0.7308797352015972} 12/27/2021 09:42:14 - INFO - codeparrot_training - Step 33700: {'lr': 0.00012081474388207062, 'samples': 17254912, 'steps': 33700, 'batch_loss/train': 0.6798901000292972} 12/27/2021 09:42:25 - INFO - codeparrot_training - Step 33701: {'lr': 0.0001208012486613822, 'samples': 17255424, 'steps': 33701, 'batch_loss/train': 0.7616830673068762} 12/27/2021 09:42:35 - INFO - codeparrot_training - Step 33702: {'lr': 0.00012078775395433925, 'samples': 17255936, 'steps': 33702, 'batch_loss/train': 0.7871210370212793} 12/27/2021 09:42:47 - INFO - codeparrot_training - Step 33703: {'lr': 0.00012077425976099541, 'samples': 17256448, 'steps': 33703, 'batch_loss/train': 0.6546939482213929} 12/27/2021 09:42:58 - INFO - codeparrot_training - Step 33704: {'lr': 0.00012076076608140438, 'samples': 17256960, 'steps': 33704, 'batch_loss/train': 0.7527337337378412} 12/27/2021 09:43:09 - INFO - codeparrot_training - Step 33705: {'lr': 0.00012074727291561958, 'samples': 17257472, 'steps': 33705, 'batch_loss/train': 0.7203680672682822} 12/27/2021 09:43:19 - INFO - codeparrot_training - Step 33706: {'lr': 0.000120733780263695, 'samples': 17257984, 'steps': 33706, 'batch_loss/train': 0.6507476302795112} 12/27/2021 09:43:33 - INFO - codeparrot_training - Step 33707: {'lr': 0.00012072028812568403, 'samples': 17258496, 'steps': 33707, 'batch_loss/train': 0.7409752765670419} 12/27/2021 09:43:44 - INFO - codeparrot_training - Step 33708: {'lr': 0.00012070679650164027, 'samples': 17259008, 'steps': 33708, 'batch_loss/train': 0.7454460142180324} 12/27/2021 09:43:54 - INFO - codeparrot_training - Step 33709: {'lr': 0.00012069330539161763, 'samples': 17259520, 'steps': 33709, 'batch_loss/train': 0.655768904951401} 12/27/2021 09:44:06 - INFO - codeparrot_training - Step 33710: {'lr': 0.00012067981479566953, 'samples': 17260032, 'steps': 33710, 'batch_loss/train': 0.7313806200399995} 12/27/2021 09:44:17 - INFO - codeparrot_training - Step 33711: {'lr': 0.00012066632471384961, 'samples': 17260544, 'steps': 33711, 'batch_loss/train': 0.6952187730930746} 12/27/2021 09:44:28 - INFO - codeparrot_training - Step 33712: {'lr': 0.00012065283514621158, 'samples': 17261056, 'steps': 33712, 'batch_loss/train': 0.7280085794627666} 12/27/2021 09:44:40 - INFO - codeparrot_training - Step 33713: {'lr': 0.00012063934609280906, 'samples': 17261568, 'steps': 33713, 'batch_loss/train': 0.6960946666076779} 12/27/2021 09:44:50 - INFO - codeparrot_training - Step 33714: {'lr': 0.00012062585755369562, 'samples': 17262080, 'steps': 33714, 'batch_loss/train': 0.7407726999372244} 12/27/2021 09:45:01 - INFO - codeparrot_training - Step 33715: {'lr': 0.00012061236952892501, 'samples': 17262592, 'steps': 33715, 'batch_loss/train': 0.8130402527749538} 12/27/2021 09:45:12 - INFO - codeparrot_training - Step 33716: {'lr': 0.0001205988820185506, 'samples': 17263104, 'steps': 33716, 'batch_loss/train': 0.7957127904519439} 12/27/2021 09:45:24 - INFO - codeparrot_training - Step 33717: {'lr': 0.00012058539502262625, 'samples': 17263616, 'steps': 33717, 'batch_loss/train': 0.7383029754273593} 12/27/2021 09:45:34 - INFO - codeparrot_training - Step 33718: {'lr': 0.00012057190854120562, 'samples': 17264128, 'steps': 33718, 'batch_loss/train': 0.7564710509032011} 12/27/2021 09:45:45 - INFO - codeparrot_training - Step 33719: {'lr': 0.0001205584225743421, 'samples': 17264640, 'steps': 33719, 'batch_loss/train': 0.7795358086004853} 12/27/2021 09:45:59 - INFO - codeparrot_training - Step 33720: {'lr': 0.00012054493712208934, 'samples': 17265152, 'steps': 33720, 'batch_loss/train': 0.6211439569597133} 12/27/2021 09:46:10 - INFO - codeparrot_training - Step 33721: {'lr': 0.0001205314521845012, 'samples': 17265664, 'steps': 33721, 'batch_loss/train': 0.7207167628221214} 12/27/2021 09:46:20 - INFO - codeparrot_training - Step 33722: {'lr': 0.00012051796776163102, 'samples': 17266176, 'steps': 33722, 'batch_loss/train': 0.5361115167615935} 12/27/2021 09:46:33 - INFO - codeparrot_training - Step 33723: {'lr': 0.00012050448385353243, 'samples': 17266688, 'steps': 33723, 'batch_loss/train': 0.7084139953367412} 12/27/2021 09:46:43 - INFO - codeparrot_training - Step 33724: {'lr': 0.0001204910004602593, 'samples': 17267200, 'steps': 33724, 'batch_loss/train': 0.674851122777909} 12/27/2021 09:46:54 - INFO - codeparrot_training - Step 33725: {'lr': 0.00012047751758186496, 'samples': 17267712, 'steps': 33725, 'batch_loss/train': 0.7036031575407833} 12/27/2021 09:47:06 - INFO - codeparrot_training - Step 33726: {'lr': 0.0001204640352184031, 'samples': 17268224, 'steps': 33726, 'batch_loss/train': 0.9915058328770101} 12/27/2021 09:47:17 - INFO - codeparrot_training - Step 33727: {'lr': 0.00012045055336992733, 'samples': 17268736, 'steps': 33727, 'batch_loss/train': 0.7252625547698699} 12/27/2021 09:47:27 - INFO - codeparrot_training - Step 33728: {'lr': 0.00012043707203649123, 'samples': 17269248, 'steps': 33728, 'batch_loss/train': 0.72342265304178} 12/27/2021 09:47:38 - INFO - codeparrot_training - Step 33729: {'lr': 0.00012042359121814845, 'samples': 17269760, 'steps': 33729, 'batch_loss/train': 0.7206937773153186} 12/27/2021 09:47:52 - INFO - codeparrot_training - Step 33730: {'lr': 0.0001204101109149526, 'samples': 17270272, 'steps': 33730, 'batch_loss/train': 0.6936120027676225} 12/27/2021 09:48:02 - INFO - codeparrot_training - Step 33731: {'lr': 0.00012039663112695703, 'samples': 17270784, 'steps': 33731, 'batch_loss/train': 0.5930418060161173} 12/27/2021 09:48:13 - INFO - codeparrot_training - Step 33732: {'lr': 0.00012038315185421576, 'samples': 17271296, 'steps': 33732, 'batch_loss/train': 0.741578227840364} 12/27/2021 09:48:25 - INFO - codeparrot_training - Step 33733: {'lr': 0.00012036967309678198, 'samples': 17271808, 'steps': 33733, 'batch_loss/train': 0.5078476185444742} 12/27/2021 09:48:36 - INFO - codeparrot_training - Step 33734: {'lr': 0.00012035619485470936, 'samples': 17272320, 'steps': 33734, 'batch_loss/train': 0.6194896027154755} 12/27/2021 09:48:46 - INFO - codeparrot_training - Step 33735: {'lr': 0.00012034271712805176, 'samples': 17272832, 'steps': 33735, 'batch_loss/train': 0.7298682767432183} 12/27/2021 09:48:58 - INFO - codeparrot_training - Step 33736: {'lr': 0.00012032923991686242, 'samples': 17273344, 'steps': 33736, 'batch_loss/train': 0.6918439633445814} 12/27/2021 09:49:09 - INFO - codeparrot_training - Step 33737: {'lr': 0.00012031576322119511, 'samples': 17273856, 'steps': 33737, 'batch_loss/train': 0.7724747355096042} 12/27/2021 09:49:20 - INFO - codeparrot_training - Step 33738: {'lr': 0.00012030228704110336, 'samples': 17274368, 'steps': 33738, 'batch_loss/train': 0.6011626916006207} 12/27/2021 09:49:30 - INFO - codeparrot_training - Step 33739: {'lr': 0.00012028881137664072, 'samples': 17274880, 'steps': 33739, 'batch_loss/train': 0.6257424731738865} 12/27/2021 09:49:44 - INFO - codeparrot_training - Step 33740: {'lr': 0.00012027533622786078, 'samples': 17275392, 'steps': 33740, 'batch_loss/train': 0.6987008319701999} 12/27/2021 09:49:55 - INFO - codeparrot_training - Step 33741: {'lr': 0.00012026186159481723, 'samples': 17275904, 'steps': 33741, 'batch_loss/train': 0.7218255087500438} 12/27/2021 09:50:05 - INFO - codeparrot_training - Step 33742: {'lr': 0.00012024838747756333, 'samples': 17276416, 'steps': 33742, 'batch_loss/train': 0.6940088593401015} 12/27/2021 09:50:18 - INFO - codeparrot_training - Step 33743: {'lr': 0.00012023491387615296, 'samples': 17276928, 'steps': 33743, 'batch_loss/train': 0.7867160406894982} 12/27/2021 09:50:28 - INFO - codeparrot_training - Step 33744: {'lr': 0.00012022144079063968, 'samples': 17277440, 'steps': 33744, 'batch_loss/train': 0.7837965063517913} 12/27/2021 09:50:39 - INFO - codeparrot_training - Step 33745: {'lr': 0.00012020796822107674, 'samples': 17277952, 'steps': 33745, 'batch_loss/train': 0.5326165237929672} 12/27/2021 09:50:53 - INFO - codeparrot_training - Step 33746: {'lr': 0.00012019449616751801, 'samples': 17278464, 'steps': 33746, 'batch_loss/train': 0.7065875791013241} 12/27/2021 09:51:03 - INFO - codeparrot_training - Step 33747: {'lr': 0.00012018102463001704, 'samples': 17278976, 'steps': 33747, 'batch_loss/train': 0.7978595096501522} 12/27/2021 09:51:14 - INFO - codeparrot_training - Step 33748: {'lr': 0.00012016755360862722, 'samples': 17279488, 'steps': 33748, 'batch_loss/train': 0.8494005692191422} 12/27/2021 09:51:24 - INFO - codeparrot_training - Step 33749: {'lr': 0.00012015408310340209, 'samples': 17280000, 'steps': 33749, 'batch_loss/train': 0.7747149942442775} 12/27/2021 09:51:37 - INFO - codeparrot_training - Step 33750: {'lr': 0.00012014061311439547, 'samples': 17280512, 'steps': 33750, 'batch_loss/train': 0.754429361782968} 12/27/2021 09:51:47 - INFO - codeparrot_training - Step 33751: {'lr': 0.00012012714364166063, 'samples': 17281024, 'steps': 33751, 'batch_loss/train': 0.8654056675732136} 12/27/2021 09:51:58 - INFO - codeparrot_training - Step 33752: {'lr': 0.00012011367468525125, 'samples': 17281536, 'steps': 33752, 'batch_loss/train': 0.7935637559276074} 12/27/2021 09:52:10 - INFO - codeparrot_training - Step 33753: {'lr': 0.00012010020624522086, 'samples': 17282048, 'steps': 33753, 'batch_loss/train': 0.8169003166258335} 12/27/2021 09:52:21 - INFO - codeparrot_training - Step 33754: {'lr': 0.00012008673832162301, 'samples': 17282560, 'steps': 33754, 'batch_loss/train': 0.6179063479939941} 12/27/2021 09:52:31 - INFO - codeparrot_training - Step 33755: {'lr': 0.00012007327091451123, 'samples': 17283072, 'steps': 33755, 'batch_loss/train': 0.6402977826073766} 12/27/2021 09:52:43 - INFO - codeparrot_training - Step 33756: {'lr': 0.00012005980402393915, 'samples': 17283584, 'steps': 33756, 'batch_loss/train': 0.760567145422101} 12/27/2021 09:52:54 - INFO - codeparrot_training - Step 33757: {'lr': 0.00012004633764996003, 'samples': 17284096, 'steps': 33757, 'batch_loss/train': 0.6313295364379883} 12/27/2021 09:53:04 - INFO - codeparrot_training - Step 33758: {'lr': 0.0001200328717926277, 'samples': 17284608, 'steps': 33758, 'batch_loss/train': 0.6690980619750917} 12/27/2021 09:53:18 - INFO - codeparrot_training - Step 33759: {'lr': 0.00012001940645199572, 'samples': 17285120, 'steps': 33759, 'batch_loss/train': 0.6783102583140135} 12/27/2021 09:53:29 - INFO - codeparrot_training - Step 33760: {'lr': 0.00012000594162811729, 'samples': 17285632, 'steps': 33760, 'batch_loss/train': 0.719848264940083} 12/27/2021 09:53:40 - INFO - codeparrot_training - Step 33761: {'lr': 0.00011999247732104635, 'samples': 17286144, 'steps': 33761, 'batch_loss/train': 0.7214340814389288} 12/27/2021 09:53:50 - INFO - codeparrot_training - Step 33762: {'lr': 0.00011997901353083612, 'samples': 17286656, 'steps': 33762, 'batch_loss/train': 0.7255036681890488} 12/27/2021 09:54:03 - INFO - codeparrot_training - Step 33763: {'lr': 0.00011996555025754025, 'samples': 17287168, 'steps': 33763, 'batch_loss/train': 0.76879130769521} 12/27/2021 09:54:13 - INFO - codeparrot_training - Step 33764: {'lr': 0.00011995208750121226, 'samples': 17287680, 'steps': 33764, 'batch_loss/train': 1.7395339253125712} 12/27/2021 09:54:24 - INFO - codeparrot_training - Step 33765: {'lr': 0.00011993862526190564, 'samples': 17288192, 'steps': 33765, 'batch_loss/train': 0.7945230826735497} 12/27/2021 09:54:36 - INFO - codeparrot_training - Step 33766: {'lr': 0.00011992516353967395, 'samples': 17288704, 'steps': 33766, 'batch_loss/train': 0.5975441099144518} 12/27/2021 09:54:46 - INFO - codeparrot_training - Step 33767: {'lr': 0.00011991170233457072, 'samples': 17289216, 'steps': 33767, 'batch_loss/train': 0.7266815355978906} 12/27/2021 09:54:57 - INFO - codeparrot_training - Step 33768: {'lr': 0.00011989824164664942, 'samples': 17289728, 'steps': 33768, 'batch_loss/train': 0.7379002394154668} 12/27/2021 09:55:11 - INFO - codeparrot_training - Step 33769: {'lr': 0.00011988478147596358, 'samples': 17290240, 'steps': 33769, 'batch_loss/train': 0.75599405891262} 12/27/2021 09:55:22 - INFO - codeparrot_training - Step 33770: {'lr': 0.00011987132182256682, 'samples': 17290752, 'steps': 33770, 'batch_loss/train': 0.8337046513333917} 12/27/2021 09:55:32 - INFO - codeparrot_training - Step 33771: {'lr': 0.00011985786268651236, 'samples': 17291264, 'steps': 33771, 'batch_loss/train': 0.6113178730010986} 12/27/2021 09:55:43 - INFO - codeparrot_training - Step 33772: {'lr': 0.00011984440406785399, 'samples': 17291776, 'steps': 33772, 'batch_loss/train': 0.7528033999260515} 12/27/2021 09:55:55 - INFO - codeparrot_training - Step 33773: {'lr': 0.00011983094596664524, 'samples': 17292288, 'steps': 33773, 'batch_loss/train': 0.6709207352250814} 12/27/2021 09:56:05 - INFO - codeparrot_training - Step 33774: {'lr': 0.00011981748838293938, 'samples': 17292800, 'steps': 33774, 'batch_loss/train': 0.6546078467799816} 12/27/2021 09:56:16 - INFO - codeparrot_training - Step 33775: {'lr': 0.00011980403131678993, 'samples': 17293312, 'steps': 33775, 'batch_loss/train': 0.7149500879459083} 12/27/2021 09:56:30 - INFO - codeparrot_training - Step 33776: {'lr': 0.00011979057476825072, 'samples': 17293824, 'steps': 33776, 'batch_loss/train': 0.7169255996122956} 12/27/2021 09:56:41 - INFO - codeparrot_training - Step 33777: {'lr': 0.0001197771187373749, 'samples': 17294336, 'steps': 33777, 'batch_loss/train': 0.8817254062741995} 12/27/2021 09:56:51 - INFO - codeparrot_training - Step 33778: {'lr': 0.00011976366322421609, 'samples': 17294848, 'steps': 33778, 'batch_loss/train': 0.7470169425942004} 12/27/2021 09:57:03 - INFO - codeparrot_training - Step 33779: {'lr': 0.00011975020822882776, 'samples': 17295360, 'steps': 33779, 'batch_loss/train': 0.7255746745504439} 12/27/2021 09:57:14 - INFO - codeparrot_training - Step 33780: {'lr': 0.00011973675375126344, 'samples': 17295872, 'steps': 33780, 'batch_loss/train': 0.7635662169195712} 12/27/2021 09:57:25 - INFO - codeparrot_training - Step 33781: {'lr': 0.00011972329979157662, 'samples': 17296384, 'steps': 33781, 'batch_loss/train': 0.6934156161732972} 12/27/2021 09:57:35 - INFO - codeparrot_training - Step 33782: {'lr': 0.00011970984634982074, 'samples': 17296896, 'steps': 33782, 'batch_loss/train': 0.7307876846753061} 12/27/2021 09:57:48 - INFO - codeparrot_training - Step 33783: {'lr': 0.00011969639342604932, 'samples': 17297408, 'steps': 33783, 'batch_loss/train': 0.9243980348110199} 12/27/2021 09:57:58 - INFO - codeparrot_training - Step 33784: {'lr': 0.00011968294102031585, 'samples': 17297920, 'steps': 33784, 'batch_loss/train': 0.7622435232624412} 12/27/2021 09:58:09 - INFO - codeparrot_training - Step 33785: {'lr': 0.0001196694891326739, 'samples': 17298432, 'steps': 33785, 'batch_loss/train': 0.6953941164538264} 12/27/2021 09:58:22 - INFO - codeparrot_training - Step 33786: {'lr': 0.00011965603776317663, 'samples': 17298944, 'steps': 33786, 'batch_loss/train': 0.6611773939803243} 12/27/2021 09:58:33 - INFO - codeparrot_training - Step 33787: {'lr': 0.00011964258691187796, 'samples': 17299456, 'steps': 33787, 'batch_loss/train': 0.6441615275107324} 12/27/2021 09:58:43 - INFO - codeparrot_training - Step 33788: {'lr': 0.00011962913657883101, 'samples': 17299968, 'steps': 33788, 'batch_loss/train': 0.7380093294195831} 12/27/2021 09:58:55 - INFO - codeparrot_training - Step 33789: {'lr': 0.00011961568676408943, 'samples': 17300480, 'steps': 33789, 'batch_loss/train': 0.7269299941835925} 12/27/2021 09:59:06 - INFO - codeparrot_training - Step 33790: {'lr': 0.00011960223746770663, 'samples': 17300992, 'steps': 33790, 'batch_loss/train': 0.8052985752001405} 12/27/2021 09:59:17 - INFO - codeparrot_training - Step 33791: {'lr': 0.0001195887886897361, 'samples': 17301504, 'steps': 33791, 'batch_loss/train': 0.7691049083368853} 12/27/2021 09:59:27 - INFO - codeparrot_training - Step 33792: {'lr': 0.00011957534043023129, 'samples': 17302016, 'steps': 33792, 'batch_loss/train': 0.7469819318503141} 12/27/2021 09:59:39 - INFO - codeparrot_training - Step 33793: {'lr': 0.00011956189268924569, 'samples': 17302528, 'steps': 33793, 'batch_loss/train': 0.6898347213864326} 12/27/2021 09:59:50 - INFO - codeparrot_training - Step 33794: {'lr': 0.00011954844546683277, 'samples': 17303040, 'steps': 33794, 'batch_loss/train': 0.6866682698018849} 12/27/2021 10:00:01 - INFO - codeparrot_training - Step 33795: {'lr': 0.00011953499876304597, 'samples': 17303552, 'steps': 33795, 'batch_loss/train': 0.6100697587244213} 12/27/2021 10:00:13 - INFO - codeparrot_training - Step 33796: {'lr': 0.00011952155257793882, 'samples': 17304064, 'steps': 33796, 'batch_loss/train': 0.7547424687072635} 12/27/2021 10:00:24 - INFO - codeparrot_training - Step 33797: {'lr': 0.00011950810691156456, 'samples': 17304576, 'steps': 33797, 'batch_loss/train': 0.7002913802862167} 12/27/2021 10:00:34 - INFO - codeparrot_training - Step 33798: {'lr': 0.00011949466176397686, 'samples': 17305088, 'steps': 33798, 'batch_loss/train': 0.7762186555191875} 12/27/2021 10:00:48 - INFO - codeparrot_training - Step 33799: {'lr': 0.00011948121713522922, 'samples': 17305600, 'steps': 33799, 'batch_loss/train': 0.7410642588511109} 12/27/2021 10:00:59 - INFO - codeparrot_training - Step 33800: {'lr': 0.00011946777302537487, 'samples': 17306112, 'steps': 33800, 'batch_loss/train': 0.7566625364124775} 12/27/2021 10:01:09 - INFO - codeparrot_training - Step 33801: {'lr': 0.00011945432943446727, 'samples': 17306624, 'steps': 33801, 'batch_loss/train': 0.7785346172749996} 12/27/2021 10:01:20 - INFO - codeparrot_training - Step 33802: {'lr': 0.00011944088636256018, 'samples': 17307136, 'steps': 33802, 'batch_loss/train': 0.7442473797127604} 12/27/2021 10:01:32 - INFO - codeparrot_training - Step 33803: {'lr': 0.00011942744380970668, 'samples': 17307648, 'steps': 33803, 'batch_loss/train': 0.7390279611572623} 12/27/2021 10:01:43 - INFO - codeparrot_training - Step 33804: {'lr': 0.0001194140017759604, 'samples': 17308160, 'steps': 33804, 'batch_loss/train': 0.7257581620942801} 12/27/2021 10:01:53 - INFO - codeparrot_training - Step 33805: {'lr': 0.00011940056026137474, 'samples': 17308672, 'steps': 33805, 'batch_loss/train': 0.6789220452774316} 12/27/2021 10:02:05 - INFO - codeparrot_training - Step 33806: {'lr': 0.00011938711926600315, 'samples': 17309184, 'steps': 33806, 'batch_loss/train': 0.805217846762389} 12/27/2021 10:02:16 - INFO - codeparrot_training - Step 33807: {'lr': 0.00011937367878989904, 'samples': 17309696, 'steps': 33807, 'batch_loss/train': 0.7842119242995977} 12/27/2021 10:02:27 - INFO - codeparrot_training - Step 33808: {'lr': 0.00011936023883311586, 'samples': 17310208, 'steps': 33808, 'batch_loss/train': 0.7273971159011126} 12/27/2021 10:02:40 - INFO - codeparrot_training - Step 33809: {'lr': 0.00011934679939570705, 'samples': 17310720, 'steps': 33809, 'batch_loss/train': 0.7362692016176879} 12/27/2021 10:02:51 - INFO - codeparrot_training - Step 33810: {'lr': 0.00011933336047772603, 'samples': 17311232, 'steps': 33810, 'batch_loss/train': 0.8287498503923416} 12/27/2021 10:03:01 - INFO - codeparrot_training - Step 33811: {'lr': 0.00011931992207922634, 'samples': 17311744, 'steps': 33811, 'batch_loss/train': 0.7651020127814263} 12/27/2021 10:03:14 - INFO - codeparrot_training - Step 33812: {'lr': 0.00011930648420026108, 'samples': 17312256, 'steps': 33812, 'batch_loss/train': 0.911902760155499} 12/27/2021 10:03:24 - INFO - codeparrot_training - Step 33813: {'lr': 0.00011929304684088402, 'samples': 17312768, 'steps': 33813, 'batch_loss/train': 0.6822981345467269} 12/27/2021 10:03:35 - INFO - codeparrot_training - Step 33814: {'lr': 0.0001192796100011485, 'samples': 17313280, 'steps': 33814, 'batch_loss/train': 0.7480244413018227} 12/27/2021 10:03:45 - INFO - codeparrot_training - Step 33815: {'lr': 0.00011926617368110785, 'samples': 17313792, 'steps': 33815, 'batch_loss/train': 0.7769664544612169} 12/27/2021 10:03:58 - INFO - codeparrot_training - Step 33816: {'lr': 0.0001192527378808155, 'samples': 17314304, 'steps': 33816, 'batch_loss/train': 0.7623172868043184} 12/27/2021 10:04:09 - INFO - codeparrot_training - Step 33817: {'lr': 0.0001192393026003249, 'samples': 17314816, 'steps': 33817, 'batch_loss/train': 0.7058389112353325} 12/27/2021 10:04:19 - INFO - codeparrot_training - Step 33818: {'lr': 0.00011922586783968947, 'samples': 17315328, 'steps': 33818, 'batch_loss/train': 0.732837725430727} 12/27/2021 10:04:34 - INFO - codeparrot_training - Step 33819: {'lr': 0.00011921243359896259, 'samples': 17315840, 'steps': 33819, 'batch_loss/train': 0.791741237975657} 12/27/2021 10:04:44 - INFO - codeparrot_training - Step 33820: {'lr': 0.00011919899987819771, 'samples': 17316352, 'steps': 33820, 'batch_loss/train': 0.6287955336738378} 12/27/2021 10:04:55 - INFO - codeparrot_training - Step 33821: {'lr': 0.0001191855666774482, 'samples': 17316864, 'steps': 33821, 'batch_loss/train': 0.49832793656969443} 12/27/2021 10:05:07 - INFO - codeparrot_training - Step 33822: {'lr': 0.0001191721339967676, 'samples': 17317376, 'steps': 33822, 'batch_loss/train': 0.8658256642520428} 12/27/2021 10:05:18 - INFO - codeparrot_training - Step 33823: {'lr': 0.000119158701836209, 'samples': 17317888, 'steps': 33823, 'batch_loss/train': 0.880134274251759} 12/27/2021 10:05:28 - INFO - codeparrot_training - Step 33824: {'lr': 0.00011914527019582608, 'samples': 17318400, 'steps': 33824, 'batch_loss/train': 0.7081228997558355} 12/27/2021 10:05:39 - INFO - codeparrot_training - Step 33825: {'lr': 0.00011913183907567226, 'samples': 17318912, 'steps': 33825, 'batch_loss/train': 0.8111833315342665} 12/27/2021 10:05:53 - INFO - codeparrot_training - Step 33826: {'lr': 0.00011911840847580074, 'samples': 17319424, 'steps': 33826, 'batch_loss/train': 0.6813600626774132} 12/27/2021 10:06:03 - INFO - codeparrot_training - Step 33827: {'lr': 0.00011910497839626492, 'samples': 17319936, 'steps': 33827, 'batch_loss/train': 0.6922825202345848} 12/27/2021 10:06:14 - INFO - codeparrot_training - Step 33828: {'lr': 0.00011909154883711845, 'samples': 17320448, 'steps': 33828, 'batch_loss/train': 0.7439530082046986} 12/27/2021 10:06:26 - INFO - codeparrot_training - Step 33829: {'lr': 0.00011907811979841446, 'samples': 17320960, 'steps': 33829, 'batch_loss/train': 0.7690217611379921} 12/27/2021 10:06:37 - INFO - codeparrot_training - Step 33830: {'lr': 0.00011906469128020644, 'samples': 17321472, 'steps': 33830, 'batch_loss/train': 0.7968953484669328} 12/27/2021 10:06:47 - INFO - codeparrot_training - Step 33831: {'lr': 0.00011905126328254778, 'samples': 17321984, 'steps': 33831, 'batch_loss/train': 0.7412772020325065} 12/27/2021 10:06:59 - INFO - codeparrot_training - Step 33832: {'lr': 0.00011903783580549185, 'samples': 17322496, 'steps': 33832, 'batch_loss/train': 0.754237093962729} 12/27/2021 10:07:10 - INFO - codeparrot_training - Step 33833: {'lr': 0.00011902440884909202, 'samples': 17323008, 'steps': 33833, 'batch_loss/train': 0.7622250020503998} 12/27/2021 10:07:20 - INFO - codeparrot_training - Step 33834: {'lr': 0.00011901098241340171, 'samples': 17323520, 'steps': 33834, 'batch_loss/train': 0.8429202670231462} 12/27/2021 10:07:34 - INFO - codeparrot_training - Step 33835: {'lr': 0.00011899755649847427, 'samples': 17324032, 'steps': 33835, 'batch_loss/train': 0.6826363741420209} 12/27/2021 10:07:45 - INFO - codeparrot_training - Step 33836: {'lr': 0.00011898413110436307, 'samples': 17324544, 'steps': 33836, 'batch_loss/train': 0.7391997752711177} 12/27/2021 10:07:56 - INFO - codeparrot_training - Step 33837: {'lr': 0.00011897070623112158, 'samples': 17325056, 'steps': 33837, 'batch_loss/train': 0.7776093380525708} 12/27/2021 10:08:06 - INFO - codeparrot_training - Step 33838: {'lr': 0.00011895728187880293, 'samples': 17325568, 'steps': 33838, 'batch_loss/train': 0.6993017299100757} 12/27/2021 10:08:18 - INFO - codeparrot_training - Step 33839: {'lr': 0.0001189438580474607, 'samples': 17326080, 'steps': 33839, 'batch_loss/train': 0.7985516898334026} 12/27/2021 10:08:29 - INFO - codeparrot_training - Step 33840: {'lr': 0.00011893043473714831, 'samples': 17326592, 'steps': 33840, 'batch_loss/train': 0.734759614802897} 12/27/2021 10:08:40 - INFO - codeparrot_training - Step 33841: {'lr': 0.00011891701194791896, 'samples': 17327104, 'steps': 33841, 'batch_loss/train': 0.7667638412676752} 12/27/2021 10:08:54 - INFO - codeparrot_training - Step 33842: {'lr': 0.00011890358967982603, 'samples': 17327616, 'steps': 33842, 'batch_loss/train': 0.8547091628424823} 12/27/2021 10:09:04 - INFO - codeparrot_training - Step 33843: {'lr': 0.00011889016793292296, 'samples': 17328128, 'steps': 33843, 'batch_loss/train': 0.5893305789213628} 12/27/2021 10:09:15 - INFO - codeparrot_training - Step 33844: {'lr': 0.00011887674670726306, 'samples': 17328640, 'steps': 33844, 'batch_loss/train': 0.7391862012445927} 12/27/2021 10:09:26 - INFO - codeparrot_training - Step 33845: {'lr': 0.00011886332600289973, 'samples': 17329152, 'steps': 33845, 'batch_loss/train': 0.6945781419053674} 12/27/2021 10:09:38 - INFO - codeparrot_training - Step 33846: {'lr': 0.00011884990581988628, 'samples': 17329664, 'steps': 33846, 'batch_loss/train': 0.5319313517538831} 12/27/2021 10:09:49 - INFO - codeparrot_training - Step 33847: {'lr': 0.00011883648615827608, 'samples': 17330176, 'steps': 33847, 'batch_loss/train': 0.6347704823128879} 12/27/2021 10:09:59 - INFO - codeparrot_training - Step 33848: {'lr': 0.00011882306701812248, 'samples': 17330688, 'steps': 33848, 'batch_loss/train': 0.6431613573804498} 12/27/2021 10:10:11 - INFO - codeparrot_training - Step 33849: {'lr': 0.00011880964839947886, 'samples': 17331200, 'steps': 33849, 'batch_loss/train': 0.7778122080489993} 12/27/2021 10:10:22 - INFO - codeparrot_training - Step 33850: {'lr': 0.00011879623030239852, 'samples': 17331712, 'steps': 33850, 'batch_loss/train': 0.7424142425879836} 12/27/2021 10:10:33 - INFO - codeparrot_training - Step 33851: {'lr': 0.00011878281272693495, 'samples': 17332224, 'steps': 33851, 'batch_loss/train': 0.7505213394761086} 12/27/2021 10:10:45 - INFO - codeparrot_training - Step 33852: {'lr': 0.00011876939567314127, 'samples': 17332736, 'steps': 33852, 'batch_loss/train': 0.7962704757228494} 12/27/2021 10:10:55 - INFO - codeparrot_training - Step 33853: {'lr': 0.0001187559791410708, 'samples': 17333248, 'steps': 33853, 'batch_loss/train': 1.1116665983572602} 12/27/2021 10:11:06 - INFO - codeparrot_training - Step 33854: {'lr': 0.00011874256313077722, 'samples': 17333760, 'steps': 33854, 'batch_loss/train': 0.8017313815653324} 12/27/2021 10:11:20 - INFO - codeparrot_training - Step 33855: {'lr': 0.00011872914764231352, 'samples': 17334272, 'steps': 33855, 'batch_loss/train': 0.6526933667482808} 12/27/2021 10:11:31 - INFO - codeparrot_training - Step 33856: {'lr': 0.0001187157326757331, 'samples': 17334784, 'steps': 33856, 'batch_loss/train': 0.7318984633311629} 12/27/2021 10:11:41 - INFO - codeparrot_training - Step 33857: {'lr': 0.00011870231823108954, 'samples': 17335296, 'steps': 33857, 'batch_loss/train': 0.7709608416771516} 12/27/2021 10:11:52 - INFO - codeparrot_training - Step 33858: {'lr': 0.00011868890430843587, 'samples': 17335808, 'steps': 33858, 'batch_loss/train': 0.7636653033550829} 12/27/2021 10:12:04 - INFO - codeparrot_training - Step 33859: {'lr': 0.00011867549090782556, 'samples': 17336320, 'steps': 33859, 'batch_loss/train': 0.5755413645529188} 12/27/2021 10:12:15 - INFO - codeparrot_training - Step 33860: {'lr': 0.00011866207802931189, 'samples': 17336832, 'steps': 33860, 'batch_loss/train': 0.7805722062475979} 12/27/2021 10:12:25 - INFO - codeparrot_training - Step 33861: {'lr': 0.00011864866567294825, 'samples': 17337344, 'steps': 33861, 'batch_loss/train': 0.6983417319133878} 12/27/2021 10:12:38 - INFO - codeparrot_training - Step 33862: {'lr': 0.00011863525383878787, 'samples': 17337856, 'steps': 33862, 'batch_loss/train': 0.7029704144224524} 12/27/2021 10:12:48 - INFO - codeparrot_training - Step 33863: {'lr': 0.00011862184252688424, 'samples': 17338368, 'steps': 33863, 'batch_loss/train': 0.7345256195403636} 12/27/2021 10:12:59 - INFO - codeparrot_training - Step 33864: {'lr': 0.00011860843173729035, 'samples': 17338880, 'steps': 33864, 'batch_loss/train': 1.3020908003672957} 12/27/2021 10:13:09 - INFO - codeparrot_training - Step 33865: {'lr': 0.00011859502147005982, 'samples': 17339392, 'steps': 33865, 'batch_loss/train': 0.8198451949283481} 12/27/2021 10:13:24 - INFO - codeparrot_training - Step 33866: {'lr': 0.00011858161172524597, 'samples': 17339904, 'steps': 33866, 'batch_loss/train': 0.6823558593168855} 12/27/2021 10:13:34 - INFO - codeparrot_training - Step 33867: {'lr': 0.00011856820250290193, 'samples': 17340416, 'steps': 33867, 'batch_loss/train': 0.66835852060467} 12/27/2021 10:13:45 - INFO - codeparrot_training - Step 33868: {'lr': 0.00011855479380308109, 'samples': 17340928, 'steps': 33868, 'batch_loss/train': 0.7564546628855169} 12/27/2021 10:13:57 - INFO - codeparrot_training - Step 33869: {'lr': 0.00011854138562583674, 'samples': 17341440, 'steps': 33869, 'batch_loss/train': 0.7384128361009061} 12/27/2021 10:14:08 - INFO - codeparrot_training - Step 33870: {'lr': 0.00011852797797122219, 'samples': 17341952, 'steps': 33870, 'batch_loss/train': 0.776478489395231} 12/27/2021 10:14:18 - INFO - codeparrot_training - Step 33871: {'lr': 0.00011851457083929079, 'samples': 17342464, 'steps': 33871, 'batch_loss/train': 0.745958499610424} 12/27/2021 10:14:32 - INFO - codeparrot_training - Step 33872: {'lr': 0.0001185011642300958, 'samples': 17342976, 'steps': 33872, 'batch_loss/train': 0.6717157429084182} 12/27/2021 10:14:43 - INFO - codeparrot_training - Step 33873: {'lr': 0.00011848775814369053, 'samples': 17343488, 'steps': 33873, 'batch_loss/train': 0.5814956168178469} 12/27/2021 10:14:53 - INFO - codeparrot_training - Step 33874: {'lr': 0.00011847435258012827, 'samples': 17344000, 'steps': 33874, 'batch_loss/train': 0.7188391521340236} 12/27/2021 10:15:06 - INFO - codeparrot_training - Step 33875: {'lr': 0.00011846094753946233, 'samples': 17344512, 'steps': 33875, 'batch_loss/train': 0.849487341940403} 12/27/2021 10:15:17 - INFO - codeparrot_training - Step 33876: {'lr': 0.00011844754302174599, 'samples': 17345024, 'steps': 33876, 'batch_loss/train': 1.0953899286687374} 12/27/2021 10:15:27 - INFO - codeparrot_training - Step 33877: {'lr': 0.00011843413902703263, 'samples': 17345536, 'steps': 33877, 'batch_loss/train': 0.6824210830382071} 12/27/2021 10:15:38 - INFO - codeparrot_training - Step 33878: {'lr': 0.00011842073555537539, 'samples': 17346048, 'steps': 33878, 'batch_loss/train': 0.9869576469063759} 12/27/2021 10:15:50 - INFO - codeparrot_training - Step 33879: {'lr': 0.00011840733260682752, 'samples': 17346560, 'steps': 33879, 'batch_loss/train': 0.7596635175868869} 12/27/2021 10:16:00 - INFO - codeparrot_training - Step 33880: {'lr': 0.00011839393018144256, 'samples': 17347072, 'steps': 33880, 'batch_loss/train': 0.7028494987171143} 12/27/2021 10:16:11 - INFO - codeparrot_training - Step 33881: {'lr': 0.00011838052827927359, 'samples': 17347584, 'steps': 33881, 'batch_loss/train': 0.7546853993553668} 12/27/2021 10:16:25 - INFO - codeparrot_training - Step 33882: {'lr': 0.00011836712690037382, 'samples': 17348096, 'steps': 33882, 'batch_loss/train': 0.9199228528887033} 12/27/2021 10:16:36 - INFO - codeparrot_training - Step 33883: {'lr': 0.00011835372604479686, 'samples': 17348608, 'steps': 33883, 'batch_loss/train': 0.9166366397403181} 12/27/2021 10:16:46 - INFO - codeparrot_training - Step 33884: {'lr': 0.00011834032571259567, 'samples': 17349120, 'steps': 33884, 'batch_loss/train': 0.6766468819696456} 12/27/2021 10:16:58 - INFO - codeparrot_training - Step 33885: {'lr': 0.00011832692590382363, 'samples': 17349632, 'steps': 33885, 'batch_loss/train': 0.7206044507911429} 12/27/2021 10:17:09 - INFO - codeparrot_training - Step 33886: {'lr': 0.00011831352661853404, 'samples': 17350144, 'steps': 33886, 'batch_loss/train': 0.8276585564017296} 12/27/2021 10:17:20 - INFO - codeparrot_training - Step 33887: {'lr': 0.00011830012785678012, 'samples': 17350656, 'steps': 33887, 'batch_loss/train': 0.7397940154187381} 12/27/2021 10:17:30 - INFO - codeparrot_training - Step 33888: {'lr': 0.00011828672961861517, 'samples': 17351168, 'steps': 33888, 'batch_loss/train': 0.73490153811872} 12/27/2021 10:17:42 - INFO - codeparrot_training - Step 33889: {'lr': 0.00011827333190409253, 'samples': 17351680, 'steps': 33889, 'batch_loss/train': 0.6747851818799973} 12/27/2021 10:17:53 - INFO - codeparrot_training - Step 33890: {'lr': 0.00011825993471326521, 'samples': 17352192, 'steps': 33890, 'batch_loss/train': 0.6927500190213323} 12/27/2021 10:18:03 - INFO - codeparrot_training - Step 33891: {'lr': 0.00011824653804618677, 'samples': 17352704, 'steps': 33891, 'batch_loss/train': 0.7725526499561965} 12/27/2021 10:18:17 - INFO - codeparrot_training - Step 33892: {'lr': 0.0001182331419029104, 'samples': 17353216, 'steps': 33892, 'batch_loss/train': 0.7415281827561557} 12/27/2021 10:18:27 - INFO - codeparrot_training - Step 33893: {'lr': 0.00011821974628348911, 'samples': 17353728, 'steps': 33893, 'batch_loss/train': 0.7833922277204692} 12/27/2021 10:18:38 - INFO - codeparrot_training - Step 33894: {'lr': 0.00011820635118797649, 'samples': 17354240, 'steps': 33894, 'batch_loss/train': 0.7974936310201883} 12/27/2021 10:18:50 - INFO - codeparrot_training - Step 33895: {'lr': 0.0001181929566164257, 'samples': 17354752, 'steps': 33895, 'batch_loss/train': 0.7233473001979291} 12/27/2021 10:19:01 - INFO - codeparrot_training - Step 33896: {'lr': 0.00011817956256888986, 'samples': 17355264, 'steps': 33896, 'batch_loss/train': 0.7874010531231761} 12/27/2021 10:19:11 - INFO - codeparrot_training - Step 33897: {'lr': 0.00011816616904542235, 'samples': 17355776, 'steps': 33897, 'batch_loss/train': 0.6410708470502868} 12/27/2021 10:19:22 - INFO - codeparrot_training - Step 33898: {'lr': 0.00011815277604607632, 'samples': 17356288, 'steps': 33898, 'batch_loss/train': 0.7191496263258159} 12/27/2021 10:19:34 - INFO - codeparrot_training - Step 33899: {'lr': 0.00011813938357090512, 'samples': 17356800, 'steps': 33899, 'batch_loss/train': 1.3016128400340676} 12/27/2021 10:19:45 - INFO - codeparrot_training - Step 33900: {'lr': 0.00011812599161996191, 'samples': 17357312, 'steps': 33900, 'batch_loss/train': 0.7829555533826351} 12/27/2021 10:19:55 - INFO - codeparrot_training - Step 33901: {'lr': 0.00011811260019329995, 'samples': 17357824, 'steps': 33901, 'batch_loss/train': 0.7795250446069986} 12/27/2021 10:20:07 - INFO - codeparrot_training - Step 33902: {'lr': 0.00011809920929097251, 'samples': 17358336, 'steps': 33902, 'batch_loss/train': 0.7191805299371481} 12/27/2021 10:20:18 - INFO - codeparrot_training - Step 33903: {'lr': 0.00011808581891303282, 'samples': 17358848, 'steps': 33903, 'batch_loss/train': 0.7512554237619042} 12/27/2021 10:20:29 - INFO - codeparrot_training - Step 33904: {'lr': 0.00011807242905953419, 'samples': 17359360, 'steps': 33904, 'batch_loss/train': 0.7373046446591616} 12/27/2021 10:20:42 - INFO - codeparrot_training - Step 33905: {'lr': 0.00011805903973052959, 'samples': 17359872, 'steps': 33905, 'batch_loss/train': 0.7182495696470141} 12/27/2021 10:20:53 - INFO - codeparrot_training - Step 33906: {'lr': 0.00011804565092607259, 'samples': 17360384, 'steps': 33906, 'batch_loss/train': 0.7634653223212808} 12/27/2021 10:21:04 - INFO - codeparrot_training - Step 33907: {'lr': 0.00011803226264621619, 'samples': 17360896, 'steps': 33907, 'batch_loss/train': 0.6868831194005907} 12/27/2021 10:21:14 - INFO - codeparrot_training - Step 33908: {'lr': 0.00011801887489101357, 'samples': 17361408, 'steps': 33908, 'batch_loss/train': 0.7052247142419219} 12/27/2021 10:21:26 - INFO - codeparrot_training - Step 33909: {'lr': 0.00011800548766051827, 'samples': 17361920, 'steps': 33909, 'batch_loss/train': 0.6337718986906111} 12/27/2021 10:21:37 - INFO - codeparrot_training - Step 33910: {'lr': 0.00011799210095478322, 'samples': 17362432, 'steps': 33910, 'batch_loss/train': 0.7100390347186476} 12/27/2021 10:21:48 - INFO - codeparrot_training - Step 33911: {'lr': 0.00011797871477386174, 'samples': 17362944, 'steps': 33911, 'batch_loss/train': 0.8223404567688704} 12/27/2021 10:22:02 - INFO - codeparrot_training - Step 33912: {'lr': 0.00011796532911780702, 'samples': 17363456, 'steps': 33912, 'batch_loss/train': 0.8393442649394274} 12/27/2021 10:22:12 - INFO - codeparrot_training - Step 33913: {'lr': 0.0001179519439866723, 'samples': 17363968, 'steps': 33913, 'batch_loss/train': 0.6720684689935297} 12/27/2021 10:22:23 - INFO - codeparrot_training - Step 33914: {'lr': 0.00011793855938051079, 'samples': 17364480, 'steps': 33914, 'batch_loss/train': 0.7615920407697558} 12/27/2021 10:22:35 - INFO - codeparrot_training - Step 33915: {'lr': 0.00011792517529937583, 'samples': 17364992, 'steps': 33915, 'batch_loss/train': 0.6845538062043488} 12/27/2021 10:22:46 - INFO - codeparrot_training - Step 33916: {'lr': 0.00011791179174332028, 'samples': 17365504, 'steps': 33916, 'batch_loss/train': 0.6996733634732664} 12/27/2021 10:22:56 - INFO - codeparrot_training - Step 33917: {'lr': 0.00011789840871239768, 'samples': 17366016, 'steps': 33917, 'batch_loss/train': 0.7339916655328125} 12/27/2021 10:23:07 - INFO - codeparrot_training - Step 33918: {'lr': 0.00011788502620666122, 'samples': 17366528, 'steps': 33918, 'batch_loss/train': 0.7256437372416258} 12/27/2021 10:23:19 - INFO - codeparrot_training - Step 33919: {'lr': 0.00011787164422616381, 'samples': 17367040, 'steps': 33919, 'batch_loss/train': 0.7060038587078452} 12/27/2021 10:23:30 - INFO - codeparrot_training - Step 33920: {'lr': 0.00011785826277095896, 'samples': 17367552, 'steps': 33920, 'batch_loss/train': 0.7543929256498814} 12/27/2021 10:23:40 - INFO - codeparrot_training - Step 33921: {'lr': 0.00011784488184109985, 'samples': 17368064, 'steps': 33921, 'batch_loss/train': 0.5626818460877985} 12/27/2021 10:23:54 - INFO - codeparrot_training - Step 33922: {'lr': 0.00011783150143663952, 'samples': 17368576, 'steps': 33922, 'batch_loss/train': 0.733602155931294} 12/27/2021 10:24:05 - INFO - codeparrot_training - Step 33923: {'lr': 0.00011781812155763119, 'samples': 17369088, 'steps': 33923, 'batch_loss/train': 0.7005573036149144} 12/27/2021 10:24:15 - INFO - codeparrot_training - Step 33924: {'lr': 0.00011780474220412813, 'samples': 17369600, 'steps': 33924, 'batch_loss/train': 0.7866670414805412} 12/27/2021 10:24:27 - INFO - codeparrot_training - Step 33925: {'lr': 0.00011779136337618348, 'samples': 17370112, 'steps': 33925, 'batch_loss/train': 0.8054389599710703} 12/27/2021 10:24:38 - INFO - codeparrot_training - Step 33926: {'lr': 0.00011777798507385049, 'samples': 17370624, 'steps': 33926, 'batch_loss/train': 0.7211424960405566} 12/27/2021 10:24:49 - INFO - codeparrot_training - Step 33927: {'lr': 0.00011776460729718228, 'samples': 17371136, 'steps': 33927, 'batch_loss/train': 0.7673728158697486} 12/27/2021 10:25:03 - INFO - codeparrot_training - Step 33928: {'lr': 0.00011775123004623205, 'samples': 17371648, 'steps': 33928, 'batch_loss/train': 0.772672004532069} 12/27/2021 10:25:13 - INFO - codeparrot_training - Step 33929: {'lr': 0.00011773785332105299, 'samples': 17372160, 'steps': 33929, 'batch_loss/train': 0.7102061463519931} 12/27/2021 10:25:24 - INFO - codeparrot_training - Step 33930: {'lr': 0.00011772447712169831, 'samples': 17372672, 'steps': 33930, 'batch_loss/train': 0.7850354956462979} 12/27/2021 10:25:34 - INFO - codeparrot_training - Step 33931: {'lr': 0.00011771110144822117, 'samples': 17373184, 'steps': 33931, 'batch_loss/train': 0.7022560480982065} 12/27/2021 10:25:46 - INFO - codeparrot_training - Step 33932: {'lr': 0.00011769772630067479, 'samples': 17373696, 'steps': 33932, 'batch_loss/train': 0.797917996533215} 12/27/2021 10:25:57 - INFO - codeparrot_training - Step 33933: {'lr': 0.00011768435167911223, 'samples': 17374208, 'steps': 33933, 'batch_loss/train': 0.7871909504756331} 12/27/2021 10:26:08 - INFO - codeparrot_training - Step 33934: {'lr': 0.00011767097758358663, 'samples': 17374720, 'steps': 33934, 'batch_loss/train': 0.6714598531834781} 12/27/2021 10:26:20 - INFO - codeparrot_training - Step 33935: {'lr': 0.00011765760401415145, 'samples': 17375232, 'steps': 33935, 'batch_loss/train': 0.7669158419594169} 12/27/2021 10:26:31 - INFO - codeparrot_training - Step 33936: {'lr': 0.00011764423097085957, 'samples': 17375744, 'steps': 33936, 'batch_loss/train': 0.6868281993083656} 12/27/2021 10:26:41 - INFO - codeparrot_training - Step 33937: {'lr': 0.00011763085845376426, 'samples': 17376256, 'steps': 33937, 'batch_loss/train': 0.8229753477498889} 12/27/2021 10:26:54 - INFO - codeparrot_training - Step 33938: {'lr': 0.00011761748646291868, 'samples': 17376768, 'steps': 33938, 'batch_loss/train': 0.7413459187373519} 12/27/2021 10:27:04 - INFO - codeparrot_training - Step 33939: {'lr': 0.00011760411499837598, 'samples': 17377280, 'steps': 33939, 'batch_loss/train': 0.7486117826774716} 12/27/2021 10:27:15 - INFO - codeparrot_training - Step 33940: {'lr': 0.00011759074406018932, 'samples': 17377792, 'steps': 33940, 'batch_loss/train': 0.7283268258906901} 12/27/2021 10:27:25 - INFO - codeparrot_training - Step 33941: {'lr': 0.00011757737364841195, 'samples': 17378304, 'steps': 33941, 'batch_loss/train': 0.9613920226693153} 12/27/2021 10:27:39 - INFO - codeparrot_training - Step 33942: {'lr': 0.00011756400376309678, 'samples': 17378816, 'steps': 33942, 'batch_loss/train': 0.7773984987288713} 12/27/2021 10:27:50 - INFO - codeparrot_training - Step 33943: {'lr': 0.0001175506344042972, 'samples': 17379328, 'steps': 33943, 'batch_loss/train': 0.8481996795162559} 12/27/2021 10:28:01 - INFO - codeparrot_training - Step 33944: {'lr': 0.0001175372655720664, 'samples': 17379840, 'steps': 33944, 'batch_loss/train': 0.7102371221408248} 12/27/2021 10:28:12 - INFO - codeparrot_training - Step 33945: {'lr': 0.00011752389726645721, 'samples': 17380352, 'steps': 33945, 'batch_loss/train': 0.8669495629146695} 12/27/2021 10:28:23 - INFO - codeparrot_training - Step 33946: {'lr': 0.00011751052948752309, 'samples': 17380864, 'steps': 33946, 'batch_loss/train': 0.7090238514356315} 12/27/2021 10:28:34 - INFO - codeparrot_training - Step 33947: {'lr': 0.00011749716223531714, 'samples': 17381376, 'steps': 33947, 'batch_loss/train': 0.6789812841452658} 12/27/2021 10:28:46 - INFO - codeparrot_training - Step 33948: {'lr': 0.00011748379550989239, 'samples': 17381888, 'steps': 33948, 'batch_loss/train': 0.6758020455017686} 12/27/2021 10:28:57 - INFO - codeparrot_training - Step 33949: {'lr': 0.00011747042931130192, 'samples': 17382400, 'steps': 33949, 'batch_loss/train': 0.7503585736267269} 12/27/2021 10:29:07 - INFO - codeparrot_training - Step 33950: {'lr': 0.00011745706363959915, 'samples': 17382912, 'steps': 33950, 'batch_loss/train': 0.7507020519115031} 12/27/2021 10:29:18 - INFO - codeparrot_training - Step 33951: {'lr': 0.000117443698494837, 'samples': 17383424, 'steps': 33951, 'batch_loss/train': 0.7593994308263063} 12/27/2021 10:29:32 - INFO - codeparrot_training - Step 33952: {'lr': 0.00011743033387706862, 'samples': 17383936, 'steps': 33952, 'batch_loss/train': 0.7704563816078007} 12/27/2021 10:29:42 - INFO - codeparrot_training - Step 33953: {'lr': 0.00011741696978634719, 'samples': 17384448, 'steps': 33953, 'batch_loss/train': 0.7366810720413923} 12/27/2021 10:29:53 - INFO - codeparrot_training - Step 33954: {'lr': 0.00011740360622272583, 'samples': 17384960, 'steps': 33954, 'batch_loss/train': 0.6669680364429951} 12/27/2021 10:30:05 - INFO - codeparrot_training - Step 33955: {'lr': 0.00011739024318625766, 'samples': 17385472, 'steps': 33955, 'batch_loss/train': 0.7851710920222104} 12/27/2021 10:30:16 - INFO - codeparrot_training - Step 33956: {'lr': 0.00011737688067699582, 'samples': 17385984, 'steps': 33956, 'batch_loss/train': 0.7075346640776843} 12/27/2021 10:30:26 - INFO - codeparrot_training - Step 33957: {'lr': 0.0001173635186949934, 'samples': 17386496, 'steps': 33957, 'batch_loss/train': 0.8174586500972509} 12/27/2021 10:30:38 - INFO - codeparrot_training - Step 33958: {'lr': 0.00011735015724030366, 'samples': 17387008, 'steps': 33958, 'batch_loss/train': 0.7087609805166721} 12/27/2021 10:30:49 - INFO - codeparrot_training - Step 33959: {'lr': 0.0001173367963129795, 'samples': 17387520, 'steps': 33959, 'batch_loss/train': 0.7622820264659822} 12/27/2021 10:31:00 - INFO - codeparrot_training - Step 33960: {'lr': 0.00011732343591307407, 'samples': 17388032, 'steps': 33960, 'batch_loss/train': 0.800880221882835} 12/27/2021 10:31:13 - INFO - codeparrot_training - Step 33961: {'lr': 0.00011731007604064073, 'samples': 17388544, 'steps': 33961, 'batch_loss/train': 0.848696970846504} 12/27/2021 10:31:24 - INFO - codeparrot_training - Step 33962: {'lr': 0.00011729671669573235, 'samples': 17389056, 'steps': 33962, 'batch_loss/train': 0.750405928120017} 12/27/2021 10:31:35 - INFO - codeparrot_training - Step 33963: {'lr': 0.0001172833578784021, 'samples': 17389568, 'steps': 33963, 'batch_loss/train': 0.5911105675622821} 12/27/2021 10:31:45 - INFO - codeparrot_training - Step 33964: {'lr': 0.00011726999958870313, 'samples': 17390080, 'steps': 33964, 'batch_loss/train': 0.7222705786116421} 12/27/2021 10:31:58 - INFO - codeparrot_training - Step 33965: {'lr': 0.00011725664182668852, 'samples': 17390592, 'steps': 33965, 'batch_loss/train': 0.7979919677600265} 12/27/2021 10:32:08 - INFO - codeparrot_training - Step 33966: {'lr': 0.00011724328459241138, 'samples': 17391104, 'steps': 33966, 'batch_loss/train': 0.7453906256705523} 12/27/2021 10:32:19 - INFO - codeparrot_training - Step 33967: {'lr': 0.00011722992788592482, 'samples': 17391616, 'steps': 33967, 'batch_loss/train': 0.7594524793094024} 12/27/2021 10:32:33 - INFO - codeparrot_training - Step 33968: {'lr': 0.00011721657170728192, 'samples': 17392128, 'steps': 33968, 'batch_loss/train': 0.6551342783495784} 12/27/2021 10:32:43 - INFO - codeparrot_training - Step 33969: {'lr': 0.0001172032160565358, 'samples': 17392640, 'steps': 33969, 'batch_loss/train': 0.7233587787486613} 12/27/2021 10:32:54 - INFO - codeparrot_training - Step 33970: {'lr': 0.00011718986093373963, 'samples': 17393152, 'steps': 33970, 'batch_loss/train': 0.6930067350622267} 12/27/2021 10:33:06 - INFO - codeparrot_training - Step 33971: {'lr': 0.00011717650633894625, 'samples': 17393664, 'steps': 33971, 'batch_loss/train': 0.7946643801406026} 12/27/2021 10:33:17 - INFO - codeparrot_training - Step 33972: {'lr': 0.00011716315227220903, 'samples': 17394176, 'steps': 33972, 'batch_loss/train': 0.7193469917401671} 12/27/2021 10:33:27 - INFO - codeparrot_training - Step 33973: {'lr': 0.00011714979873358102, 'samples': 17394688, 'steps': 33973, 'batch_loss/train': 0.6912002760218456} 12/27/2021 10:33:38 - INFO - codeparrot_training - Step 33974: {'lr': 0.00011713644572311522, 'samples': 17395200, 'steps': 33974, 'batch_loss/train': 0.788709744811058} 12/27/2021 10:33:50 - INFO - codeparrot_training - Step 33975: {'lr': 0.0001171230932408646, 'samples': 17395712, 'steps': 33975, 'batch_loss/train': 0.7891425378620625} 12/27/2021 10:34:01 - INFO - codeparrot_training - Step 33976: {'lr': 0.00011710974128688259, 'samples': 17396224, 'steps': 33976, 'batch_loss/train': 0.8184848569799215} 12/27/2021 10:34:11 - INFO - codeparrot_training - Step 33977: {'lr': 0.00011709638986122195, 'samples': 17396736, 'steps': 33977, 'batch_loss/train': 0.7153706093668006} 12/27/2021 10:34:25 - INFO - codeparrot_training - Step 33978: {'lr': 0.00011708303896393594, 'samples': 17397248, 'steps': 33978, 'batch_loss/train': 0.687922241166234} 12/27/2021 10:34:36 - INFO - codeparrot_training - Step 33979: {'lr': 0.00011706968859507752, 'samples': 17397760, 'steps': 33979, 'batch_loss/train': 0.6754727587103844} 12/27/2021 10:34:47 - INFO - codeparrot_training - Step 33980: {'lr': 0.00011705633875469987, 'samples': 17398272, 'steps': 33980, 'batch_loss/train': 0.6927070040255785} 12/27/2021 10:35:00 - INFO - codeparrot_training - Step 33981: {'lr': 0.00011704298944285598, 'samples': 17398784, 'steps': 33981, 'batch_loss/train': 0.7885979991406202} 12/27/2021 10:35:10 - INFO - codeparrot_training - Step 33982: {'lr': 0.00011702964065959897, 'samples': 17399296, 'steps': 33982, 'batch_loss/train': 0.6606742413132451} 12/27/2021 10:35:21 - INFO - codeparrot_training - Step 33983: {'lr': 0.00011701629240498193, 'samples': 17399808, 'steps': 33983, 'batch_loss/train': 0.7343110097572207} 12/27/2021 10:35:32 - INFO - codeparrot_training - Step 33984: {'lr': 0.0001170029446790579, 'samples': 17400320, 'steps': 33984, 'batch_loss/train': 0.7967955060303211} 12/27/2021 10:35:44 - INFO - codeparrot_training - Step 33985: {'lr': 0.00011698959748188001, 'samples': 17400832, 'steps': 33985, 'batch_loss/train': 0.9872081702342257} 12/27/2021 10:35:54 - INFO - codeparrot_training - Step 33986: {'lr': 0.00011697625081350108, 'samples': 17401344, 'steps': 33986, 'batch_loss/train': 0.7210482591763139} 12/27/2021 10:36:05 - INFO - codeparrot_training - Step 33987: {'lr': 0.00011696290467397452, 'samples': 17401856, 'steps': 33987, 'batch_loss/train': 0.6285307538346387} 12/27/2021 10:36:19 - INFO - codeparrot_training - Step 33988: {'lr': 0.00011694955906335314, 'samples': 17402368, 'steps': 33988, 'batch_loss/train': 0.6838546991348267} 12/27/2021 10:36:29 - INFO - codeparrot_training - Step 33989: {'lr': 0.00011693621398169008, 'samples': 17402880, 'steps': 33989, 'batch_loss/train': 0.9002730612410232} 12/27/2021 10:36:40 - INFO - codeparrot_training - Step 33990: {'lr': 0.0001169228694290384, 'samples': 17403392, 'steps': 33990, 'batch_loss/train': 0.7351534934714437} 12/27/2021 10:36:51 - INFO - codeparrot_training - Step 33991: {'lr': 0.00011690952540545114, 'samples': 17403904, 'steps': 33991, 'batch_loss/train': 0.7654801029711962} 12/27/2021 10:37:03 - INFO - codeparrot_training - Step 33992: {'lr': 0.00011689618191098134, 'samples': 17404416, 'steps': 33992, 'batch_loss/train': 0.7217498235404491} 12/27/2021 10:37:13 - INFO - codeparrot_training - Step 33993: {'lr': 0.00011688283894568205, 'samples': 17404928, 'steps': 33993, 'batch_loss/train': 0.587342000042554} 12/27/2021 10:37:24 - INFO - codeparrot_training - Step 33994: {'lr': 0.00011686949650960635, 'samples': 17405440, 'steps': 33994, 'batch_loss/train': 0.7982340343296528} 12/27/2021 10:37:36 - INFO - codeparrot_training - Step 33995: {'lr': 0.00011685615460280726, 'samples': 17405952, 'steps': 33995, 'batch_loss/train': 0.8518912405706942} 12/27/2021 10:37:47 - INFO - codeparrot_training - Step 33996: {'lr': 0.0001168428132253379, 'samples': 17406464, 'steps': 33996, 'batch_loss/train': 0.7357836728915572} 12/27/2021 10:37:57 - INFO - codeparrot_training - Step 33997: {'lr': 0.00011682947237725109, 'samples': 17406976, 'steps': 33997, 'batch_loss/train': 0.6641278394963592} 12/27/2021 10:38:11 - INFO - codeparrot_training - Step 33998: {'lr': 0.00011681613205860009, 'samples': 17407488, 'steps': 33998, 'batch_loss/train': 0.7284656278789043} 12/27/2021 10:38:22 - INFO - codeparrot_training - Step 33999: {'lr': 0.00011680279226943793, 'samples': 17408000, 'steps': 33999, 'batch_loss/train': 0.7103719851002097} 12/27/2021 10:38:33 - INFO - codeparrot_training - Step 34000: {'lr': 0.00011678945300981752, 'samples': 17408512, 'steps': 34000, 'batch_loss/train': 0.6950329735409468} 12/27/2021 10:38:45 - INFO - codeparrot_training - Step 34001: {'lr': 0.00011677611427979181, 'samples': 17409024, 'steps': 34001, 'batch_loss/train': 0.891688684001565} 12/27/2021 10:38:55 - INFO - codeparrot_training - Step 34002: {'lr': 0.00011676277607941418, 'samples': 17409536, 'steps': 34002, 'batch_loss/train': 0.7780493656173348} 12/27/2021 10:39:06 - INFO - codeparrot_training - Step 34003: {'lr': 0.00011674943840873736, 'samples': 17410048, 'steps': 34003, 'batch_loss/train': 0.7646835660561919} 12/27/2021 10:39:17 - INFO - codeparrot_training - Step 34004: {'lr': 0.00011673610126781445, 'samples': 17410560, 'steps': 34004, 'batch_loss/train': 0.7162669335957617} 12/27/2021 10:39:29 - INFO - codeparrot_training - Step 34005: {'lr': 0.00011672276465669849, 'samples': 17411072, 'steps': 34005, 'batch_loss/train': 0.8625990794971585} 12/27/2021 10:39:39 - INFO - codeparrot_training - Step 34006: {'lr': 0.00011670942857544248, 'samples': 17411584, 'steps': 34006, 'batch_loss/train': 0.6837930746842176} 12/27/2021 10:39:50 - INFO - codeparrot_training - Step 34007: {'lr': 0.00011669609302409945, 'samples': 17412096, 'steps': 34007, 'batch_loss/train': 0.7427199208177626} 12/27/2021 10:40:04 - INFO - codeparrot_training - Step 34008: {'lr': 0.00011668275800272243, 'samples': 17412608, 'steps': 34008, 'batch_loss/train': 0.8492109929211438} 12/27/2021 10:40:15 - INFO - codeparrot_training - Step 34009: {'lr': 0.00011666942351136442, 'samples': 17413120, 'steps': 34009, 'batch_loss/train': 0.7412005286896601} 12/27/2021 10:40:25 - INFO - codeparrot_training - Step 34010: {'lr': 0.00011665608955007844, 'samples': 17413632, 'steps': 34010, 'batch_loss/train': 0.7019436815753579} 12/27/2021 10:40:37 - INFO - codeparrot_training - Step 34011: {'lr': 0.00011664275611891761, 'samples': 17414144, 'steps': 34011, 'batch_loss/train': 0.7079215226694942} 12/27/2021 10:40:48 - INFO - codeparrot_training - Step 34012: {'lr': 0.00011662942321793463, 'samples': 17414656, 'steps': 34012, 'batch_loss/train': 0.8478155788034201} 12/27/2021 10:40:59 - INFO - codeparrot_training - Step 34013: {'lr': 0.00011661609084718288, 'samples': 17415168, 'steps': 34013, 'batch_loss/train': 0.7946594685781747} 12/27/2021 10:41:12 - INFO - codeparrot_training - Step 34014: {'lr': 0.00011660275900671511, 'samples': 17415680, 'steps': 34014, 'batch_loss/train': 0.6448856287170202} 12/27/2021 10:41:23 - INFO - codeparrot_training - Step 34015: {'lr': 0.0001165894276965844, 'samples': 17416192, 'steps': 34015, 'batch_loss/train': 0.7045399914495647} 12/27/2021 10:41:33 - INFO - codeparrot_training - Step 34016: {'lr': 0.00011657609691684373, 'samples': 17416704, 'steps': 34016, 'batch_loss/train': 0.7438076832331717} 12/27/2021 10:41:44 - INFO - codeparrot_training - Step 34017: {'lr': 0.00011656276666754614, 'samples': 17417216, 'steps': 34017, 'batch_loss/train': 0.6864403053186834} 12/27/2021 10:41:56 - INFO - codeparrot_training - Step 34018: {'lr': 0.00011654943694874459, 'samples': 17417728, 'steps': 34018, 'batch_loss/train': 0.6939795592334121} 12/27/2021 10:42:07 - INFO - codeparrot_training - Step 34019: {'lr': 0.00011653610776049212, 'samples': 17418240, 'steps': 34019, 'batch_loss/train': 0.6796134472824633} 12/27/2021 10:42:17 - INFO - codeparrot_training - Step 34020: {'lr': 0.00011652277910284167, 'samples': 17418752, 'steps': 34020, 'batch_loss/train': 0.7701851362362504} 12/27/2021 10:42:29 - INFO - codeparrot_training - Step 34021: {'lr': 0.00011650945097584626, 'samples': 17419264, 'steps': 34021, 'batch_loss/train': 0.6826286753639579} 12/27/2021 10:42:40 - INFO - codeparrot_training - Step 34022: {'lr': 0.00011649612337955895, 'samples': 17419776, 'steps': 34022, 'batch_loss/train': 0.7985323127359152} 12/27/2021 10:42:51 - INFO - codeparrot_training - Step 34023: {'lr': 0.00011648279631403247, 'samples': 17420288, 'steps': 34023, 'batch_loss/train': 0.7045121910050511} 12/27/2021 10:43:03 - INFO - codeparrot_training - Step 34024: {'lr': 0.00011646946977932004, 'samples': 17420800, 'steps': 34024, 'batch_loss/train': 0.6477133170701563} 12/27/2021 10:43:14 - INFO - codeparrot_training - Step 34025: {'lr': 0.00011645614377547473, 'samples': 17421312, 'steps': 34025, 'batch_loss/train': 0.8404611065634526} 12/27/2021 10:43:24 - INFO - codeparrot_training - Step 34026: {'lr': 0.00011644281830254924, 'samples': 17421824, 'steps': 34026, 'batch_loss/train': 0.5357027471181937} 12/27/2021 10:43:38 - INFO - codeparrot_training - Step 34027: {'lr': 0.00011642949336059658, 'samples': 17422336, 'steps': 34027, 'batch_loss/train': 0.7905428754165769} 12/27/2021 10:43:49 - INFO - codeparrot_training - Step 34028: {'lr': 0.00011641616894967, 'samples': 17422848, 'steps': 34028, 'batch_loss/train': 0.7851754324510694} 12/27/2021 10:43:59 - INFO - codeparrot_training - Step 34029: {'lr': 0.00011640284506982221, 'samples': 17423360, 'steps': 34029, 'batch_loss/train': 0.6347846928983927} 12/27/2021 10:44:10 - INFO - codeparrot_training - Step 34030: {'lr': 0.00011638952172110618, 'samples': 17423872, 'steps': 34030, 'batch_loss/train': 0.6489845728501678} 12/27/2021 10:44:23 - INFO - codeparrot_training - Step 34031: {'lr': 0.00011637619890357515, 'samples': 17424384, 'steps': 34031, 'batch_loss/train': 0.7026735870167613} 12/27/2021 10:44:33 - INFO - codeparrot_training - Step 34032: {'lr': 0.00011636287661728178, 'samples': 17424896, 'steps': 34032, 'batch_loss/train': 0.7287194977980107} 12/27/2021 10:44:44 - INFO - codeparrot_training - Step 34033: {'lr': 0.00011634955486227916, 'samples': 17425408, 'steps': 34033, 'batch_loss/train': 3.8185869557783008} 12/27/2021 10:44:56 - INFO - codeparrot_training - Step 34034: {'lr': 0.00011633623363862026, 'samples': 17425920, 'steps': 34034, 'batch_loss/train': 0.7006557397544384} 12/27/2021 10:45:07 - INFO - codeparrot_training - Step 34035: {'lr': 0.00011632291294635803, 'samples': 17426432, 'steps': 34035, 'batch_loss/train': 0.8409981327131391} 12/27/2021 10:45:17 - INFO - codeparrot_training - Step 34036: {'lr': 0.0001163095927855454, 'samples': 17426944, 'steps': 34036, 'batch_loss/train': 0.6176186651282478} 12/27/2021 10:45:28 - INFO - codeparrot_training - Step 34037: {'lr': 0.00011629627315623545, 'samples': 17427456, 'steps': 34037, 'batch_loss/train': 0.768959793727845} 12/27/2021 10:45:42 - INFO - codeparrot_training - Step 34038: {'lr': 0.00011628295405848085, 'samples': 17427968, 'steps': 34038, 'batch_loss/train': 1.5425396435894072} 12/27/2021 10:45:53 - INFO - codeparrot_training - Step 34039: {'lr': 0.00011626963549233486, 'samples': 17428480, 'steps': 34039, 'batch_loss/train': 0.749444596702233} 12/27/2021 10:46:03 - INFO - codeparrot_training - Step 34040: {'lr': 0.00011625631745785034, 'samples': 17428992, 'steps': 34040, 'batch_loss/train': 0.7672365512698889} 12/27/2021 10:46:15 - INFO - codeparrot_training - Step 34041: {'lr': 0.00011624299995508006, 'samples': 17429504, 'steps': 34041, 'batch_loss/train': 0.7537202453240752} 12/27/2021 10:46:26 - INFO - codeparrot_training - Step 34042: {'lr': 0.00011622968298407727, 'samples': 17430016, 'steps': 34042, 'batch_loss/train': 0.8466940997168422} 12/27/2021 10:46:37 - INFO - codeparrot_training - Step 34043: {'lr': 0.00011621636654489467, 'samples': 17430528, 'steps': 34043, 'batch_loss/train': 0.7830693982541561} 12/27/2021 10:46:51 - INFO - codeparrot_training - Step 34044: {'lr': 0.0001162030506375853, 'samples': 17431040, 'steps': 34044, 'batch_loss/train': 0.7016975954174995} 12/27/2021 10:47:01 - INFO - codeparrot_training - Step 34045: {'lr': 0.00011618973526220203, 'samples': 17431552, 'steps': 34045, 'batch_loss/train': 0.9429620271548629} 12/27/2021 10:47:12 - INFO - codeparrot_training - Step 34046: {'lr': 0.00011617642041879789, 'samples': 17432064, 'steps': 34046, 'batch_loss/train': 0.620185422943905} 12/27/2021 10:47:22 - INFO - codeparrot_training - Step 34047: {'lr': 0.00011616310610742575, 'samples': 17432576, 'steps': 34047, 'batch_loss/train': 0.6970572697464377} 12/27/2021 10:47:35 - INFO - codeparrot_training - Step 34048: {'lr': 0.00011614979232813869, 'samples': 17433088, 'steps': 34048, 'batch_loss/train': 1.0168441776186228} 12/27/2021 10:47:45 - INFO - codeparrot_training - Step 34049: {'lr': 0.00011613647908098929, 'samples': 17433600, 'steps': 34049, 'batch_loss/train': 0.7742562331259251} 12/27/2021 10:47:56 - INFO - codeparrot_training - Step 34050: {'lr': 0.00011612316636603081, 'samples': 17434112, 'steps': 34050, 'batch_loss/train': 0.6762372492812574} 12/27/2021 10:48:08 - INFO - codeparrot_training - Step 34051: {'lr': 0.00011610985418331618, 'samples': 17434624, 'steps': 34051, 'batch_loss/train': 0.7212521960027516} 12/27/2021 10:48:19 - INFO - codeparrot_training - Step 34052: {'lr': 0.00011609654253289812, 'samples': 17435136, 'steps': 34052, 'batch_loss/train': 0.8058416731655598} 12/27/2021 10:48:29 - INFO - codeparrot_training - Step 34053: {'lr': 0.00011608323141482951, 'samples': 17435648, 'steps': 34053, 'batch_loss/train': 0.7505662012845278} 12/27/2021 10:48:43 - INFO - codeparrot_training - Step 34054: {'lr': 0.00011606992082916366, 'samples': 17436160, 'steps': 34054, 'batch_loss/train': 0.7710698493756354} 12/27/2021 10:48:54 - INFO - codeparrot_training - Step 34055: {'lr': 0.00011605661077595309, 'samples': 17436672, 'steps': 34055, 'batch_loss/train': 0.6428927350789309} 12/27/2021 10:49:04 - INFO - codeparrot_training - Step 34056: {'lr': 0.00011604330125525078, 'samples': 17437184, 'steps': 34056, 'batch_loss/train': 0.6913951493334025} 12/27/2021 10:49:15 - INFO - codeparrot_training - Step 34057: {'lr': 0.00011602999226710992, 'samples': 17437696, 'steps': 34057, 'batch_loss/train': 0.6246449770405889} 12/27/2021 10:49:27 - INFO - codeparrot_training - Step 34058: {'lr': 0.00011601668381158314, 'samples': 17438208, 'steps': 34058, 'batch_loss/train': 0.738254239782691} 12/27/2021 10:49:38 - INFO - codeparrot_training - Step 34059: {'lr': 0.00011600337588872342, 'samples': 17438720, 'steps': 34059, 'batch_loss/train': 0.69523951318115} 12/27/2021 10:49:48 - INFO - codeparrot_training - Step 34060: {'lr': 0.00011599006849858368, 'samples': 17439232, 'steps': 34060, 'batch_loss/train': 0.7241844646632671} 12/27/2021 10:50:00 - INFO - codeparrot_training - Step 34061: {'lr': 0.00011597676164121684, 'samples': 17439744, 'steps': 34061, 'batch_loss/train': 0.7167152781039476} 12/27/2021 10:50:11 - INFO - codeparrot_training - Step 34062: {'lr': 0.00011596345531667579, 'samples': 17440256, 'steps': 34062, 'batch_loss/train': 0.6693869954906404} 12/27/2021 10:50:22 - INFO - codeparrot_training - Step 34063: {'lr': 0.0001159501495250135, 'samples': 17440768, 'steps': 34063, 'batch_loss/train': 0.6398297927808017} 12/27/2021 10:50:35 - INFO - codeparrot_training - Step 34064: {'lr': 0.00011593684426628265, 'samples': 17441280, 'steps': 34064, 'batch_loss/train': 0.7835142421536148} 12/27/2021 10:50:46 - INFO - codeparrot_training - Step 34065: {'lr': 0.00011592353954053636, 'samples': 17441792, 'steps': 34065, 'batch_loss/train': 0.6750458390451968} 12/27/2021 10:50:57 - INFO - codeparrot_training - Step 34066: {'lr': 0.00011591023534782755, 'samples': 17442304, 'steps': 34066, 'batch_loss/train': 0.6785541279241443} 12/27/2021 10:51:07 - INFO - codeparrot_training - Step 34067: {'lr': 0.00011589693168820883, 'samples': 17442816, 'steps': 34067, 'batch_loss/train': 0.799151404760778} 12/27/2021 10:51:20 - INFO - codeparrot_training - Step 34068: {'lr': 0.00011588362856173345, 'samples': 17443328, 'steps': 34068, 'batch_loss/train': 0.8241443475708365} 12/27/2021 10:51:30 - INFO - codeparrot_training - Step 34069: {'lr': 0.00011587032596845404, 'samples': 17443840, 'steps': 34069, 'batch_loss/train': 0.7643985785543919} 12/27/2021 10:51:41 - INFO - codeparrot_training - Step 34070: {'lr': 0.00011585702390842357, 'samples': 17444352, 'steps': 34070, 'batch_loss/train': 0.7539286087267101} 12/27/2021 10:51:53 - INFO - codeparrot_training - Step 34071: {'lr': 0.00011584372238169492, 'samples': 17444864, 'steps': 34071, 'batch_loss/train': 0.7585209039971232} 12/27/2021 10:52:03 - INFO - codeparrot_training - Step 34072: {'lr': 0.00011583042138832095, 'samples': 17445376, 'steps': 34072, 'batch_loss/train': 0.7145013621775433} 12/27/2021 10:52:14 - INFO - codeparrot_training - Step 34073: {'lr': 0.00011581712092835459, 'samples': 17445888, 'steps': 34073, 'batch_loss/train': 0.8834557766094804} 12/27/2021 10:52:28 - INFO - codeparrot_training - Step 34074: {'lr': 0.00011580382100184869, 'samples': 17446400, 'steps': 34074, 'batch_loss/train': 0.7500525254290551} 12/27/2021 10:52:39 - INFO - codeparrot_training - Step 34075: {'lr': 0.00011579052160885611, 'samples': 17446912, 'steps': 34075, 'batch_loss/train': 0.8010377120226622} 12/27/2021 10:52:49 - INFO - codeparrot_training - Step 34076: {'lr': 0.00011577722274942975, 'samples': 17447424, 'steps': 34076, 'batch_loss/train': 0.7085711876861751} 12/27/2021 10:53:01 - INFO - codeparrot_training - Step 34077: {'lr': 0.00011576392442362258, 'samples': 17447936, 'steps': 34077, 'batch_loss/train': 0.6766306962817907} 12/27/2021 10:53:12 - INFO - codeparrot_training - Step 34078: {'lr': 0.00011575062663148717, 'samples': 17448448, 'steps': 34078, 'batch_loss/train': 0.7510093324817717} 12/27/2021 10:53:22 - INFO - codeparrot_training - Step 34079: {'lr': 0.00011573732937307666, 'samples': 17448960, 'steps': 34079, 'batch_loss/train': 0.6373726318124682} 12/27/2021 10:53:33 - INFO - codeparrot_training - Step 34080: {'lr': 0.00011572403264844392, 'samples': 17449472, 'steps': 34080, 'batch_loss/train': 0.7862180219963193} 12/27/2021 10:53:45 - INFO - codeparrot_training - Step 34081: {'lr': 0.00011571073645764166, 'samples': 17449984, 'steps': 34081, 'batch_loss/train': 0.8032226336654276} 12/27/2021 10:53:56 - INFO - codeparrot_training - Step 34082: {'lr': 0.00011569744080072269, 'samples': 17450496, 'steps': 34082, 'batch_loss/train': 0.6559518533758819} 12/27/2021 10:54:07 - INFO - codeparrot_training - Step 34083: {'lr': 0.00011568414567774016, 'samples': 17451008, 'steps': 34083, 'batch_loss/train': 0.64466368034482} 12/27/2021 10:54:21 - INFO - codeparrot_training - Step 34084: {'lr': 0.0001156708510887467, 'samples': 17451520, 'steps': 34084, 'batch_loss/train': 0.5303924125619233} 12/27/2021 10:54:32 - INFO - codeparrot_training - Step 34085: {'lr': 0.00011565755703379518, 'samples': 17452032, 'steps': 34085, 'batch_loss/train': 0.724460612051189} 12/27/2021 10:54:42 - INFO - codeparrot_training - Step 34086: {'lr': 0.0001156442635129385, 'samples': 17452544, 'steps': 34086, 'batch_loss/train': 0.7471142441499978} 12/27/2021 10:54:54 - INFO - codeparrot_training - Step 34087: {'lr': 0.00011563097052622948, 'samples': 17453056, 'steps': 34087, 'batch_loss/train': 0.7464078534394503} 12/27/2021 10:55:05 - INFO - codeparrot_training - Step 34088: {'lr': 0.00011561767807372101, 'samples': 17453568, 'steps': 34088, 'batch_loss/train': 0.7987651182338595} 12/27/2021 10:55:16 - INFO - codeparrot_training - Step 34089: {'lr': 0.000115604386155466, 'samples': 17454080, 'steps': 34089, 'batch_loss/train': 0.8034741566516459} 12/27/2021 10:55:26 - INFO - codeparrot_training - Step 34090: {'lr': 0.000115591094771517, 'samples': 17454592, 'steps': 34090, 'batch_loss/train': 0.7554118791595101} 12/27/2021 10:55:38 - INFO - codeparrot_training - Step 34091: {'lr': 0.00011557780392192718, 'samples': 17455104, 'steps': 34091, 'batch_loss/train': 1.1995115717872977} 12/27/2021 10:55:49 - INFO - codeparrot_training - Step 34092: {'lr': 0.00011556451360674933, 'samples': 17455616, 'steps': 34092, 'batch_loss/train': 0.9452953017316759} 12/27/2021 10:55:59 - INFO - codeparrot_training - Step 34093: {'lr': 0.00011555122382603603, 'samples': 17456128, 'steps': 34093, 'batch_loss/train': 0.6530362332705408} 12/27/2021 10:56:13 - INFO - codeparrot_training - Step 34094: {'lr': 0.00011553793457984047, 'samples': 17456640, 'steps': 34094, 'batch_loss/train': 0.6286806010175496} 12/27/2021 10:56:24 - INFO - codeparrot_training - Step 34095: {'lr': 0.00011552464586821523, 'samples': 17457152, 'steps': 34095, 'batch_loss/train': 0.7318586227484047} 12/27/2021 10:56:35 - INFO - codeparrot_training - Step 34096: {'lr': 0.00011551135769121324, 'samples': 17457664, 'steps': 34096, 'batch_loss/train': 0.757426178897731} 12/27/2021 10:56:47 - INFO - codeparrot_training - Step 34097: {'lr': 0.00011549807004888729, 'samples': 17458176, 'steps': 34097, 'batch_loss/train': 0.7515984772471711} 12/27/2021 10:56:57 - INFO - codeparrot_training - Step 34098: {'lr': 0.00011548478294129022, 'samples': 17458688, 'steps': 34098, 'batch_loss/train': 0.7327195247635245} 12/27/2021 10:57:08 - INFO - codeparrot_training - Step 34099: {'lr': 0.00011547149636847487, 'samples': 17459200, 'steps': 34099, 'batch_loss/train': 0.6837903345003724} 12/27/2021 10:57:19 - INFO - codeparrot_training - Step 34100: {'lr': 0.00011545821033049406, 'samples': 17459712, 'steps': 34100, 'batch_loss/train': 0.5985040664672852} 12/27/2021 10:57:32 - INFO - codeparrot_training - Step 34101: {'lr': 0.0001154449248274006, 'samples': 17460224, 'steps': 34101, 'batch_loss/train': 0.7965546608902514} 12/27/2021 10:57:43 - INFO - codeparrot_training - Step 34102: {'lr': 0.00011543163985924732, 'samples': 17460736, 'steps': 34102, 'batch_loss/train': 0.6632373770698905} 12/27/2021 10:57:54 - INFO - codeparrot_training - Step 34103: {'lr': 0.0001154183554260871, 'samples': 17461248, 'steps': 34103, 'batch_loss/train': 0.749190763104707} 12/27/2021 10:58:06 - INFO - codeparrot_training - Step 34104: {'lr': 0.00011540507152797252, 'samples': 17461760, 'steps': 34104, 'batch_loss/train': 0.7891435930505395} 12/27/2021 10:58:17 - INFO - codeparrot_training - Step 34105: {'lr': 0.00011539178816495666, 'samples': 17462272, 'steps': 34105, 'batch_loss/train': 0.6484754444099963} 12/27/2021 10:58:27 - INFO - codeparrot_training - Step 34106: {'lr': 0.0001153785053370923, 'samples': 17462784, 'steps': 34106, 'batch_loss/train': 0.7225831577088684} 12/27/2021 10:58:40 - INFO - codeparrot_training - Step 34107: {'lr': 0.00011536522304443209, 'samples': 17463296, 'steps': 34107, 'batch_loss/train': 0.6040927392896265} 12/27/2021 10:58:50 - INFO - codeparrot_training - Step 34108: {'lr': 0.0001153519412870288, 'samples': 17463808, 'steps': 34108, 'batch_loss/train': 0.6024424296338111} 12/27/2021 10:59:01 - INFO - codeparrot_training - Step 34109: {'lr': 0.00011533866006493554, 'samples': 17464320, 'steps': 34109, 'batch_loss/train': 0.6685375338420272} 12/27/2021 10:59:13 - INFO - codeparrot_training - Step 34110: {'lr': 0.00011532537937820483, 'samples': 17464832, 'steps': 34110, 'batch_loss/train': 0.7001435258425772} 12/27/2021 10:59:24 - INFO - codeparrot_training - Step 34111: {'lr': 0.00011531209922688957, 'samples': 17465344, 'steps': 34111, 'batch_loss/train': 0.6720527051948011} 12/27/2021 10:59:34 - INFO - codeparrot_training - Step 34112: {'lr': 0.00011529881961104255, 'samples': 17465856, 'steps': 34112, 'batch_loss/train': 0.8056582100689411} 12/27/2021 10:59:45 - INFO - codeparrot_training - Step 34113: {'lr': 0.00011528554053071655, 'samples': 17466368, 'steps': 34113, 'batch_loss/train': 0.6890071184607223} 12/27/2021 10:59:59 - INFO - codeparrot_training - Step 34114: {'lr': 0.00011527226198596438, 'samples': 17466880, 'steps': 34114, 'batch_loss/train': 0.7270772454794496} 12/27/2021 11:00:10 - INFO - codeparrot_training - Step 34115: {'lr': 0.00011525898397683884, 'samples': 17467392, 'steps': 34115, 'batch_loss/train': 0.5990047635277733} 12/27/2021 11:00:20 - INFO - codeparrot_training - Step 34116: {'lr': 0.0001152457065033927, 'samples': 17467904, 'steps': 34116, 'batch_loss/train': 0.8100906203035265} 12/27/2021 11:00:32 - INFO - codeparrot_training - Step 34117: {'lr': 0.00011523242956567872, 'samples': 17468416, 'steps': 34117, 'batch_loss/train': 0.8666781121864915} 12/27/2021 11:00:43 - INFO - codeparrot_training - Step 34118: {'lr': 0.00011521915316374984, 'samples': 17468928, 'steps': 34118, 'batch_loss/train': 0.7412851462140679} 12/27/2021 11:00:54 - INFO - codeparrot_training - Step 34119: {'lr': 0.0001152058772976585, 'samples': 17469440, 'steps': 34119, 'batch_loss/train': 0.8074859455227852} 12/27/2021 11:01:04 - INFO - codeparrot_training - Step 34120: {'lr': 0.00011519260196745782, 'samples': 17469952, 'steps': 34120, 'batch_loss/train': 0.5904555935412645} 12/27/2021 11:01:17 - INFO - codeparrot_training - Step 34121: {'lr': 0.00011517932717320051, 'samples': 17470464, 'steps': 34121, 'batch_loss/train': 0.7013932447880507} 12/27/2021 11:01:28 - INFO - codeparrot_training - Step 34122: {'lr': 0.00011516605291493922, 'samples': 17470976, 'steps': 34122, 'batch_loss/train': 0.6893071912927553} 12/27/2021 11:01:38 - INFO - codeparrot_training - Step 34123: {'lr': 0.00011515277919272679, 'samples': 17471488, 'steps': 34123, 'batch_loss/train': 0.7225416097790003} 12/27/2021 11:01:52 - INFO - codeparrot_training - Step 34124: {'lr': 0.00011513950600661599, 'samples': 17472000, 'steps': 34124, 'batch_loss/train': 0.7478470755741} 12/27/2021 11:02:02 - INFO - codeparrot_training - Step 34125: {'lr': 0.00011512623335665962, 'samples': 17472512, 'steps': 34125, 'batch_loss/train': 0.7100564575521275} 12/27/2021 11:02:13 - INFO - codeparrot_training - Step 34126: {'lr': 0.00011511296124291037, 'samples': 17473024, 'steps': 34126, 'batch_loss/train': 0.7554535032249987} 12/27/2021 11:02:25 - INFO - codeparrot_training - Step 34127: {'lr': 0.00011509968966542109, 'samples': 17473536, 'steps': 34127, 'batch_loss/train': 0.677711671218276} 12/27/2021 11:02:36 - INFO - codeparrot_training - Step 34128: {'lr': 0.00011508641862424448, 'samples': 17474048, 'steps': 34128, 'batch_loss/train': 0.8194803167134523} 12/27/2021 11:02:47 - INFO - codeparrot_training - Step 34129: {'lr': 0.00011507314811943345, 'samples': 17474560, 'steps': 34129, 'batch_loss/train': 0.7069338648580015} 12/27/2021 11:02:57 - INFO - codeparrot_training - Step 34130: {'lr': 0.00011505987815104043, 'samples': 17475072, 'steps': 34130, 'batch_loss/train': 0.5293915896909311} 12/27/2021 11:03:11 - INFO - codeparrot_training - Step 34131: {'lr': 0.00011504660871911851, 'samples': 17475584, 'steps': 34131, 'batch_loss/train': 0.7369211328914389} 12/27/2021 11:03:22 - INFO - codeparrot_training - Step 34132: {'lr': 0.00011503333982372039, 'samples': 17476096, 'steps': 34132, 'batch_loss/train': 0.6853235382586718} 12/27/2021 11:03:32 - INFO - codeparrot_training - Step 34133: {'lr': 0.00011502007146489862, 'samples': 17476608, 'steps': 34133, 'batch_loss/train': 0.6820362580474466} 12/27/2021 11:03:44 - INFO - codeparrot_training - Step 34134: {'lr': 0.00011500680364270604, 'samples': 17477120, 'steps': 34134, 'batch_loss/train': 0.7398380287922919} 12/27/2021 11:03:55 - INFO - codeparrot_training - Step 34135: {'lr': 0.00011499353635719558, 'samples': 17477632, 'steps': 34135, 'batch_loss/train': 0.7334660943597555} 12/27/2021 11:04:06 - INFO - codeparrot_training - Step 34136: {'lr': 0.00011498026960841979, 'samples': 17478144, 'steps': 34136, 'batch_loss/train': 0.8028545144479722} 12/27/2021 11:04:18 - INFO - codeparrot_training - Step 34137: {'lr': 0.00011496700339643146, 'samples': 17478656, 'steps': 34137, 'batch_loss/train': 0.743475039023906} 12/27/2021 11:04:28 - INFO - codeparrot_training - Step 34138: {'lr': 0.00011495373772128331, 'samples': 17479168, 'steps': 34138, 'batch_loss/train': 0.7813604921102524} 12/27/2021 11:04:39 - INFO - codeparrot_training - Step 34139: {'lr': 0.00011494047258302811, 'samples': 17479680, 'steps': 34139, 'batch_loss/train': 0.6531749879941344} 12/27/2021 11:04:53 - INFO - codeparrot_training - Step 34140: {'lr': 0.00011492720798171863, 'samples': 17480192, 'steps': 34140, 'batch_loss/train': 0.7405784642323852} 12/27/2021 11:05:04 - INFO - codeparrot_training - Step 34141: {'lr': 0.00011491394391740754, 'samples': 17480704, 'steps': 34141, 'batch_loss/train': 0.7282584281638265} 12/27/2021 11:05:14 - INFO - codeparrot_training - Step 34142: {'lr': 0.00011490068039014761, 'samples': 17481216, 'steps': 34142, 'batch_loss/train': 0.6356704826466739} 12/27/2021 11:05:25 - INFO - codeparrot_training - Step 34143: {'lr': 0.00011488741739999156, 'samples': 17481728, 'steps': 34143, 'batch_loss/train': 0.7769743297249079} 12/27/2021 11:05:37 - INFO - codeparrot_training - Step 34144: {'lr': 0.00011487415494699222, 'samples': 17482240, 'steps': 34144, 'batch_loss/train': 0.7986161783337593} 12/27/2021 11:05:47 - INFO - codeparrot_training - Step 34145: {'lr': 0.00011486089303120204, 'samples': 17482752, 'steps': 34145, 'batch_loss/train': 0.7736176820471883} 12/27/2021 11:05:58 - INFO - codeparrot_training - Step 34146: {'lr': 0.000114847631652674, 'samples': 17483264, 'steps': 34146, 'batch_loss/train': 0.7290492253378034} 12/27/2021 11:06:10 - INFO - codeparrot_training - Step 34147: {'lr': 0.00011483437081146083, 'samples': 17483776, 'steps': 34147, 'batch_loss/train': 0.730656516738236} 12/27/2021 11:06:21 - INFO - codeparrot_training - Step 34148: {'lr': 0.0001148211105076151, 'samples': 17484288, 'steps': 34148, 'batch_loss/train': 0.7522284319857135} 12/27/2021 11:06:31 - INFO - codeparrot_training - Step 34149: {'lr': 0.00011480785074118957, 'samples': 17484800, 'steps': 34149, 'batch_loss/train': 0.7432182114571333} 12/27/2021 11:06:43 - INFO - codeparrot_training - Step 34150: {'lr': 0.000114794591512237, 'samples': 17485312, 'steps': 34150, 'batch_loss/train': 0.883565803989768} 12/27/2021 11:06:54 - INFO - codeparrot_training - Step 34151: {'lr': 0.00011478133282081008, 'samples': 17485824, 'steps': 34151, 'batch_loss/train': 0.767848608084023} 12/27/2021 11:07:05 - INFO - codeparrot_training - Step 34152: {'lr': 0.00011476807466696152, 'samples': 17486336, 'steps': 34152, 'batch_loss/train': 0.5912116440013051} 12/27/2021 11:07:15 - INFO - codeparrot_training - Step 34153: {'lr': 0.000114754817050744, 'samples': 17486848, 'steps': 34153, 'batch_loss/train': 0.679541896097362} 12/27/2021 11:07:29 - INFO - codeparrot_training - Step 34154: {'lr': 0.0001147415599722103, 'samples': 17487360, 'steps': 34154, 'batch_loss/train': 0.6786409839987755} 12/27/2021 11:07:40 - INFO - codeparrot_training - Step 34155: {'lr': 0.00011472830343141308, 'samples': 17487872, 'steps': 34155, 'batch_loss/train': 0.6783469845540822} 12/27/2021 11:07:50 - INFO - codeparrot_training - Step 34156: {'lr': 0.00011471504742840505, 'samples': 17488384, 'steps': 34156, 'batch_loss/train': 0.7053794573293999} 12/27/2021 11:08:03 - INFO - codeparrot_training - Step 34157: {'lr': 0.00011470179196323891, 'samples': 17488896, 'steps': 34157, 'batch_loss/train': 0.7515245652757585} 12/27/2021 11:08:13 - INFO - codeparrot_training - Step 34158: {'lr': 0.00011468853703596744, 'samples': 17489408, 'steps': 34158, 'batch_loss/train': 0.7324486749712378} 12/27/2021 11:08:24 - INFO - codeparrot_training - Step 34159: {'lr': 0.00011467528264664315, 'samples': 17489920, 'steps': 34159, 'batch_loss/train': 0.6761658139294013} 12/27/2021 11:08:38 - INFO - codeparrot_training - Step 34160: {'lr': 0.00011466202879531876, 'samples': 17490432, 'steps': 34160, 'batch_loss/train': 0.6783292600885034} 12/27/2021 11:08:48 - INFO - codeparrot_training - Step 34161: {'lr': 0.00011464877548204725, 'samples': 17490944, 'steps': 34161, 'batch_loss/train': 0.6695960476063192} 12/27/2021 11:08:59 - INFO - codeparrot_training - Step 34162: {'lr': 0.000114635522706881, 'samples': 17491456, 'steps': 34162, 'batch_loss/train': 0.7698893349152058} 12/27/2021 11:09:11 - INFO - codeparrot_training - Step 34163: {'lr': 0.00011462227046987278, 'samples': 17491968, 'steps': 34163, 'batch_loss/train': 0.7224205336533487} 12/27/2021 11:09:22 - INFO - codeparrot_training - Step 34164: {'lr': 0.0001146090187710753, 'samples': 17492480, 'steps': 34164, 'batch_loss/train': 0.8087456054054201} 12/27/2021 11:09:32 - INFO - codeparrot_training - Step 34165: {'lr': 0.00011459576761054125, 'samples': 17492992, 'steps': 34165, 'batch_loss/train': 0.7408385425806046} 12/27/2021 11:09:43 - INFO - codeparrot_training - Step 34166: {'lr': 0.00011458251698832331, 'samples': 17493504, 'steps': 34166, 'batch_loss/train': 0.9245522208511829} 12/27/2021 11:09:55 - INFO - codeparrot_training - Step 34167: {'lr': 0.00011456926690447414, 'samples': 17494016, 'steps': 34167, 'batch_loss/train': 0.7119072787463665} 12/27/2021 11:10:06 - INFO - codeparrot_training - Step 34168: {'lr': 0.00011455601735904642, 'samples': 17494528, 'steps': 34168, 'batch_loss/train': 0.7562544983811677} 12/27/2021 11:10:16 - INFO - codeparrot_training - Step 34169: {'lr': 0.00011454276835209285, 'samples': 17495040, 'steps': 34169, 'batch_loss/train': 0.6933785579167306} 12/27/2021 11:10:30 - INFO - codeparrot_training - Step 34170: {'lr': 0.00011452951988366616, 'samples': 17495552, 'steps': 34170, 'batch_loss/train': 0.7831014040857553} 12/27/2021 11:10:41 - INFO - codeparrot_training - Step 34171: {'lr': 0.00011451627195381876, 'samples': 17496064, 'steps': 34171, 'batch_loss/train': 0.7004914632998407} 12/27/2021 11:10:52 - INFO - codeparrot_training - Step 34172: {'lr': 0.00011450302456260362, 'samples': 17496576, 'steps': 34172, 'batch_loss/train': 0.8465925073251128} 12/27/2021 11:11:04 - INFO - codeparrot_training - Step 34173: {'lr': 0.00011448977771007338, 'samples': 17497088, 'steps': 34173, 'batch_loss/train': 0.8448006212711334} 12/27/2021 11:11:14 - INFO - codeparrot_training - Step 34174: {'lr': 0.0001144765313962805, 'samples': 17497600, 'steps': 34174, 'batch_loss/train': 0.7371525522321463} 12/27/2021 11:11:25 - INFO - codeparrot_training - Step 34175: {'lr': 0.00011446328562127777, 'samples': 17498112, 'steps': 34175, 'batch_loss/train': 0.8814127680379897} 12/27/2021 11:11:36 - INFO - codeparrot_training - Step 34176: {'lr': 0.00011445004038511786, 'samples': 17498624, 'steps': 34176, 'batch_loss/train': 0.7188597107306123} 12/27/2021 11:11:50 - INFO - codeparrot_training - Step 34177: {'lr': 0.0001144367956878534, 'samples': 17499136, 'steps': 34177, 'batch_loss/train': 0.5878034085035324} 12/27/2021 11:12:00 - INFO - codeparrot_training - Step 34178: {'lr': 0.00011442355152953707, 'samples': 17499648, 'steps': 34178, 'batch_loss/train': 0.6557378848083317} 12/27/2021 11:12:11 - INFO - codeparrot_training - Step 34179: {'lr': 0.00011441030791022147, 'samples': 17500160, 'steps': 34179, 'batch_loss/train': 0.6698999437503517} 12/27/2021 11:12:23 - INFO - codeparrot_training - Step 34180: {'lr': 0.00011439706482995932, 'samples': 17500672, 'steps': 34180, 'batch_loss/train': 0.8959422181360424} 12/27/2021 11:12:34 - INFO - codeparrot_training - Step 34181: {'lr': 0.00011438382228880323, 'samples': 17501184, 'steps': 34181, 'batch_loss/train': 0.6928019057959318} 12/27/2021 11:12:44 - INFO - codeparrot_training - Step 34182: {'lr': 0.00011437058028680583, 'samples': 17501696, 'steps': 34182, 'batch_loss/train': 0.7927770176902413} 12/27/2021 11:12:56 - INFO - codeparrot_training - Step 34183: {'lr': 0.00011435733882401983, 'samples': 17502208, 'steps': 34183, 'batch_loss/train': 0.750440550968051} 12/27/2021 11:13:07 - INFO - codeparrot_training - Step 34184: {'lr': 0.0001143440979004979, 'samples': 17502720, 'steps': 34184, 'batch_loss/train': 0.6828184258192778} 12/27/2021 11:13:18 - INFO - codeparrot_training - Step 34185: {'lr': 0.00011433085751629252, 'samples': 17503232, 'steps': 34185, 'batch_loss/train': 0.6799761429429054} 12/27/2021 11:13:28 - INFO - codeparrot_training - Step 34186: {'lr': 0.00011431761767145635, 'samples': 17503744, 'steps': 34186, 'batch_loss/train': 0.7260796898044646} 12/27/2021 11:13:40 - INFO - codeparrot_training - Step 34187: {'lr': 0.0001143043783660423, 'samples': 17504256, 'steps': 34187, 'batch_loss/train': 0.7872015759348869} 12/27/2021 11:13:51 - INFO - codeparrot_training - Step 34188: {'lr': 0.00011429113960010268, 'samples': 17504768, 'steps': 34188, 'batch_loss/train': 0.744773029233329} 12/27/2021 11:14:02 - INFO - codeparrot_training - Step 34189: {'lr': 0.00011427790137369016, 'samples': 17505280, 'steps': 34189, 'batch_loss/train': 0.5792866225820035} 12/27/2021 11:14:16 - INFO - codeparrot_training - Step 34190: {'lr': 0.00011426466368685764, 'samples': 17505792, 'steps': 34190, 'batch_loss/train': 0.7183326035737991} 12/27/2021 11:14:26 - INFO - codeparrot_training - Step 34191: {'lr': 0.0001142514265396575, 'samples': 17506304, 'steps': 34191, 'batch_loss/train': 0.7398158283904195} 12/27/2021 11:14:37 - INFO - codeparrot_training - Step 34192: {'lr': 0.00011423818993214241, 'samples': 17506816, 'steps': 34192, 'batch_loss/train': 0.6478135637007654} 12/27/2021 11:14:49 - INFO - codeparrot_training - Step 34193: {'lr': 0.00011422495386436501, 'samples': 17507328, 'steps': 34193, 'batch_loss/train': 0.8427016760688275} 12/27/2021 11:15:00 - INFO - codeparrot_training - Step 34194: {'lr': 0.00011421171833637795, 'samples': 17507840, 'steps': 34194, 'batch_loss/train': 0.7183151165954769} 12/27/2021 11:15:10 - INFO - codeparrot_training - Step 34195: {'lr': 0.0001141984833482338, 'samples': 17508352, 'steps': 34195, 'batch_loss/train': 0.771980288438499} 12/27/2021 11:15:21 - INFO - codeparrot_training - Step 34196: {'lr': 0.0001141852488999853, 'samples': 17508864, 'steps': 34196, 'batch_loss/train': 0.6981086507439613} 12/27/2021 11:15:33 - INFO - codeparrot_training - Step 34197: {'lr': 0.0001141720149916848, 'samples': 17509376, 'steps': 34197, 'batch_loss/train': 0.7382856942713261} 12/27/2021 11:15:44 - INFO - codeparrot_training - Step 34198: {'lr': 0.00011415878162338517, 'samples': 17509888, 'steps': 34198, 'batch_loss/train': 0.7437757849693298} 12/27/2021 11:15:54 - INFO - codeparrot_training - Step 34199: {'lr': 0.000114145548795139, 'samples': 17510400, 'steps': 34199, 'batch_loss/train': 0.733231820166111} 12/27/2021 11:16:08 - INFO - codeparrot_training - Step 34200: {'lr': 0.00011413231650699876, 'samples': 17510912, 'steps': 34200, 'batch_loss/train': 0.6818895298056304} 12/27/2021 11:16:19 - INFO - codeparrot_training - Step 34201: {'lr': 0.00011411908475901706, 'samples': 17511424, 'steps': 34201, 'batch_loss/train': 0.748251766897738} 12/27/2021 11:16:29 - INFO - codeparrot_training - Step 34202: {'lr': 0.00011410585355124672, 'samples': 17511936, 'steps': 34202, 'batch_loss/train': 0.6051657168427482} 12/27/2021 11:16:41 - INFO - codeparrot_training - Step 34203: {'lr': 0.00011409262288374012, 'samples': 17512448, 'steps': 34203, 'batch_loss/train': 0.7640947988256812} 12/27/2021 11:16:52 - INFO - codeparrot_training - Step 34204: {'lr': 0.00011407939275654992, 'samples': 17512960, 'steps': 34204, 'batch_loss/train': 0.649901551194489} 12/27/2021 11:17:02 - INFO - codeparrot_training - Step 34205: {'lr': 0.00011406616316972873, 'samples': 17513472, 'steps': 34205, 'batch_loss/train': 0.8517696764320135} 12/27/2021 11:17:13 - INFO - codeparrot_training - Step 34206: {'lr': 0.00011405293412332918, 'samples': 17513984, 'steps': 34206, 'batch_loss/train': 0.7165851769968867} 12/27/2021 11:17:25 - INFO - codeparrot_training - Step 34207: {'lr': 0.00011403970561740379, 'samples': 17514496, 'steps': 34207, 'batch_loss/train': 0.7229187451303005} 12/27/2021 11:17:36 - INFO - codeparrot_training - Step 34208: {'lr': 0.00011402647765200521, 'samples': 17515008, 'steps': 34208, 'batch_loss/train': 0.666474896017462} 12/27/2021 11:17:46 - INFO - codeparrot_training - Step 34209: {'lr': 0.00011401325022718604, 'samples': 17515520, 'steps': 34209, 'batch_loss/train': 0.8010671420488507} 12/27/2021 11:18:00 - INFO - codeparrot_training - Step 34210: {'lr': 0.0001140000233429988, 'samples': 17516032, 'steps': 34210, 'batch_loss/train': 0.68050591298379} 12/27/2021 11:18:11 - INFO - codeparrot_training - Step 34211: {'lr': 0.00011398679699949624, 'samples': 17516544, 'steps': 34211, 'batch_loss/train': 0.7941946610808372} 12/27/2021 11:18:22 - INFO - codeparrot_training - Step 34212: {'lr': 0.00011397357119673064, 'samples': 17517056, 'steps': 34212, 'batch_loss/train': 1.0401432907674462} 12/27/2021 11:18:34 - INFO - codeparrot_training - Step 34213: {'lr': 0.00011396034593475491, 'samples': 17517568, 'steps': 34213, 'batch_loss/train': 0.7652652421966195} 12/27/2021 11:18:44 - INFO - codeparrot_training - Step 34214: {'lr': 0.00011394712121362141, 'samples': 17518080, 'steps': 34214, 'batch_loss/train': 0.7339724994963035} 12/27/2021 11:18:55 - INFO - codeparrot_training - Step 34215: {'lr': 0.00011393389703338267, 'samples': 17518592, 'steps': 34215, 'batch_loss/train': 0.7653038017451763} 12/27/2021 11:19:06 - INFO - codeparrot_training - Step 34216: {'lr': 0.0001139206733940916, 'samples': 17519104, 'steps': 34216, 'batch_loss/train': 0.7387474542483687} 12/27/2021 11:19:18 - INFO - codeparrot_training - Step 34217: {'lr': 0.00011390745029580041, 'samples': 17519616, 'steps': 34217, 'batch_loss/train': 0.709418663289398} 12/27/2021 11:19:28 - INFO - codeparrot_training - Step 34218: {'lr': 0.00011389422773856185, 'samples': 17520128, 'steps': 34218, 'batch_loss/train': 0.7468811124563217} 12/27/2021 11:19:39 - INFO - codeparrot_training - Step 34219: {'lr': 0.00011388100572242843, 'samples': 17520640, 'steps': 34219, 'batch_loss/train': 0.6763556562364101} 12/27/2021 11:19:51 - INFO - codeparrot_training - Step 34220: {'lr': 0.00011386778424745273, 'samples': 17521152, 'steps': 34220, 'batch_loss/train': 0.629079110280145} 12/27/2021 11:20:02 - INFO - codeparrot_training - Step 34221: {'lr': 0.00011385456331368735, 'samples': 17521664, 'steps': 34221, 'batch_loss/train': 0.6938898812513798} 12/27/2021 11:20:12 - INFO - codeparrot_training - Step 34222: {'lr': 0.0001138413429211849, 'samples': 17522176, 'steps': 34222, 'batch_loss/train': 0.8101713814539835} 12/27/2021 11:20:26 - INFO - codeparrot_training - Step 34223: {'lr': 0.00011382812306999766, 'samples': 17522688, 'steps': 34223, 'batch_loss/train': 0.7609236589632928} 12/27/2021 11:20:37 - INFO - codeparrot_training - Step 34224: {'lr': 0.00011381490376017848, 'samples': 17523200, 'steps': 34224, 'batch_loss/train': 0.7095018485561013} 12/27/2021 11:20:47 - INFO - codeparrot_training - Step 34225: {'lr': 0.0001138016849917799, 'samples': 17523712, 'steps': 34225, 'batch_loss/train': 0.7353389002382755} 12/27/2021 11:20:59 - INFO - codeparrot_training - Step 34226: {'lr': 0.00011378846676485424, 'samples': 17524224, 'steps': 34226, 'batch_loss/train': 0.7183030070737004} 12/27/2021 11:21:10 - INFO - codeparrot_training - Step 34227: {'lr': 0.00011377524907945427, 'samples': 17524736, 'steps': 34227, 'batch_loss/train': 0.7962778450455517} 12/27/2021 11:21:21 - INFO - codeparrot_training - Step 34228: {'lr': 0.00011376203193563259, 'samples': 17525248, 'steps': 34228, 'batch_loss/train': 0.7613193690776825} 12/27/2021 11:21:31 - INFO - codeparrot_training - Step 34229: {'lr': 0.00011374881533344153, 'samples': 17525760, 'steps': 34229, 'batch_loss/train': 0.773185177706182} 12/27/2021 11:21:45 - INFO - codeparrot_training - Step 34230: {'lr': 0.00011373559927293372, 'samples': 17526272, 'steps': 34230, 'batch_loss/train': 0.7732184855267406} 12/27/2021 11:21:56 - INFO - codeparrot_training - Step 34231: {'lr': 0.00011372238375416171, 'samples': 17526784, 'steps': 34231, 'batch_loss/train': 0.7263234523124993} 12/27/2021 11:22:06 - INFO - codeparrot_training - Step 34232: {'lr': 0.00011370916877717807, 'samples': 17527296, 'steps': 34232, 'batch_loss/train': 0.8017815737985075} 12/27/2021 11:22:18 - INFO - codeparrot_training - Step 34233: {'lr': 0.0001136959543420353, 'samples': 17527808, 'steps': 34233, 'batch_loss/train': 0.7072553765028715} 12/27/2021 11:22:29 - INFO - codeparrot_training - Step 34234: {'lr': 0.00011368274044878596, 'samples': 17528320, 'steps': 34234, 'batch_loss/train': 0.7025429038330913} 12/27/2021 11:22:40 - INFO - codeparrot_training - Step 34235: {'lr': 0.00011366952709748258, 'samples': 17528832, 'steps': 34235, 'batch_loss/train': 0.691580637358129} 12/27/2021 11:22:52 - INFO - codeparrot_training - Step 34236: {'lr': 0.00011365631428817766, 'samples': 17529344, 'steps': 34236, 'batch_loss/train': 0.746270983479917} 12/27/2021 11:23:02 - INFO - codeparrot_training - Step 34237: {'lr': 0.00011364310202092388, 'samples': 17529856, 'steps': 34237, 'batch_loss/train': 0.7660455945879221} 12/27/2021 11:23:13 - INFO - codeparrot_training - Step 34238: {'lr': 0.00011362989029577342, 'samples': 17530368, 'steps': 34238, 'batch_loss/train': 0.8220439720898867} 12/27/2021 11:23:23 - INFO - codeparrot_training - Step 34239: {'lr': 0.0001136166791127792, 'samples': 17530880, 'steps': 34239, 'batch_loss/train': 0.7162083038128912} 12/27/2021 11:23:37 - INFO - codeparrot_training - Step 34240: {'lr': 0.00011360346847199351, 'samples': 17531392, 'steps': 34240, 'batch_loss/train': 0.801388094201684} 12/27/2021 11:23:48 - INFO - codeparrot_training - Step 34241: {'lr': 0.00011359025837346881, 'samples': 17531904, 'steps': 34241, 'batch_loss/train': 0.7389375395141542} 12/27/2021 11:23:59 - INFO - codeparrot_training - Step 34242: {'lr': 0.00011357704881725795, 'samples': 17532416, 'steps': 34242, 'batch_loss/train': 0.6832262742100284} 12/27/2021 11:24:11 - INFO - codeparrot_training - Step 34243: {'lr': 0.00011356383980341311, 'samples': 17532928, 'steps': 34243, 'batch_loss/train': 0.7592129176482558} 12/27/2021 11:24:21 - INFO - codeparrot_training - Step 34244: {'lr': 0.00011355063133198696, 'samples': 17533440, 'steps': 34244, 'batch_loss/train': 1.0256860284134746} 12/27/2021 11:24:32 - INFO - codeparrot_training - Step 34245: {'lr': 0.00011353742340303195, 'samples': 17533952, 'steps': 34245, 'batch_loss/train': 0.7123698126524687} 12/27/2021 11:24:46 - INFO - codeparrot_training - Step 34246: {'lr': 0.00011352421601660062, 'samples': 17534464, 'steps': 34246, 'batch_loss/train': 0.7021027030423284} 12/27/2021 11:24:57 - INFO - codeparrot_training - Step 34247: {'lr': 0.00011351100917274548, 'samples': 17534976, 'steps': 34247, 'batch_loss/train': 0.7512540183961391} 12/27/2021 11:25:07 - INFO - codeparrot_training - Step 34248: {'lr': 0.00011349780287151912, 'samples': 17535488, 'steps': 34248, 'batch_loss/train': 0.7674631271511316} 12/27/2021 11:25:18 - INFO - codeparrot_training - Step 34249: {'lr': 0.0001134845971129738, 'samples': 17536000, 'steps': 34249, 'batch_loss/train': 0.7042293939739466} 12/27/2021 11:25:30 - INFO - codeparrot_training - Step 34250: {'lr': 0.00011347139189716224, 'samples': 17536512, 'steps': 34250, 'batch_loss/train': 0.8865181002765894} 12/27/2021 11:25:40 - INFO - codeparrot_training - Step 34251: {'lr': 0.00011345818722413695, 'samples': 17537024, 'steps': 34251, 'batch_loss/train': 0.7750534350052476} 12/27/2021 11:25:51 - INFO - codeparrot_training - Step 34252: {'lr': 0.0001134449830939502, 'samples': 17537536, 'steps': 34252, 'batch_loss/train': 0.9061699472367764} 12/27/2021 11:26:03 - INFO - codeparrot_training - Step 34253: {'lr': 0.0001134317795066547, 'samples': 17538048, 'steps': 34253, 'batch_loss/train': 0.7493865140713751} 12/27/2021 11:26:14 - INFO - codeparrot_training - Step 34254: {'lr': 0.000113418576462303, 'samples': 17538560, 'steps': 34254, 'batch_loss/train': 0.7392968963831663} 12/27/2021 11:26:25 - INFO - codeparrot_training - Step 34255: {'lr': 0.00011340537396094736, 'samples': 17539072, 'steps': 34255, 'batch_loss/train': 0.6077948492020369} 12/27/2021 11:26:37 - INFO - codeparrot_training - Step 34256: {'lr': 0.0001133921720026403, 'samples': 17539584, 'steps': 34256, 'batch_loss/train': 0.7980314125306904} 12/27/2021 11:26:48 - INFO - codeparrot_training - Step 34257: {'lr': 0.00011337897058743454, 'samples': 17540096, 'steps': 34257, 'batch_loss/train': 0.7031440599821508} 12/27/2021 11:26:58 - INFO - codeparrot_training - Step 34258: {'lr': 0.00011336576971538234, 'samples': 17540608, 'steps': 34258, 'batch_loss/train': 0.6950552803464234} 12/27/2021 11:27:09 - INFO - codeparrot_training - Step 34259: {'lr': 0.00011335256938653623, 'samples': 17541120, 'steps': 34259, 'batch_loss/train': 0.7233436778187752} 12/27/2021 11:27:23 - INFO - codeparrot_training - Step 34260: {'lr': 0.00011333936960094873, 'samples': 17541632, 'steps': 34260, 'batch_loss/train': 0.750749321654439} 12/27/2021 11:27:34 - INFO - codeparrot_training - Step 34261: {'lr': 0.00011332617035867227, 'samples': 17542144, 'steps': 34261, 'batch_loss/train': 0.7337096249684691} 12/27/2021 11:27:44 - INFO - codeparrot_training - Step 34262: {'lr': 0.00011331297165975937, 'samples': 17542656, 'steps': 34262, 'batch_loss/train': 0.724211361259222} 12/27/2021 11:27:57 - INFO - codeparrot_training - Step 34263: {'lr': 0.00011329977350426246, 'samples': 17543168, 'steps': 34263, 'batch_loss/train': 0.7837884314358234} 12/27/2021 11:28:08 - INFO - codeparrot_training - Step 34264: {'lr': 0.00011328657589223407, 'samples': 17543680, 'steps': 34264, 'batch_loss/train': 0.8133101034909487} 12/27/2021 11:28:18 - INFO - codeparrot_training - Step 34265: {'lr': 0.00011327337882372671, 'samples': 17544192, 'steps': 34265, 'batch_loss/train': 0.6319944243878126} 12/27/2021 11:28:31 - INFO - codeparrot_training - Step 34266: {'lr': 0.00011326018229879268, 'samples': 17544704, 'steps': 34266, 'batch_loss/train': 0.6481868864502758} 12/27/2021 11:28:41 - INFO - codeparrot_training - Step 34267: {'lr': 0.00011324698631748443, 'samples': 17545216, 'steps': 34267, 'batch_loss/train': 0.6877959053963423} 12/27/2021 11:28:52 - INFO - codeparrot_training - Step 34268: {'lr': 0.00011323379087985472, 'samples': 17545728, 'steps': 34268, 'batch_loss/train': 0.8886834809090942} 12/27/2021 11:29:02 - INFO - codeparrot_training - Step 34269: {'lr': 0.00011322059598595571, 'samples': 17546240, 'steps': 34269, 'batch_loss/train': 0.8890545177273452} 12/27/2021 11:29:16 - INFO - codeparrot_training - Step 34270: {'lr': 0.00011320740163583999, 'samples': 17546752, 'steps': 34270, 'batch_loss/train': 0.7706716293469071} 12/27/2021 11:29:27 - INFO - codeparrot_training - Step 34271: {'lr': 0.00011319420782955997, 'samples': 17547264, 'steps': 34271, 'batch_loss/train': 0.8373215962201357} 12/27/2021 11:29:37 - INFO - codeparrot_training - Step 34272: {'lr': 0.00011318101456716812, 'samples': 17547776, 'steps': 34272, 'batch_loss/train': 0.7156528965570033} 12/27/2021 11:29:50 - INFO - codeparrot_training - Step 34273: {'lr': 0.00011316782184871691, 'samples': 17548288, 'steps': 34273, 'batch_loss/train': 0.7909353990107775} 12/27/2021 11:30:00 - INFO - codeparrot_training - Step 34274: {'lr': 0.00011315462967425885, 'samples': 17548800, 'steps': 34274, 'batch_loss/train': 0.680889516021125} 12/27/2021 11:30:11 - INFO - codeparrot_training - Step 34275: {'lr': 0.00011314143804384613, 'samples': 17549312, 'steps': 34275, 'batch_loss/train': 0.8044012892059982} 12/27/2021 11:30:24 - INFO - codeparrot_training - Step 34276: {'lr': 0.00011312824695753148, 'samples': 17549824, 'steps': 34276, 'batch_loss/train': 0.9348953613080084} 12/27/2021 11:30:35 - INFO - codeparrot_training - Step 34277: {'lr': 0.00011311505641536735, 'samples': 17550336, 'steps': 34277, 'batch_loss/train': 0.7773891240358353} 12/27/2021 11:30:45 - INFO - codeparrot_training - Step 34278: {'lr': 0.00011310186641740585, 'samples': 17550848, 'steps': 34278, 'batch_loss/train': 0.7877403264865279} 12/27/2021 11:30:56 - INFO - codeparrot_training - Step 34279: {'lr': 0.00011308867696369976, 'samples': 17551360, 'steps': 34279, 'batch_loss/train': 0.8105154274962842} 12/27/2021 11:31:08 - INFO - codeparrot_training - Step 34280: {'lr': 0.00011307548805430145, 'samples': 17551872, 'steps': 34280, 'batch_loss/train': 0.690525321289897} 12/27/2021 11:31:19 - INFO - codeparrot_training - Step 34281: {'lr': 0.00011306229968926324, 'samples': 17552384, 'steps': 34281, 'batch_loss/train': 0.8020460130646825} 12/27/2021 11:31:29 - INFO - codeparrot_training - Step 34282: {'lr': 0.00011304911186863753, 'samples': 17552896, 'steps': 34282, 'batch_loss/train': 0.6424347786232829} 12/27/2021 11:31:42 - INFO - codeparrot_training - Step 34283: {'lr': 0.000113035924592477, 'samples': 17553408, 'steps': 34283, 'batch_loss/train': 0.6697959741577506} 12/27/2021 11:31:52 - INFO - codeparrot_training - Step 34284: {'lr': 0.00011302273786083384, 'samples': 17553920, 'steps': 34284, 'batch_loss/train': 0.7417545805219561} 12/27/2021 11:32:03 - INFO - codeparrot_training - Step 34285: {'lr': 0.00011300955167376056, 'samples': 17554432, 'steps': 34285, 'batch_loss/train': 0.5827548510860652} 12/27/2021 11:32:16 - INFO - codeparrot_training - Step 34286: {'lr': 0.00011299636603130955, 'samples': 17554944, 'steps': 34286, 'batch_loss/train': 0.6574595173588023} 12/27/2021 11:32:27 - INFO - codeparrot_training - Step 34287: {'lr': 0.00011298318093353329, 'samples': 17555456, 'steps': 34287, 'batch_loss/train': 0.7790904324501753} 12/27/2021 11:32:37 - INFO - codeparrot_training - Step 34288: {'lr': 0.00011296999638048416, 'samples': 17555968, 'steps': 34288, 'batch_loss/train': 0.7258553802967072} 12/27/2021 11:32:48 - INFO - codeparrot_training - Step 34289: {'lr': 0.00011295681237221459, 'samples': 17556480, 'steps': 34289, 'batch_loss/train': 0.9141276073642075} 12/27/2021 11:33:00 - INFO - codeparrot_training - Step 34290: {'lr': 0.00011294362890877696, 'samples': 17556992, 'steps': 34290, 'batch_loss/train': 0.7146477841306478} 12/27/2021 11:33:11 - INFO - codeparrot_training - Step 34291: {'lr': 0.00011293044599022373, 'samples': 17557504, 'steps': 34291, 'batch_loss/train': 0.6752990956883878} 12/27/2021 11:33:21 - INFO - codeparrot_training - Step 34292: {'lr': 0.00011291726361660737, 'samples': 17558016, 'steps': 34292, 'batch_loss/train': 0.802982933819294} 12/27/2021 11:33:34 - INFO - codeparrot_training - Step 34293: {'lr': 0.00011290408178798004, 'samples': 17558528, 'steps': 34293, 'batch_loss/train': 0.7696929699741304} 12/27/2021 11:33:44 - INFO - codeparrot_training - Step 34294: {'lr': 0.0001128909005043945, 'samples': 17559040, 'steps': 34294, 'batch_loss/train': 0.7466912129893899} 12/27/2021 11:33:55 - INFO - codeparrot_training - Step 34295: {'lr': 0.00011287771976590288, 'samples': 17559552, 'steps': 34295, 'batch_loss/train': 0.7174420608207583} 12/27/2021 11:34:07 - INFO - codeparrot_training - Step 34296: {'lr': 0.00011286453957255765, 'samples': 17560064, 'steps': 34296, 'batch_loss/train': 0.6929365303367376} 12/27/2021 11:34:18 - INFO - codeparrot_training - Step 34297: {'lr': 0.00011285135992441126, 'samples': 17560576, 'steps': 34297, 'batch_loss/train': 0.6484455280005932} 12/27/2021 11:34:28 - INFO - codeparrot_training - Step 34298: {'lr': 0.00011283818082151606, 'samples': 17561088, 'steps': 34298, 'batch_loss/train': 0.7272919262759387} 12/27/2021 11:34:39 - INFO - codeparrot_training - Step 34299: {'lr': 0.00011282500226392444, 'samples': 17561600, 'steps': 34299, 'batch_loss/train': 0.7679263697937131} 12/27/2021 11:34:54 - INFO - codeparrot_training - Step 34300: {'lr': 0.00011281182425168885, 'samples': 17562112, 'steps': 34300, 'batch_loss/train': 0.8113942793570459} 12/27/2021 11:35:04 - INFO - codeparrot_training - Step 34301: {'lr': 0.00011279864678486166, 'samples': 17562624, 'steps': 34301, 'batch_loss/train': 0.7705360271502286} 12/27/2021 11:35:15 - INFO - codeparrot_training - Step 34302: {'lr': 0.0001127854698634952, 'samples': 17563136, 'steps': 34302, 'batch_loss/train': 0.5000527070369571} 12/27/2021 11:35:27 - INFO - codeparrot_training - Step 34303: {'lr': 0.00011277229348764201, 'samples': 17563648, 'steps': 34303, 'batch_loss/train': 0.8017270509153605} 12/27/2021 11:35:38 - INFO - codeparrot_training - Step 34304: {'lr': 0.00011275911765735419, 'samples': 17564160, 'steps': 34304, 'batch_loss/train': 0.8739062175154686} 12/27/2021 11:35:48 - INFO - codeparrot_training - Step 34305: {'lr': 0.0001127459423726844, 'samples': 17564672, 'steps': 34305, 'batch_loss/train': 0.817676137201488} 12/27/2021 11:36:03 - INFO - codeparrot_training - Step 34306: {'lr': 0.000112732767633685, 'samples': 17565184, 'steps': 34306, 'batch_loss/train': 0.7090346370823681} 12/27/2021 11:36:13 - INFO - codeparrot_training - Step 34307: {'lr': 0.00011271959344040819, 'samples': 17565696, 'steps': 34307, 'batch_loss/train': 0.7414208292029798} 12/27/2021 11:36:24 - INFO - codeparrot_training - Step 34308: {'lr': 0.00011270641979290636, 'samples': 17566208, 'steps': 34308, 'batch_loss/train': 0.9199577393010259} 12/27/2021 11:36:35 - INFO - codeparrot_training - Step 34309: {'lr': 0.00011269324669123213, 'samples': 17566720, 'steps': 34309, 'batch_loss/train': 0.9640889174770564} 12/27/2021 11:36:47 - INFO - codeparrot_training - Step 34310: {'lr': 0.00011268007413543763, 'samples': 17567232, 'steps': 34310, 'batch_loss/train': 0.758087648806395} 12/27/2021 11:36:58 - INFO - codeparrot_training - Step 34311: {'lr': 0.00011266690212557529, 'samples': 17567744, 'steps': 34311, 'batch_loss/train': 0.770239194855094} 12/27/2021 11:37:08 - INFO - codeparrot_training - Step 34312: {'lr': 0.00011265373066169751, 'samples': 17568256, 'steps': 34312, 'batch_loss/train': 0.8807571018114686} 12/27/2021 11:37:20 - INFO - codeparrot_training - Step 34313: {'lr': 0.00011264055974385662, 'samples': 17568768, 'steps': 34313, 'batch_loss/train': 0.8860008837655187} 12/27/2021 11:37:31 - INFO - codeparrot_training - Step 34314: {'lr': 0.00011262738937210501, 'samples': 17569280, 'steps': 34314, 'batch_loss/train': 0.7254746733233333} 12/27/2021 11:37:42 - INFO - codeparrot_training - Step 34315: {'lr': 0.00011261421954649503, 'samples': 17569792, 'steps': 34315, 'batch_loss/train': 0.7355977473780513} 12/27/2021 11:37:52 - INFO - codeparrot_training - Step 34316: {'lr': 0.00011260105026707903, 'samples': 17570304, 'steps': 34316, 'batch_loss/train': 0.7975166300311685} 12/27/2021 11:38:05 - INFO - codeparrot_training - Step 34317: {'lr': 0.00011258788153390937, 'samples': 17570816, 'steps': 34317, 'batch_loss/train': 0.7928054986987263} 12/27/2021 11:38:16 - INFO - codeparrot_training - Step 34318: {'lr': 0.00011257471334703853, 'samples': 17571328, 'steps': 34318, 'batch_loss/train': 0.6796704423613846} 12/27/2021 11:38:26 - INFO - codeparrot_training - Step 34319: {'lr': 0.00011256154570651853, 'samples': 17571840, 'steps': 34319, 'batch_loss/train': 0.7399089655373245} 12/27/2021 11:38:39 - INFO - codeparrot_training - Step 34320: {'lr': 0.00011254837861240211, 'samples': 17572352, 'steps': 34320, 'batch_loss/train': 0.7709735427051783} 12/27/2021 11:38:49 - INFO - codeparrot_training - Step 34321: {'lr': 0.00011253521206474134, 'samples': 17572864, 'steps': 34321, 'batch_loss/train': 0.8186138449236751} 12/27/2021 11:39:00 - INFO - codeparrot_training - Step 34322: {'lr': 0.00011252204606358871, 'samples': 17573376, 'steps': 34322, 'batch_loss/train': 0.8018608000129461} 12/27/2021 11:39:12 - INFO - codeparrot_training - Step 34323: {'lr': 0.00011250888060899648, 'samples': 17573888, 'steps': 34323, 'batch_loss/train': 0.8894385118037462} 12/27/2021 11:39:22 - INFO - codeparrot_training - Step 34324: {'lr': 0.00011249571570101702, 'samples': 17574400, 'steps': 34324, 'batch_loss/train': 0.7122834362089634} 12/27/2021 11:39:33 - INFO - codeparrot_training - Step 34325: {'lr': 0.00011248255133970268, 'samples': 17574912, 'steps': 34325, 'batch_loss/train': 0.811144913546741} 12/27/2021 11:39:46 - INFO - codeparrot_training - Step 34326: {'lr': 0.00011246938752510579, 'samples': 17575424, 'steps': 34326, 'batch_loss/train': 0.7488575209863484} 12/27/2021 11:39:57 - INFO - codeparrot_training - Step 34327: {'lr': 0.00011245622425727869, 'samples': 17575936, 'steps': 34327, 'batch_loss/train': 0.7794361463747919} 12/27/2021 11:40:08 - INFO - codeparrot_training - Step 34328: {'lr': 0.00011244306153627371, 'samples': 17576448, 'steps': 34328, 'batch_loss/train': 1.502660074736923} 12/27/2021 11:40:18 - INFO - codeparrot_training - Step 34329: {'lr': 0.00011242989936214326, 'samples': 17576960, 'steps': 34329, 'batch_loss/train': 0.7087533511221409} 12/27/2021 11:40:30 - INFO - codeparrot_training - Step 34330: {'lr': 0.00011241673773493941, 'samples': 17577472, 'steps': 34330, 'batch_loss/train': 0.8931074952706695} 12/27/2021 11:40:41 - INFO - codeparrot_training - Step 34331: {'lr': 0.00011240357665471476, 'samples': 17577984, 'steps': 34331, 'batch_loss/train': 0.7553574610501528} 12/27/2021 11:40:52 - INFO - codeparrot_training - Step 34332: {'lr': 0.00011239041612152163, 'samples': 17578496, 'steps': 34332, 'batch_loss/train': 0.8043943420052528} 12/27/2021 11:41:06 - INFO - codeparrot_training - Step 34333: {'lr': 0.00011237725613541214, 'samples': 17579008, 'steps': 34333, 'batch_loss/train': 0.7183165708556771} 12/27/2021 11:41:16 - INFO - codeparrot_training - Step 34334: {'lr': 0.00011236409669643863, 'samples': 17579520, 'steps': 34334, 'batch_loss/train': 0.801636653020978} 12/27/2021 11:41:27 - INFO - codeparrot_training - Step 34335: {'lr': 0.00011235093780465367, 'samples': 17580032, 'steps': 34335, 'batch_loss/train': 0.7795010069385171} 12/27/2021 11:41:37 - INFO - codeparrot_training - Step 34336: {'lr': 0.00011233777946010934, 'samples': 17580544, 'steps': 34336, 'batch_loss/train': 0.7775899348780513} 12/27/2021 11:41:49 - INFO - codeparrot_training - Step 34337: {'lr': 0.0001123246216628579, 'samples': 17581056, 'steps': 34337, 'batch_loss/train': 0.8022198399994522} 12/27/2021 11:42:00 - INFO - codeparrot_training - Step 34338: {'lr': 0.00011231146441295199, 'samples': 17581568, 'steps': 34338, 'batch_loss/train': 0.6595314033329487} 12/27/2021 11:42:11 - INFO - codeparrot_training - Step 34339: {'lr': 0.00011229830771044361, 'samples': 17582080, 'steps': 34339, 'batch_loss/train': 0.7214246008079499} 12/27/2021 11:42:23 - INFO - codeparrot_training - Step 34340: {'lr': 0.00011228515155538516, 'samples': 17582592, 'steps': 34340, 'batch_loss/train': 0.8421722147613764} 12/27/2021 11:42:33 - INFO - codeparrot_training - Step 34341: {'lr': 0.00011227199594782894, 'samples': 17583104, 'steps': 34341, 'batch_loss/train': 0.7499485625885427} 12/27/2021 11:42:44 - INFO - codeparrot_training - Step 34342: {'lr': 0.00011225884088782726, 'samples': 17583616, 'steps': 34342, 'batch_loss/train': 0.870111744850874} 12/27/2021 11:42:56 - INFO - codeparrot_training - Step 34343: {'lr': 0.00011224568637543245, 'samples': 17584128, 'steps': 34343, 'batch_loss/train': 0.6696781544014812} 12/27/2021 11:43:07 - INFO - codeparrot_training - Step 34344: {'lr': 0.00011223253241069683, 'samples': 17584640, 'steps': 34344, 'batch_loss/train': 0.7719886717386544} 12/27/2021 11:43:17 - INFO - codeparrot_training - Step 34345: {'lr': 0.00011221937899367248, 'samples': 17585152, 'steps': 34345, 'batch_loss/train': 0.7802622616291046} 12/27/2021 11:43:31 - INFO - codeparrot_training - Step 34346: {'lr': 0.00011220622612441195, 'samples': 17585664, 'steps': 34346, 'batch_loss/train': 0.551540503045544} 12/27/2021 11:43:41 - INFO - codeparrot_training - Step 34347: {'lr': 0.00011219307380296751, 'samples': 17586176, 'steps': 34347, 'batch_loss/train': 0.8188374959863722} 12/27/2021 11:43:52 - INFO - codeparrot_training - Step 34348: {'lr': 0.00011217992202939128, 'samples': 17586688, 'steps': 34348, 'batch_loss/train': 0.7391024529933929} 12/27/2021 11:44:03 - INFO - codeparrot_training - Step 34349: {'lr': 0.00011216677080373567, 'samples': 17587200, 'steps': 34349, 'batch_loss/train': 0.7972397701814771} 12/27/2021 11:44:15 - INFO - codeparrot_training - Step 34350: {'lr': 0.00011215362012605293, 'samples': 17587712, 'steps': 34350, 'batch_loss/train': 0.7020173803321086} 12/27/2021 11:44:26 - INFO - codeparrot_training - Step 34351: {'lr': 0.00011214046999639533, 'samples': 17588224, 'steps': 34351, 'batch_loss/train': 0.7172701060771942} 12/27/2021 11:44:36 - INFO - codeparrot_training - Step 34352: {'lr': 0.00011212732041481518, 'samples': 17588736, 'steps': 34352, 'batch_loss/train': 0.7774214723613113} 12/27/2021 11:44:48 - INFO - codeparrot_training - Step 34353: {'lr': 0.00011211417138136474, 'samples': 17589248, 'steps': 34353, 'batch_loss/train': 0.7550576627254486} 12/27/2021 11:44:59 - INFO - codeparrot_training - Step 34354: {'lr': 0.00011210102289609628, 'samples': 17589760, 'steps': 34354, 'batch_loss/train': 0.8188633488025516} 12/27/2021 11:45:10 - INFO - codeparrot_training - Step 34355: {'lr': 0.00011208787495906219, 'samples': 17590272, 'steps': 34355, 'batch_loss/train': 0.7264107260853052} 12/27/2021 11:45:23 - INFO - codeparrot_training - Step 34356: {'lr': 0.00011207472757031445, 'samples': 17590784, 'steps': 34356, 'batch_loss/train': 0.6686734822578728} 12/27/2021 11:45:33 - INFO - codeparrot_training - Step 34357: {'lr': 0.00011206158072990561, 'samples': 17591296, 'steps': 34357, 'batch_loss/train': 0.7447055601514876} 12/27/2021 11:45:44 - INFO - codeparrot_training - Step 34358: {'lr': 0.00011204843443788792, 'samples': 17591808, 'steps': 34358, 'batch_loss/train': 0.7828083033673465} 12/27/2021 11:45:55 - INFO - codeparrot_training - Step 34359: {'lr': 0.00011203528869431348, 'samples': 17592320, 'steps': 34359, 'batch_loss/train': 0.7602322382153943} 12/27/2021 11:46:07 - INFO - codeparrot_training - Step 34360: {'lr': 0.00011202214349923457, 'samples': 17592832, 'steps': 34360, 'batch_loss/train': 0.8061895575374365} 12/27/2021 11:46:17 - INFO - codeparrot_training - Step 34361: {'lr': 0.00011200899885270369, 'samples': 17593344, 'steps': 34361, 'batch_loss/train': 0.8161413241177797} 12/27/2021 11:46:28 - INFO - codeparrot_training - Step 34362: {'lr': 0.00011199585475477281, 'samples': 17593856, 'steps': 34362, 'batch_loss/train': 0.7334710152354091} 12/27/2021 11:46:41 - INFO - codeparrot_training - Step 34363: {'lr': 0.00011198271120549425, 'samples': 17594368, 'steps': 34363, 'batch_loss/train': 0.7382420627400279} 12/27/2021 11:46:52 - INFO - codeparrot_training - Step 34364: {'lr': 0.00011196956820492046, 'samples': 17594880, 'steps': 34364, 'batch_loss/train': 0.7629763740114868} 12/27/2021 11:47:02 - INFO - codeparrot_training - Step 34365: {'lr': 0.00011195642575310346, 'samples': 17595392, 'steps': 34365, 'batch_loss/train': 0.749763008672744} 12/27/2021 11:47:14 - INFO - codeparrot_training - Step 34366: {'lr': 0.00011194328385009565, 'samples': 17595904, 'steps': 34366, 'batch_loss/train': 0.6456042632344179} 12/27/2021 11:47:25 - INFO - codeparrot_training - Step 34367: {'lr': 0.00011193014249594916, 'samples': 17596416, 'steps': 34367, 'batch_loss/train': 0.7805469734594226} 12/27/2021 11:47:36 - INFO - codeparrot_training - Step 34368: {'lr': 0.00011191700169071629, 'samples': 17596928, 'steps': 34368, 'batch_loss/train': 0.6991258580237627} 12/27/2021 11:47:46 - INFO - codeparrot_training - Step 34369: {'lr': 0.0001119038614344493, 'samples': 17597440, 'steps': 34369, 'batch_loss/train': 0.6067713563097641} 12/27/2021 11:47:58 - INFO - codeparrot_training - Step 34370: {'lr': 0.00011189072172720049, 'samples': 17597952, 'steps': 34370, 'batch_loss/train': 0.8865242563188076} 12/27/2021 11:48:09 - INFO - codeparrot_training - Step 34371: {'lr': 0.00011187758256902186, 'samples': 17598464, 'steps': 34371, 'batch_loss/train': 0.7294702166691422} 12/27/2021 11:48:19 - INFO - codeparrot_training - Step 34372: {'lr': 0.00011186444395996586, 'samples': 17598976, 'steps': 34372, 'batch_loss/train': 0.7941305506974459} 12/27/2021 11:48:33 - INFO - codeparrot_training - Step 34373: {'lr': 0.00011185130590008483, 'samples': 17599488, 'steps': 34373, 'batch_loss/train': 0.7500801999121904} 12/27/2021 11:48:43 - INFO - codeparrot_training - Step 34374: {'lr': 0.00011183816838943062, 'samples': 17600000, 'steps': 34374, 'batch_loss/train': 0.7827339844079688} 12/27/2021 11:48:54 - INFO - codeparrot_training - Step 34375: {'lr': 0.00011182503142805586, 'samples': 17600512, 'steps': 34375, 'batch_loss/train': 0.8251475454308093} 12/27/2021 11:49:06 - INFO - codeparrot_training - Step 34376: {'lr': 0.00011181189501601252, 'samples': 17601024, 'steps': 34376, 'batch_loss/train': 0.7385372584685683} 12/27/2021 11:49:17 - INFO - codeparrot_training - Step 34377: {'lr': 0.00011179875915335291, 'samples': 17601536, 'steps': 34377, 'batch_loss/train': 0.7090229282621294} 12/27/2021 11:49:28 - INFO - codeparrot_training - Step 34378: {'lr': 0.00011178562384012924, 'samples': 17602048, 'steps': 34378, 'batch_loss/train': 0.6663260182831436} 12/27/2021 11:49:40 - INFO - codeparrot_training - Step 34379: {'lr': 0.00011177248907639376, 'samples': 17602560, 'steps': 34379, 'batch_loss/train': 0.9029968720860779} 12/27/2021 11:49:50 - INFO - codeparrot_training - Step 34380: {'lr': 0.00011175935486219868, 'samples': 17603072, 'steps': 34380, 'batch_loss/train': 1.3807708900421858} 12/27/2021 11:50:01 - INFO - codeparrot_training - Step 34381: {'lr': 0.00011174622119759619, 'samples': 17603584, 'steps': 34381, 'batch_loss/train': 0.6582646326860413} 12/27/2021 11:50:12 - INFO - codeparrot_training - Step 34382: {'lr': 0.00011173308808263852, 'samples': 17604096, 'steps': 34382, 'batch_loss/train': 0.7257364373654127} 12/27/2021 11:50:24 - INFO - codeparrot_training - Step 34383: {'lr': 0.00011171995551737787, 'samples': 17604608, 'steps': 34383, 'batch_loss/train': 0.7607742846012115} 12/27/2021 11:50:34 - INFO - codeparrot_training - Step 34384: {'lr': 0.00011170682350186658, 'samples': 17605120, 'steps': 34384, 'batch_loss/train': 0.6951427937019616} 12/27/2021 11:50:45 - INFO - codeparrot_training - Step 34385: {'lr': 0.00011169369203615664, 'samples': 17605632, 'steps': 34385, 'batch_loss/train': 0.7159852553158998} 12/27/2021 11:50:59 - INFO - codeparrot_training - Step 34386: {'lr': 0.00011168056112030028, 'samples': 17606144, 'steps': 34386, 'batch_loss/train': 0.6541794454678893} 12/27/2021 11:51:09 - INFO - codeparrot_training - Step 34387: {'lr': 0.00011166743075434995, 'samples': 17606656, 'steps': 34387, 'batch_loss/train': 0.8535569924861193} 12/27/2021 11:51:20 - INFO - codeparrot_training - Step 34388: {'lr': 0.0001116543009383576, 'samples': 17607168, 'steps': 34388, 'batch_loss/train': 0.7120179245248437} 12/27/2021 11:51:32 - INFO - codeparrot_training - Step 34389: {'lr': 0.00011164117167237542, 'samples': 17607680, 'steps': 34389, 'batch_loss/train': 0.7814611820504069} 12/27/2021 11:51:43 - INFO - codeparrot_training - Step 34390: {'lr': 0.00011162804295645589, 'samples': 17608192, 'steps': 34390, 'batch_loss/train': 0.7686850498430431} 12/27/2021 11:51:54 - INFO - codeparrot_training - Step 34391: {'lr': 0.00011161491479065092, 'samples': 17608704, 'steps': 34391, 'batch_loss/train': 0.7266185148619115} 12/27/2021 11:52:04 - INFO - codeparrot_training - Step 34392: {'lr': 0.00011160178717501279, 'samples': 17609216, 'steps': 34392, 'batch_loss/train': 0.7295046286890283} 12/27/2021 11:52:17 - INFO - codeparrot_training - Step 34393: {'lr': 0.00011158866010959367, 'samples': 17609728, 'steps': 34393, 'batch_loss/train': 0.7677854532375932} 12/27/2021 11:52:28 - INFO - codeparrot_training - Step 34394: {'lr': 0.00011157553359444581, 'samples': 17610240, 'steps': 34394, 'batch_loss/train': 0.7934687566012144} 12/27/2021 11:52:39 - INFO - codeparrot_training - Step 34395: {'lr': 0.00011156240762962136, 'samples': 17610752, 'steps': 34395, 'batch_loss/train': 0.771068749949336} 12/27/2021 11:52:51 - INFO - codeparrot_training - Step 34396: {'lr': 0.00011154928221517258, 'samples': 17611264, 'steps': 34396, 'batch_loss/train': 0.7926764898002148} 12/27/2021 11:53:01 - INFO - codeparrot_training - Step 34397: {'lr': 0.0001115361573511514, 'samples': 17611776, 'steps': 34397, 'batch_loss/train': 0.8068623510189354} 12/27/2021 11:53:12 - INFO - codeparrot_training - Step 34398: {'lr': 0.00011152303303761027, 'samples': 17612288, 'steps': 34398, 'batch_loss/train': 0.7451186254620552} 12/27/2021 11:53:24 - INFO - codeparrot_training - Step 34399: {'lr': 0.00011150990927460139, 'samples': 17612800, 'steps': 34399, 'batch_loss/train': 0.7056184476241469} 12/27/2021 11:53:35 - INFO - codeparrot_training - Step 34400: {'lr': 0.00011149678606217662, 'samples': 17613312, 'steps': 34400, 'batch_loss/train': 0.5694431269075722} 12/27/2021 11:53:45 - INFO - codeparrot_training - Step 34401: {'lr': 0.00011148366340038849, 'samples': 17613824, 'steps': 34401, 'batch_loss/train': 0.8537648068740964} 12/27/2021 11:53:56 - INFO - codeparrot_training - Step 34402: {'lr': 0.00011147054128928893, 'samples': 17614336, 'steps': 34402, 'batch_loss/train': 0.6330867297947407} 12/27/2021 11:54:09 - INFO - codeparrot_training - Step 34403: {'lr': 0.0001114574197289302, 'samples': 17614848, 'steps': 34403, 'batch_loss/train': 0.7457211504224688} 12/27/2021 11:54:20 - INFO - codeparrot_training - Step 34404: {'lr': 0.00011144429871936446, 'samples': 17615360, 'steps': 34404, 'batch_loss/train': 0.7827552668750286} 12/27/2021 11:54:30 - INFO - codeparrot_training - Step 34405: {'lr': 0.00011143117826064386, 'samples': 17615872, 'steps': 34405, 'batch_loss/train': 0.833097848109901} 12/27/2021 11:54:43 - INFO - codeparrot_training - Step 34406: {'lr': 0.00011141805835282059, 'samples': 17616384, 'steps': 34406, 'batch_loss/train': 0.6936868131160736} 12/27/2021 11:54:53 - INFO - codeparrot_training - Step 34407: {'lr': 0.00011140493899594679, 'samples': 17616896, 'steps': 34407, 'batch_loss/train': 0.753643227275461} 12/27/2021 11:55:04 - INFO - codeparrot_training - Step 34408: {'lr': 0.00011139182019007462, 'samples': 17617408, 'steps': 34408, 'batch_loss/train': 0.823675749823451} 12/27/2021 11:55:18 - INFO - codeparrot_training - Step 34409: {'lr': 0.00011137870193525623, 'samples': 17617920, 'steps': 34409, 'batch_loss/train': 0.7089589911629446} 12/27/2021 11:55:28 - INFO - codeparrot_training - Step 34410: {'lr': 0.00011136558423154388, 'samples': 17618432, 'steps': 34410, 'batch_loss/train': 0.8291750191710889} 12/27/2021 11:55:39 - INFO - codeparrot_training - Step 34411: {'lr': 0.00011135246707898945, 'samples': 17618944, 'steps': 34411, 'batch_loss/train': 0.7168448832817376} 12/27/2021 11:55:50 - INFO - codeparrot_training - Step 34412: {'lr': 0.00011133935047764532, 'samples': 17619456, 'steps': 34412, 'batch_loss/train': 0.7450834475457668} 12/27/2021 11:56:02 - INFO - codeparrot_training - Step 34413: {'lr': 0.00011132623442756368, 'samples': 17619968, 'steps': 34413, 'batch_loss/train': 0.690093171549961} 12/27/2021 11:56:12 - INFO - codeparrot_training - Step 34414: {'lr': 0.00011131311892879648, 'samples': 17620480, 'steps': 34414, 'batch_loss/train': 0.7436512159183621} 12/27/2021 11:56:23 - INFO - codeparrot_training - Step 34415: {'lr': 0.00011130000398139587, 'samples': 17620992, 'steps': 34415, 'batch_loss/train': 0.7705901693552732} 12/27/2021 11:56:35 - INFO - codeparrot_training - Step 34416: {'lr': 0.00011128688958541425, 'samples': 17621504, 'steps': 34416, 'batch_loss/train': 0.7996186652453616} 12/27/2021 11:56:46 - INFO - codeparrot_training - Step 34417: {'lr': 0.00011127377574090347, 'samples': 17622016, 'steps': 34417, 'batch_loss/train': 0.7787807071581483} 12/27/2021 11:56:57 - INFO - codeparrot_training - Step 34418: {'lr': 0.0001112606624479158, 'samples': 17622528, 'steps': 34418, 'batch_loss/train': 0.757507901173085} 12/27/2021 11:57:09 - INFO - codeparrot_training - Step 34419: {'lr': 0.00011124754970650335, 'samples': 17623040, 'steps': 34419, 'batch_loss/train': 1.4216135516762733} 12/27/2021 11:57:19 - INFO - codeparrot_training - Step 34420: {'lr': 0.00011123443751671825, 'samples': 17623552, 'steps': 34420, 'batch_loss/train': 0.7697980338707566} 12/27/2021 11:57:30 - INFO - codeparrot_training - Step 34421: {'lr': 0.00011122132587861261, 'samples': 17624064, 'steps': 34421, 'batch_loss/train': 0.7751721232198179} 12/27/2021 11:57:40 - INFO - codeparrot_training - Step 34422: {'lr': 0.00011120821479223869, 'samples': 17624576, 'steps': 34422, 'batch_loss/train': 0.8093545148149133} 12/27/2021 11:57:54 - INFO - codeparrot_training - Step 34423: {'lr': 0.00011119510425764831, 'samples': 17625088, 'steps': 34423, 'batch_loss/train': 0.8573832348920405} 12/27/2021 11:58:05 - INFO - codeparrot_training - Step 34424: {'lr': 0.0001111819942748939, 'samples': 17625600, 'steps': 34424, 'batch_loss/train': 0.733364796731621} 12/27/2021 11:58:15 - INFO - codeparrot_training - Step 34425: {'lr': 0.00011116888484402753, 'samples': 17626112, 'steps': 34425, 'batch_loss/train': 0.7188626502174884} 12/27/2021 11:58:27 - INFO - codeparrot_training - Step 34426: {'lr': 0.00011115577596510107, 'samples': 17626624, 'steps': 34426, 'batch_loss/train': 1.107390302233398} 12/27/2021 11:58:38 - INFO - codeparrot_training - Step 34427: {'lr': 0.00011114266763816693, 'samples': 17627136, 'steps': 34427, 'batch_loss/train': 0.5734589085914195} 12/27/2021 11:58:49 - INFO - codeparrot_training - Step 34428: {'lr': 0.00011112955986327718, 'samples': 17627648, 'steps': 34428, 'batch_loss/train': 0.540024128742516} 12/27/2021 11:58:59 - INFO - codeparrot_training - Step 34429: {'lr': 0.00011111645264048378, 'samples': 17628160, 'steps': 34429, 'batch_loss/train': 0.6829158416949213} 12/27/2021 11:59:12 - INFO - codeparrot_training - Step 34430: {'lr': 0.00011110334596983896, 'samples': 17628672, 'steps': 34430, 'batch_loss/train': 0.7729882767889649} 12/27/2021 11:59:22 - INFO - codeparrot_training - Step 34431: {'lr': 0.00011109023985139474, 'samples': 17629184, 'steps': 34431, 'batch_loss/train': 0.731887960806489} 12/27/2021 11:59:33 - INFO - codeparrot_training - Step 34432: {'lr': 0.0001110771342852033, 'samples': 17629696, 'steps': 34432, 'batch_loss/train': 0.7073102365247905} 12/27/2021 11:59:46 - INFO - codeparrot_training - Step 34433: {'lr': 0.00011106402927131673, 'samples': 17630208, 'steps': 34433, 'batch_loss/train': 0.7485315250232816} 12/27/2021 11:59:56 - INFO - codeparrot_training - Step 34434: {'lr': 0.0001110509248097871, 'samples': 17630720, 'steps': 34434, 'batch_loss/train': 0.7469553574919701} 12/27/2021 12:00:07 - INFO - codeparrot_training - Step 34435: {'lr': 0.00011103782090066652, 'samples': 17631232, 'steps': 34435, 'batch_loss/train': 0.7691894195741042} 12/27/2021 12:00:20 - INFO - codeparrot_training - Step 34436: {'lr': 0.00011102471754400711, 'samples': 17631744, 'steps': 34436, 'batch_loss/train': 0.754659435711801} 12/27/2021 12:00:30 - INFO - codeparrot_training - Step 34437: {'lr': 0.00011101161473986096, 'samples': 17632256, 'steps': 34437, 'batch_loss/train': 0.8118449952453375} 12/27/2021 12:00:41 - INFO - codeparrot_training - Step 34438: {'lr': 0.00011099851248828014, 'samples': 17632768, 'steps': 34438, 'batch_loss/train': 0.7451585801318288} 12/27/2021 12:00:51 - INFO - codeparrot_training - Step 34439: {'lr': 0.00011098541078931681, 'samples': 17633280, 'steps': 34439, 'batch_loss/train': 0.4463175958953798} 12/27/2021 12:01:05 - INFO - codeparrot_training - Step 34440: {'lr': 0.00011097230964302294, 'samples': 17633792, 'steps': 34440, 'batch_loss/train': 0.787126618437469} 12/27/2021 12:01:16 - INFO - codeparrot_training - Step 34441: {'lr': 0.0001109592090494506, 'samples': 17634304, 'steps': 34441, 'batch_loss/train': 0.8485484616830945} 12/27/2021 12:01:27 - INFO - codeparrot_training - Step 34442: {'lr': 0.00011094610900865207, 'samples': 17634816, 'steps': 34442, 'batch_loss/train': 0.8263475173152983} 12/27/2021 12:01:39 - INFO - codeparrot_training - Step 34443: {'lr': 0.00011093300952067925, 'samples': 17635328, 'steps': 34443, 'batch_loss/train': 0.7021220703609288} 12/27/2021 12:01:49 - INFO - codeparrot_training - Step 34444: {'lr': 0.00011091991058558426, 'samples': 17635840, 'steps': 34444, 'batch_loss/train': 0.8072323095984757} 12/27/2021 12:02:00 - INFO - codeparrot_training - Step 34445: {'lr': 0.00011090681220341922, 'samples': 17636352, 'steps': 34445, 'batch_loss/train': 0.7588364602997899} 12/27/2021 12:02:12 - INFO - codeparrot_training - Step 34446: {'lr': 0.00011089371437423615, 'samples': 17636864, 'steps': 34446, 'batch_loss/train': 0.6936974721029401} 12/27/2021 12:02:23 - INFO - codeparrot_training - Step 34447: {'lr': 0.00011088061709808714, 'samples': 17637376, 'steps': 34447, 'batch_loss/train': 0.7717938302084804} 12/27/2021 12:02:33 - INFO - codeparrot_training - Step 34448: {'lr': 0.00011086752037502429, 'samples': 17637888, 'steps': 34448, 'batch_loss/train': 0.7196260408964008} 12/27/2021 12:02:44 - INFO - codeparrot_training - Step 34449: {'lr': 0.00011085442420509964, 'samples': 17638400, 'steps': 34449, 'batch_loss/train': 0.7783109345473349} 12/27/2021 12:02:57 - INFO - codeparrot_training - Step 34450: {'lr': 0.00011084132858836524, 'samples': 17638912, 'steps': 34450, 'batch_loss/train': 0.9205832807347178} 12/27/2021 12:03:08 - INFO - codeparrot_training - Step 34451: {'lr': 0.00011082823352487328, 'samples': 17639424, 'steps': 34451, 'batch_loss/train': 0.7624450428411365} 12/27/2021 12:03:18 - INFO - codeparrot_training - Step 34452: {'lr': 0.00011081513901467555, 'samples': 17639936, 'steps': 34452, 'batch_loss/train': 0.7058011302724481} 12/27/2021 12:03:30 - INFO - codeparrot_training - Step 34453: {'lr': 0.00011080204505782432, 'samples': 17640448, 'steps': 34453, 'batch_loss/train': 1.1255666255019605} 12/27/2021 12:03:41 - INFO - codeparrot_training - Step 34454: {'lr': 0.00011078895165437172, 'samples': 17640960, 'steps': 34454, 'batch_loss/train': 0.6851087450049818} 12/27/2021 12:03:52 - INFO - codeparrot_training - Step 34455: {'lr': 0.0001107758588043696, 'samples': 17641472, 'steps': 34455, 'batch_loss/train': 0.7061057786922902} 12/27/2021 12:04:04 - INFO - codeparrot_training - Step 34456: {'lr': 0.00011076276650787007, 'samples': 17641984, 'steps': 34456, 'batch_loss/train': 0.7002016892656684} 12/27/2021 12:04:15 - INFO - codeparrot_training - Step 34457: {'lr': 0.00011074967476492523, 'samples': 17642496, 'steps': 34457, 'batch_loss/train': 0.836693181656301} 12/27/2021 12:04:25 - INFO - codeparrot_training - Step 34458: {'lr': 0.00011073658357558711, 'samples': 17643008, 'steps': 34458, 'batch_loss/train': 0.7738205986097455} 12/27/2021 12:04:36 - INFO - codeparrot_training - Step 34459: {'lr': 0.00011072349293990771, 'samples': 17643520, 'steps': 34459, 'batch_loss/train': 0.7164114250335842} 12/27/2021 12:04:50 - INFO - codeparrot_training - Step 34460: {'lr': 0.00011071040285793915, 'samples': 17644032, 'steps': 34460, 'batch_loss/train': 0.5990934326546267} 12/27/2021 12:05:00 - INFO - codeparrot_training - Step 34461: {'lr': 0.00011069731332973343, 'samples': 17644544, 'steps': 34461, 'batch_loss/train': 0.6914393571205437} 12/27/2021 12:05:11 - INFO - codeparrot_training - Step 34462: {'lr': 0.00011068422435534259, 'samples': 17645056, 'steps': 34462, 'batch_loss/train': 0.6518776593729854} 12/27/2021 12:05:23 - INFO - codeparrot_training - Step 34463: {'lr': 0.00011067113593481864, 'samples': 17645568, 'steps': 34463, 'batch_loss/train': 0.7943997364491224} 12/27/2021 12:05:34 - INFO - codeparrot_training - Step 34464: {'lr': 0.0001106580480682137, 'samples': 17646080, 'steps': 34464, 'batch_loss/train': 0.7462907219305634} 12/27/2021 12:05:44 - INFO - codeparrot_training - Step 34465: {'lr': 0.00011064496075557981, 'samples': 17646592, 'steps': 34465, 'batch_loss/train': 0.8367091305553913} 12/27/2021 12:05:56 - INFO - codeparrot_training - Step 34466: {'lr': 0.00011063187399696884, 'samples': 17647104, 'steps': 34466, 'batch_loss/train': 0.7945496346801519} 12/27/2021 12:06:07 - INFO - codeparrot_training - Step 34467: {'lr': 0.00011061878779243284, 'samples': 17647616, 'steps': 34467, 'batch_loss/train': 0.9108990207314491} 12/27/2021 12:06:18 - INFO - codeparrot_training - Step 34468: {'lr': 0.00011060570214202406, 'samples': 17648128, 'steps': 34468, 'batch_loss/train': 0.7916782414540648} 12/27/2021 12:06:28 - INFO - codeparrot_training - Step 34469: {'lr': 0.00011059261704579432, 'samples': 17648640, 'steps': 34469, 'batch_loss/train': 0.70621133223176} 12/27/2021 12:06:40 - INFO - codeparrot_training - Step 34470: {'lr': 0.00011057953250379568, 'samples': 17649152, 'steps': 34470, 'batch_loss/train': 0.7101157903671265} 12/27/2021 12:06:51 - INFO - codeparrot_training - Step 34471: {'lr': 0.00011056644851608019, 'samples': 17649664, 'steps': 34471, 'batch_loss/train': 0.7228190060704947} 12/27/2021 12:07:02 - INFO - codeparrot_training - Step 34472: {'lr': 0.00011055336508269984, 'samples': 17650176, 'steps': 34472, 'batch_loss/train': 0.8664384288713336} 12/27/2021 12:07:15 - INFO - codeparrot_training - Step 34473: {'lr': 0.00011054028220370666, 'samples': 17650688, 'steps': 34473, 'batch_loss/train': 0.7054885742836632} 12/27/2021 12:07:25 - INFO - codeparrot_training - Step 34474: {'lr': 0.00011052719987915267, 'samples': 17651200, 'steps': 34474, 'batch_loss/train': 0.782352901995182} 12/27/2021 12:07:36 - INFO - codeparrot_training - Step 34475: {'lr': 0.00011051411810908987, 'samples': 17651712, 'steps': 34475, 'batch_loss/train': 0.6350780150387436} 12/27/2021 12:07:48 - INFO - codeparrot_training - Step 34476: {'lr': 0.00011050103689357025, 'samples': 17652224, 'steps': 34476, 'batch_loss/train': 0.7923905774950981} 12/27/2021 12:07:59 - INFO - codeparrot_training - Step 34477: {'lr': 0.00011048795623264594, 'samples': 17652736, 'steps': 34477, 'batch_loss/train': 0.7275201017037034} 12/27/2021 12:08:09 - INFO - codeparrot_training - Step 34478: {'lr': 0.00011047487612636864, 'samples': 17653248, 'steps': 34478, 'batch_loss/train': 0.6852035028859973} 12/27/2021 12:08:22 - INFO - codeparrot_training - Step 34479: {'lr': 0.00011046179657479067, 'samples': 17653760, 'steps': 34479, 'batch_loss/train': 0.7603341257199645} 12/27/2021 12:08:33 - INFO - codeparrot_training - Step 34480: {'lr': 0.00011044871757796399, 'samples': 17654272, 'steps': 34480, 'batch_loss/train': 0.760087737813592} 12/27/2021 12:08:44 - INFO - codeparrot_training - Step 34481: {'lr': 0.0001104356391359404, 'samples': 17654784, 'steps': 34481, 'batch_loss/train': 0.9162117689847946} 12/27/2021 12:08:54 - INFO - codeparrot_training - Step 34482: {'lr': 0.00011042256124877195, 'samples': 17655296, 'steps': 34482, 'batch_loss/train': 0.8111310992389917} 12/27/2021 12:09:07 - INFO - codeparrot_training - Step 34483: {'lr': 0.00011040948391651087, 'samples': 17655808, 'steps': 34483, 'batch_loss/train': 0.7480237009003758} 12/27/2021 12:09:17 - INFO - codeparrot_training - Step 34484: {'lr': 0.00011039640713920887, 'samples': 17656320, 'steps': 34484, 'batch_loss/train': 0.6818011067807674} 12/27/2021 12:09:28 - INFO - codeparrot_training - Step 34485: {'lr': 0.00011038333091691804, 'samples': 17656832, 'steps': 34485, 'batch_loss/train': 0.8540500765666366} 12/27/2021 12:09:40 - INFO - codeparrot_training - Step 34486: {'lr': 0.00011037025524969038, 'samples': 17657344, 'steps': 34486, 'batch_loss/train': 0.8234434574842453} 12/27/2021 12:09:51 - INFO - codeparrot_training - Step 34487: {'lr': 0.00011035718013757787, 'samples': 17657856, 'steps': 34487, 'batch_loss/train': 0.6974821258336306} 12/27/2021 12:10:01 - INFO - codeparrot_training - Step 34488: {'lr': 0.00011034410558063249, 'samples': 17658368, 'steps': 34488, 'batch_loss/train': 0.733502296730876} 12/27/2021 12:10:16 - INFO - codeparrot_training - Step 34489: {'lr': 0.0001103310315789062, 'samples': 17658880, 'steps': 34489, 'batch_loss/train': 0.7139329197816551} 12/27/2021 12:10:26 - INFO - codeparrot_training - Step 34490: {'lr': 0.00011031795813245099, 'samples': 17659392, 'steps': 34490, 'batch_loss/train': 0.45639136992394924} 12/27/2021 12:10:37 - INFO - codeparrot_training - Step 34491: {'lr': 0.00011030488524131894, 'samples': 17659904, 'steps': 34491, 'batch_loss/train': 0.871691839536652} 12/27/2021 12:10:47 - INFO - codeparrot_training - Step 34492: {'lr': 0.00011029181290556181, 'samples': 17660416, 'steps': 34492, 'batch_loss/train': 0.7084211938781664} 12/27/2021 12:11:00 - INFO - codeparrot_training - Step 34493: {'lr': 0.0001102787411252316, 'samples': 17660928, 'steps': 34493, 'batch_loss/train': 0.8154762359336019} 12/27/2021 12:11:10 - INFO - codeparrot_training - Step 34494: {'lr': 0.00011026566990038053, 'samples': 17661440, 'steps': 34494, 'batch_loss/train': 0.9371199980378151} 12/27/2021 12:11:21 - INFO - codeparrot_training - Step 34495: {'lr': 0.00011025259923106032, 'samples': 17661952, 'steps': 34495, 'batch_loss/train': 0.6731951949186623} 12/27/2021 12:11:34 - INFO - codeparrot_training - Step 34496: {'lr': 0.00011023952911732302, 'samples': 17662464, 'steps': 34496, 'batch_loss/train': 0.8283979299012572} 12/27/2021 12:11:45 - INFO - codeparrot_training - Step 34497: {'lr': 0.00011022645955922056, 'samples': 17662976, 'steps': 34497, 'batch_loss/train': 0.920893969014287} 12/27/2021 12:11:55 - INFO - codeparrot_training - Step 34498: {'lr': 0.00011021339055680491, 'samples': 17663488, 'steps': 34498, 'batch_loss/train': 0.6766141084954143} 12/27/2021 12:12:07 - INFO - codeparrot_training - Step 34499: {'lr': 0.00011020032211012806, 'samples': 17664000, 'steps': 34499, 'batch_loss/train': 0.8069435150828212} 12/27/2021 12:12:18 - INFO - codeparrot_training - Step 34500: {'lr': 0.00011018725421924191, 'samples': 17664512, 'steps': 34500, 'batch_loss/train': 0.7781944358721375} 12/27/2021 12:12:29 - INFO - codeparrot_training - Step 34501: {'lr': 0.0001101741868841985, 'samples': 17665024, 'steps': 34501, 'batch_loss/train': 0.7545056464150548} 12/27/2021 12:12:39 - INFO - codeparrot_training - Step 34502: {'lr': 0.00011016112010504969, 'samples': 17665536, 'steps': 34502, 'batch_loss/train': 0.7911162581294775} 12/27/2021 12:12:52 - INFO - codeparrot_training - Step 34503: {'lr': 0.00011014805388184757, 'samples': 17666048, 'steps': 34503, 'batch_loss/train': 0.7218083813786507} 12/27/2021 12:13:02 - INFO - codeparrot_training - Step 34504: {'lr': 0.0001101349882146438, 'samples': 17666560, 'steps': 34504, 'batch_loss/train': 0.8580451300367713} 12/27/2021 12:13:13 - INFO - codeparrot_training - Step 34505: {'lr': 0.00011012192310349056, 'samples': 17667072, 'steps': 34505, 'batch_loss/train': 0.7194064408540726} 12/27/2021 12:13:25 - INFO - codeparrot_training - Step 34506: {'lr': 0.00011010885854843988, 'samples': 17667584, 'steps': 34506, 'batch_loss/train': 0.9076379891484976} 12/27/2021 12:13:36 - INFO - codeparrot_training - Step 34507: {'lr': 0.00011009579454954344, 'samples': 17668096, 'steps': 34507, 'batch_loss/train': 0.630311927292496} 12/27/2021 12:13:46 - INFO - codeparrot_training - Step 34508: {'lr': 0.0001100827311068532, 'samples': 17668608, 'steps': 34508, 'batch_loss/train': 0.7519833026453853} 12/27/2021 12:13:57 - INFO - codeparrot_training - Step 34509: {'lr': 0.00011006966822042138, 'samples': 17669120, 'steps': 34509, 'batch_loss/train': 0.7711794301867485} 12/27/2021 12:14:11 - INFO - codeparrot_training - Step 34510: {'lr': 0.00011005660589029965, 'samples': 17669632, 'steps': 34510, 'batch_loss/train': 0.6612903666682541} 12/27/2021 12:14:21 - INFO - codeparrot_training - Step 34511: {'lr': 0.00011004354411654, 'samples': 17670144, 'steps': 34511, 'batch_loss/train': 0.6420267251087353} 12/27/2021 12:14:32 - INFO - codeparrot_training - Step 34512: {'lr': 0.00011003048289919437, 'samples': 17670656, 'steps': 34512, 'batch_loss/train': 0.8095030225813389} 12/27/2021 12:14:44 - INFO - codeparrot_training - Step 34513: {'lr': 0.00011001742223831468, 'samples': 17671168, 'steps': 34513, 'batch_loss/train': 0.6939255073666573} 12/27/2021 12:14:54 - INFO - codeparrot_training - Step 34514: {'lr': 0.00011000436213395287, 'samples': 17671680, 'steps': 34514, 'batch_loss/train': 0.840409170370549} 12/27/2021 12:15:05 - INFO - codeparrot_training - Step 34515: {'lr': 0.00010999130258616089, 'samples': 17672192, 'steps': 34515, 'batch_loss/train': 0.8247584423515946} 12/27/2021 12:15:17 - INFO - codeparrot_training - Step 34516: {'lr': 0.00010997824359499058, 'samples': 17672704, 'steps': 34516, 'batch_loss/train': 0.6923349229618907} 12/27/2021 12:15:28 - INFO - codeparrot_training - Step 34517: {'lr': 0.00010996518516049392, 'samples': 17673216, 'steps': 34517, 'batch_loss/train': 0.8234707741066813} 12/27/2021 12:15:38 - INFO - codeparrot_training - Step 34518: {'lr': 0.00010995212728272292, 'samples': 17673728, 'steps': 34518, 'batch_loss/train': 0.802734799683094} 12/27/2021 12:15:49 - INFO - codeparrot_training - Step 34519: {'lr': 0.00010993906996172916, 'samples': 17674240, 'steps': 34519, 'batch_loss/train': 0.7585452233906835} 12/27/2021 12:16:02 - INFO - codeparrot_training - Step 34520: {'lr': 0.00010992601319756498, 'samples': 17674752, 'steps': 34520, 'batch_loss/train': 0.7040478931739926} 12/27/2021 12:16:13 - INFO - codeparrot_training - Step 34521: {'lr': 0.00010991295699028197, 'samples': 17675264, 'steps': 34521, 'batch_loss/train': 0.9323031837120652} 12/27/2021 12:16:23 - INFO - codeparrot_training - Step 34522: {'lr': 0.00010989990133993208, 'samples': 17675776, 'steps': 34522, 'batch_loss/train': 0.7196086379699409} 12/27/2021 12:16:35 - INFO - codeparrot_training - Step 34523: {'lr': 0.00010988684624656747, 'samples': 17676288, 'steps': 34523, 'batch_loss/train': 0.7550276983529329} 12/27/2021 12:16:46 - INFO - codeparrot_training - Step 34524: {'lr': 0.00010987379171023976, 'samples': 17676800, 'steps': 34524, 'batch_loss/train': 0.6702524540014565} 12/27/2021 12:16:57 - INFO - codeparrot_training - Step 34525: {'lr': 0.00010986073773100092, 'samples': 17677312, 'steps': 34525, 'batch_loss/train': 0.7106424056109972} 12/27/2021 12:17:10 - INFO - codeparrot_training - Step 34526: {'lr': 0.00010984768430890289, 'samples': 17677824, 'steps': 34526, 'batch_loss/train': 0.848352650180459} 12/27/2021 12:17:20 - INFO - codeparrot_training - Step 34527: {'lr': 0.00010983463144399755, 'samples': 17678336, 'steps': 34527, 'batch_loss/train': 0.7703933278098702} 12/27/2021 12:17:31 - INFO - codeparrot_training - Step 34528: {'lr': 0.00010982157913633678, 'samples': 17678848, 'steps': 34528, 'batch_loss/train': 0.7566227237693965} 12/27/2021 12:17:42 - INFO - codeparrot_training - Step 34529: {'lr': 0.00010980852738597258, 'samples': 17679360, 'steps': 34529, 'batch_loss/train': 0.6941836497280747} 12/27/2021 12:17:54 - INFO - codeparrot_training - Step 34530: {'lr': 0.00010979547619295654, 'samples': 17679872, 'steps': 34530, 'batch_loss/train': 0.8163931798189878} 12/27/2021 12:18:04 - INFO - codeparrot_training - Step 34531: {'lr': 0.00010978242555734086, 'samples': 17680384, 'steps': 34531, 'batch_loss/train': 0.5660041217925027} 12/27/2021 12:18:15 - INFO - codeparrot_training - Step 34532: {'lr': 0.00010976937547917743, 'samples': 17680896, 'steps': 34532, 'batch_loss/train': 0.7691296692937613} 12/27/2021 12:18:27 - INFO - codeparrot_training - Step 34533: {'lr': 0.00010975632595851787, 'samples': 17681408, 'steps': 34533, 'batch_loss/train': 0.7639230187050998} 12/27/2021 12:18:38 - INFO - codeparrot_training - Step 34534: {'lr': 0.00010974327699541414, 'samples': 17681920, 'steps': 34534, 'batch_loss/train': 0.8031266871839762} 12/27/2021 12:18:49 - INFO - codeparrot_training - Step 34535: {'lr': 0.00010973022858991835, 'samples': 17682432, 'steps': 34535, 'batch_loss/train': 0.7467765277251601} 12/27/2021 12:19:01 - INFO - codeparrot_training - Step 34536: {'lr': 0.00010971718074208214, 'samples': 17682944, 'steps': 34536, 'batch_loss/train': 0.7682336331345141} 12/27/2021 12:19:11 - INFO - codeparrot_training - Step 34537: {'lr': 0.00010970413345195743, 'samples': 17683456, 'steps': 34537, 'batch_loss/train': 0.7083218600600958} 12/27/2021 12:19:22 - INFO - codeparrot_training - Step 34538: {'lr': 0.00010969108671959613, 'samples': 17683968, 'steps': 34538, 'batch_loss/train': 0.757341573247686} 12/27/2021 12:19:32 - INFO - codeparrot_training - Step 34539: {'lr': 0.00010967804054505007, 'samples': 17684480, 'steps': 34539, 'batch_loss/train': 0.6798556151043158} 12/27/2021 12:19:46 - INFO - codeparrot_training - Step 34540: {'lr': 0.00010966499492837115, 'samples': 17684992, 'steps': 34540, 'batch_loss/train': 0.6732105643022805} 12/27/2021 12:19:56 - INFO - codeparrot_training - Step 34541: {'lr': 0.00010965194986961121, 'samples': 17685504, 'steps': 34541, 'batch_loss/train': 0.7210122034884989} 12/27/2021 12:20:07 - INFO - codeparrot_training - Step 34542: {'lr': 0.00010963890536882212, 'samples': 17686016, 'steps': 34542, 'batch_loss/train': 0.6911572618409991} 12/27/2021 12:20:19 - INFO - codeparrot_training - Step 34543: {'lr': 0.00010962586142605577, 'samples': 17686528, 'steps': 34543, 'batch_loss/train': 0.7879563439637423} 12/27/2021 12:20:30 - INFO - codeparrot_training - Step 34544: {'lr': 0.00010961281804136408, 'samples': 17687040, 'steps': 34544, 'batch_loss/train': 0.7782407468184829} 12/27/2021 12:20:40 - INFO - codeparrot_training - Step 34545: {'lr': 0.00010959977521479861, 'samples': 17687552, 'steps': 34545, 'batch_loss/train': 0.7393609555438161} 12/27/2021 12:20:52 - INFO - codeparrot_training - Step 34546: {'lr': 0.00010958673294641162, 'samples': 17688064, 'steps': 34546, 'batch_loss/train': 0.8288523443043232} 12/27/2021 12:21:03 - INFO - codeparrot_training - Step 34547: {'lr': 0.00010957369123625465, 'samples': 17688576, 'steps': 34547, 'batch_loss/train': 0.8230647621676326} 12/27/2021 12:21:14 - INFO - codeparrot_training - Step 34548: {'lr': 0.0001095606500843796, 'samples': 17689088, 'steps': 34548, 'batch_loss/train': 0.7536678430624306} 12/27/2021 12:21:24 - INFO - codeparrot_training - Step 34549: {'lr': 0.00010954760949083856, 'samples': 17689600, 'steps': 34549, 'batch_loss/train': 0.7326570590957999} 12/27/2021 12:21:38 - INFO - codeparrot_training - Step 34550: {'lr': 0.00010953456945568307, 'samples': 17690112, 'steps': 34550, 'batch_loss/train': 0.8280820806976408} 12/27/2021 12:21:49 - INFO - codeparrot_training - Step 34551: {'lr': 0.00010952152997896512, 'samples': 17690624, 'steps': 34551, 'batch_loss/train': 0.7558524687774479} 12/27/2021 12:21:59 - INFO - codeparrot_training - Step 34552: {'lr': 0.00010950849106073652, 'samples': 17691136, 'steps': 34552, 'batch_loss/train': 0.6981484855059534} 12/27/2021 12:22:11 - INFO - codeparrot_training - Step 34553: {'lr': 0.0001094954527010491, 'samples': 17691648, 'steps': 34553, 'batch_loss/train': 0.8301624245941639} 12/27/2021 12:22:22 - INFO - codeparrot_training - Step 34554: {'lr': 0.00010948241489995472, 'samples': 17692160, 'steps': 34554, 'batch_loss/train': 0.8031542720273137} 12/27/2021 12:22:33 - INFO - codeparrot_training - Step 34555: {'lr': 0.00010946937765750528, 'samples': 17692672, 'steps': 34555, 'batch_loss/train': 0.7369141139788553} 12/27/2021 12:22:46 - INFO - codeparrot_training - Step 34556: {'lr': 0.00010945634097375237, 'samples': 17693184, 'steps': 34556, 'batch_loss/train': 0.8383744489401579} 12/27/2021 12:22:57 - INFO - codeparrot_training - Step 34557: {'lr': 0.00010944330484874804, 'samples': 17693696, 'steps': 34557, 'batch_loss/train': 0.7597375903278589} 12/27/2021 12:23:08 - INFO - codeparrot_training - Step 34558: {'lr': 0.00010943026928254418, 'samples': 17694208, 'steps': 34558, 'batch_loss/train': 0.6740828899201006} 12/27/2021 12:23:18 - INFO - codeparrot_training - Step 34559: {'lr': 0.00010941723427519229, 'samples': 17694720, 'steps': 34559, 'batch_loss/train': 1.050360678229481} 12/27/2021 12:23:31 - INFO - codeparrot_training - Step 34560: {'lr': 0.00010940419982674449, 'samples': 17695232, 'steps': 34560, 'batch_loss/train': 0.7361983307637274} 12/27/2021 12:23:41 - INFO - codeparrot_training - Step 34561: {'lr': 0.00010939116593725262, 'samples': 17695744, 'steps': 34561, 'batch_loss/train': 0.4695886396802962} 12/27/2021 12:23:52 - INFO - codeparrot_training - Step 34562: {'lr': 0.00010937813260676827, 'samples': 17696256, 'steps': 34562, 'batch_loss/train': 0.786571147851646} 12/27/2021 12:24:04 - INFO - codeparrot_training - Step 34563: {'lr': 0.00010936509983534326, 'samples': 17696768, 'steps': 34563, 'batch_loss/train': 0.7038963069207966} 12/27/2021 12:24:15 - INFO - codeparrot_training - Step 34564: {'lr': 0.00010935206762302971, 'samples': 17697280, 'steps': 34564, 'batch_loss/train': 0.7926880968734622} 12/27/2021 12:24:25 - INFO - codeparrot_training - Step 34565: {'lr': 0.00010933903596987915, 'samples': 17697792, 'steps': 34565, 'batch_loss/train': 0.7639858229085803} 12/27/2021 12:24:36 - INFO - codeparrot_training - Step 34566: {'lr': 0.00010932600487594347, 'samples': 17698304, 'steps': 34566, 'batch_loss/train': 0.7884039806667715} 12/27/2021 12:24:50 - INFO - codeparrot_training - Step 34567: {'lr': 0.00010931297434127446, 'samples': 17698816, 'steps': 34567, 'batch_loss/train': 0.756989948451519} 12/27/2021 12:25:01 - INFO - codeparrot_training - Step 34568: {'lr': 0.00010929994436592397, 'samples': 17699328, 'steps': 34568, 'batch_loss/train': 0.7052311580628157} 12/27/2021 12:25:11 - INFO - codeparrot_training - Step 34569: {'lr': 0.00010928691494994375, 'samples': 17699840, 'steps': 34569, 'batch_loss/train': 0.7490589451044798} 12/27/2021 12:25:24 - INFO - codeparrot_training - Step 34570: {'lr': 0.00010927388609338573, 'samples': 17700352, 'steps': 34570, 'batch_loss/train': 0.6649514343589544} 12/27/2021 12:25:34 - INFO - codeparrot_training - Step 34571: {'lr': 0.00010926085779630143, 'samples': 17700864, 'steps': 34571, 'batch_loss/train': 0.821752411313355} 12/27/2021 12:25:45 - INFO - codeparrot_training - Step 34572: {'lr': 0.00010924783005874298, 'samples': 17701376, 'steps': 34572, 'batch_loss/train': 0.7452294593676925} 12/27/2021 12:25:57 - INFO - codeparrot_training - Step 34573: {'lr': 0.00010923480288076193, 'samples': 17701888, 'steps': 34573, 'batch_loss/train': 0.8177369143813848} 12/27/2021 12:26:08 - INFO - codeparrot_training - Step 34574: {'lr': 0.00010922177626241005, 'samples': 17702400, 'steps': 34574, 'batch_loss/train': 1.6691797524690628} 12/27/2021 12:26:18 - INFO - codeparrot_training - Step 34575: {'lr': 0.00010920875020373944, 'samples': 17702912, 'steps': 34575, 'batch_loss/train': 1.021680610254407} 12/27/2021 12:26:29 - INFO - codeparrot_training - Step 34576: {'lr': 0.00010919572470480154, 'samples': 17703424, 'steps': 34576, 'batch_loss/train': 0.7829701625742018} 12/27/2021 12:26:42 - INFO - codeparrot_training - Step 34577: {'lr': 0.00010918269976564833, 'samples': 17703936, 'steps': 34577, 'batch_loss/train': 0.6804534120019525} 12/27/2021 12:26:53 - INFO - codeparrot_training - Step 34578: {'lr': 0.0001091696753863315, 'samples': 17704448, 'steps': 34578, 'batch_loss/train': 0.7165276035666466} 12/27/2021 12:27:03 - INFO - codeparrot_training - Step 34579: {'lr': 0.00010915665156690289, 'samples': 17704960, 'steps': 34579, 'batch_loss/train': 0.6613225908949971} 12/27/2021 12:27:15 - INFO - codeparrot_training - Step 34580: {'lr': 0.00010914362830741423, 'samples': 17705472, 'steps': 34580, 'batch_loss/train': 0.771237232722342} 12/27/2021 12:27:26 - INFO - codeparrot_training - Step 34581: {'lr': 0.0001091306056079174, 'samples': 17705984, 'steps': 34581, 'batch_loss/train': 0.8271016888320446} 12/27/2021 12:27:37 - INFO - codeparrot_training - Step 34582: {'lr': 0.00010911758346846393, 'samples': 17706496, 'steps': 34582, 'batch_loss/train': 0.7984769656322896} 12/27/2021 12:27:49 - INFO - codeparrot_training - Step 34583: {'lr': 0.00010910456188910583, 'samples': 17707008, 'steps': 34583, 'batch_loss/train': 0.6984512070193887} 12/27/2021 12:28:00 - INFO - codeparrot_training - Step 34584: {'lr': 0.00010909154086989492, 'samples': 17707520, 'steps': 34584, 'batch_loss/train': 0.8356533609330654} 12/27/2021 12:28:10 - INFO - codeparrot_training - Step 34585: {'lr': 0.00010907852041088262, 'samples': 17708032, 'steps': 34585, 'batch_loss/train': 0.8357437313534319} 12/27/2021 12:28:21 - INFO - codeparrot_training - Step 34586: {'lr': 0.00010906550051212102, 'samples': 17708544, 'steps': 34586, 'batch_loss/train': 0.7830341113731265} 12/27/2021 12:28:33 - INFO - codeparrot_training - Step 34587: {'lr': 0.00010905248117366187, 'samples': 17709056, 'steps': 34587, 'batch_loss/train': 0.6872793866787106} 12/27/2021 12:28:44 - INFO - codeparrot_training - Step 34588: {'lr': 0.00010903946239555674, 'samples': 17709568, 'steps': 34588, 'batch_loss/train': 0.7415819368325174} 12/27/2021 12:28:54 - INFO - codeparrot_training - Step 34589: {'lr': 0.00010902644417785737, 'samples': 17710080, 'steps': 34589, 'batch_loss/train': 0.7190971504896879} 12/27/2021 12:29:07 - INFO - codeparrot_training - Step 34590: {'lr': 0.00010901342652061583, 'samples': 17710592, 'steps': 34590, 'batch_loss/train': 0.8199988703709096} 12/27/2021 12:29:18 - INFO - codeparrot_training - Step 34591: {'lr': 0.00010900040942388356, 'samples': 17711104, 'steps': 34591, 'batch_loss/train': 0.7897628601640463} 12/27/2021 12:29:29 - INFO - codeparrot_training - Step 34592: {'lr': 0.00010898739288771245, 'samples': 17711616, 'steps': 34592, 'batch_loss/train': 0.7701017186045647} 12/27/2021 12:29:41 - INFO - codeparrot_training - Step 34593: {'lr': 0.00010897437691215423, 'samples': 17712128, 'steps': 34593, 'batch_loss/train': 0.712110155262053} 12/27/2021 12:29:51 - INFO - codeparrot_training - Step 34594: {'lr': 0.00010896136149726061, 'samples': 17712640, 'steps': 34594, 'batch_loss/train': 0.7821227889508009} 12/27/2021 12:30:02 - INFO - codeparrot_training - Step 34595: {'lr': 0.00010894834664308337, 'samples': 17713152, 'steps': 34595, 'batch_loss/train': 0.8418583013117313} 12/27/2021 12:30:13 - INFO - codeparrot_training - Step 34596: {'lr': 0.00010893533234967423, 'samples': 17713664, 'steps': 34596, 'batch_loss/train': 0.7119698917958885} 12/27/2021 12:30:26 - INFO - codeparrot_training - Step 34597: {'lr': 0.00010892231861708498, 'samples': 17714176, 'steps': 34597, 'batch_loss/train': 0.7187304712133482} 12/27/2021 12:30:36 - INFO - codeparrot_training - Step 34598: {'lr': 0.0001089093054453673, 'samples': 17714688, 'steps': 34598, 'batch_loss/train': 0.7776285158470273} 12/27/2021 12:30:47 - INFO - codeparrot_training - Step 34599: {'lr': 0.00010889629283457306, 'samples': 17715200, 'steps': 34599, 'batch_loss/train': 0.7739912439137697} 12/27/2021 12:30:59 - INFO - codeparrot_training - Step 34600: {'lr': 0.00010888328078475365, 'samples': 17715712, 'steps': 34600, 'batch_loss/train': 0.818750748410821} 12/27/2021 12:31:10 - INFO - codeparrot_training - Step 34601: {'lr': 0.00010887026929596126, 'samples': 17716224, 'steps': 34601, 'batch_loss/train': 0.748314491007477} 12/27/2021 12:31:20 - INFO - codeparrot_training - Step 34602: {'lr': 0.00010885725836824728, 'samples': 17716736, 'steps': 34602, 'batch_loss/train': 0.696022535674274} 12/27/2021 12:31:33 - INFO - codeparrot_training - Step 34603: {'lr': 0.00010884424800166356, 'samples': 17717248, 'steps': 34603, 'batch_loss/train': 0.7846635935129598} 12/27/2021 12:31:43 - INFO - codeparrot_training - Step 34604: {'lr': 0.00010883123819626182, 'samples': 17717760, 'steps': 34604, 'batch_loss/train': 0.7488360591232777} 12/27/2021 12:31:54 - INFO - codeparrot_training - Step 34605: {'lr': 0.00010881822895209376, 'samples': 17718272, 'steps': 34605, 'batch_loss/train': 0.710947087733075} 12/27/2021 12:32:07 - INFO - codeparrot_training - Step 34606: {'lr': 0.00010880522026921111, 'samples': 17718784, 'steps': 34606, 'batch_loss/train': 0.9015623810701072} 12/27/2021 12:32:18 - INFO - codeparrot_training - Step 34607: {'lr': 0.00010879221214766558, 'samples': 17719296, 'steps': 34607, 'batch_loss/train': 0.7463231855072081} 12/27/2021 12:32:28 - INFO - codeparrot_training - Step 34608: {'lr': 0.00010877920458750889, 'samples': 17719808, 'steps': 34608, 'batch_loss/train': 1.4602861151797697} 12/27/2021 12:32:39 - INFO - codeparrot_training - Step 34609: {'lr': 0.0001087661975887928, 'samples': 17720320, 'steps': 34609, 'batch_loss/train': 0.7637053364887834} 12/27/2021 12:32:51 - INFO - codeparrot_training - Step 34610: {'lr': 0.00010875319115156904, 'samples': 17720832, 'steps': 34610, 'batch_loss/train': 1.5383484023623168} 12/27/2021 12:33:02 - INFO - codeparrot_training - Step 34611: {'lr': 0.0001087401852758891, 'samples': 17721344, 'steps': 34611, 'batch_loss/train': 0.6765186977572739} 12/27/2021 12:33:12 - INFO - codeparrot_training - Step 34612: {'lr': 0.00010872717996180492, 'samples': 17721856, 'steps': 34612, 'batch_loss/train': 0.8003728231415153} 12/27/2021 12:33:26 - INFO - codeparrot_training - Step 34613: {'lr': 0.00010871417520936822, 'samples': 17722368, 'steps': 34613, 'batch_loss/train': 0.7909423806704581} 12/27/2021 12:33:36 - INFO - codeparrot_training - Step 34614: {'lr': 0.00010870117101863053, 'samples': 17722880, 'steps': 34614, 'batch_loss/train': 0.7491868613287807} 12/27/2021 12:33:47 - INFO - codeparrot_training - Step 34615: {'lr': 0.00010868816738964354, 'samples': 17723392, 'steps': 34615, 'batch_loss/train': 0.7443511830642819} 12/27/2021 12:33:59 - INFO - codeparrot_training - Step 34616: {'lr': 0.00010867516432245922, 'samples': 17723904, 'steps': 34616, 'batch_loss/train': 0.6887220535427332} 12/27/2021 12:34:09 - INFO - codeparrot_training - Step 34617: {'lr': 0.00010866216181712899, 'samples': 17724416, 'steps': 34617, 'batch_loss/train': 0.7761344045866281} 12/27/2021 12:34:20 - INFO - codeparrot_training - Step 34618: {'lr': 0.00010864915987370463, 'samples': 17724928, 'steps': 34618, 'batch_loss/train': 0.8244894072413445} 12/27/2021 12:34:31 - INFO - codeparrot_training - Step 34619: {'lr': 0.00010863615849223784, 'samples': 17725440, 'steps': 34619, 'batch_loss/train': 0.616745154839009} 12/27/2021 12:34:43 - INFO - codeparrot_training - Step 34620: {'lr': 0.00010862315767278033, 'samples': 17725952, 'steps': 34620, 'batch_loss/train': 0.7780471825972199} 12/27/2021 12:34:53 - INFO - codeparrot_training - Step 34621: {'lr': 0.00010861015741538374, 'samples': 17726464, 'steps': 34621, 'batch_loss/train': 0.7424298439873382} 12/27/2021 12:35:04 - INFO - codeparrot_training - Step 34622: {'lr': 0.00010859715772009979, 'samples': 17726976, 'steps': 34622, 'batch_loss/train': 0.7409383812919259} 12/27/2021 12:35:16 - INFO - codeparrot_training - Step 34623: {'lr': 0.00010858415858698015, 'samples': 17727488, 'steps': 34623, 'batch_loss/train': 0.6543269562534988} 12/27/2021 12:35:27 - INFO - codeparrot_training - Step 34624: {'lr': 0.00010857116001607651, 'samples': 17728000, 'steps': 34624, 'batch_loss/train': 0.6096712345024571} 12/27/2021 12:35:37 - INFO - codeparrot_training - Step 34625: {'lr': 0.0001085581620074406, 'samples': 17728512, 'steps': 34625, 'batch_loss/train': 0.7690672120079398} 12/27/2021 12:35:48 - INFO - codeparrot_training - Step 34626: {'lr': 0.00010854516456112384, 'samples': 17729024, 'steps': 34626, 'batch_loss/train': 0.8348924724850804} 12/27/2021 12:36:01 - INFO - codeparrot_training - Step 34627: {'lr': 0.0001085321676771783, 'samples': 17729536, 'steps': 34627, 'batch_loss/train': 0.7390170157887042} 12/27/2021 12:36:12 - INFO - codeparrot_training - Step 34628: {'lr': 0.00010851917135565537, 'samples': 17730048, 'steps': 34628, 'batch_loss/train': 1.0446325205266476} 12/27/2021 12:36:23 - INFO - codeparrot_training - Step 34629: {'lr': 0.00010850617559660678, 'samples': 17730560, 'steps': 34629, 'batch_loss/train': 0.815996044781059} 12/27/2021 12:36:35 - INFO - codeparrot_training - Step 34630: {'lr': 0.00010849318040008419, 'samples': 17731072, 'steps': 34630, 'batch_loss/train': 0.8164632823318243} 12/27/2021 12:36:46 - INFO - codeparrot_training - Step 34631: {'lr': 0.0001084801857661393, 'samples': 17731584, 'steps': 34631, 'batch_loss/train': 1.0382157913409173} 12/27/2021 12:36:56 - INFO - codeparrot_training - Step 34632: {'lr': 0.00010846719169482374, 'samples': 17732096, 'steps': 34632, 'batch_loss/train': 0.9294242179021239} 12/27/2021 12:37:07 - INFO - codeparrot_training - Step 34633: {'lr': 0.00010845419818618921, 'samples': 17732608, 'steps': 34633, 'batch_loss/train': 0.5957425811793655} 12/27/2021 12:37:19 - INFO - codeparrot_training - Step 34634: {'lr': 0.00010844120524028733, 'samples': 17733120, 'steps': 34634, 'batch_loss/train': 0.736432796344161} 12/27/2021 12:37:30 - INFO - codeparrot_training - Step 34635: {'lr': 0.00010842821285716978, 'samples': 17733632, 'steps': 34635, 'batch_loss/train': 0.8045229761628434} 12/27/2021 12:37:40 - INFO - codeparrot_training - Step 34636: {'lr': 0.00010841522103688828, 'samples': 17734144, 'steps': 34636, 'batch_loss/train': 0.8096612774534151} 12/27/2021 12:37:55 - INFO - codeparrot_training - Step 34637: {'lr': 0.00010840222977949422, 'samples': 17734656, 'steps': 34637, 'batch_loss/train': 0.8099394831806421} 12/27/2021 12:38:05 - INFO - codeparrot_training - Step 34638: {'lr': 0.00010838923908503951, 'samples': 17735168, 'steps': 34638, 'batch_loss/train': 0.7715016640722752} 12/27/2021 12:38:16 - INFO - codeparrot_training - Step 34639: {'lr': 0.0001083762489535758, 'samples': 17735680, 'steps': 34639, 'batch_loss/train': 0.7660728599876165} 12/27/2021 12:38:28 - INFO - codeparrot_training - Step 34640: {'lr': 0.0001083632593851546, 'samples': 17736192, 'steps': 34640, 'batch_loss/train': 0.48957786796381697} 12/27/2021 12:38:39 - INFO - codeparrot_training - Step 34641: {'lr': 0.00010835027037982747, 'samples': 17736704, 'steps': 34641, 'batch_loss/train': 0.7500161277130246} 12/27/2021 12:38:49 - INFO - codeparrot_training - Step 34642: {'lr': 0.00010833728193764636, 'samples': 17737216, 'steps': 34642, 'batch_loss/train': 0.7953072022646666} 12/27/2021 12:39:03 - INFO - codeparrot_training - Step 34643: {'lr': 0.00010832429405866268, 'samples': 17737728, 'steps': 34643, 'batch_loss/train': 0.7877178522758186} 12/27/2021 12:39:14 - INFO - codeparrot_training - Step 34644: {'lr': 0.00010831130674292808, 'samples': 17738240, 'steps': 34644, 'batch_loss/train': 0.6145035493536852} 12/27/2021 12:39:24 - INFO - codeparrot_training - Step 34645: {'lr': 0.00010829831999049422, 'samples': 17738752, 'steps': 34645, 'batch_loss/train': 0.8127560894936323} 12/27/2021 12:39:35 - INFO - codeparrot_training - Step 34646: {'lr': 0.00010828533380141272, 'samples': 17739264, 'steps': 34646, 'batch_loss/train': 0.8187228164752014} 12/27/2021 12:39:47 - INFO - codeparrot_training - Step 34647: {'lr': 0.00010827234817573526, 'samples': 17739776, 'steps': 34647, 'batch_loss/train': 0.7122007822617888} 12/27/2021 12:39:58 - INFO - codeparrot_training - Step 34648: {'lr': 0.00010825936311351341, 'samples': 17740288, 'steps': 34648, 'batch_loss/train': 0.6729576732032001} 12/27/2021 12:40:08 - INFO - codeparrot_training - Step 34649: {'lr': 0.00010824637861479883, 'samples': 17740800, 'steps': 34649, 'batch_loss/train': 0.7296132147312164} 12/27/2021 12:40:20 - INFO - codeparrot_training - Step 34650: {'lr': 0.0001082333946796431, 'samples': 17741312, 'steps': 34650, 'batch_loss/train': 0.739288526121527} 12/27/2021 12:40:31 - INFO - codeparrot_training - Step 34651: {'lr': 0.00010822041130809798, 'samples': 17741824, 'steps': 34651, 'batch_loss/train': 0.7354347235523164} 12/27/2021 12:40:41 - INFO - codeparrot_training - Step 34652: {'lr': 0.00010820742850021478, 'samples': 17742336, 'steps': 34652, 'batch_loss/train': 0.7592067720834166} 12/27/2021 12:40:55 - INFO - codeparrot_training - Step 34653: {'lr': 0.00010819444625604538, 'samples': 17742848, 'steps': 34653, 'batch_loss/train': 0.7563045585993677} 12/27/2021 12:41:05 - INFO - codeparrot_training - Step 34654: {'lr': 0.00010818146457564143, 'samples': 17743360, 'steps': 34654, 'batch_loss/train': 0.7545416289940476} 12/27/2021 12:41:16 - INFO - codeparrot_training - Step 34655: {'lr': 0.00010816848345905434, 'samples': 17743872, 'steps': 34655, 'batch_loss/train': 0.9308608849532902} 12/27/2021 12:41:26 - INFO - codeparrot_training - Step 34656: {'lr': 0.0001081555029063358, 'samples': 17744384, 'steps': 34656, 'batch_loss/train': 0.7801777264103293} 12/27/2021 12:41:39 - INFO - codeparrot_training - Step 34657: {'lr': 0.0001081425229175374, 'samples': 17744896, 'steps': 34657, 'batch_loss/train': 0.8468057848513126} 12/27/2021 12:41:49 - INFO - codeparrot_training - Step 34658: {'lr': 0.00010812954349271079, 'samples': 17745408, 'steps': 34658, 'batch_loss/train': 0.6960508581250906} 12/27/2021 12:42:00 - INFO - codeparrot_training - Step 34659: {'lr': 0.00010811656463190756, 'samples': 17745920, 'steps': 34659, 'batch_loss/train': 0.7539968099445105} 12/27/2021 12:42:13 - INFO - codeparrot_training - Step 34660: {'lr': 0.00010810358633517928, 'samples': 17746432, 'steps': 34660, 'batch_loss/train': 0.8067070813849568} 12/27/2021 12:42:24 - INFO - codeparrot_training - Step 34661: {'lr': 0.00010809060860257754, 'samples': 17746944, 'steps': 34661, 'batch_loss/train': 0.7711439705453813} 12/27/2021 12:42:34 - INFO - codeparrot_training - Step 34662: {'lr': 0.00010807763143415408, 'samples': 17747456, 'steps': 34662, 'batch_loss/train': 0.8172398949973285} 12/27/2021 12:42:47 - INFO - codeparrot_training - Step 34663: {'lr': 0.00010806465482996017, 'samples': 17747968, 'steps': 34663, 'batch_loss/train': 0.7995973993092775} 12/27/2021 12:42:57 - INFO - codeparrot_training - Step 34664: {'lr': 0.00010805167879004767, 'samples': 17748480, 'steps': 34664, 'batch_loss/train': 0.6847888482734561} 12/27/2021 12:43:08 - INFO - codeparrot_training - Step 34665: {'lr': 0.00010803870331446824, 'samples': 17748992, 'steps': 34665, 'batch_loss/train': 1.1733375864569098} 12/27/2021 12:43:19 - INFO - codeparrot_training - Step 34666: {'lr': 0.00010802572840327321, 'samples': 17749504, 'steps': 34666, 'batch_loss/train': 0.6965578724630177} 12/27/2021 12:43:31 - INFO - codeparrot_training - Step 34667: {'lr': 0.00010801275405651418, 'samples': 17750016, 'steps': 34667, 'batch_loss/train': 1.9556647390127182} 12/27/2021 12:43:42 - INFO - codeparrot_training - Step 34668: {'lr': 0.00010799978027424301, 'samples': 17750528, 'steps': 34668, 'batch_loss/train': 1.9834051867946982} 12/27/2021 12:43:52 - INFO - codeparrot_training - Step 34669: {'lr': 0.00010798680705651101, 'samples': 17751040, 'steps': 34669, 'batch_loss/train': 0.8437837222591043} 12/27/2021 12:44:06 - INFO - codeparrot_training - Step 34670: {'lr': 0.00010797383440336975, 'samples': 17751552, 'steps': 34670, 'batch_loss/train': 0.7777055143378675} 12/27/2021 12:44:17 - INFO - codeparrot_training - Step 34671: {'lr': 0.00010796086231487107, 'samples': 17752064, 'steps': 34671, 'batch_loss/train': 0.7493256242014468} 12/27/2021 12:44:28 - INFO - codeparrot_training - Step 34672: {'lr': 0.00010794789079106629, 'samples': 17752576, 'steps': 34672, 'batch_loss/train': 0.8084011245518923} 12/27/2021 12:44:38 - INFO - codeparrot_training - Step 34673: {'lr': 0.00010793491983200707, 'samples': 17753088, 'steps': 34673, 'batch_loss/train': 1.7649640352465212} 12/27/2021 12:44:50 - INFO - codeparrot_training - Step 34674: {'lr': 0.00010792194943774495, 'samples': 17753600, 'steps': 34674, 'batch_loss/train': 0.7391906068660319} 12/27/2021 12:45:01 - INFO - codeparrot_training - Step 34675: {'lr': 0.00010790897960833152, 'samples': 17754112, 'steps': 34675, 'batch_loss/train': 0.7241857456974685} 12/27/2021 12:45:12 - INFO - codeparrot_training - Step 34676: {'lr': 0.00010789601034381833, 'samples': 17754624, 'steps': 34676, 'batch_loss/train': 0.765894562471658} 12/27/2021 12:45:24 - INFO - codeparrot_training - Step 34677: {'lr': 0.00010788304164425702, 'samples': 17755136, 'steps': 34677, 'batch_loss/train': 0.7900125226005912} 12/27/2021 12:45:34 - INFO - codeparrot_training - Step 34678: {'lr': 0.0001078700735096989, 'samples': 17755648, 'steps': 34678, 'batch_loss/train': 0.787218248238787} 12/27/2021 12:45:45 - INFO - codeparrot_training - Step 34679: {'lr': 0.00010785710594019582, 'samples': 17756160, 'steps': 34679, 'batch_loss/train': 0.7864570086821914} 12/27/2021 12:45:57 - INFO - codeparrot_training - Step 34680: {'lr': 0.00010784413893579931, 'samples': 17756672, 'steps': 34680, 'batch_loss/train': 0.7998557407408953} 12/27/2021 12:46:08 - INFO - codeparrot_training - Step 34681: {'lr': 0.00010783117249656071, 'samples': 17757184, 'steps': 34681, 'batch_loss/train': 0.6883670655079186} 12/27/2021 12:46:18 - INFO - codeparrot_training - Step 34682: {'lr': 0.0001078182066225317, 'samples': 17757696, 'steps': 34682, 'batch_loss/train': 0.8720883871428668} 12/27/2021 12:46:29 - INFO - codeparrot_training - Step 34683: {'lr': 0.00010780524131376379, 'samples': 17758208, 'steps': 34683, 'batch_loss/train': 0.6194678008323535} 12/27/2021 12:46:42 - INFO - codeparrot_training - Step 34684: {'lr': 0.00010779227657030857, 'samples': 17758720, 'steps': 34684, 'batch_loss/train': 0.7247435310855508} 12/27/2021 12:46:53 - INFO - codeparrot_training - Step 34685: {'lr': 0.00010777931239221758, 'samples': 17759232, 'steps': 34685, 'batch_loss/train': 0.7232412076555192} 12/27/2021 12:47:03 - INFO - codeparrot_training - Step 34686: {'lr': 0.0001077663487795423, 'samples': 17759744, 'steps': 34686, 'batch_loss/train': 0.7788467863574624} 12/27/2021 12:47:15 - INFO - codeparrot_training - Step 34687: {'lr': 0.00010775338573233438, 'samples': 17760256, 'steps': 34687, 'batch_loss/train': 0.624512504786253} 12/27/2021 12:47:26 - INFO - codeparrot_training - Step 34688: {'lr': 0.00010774042325064523, 'samples': 17760768, 'steps': 34688, 'batch_loss/train': 0.759257273748517} 12/27/2021 12:47:37 - INFO - codeparrot_training - Step 34689: {'lr': 0.00010772746133452646, 'samples': 17761280, 'steps': 34689, 'batch_loss/train': 0.7733376757241786} 12/27/2021 12:47:51 - INFO - codeparrot_training - Step 34690: {'lr': 0.00010771449998402958, 'samples': 17761792, 'steps': 34690, 'batch_loss/train': 0.7387908538803458} 12/27/2021 12:48:02 - INFO - codeparrot_training - Step 34691: {'lr': 0.00010770153919920622, 'samples': 17762304, 'steps': 34691, 'batch_loss/train': 0.8707666387781501} 12/27/2021 12:48:13 - INFO - codeparrot_training - Step 34692: {'lr': 0.0001076885789801077, 'samples': 17762816, 'steps': 34692, 'batch_loss/train': 0.7045112610794604} 12/27/2021 12:48:23 - INFO - codeparrot_training - Step 34693: {'lr': 0.00010767561932678558, 'samples': 17763328, 'steps': 34693, 'batch_loss/train': 0.7492075499612838} 12/27/2021 12:48:35 - INFO - codeparrot_training - Step 34694: {'lr': 0.00010766266023929164, 'samples': 17763840, 'steps': 34694, 'batch_loss/train': 0.7566900396486744} 12/27/2021 12:48:46 - INFO - codeparrot_training - Step 34695: {'lr': 0.00010764970171767713, 'samples': 17764352, 'steps': 34695, 'batch_loss/train': 0.7428379645571113} 12/27/2021 12:48:57 - INFO - codeparrot_training - Step 34696: {'lr': 0.00010763674376199356, 'samples': 17764864, 'steps': 34696, 'batch_loss/train': 0.7548131491057575} 12/27/2021 12:49:10 - INFO - codeparrot_training - Step 34697: {'lr': 0.00010762378637229273, 'samples': 17765376, 'steps': 34697, 'batch_loss/train': 0.6321203447878361} 12/27/2021 12:49:20 - INFO - codeparrot_training - Step 34698: {'lr': 0.00010761082954862586, 'samples': 17765888, 'steps': 34698, 'batch_loss/train': 0.8099675234407187} 12/27/2021 12:49:31 - INFO - codeparrot_training - Step 34699: {'lr': 0.00010759787329104459, 'samples': 17766400, 'steps': 34699, 'batch_loss/train': 0.7631302773952484} 12/27/2021 12:49:43 - INFO - codeparrot_training - Step 34700: {'lr': 0.00010758491759960041, 'samples': 17766912, 'steps': 34700, 'batch_loss/train': 0.6966197041328996} 12/27/2021 12:49:54 - INFO - codeparrot_training - Step 34701: {'lr': 0.00010757196247434482, 'samples': 17767424, 'steps': 34701, 'batch_loss/train': 0.7526759523898363} 12/27/2021 12:50:04 - INFO - codeparrot_training - Step 34702: {'lr': 0.0001075590079153293, 'samples': 17767936, 'steps': 34702, 'batch_loss/train': 0.7179747186601162} 12/27/2021 12:50:15 - INFO - codeparrot_training - Step 34703: {'lr': 0.00010754605392260549, 'samples': 17768448, 'steps': 34703, 'batch_loss/train': 0.7417054851539433} 12/27/2021 12:50:27 - INFO - codeparrot_training - Step 34704: {'lr': 0.00010753310049622461, 'samples': 17768960, 'steps': 34704, 'batch_loss/train': 0.8079427923075855} 12/27/2021 12:50:38 - INFO - codeparrot_training - Step 34705: {'lr': 0.00010752014763623841, 'samples': 17769472, 'steps': 34705, 'batch_loss/train': 0.7629768126644194} 12/27/2021 12:50:48 - INFO - codeparrot_training - Step 34706: {'lr': 0.0001075071953426984, 'samples': 17769984, 'steps': 34706, 'batch_loss/train': 0.8269611629657447} 12/27/2021 12:51:03 - INFO - codeparrot_training - Step 34707: {'lr': 0.00010749424361565576, 'samples': 17770496, 'steps': 34707, 'batch_loss/train': 0.7226069942116737} 12/27/2021 12:51:13 - INFO - codeparrot_training - Step 34708: {'lr': 0.00010748129245516245, 'samples': 17771008, 'steps': 34708, 'batch_loss/train': 0.7173037598840892} 12/27/2021 12:51:24 - INFO - codeparrot_training - Step 34709: {'lr': 0.00010746834186126955, 'samples': 17771520, 'steps': 34709, 'batch_loss/train': 1.1994297448545694} 12/27/2021 12:51:34 - INFO - codeparrot_training - Step 34710: {'lr': 0.00010745539183402875, 'samples': 17772032, 'steps': 34710, 'batch_loss/train': 0.5124312825500965} 12/27/2021 12:51:46 - INFO - codeparrot_training - Step 34711: {'lr': 0.00010744244237349147, 'samples': 17772544, 'steps': 34711, 'batch_loss/train': 0.7146852011792362} 12/27/2021 12:51:57 - INFO - codeparrot_training - Step 34712: {'lr': 0.00010742949347970924, 'samples': 17773056, 'steps': 34712, 'batch_loss/train': 0.8529230123385787} 12/27/2021 12:52:08 - INFO - codeparrot_training - Step 34713: {'lr': 0.00010741654515273346, 'samples': 17773568, 'steps': 34713, 'batch_loss/train': 0.7793862216640264} 12/27/2021 12:52:26 - INFO - codeparrot_training - Step 34714: {'lr': 0.00010740359739261568, 'samples': 17774080, 'steps': 34714, 'batch_loss/train': 0.6260794447152875} 12/27/2021 12:52:37 - INFO - codeparrot_training - Step 34715: {'lr': 0.00010739065019940739, 'samples': 17774592, 'steps': 34715, 'batch_loss/train': 0.733453297521919} 12/27/2021 12:52:47 - INFO - codeparrot_training - Step 34716: {'lr': 0.00010737770357315999, 'samples': 17775104, 'steps': 34716, 'batch_loss/train': 0.7124700085259974} 12/27/2021 12:52:59 - INFO - codeparrot_training - Step 34717: {'lr': 0.00010736475751392508, 'samples': 17775616, 'steps': 34717, 'batch_loss/train': 0.846117440611124} 12/27/2021 12:53:10 - INFO - codeparrot_training - Step 34718: {'lr': 0.00010735181202175395, 'samples': 17776128, 'steps': 34718, 'batch_loss/train': 0.7690116837620735} 12/27/2021 12:53:20 - INFO - codeparrot_training - Step 34719: {'lr': 0.00010733886709669807, 'samples': 17776640, 'steps': 34719, 'batch_loss/train': 0.6683000102639198} 12/27/2021 12:53:33 - INFO - codeparrot_training - Step 34720: {'lr': 0.00010732592273880917, 'samples': 17777152, 'steps': 34720, 'batch_loss/train': 0.7009621229954064} 12/27/2021 12:53:44 - INFO - codeparrot_training - Step 34721: {'lr': 0.00010731297894813843, 'samples': 17777664, 'steps': 34721, 'batch_loss/train': 0.8058420983143151} 12/27/2021 12:53:54 - INFO - codeparrot_training - Step 34722: {'lr': 0.00010730003572473732, 'samples': 17778176, 'steps': 34722, 'batch_loss/train': 0.6578275871579535} 12/27/2021 12:54:05 - INFO - codeparrot_training - Step 34723: {'lr': 0.00010728709306865758, 'samples': 17778688, 'steps': 34723, 'batch_loss/train': 0.5909517244435847} 12/27/2021 12:54:17 - INFO - codeparrot_training - Step 34724: {'lr': 0.00010727415097995039, 'samples': 17779200, 'steps': 34724, 'batch_loss/train': 0.5513973543420434} 12/27/2021 12:54:28 - INFO - codeparrot_training - Step 34725: {'lr': 0.00010726120945866724, 'samples': 17779712, 'steps': 34725, 'batch_loss/train': 0.7640963401645422} 12/27/2021 12:54:39 - INFO - codeparrot_training - Step 34726: {'lr': 0.00010724826850485966, 'samples': 17780224, 'steps': 34726, 'batch_loss/train': 0.7959780250675976} 12/27/2021 12:54:52 - INFO - codeparrot_training - Step 34727: {'lr': 0.00010723532811857906, 'samples': 17780736, 'steps': 34727, 'batch_loss/train': 0.5517269042902626} 12/27/2021 12:55:03 - INFO - codeparrot_training - Step 34728: {'lr': 0.00010722238829987688, 'samples': 17781248, 'steps': 34728, 'batch_loss/train': 0.7480586310848594} 12/27/2021 12:55:14 - INFO - codeparrot_training - Step 34729: {'lr': 0.00010720944904880467, 'samples': 17781760, 'steps': 34729, 'batch_loss/train': 0.7518724595429376} 12/27/2021 12:55:24 - INFO - codeparrot_training - Step 34730: {'lr': 0.00010719651036541358, 'samples': 17782272, 'steps': 34730, 'batch_loss/train': 0.7269265330396593} 12/27/2021 12:55:36 - INFO - codeparrot_training - Step 34731: {'lr': 0.00010718357224975536, 'samples': 17782784, 'steps': 34731, 'batch_loss/train': 0.8050529193133116} 12/27/2021 12:55:47 - INFO - codeparrot_training - Step 34732: {'lr': 0.0001071706347018814, 'samples': 17783296, 'steps': 34732, 'batch_loss/train': 0.7464966252446175} 12/27/2021 12:55:58 - INFO - codeparrot_training - Step 34733: {'lr': 0.00010715769772184291, 'samples': 17783808, 'steps': 34733, 'batch_loss/train': 0.7958787530660629} 12/27/2021 12:56:10 - INFO - codeparrot_training - Step 34734: {'lr': 0.00010714476130969156, 'samples': 17784320, 'steps': 34734, 'batch_loss/train': 0.6793803606997244} 12/27/2021 12:56:20 - INFO - codeparrot_training - Step 34735: {'lr': 0.00010713182546547878, 'samples': 17784832, 'steps': 34735, 'batch_loss/train': 0.6894175773486495} 12/27/2021 12:56:31 - INFO - codeparrot_training - Step 34736: {'lr': 0.00010711889018925583, 'samples': 17785344, 'steps': 34736, 'batch_loss/train': 0.8266613425221294} 12/27/2021 12:56:45 - INFO - codeparrot_training - Step 34737: {'lr': 0.00010710595548107421, 'samples': 17785856, 'steps': 34737, 'batch_loss/train': 0.7952640503644943} 12/27/2021 12:56:56 - INFO - codeparrot_training - Step 34738: {'lr': 0.00010709302134098539, 'samples': 17786368, 'steps': 34738, 'batch_loss/train': 0.7265946976840496} 12/27/2021 12:57:06 - INFO - codeparrot_training - Step 34739: {'lr': 0.00010708008776904072, 'samples': 17786880, 'steps': 34739, 'batch_loss/train': 0.7957381082233042} 12/27/2021 12:57:17 - INFO - codeparrot_training - Step 34740: {'lr': 0.0001070671547652917, 'samples': 17787392, 'steps': 34740, 'batch_loss/train': 0.8000319530256093} 12/27/2021 12:57:29 - INFO - codeparrot_training - Step 34741: {'lr': 0.00010705422232978965, 'samples': 17787904, 'steps': 34741, 'batch_loss/train': 0.7436034176498652} 12/27/2021 12:57:40 - INFO - codeparrot_training - Step 34742: {'lr': 0.00010704129046258607, 'samples': 17788416, 'steps': 34742, 'batch_loss/train': 0.7677653960417956} 12/27/2021 12:57:50 - INFO - codeparrot_training - Step 34743: {'lr': 0.00010702835916373233, 'samples': 17788928, 'steps': 34743, 'batch_loss/train': 0.7715262910351157} 12/27/2021 12:58:02 - INFO - codeparrot_training - Step 34744: {'lr': 0.00010701542843327985, 'samples': 17789440, 'steps': 34744, 'batch_loss/train': 0.7844451433047652} 12/27/2021 12:58:13 - INFO - codeparrot_training - Step 34745: {'lr': 0.00010700249827128006, 'samples': 17789952, 'steps': 34745, 'batch_loss/train': 0.7617988921701908} 12/27/2021 12:58:23 - INFO - codeparrot_training - Step 34746: {'lr': 0.00010698956867778439, 'samples': 17790464, 'steps': 34746, 'batch_loss/train': 0.7489893996389583} 12/27/2021 12:58:38 - INFO - codeparrot_training - Step 34747: {'lr': 0.00010697663965284416, 'samples': 17790976, 'steps': 34747, 'batch_loss/train': 0.8312839809805155} 12/27/2021 12:58:48 - INFO - codeparrot_training - Step 34748: {'lr': 0.00010696371119651069, 'samples': 17791488, 'steps': 34748, 'batch_loss/train': 0.771825313102454} 12/27/2021 12:58:59 - INFO - codeparrot_training - Step 34749: {'lr': 0.00010695078330883565, 'samples': 17792000, 'steps': 34749, 'batch_loss/train': 0.5672894455492496} 12/27/2021 12:59:11 - INFO - codeparrot_training - Step 34750: {'lr': 0.00010693785598987021, 'samples': 17792512, 'steps': 34750, 'batch_loss/train': 0.8352409871295094} 12/27/2021 12:59:22 - INFO - codeparrot_training - Step 34751: {'lr': 0.00010692492923966584, 'samples': 17793024, 'steps': 34751, 'batch_loss/train': 0.7140766149386764} 12/27/2021 12:59:32 - INFO - codeparrot_training - Step 34752: {'lr': 0.00010691200305827395, 'samples': 17793536, 'steps': 34752, 'batch_loss/train': 0.5252500658389181} 12/27/2021 12:59:43 - INFO - codeparrot_training - Step 34753: {'lr': 0.0001068990774457459, 'samples': 17794048, 'steps': 34753, 'batch_loss/train': 0.7578406487591565} 12/27/2021 12:59:57 - INFO - codeparrot_training - Step 34754: {'lr': 0.00010688615240213306, 'samples': 17794560, 'steps': 34754, 'batch_loss/train': 0.7417267495766282} 12/27/2021 13:00:08 - INFO - codeparrot_training - Step 34755: {'lr': 0.00010687322792748693, 'samples': 17795072, 'steps': 34755, 'batch_loss/train': 0.7195241260924377} 12/27/2021 13:00:19 - INFO - codeparrot_training - Step 34756: {'lr': 0.00010686030402185865, 'samples': 17795584, 'steps': 34756, 'batch_loss/train': 0.5227420385926962} 12/27/2021 13:00:31 - INFO - codeparrot_training - Step 34757: {'lr': 0.00010684738068529984, 'samples': 17796096, 'steps': 34757, 'batch_loss/train': 0.7642926312983036} 12/27/2021 13:00:41 - INFO - codeparrot_training - Step 34758: {'lr': 0.00010683445791786186, 'samples': 17796608, 'steps': 34758, 'batch_loss/train': 0.6872485694475472} 12/27/2021 13:00:52 - INFO - codeparrot_training - Step 34759: {'lr': 0.00010682153571959585, 'samples': 17797120, 'steps': 34759, 'batch_loss/train': 0.7641118243336678} 12/27/2021 13:01:04 - INFO - codeparrot_training - Step 34760: {'lr': 0.00010680861409055346, 'samples': 17797632, 'steps': 34760, 'batch_loss/train': 0.6849093921482563} 12/27/2021 13:01:15 - INFO - codeparrot_training - Step 34761: {'lr': 0.00010679569303078603, 'samples': 17798144, 'steps': 34761, 'batch_loss/train': 0.7101492434740067} 12/27/2021 13:01:25 - INFO - codeparrot_training - Step 34762: {'lr': 0.00010678277254034477, 'samples': 17798656, 'steps': 34762, 'batch_loss/train': 0.7454499988816679} 12/27/2021 13:01:36 - INFO - codeparrot_training - Step 34763: {'lr': 0.00010676985261928109, 'samples': 17799168, 'steps': 34763, 'batch_loss/train': 0.7156134457327425} 12/27/2021 13:01:48 - INFO - codeparrot_training - Step 34764: {'lr': 0.00010675693326764644, 'samples': 17799680, 'steps': 34764, 'batch_loss/train': 0.7232645156327635} 12/27/2021 13:01:59 - INFO - codeparrot_training - Step 34765: {'lr': 0.00010674401448549212, 'samples': 17800192, 'steps': 34765, 'batch_loss/train': 0.699061450897716} 12/27/2021 13:02:10 - INFO - codeparrot_training - Step 34766: {'lr': 0.00010673109627286951, 'samples': 17800704, 'steps': 34766, 'batch_loss/train': 0.5463937608583365} 12/27/2021 13:02:24 - INFO - codeparrot_training - Step 34767: {'lr': 0.00010671817862982997, 'samples': 17801216, 'steps': 34767, 'batch_loss/train': 0.8092751065269113} 12/27/2021 13:02:34 - INFO - codeparrot_training - Step 34768: {'lr': 0.00010670526155642485, 'samples': 17801728, 'steps': 34768, 'batch_loss/train': 0.5977837836835533} 12/27/2021 13:02:45 - INFO - codeparrot_training - Step 34769: {'lr': 0.00010669234505270547, 'samples': 17802240, 'steps': 34769, 'batch_loss/train': 0.66429414274171} 12/27/2021 13:02:57 - INFO - codeparrot_training - Step 34770: {'lr': 0.00010667942911872325, 'samples': 17802752, 'steps': 34770, 'batch_loss/train': 0.6995111061260104} 12/27/2021 13:03:08 - INFO - codeparrot_training - Step 34771: {'lr': 0.00010666651375452952, 'samples': 17803264, 'steps': 34771, 'batch_loss/train': 0.909400918520987} 12/27/2021 13:03:18 - INFO - codeparrot_training - Step 34772: {'lr': 0.00010665359896017566, 'samples': 17803776, 'steps': 34772, 'batch_loss/train': 0.7714871037751436} 12/27/2021 13:03:29 - INFO - codeparrot_training - Step 34773: {'lr': 0.0001066406847357129, 'samples': 17804288, 'steps': 34773, 'batch_loss/train': 0.7095309821888804} 12/27/2021 13:03:42 - INFO - codeparrot_training - Step 34774: {'lr': 0.00010662777108119254, 'samples': 17804800, 'steps': 34774, 'batch_loss/train': 0.852434559725225} 12/27/2021 13:03:52 - INFO - codeparrot_training - Step 34775: {'lr': 0.00010661485799666623, 'samples': 17805312, 'steps': 34775, 'batch_loss/train': 0.74526003934443} 12/27/2021 13:04:03 - INFO - codeparrot_training - Step 34776: {'lr': 0.000106601945482185, 'samples': 17805824, 'steps': 34776, 'batch_loss/train': 0.7230232413858175} 12/27/2021 13:04:17 - INFO - codeparrot_training - Step 34777: {'lr': 0.00010658903353780028, 'samples': 17806336, 'steps': 34777, 'batch_loss/train': 1.6783784059807658} 12/27/2021 13:04:27 - INFO - codeparrot_training - Step 34778: {'lr': 0.00010657612216356344, 'samples': 17806848, 'steps': 34778, 'batch_loss/train': 0.7808356084860861} 12/27/2021 13:04:38 - INFO - codeparrot_training - Step 34779: {'lr': 0.00010656321135952576, 'samples': 17807360, 'steps': 34779, 'batch_loss/train': 0.6492507376242429} 12/27/2021 13:04:50 - INFO - codeparrot_training - Step 34780: {'lr': 0.0001065503011257386, 'samples': 17807872, 'steps': 34780, 'batch_loss/train': 0.8584098853170872} 12/27/2021 13:05:01 - INFO - codeparrot_training - Step 34781: {'lr': 0.00010653739146225328, 'samples': 17808384, 'steps': 34781, 'batch_loss/train': 0.691904753446579} 12/27/2021 13:05:11 - INFO - codeparrot_training - Step 34782: {'lr': 0.0001065244823691211, 'samples': 17808896, 'steps': 34782, 'batch_loss/train': 0.7899387869983912} 12/27/2021 13:05:22 - INFO - codeparrot_training - Step 34783: {'lr': 0.00010651157384639343, 'samples': 17809408, 'steps': 34783, 'batch_loss/train': 0.7217552978545427} 12/27/2021 13:05:36 - INFO - codeparrot_training - Step 34784: {'lr': 0.00010649866589412166, 'samples': 17809920, 'steps': 34784, 'batch_loss/train': 0.827704768627882} 12/27/2021 13:05:46 - INFO - codeparrot_training - Step 34785: {'lr': 0.00010648575851235681, 'samples': 17810432, 'steps': 34785, 'batch_loss/train': 0.7254002639092505} 12/27/2021 13:05:57 - INFO - codeparrot_training - Step 34786: {'lr': 0.00010647285170115048, 'samples': 17810944, 'steps': 34786, 'batch_loss/train': 0.7550799702294171} 12/27/2021 13:06:09 - INFO - codeparrot_training - Step 34787: {'lr': 0.00010645994546055401, 'samples': 17811456, 'steps': 34787, 'batch_loss/train': 0.7321777837350965} 12/27/2021 13:06:20 - INFO - codeparrot_training - Step 34788: {'lr': 0.0001064470397906185, 'samples': 17811968, 'steps': 34788, 'batch_loss/train': 0.8069809409789741} 12/27/2021 13:06:30 - INFO - codeparrot_training - Step 34789: {'lr': 0.00010643413469139526, 'samples': 17812480, 'steps': 34789, 'batch_loss/train': 0.801814878359437} 12/27/2021 13:06:42 - INFO - codeparrot_training - Step 34790: {'lr': 0.0001064212301629359, 'samples': 17812992, 'steps': 34790, 'batch_loss/train': 0.540777264686767} 12/27/2021 13:06:53 - INFO - codeparrot_training - Step 34791: {'lr': 0.00010640832620529139, 'samples': 17813504, 'steps': 34791, 'batch_loss/train': 0.6982282213866711} 12/27/2021 13:07:04 - INFO - codeparrot_training - Step 34792: {'lr': 0.00010639542281851317, 'samples': 17814016, 'steps': 34792, 'batch_loss/train': 0.617376018781215} 12/27/2021 13:07:14 - INFO - codeparrot_training - Step 34793: {'lr': 0.00010638252000265252, 'samples': 17814528, 'steps': 34793, 'batch_loss/train': 0.7352919853292406} 12/27/2021 13:07:28 - INFO - codeparrot_training - Step 34794: {'lr': 0.00010636961775776074, 'samples': 17815040, 'steps': 34794, 'batch_loss/train': 0.7512033988023177} 12/27/2021 13:07:39 - INFO - codeparrot_training - Step 34795: {'lr': 0.00010635671608388913, 'samples': 17815552, 'steps': 34795, 'batch_loss/train': 0.7624257197603583} 12/27/2021 13:07:50 - INFO - codeparrot_training - Step 34796: {'lr': 0.000106343814981089, 'samples': 17816064, 'steps': 34796, 'batch_loss/train': 0.6872226749546826} 12/27/2021 13:08:02 - INFO - codeparrot_training - Step 34797: {'lr': 0.00010633091444941159, 'samples': 17816576, 'steps': 34797, 'batch_loss/train': 0.6986401295289397} 12/27/2021 13:08:13 - INFO - codeparrot_training - Step 34798: {'lr': 0.00010631801448890832, 'samples': 17817088, 'steps': 34798, 'batch_loss/train': 0.814217405859381} 12/27/2021 13:08:23 - INFO - codeparrot_training - Step 34799: {'lr': 0.00010630511509963029, 'samples': 17817600, 'steps': 34799, 'batch_loss/train': 0.7735987743362784} 12/27/2021 13:08:35 - INFO - codeparrot_training - Step 34800: {'lr': 0.00010629221628162877, 'samples': 17818112, 'steps': 34800, 'batch_loss/train': 0.7190221979981288} 12/27/2021 13:08:46 - INFO - codeparrot_training - Step 34801: {'lr': 0.00010627931803495534, 'samples': 17818624, 'steps': 34801, 'batch_loss/train': 0.7048407201655209} 12/27/2021 13:08:57 - INFO - codeparrot_training - Step 34802: {'lr': 0.00010626642035966094, 'samples': 17819136, 'steps': 34802, 'batch_loss/train': 0.8072689836844802} 12/27/2021 13:09:07 - INFO - codeparrot_training - Step 34803: {'lr': 0.00010625352325579698, 'samples': 17819648, 'steps': 34803, 'batch_loss/train': 0.606172438710928} 12/27/2021 13:09:19 - INFO - codeparrot_training - Step 34804: {'lr': 0.00010624062672341478, 'samples': 17820160, 'steps': 34804, 'batch_loss/train': 0.7140773823484778} 12/27/2021 13:09:30 - INFO - codeparrot_training - Step 34805: {'lr': 0.00010622773076256553, 'samples': 17820672, 'steps': 34805, 'batch_loss/train': 0.6453380449675024} 12/27/2021 13:09:41 - INFO - codeparrot_training - Step 34806: {'lr': 0.00010621483537330053, 'samples': 17821184, 'steps': 34806, 'batch_loss/train': 0.699277764884755} 12/27/2021 13:09:54 - INFO - codeparrot_training - Step 34807: {'lr': 0.00010620194055567109, 'samples': 17821696, 'steps': 34807, 'batch_loss/train': 0.6940719482372515} 12/27/2021 13:10:05 - INFO - codeparrot_training - Step 34808: {'lr': 0.00010618904630972843, 'samples': 17822208, 'steps': 34808, 'batch_loss/train': 0.7215255298651755} 12/27/2021 13:10:16 - INFO - codeparrot_training - Step 34809: {'lr': 0.0001061761526355238, 'samples': 17822720, 'steps': 34809, 'batch_loss/train': 0.7134733526036143} 12/27/2021 13:10:28 - INFO - codeparrot_training - Step 34810: {'lr': 0.00010616325953310857, 'samples': 17823232, 'steps': 34810, 'batch_loss/train': 0.6835008473135531} 12/27/2021 13:10:39 - INFO - codeparrot_training - Step 34811: {'lr': 0.00010615036700253374, 'samples': 17823744, 'steps': 34811, 'batch_loss/train': 0.9105136711150408} 12/27/2021 13:10:49 - INFO - codeparrot_training - Step 34812: {'lr': 0.00010613747504385084, 'samples': 17824256, 'steps': 34812, 'batch_loss/train': 0.7879514140659012} 12/27/2021 13:11:00 - INFO - codeparrot_training - Step 34813: {'lr': 0.00010612458365711109, 'samples': 17824768, 'steps': 34813, 'batch_loss/train': 0.8271903907880187} 12/27/2021 13:11:14 - INFO - codeparrot_training - Step 34814: {'lr': 0.00010611169284236558, 'samples': 17825280, 'steps': 34814, 'batch_loss/train': 0.7801403934136033} 12/27/2021 13:11:24 - INFO - codeparrot_training - Step 34815: {'lr': 0.00010609880259966556, 'samples': 17825792, 'steps': 34815, 'batch_loss/train': 0.7450990080833435} 12/27/2021 13:11:35 - INFO - codeparrot_training - Step 34816: {'lr': 0.00010608591292906256, 'samples': 17826304, 'steps': 34816, 'batch_loss/train': 0.9162503369152546} 12/27/2021 13:11:47 - INFO - codeparrot_training - Step 34817: {'lr': 0.00010607302383060754, 'samples': 17826816, 'steps': 34817, 'batch_loss/train': 0.770240974612534} 12/27/2021 13:11:58 - INFO - codeparrot_training - Step 34818: {'lr': 0.00010606013530435183, 'samples': 17827328, 'steps': 34818, 'batch_loss/train': 0.7581828851252794} 12/27/2021 13:12:08 - INFO - codeparrot_training - Step 34819: {'lr': 0.00010604724735034665, 'samples': 17827840, 'steps': 34819, 'batch_loss/train': 0.8074771640822291} 12/27/2021 13:12:21 - INFO - codeparrot_training - Step 34820: {'lr': 0.00010603435996864327, 'samples': 17828352, 'steps': 34820, 'batch_loss/train': 0.846005130559206} 12/27/2021 13:12:32 - INFO - codeparrot_training - Step 34821: {'lr': 0.00010602147315929292, 'samples': 17828864, 'steps': 34821, 'batch_loss/train': 0.6198613606393337} 12/27/2021 13:12:42 - INFO - codeparrot_training - Step 34822: {'lr': 0.00010600858692234683, 'samples': 17829376, 'steps': 34822, 'batch_loss/train': 0.7182377902790904} 12/27/2021 13:12:53 - INFO - codeparrot_training - Step 34823: {'lr': 0.00010599570125785623, 'samples': 17829888, 'steps': 34823, 'batch_loss/train': 0.6148078856058419} 12/27/2021 13:13:07 - INFO - codeparrot_training - Step 34824: {'lr': 0.00010598281616587236, 'samples': 17830400, 'steps': 34824, 'batch_loss/train': 0.810150726698339} 12/27/2021 13:13:17 - INFO - codeparrot_training - Step 34825: {'lr': 0.0001059699316464465, 'samples': 17830912, 'steps': 34825, 'batch_loss/train': 0.7295198389329016} 12/27/2021 13:13:28 - INFO - codeparrot_training - Step 34826: {'lr': 0.00010595704769962964, 'samples': 17831424, 'steps': 34826, 'batch_loss/train': 0.7497985064983368} 12/27/2021 13:13:40 - INFO - codeparrot_training - Step 34827: {'lr': 0.00010594416432547335, 'samples': 17831936, 'steps': 34827, 'batch_loss/train': 0.8521332964301109} 12/27/2021 13:13:51 - INFO - codeparrot_training - Step 34828: {'lr': 0.00010593128152402859, 'samples': 17832448, 'steps': 34828, 'batch_loss/train': 0.709184689912945} 12/27/2021 13:14:01 - INFO - codeparrot_training - Step 34829: {'lr': 0.00010591839929534664, 'samples': 17832960, 'steps': 34829, 'batch_loss/train': 0.7757574818097055} 12/27/2021 13:14:13 - INFO - codeparrot_training - Step 34830: {'lr': 0.00010590551763947872, 'samples': 17833472, 'steps': 34830, 'batch_loss/train': 0.7539284857921302} 12/27/2021 13:14:24 - INFO - codeparrot_training - Step 34831: {'lr': 0.00010589263655647608, 'samples': 17833984, 'steps': 34831, 'batch_loss/train': 0.7204312384128571} 12/27/2021 13:14:34 - INFO - codeparrot_training - Step 34832: {'lr': 0.00010587975604638989, 'samples': 17834496, 'steps': 34832, 'batch_loss/train': 0.7311399118043482} 12/27/2021 13:14:45 - INFO - codeparrot_training - Step 34833: {'lr': 0.00010586687610927137, 'samples': 17835008, 'steps': 34833, 'batch_loss/train': 0.7765808455878869} 12/27/2021 13:14:57 - INFO - codeparrot_training - Step 34834: {'lr': 0.00010585399674517174, 'samples': 17835520, 'steps': 34834, 'batch_loss/train': 0.7203957810997963} 12/27/2021 13:15:08 - INFO - codeparrot_training - Step 34835: {'lr': 0.0001058411179541422, 'samples': 17836032, 'steps': 34835, 'batch_loss/train': 0.7727825231850147} 12/27/2021 13:15:18 - INFO - codeparrot_training - Step 34836: {'lr': 0.00010582823973623402, 'samples': 17836544, 'steps': 34836, 'batch_loss/train': 0.7052954537793994} 12/27/2021 13:15:32 - INFO - codeparrot_training - Step 34837: {'lr': 0.00010581536209149814, 'samples': 17837056, 'steps': 34837, 'batch_loss/train': 0.8227670872583985} 12/27/2021 13:15:43 - INFO - codeparrot_training - Step 34838: {'lr': 0.00010580248501998602, 'samples': 17837568, 'steps': 34838, 'batch_loss/train': 0.7092393250204623} 12/27/2021 13:15:54 - INFO - codeparrot_training - Step 34839: {'lr': 0.00010578960852174887, 'samples': 17838080, 'steps': 34839, 'batch_loss/train': 0.7482771431095898} 12/27/2021 13:16:06 - INFO - codeparrot_training - Step 34840: {'lr': 0.00010577673259683771, 'samples': 17838592, 'steps': 34840, 'batch_loss/train': 0.8848178694024682} 12/27/2021 13:16:16 - INFO - codeparrot_training - Step 34841: {'lr': 0.00010576385724530368, 'samples': 17839104, 'steps': 34841, 'batch_loss/train': 0.7620724686421454} 12/27/2021 13:16:27 - INFO - codeparrot_training - Step 34842: {'lr': 0.0001057509824671983, 'samples': 17839616, 'steps': 34842, 'batch_loss/train': 0.7926408685743809} 12/27/2021 13:16:41 - INFO - codeparrot_training - Step 34843: {'lr': 0.00010573810826257246, 'samples': 17840128, 'steps': 34843, 'batch_loss/train': 0.8106785283889621} 12/27/2021 13:16:51 - INFO - codeparrot_training - Step 34844: {'lr': 0.0001057252346314774, 'samples': 17840640, 'steps': 34844, 'batch_loss/train': 0.8282133853062987} 12/27/2021 13:17:02 - INFO - codeparrot_training - Step 34845: {'lr': 0.00010571236157396435, 'samples': 17841152, 'steps': 34845, 'batch_loss/train': 0.7249978738836944} 12/27/2021 13:17:13 - INFO - codeparrot_training - Step 34846: {'lr': 0.00010569948909008448, 'samples': 17841664, 'steps': 34846, 'batch_loss/train': 0.5334751811460592} 12/27/2021 13:17:25 - INFO - codeparrot_training - Step 34847: {'lr': 0.00010568661717988895, 'samples': 17842176, 'steps': 34847, 'batch_loss/train': 0.7655878639779985} 12/27/2021 13:17:35 - INFO - codeparrot_training - Step 34848: {'lr': 0.00010567374584342893, 'samples': 17842688, 'steps': 34848, 'batch_loss/train': 0.8087822304805741} 12/27/2021 13:17:46 - INFO - codeparrot_training - Step 34849: {'lr': 0.00010566087508075561, 'samples': 17843200, 'steps': 34849, 'batch_loss/train': 0.7714110026136041} 12/27/2021 13:17:58 - INFO - codeparrot_training - Step 34850: {'lr': 0.00010564800489192014, 'samples': 17843712, 'steps': 34850, 'batch_loss/train': 0.72213319176808} 12/27/2021 13:18:09 - INFO - codeparrot_training - Step 34851: {'lr': 0.0001056351352769738, 'samples': 17844224, 'steps': 34851, 'batch_loss/train': 0.7116295006126165} 12/27/2021 13:18:19 - INFO - codeparrot_training - Step 34852: {'lr': 0.00010562226623596746, 'samples': 17844736, 'steps': 34852, 'batch_loss/train': 0.6502488641999662} 12/27/2021 13:18:33 - INFO - codeparrot_training - Step 34853: {'lr': 0.00010560939776895268, 'samples': 17845248, 'steps': 34853, 'batch_loss/train': 0.8415634827688336} 12/27/2021 13:18:44 - INFO - codeparrot_training - Step 34854: {'lr': 0.00010559652987598032, 'samples': 17845760, 'steps': 34854, 'batch_loss/train': 0.7101505654864013} 12/27/2021 13:18:54 - INFO - codeparrot_training - Step 34855: {'lr': 0.0001055836625571015, 'samples': 17846272, 'steps': 34855, 'batch_loss/train': 0.7740232220385224} 12/27/2021 13:19:06 - INFO - codeparrot_training - Step 34856: {'lr': 0.00010557079581236773, 'samples': 17846784, 'steps': 34856, 'batch_loss/train': 0.711734133772552} 12/27/2021 13:19:17 - INFO - codeparrot_training - Step 34857: {'lr': 0.00010555792964182983, 'samples': 17847296, 'steps': 34857, 'batch_loss/train': 1.105424894951284} 12/27/2021 13:19:28 - INFO - codeparrot_training - Step 34858: {'lr': 0.00010554506404553907, 'samples': 17847808, 'steps': 34858, 'batch_loss/train': 0.9194178534671664} 12/27/2021 13:19:38 - INFO - codeparrot_training - Step 34859: {'lr': 0.0001055321990235466, 'samples': 17848320, 'steps': 34859, 'batch_loss/train': 0.7129764701239765} 12/27/2021 13:19:52 - INFO - codeparrot_training - Step 34860: {'lr': 0.00010551933457590354, 'samples': 17848832, 'steps': 34860, 'batch_loss/train': 0.7328076269477606} 12/27/2021 13:20:03 - INFO - codeparrot_training - Step 34861: {'lr': 0.00010550647070266106, 'samples': 17849344, 'steps': 34861, 'batch_loss/train': 0.786354067735374} 12/27/2021 13:20:14 - INFO - codeparrot_training - Step 34862: {'lr': 0.0001054936074038704, 'samples': 17849856, 'steps': 34862, 'batch_loss/train': 0.7494520545005798} 12/27/2021 13:20:26 - INFO - codeparrot_training - Step 34863: {'lr': 0.0001054807446795824, 'samples': 17850368, 'steps': 34863, 'batch_loss/train': 0.7795712003717199} 12/27/2021 13:20:36 - INFO - codeparrot_training - Step 34864: {'lr': 0.00010546788252984849, 'samples': 17850880, 'steps': 34864, 'batch_loss/train': 0.7620726488530636} 12/27/2021 13:20:47 - INFO - codeparrot_training - Step 34865: {'lr': 0.00010545502095471976, 'samples': 17851392, 'steps': 34865, 'batch_loss/train': 0.7623041006736457} 12/27/2021 13:20:59 - INFO - codeparrot_training - Step 34866: {'lr': 0.00010544215995424725, 'samples': 17851904, 'steps': 34866, 'batch_loss/train': 0.8007836560718715} 12/27/2021 13:21:10 - INFO - codeparrot_training - Step 34867: {'lr': 0.00010542929952848198, 'samples': 17852416, 'steps': 34867, 'batch_loss/train': 0.8926377138122916} 12/27/2021 13:21:20 - INFO - codeparrot_training - Step 34868: {'lr': 0.00010541643967747547, 'samples': 17852928, 'steps': 34868, 'batch_loss/train': 0.8975223070010543} 12/27/2021 13:21:31 - INFO - codeparrot_training - Step 34869: {'lr': 0.00010540358040127848, 'samples': 17853440, 'steps': 34869, 'batch_loss/train': 0.7570589883252978} 12/27/2021 13:21:43 - INFO - codeparrot_training - Step 34870: {'lr': 0.00010539072169994215, 'samples': 17853952, 'steps': 34870, 'batch_loss/train': 0.8047476271167397} 12/27/2021 13:21:54 - INFO - codeparrot_training - Step 34871: {'lr': 0.00010537786357351792, 'samples': 17854464, 'steps': 34871, 'batch_loss/train': 0.6700820692349225} 12/27/2021 13:22:04 - INFO - codeparrot_training - Step 34872: {'lr': 0.0001053650060220566, 'samples': 17854976, 'steps': 34872, 'batch_loss/train': 0.8402840588241816} 12/27/2021 13:22:18 - INFO - codeparrot_training - Step 34873: {'lr': 0.00010535214904560941, 'samples': 17855488, 'steps': 34873, 'batch_loss/train': 0.7986355791799724} 12/27/2021 13:22:29 - INFO - codeparrot_training - Step 34874: {'lr': 0.00010533929264422746, 'samples': 17856000, 'steps': 34874, 'batch_loss/train': 0.7645367262884974} 12/27/2021 13:22:40 - INFO - codeparrot_training - Step 34875: {'lr': 0.00010532643681796183, 'samples': 17856512, 'steps': 34875, 'batch_loss/train': 0.6936098381411284} 12/27/2021 13:22:52 - INFO - codeparrot_training - Step 34876: {'lr': 0.00010531358156686369, 'samples': 17857024, 'steps': 34876, 'batch_loss/train': 0.7363324444741011} 12/27/2021 13:23:02 - INFO - codeparrot_training - Step 34877: {'lr': 0.00010530072689098418, 'samples': 17857536, 'steps': 34877, 'batch_loss/train': 0.7745208162814379} 12/27/2021 13:23:13 - INFO - codeparrot_training - Step 34878: {'lr': 0.00010528787279037419, 'samples': 17858048, 'steps': 34878, 'batch_loss/train': 0.7640098556876183} 12/27/2021 13:23:24 - INFO - codeparrot_training - Step 34879: {'lr': 0.00010527501926508506, 'samples': 17858560, 'steps': 34879, 'batch_loss/train': 0.8203553073108196} 12/27/2021 13:23:36 - INFO - codeparrot_training - Step 34880: {'lr': 0.00010526216631516788, 'samples': 17859072, 'steps': 34880, 'batch_loss/train': 0.7213585809804499} 12/27/2021 13:23:46 - INFO - codeparrot_training - Step 34881: {'lr': 0.00010524931394067353, 'samples': 17859584, 'steps': 34881, 'batch_loss/train': 0.7444583494216204} 12/27/2021 13:23:57 - INFO - codeparrot_training - Step 34882: {'lr': 0.00010523646214165342, 'samples': 17860096, 'steps': 34882, 'batch_loss/train': 0.7713646022602916} 12/27/2021 13:24:11 - INFO - codeparrot_training - Step 34883: {'lr': 0.00010522361091815838, 'samples': 17860608, 'steps': 34883, 'batch_loss/train': 0.723986248485744} 12/27/2021 13:24:21 - INFO - codeparrot_training - Step 34884: {'lr': 0.0001052107602702396, 'samples': 17861120, 'steps': 34884, 'batch_loss/train': 0.7593053942546248} 12/27/2021 13:24:32 - INFO - codeparrot_training - Step 34885: {'lr': 0.00010519791019794817, 'samples': 17861632, 'steps': 34885, 'batch_loss/train': 0.7159714419394732} 12/27/2021 13:24:44 - INFO - codeparrot_training - Step 34886: {'lr': 0.00010518506070133516, 'samples': 17862144, 'steps': 34886, 'batch_loss/train': 0.7806345857679844} 12/27/2021 13:24:55 - INFO - codeparrot_training - Step 34887: {'lr': 0.00010517221178045167, 'samples': 17862656, 'steps': 34887, 'batch_loss/train': 0.7800895469263196} 12/27/2021 13:25:06 - INFO - codeparrot_training - Step 34888: {'lr': 0.00010515936343534887, 'samples': 17863168, 'steps': 34888, 'batch_loss/train': 0.6912047136574984} 12/27/2021 13:25:16 - INFO - codeparrot_training - Step 34889: {'lr': 0.00010514651566607759, 'samples': 17863680, 'steps': 34889, 'batch_loss/train': 0.7563458271324635} 12/27/2021 13:25:30 - INFO - codeparrot_training - Step 34890: {'lr': 0.00010513366847268915, 'samples': 17864192, 'steps': 34890, 'batch_loss/train': 0.7882675817236304} 12/27/2021 13:25:41 - INFO - codeparrot_training - Step 34891: {'lr': 0.00010512082185523463, 'samples': 17864704, 'steps': 34891, 'batch_loss/train': 0.7290434967726469} 12/27/2021 13:25:51 - INFO - codeparrot_training - Step 34892: {'lr': 0.00010510797581376483, 'samples': 17865216, 'steps': 34892, 'batch_loss/train': 0.7212680652737617} 12/27/2021 13:26:03 - INFO - codeparrot_training - Step 34893: {'lr': 0.00010509513034833112, 'samples': 17865728, 'steps': 34893, 'batch_loss/train': 0.760289253666997} 12/27/2021 13:26:14 - INFO - codeparrot_training - Step 34894: {'lr': 0.00010508228545898454, 'samples': 17866240, 'steps': 34894, 'batch_loss/train': 0.7736317934468389} 12/27/2021 13:26:25 - INFO - codeparrot_training - Step 34895: {'lr': 0.00010506944114577599, 'samples': 17866752, 'steps': 34895, 'batch_loss/train': 0.7234819591976702} 12/27/2021 13:26:37 - INFO - codeparrot_training - Step 34896: {'lr': 0.0001050565974087565, 'samples': 17867264, 'steps': 34896, 'batch_loss/train': 0.7660175217315555} 12/27/2021 13:26:47 - INFO - codeparrot_training - Step 34897: {'lr': 0.00010504375424797746, 'samples': 17867776, 'steps': 34897, 'batch_loss/train': 0.6407424190547317} 12/27/2021 13:26:58 - INFO - codeparrot_training - Step 34898: {'lr': 0.00010503091166348963, 'samples': 17868288, 'steps': 34898, 'batch_loss/train': 0.7518193360883743} 12/27/2021 13:27:08 - INFO - codeparrot_training - Step 34899: {'lr': 0.00010501806965534413, 'samples': 17868800, 'steps': 34899, 'batch_loss/train': 0.7556787542998791} 12/27/2021 13:27:22 - INFO - codeparrot_training - Step 34900: {'lr': 0.00010500522822359207, 'samples': 17869312, 'steps': 34900, 'batch_loss/train': 0.7752258405089378} 12/27/2021 13:27:33 - INFO - codeparrot_training - Step 34901: {'lr': 0.00010499238736828445, 'samples': 17869824, 'steps': 34901, 'batch_loss/train': 0.8919923468492925} 12/27/2021 13:27:44 - INFO - codeparrot_training - Step 34902: {'lr': 0.00010497954708947238, 'samples': 17870336, 'steps': 34902, 'batch_loss/train': 0.6764085441827774} 12/27/2021 13:27:56 - INFO - codeparrot_training - Step 34903: {'lr': 0.00010496670738720693, 'samples': 17870848, 'steps': 34903, 'batch_loss/train': 0.814879197627306} 12/27/2021 13:28:06 - INFO - codeparrot_training - Step 34904: {'lr': 0.00010495386826153894, 'samples': 17871360, 'steps': 34904, 'batch_loss/train': 0.7644126154482365} 12/27/2021 13:28:17 - INFO - codeparrot_training - Step 34905: {'lr': 0.00010494102971251965, 'samples': 17871872, 'steps': 34905, 'batch_loss/train': 0.7264442276209593} 12/27/2021 13:28:29 - INFO - codeparrot_training - Step 34906: {'lr': 0.00010492819174020018, 'samples': 17872384, 'steps': 34906, 'batch_loss/train': 0.7394793098792434} 12/27/2021 13:28:40 - INFO - codeparrot_training - Step 34907: {'lr': 0.00010491535434463125, 'samples': 17872896, 'steps': 34907, 'batch_loss/train': 0.7394304377958179} 12/27/2021 13:28:50 - INFO - codeparrot_training - Step 34908: {'lr': 0.00010490251752586428, 'samples': 17873408, 'steps': 34908, 'batch_loss/train': 0.7649002987891436} 12/27/2021 13:29:02 - INFO - codeparrot_training - Step 34909: {'lr': 0.00010488968128395, 'samples': 17873920, 'steps': 34909, 'batch_loss/train': 0.7770976931788027} 12/27/2021 13:29:13 - INFO - codeparrot_training - Step 34910: {'lr': 0.00010487684561893957, 'samples': 17874432, 'steps': 34910, 'batch_loss/train': 0.8128609675914049} 12/27/2021 13:29:23 - INFO - codeparrot_training - Step 34911: {'lr': 0.00010486401053088402, 'samples': 17874944, 'steps': 34911, 'batch_loss/train': 0.7849518703296781} 12/27/2021 13:29:34 - INFO - codeparrot_training - Step 34912: {'lr': 0.00010485117601983432, 'samples': 17875456, 'steps': 34912, 'batch_loss/train': 0.7109293332323432} 12/27/2021 13:29:48 - INFO - codeparrot_training - Step 34913: {'lr': 0.00010483834208584159, 'samples': 17875968, 'steps': 34913, 'batch_loss/train': 0.729361486621201} 12/27/2021 13:29:59 - INFO - codeparrot_training - Step 34914: {'lr': 0.00010482550872895677, 'samples': 17876480, 'steps': 34914, 'batch_loss/train': 0.7339539880631492} 12/27/2021 13:30:09 - INFO - codeparrot_training - Step 34915: {'lr': 0.00010481267594923091, 'samples': 17876992, 'steps': 34915, 'batch_loss/train': 0.7480501271784306} 12/27/2021 13:30:21 - INFO - codeparrot_training - Step 34916: {'lr': 0.00010479984374671503, 'samples': 17877504, 'steps': 34916, 'batch_loss/train': 0.6948256120085716} 12/27/2021 13:30:32 - INFO - codeparrot_training - Step 34917: {'lr': 0.00010478701212146025, 'samples': 17878016, 'steps': 34917, 'batch_loss/train': 0.7271158704534173} 12/27/2021 13:30:42 - INFO - codeparrot_training - Step 34918: {'lr': 0.0001047741810735173, 'samples': 17878528, 'steps': 34918, 'batch_loss/train': 0.7357905982062221} 12/27/2021 13:30:56 - INFO - codeparrot_training - Step 34919: {'lr': 0.00010476135060293745, 'samples': 17879040, 'steps': 34919, 'batch_loss/train': 0.7156264942605048} 12/27/2021 13:31:07 - INFO - codeparrot_training - Step 34920: {'lr': 0.00010474852070977172, 'samples': 17879552, 'steps': 34920, 'batch_loss/train': 0.6458998252637684} 12/27/2021 13:31:18 - INFO - codeparrot_training - Step 34921: {'lr': 0.00010473569139407093, 'samples': 17880064, 'steps': 34921, 'batch_loss/train': 0.7413553656078875} 12/27/2021 13:31:28 - INFO - codeparrot_training - Step 34922: {'lr': 0.00010472286265588613, 'samples': 17880576, 'steps': 34922, 'batch_loss/train': 0.6307602804154158} 12/27/2021 13:31:40 - INFO - codeparrot_training - Step 34923: {'lr': 0.0001047100344952685, 'samples': 17881088, 'steps': 34923, 'batch_loss/train': 0.7269547982141376} 12/27/2021 13:31:51 - INFO - codeparrot_training - Step 34924: {'lr': 0.00010469720691226886, 'samples': 17881600, 'steps': 34924, 'batch_loss/train': 0.7826365502551198} 12/27/2021 13:32:01 - INFO - codeparrot_training - Step 34925: {'lr': 0.00010468437990693824, 'samples': 17882112, 'steps': 34925, 'batch_loss/train': 0.810771363787353} 12/27/2021 13:32:14 - INFO - codeparrot_training - Step 34926: {'lr': 0.00010467155347932766, 'samples': 17882624, 'steps': 34926, 'batch_loss/train': 0.6299776388332248} 12/27/2021 13:32:25 - INFO - codeparrot_training - Step 34927: {'lr': 0.00010465872762948814, 'samples': 17883136, 'steps': 34927, 'batch_loss/train': 0.7829691654769704} 12/27/2021 13:32:35 - INFO - codeparrot_training - Step 34928: {'lr': 0.00010464590235747059, 'samples': 17883648, 'steps': 34928, 'batch_loss/train': 0.7331445133313537} 12/27/2021 13:32:49 - INFO - codeparrot_training - Step 34929: {'lr': 0.00010463307766332609, 'samples': 17884160, 'steps': 34929, 'batch_loss/train': 0.7267533700214699} 12/27/2021 13:33:00 - INFO - codeparrot_training - Step 34930: {'lr': 0.00010462025354710558, 'samples': 17884672, 'steps': 34930, 'batch_loss/train': 0.7884490126743913} 12/27/2021 13:33:10 - INFO - codeparrot_training - Step 34931: {'lr': 0.00010460743000886003, 'samples': 17885184, 'steps': 34931, 'batch_loss/train': 0.8968729628250003} 12/27/2021 13:33:21 - INFO - codeparrot_training - Step 34932: {'lr': 0.00010459460704864054, 'samples': 17885696, 'steps': 34932, 'batch_loss/train': 0.8193860668689013} 12/27/2021 13:33:33 - INFO - codeparrot_training - Step 34933: {'lr': 0.00010458178466649781, 'samples': 17886208, 'steps': 34933, 'batch_loss/train': 0.8374904943630099} 12/27/2021 13:33:44 - INFO - codeparrot_training - Step 34934: {'lr': 0.00010456896286248319, 'samples': 17886720, 'steps': 34934, 'batch_loss/train': 0.8875973923131824} 12/27/2021 13:33:54 - INFO - codeparrot_training - Step 34935: {'lr': 0.00010455614163664734, 'samples': 17887232, 'steps': 34935, 'batch_loss/train': 0.7090799021534622} 12/27/2021 13:34:07 - INFO - codeparrot_training - Step 34936: {'lr': 0.00010454332098904137, 'samples': 17887744, 'steps': 34936, 'batch_loss/train': 0.6712753419415094} 12/27/2021 13:34:17 - INFO - codeparrot_training - Step 34937: {'lr': 0.00010453050091971622, 'samples': 17888256, 'steps': 34937, 'batch_loss/train': 0.7637683459324762} 12/27/2021 13:34:28 - INFO - codeparrot_training - Step 34938: {'lr': 0.0001045176814287229, 'samples': 17888768, 'steps': 34938, 'batch_loss/train': 0.7584273796528578} 12/27/2021 13:34:42 - INFO - codeparrot_training - Step 34939: {'lr': 0.00010450486251611233, 'samples': 17889280, 'steps': 34939, 'batch_loss/train': 0.7813053485588171} 12/27/2021 13:34:52 - INFO - codeparrot_training - Step 34940: {'lr': 0.00010449204418193547, 'samples': 17889792, 'steps': 34940, 'batch_loss/train': 0.7406063338275999} 12/27/2021 13:35:03 - INFO - codeparrot_training - Step 34941: {'lr': 0.00010447922642624333, 'samples': 17890304, 'steps': 34941, 'batch_loss/train': 0.6691484833136201} 12/27/2021 13:35:14 - INFO - codeparrot_training - Step 34942: {'lr': 0.00010446640924908682, 'samples': 17890816, 'steps': 34942, 'batch_loss/train': 0.7212092888657935} 12/27/2021 13:35:26 - INFO - codeparrot_training - Step 34943: {'lr': 0.00010445359265051701, 'samples': 17891328, 'steps': 34943, 'batch_loss/train': 0.5844553578644991} 12/27/2021 13:35:37 - INFO - codeparrot_training - Step 34944: {'lr': 0.0001044407766305846, 'samples': 17891840, 'steps': 34944, 'batch_loss/train': 0.6069673015736043} 12/27/2021 13:35:47 - INFO - codeparrot_training - Step 34945: {'lr': 0.00010442796118934077, 'samples': 17892352, 'steps': 34945, 'batch_loss/train': 0.7359448620118201} 12/27/2021 13:35:59 - INFO - codeparrot_training - Step 34946: {'lr': 0.0001044151463268365, 'samples': 17892864, 'steps': 34946, 'batch_loss/train': 0.7601041889283806} 12/27/2021 13:36:10 - INFO - codeparrot_training - Step 34947: {'lr': 0.00010440233204312255, 'samples': 17893376, 'steps': 34947, 'batch_loss/train': 0.7036808496341109} 12/27/2021 13:36:21 - INFO - codeparrot_training - Step 34948: {'lr': 0.00010438951833824986, 'samples': 17893888, 'steps': 34948, 'batch_loss/train': 0.6705270182574168} 12/27/2021 13:36:33 - INFO - codeparrot_training - Step 34949: {'lr': 0.00010437670521226963, 'samples': 17894400, 'steps': 34949, 'batch_loss/train': 0.6511285365559161} 12/27/2021 13:36:44 - INFO - codeparrot_training - Step 34950: {'lr': 0.00010436389266523255, 'samples': 17894912, 'steps': 34950, 'batch_loss/train': 0.809127994813025} 12/27/2021 13:36:54 - INFO - codeparrot_training - Step 34951: {'lr': 0.00010435108069718966, 'samples': 17895424, 'steps': 34951, 'batch_loss/train': 0.5521839512512088} 12/27/2021 13:37:08 - INFO - codeparrot_training - Step 34952: {'lr': 0.00010433826930819188, 'samples': 17895936, 'steps': 34952, 'batch_loss/train': 0.7023018975742161} 12/27/2021 13:37:19 - INFO - codeparrot_training - Step 34953: {'lr': 0.00010432545849829012, 'samples': 17896448, 'steps': 34953, 'batch_loss/train': 0.8033619662746787} 12/27/2021 13:37:30 - INFO - codeparrot_training - Step 34954: {'lr': 0.00010431264826753534, 'samples': 17896960, 'steps': 34954, 'batch_loss/train': 0.7145039644092321} 12/27/2021 13:37:40 - INFO - codeparrot_training - Step 34955: {'lr': 0.00010429983861597844, 'samples': 17897472, 'steps': 34955, 'batch_loss/train': 0.8260333896614611} 12/27/2021 13:37:52 - INFO - codeparrot_training - Step 34956: {'lr': 0.00010428702954367039, 'samples': 17897984, 'steps': 34956, 'batch_loss/train': 0.8708839043974876} 12/27/2021 13:38:03 - INFO - codeparrot_training - Step 34957: {'lr': 0.00010427422105066212, 'samples': 17898496, 'steps': 34957, 'batch_loss/train': 0.878498095087707} 12/27/2021 13:38:14 - INFO - codeparrot_training - Step 34958: {'lr': 0.00010426141313700457, 'samples': 17899008, 'steps': 34958, 'batch_loss/train': 0.7346953861415386} 12/27/2021 13:38:27 - INFO - codeparrot_training - Step 34959: {'lr': 0.00010424860580274844, 'samples': 17899520, 'steps': 34959, 'batch_loss/train': 0.7721509411931038} 12/27/2021 13:38:38 - INFO - codeparrot_training - Step 34960: {'lr': 0.00010423579904794492, 'samples': 17900032, 'steps': 34960, 'batch_loss/train': 0.6792907901108265} 12/27/2021 13:38:49 - INFO - codeparrot_training - Step 34961: {'lr': 0.00010422299287264492, 'samples': 17900544, 'steps': 34961, 'batch_loss/train': 0.7776976581662893} 12/27/2021 13:39:01 - INFO - codeparrot_training - Step 34962: {'lr': 0.00010421018727689915, 'samples': 17901056, 'steps': 34962, 'batch_loss/train': 0.7674995632842183} 12/27/2021 13:39:12 - INFO - codeparrot_training - Step 34963: {'lr': 0.00010419738226075864, 'samples': 17901568, 'steps': 34963, 'batch_loss/train': 1.0075879199430346} 12/27/2021 13:39:22 - INFO - codeparrot_training - Step 34964: {'lr': 0.00010418457782427426, 'samples': 17902080, 'steps': 34964, 'batch_loss/train': 0.733496296685189} 12/27/2021 13:39:33 - INFO - codeparrot_training - Step 34965: {'lr': 0.00010417177396749697, 'samples': 17902592, 'steps': 34965, 'batch_loss/train': 0.8101309745106846} 12/27/2021 13:39:45 - INFO - codeparrot_training - Step 34966: {'lr': 0.00010415897069047764, 'samples': 17903104, 'steps': 34966, 'batch_loss/train': 0.7816418826114386} 12/27/2021 13:39:56 - INFO - codeparrot_training - Step 34967: {'lr': 0.00010414616799326718, 'samples': 17903616, 'steps': 34967, 'batch_loss/train': 0.7243243644479662} 12/27/2021 13:40:07 - INFO - codeparrot_training - Step 34968: {'lr': 0.00010413336587591646, 'samples': 17904128, 'steps': 34968, 'batch_loss/train': 0.8068562168627977} 12/27/2021 13:40:21 - INFO - codeparrot_training - Step 34969: {'lr': 0.0001041205643384765, 'samples': 17904640, 'steps': 34969, 'batch_loss/train': 0.8826045878231525} 12/27/2021 13:40:31 - INFO - codeparrot_training - Step 34970: {'lr': 0.00010410776338099793, 'samples': 17905152, 'steps': 34970, 'batch_loss/train': 0.7670821687206626} 12/27/2021 13:40:42 - INFO - codeparrot_training - Step 34971: {'lr': 0.00010409496300353188, 'samples': 17905664, 'steps': 34971, 'batch_loss/train': 0.6387173898983747} 12/27/2021 13:40:54 - INFO - codeparrot_training - Step 34972: {'lr': 0.00010408216320612926, 'samples': 17906176, 'steps': 34972, 'batch_loss/train': 0.7381429662927985} 12/27/2021 13:41:05 - INFO - codeparrot_training - Step 34973: {'lr': 0.00010406936398884076, 'samples': 17906688, 'steps': 34973, 'batch_loss/train': 0.8212248254567385} 12/27/2021 13:41:15 - INFO - codeparrot_training - Step 34974: {'lr': 0.00010405656535171731, 'samples': 17907200, 'steps': 34974, 'batch_loss/train': 0.706029194581788} 12/27/2021 13:41:30 - INFO - codeparrot_training - Step 34975: {'lr': 0.00010404376729481, 'samples': 17907712, 'steps': 34975, 'batch_loss/train': 0.7911435039713979} 12/27/2021 13:41:41 - INFO - codeparrot_training - Step 34976: {'lr': 0.00010403096981816948, 'samples': 17908224, 'steps': 34976, 'batch_loss/train': 0.7030327829997987} 12/27/2021 13:41:52 - INFO - codeparrot_training - Step 34977: {'lr': 0.0001040181729218467, 'samples': 17908736, 'steps': 34977, 'batch_loss/train': 1.1061218418180943} 12/27/2021 13:42:02 - INFO - codeparrot_training - Step 34978: {'lr': 0.00010400537660589257, 'samples': 17909248, 'steps': 34978, 'batch_loss/train': 1.4127320146653801} 12/27/2021 13:42:14 - INFO - codeparrot_training - Step 34979: {'lr': 0.0001039925808703579, 'samples': 17909760, 'steps': 34979, 'batch_loss/train': 0.7216301918961108} 12/27/2021 13:42:25 - INFO - codeparrot_training - Step 34980: {'lr': 0.00010397978571529362, 'samples': 17910272, 'steps': 34980, 'batch_loss/train': 0.8733003647066653} 12/27/2021 13:42:36 - INFO - codeparrot_training - Step 34981: {'lr': 0.0001039669911407506, 'samples': 17910784, 'steps': 34981, 'batch_loss/train': 0.6230060249799863} 12/27/2021 13:42:48 - INFO - codeparrot_training - Step 34982: {'lr': 0.00010395419714677964, 'samples': 17911296, 'steps': 34982, 'batch_loss/train': 0.7719918750226498} 12/27/2021 13:42:58 - INFO - codeparrot_training - Step 34983: {'lr': 0.00010394140373343167, 'samples': 17911808, 'steps': 34983, 'batch_loss/train': 0.743846784112975} 12/27/2021 13:43:09 - INFO - codeparrot_training - Step 34984: {'lr': 0.00010392861090075762, 'samples': 17912320, 'steps': 34984, 'batch_loss/train': 0.7434520330280066} 12/27/2021 13:43:19 - INFO - codeparrot_training - Step 34985: {'lr': 0.00010391581864880806, 'samples': 17912832, 'steps': 34985, 'batch_loss/train': 0.7276052972301841} 12/27/2021 13:43:34 - INFO - codeparrot_training - Step 34986: {'lr': 0.00010390302697763415, 'samples': 17913344, 'steps': 34986, 'batch_loss/train': 0.6484629125334322} 12/27/2021 13:43:45 - INFO - codeparrot_training - Step 34987: {'lr': 0.00010389023588728674, 'samples': 17913856, 'steps': 34987, 'batch_loss/train': 0.6126240943558514} 12/27/2021 13:43:55 - INFO - codeparrot_training - Step 34988: {'lr': 0.00010387744537781649, 'samples': 17914368, 'steps': 34988, 'batch_loss/train': 0.7856767133343965} 12/27/2021 13:44:07 - INFO - codeparrot_training - Step 34989: {'lr': 0.00010386465544927432, 'samples': 17914880, 'steps': 34989, 'batch_loss/train': 0.7532626343891025} 12/27/2021 13:44:18 - INFO - codeparrot_training - Step 34990: {'lr': 0.00010385186610171113, 'samples': 17915392, 'steps': 34990, 'batch_loss/train': 0.6887447247281671} 12/27/2021 13:44:29 - INFO - codeparrot_training - Step 34991: {'lr': 0.00010383907733517772, 'samples': 17915904, 'steps': 34991, 'batch_loss/train': 0.7263561310246587} 12/27/2021 13:44:41 - INFO - codeparrot_training - Step 34992: {'lr': 0.00010382628914972495, 'samples': 17916416, 'steps': 34992, 'batch_loss/train': 0.6223537176847458} 12/27/2021 13:44:52 - INFO - codeparrot_training - Step 34993: {'lr': 0.00010381350154540367, 'samples': 17916928, 'steps': 34993, 'batch_loss/train': 0.7556021632626653} 12/27/2021 13:45:02 - INFO - codeparrot_training - Step 34994: {'lr': 0.0001038007145222647, 'samples': 17917440, 'steps': 34994, 'batch_loss/train': 0.9482837039977312} 12/27/2021 13:45:13 - INFO - codeparrot_training - Step 34995: {'lr': 0.00010378792808035889, 'samples': 17917952, 'steps': 34995, 'batch_loss/train': 0.663043662905693} 12/27/2021 13:45:25 - INFO - codeparrot_training - Step 34996: {'lr': 0.00010377514221973707, 'samples': 17918464, 'steps': 34996, 'batch_loss/train': 0.7421654928475618} 12/27/2021 13:45:35 - INFO - codeparrot_training - Step 34997: {'lr': 0.00010376235694045008, 'samples': 17918976, 'steps': 34997, 'batch_loss/train': 0.8440078664571047} 12/27/2021 13:45:46 - INFO - codeparrot_training - Step 34998: {'lr': 0.00010374957224254882, 'samples': 17919488, 'steps': 34998, 'batch_loss/train': 0.7044084500521421} 12/27/2021 13:46:00 - INFO - codeparrot_training - Step 34999: {'lr': 0.00010373678812608398, 'samples': 17920000, 'steps': 34999, 'batch_loss/train': 0.7611512918956578} 12/27/2021 13:46:00 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/27/2021 13:49:24 - INFO - codeparrot_training - Step 35000: {'loss/eval': 0.7561725378036499, 'perplexity': 2.1301076412200928} 12/27/2021 13:49:43 - WARNING - huggingface_hub.repository - Several commits (12) will be pushed upstream. 12/27/2021 13:49:56 - INFO - codeparrot_training - Step 35000: {'lr': 0.00010372400459110632, 'samples': 17920512, 'steps': 35000, 'batch_loss/train': 0.7205137052806094} 12/27/2021 13:50:07 - INFO - codeparrot_training - Step 35001: {'lr': 0.00010371122163766697, 'samples': 17921024, 'steps': 35001, 'batch_loss/train': 0.7165432414039969} 12/27/2021 13:50:19 - INFO - codeparrot_training - Step 35002: {'lr': 0.00010369843926581648, 'samples': 17921536, 'steps': 35002, 'batch_loss/train': 0.7319596698507667} 12/27/2021 13:50:30 - INFO - codeparrot_training - Step 35003: {'lr': 0.00010368565747560565, 'samples': 17922048, 'steps': 35003, 'batch_loss/train': 0.6962243495509028} 12/27/2021 13:50:41 - INFO - codeparrot_training - Step 35004: {'lr': 0.0001036728762670856, 'samples': 17922560, 'steps': 35004, 'batch_loss/train': 0.6871249480172992} 12/27/2021 13:50:51 - INFO - codeparrot_training - Step 35005: {'lr': 0.00010366009564030688, 'samples': 17923072, 'steps': 35005, 'batch_loss/train': 0.7462858716025949} 12/27/2021 13:51:05 - INFO - codeparrot_training - Step 35006: {'lr': 0.00010364731559532031, 'samples': 17923584, 'steps': 35006, 'batch_loss/train': 0.8078965097665787} 12/27/2021 13:51:16 - INFO - codeparrot_training - Step 35007: {'lr': 0.00010363453613217677, 'samples': 17924096, 'steps': 35007, 'batch_loss/train': 0.6881614183075726} 12/27/2021 13:51:26 - INFO - codeparrot_training - Step 35008: {'lr': 0.00010362175725092707, 'samples': 17924608, 'steps': 35008, 'batch_loss/train': 0.7169940594467334} 12/27/2021 13:51:38 - INFO - codeparrot_training - Step 35009: {'lr': 0.00010360897895162197, 'samples': 17925120, 'steps': 35009, 'batch_loss/train': 0.7645176043733954} 12/27/2021 13:51:49 - INFO - codeparrot_training - Step 35010: {'lr': 0.00010359620123431238, 'samples': 17925632, 'steps': 35010, 'batch_loss/train': 0.6168565594125539} 12/27/2021 13:52:00 - INFO - codeparrot_training - Step 35011: {'lr': 0.00010358342409904886, 'samples': 17926144, 'steps': 35011, 'batch_loss/train': 0.7328414891380817} 12/27/2021 13:52:12 - INFO - codeparrot_training - Step 35012: {'lr': 0.00010357064754588243, 'samples': 17926656, 'steps': 35012, 'batch_loss/train': 0.692734676413238} 12/27/2021 13:52:22 - INFO - codeparrot_training - Step 35013: {'lr': 0.0001035578715748639, 'samples': 17927168, 'steps': 35013, 'batch_loss/train': 0.7836133753880858} 12/27/2021 13:52:33 - INFO - codeparrot_training - Step 35014: {'lr': 0.00010354509618604391, 'samples': 17927680, 'steps': 35014, 'batch_loss/train': 0.8068492778111249} 12/27/2021 13:52:43 - INFO - codeparrot_training - Step 35015: {'lr': 0.0001035323213794733, 'samples': 17928192, 'steps': 35015, 'batch_loss/train': 0.7229512324556708} 12/27/2021 13:52:57 - INFO - codeparrot_training - Step 35016: {'lr': 0.0001035195471552029, 'samples': 17928704, 'steps': 35016, 'batch_loss/train': 0.7708474332466722} 12/27/2021 13:53:08 - INFO - codeparrot_training - Step 35017: {'lr': 0.00010350677351328344, 'samples': 17929216, 'steps': 35017, 'batch_loss/train': 0.7743860670598224} 12/27/2021 13:53:19 - INFO - codeparrot_training - Step 35018: {'lr': 0.00010349400045376575, 'samples': 17929728, 'steps': 35018, 'batch_loss/train': 0.8115608496591449} 12/27/2021 13:53:31 - INFO - codeparrot_training - Step 35019: {'lr': 0.00010348122797670062, 'samples': 17930240, 'steps': 35019, 'batch_loss/train': 0.7409935649484396} 12/27/2021 13:53:41 - INFO - codeparrot_training - Step 35020: {'lr': 0.00010346845608213879, 'samples': 17930752, 'steps': 35020, 'batch_loss/train': 0.7665671589784324} 12/27/2021 13:53:52 - INFO - codeparrot_training - Step 35021: {'lr': 0.00010345568477013101, 'samples': 17931264, 'steps': 35021, 'batch_loss/train': 0.757289836066775} 12/27/2021 13:54:04 - INFO - codeparrot_training - Step 35022: {'lr': 0.00010344291404072815, 'samples': 17931776, 'steps': 35022, 'batch_loss/train': 0.7701202901080251} 12/27/2021 13:54:14 - INFO - codeparrot_training - Step 35023: {'lr': 0.0001034301438939809, 'samples': 17932288, 'steps': 35023, 'batch_loss/train': 0.6754916394129395} 12/27/2021 13:54:25 - INFO - codeparrot_training - Step 35024: {'lr': 0.00010341737432994014, 'samples': 17932800, 'steps': 35024, 'batch_loss/train': 0.7442464847117662} 12/27/2021 13:54:39 - INFO - codeparrot_training - Step 35025: {'lr': 0.00010340460534865648, 'samples': 17933312, 'steps': 35025, 'batch_loss/train': 0.7829721244052052} 12/27/2021 13:54:50 - INFO - codeparrot_training - Step 35026: {'lr': 0.00010339183695018064, 'samples': 17933824, 'steps': 35026, 'batch_loss/train': 0.6782641354948282} 12/27/2021 13:55:00 - INFO - codeparrot_training - Step 35027: {'lr': 0.00010337906913456369, 'samples': 17934336, 'steps': 35027, 'batch_loss/train': 0.7339533641934395} 12/27/2021 13:55:11 - INFO - codeparrot_training - Step 35028: {'lr': 0.00010336630190185611, 'samples': 17934848, 'steps': 35028, 'batch_loss/train': 0.8075418423395604} 12/27/2021 13:55:23 - INFO - codeparrot_training - Step 35029: {'lr': 0.00010335353525210867, 'samples': 17935360, 'steps': 35029, 'batch_loss/train': 0.7610948509536684} 12/27/2021 13:55:34 - INFO - codeparrot_training - Step 35030: {'lr': 0.00010334076918537239, 'samples': 17935872, 'steps': 35030, 'batch_loss/train': 0.7750435802154243} 12/27/2021 13:55:44 - INFO - codeparrot_training - Step 35031: {'lr': 0.0001033280037016977, 'samples': 17936384, 'steps': 35031, 'batch_loss/train': 0.6640761061571538} 12/27/2021 13:55:56 - INFO - codeparrot_training - Step 35032: {'lr': 0.00010331523880113552, 'samples': 17936896, 'steps': 35032, 'batch_loss/train': 0.7813567277044058} 12/27/2021 13:56:07 - INFO - codeparrot_training - Step 35033: {'lr': 0.00010330247448373656, 'samples': 17937408, 'steps': 35033, 'batch_loss/train': 0.6472561890259385} 12/27/2021 13:56:18 - INFO - codeparrot_training - Step 35034: {'lr': 0.00010328971074955157, 'samples': 17937920, 'steps': 35034, 'batch_loss/train': 0.704840833786875} 12/27/2021 13:56:32 - INFO - codeparrot_training - Step 35035: {'lr': 0.0001032769475986313, 'samples': 17938432, 'steps': 35035, 'batch_loss/train': 0.7618941748514771} 12/27/2021 13:56:42 - INFO - codeparrot_training - Step 35036: {'lr': 0.00010326418503102659, 'samples': 17938944, 'steps': 35036, 'batch_loss/train': 0.6802568396087736} 12/27/2021 13:56:53 - INFO - codeparrot_training - Step 35037: {'lr': 0.00010325142304678786, 'samples': 17939456, 'steps': 35037, 'batch_loss/train': 0.6603834294946864} 12/27/2021 13:57:03 - INFO - codeparrot_training - Step 35038: {'lr': 0.00010323866164596618, 'samples': 17939968, 'steps': 35038, 'batch_loss/train': 0.7560892654582858} 12/27/2021 13:57:16 - INFO - codeparrot_training - Step 35039: {'lr': 0.00010322590082861224, 'samples': 17940480, 'steps': 35039, 'batch_loss/train': 0.5951174583751708} 12/27/2021 13:57:26 - INFO - codeparrot_training - Step 35040: {'lr': 0.00010321314059477652, 'samples': 17940992, 'steps': 35040, 'batch_loss/train': 0.5240196478553116} 12/27/2021 13:57:37 - INFO - codeparrot_training - Step 35041: {'lr': 0.00010320038094451001, 'samples': 17941504, 'steps': 35041, 'batch_loss/train': 0.8101858645677567} 12/27/2021 13:57:49 - INFO - codeparrot_training - Step 35042: {'lr': 0.00010318762187786345, 'samples': 17942016, 'steps': 35042, 'batch_loss/train': 0.7254184992052615} 12/27/2021 13:58:00 - INFO - codeparrot_training - Step 35043: {'lr': 0.0001031748633948874, 'samples': 17942528, 'steps': 35043, 'batch_loss/train': 0.7122948868200183} 12/27/2021 13:58:10 - INFO - codeparrot_training - Step 35044: {'lr': 0.00010316210549563266, 'samples': 17943040, 'steps': 35044, 'batch_loss/train': 0.7330319723114371} 12/27/2021 13:58:21 - INFO - codeparrot_training - Step 35045: {'lr': 0.0001031493481801499, 'samples': 17943552, 'steps': 35045, 'batch_loss/train': 0.7443147468147799} 12/27/2021 13:58:35 - INFO - codeparrot_training - Step 35046: {'lr': 0.00010313659144848991, 'samples': 17944064, 'steps': 35046, 'batch_loss/train': 0.775072043761611} 12/27/2021 13:58:45 - INFO - codeparrot_training - Step 35047: {'lr': 0.00010312383530070338, 'samples': 17944576, 'steps': 35047, 'batch_loss/train': 0.7673404871020466} 12/27/2021 13:58:56 - INFO - codeparrot_training - Step 35048: {'lr': 0.00010311107973684103, 'samples': 17945088, 'steps': 35048, 'batch_loss/train': 0.9207096234895289} 12/27/2021 13:59:08 - INFO - codeparrot_training - Step 35049: {'lr': 0.00010309832475695354, 'samples': 17945600, 'steps': 35049, 'batch_loss/train': 0.7915380760096014} 12/27/2021 13:59:19 - INFO - codeparrot_training - Step 35050: {'lr': 0.00010308557036109167, 'samples': 17946112, 'steps': 35050, 'batch_loss/train': 0.8194435713812709} 12/27/2021 13:59:29 - INFO - codeparrot_training - Step 35051: {'lr': 0.00010307281654930618, 'samples': 17946624, 'steps': 35051, 'batch_loss/train': 0.8134335273643956} 12/27/2021 13:59:41 - INFO - codeparrot_training - Step 35052: {'lr': 0.00010306006332164752, 'samples': 17947136, 'steps': 35052, 'batch_loss/train': 0.8242401364259422} 12/27/2021 13:59:52 - INFO - codeparrot_training - Step 35053: {'lr': 0.00010304731067816672, 'samples': 17947648, 'steps': 35053, 'batch_loss/train': 0.7834368215408176} 12/27/2021 14:00:02 - INFO - codeparrot_training - Step 35054: {'lr': 0.00010303455861891426, 'samples': 17948160, 'steps': 35054, 'batch_loss/train': 0.751283519435674} 12/27/2021 14:00:13 - INFO - codeparrot_training - Step 35055: {'lr': 0.0001030218071439408, 'samples': 17948672, 'steps': 35055, 'batch_loss/train': 0.7516632257029414} 12/27/2021 14:00:27 - INFO - codeparrot_training - Step 35056: {'lr': 0.00010300905625329731, 'samples': 17949184, 'steps': 35056, 'batch_loss/train': 0.6908175933058374} 12/27/2021 14:00:38 - INFO - codeparrot_training - Step 35057: {'lr': 0.00010299630594703424, 'samples': 17949696, 'steps': 35057, 'batch_loss/train': 0.7167799996677786} 12/27/2021 14:00:48 - INFO - codeparrot_training - Step 35058: {'lr': 0.00010298355622520236, 'samples': 17950208, 'steps': 35058, 'batch_loss/train': 0.734885856276378} 12/27/2021 14:01:00 - INFO - codeparrot_training - Step 35059: {'lr': 0.00010297080708785233, 'samples': 17950720, 'steps': 35059, 'batch_loss/train': 0.7766203857026994} 12/27/2021 14:01:11 - INFO - codeparrot_training - Step 35060: {'lr': 0.00010295805853503486, 'samples': 17951232, 'steps': 35060, 'batch_loss/train': 0.6390976533293724} 12/27/2021 14:01:21 - INFO - codeparrot_training - Step 35061: {'lr': 0.00010294531056680063, 'samples': 17951744, 'steps': 35061, 'batch_loss/train': 0.6993512106128037} 12/27/2021 14:01:35 - INFO - codeparrot_training - Step 35062: {'lr': 0.0001029325631832004, 'samples': 17952256, 'steps': 35062, 'batch_loss/train': 0.762098265811801} 12/27/2021 14:01:46 - INFO - codeparrot_training - Step 35063: {'lr': 0.0001029198163842846, 'samples': 17952768, 'steps': 35063, 'batch_loss/train': 0.7518583734054118} 12/27/2021 14:01:57 - INFO - codeparrot_training - Step 35064: {'lr': 0.00010290707017010417, 'samples': 17953280, 'steps': 35064, 'batch_loss/train': 0.6618884379277006} 12/27/2021 14:02:09 - INFO - codeparrot_training - Step 35065: {'lr': 0.00010289432454070977, 'samples': 17953792, 'steps': 35065, 'batch_loss/train': 0.6904867370612919} 12/27/2021 14:02:19 - INFO - codeparrot_training - Step 35066: {'lr': 0.00010288157949615182, 'samples': 17954304, 'steps': 35066, 'batch_loss/train': 0.7965537076815963} 12/27/2021 14:02:30 - INFO - codeparrot_training - Step 35067: {'lr': 0.00010286883503648122, 'samples': 17954816, 'steps': 35067, 'batch_loss/train': 0.7719065472483635} 12/27/2021 14:02:40 - INFO - codeparrot_training - Step 35068: {'lr': 0.00010285609116174869, 'samples': 17955328, 'steps': 35068, 'batch_loss/train': 0.7793703284114599} 12/27/2021 14:02:53 - INFO - codeparrot_training - Step 35069: {'lr': 0.0001028433478720047, 'samples': 17955840, 'steps': 35069, 'batch_loss/train': 0.8876784685999155} 12/27/2021 14:03:04 - INFO - codeparrot_training - Step 35070: {'lr': 0.00010283060516729997, 'samples': 17956352, 'steps': 35070, 'batch_loss/train': 0.7857760358601809} 12/27/2021 14:03:14 - INFO - codeparrot_training - Step 35071: {'lr': 0.0001028178630476852, 'samples': 17956864, 'steps': 35071, 'batch_loss/train': 0.7838863311335444} 12/27/2021 14:03:26 - INFO - codeparrot_training - Step 35072: {'lr': 0.00010280512151321104, 'samples': 17957376, 'steps': 35072, 'batch_loss/train': 0.779925080947578} 12/27/2021 14:03:37 - INFO - codeparrot_training - Step 35073: {'lr': 0.00010279238056392812, 'samples': 17957888, 'steps': 35073, 'batch_loss/train': 0.7220672918483615} 12/27/2021 14:03:48 - INFO - codeparrot_training - Step 35074: {'lr': 0.00010277964019988715, 'samples': 17958400, 'steps': 35074, 'batch_loss/train': 0.7945122066885233} 12/27/2021 14:04:01 - INFO - codeparrot_training - Step 35075: {'lr': 0.0001027669004211387, 'samples': 17958912, 'steps': 35075, 'batch_loss/train': 0.8057098719291389} 12/27/2021 14:04:12 - INFO - codeparrot_training - Step 35076: {'lr': 0.00010275416122773346, 'samples': 17959424, 'steps': 35076, 'batch_loss/train': 0.7385892197489738} 12/27/2021 14:04:23 - INFO - codeparrot_training - Step 35077: {'lr': 0.00010274142261972211, 'samples': 17959936, 'steps': 35077, 'batch_loss/train': 0.7642313675023615} 12/27/2021 14:04:33 - INFO - codeparrot_training - Step 35078: {'lr': 0.00010272868459715523, 'samples': 17960448, 'steps': 35078, 'batch_loss/train': 0.6912565319798887} 12/27/2021 14:04:45 - INFO - codeparrot_training - Step 35079: {'lr': 0.0001027159471600836, 'samples': 17960960, 'steps': 35079, 'batch_loss/train': 0.7145729574840516} 12/27/2021 14:04:56 - INFO - codeparrot_training - Step 35080: {'lr': 0.00010270321030855765, 'samples': 17961472, 'steps': 35080, 'batch_loss/train': 0.745631193742156} 12/27/2021 14:05:07 - INFO - codeparrot_training - Step 35081: {'lr': 0.00010269047404262805, 'samples': 17961984, 'steps': 35081, 'batch_loss/train': 0.6901947981677949} 12/27/2021 14:05:19 - INFO - codeparrot_training - Step 35082: {'lr': 0.00010267773836234568, 'samples': 17962496, 'steps': 35082, 'batch_loss/train': 0.7772010378539562} 12/27/2021 14:05:30 - INFO - codeparrot_training - Step 35083: {'lr': 0.00010266500326776088, 'samples': 17963008, 'steps': 35083, 'batch_loss/train': 0.7466885794419795} 12/27/2021 14:05:40 - INFO - codeparrot_training - Step 35084: {'lr': 0.00010265226875892442, 'samples': 17963520, 'steps': 35084, 'batch_loss/train': 0.5935227453592233} 12/27/2021 14:05:54 - INFO - codeparrot_training - Step 35085: {'lr': 0.00010263953483588687, 'samples': 17964032, 'steps': 35085, 'batch_loss/train': 0.7465480281971395} 12/27/2021 14:06:05 - INFO - codeparrot_training - Step 35086: {'lr': 0.0001026268014986989, 'samples': 17964544, 'steps': 35086, 'batch_loss/train': 0.7018557609990239} 12/27/2021 14:06:16 - INFO - codeparrot_training - Step 35087: {'lr': 0.00010261406874741117, 'samples': 17965056, 'steps': 35087, 'batch_loss/train': 0.7749092602171004} 12/27/2021 14:06:26 - INFO - codeparrot_training - Step 35088: {'lr': 0.0001026013365820743, 'samples': 17965568, 'steps': 35088, 'batch_loss/train': 0.7035460919141769} 12/27/2021 14:06:38 - INFO - codeparrot_training - Step 35089: {'lr': 0.00010258860500273868, 'samples': 17966080, 'steps': 35089, 'batch_loss/train': 0.8013028502464294} 12/27/2021 14:06:49 - INFO - codeparrot_training - Step 35090: {'lr': 0.0001025758740094552, 'samples': 17966592, 'steps': 35090, 'batch_loss/train': 0.7017069682478905} 12/27/2021 14:07:00 - INFO - codeparrot_training - Step 35091: {'lr': 0.00010256314360227448, 'samples': 17967104, 'steps': 35091, 'batch_loss/train': 0.6733329384587705} 12/27/2021 14:07:14 - INFO - codeparrot_training - Step 35092: {'lr': 0.00010255041378124683, 'samples': 17967616, 'steps': 35092, 'batch_loss/train': 0.7768855793401599} 12/27/2021 14:07:24 - INFO - codeparrot_training - Step 35093: {'lr': 0.00010253768454642315, 'samples': 17968128, 'steps': 35093, 'batch_loss/train': 0.8107994580641389} 12/27/2021 14:07:35 - INFO - codeparrot_training - Step 35094: {'lr': 0.00010252495589785407, 'samples': 17968640, 'steps': 35094, 'batch_loss/train': 0.884938457980752} 12/27/2021 14:07:47 - INFO - codeparrot_training - Step 35095: {'lr': 0.00010251222783558998, 'samples': 17969152, 'steps': 35095, 'batch_loss/train': 0.7682624738663435} 12/27/2021 14:07:57 - INFO - codeparrot_training - Step 35096: {'lr': 0.0001024995003596815, 'samples': 17969664, 'steps': 35096, 'batch_loss/train': 0.8379402200225741} 12/27/2021 14:08:08 - INFO - codeparrot_training - Step 35097: {'lr': 0.00010248677347017949, 'samples': 17970176, 'steps': 35097, 'batch_loss/train': 0.7951873298734426} 12/27/2021 14:08:20 - INFO - codeparrot_training - Step 35098: {'lr': 0.00010247404716713426, 'samples': 17970688, 'steps': 35098, 'batch_loss/train': 0.77194878552109} 12/27/2021 14:08:31 - INFO - codeparrot_training - Step 35099: {'lr': 0.00010246132145059652, 'samples': 17971200, 'steps': 35099, 'batch_loss/train': 0.7537603685632348} 12/27/2021 14:08:41 - INFO - codeparrot_training - Step 35100: {'lr': 0.00010244859632061687, 'samples': 17971712, 'steps': 35100, 'batch_loss/train': 0.737706012558192} 12/27/2021 14:08:52 - INFO - codeparrot_training - Step 35101: {'lr': 0.0001024358717772459, 'samples': 17972224, 'steps': 35101, 'batch_loss/train': 0.7871341900900006} 12/27/2021 14:09:06 - INFO - codeparrot_training - Step 35102: {'lr': 0.00010242314782053417, 'samples': 17972736, 'steps': 35102, 'batch_loss/train': 0.7878256943076849} 12/27/2021 14:09:17 - INFO - codeparrot_training - Step 35103: {'lr': 0.00010241042445053227, 'samples': 17973248, 'steps': 35103, 'batch_loss/train': 0.631833887193352} 12/27/2021 14:09:27 - INFO - codeparrot_training - Step 35104: {'lr': 0.00010239770166729081, 'samples': 17973760, 'steps': 35104, 'batch_loss/train': 0.7162641575559974} 12/27/2021 14:09:40 - INFO - codeparrot_training - Step 35105: {'lr': 0.00010238497947086046, 'samples': 17974272, 'steps': 35105, 'batch_loss/train': 0.6024287405889481} 12/27/2021 14:09:50 - INFO - codeparrot_training - Step 35106: {'lr': 0.00010237225786129157, 'samples': 17974784, 'steps': 35106, 'batch_loss/train': 0.7566647790372372} 12/27/2021 14:10:01 - INFO - codeparrot_training - Step 35107: {'lr': 0.00010235953683863477, 'samples': 17975296, 'steps': 35107, 'batch_loss/train': 0.833279955200851} 12/27/2021 14:10:13 - INFO - codeparrot_training - Step 35108: {'lr': 0.00010234681640294088, 'samples': 17975808, 'steps': 35108, 'batch_loss/train': 0.732110982760787} 12/27/2021 14:10:24 - INFO - codeparrot_training - Step 35109: {'lr': 0.0001023340965542602, 'samples': 17976320, 'steps': 35109, 'batch_loss/train': 0.7526496809441596} 12/27/2021 14:10:34 - INFO - codeparrot_training - Step 35110: {'lr': 0.0001023213772926434, 'samples': 17976832, 'steps': 35110, 'batch_loss/train': 0.7001846767961979} 12/27/2021 14:10:46 - INFO - codeparrot_training - Step 35111: {'lr': 0.00010230865861814106, 'samples': 17977344, 'steps': 35111, 'batch_loss/train': 0.7389930346980691} 12/27/2021 14:10:57 - INFO - codeparrot_training - Step 35112: {'lr': 0.0001022959405308037, 'samples': 17977856, 'steps': 35112, 'batch_loss/train': 0.7654247339814901} 12/27/2021 14:11:07 - INFO - codeparrot_training - Step 35113: {'lr': 0.00010228322303068193, 'samples': 17978368, 'steps': 35113, 'batch_loss/train': 0.7173255477100611} 12/27/2021 14:11:18 - INFO - codeparrot_training - Step 35114: {'lr': 0.00010227050611782629, 'samples': 17978880, 'steps': 35114, 'batch_loss/train': 0.8423047575633973} 12/27/2021 14:11:32 - INFO - codeparrot_training - Step 35115: {'lr': 0.00010225778979228734, 'samples': 17979392, 'steps': 35115, 'batch_loss/train': 0.6646361444145441} 12/27/2021 14:11:43 - INFO - codeparrot_training - Step 35116: {'lr': 0.00010224507405411562, 'samples': 17979904, 'steps': 35116, 'batch_loss/train': 0.7580457171425223} 12/27/2021 14:11:53 - INFO - codeparrot_training - Step 35117: {'lr': 0.00010223235890336178, 'samples': 17980416, 'steps': 35117, 'batch_loss/train': 0.69229931011796} 12/27/2021 14:12:05 - INFO - codeparrot_training - Step 35118: {'lr': 0.0001022196443400761, 'samples': 17980928, 'steps': 35118, 'batch_loss/train': 0.8245159205980599} 12/27/2021 14:12:16 - INFO - codeparrot_training - Step 35119: {'lr': 0.00010220693036430941, 'samples': 17981440, 'steps': 35119, 'batch_loss/train': 0.677489148103632} 12/27/2021 14:12:27 - INFO - codeparrot_training - Step 35120: {'lr': 0.00010219421697611228, 'samples': 17981952, 'steps': 35120, 'batch_loss/train': 1.2650333531200886} 12/27/2021 14:12:40 - INFO - codeparrot_training - Step 35121: {'lr': 0.00010218150417553501, 'samples': 17982464, 'steps': 35121, 'batch_loss/train': 0.7898223418742418} 12/27/2021 14:12:51 - INFO - codeparrot_training - Step 35122: {'lr': 0.00010216879196262821, 'samples': 17982976, 'steps': 35122, 'batch_loss/train': 0.6896977999713272} 12/27/2021 14:13:02 - INFO - codeparrot_training - Step 35123: {'lr': 0.00010215608033744262, 'samples': 17983488, 'steps': 35123, 'batch_loss/train': 0.7305723044555634} 12/27/2021 14:13:12 - INFO - codeparrot_training - Step 35124: {'lr': 0.00010214336930002855, 'samples': 17984000, 'steps': 35124, 'batch_loss/train': 0.8634367166087031} 12/27/2021 14:13:24 - INFO - codeparrot_training - Step 35125: {'lr': 0.00010213065885043662, 'samples': 17984512, 'steps': 35125, 'batch_loss/train': 0.7233592895790935} 12/27/2021 14:13:35 - INFO - codeparrot_training - Step 35126: {'lr': 0.00010211794898871737, 'samples': 17985024, 'steps': 35126, 'batch_loss/train': 0.6743515896378085} 12/27/2021 14:13:45 - INFO - codeparrot_training - Step 35127: {'lr': 0.0001021052397149213, 'samples': 17985536, 'steps': 35127, 'batch_loss/train': 0.780108267441392} 12/27/2021 14:14:00 - INFO - codeparrot_training - Step 35128: {'lr': 0.00010209253102909896, 'samples': 17986048, 'steps': 35128, 'batch_loss/train': 0.8785143010318279} 12/27/2021 14:14:11 - INFO - codeparrot_training - Step 35129: {'lr': 0.00010207982293130086, 'samples': 17986560, 'steps': 35129, 'batch_loss/train': 0.7336140810512006} 12/27/2021 14:14:21 - INFO - codeparrot_training - Step 35130: {'lr': 0.00010206711542157754, 'samples': 17987072, 'steps': 35130, 'batch_loss/train': 0.7749552242457867} 12/27/2021 14:14:33 - INFO - codeparrot_training - Step 35131: {'lr': 0.00010205440849997948, 'samples': 17987584, 'steps': 35131, 'batch_loss/train': 0.7602553786709905} 12/27/2021 14:14:44 - INFO - codeparrot_training - Step 35132: {'lr': 0.00010204170216655737, 'samples': 17988096, 'steps': 35132, 'batch_loss/train': 0.5702526903478429} 12/27/2021 14:14:55 - INFO - codeparrot_training - Step 35133: {'lr': 0.00010202899642136138, 'samples': 17988608, 'steps': 35133, 'batch_loss/train': 0.7151809957576916} 12/27/2021 14:15:05 - INFO - codeparrot_training - Step 35134: {'lr': 0.00010201629126444242, 'samples': 17989120, 'steps': 35134, 'batch_loss/train': 0.6906768688932061} 12/27/2021 14:15:19 - INFO - codeparrot_training - Step 35135: {'lr': 0.00010200358669585072, 'samples': 17989632, 'steps': 35135, 'batch_loss/train': 0.7517832545563579} 12/27/2021 14:15:30 - INFO - codeparrot_training - Step 35136: {'lr': 0.00010199088271563686, 'samples': 17990144, 'steps': 35136, 'batch_loss/train': 0.6683237790130079} 12/27/2021 14:15:41 - INFO - codeparrot_training - Step 35137: {'lr': 0.00010197817932385137, 'samples': 17990656, 'steps': 35137, 'batch_loss/train': 0.7020743063185364} 12/27/2021 14:15:53 - INFO - codeparrot_training - Step 35138: {'lr': 0.00010196547652054475, 'samples': 17991168, 'steps': 35138, 'batch_loss/train': 0.6983291930519044} 12/27/2021 14:16:03 - INFO - codeparrot_training - Step 35139: {'lr': 0.0001019527743057675, 'samples': 17991680, 'steps': 35139, 'batch_loss/train': 0.7402449373621494} 12/27/2021 14:16:14 - INFO - codeparrot_training - Step 35140: {'lr': 0.00010194007267957012, 'samples': 17992192, 'steps': 35140, 'batch_loss/train': 0.6956094391644001} 12/27/2021 14:16:26 - INFO - codeparrot_training - Step 35141: {'lr': 0.0001019273716420031, 'samples': 17992704, 'steps': 35141, 'batch_loss/train': 0.7646424844861031} 12/27/2021 14:16:37 - INFO - codeparrot_training - Step 35142: {'lr': 0.00010191467119311696, 'samples': 17993216, 'steps': 35142, 'batch_loss/train': 0.7658475402276963} 12/27/2021 14:16:47 - INFO - codeparrot_training - Step 35143: {'lr': 0.00010190197133296225, 'samples': 17993728, 'steps': 35143, 'batch_loss/train': 0.740325310267508} 12/27/2021 14:17:01 - INFO - codeparrot_training - Step 35144: {'lr': 0.00010188927206158921, 'samples': 17994240, 'steps': 35144, 'batch_loss/train': 0.6224370603449643} 12/27/2021 14:17:12 - INFO - codeparrot_training - Step 35145: {'lr': 0.00010187657337904857, 'samples': 17994752, 'steps': 35145, 'batch_loss/train': 0.8157699769362807} 12/27/2021 14:17:22 - INFO - codeparrot_training - Step 35146: {'lr': 0.00010186387528539085, 'samples': 17995264, 'steps': 35146, 'batch_loss/train': 0.6703877621330321} 12/27/2021 14:17:33 - INFO - codeparrot_training - Step 35147: {'lr': 0.00010185117778066633, 'samples': 17995776, 'steps': 35147, 'batch_loss/train': 0.6297203365247697} 12/27/2021 14:17:45 - INFO - codeparrot_training - Step 35148: {'lr': 0.0001018384808649255, 'samples': 17996288, 'steps': 35148, 'batch_loss/train': 0.6834418375510722} 12/27/2021 14:17:56 - INFO - codeparrot_training - Step 35149: {'lr': 0.00010182578453821911, 'samples': 17996800, 'steps': 35149, 'batch_loss/train': 0.6168166999705136} 12/27/2021 14:18:06 - INFO - codeparrot_training - Step 35150: {'lr': 0.00010181308880059736, 'samples': 17997312, 'steps': 35150, 'batch_loss/train': 0.6210667092818767} 12/27/2021 14:18:21 - INFO - codeparrot_training - Step 35151: {'lr': 0.00010180039365211083, 'samples': 17997824, 'steps': 35151, 'batch_loss/train': 0.6926565235480666} 12/27/2021 14:18:31 - INFO - codeparrot_training - Step 35152: {'lr': 0.00010178769909280996, 'samples': 17998336, 'steps': 35152, 'batch_loss/train': 0.6893277075141668} 12/27/2021 14:18:42 - INFO - codeparrot_training - Step 35153: {'lr': 0.00010177500512274524, 'samples': 17998848, 'steps': 35153, 'batch_loss/train': 0.8051333408802748} 12/27/2021 14:18:54 - INFO - codeparrot_training - Step 35154: {'lr': 0.00010176231174196713, 'samples': 17999360, 'steps': 35154, 'batch_loss/train': 0.7609133371151984} 12/27/2021 14:19:05 - INFO - codeparrot_training - Step 35155: {'lr': 0.00010174961895052609, 'samples': 17999872, 'steps': 35155, 'batch_loss/train': 0.7026512483134866} 12/27/2021 14:19:15 - INFO - codeparrot_training - Step 35156: {'lr': 0.0001017369267484726, 'samples': 18000384, 'steps': 35156, 'batch_loss/train': 0.6776679652393796} 12/27/2021 14:19:26 - INFO - codeparrot_training - Step 35157: {'lr': 0.0001017242351358571, 'samples': 18000896, 'steps': 35157, 'batch_loss/train': 1.2867574952542782} 12/27/2021 14:19:38 - INFO - codeparrot_training - Step 35158: {'lr': 0.00010171154411273015, 'samples': 18001408, 'steps': 35158, 'batch_loss/train': 0.6794422026723623} 12/27/2021 14:19:49 - INFO - codeparrot_training - Step 35159: {'lr': 0.00010169885367914192, 'samples': 18001920, 'steps': 35159, 'batch_loss/train': 1.001959583722055} 12/27/2021 14:19:59 - INFO - codeparrot_training - Step 35160: {'lr': 0.00010168616383514326, 'samples': 18002432, 'steps': 35160, 'batch_loss/train': 0.7031163861975074} 12/27/2021 14:20:13 - INFO - codeparrot_training - Step 35161: {'lr': 0.0001016734745807843, 'samples': 18002944, 'steps': 35161, 'batch_loss/train': 0.672445182222873} 12/27/2021 14:20:24 - INFO - codeparrot_training - Step 35162: {'lr': 0.00010166078591611558, 'samples': 18003456, 'steps': 35162, 'batch_loss/train': 1.4400238329544663} 12/27/2021 14:20:35 - INFO - codeparrot_training - Step 35163: {'lr': 0.00010164809784118758, 'samples': 18003968, 'steps': 35163, 'batch_loss/train': 0.7443693978711963} 12/27/2021 14:20:47 - INFO - codeparrot_training - Step 35164: {'lr': 0.00010163541035605073, 'samples': 18004480, 'steps': 35164, 'batch_loss/train': 0.7068806299939752} 12/27/2021 14:20:57 - INFO - codeparrot_training - Step 35165: {'lr': 0.00010162272346075547, 'samples': 18004992, 'steps': 35165, 'batch_loss/train': 0.7518135700374842} 12/27/2021 14:21:08 - INFO - codeparrot_training - Step 35166: {'lr': 0.00010161003715535225, 'samples': 18005504, 'steps': 35166, 'batch_loss/train': 0.6888754717074335} 12/27/2021 14:21:19 - INFO - codeparrot_training - Step 35167: {'lr': 0.00010159735143989147, 'samples': 18006016, 'steps': 35167, 'batch_loss/train': 0.8309943862259388} 12/27/2021 14:21:31 - INFO - codeparrot_training - Step 35168: {'lr': 0.0001015846663144236, 'samples': 18006528, 'steps': 35168, 'batch_loss/train': 0.762924381531775} 12/27/2021 14:21:41 - INFO - codeparrot_training - Step 35169: {'lr': 0.00010157198177899913, 'samples': 18007040, 'steps': 35169, 'batch_loss/train': 0.650990879163146} 12/27/2021 14:21:52 - INFO - codeparrot_training - Step 35170: {'lr': 0.00010155929783366824, 'samples': 18007552, 'steps': 35170, 'batch_loss/train': 0.6757689109072089} 12/27/2021 14:22:04 - INFO - codeparrot_training - Step 35171: {'lr': 0.00010154661447848163, 'samples': 18008064, 'steps': 35171, 'batch_loss/train': 0.6893121431348845} 12/27/2021 14:22:15 - INFO - codeparrot_training - Step 35172: {'lr': 0.00010153393171348973, 'samples': 18008576, 'steps': 35172, 'batch_loss/train': 0.6766618406400084} 12/27/2021 14:22:25 - INFO - codeparrot_training - Step 35173: {'lr': 0.00010152124953874276, 'samples': 18009088, 'steps': 35173, 'batch_loss/train': 0.7527262098155916} 12/27/2021 14:22:39 - INFO - codeparrot_training - Step 35174: {'lr': 0.00010150856795429115, 'samples': 18009600, 'steps': 35174, 'batch_loss/train': 0.6340643847361207} 12/27/2021 14:22:50 - INFO - codeparrot_training - Step 35175: {'lr': 0.00010149588696018556, 'samples': 18010112, 'steps': 35175, 'batch_loss/train': 0.7714525461196899} 12/27/2021 14:23:01 - INFO - codeparrot_training - Step 35176: {'lr': 0.00010148320655647619, 'samples': 18010624, 'steps': 35176, 'batch_loss/train': 0.7312363380333409} 12/27/2021 14:23:13 - INFO - codeparrot_training - Step 35177: {'lr': 0.00010147052674321342, 'samples': 18011136, 'steps': 35177, 'batch_loss/train': 0.6664560092613101} 12/27/2021 14:23:24 - INFO - codeparrot_training - Step 35178: {'lr': 0.00010145784752044789, 'samples': 18011648, 'steps': 35178, 'batch_loss/train': 0.677890510938596} 12/27/2021 14:23:34 - INFO - codeparrot_training - Step 35179: {'lr': 0.00010144516888822983, 'samples': 18012160, 'steps': 35179, 'batch_loss/train': 0.6962851770222187} 12/27/2021 14:23:45 - INFO - codeparrot_training - Step 35180: {'lr': 0.00010143249084660968, 'samples': 18012672, 'steps': 35180, 'batch_loss/train': 0.7360342089086771} 12/27/2021 14:23:59 - INFO - codeparrot_training - Step 35181: {'lr': 0.00010141981339563786, 'samples': 18013184, 'steps': 35181, 'batch_loss/train': 1.430893034208566} 12/27/2021 14:24:09 - INFO - codeparrot_training - Step 35182: {'lr': 0.00010140713653536473, 'samples': 18013696, 'steps': 35182, 'batch_loss/train': 0.7693841503933072} 12/27/2021 14:24:20 - INFO - codeparrot_training - Step 35183: {'lr': 0.00010139446026584074, 'samples': 18014208, 'steps': 35183, 'batch_loss/train': 0.8550612232647836} 12/27/2021 14:24:32 - INFO - codeparrot_training - Step 35184: {'lr': 0.00010138178458711633, 'samples': 18014720, 'steps': 35184, 'batch_loss/train': 0.7252981560304761} 12/27/2021 14:24:43 - INFO - codeparrot_training - Step 35185: {'lr': 0.00010136910949924166, 'samples': 18015232, 'steps': 35185, 'batch_loss/train': 0.7882574438117445} 12/27/2021 14:24:53 - INFO - codeparrot_training - Step 35186: {'lr': 0.0001013564350022674, 'samples': 18015744, 'steps': 35186, 'batch_loss/train': 0.7008834099397063} 12/27/2021 14:25:05 - INFO - codeparrot_training - Step 35187: {'lr': 0.00010134376109624391, 'samples': 18016256, 'steps': 35187, 'batch_loss/train': 0.6405852260068059} 12/27/2021 14:25:16 - INFO - codeparrot_training - Step 35188: {'lr': 0.0001013310877812213, 'samples': 18016768, 'steps': 35188, 'batch_loss/train': 0.6906677866354585} 12/27/2021 14:25:27 - INFO - codeparrot_training - Step 35189: {'lr': 0.00010131841505725034, 'samples': 18017280, 'steps': 35189, 'batch_loss/train': 0.7556113878963515} 12/27/2021 14:25:37 - INFO - codeparrot_training - Step 35190: {'lr': 0.00010130574292438113, 'samples': 18017792, 'steps': 35190, 'batch_loss/train': 0.7164440928027034} 12/27/2021 14:25:51 - INFO - codeparrot_training - Step 35191: {'lr': 0.00010129307138266414, 'samples': 18018304, 'steps': 35191, 'batch_loss/train': 0.6467053182423115} 12/27/2021 14:26:02 - INFO - codeparrot_training - Step 35192: {'lr': 0.00010128040043214975, 'samples': 18018816, 'steps': 35192, 'batch_loss/train': 0.7722024852409959} 12/27/2021 14:26:12 - INFO - codeparrot_training - Step 35193: {'lr': 0.00010126773007288834, 'samples': 18019328, 'steps': 35193, 'batch_loss/train': 0.6909907374065369} 12/27/2021 14:26:25 - INFO - codeparrot_training - Step 35194: {'lr': 0.00010125506030493026, 'samples': 18019840, 'steps': 35194, 'batch_loss/train': 0.7761774896644056} 12/27/2021 14:26:35 - INFO - codeparrot_training - Step 35195: {'lr': 0.000101242391128326, 'samples': 18020352, 'steps': 35195, 'batch_loss/train': 0.8082669973373413} 12/27/2021 14:26:46 - INFO - codeparrot_training - Step 35196: {'lr': 0.00010122972254312562, 'samples': 18020864, 'steps': 35196, 'batch_loss/train': 0.6949225077405572} 12/27/2021 14:26:58 - INFO - codeparrot_training - Step 35197: {'lr': 0.00010121705454937983, 'samples': 18021376, 'steps': 35197, 'batch_loss/train': 0.6570263738685753} 12/27/2021 14:27:08 - INFO - codeparrot_training - Step 35198: {'lr': 0.0001012043871471389, 'samples': 18021888, 'steps': 35198, 'batch_loss/train': 0.6668422757647932} 12/27/2021 14:27:19 - INFO - codeparrot_training - Step 35199: {'lr': 0.00010119172033645307, 'samples': 18022400, 'steps': 35199, 'batch_loss/train': 0.6842587082646787} 12/27/2021 14:27:30 - INFO - codeparrot_training - Step 35200: {'lr': 0.0001011790541173727, 'samples': 18022912, 'steps': 35200, 'batch_loss/train': 0.6956548662856221} 12/27/2021 14:27:43 - INFO - codeparrot_training - Step 35201: {'lr': 0.00010116638848994835, 'samples': 18023424, 'steps': 35201, 'batch_loss/train': 0.652626617345959} 12/27/2021 14:27:54 - INFO - codeparrot_training - Step 35202: {'lr': 0.00010115372345423018, 'samples': 18023936, 'steps': 35202, 'batch_loss/train': 0.5884508807212114} 12/27/2021 14:28:05 - INFO - codeparrot_training - Step 35203: {'lr': 0.00010114105901026851, 'samples': 18024448, 'steps': 35203, 'batch_loss/train': 0.8194062849506736} 12/27/2021 14:28:17 - INFO - codeparrot_training - Step 35204: {'lr': 0.00010112839515811398, 'samples': 18024960, 'steps': 35204, 'batch_loss/train': 0.7487818356603384} 12/27/2021 14:28:27 - INFO - codeparrot_training - Step 35205: {'lr': 0.00010111573189781661, 'samples': 18025472, 'steps': 35205, 'batch_loss/train': 0.5917289052158594} 12/27/2021 14:28:38 - INFO - codeparrot_training - Step 35206: {'lr': 0.00010110306922942692, 'samples': 18025984, 'steps': 35206, 'batch_loss/train': 0.7149459421634674} 12/27/2021 14:28:52 - INFO - codeparrot_training - Step 35207: {'lr': 0.00010109040715299517, 'samples': 18026496, 'steps': 35207, 'batch_loss/train': 0.6708722609910183} 12/27/2021 14:29:02 - INFO - codeparrot_training - Step 35208: {'lr': 0.00010107774566857175, 'samples': 18027008, 'steps': 35208, 'batch_loss/train': 0.643710891250521} 12/27/2021 14:29:13 - INFO - codeparrot_training - Step 35209: {'lr': 0.00010106508477620699, 'samples': 18027520, 'steps': 35209, 'batch_loss/train': 0.6656074519269168} 12/27/2021 14:29:25 - INFO - codeparrot_training - Step 35210: {'lr': 0.00010105242447595128, 'samples': 18028032, 'steps': 35210, 'batch_loss/train': 0.7734567660372704} 12/27/2021 14:29:36 - INFO - codeparrot_training - Step 35211: {'lr': 0.00010103976476785474, 'samples': 18028544, 'steps': 35211, 'batch_loss/train': 0.7239289589342661} 12/27/2021 14:29:47 - INFO - codeparrot_training - Step 35212: {'lr': 0.00010102710565196793, 'samples': 18029056, 'steps': 35212, 'batch_loss/train': 0.7522062069620006} 12/27/2021 14:29:57 - INFO - codeparrot_training - Step 35213: {'lr': 0.00010101444712834118, 'samples': 18029568, 'steps': 35213, 'batch_loss/train': 0.6594577883370221} 12/27/2021 14:30:10 - INFO - codeparrot_training - Step 35214: {'lr': 0.00010100178919702455, 'samples': 18030080, 'steps': 35214, 'batch_loss/train': 1.0657052583992481} 12/27/2021 14:30:20 - INFO - codeparrot_training - Step 35215: {'lr': 0.00010098913185806874, 'samples': 18030592, 'steps': 35215, 'batch_loss/train': 0.7334190639958251} 12/27/2021 14:30:31 - INFO - codeparrot_training - Step 35216: {'lr': 0.00010097647511152378, 'samples': 18031104, 'steps': 35216, 'batch_loss/train': 0.8242321056313813} 12/27/2021 14:30:43 - INFO - codeparrot_training - Step 35217: {'lr': 0.00010096381895744009, 'samples': 18031616, 'steps': 35217, 'batch_loss/train': 0.6685341009870172} 12/27/2021 14:30:54 - INFO - codeparrot_training - Step 35218: {'lr': 0.00010095116339586799, 'samples': 18032128, 'steps': 35218, 'batch_loss/train': 0.7330226376652718} 12/27/2021 14:31:04 - INFO - codeparrot_training - Step 35219: {'lr': 0.00010093850842685779, 'samples': 18032640, 'steps': 35219, 'batch_loss/train': 0.6420453593600541} 12/27/2021 14:31:18 - INFO - codeparrot_training - Step 35220: {'lr': 0.00010092585405045976, 'samples': 18033152, 'steps': 35220, 'batch_loss/train': 0.6976306703872979} 12/27/2021 14:31:29 - INFO - codeparrot_training - Step 35221: {'lr': 0.00010091320026672427, 'samples': 18033664, 'steps': 35221, 'batch_loss/train': 0.629757251124829} 12/27/2021 14:31:40 - INFO - codeparrot_training - Step 35222: {'lr': 0.00010090054707570162, 'samples': 18034176, 'steps': 35222, 'batch_loss/train': 0.7255532341077924} 12/27/2021 14:31:50 - INFO - codeparrot_training - Step 35223: {'lr': 0.00010088789447744206, 'samples': 18034688, 'steps': 35223, 'batch_loss/train': 0.7115226164460182} 12/27/2021 14:32:03 - INFO - codeparrot_training - Step 35224: {'lr': 0.00010087524247199605, 'samples': 18035200, 'steps': 35224, 'batch_loss/train': 0.7043002606369555} 12/27/2021 14:32:13 - INFO - codeparrot_training - Step 35225: {'lr': 0.00010086259105941355, 'samples': 18035712, 'steps': 35225, 'batch_loss/train': 0.6634961240924895} 12/27/2021 14:32:24 - INFO - codeparrot_training - Step 35226: {'lr': 0.00010084994023974523, 'samples': 18036224, 'steps': 35226, 'batch_loss/train': 0.5223846236476675} 12/27/2021 14:32:36 - INFO - codeparrot_training - Step 35227: {'lr': 0.00010083729001304126, 'samples': 18036736, 'steps': 35227, 'batch_loss/train': 0.7400080189108849} 12/27/2021 14:32:46 - INFO - codeparrot_training - Step 35228: {'lr': 0.00010082464037935185, 'samples': 18037248, 'steps': 35228, 'batch_loss/train': 0.7281855912879109} 12/27/2021 14:32:57 - INFO - codeparrot_training - Step 35229: {'lr': 0.00010081199133872723, 'samples': 18037760, 'steps': 35229, 'batch_loss/train': 0.5384515738696791} 12/27/2021 14:33:11 - INFO - codeparrot_training - Step 35230: {'lr': 0.000100799342891218, 'samples': 18038272, 'steps': 35230, 'batch_loss/train': 0.7330238968133926} 12/27/2021 14:33:22 - INFO - codeparrot_training - Step 35231: {'lr': 0.00010078669503687416, 'samples': 18038784, 'steps': 35231, 'batch_loss/train': 0.7768237674608827} 12/27/2021 14:33:32 - INFO - codeparrot_training - Step 35232: {'lr': 0.00010077404777574605, 'samples': 18039296, 'steps': 35232, 'batch_loss/train': 0.693385686725378} 12/27/2021 14:33:44 - INFO - codeparrot_training - Step 35233: {'lr': 0.00010076140110788401, 'samples': 18039808, 'steps': 35233, 'batch_loss/train': 0.7404655311256647} 12/27/2021 14:33:55 - INFO - codeparrot_training - Step 35234: {'lr': 0.00010074875503333828, 'samples': 18040320, 'steps': 35234, 'batch_loss/train': 0.7472183112986386} 12/27/2021 14:34:06 - INFO - codeparrot_training - Step 35235: {'lr': 0.00010073610955215914, 'samples': 18040832, 'steps': 35235, 'batch_loss/train': 0.7656711479648948} 12/27/2021 14:34:16 - INFO - codeparrot_training - Step 35236: {'lr': 0.00010072346466439694, 'samples': 18041344, 'steps': 35236, 'batch_loss/train': 0.6300302390009165} 12/27/2021 14:34:30 - INFO - codeparrot_training - Step 35237: {'lr': 0.00010071082037010174, 'samples': 18041856, 'steps': 35237, 'batch_loss/train': 0.6931565012782812} 12/27/2021 14:34:41 - INFO - codeparrot_training - Step 35238: {'lr': 0.00010069817666932402, 'samples': 18042368, 'steps': 35238, 'batch_loss/train': 0.7310207369737327} 12/27/2021 14:34:52 - INFO - codeparrot_training - Step 35239: {'lr': 0.00010068553356211405, 'samples': 18042880, 'steps': 35239, 'batch_loss/train': 0.6543886200524867} 12/27/2021 14:35:04 - INFO - codeparrot_training - Step 35240: {'lr': 0.00010067289104852187, 'samples': 18043392, 'steps': 35240, 'batch_loss/train': 0.6918481877073646} 12/27/2021 14:35:14 - INFO - codeparrot_training - Step 35241: {'lr': 0.00010066024912859806, 'samples': 18043904, 'steps': 35241, 'batch_loss/train': 0.7321366276592016} 12/27/2021 14:35:25 - INFO - codeparrot_training - Step 35242: {'lr': 0.0001006476078023926, 'samples': 18044416, 'steps': 35242, 'batch_loss/train': 0.6612082719802856} 12/27/2021 14:35:37 - INFO - codeparrot_training - Step 35243: {'lr': 0.00010063496706995587, 'samples': 18044928, 'steps': 35243, 'batch_loss/train': 0.7500444948673248} 12/27/2021 14:35:48 - INFO - codeparrot_training - Step 35244: {'lr': 0.00010062232693133813, 'samples': 18045440, 'steps': 35244, 'batch_loss/train': 0.7143563642166555} 12/27/2021 14:35:58 - INFO - codeparrot_training - Step 35245: {'lr': 0.0001006096873865896, 'samples': 18045952, 'steps': 35245, 'batch_loss/train': 0.758274822961539} 12/27/2021 14:36:09 - INFO - codeparrot_training - Step 35246: {'lr': 0.00010059704843576057, 'samples': 18046464, 'steps': 35246, 'batch_loss/train': 0.7051347822416574} 12/27/2021 14:36:21 - INFO - codeparrot_training - Step 35247: {'lr': 0.00010058441007890124, 'samples': 18046976, 'steps': 35247, 'batch_loss/train': 0.7389779379591346} 12/27/2021 14:36:32 - INFO - codeparrot_training - Step 35248: {'lr': 0.00010057177231606185, 'samples': 18047488, 'steps': 35248, 'batch_loss/train': 0.6712966460036114} 12/27/2021 14:36:42 - INFO - codeparrot_training - Step 35249: {'lr': 0.0001005591351472927, 'samples': 18048000, 'steps': 35249, 'batch_loss/train': 0.6828851979225874} 12/27/2021 14:36:56 - INFO - codeparrot_training - Step 35250: {'lr': 0.00010054649857264412, 'samples': 18048512, 'steps': 35250, 'batch_loss/train': 0.7554211844690144} 12/27/2021 14:37:07 - INFO - codeparrot_training - Step 35251: {'lr': 0.00010053386259216601, 'samples': 18049024, 'steps': 35251, 'batch_loss/train': 0.6524878633208573} 12/27/2021 14:37:17 - INFO - codeparrot_training - Step 35252: {'lr': 0.00010052122720590892, 'samples': 18049536, 'steps': 35252, 'batch_loss/train': 0.6949171060696244} 12/27/2021 14:37:30 - INFO - codeparrot_training - Step 35253: {'lr': 0.0001005085924139231, 'samples': 18050048, 'steps': 35253, 'batch_loss/train': 0.6824738252907991} 12/27/2021 14:37:40 - INFO - codeparrot_training - Step 35254: {'lr': 0.00010049595821625857, 'samples': 18050560, 'steps': 35254, 'batch_loss/train': 0.7683088001795113} 12/27/2021 14:37:51 - INFO - codeparrot_training - Step 35255: {'lr': 0.00010048332461296558, 'samples': 18051072, 'steps': 35255, 'batch_loss/train': 0.569421346852323} 12/27/2021 14:38:03 - INFO - codeparrot_training - Step 35256: {'lr': 0.00010047069160409458, 'samples': 18051584, 'steps': 35256, 'batch_loss/train': 0.7755755205871537} 12/27/2021 14:38:14 - INFO - codeparrot_training - Step 35257: {'lr': 0.0001004580591896956, 'samples': 18052096, 'steps': 35257, 'batch_loss/train': 0.7147713117301464} 12/27/2021 14:38:24 - INFO - codeparrot_training - Step 35258: {'lr': 0.0001004454273698189, 'samples': 18052608, 'steps': 35258, 'batch_loss/train': 0.5795923029072583} 12/27/2021 14:38:35 - INFO - codeparrot_training - Step 35259: {'lr': 0.0001004327961445147, 'samples': 18053120, 'steps': 35259, 'batch_loss/train': 0.6824042607913725} 12/27/2021 14:38:49 - INFO - codeparrot_training - Step 35260: {'lr': 0.00010042016551383323, 'samples': 18053632, 'steps': 35260, 'batch_loss/train': 0.721378194342833} 12/27/2021 14:39:00 - INFO - codeparrot_training - Step 35261: {'lr': 0.00010040753547782472, 'samples': 18054144, 'steps': 35261, 'batch_loss/train': 0.68565181363374} 12/27/2021 14:39:10 - INFO - codeparrot_training - Step 35262: {'lr': 0.00010039490603653936, 'samples': 18054656, 'steps': 35262, 'batch_loss/train': 0.6519018667750061} 12/27/2021 14:39:22 - INFO - codeparrot_training - Step 35263: {'lr': 0.00010038227719002735, 'samples': 18055168, 'steps': 35263, 'batch_loss/train': 0.7457216419279575} 12/27/2021 14:39:33 - INFO - codeparrot_training - Step 35264: {'lr': 0.00010036964893833894, 'samples': 18055680, 'steps': 35264, 'batch_loss/train': 1.1511609707958996} 12/27/2021 14:39:44 - INFO - codeparrot_training - Step 35265: {'lr': 0.00010035702128152438, 'samples': 18056192, 'steps': 35265, 'batch_loss/train': 0.6296189457643777} 12/27/2021 14:39:58 - INFO - codeparrot_training - Step 35266: {'lr': 0.00010034439421963362, 'samples': 18056704, 'steps': 35266, 'batch_loss/train': 0.6718328902497888} 12/27/2021 14:40:08 - INFO - codeparrot_training - Step 35267: {'lr': 0.00010033176775271713, 'samples': 18057216, 'steps': 35267, 'batch_loss/train': 0.6684677107259631} 12/27/2021 14:40:19 - INFO - codeparrot_training - Step 35268: {'lr': 0.00010031914188082511, 'samples': 18057728, 'steps': 35268, 'batch_loss/train': 0.6149955110158771} 12/27/2021 14:40:31 - INFO - codeparrot_training - Step 35269: {'lr': 0.00010030651660400755, 'samples': 18058240, 'steps': 35269, 'batch_loss/train': 0.6121428828919306} 12/27/2021 14:40:42 - INFO - codeparrot_training - Step 35270: {'lr': 0.00010029389192231481, 'samples': 18058752, 'steps': 35270, 'batch_loss/train': 0.7831706092692912} 12/27/2021 14:40:52 - INFO - codeparrot_training - Step 35271: {'lr': 0.000100281267835797, 'samples': 18059264, 'steps': 35271, 'batch_loss/train': 0.7005628533661366} 12/27/2021 14:41:03 - INFO - codeparrot_training - Step 35272: {'lr': 0.00010026864434450433, 'samples': 18059776, 'steps': 35272, 'batch_loss/train': 0.8859023684635758} 12/27/2021 14:41:15 - INFO - codeparrot_training - Step 35273: {'lr': 0.000100256021448487, 'samples': 18060288, 'steps': 35273, 'batch_loss/train': 0.600060848519206} 12/27/2021 14:41:26 - INFO - codeparrot_training - Step 35274: {'lr': 0.00010024339914779517, 'samples': 18060800, 'steps': 35274, 'batch_loss/train': 0.7123131351545453} 12/27/2021 14:41:36 - INFO - codeparrot_training - Step 35275: {'lr': 0.00010023077744247908, 'samples': 18061312, 'steps': 35275, 'batch_loss/train': 0.6837335091549903} 12/27/2021 14:41:50 - INFO - codeparrot_training - Step 35276: {'lr': 0.00010021815633258885, 'samples': 18061824, 'steps': 35276, 'batch_loss/train': 0.6951230349950492} 12/27/2021 14:42:01 - INFO - codeparrot_training - Step 35277: {'lr': 0.00010020553581817468, 'samples': 18062336, 'steps': 35277, 'batch_loss/train': 0.5638306156615727} 12/27/2021 14:42:11 - INFO - codeparrot_training - Step 35278: {'lr': 0.00010019291589928673, 'samples': 18062848, 'steps': 35278, 'batch_loss/train': 0.8123090239241719} 12/27/2021 14:42:24 - INFO - codeparrot_training - Step 35279: {'lr': 0.00010018029657597528, 'samples': 18063360, 'steps': 35279, 'batch_loss/train': 0.7637016200460494} 12/27/2021 14:42:34 - INFO - codeparrot_training - Step 35280: {'lr': 0.00010016767784829028, 'samples': 18063872, 'steps': 35280, 'batch_loss/train': 0.699349164031446} 12/27/2021 14:42:45 - INFO - codeparrot_training - Step 35281: {'lr': 0.00010015505971628197, 'samples': 18064384, 'steps': 35281, 'batch_loss/train': 0.7190238307230175} 12/27/2021 14:42:55 - INFO - codeparrot_training - Step 35282: {'lr': 0.00010014244218000071, 'samples': 18064896, 'steps': 35282, 'batch_loss/train': 0.7472387235611677} 12/27/2021 14:43:09 - INFO - codeparrot_training - Step 35283: {'lr': 0.00010012982523949645, 'samples': 18065408, 'steps': 35283, 'batch_loss/train': 0.7688550995662808} 12/27/2021 14:43:20 - INFO - codeparrot_training - Step 35284: {'lr': 0.0001001172088948194, 'samples': 18065920, 'steps': 35284, 'batch_loss/train': 0.6756486142985523} 12/27/2021 14:43:31 - INFO - codeparrot_training - Step 35285: {'lr': 0.00010010459314601974, 'samples': 18066432, 'steps': 35285, 'batch_loss/train': 0.8074794927088078} 12/27/2021 14:43:43 - INFO - codeparrot_training - Step 35286: {'lr': 0.00010009197799314762, 'samples': 18066944, 'steps': 35286, 'batch_loss/train': 0.7317734067328274} 12/27/2021 14:43:53 - INFO - codeparrot_training - Step 35287: {'lr': 0.00010007936343625321, 'samples': 18067456, 'steps': 35287, 'batch_loss/train': 0.8028221328277141} 12/27/2021 14:44:04 - INFO - codeparrot_training - Step 35288: {'lr': 0.0001000667494753866, 'samples': 18067968, 'steps': 35288, 'batch_loss/train': 0.6981827190611511} 12/27/2021 14:44:16 - INFO - codeparrot_training - Step 35289: {'lr': 0.00010005413611059803, 'samples': 18068480, 'steps': 35289, 'batch_loss/train': 0.6649818157893606} 12/27/2021 14:44:27 - INFO - codeparrot_training - Step 35290: {'lr': 0.00010004152334193759, 'samples': 18068992, 'steps': 35290, 'batch_loss/train': 0.7353443233296275} 12/27/2021 14:44:37 - INFO - codeparrot_training - Step 35291: {'lr': 0.00010002891116945548, 'samples': 18069504, 'steps': 35291, 'batch_loss/train': 0.6674162708222866} 12/27/2021 14:44:50 - INFO - codeparrot_training - Step 35292: {'lr': 0.00010001629959320166, 'samples': 18070016, 'steps': 35292, 'batch_loss/train': 0.8006352744996548} 12/27/2021 14:45:01 - INFO - codeparrot_training - Step 35293: {'lr': 0.00010000368861322646, 'samples': 18070528, 'steps': 35293, 'batch_loss/train': 1.3164213189738803} 12/27/2021 14:45:12 - INFO - codeparrot_training - Step 35294: {'lr': 9.999107822958007e-05, 'samples': 18071040, 'steps': 35294, 'batch_loss/train': 0.7865543926891405} 12/27/2021 14:45:22 - INFO - codeparrot_training - Step 35295: {'lr': 9.99784684423124e-05, 'samples': 18071552, 'steps': 35295, 'batch_loss/train': 0.6517572054872289} 12/27/2021 14:45:36 - INFO - codeparrot_training - Step 35296: {'lr': 9.996585925147369e-05, 'samples': 18072064, 'steps': 35296, 'batch_loss/train': 0.6815078982617706} 12/27/2021 14:45:47 - INFO - codeparrot_training - Step 35297: {'lr': 9.995325065711408e-05, 'samples': 18072576, 'steps': 35297, 'batch_loss/train': 0.7779996902681887} 12/27/2021 14:45:57 - INFO - codeparrot_training - Step 35298: {'lr': 9.994064265928371e-05, 'samples': 18073088, 'steps': 35298, 'batch_loss/train': 0.7036052611656487} 12/27/2021 14:46:10 - INFO - codeparrot_training - Step 35299: {'lr': 9.992803525803268e-05, 'samples': 18073600, 'steps': 35299, 'batch_loss/train': 0.6950824586674571} 12/27/2021 14:46:20 - INFO - codeparrot_training - Step 35300: {'lr': 9.991542845341109e-05, 'samples': 18074112, 'steps': 35300, 'batch_loss/train': 0.5396225066506304} 12/27/2021 14:46:31 - INFO - codeparrot_training - Step 35301: {'lr': 9.990282224546912e-05, 'samples': 18074624, 'steps': 35301, 'batch_loss/train': 0.7752911895513535} 12/27/2021 14:46:43 - INFO - codeparrot_training - Step 35302: {'lr': 9.989021663425682e-05, 'samples': 18075136, 'steps': 35302, 'batch_loss/train': 0.7379099521785975} 12/27/2021 14:46:54 - INFO - codeparrot_training - Step 35303: {'lr': 9.987761161982436e-05, 'samples': 18075648, 'steps': 35303, 'batch_loss/train': 0.8158892849460244} 12/27/2021 14:47:04 - INFO - codeparrot_training - Step 35304: {'lr': 9.986500720222183e-05, 'samples': 18076160, 'steps': 35304, 'batch_loss/train': 0.6919714146060869} 12/27/2021 14:47:15 - INFO - codeparrot_training - Step 35305: {'lr': 9.985240338149945e-05, 'samples': 18076672, 'steps': 35305, 'batch_loss/train': 0.7439167248085141} 12/27/2021 14:47:29 - INFO - codeparrot_training - Step 35306: {'lr': 9.98398001577071e-05, 'samples': 18077184, 'steps': 35306, 'batch_loss/train': 0.7529320670291781} 12/27/2021 14:47:39 - INFO - codeparrot_training - Step 35307: {'lr': 9.982719753089493e-05, 'samples': 18077696, 'steps': 35307, 'batch_loss/train': 0.7367250209208578} 12/27/2021 14:47:50 - INFO - codeparrot_training - Step 35308: {'lr': 9.98145955011133e-05, 'samples': 18078208, 'steps': 35308, 'batch_loss/train': 0.7356796180829406} 12/27/2021 14:48:02 - INFO - codeparrot_training - Step 35309: {'lr': 9.980199406841203e-05, 'samples': 18078720, 'steps': 35309, 'batch_loss/train': 0.7831818873528391} 12/27/2021 14:48:13 - INFO - codeparrot_training - Step 35310: {'lr': 9.978939323284134e-05, 'samples': 18079232, 'steps': 35310, 'batch_loss/train': 0.74934440292418} 12/27/2021 14:48:23 - INFO - codeparrot_training - Step 35311: {'lr': 9.97767929944513e-05, 'samples': 18079744, 'steps': 35311, 'batch_loss/train': 0.7637182334437966} 12/27/2021 14:48:35 - INFO - codeparrot_training - Step 35312: {'lr': 9.976419335329198e-05, 'samples': 18080256, 'steps': 35312, 'batch_loss/train': 0.6872864303877577} 12/27/2021 14:48:46 - INFO - codeparrot_training - Step 35313: {'lr': 9.975159430941355e-05, 'samples': 18080768, 'steps': 35313, 'batch_loss/train': 0.6598122268915176} 12/27/2021 14:48:57 - INFO - codeparrot_training - Step 35314: {'lr': 9.9738995862866e-05, 'samples': 18081280, 'steps': 35314, 'batch_loss/train': 0.6530728358775377} 12/27/2021 14:49:11 - INFO - codeparrot_training - Step 35315: {'lr': 9.972639801369951e-05, 'samples': 18081792, 'steps': 35315, 'batch_loss/train': 0.610019098763587} 12/27/2021 14:49:21 - INFO - codeparrot_training - Step 35316: {'lr': 9.97138007619641e-05, 'samples': 18082304, 'steps': 35316, 'batch_loss/train': 0.6020709915028419} 12/27/2021 14:49:32 - INFO - codeparrot_training - Step 35317: {'lr': 9.970120410770994e-05, 'samples': 18082816, 'steps': 35317, 'batch_loss/train': 0.8092505256645381} 12/27/2021 14:49:43 - INFO - codeparrot_training - Step 35318: {'lr': 9.968860805098689e-05, 'samples': 18083328, 'steps': 35318, 'batch_loss/train': 0.7311658575199544} 12/27/2021 14:49:55 - INFO - codeparrot_training - Step 35319: {'lr': 9.967601259184528e-05, 'samples': 18083840, 'steps': 35319, 'batch_loss/train': 0.7616681288927794} 12/27/2021 14:50:05 - INFO - codeparrot_training - Step 35320: {'lr': 9.966341773033513e-05, 'samples': 18084352, 'steps': 35320, 'batch_loss/train': 0.7403996116481721} 12/27/2021 14:50:16 - INFO - codeparrot_training - Step 35321: {'lr': 9.965082346650642e-05, 'samples': 18084864, 'steps': 35321, 'batch_loss/train': 0.7707135640084743} 12/27/2021 14:50:30 - INFO - codeparrot_training - Step 35322: {'lr': 9.963822980040915e-05, 'samples': 18085376, 'steps': 35322, 'batch_loss/train': 0.6758685130625963} 12/27/2021 14:50:40 - INFO - codeparrot_training - Step 35323: {'lr': 9.962563673209368e-05, 'samples': 18085888, 'steps': 35323, 'batch_loss/train': 0.6414329058025032} 12/27/2021 14:50:51 - INFO - codeparrot_training - Step 35324: {'lr': 9.961304426160984e-05, 'samples': 18086400, 'steps': 35324, 'batch_loss/train': 0.6915494857821614} 12/27/2021 14:51:03 - INFO - codeparrot_training - Step 35325: {'lr': 9.960045238900772e-05, 'samples': 18086912, 'steps': 35325, 'batch_loss/train': 0.6948415162041783} 12/27/2021 14:51:14 - INFO - codeparrot_training - Step 35326: {'lr': 9.958786111433744e-05, 'samples': 18087424, 'steps': 35326, 'batch_loss/train': 0.6764901926508173} 12/27/2021 14:51:25 - INFO - codeparrot_training - Step 35327: {'lr': 9.957527043764902e-05, 'samples': 18087936, 'steps': 35327, 'batch_loss/train': 0.6377876867773011} 12/27/2021 14:51:36 - INFO - codeparrot_training - Step 35328: {'lr': 9.956268035899251e-05, 'samples': 18088448, 'steps': 35328, 'batch_loss/train': 0.583599598146975} 12/27/2021 14:51:47 - INFO - codeparrot_training - Step 35329: {'lr': 9.955009087841798e-05, 'samples': 18088960, 'steps': 35329, 'batch_loss/train': 0.7377487169578671} 12/27/2021 14:51:58 - INFO - codeparrot_training - Step 35330: {'lr': 9.953750199597553e-05, 'samples': 18089472, 'steps': 35330, 'batch_loss/train': 0.7277034360449761} 12/27/2021 14:52:08 - INFO - codeparrot_training - Step 35331: {'lr': 9.952491371171521e-05, 'samples': 18089984, 'steps': 35331, 'batch_loss/train': 0.8005177965387702} 12/27/2021 14:52:21 - INFO - codeparrot_training - Step 35332: {'lr': 9.951232602568691e-05, 'samples': 18090496, 'steps': 35332, 'batch_loss/train': 0.7062467392534018} 12/27/2021 14:52:31 - INFO - codeparrot_training - Step 35333: {'lr': 9.949973893794073e-05, 'samples': 18091008, 'steps': 35333, 'batch_loss/train': 0.7402542848140001} 12/27/2021 14:52:42 - INFO - codeparrot_training - Step 35334: {'lr': 9.948715244852691e-05, 'samples': 18091520, 'steps': 35334, 'batch_loss/train': 0.7161514302715659} 12/27/2021 14:52:56 - INFO - codeparrot_training - Step 35335: {'lr': 9.947456655749526e-05, 'samples': 18092032, 'steps': 35335, 'batch_loss/train': 0.7294700811617076} 12/27/2021 14:53:06 - INFO - codeparrot_training - Step 35336: {'lr': 9.946198126489581e-05, 'samples': 18092544, 'steps': 35336, 'batch_loss/train': 0.687269794754684} 12/27/2021 14:53:17 - INFO - codeparrot_training - Step 35337: {'lr': 9.944939657077886e-05, 'samples': 18093056, 'steps': 35337, 'batch_loss/train': 0.6943539921194315} 12/27/2021 14:53:29 - INFO - codeparrot_training - Step 35338: {'lr': 9.943681247519418e-05, 'samples': 18093568, 'steps': 35338, 'batch_loss/train': 0.7322459034621716} 12/27/2021 14:53:40 - INFO - codeparrot_training - Step 35339: {'lr': 9.942422897819186e-05, 'samples': 18094080, 'steps': 35339, 'batch_loss/train': 0.7081223567947745} 12/27/2021 14:53:50 - INFO - codeparrot_training - Step 35340: {'lr': 9.941164607982198e-05, 'samples': 18094592, 'steps': 35340, 'batch_loss/train': 0.7185774371027946} 12/27/2021 14:54:02 - INFO - codeparrot_training - Step 35341: {'lr': 9.939906378013452e-05, 'samples': 18095104, 'steps': 35341, 'batch_loss/train': 0.6834898805245757} 12/27/2021 14:54:13 - INFO - codeparrot_training - Step 35342: {'lr': 9.938648207917953e-05, 'samples': 18095616, 'steps': 35342, 'batch_loss/train': 0.7697728681378067} 12/27/2021 14:54:24 - INFO - codeparrot_training - Step 35343: {'lr': 9.937390097700707e-05, 'samples': 18096128, 'steps': 35343, 'batch_loss/train': 0.6995099810883403} 12/27/2021 14:54:34 - INFO - codeparrot_training - Step 35344: {'lr': 9.936132047366697e-05, 'samples': 18096640, 'steps': 35344, 'batch_loss/train': 0.6692955686303321} 12/27/2021 14:54:48 - INFO - codeparrot_training - Step 35345: {'lr': 9.934874056920946e-05, 'samples': 18097152, 'steps': 35345, 'batch_loss/train': 0.5318065100873355} 12/27/2021 14:54:59 - INFO - codeparrot_training - Step 35346: {'lr': 9.933616126368456e-05, 'samples': 18097664, 'steps': 35346, 'batch_loss/train': 0.6570159192197025} 12/27/2021 14:55:10 - INFO - codeparrot_training - Step 35347: {'lr': 9.93235825571421e-05, 'samples': 18098176, 'steps': 35347, 'batch_loss/train': 0.7109556109644473} 12/27/2021 14:55:22 - INFO - codeparrot_training - Step 35348: {'lr': 9.93110044496321e-05, 'samples': 18098688, 'steps': 35348, 'batch_loss/train': 0.6676415093243122} 12/27/2021 14:55:32 - INFO - codeparrot_training - Step 35349: {'lr': 9.929842694120481e-05, 'samples': 18099200, 'steps': 35349, 'batch_loss/train': 0.7356524630449712} 12/27/2021 14:55:43 - INFO - codeparrot_training - Step 35350: {'lr': 9.928585003191001e-05, 'samples': 18099712, 'steps': 35350, 'batch_loss/train': 0.7153858626261353} 12/27/2021 14:55:57 - INFO - codeparrot_training - Step 35351: {'lr': 9.927327372179773e-05, 'samples': 18100224, 'steps': 35351, 'batch_loss/train': 0.6643556099152192} 12/27/2021 14:56:08 - INFO - codeparrot_training - Step 35352: {'lr': 9.926069801091802e-05, 'samples': 18100736, 'steps': 35352, 'batch_loss/train': 0.750063790474087} 12/27/2021 14:56:18 - INFO - codeparrot_training - Step 35353: {'lr': 9.924812289932087e-05, 'samples': 18101248, 'steps': 35353, 'batch_loss/train': 0.6659667859785259} 12/27/2021 14:56:30 - INFO - codeparrot_training - Step 35354: {'lr': 9.923554838705626e-05, 'samples': 18101760, 'steps': 35354, 'batch_loss/train': 0.6389588443562388} 12/27/2021 14:56:41 - INFO - codeparrot_training - Step 35355: {'lr': 9.922297447417417e-05, 'samples': 18102272, 'steps': 35355, 'batch_loss/train': 0.6636994602158666} 12/27/2021 14:56:51 - INFO - codeparrot_training - Step 35356: {'lr': 9.921040116072457e-05, 'samples': 18102784, 'steps': 35356, 'batch_loss/train': 0.7600496709346771} 12/27/2021 14:57:02 - INFO - codeparrot_training - Step 35357: {'lr': 9.919782844675754e-05, 'samples': 18103296, 'steps': 35357, 'batch_loss/train': 0.6912603698438033} 12/27/2021 14:57:14 - INFO - codeparrot_training - Step 35358: {'lr': 9.918525633232306e-05, 'samples': 18103808, 'steps': 35358, 'batch_loss/train': 0.8035588636994362} 12/27/2021 14:57:25 - INFO - codeparrot_training - Step 35359: {'lr': 9.917268481747088e-05, 'samples': 18104320, 'steps': 35359, 'batch_loss/train': 0.6676303688436747} 12/27/2021 14:57:35 - INFO - codeparrot_training - Step 35360: {'lr': 9.916011390225135e-05, 'samples': 18104832, 'steps': 35360, 'batch_loss/train': 0.7040091101080179} 12/27/2021 14:57:49 - INFO - codeparrot_training - Step 35361: {'lr': 9.914754358671413e-05, 'samples': 18105344, 'steps': 35361, 'batch_loss/train': 0.7993764765560627} 12/27/2021 14:58:00 - INFO - codeparrot_training - Step 35362: {'lr': 9.913497387090925e-05, 'samples': 18105856, 'steps': 35362, 'batch_loss/train': 0.6815368197858334} 12/27/2021 14:58:11 - INFO - codeparrot_training - Step 35363: {'lr': 9.912240475488693e-05, 'samples': 18106368, 'steps': 35363, 'batch_loss/train': 0.6788324285298586} 12/27/2021 14:58:23 - INFO - codeparrot_training - Step 35364: {'lr': 9.910983623869688e-05, 'samples': 18106880, 'steps': 35364, 'batch_loss/train': 0.6951253555016592} 12/27/2021 14:58:34 - INFO - codeparrot_training - Step 35365: {'lr': 9.909726832238913e-05, 'samples': 18107392, 'steps': 35365, 'batch_loss/train': 0.7079531345516443} 12/27/2021 14:58:44 - INFO - codeparrot_training - Step 35366: {'lr': 9.908470100601366e-05, 'samples': 18107904, 'steps': 35366, 'batch_loss/train': 0.5108254313818179} 12/27/2021 14:58:55 - INFO - codeparrot_training - Step 35367: {'lr': 9.907213428962044e-05, 'samples': 18108416, 'steps': 35367, 'batch_loss/train': 0.621818535233615} 12/27/2021 14:59:07 - INFO - codeparrot_training - Step 35368: {'lr': 9.905956817325942e-05, 'samples': 18108928, 'steps': 35368, 'batch_loss/train': 0.6042713559581898} 12/27/2021 14:59:18 - INFO - codeparrot_training - Step 35369: {'lr': 9.904700265698066e-05, 'samples': 18109440, 'steps': 35369, 'batch_loss/train': 0.6906820549629629} 12/27/2021 14:59:28 - INFO - codeparrot_training - Step 35370: {'lr': 9.903443774083384e-05, 'samples': 18109952, 'steps': 35370, 'batch_loss/train': 0.6536664948798716} 12/27/2021 14:59:41 - INFO - codeparrot_training - Step 35371: {'lr': 9.902187342486919e-05, 'samples': 18110464, 'steps': 35371, 'batch_loss/train': 0.6833102805539966} 12/27/2021 14:59:51 - INFO - codeparrot_training - Step 35372: {'lr': 9.900930970913663e-05, 'samples': 18110976, 'steps': 35372, 'batch_loss/train': 0.6387163868639618} 12/27/2021 15:00:02 - INFO - codeparrot_training - Step 35373: {'lr': 9.899674659368591e-05, 'samples': 18111488, 'steps': 35373, 'batch_loss/train': 0.7136558922939003} 12/27/2021 15:00:16 - INFO - codeparrot_training - Step 35374: {'lr': 9.898418407856715e-05, 'samples': 18112000, 'steps': 35374, 'batch_loss/train': 0.9621209297329187} 12/27/2021 15:00:27 - INFO - codeparrot_training - Step 35375: {'lr': 9.897162216383038e-05, 'samples': 18112512, 'steps': 35375, 'batch_loss/train': 0.6702128741890192} 12/27/2021 15:00:37 - INFO - codeparrot_training - Step 35376: {'lr': 9.89590608495253e-05, 'samples': 18113024, 'steps': 35376, 'batch_loss/train': 0.7827315190806985} 12/27/2021 15:00:48 - INFO - codeparrot_training - Step 35377: {'lr': 9.894650013570197e-05, 'samples': 18113536, 'steps': 35377, 'batch_loss/train': 0.4143499215715565} 12/27/2021 15:01:00 - INFO - codeparrot_training - Step 35378: {'lr': 9.893394002241032e-05, 'samples': 18114048, 'steps': 35378, 'batch_loss/train': 0.7093989023705944} 12/27/2021 15:01:11 - INFO - codeparrot_training - Step 35379: {'lr': 9.892138050970028e-05, 'samples': 18114560, 'steps': 35379, 'batch_loss/train': 0.5996768460609019} 12/27/2021 15:01:21 - INFO - codeparrot_training - Step 35380: {'lr': 9.89088215976218e-05, 'samples': 18115072, 'steps': 35380, 'batch_loss/train': 0.5601140176877379} 12/27/2021 15:01:35 - INFO - codeparrot_training - Step 35381: {'lr': 9.889626328622478e-05, 'samples': 18115584, 'steps': 35381, 'batch_loss/train': 0.6113115968182683} 12/27/2021 15:01:46 - INFO - codeparrot_training - Step 35382: {'lr': 9.888370557555914e-05, 'samples': 18116096, 'steps': 35382, 'batch_loss/train': 0.8104824488982558} 12/27/2021 15:01:57 - INFO - codeparrot_training - Step 35383: {'lr': 9.887114846567488e-05, 'samples': 18116608, 'steps': 35383, 'batch_loss/train': 0.7156709237024188} 12/27/2021 15:02:09 - INFO - codeparrot_training - Step 35384: {'lr': 9.885859195662191e-05, 'samples': 18117120, 'steps': 35384, 'batch_loss/train': 0.7268116874620318} 12/27/2021 15:02:20 - INFO - codeparrot_training - Step 35385: {'lr': 9.884603604844997e-05, 'samples': 18117632, 'steps': 35385, 'batch_loss/train': 0.6861888603307307} 12/27/2021 15:02:30 - INFO - codeparrot_training - Step 35386: {'lr': 9.883348074120929e-05, 'samples': 18118144, 'steps': 35386, 'batch_loss/train': 0.48222152231028304} 12/27/2021 15:02:42 - INFO - codeparrot_training - Step 35387: {'lr': 9.882092603494949e-05, 'samples': 18118656, 'steps': 35387, 'batch_loss/train': 0.6016679286258295} 12/27/2021 15:02:53 - INFO - codeparrot_training - Step 35388: {'lr': 9.880837192972056e-05, 'samples': 18119168, 'steps': 35388, 'batch_loss/train': 0.6994837666861713} 12/27/2021 15:03:03 - INFO - codeparrot_training - Step 35389: {'lr': 9.879581842557261e-05, 'samples': 18119680, 'steps': 35389, 'batch_loss/train': 0.7621412726584822} 12/27/2021 15:03:14 - INFO - codeparrot_training - Step 35390: {'lr': 9.878326552255531e-05, 'samples': 18120192, 'steps': 35390, 'batch_loss/train': 0.7321717932354659} 12/27/2021 15:03:28 - INFO - codeparrot_training - Step 35391: {'lr': 9.877071322071865e-05, 'samples': 18120704, 'steps': 35391, 'batch_loss/train': 0.6823509801179171} 12/27/2021 15:03:39 - INFO - codeparrot_training - Step 35392: {'lr': 9.875816152011252e-05, 'samples': 18121216, 'steps': 35392, 'batch_loss/train': 0.6781364037888125} 12/27/2021 15:03:49 - INFO - codeparrot_training - Step 35393: {'lr': 9.874561042078684e-05, 'samples': 18121728, 'steps': 35393, 'batch_loss/train': 0.7906313729472458} 12/27/2021 15:04:02 - INFO - codeparrot_training - Step 35394: {'lr': 9.873305992279149e-05, 'samples': 18122240, 'steps': 35394, 'batch_loss/train': 1.0669823233038187} 12/27/2021 15:04:13 - INFO - codeparrot_training - Step 35395: {'lr': 9.872051002617649e-05, 'samples': 18122752, 'steps': 35395, 'batch_loss/train': 0.7133671417832375} 12/27/2021 15:04:23 - INFO - codeparrot_training - Step 35396: {'lr': 9.870796073099144e-05, 'samples': 18123264, 'steps': 35396, 'batch_loss/train': 0.7502733948640525} 12/27/2021 15:04:34 - INFO - codeparrot_training - Step 35397: {'lr': 9.86954120372865e-05, 'samples': 18123776, 'steps': 35397, 'batch_loss/train': 0.6320596979930997} 12/27/2021 15:04:46 - INFO - codeparrot_training - Step 35398: {'lr': 9.868286394511155e-05, 'samples': 18124288, 'steps': 35398, 'batch_loss/train': 0.7091482719406486} 12/27/2021 15:04:57 - INFO - codeparrot_training - Step 35399: {'lr': 9.867031645451621e-05, 'samples': 18124800, 'steps': 35399, 'batch_loss/train': 0.7742438828572631} 12/27/2021 15:05:07 - INFO - codeparrot_training - Step 35400: {'lr': 9.865776956555068e-05, 'samples': 18125312, 'steps': 35400, 'batch_loss/train': 0.6984161641448736} 12/27/2021 15:05:19 - INFO - codeparrot_training - Step 35401: {'lr': 9.864522327826478e-05, 'samples': 18125824, 'steps': 35401, 'batch_loss/train': 0.6889519798569381} 12/27/2021 15:05:30 - INFO - codeparrot_training - Step 35402: {'lr': 9.863267759270822e-05, 'samples': 18126336, 'steps': 35402, 'batch_loss/train': 0.7390691991895437} 12/27/2021 15:05:41 - INFO - codeparrot_training - Step 35403: {'lr': 9.86201325089309e-05, 'samples': 18126848, 'steps': 35403, 'batch_loss/train': 0.6586427320726216} 12/27/2021 15:05:55 - INFO - codeparrot_training - Step 35404: {'lr': 9.860758802698297e-05, 'samples': 18127360, 'steps': 35404, 'batch_loss/train': 0.6580661139450967} 12/27/2021 15:06:05 - INFO - codeparrot_training - Step 35405: {'lr': 9.8595044146914e-05, 'samples': 18127872, 'steps': 35405, 'batch_loss/train': 0.49728294974192977} 12/27/2021 15:06:16 - INFO - codeparrot_training - Step 35406: {'lr': 9.858250086877396e-05, 'samples': 18128384, 'steps': 35406, 'batch_loss/train': 0.7264359733089805} 12/27/2021 15:06:26 - INFO - codeparrot_training - Step 35407: {'lr': 9.856995819261275e-05, 'samples': 18128896, 'steps': 35407, 'batch_loss/train': 0.814066301099956} 12/27/2021 15:06:39 - INFO - codeparrot_training - Step 35408: {'lr': 9.855741611848018e-05, 'samples': 18129408, 'steps': 35408, 'batch_loss/train': 0.7504044808447361} 12/27/2021 15:06:49 - INFO - codeparrot_training - Step 35409: {'lr': 9.854487464642614e-05, 'samples': 18129920, 'steps': 35409, 'batch_loss/train': 0.6450225813314319} 12/27/2021 15:07:00 - INFO - codeparrot_training - Step 35410: {'lr': 9.853233377650051e-05, 'samples': 18130432, 'steps': 35410, 'batch_loss/train': 0.6323932665400207} 12/27/2021 15:07:14 - INFO - codeparrot_training - Step 35411: {'lr': 9.851979350875311e-05, 'samples': 18130944, 'steps': 35411, 'batch_loss/train': 0.8104369323700666} 12/27/2021 15:07:24 - INFO - codeparrot_training - Step 35412: {'lr': 9.850725384323392e-05, 'samples': 18131456, 'steps': 35412, 'batch_loss/train': 0.639436059165746} 12/27/2021 15:07:35 - INFO - codeparrot_training - Step 35413: {'lr': 9.849471477999261e-05, 'samples': 18131968, 'steps': 35413, 'batch_loss/train': 0.7464611820760183} 12/27/2021 15:07:47 - INFO - codeparrot_training - Step 35414: {'lr': 9.848217631907899e-05, 'samples': 18132480, 'steps': 35414, 'batch_loss/train': 0.5940879021654837} 12/27/2021 15:07:58 - INFO - codeparrot_training - Step 35415: {'lr': 9.846963846054322e-05, 'samples': 18132992, 'steps': 35415, 'batch_loss/train': 0.47180364601081237} 12/27/2021 15:08:09 - INFO - codeparrot_training - Step 35416: {'lr': 9.845710120443486e-05, 'samples': 18133504, 'steps': 35416, 'batch_loss/train': 0.7403655741363764} 12/27/2021 15:08:21 - INFO - codeparrot_training - Step 35417: {'lr': 9.844456455080386e-05, 'samples': 18134016, 'steps': 35417, 'batch_loss/train': 0.5832559185801074} 12/27/2021 15:08:32 - INFO - codeparrot_training - Step 35418: {'lr': 9.84320284997e-05, 'samples': 18134528, 'steps': 35418, 'batch_loss/train': 0.7687792042270303} 12/27/2021 15:08:42 - INFO - codeparrot_training - Step 35419: {'lr': 9.841949305117321e-05, 'samples': 18135040, 'steps': 35419, 'batch_loss/train': 0.6975986105389893} 12/27/2021 15:08:53 - INFO - codeparrot_training - Step 35420: {'lr': 9.840695820527326e-05, 'samples': 18135552, 'steps': 35420, 'batch_loss/train': 0.46094227861613035} 12/27/2021 15:09:07 - INFO - codeparrot_training - Step 35421: {'lr': 9.839442396205003e-05, 'samples': 18136064, 'steps': 35421, 'batch_loss/train': 0.756239078938961} 12/27/2021 15:09:17 - INFO - codeparrot_training - Step 35422: {'lr': 9.838189032155332e-05, 'samples': 18136576, 'steps': 35422, 'batch_loss/train': 0.8086400460451841} 12/27/2021 15:09:28 - INFO - codeparrot_training - Step 35423: {'lr': 9.836935728383292e-05, 'samples': 18137088, 'steps': 35423, 'batch_loss/train': 0.7847377583384514} 12/27/2021 15:09:40 - INFO - codeparrot_training - Step 35424: {'lr': 9.835682484893884e-05, 'samples': 18137600, 'steps': 35424, 'batch_loss/train': 0.7200559489428997} 12/27/2021 15:09:50 - INFO - codeparrot_training - Step 35425: {'lr': 9.834429301692057e-05, 'samples': 18138112, 'steps': 35425, 'batch_loss/train': 0.6370576359331608} 12/27/2021 15:10:01 - INFO - codeparrot_training - Step 35426: {'lr': 9.833176178782822e-05, 'samples': 18138624, 'steps': 35426, 'batch_loss/train': 0.7188997594639659} 12/27/2021 15:10:13 - INFO - codeparrot_training - Step 35427: {'lr': 9.83192311617116e-05, 'samples': 18139136, 'steps': 35427, 'batch_loss/train': 0.7836943726288155} 12/27/2021 15:10:24 - INFO - codeparrot_training - Step 35428: {'lr': 9.830670113862037e-05, 'samples': 18139648, 'steps': 35428, 'batch_loss/train': 0.672900979872793} 12/27/2021 15:10:35 - INFO - codeparrot_training - Step 35429: {'lr': 9.829417171860433e-05, 'samples': 18140160, 'steps': 35429, 'batch_loss/train': 0.7796721709892154} 12/27/2021 15:10:48 - INFO - codeparrot_training - Step 35430: {'lr': 9.828164290171354e-05, 'samples': 18140672, 'steps': 35430, 'batch_loss/train': 0.6853547543287277} 12/27/2021 15:10:59 - INFO - codeparrot_training - Step 35431: {'lr': 9.826911468799757e-05, 'samples': 18141184, 'steps': 35431, 'batch_loss/train': 0.7172858670819551} 12/27/2021 15:11:10 - INFO - codeparrot_training - Step 35432: {'lr': 9.825658707750632e-05, 'samples': 18141696, 'steps': 35432, 'batch_loss/train': 0.7244613543152809} 12/27/2021 15:11:20 - INFO - codeparrot_training - Step 35433: {'lr': 9.824406007028958e-05, 'samples': 18142208, 'steps': 35433, 'batch_loss/train': 0.7689093784429133} 12/27/2021 15:11:32 - INFO - codeparrot_training - Step 35434: {'lr': 9.823153366639714e-05, 'samples': 18142720, 'steps': 35434, 'batch_loss/train': 0.8267568536102772} 12/27/2021 15:11:43 - INFO - codeparrot_training - Step 35435: {'lr': 9.821900786587884e-05, 'samples': 18143232, 'steps': 35435, 'batch_loss/train': 0.8835160555317998} 12/27/2021 15:11:54 - INFO - codeparrot_training - Step 35436: {'lr': 9.820648266878443e-05, 'samples': 18143744, 'steps': 35436, 'batch_loss/train': 0.6985153815476224} 12/27/2021 15:12:07 - INFO - codeparrot_training - Step 35437: {'lr': 9.819395807516373e-05, 'samples': 18144256, 'steps': 35437, 'batch_loss/train': 0.7880108645185828} 12/27/2021 15:12:18 - INFO - codeparrot_training - Step 35438: {'lr': 9.818143408506654e-05, 'samples': 18144768, 'steps': 35438, 'batch_loss/train': 0.7011078442446887} 12/27/2021 15:12:29 - INFO - codeparrot_training - Step 35439: {'lr': 9.816891069854275e-05, 'samples': 18145280, 'steps': 35439, 'batch_loss/train': 0.7881228178739548} 12/27/2021 15:12:41 - INFO - codeparrot_training - Step 35440: {'lr': 9.815638791564186e-05, 'samples': 18145792, 'steps': 35440, 'batch_loss/train': 0.663694835267961} 12/27/2021 15:12:51 - INFO - codeparrot_training - Step 35441: {'lr': 9.814386573641401e-05, 'samples': 18146304, 'steps': 35441, 'batch_loss/train': 0.684976439923048} 12/27/2021 15:13:02 - INFO - codeparrot_training - Step 35442: {'lr': 9.813134416090871e-05, 'samples': 18146816, 'steps': 35442, 'batch_loss/train': 0.7133680703118443} 12/27/2021 15:13:14 - INFO - codeparrot_training - Step 35443: {'lr': 9.811882318917587e-05, 'samples': 18147328, 'steps': 35443, 'batch_loss/train': 0.7059972868300974} 12/27/2021 15:13:25 - INFO - codeparrot_training - Step 35444: {'lr': 9.81063028212652e-05, 'samples': 18147840, 'steps': 35444, 'batch_loss/train': 0.6386971636675298} 12/27/2021 15:13:35 - INFO - codeparrot_training - Step 35445: {'lr': 9.809378305722655e-05, 'samples': 18148352, 'steps': 35445, 'batch_loss/train': 0.7103393310680985} 12/27/2021 15:13:46 - INFO - codeparrot_training - Step 35446: {'lr': 9.808126389710966e-05, 'samples': 18148864, 'steps': 35446, 'batch_loss/train': 0.7009616894647479} 12/27/2021 15:13:58 - INFO - codeparrot_training - Step 35447: {'lr': 9.80687453409643e-05, 'samples': 18149376, 'steps': 35447, 'batch_loss/train': 0.6667755572125316} 12/27/2021 15:14:09 - INFO - codeparrot_training - Step 35448: {'lr': 9.805622738884023e-05, 'samples': 18149888, 'steps': 35448, 'batch_loss/train': 0.8017380526289344} 12/27/2021 15:14:20 - INFO - codeparrot_training - Step 35449: {'lr': 9.804371004078724e-05, 'samples': 18150400, 'steps': 35449, 'batch_loss/train': 0.7177758305333555} 12/27/2021 15:14:34 - INFO - codeparrot_training - Step 35450: {'lr': 9.803119329685514e-05, 'samples': 18150912, 'steps': 35450, 'batch_loss/train': 0.4422272937663365} 12/27/2021 15:14:45 - INFO - codeparrot_training - Step 35451: {'lr': 9.801867715709351e-05, 'samples': 18151424, 'steps': 35451, 'batch_loss/train': 1.0119627434760332} 12/27/2021 15:14:55 - INFO - codeparrot_training - Step 35452: {'lr': 9.800616162155229e-05, 'samples': 18151936, 'steps': 35452, 'batch_loss/train': 0.6767908027395606} 12/27/2021 15:15:06 - INFO - codeparrot_training - Step 35453: {'lr': 9.799364669028126e-05, 'samples': 18152448, 'steps': 35453, 'batch_loss/train': 0.6226561279036105} 12/27/2021 15:15:18 - INFO - codeparrot_training - Step 35454: {'lr': 9.798113236333004e-05, 'samples': 18152960, 'steps': 35454, 'batch_loss/train': 0.5345325840462465} 12/27/2021 15:15:29 - INFO - codeparrot_training - Step 35455: {'lr': 9.796861864074832e-05, 'samples': 18153472, 'steps': 35455, 'batch_loss/train': 0.7206806265749037} 12/27/2021 15:15:39 - INFO - codeparrot_training - Step 35456: {'lr': 9.795610552258616e-05, 'samples': 18153984, 'steps': 35456, 'batch_loss/train': 0.6271525649353862} 12/27/2021 15:15:52 - INFO - codeparrot_training - Step 35457: {'lr': 9.7943593008893e-05, 'samples': 18154496, 'steps': 35457, 'batch_loss/train': 0.7037547463551164} 12/27/2021 15:16:02 - INFO - codeparrot_training - Step 35458: {'lr': 9.793108109971873e-05, 'samples': 18155008, 'steps': 35458, 'batch_loss/train': 0.765326554159401} 12/27/2021 15:16:13 - INFO - codeparrot_training - Step 35459: {'lr': 9.791856979511305e-05, 'samples': 18155520, 'steps': 35459, 'batch_loss/train': 0.48371899969060905} 12/27/2021 15:16:27 - INFO - codeparrot_training - Step 35460: {'lr': 9.79060590951257e-05, 'samples': 18156032, 'steps': 35460, 'batch_loss/train': 0.6889138026162982} 12/27/2021 15:16:37 - INFO - codeparrot_training - Step 35461: {'lr': 9.789354899980646e-05, 'samples': 18156544, 'steps': 35461, 'batch_loss/train': 0.7149478439241648} 12/27/2021 15:16:48 - INFO - codeparrot_training - Step 35462: {'lr': 9.788103950920501e-05, 'samples': 18157056, 'steps': 35462, 'batch_loss/train': 0.7133966367691755} 12/27/2021 15:16:59 - INFO - codeparrot_training - Step 35463: {'lr': 9.786853062337115e-05, 'samples': 18157568, 'steps': 35463, 'batch_loss/train': 0.7315489095635712} 12/27/2021 15:17:11 - INFO - codeparrot_training - Step 35464: {'lr': 9.785602234235452e-05, 'samples': 18158080, 'steps': 35464, 'batch_loss/train': 0.7597582172602415} 12/27/2021 15:17:21 - INFO - codeparrot_training - Step 35465: {'lr': 9.7843514666205e-05, 'samples': 18158592, 'steps': 35465, 'batch_loss/train': 0.7382345553487539} 12/27/2021 15:17:32 - INFO - codeparrot_training - Step 35466: {'lr': 9.783100759497205e-05, 'samples': 18159104, 'steps': 35466, 'batch_loss/train': 0.7181676011532545} 12/27/2021 15:17:46 - INFO - codeparrot_training - Step 35467: {'lr': 9.781850112870572e-05, 'samples': 18159616, 'steps': 35467, 'batch_loss/train': 0.7646672232076526} 12/27/2021 15:17:56 - INFO - codeparrot_training - Step 35468: {'lr': 9.780599526745549e-05, 'samples': 18160128, 'steps': 35468, 'batch_loss/train': 0.612640761770308} 12/27/2021 15:18:07 - INFO - codeparrot_training - Step 35469: {'lr': 9.779349001127116e-05, 'samples': 18160640, 'steps': 35469, 'batch_loss/train': 0.6918682752875611} 12/27/2021 15:18:19 - INFO - codeparrot_training - Step 35470: {'lr': 9.778098536020244e-05, 'samples': 18161152, 'steps': 35470, 'batch_loss/train': 0.685822403524071} 12/27/2021 15:18:30 - INFO - codeparrot_training - Step 35471: {'lr': 9.776848131429905e-05, 'samples': 18161664, 'steps': 35471, 'batch_loss/train': 0.7434718809090555} 12/27/2021 15:18:41 - INFO - codeparrot_training - Step 35472: {'lr': 9.775597787361067e-05, 'samples': 18162176, 'steps': 35472, 'batch_loss/train': 0.8375572115182877} 12/27/2021 15:18:51 - INFO - codeparrot_training - Step 35473: {'lr': 9.774347503818707e-05, 'samples': 18162688, 'steps': 35473, 'batch_loss/train': 0.5002208927180618} 12/27/2021 15:19:03 - INFO - codeparrot_training - Step 35474: {'lr': 9.773097280807791e-05, 'samples': 18163200, 'steps': 35474, 'batch_loss/train': 0.6967266537249088} 12/27/2021 15:19:14 - INFO - codeparrot_training - Step 35475: {'lr': 9.771847118333291e-05, 'samples': 18163712, 'steps': 35475, 'batch_loss/train': 0.7327303336933255} 12/27/2021 15:19:25 - INFO - codeparrot_training - Step 35476: {'lr': 9.770597016400184e-05, 'samples': 18164224, 'steps': 35476, 'batch_loss/train': 0.8066165409982204} 12/27/2021 15:19:37 - INFO - codeparrot_training - Step 35477: {'lr': 9.769346975013418e-05, 'samples': 18164736, 'steps': 35477, 'batch_loss/train': 0.49059097407734953} 12/27/2021 15:19:47 - INFO - codeparrot_training - Step 35478: {'lr': 9.768096994177986e-05, 'samples': 18165248, 'steps': 35478, 'batch_loss/train': 0.8182439627125859} 12/27/2021 15:19:58 - INFO - codeparrot_training - Step 35479: {'lr': 9.766847073898858e-05, 'samples': 18165760, 'steps': 35479, 'batch_loss/train': 0.5967488271126058} 12/27/2021 15:20:12 - INFO - codeparrot_training - Step 35480: {'lr': 9.765597214180983e-05, 'samples': 18166272, 'steps': 35480, 'batch_loss/train': 0.7390686720609665} 12/27/2021 15:20:23 - INFO - codeparrot_training - Step 35481: {'lr': 9.764347415029334e-05, 'samples': 18166784, 'steps': 35481, 'batch_loss/train': 0.8171263942494988} 12/27/2021 15:20:33 - INFO - codeparrot_training - Step 35482: {'lr': 9.763097676448903e-05, 'samples': 18167296, 'steps': 35482, 'batch_loss/train': 0.8705123318359256} 12/27/2021 15:20:45 - INFO - codeparrot_training - Step 35483: {'lr': 9.761847998444637e-05, 'samples': 18167808, 'steps': 35483, 'batch_loss/train': 0.5998294521123171} 12/27/2021 15:20:56 - INFO - codeparrot_training - Step 35484: {'lr': 9.760598381021499e-05, 'samples': 18168320, 'steps': 35484, 'batch_loss/train': 0.6761112194508314} 12/27/2021 15:21:07 - INFO - codeparrot_training - Step 35485: {'lr': 9.759348824184483e-05, 'samples': 18168832, 'steps': 35485, 'batch_loss/train': 0.7806179439648986} 12/27/2021 15:21:17 - INFO - codeparrot_training - Step 35486: {'lr': 9.758099327938536e-05, 'samples': 18169344, 'steps': 35486, 'batch_loss/train': 0.6668624226003885} 12/27/2021 15:21:29 - INFO - codeparrot_training - Step 35487: {'lr': 9.756849892288631e-05, 'samples': 18169856, 'steps': 35487, 'batch_loss/train': 0.6897558737546206} 12/27/2021 15:21:40 - INFO - codeparrot_training - Step 35488: {'lr': 9.755600517239733e-05, 'samples': 18170368, 'steps': 35488, 'batch_loss/train': 0.801075461320579} 12/27/2021 15:21:51 - INFO - codeparrot_training - Step 35489: {'lr': 9.75435120279681e-05, 'samples': 18170880, 'steps': 35489, 'batch_loss/train': 0.827396955806762} 12/27/2021 15:22:06 - INFO - codeparrot_training - Step 35490: {'lr': 9.753101948964835e-05, 'samples': 18171392, 'steps': 35490, 'batch_loss/train': 0.6647643157630228} 12/27/2021 15:22:17 - INFO - codeparrot_training - Step 35491: {'lr': 9.751852755748775e-05, 'samples': 18171904, 'steps': 35491, 'batch_loss/train': 0.8069110009819269} 12/27/2021 15:22:27 - INFO - codeparrot_training - Step 35492: {'lr': 9.750603623153573e-05, 'samples': 18172416, 'steps': 35492, 'batch_loss/train': 0.6537194871343672} 12/27/2021 15:22:39 - INFO - codeparrot_training - Step 35493: {'lr': 9.749354551184222e-05, 'samples': 18172928, 'steps': 35493, 'batch_loss/train': 0.7253622501157224} 12/27/2021 15:22:50 - INFO - codeparrot_training - Step 35494: {'lr': 9.74810553984569e-05, 'samples': 18173440, 'steps': 35494, 'batch_loss/train': 0.6836953521706164} 12/27/2021 15:23:01 - INFO - codeparrot_training - Step 35495: {'lr': 9.746856589142913e-05, 'samples': 18173952, 'steps': 35495, 'batch_loss/train': 0.7113713128492236} 12/27/2021 15:23:12 - INFO - codeparrot_training - Step 35496: {'lr': 9.745607699080891e-05, 'samples': 18174464, 'steps': 35496, 'batch_loss/train': 0.7261494256090373} 12/27/2021 15:23:26 - INFO - codeparrot_training - Step 35497: {'lr': 9.744358869664565e-05, 'samples': 18174976, 'steps': 35497, 'batch_loss/train': 0.7466550730168819} 12/27/2021 15:23:36 - INFO - codeparrot_training - Step 35498: {'lr': 9.743110100898908e-05, 'samples': 18175488, 'steps': 35498, 'batch_loss/train': 0.7204165023285896} 12/27/2021 15:23:47 - INFO - codeparrot_training - Step 35499: {'lr': 9.741861392788884e-05, 'samples': 18176000, 'steps': 35499, 'batch_loss/train': 0.9037775136530399} 12/27/2021 15:23:59 - INFO - codeparrot_training - Step 35500: {'lr': 9.740612745339458e-05, 'samples': 18176512, 'steps': 35500, 'batch_loss/train': 0.9558141669258475} 12/27/2021 15:24:10 - INFO - codeparrot_training - Step 35501: {'lr': 9.739364158555594e-05, 'samples': 18177024, 'steps': 35501, 'batch_loss/train': 0.6817358792759478} 12/27/2021 15:24:20 - INFO - codeparrot_training - Step 35502: {'lr': 9.738115632442265e-05, 'samples': 18177536, 'steps': 35502, 'batch_loss/train': 0.5801437149639241} 12/27/2021 15:24:32 - INFO - codeparrot_training - Step 35503: {'lr': 9.73686716700441e-05, 'samples': 18178048, 'steps': 35503, 'batch_loss/train': 0.7681184778921306} 12/27/2021 15:24:43 - INFO - codeparrot_training - Step 35504: {'lr': 9.735618762247014e-05, 'samples': 18178560, 'steps': 35504, 'batch_loss/train': 0.6651864496525377} 12/27/2021 15:24:54 - INFO - codeparrot_training - Step 35505: {'lr': 9.734370418175042e-05, 'samples': 18179072, 'steps': 35505, 'batch_loss/train': 0.7557421457022429} 12/27/2021 15:25:08 - INFO - codeparrot_training - Step 35506: {'lr': 9.733122134793443e-05, 'samples': 18179584, 'steps': 35506, 'batch_loss/train': 0.6186832704115659} 12/27/2021 15:25:18 - INFO - codeparrot_training - Step 35507: {'lr': 9.731873912107175e-05, 'samples': 18180096, 'steps': 35507, 'batch_loss/train': 0.6311852871440351} 12/27/2021 15:25:29 - INFO - codeparrot_training - Step 35508: {'lr': 9.730625750121231e-05, 'samples': 18180608, 'steps': 35508, 'batch_loss/train': 0.7005709728691727} 12/27/2021 15:25:39 - INFO - codeparrot_training - Step 35509: {'lr': 9.729377648840545e-05, 'samples': 18181120, 'steps': 35509, 'batch_loss/train': 0.7003309079445899} 12/27/2021 15:25:52 - INFO - codeparrot_training - Step 35510: {'lr': 9.728129608270075e-05, 'samples': 18181632, 'steps': 35510, 'batch_loss/train': 0.7284903093241155} 12/27/2021 15:26:02 - INFO - codeparrot_training - Step 35511: {'lr': 9.726881628414816e-05, 'samples': 18182144, 'steps': 35511, 'batch_loss/train': 0.9224564833566546} 12/27/2021 15:26:13 - INFO - codeparrot_training - Step 35512: {'lr': 9.725633709279699e-05, 'samples': 18182656, 'steps': 35512, 'batch_loss/train': 0.8231815886683762} 12/27/2021 15:26:25 - INFO - codeparrot_training - Step 35513: {'lr': 9.724385850869697e-05, 'samples': 18183168, 'steps': 35513, 'batch_loss/train': 0.7992125591263175} 12/27/2021 15:26:36 - INFO - codeparrot_training - Step 35514: {'lr': 9.723138053189768e-05, 'samples': 18183680, 'steps': 35514, 'batch_loss/train': 0.7496729935519397} 12/27/2021 15:26:46 - INFO - codeparrot_training - Step 35515: {'lr': 9.721890316244872e-05, 'samples': 18184192, 'steps': 35515, 'batch_loss/train': 0.6808378836140037} 12/27/2021 15:26:58 - INFO - codeparrot_training - Step 35516: {'lr': 9.720642640039973e-05, 'samples': 18184704, 'steps': 35516, 'batch_loss/train': 0.5623439240443986} 12/27/2021 15:27:09 - INFO - codeparrot_training - Step 35517: {'lr': 9.719395024580036e-05, 'samples': 18185216, 'steps': 35517, 'batch_loss/train': 0.7202573209069669} 12/27/2021 15:27:20 - INFO - codeparrot_training - Step 35518: {'lr': 9.71814746987e-05, 'samples': 18185728, 'steps': 35518, 'batch_loss/train': 0.5608435517351609} 12/27/2021 15:27:30 - INFO - codeparrot_training - Step 35519: {'lr': 9.716899975914848e-05, 'samples': 18186240, 'steps': 35519, 'batch_loss/train': 0.7478171410039067} 12/27/2021 15:27:44 - INFO - codeparrot_training - Step 35520: {'lr': 9.715652542719538e-05, 'samples': 18186752, 'steps': 35520, 'batch_loss/train': 0.5487567765812855} 12/27/2021 15:27:55 - INFO - codeparrot_training - Step 35521: {'lr': 9.714405170289003e-05, 'samples': 18187264, 'steps': 35521, 'batch_loss/train': 0.6439988883212209} 12/27/2021 15:28:06 - INFO - codeparrot_training - Step 35522: {'lr': 9.713157858628244e-05, 'samples': 18187776, 'steps': 35522, 'batch_loss/train': 0.7431323518976569} 12/27/2021 15:28:18 - INFO - codeparrot_training - Step 35523: {'lr': 9.711910607742186e-05, 'samples': 18188288, 'steps': 35523, 'batch_loss/train': 0.7660336424596608} 12/27/2021 15:28:28 - INFO - codeparrot_training - Step 35524: {'lr': 9.710663417635798e-05, 'samples': 18188800, 'steps': 35524, 'batch_loss/train': 0.6581562338396907} 12/27/2021 15:28:39 - INFO - codeparrot_training - Step 35525: {'lr': 9.709416288314041e-05, 'samples': 18189312, 'steps': 35525, 'batch_loss/train': 0.662408035248518} 12/27/2021 15:28:53 - INFO - codeparrot_training - Step 35526: {'lr': 9.70816921978187e-05, 'samples': 18189824, 'steps': 35526, 'batch_loss/train': 0.7836055532097816} 12/27/2021 15:29:04 - INFO - codeparrot_training - Step 35527: {'lr': 9.706922212044244e-05, 'samples': 18190336, 'steps': 35527, 'batch_loss/train': 0.6816866574808955} 12/27/2021 15:29:14 - INFO - codeparrot_training - Step 35528: {'lr': 9.705675265106123e-05, 'samples': 18190848, 'steps': 35528, 'batch_loss/train': 0.7653995170257986} 12/27/2021 15:29:25 - INFO - codeparrot_training - Step 35529: {'lr': 9.70442837897246e-05, 'samples': 18191360, 'steps': 35529, 'batch_loss/train': 0.6158125003566965} 12/27/2021 15:29:37 - INFO - codeparrot_training - Step 35530: {'lr': 9.703181553648216e-05, 'samples': 18191872, 'steps': 35530, 'batch_loss/train': 0.697675549890846} 12/27/2021 15:29:47 - INFO - codeparrot_training - Step 35531: {'lr': 9.701934789138353e-05, 'samples': 18192384, 'steps': 35531, 'batch_loss/train': 0.6880712807178497} 12/27/2021 15:29:58 - INFO - codeparrot_training - Step 35532: {'lr': 9.700688085447806e-05, 'samples': 18192896, 'steps': 35532, 'batch_loss/train': 0.6736629498191178} 12/27/2021 15:30:10 - INFO - codeparrot_training - Step 35533: {'lr': 9.699441442581552e-05, 'samples': 18193408, 'steps': 35533, 'batch_loss/train': 0.6610046438872814} 12/27/2021 15:30:21 - INFO - codeparrot_training - Step 35534: {'lr': 9.698194860544554e-05, 'samples': 18193920, 'steps': 35534, 'batch_loss/train': 0.6970259891822934} 12/27/2021 15:30:31 - INFO - codeparrot_training - Step 35535: {'lr': 9.696948339341743e-05, 'samples': 18194432, 'steps': 35535, 'batch_loss/train': 0.7663783794268966} 12/27/2021 15:30:45 - INFO - codeparrot_training - Step 35536: {'lr': 9.695701878978081e-05, 'samples': 18194944, 'steps': 35536, 'batch_loss/train': 0.7164968538563699} 12/27/2021 15:30:56 - INFO - codeparrot_training - Step 35537: {'lr': 9.694455479458547e-05, 'samples': 18195456, 'steps': 35537, 'batch_loss/train': 0.9238690464990214} 12/27/2021 15:31:07 - INFO - codeparrot_training - Step 35538: {'lr': 9.693209140788071e-05, 'samples': 18195968, 'steps': 35538, 'batch_loss/train': 0.6733849281445146} 12/27/2021 15:31:19 - INFO - codeparrot_training - Step 35539: {'lr': 9.691962862971612e-05, 'samples': 18196480, 'steps': 35539, 'batch_loss/train': 0.728737635537982} 12/27/2021 15:31:30 - INFO - codeparrot_training - Step 35540: {'lr': 9.690716646014131e-05, 'samples': 18196992, 'steps': 35540, 'batch_loss/train': 0.7417499313596636} 12/27/2021 15:31:40 - INFO - codeparrot_training - Step 35541: {'lr': 9.689470489920582e-05, 'samples': 18197504, 'steps': 35541, 'batch_loss/train': 0.589667750406079} 12/27/2021 15:31:51 - INFO - codeparrot_training - Step 35542: {'lr': 9.688224394695913e-05, 'samples': 18198016, 'steps': 35542, 'batch_loss/train': 0.9115322440629825} 12/27/2021 15:32:03 - INFO - codeparrot_training - Step 35543: {'lr': 9.686978360345095e-05, 'samples': 18198528, 'steps': 35543, 'batch_loss/train': 0.7106851004064083} 12/27/2021 15:32:13 - INFO - codeparrot_training - Step 35544: {'lr': 9.68573238687305e-05, 'samples': 18199040, 'steps': 35544, 'batch_loss/train': 0.7219243310391903} 12/27/2021 15:32:24 - INFO - codeparrot_training - Step 35545: {'lr': 9.68448647428476e-05, 'samples': 18199552, 'steps': 35545, 'batch_loss/train': 0.6752308933064342} 12/27/2021 15:32:38 - INFO - codeparrot_training - Step 35546: {'lr': 9.683240622585177e-05, 'samples': 18200064, 'steps': 35546, 'batch_loss/train': 0.7025135289877653} 12/27/2021 15:32:49 - INFO - codeparrot_training - Step 35547: {'lr': 9.681994831779225e-05, 'samples': 18200576, 'steps': 35547, 'batch_loss/train': 0.7183220915030688} 12/27/2021 15:32:59 - INFO - codeparrot_training - Step 35548: {'lr': 9.6807491018719e-05, 'samples': 18201088, 'steps': 35548, 'batch_loss/train': 0.6360579254105687} 12/27/2021 15:33:12 - INFO - codeparrot_training - Step 35549: {'lr': 9.679503432868123e-05, 'samples': 18201600, 'steps': 35549, 'batch_loss/train': 0.7354193348437548} 12/27/2021 15:33:22 - INFO - codeparrot_training - Step 35550: {'lr': 9.678257824772855e-05, 'samples': 18202112, 'steps': 35550, 'batch_loss/train': 0.6954219555482268} 12/27/2021 15:33:33 - INFO - codeparrot_training - Step 35551: {'lr': 9.677012277591046e-05, 'samples': 18202624, 'steps': 35551, 'batch_loss/train': 0.6168061038479209} 12/27/2021 15:33:43 - INFO - codeparrot_training - Step 35552: {'lr': 9.675766791327653e-05, 'samples': 18203136, 'steps': 35552, 'batch_loss/train': 0.7216682573780417} 12/27/2021 15:33:56 - INFO - codeparrot_training - Step 35553: {'lr': 9.674521365987626e-05, 'samples': 18203648, 'steps': 35553, 'batch_loss/train': 0.7078867554664612} 12/27/2021 15:34:06 - INFO - codeparrot_training - Step 35554: {'lr': 9.673276001575912e-05, 'samples': 18204160, 'steps': 35554, 'batch_loss/train': 0.7522571426816285} 12/27/2021 15:34:17 - INFO - codeparrot_training - Step 35555: {'lr': 9.67203069809747e-05, 'samples': 18204672, 'steps': 35555, 'batch_loss/train': 0.7198847243562341} 12/27/2021 15:34:31 - INFO - codeparrot_training - Step 35556: {'lr': 9.670785455557243e-05, 'samples': 18205184, 'steps': 35556, 'batch_loss/train': 0.6960007445886731} 12/27/2021 15:34:41 - INFO - codeparrot_training - Step 35557: {'lr': 9.669540273960196e-05, 'samples': 18205696, 'steps': 35557, 'batch_loss/train': 0.7363639860413969} 12/27/2021 15:34:52 - INFO - codeparrot_training - Step 35558: {'lr': 9.668295153311246e-05, 'samples': 18206208, 'steps': 35558, 'batch_loss/train': 0.790697269141674} 12/27/2021 15:35:04 - INFO - codeparrot_training - Step 35559: {'lr': 9.667050093615379e-05, 'samples': 18206720, 'steps': 35559, 'batch_loss/train': 0.723160880850628} 12/27/2021 15:35:15 - INFO - codeparrot_training - Step 35560: {'lr': 9.665805094877539e-05, 'samples': 18207232, 'steps': 35560, 'batch_loss/train': 0.7867607213556767} 12/27/2021 15:35:25 - INFO - codeparrot_training - Step 35561: {'lr': 9.664560157102659e-05, 'samples': 18207744, 'steps': 35561, 'batch_loss/train': 0.6233026956906542} 12/27/2021 15:35:38 - INFO - codeparrot_training - Step 35562: {'lr': 9.663315280295687e-05, 'samples': 18208256, 'steps': 35562, 'batch_loss/train': 0.7780787036754191} 12/27/2021 15:35:48 - INFO - codeparrot_training - Step 35563: {'lr': 9.662070464461603e-05, 'samples': 18208768, 'steps': 35563, 'batch_loss/train': 0.5659815210674424} 12/27/2021 15:35:59 - INFO - codeparrot_training - Step 35564: {'lr': 9.660825709605325e-05, 'samples': 18209280, 'steps': 35564, 'batch_loss/train': 0.7217256696312688} 12/27/2021 15:36:10 - INFO - codeparrot_training - Step 35565: {'lr': 9.659581015731811e-05, 'samples': 18209792, 'steps': 35565, 'batch_loss/train': 0.5777071579359472} 12/27/2021 15:36:24 - INFO - codeparrot_training - Step 35566: {'lr': 9.658336382846011e-05, 'samples': 18210304, 'steps': 35566, 'batch_loss/train': 0.6444688759511337} 12/27/2021 15:36:35 - INFO - codeparrot_training - Step 35567: {'lr': 9.657091810952876e-05, 'samples': 18210816, 'steps': 35567, 'batch_loss/train': 0.6979010275099427} 12/27/2021 15:36:45 - INFO - codeparrot_training - Step 35568: {'lr': 9.655847300057346e-05, 'samples': 18211328, 'steps': 35568, 'batch_loss/train': 0.6956471018493176} 12/27/2021 15:36:57 - INFO - codeparrot_training - Step 35569: {'lr': 9.654602850164376e-05, 'samples': 18211840, 'steps': 35569, 'batch_loss/train': 0.5879689613357186} 12/27/2021 15:37:08 - INFO - codeparrot_training - Step 35570: {'lr': 9.653358461278913e-05, 'samples': 18212352, 'steps': 35570, 'batch_loss/train': 0.73029179580044} 12/27/2021 15:37:18 - INFO - codeparrot_training - Step 35571: {'lr': 9.652114133405898e-05, 'samples': 18212864, 'steps': 35571, 'batch_loss/train': 0.68666703812778} 12/27/2021 15:37:31 - INFO - codeparrot_training - Step 35572: {'lr': 9.650869866550294e-05, 'samples': 18213376, 'steps': 35572, 'batch_loss/train': 0.7307100733742118} 12/27/2021 15:37:42 - INFO - codeparrot_training - Step 35573: {'lr': 9.649625660717016e-05, 'samples': 18213888, 'steps': 35573, 'batch_loss/train': 0.729357200092636} 12/27/2021 15:37:52 - INFO - codeparrot_training - Step 35574: {'lr': 9.64838151591104e-05, 'samples': 18214400, 'steps': 35574, 'batch_loss/train': 0.6514373486861587} 12/27/2021 15:38:04 - INFO - codeparrot_training - Step 35575: {'lr': 9.647137432137312e-05, 'samples': 18214912, 'steps': 35575, 'batch_loss/train': 0.6121893967210781} 12/27/2021 15:38:15 - INFO - codeparrot_training - Step 35576: {'lr': 9.645893409400758e-05, 'samples': 18215424, 'steps': 35576, 'batch_loss/train': 0.7250184887088835} 12/27/2021 15:38:26 - INFO - codeparrot_training - Step 35577: {'lr': 9.644649447706336e-05, 'samples': 18215936, 'steps': 35577, 'batch_loss/train': 0.7030135265085846} 12/27/2021 15:38:36 - INFO - codeparrot_training - Step 35578: {'lr': 9.643405547058989e-05, 'samples': 18216448, 'steps': 35578, 'batch_loss/train': 0.67955855675973} 12/27/2021 15:38:50 - INFO - codeparrot_training - Step 35579: {'lr': 9.642161707463662e-05, 'samples': 18216960, 'steps': 35579, 'batch_loss/train': 0.7776515246368945} 12/27/2021 15:39:01 - INFO - codeparrot_training - Step 35580: {'lr': 9.640917928925302e-05, 'samples': 18217472, 'steps': 35580, 'batch_loss/train': 0.7416417011991143} 12/27/2021 15:39:12 - INFO - codeparrot_training - Step 35581: {'lr': 9.639674211448854e-05, 'samples': 18217984, 'steps': 35581, 'batch_loss/train': 0.6747794910334051} 12/27/2021 15:39:24 - INFO - codeparrot_training - Step 35582: {'lr': 9.638430555039262e-05, 'samples': 18218496, 'steps': 35582, 'batch_loss/train': 0.689960933290422} 12/27/2021 15:39:34 - INFO - codeparrot_training - Step 35583: {'lr': 9.637186959701469e-05, 'samples': 18219008, 'steps': 35583, 'batch_loss/train': 0.7357696536928415} 12/27/2021 15:39:45 - INFO - codeparrot_training - Step 35584: {'lr': 9.635943425440418e-05, 'samples': 18219520, 'steps': 35584, 'batch_loss/train': 0.7324078041128814} 12/27/2021 15:39:57 - INFO - codeparrot_training - Step 35585: {'lr': 9.634699952261055e-05, 'samples': 18220032, 'steps': 35585, 'batch_loss/train': 0.7851353967562318} 12/27/2021 15:40:08 - INFO - codeparrot_training - Step 35586: {'lr': 9.633456540168336e-05, 'samples': 18220544, 'steps': 35586, 'batch_loss/train': 0.7934884941205382} 12/27/2021 15:40:18 - INFO - codeparrot_training - Step 35587: {'lr': 9.63221318916718e-05, 'samples': 18221056, 'steps': 35587, 'batch_loss/train': 0.7691303491592407} 12/27/2021 15:40:29 - INFO - codeparrot_training - Step 35588: {'lr': 9.630969899262531e-05, 'samples': 18221568, 'steps': 35588, 'batch_loss/train': 0.8146612048149109} 12/27/2021 15:40:43 - INFO - codeparrot_training - Step 35589: {'lr': 9.629726670459363e-05, 'samples': 18222080, 'steps': 35589, 'batch_loss/train': 1.3746132627129555} 12/27/2021 15:40:53 - INFO - codeparrot_training - Step 35590: {'lr': 9.628483502762589e-05, 'samples': 18222592, 'steps': 35590, 'batch_loss/train': 0.7258619498461485} 12/27/2021 15:41:04 - INFO - codeparrot_training - Step 35591: {'lr': 9.627240396177162e-05, 'samples': 18223104, 'steps': 35591, 'batch_loss/train': 0.7402160419151187} 12/27/2021 15:41:16 - INFO - codeparrot_training - Step 35592: {'lr': 9.625997350708018e-05, 'samples': 18223616, 'steps': 35592, 'batch_loss/train': 0.6435139612294734} 12/27/2021 15:41:27 - INFO - codeparrot_training - Step 35593: {'lr': 9.624754366360108e-05, 'samples': 18224128, 'steps': 35593, 'batch_loss/train': 0.6611620406038128} 12/27/2021 15:41:37 - INFO - codeparrot_training - Step 35594: {'lr': 9.623511443138367e-05, 'samples': 18224640, 'steps': 35594, 'batch_loss/train': 0.8209523316472769} 12/27/2021 15:41:52 - INFO - codeparrot_training - Step 35595: {'lr': 9.622268581047738e-05, 'samples': 18225152, 'steps': 35595, 'batch_loss/train': 0.60342166404007} 12/27/2021 15:42:02 - INFO - codeparrot_training - Step 35596: {'lr': 9.621025780093165e-05, 'samples': 18225664, 'steps': 35596, 'batch_loss/train': 0.605864395853132} 12/27/2021 15:42:13 - INFO - codeparrot_training - Step 35597: {'lr': 9.619783040279584e-05, 'samples': 18226176, 'steps': 35597, 'batch_loss/train': 0.8097067251801491} 12/27/2021 15:42:25 - INFO - codeparrot_training - Step 35598: {'lr': 9.618540361611949e-05, 'samples': 18226688, 'steps': 35598, 'batch_loss/train': 0.7258571973070502} 12/27/2021 15:42:36 - INFO - codeparrot_training - Step 35599: {'lr': 9.617297744095172e-05, 'samples': 18227200, 'steps': 35599, 'batch_loss/train': 0.6750585548579693} 12/27/2021 15:42:47 - INFO - codeparrot_training - Step 35600: {'lr': 9.616055187734218e-05, 'samples': 18227712, 'steps': 35600, 'batch_loss/train': 0.7413767608813941} 12/27/2021 15:42:57 - INFO - codeparrot_training - Step 35601: {'lr': 9.61481269253403e-05, 'samples': 18228224, 'steps': 35601, 'batch_loss/train': 0.7273547118529677} 12/27/2021 15:43:09 - INFO - codeparrot_training - Step 35602: {'lr': 9.61357025849953e-05, 'samples': 18228736, 'steps': 35602, 'batch_loss/train': 0.7191376346163452} 12/27/2021 15:43:20 - INFO - codeparrot_training - Step 35603: {'lr': 9.612327885635663e-05, 'samples': 18229248, 'steps': 35603, 'batch_loss/train': 0.7267008220078424} 12/27/2021 15:43:31 - INFO - codeparrot_training - Step 35604: {'lr': 9.611085573947371e-05, 'samples': 18229760, 'steps': 35604, 'batch_loss/train': 0.6983309015631676} 12/27/2021 15:43:44 - INFO - codeparrot_training - Step 35605: {'lr': 9.609843323439594e-05, 'samples': 18230272, 'steps': 35605, 'batch_loss/train': 0.7906979601830244} 12/27/2021 15:43:55 - INFO - codeparrot_training - Step 35606: {'lr': 9.608601134117264e-05, 'samples': 18230784, 'steps': 35606, 'batch_loss/train': 0.792594201862812} 12/27/2021 15:44:06 - INFO - codeparrot_training - Step 35607: {'lr': 9.607359005985327e-05, 'samples': 18231296, 'steps': 35607, 'batch_loss/train': 0.7693005632609129} 12/27/2021 15:44:18 - INFO - codeparrot_training - Step 35608: {'lr': 9.60611693904872e-05, 'samples': 18231808, 'steps': 35608, 'batch_loss/train': 0.7961656730622053} 12/27/2021 15:44:29 - INFO - codeparrot_training - Step 35609: {'lr': 9.604874933312378e-05, 'samples': 18232320, 'steps': 35609, 'batch_loss/train': 0.69483657553792} 12/27/2021 15:44:39 - INFO - codeparrot_training - Step 35610: {'lr': 9.60363298878124e-05, 'samples': 18232832, 'steps': 35610, 'batch_loss/train': 0.7729988251812756} 12/27/2021 15:44:50 - INFO - codeparrot_training - Step 35611: {'lr': 9.602391105460246e-05, 'samples': 18233344, 'steps': 35611, 'batch_loss/train': 0.7151641380041838} 12/27/2021 15:45:04 - INFO - codeparrot_training - Step 35612: {'lr': 9.601149283354336e-05, 'samples': 18233856, 'steps': 35612, 'batch_loss/train': 0.7496756431646645} 12/27/2021 15:45:14 - INFO - codeparrot_training - Step 35613: {'lr': 9.599907522468435e-05, 'samples': 18234368, 'steps': 35613, 'batch_loss/train': 0.7205678271129727} 12/27/2021 15:45:25 - INFO - codeparrot_training - Step 35614: {'lr': 9.598665822807479e-05, 'samples': 18234880, 'steps': 35614, 'batch_loss/train': 0.7638664306141436} 12/27/2021 15:45:37 - INFO - codeparrot_training - Step 35615: {'lr': 9.59742418437643e-05, 'samples': 18235392, 'steps': 35615, 'batch_loss/train': 0.7500141374766827} 12/27/2021 15:45:48 - INFO - codeparrot_training - Step 35616: {'lr': 9.596182607180196e-05, 'samples': 18235904, 'steps': 35616, 'batch_loss/train': 0.6430748039856553} 12/27/2021 15:45:58 - INFO - codeparrot_training - Step 35617: {'lr': 9.594941091223725e-05, 'samples': 18236416, 'steps': 35617, 'batch_loss/train': 0.5088404106791131} 12/27/2021 15:46:10 - INFO - codeparrot_training - Step 35618: {'lr': 9.593699636511952e-05, 'samples': 18236928, 'steps': 35618, 'batch_loss/train': 0.7878969898447394} 12/27/2021 15:46:21 - INFO - codeparrot_training - Step 35619: {'lr': 9.592458243049812e-05, 'samples': 18237440, 'steps': 35619, 'batch_loss/train': 0.7256737682037055} 12/27/2021 15:46:32 - INFO - codeparrot_training - Step 35620: {'lr': 9.59121691084224e-05, 'samples': 18237952, 'steps': 35620, 'batch_loss/train': 0.7165853483602405} 12/27/2021 15:46:42 - INFO - codeparrot_training - Step 35621: {'lr': 9.589975639894169e-05, 'samples': 18238464, 'steps': 35621, 'batch_loss/train': 0.6933795632794499} 12/27/2021 15:46:54 - INFO - codeparrot_training - Step 35622: {'lr': 9.58873443021054e-05, 'samples': 18238976, 'steps': 35622, 'batch_loss/train': 0.7949727084487677} 12/27/2021 15:47:05 - INFO - codeparrot_training - Step 35623: {'lr': 9.587493281796283e-05, 'samples': 18239488, 'steps': 35623, 'batch_loss/train': 0.8282196819782257} 12/27/2021 15:47:16 - INFO - codeparrot_training - Step 35624: {'lr': 9.586252194656339e-05, 'samples': 18240000, 'steps': 35624, 'batch_loss/train': 0.7324314792640507} 12/27/2021 15:47:30 - INFO - codeparrot_training - Step 35625: {'lr': 9.585011168795623e-05, 'samples': 18240512, 'steps': 35625, 'batch_loss/train': 0.697044868953526} 12/27/2021 15:47:41 - INFO - codeparrot_training - Step 35626: {'lr': 9.583770204219088e-05, 'samples': 18241024, 'steps': 35626, 'batch_loss/train': 0.7537901941686869} 12/27/2021 15:47:51 - INFO - codeparrot_training - Step 35627: {'lr': 9.582529300931672e-05, 'samples': 18241536, 'steps': 35627, 'batch_loss/train': 0.8480989269446582} 12/27/2021 15:48:04 - INFO - codeparrot_training - Step 35628: {'lr': 9.581288458938289e-05, 'samples': 18242048, 'steps': 35628, 'batch_loss/train': 0.6423485262785107} 12/27/2021 15:48:14 - INFO - codeparrot_training - Step 35629: {'lr': 9.580047678243872e-05, 'samples': 18242560, 'steps': 35629, 'batch_loss/train': 0.759436153806746} 12/27/2021 15:48:25 - INFO - codeparrot_training - Step 35630: {'lr': 9.578806958853381e-05, 'samples': 18243072, 'steps': 35630, 'batch_loss/train': 0.7105741994455457} 12/27/2021 15:48:36 - INFO - codeparrot_training - Step 35631: {'lr': 9.57756630077172e-05, 'samples': 18243584, 'steps': 35631, 'batch_loss/train': 0.6667629953008145} 12/27/2021 15:48:48 - INFO - codeparrot_training - Step 35632: {'lr': 9.576325704003835e-05, 'samples': 18244096, 'steps': 35632, 'batch_loss/train': 0.7961804736405611} 12/27/2021 15:48:58 - INFO - codeparrot_training - Step 35633: {'lr': 9.575085168554653e-05, 'samples': 18244608, 'steps': 35633, 'batch_loss/train': 0.6901559010148048} 12/27/2021 15:49:09 - INFO - codeparrot_training - Step 35634: {'lr': 9.573844694429109e-05, 'samples': 18245120, 'steps': 35634, 'batch_loss/train': 0.7055669687688351} 12/27/2021 15:49:23 - INFO - codeparrot_training - Step 35635: {'lr': 9.572604281632135e-05, 'samples': 18245632, 'steps': 35635, 'batch_loss/train': 0.694852335145697} 12/27/2021 15:49:33 - INFO - codeparrot_training - Step 35636: {'lr': 9.57136393016866e-05, 'samples': 18246144, 'steps': 35636, 'batch_loss/train': 0.6428313995711505} 12/27/2021 15:49:44 - INFO - codeparrot_training - Step 35637: {'lr': 9.570123640043618e-05, 'samples': 18246656, 'steps': 35637, 'batch_loss/train': 0.6974210450425744} 12/27/2021 15:49:56 - INFO - codeparrot_training - Step 35638: {'lr': 9.568883411261944e-05, 'samples': 18247168, 'steps': 35638, 'batch_loss/train': 0.7915255133993924} 12/27/2021 15:50:07 - INFO - codeparrot_training - Step 35639: {'lr': 9.567643243828558e-05, 'samples': 18247680, 'steps': 35639, 'batch_loss/train': 0.721165825612843} 12/27/2021 15:50:17 - INFO - codeparrot_training - Step 35640: {'lr': 9.566403137748383e-05, 'samples': 18248192, 'steps': 35640, 'batch_loss/train': 0.799771080724895} 12/27/2021 15:50:28 - INFO - codeparrot_training - Step 35641: {'lr': 9.56516309302638e-05, 'samples': 18248704, 'steps': 35641, 'batch_loss/train': 0.6611841709818691} 12/27/2021 15:50:40 - INFO - codeparrot_training - Step 35642: {'lr': 9.563923109667455e-05, 'samples': 18249216, 'steps': 35642, 'batch_loss/train': 0.7214314287994057} 12/27/2021 15:50:50 - INFO - codeparrot_training - Step 35643: {'lr': 9.56268318767653e-05, 'samples': 18249728, 'steps': 35643, 'batch_loss/train': 0.6536036485340446} 12/27/2021 15:51:01 - INFO - codeparrot_training - Step 35644: {'lr': 9.561443327058569e-05, 'samples': 18250240, 'steps': 35644, 'batch_loss/train': 0.6839099964126945} 12/27/2021 15:51:15 - INFO - codeparrot_training - Step 35645: {'lr': 9.56020352781847e-05, 'samples': 18250752, 'steps': 35645, 'batch_loss/train': 0.8934956500306726} 12/27/2021 15:51:26 - INFO - codeparrot_training - Step 35646: {'lr': 9.55896378996117e-05, 'samples': 18251264, 'steps': 35646, 'batch_loss/train': 0.6077330549596809} 12/27/2021 15:51:36 - INFO - codeparrot_training - Step 35647: {'lr': 9.5577241134916e-05, 'samples': 18251776, 'steps': 35647, 'batch_loss/train': 0.7371204243972898} 12/27/2021 15:51:48 - INFO - codeparrot_training - Step 35648: {'lr': 9.556484498414691e-05, 'samples': 18252288, 'steps': 35648, 'batch_loss/train': 0.7309583388268948} 12/27/2021 15:51:59 - INFO - codeparrot_training - Step 35649: {'lr': 9.555244944735367e-05, 'samples': 18252800, 'steps': 35649, 'batch_loss/train': 0.7079448355361819} 12/27/2021 15:52:09 - INFO - codeparrot_training - Step 35650: {'lr': 9.554005452458564e-05, 'samples': 18253312, 'steps': 35650, 'batch_loss/train': 0.6819901498965919} 12/27/2021 15:52:21 - INFO - codeparrot_training - Step 35651: {'lr': 9.552766021589188e-05, 'samples': 18253824, 'steps': 35651, 'batch_loss/train': 0.7315757042961195} 12/27/2021 15:52:32 - INFO - codeparrot_training - Step 35652: {'lr': 9.551526652132192e-05, 'samples': 18254336, 'steps': 35652, 'batch_loss/train': 0.7260130830109119} 12/27/2021 15:52:43 - INFO - codeparrot_training - Step 35653: {'lr': 9.550287344092498e-05, 'samples': 18254848, 'steps': 35653, 'batch_loss/train': 0.7036185511387885} 12/27/2021 15:52:53 - INFO - codeparrot_training - Step 35654: {'lr': 9.549048097475022e-05, 'samples': 18255360, 'steps': 35654, 'batch_loss/train': 0.6274188556708395} 12/27/2021 15:53:08 - INFO - codeparrot_training - Step 35655: {'lr': 9.547808912284686e-05, 'samples': 18255872, 'steps': 35655, 'batch_loss/train': 0.7371995132416487} 12/27/2021 15:53:18 - INFO - codeparrot_training - Step 35656: {'lr': 9.546569788526447e-05, 'samples': 18256384, 'steps': 35656, 'batch_loss/train': 0.7668287004344165} 12/27/2021 15:53:29 - INFO - codeparrot_training - Step 35657: {'lr': 9.545330726205198e-05, 'samples': 18256896, 'steps': 35657, 'batch_loss/train': 0.8106929142959416} 12/27/2021 15:53:41 - INFO - codeparrot_training - Step 35658: {'lr': 9.544091725325882e-05, 'samples': 18257408, 'steps': 35658, 'batch_loss/train': 0.5493429105263203} 12/27/2021 15:53:51 - INFO - codeparrot_training - Step 35659: {'lr': 9.542852785893421e-05, 'samples': 18257920, 'steps': 35659, 'batch_loss/train': 0.5683344712015241} 12/27/2021 15:54:02 - INFO - codeparrot_training - Step 35660: {'lr': 9.541613907912742e-05, 'samples': 18258432, 'steps': 35660, 'batch_loss/train': 1.336568096652627} 12/27/2021 15:54:14 - INFO - codeparrot_training - Step 35661: {'lr': 9.540375091388766e-05, 'samples': 18258944, 'steps': 35661, 'batch_loss/train': 0.740015187067911} 12/27/2021 15:54:25 - INFO - codeparrot_training - Step 35662: {'lr': 9.539136336326426e-05, 'samples': 18259456, 'steps': 35662, 'batch_loss/train': 0.7609488004818559} 12/27/2021 15:54:35 - INFO - codeparrot_training - Step 35663: {'lr': 9.53789764273064e-05, 'samples': 18259968, 'steps': 35663, 'batch_loss/train': 0.7535416530445218} 12/27/2021 15:54:46 - INFO - codeparrot_training - Step 35664: {'lr': 9.536659010606336e-05, 'samples': 18260480, 'steps': 35664, 'batch_loss/train': 0.752021380700171} 12/27/2021 15:55:00 - INFO - codeparrot_training - Step 35665: {'lr': 9.535420439958443e-05, 'samples': 18260992, 'steps': 35665, 'batch_loss/train': 0.6927860816940665} 12/27/2021 15:55:11 - INFO - codeparrot_training - Step 35666: {'lr': 9.534181930791864e-05, 'samples': 18261504, 'steps': 35666, 'batch_loss/train': 0.8016884569078684} 12/27/2021 15:55:22 - INFO - codeparrot_training - Step 35667: {'lr': 9.532943483111553e-05, 'samples': 18262016, 'steps': 35667, 'batch_loss/train': 2.500639136880636} 12/27/2021 15:55:34 - INFO - codeparrot_training - Step 35668: {'lr': 9.53170509692241e-05, 'samples': 18262528, 'steps': 35668, 'batch_loss/train': 0.7404604065814056} 12/27/2021 15:55:45 - INFO - codeparrot_training - Step 35669: {'lr': 9.53046677222936e-05, 'samples': 18263040, 'steps': 35669, 'batch_loss/train': 0.36992755776736885} 12/27/2021 15:55:55 - INFO - codeparrot_training - Step 35670: {'lr': 9.529228509037349e-05, 'samples': 18263552, 'steps': 35670, 'batch_loss/train': 0.7738149808719754} 12/27/2021 15:56:06 - INFO - codeparrot_training - Step 35671: {'lr': 9.527990307351277e-05, 'samples': 18264064, 'steps': 35671, 'batch_loss/train': 0.745609617093578} 12/27/2021 15:56:18 - INFO - codeparrot_training - Step 35672: {'lr': 9.526752167176073e-05, 'samples': 18264576, 'steps': 35672, 'batch_loss/train': 0.754799920017831} 12/27/2021 15:56:29 - INFO - codeparrot_training - Step 35673: {'lr': 9.525514088516659e-05, 'samples': 18265088, 'steps': 35673, 'batch_loss/train': 0.7374648982658982} 12/27/2021 15:56:39 - INFO - codeparrot_training - Step 35674: {'lr': 9.52427607137796e-05, 'samples': 18265600, 'steps': 35674, 'batch_loss/train': 0.7739876779960468} 12/27/2021 15:56:53 - INFO - codeparrot_training - Step 35675: {'lr': 9.523038115764892e-05, 'samples': 18266112, 'steps': 35675, 'batch_loss/train': 0.7763915024697781} 12/27/2021 15:57:04 - INFO - codeparrot_training - Step 35676: {'lr': 9.521800221682394e-05, 'samples': 18266624, 'steps': 35676, 'batch_loss/train': 0.7750779259949923} 12/27/2021 15:57:14 - INFO - codeparrot_training - Step 35677: {'lr': 9.520562389135353e-05, 'samples': 18267136, 'steps': 35677, 'batch_loss/train': 0.6613026591949165} 12/27/2021 15:57:26 - INFO - codeparrot_training - Step 35678: {'lr': 9.519324618128722e-05, 'samples': 18267648, 'steps': 35678, 'batch_loss/train': 0.718902200460434} 12/27/2021 15:57:37 - INFO - codeparrot_training - Step 35679: {'lr': 9.51808690866742e-05, 'samples': 18268160, 'steps': 35679, 'batch_loss/train': 0.7113848514854908} 12/27/2021 15:57:48 - INFO - codeparrot_training - Step 35680: {'lr': 9.516849260756338e-05, 'samples': 18268672, 'steps': 35680, 'batch_loss/train': 0.6745733609423041} 12/27/2021 15:58:02 - INFO - codeparrot_training - Step 35681: {'lr': 9.515611674400428e-05, 'samples': 18269184, 'steps': 35681, 'batch_loss/train': 0.7749028797261417} 12/27/2021 15:58:12 - INFO - codeparrot_training - Step 35682: {'lr': 9.51437414960461e-05, 'samples': 18269696, 'steps': 35682, 'batch_loss/train': 0.7750254916027188} 12/27/2021 15:58:23 - INFO - codeparrot_training - Step 35683: {'lr': 9.513136686373783e-05, 'samples': 18270208, 'steps': 35683, 'batch_loss/train': 0.8325651716440916} 12/27/2021 15:58:34 - INFO - codeparrot_training - Step 35684: {'lr': 9.511899284712875e-05, 'samples': 18270720, 'steps': 35684, 'batch_loss/train': 0.7987270504236221} 12/27/2021 15:58:46 - INFO - codeparrot_training - Step 35685: {'lr': 9.510661944626808e-05, 'samples': 18271232, 'steps': 35685, 'batch_loss/train': 0.7927672751247883} 12/27/2021 15:58:56 - INFO - codeparrot_training - Step 35686: {'lr': 9.509424666120503e-05, 'samples': 18271744, 'steps': 35686, 'batch_loss/train': 0.6457416755147278} 12/27/2021 15:59:07 - INFO - codeparrot_training - Step 35687: {'lr': 9.508187449198874e-05, 'samples': 18272256, 'steps': 35687, 'batch_loss/train': 0.7493279557675123} 12/27/2021 15:59:19 - INFO - codeparrot_training - Step 35688: {'lr': 9.506950293866845e-05, 'samples': 18272768, 'steps': 35688, 'batch_loss/train': 0.701217507943511} 12/27/2021 15:59:30 - INFO - codeparrot_training - Step 35689: {'lr': 9.505713200129329e-05, 'samples': 18273280, 'steps': 35689, 'batch_loss/train': 0.7445602286607027} 12/27/2021 15:59:41 - INFO - codeparrot_training - Step 35690: {'lr': 9.504476167991247e-05, 'samples': 18273792, 'steps': 35690, 'batch_loss/train': 0.5716936443932354} 12/27/2021 15:59:54 - INFO - codeparrot_training - Step 35691: {'lr': 9.503239197457528e-05, 'samples': 18274304, 'steps': 35691, 'batch_loss/train': 0.6222275698091835} 12/27/2021 16:00:05 - INFO - codeparrot_training - Step 35692: {'lr': 9.50200228853306e-05, 'samples': 18274816, 'steps': 35692, 'batch_loss/train': 0.6725667540449649} 12/27/2021 16:00:16 - INFO - codeparrot_training - Step 35693: {'lr': 9.500765441222795e-05, 'samples': 18275328, 'steps': 35693, 'batch_loss/train': 0.7168538435362279} 12/27/2021 16:00:26 - INFO - codeparrot_training - Step 35694: {'lr': 9.499528655531625e-05, 'samples': 18275840, 'steps': 35694, 'batch_loss/train': 0.7525026218499988} 12/27/2021 16:00:38 - INFO - codeparrot_training - Step 35695: {'lr': 9.49829193146447e-05, 'samples': 18276352, 'steps': 35695, 'batch_loss/train': 0.6513575068674982} 12/27/2021 16:00:49 - INFO - codeparrot_training - Step 35696: {'lr': 9.497055269026269e-05, 'samples': 18276864, 'steps': 35696, 'batch_loss/train': 0.7617508752737194} 12/27/2021 16:00:59 - INFO - codeparrot_training - Step 35697: {'lr': 9.495818668221917e-05, 'samples': 18277376, 'steps': 35697, 'batch_loss/train': 0.7052996219135821} 12/27/2021 16:01:12 - INFO - codeparrot_training - Step 35698: {'lr': 9.494582129056334e-05, 'samples': 18277888, 'steps': 35698, 'batch_loss/train': 0.7190717509947717} 12/27/2021 16:01:22 - INFO - codeparrot_training - Step 35699: {'lr': 9.493345651534435e-05, 'samples': 18278400, 'steps': 35699, 'batch_loss/train': 0.6710964934900403} 12/27/2021 16:01:33 - INFO - codeparrot_training - Step 35700: {'lr': 9.492109235661142e-05, 'samples': 18278912, 'steps': 35700, 'batch_loss/train': 0.7823557537049055} 12/27/2021 16:01:45 - INFO - codeparrot_training - Step 35701: {'lr': 9.490872881441365e-05, 'samples': 18279424, 'steps': 35701, 'batch_loss/train': 0.7286529110278934} 12/27/2021 16:01:56 - INFO - codeparrot_training - Step 35702: {'lr': 9.489636588880032e-05, 'samples': 18279936, 'steps': 35702, 'batch_loss/train': 0.7457581073977053} 12/27/2021 16:02:06 - INFO - codeparrot_training - Step 35703: {'lr': 9.48840035798203e-05, 'samples': 18280448, 'steps': 35703, 'batch_loss/train': 0.7686090131755918} 12/27/2021 16:02:20 - INFO - codeparrot_training - Step 35704: {'lr': 9.487164188752299e-05, 'samples': 18280960, 'steps': 35704, 'batch_loss/train': 0.6947275931015611} 12/27/2021 16:02:31 - INFO - codeparrot_training - Step 35705: {'lr': 9.485928081195754e-05, 'samples': 18281472, 'steps': 35705, 'batch_loss/train': 0.815170569345355} 12/27/2021 16:02:41 - INFO - codeparrot_training - Step 35706: {'lr': 9.484692035317283e-05, 'samples': 18281984, 'steps': 35706, 'batch_loss/train': 0.7383122439496219} 12/27/2021 16:02:52 - INFO - codeparrot_training - Step 35707: {'lr': 9.483456051121827e-05, 'samples': 18282496, 'steps': 35707, 'batch_loss/train': 0.7351507560815662} 12/27/2021 16:03:04 - INFO - codeparrot_training - Step 35708: {'lr': 9.482220128614302e-05, 'samples': 18283008, 'steps': 35708, 'batch_loss/train': 0.7875766679644585} 12/27/2021 16:03:15 - INFO - codeparrot_training - Step 35709: {'lr': 9.4809842677996e-05, 'samples': 18283520, 'steps': 35709, 'batch_loss/train': 0.7091561360284686} 12/27/2021 16:03:25 - INFO - codeparrot_training - Step 35710: {'lr': 9.479748468682637e-05, 'samples': 18284032, 'steps': 35710, 'batch_loss/train': 0.7644245633855462} 12/27/2021 16:03:39 - INFO - codeparrot_training - Step 35711: {'lr': 9.478512731268352e-05, 'samples': 18284544, 'steps': 35711, 'batch_loss/train': 0.77675883367192} 12/27/2021 16:03:50 - INFO - codeparrot_training - Step 35712: {'lr': 9.477277055561628e-05, 'samples': 18285056, 'steps': 35712, 'batch_loss/train': 0.7597151789814234} 12/27/2021 16:04:00 - INFO - codeparrot_training - Step 35713: {'lr': 9.476041441567393e-05, 'samples': 18285568, 'steps': 35713, 'batch_loss/train': 0.7629708079621196} 12/27/2021 16:04:12 - INFO - codeparrot_training - Step 35714: {'lr': 9.474805889290552e-05, 'samples': 18286080, 'steps': 35714, 'batch_loss/train': 0.6628235681564547} 12/27/2021 16:04:23 - INFO - codeparrot_training - Step 35715: {'lr': 9.473570398736022e-05, 'samples': 18286592, 'steps': 35715, 'batch_loss/train': 0.8158446401357651} 12/27/2021 16:04:34 - INFO - codeparrot_training - Step 35716: {'lr': 9.472334969908713e-05, 'samples': 18287104, 'steps': 35716, 'batch_loss/train': 0.791647111531347} 12/27/2021 16:04:44 - INFO - codeparrot_training - Step 35717: {'lr': 9.471099602813541e-05, 'samples': 18287616, 'steps': 35717, 'batch_loss/train': 0.6534002488479018} 12/27/2021 16:04:57 - INFO - codeparrot_training - Step 35718: {'lr': 9.46986429745541e-05, 'samples': 18288128, 'steps': 35718, 'batch_loss/train': 0.8196423188783228} 12/27/2021 16:05:07 - INFO - codeparrot_training - Step 35719: {'lr': 9.468629053839234e-05, 'samples': 18288640, 'steps': 35719, 'batch_loss/train': 0.8650286910124123} 12/27/2021 16:05:18 - INFO - codeparrot_training - Step 35720: {'lr': 9.467393871969934e-05, 'samples': 18289152, 'steps': 35720, 'batch_loss/train': 0.6800003745593131} 12/27/2021 16:05:32 - INFO - codeparrot_training - Step 35721: {'lr': 9.466158751852394e-05, 'samples': 18289664, 'steps': 35721, 'batch_loss/train': 0.7177582215517759} 12/27/2021 16:05:42 - INFO - codeparrot_training - Step 35722: {'lr': 9.464923693491558e-05, 'samples': 18290176, 'steps': 35722, 'batch_loss/train': 0.7528553856536746} 12/27/2021 16:05:53 - INFO - codeparrot_training - Step 35723: {'lr': 9.463688696892315e-05, 'samples': 18290688, 'steps': 35723, 'batch_loss/train': 0.7239911202341318} 12/27/2021 16:06:05 - INFO - codeparrot_training - Step 35724: {'lr': 9.462453762059575e-05, 'samples': 18291200, 'steps': 35724, 'batch_loss/train': 0.6830595908686519} 12/27/2021 16:06:16 - INFO - codeparrot_training - Step 35725: {'lr': 9.461218888998252e-05, 'samples': 18291712, 'steps': 35725, 'batch_loss/train': 0.6762294385116547} 12/27/2021 16:06:26 - INFO - codeparrot_training - Step 35726: {'lr': 9.459984077713257e-05, 'samples': 18292224, 'steps': 35726, 'batch_loss/train': 0.7326400522142649} 12/27/2021 16:06:40 - INFO - codeparrot_training - Step 35727: {'lr': 9.458749328209499e-05, 'samples': 18292736, 'steps': 35727, 'batch_loss/train': 0.7625453942455351} 12/27/2021 16:06:51 - INFO - codeparrot_training - Step 35728: {'lr': 9.45751464049189e-05, 'samples': 18293248, 'steps': 35728, 'batch_loss/train': 0.7598450083751231} 12/27/2021 16:07:02 - INFO - codeparrot_training - Step 35729: {'lr': 9.456280014565322e-05, 'samples': 18293760, 'steps': 35729, 'batch_loss/train': 0.7125785937532783} 12/27/2021 16:07:12 - INFO - codeparrot_training - Step 35730: {'lr': 9.455045450434721e-05, 'samples': 18294272, 'steps': 35730, 'batch_loss/train': 0.8321837079711258} 12/27/2021 16:07:24 - INFO - codeparrot_training - Step 35731: {'lr': 9.453810948105e-05, 'samples': 18294784, 'steps': 35731, 'batch_loss/train': 0.7218844764865935} 12/27/2021 16:07:35 - INFO - codeparrot_training - Step 35732: {'lr': 9.452576507581037e-05, 'samples': 18295296, 'steps': 35732, 'batch_loss/train': 0.7024073214270175} 12/27/2021 16:07:45 - INFO - codeparrot_training - Step 35733: {'lr': 9.451342128867768e-05, 'samples': 18295808, 'steps': 35733, 'batch_loss/train': 0.7767377598211169} 12/27/2021 16:07:58 - INFO - codeparrot_training - Step 35734: {'lr': 9.450107811970101e-05, 'samples': 18296320, 'steps': 35734, 'batch_loss/train': 0.7785465810447931} 12/27/2021 16:08:08 - INFO - codeparrot_training - Step 35735: {'lr': 9.448873556892929e-05, 'samples': 18296832, 'steps': 35735, 'batch_loss/train': 0.7627498302608728} 12/27/2021 16:08:19 - INFO - codeparrot_training - Step 35736: {'lr': 9.44763936364115e-05, 'samples': 18297344, 'steps': 35736, 'batch_loss/train': 0.7221322765108198} 12/27/2021 16:08:33 - INFO - codeparrot_training - Step 35737: {'lr': 9.446405232219704e-05, 'samples': 18297856, 'steps': 35737, 'batch_loss/train': 0.7503625238314271} 12/27/2021 16:08:43 - INFO - codeparrot_training - Step 35738: {'lr': 9.44517116263347e-05, 'samples': 18298368, 'steps': 35738, 'batch_loss/train': 0.7871506931260228} 12/27/2021 16:08:54 - INFO - codeparrot_training - Step 35739: {'lr': 9.443937154887364e-05, 'samples': 18298880, 'steps': 35739, 'batch_loss/train': 0.7245736923068762} 12/27/2021 16:09:05 - INFO - codeparrot_training - Step 35740: {'lr': 9.442703208986287e-05, 'samples': 18299392, 'steps': 35740, 'batch_loss/train': 0.8231748500838876} 12/27/2021 16:09:17 - INFO - codeparrot_training - Step 35741: {'lr': 9.441469324935147e-05, 'samples': 18299904, 'steps': 35741, 'batch_loss/train': 0.7706712144427001} 12/27/2021 16:09:27 - INFO - codeparrot_training - Step 35742: {'lr': 9.440235502738853e-05, 'samples': 18300416, 'steps': 35742, 'batch_loss/train': 0.7810090528801084} 12/27/2021 16:09:38 - INFO - codeparrot_training - Step 35743: {'lr': 9.439001742402306e-05, 'samples': 18300928, 'steps': 35743, 'batch_loss/train': 0.6884624501690269} 12/27/2021 16:09:50 - INFO - codeparrot_training - Step 35744: {'lr': 9.437768043930414e-05, 'samples': 18301440, 'steps': 35744, 'batch_loss/train': 0.6265654810704291} 12/27/2021 16:10:01 - INFO - codeparrot_training - Step 35745: {'lr': 9.43653440732808e-05, 'samples': 18301952, 'steps': 35745, 'batch_loss/train': 0.7348963171243668} 12/27/2021 16:10:11 - INFO - codeparrot_training - Step 35746: {'lr': 9.435300832600213e-05, 'samples': 18302464, 'steps': 35746, 'batch_loss/train': 0.7207103059627116} 12/27/2021 16:10:24 - INFO - codeparrot_training - Step 35747: {'lr': 9.434067319751699e-05, 'samples': 18302976, 'steps': 35747, 'batch_loss/train': 0.7489693192765117} 12/27/2021 16:10:34 - INFO - codeparrot_training - Step 35748: {'lr': 9.432833868787474e-05, 'samples': 18303488, 'steps': 35748, 'batch_loss/train': 0.7758341196458787} 12/27/2021 16:10:45 - INFO - codeparrot_training - Step 35749: {'lr': 9.431600479712416e-05, 'samples': 18304000, 'steps': 35749, 'batch_loss/train': 0.7305180067196488} 12/27/2021 16:10:59 - INFO - codeparrot_training - Step 35750: {'lr': 9.430367152531433e-05, 'samples': 18304512, 'steps': 35750, 'batch_loss/train': 0.6941888406872749} 12/27/2021 16:11:09 - INFO - codeparrot_training - Step 35751: {'lr': 9.429133887249434e-05, 'samples': 18305024, 'steps': 35751, 'batch_loss/train': 0.7528534047305584} 12/27/2021 16:11:20 - INFO - codeparrot_training - Step 35752: {'lr': 9.427900683871316e-05, 'samples': 18305536, 'steps': 35752, 'batch_loss/train': 0.7257074825465679} 12/27/2021 16:11:31 - INFO - codeparrot_training - Step 35753: {'lr': 9.426667542401988e-05, 'samples': 18306048, 'steps': 35753, 'batch_loss/train': 0.6948005063459277} 12/27/2021 16:11:43 - INFO - codeparrot_training - Step 35754: {'lr': 9.425434462846347e-05, 'samples': 18306560, 'steps': 35754, 'batch_loss/train': 0.7867658026516438} 12/27/2021 16:11:53 - INFO - codeparrot_training - Step 35755: {'lr': 9.424201445209299e-05, 'samples': 18307072, 'steps': 35755, 'batch_loss/train': 0.70413310918957} 12/27/2021 16:12:04 - INFO - codeparrot_training - Step 35756: {'lr': 9.422968489495745e-05, 'samples': 18307584, 'steps': 35756, 'batch_loss/train': 0.801765319891274} 12/27/2021 16:12:18 - INFO - codeparrot_training - Step 35757: {'lr': 9.421735595710595e-05, 'samples': 18308096, 'steps': 35757, 'batch_loss/train': 0.733747205697} 12/27/2021 16:12:29 - INFO - codeparrot_training - Step 35758: {'lr': 9.420502763858727e-05, 'samples': 18308608, 'steps': 35758, 'batch_loss/train': 0.7013317355886102} 12/27/2021 16:12:40 - INFO - codeparrot_training - Step 35759: {'lr': 9.419269993945065e-05, 'samples': 18309120, 'steps': 35759, 'batch_loss/train': 0.7695302513893694} 12/27/2021 16:12:52 - INFO - codeparrot_training - Step 35760: {'lr': 9.41803728597451e-05, 'samples': 18309632, 'steps': 35760, 'batch_loss/train': 0.7366391592659056} 12/27/2021 16:13:02 - INFO - codeparrot_training - Step 35761: {'lr': 9.416804639951948e-05, 'samples': 18310144, 'steps': 35761, 'batch_loss/train': 0.7417934057302773} 12/27/2021 16:13:13 - INFO - codeparrot_training - Step 35762: {'lr': 9.415572055882277e-05, 'samples': 18310656, 'steps': 35762, 'batch_loss/train': 0.7201810032129288} 12/27/2021 16:13:25 - INFO - codeparrot_training - Step 35763: {'lr': 9.414339533770425e-05, 'samples': 18311168, 'steps': 35763, 'batch_loss/train': 0.6859910096973181} 12/27/2021 16:13:36 - INFO - codeparrot_training - Step 35764: {'lr': 9.413107073621263e-05, 'samples': 18311680, 'steps': 35764, 'batch_loss/train': 0.7138651246204972} 12/27/2021 16:13:46 - INFO - codeparrot_training - Step 35765: {'lr': 9.411874675439706e-05, 'samples': 18312192, 'steps': 35765, 'batch_loss/train': 0.8002516347914934} 12/27/2021 16:13:57 - INFO - codeparrot_training - Step 35766: {'lr': 9.410642339230647e-05, 'samples': 18312704, 'steps': 35766, 'batch_loss/train': 0.7573810154572129} 12/27/2021 16:14:11 - INFO - codeparrot_training - Step 35767: {'lr': 9.409410064998986e-05, 'samples': 18313216, 'steps': 35767, 'batch_loss/train': 0.7561194174923003} 12/27/2021 16:14:21 - INFO - codeparrot_training - Step 35768: {'lr': 9.408177852749628e-05, 'samples': 18313728, 'steps': 35768, 'batch_loss/train': 0.7610154435969889} 12/27/2021 16:14:32 - INFO - codeparrot_training - Step 35769: {'lr': 9.406945702487466e-05, 'samples': 18314240, 'steps': 35769, 'batch_loss/train': 0.8011788409203291} 12/27/2021 16:14:44 - INFO - codeparrot_training - Step 35770: {'lr': 9.405713614217399e-05, 'samples': 18314752, 'steps': 35770, 'batch_loss/train': 0.7190556833520532} 12/27/2021 16:14:55 - INFO - codeparrot_training - Step 35771: {'lr': 9.404481587944325e-05, 'samples': 18315264, 'steps': 35771, 'batch_loss/train': 0.6102609654190019} 12/27/2021 16:15:05 - INFO - codeparrot_training - Step 35772: {'lr': 9.403249623673154e-05, 'samples': 18315776, 'steps': 35772, 'batch_loss/train': 0.7217534468509257} 12/27/2021 16:15:17 - INFO - codeparrot_training - Step 35773: {'lr': 9.402017721408756e-05, 'samples': 18316288, 'steps': 35773, 'batch_loss/train': 0.6967091779224575} 12/27/2021 16:15:28 - INFO - codeparrot_training - Step 35774: {'lr': 9.400785881156065e-05, 'samples': 18316800, 'steps': 35774, 'batch_loss/train': 0.8271795678883791} 12/27/2021 16:15:39 - INFO - codeparrot_training - Step 35775: {'lr': 9.399554102919949e-05, 'samples': 18317312, 'steps': 35775, 'batch_loss/train': 0.6656002919189632} 12/27/2021 16:15:49 - INFO - codeparrot_training - Step 35776: {'lr': 9.398322386705316e-05, 'samples': 18317824, 'steps': 35776, 'batch_loss/train': 0.6768188446294516} 12/27/2021 16:16:04 - INFO - codeparrot_training - Step 35777: {'lr': 9.397090732517064e-05, 'samples': 18318336, 'steps': 35777, 'batch_loss/train': 0.7030935598886572} 12/27/2021 16:16:14 - INFO - codeparrot_training - Step 35778: {'lr': 9.395859140360085e-05, 'samples': 18318848, 'steps': 35778, 'batch_loss/train': 0.7597207406070083} 12/27/2021 16:16:25 - INFO - codeparrot_training - Step 35779: {'lr': 9.394627610239279e-05, 'samples': 18319360, 'steps': 35779, 'batch_loss/train': 0.5046744914725423} 12/27/2021 16:16:37 - INFO - codeparrot_training - Step 35780: {'lr': 9.393396142159541e-05, 'samples': 18319872, 'steps': 35780, 'batch_loss/train': 0.757455793209374} 12/27/2021 16:16:48 - INFO - codeparrot_training - Step 35781: {'lr': 9.392164736125766e-05, 'samples': 18320384, 'steps': 35781, 'batch_loss/train': 0.7208975707180798} 12/27/2021 16:16:59 - INFO - codeparrot_training - Step 35782: {'lr': 9.390933392142852e-05, 'samples': 18320896, 'steps': 35782, 'batch_loss/train': 0.793000491335988} 12/27/2021 16:17:11 - INFO - codeparrot_training - Step 35783: {'lr': 9.389702110215704e-05, 'samples': 18321408, 'steps': 35783, 'batch_loss/train': 0.7782774735242128} 12/27/2021 16:17:22 - INFO - codeparrot_training - Step 35784: {'lr': 9.388470890349185e-05, 'samples': 18321920, 'steps': 35784, 'batch_loss/train': 0.8334526792168617} 12/27/2021 16:17:32 - INFO - codeparrot_training - Step 35785: {'lr': 9.387239732548222e-05, 'samples': 18322432, 'steps': 35785, 'batch_loss/train': 0.6838219109922647} 12/27/2021 16:17:43 - INFO - codeparrot_training - Step 35786: {'lr': 9.386008636817708e-05, 'samples': 18322944, 'steps': 35786, 'batch_loss/train': 0.7223830316215754} 12/27/2021 16:17:55 - INFO - codeparrot_training - Step 35787: {'lr': 9.384777603162518e-05, 'samples': 18323456, 'steps': 35787, 'batch_loss/train': 0.6918417797423899} 12/27/2021 16:18:06 - INFO - codeparrot_training - Step 35788: {'lr': 9.383546631587548e-05, 'samples': 18323968, 'steps': 35788, 'batch_loss/train': 0.6688162302598357} 12/27/2021 16:18:16 - INFO - codeparrot_training - Step 35789: {'lr': 9.382315722097715e-05, 'samples': 18324480, 'steps': 35789, 'batch_loss/train': 0.8242836077697575} 12/27/2021 16:18:30 - INFO - codeparrot_training - Step 35790: {'lr': 9.381084874697893e-05, 'samples': 18324992, 'steps': 35790, 'batch_loss/train': 0.7605531851295382} 12/27/2021 16:18:41 - INFO - codeparrot_training - Step 35791: {'lr': 9.37985408939297e-05, 'samples': 18325504, 'steps': 35791, 'batch_loss/train': 0.784853438148275} 12/27/2021 16:18:51 - INFO - codeparrot_training - Step 35792: {'lr': 9.378623366187864e-05, 'samples': 18326016, 'steps': 35792, 'batch_loss/train': 0.7782832938246429} 12/27/2021 16:19:03 - INFO - codeparrot_training - Step 35793: {'lr': 9.377392705087445e-05, 'samples': 18326528, 'steps': 35793, 'batch_loss/train': 0.7576638618484139} 12/27/2021 16:19:14 - INFO - codeparrot_training - Step 35794: {'lr': 9.376162106096616e-05, 'samples': 18327040, 'steps': 35794, 'batch_loss/train': 0.649323474499397} 12/27/2021 16:19:25 - INFO - codeparrot_training - Step 35795: {'lr': 9.374931569220266e-05, 'samples': 18327552, 'steps': 35795, 'batch_loss/train': 0.7502221141476184} 12/27/2021 16:19:35 - INFO - codeparrot_training - Step 35796: {'lr': 9.373701094463288e-05, 'samples': 18328064, 'steps': 35796, 'batch_loss/train': 0.6958383927121758} 12/27/2021 16:19:49 - INFO - codeparrot_training - Step 35797: {'lr': 9.372470681830575e-05, 'samples': 18328576, 'steps': 35797, 'batch_loss/train': 0.8916422501206398} 12/27/2021 16:20:00 - INFO - codeparrot_training - Step 35798: {'lr': 9.371240331327027e-05, 'samples': 18329088, 'steps': 35798, 'batch_loss/train': 0.7836667564697564} 12/27/2021 16:20:10 - INFO - codeparrot_training - Step 35799: {'lr': 9.370010042957508e-05, 'samples': 18329600, 'steps': 35799, 'batch_loss/train': 0.7329276650561951} 12/27/2021 16:20:22 - INFO - codeparrot_training - Step 35800: {'lr': 9.368779816726935e-05, 'samples': 18330112, 'steps': 35800, 'batch_loss/train': 0.6720064125256613} 12/27/2021 16:20:33 - INFO - codeparrot_training - Step 35801: {'lr': 9.367549652640203e-05, 'samples': 18330624, 'steps': 35801, 'batch_loss/train': 0.7447382835671306} 12/27/2021 16:20:44 - INFO - codeparrot_training - Step 35802: {'lr': 9.366319550702181e-05, 'samples': 18331136, 'steps': 35802, 'batch_loss/train': 0.7116678198799491} 12/27/2021 16:20:56 - INFO - codeparrot_training - Step 35803: {'lr': 9.365089510917768e-05, 'samples': 18331648, 'steps': 35803, 'batch_loss/train': 0.7227783277630806} 12/27/2021 16:21:07 - INFO - codeparrot_training - Step 35804: {'lr': 9.363859533291857e-05, 'samples': 18332160, 'steps': 35804, 'batch_loss/train': 1.049550631083548} 12/27/2021 16:21:17 - INFO - codeparrot_training - Step 35805: {'lr': 9.362629617829336e-05, 'samples': 18332672, 'steps': 35805, 'batch_loss/train': 0.7391338367015123} 12/27/2021 16:21:28 - INFO - codeparrot_training - Step 35806: {'lr': 9.361399764535095e-05, 'samples': 18333184, 'steps': 35806, 'batch_loss/train': 0.7758481809869409} 12/27/2021 16:21:42 - INFO - codeparrot_training - Step 35807: {'lr': 9.360169973414023e-05, 'samples': 18333696, 'steps': 35807, 'batch_loss/train': 0.7499906283337623} 12/27/2021 16:21:52 - INFO - codeparrot_training - Step 35808: {'lr': 9.358940244471013e-05, 'samples': 18334208, 'steps': 35808, 'batch_loss/train': 0.6931192786432803} 12/27/2021 16:22:03 - INFO - codeparrot_training - Step 35809: {'lr': 9.357710577710957e-05, 'samples': 18334720, 'steps': 35809, 'batch_loss/train': 0.7369458386674523} 12/27/2021 16:22:15 - INFO - codeparrot_training - Step 35810: {'lr': 9.356480973138721e-05, 'samples': 18335232, 'steps': 35810, 'batch_loss/train': 0.6766181541606784} 12/27/2021 16:22:26 - INFO - codeparrot_training - Step 35811: {'lr': 9.355251430759218e-05, 'samples': 18335744, 'steps': 35811, 'batch_loss/train': 0.7417276652995497} 12/27/2021 16:22:36 - INFO - codeparrot_training - Step 35812: {'lr': 9.354021950577336e-05, 'samples': 18336256, 'steps': 35812, 'batch_loss/train': 0.7773652719333768} 12/27/2021 16:22:48 - INFO - codeparrot_training - Step 35813: {'lr': 9.352792532597946e-05, 'samples': 18336768, 'steps': 35813, 'batch_loss/train': 0.7261496202554554} 12/27/2021 16:22:59 - INFO - codeparrot_training - Step 35814: {'lr': 9.351563176825936e-05, 'samples': 18337280, 'steps': 35814, 'batch_loss/train': 0.7358273607678711} 12/27/2021 16:23:10 - INFO - codeparrot_training - Step 35815: {'lr': 9.350333883266221e-05, 'samples': 18337792, 'steps': 35815, 'batch_loss/train': 0.6997850146144629} 12/27/2021 16:23:22 - INFO - codeparrot_training - Step 35816: {'lr': 9.349104651923663e-05, 'samples': 18338304, 'steps': 35816, 'batch_loss/train': 0.5819818017189391} 12/27/2021 16:23:32 - INFO - codeparrot_training - Step 35817: {'lr': 9.347875482803142e-05, 'samples': 18338816, 'steps': 35817, 'batch_loss/train': 0.6846872041933239} 12/27/2021 16:23:43 - INFO - codeparrot_training - Step 35818: {'lr': 9.346646375909576e-05, 'samples': 18339328, 'steps': 35818, 'batch_loss/train': 0.756682145409286} 12/27/2021 16:23:54 - INFO - codeparrot_training - Step 35819: {'lr': 9.345417331247826e-05, 'samples': 18339840, 'steps': 35819, 'batch_loss/train': 0.7152370215044357} 12/27/2021 16:24:08 - INFO - codeparrot_training - Step 35820: {'lr': 9.344188348822788e-05, 'samples': 18340352, 'steps': 35820, 'batch_loss/train': 0.7101870211772621} 12/27/2021 16:24:18 - INFO - codeparrot_training - Step 35821: {'lr': 9.342959428639344e-05, 'samples': 18340864, 'steps': 35821, 'batch_loss/train': 0.7052617170847952} 12/27/2021 16:24:29 - INFO - codeparrot_training - Step 35822: {'lr': 9.341730570702381e-05, 'samples': 18341376, 'steps': 35822, 'batch_loss/train': 0.8524156007915735} 12/27/2021 16:24:41 - INFO - codeparrot_training - Step 35823: {'lr': 9.340501775016785e-05, 'samples': 18341888, 'steps': 35823, 'batch_loss/train': 0.7085746815428138} 12/27/2021 16:24:52 - INFO - codeparrot_training - Step 35824: {'lr': 9.339273041587449e-05, 'samples': 18342400, 'steps': 35824, 'batch_loss/train': 0.7309046788141131} 12/27/2021 16:25:03 - INFO - codeparrot_training - Step 35825: {'lr': 9.338044370419233e-05, 'samples': 18342912, 'steps': 35825, 'batch_loss/train': 0.6905115297995508} 12/27/2021 16:25:16 - INFO - codeparrot_training - Step 35826: {'lr': 9.336815761517049e-05, 'samples': 18343424, 'steps': 35826, 'batch_loss/train': 0.8695872854441404} 12/27/2021 16:25:27 - INFO - codeparrot_training - Step 35827: {'lr': 9.335587214885779e-05, 'samples': 18343936, 'steps': 35827, 'batch_loss/train': 0.7400167901068926} 12/27/2021 16:25:38 - INFO - codeparrot_training - Step 35828: {'lr': 9.334358730530282e-05, 'samples': 18344448, 'steps': 35828, 'batch_loss/train': 0.6561607653275132} 12/27/2021 16:25:48 - INFO - codeparrot_training - Step 35829: {'lr': 9.333130308455479e-05, 'samples': 18344960, 'steps': 35829, 'batch_loss/train': 0.693413860630244} 12/27/2021 16:26:00 - INFO - codeparrot_training - Step 35830: {'lr': 9.331901948666222e-05, 'samples': 18345472, 'steps': 35830, 'batch_loss/train': 0.7794595463201404} 12/27/2021 16:26:11 - INFO - codeparrot_training - Step 35831: {'lr': 9.33067365116741e-05, 'samples': 18345984, 'steps': 35831, 'batch_loss/train': 0.7530205585062504} 12/27/2021 16:26:22 - INFO - codeparrot_training - Step 35832: {'lr': 9.329445415963922e-05, 'samples': 18346496, 'steps': 35832, 'batch_loss/train': 0.7926337849348783} 12/27/2021 16:26:34 - INFO - codeparrot_training - Step 35833: {'lr': 9.32821724306064e-05, 'samples': 18347008, 'steps': 35833, 'batch_loss/train': 0.6939398320391774} 12/27/2021 16:26:44 - INFO - codeparrot_training - Step 35834: {'lr': 9.326989132462452e-05, 'samples': 18347520, 'steps': 35834, 'batch_loss/train': 0.8237485978752375} 12/27/2021 16:26:55 - INFO - codeparrot_training - Step 35835: {'lr': 9.325761084174236e-05, 'samples': 18348032, 'steps': 35835, 'batch_loss/train': 0.7494704155251384} 12/27/2021 16:27:09 - INFO - codeparrot_training - Step 35836: {'lr': 9.324533098200877e-05, 'samples': 18348544, 'steps': 35836, 'batch_loss/train': 0.9015398491173983} 12/27/2021 16:27:20 - INFO - codeparrot_training - Step 35837: {'lr': 9.323305174547252e-05, 'samples': 18349056, 'steps': 35837, 'batch_loss/train': 1.5301216691732407} 12/27/2021 16:27:30 - INFO - codeparrot_training - Step 35838: {'lr': 9.322077313218257e-05, 'samples': 18349568, 'steps': 35838, 'batch_loss/train': 1.5688369646668434} 12/27/2021 16:27:41 - INFO - codeparrot_training - Step 35839: {'lr': 9.320849514218754e-05, 'samples': 18350080, 'steps': 35839, 'batch_loss/train': 0.6002498932357412} 12/27/2021 16:27:53 - INFO - codeparrot_training - Step 35840: {'lr': 9.319621777553627e-05, 'samples': 18350592, 'steps': 35840, 'batch_loss/train': 0.6517151952721179} 12/27/2021 16:28:04 - INFO - codeparrot_training - Step 35841: {'lr': 9.318394103227778e-05, 'samples': 18351104, 'steps': 35841, 'batch_loss/train': 0.7479615386109799} 12/27/2021 16:28:14 - INFO - codeparrot_training - Step 35842: {'lr': 9.317166491246063e-05, 'samples': 18351616, 'steps': 35842, 'batch_loss/train': 0.7475185105577111} 12/27/2021 16:28:28 - INFO - codeparrot_training - Step 35843: {'lr': 9.315938941613366e-05, 'samples': 18352128, 'steps': 35843, 'batch_loss/train': 0.748023824300617} 12/27/2021 16:28:39 - INFO - codeparrot_training - Step 35844: {'lr': 9.31471145433459e-05, 'samples': 18352640, 'steps': 35844, 'batch_loss/train': 0.7708167957607657} 12/27/2021 16:28:49 - INFO - codeparrot_training - Step 35845: {'lr': 9.31348402941459e-05, 'samples': 18353152, 'steps': 35845, 'batch_loss/train': 0.7558589312247932} 12/27/2021 16:29:01 - INFO - codeparrot_training - Step 35846: {'lr': 9.312256666858255e-05, 'samples': 18353664, 'steps': 35846, 'batch_loss/train': 0.7562348362989724} 12/27/2021 16:29:12 - INFO - codeparrot_training - Step 35847: {'lr': 9.311029366670465e-05, 'samples': 18354176, 'steps': 35847, 'batch_loss/train': 0.7475359681993723} 12/27/2021 16:29:23 - INFO - codeparrot_training - Step 35848: {'lr': 9.309802128856096e-05, 'samples': 18354688, 'steps': 35848, 'batch_loss/train': 0.8199622317915782} 12/27/2021 16:29:33 - INFO - codeparrot_training - Step 35849: {'lr': 9.308574953420033e-05, 'samples': 18355200, 'steps': 35849, 'batch_loss/train': 0.6760741099715233} 12/27/2021 16:29:45 - INFO - codeparrot_training - Step 35850: {'lr': 9.307347840367156e-05, 'samples': 18355712, 'steps': 35850, 'batch_loss/train': 0.6332513664383441} 12/27/2021 16:29:56 - INFO - codeparrot_training - Step 35851: {'lr': 9.306120789702321e-05, 'samples': 18356224, 'steps': 35851, 'batch_loss/train': 0.7729025520384312} 12/27/2021 16:30:07 - INFO - codeparrot_training - Step 35852: {'lr': 9.304893801430437e-05, 'samples': 18356736, 'steps': 35852, 'batch_loss/train': 0.6770806439453736} 12/27/2021 16:30:21 - INFO - codeparrot_training - Step 35853: {'lr': 9.303666875556374e-05, 'samples': 18357248, 'steps': 35853, 'batch_loss/train': 0.7360367998480797} 12/27/2021 16:30:31 - INFO - codeparrot_training - Step 35854: {'lr': 9.302440012084987e-05, 'samples': 18357760, 'steps': 35854, 'batch_loss/train': 0.7073702001944184} 12/27/2021 16:30:42 - INFO - codeparrot_training - Step 35855: {'lr': 9.30121321102119e-05, 'samples': 18358272, 'steps': 35855, 'batch_loss/train': 0.7397132758051157} 12/27/2021 16:30:54 - INFO - codeparrot_training - Step 35856: {'lr': 9.299986472369834e-05, 'samples': 18358784, 'steps': 35856, 'batch_loss/train': 0.6502852017292753} 12/27/2021 16:31:05 - INFO - codeparrot_training - Step 35857: {'lr': 9.298759796135802e-05, 'samples': 18359296, 'steps': 35857, 'batch_loss/train': 0.6013147972989827} 12/27/2021 16:31:15 - INFO - codeparrot_training - Step 35858: {'lr': 9.297533182323975e-05, 'samples': 18359808, 'steps': 35858, 'batch_loss/train': 0.7420669449493289} 12/27/2021 16:31:26 - INFO - codeparrot_training - Step 35859: {'lr': 9.296306630939225e-05, 'samples': 18360320, 'steps': 35859, 'batch_loss/train': 0.7413587113842368} 12/27/2021 16:31:38 - INFO - codeparrot_training - Step 35860: {'lr': 9.29508014198643e-05, 'samples': 18360832, 'steps': 35860, 'batch_loss/train': 0.6747321132570505} 12/27/2021 16:31:49 - INFO - codeparrot_training - Step 35861: {'lr': 9.293853715470465e-05, 'samples': 18361344, 'steps': 35861, 'batch_loss/train': 0.7697358718141913} 12/27/2021 16:32:00 - INFO - codeparrot_training - Step 35862: {'lr': 9.29262735139621e-05, 'samples': 18361856, 'steps': 35862, 'batch_loss/train': 0.7621150942286476} 12/27/2021 16:32:12 - INFO - codeparrot_training - Step 35863: {'lr': 9.291401049768539e-05, 'samples': 18362368, 'steps': 35863, 'batch_loss/train': 0.6827063788077794} 12/27/2021 16:32:22 - INFO - codeparrot_training - Step 35864: {'lr': 9.29017481059233e-05, 'samples': 18362880, 'steps': 35864, 'batch_loss/train': 0.7339436176698655} 12/27/2021 16:32:33 - INFO - codeparrot_training - Step 35865: {'lr': 9.28894863387244e-05, 'samples': 18363392, 'steps': 35865, 'batch_loss/train': 0.6288365484215319} 12/27/2021 16:32:47 - INFO - codeparrot_training - Step 35866: {'lr': 9.287722519613767e-05, 'samples': 18363904, 'steps': 35866, 'batch_loss/train': 0.7213640442350879} 12/27/2021 16:32:57 - INFO - codeparrot_training - Step 35867: {'lr': 9.286496467821185e-05, 'samples': 18364416, 'steps': 35867, 'batch_loss/train': 0.7648705458268523} 12/27/2021 16:33:08 - INFO - codeparrot_training - Step 35868: {'lr': 9.285270478499553e-05, 'samples': 18364928, 'steps': 35868, 'batch_loss/train': 0.6633357927203178} 12/27/2021 16:33:19 - INFO - codeparrot_training - Step 35869: {'lr': 9.28404455165374e-05, 'samples': 18365440, 'steps': 35869, 'batch_loss/train': 0.7307779686525464} 12/27/2021 16:33:31 - INFO - codeparrot_training - Step 35870: {'lr': 9.28281868728865e-05, 'samples': 18365952, 'steps': 35870, 'batch_loss/train': 0.7786844093352556} 12/27/2021 16:33:41 - INFO - codeparrot_training - Step 35871: {'lr': 9.28159288540913e-05, 'samples': 18366464, 'steps': 35871, 'batch_loss/train': 0.593412114190869} 12/27/2021 16:33:52 - INFO - codeparrot_training - Step 35872: {'lr': 9.28036714602006e-05, 'samples': 18366976, 'steps': 35872, 'batch_loss/train': 0.6674199920962565} 12/27/2021 16:34:06 - INFO - codeparrot_training - Step 35873: {'lr': 9.279141469126315e-05, 'samples': 18367488, 'steps': 35873, 'batch_loss/train': 0.6402705167420208} 12/27/2021 16:34:17 - INFO - codeparrot_training - Step 35874: {'lr': 9.277915854732769e-05, 'samples': 18368000, 'steps': 35874, 'batch_loss/train': 0.5920178948435932} 12/27/2021 16:34:27 - INFO - codeparrot_training - Step 35875: {'lr': 9.276690302844293e-05, 'samples': 18368512, 'steps': 35875, 'batch_loss/train': 0.7177672091638669} 12/27/2021 16:34:39 - INFO - codeparrot_training - Step 35876: {'lr': 9.275464813465767e-05, 'samples': 18369024, 'steps': 35876, 'batch_loss/train': 0.765704121440649} 12/27/2021 16:34:50 - INFO - codeparrot_training - Step 35877: {'lr': 9.27423938660204e-05, 'samples': 18369536, 'steps': 35877, 'batch_loss/train': 0.6696891712490469} 12/27/2021 16:35:00 - INFO - codeparrot_training - Step 35878: {'lr': 9.273014022258009e-05, 'samples': 18370048, 'steps': 35878, 'batch_loss/train': 0.713665580842644} 12/27/2021 16:35:11 - INFO - codeparrot_training - Step 35879: {'lr': 9.27178872043854e-05, 'samples': 18370560, 'steps': 35879, 'batch_loss/train': 0.6946148288552649} 12/27/2021 16:35:23 - INFO - codeparrot_training - Step 35880: {'lr': 9.270563481148486e-05, 'samples': 18371072, 'steps': 35880, 'batch_loss/train': 0.7894155206158757} 12/27/2021 16:35:34 - INFO - codeparrot_training - Step 35881: {'lr': 9.269338304392742e-05, 'samples': 18371584, 'steps': 35881, 'batch_loss/train': 0.8375609591603279} 12/27/2021 16:35:44 - INFO - codeparrot_training - Step 35882: {'lr': 9.268113190176175e-05, 'samples': 18372096, 'steps': 35882, 'batch_loss/train': 0.7605648674070835} 12/27/2021 16:35:58 - INFO - codeparrot_training - Step 35883: {'lr': 9.266888138503644e-05, 'samples': 18372608, 'steps': 35883, 'batch_loss/train': 0.8028767956420779} 12/27/2021 16:36:09 - INFO - codeparrot_training - Step 35884: {'lr': 9.265663149380025e-05, 'samples': 18373120, 'steps': 35884, 'batch_loss/train': 0.7379666336346418} 12/27/2021 16:36:19 - INFO - codeparrot_training - Step 35885: {'lr': 9.264438222810187e-05, 'samples': 18373632, 'steps': 35885, 'batch_loss/train': 0.7334366487339139} 12/27/2021 16:36:32 - INFO - codeparrot_training - Step 35886: {'lr': 9.263213358799e-05, 'samples': 18374144, 'steps': 35886, 'batch_loss/train': 0.7612590561620891} 12/27/2021 16:36:42 - INFO - codeparrot_training - Step 35887: {'lr': 9.261988557351337e-05, 'samples': 18374656, 'steps': 35887, 'batch_loss/train': 0.6744334003888071} 12/27/2021 16:36:53 - INFO - codeparrot_training - Step 35888: {'lr': 9.260763818472065e-05, 'samples': 18375168, 'steps': 35888, 'batch_loss/train': 0.7528471513651311} 12/27/2021 16:37:04 - INFO - codeparrot_training - Step 35889: {'lr': 9.259539142166051e-05, 'samples': 18375680, 'steps': 35889, 'batch_loss/train': 0.7847860837355256} 12/27/2021 16:37:16 - INFO - codeparrot_training - Step 35890: {'lr': 9.258314528438167e-05, 'samples': 18376192, 'steps': 35890, 'batch_loss/train': 0.6076535894535482} 12/27/2021 16:37:27 - INFO - codeparrot_training - Step 35891: {'lr': 9.257089977293281e-05, 'samples': 18376704, 'steps': 35891, 'batch_loss/train': 0.7064728597179055} 12/27/2021 16:37:37 - INFO - codeparrot_training - Step 35892: {'lr': 9.255865488736262e-05, 'samples': 18377216, 'steps': 35892, 'batch_loss/train': 0.7051107650622725} 12/27/2021 16:37:49 - INFO - codeparrot_training - Step 35893: {'lr': 9.254641062771985e-05, 'samples': 18377728, 'steps': 35893, 'batch_loss/train': 0.948260584846139} 12/27/2021 16:38:00 - INFO - codeparrot_training - Step 35894: {'lr': 9.253416699405298e-05, 'samples': 18378240, 'steps': 35894, 'batch_loss/train': 0.8976905979216099} 12/27/2021 16:38:11 - INFO - codeparrot_training - Step 35895: {'lr': 9.252192398641077e-05, 'samples': 18378752, 'steps': 35895, 'batch_loss/train': 0.6710013290867209} 12/27/2021 16:38:21 - INFO - codeparrot_training - Step 35896: {'lr': 9.250968160484207e-05, 'samples': 18379264, 'steps': 35896, 'batch_loss/train': 0.7484855153597891} 12/27/2021 16:38:35 - INFO - codeparrot_training - Step 35897: {'lr': 9.249743984939532e-05, 'samples': 18379776, 'steps': 35897, 'batch_loss/train': 0.766696116887033} 12/27/2021 16:38:46 - INFO - codeparrot_training - Step 35898: {'lr': 9.248519872011929e-05, 'samples': 18380288, 'steps': 35898, 'batch_loss/train': 0.7365012695081532} 12/27/2021 16:38:56 - INFO - codeparrot_training - Step 35899: {'lr': 9.247295821706264e-05, 'samples': 18380800, 'steps': 35899, 'batch_loss/train': 0.7461881805211306} 12/27/2021 16:39:09 - INFO - codeparrot_training - Step 35900: {'lr': 9.246071834027403e-05, 'samples': 18381312, 'steps': 35900, 'batch_loss/train': 0.7867414015345275} 12/27/2021 16:39:20 - INFO - codeparrot_training - Step 35901: {'lr': 9.244847908980211e-05, 'samples': 18381824, 'steps': 35901, 'batch_loss/train': 0.8200446469709277} 12/27/2021 16:39:30 - INFO - codeparrot_training - Step 35902: {'lr': 9.243624046569554e-05, 'samples': 18382336, 'steps': 35902, 'batch_loss/train': 0.7145976321771741} 12/27/2021 16:39:41 - INFO - codeparrot_training - Step 35903: {'lr': 9.242400246800301e-05, 'samples': 18382848, 'steps': 35903, 'batch_loss/train': 1.658890904393047} 12/27/2021 16:39:55 - INFO - codeparrot_training - Step 35904: {'lr': 9.241176509677312e-05, 'samples': 18383360, 'steps': 35904, 'batch_loss/train': 1.4847643906250596} 12/27/2021 16:40:05 - INFO - codeparrot_training - Step 35905: {'lr': 9.239952835205467e-05, 'samples': 18383872, 'steps': 35905, 'batch_loss/train': 0.7459409926086664} 12/27/2021 16:40:16 - INFO - codeparrot_training - Step 35906: {'lr': 9.2387292233896e-05, 'samples': 18384384, 'steps': 35906, 'batch_loss/train': 0.7347575132735074} 12/27/2021 16:40:28 - INFO - codeparrot_training - Step 35907: {'lr': 9.2375056742346e-05, 'samples': 18384896, 'steps': 35907, 'batch_loss/train': 0.7921801237389445} 12/27/2021 16:40:39 - INFO - codeparrot_training - Step 35908: {'lr': 9.236282187745338e-05, 'samples': 18385408, 'steps': 35908, 'batch_loss/train': 0.6351915304549038} 12/27/2021 16:40:49 - INFO - codeparrot_training - Step 35909: {'lr': 9.235058763926655e-05, 'samples': 18385920, 'steps': 35909, 'batch_loss/train': 0.7195815045270137} 12/27/2021 16:41:02 - INFO - codeparrot_training - Step 35910: {'lr': 9.233835402783428e-05, 'samples': 18386432, 'steps': 35910, 'batch_loss/train': 0.671236539259553} 12/27/2021 16:41:12 - INFO - codeparrot_training - Step 35911: {'lr': 9.232612104320517e-05, 'samples': 18386944, 'steps': 35911, 'batch_loss/train': 0.760096506215632} 12/27/2021 16:41:23 - INFO - codeparrot_training - Step 35912: {'lr': 9.231388868542787e-05, 'samples': 18387456, 'steps': 35912, 'batch_loss/train': 0.7366385757923126} 12/27/2021 16:41:33 - INFO - codeparrot_training - Step 35913: {'lr': 9.2301656954551e-05, 'samples': 18387968, 'steps': 35913, 'batch_loss/train': 0.8343052100390196} 12/27/2021 16:41:47 - INFO - codeparrot_training - Step 35914: {'lr': 9.22894258506232e-05, 'samples': 18388480, 'steps': 35914, 'batch_loss/train': 0.6924832430668175} 12/27/2021 16:41:58 - INFO - codeparrot_training - Step 35915: {'lr': 9.227719537369312e-05, 'samples': 18388992, 'steps': 35915, 'batch_loss/train': 0.727765423245728} 12/27/2021 16:42:09 - INFO - codeparrot_training - Step 35916: {'lr': 9.226496552380931e-05, 'samples': 18389504, 'steps': 35916, 'batch_loss/train': 0.7252728843595833} 12/27/2021 16:42:21 - INFO - codeparrot_training - Step 35917: {'lr': 9.225273630102046e-05, 'samples': 18390016, 'steps': 35917, 'batch_loss/train': 0.7251926630269736} 12/27/2021 16:42:31 - INFO - codeparrot_training - Step 35918: {'lr': 9.224050770537518e-05, 'samples': 18390528, 'steps': 35918, 'batch_loss/train': 0.7350163431838155} 12/27/2021 16:42:42 - INFO - codeparrot_training - Step 35919: {'lr': 9.222827973692216e-05, 'samples': 18391040, 'steps': 35919, 'batch_loss/train': 1.0767214216757566} 12/27/2021 16:42:56 - INFO - codeparrot_training - Step 35920: {'lr': 9.221605239570983e-05, 'samples': 18391552, 'steps': 35920, 'batch_loss/train': 0.7670302111655474} 12/27/2021 16:43:07 - INFO - codeparrot_training - Step 35921: {'lr': 9.220382568178682e-05, 'samples': 18392064, 'steps': 35921, 'batch_loss/train': 0.8578040767461061} 12/27/2021 16:43:17 - INFO - codeparrot_training - Step 35922: {'lr': 9.219159959520199e-05, 'samples': 18392576, 'steps': 35922, 'batch_loss/train': 0.7169256471097469} 12/27/2021 16:43:29 - INFO - codeparrot_training - Step 35923: {'lr': 9.217937413600366e-05, 'samples': 18393088, 'steps': 35923, 'batch_loss/train': 0.6693762438371778} 12/27/2021 16:43:40 - INFO - codeparrot_training - Step 35924: {'lr': 9.216714930424059e-05, 'samples': 18393600, 'steps': 35924, 'batch_loss/train': 0.7493136930279434} 12/27/2021 16:43:51 - INFO - codeparrot_training - Step 35925: {'lr': 9.215492509996132e-05, 'samples': 18394112, 'steps': 35925, 'batch_loss/train': 0.7424638569355011} 12/27/2021 16:44:01 - INFO - codeparrot_training - Step 35926: {'lr': 9.214270152321444e-05, 'samples': 18394624, 'steps': 35926, 'batch_loss/train': 0.6782142445445061} 12/27/2021 16:44:13 - INFO - codeparrot_training - Step 35927: {'lr': 9.213047857404861e-05, 'samples': 18395136, 'steps': 35927, 'batch_loss/train': 0.6442885841242969} 12/27/2021 16:44:24 - INFO - codeparrot_training - Step 35928: {'lr': 9.211825625251239e-05, 'samples': 18395648, 'steps': 35928, 'batch_loss/train': 0.6508145919069648} 12/27/2021 16:44:35 - INFO - codeparrot_training - Step 35929: {'lr': 9.210603455865435e-05, 'samples': 18396160, 'steps': 35929, 'batch_loss/train': 0.70736145315459} 12/27/2021 16:44:49 - INFO - codeparrot_training - Step 35930: {'lr': 9.20938134925231e-05, 'samples': 18396672, 'steps': 35930, 'batch_loss/train': 0.7929774820804596} 12/27/2021 16:44:59 - INFO - codeparrot_training - Step 35931: {'lr': 9.208159305416733e-05, 'samples': 18397184, 'steps': 35931, 'batch_loss/train': 0.715305800549686} 12/27/2021 16:45:10 - INFO - codeparrot_training - Step 35932: {'lr': 9.206937324363531e-05, 'samples': 18397696, 'steps': 35932, 'batch_loss/train': 0.7114293789491057} 12/27/2021 16:45:22 - INFO - codeparrot_training - Step 35933: {'lr': 9.205715406097595e-05, 'samples': 18398208, 'steps': 35933, 'batch_loss/train': 0.8433697372674942} 12/27/2021 16:45:33 - INFO - codeparrot_training - Step 35934: {'lr': 9.20449355062378e-05, 'samples': 18398720, 'steps': 35934, 'batch_loss/train': 0.6942071933299303} 12/27/2021 16:45:43 - INFO - codeparrot_training - Step 35935: {'lr': 9.203271757946924e-05, 'samples': 18399232, 'steps': 35935, 'batch_loss/train': 0.7236814685165882} 12/27/2021 16:45:54 - INFO - codeparrot_training - Step 35936: {'lr': 9.202050028071885e-05, 'samples': 18399744, 'steps': 35936, 'batch_loss/train': 0.7227483717724681} 12/27/2021 16:46:06 - INFO - codeparrot_training - Step 35937: {'lr': 9.20082836100355e-05, 'samples': 18400256, 'steps': 35937, 'batch_loss/train': 0.7645319793373346} 12/27/2021 16:46:16 - INFO - codeparrot_training - Step 35938: {'lr': 9.199606756746745e-05, 'samples': 18400768, 'steps': 35938, 'batch_loss/train': 0.6689610728062689} 12/27/2021 16:46:27 - INFO - codeparrot_training - Step 35939: {'lr': 9.198385215306341e-05, 'samples': 18401280, 'steps': 35939, 'batch_loss/train': 0.7354599945247173} 12/27/2021 16:46:39 - INFO - codeparrot_training - Step 35940: {'lr': 9.197163736687186e-05, 'samples': 18401792, 'steps': 35940, 'batch_loss/train': 0.7084065359085798} 12/27/2021 16:46:50 - INFO - codeparrot_training - Step 35941: {'lr': 9.195942320894143e-05, 'samples': 18402304, 'steps': 35941, 'batch_loss/train': 0.7892806094023399} 12/27/2021 16:47:00 - INFO - codeparrot_training - Step 35942: {'lr': 9.194720967932069e-05, 'samples': 18402816, 'steps': 35942, 'batch_loss/train': 0.7284076795913279} 12/27/2021 16:47:14 - INFO - codeparrot_training - Step 35943: {'lr': 9.193499677805814e-05, 'samples': 18403328, 'steps': 35943, 'batch_loss/train': 0.8397909617051482} 12/27/2021 16:47:25 - INFO - codeparrot_training - Step 35944: {'lr': 9.192278450520236e-05, 'samples': 18403840, 'steps': 35944, 'batch_loss/train': 0.7389550264924765} 12/27/2021 16:47:35 - INFO - codeparrot_training - Step 35945: {'lr': 9.191057286080199e-05, 'samples': 18404352, 'steps': 35945, 'batch_loss/train': 0.7048103092238307} 12/27/2021 16:47:46 - INFO - codeparrot_training - Step 35946: {'lr': 9.189836184490538e-05, 'samples': 18404864, 'steps': 35946, 'batch_loss/train': 0.7272311430424452} 12/27/2021 16:47:58 - INFO - codeparrot_training - Step 35947: {'lr': 9.188615145756113e-05, 'samples': 18405376, 'steps': 35947, 'batch_loss/train': 0.8219953645020723} 12/27/2021 16:48:09 - INFO - codeparrot_training - Step 35948: {'lr': 9.187394169881802e-05, 'samples': 18405888, 'steps': 35948, 'batch_loss/train': 0.6801323393592611} 12/27/2021 16:48:19 - INFO - codeparrot_training - Step 35949: {'lr': 9.186173256872429e-05, 'samples': 18406400, 'steps': 35949, 'batch_loss/train': 0.7060509594157338} 12/27/2021 16:48:32 - INFO - codeparrot_training - Step 35950: {'lr': 9.184952406732863e-05, 'samples': 18406912, 'steps': 35950, 'batch_loss/train': 0.8169520320370793} 12/27/2021 16:48:42 - INFO - codeparrot_training - Step 35951: {'lr': 9.183731619467952e-05, 'samples': 18407424, 'steps': 35951, 'batch_loss/train': 0.7230586013756692} 12/27/2021 16:48:53 - INFO - codeparrot_training - Step 35952: {'lr': 9.182510895082554e-05, 'samples': 18407936, 'steps': 35952, 'batch_loss/train': 0.6611084947362542} 12/27/2021 16:49:07 - INFO - codeparrot_training - Step 35953: {'lr': 9.18129023358152e-05, 'samples': 18408448, 'steps': 35953, 'batch_loss/train': 0.903043607249856} 12/27/2021 16:49:18 - INFO - codeparrot_training - Step 35954: {'lr': 9.180069634969704e-05, 'samples': 18408960, 'steps': 35954, 'batch_loss/train': 0.745842938311398} 12/27/2021 16:49:28 - INFO - codeparrot_training - Step 35955: {'lr': 9.178849099251957e-05, 'samples': 18409472, 'steps': 35955, 'batch_loss/train': 0.6656550848856568} 12/27/2021 16:49:40 - INFO - codeparrot_training - Step 35956: {'lr': 9.177628626433131e-05, 'samples': 18409984, 'steps': 35956, 'batch_loss/train': 0.512804765952751} 12/27/2021 16:49:51 - INFO - codeparrot_training - Step 35957: {'lr': 9.176408216518092e-05, 'samples': 18410496, 'steps': 35957, 'batch_loss/train': 0.7758266218006611} 12/27/2021 16:50:01 - INFO - codeparrot_training - Step 35958: {'lr': 9.175187869511658e-05, 'samples': 18411008, 'steps': 35958, 'batch_loss/train': 0.731005291454494} 12/27/2021 16:50:12 - INFO - codeparrot_training - Step 35959: {'lr': 9.173967585418713e-05, 'samples': 18411520, 'steps': 35959, 'batch_loss/train': 0.7145471461117268} 12/27/2021 16:50:26 - INFO - codeparrot_training - Step 35960: {'lr': 9.172747364244102e-05, 'samples': 18412032, 'steps': 35960, 'batch_loss/train': 0.6894087109249085} 12/27/2021 16:50:37 - INFO - codeparrot_training - Step 35961: {'lr': 9.171527205992667e-05, 'samples': 18412544, 'steps': 35961, 'batch_loss/train': 0.774477060418576} 12/27/2021 16:50:47 - INFO - codeparrot_training - Step 35962: {'lr': 9.170307110669251e-05, 'samples': 18413056, 'steps': 35962, 'batch_loss/train': 0.7942048613913357} 12/27/2021 16:51:00 - INFO - codeparrot_training - Step 35963: {'lr': 9.169087078278737e-05, 'samples': 18413568, 'steps': 35963, 'batch_loss/train': 0.7400437090545893} 12/27/2021 16:51:10 - INFO - codeparrot_training - Step 35964: {'lr': 9.167867108825945e-05, 'samples': 18414080, 'steps': 35964, 'batch_loss/train': 0.7359481258317828} 12/27/2021 16:51:21 - INFO - codeparrot_training - Step 35965: {'lr': 9.166647202315736e-05, 'samples': 18414592, 'steps': 35965, 'batch_loss/train': 0.9088961714878678} 12/27/2021 16:51:33 - INFO - codeparrot_training - Step 35966: {'lr': 9.165427358752958e-05, 'samples': 18415104, 'steps': 35966, 'batch_loss/train': 0.6665981789119542} 12/27/2021 16:51:43 - INFO - codeparrot_training - Step 35967: {'lr': 9.164207578142463e-05, 'samples': 18415616, 'steps': 35967, 'batch_loss/train': 0.6947928452864289} 12/27/2021 16:51:54 - INFO - codeparrot_training - Step 35968: {'lr': 9.1629878604891e-05, 'samples': 18416128, 'steps': 35968, 'batch_loss/train': 0.8481385558843613} 12/27/2021 16:52:05 - INFO - codeparrot_training - Step 35969: {'lr': 9.161768205797715e-05, 'samples': 18416640, 'steps': 35969, 'batch_loss/train': 0.6070574317127466} 12/27/2021 16:52:19 - INFO - codeparrot_training - Step 35970: {'lr': 9.160548614073163e-05, 'samples': 18417152, 'steps': 35970, 'batch_loss/train': 0.7396756880916655} 12/27/2021 16:52:29 - INFO - codeparrot_training - Step 35971: {'lr': 9.159329085320287e-05, 'samples': 18417664, 'steps': 35971, 'batch_loss/train': 0.6050330079160631} 12/27/2021 16:52:40 - INFO - codeparrot_training - Step 35972: {'lr': 9.158109619543945e-05, 'samples': 18418176, 'steps': 35972, 'batch_loss/train': 0.6331977180670947} 12/27/2021 16:52:52 - INFO - codeparrot_training - Step 35973: {'lr': 9.156890216748961e-05, 'samples': 18418688, 'steps': 35973, 'batch_loss/train': 0.8264415599405766} 12/27/2021 16:53:03 - INFO - codeparrot_training - Step 35974: {'lr': 9.155670876940217e-05, 'samples': 18419200, 'steps': 35974, 'batch_loss/train': 0.6353985696914606} 12/27/2021 16:53:13 - INFO - codeparrot_training - Step 35975: {'lr': 9.154451600122535e-05, 'samples': 18419712, 'steps': 35975, 'batch_loss/train': 0.7130425877403468} 12/27/2021 16:53:26 - INFO - codeparrot_training - Step 35976: {'lr': 9.153232386300761e-05, 'samples': 18420224, 'steps': 35976, 'batch_loss/train': 0.6663322867825627} 12/27/2021 16:53:36 - INFO - codeparrot_training - Step 35977: {'lr': 9.152013235479767e-05, 'samples': 18420736, 'steps': 35977, 'batch_loss/train': 0.8729102308861911} 12/27/2021 16:53:47 - INFO - codeparrot_training - Step 35978: {'lr': 9.150794147664379e-05, 'samples': 18421248, 'steps': 35978, 'batch_loss/train': 0.7903427975252271} 12/27/2021 16:53:58 - INFO - codeparrot_training - Step 35979: {'lr': 9.149575122859446e-05, 'samples': 18421760, 'steps': 35979, 'batch_loss/train': 0.7052617084700614} 12/27/2021 16:54:11 - INFO - codeparrot_training - Step 35980: {'lr': 9.14835616106982e-05, 'samples': 18422272, 'steps': 35980, 'batch_loss/train': 0.7116070128977299} 12/27/2021 16:54:22 - INFO - codeparrot_training - Step 35981: {'lr': 9.147137262300345e-05, 'samples': 18422784, 'steps': 35981, 'batch_loss/train': 0.7814454482868314} 12/27/2021 16:54:33 - INFO - codeparrot_training - Step 35982: {'lr': 9.145918426555863e-05, 'samples': 18423296, 'steps': 35982, 'batch_loss/train': 0.7092293999157846} 12/27/2021 16:54:45 - INFO - codeparrot_training - Step 35983: {'lr': 9.144699653841234e-05, 'samples': 18423808, 'steps': 35983, 'batch_loss/train': 0.5621551339281723} 12/27/2021 16:54:55 - INFO - codeparrot_training - Step 35984: {'lr': 9.143480944161276e-05, 'samples': 18424320, 'steps': 35984, 'batch_loss/train': 0.7695483481511474} 12/27/2021 16:55:06 - INFO - codeparrot_training - Step 35985: {'lr': 9.142262297520856e-05, 'samples': 18424832, 'steps': 35985, 'batch_loss/train': 0.71873390651308} 12/27/2021 16:55:18 - INFO - codeparrot_training - Step 35986: {'lr': 9.141043713924827e-05, 'samples': 18425344, 'steps': 35986, 'batch_loss/train': 0.6354978996678255} 12/27/2021 16:55:29 - INFO - codeparrot_training - Step 35987: {'lr': 9.139825193378007e-05, 'samples': 18425856, 'steps': 35987, 'batch_loss/train': 0.5798768000677228} 12/27/2021 16:55:39 - INFO - codeparrot_training - Step 35988: {'lr': 9.138606735885246e-05, 'samples': 18426368, 'steps': 35988, 'batch_loss/train': 0.6515175793319941} 12/27/2021 16:55:53 - INFO - codeparrot_training - Step 35989: {'lr': 9.137388341451414e-05, 'samples': 18426880, 'steps': 35989, 'batch_loss/train': 0.7409391398541629} 12/27/2021 16:56:04 - INFO - codeparrot_training - Step 35990: {'lr': 9.136170010081327e-05, 'samples': 18427392, 'steps': 35990, 'batch_loss/train': 0.7645071893930435} 12/27/2021 16:56:15 - INFO - codeparrot_training - Step 35991: {'lr': 9.134951741779837e-05, 'samples': 18427904, 'steps': 35991, 'batch_loss/train': 0.7428895134944469} 12/27/2021 16:56:25 - INFO - codeparrot_training - Step 35992: {'lr': 9.133733536551792e-05, 'samples': 18428416, 'steps': 35992, 'batch_loss/train': 0.8070499980822206} 12/27/2021 16:56:38 - INFO - codeparrot_training - Step 35993: {'lr': 9.132515394402028e-05, 'samples': 18428928, 'steps': 35993, 'batch_loss/train': 0.6803680812008679} 12/27/2021 16:56:48 - INFO - codeparrot_training - Step 35994: {'lr': 9.131297315335393e-05, 'samples': 18429440, 'steps': 35994, 'batch_loss/train': 0.5302405016263947} 12/27/2021 16:56:59 - INFO - codeparrot_training - Step 35995: {'lr': 9.130079299356728e-05, 'samples': 18429952, 'steps': 35995, 'batch_loss/train': 0.6563731613568962} 12/27/2021 16:57:11 - INFO - codeparrot_training - Step 35996: {'lr': 9.128861346470876e-05, 'samples': 18430464, 'steps': 35996, 'batch_loss/train': 0.6843470609746873} 12/27/2021 16:57:22 - INFO - codeparrot_training - Step 35997: {'lr': 9.127643456682678e-05, 'samples': 18430976, 'steps': 35997, 'batch_loss/train': 0.7058441592380404} 12/27/2021 16:57:32 - INFO - codeparrot_training - Step 35998: {'lr': 9.126425629996984e-05, 'samples': 18431488, 'steps': 35998, 'batch_loss/train': 0.7528585917316377} 12/27/2021 16:57:46 - INFO - codeparrot_training - Step 35999: {'lr': 9.125207866418611e-05, 'samples': 18432000, 'steps': 35999, 'batch_loss/train': 0.7576403806451708} 12/27/2021 16:57:57 - INFO - codeparrot_training - Step 36000: {'lr': 9.123990165952437e-05, 'samples': 18432512, 'steps': 36000, 'batch_loss/train': 0.6587061672471464} 12/27/2021 16:58:08 - INFO - codeparrot_training - Step 36001: {'lr': 9.122772528603272e-05, 'samples': 18433024, 'steps': 36001, 'batch_loss/train': 1.103378482919652} 12/27/2021 16:58:18 - INFO - codeparrot_training - Step 36002: {'lr': 9.12155495437596e-05, 'samples': 18433536, 'steps': 36002, 'batch_loss/train': 1.3236759062856436} 12/27/2021 16:58:31 - INFO - codeparrot_training - Step 36003: {'lr': 9.12033744327537e-05, 'samples': 18434048, 'steps': 36003, 'batch_loss/train': 0.8366704992949963} 12/27/2021 16:58:41 - INFO - codeparrot_training - Step 36004: {'lr': 9.119119995306313e-05, 'samples': 18434560, 'steps': 36004, 'batch_loss/train': 0.7492743129841983} 12/27/2021 16:58:52 - INFO - codeparrot_training - Step 36005: {'lr': 9.117902610473636e-05, 'samples': 18435072, 'steps': 36005, 'batch_loss/train': 0.6892930893227458} 12/27/2021 16:59:05 - INFO - codeparrot_training - Step 36006: {'lr': 9.116685288782184e-05, 'samples': 18435584, 'steps': 36006, 'batch_loss/train': 0.7948391800746322} 12/27/2021 16:59:16 - INFO - codeparrot_training - Step 36007: {'lr': 9.115468030236792e-05, 'samples': 18436096, 'steps': 36007, 'batch_loss/train': 0.6438982719555497} 12/27/2021 16:59:27 - INFO - codeparrot_training - Step 36008: {'lr': 9.114250834842302e-05, 'samples': 18436608, 'steps': 36008, 'batch_loss/train': 0.6917950809001923} 12/27/2021 16:59:39 - INFO - codeparrot_training - Step 36009: {'lr': 9.113033702603562e-05, 'samples': 18437120, 'steps': 36009, 'batch_loss/train': 0.7396313082426786} 12/27/2021 16:59:50 - INFO - codeparrot_training - Step 36010: {'lr': 9.111816633525386e-05, 'samples': 18437632, 'steps': 36010, 'batch_loss/train': 0.8417942244559526} 12/27/2021 17:00:00 - INFO - codeparrot_training - Step 36011: {'lr': 9.110599627612631e-05, 'samples': 18438144, 'steps': 36011, 'batch_loss/train': 0.7742339053656906} 12/27/2021 17:00:11 - INFO - codeparrot_training - Step 36012: {'lr': 9.109382684870146e-05, 'samples': 18438656, 'steps': 36012, 'batch_loss/train': 0.6785327787511051} 12/27/2021 17:00:23 - INFO - codeparrot_training - Step 36013: {'lr': 9.108165805302734e-05, 'samples': 18439168, 'steps': 36013, 'batch_loss/train': 0.8571742660133168} 12/27/2021 17:00:34 - INFO - codeparrot_training - Step 36014: {'lr': 9.106948988915267e-05, 'samples': 18439680, 'steps': 36014, 'batch_loss/train': 0.6754541876725852} 12/27/2021 17:00:44 - INFO - codeparrot_training - Step 36015: {'lr': 9.105732235712577e-05, 'samples': 18440192, 'steps': 36015, 'batch_loss/train': 0.5651668279897422} 12/27/2021 17:00:58 - INFO - codeparrot_training - Step 36016: {'lr': 9.104515545699484e-05, 'samples': 18440704, 'steps': 36016, 'batch_loss/train': 0.8314782325178385} 12/27/2021 17:01:09 - INFO - codeparrot_training - Step 36017: {'lr': 9.103298918880829e-05, 'samples': 18441216, 'steps': 36017, 'batch_loss/train': 0.740310219116509} 12/27/2021 17:01:19 - INFO - codeparrot_training - Step 36018: {'lr': 9.10208235526147e-05, 'samples': 18441728, 'steps': 36018, 'batch_loss/train': 0.8033771365880966} 12/27/2021 17:01:32 - INFO - codeparrot_training - Step 36019: {'lr': 9.100865854846222e-05, 'samples': 18442240, 'steps': 36019, 'batch_loss/train': 0.8135217193048447} 12/27/2021 17:01:42 - INFO - codeparrot_training - Step 36020: {'lr': 9.099649417639927e-05, 'samples': 18442752, 'steps': 36020, 'batch_loss/train': 0.8391652787104249} 12/27/2021 17:01:53 - INFO - codeparrot_training - Step 36021: {'lr': 9.098433043647423e-05, 'samples': 18443264, 'steps': 36021, 'batch_loss/train': 0.7744415495544672} 12/27/2021 17:02:04 - INFO - codeparrot_training - Step 36022: {'lr': 9.097216732873542e-05, 'samples': 18443776, 'steps': 36022, 'batch_loss/train': 0.6883067404851317} 12/27/2021 17:02:17 - INFO - codeparrot_training - Step 36023: {'lr': 9.096000485323124e-05, 'samples': 18444288, 'steps': 36023, 'batch_loss/train': 0.7433245623251423} 12/27/2021 17:02:28 - INFO - codeparrot_training - Step 36024: {'lr': 9.094784301001013e-05, 'samples': 18444800, 'steps': 36024, 'batch_loss/train': 0.7840862376615405} 12/27/2021 17:02:39 - INFO - codeparrot_training - Step 36025: {'lr': 9.093568179912018e-05, 'samples': 18445312, 'steps': 36025, 'batch_loss/train': 0.7186082829721272} 12/27/2021 17:02:51 - INFO - codeparrot_training - Step 36026: {'lr': 9.092352122060998e-05, 'samples': 18445824, 'steps': 36026, 'batch_loss/train': 0.8031856874004006} 12/27/2021 17:03:01 - INFO - codeparrot_training - Step 36027: {'lr': 9.091136127452787e-05, 'samples': 18446336, 'steps': 36027, 'batch_loss/train': 0.7578164383303374} 12/27/2021 17:03:12 - INFO - codeparrot_training - Step 36028: {'lr': 9.089920196092194e-05, 'samples': 18446848, 'steps': 36028, 'batch_loss/train': 0.6917106769979} 12/27/2021 17:03:24 - INFO - codeparrot_training - Step 36029: {'lr': 9.088704327984088e-05, 'samples': 18447360, 'steps': 36029, 'batch_loss/train': 0.5497882608906366} 12/27/2021 17:03:35 - INFO - codeparrot_training - Step 36030: {'lr': 9.087488523133277e-05, 'samples': 18447872, 'steps': 36030, 'batch_loss/train': 0.7154193632304668} 12/27/2021 17:03:45 - INFO - codeparrot_training - Step 36031: {'lr': 9.086272781544605e-05, 'samples': 18448384, 'steps': 36031, 'batch_loss/train': 0.8176395012997091} 12/27/2021 17:03:56 - INFO - codeparrot_training - Step 36032: {'lr': 9.085057103222904e-05, 'samples': 18448896, 'steps': 36032, 'batch_loss/train': 0.7397639022674412} 12/27/2021 17:04:10 - INFO - codeparrot_training - Step 36033: {'lr': 9.083841488173006e-05, 'samples': 18449408, 'steps': 36033, 'batch_loss/train': 0.7577148135751486} 12/27/2021 17:04:20 - INFO - codeparrot_training - Step 36034: {'lr': 9.082625936399744e-05, 'samples': 18449920, 'steps': 36034, 'batch_loss/train': 0.5974983554333448} 12/27/2021 17:04:31 - INFO - codeparrot_training - Step 36035: {'lr': 9.081410447907959e-05, 'samples': 18450432, 'steps': 36035, 'batch_loss/train': 0.8223872678354383} 12/27/2021 17:04:43 - INFO - codeparrot_training - Step 36036: {'lr': 9.080195022702459e-05, 'samples': 18450944, 'steps': 36036, 'batch_loss/train': 0.9012853680178523} 12/27/2021 17:04:54 - INFO - codeparrot_training - Step 36037: {'lr': 9.0789796607881e-05, 'samples': 18451456, 'steps': 36037, 'batch_loss/train': 0.73198960442096} 12/27/2021 17:05:04 - INFO - codeparrot_training - Step 36038: {'lr': 9.077764362169716e-05, 'samples': 18451968, 'steps': 36038, 'batch_loss/train': 0.6980669111944735} 12/27/2021 17:05:15 - INFO - codeparrot_training - Step 36039: {'lr': 9.076549126852113e-05, 'samples': 18452480, 'steps': 36039, 'batch_loss/train': 0.7727489150129259} 12/27/2021 17:05:27 - INFO - codeparrot_training - Step 36040: {'lr': 9.075333954840146e-05, 'samples': 18452992, 'steps': 36040, 'batch_loss/train': 0.6747822370380163} 12/27/2021 17:05:38 - INFO - codeparrot_training - Step 36041: {'lr': 9.074118846138646e-05, 'samples': 18453504, 'steps': 36041, 'batch_loss/train': 0.7013267854345031} 12/27/2021 17:05:48 - INFO - codeparrot_training - Step 36042: {'lr': 9.072903800752427e-05, 'samples': 18454016, 'steps': 36042, 'batch_loss/train': 0.7426559859886765} 12/27/2021 17:06:02 - INFO - codeparrot_training - Step 36043: {'lr': 9.071688818686321e-05, 'samples': 18454528, 'steps': 36043, 'batch_loss/train': 1.338055067230016} 12/27/2021 17:06:13 - INFO - codeparrot_training - Step 36044: {'lr': 9.070473899945183e-05, 'samples': 18455040, 'steps': 36044, 'batch_loss/train': 0.7464198800735176} 12/27/2021 17:06:23 - INFO - codeparrot_training - Step 36045: {'lr': 9.069259044533818e-05, 'samples': 18455552, 'steps': 36045, 'batch_loss/train': 0.6636642245575786} 12/27/2021 17:06:35 - INFO - codeparrot_training - Step 36046: {'lr': 9.068044252457061e-05, 'samples': 18456064, 'steps': 36046, 'batch_loss/train': 0.7398899300023913} 12/27/2021 17:06:46 - INFO - codeparrot_training - Step 36047: {'lr': 9.066829523719746e-05, 'samples': 18456576, 'steps': 36047, 'batch_loss/train': 0.7948079565539956} 12/27/2021 17:06:57 - INFO - codeparrot_training - Step 36048: {'lr': 9.065614858326704e-05, 'samples': 18457088, 'steps': 36048, 'batch_loss/train': 0.6996699394658208} 12/27/2021 17:07:07 - INFO - codeparrot_training - Step 36049: {'lr': 9.064400256282756e-05, 'samples': 18457600, 'steps': 36049, 'batch_loss/train': 0.8131546154618263} 12/27/2021 17:07:20 - INFO - codeparrot_training - Step 36050: {'lr': 9.06318571759274e-05, 'samples': 18458112, 'steps': 36050, 'batch_loss/train': 0.7528627370484173} 12/27/2021 17:07:30 - INFO - codeparrot_training - Step 36051: {'lr': 9.061971242261476e-05, 'samples': 18458624, 'steps': 36051, 'batch_loss/train': 0.5231990155880339} 12/27/2021 17:07:41 - INFO - codeparrot_training - Step 36052: {'lr': 9.060756830293799e-05, 'samples': 18459136, 'steps': 36052, 'batch_loss/train': 0.8295957716181874} 12/27/2021 17:07:55 - INFO - codeparrot_training - Step 36053: {'lr': 9.059542481694541e-05, 'samples': 18459648, 'steps': 36053, 'batch_loss/train': 0.8115637726150453} 12/27/2021 17:08:05 - INFO - codeparrot_training - Step 36054: {'lr': 9.058328196468508e-05, 'samples': 18460160, 'steps': 36054, 'batch_loss/train': 0.7435056869871914} 12/27/2021 17:08:16 - INFO - codeparrot_training - Step 36055: {'lr': 9.05711397462056e-05, 'samples': 18460672, 'steps': 36055, 'batch_loss/train': 0.792408587411046} 12/27/2021 17:08:28 - INFO - codeparrot_training - Step 36056: {'lr': 9.055899816155497e-05, 'samples': 18461184, 'steps': 36056, 'batch_loss/train': 0.6202216993551701} 12/27/2021 17:08:39 - INFO - codeparrot_training - Step 36057: {'lr': 9.054685721078158e-05, 'samples': 18461696, 'steps': 36057, 'batch_loss/train': 0.6929899882525206} 12/27/2021 17:08:49 - INFO - codeparrot_training - Step 36058: {'lr': 9.053471689393363e-05, 'samples': 18462208, 'steps': 36058, 'batch_loss/train': 0.7951883925125003} 12/27/2021 17:09:02 - INFO - codeparrot_training - Step 36059: {'lr': 9.052257721105947e-05, 'samples': 18462720, 'steps': 36059, 'batch_loss/train': 0.7544402349740267} 12/27/2021 17:09:12 - INFO - codeparrot_training - Step 36060: {'lr': 9.051043816220734e-05, 'samples': 18463232, 'steps': 36060, 'batch_loss/train': 0.7425871840678155} 12/27/2021 17:09:23 - INFO - codeparrot_training - Step 36061: {'lr': 9.049829974742546e-05, 'samples': 18463744, 'steps': 36061, 'batch_loss/train': 0.8181298174895346} 12/27/2021 17:09:34 - INFO - codeparrot_training - Step 36062: {'lr': 9.048616196676212e-05, 'samples': 18464256, 'steps': 36062, 'batch_loss/train': 0.7040843605063856} 12/27/2021 17:09:48 - INFO - codeparrot_training - Step 36063: {'lr': 9.04740248202656e-05, 'samples': 18464768, 'steps': 36063, 'batch_loss/train': 0.6543816160410643} 12/27/2021 17:09:58 - INFO - codeparrot_training - Step 36064: {'lr': 9.046188830798419e-05, 'samples': 18465280, 'steps': 36064, 'batch_loss/train': 0.7158328904770315} 12/27/2021 17:10:09 - INFO - codeparrot_training - Step 36065: {'lr': 9.04497524299659e-05, 'samples': 18465792, 'steps': 36065, 'batch_loss/train': 0.849115077406168} 12/27/2021 17:10:21 - INFO - codeparrot_training - Step 36066: {'lr': 9.04376171862592e-05, 'samples': 18466304, 'steps': 36066, 'batch_loss/train': 0.5545320026576519} 12/27/2021 17:10:32 - INFO - codeparrot_training - Step 36067: {'lr': 9.04254825769124e-05, 'samples': 18466816, 'steps': 36067, 'batch_loss/train': 0.7646978562697768} 12/27/2021 17:10:42 - INFO - codeparrot_training - Step 36068: {'lr': 9.041334860197356e-05, 'samples': 18467328, 'steps': 36068, 'batch_loss/train': 0.8425108101218939} 12/27/2021 17:10:55 - INFO - codeparrot_training - Step 36069: {'lr': 9.040121526149089e-05, 'samples': 18467840, 'steps': 36069, 'batch_loss/train': 0.7645883227232844} 12/27/2021 17:11:06 - INFO - codeparrot_training - Step 36070: {'lr': 9.03890825555129e-05, 'samples': 18468352, 'steps': 36070, 'batch_loss/train': 0.78695446299389} 12/27/2021 17:11:17 - INFO - codeparrot_training - Step 36071: {'lr': 9.037695048408756e-05, 'samples': 18468864, 'steps': 36071, 'batch_loss/train': 0.7331822896376252} 12/27/2021 17:11:27 - INFO - codeparrot_training - Step 36072: {'lr': 9.03648190472632e-05, 'samples': 18469376, 'steps': 36072, 'batch_loss/train': 1.2554147709161043} 12/27/2021 17:11:41 - INFO - codeparrot_training - Step 36073: {'lr': 9.035268824508803e-05, 'samples': 18469888, 'steps': 36073, 'batch_loss/train': 0.8563771257176995} 12/27/2021 17:11:52 - INFO - codeparrot_training - Step 36074: {'lr': 9.03405580776103e-05, 'samples': 18470400, 'steps': 36074, 'batch_loss/train': 0.830563897266984} 12/27/2021 17:12:02 - INFO - codeparrot_training - Step 36075: {'lr': 9.032842854487822e-05, 'samples': 18470912, 'steps': 36075, 'batch_loss/train': 0.7339374436996877} 12/27/2021 17:12:15 - INFO - codeparrot_training - Step 36076: {'lr': 9.031629964694005e-05, 'samples': 18471424, 'steps': 36076, 'batch_loss/train': 0.8153471546247602} 12/27/2021 17:12:25 - INFO - codeparrot_training - Step 36077: {'lr': 9.030417138384395e-05, 'samples': 18471936, 'steps': 36077, 'batch_loss/train': 0.8424601489678025} 12/27/2021 17:12:36 - INFO - codeparrot_training - Step 36078: {'lr': 9.029204375563816e-05, 'samples': 18472448, 'steps': 36078, 'batch_loss/train': 0.7023819675669074} 12/27/2021 17:12:47 - INFO - codeparrot_training - Step 36079: {'lr': 9.0279916762371e-05, 'samples': 18472960, 'steps': 36079, 'batch_loss/train': 0.8430490689352155} 12/27/2021 17:13:01 - INFO - codeparrot_training - Step 36080: {'lr': 9.026779040409039e-05, 'samples': 18473472, 'steps': 36080, 'batch_loss/train': 0.8837430160492659} 12/27/2021 17:13:11 - INFO - codeparrot_training - Step 36081: {'lr': 9.025566468084492e-05, 'samples': 18473984, 'steps': 36081, 'batch_loss/train': 0.639864539494738} 12/27/2021 17:13:22 - INFO - codeparrot_training - Step 36082: {'lr': 9.024353959268253e-05, 'samples': 18474496, 'steps': 36082, 'batch_loss/train': 0.8405742915347219} 12/27/2021 17:13:34 - INFO - codeparrot_training - Step 36083: {'lr': 9.02314151396515e-05, 'samples': 18475008, 'steps': 36083, 'batch_loss/train': 0.7777721291640773} 12/27/2021 17:13:44 - INFO - codeparrot_training - Step 36084: {'lr': 9.021929132180004e-05, 'samples': 18475520, 'steps': 36084, 'batch_loss/train': 0.6449859084095806} 12/27/2021 17:13:55 - INFO - codeparrot_training - Step 36085: {'lr': 9.020716813917634e-05, 'samples': 18476032, 'steps': 36085, 'batch_loss/train': 0.7632893566042185} 12/27/2021 17:14:07 - INFO - codeparrot_training - Step 36086: {'lr': 9.019504559182861e-05, 'samples': 18476544, 'steps': 36086, 'batch_loss/train': 0.6842827592045069} 12/27/2021 17:14:18 - INFO - codeparrot_training - Step 36087: {'lr': 9.018292367980504e-05, 'samples': 18477056, 'steps': 36087, 'batch_loss/train': 0.868853741325438} 12/27/2021 17:14:28 - INFO - codeparrot_training - Step 36088: {'lr': 9.017080240315381e-05, 'samples': 18477568, 'steps': 36088, 'batch_loss/train': 0.8012022031471133} 12/27/2021 17:14:41 - INFO - codeparrot_training - Step 36089: {'lr': 9.015868176192313e-05, 'samples': 18478080, 'steps': 36089, 'batch_loss/train': 0.8098978642374277} 12/27/2021 17:14:52 - INFO - codeparrot_training - Step 36090: {'lr': 9.014656175616126e-05, 'samples': 18478592, 'steps': 36090, 'batch_loss/train': 0.6992093296721578} 12/27/2021 17:15:02 - INFO - codeparrot_training - Step 36091: {'lr': 9.013444238591615e-05, 'samples': 18479104, 'steps': 36091, 'batch_loss/train': 0.7248018519021571} 12/27/2021 17:15:13 - INFO - codeparrot_training - Step 36092: {'lr': 9.012232365123618e-05, 'samples': 18479616, 'steps': 36092, 'batch_loss/train': 0.7836943808943033} 12/27/2021 17:15:26 - INFO - codeparrot_training - Step 36093: {'lr': 9.01102055521696e-05, 'samples': 18480128, 'steps': 36093, 'batch_loss/train': 0.7369425385259092} 12/27/2021 17:15:37 - INFO - codeparrot_training - Step 36094: {'lr': 9.009808808876438e-05, 'samples': 18480640, 'steps': 36094, 'batch_loss/train': 0.7375108222477138} 12/27/2021 17:15:47 - INFO - codeparrot_training - Step 36095: {'lr': 9.008597126106869e-05, 'samples': 18481152, 'steps': 36095, 'batch_loss/train': 0.7523403688101098} 12/27/2021 17:16:00 - INFO - codeparrot_training - Step 36096: {'lr': 9.007385506913096e-05, 'samples': 18481664, 'steps': 36096, 'batch_loss/train': 0.6843339973129332} 12/27/2021 17:16:10 - INFO - codeparrot_training - Step 36097: {'lr': 9.006173951299914e-05, 'samples': 18482176, 'steps': 36097, 'batch_loss/train': 0.644876075675711} 12/27/2021 17:16:21 - INFO - codeparrot_training - Step 36098: {'lr': 9.004962459272142e-05, 'samples': 18482688, 'steps': 36098, 'batch_loss/train': 0.763948374427855} 12/27/2021 17:16:33 - INFO - codeparrot_training - Step 36099: {'lr': 9.003751030834603e-05, 'samples': 18483200, 'steps': 36099, 'batch_loss/train': 0.9090169267728925} 12/27/2021 17:16:43 - INFO - codeparrot_training - Step 36100: {'lr': 9.00253966599211e-05, 'samples': 18483712, 'steps': 36100, 'batch_loss/train': 0.7044116349425167} 12/27/2021 17:16:54 - INFO - codeparrot_training - Step 36101: {'lr': 9.001328364749475e-05, 'samples': 18484224, 'steps': 36101, 'batch_loss/train': 0.8065424135420471} 12/27/2021 17:17:05 - INFO - codeparrot_training - Step 36102: {'lr': 9.00011712711152e-05, 'samples': 18484736, 'steps': 36102, 'batch_loss/train': 0.6786086533684283} 12/27/2021 17:17:18 - INFO - codeparrot_training - Step 36103: {'lr': 8.998905953083059e-05, 'samples': 18485248, 'steps': 36103, 'batch_loss/train': 0.8440176979638636} 12/27/2021 17:17:29 - INFO - codeparrot_training - Step 36104: {'lr': 8.997694842668905e-05, 'samples': 18485760, 'steps': 36104, 'batch_loss/train': 0.7469984982162714} 12/27/2021 17:17:39 - INFO - codeparrot_training - Step 36105: {'lr': 8.996483795873884e-05, 'samples': 18486272, 'steps': 36105, 'batch_loss/train': 0.579115404660115} 12/27/2021 17:17:51 - INFO - codeparrot_training - Step 36106: {'lr': 8.995272812702781e-05, 'samples': 18486784, 'steps': 36106, 'batch_loss/train': 0.7292935089208186} 12/27/2021 17:18:02 - INFO - codeparrot_training - Step 36107: {'lr': 8.994061893160441e-05, 'samples': 18487296, 'steps': 36107, 'batch_loss/train': 0.8053917018696666} 12/27/2021 17:18:13 - INFO - codeparrot_training - Step 36108: {'lr': 8.992851037251673e-05, 'samples': 18487808, 'steps': 36108, 'batch_loss/train': 0.7582925383467227} 12/27/2021 17:18:27 - INFO - codeparrot_training - Step 36109: {'lr': 8.991640244981281e-05, 'samples': 18488320, 'steps': 36109, 'batch_loss/train': 0.7983978447737172} 12/27/2021 17:18:37 - INFO - codeparrot_training - Step 36110: {'lr': 8.990429516354078e-05, 'samples': 18488832, 'steps': 36110, 'batch_loss/train': 0.721306256367825} 12/27/2021 17:18:48 - INFO - codeparrot_training - Step 36111: {'lr': 8.989218851374887e-05, 'samples': 18489344, 'steps': 36111, 'batch_loss/train': 0.4519781689159572} 12/27/2021 17:18:59 - INFO - codeparrot_training - Step 36112: {'lr': 8.988008250048512e-05, 'samples': 18489856, 'steps': 36112, 'batch_loss/train': 0.6875179710332304} 12/27/2021 17:19:11 - INFO - codeparrot_training - Step 36113: {'lr': 8.986797712379774e-05, 'samples': 18490368, 'steps': 36113, 'batch_loss/train': 0.6275169877335429} 12/27/2021 17:19:21 - INFO - codeparrot_training - Step 36114: {'lr': 8.98558723837348e-05, 'samples': 18490880, 'steps': 36114, 'batch_loss/train': 0.716976745869033} 12/27/2021 17:19:32 - INFO - codeparrot_training - Step 36115: {'lr': 8.984376828034444e-05, 'samples': 18491392, 'steps': 36115, 'batch_loss/train': 0.7398167797946371} 12/27/2021 17:19:44 - INFO - codeparrot_training - Step 36116: {'lr': 8.98316648136748e-05, 'samples': 18491904, 'steps': 36116, 'batch_loss/train': 0.8147556195035577} 12/27/2021 17:19:55 - INFO - codeparrot_training - Step 36117: {'lr': 8.981956198377397e-05, 'samples': 18492416, 'steps': 36117, 'batch_loss/train': 0.6951551134698093} 12/27/2021 17:20:05 - INFO - codeparrot_training - Step 36118: {'lr': 8.980745979069008e-05, 'samples': 18492928, 'steps': 36118, 'batch_loss/train': 0.7110632829135284} 12/27/2021 17:20:19 - INFO - codeparrot_training - Step 36119: {'lr': 8.979535823447135e-05, 'samples': 18493440, 'steps': 36119, 'batch_loss/train': 0.7232105205766857} 12/27/2021 17:20:30 - INFO - codeparrot_training - Step 36120: {'lr': 8.978325731516568e-05, 'samples': 18493952, 'steps': 36120, 'batch_loss/train': 0.5980607247911394} 12/27/2021 17:20:41 - INFO - codeparrot_training - Step 36121: {'lr': 8.977115703282119e-05, 'samples': 18494464, 'steps': 36121, 'batch_loss/train': 0.7974916020175442} 12/27/2021 17:20:51 - INFO - codeparrot_training - Step 36122: {'lr': 8.975905738748628e-05, 'samples': 18494976, 'steps': 36122, 'batch_loss/train': 0.7617523320950568} 12/27/2021 17:21:04 - INFO - codeparrot_training - Step 36123: {'lr': 8.974695837920873e-05, 'samples': 18495488, 'steps': 36123, 'batch_loss/train': 0.7836980456486344} 12/27/2021 17:21:15 - INFO - codeparrot_training - Step 36124: {'lr': 8.973486000803671e-05, 'samples': 18496000, 'steps': 36124, 'batch_loss/train': 0.7526051041204482} 12/27/2021 17:21:25 - INFO - codeparrot_training - Step 36125: {'lr': 8.972276227401852e-05, 'samples': 18496512, 'steps': 36125, 'batch_loss/train': 0.9319805812556297} 12/27/2021 17:21:37 - INFO - codeparrot_training - Step 36126: {'lr': 8.971066517720202e-05, 'samples': 18497024, 'steps': 36126, 'batch_loss/train': 0.818349216831848} 12/27/2021 17:21:48 - INFO - codeparrot_training - Step 36127: {'lr': 8.969856871763542e-05, 'samples': 18497536, 'steps': 36127, 'batch_loss/train': 0.7584994928911328} 12/27/2021 17:21:58 - INFO - codeparrot_training - Step 36128: {'lr': 8.968647289536677e-05, 'samples': 18498048, 'steps': 36128, 'batch_loss/train': 0.8314163219183683} 12/27/2021 17:22:09 - INFO - codeparrot_training - Step 36129: {'lr': 8.967437771044418e-05, 'samples': 18498560, 'steps': 36129, 'batch_loss/train': 0.6998536111786962} 12/27/2021 17:22:21 - INFO - codeparrot_training - Step 36130: {'lr': 8.966228316291572e-05, 'samples': 18499072, 'steps': 36130, 'batch_loss/train': 0.7629427118226886} 12/27/2021 17:22:32 - INFO - codeparrot_training - Step 36131: {'lr': 8.965018925282958e-05, 'samples': 18499584, 'steps': 36131, 'batch_loss/train': 0.8461836110800505} 12/27/2021 17:22:42 - INFO - codeparrot_training - Step 36132: {'lr': 8.963809598023354e-05, 'samples': 18500096, 'steps': 36132, 'batch_loss/train': 0.7738512139767408} 12/27/2021 17:22:56 - INFO - codeparrot_training - Step 36133: {'lr': 8.962600334517598e-05, 'samples': 18500608, 'steps': 36133, 'batch_loss/train': 0.7343422593548894} 12/27/2021 17:23:07 - INFO - codeparrot_training - Step 36134: {'lr': 8.961391134770497e-05, 'samples': 18501120, 'steps': 36134, 'batch_loss/train': 0.7209643423557281} 12/27/2021 17:23:18 - INFO - codeparrot_training - Step 36135: {'lr': 8.96018199878684e-05, 'samples': 18501632, 'steps': 36135, 'batch_loss/train': 0.760447253473103} 12/27/2021 17:23:30 - INFO - codeparrot_training - Step 36136: {'lr': 8.958972926571443e-05, 'samples': 18502144, 'steps': 36136, 'batch_loss/train': 0.7890462027862668} 12/27/2021 17:23:41 - INFO - codeparrot_training - Step 36137: {'lr': 8.957763918129113e-05, 'samples': 18502656, 'steps': 36137, 'batch_loss/train': 0.49168940231902525} 12/27/2021 17:23:51 - INFO - codeparrot_training - Step 36138: {'lr': 8.956554973464656e-05, 'samples': 18503168, 'steps': 36138, 'batch_loss/train': 0.6043767787341494} 12/27/2021 17:24:04 - INFO - codeparrot_training - Step 36139: {'lr': 8.955346092582878e-05, 'samples': 18503680, 'steps': 36139, 'batch_loss/train': 1.4171379439067096} 12/27/2021 17:24:15 - INFO - codeparrot_training - Step 36140: {'lr': 8.954137275488585e-05, 'samples': 18504192, 'steps': 36140, 'batch_loss/train': 0.6716613634489477} 12/27/2021 17:24:26 - INFO - codeparrot_training - Step 36141: {'lr': 8.952928522186588e-05, 'samples': 18504704, 'steps': 36141, 'batch_loss/train': 0.6590047245845199} 12/27/2021 17:24:36 - INFO - codeparrot_training - Step 36142: {'lr': 8.951719832681682e-05, 'samples': 18505216, 'steps': 36142, 'batch_loss/train': 0.6892561642453074} 12/27/2021 17:24:48 - INFO - codeparrot_training - Step 36143: {'lr': 8.950511206978682e-05, 'samples': 18505728, 'steps': 36143, 'batch_loss/train': 0.7012486725579947} 12/27/2021 17:24:59 - INFO - codeparrot_training - Step 36144: {'lr': 8.949302645082388e-05, 'samples': 18506240, 'steps': 36144, 'batch_loss/train': 0.7319827051833272} 12/27/2021 17:25:09 - INFO - codeparrot_training - Step 36145: {'lr': 8.948094146997615e-05, 'samples': 18506752, 'steps': 36145, 'batch_loss/train': 0.6677197851240635} 12/27/2021 17:25:22 - INFO - codeparrot_training - Step 36146: {'lr': 8.946885712729152e-05, 'samples': 18507264, 'steps': 36146, 'batch_loss/train': 0.7903059660457075} 12/27/2021 17:25:32 - INFO - codeparrot_training - Step 36147: {'lr': 8.9456773422818e-05, 'samples': 18507776, 'steps': 36147, 'batch_loss/train': 0.7517200615257025} 12/27/2021 17:25:43 - INFO - codeparrot_training - Step 36148: {'lr': 8.944469035660391e-05, 'samples': 18508288, 'steps': 36148, 'batch_loss/train': 0.7430882900953293} 12/27/2021 17:25:57 - INFO - codeparrot_training - Step 36149: {'lr': 8.943260792869701e-05, 'samples': 18508800, 'steps': 36149, 'batch_loss/train': 0.7268312820233405} 12/27/2021 17:26:08 - INFO - codeparrot_training - Step 36150: {'lr': 8.942052613914533e-05, 'samples': 18509312, 'steps': 36150, 'batch_loss/train': 0.8446393250487745} 12/27/2021 17:26:19 - INFO - codeparrot_training - Step 36151: {'lr': 8.940844498799722e-05, 'samples': 18509824, 'steps': 36151, 'batch_loss/train': 0.9267545575276017} 12/27/2021 17:26:29 - INFO - codeparrot_training - Step 36152: {'lr': 8.939636447530036e-05, 'samples': 18510336, 'steps': 36152, 'batch_loss/train': 0.7814588402397931} 12/27/2021 17:26:41 - INFO - codeparrot_training - Step 36153: {'lr': 8.938428460110295e-05, 'samples': 18510848, 'steps': 36153, 'batch_loss/train': 0.8097399575053714} 12/27/2021 17:26:52 - INFO - codeparrot_training - Step 36154: {'lr': 8.937220536545295e-05, 'samples': 18511360, 'steps': 36154, 'batch_loss/train': 0.7259005894884467} 12/27/2021 17:27:03 - INFO - codeparrot_training - Step 36155: {'lr': 8.936012676839844e-05, 'samples': 18511872, 'steps': 36155, 'batch_loss/train': 0.7127362390747294} 12/27/2021 17:27:15 - INFO - codeparrot_training - Step 36156: {'lr': 8.93480488099874e-05, 'samples': 18512384, 'steps': 36156, 'batch_loss/train': 0.7675464130006731} 12/27/2021 17:27:26 - INFO - codeparrot_training - Step 36157: {'lr': 8.933597149026795e-05, 'samples': 18512896, 'steps': 36157, 'batch_loss/train': 0.8007137980312109} 12/27/2021 17:27:36 - INFO - codeparrot_training - Step 36158: {'lr': 8.932389480928785e-05, 'samples': 18513408, 'steps': 36158, 'batch_loss/train': 0.7853765017352998} 12/27/2021 17:27:47 - INFO - codeparrot_training - Step 36159: {'lr': 8.931181876709531e-05, 'samples': 18513920, 'steps': 36159, 'batch_loss/train': 0.7431431184522808} 12/27/2021 17:27:59 - INFO - codeparrot_training - Step 36160: {'lr': 8.929974336373844e-05, 'samples': 18514432, 'steps': 36160, 'batch_loss/train': 0.5993664916604757} 12/27/2021 17:28:10 - INFO - codeparrot_training - Step 36161: {'lr': 8.928766859926493e-05, 'samples': 18514944, 'steps': 36161, 'batch_loss/train': 0.7202221509069204} 12/27/2021 17:28:20 - INFO - codeparrot_training - Step 36162: {'lr': 8.927559447372304e-05, 'samples': 18515456, 'steps': 36162, 'batch_loss/train': 0.8218893462326378} 12/27/2021 17:28:34 - INFO - codeparrot_training - Step 36163: {'lr': 8.926352098716082e-05, 'samples': 18515968, 'steps': 36163, 'batch_loss/train': 0.7443628925830126} 12/27/2021 17:28:45 - INFO - codeparrot_training - Step 36164: {'lr': 8.925144813962605e-05, 'samples': 18516480, 'steps': 36164, 'batch_loss/train': 0.832579149864614} 12/27/2021 17:28:55 - INFO - codeparrot_training - Step 36165: {'lr': 8.923937593116679e-05, 'samples': 18516992, 'steps': 36165, 'batch_loss/train': 0.6934674307703972} 12/27/2021 17:29:07 - INFO - codeparrot_training - Step 36166: {'lr': 8.92273043618311e-05, 'samples': 18517504, 'steps': 36166, 'batch_loss/train': 0.7366448652464896} 12/27/2021 17:29:18 - INFO - codeparrot_training - Step 36167: {'lr': 8.921523343166693e-05, 'samples': 18518016, 'steps': 36167, 'batch_loss/train': 0.7767114073503762} 12/27/2021 17:29:29 - INFO - codeparrot_training - Step 36168: {'lr': 8.92031631407223e-05, 'samples': 18518528, 'steps': 36168, 'batch_loss/train': 0.6714807320386171} 12/27/2021 17:29:39 - INFO - codeparrot_training - Step 36169: {'lr': 8.919109348904517e-05, 'samples': 18519040, 'steps': 36169, 'batch_loss/train': 0.7298551192507148} 12/27/2021 17:29:54 - INFO - codeparrot_training - Step 36170: {'lr': 8.917902447668352e-05, 'samples': 18519552, 'steps': 36170, 'batch_loss/train': 0.7735711545683444} 12/27/2021 17:30:04 - INFO - codeparrot_training - Step 36171: {'lr': 8.916695610368545e-05, 'samples': 18520064, 'steps': 36171, 'batch_loss/train': 0.8081440469250083} 12/27/2021 17:30:15 - INFO - codeparrot_training - Step 36172: {'lr': 8.915488837009875e-05, 'samples': 18520576, 'steps': 36172, 'batch_loss/train': 0.7637980091385543} 12/27/2021 17:30:27 - INFO - codeparrot_training - Step 36173: {'lr': 8.914282127597136e-05, 'samples': 18521088, 'steps': 36173, 'batch_loss/train': 0.5922847206238657} 12/27/2021 17:30:38 - INFO - codeparrot_training - Step 36174: {'lr': 8.913075482135158e-05, 'samples': 18521600, 'steps': 36174, 'batch_loss/train': 0.6639114457648247} 12/27/2021 17:30:48 - INFO - codeparrot_training - Step 36175: {'lr': 8.911868900628708e-05, 'samples': 18522112, 'steps': 36175, 'batch_loss/train': 0.6853092147503048} 12/27/2021 17:30:59 - INFO - codeparrot_training - Step 36176: {'lr': 8.910662383082585e-05, 'samples': 18522624, 'steps': 36176, 'batch_loss/train': 0.7024936703965068} 12/27/2021 17:31:11 - INFO - codeparrot_training - Step 36177: {'lr': 8.909455929501611e-05, 'samples': 18523136, 'steps': 36177, 'batch_loss/train': 0.7655327083775774} 12/27/2021 17:31:22 - INFO - codeparrot_training - Step 36178: {'lr': 8.908249539890553e-05, 'samples': 18523648, 'steps': 36178, 'batch_loss/train': 0.7853464554063976} 12/27/2021 17:31:32 - INFO - codeparrot_training - Step 36179: {'lr': 8.907043214254224e-05, 'samples': 18524160, 'steps': 36179, 'batch_loss/train': 0.7615407384000719} 12/27/2021 17:31:46 - INFO - codeparrot_training - Step 36180: {'lr': 8.90583695259741e-05, 'samples': 18524672, 'steps': 36180, 'batch_loss/train': 0.8817721800878644} 12/27/2021 17:31:56 - INFO - codeparrot_training - Step 36181: {'lr': 8.904630754924914e-05, 'samples': 18525184, 'steps': 36181, 'batch_loss/train': 0.7602861734339967} 12/27/2021 17:32:07 - INFO - codeparrot_training - Step 36182: {'lr': 8.903424621241527e-05, 'samples': 18525696, 'steps': 36182, 'batch_loss/train': 0.7372881956398487} 12/27/2021 17:32:19 - INFO - codeparrot_training - Step 36183: {'lr': 8.902218551552055e-05, 'samples': 18526208, 'steps': 36183, 'batch_loss/train': 0.7869594707153738} 12/27/2021 17:32:30 - INFO - codeparrot_training - Step 36184: {'lr': 8.901012545861267e-05, 'samples': 18526720, 'steps': 36184, 'batch_loss/train': 0.7902868436649442} 12/27/2021 17:32:40 - INFO - codeparrot_training - Step 36185: {'lr': 8.899806604173982e-05, 'samples': 18527232, 'steps': 36185, 'batch_loss/train': 0.7550549139268696} 12/27/2021 17:32:52 - INFO - codeparrot_training - Step 36186: {'lr': 8.898600726494996e-05, 'samples': 18527744, 'steps': 36186, 'batch_loss/train': 0.7581620290875435} 12/27/2021 17:33:03 - INFO - codeparrot_training - Step 36187: {'lr': 8.897394912829076e-05, 'samples': 18528256, 'steps': 36187, 'batch_loss/train': 0.7767098499462008} 12/27/2021 17:33:14 - INFO - codeparrot_training - Step 36188: {'lr': 8.896189163181042e-05, 'samples': 18528768, 'steps': 36188, 'batch_loss/train': 0.7622885908931494} 12/27/2021 17:33:24 - INFO - codeparrot_training - Step 36189: {'lr': 8.89498347755569e-05, 'samples': 18529280, 'steps': 36189, 'batch_loss/train': 0.6815024316310883} 12/27/2021 17:33:37 - INFO - codeparrot_training - Step 36190: {'lr': 8.893777855957794e-05, 'samples': 18529792, 'steps': 36190, 'batch_loss/train': 0.7479876982979476} 12/27/2021 17:33:48 - INFO - codeparrot_training - Step 36191: {'lr': 8.892572298392154e-05, 'samples': 18530304, 'steps': 36191, 'batch_loss/train': 0.783193496055901} 12/27/2021 17:33:59 - INFO - codeparrot_training - Step 36192: {'lr': 8.891366804863565e-05, 'samples': 18530816, 'steps': 36192, 'batch_loss/train': 0.756583817768842} 12/27/2021 17:34:11 - INFO - codeparrot_training - Step 36193: {'lr': 8.890161375376818e-05, 'samples': 18531328, 'steps': 36193, 'batch_loss/train': 0.7801409214735031} 12/27/2021 17:34:22 - INFO - codeparrot_training - Step 36194: {'lr': 8.888956009936708e-05, 'samples': 18531840, 'steps': 36194, 'batch_loss/train': 0.9775437316857278} 12/27/2021 17:34:32 - INFO - codeparrot_training - Step 36195: {'lr': 8.887750708548026e-05, 'samples': 18532352, 'steps': 36195, 'batch_loss/train': 0.82029348553624} 12/27/2021 17:34:45 - INFO - codeparrot_training - Step 36196: {'lr': 8.886545471215565e-05, 'samples': 18532864, 'steps': 36196, 'batch_loss/train': 0.776965088211} 12/27/2021 17:34:56 - INFO - codeparrot_training - Step 36197: {'lr': 8.885340297944111e-05, 'samples': 18533376, 'steps': 36197, 'batch_loss/train': 0.7346723913215101} 12/27/2021 17:35:07 - INFO - codeparrot_training - Step 36198: {'lr': 8.884135188738463e-05, 'samples': 18533888, 'steps': 36198, 'batch_loss/train': 0.7353086358634755} 12/27/2021 17:35:17 - INFO - codeparrot_training - Step 36199: {'lr': 8.882930143603406e-05, 'samples': 18534400, 'steps': 36199, 'batch_loss/train': 0.817681074142456} 12/27/2021 17:35:29 - INFO - codeparrot_training - Step 36200: {'lr': 8.88172516254374e-05, 'samples': 18534912, 'steps': 36200, 'batch_loss/train': 0.6825505536980927} 12/27/2021 17:35:40 - INFO - codeparrot_training - Step 36201: {'lr': 8.880520245564245e-05, 'samples': 18535424, 'steps': 36201, 'batch_loss/train': 0.7389611657708883} 12/27/2021 17:35:51 - INFO - codeparrot_training - Step 36202: {'lr': 8.879315392669705e-05, 'samples': 18535936, 'steps': 36202, 'batch_loss/train': 0.716506187338382} 12/27/2021 17:36:03 - INFO - codeparrot_training - Step 36203: {'lr': 8.878110603864937e-05, 'samples': 18536448, 'steps': 36203, 'batch_loss/train': 0.8892434248700738} 12/27/2021 17:36:13 - INFO - codeparrot_training - Step 36204: {'lr': 8.876905879154703e-05, 'samples': 18536960, 'steps': 36204, 'batch_loss/train': 0.7917441057506949} 12/27/2021 17:36:24 - INFO - codeparrot_training - Step 36205: {'lr': 8.875701218543805e-05, 'samples': 18537472, 'steps': 36205, 'batch_loss/train': 0.667993264796678} 12/27/2021 17:36:36 - INFO - codeparrot_training - Step 36206: {'lr': 8.874496622037032e-05, 'samples': 18537984, 'steps': 36206, 'batch_loss/train': 0.6939433310180902} 12/27/2021 17:36:47 - INFO - codeparrot_training - Step 36207: {'lr': 8.873292089639171e-05, 'samples': 18538496, 'steps': 36207, 'batch_loss/train': 0.6705075120553374} 12/27/2021 17:36:57 - INFO - codeparrot_training - Step 36208: {'lr': 8.87208762135501e-05, 'samples': 18539008, 'steps': 36208, 'batch_loss/train': 0.6816514628008008} 12/27/2021 17:37:08 - INFO - codeparrot_training - Step 36209: {'lr': 8.870883217189347e-05, 'samples': 18539520, 'steps': 36209, 'batch_loss/train': 0.7429502345621586} 12/27/2021 17:37:22 - INFO - codeparrot_training - Step 36210: {'lr': 8.869678877146948e-05, 'samples': 18540032, 'steps': 36210, 'batch_loss/train': 0.535171405877918} 12/27/2021 17:37:32 - INFO - codeparrot_training - Step 36211: {'lr': 8.868474601232621e-05, 'samples': 18540544, 'steps': 36211, 'batch_loss/train': 0.7222586560528725} 12/27/2021 17:37:43 - INFO - codeparrot_training - Step 36212: {'lr': 8.867270389451157e-05, 'samples': 18541056, 'steps': 36212, 'batch_loss/train': 0.7039627323392779} 12/27/2021 17:37:55 - INFO - codeparrot_training - Step 36213: {'lr': 8.866066241807319e-05, 'samples': 18541568, 'steps': 36213, 'batch_loss/train': 0.6768443384207785} 12/27/2021 17:38:06 - INFO - codeparrot_training - Step 36214: {'lr': 8.864862158305918e-05, 'samples': 18542080, 'steps': 36214, 'batch_loss/train': 0.7428876007907093} 12/27/2021 17:38:16 - INFO - codeparrot_training - Step 36215: {'lr': 8.863658138951739e-05, 'samples': 18542592, 'steps': 36215, 'batch_loss/train': 0.6951301582157612} 12/27/2021 17:38:30 - INFO - codeparrot_training - Step 36216: {'lr': 8.862454183749555e-05, 'samples': 18543104, 'steps': 36216, 'batch_loss/train': 0.6562350182794034} 12/27/2021 17:38:41 - INFO - codeparrot_training - Step 36217: {'lr': 8.861250292704159e-05, 'samples': 18543616, 'steps': 36217, 'batch_loss/train': 0.7956361197866499} 12/27/2021 17:38:51 - INFO - codeparrot_training - Step 36218: {'lr': 8.86004646582034e-05, 'samples': 18544128, 'steps': 36218, 'batch_loss/train': 0.7140537332743406} 12/27/2021 17:39:02 - INFO - codeparrot_training - Step 36219: {'lr': 8.858842703102882e-05, 'samples': 18544640, 'steps': 36219, 'batch_loss/train': 0.8026652918197215} 12/27/2021 17:39:14 - INFO - codeparrot_training - Step 36220: {'lr': 8.85763900455657e-05, 'samples': 18545152, 'steps': 36220, 'batch_loss/train': 0.7680746708065271} 12/27/2021 17:39:25 - INFO - codeparrot_training - Step 36221: {'lr': 8.856435370186192e-05, 'samples': 18545664, 'steps': 36221, 'batch_loss/train': 0.7377539416775107} 12/27/2021 17:39:35 - INFO - codeparrot_training - Step 36222: {'lr': 8.855231799996527e-05, 'samples': 18546176, 'steps': 36222, 'batch_loss/train': 0.7335943579673767} 12/27/2021 17:39:48 - INFO - codeparrot_training - Step 36223: {'lr': 8.854028293992369e-05, 'samples': 18546688, 'steps': 36223, 'batch_loss/train': 0.9817127727437764} 12/27/2021 17:39:58 - INFO - codeparrot_training - Step 36224: {'lr': 8.852824852178496e-05, 'samples': 18547200, 'steps': 36224, 'batch_loss/train': 0.7182253682985902} 12/27/2021 17:40:09 - INFO - codeparrot_training - Step 36225: {'lr': 8.851621474559693e-05, 'samples': 18547712, 'steps': 36225, 'batch_loss/train': 0.8590211855480447} 12/27/2021 17:40:23 - INFO - codeparrot_training - Step 36226: {'lr': 8.850418161140756e-05, 'samples': 18548224, 'steps': 36226, 'batch_loss/train': 0.7136895274743438} 12/27/2021 17:40:34 - INFO - codeparrot_training - Step 36227: {'lr': 8.849214911926451e-05, 'samples': 18548736, 'steps': 36227, 'batch_loss/train': 0.7400990314781666} 12/27/2021 17:40:44 - INFO - codeparrot_training - Step 36228: {'lr': 8.84801172692156e-05, 'samples': 18549248, 'steps': 36228, 'batch_loss/train': 0.7833086862228811} 12/27/2021 17:40:56 - INFO - codeparrot_training - Step 36229: {'lr': 8.846808606130891e-05, 'samples': 18549760, 'steps': 36229, 'batch_loss/train': 0.7320269555784762} 12/27/2021 17:41:07 - INFO - codeparrot_training - Step 36230: {'lr': 8.845605549559202e-05, 'samples': 18550272, 'steps': 36230, 'batch_loss/train': 0.8755530556663871} 12/27/2021 17:41:18 - INFO - codeparrot_training - Step 36231: {'lr': 8.844402557211289e-05, 'samples': 18550784, 'steps': 36231, 'batch_loss/train': 0.8209898872300982} 12/27/2021 17:41:28 - INFO - codeparrot_training - Step 36232: {'lr': 8.84319962909193e-05, 'samples': 18551296, 'steps': 36232, 'batch_loss/train': 0.7149300645105541} 12/27/2021 17:41:40 - INFO - codeparrot_training - Step 36233: {'lr': 8.841996765205909e-05, 'samples': 18551808, 'steps': 36233, 'batch_loss/train': 0.7421365668997169} 12/27/2021 17:41:51 - INFO - codeparrot_training - Step 36234: {'lr': 8.840793965558008e-05, 'samples': 18552320, 'steps': 36234, 'batch_loss/train': 0.8450429746881127} 12/27/2021 17:42:01 - INFO - codeparrot_training - Step 36235: {'lr': 8.839591230153008e-05, 'samples': 18552832, 'steps': 36235, 'batch_loss/train': 0.8102442352101207} 12/27/2021 17:42:15 - INFO - codeparrot_training - Step 36236: {'lr': 8.838388558995691e-05, 'samples': 18553344, 'steps': 36236, 'batch_loss/train': 0.7297140592709184} 12/27/2021 17:42:25 - INFO - codeparrot_training - Step 36237: {'lr': 8.83718595209084e-05, 'samples': 18553856, 'steps': 36237, 'batch_loss/train': 0.6745558613911271} 12/27/2021 17:42:36 - INFO - codeparrot_training - Step 36238: {'lr': 8.835983409443244e-05, 'samples': 18554368, 'steps': 36238, 'batch_loss/train': 0.7577584907412529} 12/27/2021 17:42:48 - INFO - codeparrot_training - Step 36239: {'lr': 8.834780931057656e-05, 'samples': 18554880, 'steps': 36239, 'batch_loss/train': 0.859069561585784} 12/27/2021 17:42:59 - INFO - codeparrot_training - Step 36240: {'lr': 8.833578516938884e-05, 'samples': 18555392, 'steps': 36240, 'batch_loss/train': 0.724005882628262} 12/27/2021 17:43:09 - INFO - codeparrot_training - Step 36241: {'lr': 8.832376167091708e-05, 'samples': 18555904, 'steps': 36241, 'batch_loss/train': 0.7848091199994087} 12/27/2021 17:43:20 - INFO - codeparrot_training - Step 36242: {'lr': 8.831173881520892e-05, 'samples': 18556416, 'steps': 36242, 'batch_loss/train': 0.8279363317415118} 12/27/2021 17:43:34 - INFO - codeparrot_training - Step 36243: {'lr': 8.829971660231215e-05, 'samples': 18556928, 'steps': 36243, 'batch_loss/train': 0.6300004439544864} 12/27/2021 17:43:44 - INFO - codeparrot_training - Step 36244: {'lr': 8.828769503227482e-05, 'samples': 18557440, 'steps': 36244, 'batch_loss/train': 0.8071668492630124} 12/27/2021 17:43:55 - INFO - codeparrot_training - Step 36245: {'lr': 8.827567410514444e-05, 'samples': 18557952, 'steps': 36245, 'batch_loss/train': 0.6981073166243732} 12/27/2021 17:44:07 - INFO - codeparrot_training - Step 36246: {'lr': 8.826365382096896e-05, 'samples': 18558464, 'steps': 36246, 'batch_loss/train': 0.8149707202101126} 12/27/2021 17:44:18 - INFO - codeparrot_training - Step 36247: {'lr': 8.825163417979609e-05, 'samples': 18558976, 'steps': 36247, 'batch_loss/train': 1.0217537232674658} 12/27/2021 17:44:28 - INFO - codeparrot_training - Step 36248: {'lr': 8.823961518167368e-05, 'samples': 18559488, 'steps': 36248, 'batch_loss/train': 0.7160894768312573} 12/27/2021 17:44:40 - INFO - codeparrot_training - Step 36249: {'lr': 8.822759682664947e-05, 'samples': 18560000, 'steps': 36249, 'batch_loss/train': 0.7673192359507084} 12/27/2021 17:44:51 - INFO - codeparrot_training - Step 36250: {'lr': 8.821557911477124e-05, 'samples': 18560512, 'steps': 36250, 'batch_loss/train': 0.7175234481692314} 12/27/2021 17:45:02 - INFO - codeparrot_training - Step 36251: {'lr': 8.820356204608679e-05, 'samples': 18561024, 'steps': 36251, 'batch_loss/train': 0.7162219313904643} 12/27/2021 17:45:12 - INFO - codeparrot_training - Step 36252: {'lr': 8.819154562064396e-05, 'samples': 18561536, 'steps': 36252, 'batch_loss/train': 0.6511330960784107} 12/27/2021 17:45:24 - INFO - codeparrot_training - Step 36253: {'lr': 8.817952983849037e-05, 'samples': 18562048, 'steps': 36253, 'batch_loss/train': 0.7754887922201306} 12/27/2021 17:45:35 - INFO - codeparrot_training - Step 36254: {'lr': 8.816751469967379e-05, 'samples': 18562560, 'steps': 36254, 'batch_loss/train': 0.7247647629119456} 12/27/2021 17:45:46 - INFO - codeparrot_training - Step 36255: {'lr': 8.815550020424223e-05, 'samples': 18563072, 'steps': 36255, 'batch_loss/train': 0.751273191999644} 12/27/2021 17:45:59 - INFO - codeparrot_training - Step 36256: {'lr': 8.81434863522432e-05, 'samples': 18563584, 'steps': 36256, 'batch_loss/train': 0.736920066177845} 12/27/2021 17:46:10 - INFO - codeparrot_training - Step 36257: {'lr': 8.813147314372457e-05, 'samples': 18564096, 'steps': 36257, 'batch_loss/train': 0.7628879090771079} 12/27/2021 17:46:21 - INFO - codeparrot_training - Step 36258: {'lr': 8.811946057873407e-05, 'samples': 18564608, 'steps': 36258, 'batch_loss/train': 0.8839595755562186} 12/27/2021 17:46:33 - INFO - codeparrot_training - Step 36259: {'lr': 8.810744865731948e-05, 'samples': 18565120, 'steps': 36259, 'batch_loss/train': 0.7481148880906403} 12/27/2021 17:46:43 - INFO - codeparrot_training - Step 36260: {'lr': 8.809543737952853e-05, 'samples': 18565632, 'steps': 36260, 'batch_loss/train': 0.7596350652165711} 12/27/2021 17:46:54 - INFO - codeparrot_training - Step 36261: {'lr': 8.8083426745409e-05, 'samples': 18566144, 'steps': 36261, 'batch_loss/train': 0.7876074891537428} 12/27/2021 17:47:06 - INFO - codeparrot_training - Step 36262: {'lr': 8.807141675500863e-05, 'samples': 18566656, 'steps': 36262, 'batch_loss/train': 0.8113548378460109} 12/27/2021 17:47:17 - INFO - codeparrot_training - Step 36263: {'lr': 8.805940740837515e-05, 'samples': 18567168, 'steps': 36263, 'batch_loss/train': 0.842553979717195} 12/27/2021 17:47:27 - INFO - codeparrot_training - Step 36264: {'lr': 8.804739870555642e-05, 'samples': 18567680, 'steps': 36264, 'batch_loss/train': 0.7147026634775102} 12/27/2021 17:47:38 - INFO - codeparrot_training - Step 36265: {'lr': 8.80353906465999e-05, 'samples': 18568192, 'steps': 36265, 'batch_loss/train': 0.762251565232873} 12/27/2021 17:47:51 - INFO - codeparrot_training - Step 36266: {'lr': 8.802338323155362e-05, 'samples': 18568704, 'steps': 36266, 'batch_loss/train': 0.6575110871344805} 12/27/2021 17:48:02 - INFO - codeparrot_training - Step 36267: {'lr': 8.801137646046528e-05, 'samples': 18569216, 'steps': 36267, 'batch_loss/train': 0.7126166480593383} 12/27/2021 17:48:13 - INFO - codeparrot_training - Step 36268: {'lr': 8.799937033338246e-05, 'samples': 18569728, 'steps': 36268, 'batch_loss/train': 0.8396990243345499} 12/27/2021 17:48:25 - INFO - codeparrot_training - Step 36269: {'lr': 8.798736485035289e-05, 'samples': 18570240, 'steps': 36269, 'batch_loss/train': 0.7701278631575406} 12/27/2021 17:48:36 - INFO - codeparrot_training - Step 36270: {'lr': 8.797536001142453e-05, 'samples': 18570752, 'steps': 36270, 'batch_loss/train': 0.7545412871986628} 12/27/2021 17:48:46 - INFO - codeparrot_training - Step 36271: {'lr': 8.796335581664489e-05, 'samples': 18571264, 'steps': 36271, 'batch_loss/train': 0.8061935151927173} 12/27/2021 17:48:57 - INFO - codeparrot_training - Step 36272: {'lr': 8.795135226606176e-05, 'samples': 18571776, 'steps': 36272, 'batch_loss/train': 0.7995295580476522} 12/27/2021 17:49:10 - INFO - codeparrot_training - Step 36273: {'lr': 8.793934935972289e-05, 'samples': 18572288, 'steps': 36273, 'batch_loss/train': 0.6112108719535172} 12/27/2021 17:49:21 - INFO - codeparrot_training - Step 36274: {'lr': 8.792734709767597e-05, 'samples': 18572800, 'steps': 36274, 'batch_loss/train': 0.7974979830905795} 12/27/2021 17:49:31 - INFO - codeparrot_training - Step 36275: {'lr': 8.791534547996872e-05, 'samples': 18573312, 'steps': 36275, 'batch_loss/train': 0.6972875534556806} 12/27/2021 17:49:43 - INFO - codeparrot_training - Step 36276: {'lr': 8.790334450664884e-05, 'samples': 18573824, 'steps': 36276, 'batch_loss/train': 0.7886052774265409} 12/27/2021 17:49:54 - INFO - codeparrot_training - Step 36277: {'lr': 8.789134417776404e-05, 'samples': 18574336, 'steps': 36277, 'batch_loss/train': 0.7799499533139169} 12/27/2021 17:50:04 - INFO - codeparrot_training - Step 36278: {'lr': 8.787934449336208e-05, 'samples': 18574848, 'steps': 36278, 'batch_loss/train': 0.6851506466045976} 12/27/2021 17:50:17 - INFO - codeparrot_training - Step 36279: {'lr': 8.78673454534907e-05, 'samples': 18575360, 'steps': 36279, 'batch_loss/train': 0.9720335090532899} 12/27/2021 17:50:27 - INFO - codeparrot_training - Step 36280: {'lr': 8.785534705819737e-05, 'samples': 18575872, 'steps': 36280, 'batch_loss/train': 0.7641565217636526} 12/27/2021 17:50:38 - INFO - codeparrot_training - Step 36281: {'lr': 8.784334930753013e-05, 'samples': 18576384, 'steps': 36281, 'batch_loss/train': 0.8453644039109349} 12/27/2021 17:50:52 - INFO - codeparrot_training - Step 36282: {'lr': 8.783135220153643e-05, 'samples': 18576896, 'steps': 36282, 'batch_loss/train': 0.8636897448450327} 12/27/2021 17:51:02 - INFO - codeparrot_training - Step 36283: {'lr': 8.781935574026404e-05, 'samples': 18577408, 'steps': 36283, 'batch_loss/train': 0.7914618843933567} 12/27/2021 17:51:13 - INFO - codeparrot_training - Step 36284: {'lr': 8.780735992376063e-05, 'samples': 18577920, 'steps': 36284, 'batch_loss/train': 0.7568237234372646} 12/27/2021 17:51:23 - INFO - codeparrot_training - Step 36285: {'lr': 8.779536475207392e-05, 'samples': 18578432, 'steps': 36285, 'batch_loss/train': 0.686777900904417} 12/27/2021 17:51:36 - INFO - codeparrot_training - Step 36286: {'lr': 8.778337022525163e-05, 'samples': 18578944, 'steps': 36286, 'batch_loss/train': 0.769679814344272} 12/27/2021 17:51:46 - INFO - codeparrot_training - Step 36287: {'lr': 8.777137634334138e-05, 'samples': 18579456, 'steps': 36287, 'batch_loss/train': 0.8194736069999635} 12/27/2021 17:51:57 - INFO - codeparrot_training - Step 36288: {'lr': 8.775938310639086e-05, 'samples': 18579968, 'steps': 36288, 'batch_loss/train': 0.7330724662169814} 12/27/2021 17:52:09 - INFO - codeparrot_training - Step 36289: {'lr': 8.77473905144478e-05, 'samples': 18580480, 'steps': 36289, 'batch_loss/train': 0.7879522047005594} 12/27/2021 17:52:20 - INFO - codeparrot_training - Step 36290: {'lr': 8.773539856755994e-05, 'samples': 18580992, 'steps': 36290, 'batch_loss/train': 0.7136322013102472} 12/27/2021 17:52:31 - INFO - codeparrot_training - Step 36291: {'lr': 8.772340726577471e-05, 'samples': 18581504, 'steps': 36291, 'batch_loss/train': 0.7607407844625413} 12/27/2021 17:52:43 - INFO - codeparrot_training - Step 36292: {'lr': 8.771141660914001e-05, 'samples': 18582016, 'steps': 36292, 'batch_loss/train': 0.7547779362648726} 12/27/2021 17:52:53 - INFO - codeparrot_training - Step 36293: {'lr': 8.76994265977035e-05, 'samples': 18582528, 'steps': 36293, 'batch_loss/train': 0.7941035199910402} 12/27/2021 17:53:04 - INFO - codeparrot_training - Step 36294: {'lr': 8.768743723151273e-05, 'samples': 18583040, 'steps': 36294, 'batch_loss/train': 0.6976884411997162} 12/27/2021 17:53:14 - INFO - codeparrot_training - Step 36295: {'lr': 8.767544851061532e-05, 'samples': 18583552, 'steps': 36295, 'batch_loss/train': 0.7169682178646326} 12/27/2021 17:53:29 - INFO - codeparrot_training - Step 36296: {'lr': 8.766346043505924e-05, 'samples': 18584064, 'steps': 36296, 'batch_loss/train': 0.6667459304444492} 12/27/2021 17:53:39 - INFO - codeparrot_training - Step 36297: {'lr': 8.765147300489182e-05, 'samples': 18584576, 'steps': 36297, 'batch_loss/train': 0.7823790507391095} 12/27/2021 17:53:50 - INFO - codeparrot_training - Step 36298: {'lr': 8.763948622016085e-05, 'samples': 18585088, 'steps': 36298, 'batch_loss/train': 0.567451585753588} 12/27/2021 17:54:02 - INFO - codeparrot_training - Step 36299: {'lr': 8.7627500080914e-05, 'samples': 18585600, 'steps': 36299, 'batch_loss/train': 0.6537379111396149} 12/27/2021 17:54:13 - INFO - codeparrot_training - Step 36300: {'lr': 8.761551458719888e-05, 'samples': 18586112, 'steps': 36300, 'batch_loss/train': 0.8252566773444414} 12/27/2021 17:54:23 - INFO - codeparrot_training - Step 36301: {'lr': 8.760352973906318e-05, 'samples': 18586624, 'steps': 36301, 'batch_loss/train': 0.7268011644482613} 12/27/2021 17:54:37 - INFO - codeparrot_training - Step 36302: {'lr': 8.759154553655452e-05, 'samples': 18587136, 'steps': 36302, 'batch_loss/train': 0.708038083743304} 12/27/2021 17:54:47 - INFO - codeparrot_training - Step 36303: {'lr': 8.757956197972053e-05, 'samples': 18587648, 'steps': 36303, 'batch_loss/train': 0.6288723237812519} 12/27/2021 17:54:58 - INFO - codeparrot_training - Step 36304: {'lr': 8.756757906860891e-05, 'samples': 18588160, 'steps': 36304, 'batch_loss/train': 0.602703768061474} 12/27/2021 17:55:10 - INFO - codeparrot_training - Step 36305: {'lr': 8.755559680326733e-05, 'samples': 18588672, 'steps': 36305, 'batch_loss/train': 0.6866074567660689} 12/27/2021 17:55:21 - INFO - codeparrot_training - Step 36306: {'lr': 8.754361518374318e-05, 'samples': 18589184, 'steps': 36306, 'batch_loss/train': 0.7677875059307553} 12/27/2021 17:55:31 - INFO - codeparrot_training - Step 36307: {'lr': 8.753163421008445e-05, 'samples': 18589696, 'steps': 36307, 'batch_loss/train': 0.6385405566543341} 12/27/2021 17:55:42 - INFO - codeparrot_training - Step 36308: {'lr': 8.75196538823385e-05, 'samples': 18590208, 'steps': 36308, 'batch_loss/train': 0.6209061802364886} 12/27/2021 17:55:54 - INFO - codeparrot_training - Step 36309: {'lr': 8.7507674200553e-05, 'samples': 18590720, 'steps': 36309, 'batch_loss/train': 0.7781145320041105} 12/27/2021 17:56:05 - INFO - codeparrot_training - Step 36310: {'lr': 8.749569516477579e-05, 'samples': 18591232, 'steps': 36310, 'batch_loss/train': 0.7383133312687278} 12/27/2021 17:56:15 - INFO - codeparrot_training - Step 36311: {'lr': 8.748371677505426e-05, 'samples': 18591744, 'steps': 36311, 'batch_loss/train': 0.789164446759969} 12/27/2021 17:56:29 - INFO - codeparrot_training - Step 36312: {'lr': 8.74717390314361e-05, 'samples': 18592256, 'steps': 36312, 'batch_loss/train': 0.7418544963002205} 12/27/2021 17:56:39 - INFO - codeparrot_training - Step 36313: {'lr': 8.745976193396892e-05, 'samples': 18592768, 'steps': 36313, 'batch_loss/train': 0.7908967873081565} 12/27/2021 17:56:50 - INFO - codeparrot_training - Step 36314: {'lr': 8.744778548270039e-05, 'samples': 18593280, 'steps': 36314, 'batch_loss/train': 0.7561267599230632} 12/27/2021 17:57:02 - INFO - codeparrot_training - Step 36315: {'lr': 8.743580967767806e-05, 'samples': 18593792, 'steps': 36315, 'batch_loss/train': 0.74572801112663} 12/27/2021 17:57:13 - INFO - codeparrot_training - Step 36316: {'lr': 8.742383451894966e-05, 'samples': 18594304, 'steps': 36316, 'batch_loss/train': 0.6952911550179124} 12/27/2021 17:57:23 - INFO - codeparrot_training - Step 36317: {'lr': 8.741186000656254e-05, 'samples': 18594816, 'steps': 36317, 'batch_loss/train': 0.6189573990995996} 12/27/2021 17:57:34 - INFO - codeparrot_training - Step 36318: {'lr': 8.739988614056457e-05, 'samples': 18595328, 'steps': 36318, 'batch_loss/train': 0.7712802696041763} 12/27/2021 17:57:46 - INFO - codeparrot_training - Step 36319: {'lr': 8.738791292100331e-05, 'samples': 18595840, 'steps': 36319, 'batch_loss/train': 0.6113951722509228} 12/27/2021 17:57:57 - INFO - codeparrot_training - Step 36320: {'lr': 8.737594034792626e-05, 'samples': 18596352, 'steps': 36320, 'batch_loss/train': 0.798991653136909} 12/27/2021 17:58:07 - INFO - codeparrot_training - Step 36321: {'lr': 8.736396842138098e-05, 'samples': 18596864, 'steps': 36321, 'batch_loss/train': 0.7371747680008411} 12/27/2021 17:58:21 - INFO - codeparrot_training - Step 36322: {'lr': 8.735199714141528e-05, 'samples': 18597376, 'steps': 36322, 'batch_loss/train': 0.8484732704237103} 12/27/2021 17:58:32 - INFO - codeparrot_training - Step 36323: {'lr': 8.734002650807656e-05, 'samples': 18597888, 'steps': 36323, 'batch_loss/train': 0.6194661818444729} 12/27/2021 17:58:42 - INFO - codeparrot_training - Step 36324: {'lr': 8.732805652141243e-05, 'samples': 18598400, 'steps': 36324, 'batch_loss/train': 0.8038279679603875} 12/27/2021 17:58:54 - INFO - codeparrot_training - Step 36325: {'lr': 8.731608718147066e-05, 'samples': 18598912, 'steps': 36325, 'batch_loss/train': 0.7796403234824538} 12/27/2021 17:59:05 - INFO - codeparrot_training - Step 36326: {'lr': 8.730411848829862e-05, 'samples': 18599424, 'steps': 36326, 'batch_loss/train': 0.7515359960962087} 12/27/2021 17:59:16 - INFO - codeparrot_training - Step 36327: {'lr': 8.729215044194397e-05, 'samples': 18599936, 'steps': 36327, 'batch_loss/train': 0.7611063243821263} 12/27/2021 17:59:26 - INFO - codeparrot_training - Step 36328: {'lr': 8.72801830424543e-05, 'samples': 18600448, 'steps': 36328, 'batch_loss/train': 0.7476152768358588} 12/27/2021 17:59:38 - INFO - codeparrot_training - Step 36329: {'lr': 8.72682162898772e-05, 'samples': 18600960, 'steps': 36329, 'batch_loss/train': 0.7506369724869728} 12/27/2021 17:59:49 - INFO - codeparrot_training - Step 36330: {'lr': 8.72562501842602e-05, 'samples': 18601472, 'steps': 36330, 'batch_loss/train': 0.815529559738934} 12/27/2021 18:00:00 - INFO - codeparrot_training - Step 36331: {'lr': 8.7244284725651e-05, 'samples': 18601984, 'steps': 36331, 'batch_loss/train': 0.6676201447844505} 12/27/2021 18:00:13 - INFO - codeparrot_training - Step 36332: {'lr': 8.723231991409689e-05, 'samples': 18602496, 'steps': 36332, 'batch_loss/train': 0.6988983969204128} 12/27/2021 18:00:24 - INFO - codeparrot_training - Step 36333: {'lr': 8.722035574964568e-05, 'samples': 18603008, 'steps': 36333, 'batch_loss/train': 0.7551870243623853} 12/27/2021 18:00:35 - INFO - codeparrot_training - Step 36334: {'lr': 8.720839223234497e-05, 'samples': 18603520, 'steps': 36334, 'batch_loss/train': 0.7685743300244212} 12/27/2021 18:00:47 - INFO - codeparrot_training - Step 36335: {'lr': 8.719642936224209e-05, 'samples': 18604032, 'steps': 36335, 'batch_loss/train': 0.6241638152860105} 12/27/2021 18:00:58 - INFO - codeparrot_training - Step 36336: {'lr': 8.718446713938488e-05, 'samples': 18604544, 'steps': 36336, 'batch_loss/train': 0.5499983059708029} 12/27/2021 18:01:08 - INFO - codeparrot_training - Step 36337: {'lr': 8.717250556382066e-05, 'samples': 18605056, 'steps': 36337, 'batch_loss/train': 0.7526795249432325} 12/27/2021 18:01:19 - INFO - codeparrot_training - Step 36338: {'lr': 8.716054463559708e-05, 'samples': 18605568, 'steps': 36338, 'batch_loss/train': 0.7224189681001008} 12/27/2021 18:01:31 - INFO - codeparrot_training - Step 36339: {'lr': 8.71485843547617e-05, 'samples': 18606080, 'steps': 36339, 'batch_loss/train': 0.6635256982408464} 12/27/2021 18:01:42 - INFO - codeparrot_training - Step 36340: {'lr': 8.713662472136206e-05, 'samples': 18606592, 'steps': 36340, 'batch_loss/train': 0.7889688257128} 12/27/2021 18:01:52 - INFO - codeparrot_training - Step 36341: {'lr': 8.712466573544567e-05, 'samples': 18607104, 'steps': 36341, 'batch_loss/train': 0.7822269340977073} 12/27/2021 18:02:06 - INFO - codeparrot_training - Step 36342: {'lr': 8.711270739706023e-05, 'samples': 18607616, 'steps': 36342, 'batch_loss/train': 0.788217831403017} 12/27/2021 18:02:17 - INFO - codeparrot_training - Step 36343: {'lr': 8.710074970625297e-05, 'samples': 18608128, 'steps': 36343, 'batch_loss/train': 0.7598125776275992} 12/27/2021 18:02:27 - INFO - codeparrot_training - Step 36344: {'lr': 8.708879266307174e-05, 'samples': 18608640, 'steps': 36344, 'batch_loss/train': 0.7524355761706829} 12/27/2021 18:02:40 - INFO - codeparrot_training - Step 36345: {'lr': 8.707683626756402e-05, 'samples': 18609152, 'steps': 36345, 'batch_loss/train': 0.7500130119733512} 12/27/2021 18:02:50 - INFO - codeparrot_training - Step 36346: {'lr': 8.706488051977709e-05, 'samples': 18609664, 'steps': 36346, 'batch_loss/train': 0.7183595011010766} 12/27/2021 18:03:01 - INFO - codeparrot_training - Step 36347: {'lr': 8.705292541975879e-05, 'samples': 18610176, 'steps': 36347, 'batch_loss/train': 0.7333030826412141} 12/27/2021 18:03:11 - INFO - codeparrot_training - Step 36348: {'lr': 8.704097096755662e-05, 'samples': 18610688, 'steps': 36348, 'batch_loss/train': 0.793797698803246} 12/27/2021 18:03:23 - INFO - codeparrot_training - Step 36349: {'lr': 8.70290171632179e-05, 'samples': 18611200, 'steps': 36349, 'batch_loss/train': 0.7488683681003749} 12/27/2021 18:03:34 - INFO - codeparrot_training - Step 36350: {'lr': 8.701706400679021e-05, 'samples': 18611712, 'steps': 36350, 'batch_loss/train': 0.6875086473301053} 12/27/2021 18:03:45 - INFO - codeparrot_training - Step 36351: {'lr': 8.700511149832133e-05, 'samples': 18612224, 'steps': 36351, 'batch_loss/train': 0.7009973954409361} 12/27/2021 18:03:59 - INFO - codeparrot_training - Step 36352: {'lr': 8.699315963785845e-05, 'samples': 18612736, 'steps': 36352, 'batch_loss/train': 0.830223286524415} 12/27/2021 18:04:09 - INFO - codeparrot_training - Step 36353: {'lr': 8.698120842544924e-05, 'samples': 18613248, 'steps': 36353, 'batch_loss/train': 0.8668190538883209} 12/27/2021 18:04:20 - INFO - codeparrot_training - Step 36354: {'lr': 8.696925786114119e-05, 'samples': 18613760, 'steps': 36354, 'batch_loss/train': 0.6667782583972439} 12/27/2021 18:04:32 - INFO - codeparrot_training - Step 36355: {'lr': 8.69573079449818e-05, 'samples': 18614272, 'steps': 36355, 'batch_loss/train': 0.6350918132811785} 12/27/2021 18:04:43 - INFO - codeparrot_training - Step 36356: {'lr': 8.694535867701862e-05, 'samples': 18614784, 'steps': 36356, 'batch_loss/train': 0.6821637619286776} 12/27/2021 18:04:53 - INFO - codeparrot_training - Step 36357: {'lr': 8.693341005729918e-05, 'samples': 18615296, 'steps': 36357, 'batch_loss/train': 0.679288269020617} 12/27/2021 18:05:04 - INFO - codeparrot_training - Step 36358: {'lr': 8.692146208587079e-05, 'samples': 18615808, 'steps': 36358, 'batch_loss/train': 0.7258469039807096} 12/27/2021 18:05:17 - INFO - codeparrot_training - Step 36359: {'lr': 8.690951476278114e-05, 'samples': 18616320, 'steps': 36359, 'batch_loss/train': 0.7897851960733533} 12/27/2021 18:05:28 - INFO - codeparrot_training - Step 36360: {'lr': 8.68975680880778e-05, 'samples': 18616832, 'steps': 36360, 'batch_loss/train': 0.8061929391697049} 12/27/2021 18:05:39 - INFO - codeparrot_training - Step 36361: {'lr': 8.688562206180794e-05, 'samples': 18617344, 'steps': 36361, 'batch_loss/train': 0.7177550608757883} 12/27/2021 18:05:51 - INFO - codeparrot_training - Step 36362: {'lr': 8.687367668401947e-05, 'samples': 18617856, 'steps': 36362, 'batch_loss/train': 0.6454241198953241} 12/27/2021 18:06:02 - INFO - codeparrot_training - Step 36363: {'lr': 8.686173195475955e-05, 'samples': 18618368, 'steps': 36363, 'batch_loss/train': 1.6421931628137827} 12/27/2021 18:06:12 - INFO - codeparrot_training - Step 36364: {'lr': 8.68497878740758e-05, 'samples': 18618880, 'steps': 36364, 'batch_loss/train': 0.6816184599883854} 12/27/2021 18:06:24 - INFO - codeparrot_training - Step 36365: {'lr': 8.683784444201567e-05, 'samples': 18619392, 'steps': 36365, 'batch_loss/train': 0.664449610747397} 12/27/2021 18:06:35 - INFO - codeparrot_training - Step 36366: {'lr': 8.682590165862666e-05, 'samples': 18619904, 'steps': 36366, 'batch_loss/train': 0.5899224642198533} 12/27/2021 18:06:46 - INFO - codeparrot_training - Step 36367: {'lr': 8.68139595239563e-05, 'samples': 18620416, 'steps': 36367, 'batch_loss/train': 0.7158440230414271} 12/27/2021 18:06:56 - INFO - codeparrot_training - Step 36368: {'lr': 8.680201803805198e-05, 'samples': 18620928, 'steps': 36368, 'batch_loss/train': 1.210860381834209} 12/27/2021 18:07:09 - INFO - codeparrot_training - Step 36369: {'lr': 8.679007720096121e-05, 'samples': 18621440, 'steps': 36369, 'batch_loss/train': 1.1529812812805176} 12/27/2021 18:07:20 - INFO - codeparrot_training - Step 36370: {'lr': 8.677813701273148e-05, 'samples': 18621952, 'steps': 36370, 'batch_loss/train': 0.8222639709711075} 12/27/2021 18:07:31 - INFO - codeparrot_training - Step 36371: {'lr': 8.676619747341036e-05, 'samples': 18622464, 'steps': 36371, 'batch_loss/train': 0.7863074215129018} 12/27/2021 18:07:43 - INFO - codeparrot_training - Step 36372: {'lr': 8.675425858304501e-05, 'samples': 18622976, 'steps': 36372, 'batch_loss/train': 0.8033138606697321} 12/27/2021 18:07:54 - INFO - codeparrot_training - Step 36373: {'lr': 8.674232034168317e-05, 'samples': 18623488, 'steps': 36373, 'batch_loss/train': 0.7384617989882827} 12/27/2021 18:08:04 - INFO - codeparrot_training - Step 36374: {'lr': 8.673038274937228e-05, 'samples': 18624000, 'steps': 36374, 'batch_loss/train': 0.6132559161633253} 12/27/2021 18:08:16 - INFO - codeparrot_training - Step 36375: {'lr': 8.671844580615971e-05, 'samples': 18624512, 'steps': 36375, 'batch_loss/train': 0.7951045883819461} 12/27/2021 18:08:27 - INFO - codeparrot_training - Step 36376: {'lr': 8.670650951209283e-05, 'samples': 18625024, 'steps': 36376, 'batch_loss/train': 0.750059018842876} 12/27/2021 18:08:38 - INFO - codeparrot_training - Step 36377: {'lr': 8.669457386721938e-05, 'samples': 18625536, 'steps': 36377, 'batch_loss/train': 0.7567000374547206} 12/27/2021 18:08:48 - INFO - codeparrot_training - Step 36378: {'lr': 8.668263887158656e-05, 'samples': 18626048, 'steps': 36378, 'batch_loss/train': 0.7590079344809055} 12/27/2021 18:09:00 - INFO - codeparrot_training - Step 36379: {'lr': 8.667070452524189e-05, 'samples': 18626560, 'steps': 36379, 'batch_loss/train': 0.8041885672137141} 12/27/2021 18:09:11 - INFO - codeparrot_training - Step 36380: {'lr': 8.665877082823284e-05, 'samples': 18627072, 'steps': 36380, 'batch_loss/train': 0.8715439904481173} 12/27/2021 18:09:22 - INFO - codeparrot_training - Step 36381: {'lr': 8.664683778060686e-05, 'samples': 18627584, 'steps': 36381, 'batch_loss/train': 0.7751912144012749} 12/27/2021 18:09:36 - INFO - codeparrot_training - Step 36382: {'lr': 8.663490538241136e-05, 'samples': 18628096, 'steps': 36382, 'batch_loss/train': 0.6650896314531565} 12/27/2021 18:09:46 - INFO - codeparrot_training - Step 36383: {'lr': 8.662297363369378e-05, 'samples': 18628608, 'steps': 36383, 'batch_loss/train': 0.7519657956436276} 12/27/2021 18:09:57 - INFO - codeparrot_training - Step 36384: {'lr': 8.661104253450158e-05, 'samples': 18629120, 'steps': 36384, 'batch_loss/train': 0.860996762290597} 12/27/2021 18:10:09 - INFO - codeparrot_training - Step 36385: {'lr': 8.659911208488216e-05, 'samples': 18629632, 'steps': 36385, 'batch_loss/train': 0.6665767794474959} 12/27/2021 18:10:20 - INFO - codeparrot_training - Step 36386: {'lr': 8.65871822848831e-05, 'samples': 18630144, 'steps': 36386, 'batch_loss/train': 0.7482742671854794} 12/27/2021 18:10:30 - INFO - codeparrot_training - Step 36387: {'lr': 8.65752531345515e-05, 'samples': 18630656, 'steps': 36387, 'batch_loss/train': 0.6328864124370739} 12/27/2021 18:10:41 - INFO - codeparrot_training - Step 36388: {'lr': 8.656332463393515e-05, 'samples': 18631168, 'steps': 36388, 'batch_loss/train': 0.6804177984595299} 12/27/2021 18:10:54 - INFO - codeparrot_training - Step 36389: {'lr': 8.655139678308124e-05, 'samples': 18631680, 'steps': 36389, 'batch_loss/train': 0.690496145747602} 12/27/2021 18:11:05 - INFO - codeparrot_training - Step 36390: {'lr': 8.653946958203726e-05, 'samples': 18632192, 'steps': 36390, 'batch_loss/train': 0.7302074700128287} 12/27/2021 18:11:15 - INFO - codeparrot_training - Step 36391: {'lr': 8.652754303085064e-05, 'samples': 18632704, 'steps': 36391, 'batch_loss/train': 0.7921981813851744} 12/27/2021 18:11:27 - INFO - codeparrot_training - Step 36392: {'lr': 8.651561712956877e-05, 'samples': 18633216, 'steps': 36392, 'batch_loss/train': 0.7527432646602392} 12/27/2021 18:11:38 - INFO - codeparrot_training - Step 36393: {'lr': 8.650369187823909e-05, 'samples': 18633728, 'steps': 36393, 'batch_loss/train': 0.6525541068986058} 12/27/2021 18:11:49 - INFO - codeparrot_training - Step 36394: {'lr': 8.649176727690897e-05, 'samples': 18634240, 'steps': 36394, 'batch_loss/train': 0.751945317722857} 12/27/2021 18:12:01 - INFO - codeparrot_training - Step 36395: {'lr': 8.647984332562587e-05, 'samples': 18634752, 'steps': 36395, 'batch_loss/train': 0.7599515756592155} 12/27/2021 18:12:11 - INFO - codeparrot_training - Step 36396: {'lr': 8.646792002443715e-05, 'samples': 18635264, 'steps': 36396, 'batch_loss/train': 0.8437130195088685} 12/27/2021 18:12:22 - INFO - codeparrot_training - Step 36397: {'lr': 8.645599737339035e-05, 'samples': 18635776, 'steps': 36397, 'batch_loss/train': 0.8108250317163765} 12/27/2021 18:12:33 - INFO - codeparrot_training - Step 36398: {'lr': 8.644407537253254e-05, 'samples': 18636288, 'steps': 36398, 'batch_loss/train': 0.7802631775848567} 12/27/2021 18:12:46 - INFO - codeparrot_training - Step 36399: {'lr': 8.643215402191145e-05, 'samples': 18636800, 'steps': 36399, 'batch_loss/train': 0.7230771949980408} 12/27/2021 18:12:56 - INFO - codeparrot_training - Step 36400: {'lr': 8.642023332157443e-05, 'samples': 18637312, 'steps': 36400, 'batch_loss/train': 0.740625592879951} 12/27/2021 18:13:07 - INFO - codeparrot_training - Step 36401: {'lr': 8.64083132715687e-05, 'samples': 18637824, 'steps': 36401, 'batch_loss/train': 0.8271403961116448} 12/27/2021 18:13:19 - INFO - codeparrot_training - Step 36402: {'lr': 8.639639387194167e-05, 'samples': 18638336, 'steps': 36402, 'batch_loss/train': 0.6985666663385928} 12/27/2021 18:13:30 - INFO - codeparrot_training - Step 36403: {'lr': 8.638447512274097e-05, 'samples': 18638848, 'steps': 36403, 'batch_loss/train': 0.6942899222485721} 12/27/2021 18:13:40 - INFO - codeparrot_training - Step 36404: {'lr': 8.637255702401375e-05, 'samples': 18639360, 'steps': 36404, 'batch_loss/train': 0.7380894031375647} 12/27/2021 18:13:52 - INFO - codeparrot_training - Step 36405: {'lr': 8.636063957580744e-05, 'samples': 18639872, 'steps': 36405, 'batch_loss/train': 0.7247441109502688} 12/27/2021 18:14:03 - INFO - codeparrot_training - Step 36406: {'lr': 8.634872277816946e-05, 'samples': 18640384, 'steps': 36406, 'batch_loss/train': 0.698512832634151} 12/27/2021 18:14:14 - INFO - codeparrot_training - Step 36407: {'lr': 8.633680663114715e-05, 'samples': 18640896, 'steps': 36407, 'batch_loss/train': 0.6441687904298306} 12/27/2021 18:14:24 - INFO - codeparrot_training - Step 36408: {'lr': 8.632489113478789e-05, 'samples': 18641408, 'steps': 36408, 'batch_loss/train': 0.6874147709459066} 12/27/2021 18:14:37 - INFO - codeparrot_training - Step 36409: {'lr': 8.631297628913909e-05, 'samples': 18641920, 'steps': 36409, 'batch_loss/train': 0.8103691693395376} 12/27/2021 18:14:48 - INFO - codeparrot_training - Step 36410: {'lr': 8.630106209424806e-05, 'samples': 18642432, 'steps': 36410, 'batch_loss/train': 0.6687327325344086} 12/27/2021 18:14:59 - INFO - codeparrot_training - Step 36411: {'lr': 8.62891485501622e-05, 'samples': 18642944, 'steps': 36411, 'batch_loss/train': 0.7361603886820376} 12/27/2021 18:15:11 - INFO - codeparrot_training - Step 36412: {'lr': 8.627723565692899e-05, 'samples': 18643456, 'steps': 36412, 'batch_loss/train': 0.7646939884871244} 12/27/2021 18:15:21 - INFO - codeparrot_training - Step 36413: {'lr': 8.626532341459548e-05, 'samples': 18643968, 'steps': 36413, 'batch_loss/train': 0.6808275962248445} 12/27/2021 18:15:32 - INFO - codeparrot_training - Step 36414: {'lr': 8.625341182320929e-05, 'samples': 18644480, 'steps': 36414, 'batch_loss/train': 0.7707731206901371} 12/27/2021 18:15:44 - INFO - codeparrot_training - Step 36415: {'lr': 8.624150088281782e-05, 'samples': 18644992, 'steps': 36415, 'batch_loss/train': 0.8093417058698833} 12/27/2021 18:15:55 - INFO - codeparrot_training - Step 36416: {'lr': 8.622959059346819e-05, 'samples': 18645504, 'steps': 36416, 'batch_loss/train': 0.8310194229707122} 12/27/2021 18:16:05 - INFO - codeparrot_training - Step 36417: {'lr': 8.62176809552079e-05, 'samples': 18646016, 'steps': 36417, 'batch_loss/train': 0.6469205068424344} 12/27/2021 18:16:19 - INFO - codeparrot_training - Step 36418: {'lr': 8.620577196808427e-05, 'samples': 18646528, 'steps': 36418, 'batch_loss/train': 0.7612549159675837} 12/27/2021 18:16:29 - INFO - codeparrot_training - Step 36419: {'lr': 8.619386363214462e-05, 'samples': 18647040, 'steps': 36419, 'batch_loss/train': 0.7582234162837267} 12/27/2021 18:16:40 - INFO - codeparrot_training - Step 36420: {'lr': 8.618195594743633e-05, 'samples': 18647552, 'steps': 36420, 'batch_loss/train': 0.7407724549993873} 12/27/2021 18:16:51 - INFO - codeparrot_training - Step 36421: {'lr': 8.617004891400673e-05, 'samples': 18648064, 'steps': 36421, 'batch_loss/train': 0.803670144174248} 12/27/2021 18:17:03 - INFO - codeparrot_training - Step 36422: {'lr': 8.615814253190316e-05, 'samples': 18648576, 'steps': 36422, 'batch_loss/train': 0.7434567180462182} 12/27/2021 18:17:13 - INFO - codeparrot_training - Step 36423: {'lr': 8.614623680117295e-05, 'samples': 18649088, 'steps': 36423, 'batch_loss/train': 1.65714799053967} 12/27/2021 18:17:24 - INFO - codeparrot_training - Step 36424: {'lr': 8.613433172186341e-05, 'samples': 18649600, 'steps': 36424, 'batch_loss/train': 0.7004383311141282} 12/27/2021 18:17:36 - INFO - codeparrot_training - Step 36425: {'lr': 8.612242729402192e-05, 'samples': 18650112, 'steps': 36425, 'batch_loss/train': 0.7022304669953883} 12/27/2021 18:17:47 - INFO - codeparrot_training - Step 36426: {'lr': 8.611052351769586e-05, 'samples': 18650624, 'steps': 36426, 'batch_loss/train': 0.5806684494018555} 12/27/2021 18:17:57 - INFO - codeparrot_training - Step 36427: {'lr': 8.60986203929324e-05, 'samples': 18651136, 'steps': 36427, 'batch_loss/train': 0.6815107362344861} 12/27/2021 18:18:11 - INFO - codeparrot_training - Step 36428: {'lr': 8.608671791977884e-05, 'samples': 18651648, 'steps': 36428, 'batch_loss/train': 0.7339915648335591} 12/27/2021 18:18:21 - INFO - codeparrot_training - Step 36429: {'lr': 8.607481609828277e-05, 'samples': 18652160, 'steps': 36429, 'batch_loss/train': 0.6549890977330506} 12/27/2021 18:18:32 - INFO - codeparrot_training - Step 36430: {'lr': 8.606291492849125e-05, 'samples': 18652672, 'steps': 36430, 'batch_loss/train': 0.5410484811291099} 12/27/2021 18:18:43 - INFO - codeparrot_training - Step 36431: {'lr': 8.605101441045168e-05, 'samples': 18653184, 'steps': 36431, 'batch_loss/train': 0.7298778686672449} 12/27/2021 18:18:55 - INFO - codeparrot_training - Step 36432: {'lr': 8.603911454421138e-05, 'samples': 18653696, 'steps': 36432, 'batch_loss/train': 0.6700654062442482} 12/27/2021 18:19:06 - INFO - codeparrot_training - Step 36433: {'lr': 8.602721532981764e-05, 'samples': 18654208, 'steps': 36433, 'batch_loss/train': 0.743996313191019} 12/27/2021 18:19:16 - INFO - codeparrot_training - Step 36434: {'lr': 8.601531676731778e-05, 'samples': 18654720, 'steps': 36434, 'batch_loss/train': 0.7104856870137155} 12/27/2021 18:19:28 - INFO - codeparrot_training - Step 36435: {'lr': 8.600341885675911e-05, 'samples': 18655232, 'steps': 36435, 'batch_loss/train': 0.736615139991045} 12/27/2021 18:19:39 - INFO - codeparrot_training - Step 36436: {'lr': 8.599152159818893e-05, 'samples': 18655744, 'steps': 36436, 'batch_loss/train': 0.6888014581054449} 12/27/2021 18:19:50 - INFO - codeparrot_training - Step 36437: {'lr': 8.597962499165451e-05, 'samples': 18656256, 'steps': 36437, 'batch_loss/train': 0.7594074741355143} 12/27/2021 18:20:03 - INFO - codeparrot_training - Step 36438: {'lr': 8.596772903720326e-05, 'samples': 18656768, 'steps': 36438, 'batch_loss/train': 0.7909910790622234} 12/27/2021 18:20:13 - INFO - codeparrot_training - Step 36439: {'lr': 8.595583373488223e-05, 'samples': 18657280, 'steps': 36439, 'batch_loss/train': 0.7027044078567997} 12/27/2021 18:20:24 - INFO - codeparrot_training - Step 36440: {'lr': 8.594393908473894e-05, 'samples': 18657792, 'steps': 36440, 'batch_loss/train': 0.7269790086429566} 12/27/2021 18:20:35 - INFO - codeparrot_training - Step 36441: {'lr': 8.593204508682067e-05, 'samples': 18658304, 'steps': 36441, 'batch_loss/train': 0.8281779997050762} 12/27/2021 18:20:47 - INFO - codeparrot_training - Step 36442: {'lr': 8.59201517411746e-05, 'samples': 18658816, 'steps': 36442, 'batch_loss/train': 0.7897717840969563} 12/27/2021 18:20:58 - INFO - codeparrot_training - Step 36443: {'lr': 8.590825904784802e-05, 'samples': 18659328, 'steps': 36443, 'batch_loss/train': 0.7425360698252916} 12/27/2021 18:21:08 - INFO - codeparrot_training - Step 36444: {'lr': 8.589636700688824e-05, 'samples': 18659840, 'steps': 36444, 'batch_loss/train': 0.5968917579157278} 12/27/2021 18:21:20 - INFO - codeparrot_training - Step 36445: {'lr': 8.588447561834255e-05, 'samples': 18660352, 'steps': 36445, 'batch_loss/train': 0.6456735162064433} 12/27/2021 18:21:31 - INFO - codeparrot_training - Step 36446: {'lr': 8.587258488225822e-05, 'samples': 18660864, 'steps': 36446, 'batch_loss/train': 0.6161669994471595} 12/27/2021 18:21:42 - INFO - codeparrot_training - Step 36447: {'lr': 8.586069479868253e-05, 'samples': 18661376, 'steps': 36447, 'batch_loss/train': 0.6726089813746512} 12/27/2021 18:21:56 - INFO - codeparrot_training - Step 36448: {'lr': 8.58488053676627e-05, 'samples': 18661888, 'steps': 36448, 'batch_loss/train': 0.764199766330421} 12/27/2021 18:22:07 - INFO - codeparrot_training - Step 36449: {'lr': 8.583691658924608e-05, 'samples': 18662400, 'steps': 36449, 'batch_loss/train': 0.7321828790009022} 12/27/2021 18:22:17 - INFO - codeparrot_training - Step 36450: {'lr': 8.582502846347989e-05, 'samples': 18662912, 'steps': 36450, 'batch_loss/train': 0.6813271306455135} 12/27/2021 18:22:28 - INFO - codeparrot_training - Step 36451: {'lr': 8.581314099041138e-05, 'samples': 18663424, 'steps': 36451, 'batch_loss/train': 0.7546161091886461} 12/27/2021 18:22:40 - INFO - codeparrot_training - Step 36452: {'lr': 8.580125417008792e-05, 'samples': 18663936, 'steps': 36452, 'batch_loss/train': 0.8058214578777552} 12/27/2021 18:22:51 - INFO - codeparrot_training - Step 36453: {'lr': 8.57893680025566e-05, 'samples': 18664448, 'steps': 36453, 'batch_loss/train': 0.653468007221818} 12/27/2021 18:23:01 - INFO - codeparrot_training - Step 36454: {'lr': 8.577748248786466e-05, 'samples': 18664960, 'steps': 36454, 'batch_loss/train': 0.7612888338044286} 12/27/2021 18:23:15 - INFO - codeparrot_training - Step 36455: {'lr': 8.576559762605959e-05, 'samples': 18665472, 'steps': 36455, 'batch_loss/train': 0.7458384251222014} 12/27/2021 18:23:26 - INFO - codeparrot_training - Step 36456: {'lr': 8.57537134171884e-05, 'samples': 18665984, 'steps': 36456, 'batch_loss/train': 0.9265475894790143} 12/27/2021 18:23:37 - INFO - codeparrot_training - Step 36457: {'lr': 8.574182986129839e-05, 'samples': 18666496, 'steps': 36457, 'batch_loss/train': 0.74567109066993} 12/27/2021 18:23:49 - INFO - codeparrot_training - Step 36458: {'lr': 8.572994695843698e-05, 'samples': 18667008, 'steps': 36458, 'batch_loss/train': 0.4519845680333674} 12/27/2021 18:23:59 - INFO - codeparrot_training - Step 36459: {'lr': 8.57180647086512e-05, 'samples': 18667520, 'steps': 36459, 'batch_loss/train': 0.7737554479390383} 12/27/2021 18:24:10 - INFO - codeparrot_training - Step 36460: {'lr': 8.570618311198835e-05, 'samples': 18668032, 'steps': 36460, 'batch_loss/train': 0.7095240151975304} 12/27/2021 18:24:20 - INFO - codeparrot_training - Step 36461: {'lr': 8.569430216849569e-05, 'samples': 18668544, 'steps': 36461, 'batch_loss/train': 0.7690739505924284} 12/27/2021 18:24:33 - INFO - codeparrot_training - Step 36462: {'lr': 8.568242187822045e-05, 'samples': 18669056, 'steps': 36462, 'batch_loss/train': 0.7052374118939042} 12/27/2021 18:24:43 - INFO - codeparrot_training - Step 36463: {'lr': 8.567054224120987e-05, 'samples': 18669568, 'steps': 36463, 'batch_loss/train': 0.7715007518418133} 12/27/2021 18:24:54 - INFO - codeparrot_training - Step 36464: {'lr': 8.565866325751124e-05, 'samples': 18670080, 'steps': 36464, 'batch_loss/train': 0.8409660095348954} 12/27/2021 18:25:08 - INFO - codeparrot_training - Step 36465: {'lr': 8.564678492717155e-05, 'samples': 18670592, 'steps': 36465, 'batch_loss/train': 0.8322513932362199} 12/27/2021 18:25:19 - INFO - codeparrot_training - Step 36466: {'lr': 8.563490725023823e-05, 'samples': 18671104, 'steps': 36466, 'batch_loss/train': 0.7176112234592438} 12/27/2021 18:25:29 - INFO - codeparrot_training - Step 36467: {'lr': 8.562303022675857e-05, 'samples': 18671616, 'steps': 36467, 'batch_loss/train': 0.7717189472168684} 12/27/2021 18:25:41 - INFO - codeparrot_training - Step 36468: {'lr': 8.561115385677959e-05, 'samples': 18672128, 'steps': 36468, 'batch_loss/train': 0.7465449548326433} 12/27/2021 18:25:52 - INFO - codeparrot_training - Step 36469: {'lr': 8.559927814034851e-05, 'samples': 18672640, 'steps': 36469, 'batch_loss/train': 0.6847016157116741} 12/27/2021 18:26:03 - INFO - codeparrot_training - Step 36470: {'lr': 8.55874030775128e-05, 'samples': 18673152, 'steps': 36470, 'batch_loss/train': 0.7753669009543955} 12/27/2021 18:26:13 - INFO - codeparrot_training - Step 36471: {'lr': 8.557552866831941e-05, 'samples': 18673664, 'steps': 36471, 'batch_loss/train': 0.8010908383876085} 12/27/2021 18:26:27 - INFO - codeparrot_training - Step 36472: {'lr': 8.556365491281562e-05, 'samples': 18674176, 'steps': 36472, 'batch_loss/train': 0.5631751712644473} 12/27/2021 18:26:38 - INFO - codeparrot_training - Step 36473: {'lr': 8.555178181104863e-05, 'samples': 18674688, 'steps': 36473, 'batch_loss/train': 0.8597992183640599} 12/27/2021 18:26:48 - INFO - codeparrot_training - Step 36474: {'lr': 8.553990936306572e-05, 'samples': 18675200, 'steps': 36474, 'batch_loss/train': 0.7964079459197819} 12/27/2021 18:27:00 - INFO - codeparrot_training - Step 36475: {'lr': 8.5528037568914e-05, 'samples': 18675712, 'steps': 36475, 'batch_loss/train': 0.688139627687633} 12/27/2021 18:27:11 - INFO - codeparrot_training - Step 36476: {'lr': 8.551616642864068e-05, 'samples': 18676224, 'steps': 36476, 'batch_loss/train': 0.6485803241375834} 12/27/2021 18:27:22 - INFO - codeparrot_training - Step 36477: {'lr': 8.5504295942293e-05, 'samples': 18676736, 'steps': 36477, 'batch_loss/train': 0.7173856412118766} 12/27/2021 18:27:34 - INFO - codeparrot_training - Step 36478: {'lr': 8.549242610991822e-05, 'samples': 18677248, 'steps': 36478, 'batch_loss/train': 0.7376294601708651} 12/27/2021 18:27:44 - INFO - codeparrot_training - Step 36479: {'lr': 8.548055693156334e-05, 'samples': 18677760, 'steps': 36479, 'batch_loss/train': 0.7669123767409474} 12/27/2021 18:27:55 - INFO - codeparrot_training - Step 36480: {'lr': 8.546868840727556e-05, 'samples': 18678272, 'steps': 36480, 'batch_loss/train': 0.728676677797921} 12/27/2021 18:28:07 - INFO - codeparrot_training - Step 36481: {'lr': 8.545682053710232e-05, 'samples': 18678784, 'steps': 36481, 'batch_loss/train': 0.7463004169985652} 12/27/2021 18:28:18 - INFO - codeparrot_training - Step 36482: {'lr': 8.544495332109056e-05, 'samples': 18679296, 'steps': 36482, 'batch_loss/train': 0.7179386425996199} 12/27/2021 18:28:28 - INFO - codeparrot_training - Step 36483: {'lr': 8.543308675928744e-05, 'samples': 18679808, 'steps': 36483, 'batch_loss/train': 0.7091430723667145} 12/27/2021 18:28:39 - INFO - codeparrot_training - Step 36484: {'lr': 8.54212208517404e-05, 'samples': 18680320, 'steps': 36484, 'batch_loss/train': 0.7397499342914671} 12/27/2021 18:28:52 - INFO - codeparrot_training - Step 36485: {'lr': 8.540935559849636e-05, 'samples': 18680832, 'steps': 36485, 'batch_loss/train': 0.8008424185682088} 12/27/2021 18:29:03 - INFO - codeparrot_training - Step 36486: {'lr': 8.539749099960256e-05, 'samples': 18681344, 'steps': 36486, 'batch_loss/train': 0.8088085660710931} 12/27/2021 18:29:13 - INFO - codeparrot_training - Step 36487: {'lr': 8.538562705510619e-05, 'samples': 18681856, 'steps': 36487, 'batch_loss/train': 0.7127925809472799} 12/27/2021 18:29:26 - INFO - codeparrot_training - Step 36488: {'lr': 8.537376376505443e-05, 'samples': 18682368, 'steps': 36488, 'batch_loss/train': 0.6386067755520344} 12/27/2021 18:29:36 - INFO - codeparrot_training - Step 36489: {'lr': 8.536190112949441e-05, 'samples': 18682880, 'steps': 36489, 'batch_loss/train': 0.7292270362377167} 12/27/2021 18:29:47 - INFO - codeparrot_training - Step 36490: {'lr': 8.535003914847339e-05, 'samples': 18683392, 'steps': 36490, 'batch_loss/train': 0.7100473428145051} 12/27/2021 18:29:59 - INFO - codeparrot_training - Step 36491: {'lr': 8.533817782203826e-05, 'samples': 18683904, 'steps': 36491, 'batch_loss/train': 0.6435496732592583} 12/27/2021 18:30:10 - INFO - codeparrot_training - Step 36492: {'lr': 8.532631715023648e-05, 'samples': 18684416, 'steps': 36492, 'batch_loss/train': 0.731092294678092} 12/27/2021 18:30:20 - INFO - codeparrot_training - Step 36493: {'lr': 8.531445713311514e-05, 'samples': 18684928, 'steps': 36493, 'batch_loss/train': 0.8008548896759748} 12/27/2021 18:30:31 - INFO - codeparrot_training - Step 36494: {'lr': 8.530259777072119e-05, 'samples': 18685440, 'steps': 36494, 'batch_loss/train': 1.502566413488239} 12/27/2021 18:30:44 - INFO - codeparrot_training - Step 36495: {'lr': 8.5290739063102e-05, 'samples': 18685952, 'steps': 36495, 'batch_loss/train': 0.6368799800984561} 12/27/2021 18:30:55 - INFO - codeparrot_training - Step 36496: {'lr': 8.52788810103047e-05, 'samples': 18686464, 'steps': 36496, 'batch_loss/train': 0.760872233659029} 12/27/2021 18:31:06 - INFO - codeparrot_training - Step 36497: {'lr': 8.526702361237632e-05, 'samples': 18686976, 'steps': 36497, 'batch_loss/train': 0.7935266196727753} 12/27/2021 18:31:18 - INFO - codeparrot_training - Step 36498: {'lr': 8.525516686936405e-05, 'samples': 18687488, 'steps': 36498, 'batch_loss/train': 0.6842340370640159} 12/27/2021 18:31:28 - INFO - codeparrot_training - Step 36499: {'lr': 8.524331078131503e-05, 'samples': 18688000, 'steps': 36499, 'batch_loss/train': 0.755492577329278} 12/27/2021 18:31:39 - INFO - codeparrot_training - Step 36500: {'lr': 8.523145534827636e-05, 'samples': 18688512, 'steps': 36500, 'batch_loss/train': 0.8439911459572613} 12/27/2021 18:31:53 - INFO - codeparrot_training - Step 36501: {'lr': 8.521960057029526e-05, 'samples': 18689024, 'steps': 36501, 'batch_loss/train': 0.7989589585922658} 12/27/2021 18:32:04 - INFO - codeparrot_training - Step 36502: {'lr': 8.52077464474188e-05, 'samples': 18689536, 'steps': 36502, 'batch_loss/train': 0.8351723859086633} 12/27/2021 18:32:14 - INFO - codeparrot_training - Step 36503: {'lr': 8.51958929796941e-05, 'samples': 18690048, 'steps': 36503, 'batch_loss/train': 0.7354564424604177} 12/27/2021 18:32:25 - INFO - codeparrot_training - Step 36504: {'lr': 8.51840401671683e-05, 'samples': 18690560, 'steps': 36504, 'batch_loss/train': 1.6915554525330663} 12/27/2021 18:32:37 - INFO - codeparrot_training - Step 36505: {'lr': 8.517218800988863e-05, 'samples': 18691072, 'steps': 36505, 'batch_loss/train': 0.716287687420845} 12/27/2021 18:32:48 - INFO - codeparrot_training - Step 36506: {'lr': 8.516033650790192e-05, 'samples': 18691584, 'steps': 36506, 'batch_loss/train': 0.7880276518408209} 12/27/2021 18:32:58 - INFO - codeparrot_training - Step 36507: {'lr': 8.514848566125563e-05, 'samples': 18692096, 'steps': 36507, 'batch_loss/train': 0.7209792621433735} 12/27/2021 18:33:10 - INFO - codeparrot_training - Step 36508: {'lr': 8.513663546999664e-05, 'samples': 18692608, 'steps': 36508, 'batch_loss/train': 0.6410374960396439} 12/27/2021 18:33:21 - INFO - codeparrot_training - Step 36509: {'lr': 8.512478593417205e-05, 'samples': 18693120, 'steps': 36509, 'batch_loss/train': 0.6134020294994116} 12/27/2021 18:33:32 - INFO - codeparrot_training - Step 36510: {'lr': 8.511293705382925e-05, 'samples': 18693632, 'steps': 36510, 'batch_loss/train': 0.6671156985685229} 12/27/2021 18:33:45 - INFO - codeparrot_training - Step 36511: {'lr': 8.510108882901505e-05, 'samples': 18694144, 'steps': 36511, 'batch_loss/train': 0.6924156923778355} 12/27/2021 18:33:55 - INFO - codeparrot_training - Step 36512: {'lr': 8.508924125977665e-05, 'samples': 18694656, 'steps': 36512, 'batch_loss/train': 0.7074255300103687} 12/27/2021 18:34:06 - INFO - codeparrot_training - Step 36513: {'lr': 8.507739434616115e-05, 'samples': 18695168, 'steps': 36513, 'batch_loss/train': 0.7699545659124851} 12/27/2021 18:34:18 - INFO - codeparrot_training - Step 36514: {'lr': 8.506554808821571e-05, 'samples': 18695680, 'steps': 36514, 'batch_loss/train': 0.8261632993817329} 12/27/2021 18:34:29 - INFO - codeparrot_training - Step 36515: {'lr': 8.505370248598732e-05, 'samples': 18696192, 'steps': 36515, 'batch_loss/train': 0.7689424566924572} 12/27/2021 18:34:40 - INFO - codeparrot_training - Step 36516: {'lr': 8.504185753952324e-05, 'samples': 18696704, 'steps': 36516, 'batch_loss/train': 0.7349162986502051} 12/27/2021 18:34:50 - INFO - codeparrot_training - Step 36517: {'lr': 8.503001324887028e-05, 'samples': 18697216, 'steps': 36517, 'batch_loss/train': 0.6908485479652882} 12/27/2021 18:35:02 - INFO - codeparrot_training - Step 36518: {'lr': 8.501816961407575e-05, 'samples': 18697728, 'steps': 36518, 'batch_loss/train': 0.6688806939637288} 12/27/2021 18:35:13 - INFO - codeparrot_training - Step 36519: {'lr': 8.500632663518679e-05, 'samples': 18698240, 'steps': 36519, 'batch_loss/train': 0.7574603809043765} 12/27/2021 18:35:24 - INFO - codeparrot_training - Step 36520: {'lr': 8.499448431225021e-05, 'samples': 18698752, 'steps': 36520, 'batch_loss/train': 0.7725235810503364} 12/27/2021 18:35:36 - INFO - codeparrot_training - Step 36521: {'lr': 8.498264264531333e-05, 'samples': 18699264, 'steps': 36521, 'batch_loss/train': 0.6944531912449747} 12/27/2021 18:35:46 - INFO - codeparrot_training - Step 36522: {'lr': 8.497080163442325e-05, 'samples': 18699776, 'steps': 36522, 'batch_loss/train': 0.9243945889174938} 12/27/2021 18:35:57 - INFO - codeparrot_training - Step 36523: {'lr': 8.495896127962688e-05, 'samples': 18700288, 'steps': 36523, 'batch_loss/train': 0.8550685602240264} 12/27/2021 18:36:08 - INFO - codeparrot_training - Step 36524: {'lr': 8.494712158097134e-05, 'samples': 18700800, 'steps': 36524, 'batch_loss/train': 0.7047562743537128} 12/27/2021 18:36:21 - INFO - codeparrot_training - Step 36525: {'lr': 8.493528253850371e-05, 'samples': 18701312, 'steps': 36525, 'batch_loss/train': 0.6692768619395792} 12/27/2021 18:36:32 - INFO - codeparrot_training - Step 36526: {'lr': 8.49234441522711e-05, 'samples': 18701824, 'steps': 36526, 'batch_loss/train': 0.7751989830285311} 12/27/2021 18:36:42 - INFO - codeparrot_training - Step 36527: {'lr': 8.491160642232054e-05, 'samples': 18702336, 'steps': 36527, 'batch_loss/train': 0.6894410750828683} 12/27/2021 18:36:54 - INFO - codeparrot_training - Step 36528: {'lr': 8.489976934869909e-05, 'samples': 18702848, 'steps': 36528, 'batch_loss/train': 0.7344514737487771} 12/27/2021 18:37:05 - INFO - codeparrot_training - Step 36529: {'lr': 8.48879329314538e-05, 'samples': 18703360, 'steps': 36529, 'batch_loss/train': 0.718439650721848} 12/27/2021 18:37:16 - INFO - codeparrot_training - Step 36530: {'lr': 8.487609717063178e-05, 'samples': 18703872, 'steps': 36530, 'batch_loss/train': 0.7026046137325466} 12/27/2021 18:37:29 - INFO - codeparrot_training - Step 36531: {'lr': 8.486426206628003e-05, 'samples': 18704384, 'steps': 36531, 'batch_loss/train': 0.7324864966794848} 12/27/2021 18:37:40 - INFO - codeparrot_training - Step 36532: {'lr': 8.485242761844564e-05, 'samples': 18704896, 'steps': 36532, 'batch_loss/train': 0.7084948094561696} 12/27/2021 18:37:51 - INFO - codeparrot_training - Step 36533: {'lr': 8.484059382717571e-05, 'samples': 18705408, 'steps': 36533, 'batch_loss/train': 0.8015283490531147} 12/27/2021 18:38:01 - INFO - codeparrot_training - Step 36534: {'lr': 8.482876069251714e-05, 'samples': 18705920, 'steps': 36534, 'batch_loss/train': 0.7000363897532225} 12/27/2021 18:38:13 - INFO - codeparrot_training - Step 36535: {'lr': 8.481692821451695e-05, 'samples': 18706432, 'steps': 36535, 'batch_loss/train': 0.6575661515817046} 12/27/2021 18:38:24 - INFO - codeparrot_training - Step 36536: {'lr': 8.480509639322245e-05, 'samples': 18706944, 'steps': 36536, 'batch_loss/train': 0.6796227842569351} 12/27/2021 18:38:35 - INFO - codeparrot_training - Step 36537: {'lr': 8.479326522868045e-05, 'samples': 18707456, 'steps': 36537, 'batch_loss/train': 0.7365211676806211} 12/27/2021 18:38:47 - INFO - codeparrot_training - Step 36538: {'lr': 8.478143472093803e-05, 'samples': 18707968, 'steps': 36538, 'batch_loss/train': 0.5301402975455858} 12/27/2021 18:38:57 - INFO - codeparrot_training - Step 36539: {'lr': 8.476960487004224e-05, 'samples': 18708480, 'steps': 36539, 'batch_loss/train': 0.6614083372987807} 12/27/2021 18:39:08 - INFO - codeparrot_training - Step 36540: {'lr': 8.475777567604012e-05, 'samples': 18708992, 'steps': 36540, 'batch_loss/train': 0.7997709577903152} 12/27/2021 18:39:22 - INFO - codeparrot_training - Step 36541: {'lr': 8.474594713897871e-05, 'samples': 18709504, 'steps': 36541, 'batch_loss/train': 0.7072258946718648} 12/27/2021 18:39:33 - INFO - codeparrot_training - Step 36542: {'lr': 8.473411925890506e-05, 'samples': 18710016, 'steps': 36542, 'batch_loss/train': 0.6970100840553641} 12/27/2021 18:39:44 - INFO - codeparrot_training - Step 36543: {'lr': 8.472229203586601e-05, 'samples': 18710528, 'steps': 36543, 'batch_loss/train': 0.5739510257262737} 12/27/2021 18:39:54 - INFO - codeparrot_training - Step 36544: {'lr': 8.47104654699088e-05, 'samples': 18711040, 'steps': 36544, 'batch_loss/train': 0.7144895233213902} 12/27/2021 18:40:06 - INFO - codeparrot_training - Step 36545: {'lr': 8.469863956108045e-05, 'samples': 18711552, 'steps': 36545, 'batch_loss/train': 0.4661596061196178} 12/27/2021 18:40:17 - INFO - codeparrot_training - Step 36546: {'lr': 8.468681430942773e-05, 'samples': 18712064, 'steps': 36546, 'batch_loss/train': 0.8095893627032638} 12/27/2021 18:40:27 - INFO - codeparrot_training - Step 36547: {'lr': 8.46749897149979e-05, 'samples': 18712576, 'steps': 36547, 'batch_loss/train': 0.8046547323465347} 12/27/2021 18:40:39 - INFO - codeparrot_training - Step 36548: {'lr': 8.466316577783795e-05, 'samples': 18713088, 'steps': 36548, 'batch_loss/train': 0.6744014543946832} 12/27/2021 18:40:50 - INFO - codeparrot_training - Step 36549: {'lr': 8.465134249799478e-05, 'samples': 18713600, 'steps': 36549, 'batch_loss/train': 0.694523582002148} 12/27/2021 18:41:01 - INFO - codeparrot_training - Step 36550: {'lr': 8.463951987551533e-05, 'samples': 18714112, 'steps': 36550, 'batch_loss/train': 0.7309525427408516} 12/27/2021 18:41:11 - INFO - codeparrot_training - Step 36551: {'lr': 8.462769791044686e-05, 'samples': 18714624, 'steps': 36551, 'batch_loss/train': 0.7711106827482581} 12/27/2021 18:41:25 - INFO - codeparrot_training - Step 36552: {'lr': 8.461587660283618e-05, 'samples': 18715136, 'steps': 36552, 'batch_loss/train': 0.792215689085424} 12/27/2021 18:41:36 - INFO - codeparrot_training - Step 36553: {'lr': 8.46040559527303e-05, 'samples': 18715648, 'steps': 36553, 'batch_loss/train': 0.9423270868137479} 12/27/2021 18:41:46 - INFO - codeparrot_training - Step 36554: {'lr': 8.459223596017624e-05, 'samples': 18716160, 'steps': 36554, 'batch_loss/train': 0.7791031999513507} 12/27/2021 18:41:58 - INFO - codeparrot_training - Step 36555: {'lr': 8.458041662522101e-05, 'samples': 18716672, 'steps': 36555, 'batch_loss/train': 0.8223996087908745} 12/27/2021 18:42:09 - INFO - codeparrot_training - Step 36556: {'lr': 8.456859794791158e-05, 'samples': 18717184, 'steps': 36556, 'batch_loss/train': 0.7176265344023705} 12/27/2021 18:42:19 - INFO - codeparrot_training - Step 36557: {'lr': 8.455677992829492e-05, 'samples': 18717696, 'steps': 36557, 'batch_loss/train': 0.6852910006418824} 12/27/2021 18:42:30 - INFO - codeparrot_training - Step 36558: {'lr': 8.454496256641806e-05, 'samples': 18718208, 'steps': 36558, 'batch_loss/train': 0.7420500214211643} 12/27/2021 18:42:42 - INFO - codeparrot_training - Step 36559: {'lr': 8.453314586232794e-05, 'samples': 18718720, 'steps': 36559, 'batch_loss/train': 0.6723591601476073} 12/27/2021 18:42:53 - INFO - codeparrot_training - Step 36560: {'lr': 8.452132981607163e-05, 'samples': 18719232, 'steps': 36560, 'batch_loss/train': 0.5790753066539764} 12/27/2021 18:43:04 - INFO - codeparrot_training - Step 36561: {'lr': 8.450951442769589e-05, 'samples': 18719744, 'steps': 36561, 'batch_loss/train': 0.7473357514827512} 12/27/2021 18:43:17 - INFO - codeparrot_training - Step 36562: {'lr': 8.449769969724799e-05, 'samples': 18720256, 'steps': 36562, 'batch_loss/train': 0.7916868580505252} 12/27/2021 18:43:27 - INFO - codeparrot_training - Step 36563: {'lr': 8.448588562477466e-05, 'samples': 18720768, 'steps': 36563, 'batch_loss/train': 0.6988553970586509} 12/27/2021 18:43:38 - INFO - codeparrot_training - Step 36564: {'lr': 8.447407221032297e-05, 'samples': 18721280, 'steps': 36564, 'batch_loss/train': 0.7401791010051966} 12/27/2021 18:43:50 - INFO - codeparrot_training - Step 36565: {'lr': 8.446225945393984e-05, 'samples': 18721792, 'steps': 36565, 'batch_loss/train': 0.7804326969198883} 12/27/2021 18:44:01 - INFO - codeparrot_training - Step 36566: {'lr': 8.445044735567226e-05, 'samples': 18722304, 'steps': 36566, 'batch_loss/train': 0.6728574417065829} 12/27/2021 18:44:11 - INFO - codeparrot_training - Step 36567: {'lr': 8.443863591556721e-05, 'samples': 18722816, 'steps': 36567, 'batch_loss/train': 0.6319925967836753} 12/27/2021 18:44:22 - INFO - codeparrot_training - Step 36568: {'lr': 8.442682513367164e-05, 'samples': 18723328, 'steps': 36568, 'batch_loss/train': 0.6213969474192709} 12/27/2021 18:44:34 - INFO - codeparrot_training - Step 36569: {'lr': 8.441501501003245e-05, 'samples': 18723840, 'steps': 36569, 'batch_loss/train': 0.5050642136484385} 12/27/2021 18:44:45 - INFO - codeparrot_training - Step 36570: {'lr': 8.440320554469669e-05, 'samples': 18724352, 'steps': 36570, 'batch_loss/train': 0.6121114764828235} 12/27/2021 18:44:55 - INFO - codeparrot_training - Step 36571: {'lr': 8.439139673771131e-05, 'samples': 18724864, 'steps': 36571, 'batch_loss/train': 0.7857324378564954} 12/27/2021 18:45:09 - INFO - codeparrot_training - Step 36572: {'lr': 8.437958858912306e-05, 'samples': 18725376, 'steps': 36572, 'batch_loss/train': 0.8424926158040762} 12/27/2021 18:45:20 - INFO - codeparrot_training - Step 36573: {'lr': 8.43677810989791e-05, 'samples': 18725888, 'steps': 36573, 'batch_loss/train': 0.8253383589908481} 12/27/2021 18:45:31 - INFO - codeparrot_training - Step 36574: {'lr': 8.435597426732639e-05, 'samples': 18726400, 'steps': 36574, 'batch_loss/train': 0.6475716666318476} 12/27/2021 18:45:43 - INFO - codeparrot_training - Step 36575: {'lr': 8.434416809421169e-05, 'samples': 18726912, 'steps': 36575, 'batch_loss/train': 0.8805831531062722} 12/27/2021 18:45:53 - INFO - codeparrot_training - Step 36576: {'lr': 8.433236257968194e-05, 'samples': 18727424, 'steps': 36576, 'batch_loss/train': 0.7486494085751474} 12/27/2021 18:46:04 - INFO - codeparrot_training - Step 36577: {'lr': 8.432055772378436e-05, 'samples': 18727936, 'steps': 36577, 'batch_loss/train': 0.7806073813699186} 12/27/2021 18:46:15 - INFO - codeparrot_training - Step 36578: {'lr': 8.430875352656556e-05, 'samples': 18728448, 'steps': 36578, 'batch_loss/train': 0.7699385933228768} 12/27/2021 18:46:28 - INFO - codeparrot_training - Step 36579: {'lr': 8.42969499880726e-05, 'samples': 18728960, 'steps': 36579, 'batch_loss/train': 0.6950169617775828} 12/27/2021 18:46:39 - INFO - codeparrot_training - Step 36580: {'lr': 8.428514710835245e-05, 'samples': 18729472, 'steps': 36580, 'batch_loss/train': 0.7101005427539349} 12/27/2021 18:46:49 - INFO - codeparrot_training - Step 36581: {'lr': 8.427334488745191e-05, 'samples': 18729984, 'steps': 36581, 'batch_loss/train': 0.7754859663546085} 12/27/2021 18:47:02 - INFO - codeparrot_training - Step 36582: {'lr': 8.426154332541802e-05, 'samples': 18730496, 'steps': 36582, 'batch_loss/train': 0.6788388739805669} 12/27/2021 18:47:12 - INFO - codeparrot_training - Step 36583: {'lr': 8.424974242229766e-05, 'samples': 18731008, 'steps': 36583, 'batch_loss/train': 0.6997841903939843} 12/27/2021 18:47:23 - INFO - codeparrot_training - Step 36584: {'lr': 8.423794217813771e-05, 'samples': 18731520, 'steps': 36584, 'batch_loss/train': 0.6937423078343272} 12/27/2021 18:47:35 - INFO - codeparrot_training - Step 36585: {'lr': 8.422614259298514e-05, 'samples': 18732032, 'steps': 36585, 'batch_loss/train': 0.7842827625572681} 12/27/2021 18:47:46 - INFO - codeparrot_training - Step 36586: {'lr': 8.42143436668869e-05, 'samples': 18732544, 'steps': 36586, 'batch_loss/train': 0.7535020255018026} 12/27/2021 18:47:56 - INFO - codeparrot_training - Step 36587: {'lr': 8.420254539988967e-05, 'samples': 18733056, 'steps': 36587, 'batch_loss/train': 0.6799330201465636} 12/27/2021 18:48:07 - INFO - codeparrot_training - Step 36588: {'lr': 8.41907477920407e-05, 'samples': 18733568, 'steps': 36588, 'batch_loss/train': 0.6836391093675047} 12/27/2021 18:48:19 - INFO - codeparrot_training - Step 36589: {'lr': 8.417895084338659e-05, 'samples': 18734080, 'steps': 36589, 'batch_loss/train': 0.7083148760721087} 12/27/2021 18:48:30 - INFO - codeparrot_training - Step 36590: {'lr': 8.416715455397439e-05, 'samples': 18734592, 'steps': 36590, 'batch_loss/train': 0.6231975100236014} 12/27/2021 18:48:40 - INFO - codeparrot_training - Step 36591: {'lr': 8.415535892385095e-05, 'samples': 18735104, 'steps': 36591, 'batch_loss/train': 0.7233282644301653} 12/27/2021 18:48:54 - INFO - codeparrot_training - Step 36592: {'lr': 8.414356395306319e-05, 'samples': 18735616, 'steps': 36592, 'batch_loss/train': 0.7451116570155136} 12/27/2021 18:49:04 - INFO - codeparrot_training - Step 36593: {'lr': 8.413176964165798e-05, 'samples': 18736128, 'steps': 36593, 'batch_loss/train': 0.6719898085575551} 12/27/2021 18:49:15 - INFO - codeparrot_training - Step 36594: {'lr': 8.411997598968225e-05, 'samples': 18736640, 'steps': 36594, 'batch_loss/train': 0.7840981294866651} 12/27/2021 18:49:27 - INFO - codeparrot_training - Step 36595: {'lr': 8.410818299718284e-05, 'samples': 18737152, 'steps': 36595, 'batch_loss/train': 0.7808865467086434} 12/27/2021 18:49:38 - INFO - codeparrot_training - Step 36596: {'lr': 8.409639066420666e-05, 'samples': 18737664, 'steps': 36596, 'batch_loss/train': 1.6032427656464279} 12/27/2021 18:49:49 - INFO - codeparrot_training - Step 36597: {'lr': 8.408459899080068e-05, 'samples': 18738176, 'steps': 36597, 'batch_loss/train': 0.8178281951695681} 12/27/2021 18:49:59 - INFO - codeparrot_training - Step 36598: {'lr': 8.407280797701153e-05, 'samples': 18738688, 'steps': 36598, 'batch_loss/train': 0.720072730910033} 12/27/2021 18:50:12 - INFO - codeparrot_training - Step 36599: {'lr': 8.406101762288631e-05, 'samples': 18739200, 'steps': 36599, 'batch_loss/train': 0.7616255454486236} 12/27/2021 18:50:23 - INFO - codeparrot_training - Step 36600: {'lr': 8.404922792847192e-05, 'samples': 18739712, 'steps': 36600, 'batch_loss/train': 1.2137140482664108} 12/27/2021 18:50:34 - INFO - codeparrot_training - Step 36601: {'lr': 8.403743889381504e-05, 'samples': 18740224, 'steps': 36601, 'batch_loss/train': 0.7361725475639105} 12/27/2021 18:50:47 - INFO - codeparrot_training - Step 36602: {'lr': 8.402565051896257e-05, 'samples': 18740736, 'steps': 36602, 'batch_loss/train': 0.7357711214572191} 12/27/2021 18:50:58 - INFO - codeparrot_training - Step 36603: {'lr': 8.401386280396159e-05, 'samples': 18741248, 'steps': 36603, 'batch_loss/train': 0.9456889173015952} 12/27/2021 18:51:09 - INFO - codeparrot_training - Step 36604: {'lr': 8.400207574885876e-05, 'samples': 18741760, 'steps': 36604, 'batch_loss/train': 0.725997980684042} 12/27/2021 18:51:19 - INFO - codeparrot_training - Step 36605: {'lr': 8.399028935370098e-05, 'samples': 18742272, 'steps': 36605, 'batch_loss/train': 0.9421112863346934} 12/27/2021 18:51:31 - INFO - codeparrot_training - Step 36606: {'lr': 8.397850361853515e-05, 'samples': 18742784, 'steps': 36606, 'batch_loss/train': 0.7113961330614984} 12/27/2021 18:51:42 - INFO - codeparrot_training - Step 36607: {'lr': 8.39667185434081e-05, 'samples': 18743296, 'steps': 36607, 'batch_loss/train': 0.7569425962865353} 12/27/2021 18:51:53 - INFO - codeparrot_training - Step 36608: {'lr': 8.395493412836666e-05, 'samples': 18743808, 'steps': 36608, 'batch_loss/train': 0.6689740599831566} 12/27/2021 18:52:05 - INFO - codeparrot_training - Step 36609: {'lr': 8.394315037345771e-05, 'samples': 18744320, 'steps': 36609, 'batch_loss/train': 0.67688700533472} 12/27/2021 18:52:15 - INFO - codeparrot_training - Step 36610: {'lr': 8.393136727872811e-05, 'samples': 18744832, 'steps': 36610, 'batch_loss/train': 0.7967196740210056} 12/27/2021 18:52:26 - INFO - codeparrot_training - Step 36611: {'lr': 8.391958484422468e-05, 'samples': 18745344, 'steps': 36611, 'batch_loss/train': 0.7819189727306366} 12/27/2021 18:52:40 - INFO - codeparrot_training - Step 36612: {'lr': 8.390780306999434e-05, 'samples': 18745856, 'steps': 36612, 'batch_loss/train': 0.7509459992870688} 12/27/2021 18:52:50 - INFO - codeparrot_training - Step 36613: {'lr': 8.389602195608373e-05, 'samples': 18746368, 'steps': 36613, 'batch_loss/train': 0.719980644993484} 12/27/2021 18:53:01 - INFO - codeparrot_training - Step 36614: {'lr': 8.388424150253996e-05, 'samples': 18746880, 'steps': 36614, 'batch_loss/train': 0.6058870770502836} 12/27/2021 18:53:13 - INFO - codeparrot_training - Step 36615: {'lr': 8.387246170940965e-05, 'samples': 18747392, 'steps': 36615, 'batch_loss/train': 0.7343669133260846} 12/27/2021 18:53:24 - INFO - codeparrot_training - Step 36616: {'lr': 8.386068257673968e-05, 'samples': 18747904, 'steps': 36616, 'batch_loss/train': 0.7391715315170586} 12/27/2021 18:53:34 - INFO - codeparrot_training - Step 36617: {'lr': 8.384890410457693e-05, 'samples': 18748416, 'steps': 36617, 'batch_loss/train': 0.8043001731857657} 12/27/2021 18:53:45 - INFO - codeparrot_training - Step 36618: {'lr': 8.38371262929682e-05, 'samples': 18748928, 'steps': 36618, 'batch_loss/train': 0.6759725220035762} 12/27/2021 18:53:58 - INFO - codeparrot_training - Step 36619: {'lr': 8.38253491419603e-05, 'samples': 18749440, 'steps': 36619, 'batch_loss/train': 0.7810416417196393} 12/27/2021 18:54:09 - INFO - codeparrot_training - Step 36620: {'lr': 8.381357265160008e-05, 'samples': 18749952, 'steps': 36620, 'batch_loss/train': 0.6773642282932997} 12/27/2021 18:54:19 - INFO - codeparrot_training - Step 36621: {'lr': 8.380179682193434e-05, 'samples': 18750464, 'steps': 36621, 'batch_loss/train': 0.6925412559648976} 12/27/2021 18:54:32 - INFO - codeparrot_training - Step 36622: {'lr': 8.379002165300992e-05, 'samples': 18750976, 'steps': 36622, 'batch_loss/train': 0.6919590360485017} 12/27/2021 18:54:43 - INFO - codeparrot_training - Step 36623: {'lr': 8.377824714487367e-05, 'samples': 18751488, 'steps': 36623, 'batch_loss/train': 0.5990051324479282} 12/27/2021 18:54:53 - INFO - codeparrot_training - Step 36624: {'lr': 8.376647329757219e-05, 'samples': 18752000, 'steps': 36624, 'batch_loss/train': 0.7643597513088025} 12/27/2021 18:55:05 - INFO - codeparrot_training - Step 36625: {'lr': 8.375470011115255e-05, 'samples': 18752512, 'steps': 36625, 'batch_loss/train': 0.7785129295662045} 12/27/2021 18:55:16 - INFO - codeparrot_training - Step 36626: {'lr': 8.374292758566149e-05, 'samples': 18753024, 'steps': 36626, 'batch_loss/train': 0.7250736085698009} 12/27/2021 18:55:27 - INFO - codeparrot_training - Step 36627: {'lr': 8.373115572114572e-05, 'samples': 18753536, 'steps': 36627, 'batch_loss/train': 0.667882489040494} 12/27/2021 18:55:37 - INFO - codeparrot_training - Step 36628: {'lr': 8.371938451765202e-05, 'samples': 18754048, 'steps': 36628, 'batch_loss/train': 0.731650504283607} 12/27/2021 18:55:49 - INFO - codeparrot_training - Step 36629: {'lr': 8.370761397522742e-05, 'samples': 18754560, 'steps': 36629, 'batch_loss/train': 0.6825260850600898} 12/27/2021 18:56:00 - INFO - codeparrot_training - Step 36630: {'lr': 8.369584409391845e-05, 'samples': 18755072, 'steps': 36630, 'batch_loss/train': 0.7149854593444616} 12/27/2021 18:56:11 - INFO - codeparrot_training - Step 36631: {'lr': 8.368407487377194e-05, 'samples': 18755584, 'steps': 36631, 'batch_loss/train': 0.6357319364324212} 12/27/2021 18:56:24 - INFO - codeparrot_training - Step 36632: {'lr': 8.367230631483494e-05, 'samples': 18756096, 'steps': 36632, 'batch_loss/train': 0.6827588395681232} 12/27/2021 18:56:34 - INFO - codeparrot_training - Step 36633: {'lr': 8.366053841715393e-05, 'samples': 18756608, 'steps': 36633, 'batch_loss/train': 0.7170917757612187} 12/27/2021 18:56:45 - INFO - codeparrot_training - Step 36634: {'lr': 8.364877118077582e-05, 'samples': 18757120, 'steps': 36634, 'batch_loss/train': 0.7627847781404853} 12/27/2021 18:56:57 - INFO - codeparrot_training - Step 36635: {'lr': 8.363700460574741e-05, 'samples': 18757632, 'steps': 36635, 'batch_loss/train': 0.7068313229829073} 12/27/2021 18:57:08 - INFO - codeparrot_training - Step 36636: {'lr': 8.362523869211544e-05, 'samples': 18758144, 'steps': 36636, 'batch_loss/train': 0.7230826914310455} 12/27/2021 18:57:18 - INFO - codeparrot_training - Step 36637: {'lr': 8.361347343992667e-05, 'samples': 18758656, 'steps': 36637, 'batch_loss/train': 0.7199418293312192} 12/27/2021 18:57:29 - INFO - codeparrot_training - Step 36638: {'lr': 8.360170884922801e-05, 'samples': 18759168, 'steps': 36638, 'batch_loss/train': 0.7652939092367887} 12/27/2021 18:57:41 - INFO - codeparrot_training - Step 36639: {'lr': 8.358994492006597e-05, 'samples': 18759680, 'steps': 36639, 'batch_loss/train': 0.7731821592897177} 12/27/2021 18:57:52 - INFO - codeparrot_training - Step 36640: {'lr': 8.357818165248757e-05, 'samples': 18760192, 'steps': 36640, 'batch_loss/train': 0.6901942985132337} 12/27/2021 18:58:02 - INFO - codeparrot_training - Step 36641: {'lr': 8.356641904653953e-05, 'samples': 18760704, 'steps': 36641, 'batch_loss/train': 0.7083317125216126} 12/27/2021 18:58:16 - INFO - codeparrot_training - Step 36642: {'lr': 8.355465710226842e-05, 'samples': 18761216, 'steps': 36642, 'batch_loss/train': 0.7891107276082039} 12/27/2021 18:58:27 - INFO - codeparrot_training - Step 36643: {'lr': 8.35428958197213e-05, 'samples': 18761728, 'steps': 36643, 'batch_loss/train': 0.736666769720614} 12/27/2021 18:58:37 - INFO - codeparrot_training - Step 36644: {'lr': 8.35311351989447e-05, 'samples': 18762240, 'steps': 36644, 'batch_loss/train': 0.733893746510148} 12/27/2021 18:58:50 - INFO - codeparrot_training - Step 36645: {'lr': 8.351937523998546e-05, 'samples': 18762752, 'steps': 36645, 'batch_loss/train': 0.742519952938892} 12/27/2021 18:59:01 - INFO - codeparrot_training - Step 36646: {'lr': 8.35076159428903e-05, 'samples': 18763264, 'steps': 36646, 'batch_loss/train': 0.6757574961520731} 12/27/2021 18:59:12 - INFO - codeparrot_training - Step 36647: {'lr': 8.349585730770601e-05, 'samples': 18763776, 'steps': 36647, 'batch_loss/train': 0.801572478376329} 12/27/2021 18:59:22 - INFO - codeparrot_training - Step 36648: {'lr': 8.348409933447931e-05, 'samples': 18764288, 'steps': 36648, 'batch_loss/train': 0.6991234642919153} 12/27/2021 18:59:36 - INFO - codeparrot_training - Step 36649: {'lr': 8.347234202325704e-05, 'samples': 18764800, 'steps': 36649, 'batch_loss/train': 0.7977390652522445} 12/27/2021 18:59:47 - INFO - codeparrot_training - Step 36650: {'lr': 8.346058537408572e-05, 'samples': 18765312, 'steps': 36650, 'batch_loss/train': 0.7385057779029012} 12/27/2021 18:59:57 - INFO - codeparrot_training - Step 36651: {'lr': 8.34488293870123e-05, 'samples': 18765824, 'steps': 36651, 'batch_loss/train': 0.8143444173038006} 12/27/2021 19:00:09 - INFO - codeparrot_training - Step 36652: {'lr': 8.34370740620835e-05, 'samples': 18766336, 'steps': 36652, 'batch_loss/train': 0.8025637520477176} 12/27/2021 19:00:20 - INFO - codeparrot_training - Step 36653: {'lr': 8.342531939934594e-05, 'samples': 18766848, 'steps': 36653, 'batch_loss/train': 0.7663442399352789} 12/27/2021 19:00:31 - INFO - codeparrot_training - Step 36654: {'lr': 8.341356539884634e-05, 'samples': 18767360, 'steps': 36654, 'batch_loss/train': 0.6247159566846676} 12/27/2021 19:00:43 - INFO - codeparrot_training - Step 36655: {'lr': 8.340181206063167e-05, 'samples': 18767872, 'steps': 36655, 'batch_loss/train': 0.7368217329494655} 12/27/2021 19:00:53 - INFO - codeparrot_training - Step 36656: {'lr': 8.339005938474839e-05, 'samples': 18768384, 'steps': 36656, 'batch_loss/train': 0.7629817025735974} 12/27/2021 19:01:04 - INFO - codeparrot_training - Step 36657: {'lr': 8.337830737124324e-05, 'samples': 18768896, 'steps': 36657, 'batch_loss/train': 0.7889930545352399} 12/27/2021 19:01:15 - INFO - codeparrot_training - Step 36658: {'lr': 8.336655602016321e-05, 'samples': 18769408, 'steps': 36658, 'batch_loss/train': 0.7361818780191243} 12/27/2021 19:01:29 - INFO - codeparrot_training - Step 36659: {'lr': 8.335480533155477e-05, 'samples': 18769920, 'steps': 36659, 'batch_loss/train': 0.7395272427238524} 12/27/2021 19:01:39 - INFO - codeparrot_training - Step 36660: {'lr': 8.334305530546465e-05, 'samples': 18770432, 'steps': 36660, 'batch_loss/train': 0.7955477610230446} 12/27/2021 19:01:50 - INFO - codeparrot_training - Step 36661: {'lr': 8.333130594193963e-05, 'samples': 18770944, 'steps': 36661, 'batch_loss/train': 0.7692244435893372} 12/27/2021 19:02:02 - INFO - codeparrot_training - Step 36662: {'lr': 8.331955724102644e-05, 'samples': 18771456, 'steps': 36662, 'batch_loss/train': 0.7964009270071983} 12/27/2021 19:02:13 - INFO - codeparrot_training - Step 36663: {'lr': 8.330780920277173e-05, 'samples': 18771968, 'steps': 36663, 'batch_loss/train': 0.7178944554179907} 12/27/2021 19:02:23 - INFO - codeparrot_training - Step 36664: {'lr': 8.32960618272223e-05, 'samples': 18772480, 'steps': 36664, 'batch_loss/train': 0.7055078698322177} 12/27/2021 19:02:36 - INFO - codeparrot_training - Step 36665: {'lr': 8.328431511442464e-05, 'samples': 18772992, 'steps': 36665, 'batch_loss/train': 0.7914983369410038} 12/27/2021 19:02:47 - INFO - codeparrot_training - Step 36666: {'lr': 8.327256906442568e-05, 'samples': 18773504, 'steps': 36666, 'batch_loss/train': 0.8274886095896363} 12/27/2021 19:02:57 - INFO - codeparrot_training - Step 36667: {'lr': 8.326082367727208e-05, 'samples': 18774016, 'steps': 36667, 'batch_loss/train': 0.715257456060499} 12/27/2021 19:03:08 - INFO - codeparrot_training - Step 36668: {'lr': 8.324907895301034e-05, 'samples': 18774528, 'steps': 36668, 'batch_loss/train': 0.6543253245763481} 12/27/2021 19:03:20 - INFO - codeparrot_training - Step 36669: {'lr': 8.323733489168748e-05, 'samples': 18775040, 'steps': 36669, 'batch_loss/train': 0.737963953986764} 12/27/2021 19:03:31 - INFO - codeparrot_training - Step 36670: {'lr': 8.322559149334991e-05, 'samples': 18775552, 'steps': 36670, 'batch_loss/train': 0.7751112682744861} 12/27/2021 19:03:41 - INFO - codeparrot_training - Step 36671: {'lr': 8.321384875804444e-05, 'samples': 18776064, 'steps': 36671, 'batch_loss/train': 0.661990950582549} 12/27/2021 19:03:54 - INFO - codeparrot_training - Step 36672: {'lr': 8.320210668581771e-05, 'samples': 18776576, 'steps': 36672, 'batch_loss/train': 0.6909137070178986} 12/27/2021 19:04:05 - INFO - codeparrot_training - Step 36673: {'lr': 8.319036527671642e-05, 'samples': 18777088, 'steps': 36673, 'batch_loss/train': 0.796280107460916} 12/27/2021 19:04:16 - INFO - codeparrot_training - Step 36674: {'lr': 8.317862453078726e-05, 'samples': 18777600, 'steps': 36674, 'batch_loss/train': 0.6653452316531911} 12/27/2021 19:04:28 - INFO - codeparrot_training - Step 36675: {'lr': 8.31668844480769e-05, 'samples': 18778112, 'steps': 36675, 'batch_loss/train': 0.7643136276165023} 12/27/2021 19:04:38 - INFO - codeparrot_training - Step 36676: {'lr': 8.3155145028632e-05, 'samples': 18778624, 'steps': 36676, 'batch_loss/train': 0.7190324678085744} 12/27/2021 19:04:49 - INFO - codeparrot_training - Step 36677: {'lr': 8.314340627249927e-05, 'samples': 18779136, 'steps': 36677, 'batch_loss/train': 0.7481923252344131} 12/27/2021 19:05:00 - INFO - codeparrot_training - Step 36678: {'lr': 8.313166817972542e-05, 'samples': 18779648, 'steps': 36678, 'batch_loss/train': 0.7342448101844639} 12/27/2021 19:05:12 - INFO - codeparrot_training - Step 36679: {'lr': 8.31199307503569e-05, 'samples': 18780160, 'steps': 36679, 'batch_loss/train': 0.7674197875894606} 12/27/2021 19:05:22 - INFO - codeparrot_training - Step 36680: {'lr': 8.310819398444059e-05, 'samples': 18780672, 'steps': 36680, 'batch_loss/train': 0.7173209087923169} 12/27/2021 19:05:33 - INFO - codeparrot_training - Step 36681: {'lr': 8.309645788202317e-05, 'samples': 18781184, 'steps': 36681, 'batch_loss/train': 0.8820277762133628} 12/27/2021 19:05:47 - INFO - codeparrot_training - Step 36682: {'lr': 8.308472244315113e-05, 'samples': 18781696, 'steps': 36682, 'batch_loss/train': 0.7659268770366907} 12/27/2021 19:05:57 - INFO - codeparrot_training - Step 36683: {'lr': 8.307298766787113e-05, 'samples': 18782208, 'steps': 36683, 'batch_loss/train': 0.7627882622182369} 12/27/2021 19:06:08 - INFO - codeparrot_training - Step 36684: {'lr': 8.306125355623007e-05, 'samples': 18782720, 'steps': 36684, 'batch_loss/train': 0.7582920058630407} 12/27/2021 19:06:20 - INFO - codeparrot_training - Step 36685: {'lr': 8.304952010827436e-05, 'samples': 18783232, 'steps': 36685, 'batch_loss/train': 0.7316897104028612} 12/27/2021 19:06:31 - INFO - codeparrot_training - Step 36686: {'lr': 8.30377873240507e-05, 'samples': 18783744, 'steps': 36686, 'batch_loss/train': 0.8650812925770879} 12/27/2021 19:06:41 - INFO - codeparrot_training - Step 36687: {'lr': 8.302605520360577e-05, 'samples': 18784256, 'steps': 36687, 'batch_loss/train': 0.6192779263947159} 12/27/2021 19:06:55 - INFO - codeparrot_training - Step 36688: {'lr': 8.301432374698617e-05, 'samples': 18784768, 'steps': 36688, 'batch_loss/train': 0.7715060086920857} 12/27/2021 19:07:06 - INFO - codeparrot_training - Step 36689: {'lr': 8.30025929542386e-05, 'samples': 18785280, 'steps': 36689, 'batch_loss/train': 0.7221859097480774} 12/27/2021 19:07:17 - INFO - codeparrot_training - Step 36690: {'lr': 8.299086282540974e-05, 'samples': 18785792, 'steps': 36690, 'batch_loss/train': 0.7902299319393933} 12/27/2021 19:07:27 - INFO - codeparrot_training - Step 36691: {'lr': 8.297913336054596e-05, 'samples': 18786304, 'steps': 36691, 'batch_loss/train': 0.7132461462169886} 12/27/2021 19:07:40 - INFO - codeparrot_training - Step 36692: {'lr': 8.296740455969418e-05, 'samples': 18786816, 'steps': 36692, 'batch_loss/train': 0.7527497126720846} 12/27/2021 19:07:50 - INFO - codeparrot_training - Step 36693: {'lr': 8.295567642290103e-05, 'samples': 18787328, 'steps': 36693, 'batch_loss/train': 0.7708500390872359} 12/27/2021 19:08:01 - INFO - codeparrot_training - Step 36694: {'lr': 8.294394895021284e-05, 'samples': 18787840, 'steps': 36694, 'batch_loss/train': 0.5707001652917825} 12/27/2021 19:08:13 - INFO - codeparrot_training - Step 36695: {'lr': 8.293222214167662e-05, 'samples': 18788352, 'steps': 36695, 'batch_loss/train': 0.7867885027080774} 12/27/2021 19:08:23 - INFO - codeparrot_training - Step 36696: {'lr': 8.292049599733873e-05, 'samples': 18788864, 'steps': 36696, 'batch_loss/train': 0.6426711366511881} 12/27/2021 19:08:34 - INFO - codeparrot_training - Step 36697: {'lr': 8.290877051724582e-05, 'samples': 18789376, 'steps': 36697, 'batch_loss/train': 0.7498798822052777} 12/27/2021 19:08:47 - INFO - codeparrot_training - Step 36698: {'lr': 8.289704570144457e-05, 'samples': 18789888, 'steps': 36698, 'batch_loss/train': 0.8704461269080639} 12/27/2021 19:08:58 - INFO - codeparrot_training - Step 36699: {'lr': 8.288532154998158e-05, 'samples': 18790400, 'steps': 36699, 'batch_loss/train': 0.6691304436826613} 12/27/2021 19:09:09 - INFO - codeparrot_training - Step 36700: {'lr': 8.287359806290346e-05, 'samples': 18790912, 'steps': 36700, 'batch_loss/train': 0.6943811415694654} 12/27/2021 19:09:19 - INFO - codeparrot_training - Step 36701: {'lr': 8.28618752402568e-05, 'samples': 18791424, 'steps': 36701, 'batch_loss/train': 0.6811286918818951} 12/27/2021 19:09:32 - INFO - codeparrot_training - Step 36702: {'lr': 8.285015308208818e-05, 'samples': 18791936, 'steps': 36702, 'batch_loss/train': 0.6237934813834727} 12/27/2021 19:09:42 - INFO - codeparrot_training - Step 36703: {'lr': 8.283843158844429e-05, 'samples': 18792448, 'steps': 36703, 'batch_loss/train': 0.6001900602132082} 12/27/2021 19:09:53 - INFO - codeparrot_training - Step 36704: {'lr': 8.282671075937174e-05, 'samples': 18792960, 'steps': 36704, 'batch_loss/train': 0.7051602825522423} 12/27/2021 19:10:06 - INFO - codeparrot_training - Step 36705: {'lr': 8.281499059491687e-05, 'samples': 18793472, 'steps': 36705, 'batch_loss/train': 0.7593607977032661} 12/27/2021 19:10:17 - INFO - codeparrot_training - Step 36706: {'lr': 8.280327109512659e-05, 'samples': 18793984, 'steps': 36706, 'batch_loss/train': 0.7736416654661298} 12/27/2021 19:10:27 - INFO - codeparrot_training - Step 36707: {'lr': 8.279155226004745e-05, 'samples': 18794496, 'steps': 36707, 'batch_loss/train': 0.7529042079113424} 12/27/2021 19:10:38 - INFO - codeparrot_training - Step 36708: {'lr': 8.277983408972586e-05, 'samples': 18795008, 'steps': 36708, 'batch_loss/train': 1.3212894275784492} 12/27/2021 19:10:52 - INFO - codeparrot_training - Step 36709: {'lr': 8.276811658420846e-05, 'samples': 18795520, 'steps': 36709, 'batch_loss/train': 0.6654043085873127} 12/27/2021 19:11:02 - INFO - codeparrot_training - Step 36710: {'lr': 8.275639974354204e-05, 'samples': 18796032, 'steps': 36710, 'batch_loss/train': 0.7404041136614978} 12/27/2021 19:11:13 - INFO - codeparrot_training - Step 36711: {'lr': 8.274468356777292e-05, 'samples': 18796544, 'steps': 36711, 'batch_loss/train': 0.7424893220886588} 12/27/2021 19:11:25 - INFO - codeparrot_training - Step 36712: {'lr': 8.27329680569478e-05, 'samples': 18797056, 'steps': 36712, 'batch_loss/train': 0.7046129335649312} 12/27/2021 19:11:36 - INFO - codeparrot_training - Step 36713: {'lr': 8.272125321111326e-05, 'samples': 18797568, 'steps': 36713, 'batch_loss/train': 0.6934572672471404} 12/27/2021 19:11:46 - INFO - codeparrot_training - Step 36714: {'lr': 8.270953903031584e-05, 'samples': 18798080, 'steps': 36714, 'batch_loss/train': 0.7525898462627083} 12/27/2021 19:11:58 - INFO - codeparrot_training - Step 36715: {'lr': 8.269782551460211e-05, 'samples': 18798592, 'steps': 36715, 'batch_loss/train': 0.7445092551643029} 12/27/2021 19:12:09 - INFO - codeparrot_training - Step 36716: {'lr': 8.268611266401869e-05, 'samples': 18799104, 'steps': 36716, 'batch_loss/train': 0.7602471676655114} 12/27/2021 19:12:20 - INFO - codeparrot_training - Step 36717: {'lr': 8.267440047861208e-05, 'samples': 18799616, 'steps': 36717, 'batch_loss/train': 0.7065141280181706} 12/27/2021 19:12:33 - INFO - codeparrot_training - Step 36718: {'lr': 8.266268895842888e-05, 'samples': 18800128, 'steps': 36718, 'batch_loss/train': 0.8020841823890805} 12/27/2021 19:12:44 - INFO - codeparrot_training - Step 36719: {'lr': 8.265097810351575e-05, 'samples': 18800640, 'steps': 36719, 'batch_loss/train': 0.7886828747577965} 12/27/2021 19:12:54 - INFO - codeparrot_training - Step 36720: {'lr': 8.263926791391893e-05, 'samples': 18801152, 'steps': 36720, 'batch_loss/train': 0.7473430873360485} 12/27/2021 19:13:05 - INFO - codeparrot_training - Step 36721: {'lr': 8.26275583896853e-05, 'samples': 18801664, 'steps': 36721, 'batch_loss/train': 0.7293509100563824} 12/27/2021 19:13:17 - INFO - codeparrot_training - Step 36722: {'lr': 8.261584953086141e-05, 'samples': 18802176, 'steps': 36722, 'batch_loss/train': 0.8271241937763989} 12/27/2021 19:13:28 - INFO - codeparrot_training - Step 36723: {'lr': 8.260414133749359e-05, 'samples': 18802688, 'steps': 36723, 'batch_loss/train': 0.702637636102736} 12/27/2021 19:13:38 - INFO - codeparrot_training - Step 36724: {'lr': 8.259243380962848e-05, 'samples': 18803200, 'steps': 36724, 'batch_loss/train': 0.7661121701821685} 12/27/2021 19:13:50 - INFO - codeparrot_training - Step 36725: {'lr': 8.258072694731266e-05, 'samples': 18803712, 'steps': 36725, 'batch_loss/train': 0.7618011822924018} 12/27/2021 19:14:01 - INFO - codeparrot_training - Step 36726: {'lr': 8.256902075059267e-05, 'samples': 18804224, 'steps': 36726, 'batch_loss/train': 0.6544061969034374} 12/27/2021 19:14:11 - INFO - codeparrot_training - Step 36727: {'lr': 8.255731521951502e-05, 'samples': 18804736, 'steps': 36727, 'batch_loss/train': 0.9283193219453096} 12/27/2021 19:14:25 - INFO - codeparrot_training - Step 36728: {'lr': 8.254561035412625e-05, 'samples': 18805248, 'steps': 36728, 'batch_loss/train': 0.7025655549950898} 12/27/2021 19:14:36 - INFO - codeparrot_training - Step 36729: {'lr': 8.25339061544729e-05, 'samples': 18805760, 'steps': 36729, 'batch_loss/train': 0.8221601755358279} 12/27/2021 19:14:47 - INFO - codeparrot_training - Step 36730: {'lr': 8.252220262060154e-05, 'samples': 18806272, 'steps': 36730, 'batch_loss/train': 0.727706670993939} 12/27/2021 19:14:57 - INFO - codeparrot_training - Step 36731: {'lr': 8.251049975255864e-05, 'samples': 18806784, 'steps': 36731, 'batch_loss/train': 0.6668438776396215} 12/27/2021 19:15:09 - INFO - codeparrot_training - Step 36732: {'lr': 8.249879755039074e-05, 'samples': 18807296, 'steps': 36732, 'batch_loss/train': 0.7837021946907043} 12/27/2021 19:15:20 - INFO - codeparrot_training - Step 36733: {'lr': 8.248709601414451e-05, 'samples': 18807808, 'steps': 36733, 'batch_loss/train': 0.7762595815584064} 12/27/2021 19:15:31 - INFO - codeparrot_training - Step 36734: {'lr': 8.247539514386621e-05, 'samples': 18808320, 'steps': 36734, 'batch_loss/train': 0.7218529139645398} 12/27/2021 19:15:44 - INFO - codeparrot_training - Step 36735: {'lr': 8.24636949396024e-05, 'samples': 18808832, 'steps': 36735, 'batch_loss/train': 0.7300774870673195} 12/27/2021 19:15:54 - INFO - codeparrot_training - Step 36736: {'lr': 8.245199540139986e-05, 'samples': 18809344, 'steps': 36736, 'batch_loss/train': 0.6632990199141204} 12/27/2021 19:16:05 - INFO - codeparrot_training - Step 36737: {'lr': 8.244029652930482e-05, 'samples': 18809856, 'steps': 36737, 'batch_loss/train': 0.6472851859871298} 12/27/2021 19:16:17 - INFO - codeparrot_training - Step 36738: {'lr': 8.242859832336391e-05, 'samples': 18810368, 'steps': 36738, 'batch_loss/train': 0.7412304237950593} 12/27/2021 19:16:28 - INFO - codeparrot_training - Step 36739: {'lr': 8.241690078362363e-05, 'samples': 18810880, 'steps': 36739, 'batch_loss/train': 0.6556938900612295} 12/27/2021 19:16:38 - INFO - codeparrot_training - Step 36740: {'lr': 8.240520391013046e-05, 'samples': 18811392, 'steps': 36740, 'batch_loss/train': 0.7294130199588835} 12/27/2021 19:16:49 - INFO - codeparrot_training - Step 36741: {'lr': 8.239350770293092e-05, 'samples': 18811904, 'steps': 36741, 'batch_loss/train': 0.8234867444261909} 12/27/2021 19:17:01 - INFO - codeparrot_training - Step 36742: {'lr': 8.238181216207152e-05, 'samples': 18812416, 'steps': 36742, 'batch_loss/train': 0.8222115358803421} 12/27/2021 19:17:12 - INFO - codeparrot_training - Step 36743: {'lr': 8.237011728759871e-05, 'samples': 18812928, 'steps': 36743, 'batch_loss/train': 0.6943434564745985} 12/27/2021 19:17:22 - INFO - codeparrot_training - Step 36744: {'lr': 8.235842307955907e-05, 'samples': 18813440, 'steps': 36744, 'batch_loss/train': 0.733772274106741} 12/27/2021 19:17:35 - INFO - codeparrot_training - Step 36745: {'lr': 8.23467295379991e-05, 'samples': 18813952, 'steps': 36745, 'batch_loss/train': 0.5617423504590988} 12/27/2021 19:17:45 - INFO - codeparrot_training - Step 36746: {'lr': 8.233503666296504e-05, 'samples': 18814464, 'steps': 36746, 'batch_loss/train': 0.6908145986963063} 12/27/2021 19:17:56 - INFO - codeparrot_training - Step 36747: {'lr': 8.232334445450368e-05, 'samples': 18814976, 'steps': 36747, 'batch_loss/train': 0.7872436121106148} 12/27/2021 19:18:09 - INFO - codeparrot_training - Step 36748: {'lr': 8.231165291266146e-05, 'samples': 18815488, 'steps': 36748, 'batch_loss/train': 0.835467514116317} 12/27/2021 19:18:20 - INFO - codeparrot_training - Step 36749: {'lr': 8.229996203748472e-05, 'samples': 18816000, 'steps': 36749, 'batch_loss/train': 0.649962336756289} 12/27/2021 19:18:31 - INFO - codeparrot_training - Step 36750: {'lr': 8.228827182902e-05, 'samples': 18816512, 'steps': 36750, 'batch_loss/train': 0.9221134753897786} 12/27/2021 19:18:43 - INFO - codeparrot_training - Step 36751: {'lr': 8.227658228731377e-05, 'samples': 18817024, 'steps': 36751, 'batch_loss/train': 0.8663525313604623} 12/27/2021 19:18:53 - INFO - codeparrot_training - Step 36752: {'lr': 8.226489341241256e-05, 'samples': 18817536, 'steps': 36752, 'batch_loss/train': 0.7606687881052494} 12/27/2021 19:19:04 - INFO - codeparrot_training - Step 36753: {'lr': 8.225320520436278e-05, 'samples': 18818048, 'steps': 36753, 'batch_loss/train': 0.6477615816984326} 12/27/2021 19:19:15 - INFO - codeparrot_training - Step 36754: {'lr': 8.22415176632109e-05, 'samples': 18818560, 'steps': 36754, 'batch_loss/train': 0.7545359756331891} 12/27/2021 19:19:27 - INFO - codeparrot_training - Step 36755: {'lr': 8.222983078900343e-05, 'samples': 18819072, 'steps': 36755, 'batch_loss/train': 0.580536836758256} 12/27/2021 19:19:37 - INFO - codeparrot_training - Step 36756: {'lr': 8.221814458178677e-05, 'samples': 18819584, 'steps': 36756, 'batch_loss/train': 0.7237522923387587} 12/27/2021 19:19:48 - INFO - codeparrot_training - Step 36757: {'lr': 8.220645904160745e-05, 'samples': 18820096, 'steps': 36757, 'batch_loss/train': 0.8393603479489684} 12/27/2021 19:20:01 - INFO - codeparrot_training - Step 36758: {'lr': 8.219477416851187e-05, 'samples': 18820608, 'steps': 36758, 'batch_loss/train': 0.652690727263689} 12/27/2021 19:20:12 - INFO - codeparrot_training - Step 36759: {'lr': 8.218308996254662e-05, 'samples': 18821120, 'steps': 36759, 'batch_loss/train': 0.6842179307714105} 12/27/2021 19:20:23 - INFO - codeparrot_training - Step 36760: {'lr': 8.217140642375793e-05, 'samples': 18821632, 'steps': 36760, 'batch_loss/train': 0.819336699321866} 12/27/2021 19:20:33 - INFO - codeparrot_training - Step 36761: {'lr': 8.215972355219228e-05, 'samples': 18822144, 'steps': 36761, 'batch_loss/train': 0.7013769603800029} 12/27/2021 19:20:46 - INFO - codeparrot_training - Step 36762: {'lr': 8.214804134789635e-05, 'samples': 18822656, 'steps': 36762, 'batch_loss/train': 0.7992229899391532} 12/27/2021 19:20:56 - INFO - codeparrot_training - Step 36763: {'lr': 8.213635981091636e-05, 'samples': 18823168, 'steps': 36763, 'batch_loss/train': 0.7219918495975435} 12/27/2021 19:21:07 - INFO - codeparrot_training - Step 36764: {'lr': 8.212467894129883e-05, 'samples': 18823680, 'steps': 36764, 'batch_loss/train': 0.45981553476303816} 12/27/2021 19:21:20 - INFO - codeparrot_training - Step 36765: {'lr': 8.211299873909017e-05, 'samples': 18824192, 'steps': 36765, 'batch_loss/train': 0.7990710423327982} 12/27/2021 19:21:31 - INFO - codeparrot_training - Step 36766: {'lr': 8.210131920433684e-05, 'samples': 18824704, 'steps': 36766, 'batch_loss/train': 0.7462184083415195} 12/27/2021 19:21:41 - INFO - codeparrot_training - Step 36767: {'lr': 8.208964033708527e-05, 'samples': 18825216, 'steps': 36767, 'batch_loss/train': 0.6759381568990648} 12/27/2021 19:21:54 - INFO - codeparrot_training - Step 36768: {'lr': 8.207796213738189e-05, 'samples': 18825728, 'steps': 36768, 'batch_loss/train': 0.7129532899707556} 12/27/2021 19:22:04 - INFO - codeparrot_training - Step 36769: {'lr': 8.206628460527313e-05, 'samples': 18826240, 'steps': 36769, 'batch_loss/train': 0.754168601706624} 12/27/2021 19:22:15 - INFO - codeparrot_training - Step 36770: {'lr': 8.205460774080542e-05, 'samples': 18826752, 'steps': 36770, 'batch_loss/train': 0.761558098718524} 12/27/2021 19:22:27 - INFO - codeparrot_training - Step 36771: {'lr': 8.204293154402525e-05, 'samples': 18827264, 'steps': 36771, 'batch_loss/train': 0.7844746564514935} 12/27/2021 19:22:38 - INFO - codeparrot_training - Step 36772: {'lr': 8.203125601497879e-05, 'samples': 18827776, 'steps': 36772, 'batch_loss/train': 0.7896722995210439} 12/27/2021 19:22:48 - INFO - codeparrot_training - Step 36773: {'lr': 8.201958115371272e-05, 'samples': 18828288, 'steps': 36773, 'batch_loss/train': 0.7840449684299529} 12/27/2021 19:22:59 - INFO - codeparrot_training - Step 36774: {'lr': 8.200790696027346e-05, 'samples': 18828800, 'steps': 36774, 'batch_loss/train': 0.6911109746433794} 12/27/2021 19:23:11 - INFO - codeparrot_training - Step 36775: {'lr': 8.199623343470725e-05, 'samples': 18829312, 'steps': 36775, 'batch_loss/train': 0.6909348362823948} 12/27/2021 19:23:22 - INFO - codeparrot_training - Step 36776: {'lr': 8.19845605770605e-05, 'samples': 18829824, 'steps': 36776, 'batch_loss/train': 0.7904824861325324} 12/27/2021 19:23:32 - INFO - codeparrot_training - Step 36777: {'lr': 8.197288838737985e-05, 'samples': 18830336, 'steps': 36777, 'batch_loss/train': 0.7337430389598012} 12/27/2021 19:23:46 - INFO - codeparrot_training - Step 36778: {'lr': 8.196121686571147e-05, 'samples': 18830848, 'steps': 36778, 'batch_loss/train': 0.7125182142481208} 12/27/2021 19:23:57 - INFO - codeparrot_training - Step 36779: {'lr': 8.194954601210186e-05, 'samples': 18831360, 'steps': 36779, 'batch_loss/train': 0.7385668908245862} 12/27/2021 19:24:07 - INFO - codeparrot_training - Step 36780: {'lr': 8.19378758265974e-05, 'samples': 18831872, 'steps': 36780, 'batch_loss/train': 0.6613584808073938} 12/27/2021 19:24:20 - INFO - codeparrot_training - Step 36781: {'lr': 8.192620630924447e-05, 'samples': 18832384, 'steps': 36781, 'batch_loss/train': 0.765509813092649} 12/27/2021 19:24:30 - INFO - codeparrot_training - Step 36782: {'lr': 8.191453746008948e-05, 'samples': 18832896, 'steps': 36782, 'batch_loss/train': 0.5943941497243941} 12/27/2021 19:24:41 - INFO - codeparrot_training - Step 36783: {'lr': 8.190286927917883e-05, 'samples': 18833408, 'steps': 36783, 'batch_loss/train': 0.8018295224756002} 12/27/2021 19:24:52 - INFO - codeparrot_training - Step 36784: {'lr': 8.18912017665589e-05, 'samples': 18833920, 'steps': 36784, 'batch_loss/train': 0.8336114436388016} 12/27/2021 19:25:04 - INFO - codeparrot_training - Step 36785: {'lr': 8.187953492227617e-05, 'samples': 18834432, 'steps': 36785, 'batch_loss/train': 0.7331938208080828} 12/27/2021 19:25:14 - INFO - codeparrot_training - Step 36786: {'lr': 8.186786874637683e-05, 'samples': 18834944, 'steps': 36786, 'batch_loss/train': 0.7029874823056161} 12/27/2021 19:25:25 - INFO - codeparrot_training - Step 36787: {'lr': 8.18562032389073e-05, 'samples': 18835456, 'steps': 36787, 'batch_loss/train': 0.695819562766701} 12/27/2021 19:25:38 - INFO - codeparrot_training - Step 36788: {'lr': 8.184453839991415e-05, 'samples': 18835968, 'steps': 36788, 'batch_loss/train': 0.7293300924357027} 12/27/2021 19:25:49 - INFO - codeparrot_training - Step 36789: {'lr': 8.183287422944358e-05, 'samples': 18836480, 'steps': 36789, 'batch_loss/train': 0.7100752159021795} 12/27/2021 19:25:59 - INFO - codeparrot_training - Step 36790: {'lr': 8.182121072754189e-05, 'samples': 18836992, 'steps': 36790, 'batch_loss/train': 0.6504524919437245} 12/27/2021 19:26:12 - INFO - codeparrot_training - Step 36791: {'lr': 8.180954789425574e-05, 'samples': 18837504, 'steps': 36791, 'batch_loss/train': 0.7373912390321493} 12/27/2021 19:26:22 - INFO - codeparrot_training - Step 36792: {'lr': 8.179788572963123e-05, 'samples': 18838016, 'steps': 36792, 'batch_loss/train': 0.7287259558215737} 12/27/2021 19:26:33 - INFO - codeparrot_training - Step 36793: {'lr': 8.178622423371481e-05, 'samples': 18838528, 'steps': 36793, 'batch_loss/train': 0.7091111578047276} 12/27/2021 19:26:46 - INFO - codeparrot_training - Step 36794: {'lr': 8.177456340655288e-05, 'samples': 18839040, 'steps': 36794, 'batch_loss/train': 0.7610681247897446} 12/27/2021 19:26:57 - INFO - codeparrot_training - Step 36795: {'lr': 8.176290324819174e-05, 'samples': 18839552, 'steps': 36795, 'batch_loss/train': 0.761819502338767} 12/27/2021 19:27:07 - INFO - codeparrot_training - Step 36796: {'lr': 8.175124375867779e-05, 'samples': 18840064, 'steps': 36796, 'batch_loss/train': 0.9266799129545689} 12/27/2021 19:27:18 - INFO - codeparrot_training - Step 36797: {'lr': 8.173958493805742e-05, 'samples': 18840576, 'steps': 36797, 'batch_loss/train': 0.6932497979141772} 12/27/2021 19:27:30 - INFO - codeparrot_training - Step 36798: {'lr': 8.172792678637678e-05, 'samples': 18841088, 'steps': 36798, 'batch_loss/train': 0.8455711221322417} 12/27/2021 19:27:41 - INFO - codeparrot_training - Step 36799: {'lr': 8.171626930368245e-05, 'samples': 18841600, 'steps': 36799, 'batch_loss/train': 0.763932618778199} 12/27/2021 19:27:51 - INFO - codeparrot_training - Step 36800: {'lr': 8.170461249002078e-05, 'samples': 18842112, 'steps': 36800, 'batch_loss/train': 0.7166707688011229} 12/27/2021 19:28:03 - INFO - codeparrot_training - Step 36801: {'lr': 8.169295634543794e-05, 'samples': 18842624, 'steps': 36801, 'batch_loss/train': 0.791022811550647} 12/27/2021 19:28:14 - INFO - codeparrot_training - Step 36802: {'lr': 8.168130086998026e-05, 'samples': 18843136, 'steps': 36802, 'batch_loss/train': 0.7194848187500611} 12/27/2021 19:28:25 - INFO - codeparrot_training - Step 36803: {'lr': 8.166964606369434e-05, 'samples': 18843648, 'steps': 36803, 'batch_loss/train': 0.7383135426789522} 12/27/2021 19:28:38 - INFO - codeparrot_training - Step 36804: {'lr': 8.16579919266263e-05, 'samples': 18844160, 'steps': 36804, 'batch_loss/train': 0.6862031840719283} 12/27/2021 19:28:48 - INFO - codeparrot_training - Step 36805: {'lr': 8.164633845882246e-05, 'samples': 18844672, 'steps': 36805, 'batch_loss/train': 0.695546951959841} 12/27/2021 19:28:59 - INFO - codeparrot_training - Step 36806: {'lr': 8.163468566032925e-05, 'samples': 18845184, 'steps': 36806, 'batch_loss/train': 0.7079977281391621} 12/27/2021 19:29:10 - INFO - codeparrot_training - Step 36807: {'lr': 8.162303353119296e-05, 'samples': 18845696, 'steps': 36807, 'batch_loss/train': 0.7805540887638927} 12/27/2021 19:29:22 - INFO - codeparrot_training - Step 36808: {'lr': 8.161138207145991e-05, 'samples': 18846208, 'steps': 36808, 'batch_loss/train': 0.6752570478711277} 12/27/2021 19:29:32 - INFO - codeparrot_training - Step 36809: {'lr': 8.159973128117643e-05, 'samples': 18846720, 'steps': 36809, 'batch_loss/train': 0.7090511965798214} 12/27/2021 19:29:43 - INFO - codeparrot_training - Step 36810: {'lr': 8.15880811603888e-05, 'samples': 18847232, 'steps': 36810, 'batch_loss/train': 0.7253330431412905} 12/27/2021 19:29:55 - INFO - codeparrot_training - Step 36811: {'lr': 8.157643170914339e-05, 'samples': 18847744, 'steps': 36811, 'batch_loss/train': 0.8539856234565377} 12/27/2021 19:30:06 - INFO - codeparrot_training - Step 36812: {'lr': 8.156478292748659e-05, 'samples': 18848256, 'steps': 36812, 'batch_loss/train': 1.0844068620353937} 12/27/2021 19:30:17 - INFO - codeparrot_training - Step 36813: {'lr': 8.155313481546443e-05, 'samples': 18848768, 'steps': 36813, 'batch_loss/train': 0.7180340886116028} 12/27/2021 19:30:29 - INFO - codeparrot_training - Step 36814: {'lr': 8.154148737312356e-05, 'samples': 18849280, 'steps': 36814, 'batch_loss/train': 0.7895759975071996} 12/27/2021 19:30:39 - INFO - codeparrot_training - Step 36815: {'lr': 8.152984060051006e-05, 'samples': 18849792, 'steps': 36815, 'batch_loss/train': 0.6023366348817945} 12/27/2021 19:30:50 - INFO - codeparrot_training - Step 36816: {'lr': 8.151819449767022e-05, 'samples': 18850304, 'steps': 36816, 'batch_loss/train': 0.674866552464664} 12/27/2021 19:31:00 - INFO - codeparrot_training - Step 36817: {'lr': 8.150654906465057e-05, 'samples': 18850816, 'steps': 36817, 'batch_loss/train': 0.7286111665889621} 12/27/2021 19:31:14 - INFO - codeparrot_training - Step 36818: {'lr': 8.149490430149723e-05, 'samples': 18851328, 'steps': 36818, 'batch_loss/train': 0.6519960192963481} 12/27/2021 19:31:25 - INFO - codeparrot_training - Step 36819: {'lr': 8.148326020825648e-05, 'samples': 18851840, 'steps': 36819, 'batch_loss/train': 0.6578388861380517} 12/27/2021 19:31:36 - INFO - codeparrot_training - Step 36820: {'lr': 8.147161678497466e-05, 'samples': 18852352, 'steps': 36820, 'batch_loss/train': 0.6646269088378176} 12/27/2021 19:31:48 - INFO - codeparrot_training - Step 36821: {'lr': 8.145997403169805e-05, 'samples': 18852864, 'steps': 36821, 'batch_loss/train': 0.7262786654755473} 12/27/2021 19:31:58 - INFO - codeparrot_training - Step 36822: {'lr': 8.144833194847298e-05, 'samples': 18853376, 'steps': 36822, 'batch_loss/train': 0.739557926543057} 12/27/2021 19:32:09 - INFO - codeparrot_training - Step 36823: {'lr': 8.143669053534575e-05, 'samples': 18853888, 'steps': 36823, 'batch_loss/train': 0.7747603859752417} 12/27/2021 19:32:22 - INFO - codeparrot_training - Step 36824: {'lr': 8.142504979236245e-05, 'samples': 18854400, 'steps': 36824, 'batch_loss/train': 0.7168976957909763} 12/27/2021 19:32:33 - INFO - codeparrot_training - Step 36825: {'lr': 8.141340971956957e-05, 'samples': 18854912, 'steps': 36825, 'batch_loss/train': 0.5939926888095215} 12/27/2021 19:32:44 - INFO - codeparrot_training - Step 36826: {'lr': 8.14017703170134e-05, 'samples': 18855424, 'steps': 36826, 'batch_loss/train': 0.8172151725739241} 12/27/2021 19:32:54 - INFO - codeparrot_training - Step 36827: {'lr': 8.139013158473996e-05, 'samples': 18855936, 'steps': 36827, 'batch_loss/train': 0.7251099869608879} 12/27/2021 19:33:07 - INFO - codeparrot_training - Step 36828: {'lr': 8.137849352279578e-05, 'samples': 18856448, 'steps': 36828, 'batch_loss/train': 0.7172811883501709} 12/27/2021 19:33:17 - INFO - codeparrot_training - Step 36829: {'lr': 8.136685613122713e-05, 'samples': 18856960, 'steps': 36829, 'batch_loss/train': 0.751725057605654} 12/27/2021 19:33:28 - INFO - codeparrot_training - Step 36830: {'lr': 8.13552194100801e-05, 'samples': 18857472, 'steps': 36830, 'batch_loss/train': 0.7007618276402354} 12/27/2021 19:33:40 - INFO - codeparrot_training - Step 36831: {'lr': 8.134358335940104e-05, 'samples': 18857984, 'steps': 36831, 'batch_loss/train': 0.7229756400920451} 12/27/2021 19:33:51 - INFO - codeparrot_training - Step 36832: {'lr': 8.13319479792362e-05, 'samples': 18858496, 'steps': 36832, 'batch_loss/train': 0.683511127717793} 12/27/2021 19:34:01 - INFO - codeparrot_training - Step 36833: {'lr': 8.132031326963185e-05, 'samples': 18859008, 'steps': 36833, 'batch_loss/train': 0.7074511433020234} 12/27/2021 19:34:15 - INFO - codeparrot_training - Step 36834: {'lr': 8.130867923063426e-05, 'samples': 18859520, 'steps': 36834, 'batch_loss/train': 0.7093831025995314} 12/27/2021 19:34:25 - INFO - codeparrot_training - Step 36835: {'lr': 8.129704586228966e-05, 'samples': 18860032, 'steps': 36835, 'batch_loss/train': 0.7368546980433166} 12/27/2021 19:34:36 - INFO - codeparrot_training - Step 36836: {'lr': 8.12854131646443e-05, 'samples': 18860544, 'steps': 36836, 'batch_loss/train': 0.6976542524062097} 12/27/2021 19:34:46 - INFO - codeparrot_training - Step 36837: {'lr': 8.127378113774445e-05, 'samples': 18861056, 'steps': 36837, 'batch_loss/train': 0.7866384629160166} 12/27/2021 19:34:59 - INFO - codeparrot_training - Step 36838: {'lr': 8.12621497816364e-05, 'samples': 18861568, 'steps': 36838, 'batch_loss/train': 0.6374352006241679} 12/27/2021 19:35:09 - INFO - codeparrot_training - Step 36839: {'lr': 8.125051909636616e-05, 'samples': 18862080, 'steps': 36839, 'batch_loss/train': 0.7000827104784548} 12/27/2021 19:35:20 - INFO - codeparrot_training - Step 36840: {'lr': 8.123888908198032e-05, 'samples': 18862592, 'steps': 36840, 'batch_loss/train': 0.7896927102701738} 12/27/2021 19:35:32 - INFO - codeparrot_training - Step 36841: {'lr': 8.122725973852485e-05, 'samples': 18863104, 'steps': 36841, 'batch_loss/train': 0.7318034525960684} 12/27/2021 19:35:43 - INFO - codeparrot_training - Step 36842: {'lr': 8.121563106604599e-05, 'samples': 18863616, 'steps': 36842, 'batch_loss/train': 0.7470270856283605} 12/27/2021 19:35:53 - INFO - codeparrot_training - Step 36843: {'lr': 8.12040030645902e-05, 'samples': 18864128, 'steps': 36843, 'batch_loss/train': 0.7793890470638871} 12/27/2021 19:36:07 - INFO - codeparrot_training - Step 36844: {'lr': 8.11923757342035e-05, 'samples': 18864640, 'steps': 36844, 'batch_loss/train': 0.6614782433025539} 12/27/2021 19:36:18 - INFO - codeparrot_training - Step 36845: {'lr': 8.118074907493215e-05, 'samples': 18865152, 'steps': 36845, 'batch_loss/train': 0.7631401978433132} 12/27/2021 19:36:28 - INFO - codeparrot_training - Step 36846: {'lr': 8.11691230868224e-05, 'samples': 18865664, 'steps': 36846, 'batch_loss/train': 0.7514959062682465} 12/27/2021 19:36:41 - INFO - codeparrot_training - Step 36847: {'lr': 8.115749776992048e-05, 'samples': 18866176, 'steps': 36847, 'batch_loss/train': 0.7293986170552671} 12/27/2021 19:36:51 - INFO - codeparrot_training - Step 36848: {'lr': 8.11458731242726e-05, 'samples': 18866688, 'steps': 36848, 'batch_loss/train': 0.827786989393644} 12/27/2021 19:37:02 - INFO - codeparrot_training - Step 36849: {'lr': 8.113424914992504e-05, 'samples': 18867200, 'steps': 36849, 'batch_loss/train': 0.7526131917256862} 12/27/2021 19:37:13 - INFO - codeparrot_training - Step 36850: {'lr': 8.112262584692379e-05, 'samples': 18867712, 'steps': 36850, 'batch_loss/train': 0.6518911882303655} 12/27/2021 19:37:25 - INFO - codeparrot_training - Step 36851: {'lr': 8.111100321531528e-05, 'samples': 18868224, 'steps': 36851, 'batch_loss/train': 0.6751706865034066} 12/27/2021 19:37:35 - INFO - codeparrot_training - Step 36852: {'lr': 8.109938125514577e-05, 'samples': 18868736, 'steps': 36852, 'batch_loss/train': 0.7690290217287838} 12/27/2021 19:37:46 - INFO - codeparrot_training - Step 36853: {'lr': 8.108775996646114e-05, 'samples': 18869248, 'steps': 36853, 'batch_loss/train': 0.6755931687075645} 12/27/2021 19:37:59 - INFO - codeparrot_training - Step 36854: {'lr': 8.107613934930791e-05, 'samples': 18869760, 'steps': 36854, 'batch_loss/train': 0.8148000105284154} 12/27/2021 19:38:10 - INFO - codeparrot_training - Step 36855: {'lr': 8.106451940373222e-05, 'samples': 18870272, 'steps': 36855, 'batch_loss/train': 0.756522164796479} 12/27/2021 19:38:21 - INFO - codeparrot_training - Step 36856: {'lr': 8.105290012978014e-05, 'samples': 18870784, 'steps': 36856, 'batch_loss/train': 0.8010340081527829} 12/27/2021 19:38:32 - INFO - codeparrot_training - Step 36857: {'lr': 8.104128152749787e-05, 'samples': 18871296, 'steps': 36857, 'batch_loss/train': 0.6721746078692377} 12/27/2021 19:38:43 - INFO - codeparrot_training - Step 36858: {'lr': 8.102966359693184e-05, 'samples': 18871808, 'steps': 36858, 'batch_loss/train': 0.7383988946676254} 12/27/2021 19:38:54 - INFO - codeparrot_training - Step 36859: {'lr': 8.101804633812796e-05, 'samples': 18872320, 'steps': 36859, 'batch_loss/train': 0.6660431808559224} 12/27/2021 19:39:04 - INFO - codeparrot_training - Step 36860: {'lr': 8.100642975113254e-05, 'samples': 18872832, 'steps': 36860, 'batch_loss/train': 0.6999983650748618} 12/27/2021 19:39:17 - INFO - codeparrot_training - Step 36861: {'lr': 8.099481383599173e-05, 'samples': 18873344, 'steps': 36861, 'batch_loss/train': 0.804377909284085} 12/27/2021 19:39:27 - INFO - codeparrot_training - Step 36862: {'lr': 8.098319859275175e-05, 'samples': 18873856, 'steps': 36862, 'batch_loss/train': 0.6538141523487866} 12/27/2021 19:39:38 - INFO - codeparrot_training - Step 36863: {'lr': 8.097158402145874e-05, 'samples': 18874368, 'steps': 36863, 'batch_loss/train': 0.6323621980845928} 12/27/2021 19:39:51 - INFO - codeparrot_training - Step 36864: {'lr': 8.095997012215886e-05, 'samples': 18874880, 'steps': 36864, 'batch_loss/train': 0.7291958652203903} 12/27/2021 19:40:02 - INFO - codeparrot_training - Step 36865: {'lr': 8.094835689489834e-05, 'samples': 18875392, 'steps': 36865, 'batch_loss/train': 0.7633236250840127} 12/27/2021 19:40:12 - INFO - codeparrot_training - Step 36866: {'lr': 8.093674433972333e-05, 'samples': 18875904, 'steps': 36866, 'batch_loss/train': 0.6835373051580973} 12/27/2021 19:40:24 - INFO - codeparrot_training - Step 36867: {'lr': 8.092513245668007e-05, 'samples': 18876416, 'steps': 36867, 'batch_loss/train': 0.7351164375431836} 12/27/2021 19:40:35 - INFO - codeparrot_training - Step 36868: {'lr': 8.091352124581447e-05, 'samples': 18876928, 'steps': 36868, 'batch_loss/train': 0.607794649200514} 12/27/2021 19:40:46 - INFO - codeparrot_training - Step 36869: {'lr': 8.090191070717303e-05, 'samples': 18877440, 'steps': 36869, 'batch_loss/train': 0.7105131624266505} 12/27/2021 19:40:56 - INFO - codeparrot_training - Step 36870: {'lr': 8.089030084080164e-05, 'samples': 18877952, 'steps': 36870, 'batch_loss/train': 0.6630623054224998} 12/27/2021 19:41:09 - INFO - codeparrot_training - Step 36871: {'lr': 8.087869164674661e-05, 'samples': 18878464, 'steps': 36871, 'batch_loss/train': 0.5591299654915929} 12/27/2021 19:41:19 - INFO - codeparrot_training - Step 36872: {'lr': 8.0867083125054e-05, 'samples': 18878976, 'steps': 36872, 'batch_loss/train': 1.3454275239491835} 12/27/2021 19:41:30 - INFO - codeparrot_training - Step 36873: {'lr': 8.085547527577003e-05, 'samples': 18879488, 'steps': 36873, 'batch_loss/train': 0.7422282649204135} 12/27/2021 19:41:44 - INFO - codeparrot_training - Step 36874: {'lr': 8.084386809894082e-05, 'samples': 18880000, 'steps': 36874, 'batch_loss/train': 0.7347936369478703} 12/27/2021 19:41:55 - INFO - codeparrot_training - Step 36875: {'lr': 8.08322615946126e-05, 'samples': 18880512, 'steps': 36875, 'batch_loss/train': 0.7151158032938838} 12/27/2021 19:42:05 - INFO - codeparrot_training - Step 36876: {'lr': 8.082065576283126e-05, 'samples': 18881024, 'steps': 36876, 'batch_loss/train': 0.7684811751823872} 12/27/2021 19:42:18 - INFO - codeparrot_training - Step 36877: {'lr': 8.080905060364322e-05, 'samples': 18881536, 'steps': 36877, 'batch_loss/train': 0.7239079778082669} 12/27/2021 19:42:28 - INFO - codeparrot_training - Step 36878: {'lr': 8.079744611709455e-05, 'samples': 18882048, 'steps': 36878, 'batch_loss/train': 0.7966356407850981} 12/27/2021 19:42:39 - INFO - codeparrot_training - Step 36879: {'lr': 8.078584230323121e-05, 'samples': 18882560, 'steps': 36879, 'batch_loss/train': 0.7500490071251988} 12/27/2021 19:42:55 - INFO - codeparrot_training - Step 36880: {'lr': 8.077423916209956e-05, 'samples': 18883072, 'steps': 36880, 'batch_loss/train': 0.7034739209339023} 12/27/2021 19:43:06 - INFO - codeparrot_training - Step 36881: {'lr': 8.07626366937457e-05, 'samples': 18883584, 'steps': 36881, 'batch_loss/train': 0.8173080515116453} 12/27/2021 19:43:16 - INFO - codeparrot_training - Step 36882: {'lr': 8.075103489821561e-05, 'samples': 18884096, 'steps': 36882, 'batch_loss/train': 1.6090057156980038} 12/27/2021 19:43:27 - INFO - codeparrot_training - Step 36883: {'lr': 8.073943377555542e-05, 'samples': 18884608, 'steps': 36883, 'batch_loss/train': 0.7608630303293467} 12/27/2021 19:43:39 - INFO - codeparrot_training - Step 36884: {'lr': 8.072783332581151e-05, 'samples': 18885120, 'steps': 36884, 'batch_loss/train': 0.9904748285189271} 12/27/2021 19:43:50 - INFO - codeparrot_training - Step 36885: {'lr': 8.071623354902974e-05, 'samples': 18885632, 'steps': 36885, 'batch_loss/train': 0.7896097507327795} 12/27/2021 19:44:00 - INFO - codeparrot_training - Step 36886: {'lr': 8.07046344452563e-05, 'samples': 18886144, 'steps': 36886, 'batch_loss/train': 0.7176877232268453} 12/27/2021 19:44:11 - INFO - codeparrot_training - Step 36887: {'lr': 8.069303601453728e-05, 'samples': 18886656, 'steps': 36887, 'batch_loss/train': 0.7254536878317595} 12/27/2021 19:44:23 - INFO - codeparrot_training - Step 36888: {'lr': 8.068143825691887e-05, 'samples': 18887168, 'steps': 36888, 'batch_loss/train': 0.7389677013270557} 12/27/2021 19:44:33 - INFO - codeparrot_training - Step 36889: {'lr': 8.066984117244711e-05, 'samples': 18887680, 'steps': 36889, 'batch_loss/train': 0.7597364690154791} 12/27/2021 19:44:44 - INFO - codeparrot_training - Step 36890: {'lr': 8.065824476116812e-05, 'samples': 18888192, 'steps': 36890, 'batch_loss/train': 0.743790264474228} 12/27/2021 19:44:58 - INFO - codeparrot_training - Step 36891: {'lr': 8.064664902312799e-05, 'samples': 18888704, 'steps': 36891, 'batch_loss/train': 0.7111296670045704} 12/27/2021 19:45:09 - INFO - codeparrot_training - Step 36892: {'lr': 8.063505395837286e-05, 'samples': 18889216, 'steps': 36892, 'batch_loss/train': 0.7457968392409384} 12/27/2021 19:45:19 - INFO - codeparrot_training - Step 36893: {'lr': 8.062345956694889e-05, 'samples': 18889728, 'steps': 36893, 'batch_loss/train': 0.6793758892454207} 12/27/2021 19:45:32 - INFO - codeparrot_training - Step 36894: {'lr': 8.061186584890193e-05, 'samples': 18890240, 'steps': 36894, 'batch_loss/train': 0.7477684011682868} 12/27/2021 19:45:42 - INFO - codeparrot_training - Step 36895: {'lr': 8.060027280427837e-05, 'samples': 18890752, 'steps': 36895, 'batch_loss/train': 0.7119567878544331} 12/27/2021 19:45:53 - INFO - codeparrot_training - Step 36896: {'lr': 8.058868043312412e-05, 'samples': 18891264, 'steps': 36896, 'batch_loss/train': 0.7043380625545979} 12/27/2021 19:46:06 - INFO - codeparrot_training - Step 36897: {'lr': 8.057708873548528e-05, 'samples': 18891776, 'steps': 36897, 'batch_loss/train': 0.7337637129239738} 12/27/2021 19:46:17 - INFO - codeparrot_training - Step 36898: {'lr': 8.056549771140797e-05, 'samples': 18892288, 'steps': 36898, 'batch_loss/train': 0.6913101570680737} 12/27/2021 19:46:28 - INFO - codeparrot_training - Step 36899: {'lr': 8.055390736093826e-05, 'samples': 18892800, 'steps': 36899, 'batch_loss/train': 0.5938840243034065} 12/27/2021 19:46:38 - INFO - codeparrot_training - Step 36900: {'lr': 8.054231768412223e-05, 'samples': 18893312, 'steps': 36900, 'batch_loss/train': 0.7291331430897117} 12/27/2021 19:46:50 - INFO - codeparrot_training - Step 36901: {'lr': 8.0530728681006e-05, 'samples': 18893824, 'steps': 36901, 'batch_loss/train': 0.6893025138415396} 12/27/2021 19:47:01 - INFO - codeparrot_training - Step 36902: {'lr': 8.051914035163557e-05, 'samples': 18894336, 'steps': 36902, 'batch_loss/train': 0.8290727366693318} 12/27/2021 19:47:12 - INFO - codeparrot_training - Step 36903: {'lr': 8.050755269605703e-05, 'samples': 18894848, 'steps': 36903, 'batch_loss/train': 0.6685403301380575} 12/27/2021 19:47:24 - INFO - codeparrot_training - Step 36904: {'lr': 8.049596571431659e-05, 'samples': 18895360, 'steps': 36904, 'batch_loss/train': 0.5569492348004133} 12/27/2021 19:47:34 - INFO - codeparrot_training - Step 36905: {'lr': 8.048437940645997e-05, 'samples': 18895872, 'steps': 36905, 'batch_loss/train': 0.7711724732071161} 12/27/2021 19:47:45 - INFO - codeparrot_training - Step 36906: {'lr': 8.047279377253358e-05, 'samples': 18896384, 'steps': 36906, 'batch_loss/train': 0.7012423751875758} 12/27/2021 19:47:59 - INFO - codeparrot_training - Step 36907: {'lr': 8.04612088125834e-05, 'samples': 18896896, 'steps': 36907, 'batch_loss/train': 0.6269511538557708} 12/27/2021 19:48:09 - INFO - codeparrot_training - Step 36908: {'lr': 8.044962452665538e-05, 'samples': 18897408, 'steps': 36908, 'batch_loss/train': 0.7157770404592156} 12/27/2021 19:48:20 - INFO - codeparrot_training - Step 36909: {'lr': 8.043804091479554e-05, 'samples': 18897920, 'steps': 36909, 'batch_loss/train': 0.7551205945201218} 12/27/2021 19:48:32 - INFO - codeparrot_training - Step 36910: {'lr': 8.04264579770502e-05, 'samples': 18898432, 'steps': 36910, 'batch_loss/train': 0.691698188893497} 12/27/2021 19:48:43 - INFO - codeparrot_training - Step 36911: {'lr': 8.041487571346511e-05, 'samples': 18898944, 'steps': 36911, 'batch_loss/train': 0.7072904058732092} 12/27/2021 19:48:53 - INFO - codeparrot_training - Step 36912: {'lr': 8.04032941240865e-05, 'samples': 18899456, 'steps': 36912, 'batch_loss/train': 0.7114818999543786} 12/27/2021 19:49:04 - INFO - codeparrot_training - Step 36913: {'lr': 8.039171320896032e-05, 'samples': 18899968, 'steps': 36913, 'batch_loss/train': 0.5988868247077335} 12/27/2021 19:49:18 - INFO - codeparrot_training - Step 36914: {'lr': 8.038013296813265e-05, 'samples': 18900480, 'steps': 36914, 'batch_loss/train': 0.5819478366756812} 12/27/2021 19:49:29 - INFO - codeparrot_training - Step 36915: {'lr': 8.036855340164952e-05, 'samples': 18900992, 'steps': 36915, 'batch_loss/train': 0.6880327386315912} 12/27/2021 19:49:39 - INFO - codeparrot_training - Step 36916: {'lr': 8.035697450955701e-05, 'samples': 18901504, 'steps': 36916, 'batch_loss/train': 0.5279772836947814} 12/27/2021 19:49:51 - INFO - codeparrot_training - Step 36917: {'lr': 8.034539629190107e-05, 'samples': 18902016, 'steps': 36917, 'batch_loss/train': 0.8091711895540357} 12/27/2021 19:50:02 - INFO - codeparrot_training - Step 36918: {'lr': 8.033381874872781e-05, 'samples': 18902528, 'steps': 36918, 'batch_loss/train': 0.6125559504143894} 12/27/2021 19:50:12 - INFO - codeparrot_training - Step 36919: {'lr': 8.032224188008327e-05, 'samples': 18903040, 'steps': 36919, 'batch_loss/train': 0.5975349033251405} 12/27/2021 19:50:25 - INFO - codeparrot_training - Step 36920: {'lr': 8.03106656860133e-05, 'samples': 18903552, 'steps': 36920, 'batch_loss/train': 0.6071949643373955} 12/27/2021 19:50:35 - INFO - codeparrot_training - Step 36921: {'lr': 8.02990901665642e-05, 'samples': 18904064, 'steps': 36921, 'batch_loss/train': 0.6152003183960915} 12/27/2021 19:50:46 - INFO - codeparrot_training - Step 36922: {'lr': 8.028751532178175e-05, 'samples': 18904576, 'steps': 36922, 'batch_loss/train': 0.4654609391873237} 12/27/2021 19:50:56 - INFO - codeparrot_training - Step 36923: {'lr': 8.027594115171208e-05, 'samples': 18905088, 'steps': 36923, 'batch_loss/train': 0.6967829987406731} 12/27/2021 19:51:09 - INFO - codeparrot_training - Step 36924: {'lr': 8.026436765640116e-05, 'samples': 18905600, 'steps': 36924, 'batch_loss/train': 0.6785035752691329} 12/27/2021 19:51:19 - INFO - codeparrot_training - Step 36925: {'lr': 8.025279483589504e-05, 'samples': 18906112, 'steps': 36925, 'batch_loss/train': 0.6825440381653607} 12/27/2021 19:51:30 - INFO - codeparrot_training - Step 36926: {'lr': 8.02412226902397e-05, 'samples': 18906624, 'steps': 36926, 'batch_loss/train': 0.6450645750155672} 12/27/2021 19:51:44 - INFO - codeparrot_training - Step 36927: {'lr': 8.022965121948114e-05, 'samples': 18907136, 'steps': 36927, 'batch_loss/train': 0.6821285784244537} 12/27/2021 19:51:54 - INFO - codeparrot_training - Step 36928: {'lr': 8.021808042366543e-05, 'samples': 18907648, 'steps': 36928, 'batch_loss/train': 0.7258439781144261} 12/27/2021 19:52:05 - INFO - codeparrot_training - Step 36929: {'lr': 8.020651030283849e-05, 'samples': 18908160, 'steps': 36929, 'batch_loss/train': 0.7041181027889252} 12/27/2021 19:52:17 - INFO - codeparrot_training - Step 36930: {'lr': 8.019494085704646e-05, 'samples': 18908672, 'steps': 36930, 'batch_loss/train': 0.7516040233895183} 12/27/2021 19:52:28 - INFO - codeparrot_training - Step 36931: {'lr': 8.018337208633505e-05, 'samples': 18909184, 'steps': 36931, 'batch_loss/train': 0.7117884890176356} 12/27/2021 19:52:38 - INFO - codeparrot_training - Step 36932: {'lr': 8.017180399075052e-05, 'samples': 18909696, 'steps': 36932, 'batch_loss/train': 0.6942347715375945} 12/27/2021 19:52:50 - INFO - codeparrot_training - Step 36933: {'lr': 8.016023657033886e-05, 'samples': 18910208, 'steps': 36933, 'batch_loss/train': 0.7463670251891017} 12/27/2021 19:53:01 - INFO - codeparrot_training - Step 36934: {'lr': 8.014866982514588e-05, 'samples': 18910720, 'steps': 36934, 'batch_loss/train': 0.6854433831758797} 12/27/2021 19:53:11 - INFO - codeparrot_training - Step 36935: {'lr': 8.013710375521757e-05, 'samples': 18911232, 'steps': 36935, 'batch_loss/train': 0.7309957551769912} 12/27/2021 19:53:22 - INFO - codeparrot_training - Step 36936: {'lr': 8.012553836060018e-05, 'samples': 18911744, 'steps': 36936, 'batch_loss/train': 0.6329486801405437} 12/27/2021 19:53:36 - INFO - codeparrot_training - Step 36937: {'lr': 8.011397364133941e-05, 'samples': 18912256, 'steps': 36937, 'batch_loss/train': 0.699617141392082} 12/27/2021 19:53:47 - INFO - codeparrot_training - Step 36938: {'lr': 8.010240959748124e-05, 'samples': 18912768, 'steps': 36938, 'batch_loss/train': 0.6347867457079701} 12/27/2021 19:53:57 - INFO - codeparrot_training - Step 36939: {'lr': 8.009084622907192e-05, 'samples': 18913280, 'steps': 36939, 'batch_loss/train': 0.7843932881951332} 12/27/2021 19:54:09 - INFO - codeparrot_training - Step 36940: {'lr': 8.007928353615718e-05, 'samples': 18913792, 'steps': 36940, 'batch_loss/train': 0.8830147697590292} 12/27/2021 19:54:20 - INFO - codeparrot_training - Step 36941: {'lr': 8.006772151878302e-05, 'samples': 18914304, 'steps': 36941, 'batch_loss/train': 0.7413740064948797} 12/27/2021 19:54:31 - INFO - codeparrot_training - Step 36942: {'lr': 8.005616017699544e-05, 'samples': 18914816, 'steps': 36942, 'batch_loss/train': 0.8160474095493555} 12/27/2021 19:54:44 - INFO - codeparrot_training - Step 36943: {'lr': 8.00445995108404e-05, 'samples': 18915328, 'steps': 36943, 'batch_loss/train': 0.669685483968351} 12/27/2021 19:54:55 - INFO - codeparrot_training - Step 36944: {'lr': 8.003303952036387e-05, 'samples': 18915840, 'steps': 36944, 'batch_loss/train': 0.6905510048381984} 12/27/2021 19:55:05 - INFO - codeparrot_training - Step 36945: {'lr': 8.002148020561185e-05, 'samples': 18916352, 'steps': 36945, 'batch_loss/train': 0.7499243969796225} 12/27/2021 19:55:17 - INFO - codeparrot_training - Step 36946: {'lr': 8.00099215666301e-05, 'samples': 18916864, 'steps': 36946, 'batch_loss/train': 0.6741363164037466} 12/27/2021 19:55:28 - INFO - codeparrot_training - Step 36947: {'lr': 7.99983636034648e-05, 'samples': 18917376, 'steps': 36947, 'batch_loss/train': 0.6736148775089532} 12/27/2021 19:55:38 - INFO - codeparrot_training - Step 36948: {'lr': 7.99868063161619e-05, 'samples': 18917888, 'steps': 36948, 'batch_loss/train': 0.661782116163522} 12/27/2021 19:55:49 - INFO - codeparrot_training - Step 36949: {'lr': 7.997524970476716e-05, 'samples': 18918400, 'steps': 36949, 'batch_loss/train': 0.7129414090886712} 12/27/2021 19:56:01 - INFO - codeparrot_training - Step 36950: {'lr': 7.996369376932666e-05, 'samples': 18918912, 'steps': 36950, 'batch_loss/train': 0.6816863950807601} 12/27/2021 19:56:12 - INFO - codeparrot_training - Step 36951: {'lr': 7.995213850988628e-05, 'samples': 18919424, 'steps': 36951, 'batch_loss/train': 0.6894365836633369} 12/27/2021 19:56:22 - INFO - codeparrot_training - Step 36952: {'lr': 7.994058392649201e-05, 'samples': 18919936, 'steps': 36952, 'batch_loss/train': 0.737070289440453} 12/27/2021 19:56:36 - INFO - codeparrot_training - Step 36953: {'lr': 7.992903001918977e-05, 'samples': 18920448, 'steps': 36953, 'batch_loss/train': 0.6493254103697836} 12/27/2021 19:56:47 - INFO - codeparrot_training - Step 36954: {'lr': 7.991747678802547e-05, 'samples': 18920960, 'steps': 36954, 'batch_loss/train': 0.707736567594111} 12/27/2021 19:56:58 - INFO - codeparrot_training - Step 36955: {'lr': 7.990592423304508e-05, 'samples': 18921472, 'steps': 36955, 'batch_loss/train': 0.6233206814504229} 12/27/2021 19:57:10 - INFO - codeparrot_training - Step 36956: {'lr': 7.98943723542945e-05, 'samples': 18921984, 'steps': 36956, 'batch_loss/train': 0.7377684121020138} 12/27/2021 19:57:21 - INFO - codeparrot_training - Step 36957: {'lr': 7.988282115181966e-05, 'samples': 18922496, 'steps': 36957, 'batch_loss/train': 0.598848206398543} 12/27/2021 19:57:31 - INFO - codeparrot_training - Step 36958: {'lr': 7.987127062566651e-05, 'samples': 18923008, 'steps': 36958, 'batch_loss/train': 0.7215934516862035} 12/27/2021 19:57:42 - INFO - codeparrot_training - Step 36959: {'lr': 7.985972077588102e-05, 'samples': 18923520, 'steps': 36959, 'batch_loss/train': 0.7562639303505421} 12/27/2021 19:57:54 - INFO - codeparrot_training - Step 36960: {'lr': 7.984817160250896e-05, 'samples': 18924032, 'steps': 36960, 'batch_loss/train': 0.6938136909157038} 12/27/2021 19:58:05 - INFO - codeparrot_training - Step 36961: {'lr': 7.983662310559623e-05, 'samples': 18924544, 'steps': 36961, 'batch_loss/train': 0.6505296174436808} 12/27/2021 19:58:15 - INFO - codeparrot_training - Step 36962: {'lr': 7.982507528518901e-05, 'samples': 18925056, 'steps': 36962, 'batch_loss/train': 0.7245751381851733} 12/27/2021 19:58:27 - INFO - codeparrot_training - Step 36963: {'lr': 7.981352814133294e-05, 'samples': 18925568, 'steps': 36963, 'batch_loss/train': 0.7519895872101188} 12/27/2021 19:58:38 - INFO - codeparrot_training - Step 36964: {'lr': 7.980198167407396e-05, 'samples': 18926080, 'steps': 36964, 'batch_loss/train': 0.7928161518648267} 12/27/2021 19:58:48 - INFO - codeparrot_training - Step 36965: {'lr': 7.979043588345819e-05, 'samples': 18926592, 'steps': 36965, 'batch_loss/train': 0.6231258185289335} 12/27/2021 19:59:03 - INFO - codeparrot_training - Step 36966: {'lr': 7.977889076953129e-05, 'samples': 18927104, 'steps': 36966, 'batch_loss/train': 0.7196843731217086} 12/27/2021 19:59:13 - INFO - codeparrot_training - Step 36967: {'lr': 7.976734633233926e-05, 'samples': 18927616, 'steps': 36967, 'batch_loss/train': 0.6091135444294196} 12/27/2021 19:59:24 - INFO - codeparrot_training - Step 36968: {'lr': 7.9755802571928e-05, 'samples': 18928128, 'steps': 36968, 'batch_loss/train': 0.5303048739442602} 12/27/2021 19:59:36 - INFO - codeparrot_training - Step 36969: {'lr': 7.974425948834336e-05, 'samples': 18928640, 'steps': 36969, 'batch_loss/train': 0.4962115434464067} 12/27/2021 19:59:47 - INFO - codeparrot_training - Step 36970: {'lr': 7.973271708163124e-05, 'samples': 18929152, 'steps': 36970, 'batch_loss/train': 0.6717812474817038} 12/27/2021 19:59:57 - INFO - codeparrot_training - Step 36971: {'lr': 7.972117535183766e-05, 'samples': 18929664, 'steps': 36971, 'batch_loss/train': 0.7308978689834476} 12/27/2021 20:00:08 - INFO - codeparrot_training - Step 36972: {'lr': 7.970963429900821e-05, 'samples': 18930176, 'steps': 36972, 'batch_loss/train': 0.7143170107156038} 12/27/2021 20:00:20 - INFO - codeparrot_training - Step 36973: {'lr': 7.969809392318906e-05, 'samples': 18930688, 'steps': 36973, 'batch_loss/train': 0.7234110152348876} 12/27/2021 20:00:31 - INFO - codeparrot_training - Step 36974: {'lr': 7.968655422442606e-05, 'samples': 18931200, 'steps': 36974, 'batch_loss/train': 0.6547766164876521} 12/27/2021 20:00:41 - INFO - codeparrot_training - Step 36975: {'lr': 7.967501520276483e-05, 'samples': 18931712, 'steps': 36975, 'batch_loss/train': 0.5781099917658139} 12/27/2021 20:00:55 - INFO - codeparrot_training - Step 36976: {'lr': 7.966347685825159e-05, 'samples': 18932224, 'steps': 36976, 'batch_loss/train': 0.6565211601264309} 12/27/2021 20:01:06 - INFO - codeparrot_training - Step 36977: {'lr': 7.9651939190932e-05, 'samples': 18932736, 'steps': 36977, 'batch_loss/train': 0.6065825319383293} 12/27/2021 20:01:17 - INFO - codeparrot_training - Step 36978: {'lr': 7.964040220085195e-05, 'samples': 18933248, 'steps': 36978, 'batch_loss/train': 0.8459489718079567} 12/27/2021 20:01:29 - INFO - codeparrot_training - Step 36979: {'lr': 7.962886588805734e-05, 'samples': 18933760, 'steps': 36979, 'batch_loss/train': 0.7724293577484787} 12/27/2021 20:01:40 - INFO - codeparrot_training - Step 36980: {'lr': 7.961733025259405e-05, 'samples': 18934272, 'steps': 36980, 'batch_loss/train': 0.6803732612170279} 12/27/2021 20:01:50 - INFO - codeparrot_training - Step 36981: {'lr': 7.96057952945079e-05, 'samples': 18934784, 'steps': 36981, 'batch_loss/train': 0.6963028875179589} 12/27/2021 20:02:04 - INFO - codeparrot_training - Step 36982: {'lr': 7.959426101384479e-05, 'samples': 18935296, 'steps': 36982, 'batch_loss/train': 0.7021687747910619} 12/27/2021 20:02:15 - INFO - codeparrot_training - Step 36983: {'lr': 7.958272741065054e-05, 'samples': 18935808, 'steps': 36983, 'batch_loss/train': 0.7135840281844139} 12/27/2021 20:02:26 - INFO - codeparrot_training - Step 36984: {'lr': 7.957119448497099e-05, 'samples': 18936320, 'steps': 36984, 'batch_loss/train': 0.6931980508379638} 12/27/2021 20:02:36 - INFO - codeparrot_training - Step 36985: {'lr': 7.955966223685215e-05, 'samples': 18936832, 'steps': 36985, 'batch_loss/train': 0.6356227728538215} 12/27/2021 20:02:48 - INFO - codeparrot_training - Step 36986: {'lr': 7.954813066633962e-05, 'samples': 18937344, 'steps': 36986, 'batch_loss/train': 0.6468484147917479} 12/27/2021 20:02:59 - INFO - codeparrot_training - Step 36987: {'lr': 7.95365997734793e-05, 'samples': 18937856, 'steps': 36987, 'batch_loss/train': 0.710768093355} 12/27/2021 20:03:10 - INFO - codeparrot_training - Step 36988: {'lr': 7.952506955831726e-05, 'samples': 18938368, 'steps': 36988, 'batch_loss/train': 0.651355191366747} 12/27/2021 20:03:22 - INFO - codeparrot_training - Step 36989: {'lr': 7.951354002089909e-05, 'samples': 18938880, 'steps': 36989, 'batch_loss/train': 0.7166081755422056} 12/27/2021 20:03:32 - INFO - codeparrot_training - Step 36990: {'lr': 7.950201116127062e-05, 'samples': 18939392, 'steps': 36990, 'batch_loss/train': 0.7027322929352522} 12/27/2021 20:03:43 - INFO - codeparrot_training - Step 36991: {'lr': 7.949048297947794e-05, 'samples': 18939904, 'steps': 36991, 'batch_loss/train': 0.6686422643251717} 12/27/2021 20:03:57 - INFO - codeparrot_training - Step 36992: {'lr': 7.947895547556665e-05, 'samples': 18940416, 'steps': 36992, 'batch_loss/train': 0.7406961508095264} 12/27/2021 20:04:08 - INFO - codeparrot_training - Step 36993: {'lr': 7.946742864958264e-05, 'samples': 18940928, 'steps': 36993, 'batch_loss/train': 0.7228527488186955} 12/27/2021 20:04:18 - INFO - codeparrot_training - Step 36994: {'lr': 7.945590250157176e-05, 'samples': 18941440, 'steps': 36994, 'batch_loss/train': 0.6849326840601861} 12/27/2021 20:04:31 - INFO - codeparrot_training - Step 36995: {'lr': 7.94443770315798e-05, 'samples': 18941952, 'steps': 36995, 'batch_loss/train': 0.7194827105849981} 12/27/2021 20:04:42 - INFO - codeparrot_training - Step 36996: {'lr': 7.943285223965261e-05, 'samples': 18942464, 'steps': 36996, 'batch_loss/train': 0.5826269168173894} 12/27/2021 20:04:52 - INFO - codeparrot_training - Step 36997: {'lr': 7.942132812583606e-05, 'samples': 18942976, 'steps': 36997, 'batch_loss/train': 0.6787893166765571} 12/27/2021 20:05:03 - INFO - codeparrot_training - Step 36998: {'lr': 7.940980469017572e-05, 'samples': 18943488, 'steps': 36998, 'batch_loss/train': 1.0738510312512517} 12/27/2021 20:05:17 - INFO - codeparrot_training - Step 36999: {'lr': 7.939828193271769e-05, 'samples': 18944000, 'steps': 36999, 'batch_loss/train': 0.627480850671418} 12/27/2021 20:05:28 - INFO - codeparrot_training - Step 37000: {'lr': 7.938675985350777e-05, 'samples': 18944512, 'steps': 37000, 'batch_loss/train': 0.6539727216586471} 12/27/2021 20:05:38 - INFO - codeparrot_training - Step 37001: {'lr': 7.937523845259149e-05, 'samples': 18945024, 'steps': 37001, 'batch_loss/train': 1.479330277070403} 12/27/2021 20:05:50 - INFO - codeparrot_training - Step 37002: {'lr': 7.936371773001491e-05, 'samples': 18945536, 'steps': 37002, 'batch_loss/train': 0.704230172559619} 12/27/2021 20:06:01 - INFO - codeparrot_training - Step 37003: {'lr': 7.935219768582386e-05, 'samples': 18946048, 'steps': 37003, 'batch_loss/train': 0.7005458567291498} 12/27/2021 20:06:11 - INFO - codeparrot_training - Step 37004: {'lr': 7.934067832006395e-05, 'samples': 18946560, 'steps': 37004, 'batch_loss/train': 0.7342205620370805} 12/27/2021 20:06:23 - INFO - codeparrot_training - Step 37005: {'lr': 7.932915963278106e-05, 'samples': 18947072, 'steps': 37005, 'batch_loss/train': 0.6071142158471048} 12/27/2021 20:06:34 - INFO - codeparrot_training - Step 37006: {'lr': 7.9317641624021e-05, 'samples': 18947584, 'steps': 37006, 'batch_loss/train': 0.8082610657438636} 12/27/2021 20:06:45 - INFO - codeparrot_training - Step 37007: {'lr': 7.930612429382952e-05, 'samples': 18948096, 'steps': 37007, 'batch_loss/train': 0.740515053505078} 12/27/2021 20:06:55 - INFO - codeparrot_training - Step 37008: {'lr': 7.929460764225246e-05, 'samples': 18948608, 'steps': 37008, 'batch_loss/train': 0.6495939987362362} 12/27/2021 20:07:08 - INFO - codeparrot_training - Step 37009: {'lr': 7.92830916693356e-05, 'samples': 18949120, 'steps': 37009, 'batch_loss/train': 0.7320220512337983} 12/27/2021 20:07:18 - INFO - codeparrot_training - Step 37010: {'lr': 7.927157637512469e-05, 'samples': 18949632, 'steps': 37010, 'batch_loss/train': 0.633225588768255} 12/27/2021 20:07:29 - INFO - codeparrot_training - Step 37011: {'lr': 7.92600617596656e-05, 'samples': 18950144, 'steps': 37011, 'batch_loss/train': 0.5157887704554014} 12/27/2021 20:07:43 - INFO - codeparrot_training - Step 37012: {'lr': 7.924854782300389e-05, 'samples': 18950656, 'steps': 37012, 'batch_loss/train': 0.7541056853951886} 12/27/2021 20:07:54 - INFO - codeparrot_training - Step 37013: {'lr': 7.923703456518555e-05, 'samples': 18951168, 'steps': 37013, 'batch_loss/train': 0.609820379409939} 12/27/2021 20:08:04 - INFO - codeparrot_training - Step 37014: {'lr': 7.922552198625637e-05, 'samples': 18951680, 'steps': 37014, 'batch_loss/train': 0.7075500111095607} 12/27/2021 20:08:17 - INFO - codeparrot_training - Step 37015: {'lr': 7.921401008626195e-05, 'samples': 18952192, 'steps': 37015, 'batch_loss/train': 0.709925789386034} 12/27/2021 20:08:27 - INFO - codeparrot_training - Step 37016: {'lr': 7.920249886524805e-05, 'samples': 18952704, 'steps': 37016, 'batch_loss/train': 0.5661620802420657} 12/27/2021 20:08:38 - INFO - codeparrot_training - Step 37017: {'lr': 7.919098832326072e-05, 'samples': 18953216, 'steps': 37017, 'batch_loss/train': 0.6554765431210399} 12/27/2021 20:08:50 - INFO - codeparrot_training - Step 37018: {'lr': 7.91794784603454e-05, 'samples': 18953728, 'steps': 37018, 'batch_loss/train': 0.7463903483003378} 12/27/2021 20:09:01 - INFO - codeparrot_training - Step 37019: {'lr': 7.916796927654799e-05, 'samples': 18954240, 'steps': 37019, 'batch_loss/train': 0.6959665361791849} 12/27/2021 20:09:11 - INFO - codeparrot_training - Step 37020: {'lr': 7.915646077191423e-05, 'samples': 18954752, 'steps': 37020, 'batch_loss/train': 0.5783844361430965} 12/27/2021 20:09:22 - INFO - codeparrot_training - Step 37021: {'lr': 7.914495294648988e-05, 'samples': 18955264, 'steps': 37021, 'batch_loss/train': 0.5217448582989164} 12/27/2021 20:09:36 - INFO - codeparrot_training - Step 37022: {'lr': 7.913344580032069e-05, 'samples': 18955776, 'steps': 37022, 'batch_loss/train': 0.7163395187817514} 12/27/2021 20:09:47 - INFO - codeparrot_training - Step 37023: {'lr': 7.912193933345247e-05, 'samples': 18956288, 'steps': 37023, 'batch_loss/train': 0.6298843323020265} 12/27/2021 20:09:57 - INFO - codeparrot_training - Step 37024: {'lr': 7.911043354593076e-05, 'samples': 18956800, 'steps': 37024, 'batch_loss/train': 0.6484628207981586} 12/27/2021 20:10:09 - INFO - codeparrot_training - Step 37025: {'lr': 7.90989284378015e-05, 'samples': 18957312, 'steps': 37025, 'batch_loss/train': 0.7744661495089531} 12/27/2021 20:10:20 - INFO - codeparrot_training - Step 37026: {'lr': 7.908742400911045e-05, 'samples': 18957824, 'steps': 37026, 'batch_loss/train': 0.5963794980198145} 12/27/2021 20:10:31 - INFO - codeparrot_training - Step 37027: {'lr': 7.907592025990312e-05, 'samples': 18958336, 'steps': 37027, 'batch_loss/train': 0.7158235386013985} 12/27/2021 20:10:45 - INFO - codeparrot_training - Step 37028: {'lr': 7.906441719022544e-05, 'samples': 18958848, 'steps': 37028, 'batch_loss/train': 0.7314728097990155} 12/27/2021 20:10:56 - INFO - codeparrot_training - Step 37029: {'lr': 7.905291480012323e-05, 'samples': 18959360, 'steps': 37029, 'batch_loss/train': 0.6131107611581683} 12/27/2021 20:11:06 - INFO - codeparrot_training - Step 37030: {'lr': 7.904141308964197e-05, 'samples': 18959872, 'steps': 37030, 'batch_loss/train': 0.42001955141313374} 12/27/2021 20:11:18 - INFO - codeparrot_training - Step 37031: {'lr': 7.90299120588275e-05, 'samples': 18960384, 'steps': 37031, 'batch_loss/train': 0.6821032399311662} 12/27/2021 20:11:29 - INFO - codeparrot_training - Step 37032: {'lr': 7.901841170772555e-05, 'samples': 18960896, 'steps': 37032, 'batch_loss/train': 0.8451212830841541} 12/27/2021 20:11:40 - INFO - codeparrot_training - Step 37033: {'lr': 7.900691203638182e-05, 'samples': 18961408, 'steps': 37033, 'batch_loss/train': 0.7219308312050998} 12/27/2021 20:11:50 - INFO - codeparrot_training - Step 37034: {'lr': 7.899541304484206e-05, 'samples': 18961920, 'steps': 37034, 'batch_loss/train': 0.7060315497219563} 12/27/2021 20:12:02 - INFO - codeparrot_training - Step 37035: {'lr': 7.898391473315198e-05, 'samples': 18962432, 'steps': 37035, 'batch_loss/train': 0.7061622757464647} 12/27/2021 20:12:13 - INFO - codeparrot_training - Step 37036: {'lr': 7.897241710135725e-05, 'samples': 18962944, 'steps': 37036, 'batch_loss/train': 0.6500020930543542} 12/27/2021 20:12:24 - INFO - codeparrot_training - Step 37037: {'lr': 7.896092014950362e-05, 'samples': 18963456, 'steps': 37037, 'batch_loss/train': 0.651239826955134} 12/27/2021 20:12:38 - INFO - codeparrot_training - Step 37038: {'lr': 7.894942387763681e-05, 'samples': 18963968, 'steps': 37038, 'batch_loss/train': 0.6949061566847377} 12/27/2021 20:12:48 - INFO - codeparrot_training - Step 37039: {'lr': 7.893792828580248e-05, 'samples': 18964480, 'steps': 37039, 'batch_loss/train': 0.6239367919042706} 12/27/2021 20:12:59 - INFO - codeparrot_training - Step 37040: {'lr': 7.892643337404646e-05, 'samples': 18964992, 'steps': 37040, 'batch_loss/train': 0.6906076762825251} 12/27/2021 20:13:11 - INFO - codeparrot_training - Step 37041: {'lr': 7.891493914241426e-05, 'samples': 18965504, 'steps': 37041, 'batch_loss/train': 0.6314979634189513} 12/27/2021 20:13:21 - INFO - codeparrot_training - Step 37042: {'lr': 7.890344559095159e-05, 'samples': 18966016, 'steps': 37042, 'batch_loss/train': 0.6236508642323315} 12/27/2021 20:13:32 - INFO - codeparrot_training - Step 37043: {'lr': 7.889195271970434e-05, 'samples': 18966528, 'steps': 37043, 'batch_loss/train': 0.6960946036269888} 12/27/2021 20:13:44 - INFO - codeparrot_training - Step 37044: {'lr': 7.888046052871803e-05, 'samples': 18967040, 'steps': 37044, 'batch_loss/train': 0.7321414644829929} 12/27/2021 20:13:55 - INFO - codeparrot_training - Step 37045: {'lr': 7.88689690180384e-05, 'samples': 18967552, 'steps': 37045, 'batch_loss/train': 0.757238588295877} 12/27/2021 20:14:05 - INFO - codeparrot_training - Step 37046: {'lr': 7.88574781877111e-05, 'samples': 18968064, 'steps': 37046, 'batch_loss/train': 0.7684316355735064} 12/27/2021 20:14:16 - INFO - codeparrot_training - Step 37047: {'lr': 7.88459880377819e-05, 'samples': 18968576, 'steps': 37047, 'batch_loss/train': 0.7432300979271531} 12/27/2021 20:14:28 - INFO - codeparrot_training - Step 37048: {'lr': 7.883449856829639e-05, 'samples': 18969088, 'steps': 37048, 'batch_loss/train': 0.8266488444060087} 12/27/2021 20:14:39 - INFO - codeparrot_training - Step 37049: {'lr': 7.88230097793003e-05, 'samples': 18969600, 'steps': 37049, 'batch_loss/train': 0.72660066280514} 12/27/2021 20:14:49 - INFO - codeparrot_training - Step 37050: {'lr': 7.881152167083927e-05, 'samples': 18970112, 'steps': 37050, 'batch_loss/train': 0.5490976296423469} 12/27/2021 20:15:03 - INFO - codeparrot_training - Step 37051: {'lr': 7.880003424295898e-05, 'samples': 18970624, 'steps': 37051, 'batch_loss/train': 0.6365469181910157} 12/27/2021 20:15:14 - INFO - codeparrot_training - Step 37052: {'lr': 7.878854749570522e-05, 'samples': 18971136, 'steps': 37052, 'batch_loss/train': 0.6976985698565841} 12/27/2021 20:15:25 - INFO - codeparrot_training - Step 37053: {'lr': 7.877706142912339e-05, 'samples': 18971648, 'steps': 37053, 'batch_loss/train': 0.7622975604608655} 12/27/2021 20:15:37 - INFO - codeparrot_training - Step 37054: {'lr': 7.876557604325938e-05, 'samples': 18972160, 'steps': 37054, 'batch_loss/train': 0.6561397722107358} 12/27/2021 20:15:48 - INFO - codeparrot_training - Step 37055: {'lr': 7.875409133815886e-05, 'samples': 18972672, 'steps': 37055, 'batch_loss/train': 0.6226627653813921} 12/27/2021 20:15:58 - INFO - codeparrot_training - Step 37056: {'lr': 7.874260731386734e-05, 'samples': 18973184, 'steps': 37056, 'batch_loss/train': 0.6685907741775736} 12/27/2021 20:16:09 - INFO - codeparrot_training - Step 37057: {'lr': 7.873112397043054e-05, 'samples': 18973696, 'steps': 37057, 'batch_loss/train': 1.214928723871708} 12/27/2021 20:16:23 - INFO - codeparrot_training - Step 37058: {'lr': 7.871964130789413e-05, 'samples': 18974208, 'steps': 37058, 'batch_loss/train': 0.6726039266213775} 12/27/2021 20:16:34 - INFO - codeparrot_training - Step 37059: {'lr': 7.870815932630376e-05, 'samples': 18974720, 'steps': 37059, 'batch_loss/train': 0.604571735515492} 12/27/2021 20:16:44 - INFO - codeparrot_training - Step 37060: {'lr': 7.869667802570507e-05, 'samples': 18975232, 'steps': 37060, 'batch_loss/train': 0.6793697718530893} 12/27/2021 20:16:56 - INFO - codeparrot_training - Step 37061: {'lr': 7.868519740614369e-05, 'samples': 18975744, 'steps': 37061, 'batch_loss/train': 0.6931391038233414} 12/27/2021 20:17:07 - INFO - codeparrot_training - Step 37062: {'lr': 7.86737174676653e-05, 'samples': 18976256, 'steps': 37062, 'batch_loss/train': 0.6355009754188359} 12/27/2021 20:17:18 - INFO - codeparrot_training - Step 37063: {'lr': 7.866223821031551e-05, 'samples': 18976768, 'steps': 37063, 'batch_loss/train': 0.586891514598392} 12/27/2021 20:17:30 - INFO - codeparrot_training - Step 37064: {'lr': 7.865075963413995e-05, 'samples': 18977280, 'steps': 37064, 'batch_loss/train': 0.6667891554534435} 12/27/2021 20:17:41 - INFO - codeparrot_training - Step 37065: {'lr': 7.86392817391843e-05, 'samples': 18977792, 'steps': 37065, 'batch_loss/train': 0.5898059492465109} 12/27/2021 20:17:51 - INFO - codeparrot_training - Step 37066: {'lr': 7.862780452549423e-05, 'samples': 18978304, 'steps': 37066, 'batch_loss/train': 0.760198718693573} 12/27/2021 20:18:02 - INFO - codeparrot_training - Step 37067: {'lr': 7.861632799311522e-05, 'samples': 18978816, 'steps': 37067, 'batch_loss/train': 0.6673087021335959} 12/27/2021 20:18:16 - INFO - codeparrot_training - Step 37068: {'lr': 7.86048521420929e-05, 'samples': 18979328, 'steps': 37068, 'batch_loss/train': 0.5965315259527415} 12/27/2021 20:18:26 - INFO - codeparrot_training - Step 37069: {'lr': 7.859337697247312e-05, 'samples': 18979840, 'steps': 37069, 'batch_loss/train': 0.661657276796177} 12/27/2021 20:18:37 - INFO - codeparrot_training - Step 37070: {'lr': 7.858190248430128e-05, 'samples': 18980352, 'steps': 37070, 'batch_loss/train': 0.7607280109077692} 12/27/2021 20:18:49 - INFO - codeparrot_training - Step 37071: {'lr': 7.857042867762306e-05, 'samples': 18980864, 'steps': 37071, 'batch_loss/train': 0.6108663426421117} 12/27/2021 20:19:00 - INFO - codeparrot_training - Step 37072: {'lr': 7.855895555248413e-05, 'samples': 18981376, 'steps': 37072, 'batch_loss/train': 0.7256831098347902} 12/27/2021 20:19:10 - INFO - codeparrot_training - Step 37073: {'lr': 7.854748310893001e-05, 'samples': 18981888, 'steps': 37073, 'batch_loss/train': 0.739351793192327} 12/27/2021 20:19:22 - INFO - codeparrot_training - Step 37074: {'lr': 7.853601134700639e-05, 'samples': 18982400, 'steps': 37074, 'batch_loss/train': 0.7217006245627999} 12/27/2021 20:19:33 - INFO - codeparrot_training - Step 37075: {'lr': 7.852454026675882e-05, 'samples': 18982912, 'steps': 37075, 'batch_loss/train': 0.7578375078737736} 12/27/2021 20:19:43 - INFO - codeparrot_training - Step 37076: {'lr': 7.851306986823296e-05, 'samples': 18983424, 'steps': 37076, 'batch_loss/train': 0.802395498380065} 12/27/2021 20:19:58 - INFO - codeparrot_training - Step 37077: {'lr': 7.850160015147436e-05, 'samples': 18983936, 'steps': 37077, 'batch_loss/train': 0.755342670949176} 12/27/2021 20:20:08 - INFO - codeparrot_training - Step 37078: {'lr': 7.849013111652873e-05, 'samples': 18984448, 'steps': 37078, 'batch_loss/train': 0.5470927022397518} 12/27/2021 20:20:19 - INFO - codeparrot_training - Step 37079: {'lr': 7.847866276344142e-05, 'samples': 18984960, 'steps': 37079, 'batch_loss/train': 0.752311646938324} 12/27/2021 20:20:29 - INFO - codeparrot_training - Step 37080: {'lr': 7.846719509225827e-05, 'samples': 18985472, 'steps': 37080, 'batch_loss/train': 1.5349223036319017} 12/27/2021 20:20:41 - INFO - codeparrot_training - Step 37081: {'lr': 7.845572810302485e-05, 'samples': 18985984, 'steps': 37081, 'batch_loss/train': 0.7588063781149685} 12/27/2021 20:20:52 - INFO - codeparrot_training - Step 37082: {'lr': 7.844426179578662e-05, 'samples': 18986496, 'steps': 37082, 'batch_loss/train': 0.524880520417355} 12/27/2021 20:21:03 - INFO - codeparrot_training - Step 37083: {'lr': 7.843279617058913e-05, 'samples': 18987008, 'steps': 37083, 'batch_loss/train': 0.7225110568106174} 12/27/2021 20:21:15 - INFO - codeparrot_training - Step 37084: {'lr': 7.842133122747824e-05, 'samples': 18987520, 'steps': 37084, 'batch_loss/train': 0.6874410086602438} 12/27/2021 20:21:26 - INFO - codeparrot_training - Step 37085: {'lr': 7.840986696649924e-05, 'samples': 18988032, 'steps': 37085, 'batch_loss/train': 0.7083207741379738} 12/27/2021 20:21:36 - INFO - codeparrot_training - Step 37086: {'lr': 7.839840338769786e-05, 'samples': 18988544, 'steps': 37086, 'batch_loss/train': 0.7092016155365855} 12/27/2021 20:21:50 - INFO - codeparrot_training - Step 37087: {'lr': 7.838694049111961e-05, 'samples': 18989056, 'steps': 37087, 'batch_loss/train': 0.6420690991799347} 12/27/2021 20:22:01 - INFO - codeparrot_training - Step 37088: {'lr': 7.837547827681008e-05, 'samples': 18989568, 'steps': 37088, 'batch_loss/train': 0.5078100647951942} 12/27/2021 20:22:12 - INFO - codeparrot_training - Step 37089: {'lr': 7.836401674481486e-05, 'samples': 18990080, 'steps': 37089, 'batch_loss/train': 0.7268566316924989} 12/27/2021 20:22:24 - INFO - codeparrot_training - Step 37090: {'lr': 7.835255589517951e-05, 'samples': 18990592, 'steps': 37090, 'batch_loss/train': 0.684853951446712} 12/27/2021 20:22:34 - INFO - codeparrot_training - Step 37091: {'lr': 7.834109572794956e-05, 'samples': 18991104, 'steps': 37091, 'batch_loss/train': 0.663114046677947} 12/27/2021 20:22:45 - INFO - codeparrot_training - Step 37092: {'lr': 7.832963624317069e-05, 'samples': 18991616, 'steps': 37092, 'batch_loss/train': 0.6775567279255483} 12/27/2021 20:22:56 - INFO - codeparrot_training - Step 37093: {'lr': 7.83181774408883e-05, 'samples': 18992128, 'steps': 37093, 'batch_loss/train': 0.7252017022110522} 12/27/2021 20:23:08 - INFO - codeparrot_training - Step 37094: {'lr': 7.830671932114791e-05, 'samples': 18992640, 'steps': 37094, 'batch_loss/train': 0.8253285665996373} 12/27/2021 20:23:19 - INFO - codeparrot_training - Step 37095: {'lr': 7.829526188399533e-05, 'samples': 18993152, 'steps': 37095, 'batch_loss/train': 0.6504606013186276} 12/27/2021 20:23:30 - INFO - codeparrot_training - Step 37096: {'lr': 7.828380512947589e-05, 'samples': 18993664, 'steps': 37096, 'batch_loss/train': 0.6125921640195884} 12/27/2021 20:23:43 - INFO - codeparrot_training - Step 37097: {'lr': 7.827234905763523e-05, 'samples': 18994176, 'steps': 37097, 'batch_loss/train': 0.7477669408544898} 12/27/2021 20:23:54 - INFO - codeparrot_training - Step 37098: {'lr': 7.826089366851882e-05, 'samples': 18994688, 'steps': 37098, 'batch_loss/train': 0.777917611412704} 12/27/2021 20:24:05 - INFO - codeparrot_training - Step 37099: {'lr': 7.824943896217226e-05, 'samples': 18995200, 'steps': 37099, 'batch_loss/train': 0.7008698618737981} 12/27/2021 20:24:17 - INFO - codeparrot_training - Step 37100: {'lr': 7.823798493864112e-05, 'samples': 18995712, 'steps': 37100, 'batch_loss/train': 0.748441613279283} 12/27/2021 20:24:27 - INFO - codeparrot_training - Step 37101: {'lr': 7.822653159797086e-05, 'samples': 18996224, 'steps': 37101, 'batch_loss/train': 0.6704218192026019} 12/27/2021 20:24:38 - INFO - codeparrot_training - Step 37102: {'lr': 7.821507894020704e-05, 'samples': 18996736, 'steps': 37102, 'batch_loss/train': 0.6524304263584781} 12/27/2021 20:24:49 - INFO - codeparrot_training - Step 37103: {'lr': 7.820362696539523e-05, 'samples': 18997248, 'steps': 37103, 'batch_loss/train': 0.5927366639953107} 12/27/2021 20:25:01 - INFO - codeparrot_training - Step 37104: {'lr': 7.819217567358103e-05, 'samples': 18997760, 'steps': 37104, 'batch_loss/train': 0.640351618611021} 12/27/2021 20:25:12 - INFO - codeparrot_training - Step 37105: {'lr': 7.818072506480967e-05, 'samples': 18998272, 'steps': 37105, 'batch_loss/train': 0.7543588611297309} 12/27/2021 20:25:22 - INFO - codeparrot_training - Step 37106: {'lr': 7.816927513912697e-05, 'samples': 18998784, 'steps': 37106, 'batch_loss/train': 0.6125167226418853} 12/27/2021 20:25:36 - INFO - codeparrot_training - Step 37107: {'lr': 7.815782589657841e-05, 'samples': 18999296, 'steps': 37107, 'batch_loss/train': 0.6299641591031104} 12/27/2021 20:25:47 - INFO - codeparrot_training - Step 37108: {'lr': 7.814637733720939e-05, 'samples': 18999808, 'steps': 37108, 'batch_loss/train': 0.5931424112059176} 12/27/2021 20:25:58 - INFO - codeparrot_training - Step 37109: {'lr': 7.81349294610654e-05, 'samples': 19000320, 'steps': 37109, 'batch_loss/train': 0.7428894764743745} 12/27/2021 20:26:10 - INFO - codeparrot_training - Step 37110: {'lr': 7.812348226819219e-05, 'samples': 19000832, 'steps': 37110, 'batch_loss/train': 0.6462305658496916} 12/27/2021 20:26:21 - INFO - codeparrot_training - Step 37111: {'lr': 7.811203575863504e-05, 'samples': 19001344, 'steps': 37111, 'batch_loss/train': 0.7002668051281944} 12/27/2021 20:26:31 - INFO - codeparrot_training - Step 37112: {'lr': 7.810058993243954e-05, 'samples': 19001856, 'steps': 37112, 'batch_loss/train': 1.525072769029066} 12/27/2021 20:26:42 - INFO - codeparrot_training - Step 37113: {'lr': 7.808914478965118e-05, 'samples': 19002368, 'steps': 37113, 'batch_loss/train': 0.6133889963384718} 12/27/2021 20:26:56 - INFO - codeparrot_training - Step 37114: {'lr': 7.807770033031547e-05, 'samples': 19002880, 'steps': 37114, 'batch_loss/train': 0.7355489311739802} 12/27/2021 20:27:06 - INFO - codeparrot_training - Step 37115: {'lr': 7.80662565544779e-05, 'samples': 19003392, 'steps': 37115, 'batch_loss/train': 0.5208797052619047} 12/27/2021 20:27:17 - INFO - codeparrot_training - Step 37116: {'lr': 7.805481346218399e-05, 'samples': 19003904, 'steps': 37116, 'batch_loss/train': 0.6889157434925437} 12/27/2021 20:27:29 - INFO - codeparrot_training - Step 37117: {'lr': 7.80433710534792e-05, 'samples': 19004416, 'steps': 37117, 'batch_loss/train': 0.6520233880728483} 12/27/2021 20:27:40 - INFO - codeparrot_training - Step 37118: {'lr': 7.803192932840905e-05, 'samples': 19004928, 'steps': 37118, 'batch_loss/train': 0.6770029026083648} 12/27/2021 20:27:50 - INFO - codeparrot_training - Step 37119: {'lr': 7.802048828701911e-05, 'samples': 19005440, 'steps': 37119, 'batch_loss/train': 0.7304775877855718} 12/27/2021 20:28:02 - INFO - codeparrot_training - Step 37120: {'lr': 7.80090479293546e-05, 'samples': 19005952, 'steps': 37120, 'batch_loss/train': 0.6807651554991025} 12/27/2021 20:28:13 - INFO - codeparrot_training - Step 37121: {'lr': 7.799760825546132e-05, 'samples': 19006464, 'steps': 37121, 'batch_loss/train': 0.6441331380046904} 12/27/2021 20:28:24 - INFO - codeparrot_training - Step 37122: {'lr': 7.798616926538454e-05, 'samples': 19006976, 'steps': 37122, 'batch_loss/train': 0.6825307449325919} 12/27/2021 20:28:36 - INFO - codeparrot_training - Step 37123: {'lr': 7.797473095916971e-05, 'samples': 19007488, 'steps': 37123, 'batch_loss/train': 0.7010798221454024} 12/27/2021 20:28:47 - INFO - codeparrot_training - Step 37124: {'lr': 7.796329333686256e-05, 'samples': 19008000, 'steps': 37124, 'batch_loss/train': 0.6883766669780016} 12/27/2021 20:28:57 - INFO - codeparrot_training - Step 37125: {'lr': 7.795185639850833e-05, 'samples': 19008512, 'steps': 37125, 'batch_loss/train': 0.6505034575238824} 12/27/2021 20:29:08 - INFO - codeparrot_training - Step 37126: {'lr': 7.794042014415253e-05, 'samples': 19009024, 'steps': 37126, 'batch_loss/train': 0.6465839934535325} 12/27/2021 20:29:22 - INFO - codeparrot_training - Step 37127: {'lr': 7.792898457384067e-05, 'samples': 19009536, 'steps': 37127, 'batch_loss/train': 0.7561678290367126} 12/27/2021 20:29:33 - INFO - codeparrot_training - Step 37128: {'lr': 7.791754968761818e-05, 'samples': 19010048, 'steps': 37128, 'batch_loss/train': 0.7114947268273681} 12/27/2021 20:29:43 - INFO - codeparrot_training - Step 37129: {'lr': 7.790611548553056e-05, 'samples': 19010560, 'steps': 37129, 'batch_loss/train': 0.8015829268842936} 12/27/2021 20:29:55 - INFO - codeparrot_training - Step 37130: {'lr': 7.789468196762333e-05, 'samples': 19011072, 'steps': 37130, 'batch_loss/train': 0.5057271269906778} 12/27/2021 20:30:06 - INFO - codeparrot_training - Step 37131: {'lr': 7.788324913394168e-05, 'samples': 19011584, 'steps': 37131, 'batch_loss/train': 0.7401209737872705} 12/27/2021 20:30:17 - INFO - codeparrot_training - Step 37132: {'lr': 7.787181698453132e-05, 'samples': 19012096, 'steps': 37132, 'batch_loss/train': 0.6843256382271647} 12/27/2021 20:30:29 - INFO - codeparrot_training - Step 37133: {'lr': 7.786038551943772e-05, 'samples': 19012608, 'steps': 37133, 'batch_loss/train': 0.6737291980534792} 12/27/2021 20:30:39 - INFO - codeparrot_training - Step 37134: {'lr': 7.784895473870607e-05, 'samples': 19013120, 'steps': 37134, 'batch_loss/train': 0.7295979540795088} 12/27/2021 20:30:50 - INFO - codeparrot_training - Step 37135: {'lr': 7.783752464238204e-05, 'samples': 19013632, 'steps': 37135, 'batch_loss/train': 0.6954132718965411} 12/27/2021 20:31:01 - INFO - codeparrot_training - Step 37136: {'lr': 7.78260952305111e-05, 'samples': 19014144, 'steps': 37136, 'batch_loss/train': 0.6492944332130719} 12/27/2021 20:31:15 - INFO - codeparrot_training - Step 37137: {'lr': 7.781466650313848e-05, 'samples': 19014656, 'steps': 37137, 'batch_loss/train': 0.6850067503983155} 12/27/2021 20:31:26 - INFO - codeparrot_training - Step 37138: {'lr': 7.780323846030977e-05, 'samples': 19015168, 'steps': 37138, 'batch_loss/train': 0.7367135342210531} 12/27/2021 20:31:36 - INFO - codeparrot_training - Step 37139: {'lr': 7.779181110207034e-05, 'samples': 19015680, 'steps': 37139, 'batch_loss/train': 0.6412665290990844} 12/27/2021 20:31:49 - INFO - codeparrot_training - Step 37140: {'lr': 7.778038442846567e-05, 'samples': 19016192, 'steps': 37140, 'batch_loss/train': 0.6775917578488588} 12/27/2021 20:31:59 - INFO - codeparrot_training - Step 37141: {'lr': 7.776895843954116e-05, 'samples': 19016704, 'steps': 37141, 'batch_loss/train': 0.6513148839585483} 12/27/2021 20:32:10 - INFO - codeparrot_training - Step 37142: {'lr': 7.77575331353422e-05, 'samples': 19017216, 'steps': 37142, 'batch_loss/train': 0.7022278963704593} 12/27/2021 20:32:22 - INFO - codeparrot_training - Step 37143: {'lr': 7.77461085159143e-05, 'samples': 19017728, 'steps': 37143, 'batch_loss/train': 0.664938292466104} 12/27/2021 20:32:33 - INFO - codeparrot_training - Step 37144: {'lr': 7.773468458130282e-05, 'samples': 19018240, 'steps': 37144, 'batch_loss/train': 0.7193961441516876} 12/27/2021 20:32:43 - INFO - codeparrot_training - Step 37145: {'lr': 7.772326133155324e-05, 'samples': 19018752, 'steps': 37145, 'batch_loss/train': 0.639509656932205} 12/27/2021 20:32:54 - INFO - codeparrot_training - Step 37146: {'lr': 7.771183876671082e-05, 'samples': 19019264, 'steps': 37146, 'batch_loss/train': 0.5028901746554766} 12/27/2021 20:33:08 - INFO - codeparrot_training - Step 37147: {'lr': 7.77004168868212e-05, 'samples': 19019776, 'steps': 37147, 'batch_loss/train': 0.7468141413410194} 12/27/2021 20:33:18 - INFO - codeparrot_training - Step 37148: {'lr': 7.768899569192959e-05, 'samples': 19020288, 'steps': 37148, 'batch_loss/train': 0.7925293203443289} 12/27/2021 20:33:29 - INFO - codeparrot_training - Step 37149: {'lr': 7.767757518208138e-05, 'samples': 19020800, 'steps': 37149, 'batch_loss/train': 0.7244021687656641} 12/27/2021 20:33:41 - INFO - codeparrot_training - Step 37150: {'lr': 7.766615535732222e-05, 'samples': 19021312, 'steps': 37150, 'batch_loss/train': 0.678189710713923} 12/27/2021 20:33:52 - INFO - codeparrot_training - Step 37151: {'lr': 7.76547362176973e-05, 'samples': 19021824, 'steps': 37151, 'batch_loss/train': 0.6578286015428603} 12/27/2021 20:34:02 - INFO - codeparrot_training - Step 37152: {'lr': 7.764331776325206e-05, 'samples': 19022336, 'steps': 37152, 'batch_loss/train': 0.7509918510913849} 12/27/2021 20:34:16 - INFO - codeparrot_training - Step 37153: {'lr': 7.763189999403191e-05, 'samples': 19022848, 'steps': 37153, 'batch_loss/train': 0.7004409530200064} 12/27/2021 20:34:27 - INFO - codeparrot_training - Step 37154: {'lr': 7.762048291008225e-05, 'samples': 19023360, 'steps': 37154, 'batch_loss/train': 0.6497192992828786} 12/27/2021 20:34:38 - INFO - codeparrot_training - Step 37155: {'lr': 7.760906651144844e-05, 'samples': 19023872, 'steps': 37155, 'batch_loss/train': 0.7204491193406284} 12/27/2021 20:34:50 - INFO - codeparrot_training - Step 37156: {'lr': 7.759765079817599e-05, 'samples': 19024384, 'steps': 37156, 'batch_loss/train': 0.7216547769494355} 12/27/2021 20:35:01 - INFO - codeparrot_training - Step 37157: {'lr': 7.758623577031001e-05, 'samples': 19024896, 'steps': 37157, 'batch_loss/train': 0.6888936059549451} 12/27/2021 20:35:11 - INFO - codeparrot_training - Step 37158: {'lr': 7.757482142789615e-05, 'samples': 19025408, 'steps': 37158, 'batch_loss/train': 0.5386753172206227} 12/27/2021 20:35:22 - INFO - codeparrot_training - Step 37159: {'lr': 7.756340777097979e-05, 'samples': 19025920, 'steps': 37159, 'batch_loss/train': 0.675590789411217} 12/27/2021 20:35:34 - INFO - codeparrot_training - Step 37160: {'lr': 7.7551994799606e-05, 'samples': 19026432, 'steps': 37160, 'batch_loss/train': 0.6853565205819905} 12/27/2021 20:35:45 - INFO - codeparrot_training - Step 37161: {'lr': 7.754058251382048e-05, 'samples': 19026944, 'steps': 37161, 'batch_loss/train': 0.7919809052255005} 12/27/2021 20:35:55 - INFO - codeparrot_training - Step 37162: {'lr': 7.752917091366856e-05, 'samples': 19027456, 'steps': 37162, 'batch_loss/train': 0.6019865337293595} 12/27/2021 20:36:09 - INFO - codeparrot_training - Step 37163: {'lr': 7.751775999919546e-05, 'samples': 19027968, 'steps': 37163, 'batch_loss/train': 0.7461248030886054} 12/27/2021 20:36:20 - INFO - codeparrot_training - Step 37164: {'lr': 7.750634977044655e-05, 'samples': 19028480, 'steps': 37164, 'batch_loss/train': 0.837866548448801} 12/27/2021 20:36:31 - INFO - codeparrot_training - Step 37165: {'lr': 7.749494022746739e-05, 'samples': 19028992, 'steps': 37165, 'batch_loss/train': 1.6970522031188011} 12/27/2021 20:36:43 - INFO - codeparrot_training - Step 37166: {'lr': 7.748353137030318e-05, 'samples': 19029504, 'steps': 37166, 'batch_loss/train': 0.8452182542532682} 12/27/2021 20:36:53 - INFO - codeparrot_training - Step 37167: {'lr': 7.747212319899927e-05, 'samples': 19030016, 'steps': 37167, 'batch_loss/train': 0.8872019378468394} 12/27/2021 20:37:04 - INFO - codeparrot_training - Step 37168: {'lr': 7.746071571360105e-05, 'samples': 19030528, 'steps': 37168, 'batch_loss/train': 0.717394043225795} 12/27/2021 20:37:15 - INFO - codeparrot_training - Step 37169: {'lr': 7.744930891415388e-05, 'samples': 19031040, 'steps': 37169, 'batch_loss/train': 0.6265349168679677} 12/27/2021 20:37:27 - INFO - codeparrot_training - Step 37170: {'lr': 7.743790280070312e-05, 'samples': 19031552, 'steps': 37170, 'batch_loss/train': 0.4811645080335438} 12/27/2021 20:37:38 - INFO - codeparrot_training - Step 37171: {'lr': 7.742649737329411e-05, 'samples': 19032064, 'steps': 37171, 'batch_loss/train': 0.7478670068085194} 12/27/2021 20:37:48 - INFO - codeparrot_training - Step 37172: {'lr': 7.741509263197216e-05, 'samples': 19032576, 'steps': 37172, 'batch_loss/train': 0.6995479045435786} 12/27/2021 20:38:02 - INFO - codeparrot_training - Step 37173: {'lr': 7.740368857678265e-05, 'samples': 19033088, 'steps': 37173, 'batch_loss/train': 0.6126764176297002} 12/27/2021 20:38:13 - INFO - codeparrot_training - Step 37174: {'lr': 7.739228520777097e-05, 'samples': 19033600, 'steps': 37174, 'batch_loss/train': 0.7558821244165301} 12/27/2021 20:38:23 - INFO - codeparrot_training - Step 37175: {'lr': 7.738088252498223e-05, 'samples': 19034112, 'steps': 37175, 'batch_loss/train': 0.759950203821063} 12/27/2021 20:38:35 - INFO - codeparrot_training - Step 37176: {'lr': 7.736948052846212e-05, 'samples': 19034624, 'steps': 37176, 'batch_loss/train': 0.581721066031605} 12/27/2021 20:38:46 - INFO - codeparrot_training - Step 37177: {'lr': 7.735807921825567e-05, 'samples': 19035136, 'steps': 37177, 'batch_loss/train': 0.7138878230471164} 12/27/2021 20:38:57 - INFO - codeparrot_training - Step 37178: {'lr': 7.73466785944083e-05, 'samples': 19035648, 'steps': 37178, 'batch_loss/train': 0.6458631101995707} 12/27/2021 20:39:09 - INFO - codeparrot_training - Step 37179: {'lr': 7.733527865696536e-05, 'samples': 19036160, 'steps': 37179, 'batch_loss/train': 0.6842965371906757} 12/27/2021 20:39:19 - INFO - codeparrot_training - Step 37180: {'lr': 7.732387940597213e-05, 'samples': 19036672, 'steps': 37180, 'batch_loss/train': 0.6971385646611452} 12/27/2021 20:39:30 - INFO - codeparrot_training - Step 37181: {'lr': 7.731248084147399e-05, 'samples': 19037184, 'steps': 37181, 'batch_loss/train': 0.6143935285508633} 12/27/2021 20:39:41 - INFO - codeparrot_training - Step 37182: {'lr': 7.730108296351626e-05, 'samples': 19037696, 'steps': 37182, 'batch_loss/train': 0.6674615101655945} 12/27/2021 20:39:54 - INFO - codeparrot_training - Step 37183: {'lr': 7.728968577214409e-05, 'samples': 19038208, 'steps': 37183, 'batch_loss/train': 0.6555862459354103} 12/27/2021 20:40:05 - INFO - codeparrot_training - Step 37184: {'lr': 7.727828926740299e-05, 'samples': 19038720, 'steps': 37184, 'batch_loss/train': 0.6686377585865557} 12/27/2021 20:40:16 - INFO - codeparrot_training - Step 37185: {'lr': 7.726689344933826e-05, 'samples': 19039232, 'steps': 37185, 'batch_loss/train': 0.6460449644364417} 12/27/2021 20:40:28 - INFO - codeparrot_training - Step 37186: {'lr': 7.7255498317995e-05, 'samples': 19039744, 'steps': 37186, 'batch_loss/train': 0.7023064256645739} 12/27/2021 20:40:39 - INFO - codeparrot_training - Step 37187: {'lr': 7.724410387341873e-05, 'samples': 19040256, 'steps': 37187, 'batch_loss/train': 0.6551991655142047} 12/27/2021 20:40:49 - INFO - codeparrot_training - Step 37188: {'lr': 7.723271011565475e-05, 'samples': 19040768, 'steps': 37188, 'batch_loss/train': 0.7479165446711704} 12/27/2021 20:41:01 - INFO - codeparrot_training - Step 37189: {'lr': 7.722131704474819e-05, 'samples': 19041280, 'steps': 37189, 'batch_loss/train': 0.6245931689627469} 12/27/2021 20:41:12 - INFO - codeparrot_training - Step 37190: {'lr': 7.720992466074436e-05, 'samples': 19041792, 'steps': 37190, 'batch_loss/train': 0.6800803490914404} 12/27/2021 20:41:23 - INFO - codeparrot_training - Step 37191: {'lr': 7.719853296368878e-05, 'samples': 19042304, 'steps': 37191, 'batch_loss/train': 0.647551755188033} 12/27/2021 20:41:36 - INFO - codeparrot_training - Step 37192: {'lr': 7.718714195362652e-05, 'samples': 19042816, 'steps': 37192, 'batch_loss/train': 0.5800726141314954} 12/27/2021 20:41:47 - INFO - codeparrot_training - Step 37193: {'lr': 7.717575163060295e-05, 'samples': 19043328, 'steps': 37193, 'batch_loss/train': 0.7095196074806154} 12/27/2021 20:41:58 - INFO - codeparrot_training - Step 37194: {'lr': 7.716436199466331e-05, 'samples': 19043840, 'steps': 37194, 'batch_loss/train': 0.7294614565907978} 12/27/2021 20:42:08 - INFO - codeparrot_training - Step 37195: {'lr': 7.715297304585289e-05, 'samples': 19044352, 'steps': 37195, 'batch_loss/train': 0.7104923753067851} 12/27/2021 20:42:20 - INFO - codeparrot_training - Step 37196: {'lr': 7.714158478421701e-05, 'samples': 19044864, 'steps': 37196, 'batch_loss/train': 0.694102861918509} 12/27/2021 20:42:31 - INFO - codeparrot_training - Step 37197: {'lr': 7.713019720980091e-05, 'samples': 19045376, 'steps': 37197, 'batch_loss/train': 0.6771412249654531} 12/27/2021 20:42:42 - INFO - codeparrot_training - Step 37198: {'lr': 7.711881032264988e-05, 'samples': 19045888, 'steps': 37198, 'batch_loss/train': 0.6470692558214068} 12/27/2021 20:42:55 - INFO - codeparrot_training - Step 37199: {'lr': 7.710742412280918e-05, 'samples': 19046400, 'steps': 37199, 'batch_loss/train': 0.7574764173477888} 12/27/2021 20:43:06 - INFO - codeparrot_training - Step 37200: {'lr': 7.709603861032416e-05, 'samples': 19046912, 'steps': 37200, 'batch_loss/train': 0.5935252674389631} 12/27/2021 20:43:17 - INFO - codeparrot_training - Step 37201: {'lr': 7.708465378523985e-05, 'samples': 19047424, 'steps': 37201, 'batch_loss/train': 0.7231197152286768} 12/27/2021 20:43:29 - INFO - codeparrot_training - Step 37202: {'lr': 7.707326964760181e-05, 'samples': 19047936, 'steps': 37202, 'batch_loss/train': 0.8800827236846089} 12/27/2021 20:43:39 - INFO - codeparrot_training - Step 37203: {'lr': 7.706188619745507e-05, 'samples': 19048448, 'steps': 37203, 'batch_loss/train': 0.6692124041728675} 12/27/2021 20:43:50 - INFO - codeparrot_training - Step 37204: {'lr': 7.705050343484502e-05, 'samples': 19048960, 'steps': 37204, 'batch_loss/train': 0.7665902402950451} 12/27/2021 20:44:01 - INFO - codeparrot_training - Step 37205: {'lr': 7.70391213598168e-05, 'samples': 19049472, 'steps': 37205, 'batch_loss/train': 0.6348661037336569} 12/27/2021 20:44:13 - INFO - codeparrot_training - Step 37206: {'lr': 7.702773997241577e-05, 'samples': 19049984, 'steps': 37206, 'batch_loss/train': 0.7476084805093706} 12/27/2021 20:44:23 - INFO - codeparrot_training - Step 37207: {'lr': 7.701635927268713e-05, 'samples': 19050496, 'steps': 37207, 'batch_loss/train': 0.7632838813588023} 12/27/2021 20:44:34 - INFO - codeparrot_training - Step 37208: {'lr': 7.700497926067613e-05, 'samples': 19051008, 'steps': 37208, 'batch_loss/train': 0.6998580822255462} 12/27/2021 20:44:46 - INFO - codeparrot_training - Step 37209: {'lr': 7.699359993642798e-05, 'samples': 19051520, 'steps': 37209, 'batch_loss/train': 0.730762536637485} 12/27/2021 20:44:57 - INFO - codeparrot_training - Step 37210: {'lr': 7.6982221299988e-05, 'samples': 19052032, 'steps': 37210, 'batch_loss/train': 0.7024560659192502} 12/27/2021 20:45:07 - INFO - codeparrot_training - Step 37211: {'lr': 7.697084335140143e-05, 'samples': 19052544, 'steps': 37211, 'batch_loss/train': 0.5734771697898395} 12/27/2021 20:45:21 - INFO - codeparrot_training - Step 37212: {'lr': 7.695946609071327e-05, 'samples': 19053056, 'steps': 37212, 'batch_loss/train': 0.7163792219944298} 12/27/2021 20:45:32 - INFO - codeparrot_training - Step 37213: {'lr': 7.694808951796906e-05, 'samples': 19053568, 'steps': 37213, 'batch_loss/train': 0.6888456537853926} 12/27/2021 20:45:43 - INFO - codeparrot_training - Step 37214: {'lr': 7.693671363321397e-05, 'samples': 19054080, 'steps': 37214, 'batch_loss/train': 0.7523632366210222} 12/27/2021 20:45:55 - INFO - codeparrot_training - Step 37215: {'lr': 7.692533843649308e-05, 'samples': 19054592, 'steps': 37215, 'batch_loss/train': 0.7140096887014806} 12/27/2021 20:46:05 - INFO - codeparrot_training - Step 37216: {'lr': 7.69139639278516e-05, 'samples': 19055104, 'steps': 37216, 'batch_loss/train': 0.7272496754303575} 12/27/2021 20:46:16 - INFO - codeparrot_training - Step 37217: {'lr': 7.690259010733503e-05, 'samples': 19055616, 'steps': 37217, 'batch_loss/train': 0.6651750588789582} 12/27/2021 20:46:26 - INFO - codeparrot_training - Step 37218: {'lr': 7.68912169749883e-05, 'samples': 19056128, 'steps': 37218, 'batch_loss/train': 0.6344317790062632} 12/27/2021 20:46:39 - INFO - codeparrot_training - Step 37219: {'lr': 7.687984453085675e-05, 'samples': 19056640, 'steps': 37219, 'batch_loss/train': 0.7296708421781659} 12/27/2021 20:46:49 - INFO - codeparrot_training - Step 37220: {'lr': 7.686847277498554e-05, 'samples': 19057152, 'steps': 37220, 'batch_loss/train': 0.7156283739022911} 12/27/2021 20:47:00 - INFO - codeparrot_training - Step 37221: {'lr': 7.685710170741994e-05, 'samples': 19057664, 'steps': 37221, 'batch_loss/train': 0.7552863712189719} 12/27/2021 20:47:14 - INFO - codeparrot_training - Step 37222: {'lr': 7.684573132820511e-05, 'samples': 19058176, 'steps': 37222, 'batch_loss/train': 0.6897243387065828} 12/27/2021 20:47:24 - INFO - codeparrot_training - Step 37223: {'lr': 7.683436163738627e-05, 'samples': 19058688, 'steps': 37223, 'batch_loss/train': 0.6367177948122844} 12/27/2021 20:47:35 - INFO - codeparrot_training - Step 37224: {'lr': 7.682299263500863e-05, 'samples': 19059200, 'steps': 37224, 'batch_loss/train': 0.6085158386267722} 12/27/2021 20:47:47 - INFO - codeparrot_training - Step 37225: {'lr': 7.681162432111741e-05, 'samples': 19059712, 'steps': 37225, 'batch_loss/train': 0.6673699459061027} 12/27/2021 20:47:58 - INFO - codeparrot_training - Step 37226: {'lr': 7.680025669575783e-05, 'samples': 19060224, 'steps': 37226, 'batch_loss/train': 0.7112967390567064} 12/27/2021 20:48:08 - INFO - codeparrot_training - Step 37227: {'lr': 7.678888975897486e-05, 'samples': 19060736, 'steps': 37227, 'batch_loss/train': 0.6769440146163106} 12/27/2021 20:48:22 - INFO - codeparrot_training - Step 37228: {'lr': 7.677752351081404e-05, 'samples': 19061248, 'steps': 37228, 'batch_loss/train': 0.8331424961797893} 12/27/2021 20:48:33 - INFO - codeparrot_training - Step 37229: {'lr': 7.676615795132028e-05, 'samples': 19061760, 'steps': 37229, 'batch_loss/train': 0.7637936845421791} 12/27/2021 20:48:44 - INFO - codeparrot_training - Step 37230: {'lr': 7.67547930805389e-05, 'samples': 19062272, 'steps': 37230, 'batch_loss/train': 0.6796521516516805} 12/27/2021 20:48:54 - INFO - codeparrot_training - Step 37231: {'lr': 7.674342889851502e-05, 'samples': 19062784, 'steps': 37231, 'batch_loss/train': 0.6177400411106646} 12/27/2021 20:49:06 - INFO - codeparrot_training - Step 37232: {'lr': 7.673206540529385e-05, 'samples': 19063296, 'steps': 37232, 'batch_loss/train': 0.7095497525297105} 12/27/2021 20:49:17 - INFO - codeparrot_training - Step 37233: {'lr': 7.672070260092057e-05, 'samples': 19063808, 'steps': 37233, 'batch_loss/train': 0.5246806815266609} 12/27/2021 20:49:28 - INFO - codeparrot_training - Step 37234: {'lr': 7.670934048544034e-05, 'samples': 19064320, 'steps': 37234, 'batch_loss/train': 0.6837350488640368} 12/27/2021 20:49:40 - INFO - codeparrot_training - Step 37235: {'lr': 7.669797905889835e-05, 'samples': 19064832, 'steps': 37235, 'batch_loss/train': 0.5508872780483216} 12/27/2021 20:49:50 - INFO - codeparrot_training - Step 37236: {'lr': 7.668661832133977e-05, 'samples': 19065344, 'steps': 37236, 'batch_loss/train': 0.6588692646473646} 12/27/2021 20:50:01 - INFO - codeparrot_training - Step 37237: {'lr': 7.667525827280983e-05, 'samples': 19065856, 'steps': 37237, 'batch_loss/train': 0.5764930052682757} 12/27/2021 20:50:15 - INFO - codeparrot_training - Step 37238: {'lr': 7.666389891335346e-05, 'samples': 19066368, 'steps': 37238, 'batch_loss/train': 0.6316516364458948} 12/27/2021 20:50:26 - INFO - codeparrot_training - Step 37239: {'lr': 7.665254024301605e-05, 'samples': 19066880, 'steps': 37239, 'batch_loss/train': 0.7595114912837744} 12/27/2021 20:50:36 - INFO - codeparrot_training - Step 37240: {'lr': 7.664118226184274e-05, 'samples': 19067392, 'steps': 37240, 'batch_loss/train': 0.6967218136414886} 12/27/2021 20:50:48 - INFO - codeparrot_training - Step 37241: {'lr': 7.662982496987858e-05, 'samples': 19067904, 'steps': 37241, 'batch_loss/train': 0.6751756248995662} 12/27/2021 20:50:59 - INFO - codeparrot_training - Step 37242: {'lr': 7.661846836716872e-05, 'samples': 19068416, 'steps': 37242, 'batch_loss/train': 0.7500827652402222} 12/27/2021 20:51:10 - INFO - codeparrot_training - Step 37243: {'lr': 7.660711245375848e-05, 'samples': 19068928, 'steps': 37243, 'batch_loss/train': 0.7083384739235044} 12/27/2021 20:51:20 - INFO - codeparrot_training - Step 37244: {'lr': 7.659575722969283e-05, 'samples': 19069440, 'steps': 37244, 'batch_loss/train': 0.6954609411768615} 12/27/2021 20:51:32 - INFO - codeparrot_training - Step 37245: {'lr': 7.658440269501687e-05, 'samples': 19069952, 'steps': 37245, 'batch_loss/train': 0.7061921257991344} 12/27/2021 20:51:43 - INFO - codeparrot_training - Step 37246: {'lr': 7.657304884977603e-05, 'samples': 19070464, 'steps': 37246, 'batch_loss/train': 0.7062642271630466} 12/27/2021 20:51:53 - INFO - codeparrot_training - Step 37247: {'lr': 7.656169569401519e-05, 'samples': 19070976, 'steps': 37247, 'batch_loss/train': 0.7764972895383835} 12/27/2021 20:52:07 - INFO - codeparrot_training - Step 37248: {'lr': 7.655034322777956e-05, 'samples': 19071488, 'steps': 37248, 'batch_loss/train': 0.7188525488600135} 12/27/2021 20:52:18 - INFO - codeparrot_training - Step 37249: {'lr': 7.653899145111426e-05, 'samples': 19072000, 'steps': 37249, 'batch_loss/train': 0.6672337846830487} 12/27/2021 20:52:29 - INFO - codeparrot_training - Step 37250: {'lr': 7.652764036406442e-05, 'samples': 19072512, 'steps': 37250, 'batch_loss/train': 0.76089400309138} 12/27/2021 20:52:41 - INFO - codeparrot_training - Step 37251: {'lr': 7.651628996667523e-05, 'samples': 19073024, 'steps': 37251, 'batch_loss/train': 0.80513805616647} 12/27/2021 20:52:51 - INFO - codeparrot_training - Step 37252: {'lr': 7.65049402589918e-05, 'samples': 19073536, 'steps': 37252, 'batch_loss/train': 0.699212103150785} 12/27/2021 20:53:02 - INFO - codeparrot_training - Step 37253: {'lr': 7.649359124105907e-05, 'samples': 19074048, 'steps': 37253, 'batch_loss/train': 0.6841794876381755} 12/27/2021 20:53:14 - INFO - codeparrot_training - Step 37254: {'lr': 7.648224291292238e-05, 'samples': 19074560, 'steps': 37254, 'batch_loss/train': 0.772048749960959} 12/27/2021 20:53:25 - INFO - codeparrot_training - Step 37255: {'lr': 7.647089527462689e-05, 'samples': 19075072, 'steps': 37255, 'batch_loss/train': 0.624176093377173} 12/27/2021 20:53:36 - INFO - codeparrot_training - Step 37256: {'lr': 7.645954832621749e-05, 'samples': 19075584, 'steps': 37256, 'batch_loss/train': 0.6807762356474996} 12/27/2021 20:53:48 - INFO - codeparrot_training - Step 37257: {'lr': 7.64482020677394e-05, 'samples': 19076096, 'steps': 37257, 'batch_loss/train': 0.6878777774982154} 12/27/2021 20:53:58 - INFO - codeparrot_training - Step 37258: {'lr': 7.643685649923773e-05, 'samples': 19076608, 'steps': 37258, 'batch_loss/train': 2.622253111563623} 12/27/2021 20:54:09 - INFO - codeparrot_training - Step 37259: {'lr': 7.642551162075759e-05, 'samples': 19077120, 'steps': 37259, 'batch_loss/train': 0.6698884260840714} 12/27/2021 20:54:20 - INFO - codeparrot_training - Step 37260: {'lr': 7.641416743234406e-05, 'samples': 19077632, 'steps': 37260, 'batch_loss/train': 0.6799310119822621} 12/27/2021 20:54:34 - INFO - codeparrot_training - Step 37261: {'lr': 7.640282393404227e-05, 'samples': 19078144, 'steps': 37261, 'batch_loss/train': 0.7111582066863775} 12/27/2021 20:54:45 - INFO - codeparrot_training - Step 37262: {'lr': 7.639148112589731e-05, 'samples': 19078656, 'steps': 37262, 'batch_loss/train': 0.7659781072288752} 12/27/2021 20:54:55 - INFO - codeparrot_training - Step 37263: {'lr': 7.638013900795426e-05, 'samples': 19079168, 'steps': 37263, 'batch_loss/train': 0.6255308189429343} 12/27/2021 20:55:07 - INFO - codeparrot_training - Step 37264: {'lr': 7.636879758025822e-05, 'samples': 19079680, 'steps': 37264, 'batch_loss/train': 0.6433690022677183} 12/27/2021 20:55:18 - INFO - codeparrot_training - Step 37265: {'lr': 7.635745684285428e-05, 'samples': 19080192, 'steps': 37265, 'batch_loss/train': 0.7300108266063035} 12/27/2021 20:55:29 - INFO - codeparrot_training - Step 37266: {'lr': 7.634611679578762e-05, 'samples': 19080704, 'steps': 37266, 'batch_loss/train': 0.7210528597934172} 12/27/2021 20:55:39 - INFO - codeparrot_training - Step 37267: {'lr': 7.633477743910313e-05, 'samples': 19081216, 'steps': 37267, 'batch_loss/train': 0.6755300322547555} 12/27/2021 20:55:53 - INFO - codeparrot_training - Step 37268: {'lr': 7.632343877284592e-05, 'samples': 19081728, 'steps': 37268, 'batch_loss/train': 0.8043438852764666} 12/27/2021 20:56:04 - INFO - codeparrot_training - Step 37269: {'lr': 7.63121007970613e-05, 'samples': 19082240, 'steps': 37269, 'batch_loss/train': 0.667689764406532} 12/27/2021 20:56:15 - INFO - codeparrot_training - Step 37270: {'lr': 7.63007635117941e-05, 'samples': 19082752, 'steps': 37270, 'batch_loss/train': 0.6861507017165422} 12/27/2021 20:56:27 - INFO - codeparrot_training - Step 37271: {'lr': 7.628942691708942e-05, 'samples': 19083264, 'steps': 37271, 'batch_loss/train': 0.7330965111032128} 12/27/2021 20:56:37 - INFO - codeparrot_training - Step 37272: {'lr': 7.627809101299252e-05, 'samples': 19083776, 'steps': 37272, 'batch_loss/train': 0.6400371168856509} 12/27/2021 20:56:48 - INFO - codeparrot_training - Step 37273: {'lr': 7.626675579954828e-05, 'samples': 19084288, 'steps': 37273, 'batch_loss/train': 0.7206624606624246} 12/27/2021 20:57:00 - INFO - codeparrot_training - Step 37274: {'lr': 7.62554212768018e-05, 'samples': 19084800, 'steps': 37274, 'batch_loss/train': 0.6502030985429883} 12/27/2021 20:57:11 - INFO - codeparrot_training - Step 37275: {'lr': 7.624408744479819e-05, 'samples': 19085312, 'steps': 37275, 'batch_loss/train': 0.6793107080447953} 12/27/2021 20:57:21 - INFO - codeparrot_training - Step 37276: {'lr': 7.623275430358246e-05, 'samples': 19085824, 'steps': 37276, 'batch_loss/train': 0.6431621846859343} 12/27/2021 20:57:35 - INFO - codeparrot_training - Step 37277: {'lr': 7.622142185319966e-05, 'samples': 19086336, 'steps': 37277, 'batch_loss/train': 0.7170668063918129} 12/27/2021 20:57:46 - INFO - codeparrot_training - Step 37278: {'lr': 7.621009009369497e-05, 'samples': 19086848, 'steps': 37278, 'batch_loss/train': 0.7657397626899183} 12/27/2021 20:57:57 - INFO - codeparrot_training - Step 37279: {'lr': 7.619875902511319e-05, 'samples': 19087360, 'steps': 37279, 'batch_loss/train': 0.6695905304513872} 12/27/2021 20:58:08 - INFO - codeparrot_training - Step 37280: {'lr': 7.618742864749961e-05, 'samples': 19087872, 'steps': 37280, 'batch_loss/train': 0.8357820631936193} 12/27/2021 20:58:20 - INFO - codeparrot_training - Step 37281: {'lr': 7.617609896089925e-05, 'samples': 19088384, 'steps': 37281, 'batch_loss/train': 0.6094255256466568} 12/27/2021 20:58:31 - INFO - codeparrot_training - Step 37282: {'lr': 7.616476996535693e-05, 'samples': 19088896, 'steps': 37282, 'batch_loss/train': 0.6971251973882318} 12/27/2021 20:58:41 - INFO - codeparrot_training - Step 37283: {'lr': 7.6153441660918e-05, 'samples': 19089408, 'steps': 37283, 'batch_loss/train': 0.7346775801852345} 12/27/2021 20:58:55 - INFO - codeparrot_training - Step 37284: {'lr': 7.614211404762728e-05, 'samples': 19089920, 'steps': 37284, 'batch_loss/train': 0.6376164816028904} 12/27/2021 20:59:06 - INFO - codeparrot_training - Step 37285: {'lr': 7.613078712552987e-05, 'samples': 19090432, 'steps': 37285, 'batch_loss/train': 0.9116199733689427} 12/27/2021 20:59:17 - INFO - codeparrot_training - Step 37286: {'lr': 7.611946089467078e-05, 'samples': 19090944, 'steps': 37286, 'batch_loss/train': 0.5323680721921846} 12/27/2021 20:59:29 - INFO - codeparrot_training - Step 37287: {'lr': 7.610813535509506e-05, 'samples': 19091456, 'steps': 37287, 'batch_loss/train': 0.5788163533434272} 12/27/2021 20:59:39 - INFO - codeparrot_training - Step 37288: {'lr': 7.609681050684775e-05, 'samples': 19091968, 'steps': 37288, 'batch_loss/train': 0.7699443055316806} 12/27/2021 20:59:50 - INFO - codeparrot_training - Step 37289: {'lr': 7.608548634997386e-05, 'samples': 19092480, 'steps': 37289, 'batch_loss/train': 0.707109990529716} 12/27/2021 21:00:01 - INFO - codeparrot_training - Step 37290: {'lr': 7.60741628845184e-05, 'samples': 19092992, 'steps': 37290, 'batch_loss/train': 0.7451013568788767} 12/27/2021 21:00:13 - INFO - codeparrot_training - Step 37291: {'lr': 7.606284011052638e-05, 'samples': 19093504, 'steps': 37291, 'batch_loss/train': 0.705041904002428} 12/27/2021 21:00:23 - INFO - codeparrot_training - Step 37292: {'lr': 7.605151802804294e-05, 'samples': 19094016, 'steps': 37292, 'batch_loss/train': 0.7641530300606973} 12/27/2021 21:00:34 - INFO - codeparrot_training - Step 37293: {'lr': 7.60401966371129e-05, 'samples': 19094528, 'steps': 37293, 'batch_loss/train': 0.6409878124250099} 12/27/2021 21:00:46 - INFO - codeparrot_training - Step 37294: {'lr': 7.602887593778126e-05, 'samples': 19095040, 'steps': 37294, 'batch_loss/train': 0.5539533082337584} 12/27/2021 21:00:57 - INFO - codeparrot_training - Step 37295: {'lr': 7.601755593009327e-05, 'samples': 19095552, 'steps': 37295, 'batch_loss/train': 0.7170076044276357} 12/27/2021 21:01:07 - INFO - codeparrot_training - Step 37296: {'lr': 7.600623661409373e-05, 'samples': 19096064, 'steps': 37296, 'batch_loss/train': 0.7146009728312492} 12/27/2021 21:01:21 - INFO - codeparrot_training - Step 37297: {'lr': 7.59949179898276e-05, 'samples': 19096576, 'steps': 37297, 'batch_loss/train': 0.6565872272476554} 12/27/2021 21:01:32 - INFO - codeparrot_training - Step 37298: {'lr': 7.598360005734014e-05, 'samples': 19097088, 'steps': 37298, 'batch_loss/train': 0.45214333425974473} 12/27/2021 21:01:42 - INFO - codeparrot_training - Step 37299: {'lr': 7.59722828166761e-05, 'samples': 19097600, 'steps': 37299, 'batch_loss/train': 0.6999318770831451} 12/27/2021 21:01:54 - INFO - codeparrot_training - Step 37300: {'lr': 7.596096626788054e-05, 'samples': 19098112, 'steps': 37300, 'batch_loss/train': 0.7140223244205117} 12/27/2021 21:02:05 - INFO - codeparrot_training - Step 37301: {'lr': 7.59496504109985e-05, 'samples': 19098624, 'steps': 37301, 'batch_loss/train': 0.7311522271484137} 12/27/2021 21:02:16 - INFO - codeparrot_training - Step 37302: {'lr': 7.593833524607491e-05, 'samples': 19099136, 'steps': 37302, 'batch_loss/train': 0.8109034434892237} 12/27/2021 21:02:26 - INFO - codeparrot_training - Step 37303: {'lr': 7.592702077315478e-05, 'samples': 19099648, 'steps': 37303, 'batch_loss/train': 0.6819862346164882} 12/27/2021 21:02:39 - INFO - codeparrot_training - Step 37304: {'lr': 7.591570699228315e-05, 'samples': 19100160, 'steps': 37304, 'batch_loss/train': 0.8743558004498482} 12/27/2021 21:02:49 - INFO - codeparrot_training - Step 37305: {'lr': 7.590439390350478e-05, 'samples': 19100672, 'steps': 37305, 'batch_loss/train': 0.7029228806495667} 12/27/2021 21:03:00 - INFO - codeparrot_training - Step 37306: {'lr': 7.589308150686488e-05, 'samples': 19101184, 'steps': 37306, 'batch_loss/train': 0.7788122746860608} 12/27/2021 21:03:14 - INFO - codeparrot_training - Step 37307: {'lr': 7.588176980240846e-05, 'samples': 19101696, 'steps': 37307, 'batch_loss/train': 0.6450522621162236} 12/27/2021 21:03:25 - INFO - codeparrot_training - Step 37308: {'lr': 7.587045879018021e-05, 'samples': 19102208, 'steps': 37308, 'batch_loss/train': 0.5050364781636745} 12/27/2021 21:03:35 - INFO - codeparrot_training - Step 37309: {'lr': 7.585914847022532e-05, 'samples': 19102720, 'steps': 37309, 'batch_loss/train': 0.7041539838537574} 12/27/2021 21:03:47 - INFO - codeparrot_training - Step 37310: {'lr': 7.584783884258881e-05, 'samples': 19103232, 'steps': 37310, 'batch_loss/train': 0.7919013136997819} 12/27/2021 21:03:58 - INFO - codeparrot_training - Step 37311: {'lr': 7.583652990731544e-05, 'samples': 19103744, 'steps': 37311, 'batch_loss/train': 0.7069956574123353} 12/27/2021 21:04:09 - INFO - codeparrot_training - Step 37312: {'lr': 7.582522166445028e-05, 'samples': 19104256, 'steps': 37312, 'batch_loss/train': 0.7787594036199152} 12/27/2021 21:04:22 - INFO - codeparrot_training - Step 37313: {'lr': 7.581391411403826e-05, 'samples': 19104768, 'steps': 37313, 'batch_loss/train': 0.6895229695364833} 12/27/2021 21:04:33 - INFO - codeparrot_training - Step 37314: {'lr': 7.580260725612434e-05, 'samples': 19105280, 'steps': 37314, 'batch_loss/train': 0.7547299859579653} 12/27/2021 21:04:44 - INFO - codeparrot_training - Step 37315: {'lr': 7.579130109075349e-05, 'samples': 19105792, 'steps': 37315, 'batch_loss/train': 0.6049373353307601} 12/27/2021 21:04:54 - INFO - codeparrot_training - Step 37316: {'lr': 7.577999561797064e-05, 'samples': 19106304, 'steps': 37316, 'batch_loss/train': 0.7531453799456358} 12/27/2021 21:05:06 - INFO - codeparrot_training - Step 37317: {'lr': 7.576869083782076e-05, 'samples': 19106816, 'steps': 37317, 'batch_loss/train': 0.7085924739949405} 12/27/2021 21:05:17 - INFO - codeparrot_training - Step 37318: {'lr': 7.575738675034886e-05, 'samples': 19107328, 'steps': 37318, 'batch_loss/train': 0.5870137768797576} 12/27/2021 21:05:28 - INFO - codeparrot_training - Step 37319: {'lr': 7.574608335559962e-05, 'samples': 19107840, 'steps': 37319, 'batch_loss/train': 0.736138345208019} 12/27/2021 21:05:40 - INFO - codeparrot_training - Step 37320: {'lr': 7.573478065361825e-05, 'samples': 19108352, 'steps': 37320, 'batch_loss/train': 0.6551742563024163} 12/27/2021 21:05:51 - INFO - codeparrot_training - Step 37321: {'lr': 7.572347864444967e-05, 'samples': 19108864, 'steps': 37321, 'batch_loss/train': 0.7396856863051653} 12/27/2021 21:06:01 - INFO - codeparrot_training - Step 37322: {'lr': 7.571217732813865e-05, 'samples': 19109376, 'steps': 37322, 'batch_loss/train': 0.6550406668102369} 12/27/2021 21:06:15 - INFO - codeparrot_training - Step 37323: {'lr': 7.57008767047301e-05, 'samples': 19109888, 'steps': 37323, 'batch_loss/train': 0.7229594553937204} 12/27/2021 21:06:26 - INFO - codeparrot_training - Step 37324: {'lr': 7.568957677426927e-05, 'samples': 19110400, 'steps': 37324, 'batch_loss/train': 0.6682277834042907} 12/27/2021 21:06:36 - INFO - codeparrot_training - Step 37325: {'lr': 7.567827753680073e-05, 'samples': 19110912, 'steps': 37325, 'batch_loss/train': 0.7449726587801706} 12/27/2021 21:06:48 - INFO - codeparrot_training - Step 37326: {'lr': 7.566697899236957e-05, 'samples': 19111424, 'steps': 37326, 'batch_loss/train': 0.6786658722849097} 12/27/2021 21:06:59 - INFO - codeparrot_training - Step 37327: {'lr': 7.565568114102067e-05, 'samples': 19111936, 'steps': 37327, 'batch_loss/train': 0.6913244491443038} 12/27/2021 21:07:10 - INFO - codeparrot_training - Step 37328: {'lr': 7.564438398279896e-05, 'samples': 19112448, 'steps': 37328, 'batch_loss/train': 0.7258752132765949} 12/27/2021 21:07:20 - INFO - codeparrot_training - Step 37329: {'lr': 7.563308751774933e-05, 'samples': 19112960, 'steps': 37329, 'batch_loss/train': 0.6219339363451581} 12/27/2021 21:07:32 - INFO - codeparrot_training - Step 37330: {'lr': 7.56217917459168e-05, 'samples': 19113472, 'steps': 37330, 'batch_loss/train': 0.7626183107495308} 12/27/2021 21:07:43 - INFO - codeparrot_training - Step 37331: {'lr': 7.5610496667346e-05, 'samples': 19113984, 'steps': 37331, 'batch_loss/train': 0.6819661702029407} 12/27/2021 21:07:54 - INFO - codeparrot_training - Step 37332: {'lr': 7.559920228208211e-05, 'samples': 19114496, 'steps': 37332, 'batch_loss/train': 0.6981303235515952} 12/27/2021 21:08:07 - INFO - codeparrot_training - Step 37333: {'lr': 7.558790859017004e-05, 'samples': 19115008, 'steps': 37333, 'batch_loss/train': 0.6890076363924891} 12/27/2021 21:08:18 - INFO - codeparrot_training - Step 37334: {'lr': 7.55766155916544e-05, 'samples': 19115520, 'steps': 37334, 'batch_loss/train': 0.6658317900728434} 12/27/2021 21:08:29 - INFO - codeparrot_training - Step 37335: {'lr': 7.556532328658037e-05, 'samples': 19116032, 'steps': 37335, 'batch_loss/train': 0.6997547652572393} 12/27/2021 21:08:43 - INFO - codeparrot_training - Step 37336: {'lr': 7.555403167499283e-05, 'samples': 19116544, 'steps': 37336, 'batch_loss/train': 0.6568750129081309} 12/27/2021 21:08:54 - INFO - codeparrot_training - Step 37337: {'lr': 7.55427407569365e-05, 'samples': 19117056, 'steps': 37337, 'batch_loss/train': 0.5724144229316153} 12/27/2021 21:09:04 - INFO - codeparrot_training - Step 37338: {'lr': 7.553145053245639e-05, 'samples': 19117568, 'steps': 37338, 'batch_loss/train': 0.6182757969945669} 12/27/2021 21:09:16 - INFO - codeparrot_training - Step 37339: {'lr': 7.55201610015973e-05, 'samples': 19118080, 'steps': 37339, 'batch_loss/train': 0.6705024992115796} 12/27/2021 21:09:27 - INFO - codeparrot_training - Step 37340: {'lr': 7.550887216440422e-05, 'samples': 19118592, 'steps': 37340, 'batch_loss/train': 0.668451156001538} 12/27/2021 21:09:37 - INFO - codeparrot_training - Step 37341: {'lr': 7.549758402092197e-05, 'samples': 19119104, 'steps': 37341, 'batch_loss/train': 0.6778589561581612} 12/27/2021 21:09:48 - INFO - codeparrot_training - Step 37342: {'lr': 7.548629657119543e-05, 'samples': 19119616, 'steps': 37342, 'batch_loss/train': 0.7111142559442669} 12/27/2021 21:10:00 - INFO - codeparrot_training - Step 37343: {'lr': 7.54750098152695e-05, 'samples': 19120128, 'steps': 37343, 'batch_loss/train': 0.6727881319820881} 12/27/2021 21:10:11 - INFO - codeparrot_training - Step 37344: {'lr': 7.546372375318899e-05, 'samples': 19120640, 'steps': 37344, 'batch_loss/train': 0.7381064454093575} 12/27/2021 21:10:22 - INFO - codeparrot_training - Step 37345: {'lr': 7.545243838499886e-05, 'samples': 19121152, 'steps': 37345, 'batch_loss/train': 0.7043384981807321} 12/27/2021 21:10:36 - INFO - codeparrot_training - Step 37346: {'lr': 7.544115371074389e-05, 'samples': 19121664, 'steps': 37346, 'batch_loss/train': 0.7231387542560697} 12/27/2021 21:10:47 - INFO - codeparrot_training - Step 37347: {'lr': 7.542986973046908e-05, 'samples': 19122176, 'steps': 37347, 'batch_loss/train': 0.6788559465203434} 12/27/2021 21:10:57 - INFO - codeparrot_training - Step 37348: {'lr': 7.541858644421912e-05, 'samples': 19122688, 'steps': 37348, 'batch_loss/train': 0.7030139514245093} 12/27/2021 21:11:10 - INFO - codeparrot_training - Step 37349: {'lr': 7.540730385203884e-05, 'samples': 19123200, 'steps': 37349, 'batch_loss/train': 0.6601092806085944} 12/27/2021 21:11:20 - INFO - codeparrot_training - Step 37350: {'lr': 7.539602195397338e-05, 'samples': 19123712, 'steps': 37350, 'batch_loss/train': 0.7468447964638472} 12/27/2021 21:11:31 - INFO - codeparrot_training - Step 37351: {'lr': 7.538474075006732e-05, 'samples': 19124224, 'steps': 37351, 'batch_loss/train': 0.7582406913861632} 12/27/2021 21:11:41 - INFO - codeparrot_training - Step 37352: {'lr': 7.537346024036559e-05, 'samples': 19124736, 'steps': 37352, 'batch_loss/train': 0.617605731356889} 12/27/2021 21:11:53 - INFO - codeparrot_training - Step 37353: {'lr': 7.536218042491305e-05, 'samples': 19125248, 'steps': 37353, 'batch_loss/train': 0.7791539505124092} 12/27/2021 21:12:04 - INFO - codeparrot_training - Step 37354: {'lr': 7.535090130375455e-05, 'samples': 19125760, 'steps': 37354, 'batch_loss/train': 0.7238202840089798} 12/27/2021 21:12:15 - INFO - codeparrot_training - Step 37355: {'lr': 7.53396228769349e-05, 'samples': 19126272, 'steps': 37355, 'batch_loss/train': 0.8299018479883671} 12/27/2021 21:12:29 - INFO - codeparrot_training - Step 37356: {'lr': 7.5328345144499e-05, 'samples': 19126784, 'steps': 37356, 'batch_loss/train': 0.7595121320337057} 12/27/2021 21:12:39 - INFO - codeparrot_training - Step 37357: {'lr': 7.531706810649158e-05, 'samples': 19127296, 'steps': 37357, 'batch_loss/train': 0.6865561921149492} 12/27/2021 21:12:50 - INFO - codeparrot_training - Step 37358: {'lr': 7.53057917629576e-05, 'samples': 19127808, 'steps': 37358, 'batch_loss/train': 0.6991567704826593} 12/27/2021 21:13:02 - INFO - codeparrot_training - Step 37359: {'lr': 7.52945161139419e-05, 'samples': 19128320, 'steps': 37359, 'batch_loss/train': 0.7568049523979425} 12/27/2021 21:13:13 - INFO - codeparrot_training - Step 37360: {'lr': 7.528324115948903e-05, 'samples': 19128832, 'steps': 37360, 'batch_loss/train': 0.7659473530948162} 12/27/2021 21:13:23 - INFO - codeparrot_training - Step 37361: {'lr': 7.527196689964416e-05, 'samples': 19129344, 'steps': 37361, 'batch_loss/train': 0.6465988736599684} 12/27/2021 21:13:34 - INFO - codeparrot_training - Step 37362: {'lr': 7.526069333445205e-05, 'samples': 19129856, 'steps': 37362, 'batch_loss/train': 0.604933037830051} 12/27/2021 21:13:48 - INFO - codeparrot_training - Step 37363: {'lr': 7.524942046395734e-05, 'samples': 19130368, 'steps': 37363, 'batch_loss/train': 0.7434356597950682} 12/27/2021 21:13:58 - INFO - codeparrot_training - Step 37364: {'lr': 7.523814828820494e-05, 'samples': 19130880, 'steps': 37364, 'batch_loss/train': 0.5822224677540362} 12/27/2021 21:14:09 - INFO - codeparrot_training - Step 37365: {'lr': 7.522687680723969e-05, 'samples': 19131392, 'steps': 37365, 'batch_loss/train': 0.6450864724465646} 12/27/2021 21:14:21 - INFO - codeparrot_training - Step 37366: {'lr': 7.521560602110639e-05, 'samples': 19131904, 'steps': 37366, 'batch_loss/train': 0.6198211978189647} 12/27/2021 21:14:32 - INFO - codeparrot_training - Step 37367: {'lr': 7.520433592984985e-05, 'samples': 19132416, 'steps': 37367, 'batch_loss/train': 0.6701055187731981} 12/27/2021 21:14:42 - INFO - codeparrot_training - Step 37368: {'lr': 7.519306653351487e-05, 'samples': 19132928, 'steps': 37368, 'batch_loss/train': 0.6949010284151882} 12/27/2021 21:14:55 - INFO - codeparrot_training - Step 37369: {'lr': 7.518179783214624e-05, 'samples': 19133440, 'steps': 37369, 'batch_loss/train': 0.9280997947789729} 12/27/2021 21:15:05 - INFO - codeparrot_training - Step 37370: {'lr': 7.517052982578879e-05, 'samples': 19133952, 'steps': 37370, 'batch_loss/train': 0.7135401689447463} 12/27/2021 21:15:16 - INFO - codeparrot_training - Step 37371: {'lr': 7.51592625144873e-05, 'samples': 19134464, 'steps': 37371, 'batch_loss/train': 0.6191245626250748} 12/27/2021 21:15:28 - INFO - codeparrot_training - Step 37372: {'lr': 7.514799589828656e-05, 'samples': 19134976, 'steps': 37372, 'batch_loss/train': 0.6550342130358331} 12/27/2021 21:15:39 - INFO - codeparrot_training - Step 37373: {'lr': 7.513672997723142e-05, 'samples': 19135488, 'steps': 37373, 'batch_loss/train': 0.6914710430428386} 12/27/2021 21:15:50 - INFO - codeparrot_training - Step 37374: {'lr': 7.512546475136655e-05, 'samples': 19136000, 'steps': 37374, 'batch_loss/train': 0.7335097352042794} 12/27/2021 21:16:00 - INFO - codeparrot_training - Step 37375: {'lr': 7.511420022073676e-05, 'samples': 19136512, 'steps': 37375, 'batch_loss/train': 0.7434495142661035} 12/27/2021 21:16:14 - INFO - codeparrot_training - Step 37376: {'lr': 7.510293638538698e-05, 'samples': 19137024, 'steps': 37376, 'batch_loss/train': 0.834057233761996} 12/27/2021 21:16:25 - INFO - codeparrot_training - Step 37377: {'lr': 7.509167324536185e-05, 'samples': 19137536, 'steps': 37377, 'batch_loss/train': 0.7662120093591511} 12/27/2021 21:16:36 - INFO - codeparrot_training - Step 37378: {'lr': 7.508041080070616e-05, 'samples': 19138048, 'steps': 37378, 'batch_loss/train': 0.6059486406156793} 12/27/2021 21:16:48 - INFO - codeparrot_training - Step 37379: {'lr': 7.506914905146473e-05, 'samples': 19138560, 'steps': 37379, 'batch_loss/train': 0.777562371455133} 12/27/2021 21:16:58 - INFO - codeparrot_training - Step 37380: {'lr': 7.505788799768229e-05, 'samples': 19139072, 'steps': 37380, 'batch_loss/train': 0.7164712813682854} 12/27/2021 21:17:09 - INFO - codeparrot_training - Step 37381: {'lr': 7.504662763940362e-05, 'samples': 19139584, 'steps': 37381, 'batch_loss/train': 0.7718726471066475} 12/27/2021 21:17:21 - INFO - codeparrot_training - Step 37382: {'lr': 7.503536797667353e-05, 'samples': 19140096, 'steps': 37382, 'batch_loss/train': 0.5904621267109178} 12/27/2021 21:17:32 - INFO - codeparrot_training - Step 37383: {'lr': 7.502410900953674e-05, 'samples': 19140608, 'steps': 37383, 'batch_loss/train': 0.7529159965924919} 12/27/2021 21:17:42 - INFO - codeparrot_training - Step 37384: {'lr': 7.501285073803805e-05, 'samples': 19141120, 'steps': 37384, 'batch_loss/train': 0.7132972029503435} 12/27/2021 21:17:53 - INFO - codeparrot_training - Step 37385: {'lr': 7.500159316222222e-05, 'samples': 19141632, 'steps': 37385, 'batch_loss/train': 0.5251705628470518} 12/27/2021 21:18:07 - INFO - codeparrot_training - Step 37386: {'lr': 7.499033628213387e-05, 'samples': 19142144, 'steps': 37386, 'batch_loss/train': 0.6340461129439063} 12/27/2021 21:18:17 - INFO - codeparrot_training - Step 37387: {'lr': 7.497908009781789e-05, 'samples': 19142656, 'steps': 37387, 'batch_loss/train': 0.8204036774113774} 12/27/2021 21:18:28 - INFO - codeparrot_training - Step 37388: {'lr': 7.496782460931912e-05, 'samples': 19143168, 'steps': 37388, 'batch_loss/train': 0.6903000911697745} 12/27/2021 21:18:40 - INFO - codeparrot_training - Step 37389: {'lr': 7.495656981668211e-05, 'samples': 19143680, 'steps': 37389, 'batch_loss/train': 0.7044523255899549} 12/27/2021 21:18:51 - INFO - codeparrot_training - Step 37390: {'lr': 7.494531571995159e-05, 'samples': 19144192, 'steps': 37390, 'batch_loss/train': 0.6488890142645687} 12/27/2021 21:19:01 - INFO - codeparrot_training - Step 37391: {'lr': 7.493406231917258e-05, 'samples': 19144704, 'steps': 37391, 'batch_loss/train': 0.7884946931153536} 12/27/2021 21:19:15 - INFO - codeparrot_training - Step 37392: {'lr': 7.492280961438953e-05, 'samples': 19145216, 'steps': 37392, 'batch_loss/train': 0.7413497781381011} 12/27/2021 21:19:26 - INFO - codeparrot_training - Step 37393: {'lr': 7.491155760564727e-05, 'samples': 19145728, 'steps': 37393, 'batch_loss/train': 0.6674140924587846} 12/27/2021 21:19:37 - INFO - codeparrot_training - Step 37394: {'lr': 7.490030629299058e-05, 'samples': 19146240, 'steps': 37394, 'batch_loss/train': 0.7535772970877588} 12/27/2021 21:19:49 - INFO - codeparrot_training - Step 37395: {'lr': 7.488905567646415e-05, 'samples': 19146752, 'steps': 37395, 'batch_loss/train': 0.7059110412374139} 12/27/2021 21:19:59 - INFO - codeparrot_training - Step 37396: {'lr': 7.487780575611269e-05, 'samples': 19147264, 'steps': 37396, 'batch_loss/train': 0.7181800175458193} 12/27/2021 21:20:10 - INFO - codeparrot_training - Step 37397: {'lr': 7.486655653198099e-05, 'samples': 19147776, 'steps': 37397, 'batch_loss/train': 0.6416898448951542} 12/27/2021 21:20:20 - INFO - codeparrot_training - Step 37398: {'lr': 7.48553080041137e-05, 'samples': 19148288, 'steps': 37398, 'batch_loss/train': 0.6721502994187176} 12/27/2021 21:20:33 - INFO - codeparrot_training - Step 37399: {'lr': 7.48440601725556e-05, 'samples': 19148800, 'steps': 37399, 'batch_loss/train': 0.7396495323628187} 12/27/2021 21:20:43 - INFO - codeparrot_training - Step 37400: {'lr': 7.483281303735142e-05, 'samples': 19149312, 'steps': 37400, 'batch_loss/train': 0.630719704873627} 12/27/2021 21:20:54 - INFO - codeparrot_training - Step 37401: {'lr': 7.482156659854572e-05, 'samples': 19149824, 'steps': 37401, 'batch_loss/train': 0.7042665304616094} 12/27/2021 21:21:08 - INFO - codeparrot_training - Step 37402: {'lr': 7.481032085618347e-05, 'samples': 19150336, 'steps': 37402, 'batch_loss/train': 0.7149718501605093} 12/27/2021 21:21:18 - INFO - codeparrot_training - Step 37403: {'lr': 7.479907581030915e-05, 'samples': 19150848, 'steps': 37403, 'batch_loss/train': 0.6572942952625453} 12/27/2021 21:21:29 - INFO - codeparrot_training - Step 37404: {'lr': 7.478783146096754e-05, 'samples': 19151360, 'steps': 37404, 'batch_loss/train': 0.7400040016509593} 12/27/2021 21:21:41 - INFO - codeparrot_training - Step 37405: {'lr': 7.477658780820337e-05, 'samples': 19151872, 'steps': 37405, 'batch_loss/train': 0.7719031758606434} 12/27/2021 21:21:52 - INFO - codeparrot_training - Step 37406: {'lr': 7.476534485206136e-05, 'samples': 19152384, 'steps': 37406, 'batch_loss/train': 0.70717808092013} 12/27/2021 21:22:02 - INFO - codeparrot_training - Step 37407: {'lr': 7.475410259258612e-05, 'samples': 19152896, 'steps': 37407, 'batch_loss/train': 0.7208217596635222} 12/27/2021 21:22:15 - INFO - codeparrot_training - Step 37408: {'lr': 7.474286102982242e-05, 'samples': 19153408, 'steps': 37408, 'batch_loss/train': 0.6730408789590001} 12/27/2021 21:22:25 - INFO - codeparrot_training - Step 37409: {'lr': 7.47316201638149e-05, 'samples': 19153920, 'steps': 37409, 'batch_loss/train': 0.8204831019975245} 12/27/2021 21:22:36 - INFO - codeparrot_training - Step 37410: {'lr': 7.472037999460832e-05, 'samples': 19154432, 'steps': 37410, 'batch_loss/train': 0.6893211882561445} 12/27/2021 21:22:46 - INFO - codeparrot_training - Step 37411: {'lr': 7.47091405222474e-05, 'samples': 19154944, 'steps': 37411, 'batch_loss/train': 0.6530314162373543} 12/27/2021 21:22:59 - INFO - codeparrot_training - Step 37412: {'lr': 7.469790174677657e-05, 'samples': 19155456, 'steps': 37412, 'batch_loss/train': 0.6023655394092202} 12/27/2021 21:23:09 - INFO - codeparrot_training - Step 37413: {'lr': 7.468666366824079e-05, 'samples': 19155968, 'steps': 37413, 'batch_loss/train': 0.6432462411466986} 12/27/2021 21:23:20 - INFO - codeparrot_training - Step 37414: {'lr': 7.467542628668469e-05, 'samples': 19156480, 'steps': 37414, 'batch_loss/train': 0.7179863974452019} 12/27/2021 21:23:34 - INFO - codeparrot_training - Step 37415: {'lr': 7.466418960215285e-05, 'samples': 19156992, 'steps': 37415, 'batch_loss/train': 0.6515214685350657} 12/27/2021 21:23:45 - INFO - codeparrot_training - Step 37416: {'lr': 7.46529536146899e-05, 'samples': 19157504, 'steps': 37416, 'batch_loss/train': 0.6703551330720074} 12/27/2021 21:23:55 - INFO - codeparrot_training - Step 37417: {'lr': 7.464171832434074e-05, 'samples': 19158016, 'steps': 37417, 'batch_loss/train': 0.6947374456794932} 12/27/2021 21:24:07 - INFO - codeparrot_training - Step 37418: {'lr': 7.463048373114983e-05, 'samples': 19158528, 'steps': 37418, 'batch_loss/train': 0.6579911252483726} 12/27/2021 21:24:18 - INFO - codeparrot_training - Step 37419: {'lr': 7.461924983516192e-05, 'samples': 19159040, 'steps': 37419, 'batch_loss/train': 0.7533904174342752} 12/27/2021 21:24:29 - INFO - codeparrot_training - Step 37420: {'lr': 7.460801663642161e-05, 'samples': 19159552, 'steps': 37420, 'batch_loss/train': 0.7679555821232498} 12/27/2021 21:24:39 - INFO - codeparrot_training - Step 37421: {'lr': 7.459678413497362e-05, 'samples': 19160064, 'steps': 37421, 'batch_loss/train': 0.6886857298668474} 12/27/2021 21:24:53 - INFO - codeparrot_training - Step 37422: {'lr': 7.458555233086258e-05, 'samples': 19160576, 'steps': 37422, 'batch_loss/train': 0.7757359948009253} 12/27/2021 21:25:04 - INFO - codeparrot_training - Step 37423: {'lr': 7.457432122413316e-05, 'samples': 19161088, 'steps': 37423, 'batch_loss/train': 0.6489511085674167} 12/27/2021 21:25:15 - INFO - codeparrot_training - Step 37424: {'lr': 7.456309081483e-05, 'samples': 19161600, 'steps': 37424, 'batch_loss/train': 0.7244134497013874} 12/27/2021 21:25:27 - INFO - codeparrot_training - Step 37425: {'lr': 7.455186110299774e-05, 'samples': 19162112, 'steps': 37425, 'batch_loss/train': 0.7851887550204992} 12/27/2021 21:25:37 - INFO - codeparrot_training - Step 37426: {'lr': 7.454063208868112e-05, 'samples': 19162624, 'steps': 37426, 'batch_loss/train': 0.6666904920712113} 12/27/2021 21:25:48 - INFO - codeparrot_training - Step 37427: {'lr': 7.452940377192455e-05, 'samples': 19163136, 'steps': 37427, 'batch_loss/train': 0.6828517820686102} 12/27/2021 21:26:00 - INFO - codeparrot_training - Step 37428: {'lr': 7.451817615277296e-05, 'samples': 19163648, 'steps': 37428, 'batch_loss/train': 0.6541447471827269} 12/27/2021 21:26:11 - INFO - codeparrot_training - Step 37429: {'lr': 7.450694923127074e-05, 'samples': 19164160, 'steps': 37429, 'batch_loss/train': 0.6234635459259152} 12/27/2021 21:26:21 - INFO - codeparrot_training - Step 37430: {'lr': 7.449572300746257e-05, 'samples': 19164672, 'steps': 37430, 'batch_loss/train': 0.6532305025029927} 12/27/2021 21:26:35 - INFO - codeparrot_training - Step 37431: {'lr': 7.448449748139329e-05, 'samples': 19165184, 'steps': 37431, 'batch_loss/train': 0.7073656059801579} 12/27/2021 21:26:46 - INFO - codeparrot_training - Step 37432: {'lr': 7.447327265310733e-05, 'samples': 19165696, 'steps': 37432, 'batch_loss/train': 0.659463154617697} 12/27/2021 21:26:57 - INFO - codeparrot_training - Step 37433: {'lr': 7.446204852264931e-05, 'samples': 19166208, 'steps': 37433, 'batch_loss/train': 0.6740720672532916} 12/27/2021 21:27:07 - INFO - codeparrot_training - Step 37434: {'lr': 7.445082509006393e-05, 'samples': 19166720, 'steps': 37434, 'batch_loss/train': 0.7695078076794744} 12/27/2021 21:27:19 - INFO - codeparrot_training - Step 37435: {'lr': 7.44396023553958e-05, 'samples': 19167232, 'steps': 37435, 'batch_loss/train': 0.7309955509845167} 12/27/2021 21:27:30 - INFO - codeparrot_training - Step 37436: {'lr': 7.442838031868948e-05, 'samples': 19167744, 'steps': 37436, 'batch_loss/train': 0.6589964078739285} 12/27/2021 21:27:41 - INFO - codeparrot_training - Step 37437: {'lr': 7.441715897998974e-05, 'samples': 19168256, 'steps': 37437, 'batch_loss/train': 0.7705119187012315} 12/27/2021 21:27:52 - INFO - codeparrot_training - Step 37438: {'lr': 7.440593833934092e-05, 'samples': 19168768, 'steps': 37438, 'batch_loss/train': 0.7628116011619568} 12/27/2021 21:28:03 - INFO - codeparrot_training - Step 37439: {'lr': 7.439471839678785e-05, 'samples': 19169280, 'steps': 37439, 'batch_loss/train': 0.6094201100058854} 12/27/2021 21:28:14 - INFO - codeparrot_training - Step 37440: {'lr': 7.438349915237516e-05, 'samples': 19169792, 'steps': 37440, 'batch_loss/train': 0.7288192734122276} 12/27/2021 21:28:28 - INFO - codeparrot_training - Step 37441: {'lr': 7.437228060614729e-05, 'samples': 19170304, 'steps': 37441, 'batch_loss/train': 0.7049838309176266} 12/27/2021 21:28:38 - INFO - codeparrot_training - Step 37442: {'lr': 7.436106275814883e-05, 'samples': 19170816, 'steps': 37442, 'batch_loss/train': 0.6624781992286444} 12/27/2021 21:28:49 - INFO - codeparrot_training - Step 37443: {'lr': 7.434984560842465e-05, 'samples': 19171328, 'steps': 37443, 'batch_loss/train': 0.6733969734050333} 12/27/2021 21:28:59 - INFO - codeparrot_training - Step 37444: {'lr': 7.433862915701905e-05, 'samples': 19171840, 'steps': 37444, 'batch_loss/train': 0.6003708890639246} 12/27/2021 21:29:12 - INFO - codeparrot_training - Step 37445: {'lr': 7.432741340397675e-05, 'samples': 19172352, 'steps': 37445, 'batch_loss/train': 0.6816694281296805} 12/27/2021 21:29:22 - INFO - codeparrot_training - Step 37446: {'lr': 7.43161983493423e-05, 'samples': 19172864, 'steps': 37446, 'batch_loss/train': 0.5969056938774884} 12/27/2021 21:29:33 - INFO - codeparrot_training - Step 37447: {'lr': 7.430498399316035e-05, 'samples': 19173376, 'steps': 37447, 'batch_loss/train': 0.7274843598715961} 12/27/2021 21:29:45 - INFO - codeparrot_training - Step 37448: {'lr': 7.429377033547541e-05, 'samples': 19173888, 'steps': 37448, 'batch_loss/train': 0.7048712745308876} 12/27/2021 21:29:56 - INFO - codeparrot_training - Step 37449: {'lr': 7.42825573763321e-05, 'samples': 19174400, 'steps': 37449, 'batch_loss/train': 0.6122336648404598} 12/27/2021 21:30:06 - INFO - codeparrot_training - Step 37450: {'lr': 7.427134511577502e-05, 'samples': 19174912, 'steps': 37450, 'batch_loss/train': 0.6842440476175398} 12/27/2021 21:30:20 - INFO - codeparrot_training - Step 37451: {'lr': 7.42601335538487e-05, 'samples': 19175424, 'steps': 37451, 'batch_loss/train': 0.7384326858446002} 12/27/2021 21:30:31 - INFO - codeparrot_training - Step 37452: {'lr': 7.42489226905978e-05, 'samples': 19175936, 'steps': 37452, 'batch_loss/train': 0.7526729335077107} 12/27/2021 21:30:41 - INFO - codeparrot_training - Step 37453: {'lr': 7.423771252606667e-05, 'samples': 19176448, 'steps': 37453, 'batch_loss/train': 0.7486627148464322} 12/27/2021 21:30:53 - INFO - codeparrot_training - Step 37454: {'lr': 7.422650306030021e-05, 'samples': 19176960, 'steps': 37454, 'batch_loss/train': 0.6201899885199964} 12/27/2021 21:31:04 - INFO - codeparrot_training - Step 37455: {'lr': 7.42152942933427e-05, 'samples': 19177472, 'steps': 37455, 'batch_loss/train': 0.6662867576815188} 12/27/2021 21:31:15 - INFO - codeparrot_training - Step 37456: {'lr': 7.420408622523875e-05, 'samples': 19177984, 'steps': 37456, 'batch_loss/train': 0.7770166150294244} 12/27/2021 21:31:25 - INFO - codeparrot_training - Step 37457: {'lr': 7.41928788560331e-05, 'samples': 19178496, 'steps': 37457, 'batch_loss/train': 0.5496751724858768} 12/27/2021 21:31:38 - INFO - codeparrot_training - Step 37458: {'lr': 7.418167218577013e-05, 'samples': 19179008, 'steps': 37458, 'batch_loss/train': 0.5746014688338619} 12/27/2021 21:31:48 - INFO - codeparrot_training - Step 37459: {'lr': 7.417046621449444e-05, 'samples': 19179520, 'steps': 37459, 'batch_loss/train': 0.8135738680139184} 12/27/2021 21:31:59 - INFO - codeparrot_training - Step 37460: {'lr': 7.415926094225056e-05, 'samples': 19180032, 'steps': 37460, 'batch_loss/train': 0.6807597493752837} 12/27/2021 21:32:13 - INFO - codeparrot_training - Step 37461: {'lr': 7.414805636908309e-05, 'samples': 19180544, 'steps': 37461, 'batch_loss/train': 0.7438042489811778} 12/27/2021 21:32:23 - INFO - codeparrot_training - Step 37462: {'lr': 7.413685249503654e-05, 'samples': 19181056, 'steps': 37462, 'batch_loss/train': 0.7038504919037223} 12/27/2021 21:32:34 - INFO - codeparrot_training - Step 37463: {'lr': 7.412564932015556e-05, 'samples': 19181568, 'steps': 37463, 'batch_loss/train': 0.7415654351934791} 12/27/2021 21:32:46 - INFO - codeparrot_training - Step 37464: {'lr': 7.411444684448443e-05, 'samples': 19182080, 'steps': 37464, 'batch_loss/train': 0.7719740571919829} 12/27/2021 21:32:57 - INFO - codeparrot_training - Step 37465: {'lr': 7.41032450680679e-05, 'samples': 19182592, 'steps': 37465, 'batch_loss/train': 0.7547908620908856} 12/27/2021 21:33:07 - INFO - codeparrot_training - Step 37466: {'lr': 7.409204399095054e-05, 'samples': 19183104, 'steps': 37466, 'batch_loss/train': 0.7546950355172157} 12/27/2021 21:33:18 - INFO - codeparrot_training - Step 37467: {'lr': 7.408084361317665e-05, 'samples': 19183616, 'steps': 37467, 'batch_loss/train': 0.7192548925522715} 12/27/2021 21:33:30 - INFO - codeparrot_training - Step 37468: {'lr': 7.406964393479096e-05, 'samples': 19184128, 'steps': 37468, 'batch_loss/train': 0.6657983311451972} 12/27/2021 21:33:41 - INFO - codeparrot_training - Step 37469: {'lr': 7.405844495583802e-05, 'samples': 19184640, 'steps': 37469, 'batch_loss/train': 0.7661541579291224} 12/27/2021 21:33:51 - INFO - codeparrot_training - Step 37470: {'lr': 7.40472466763622e-05, 'samples': 19185152, 'steps': 37470, 'batch_loss/train': 0.7560448017902672} 12/27/2021 21:34:05 - INFO - codeparrot_training - Step 37471: {'lr': 7.403604909640802e-05, 'samples': 19185664, 'steps': 37471, 'batch_loss/train': 0.66115465213079} 12/27/2021 21:34:16 - INFO - codeparrot_training - Step 37472: {'lr': 7.40248522160202e-05, 'samples': 19186176, 'steps': 37472, 'batch_loss/train': 0.6897639883682132} 12/27/2021 21:34:27 - INFO - codeparrot_training - Step 37473: {'lr': 7.401365603524307e-05, 'samples': 19186688, 'steps': 37473, 'batch_loss/train': 0.7955716364085674} 12/27/2021 21:34:39 - INFO - codeparrot_training - Step 37474: {'lr': 7.400246055412115e-05, 'samples': 19187200, 'steps': 37474, 'batch_loss/train': 0.7919681305065751} 12/27/2021 21:34:49 - INFO - codeparrot_training - Step 37475: {'lr': 7.399126577269905e-05, 'samples': 19187712, 'steps': 37475, 'batch_loss/train': 0.9506720555946231} 12/27/2021 21:35:00 - INFO - codeparrot_training - Step 37476: {'lr': 7.39800716910212e-05, 'samples': 19188224, 'steps': 37476, 'batch_loss/train': 0.712268385104835} 12/27/2021 21:35:14 - INFO - codeparrot_training - Step 37477: {'lr': 7.39688783091321e-05, 'samples': 19188736, 'steps': 37477, 'batch_loss/train': 0.7549817175604403} 12/27/2021 21:35:25 - INFO - codeparrot_training - Step 37478: {'lr': 7.395768562707636e-05, 'samples': 19189248, 'steps': 37478, 'batch_loss/train': 0.6687302407808602} 12/27/2021 21:35:35 - INFO - codeparrot_training - Step 37479: {'lr': 7.394649364489825e-05, 'samples': 19189760, 'steps': 37479, 'batch_loss/train': 0.6355076050385833} 12/27/2021 21:35:46 - INFO - codeparrot_training - Step 37480: {'lr': 7.393530236264248e-05, 'samples': 19190272, 'steps': 37480, 'batch_loss/train': 0.7515812963247299} 12/27/2021 21:35:58 - INFO - codeparrot_training - Step 37481: {'lr': 7.392411178035352e-05, 'samples': 19190784, 'steps': 37481, 'batch_loss/train': 0.6569247324950993} 12/27/2021 21:36:08 - INFO - codeparrot_training - Step 37482: {'lr': 7.39129218980757e-05, 'samples': 19191296, 'steps': 37482, 'batch_loss/train': 0.828252634499222} 12/27/2021 21:36:19 - INFO - codeparrot_training - Step 37483: {'lr': 7.390173271585374e-05, 'samples': 19191808, 'steps': 37483, 'batch_loss/train': 0.7538396110758185} 12/27/2021 21:36:31 - INFO - codeparrot_training - Step 37484: {'lr': 7.38905442337319e-05, 'samples': 19192320, 'steps': 37484, 'batch_loss/train': 0.6844297759234905} 12/27/2021 21:36:42 - INFO - codeparrot_training - Step 37485: {'lr': 7.387935645175478e-05, 'samples': 19192832, 'steps': 37485, 'batch_loss/train': 0.7159486133605242} 12/27/2021 21:36:53 - INFO - codeparrot_training - Step 37486: {'lr': 7.386816936996684e-05, 'samples': 19193344, 'steps': 37486, 'batch_loss/train': 0.7654997101053596} 12/27/2021 21:37:05 - INFO - codeparrot_training - Step 37487: {'lr': 7.385698298841256e-05, 'samples': 19193856, 'steps': 37487, 'batch_loss/train': 0.714582443702966} 12/27/2021 21:37:15 - INFO - codeparrot_training - Step 37488: {'lr': 7.384579730713642e-05, 'samples': 19194368, 'steps': 37488, 'batch_loss/train': 0.6921452046371996} 12/27/2021 21:37:26 - INFO - codeparrot_training - Step 37489: {'lr': 7.383461232618293e-05, 'samples': 19194880, 'steps': 37489, 'batch_loss/train': 0.6901345533551648} 12/27/2021 21:37:37 - INFO - codeparrot_training - Step 37490: {'lr': 7.382342804559635e-05, 'samples': 19195392, 'steps': 37490, 'batch_loss/train': 0.7048456193879247} 12/27/2021 21:37:51 - INFO - codeparrot_training - Step 37491: {'lr': 7.381224446542137e-05, 'samples': 19195904, 'steps': 37491, 'batch_loss/train': 0.8045436418615282} 12/27/2021 21:38:01 - INFO - codeparrot_training - Step 37492: {'lr': 7.380106158570249e-05, 'samples': 19196416, 'steps': 37492, 'batch_loss/train': 0.7193973967805505} 12/27/2021 21:38:12 - INFO - codeparrot_training - Step 37493: {'lr': 7.378987940648388e-05, 'samples': 19196928, 'steps': 37493, 'batch_loss/train': 0.8321050042286515} 12/27/2021 21:38:24 - INFO - codeparrot_training - Step 37494: {'lr': 7.377869792781026e-05, 'samples': 19197440, 'steps': 37494, 'batch_loss/train': 1.0353566072881222} 12/27/2021 21:38:35 - INFO - codeparrot_training - Step 37495: {'lr': 7.376751714972604e-05, 'samples': 19197952, 'steps': 37495, 'batch_loss/train': 0.7342135366052389} 12/27/2021 21:38:46 - INFO - codeparrot_training - Step 37496: {'lr': 7.375633707227561e-05, 'samples': 19198464, 'steps': 37496, 'batch_loss/train': 0.7240772163495421} 12/27/2021 21:38:58 - INFO - codeparrot_training - Step 37497: {'lr': 7.374515769550332e-05, 'samples': 19198976, 'steps': 37497, 'batch_loss/train': 0.7609490435570478} 12/27/2021 21:39:08 - INFO - codeparrot_training - Step 37498: {'lr': 7.373397901945389e-05, 'samples': 19199488, 'steps': 37498, 'batch_loss/train': 0.8047206150367856} 12/27/2021 21:39:19 - INFO - codeparrot_training - Step 37499: {'lr': 7.372280104417153e-05, 'samples': 19200000, 'steps': 37499, 'batch_loss/train': 0.808799035847187} 12/27/2021 21:39:19 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/27/2021 21:42:43 - INFO - codeparrot_training - Step 37500: {'loss/eval': 0.750774621963501, 'perplexity': 2.118640422821045} 12/27/2021 21:43:01 - WARNING - huggingface_hub.repository - Several commits (13) will be pushed upstream. 12/27/2021 21:43:15 - INFO - codeparrot_training - Step 37500: {'lr': 7.371162376970078e-05, 'samples': 19200512, 'steps': 37500, 'batch_loss/train': 0.7357386639341712} 12/27/2021 21:43:28 - INFO - codeparrot_training - Step 37501: {'lr': 7.370044719608602e-05, 'samples': 19201024, 'steps': 37501, 'batch_loss/train': 0.7435182514600456} 12/27/2021 21:43:39 - INFO - codeparrot_training - Step 37502: {'lr': 7.368927132337172e-05, 'samples': 19201536, 'steps': 37502, 'batch_loss/train': 0.6693274169228971} 12/27/2021 21:43:49 - INFO - codeparrot_training - Step 37503: {'lr': 7.367809615160231e-05, 'samples': 19202048, 'steps': 37503, 'batch_loss/train': 0.6421770723536611} 12/27/2021 21:44:01 - INFO - codeparrot_training - Step 37504: {'lr': 7.366692168082221e-05, 'samples': 19202560, 'steps': 37504, 'batch_loss/train': 0.6276878505013883} 12/27/2021 21:44:12 - INFO - codeparrot_training - Step 37505: {'lr': 7.365574791107584e-05, 'samples': 19203072, 'steps': 37505, 'batch_loss/train': 0.6686520266812295} 12/27/2021 21:44:23 - INFO - codeparrot_training - Step 37506: {'lr': 7.364457484240763e-05, 'samples': 19203584, 'steps': 37506, 'batch_loss/train': 0.6210176380118355} 12/27/2021 21:44:37 - INFO - codeparrot_training - Step 37507: {'lr': 7.36334024748621e-05, 'samples': 19204096, 'steps': 37507, 'batch_loss/train': 0.7733500618487597} 12/27/2021 21:44:47 - INFO - codeparrot_training - Step 37508: {'lr': 7.36222308084834e-05, 'samples': 19204608, 'steps': 37508, 'batch_loss/train': 0.9040505704469979} 12/27/2021 21:44:58 - INFO - codeparrot_training - Step 37509: {'lr': 7.361105984331628e-05, 'samples': 19205120, 'steps': 37509, 'batch_loss/train': 0.7905787447816692} 12/27/2021 21:45:09 - INFO - codeparrot_training - Step 37510: {'lr': 7.359988957940489e-05, 'samples': 19205632, 'steps': 37510, 'batch_loss/train': 0.6167054476682097} 12/27/2021 21:45:21 - INFO - codeparrot_training - Step 37511: {'lr': 7.358872001679373e-05, 'samples': 19206144, 'steps': 37511, 'batch_loss/train': 0.7456460346002132} 12/27/2021 21:45:31 - INFO - codeparrot_training - Step 37512: {'lr': 7.357755115552723e-05, 'samples': 19206656, 'steps': 37512, 'batch_loss/train': 0.6298953645746224} 12/27/2021 21:45:42 - INFO - codeparrot_training - Step 37513: {'lr': 7.356638299564977e-05, 'samples': 19207168, 'steps': 37513, 'batch_loss/train': 0.695612823124975} 12/27/2021 21:45:55 - INFO - codeparrot_training - Step 37514: {'lr': 7.355521553720573e-05, 'samples': 19207680, 'steps': 37514, 'batch_loss/train': 1.1674690479412675} 12/27/2021 21:46:05 - INFO - codeparrot_training - Step 37515: {'lr': 7.354404878023954e-05, 'samples': 19208192, 'steps': 37515, 'batch_loss/train': 0.7166361203417182} 12/27/2021 21:46:16 - INFO - codeparrot_training - Step 37516: {'lr': 7.353288272479558e-05, 'samples': 19208704, 'steps': 37516, 'batch_loss/train': 0.6063757427036762} 12/27/2021 21:46:26 - INFO - codeparrot_training - Step 37517: {'lr': 7.352171737091826e-05, 'samples': 19209216, 'steps': 37517, 'batch_loss/train': 0.6467999559827149} 12/27/2021 21:46:40 - INFO - codeparrot_training - Step 37518: {'lr': 7.351055271865204e-05, 'samples': 19209728, 'steps': 37518, 'batch_loss/train': 0.7181018507108092} 12/27/2021 21:46:51 - INFO - codeparrot_training - Step 37519: {'lr': 7.349938876804104e-05, 'samples': 19210240, 'steps': 37519, 'batch_loss/train': 0.696806118299719} 12/27/2021 21:47:01 - INFO - codeparrot_training - Step 37520: {'lr': 7.348822551912993e-05, 'samples': 19210752, 'steps': 37520, 'batch_loss/train': 0.6597581906244159} 12/27/2021 21:47:14 - INFO - codeparrot_training - Step 37521: {'lr': 7.347706297196305e-05, 'samples': 19211264, 'steps': 37521, 'batch_loss/train': 0.7901114020496607} 12/27/2021 21:47:24 - INFO - codeparrot_training - Step 37522: {'lr': 7.346590112658464e-05, 'samples': 19211776, 'steps': 37522, 'batch_loss/train': 0.7814147006720304} 12/27/2021 21:47:35 - INFO - codeparrot_training - Step 37523: {'lr': 7.345473998303908e-05, 'samples': 19212288, 'steps': 37523, 'batch_loss/train': 0.7556688571348786} 12/27/2021 21:47:47 - INFO - codeparrot_training - Step 37524: {'lr': 7.344357954137096e-05, 'samples': 19212800, 'steps': 37524, 'batch_loss/train': 0.7085003582760692} 12/27/2021 21:47:58 - INFO - codeparrot_training - Step 37525: {'lr': 7.343241980162444e-05, 'samples': 19213312, 'steps': 37525, 'batch_loss/train': 0.7335137655027211} 12/27/2021 21:48:08 - INFO - codeparrot_training - Step 37526: {'lr': 7.342126076384392e-05, 'samples': 19213824, 'steps': 37526, 'batch_loss/train': 0.7512008370831609} 12/27/2021 21:48:20 - INFO - codeparrot_training - Step 37527: {'lr': 7.341010242807381e-05, 'samples': 19214336, 'steps': 37527, 'batch_loss/train': 0.7200516937300563} 12/27/2021 21:48:31 - INFO - codeparrot_training - Step 37528: {'lr': 7.339894479435846e-05, 'samples': 19214848, 'steps': 37528, 'batch_loss/train': 0.7727027875371277} 12/27/2021 21:48:42 - INFO - codeparrot_training - Step 37529: {'lr': 7.338778786274222e-05, 'samples': 19215360, 'steps': 37529, 'batch_loss/train': 0.6985601978376508} 12/27/2021 21:48:52 - INFO - codeparrot_training - Step 37530: {'lr': 7.337663163326946e-05, 'samples': 19215872, 'steps': 37530, 'batch_loss/train': 0.6496484489180148} 12/27/2021 21:49:07 - INFO - codeparrot_training - Step 37531: {'lr': 7.336547610598451e-05, 'samples': 19216384, 'steps': 37531, 'batch_loss/train': 0.6876111882738769} 12/27/2021 21:49:17 - INFO - codeparrot_training - Step 37532: {'lr': 7.335432128093173e-05, 'samples': 19216896, 'steps': 37532, 'batch_loss/train': 0.6436452187190298} 12/27/2021 21:49:28 - INFO - codeparrot_training - Step 37533: {'lr': 7.334316715815555e-05, 'samples': 19217408, 'steps': 37533, 'batch_loss/train': 0.4624420494074002} 12/27/2021 21:49:40 - INFO - codeparrot_training - Step 37534: {'lr': 7.333201373770007e-05, 'samples': 19217920, 'steps': 37534, 'batch_loss/train': 0.7080879284767434} 12/27/2021 21:49:50 - INFO - codeparrot_training - Step 37535: {'lr': 7.332086101960996e-05, 'samples': 19218432, 'steps': 37535, 'batch_loss/train': 0.6884201869834214} 12/27/2021 21:50:01 - INFO - codeparrot_training - Step 37536: {'lr': 7.330970900392934e-05, 'samples': 19218944, 'steps': 37536, 'batch_loss/train': 0.7126073078252375} 12/27/2021 21:50:15 - INFO - codeparrot_training - Step 37537: {'lr': 7.329855769070257e-05, 'samples': 19219456, 'steps': 37537, 'batch_loss/train': 0.8198648225516081} 12/27/2021 21:50:26 - INFO - codeparrot_training - Step 37538: {'lr': 7.328740707997402e-05, 'samples': 19219968, 'steps': 37538, 'batch_loss/train': 0.6566177012282424} 12/27/2021 21:50:36 - INFO - codeparrot_training - Step 37539: {'lr': 7.327625717178801e-05, 'samples': 19220480, 'steps': 37539, 'batch_loss/train': 0.7726355723571032} 12/27/2021 21:50:47 - INFO - codeparrot_training - Step 37540: {'lr': 7.326510796618891e-05, 'samples': 19220992, 'steps': 37540, 'batch_loss/train': 0.658759337849915} 12/27/2021 21:50:59 - INFO - codeparrot_training - Step 37541: {'lr': 7.325395946322098e-05, 'samples': 19221504, 'steps': 37541, 'batch_loss/train': 0.6405524201691151} 12/27/2021 21:51:09 - INFO - codeparrot_training - Step 37542: {'lr': 7.324281166292856e-05, 'samples': 19222016, 'steps': 37542, 'batch_loss/train': 0.8007869776338339} 12/27/2021 21:51:20 - INFO - codeparrot_training - Step 37543: {'lr': 7.323166456535598e-05, 'samples': 19222528, 'steps': 37543, 'batch_loss/train': 0.7416593916714191} 12/27/2021 21:51:32 - INFO - codeparrot_training - Step 37544: {'lr': 7.322051817054764e-05, 'samples': 19223040, 'steps': 37544, 'batch_loss/train': 0.6863210353767499} 12/27/2021 21:51:43 - INFO - codeparrot_training - Step 37545: {'lr': 7.320937247854764e-05, 'samples': 19223552, 'steps': 37545, 'batch_loss/train': 0.7160296908114105} 12/27/2021 21:51:54 - INFO - codeparrot_training - Step 37546: {'lr': 7.319822748940047e-05, 'samples': 19224064, 'steps': 37546, 'batch_loss/train': 0.7288490128703415} 12/27/2021 21:52:07 - INFO - codeparrot_training - Step 37547: {'lr': 7.318708320315048e-05, 'samples': 19224576, 'steps': 37547, 'batch_loss/train': 0.779841941781342} 12/27/2021 21:52:18 - INFO - codeparrot_training - Step 37548: {'lr': 7.317593961984182e-05, 'samples': 19225088, 'steps': 37548, 'batch_loss/train': 0.6685237800702453} 12/27/2021 21:52:29 - INFO - codeparrot_training - Step 37549: {'lr': 7.316479673951875e-05, 'samples': 19225600, 'steps': 37549, 'batch_loss/train': 0.6598315038718283} 12/27/2021 21:52:39 - INFO - codeparrot_training - Step 37550: {'lr': 7.315365456222587e-05, 'samples': 19226112, 'steps': 37550, 'batch_loss/train': 0.6775889276759699} 12/27/2021 21:52:52 - INFO - codeparrot_training - Step 37551: {'lr': 7.314251308800718e-05, 'samples': 19226624, 'steps': 37551, 'batch_loss/train': 0.6785634085536003} 12/27/2021 21:53:02 - INFO - codeparrot_training - Step 37552: {'lr': 7.313137231690708e-05, 'samples': 19227136, 'steps': 37552, 'batch_loss/train': 0.7245842609554529} 12/27/2021 21:53:13 - INFO - codeparrot_training - Step 37553: {'lr': 7.312023224896986e-05, 'samples': 19227648, 'steps': 37553, 'batch_loss/train': 0.6148419806850143} 12/27/2021 21:53:25 - INFO - codeparrot_training - Step 37554: {'lr': 7.31090928842398e-05, 'samples': 19228160, 'steps': 37554, 'batch_loss/train': 0.6883994047529995} 12/27/2021 21:53:36 - INFO - codeparrot_training - Step 37555: {'lr': 7.309795422276123e-05, 'samples': 19228672, 'steps': 37555, 'batch_loss/train': 0.9209551746025681} 12/27/2021 21:53:47 - INFO - codeparrot_training - Step 37556: {'lr': 7.308681626457836e-05, 'samples': 19229184, 'steps': 37556, 'batch_loss/train': 0.5962865251058247} 12/27/2021 21:54:01 - INFO - codeparrot_training - Step 37557: {'lr': 7.307567900973555e-05, 'samples': 19229696, 'steps': 37557, 'batch_loss/train': 0.7018822133541107} 12/27/2021 21:54:11 - INFO - codeparrot_training - Step 37558: {'lr': 7.3064542458277e-05, 'samples': 19230208, 'steps': 37558, 'batch_loss/train': 0.7511423039250076} 12/27/2021 21:54:22 - INFO - codeparrot_training - Step 37559: {'lr': 7.305340661024712e-05, 'samples': 19230720, 'steps': 37559, 'batch_loss/train': 0.6464865081943572} 12/27/2021 21:54:32 - INFO - codeparrot_training - Step 37560: {'lr': 7.304227146568993e-05, 'samples': 19231232, 'steps': 37560, 'batch_loss/train': 0.7058068260084838} 12/27/2021 21:54:45 - INFO - codeparrot_training - Step 37561: {'lr': 7.303113702464991e-05, 'samples': 19231744, 'steps': 37561, 'batch_loss/train': 0.4866297041880898} 12/27/2021 21:54:55 - INFO - codeparrot_training - Step 37562: {'lr': 7.302000328717135e-05, 'samples': 19232256, 'steps': 37562, 'batch_loss/train': 0.7047883830964565} 12/27/2021 21:55:06 - INFO - codeparrot_training - Step 37563: {'lr': 7.300887025329835e-05, 'samples': 19232768, 'steps': 37563, 'batch_loss/train': 0.8859242666512728} 12/27/2021 21:55:18 - INFO - codeparrot_training - Step 37564: {'lr': 7.299773792307526e-05, 'samples': 19233280, 'steps': 37564, 'batch_loss/train': 0.5855559476767667} 12/27/2021 21:55:29 - INFO - codeparrot_training - Step 37565: {'lr': 7.298660629654636e-05, 'samples': 19233792, 'steps': 37565, 'batch_loss/train': 0.6790951592847705} 12/27/2021 21:55:39 - INFO - codeparrot_training - Step 37566: {'lr': 7.297547537375585e-05, 'samples': 19234304, 'steps': 37566, 'batch_loss/train': 0.6899379999376833} 12/27/2021 21:55:53 - INFO - codeparrot_training - Step 37567: {'lr': 7.296434515474798e-05, 'samples': 19234816, 'steps': 37567, 'batch_loss/train': 0.7386481831781566} 12/27/2021 21:56:04 - INFO - codeparrot_training - Step 37568: {'lr': 7.295321563956705e-05, 'samples': 19235328, 'steps': 37568, 'batch_loss/train': 0.7554440045496449} 12/27/2021 21:56:14 - INFO - codeparrot_training - Step 37569: {'lr': 7.294208682825732e-05, 'samples': 19235840, 'steps': 37569, 'batch_loss/train': 0.744614286813885} 12/27/2021 21:56:26 - INFO - codeparrot_training - Step 37570: {'lr': 7.293095872086295e-05, 'samples': 19236352, 'steps': 37570, 'batch_loss/train': 0.6846874626353383} 12/27/2021 21:56:37 - INFO - codeparrot_training - Step 37571: {'lr': 7.291983131742824e-05, 'samples': 19236864, 'steps': 37571, 'batch_loss/train': 0.6575920606846921} 12/27/2021 21:56:48 - INFO - codeparrot_training - Step 37572: {'lr': 7.290870461799743e-05, 'samples': 19237376, 'steps': 37572, 'batch_loss/train': 0.6942357271909714} 12/27/2021 21:56:58 - INFO - codeparrot_training - Step 37573: {'lr': 7.289757862261482e-05, 'samples': 19237888, 'steps': 37573, 'batch_loss/train': 0.6445488701574504} 12/27/2021 21:57:11 - INFO - codeparrot_training - Step 37574: {'lr': 7.288645333132449e-05, 'samples': 19238400, 'steps': 37574, 'batch_loss/train': 0.6611454666126519} 12/27/2021 21:57:21 - INFO - codeparrot_training - Step 37575: {'lr': 7.287532874417066e-05, 'samples': 19238912, 'steps': 37575, 'batch_loss/train': 0.8309986991807818} 12/27/2021 21:57:32 - INFO - codeparrot_training - Step 37576: {'lr': 7.28642048611978e-05, 'samples': 19239424, 'steps': 37576, 'batch_loss/train': 0.7361975880339742} 12/27/2021 21:57:46 - INFO - codeparrot_training - Step 37577: {'lr': 7.28530816824499e-05, 'samples': 19239936, 'steps': 37577, 'batch_loss/train': 0.7672217441722751} 12/27/2021 21:57:57 - INFO - codeparrot_training - Step 37578: {'lr': 7.284195920797118e-05, 'samples': 19240448, 'steps': 37578, 'batch_loss/train': 0.6942449612542987} 12/27/2021 21:58:07 - INFO - codeparrot_training - Step 37579: {'lr': 7.283083743780608e-05, 'samples': 19240960, 'steps': 37579, 'batch_loss/train': 0.7059304309077561} 12/27/2021 21:58:19 - INFO - codeparrot_training - Step 37580: {'lr': 7.281971637199861e-05, 'samples': 19241472, 'steps': 37580, 'batch_loss/train': 0.7555155249428935} 12/27/2021 21:58:30 - INFO - codeparrot_training - Step 37581: {'lr': 7.280859601059306e-05, 'samples': 19241984, 'steps': 37581, 'batch_loss/train': 0.6063072067336179} 12/27/2021 21:58:41 - INFO - codeparrot_training - Step 37582: {'lr': 7.279747635363362e-05, 'samples': 19242496, 'steps': 37582, 'batch_loss/train': 0.611586266575614} 12/27/2021 21:58:53 - INFO - codeparrot_training - Step 37583: {'lr': 7.27863574011645e-05, 'samples': 19243008, 'steps': 37583, 'batch_loss/train': 0.7405858935671858} 12/27/2021 21:59:04 - INFO - codeparrot_training - Step 37584: {'lr': 7.27752391532299e-05, 'samples': 19243520, 'steps': 37584, 'batch_loss/train': 0.6803420479409397} 12/27/2021 21:59:14 - INFO - codeparrot_training - Step 37585: {'lr': 7.276412160987412e-05, 'samples': 19244032, 'steps': 37585, 'batch_loss/train': 0.6188627304509282} 12/27/2021 21:59:25 - INFO - codeparrot_training - Step 37586: {'lr': 7.275300477114111e-05, 'samples': 19244544, 'steps': 37586, 'batch_loss/train': 0.9360009585507214} 12/27/2021 21:59:39 - INFO - codeparrot_training - Step 37587: {'lr': 7.274188863707532e-05, 'samples': 19245056, 'steps': 37587, 'batch_loss/train': 1.3166502742096782} 12/27/2021 21:59:50 - INFO - codeparrot_training - Step 37588: {'lr': 7.273077320772093e-05, 'samples': 19245568, 'steps': 37588, 'batch_loss/train': 0.7582199187017977} 12/27/2021 22:00:00 - INFO - codeparrot_training - Step 37589: {'lr': 7.271965848312196e-05, 'samples': 19246080, 'steps': 37589, 'batch_loss/train': 0.7833939585834742} 12/27/2021 22:00:12 - INFO - codeparrot_training - Step 37590: {'lr': 7.270854446332268e-05, 'samples': 19246592, 'steps': 37590, 'batch_loss/train': 0.6724503021687269} 12/27/2021 22:00:23 - INFO - codeparrot_training - Step 37591: {'lr': 7.26974311483673e-05, 'samples': 19247104, 'steps': 37591, 'batch_loss/train': 0.7752807168290019} 12/27/2021 22:00:33 - INFO - codeparrot_training - Step 37592: {'lr': 7.268631853829999e-05, 'samples': 19247616, 'steps': 37592, 'batch_loss/train': 0.6965697288978845} 12/27/2021 22:00:47 - INFO - codeparrot_training - Step 37593: {'lr': 7.26752066331649e-05, 'samples': 19248128, 'steps': 37593, 'batch_loss/train': 0.8069468680769205} 12/27/2021 22:00:58 - INFO - codeparrot_training - Step 37594: {'lr': 7.266409543300626e-05, 'samples': 19248640, 'steps': 37594, 'batch_loss/train': 0.7973256204277277} 12/27/2021 22:01:09 - INFO - codeparrot_training - Step 37595: {'lr': 7.265298493786821e-05, 'samples': 19249152, 'steps': 37595, 'batch_loss/train': 0.7728339759632945} 12/27/2021 22:01:19 - INFO - codeparrot_training - Step 37596: {'lr': 7.264187514779494e-05, 'samples': 19249664, 'steps': 37596, 'batch_loss/train': 0.7464733277447522} 12/27/2021 22:01:31 - INFO - codeparrot_training - Step 37597: {'lr': 7.263076606283059e-05, 'samples': 19250176, 'steps': 37597, 'batch_loss/train': 0.7524815900251269} 12/27/2021 22:01:42 - INFO - codeparrot_training - Step 37598: {'lr': 7.261965768301934e-05, 'samples': 19250688, 'steps': 37598, 'batch_loss/train': 0.7356204567477107} 12/27/2021 22:01:53 - INFO - codeparrot_training - Step 37599: {'lr': 7.260855000840544e-05, 'samples': 19251200, 'steps': 37599, 'batch_loss/train': 0.8037867397069931} 12/27/2021 22:02:05 - INFO - codeparrot_training - Step 37600: {'lr': 7.25974430390329e-05, 'samples': 19251712, 'steps': 37600, 'batch_loss/train': 0.6588559368974529} 12/27/2021 22:02:16 - INFO - codeparrot_training - Step 37601: {'lr': 7.258633677494583e-05, 'samples': 19252224, 'steps': 37601, 'batch_loss/train': 0.8099284525960684} 12/27/2021 22:02:26 - INFO - codeparrot_training - Step 37602: {'lr': 7.257523121618867e-05, 'samples': 19252736, 'steps': 37602, 'batch_loss/train': 0.8148653141688555} 12/27/2021 22:02:39 - INFO - codeparrot_training - Step 37603: {'lr': 7.256412636280532e-05, 'samples': 19253248, 'steps': 37603, 'batch_loss/train': 0.7645574510097504} 12/27/2021 22:02:50 - INFO - codeparrot_training - Step 37604: {'lr': 7.255302221483995e-05, 'samples': 19253760, 'steps': 37604, 'batch_loss/train': 1.0074940277263522} 12/27/2021 22:03:00 - INFO - codeparrot_training - Step 37605: {'lr': 7.25419187723369e-05, 'samples': 19254272, 'steps': 37605, 'batch_loss/train': 0.950430910103023} 12/27/2021 22:03:11 - INFO - codeparrot_training - Step 37606: {'lr': 7.253081603534009e-05, 'samples': 19254784, 'steps': 37606, 'batch_loss/train': 0.7351180650293827} 12/27/2021 22:03:25 - INFO - codeparrot_training - Step 37607: {'lr': 7.251971400389376e-05, 'samples': 19255296, 'steps': 37607, 'batch_loss/train': 0.7041830664966255} 12/27/2021 22:03:36 - INFO - codeparrot_training - Step 37608: {'lr': 7.250861267804201e-05, 'samples': 19255808, 'steps': 37608, 'batch_loss/train': 0.7318011987954378} 12/27/2021 22:03:46 - INFO - codeparrot_training - Step 37609: {'lr': 7.249751205782901e-05, 'samples': 19256320, 'steps': 37609, 'batch_loss/train': 0.6468901163898408} 12/27/2021 22:03:58 - INFO - codeparrot_training - Step 37610: {'lr': 7.24864121432989e-05, 'samples': 19256832, 'steps': 37610, 'batch_loss/train': 0.8126379838213325} 12/27/2021 22:04:09 - INFO - codeparrot_training - Step 37611: {'lr': 7.247531293449583e-05, 'samples': 19257344, 'steps': 37611, 'batch_loss/train': 0.7302650075871497} 12/27/2021 22:04:20 - INFO - codeparrot_training - Step 37612: {'lr': 7.246421443146376e-05, 'samples': 19257856, 'steps': 37612, 'batch_loss/train': 0.7382661742158234} 12/27/2021 22:04:32 - INFO - codeparrot_training - Step 37613: {'lr': 7.2453116634247e-05, 'samples': 19258368, 'steps': 37613, 'batch_loss/train': 0.6492467033676803} 12/27/2021 22:04:43 - INFO - codeparrot_training - Step 37614: {'lr': 7.244201954288971e-05, 'samples': 19258880, 'steps': 37614, 'batch_loss/train': 0.6949067420791835} 12/27/2021 22:04:53 - INFO - codeparrot_training - Step 37615: {'lr': 7.243092315743572e-05, 'samples': 19259392, 'steps': 37615, 'batch_loss/train': 0.6511425985372625} 12/27/2021 22:05:07 - INFO - codeparrot_training - Step 37616: {'lr': 7.241982747792941e-05, 'samples': 19259904, 'steps': 37616, 'batch_loss/train': 0.8011859068647027} 12/27/2021 22:05:18 - INFO - codeparrot_training - Step 37617: {'lr': 7.24087325044149e-05, 'samples': 19260416, 'steps': 37617, 'batch_loss/train': 0.6427650232799351} 12/27/2021 22:05:28 - INFO - codeparrot_training - Step 37618: {'lr': 7.239763823693615e-05, 'samples': 19260928, 'steps': 37618, 'batch_loss/train': 0.7219798604492098} 12/27/2021 22:05:39 - INFO - codeparrot_training - Step 37619: {'lr': 7.238654467553732e-05, 'samples': 19261440, 'steps': 37619, 'batch_loss/train': 0.7229154612869024} 12/27/2021 22:05:51 - INFO - codeparrot_training - Step 37620: {'lr': 7.237545182026253e-05, 'samples': 19261952, 'steps': 37620, 'batch_loss/train': 0.6866389036295004} 12/27/2021 22:06:02 - INFO - codeparrot_training - Step 37621: {'lr': 7.236435967115587e-05, 'samples': 19262464, 'steps': 37621, 'batch_loss/train': 0.704981941729784} 12/27/2021 22:06:12 - INFO - codeparrot_training - Step 37622: {'lr': 7.235326822826144e-05, 'samples': 19262976, 'steps': 37622, 'batch_loss/train': 0.7946238797158003} 12/27/2021 22:06:27 - INFO - codeparrot_training - Step 37623: {'lr': 7.234217749162336e-05, 'samples': 19263488, 'steps': 37623, 'batch_loss/train': 0.7669832520186901} 12/27/2021 22:06:37 - INFO - codeparrot_training - Step 37624: {'lr': 7.233108746128566e-05, 'samples': 19264000, 'steps': 37624, 'batch_loss/train': 0.6352514360332862} 12/27/2021 22:06:48 - INFO - codeparrot_training - Step 37625: {'lr': 7.23199981372926e-05, 'samples': 19264512, 'steps': 37625, 'batch_loss/train': 0.6720801843330264} 12/27/2021 22:07:00 - INFO - codeparrot_training - Step 37626: {'lr': 7.230890951968802e-05, 'samples': 19265024, 'steps': 37626, 'batch_loss/train': 0.4913104111328721} 12/27/2021 22:07:11 - INFO - codeparrot_training - Step 37627: {'lr': 7.229782160851606e-05, 'samples': 19265536, 'steps': 37627, 'batch_loss/train': 0.5031326857279055} 12/27/2021 22:07:21 - INFO - codeparrot_training - Step 37628: {'lr': 7.228673440382102e-05, 'samples': 19266048, 'steps': 37628, 'batch_loss/train': 0.5517230805126019} 12/27/2021 22:07:32 - INFO - codeparrot_training - Step 37629: {'lr': 7.227564790564676e-05, 'samples': 19266560, 'steps': 37629, 'batch_loss/train': 0.5019021203916054} 12/27/2021 22:07:44 - INFO - codeparrot_training - Step 37630: {'lr': 7.226456211403728e-05, 'samples': 19267072, 'steps': 37630, 'batch_loss/train': 0.6399957410467323} 12/27/2021 22:07:55 - INFO - codeparrot_training - Step 37631: {'lr': 7.225347702903701e-05, 'samples': 19267584, 'steps': 37631, 'batch_loss/train': 0.5975048750115093} 12/27/2021 22:08:05 - INFO - codeparrot_training - Step 37632: {'lr': 7.224239265068969e-05, 'samples': 19268096, 'steps': 37632, 'batch_loss/train': 0.7233954169787467} 12/27/2021 22:08:17 - INFO - codeparrot_training - Step 37633: {'lr': 7.223130897903952e-05, 'samples': 19268608, 'steps': 37633, 'batch_loss/train': 0.8030933570116758} 12/27/2021 22:08:28 - INFO - codeparrot_training - Step 37634: {'lr': 7.222022601413053e-05, 'samples': 19269120, 'steps': 37634, 'batch_loss/train': 0.5719752920267638} 12/27/2021 22:08:39 - INFO - codeparrot_training - Step 37635: {'lr': 7.220914375600679e-05, 'samples': 19269632, 'steps': 37635, 'batch_loss/train': 0.7456176830455661} 12/27/2021 22:08:53 - INFO - codeparrot_training - Step 37636: {'lr': 7.219806220471237e-05, 'samples': 19270144, 'steps': 37636, 'batch_loss/train': 0.7538625374436378} 12/27/2021 22:09:03 - INFO - codeparrot_training - Step 37637: {'lr': 7.218698136029139e-05, 'samples': 19270656, 'steps': 37637, 'batch_loss/train': 0.7327590929344296} 12/27/2021 22:09:14 - INFO - codeparrot_training - Step 37638: {'lr': 7.217590122278769e-05, 'samples': 19271168, 'steps': 37638, 'batch_loss/train': 0.7790881767868996} 12/27/2021 22:09:26 - INFO - codeparrot_training - Step 37639: {'lr': 7.216482179224554e-05, 'samples': 19271680, 'steps': 37639, 'batch_loss/train': 0.7020890226121992} 12/27/2021 22:09:36 - INFO - codeparrot_training - Step 37640: {'lr': 7.215374306870898e-05, 'samples': 19272192, 'steps': 37640, 'batch_loss/train': 0.7570943580940366} 12/27/2021 22:09:47 - INFO - codeparrot_training - Step 37641: {'lr': 7.214266505222183e-05, 'samples': 19272704, 'steps': 37641, 'batch_loss/train': 0.7487537865526974} 12/27/2021 22:09:58 - INFO - codeparrot_training - Step 37642: {'lr': 7.213158774282836e-05, 'samples': 19273216, 'steps': 37642, 'batch_loss/train': 0.745124832727015} 12/27/2021 22:10:10 - INFO - codeparrot_training - Step 37643: {'lr': 7.212051114057264e-05, 'samples': 19273728, 'steps': 37643, 'batch_loss/train': 0.7072840123437345} 12/27/2021 22:10:20 - INFO - codeparrot_training - Step 37644: {'lr': 7.210943524549848e-05, 'samples': 19274240, 'steps': 37644, 'batch_loss/train': 0.6029975588899106} 12/27/2021 22:10:31 - INFO - codeparrot_training - Step 37645: {'lr': 7.209836005765008e-05, 'samples': 19274752, 'steps': 37645, 'batch_loss/train': 0.826749118976295} 12/27/2021 22:10:45 - INFO - codeparrot_training - Step 37646: {'lr': 7.208728557707139e-05, 'samples': 19275264, 'steps': 37646, 'batch_loss/train': 0.6495822072029114} 12/27/2021 22:10:56 - INFO - codeparrot_training - Step 37647: {'lr': 7.207621180380653e-05, 'samples': 19275776, 'steps': 37647, 'batch_loss/train': 0.6877372634480707} 12/27/2021 22:11:06 - INFO - codeparrot_training - Step 37648: {'lr': 7.20651387378994e-05, 'samples': 19276288, 'steps': 37648, 'batch_loss/train': 0.8380974140018225} 12/27/2021 22:11:18 - INFO - codeparrot_training - Step 37649: {'lr': 7.205406637939415e-05, 'samples': 19276800, 'steps': 37649, 'batch_loss/train': 0.6473179887980223} 12/27/2021 22:11:29 - INFO - codeparrot_training - Step 37650: {'lr': 7.204299472833472e-05, 'samples': 19277312, 'steps': 37650, 'batch_loss/train': 0.6728856889531016} 12/27/2021 22:11:40 - INFO - codeparrot_training - Step 37651: {'lr': 7.203192378476514e-05, 'samples': 19277824, 'steps': 37651, 'batch_loss/train': 0.5991701949387789} 12/27/2021 22:11:52 - INFO - codeparrot_training - Step 37652: {'lr': 7.202085354872945e-05, 'samples': 19278336, 'steps': 37652, 'batch_loss/train': 0.7316712076426484} 12/27/2021 22:12:03 - INFO - codeparrot_training - Step 37653: {'lr': 7.200978402027165e-05, 'samples': 19278848, 'steps': 37653, 'batch_loss/train': 0.6700221430510283} 12/27/2021 22:12:13 - INFO - codeparrot_training - Step 37654: {'lr': 7.199871519943579e-05, 'samples': 19279360, 'steps': 37654, 'batch_loss/train': 0.7820111168548465} 12/27/2021 22:12:24 - INFO - codeparrot_training - Step 37655: {'lr': 7.198764708626576e-05, 'samples': 19279872, 'steps': 37655, 'batch_loss/train': 0.7692466708831489} 12/27/2021 22:12:38 - INFO - codeparrot_training - Step 37656: {'lr': 7.197657968080557e-05, 'samples': 19280384, 'steps': 37656, 'batch_loss/train': 0.6207343883579597} 12/27/2021 22:12:48 - INFO - codeparrot_training - Step 37657: {'lr': 7.196551298309942e-05, 'samples': 19280896, 'steps': 37657, 'batch_loss/train': 0.6026344685815275} 12/27/2021 22:12:59 - INFO - codeparrot_training - Step 37658: {'lr': 7.195444699319109e-05, 'samples': 19281408, 'steps': 37658, 'batch_loss/train': 0.6329897353425622} 12/27/2021 22:13:11 - INFO - codeparrot_training - Step 37659: {'lr': 7.194338171112466e-05, 'samples': 19281920, 'steps': 37659, 'batch_loss/train': 0.6560767454793677} 12/27/2021 22:13:22 - INFO - codeparrot_training - Step 37660: {'lr': 7.193231713694412e-05, 'samples': 19282432, 'steps': 37660, 'batch_loss/train': 0.8935585087165236} 12/27/2021 22:13:32 - INFO - codeparrot_training - Step 37661: {'lr': 7.192125327069343e-05, 'samples': 19282944, 'steps': 37661, 'batch_loss/train': 0.6987595846876502} 12/27/2021 22:13:43 - INFO - codeparrot_training - Step 37662: {'lr': 7.191019011241662e-05, 'samples': 19283456, 'steps': 37662, 'batch_loss/train': 0.9598379447124898} 12/27/2021 22:13:57 - INFO - codeparrot_training - Step 37663: {'lr': 7.18991276621577e-05, 'samples': 19283968, 'steps': 37663, 'batch_loss/train': 0.7476274846121669} 12/27/2021 22:14:08 - INFO - codeparrot_training - Step 37664: {'lr': 7.188806591996048e-05, 'samples': 19284480, 'steps': 37664, 'batch_loss/train': 0.7337194085121155} 12/27/2021 22:14:18 - INFO - codeparrot_training - Step 37665: {'lr': 7.18770048858691e-05, 'samples': 19284992, 'steps': 37665, 'batch_loss/train': 0.7539177192375064} 12/27/2021 22:14:30 - INFO - codeparrot_training - Step 37666: {'lr': 7.186594455992759e-05, 'samples': 19285504, 'steps': 37666, 'batch_loss/train': 0.6856266758404672} 12/27/2021 22:14:41 - INFO - codeparrot_training - Step 37667: {'lr': 7.185488494217968e-05, 'samples': 19286016, 'steps': 37667, 'batch_loss/train': 0.7234894565772265} 12/27/2021 22:14:52 - INFO - codeparrot_training - Step 37668: {'lr': 7.184382603266953e-05, 'samples': 19286528, 'steps': 37668, 'batch_loss/train': 0.704360100440681} 12/27/2021 22:15:04 - INFO - codeparrot_training - Step 37669: {'lr': 7.183276783144116e-05, 'samples': 19287040, 'steps': 37669, 'batch_loss/train': 0.7820108300074935} 12/27/2021 22:15:14 - INFO - codeparrot_training - Step 37670: {'lr': 7.182171033853832e-05, 'samples': 19287552, 'steps': 37670, 'batch_loss/train': 0.678535025101155} 12/27/2021 22:15:25 - INFO - codeparrot_training - Step 37671: {'lr': 7.181065355400512e-05, 'samples': 19288064, 'steps': 37671, 'batch_loss/train': 0.735181087278761} 12/27/2021 22:15:37 - INFO - codeparrot_training - Step 37672: {'lr': 7.179959747788545e-05, 'samples': 19288576, 'steps': 37672, 'batch_loss/train': 0.7054260922595859} 12/27/2021 22:15:48 - INFO - codeparrot_training - Step 37673: {'lr': 7.178854211022331e-05, 'samples': 19289088, 'steps': 37673, 'batch_loss/train': 0.6441681855358183} 12/27/2021 22:15:58 - INFO - codeparrot_training - Step 37674: {'lr': 7.177748745106263e-05, 'samples': 19289600, 'steps': 37674, 'batch_loss/train': 0.8051113532856107} 12/27/2021 22:16:09 - INFO - codeparrot_training - Step 37675: {'lr': 7.17664335004474e-05, 'samples': 19290112, 'steps': 37675, 'batch_loss/train': 0.7336748787201941} 12/27/2021 22:16:23 - INFO - codeparrot_training - Step 37676: {'lr': 7.175538025842149e-05, 'samples': 19290624, 'steps': 37676, 'batch_loss/train': 0.7427348792552948} 12/27/2021 22:16:34 - INFO - codeparrot_training - Step 37677: {'lr': 7.17443277250289e-05, 'samples': 19291136, 'steps': 37677, 'batch_loss/train': 0.677530046319589} 12/27/2021 22:16:44 - INFO - codeparrot_training - Step 37678: {'lr': 7.173327590031355e-05, 'samples': 19291648, 'steps': 37678, 'batch_loss/train': 0.7553304601460695} 12/27/2021 22:16:56 - INFO - codeparrot_training - Step 37679: {'lr': 7.172222478431939e-05, 'samples': 19292160, 'steps': 37679, 'batch_loss/train': 0.6300845937803388} 12/27/2021 22:17:07 - INFO - codeparrot_training - Step 37680: {'lr': 7.171117437709044e-05, 'samples': 19292672, 'steps': 37680, 'batch_loss/train': 0.6250209819991142} 12/27/2021 22:17:18 - INFO - codeparrot_training - Step 37681: {'lr': 7.170012467867046e-05, 'samples': 19293184, 'steps': 37681, 'batch_loss/train': 0.6950753229903057} 12/27/2021 22:17:30 - INFO - codeparrot_training - Step 37682: {'lr': 7.168907568910335e-05, 'samples': 19293696, 'steps': 37682, 'batch_loss/train': 0.7403335068374872} 12/27/2021 22:17:40 - INFO - codeparrot_training - Step 37683: {'lr': 7.167802740843335e-05, 'samples': 19294208, 'steps': 37683, 'batch_loss/train': 0.659871599636972} 12/27/2021 22:17:51 - INFO - codeparrot_training - Step 37684: {'lr': 7.166697983670406e-05, 'samples': 19294720, 'steps': 37684, 'batch_loss/train': 0.6643612897023559} 12/27/2021 22:18:04 - INFO - codeparrot_training - Step 37685: {'lr': 7.165593297395955e-05, 'samples': 19295232, 'steps': 37685, 'batch_loss/train': 0.7913331035524607} 12/27/2021 22:18:15 - INFO - codeparrot_training - Step 37686: {'lr': 7.16448868202437e-05, 'samples': 19295744, 'steps': 37686, 'batch_loss/train': 0.6877202056348324} 12/27/2021 22:18:25 - INFO - codeparrot_training - Step 37687: {'lr': 7.163384137560044e-05, 'samples': 19296256, 'steps': 37687, 'batch_loss/train': 0.7970444988459349} 12/27/2021 22:18:36 - INFO - codeparrot_training - Step 37688: {'lr': 7.162279664007368e-05, 'samples': 19296768, 'steps': 37688, 'batch_loss/train': 0.7114815809763968} 12/27/2021 22:18:48 - INFO - codeparrot_training - Step 37689: {'lr': 7.16117526137073e-05, 'samples': 19297280, 'steps': 37689, 'batch_loss/train': 0.7132863039150834} 12/27/2021 22:18:59 - INFO - codeparrot_training - Step 37690: {'lr': 7.160070929654528e-05, 'samples': 19297792, 'steps': 37690, 'batch_loss/train': 0.7599714146926999} 12/27/2021 22:19:09 - INFO - codeparrot_training - Step 37691: {'lr': 7.158966668863146e-05, 'samples': 19298304, 'steps': 37691, 'batch_loss/train': 0.7192160994745791} 12/27/2021 22:19:23 - INFO - codeparrot_training - Step 37692: {'lr': 7.157862479000985e-05, 'samples': 19298816, 'steps': 37692, 'batch_loss/train': 0.5993905893701594} 12/27/2021 22:19:34 - INFO - codeparrot_training - Step 37693: {'lr': 7.156758360072408e-05, 'samples': 19299328, 'steps': 37693, 'batch_loss/train': 0.7135406763118226} 12/27/2021 22:19:45 - INFO - codeparrot_training - Step 37694: {'lr': 7.15565431208183e-05, 'samples': 19299840, 'steps': 37694, 'batch_loss/train': 0.6708690254017711} 12/27/2021 22:19:57 - INFO - codeparrot_training - Step 37695: {'lr': 7.154550335033641e-05, 'samples': 19300352, 'steps': 37695, 'batch_loss/train': 0.6873482265509665} 12/27/2021 22:20:07 - INFO - codeparrot_training - Step 37696: {'lr': 7.153446428932214e-05, 'samples': 19300864, 'steps': 37696, 'batch_loss/train': 0.8050324595533311} 12/27/2021 22:20:18 - INFO - codeparrot_training - Step 37697: {'lr': 7.152342593781941e-05, 'samples': 19301376, 'steps': 37697, 'batch_loss/train': 0.6872134949080646} 12/27/2021 22:20:29 - INFO - codeparrot_training - Step 37698: {'lr': 7.151238829587225e-05, 'samples': 19301888, 'steps': 37698, 'batch_loss/train': 0.7039257753640413} 12/27/2021 22:20:41 - INFO - codeparrot_training - Step 37699: {'lr': 7.15013513635244e-05, 'samples': 19302400, 'steps': 37699, 'batch_loss/train': 0.7412661481648684} 12/27/2021 22:20:51 - INFO - codeparrot_training - Step 37700: {'lr': 7.149031514081977e-05, 'samples': 19302912, 'steps': 37700, 'batch_loss/train': 1.2497080489993095} 12/27/2021 22:21:02 - INFO - codeparrot_training - Step 37701: {'lr': 7.147927962780223e-05, 'samples': 19303424, 'steps': 37701, 'batch_loss/train': 1.0254977969452739} 12/27/2021 22:21:16 - INFO - codeparrot_training - Step 37702: {'lr': 7.14682448245157e-05, 'samples': 19303936, 'steps': 37702, 'batch_loss/train': 0.683778828009963} 12/27/2021 22:21:27 - INFO - codeparrot_training - Step 37703: {'lr': 7.145721073100398e-05, 'samples': 19304448, 'steps': 37703, 'batch_loss/train': 0.776597942225635} 12/27/2021 22:21:37 - INFO - codeparrot_training - Step 37704: {'lr': 7.144617734731098e-05, 'samples': 19304960, 'steps': 37704, 'batch_loss/train': 0.7421112223528326} 12/27/2021 22:21:49 - INFO - codeparrot_training - Step 37705: {'lr': 7.143514467348059e-05, 'samples': 19305472, 'steps': 37705, 'batch_loss/train': 0.7349345623515546} 12/27/2021 22:22:00 - INFO - codeparrot_training - Step 37706: {'lr': 7.142411270955663e-05, 'samples': 19305984, 'steps': 37706, 'batch_loss/train': 0.6893097334832419} 12/27/2021 22:22:11 - INFO - codeparrot_training - Step 37707: {'lr': 7.1413081455583e-05, 'samples': 19306496, 'steps': 37707, 'batch_loss/train': 0.762250856962055} 12/27/2021 22:22:21 - INFO - codeparrot_training - Step 37708: {'lr': 7.140205091160343e-05, 'samples': 19307008, 'steps': 37708, 'batch_loss/train': 0.7801965046674013} 12/27/2021 22:22:34 - INFO - codeparrot_training - Step 37709: {'lr': 7.139102107766199e-05, 'samples': 19307520, 'steps': 37709, 'batch_loss/train': 0.7736815912649035} 12/27/2021 22:22:44 - INFO - codeparrot_training - Step 37710: {'lr': 7.137999195380235e-05, 'samples': 19308032, 'steps': 37710, 'batch_loss/train': 0.757076499518007} 12/27/2021 22:22:55 - INFO - codeparrot_training - Step 37711: {'lr': 7.136896354006841e-05, 'samples': 19308544, 'steps': 37711, 'batch_loss/train': 0.6630944402422756} 12/27/2021 22:23:07 - INFO - codeparrot_training - Step 37712: {'lr': 7.135793583650405e-05, 'samples': 19309056, 'steps': 37712, 'batch_loss/train': 0.6876103011891246} 12/27/2021 22:23:18 - INFO - codeparrot_training - Step 37713: {'lr': 7.134690884315306e-05, 'samples': 19309568, 'steps': 37713, 'batch_loss/train': 0.651910643093288} 12/27/2021 22:23:29 - INFO - codeparrot_training - Step 37714: {'lr': 7.133588256005929e-05, 'samples': 19310080, 'steps': 37714, 'batch_loss/train': 0.6657103085890412} 12/27/2021 22:23:43 - INFO - codeparrot_training - Step 37715: {'lr': 7.132485698726663e-05, 'samples': 19310592, 'steps': 37715, 'batch_loss/train': 0.7529480773955584} 12/27/2021 22:23:54 - INFO - codeparrot_training - Step 37716: {'lr': 7.131383212481884e-05, 'samples': 19311104, 'steps': 37716, 'batch_loss/train': 0.7461924776434898} 12/27/2021 22:24:04 - INFO - codeparrot_training - Step 37717: {'lr': 7.130280797275979e-05, 'samples': 19311616, 'steps': 37717, 'batch_loss/train': 0.691386794205755} 12/27/2021 22:24:16 - INFO - codeparrot_training - Step 37718: {'lr': 7.129178453113338e-05, 'samples': 19312128, 'steps': 37718, 'batch_loss/train': 0.6811860461020842} 12/27/2021 22:24:27 - INFO - codeparrot_training - Step 37719: {'lr': 7.128076179998318e-05, 'samples': 19312640, 'steps': 37719, 'batch_loss/train': 0.7147763656685129} 12/27/2021 22:24:38 - INFO - codeparrot_training - Step 37720: {'lr': 7.126973977935331e-05, 'samples': 19313152, 'steps': 37720, 'batch_loss/train': 0.8041354706510901} 12/27/2021 22:24:48 - INFO - codeparrot_training - Step 37721: {'lr': 7.125871846928752e-05, 'samples': 19313664, 'steps': 37721, 'batch_loss/train': 0.806397957727313} 12/27/2021 22:25:00 - INFO - codeparrot_training - Step 37722: {'lr': 7.124769786982951e-05, 'samples': 19314176, 'steps': 37722, 'batch_loss/train': 0.69436871772632} 12/27/2021 22:25:11 - INFO - codeparrot_training - Step 37723: {'lr': 7.12366779810231e-05, 'samples': 19314688, 'steps': 37723, 'batch_loss/train': 0.7383195860311389} 12/27/2021 22:25:22 - INFO - codeparrot_training - Step 37724: {'lr': 7.122565880291229e-05, 'samples': 19315200, 'steps': 37724, 'batch_loss/train': 0.7881569415330887} 12/27/2021 22:25:35 - INFO - codeparrot_training - Step 37725: {'lr': 7.121464033554068e-05, 'samples': 19315712, 'steps': 37725, 'batch_loss/train': 0.7418952230364084} 12/27/2021 22:25:46 - INFO - codeparrot_training - Step 37726: {'lr': 7.120362257895215e-05, 'samples': 19316224, 'steps': 37726, 'batch_loss/train': 0.7400654200464487} 12/27/2021 22:25:57 - INFO - codeparrot_training - Step 37727: {'lr': 7.119260553319052e-05, 'samples': 19316736, 'steps': 37727, 'batch_loss/train': 0.644485330209136} 12/27/2021 22:26:09 - INFO - codeparrot_training - Step 37728: {'lr': 7.118158919829956e-05, 'samples': 19317248, 'steps': 37728, 'batch_loss/train': 1.5245349258184433} 12/27/2021 22:26:20 - INFO - codeparrot_training - Step 37729: {'lr': 7.117057357432307e-05, 'samples': 19317760, 'steps': 37729, 'batch_loss/train': 0.7088679615408182} 12/27/2021 22:26:30 - INFO - codeparrot_training - Step 37730: {'lr': 7.115955866130488e-05, 'samples': 19318272, 'steps': 37730, 'batch_loss/train': 0.7643063925206661} 12/27/2021 22:26:44 - INFO - codeparrot_training - Step 37731: {'lr': 7.114854445928876e-05, 'samples': 19318784, 'steps': 37731, 'batch_loss/train': 0.7451301638502628} 12/27/2021 22:26:55 - INFO - codeparrot_training - Step 37732: {'lr': 7.113753096831849e-05, 'samples': 19319296, 'steps': 37732, 'batch_loss/train': 0.7505029446911067} 12/27/2021 22:27:05 - INFO - codeparrot_training - Step 37733: {'lr': 7.112651818843793e-05, 'samples': 19319808, 'steps': 37733, 'batch_loss/train': 0.6421818550443277} 12/27/2021 22:27:16 - INFO - codeparrot_training - Step 37734: {'lr': 7.111550611969064e-05, 'samples': 19320320, 'steps': 37734, 'batch_loss/train': 0.7656396338716149} 12/27/2021 22:27:28 - INFO - codeparrot_training - Step 37735: {'lr': 7.110449476212072e-05, 'samples': 19320832, 'steps': 37735, 'batch_loss/train': 0.7247192589566112} 12/27/2021 22:27:39 - INFO - codeparrot_training - Step 37736: {'lr': 7.109348411577171e-05, 'samples': 19321344, 'steps': 37736, 'batch_loss/train': 0.7669390579685569} 12/27/2021 22:27:49 - INFO - codeparrot_training - Step 37737: {'lr': 7.108247418068744e-05, 'samples': 19321856, 'steps': 37737, 'batch_loss/train': 0.6649328228086233} 12/27/2021 22:28:01 - INFO - codeparrot_training - Step 37738: {'lr': 7.107146495691169e-05, 'samples': 19322368, 'steps': 37738, 'batch_loss/train': 0.6387833282351494} 12/27/2021 22:28:12 - INFO - codeparrot_training - Step 37739: {'lr': 7.106045644448825e-05, 'samples': 19322880, 'steps': 37739, 'batch_loss/train': 0.736434318125248} 12/27/2021 22:28:23 - INFO - codeparrot_training - Step 37740: {'lr': 7.104944864346086e-05, 'samples': 19323392, 'steps': 37740, 'batch_loss/train': 0.7671254118904471} 12/27/2021 22:28:37 - INFO - codeparrot_training - Step 37741: {'lr': 7.103844155387329e-05, 'samples': 19323904, 'steps': 37741, 'batch_loss/train': 0.6909619951620698} 12/27/2021 22:28:47 - INFO - codeparrot_training - Step 37742: {'lr': 7.102743517576932e-05, 'samples': 19324416, 'steps': 37742, 'batch_loss/train': 0.7159868897870183} 12/27/2021 22:28:58 - INFO - codeparrot_training - Step 37743: {'lr': 7.101642950919266e-05, 'samples': 19324928, 'steps': 37743, 'batch_loss/train': 0.7169725252315402} 12/27/2021 22:29:09 - INFO - codeparrot_training - Step 37744: {'lr': 7.10054245541872e-05, 'samples': 19325440, 'steps': 37744, 'batch_loss/train': 0.7780646746978164} 12/27/2021 22:29:21 - INFO - codeparrot_training - Step 37745: {'lr': 7.099442031079643e-05, 'samples': 19325952, 'steps': 37745, 'batch_loss/train': 0.7310085408389568} 12/27/2021 22:29:31 - INFO - codeparrot_training - Step 37746: {'lr': 7.098341677906431e-05, 'samples': 19326464, 'steps': 37746, 'batch_loss/train': 0.7828149297856726} 12/27/2021 22:29:42 - INFO - codeparrot_training - Step 37747: {'lr': 7.097241395903461e-05, 'samples': 19326976, 'steps': 37747, 'batch_loss/train': 0.6459057962056249} 12/27/2021 22:29:55 - INFO - codeparrot_training - Step 37748: {'lr': 7.096141185075095e-05, 'samples': 19327488, 'steps': 37748, 'batch_loss/train': 0.7539773648604751} 12/27/2021 22:30:06 - INFO - codeparrot_training - Step 37749: {'lr': 7.095041045425701e-05, 'samples': 19328000, 'steps': 37749, 'batch_loss/train': 0.6842559366486967} 12/27/2021 22:30:16 - INFO - codeparrot_training - Step 37750: {'lr': 7.093940976959679e-05, 'samples': 19328512, 'steps': 37750, 'batch_loss/train': 0.916247732937336} 12/27/2021 22:30:29 - INFO - codeparrot_training - Step 37751: {'lr': 7.092840979681382e-05, 'samples': 19329024, 'steps': 37751, 'batch_loss/train': 0.7756955958902836} 12/27/2021 22:30:39 - INFO - codeparrot_training - Step 37752: {'lr': 7.091741053595175e-05, 'samples': 19329536, 'steps': 37752, 'batch_loss/train': 0.6793171493336558} 12/27/2021 22:30:50 - INFO - codeparrot_training - Step 37753: {'lr': 7.090641198705461e-05, 'samples': 19330048, 'steps': 37753, 'batch_loss/train': 0.9156065515708178} 12/27/2021 22:31:00 - INFO - codeparrot_training - Step 37754: {'lr': 7.089541415016587e-05, 'samples': 19330560, 'steps': 37754, 'batch_loss/train': 0.5841875427868217} 12/27/2021 22:31:13 - INFO - codeparrot_training - Step 37755: {'lr': 7.088441702532936e-05, 'samples': 19331072, 'steps': 37755, 'batch_loss/train': 0.6692193299531937} 12/27/2021 22:31:23 - INFO - codeparrot_training - Step 37756: {'lr': 7.087342061258876e-05, 'samples': 19331584, 'steps': 37756, 'batch_loss/train': 0.7509391305502504} 12/27/2021 22:31:34 - INFO - codeparrot_training - Step 37757: {'lr': 7.08624249119878e-05, 'samples': 19332096, 'steps': 37757, 'batch_loss/train': 0.6732246353058144} 12/27/2021 22:31:46 - INFO - codeparrot_training - Step 37758: {'lr': 7.085142992357022e-05, 'samples': 19332608, 'steps': 37758, 'batch_loss/train': 0.7409005360677838} 12/27/2021 22:31:57 - INFO - codeparrot_training - Step 37759: {'lr': 7.084043564737978e-05, 'samples': 19333120, 'steps': 37759, 'batch_loss/train': 0.7652485594153404} 12/27/2021 22:32:07 - INFO - codeparrot_training - Step 37760: {'lr': 7.082944208345992e-05, 'samples': 19333632, 'steps': 37760, 'batch_loss/train': 0.7245217561721802} 12/27/2021 22:32:21 - INFO - codeparrot_training - Step 37761: {'lr': 7.081844923185477e-05, 'samples': 19334144, 'steps': 37761, 'batch_loss/train': 0.7215969683602452} 12/27/2021 22:32:32 - INFO - codeparrot_training - Step 37762: {'lr': 7.080745709260769e-05, 'samples': 19334656, 'steps': 37762, 'batch_loss/train': 0.7984918975271285} 12/27/2021 22:32:43 - INFO - codeparrot_training - Step 37763: {'lr': 7.079646566576242e-05, 'samples': 19335168, 'steps': 37763, 'batch_loss/train': 0.6860335287638009} 12/27/2021 22:32:55 - INFO - codeparrot_training - Step 37764: {'lr': 7.078547495136293e-05, 'samples': 19335680, 'steps': 37764, 'batch_loss/train': 0.6663749809376895} 12/27/2021 22:33:05 - INFO - codeparrot_training - Step 37765: {'lr': 7.07744849494526e-05, 'samples': 19336192, 'steps': 37765, 'batch_loss/train': 0.7525169705040753} 12/27/2021 22:33:16 - INFO - codeparrot_training - Step 37766: {'lr': 7.076349566007525e-05, 'samples': 19336704, 'steps': 37766, 'batch_loss/train': 0.6933405492454767} 12/27/2021 22:33:26 - INFO - codeparrot_training - Step 37767: {'lr': 7.07525070832746e-05, 'samples': 19337216, 'steps': 37767, 'batch_loss/train': 0.7870901552960277} 12/27/2021 22:33:39 - INFO - codeparrot_training - Step 37768: {'lr': 7.074151921909428e-05, 'samples': 19337728, 'steps': 37768, 'batch_loss/train': 0.7281772457063198} 12/27/2021 22:33:49 - INFO - codeparrot_training - Step 37769: {'lr': 7.073053206757798e-05, 'samples': 19338240, 'steps': 37769, 'batch_loss/train': 0.7815419272519648} 12/27/2021 22:34:00 - INFO - codeparrot_training - Step 37770: {'lr': 7.07195456287695e-05, 'samples': 19338752, 'steps': 37770, 'batch_loss/train': 0.7204481270164251} 12/27/2021 22:34:14 - INFO - codeparrot_training - Step 37771: {'lr': 7.070855990271225e-05, 'samples': 19339264, 'steps': 37771, 'batch_loss/train': 0.7573629673570395} 12/27/2021 22:34:24 - INFO - codeparrot_training - Step 37772: {'lr': 7.069757488945014e-05, 'samples': 19339776, 'steps': 37772, 'batch_loss/train': 0.6708491845056415} 12/27/2021 22:34:35 - INFO - codeparrot_training - Step 37773: {'lr': 7.068659058902685e-05, 'samples': 19340288, 'steps': 37773, 'batch_loss/train': 0.7069462821818888} 12/27/2021 22:34:47 - INFO - codeparrot_training - Step 37774: {'lr': 7.06756070014859e-05, 'samples': 19340800, 'steps': 37774, 'batch_loss/train': 0.782282548956573} 12/27/2021 22:34:57 - INFO - codeparrot_training - Step 37775: {'lr': 7.066462412687091e-05, 'samples': 19341312, 'steps': 37775, 'batch_loss/train': 0.7521203476935625} 12/27/2021 22:35:08 - INFO - codeparrot_training - Step 37776: {'lr': 7.065364196522587e-05, 'samples': 19341824, 'steps': 37776, 'batch_loss/train': 0.7386509501375258} 12/27/2021 22:35:19 - INFO - codeparrot_training - Step 37777: {'lr': 7.064266051659413e-05, 'samples': 19342336, 'steps': 37777, 'batch_loss/train': 0.6675137211568654} 12/27/2021 22:35:32 - INFO - codeparrot_training - Step 37778: {'lr': 7.063167978101937e-05, 'samples': 19342848, 'steps': 37778, 'batch_loss/train': 0.7471154672093689} 12/27/2021 22:35:43 - INFO - codeparrot_training - Step 37779: {'lr': 7.062069975854546e-05, 'samples': 19343360, 'steps': 37779, 'batch_loss/train': 0.7851870507001877} 12/27/2021 22:35:53 - INFO - codeparrot_training - Step 37780: {'lr': 7.060972044921587e-05, 'samples': 19343872, 'steps': 37780, 'batch_loss/train': 0.6631940993247554} 12/27/2021 22:36:06 - INFO - codeparrot_training - Step 37781: {'lr': 7.059874185307427e-05, 'samples': 19344384, 'steps': 37781, 'batch_loss/train': 0.9415081916376948} 12/27/2021 22:36:16 - INFO - codeparrot_training - Step 37782: {'lr': 7.058776397016436e-05, 'samples': 19344896, 'steps': 37782, 'batch_loss/train': 0.616702280472964} 12/27/2021 22:36:27 - INFO - codeparrot_training - Step 37783: {'lr': 7.057678680052975e-05, 'samples': 19345408, 'steps': 37783, 'batch_loss/train': 0.8048212407156825} 12/27/2021 22:36:41 - INFO - codeparrot_training - Step 37784: {'lr': 7.056581034421408e-05, 'samples': 19345920, 'steps': 37784, 'batch_loss/train': 0.6245708339847624} 12/27/2021 22:36:52 - INFO - codeparrot_training - Step 37785: {'lr': 7.055483460126108e-05, 'samples': 19346432, 'steps': 37785, 'batch_loss/train': 0.7392778806388378} 12/27/2021 22:37:02 - INFO - codeparrot_training - Step 37786: {'lr': 7.054385957171413e-05, 'samples': 19346944, 'steps': 37786, 'batch_loss/train': 0.6942450497299433} 12/27/2021 22:37:13 - INFO - codeparrot_training - Step 37787: {'lr': 7.053288525561712e-05, 'samples': 19347456, 'steps': 37787, 'batch_loss/train': 0.7757116421125829} 12/27/2021 22:37:25 - INFO - codeparrot_training - Step 37788: {'lr': 7.05219116530137e-05, 'samples': 19347968, 'steps': 37788, 'batch_loss/train': 0.7353953155688941} 12/27/2021 22:37:36 - INFO - codeparrot_training - Step 37789: {'lr': 7.051093876394721e-05, 'samples': 19348480, 'steps': 37789, 'batch_loss/train': 0.6867673736996949} 12/27/2021 22:37:46 - INFO - codeparrot_training - Step 37790: {'lr': 7.049996658846162e-05, 'samples': 19348992, 'steps': 37790, 'batch_loss/train': 0.7031579811591655} 12/27/2021 22:37:59 - INFO - codeparrot_training - Step 37791: {'lr': 7.048899512660028e-05, 'samples': 19349504, 'steps': 37791, 'batch_loss/train': 0.7448503328487277} 12/27/2021 22:38:09 - INFO - codeparrot_training - Step 37792: {'lr': 7.047802437840695e-05, 'samples': 19350016, 'steps': 37792, 'batch_loss/train': 0.7577698146924376} 12/27/2021 22:38:20 - INFO - codeparrot_training - Step 37793: {'lr': 7.04670543439252e-05, 'samples': 19350528, 'steps': 37793, 'batch_loss/train': 0.7790978591656312} 12/27/2021 22:38:34 - INFO - codeparrot_training - Step 37794: {'lr': 7.045608502319864e-05, 'samples': 19351040, 'steps': 37794, 'batch_loss/train': 0.7666647477308288} 12/27/2021 22:38:45 - INFO - codeparrot_training - Step 37795: {'lr': 7.04451164162709e-05, 'samples': 19351552, 'steps': 37795, 'batch_loss/train': 0.8199271792545915} 12/27/2021 22:38:55 - INFO - codeparrot_training - Step 37796: {'lr': 7.043414852318558e-05, 'samples': 19352064, 'steps': 37796, 'batch_loss/train': 0.7065826305188239} 12/27/2021 22:39:08 - INFO - codeparrot_training - Step 37797: {'lr': 7.042318134398628e-05, 'samples': 19352576, 'steps': 37797, 'batch_loss/train': 0.7468431573361158} 12/27/2021 22:39:18 - INFO - codeparrot_training - Step 37798: {'lr': 7.041221487871663e-05, 'samples': 19353088, 'steps': 37798, 'batch_loss/train': 0.7876054020598531} 12/27/2021 22:39:29 - INFO - codeparrot_training - Step 37799: {'lr': 7.040124912742024e-05, 'samples': 19353600, 'steps': 37799, 'batch_loss/train': 0.6793751202058047} 12/27/2021 22:39:40 - INFO - codeparrot_training - Step 37800: {'lr': 7.039028409014055e-05, 'samples': 19354112, 'steps': 37800, 'batch_loss/train': 0.9693438392132521} 12/27/2021 22:39:54 - INFO - codeparrot_training - Step 37801: {'lr': 7.037931976692133e-05, 'samples': 19354624, 'steps': 37801, 'batch_loss/train': 0.7531082837376744} 12/27/2021 22:40:04 - INFO - codeparrot_training - Step 37802: {'lr': 7.036835615780621e-05, 'samples': 19355136, 'steps': 37802, 'batch_loss/train': 0.7558936877176166} 12/27/2021 22:40:15 - INFO - codeparrot_training - Step 37803: {'lr': 7.035739326283857e-05, 'samples': 19355648, 'steps': 37803, 'batch_loss/train': 0.8634629156440496} 12/27/2021 22:40:27 - INFO - codeparrot_training - Step 37804: {'lr': 7.034643108206206e-05, 'samples': 19356160, 'steps': 37804, 'batch_loss/train': 0.8729320880956948} 12/27/2021 22:40:38 - INFO - codeparrot_training - Step 37805: {'lr': 7.033546961552046e-05, 'samples': 19356672, 'steps': 37805, 'batch_loss/train': 0.7730695409700274} 12/27/2021 22:40:48 - INFO - codeparrot_training - Step 37806: {'lr': 7.032450886325709e-05, 'samples': 19357184, 'steps': 37806, 'batch_loss/train': 0.9062175329308957} 12/27/2021 22:41:01 - INFO - codeparrot_training - Step 37807: {'lr': 7.031354882531565e-05, 'samples': 19357696, 'steps': 37807, 'batch_loss/train': 0.796833235071972} 12/27/2021 22:41:11 - INFO - codeparrot_training - Step 37808: {'lr': 7.030258950173968e-05, 'samples': 19358208, 'steps': 37808, 'batch_loss/train': 0.6456293524242938} 12/27/2021 22:41:22 - INFO - codeparrot_training - Step 37809: {'lr': 7.029163089257276e-05, 'samples': 19358720, 'steps': 37809, 'batch_loss/train': 0.6938849687576294} 12/27/2021 22:41:33 - INFO - codeparrot_training - Step 37810: {'lr': 7.028067299785848e-05, 'samples': 19359232, 'steps': 37810, 'batch_loss/train': 0.7306367112323642} 12/27/2021 22:41:47 - INFO - codeparrot_training - Step 37811: {'lr': 7.026971581764043e-05, 'samples': 19359744, 'steps': 37811, 'batch_loss/train': 0.4827926770085469} 12/27/2021 22:41:58 - INFO - codeparrot_training - Step 37812: {'lr': 7.025875935196202e-05, 'samples': 19360256, 'steps': 37812, 'batch_loss/train': 0.8973042070865631} 12/27/2021 22:42:08 - INFO - codeparrot_training - Step 37813: {'lr': 7.024780360086697e-05, 'samples': 19360768, 'steps': 37813, 'batch_loss/train': 1.1133705666288733} 12/27/2021 22:42:20 - INFO - codeparrot_training - Step 37814: {'lr': 7.023684856439885e-05, 'samples': 19361280, 'steps': 37814, 'batch_loss/train': 0.8125955946743488} 12/27/2021 22:42:31 - INFO - codeparrot_training - Step 37815: {'lr': 7.022589424260097e-05, 'samples': 19361792, 'steps': 37815, 'batch_loss/train': 0.7861744603142142} 12/27/2021 22:42:42 - INFO - codeparrot_training - Step 37816: {'lr': 7.021494063551726e-05, 'samples': 19362304, 'steps': 37816, 'batch_loss/train': 0.7536648670211434} 12/27/2021 22:42:52 - INFO - codeparrot_training - Step 37817: {'lr': 7.020398774319096e-05, 'samples': 19362816, 'steps': 37817, 'batch_loss/train': 0.5955741568468511} 12/27/2021 22:43:06 - INFO - codeparrot_training - Step 37818: {'lr': 7.01930355656657e-05, 'samples': 19363328, 'steps': 37818, 'batch_loss/train': 0.7867474602535367} 12/27/2021 22:43:17 - INFO - codeparrot_training - Step 37819: {'lr': 7.018208410298505e-05, 'samples': 19363840, 'steps': 37819, 'batch_loss/train': 0.7463323408737779} 12/27/2021 22:43:28 - INFO - codeparrot_training - Step 37820: {'lr': 7.017113335519255e-05, 'samples': 19364352, 'steps': 37820, 'batch_loss/train': 0.6658764358144253} 12/27/2021 22:43:40 - INFO - codeparrot_training - Step 37821: {'lr': 7.01601833223317e-05, 'samples': 19364864, 'steps': 37821, 'batch_loss/train': 0.7677059599664062} 12/27/2021 22:43:50 - INFO - codeparrot_training - Step 37822: {'lr': 7.014923400444606e-05, 'samples': 19365376, 'steps': 37822, 'batch_loss/train': 0.6303273297380656} 12/27/2021 22:44:01 - INFO - codeparrot_training - Step 37823: {'lr': 7.013828540157919e-05, 'samples': 19365888, 'steps': 37823, 'batch_loss/train': 0.8393730036914349} 12/27/2021 22:44:13 - INFO - codeparrot_training - Step 37824: {'lr': 7.012733751377454e-05, 'samples': 19366400, 'steps': 37824, 'batch_loss/train': 0.7789520910009742} 12/27/2021 22:44:24 - INFO - codeparrot_training - Step 37825: {'lr': 7.011639034107578e-05, 'samples': 19366912, 'steps': 37825, 'batch_loss/train': 0.7488711578771472} 12/27/2021 22:44:34 - INFO - codeparrot_training - Step 37826: {'lr': 7.010544388352616e-05, 'samples': 19367424, 'steps': 37826, 'batch_loss/train': 0.8507143743336201} 12/27/2021 22:44:45 - INFO - codeparrot_training - Step 37827: {'lr': 7.009449814116948e-05, 'samples': 19367936, 'steps': 37827, 'batch_loss/train': 0.7528636949136853} 12/27/2021 22:44:58 - INFO - codeparrot_training - Step 37828: {'lr': 7.008355311404918e-05, 'samples': 19368448, 'steps': 37828, 'batch_loss/train': 0.8268692968413234} 12/27/2021 22:45:08 - INFO - codeparrot_training - Step 37829: {'lr': 7.00726088022087e-05, 'samples': 19368960, 'steps': 37829, 'batch_loss/train': 0.6269708890467882} 12/27/2021 22:45:19 - INFO - codeparrot_training - Step 37830: {'lr': 7.00616652056915e-05, 'samples': 19369472, 'steps': 37830, 'batch_loss/train': 0.659323084866628} 12/27/2021 22:45:33 - INFO - codeparrot_training - Step 37831: {'lr': 7.005072232454132e-05, 'samples': 19369984, 'steps': 37831, 'batch_loss/train': 0.788163305958733} 12/27/2021 22:45:43 - INFO - codeparrot_training - Step 37832: {'lr': 7.003978015880144e-05, 'samples': 19370496, 'steps': 37832, 'batch_loss/train': 0.7407432820182294} 12/27/2021 22:45:54 - INFO - codeparrot_training - Step 37833: {'lr': 7.002883870851545e-05, 'samples': 19371008, 'steps': 37833, 'batch_loss/train': 0.8519901231629774} 12/27/2021 22:46:06 - INFO - codeparrot_training - Step 37834: {'lr': 7.001789797372685e-05, 'samples': 19371520, 'steps': 37834, 'batch_loss/train': 0.7144193844869733} 12/27/2021 22:46:17 - INFO - codeparrot_training - Step 37835: {'lr': 7.000695795447915e-05, 'samples': 19372032, 'steps': 37835, 'batch_loss/train': 0.7603738233447075} 12/27/2021 22:46:28 - INFO - codeparrot_training - Step 37836: {'lr': 6.999601865081578e-05, 'samples': 19372544, 'steps': 37836, 'batch_loss/train': 0.8201711941510439} 12/27/2021 22:46:38 - INFO - codeparrot_training - Step 37837: {'lr': 6.998508006278029e-05, 'samples': 19373056, 'steps': 37837, 'batch_loss/train': 0.7943526263698004} 12/27/2021 22:46:50 - INFO - codeparrot_training - Step 37838: {'lr': 6.997414219041614e-05, 'samples': 19373568, 'steps': 37838, 'batch_loss/train': 0.700132871279493} 12/27/2021 22:47:01 - INFO - codeparrot_training - Step 37839: {'lr': 6.996320503376685e-05, 'samples': 19374080, 'steps': 37839, 'batch_loss/train': 0.7567326310090721} 12/27/2021 22:47:12 - INFO - codeparrot_training - Step 37840: {'lr': 6.995226859287595e-05, 'samples': 19374592, 'steps': 37840, 'batch_loss/train': 0.7081367088248953} 12/27/2021 22:47:25 - INFO - codeparrot_training - Step 37841: {'lr': 6.994133286778667e-05, 'samples': 19375104, 'steps': 37841, 'batch_loss/train': 0.7405070713721216} 12/27/2021 22:47:36 - INFO - codeparrot_training - Step 37842: {'lr': 6.993039785854277e-05, 'samples': 19375616, 'steps': 37842, 'batch_loss/train': 0.7190625158837065} 12/27/2021 22:47:47 - INFO - codeparrot_training - Step 37843: {'lr': 6.991946356518767e-05, 'samples': 19376128, 'steps': 37843, 'batch_loss/train': 0.755460349842906} 12/27/2021 22:47:59 - INFO - codeparrot_training - Step 37844: {'lr': 6.990852998776473e-05, 'samples': 19376640, 'steps': 37844, 'batch_loss/train': 0.7305247560143471} 12/27/2021 22:48:10 - INFO - codeparrot_training - Step 37845: {'lr': 6.989759712631747e-05, 'samples': 19377152, 'steps': 37845, 'batch_loss/train': 0.6193144407006912} 12/27/2021 22:48:20 - INFO - codeparrot_training - Step 37846: {'lr': 6.988666498088932e-05, 'samples': 19377664, 'steps': 37846, 'batch_loss/train': 0.7173534715548158} 12/27/2021 22:48:31 - INFO - codeparrot_training - Step 37847: {'lr': 6.987573355152382e-05, 'samples': 19378176, 'steps': 37847, 'batch_loss/train': 0.736738262232393} 12/27/2021 22:48:45 - INFO - codeparrot_training - Step 37848: {'lr': 6.986480283826441e-05, 'samples': 19378688, 'steps': 37848, 'batch_loss/train': 0.8080068775452673} 12/27/2021 22:48:55 - INFO - codeparrot_training - Step 37849: {'lr': 6.98538728411545e-05, 'samples': 19379200, 'steps': 37849, 'batch_loss/train': 0.6672481913119555} 12/27/2021 22:49:06 - INFO - codeparrot_training - Step 37850: {'lr': 6.984294356023757e-05, 'samples': 19379712, 'steps': 37850, 'batch_loss/train': 0.666537307202816} 12/27/2021 22:49:18 - INFO - codeparrot_training - Step 37851: {'lr': 6.983201499555717e-05, 'samples': 19380224, 'steps': 37851, 'batch_loss/train': 0.6753679383546114} 12/27/2021 22:49:29 - INFO - codeparrot_training - Step 37852: {'lr': 6.982108714715646e-05, 'samples': 19380736, 'steps': 37852, 'batch_loss/train': 0.6100402818992734} 12/27/2021 22:49:39 - INFO - codeparrot_training - Step 37853: {'lr': 6.981016001507917e-05, 'samples': 19381248, 'steps': 37853, 'batch_loss/train': 0.8586727164220065} 12/27/2021 22:49:51 - INFO - codeparrot_training - Step 37854: {'lr': 6.979923359936874e-05, 'samples': 19381760, 'steps': 37854, 'batch_loss/train': 0.8093118106480688} 12/27/2021 22:50:02 - INFO - codeparrot_training - Step 37855: {'lr': 6.97883079000684e-05, 'samples': 19382272, 'steps': 37855, 'batch_loss/train': 0.7556246720487252} 12/27/2021 22:50:13 - INFO - codeparrot_training - Step 37856: {'lr': 6.977738291722166e-05, 'samples': 19382784, 'steps': 37856, 'batch_loss/train': 0.7868364588357508} 12/27/2021 22:50:26 - INFO - codeparrot_training - Step 37857: {'lr': 6.976645865087213e-05, 'samples': 19383296, 'steps': 37857, 'batch_loss/train': 0.6694024513708428} 12/27/2021 22:50:37 - INFO - codeparrot_training - Step 37858: {'lr': 6.975553510106305e-05, 'samples': 19383808, 'steps': 37858, 'batch_loss/train': 0.7136656045913696} 12/27/2021 22:50:48 - INFO - codeparrot_training - Step 37859: {'lr': 6.97446122678379e-05, 'samples': 19384320, 'steps': 37859, 'batch_loss/train': 0.7828045906499028} 12/27/2021 22:50:58 - INFO - codeparrot_training - Step 37860: {'lr': 6.973369015124009e-05, 'samples': 19384832, 'steps': 37860, 'batch_loss/train': 0.8119308948516846} 12/27/2021 22:51:11 - INFO - codeparrot_training - Step 37861: {'lr': 6.972276875131309e-05, 'samples': 19385344, 'steps': 37861, 'batch_loss/train': 0.6767142943572253} 12/27/2021 22:51:21 - INFO - codeparrot_training - Step 37862: {'lr': 6.971184806810027e-05, 'samples': 19385856, 'steps': 37862, 'batch_loss/train': 0.7275316771119833} 12/27/2021 22:51:32 - INFO - codeparrot_training - Step 37863: {'lr': 6.970092810164508e-05, 'samples': 19386368, 'steps': 37863, 'batch_loss/train': 0.8396244717296213} 12/27/2021 22:51:44 - INFO - codeparrot_training - Step 37864: {'lr': 6.969000885199092e-05, 'samples': 19386880, 'steps': 37864, 'batch_loss/train': 0.814521555788815} 12/27/2021 22:51:55 - INFO - codeparrot_training - Step 37865: {'lr': 6.967909031918118e-05, 'samples': 19387392, 'steps': 37865, 'batch_loss/train': 0.7548109409399331} 12/27/2021 22:52:05 - INFO - codeparrot_training - Step 37866: {'lr': 6.966817250325941e-05, 'samples': 19387904, 'steps': 37866, 'batch_loss/train': 0.7378581836819649} 12/27/2021 22:52:17 - INFO - codeparrot_training - Step 37867: {'lr': 6.96572554042687e-05, 'samples': 19388416, 'steps': 37867, 'batch_loss/train': 0.6920118387788534} 12/27/2021 22:52:28 - INFO - codeparrot_training - Step 37868: {'lr': 6.964633902225275e-05, 'samples': 19388928, 'steps': 37868, 'batch_loss/train': 0.7137281728209928} 12/27/2021 22:52:39 - INFO - codeparrot_training - Step 37869: {'lr': 6.963542335725493e-05, 'samples': 19389440, 'steps': 37869, 'batch_loss/train': 0.7526291598333046} 12/27/2021 22:52:49 - INFO - codeparrot_training - Step 37870: {'lr': 6.962450840931847e-05, 'samples': 19389952, 'steps': 37870, 'batch_loss/train': 0.6539668259210885} 12/27/2021 22:53:03 - INFO - codeparrot_training - Step 37871: {'lr': 6.961359417848689e-05, 'samples': 19390464, 'steps': 37871, 'batch_loss/train': 0.7330927485600114} 12/27/2021 22:53:14 - INFO - codeparrot_training - Step 37872: {'lr': 6.960268066480352e-05, 'samples': 19390976, 'steps': 37872, 'batch_loss/train': 0.7410589121282101} 12/27/2021 22:53:24 - INFO - codeparrot_training - Step 37873: {'lr': 6.959176786831179e-05, 'samples': 19391488, 'steps': 37873, 'batch_loss/train': 0.7643704107031226} 12/27/2021 22:53:37 - INFO - codeparrot_training - Step 37874: {'lr': 6.958085578905507e-05, 'samples': 19392000, 'steps': 37874, 'batch_loss/train': 0.6982631431892514} 12/27/2021 22:53:47 - INFO - codeparrot_training - Step 37875: {'lr': 6.956994442707673e-05, 'samples': 19392512, 'steps': 37875, 'batch_loss/train': 0.7102837548591197} 12/27/2021 22:53:58 - INFO - codeparrot_training - Step 37876: {'lr': 6.955903378242018e-05, 'samples': 19393024, 'steps': 37876, 'batch_loss/train': 0.7496464485302567} 12/27/2021 22:54:12 - INFO - codeparrot_training - Step 37877: {'lr': 6.954812385512877e-05, 'samples': 19393536, 'steps': 37877, 'batch_loss/train': 0.7714387681335211} 12/27/2021 22:54:22 - INFO - codeparrot_training - Step 37878: {'lr': 6.95372146452459e-05, 'samples': 19394048, 'steps': 37878, 'batch_loss/train': 0.7430885415524244} 12/27/2021 22:54:33 - INFO - codeparrot_training - Step 37879: {'lr': 6.952630615281492e-05, 'samples': 19394560, 'steps': 37879, 'batch_loss/train': 0.8036569515243173} 12/27/2021 22:54:44 - INFO - codeparrot_training - Step 37880: {'lr': 6.951539837787926e-05, 'samples': 19395072, 'steps': 37880, 'batch_loss/train': 0.7423954326659441} 12/27/2021 22:54:56 - INFO - codeparrot_training - Step 37881: {'lr': 6.950449132048215e-05, 'samples': 19395584, 'steps': 37881, 'batch_loss/train': 0.7288541770540178} 12/27/2021 22:55:07 - INFO - codeparrot_training - Step 37882: {'lr': 6.949358498066697e-05, 'samples': 19396096, 'steps': 37882, 'batch_loss/train': 0.739379667211324} 12/27/2021 22:55:17 - INFO - codeparrot_training - Step 37883: {'lr': 6.94826793584773e-05, 'samples': 19396608, 'steps': 37883, 'batch_loss/train': 0.7472455091774464} 12/27/2021 22:55:30 - INFO - codeparrot_training - Step 37884: {'lr': 6.947177445395622e-05, 'samples': 19397120, 'steps': 37884, 'batch_loss/train': 0.767727124504745} 12/27/2021 22:55:40 - INFO - codeparrot_training - Step 37885: {'lr': 6.946087026714723e-05, 'samples': 19397632, 'steps': 37885, 'batch_loss/train': 0.6993262423202395} 12/27/2021 22:55:51 - INFO - codeparrot_training - Step 37886: {'lr': 6.944996679809365e-05, 'samples': 19398144, 'steps': 37886, 'batch_loss/train': 0.7934499285183847} 12/27/2021 22:56:05 - INFO - codeparrot_training - Step 37887: {'lr': 6.943906404683883e-05, 'samples': 19398656, 'steps': 37887, 'batch_loss/train': 0.6229564732639119} 12/27/2021 22:56:15 - INFO - codeparrot_training - Step 37888: {'lr': 6.942816201342611e-05, 'samples': 19399168, 'steps': 37888, 'batch_loss/train': 0.6890851485077292} 12/27/2021 22:56:26 - INFO - codeparrot_training - Step 37889: {'lr': 6.941726069789886e-05, 'samples': 19399680, 'steps': 37889, 'batch_loss/train': 0.8567852098494768} 12/27/2021 22:56:37 - INFO - codeparrot_training - Step 37890: {'lr': 6.940636010030035e-05, 'samples': 19400192, 'steps': 37890, 'batch_loss/train': 0.7277032905258238} 12/27/2021 22:56:49 - INFO - codeparrot_training - Step 37891: {'lr': 6.9395460220674e-05, 'samples': 19400704, 'steps': 37891, 'batch_loss/train': 0.6719166298862547} 12/27/2021 22:57:00 - INFO - codeparrot_training - Step 37892: {'lr': 6.938456105906318e-05, 'samples': 19401216, 'steps': 37892, 'batch_loss/train': 0.7273709308356047} 12/27/2021 22:57:11 - INFO - codeparrot_training - Step 37893: {'lr': 6.937366261551098e-05, 'samples': 19401728, 'steps': 37893, 'batch_loss/train': 0.6415940169245005} 12/27/2021 22:57:23 - INFO - codeparrot_training - Step 37894: {'lr': 6.936276489006096e-05, 'samples': 19402240, 'steps': 37894, 'batch_loss/train': 0.7840650202706456} 12/27/2021 22:57:33 - INFO - codeparrot_training - Step 37895: {'lr': 6.935186788275649e-05, 'samples': 19402752, 'steps': 37895, 'batch_loss/train': 0.7246563397347927} 12/27/2021 22:57:44 - INFO - codeparrot_training - Step 37896: {'lr': 6.934097159364069e-05, 'samples': 19403264, 'steps': 37896, 'batch_loss/train': 0.8400200419127941} 12/27/2021 22:57:58 - INFO - codeparrot_training - Step 37897: {'lr': 6.933007602275698e-05, 'samples': 19403776, 'steps': 37897, 'batch_loss/train': 0.7412977712228894} 12/27/2021 22:58:09 - INFO - codeparrot_training - Step 37898: {'lr': 6.931918117014866e-05, 'samples': 19404288, 'steps': 37898, 'batch_loss/train': 0.7130647194571793} 12/27/2021 22:58:19 - INFO - codeparrot_training - Step 37899: {'lr': 6.930828703585906e-05, 'samples': 19404800, 'steps': 37899, 'batch_loss/train': 0.7220736879389733} 12/27/2021 22:58:30 - INFO - codeparrot_training - Step 37900: {'lr': 6.929739361993149e-05, 'samples': 19405312, 'steps': 37900, 'batch_loss/train': 0.7978745312429965} 12/27/2021 22:58:42 - INFO - codeparrot_training - Step 37901: {'lr': 6.928650092240924e-05, 'samples': 19405824, 'steps': 37901, 'batch_loss/train': 0.7740875091403723} 12/27/2021 22:58:52 - INFO - codeparrot_training - Step 37902: {'lr': 6.927560894333562e-05, 'samples': 19406336, 'steps': 37902, 'batch_loss/train': 0.7300789913279004} 12/27/2021 22:59:03 - INFO - codeparrot_training - Step 37903: {'lr': 6.926471768275394e-05, 'samples': 19406848, 'steps': 37903, 'batch_loss/train': 0.8345288028940558} 12/27/2021 22:59:15 - INFO - codeparrot_training - Step 37904: {'lr': 6.925382714070753e-05, 'samples': 19407360, 'steps': 37904, 'batch_loss/train': 0.7614617291837931} 12/27/2021 22:59:26 - INFO - codeparrot_training - Step 37905: {'lr': 6.924293731723962e-05, 'samples': 19407872, 'steps': 37905, 'batch_loss/train': 0.6943542119115591} 12/27/2021 22:59:36 - INFO - codeparrot_training - Step 37906: {'lr': 6.923204821239362e-05, 'samples': 19408384, 'steps': 37906, 'batch_loss/train': 0.799405668862164} 12/27/2021 22:59:50 - INFO - codeparrot_training - Step 37907: {'lr': 6.922115982621268e-05, 'samples': 19408896, 'steps': 37907, 'batch_loss/train': 0.8032886511646211} 12/27/2021 23:00:01 - INFO - codeparrot_training - Step 37908: {'lr': 6.921027215874007e-05, 'samples': 19409408, 'steps': 37908, 'batch_loss/train': 0.7212391961365938} 12/27/2021 23:00:12 - INFO - codeparrot_training - Step 37909: {'lr': 6.91993852100193e-05, 'samples': 19409920, 'steps': 37909, 'batch_loss/train': 0.7907876167446375} 12/27/2021 23:00:22 - INFO - codeparrot_training - Step 37910: {'lr': 6.918849898009344e-05, 'samples': 19410432, 'steps': 37910, 'batch_loss/train': 0.7592957811430097} 12/27/2021 23:00:35 - INFO - codeparrot_training - Step 37911: {'lr': 6.917761346900578e-05, 'samples': 19410944, 'steps': 37911, 'batch_loss/train': 0.7216945631662384} 12/27/2021 23:00:45 - INFO - codeparrot_training - Step 37912: {'lr': 6.916672867679976e-05, 'samples': 19411456, 'steps': 37912, 'batch_loss/train': 0.6995371337980032} 12/27/2021 23:00:56 - INFO - codeparrot_training - Step 37913: {'lr': 6.915584460351851e-05, 'samples': 19411968, 'steps': 37913, 'batch_loss/train': 0.6203677137964405} 12/27/2021 23:01:08 - INFO - codeparrot_training - Step 37914: {'lr': 6.914496124920533e-05, 'samples': 19412480, 'steps': 37914, 'batch_loss/train': 0.7222043075598776} 12/27/2021 23:01:19 - INFO - codeparrot_training - Step 37915: {'lr': 6.913407861390348e-05, 'samples': 19412992, 'steps': 37915, 'batch_loss/train': 1.5123331970535219} 12/27/2021 23:01:29 - INFO - codeparrot_training - Step 37916: {'lr': 6.912319669765623e-05, 'samples': 19413504, 'steps': 37916, 'batch_loss/train': 1.5889678439125419} 12/27/2021 23:01:40 - INFO - codeparrot_training - Step 37917: {'lr': 6.911231550050689e-05, 'samples': 19414016, 'steps': 37917, 'batch_loss/train': 0.7513351095840335} 12/27/2021 23:01:54 - INFO - codeparrot_training - Step 37918: {'lr': 6.910143502249874e-05, 'samples': 19414528, 'steps': 37918, 'batch_loss/train': 0.706303742248565} 12/27/2021 23:02:05 - INFO - codeparrot_training - Step 37919: {'lr': 6.909055526367483e-05, 'samples': 19415040, 'steps': 37919, 'batch_loss/train': 0.9012899480294436} 12/27/2021 23:02:15 - INFO - codeparrot_training - Step 37920: {'lr': 6.907967622407863e-05, 'samples': 19415552, 'steps': 37920, 'batch_loss/train': 0.8605451788753271} 12/27/2021 23:02:27 - INFO - codeparrot_training - Step 37921: {'lr': 6.90687979037534e-05, 'samples': 19416064, 'steps': 37921, 'batch_loss/train': 0.7502624969929457} 12/27/2021 23:02:38 - INFO - codeparrot_training - Step 37922: {'lr': 6.905792030274224e-05, 'samples': 19416576, 'steps': 37922, 'batch_loss/train': 0.7273565115174279} 12/27/2021 23:02:49 - INFO - codeparrot_training - Step 37923: {'lr': 6.904704342108841e-05, 'samples': 19417088, 'steps': 37923, 'batch_loss/train': 0.697665236890316} 12/27/2021 23:03:01 - INFO - codeparrot_training - Step 37924: {'lr': 6.903616725883536e-05, 'samples': 19417600, 'steps': 37924, 'batch_loss/train': 0.7229712216649204} 12/27/2021 23:03:11 - INFO - codeparrot_training - Step 37925: {'lr': 6.902529181602607e-05, 'samples': 19418112, 'steps': 37925, 'batch_loss/train': 0.7771627473994158} 12/27/2021 23:03:22 - INFO - codeparrot_training - Step 37926: {'lr': 6.901441709270392e-05, 'samples': 19418624, 'steps': 37926, 'batch_loss/train': 0.7503784724976867} 12/27/2021 23:03:36 - INFO - codeparrot_training - Step 37927: {'lr': 6.90035430889121e-05, 'samples': 19419136, 'steps': 37927, 'batch_loss/train': 0.720661393366754} 12/27/2021 23:03:47 - INFO - codeparrot_training - Step 37928: {'lr': 6.899266980469385e-05, 'samples': 19419648, 'steps': 37928, 'batch_loss/train': 0.6912085868534632} 12/27/2021 23:03:57 - INFO - codeparrot_training - Step 37929: {'lr': 6.89817972400924e-05, 'samples': 19420160, 'steps': 37929, 'batch_loss/train': 0.7557434597983956} 12/27/2021 23:04:08 - INFO - codeparrot_training - Step 37930: {'lr': 6.897092539515096e-05, 'samples': 19420672, 'steps': 37930, 'batch_loss/train': 0.7064168169745244} 12/27/2021 23:04:20 - INFO - codeparrot_training - Step 37931: {'lr': 6.896005426991279e-05, 'samples': 19421184, 'steps': 37931, 'batch_loss/train': 0.7784588830545545} 12/27/2021 23:04:31 - INFO - codeparrot_training - Step 37932: {'lr': 6.894918386442115e-05, 'samples': 19421696, 'steps': 37932, 'batch_loss/train': 0.6773410467430949} 12/27/2021 23:04:41 - INFO - codeparrot_training - Step 37933: {'lr': 6.893831417871913e-05, 'samples': 19422208, 'steps': 37933, 'batch_loss/train': 0.7467883881181479} 12/27/2021 23:04:55 - INFO - codeparrot_training - Step 37934: {'lr': 6.89274452128499e-05, 'samples': 19422720, 'steps': 37934, 'batch_loss/train': 0.7328744297847152} 12/27/2021 23:05:06 - INFO - codeparrot_training - Step 37935: {'lr': 6.891657696685697e-05, 'samples': 19423232, 'steps': 37935, 'batch_loss/train': 0.7712392134126276} 12/27/2021 23:05:17 - INFO - codeparrot_training - Step 37936: {'lr': 6.890570944078325e-05, 'samples': 19423744, 'steps': 37936, 'batch_loss/train': 0.7726200451143086} 12/27/2021 23:05:29 - INFO - codeparrot_training - Step 37937: {'lr': 6.889484263467197e-05, 'samples': 19424256, 'steps': 37937, 'batch_loss/train': 0.7576803583651781} 12/27/2021 23:05:39 - INFO - codeparrot_training - Step 37938: {'lr': 6.888397654856659e-05, 'samples': 19424768, 'steps': 37938, 'batch_loss/train': 0.7100480878725648} 12/27/2021 23:05:50 - INFO - codeparrot_training - Step 37939: {'lr': 6.887311118251005e-05, 'samples': 19425280, 'steps': 37939, 'batch_loss/train': 0.7062269276939332} 12/27/2021 23:06:01 - INFO - codeparrot_training - Step 37940: {'lr': 6.886224653654563e-05, 'samples': 19425792, 'steps': 37940, 'batch_loss/train': 0.7483000960201025} 12/27/2021 23:06:13 - INFO - codeparrot_training - Step 37941: {'lr': 6.88513826107165e-05, 'samples': 19426304, 'steps': 37941, 'batch_loss/train': 0.9585089161992073} 12/27/2021 23:06:24 - INFO - codeparrot_training - Step 37942: {'lr': 6.884051940506588e-05, 'samples': 19426816, 'steps': 37942, 'batch_loss/train': 0.7229402158409357} 12/27/2021 23:06:34 - INFO - codeparrot_training - Step 37943: {'lr': 6.882965691963696e-05, 'samples': 19427328, 'steps': 37943, 'batch_loss/train': 0.7252019718289375} 12/27/2021 23:06:48 - INFO - codeparrot_training - Step 37944: {'lr': 6.8818795154473e-05, 'samples': 19427840, 'steps': 37944, 'batch_loss/train': 0.9071295689791441} 12/27/2021 23:06:59 - INFO - codeparrot_training - Step 37945: {'lr': 6.880793410961695e-05, 'samples': 19428352, 'steps': 37945, 'batch_loss/train': 0.7793735433369875} 12/27/2021 23:07:09 - INFO - codeparrot_training - Step 37946: {'lr': 6.879707378511221e-05, 'samples': 19428864, 'steps': 37946, 'batch_loss/train': 0.8073192620649934} 12/27/2021 23:07:22 - INFO - codeparrot_training - Step 37947: {'lr': 6.878621418100196e-05, 'samples': 19429376, 'steps': 37947, 'batch_loss/train': 0.8301053766626865} 12/27/2021 23:07:32 - INFO - codeparrot_training - Step 37948: {'lr': 6.877535529732915e-05, 'samples': 19429888, 'steps': 37948, 'batch_loss/train': 0.5909987520426512} 12/27/2021 23:07:43 - INFO - codeparrot_training - Step 37949: {'lr': 6.876449713413715e-05, 'samples': 19430400, 'steps': 37949, 'batch_loss/train': 0.7925762934610248} 12/27/2021 23:07:54 - INFO - codeparrot_training - Step 37950: {'lr': 6.87536396914692e-05, 'samples': 19430912, 'steps': 37950, 'batch_loss/train': 0.7412500656209886} 12/27/2021 23:08:06 - INFO - codeparrot_training - Step 37951: {'lr': 6.874278296936823e-05, 'samples': 19431424, 'steps': 37951, 'batch_loss/train': 0.7215174483135343} 12/27/2021 23:08:17 - INFO - codeparrot_training - Step 37952: {'lr': 6.873192696787753e-05, 'samples': 19431936, 'steps': 37952, 'batch_loss/train': 0.7308021355420351} 12/27/2021 23:08:27 - INFO - codeparrot_training - Step 37953: {'lr': 6.872107168704022e-05, 'samples': 19432448, 'steps': 37953, 'batch_loss/train': 0.7274769954383373} 12/27/2021 23:08:40 - INFO - codeparrot_training - Step 37954: {'lr': 6.87102171268995e-05, 'samples': 19432960, 'steps': 37954, 'batch_loss/train': 0.7876533889211714} 12/27/2021 23:08:50 - INFO - codeparrot_training - Step 37955: {'lr': 6.869936328749851e-05, 'samples': 19433472, 'steps': 37955, 'batch_loss/train': 0.7825709274038672} 12/27/2021 23:09:01 - INFO - codeparrot_training - Step 37956: {'lr': 6.86885101688804e-05, 'samples': 19433984, 'steps': 37956, 'batch_loss/train': 0.6598970466293395} 12/27/2021 23:09:12 - INFO - codeparrot_training - Step 37957: {'lr': 6.867765777108831e-05, 'samples': 19434496, 'steps': 37957, 'batch_loss/train': 1.138306584674865} 12/27/2021 23:09:26 - INFO - codeparrot_training - Step 37958: {'lr': 6.866680609416537e-05, 'samples': 19435008, 'steps': 37958, 'batch_loss/train': 0.554873844870599} 12/27/2021 23:09:36 - INFO - codeparrot_training - Step 37959: {'lr': 6.865595513815482e-05, 'samples': 19435520, 'steps': 37959, 'batch_loss/train': 0.7135313488543034} 12/27/2021 23:09:47 - INFO - codeparrot_training - Step 37960: {'lr': 6.864510490309958e-05, 'samples': 19436032, 'steps': 37960, 'batch_loss/train': 0.7334209443069994} 12/27/2021 23:09:59 - INFO - codeparrot_training - Step 37961: {'lr': 6.863425538904308e-05, 'samples': 19436544, 'steps': 37961, 'batch_loss/train': 0.7787251207046211} 12/27/2021 23:10:10 - INFO - codeparrot_training - Step 37962: {'lr': 6.862340659602819e-05, 'samples': 19437056, 'steps': 37962, 'batch_loss/train': 0.6920170423109084} 12/27/2021 23:10:20 - INFO - codeparrot_training - Step 37963: {'lr': 6.861255852409812e-05, 'samples': 19437568, 'steps': 37963, 'batch_loss/train': 0.7900913180783391} 12/27/2021 23:10:34 - INFO - codeparrot_training - Step 37964: {'lr': 6.860171117329617e-05, 'samples': 19438080, 'steps': 37964, 'batch_loss/train': 0.7516879797913134} 12/27/2021 23:10:45 - INFO - codeparrot_training - Step 37965: {'lr': 6.859086454366523e-05, 'samples': 19438592, 'steps': 37965, 'batch_loss/train': 0.8771762317046523} 12/27/2021 23:10:55 - INFO - codeparrot_training - Step 37966: {'lr': 6.858001863524852e-05, 'samples': 19439104, 'steps': 37966, 'batch_loss/train': 0.7912834119051695} 12/27/2021 23:11:06 - INFO - codeparrot_training - Step 37967: {'lr': 6.856917344808916e-05, 'samples': 19439616, 'steps': 37967, 'batch_loss/train': 0.7396066873334348} 12/27/2021 23:11:18 - INFO - codeparrot_training - Step 37968: {'lr': 6.855832898223024e-05, 'samples': 19440128, 'steps': 37968, 'batch_loss/train': 0.9192312713712454} 12/27/2021 23:11:29 - INFO - codeparrot_training - Step 37969: {'lr': 6.854748523771492e-05, 'samples': 19440640, 'steps': 37969, 'batch_loss/train': 0.7035023542121053} 12/27/2021 23:11:40 - INFO - codeparrot_training - Step 37970: {'lr': 6.853664221458636e-05, 'samples': 19441152, 'steps': 37970, 'batch_loss/train': 1.13865324575454} 12/27/2021 23:11:52 - INFO - codeparrot_training - Step 37971: {'lr': 6.852579991288743e-05, 'samples': 19441664, 'steps': 37971, 'batch_loss/train': 0.7897201557643712} 12/27/2021 23:12:02 - INFO - codeparrot_training - Step 37972: {'lr': 6.851495833266147e-05, 'samples': 19442176, 'steps': 37972, 'batch_loss/train': 0.7990775257349014} 12/27/2021 23:12:13 - INFO - codeparrot_training - Step 37973: {'lr': 6.850411747395161e-05, 'samples': 19442688, 'steps': 37973, 'batch_loss/train': 0.8237691652029753} 12/27/2021 23:12:28 - INFO - codeparrot_training - Step 37974: {'lr': 6.849327733680067e-05, 'samples': 19443200, 'steps': 37974, 'batch_loss/train': 0.8079334637150168} 12/27/2021 23:12:38 - INFO - codeparrot_training - Step 37975: {'lr': 6.8482437921252e-05, 'samples': 19443712, 'steps': 37975, 'batch_loss/train': 0.6542597856605425} 12/27/2021 23:12:49 - INFO - codeparrot_training - Step 37976: {'lr': 6.847159922734872e-05, 'samples': 19444224, 'steps': 37976, 'batch_loss/train': 0.7305298196151853} 12/27/2021 23:13:00 - INFO - codeparrot_training - Step 37977: {'lr': 6.84607612551337e-05, 'samples': 19444736, 'steps': 37977, 'batch_loss/train': 2.124360516667366} 12/27/2021 23:13:12 - INFO - codeparrot_training - Step 37978: {'lr': 6.84499240046502e-05, 'samples': 19445248, 'steps': 37978, 'batch_loss/train': 0.80744943395257} 12/27/2021 23:13:22 - INFO - codeparrot_training - Step 37979: {'lr': 6.843908747594122e-05, 'samples': 19445760, 'steps': 37979, 'batch_loss/train': 0.7679471634328365} 12/27/2021 23:13:33 - INFO - codeparrot_training - Step 37980: {'lr': 6.842825166904987e-05, 'samples': 19446272, 'steps': 37980, 'batch_loss/train': 0.7673457898199558} 12/27/2021 23:13:47 - INFO - codeparrot_training - Step 37981: {'lr': 6.841741658401923e-05, 'samples': 19446784, 'steps': 37981, 'batch_loss/train': 0.7349571632221341} 12/27/2021 23:13:58 - INFO - codeparrot_training - Step 37982: {'lr': 6.840658222089241e-05, 'samples': 19447296, 'steps': 37982, 'batch_loss/train': 0.7216533473692834} 12/27/2021 23:14:08 - INFO - codeparrot_training - Step 37983: {'lr': 6.839574857971243e-05, 'samples': 19447808, 'steps': 37983, 'batch_loss/train': 0.9584372027311474} 12/27/2021 23:14:19 - INFO - codeparrot_training - Step 37984: {'lr': 6.838491566052238e-05, 'samples': 19448320, 'steps': 37984, 'batch_loss/train': 0.7660047390963882} 12/27/2021 23:14:31 - INFO - codeparrot_training - Step 37985: {'lr': 6.837408346336532e-05, 'samples': 19448832, 'steps': 37985, 'batch_loss/train': 0.8329620976001024} 12/27/2021 23:14:42 - INFO - codeparrot_training - Step 37986: {'lr': 6.836325198828436e-05, 'samples': 19449344, 'steps': 37986, 'batch_loss/train': 0.8148735472932458} 12/27/2021 23:14:52 - INFO - codeparrot_training - Step 37987: {'lr': 6.835242123532257e-05, 'samples': 19449856, 'steps': 37987, 'batch_loss/train': 0.7475591842085123} 12/27/2021 23:15:04 - INFO - codeparrot_training - Step 37988: {'lr': 6.834159120452291e-05, 'samples': 19450368, 'steps': 37988, 'batch_loss/train': 0.7608683332800865} 12/27/2021 23:15:15 - INFO - codeparrot_training - Step 37989: {'lr': 6.83307618959284e-05, 'samples': 19450880, 'steps': 37989, 'batch_loss/train': 0.7362184710800648} 12/27/2021 23:15:26 - INFO - codeparrot_training - Step 37990: {'lr': 6.831993330958236e-05, 'samples': 19451392, 'steps': 37990, 'batch_loss/train': 0.7729928700719029} 12/27/2021 23:15:38 - INFO - codeparrot_training - Step 37991: {'lr': 6.830910544552757e-05, 'samples': 19451904, 'steps': 37991, 'batch_loss/train': 0.7850967422127724} 12/27/2021 23:15:48 - INFO - codeparrot_training - Step 37992: {'lr': 6.829827830380717e-05, 'samples': 19452416, 'steps': 37992, 'batch_loss/train': 0.7885849406011403} 12/27/2021 23:15:59 - INFO - codeparrot_training - Step 37993: {'lr': 6.828745188446422e-05, 'samples': 19452928, 'steps': 37993, 'batch_loss/train': 0.6222189285326749} 12/27/2021 23:16:10 - INFO - codeparrot_training - Step 37994: {'lr': 6.827662618754174e-05, 'samples': 19453440, 'steps': 37994, 'batch_loss/train': 0.667322495020926} 12/27/2021 23:16:24 - INFO - codeparrot_training - Step 37995: {'lr': 6.826580121308279e-05, 'samples': 19453952, 'steps': 37995, 'batch_loss/train': 0.8027400281280279} 12/27/2021 23:16:34 - INFO - codeparrot_training - Step 37996: {'lr': 6.825497696113044e-05, 'samples': 19454464, 'steps': 37996, 'batch_loss/train': 0.716345677152276} 12/27/2021 23:16:45 - INFO - codeparrot_training - Step 37997: {'lr': 6.824415343172757e-05, 'samples': 19454976, 'steps': 37997, 'batch_loss/train': 0.5687897165771574} 12/27/2021 23:16:57 - INFO - codeparrot_training - Step 37998: {'lr': 6.823333062491737e-05, 'samples': 19455488, 'steps': 37998, 'batch_loss/train': 0.7134642873425037} 12/27/2021 23:17:08 - INFO - codeparrot_training - Step 37999: {'lr': 6.822250854074288e-05, 'samples': 19456000, 'steps': 37999, 'batch_loss/train': 0.8369523352012038} 12/27/2021 23:17:18 - INFO - codeparrot_training - Step 38000: {'lr': 6.821168717924695e-05, 'samples': 19456512, 'steps': 38000, 'batch_loss/train': 0.7143137902021408} 12/27/2021 23:17:30 - INFO - codeparrot_training - Step 38001: {'lr': 6.820086654047275e-05, 'samples': 19457024, 'steps': 38001, 'batch_loss/train': 0.7138719321228564} 12/27/2021 23:17:41 - INFO - codeparrot_training - Step 38002: {'lr': 6.819004662446335e-05, 'samples': 19457536, 'steps': 38002, 'batch_loss/train': 0.7121083745732903} 12/27/2021 23:17:52 - INFO - codeparrot_training - Step 38003: {'lr': 6.817922743126157e-05, 'samples': 19458048, 'steps': 38003, 'batch_loss/train': 0.7552255378104746} 12/27/2021 23:18:02 - INFO - codeparrot_training - Step 38004: {'lr': 6.816840896091048e-05, 'samples': 19458560, 'steps': 38004, 'batch_loss/train': 0.8184477486647666} 12/27/2021 23:18:16 - INFO - codeparrot_training - Step 38005: {'lr': 6.815759121345331e-05, 'samples': 19459072, 'steps': 38005, 'batch_loss/train': 0.5731506105512381} 12/27/2021 23:18:27 - INFO - codeparrot_training - Step 38006: {'lr': 6.814677418893278e-05, 'samples': 19459584, 'steps': 38006, 'batch_loss/train': 0.6828083067666739} 12/27/2021 23:18:37 - INFO - codeparrot_training - Step 38007: {'lr': 6.813595788739202e-05, 'samples': 19460096, 'steps': 38007, 'batch_loss/train': 0.7412035157904029} 12/27/2021 23:18:50 - INFO - codeparrot_training - Step 38008: {'lr': 6.812514230887404e-05, 'samples': 19460608, 'steps': 38008, 'batch_loss/train': 0.6493329824879766} 12/27/2021 23:19:01 - INFO - codeparrot_training - Step 38009: {'lr': 6.81143274534218e-05, 'samples': 19461120, 'steps': 38009, 'batch_loss/train': 0.7183527983725071} 12/27/2021 23:19:11 - INFO - codeparrot_training - Step 38010: {'lr': 6.810351332107833e-05, 'samples': 19461632, 'steps': 38010, 'batch_loss/train': 0.8651158325374126} 12/27/2021 23:19:25 - INFO - codeparrot_training - Step 38011: {'lr': 6.80926999118866e-05, 'samples': 19462144, 'steps': 38011, 'batch_loss/train': 0.7002102332189679} 12/27/2021 23:19:36 - INFO - codeparrot_training - Step 38012: {'lr': 6.808188722588963e-05, 'samples': 19462656, 'steps': 38012, 'batch_loss/train': 0.7955935318022966} 12/27/2021 23:19:46 - INFO - codeparrot_training - Step 38013: {'lr': 6.807107526313038e-05, 'samples': 19463168, 'steps': 38013, 'batch_loss/train': 0.6919260867871344} 12/27/2021 23:19:57 - INFO - codeparrot_training - Step 38014: {'lr': 6.80602640236519e-05, 'samples': 19463680, 'steps': 38014, 'batch_loss/train': 0.8581539066508412} 12/27/2021 23:20:09 - INFO - codeparrot_training - Step 38015: {'lr': 6.804945350749698e-05, 'samples': 19464192, 'steps': 38015, 'batch_loss/train': 0.682075566612184} 12/27/2021 23:20:20 - INFO - codeparrot_training - Step 38016: {'lr': 6.803864371470888e-05, 'samples': 19464704, 'steps': 38016, 'batch_loss/train': 0.747820658609271} 12/27/2021 23:20:30 - INFO - codeparrot_training - Step 38017: {'lr': 6.802783464533033e-05, 'samples': 19465216, 'steps': 38017, 'batch_loss/train': 0.6807028746698052} 12/27/2021 23:20:42 - INFO - codeparrot_training - Step 38018: {'lr': 6.801702629940442e-05, 'samples': 19465728, 'steps': 38018, 'batch_loss/train': 0.6882599787786603} 12/27/2021 23:20:53 - INFO - codeparrot_training - Step 38019: {'lr': 6.80062186769741e-05, 'samples': 19466240, 'steps': 38019, 'batch_loss/train': 0.6906061568297446} 12/27/2021 23:21:04 - INFO - codeparrot_training - Step 38020: {'lr': 6.799541177808232e-05, 'samples': 19466752, 'steps': 38020, 'batch_loss/train': 0.6645843591541052} 12/27/2021 23:21:16 - INFO - codeparrot_training - Step 38021: {'lr': 6.798460560277209e-05, 'samples': 19467264, 'steps': 38021, 'batch_loss/train': 0.8676742487587035} 12/27/2021 23:21:26 - INFO - codeparrot_training - Step 38022: {'lr': 6.79738001510863e-05, 'samples': 19467776, 'steps': 38022, 'batch_loss/train': 0.8486853186041117} 12/27/2021 23:21:37 - INFO - codeparrot_training - Step 38023: {'lr': 6.7962995423068e-05, 'samples': 19468288, 'steps': 38023, 'batch_loss/train': 0.6921036714920774} 12/27/2021 23:21:48 - INFO - codeparrot_training - Step 38024: {'lr': 6.795219141876005e-05, 'samples': 19468800, 'steps': 38024, 'batch_loss/train': 0.6688577681779861} 12/27/2021 23:22:01 - INFO - codeparrot_training - Step 38025: {'lr': 6.794138813820556e-05, 'samples': 19469312, 'steps': 38025, 'batch_loss/train': 0.7574055880540982} 12/27/2021 23:22:12 - INFO - codeparrot_training - Step 38026: {'lr': 6.793058558144719e-05, 'samples': 19469824, 'steps': 38026, 'batch_loss/train': 0.7432022895663977} 12/27/2021 23:22:23 - INFO - codeparrot_training - Step 38027: {'lr': 6.791978374852814e-05, 'samples': 19470336, 'steps': 38027, 'batch_loss/train': 0.7078547389246523} 12/27/2021 23:22:35 - INFO - codeparrot_training - Step 38028: {'lr': 6.790898263949136e-05, 'samples': 19470848, 'steps': 38028, 'batch_loss/train': 0.7135637564351782} 12/27/2021 23:22:46 - INFO - codeparrot_training - Step 38029: {'lr': 6.789818225437966e-05, 'samples': 19471360, 'steps': 38029, 'batch_loss/train': 0.8664136354345828} 12/27/2021 23:22:56 - INFO - codeparrot_training - Step 38030: {'lr': 6.788738259323592e-05, 'samples': 19471872, 'steps': 38030, 'batch_loss/train': 0.8323615617118776} 12/27/2021 23:23:08 - INFO - codeparrot_training - Step 38031: {'lr': 6.787658365610336e-05, 'samples': 19472384, 'steps': 38031, 'batch_loss/train': 0.8075231658294797} 12/27/2021 23:23:19 - INFO - codeparrot_training - Step 38032: {'lr': 6.786578544302463e-05, 'samples': 19472896, 'steps': 38032, 'batch_loss/train': 0.8379937834106386} 12/27/2021 23:23:30 - INFO - codeparrot_training - Step 38033: {'lr': 6.78549879540428e-05, 'samples': 19473408, 'steps': 38033, 'batch_loss/train': 0.6763877061894163} 12/27/2021 23:23:40 - INFO - codeparrot_training - Step 38034: {'lr': 6.784419118920074e-05, 'samples': 19473920, 'steps': 38034, 'batch_loss/train': 0.7866082938853651} 12/27/2021 23:23:54 - INFO - codeparrot_training - Step 38035: {'lr': 6.78333951485414e-05, 'samples': 19474432, 'steps': 38035, 'batch_loss/train': 0.8483129423111677} 12/27/2021 23:24:05 - INFO - codeparrot_training - Step 38036: {'lr': 6.78225998321077e-05, 'samples': 19474944, 'steps': 38036, 'batch_loss/train': 0.702086822129786} 12/27/2021 23:24:15 - INFO - codeparrot_training - Step 38037: {'lr': 6.781180523994255e-05, 'samples': 19475456, 'steps': 38037, 'batch_loss/train': 0.691686560632661} 12/27/2021 23:24:28 - INFO - codeparrot_training - Step 38038: {'lr': 6.780101137208888e-05, 'samples': 19475968, 'steps': 38038, 'batch_loss/train': 0.6383889203425497} 12/27/2021 23:24:38 - INFO - codeparrot_training - Step 38039: {'lr': 6.779021822858958e-05, 'samples': 19476480, 'steps': 38039, 'batch_loss/train': 0.7677472177892923} 12/27/2021 23:24:49 - INFO - codeparrot_training - Step 38040: {'lr': 6.777942580948765e-05, 'samples': 19476992, 'steps': 38040, 'batch_loss/train': 0.7147419803077355} 12/27/2021 23:25:03 - INFO - codeparrot_training - Step 38041: {'lr': 6.776863411482579e-05, 'samples': 19477504, 'steps': 38041, 'batch_loss/train': 0.7787148517090827} 12/27/2021 23:25:13 - INFO - codeparrot_training - Step 38042: {'lr': 6.775784314464717e-05, 'samples': 19478016, 'steps': 38042, 'batch_loss/train': 0.7475278284400702} 12/27/2021 23:25:24 - INFO - codeparrot_training - Step 38043: {'lr': 6.774705289899447e-05, 'samples': 19478528, 'steps': 38043, 'batch_loss/train': 0.6694553000852466} 12/27/2021 23:25:36 - INFO - codeparrot_training - Step 38044: {'lr': 6.77362633779107e-05, 'samples': 19479040, 'steps': 38044, 'batch_loss/train': 0.838444103486836} 12/27/2021 23:25:47 - INFO - codeparrot_training - Step 38045: {'lr': 6.772547458143869e-05, 'samples': 19479552, 'steps': 38045, 'batch_loss/train': 0.6285860827192664} 12/27/2021 23:25:57 - INFO - codeparrot_training - Step 38046: {'lr': 6.771468650962138e-05, 'samples': 19480064, 'steps': 38046, 'batch_loss/train': 0.745768828317523} 12/27/2021 23:26:08 - INFO - codeparrot_training - Step 38047: {'lr': 6.770389916250167e-05, 'samples': 19480576, 'steps': 38047, 'batch_loss/train': 0.7956162407062948} 12/27/2021 23:26:20 - INFO - codeparrot_training - Step 38048: {'lr': 6.76931125401224e-05, 'samples': 19481088, 'steps': 38048, 'batch_loss/train': 0.6165697281248868} 12/27/2021 23:26:31 - INFO - codeparrot_training - Step 38049: {'lr': 6.76823266425265e-05, 'samples': 19481600, 'steps': 38049, 'batch_loss/train': 0.7515297078061849} 12/27/2021 23:26:41 - INFO - codeparrot_training - Step 38050: {'lr': 6.767154146975679e-05, 'samples': 19482112, 'steps': 38050, 'batch_loss/train': 0.7276681839721277} 12/27/2021 23:26:55 - INFO - codeparrot_training - Step 38051: {'lr': 6.766075702185628e-05, 'samples': 19482624, 'steps': 38051, 'batch_loss/train': 0.7916579321026802} 12/27/2021 23:27:06 - INFO - codeparrot_training - Step 38052: {'lr': 6.764997329886762e-05, 'samples': 19483136, 'steps': 38052, 'batch_loss/train': 0.7021405091509223} 12/27/2021 23:27:16 - INFO - codeparrot_training - Step 38053: {'lr': 6.763919030083387e-05, 'samples': 19483648, 'steps': 38053, 'batch_loss/train': 0.7989082671701908} 12/27/2021 23:27:29 - INFO - codeparrot_training - Step 38054: {'lr': 6.762840802779793e-05, 'samples': 19484160, 'steps': 38054, 'batch_loss/train': 0.7563087644521147} 12/27/2021 23:27:39 - INFO - codeparrot_training - Step 38055: {'lr': 6.76176264798025e-05, 'samples': 19484672, 'steps': 38055, 'batch_loss/train': 0.692258627153933} 12/27/2021 23:27:50 - INFO - codeparrot_training - Step 38056: {'lr': 6.760684565689044e-05, 'samples': 19485184, 'steps': 38056, 'batch_loss/train': 0.7154818926937878} 12/27/2021 23:28:01 - INFO - codeparrot_training - Step 38057: {'lr': 6.759606555910485e-05, 'samples': 19485696, 'steps': 38057, 'batch_loss/train': 0.7899064132943749} 12/27/2021 23:28:13 - INFO - codeparrot_training - Step 38058: {'lr': 6.758528618648835e-05, 'samples': 19486208, 'steps': 38058, 'batch_loss/train': 0.7662624209187925} 12/27/2021 23:28:23 - INFO - codeparrot_training - Step 38059: {'lr': 6.757450753908381e-05, 'samples': 19486720, 'steps': 38059, 'batch_loss/train': 0.7453341474756598} 12/27/2021 23:28:34 - INFO - codeparrot_training - Step 38060: {'lr': 6.756372961693427e-05, 'samples': 19487232, 'steps': 38060, 'batch_loss/train': 0.8154143746942282} 12/27/2021 23:28:46 - INFO - codeparrot_training - Step 38061: {'lr': 6.755295242008241e-05, 'samples': 19487744, 'steps': 38061, 'batch_loss/train': 0.8447525375522673} 12/27/2021 23:28:57 - INFO - codeparrot_training - Step 38062: {'lr': 6.754217594857113e-05, 'samples': 19488256, 'steps': 38062, 'batch_loss/train': 0.7745918449945748} 12/27/2021 23:29:08 - INFO - codeparrot_training - Step 38063: {'lr': 6.753140020244322e-05, 'samples': 19488768, 'steps': 38063, 'batch_loss/train': 0.6901517147198319} 12/27/2021 23:29:22 - INFO - codeparrot_training - Step 38064: {'lr': 6.75206251817416e-05, 'samples': 19489280, 'steps': 38064, 'batch_loss/train': 0.7283762791194022} 12/27/2021 23:29:33 - INFO - codeparrot_training - Step 38065: {'lr': 6.750985088650907e-05, 'samples': 19489792, 'steps': 38065, 'batch_loss/train': 0.7423744010739028} 12/27/2021 23:29:43 - INFO - codeparrot_training - Step 38066: {'lr': 6.749907731678854e-05, 'samples': 19490304, 'steps': 38066, 'batch_loss/train': 0.758969129063189} 12/27/2021 23:29:54 - INFO - codeparrot_training - Step 38067: {'lr': 6.748830447262261e-05, 'samples': 19490816, 'steps': 38067, 'batch_loss/train': 0.705785698723048} 12/27/2021 23:30:06 - INFO - codeparrot_training - Step 38068: {'lr': 6.747753235405444e-05, 'samples': 19491328, 'steps': 38068, 'batch_loss/train': 0.6774579361081123} 12/27/2021 23:30:17 - INFO - codeparrot_training - Step 38069: {'lr': 6.746676096112658e-05, 'samples': 19491840, 'steps': 38069, 'batch_loss/train': 0.6586057655513287} 12/27/2021 23:30:27 - INFO - codeparrot_training - Step 38070: {'lr': 6.745599029388197e-05, 'samples': 19492352, 'steps': 38070, 'batch_loss/train': 0.7276484143803827} 12/27/2021 23:30:40 - INFO - codeparrot_training - Step 38071: {'lr': 6.744522035236342e-05, 'samples': 19492864, 'steps': 38071, 'batch_loss/train': 0.7556026941165328} 12/27/2021 23:30:50 - INFO - codeparrot_training - Step 38072: {'lr': 6.743445113661373e-05, 'samples': 19493376, 'steps': 38072, 'batch_loss/train': 0.7139811348170042} 12/27/2021 23:31:01 - INFO - codeparrot_training - Step 38073: {'lr': 6.742368264667573e-05, 'samples': 19493888, 'steps': 38073, 'batch_loss/train': 0.8139375812606886} 12/27/2021 23:31:15 - INFO - codeparrot_training - Step 38074: {'lr': 6.741291488259224e-05, 'samples': 19494400, 'steps': 38074, 'batch_loss/train': 0.8141045346856117} 12/27/2021 23:31:26 - INFO - codeparrot_training - Step 38075: {'lr': 6.740214784440607e-05, 'samples': 19494912, 'steps': 38075, 'batch_loss/train': 0.8047191211953759} 12/27/2021 23:31:36 - INFO - codeparrot_training - Step 38076: {'lr': 6.739138153215998e-05, 'samples': 19495424, 'steps': 38076, 'batch_loss/train': 0.7730138404294848} 12/27/2021 23:31:47 - INFO - codeparrot_training - Step 38077: {'lr': 6.738061594589692e-05, 'samples': 19495936, 'steps': 38077, 'batch_loss/train': 0.7608152609318495} 12/27/2021 23:31:59 - INFO - codeparrot_training - Step 38078: {'lr': 6.736985108565941e-05, 'samples': 19496448, 'steps': 38078, 'batch_loss/train': 0.7581360279582441} 12/27/2021 23:32:10 - INFO - codeparrot_training - Step 38079: {'lr': 6.735908695149049e-05, 'samples': 19496960, 'steps': 38079, 'batch_loss/train': 0.6992946867831051} 12/27/2021 23:32:20 - INFO - codeparrot_training - Step 38080: {'lr': 6.734832354343292e-05, 'samples': 19497472, 'steps': 38080, 'batch_loss/train': 0.757581832818687} 12/27/2021 23:32:34 - INFO - codeparrot_training - Step 38081: {'lr': 6.733756086152943e-05, 'samples': 19497984, 'steps': 38081, 'batch_loss/train': 0.7205653378041461} 12/27/2021 23:32:45 - INFO - codeparrot_training - Step 38082: {'lr': 6.732679890582272e-05, 'samples': 19498496, 'steps': 38082, 'batch_loss/train': 0.8055658536031842} 12/27/2021 23:32:55 - INFO - codeparrot_training - Step 38083: {'lr': 6.731603767635583e-05, 'samples': 19499008, 'steps': 38083, 'batch_loss/train': 1.6342855783877894} 12/27/2021 23:33:08 - INFO - codeparrot_training - Step 38084: {'lr': 6.730527717317133e-05, 'samples': 19499520, 'steps': 38084, 'batch_loss/train': 0.8505795346572995} 12/27/2021 23:33:18 - INFO - codeparrot_training - Step 38085: {'lr': 6.729451739631199e-05, 'samples': 19500032, 'steps': 38085, 'batch_loss/train': 0.7832476822659373} 12/27/2021 23:33:29 - INFO - codeparrot_training - Step 38086: {'lr': 6.728375834582082e-05, 'samples': 19500544, 'steps': 38086, 'batch_loss/train': 0.7139241797849536} 12/27/2021 23:33:40 - INFO - codeparrot_training - Step 38087: {'lr': 6.727300002174033e-05, 'samples': 19501056, 'steps': 38087, 'batch_loss/train': 0.7116087400354445} 12/27/2021 23:33:52 - INFO - codeparrot_training - Step 38088: {'lr': 6.726224242411344e-05, 'samples': 19501568, 'steps': 38088, 'batch_loss/train': 0.6330985588720068} 12/27/2021 23:34:03 - INFO - codeparrot_training - Step 38089: {'lr': 6.725148555298285e-05, 'samples': 19502080, 'steps': 38089, 'batch_loss/train': 0.8502793395891786} 12/27/2021 23:34:13 - INFO - codeparrot_training - Step 38090: {'lr': 6.724072940839135e-05, 'samples': 19502592, 'steps': 38090, 'batch_loss/train': 0.9925122796557844} 12/27/2021 23:34:27 - INFO - codeparrot_training - Step 38091: {'lr': 6.722997399038172e-05, 'samples': 19503104, 'steps': 38091, 'batch_loss/train': 0.7793173061218113} 12/27/2021 23:34:38 - INFO - codeparrot_training - Step 38092: {'lr': 6.721921929899674e-05, 'samples': 19503616, 'steps': 38092, 'batch_loss/train': 0.6986192509066314} 12/27/2021 23:34:49 - INFO - codeparrot_training - Step 38093: {'lr': 6.720846533427901e-05, 'samples': 19504128, 'steps': 38093, 'batch_loss/train': 0.7985102282837033} 12/27/2021 23:34:59 - INFO - codeparrot_training - Step 38094: {'lr': 6.719771209627149e-05, 'samples': 19504640, 'steps': 38094, 'batch_loss/train': 0.6654206307139248} 12/27/2021 23:35:11 - INFO - codeparrot_training - Step 38095: {'lr': 6.718695958501689e-05, 'samples': 19505152, 'steps': 38095, 'batch_loss/train': 0.8024751516059041} 12/27/2021 23:35:22 - INFO - codeparrot_training - Step 38096: {'lr': 6.717620780055775e-05, 'samples': 19505664, 'steps': 38096, 'batch_loss/train': 0.8177887117490172} 12/27/2021 23:35:33 - INFO - codeparrot_training - Step 38097: {'lr': 6.716545674293717e-05, 'samples': 19506176, 'steps': 38097, 'batch_loss/train': 1.4748151362873614} 12/27/2021 23:35:47 - INFO - codeparrot_training - Step 38098: {'lr': 6.715470641219759e-05, 'samples': 19506688, 'steps': 38098, 'batch_loss/train': 0.7057048599235713} 12/27/2021 23:35:57 - INFO - codeparrot_training - Step 38099: {'lr': 6.714395680838186e-05, 'samples': 19507200, 'steps': 38099, 'batch_loss/train': 0.7789980713278055} 12/27/2021 23:36:08 - INFO - codeparrot_training - Step 38100: {'lr': 6.713320793153272e-05, 'samples': 19507712, 'steps': 38100, 'batch_loss/train': 0.7295831055380404} 12/27/2021 23:36:20 - INFO - codeparrot_training - Step 38101: {'lr': 6.71224597816929e-05, 'samples': 19508224, 'steps': 38101, 'batch_loss/train': 0.7241937201470137} 12/27/2021 23:36:31 - INFO - codeparrot_training - Step 38102: {'lr': 6.711171235890511e-05, 'samples': 19508736, 'steps': 38102, 'batch_loss/train': 0.6947713457047939} 12/27/2021 23:36:41 - INFO - codeparrot_training - Step 38103: {'lr': 6.710096566321213e-05, 'samples': 19509248, 'steps': 38103, 'batch_loss/train': 0.7571607967838645} 12/27/2021 23:36:53 - INFO - codeparrot_training - Step 38104: {'lr': 6.709021969465661e-05, 'samples': 19509760, 'steps': 38104, 'batch_loss/train': 0.8941481318324804} 12/27/2021 23:37:04 - INFO - codeparrot_training - Step 38105: {'lr': 6.707947445328136e-05, 'samples': 19510272, 'steps': 38105, 'batch_loss/train': 0.71685706730932} 12/27/2021 23:37:15 - INFO - codeparrot_training - Step 38106: {'lr': 6.706872993912911e-05, 'samples': 19510784, 'steps': 38106, 'batch_loss/train': 0.7346579041332006} 12/27/2021 23:37:25 - INFO - codeparrot_training - Step 38107: {'lr': 6.705798615224243e-05, 'samples': 19511296, 'steps': 38107, 'batch_loss/train': 0.6891707540489733} 12/27/2021 23:37:37 - INFO - codeparrot_training - Step 38108: {'lr': 6.704724309266402e-05, 'samples': 19511808, 'steps': 38108, 'batch_loss/train': 0.7751633897423744} 12/27/2021 23:37:48 - INFO - codeparrot_training - Step 38109: {'lr': 6.703650076043687e-05, 'samples': 19512320, 'steps': 38109, 'batch_loss/train': 0.7125693676061928} 12/27/2021 23:37:59 - INFO - codeparrot_training - Step 38110: {'lr': 6.702575915560344e-05, 'samples': 19512832, 'steps': 38110, 'batch_loss/train': 0.5810492064338177} 12/27/2021 23:38:12 - INFO - codeparrot_training - Step 38111: {'lr': 6.701501827820644e-05, 'samples': 19513344, 'steps': 38111, 'batch_loss/train': 0.6521981096593663} 12/27/2021 23:38:23 - INFO - codeparrot_training - Step 38112: {'lr': 6.700427812828878e-05, 'samples': 19513856, 'steps': 38112, 'batch_loss/train': 0.7222411064431071} 12/27/2021 23:38:34 - INFO - codeparrot_training - Step 38113: {'lr': 6.699353870589292e-05, 'samples': 19514368, 'steps': 38113, 'batch_loss/train': 0.6939551767427474} 12/27/2021 23:38:46 - INFO - codeparrot_training - Step 38114: {'lr': 6.698280001106163e-05, 'samples': 19514880, 'steps': 38114, 'batch_loss/train': 0.707933540455997} 12/27/2021 23:38:56 - INFO - codeparrot_training - Step 38115: {'lr': 6.697206204383766e-05, 'samples': 19515392, 'steps': 38115, 'batch_loss/train': 0.7744143878808245} 12/27/2021 23:39:07 - INFO - codeparrot_training - Step 38116: {'lr': 6.696132480426364e-05, 'samples': 19515904, 'steps': 38116, 'batch_loss/train': 0.7877387991175056} 12/27/2021 23:39:18 - INFO - codeparrot_training - Step 38117: {'lr': 6.69505882923823e-05, 'samples': 19516416, 'steps': 38117, 'batch_loss/train': 0.7509747589938343} 12/27/2021 23:39:30 - INFO - codeparrot_training - Step 38118: {'lr': 6.693985250823637e-05, 'samples': 19516928, 'steps': 38118, 'batch_loss/train': 0.877484773285687} 12/27/2021 23:39:41 - INFO - codeparrot_training - Step 38119: {'lr': 6.692911745186831e-05, 'samples': 19517440, 'steps': 38119, 'batch_loss/train': 0.6741580655798316} 12/27/2021 23:39:51 - INFO - codeparrot_training - Step 38120: {'lr': 6.691838312332104e-05, 'samples': 19517952, 'steps': 38120, 'batch_loss/train': 0.7566275019198656} 12/27/2021 23:40:05 - INFO - codeparrot_training - Step 38121: {'lr': 6.690764952263723e-05, 'samples': 19518464, 'steps': 38121, 'batch_loss/train': 0.7213246617466211} 12/27/2021 23:40:16 - INFO - codeparrot_training - Step 38122: {'lr': 6.689691664985931e-05, 'samples': 19518976, 'steps': 38122, 'batch_loss/train': 0.6487273168459069} 12/27/2021 23:40:26 - INFO - codeparrot_training - Step 38123: {'lr': 6.688618450503026e-05, 'samples': 19519488, 'steps': 38123, 'batch_loss/train': 0.7360379965975881} 12/27/2021 23:40:38 - INFO - codeparrot_training - Step 38124: {'lr': 6.68754530881925e-05, 'samples': 19520000, 'steps': 38124, 'batch_loss/train': 0.7575803302461281} 12/27/2021 23:40:49 - INFO - codeparrot_training - Step 38125: {'lr': 6.686472239938885e-05, 'samples': 19520512, 'steps': 38125, 'batch_loss/train': 0.6691921236924827} 12/27/2021 23:41:00 - INFO - codeparrot_training - Step 38126: {'lr': 6.685399243866188e-05, 'samples': 19521024, 'steps': 38126, 'batch_loss/train': 0.7452910207211971} 12/27/2021 23:41:10 - INFO - codeparrot_training - Step 38127: {'lr': 6.68432632060543e-05, 'samples': 19521536, 'steps': 38127, 'batch_loss/train': 0.8335386021062732} 12/27/2021 23:41:24 - INFO - codeparrot_training - Step 38128: {'lr': 6.683253470160871e-05, 'samples': 19522048, 'steps': 38128, 'batch_loss/train': 0.5849158206256106} 12/27/2021 23:41:35 - INFO - codeparrot_training - Step 38129: {'lr': 6.682180692536785e-05, 'samples': 19522560, 'steps': 38129, 'batch_loss/train': 0.791334240231663} 12/27/2021 23:41:45 - INFO - codeparrot_training - Step 38130: {'lr': 6.681107987737431e-05, 'samples': 19523072, 'steps': 38130, 'batch_loss/train': 0.7581189121119678} 12/27/2021 23:41:57 - INFO - codeparrot_training - Step 38131: {'lr': 6.680035355767073e-05, 'samples': 19523584, 'steps': 38131, 'batch_loss/train': 0.6989191183820367} 12/27/2021 23:42:08 - INFO - codeparrot_training - Step 38132: {'lr': 6.678962796629984e-05, 'samples': 19524096, 'steps': 38132, 'batch_loss/train': 0.7551042698323727} 12/27/2021 23:42:19 - INFO - codeparrot_training - Step 38133: {'lr': 6.677890310330406e-05, 'samples': 19524608, 'steps': 38133, 'batch_loss/train': 0.8298066155984998} 12/27/2021 23:42:31 - INFO - codeparrot_training - Step 38134: {'lr': 6.676817896872628e-05, 'samples': 19525120, 'steps': 38134, 'batch_loss/train': 0.7614981709048152} 12/27/2021 23:42:41 - INFO - codeparrot_training - Step 38135: {'lr': 6.675745556260909e-05, 'samples': 19525632, 'steps': 38135, 'batch_loss/train': 0.7639945643022656} 12/27/2021 23:42:52 - INFO - codeparrot_training - Step 38136: {'lr': 6.6746732884995e-05, 'samples': 19526144, 'steps': 38136, 'batch_loss/train': 0.7379753682762384} 12/27/2021 23:43:06 - INFO - codeparrot_training - Step 38137: {'lr': 6.673601093592662e-05, 'samples': 19526656, 'steps': 38137, 'batch_loss/train': 0.738164346665144} 12/27/2021 23:43:16 - INFO - codeparrot_training - Step 38138: {'lr': 6.672528971544683e-05, 'samples': 19527168, 'steps': 38138, 'batch_loss/train': 0.639187048189342} 12/27/2021 23:43:27 - INFO - codeparrot_training - Step 38139: {'lr': 6.6714569223598e-05, 'samples': 19527680, 'steps': 38139, 'batch_loss/train': 0.7271018528845161} 12/27/2021 23:43:38 - INFO - codeparrot_training - Step 38140: {'lr': 6.670384946042285e-05, 'samples': 19528192, 'steps': 38140, 'batch_loss/train': 0.7149888621643186} 12/27/2021 23:43:50 - INFO - codeparrot_training - Step 38141: {'lr': 6.669313042596398e-05, 'samples': 19528704, 'steps': 38141, 'batch_loss/train': 0.7427125805988908} 12/27/2021 23:44:01 - INFO - codeparrot_training - Step 38142: {'lr': 6.668241212026402e-05, 'samples': 19529216, 'steps': 38142, 'batch_loss/train': 0.7008967010769993} 12/27/2021 23:44:11 - INFO - codeparrot_training - Step 38143: {'lr': 6.667169454336555e-05, 'samples': 19529728, 'steps': 38143, 'batch_loss/train': 0.7254847539588809} 12/27/2021 23:44:23 - INFO - codeparrot_training - Step 38144: {'lr': 6.666097769531127e-05, 'samples': 19530240, 'steps': 38144, 'batch_loss/train': 0.7852943018078804} 12/27/2021 23:44:34 - INFO - codeparrot_training - Step 38145: {'lr': 6.665026157614359e-05, 'samples': 19530752, 'steps': 38145, 'batch_loss/train': 0.7113714113365859} 12/27/2021 23:44:44 - INFO - codeparrot_training - Step 38146: {'lr': 6.663954618590528e-05, 'samples': 19531264, 'steps': 38146, 'batch_loss/train': 0.8156648334115744} 12/27/2021 23:44:57 - INFO - codeparrot_training - Step 38147: {'lr': 6.662883152463903e-05, 'samples': 19531776, 'steps': 38147, 'batch_loss/train': 0.7965196426957846} 12/27/2021 23:45:07 - INFO - codeparrot_training - Step 38148: {'lr': 6.661811759238711e-05, 'samples': 19532288, 'steps': 38148, 'batch_loss/train': 0.7578490697778761} 12/27/2021 23:45:18 - INFO - codeparrot_training - Step 38149: {'lr': 6.660740438919241e-05, 'samples': 19532800, 'steps': 38149, 'batch_loss/train': 0.6775715027470142} 12/27/2021 23:45:28 - INFO - codeparrot_training - Step 38150: {'lr': 6.659669191509748e-05, 'samples': 19533312, 'steps': 38150, 'batch_loss/train': 0.6566980835050344} 12/27/2021 23:45:42 - INFO - codeparrot_training - Step 38151: {'lr': 6.658598017014481e-05, 'samples': 19533824, 'steps': 38151, 'batch_loss/train': 0.7487490166677162} 12/27/2021 23:45:53 - INFO - codeparrot_training - Step 38152: {'lr': 6.6575269154377e-05, 'samples': 19534336, 'steps': 38152, 'batch_loss/train': 0.7505469135940075} 12/27/2021 23:46:04 - INFO - codeparrot_training - Step 38153: {'lr': 6.656455886783666e-05, 'samples': 19534848, 'steps': 38153, 'batch_loss/train': 0.6856183391064405} 12/27/2021 23:46:16 - INFO - codeparrot_training - Step 38154: {'lr': 6.655384931056637e-05, 'samples': 19535360, 'steps': 38154, 'batch_loss/train': 0.6972450804896653} 12/27/2021 23:46:26 - INFO - codeparrot_training - Step 38155: {'lr': 6.654314048260873e-05, 'samples': 19535872, 'steps': 38155, 'batch_loss/train': 0.7726530702784657} 12/27/2021 23:46:37 - INFO - codeparrot_training - Step 38156: {'lr': 6.653243238400628e-05, 'samples': 19536384, 'steps': 38156, 'batch_loss/train': 0.6028069651219994} 12/27/2021 23:46:51 - INFO - codeparrot_training - Step 38157: {'lr': 6.652172501480161e-05, 'samples': 19536896, 'steps': 38157, 'batch_loss/train': 0.6044069905765355} 12/27/2021 23:47:02 - INFO - codeparrot_training - Step 38158: {'lr': 6.651101837503737e-05, 'samples': 19537408, 'steps': 38158, 'batch_loss/train': 0.7623013517586514} 12/27/2021 23:47:12 - INFO - codeparrot_training - Step 38159: {'lr': 6.650031246475586e-05, 'samples': 19537920, 'steps': 38159, 'batch_loss/train': 0.7555267903953791} 12/27/2021 23:47:23 - INFO - codeparrot_training - Step 38160: {'lr': 6.648960728399991e-05, 'samples': 19538432, 'steps': 38160, 'batch_loss/train': 0.7160872402600944} 12/27/2021 23:47:35 - INFO - codeparrot_training - Step 38161: {'lr': 6.647890283281205e-05, 'samples': 19538944, 'steps': 38161, 'batch_loss/train': 0.6752011426724494} 12/27/2021 23:47:46 - INFO - codeparrot_training - Step 38162: {'lr': 6.646819911123472e-05, 'samples': 19539456, 'steps': 38162, 'batch_loss/train': 0.850629972293973} 12/27/2021 23:47:56 - INFO - codeparrot_training - Step 38163: {'lr': 6.645749611931046e-05, 'samples': 19539968, 'steps': 38163, 'batch_loss/train': 0.7164336594287306} 12/27/2021 23:48:08 - INFO - codeparrot_training - Step 38164: {'lr': 6.644679385708202e-05, 'samples': 19540480, 'steps': 38164, 'batch_loss/train': 0.7175742397084832} 12/27/2021 23:48:19 - INFO - codeparrot_training - Step 38165: {'lr': 6.643609232459178e-05, 'samples': 19540992, 'steps': 38165, 'batch_loss/train': 0.7517568037146702} 12/27/2021 23:48:30 - INFO - codeparrot_training - Step 38166: {'lr': 6.642539152188229e-05, 'samples': 19541504, 'steps': 38166, 'batch_loss/train': 0.7695483909919858} 12/27/2021 23:48:44 - INFO - codeparrot_training - Step 38167: {'lr': 6.641469144899614e-05, 'samples': 19542016, 'steps': 38167, 'batch_loss/train': 0.7216514605097473} 12/27/2021 23:48:55 - INFO - codeparrot_training - Step 38168: {'lr': 6.640399210597588e-05, 'samples': 19542528, 'steps': 38168, 'batch_loss/train': 0.6697763132397085} 12/27/2021 23:49:05 - INFO - codeparrot_training - Step 38169: {'lr': 6.639329349286399e-05, 'samples': 19543040, 'steps': 38169, 'batch_loss/train': 0.7360765342600644} 12/27/2021 23:49:16 - INFO - codeparrot_training - Step 38170: {'lr': 6.638259560970305e-05, 'samples': 19543552, 'steps': 38170, 'batch_loss/train': 0.6837659031152725} 12/27/2021 23:49:29 - INFO - codeparrot_training - Step 38171: {'lr': 6.63718984565356e-05, 'samples': 19544064, 'steps': 38171, 'batch_loss/train': 1.154404896311462} 12/27/2021 23:49:39 - INFO - codeparrot_training - Step 38172: {'lr': 6.636120203340412e-05, 'samples': 19544576, 'steps': 38172, 'batch_loss/train': 0.7409996800124645} 12/27/2021 23:49:50 - INFO - codeparrot_training - Step 38173: {'lr': 6.635050634035125e-05, 'samples': 19545088, 'steps': 38173, 'batch_loss/train': 0.7768778763711452} 12/27/2021 23:50:02 - INFO - codeparrot_training - Step 38174: {'lr': 6.633981137741929e-05, 'samples': 19545600, 'steps': 38174, 'batch_loss/train': 0.7541071875020862} 12/27/2021 23:50:12 - INFO - codeparrot_training - Step 38175: {'lr': 6.632911714465092e-05, 'samples': 19546112, 'steps': 38175, 'batch_loss/train': 0.8278282694518566} 12/27/2021 23:50:23 - INFO - codeparrot_training - Step 38176: {'lr': 6.631842364208873e-05, 'samples': 19546624, 'steps': 38176, 'batch_loss/train': 0.697361696511507} 12/27/2021 23:50:37 - INFO - codeparrot_training - Step 38177: {'lr': 6.630773086977507e-05, 'samples': 19547136, 'steps': 38177, 'batch_loss/train': 0.7426096927374601} 12/27/2021 23:50:47 - INFO - codeparrot_training - Step 38178: {'lr': 6.629703882775252e-05, 'samples': 19547648, 'steps': 38178, 'batch_loss/train': 0.7503580623306334} 12/27/2021 23:50:58 - INFO - codeparrot_training - Step 38179: {'lr': 6.628634751606356e-05, 'samples': 19548160, 'steps': 38179, 'batch_loss/train': 0.8031446812674403} 12/27/2021 23:51:09 - INFO - codeparrot_training - Step 38180: {'lr': 6.627565693475071e-05, 'samples': 19548672, 'steps': 38180, 'batch_loss/train': 0.7554416321218014} 12/27/2021 23:51:21 - INFO - codeparrot_training - Step 38181: {'lr': 6.626496708385646e-05, 'samples': 19549184, 'steps': 38181, 'batch_loss/train': 0.717901112511754} 12/27/2021 23:51:31 - INFO - codeparrot_training - Step 38182: {'lr': 6.625427796342335e-05, 'samples': 19549696, 'steps': 38182, 'batch_loss/train': 0.756148673593998} 12/27/2021 23:51:42 - INFO - codeparrot_training - Step 38183: {'lr': 6.624358957349386e-05, 'samples': 19550208, 'steps': 38183, 'batch_loss/train': 0.7061813119798899} 12/27/2021 23:51:54 - INFO - codeparrot_training - Step 38184: {'lr': 6.623290191411047e-05, 'samples': 19550720, 'steps': 38184, 'batch_loss/train': 0.6967371630016714} 12/27/2021 23:52:05 - INFO - codeparrot_training - Step 38185: {'lr': 6.622221498531566e-05, 'samples': 19551232, 'steps': 38185, 'batch_loss/train': 0.7380496421828866} 12/27/2021 23:52:15 - INFO - codeparrot_training - Step 38186: {'lr': 6.621152878715192e-05, 'samples': 19551744, 'steps': 38186, 'batch_loss/train': 0.6856209053657949} 12/27/2021 23:52:28 - INFO - codeparrot_training - Step 38187: {'lr': 6.620084331966183e-05, 'samples': 19552256, 'steps': 38187, 'batch_loss/train': 0.7341563836671412} 12/27/2021 23:52:38 - INFO - codeparrot_training - Step 38188: {'lr': 6.619015858288774e-05, 'samples': 19552768, 'steps': 38188, 'batch_loss/train': 0.7682216884568334} 12/27/2021 23:52:49 - INFO - codeparrot_training - Step 38189: {'lr': 6.617947457687207e-05, 'samples': 19553280, 'steps': 38189, 'batch_loss/train': 0.7565338155254722} 12/27/2021 23:53:02 - INFO - codeparrot_training - Step 38190: {'lr': 6.616879130165756e-05, 'samples': 19553792, 'steps': 38190, 'batch_loss/train': 0.7000937135890126} 12/27/2021 23:53:13 - INFO - codeparrot_training - Step 38191: {'lr': 6.615810875728645e-05, 'samples': 19554304, 'steps': 38191, 'batch_loss/train': 0.7628334967885166} 12/27/2021 23:53:24 - INFO - codeparrot_training - Step 38192: {'lr': 6.614742694380129e-05, 'samples': 19554816, 'steps': 38192, 'batch_loss/train': 0.6169973557116464} 12/27/2021 23:53:34 - INFO - codeparrot_training - Step 38193: {'lr': 6.613674586124454e-05, 'samples': 19555328, 'steps': 38193, 'batch_loss/train': 0.6988013100926764} 12/27/2021 23:53:46 - INFO - codeparrot_training - Step 38194: {'lr': 6.612606550965866e-05, 'samples': 19555840, 'steps': 38194, 'batch_loss/train': 0.6250166965182871} 12/27/2021 23:53:57 - INFO - codeparrot_training - Step 38195: {'lr': 6.61153858890861e-05, 'samples': 19556352, 'steps': 38195, 'batch_loss/train': 0.7993985074572265} 12/27/2021 23:54:08 - INFO - codeparrot_training - Step 38196: {'lr': 6.610470699956937e-05, 'samples': 19556864, 'steps': 38196, 'batch_loss/train': 0.7396250439342111} 12/27/2021 23:54:22 - INFO - codeparrot_training - Step 38197: {'lr': 6.609402884115087e-05, 'samples': 19557376, 'steps': 38197, 'batch_loss/train': 0.6713492607232183} 12/27/2021 23:54:32 - INFO - codeparrot_training - Step 38198: {'lr': 6.608335141387306e-05, 'samples': 19557888, 'steps': 38198, 'batch_loss/train': 0.7010870794765651} 12/27/2021 23:54:43 - INFO - codeparrot_training - Step 38199: {'lr': 6.607267471777847e-05, 'samples': 19558400, 'steps': 38199, 'batch_loss/train': 0.8469199473038316} 12/27/2021 23:54:55 - INFO - codeparrot_training - Step 38200: {'lr': 6.606199875290936e-05, 'samples': 19558912, 'steps': 38200, 'batch_loss/train': 0.7403491870500147} 12/27/2021 23:55:06 - INFO - codeparrot_training - Step 38201: {'lr': 6.605132351930834e-05, 'samples': 19559424, 'steps': 38201, 'batch_loss/train': 0.8122623143717647} 12/27/2021 23:55:16 - INFO - codeparrot_training - Step 38202: {'lr': 6.60406490170179e-05, 'samples': 19559936, 'steps': 38202, 'batch_loss/train': 0.7378783435560763} 12/27/2021 23:55:27 - INFO - codeparrot_training - Step 38203: {'lr': 6.602997524608026e-05, 'samples': 19560448, 'steps': 38203, 'batch_loss/train': 0.718228334793821} 12/27/2021 23:55:40 - INFO - codeparrot_training - Step 38204: {'lr': 6.601930220653801e-05, 'samples': 19560960, 'steps': 38204, 'batch_loss/train': 0.7608047695830464} 12/27/2021 23:55:50 - INFO - codeparrot_training - Step 38205: {'lr': 6.600862989843354e-05, 'samples': 19561472, 'steps': 38205, 'batch_loss/train': 0.8052284466102719} 12/27/2021 23:56:01 - INFO - codeparrot_training - Step 38206: {'lr': 6.599795832180928e-05, 'samples': 19561984, 'steps': 38206, 'batch_loss/train': 0.5075615612440743} 12/27/2021 23:56:13 - INFO - codeparrot_training - Step 38207: {'lr': 6.598728747670766e-05, 'samples': 19562496, 'steps': 38207, 'batch_loss/train': 0.7177411688026041} 12/27/2021 23:56:24 - INFO - codeparrot_training - Step 38208: {'lr': 6.597661736317112e-05, 'samples': 19563008, 'steps': 38208, 'batch_loss/train': 0.7854504412971437} 12/27/2021 23:56:35 - INFO - codeparrot_training - Step 38209: {'lr': 6.596594798124206e-05, 'samples': 19563520, 'steps': 38209, 'batch_loss/train': 0.6064290511421859} 12/27/2021 23:56:49 - INFO - codeparrot_training - Step 38210: {'lr': 6.595527933096288e-05, 'samples': 19564032, 'steps': 38210, 'batch_loss/train': 0.6457908172160387} 12/27/2021 23:57:00 - INFO - codeparrot_training - Step 38211: {'lr': 6.594461141237604e-05, 'samples': 19564544, 'steps': 38211, 'batch_loss/train': 0.615154932718724} 12/27/2021 23:57:11 - INFO - codeparrot_training - Step 38212: {'lr': 6.593394422552391e-05, 'samples': 19565056, 'steps': 38212, 'batch_loss/train': 0.7028532791882753} 12/27/2021 23:57:21 - INFO - codeparrot_training - Step 38213: {'lr': 6.592327777044902e-05, 'samples': 19565568, 'steps': 38213, 'batch_loss/train': 0.7196987282950431} 12/27/2021 23:57:33 - INFO - codeparrot_training - Step 38214: {'lr': 6.591261204719357e-05, 'samples': 19566080, 'steps': 38214, 'batch_loss/train': 0.7358425405691378} 12/27/2021 23:57:44 - INFO - codeparrot_training - Step 38215: {'lr': 6.59019470558e-05, 'samples': 19566592, 'steps': 38215, 'batch_loss/train': 0.7633813864085823} 12/27/2021 23:57:54 - INFO - codeparrot_training - Step 38216: {'lr': 6.589128279631093e-05, 'samples': 19567104, 'steps': 38216, 'batch_loss/train': 0.7792551862075925} 12/27/2021 23:58:08 - INFO - codeparrot_training - Step 38217: {'lr': 6.588061926876851e-05, 'samples': 19567616, 'steps': 38217, 'batch_loss/train': 0.9625776652246714} 12/27/2021 23:58:19 - INFO - codeparrot_training - Step 38218: {'lr': 6.586995647321525e-05, 'samples': 19568128, 'steps': 38218, 'batch_loss/train': 0.7419016245985404} 12/27/2021 23:58:30 - INFO - codeparrot_training - Step 38219: {'lr': 6.585929440969352e-05, 'samples': 19568640, 'steps': 38219, 'batch_loss/train': 0.6653996298555285} 12/27/2021 23:58:42 - INFO - codeparrot_training - Step 38220: {'lr': 6.584863307824568e-05, 'samples': 19569152, 'steps': 38220, 'batch_loss/train': 0.8016148703172803} 12/27/2021 23:58:53 - INFO - codeparrot_training - Step 38221: {'lr': 6.583797247891416e-05, 'samples': 19569664, 'steps': 38221, 'batch_loss/train': 0.6407985994592309} 12/27/2021 23:59:04 - INFO - codeparrot_training - Step 38222: {'lr': 6.582731261174132e-05, 'samples': 19570176, 'steps': 38222, 'batch_loss/train': 0.7170037301257253} 12/27/2021 23:59:14 - INFO - codeparrot_training - Step 38223: {'lr': 6.581665347676954e-05, 'samples': 19570688, 'steps': 38223, 'batch_loss/train': 0.9801486814394593} 12/27/2021 23:59:26 - INFO - codeparrot_training - Step 38224: {'lr': 6.580599507404123e-05, 'samples': 19571200, 'steps': 38224, 'batch_loss/train': 0.692960518412292} 12/27/2021 23:59:37 - INFO - codeparrot_training - Step 38225: {'lr': 6.579533740359877e-05, 'samples': 19571712, 'steps': 38225, 'batch_loss/train': 0.6958692679181695} 12/27/2021 23:59:48 - INFO - codeparrot_training - Step 38226: {'lr': 6.578468046548438e-05, 'samples': 19572224, 'steps': 38226, 'batch_loss/train': 0.7599851507693529} 12/28/2021 00:00:02 - INFO - codeparrot_training - Step 38227: {'lr': 6.57740242597406e-05, 'samples': 19572736, 'steps': 38227, 'batch_loss/train': 0.7430704317521304} 12/28/2021 00:00:12 - INFO - codeparrot_training - Step 38228: {'lr': 6.576336878640982e-05, 'samples': 19573248, 'steps': 38228, 'batch_loss/train': 0.7251427182927728} 12/28/2021 00:00:23 - INFO - codeparrot_training - Step 38229: {'lr': 6.575271404553424e-05, 'samples': 19573760, 'steps': 38229, 'batch_loss/train': 0.7160039134323597} 12/28/2021 00:00:34 - INFO - codeparrot_training - Step 38230: {'lr': 6.574206003715622e-05, 'samples': 19574272, 'steps': 38230, 'batch_loss/train': 0.8019186542369425} 12/28/2021 00:00:46 - INFO - codeparrot_training - Step 38231: {'lr': 6.573140676131836e-05, 'samples': 19574784, 'steps': 38231, 'batch_loss/train': 0.676960319513455} 12/28/2021 00:00:56 - INFO - codeparrot_training - Step 38232: {'lr': 6.572075421806276e-05, 'samples': 19575296, 'steps': 38232, 'batch_loss/train': 0.8609317666850984} 12/28/2021 00:01:07 - INFO - codeparrot_training - Step 38233: {'lr': 6.571010240743187e-05, 'samples': 19575808, 'steps': 38233, 'batch_loss/train': 0.6560929785482585} 12/28/2021 00:01:19 - INFO - codeparrot_training - Step 38234: {'lr': 6.569945132946803e-05, 'samples': 19576320, 'steps': 38234, 'batch_loss/train': 0.7261673368047923} 12/28/2021 00:01:30 - INFO - codeparrot_training - Step 38235: {'lr': 6.568880098421357e-05, 'samples': 19576832, 'steps': 38235, 'batch_loss/train': 0.7078453821595758} 12/28/2021 00:01:41 - INFO - codeparrot_training - Step 38236: {'lr': 6.567815137171085e-05, 'samples': 19577344, 'steps': 38236, 'batch_loss/train': 0.8043895922601223} 12/28/2021 00:01:55 - INFO - codeparrot_training - Step 38237: {'lr': 6.566750249200221e-05, 'samples': 19577856, 'steps': 38237, 'batch_loss/train': 0.7276739794760942} 12/28/2021 00:02:05 - INFO - codeparrot_training - Step 38238: {'lr': 6.565685434512997e-05, 'samples': 19578368, 'steps': 38238, 'batch_loss/train': 0.700192992342636} 12/28/2021 00:02:16 - INFO - codeparrot_training - Step 38239: {'lr': 6.564620693113647e-05, 'samples': 19578880, 'steps': 38239, 'batch_loss/train': 0.6810625027865171} 12/28/2021 00:02:27 - INFO - codeparrot_training - Step 38240: {'lr': 6.563556025006412e-05, 'samples': 19579392, 'steps': 38240, 'batch_loss/train': 0.7867755638435483} 12/28/2021 00:02:39 - INFO - codeparrot_training - Step 38241: {'lr': 6.562491430195502e-05, 'samples': 19579904, 'steps': 38241, 'batch_loss/train': 0.6703734518960118} 12/28/2021 00:02:49 - INFO - codeparrot_training - Step 38242: {'lr': 6.561426908685179e-05, 'samples': 19580416, 'steps': 38242, 'batch_loss/train': 0.7245019087567925} 12/28/2021 00:03:00 - INFO - codeparrot_training - Step 38243: {'lr': 6.560362460479654e-05, 'samples': 19580928, 'steps': 38243, 'batch_loss/train': 0.7434597158571705} 12/28/2021 00:03:12 - INFO - codeparrot_training - Step 38244: {'lr': 6.559298085583157e-05, 'samples': 19581440, 'steps': 38244, 'batch_loss/train': 0.7909061699174345} 12/28/2021 00:03:23 - INFO - codeparrot_training - Step 38245: {'lr': 6.558233783999942e-05, 'samples': 19581952, 'steps': 38245, 'batch_loss/train': 0.7355686957016587} 12/28/2021 00:03:33 - INFO - codeparrot_training - Step 38246: {'lr': 6.557169555734222e-05, 'samples': 19582464, 'steps': 38246, 'batch_loss/train': 0.8203853573650122} 12/28/2021 00:03:47 - INFO - codeparrot_training - Step 38247: {'lr': 6.556105400790227e-05, 'samples': 19582976, 'steps': 38247, 'batch_loss/train': 0.7661723620258272} 12/28/2021 00:03:58 - INFO - codeparrot_training - Step 38248: {'lr': 6.555041319172197e-05, 'samples': 19583488, 'steps': 38248, 'batch_loss/train': 0.7475724825635552} 12/28/2021 00:04:08 - INFO - codeparrot_training - Step 38249: {'lr': 6.553977310884357e-05, 'samples': 19584000, 'steps': 38249, 'batch_loss/train': 0.822636166587472} 12/28/2021 00:04:20 - INFO - codeparrot_training - Step 38250: {'lr': 6.552913375930939e-05, 'samples': 19584512, 'steps': 38250, 'batch_loss/train': 0.6913166884332895} 12/28/2021 00:04:31 - INFO - codeparrot_training - Step 38251: {'lr': 6.551849514316177e-05, 'samples': 19585024, 'steps': 38251, 'batch_loss/train': 0.7323337830603123} 12/28/2021 00:04:42 - INFO - codeparrot_training - Step 38252: {'lr': 6.550785726044282e-05, 'samples': 19585536, 'steps': 38252, 'batch_loss/train': 0.8059500241652131} 12/28/2021 00:04:52 - INFO - codeparrot_training - Step 38253: {'lr': 6.549722011119504e-05, 'samples': 19586048, 'steps': 38253, 'batch_loss/train': 0.8417808641679585} 12/28/2021 00:05:05 - INFO - codeparrot_training - Step 38254: {'lr': 6.548658369546073e-05, 'samples': 19586560, 'steps': 38254, 'batch_loss/train': 0.7861618297174573} 12/28/2021 00:05:15 - INFO - codeparrot_training - Step 38255: {'lr': 6.547594801328202e-05, 'samples': 19587072, 'steps': 38255, 'batch_loss/train': 0.8468660106882453} 12/28/2021 00:05:26 - INFO - codeparrot_training - Step 38256: {'lr': 6.546531306470118e-05, 'samples': 19587584, 'steps': 38256, 'batch_loss/train': 0.7944851126521826} 12/28/2021 00:05:40 - INFO - codeparrot_training - Step 38257: {'lr': 6.545467884976075e-05, 'samples': 19588096, 'steps': 38257, 'batch_loss/train': 0.7551068705506623} 12/28/2021 00:05:51 - INFO - codeparrot_training - Step 38258: {'lr': 6.544404536850273e-05, 'samples': 19588608, 'steps': 38258, 'batch_loss/train': 0.5950392212253064} 12/28/2021 00:06:01 - INFO - codeparrot_training - Step 38259: {'lr': 6.543341262096952e-05, 'samples': 19589120, 'steps': 38259, 'batch_loss/train': 0.6199569408781826} 12/28/2021 00:06:12 - INFO - codeparrot_training - Step 38260: {'lr': 6.542278060720336e-05, 'samples': 19589632, 'steps': 38260, 'batch_loss/train': 0.7204940365627408} 12/28/2021 00:06:24 - INFO - codeparrot_training - Step 38261: {'lr': 6.541214932724653e-05, 'samples': 19590144, 'steps': 38261, 'batch_loss/train': 0.7649514614604414} 12/28/2021 00:06:35 - INFO - codeparrot_training - Step 38262: {'lr': 6.540151878114132e-05, 'samples': 19590656, 'steps': 38262, 'batch_loss/train': 0.7119155586697161} 12/28/2021 00:06:45 - INFO - codeparrot_training - Step 38263: {'lr': 6.539088896892994e-05, 'samples': 19591168, 'steps': 38263, 'batch_loss/train': 0.39729731815168634} 12/28/2021 00:06:57 - INFO - codeparrot_training - Step 38264: {'lr': 6.538025989065472e-05, 'samples': 19591680, 'steps': 38264, 'batch_loss/train': 0.7150715626776218} 12/28/2021 00:07:08 - INFO - codeparrot_training - Step 38265: {'lr': 6.536963154635783e-05, 'samples': 19592192, 'steps': 38265, 'batch_loss/train': 0.8113149646669626} 12/28/2021 00:07:19 - INFO - codeparrot_training - Step 38266: {'lr': 6.535900393608168e-05, 'samples': 19592704, 'steps': 38266, 'batch_loss/train': 0.7865096526220441} 12/28/2021 00:07:33 - INFO - codeparrot_training - Step 38267: {'lr': 6.534837705986826e-05, 'samples': 19593216, 'steps': 38267, 'batch_loss/train': 0.7727668462321162} 12/28/2021 00:07:43 - INFO - codeparrot_training - Step 38268: {'lr': 6.533775091776009e-05, 'samples': 19593728, 'steps': 38268, 'batch_loss/train': 0.8263084962964058} 12/28/2021 00:07:54 - INFO - codeparrot_training - Step 38269: {'lr': 6.532712550979924e-05, 'samples': 19594240, 'steps': 38269, 'batch_loss/train': 0.7259215348167345} 12/28/2021 00:08:06 - INFO - codeparrot_training - Step 38270: {'lr': 6.531650083602795e-05, 'samples': 19594752, 'steps': 38270, 'batch_loss/train': 0.9146595895290375} 12/28/2021 00:08:17 - INFO - codeparrot_training - Step 38271: {'lr': 6.530587689648865e-05, 'samples': 19595264, 'steps': 38271, 'batch_loss/train': 0.67665906669572} 12/28/2021 00:08:27 - INFO - codeparrot_training - Step 38272: {'lr': 6.529525369122339e-05, 'samples': 19595776, 'steps': 38272, 'batch_loss/train': 0.7706516465987079} 12/28/2021 00:08:38 - INFO - codeparrot_training - Step 38273: {'lr': 6.528463122027445e-05, 'samples': 19596288, 'steps': 38273, 'batch_loss/train': 0.7459305040538311} 12/28/2021 00:08:52 - INFO - codeparrot_training - Step 38274: {'lr': 6.527400948368406e-05, 'samples': 19596800, 'steps': 38274, 'batch_loss/train': 0.7296036505140364} 12/28/2021 00:09:03 - INFO - codeparrot_training - Step 38275: {'lr': 6.526338848149447e-05, 'samples': 19597312, 'steps': 38275, 'batch_loss/train': 0.7390027069486678} 12/28/2021 00:09:13 - INFO - codeparrot_training - Step 38276: {'lr': 6.525276821374787e-05, 'samples': 19597824, 'steps': 38276, 'batch_loss/train': 0.679428106173873} 12/28/2021 00:09:26 - INFO - codeparrot_training - Step 38277: {'lr': 6.52421486804866e-05, 'samples': 19598336, 'steps': 38277, 'batch_loss/train': 0.7297203075140715} 12/28/2021 00:09:36 - INFO - codeparrot_training - Step 38278: {'lr': 6.523152988175262e-05, 'samples': 19598848, 'steps': 38278, 'batch_loss/train': 0.7505086148157716} 12/28/2021 00:09:47 - INFO - codeparrot_training - Step 38279: {'lr': 6.522091181758841e-05, 'samples': 19599360, 'steps': 38279, 'batch_loss/train': 1.1043040258809924} 12/28/2021 00:09:59 - INFO - codeparrot_training - Step 38280: {'lr': 6.521029448803616e-05, 'samples': 19599872, 'steps': 38280, 'batch_loss/train': 0.7818784039700404} 12/28/2021 00:10:10 - INFO - codeparrot_training - Step 38281: {'lr': 6.519967789313786e-05, 'samples': 19600384, 'steps': 38281, 'batch_loss/train': 0.8457827370148152} 12/28/2021 00:10:20 - INFO - codeparrot_training - Step 38282: {'lr': 6.518906203293589e-05, 'samples': 19600896, 'steps': 38282, 'batch_loss/train': 0.7525841114111245} 12/28/2021 00:10:31 - INFO - codeparrot_training - Step 38283: {'lr': 6.517844690747254e-05, 'samples': 19601408, 'steps': 38283, 'batch_loss/train': 0.7965667298994958} 12/28/2021 00:10:43 - INFO - codeparrot_training - Step 38284: {'lr': 6.516783251678985e-05, 'samples': 19601920, 'steps': 38284, 'batch_loss/train': 0.8583795625017956} 12/28/2021 00:10:53 - INFO - codeparrot_training - Step 38285: {'lr': 6.515721886093002e-05, 'samples': 19602432, 'steps': 38285, 'batch_loss/train': 0.6461964971385896} 12/28/2021 00:11:04 - INFO - codeparrot_training - Step 38286: {'lr': 6.514660593993532e-05, 'samples': 19602944, 'steps': 38286, 'batch_loss/train': 0.7654438498429954} 12/28/2021 00:11:18 - INFO - codeparrot_training - Step 38287: {'lr': 6.513599375384791e-05, 'samples': 19603456, 'steps': 38287, 'batch_loss/train': 0.7677399488165975} 12/28/2021 00:11:29 - INFO - codeparrot_training - Step 38288: {'lr': 6.512538230270998e-05, 'samples': 19603968, 'steps': 38288, 'batch_loss/train': 0.7322713094763458} 12/28/2021 00:11:39 - INFO - codeparrot_training - Step 38289: {'lr': 6.511477158656373e-05, 'samples': 19604480, 'steps': 38289, 'batch_loss/train': 0.7068144190125167} 12/28/2021 00:11:51 - INFO - codeparrot_training - Step 38290: {'lr': 6.510416160545132e-05, 'samples': 19604992, 'steps': 38290, 'batch_loss/train': 0.8023194121196866} 12/28/2021 00:12:02 - INFO - codeparrot_training - Step 38291: {'lr': 6.509355235941497e-05, 'samples': 19605504, 'steps': 38291, 'batch_loss/train': 0.7940723514184356} 12/28/2021 00:12:12 - INFO - codeparrot_training - Step 38292: {'lr': 6.508294384849692e-05, 'samples': 19606016, 'steps': 38292, 'batch_loss/train': 0.7273894269019365} 12/28/2021 00:12:23 - INFO - codeparrot_training - Step 38293: {'lr': 6.50723360727391e-05, 'samples': 19606528, 'steps': 38293, 'batch_loss/train': 0.7589011201635003} 12/28/2021 00:12:35 - INFO - codeparrot_training - Step 38294: {'lr': 6.506172903218399e-05, 'samples': 19607040, 'steps': 38294, 'batch_loss/train': 0.681872146204114} 12/28/2021 00:12:46 - INFO - codeparrot_training - Step 38295: {'lr': 6.505112272687356e-05, 'samples': 19607552, 'steps': 38295, 'batch_loss/train': 0.7638281132094562} 12/28/2021 00:12:57 - INFO - codeparrot_training - Step 38296: {'lr': 6.504051715684994e-05, 'samples': 19608064, 'steps': 38296, 'batch_loss/train': 0.7066401953343302} 12/28/2021 00:13:11 - INFO - codeparrot_training - Step 38297: {'lr': 6.502991232215554e-05, 'samples': 19608576, 'steps': 38297, 'batch_loss/train': 0.7373089231550694} 12/28/2021 00:13:21 - INFO - codeparrot_training - Step 38298: {'lr': 6.50193082228323e-05, 'samples': 19609088, 'steps': 38298, 'batch_loss/train': 0.6641963459551334} 12/28/2021 00:13:32 - INFO - codeparrot_training - Step 38299: {'lr': 6.500870485892243e-05, 'samples': 19609600, 'steps': 38299, 'batch_loss/train': 0.7542337737977505} 12/28/2021 00:13:44 - INFO - codeparrot_training - Step 38300: {'lr': 6.499810223046813e-05, 'samples': 19610112, 'steps': 38300, 'batch_loss/train': 0.659111617016606} 12/28/2021 00:13:55 - INFO - codeparrot_training - Step 38301: {'lr': 6.49875003375115e-05, 'samples': 19610624, 'steps': 38301, 'batch_loss/train': 0.8163775624707341} 12/28/2021 00:14:05 - INFO - codeparrot_training - Step 38302: {'lr': 6.497689918009469e-05, 'samples': 19611136, 'steps': 38302, 'batch_loss/train': 0.7661019610241055} 12/28/2021 00:14:16 - INFO - codeparrot_training - Step 38303: {'lr': 6.496629875825997e-05, 'samples': 19611648, 'steps': 38303, 'batch_loss/train': 0.8495402345433831} 12/28/2021 00:14:30 - INFO - codeparrot_training - Step 38304: {'lr': 6.495569907204924e-05, 'samples': 19612160, 'steps': 38304, 'batch_loss/train': 0.7801128569990396} 12/28/2021 00:14:40 - INFO - codeparrot_training - Step 38305: {'lr': 6.494510012150482e-05, 'samples': 19612672, 'steps': 38305, 'batch_loss/train': 0.841670008841902} 12/28/2021 00:14:51 - INFO - codeparrot_training - Step 38306: {'lr': 6.493450190666891e-05, 'samples': 19613184, 'steps': 38306, 'batch_loss/train': 0.7830355037003756} 12/28/2021 00:15:03 - INFO - codeparrot_training - Step 38307: {'lr': 6.492390442758336e-05, 'samples': 19613696, 'steps': 38307, 'batch_loss/train': 0.6977011598646641} 12/28/2021 00:15:14 - INFO - codeparrot_training - Step 38308: {'lr': 6.49133076842906e-05, 'samples': 19614208, 'steps': 38308, 'batch_loss/train': 0.806927002966404} 12/28/2021 00:15:25 - INFO - codeparrot_training - Step 38309: {'lr': 6.490271167683271e-05, 'samples': 19614720, 'steps': 38309, 'batch_loss/train': 0.7385310009121895} 12/28/2021 00:15:37 - INFO - codeparrot_training - Step 38310: {'lr': 6.489211640525164e-05, 'samples': 19615232, 'steps': 38310, 'batch_loss/train': 0.7425264166668057} 12/28/2021 00:15:48 - INFO - codeparrot_training - Step 38311: {'lr': 6.488152186958957e-05, 'samples': 19615744, 'steps': 38311, 'batch_loss/train': 0.7783099012449384} 12/28/2021 00:15:58 - INFO - codeparrot_training - Step 38312: {'lr': 6.487092806988884e-05, 'samples': 19616256, 'steps': 38312, 'batch_loss/train': 0.7977734599262476} 12/28/2021 00:16:09 - INFO - codeparrot_training - Step 38313: {'lr': 6.486033500619128e-05, 'samples': 19616768, 'steps': 38313, 'batch_loss/train': 0.7662871675565839} 12/28/2021 00:16:22 - INFO - codeparrot_training - Step 38314: {'lr': 6.484974267853915e-05, 'samples': 19617280, 'steps': 38314, 'batch_loss/train': 0.8042777333175763} 12/28/2021 00:16:33 - INFO - codeparrot_training - Step 38315: {'lr': 6.483915108697452e-05, 'samples': 19617792, 'steps': 38315, 'batch_loss/train': 0.7805380839854479} 12/28/2021 00:16:43 - INFO - codeparrot_training - Step 38316: {'lr': 6.482856023153954e-05, 'samples': 19618304, 'steps': 38316, 'batch_loss/train': 0.6925967128481716} 12/28/2021 00:16:55 - INFO - codeparrot_training - Step 38317: {'lr': 6.481797011227625e-05, 'samples': 19618816, 'steps': 38317, 'batch_loss/train': 0.6103768647881225} 12/28/2021 00:17:06 - INFO - codeparrot_training - Step 38318: {'lr': 6.480738072922682e-05, 'samples': 19619328, 'steps': 38318, 'batch_loss/train': 0.7465503020212054} 12/28/2021 00:17:17 - INFO - codeparrot_training - Step 38319: {'lr': 6.47967920824333e-05, 'samples': 19619840, 'steps': 38319, 'batch_loss/train': 0.8305830415338278} 12/28/2021 00:17:29 - INFO - codeparrot_training - Step 38320: {'lr': 6.47862041719378e-05, 'samples': 19620352, 'steps': 38320, 'batch_loss/train': 0.7853557167109102} 12/28/2021 00:17:39 - INFO - codeparrot_training - Step 38321: {'lr': 6.47756169977825e-05, 'samples': 19620864, 'steps': 38321, 'batch_loss/train': 0.5991459025535733} 12/28/2021 00:17:50 - INFO - codeparrot_training - Step 38322: {'lr': 6.476503056000927e-05, 'samples': 19621376, 'steps': 38322, 'batch_loss/train': 1.0907602738589048} 12/28/2021 00:18:01 - INFO - codeparrot_training - Step 38323: {'lr': 6.475444485866047e-05, 'samples': 19621888, 'steps': 38323, 'batch_loss/train': 0.7803448652848601} 12/28/2021 00:18:13 - INFO - codeparrot_training - Step 38324: {'lr': 6.474385989377798e-05, 'samples': 19622400, 'steps': 38324, 'batch_loss/train': 0.7460453528910875} 12/28/2021 00:18:23 - INFO - codeparrot_training - Step 38325: {'lr': 6.473327566540396e-05, 'samples': 19622912, 'steps': 38325, 'batch_loss/train': 0.7651059399358928} 12/28/2021 00:18:34 - INFO - codeparrot_training - Step 38326: {'lr': 6.472269217358048e-05, 'samples': 19623424, 'steps': 38326, 'batch_loss/train': 0.7672331392532215} 12/28/2021 00:18:48 - INFO - codeparrot_training - Step 38327: {'lr': 6.47121094183496e-05, 'samples': 19623936, 'steps': 38327, 'batch_loss/train': 0.8058913256973028} 12/28/2021 00:18:58 - INFO - codeparrot_training - Step 38328: {'lr': 6.470152739975343e-05, 'samples': 19624448, 'steps': 38328, 'batch_loss/train': 0.7495194706134498} 12/28/2021 00:19:09 - INFO - codeparrot_training - Step 38329: {'lr': 6.469094611783411e-05, 'samples': 19624960, 'steps': 38329, 'batch_loss/train': 0.6307265651412308} 12/28/2021 00:19:21 - INFO - codeparrot_training - Step 38330: {'lr': 6.468036557263343e-05, 'samples': 19625472, 'steps': 38330, 'batch_loss/train': 0.7361193988472223} 12/28/2021 00:19:32 - INFO - codeparrot_training - Step 38331: {'lr': 6.466978576419374e-05, 'samples': 19625984, 'steps': 38331, 'batch_loss/train': 0.7281195761752315} 12/28/2021 00:19:42 - INFO - codeparrot_training - Step 38332: {'lr': 6.465920669255709e-05, 'samples': 19626496, 'steps': 38332, 'batch_loss/train': 0.7042256176937371} 12/28/2021 00:19:53 - INFO - codeparrot_training - Step 38333: {'lr': 6.464862835776528e-05, 'samples': 19627008, 'steps': 38333, 'batch_loss/train': 0.6992180674569681} 12/28/2021 00:20:05 - INFO - codeparrot_training - Step 38334: {'lr': 6.463805075986062e-05, 'samples': 19627520, 'steps': 38334, 'batch_loss/train': 0.7122929248143919} 12/28/2021 00:20:15 - INFO - codeparrot_training - Step 38335: {'lr': 6.462747389888516e-05, 'samples': 19628032, 'steps': 38335, 'batch_loss/train': 0.7623554370948114} 12/28/2021 00:20:26 - INFO - codeparrot_training - Step 38336: {'lr': 6.461689777488078e-05, 'samples': 19628544, 'steps': 38336, 'batch_loss/train': 0.6748537914827466} 12/28/2021 00:20:40 - INFO - codeparrot_training - Step 38337: {'lr': 6.460632238788958e-05, 'samples': 19629056, 'steps': 38337, 'batch_loss/train': 0.7327154751401395} 12/28/2021 00:20:51 - INFO - codeparrot_training - Step 38338: {'lr': 6.459574773795374e-05, 'samples': 19629568, 'steps': 38338, 'batch_loss/train': 0.740249081980437} 12/28/2021 00:21:01 - INFO - codeparrot_training - Step 38339: {'lr': 6.458517382511517e-05, 'samples': 19630080, 'steps': 38339, 'batch_loss/train': 0.7761305519379675} 12/28/2021 00:21:13 - INFO - codeparrot_training - Step 38340: {'lr': 6.457460064941589e-05, 'samples': 19630592, 'steps': 38340, 'batch_loss/train': 0.7885264130309224} 12/28/2021 00:21:24 - INFO - codeparrot_training - Step 38341: {'lr': 6.456402821089801e-05, 'samples': 19631104, 'steps': 38341, 'batch_loss/train': 0.6643819874152541} 12/28/2021 00:21:35 - INFO - codeparrot_training - Step 38342: {'lr': 6.455345650960356e-05, 'samples': 19631616, 'steps': 38342, 'batch_loss/train': 0.7686267375247553} 12/28/2021 00:21:45 - INFO - codeparrot_training - Step 38343: {'lr': 6.454288554557453e-05, 'samples': 19632128, 'steps': 38343, 'batch_loss/train': 0.7542864098213613} 12/28/2021 00:21:59 - INFO - codeparrot_training - Step 38344: {'lr': 6.453231531885295e-05, 'samples': 19632640, 'steps': 38344, 'batch_loss/train': 0.8060074700042605} 12/28/2021 00:22:10 - INFO - codeparrot_training - Step 38345: {'lr': 6.452174582948083e-05, 'samples': 19633152, 'steps': 38345, 'batch_loss/train': 0.7734018703922629} 12/28/2021 00:22:20 - INFO - codeparrot_training - Step 38346: {'lr': 6.451117707750027e-05, 'samples': 19633664, 'steps': 38346, 'batch_loss/train': 0.806265520513989} 12/28/2021 00:22:32 - INFO - codeparrot_training - Step 38347: {'lr': 6.450060906295327e-05, 'samples': 19634176, 'steps': 38347, 'batch_loss/train': 0.8250538082793355} 12/28/2021 00:22:43 - INFO - codeparrot_training - Step 38348: {'lr': 6.449004178588167e-05, 'samples': 19634688, 'steps': 38348, 'batch_loss/train': 0.7327384087257087} 12/28/2021 00:22:54 - INFO - codeparrot_training - Step 38349: {'lr': 6.447947524632777e-05, 'samples': 19635200, 'steps': 38349, 'batch_loss/train': 0.7325956334825605} 12/28/2021 00:23:06 - INFO - codeparrot_training - Step 38350: {'lr': 6.446890944433334e-05, 'samples': 19635712, 'steps': 38350, 'batch_loss/train': 0.740712609142065} 12/28/2021 00:23:16 - INFO - codeparrot_training - Step 38351: {'lr': 6.445834437994047e-05, 'samples': 19636224, 'steps': 38351, 'batch_loss/train': 0.7457031607627869} 12/28/2021 00:23:27 - INFO - codeparrot_training - Step 38352: {'lr': 6.444778005319119e-05, 'samples': 19636736, 'steps': 38352, 'batch_loss/train': 0.8725676694884896} 12/28/2021 00:23:41 - INFO - codeparrot_training - Step 38353: {'lr': 6.443721646412745e-05, 'samples': 19637248, 'steps': 38353, 'batch_loss/train': 0.6855516275390983} 12/28/2021 00:23:52 - INFO - codeparrot_training - Step 38354: {'lr': 6.442665361279129e-05, 'samples': 19637760, 'steps': 38354, 'batch_loss/train': 0.4789983533555642} 12/28/2021 00:24:02 - INFO - codeparrot_training - Step 38355: {'lr': 6.441609149922468e-05, 'samples': 19638272, 'steps': 38355, 'batch_loss/train': 0.7945847045630217} 12/28/2021 00:24:13 - INFO - codeparrot_training - Step 38356: {'lr': 6.440553012346961e-05, 'samples': 19638784, 'steps': 38356, 'batch_loss/train': 0.7204820830374956} 12/28/2021 00:24:25 - INFO - codeparrot_training - Step 38357: {'lr': 6.439496948556809e-05, 'samples': 19639296, 'steps': 38357, 'batch_loss/train': 0.7125729853287339} 12/28/2021 00:24:36 - INFO - codeparrot_training - Step 38358: {'lr': 6.438440958556213e-05, 'samples': 19639808, 'steps': 38358, 'batch_loss/train': 0.7362029834184796} 12/28/2021 00:24:46 - INFO - codeparrot_training - Step 38359: {'lr': 6.437385042349356e-05, 'samples': 19640320, 'steps': 38359, 'batch_loss/train': 0.671988049056381} 12/28/2021 00:24:58 - INFO - codeparrot_training - Step 38360: {'lr': 6.436329199940452e-05, 'samples': 19640832, 'steps': 38360, 'batch_loss/train': 0.7985797375440598} 12/28/2021 00:25:09 - INFO - codeparrot_training - Step 38361: {'lr': 6.435273431333702e-05, 'samples': 19641344, 'steps': 38361, 'batch_loss/train': 0.7597738000331447} 12/28/2021 00:25:20 - INFO - codeparrot_training - Step 38362: {'lr': 6.434217736533287e-05, 'samples': 19641856, 'steps': 38362, 'batch_loss/train': 0.6770162108587101} 12/28/2021 00:25:32 - INFO - codeparrot_training - Step 38363: {'lr': 6.433162115543406e-05, 'samples': 19642368, 'steps': 38363, 'batch_loss/train': 0.7052792422473431} 12/28/2021 00:25:43 - INFO - codeparrot_training - Step 38364: {'lr': 6.432106568368276e-05, 'samples': 19642880, 'steps': 38364, 'batch_loss/train': 0.8148827631957829} 12/28/2021 00:25:53 - INFO - codeparrot_training - Step 38365: {'lr': 6.431051095012073e-05, 'samples': 19643392, 'steps': 38365, 'batch_loss/train': 0.7686430052854121} 12/28/2021 00:26:04 - INFO - codeparrot_training - Step 38366: {'lr': 6.429995695478996e-05, 'samples': 19643904, 'steps': 38366, 'batch_loss/train': 0.7403858588077128} 12/28/2021 00:26:18 - INFO - codeparrot_training - Step 38367: {'lr': 6.428940369773248e-05, 'samples': 19644416, 'steps': 38367, 'batch_loss/train': 0.7410845505073667} 12/28/2021 00:26:28 - INFO - codeparrot_training - Step 38368: {'lr': 6.427885117899019e-05, 'samples': 19644928, 'steps': 38368, 'batch_loss/train': 0.7493040142580867} 12/28/2021 00:26:39 - INFO - codeparrot_training - Step 38369: {'lr': 6.426829939860507e-05, 'samples': 19645440, 'steps': 38369, 'batch_loss/train': 0.6543682743795216} 12/28/2021 00:26:51 - INFO - codeparrot_training - Step 38370: {'lr': 6.425774835661907e-05, 'samples': 19645952, 'steps': 38370, 'batch_loss/train': 0.693428578786552} 12/28/2021 00:27:02 - INFO - codeparrot_training - Step 38371: {'lr': 6.424719805307411e-05, 'samples': 19646464, 'steps': 38371, 'batch_loss/train': 0.9033864689990878} 12/28/2021 00:27:12 - INFO - codeparrot_training - Step 38372: {'lr': 6.423664848801216e-05, 'samples': 19646976, 'steps': 38372, 'batch_loss/train': 0.6848810724914074} 12/28/2021 00:27:26 - INFO - codeparrot_training - Step 38373: {'lr': 6.422609966147524e-05, 'samples': 19647488, 'steps': 38373, 'batch_loss/train': 0.7827931270003319} 12/28/2021 00:27:37 - INFO - codeparrot_training - Step 38374: {'lr': 6.421555157350506e-05, 'samples': 19648000, 'steps': 38374, 'batch_loss/train': 0.6135439765639603} 12/28/2021 00:27:48 - INFO - codeparrot_training - Step 38375: {'lr': 6.420500422414382e-05, 'samples': 19648512, 'steps': 38375, 'batch_loss/train': 0.65390006871894} 12/28/2021 00:27:58 - INFO - codeparrot_training - Step 38376: {'lr': 6.419445761343326e-05, 'samples': 19649024, 'steps': 38376, 'batch_loss/train': 0.7094280873425305} 12/28/2021 00:28:10 - INFO - codeparrot_training - Step 38377: {'lr': 6.418391174141539e-05, 'samples': 19649536, 'steps': 38377, 'batch_loss/train': 0.814843999221921} 12/28/2021 00:28:21 - INFO - codeparrot_training - Step 38378: {'lr': 6.417336660813214e-05, 'samples': 19650048, 'steps': 38378, 'batch_loss/train': 0.7379936659708619} 12/28/2021 00:28:31 - INFO - codeparrot_training - Step 38379: {'lr': 6.416282221362538e-05, 'samples': 19650560, 'steps': 38379, 'batch_loss/train': 0.6340129630407318} 12/28/2021 00:28:44 - INFO - codeparrot_training - Step 38380: {'lr': 6.41522785579371e-05, 'samples': 19651072, 'steps': 38380, 'batch_loss/train': 0.683839307166636} 12/28/2021 00:28:54 - INFO - codeparrot_training - Step 38381: {'lr': 6.414173564110917e-05, 'samples': 19651584, 'steps': 38381, 'batch_loss/train': 0.8094533861149102} 12/28/2021 00:29:05 - INFO - codeparrot_training - Step 38382: {'lr': 6.413119346318353e-05, 'samples': 19652096, 'steps': 38382, 'batch_loss/train': 0.7290824940428138} 12/28/2021 00:29:19 - INFO - codeparrot_training - Step 38383: {'lr': 6.412065202420209e-05, 'samples': 19652608, 'steps': 38383, 'batch_loss/train': 0.6961955849546939} 12/28/2021 00:29:29 - INFO - codeparrot_training - Step 38384: {'lr': 6.411011132420683e-05, 'samples': 19653120, 'steps': 38384, 'batch_loss/train': 0.6858328119851649} 12/28/2021 00:29:40 - INFO - codeparrot_training - Step 38385: {'lr': 6.40995713632394e-05, 'samples': 19653632, 'steps': 38385, 'batch_loss/train': 0.8461006209254265} 12/28/2021 00:29:52 - INFO - codeparrot_training - Step 38386: {'lr': 6.408903214134199e-05, 'samples': 19654144, 'steps': 38386, 'batch_loss/train': 0.7455729246139526} 12/28/2021 00:30:03 - INFO - codeparrot_training - Step 38387: {'lr': 6.407849365855645e-05, 'samples': 19654656, 'steps': 38387, 'batch_loss/train': 0.7552090603858232} 12/28/2021 00:30:14 - INFO - codeparrot_training - Step 38388: {'lr': 6.406795591492454e-05, 'samples': 19655168, 'steps': 38388, 'batch_loss/train': 0.6562322383979335} 12/28/2021 00:30:24 - INFO - codeparrot_training - Step 38389: {'lr': 6.405741891048817e-05, 'samples': 19655680, 'steps': 38389, 'batch_loss/train': 0.8091207966208458} 12/28/2021 00:30:38 - INFO - codeparrot_training - Step 38390: {'lr': 6.404688264528941e-05, 'samples': 19656192, 'steps': 38390, 'batch_loss/train': 0.748350549954921} 12/28/2021 00:30:49 - INFO - codeparrot_training - Step 38391: {'lr': 6.403634711936999e-05, 'samples': 19656704, 'steps': 38391, 'batch_loss/train': 0.8258429793640971} 12/28/2021 00:30:59 - INFO - codeparrot_training - Step 38392: {'lr': 6.402581233277174e-05, 'samples': 19657216, 'steps': 38392, 'batch_loss/train': 0.63260652354802} 12/28/2021 00:31:12 - INFO - codeparrot_training - Step 38393: {'lr': 6.40152782855368e-05, 'samples': 19657728, 'steps': 38393, 'batch_loss/train': 0.7495514452457428} 12/28/2021 00:31:22 - INFO - codeparrot_training - Step 38394: {'lr': 6.400474497770681e-05, 'samples': 19658240, 'steps': 38394, 'batch_loss/train': 0.6700763031840324} 12/28/2021 00:31:33 - INFO - codeparrot_training - Step 38395: {'lr': 6.399421240932374e-05, 'samples': 19658752, 'steps': 38395, 'batch_loss/train': 0.8013367643579841} 12/28/2021 00:31:45 - INFO - codeparrot_training - Step 38396: {'lr': 6.398368058042944e-05, 'samples': 19659264, 'steps': 38396, 'batch_loss/train': 0.7473797844722867} 12/28/2021 00:31:56 - INFO - codeparrot_training - Step 38397: {'lr': 6.397314949106579e-05, 'samples': 19659776, 'steps': 38397, 'batch_loss/train': 0.7748925001360476} 12/28/2021 00:32:06 - INFO - codeparrot_training - Step 38398: {'lr': 6.396261914127463e-05, 'samples': 19660288, 'steps': 38398, 'batch_loss/train': 0.591587239439832} 12/28/2021 00:32:17 - INFO - codeparrot_training - Step 38399: {'lr': 6.395208953109796e-05, 'samples': 19660800, 'steps': 38399, 'batch_loss/train': 0.700949290767312} 12/28/2021 00:32:29 - INFO - codeparrot_training - Step 38400: {'lr': 6.394156066057738e-05, 'samples': 19661312, 'steps': 38400, 'batch_loss/train': 0.7353537338785827} 12/28/2021 00:32:40 - INFO - codeparrot_training - Step 38401: {'lr': 6.393103252975497e-05, 'samples': 19661824, 'steps': 38401, 'batch_loss/train': 0.6307841779198498} 12/28/2021 00:32:50 - INFO - codeparrot_training - Step 38402: {'lr': 6.392050513867259e-05, 'samples': 19662336, 'steps': 38402, 'batch_loss/train': 0.7597777795162983} 12/28/2021 00:33:04 - INFO - codeparrot_training - Step 38403: {'lr': 6.390997848737193e-05, 'samples': 19662848, 'steps': 38403, 'batch_loss/train': 0.8178857195889577} 12/28/2021 00:33:15 - INFO - codeparrot_training - Step 38404: {'lr': 6.389945257589497e-05, 'samples': 19663360, 'steps': 38404, 'batch_loss/train': 0.7620668532326818} 12/28/2021 00:33:25 - INFO - codeparrot_training - Step 38405: {'lr': 6.388892740428348e-05, 'samples': 19663872, 'steps': 38405, 'batch_loss/train': 0.8803630908951163} 12/28/2021 00:33:38 - INFO - codeparrot_training - Step 38406: {'lr': 6.387840297257935e-05, 'samples': 19664384, 'steps': 38406, 'batch_loss/train': 0.6754396988544613} 12/28/2021 00:33:49 - INFO - codeparrot_training - Step 38407: {'lr': 6.386787928082444e-05, 'samples': 19664896, 'steps': 38407, 'batch_loss/train': 0.6862713247537613} 12/28/2021 00:33:59 - INFO - codeparrot_training - Step 38408: {'lr': 6.385735632906053e-05, 'samples': 19665408, 'steps': 38408, 'batch_loss/train': 0.7175460597500205} 12/28/2021 00:34:10 - INFO - codeparrot_training - Step 38409: {'lr': 6.38468341173295e-05, 'samples': 19665920, 'steps': 38409, 'batch_loss/train': 0.6468212339095771} 12/28/2021 00:34:22 - INFO - codeparrot_training - Step 38410: {'lr': 6.383631264567316e-05, 'samples': 19666432, 'steps': 38410, 'batch_loss/train': 0.7507250646594912} 12/28/2021 00:34:32 - INFO - codeparrot_training - Step 38411: {'lr': 6.382579191413334e-05, 'samples': 19666944, 'steps': 38411, 'batch_loss/train': 0.7829470243304968} 12/28/2021 00:34:43 - INFO - codeparrot_training - Step 38412: {'lr': 6.38152719227519e-05, 'samples': 19667456, 'steps': 38412, 'batch_loss/train': 0.5927355559542775} 12/28/2021 00:34:56 - INFO - codeparrot_training - Step 38413: {'lr': 6.380475267157071e-05, 'samples': 19667968, 'steps': 38413, 'batch_loss/train': 0.8065078901126981} 12/28/2021 00:35:07 - INFO - codeparrot_training - Step 38414: {'lr': 6.379423416063143e-05, 'samples': 19668480, 'steps': 38414, 'batch_loss/train': 0.762964136665687} 12/28/2021 00:35:18 - INFO - codeparrot_training - Step 38415: {'lr': 6.37837163899759e-05, 'samples': 19668992, 'steps': 38415, 'batch_loss/train': 0.6692428891547024} 12/28/2021 00:35:30 - INFO - codeparrot_training - Step 38416: {'lr': 6.377319935964616e-05, 'samples': 19669504, 'steps': 38416, 'batch_loss/train': 0.7497809994965792} 12/28/2021 00:35:40 - INFO - codeparrot_training - Step 38417: {'lr': 6.376268306968377e-05, 'samples': 19670016, 'steps': 38417, 'batch_loss/train': 0.7780832159332931} 12/28/2021 00:35:51 - INFO - codeparrot_training - Step 38418: {'lr': 6.375216752013058e-05, 'samples': 19670528, 'steps': 38418, 'batch_loss/train': 0.7567772939801216} 12/28/2021 00:36:01 - INFO - codeparrot_training - Step 38419: {'lr': 6.374165271102858e-05, 'samples': 19671040, 'steps': 38419, 'batch_loss/train': 0.7838439894840121} 12/28/2021 00:36:13 - INFO - codeparrot_training - Step 38420: {'lr': 6.373113864241937e-05, 'samples': 19671552, 'steps': 38420, 'batch_loss/train': 0.6370644306298345} 12/28/2021 00:36:24 - INFO - codeparrot_training - Step 38421: {'lr': 6.37206253143448e-05, 'samples': 19672064, 'steps': 38421, 'batch_loss/train': 0.7482514674775302} 12/28/2021 00:36:35 - INFO - codeparrot_training - Step 38422: {'lr': 6.371011272684671e-05, 'samples': 19672576, 'steps': 38422, 'batch_loss/train': 0.7714666360989213} 12/28/2021 00:36:48 - INFO - codeparrot_training - Step 38423: {'lr': 6.369960087996688e-05, 'samples': 19673088, 'steps': 38423, 'batch_loss/train': 0.7327038571238518} 12/28/2021 00:36:59 - INFO - codeparrot_training - Step 38424: {'lr': 6.368908977374708e-05, 'samples': 19673600, 'steps': 38424, 'batch_loss/train': 0.7319432012736797} 12/28/2021 00:37:10 - INFO - codeparrot_training - Step 38425: {'lr': 6.367857940822915e-05, 'samples': 19674112, 'steps': 38425, 'batch_loss/train': 0.7264312393963337} 12/28/2021 00:37:22 - INFO - codeparrot_training - Step 38426: {'lr': 6.366806978345475e-05, 'samples': 19674624, 'steps': 38426, 'batch_loss/train': 0.5986033962108195} 12/28/2021 00:37:33 - INFO - codeparrot_training - Step 38427: {'lr': 6.365756089946578e-05, 'samples': 19675136, 'steps': 38427, 'batch_loss/train': 0.7111239922232926} 12/28/2021 00:37:43 - INFO - codeparrot_training - Step 38428: {'lr': 6.364705275630408e-05, 'samples': 19675648, 'steps': 38428, 'batch_loss/train': 0.656296726083383} 12/28/2021 00:37:54 - INFO - codeparrot_training - Step 38429: {'lr': 6.363654535401118e-05, 'samples': 19676160, 'steps': 38429, 'batch_loss/train': 0.7038709837943316} 12/28/2021 00:38:07 - INFO - codeparrot_training - Step 38430: {'lr': 6.362603869262914e-05, 'samples': 19676672, 'steps': 38430, 'batch_loss/train': 0.719716046587564} 12/28/2021 00:38:18 - INFO - codeparrot_training - Step 38431: {'lr': 6.361553277219956e-05, 'samples': 19677184, 'steps': 38431, 'batch_loss/train': 0.7527082925662398} 12/28/2021 00:38:28 - INFO - codeparrot_training - Step 38432: {'lr': 6.360502759276421e-05, 'samples': 19677696, 'steps': 38432, 'batch_loss/train': 0.7618648714851588} 12/28/2021 00:38:40 - INFO - codeparrot_training - Step 38433: {'lr': 6.359452315436492e-05, 'samples': 19678208, 'steps': 38433, 'batch_loss/train': 0.7802742589265108} 12/28/2021 00:38:51 - INFO - codeparrot_training - Step 38434: {'lr': 6.358401945704339e-05, 'samples': 19678720, 'steps': 38434, 'batch_loss/train': 0.7132532484829426} 12/28/2021 00:39:02 - INFO - codeparrot_training - Step 38435: {'lr': 6.357351650084145e-05, 'samples': 19679232, 'steps': 38435, 'batch_loss/train': 0.8491972051560879} 12/28/2021 00:39:14 - INFO - codeparrot_training - Step 38436: {'lr': 6.35630142858008e-05, 'samples': 19679744, 'steps': 38436, 'batch_loss/train': 0.6819310084683821} 12/28/2021 00:39:24 - INFO - codeparrot_training - Step 38437: {'lr': 6.35525128119632e-05, 'samples': 19680256, 'steps': 38437, 'batch_loss/train': 0.7964048692956567} 12/28/2021 00:39:35 - INFO - codeparrot_training - Step 38438: {'lr': 6.35420120793704e-05, 'samples': 19680768, 'steps': 38438, 'batch_loss/train': 0.7027354270685464} 12/28/2021 00:39:45 - INFO - codeparrot_training - Step 38439: {'lr': 6.353151208806427e-05, 'samples': 19681280, 'steps': 38439, 'batch_loss/train': 0.7646195278503001} 12/28/2021 00:39:58 - INFO - codeparrot_training - Step 38440: {'lr': 6.352101283808634e-05, 'samples': 19681792, 'steps': 38440, 'batch_loss/train': 0.7578284488990903} 12/28/2021 00:40:08 - INFO - codeparrot_training - Step 38441: {'lr': 6.351051432947837e-05, 'samples': 19682304, 'steps': 38441, 'batch_loss/train': 0.7898561218753457} 12/28/2021 00:40:19 - INFO - codeparrot_training - Step 38442: {'lr': 6.350001656228232e-05, 'samples': 19682816, 'steps': 38442, 'batch_loss/train': 0.8288913741707802} 12/28/2021 00:40:32 - INFO - codeparrot_training - Step 38443: {'lr': 6.348951953653975e-05, 'samples': 19683328, 'steps': 38443, 'batch_loss/train': 0.8095743153244257} 12/28/2021 00:40:43 - INFO - codeparrot_training - Step 38444: {'lr': 6.347902325229232e-05, 'samples': 19683840, 'steps': 38444, 'batch_loss/train': 0.7590812365524471} 12/28/2021 00:40:53 - INFO - codeparrot_training - Step 38445: {'lr': 6.346852770958203e-05, 'samples': 19684352, 'steps': 38445, 'batch_loss/train': 0.5931028816848993} 12/28/2021 00:41:05 - INFO - codeparrot_training - Step 38446: {'lr': 6.345803290845034e-05, 'samples': 19684864, 'steps': 38446, 'batch_loss/train': 0.8167167259380221} 12/28/2021 00:41:16 - INFO - codeparrot_training - Step 38447: {'lr': 6.34475388489391e-05, 'samples': 19685376, 'steps': 38447, 'batch_loss/train': 0.6866022716276348} 12/28/2021 00:41:27 - INFO - codeparrot_training - Step 38448: {'lr': 6.343704553109e-05, 'samples': 19685888, 'steps': 38448, 'batch_loss/train': 0.8194201989099383} 12/28/2021 00:41:41 - INFO - codeparrot_training - Step 38449: {'lr': 6.342655295494478e-05, 'samples': 19686400, 'steps': 38449, 'batch_loss/train': 0.6920407949946821} 12/28/2021 00:41:51 - INFO - codeparrot_training - Step 38450: {'lr': 6.341606112054513e-05, 'samples': 19686912, 'steps': 38450, 'batch_loss/train': 0.7803047086345032} 12/28/2021 00:42:02 - INFO - codeparrot_training - Step 38451: {'lr': 6.340557002793282e-05, 'samples': 19687424, 'steps': 38451, 'batch_loss/train': 0.8365876264870167} 12/28/2021 00:42:13 - INFO - codeparrot_training - Step 38452: {'lr': 6.339507967714938e-05, 'samples': 19687936, 'steps': 38452, 'batch_loss/train': 0.6642131232656538} 12/28/2021 00:42:25 - INFO - codeparrot_training - Step 38453: {'lr': 6.338459006823672e-05, 'samples': 19688448, 'steps': 38453, 'batch_loss/train': 0.7386783640831709} 12/28/2021 00:42:35 - INFO - codeparrot_training - Step 38454: {'lr': 6.337410120123654e-05, 'samples': 19688960, 'steps': 38454, 'batch_loss/train': 0.6320990797830746} 12/28/2021 00:42:46 - INFO - codeparrot_training - Step 38455: {'lr': 6.336361307619032e-05, 'samples': 19689472, 'steps': 38455, 'batch_loss/train': 0.7699910085648298} 12/28/2021 00:42:58 - INFO - codeparrot_training - Step 38456: {'lr': 6.335312569313998e-05, 'samples': 19689984, 'steps': 38456, 'batch_loss/train': 0.793927138671279} 12/28/2021 00:43:09 - INFO - codeparrot_training - Step 38457: {'lr': 6.334263905212719e-05, 'samples': 19690496, 'steps': 38457, 'batch_loss/train': 0.6866336888633668} 12/28/2021 00:43:19 - INFO - codeparrot_training - Step 38458: {'lr': 6.333215315319355e-05, 'samples': 19691008, 'steps': 38458, 'batch_loss/train': 0.6983337382553145} 12/28/2021 00:43:33 - INFO - codeparrot_training - Step 38459: {'lr': 6.332166799638076e-05, 'samples': 19691520, 'steps': 38459, 'batch_loss/train': 0.7664677398279309} 12/28/2021 00:43:44 - INFO - codeparrot_training - Step 38460: {'lr': 6.331118358173052e-05, 'samples': 19692032, 'steps': 38460, 'batch_loss/train': 0.6944252374814823} 12/28/2021 00:43:55 - INFO - codeparrot_training - Step 38461: {'lr': 6.330069990928452e-05, 'samples': 19692544, 'steps': 38461, 'batch_loss/train': 0.598271930415649} 12/28/2021 00:44:05 - INFO - codeparrot_training - Step 38462: {'lr': 6.329021697908447e-05, 'samples': 19693056, 'steps': 38462, 'batch_loss/train': 0.7775097172707319} 12/28/2021 00:44:17 - INFO - codeparrot_training - Step 38463: {'lr': 6.327973479117203e-05, 'samples': 19693568, 'steps': 38463, 'batch_loss/train': 0.7509335740469396} 12/28/2021 00:44:28 - INFO - codeparrot_training - Step 38464: {'lr': 6.326925334558883e-05, 'samples': 19694080, 'steps': 38464, 'batch_loss/train': 0.6825699554756284} 12/28/2021 00:44:39 - INFO - codeparrot_training - Step 38465: {'lr': 6.325877264237664e-05, 'samples': 19694592, 'steps': 38465, 'batch_loss/train': 0.7386633413843811} 12/28/2021 00:44:51 - INFO - codeparrot_training - Step 38466: {'lr': 6.324829268157695e-05, 'samples': 19695104, 'steps': 38466, 'batch_loss/train': 0.8703750874847174} 12/28/2021 00:45:01 - INFO - codeparrot_training - Step 38467: {'lr': 6.323781346323157e-05, 'samples': 19695616, 'steps': 38467, 'batch_loss/train': 0.7684224210679531} 12/28/2021 00:45:12 - INFO - codeparrot_training - Step 38468: {'lr': 6.322733498738223e-05, 'samples': 19696128, 'steps': 38468, 'batch_loss/train': 0.7938327835872769} 12/28/2021 00:45:25 - INFO - codeparrot_training - Step 38469: {'lr': 6.32168572540704e-05, 'samples': 19696640, 'steps': 38469, 'batch_loss/train': 0.710220139939338} 12/28/2021 00:45:36 - INFO - codeparrot_training - Step 38470: {'lr': 6.320638026333772e-05, 'samples': 19697152, 'steps': 38470, 'batch_loss/train': 0.7113888976164162} 12/28/2021 00:45:47 - INFO - codeparrot_training - Step 38471: {'lr': 6.31959040152261e-05, 'samples': 19697664, 'steps': 38471, 'batch_loss/train': 0.7626016344875097} 12/28/2021 00:45:57 - INFO - codeparrot_training - Step 38472: {'lr': 6.318542850977696e-05, 'samples': 19698176, 'steps': 38472, 'batch_loss/train': 0.8056299681775272} 12/28/2021 00:46:10 - INFO - codeparrot_training - Step 38473: {'lr': 6.317495374703202e-05, 'samples': 19698688, 'steps': 38473, 'batch_loss/train': 0.8087638337165117} 12/28/2021 00:46:20 - INFO - codeparrot_training - Step 38474: {'lr': 6.316447972703293e-05, 'samples': 19699200, 'steps': 38474, 'batch_loss/train': 0.6999473250471056} 12/28/2021 00:46:31 - INFO - codeparrot_training - Step 38475: {'lr': 6.315400644982133e-05, 'samples': 19699712, 'steps': 38475, 'batch_loss/train': 0.5576146920211613} 12/28/2021 00:46:43 - INFO - codeparrot_training - Step 38476: {'lr': 6.314353391543884e-05, 'samples': 19700224, 'steps': 38476, 'batch_loss/train': 0.6610268514486961} 12/28/2021 00:46:54 - INFO - codeparrot_training - Step 38477: {'lr': 6.313306212392717e-05, 'samples': 19700736, 'steps': 38477, 'batch_loss/train': 0.8445931347087026} 12/28/2021 00:47:04 - INFO - codeparrot_training - Step 38478: {'lr': 6.312259107532774e-05, 'samples': 19701248, 'steps': 38478, 'batch_loss/train': 0.79826316004619} 12/28/2021 00:47:16 - INFO - codeparrot_training - Step 38479: {'lr': 6.31121207696824e-05, 'samples': 19701760, 'steps': 38479, 'batch_loss/train': 0.9290223428979516} 12/28/2021 00:47:27 - INFO - codeparrot_training - Step 38480: {'lr': 6.310165120703279e-05, 'samples': 19702272, 'steps': 38480, 'batch_loss/train': 0.6400492573156953} 12/28/2021 00:47:38 - INFO - codeparrot_training - Step 38481: {'lr': 6.309118238742028e-05, 'samples': 19702784, 'steps': 38481, 'batch_loss/train': 0.7267181766219437} 12/28/2021 00:47:48 - INFO - codeparrot_training - Step 38482: {'lr': 6.308071431088674e-05, 'samples': 19703296, 'steps': 38482, 'batch_loss/train': 0.7869244773173705} 12/28/2021 00:48:02 - INFO - codeparrot_training - Step 38483: {'lr': 6.307024697747377e-05, 'samples': 19703808, 'steps': 38483, 'batch_loss/train': 0.7119268216192722} 12/28/2021 00:48:13 - INFO - codeparrot_training - Step 38484: {'lr': 6.305978038722285e-05, 'samples': 19704320, 'steps': 38484, 'batch_loss/train': 0.8163332706317306} 12/28/2021 00:48:23 - INFO - codeparrot_training - Step 38485: {'lr': 6.304931454017566e-05, 'samples': 19704832, 'steps': 38485, 'batch_loss/train': 0.6311103096231818} 12/28/2021 00:48:35 - INFO - codeparrot_training - Step 38486: {'lr': 6.303884943637378e-05, 'samples': 19705344, 'steps': 38486, 'batch_loss/train': 0.7645781370811164} 12/28/2021 00:48:46 - INFO - codeparrot_training - Step 38487: {'lr': 6.302838507585883e-05, 'samples': 19705856, 'steps': 38487, 'batch_loss/train': 0.7368362348061055} 12/28/2021 00:48:57 - INFO - codeparrot_training - Step 38488: {'lr': 6.301792145867246e-05, 'samples': 19706368, 'steps': 38488, 'batch_loss/train': 0.7685706210322678} 12/28/2021 00:49:11 - INFO - codeparrot_training - Step 38489: {'lr': 6.300745858485623e-05, 'samples': 19706880, 'steps': 38489, 'batch_loss/train': 0.6928879718761891} 12/28/2021 00:49:21 - INFO - codeparrot_training - Step 38490: {'lr': 6.299699645445173e-05, 'samples': 19707392, 'steps': 38490, 'batch_loss/train': 0.6925926632247865} 12/28/2021 00:49:32 - INFO - codeparrot_training - Step 38491: {'lr': 6.298653506750057e-05, 'samples': 19707904, 'steps': 38491, 'batch_loss/train': 0.7901623845100403} 12/28/2021 00:49:44 - INFO - codeparrot_training - Step 38492: {'lr': 6.297607442404432e-05, 'samples': 19708416, 'steps': 38492, 'batch_loss/train': 0.7654010131955147} 12/28/2021 00:49:55 - INFO - codeparrot_training - Step 38493: {'lr': 6.296561452412461e-05, 'samples': 19708928, 'steps': 38493, 'batch_loss/train': 0.6866954765282571} 12/28/2021 00:50:05 - INFO - codeparrot_training - Step 38494: {'lr': 6.295515536778303e-05, 'samples': 19709440, 'steps': 38494, 'batch_loss/train': 0.6141325477510691} 12/28/2021 00:50:16 - INFO - codeparrot_training - Step 38495: {'lr': 6.294469695506109e-05, 'samples': 19709952, 'steps': 38495, 'batch_loss/train': 0.7583265919238329} 12/28/2021 00:50:29 - INFO - codeparrot_training - Step 38496: {'lr': 6.293423928600028e-05, 'samples': 19710464, 'steps': 38496, 'batch_loss/train': 0.7957413461990654} 12/28/2021 00:50:40 - INFO - codeparrot_training - Step 38497: {'lr': 6.29237823606425e-05, 'samples': 19710976, 'steps': 38497, 'batch_loss/train': 0.7559268689365126} 12/28/2021 00:50:50 - INFO - codeparrot_training - Step 38498: {'lr': 6.291332617902903e-05, 'samples': 19711488, 'steps': 38498, 'batch_loss/train': 0.5950503406929784} 12/28/2021 00:51:03 - INFO - codeparrot_training - Step 38499: {'lr': 6.290287074120154e-05, 'samples': 19712000, 'steps': 38499, 'batch_loss/train': 0.8953142175450921} 12/28/2021 00:51:14 - INFO - codeparrot_training - Step 38500: {'lr': 6.289241604720158e-05, 'samples': 19712512, 'steps': 38500, 'batch_loss/train': 0.8528660968877375} 12/28/2021 00:51:24 - INFO - codeparrot_training - Step 38501: {'lr': 6.288196209707072e-05, 'samples': 19713024, 'steps': 38501, 'batch_loss/train': 0.4403483749483712} 12/28/2021 00:51:35 - INFO - codeparrot_training - Step 38502: {'lr': 6.287150889085055e-05, 'samples': 19713536, 'steps': 38502, 'batch_loss/train': 0.6731671155430377} 12/28/2021 00:51:47 - INFO - codeparrot_training - Step 38503: {'lr': 6.286105642858258e-05, 'samples': 19714048, 'steps': 38503, 'batch_loss/train': 0.6253696205094457} 12/28/2021 00:51:58 - INFO - codeparrot_training - Step 38504: {'lr': 6.285060471030841e-05, 'samples': 19714560, 'steps': 38504, 'batch_loss/train': 0.5360687880311161} 12/28/2021 00:52:09 - INFO - codeparrot_training - Step 38505: {'lr': 6.284015373606954e-05, 'samples': 19715072, 'steps': 38505, 'batch_loss/train': 0.9558635056018829} 12/28/2021 00:52:23 - INFO - codeparrot_training - Step 38506: {'lr': 6.282970350590767e-05, 'samples': 19715584, 'steps': 38506, 'batch_loss/train': 0.8980566733516753} 12/28/2021 00:52:33 - INFO - codeparrot_training - Step 38507: {'lr': 6.281925401986405e-05, 'samples': 19716096, 'steps': 38507, 'batch_loss/train': 0.857407289557159} 12/28/2021 00:52:44 - INFO - codeparrot_training - Step 38508: {'lr': 6.280880527798047e-05, 'samples': 19716608, 'steps': 38508, 'batch_loss/train': 0.7430108329281211} 12/28/2021 00:52:56 - INFO - codeparrot_training - Step 38509: {'lr': 6.279835728029848e-05, 'samples': 19717120, 'steps': 38509, 'batch_loss/train': 0.7090084319934249} 12/28/2021 00:53:07 - INFO - codeparrot_training - Step 38510: {'lr': 6.278791002685943e-05, 'samples': 19717632, 'steps': 38510, 'batch_loss/train': 0.8043275391682982} 12/28/2021 00:53:17 - INFO - codeparrot_training - Step 38511: {'lr': 6.2777463517705e-05, 'samples': 19718144, 'steps': 38511, 'batch_loss/train': 0.7395359249785542} 12/28/2021 00:53:28 - INFO - codeparrot_training - Step 38512: {'lr': 6.276701775287669e-05, 'samples': 19718656, 'steps': 38512, 'batch_loss/train': 0.7490957519039512} 12/28/2021 00:53:40 - INFO - codeparrot_training - Step 38513: {'lr': 6.2756572732416e-05, 'samples': 19719168, 'steps': 38513, 'batch_loss/train': 0.9049792131409049} 12/28/2021 00:53:51 - INFO - codeparrot_training - Step 38514: {'lr': 6.274612845636448e-05, 'samples': 19719680, 'steps': 38514, 'batch_loss/train': 1.6238095904700458} 12/28/2021 00:54:01 - INFO - codeparrot_training - Step 38515: {'lr': 6.273568492476364e-05, 'samples': 19720192, 'steps': 38515, 'batch_loss/train': 0.6786352186463773} 12/28/2021 00:54:14 - INFO - codeparrot_training - Step 38516: {'lr': 6.272524213765504e-05, 'samples': 19720704, 'steps': 38516, 'batch_loss/train': 0.6207741020480171} 12/28/2021 00:54:24 - INFO - codeparrot_training - Step 38517: {'lr': 6.271480009508012e-05, 'samples': 19721216, 'steps': 38517, 'batch_loss/train': 0.6435707774944603} 12/28/2021 00:54:35 - INFO - codeparrot_training - Step 38518: {'lr': 6.270435879708047e-05, 'samples': 19721728, 'steps': 38518, 'batch_loss/train': 0.7758291559293866} 12/28/2021 00:54:48 - INFO - codeparrot_training - Step 38519: {'lr': 6.269391824369758e-05, 'samples': 19722240, 'steps': 38519, 'batch_loss/train': 0.6319524459540844} 12/28/2021 00:54:59 - INFO - codeparrot_training - Step 38520: {'lr': 6.268347843497299e-05, 'samples': 19722752, 'steps': 38520, 'batch_loss/train': 0.7310259705409408} 12/28/2021 00:55:09 - INFO - codeparrot_training - Step 38521: {'lr': 6.267303937094813e-05, 'samples': 19723264, 'steps': 38521, 'batch_loss/train': 0.7670648219063878} 12/28/2021 00:55:20 - INFO - codeparrot_training - Step 38522: {'lr': 6.26626010516644e-05, 'samples': 19723776, 'steps': 38522, 'batch_loss/train': 0.7185894125141203} 12/28/2021 00:55:32 - INFO - codeparrot_training - Step 38523: {'lr': 6.265216347716363e-05, 'samples': 19724288, 'steps': 38523, 'batch_loss/train': 0.7206030728993937} 12/28/2021 00:55:43 - INFO - codeparrot_training - Step 38524: {'lr': 6.264172664748701e-05, 'samples': 19724800, 'steps': 38524, 'batch_loss/train': 0.6217481712810695} 12/28/2021 00:55:54 - INFO - codeparrot_training - Step 38525: {'lr': 6.263129056267616e-05, 'samples': 19725312, 'steps': 38525, 'batch_loss/train': 0.757339131552726} 12/28/2021 00:56:06 - INFO - codeparrot_training - Step 38526: {'lr': 6.262085522277256e-05, 'samples': 19725824, 'steps': 38526, 'batch_loss/train': 0.7803738475777209} 12/28/2021 00:56:17 - INFO - codeparrot_training - Step 38527: {'lr': 6.261042062781766e-05, 'samples': 19726336, 'steps': 38527, 'batch_loss/train': 0.5679833321482874} 12/28/2021 00:56:27 - INFO - codeparrot_training - Step 38528: {'lr': 6.259998677785301e-05, 'samples': 19726848, 'steps': 38528, 'batch_loss/train': 0.8099242490716279} 12/28/2021 00:56:41 - INFO - codeparrot_training - Step 38529: {'lr': 6.258955367292002e-05, 'samples': 19727360, 'steps': 38529, 'batch_loss/train': 0.6602439088746905} 12/28/2021 00:56:52 - INFO - codeparrot_training - Step 38530: {'lr': 6.257912131306021e-05, 'samples': 19727872, 'steps': 38530, 'batch_loss/train': 0.8143248131964356} 12/28/2021 00:57:02 - INFO - codeparrot_training - Step 38531: {'lr': 6.256868969831506e-05, 'samples': 19728384, 'steps': 38531, 'batch_loss/train': 0.7829656950198114} 12/28/2021 00:57:13 - INFO - codeparrot_training - Step 38532: {'lr': 6.255825882872606e-05, 'samples': 19728896, 'steps': 38532, 'batch_loss/train': 0.7390374080277979} 12/28/2021 00:57:25 - INFO - codeparrot_training - Step 38533: {'lr': 6.254782870433453e-05, 'samples': 19729408, 'steps': 38533, 'batch_loss/train': 0.7125545849557966} 12/28/2021 00:57:36 - INFO - codeparrot_training - Step 38534: {'lr': 6.253739932518215e-05, 'samples': 19729920, 'steps': 38534, 'batch_loss/train': 0.707880390342325} 12/28/2021 00:57:46 - INFO - codeparrot_training - Step 38535: {'lr': 6.252697069131034e-05, 'samples': 19730432, 'steps': 38535, 'batch_loss/train': 0.7527965493500233} 12/28/2021 00:58:01 - INFO - codeparrot_training - Step 38536: {'lr': 6.251654280276042e-05, 'samples': 19730944, 'steps': 38536, 'batch_loss/train': 0.6142848604358733} 12/28/2021 00:58:11 - INFO - codeparrot_training - Step 38537: {'lr': 6.250611565957387e-05, 'samples': 19731456, 'steps': 38537, 'batch_loss/train': 0.80760339461267} 12/28/2021 00:58:22 - INFO - codeparrot_training - Step 38538: {'lr': 6.249568926179237e-05, 'samples': 19731968, 'steps': 38538, 'batch_loss/train': 0.843337781727314} 12/28/2021 00:58:34 - INFO - codeparrot_training - Step 38539: {'lr': 6.248526360945713e-05, 'samples': 19732480, 'steps': 38539, 'batch_loss/train': 0.7361961849965155} 12/28/2021 00:58:44 - INFO - codeparrot_training - Step 38540: {'lr': 6.247483870260967e-05, 'samples': 19732992, 'steps': 38540, 'batch_loss/train': 0.8084354139864445} 12/28/2021 00:58:55 - INFO - codeparrot_training - Step 38541: {'lr': 6.246441454129145e-05, 'samples': 19733504, 'steps': 38541, 'batch_loss/train': 0.7410398635547608} 12/28/2021 00:59:06 - INFO - codeparrot_training - Step 38542: {'lr': 6.245399112554392e-05, 'samples': 19734016, 'steps': 38542, 'batch_loss/train': 0.773354725446552} 12/28/2021 00:59:18 - INFO - codeparrot_training - Step 38543: {'lr': 6.244356845540849e-05, 'samples': 19734528, 'steps': 38543, 'batch_loss/train': 0.6860448494553566} 12/28/2021 00:59:29 - INFO - codeparrot_training - Step 38544: {'lr': 6.243314653092661e-05, 'samples': 19735040, 'steps': 38544, 'batch_loss/train': 0.5387298739515245} 12/28/2021 00:59:39 - INFO - codeparrot_training - Step 38545: {'lr': 6.242272535213971e-05, 'samples': 19735552, 'steps': 38545, 'batch_loss/train': 0.6919528039870784} 12/28/2021 00:59:52 - INFO - codeparrot_training - Step 38546: {'lr': 6.241230491908926e-05, 'samples': 19736064, 'steps': 38546, 'batch_loss/train': 0.5432820612913929} 12/28/2021 01:00:02 - INFO - codeparrot_training - Step 38547: {'lr': 6.240188523181672e-05, 'samples': 19736576, 'steps': 38547, 'batch_loss/train': 0.7221894497051835} 12/28/2021 01:00:13 - INFO - codeparrot_training - Step 38548: {'lr': 6.23914662903633e-05, 'samples': 19737088, 'steps': 38548, 'batch_loss/train': 0.7475956198759377} 12/28/2021 01:00:27 - INFO - codeparrot_training - Step 38549: {'lr': 6.238104809477072e-05, 'samples': 19737600, 'steps': 38549, 'batch_loss/train': 0.7771667591296136} 12/28/2021 01:00:38 - INFO - codeparrot_training - Step 38550: {'lr': 6.237063064508014e-05, 'samples': 19738112, 'steps': 38550, 'batch_loss/train': 0.7804238549433649} 12/28/2021 01:00:48 - INFO - codeparrot_training - Step 38551: {'lr': 6.236021394133312e-05, 'samples': 19738624, 'steps': 38551, 'batch_loss/train': 0.8056922516552731} 12/28/2021 01:00:59 - INFO - codeparrot_training - Step 38552: {'lr': 6.234979798357105e-05, 'samples': 19739136, 'steps': 38552, 'batch_loss/train': 0.7167910602875054} 12/28/2021 01:01:11 - INFO - codeparrot_training - Step 38553: {'lr': 6.233938277183532e-05, 'samples': 19739648, 'steps': 38553, 'batch_loss/train': 0.717138968873769} 12/28/2021 01:01:21 - INFO - codeparrot_training - Step 38554: {'lr': 6.232896830616732e-05, 'samples': 19740160, 'steps': 38554, 'batch_loss/train': 0.7256405760999769} 12/28/2021 01:01:32 - INFO - codeparrot_training - Step 38555: {'lr': 6.231855458660851e-05, 'samples': 19740672, 'steps': 38555, 'batch_loss/train': 0.7291985177434981} 12/28/2021 01:01:45 - INFO - codeparrot_training - Step 38556: {'lr': 6.230814161320025e-05, 'samples': 19741184, 'steps': 38556, 'batch_loss/train': 0.9318191315978765} 12/28/2021 01:01:56 - INFO - codeparrot_training - Step 38557: {'lr': 6.229772938598396e-05, 'samples': 19741696, 'steps': 38557, 'batch_loss/train': 0.9974065995775163} 12/28/2021 01:02:06 - INFO - codeparrot_training - Step 38558: {'lr': 6.228731790500108e-05, 'samples': 19742208, 'steps': 38558, 'batch_loss/train': 1.1106393891386688} 12/28/2021 01:02:17 - INFO - codeparrot_training - Step 38559: {'lr': 6.227690717029284e-05, 'samples': 19742720, 'steps': 38559, 'batch_loss/train': 0.75277094473131} 12/28/2021 01:02:31 - INFO - codeparrot_training - Step 38560: {'lr': 6.226649718190076e-05, 'samples': 19743232, 'steps': 38560, 'batch_loss/train': 0.7417522789910436} 12/28/2021 01:02:42 - INFO - codeparrot_training - Step 38561: {'lr': 6.225608793986628e-05, 'samples': 19743744, 'steps': 38561, 'batch_loss/train': 0.8065208815969527} 12/28/2021 01:02:52 - INFO - codeparrot_training - Step 38562: {'lr': 6.224567944423065e-05, 'samples': 19744256, 'steps': 38562, 'batch_loss/train': 0.5413910199422389} 12/28/2021 01:03:04 - INFO - codeparrot_training - Step 38563: {'lr': 6.223527169503523e-05, 'samples': 19744768, 'steps': 38563, 'batch_loss/train': 0.778783957939595} 12/28/2021 01:03:15 - INFO - codeparrot_training - Step 38564: {'lr': 6.22248646923216e-05, 'samples': 19745280, 'steps': 38564, 'batch_loss/train': 0.6644877181388438} 12/28/2021 01:03:26 - INFO - codeparrot_training - Step 38565: {'lr': 6.221445843613096e-05, 'samples': 19745792, 'steps': 38565, 'batch_loss/train': 0.8251500213518739} 12/28/2021 01:03:40 - INFO - codeparrot_training - Step 38566: {'lr': 6.220405292650472e-05, 'samples': 19746304, 'steps': 38566, 'batch_loss/train': 0.6647642273455858} 12/28/2021 01:03:50 - INFO - codeparrot_training - Step 38567: {'lr': 6.219364816348428e-05, 'samples': 19746816, 'steps': 38567, 'batch_loss/train': 0.6265557520091534} 12/28/2021 01:04:01 - INFO - codeparrot_training - Step 38568: {'lr': 6.218324414711093e-05, 'samples': 19747328, 'steps': 38568, 'batch_loss/train': 0.7970314705744386} 12/28/2021 01:04:12 - INFO - codeparrot_training - Step 38569: {'lr': 6.217284087742611e-05, 'samples': 19747840, 'steps': 38569, 'batch_loss/train': 1.7908611111342907} 12/28/2021 01:04:24 - INFO - codeparrot_training - Step 38570: {'lr': 6.216243835447116e-05, 'samples': 19748352, 'steps': 38570, 'batch_loss/train': 0.9718086153734475} 12/28/2021 01:04:35 - INFO - codeparrot_training - Step 38571: {'lr': 6.215203657828745e-05, 'samples': 19748864, 'steps': 38571, 'batch_loss/train': 0.7160967001691461} 12/28/2021 01:04:45 - INFO - codeparrot_training - Step 38572: {'lr': 6.214163554891627e-05, 'samples': 19749376, 'steps': 38572, 'batch_loss/train': 0.8031385019421577} 12/28/2021 01:04:57 - INFO - codeparrot_training - Step 38573: {'lr': 6.21312352663991e-05, 'samples': 19749888, 'steps': 38573, 'batch_loss/train': 0.8722985750064254} 12/28/2021 01:05:08 - INFO - codeparrot_training - Step 38574: {'lr': 6.212083573077707e-05, 'samples': 19750400, 'steps': 38574, 'batch_loss/train': 0.8614976750686765} 12/28/2021 01:05:19 - INFO - codeparrot_training - Step 38575: {'lr': 6.211043694209179e-05, 'samples': 19750912, 'steps': 38575, 'batch_loss/train': 0.7631275597959757} 12/28/2021 01:05:32 - INFO - codeparrot_training - Step 38576: {'lr': 6.21000389003844e-05, 'samples': 19751424, 'steps': 38576, 'batch_loss/train': 0.7844944503158331} 12/28/2021 01:05:43 - INFO - codeparrot_training - Step 38577: {'lr': 6.208964160569625e-05, 'samples': 19751936, 'steps': 38577, 'batch_loss/train': 0.6183680053800344} 12/28/2021 01:05:54 - INFO - codeparrot_training - Step 38578: {'lr': 6.207924505806886e-05, 'samples': 19752448, 'steps': 38578, 'batch_loss/train': 0.7254118821583688} 12/28/2021 01:06:04 - INFO - codeparrot_training - Step 38579: {'lr': 6.206884925754336e-05, 'samples': 19752960, 'steps': 38579, 'batch_loss/train': 0.8172244727611542} 12/28/2021 01:06:16 - INFO - codeparrot_training - Step 38580: {'lr': 6.205845420416117e-05, 'samples': 19753472, 'steps': 38580, 'batch_loss/train': 0.8003630638122559} 12/28/2021 01:06:27 - INFO - codeparrot_training - Step 38581: {'lr': 6.204805989796358e-05, 'samples': 19753984, 'steps': 38581, 'batch_loss/train': 0.7306803925894201} 12/28/2021 01:06:38 - INFO - codeparrot_training - Step 38582: {'lr': 6.203766633899196e-05, 'samples': 19754496, 'steps': 38582, 'batch_loss/train': 0.7494755536317825} 12/28/2021 01:06:50 - INFO - codeparrot_training - Step 38583: {'lr': 6.202727352728759e-05, 'samples': 19755008, 'steps': 38583, 'batch_loss/train': 0.6699392859009095} 12/28/2021 01:07:01 - INFO - codeparrot_training - Step 38584: {'lr': 6.201688146289189e-05, 'samples': 19755520, 'steps': 38584, 'batch_loss/train': 0.6762027539662085} 12/28/2021 01:07:11 - INFO - codeparrot_training - Step 38585: {'lr': 6.200649014584595e-05, 'samples': 19756032, 'steps': 38585, 'batch_loss/train': 0.8261265624314547} 12/28/2021 01:07:22 - INFO - codeparrot_training - Step 38586: {'lr': 6.199609957619127e-05, 'samples': 19756544, 'steps': 38586, 'batch_loss/train': 1.1167248180136085} 12/28/2021 01:07:36 - INFO - codeparrot_training - Step 38587: {'lr': 6.198570975396919e-05, 'samples': 19757056, 'steps': 38587, 'batch_loss/train': 0.7816394474357367} 12/28/2021 01:07:47 - INFO - codeparrot_training - Step 38588: {'lr': 6.197532067922088e-05, 'samples': 19757568, 'steps': 38588, 'batch_loss/train': 0.7675547460094094} 12/28/2021 01:07:57 - INFO - codeparrot_training - Step 38589: {'lr': 6.196493235198761e-05, 'samples': 19758080, 'steps': 38589, 'batch_loss/train': 0.6204850105568767} 12/28/2021 01:08:09 - INFO - codeparrot_training - Step 38590: {'lr': 6.195454477231092e-05, 'samples': 19758592, 'steps': 38590, 'batch_loss/train': 0.6553027506452054} 12/28/2021 01:08:20 - INFO - codeparrot_training - Step 38591: {'lr': 6.19441579402319e-05, 'samples': 19759104, 'steps': 38591, 'batch_loss/train': 0.779658603714779} 12/28/2021 01:08:31 - INFO - codeparrot_training - Step 38592: {'lr': 6.19337718557918e-05, 'samples': 19759616, 'steps': 38592, 'batch_loss/train': 0.6965190693736076} 12/28/2021 01:08:43 - INFO - codeparrot_training - Step 38593: {'lr': 6.192338651903218e-05, 'samples': 19760128, 'steps': 38593, 'batch_loss/train': 0.8091543577611446} 12/28/2021 01:08:53 - INFO - codeparrot_training - Step 38594: {'lr': 6.191300192999408e-05, 'samples': 19760640, 'steps': 38594, 'batch_loss/train': 0.7313229739665985} 12/28/2021 01:09:04 - INFO - codeparrot_training - Step 38595: {'lr': 6.190261808871886e-05, 'samples': 19761152, 'steps': 38595, 'batch_loss/train': 0.794970735674724} 12/28/2021 01:09:18 - INFO - codeparrot_training - Step 38596: {'lr': 6.189223499524783e-05, 'samples': 19761664, 'steps': 38596, 'batch_loss/train': 0.648811215069145} 12/28/2021 01:09:29 - INFO - codeparrot_training - Step 38597: {'lr': 6.188185264962221e-05, 'samples': 19762176, 'steps': 38597, 'batch_loss/train': 0.6191862199921161} 12/28/2021 01:09:39 - INFO - codeparrot_training - Step 38598: {'lr': 6.187147105188335e-05, 'samples': 19762688, 'steps': 38598, 'batch_loss/train': 0.6543341684155166} 12/28/2021 01:09:50 - INFO - codeparrot_training - Step 38599: {'lr': 6.186109020207253e-05, 'samples': 19763200, 'steps': 38599, 'batch_loss/train': 0.7035743452725001} 12/28/2021 01:10:02 - INFO - codeparrot_training - Step 38600: {'lr': 6.185071010023083e-05, 'samples': 19763712, 'steps': 38600, 'batch_loss/train': 0.5914978196378797} 12/28/2021 01:10:13 - INFO - codeparrot_training - Step 38601: {'lr': 6.184033074639984e-05, 'samples': 19764224, 'steps': 38601, 'batch_loss/train': 0.9044416649267077} 12/28/2021 01:10:23 - INFO - codeparrot_training - Step 38602: {'lr': 6.182995214062053e-05, 'samples': 19764736, 'steps': 38602, 'batch_loss/train': 0.6889653401449323} 12/28/2021 01:10:35 - INFO - codeparrot_training - Step 38603: {'lr': 6.181957428293425e-05, 'samples': 19765248, 'steps': 38603, 'batch_loss/train': 0.723068630322814} 12/28/2021 01:10:46 - INFO - codeparrot_training - Step 38604: {'lr': 6.180919717338243e-05, 'samples': 19765760, 'steps': 38604, 'batch_loss/train': 0.9011699249967933} 12/28/2021 01:10:57 - INFO - codeparrot_training - Step 38605: {'lr': 6.179882081200609e-05, 'samples': 19766272, 'steps': 38605, 'batch_loss/train': 0.7294855704531074} 12/28/2021 01:11:07 - INFO - codeparrot_training - Step 38606: {'lr': 6.178844519884658e-05, 'samples': 19766784, 'steps': 38606, 'batch_loss/train': 0.7963156811892986} 12/28/2021 01:11:21 - INFO - codeparrot_training - Step 38607: {'lr': 6.177807033394517e-05, 'samples': 19767296, 'steps': 38607, 'batch_loss/train': 0.9892269414849579} 12/28/2021 01:11:31 - INFO - codeparrot_training - Step 38608: {'lr': 6.176769621734305e-05, 'samples': 19767808, 'steps': 38608, 'batch_loss/train': 0.7199160065501928} 12/28/2021 01:11:42 - INFO - codeparrot_training - Step 38609: {'lr': 6.175732284908153e-05, 'samples': 19768320, 'steps': 38609, 'batch_loss/train': 0.7478421991690993} 12/28/2021 01:11:54 - INFO - codeparrot_training - Step 38610: {'lr': 6.174695022920184e-05, 'samples': 19768832, 'steps': 38610, 'batch_loss/train': 0.8302606204524636} 12/28/2021 01:12:05 - INFO - codeparrot_training - Step 38611: {'lr': 6.17365783577451e-05, 'samples': 19769344, 'steps': 38611, 'batch_loss/train': 0.7372376315761358} 12/28/2021 01:12:15 - INFO - codeparrot_training - Step 38612: {'lr': 6.17262072347527e-05, 'samples': 19769856, 'steps': 38612, 'batch_loss/train': 0.7054348101373762} 12/28/2021 01:12:27 - INFO - codeparrot_training - Step 38613: {'lr': 6.171583686026586e-05, 'samples': 19770368, 'steps': 38613, 'batch_loss/train': 0.8489370420575142} 12/28/2021 01:12:38 - INFO - codeparrot_training - Step 38614: {'lr': 6.170546723432563e-05, 'samples': 19770880, 'steps': 38614, 'batch_loss/train': 0.8445638599805534} 12/28/2021 01:12:49 - INFO - codeparrot_training - Step 38615: {'lr': 6.169509835697346e-05, 'samples': 19771392, 'steps': 38615, 'batch_loss/train': 0.7939928006380796} 12/28/2021 01:12:59 - INFO - codeparrot_training - Step 38616: {'lr': 6.168473022825052e-05, 'samples': 19771904, 'steps': 38616, 'batch_loss/train': 0.814802622422576} 12/28/2021 01:13:13 - INFO - codeparrot_training - Step 38617: {'lr': 6.167436284819792e-05, 'samples': 19772416, 'steps': 38617, 'batch_loss/train': 0.7011732854880393} 12/28/2021 01:13:24 - INFO - codeparrot_training - Step 38618: {'lr': 6.166399621685686e-05, 'samples': 19772928, 'steps': 38618, 'batch_loss/train': 0.7313916649436578} 12/28/2021 01:13:35 - INFO - codeparrot_training - Step 38619: {'lr': 6.165363033426879e-05, 'samples': 19773440, 'steps': 38619, 'batch_loss/train': 0.8015617472119629} 12/28/2021 01:13:47 - INFO - codeparrot_training - Step 38620: {'lr': 6.164326520047472e-05, 'samples': 19773952, 'steps': 38620, 'batch_loss/train': 0.6499131377786398} 12/28/2021 01:13:57 - INFO - codeparrot_training - Step 38621: {'lr': 6.16329008155159e-05, 'samples': 19774464, 'steps': 38621, 'batch_loss/train': 0.737680222839117} 12/28/2021 01:14:08 - INFO - codeparrot_training - Step 38622: {'lr': 6.162253717943353e-05, 'samples': 19774976, 'steps': 38622, 'batch_loss/train': 0.740686247125268} 12/28/2021 01:14:22 - INFO - codeparrot_training - Step 38623: {'lr': 6.161217429226881e-05, 'samples': 19775488, 'steps': 38623, 'batch_loss/train': 0.7208714819862507} 12/28/2021 01:14:33 - INFO - codeparrot_training - Step 38624: {'lr': 6.160181215406296e-05, 'samples': 19776000, 'steps': 38624, 'batch_loss/train': 0.6787209392059594} 12/28/2021 01:14:43 - INFO - codeparrot_training - Step 38625: {'lr': 6.159145076485726e-05, 'samples': 19776512, 'steps': 38625, 'batch_loss/train': 0.7371227843686938} 12/28/2021 01:14:55 - INFO - codeparrot_training - Step 38626: {'lr': 6.158109012469265e-05, 'samples': 19777024, 'steps': 38626, 'batch_loss/train': 0.7128598405979574} 12/28/2021 01:15:06 - INFO - codeparrot_training - Step 38627: {'lr': 6.157073023361057e-05, 'samples': 19777536, 'steps': 38627, 'batch_loss/train': 0.7585228346288204} 12/28/2021 01:15:17 - INFO - codeparrot_training - Step 38628: {'lr': 6.156037109165219e-05, 'samples': 19778048, 'steps': 38628, 'batch_loss/train': 0.825270933099091} 12/28/2021 01:15:27 - INFO - codeparrot_training - Step 38629: {'lr': 6.155001269885846e-05, 'samples': 19778560, 'steps': 38629, 'batch_loss/train': 0.7335618590004742} 12/28/2021 01:15:40 - INFO - codeparrot_training - Step 38630: {'lr': 6.153965505527087e-05, 'samples': 19779072, 'steps': 38630, 'batch_loss/train': 0.7966350428760052} 12/28/2021 01:15:50 - INFO - codeparrot_training - Step 38631: {'lr': 6.15292981609304e-05, 'samples': 19779584, 'steps': 38631, 'batch_loss/train': 0.8570838836021721} 12/28/2021 01:16:01 - INFO - codeparrot_training - Step 38632: {'lr': 6.151894201587824e-05, 'samples': 19780096, 'steps': 38632, 'batch_loss/train': 0.7907201701309532} 12/28/2021 01:16:13 - INFO - codeparrot_training - Step 38633: {'lr': 6.150858662015563e-05, 'samples': 19780608, 'steps': 38633, 'batch_loss/train': 0.7726617471198551} 12/28/2021 01:16:24 - INFO - codeparrot_training - Step 38634: {'lr': 6.149823197380369e-05, 'samples': 19781120, 'steps': 38634, 'batch_loss/train': 0.6272670495673083} 12/28/2021 01:16:34 - INFO - codeparrot_training - Step 38635: {'lr': 6.148787807686362e-05, 'samples': 19781632, 'steps': 38635, 'batch_loss/train': 0.7532835580641404} 12/28/2021 01:16:48 - INFO - codeparrot_training - Step 38636: {'lr': 6.147752492937656e-05, 'samples': 19782144, 'steps': 38636, 'batch_loss/train': 0.661607435438782} 12/28/2021 01:16:59 - INFO - codeparrot_training - Step 38637: {'lr': 6.146717253138367e-05, 'samples': 19782656, 'steps': 38637, 'batch_loss/train': 0.6417313073761761} 12/28/2021 01:17:10 - INFO - codeparrot_training - Step 38638: {'lr': 6.145682088292614e-05, 'samples': 19783168, 'steps': 38638, 'batch_loss/train': 0.7465582666918635} 12/28/2021 01:17:20 - INFO - codeparrot_training - Step 38639: {'lr': 6.144646998404515e-05, 'samples': 19783680, 'steps': 38639, 'batch_loss/train': 0.7103332779370248} 12/28/2021 01:17:32 - INFO - codeparrot_training - Step 38640: {'lr': 6.143611983478165e-05, 'samples': 19784192, 'steps': 38640, 'batch_loss/train': 0.7732701641507447} 12/28/2021 01:17:43 - INFO - codeparrot_training - Step 38641: {'lr': 6.142577043517703e-05, 'samples': 19784704, 'steps': 38641, 'batch_loss/train': 0.7469205399975181} 12/28/2021 01:17:53 - INFO - codeparrot_training - Step 38642: {'lr': 6.14154217852724e-05, 'samples': 19785216, 'steps': 38642, 'batch_loss/train': 0.7541529033333063} 12/28/2021 01:18:06 - INFO - codeparrot_training - Step 38643: {'lr': 6.140507388510879e-05, 'samples': 19785728, 'steps': 38643, 'batch_loss/train': 0.787351957987994} 12/28/2021 01:18:16 - INFO - codeparrot_training - Step 38644: {'lr': 6.139472673472732e-05, 'samples': 19786240, 'steps': 38644, 'batch_loss/train': 0.9939497150480747} 12/28/2021 01:18:27 - INFO - codeparrot_training - Step 38645: {'lr': 6.138438033416932e-05, 'samples': 19786752, 'steps': 38645, 'batch_loss/train': 0.7314211265183985} 12/28/2021 01:18:39 - INFO - codeparrot_training - Step 38646: {'lr': 6.137403468347578e-05, 'samples': 19787264, 'steps': 38646, 'batch_loss/train': 0.727678531780839} 12/28/2021 01:18:50 - INFO - codeparrot_training - Step 38647: {'lr': 6.13636897826878e-05, 'samples': 19787776, 'steps': 38647, 'batch_loss/train': 0.6394073537085205} 12/28/2021 01:19:01 - INFO - codeparrot_training - Step 38648: {'lr': 6.135334563184661e-05, 'samples': 19788288, 'steps': 38648, 'batch_loss/train': 0.7118160864338279} 12/28/2021 01:19:11 - INFO - codeparrot_training - Step 38649: {'lr': 6.134300223099326e-05, 'samples': 19788800, 'steps': 38649, 'batch_loss/train': 0.6039572429144755} 12/28/2021 01:19:26 - INFO - codeparrot_training - Step 38650: {'lr': 6.133265958016893e-05, 'samples': 19789312, 'steps': 38650, 'batch_loss/train': 0.8268266348168254} 12/28/2021 01:19:36 - INFO - codeparrot_training - Step 38651: {'lr': 6.132231767941468e-05, 'samples': 19789824, 'steps': 38651, 'batch_loss/train': 0.690887127420865} 12/28/2021 01:19:47 - INFO - codeparrot_training - Step 38652: {'lr': 6.131197652877166e-05, 'samples': 19790336, 'steps': 38652, 'batch_loss/train': 0.6855646129697561} 12/28/2021 01:19:59 - INFO - codeparrot_training - Step 38653: {'lr': 6.130163612828097e-05, 'samples': 19790848, 'steps': 38653, 'batch_loss/train': 0.8310847263783216} 12/28/2021 01:20:09 - INFO - codeparrot_training - Step 38654: {'lr': 6.129129647798381e-05, 'samples': 19791360, 'steps': 38654, 'batch_loss/train': 0.7113352000014856} 12/28/2021 01:20:20 - INFO - codeparrot_training - Step 38655: {'lr': 6.128095757792105e-05, 'samples': 19791872, 'steps': 38655, 'batch_loss/train': 0.7063161362893879} 12/28/2021 01:20:34 - INFO - codeparrot_training - Step 38656: {'lr': 6.127061942813409e-05, 'samples': 19792384, 'steps': 38656, 'batch_loss/train': 0.6901949718594551} 12/28/2021 01:20:45 - INFO - codeparrot_training - Step 38657: {'lr': 6.126028202866379e-05, 'samples': 19792896, 'steps': 38657, 'batch_loss/train': 0.7430054023861885} 12/28/2021 01:20:55 - INFO - codeparrot_training - Step 38658: {'lr': 6.124994537955139e-05, 'samples': 19793408, 'steps': 38658, 'batch_loss/train': 0.7306178510189056} 12/28/2021 01:21:07 - INFO - codeparrot_training - Step 38659: {'lr': 6.12396094808379e-05, 'samples': 19793920, 'steps': 38659, 'batch_loss/train': 0.7673916202038527} 12/28/2021 01:21:18 - INFO - codeparrot_training - Step 38660: {'lr': 6.122927433256448e-05, 'samples': 19794432, 'steps': 38660, 'batch_loss/train': 0.8460995032219216} 12/28/2021 01:21:29 - INFO - codeparrot_training - Step 38661: {'lr': 6.121893993477215e-05, 'samples': 19794944, 'steps': 38661, 'batch_loss/train': 0.6821749382652342} 12/28/2021 01:21:39 - INFO - codeparrot_training - Step 38662: {'lr': 6.120860628750206e-05, 'samples': 19795456, 'steps': 38662, 'batch_loss/train': 0.7751642018556595} 12/28/2021 01:21:51 - INFO - codeparrot_training - Step 38663: {'lr': 6.119827339079525e-05, 'samples': 19795968, 'steps': 38663, 'batch_loss/train': 0.6022106766758952} 12/28/2021 01:22:02 - INFO - codeparrot_training - Step 38664: {'lr': 6.118794124469285e-05, 'samples': 19796480, 'steps': 38664, 'batch_loss/train': 0.7627492374740541} 12/28/2021 01:22:13 - INFO - codeparrot_training - Step 38665: {'lr': 6.117760984923596e-05, 'samples': 19796992, 'steps': 38665, 'batch_loss/train': 0.7323533548042178} 12/28/2021 01:22:27 - INFO - codeparrot_training - Step 38666: {'lr': 6.116727920446543e-05, 'samples': 19797504, 'steps': 38666, 'batch_loss/train': 0.6996463621035218} 12/28/2021 01:22:37 - INFO - codeparrot_training - Step 38667: {'lr': 6.11569493104226e-05, 'samples': 19798016, 'steps': 38667, 'batch_loss/train': 0.6231292639859021} 12/28/2021 01:22:48 - INFO - codeparrot_training - Step 38668: {'lr': 6.114662016714847e-05, 'samples': 19798528, 'steps': 38668, 'batch_loss/train': 0.8087078053504229} 12/28/2021 01:23:00 - INFO - codeparrot_training - Step 38669: {'lr': 6.113629177468402e-05, 'samples': 19799040, 'steps': 38669, 'batch_loss/train': 0.6839957822812721} 12/28/2021 01:23:11 - INFO - codeparrot_training - Step 38670: {'lr': 6.11259641330703e-05, 'samples': 19799552, 'steps': 38670, 'batch_loss/train': 0.8175992406904697} 12/28/2021 01:23:21 - INFO - codeparrot_training - Step 38671: {'lr': 6.111563724234856e-05, 'samples': 19800064, 'steps': 38671, 'batch_loss/train': 0.6570698505965993} 12/28/2021 01:23:32 - INFO - codeparrot_training - Step 38672: {'lr': 6.110531110255965e-05, 'samples': 19800576, 'steps': 38672, 'batch_loss/train': 0.665431709960103} 12/28/2021 01:23:46 - INFO - codeparrot_training - Step 38673: {'lr': 6.109498571374473e-05, 'samples': 19801088, 'steps': 38673, 'batch_loss/train': 0.782311346847564} 12/28/2021 01:23:57 - INFO - codeparrot_training - Step 38674: {'lr': 6.108466107594479e-05, 'samples': 19801600, 'steps': 38674, 'batch_loss/train': 0.6504009030759335} 12/28/2021 01:24:07 - INFO - codeparrot_training - Step 38675: {'lr': 6.107433718920091e-05, 'samples': 19802112, 'steps': 38675, 'batch_loss/train': 0.7596104890108109} 12/28/2021 01:24:20 - INFO - codeparrot_training - Step 38676: {'lr': 6.106401405355413e-05, 'samples': 19802624, 'steps': 38676, 'batch_loss/train': 0.7339260862208903} 12/28/2021 01:24:30 - INFO - codeparrot_training - Step 38677: {'lr': 6.105369166904551e-05, 'samples': 19803136, 'steps': 38677, 'batch_loss/train': 0.6118829536717385} 12/28/2021 01:24:41 - INFO - codeparrot_training - Step 38678: {'lr': 6.104337003571602e-05, 'samples': 19803648, 'steps': 38678, 'batch_loss/train': 0.7366061434149742} 12/28/2021 01:24:53 - INFO - codeparrot_training - Step 38679: {'lr': 6.10330491536068e-05, 'samples': 19804160, 'steps': 38679, 'batch_loss/train': 0.7410623910836875} 12/28/2021 01:25:04 - INFO - codeparrot_training - Step 38680: {'lr': 6.1022729022758874e-05, 'samples': 19804672, 'steps': 38680, 'batch_loss/train': 0.7859787247143686} 12/28/2021 01:25:14 - INFO - codeparrot_training - Step 38681: {'lr': 6.1012409643213086e-05, 'samples': 19805184, 'steps': 38681, 'batch_loss/train': 0.7075341688469052} 12/28/2021 01:25:25 - INFO - codeparrot_training - Step 38682: {'lr': 6.100209101501064e-05, 'samples': 19805696, 'steps': 38682, 'batch_loss/train': 0.6429586578160524} 12/28/2021 01:25:37 - INFO - codeparrot_training - Step 38683: {'lr': 6.0991773138192626e-05, 'samples': 19806208, 'steps': 38683, 'batch_loss/train': 0.6732246205210686} 12/28/2021 01:25:48 - INFO - codeparrot_training - Step 38684: {'lr': 6.0981456012799866e-05, 'samples': 19806720, 'steps': 38684, 'batch_loss/train': 0.7092171756085008} 12/28/2021 01:25:58 - INFO - codeparrot_training - Step 38685: {'lr': 6.0971139638873465e-05, 'samples': 19807232, 'steps': 38685, 'batch_loss/train': 0.7943690679967403} 12/28/2021 01:26:12 - INFO - codeparrot_training - Step 38686: {'lr': 6.096082401645445e-05, 'samples': 19807744, 'steps': 38686, 'batch_loss/train': 0.6991420686244965} 12/28/2021 01:26:23 - INFO - codeparrot_training - Step 38687: {'lr': 6.095050914558381e-05, 'samples': 19808256, 'steps': 38687, 'batch_loss/train': 0.727948535233736} 12/28/2021 01:26:33 - INFO - codeparrot_training - Step 38688: {'lr': 6.0940195026302544e-05, 'samples': 19808768, 'steps': 38688, 'batch_loss/train': 0.7923706704750657} 12/28/2021 01:26:46 - INFO - codeparrot_training - Step 38689: {'lr': 6.092988165865171e-05, 'samples': 19809280, 'steps': 38689, 'batch_loss/train': 0.7392986039631069} 12/28/2021 01:26:57 - INFO - codeparrot_training - Step 38690: {'lr': 6.091956904267226e-05, 'samples': 19809792, 'steps': 38690, 'batch_loss/train': 0.6783370021730661} 12/28/2021 01:27:07 - INFO - codeparrot_training - Step 38691: {'lr': 6.090925717840528e-05, 'samples': 19810304, 'steps': 38691, 'batch_loss/train': 0.7014932986348867} 12/28/2021 01:27:18 - INFO - codeparrot_training - Step 38692: {'lr': 6.089894606589158e-05, 'samples': 19810816, 'steps': 38692, 'batch_loss/train': 0.791485152207315} 12/28/2021 01:27:30 - INFO - codeparrot_training - Step 38693: {'lr': 6.088863570517231e-05, 'samples': 19811328, 'steps': 38693, 'batch_loss/train': 0.7008618488907814} 12/28/2021 01:27:41 - INFO - codeparrot_training - Step 38694: {'lr': 6.087832609628849e-05, 'samples': 19811840, 'steps': 38694, 'batch_loss/train': 0.7659913748502731} 12/28/2021 01:27:51 - INFO - codeparrot_training - Step 38695: {'lr': 6.086801723928098e-05, 'samples': 19812352, 'steps': 38695, 'batch_loss/train': 0.6519919675774872} 12/28/2021 01:28:05 - INFO - codeparrot_training - Step 38696: {'lr': 6.0857709134190723e-05, 'samples': 19812864, 'steps': 38696, 'batch_loss/train': 0.7178121310425922} 12/28/2021 01:28:16 - INFO - codeparrot_training - Step 38697: {'lr': 6.0847401781058966e-05, 'samples': 19813376, 'steps': 38697, 'batch_loss/train': 0.7392503833398223} 12/28/2021 01:28:27 - INFO - codeparrot_training - Step 38698: {'lr': 6.083709517992644e-05, 'samples': 19813888, 'steps': 38698, 'batch_loss/train': 0.7068426087498665} 12/28/2021 01:28:39 - INFO - codeparrot_training - Step 38699: {'lr': 6.082678933083421e-05, 'samples': 19814400, 'steps': 38699, 'batch_loss/train': 0.744864453561604} 12/28/2021 01:28:49 - INFO - codeparrot_training - Step 38700: {'lr': 6.0816484233823214e-05, 'samples': 19814912, 'steps': 38700, 'batch_loss/train': 0.7873950025532395} 12/28/2021 01:29:00 - INFO - codeparrot_training - Step 38701: {'lr': 6.0806179888934444e-05, 'samples': 19815424, 'steps': 38701, 'batch_loss/train': 0.6544127650558949} 12/28/2021 01:29:11 - INFO - codeparrot_training - Step 38702: {'lr': 6.079587629620889e-05, 'samples': 19815936, 'steps': 38702, 'batch_loss/train': 0.6764645967632532} 12/28/2021 01:29:23 - INFO - codeparrot_training - Step 38703: {'lr': 6.078557345568747e-05, 'samples': 19816448, 'steps': 38703, 'batch_loss/train': 0.7368596973828971} 12/28/2021 01:29:33 - INFO - codeparrot_training - Step 38704: {'lr': 6.077527136741118e-05, 'samples': 19816960, 'steps': 38704, 'batch_loss/train': 0.636621936224401} 12/28/2021 01:29:44 - INFO - codeparrot_training - Step 38705: {'lr': 6.0764970031420955e-05, 'samples': 19817472, 'steps': 38705, 'batch_loss/train': 0.72135567269288} 12/28/2021 01:29:58 - INFO - codeparrot_training - Step 38706: {'lr': 6.0754669447757816e-05, 'samples': 19817984, 'steps': 38706, 'batch_loss/train': 0.5955868819728494} 12/28/2021 01:30:09 - INFO - codeparrot_training - Step 38707: {'lr': 6.074436961646254e-05, 'samples': 19818496, 'steps': 38707, 'batch_loss/train': 0.7253710445947945} 12/28/2021 01:30:19 - INFO - codeparrot_training - Step 38708: {'lr': 6.073407053757623e-05, 'samples': 19819008, 'steps': 38708, 'batch_loss/train': 0.8302673622965813} 12/28/2021 01:30:31 - INFO - codeparrot_training - Step 38709: {'lr': 6.0723772211139896e-05, 'samples': 19819520, 'steps': 38709, 'batch_loss/train': 0.7382979956455529} 12/28/2021 01:30:42 - INFO - codeparrot_training - Step 38710: {'lr': 6.0713474637194287e-05, 'samples': 19820032, 'steps': 38710, 'batch_loss/train': 0.9048844929784536} 12/28/2021 01:30:53 - INFO - codeparrot_training - Step 38711: {'lr': 6.0703177815780415e-05, 'samples': 19820544, 'steps': 38711, 'batch_loss/train': 0.7959080226719379} 12/28/2021 01:31:03 - INFO - codeparrot_training - Step 38712: {'lr': 6.0692881746939265e-05, 'samples': 19821056, 'steps': 38712, 'batch_loss/train': 0.7648805100470781} 12/28/2021 01:31:17 - INFO - codeparrot_training - Step 38713: {'lr': 6.06825864307117e-05, 'samples': 19821568, 'steps': 38713, 'batch_loss/train': 0.7431241874583066} 12/28/2021 01:31:28 - INFO - codeparrot_training - Step 38714: {'lr': 6.067229186713871e-05, 'samples': 19822080, 'steps': 38714, 'batch_loss/train': 0.7053918289020658} 12/28/2021 01:31:39 - INFO - codeparrot_training - Step 38715: {'lr': 6.066199805626121e-05, 'samples': 19822592, 'steps': 38715, 'batch_loss/train': 0.7808208608767018} 12/28/2021 01:31:51 - INFO - codeparrot_training - Step 38716: {'lr': 6.065170499812009e-05, 'samples': 19823104, 'steps': 38716, 'batch_loss/train': 0.8255712119862437} 12/28/2021 01:32:01 - INFO - codeparrot_training - Step 38717: {'lr': 6.064141269275633e-05, 'samples': 19823616, 'steps': 38717, 'batch_loss/train': 0.688257472589612} 12/28/2021 01:32:12 - INFO - codeparrot_training - Step 38718: {'lr': 6.0631121140210774e-05, 'samples': 19824128, 'steps': 38718, 'batch_loss/train': 0.7936273562954739} 12/28/2021 01:32:24 - INFO - codeparrot_training - Step 38719: {'lr': 6.0620830340524405e-05, 'samples': 19824640, 'steps': 38719, 'batch_loss/train': 0.7717887260951102} 12/28/2021 01:32:35 - INFO - codeparrot_training - Step 38720: {'lr': 6.0610540293738166e-05, 'samples': 19825152, 'steps': 38720, 'batch_loss/train': 1.276128716301173} 12/28/2021 01:32:46 - INFO - codeparrot_training - Step 38721: {'lr': 6.0600250999892856e-05, 'samples': 19825664, 'steps': 38721, 'batch_loss/train': 1.0045360807562247} 12/28/2021 01:33:00 - INFO - codeparrot_training - Step 38722: {'lr': 6.058996245902934e-05, 'samples': 19826176, 'steps': 38722, 'batch_loss/train': 0.7333298311568797} 12/28/2021 01:33:11 - INFO - codeparrot_training - Step 38723: {'lr': 6.057967467118877e-05, 'samples': 19826688, 'steps': 38723, 'batch_loss/train': 1.7297862078994513} 12/28/2021 01:33:21 - INFO - codeparrot_training - Step 38724: {'lr': 6.0569387636411816e-05, 'samples': 19827200, 'steps': 38724, 'batch_loss/train': 0.7518463972955942} 12/28/2021 01:33:32 - INFO - codeparrot_training - Step 38725: {'lr': 6.055910135473938e-05, 'samples': 19827712, 'steps': 38725, 'batch_loss/train': 0.7472869753837585} 12/28/2021 01:33:44 - INFO - codeparrot_training - Step 38726: {'lr': 6.054881582621255e-05, 'samples': 19828224, 'steps': 38726, 'batch_loss/train': 0.7217438519001007} 12/28/2021 01:33:55 - INFO - codeparrot_training - Step 38727: {'lr': 6.053853105087206e-05, 'samples': 19828736, 'steps': 38727, 'batch_loss/train': 0.7655802257359028} 12/28/2021 01:34:05 - INFO - codeparrot_training - Step 38728: {'lr': 6.052824702875881e-05, 'samples': 19829248, 'steps': 38728, 'batch_loss/train': 0.7964593444485217} 12/28/2021 01:34:17 - INFO - codeparrot_training - Step 38729: {'lr': 6.0517963759913695e-05, 'samples': 19829760, 'steps': 38729, 'batch_loss/train': 0.7451753828208894} 12/28/2021 01:34:28 - INFO - codeparrot_training - Step 38730: {'lr': 6.0507681244377636e-05, 'samples': 19830272, 'steps': 38730, 'batch_loss/train': 0.7729161984752864} 12/28/2021 01:34:39 - INFO - codeparrot_training - Step 38731: {'lr': 6.049739948219149e-05, 'samples': 19830784, 'steps': 38731, 'batch_loss/train': 0.6521641805302352} 12/28/2021 01:34:49 - INFO - codeparrot_training - Step 38732: {'lr': 6.048711847339619e-05, 'samples': 19831296, 'steps': 38732, 'batch_loss/train': 0.8994287014938891} 12/28/2021 01:35:01 - INFO - codeparrot_training - Step 38733: {'lr': 6.047683821803241e-05, 'samples': 19831808, 'steps': 38733, 'batch_loss/train': 0.7558269817382097} 12/28/2021 01:35:12 - INFO - codeparrot_training - Step 38734: {'lr': 6.046655871614121e-05, 'samples': 19832320, 'steps': 38734, 'batch_loss/train': 0.9024644764140248} 12/28/2021 01:35:23 - INFO - codeparrot_training - Step 38735: {'lr': 6.045627996776351e-05, 'samples': 19832832, 'steps': 38735, 'batch_loss/train': 0.808067130856216} 12/28/2021 01:35:37 - INFO - codeparrot_training - Step 38736: {'lr': 6.0446001972939976e-05, 'samples': 19833344, 'steps': 38736, 'batch_loss/train': 0.7319668701966293} 12/28/2021 01:35:48 - INFO - codeparrot_training - Step 38737: {'lr': 6.043572473171158e-05, 'samples': 19833856, 'steps': 38737, 'batch_loss/train': 0.8679178757593036} 12/28/2021 01:35:59 - INFO - codeparrot_training - Step 38738: {'lr': 6.042544824411916e-05, 'samples': 19834368, 'steps': 38738, 'batch_loss/train': 0.8553459160029888} 12/28/2021 01:36:09 - INFO - codeparrot_training - Step 38739: {'lr': 6.0415172510203595e-05, 'samples': 19834880, 'steps': 38739, 'batch_loss/train': 1.2584313983097672} 12/28/2021 01:36:21 - INFO - codeparrot_training - Step 38740: {'lr': 6.040489753000569e-05, 'samples': 19835392, 'steps': 38740, 'batch_loss/train': 0.8857108657248318} 12/28/2021 01:36:32 - INFO - codeparrot_training - Step 38741: {'lr': 6.039462330356635e-05, 'samples': 19835904, 'steps': 38741, 'batch_loss/train': 0.648717840667814} 12/28/2021 01:36:43 - INFO - codeparrot_training - Step 38742: {'lr': 6.038434983092636e-05, 'samples': 19836416, 'steps': 38742, 'batch_loss/train': 0.8296328405849636} 12/28/2021 01:36:57 - INFO - codeparrot_training - Step 38743: {'lr': 6.037407711212664e-05, 'samples': 19836928, 'steps': 38743, 'batch_loss/train': 0.7138881469145417} 12/28/2021 01:37:08 - INFO - codeparrot_training - Step 38744: {'lr': 6.036380514720796e-05, 'samples': 19837440, 'steps': 38744, 'batch_loss/train': 0.7651556637138128} 12/28/2021 01:37:19 - INFO - codeparrot_training - Step 38745: {'lr': 6.0353533936211206e-05, 'samples': 19837952, 'steps': 38745, 'batch_loss/train': 1.3095074417069554} 12/28/2021 01:37:29 - INFO - codeparrot_training - Step 38746: {'lr': 6.0343263479177266e-05, 'samples': 19838464, 'steps': 38746, 'batch_loss/train': 0.7070842101238668} 12/28/2021 01:37:41 - INFO - codeparrot_training - Step 38747: {'lr': 6.0332993776146814e-05, 'samples': 19838976, 'steps': 38747, 'batch_loss/train': 0.7129671429283917} 12/28/2021 01:37:52 - INFO - codeparrot_training - Step 38748: {'lr': 6.03227248271607e-05, 'samples': 19839488, 'steps': 38748, 'batch_loss/train': 0.7917311107739806} 12/28/2021 01:38:03 - INFO - codeparrot_training - Step 38749: {'lr': 6.031245663225995e-05, 'samples': 19840000, 'steps': 38749, 'batch_loss/train': 0.8059442164376378} 12/28/2021 01:38:15 - INFO - codeparrot_training - Step 38750: {'lr': 6.030218919148517e-05, 'samples': 19840512, 'steps': 38750, 'batch_loss/train': 0.6573063067626208} 12/28/2021 01:38:25 - INFO - codeparrot_training - Step 38751: {'lr': 6.029192250487719e-05, 'samples': 19841024, 'steps': 38751, 'batch_loss/train': 0.6295698472531512} 12/28/2021 01:38:36 - INFO - codeparrot_training - Step 38752: {'lr': 6.028165657247706e-05, 'samples': 19841536, 'steps': 38752, 'batch_loss/train': 0.8329634480178356} 12/28/2021 01:38:49 - INFO - codeparrot_training - Step 38753: {'lr': 6.027139139432533e-05, 'samples': 19842048, 'steps': 38753, 'batch_loss/train': 0.7431095178471878} 12/28/2021 01:39:00 - INFO - codeparrot_training - Step 38754: {'lr': 6.026112697046293e-05, 'samples': 19842560, 'steps': 38754, 'batch_loss/train': 0.711303245741874} 12/28/2021 01:39:11 - INFO - codeparrot_training - Step 38755: {'lr': 6.025086330093063e-05, 'samples': 19843072, 'steps': 38755, 'batch_loss/train': 0.7635028371587396} 12/28/2021 01:39:23 - INFO - codeparrot_training - Step 38756: {'lr': 6.0240600385769284e-05, 'samples': 19843584, 'steps': 38756, 'batch_loss/train': 0.7833646675571799} 12/28/2021 01:39:34 - INFO - codeparrot_training - Step 38757: {'lr': 6.023033822501961e-05, 'samples': 19844096, 'steps': 38757, 'batch_loss/train': 0.7038839519955218} 12/28/2021 01:39:44 - INFO - codeparrot_training - Step 38758: {'lr': 6.0220076818722576e-05, 'samples': 19844608, 'steps': 38758, 'batch_loss/train': 0.7483367726672441} 12/28/2021 01:39:55 - INFO - codeparrot_training - Step 38759: {'lr': 6.0209816166918675e-05, 'samples': 19845120, 'steps': 38759, 'batch_loss/train': 0.6035371313337237} 12/28/2021 01:40:08 - INFO - codeparrot_training - Step 38760: {'lr': 6.0199556269648987e-05, 'samples': 19845632, 'steps': 38760, 'batch_loss/train': 0.578038819367066} 12/28/2021 01:40:18 - INFO - codeparrot_training - Step 38761: {'lr': 6.0189297126954255e-05, 'samples': 19846144, 'steps': 38761, 'batch_loss/train': 0.6332105241017416} 12/28/2021 01:40:29 - INFO - codeparrot_training - Step 38762: {'lr': 6.0179038738875065e-05, 'samples': 19846656, 'steps': 38762, 'batch_loss/train': 0.9965695794671774} 12/28/2021 01:40:43 - INFO - codeparrot_training - Step 38763: {'lr': 6.0168781105452404e-05, 'samples': 19847168, 'steps': 38763, 'batch_loss/train': 0.7301886667264625} 12/28/2021 01:40:54 - INFO - codeparrot_training - Step 38764: {'lr': 6.015852422672707e-05, 'samples': 19847680, 'steps': 38764, 'batch_loss/train': 0.6989765977486968} 12/28/2021 01:41:04 - INFO - codeparrot_training - Step 38765: {'lr': 6.014826810273971e-05, 'samples': 19848192, 'steps': 38765, 'batch_loss/train': 0.7778428922174498} 12/28/2021 01:41:15 - INFO - codeparrot_training - Step 38766: {'lr': 6.013801273353115e-05, 'samples': 19848704, 'steps': 38766, 'batch_loss/train': 0.7406296054832637} 12/28/2021 01:41:27 - INFO - codeparrot_training - Step 38767: {'lr': 6.0127758119142146e-05, 'samples': 19849216, 'steps': 38767, 'batch_loss/train': 0.8596357628703117} 12/28/2021 01:41:38 - INFO - codeparrot_training - Step 38768: {'lr': 6.0117504259613456e-05, 'samples': 19849728, 'steps': 38768, 'batch_loss/train': 0.7078367155045271} 12/28/2021 01:41:48 - INFO - codeparrot_training - Step 38769: {'lr': 6.0107251154985913e-05, 'samples': 19850240, 'steps': 38769, 'batch_loss/train': 0.7544995984062552} 12/28/2021 01:42:00 - INFO - codeparrot_training - Step 38770: {'lr': 6.009699880530023e-05, 'samples': 19850752, 'steps': 38770, 'batch_loss/train': 0.7930959332734346} 12/28/2021 01:42:11 - INFO - codeparrot_training - Step 38771: {'lr': 6.0086747210597175e-05, 'samples': 19851264, 'steps': 38771, 'batch_loss/train': 0.7301190867146943} 12/28/2021 01:42:21 - INFO - codeparrot_training - Step 38772: {'lr': 6.007649637091758e-05, 'samples': 19851776, 'steps': 38772, 'batch_loss/train': 0.7858914257958531} 12/28/2021 01:42:35 - INFO - codeparrot_training - Step 38773: {'lr': 6.0066246286302025e-05, 'samples': 19852288, 'steps': 38773, 'batch_loss/train': 0.813948676455766} 12/28/2021 01:42:46 - INFO - codeparrot_training - Step 38774: {'lr': 6.005599695679131e-05, 'samples': 19852800, 'steps': 38774, 'batch_loss/train': 0.7616707002744079} 12/28/2021 01:42:57 - INFO - codeparrot_training - Step 38775: {'lr': 6.004574838242638e-05, 'samples': 19853312, 'steps': 38775, 'batch_loss/train': 0.7692202758044004} 12/28/2021 01:43:09 - INFO - codeparrot_training - Step 38776: {'lr': 6.003550056324775e-05, 'samples': 19853824, 'steps': 38776, 'batch_loss/train': 0.7593022435903549} 12/28/2021 01:43:19 - INFO - codeparrot_training - Step 38777: {'lr': 6.002525349929616e-05, 'samples': 19854336, 'steps': 38777, 'batch_loss/train': 0.7605140879750252} 12/28/2021 01:43:30 - INFO - codeparrot_training - Step 38778: {'lr': 6.0015007190612594e-05, 'samples': 19854848, 'steps': 38778, 'batch_loss/train': 0.7254862524569035} 12/28/2021 01:43:41 - INFO - codeparrot_training - Step 38779: {'lr': 6.000476163723753e-05, 'samples': 19855360, 'steps': 38779, 'batch_loss/train': 0.6624912023544312} 12/28/2021 01:43:53 - INFO - codeparrot_training - Step 38780: {'lr': 5.999451683921181e-05, 'samples': 19855872, 'steps': 38780, 'batch_loss/train': 0.6577369277365506} 12/28/2021 01:44:03 - INFO - codeparrot_training - Step 38781: {'lr': 5.998427279657615e-05, 'samples': 19856384, 'steps': 38781, 'batch_loss/train': 0.6859686067327857} 12/28/2021 01:44:14 - INFO - codeparrot_training - Step 38782: {'lr': 5.997402950937125e-05, 'samples': 19856896, 'steps': 38782, 'batch_loss/train': 0.7310810433700681} 12/28/2021 01:44:28 - INFO - codeparrot_training - Step 38783: {'lr': 5.996378697763788e-05, 'samples': 19857408, 'steps': 38783, 'batch_loss/train': 0.7760753370821476} 12/28/2021 01:44:38 - INFO - codeparrot_training - Step 38784: {'lr': 5.995354520141682e-05, 'samples': 19857920, 'steps': 38784, 'batch_loss/train': 0.7425950631732121} 12/28/2021 01:44:49 - INFO - codeparrot_training - Step 38785: {'lr': 5.994330418074856e-05, 'samples': 19858432, 'steps': 38785, 'batch_loss/train': 0.7254392034374177} 12/28/2021 01:45:01 - INFO - codeparrot_training - Step 38786: {'lr': 5.993306391567402e-05, 'samples': 19858944, 'steps': 38786, 'batch_loss/train': 0.7204031131113879} 12/28/2021 01:45:12 - INFO - codeparrot_training - Step 38787: {'lr': 5.992282440623389e-05, 'samples': 19859456, 'steps': 38787, 'batch_loss/train': 0.585219016764313} 12/28/2021 01:45:23 - INFO - codeparrot_training - Step 38788: {'lr': 5.991258565246874e-05, 'samples': 19859968, 'steps': 38788, 'batch_loss/train': 0.7677024444565177} 12/28/2021 01:45:33 - INFO - codeparrot_training - Step 38789: {'lr': 5.990234765441943e-05, 'samples': 19860480, 'steps': 38789, 'batch_loss/train': 0.7853798884898424} 12/28/2021 01:45:47 - INFO - codeparrot_training - Step 38790: {'lr': 5.9892110412126695e-05, 'samples': 19860992, 'steps': 38790, 'batch_loss/train': 0.7907055742107332} 12/28/2021 01:45:58 - INFO - codeparrot_training - Step 38791: {'lr': 5.988187392563107e-05, 'samples': 19861504, 'steps': 38791, 'batch_loss/train': 1.6244096769951284} 12/28/2021 01:46:09 - INFO - codeparrot_training - Step 38792: {'lr': 5.987163819497332e-05, 'samples': 19862016, 'steps': 38792, 'batch_loss/train': 0.8136013529729098} 12/28/2021 01:46:21 - INFO - codeparrot_training - Step 38793: {'lr': 5.986140322019415e-05, 'samples': 19862528, 'steps': 38793, 'batch_loss/train': 0.7031417516991496} 12/28/2021 01:46:31 - INFO - codeparrot_training - Step 38794: {'lr': 5.985116900133425e-05, 'samples': 19863040, 'steps': 38794, 'batch_loss/train': 0.7314954306930304} 12/28/2021 01:46:42 - INFO - codeparrot_training - Step 38795: {'lr': 5.984093553843431e-05, 'samples': 19863552, 'steps': 38795, 'batch_loss/train': 0.8289083261042833} 12/28/2021 01:46:54 - INFO - codeparrot_training - Step 38796: {'lr': 5.9830702831535e-05, 'samples': 19864064, 'steps': 38796, 'batch_loss/train': 0.7228858014568686} 12/28/2021 01:47:05 - INFO - codeparrot_training - Step 38797: {'lr': 5.982047088067702e-05, 'samples': 19864576, 'steps': 38797, 'batch_loss/train': 0.7555120307952166} 12/28/2021 01:47:15 - INFO - codeparrot_training - Step 38798: {'lr': 5.981023968590102e-05, 'samples': 19865088, 'steps': 38798, 'batch_loss/train': 0.7590641397982836} 12/28/2021 01:47:26 - INFO - codeparrot_training - Step 38799: {'lr': 5.980000924724771e-05, 'samples': 19865600, 'steps': 38799, 'batch_loss/train': 1.2569686165079474} 12/28/2021 01:47:39 - INFO - codeparrot_training - Step 38800: {'lr': 5.978977956475773e-05, 'samples': 19866112, 'steps': 38800, 'batch_loss/train': 0.933321843855083} 12/28/2021 01:47:49 - INFO - codeparrot_training - Step 38801: {'lr': 5.977955063847185e-05, 'samples': 19866624, 'steps': 38801, 'batch_loss/train': 0.7414567391388118} 12/28/2021 01:48:00 - INFO - codeparrot_training - Step 38802: {'lr': 5.976932246843059e-05, 'samples': 19867136, 'steps': 38802, 'batch_loss/train': 0.6423656221013516} 12/28/2021 01:48:12 - INFO - codeparrot_training - Step 38803: {'lr': 5.975909505467458e-05, 'samples': 19867648, 'steps': 38803, 'batch_loss/train': 0.7694938462227583} 12/28/2021 01:48:23 - INFO - codeparrot_training - Step 38804: {'lr': 5.974886839724475e-05, 'samples': 19868160, 'steps': 38804, 'batch_loss/train': 0.7471473291516304} 12/28/2021 01:48:33 - INFO - codeparrot_training - Step 38805: {'lr': 5.973864249618149e-05, 'samples': 19868672, 'steps': 38805, 'batch_loss/train': 0.811741316691041} 12/28/2021 01:48:45 - INFO - codeparrot_training - Step 38806: {'lr': 5.972841735152554e-05, 'samples': 19869184, 'steps': 38806, 'batch_loss/train': 0.7464410702232271} 12/28/2021 01:48:56 - INFO - codeparrot_training - Step 38807: {'lr': 5.971819296331757e-05, 'samples': 19869696, 'steps': 38807, 'batch_loss/train': 0.7077734469203278} 12/28/2021 01:49:06 - INFO - codeparrot_training - Step 38808: {'lr': 5.970796933159822e-05, 'samples': 19870208, 'steps': 38808, 'batch_loss/train': 0.7795674952794798} 12/28/2021 01:49:17 - INFO - codeparrot_training - Step 38809: {'lr': 5.9697746456408134e-05, 'samples': 19870720, 'steps': 38809, 'batch_loss/train': 0.7233541025780141} 12/28/2021 01:49:30 - INFO - codeparrot_training - Step 38810: {'lr': 5.968752433778801e-05, 'samples': 19871232, 'steps': 38810, 'batch_loss/train': 0.6496308841742575} 12/28/2021 01:49:40 - INFO - codeparrot_training - Step 38811: {'lr': 5.967730297577831e-05, 'samples': 19871744, 'steps': 38811, 'batch_loss/train': 0.772645550314337} 12/28/2021 01:49:51 - INFO - codeparrot_training - Step 38812: {'lr': 5.966708237041985e-05, 'samples': 19872256, 'steps': 38812, 'batch_loss/train': 0.7367213377729058} 12/28/2021 01:50:03 - INFO - codeparrot_training - Step 38813: {'lr': 5.9656862521753285e-05, 'samples': 19872768, 'steps': 38813, 'batch_loss/train': 0.7238905010744929} 12/28/2021 01:50:14 - INFO - codeparrot_training - Step 38814: {'lr': 5.9646643429818985e-05, 'samples': 19873280, 'steps': 38814, 'batch_loss/train': 0.7329961503855884} 12/28/2021 01:50:24 - INFO - codeparrot_training - Step 38815: {'lr': 5.963642509465786e-05, 'samples': 19873792, 'steps': 38815, 'batch_loss/train': 0.8072380991652608} 12/28/2021 01:50:35 - INFO - codeparrot_training - Step 38816: {'lr': 5.962620751631051e-05, 'samples': 19874304, 'steps': 38816, 'batch_loss/train': 0.7906304346397519} 12/28/2021 01:50:47 - INFO - codeparrot_training - Step 38817: {'lr': 5.961599069481738e-05, 'samples': 19874816, 'steps': 38817, 'batch_loss/train': 0.8015043156920001} 12/28/2021 01:50:58 - INFO - codeparrot_training - Step 38818: {'lr': 5.9605774630219184e-05, 'samples': 19875328, 'steps': 38818, 'batch_loss/train': 0.666825981810689} 12/28/2021 01:51:09 - INFO - codeparrot_training - Step 38819: {'lr': 5.959555932255656e-05, 'samples': 19875840, 'steps': 38819, 'batch_loss/train': 0.7791667096316814} 12/28/2021 01:51:21 - INFO - codeparrot_training - Step 38820: {'lr': 5.958534477187008e-05, 'samples': 19876352, 'steps': 38820, 'batch_loss/train': 0.7748953141272068} 12/28/2021 01:51:32 - INFO - codeparrot_training - Step 38821: {'lr': 5.9575130978200384e-05, 'samples': 19876864, 'steps': 38821, 'batch_loss/train': 0.6771548292599618} 12/28/2021 01:51:43 - INFO - codeparrot_training - Step 38822: {'lr': 5.956491794158805e-05, 'samples': 19877376, 'steps': 38822, 'batch_loss/train': 0.7230031164363027} 12/28/2021 01:51:55 - INFO - codeparrot_training - Step 38823: {'lr': 5.9554705662073716e-05, 'samples': 19877888, 'steps': 38823, 'batch_loss/train': 0.78884730162099} 12/28/2021 01:52:05 - INFO - codeparrot_training - Step 38824: {'lr': 5.9544494139697955e-05, 'samples': 19878400, 'steps': 38824, 'batch_loss/train': 1.381239715963602} 12/28/2021 01:52:16 - INFO - codeparrot_training - Step 38825: {'lr': 5.953428337450137e-05, 'samples': 19878912, 'steps': 38825, 'batch_loss/train': 0.6654192993883044} 12/28/2021 01:52:27 - INFO - codeparrot_training - Step 38826: {'lr': 5.9524073366524574e-05, 'samples': 19879424, 'steps': 38826, 'batch_loss/train': 0.8366665989160538} 12/28/2021 01:52:39 - INFO - codeparrot_training - Step 38827: {'lr': 5.951386411580817e-05, 'samples': 19879936, 'steps': 38827, 'batch_loss/train': 0.8525454485788941} 12/28/2021 01:52:49 - INFO - codeparrot_training - Step 38828: {'lr': 5.950365562239268e-05, 'samples': 19880448, 'steps': 38828, 'batch_loss/train': 0.7345959600061178} 12/28/2021 01:53:00 - INFO - codeparrot_training - Step 38829: {'lr': 5.9493447886318665e-05, 'samples': 19880960, 'steps': 38829, 'batch_loss/train': 0.7401075195521116} 12/28/2021 01:53:13 - INFO - codeparrot_training - Step 38830: {'lr': 5.948324090762688e-05, 'samples': 19881472, 'steps': 38830, 'batch_loss/train': 0.6799955039750785} 12/28/2021 01:53:24 - INFO - codeparrot_training - Step 38831: {'lr': 5.947303468635773e-05, 'samples': 19881984, 'steps': 38831, 'batch_loss/train': 0.7294881027191877} 12/28/2021 01:53:34 - INFO - codeparrot_training - Step 38832: {'lr': 5.9462829222551864e-05, 'samples': 19882496, 'steps': 38832, 'batch_loss/train': 0.7140337214805186} 12/28/2021 01:53:46 - INFO - codeparrot_training - Step 38833: {'lr': 5.945262451624983e-05, 'samples': 19883008, 'steps': 38833, 'batch_loss/train': 0.8686894020065665} 12/28/2021 01:53:57 - INFO - codeparrot_training - Step 38834: {'lr': 5.944242056749225e-05, 'samples': 19883520, 'steps': 38834, 'batch_loss/train': 0.7286101607605815} 12/28/2021 01:54:08 - INFO - codeparrot_training - Step 38835: {'lr': 5.9432217376319625e-05, 'samples': 19884032, 'steps': 38835, 'batch_loss/train': 0.7661595577374101} 12/28/2021 01:54:19 - INFO - codeparrot_training - Step 38836: {'lr': 5.942201494277255e-05, 'samples': 19884544, 'steps': 38836, 'batch_loss/train': 0.5893681399757043} 12/28/2021 01:54:30 - INFO - codeparrot_training - Step 38837: {'lr': 5.941181326689157e-05, 'samples': 19885056, 'steps': 38837, 'batch_loss/train': 0.8442475516349077} 12/28/2021 01:54:41 - INFO - codeparrot_training - Step 38838: {'lr': 5.9401612348717274e-05, 'samples': 19885568, 'steps': 38838, 'batch_loss/train': 0.6974439970217645} 12/28/2021 01:54:51 - INFO - codeparrot_training - Step 38839: {'lr': 5.9391412188290276e-05, 'samples': 19886080, 'steps': 38839, 'batch_loss/train': 0.6679541133344173} 12/28/2021 01:55:04 - INFO - codeparrot_training - Step 38840: {'lr': 5.9381212785650895e-05, 'samples': 19886592, 'steps': 38840, 'batch_loss/train': 0.611889187246561} 12/28/2021 01:55:15 - INFO - codeparrot_training - Step 38841: {'lr': 5.9371014140839925e-05, 'samples': 19887104, 'steps': 38841, 'batch_loss/train': 0.5553254275582731} 12/28/2021 01:55:26 - INFO - codeparrot_training - Step 38842: {'lr': 5.936081625389789e-05, 'samples': 19887616, 'steps': 38842, 'batch_loss/train': 0.7669845055788755} 12/28/2021 01:55:38 - INFO - codeparrot_training - Step 38843: {'lr': 5.93506191248652e-05, 'samples': 19888128, 'steps': 38843, 'batch_loss/train': 0.763154296670109} 12/28/2021 01:55:48 - INFO - codeparrot_training - Step 38844: {'lr': 5.934042275378238e-05, 'samples': 19888640, 'steps': 38844, 'batch_loss/train': 0.6971698282286525} 12/28/2021 01:55:59 - INFO - codeparrot_training - Step 38845: {'lr': 5.93302271406902e-05, 'samples': 19889152, 'steps': 38845, 'batch_loss/train': 0.653616669587791} 12/28/2021 01:56:11 - INFO - codeparrot_training - Step 38846: {'lr': 5.932003228562896e-05, 'samples': 19889664, 'steps': 38846, 'batch_loss/train': 0.7813394935801625} 12/28/2021 01:56:21 - INFO - codeparrot_training - Step 38847: {'lr': 5.9309838188639275e-05, 'samples': 19890176, 'steps': 38847, 'batch_loss/train': 0.7199290576390922} 12/28/2021 01:56:32 - INFO - codeparrot_training - Step 38848: {'lr': 5.9299644849761665e-05, 'samples': 19890688, 'steps': 38848, 'batch_loss/train': 0.7992496425285935} 12/28/2021 01:56:43 - INFO - codeparrot_training - Step 38849: {'lr': 5.9289452269036676e-05, 'samples': 19891200, 'steps': 38849, 'batch_loss/train': 0.726200407370925} 12/28/2021 01:56:55 - INFO - codeparrot_training - Step 38850: {'lr': 5.927926044650481e-05, 'samples': 19891712, 'steps': 38850, 'batch_loss/train': 0.7334647345705889} 12/28/2021 01:57:05 - INFO - codeparrot_training - Step 38851: {'lr': 5.926906938220661e-05, 'samples': 19892224, 'steps': 38851, 'batch_loss/train': 0.6966672316193581} 12/28/2021 01:57:16 - INFO - codeparrot_training - Step 38852: {'lr': 5.925887907618255e-05, 'samples': 19892736, 'steps': 38852, 'batch_loss/train': 0.7602341647725552} 12/28/2021 01:57:29 - INFO - codeparrot_training - Step 38853: {'lr': 5.924868952847318e-05, 'samples': 19893248, 'steps': 38853, 'batch_loss/train': 0.7756495005451143} 12/28/2021 01:57:40 - INFO - codeparrot_training - Step 38854: {'lr': 5.9238500739119044e-05, 'samples': 19893760, 'steps': 38854, 'batch_loss/train': 0.7549344958970323} 12/28/2021 01:57:50 - INFO - codeparrot_training - Step 38855: {'lr': 5.922831270816048e-05, 'samples': 19894272, 'steps': 38855, 'batch_loss/train': 0.7839827528223395} 12/28/2021 01:58:02 - INFO - codeparrot_training - Step 38856: {'lr': 5.921812543563823e-05, 'samples': 19894784, 'steps': 38856, 'batch_loss/train': 0.8407323993742466} 12/28/2021 01:58:13 - INFO - codeparrot_training - Step 38857: {'lr': 5.9207938921592626e-05, 'samples': 19895296, 'steps': 38857, 'batch_loss/train': 0.6385821080766618} 12/28/2021 01:58:24 - INFO - codeparrot_training - Step 38858: {'lr': 5.919775316606424e-05, 'samples': 19895808, 'steps': 38858, 'batch_loss/train': 0.8099682908505201} 12/28/2021 01:58:36 - INFO - codeparrot_training - Step 38859: {'lr': 5.9187568169093514e-05, 'samples': 19896320, 'steps': 38859, 'batch_loss/train': 0.7838155466597527} 12/28/2021 01:58:47 - INFO - codeparrot_training - Step 38860: {'lr': 5.917738393072095e-05, 'samples': 19896832, 'steps': 38860, 'batch_loss/train': 1.3276017657481134} 12/28/2021 01:58:58 - INFO - codeparrot_training - Step 38861: {'lr': 5.9167200450987085e-05, 'samples': 19897344, 'steps': 38861, 'batch_loss/train': 0.7084209523163736} 12/28/2021 01:59:08 - INFO - codeparrot_training - Step 38862: {'lr': 5.915701772993237e-05, 'samples': 19897856, 'steps': 38862, 'batch_loss/train': 0.7289353720843792} 12/28/2021 01:59:20 - INFO - codeparrot_training - Step 38863: {'lr': 5.91468357675973e-05, 'samples': 19898368, 'steps': 38863, 'batch_loss/train': 0.7665232196450233} 12/28/2021 01:59:31 - INFO - codeparrot_training - Step 38864: {'lr': 5.913665456402234e-05, 'samples': 19898880, 'steps': 38864, 'batch_loss/train': 0.599003522656858} 12/28/2021 01:59:42 - INFO - codeparrot_training - Step 38865: {'lr': 5.912647411924804e-05, 'samples': 19899392, 'steps': 38865, 'batch_loss/train': 0.8108168105245568} 12/28/2021 01:59:54 - INFO - codeparrot_training - Step 38866: {'lr': 5.911629443331465e-05, 'samples': 19899904, 'steps': 38866, 'batch_loss/train': 0.7376165273599327} 12/28/2021 02:00:04 - INFO - codeparrot_training - Step 38867: {'lr': 5.9106115506262906e-05, 'samples': 19900416, 'steps': 38867, 'batch_loss/train': 0.7129457327537239} 12/28/2021 02:00:15 - INFO - codeparrot_training - Step 38868: {'lr': 5.909593733813321e-05, 'samples': 19900928, 'steps': 38868, 'batch_loss/train': 0.7725452352315187} 12/28/2021 02:00:28 - INFO - codeparrot_training - Step 38869: {'lr': 5.908575992896592e-05, 'samples': 19901440, 'steps': 38869, 'batch_loss/train': 0.8224568562582135} 12/28/2021 02:00:38 - INFO - codeparrot_training - Step 38870: {'lr': 5.90755832788015e-05, 'samples': 19901952, 'steps': 38870, 'batch_loss/train': 0.8729995894245803} 12/28/2021 02:00:49 - INFO - codeparrot_training - Step 38871: {'lr': 5.906540738768063e-05, 'samples': 19902464, 'steps': 38871, 'batch_loss/train': 0.7762418678030372} 12/28/2021 02:01:00 - INFO - codeparrot_training - Step 38872: {'lr': 5.905523225564349e-05, 'samples': 19902976, 'steps': 38872, 'batch_loss/train': 0.7796810436993837} 12/28/2021 02:01:12 - INFO - codeparrot_training - Step 38873: {'lr': 5.904505788273068e-05, 'samples': 19903488, 'steps': 38873, 'batch_loss/train': 0.6841107485815883} 12/28/2021 02:01:23 - INFO - codeparrot_training - Step 38874: {'lr': 5.9034884268982626e-05, 'samples': 19904000, 'steps': 38874, 'batch_loss/train': 0.5672952672466636} 12/28/2021 02:01:33 - INFO - codeparrot_training - Step 38875: {'lr': 5.9024711414439744e-05, 'samples': 19904512, 'steps': 38875, 'batch_loss/train': 0.7164795164717361} 12/28/2021 02:01:46 - INFO - codeparrot_training - Step 38876: {'lr': 5.9014539319142494e-05, 'samples': 19905024, 'steps': 38876, 'batch_loss/train': 1.0846300888806581} 12/28/2021 02:01:57 - INFO - codeparrot_training - Step 38877: {'lr': 5.900436798313136e-05, 'samples': 19905536, 'steps': 38877, 'batch_loss/train': 0.7545069130137563} 12/28/2021 02:02:08 - INFO - codeparrot_training - Step 38878: {'lr': 5.899419740644671e-05, 'samples': 19906048, 'steps': 38878, 'batch_loss/train': 0.7713282713666558} 12/28/2021 02:02:20 - INFO - codeparrot_training - Step 38879: {'lr': 5.8984027589129016e-05, 'samples': 19906560, 'steps': 38879, 'batch_loss/train': 0.745905906194821} 12/28/2021 02:02:30 - INFO - codeparrot_training - Step 38880: {'lr': 5.897385853121878e-05, 'samples': 19907072, 'steps': 38880, 'batch_loss/train': 0.810807743575424} 12/28/2021 02:02:41 - INFO - codeparrot_training - Step 38881: {'lr': 5.896369023275619e-05, 'samples': 19907584, 'steps': 38881, 'batch_loss/train': 0.7618870753794909} 12/28/2021 02:02:52 - INFO - codeparrot_training - Step 38882: {'lr': 5.895352269378201e-05, 'samples': 19908096, 'steps': 38882, 'batch_loss/train': 0.6563027920201421} 12/28/2021 02:03:04 - INFO - codeparrot_training - Step 38883: {'lr': 5.894335591433639e-05, 'samples': 19908608, 'steps': 38883, 'batch_loss/train': 0.7977256111335009} 12/28/2021 02:03:15 - INFO - codeparrot_training - Step 38884: {'lr': 5.893318989445978e-05, 'samples': 19909120, 'steps': 38884, 'batch_loss/train': 0.7159161258023232} 12/28/2021 02:03:25 - INFO - codeparrot_training - Step 38885: {'lr': 5.892302463419283e-05, 'samples': 19909632, 'steps': 38885, 'batch_loss/train': 0.9951320020772982} 12/28/2021 02:03:38 - INFO - codeparrot_training - Step 38886: {'lr': 5.891286013357569e-05, 'samples': 19910144, 'steps': 38886, 'batch_loss/train': 0.9006099388934672} 12/28/2021 02:03:48 - INFO - codeparrot_training - Step 38887: {'lr': 5.890269639264886e-05, 'samples': 19910656, 'steps': 38887, 'batch_loss/train': 0.7320769582875073} 12/28/2021 02:03:59 - INFO - codeparrot_training - Step 38888: {'lr': 5.889253341145279e-05, 'samples': 19911168, 'steps': 38888, 'batch_loss/train': 0.9140479024499655} 12/28/2021 02:04:11 - INFO - codeparrot_training - Step 38889: {'lr': 5.8882371190027814e-05, 'samples': 19911680, 'steps': 38889, 'batch_loss/train': 0.835481708869338} 12/28/2021 02:04:22 - INFO - codeparrot_training - Step 38890: {'lr': 5.8872209728414365e-05, 'samples': 19912192, 'steps': 38890, 'batch_loss/train': 0.8013046605046839} 12/28/2021 02:04:33 - INFO - codeparrot_training - Step 38891: {'lr': 5.886204902665293e-05, 'samples': 19912704, 'steps': 38891, 'batch_loss/train': 0.7372593968175352} 12/28/2021 02:04:45 - INFO - codeparrot_training - Step 38892: {'lr': 5.885188908478367e-05, 'samples': 19913216, 'steps': 38892, 'batch_loss/train': 0.7445608861744404} 12/28/2021 02:04:55 - INFO - codeparrot_training - Step 38893: {'lr': 5.884172990284722e-05, 'samples': 19913728, 'steps': 38893, 'batch_loss/train': 0.8835845813155174} 12/28/2021 02:05:06 - INFO - codeparrot_training - Step 38894: {'lr': 5.8831571480883916e-05, 'samples': 19914240, 'steps': 38894, 'batch_loss/train': 0.8169951569288969} 12/28/2021 02:05:17 - INFO - codeparrot_training - Step 38895: {'lr': 5.882141381893402e-05, 'samples': 19914752, 'steps': 38895, 'batch_loss/train': 0.7605839520692825} 12/28/2021 02:05:29 - INFO - codeparrot_training - Step 38896: {'lr': 5.8811256917037954e-05, 'samples': 19915264, 'steps': 38896, 'batch_loss/train': 0.6633371226489544} 12/28/2021 02:05:39 - INFO - codeparrot_training - Step 38897: {'lr': 5.880110077523626e-05, 'samples': 19915776, 'steps': 38897, 'batch_loss/train': 0.5991840711794794} 12/28/2021 02:05:50 - INFO - codeparrot_training - Step 38898: {'lr': 5.879094539356913e-05, 'samples': 19916288, 'steps': 38898, 'batch_loss/train': 0.8221656689420342} 12/28/2021 02:06:03 - INFO - codeparrot_training - Step 38899: {'lr': 5.878079077207696e-05, 'samples': 19916800, 'steps': 38899, 'batch_loss/train': 0.6623728199629113} 12/28/2021 02:06:13 - INFO - codeparrot_training - Step 38900: {'lr': 5.8770636910800286e-05, 'samples': 19917312, 'steps': 38900, 'batch_loss/train': 0.6857699257088825} 12/28/2021 02:06:24 - INFO - codeparrot_training - Step 38901: {'lr': 5.8760483809779274e-05, 'samples': 19917824, 'steps': 38901, 'batch_loss/train': 0.6419986761175096} 12/28/2021 02:06:36 - INFO - codeparrot_training - Step 38902: {'lr': 5.875033146905437e-05, 'samples': 19918336, 'steps': 38902, 'batch_loss/train': 0.7975829299539328} 12/28/2021 02:06:47 - INFO - codeparrot_training - Step 38903: {'lr': 5.874017988866595e-05, 'samples': 19918848, 'steps': 38903, 'batch_loss/train': 0.6889579384587705} 12/28/2021 02:06:57 - INFO - codeparrot_training - Step 38904: {'lr': 5.873002906865435e-05, 'samples': 19919360, 'steps': 38904, 'batch_loss/train': 0.8137488844804466} 12/28/2021 02:07:08 - INFO - codeparrot_training - Step 38905: {'lr': 5.871987900905992e-05, 'samples': 19919872, 'steps': 38905, 'batch_loss/train': 0.6281972462311387} 12/28/2021 02:07:21 - INFO - codeparrot_training - Step 38906: {'lr': 5.870972970992311e-05, 'samples': 19920384, 'steps': 38906, 'batch_loss/train': 0.7626373274251819} 12/28/2021 02:07:32 - INFO - codeparrot_training - Step 38907: {'lr': 5.8699581171284046e-05, 'samples': 19920896, 'steps': 38907, 'batch_loss/train': 1.361049523577094} 12/28/2021 02:07:42 - INFO - codeparrot_training - Step 38908: {'lr': 5.868943339318336e-05, 'samples': 19921408, 'steps': 38908, 'batch_loss/train': 0.9095083484426141} 12/28/2021 02:07:55 - INFO - codeparrot_training - Step 38909: {'lr': 5.8679286375661166e-05, 'samples': 19921920, 'steps': 38909, 'batch_loss/train': 0.7304031775565818} 12/28/2021 02:08:05 - INFO - codeparrot_training - Step 38910: {'lr': 5.866914011875782e-05, 'samples': 19922432, 'steps': 38910, 'batch_loss/train': 0.7883678581565619} 12/28/2021 02:08:16 - INFO - codeparrot_training - Step 38911: {'lr': 5.8658994622513894e-05, 'samples': 19922944, 'steps': 38911, 'batch_loss/train': 0.6284033651463687} 12/28/2021 02:08:27 - INFO - codeparrot_training - Step 38912: {'lr': 5.8648849886969474e-05, 'samples': 19923456, 'steps': 38912, 'batch_loss/train': 0.5996469082310796} 12/28/2021 02:08:39 - INFO - codeparrot_training - Step 38913: {'lr': 5.863870591216497e-05, 'samples': 19923968, 'steps': 38913, 'batch_loss/train': 0.6934993928298354} 12/28/2021 02:08:49 - INFO - codeparrot_training - Step 38914: {'lr': 5.862856269814071e-05, 'samples': 19924480, 'steps': 38914, 'batch_loss/train': 0.8294217772781849} 12/28/2021 02:09:00 - INFO - codeparrot_training - Step 38915: {'lr': 5.861842024493705e-05, 'samples': 19924992, 'steps': 38915, 'batch_loss/train': 0.6173024796880782} 12/28/2021 02:09:13 - INFO - codeparrot_training - Step 38916: {'lr': 5.8608278552594265e-05, 'samples': 19925504, 'steps': 38916, 'batch_loss/train': 0.6793924970552325} 12/28/2021 02:09:23 - INFO - codeparrot_training - Step 38917: {'lr': 5.8598137621152766e-05, 'samples': 19926016, 'steps': 38917, 'batch_loss/train': 0.8962904778309166} 12/28/2021 02:09:34 - INFO - codeparrot_training - Step 38918: {'lr': 5.8587997450652686e-05, 'samples': 19926528, 'steps': 38918, 'batch_loss/train': 0.920258960686624} 12/28/2021 02:09:46 - INFO - codeparrot_training - Step 38919: {'lr': 5.8577858041134494e-05, 'samples': 19927040, 'steps': 38919, 'batch_loss/train': 0.7260876779910177} 12/28/2021 02:09:57 - INFO - codeparrot_training - Step 38920: {'lr': 5.8567719392638555e-05, 'samples': 19927552, 'steps': 38920, 'batch_loss/train': 0.7125096749514341} 12/28/2021 02:10:07 - INFO - codeparrot_training - Step 38921: {'lr': 5.855758150520493e-05, 'samples': 19928064, 'steps': 38921, 'batch_loss/train': 0.7646785406395793} 12/28/2021 02:10:18 - INFO - codeparrot_training - Step 38922: {'lr': 5.854744437887413e-05, 'samples': 19928576, 'steps': 38922, 'batch_loss/train': 0.7818533470854163} 12/28/2021 02:10:30 - INFO - codeparrot_training - Step 38923: {'lr': 5.853730801368651e-05, 'samples': 19929088, 'steps': 38923, 'batch_loss/train': 0.7272936915978789} 12/28/2021 02:10:41 - INFO - codeparrot_training - Step 38924: {'lr': 5.852717240968216e-05, 'samples': 19929600, 'steps': 38924, 'batch_loss/train': 0.7588916667737067} 12/28/2021 02:10:51 - INFO - codeparrot_training - Step 38925: {'lr': 5.851703756690141e-05, 'samples': 19930112, 'steps': 38925, 'batch_loss/train': 0.8576550208963454} 12/28/2021 02:11:04 - INFO - codeparrot_training - Step 38926: {'lr': 5.850690348538476e-05, 'samples': 19930624, 'steps': 38926, 'batch_loss/train': 0.7033779537305236} 12/28/2021 02:11:15 - INFO - codeparrot_training - Step 38927: {'lr': 5.8496770165172295e-05, 'samples': 19931136, 'steps': 38927, 'batch_loss/train': 0.8145476386416703} 12/28/2021 02:11:25 - INFO - codeparrot_training - Step 38928: {'lr': 5.848663760630435e-05, 'samples': 19931648, 'steps': 38928, 'batch_loss/train': 0.6935524800792336} 12/28/2021 02:11:37 - INFO - codeparrot_training - Step 38929: {'lr': 5.847650580882122e-05, 'samples': 19932160, 'steps': 38929, 'batch_loss/train': 0.6838181735947728} 12/28/2021 02:11:48 - INFO - codeparrot_training - Step 38930: {'lr': 5.84663747727632e-05, 'samples': 19932672, 'steps': 38930, 'batch_loss/train': 0.6445499290130101} 12/28/2021 02:11:59 - INFO - codeparrot_training - Step 38931: {'lr': 5.8456244498170554e-05, 'samples': 19933184, 'steps': 38931, 'batch_loss/train': 0.7109343875199556} 12/28/2021 02:12:09 - INFO - codeparrot_training - Step 38932: {'lr': 5.844611498508362e-05, 'samples': 19933696, 'steps': 38932, 'batch_loss/train': 0.7128542610444129} 12/28/2021 02:12:22 - INFO - codeparrot_training - Step 38933: {'lr': 5.8435986233542456e-05, 'samples': 19934208, 'steps': 38933, 'batch_loss/train': 0.80335792619735} 12/28/2021 02:12:32 - INFO - codeparrot_training - Step 38934: {'lr': 5.842585824358754e-05, 'samples': 19934720, 'steps': 38934, 'batch_loss/train': 0.8799339402467012} 12/28/2021 02:12:43 - INFO - codeparrot_training - Step 38935: {'lr': 5.8415731015259185e-05, 'samples': 19935232, 'steps': 38935, 'batch_loss/train': 0.754775661509484} 12/28/2021 02:12:56 - INFO - codeparrot_training - Step 38936: {'lr': 5.840560454859736e-05, 'samples': 19935744, 'steps': 38936, 'batch_loss/train': 0.7517993391957134} 12/28/2021 02:13:06 - INFO - codeparrot_training - Step 38937: {'lr': 5.839547884364269e-05, 'samples': 19936256, 'steps': 38937, 'batch_loss/train': 0.7749404646456242} 12/28/2021 02:13:17 - INFO - codeparrot_training - Step 38938: {'lr': 5.8385353900435146e-05, 'samples': 19936768, 'steps': 38938, 'batch_loss/train': 0.6484752336982638} 12/28/2021 02:13:29 - INFO - codeparrot_training - Step 38939: {'lr': 5.83752297190151e-05, 'samples': 19937280, 'steps': 38939, 'batch_loss/train': 0.7580258338712156} 12/28/2021 02:13:40 - INFO - codeparrot_training - Step 38940: {'lr': 5.836510629942276e-05, 'samples': 19937792, 'steps': 38940, 'batch_loss/train': 0.5833626193343662} 12/28/2021 02:13:51 - INFO - codeparrot_training - Step 38941: {'lr': 5.8354983641698445e-05, 'samples': 19938304, 'steps': 38941, 'batch_loss/train': 0.6750135456677526} 12/28/2021 02:14:01 - INFO - codeparrot_training - Step 38942: {'lr': 5.834486174588233e-05, 'samples': 19938816, 'steps': 38942, 'batch_loss/train': 0.724767139647156} 12/28/2021 02:14:13 - INFO - codeparrot_training - Step 38943: {'lr': 5.8334740612014656e-05, 'samples': 19939328, 'steps': 38943, 'batch_loss/train': 0.7900272887200117} 12/28/2021 02:14:24 - INFO - codeparrot_training - Step 38944: {'lr': 5.832462024013571e-05, 'samples': 19939840, 'steps': 38944, 'batch_loss/train': 0.6468957148026675} 12/28/2021 02:14:35 - INFO - codeparrot_training - Step 38945: {'lr': 5.8314500630285705e-05, 'samples': 19940352, 'steps': 38945, 'batch_loss/train': 0.6623841619584709} 12/28/2021 02:14:48 - INFO - codeparrot_training - Step 38946: {'lr': 5.830438178250491e-05, 'samples': 19940864, 'steps': 38946, 'batch_loss/train': 0.7803747630678117} 12/28/2021 02:14:59 - INFO - codeparrot_training - Step 38947: {'lr': 5.829426369683341e-05, 'samples': 19941376, 'steps': 38947, 'batch_loss/train': 0.5436883964575827} 12/28/2021 02:15:09 - INFO - codeparrot_training - Step 38948: {'lr': 5.828414637331158e-05, 'samples': 19941888, 'steps': 38948, 'batch_loss/train': 0.7300732512958348} 12/28/2021 02:15:21 - INFO - codeparrot_training - Step 38949: {'lr': 5.827402981197966e-05, 'samples': 19942400, 'steps': 38949, 'batch_loss/train': 0.6945686156395823} 12/28/2021 02:15:32 - INFO - codeparrot_training - Step 38950: {'lr': 5.8263914012877755e-05, 'samples': 19942912, 'steps': 38950, 'batch_loss/train': 0.7090977104380727} 12/28/2021 02:15:43 - INFO - codeparrot_training - Step 38951: {'lr': 5.825379897604605e-05, 'samples': 19943424, 'steps': 38951, 'batch_loss/train': 0.730247707106173} 12/28/2021 02:15:55 - INFO - codeparrot_training - Step 38952: {'lr': 5.824368470152497e-05, 'samples': 19943936, 'steps': 38952, 'batch_loss/train': 0.6962484261021018} 12/28/2021 02:16:05 - INFO - codeparrot_training - Step 38953: {'lr': 5.823357118935452e-05, 'samples': 19944448, 'steps': 38953, 'batch_loss/train': 0.8092547622509301} 12/28/2021 02:16:16 - INFO - codeparrot_training - Step 38954: {'lr': 5.822345843957499e-05, 'samples': 19944960, 'steps': 38954, 'batch_loss/train': 0.6588497923221439} 12/28/2021 02:16:26 - INFO - codeparrot_training - Step 38955: {'lr': 5.821334645222659e-05, 'samples': 19945472, 'steps': 38955, 'batch_loss/train': 0.6019646821077913} 12/28/2021 02:16:39 - INFO - codeparrot_training - Step 38956: {'lr': 5.820323522734949e-05, 'samples': 19945984, 'steps': 38956, 'batch_loss/train': 0.7519201007671654} 12/28/2021 02:16:50 - INFO - codeparrot_training - Step 38957: {'lr': 5.8193124764983924e-05, 'samples': 19946496, 'steps': 38957, 'batch_loss/train': 0.7987021167064086} 12/28/2021 02:17:00 - INFO - codeparrot_training - Step 38958: {'lr': 5.818301506517007e-05, 'samples': 19947008, 'steps': 38958, 'batch_loss/train': 0.8151186485774815} 12/28/2021 02:17:13 - INFO - codeparrot_training - Step 38959: {'lr': 5.8172906127948125e-05, 'samples': 19947520, 'steps': 38959, 'batch_loss/train': 0.755776546895504} 12/28/2021 02:17:23 - INFO - codeparrot_training - Step 38960: {'lr': 5.8162797953358245e-05, 'samples': 19948032, 'steps': 38960, 'batch_loss/train': 0.7254821372916922} 12/28/2021 02:17:34 - INFO - codeparrot_training - Step 38961: {'lr': 5.8152690541440735e-05, 'samples': 19948544, 'steps': 38961, 'batch_loss/train': 0.7464268170297146} 12/28/2021 02:17:46 - INFO - codeparrot_training - Step 38962: {'lr': 5.814258389223556e-05, 'samples': 19949056, 'steps': 38962, 'batch_loss/train': 0.6244916282594204} 12/28/2021 02:17:57 - INFO - codeparrot_training - Step 38963: {'lr': 5.813247800578314e-05, 'samples': 19949568, 'steps': 38963, 'batch_loss/train': 0.8174523927737027} 12/28/2021 02:18:07 - INFO - codeparrot_training - Step 38964: {'lr': 5.812237288212349e-05, 'samples': 19950080, 'steps': 38964, 'batch_loss/train': 0.6573232733644545} 12/28/2021 02:18:18 - INFO - codeparrot_training - Step 38965: {'lr': 5.8112268521296834e-05, 'samples': 19950592, 'steps': 38965, 'batch_loss/train': 0.7207604278810322} 12/28/2021 02:18:31 - INFO - codeparrot_training - Step 38966: {'lr': 5.810216492334336e-05, 'samples': 19951104, 'steps': 38966, 'batch_loss/train': 0.6564660184085369} 12/28/2021 02:18:42 - INFO - codeparrot_training - Step 38967: {'lr': 5.80920620883032e-05, 'samples': 19951616, 'steps': 38967, 'batch_loss/train': 0.7251774608157575} 12/28/2021 02:18:52 - INFO - codeparrot_training - Step 38968: {'lr': 5.808196001621654e-05, 'samples': 19952128, 'steps': 38968, 'batch_loss/train': 0.7116457913070917} 12/28/2021 02:19:04 - INFO - codeparrot_training - Step 38969: {'lr': 5.807185870712356e-05, 'samples': 19952640, 'steps': 38969, 'batch_loss/train': 0.5018366044387221} 12/28/2021 02:19:15 - INFO - codeparrot_training - Step 38970: {'lr': 5.806175816106438e-05, 'samples': 19953152, 'steps': 38970, 'batch_loss/train': 0.7952136322855949} 12/28/2021 02:19:26 - INFO - codeparrot_training - Step 38971: {'lr': 5.8051658378079184e-05, 'samples': 19953664, 'steps': 38971, 'batch_loss/train': 0.8492340594530106} 12/28/2021 02:19:38 - INFO - codeparrot_training - Step 38972: {'lr': 5.804155935820818e-05, 'samples': 19954176, 'steps': 38972, 'batch_loss/train': 0.6582739867735654} 12/28/2021 02:19:49 - INFO - codeparrot_training - Step 38973: {'lr': 5.803146110149132e-05, 'samples': 19954688, 'steps': 38973, 'batch_loss/train': 0.7093100640922785} 12/28/2021 02:19:59 - INFO - codeparrot_training - Step 38974: {'lr': 5.802136360796895e-05, 'samples': 19955200, 'steps': 38974, 'batch_loss/train': 0.7261002147570252} 12/28/2021 02:20:10 - INFO - codeparrot_training - Step 38975: {'lr': 5.801126687768124e-05, 'samples': 19955712, 'steps': 38975, 'batch_loss/train': 0.7064434089697897} 12/28/2021 02:20:23 - INFO - codeparrot_training - Step 38976: {'lr': 5.800117091066814e-05, 'samples': 19956224, 'steps': 38976, 'batch_loss/train': 0.7817306099459529} 12/28/2021 02:20:33 - INFO - codeparrot_training - Step 38977: {'lr': 5.799107570696985e-05, 'samples': 19956736, 'steps': 38977, 'batch_loss/train': 0.6807959484867752} 12/28/2021 02:20:44 - INFO - codeparrot_training - Step 38978: {'lr': 5.798098126662665e-05, 'samples': 19957248, 'steps': 38978, 'batch_loss/train': 0.6300896471366286} 12/28/2021 02:20:56 - INFO - codeparrot_training - Step 38979: {'lr': 5.797088758967853e-05, 'samples': 19957760, 'steps': 38979, 'batch_loss/train': 0.7736022057943046} 12/28/2021 02:21:07 - INFO - codeparrot_training - Step 38980: {'lr': 5.796079467616563e-05, 'samples': 19958272, 'steps': 38980, 'batch_loss/train': 0.7935903551988304} 12/28/2021 02:21:17 - INFO - codeparrot_training - Step 38981: {'lr': 5.7950702526128104e-05, 'samples': 19958784, 'steps': 38981, 'batch_loss/train': 1.0308968294411898} 12/28/2021 02:21:29 - INFO - codeparrot_training - Step 38982: {'lr': 5.794061113960608e-05, 'samples': 19959296, 'steps': 38982, 'batch_loss/train': 0.7609772710129619} 12/28/2021 02:21:40 - INFO - codeparrot_training - Step 38983: {'lr': 5.793052051663966e-05, 'samples': 19959808, 'steps': 38983, 'batch_loss/train': 0.8116559288464487} 12/28/2021 02:21:51 - INFO - codeparrot_training - Step 38984: {'lr': 5.792043065726898e-05, 'samples': 19960320, 'steps': 38984, 'batch_loss/train': 0.7857253029942513} 12/28/2021 02:22:01 - INFO - codeparrot_training - Step 38985: {'lr': 5.791034156153413e-05, 'samples': 19960832, 'steps': 38985, 'batch_loss/train': 0.7978682722896338} 12/28/2021 02:22:14 - INFO - codeparrot_training - Step 38986: {'lr': 5.790025322947523e-05, 'samples': 19961344, 'steps': 38986, 'batch_loss/train': 0.7780062509700656} 12/28/2021 02:22:25 - INFO - codeparrot_training - Step 38987: {'lr': 5.789016566113245e-05, 'samples': 19961856, 'steps': 38987, 'batch_loss/train': 0.7410834105685353} 12/28/2021 02:22:35 - INFO - codeparrot_training - Step 38988: {'lr': 5.78800788565457e-05, 'samples': 19962368, 'steps': 38988, 'batch_loss/train': 0.7889512377441861} 12/28/2021 02:22:47 - INFO - codeparrot_training - Step 38989: {'lr': 5.7869992815755305e-05, 'samples': 19962880, 'steps': 38989, 'batch_loss/train': 0.7537519251927733} 12/28/2021 02:22:58 - INFO - codeparrot_training - Step 38990: {'lr': 5.785990753880133e-05, 'samples': 19963392, 'steps': 38990, 'batch_loss/train': 0.6963763313833624} 12/28/2021 02:23:09 - INFO - codeparrot_training - Step 38991: {'lr': 5.7849823025723726e-05, 'samples': 19963904, 'steps': 38991, 'batch_loss/train': 0.7478000009432435} 12/28/2021 02:23:22 - INFO - codeparrot_training - Step 38992: {'lr': 5.783973927656269e-05, 'samples': 19964416, 'steps': 38992, 'batch_loss/train': 0.6848917035385966} 12/28/2021 02:23:32 - INFO - codeparrot_training - Step 38993: {'lr': 5.782965629135828e-05, 'samples': 19964928, 'steps': 38993, 'batch_loss/train': 0.7143084367271513} 12/28/2021 02:23:43 - INFO - codeparrot_training - Step 38994: {'lr': 5.781957407015062e-05, 'samples': 19965440, 'steps': 38994, 'batch_loss/train': 0.7902106279507279} 12/28/2021 02:23:54 - INFO - codeparrot_training - Step 38995: {'lr': 5.780949261297974e-05, 'samples': 19965952, 'steps': 38995, 'batch_loss/train': 0.5597991977119818} 12/28/2021 02:24:06 - INFO - codeparrot_training - Step 38996: {'lr': 5.7799411919885774e-05, 'samples': 19966464, 'steps': 38996, 'batch_loss/train': 0.7303906921297312} 12/28/2021 02:24:16 - INFO - codeparrot_training - Step 38997: {'lr': 5.7789331990908775e-05, 'samples': 19966976, 'steps': 38997, 'batch_loss/train': 0.7292778645642102} 12/28/2021 02:24:27 - INFO - codeparrot_training - Step 38998: {'lr': 5.777925282608887e-05, 'samples': 19967488, 'steps': 38998, 'batch_loss/train': 0.9318590294569731} 12/28/2021 02:24:39 - INFO - codeparrot_training - Step 38999: {'lr': 5.776917442546595e-05, 'samples': 19968000, 'steps': 38999, 'batch_loss/train': 0.7915007802657783} 12/28/2021 02:24:50 - INFO - codeparrot_training - Step 39000: {'lr': 5.775909678908026e-05, 'samples': 19968512, 'steps': 39000, 'batch_loss/train': 0.7988896872848272} 12/28/2021 02:25:00 - INFO - codeparrot_training - Step 39001: {'lr': 5.77490199169719e-05, 'samples': 19969024, 'steps': 39001, 'batch_loss/train': 0.7061845031566918} 12/28/2021 02:25:12 - INFO - codeparrot_training - Step 39002: {'lr': 5.773894380918077e-05, 'samples': 19969536, 'steps': 39002, 'batch_loss/train': 0.7446090672165155} 12/28/2021 02:25:23 - INFO - codeparrot_training - Step 39003: {'lr': 5.772886846574693e-05, 'samples': 19970048, 'steps': 39003, 'batch_loss/train': 0.6968137905932963} 12/28/2021 02:25:34 - INFO - codeparrot_training - Step 39004: {'lr': 5.771879388671064e-05, 'samples': 19970560, 'steps': 39004, 'batch_loss/train': 0.5897177751176059} 12/28/2021 02:25:44 - INFO - codeparrot_training - Step 39005: {'lr': 5.7708720072111784e-05, 'samples': 19971072, 'steps': 39005, 'batch_loss/train': 0.5682974413502961} 12/28/2021 02:25:57 - INFO - codeparrot_training - Step 39006: {'lr': 5.769864702199043e-05, 'samples': 19971584, 'steps': 39006, 'batch_loss/train': 0.7232973081991076} 12/28/2021 02:26:08 - INFO - codeparrot_training - Step 39007: {'lr': 5.768857473638664e-05, 'samples': 19972096, 'steps': 39007, 'batch_loss/train': 0.7652281054761261} 12/28/2021 02:26:18 - INFO - codeparrot_training - Step 39008: {'lr': 5.767850321534049e-05, 'samples': 19972608, 'steps': 39008, 'batch_loss/train': 0.7344508320093155} 12/28/2021 02:26:31 - INFO - codeparrot_training - Step 39009: {'lr': 5.766843245889197e-05, 'samples': 19973120, 'steps': 39009, 'batch_loss/train': 0.686232260428369} 12/28/2021 02:26:41 - INFO - codeparrot_training - Step 39010: {'lr': 5.7658362467081146e-05, 'samples': 19973632, 'steps': 39010, 'batch_loss/train': 0.7058380823582411} 12/28/2021 02:26:52 - INFO - codeparrot_training - Step 39011: {'lr': 5.764829323994805e-05, 'samples': 19974144, 'steps': 39011, 'batch_loss/train': 0.7202324038371444} 12/28/2021 02:27:04 - INFO - codeparrot_training - Step 39012: {'lr': 5.763822477753269e-05, 'samples': 19974656, 'steps': 39012, 'batch_loss/train': 0.8386450242251158} 12/28/2021 02:27:15 - INFO - codeparrot_training - Step 39013: {'lr': 5.7628157079875235e-05, 'samples': 19975168, 'steps': 39013, 'batch_loss/train': 0.8295177679974586} 12/28/2021 02:27:25 - INFO - codeparrot_training - Step 39014: {'lr': 5.761809014701541e-05, 'samples': 19975680, 'steps': 39014, 'batch_loss/train': 0.7948974017053843} 12/28/2021 02:27:36 - INFO - codeparrot_training - Step 39015: {'lr': 5.760802397899348e-05, 'samples': 19976192, 'steps': 39015, 'batch_loss/train': 0.7246670797467232} 12/28/2021 02:27:49 - INFO - codeparrot_training - Step 39016: {'lr': 5.759795857584951e-05, 'samples': 19976704, 'steps': 39016, 'batch_loss/train': 0.7858442701399326} 12/28/2021 02:28:00 - INFO - codeparrot_training - Step 39017: {'lr': 5.7587893937623316e-05, 'samples': 19977216, 'steps': 39017, 'batch_loss/train': 0.7908126343972981} 12/28/2021 02:28:10 - INFO - codeparrot_training - Step 39018: {'lr': 5.757783006435499e-05, 'samples': 19977728, 'steps': 39018, 'batch_loss/train': 0.8211064422503114} 12/28/2021 02:28:22 - INFO - codeparrot_training - Step 39019: {'lr': 5.756776695608459e-05, 'samples': 19978240, 'steps': 39019, 'batch_loss/train': 0.7214486096054316} 12/28/2021 02:28:33 - INFO - codeparrot_training - Step 39020: {'lr': 5.7557704612852065e-05, 'samples': 19978752, 'steps': 39020, 'batch_loss/train': 0.7154511697590351} 12/28/2021 02:28:43 - INFO - codeparrot_training - Step 39021: {'lr': 5.754764303469745e-05, 'samples': 19979264, 'steps': 39021, 'batch_loss/train': 0.7008446189574897} 12/28/2021 02:28:56 - INFO - codeparrot_training - Step 39022: {'lr': 5.753758222166072e-05, 'samples': 19979776, 'steps': 39022, 'batch_loss/train': 0.7345029781572521} 12/28/2021 02:29:07 - INFO - codeparrot_training - Step 39023: {'lr': 5.752752217378193e-05, 'samples': 19980288, 'steps': 39023, 'batch_loss/train': 0.715320291928947} 12/28/2021 02:29:18 - INFO - codeparrot_training - Step 39024: {'lr': 5.751746289110102e-05, 'samples': 19980800, 'steps': 39024, 'batch_loss/train': 0.7005163319408894} 12/28/2021 02:29:30 - INFO - codeparrot_training - Step 39025: {'lr': 5.750740437365798e-05, 'samples': 19981312, 'steps': 39025, 'batch_loss/train': 0.719032846391201} 12/28/2021 02:29:40 - INFO - codeparrot_training - Step 39026: {'lr': 5.7497346621492836e-05, 'samples': 19981824, 'steps': 39026, 'batch_loss/train': 0.7662445800378919} 12/28/2021 02:29:51 - INFO - codeparrot_training - Step 39027: {'lr': 5.748728963464564e-05, 'samples': 19982336, 'steps': 39027, 'batch_loss/train': 0.6881159152835608} 12/28/2021 02:30:02 - INFO - codeparrot_training - Step 39028: {'lr': 5.747723341315619e-05, 'samples': 19982848, 'steps': 39028, 'batch_loss/train': 0.6125962869264185} 12/28/2021 02:30:14 - INFO - codeparrot_training - Step 39029: {'lr': 5.746717795706452e-05, 'samples': 19983360, 'steps': 39029, 'batch_loss/train': 0.7205719528719783} 12/28/2021 02:30:24 - INFO - codeparrot_training - Step 39030: {'lr': 5.745712326641078e-05, 'samples': 19983872, 'steps': 39030, 'batch_loss/train': 0.6836889665573835} 12/28/2021 02:30:35 - INFO - codeparrot_training - Step 39031: {'lr': 5.7447069341234754e-05, 'samples': 19984384, 'steps': 39031, 'batch_loss/train': 0.7169577945023775} 12/28/2021 02:30:48 - INFO - codeparrot_training - Step 39032: {'lr': 5.74370161815764e-05, 'samples': 19984896, 'steps': 39032, 'batch_loss/train': 0.7066601142287254} 12/28/2021 02:30:58 - INFO - codeparrot_training - Step 39033: {'lr': 5.742696378747589e-05, 'samples': 19985408, 'steps': 39033, 'batch_loss/train': 0.7465839218348265} 12/28/2021 02:31:09 - INFO - codeparrot_training - Step 39034: {'lr': 5.7416912158973016e-05, 'samples': 19985920, 'steps': 39034, 'batch_loss/train': 0.7310161869972944} 12/28/2021 02:31:21 - INFO - codeparrot_training - Step 39035: {'lr': 5.740686129610776e-05, 'samples': 19986432, 'steps': 39035, 'batch_loss/train': 0.6674642886500806} 12/28/2021 02:31:32 - INFO - codeparrot_training - Step 39036: {'lr': 5.739681119892013e-05, 'samples': 19986944, 'steps': 39036, 'batch_loss/train': 1.1293489960953593} 12/28/2021 02:31:42 - INFO - codeparrot_training - Step 39037: {'lr': 5.738676186745001e-05, 'samples': 19987456, 'steps': 39037, 'batch_loss/train': 0.809371842071414} 12/28/2021 02:31:53 - INFO - codeparrot_training - Step 39038: {'lr': 5.737671330173744e-05, 'samples': 19987968, 'steps': 39038, 'batch_loss/train': 0.7530388967134058} 12/28/2021 02:32:05 - INFO - codeparrot_training - Step 39039: {'lr': 5.736666550182237e-05, 'samples': 19988480, 'steps': 39039, 'batch_loss/train': 0.6820602947846055} 12/28/2021 02:32:16 - INFO - codeparrot_training - Step 39040: {'lr': 5.7356618467744565e-05, 'samples': 19988992, 'steps': 39040, 'batch_loss/train': 0.7259715888649225} 12/28/2021 02:32:26 - INFO - codeparrot_training - Step 39041: {'lr': 5.73465721995442e-05, 'samples': 19989504, 'steps': 39041, 'batch_loss/train': 0.6936422679573298} 12/28/2021 02:32:39 - INFO - codeparrot_training - Step 39042: {'lr': 5.7336526697261164e-05, 'samples': 19990016, 'steps': 39042, 'batch_loss/train': 0.7342496966011822} 12/28/2021 02:32:49 - INFO - codeparrot_training - Step 39043: {'lr': 5.732648196093529e-05, 'samples': 19990528, 'steps': 39043, 'batch_loss/train': 0.689577161683701} 12/28/2021 02:33:00 - INFO - codeparrot_training - Step 39044: {'lr': 5.731643799060657e-05, 'samples': 19991040, 'steps': 39044, 'batch_loss/train': 0.6413693577633239} 12/28/2021 02:33:13 - INFO - codeparrot_training - Step 39045: {'lr': 5.730639478631494e-05, 'samples': 19991552, 'steps': 39045, 'batch_loss/train': 0.6111188475042582} 12/28/2021 02:33:23 - INFO - codeparrot_training - Step 39046: {'lr': 5.7296352348100324e-05, 'samples': 19992064, 'steps': 39046, 'batch_loss/train': 0.71412233565934} 12/28/2021 02:33:34 - INFO - codeparrot_training - Step 39047: {'lr': 5.728631067600265e-05, 'samples': 19992576, 'steps': 39047, 'batch_loss/train': 0.676537390681915} 12/28/2021 02:33:46 - INFO - codeparrot_training - Step 39048: {'lr': 5.727626977006184e-05, 'samples': 19993088, 'steps': 39048, 'batch_loss/train': 0.8802973758429289} 12/28/2021 02:33:57 - INFO - codeparrot_training - Step 39049: {'lr': 5.726622963031783e-05, 'samples': 19993600, 'steps': 39049, 'batch_loss/train': 0.687555665615946} 12/28/2021 02:34:08 - INFO - codeparrot_training - Step 39050: {'lr': 5.725619025681048e-05, 'samples': 19994112, 'steps': 39050, 'batch_loss/train': 0.6066041670273989} 12/28/2021 02:34:18 - INFO - codeparrot_training - Step 39051: {'lr': 5.7246151649579776e-05, 'samples': 19994624, 'steps': 39051, 'batch_loss/train': 0.7209881348535419} 12/28/2021 02:34:31 - INFO - codeparrot_training - Step 39052: {'lr': 5.7236113808665554e-05, 'samples': 19995136, 'steps': 39052, 'batch_loss/train': 0.6771249966695905} 12/28/2021 02:34:42 - INFO - codeparrot_training - Step 39053: {'lr': 5.7226076734107875e-05, 'samples': 19995648, 'steps': 39053, 'batch_loss/train': 0.7578171142376959} 12/28/2021 02:34:52 - INFO - codeparrot_training - Step 39054: {'lr': 5.721604042594641e-05, 'samples': 19996160, 'steps': 39054, 'batch_loss/train': 0.7031766835134476} 12/28/2021 02:35:04 - INFO - codeparrot_training - Step 39055: {'lr': 5.720600488422115e-05, 'samples': 19996672, 'steps': 39055, 'batch_loss/train': 0.7847476839087903} 12/28/2021 02:35:15 - INFO - codeparrot_training - Step 39056: {'lr': 5.719597010897212e-05, 'samples': 19997184, 'steps': 39056, 'batch_loss/train': 0.8297790249343961} 12/28/2021 02:35:26 - INFO - codeparrot_training - Step 39057: {'lr': 5.718593610023906e-05, 'samples': 19997696, 'steps': 39057, 'batch_loss/train': 0.7425177227705717} 12/28/2021 02:35:38 - INFO - codeparrot_training - Step 39058: {'lr': 5.7175902858061846e-05, 'samples': 19998208, 'steps': 39058, 'batch_loss/train': 0.7645022207871079} 12/28/2021 02:35:49 - INFO - codeparrot_training - Step 39059: {'lr': 5.7165870382480586e-05, 'samples': 19998720, 'steps': 39059, 'batch_loss/train': 0.7496865149587393} 12/28/2021 02:35:59 - INFO - codeparrot_training - Step 39060: {'lr': 5.715583867353491e-05, 'samples': 19999232, 'steps': 39060, 'batch_loss/train': 0.7210771018872038} 12/28/2021 02:36:10 - INFO - codeparrot_training - Step 39061: {'lr': 5.714580773126482e-05, 'samples': 19999744, 'steps': 39061, 'batch_loss/train': 0.7269144912716001} 12/28/2021 02:36:23 - INFO - codeparrot_training - Step 39062: {'lr': 5.71357775557102e-05, 'samples': 20000256, 'steps': 39062, 'batch_loss/train': 0.7169050709344447} 12/28/2021 02:36:34 - INFO - codeparrot_training - Step 39063: {'lr': 5.712574814691088e-05, 'samples': 20000768, 'steps': 39063, 'batch_loss/train': 0.7519657616503537} 12/28/2021 02:36:44 - INFO - codeparrot_training - Step 39064: {'lr': 5.711571950490676e-05, 'samples': 20001280, 'steps': 39064, 'batch_loss/train': 0.786591480486095} 12/28/2021 02:36:56 - INFO - codeparrot_training - Step 39065: {'lr': 5.7105691629737805e-05, 'samples': 20001792, 'steps': 39065, 'batch_loss/train': 2.4883801080286503} 12/28/2021 02:37:07 - INFO - codeparrot_training - Step 39066: {'lr': 5.709566452144363e-05, 'samples': 20002304, 'steps': 39066, 'batch_loss/train': 0.7733284472487867} 12/28/2021 02:37:18 - INFO - codeparrot_training - Step 39067: {'lr': 5.708563818006432e-05, 'samples': 20002816, 'steps': 39067, 'batch_loss/train': 0.7624649195931852} 12/28/2021 02:37:31 - INFO - codeparrot_training - Step 39068: {'lr': 5.707561260563973e-05, 'samples': 20003328, 'steps': 39068, 'batch_loss/train': 0.7149886712431908} 12/28/2021 02:37:42 - INFO - codeparrot_training - Step 39069: {'lr': 5.706558779820956e-05, 'samples': 20003840, 'steps': 39069, 'batch_loss/train': 0.7628435073420405} 12/28/2021 02:37:52 - INFO - codeparrot_training - Step 39070: {'lr': 5.705556375781379e-05, 'samples': 20004352, 'steps': 39070, 'batch_loss/train': 1.2716845432296395} 12/28/2021 02:38:03 - INFO - codeparrot_training - Step 39071: {'lr': 5.704554048449234e-05, 'samples': 20004864, 'steps': 39071, 'batch_loss/train': 0.6892978553951252} 12/28/2021 02:38:15 - INFO - codeparrot_training - Step 39072: {'lr': 5.7035517978284876e-05, 'samples': 20005376, 'steps': 39072, 'batch_loss/train': 0.7094128026801627} 12/28/2021 02:38:26 - INFO - codeparrot_training - Step 39073: {'lr': 5.702549623923134e-05, 'samples': 20005888, 'steps': 39073, 'batch_loss/train': 0.7301355414092541} 12/28/2021 02:38:36 - INFO - codeparrot_training - Step 39074: {'lr': 5.701547526737158e-05, 'samples': 20006400, 'steps': 39074, 'batch_loss/train': 0.7844263240695} 12/28/2021 02:38:49 - INFO - codeparrot_training - Step 39075: {'lr': 5.70054550627454e-05, 'samples': 20006912, 'steps': 39075, 'batch_loss/train': 0.6926543549634516} 12/28/2021 02:38:59 - INFO - codeparrot_training - Step 39076: {'lr': 5.6995435625392664e-05, 'samples': 20007424, 'steps': 39076, 'batch_loss/train': 0.7904582098126411} 12/28/2021 02:39:10 - INFO - codeparrot_training - Step 39077: {'lr': 5.6985416955353223e-05, 'samples': 20007936, 'steps': 39077, 'batch_loss/train': 0.7208088338375092} 12/28/2021 02:39:21 - INFO - codeparrot_training - Step 39078: {'lr': 5.6975399052666856e-05, 'samples': 20008448, 'steps': 39078, 'batch_loss/train': 0.7370007480494678} 12/28/2021 02:39:33 - INFO - codeparrot_training - Step 39079: {'lr': 5.696538191737344e-05, 'samples': 20008960, 'steps': 39079, 'batch_loss/train': 0.5895890505053103} 12/28/2021 02:39:43 - INFO - codeparrot_training - Step 39080: {'lr': 5.695536554951283e-05, 'samples': 20009472, 'steps': 39080, 'batch_loss/train': 0.7190752467140555} 12/28/2021 02:39:54 - INFO - codeparrot_training - Step 39081: {'lr': 5.6945349949124676e-05, 'samples': 20009984, 'steps': 39081, 'batch_loss/train': 0.8872180634643883} 12/28/2021 02:40:09 - INFO - codeparrot_training - Step 39082: {'lr': 5.6935335116249044e-05, 'samples': 20010496, 'steps': 39082, 'batch_loss/train': 1.465199787169695} 12/28/2021 02:40:19 - INFO - codeparrot_training - Step 39083: {'lr': 5.6925321050925545e-05, 'samples': 20011008, 'steps': 39083, 'batch_loss/train': 0.7064515573438257} 12/28/2021 02:40:30 - INFO - codeparrot_training - Step 39084: {'lr': 5.691530775319401e-05, 'samples': 20011520, 'steps': 39084, 'batch_loss/train': 0.7416319982148707} 12/28/2021 02:40:40 - INFO - codeparrot_training - Step 39085: {'lr': 5.690529522309446e-05, 'samples': 20012032, 'steps': 39085, 'batch_loss/train': 0.7521039955317974} 12/28/2021 02:40:53 - INFO - codeparrot_training - Step 39086: {'lr': 5.689528346066644e-05, 'samples': 20012544, 'steps': 39086, 'batch_loss/train': 0.8023230582475662} 12/28/2021 02:41:04 - INFO - codeparrot_training - Step 39087: {'lr': 5.6885272465949876e-05, 'samples': 20013056, 'steps': 39087, 'batch_loss/train': 0.6588184274733067} 12/28/2021 02:41:14 - INFO - codeparrot_training - Step 39088: {'lr': 5.687526223898456e-05, 'samples': 20013568, 'steps': 39088, 'batch_loss/train': 0.9240426411852241} 12/28/2021 02:41:26 - INFO - codeparrot_training - Step 39089: {'lr': 5.686525277981028e-05, 'samples': 20014080, 'steps': 39089, 'batch_loss/train': 1.0042285593226552} 12/28/2021 02:41:37 - INFO - codeparrot_training - Step 39090: {'lr': 5.6855244088466816e-05, 'samples': 20014592, 'steps': 39090, 'batch_loss/train': 0.7582023201975971} 12/28/2021 02:41:48 - INFO - codeparrot_training - Step 39091: {'lr': 5.6845236164994043e-05, 'samples': 20015104, 'steps': 39091, 'batch_loss/train': 0.7122832788154483} 12/28/2021 02:42:02 - INFO - codeparrot_training - Step 39092: {'lr': 5.6835229009431556e-05, 'samples': 20015616, 'steps': 39092, 'batch_loss/train': 0.663698447868228} 12/28/2021 02:42:12 - INFO - codeparrot_training - Step 39093: {'lr': 5.68252226218193e-05, 'samples': 20016128, 'steps': 39093, 'batch_loss/train': 0.6419235775247216} 12/28/2021 02:42:23 - INFO - codeparrot_training - Step 39094: {'lr': 5.681521700219713e-05, 'samples': 20016640, 'steps': 39094, 'batch_loss/train': 0.7134287776425481} 12/28/2021 02:42:34 - INFO - codeparrot_training - Step 39095: {'lr': 5.6805212150604554e-05, 'samples': 20017152, 'steps': 39095, 'batch_loss/train': 0.6403492949903011} 12/28/2021 02:42:46 - INFO - codeparrot_training - Step 39096: {'lr': 5.679520806708158e-05, 'samples': 20017664, 'steps': 39096, 'batch_loss/train': 0.7076187252532691} 12/28/2021 02:42:57 - INFO - codeparrot_training - Step 39097: {'lr': 5.678520475166796e-05, 'samples': 20018176, 'steps': 39097, 'batch_loss/train': 0.7785665476694703} 12/28/2021 02:43:07 - INFO - codeparrot_training - Step 39098: {'lr': 5.677520220440335e-05, 'samples': 20018688, 'steps': 39098, 'batch_loss/train': 0.7536584921181202} 12/28/2021 02:43:20 - INFO - codeparrot_training - Step 39099: {'lr': 5.6765200425327596e-05, 'samples': 20019200, 'steps': 39099, 'batch_loss/train': 0.8052341609727591} 12/28/2021 02:43:30 - INFO - codeparrot_training - Step 39100: {'lr': 5.675519941448043e-05, 'samples': 20019712, 'steps': 39100, 'batch_loss/train': 0.7993134071002714} 12/28/2021 02:43:41 - INFO - codeparrot_training - Step 39101: {'lr': 5.67451991719016e-05, 'samples': 20020224, 'steps': 39101, 'batch_loss/train': 0.722289364784956} 12/28/2021 02:43:55 - INFO - codeparrot_training - Step 39102: {'lr': 5.6735199697630914e-05, 'samples': 20020736, 'steps': 39102, 'batch_loss/train': 0.6629281316418201} 12/28/2021 02:44:06 - INFO - codeparrot_training - Step 39103: {'lr': 5.672520099170811e-05, 'samples': 20021248, 'steps': 39103, 'batch_loss/train': 0.726734041236341} 12/28/2021 02:44:16 - INFO - codeparrot_training - Step 39104: {'lr': 5.671520305417291e-05, 'samples': 20021760, 'steps': 39104, 'batch_loss/train': 0.7363632686901838} 12/28/2021 02:44:27 - INFO - codeparrot_training - Step 39105: {'lr': 5.6705205885065087e-05, 'samples': 20022272, 'steps': 39105, 'batch_loss/train': 0.6647102814167738} 12/28/2021 02:44:39 - INFO - codeparrot_training - Step 39106: {'lr': 5.669520948442439e-05, 'samples': 20022784, 'steps': 39106, 'batch_loss/train': 0.6965447305701673} 12/28/2021 02:44:50 - INFO - codeparrot_training - Step 39107: {'lr': 5.668521385229053e-05, 'samples': 20023296, 'steps': 39107, 'batch_loss/train': 0.6690406058914959} 12/28/2021 02:45:01 - INFO - codeparrot_training - Step 39108: {'lr': 5.667521898870334e-05, 'samples': 20023808, 'steps': 39108, 'batch_loss/train': 0.6031132033094764} 12/28/2021 02:45:15 - INFO - codeparrot_training - Step 39109: {'lr': 5.666522489370243e-05, 'samples': 20024320, 'steps': 39109, 'batch_loss/train': 0.8675164701417089} 12/28/2021 02:45:25 - INFO - codeparrot_training - Step 39110: {'lr': 5.665523156732752e-05, 'samples': 20024832, 'steps': 39110, 'batch_loss/train': 0.6868296600878239} 12/28/2021 02:45:36 - INFO - codeparrot_training - Step 39111: {'lr': 5.664523900961852e-05, 'samples': 20025344, 'steps': 39111, 'batch_loss/train': 0.7672839937731624} 12/28/2021 02:45:47 - INFO - codeparrot_training - Step 39112: {'lr': 5.6635247220614974e-05, 'samples': 20025856, 'steps': 39112, 'batch_loss/train': 0.6296299035893753} 12/28/2021 02:45:59 - INFO - codeparrot_training - Step 39113: {'lr': 5.6625256200356687e-05, 'samples': 20026368, 'steps': 39113, 'batch_loss/train': 0.7362305582500994} 12/28/2021 02:46:09 - INFO - codeparrot_training - Step 39114: {'lr': 5.661526594888336e-05, 'samples': 20026880, 'steps': 39114, 'batch_loss/train': 0.746927672997117} 12/28/2021 02:46:20 - INFO - codeparrot_training - Step 39115: {'lr': 5.660527646623473e-05, 'samples': 20027392, 'steps': 39115, 'batch_loss/train': 0.7959609311074018} 12/28/2021 02:46:32 - INFO - codeparrot_training - Step 39116: {'lr': 5.6595287752450485e-05, 'samples': 20027904, 'steps': 39116, 'batch_loss/train': 0.7087512463913299} 12/28/2021 02:46:43 - INFO - codeparrot_training - Step 39117: {'lr': 5.6585299807570406e-05, 'samples': 20028416, 'steps': 39117, 'batch_loss/train': 0.7851991504430771} 12/28/2021 02:46:53 - INFO - codeparrot_training - Step 39118: {'lr': 5.657531263163404e-05, 'samples': 20028928, 'steps': 39118, 'batch_loss/train': 0.9075112221762538} 12/28/2021 02:47:07 - INFO - codeparrot_training - Step 39119: {'lr': 5.6565326224681245e-05, 'samples': 20029440, 'steps': 39119, 'batch_loss/train': 0.9487456511706114} 12/28/2021 02:47:18 - INFO - codeparrot_training - Step 39120: {'lr': 5.6555340586751743e-05, 'samples': 20029952, 'steps': 39120, 'batch_loss/train': 1.3595380205661058} 12/28/2021 02:47:28 - INFO - codeparrot_training - Step 39121: {'lr': 5.654535571788502e-05, 'samples': 20030464, 'steps': 39121, 'batch_loss/train': 0.7637912314385176} 12/28/2021 02:47:39 - INFO - codeparrot_training - Step 39122: {'lr': 5.6535371618121e-05, 'samples': 20030976, 'steps': 39122, 'batch_loss/train': 0.7998102982528508} 12/28/2021 02:47:51 - INFO - codeparrot_training - Step 39123: {'lr': 5.652538828749934e-05, 'samples': 20031488, 'steps': 39123, 'batch_loss/train': 0.7372973309829831} 12/28/2021 02:48:02 - INFO - codeparrot_training - Step 39124: {'lr': 5.6515405726059646e-05, 'samples': 20032000, 'steps': 39124, 'batch_loss/train': 2.118893824517727} 12/28/2021 02:48:13 - INFO - codeparrot_training - Step 39125: {'lr': 5.650542393384161e-05, 'samples': 20032512, 'steps': 39125, 'batch_loss/train': 0.8197911288589239} 12/28/2021 02:48:25 - INFO - codeparrot_training - Step 39126: {'lr': 5.649544291088496e-05, 'samples': 20033024, 'steps': 39126, 'batch_loss/train': 0.7527726045809686} 12/28/2021 02:48:35 - INFO - codeparrot_training - Step 39127: {'lr': 5.6485462657229367e-05, 'samples': 20033536, 'steps': 39127, 'batch_loss/train': 0.7638858039863408} 12/28/2021 02:48:46 - INFO - codeparrot_training - Step 39128: {'lr': 5.647548317291451e-05, 'samples': 20034048, 'steps': 39128, 'batch_loss/train': 0.6879172393819317} 12/28/2021 02:48:58 - INFO - codeparrot_training - Step 39129: {'lr': 5.646550445798007e-05, 'samples': 20034560, 'steps': 39129, 'batch_loss/train': 0.646985711529851} 12/28/2021 02:49:09 - INFO - codeparrot_training - Step 39130: {'lr': 5.6455526512465687e-05, 'samples': 20035072, 'steps': 39130, 'batch_loss/train': 0.7899009403772652} 12/28/2021 02:49:19 - INFO - codeparrot_training - Step 39131: {'lr': 5.644554933641105e-05, 'samples': 20035584, 'steps': 39131, 'batch_loss/train': 0.7494337495299987} 12/28/2021 02:49:30 - INFO - codeparrot_training - Step 39132: {'lr': 5.643557292985585e-05, 'samples': 20036096, 'steps': 39132, 'batch_loss/train': 0.6983427956001833} 12/28/2021 02:49:44 - INFO - codeparrot_training - Step 39133: {'lr': 5.642559729283972e-05, 'samples': 20036608, 'steps': 39133, 'batch_loss/train': 1.4299851216783281} 12/28/2021 02:49:55 - INFO - codeparrot_training - Step 39134: {'lr': 5.641562242540238e-05, 'samples': 20037120, 'steps': 39134, 'batch_loss/train': 0.7367000617086887} 12/28/2021 02:50:05 - INFO - codeparrot_training - Step 39135: {'lr': 5.640564832758335e-05, 'samples': 20037632, 'steps': 39135, 'batch_loss/train': 0.7711104564368725} 12/28/2021 02:50:17 - INFO - codeparrot_training - Step 39136: {'lr': 5.63956749994223e-05, 'samples': 20038144, 'steps': 39136, 'batch_loss/train': 0.7112706899642944} 12/28/2021 02:50:28 - INFO - codeparrot_training - Step 39137: {'lr': 5.6385702440959115e-05, 'samples': 20038656, 'steps': 39137, 'batch_loss/train': 0.7594265290535986} 12/28/2021 02:50:39 - INFO - codeparrot_training - Step 39138: {'lr': 5.637573065223317e-05, 'samples': 20039168, 'steps': 39138, 'batch_loss/train': 0.7740198606625199} 12/28/2021 02:50:53 - INFO - codeparrot_training - Step 39139: {'lr': 5.636575963328422e-05, 'samples': 20039680, 'steps': 39139, 'batch_loss/train': 0.7230961916502565} 12/28/2021 02:51:03 - INFO - codeparrot_training - Step 39140: {'lr': 5.6355789384151895e-05, 'samples': 20040192, 'steps': 39140, 'batch_loss/train': 0.7220613472163677} 12/28/2021 02:51:14 - INFO - codeparrot_training - Step 39141: {'lr': 5.6345819904875835e-05, 'samples': 20040704, 'steps': 39141, 'batch_loss/train': 0.7275355746969581} 12/28/2021 02:51:24 - INFO - codeparrot_training - Step 39142: {'lr': 5.633585119549567e-05, 'samples': 20041216, 'steps': 39142, 'batch_loss/train': 0.7147778326179832} 12/28/2021 02:51:37 - INFO - codeparrot_training - Step 39143: {'lr': 5.6325883256051035e-05, 'samples': 20041728, 'steps': 39143, 'batch_loss/train': 0.6062137135304511} 12/28/2021 02:51:48 - INFO - codeparrot_training - Step 39144: {'lr': 5.631591608658157e-05, 'samples': 20042240, 'steps': 39144, 'batch_loss/train': 0.6736447624862194} 12/28/2021 02:51:58 - INFO - codeparrot_training - Step 39145: {'lr': 5.6305949687126876e-05, 'samples': 20042752, 'steps': 39145, 'batch_loss/train': 0.5904196421615779} 12/28/2021 02:52:10 - INFO - codeparrot_training - Step 39146: {'lr': 5.629598405772668e-05, 'samples': 20043264, 'steps': 39146, 'batch_loss/train': 0.6555983740836382} 12/28/2021 02:52:21 - INFO - codeparrot_training - Step 39147: {'lr': 5.6286019198420356e-05, 'samples': 20043776, 'steps': 39147, 'batch_loss/train': 0.7216521929949522} 12/28/2021 02:52:32 - INFO - codeparrot_training - Step 39148: {'lr': 5.627605510924774e-05, 'samples': 20044288, 'steps': 39148, 'batch_loss/train': 0.7461178579251282} 12/28/2021 02:52:45 - INFO - codeparrot_training - Step 39149: {'lr': 5.626609179024847e-05, 'samples': 20044800, 'steps': 39149, 'batch_loss/train': 0.6269946614047512} 12/28/2021 02:52:56 - INFO - codeparrot_training - Step 39150: {'lr': 5.6256129241462006e-05, 'samples': 20045312, 'steps': 39150, 'batch_loss/train': 0.7345927763963118} 12/28/2021 02:53:07 - INFO - codeparrot_training - Step 39151: {'lr': 5.6246167462927934e-05, 'samples': 20045824, 'steps': 39151, 'batch_loss/train': 0.7017435752786696} 12/28/2021 02:53:17 - INFO - codeparrot_training - Step 39152: {'lr': 5.623620645468608e-05, 'samples': 20046336, 'steps': 39152, 'batch_loss/train': 0.7930043372325599} 12/28/2021 02:53:30 - INFO - codeparrot_training - Step 39153: {'lr': 5.622624621677583e-05, 'samples': 20046848, 'steps': 39153, 'batch_loss/train': 0.7817897265776992} 12/28/2021 02:53:40 - INFO - codeparrot_training - Step 39154: {'lr': 5.621628674923684e-05, 'samples': 20047360, 'steps': 39154, 'batch_loss/train': 0.6823260644450784} 12/28/2021 02:53:51 - INFO - codeparrot_training - Step 39155: {'lr': 5.620632805210876e-05, 'samples': 20047872, 'steps': 39155, 'batch_loss/train': 0.8303641565144062} 12/28/2021 02:54:03 - INFO - codeparrot_training - Step 39156: {'lr': 5.619637012543113e-05, 'samples': 20048384, 'steps': 39156, 'batch_loss/train': 0.7641572426073253} 12/28/2021 02:54:14 - INFO - codeparrot_training - Step 39157: {'lr': 5.6186412969243566e-05, 'samples': 20048896, 'steps': 39157, 'batch_loss/train': 0.6384455407969654} 12/28/2021 02:54:24 - INFO - codeparrot_training - Step 39158: {'lr': 5.6176456583585636e-05, 'samples': 20049408, 'steps': 39158, 'batch_loss/train': 0.7418220650870353} 12/28/2021 02:54:38 - INFO - codeparrot_training - Step 39159: {'lr': 5.616650096849693e-05, 'samples': 20049920, 'steps': 39159, 'batch_loss/train': 0.7925353394821286} 12/28/2021 02:54:49 - INFO - codeparrot_training - Step 39160: {'lr': 5.6156546124017024e-05, 'samples': 20050432, 'steps': 39160, 'batch_loss/train': 0.6954695032909513} 12/28/2021 02:54:59 - INFO - codeparrot_training - Step 39161: {'lr': 5.614659205018558e-05, 'samples': 20050944, 'steps': 39161, 'batch_loss/train': 0.787446585483849} 12/28/2021 02:55:10 - INFO - codeparrot_training - Step 39162: {'lr': 5.6136638747041946e-05, 'samples': 20051456, 'steps': 39162, 'batch_loss/train': 0.6652501919306815} 12/28/2021 02:55:22 - INFO - codeparrot_training - Step 39163: {'lr': 5.6126686214625966e-05, 'samples': 20051968, 'steps': 39163, 'batch_loss/train': 0.763237671693787} 12/28/2021 02:55:33 - INFO - codeparrot_training - Step 39164: {'lr': 5.611673445297702e-05, 'samples': 20052480, 'steps': 39164, 'batch_loss/train': 0.7590137803927064} 12/28/2021 02:55:43 - INFO - codeparrot_training - Step 39165: {'lr': 5.6106783462134715e-05, 'samples': 20052992, 'steps': 39165, 'batch_loss/train': 0.8453309866599739} 12/28/2021 02:55:56 - INFO - codeparrot_training - Step 39166: {'lr': 5.609683324213863e-05, 'samples': 20053504, 'steps': 39166, 'batch_loss/train': 0.7180120686534792} 12/28/2021 02:56:06 - INFO - codeparrot_training - Step 39167: {'lr': 5.608688379302834e-05, 'samples': 20054016, 'steps': 39167, 'batch_loss/train': 0.6773507674224675} 12/28/2021 02:56:17 - INFO - codeparrot_training - Step 39168: {'lr': 5.607693511484335e-05, 'samples': 20054528, 'steps': 39168, 'batch_loss/train': 0.7580087664537132} 12/28/2021 02:56:31 - INFO - codeparrot_training - Step 39169: {'lr': 5.606698720762327e-05, 'samples': 20055040, 'steps': 39169, 'batch_loss/train': 0.8091659133788198} 12/28/2021 02:56:41 - INFO - codeparrot_training - Step 39170: {'lr': 5.6057040071407606e-05, 'samples': 20055552, 'steps': 39170, 'batch_loss/train': 0.7885321588255465} 12/28/2021 02:56:52 - INFO - codeparrot_training - Step 39171: {'lr': 5.6047093706235954e-05, 'samples': 20056064, 'steps': 39171, 'batch_loss/train': 0.7913047322072089} 12/28/2021 02:57:03 - INFO - codeparrot_training - Step 39172: {'lr': 5.603714811214786e-05, 'samples': 20056576, 'steps': 39172, 'batch_loss/train': 0.7515617134049535} 12/28/2021 02:57:15 - INFO - codeparrot_training - Step 39173: {'lr': 5.6027203289182706e-05, 'samples': 20057088, 'steps': 39173, 'batch_loss/train': 0.9370538215152919} 12/28/2021 02:57:26 - INFO - codeparrot_training - Step 39174: {'lr': 5.6017259237380215e-05, 'samples': 20057600, 'steps': 39174, 'batch_loss/train': 0.7360365078784525} 12/28/2021 02:57:36 - INFO - codeparrot_training - Step 39175: {'lr': 5.6007315956779905e-05, 'samples': 20058112, 'steps': 39175, 'batch_loss/train': 0.8339464361779392} 12/28/2021 02:57:48 - INFO - codeparrot_training - Step 39176: {'lr': 5.59973734474212e-05, 'samples': 20058624, 'steps': 39176, 'batch_loss/train': 0.7675507105886936} 12/28/2021 02:57:59 - INFO - codeparrot_training - Step 39177: {'lr': 5.598743170934362e-05, 'samples': 20059136, 'steps': 39177, 'batch_loss/train': 0.6837025727145374} 12/28/2021 02:58:10 - INFO - codeparrot_training - Step 39178: {'lr': 5.5977490742586904e-05, 'samples': 20059648, 'steps': 39178, 'batch_loss/train': 0.7424450716935098} 12/28/2021 02:58:23 - INFO - codeparrot_training - Step 39179: {'lr': 5.596755054719036e-05, 'samples': 20060160, 'steps': 39179, 'batch_loss/train': 0.7340706191025674} 12/28/2021 02:58:34 - INFO - codeparrot_training - Step 39180: {'lr': 5.595761112319356e-05, 'samples': 20060672, 'steps': 39180, 'batch_loss/train': 0.7507338719442487} 12/28/2021 02:58:45 - INFO - codeparrot_training - Step 39181: {'lr': 5.5947672470636005e-05, 'samples': 20061184, 'steps': 39181, 'batch_loss/train': 0.7839977703988552} 12/28/2021 02:58:55 - INFO - codeparrot_training - Step 39182: {'lr': 5.593773458955726e-05, 'samples': 20061696, 'steps': 39182, 'batch_loss/train': 0.7723514903336763} 12/28/2021 02:59:07 - INFO - codeparrot_training - Step 39183: {'lr': 5.592779747999682e-05, 'samples': 20062208, 'steps': 39183, 'batch_loss/train': 0.6724197115399875} 12/28/2021 02:59:18 - INFO - codeparrot_training - Step 39184: {'lr': 5.591786114199415e-05, 'samples': 20062720, 'steps': 39184, 'batch_loss/train': 0.7344922749325633} 12/28/2021 02:59:29 - INFO - codeparrot_training - Step 39185: {'lr': 5.59079255755888e-05, 'samples': 20063232, 'steps': 39185, 'batch_loss/train': 0.7533183414489031} 12/28/2021 02:59:42 - INFO - codeparrot_training - Step 39186: {'lr': 5.589799078082025e-05, 'samples': 20063744, 'steps': 39186, 'batch_loss/train': 0.7892286698333919} 12/28/2021 02:59:53 - INFO - codeparrot_training - Step 39187: {'lr': 5.5888056757728065e-05, 'samples': 20064256, 'steps': 39187, 'batch_loss/train': 0.7746480424539186} 12/28/2021 03:00:04 - INFO - codeparrot_training - Step 39188: {'lr': 5.587812350635152e-05, 'samples': 20064768, 'steps': 39188, 'batch_loss/train': 0.7565336632542312} 12/28/2021 03:00:16 - INFO - codeparrot_training - Step 39189: {'lr': 5.5868191026730405e-05, 'samples': 20065280, 'steps': 39189, 'batch_loss/train': 0.6934180217795074} 12/28/2021 03:00:27 - INFO - codeparrot_training - Step 39190: {'lr': 5.585825931890401e-05, 'samples': 20065792, 'steps': 39190, 'batch_loss/train': 0.7567882658913732} 12/28/2021 03:00:37 - INFO - codeparrot_training - Step 39191: {'lr': 5.5848328382911844e-05, 'samples': 20066304, 'steps': 39191, 'batch_loss/train': 0.7474022302776575} 12/28/2021 03:00:48 - INFO - codeparrot_training - Step 39192: {'lr': 5.583839821879344e-05, 'samples': 20066816, 'steps': 39192, 'batch_loss/train': 0.8671368565410376} 12/28/2021 03:01:00 - INFO - codeparrot_training - Step 39193: {'lr': 5.582846882658824e-05, 'samples': 20067328, 'steps': 39193, 'batch_loss/train': 0.799350316170603} 12/28/2021 03:01:11 - INFO - codeparrot_training - Step 39194: {'lr': 5.5818540206335725e-05, 'samples': 20067840, 'steps': 39194, 'batch_loss/train': 0.7486234353855252} 12/28/2021 03:01:21 - INFO - codeparrot_training - Step 39195: {'lr': 5.5808612358075385e-05, 'samples': 20068352, 'steps': 39195, 'batch_loss/train': 0.6865922480355948} 12/28/2021 03:01:33 - INFO - codeparrot_training - Step 39196: {'lr': 5.579868528184667e-05, 'samples': 20068864, 'steps': 39196, 'batch_loss/train': 0.7432649163529277} 12/28/2021 03:01:44 - INFO - codeparrot_training - Step 39197: {'lr': 5.578875897768906e-05, 'samples': 20069376, 'steps': 39197, 'batch_loss/train': 0.7629923298954964} 12/28/2021 03:01:54 - INFO - codeparrot_training - Step 39198: {'lr': 5.577883344564208e-05, 'samples': 20069888, 'steps': 39198, 'batch_loss/train': 0.6995450358372182} 12/28/2021 03:02:08 - INFO - codeparrot_training - Step 39199: {'lr': 5.5768908685744995e-05, 'samples': 20070400, 'steps': 39199, 'batch_loss/train': 0.7073991019278765} 12/28/2021 03:02:19 - INFO - codeparrot_training - Step 39200: {'lr': 5.575898469803745e-05, 'samples': 20070912, 'steps': 39200, 'batch_loss/train': 0.829474757425487} 12/28/2021 03:02:30 - INFO - codeparrot_training - Step 39201: {'lr': 5.5749061482558896e-05, 'samples': 20071424, 'steps': 39201, 'batch_loss/train': 0.6903760866262019} 12/28/2021 03:02:42 - INFO - codeparrot_training - Step 39202: {'lr': 5.573913903934869e-05, 'samples': 20071936, 'steps': 39202, 'batch_loss/train': 0.7384287554305047} 12/28/2021 03:02:52 - INFO - codeparrot_training - Step 39203: {'lr': 5.5729217368446215e-05, 'samples': 20072448, 'steps': 39203, 'batch_loss/train': 0.6782749886624515} 12/28/2021 03:03:03 - INFO - codeparrot_training - Step 39204: {'lr': 5.5719296469891185e-05, 'samples': 20072960, 'steps': 39204, 'batch_loss/train': 0.6864196147071198} 12/28/2021 03:03:14 - INFO - codeparrot_training - Step 39205: {'lr': 5.570937634372278e-05, 'samples': 20073472, 'steps': 39205, 'batch_loss/train': 0.7555100531317294} 12/28/2021 03:03:26 - INFO - codeparrot_training - Step 39206: {'lr': 5.5699456989980495e-05, 'samples': 20073984, 'steps': 39206, 'batch_loss/train': 0.7965959068387747} 12/28/2021 03:03:36 - INFO - codeparrot_training - Step 39207: {'lr': 5.5689538408703914e-05, 'samples': 20074496, 'steps': 39207, 'batch_loss/train': 0.7768170861527324} 12/28/2021 03:03:47 - INFO - codeparrot_training - Step 39208: {'lr': 5.5679620599932284e-05, 'samples': 20075008, 'steps': 39208, 'batch_loss/train': 0.6433663727948442} 12/28/2021 03:04:01 - INFO - codeparrot_training - Step 39209: {'lr': 5.5669703563705124e-05, 'samples': 20075520, 'steps': 39209, 'batch_loss/train': 0.7623039954341948} 12/28/2021 03:04:12 - INFO - codeparrot_training - Step 39210: {'lr': 5.565978730006185e-05, 'samples': 20076032, 'steps': 39210, 'batch_loss/train': 0.7175253860186785} 12/28/2021 03:04:22 - INFO - codeparrot_training - Step 39211: {'lr': 5.564987180904188e-05, 'samples': 20076544, 'steps': 39211, 'batch_loss/train': 0.8367907995707355} 12/28/2021 03:04:35 - INFO - codeparrot_training - Step 39212: {'lr': 5.563995709068465e-05, 'samples': 20077056, 'steps': 39212, 'batch_loss/train': 0.7077854431699961} 12/28/2021 03:04:45 - INFO - codeparrot_training - Step 39213: {'lr': 5.563004314502962e-05, 'samples': 20077568, 'steps': 39213, 'batch_loss/train': 0.8541156072169542} 12/28/2021 03:04:56 - INFO - codeparrot_training - Step 39214: {'lr': 5.562012997211599e-05, 'samples': 20078080, 'steps': 39214, 'batch_loss/train': 0.6983298109844327} 12/28/2021 03:05:06 - INFO - codeparrot_training - Step 39215: {'lr': 5.561021757198348e-05, 'samples': 20078592, 'steps': 39215, 'batch_loss/train': 0.8039443893358111} 12/28/2021 03:05:20 - INFO - codeparrot_training - Step 39216: {'lr': 5.5600305944671306e-05, 'samples': 20079104, 'steps': 39216, 'batch_loss/train': 0.6996858888305724} 12/28/2021 03:05:31 - INFO - codeparrot_training - Step 39217: {'lr': 5.5590395090218796e-05, 'samples': 20079616, 'steps': 39217, 'batch_loss/train': 0.7758933529257774} 12/28/2021 03:05:42 - INFO - codeparrot_training - Step 39218: {'lr': 5.5580485008665635e-05, 'samples': 20080128, 'steps': 39218, 'batch_loss/train': 0.7471621014992706} 12/28/2021 03:05:54 - INFO - codeparrot_training - Step 39219: {'lr': 5.557057570005098e-05, 'samples': 20080640, 'steps': 39219, 'batch_loss/train': 0.7119061393896118} 12/28/2021 03:06:05 - INFO - codeparrot_training - Step 39220: {'lr': 5.5560667164414294e-05, 'samples': 20081152, 'steps': 39220, 'batch_loss/train': 0.7556946561671793} 12/28/2021 03:06:15 - INFO - codeparrot_training - Step 39221: {'lr': 5.555075940179499e-05, 'samples': 20081664, 'steps': 39221, 'batch_loss/train': 0.7306016732472926} 12/28/2021 03:06:27 - INFO - codeparrot_training - Step 39222: {'lr': 5.554085241223245e-05, 'samples': 20082176, 'steps': 39222, 'batch_loss/train': 0.7649238770827651} 12/28/2021 03:06:38 - INFO - codeparrot_training - Step 39223: {'lr': 5.5530946195766044e-05, 'samples': 20082688, 'steps': 39223, 'batch_loss/train': 0.5017632628441788} 12/28/2021 03:06:49 - INFO - codeparrot_training - Step 39224: {'lr': 5.5521040752435246e-05, 'samples': 20083200, 'steps': 39224, 'batch_loss/train': 0.7575158448889852} 12/28/2021 03:06:59 - INFO - codeparrot_training - Step 39225: {'lr': 5.55111360822792e-05, 'samples': 20083712, 'steps': 39225, 'batch_loss/train': 0.739603815600276} 12/28/2021 03:07:12 - INFO - codeparrot_training - Step 39226: {'lr': 5.5501232185337535e-05, 'samples': 20084224, 'steps': 39226, 'batch_loss/train': 0.7185442727059126} 12/28/2021 03:07:23 - INFO - codeparrot_training - Step 39227: {'lr': 5.549132906164958e-05, 'samples': 20084736, 'steps': 39227, 'batch_loss/train': 0.702891044318676} 12/28/2021 03:07:33 - INFO - codeparrot_training - Step 39228: {'lr': 5.5481426711254614e-05, 'samples': 20085248, 'steps': 39228, 'batch_loss/train': 0.6256917355931364} 12/28/2021 03:07:46 - INFO - codeparrot_training - Step 39229: {'lr': 5.547152513419196e-05, 'samples': 20085760, 'steps': 39229, 'batch_loss/train': 0.7841019220650196} 12/28/2021 03:07:56 - INFO - codeparrot_training - Step 39230: {'lr': 5.54616243305012e-05, 'samples': 20086272, 'steps': 39230, 'batch_loss/train': 0.762813794775866} 12/28/2021 03:08:07 - INFO - codeparrot_training - Step 39231: {'lr': 5.5451724300221526e-05, 'samples': 20086784, 'steps': 39231, 'batch_loss/train': 0.6740454523824155} 12/28/2021 03:08:19 - INFO - codeparrot_training - Step 39232: {'lr': 5.5441825043392234e-05, 'samples': 20087296, 'steps': 39232, 'batch_loss/train': 0.7401770411524922} 12/28/2021 03:08:29 - INFO - codeparrot_training - Step 39233: {'lr': 5.5431926560052945e-05, 'samples': 20087808, 'steps': 39233, 'batch_loss/train': 0.7620372241362929} 12/28/2021 03:08:40 - INFO - codeparrot_training - Step 39234: {'lr': 5.542202885024278e-05, 'samples': 20088320, 'steps': 39234, 'batch_loss/train': 0.7038419148884714} 12/28/2021 03:08:51 - INFO - codeparrot_training - Step 39235: {'lr': 5.541213191400113e-05, 'samples': 20088832, 'steps': 39235, 'batch_loss/train': 0.6857101153582335} 12/28/2021 03:09:03 - INFO - codeparrot_training - Step 39236: {'lr': 5.5402235751367405e-05, 'samples': 20089344, 'steps': 39236, 'batch_loss/train': 0.8378484947606921} 12/28/2021 03:09:13 - INFO - codeparrot_training - Step 39237: {'lr': 5.5392340362380886e-05, 'samples': 20089856, 'steps': 39237, 'batch_loss/train': 0.752046367386356} 12/28/2021 03:09:24 - INFO - codeparrot_training - Step 39238: {'lr': 5.538244574708098e-05, 'samples': 20090368, 'steps': 39238, 'batch_loss/train': 0.8612436940893531} 12/28/2021 03:09:37 - INFO - codeparrot_training - Step 39239: {'lr': 5.5372551905507025e-05, 'samples': 20090880, 'steps': 39239, 'batch_loss/train': 0.7008555894717574} 12/28/2021 03:09:48 - INFO - codeparrot_training - Step 39240: {'lr': 5.536265883769817e-05, 'samples': 20091392, 'steps': 39240, 'batch_loss/train': 0.7511526802554727} 12/28/2021 03:09:58 - INFO - codeparrot_training - Step 39241: {'lr': 5.535276654369398e-05, 'samples': 20091904, 'steps': 39241, 'batch_loss/train': 0.9825930539518595} 12/28/2021 03:10:11 - INFO - codeparrot_training - Step 39242: {'lr': 5.5342875023533746e-05, 'samples': 20092416, 'steps': 39242, 'batch_loss/train': 0.6689064018428326} 12/28/2021 03:10:21 - INFO - codeparrot_training - Step 39243: {'lr': 5.533298427725661e-05, 'samples': 20092928, 'steps': 39243, 'batch_loss/train': 0.7115756273269653} 12/28/2021 03:10:32 - INFO - codeparrot_training - Step 39244: {'lr': 5.532309430490218e-05, 'samples': 20093440, 'steps': 39244, 'batch_loss/train': 1.2759842614468653} 12/28/2021 03:10:42 - INFO - codeparrot_training - Step 39245: {'lr': 5.5313205106509526e-05, 'samples': 20093952, 'steps': 39245, 'batch_loss/train': 0.6271579238818958} 12/28/2021 03:10:55 - INFO - codeparrot_training - Step 39246: {'lr': 5.530331668211808e-05, 'samples': 20094464, 'steps': 39246, 'batch_loss/train': 0.7027442641556263} 12/28/2021 03:11:06 - INFO - codeparrot_training - Step 39247: {'lr': 5.5293429031767125e-05, 'samples': 20094976, 'steps': 39247, 'batch_loss/train': 0.7730758134275675} 12/28/2021 03:11:16 - INFO - codeparrot_training - Step 39248: {'lr': 5.528354215549597e-05, 'samples': 20095488, 'steps': 39248, 'batch_loss/train': 0.8474413473159075} 12/28/2021 03:11:29 - INFO - codeparrot_training - Step 39249: {'lr': 5.527365605334392e-05, 'samples': 20096000, 'steps': 39249, 'batch_loss/train': 0.660292228567414} 12/28/2021 03:11:39 - INFO - codeparrot_training - Step 39250: {'lr': 5.52637707253503e-05, 'samples': 20096512, 'steps': 39250, 'batch_loss/train': 0.7222648758906871} 12/28/2021 03:11:50 - INFO - codeparrot_training - Step 39251: {'lr': 5.525388617155441e-05, 'samples': 20097024, 'steps': 39251, 'batch_loss/train': 0.746161880902946} 12/28/2021 03:12:02 - INFO - codeparrot_training - Step 39252: {'lr': 5.524400239199553e-05, 'samples': 20097536, 'steps': 39252, 'batch_loss/train': 0.755346660502255} 12/28/2021 03:12:13 - INFO - codeparrot_training - Step 39253: {'lr': 5.523411938671302e-05, 'samples': 20098048, 'steps': 39253, 'batch_loss/train': 0.8916336316615343} 12/28/2021 03:12:24 - INFO - codeparrot_training - Step 39254: {'lr': 5.522423715574598e-05, 'samples': 20098560, 'steps': 39254, 'batch_loss/train': 0.6954328084830195} 12/28/2021 03:12:36 - INFO - codeparrot_training - Step 39255: {'lr': 5.521435569913388e-05, 'samples': 20099072, 'steps': 39255, 'batch_loss/train': 0.7629971075803041} 12/28/2021 03:12:47 - INFO - codeparrot_training - Step 39256: {'lr': 5.520447501691603e-05, 'samples': 20099584, 'steps': 39256, 'batch_loss/train': 0.7692915709340014} 12/28/2021 03:12:58 - INFO - codeparrot_training - Step 39257: {'lr': 5.519459510913158e-05, 'samples': 20100096, 'steps': 39257, 'batch_loss/train': 0.8326587174087763} 12/28/2021 03:13:08 - INFO - codeparrot_training - Step 39258: {'lr': 5.518471597581978e-05, 'samples': 20100608, 'steps': 39258, 'batch_loss/train': 0.7293431838043034} 12/28/2021 03:13:21 - INFO - codeparrot_training - Step 39259: {'lr': 5.5174837617020134e-05, 'samples': 20101120, 'steps': 39259, 'batch_loss/train': 0.7504305806942284} 12/28/2021 03:13:31 - INFO - codeparrot_training - Step 39260: {'lr': 5.516496003277171e-05, 'samples': 20101632, 'steps': 39260, 'batch_loss/train': 0.7060175822116435} 12/28/2021 03:13:42 - INFO - codeparrot_training - Step 39261: {'lr': 5.515508322311383e-05, 'samples': 20102144, 'steps': 39261, 'batch_loss/train': 0.9126290725544095} 12/28/2021 03:13:54 - INFO - codeparrot_training - Step 39262: {'lr': 5.5145207188085774e-05, 'samples': 20102656, 'steps': 39262, 'batch_loss/train': 0.847923006862402} 12/28/2021 03:14:05 - INFO - codeparrot_training - Step 39263: {'lr': 5.513533192772679e-05, 'samples': 20103168, 'steps': 39263, 'batch_loss/train': 0.5230202204547822} 12/28/2021 03:14:16 - INFO - codeparrot_training - Step 39264: {'lr': 5.5125457442076146e-05, 'samples': 20103680, 'steps': 39264, 'batch_loss/train': 0.7849086017813534} 12/28/2021 03:14:28 - INFO - codeparrot_training - Step 39265: {'lr': 5.511558373117318e-05, 'samples': 20104192, 'steps': 39265, 'batch_loss/train': 0.7251571016386151} 12/28/2021 03:14:39 - INFO - codeparrot_training - Step 39266: {'lr': 5.510571079505694e-05, 'samples': 20104704, 'steps': 39266, 'batch_loss/train': 0.7654148442670703} 12/28/2021 03:14:50 - INFO - codeparrot_training - Step 39267: {'lr': 5.5095838633766855e-05, 'samples': 20105216, 'steps': 39267, 'batch_loss/train': 0.7496810406446457} 12/28/2021 03:15:00 - INFO - codeparrot_training - Step 39268: {'lr': 5.508596724734219e-05, 'samples': 20105728, 'steps': 39268, 'batch_loss/train': 0.6779345800168812} 12/28/2021 03:15:12 - INFO - codeparrot_training - Step 39269: {'lr': 5.5076096635822e-05, 'samples': 20106240, 'steps': 39269, 'batch_loss/train': 0.7833258796017617} 12/28/2021 03:15:23 - INFO - codeparrot_training - Step 39270: {'lr': 5.506622679924578e-05, 'samples': 20106752, 'steps': 39270, 'batch_loss/train': 0.7677215384319425} 12/28/2021 03:15:33 - INFO - codeparrot_training - Step 39271: {'lr': 5.505635773765255e-05, 'samples': 20107264, 'steps': 39271, 'batch_loss/train': 0.7346278210170567} 12/28/2021 03:15:46 - INFO - codeparrot_training - Step 39272: {'lr': 5.504648945108165e-05, 'samples': 20107776, 'steps': 39272, 'batch_loss/train': 0.7360728038474917} 12/28/2021 03:15:57 - INFO - codeparrot_training - Step 39273: {'lr': 5.503662193957232e-05, 'samples': 20108288, 'steps': 39273, 'batch_loss/train': 0.6914215761935338} 12/28/2021 03:16:07 - INFO - codeparrot_training - Step 39274: {'lr': 5.502675520316372e-05, 'samples': 20108800, 'steps': 39274, 'batch_loss/train': 0.6230258294381201} 12/28/2021 03:16:20 - INFO - codeparrot_training - Step 39275: {'lr': 5.5016889241895156e-05, 'samples': 20109312, 'steps': 39275, 'batch_loss/train': 0.6315879989415407} 12/28/2021 03:16:31 - INFO - codeparrot_training - Step 39276: {'lr': 5.500702405580582e-05, 'samples': 20109824, 'steps': 39276, 'batch_loss/train': 0.5189997056731954} 12/28/2021 03:16:41 - INFO - codeparrot_training - Step 39277: {'lr': 5.49971596449349e-05, 'samples': 20110336, 'steps': 39277, 'batch_loss/train': 0.7872863374650478} 12/28/2021 03:16:52 - INFO - codeparrot_training - Step 39278: {'lr': 5.4987296009321676e-05, 'samples': 20110848, 'steps': 39278, 'batch_loss/train': 0.651611088309437} 12/28/2021 03:17:05 - INFO - codeparrot_training - Step 39279: {'lr': 5.4977433149005384e-05, 'samples': 20111360, 'steps': 39279, 'batch_loss/train': 0.7116173021495342} 12/28/2021 03:17:15 - INFO - codeparrot_training - Step 39280: {'lr': 5.496757106402503e-05, 'samples': 20111872, 'steps': 39280, 'batch_loss/train': 1.095536969602108} 12/28/2021 03:17:26 - INFO - codeparrot_training - Step 39281: {'lr': 5.495770975442005e-05, 'samples': 20112384, 'steps': 39281, 'batch_loss/train': 0.8595628961920738} 12/28/2021 03:17:38 - INFO - codeparrot_training - Step 39282: {'lr': 5.4947849220229634e-05, 'samples': 20112896, 'steps': 39282, 'batch_loss/train': 0.784038903657347} 12/28/2021 03:17:49 - INFO - codeparrot_training - Step 39283: {'lr': 5.493798946149287e-05, 'samples': 20113408, 'steps': 39283, 'batch_loss/train': 0.9069003965705633} 12/28/2021 03:17:59 - INFO - codeparrot_training - Step 39284: {'lr': 5.492813047824893e-05, 'samples': 20113920, 'steps': 39284, 'batch_loss/train': 0.7228540587238967} 12/28/2021 03:18:10 - INFO - codeparrot_training - Step 39285: {'lr': 5.491827227053722e-05, 'samples': 20114432, 'steps': 39285, 'batch_loss/train': 0.7365232175216079} 12/28/2021 03:18:23 - INFO - codeparrot_training - Step 39286: {'lr': 5.490841483839673e-05, 'samples': 20114944, 'steps': 39286, 'batch_loss/train': 0.6454261259641498} 12/28/2021 03:18:33 - INFO - codeparrot_training - Step 39287: {'lr': 5.489855818186673e-05, 'samples': 20115456, 'steps': 39287, 'batch_loss/train': 0.7790150241926312} 12/28/2021 03:18:44 - INFO - codeparrot_training - Step 39288: {'lr': 5.488870230098639e-05, 'samples': 20115968, 'steps': 39288, 'batch_loss/train': 0.6397516429424286} 12/28/2021 03:18:56 - INFO - codeparrot_training - Step 39289: {'lr': 5.487884719579489e-05, 'samples': 20116480, 'steps': 39289, 'batch_loss/train': 0.7827336424961686} 12/28/2021 03:19:07 - INFO - codeparrot_training - Step 39290: {'lr': 5.486899286633143e-05, 'samples': 20116992, 'steps': 39290, 'batch_loss/train': 0.6821238123811781} 12/28/2021 03:19:17 - INFO - codeparrot_training - Step 39291: {'lr': 5.485913931263517e-05, 'samples': 20117504, 'steps': 39291, 'batch_loss/train': 0.6521650210488588} 12/28/2021 03:19:30 - INFO - codeparrot_training - Step 39292: {'lr': 5.48492865347453e-05, 'samples': 20118016, 'steps': 39292, 'batch_loss/train': 0.7070273926947266} 12/28/2021 03:19:41 - INFO - codeparrot_training - Step 39293: {'lr': 5.483943453270096e-05, 'samples': 20118528, 'steps': 39293, 'batch_loss/train': 0.7033117841929197} 12/28/2021 03:19:51 - INFO - codeparrot_training - Step 39294: {'lr': 5.482958330654142e-05, 'samples': 20119040, 'steps': 39294, 'batch_loss/train': 0.7219500308856368} 12/28/2021 03:20:02 - INFO - codeparrot_training - Step 39295: {'lr': 5.481973285630562e-05, 'samples': 20119552, 'steps': 39295, 'batch_loss/train': 0.6957695400342345} 12/28/2021 03:20:14 - INFO - codeparrot_training - Step 39296: {'lr': 5.480988318203295e-05, 'samples': 20120064, 'steps': 39296, 'batch_loss/train': 0.6784625570289791} 12/28/2021 03:20:24 - INFO - codeparrot_training - Step 39297: {'lr': 5.480003428376254e-05, 'samples': 20120576, 'steps': 39297, 'batch_loss/train': 0.723709539975971} 12/28/2021 03:20:35 - INFO - codeparrot_training - Step 39298: {'lr': 5.4790186161533416e-05, 'samples': 20121088, 'steps': 39298, 'batch_loss/train': 0.8584524672478437} 12/28/2021 03:20:47 - INFO - codeparrot_training - Step 39299: {'lr': 5.47803388153848e-05, 'samples': 20121600, 'steps': 39299, 'batch_loss/train': 0.8395007462240756} 12/28/2021 03:20:58 - INFO - codeparrot_training - Step 39300: {'lr': 5.477049224535585e-05, 'samples': 20122112, 'steps': 39300, 'batch_loss/train': 0.7754421019926667} 12/28/2021 03:21:08 - INFO - codeparrot_training - Step 39301: {'lr': 5.476064645148571e-05, 'samples': 20122624, 'steps': 39301, 'batch_loss/train': 0.6521977399243042} 12/28/2021 03:21:21 - INFO - codeparrot_training - Step 39302: {'lr': 5.475080143381353e-05, 'samples': 20123136, 'steps': 39302, 'batch_loss/train': 0.6962668932974339} 12/28/2021 03:21:32 - INFO - codeparrot_training - Step 39303: {'lr': 5.474095719237843e-05, 'samples': 20123648, 'steps': 39303, 'batch_loss/train': 0.6798448404297233} 12/28/2021 03:21:42 - INFO - codeparrot_training - Step 39304: {'lr': 5.473111372721956e-05, 'samples': 20124160, 'steps': 39304, 'batch_loss/train': 0.7251331657171249} 12/28/2021 03:21:53 - INFO - codeparrot_training - Step 39305: {'lr': 5.4721271038376096e-05, 'samples': 20124672, 'steps': 39305, 'batch_loss/train': 0.6981123983860016} 12/28/2021 03:22:05 - INFO - codeparrot_training - Step 39306: {'lr': 5.471142912588703e-05, 'samples': 20125184, 'steps': 39306, 'batch_loss/train': 0.6940828757360578} 12/28/2021 03:22:16 - INFO - codeparrot_training - Step 39307: {'lr': 5.4701587989791605e-05, 'samples': 20125696, 'steps': 39307, 'batch_loss/train': 0.7008584425784647} 12/28/2021 03:22:26 - INFO - codeparrot_training - Step 39308: {'lr': 5.4691747630129045e-05, 'samples': 20126208, 'steps': 39308, 'batch_loss/train': 0.8377436078153551} 12/28/2021 03:22:39 - INFO - codeparrot_training - Step 39309: {'lr': 5.4681908046938233e-05, 'samples': 20126720, 'steps': 39309, 'batch_loss/train': 0.5978302725125104} 12/28/2021 03:22:49 - INFO - codeparrot_training - Step 39310: {'lr': 5.467206924025836e-05, 'samples': 20127232, 'steps': 39310, 'batch_loss/train': 0.8154808850958943} 12/28/2021 03:23:00 - INFO - codeparrot_training - Step 39311: {'lr': 5.4662231210128726e-05, 'samples': 20127744, 'steps': 39311, 'batch_loss/train': 0.7616565343923867} 12/28/2021 03:23:12 - INFO - codeparrot_training - Step 39312: {'lr': 5.465239395658825e-05, 'samples': 20128256, 'steps': 39312, 'batch_loss/train': 0.7957565168617293} 12/28/2021 03:23:23 - INFO - codeparrot_training - Step 39313: {'lr': 5.464255747967606e-05, 'samples': 20128768, 'steps': 39313, 'batch_loss/train': 0.7691554138436913} 12/28/2021 03:23:34 - INFO - codeparrot_training - Step 39314: {'lr': 5.463272177943132e-05, 'samples': 20129280, 'steps': 39314, 'batch_loss/train': 0.7531825415790081} 12/28/2021 03:23:44 - INFO - codeparrot_training - Step 39315: {'lr': 5.462288685589312e-05, 'samples': 20129792, 'steps': 39315, 'batch_loss/train': 0.5904778777621686} 12/28/2021 03:23:57 - INFO - codeparrot_training - Step 39316: {'lr': 5.4613052709100526e-05, 'samples': 20130304, 'steps': 39316, 'batch_loss/train': 0.740202916553244} 12/28/2021 03:24:08 - INFO - codeparrot_training - Step 39317: {'lr': 5.460321933909265e-05, 'samples': 20130816, 'steps': 39317, 'batch_loss/train': 0.7597636259160936} 12/28/2021 03:24:18 - INFO - codeparrot_training - Step 39318: {'lr': 5.459338674590864e-05, 'samples': 20131328, 'steps': 39318, 'batch_loss/train': 0.8290556268766522} 12/28/2021 03:24:30 - INFO - codeparrot_training - Step 39319: {'lr': 5.45835549295875e-05, 'samples': 20131840, 'steps': 39319, 'batch_loss/train': 0.739656075835228} 12/28/2021 03:24:41 - INFO - codeparrot_training - Step 39320: {'lr': 5.457372389016846e-05, 'samples': 20132352, 'steps': 39320, 'batch_loss/train': 0.6986919195624068} 12/28/2021 03:24:52 - INFO - codeparrot_training - Step 39321: {'lr': 5.4563893627690346e-05, 'samples': 20132864, 'steps': 39321, 'batch_loss/train': 0.795779419131577} 12/28/2021 03:25:04 - INFO - codeparrot_training - Step 39322: {'lr': 5.4554064142192466e-05, 'samples': 20133376, 'steps': 39322, 'batch_loss/train': 0.7519307159818709} 12/28/2021 03:25:15 - INFO - codeparrot_training - Step 39323: {'lr': 5.45442354337139e-05, 'samples': 20133888, 'steps': 39323, 'batch_loss/train': 0.7373403459787369} 12/28/2021 03:25:26 - INFO - codeparrot_training - Step 39324: {'lr': 5.453440750229358e-05, 'samples': 20134400, 'steps': 39324, 'batch_loss/train': 0.709995049983263} 12/28/2021 03:25:36 - INFO - codeparrot_training - Step 39325: {'lr': 5.452458034797067e-05, 'samples': 20134912, 'steps': 39325, 'batch_loss/train': 0.6698522182414308} 12/28/2021 03:25:48 - INFO - codeparrot_training - Step 39326: {'lr': 5.45147539707842e-05, 'samples': 20135424, 'steps': 39326, 'batch_loss/train': 0.6979041143786162} 12/28/2021 03:25:59 - INFO - codeparrot_training - Step 39327: {'lr': 5.4504928370773254e-05, 'samples': 20135936, 'steps': 39327, 'batch_loss/train': 0.7409681244753301} 12/28/2021 03:26:10 - INFO - codeparrot_training - Step 39328: {'lr': 5.449510354797691e-05, 'samples': 20136448, 'steps': 39328, 'batch_loss/train': 0.862577274441719} 12/28/2021 03:26:22 - INFO - codeparrot_training - Step 39329: {'lr': 5.44852795024342e-05, 'samples': 20136960, 'steps': 39329, 'batch_loss/train': 0.7847136841155589} 12/28/2021 03:26:32 - INFO - codeparrot_training - Step 39330: {'lr': 5.44754562341842e-05, 'samples': 20137472, 'steps': 39330, 'batch_loss/train': 0.8183312087785453} 12/28/2021 03:26:43 - INFO - codeparrot_training - Step 39331: {'lr': 5.446563374326596e-05, 'samples': 20137984, 'steps': 39331, 'batch_loss/train': 0.7529418069170788} 12/28/2021 03:26:56 - INFO - codeparrot_training - Step 39332: {'lr': 5.445581202971855e-05, 'samples': 20138496, 'steps': 39332, 'batch_loss/train': 0.790756743401289} 12/28/2021 03:27:06 - INFO - codeparrot_training - Step 39333: {'lr': 5.4445991093580975e-05, 'samples': 20139008, 'steps': 39333, 'batch_loss/train': 0.8084138052072376} 12/28/2021 03:27:17 - INFO - codeparrot_training - Step 39334: {'lr': 5.4436170934892365e-05, 'samples': 20139520, 'steps': 39334, 'batch_loss/train': 0.6368199265561998} 12/28/2021 03:27:29 - INFO - codeparrot_training - Step 39335: {'lr': 5.442635155369166e-05, 'samples': 20140032, 'steps': 39335, 'batch_loss/train': 0.6565889311023057} 12/28/2021 03:27:40 - INFO - codeparrot_training - Step 39336: {'lr': 5.441653295001783e-05, 'samples': 20140544, 'steps': 39336, 'batch_loss/train': 0.8225039590615779} 12/28/2021 03:27:50 - INFO - codeparrot_training - Step 39337: {'lr': 5.440671512391018e-05, 'samples': 20141056, 'steps': 39337, 'batch_loss/train': 0.7199473921209574} 12/28/2021 03:28:01 - INFO - codeparrot_training - Step 39338: {'lr': 5.439689807540748e-05, 'samples': 20141568, 'steps': 39338, 'batch_loss/train': 0.7278546569868922} 12/28/2021 03:28:13 - INFO - codeparrot_training - Step 39339: {'lr': 5.438708180454888e-05, 'samples': 20142080, 'steps': 39339, 'batch_loss/train': 0.9168185824528337} 12/28/2021 03:28:24 - INFO - codeparrot_training - Step 39340: {'lr': 5.437726631137338e-05, 'samples': 20142592, 'steps': 39340, 'batch_loss/train': 0.7477435655891895} 12/28/2021 03:28:34 - INFO - codeparrot_training - Step 39341: {'lr': 5.436745159591999e-05, 'samples': 20143104, 'steps': 39341, 'batch_loss/train': 0.580434930510819} 12/28/2021 03:28:47 - INFO - codeparrot_training - Step 39342: {'lr': 5.435763765822776e-05, 'samples': 20143616, 'steps': 39342, 'batch_loss/train': 0.6251293580280617} 12/28/2021 03:28:58 - INFO - codeparrot_training - Step 39343: {'lr': 5.4347824498335694e-05, 'samples': 20144128, 'steps': 39343, 'batch_loss/train': 0.7158258492127061} 12/28/2021 03:29:09 - INFO - codeparrot_training - Step 39344: {'lr': 5.4338012116282784e-05, 'samples': 20144640, 'steps': 39344, 'batch_loss/train': 0.7156038420507684} 12/28/2021 03:29:21 - INFO - codeparrot_training - Step 39345: {'lr': 5.4328200512108076e-05, 'samples': 20145152, 'steps': 39345, 'batch_loss/train': 0.7983291166601703} 12/28/2021 03:29:31 - INFO - codeparrot_training - Step 39346: {'lr': 5.431838968585062e-05, 'samples': 20145664, 'steps': 39346, 'batch_loss/train': 0.6625806582160294} 12/28/2021 03:29:42 - INFO - codeparrot_training - Step 39347: {'lr': 5.430857963754923e-05, 'samples': 20146176, 'steps': 39347, 'batch_loss/train': 0.7165312321740203} 12/28/2021 03:29:53 - INFO - codeparrot_training - Step 39348: {'lr': 5.4298770367243095e-05, 'samples': 20146688, 'steps': 39348, 'batch_loss/train': 1.660647178068757} 12/28/2021 03:30:05 - INFO - codeparrot_training - Step 39349: {'lr': 5.428896187497123e-05, 'samples': 20147200, 'steps': 39349, 'batch_loss/train': 0.7617213084595278} 12/28/2021 03:30:15 - INFO - codeparrot_training - Step 39350: {'lr': 5.427915416077248e-05, 'samples': 20147712, 'steps': 39350, 'batch_loss/train': 0.728268408216536} 12/28/2021 03:30:26 - INFO - codeparrot_training - Step 39351: {'lr': 5.4269347224685905e-05, 'samples': 20148224, 'steps': 39351, 'batch_loss/train': 0.7281312271370552} 12/28/2021 03:30:39 - INFO - codeparrot_training - Step 39352: {'lr': 5.4259541066750516e-05, 'samples': 20148736, 'steps': 39352, 'batch_loss/train': 0.9394921418279409} 12/28/2021 03:30:50 - INFO - codeparrot_training - Step 39353: {'lr': 5.424973568700528e-05, 'samples': 20149248, 'steps': 39353, 'batch_loss/train': 0.758407891029492} 12/28/2021 03:31:00 - INFO - codeparrot_training - Step 39354: {'lr': 5.423993108548919e-05, 'samples': 20149760, 'steps': 39354, 'batch_loss/train': 0.7141364044509828} 12/28/2021 03:31:12 - INFO - codeparrot_training - Step 39355: {'lr': 5.423012726224119e-05, 'samples': 20150272, 'steps': 39355, 'batch_loss/train': 0.8177954736165702} 12/28/2021 03:31:23 - INFO - codeparrot_training - Step 39356: {'lr': 5.422032421730033e-05, 'samples': 20150784, 'steps': 39356, 'batch_loss/train': 0.7170545170083642} 12/28/2021 03:31:34 - INFO - codeparrot_training - Step 39357: {'lr': 5.42105219507055e-05, 'samples': 20151296, 'steps': 39357, 'batch_loss/train': 0.7615496544167399} 12/28/2021 03:31:44 - INFO - codeparrot_training - Step 39358: {'lr': 5.4200720462495716e-05, 'samples': 20151808, 'steps': 39358, 'batch_loss/train': 0.7820789879187942} 12/28/2021 03:31:56 - INFO - codeparrot_training - Step 39359: {'lr': 5.419091975270993e-05, 'samples': 20152320, 'steps': 39359, 'batch_loss/train': 0.591106356587261} 12/28/2021 03:32:07 - INFO - codeparrot_training - Step 39360: {'lr': 5.418111982138721e-05, 'samples': 20152832, 'steps': 39360, 'batch_loss/train': 0.6852802275680006} 12/28/2021 03:32:18 - INFO - codeparrot_training - Step 39361: {'lr': 5.417132066856631e-05, 'samples': 20153344, 'steps': 39361, 'batch_loss/train': 0.7910247007384896} 12/28/2021 03:32:31 - INFO - codeparrot_training - Step 39362: {'lr': 5.4161522294286256e-05, 'samples': 20153856, 'steps': 39362, 'batch_loss/train': 0.7804459021426737} 12/28/2021 03:32:42 - INFO - codeparrot_training - Step 39363: {'lr': 5.415172469858617e-05, 'samples': 20154368, 'steps': 39363, 'batch_loss/train': 0.6767643947387114} 12/28/2021 03:32:52 - INFO - codeparrot_training - Step 39364: {'lr': 5.414192788150482e-05, 'samples': 20154880, 'steps': 39364, 'batch_loss/train': 0.8365609222091734} 12/28/2021 03:33:03 - INFO - codeparrot_training - Step 39365: {'lr': 5.413213184308113e-05, 'samples': 20155392, 'steps': 39365, 'batch_loss/train': 0.6511794272810221} 12/28/2021 03:33:15 - INFO - codeparrot_training - Step 39366: {'lr': 5.4122336583354275e-05, 'samples': 20155904, 'steps': 39366, 'batch_loss/train': 0.6701450924156234} 12/28/2021 03:33:25 - INFO - codeparrot_training - Step 39367: {'lr': 5.4112542102362965e-05, 'samples': 20156416, 'steps': 39367, 'batch_loss/train': 0.7877148590050638} 12/28/2021 03:33:36 - INFO - codeparrot_training - Step 39368: {'lr': 5.410274840014623e-05, 'samples': 20156928, 'steps': 39368, 'batch_loss/train': 0.8056942410767078} 12/28/2021 03:33:48 - INFO - codeparrot_training - Step 39369: {'lr': 5.4092955476743e-05, 'samples': 20157440, 'steps': 39369, 'batch_loss/train': 0.6912134727463126} 12/28/2021 03:33:59 - INFO - codeparrot_training - Step 39370: {'lr': 5.408316333219224e-05, 'samples': 20157952, 'steps': 39370, 'batch_loss/train': 0.7021306874230504} 12/28/2021 03:34:09 - INFO - codeparrot_training - Step 39371: {'lr': 5.4073371966532806e-05, 'samples': 20158464, 'steps': 39371, 'batch_loss/train': 0.765987410210073} 12/28/2021 03:34:22 - INFO - codeparrot_training - Step 39372: {'lr': 5.406358137980374e-05, 'samples': 20158976, 'steps': 39372, 'batch_loss/train': 0.8496917695738375} 12/28/2021 03:34:33 - INFO - codeparrot_training - Step 39373: {'lr': 5.4053791572043765e-05, 'samples': 20159488, 'steps': 39373, 'batch_loss/train': 0.723912678193301} 12/28/2021 03:34:43 - INFO - codeparrot_training - Step 39374: {'lr': 5.4044002543292006e-05, 'samples': 20160000, 'steps': 39374, 'batch_loss/train': 0.7449505040422082} 12/28/2021 03:34:55 - INFO - codeparrot_training - Step 39375: {'lr': 5.403421429358735e-05, 'samples': 20160512, 'steps': 39375, 'batch_loss/train': 0.7325804140418768} 12/28/2021 03:35:06 - INFO - codeparrot_training - Step 39376: {'lr': 5.40244268229686e-05, 'samples': 20161024, 'steps': 39376, 'batch_loss/train': 0.6922009016270749} 12/28/2021 03:35:17 - INFO - codeparrot_training - Step 39377: {'lr': 5.401464013147467e-05, 'samples': 20161536, 'steps': 39377, 'batch_loss/train': 0.7928176121786237} 12/28/2021 03:35:27 - INFO - codeparrot_training - Step 39378: {'lr': 5.4004854219144665e-05, 'samples': 20162048, 'steps': 39378, 'batch_loss/train': 0.6444073719903827} 12/28/2021 03:35:40 - INFO - codeparrot_training - Step 39379: {'lr': 5.399506908601728e-05, 'samples': 20162560, 'steps': 39379, 'batch_loss/train': 0.7647204040549695} 12/28/2021 03:35:51 - INFO - codeparrot_training - Step 39380: {'lr': 5.398528473213149e-05, 'samples': 20163072, 'steps': 39380, 'batch_loss/train': 0.865391593426466} 12/28/2021 03:36:02 - INFO - codeparrot_training - Step 39381: {'lr': 5.3975501157526195e-05, 'samples': 20163584, 'steps': 39381, 'batch_loss/train': 0.7122541816788726} 12/28/2021 03:36:14 - INFO - codeparrot_training - Step 39382: {'lr': 5.396571836224032e-05, 'samples': 20164096, 'steps': 39382, 'batch_loss/train': 0.8927709013223648} 12/28/2021 03:36:24 - INFO - codeparrot_training - Step 39383: {'lr': 5.395593634631268e-05, 'samples': 20164608, 'steps': 39383, 'batch_loss/train': 0.7774208765476942} 12/28/2021 03:36:35 - INFO - codeparrot_training - Step 39384: {'lr': 5.3946155109782265e-05, 'samples': 20165120, 'steps': 39384, 'batch_loss/train': 0.6154045120347291} 12/28/2021 03:36:47 - INFO - codeparrot_training - Step 39385: {'lr': 5.393637465268789e-05, 'samples': 20165632, 'steps': 39385, 'batch_loss/train': 0.6547950087115169} 12/28/2021 03:36:58 - INFO - codeparrot_training - Step 39386: {'lr': 5.392659497506844e-05, 'samples': 20166144, 'steps': 39386, 'batch_loss/train': 0.8555777240544558} 12/28/2021 03:37:08 - INFO - codeparrot_training - Step 39387: {'lr': 5.3916816076962886e-05, 'samples': 20166656, 'steps': 39387, 'batch_loss/train': 0.7775980444857851} 12/28/2021 03:37:19 - INFO - codeparrot_training - Step 39388: {'lr': 5.3907037958409925e-05, 'samples': 20167168, 'steps': 39388, 'batch_loss/train': 0.7323806337080896} 12/28/2021 03:37:31 - INFO - codeparrot_training - Step 39389: {'lr': 5.3897260619448666e-05, 'samples': 20167680, 'steps': 39389, 'batch_loss/train': 0.7204460557550192} 12/28/2021 03:37:42 - INFO - codeparrot_training - Step 39390: {'lr': 5.388748406011776e-05, 'samples': 20168192, 'steps': 39390, 'batch_loss/train': 0.7372444663196802} 12/28/2021 03:37:52 - INFO - codeparrot_training - Step 39391: {'lr': 5.387770828045613e-05, 'samples': 20168704, 'steps': 39391, 'batch_loss/train': 0.6702423873357475} 12/28/2021 03:38:05 - INFO - codeparrot_training - Step 39392: {'lr': 5.38679332805028e-05, 'samples': 20169216, 'steps': 39392, 'batch_loss/train': 0.8153108800761402} 12/28/2021 03:38:16 - INFO - codeparrot_training - Step 39393: {'lr': 5.3858159060296427e-05, 'samples': 20169728, 'steps': 39393, 'batch_loss/train': 0.7746766554191709} 12/28/2021 03:38:26 - INFO - codeparrot_training - Step 39394: {'lr': 5.384838561987599e-05, 'samples': 20170240, 'steps': 39394, 'batch_loss/train': 0.7054853980662301} 12/28/2021 03:38:39 - INFO - codeparrot_training - Step 39395: {'lr': 5.383861295928027e-05, 'samples': 20170752, 'steps': 39395, 'batch_loss/train': 0.7113927068421617} 12/28/2021 03:38:49 - INFO - codeparrot_training - Step 39396: {'lr': 5.3828841078548185e-05, 'samples': 20171264, 'steps': 39396, 'batch_loss/train': 0.7225595540367067} 12/28/2021 03:39:00 - INFO - codeparrot_training - Step 39397: {'lr': 5.381906997771855e-05, 'samples': 20171776, 'steps': 39397, 'batch_loss/train': 0.6008086476940662} 12/28/2021 03:39:12 - INFO - codeparrot_training - Step 39398: {'lr': 5.3809299656830293e-05, 'samples': 20172288, 'steps': 39398, 'batch_loss/train': 0.6633574180305004} 12/28/2021 03:39:23 - INFO - codeparrot_training - Step 39399: {'lr': 5.3799530115922037e-05, 'samples': 20172800, 'steps': 39399, 'batch_loss/train': 0.7911602072417736} 12/28/2021 03:39:33 - INFO - codeparrot_training - Step 39400: {'lr': 5.378976135503283e-05, 'samples': 20173312, 'steps': 39400, 'batch_loss/train': 0.7631060783751309} 12/28/2021 03:39:44 - INFO - codeparrot_training - Step 39401: {'lr': 5.37799933742015e-05, 'samples': 20173824, 'steps': 39401, 'batch_loss/train': 0.6705792983993888} 12/28/2021 03:39:57 - INFO - codeparrot_training - Step 39402: {'lr': 5.3770226173466717e-05, 'samples': 20174336, 'steps': 39402, 'batch_loss/train': 0.8356925444677472} 12/28/2021 03:40:07 - INFO - codeparrot_training - Step 39403: {'lr': 5.376045975286748e-05, 'samples': 20174848, 'steps': 39403, 'batch_loss/train': 0.813334334641695} 12/28/2021 03:40:18 - INFO - codeparrot_training - Step 39404: {'lr': 5.3750694112442625e-05, 'samples': 20175360, 'steps': 39404, 'batch_loss/train': 0.7842542789876461} 12/28/2021 03:40:30 - INFO - codeparrot_training - Step 39405: {'lr': 5.374092925223084e-05, 'samples': 20175872, 'steps': 39405, 'batch_loss/train': 0.8909340412355959} 12/28/2021 03:40:41 - INFO - codeparrot_training - Step 39406: {'lr': 5.373116517227103e-05, 'samples': 20176384, 'steps': 39406, 'batch_loss/train': 0.6795357305090874} 12/28/2021 03:40:51 - INFO - codeparrot_training - Step 39407: {'lr': 5.372140187260199e-05, 'samples': 20176896, 'steps': 39407, 'batch_loss/train': 0.9007132411934435} 12/28/2021 03:41:04 - INFO - codeparrot_training - Step 39408: {'lr': 5.3711639353262534e-05, 'samples': 20177408, 'steps': 39408, 'batch_loss/train': 0.7600457733497024} 12/28/2021 03:41:15 - INFO - codeparrot_training - Step 39409: {'lr': 5.370187761429149e-05, 'samples': 20177920, 'steps': 39409, 'batch_loss/train': 0.7187704727984965} 12/28/2021 03:41:25 - INFO - codeparrot_training - Step 39410: {'lr': 5.369211665572768e-05, 'samples': 20178432, 'steps': 39410, 'batch_loss/train': 0.7440506359562278} 12/28/2021 03:41:36 - INFO - codeparrot_training - Step 39411: {'lr': 5.3682356477609853e-05, 'samples': 20178944, 'steps': 39411, 'batch_loss/train': 0.7333939950913191} 12/28/2021 03:41:48 - INFO - codeparrot_training - Step 39412: {'lr': 5.3672597079976875e-05, 'samples': 20179456, 'steps': 39412, 'batch_loss/train': 0.6842220886610448} 12/28/2021 03:41:59 - INFO - codeparrot_training - Step 39413: {'lr': 5.36628384628676e-05, 'samples': 20179968, 'steps': 39413, 'batch_loss/train': 0.555921696126461} 12/28/2021 03:42:10 - INFO - codeparrot_training - Step 39414: {'lr': 5.36530806263206e-05, 'samples': 20180480, 'steps': 39414, 'batch_loss/train': 0.7547425997909158} 12/28/2021 03:42:22 - INFO - codeparrot_training - Step 39415: {'lr': 5.3643323570374936e-05, 'samples': 20180992, 'steps': 39415, 'batch_loss/train': 0.6621133610606194} 12/28/2021 03:42:33 - INFO - codeparrot_training - Step 39416: {'lr': 5.363356729506921e-05, 'samples': 20181504, 'steps': 39416, 'batch_loss/train': 0.7693030573427677} 12/28/2021 03:42:43 - INFO - codeparrot_training - Step 39417: {'lr': 5.362381180044221e-05, 'samples': 20182016, 'steps': 39417, 'batch_loss/train': 0.8069823402911425} 12/28/2021 03:42:56 - INFO - codeparrot_training - Step 39418: {'lr': 5.361405708653294e-05, 'samples': 20182528, 'steps': 39418, 'batch_loss/train': 0.7614073029253632} 12/28/2021 03:43:07 - INFO - codeparrot_training - Step 39419: {'lr': 5.360430315337994e-05, 'samples': 20183040, 'steps': 39419, 'batch_loss/train': 0.7453854456543922} 12/28/2021 03:43:17 - INFO - codeparrot_training - Step 39420: {'lr': 5.359455000102209e-05, 'samples': 20183552, 'steps': 39420, 'batch_loss/train': 0.7062373943626881} 12/28/2021 03:43:28 - INFO - codeparrot_training - Step 39421: {'lr': 5.3584797629498136e-05, 'samples': 20184064, 'steps': 39421, 'batch_loss/train': 0.7360385390929878} 12/28/2021 03:43:40 - INFO - codeparrot_training - Step 39422: {'lr': 5.3575046038846856e-05, 'samples': 20184576, 'steps': 39422, 'batch_loss/train': 0.7581028379499912} 12/28/2021 03:43:51 - INFO - codeparrot_training - Step 39423: {'lr': 5.356529522910702e-05, 'samples': 20185088, 'steps': 39423, 'batch_loss/train': 0.7054392085410655} 12/28/2021 03:44:01 - INFO - codeparrot_training - Step 39424: {'lr': 5.3555545200317494e-05, 'samples': 20185600, 'steps': 39424, 'batch_loss/train': 0.801244561560452} 12/28/2021 03:44:13 - INFO - codeparrot_training - Step 39425: {'lr': 5.35457959525168e-05, 'samples': 20186112, 'steps': 39425, 'batch_loss/train': 0.7290278421714902} 12/28/2021 03:44:24 - INFO - codeparrot_training - Step 39426: {'lr': 5.3536047485743924e-05, 'samples': 20186624, 'steps': 39426, 'batch_loss/train': 0.7292734319344163} 12/28/2021 03:44:34 - INFO - codeparrot_training - Step 39427: {'lr': 5.3526299800037594e-05, 'samples': 20187136, 'steps': 39427, 'batch_loss/train': 0.7371703959070146} 12/28/2021 03:44:46 - INFO - codeparrot_training - Step 39428: {'lr': 5.351655289543636e-05, 'samples': 20187648, 'steps': 39428, 'batch_loss/train': 0.80770742171444} 12/28/2021 03:44:57 - INFO - codeparrot_training - Step 39429: {'lr': 5.350680677197919e-05, 'samples': 20188160, 'steps': 39429, 'batch_loss/train': 1.2996415947563946} 12/28/2021 03:45:08 - INFO - codeparrot_training - Step 39430: {'lr': 5.349706142970487e-05, 'samples': 20188672, 'steps': 39430, 'batch_loss/train': 0.697851316537708} 12/28/2021 03:45:21 - INFO - codeparrot_training - Step 39431: {'lr': 5.3487316868651934e-05, 'samples': 20189184, 'steps': 39431, 'batch_loss/train': 0.7207304290495813} 12/28/2021 03:45:31 - INFO - codeparrot_training - Step 39432: {'lr': 5.3477573088859174e-05, 'samples': 20189696, 'steps': 39432, 'batch_loss/train': 0.5863667842932045} 12/28/2021 03:45:42 - INFO - codeparrot_training - Step 39433: {'lr': 5.346783009036552e-05, 'samples': 20190208, 'steps': 39433, 'batch_loss/train': 0.723654449917376} 12/28/2021 03:45:53 - INFO - codeparrot_training - Step 39434: {'lr': 5.3458087873209464e-05, 'samples': 20190720, 'steps': 39434, 'batch_loss/train': 0.7606234066188335} 12/28/2021 03:46:05 - INFO - codeparrot_training - Step 39435: {'lr': 5.344834643742988e-05, 'samples': 20191232, 'steps': 39435, 'batch_loss/train': 0.671953622251749} 12/28/2021 03:46:15 - INFO - codeparrot_training - Step 39436: {'lr': 5.3438605783065447e-05, 'samples': 20191744, 'steps': 39436, 'batch_loss/train': 0.7555265463888645} 12/28/2021 03:46:26 - INFO - codeparrot_training - Step 39437: {'lr': 5.342886591015489e-05, 'samples': 20192256, 'steps': 39437, 'batch_loss/train': 0.7214535223320127} 12/28/2021 03:46:39 - INFO - codeparrot_training - Step 39438: {'lr': 5.341912681873693e-05, 'samples': 20192768, 'steps': 39438, 'batch_loss/train': 0.7911840984597802} 12/28/2021 03:46:49 - INFO - codeparrot_training - Step 39439: {'lr': 5.34093885088503e-05, 'samples': 20193280, 'steps': 39439, 'batch_loss/train': 0.7856487259268761} 12/28/2021 03:47:00 - INFO - codeparrot_training - Step 39440: {'lr': 5.339965098053373e-05, 'samples': 20193792, 'steps': 39440, 'batch_loss/train': 0.6575935987057164} 12/28/2021 03:47:12 - INFO - codeparrot_training - Step 39441: {'lr': 5.338991423382597e-05, 'samples': 20194304, 'steps': 39441, 'batch_loss/train': 0.866133195348084} 12/28/2021 03:47:23 - INFO - codeparrot_training - Step 39442: {'lr': 5.3380178268765595e-05, 'samples': 20194816, 'steps': 39442, 'batch_loss/train': 0.674976222217083} 12/28/2021 03:47:33 - INFO - codeparrot_training - Step 39443: {'lr': 5.3370443085391354e-05, 'samples': 20195328, 'steps': 39443, 'batch_loss/train': 0.7927540550008416} 12/28/2021 03:47:44 - INFO - codeparrot_training - Step 39444: {'lr': 5.33607086837421e-05, 'samples': 20195840, 'steps': 39444, 'batch_loss/train': 0.7096011554822326} 12/28/2021 03:47:57 - INFO - codeparrot_training - Step 39445: {'lr': 5.335097506385633e-05, 'samples': 20196352, 'steps': 39445, 'batch_loss/train': 0.8216638413723558} 12/28/2021 03:48:07 - INFO - codeparrot_training - Step 39446: {'lr': 5.334124222577288e-05, 'samples': 20196864, 'steps': 39446, 'batch_loss/train': 0.8768046479672194} 12/28/2021 03:48:18 - INFO - codeparrot_training - Step 39447: {'lr': 5.333151016953036e-05, 'samples': 20197376, 'steps': 39447, 'batch_loss/train': 0.7923226119019091} 12/28/2021 03:48:30 - INFO - codeparrot_training - Step 39448: {'lr': 5.3321778895167514e-05, 'samples': 20197888, 'steps': 39448, 'batch_loss/train': 0.7977863007690758} 12/28/2021 03:48:41 - INFO - codeparrot_training - Step 39449: {'lr': 5.3312048402723e-05, 'samples': 20198400, 'steps': 39449, 'batch_loss/train': 0.7947170315310359} 12/28/2021 03:48:52 - INFO - codeparrot_training - Step 39450: {'lr': 5.330231869223559e-05, 'samples': 20198912, 'steps': 39450, 'batch_loss/train': 0.817412176169455} 12/28/2021 03:49:04 - INFO - codeparrot_training - Step 39451: {'lr': 5.3292589763743766e-05, 'samples': 20199424, 'steps': 39451, 'batch_loss/train': 0.7836900002439506} 12/28/2021 03:49:15 - INFO - codeparrot_training - Step 39452: {'lr': 5.328286161728638e-05, 'samples': 20199936, 'steps': 39452, 'batch_loss/train': 0.832715867087245} 12/28/2021 03:49:25 - INFO - codeparrot_training - Step 39453: {'lr': 5.327313425290215e-05, 'samples': 20200448, 'steps': 39453, 'batch_loss/train': 0.7242568549700081} 12/28/2021 03:49:36 - INFO - codeparrot_training - Step 39454: {'lr': 5.326340767062948e-05, 'samples': 20200960, 'steps': 39454, 'batch_loss/train': 0.9082569624297321} 12/28/2021 03:49:48 - INFO - codeparrot_training - Step 39455: {'lr': 5.32536818705073e-05, 'samples': 20201472, 'steps': 39455, 'batch_loss/train': 0.6768544069491327} 12/28/2021 03:49:59 - INFO - codeparrot_training - Step 39456: {'lr': 5.32439568525743e-05, 'samples': 20201984, 'steps': 39456, 'batch_loss/train': 0.6968962103128433} 12/28/2021 03:50:09 - INFO - codeparrot_training - Step 39457: {'lr': 5.323423261686891e-05, 'samples': 20202496, 'steps': 39457, 'batch_loss/train': 0.6600030197296292} 12/28/2021 03:50:22 - INFO - codeparrot_training - Step 39458: {'lr': 5.322450916342989e-05, 'samples': 20203008, 'steps': 39458, 'batch_loss/train': 0.7658739178441465} 12/28/2021 03:50:33 - INFO - codeparrot_training - Step 39459: {'lr': 5.321478649229605e-05, 'samples': 20203520, 'steps': 39459, 'batch_loss/train': 0.7794425114989281} 12/28/2021 03:50:43 - INFO - codeparrot_training - Step 39460: {'lr': 5.320506460350582e-05, 'samples': 20204032, 'steps': 39460, 'batch_loss/train': 0.7390720979310572} 12/28/2021 03:50:55 - INFO - codeparrot_training - Step 39461: {'lr': 5.319534349709798e-05, 'samples': 20204544, 'steps': 39461, 'batch_loss/train': 0.7812855243682861} 12/28/2021 03:51:06 - INFO - codeparrot_training - Step 39462: {'lr': 5.318562317311112e-05, 'samples': 20205056, 'steps': 39462, 'batch_loss/train': 0.8252431834116578} 12/28/2021 03:51:17 - INFO - codeparrot_training - Step 39463: {'lr': 5.317590363158392e-05, 'samples': 20205568, 'steps': 39463, 'batch_loss/train': 0.9619421968236566} 12/28/2021 03:51:27 - INFO - codeparrot_training - Step 39464: {'lr': 5.316618487255501e-05, 'samples': 20206080, 'steps': 39464, 'batch_loss/train': 0.7328809965401888} 12/28/2021 03:51:39 - INFO - codeparrot_training - Step 39465: {'lr': 5.3156466896063014e-05, 'samples': 20206592, 'steps': 39465, 'batch_loss/train': 0.7762683797627687} 12/28/2021 03:51:50 - INFO - codeparrot_training - Step 39466: {'lr': 5.314674970214661e-05, 'samples': 20207104, 'steps': 39466, 'batch_loss/train': 0.789859258569777} 12/28/2021 03:52:01 - INFO - codeparrot_training - Step 39467: {'lr': 5.3137033290844375e-05, 'samples': 20207616, 'steps': 39467, 'batch_loss/train': 0.7961180275306106} 12/28/2021 03:52:14 - INFO - codeparrot_training - Step 39468: {'lr': 5.312731766219503e-05, 'samples': 20208128, 'steps': 39468, 'batch_loss/train': 0.7739022010937333} 12/28/2021 03:52:24 - INFO - codeparrot_training - Step 39469: {'lr': 5.311760281623701e-05, 'samples': 20208640, 'steps': 39469, 'batch_loss/train': 0.7133411024697125} 12/28/2021 03:52:35 - INFO - codeparrot_training - Step 39470: {'lr': 5.31078887530092e-05, 'samples': 20209152, 'steps': 39470, 'batch_loss/train': 0.6681692483834922} 12/28/2021 03:52:47 - INFO - codeparrot_training - Step 39471: {'lr': 5.3098175472549996e-05, 'samples': 20209664, 'steps': 39471, 'batch_loss/train': 0.753307712264359} 12/28/2021 03:52:57 - INFO - codeparrot_training - Step 39472: {'lr': 5.308846297489811e-05, 'samples': 20210176, 'steps': 39472, 'batch_loss/train': 0.7991049312986434} 12/28/2021 03:53:08 - INFO - codeparrot_training - Step 39473: {'lr': 5.307875126009215e-05, 'samples': 20210688, 'steps': 39473, 'batch_loss/train': 0.8328766617923975} 12/28/2021 03:53:19 - INFO - codeparrot_training - Step 39474: {'lr': 5.30690403281707e-05, 'samples': 20211200, 'steps': 39474, 'batch_loss/train': 0.7671182518824935} 12/28/2021 03:53:31 - INFO - codeparrot_training - Step 39475: {'lr': 5.3059330179172416e-05, 'samples': 20211712, 'steps': 39475, 'batch_loss/train': 0.7624864378012717} 12/28/2021 03:53:42 - INFO - codeparrot_training - Step 39476: {'lr': 5.304962081313586e-05, 'samples': 20212224, 'steps': 39476, 'batch_loss/train': 0.7009523280430585} 12/28/2021 03:53:52 - INFO - codeparrot_training - Step 39477: {'lr': 5.303991223009963e-05, 'samples': 20212736, 'steps': 39477, 'batch_loss/train': 0.7281534804496914} 12/28/2021 03:54:05 - INFO - codeparrot_training - Step 39478: {'lr': 5.303020443010237e-05, 'samples': 20213248, 'steps': 39478, 'batch_loss/train': 0.7090143738314509} 12/28/2021 03:54:16 - INFO - codeparrot_training - Step 39479: {'lr': 5.302049741318268e-05, 'samples': 20213760, 'steps': 39479, 'batch_loss/train': 0.7509523617336527} 12/28/2021 03:54:26 - INFO - codeparrot_training - Step 39480: {'lr': 5.3010791179378986e-05, 'samples': 20214272, 'steps': 39480, 'batch_loss/train': 0.7893891409039497} 12/28/2021 03:54:39 - INFO - codeparrot_training - Step 39481: {'lr': 5.300108572873011e-05, 'samples': 20214784, 'steps': 39481, 'batch_loss/train': 0.795395475346595} 12/28/2021 03:54:50 - INFO - codeparrot_training - Step 39482: {'lr': 5.299138106127458e-05, 'samples': 20215296, 'steps': 39482, 'batch_loss/train': 0.8057125471532345} 12/28/2021 03:55:00 - INFO - codeparrot_training - Step 39483: {'lr': 5.298167717705085e-05, 'samples': 20215808, 'steps': 39483, 'batch_loss/train': 0.7741200644522905} 12/28/2021 03:55:11 - INFO - codeparrot_training - Step 39484: {'lr': 5.2971974076097524e-05, 'samples': 20216320, 'steps': 39484, 'batch_loss/train': 0.8375154016539454} 12/28/2021 03:55:24 - INFO - codeparrot_training - Step 39485: {'lr': 5.296227175845336e-05, 'samples': 20216832, 'steps': 39485, 'batch_loss/train': 1.1451807075645775} 12/28/2021 03:55:35 - INFO - codeparrot_training - Step 39486: {'lr': 5.295257022415673e-05, 'samples': 20217344, 'steps': 39486, 'batch_loss/train': 0.5663501154631376} 12/28/2021 03:55:45 - INFO - codeparrot_training - Step 39487: {'lr': 5.2942869473246306e-05, 'samples': 20217856, 'steps': 39487, 'batch_loss/train': 0.5451652510091662} 12/28/2021 03:55:57 - INFO - codeparrot_training - Step 39488: {'lr': 5.2933169505760607e-05, 'samples': 20218368, 'steps': 39488, 'batch_loss/train': 0.6872301953844726} 12/28/2021 03:56:08 - INFO - codeparrot_training - Step 39489: {'lr': 5.292347032173822e-05, 'samples': 20218880, 'steps': 39489, 'batch_loss/train': 0.6846832782030106} 12/28/2021 03:56:19 - INFO - codeparrot_training - Step 39490: {'lr': 5.2913771921217714e-05, 'samples': 20219392, 'steps': 39490, 'batch_loss/train': 0.6886759297922254} 12/28/2021 03:56:29 - INFO - codeparrot_training - Step 39491: {'lr': 5.29040743042376e-05, 'samples': 20219904, 'steps': 39491, 'batch_loss/train': 0.7797865075990558} 12/28/2021 03:56:42 - INFO - codeparrot_training - Step 39492: {'lr': 5.28943774708365e-05, 'samples': 20220416, 'steps': 39492, 'batch_loss/train': 0.733067914377898} 12/28/2021 03:56:52 - INFO - codeparrot_training - Step 39493: {'lr': 5.288468142105293e-05, 'samples': 20220928, 'steps': 39493, 'batch_loss/train': 0.976993944728747} 12/28/2021 03:57:03 - INFO - codeparrot_training - Step 39494: {'lr': 5.2874986154925504e-05, 'samples': 20221440, 'steps': 39494, 'batch_loss/train': 0.6689965697005391} 12/28/2021 03:57:16 - INFO - codeparrot_training - Step 39495: {'lr': 5.2865291672492544e-05, 'samples': 20221952, 'steps': 39495, 'batch_loss/train': 0.6948798093944788} 12/28/2021 03:57:26 - INFO - codeparrot_training - Step 39496: {'lr': 5.285559797379291e-05, 'samples': 20222464, 'steps': 39496, 'batch_loss/train': 0.7858500145375729} 12/28/2021 03:57:37 - INFO - codeparrot_training - Step 39497: {'lr': 5.2845905058864916e-05, 'samples': 20222976, 'steps': 39497, 'batch_loss/train': 0.7580390367656946} 12/28/2021 03:57:47 - INFO - codeparrot_training - Step 39498: {'lr': 5.2836212927747124e-05, 'samples': 20223488, 'steps': 39498, 'batch_loss/train': 0.7396728296298534} 12/28/2021 03:58:00 - INFO - codeparrot_training - Step 39499: {'lr': 5.282652158047813e-05, 'samples': 20224000, 'steps': 39499, 'batch_loss/train': 0.6015521096996963} 12/28/2021 03:58:10 - INFO - codeparrot_training - Step 39500: {'lr': 5.281683101709644e-05, 'samples': 20224512, 'steps': 39500, 'batch_loss/train': 0.6730036404915154} 12/28/2021 03:58:21 - INFO - codeparrot_training - Step 39501: {'lr': 5.280714123764058e-05, 'samples': 20225024, 'steps': 39501, 'batch_loss/train': 0.7437726520001888} 12/28/2021 03:58:33 - INFO - codeparrot_training - Step 39502: {'lr': 5.2797452242149046e-05, 'samples': 20225536, 'steps': 39502, 'batch_loss/train': 0.5893693719990551} 12/28/2021 03:58:44 - INFO - codeparrot_training - Step 39503: {'lr': 5.278776403066041e-05, 'samples': 20226048, 'steps': 39503, 'batch_loss/train': 0.5687245114240795} 12/28/2021 03:58:55 - INFO - codeparrot_training - Step 39504: {'lr': 5.277807660321313e-05, 'samples': 20226560, 'steps': 39504, 'batch_loss/train': 0.8163926750421524} 12/28/2021 03:59:05 - INFO - codeparrot_training - Step 39505: {'lr': 5.276838995984584e-05, 'samples': 20227072, 'steps': 39505, 'batch_loss/train': 0.7564628190593794} 12/28/2021 03:59:17 - INFO - codeparrot_training - Step 39506: {'lr': 5.27587041005968e-05, 'samples': 20227584, 'steps': 39506, 'batch_loss/train': 0.8105967100709677} 12/28/2021 03:59:28 - INFO - codeparrot_training - Step 39507: {'lr': 5.2749019025504777e-05, 'samples': 20228096, 'steps': 39507, 'batch_loss/train': 0.7663383865728974} 12/28/2021 03:59:39 - INFO - codeparrot_training - Step 39508: {'lr': 5.273933473460821e-05, 'samples': 20228608, 'steps': 39508, 'batch_loss/train': 0.8066987665370107} 12/28/2021 03:59:51 - INFO - codeparrot_training - Step 39509: {'lr': 5.2729651227945514e-05, 'samples': 20229120, 'steps': 39509, 'batch_loss/train': 0.7933130403980613} 12/28/2021 04:00:02 - INFO - codeparrot_training - Step 39510: {'lr': 5.271996850555516e-05, 'samples': 20229632, 'steps': 39510, 'batch_loss/train': 0.7127966834232211} 12/28/2021 04:00:12 - INFO - codeparrot_training - Step 39511: {'lr': 5.2710286567475867e-05, 'samples': 20230144, 'steps': 39511, 'batch_loss/train': 0.7647283058613539} 12/28/2021 04:00:24 - INFO - codeparrot_training - Step 39512: {'lr': 5.270060541374594e-05, 'samples': 20230656, 'steps': 39512, 'batch_loss/train': 0.8309175898320973} 12/28/2021 04:00:35 - INFO - codeparrot_training - Step 39513: {'lr': 5.269092504440379e-05, 'samples': 20231168, 'steps': 39513, 'batch_loss/train': 0.7005123142153025} 12/28/2021 04:00:46 - INFO - codeparrot_training - Step 39514: {'lr': 5.2681245459488184e-05, 'samples': 20231680, 'steps': 39514, 'batch_loss/train': 0.786739761591889} 12/28/2021 04:01:00 - INFO - codeparrot_training - Step 39515: {'lr': 5.2671566659037386e-05, 'samples': 20232192, 'steps': 39515, 'batch_loss/train': 0.7465428072027862} 12/28/2021 04:01:11 - INFO - codeparrot_training - Step 39516: {'lr': 5.26618886430899e-05, 'samples': 20232704, 'steps': 39516, 'batch_loss/train': 0.7741858903318644} 12/28/2021 04:01:21 - INFO - codeparrot_training - Step 39517: {'lr': 5.265221141168425e-05, 'samples': 20233216, 'steps': 39517, 'batch_loss/train': 0.8053018320351839} 12/28/2021 04:01:32 - INFO - codeparrot_training - Step 39518: {'lr': 5.264253496485888e-05, 'samples': 20233728, 'steps': 39518, 'batch_loss/train': 0.7997505096718669} 12/28/2021 04:01:44 - INFO - codeparrot_training - Step 39519: {'lr': 5.2632859302652295e-05, 'samples': 20234240, 'steps': 39519, 'batch_loss/train': 0.7403130661696196} 12/28/2021 04:01:55 - INFO - codeparrot_training - Step 39520: {'lr': 5.2623184425103e-05, 'samples': 20234752, 'steps': 39520, 'batch_loss/train': 0.703240207512863} 12/28/2021 04:02:06 - INFO - codeparrot_training - Step 39521: {'lr': 5.261351033224923e-05, 'samples': 20235264, 'steps': 39521, 'batch_loss/train': 0.8045481871813536} 12/28/2021 04:02:19 - INFO - codeparrot_training - Step 39522: {'lr': 5.260383702412971e-05, 'samples': 20235776, 'steps': 39522, 'batch_loss/train': 0.8855097591876984} 12/28/2021 04:02:30 - INFO - codeparrot_training - Step 39523: {'lr': 5.259416450078286e-05, 'samples': 20236288, 'steps': 39523, 'batch_loss/train': 0.700821787584573} 12/28/2021 04:02:41 - INFO - codeparrot_training - Step 39524: {'lr': 5.2584492762247025e-05, 'samples': 20236800, 'steps': 39524, 'batch_loss/train': 0.819702785462141} 12/28/2021 04:02:53 - INFO - codeparrot_training - Step 39525: {'lr': 5.2574821808560685e-05, 'samples': 20237312, 'steps': 39525, 'batch_loss/train': 0.677205074345693} 12/28/2021 04:03:04 - INFO - codeparrot_training - Step 39526: {'lr': 5.256515163976231e-05, 'samples': 20237824, 'steps': 39526, 'batch_loss/train': 0.7097938568331301} 12/28/2021 04:03:14 - INFO - codeparrot_training - Step 39527: {'lr': 5.2555482255890353e-05, 'samples': 20238336, 'steps': 39527, 'batch_loss/train': 0.8950821693288162} 12/28/2021 04:03:25 - INFO - codeparrot_training - Step 39528: {'lr': 5.2545813656983247e-05, 'samples': 20238848, 'steps': 39528, 'batch_loss/train': 0.8221970946760848} 12/28/2021 04:03:37 - INFO - codeparrot_training - Step 39529: {'lr': 5.253614584307942e-05, 'samples': 20239360, 'steps': 39529, 'batch_loss/train': 0.717199434991926} 12/28/2021 04:03:48 - INFO - codeparrot_training - Step 39530: {'lr': 5.252647881421732e-05, 'samples': 20239872, 'steps': 39530, 'batch_loss/train': 0.6191679565235972} 12/28/2021 04:03:59 - INFO - codeparrot_training - Step 39531: {'lr': 5.251681257043545e-05, 'samples': 20240384, 'steps': 39531, 'batch_loss/train': 0.9427189854905009} 12/28/2021 04:04:13 - INFO - codeparrot_training - Step 39532: {'lr': 5.2507147111772014e-05, 'samples': 20240896, 'steps': 39532, 'batch_loss/train': 0.7077612257562578} 12/28/2021 04:04:24 - INFO - codeparrot_training - Step 39533: {'lr': 5.2497482438265656e-05, 'samples': 20241408, 'steps': 39533, 'batch_loss/train': 0.7485678996890783} 12/28/2021 04:04:34 - INFO - codeparrot_training - Step 39534: {'lr': 5.248781854995482e-05, 'samples': 20241920, 'steps': 39534, 'batch_loss/train': 0.6064744120230898} 12/28/2021 04:04:47 - INFO - codeparrot_training - Step 39535: {'lr': 5.247815544687778e-05, 'samples': 20242432, 'steps': 39535, 'batch_loss/train': 0.7415142427198589} 12/28/2021 04:04:57 - INFO - codeparrot_training - Step 39536: {'lr': 5.246849312907292e-05, 'samples': 20242944, 'steps': 39536, 'batch_loss/train': 0.7681648316793144} 12/28/2021 04:05:08 - INFO - codeparrot_training - Step 39537: {'lr': 5.245883159657891e-05, 'samples': 20243456, 'steps': 39537, 'batch_loss/train': 0.8556383970426396} 12/28/2021 04:05:19 - INFO - codeparrot_training - Step 39538: {'lr': 5.24491708494339e-05, 'samples': 20243968, 'steps': 39538, 'batch_loss/train': 0.6816891863709316} 12/28/2021 04:05:32 - INFO - codeparrot_training - Step 39539: {'lr': 5.2439510887676345e-05, 'samples': 20244480, 'steps': 39539, 'batch_loss/train': 0.7694361675530672} 12/28/2021 04:05:43 - INFO - codeparrot_training - Step 39540: {'lr': 5.2429851711344846e-05, 'samples': 20244992, 'steps': 39540, 'batch_loss/train': 0.7293865485116839} 12/28/2021 04:05:54 - INFO - codeparrot_training - Step 39541: {'lr': 5.242019332047757e-05, 'samples': 20245504, 'steps': 39541, 'batch_loss/train': 0.6256009312346578} 12/28/2021 04:06:06 - INFO - codeparrot_training - Step 39542: {'lr': 5.2410535715113005e-05, 'samples': 20246016, 'steps': 39542, 'batch_loss/train': 0.7398149874061346} 12/28/2021 04:06:16 - INFO - codeparrot_training - Step 39543: {'lr': 5.240087889528952e-05, 'samples': 20246528, 'steps': 39543, 'batch_loss/train': 0.7173384143970907} 12/28/2021 04:06:27 - INFO - codeparrot_training - Step 39544: {'lr': 5.2391222861045557e-05, 'samples': 20247040, 'steps': 39544, 'batch_loss/train': 0.7675345093011856} 12/28/2021 04:06:39 - INFO - codeparrot_training - Step 39545: {'lr': 5.2381567612419495e-05, 'samples': 20247552, 'steps': 39545, 'batch_loss/train': 0.6943133411696181} 12/28/2021 04:06:50 - INFO - codeparrot_training - Step 39546: {'lr': 5.237191314944975e-05, 'samples': 20248064, 'steps': 39546, 'batch_loss/train': 0.7671464122831821} 12/28/2021 04:07:00 - INFO - codeparrot_training - Step 39547: {'lr': 5.236225947217452e-05, 'samples': 20248576, 'steps': 39547, 'batch_loss/train': 0.7169099416350946} 12/28/2021 04:07:11 - INFO - codeparrot_training - Step 39548: {'lr': 5.235260658063237e-05, 'samples': 20249088, 'steps': 39548, 'batch_loss/train': 0.6591742495074868} 12/28/2021 04:07:25 - INFO - codeparrot_training - Step 39549: {'lr': 5.234295447486173e-05, 'samples': 20249600, 'steps': 39549, 'batch_loss/train': 0.8073120112530887} 12/28/2021 04:07:36 - INFO - codeparrot_training - Step 39550: {'lr': 5.23333031549007e-05, 'samples': 20250112, 'steps': 39550, 'batch_loss/train': 0.8019711021333933} 12/28/2021 04:07:46 - INFO - codeparrot_training - Step 39551: {'lr': 5.232365262078798e-05, 'samples': 20250624, 'steps': 39551, 'batch_loss/train': 0.7972091233823448} 12/28/2021 04:07:58 - INFO - codeparrot_training - Step 39552: {'lr': 5.2314002872561714e-05, 'samples': 20251136, 'steps': 39552, 'batch_loss/train': 0.7372803362086415} 12/28/2021 04:08:09 - INFO - codeparrot_training - Step 39553: {'lr': 5.230435391026031e-05, 'samples': 20251648, 'steps': 39553, 'batch_loss/train': 0.7310261745005846} 12/28/2021 04:08:20 - INFO - codeparrot_training - Step 39554: {'lr': 5.229470573392217e-05, 'samples': 20252160, 'steps': 39554, 'batch_loss/train': 0.6623462662100792} 12/28/2021 04:08:32 - INFO - codeparrot_training - Step 39555: {'lr': 5.228505834358563e-05, 'samples': 20252672, 'steps': 39555, 'batch_loss/train': 0.7733089118264616} 12/28/2021 04:08:42 - INFO - codeparrot_training - Step 39556: {'lr': 5.2275411739289016e-05, 'samples': 20253184, 'steps': 39556, 'batch_loss/train': 0.7210176941007376} 12/28/2021 04:08:53 - INFO - codeparrot_training - Step 39557: {'lr': 5.226576592107074e-05, 'samples': 20253696, 'steps': 39557, 'batch_loss/train': 0.7121625049039721} 12/28/2021 04:09:06 - INFO - codeparrot_training - Step 39558: {'lr': 5.2256120888969085e-05, 'samples': 20254208, 'steps': 39558, 'batch_loss/train': 0.768906332552433} 12/28/2021 04:09:17 - INFO - codeparrot_training - Step 39559: {'lr': 5.2246476643022456e-05, 'samples': 20254720, 'steps': 39559, 'batch_loss/train': 1.3456422664457932} 12/28/2021 04:09:27 - INFO - codeparrot_training - Step 39560: {'lr': 5.223683318326922e-05, 'samples': 20255232, 'steps': 39560, 'batch_loss/train': 1.2541991639882326} 12/28/2021 04:09:38 - INFO - codeparrot_training - Step 39561: {'lr': 5.2227190509747585e-05, 'samples': 20255744, 'steps': 39561, 'batch_loss/train': 0.7395387664437294} 12/28/2021 04:09:52 - INFO - codeparrot_training - Step 39562: {'lr': 5.221754862249592e-05, 'samples': 20256256, 'steps': 39562, 'batch_loss/train': 0.7635279502719641} 12/28/2021 04:10:03 - INFO - codeparrot_training - Step 39563: {'lr': 5.2207907521552715e-05, 'samples': 20256768, 'steps': 39563, 'batch_loss/train': 0.5678895735181868} 12/28/2021 04:10:13 - INFO - codeparrot_training - Step 39564: {'lr': 5.219826720695614e-05, 'samples': 20257280, 'steps': 39564, 'batch_loss/train': 0.7383971717208624} 12/28/2021 04:10:26 - INFO - codeparrot_training - Step 39565: {'lr': 5.218862767874449e-05, 'samples': 20257792, 'steps': 39565, 'batch_loss/train': 0.6920982759911567} 12/28/2021 04:10:36 - INFO - codeparrot_training - Step 39566: {'lr': 5.21789889369563e-05, 'samples': 20258304, 'steps': 39566, 'batch_loss/train': 0.6707998330239207} 12/28/2021 04:10:47 - INFO - codeparrot_training - Step 39567: {'lr': 5.216935098162967e-05, 'samples': 20258816, 'steps': 39567, 'batch_loss/train': 0.770027888007462} 12/28/2021 04:10:57 - INFO - codeparrot_training - Step 39568: {'lr': 5.215971381280304e-05, 'samples': 20259328, 'steps': 39568, 'batch_loss/train': 0.764484285376966} 12/28/2021 04:11:11 - INFO - codeparrot_training - Step 39569: {'lr': 5.215007743051467e-05, 'samples': 20259840, 'steps': 39569, 'batch_loss/train': 0.7969617573544383} 12/28/2021 04:11:22 - INFO - codeparrot_training - Step 39570: {'lr': 5.2140441834802874e-05, 'samples': 20260352, 'steps': 39570, 'batch_loss/train': 0.6881572604179382} 12/28/2021 04:11:33 - INFO - codeparrot_training - Step 39571: {'lr': 5.2130807025706e-05, 'samples': 20260864, 'steps': 39571, 'batch_loss/train': 0.694445324363187} 12/28/2021 04:11:45 - INFO - codeparrot_training - Step 39572: {'lr': 5.2121173003262355e-05, 'samples': 20261376, 'steps': 39572, 'batch_loss/train': 0.7845599967986345} 12/28/2021 04:11:55 - INFO - codeparrot_training - Step 39573: {'lr': 5.21115397675101e-05, 'samples': 20261888, 'steps': 39573, 'batch_loss/train': 0.7332054583821446} 12/28/2021 04:12:06 - INFO - codeparrot_training - Step 39574: {'lr': 5.210190731848771e-05, 'samples': 20262400, 'steps': 39574, 'batch_loss/train': 0.751812526024878} 12/28/2021 04:12:18 - INFO - codeparrot_training - Step 39575: {'lr': 5.209227565623348e-05, 'samples': 20262912, 'steps': 39575, 'batch_loss/train': 0.8098681080155075} 12/28/2021 04:12:29 - INFO - codeparrot_training - Step 39576: {'lr': 5.208264478078548e-05, 'samples': 20263424, 'steps': 39576, 'batch_loss/train': 0.6588932536542416} 12/28/2021 04:12:39 - INFO - codeparrot_training - Step 39577: {'lr': 5.207301469218231e-05, 'samples': 20263936, 'steps': 39577, 'batch_loss/train': 0.6943796155974269} 12/28/2021 04:12:50 - INFO - codeparrot_training - Step 39578: {'lr': 5.206338539046204e-05, 'samples': 20264448, 'steps': 39578, 'batch_loss/train': 0.6054997416213155} 12/28/2021 04:13:04 - INFO - codeparrot_training - Step 39579: {'lr': 5.205375687566299e-05, 'samples': 20264960, 'steps': 39579, 'batch_loss/train': 0.6770995217375457} 12/28/2021 04:13:15 - INFO - codeparrot_training - Step 39580: {'lr': 5.204412914782347e-05, 'samples': 20265472, 'steps': 39580, 'batch_loss/train': 0.6087929778150283} 12/28/2021 04:13:25 - INFO - codeparrot_training - Step 39581: {'lr': 5.203450220698175e-05, 'samples': 20265984, 'steps': 39581, 'batch_loss/train': 0.7493968333583325} 12/28/2021 04:13:37 - INFO - codeparrot_training - Step 39582: {'lr': 5.20248760531761e-05, 'samples': 20266496, 'steps': 39582, 'batch_loss/train': 0.8489007772877812} 12/28/2021 04:13:48 - INFO - codeparrot_training - Step 39583: {'lr': 5.201525068644478e-05, 'samples': 20267008, 'steps': 39583, 'batch_loss/train': 0.7359315338544548} 12/28/2021 04:13:58 - INFO - codeparrot_training - Step 39584: {'lr': 5.2005626106826074e-05, 'samples': 20267520, 'steps': 39584, 'batch_loss/train': 0.6875851983204484} 12/28/2021 04:14:10 - INFO - codeparrot_training - Step 39585: {'lr': 5.199600231435825e-05, 'samples': 20268032, 'steps': 39585, 'batch_loss/train': 0.7023877198225819} 12/28/2021 04:14:21 - INFO - codeparrot_training - Step 39586: {'lr': 5.1986379309079595e-05, 'samples': 20268544, 'steps': 39586, 'batch_loss/train': 0.7152821407653391} 12/28/2021 04:14:32 - INFO - codeparrot_training - Step 39587: {'lr': 5.197675709102823e-05, 'samples': 20269056, 'steps': 39587, 'batch_loss/train': 0.7999163055792451} 12/28/2021 04:14:46 - INFO - codeparrot_training - Step 39588: {'lr': 5.196713566024255e-05, 'samples': 20269568, 'steps': 39588, 'batch_loss/train': 0.769162273965776} 12/28/2021 04:14:56 - INFO - codeparrot_training - Step 39589: {'lr': 5.195751501676085e-05, 'samples': 20270080, 'steps': 39589, 'batch_loss/train': 0.7438411233015358} 12/28/2021 04:15:07 - INFO - codeparrot_training - Step 39590: {'lr': 5.194789516062121e-05, 'samples': 20270592, 'steps': 39590, 'batch_loss/train': 0.7379464539699256} 12/28/2021 04:15:18 - INFO - codeparrot_training - Step 39591: {'lr': 5.193827609186189e-05, 'samples': 20271104, 'steps': 39591, 'batch_loss/train': 0.7672349782660604} 12/28/2021 04:15:30 - INFO - codeparrot_training - Step 39592: {'lr': 5.192865781052133e-05, 'samples': 20271616, 'steps': 39592, 'batch_loss/train': 0.75081807654351} 12/28/2021 04:15:40 - INFO - codeparrot_training - Step 39593: {'lr': 5.191904031663758e-05, 'samples': 20272128, 'steps': 39593, 'batch_loss/train': 0.6515283366898075} 12/28/2021 04:15:51 - INFO - codeparrot_training - Step 39594: {'lr': 5.1909423610248943e-05, 'samples': 20272640, 'steps': 39594, 'batch_loss/train': 0.7152600893750787} 12/28/2021 04:16:03 - INFO - codeparrot_training - Step 39595: {'lr': 5.189980769139363e-05, 'samples': 20273152, 'steps': 39595, 'batch_loss/train': 0.8139065532013774} 12/28/2021 04:16:14 - INFO - codeparrot_training - Step 39596: {'lr': 5.189019256010988e-05, 'samples': 20273664, 'steps': 39596, 'batch_loss/train': 0.6639184164814651} 12/28/2021 04:16:24 - INFO - codeparrot_training - Step 39597: {'lr': 5.188057821643594e-05, 'samples': 20274176, 'steps': 39597, 'batch_loss/train': 0.7342574633657932} 12/28/2021 04:16:38 - INFO - codeparrot_training - Step 39598: {'lr': 5.1870964660410054e-05, 'samples': 20274688, 'steps': 39598, 'batch_loss/train': 0.7287344094365835} 12/28/2021 04:16:49 - INFO - codeparrot_training - Step 39599: {'lr': 5.186135189207028e-05, 'samples': 20275200, 'steps': 39599, 'batch_loss/train': 0.7585508488118649} 12/28/2021 04:16:59 - INFO - codeparrot_training - Step 39600: {'lr': 5.185173991145503e-05, 'samples': 20275712, 'steps': 39600, 'batch_loss/train': 0.7730108662508428} 12/28/2021 04:17:10 - INFO - codeparrot_training - Step 39601: {'lr': 5.18421287186025e-05, 'samples': 20276224, 'steps': 39601, 'batch_loss/train': 0.6897345795296133} 12/28/2021 04:17:22 - INFO - codeparrot_training - Step 39602: {'lr': 5.183251831355071e-05, 'samples': 20276736, 'steps': 39602, 'batch_loss/train': 0.6653158687986434} 12/28/2021 04:17:33 - INFO - codeparrot_training - Step 39603: {'lr': 5.182290869633807e-05, 'samples': 20277248, 'steps': 39603, 'batch_loss/train': 0.9142854269593954} 12/28/2021 04:17:43 - INFO - codeparrot_training - Step 39604: {'lr': 5.1813299867002784e-05, 'samples': 20277760, 'steps': 39604, 'batch_loss/train': 0.795382262673229} 12/28/2021 04:17:56 - INFO - codeparrot_training - Step 39605: {'lr': 5.1803691825582924e-05, 'samples': 20278272, 'steps': 39605, 'batch_loss/train': 0.8666467778384686} 12/28/2021 04:18:06 - INFO - codeparrot_training - Step 39606: {'lr': 5.1794084572116765e-05, 'samples': 20278784, 'steps': 39606, 'batch_loss/train': 0.6971340943127871} 12/28/2021 04:18:17 - INFO - codeparrot_training - Step 39607: {'lr': 5.178447810664244e-05, 'samples': 20279296, 'steps': 39607, 'batch_loss/train': 0.662275439593941} 12/28/2021 04:18:31 - INFO - codeparrot_training - Step 39608: {'lr': 5.1774872429198236e-05, 'samples': 20279808, 'steps': 39608, 'batch_loss/train': 0.7480289028026164} 12/28/2021 04:18:42 - INFO - codeparrot_training - Step 39609: {'lr': 5.1765267539822277e-05, 'samples': 20280320, 'steps': 39609, 'batch_loss/train': 0.691581305116415} 12/28/2021 04:18:52 - INFO - codeparrot_training - Step 39610: {'lr': 5.1755663438552733e-05, 'samples': 20280832, 'steps': 39610, 'batch_loss/train': 1.4826431386172771} 12/28/2021 04:19:03 - INFO - codeparrot_training - Step 39611: {'lr': 5.1746060125427846e-05, 'samples': 20281344, 'steps': 39611, 'batch_loss/train': 0.7383647351525724} 12/28/2021 04:19:15 - INFO - codeparrot_training - Step 39612: {'lr': 5.1736457600485845e-05, 'samples': 20281856, 'steps': 39612, 'batch_loss/train': 0.7275293837301433} 12/28/2021 04:19:26 - INFO - codeparrot_training - Step 39613: {'lr': 5.172685586376466e-05, 'samples': 20282368, 'steps': 39613, 'batch_loss/train': 0.7694696311373264} 12/28/2021 04:19:37 - INFO - codeparrot_training - Step 39614: {'lr': 5.171725491530271e-05, 'samples': 20282880, 'steps': 39614, 'batch_loss/train': 0.8144179163500667} 12/28/2021 04:19:49 - INFO - codeparrot_training - Step 39615: {'lr': 5.170765475513817e-05, 'samples': 20283392, 'steps': 39615, 'batch_loss/train': 0.7570091024972498} 12/28/2021 04:19:59 - INFO - codeparrot_training - Step 39616: {'lr': 5.169805538330904e-05, 'samples': 20283904, 'steps': 39616, 'batch_loss/train': 0.7317489363485947} 12/28/2021 04:20:10 - INFO - codeparrot_training - Step 39617: {'lr': 5.1688456799853515e-05, 'samples': 20284416, 'steps': 39617, 'batch_loss/train': 0.6716732583008707} 12/28/2021 04:20:24 - INFO - codeparrot_training - Step 39618: {'lr': 5.1678859004809926e-05, 'samples': 20284928, 'steps': 39618, 'batch_loss/train': 0.6830583205446601} 12/28/2021 04:20:35 - INFO - codeparrot_training - Step 39619: {'lr': 5.166926199821623e-05, 'samples': 20285440, 'steps': 39619, 'batch_loss/train': 0.6768754990771413} 12/28/2021 04:20:45 - INFO - codeparrot_training - Step 39620: {'lr': 5.165966578011069e-05, 'samples': 20285952, 'steps': 39620, 'batch_loss/train': 0.8053790563717484} 12/28/2021 04:20:56 - INFO - codeparrot_training - Step 39621: {'lr': 5.1650070350531405e-05, 'samples': 20286464, 'steps': 39621, 'batch_loss/train': 1.5687153005274013} 12/28/2021 04:21:08 - INFO - codeparrot_training - Step 39622: {'lr': 5.164047570951655e-05, 'samples': 20286976, 'steps': 39622, 'batch_loss/train': 0.7284638904966414} 12/28/2021 04:21:19 - INFO - codeparrot_training - Step 39623: {'lr': 5.1630881857104266e-05, 'samples': 20287488, 'steps': 39623, 'batch_loss/train': 0.7600716883316636} 12/28/2021 04:21:29 - INFO - codeparrot_training - Step 39624: {'lr': 5.162128879333272e-05, 'samples': 20288000, 'steps': 39624, 'batch_loss/train': 0.781382367014885} 12/28/2021 04:21:43 - INFO - codeparrot_training - Step 39625: {'lr': 5.1611696518239994e-05, 'samples': 20288512, 'steps': 39625, 'batch_loss/train': 0.8082688609138131} 12/28/2021 04:21:54 - INFO - codeparrot_training - Step 39626: {'lr': 5.1602105031864274e-05, 'samples': 20289024, 'steps': 39626, 'batch_loss/train': 0.8433557678945363} 12/28/2021 04:22:04 - INFO - codeparrot_training - Step 39627: {'lr': 5.1592514334243724e-05, 'samples': 20289536, 'steps': 39627, 'batch_loss/train': 0.7461097706109285} 12/28/2021 04:22:16 - INFO - codeparrot_training - Step 39628: {'lr': 5.158292442541629e-05, 'samples': 20290048, 'steps': 39628, 'batch_loss/train': 0.6623764494433999} 12/28/2021 04:22:27 - INFO - codeparrot_training - Step 39629: {'lr': 5.15733353054203e-05, 'samples': 20290560, 'steps': 39629, 'batch_loss/train': 0.8051587454974651} 12/28/2021 04:22:38 - INFO - codeparrot_training - Step 39630: {'lr': 5.156374697429389e-05, 'samples': 20291072, 'steps': 39630, 'batch_loss/train': 0.7394818935426883} 12/28/2021 04:22:48 - INFO - codeparrot_training - Step 39631: {'lr': 5.1554159432075006e-05, 'samples': 20291584, 'steps': 39631, 'batch_loss/train': 0.6912846127524972} 12/28/2021 04:23:00 - INFO - codeparrot_training - Step 39632: {'lr': 5.154457267880186e-05, 'samples': 20292096, 'steps': 39632, 'batch_loss/train': 0.747750504873693} 12/28/2021 04:23:11 - INFO - codeparrot_training - Step 39633: {'lr': 5.153498671451254e-05, 'samples': 20292608, 'steps': 39633, 'batch_loss/train': 0.7268418396124616} 12/28/2021 04:23:22 - INFO - codeparrot_training - Step 39634: {'lr': 5.1525401539245205e-05, 'samples': 20293120, 'steps': 39634, 'batch_loss/train': 0.6669947896152735} 12/28/2021 04:23:35 - INFO - codeparrot_training - Step 39635: {'lr': 5.151581715303791e-05, 'samples': 20293632, 'steps': 39635, 'batch_loss/train': 0.7268478516489267} 12/28/2021 04:23:46 - INFO - codeparrot_training - Step 39636: {'lr': 5.1506233555928786e-05, 'samples': 20294144, 'steps': 39636, 'batch_loss/train': 0.7868985859677196} 12/28/2021 04:23:57 - INFO - codeparrot_training - Step 39637: {'lr': 5.149665074795595e-05, 'samples': 20294656, 'steps': 39637, 'batch_loss/train': 0.7289516888558865} 12/28/2021 04:24:09 - INFO - codeparrot_training - Step 39638: {'lr': 5.148706872915745e-05, 'samples': 20295168, 'steps': 39638, 'batch_loss/train': 0.7530101051088423} 12/28/2021 04:24:19 - INFO - codeparrot_training - Step 39639: {'lr': 5.147748749957143e-05, 'samples': 20295680, 'steps': 39639, 'batch_loss/train': 0.7121061543002725} 12/28/2021 04:24:30 - INFO - codeparrot_training - Step 39640: {'lr': 5.1467907059235956e-05, 'samples': 20296192, 'steps': 39640, 'batch_loss/train': 0.7346760043874383} 12/28/2021 04:24:41 - INFO - codeparrot_training - Step 39641: {'lr': 5.1458327408189205e-05, 'samples': 20296704, 'steps': 39641, 'batch_loss/train': 0.8713387306779623} 12/28/2021 04:24:53 - INFO - codeparrot_training - Step 39642: {'lr': 5.1448748546469085e-05, 'samples': 20297216, 'steps': 39642, 'batch_loss/train': 0.6795435920357704} 12/28/2021 04:25:04 - INFO - codeparrot_training - Step 39643: {'lr': 5.14391704741137e-05, 'samples': 20297728, 'steps': 39643, 'batch_loss/train': 0.7367433030158281} 12/28/2021 04:25:14 - INFO - codeparrot_training - Step 39644: {'lr': 5.142959319116133e-05, 'samples': 20298240, 'steps': 39644, 'batch_loss/train': 0.7975420886650681} 12/28/2021 04:25:27 - INFO - codeparrot_training - Step 39645: {'lr': 5.142001669764987e-05, 'samples': 20298752, 'steps': 39645, 'batch_loss/train': 0.6187227219343185} 12/28/2021 04:25:37 - INFO - codeparrot_training - Step 39646: {'lr': 5.141044099361744e-05, 'samples': 20299264, 'steps': 39646, 'batch_loss/train': 0.82148152962327} 12/28/2021 04:25:48 - INFO - codeparrot_training - Step 39647: {'lr': 5.1400866079102135e-05, 'samples': 20299776, 'steps': 39647, 'batch_loss/train': 0.6870973228942603} 12/28/2021 04:26:02 - INFO - codeparrot_training - Step 39648: {'lr': 5.139129195414196e-05, 'samples': 20300288, 'steps': 39648, 'batch_loss/train': 0.6994103563483804} 12/28/2021 04:26:13 - INFO - codeparrot_training - Step 39649: {'lr': 5.138171861877502e-05, 'samples': 20300800, 'steps': 39649, 'batch_loss/train': 0.8051293268799782} 12/28/2021 04:26:23 - INFO - codeparrot_training - Step 39650: {'lr': 5.137214607303939e-05, 'samples': 20301312, 'steps': 39650, 'batch_loss/train': 0.7993084769695997} 12/28/2021 04:26:34 - INFO - codeparrot_training - Step 39651: {'lr': 5.136257431697311e-05, 'samples': 20301824, 'steps': 39651, 'batch_loss/train': 0.5789559172117151} 12/28/2021 04:26:46 - INFO - codeparrot_training - Step 39652: {'lr': 5.1353003350614214e-05, 'samples': 20302336, 'steps': 39652, 'batch_loss/train': 0.5658015960943885} 12/28/2021 04:26:57 - INFO - codeparrot_training - Step 39653: {'lr': 5.1343433174000866e-05, 'samples': 20302848, 'steps': 39653, 'batch_loss/train': 0.7830357691273093} 12/28/2021 04:27:07 - INFO - codeparrot_training - Step 39654: {'lr': 5.133386378717084e-05, 'samples': 20303360, 'steps': 39654, 'batch_loss/train': 0.8100753370672464} 12/28/2021 04:27:20 - INFO - codeparrot_training - Step 39655: {'lr': 5.132429519016246e-05, 'samples': 20303872, 'steps': 39655, 'batch_loss/train': 0.7296365089714527} 12/28/2021 04:27:31 - INFO - codeparrot_training - Step 39656: {'lr': 5.131472738301371e-05, 'samples': 20304384, 'steps': 39656, 'batch_loss/train': 0.8258457621559501} 12/28/2021 04:27:41 - INFO - codeparrot_training - Step 39657: {'lr': 5.130516036576255e-05, 'samples': 20304896, 'steps': 39657, 'batch_loss/train': 0.8143079364672303} 12/28/2021 04:27:53 - INFO - codeparrot_training - Step 39658: {'lr': 5.129559413844701e-05, 'samples': 20305408, 'steps': 39658, 'batch_loss/train': 0.6054388731718063} 12/28/2021 04:28:04 - INFO - codeparrot_training - Step 39659: {'lr': 5.128602870110519e-05, 'samples': 20305920, 'steps': 39659, 'batch_loss/train': 0.6691978343296796} 12/28/2021 04:28:15 - INFO - codeparrot_training - Step 39660: {'lr': 5.127646405377506e-05, 'samples': 20306432, 'steps': 39660, 'batch_loss/train': 0.6746454895474017} 12/28/2021 04:28:25 - INFO - codeparrot_training - Step 39661: {'lr': 5.1266900196494685e-05, 'samples': 20306944, 'steps': 39661, 'batch_loss/train': 0.7539047719910741} 12/28/2021 04:28:38 - INFO - codeparrot_training - Step 39662: {'lr': 5.125733712930208e-05, 'samples': 20307456, 'steps': 39662, 'batch_loss/train': 0.7677418082021177} 12/28/2021 04:28:49 - INFO - codeparrot_training - Step 39663: {'lr': 5.124777485223525e-05, 'samples': 20307968, 'steps': 39663, 'batch_loss/train': 1.6005069811362773} 12/28/2021 04:28:59 - INFO - codeparrot_training - Step 39664: {'lr': 5.1238213365332245e-05, 'samples': 20308480, 'steps': 39664, 'batch_loss/train': 0.7367983208969235} 12/28/2021 04:29:13 - INFO - codeparrot_training - Step 39665: {'lr': 5.122865266863105e-05, 'samples': 20308992, 'steps': 39665, 'batch_loss/train': 0.6889164033345878} 12/28/2021 04:29:23 - INFO - codeparrot_training - Step 39666: {'lr': 5.1219092762169655e-05, 'samples': 20309504, 'steps': 39666, 'batch_loss/train': 0.7066614113282412} 12/28/2021 04:29:34 - INFO - codeparrot_training - Step 39667: {'lr': 5.120953364598618e-05, 'samples': 20310016, 'steps': 39667, 'batch_loss/train': 0.7094760332256556} 12/28/2021 04:29:46 - INFO - codeparrot_training - Step 39668: {'lr': 5.1199975320118467e-05, 'samples': 20310528, 'steps': 39668, 'batch_loss/train': 0.5545350525062531} 12/28/2021 04:29:57 - INFO - codeparrot_training - Step 39669: {'lr': 5.119041778460451e-05, 'samples': 20311040, 'steps': 39669, 'batch_loss/train': 0.6602079742588103} 12/28/2021 04:30:07 - INFO - codeparrot_training - Step 39670: {'lr': 5.1180861039482526e-05, 'samples': 20311552, 'steps': 39670, 'batch_loss/train': 0.765222018584609} 12/28/2021 04:30:19 - INFO - codeparrot_training - Step 39671: {'lr': 5.1171305084790284e-05, 'samples': 20312064, 'steps': 39671, 'batch_loss/train': 0.6978749255649745} 12/28/2021 04:30:30 - INFO - codeparrot_training - Step 39672: {'lr': 5.116174992056588e-05, 'samples': 20312576, 'steps': 39672, 'batch_loss/train': 0.72997931484133} 12/28/2021 04:30:40 - INFO - codeparrot_training - Step 39673: {'lr': 5.115219554684728e-05, 'samples': 20313088, 'steps': 39673, 'batch_loss/train': 0.7458478999324143} 12/28/2021 04:30:51 - INFO - codeparrot_training - Step 39674: {'lr': 5.114264196367247e-05, 'samples': 20313600, 'steps': 39674, 'batch_loss/train': 0.7742873746901751} 12/28/2021 04:31:04 - INFO - codeparrot_training - Step 39675: {'lr': 5.113308917107942e-05, 'samples': 20314112, 'steps': 39675, 'batch_loss/train': 0.675979760941118} 12/28/2021 04:31:15 - INFO - codeparrot_training - Step 39676: {'lr': 5.112353716910612e-05, 'samples': 20314624, 'steps': 39676, 'batch_loss/train': 0.6857357671833597} 12/28/2021 04:31:25 - INFO - codeparrot_training - Step 39677: {'lr': 5.1113985957790546e-05, 'samples': 20315136, 'steps': 39677, 'batch_loss/train': 0.7814164651790634} 12/28/2021 04:31:38 - INFO - codeparrot_training - Step 39678: {'lr': 5.1104435537170644e-05, 'samples': 20315648, 'steps': 39678, 'batch_loss/train': 0.6770315776811913} 12/28/2021 04:31:49 - INFO - codeparrot_training - Step 39679: {'lr': 5.1094885907284505e-05, 'samples': 20316160, 'steps': 39679, 'batch_loss/train': 0.7506450917571783} 12/28/2021 04:31:59 - INFO - codeparrot_training - Step 39680: {'lr': 5.108533706816987e-05, 'samples': 20316672, 'steps': 39680, 'batch_loss/train': 0.7587979503441602} 12/28/2021 04:32:11 - INFO - codeparrot_training - Step 39681: {'lr': 5.1075789019864874e-05, 'samples': 20317184, 'steps': 39681, 'batch_loss/train': 0.6533469976857305} 12/28/2021 04:32:22 - INFO - codeparrot_training - Step 39682: {'lr': 5.1066241762407486e-05, 'samples': 20317696, 'steps': 39682, 'batch_loss/train': 0.6978532541543245} 12/28/2021 04:32:33 - INFO - codeparrot_training - Step 39683: {'lr': 5.105669529583556e-05, 'samples': 20318208, 'steps': 39683, 'batch_loss/train': 0.7809240389615297} 12/28/2021 04:32:43 - INFO - codeparrot_training - Step 39684: {'lr': 5.1047149620187e-05, 'samples': 20318720, 'steps': 39684, 'batch_loss/train': 0.745190987829119} 12/28/2021 04:32:56 - INFO - codeparrot_training - Step 39685: {'lr': 5.103760473550001e-05, 'samples': 20319232, 'steps': 39685, 'batch_loss/train': 0.7341688266023993} 12/28/2021 04:33:07 - INFO - codeparrot_training - Step 39686: {'lr': 5.102806064181231e-05, 'samples': 20319744, 'steps': 39686, 'batch_loss/train': 0.7314262269064784} 12/28/2021 04:33:18 - INFO - codeparrot_training - Step 39687: {'lr': 5.1018517339161905e-05, 'samples': 20320256, 'steps': 39687, 'batch_loss/train': 0.8017358900979161} 12/28/2021 04:33:30 - INFO - codeparrot_training - Step 39688: {'lr': 5.100897482758676e-05, 'samples': 20320768, 'steps': 39688, 'batch_loss/train': 0.6232454550918192} 12/28/2021 04:33:40 - INFO - codeparrot_training - Step 39689: {'lr': 5.0999433107124794e-05, 'samples': 20321280, 'steps': 39689, 'batch_loss/train': 0.7667644200846553} 12/28/2021 04:33:51 - INFO - codeparrot_training - Step 39690: {'lr': 5.098989217781391e-05, 'samples': 20321792, 'steps': 39690, 'batch_loss/train': 0.8533526044338942} 12/28/2021 04:34:03 - INFO - codeparrot_training - Step 39691: {'lr': 5.098035203969209e-05, 'samples': 20322304, 'steps': 39691, 'batch_loss/train': 0.774764153175056} 12/28/2021 04:34:14 - INFO - codeparrot_training - Step 39692: {'lr': 5.097081269279724e-05, 'samples': 20322816, 'steps': 39692, 'batch_loss/train': 0.733292636461556} 12/28/2021 04:34:24 - INFO - codeparrot_training - Step 39693: {'lr': 5.096127413716731e-05, 'samples': 20323328, 'steps': 39693, 'batch_loss/train': 0.6494915883522481} 12/28/2021 04:34:35 - INFO - codeparrot_training - Step 39694: {'lr': 5.0951736372840255e-05, 'samples': 20323840, 'steps': 39694, 'batch_loss/train': 0.7339739948511124} 12/28/2021 04:34:49 - INFO - codeparrot_training - Step 39695: {'lr': 5.094219939985378e-05, 'samples': 20324352, 'steps': 39695, 'batch_loss/train': 0.7131502823904157} 12/28/2021 04:34:59 - INFO - codeparrot_training - Step 39696: {'lr': 5.0932663218246136e-05, 'samples': 20324864, 'steps': 39696, 'batch_loss/train': 0.7928963676095009} 12/28/2021 04:35:10 - INFO - codeparrot_training - Step 39697: {'lr': 5.0923127828054945e-05, 'samples': 20325376, 'steps': 39697, 'batch_loss/train': 0.9259353033266962} 12/28/2021 04:35:22 - INFO - codeparrot_training - Step 39698: {'lr': 5.0913593229318186e-05, 'samples': 20325888, 'steps': 39698, 'batch_loss/train': 0.6910555954091251} 12/28/2021 04:35:33 - INFO - codeparrot_training - Step 39699: {'lr': 5.0904059422073964e-05, 'samples': 20326400, 'steps': 39699, 'batch_loss/train': 0.7807842651382089} 12/28/2021 04:35:43 - INFO - codeparrot_training - Step 39700: {'lr': 5.089452640635991e-05, 'samples': 20326912, 'steps': 39700, 'batch_loss/train': 0.7532022586092353} 12/28/2021 04:35:54 - INFO - codeparrot_training - Step 39701: {'lr': 5.088499418221407e-05, 'samples': 20327424, 'steps': 39701, 'batch_loss/train': 0.7520043560070917} 12/28/2021 04:36:06 - INFO - codeparrot_training - Step 39702: {'lr': 5.08754627496743e-05, 'samples': 20327936, 'steps': 39702, 'batch_loss/train': 0.7325256809126586} 12/28/2021 04:36:16 - INFO - codeparrot_training - Step 39703: {'lr': 5.086593210877852e-05, 'samples': 20328448, 'steps': 39703, 'batch_loss/train': 0.8377211084589362} 12/28/2021 04:36:27 - INFO - codeparrot_training - Step 39704: {'lr': 5.085640225956459e-05, 'samples': 20328960, 'steps': 39704, 'batch_loss/train': 0.7649906938895583} 12/28/2021 04:36:40 - INFO - codeparrot_training - Step 39705: {'lr': 5.0846873202070486e-05, 'samples': 20329472, 'steps': 39705, 'batch_loss/train': 0.7677501016296446} 12/28/2021 04:36:50 - INFO - codeparrot_training - Step 39706: {'lr': 5.083734493633388e-05, 'samples': 20329984, 'steps': 39706, 'batch_loss/train': 0.8074010710697621} 12/28/2021 04:37:01 - INFO - codeparrot_training - Step 39707: {'lr': 5.082781746239285e-05, 'samples': 20330496, 'steps': 39707, 'batch_loss/train': 0.6933404211886227} 12/28/2021 04:37:13 - INFO - codeparrot_training - Step 39708: {'lr': 5.08182907802853e-05, 'samples': 20331008, 'steps': 39708, 'batch_loss/train': 0.653140138136223} 12/28/2021 04:37:24 - INFO - codeparrot_training - Step 39709: {'lr': 5.080876489004893e-05, 'samples': 20331520, 'steps': 39709, 'batch_loss/train': 0.7836046000011265} 12/28/2021 04:37:34 - INFO - codeparrot_training - Step 39710: {'lr': 5.079923979172163e-05, 'samples': 20332032, 'steps': 39710, 'batch_loss/train': 0.7895966839860193} 12/28/2021 04:37:47 - INFO - codeparrot_training - Step 39711: {'lr': 5.0789715485341477e-05, 'samples': 20332544, 'steps': 39711, 'batch_loss/train': 0.7994383564218879} 12/28/2021 04:37:58 - INFO - codeparrot_training - Step 39712: {'lr': 5.078019197094613e-05, 'samples': 20333056, 'steps': 39712, 'batch_loss/train': 0.7797977887094021} 12/28/2021 04:38:08 - INFO - codeparrot_training - Step 39713: {'lr': 5.0770669248573514e-05, 'samples': 20333568, 'steps': 39713, 'batch_loss/train': 0.7854027841240168} 12/28/2021 04:38:19 - INFO - codeparrot_training - Step 39714: {'lr': 5.076114731826151e-05, 'samples': 20334080, 'steps': 39714, 'batch_loss/train': 0.6859970227815211} 12/28/2021 04:38:31 - INFO - codeparrot_training - Step 39715: {'lr': 5.075162618004795e-05, 'samples': 20334592, 'steps': 39715, 'batch_loss/train': 0.7183624561876059} 12/28/2021 04:38:42 - INFO - codeparrot_training - Step 39716: {'lr': 5.074210583397068e-05, 'samples': 20335104, 'steps': 39716, 'batch_loss/train': 0.6956932513276115} 12/28/2021 04:38:53 - INFO - codeparrot_training - Step 39717: {'lr': 5.073258628006755e-05, 'samples': 20335616, 'steps': 39717, 'batch_loss/train': 0.7742658290080726} 12/28/2021 04:39:05 - INFO - codeparrot_training - Step 39718: {'lr': 5.072306751837644e-05, 'samples': 20336128, 'steps': 39718, 'batch_loss/train': 0.7182571869343519} 12/28/2021 04:39:16 - INFO - codeparrot_training - Step 39719: {'lr': 5.071354954893515e-05, 'samples': 20336640, 'steps': 39719, 'batch_loss/train': 0.7647685795091093} 12/28/2021 04:39:26 - INFO - codeparrot_training - Step 39720: {'lr': 5.0704032371781614e-05, 'samples': 20337152, 'steps': 39720, 'batch_loss/train': 0.8209935538470745} 12/28/2021 04:39:38 - INFO - codeparrot_training - Step 39721: {'lr': 5.069451598695346e-05, 'samples': 20337664, 'steps': 39721, 'batch_loss/train': 0.7504037320613861} 12/28/2021 04:39:49 - INFO - codeparrot_training - Step 39722: {'lr': 5.068500039448878e-05, 'samples': 20338176, 'steps': 39722, 'batch_loss/train': 0.7910297320922837} 12/28/2021 04:40:00 - INFO - codeparrot_training - Step 39723: {'lr': 5.0675485594425224e-05, 'samples': 20338688, 'steps': 39723, 'batch_loss/train': 0.8079738868400455} 12/28/2021 04:40:10 - INFO - codeparrot_training - Step 39724: {'lr': 5.0665971586800614e-05, 'samples': 20339200, 'steps': 39724, 'batch_loss/train': 0.6965051302686334} 12/28/2021 04:40:23 - INFO - codeparrot_training - Step 39725: {'lr': 5.0656458371652975e-05, 'samples': 20339712, 'steps': 39725, 'batch_loss/train': 0.7764860829338431} 12/28/2021 04:40:34 - INFO - codeparrot_training - Step 39726: {'lr': 5.064694594901989e-05, 'samples': 20340224, 'steps': 39726, 'batch_loss/train': 0.7272054459899664} 12/28/2021 04:40:44 - INFO - codeparrot_training - Step 39727: {'lr': 5.06374343189393e-05, 'samples': 20340736, 'steps': 39727, 'batch_loss/train': 0.6896307705901563} 12/28/2021 04:40:57 - INFO - codeparrot_training - Step 39728: {'lr': 5.062792348144899e-05, 'samples': 20341248, 'steps': 39728, 'batch_loss/train': 0.6899226435925812} 12/28/2021 04:41:07 - INFO - codeparrot_training - Step 39729: {'lr': 5.0618413436586767e-05, 'samples': 20341760, 'steps': 39729, 'batch_loss/train': 0.8545089745894074} 12/28/2021 04:41:18 - INFO - codeparrot_training - Step 39730: {'lr': 5.060890418439046e-05, 'samples': 20342272, 'steps': 39730, 'batch_loss/train': 0.7304475172422826} 12/28/2021 04:41:30 - INFO - codeparrot_training - Step 39731: {'lr': 5.059939572489794e-05, 'samples': 20342784, 'steps': 39731, 'batch_loss/train': 0.7346817152574658} 12/28/2021 04:41:41 - INFO - codeparrot_training - Step 39732: {'lr': 5.058988805814677e-05, 'samples': 20343296, 'steps': 39732, 'batch_loss/train': 0.631346293259412} 12/28/2021 04:41:51 - INFO - codeparrot_training - Step 39733: {'lr': 5.058038118417499e-05, 'samples': 20343808, 'steps': 39733, 'batch_loss/train': 0.7176552200689912} 12/28/2021 04:42:04 - INFO - codeparrot_training - Step 39734: {'lr': 5.057087510302039e-05, 'samples': 20344320, 'steps': 39734, 'batch_loss/train': 0.7818622312042862} 12/28/2021 04:42:15 - INFO - codeparrot_training - Step 39735: {'lr': 5.056136981472057e-05, 'samples': 20344832, 'steps': 39735, 'batch_loss/train': 0.7108720196411014} 12/28/2021 04:42:25 - INFO - codeparrot_training - Step 39736: {'lr': 5.055186531931347e-05, 'samples': 20345344, 'steps': 39736, 'batch_loss/train': 0.9735381733626127} 12/28/2021 04:42:36 - INFO - codeparrot_training - Step 39737: {'lr': 5.054236161683692e-05, 'samples': 20345856, 'steps': 39737, 'batch_loss/train': 0.82792554423213} 12/28/2021 04:42:48 - INFO - codeparrot_training - Step 39738: {'lr': 5.0532858707328575e-05, 'samples': 20346368, 'steps': 39738, 'batch_loss/train': 0.8212988143786788} 12/28/2021 04:42:59 - INFO - codeparrot_training - Step 39739: {'lr': 5.052335659082619e-05, 'samples': 20346880, 'steps': 39739, 'batch_loss/train': 0.7255764249712229} 12/28/2021 04:43:10 - INFO - codeparrot_training - Step 39740: {'lr': 5.051385526736777e-05, 'samples': 20347392, 'steps': 39740, 'batch_loss/train': 0.7419421775266528} 12/28/2021 04:43:22 - INFO - codeparrot_training - Step 39741: {'lr': 5.0504354736990856e-05, 'samples': 20347904, 'steps': 39741, 'batch_loss/train': 1.1137117072939873} 12/28/2021 04:43:33 - INFO - codeparrot_training - Step 39742: {'lr': 5.04948549997333e-05, 'samples': 20348416, 'steps': 39742, 'batch_loss/train': 0.6608170173713006} 12/28/2021 04:43:44 - INFO - codeparrot_training - Step 39743: {'lr': 5.048535605563287e-05, 'samples': 20348928, 'steps': 39743, 'batch_loss/train': 0.6216358547098935} 12/28/2021 04:43:56 - INFO - codeparrot_training - Step 39744: {'lr': 5.0475857904727334e-05, 'samples': 20349440, 'steps': 39744, 'batch_loss/train': 0.746854430064559} 12/28/2021 04:44:06 - INFO - codeparrot_training - Step 39745: {'lr': 5.046636054705447e-05, 'samples': 20349952, 'steps': 39745, 'batch_loss/train': 0.7288057929836214} 12/28/2021 04:44:17 - INFO - codeparrot_training - Step 39746: {'lr': 5.0456863982652063e-05, 'samples': 20350464, 'steps': 39746, 'batch_loss/train': 0.7604599501937628} 12/28/2021 04:44:27 - INFO - codeparrot_training - Step 39747: {'lr': 5.044736821155768e-05, 'samples': 20350976, 'steps': 39747, 'batch_loss/train': 0.6768269825261086} 12/28/2021 04:44:40 - INFO - codeparrot_training - Step 39748: {'lr': 5.043787323380936e-05, 'samples': 20351488, 'steps': 39748, 'batch_loss/train': 0.9991639945656061} 12/28/2021 04:44:50 - INFO - codeparrot_training - Step 39749: {'lr': 5.042837904944461e-05, 'samples': 20352000, 'steps': 39749, 'batch_loss/train': 0.6718710745917633} 12/28/2021 04:45:01 - INFO - codeparrot_training - Step 39750: {'lr': 5.0418885658501226e-05, 'samples': 20352512, 'steps': 39750, 'batch_loss/train': 0.6270379375200719} 12/28/2021 04:45:14 - INFO - codeparrot_training - Step 39751: {'lr': 5.0409393061017136e-05, 'samples': 20353024, 'steps': 39751, 'batch_loss/train': 0.7381706379819661} 12/28/2021 04:45:24 - INFO - codeparrot_training - Step 39752: {'lr': 5.0399901257029865e-05, 'samples': 20353536, 'steps': 39752, 'batch_loss/train': 0.7562128135468811} 12/28/2021 04:45:35 - INFO - codeparrot_training - Step 39753: {'lr': 5.03904102465772e-05, 'samples': 20354048, 'steps': 39753, 'batch_loss/train': 0.759062435477972} 12/28/2021 04:45:47 - INFO - codeparrot_training - Step 39754: {'lr': 5.0380920029696906e-05, 'samples': 20354560, 'steps': 39754, 'batch_loss/train': 0.6560955638997257} 12/28/2021 04:45:58 - INFO - codeparrot_training - Step 39755: {'lr': 5.037143060642671e-05, 'samples': 20355072, 'steps': 39755, 'batch_loss/train': 0.9190838998183608} 12/28/2021 04:46:09 - INFO - codeparrot_training - Step 39756: {'lr': 5.0361941976804326e-05, 'samples': 20355584, 'steps': 39756, 'batch_loss/train': 0.6910492079332471} 12/28/2021 04:46:19 - INFO - codeparrot_training - Step 39757: {'lr': 5.035245414086753e-05, 'samples': 20356096, 'steps': 39757, 'batch_loss/train': 0.7313598245382309} 12/28/2021 04:46:32 - INFO - codeparrot_training - Step 39758: {'lr': 5.034296709865388e-05, 'samples': 20356608, 'steps': 39758, 'batch_loss/train': 0.6483521088957787} 12/28/2021 04:46:42 - INFO - codeparrot_training - Step 39759: {'lr': 5.0333480850201285e-05, 'samples': 20357120, 'steps': 39759, 'batch_loss/train': 0.6653044098056853} 12/28/2021 04:46:53 - INFO - codeparrot_training - Step 39760: {'lr': 5.032399539554741e-05, 'samples': 20357632, 'steps': 39760, 'batch_loss/train': 0.8537539206445217} 12/28/2021 04:47:05 - INFO - codeparrot_training - Step 39761: {'lr': 5.031451073472981e-05, 'samples': 20358144, 'steps': 39761, 'batch_loss/train': 0.7028817769605666} 12/28/2021 04:47:15 - INFO - codeparrot_training - Step 39762: {'lr': 5.0305026867786405e-05, 'samples': 20358656, 'steps': 39762, 'batch_loss/train': 0.6612722160643898} 12/28/2021 04:47:26 - INFO - codeparrot_training - Step 39763: {'lr': 5.0295543794754844e-05, 'samples': 20359168, 'steps': 39763, 'batch_loss/train': 0.7122567687183619} 12/28/2021 04:47:39 - INFO - codeparrot_training - Step 39764: {'lr': 5.028606151567275e-05, 'samples': 20359680, 'steps': 39764, 'batch_loss/train': 0.7123064855113626} 12/28/2021 04:47:50 - INFO - codeparrot_training - Step 39765: {'lr': 5.0276580030577806e-05, 'samples': 20360192, 'steps': 39765, 'batch_loss/train': 0.6432053820462897} 12/28/2021 04:48:00 - INFO - codeparrot_training - Step 39766: {'lr': 5.0267099339507904e-05, 'samples': 20360704, 'steps': 39766, 'batch_loss/train': 0.7383444863371551} 12/28/2021 04:48:11 - INFO - codeparrot_training - Step 39767: {'lr': 5.025761944250051e-05, 'samples': 20361216, 'steps': 39767, 'batch_loss/train': 0.5691069369204342} 12/28/2021 04:48:23 - INFO - codeparrot_training - Step 39768: {'lr': 5.0248140339593405e-05, 'samples': 20361728, 'steps': 39768, 'batch_loss/train': 0.6893642018549144} 12/28/2021 04:48:34 - INFO - codeparrot_training - Step 39769: {'lr': 5.023866203082428e-05, 'samples': 20362240, 'steps': 39769, 'batch_loss/train': 0.8177347593009472} 12/28/2021 04:48:44 - INFO - codeparrot_training - Step 39770: {'lr': 5.0229184516230816e-05, 'samples': 20362752, 'steps': 39770, 'batch_loss/train': 0.6641706889495254} 12/28/2021 04:48:57 - INFO - codeparrot_training - Step 39771: {'lr': 5.0219707795850665e-05, 'samples': 20363264, 'steps': 39771, 'batch_loss/train': 0.7218467378988862} 12/28/2021 04:49:08 - INFO - codeparrot_training - Step 39772: {'lr': 5.021023186972154e-05, 'samples': 20363776, 'steps': 39772, 'batch_loss/train': 0.7177609419450164} 12/28/2021 04:49:18 - INFO - codeparrot_training - Step 39773: {'lr': 5.020075673788108e-05, 'samples': 20364288, 'steps': 39773, 'batch_loss/train': 0.688247271347791} 12/28/2021 04:49:30 - INFO - codeparrot_training - Step 39774: {'lr': 5.019128240036697e-05, 'samples': 20364800, 'steps': 39774, 'batch_loss/train': 0.7240539803169668} 12/28/2021 04:49:41 - INFO - codeparrot_training - Step 39775: {'lr': 5.0181808857216956e-05, 'samples': 20365312, 'steps': 39775, 'batch_loss/train': 0.7940141563303769} 12/28/2021 04:49:52 - INFO - codeparrot_training - Step 39776: {'lr': 5.017233610846847e-05, 'samples': 20365824, 'steps': 39776, 'batch_loss/train': 0.711829187348485} 12/28/2021 04:50:02 - INFO - codeparrot_training - Step 39777: {'lr': 5.016286415415946e-05, 'samples': 20366336, 'steps': 39777, 'batch_loss/train': 0.6623277934268117} 12/28/2021 04:50:14 - INFO - codeparrot_training - Step 39778: {'lr': 5.015339299432739e-05, 'samples': 20366848, 'steps': 39778, 'batch_loss/train': 0.7839710447005928} 12/28/2021 04:50:25 - INFO - codeparrot_training - Step 39779: {'lr': 5.0143922629009966e-05, 'samples': 20367360, 'steps': 39779, 'batch_loss/train': 0.795794123550877} 12/28/2021 04:50:35 - INFO - codeparrot_training - Step 39780: {'lr': 5.013445305824485e-05, 'samples': 20367872, 'steps': 39780, 'batch_loss/train': 0.7642777375876904} 12/28/2021 04:50:48 - INFO - codeparrot_training - Step 39781: {'lr': 5.0124984282069684e-05, 'samples': 20368384, 'steps': 39781, 'batch_loss/train': 0.4431344246841036} 12/28/2021 04:50:59 - INFO - codeparrot_training - Step 39782: {'lr': 5.011551630052211e-05, 'samples': 20368896, 'steps': 39782, 'batch_loss/train': 0.7656749924644828} 12/28/2021 04:51:10 - INFO - codeparrot_training - Step 39783: {'lr': 5.010604911363975e-05, 'samples': 20369408, 'steps': 39783, 'batch_loss/train': 0.7348893592134118} 12/28/2021 04:51:22 - INFO - codeparrot_training - Step 39784: {'lr': 5.009658272146028e-05, 'samples': 20369920, 'steps': 39784, 'batch_loss/train': 0.6924156388267875} 12/28/2021 04:51:32 - INFO - codeparrot_training - Step 39785: {'lr': 5.008711712402131e-05, 'samples': 20370432, 'steps': 39785, 'batch_loss/train': 0.7947660475037992} 12/28/2021 04:51:43 - INFO - codeparrot_training - Step 39786: {'lr': 5.007765232136055e-05, 'samples': 20370944, 'steps': 39786, 'batch_loss/train': 0.7219466683454812} 12/28/2021 04:51:55 - INFO - codeparrot_training - Step 39787: {'lr': 5.00681883135154e-05, 'samples': 20371456, 'steps': 39787, 'batch_loss/train': 0.7146967765875161} 12/28/2021 04:52:06 - INFO - codeparrot_training - Step 39788: {'lr': 5.0058725100523726e-05, 'samples': 20371968, 'steps': 39788, 'batch_loss/train': 0.7513910047709942} 12/28/2021 04:52:16 - INFO - codeparrot_training - Step 39789: {'lr': 5.004926268242313e-05, 'samples': 20372480, 'steps': 39789, 'batch_loss/train': 0.8639144456246868} 12/28/2021 04:52:27 - INFO - codeparrot_training - Step 39790: {'lr': 5.0039801059251125e-05, 'samples': 20372992, 'steps': 39790, 'batch_loss/train': 0.6865161065943539} 12/28/2021 04:52:40 - INFO - codeparrot_training - Step 39791: {'lr': 5.0030340231045276e-05, 'samples': 20373504, 'steps': 39791, 'batch_loss/train': 0.7515451656654477} 12/28/2021 04:52:50 - INFO - codeparrot_training - Step 39792: {'lr': 5.002088019784343e-05, 'samples': 20374016, 'steps': 39792, 'batch_loss/train': 0.6583125074976124} 12/28/2021 04:53:01 - INFO - codeparrot_training - Step 39793: {'lr': 5.001142095968297e-05, 'samples': 20374528, 'steps': 39793, 'batch_loss/train': 0.715933442581445} 12/28/2021 04:53:13 - INFO - codeparrot_training - Step 39794: {'lr': 5.000196251660161e-05, 'samples': 20375040, 'steps': 39794, 'batch_loss/train': 0.9800722101936117} 12/28/2021 04:53:24 - INFO - codeparrot_training - Step 39795: {'lr': 4.999250486863693e-05, 'samples': 20375552, 'steps': 39795, 'batch_loss/train': 1.1365963323041797} 12/28/2021 04:53:35 - INFO - codeparrot_training - Step 39796: {'lr': 4.9983048015826537e-05, 'samples': 20376064, 'steps': 39796, 'batch_loss/train': 0.753644231474027} 12/28/2021 04:53:45 - INFO - codeparrot_training - Step 39797: {'lr': 4.9973591958208e-05, 'samples': 20376576, 'steps': 39797, 'batch_loss/train': 0.6403349335305393} 12/28/2021 04:53:57 - INFO - codeparrot_training - Step 39798: {'lr': 4.996413669581895e-05, 'samples': 20377088, 'steps': 39798, 'batch_loss/train': 0.6380948266014457} 12/28/2021 04:54:08 - INFO - codeparrot_training - Step 39799: {'lr': 4.995468222869695e-05, 'samples': 20377600, 'steps': 39799, 'batch_loss/train': 0.6719279219396412} 12/28/2021 04:54:19 - INFO - codeparrot_training - Step 39800: {'lr': 4.994522855687961e-05, 'samples': 20378112, 'steps': 39800, 'batch_loss/train': 0.8265840623062104} 12/28/2021 04:54:31 - INFO - codeparrot_training - Step 39801: {'lr': 4.9935775680404554e-05, 'samples': 20378624, 'steps': 39801, 'batch_loss/train': 0.728833005297929} 12/28/2021 04:54:42 - INFO - codeparrot_training - Step 39802: {'lr': 4.992632359930921e-05, 'samples': 20379136, 'steps': 39802, 'batch_loss/train': 0.7408273234032094} 12/28/2021 04:54:53 - INFO - codeparrot_training - Step 39803: {'lr': 4.991687231363137e-05, 'samples': 20379648, 'steps': 39803, 'batch_loss/train': 0.6991786288563162} 12/28/2021 04:55:05 - INFO - codeparrot_training - Step 39804: {'lr': 4.990742182340843e-05, 'samples': 20380160, 'steps': 39804, 'batch_loss/train': 0.8360333554446697} 12/28/2021 04:55:15 - INFO - codeparrot_training - Step 39805: {'lr': 4.9897972128678026e-05, 'samples': 20380672, 'steps': 39805, 'batch_loss/train': 0.7433055061846972} 12/28/2021 04:55:26 - INFO - codeparrot_training - Step 39806: {'lr': 4.988852322947773e-05, 'samples': 20381184, 'steps': 39806, 'batch_loss/train': 0.7587762866169214} 12/28/2021 04:55:38 - INFO - codeparrot_training - Step 39807: {'lr': 4.987907512584511e-05, 'samples': 20381696, 'steps': 39807, 'batch_loss/train': 0.7211737306788564} 12/28/2021 04:55:49 - INFO - codeparrot_training - Step 39808: {'lr': 4.986962781781773e-05, 'samples': 20382208, 'steps': 39808, 'batch_loss/train': 0.736599646625109} 12/28/2021 04:55:59 - INFO - codeparrot_training - Step 39809: {'lr': 4.986018130543316e-05, 'samples': 20382720, 'steps': 39809, 'batch_loss/train': 0.8070330210030079} 12/28/2021 04:56:10 - INFO - codeparrot_training - Step 39810: {'lr': 4.9850735588728904e-05, 'samples': 20383232, 'steps': 39810, 'batch_loss/train': 0.7934607099741697} 12/28/2021 04:56:23 - INFO - codeparrot_training - Step 39811: {'lr': 4.984129066774257e-05, 'samples': 20383744, 'steps': 39811, 'batch_loss/train': 0.8413771723862737} 12/28/2021 04:56:33 - INFO - codeparrot_training - Step 39812: {'lr': 4.983184654251175e-05, 'samples': 20384256, 'steps': 39812, 'batch_loss/train': 0.7958390507847071} 12/28/2021 04:56:44 - INFO - codeparrot_training - Step 39813: {'lr': 4.9822403213073795e-05, 'samples': 20384768, 'steps': 39813, 'batch_loss/train': 0.7250747121870518} 12/28/2021 04:56:56 - INFO - codeparrot_training - Step 39814: {'lr': 4.981296067946645e-05, 'samples': 20385280, 'steps': 39814, 'batch_loss/train': 0.7380621936172247} 12/28/2021 04:57:07 - INFO - codeparrot_training - Step 39815: {'lr': 4.980351894172724e-05, 'samples': 20385792, 'steps': 39815, 'batch_loss/train': 0.6155534646240994} 12/28/2021 04:57:18 - INFO - codeparrot_training - Step 39816: {'lr': 4.979407799989358e-05, 'samples': 20386304, 'steps': 39816, 'batch_loss/train': 0.8064520820043981} 12/28/2021 04:57:30 - INFO - codeparrot_training - Step 39817: {'lr': 4.978463785400303e-05, 'samples': 20386816, 'steps': 39817, 'batch_loss/train': 0.7061855751089752} 12/28/2021 04:57:40 - INFO - codeparrot_training - Step 39818: {'lr': 4.977519850409326e-05, 'samples': 20387328, 'steps': 39818, 'batch_loss/train': 0.7361379065550864} 12/28/2021 04:57:51 - INFO - codeparrot_training - Step 39819: {'lr': 4.976575995020166e-05, 'samples': 20387840, 'steps': 39819, 'batch_loss/train': 0.7112049299757928} 12/28/2021 04:58:01 - INFO - codeparrot_training - Step 39820: {'lr': 4.975632219236578e-05, 'samples': 20388352, 'steps': 39820, 'batch_loss/train': 0.6798792658373713} 12/28/2021 04:58:14 - INFO - codeparrot_training - Step 39821: {'lr': 4.974688523062315e-05, 'samples': 20388864, 'steps': 39821, 'batch_loss/train': 0.7578471945598722} 12/28/2021 04:58:25 - INFO - codeparrot_training - Step 39822: {'lr': 4.973744906501129e-05, 'samples': 20389376, 'steps': 39822, 'batch_loss/train': 0.7402013633400202} 12/28/2021 04:58:35 - INFO - codeparrot_training - Step 39823: {'lr': 4.972801369556773e-05, 'samples': 20389888, 'steps': 39823, 'batch_loss/train': 0.750357611104846} 12/28/2021 04:58:48 - INFO - codeparrot_training - Step 39824: {'lr': 4.9718579122329967e-05, 'samples': 20390400, 'steps': 39824, 'batch_loss/train': 0.5647259424440563} 12/28/2021 04:58:58 - INFO - codeparrot_training - Step 39825: {'lr': 4.970914534533552e-05, 'samples': 20390912, 'steps': 39825, 'batch_loss/train': 0.6370991412550211} 12/28/2021 04:59:09 - INFO - codeparrot_training - Step 39826: {'lr': 4.969971236462187e-05, 'samples': 20391424, 'steps': 39826, 'batch_loss/train': 0.7458902378566563} 12/28/2021 04:59:21 - INFO - codeparrot_training - Step 39827: {'lr': 4.969028018022659e-05, 'samples': 20391936, 'steps': 39827, 'batch_loss/train': 0.8265282995998859} 12/28/2021 04:59:32 - INFO - codeparrot_training - Step 39828: {'lr': 4.9680848792187013e-05, 'samples': 20392448, 'steps': 39828, 'batch_loss/train': 0.5846214359626174} 12/28/2021 04:59:43 - INFO - codeparrot_training - Step 39829: {'lr': 4.967141820054077e-05, 'samples': 20392960, 'steps': 39829, 'batch_loss/train': 0.7691222685389221} 12/28/2021 04:59:53 - INFO - codeparrot_training - Step 39830: {'lr': 4.966198840532543e-05, 'samples': 20393472, 'steps': 39830, 'batch_loss/train': 0.773607611656189} 12/28/2021 05:00:06 - INFO - codeparrot_training - Step 39831: {'lr': 4.9652559406578316e-05, 'samples': 20393984, 'steps': 39831, 'batch_loss/train': 0.6831269297399558} 12/28/2021 05:00:17 - INFO - codeparrot_training - Step 39832: {'lr': 4.9643131204336945e-05, 'samples': 20394496, 'steps': 39832, 'batch_loss/train': 0.48173614748520777} 12/28/2021 05:00:28 - INFO - codeparrot_training - Step 39833: {'lr': 4.963370379863885e-05, 'samples': 20395008, 'steps': 39833, 'batch_loss/train': 0.9394326624460518} 12/28/2021 05:00:40 - INFO - codeparrot_training - Step 39834: {'lr': 4.96242771895215e-05, 'samples': 20395520, 'steps': 39834, 'batch_loss/train': 0.660415331250988} 12/28/2021 05:00:50 - INFO - codeparrot_training - Step 39835: {'lr': 4.961485137702237e-05, 'samples': 20396032, 'steps': 39835, 'batch_loss/train': 0.8097616462036967} 12/28/2021 05:01:01 - INFO - codeparrot_training - Step 39836: {'lr': 4.96054263611789e-05, 'samples': 20396544, 'steps': 39836, 'batch_loss/train': 0.8355239061638713} 12/28/2021 05:01:13 - INFO - codeparrot_training - Step 39837: {'lr': 4.959600214202861e-05, 'samples': 20397056, 'steps': 39837, 'batch_loss/train': 0.7024563648737967} 12/28/2021 05:01:24 - INFO - codeparrot_training - Step 39838: {'lr': 4.958657871960901e-05, 'samples': 20397568, 'steps': 39838, 'batch_loss/train': 0.7373758647590876} 12/28/2021 05:01:34 - INFO - codeparrot_training - Step 39839: {'lr': 4.9577156093957384e-05, 'samples': 20398080, 'steps': 39839, 'batch_loss/train': 0.6431780674029142} 12/28/2021 05:01:45 - INFO - codeparrot_training - Step 39840: {'lr': 4.9567734265111354e-05, 'samples': 20398592, 'steps': 39840, 'batch_loss/train': 0.6619913543108851} 12/28/2021 05:01:58 - INFO - codeparrot_training - Step 39841: {'lr': 4.9558313233108385e-05, 'samples': 20399104, 'steps': 39841, 'batch_loss/train': 0.7218330255709589} 12/28/2021 05:02:08 - INFO - codeparrot_training - Step 39842: {'lr': 4.954889299798585e-05, 'samples': 20399616, 'steps': 39842, 'batch_loss/train': 0.8011735803447664} 12/28/2021 05:02:19 - INFO - codeparrot_training - Step 39843: {'lr': 4.953947355978114e-05, 'samples': 20400128, 'steps': 39843, 'batch_loss/train': 0.7518304942641407} 12/28/2021 05:02:31 - INFO - codeparrot_training - Step 39844: {'lr': 4.953005491853191e-05, 'samples': 20400640, 'steps': 39844, 'batch_loss/train': 0.771454282104969} 12/28/2021 05:02:42 - INFO - codeparrot_training - Step 39845: {'lr': 4.9520637074275455e-05, 'samples': 20401152, 'steps': 39845, 'batch_loss/train': 0.7881674477830529} 12/28/2021 05:02:52 - INFO - codeparrot_training - Step 39846: {'lr': 4.9511220027049166e-05, 'samples': 20401664, 'steps': 39846, 'batch_loss/train': 0.7169817630201578} 12/28/2021 05:03:03 - INFO - codeparrot_training - Step 39847: {'lr': 4.950180377689067e-05, 'samples': 20402176, 'steps': 39847, 'batch_loss/train': 0.8663263898342848} 12/28/2021 05:03:15 - INFO - codeparrot_training - Step 39848: {'lr': 4.9492388323837265e-05, 'samples': 20402688, 'steps': 39848, 'batch_loss/train': 0.690076083294116} 12/28/2021 05:03:26 - INFO - codeparrot_training - Step 39849: {'lr': 4.948297366792642e-05, 'samples': 20403200, 'steps': 39849, 'batch_loss/train': 0.730540378484875} 12/28/2021 05:03:36 - INFO - codeparrot_training - Step 39850: {'lr': 4.9473559809195546e-05, 'samples': 20403712, 'steps': 39850, 'batch_loss/train': 0.6886882432736456} 12/28/2021 05:03:49 - INFO - codeparrot_training - Step 39851: {'lr': 4.946414674768207e-05, 'samples': 20404224, 'steps': 39851, 'batch_loss/train': 0.6837566918693483} 12/28/2021 05:04:00 - INFO - codeparrot_training - Step 39852: {'lr': 4.945473448342344e-05, 'samples': 20404736, 'steps': 39852, 'batch_loss/train': 0.6485180892050266} 12/28/2021 05:04:11 - INFO - codeparrot_training - Step 39853: {'lr': 4.944532301645713e-05, 'samples': 20405248, 'steps': 39853, 'batch_loss/train': 0.6500386514235288} 12/28/2021 05:04:23 - INFO - codeparrot_training - Step 39854: {'lr': 4.943591234682035e-05, 'samples': 20405760, 'steps': 39854, 'batch_loss/train': 0.828183725476265} 12/28/2021 05:04:33 - INFO - codeparrot_training - Step 39855: {'lr': 4.942650247455074e-05, 'samples': 20406272, 'steps': 39855, 'batch_loss/train': 0.7558549987152219} 12/28/2021 05:04:44 - INFO - codeparrot_training - Step 39856: {'lr': 4.941709339968567e-05, 'samples': 20406784, 'steps': 39856, 'batch_loss/train': 0.7619363917037845} 12/28/2021 05:04:57 - INFO - codeparrot_training - Step 39857: {'lr': 4.940768512226243e-05, 'samples': 20407296, 'steps': 39857, 'batch_loss/train': 0.7657769601792097} 12/28/2021 05:05:08 - INFO - codeparrot_training - Step 39858: {'lr': 4.9398277642318496e-05, 'samples': 20407808, 'steps': 39858, 'batch_loss/train': 0.6369429999031126} 12/28/2021 05:05:18 - INFO - codeparrot_training - Step 39859: {'lr': 4.938887095989128e-05, 'samples': 20408320, 'steps': 39859, 'batch_loss/train': 0.746681297197938} 12/28/2021 05:05:29 - INFO - codeparrot_training - Step 39860: {'lr': 4.9379465075018145e-05, 'samples': 20408832, 'steps': 39860, 'batch_loss/train': 0.7824710458517075} 12/28/2021 05:05:41 - INFO - codeparrot_training - Step 39861: {'lr': 4.937005998773653e-05, 'samples': 20409344, 'steps': 39861, 'batch_loss/train': 0.5884156879037619} 12/28/2021 05:05:52 - INFO - codeparrot_training - Step 39862: {'lr': 4.936065569808379e-05, 'samples': 20409856, 'steps': 39862, 'batch_loss/train': 0.7188476109877229} 12/28/2021 05:06:02 - INFO - codeparrot_training - Step 39863: {'lr': 4.935125220609732e-05, 'samples': 20410368, 'steps': 39863, 'batch_loss/train': 0.7795562222599983} 12/28/2021 05:06:15 - INFO - codeparrot_training - Step 39864: {'lr': 4.9341849511814505e-05, 'samples': 20410880, 'steps': 39864, 'batch_loss/train': 0.8405674798414111} 12/28/2021 05:06:25 - INFO - codeparrot_training - Step 39865: {'lr': 4.933244761527272e-05, 'samples': 20411392, 'steps': 39865, 'batch_loss/train': 0.716137545183301} 12/28/2021 05:06:36 - INFO - codeparrot_training - Step 39866: {'lr': 4.9323046516509376e-05, 'samples': 20411904, 'steps': 39866, 'batch_loss/train': 0.7245471007190645} 12/28/2021 05:06:46 - INFO - codeparrot_training - Step 39867: {'lr': 4.931364621556189e-05, 'samples': 20412416, 'steps': 39867, 'batch_loss/train': 0.7262518396601081} 12/28/2021 05:07:00 - INFO - codeparrot_training - Step 39868: {'lr': 4.93042467124675e-05, 'samples': 20412928, 'steps': 39868, 'batch_loss/train': 0.4915123031241819} 12/28/2021 05:07:10 - INFO - codeparrot_training - Step 39869: {'lr': 4.9294848007263584e-05, 'samples': 20413440, 'steps': 39869, 'batch_loss/train': 0.8148463163524866} 12/28/2021 05:07:21 - INFO - codeparrot_training - Step 39870: {'lr': 4.92854500999877e-05, 'samples': 20413952, 'steps': 39870, 'batch_loss/train': 0.8619000297039747} 12/28/2021 05:07:33 - INFO - codeparrot_training - Step 39871: {'lr': 4.9276052990677e-05, 'samples': 20414464, 'steps': 39871, 'batch_loss/train': 0.7833787677809596} 12/28/2021 05:07:43 - INFO - codeparrot_training - Step 39872: {'lr': 4.926665667936889e-05, 'samples': 20414976, 'steps': 39872, 'batch_loss/train': 0.7840820904821157} 12/28/2021 05:07:54 - INFO - codeparrot_training - Step 39873: {'lr': 4.925726116610085e-05, 'samples': 20415488, 'steps': 39873, 'batch_loss/train': 0.833738328423351} 12/28/2021 05:08:06 - INFO - codeparrot_training - Step 39874: {'lr': 4.9247866450910116e-05, 'samples': 20416000, 'steps': 39874, 'batch_loss/train': 0.727863809093833} 12/28/2021 05:08:17 - INFO - codeparrot_training - Step 39875: {'lr': 4.923847253383404e-05, 'samples': 20416512, 'steps': 39875, 'batch_loss/train': 0.8411591777112335} 12/28/2021 05:08:28 - INFO - codeparrot_training - Step 39876: {'lr': 4.922907941490998e-05, 'samples': 20417024, 'steps': 39876, 'batch_loss/train': 0.77149255014956} 12/28/2021 05:08:38 - INFO - codeparrot_training - Step 39877: {'lr': 4.921968709417532e-05, 'samples': 20417536, 'steps': 39877, 'batch_loss/train': 0.7150974697433412} 12/28/2021 05:08:51 - INFO - codeparrot_training - Step 39878: {'lr': 4.9210295571667366e-05, 'samples': 20418048, 'steps': 39878, 'batch_loss/train': 0.746669284068048} 12/28/2021 05:09:02 - INFO - codeparrot_training - Step 39879: {'lr': 4.9200904847423514e-05, 'samples': 20418560, 'steps': 39879, 'batch_loss/train': 0.6559899556450546} 12/28/2021 05:09:12 - INFO - codeparrot_training - Step 39880: {'lr': 4.919151492148091e-05, 'samples': 20419072, 'steps': 39880, 'batch_loss/train': 0.755777764134109} 12/28/2021 05:09:25 - INFO - codeparrot_training - Step 39881: {'lr': 4.9182125793877096e-05, 'samples': 20419584, 'steps': 39881, 'batch_loss/train': 0.7089682864025235} 12/28/2021 05:09:35 - INFO - codeparrot_training - Step 39882: {'lr': 4.9172737464649395e-05, 'samples': 20420096, 'steps': 39882, 'batch_loss/train': 0.7699264555703849} 12/28/2021 05:09:46 - INFO - codeparrot_training - Step 39883: {'lr': 4.9163349933834923e-05, 'samples': 20420608, 'steps': 39883, 'batch_loss/train': 0.6184877984924242} 12/28/2021 05:09:58 - INFO - codeparrot_training - Step 39884: {'lr': 4.915396320147128e-05, 'samples': 20421120, 'steps': 39884, 'batch_loss/train': 0.5573945879004896} 12/28/2021 05:10:09 - INFO - codeparrot_training - Step 39885: {'lr': 4.914457726759558e-05, 'samples': 20421632, 'steps': 39885, 'batch_loss/train': 0.6796557670459151} 12/28/2021 05:10:19 - INFO - codeparrot_training - Step 39886: {'lr': 4.913519213224518e-05, 'samples': 20422144, 'steps': 39886, 'batch_loss/train': 0.7077162365894765} 12/28/2021 05:10:32 - INFO - codeparrot_training - Step 39887: {'lr': 4.912580779545742e-05, 'samples': 20422656, 'steps': 39887, 'batch_loss/train': 0.711505691986531} 12/28/2021 05:10:43 - INFO - codeparrot_training - Step 39888: {'lr': 4.911642425726962e-05, 'samples': 20423168, 'steps': 39888, 'batch_loss/train': 0.8020911471685395} 12/28/2021 05:10:53 - INFO - codeparrot_training - Step 39889: {'lr': 4.9107041517719055e-05, 'samples': 20423680, 'steps': 39889, 'batch_loss/train': 0.7065968783572316} 12/28/2021 05:11:04 - INFO - codeparrot_training - Step 39890: {'lr': 4.909765957684301e-05, 'samples': 20424192, 'steps': 39890, 'batch_loss/train': 0.7949862158857286} 12/28/2021 05:11:16 - INFO - codeparrot_training - Step 39891: {'lr': 4.908827843467883e-05, 'samples': 20424704, 'steps': 39891, 'batch_loss/train': 0.7171989595517516} 12/28/2021 05:11:27 - INFO - codeparrot_training - Step 39892: {'lr': 4.907889809126381e-05, 'samples': 20425216, 'steps': 39892, 'batch_loss/train': 0.8670126684010029} 12/28/2021 05:11:37 - INFO - codeparrot_training - Step 39893: {'lr': 4.90695185466353e-05, 'samples': 20425728, 'steps': 39893, 'batch_loss/train': 0.8427414791658521} 12/28/2021 05:11:50 - INFO - codeparrot_training - Step 39894: {'lr': 4.906013980083043e-05, 'samples': 20426240, 'steps': 39894, 'batch_loss/train': 0.7511925660073757} 12/28/2021 05:12:00 - INFO - codeparrot_training - Step 39895: {'lr': 4.905076185388649e-05, 'samples': 20426752, 'steps': 39895, 'batch_loss/train': 0.9598741484805942} 12/28/2021 05:12:11 - INFO - codeparrot_training - Step 39896: {'lr': 4.9041384705841016e-05, 'samples': 20427264, 'steps': 39896, 'batch_loss/train': 0.7220978578552604} 12/28/2021 05:12:22 - INFO - codeparrot_training - Step 39897: {'lr': 4.9032008356731025e-05, 'samples': 20427776, 'steps': 39897, 'batch_loss/train': 0.6954730511642992} 12/28/2021 05:12:35 - INFO - codeparrot_training - Step 39898: {'lr': 4.9022632806593834e-05, 'samples': 20428288, 'steps': 39898, 'batch_loss/train': 0.7235852954909205} 12/28/2021 05:12:45 - INFO - codeparrot_training - Step 39899: {'lr': 4.901325805546689e-05, 'samples': 20428800, 'steps': 39899, 'batch_loss/train': 0.8040645150467753} 12/28/2021 05:12:56 - INFO - codeparrot_training - Step 39900: {'lr': 4.900388410338727e-05, 'samples': 20429312, 'steps': 39900, 'batch_loss/train': 0.6110126823186874} 12/28/2021 05:13:08 - INFO - codeparrot_training - Step 39901: {'lr': 4.899451095039231e-05, 'samples': 20429824, 'steps': 39901, 'batch_loss/train': 0.7023674710653722} 12/28/2021 05:13:18 - INFO - codeparrot_training - Step 39902: {'lr': 4.898513859651932e-05, 'samples': 20430336, 'steps': 39902, 'batch_loss/train': 0.7920525397639722} 12/28/2021 05:13:29 - INFO - codeparrot_training - Step 39903: {'lr': 4.897576704180548e-05, 'samples': 20430848, 'steps': 39903, 'batch_loss/train': 0.8222892489284277} 12/28/2021 05:13:42 - INFO - codeparrot_training - Step 39904: {'lr': 4.89663962862881e-05, 'samples': 20431360, 'steps': 39904, 'batch_loss/train': 0.6772895148023963} 12/28/2021 05:13:52 - INFO - codeparrot_training - Step 39905: {'lr': 4.895702633000449e-05, 'samples': 20431872, 'steps': 39905, 'batch_loss/train': 0.680454833433032} 12/28/2021 05:14:03 - INFO - codeparrot_training - Step 39906: {'lr': 4.894765717299168e-05, 'samples': 20432384, 'steps': 39906, 'batch_loss/train': 0.7552986267255619} 12/28/2021 05:14:14 - INFO - codeparrot_training - Step 39907: {'lr': 4.893828881528714e-05, 'samples': 20432896, 'steps': 39907, 'batch_loss/train': 0.7628909889608622} 12/28/2021 05:14:26 - INFO - codeparrot_training - Step 39908: {'lr': 4.8928921256928135e-05, 'samples': 20433408, 'steps': 39908, 'batch_loss/train': 0.7112678055418655} 12/28/2021 05:14:36 - INFO - codeparrot_training - Step 39909: {'lr': 4.8919554497951655e-05, 'samples': 20433920, 'steps': 39909, 'batch_loss/train': 0.7648602807894349} 12/28/2021 05:14:47 - INFO - codeparrot_training - Step 39910: {'lr': 4.8910188538395214e-05, 'samples': 20434432, 'steps': 39910, 'batch_loss/train': 0.6217182218679227} 12/28/2021 05:14:59 - INFO - codeparrot_training - Step 39911: {'lr': 4.890082337829596e-05, 'samples': 20434944, 'steps': 39911, 'batch_loss/train': 0.7597804742399603} 12/28/2021 05:15:10 - INFO - codeparrot_training - Step 39912: {'lr': 4.889145901769104e-05, 'samples': 20435456, 'steps': 39912, 'batch_loss/train': 0.7225272833602503} 12/28/2021 05:15:20 - INFO - codeparrot_training - Step 39913: {'lr': 4.888209545661776e-05, 'samples': 20435968, 'steps': 39913, 'batch_loss/train': 0.7936659283004701} 12/28/2021 05:15:32 - INFO - codeparrot_training - Step 39914: {'lr': 4.8872732695113304e-05, 'samples': 20436480, 'steps': 39914, 'batch_loss/train': 1.0987494746223092} 12/28/2021 05:15:43 - INFO - codeparrot_training - Step 39915: {'lr': 4.886337073321495e-05, 'samples': 20436992, 'steps': 39915, 'batch_loss/train': 0.824276955332607} 12/28/2021 05:15:54 - INFO - codeparrot_training - Step 39916: {'lr': 4.8854009570959866e-05, 'samples': 20437504, 'steps': 39916, 'batch_loss/train': 0.6910157076781616} 12/28/2021 05:16:07 - INFO - codeparrot_training - Step 39917: {'lr': 4.88446492083853e-05, 'samples': 20438016, 'steps': 39917, 'batch_loss/train': 0.7049756972119212} 12/28/2021 05:16:17 - INFO - codeparrot_training - Step 39918: {'lr': 4.8835289645528444e-05, 'samples': 20438528, 'steps': 39918, 'batch_loss/train': 0.5298192466143519} 12/28/2021 05:16:28 - INFO - codeparrot_training - Step 39919: {'lr': 4.882593088242654e-05, 'samples': 20439040, 'steps': 39919, 'batch_loss/train': 0.7223300149198622} 12/28/2021 05:16:39 - INFO - codeparrot_training - Step 39920: {'lr': 4.881657291911676e-05, 'samples': 20439552, 'steps': 39920, 'batch_loss/train': 0.6698145315749571} 12/28/2021 05:16:51 - INFO - codeparrot_training - Step 39921: {'lr': 4.880721575563632e-05, 'samples': 20440064, 'steps': 39921, 'batch_loss/train': 0.6787695731036365} 12/28/2021 05:17:01 - INFO - codeparrot_training - Step 39922: {'lr': 4.87978593920225e-05, 'samples': 20440576, 'steps': 39922, 'batch_loss/train': 0.7478927788324654} 12/28/2021 05:17:12 - INFO - codeparrot_training - Step 39923: {'lr': 4.878850382831235e-05, 'samples': 20441088, 'steps': 39923, 'batch_loss/train': 0.6917288331314921} 12/28/2021 05:17:24 - INFO - codeparrot_training - Step 39924: {'lr': 4.877914906454306e-05, 'samples': 20441600, 'steps': 39924, 'batch_loss/train': 0.8207361027598381} 12/28/2021 05:17:35 - INFO - codeparrot_training - Step 39925: {'lr': 4.876979510075202e-05, 'samples': 20442112, 'steps': 39925, 'batch_loss/train': 0.8016135627403855} 12/28/2021 05:17:45 - INFO - codeparrot_training - Step 39926: {'lr': 4.876044193697623e-05, 'samples': 20442624, 'steps': 39926, 'batch_loss/train': 0.67627636436373} 12/28/2021 05:17:58 - INFO - codeparrot_training - Step 39927: {'lr': 4.875108957325297e-05, 'samples': 20443136, 'steps': 39927, 'batch_loss/train': 0.72848389018327} 12/28/2021 05:18:09 - INFO - codeparrot_training - Step 39928: {'lr': 4.8741738009619365e-05, 'samples': 20443648, 'steps': 39928, 'batch_loss/train': 0.7004497654270381} 12/28/2021 05:18:19 - INFO - codeparrot_training - Step 39929: {'lr': 4.873238724611262e-05, 'samples': 20444160, 'steps': 39929, 'batch_loss/train': 0.7511444021947682} 12/28/2021 05:18:30 - INFO - codeparrot_training - Step 39930: {'lr': 4.872303728276989e-05, 'samples': 20444672, 'steps': 39930, 'batch_loss/train': 0.7761231092736125} 12/28/2021 05:18:42 - INFO - codeparrot_training - Step 39931: {'lr': 4.871368811962845e-05, 'samples': 20445184, 'steps': 39931, 'batch_loss/train': 0.760146792512387} 12/28/2021 05:18:53 - INFO - codeparrot_training - Step 39932: {'lr': 4.870433975672525e-05, 'samples': 20445696, 'steps': 39932, 'batch_loss/train': 0.6883700140751898} 12/28/2021 05:19:03 - INFO - codeparrot_training - Step 39933: {'lr': 4.8694992194097646e-05, 'samples': 20446208, 'steps': 39933, 'batch_loss/train': 0.7810357119888067} 12/28/2021 05:19:17 - INFO - codeparrot_training - Step 39934: {'lr': 4.8685645431782834e-05, 'samples': 20446720, 'steps': 39934, 'batch_loss/train': 0.781963643617928} 12/28/2021 05:19:27 - INFO - codeparrot_training - Step 39935: {'lr': 4.867629946981772e-05, 'samples': 20447232, 'steps': 39935, 'batch_loss/train': 0.7267339611425996} 12/28/2021 05:19:38 - INFO - codeparrot_training - Step 39936: {'lr': 4.8666954308239695e-05, 'samples': 20447744, 'steps': 39936, 'batch_loss/train': 0.6614659721963108} 12/28/2021 05:19:50 - INFO - codeparrot_training - Step 39937: {'lr': 4.865760994708593e-05, 'samples': 20448256, 'steps': 39937, 'batch_loss/train': 0.4342554225586355} 12/28/2021 05:20:01 - INFO - codeparrot_training - Step 39938: {'lr': 4.8648266386393385e-05, 'samples': 20448768, 'steps': 39938, 'batch_loss/train': 0.6929652998223901} 12/28/2021 05:20:11 - INFO - codeparrot_training - Step 39939: {'lr': 4.863892362619932e-05, 'samples': 20449280, 'steps': 39939, 'batch_loss/train': 0.5728589989012107} 12/28/2021 05:20:22 - INFO - codeparrot_training - Step 39940: {'lr': 4.862958166654086e-05, 'samples': 20449792, 'steps': 39940, 'batch_loss/train': 0.7457765117287636} 12/28/2021 05:20:34 - INFO - codeparrot_training - Step 39941: {'lr': 4.862024050745517e-05, 'samples': 20450304, 'steps': 39941, 'batch_loss/train': 0.7637727474793792} 12/28/2021 05:20:45 - INFO - codeparrot_training - Step 39942: {'lr': 4.8610900148979345e-05, 'samples': 20450816, 'steps': 39942, 'batch_loss/train': 0.7957620844244957} 12/28/2021 05:20:55 - INFO - codeparrot_training - Step 39943: {'lr': 4.860156059115053e-05, 'samples': 20451328, 'steps': 39943, 'batch_loss/train': 0.79855886567384} 12/28/2021 05:21:07 - INFO - codeparrot_training - Step 39944: {'lr': 4.859222183400588e-05, 'samples': 20451840, 'steps': 39944, 'batch_loss/train': 0.7167785735800862} 12/28/2021 05:21:18 - INFO - codeparrot_training - Step 39945: {'lr': 4.8582883877582504e-05, 'samples': 20452352, 'steps': 39945, 'batch_loss/train': 0.6854241940309294} 12/28/2021 05:21:28 - INFO - codeparrot_training - Step 39946: {'lr': 4.8573546721917525e-05, 'samples': 20452864, 'steps': 39946, 'batch_loss/train': 0.7750622876919806} 12/28/2021 05:21:41 - INFO - codeparrot_training - Step 39947: {'lr': 4.8564210367048086e-05, 'samples': 20453376, 'steps': 39947, 'batch_loss/train': 0.6676519424654543} 12/28/2021 05:21:52 - INFO - codeparrot_training - Step 39948: {'lr': 4.8554874813011354e-05, 'samples': 20453888, 'steps': 39948, 'batch_loss/train': 0.6413571778684855} 12/28/2021 05:22:02 - INFO - codeparrot_training - Step 39949: {'lr': 4.8545540059844295e-05, 'samples': 20454400, 'steps': 39949, 'batch_loss/train': 0.8233997211791575} 12/28/2021 05:22:17 - INFO - codeparrot_training - Step 39950: {'lr': 4.853620610758403e-05, 'samples': 20454912, 'steps': 39950, 'batch_loss/train': 0.7307212052401155} 12/28/2021 05:22:27 - INFO - codeparrot_training - Step 39951: {'lr': 4.852687295626787e-05, 'samples': 20455424, 'steps': 39951, 'batch_loss/train': 0.648082387400791} 12/28/2021 05:22:38 - INFO - codeparrot_training - Step 39952: {'lr': 4.8517540605932744e-05, 'samples': 20455936, 'steps': 39952, 'batch_loss/train': 0.7062582885846496} 12/28/2021 05:22:49 - INFO - codeparrot_training - Step 39953: {'lr': 4.850820905661579e-05, 'samples': 20456448, 'steps': 39953, 'batch_loss/train': 0.697128121741116} 12/28/2021 05:23:01 - INFO - codeparrot_training - Step 39954: {'lr': 4.8498878308354115e-05, 'samples': 20456960, 'steps': 39954, 'batch_loss/train': 0.6733671426773071} 12/28/2021 05:23:11 - INFO - codeparrot_training - Step 39955: {'lr': 4.8489548361184825e-05, 'samples': 20457472, 'steps': 39955, 'batch_loss/train': 0.7609276305884123} 12/28/2021 05:23:22 - INFO - codeparrot_training - Step 39956: {'lr': 4.8480219215145e-05, 'samples': 20457984, 'steps': 39956, 'batch_loss/train': 0.9310565600171685} 12/28/2021 05:23:34 - INFO - codeparrot_training - Step 39957: {'lr': 4.847089087027173e-05, 'samples': 20458496, 'steps': 39957, 'batch_loss/train': 0.7762394770979881} 12/28/2021 05:23:45 - INFO - codeparrot_training - Step 39958: {'lr': 4.846156332660209e-05, 'samples': 20459008, 'steps': 39958, 'batch_loss/train': 0.648093085270375} 12/28/2021 05:23:56 - INFO - codeparrot_training - Step 39959: {'lr': 4.8452236584173194e-05, 'samples': 20459520, 'steps': 39959, 'batch_loss/train': 0.6069926344789565} 12/28/2021 05:24:09 - INFO - codeparrot_training - Step 39960: {'lr': 4.844291064302217e-05, 'samples': 20460032, 'steps': 39960, 'batch_loss/train': 0.8443022691644728} 12/28/2021 05:24:20 - INFO - codeparrot_training - Step 39961: {'lr': 4.843358550318588e-05, 'samples': 20460544, 'steps': 39961, 'batch_loss/train': 0.7688412666320801} 12/28/2021 05:24:31 - INFO - codeparrot_training - Step 39962: {'lr': 4.8424261164701636e-05, 'samples': 20461056, 'steps': 39962, 'batch_loss/train': 0.7510229385225102} 12/28/2021 05:24:41 - INFO - codeparrot_training - Step 39963: {'lr': 4.841493762760646e-05, 'samples': 20461568, 'steps': 39963, 'batch_loss/train': 0.7107068495824933} 12/28/2021 05:24:53 - INFO - codeparrot_training - Step 39964: {'lr': 4.8405614891937316e-05, 'samples': 20462080, 'steps': 39964, 'batch_loss/train': 0.7134862234815955} 12/28/2021 05:25:04 - INFO - codeparrot_training - Step 39965: {'lr': 4.8396292957731345e-05, 'samples': 20462592, 'steps': 39965, 'batch_loss/train': 0.7644257177598774} 12/28/2021 05:25:15 - INFO - codeparrot_training - Step 39966: {'lr': 4.838697182502558e-05, 'samples': 20463104, 'steps': 39966, 'batch_loss/train': 0.7466722251847386} 12/28/2021 05:25:29 - INFO - codeparrot_training - Step 39967: {'lr': 4.837765149385709e-05, 'samples': 20463616, 'steps': 39967, 'batch_loss/train': 0.7462595161050558} 12/28/2021 05:25:39 - INFO - codeparrot_training - Step 39968: {'lr': 4.8368331964262904e-05, 'samples': 20464128, 'steps': 39968, 'batch_loss/train': 0.7238707341020927} 12/28/2021 05:25:50 - INFO - codeparrot_training - Step 39969: {'lr': 4.8359013236280143e-05, 'samples': 20464640, 'steps': 39969, 'batch_loss/train': 0.7272126708412543} 12/28/2021 05:26:02 - INFO - codeparrot_training - Step 39970: {'lr': 4.834969530994576e-05, 'samples': 20465152, 'steps': 39970, 'batch_loss/train': 0.7447898250538856} 12/28/2021 05:26:13 - INFO - codeparrot_training - Step 39971: {'lr': 4.834037818529688e-05, 'samples': 20465664, 'steps': 39971, 'batch_loss/train': 0.6997296211775392} 12/28/2021 05:26:23 - INFO - codeparrot_training - Step 39972: {'lr': 4.833106186237052e-05, 'samples': 20466176, 'steps': 39972, 'batch_loss/train': 0.7551518427208066} 12/28/2021 05:26:34 - INFO - codeparrot_training - Step 39973: {'lr': 4.8321746341203676e-05, 'samples': 20466688, 'steps': 39973, 'batch_loss/train': 0.7469924897886813} 12/28/2021 05:26:46 - INFO - codeparrot_training - Step 39974: {'lr': 4.831243162183352e-05, 'samples': 20467200, 'steps': 39974, 'batch_loss/train': 1.461985974572599} 12/28/2021 05:26:57 - INFO - codeparrot_training - Step 39975: {'lr': 4.830311770429688e-05, 'samples': 20467712, 'steps': 39975, 'batch_loss/train': 0.7708878261037171} 12/28/2021 05:27:07 - INFO - codeparrot_training - Step 39976: {'lr': 4.829380458863086e-05, 'samples': 20468224, 'steps': 39976, 'batch_loss/train': 0.7244509160518646} 12/28/2021 05:27:21 - INFO - codeparrot_training - Step 39977: {'lr': 4.828449227487261e-05, 'samples': 20468736, 'steps': 39977, 'batch_loss/train': 0.7400825284421444} 12/28/2021 05:27:32 - INFO - codeparrot_training - Step 39978: {'lr': 4.827518076305901e-05, 'samples': 20469248, 'steps': 39978, 'batch_loss/train': 0.7957007819786668} 12/28/2021 05:27:42 - INFO - codeparrot_training - Step 39979: {'lr': 4.826587005322711e-05, 'samples': 20469760, 'steps': 39979, 'batch_loss/train': 0.7145046535879374} 12/28/2021 05:27:55 - INFO - codeparrot_training - Step 39980: {'lr': 4.825656014541393e-05, 'samples': 20470272, 'steps': 39980, 'batch_loss/train': 0.6844358865637332} 12/28/2021 05:28:06 - INFO - codeparrot_training - Step 39981: {'lr': 4.824725103965652e-05, 'samples': 20470784, 'steps': 39981, 'batch_loss/train': 0.630889365915209} 12/28/2021 05:28:17 - INFO - codeparrot_training - Step 39982: {'lr': 4.823794273599183e-05, 'samples': 20471296, 'steps': 39982, 'batch_loss/train': 0.9450323438504711} 12/28/2021 05:28:27 - INFO - codeparrot_training - Step 39983: {'lr': 4.822863523445692e-05, 'samples': 20471808, 'steps': 39983, 'batch_loss/train': 0.8667396260425448} 12/28/2021 05:28:41 - INFO - codeparrot_training - Step 39984: {'lr': 4.8219328535088755e-05, 'samples': 20472320, 'steps': 39984, 'batch_loss/train': 0.7474055122584105} 12/28/2021 05:28:52 - INFO - codeparrot_training - Step 39985: {'lr': 4.821002263792437e-05, 'samples': 20472832, 'steps': 39985, 'batch_loss/train': 0.9616368832066655} 12/28/2021 05:29:03 - INFO - codeparrot_training - Step 39986: {'lr': 4.8200717543000785e-05, 'samples': 20473344, 'steps': 39986, 'batch_loss/train': 0.8350311927497387} 12/28/2021 05:29:15 - INFO - codeparrot_training - Step 39987: {'lr': 4.8191413250354834e-05, 'samples': 20473856, 'steps': 39987, 'batch_loss/train': 0.7804884095676243} 12/28/2021 05:29:25 - INFO - codeparrot_training - Step 39988: {'lr': 4.818210976002368e-05, 'samples': 20474368, 'steps': 39988, 'batch_loss/train': 0.7445569094270468} 12/28/2021 05:29:36 - INFO - codeparrot_training - Step 39989: {'lr': 4.8172807072044326e-05, 'samples': 20474880, 'steps': 39989, 'batch_loss/train': 0.665947736473754} 12/28/2021 05:29:47 - INFO - codeparrot_training - Step 39990: {'lr': 4.816350518645363e-05, 'samples': 20475392, 'steps': 39990, 'batch_loss/train': 0.5766502392943949} 12/28/2021 05:29:59 - INFO - codeparrot_training - Step 39991: {'lr': 4.815420410328855e-05, 'samples': 20475904, 'steps': 39991, 'batch_loss/train': 0.6910803131759167} 12/28/2021 05:30:10 - INFO - codeparrot_training - Step 39992: {'lr': 4.8144903822586264e-05, 'samples': 20476416, 'steps': 39992, 'batch_loss/train': 0.8380065956152976} 12/28/2021 05:30:20 - INFO - codeparrot_training - Step 39993: {'lr': 4.813560434438358e-05, 'samples': 20476928, 'steps': 39993, 'batch_loss/train': 0.7213678297703154} 12/28/2021 05:30:32 - INFO - codeparrot_training - Step 39994: {'lr': 4.8126305668717476e-05, 'samples': 20477440, 'steps': 39994, 'batch_loss/train': 0.7722652526572347} 12/28/2021 05:30:43 - INFO - codeparrot_training - Step 39995: {'lr': 4.811700779562497e-05, 'samples': 20477952, 'steps': 39995, 'batch_loss/train': 0.7667667725472711} 12/28/2021 05:30:54 - INFO - codeparrot_training - Step 39996: {'lr': 4.810771072514303e-05, 'samples': 20478464, 'steps': 39996, 'batch_loss/train': 0.6698966561816633} 12/28/2021 05:31:08 - INFO - codeparrot_training - Step 39997: {'lr': 4.809841445730859e-05, 'samples': 20478976, 'steps': 39997, 'batch_loss/train': 0.6796284410520457} 12/28/2021 05:31:18 - INFO - codeparrot_training - Step 39998: {'lr': 4.808911899215862e-05, 'samples': 20479488, 'steps': 39998, 'batch_loss/train': 0.6249241037294269} 12/28/2021 05:31:29 - INFO - codeparrot_training - Step 39999: {'lr': 4.807982432973007e-05, 'samples': 20480000, 'steps': 39999, 'batch_loss/train': 0.7455571456812322} 12/28/2021 05:31:29 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/28/2021 05:34:50 - INFO - codeparrot_training - Step 40000: {'loss/eval': 0.7470561265945435, 'perplexity': 2.110776901245117} 12/28/2021 05:35:08 - WARNING - huggingface_hub.repository - Several commits (14) will be pushed upstream. 12/28/2021 05:35:22 - INFO - codeparrot_training - Step 40000: {'lr': 4.8070530470059906e-05, 'samples': 20480512, 'steps': 40000, 'batch_loss/train': 0.7726424112915993} 12/28/2021 05:35:34 - INFO - codeparrot_training - Step 40001: {'lr': 4.806123741318513e-05, 'samples': 20481024, 'steps': 40001, 'batch_loss/train': 0.800928701646626} 12/28/2021 05:35:45 - INFO - codeparrot_training - Step 40002: {'lr': 4.805194515914249e-05, 'samples': 20481536, 'steps': 40002, 'batch_loss/train': 0.8114345027133822} 12/28/2021 05:35:55 - INFO - codeparrot_training - Step 40003: {'lr': 4.804265370796918e-05, 'samples': 20482048, 'steps': 40003, 'batch_loss/train': 0.7743837628513575} 12/28/2021 05:36:07 - INFO - codeparrot_training - Step 40004: {'lr': 4.803336305970199e-05, 'samples': 20482560, 'steps': 40004, 'batch_loss/train': 0.7756875240593217} 12/28/2021 05:36:18 - INFO - codeparrot_training - Step 40005: {'lr': 4.802407321437785e-05, 'samples': 20483072, 'steps': 40005, 'batch_loss/train': 0.7255599570926279} 12/28/2021 05:36:29 - INFO - codeparrot_training - Step 40006: {'lr': 4.8014784172033756e-05, 'samples': 20483584, 'steps': 40006, 'batch_loss/train': 0.7071920777671039} 12/28/2021 05:36:42 - INFO - codeparrot_training - Step 40007: {'lr': 4.8005495932706616e-05, 'samples': 20484096, 'steps': 40007, 'batch_loss/train': 0.7917331256903708} 12/28/2021 05:36:53 - INFO - codeparrot_training - Step 40008: {'lr': 4.7996208496433354e-05, 'samples': 20484608, 'steps': 40008, 'batch_loss/train': 0.815594139508903} 12/28/2021 05:37:04 - INFO - codeparrot_training - Step 40009: {'lr': 4.7986921863250874e-05, 'samples': 20485120, 'steps': 40009, 'batch_loss/train': 0.7863869220018387} 12/28/2021 05:37:14 - INFO - codeparrot_training - Step 40010: {'lr': 4.7977636033196134e-05, 'samples': 20485632, 'steps': 40010, 'batch_loss/train': 0.9276824467815459} 12/28/2021 05:37:27 - INFO - codeparrot_training - Step 40011: {'lr': 4.796835100630603e-05, 'samples': 20486144, 'steps': 40011, 'batch_loss/train': 0.6679785071173683} 12/28/2021 05:37:37 - INFO - codeparrot_training - Step 40012: {'lr': 4.795906678261755e-05, 'samples': 20486656, 'steps': 40012, 'batch_loss/train': 0.9688804498873651} 12/28/2021 05:37:48 - INFO - codeparrot_training - Step 40013: {'lr': 4.7949783362167375e-05, 'samples': 20487168, 'steps': 40013, 'batch_loss/train': 0.7663782855379395} 12/28/2021 05:38:01 - INFO - codeparrot_training - Step 40014: {'lr': 4.794050074499265e-05, 'samples': 20487680, 'steps': 40014, 'batch_loss/train': 0.6950382841750979} 12/28/2021 05:38:12 - INFO - codeparrot_training - Step 40015: {'lr': 4.7931218931130283e-05, 'samples': 20488192, 'steps': 40015, 'batch_loss/train': 0.6488298668409698} 12/28/2021 05:38:22 - INFO - codeparrot_training - Step 40016: {'lr': 4.7921937920617e-05, 'samples': 20488704, 'steps': 40016, 'batch_loss/train': 0.8357140067964792} 12/28/2021 05:38:34 - INFO - codeparrot_training - Step 40017: {'lr': 4.7912657713489735e-05, 'samples': 20489216, 'steps': 40017, 'batch_loss/train': 0.7229162203148007} 12/28/2021 05:38:45 - INFO - codeparrot_training - Step 40018: {'lr': 4.790337830978558e-05, 'samples': 20489728, 'steps': 40018, 'batch_loss/train': 1.463360167806968} 12/28/2021 05:38:56 - INFO - codeparrot_training - Step 40019: {'lr': 4.78940997095412e-05, 'samples': 20490240, 'steps': 40019, 'batch_loss/train': 0.7067258744500577} 12/28/2021 05:39:06 - INFO - codeparrot_training - Step 40020: {'lr': 4.788482191279359e-05, 'samples': 20490752, 'steps': 40020, 'batch_loss/train': 0.6605885562021285} 12/28/2021 05:39:18 - INFO - codeparrot_training - Step 40021: {'lr': 4.78755449195796e-05, 'samples': 20491264, 'steps': 40021, 'batch_loss/train': 0.7995709789101966} 12/28/2021 05:39:29 - INFO - codeparrot_training - Step 40022: {'lr': 4.78662687299361e-05, 'samples': 20491776, 'steps': 40022, 'batch_loss/train': 1.5829131617210805} 12/28/2021 05:39:40 - INFO - codeparrot_training - Step 40023: {'lr': 4.785699334390003e-05, 'samples': 20492288, 'steps': 40023, 'batch_loss/train': 0.8458932721987367} 12/28/2021 05:39:54 - INFO - codeparrot_training - Step 40024: {'lr': 4.78477187615082e-05, 'samples': 20492800, 'steps': 40024, 'batch_loss/train': 0.7342158211395144} 12/28/2021 05:40:04 - INFO - codeparrot_training - Step 40025: {'lr': 4.7838444982797515e-05, 'samples': 20493312, 'steps': 40025, 'batch_loss/train': 0.6796498766634613} 12/28/2021 05:40:15 - INFO - codeparrot_training - Step 40026: {'lr': 4.782917200780487e-05, 'samples': 20493824, 'steps': 40026, 'batch_loss/train': 0.745524940546602} 12/28/2021 05:40:27 - INFO - codeparrot_training - Step 40027: {'lr': 4.781989983656712e-05, 'samples': 20494336, 'steps': 40027, 'batch_loss/train': 0.6622004909440875} 12/28/2021 05:40:38 - INFO - codeparrot_training - Step 40028: {'lr': 4.7810628469121015e-05, 'samples': 20494848, 'steps': 40028, 'batch_loss/train': 0.6874584686011076} 12/28/2021 05:40:49 - INFO - codeparrot_training - Step 40029: {'lr': 4.7801357905503633e-05, 'samples': 20495360, 'steps': 40029, 'batch_loss/train': 0.817433814983815} 12/28/2021 05:40:59 - INFO - codeparrot_training - Step 40030: {'lr': 4.7792088145751615e-05, 'samples': 20495872, 'steps': 40030, 'batch_loss/train': 0.8729924038052559} 12/28/2021 05:41:11 - INFO - codeparrot_training - Step 40031: {'lr': 4.778281918990185e-05, 'samples': 20496384, 'steps': 40031, 'batch_loss/train': 0.811583916656673} 12/28/2021 05:41:22 - INFO - codeparrot_training - Step 40032: {'lr': 4.7773551037991385e-05, 'samples': 20496896, 'steps': 40032, 'batch_loss/train': 0.6568457271787338} 12/28/2021 05:41:33 - INFO - codeparrot_training - Step 40033: {'lr': 4.7764283690056864e-05, 'samples': 20497408, 'steps': 40033, 'batch_loss/train': 0.7364921532571316} 12/28/2021 05:41:47 - INFO - codeparrot_training - Step 40034: {'lr': 4.775501714613517e-05, 'samples': 20497920, 'steps': 40034, 'batch_loss/train': 0.7001315681263804} 12/28/2021 05:41:57 - INFO - codeparrot_training - Step 40035: {'lr': 4.7745751406263163e-05, 'samples': 20498432, 'steps': 40035, 'batch_loss/train': 0.7493197584990412} 12/28/2021 05:42:08 - INFO - codeparrot_training - Step 40036: {'lr': 4.773648647047768e-05, 'samples': 20498944, 'steps': 40036, 'batch_loss/train': 0.8795939201954752} 12/28/2021 05:42:20 - INFO - codeparrot_training - Step 40037: {'lr': 4.772722233881557e-05, 'samples': 20499456, 'steps': 40037, 'batch_loss/train': 0.8135405406355858} 12/28/2021 05:42:31 - INFO - codeparrot_training - Step 40038: {'lr': 4.7717959011313705e-05, 'samples': 20499968, 'steps': 40038, 'batch_loss/train': 0.6592048453167081} 12/28/2021 05:42:41 - INFO - codeparrot_training - Step 40039: {'lr': 4.770869648800871e-05, 'samples': 20500480, 'steps': 40039, 'batch_loss/train': 0.7735336166806519} 12/28/2021 05:42:52 - INFO - codeparrot_training - Step 40040: {'lr': 4.769943476893765e-05, 'samples': 20500992, 'steps': 40040, 'batch_loss/train': 0.8419398432597518} 12/28/2021 05:43:04 - INFO - codeparrot_training - Step 40041: {'lr': 4.769017385413729e-05, 'samples': 20501504, 'steps': 40041, 'batch_loss/train': 0.6829155758023262} 12/28/2021 05:43:15 - INFO - codeparrot_training - Step 40042: {'lr': 4.768091374364436e-05, 'samples': 20502016, 'steps': 40042, 'batch_loss/train': 0.6927287201397121} 12/28/2021 05:43:25 - INFO - codeparrot_training - Step 40043: {'lr': 4.767165443749566e-05, 'samples': 20502528, 'steps': 40043, 'batch_loss/train': 0.7168359580682591} 12/28/2021 05:43:39 - INFO - codeparrot_training - Step 40044: {'lr': 4.7662395935728186e-05, 'samples': 20503040, 'steps': 40044, 'batch_loss/train': 0.7643388365395367} 12/28/2021 05:43:50 - INFO - codeparrot_training - Step 40045: {'lr': 4.765313823837858e-05, 'samples': 20503552, 'steps': 40045, 'batch_loss/train': 0.7368368022143841} 12/28/2021 05:44:01 - INFO - codeparrot_training - Step 40046: {'lr': 4.764388134548361e-05, 'samples': 20504064, 'steps': 40046, 'batch_loss/train': 0.8131799167022109} 12/28/2021 05:44:13 - INFO - codeparrot_training - Step 40047: {'lr': 4.763462525708029e-05, 'samples': 20504576, 'steps': 40047, 'batch_loss/train': 0.6907230252400041} 12/28/2021 05:44:23 - INFO - codeparrot_training - Step 40048: {'lr': 4.762536997320524e-05, 'samples': 20505088, 'steps': 40048, 'batch_loss/train': 0.6759302476420999} 12/28/2021 05:44:34 - INFO - codeparrot_training - Step 40049: {'lr': 4.7616115493895304e-05, 'samples': 20505600, 'steps': 40049, 'batch_loss/train': 0.7495749788358808} 12/28/2021 05:44:45 - INFO - codeparrot_training - Step 40050: {'lr': 4.760686181918725e-05, 'samples': 20506112, 'steps': 40050, 'batch_loss/train': 0.6996154943481088} 12/28/2021 05:44:57 - INFO - codeparrot_training - Step 40051: {'lr': 4.759760894911791e-05, 'samples': 20506624, 'steps': 40051, 'batch_loss/train': 0.677469982067123} 12/28/2021 05:45:08 - INFO - codeparrot_training - Step 40052: {'lr': 4.758835688372406e-05, 'samples': 20507136, 'steps': 40052, 'batch_loss/train': 0.6714246098417789} 12/28/2021 05:45:18 - INFO - codeparrot_training - Step 40053: {'lr': 4.757910562304252e-05, 'samples': 20507648, 'steps': 40053, 'batch_loss/train': 0.6989887170493603} 12/28/2021 05:45:32 - INFO - codeparrot_training - Step 40054: {'lr': 4.756985516710991e-05, 'samples': 20508160, 'steps': 40054, 'batch_loss/train': 0.6315414141863585} 12/28/2021 05:45:43 - INFO - codeparrot_training - Step 40055: {'lr': 4.756060551596325e-05, 'samples': 20508672, 'steps': 40055, 'batch_loss/train': 0.6698095900937915} 12/28/2021 05:45:54 - INFO - codeparrot_training - Step 40056: {'lr': 4.7551356669639115e-05, 'samples': 20509184, 'steps': 40056, 'batch_loss/train': 0.7954205675050616} 12/28/2021 05:46:06 - INFO - codeparrot_training - Step 40057: {'lr': 4.754210862817429e-05, 'samples': 20509696, 'steps': 40057, 'batch_loss/train': 0.8081500781700015} 12/28/2021 05:46:17 - INFO - codeparrot_training - Step 40058: {'lr': 4.753286139160573e-05, 'samples': 20510208, 'steps': 40058, 'batch_loss/train': 0.6933741185348481} 12/28/2021 05:46:27 - INFO - codeparrot_training - Step 40059: {'lr': 4.752361495996998e-05, 'samples': 20510720, 'steps': 40059, 'batch_loss/train': 0.7685907443519682} 12/28/2021 05:46:38 - INFO - codeparrot_training - Step 40060: {'lr': 4.751436933330389e-05, 'samples': 20511232, 'steps': 40060, 'batch_loss/train': 0.7637361148372293} 12/28/2021 05:46:50 - INFO - codeparrot_training - Step 40061: {'lr': 4.7505124511644207e-05, 'samples': 20511744, 'steps': 40061, 'batch_loss/train': 0.7806929033249617} 12/28/2021 05:47:01 - INFO - codeparrot_training - Step 40062: {'lr': 4.7495880495027704e-05, 'samples': 20512256, 'steps': 40062, 'batch_loss/train': 0.680672881193459} 12/28/2021 05:47:11 - INFO - codeparrot_training - Step 40063: {'lr': 4.748663728349112e-05, 'samples': 20512768, 'steps': 40063, 'batch_loss/train': 0.6434937492012978} 12/28/2021 05:47:25 - INFO - codeparrot_training - Step 40064: {'lr': 4.7477394877071254e-05, 'samples': 20513280, 'steps': 40064, 'batch_loss/train': 0.6964603730011731} 12/28/2021 05:47:36 - INFO - codeparrot_training - Step 40065: {'lr': 4.7468153275804673e-05, 'samples': 20513792, 'steps': 40065, 'batch_loss/train': 0.7641759514808655} 12/28/2021 05:47:47 - INFO - codeparrot_training - Step 40066: {'lr': 4.7458912479728304e-05, 'samples': 20514304, 'steps': 40066, 'batch_loss/train': 0.6509502637200058} 12/28/2021 05:47:59 - INFO - codeparrot_training - Step 40067: {'lr': 4.744967248887888e-05, 'samples': 20514816, 'steps': 40067, 'batch_loss/train': 0.7363841221667826} 12/28/2021 05:48:09 - INFO - codeparrot_training - Step 40068: {'lr': 4.744043330329295e-05, 'samples': 20515328, 'steps': 40068, 'batch_loss/train': 0.9020786052569747} 12/28/2021 05:48:20 - INFO - codeparrot_training - Step 40069: {'lr': 4.743119492300746e-05, 'samples': 20515840, 'steps': 40069, 'batch_loss/train': 0.6640748856589198} 12/28/2021 05:48:31 - INFO - codeparrot_training - Step 40070: {'lr': 4.7421957348059096e-05, 'samples': 20516352, 'steps': 40070, 'batch_loss/train': 0.6751219858415425} 12/28/2021 05:48:43 - INFO - codeparrot_training - Step 40071: {'lr': 4.7412720578484474e-05, 'samples': 20516864, 'steps': 40071, 'batch_loss/train': 0.7810753248631954} 12/28/2021 05:48:54 - INFO - codeparrot_training - Step 40072: {'lr': 4.740348461432031e-05, 'samples': 20517376, 'steps': 40072, 'batch_loss/train': 0.6616936158388853} 12/28/2021 05:49:04 - INFO - codeparrot_training - Step 40073: {'lr': 4.739424945560353e-05, 'samples': 20517888, 'steps': 40073, 'batch_loss/train': 0.5942631929647177} 12/28/2021 05:49:16 - INFO - codeparrot_training - Step 40074: {'lr': 4.738501510237067e-05, 'samples': 20518400, 'steps': 40074, 'batch_loss/train': 0.701023300876841} 12/28/2021 05:49:27 - INFO - codeparrot_training - Step 40075: {'lr': 4.737578155465847e-05, 'samples': 20518912, 'steps': 40075, 'batch_loss/train': 0.6186836792621762} 12/28/2021 05:49:38 - INFO - codeparrot_training - Step 40076: {'lr': 4.736654881250363e-05, 'samples': 20519424, 'steps': 40076, 'batch_loss/train': 0.7556325769983232} 12/28/2021 05:49:52 - INFO - codeparrot_training - Step 40077: {'lr': 4.735731687594291e-05, 'samples': 20519936, 'steps': 40077, 'batch_loss/train': 0.8123668618500233} 12/28/2021 05:50:02 - INFO - codeparrot_training - Step 40078: {'lr': 4.734808574501298e-05, 'samples': 20520448, 'steps': 40078, 'batch_loss/train': 0.7661389694549143} 12/28/2021 05:50:13 - INFO - codeparrot_training - Step 40079: {'lr': 4.733885541975061e-05, 'samples': 20520960, 'steps': 40079, 'batch_loss/train': 0.6948056421242654} 12/28/2021 05:50:24 - INFO - codeparrot_training - Step 40080: {'lr': 4.73296259001923e-05, 'samples': 20521472, 'steps': 40080, 'batch_loss/train': 0.6982705732807517} 12/28/2021 05:50:36 - INFO - codeparrot_training - Step 40081: {'lr': 4.73203971863749e-05, 'samples': 20521984, 'steps': 40081, 'batch_loss/train': 0.8135706642642617} 12/28/2021 05:50:47 - INFO - codeparrot_training - Step 40082: {'lr': 4.731116927833517e-05, 'samples': 20522496, 'steps': 40082, 'batch_loss/train': 0.6406755186617374} 12/28/2021 05:50:57 - INFO - codeparrot_training - Step 40083: {'lr': 4.7301942176109585e-05, 'samples': 20523008, 'steps': 40083, 'batch_loss/train': 1.0748769277706742} 12/28/2021 05:51:11 - INFO - codeparrot_training - Step 40084: {'lr': 4.729271587973505e-05, 'samples': 20523520, 'steps': 40084, 'batch_loss/train': 0.6860901620239019} 12/28/2021 05:51:22 - INFO - codeparrot_training - Step 40085: {'lr': 4.728349038924806e-05, 'samples': 20524032, 'steps': 40085, 'batch_loss/train': 0.7516537010669708} 12/28/2021 05:51:32 - INFO - codeparrot_training - Step 40086: {'lr': 4.72742657046854e-05, 'samples': 20524544, 'steps': 40086, 'batch_loss/train': 0.8911277304869145} 12/28/2021 05:51:44 - INFO - codeparrot_training - Step 40087: {'lr': 4.726504182608368e-05, 'samples': 20525056, 'steps': 40087, 'batch_loss/train': 0.781242098659277} 12/28/2021 05:51:55 - INFO - codeparrot_training - Step 40088: {'lr': 4.725581875347962e-05, 'samples': 20525568, 'steps': 40088, 'batch_loss/train': 0.7306082248687744} 12/28/2021 05:52:06 - INFO - codeparrot_training - Step 40089: {'lr': 4.7246596486909875e-05, 'samples': 20526080, 'steps': 40089, 'batch_loss/train': 0.7065264428965747} 12/28/2021 05:52:16 - INFO - codeparrot_training - Step 40090: {'lr': 4.7237375026411125e-05, 'samples': 20526592, 'steps': 40090, 'batch_loss/train': 0.8349707731977105} 12/28/2021 05:52:28 - INFO - codeparrot_training - Step 40091: {'lr': 4.7228154372019964e-05, 'samples': 20527104, 'steps': 40091, 'batch_loss/train': 0.7017207755707204} 12/28/2021 05:52:39 - INFO - codeparrot_training - Step 40092: {'lr': 4.7218934523773125e-05, 'samples': 20527616, 'steps': 40092, 'batch_loss/train': 0.704219285864383} 12/28/2021 05:52:50 - INFO - codeparrot_training - Step 40093: {'lr': 4.72097154817073e-05, 'samples': 20528128, 'steps': 40093, 'batch_loss/train': 0.7250997233204544} 12/28/2021 05:53:02 - INFO - codeparrot_training - Step 40094: {'lr': 4.720049724585895e-05, 'samples': 20528640, 'steps': 40094, 'batch_loss/train': 0.7830549101345241} 12/28/2021 05:53:13 - INFO - codeparrot_training - Step 40095: {'lr': 4.719127981626492e-05, 'samples': 20529152, 'steps': 40095, 'batch_loss/train': 0.814109405502677} 12/28/2021 05:53:24 - INFO - codeparrot_training - Step 40096: {'lr': 4.718206319296184e-05, 'samples': 20529664, 'steps': 40096, 'batch_loss/train': 0.7858811332844198} 12/28/2021 05:53:36 - INFO - codeparrot_training - Step 40097: {'lr': 4.717284737598621e-05, 'samples': 20530176, 'steps': 40097, 'batch_loss/train': 0.7556718182750046} 12/28/2021 05:53:46 - INFO - codeparrot_training - Step 40098: {'lr': 4.7163632365374714e-05, 'samples': 20530688, 'steps': 40098, 'batch_loss/train': 0.6868764595128596} 12/28/2021 05:53:57 - INFO - codeparrot_training - Step 40099: {'lr': 4.7154418161164145e-05, 'samples': 20531200, 'steps': 40099, 'batch_loss/train': 0.7651700675487518} 12/28/2021 05:54:08 - INFO - codeparrot_training - Step 40100: {'lr': 4.714520476339096e-05, 'samples': 20531712, 'steps': 40100, 'batch_loss/train': 0.7147667082026601} 12/28/2021 05:54:21 - INFO - codeparrot_training - Step 40101: {'lr': 4.7135992172091855e-05, 'samples': 20532224, 'steps': 40101, 'batch_loss/train': 0.6884416316170245} 12/28/2021 05:54:31 - INFO - codeparrot_training - Step 40102: {'lr': 4.712678038730342e-05, 'samples': 20532736, 'steps': 40102, 'batch_loss/train': 0.805068395100534} 12/28/2021 05:54:42 - INFO - codeparrot_training - Step 40103: {'lr': 4.711756940906234e-05, 'samples': 20533248, 'steps': 40103, 'batch_loss/train': 0.8814319018274546} 12/28/2021 05:54:54 - INFO - codeparrot_training - Step 40104: {'lr': 4.710835923740517e-05, 'samples': 20533760, 'steps': 40104, 'batch_loss/train': 0.7640890548937023} 12/28/2021 05:55:05 - INFO - codeparrot_training - Step 40105: {'lr': 4.709914987236855e-05, 'samples': 20534272, 'steps': 40105, 'batch_loss/train': 0.7451956603908911} 12/28/2021 05:55:15 - INFO - codeparrot_training - Step 40106: {'lr': 4.7089941313989106e-05, 'samples': 20534784, 'steps': 40106, 'batch_loss/train': 0.7695491751655936} 12/28/2021 05:55:27 - INFO - codeparrot_training - Step 40107: {'lr': 4.708073356230347e-05, 'samples': 20535296, 'steps': 40107, 'batch_loss/train': 0.6959663246525452} 12/28/2021 05:55:38 - INFO - codeparrot_training - Step 40108: {'lr': 4.707152661734826e-05, 'samples': 20535808, 'steps': 40108, 'batch_loss/train': 0.6380113493651152} 12/28/2021 05:55:49 - INFO - codeparrot_training - Step 40109: {'lr': 4.70623204791599e-05, 'samples': 20536320, 'steps': 40109, 'batch_loss/train': 0.8995381304994226} 12/28/2021 05:55:59 - INFO - codeparrot_training - Step 40110: {'lr': 4.7053115147775245e-05, 'samples': 20536832, 'steps': 40110, 'batch_loss/train': 0.7362576564773917} 12/28/2021 05:56:11 - INFO - codeparrot_training - Step 40111: {'lr': 4.704391062323074e-05, 'samples': 20537344, 'steps': 40111, 'batch_loss/train': 0.7188084074296057} 12/28/2021 05:56:22 - INFO - codeparrot_training - Step 40112: {'lr': 4.7034706905562994e-05, 'samples': 20537856, 'steps': 40112, 'batch_loss/train': 0.691858347505331} 12/28/2021 05:56:33 - INFO - codeparrot_training - Step 40113: {'lr': 4.702550399480862e-05, 'samples': 20538368, 'steps': 40113, 'batch_loss/train': 0.9134210618212819} 12/28/2021 05:56:46 - INFO - codeparrot_training - Step 40114: {'lr': 4.70163018910042e-05, 'samples': 20538880, 'steps': 40114, 'batch_loss/train': 0.7380348790320568} 12/28/2021 05:56:56 - INFO - codeparrot_training - Step 40115: {'lr': 4.7007100594186334e-05, 'samples': 20539392, 'steps': 40115, 'batch_loss/train': 0.7254260383779183} 12/28/2021 05:57:07 - INFO - codeparrot_training - Step 40116: {'lr': 4.6997900104391586e-05, 'samples': 20539904, 'steps': 40116, 'batch_loss/train': 0.6438292134553194} 12/28/2021 05:57:19 - INFO - codeparrot_training - Step 40117: {'lr': 4.698870042165651e-05, 'samples': 20540416, 'steps': 40117, 'batch_loss/train': 0.7268433021381497} 12/28/2021 05:57:30 - INFO - codeparrot_training - Step 40118: {'lr': 4.697950154601774e-05, 'samples': 20540928, 'steps': 40118, 'batch_loss/train': 0.792010001372546} 12/28/2021 05:57:40 - INFO - codeparrot_training - Step 40119: {'lr': 4.697030347751186e-05, 'samples': 20541440, 'steps': 40119, 'batch_loss/train': 0.7561102127656341} 12/28/2021 05:57:51 - INFO - codeparrot_training - Step 40120: {'lr': 4.696110621617527e-05, 'samples': 20541952, 'steps': 40120, 'batch_loss/train': 0.7290724692866206} 12/28/2021 05:58:03 - INFO - codeparrot_training - Step 40121: {'lr': 4.695190976204472e-05, 'samples': 20542464, 'steps': 40121, 'batch_loss/train': 0.8543835822492838} 12/28/2021 05:58:14 - INFO - codeparrot_training - Step 40122: {'lr': 4.694271411515677e-05, 'samples': 20542976, 'steps': 40122, 'batch_loss/train': 0.6849077336955816} 12/28/2021 05:58:25 - INFO - codeparrot_training - Step 40123: {'lr': 4.693351927554787e-05, 'samples': 20543488, 'steps': 40123, 'batch_loss/train': 0.593883607420139} 12/28/2021 05:58:37 - INFO - codeparrot_training - Step 40124: {'lr': 4.692432524325454e-05, 'samples': 20544000, 'steps': 40124, 'batch_loss/train': 0.7160068085649982} 12/28/2021 05:58:48 - INFO - codeparrot_training - Step 40125: {'lr': 4.6915132018313545e-05, 'samples': 20544512, 'steps': 40125, 'batch_loss/train': 0.7517594691598788} 12/28/2021 05:58:59 - INFO - codeparrot_training - Step 40126: {'lr': 4.6905939600761256e-05, 'samples': 20545024, 'steps': 40126, 'batch_loss/train': 0.7075189519673586} 12/28/2021 05:59:11 - INFO - codeparrot_training - Step 40127: {'lr': 4.6896747990634234e-05, 'samples': 20545536, 'steps': 40127, 'batch_loss/train': 0.8129744082689285} 12/28/2021 05:59:21 - INFO - codeparrot_training - Step 40128: {'lr': 4.688755718796908e-05, 'samples': 20546048, 'steps': 40128, 'batch_loss/train': 0.7152688959613442} 12/28/2021 05:59:32 - INFO - codeparrot_training - Step 40129: {'lr': 4.687836719280231e-05, 'samples': 20546560, 'steps': 40129, 'batch_loss/train': 0.7336066528223455} 12/28/2021 05:59:43 - INFO - codeparrot_training - Step 40130: {'lr': 4.686917800517043e-05, 'samples': 20547072, 'steps': 40130, 'batch_loss/train': 0.7710396586917341} 12/28/2021 05:59:56 - INFO - codeparrot_training - Step 40131: {'lr': 4.685998962511001e-05, 'samples': 20547584, 'steps': 40131, 'batch_loss/train': 1.2028237571939826} 12/28/2021 06:00:07 - INFO - codeparrot_training - Step 40132: {'lr': 4.685080205265757e-05, 'samples': 20548096, 'steps': 40132, 'batch_loss/train': 0.6674700700677931} 12/28/2021 06:00:17 - INFO - codeparrot_training - Step 40133: {'lr': 4.6841615287849636e-05, 'samples': 20548608, 'steps': 40133, 'batch_loss/train': 0.7251418391242623} 12/28/2021 06:00:29 - INFO - codeparrot_training - Step 40134: {'lr': 4.683242933072279e-05, 'samples': 20549120, 'steps': 40134, 'batch_loss/train': 0.7454051394015551} 12/28/2021 06:00:40 - INFO - codeparrot_training - Step 40135: {'lr': 4.682324418131337e-05, 'samples': 20549632, 'steps': 40135, 'batch_loss/train': 0.7820845819078386} 12/28/2021 06:00:51 - INFO - codeparrot_training - Step 40136: {'lr': 4.6814059839658086e-05, 'samples': 20550144, 'steps': 40136, 'batch_loss/train': 0.7339739399030805} 12/28/2021 06:01:01 - INFO - codeparrot_training - Step 40137: {'lr': 4.680487630579344e-05, 'samples': 20550656, 'steps': 40137, 'batch_loss/train': 0.8311495203524828} 12/28/2021 06:01:13 - INFO - codeparrot_training - Step 40138: {'lr': 4.67956935797558e-05, 'samples': 20551168, 'steps': 40138, 'batch_loss/train': 0.7540168245323002} 12/28/2021 06:01:24 - INFO - codeparrot_training - Step 40139: {'lr': 4.6786511661581765e-05, 'samples': 20551680, 'steps': 40139, 'batch_loss/train': 0.797098396345973} 12/28/2021 06:01:35 - INFO - codeparrot_training - Step 40140: {'lr': 4.677733055130784e-05, 'samples': 20552192, 'steps': 40140, 'batch_loss/train': 0.7272909586317837} 12/28/2021 06:01:48 - INFO - codeparrot_training - Step 40141: {'lr': 4.676815024897052e-05, 'samples': 20552704, 'steps': 40141, 'batch_loss/train': 0.7645660217385739} 12/28/2021 06:01:58 - INFO - codeparrot_training - Step 40142: {'lr': 4.675897075460628e-05, 'samples': 20553216, 'steps': 40142, 'batch_loss/train': 1.5937109971418977} 12/28/2021 06:02:09 - INFO - codeparrot_training - Step 40143: {'lr': 4.674979206825164e-05, 'samples': 20553728, 'steps': 40143, 'batch_loss/train': 0.8191686095669866} 12/28/2021 06:02:21 - INFO - codeparrot_training - Step 40144: {'lr': 4.6740614189943075e-05, 'samples': 20554240, 'steps': 40144, 'batch_loss/train': 0.7516524894163013} 12/28/2021 06:02:32 - INFO - codeparrot_training - Step 40145: {'lr': 4.673143711971714e-05, 'samples': 20554752, 'steps': 40145, 'batch_loss/train': 0.9289237782359123} 12/28/2021 06:02:42 - INFO - codeparrot_training - Step 40146: {'lr': 4.672226085761014e-05, 'samples': 20555264, 'steps': 40146, 'batch_loss/train': 0.7269920455291867} 12/28/2021 06:02:53 - INFO - codeparrot_training - Step 40147: {'lr': 4.671308540365873e-05, 'samples': 20555776, 'steps': 40147, 'batch_loss/train': 0.7842437736690044} 12/28/2021 06:03:06 - INFO - codeparrot_training - Step 40148: {'lr': 4.6703910757899425e-05, 'samples': 20556288, 'steps': 40148, 'batch_loss/train': 0.7669282825663686} 12/28/2021 06:03:16 - INFO - codeparrot_training - Step 40149: {'lr': 4.669473692036852e-05, 'samples': 20556800, 'steps': 40149, 'batch_loss/train': 0.7780309577938169} 12/28/2021 06:03:27 - INFO - codeparrot_training - Step 40150: {'lr': 4.668556389110251e-05, 'samples': 20557312, 'steps': 40150, 'batch_loss/train': 0.623958622221835} 12/28/2021 06:03:39 - INFO - codeparrot_training - Step 40151: {'lr': 4.6676391670138066e-05, 'samples': 20557824, 'steps': 40151, 'batch_loss/train': 0.7381182019598782} 12/28/2021 06:03:50 - INFO - codeparrot_training - Step 40152: {'lr': 4.666722025751147e-05, 'samples': 20558336, 'steps': 40152, 'batch_loss/train': 0.6137489060638472} 12/28/2021 06:04:00 - INFO - codeparrot_training - Step 40153: {'lr': 4.6658049653259206e-05, 'samples': 20558848, 'steps': 40153, 'batch_loss/train': 0.8699547164142132} 12/28/2021 06:04:13 - INFO - codeparrot_training - Step 40154: {'lr': 4.6648879857417784e-05, 'samples': 20559360, 'steps': 40154, 'batch_loss/train': 0.6783642712980509} 12/28/2021 06:04:23 - INFO - codeparrot_training - Step 40155: {'lr': 4.663971087002361e-05, 'samples': 20559872, 'steps': 40155, 'batch_loss/train': 0.7100474620237947} 12/28/2021 06:04:34 - INFO - codeparrot_training - Step 40156: {'lr': 4.6630542691113164e-05, 'samples': 20560384, 'steps': 40156, 'batch_loss/train': 0.6898130043409765} 12/28/2021 06:04:45 - INFO - codeparrot_training - Step 40157: {'lr': 4.6621375320722885e-05, 'samples': 20560896, 'steps': 40157, 'batch_loss/train': 0.9339557671919465} 12/28/2021 06:04:57 - INFO - codeparrot_training - Step 40158: {'lr': 4.6612208758889244e-05, 'samples': 20561408, 'steps': 40158, 'batch_loss/train': 0.5800666536670178} 12/28/2021 06:05:07 - INFO - codeparrot_training - Step 40159: {'lr': 4.660304300564863e-05, 'samples': 20561920, 'steps': 40159, 'batch_loss/train': 0.7479292238131166} 12/28/2021 06:05:18 - INFO - codeparrot_training - Step 40160: {'lr': 4.659387806103763e-05, 'samples': 20562432, 'steps': 40160, 'batch_loss/train': 0.7104186154901981} 12/28/2021 06:05:31 - INFO - codeparrot_training - Step 40161: {'lr': 4.658471392509239e-05, 'samples': 20562944, 'steps': 40161, 'batch_loss/train': 0.674330823123455} 12/28/2021 06:05:41 - INFO - codeparrot_training - Step 40162: {'lr': 4.657555059784963e-05, 'samples': 20563456, 'steps': 40162, 'batch_loss/train': 0.6603686662856489} 12/28/2021 06:05:52 - INFO - codeparrot_training - Step 40163: {'lr': 4.656638807934571e-05, 'samples': 20563968, 'steps': 40163, 'batch_loss/train': 0.7181301829405129} 12/28/2021 06:06:04 - INFO - codeparrot_training - Step 40164: {'lr': 4.6557226369616955e-05, 'samples': 20564480, 'steps': 40164, 'batch_loss/train': 0.7908923365175724} 12/28/2021 06:06:15 - INFO - codeparrot_training - Step 40165: {'lr': 4.654806546869986e-05, 'samples': 20564992, 'steps': 40165, 'batch_loss/train': 0.6979473950341344} 12/28/2021 06:06:26 - INFO - codeparrot_training - Step 40166: {'lr': 4.653890537663083e-05, 'samples': 20565504, 'steps': 40166, 'batch_loss/train': 0.6934539144858718} 12/28/2021 06:06:38 - INFO - codeparrot_training - Step 40167: {'lr': 4.652974609344629e-05, 'samples': 20566016, 'steps': 40167, 'batch_loss/train': 0.5567430946975946} 12/28/2021 06:06:49 - INFO - codeparrot_training - Step 40168: {'lr': 4.6520587619182624e-05, 'samples': 20566528, 'steps': 40168, 'batch_loss/train': 0.7653859951533377} 12/28/2021 06:06:59 - INFO - codeparrot_training - Step 40169: {'lr': 4.6511429953876303e-05, 'samples': 20567040, 'steps': 40169, 'batch_loss/train': 0.7245139258448035} 12/28/2021 06:07:10 - INFO - codeparrot_training - Step 40170: {'lr': 4.650227309756369e-05, 'samples': 20567552, 'steps': 40170, 'batch_loss/train': 0.7280717361718416} 12/28/2021 06:07:22 - INFO - codeparrot_training - Step 40171: {'lr': 4.649311705028123e-05, 'samples': 20568064, 'steps': 40171, 'batch_loss/train': 0.8020648611709476} 12/28/2021 06:07:33 - INFO - codeparrot_training - Step 40172: {'lr': 4.648396181206527e-05, 'samples': 20568576, 'steps': 40172, 'batch_loss/train': 0.8018746161833405} 12/28/2021 06:07:44 - INFO - codeparrot_training - Step 40173: {'lr': 4.647480738295223e-05, 'samples': 20569088, 'steps': 40173, 'batch_loss/train': 0.8006339520215988} 12/28/2021 06:07:56 - INFO - codeparrot_training - Step 40174: {'lr': 4.6465653762978564e-05, 'samples': 20569600, 'steps': 40174, 'batch_loss/train': 0.6263004555366933} 12/28/2021 06:08:07 - INFO - codeparrot_training - Step 40175: {'lr': 4.6456500952180566e-05, 'samples': 20570112, 'steps': 40175, 'batch_loss/train': 0.6879673732910305} 12/28/2021 06:08:17 - INFO - codeparrot_training - Step 40176: {'lr': 4.64473489505946e-05, 'samples': 20570624, 'steps': 40176, 'batch_loss/train': 0.7384785963222384} 12/28/2021 06:08:29 - INFO - codeparrot_training - Step 40177: {'lr': 4.6438197758257224e-05, 'samples': 20571136, 'steps': 40177, 'batch_loss/train': 0.5585901099257171} 12/28/2021 06:08:40 - INFO - codeparrot_training - Step 40178: {'lr': 4.6429047375204676e-05, 'samples': 20571648, 'steps': 40178, 'batch_loss/train': 0.7729799980297685} 12/28/2021 06:08:51 - INFO - codeparrot_training - Step 40179: {'lr': 4.6419897801473265e-05, 'samples': 20572160, 'steps': 40179, 'batch_loss/train': 0.8727727336809039} 12/28/2021 06:09:01 - INFO - codeparrot_training - Step 40180: {'lr': 4.641074903709963e-05, 'samples': 20572672, 'steps': 40180, 'batch_loss/train': 0.7291929626371711} 12/28/2021 06:09:14 - INFO - codeparrot_training - Step 40181: {'lr': 4.640160108211991e-05, 'samples': 20573184, 'steps': 40181, 'batch_loss/train': 0.7603638991713524} 12/28/2021 06:09:25 - INFO - codeparrot_training - Step 40182: {'lr': 4.639245393657057e-05, 'samples': 20573696, 'steps': 40182, 'batch_loss/train': 0.7337395520880818} 12/28/2021 06:09:35 - INFO - codeparrot_training - Step 40183: {'lr': 4.638330760048792e-05, 'samples': 20574208, 'steps': 40183, 'batch_loss/train': 0.7593145728460513} 12/28/2021 06:09:47 - INFO - codeparrot_training - Step 40184: {'lr': 4.637416207390838e-05, 'samples': 20574720, 'steps': 40184, 'batch_loss/train': 0.7211215626448393} 12/28/2021 06:09:58 - INFO - codeparrot_training - Step 40185: {'lr': 4.63650173568683e-05, 'samples': 20575232, 'steps': 40185, 'batch_loss/train': 0.9008082249201834} 12/28/2021 06:10:08 - INFO - codeparrot_training - Step 40186: {'lr': 4.635587344940404e-05, 'samples': 20575744, 'steps': 40186, 'batch_loss/train': 0.854013180360198} 12/28/2021 06:10:21 - INFO - codeparrot_training - Step 40187: {'lr': 4.6346730351551846e-05, 'samples': 20576256, 'steps': 40187, 'batch_loss/train': 0.7564545031636953} 12/28/2021 06:10:31 - INFO - codeparrot_training - Step 40188: {'lr': 4.63375880633482e-05, 'samples': 20576768, 'steps': 40188, 'batch_loss/train': 0.7453168296488002} 12/28/2021 06:10:42 - INFO - codeparrot_training - Step 40189: {'lr': 4.632844658482946e-05, 'samples': 20577280, 'steps': 40189, 'batch_loss/train': 0.7270262502133846} 12/28/2021 06:10:52 - INFO - codeparrot_training - Step 40190: {'lr': 4.631930591603187e-05, 'samples': 20577792, 'steps': 40190, 'batch_loss/train': 0.7835988942533731} 12/28/2021 06:11:05 - INFO - codeparrot_training - Step 40191: {'lr': 4.631016605699179e-05, 'samples': 20578304, 'steps': 40191, 'batch_loss/train': 0.7808220256119967} 12/28/2021 06:11:16 - INFO - codeparrot_training - Step 40192: {'lr': 4.6301027007745605e-05, 'samples': 20578816, 'steps': 40192, 'batch_loss/train': 0.7259357664734125} 12/28/2021 06:11:26 - INFO - codeparrot_training - Step 40193: {'lr': 4.62918887683296e-05, 'samples': 20579328, 'steps': 40193, 'batch_loss/train': 0.8264507446438074} 12/28/2021 06:11:39 - INFO - codeparrot_training - Step 40194: {'lr': 4.6282751338780154e-05, 'samples': 20579840, 'steps': 40194, 'batch_loss/train': 0.7525029049720615} 12/28/2021 06:11:49 - INFO - codeparrot_training - Step 40195: {'lr': 4.627361471913355e-05, 'samples': 20580352, 'steps': 40195, 'batch_loss/train': 0.7075708326883614} 12/28/2021 06:12:00 - INFO - codeparrot_training - Step 40196: {'lr': 4.6264478909426116e-05, 'samples': 20580864, 'steps': 40196, 'batch_loss/train': 0.7560378313064575} 12/28/2021 06:12:12 - INFO - codeparrot_training - Step 40197: {'lr': 4.625534390969422e-05, 'samples': 20581376, 'steps': 40197, 'batch_loss/train': 0.7362106170621701} 12/28/2021 06:12:23 - INFO - codeparrot_training - Step 40198: {'lr': 4.624620971997412e-05, 'samples': 20581888, 'steps': 40198, 'batch_loss/train': 0.8208184936083853} 12/28/2021 06:12:33 - INFO - codeparrot_training - Step 40199: {'lr': 4.623707634030217e-05, 'samples': 20582400, 'steps': 40199, 'batch_loss/train': 0.6917662369087338} 12/28/2021 06:12:46 - INFO - codeparrot_training - Step 40200: {'lr': 4.622794377071474e-05, 'samples': 20582912, 'steps': 40200, 'batch_loss/train': 0.7291523981839418} 12/28/2021 06:12:57 - INFO - codeparrot_training - Step 40201: {'lr': 4.621881201124797e-05, 'samples': 20583424, 'steps': 40201, 'batch_loss/train': 0.7691547808935866} 12/28/2021 06:13:07 - INFO - codeparrot_training - Step 40202: {'lr': 4.620968106193821e-05, 'samples': 20583936, 'steps': 40202, 'batch_loss/train': 0.8545965468510985} 12/28/2021 06:13:18 - INFO - codeparrot_training - Step 40203: {'lr': 4.620055092282194e-05, 'samples': 20584448, 'steps': 40203, 'batch_loss/train': 0.8856263877823949} 12/28/2021 06:13:30 - INFO - codeparrot_training - Step 40204: {'lr': 4.6191421593935244e-05, 'samples': 20584960, 'steps': 40204, 'batch_loss/train': 0.6945271869190037} 12/28/2021 06:13:41 - INFO - codeparrot_training - Step 40205: {'lr': 4.6182293075314464e-05, 'samples': 20585472, 'steps': 40205, 'batch_loss/train': 0.794737797928974} 12/28/2021 06:13:52 - INFO - codeparrot_training - Step 40206: {'lr': 4.6173165366996064e-05, 'samples': 20585984, 'steps': 40206, 'batch_loss/train': 0.7196602271869779} 12/28/2021 06:14:05 - INFO - codeparrot_training - Step 40207: {'lr': 4.616403846901609e-05, 'samples': 20586496, 'steps': 40207, 'batch_loss/train': 0.5729159782058559} 12/28/2021 06:14:15 - INFO - codeparrot_training - Step 40208: {'lr': 4.615491238141098e-05, 'samples': 20587008, 'steps': 40208, 'batch_loss/train': 0.7088658227585256} 12/28/2021 06:14:26 - INFO - codeparrot_training - Step 40209: {'lr': 4.6145787104216934e-05, 'samples': 20587520, 'steps': 40209, 'batch_loss/train': 0.6995396646670997} 12/28/2021 06:14:38 - INFO - codeparrot_training - Step 40210: {'lr': 4.613666263747029e-05, 'samples': 20588032, 'steps': 40210, 'batch_loss/train': 1.024068814702332} 12/28/2021 06:14:49 - INFO - codeparrot_training - Step 40211: {'lr': 4.612753898120728e-05, 'samples': 20588544, 'steps': 40211, 'batch_loss/train': 0.6899694236926734} 12/28/2021 06:15:00 - INFO - codeparrot_training - Step 40212: {'lr': 4.611841613546428e-05, 'samples': 20589056, 'steps': 40212, 'batch_loss/train': 0.7376313647255301} 12/28/2021 06:15:10 - INFO - codeparrot_training - Step 40213: {'lr': 4.6109294100277337e-05, 'samples': 20589568, 'steps': 40213, 'batch_loss/train': 0.7735328169655986} 12/28/2021 06:15:22 - INFO - codeparrot_training - Step 40214: {'lr': 4.610017287568291e-05, 'samples': 20590080, 'steps': 40214, 'batch_loss/train': 0.7333654477261007} 12/28/2021 06:15:33 - INFO - codeparrot_training - Step 40215: {'lr': 4.609105246171727e-05, 'samples': 20590592, 'steps': 40215, 'batch_loss/train': 0.6854652261827141} 12/28/2021 06:15:44 - INFO - codeparrot_training - Step 40216: {'lr': 4.60819328584165e-05, 'samples': 20591104, 'steps': 40216, 'batch_loss/train': 0.7847662142012268} 12/28/2021 06:15:57 - INFO - codeparrot_training - Step 40217: {'lr': 4.607281406581701e-05, 'samples': 20591616, 'steps': 40217, 'batch_loss/train': 0.727213722653687} 12/28/2021 06:16:07 - INFO - codeparrot_training - Step 40218: {'lr': 4.606369608395511e-05, 'samples': 20592128, 'steps': 40218, 'batch_loss/train': 0.748876495170407} 12/28/2021 06:16:18 - INFO - codeparrot_training - Step 40219: {'lr': 4.605457891286685e-05, 'samples': 20592640, 'steps': 40219, 'batch_loss/train': 0.7545107076875865} 12/28/2021 06:16:30 - INFO - codeparrot_training - Step 40220: {'lr': 4.6045462552588595e-05, 'samples': 20593152, 'steps': 40220, 'batch_loss/train': 0.6624656743369997} 12/28/2021 06:16:41 - INFO - codeparrot_training - Step 40221: {'lr': 4.603634700315657e-05, 'samples': 20593664, 'steps': 40221, 'batch_loss/train': 0.7996667558327317} 12/28/2021 06:16:51 - INFO - codeparrot_training - Step 40222: {'lr': 4.6027232264607034e-05, 'samples': 20594176, 'steps': 40222, 'batch_loss/train': 0.7159662208869122} 12/28/2021 06:17:02 - INFO - codeparrot_training - Step 40223: {'lr': 4.601811833697617e-05, 'samples': 20594688, 'steps': 40223, 'batch_loss/train': 0.8244710927829146} 12/28/2021 06:17:14 - INFO - codeparrot_training - Step 40224: {'lr': 4.600900522030027e-05, 'samples': 20595200, 'steps': 40224, 'batch_loss/train': 0.6315179578959942} 12/28/2021 06:17:25 - INFO - codeparrot_training - Step 40225: {'lr': 4.5999892914615536e-05, 'samples': 20595712, 'steps': 40225, 'batch_loss/train': 0.708378707524389} 12/28/2021 06:17:35 - INFO - codeparrot_training - Step 40226: {'lr': 4.599078141995819e-05, 'samples': 20596224, 'steps': 40226, 'batch_loss/train': 0.676982197444886} 12/28/2021 06:17:47 - INFO - codeparrot_training - Step 40227: {'lr': 4.5981670736364554e-05, 'samples': 20596736, 'steps': 40227, 'batch_loss/train': 0.7785528632812202} 12/28/2021 06:17:58 - INFO - codeparrot_training - Step 40228: {'lr': 4.597256086387061e-05, 'samples': 20597248, 'steps': 40228, 'batch_loss/train': 0.8378266557119787} 12/28/2021 06:18:09 - INFO - codeparrot_training - Step 40229: {'lr': 4.596345180251285e-05, 'samples': 20597760, 'steps': 40229, 'batch_loss/train': 0.715711553581059} 12/28/2021 06:18:22 - INFO - codeparrot_training - Step 40230: {'lr': 4.5954343552327334e-05, 'samples': 20598272, 'steps': 40230, 'batch_loss/train': 0.7938433862291276} 12/28/2021 06:18:32 - INFO - codeparrot_training - Step 40231: {'lr': 4.594523611335022e-05, 'samples': 20598784, 'steps': 40231, 'batch_loss/train': 0.6824783040210605} 12/28/2021 06:18:43 - INFO - codeparrot_training - Step 40232: {'lr': 4.593612948561793e-05, 'samples': 20599296, 'steps': 40232, 'batch_loss/train': 0.6440543931676075} 12/28/2021 06:18:54 - INFO - codeparrot_training - Step 40233: {'lr': 4.5927023669166456e-05, 'samples': 20599808, 'steps': 40233, 'batch_loss/train': 0.7693226784467697} 12/28/2021 06:19:06 - INFO - codeparrot_training - Step 40234: {'lr': 4.591791866403208e-05, 'samples': 20600320, 'steps': 40234, 'batch_loss/train': 0.9088359652087092} 12/28/2021 06:19:17 - INFO - codeparrot_training - Step 40235: {'lr': 4.5908814470251023e-05, 'samples': 20600832, 'steps': 40235, 'batch_loss/train': 0.7398194647394121} 12/28/2021 06:19:27 - INFO - codeparrot_training - Step 40236: {'lr': 4.5899711087859455e-05, 'samples': 20601344, 'steps': 40236, 'batch_loss/train': 0.6731044850312173} 12/28/2021 06:19:39 - INFO - codeparrot_training - Step 40237: {'lr': 4.5890608516893554e-05, 'samples': 20601856, 'steps': 40237, 'batch_loss/train': 0.7905208840966225} 12/28/2021 06:19:50 - INFO - codeparrot_training - Step 40238: {'lr': 4.5881506757389615e-05, 'samples': 20602368, 'steps': 40238, 'batch_loss/train': 0.7808189447969198} 12/28/2021 06:20:01 - INFO - codeparrot_training - Step 40239: {'lr': 4.587240580938359e-05, 'samples': 20602880, 'steps': 40239, 'batch_loss/train': 0.5924371783621609} 12/28/2021 06:20:13 - INFO - codeparrot_training - Step 40240: {'lr': 4.586330567291186e-05, 'samples': 20603392, 'steps': 40240, 'batch_loss/train': 0.6454531333874911} 12/28/2021 06:20:24 - INFO - codeparrot_training - Step 40241: {'lr': 4.585420634801063e-05, 'samples': 20603904, 'steps': 40241, 'batch_loss/train': 0.9963292954489589} 12/28/2021 06:20:35 - INFO - codeparrot_training - Step 40242: {'lr': 4.5845107834715846e-05, 'samples': 20604416, 'steps': 40242, 'batch_loss/train': 0.7926731873303652} 12/28/2021 06:20:45 - INFO - codeparrot_training - Step 40243: {'lr': 4.58360101330639e-05, 'samples': 20604928, 'steps': 40243, 'batch_loss/train': 0.7621364649385214} 12/28/2021 06:20:57 - INFO - codeparrot_training - Step 40244: {'lr': 4.5826913243090964e-05, 'samples': 20605440, 'steps': 40244, 'batch_loss/train': 0.7518322910182178} 12/28/2021 06:21:08 - INFO - codeparrot_training - Step 40245: {'lr': 4.5817817164833045e-05, 'samples': 20605952, 'steps': 40245, 'batch_loss/train': 0.7710496479412541} 12/28/2021 06:21:19 - INFO - codeparrot_training - Step 40246: {'lr': 4.580872189832638e-05, 'samples': 20606464, 'steps': 40246, 'batch_loss/train': 0.743881743401289} 12/28/2021 06:21:32 - INFO - codeparrot_training - Step 40247: {'lr': 4.579962744360716e-05, 'samples': 20606976, 'steps': 40247, 'batch_loss/train': 0.8423977182246745} 12/28/2021 06:21:43 - INFO - codeparrot_training - Step 40248: {'lr': 4.579053380071152e-05, 'samples': 20607488, 'steps': 40248, 'batch_loss/train': 0.7508388366550207} 12/28/2021 06:21:53 - INFO - codeparrot_training - Step 40249: {'lr': 4.57814409696756e-05, 'samples': 20608000, 'steps': 40249, 'batch_loss/train': 0.4346784856170416} 12/28/2021 06:22:04 - INFO - codeparrot_training - Step 40250: {'lr': 4.5772348950535556e-05, 'samples': 20608512, 'steps': 40250, 'batch_loss/train': 1.7507757423445582} 12/28/2021 06:22:16 - INFO - codeparrot_training - Step 40251: {'lr': 4.576325774332757e-05, 'samples': 20609024, 'steps': 40251, 'batch_loss/train': 0.7039974995423108} 12/28/2021 06:22:27 - INFO - codeparrot_training - Step 40252: {'lr': 4.575416734808771e-05, 'samples': 20609536, 'steps': 40252, 'batch_loss/train': 0.7581706494092941} 12/28/2021 06:22:37 - INFO - codeparrot_training - Step 40253: {'lr': 4.574507776485218e-05, 'samples': 20610048, 'steps': 40253, 'batch_loss/train': 0.6803680490702391} 12/28/2021 06:22:49 - INFO - codeparrot_training - Step 40254: {'lr': 4.5735988993657106e-05, 'samples': 20610560, 'steps': 40254, 'batch_loss/train': 0.7942255488596857} 12/28/2021 06:23:00 - INFO - codeparrot_training - Step 40255: {'lr': 4.572690103453866e-05, 'samples': 20611072, 'steps': 40255, 'batch_loss/train': 0.6663732214365155} 12/28/2021 06:23:11 - INFO - codeparrot_training - Step 40256: {'lr': 4.571781388753288e-05, 'samples': 20611584, 'steps': 40256, 'batch_loss/train': 0.5522491771262139} 12/28/2021 06:23:24 - INFO - codeparrot_training - Step 40257: {'lr': 4.570872755267586e-05, 'samples': 20612096, 'steps': 40257, 'batch_loss/train': 0.863415963947773} 12/28/2021 06:23:34 - INFO - codeparrot_training - Step 40258: {'lr': 4.5699642030003927e-05, 'samples': 20612608, 'steps': 40258, 'batch_loss/train': 0.6366765582934022} 12/28/2021 06:23:45 - INFO - codeparrot_training - Step 40259: {'lr': 4.569055731955305e-05, 'samples': 20613120, 'steps': 40259, 'batch_loss/train': 0.7557388048153371} 12/28/2021 06:23:55 - INFO - codeparrot_training - Step 40260: {'lr': 4.568147342135934e-05, 'samples': 20613632, 'steps': 40260, 'batch_loss/train': 0.6848362879827619} 12/28/2021 06:24:08 - INFO - codeparrot_training - Step 40261: {'lr': 4.567239033545895e-05, 'samples': 20614144, 'steps': 40261, 'batch_loss/train': 0.6847449662163854} 12/28/2021 06:24:18 - INFO - codeparrot_training - Step 40262: {'lr': 4.5663308061887956e-05, 'samples': 20614656, 'steps': 40262, 'batch_loss/train': 0.6524062184616923} 12/28/2021 06:24:29 - INFO - codeparrot_training - Step 40263: {'lr': 4.565422660068253e-05, 'samples': 20615168, 'steps': 40263, 'batch_loss/train': 0.6466153231449425} 12/28/2021 06:24:41 - INFO - codeparrot_training - Step 40264: {'lr': 4.56451459518788e-05, 'samples': 20615680, 'steps': 40264, 'batch_loss/train': 0.7250316496938467} 12/28/2021 06:24:52 - INFO - codeparrot_training - Step 40265: {'lr': 4.5636066115512657e-05, 'samples': 20616192, 'steps': 40265, 'batch_loss/train': 0.7624264745973051} 12/28/2021 06:25:02 - INFO - codeparrot_training - Step 40266: {'lr': 4.5626987091620443e-05, 'samples': 20616704, 'steps': 40266, 'batch_loss/train': 0.9628012962639332} 12/28/2021 06:25:15 - INFO - codeparrot_training - Step 40267: {'lr': 4.561790888023823e-05, 'samples': 20617216, 'steps': 40267, 'batch_loss/train': 0.7458269856870174} 12/28/2021 06:25:25 - INFO - codeparrot_training - Step 40268: {'lr': 4.560883148140188e-05, 'samples': 20617728, 'steps': 40268, 'batch_loss/train': 0.6666088649071753} 12/28/2021 06:25:36 - INFO - codeparrot_training - Step 40269: {'lr': 4.559975489514773e-05, 'samples': 20618240, 'steps': 40269, 'batch_loss/train': 0.8917230033548549} 12/28/2021 06:25:49 - INFO - codeparrot_training - Step 40270: {'lr': 4.559067912151182e-05, 'samples': 20618752, 'steps': 40270, 'batch_loss/train': 0.7990964390337467} 12/28/2021 06:25:59 - INFO - codeparrot_training - Step 40271: {'lr': 4.558160416053015e-05, 'samples': 20619264, 'steps': 40271, 'batch_loss/train': 0.6644421459641308} 12/28/2021 06:26:10 - INFO - codeparrot_training - Step 40272: {'lr': 4.557253001223874e-05, 'samples': 20619776, 'steps': 40272, 'batch_loss/train': 0.8393463417887688} 12/28/2021 06:26:21 - INFO - codeparrot_training - Step 40273: {'lr': 4.556345667667391e-05, 'samples': 20620288, 'steps': 40273, 'batch_loss/train': 0.7972837402485311} 12/28/2021 06:26:33 - INFO - codeparrot_training - Step 40274: {'lr': 4.5554384153871505e-05, 'samples': 20620800, 'steps': 40274, 'batch_loss/train': 0.7739929230883718} 12/28/2021 06:26:43 - INFO - codeparrot_training - Step 40275: {'lr': 4.55453124438677e-05, 'samples': 20621312, 'steps': 40275, 'batch_loss/train': 0.7953910250216722} 12/28/2021 06:26:54 - INFO - codeparrot_training - Step 40276: {'lr': 4.553624154669853e-05, 'samples': 20621824, 'steps': 40276, 'batch_loss/train': 0.7936930847354233} 12/28/2021 06:27:07 - INFO - codeparrot_training - Step 40277: {'lr': 4.5527171462400044e-05, 'samples': 20622336, 'steps': 40277, 'batch_loss/train': 0.7505458142259158} 12/28/2021 06:27:18 - INFO - codeparrot_training - Step 40278: {'lr': 4.551810219100833e-05, 'samples': 20622848, 'steps': 40278, 'batch_loss/train': 0.696972418227233} 12/28/2021 06:27:28 - INFO - codeparrot_training - Step 40279: {'lr': 4.550903373255941e-05, 'samples': 20623360, 'steps': 40279, 'batch_loss/train': 0.772690094076097} 12/28/2021 06:27:40 - INFO - codeparrot_training - Step 40280: {'lr': 4.549996608708937e-05, 'samples': 20623872, 'steps': 40280, 'batch_loss/train': 0.6245235684327781} 12/28/2021 06:27:51 - INFO - codeparrot_training - Step 40281: {'lr': 4.549089925463432e-05, 'samples': 20624384, 'steps': 40281, 'batch_loss/train': 0.6893124175257981} 12/28/2021 06:28:02 - INFO - codeparrot_training - Step 40282: {'lr': 4.548183323523017e-05, 'samples': 20624896, 'steps': 40282, 'batch_loss/train': 0.6867762189358473} 12/28/2021 06:28:12 - INFO - codeparrot_training - Step 40283: {'lr': 4.547276802891295e-05, 'samples': 20625408, 'steps': 40283, 'batch_loss/train': 0.7544021687936038} 12/28/2021 06:28:25 - INFO - codeparrot_training - Step 40284: {'lr': 4.546370363571892e-05, 'samples': 20625920, 'steps': 40284, 'batch_loss/train': 0.7631574403494596} 12/28/2021 06:28:35 - INFO - codeparrot_training - Step 40285: {'lr': 4.545464005568392e-05, 'samples': 20626432, 'steps': 40285, 'batch_loss/train': 0.6919530352461152} 12/28/2021 06:28:46 - INFO - codeparrot_training - Step 40286: {'lr': 4.544557728884402e-05, 'samples': 20626944, 'steps': 40286, 'batch_loss/train': 0.6514772013761103} 12/28/2021 06:28:59 - INFO - codeparrot_training - Step 40287: {'lr': 4.543651533523527e-05, 'samples': 20627456, 'steps': 40287, 'batch_loss/train': 0.785541869699955} 12/28/2021 06:29:09 - INFO - codeparrot_training - Step 40288: {'lr': 4.542745419489372e-05, 'samples': 20627968, 'steps': 40288, 'batch_loss/train': 0.7480772412382066} 12/28/2021 06:29:20 - INFO - codeparrot_training - Step 40289: {'lr': 4.541839386785535e-05, 'samples': 20628480, 'steps': 40289, 'batch_loss/train': 0.792255156673491} 12/28/2021 06:29:32 - INFO - codeparrot_training - Step 40290: {'lr': 4.540933435415617e-05, 'samples': 20628992, 'steps': 40290, 'batch_loss/train': 0.6860524406656623} 12/28/2021 06:29:43 - INFO - codeparrot_training - Step 40291: {'lr': 4.540027565383226e-05, 'samples': 20629504, 'steps': 40291, 'batch_loss/train': 0.680097867269069} 12/28/2021 06:29:54 - INFO - codeparrot_training - Step 40292: {'lr': 4.5391217766919595e-05, 'samples': 20630016, 'steps': 40292, 'batch_loss/train': 0.681931477971375} 12/28/2021 06:30:04 - INFO - codeparrot_training - Step 40293: {'lr': 4.538216069345427e-05, 'samples': 20630528, 'steps': 40293, 'batch_loss/train': 0.7966600758954883} 12/28/2021 06:30:17 - INFO - codeparrot_training - Step 40294: {'lr': 4.5373104433472064e-05, 'samples': 20631040, 'steps': 40294, 'batch_loss/train': 0.7557394681498408} 12/28/2021 06:30:28 - INFO - codeparrot_training - Step 40295: {'lr': 4.5364048987009187e-05, 'samples': 20631552, 'steps': 40295, 'batch_loss/train': 0.564502936322242} 12/28/2021 06:30:38 - INFO - codeparrot_training - Step 40296: {'lr': 4.5354994354101675e-05, 'samples': 20632064, 'steps': 40296, 'batch_loss/train': 0.7572728506056592} 12/28/2021 06:30:50 - INFO - codeparrot_training - Step 40297: {'lr': 4.5345940534785356e-05, 'samples': 20632576, 'steps': 40297, 'batch_loss/train': 0.7275639334693551} 12/28/2021 06:31:01 - INFO - codeparrot_training - Step 40298: {'lr': 4.533688752909623e-05, 'samples': 20633088, 'steps': 40298, 'batch_loss/train': 0.6040925492416136} 12/28/2021 06:31:12 - INFO - codeparrot_training - Step 40299: {'lr': 4.532783533707049e-05, 'samples': 20633600, 'steps': 40299, 'batch_loss/train': 0.6393568261992186} 12/28/2021 06:31:24 - INFO - codeparrot_training - Step 40300: {'lr': 4.5318783958743946e-05, 'samples': 20634112, 'steps': 40300, 'batch_loss/train': 0.6654426899040118} 12/28/2021 06:31:34 - INFO - codeparrot_training - Step 40301: {'lr': 4.530973339415262e-05, 'samples': 20634624, 'steps': 40301, 'batch_loss/train': 0.7517715645954013} 12/28/2021 06:31:45 - INFO - codeparrot_training - Step 40302: {'lr': 4.5300683643332506e-05, 'samples': 20635136, 'steps': 40302, 'batch_loss/train': 0.7570774261839688} 12/28/2021 06:31:56 - INFO - codeparrot_training - Step 40303: {'lr': 4.5291634706319583e-05, 'samples': 20635648, 'steps': 40303, 'batch_loss/train': 0.6969921824056655} 12/28/2021 06:32:08 - INFO - codeparrot_training - Step 40304: {'lr': 4.528258658314985e-05, 'samples': 20636160, 'steps': 40304, 'batch_loss/train': 0.6894847080111504} 12/28/2021 06:32:18 - INFO - codeparrot_training - Step 40305: {'lr': 4.527353927385924e-05, 'samples': 20636672, 'steps': 40305, 'batch_loss/train': 0.644744984805584} 12/28/2021 06:32:29 - INFO - codeparrot_training - Step 40306: {'lr': 4.5264492778483714e-05, 'samples': 20637184, 'steps': 40306, 'batch_loss/train': 0.7197618130594492} 12/28/2021 06:32:42 - INFO - codeparrot_training - Step 40307: {'lr': 4.5255447097059285e-05, 'samples': 20637696, 'steps': 40307, 'batch_loss/train': 0.7548554753884673} 12/28/2021 06:32:52 - INFO - codeparrot_training - Step 40308: {'lr': 4.524640222962195e-05, 'samples': 20638208, 'steps': 40308, 'batch_loss/train': 0.7267511712852865} 12/28/2021 06:33:03 - INFO - codeparrot_training - Step 40309: {'lr': 4.523735817620747e-05, 'samples': 20638720, 'steps': 40309, 'batch_loss/train': 0.7412265175953507} 12/28/2021 06:33:15 - INFO - codeparrot_training - Step 40310: {'lr': 4.5228314936852064e-05, 'samples': 20639232, 'steps': 40310, 'batch_loss/train': 0.6586296865716577} 12/28/2021 06:33:26 - INFO - codeparrot_training - Step 40311: {'lr': 4.5219272511591494e-05, 'samples': 20639744, 'steps': 40311, 'batch_loss/train': 0.668377501424402} 12/28/2021 06:33:37 - INFO - codeparrot_training - Step 40312: {'lr': 4.5210230900461805e-05, 'samples': 20640256, 'steps': 40312, 'batch_loss/train': 0.6921990740811452} 12/28/2021 06:33:49 - INFO - codeparrot_training - Step 40313: {'lr': 4.520119010349888e-05, 'samples': 20640768, 'steps': 40313, 'batch_loss/train': 0.6982037564739585} 12/28/2021 06:33:59 - INFO - codeparrot_training - Step 40314: {'lr': 4.519215012073871e-05, 'samples': 20641280, 'steps': 40314, 'batch_loss/train': 0.6607424018438905} 12/28/2021 06:34:10 - INFO - codeparrot_training - Step 40315: {'lr': 4.518311095221722e-05, 'samples': 20641792, 'steps': 40315, 'batch_loss/train': 0.7692979229614139} 12/28/2021 06:34:20 - INFO - codeparrot_training - Step 40316: {'lr': 4.5174072597970324e-05, 'samples': 20642304, 'steps': 40316, 'batch_loss/train': 0.7494422132149339} 12/28/2021 06:34:33 - INFO - codeparrot_training - Step 40317: {'lr': 4.516503505803399e-05, 'samples': 20642816, 'steps': 40317, 'batch_loss/train': 0.7915421957150102} 12/28/2021 06:34:44 - INFO - codeparrot_training - Step 40318: {'lr': 4.515599833244413e-05, 'samples': 20643328, 'steps': 40318, 'batch_loss/train': 0.7614111355505884} 12/28/2021 06:34:55 - INFO - codeparrot_training - Step 40319: {'lr': 4.514696242123675e-05, 'samples': 20643840, 'steps': 40319, 'batch_loss/train': 0.8241767575964332} 12/28/2021 06:35:07 - INFO - codeparrot_training - Step 40320: {'lr': 4.5137927324447575e-05, 'samples': 20644352, 'steps': 40320, 'batch_loss/train': 0.7231407444924116} 12/28/2021 06:35:18 - INFO - codeparrot_training - Step 40321: {'lr': 4.512889304211268e-05, 'samples': 20644864, 'steps': 40321, 'batch_loss/train': 0.8183887973427773} 12/28/2021 06:35:28 - INFO - codeparrot_training - Step 40322: {'lr': 4.511985957426803e-05, 'samples': 20645376, 'steps': 40322, 'batch_loss/train': 0.8192988319788128} 12/28/2021 06:35:41 - INFO - codeparrot_training - Step 40323: {'lr': 4.511082692094942e-05, 'samples': 20645888, 'steps': 40323, 'batch_loss/train': 0.8066581599414349} 12/28/2021 06:35:52 - INFO - codeparrot_training - Step 40324: {'lr': 4.51017950821927e-05, 'samples': 20646400, 'steps': 40324, 'batch_loss/train': 0.6354812462814152} 12/28/2021 06:36:02 - INFO - codeparrot_training - Step 40325: {'lr': 4.509276405803403e-05, 'samples': 20646912, 'steps': 40325, 'batch_loss/train': 0.726825232617557} 12/28/2021 06:36:13 - INFO - codeparrot_training - Step 40326: {'lr': 4.508373384850908e-05, 'samples': 20647424, 'steps': 40326, 'batch_loss/train': 0.7103064735420048} 12/28/2021 06:36:25 - INFO - codeparrot_training - Step 40327: {'lr': 4.507470445365383e-05, 'samples': 20647936, 'steps': 40327, 'batch_loss/train': 0.7486369023099542} 12/28/2021 06:36:36 - INFO - codeparrot_training - Step 40328: {'lr': 4.5065675873504187e-05, 'samples': 20648448, 'steps': 40328, 'batch_loss/train': 0.6423563386779279} 12/28/2021 06:36:46 - INFO - codeparrot_training - Step 40329: {'lr': 4.505664810809604e-05, 'samples': 20648960, 'steps': 40329, 'batch_loss/train': 0.6429315954446793} 12/28/2021 06:36:59 - INFO - codeparrot_training - Step 40330: {'lr': 4.5047621157465244e-05, 'samples': 20649472, 'steps': 40330, 'batch_loss/train': 0.713623245479539} 12/28/2021 06:37:09 - INFO - codeparrot_training - Step 40331: {'lr': 4.5038595021647756e-05, 'samples': 20649984, 'steps': 40331, 'batch_loss/train': 0.601132112351479} 12/28/2021 06:37:20 - INFO - codeparrot_training - Step 40332: {'lr': 4.5029569700679425e-05, 'samples': 20650496, 'steps': 40332, 'batch_loss/train': 0.7046808041632175} 12/28/2021 06:37:31 - INFO - codeparrot_training - Step 40333: {'lr': 4.502054519459611e-05, 'samples': 20651008, 'steps': 40333, 'batch_loss/train': 0.6328116709191818} 12/28/2021 06:37:43 - INFO - codeparrot_training - Step 40334: {'lr': 4.5011521503433766e-05, 'samples': 20651520, 'steps': 40334, 'batch_loss/train': 0.6843619390856475} 12/28/2021 06:37:53 - INFO - codeparrot_training - Step 40335: {'lr': 4.5002498627228103e-05, 'samples': 20652032, 'steps': 40335, 'batch_loss/train': 0.6055632308125496} 12/28/2021 06:38:04 - INFO - codeparrot_training - Step 40336: {'lr': 4.499347656601524e-05, 'samples': 20652544, 'steps': 40336, 'batch_loss/train': 0.7485259051900357} 12/28/2021 06:38:17 - INFO - codeparrot_training - Step 40337: {'lr': 4.498445531983081e-05, 'samples': 20653056, 'steps': 40337, 'batch_loss/train': 0.7878321474418044} 12/28/2021 06:38:27 - INFO - codeparrot_training - Step 40338: {'lr': 4.4975434888710806e-05, 'samples': 20653568, 'steps': 40338, 'batch_loss/train': 0.7925006484147161} 12/28/2021 06:38:38 - INFO - codeparrot_training - Step 40339: {'lr': 4.496641527269105e-05, 'samples': 20654080, 'steps': 40339, 'batch_loss/train': 0.6955664535053074} 12/28/2021 06:38:50 - INFO - codeparrot_training - Step 40340: {'lr': 4.495739647180741e-05, 'samples': 20654592, 'steps': 40340, 'batch_loss/train': 0.7116742162033916} 12/28/2021 06:39:01 - INFO - codeparrot_training - Step 40341: {'lr': 4.494837848609573e-05, 'samples': 20655104, 'steps': 40341, 'batch_loss/train': 0.7096735220984556} 12/28/2021 06:39:12 - INFO - codeparrot_training - Step 40342: {'lr': 4.493936131559187e-05, 'samples': 20655616, 'steps': 40342, 'batch_loss/train': 0.7167394198477268} 12/28/2021 06:39:24 - INFO - codeparrot_training - Step 40343: {'lr': 4.493034496033169e-05, 'samples': 20656128, 'steps': 40343, 'batch_loss/train': 0.7696335611399263} 12/28/2021 06:39:35 - INFO - codeparrot_training - Step 40344: {'lr': 4.492132942035104e-05, 'samples': 20656640, 'steps': 40344, 'batch_loss/train': 0.8068884732201695} 12/28/2021 06:39:45 - INFO - codeparrot_training - Step 40345: {'lr': 4.49123146956858e-05, 'samples': 20657152, 'steps': 40345, 'batch_loss/train': 0.6970208091661334} 12/28/2021 06:39:56 - INFO - codeparrot_training - Step 40346: {'lr': 4.490330078637161e-05, 'samples': 20657664, 'steps': 40346, 'batch_loss/train': 0.8049480179324746} 12/28/2021 06:40:09 - INFO - codeparrot_training - Step 40347: {'lr': 4.4894287692444555e-05, 'samples': 20658176, 'steps': 40347, 'batch_loss/train': 0.686192176071927} 12/28/2021 06:40:19 - INFO - codeparrot_training - Step 40348: {'lr': 4.4885275413940414e-05, 'samples': 20658688, 'steps': 40348, 'batch_loss/train': 0.6632946697063744} 12/28/2021 06:40:30 - INFO - codeparrot_training - Step 40349: {'lr': 4.4876263950894916e-05, 'samples': 20659200, 'steps': 40349, 'batch_loss/train': 0.76527691911906} 12/28/2021 06:40:42 - INFO - codeparrot_training - Step 40350: {'lr': 4.486725330334387e-05, 'samples': 20659712, 'steps': 40350, 'batch_loss/train': 0.6463152086362243} 12/28/2021 06:40:53 - INFO - codeparrot_training - Step 40351: {'lr': 4.485824347132328e-05, 'samples': 20660224, 'steps': 40351, 'batch_loss/train': 0.7435352597385645} 12/28/2021 06:41:03 - INFO - codeparrot_training - Step 40352: {'lr': 4.4849234454868826e-05, 'samples': 20660736, 'steps': 40352, 'batch_loss/train': 0.6508435532450676} 12/28/2021 06:41:15 - INFO - codeparrot_training - Step 40353: {'lr': 4.4840226254016266e-05, 'samples': 20661248, 'steps': 40353, 'batch_loss/train': 0.8452808046713471} 12/28/2021 06:41:26 - INFO - codeparrot_training - Step 40354: {'lr': 4.483121886880162e-05, 'samples': 20661760, 'steps': 40354, 'batch_loss/train': 0.7802135436795652} 12/28/2021 06:41:37 - INFO - codeparrot_training - Step 40355: {'lr': 4.4822212299260534e-05, 'samples': 20662272, 'steps': 40355, 'batch_loss/train': 0.7926875883713365} 12/28/2021 06:41:47 - INFO - codeparrot_training - Step 40356: {'lr': 4.4813206545428865e-05, 'samples': 20662784, 'steps': 40356, 'batch_loss/train': 0.702887165651191} 12/28/2021 06:42:00 - INFO - codeparrot_training - Step 40357: {'lr': 4.4804201607342386e-05, 'samples': 20663296, 'steps': 40357, 'batch_loss/train': 0.7504991827299818} 12/28/2021 06:42:11 - INFO - codeparrot_training - Step 40358: {'lr': 4.4795197485036964e-05, 'samples': 20663808, 'steps': 40358, 'batch_loss/train': 0.7805438591167331} 12/28/2021 06:42:21 - INFO - codeparrot_training - Step 40359: {'lr': 4.4786194178548316e-05, 'samples': 20664320, 'steps': 40359, 'batch_loss/train': 0.7715212716720998} 12/28/2021 06:42:34 - INFO - codeparrot_training - Step 40360: {'lr': 4.477719168791236e-05, 'samples': 20664832, 'steps': 40360, 'batch_loss/train': 0.7841995302587748} 12/28/2021 06:42:44 - INFO - codeparrot_training - Step 40361: {'lr': 4.476819001316465e-05, 'samples': 20665344, 'steps': 40361, 'batch_loss/train': 0.6795927661005408} 12/28/2021 06:42:55 - INFO - codeparrot_training - Step 40362: {'lr': 4.4759189154341214e-05, 'samples': 20665856, 'steps': 40362, 'batch_loss/train': 0.7366246108431369} 12/28/2021 06:43:07 - INFO - codeparrot_training - Step 40363: {'lr': 4.47501891114778e-05, 'samples': 20666368, 'steps': 40363, 'batch_loss/train': 0.8261316968128085} 12/28/2021 06:43:18 - INFO - codeparrot_training - Step 40364: {'lr': 4.4741189884609986e-05, 'samples': 20666880, 'steps': 40364, 'batch_loss/train': 0.7273365771397948} 12/28/2021 06:43:29 - INFO - codeparrot_training - Step 40365: {'lr': 4.473219147377386e-05, 'samples': 20667392, 'steps': 40365, 'batch_loss/train': 0.721334948670119} 12/28/2021 06:43:41 - INFO - codeparrot_training - Step 40366: {'lr': 4.4723193879004944e-05, 'samples': 20667904, 'steps': 40366, 'batch_loss/train': 0.745970893651247} 12/28/2021 06:43:51 - INFO - codeparrot_training - Step 40367: {'lr': 4.471419710033911e-05, 'samples': 20668416, 'steps': 40367, 'batch_loss/train': 0.8160071412567049} 12/28/2021 06:44:02 - INFO - codeparrot_training - Step 40368: {'lr': 4.470520113781212e-05, 'samples': 20668928, 'steps': 40368, 'batch_loss/train': 0.7659455267712474} 12/28/2021 06:44:13 - INFO - codeparrot_training - Step 40369: {'lr': 4.4696205991459737e-05, 'samples': 20669440, 'steps': 40369, 'batch_loss/train': 0.6840039702365175} 12/28/2021 06:44:25 - INFO - codeparrot_training - Step 40370: {'lr': 4.46872116613177e-05, 'samples': 20669952, 'steps': 40370, 'batch_loss/train': 0.742690893355757} 12/28/2021 06:44:36 - INFO - codeparrot_training - Step 40371: {'lr': 4.4678218147421874e-05, 'samples': 20670464, 'steps': 40371, 'batch_loss/train': 0.7508225957863033} 12/28/2021 06:44:47 - INFO - codeparrot_training - Step 40372: {'lr': 4.4669225449807784e-05, 'samples': 20670976, 'steps': 40372, 'batch_loss/train': 0.5897313749883324} 12/28/2021 06:44:59 - INFO - codeparrot_training - Step 40373: {'lr': 4.46602335685114e-05, 'samples': 20671488, 'steps': 40373, 'batch_loss/train': 0.7224359293468297} 12/28/2021 06:45:09 - INFO - codeparrot_training - Step 40374: {'lr': 4.4651242503568445e-05, 'samples': 20672000, 'steps': 40374, 'batch_loss/train': 0.7400741763412952} 12/28/2021 06:45:20 - INFO - codeparrot_training - Step 40375: {'lr': 4.464225225501456e-05, 'samples': 20672512, 'steps': 40375, 'batch_loss/train': 0.6779354461468756} 12/28/2021 06:45:33 - INFO - codeparrot_training - Step 40376: {'lr': 4.463326282288546e-05, 'samples': 20673024, 'steps': 40376, 'batch_loss/train': 0.8140461007133126} 12/28/2021 06:45:43 - INFO - codeparrot_training - Step 40377: {'lr': 4.46242742072171e-05, 'samples': 20673536, 'steps': 40377, 'batch_loss/train': 0.7223074636422098} 12/28/2021 06:45:54 - INFO - codeparrot_training - Step 40378: {'lr': 4.461528640804502e-05, 'samples': 20674048, 'steps': 40378, 'batch_loss/train': 0.7753790849819779} 12/28/2021 06:46:05 - INFO - codeparrot_training - Step 40379: {'lr': 4.460629942540495e-05, 'samples': 20674560, 'steps': 40379, 'batch_loss/train': 0.6548361635068431} 12/28/2021 06:46:17 - INFO - codeparrot_training - Step 40380: {'lr': 4.459731325933278e-05, 'samples': 20675072, 'steps': 40380, 'batch_loss/train': 0.7644127840176225} 12/28/2021 06:46:27 - INFO - codeparrot_training - Step 40381: {'lr': 4.458832790986411e-05, 'samples': 20675584, 'steps': 40381, 'batch_loss/train': 0.7622699518688023} 12/28/2021 06:46:38 - INFO - codeparrot_training - Step 40382: {'lr': 4.4579343377034646e-05, 'samples': 20676096, 'steps': 40382, 'batch_loss/train': 0.7808735091239214} 12/28/2021 06:46:52 - INFO - codeparrot_training - Step 40383: {'lr': 4.457035966088019e-05, 'samples': 20676608, 'steps': 40383, 'batch_loss/train': 0.7398498728871346} 12/28/2021 06:47:02 - INFO - codeparrot_training - Step 40384: {'lr': 4.45613767614364e-05, 'samples': 20677120, 'steps': 40384, 'batch_loss/train': 0.7479553013108671} 12/28/2021 06:47:13 - INFO - codeparrot_training - Step 40385: {'lr': 4.455239467873901e-05, 'samples': 20677632, 'steps': 40385, 'batch_loss/train': 0.7058447608724236} 12/28/2021 06:47:25 - INFO - codeparrot_training - Step 40386: {'lr': 4.454341341282378e-05, 'samples': 20678144, 'steps': 40386, 'batch_loss/train': 0.5964519984554499} 12/28/2021 06:47:36 - INFO - codeparrot_training - Step 40387: {'lr': 4.453443296372622e-05, 'samples': 20678656, 'steps': 40387, 'batch_loss/train': 0.5950711362529546} 12/28/2021 06:47:46 - INFO - codeparrot_training - Step 40388: {'lr': 4.452545333148228e-05, 'samples': 20679168, 'steps': 40388, 'batch_loss/train': 0.671034696046263} 12/28/2021 06:47:57 - INFO - codeparrot_training - Step 40389: {'lr': 4.451647451612759e-05, 'samples': 20679680, 'steps': 40389, 'batch_loss/train': 0.6680516800843179} 12/28/2021 06:48:10 - INFO - codeparrot_training - Step 40390: {'lr': 4.450749651769767e-05, 'samples': 20680192, 'steps': 40390, 'batch_loss/train': 0.7849270151928067} 12/28/2021 06:48:21 - INFO - codeparrot_training - Step 40391: {'lr': 4.44985193362285e-05, 'samples': 20680704, 'steps': 40391, 'batch_loss/train': 0.799812157638371} 12/28/2021 06:48:32 - INFO - codeparrot_training - Step 40392: {'lr': 4.448954297175556e-05, 'samples': 20681216, 'steps': 40392, 'batch_loss/train': 0.7023670710623264} 12/28/2021 06:48:44 - INFO - codeparrot_training - Step 40393: {'lr': 4.448056742431458e-05, 'samples': 20681728, 'steps': 40393, 'batch_loss/train': 0.7365585044026375} 12/28/2021 06:48:54 - INFO - codeparrot_training - Step 40394: {'lr': 4.447159269394127e-05, 'samples': 20682240, 'steps': 40394, 'batch_loss/train': 0.8058910807594657} 12/28/2021 06:49:05 - INFO - codeparrot_training - Step 40395: {'lr': 4.446261878067132e-05, 'samples': 20682752, 'steps': 40395, 'batch_loss/train': 0.7472696108743548} 12/28/2021 06:49:17 - INFO - codeparrot_training - Step 40396: {'lr': 4.4453645684540356e-05, 'samples': 20683264, 'steps': 40396, 'batch_loss/train': 0.7356691532768309} 12/28/2021 06:49:28 - INFO - codeparrot_training - Step 40397: {'lr': 4.444467340558411e-05, 'samples': 20683776, 'steps': 40397, 'batch_loss/train': 0.6935667032375932} 12/28/2021 06:49:38 - INFO - codeparrot_training - Step 40398: {'lr': 4.443570194383822e-05, 'samples': 20684288, 'steps': 40398, 'batch_loss/train': 0.6918642777018249} 12/28/2021 06:49:49 - INFO - codeparrot_training - Step 40399: {'lr': 4.4426731299338385e-05, 'samples': 20684800, 'steps': 40399, 'batch_loss/train': 0.664076185785234} 12/28/2021 06:50:02 - INFO - codeparrot_training - Step 40400: {'lr': 4.441776147212029e-05, 'samples': 20685312, 'steps': 40400, 'batch_loss/train': 0.7744817081838846} 12/28/2021 06:50:13 - INFO - codeparrot_training - Step 40401: {'lr': 4.4408792462219434e-05, 'samples': 20685824, 'steps': 40401, 'batch_loss/train': 0.8001825036481023} 12/28/2021 06:50:23 - INFO - codeparrot_training - Step 40402: {'lr': 4.439982426967162e-05, 'samples': 20686336, 'steps': 40402, 'batch_loss/train': 0.7999228937551379} 12/28/2021 06:50:36 - INFO - codeparrot_training - Step 40403: {'lr': 4.439085689451258e-05, 'samples': 20686848, 'steps': 40403, 'batch_loss/train': 0.6971823237836361} 12/28/2021 06:50:46 - INFO - codeparrot_training - Step 40404: {'lr': 4.4381890336777754e-05, 'samples': 20687360, 'steps': 40404, 'batch_loss/train': 0.6616303997579962} 12/28/2021 06:50:57 - INFO - codeparrot_training - Step 40405: {'lr': 4.437292459650286e-05, 'samples': 20687872, 'steps': 40405, 'batch_loss/train': 0.5457518079783767} 12/28/2021 06:51:09 - INFO - codeparrot_training - Step 40406: {'lr': 4.4363959673723696e-05, 'samples': 20688384, 'steps': 40406, 'batch_loss/train': 0.7405124353244901} 12/28/2021 06:51:20 - INFO - codeparrot_training - Step 40407: {'lr': 4.435499556847572e-05, 'samples': 20688896, 'steps': 40407, 'batch_loss/train': 0.7502556978724897} 12/28/2021 06:51:30 - INFO - codeparrot_training - Step 40408: {'lr': 4.434603228079462e-05, 'samples': 20689408, 'steps': 40408, 'batch_loss/train': 0.7432135734707117} 12/28/2021 06:51:41 - INFO - codeparrot_training - Step 40409: {'lr': 4.433706981071606e-05, 'samples': 20689920, 'steps': 40409, 'batch_loss/train': 0.8127118023112416} 12/28/2021 06:51:54 - INFO - codeparrot_training - Step 40410: {'lr': 4.432810815827565e-05, 'samples': 20690432, 'steps': 40410, 'batch_loss/train': 0.7959068352356553} 12/28/2021 06:52:04 - INFO - codeparrot_training - Step 40411: {'lr': 4.431914732350903e-05, 'samples': 20690944, 'steps': 40411, 'batch_loss/train': 0.7859764881432056} 12/28/2021 06:52:15 - INFO - codeparrot_training - Step 40412: {'lr': 4.4310187306451864e-05, 'samples': 20691456, 'steps': 40412, 'batch_loss/train': 0.7002678123535588} 12/28/2021 06:52:27 - INFO - codeparrot_training - Step 40413: {'lr': 4.4301228107139604e-05, 'samples': 20691968, 'steps': 40413, 'batch_loss/train': 0.6074316906742752} 12/28/2021 06:52:38 - INFO - codeparrot_training - Step 40414: {'lr': 4.429226972560807e-05, 'samples': 20692480, 'steps': 40414, 'batch_loss/train': 0.9278447104152292} 12/28/2021 06:52:48 - INFO - codeparrot_training - Step 40415: {'lr': 4.4283312161892854e-05, 'samples': 20692992, 'steps': 40415, 'batch_loss/train': 0.8009730060584843} 12/28/2021 06:53:01 - INFO - codeparrot_training - Step 40416: {'lr': 4.4274355416029395e-05, 'samples': 20693504, 'steps': 40416, 'batch_loss/train': 0.7178189416881651} 12/28/2021 06:53:11 - INFO - codeparrot_training - Step 40417: {'lr': 4.426539948805353e-05, 'samples': 20694016, 'steps': 40417, 'batch_loss/train': 0.6820188593119383} 12/28/2021 06:53:22 - INFO - codeparrot_training - Step 40418: {'lr': 4.4256444378000724e-05, 'samples': 20694528, 'steps': 40418, 'batch_loss/train': 0.7452272624941543} 12/28/2021 06:53:32 - INFO - codeparrot_training - Step 40419: {'lr': 4.424749008590656e-05, 'samples': 20695040, 'steps': 40419, 'batch_loss/train': 0.7192664025351405} 12/28/2021 06:53:46 - INFO - codeparrot_training - Step 40420: {'lr': 4.423853661180671e-05, 'samples': 20695552, 'steps': 40420, 'batch_loss/train': 0.6509690617676824} 12/28/2021 06:53:56 - INFO - codeparrot_training - Step 40421: {'lr': 4.422958395573676e-05, 'samples': 20696064, 'steps': 40421, 'batch_loss/train': 0.6503997256513685} 12/28/2021 06:54:07 - INFO - codeparrot_training - Step 40422: {'lr': 4.4220632117732276e-05, 'samples': 20696576, 'steps': 40422, 'batch_loss/train': 0.696329802274704} 12/28/2021 06:54:20 - INFO - codeparrot_training - Step 40423: {'lr': 4.421168109782886e-05, 'samples': 20697088, 'steps': 40423, 'batch_loss/train': 0.7269235127605498} 12/28/2021 06:54:30 - INFO - codeparrot_training - Step 40424: {'lr': 4.420273089606211e-05, 'samples': 20697600, 'steps': 40424, 'batch_loss/train': 0.6996766175143421} 12/28/2021 06:54:41 - INFO - codeparrot_training - Step 40425: {'lr': 4.419378151246758e-05, 'samples': 20698112, 'steps': 40425, 'batch_loss/train': 0.44220740560558625} 12/28/2021 06:54:53 - INFO - codeparrot_training - Step 40426: {'lr': 4.418483294708092e-05, 'samples': 20698624, 'steps': 40426, 'batch_loss/train': 0.6988136784057133} 12/28/2021 06:55:04 - INFO - codeparrot_training - Step 40427: {'lr': 4.417588519993754e-05, 'samples': 20699136, 'steps': 40427, 'batch_loss/train': 0.7363351709209383} 12/28/2021 06:55:14 - INFO - codeparrot_training - Step 40428: {'lr': 4.416693827107318e-05, 'samples': 20699648, 'steps': 40428, 'batch_loss/train': 0.6852467516437173} 12/28/2021 06:55:25 - INFO - codeparrot_training - Step 40429: {'lr': 4.4157992160523434e-05, 'samples': 20700160, 'steps': 40429, 'batch_loss/train': 0.7354586706496775} 12/28/2021 06:55:38 - INFO - codeparrot_training - Step 40430: {'lr': 4.4149046868323716e-05, 'samples': 20700672, 'steps': 40430, 'batch_loss/train': 0.7491131192073226} 12/28/2021 06:55:48 - INFO - codeparrot_training - Step 40431: {'lr': 4.414010239450958e-05, 'samples': 20701184, 'steps': 40431, 'batch_loss/train': 0.6438946039415896} 12/28/2021 06:55:59 - INFO - codeparrot_training - Step 40432: {'lr': 4.41311587391168e-05, 'samples': 20701696, 'steps': 40432, 'batch_loss/train': 0.6673587433760986} 12/28/2021 06:56:11 - INFO - codeparrot_training - Step 40433: {'lr': 4.4122215902180766e-05, 'samples': 20702208, 'steps': 40433, 'batch_loss/train': 0.7638094639405608} 12/28/2021 06:56:22 - INFO - codeparrot_training - Step 40434: {'lr': 4.4113273883737036e-05, 'samples': 20702720, 'steps': 40434, 'batch_loss/train': 0.8434046627953649} 12/28/2021 06:56:32 - INFO - codeparrot_training - Step 40435: {'lr': 4.410433268382119e-05, 'samples': 20703232, 'steps': 40435, 'batch_loss/train': 0.7132640369236469} 12/28/2021 06:56:45 - INFO - codeparrot_training - Step 40436: {'lr': 4.409539230246879e-05, 'samples': 20703744, 'steps': 40436, 'batch_loss/train': 0.7270356491208076} 12/28/2021 06:56:55 - INFO - codeparrot_training - Step 40437: {'lr': 4.4086452739715344e-05, 'samples': 20704256, 'steps': 40437, 'batch_loss/train': 0.6987936076475307} 12/28/2021 06:57:06 - INFO - codeparrot_training - Step 40438: {'lr': 4.407751399559642e-05, 'samples': 20704768, 'steps': 40438, 'batch_loss/train': 0.7039797168690711} 12/28/2021 06:57:19 - INFO - codeparrot_training - Step 40439: {'lr': 4.4068576070147557e-05, 'samples': 20705280, 'steps': 40439, 'batch_loss/train': 0.8111632717773318} 12/28/2021 06:57:29 - INFO - codeparrot_training - Step 40440: {'lr': 4.405963896340426e-05, 'samples': 20705792, 'steps': 40440, 'batch_loss/train': 0.8001940255053341} 12/28/2021 06:57:40 - INFO - codeparrot_training - Step 40441: {'lr': 4.405070267540218e-05, 'samples': 20706304, 'steps': 40441, 'batch_loss/train': 0.7207147390581667} 12/28/2021 06:57:51 - INFO - codeparrot_training - Step 40442: {'lr': 4.4041767206176587e-05, 'samples': 20706816, 'steps': 40442, 'batch_loss/train': 0.8295932505279779} 12/28/2021 06:58:03 - INFO - codeparrot_training - Step 40443: {'lr': 4.403283255576321e-05, 'samples': 20707328, 'steps': 40443, 'batch_loss/train': 0.7354676341637969} 12/28/2021 06:58:13 - INFO - codeparrot_training - Step 40444: {'lr': 4.4023898724197596e-05, 'samples': 20707840, 'steps': 40444, 'batch_loss/train': 1.3388256477192044} 12/28/2021 06:58:24 - INFO - codeparrot_training - Step 40445: {'lr': 4.4014965711515145e-05, 'samples': 20708352, 'steps': 40445, 'batch_loss/train': 0.8282366618514061} 12/28/2021 06:58:37 - INFO - codeparrot_training - Step 40446: {'lr': 4.40060335177514e-05, 'samples': 20708864, 'steps': 40446, 'batch_loss/train': 0.7577597787603736} 12/28/2021 06:58:47 - INFO - codeparrot_training - Step 40447: {'lr': 4.399710214294186e-05, 'samples': 20709376, 'steps': 40447, 'batch_loss/train': 0.7728798631578684} 12/28/2021 06:58:58 - INFO - codeparrot_training - Step 40448: {'lr': 4.3988171587122094e-05, 'samples': 20709888, 'steps': 40448, 'batch_loss/train': 0.8089292179793119} 12/28/2021 06:59:10 - INFO - codeparrot_training - Step 40449: {'lr': 4.397924185032756e-05, 'samples': 20710400, 'steps': 40449, 'batch_loss/train': 0.7089675484457985} 12/28/2021 06:59:21 - INFO - codeparrot_training - Step 40450: {'lr': 4.3970312932593766e-05, 'samples': 20710912, 'steps': 40450, 'batch_loss/train': 0.6706904349848628} 12/28/2021 06:59:31 - INFO - codeparrot_training - Step 40451: {'lr': 4.39613848339562e-05, 'samples': 20711424, 'steps': 40451, 'batch_loss/train': 0.896645063534379} 12/28/2021 06:59:42 - INFO - codeparrot_training - Step 40452: {'lr': 4.395245755445046e-05, 'samples': 20711936, 'steps': 40452, 'batch_loss/train': 0.6920712045393884} 12/28/2021 06:59:54 - INFO - codeparrot_training - Step 40453: {'lr': 4.394353109411181e-05, 'samples': 20712448, 'steps': 40453, 'batch_loss/train': 0.7074859770946205} 12/28/2021 07:00:05 - INFO - codeparrot_training - Step 40454: {'lr': 4.393460545297595e-05, 'samples': 20712960, 'steps': 40454, 'batch_loss/train': 0.7303616921417415} 12/28/2021 07:00:15 - INFO - codeparrot_training - Step 40455: {'lr': 4.3925680631078323e-05, 'samples': 20713472, 'steps': 40455, 'batch_loss/train': 0.7463876297697425} 12/28/2021 07:00:27 - INFO - codeparrot_training - Step 40456: {'lr': 4.3916756628454346e-05, 'samples': 20713984, 'steps': 40456, 'batch_loss/train': 0.7606421448290348} 12/28/2021 07:00:38 - INFO - codeparrot_training - Step 40457: {'lr': 4.390783344513946e-05, 'samples': 20714496, 'steps': 40457, 'batch_loss/train': 0.8395556687028147} 12/28/2021 07:00:49 - INFO - codeparrot_training - Step 40458: {'lr': 4.3898911081169335e-05, 'samples': 20715008, 'steps': 40458, 'batch_loss/train': 0.8595121167600155} 12/28/2021 07:01:02 - INFO - codeparrot_training - Step 40459: {'lr': 4.388998953657927e-05, 'samples': 20715520, 'steps': 40459, 'batch_loss/train': 0.7806896581314504} 12/28/2021 07:01:12 - INFO - codeparrot_training - Step 40460: {'lr': 4.388106881140477e-05, 'samples': 20716032, 'steps': 40460, 'batch_loss/train': 0.7423249213024974} 12/28/2021 07:01:23 - INFO - codeparrot_training - Step 40461: {'lr': 4.3872148905681306e-05, 'samples': 20716544, 'steps': 40461, 'batch_loss/train': 0.8029386037960649} 12/28/2021 07:01:35 - INFO - codeparrot_training - Step 40462: {'lr': 4.386322981944435e-05, 'samples': 20717056, 'steps': 40462, 'batch_loss/train': 0.7612905083224177} 12/28/2021 07:01:45 - INFO - codeparrot_training - Step 40463: {'lr': 4.385431155272937e-05, 'samples': 20717568, 'steps': 40463, 'batch_loss/train': 0.7350390052888542} 12/28/2021 07:01:56 - INFO - codeparrot_training - Step 40464: {'lr': 4.384539410557178e-05, 'samples': 20718080, 'steps': 40464, 'batch_loss/train': 0.7063028584234416} 12/28/2021 07:02:07 - INFO - codeparrot_training - Step 40465: {'lr': 4.383647747800709e-05, 'samples': 20718592, 'steps': 40465, 'batch_loss/train': 0.6657015336677432} 12/28/2021 07:02:19 - INFO - codeparrot_training - Step 40466: {'lr': 4.382756167007071e-05, 'samples': 20719104, 'steps': 40466, 'batch_loss/train': 0.7736846497282386} 12/28/2021 07:02:30 - INFO - codeparrot_training - Step 40467: {'lr': 4.3818646681798165e-05, 'samples': 20719616, 'steps': 40467, 'batch_loss/train': 0.7118331221863627} 12/28/2021 07:02:40 - INFO - codeparrot_training - Step 40468: {'lr': 4.3809732513224706e-05, 'samples': 20720128, 'steps': 40468, 'batch_loss/train': 0.915772573556751} 12/28/2021 07:02:53 - INFO - codeparrot_training - Step 40469: {'lr': 4.380081916438597e-05, 'samples': 20720640, 'steps': 40469, 'batch_loss/train': 0.7623628964647651} 12/28/2021 07:03:04 - INFO - codeparrot_training - Step 40470: {'lr': 4.3791906635317373e-05, 'samples': 20721152, 'steps': 40470, 'batch_loss/train': 0.7596438234904781} 12/28/2021 07:03:14 - INFO - codeparrot_training - Step 40471: {'lr': 4.3782994926054224e-05, 'samples': 20721664, 'steps': 40471, 'batch_loss/train': 0.8092542166123167} 12/28/2021 07:03:26 - INFO - codeparrot_training - Step 40472: {'lr': 4.377408403663202e-05, 'samples': 20722176, 'steps': 40472, 'batch_loss/train': 0.7622175207361579} 12/28/2021 07:03:37 - INFO - codeparrot_training - Step 40473: {'lr': 4.3765173967086205e-05, 'samples': 20722688, 'steps': 40473, 'batch_loss/train': 0.7788681564852595} 12/28/2021 07:03:48 - INFO - codeparrot_training - Step 40474: {'lr': 4.375626471745217e-05, 'samples': 20723200, 'steps': 40474, 'batch_loss/train': 0.6657405537553132} 12/28/2021 07:03:58 - INFO - codeparrot_training - Step 40475: {'lr': 4.3747356287765356e-05, 'samples': 20723712, 'steps': 40475, 'batch_loss/train': 0.7730306498706341} 12/28/2021 07:04:11 - INFO - codeparrot_training - Step 40476: {'lr': 4.3738448678061176e-05, 'samples': 20724224, 'steps': 40476, 'batch_loss/train': 0.8383334465324879} 12/28/2021 07:04:22 - INFO - codeparrot_training - Step 40477: {'lr': 4.3729541888375054e-05, 'samples': 20724736, 'steps': 40477, 'batch_loss/train': 0.7063796282745898} 12/28/2021 07:04:32 - INFO - codeparrot_training - Step 40478: {'lr': 4.372063591874237e-05, 'samples': 20725248, 'steps': 40478, 'batch_loss/train': 0.790394798386842} 12/28/2021 07:04:44 - INFO - codeparrot_training - Step 40479: {'lr': 4.3711730769198574e-05, 'samples': 20725760, 'steps': 40479, 'batch_loss/train': 0.6721185715869069} 12/28/2021 07:04:55 - INFO - codeparrot_training - Step 40480: {'lr': 4.370282643977902e-05, 'samples': 20726272, 'steps': 40480, 'batch_loss/train': 0.714354227297008} 12/28/2021 07:05:05 - INFO - codeparrot_training - Step 40481: {'lr': 4.369392293051922e-05, 'samples': 20726784, 'steps': 40481, 'batch_loss/train': 0.7196159912273288} 12/28/2021 07:05:18 - INFO - codeparrot_training - Step 40482: {'lr': 4.3685020241454406e-05, 'samples': 20727296, 'steps': 40482, 'batch_loss/train': 0.5290621293534059} 12/28/2021 07:05:29 - INFO - codeparrot_training - Step 40483: {'lr': 4.3676118372619974e-05, 'samples': 20727808, 'steps': 40483, 'batch_loss/train': 0.8130447221919894} 12/28/2021 07:05:39 - INFO - codeparrot_training - Step 40484: {'lr': 4.3667217324051543e-05, 'samples': 20728320, 'steps': 40484, 'batch_loss/train': 0.6883433223702013} 12/28/2021 07:05:52 - INFO - codeparrot_training - Step 40485: {'lr': 4.365831709578427e-05, 'samples': 20728832, 'steps': 40485, 'batch_loss/train': 0.759350418113172} 12/28/2021 07:06:03 - INFO - codeparrot_training - Step 40486: {'lr': 4.364941768785363e-05, 'samples': 20729344, 'steps': 40486, 'batch_loss/train': 0.6457134396769106} 12/28/2021 07:06:13 - INFO - codeparrot_training - Step 40487: {'lr': 4.3640519100294985e-05, 'samples': 20729856, 'steps': 40487, 'batch_loss/train': 0.7223708657547832} 12/28/2021 07:06:24 - INFO - codeparrot_training - Step 40488: {'lr': 4.363162133314369e-05, 'samples': 20730368, 'steps': 40488, 'batch_loss/train': 0.7724894904531538} 12/28/2021 07:06:36 - INFO - codeparrot_training - Step 40489: {'lr': 4.362272438643519e-05, 'samples': 20730880, 'steps': 40489, 'batch_loss/train': 0.7460242919623852} 12/28/2021 07:06:47 - INFO - codeparrot_training - Step 40490: {'lr': 4.36138282602048e-05, 'samples': 20731392, 'steps': 40490, 'batch_loss/train': 0.6393421785905957} 12/28/2021 07:06:57 - INFO - codeparrot_training - Step 40491: {'lr': 4.360493295448789e-05, 'samples': 20731904, 'steps': 40491, 'batch_loss/train': 0.7945715775713325} 12/28/2021 07:07:09 - INFO - codeparrot_training - Step 40492: {'lr': 4.359603846931984e-05, 'samples': 20732416, 'steps': 40492, 'batch_loss/train': 0.7069477783516049} 12/28/2021 07:07:20 - INFO - codeparrot_training - Step 40493: {'lr': 4.3587144804736054e-05, 'samples': 20732928, 'steps': 40493, 'batch_loss/train': 0.7622345332056284} 12/28/2021 07:07:31 - INFO - codeparrot_training - Step 40494: {'lr': 4.357825196077173e-05, 'samples': 20733440, 'steps': 40494, 'batch_loss/train': 0.6526105371303856} 12/28/2021 07:07:43 - INFO - codeparrot_training - Step 40495: {'lr': 4.3569359937462386e-05, 'samples': 20733952, 'steps': 40495, 'batch_loss/train': 0.7509161569178104} 12/28/2021 07:07:54 - INFO - codeparrot_training - Step 40496: {'lr': 4.35604687348434e-05, 'samples': 20734464, 'steps': 40496, 'batch_loss/train': 0.6733111045323312} 12/28/2021 07:08:04 - INFO - codeparrot_training - Step 40497: {'lr': 4.355157835294998e-05, 'samples': 20734976, 'steps': 40497, 'batch_loss/train': 0.750083816004917} 12/28/2021 07:08:15 - INFO - codeparrot_training - Step 40498: {'lr': 4.3542688791817525e-05, 'samples': 20735488, 'steps': 40498, 'batch_loss/train': 0.6876324848271906} 12/28/2021 07:08:28 - INFO - codeparrot_training - Step 40499: {'lr': 4.353380005148139e-05, 'samples': 20736000, 'steps': 40499, 'batch_loss/train': 0.5949320029467344} 12/28/2021 07:08:39 - INFO - codeparrot_training - Step 40500: {'lr': 4.35249121319769e-05, 'samples': 20736512, 'steps': 40500, 'batch_loss/train': 0.7940413439646363} 12/28/2021 07:08:49 - INFO - codeparrot_training - Step 40501: {'lr': 4.35160250333394e-05, 'samples': 20737024, 'steps': 40501, 'batch_loss/train': 0.7636195430532098} 12/28/2021 07:09:01 - INFO - codeparrot_training - Step 40502: {'lr': 4.350713875560422e-05, 'samples': 20737536, 'steps': 40502, 'batch_loss/train': 0.7588082896545529} 12/28/2021 07:09:12 - INFO - codeparrot_training - Step 40503: {'lr': 4.349825329880669e-05, 'samples': 20738048, 'steps': 40503, 'batch_loss/train': 0.7338974387967028} 12/28/2021 07:09:22 - INFO - codeparrot_training - Step 40504: {'lr': 4.348936866298212e-05, 'samples': 20738560, 'steps': 40504, 'batch_loss/train': 0.7638389370404184} 12/28/2021 07:09:35 - INFO - codeparrot_training - Step 40505: {'lr': 4.348048484816586e-05, 'samples': 20739072, 'steps': 40505, 'batch_loss/train': 0.7735584378242493} 12/28/2021 07:09:45 - INFO - codeparrot_training - Step 40506: {'lr': 4.3471601854393225e-05, 'samples': 20739584, 'steps': 40506, 'batch_loss/train': 0.5759111801162362} 12/28/2021 07:09:56 - INFO - codeparrot_training - Step 40507: {'lr': 4.346271968169954e-05, 'samples': 20740096, 'steps': 40507, 'batch_loss/train': 0.7154236133210361} 12/28/2021 07:10:06 - INFO - codeparrot_training - Step 40508: {'lr': 4.3453838330120064e-05, 'samples': 20740608, 'steps': 40508, 'batch_loss/train': 0.6520477507729083} 12/28/2021 07:10:19 - INFO - codeparrot_training - Step 40509: {'lr': 4.344495779969007e-05, 'samples': 20741120, 'steps': 40509, 'batch_loss/train': 0.8365994347259402} 12/28/2021 07:10:30 - INFO - codeparrot_training - Step 40510: {'lr': 4.3436078090445056e-05, 'samples': 20741632, 'steps': 40510, 'batch_loss/train': 0.726845424156636} 12/28/2021 07:10:40 - INFO - codeparrot_training - Step 40511: {'lr': 4.342719920242014e-05, 'samples': 20742144, 'steps': 40511, 'batch_loss/train': 0.7402559109032154} 12/28/2021 07:10:53 - INFO - codeparrot_training - Step 40512: {'lr': 4.3418321135650614e-05, 'samples': 20742656, 'steps': 40512, 'batch_loss/train': 0.7335935081355274} 12/28/2021 07:11:03 - INFO - codeparrot_training - Step 40513: {'lr': 4.3409443890171985e-05, 'samples': 20743168, 'steps': 40513, 'batch_loss/train': 0.8512417776510119} 12/28/2021 07:11:14 - INFO - codeparrot_training - Step 40514: {'lr': 4.340056746601931e-05, 'samples': 20743680, 'steps': 40514, 'batch_loss/train': 0.7390624778345227} 12/28/2021 07:11:27 - INFO - codeparrot_training - Step 40515: {'lr': 4.3391691863227975e-05, 'samples': 20744192, 'steps': 40515, 'batch_loss/train': 0.6833853693678975} 12/28/2021 07:11:37 - INFO - codeparrot_training - Step 40516: {'lr': 4.3382817081833283e-05, 'samples': 20744704, 'steps': 40516, 'batch_loss/train': 0.76112678530626} 12/28/2021 07:11:48 - INFO - codeparrot_training - Step 40517: {'lr': 4.337394312187048e-05, 'samples': 20745216, 'steps': 40517, 'batch_loss/train': 0.7513343128375709} 12/28/2021 07:12:01 - INFO - codeparrot_training - Step 40518: {'lr': 4.336506998337486e-05, 'samples': 20745728, 'steps': 40518, 'batch_loss/train': 0.7123119598254561} 12/28/2021 07:12:12 - INFO - codeparrot_training - Step 40519: {'lr': 4.335619766638174e-05, 'samples': 20746240, 'steps': 40519, 'batch_loss/train': 0.7237182650715113} 12/28/2021 07:12:22 - INFO - codeparrot_training - Step 40520: {'lr': 4.3347326170926263e-05, 'samples': 20746752, 'steps': 40520, 'batch_loss/train': 0.7479345677420497} 12/28/2021 07:12:33 - INFO - codeparrot_training - Step 40521: {'lr': 4.333845549704385e-05, 'samples': 20747264, 'steps': 40521, 'batch_loss/train': 0.6260703318403102} 12/28/2021 07:12:45 - INFO - codeparrot_training - Step 40522: {'lr': 4.3329585644769745e-05, 'samples': 20747776, 'steps': 40522, 'batch_loss/train': 0.7309683738276362} 12/28/2021 07:12:55 - INFO - codeparrot_training - Step 40523: {'lr': 4.332071661413911e-05, 'samples': 20748288, 'steps': 40523, 'batch_loss/train': 0.7408329956233501} 12/28/2021 07:13:06 - INFO - codeparrot_training - Step 40524: {'lr': 4.3311848405187235e-05, 'samples': 20748800, 'steps': 40524, 'batch_loss/train': 0.7186753489077091} 12/28/2021 07:13:19 - INFO - codeparrot_training - Step 40525: {'lr': 4.330298101794949e-05, 'samples': 20749312, 'steps': 40525, 'batch_loss/train': 0.7416467727161944} 12/28/2021 07:13:30 - INFO - codeparrot_training - Step 40526: {'lr': 4.329411445246101e-05, 'samples': 20749824, 'steps': 40526, 'batch_loss/train': 0.7296819197945297} 12/28/2021 07:13:40 - INFO - codeparrot_training - Step 40527: {'lr': 4.3285248708757083e-05, 'samples': 20750336, 'steps': 40527, 'batch_loss/train': 0.7110558720305562} 12/28/2021 07:13:51 - INFO - codeparrot_training - Step 40528: {'lr': 4.327638378687296e-05, 'samples': 20750848, 'steps': 40528, 'batch_loss/train': 0.8256549974903464} 12/28/2021 07:14:03 - INFO - codeparrot_training - Step 40529: {'lr': 4.326751968684386e-05, 'samples': 20751360, 'steps': 40529, 'batch_loss/train': 0.7253541597165167} 12/28/2021 07:14:13 - INFO - codeparrot_training - Step 40530: {'lr': 4.325865640870505e-05, 'samples': 20751872, 'steps': 40530, 'batch_loss/train': 0.7117501763859764} 12/28/2021 07:14:24 - INFO - codeparrot_training - Step 40531: {'lr': 4.324979395249176e-05, 'samples': 20752384, 'steps': 40531, 'batch_loss/train': 0.6467249714769423} 12/28/2021 07:14:36 - INFO - codeparrot_training - Step 40532: {'lr': 4.3240932318239225e-05, 'samples': 20752896, 'steps': 40532, 'batch_loss/train': 0.7109428740222938} 12/28/2021 07:14:47 - INFO - codeparrot_training - Step 40533: {'lr': 4.323207150598268e-05, 'samples': 20753408, 'steps': 40533, 'batch_loss/train': 0.7867603176273406} 12/28/2021 07:14:57 - INFO - codeparrot_training - Step 40534: {'lr': 4.322321151575739e-05, 'samples': 20753920, 'steps': 40534, 'batch_loss/train': 0.745884703937918} 12/28/2021 07:15:10 - INFO - codeparrot_training - Step 40535: {'lr': 4.3214352347598435e-05, 'samples': 20754432, 'steps': 40535, 'batch_loss/train': 0.7932768100872636} 12/28/2021 07:15:21 - INFO - codeparrot_training - Step 40536: {'lr': 4.320549400154125e-05, 'samples': 20754944, 'steps': 40536, 'batch_loss/train': 0.8357607750222087} 12/28/2021 07:15:32 - INFO - codeparrot_training - Step 40537: {'lr': 4.319663647762087e-05, 'samples': 20755456, 'steps': 40537, 'batch_loss/train': 0.760215878020972} 12/28/2021 07:15:42 - INFO - codeparrot_training - Step 40538: {'lr': 4.318777977587251e-05, 'samples': 20755968, 'steps': 40538, 'batch_loss/train': 0.5637442129664123} 12/28/2021 07:15:54 - INFO - codeparrot_training - Step 40539: {'lr': 4.3178923896331576e-05, 'samples': 20756480, 'steps': 40539, 'batch_loss/train': 0.7087879017926753} 12/28/2021 07:16:05 - INFO - codeparrot_training - Step 40540: {'lr': 4.31700688390331e-05, 'samples': 20756992, 'steps': 40540, 'batch_loss/train': 0.7462062027771026} 12/28/2021 07:16:16 - INFO - codeparrot_training - Step 40541: {'lr': 4.316121460401232e-05, 'samples': 20757504, 'steps': 40541, 'batch_loss/train': 0.7519100923091173} 12/28/2021 07:16:28 - INFO - codeparrot_training - Step 40542: {'lr': 4.3152361191304475e-05, 'samples': 20758016, 'steps': 40542, 'batch_loss/train': 0.8120131697505713} 12/28/2021 07:16:39 - INFO - codeparrot_training - Step 40543: {'lr': 4.314350860094471e-05, 'samples': 20758528, 'steps': 40543, 'batch_loss/train': 0.8884657807648182} 12/28/2021 07:16:49 - INFO - codeparrot_training - Step 40544: {'lr': 4.3134656832968275e-05, 'samples': 20759040, 'steps': 40544, 'batch_loss/train': 0.9888082183897495} 12/28/2021 07:17:02 - INFO - codeparrot_training - Step 40545: {'lr': 4.312580588741036e-05, 'samples': 20759552, 'steps': 40545, 'batch_loss/train': 0.6998034184798598} 12/28/2021 07:17:13 - INFO - codeparrot_training - Step 40546: {'lr': 4.311695576430605e-05, 'samples': 20760064, 'steps': 40546, 'batch_loss/train': 0.738491348689422} 12/28/2021 07:17:23 - INFO - codeparrot_training - Step 40547: {'lr': 4.310810646369062e-05, 'samples': 20760576, 'steps': 40547, 'batch_loss/train': 0.6962211760692298} 12/28/2021 07:17:35 - INFO - codeparrot_training - Step 40548: {'lr': 4.309925798559933e-05, 'samples': 20761088, 'steps': 40548, 'batch_loss/train': 0.747937835752964} 12/28/2021 07:17:46 - INFO - codeparrot_training - Step 40549: {'lr': 4.309041033006717e-05, 'samples': 20761600, 'steps': 40549, 'batch_loss/train': 0.7610799479298294} 12/28/2021 07:17:57 - INFO - codeparrot_training - Step 40550: {'lr': 4.308156349712944e-05, 'samples': 20762112, 'steps': 40550, 'batch_loss/train': 0.7626270987093449} 12/28/2021 07:18:07 - INFO - codeparrot_training - Step 40551: {'lr': 4.307271748682134e-05, 'samples': 20762624, 'steps': 40551, 'batch_loss/train': 0.701226320117712} 12/28/2021 07:18:20 - INFO - codeparrot_training - Step 40552: {'lr': 4.3063872299177935e-05, 'samples': 20763136, 'steps': 40552, 'batch_loss/train': 0.7340475304517895} 12/28/2021 07:18:30 - INFO - codeparrot_training - Step 40553: {'lr': 4.3055027934234434e-05, 'samples': 20763648, 'steps': 40553, 'batch_loss/train': 0.8568011526949704} 12/28/2021 07:18:41 - INFO - codeparrot_training - Step 40554: {'lr': 4.304618439202601e-05, 'samples': 20764160, 'steps': 40554, 'batch_loss/train': 0.6693047331064008} 12/28/2021 07:18:54 - INFO - codeparrot_training - Step 40555: {'lr': 4.30373416725878e-05, 'samples': 20764672, 'steps': 40555, 'batch_loss/train': 0.708349627442658} 12/28/2021 07:19:04 - INFO - codeparrot_training - Step 40556: {'lr': 4.302849977595499e-05, 'samples': 20765184, 'steps': 40556, 'batch_loss/train': 0.7876220964826643} 12/28/2021 07:19:15 - INFO - codeparrot_training - Step 40557: {'lr': 4.301965870216271e-05, 'samples': 20765696, 'steps': 40557, 'batch_loss/train': 0.7988054449670017} 12/28/2021 07:19:27 - INFO - codeparrot_training - Step 40558: {'lr': 4.3010818451246104e-05, 'samples': 20766208, 'steps': 40558, 'batch_loss/train': 0.7426678696647286} 12/28/2021 07:19:38 - INFO - codeparrot_training - Step 40559: {'lr': 4.300197902324032e-05, 'samples': 20766720, 'steps': 40559, 'batch_loss/train': 0.7354076281189919} 12/28/2021 07:19:48 - INFO - codeparrot_training - Step 40560: {'lr': 4.2993140418180535e-05, 'samples': 20767232, 'steps': 40560, 'batch_loss/train': 0.7616027854382992} 12/28/2021 07:19:59 - INFO - codeparrot_training - Step 40561: {'lr': 4.298430263610184e-05, 'samples': 20767744, 'steps': 40561, 'batch_loss/train': 0.6864342237822711} 12/28/2021 07:20:12 - INFO - codeparrot_training - Step 40562: {'lr': 4.297546567703947e-05, 'samples': 20768256, 'steps': 40562, 'batch_loss/train': 0.8359753869008273} 12/28/2021 07:20:22 - INFO - codeparrot_training - Step 40563: {'lr': 4.2966629541028426e-05, 'samples': 20768768, 'steps': 40563, 'batch_loss/train': 0.7039832898881286} 12/28/2021 07:20:33 - INFO - codeparrot_training - Step 40564: {'lr': 4.2957794228103805e-05, 'samples': 20769280, 'steps': 40564, 'batch_loss/train': 0.754576587351039} 12/28/2021 07:20:46 - INFO - codeparrot_training - Step 40565: {'lr': 4.294895973830093e-05, 'samples': 20769792, 'steps': 40565, 'batch_loss/train': 0.8623255821876228} 12/28/2021 07:20:56 - INFO - codeparrot_training - Step 40566: {'lr': 4.294012607165479e-05, 'samples': 20770304, 'steps': 40566, 'batch_loss/train': 0.7082144510932267} 12/28/2021 07:21:07 - INFO - codeparrot_training - Step 40567: {'lr': 4.2931293228200496e-05, 'samples': 20770816, 'steps': 40567, 'batch_loss/train': 0.8770223122555763} 12/28/2021 07:21:19 - INFO - codeparrot_training - Step 40568: {'lr': 4.292246120797319e-05, 'samples': 20771328, 'steps': 40568, 'batch_loss/train': 0.71395722636953} 12/28/2021 07:21:30 - INFO - codeparrot_training - Step 40569: {'lr': 4.291363001100801e-05, 'samples': 20771840, 'steps': 40569, 'batch_loss/train': 0.7798535823822021} 12/28/2021 07:21:40 - INFO - codeparrot_training - Step 40570: {'lr': 4.2904799637340034e-05, 'samples': 20772352, 'steps': 40570, 'batch_loss/train': 0.6846280230674893} 12/28/2021 07:21:52 - INFO - codeparrot_training - Step 40571: {'lr': 4.2895970087004434e-05, 'samples': 20772864, 'steps': 40571, 'batch_loss/train': 0.9248480987735093} 12/28/2021 07:22:03 - INFO - codeparrot_training - Step 40572: {'lr': 4.2887141360036155e-05, 'samples': 20773376, 'steps': 40572, 'batch_loss/train': 0.6503368981648237} 12/28/2021 07:22:14 - INFO - codeparrot_training - Step 40573: {'lr': 4.287831345647045e-05, 'samples': 20773888, 'steps': 40573, 'batch_loss/train': 0.8215255672112107} 12/28/2021 07:22:24 - INFO - codeparrot_training - Step 40574: {'lr': 4.286948637634241e-05, 'samples': 20774400, 'steps': 40574, 'batch_loss/train': 0.7089400246040896} 12/28/2021 07:22:37 - INFO - codeparrot_training - Step 40575: {'lr': 4.286066011968698e-05, 'samples': 20774912, 'steps': 40575, 'batch_loss/train': 0.772858239710331} 12/28/2021 07:22:48 - INFO - codeparrot_training - Step 40576: {'lr': 4.2851834686539416e-05, 'samples': 20775424, 'steps': 40576, 'batch_loss/train': 0.7619973488617688} 12/28/2021 07:22:58 - INFO - codeparrot_training - Step 40577: {'lr': 4.284301007693478e-05, 'samples': 20775936, 'steps': 40577, 'batch_loss/train': 0.7353330638725311} 12/28/2021 07:23:11 - INFO - codeparrot_training - Step 40578: {'lr': 4.283418629090807e-05, 'samples': 20776448, 'steps': 40578, 'batch_loss/train': 0.6452664269600064} 12/28/2021 07:23:21 - INFO - codeparrot_training - Step 40579: {'lr': 4.282536332849435e-05, 'samples': 20776960, 'steps': 40579, 'batch_loss/train': 0.7027977611869574} 12/28/2021 07:23:32 - INFO - codeparrot_training - Step 40580: {'lr': 4.2816541189728886e-05, 'samples': 20777472, 'steps': 40580, 'batch_loss/train': 0.9266190659254789} 12/28/2021 07:23:44 - INFO - codeparrot_training - Step 40581: {'lr': 4.280771987464657e-05, 'samples': 20777984, 'steps': 40581, 'batch_loss/train': 0.8158270679414272} 12/28/2021 07:23:55 - INFO - codeparrot_training - Step 40582: {'lr': 4.279889938328252e-05, 'samples': 20778496, 'steps': 40582, 'batch_loss/train': 0.746169954072684} 12/28/2021 07:24:05 - INFO - codeparrot_training - Step 40583: {'lr': 4.279007971567181e-05, 'samples': 20779008, 'steps': 40583, 'batch_loss/train': 0.8031284213066101} 12/28/2021 07:24:16 - INFO - codeparrot_training - Step 40584: {'lr': 4.2781260871849495e-05, 'samples': 20779520, 'steps': 40584, 'batch_loss/train': 0.7808414264582098} 12/28/2021 07:24:29 - INFO - codeparrot_training - Step 40585: {'lr': 4.277244285185064e-05, 'samples': 20780032, 'steps': 40585, 'batch_loss/train': 0.6923339702188969} 12/28/2021 07:24:39 - INFO - codeparrot_training - Step 40586: {'lr': 4.276362565571032e-05, 'samples': 20780544, 'steps': 40586, 'batch_loss/train': 0.655676388181746} 12/28/2021 07:24:50 - INFO - codeparrot_training - Step 40587: {'lr': 4.2754809283463564e-05, 'samples': 20781056, 'steps': 40587, 'batch_loss/train': 0.7263139076530933} 12/28/2021 07:25:02 - INFO - codeparrot_training - Step 40588: {'lr': 4.2745993735145514e-05, 'samples': 20781568, 'steps': 40588, 'batch_loss/train': 0.7278920598328114} 12/28/2021 07:25:13 - INFO - codeparrot_training - Step 40589: {'lr': 4.273717901079105e-05, 'samples': 20782080, 'steps': 40589, 'batch_loss/train': 0.822988745290786} 12/28/2021 07:25:23 - INFO - codeparrot_training - Step 40590: {'lr': 4.272836511043526e-05, 'samples': 20782592, 'steps': 40590, 'batch_loss/train': 0.7021273723803461} 12/28/2021 07:25:36 - INFO - codeparrot_training - Step 40591: {'lr': 4.271955203411334e-05, 'samples': 20783104, 'steps': 40591, 'batch_loss/train': 0.7372853665146977} 12/28/2021 07:25:47 - INFO - codeparrot_training - Step 40592: {'lr': 4.2710739781860156e-05, 'samples': 20783616, 'steps': 40592, 'batch_loss/train': 0.7757987300865352} 12/28/2021 07:25:57 - INFO - codeparrot_training - Step 40593: {'lr': 4.270192835371081e-05, 'samples': 20784128, 'steps': 40593, 'batch_loss/train': 0.662891672225669} 12/28/2021 07:26:08 - INFO - codeparrot_training - Step 40594: {'lr': 4.2693117749700336e-05, 'samples': 20784640, 'steps': 40594, 'batch_loss/train': 0.7100764852948487} 12/28/2021 07:26:20 - INFO - codeparrot_training - Step 40595: {'lr': 4.268430796986372e-05, 'samples': 20785152, 'steps': 40595, 'batch_loss/train': 0.7817408749833703} 12/28/2021 07:26:31 - INFO - codeparrot_training - Step 40596: {'lr': 4.2675499014236034e-05, 'samples': 20785664, 'steps': 40596, 'batch_loss/train': 0.766152742318809} 12/28/2021 07:26:41 - INFO - codeparrot_training - Step 40597: {'lr': 4.266669088285227e-05, 'samples': 20786176, 'steps': 40597, 'batch_loss/train': 0.7139109233394265} 12/28/2021 07:26:53 - INFO - codeparrot_training - Step 40598: {'lr': 4.265788357574746e-05, 'samples': 20786688, 'steps': 40598, 'batch_loss/train': 0.7292976537719369} 12/28/2021 07:27:04 - INFO - codeparrot_training - Step 40599: {'lr': 4.264907709295662e-05, 'samples': 20787200, 'steps': 40599, 'batch_loss/train': 0.7484232778660953} 12/28/2021 07:27:15 - INFO - codeparrot_training - Step 40600: {'lr': 4.2640271434514823e-05, 'samples': 20787712, 'steps': 40600, 'batch_loss/train': 0.8082332825288177} 12/28/2021 07:27:27 - INFO - codeparrot_training - Step 40601: {'lr': 4.263146660045686e-05, 'samples': 20788224, 'steps': 40601, 'batch_loss/train': 0.6835345765575767} 12/28/2021 07:27:38 - INFO - codeparrot_training - Step 40602: {'lr': 4.262266259081799e-05, 'samples': 20788736, 'steps': 40602, 'batch_loss/train': 0.8053344255313277} 12/28/2021 07:27:49 - INFO - codeparrot_training - Step 40603: {'lr': 4.2613859405633145e-05, 'samples': 20789248, 'steps': 40603, 'batch_loss/train': 0.7825241964310408} 12/28/2021 07:28:01 - INFO - codeparrot_training - Step 40604: {'lr': 4.2605057044937224e-05, 'samples': 20789760, 'steps': 40604, 'batch_loss/train': 0.6594318682327867} 12/28/2021 07:28:12 - INFO - codeparrot_training - Step 40605: {'lr': 4.259625550876525e-05, 'samples': 20790272, 'steps': 40605, 'batch_loss/train': 0.8575113471597433} 12/28/2021 07:28:22 - INFO - codeparrot_training - Step 40606: {'lr': 4.258745479715234e-05, 'samples': 20790784, 'steps': 40606, 'batch_loss/train': 0.7349455670919269} 12/28/2021 07:28:33 - INFO - codeparrot_training - Step 40607: {'lr': 4.257865491013335e-05, 'samples': 20791296, 'steps': 40607, 'batch_loss/train': 0.7179251709021628} 12/28/2021 07:28:45 - INFO - codeparrot_training - Step 40608: {'lr': 4.2569855847743314e-05, 'samples': 20791808, 'steps': 40608, 'batch_loss/train': 0.7144948802888393} 12/28/2021 07:28:56 - INFO - codeparrot_training - Step 40609: {'lr': 4.2561057610017215e-05, 'samples': 20792320, 'steps': 40609, 'batch_loss/train': 0.7244045790284872} 12/28/2021 07:29:06 - INFO - codeparrot_training - Step 40610: {'lr': 4.2552260196990014e-05, 'samples': 20792832, 'steps': 40610, 'batch_loss/train': 0.6355009878752753} 12/28/2021 07:29:19 - INFO - codeparrot_training - Step 40611: {'lr': 4.254346360869671e-05, 'samples': 20793344, 'steps': 40611, 'batch_loss/train': 0.7868872210383415} 12/28/2021 07:29:29 - INFO - codeparrot_training - Step 40612: {'lr': 4.2534667845172234e-05, 'samples': 20793856, 'steps': 40612, 'batch_loss/train': 0.7301114676520228} 12/28/2021 07:29:40 - INFO - codeparrot_training - Step 40613: {'lr': 4.25258729064516e-05, 'samples': 20794368, 'steps': 40613, 'batch_loss/train': 0.747345557436347} 12/28/2021 07:29:53 - INFO - codeparrot_training - Step 40614: {'lr': 4.2517078792569745e-05, 'samples': 20794880, 'steps': 40614, 'batch_loss/train': 0.7513360283337533} 12/28/2021 07:30:03 - INFO - codeparrot_training - Step 40615: {'lr': 4.250828550356173e-05, 'samples': 20795392, 'steps': 40615, 'batch_loss/train': 0.7004601592198014} 12/28/2021 07:30:14 - INFO - codeparrot_training - Step 40616: {'lr': 4.249949303946229e-05, 'samples': 20795904, 'steps': 40616, 'batch_loss/train': 0.7247890050057322} 12/28/2021 07:30:25 - INFO - codeparrot_training - Step 40617: {'lr': 4.2490701400306645e-05, 'samples': 20796416, 'steps': 40617, 'batch_loss/train': 0.774853796698153} 12/28/2021 07:30:37 - INFO - codeparrot_training - Step 40618: {'lr': 4.2481910586129545e-05, 'samples': 20796928, 'steps': 40618, 'batch_loss/train': 0.7173725226894021} 12/28/2021 07:30:47 - INFO - codeparrot_training - Step 40619: {'lr': 4.2473120596966005e-05, 'samples': 20797440, 'steps': 40619, 'batch_loss/train': 0.7012114259414375} 12/28/2021 07:30:58 - INFO - codeparrot_training - Step 40620: {'lr': 4.2464331432851e-05, 'samples': 20797952, 'steps': 40620, 'batch_loss/train': 0.705242620781064} 12/28/2021 07:31:11 - INFO - codeparrot_training - Step 40621: {'lr': 4.245554309381944e-05, 'samples': 20798464, 'steps': 40621, 'batch_loss/train': 0.7369975112378597} 12/28/2021 07:31:21 - INFO - codeparrot_training - Step 40622: {'lr': 4.2446755579906266e-05, 'samples': 20798976, 'steps': 40622, 'batch_loss/train': 0.751986900344491} 12/28/2021 07:31:32 - INFO - codeparrot_training - Step 40623: {'lr': 4.2437968891146425e-05, 'samples': 20799488, 'steps': 40623, 'batch_loss/train': 0.721095765940845} 12/28/2021 07:31:44 - INFO - codeparrot_training - Step 40624: {'lr': 4.242918302757487e-05, 'samples': 20800000, 'steps': 40624, 'batch_loss/train': 0.6530221067368984} 12/28/2021 07:31:55 - INFO - codeparrot_training - Step 40625: {'lr': 4.242039798922648e-05, 'samples': 20800512, 'steps': 40625, 'batch_loss/train': 0.7435789601877332} 12/28/2021 07:32:05 - INFO - codeparrot_training - Step 40626: {'lr': 4.2411613776136284e-05, 'samples': 20801024, 'steps': 40626, 'batch_loss/train': 0.6181810375419445} 12/28/2021 07:32:16 - INFO - codeparrot_training - Step 40627: {'lr': 4.2402830388339e-05, 'samples': 20801536, 'steps': 40627, 'batch_loss/train': 0.7402001670561731} 12/28/2021 07:32:28 - INFO - codeparrot_training - Step 40628: {'lr': 4.239404782586975e-05, 'samples': 20802048, 'steps': 40628, 'batch_loss/train': 0.6814148649573326} 12/28/2021 07:32:39 - INFO - codeparrot_training - Step 40629: {'lr': 4.2385266088763445e-05, 'samples': 20802560, 'steps': 40629, 'batch_loss/train': 0.8208508482202888} 12/28/2021 07:32:50 - INFO - codeparrot_training - Step 40630: {'lr': 4.237648517705484e-05, 'samples': 20803072, 'steps': 40630, 'batch_loss/train': 0.7534816788975149} 12/28/2021 07:33:03 - INFO - codeparrot_training - Step 40631: {'lr': 4.236770509077889e-05, 'samples': 20803584, 'steps': 40631, 'batch_loss/train': 0.6500470811733976} 12/28/2021 07:33:13 - INFO - codeparrot_training - Step 40632: {'lr': 4.235892582997067e-05, 'samples': 20804096, 'steps': 40632, 'batch_loss/train': 0.6783441814477555} 12/28/2021 07:33:24 - INFO - codeparrot_training - Step 40633: {'lr': 4.2350147394664886e-05, 'samples': 20804608, 'steps': 40633, 'batch_loss/train': 1.0305607672780752} 12/28/2021 07:33:36 - INFO - codeparrot_training - Step 40634: {'lr': 4.234136978489653e-05, 'samples': 20805120, 'steps': 40634, 'batch_loss/train': 0.7072052843868732} 12/28/2021 07:33:47 - INFO - codeparrot_training - Step 40635: {'lr': 4.233259300070047e-05, 'samples': 20805632, 'steps': 40635, 'batch_loss/train': 0.7453579287976027} 12/28/2021 07:33:57 - INFO - codeparrot_training - Step 40636: {'lr': 4.232381704211161e-05, 'samples': 20806144, 'steps': 40636, 'batch_loss/train': 0.7214720237534493} 12/28/2021 07:34:10 - INFO - codeparrot_training - Step 40637: {'lr': 4.231504190916482e-05, 'samples': 20806656, 'steps': 40637, 'batch_loss/train': 0.7345947017893195} 12/28/2021 07:34:20 - INFO - codeparrot_training - Step 40638: {'lr': 4.2306267601895014e-05, 'samples': 20807168, 'steps': 40638, 'batch_loss/train': 0.7500717549119145} 12/28/2021 07:34:31 - INFO - codeparrot_training - Step 40639: {'lr': 4.2297494120337066e-05, 'samples': 20807680, 'steps': 40639, 'batch_loss/train': 0.8484051432460546} 12/28/2021 07:34:42 - INFO - codeparrot_training - Step 40640: {'lr': 4.228872146452586e-05, 'samples': 20808192, 'steps': 40640, 'batch_loss/train': 0.7856563050299883} 12/28/2021 07:34:54 - INFO - codeparrot_training - Step 40641: {'lr': 4.227994963449633e-05, 'samples': 20808704, 'steps': 40641, 'batch_loss/train': 0.854753433726728} 12/28/2021 07:35:05 - INFO - codeparrot_training - Step 40642: {'lr': 4.227117863028318e-05, 'samples': 20809216, 'steps': 40642, 'batch_loss/train': 0.774443844333291} 12/28/2021 07:35:16 - INFO - codeparrot_training - Step 40643: {'lr': 4.226240845192147e-05, 'samples': 20809728, 'steps': 40643, 'batch_loss/train': 0.7853194219060242} 12/28/2021 07:35:28 - INFO - codeparrot_training - Step 40644: {'lr': 4.225363909944596e-05, 'samples': 20810240, 'steps': 40644, 'batch_loss/train': 0.640794061939232} 12/28/2021 07:35:39 - INFO - codeparrot_training - Step 40645: {'lr': 4.224487057289153e-05, 'samples': 20810752, 'steps': 40645, 'batch_loss/train': 0.6855551626067609} 12/28/2021 07:35:49 - INFO - codeparrot_training - Step 40646: {'lr': 4.223610287229304e-05, 'samples': 20811264, 'steps': 40646, 'batch_loss/train': 0.7159038202371448} 12/28/2021 07:36:02 - INFO - codeparrot_training - Step 40647: {'lr': 4.2227335997685384e-05, 'samples': 20811776, 'steps': 40647, 'batch_loss/train': 0.7006733543239534} 12/28/2021 07:36:13 - INFO - codeparrot_training - Step 40648: {'lr': 4.221856994910336e-05, 'samples': 20812288, 'steps': 40648, 'batch_loss/train': 0.701503056101501} 12/28/2021 07:36:23 - INFO - codeparrot_training - Step 40649: {'lr': 4.220980472658187e-05, 'samples': 20812800, 'steps': 40649, 'batch_loss/train': 0.7750974022783339} 12/28/2021 07:36:34 - INFO - codeparrot_training - Step 40650: {'lr': 4.22010403301557e-05, 'samples': 20813312, 'steps': 40650, 'batch_loss/train': 0.6398797721485607} 12/28/2021 07:36:46 - INFO - codeparrot_training - Step 40651: {'lr': 4.219227675985976e-05, 'samples': 20813824, 'steps': 40651, 'batch_loss/train': 0.7065785503946245} 12/28/2021 07:36:57 - INFO - codeparrot_training - Step 40652: {'lr': 4.2183514015728926e-05, 'samples': 20814336, 'steps': 40652, 'batch_loss/train': 0.7394550545141101} 12/28/2021 07:37:07 - INFO - codeparrot_training - Step 40653: {'lr': 4.217475209779784e-05, 'samples': 20814848, 'steps': 40653, 'batch_loss/train': 0.7357168922899291} 12/28/2021 07:37:19 - INFO - codeparrot_training - Step 40654: {'lr': 4.216599100610152e-05, 'samples': 20815360, 'steps': 40654, 'batch_loss/train': 0.7670901552774012} 12/28/2021 07:37:30 - INFO - codeparrot_training - Step 40655: {'lr': 4.215723074067482e-05, 'samples': 20815872, 'steps': 40655, 'batch_loss/train': 0.8073156904429197} 12/28/2021 07:37:41 - INFO - codeparrot_training - Step 40656: {'lr': 4.214847130155244e-05, 'samples': 20816384, 'steps': 40656, 'batch_loss/train': 0.7686747601255774} 12/28/2021 07:37:53 - INFO - codeparrot_training - Step 40657: {'lr': 4.2139712688769176e-05, 'samples': 20816896, 'steps': 40657, 'batch_loss/train': 0.700076679000631} 12/28/2021 07:38:04 - INFO - codeparrot_training - Step 40658: {'lr': 4.213095490236008e-05, 'samples': 20817408, 'steps': 40658, 'batch_loss/train': 0.7687384532764554} 12/28/2021 07:38:14 - INFO - codeparrot_training - Step 40659: {'lr': 4.212219794235972e-05, 'samples': 20817920, 'steps': 40659, 'batch_loss/train': 0.7548835035413504} 12/28/2021 07:38:25 - INFO - codeparrot_training - Step 40660: {'lr': 4.211344180880297e-05, 'samples': 20818432, 'steps': 40660, 'batch_loss/train': 0.69797428464517} 12/28/2021 07:38:38 - INFO - codeparrot_training - Step 40661: {'lr': 4.2104686501724806e-05, 'samples': 20818944, 'steps': 40661, 'batch_loss/train': 0.7092579645104706} 12/28/2021 07:38:48 - INFO - codeparrot_training - Step 40662: {'lr': 4.209593202115983e-05, 'samples': 20819456, 'steps': 40662, 'batch_loss/train': 0.7524372008629143} 12/28/2021 07:38:59 - INFO - codeparrot_training - Step 40663: {'lr': 4.2087178367142933e-05, 'samples': 20819968, 'steps': 40663, 'batch_loss/train': 0.9000622471794486} 12/28/2021 07:39:11 - INFO - codeparrot_training - Step 40664: {'lr': 4.207842553970892e-05, 'samples': 20820480, 'steps': 40664, 'batch_loss/train': 0.7292219758965075} 12/28/2021 07:39:22 - INFO - codeparrot_training - Step 40665: {'lr': 4.2069673538892574e-05, 'samples': 20820992, 'steps': 40665, 'batch_loss/train': 0.9145163646899164} 12/28/2021 07:39:33 - INFO - codeparrot_training - Step 40666: {'lr': 4.206092236472869e-05, 'samples': 20821504, 'steps': 40666, 'batch_loss/train': 0.757245680782944} 12/28/2021 07:39:45 - INFO - codeparrot_training - Step 40667: {'lr': 4.2052172017252144e-05, 'samples': 20822016, 'steps': 40667, 'batch_loss/train': 0.783816454000771} 12/28/2021 07:39:55 - INFO - codeparrot_training - Step 40668: {'lr': 4.204342249649748e-05, 'samples': 20822528, 'steps': 40668, 'batch_loss/train': 0.7671822467818856} 12/28/2021 07:40:06 - INFO - codeparrot_training - Step 40669: {'lr': 4.2034673802499755e-05, 'samples': 20823040, 'steps': 40669, 'batch_loss/train': 0.7763790171593428} 12/28/2021 07:40:18 - INFO - codeparrot_training - Step 40670: {'lr': 4.202592593529367e-05, 'samples': 20823552, 'steps': 40670, 'batch_loss/train': 0.7258293356280774} 12/28/2021 07:40:29 - INFO - codeparrot_training - Step 40671: {'lr': 4.2017178894913885e-05, 'samples': 20824064, 'steps': 40671, 'batch_loss/train': 0.8097266294062138} 12/28/2021 07:40:40 - INFO - codeparrot_training - Step 40672: {'lr': 4.200843268139537e-05, 'samples': 20824576, 'steps': 40672, 'batch_loss/train': 0.6864114096388221} 12/28/2021 07:40:50 - INFO - codeparrot_training - Step 40673: {'lr': 4.199968729477271e-05, 'samples': 20825088, 'steps': 40673, 'batch_loss/train': 0.7614730102941394} 12/28/2021 07:41:02 - INFO - codeparrot_training - Step 40674: {'lr': 4.199094273508078e-05, 'samples': 20825600, 'steps': 40674, 'batch_loss/train': 0.7720507844351232} 12/28/2021 07:41:13 - INFO - codeparrot_training - Step 40675: {'lr': 4.198219900235431e-05, 'samples': 20826112, 'steps': 40675, 'batch_loss/train': 0.7784947361797094} 12/28/2021 07:41:24 - INFO - codeparrot_training - Step 40676: {'lr': 4.197345609662809e-05, 'samples': 20826624, 'steps': 40676, 'batch_loss/train': 0.79681012686342} 12/28/2021 07:41:36 - INFO - codeparrot_training - Step 40677: {'lr': 4.196471401793683e-05, 'samples': 20827136, 'steps': 40677, 'batch_loss/train': 0.6189190764562227} 12/28/2021 07:41:47 - INFO - codeparrot_training - Step 40678: {'lr': 4.195597276631541e-05, 'samples': 20827648, 'steps': 40678, 'batch_loss/train': 0.833181472029537} 12/28/2021 07:41:58 - INFO - codeparrot_training - Step 40679: {'lr': 4.194723234179837e-05, 'samples': 20828160, 'steps': 40679, 'batch_loss/train': 0.8128392724320292} 12/28/2021 07:42:10 - INFO - codeparrot_training - Step 40680: {'lr': 4.19384927444206e-05, 'samples': 20828672, 'steps': 40680, 'batch_loss/train': 0.6114092651987448} 12/28/2021 07:42:20 - INFO - codeparrot_training - Step 40681: {'lr': 4.1929753974216914e-05, 'samples': 20829184, 'steps': 40681, 'batch_loss/train': 0.8030234957113862} 12/28/2021 07:42:31 - INFO - codeparrot_training - Step 40682: {'lr': 4.192101603122189e-05, 'samples': 20829696, 'steps': 40682, 'batch_loss/train': 0.8277191873639822} 12/28/2021 07:42:43 - INFO - codeparrot_training - Step 40683: {'lr': 4.1912278915470284e-05, 'samples': 20830208, 'steps': 40683, 'batch_loss/train': 0.655609019100666} 12/28/2021 07:42:54 - INFO - codeparrot_training - Step 40684: {'lr': 4.1903542626997e-05, 'samples': 20830720, 'steps': 40684, 'batch_loss/train': 0.7165168887004256} 12/28/2021 07:43:04 - INFO - codeparrot_training - Step 40685: {'lr': 4.189480716583657e-05, 'samples': 20831232, 'steps': 40685, 'batch_loss/train': 0.7616796847432852} 12/28/2021 07:43:15 - INFO - codeparrot_training - Step 40686: {'lr': 4.188607253202378e-05, 'samples': 20831744, 'steps': 40686, 'batch_loss/train': 0.6666535176336765} 12/28/2021 07:43:28 - INFO - codeparrot_training - Step 40687: {'lr': 4.187733872559352e-05, 'samples': 20832256, 'steps': 40687, 'batch_loss/train': 0.7600087840110064} 12/28/2021 07:43:38 - INFO - codeparrot_training - Step 40688: {'lr': 4.18686057465803e-05, 'samples': 20832768, 'steps': 40688, 'batch_loss/train': 0.6358608029549941} 12/28/2021 07:43:49 - INFO - codeparrot_training - Step 40689: {'lr': 4.1859873595018916e-05, 'samples': 20833280, 'steps': 40689, 'batch_loss/train': 0.8583040460944176} 12/28/2021 07:44:01 - INFO - codeparrot_training - Step 40690: {'lr': 4.185114227094408e-05, 'samples': 20833792, 'steps': 40690, 'batch_loss/train': 0.7759388415142894} 12/28/2021 07:44:12 - INFO - codeparrot_training - Step 40691: {'lr': 4.1842411774390495e-05, 'samples': 20834304, 'steps': 40691, 'batch_loss/train': 0.8128209235146642} 12/28/2021 07:44:22 - INFO - codeparrot_training - Step 40692: {'lr': 4.18336821053929e-05, 'samples': 20834816, 'steps': 40692, 'batch_loss/train': 0.7875097296200693} 12/28/2021 07:44:34 - INFO - codeparrot_training - Step 40693: {'lr': 4.1824953263986056e-05, 'samples': 20835328, 'steps': 40693, 'batch_loss/train': 0.7186686978675425} 12/28/2021 07:44:45 - INFO - codeparrot_training - Step 40694: {'lr': 4.1816225250204455e-05, 'samples': 20835840, 'steps': 40694, 'batch_loss/train': 0.7415815033018589} 12/28/2021 07:44:56 - INFO - codeparrot_training - Step 40695: {'lr': 4.1807498064082986e-05, 'samples': 20836352, 'steps': 40695, 'batch_loss/train': 0.6528902454301715} 12/28/2021 07:45:08 - INFO - codeparrot_training - Step 40696: {'lr': 4.1798771705656377e-05, 'samples': 20836864, 'steps': 40696, 'batch_loss/train': 0.7602603398263454} 12/28/2021 07:45:19 - INFO - codeparrot_training - Step 40697: {'lr': 4.179004617495913e-05, 'samples': 20837376, 'steps': 40697, 'batch_loss/train': 0.7167619881220162} 12/28/2021 07:45:30 - INFO - codeparrot_training - Step 40698: {'lr': 4.178132147202612e-05, 'samples': 20837888, 'steps': 40698, 'batch_loss/train': 0.7573718414641917} 12/28/2021 07:45:40 - INFO - codeparrot_training - Step 40699: {'lr': 4.177259759689192e-05, 'samples': 20838400, 'steps': 40699, 'batch_loss/train': 0.664089894387871} 12/28/2021 07:45:52 - INFO - codeparrot_training - Step 40700: {'lr': 4.176387454959127e-05, 'samples': 20838912, 'steps': 40700, 'batch_loss/train': 0.8720166247803718} 12/28/2021 07:46:03 - INFO - codeparrot_training - Step 40701: {'lr': 4.175515233015878e-05, 'samples': 20839424, 'steps': 40701, 'batch_loss/train': 0.8457121839746833} 12/28/2021 07:46:14 - INFO - codeparrot_training - Step 40702: {'lr': 4.1746430938629206e-05, 'samples': 20839936, 'steps': 40702, 'batch_loss/train': 0.877267955802381} 12/28/2021 07:46:26 - INFO - codeparrot_training - Step 40703: {'lr': 4.173771037503718e-05, 'samples': 20840448, 'steps': 40703, 'batch_loss/train': 1.4148209234699607} 12/28/2021 07:46:37 - INFO - codeparrot_training - Step 40704: {'lr': 4.1728990639417405e-05, 'samples': 20840960, 'steps': 40704, 'batch_loss/train': 0.798246799968183} 12/28/2021 07:46:47 - INFO - codeparrot_training - Step 40705: {'lr': 4.1720271731804484e-05, 'samples': 20841472, 'steps': 40705, 'batch_loss/train': 0.8080144841223955} 12/28/2021 07:47:00 - INFO - codeparrot_training - Step 40706: {'lr': 4.1711553652233144e-05, 'samples': 20841984, 'steps': 40706, 'batch_loss/train': 0.642435138579458} 12/28/2021 07:47:11 - INFO - codeparrot_training - Step 40707: {'lr': 4.170283640073808e-05, 'samples': 20842496, 'steps': 40707, 'batch_loss/train': 0.7317743799649179} 12/28/2021 07:47:22 - INFO - codeparrot_training - Step 40708: {'lr': 4.1694119977353753e-05, 'samples': 20843008, 'steps': 40708, 'batch_loss/train': 0.7794406842440367} 12/28/2021 07:47:32 - INFO - codeparrot_training - Step 40709: {'lr': 4.168540438211502e-05, 'samples': 20843520, 'steps': 40709, 'batch_loss/train': 0.7453263686038554} 12/28/2021 07:47:44 - INFO - codeparrot_training - Step 40710: {'lr': 4.1676689615056525e-05, 'samples': 20844032, 'steps': 40710, 'batch_loss/train': 0.5665198883507401} 12/28/2021 07:47:55 - INFO - codeparrot_training - Step 40711: {'lr': 4.1667975676212775e-05, 'samples': 20844544, 'steps': 40711, 'batch_loss/train': 0.7288318555802107} 12/28/2021 07:48:06 - INFO - codeparrot_training - Step 40712: {'lr': 4.1659262565618447e-05, 'samples': 20845056, 'steps': 40712, 'batch_loss/train': 0.65050786617212} 12/28/2021 07:48:18 - INFO - codeparrot_training - Step 40713: {'lr': 4.165055028330836e-05, 'samples': 20845568, 'steps': 40713, 'batch_loss/train': 0.65991983236745} 12/28/2021 07:48:28 - INFO - codeparrot_training - Step 40714: {'lr': 4.164183882931694e-05, 'samples': 20846080, 'steps': 40714, 'batch_loss/train': 0.7103259917348623} 12/28/2021 07:48:39 - INFO - codeparrot_training - Step 40715: {'lr': 4.1633128203678874e-05, 'samples': 20846592, 'steps': 40715, 'batch_loss/train': 0.6915511819534004} 12/28/2021 07:48:52 - INFO - codeparrot_training - Step 40716: {'lr': 4.1624418406428834e-05, 'samples': 20847104, 'steps': 40716, 'batch_loss/train': 0.733995626680553} 12/28/2021 07:49:02 - INFO - codeparrot_training - Step 40717: {'lr': 4.161570943760143e-05, 'samples': 20847616, 'steps': 40717, 'batch_loss/train': 0.7224213387817144} 12/28/2021 07:49:13 - INFO - codeparrot_training - Step 40718: {'lr': 4.1607001297231274e-05, 'samples': 20848128, 'steps': 40718, 'batch_loss/train': 0.7033664523623884} 12/28/2021 07:49:24 - INFO - codeparrot_training - Step 40719: {'lr': 4.159829398535306e-05, 'samples': 20848640, 'steps': 40719, 'batch_loss/train': 0.6937895249575377} 12/28/2021 07:49:36 - INFO - codeparrot_training - Step 40720: {'lr': 4.158958750200123e-05, 'samples': 20849152, 'steps': 40720, 'batch_loss/train': 0.7203337200917304} 12/28/2021 07:49:47 - INFO - codeparrot_training - Step 40721: {'lr': 4.158088184721054e-05, 'samples': 20849664, 'steps': 40721, 'batch_loss/train': 0.7367396047338843} 12/28/2021 07:49:57 - INFO - codeparrot_training - Step 40722: {'lr': 4.157217702101565e-05, 'samples': 20850176, 'steps': 40722, 'batch_loss/train': 0.7417818706016988} 12/28/2021 07:50:09 - INFO - codeparrot_training - Step 40723: {'lr': 4.1563473023450956e-05, 'samples': 20850688, 'steps': 40723, 'batch_loss/train': 0.6719220378436148} 12/28/2021 07:50:20 - INFO - codeparrot_training - Step 40724: {'lr': 4.155476985455131e-05, 'samples': 20851200, 'steps': 40724, 'batch_loss/train': 0.7026813069824129} 12/28/2021 07:50:31 - INFO - codeparrot_training - Step 40725: {'lr': 4.1546067514351164e-05, 'samples': 20851712, 'steps': 40725, 'batch_loss/train': 0.8174453256651759} 12/28/2021 07:50:43 - INFO - codeparrot_training - Step 40726: {'lr': 4.15373660028851e-05, 'samples': 20852224, 'steps': 40726, 'batch_loss/train': 0.6363631887361407} 12/28/2021 07:50:54 - INFO - codeparrot_training - Step 40727: {'lr': 4.1528665320187784e-05, 'samples': 20852736, 'steps': 40727, 'batch_loss/train': 0.6799150351434946} 12/28/2021 07:51:04 - INFO - codeparrot_training - Step 40728: {'lr': 4.151996546629377e-05, 'samples': 20853248, 'steps': 40728, 'batch_loss/train': 0.7785804076120257} 12/28/2021 07:51:17 - INFO - codeparrot_training - Step 40729: {'lr': 4.151126644123765e-05, 'samples': 20853760, 'steps': 40729, 'batch_loss/train': 0.601430131791858} 12/28/2021 07:51:27 - INFO - codeparrot_training - Step 40730: {'lr': 4.150256824505402e-05, 'samples': 20854272, 'steps': 40730, 'batch_loss/train': 0.7245340086519718} 12/28/2021 07:51:38 - INFO - codeparrot_training - Step 40731: {'lr': 4.149387087777745e-05, 'samples': 20854784, 'steps': 40731, 'batch_loss/train': 0.7662334060296416} 12/28/2021 07:51:49 - INFO - codeparrot_training - Step 40732: {'lr': 4.1485174339442514e-05, 'samples': 20855296, 'steps': 40732, 'batch_loss/train': 0.7629100979538634} 12/28/2021 07:52:01 - INFO - codeparrot_training - Step 40733: {'lr': 4.147647863008386e-05, 'samples': 20855808, 'steps': 40733, 'batch_loss/train': 0.6488250037655234} 12/28/2021 07:52:12 - INFO - codeparrot_training - Step 40734: {'lr': 4.146778374973587e-05, 'samples': 20856320, 'steps': 40734, 'batch_loss/train': 0.7172049188520759} 12/28/2021 07:52:23 - INFO - codeparrot_training - Step 40735: {'lr': 4.145908969843329e-05, 'samples': 20856832, 'steps': 40735, 'batch_loss/train': 0.7835246324539185} 12/28/2021 07:52:35 - INFO - codeparrot_training - Step 40736: {'lr': 4.1450396476210686e-05, 'samples': 20857344, 'steps': 40736, 'batch_loss/train': 0.7479403633624315} 12/28/2021 07:52:45 - INFO - codeparrot_training - Step 40737: {'lr': 4.1441704083102495e-05, 'samples': 20857856, 'steps': 40737, 'batch_loss/train': 0.5921849808655679} 12/28/2021 07:52:56 - INFO - codeparrot_training - Step 40738: {'lr': 4.14330125191433e-05, 'samples': 20858368, 'steps': 40738, 'batch_loss/train': 0.6953566945157945} 12/28/2021 07:53:08 - INFO - codeparrot_training - Step 40739: {'lr': 4.142432178436781e-05, 'samples': 20858880, 'steps': 40739, 'batch_loss/train': 0.7310928259976208} 12/28/2021 07:53:19 - INFO - codeparrot_training - Step 40740: {'lr': 4.141563187881039e-05, 'samples': 20859392, 'steps': 40740, 'batch_loss/train': 0.7639391941484064} 12/28/2021 07:53:29 - INFO - codeparrot_training - Step 40741: {'lr': 4.140694280250568e-05, 'samples': 20859904, 'steps': 40741, 'batch_loss/train': 0.6032217959873378} 12/28/2021 07:53:40 - INFO - codeparrot_training - Step 40742: {'lr': 4.139825455548818e-05, 'samples': 20860416, 'steps': 40742, 'batch_loss/train': 0.7010692185722291} 12/28/2021 07:53:52 - INFO - codeparrot_training - Step 40743: {'lr': 4.1389567137792464e-05, 'samples': 20860928, 'steps': 40743, 'batch_loss/train': 0.7107723108492792} 12/28/2021 07:54:03 - INFO - codeparrot_training - Step 40744: {'lr': 4.138088054945305e-05, 'samples': 20861440, 'steps': 40744, 'batch_loss/train': 0.717473323456943} 12/28/2021 07:54:13 - INFO - codeparrot_training - Step 40745: {'lr': 4.1372194790504496e-05, 'samples': 20861952, 'steps': 40745, 'batch_loss/train': 0.8085134685970843} 12/28/2021 07:54:26 - INFO - codeparrot_training - Step 40746: {'lr': 4.136350986098133e-05, 'samples': 20862464, 'steps': 40746, 'batch_loss/train': 0.6964029748924077} 12/28/2021 07:54:37 - INFO - codeparrot_training - Step 40747: {'lr': 4.135482576091806e-05, 'samples': 20862976, 'steps': 40747, 'batch_loss/train': 0.7161815314320847} 12/28/2021 07:54:47 - INFO - codeparrot_training - Step 40748: {'lr': 4.1346142490349266e-05, 'samples': 20863488, 'steps': 40748, 'batch_loss/train': 0.7587327612563968} 12/28/2021 07:55:00 - INFO - codeparrot_training - Step 40749: {'lr': 4.133746004930933e-05, 'samples': 20864000, 'steps': 40749, 'batch_loss/train': 0.7076088883914053} 12/28/2021 07:55:10 - INFO - codeparrot_training - Step 40750: {'lr': 4.132877843783292e-05, 'samples': 20864512, 'steps': 40750, 'batch_loss/train': 0.6965926848351955} 12/28/2021 07:55:21 - INFO - codeparrot_training - Step 40751: {'lr': 4.132009765595454e-05, 'samples': 20865024, 'steps': 40751, 'batch_loss/train': 0.7552621948998421} 12/28/2021 07:55:31 - INFO - codeparrot_training - Step 40752: {'lr': 4.131141770370861e-05, 'samples': 20865536, 'steps': 40752, 'batch_loss/train': 0.7737587029114366} 12/28/2021 07:55:44 - INFO - codeparrot_training - Step 40753: {'lr': 4.130273858112968e-05, 'samples': 20866048, 'steps': 40753, 'batch_loss/train': 0.7317665731534362} 12/28/2021 07:55:54 - INFO - codeparrot_training - Step 40754: {'lr': 4.129406028825228e-05, 'samples': 20866560, 'steps': 40754, 'batch_loss/train': 0.7342617083340883} 12/28/2021 07:56:05 - INFO - codeparrot_training - Step 40755: {'lr': 4.128538282511088e-05, 'samples': 20867072, 'steps': 40755, 'batch_loss/train': 0.8336591525003314} 12/28/2021 07:56:18 - INFO - codeparrot_training - Step 40756: {'lr': 4.127670619173998e-05, 'samples': 20867584, 'steps': 40756, 'batch_loss/train': 0.8310557792428881} 12/28/2021 07:56:28 - INFO - codeparrot_training - Step 40757: {'lr': 4.1268030388174086e-05, 'samples': 20868096, 'steps': 40757, 'batch_loss/train': 0.6094690192257985} 12/28/2021 07:56:39 - INFO - codeparrot_training - Step 40758: {'lr': 4.1259355414447695e-05, 'samples': 20868608, 'steps': 40758, 'batch_loss/train': 0.8312271465547383} 12/28/2021 07:56:52 - INFO - codeparrot_training - Step 40759: {'lr': 4.125068127059528e-05, 'samples': 20869120, 'steps': 40759, 'batch_loss/train': 0.6718096631811932} 12/28/2021 07:57:02 - INFO - codeparrot_training - Step 40760: {'lr': 4.124200795665134e-05, 'samples': 20869632, 'steps': 40760, 'batch_loss/train': 0.7605562536045909} 12/28/2021 07:57:13 - INFO - codeparrot_training - Step 40761: {'lr': 4.1233335472650374e-05, 'samples': 20870144, 'steps': 40761, 'batch_loss/train': 0.8100473936647177} 12/28/2021 07:57:23 - INFO - codeparrot_training - Step 40762: {'lr': 4.122466381862688e-05, 'samples': 20870656, 'steps': 40762, 'batch_loss/train': 0.7434199675917625} 12/28/2021 07:57:36 - INFO - codeparrot_training - Step 40763: {'lr': 4.121599299461523e-05, 'samples': 20871168, 'steps': 40763, 'batch_loss/train': 0.7912529641762376} 12/28/2021 07:57:47 - INFO - codeparrot_training - Step 40764: {'lr': 4.1207323000649915e-05, 'samples': 20871680, 'steps': 40764, 'batch_loss/train': 0.7944597192108631} 12/28/2021 07:57:58 - INFO - codeparrot_training - Step 40765: {'lr': 4.1198653836765546e-05, 'samples': 20872192, 'steps': 40765, 'batch_loss/train': 0.6240514607925434} 12/28/2021 07:58:10 - INFO - codeparrot_training - Step 40766: {'lr': 4.118998550299644e-05, 'samples': 20872704, 'steps': 40766, 'batch_loss/train': 0.7475660988129675} 12/28/2021 07:58:21 - INFO - codeparrot_training - Step 40767: {'lr': 4.1181317999377106e-05, 'samples': 20873216, 'steps': 40767, 'batch_loss/train': 0.8647178076207638} 12/28/2021 07:58:31 - INFO - codeparrot_training - Step 40768: {'lr': 4.1172651325942004e-05, 'samples': 20873728, 'steps': 40768, 'batch_loss/train': 0.589450455037877} 12/28/2021 07:58:42 - INFO - codeparrot_training - Step 40769: {'lr': 4.116398548272557e-05, 'samples': 20874240, 'steps': 40769, 'batch_loss/train': 0.7925018453970551} 12/28/2021 07:58:54 - INFO - codeparrot_training - Step 40770: {'lr': 4.115532046976231e-05, 'samples': 20874752, 'steps': 40770, 'batch_loss/train': 0.8813148494809866} 12/28/2021 07:59:05 - INFO - codeparrot_training - Step 40771: {'lr': 4.114665628708661e-05, 'samples': 20875264, 'steps': 40771, 'batch_loss/train': 0.7590142348781228} 12/28/2021 07:59:15 - INFO - codeparrot_training - Step 40772: {'lr': 4.113799293473297e-05, 'samples': 20875776, 'steps': 40772, 'batch_loss/train': 0.7334465449675918} 12/28/2021 07:59:28 - INFO - codeparrot_training - Step 40773: {'lr': 4.112933041273581e-05, 'samples': 20876288, 'steps': 40773, 'batch_loss/train': 0.7317959188949317} 12/28/2021 07:59:39 - INFO - codeparrot_training - Step 40774: {'lr': 4.11206687211296e-05, 'samples': 20876800, 'steps': 40774, 'batch_loss/train': 0.7937497952952981} 12/28/2021 07:59:50 - INFO - codeparrot_training - Step 40775: {'lr': 4.111200785994865e-05, 'samples': 20877312, 'steps': 40775, 'batch_loss/train': 0.6856520976871252} 12/28/2021 08:00:02 - INFO - codeparrot_training - Step 40776: {'lr': 4.110334782922751e-05, 'samples': 20877824, 'steps': 40776, 'batch_loss/train': 0.6960872863419354} 12/28/2021 08:00:12 - INFO - codeparrot_training - Step 40777: {'lr': 4.1094688629000675e-05, 'samples': 20878336, 'steps': 40777, 'batch_loss/train': 0.8210043804720044} 12/28/2021 08:00:23 - INFO - codeparrot_training - Step 40778: {'lr': 4.108603025930238e-05, 'samples': 20878848, 'steps': 40778, 'batch_loss/train': 0.7882100150454789} 12/28/2021 08:00:35 - INFO - codeparrot_training - Step 40779: {'lr': 4.1077372720167176e-05, 'samples': 20879360, 'steps': 40779, 'batch_loss/train': 0.7245460618287325} 12/28/2021 08:00:46 - INFO - codeparrot_training - Step 40780: {'lr': 4.1068716011629457e-05, 'samples': 20879872, 'steps': 40780, 'batch_loss/train': 0.638135762361344} 12/28/2021 08:00:56 - INFO - codeparrot_training - Step 40781: {'lr': 4.1060060133723626e-05, 'samples': 20880384, 'steps': 40781, 'batch_loss/train': 0.5251462738378905} 12/28/2021 08:01:07 - INFO - codeparrot_training - Step 40782: {'lr': 4.105140508648408e-05, 'samples': 20880896, 'steps': 40782, 'batch_loss/train': 0.6975575862452388} 12/28/2021 08:01:19 - INFO - codeparrot_training - Step 40783: {'lr': 4.104275086994527e-05, 'samples': 20881408, 'steps': 40783, 'batch_loss/train': 0.6967811306240037} 12/28/2021 08:01:30 - INFO - codeparrot_training - Step 40784: {'lr': 4.103409748414161e-05, 'samples': 20881920, 'steps': 40784, 'batch_loss/train': 0.8066884269937873} 12/28/2021 08:01:40 - INFO - codeparrot_training - Step 40785: {'lr': 4.102544492910743e-05, 'samples': 20882432, 'steps': 40785, 'batch_loss/train': 0.7094352636486292} 12/28/2021 08:01:53 - INFO - codeparrot_training - Step 40786: {'lr': 4.1016793204877196e-05, 'samples': 20882944, 'steps': 40786, 'batch_loss/train': 0.7244310521055013} 12/28/2021 08:02:04 - INFO - codeparrot_training - Step 40787: {'lr': 4.100814231148528e-05, 'samples': 20883456, 'steps': 40787, 'batch_loss/train': 0.6767775388434529} 12/28/2021 08:02:14 - INFO - codeparrot_training - Step 40788: {'lr': 4.0999492248966155e-05, 'samples': 20883968, 'steps': 40788, 'batch_loss/train': 0.7895207339897752} 12/28/2021 08:02:26 - INFO - codeparrot_training - Step 40789: {'lr': 4.0990843017354063e-05, 'samples': 20884480, 'steps': 40789, 'batch_loss/train': 0.7867800313979387} 12/28/2021 08:02:37 - INFO - codeparrot_training - Step 40790: {'lr': 4.098219461668337e-05, 'samples': 20884992, 'steps': 40790, 'batch_loss/train': 0.7835348092485219} 12/28/2021 08:02:48 - INFO - codeparrot_training - Step 40791: {'lr': 4.097354704698872e-05, 'samples': 20885504, 'steps': 40791, 'batch_loss/train': 0.8889917978085577} 12/28/2021 08:03:01 - INFO - codeparrot_training - Step 40792: {'lr': 4.0964900308304216e-05, 'samples': 20886016, 'steps': 40792, 'batch_loss/train': 0.6146574900485575} 12/28/2021 08:03:11 - INFO - codeparrot_training - Step 40793: {'lr': 4.095625440066436e-05, 'samples': 20886528, 'steps': 40793, 'batch_loss/train': 0.7280031447298825} 12/28/2021 08:03:22 - INFO - codeparrot_training - Step 40794: {'lr': 4.0947609324103515e-05, 'samples': 20887040, 'steps': 40794, 'batch_loss/train': 0.7498950269073248} 12/28/2021 08:03:33 - INFO - codeparrot_training - Step 40795: {'lr': 4.093896507865602e-05, 'samples': 20887552, 'steps': 40795, 'batch_loss/train': 0.6845525964163244} 12/28/2021 08:03:45 - INFO - codeparrot_training - Step 40796: {'lr': 4.0930321664356287e-05, 'samples': 20888064, 'steps': 40796, 'batch_loss/train': 0.8108875080943108} 12/28/2021 08:03:56 - INFO - codeparrot_training - Step 40797: {'lr': 4.0921679081238646e-05, 'samples': 20888576, 'steps': 40797, 'batch_loss/train': 0.6928751524537802} 12/28/2021 08:04:06 - INFO - codeparrot_training - Step 40798: {'lr': 4.091303732933746e-05, 'samples': 20889088, 'steps': 40798, 'batch_loss/train': 0.7400817507877946} 12/28/2021 08:04:19 - INFO - codeparrot_training - Step 40799: {'lr': 4.0904396408687065e-05, 'samples': 20889600, 'steps': 40799, 'batch_loss/train': 1.0234768162481487} 12/28/2021 08:04:29 - INFO - codeparrot_training - Step 40800: {'lr': 4.0895756319321936e-05, 'samples': 20890112, 'steps': 40800, 'batch_loss/train': 0.9282355844043195} 12/28/2021 08:04:40 - INFO - codeparrot_training - Step 40801: {'lr': 4.088711706127618e-05, 'samples': 20890624, 'steps': 40801, 'batch_loss/train': 0.7572405622922815} 12/28/2021 08:04:51 - INFO - codeparrot_training - Step 40802: {'lr': 4.0878478634584356e-05, 'samples': 20891136, 'steps': 40802, 'batch_loss/train': 0.7469926429912448} 12/28/2021 08:05:03 - INFO - codeparrot_training - Step 40803: {'lr': 4.086984103928079e-05, 'samples': 20891648, 'steps': 40803, 'batch_loss/train': 0.5036926487227902} 12/28/2021 08:05:14 - INFO - codeparrot_training - Step 40804: {'lr': 4.0861204275399744e-05, 'samples': 20892160, 'steps': 40804, 'batch_loss/train': 0.6855258527211845} 12/28/2021 08:05:25 - INFO - codeparrot_training - Step 40805: {'lr': 4.085256834297554e-05, 'samples': 20892672, 'steps': 40805, 'batch_loss/train': 0.7988513829186559} 12/28/2021 08:05:37 - INFO - codeparrot_training - Step 40806: {'lr': 4.084393324204258e-05, 'samples': 20893184, 'steps': 40806, 'batch_loss/train': 0.7118301806040108} 12/28/2021 08:05:47 - INFO - codeparrot_training - Step 40807: {'lr': 4.083529897263516e-05, 'samples': 20893696, 'steps': 40807, 'batch_loss/train': 0.7569132214412093} 12/28/2021 08:05:58 - INFO - codeparrot_training - Step 40808: {'lr': 4.082666553478762e-05, 'samples': 20894208, 'steps': 40808, 'batch_loss/train': 0.7219583834521472} 12/28/2021 08:06:10 - INFO - codeparrot_training - Step 40809: {'lr': 4.081803292853428e-05, 'samples': 20894720, 'steps': 40809, 'batch_loss/train': 0.6599214193411171} 12/28/2021 08:06:21 - INFO - codeparrot_training - Step 40810: {'lr': 4.0809401153909436e-05, 'samples': 20895232, 'steps': 40810, 'batch_loss/train': 0.6745909973978996} 12/28/2021 08:06:31 - INFO - codeparrot_training - Step 40811: {'lr': 4.080077021094744e-05, 'samples': 20895744, 'steps': 40811, 'batch_loss/train': 0.706435113446787} 12/28/2021 08:06:46 - INFO - codeparrot_training - Step 40812: {'lr': 4.07921400996826e-05, 'samples': 20896256, 'steps': 40812, 'batch_loss/train': 0.7253389987163246} 12/28/2021 08:06:56 - INFO - codeparrot_training - Step 40813: {'lr': 4.078351082014919e-05, 'samples': 20896768, 'steps': 40813, 'batch_loss/train': 0.7810253719799221} 12/28/2021 08:07:07 - INFO - codeparrot_training - Step 40814: {'lr': 4.0774882372381636e-05, 'samples': 20897280, 'steps': 40814, 'batch_loss/train': 0.7343464647419751} 12/28/2021 08:07:18 - INFO - codeparrot_training - Step 40815: {'lr': 4.076625475641407e-05, 'samples': 20897792, 'steps': 40815, 'batch_loss/train': 0.7067703192587942} 12/28/2021 08:07:30 - INFO - codeparrot_training - Step 40816: {'lr': 4.07576279722808e-05, 'samples': 20898304, 'steps': 40816, 'batch_loss/train': 0.7177039822563529} 12/28/2021 08:07:40 - INFO - codeparrot_training - Step 40817: {'lr': 4.0749002020016355e-05, 'samples': 20898816, 'steps': 40817, 'batch_loss/train': 0.6973445336334407} 12/28/2021 08:07:51 - INFO - codeparrot_training - Step 40818: {'lr': 4.074037689965476e-05, 'samples': 20899328, 'steps': 40818, 'batch_loss/train': 0.7521849544718862} 12/28/2021 08:08:05 - INFO - codeparrot_training - Step 40819: {'lr': 4.073175261123038e-05, 'samples': 20899840, 'steps': 40819, 'batch_loss/train': 0.6470858384855092} 12/28/2021 08:08:16 - INFO - codeparrot_training - Step 40820: {'lr': 4.072312915477766e-05, 'samples': 20900352, 'steps': 40820, 'batch_loss/train': 0.6911444240249693} 12/28/2021 08:08:26 - INFO - codeparrot_training - Step 40821: {'lr': 4.0714506530330684e-05, 'samples': 20900864, 'steps': 40821, 'batch_loss/train': 0.6977413343265653} 12/28/2021 08:08:39 - INFO - codeparrot_training - Step 40822: {'lr': 4.07058847379238e-05, 'samples': 20901376, 'steps': 40822, 'batch_loss/train': 0.692427780944854} 12/28/2021 08:08:49 - INFO - codeparrot_training - Step 40823: {'lr': 4.069726377759131e-05, 'samples': 20901888, 'steps': 40823, 'batch_loss/train': 0.7235128059983253} 12/28/2021 08:09:00 - INFO - codeparrot_training - Step 40824: {'lr': 4.0688643649367475e-05, 'samples': 20902400, 'steps': 40824, 'batch_loss/train': 0.6572346081957221} 12/28/2021 08:09:11 - INFO - codeparrot_training - Step 40825: {'lr': 4.068002435328655e-05, 'samples': 20902912, 'steps': 40825, 'batch_loss/train': 0.544927520386409} 12/28/2021 08:09:23 - INFO - codeparrot_training - Step 40826: {'lr': 4.0671405889382865e-05, 'samples': 20903424, 'steps': 40826, 'batch_loss/train': 0.5916246909182519} 12/28/2021 08:09:33 - INFO - codeparrot_training - Step 40827: {'lr': 4.0662788257690516e-05, 'samples': 20903936, 'steps': 40827, 'batch_loss/train': 0.7087174896150827} 12/28/2021 08:09:44 - INFO - codeparrot_training - Step 40828: {'lr': 4.065417145824393e-05, 'samples': 20904448, 'steps': 40828, 'batch_loss/train': 0.658261013100855} 12/28/2021 08:09:58 - INFO - codeparrot_training - Step 40829: {'lr': 4.06455554910774e-05, 'samples': 20904960, 'steps': 40829, 'batch_loss/train': 0.6932759964838624} 12/28/2021 08:10:09 - INFO - codeparrot_training - Step 40830: {'lr': 4.0636940356225005e-05, 'samples': 20905472, 'steps': 40830, 'batch_loss/train': 0.7265910124406219} 12/28/2021 08:10:19 - INFO - codeparrot_training - Step 40831: {'lr': 4.062832605372102e-05, 'samples': 20905984, 'steps': 40831, 'batch_loss/train': 0.6284644283587113} 12/28/2021 08:10:32 - INFO - codeparrot_training - Step 40832: {'lr': 4.061971258359989e-05, 'samples': 20906496, 'steps': 40832, 'batch_loss/train': 0.8130172928795218} 12/28/2021 08:10:42 - INFO - codeparrot_training - Step 40833: {'lr': 4.061109994589568e-05, 'samples': 20907008, 'steps': 40833, 'batch_loss/train': 0.9580638236948289} 12/28/2021 08:10:53 - INFO - codeparrot_training - Step 40834: {'lr': 4.0602488140642655e-05, 'samples': 20907520, 'steps': 40834, 'batch_loss/train': 0.7190508742351085} 12/28/2021 08:11:03 - INFO - codeparrot_training - Step 40835: {'lr': 4.059387716787505e-05, 'samples': 20908032, 'steps': 40835, 'batch_loss/train': 0.6590245272964239} 12/28/2021 08:11:16 - INFO - codeparrot_training - Step 40836: {'lr': 4.058526702762716e-05, 'samples': 20908544, 'steps': 40836, 'batch_loss/train': 0.7385020204819739} 12/28/2021 08:11:26 - INFO - codeparrot_training - Step 40837: {'lr': 4.0576657719933156e-05, 'samples': 20909056, 'steps': 40837, 'batch_loss/train': 0.6754777510650456} 12/28/2021 08:11:37 - INFO - codeparrot_training - Step 40838: {'lr': 4.056804924482729e-05, 'samples': 20909568, 'steps': 40838, 'batch_loss/train': 0.6502916989848018} 12/28/2021 08:11:51 - INFO - codeparrot_training - Step 40839: {'lr': 4.055944160234376e-05, 'samples': 20910080, 'steps': 40839, 'batch_loss/train': 0.6659611454233527} 12/28/2021 08:12:01 - INFO - codeparrot_training - Step 40840: {'lr': 4.055083479251684e-05, 'samples': 20910592, 'steps': 40840, 'batch_loss/train': 0.6685709408484399} 12/28/2021 08:12:12 - INFO - codeparrot_training - Step 40841: {'lr': 4.054222881538075e-05, 'samples': 20911104, 'steps': 40841, 'batch_loss/train': 0.6577684483199846} 12/28/2021 08:12:24 - INFO - codeparrot_training - Step 40842: {'lr': 4.053362367096958e-05, 'samples': 20911616, 'steps': 40842, 'batch_loss/train': 0.6542776407732163} 12/28/2021 08:12:35 - INFO - codeparrot_training - Step 40843: {'lr': 4.052501935931774e-05, 'samples': 20912128, 'steps': 40843, 'batch_loss/train': 0.6373980494390707} 12/28/2021 08:12:46 - INFO - codeparrot_training - Step 40844: {'lr': 4.0516415880459264e-05, 'samples': 20912640, 'steps': 40844, 'batch_loss/train': 0.7946234801784158} 12/28/2021 08:12:58 - INFO - codeparrot_training - Step 40845: {'lr': 4.050781323442834e-05, 'samples': 20913152, 'steps': 40845, 'batch_loss/train': 0.5628054953704122} 12/28/2021 08:13:08 - INFO - codeparrot_training - Step 40846: {'lr': 4.04992114212594e-05, 'samples': 20913664, 'steps': 40846, 'batch_loss/train': 0.692930793389678} 12/28/2021 08:13:19 - INFO - codeparrot_training - Step 40847: {'lr': 4.049061044098642e-05, 'samples': 20914176, 'steps': 40847, 'batch_loss/train': 0.6830710208741948} 12/28/2021 08:13:30 - INFO - codeparrot_training - Step 40848: {'lr': 4.048201029364365e-05, 'samples': 20914688, 'steps': 40848, 'batch_loss/train': 0.7237851349636912} 12/28/2021 08:13:42 - INFO - codeparrot_training - Step 40849: {'lr': 4.0473410979265326e-05, 'samples': 20915200, 'steps': 40849, 'batch_loss/train': 0.7243833700194955} 12/28/2021 08:13:52 - INFO - codeparrot_training - Step 40850: {'lr': 4.0464812497885586e-05, 'samples': 20915712, 'steps': 40850, 'batch_loss/train': 0.7025236571207643} 12/28/2021 08:14:03 - INFO - codeparrot_training - Step 40851: {'lr': 4.045621484953865e-05, 'samples': 20916224, 'steps': 40851, 'batch_loss/train': 0.6987596522085369} 12/28/2021 08:14:17 - INFO - codeparrot_training - Step 40852: {'lr': 4.044761803425873e-05, 'samples': 20916736, 'steps': 40852, 'batch_loss/train': 0.6870682072476484} 12/28/2021 08:14:28 - INFO - codeparrot_training - Step 40853: {'lr': 4.043902205207983e-05, 'samples': 20917248, 'steps': 40853, 'batch_loss/train': 0.7131095523945987} 12/28/2021 08:14:38 - INFO - codeparrot_training - Step 40854: {'lr': 4.0430426903036336e-05, 'samples': 20917760, 'steps': 40854, 'batch_loss/train': 0.6667875002603978} 12/28/2021 08:14:51 - INFO - codeparrot_training - Step 40855: {'lr': 4.0421832587162366e-05, 'samples': 20918272, 'steps': 40855, 'batch_loss/train': 0.5813282279996201} 12/28/2021 08:15:01 - INFO - codeparrot_training - Step 40856: {'lr': 4.0413239104491964e-05, 'samples': 20918784, 'steps': 40856, 'batch_loss/train': 0.7102663051337004} 12/28/2021 08:15:12 - INFO - codeparrot_training - Step 40857: {'lr': 4.040464645505943e-05, 'samples': 20919296, 'steps': 40857, 'batch_loss/train': 0.7749960874207318} 12/28/2021 08:15:26 - INFO - codeparrot_training - Step 40858: {'lr': 4.039605463889892e-05, 'samples': 20919808, 'steps': 40858, 'batch_loss/train': 0.7783101070672274} 12/28/2021 08:15:37 - INFO - codeparrot_training - Step 40859: {'lr': 4.038746365604454e-05, 'samples': 20920320, 'steps': 40859, 'batch_loss/train': 0.723256993573159} 12/28/2021 08:15:47 - INFO - codeparrot_training - Step 40860: {'lr': 4.037887350653041e-05, 'samples': 20920832, 'steps': 40860, 'batch_loss/train': 0.6927997210877948} 12/28/2021 08:15:58 - INFO - codeparrot_training - Step 40861: {'lr': 4.0370284190390765e-05, 'samples': 20921344, 'steps': 40861, 'batch_loss/train': 0.6586523680016398} 12/28/2021 08:16:10 - INFO - codeparrot_training - Step 40862: {'lr': 4.036169570765971e-05, 'samples': 20921856, 'steps': 40862, 'batch_loss/train': 0.7635441245511174} 12/28/2021 08:16:21 - INFO - codeparrot_training - Step 40863: {'lr': 4.035310805837139e-05, 'samples': 20922368, 'steps': 40863, 'batch_loss/train': 0.784608312882483} 12/28/2021 08:16:31 - INFO - codeparrot_training - Step 40864: {'lr': 4.0344521242559937e-05, 'samples': 20922880, 'steps': 40864, 'batch_loss/train': 0.5385210756212473} 12/28/2021 08:16:43 - INFO - codeparrot_training - Step 40865: {'lr': 4.033593526025952e-05, 'samples': 20923392, 'steps': 40865, 'batch_loss/train': 0.6137958148028702} 12/28/2021 08:16:54 - INFO - codeparrot_training - Step 40866: {'lr': 4.0327350111504264e-05, 'samples': 20923904, 'steps': 40866, 'batch_loss/train': 0.7334831384941936} 12/28/2021 08:17:05 - INFO - codeparrot_training - Step 40867: {'lr': 4.031876579632837e-05, 'samples': 20924416, 'steps': 40867, 'batch_loss/train': 0.6553557189181447} 12/28/2021 08:17:19 - INFO - codeparrot_training - Step 40868: {'lr': 4.031018231476574e-05, 'samples': 20924928, 'steps': 40868, 'batch_loss/train': 0.7519469824619591} 12/28/2021 08:17:29 - INFO - codeparrot_training - Step 40869: {'lr': 4.030159966685079e-05, 'samples': 20925440, 'steps': 40869, 'batch_loss/train': 0.4944920669659041} 12/28/2021 08:17:40 - INFO - codeparrot_training - Step 40870: {'lr': 4.029301785261744e-05, 'samples': 20925952, 'steps': 40870, 'batch_loss/train': 0.7247950863093138} 12/28/2021 08:17:52 - INFO - codeparrot_training - Step 40871: {'lr': 4.028443687209979e-05, 'samples': 20926464, 'steps': 40871, 'batch_loss/train': 0.5891144923516549} 12/28/2021 08:18:03 - INFO - codeparrot_training - Step 40872: {'lr': 4.027585672533215e-05, 'samples': 20926976, 'steps': 40872, 'batch_loss/train': 0.6969266110099852} 12/28/2021 08:18:14 - INFO - codeparrot_training - Step 40873: {'lr': 4.026727741234848e-05, 'samples': 20927488, 'steps': 40873, 'batch_loss/train': 0.6443629893474281} 12/28/2021 08:18:24 - INFO - codeparrot_training - Step 40874: {'lr': 4.02586989331829e-05, 'samples': 20928000, 'steps': 40874, 'batch_loss/train': 0.7954091932624578} 12/28/2021 08:18:37 - INFO - codeparrot_training - Step 40875: {'lr': 4.025012128786956e-05, 'samples': 20928512, 'steps': 40875, 'batch_loss/train': 0.5030603063642047} 12/28/2021 08:18:47 - INFO - codeparrot_training - Step 40876: {'lr': 4.024154447644251e-05, 'samples': 20929024, 'steps': 40876, 'batch_loss/train': 0.7290251604281366} 12/28/2021 08:18:58 - INFO - codeparrot_training - Step 40877: {'lr': 4.02329684989359e-05, 'samples': 20929536, 'steps': 40877, 'batch_loss/train': 0.5803073047718499} 12/28/2021 08:19:10 - INFO - codeparrot_training - Step 40878: {'lr': 4.022439335538386e-05, 'samples': 20930048, 'steps': 40878, 'batch_loss/train': 0.6194904213771224} 12/28/2021 08:19:21 - INFO - codeparrot_training - Step 40879: {'lr': 4.021581904582031e-05, 'samples': 20930560, 'steps': 40879, 'batch_loss/train': 0.8420431513804942} 12/28/2021 08:19:31 - INFO - codeparrot_training - Step 40880: {'lr': 4.0207245570279494e-05, 'samples': 20931072, 'steps': 40880, 'batch_loss/train': 0.6297401059418917} 12/28/2021 08:19:42 - INFO - codeparrot_training - Step 40881: {'lr': 4.0198672928795535e-05, 'samples': 20931584, 'steps': 40881, 'batch_loss/train': 0.49351426685461774} 12/28/2021 08:19:56 - INFO - codeparrot_training - Step 40882: {'lr': 4.0190101121402315e-05, 'samples': 20932096, 'steps': 40882, 'batch_loss/train': 0.6560376089764759} 12/28/2021 08:20:07 - INFO - codeparrot_training - Step 40883: {'lr': 4.0181530148134086e-05, 'samples': 20932608, 'steps': 40883, 'batch_loss/train': 0.45016351505182683} 12/28/2021 08:20:17 - INFO - codeparrot_training - Step 40884: {'lr': 4.01729600090249e-05, 'samples': 20933120, 'steps': 40884, 'batch_loss/train': 0.6178651647642255} 12/28/2021 08:20:29 - INFO - codeparrot_training - Step 40885: {'lr': 4.0164390704108765e-05, 'samples': 20933632, 'steps': 40885, 'batch_loss/train': 0.6337673990055919} 12/28/2021 08:20:40 - INFO - codeparrot_training - Step 40886: {'lr': 4.0155822233419735e-05, 'samples': 20934144, 'steps': 40886, 'batch_loss/train': 0.6947794547304511} 12/28/2021 08:20:51 - INFO - codeparrot_training - Step 40887: {'lr': 4.014725459699201e-05, 'samples': 20934656, 'steps': 40887, 'batch_loss/train': 0.6439341919030994} 12/28/2021 08:21:05 - INFO - codeparrot_training - Step 40888: {'lr': 4.013868779485952e-05, 'samples': 20935168, 'steps': 40888, 'batch_loss/train': 0.5420798073173501} 12/28/2021 08:21:15 - INFO - codeparrot_training - Step 40889: {'lr': 4.013012182705636e-05, 'samples': 20935680, 'steps': 40889, 'batch_loss/train': 0.6226628129370511} 12/28/2021 08:21:26 - INFO - codeparrot_training - Step 40890: {'lr': 4.012155669361661e-05, 'samples': 20936192, 'steps': 40890, 'batch_loss/train': 0.5665991939604282} 12/28/2021 08:21:37 - INFO - codeparrot_training - Step 40891: {'lr': 4.011299239457428e-05, 'samples': 20936704, 'steps': 40891, 'batch_loss/train': 0.9215042535215616} 12/28/2021 08:21:49 - INFO - codeparrot_training - Step 40892: {'lr': 4.010442892996347e-05, 'samples': 20937216, 'steps': 40892, 'batch_loss/train': 0.7530100792646408} 12/28/2021 08:21:59 - INFO - codeparrot_training - Step 40893: {'lr': 4.009586629981818e-05, 'samples': 20937728, 'steps': 40893, 'batch_loss/train': 0.6466807861579582} 12/28/2021 08:22:10 - INFO - codeparrot_training - Step 40894: {'lr': 4.0087304504172444e-05, 'samples': 20938240, 'steps': 40894, 'batch_loss/train': 0.40634606551611796} 12/28/2021 08:22:22 - INFO - codeparrot_training - Step 40895: {'lr': 4.007874354306043e-05, 'samples': 20938752, 'steps': 40895, 'batch_loss/train': 0.736417468637228} 12/28/2021 08:22:33 - INFO - codeparrot_training - Step 40896: {'lr': 4.007018341651597e-05, 'samples': 20939264, 'steps': 40896, 'batch_loss/train': 0.6071084657451138} 12/28/2021 08:22:44 - INFO - codeparrot_training - Step 40897: {'lr': 4.006162412457318e-05, 'samples': 20939776, 'steps': 40897, 'batch_loss/train': 0.6828027958981693} 12/28/2021 08:22:57 - INFO - codeparrot_training - Step 40898: {'lr': 4.0053065667266185e-05, 'samples': 20940288, 'steps': 40898, 'batch_loss/train': 0.6857202150858939} 12/28/2021 08:23:08 - INFO - codeparrot_training - Step 40899: {'lr': 4.0044508044628866e-05, 'samples': 20940800, 'steps': 40899, 'batch_loss/train': 0.6799521250650287} 12/28/2021 08:23:19 - INFO - codeparrot_training - Step 40900: {'lr': 4.003595125669532e-05, 'samples': 20941312, 'steps': 40900, 'batch_loss/train': 0.6959968125447631} 12/28/2021 08:23:31 - INFO - codeparrot_training - Step 40901: {'lr': 4.002739530349955e-05, 'samples': 20941824, 'steps': 40901, 'batch_loss/train': 0.6625967902364209} 12/28/2021 08:23:42 - INFO - codeparrot_training - Step 40902: {'lr': 4.0018840185075594e-05, 'samples': 20942336, 'steps': 40902, 'batch_loss/train': 0.7493959255516529} 12/28/2021 08:23:52 - INFO - codeparrot_training - Step 40903: {'lr': 4.0010285901457414e-05, 'samples': 20942848, 'steps': 40903, 'batch_loss/train': 0.7326390558155254} 12/28/2021 08:24:03 - INFO - codeparrot_training - Step 40904: {'lr': 4.000173245267913e-05, 'samples': 20943360, 'steps': 40904, 'batch_loss/train': 0.6672530574724078} 12/28/2021 08:24:17 - INFO - codeparrot_training - Step 40905: {'lr': 3.9993179838774516e-05, 'samples': 20943872, 'steps': 40905, 'batch_loss/train': 0.6121920440928079} 12/28/2021 08:24:27 - INFO - codeparrot_training - Step 40906: {'lr': 3.998462805977782e-05, 'samples': 20944384, 'steps': 40906, 'batch_loss/train': 0.6867705835029483} 12/28/2021 08:24:38 - INFO - codeparrot_training - Step 40907: {'lr': 3.9976077115722985e-05, 'samples': 20944896, 'steps': 40907, 'batch_loss/train': 0.6684385766275227} 12/28/2021 08:24:50 - INFO - codeparrot_training - Step 40908: {'lr': 3.996752700664383e-05, 'samples': 20945408, 'steps': 40908, 'batch_loss/train': 0.676907014567405} 12/28/2021 08:25:01 - INFO - codeparrot_training - Step 40909: {'lr': 3.995897773257456e-05, 'samples': 20945920, 'steps': 40909, 'batch_loss/train': 0.7138728415593505} 12/28/2021 08:25:11 - INFO - codeparrot_training - Step 40910: {'lr': 3.995042929354917e-05, 'samples': 20946432, 'steps': 40910, 'batch_loss/train': 0.6427814780618064} 12/28/2021 08:25:23 - INFO - codeparrot_training - Step 40911: {'lr': 3.994188168960147e-05, 'samples': 20946944, 'steps': 40911, 'batch_loss/train': 0.7020695749670267} 12/28/2021 08:25:34 - INFO - codeparrot_training - Step 40912: {'lr': 3.993333492076548e-05, 'samples': 20947456, 'steps': 40912, 'batch_loss/train': 0.6114307290990837} 12/28/2021 08:25:45 - INFO - codeparrot_training - Step 40913: {'lr': 3.992478898707535e-05, 'samples': 20947968, 'steps': 40913, 'batch_loss/train': 0.8100155834108591} 12/28/2021 08:25:57 - INFO - codeparrot_training - Step 40914: {'lr': 3.991624388856488e-05, 'samples': 20948480, 'steps': 40914, 'batch_loss/train': 0.7353471578098834} 12/28/2021 08:26:07 - INFO - codeparrot_training - Step 40915: {'lr': 3.990769962526811e-05, 'samples': 20948992, 'steps': 40915, 'batch_loss/train': 0.6167314944323152} 12/28/2021 08:26:18 - INFO - codeparrot_training - Step 40916: {'lr': 3.989915619721898e-05, 'samples': 20949504, 'steps': 40916, 'batch_loss/train': 0.5814788838033564} 12/28/2021 08:26:29 - INFO - codeparrot_training - Step 40917: {'lr': 3.989061360445148e-05, 'samples': 20950016, 'steps': 40917, 'batch_loss/train': 0.8362226318567991} 12/28/2021 08:26:41 - INFO - codeparrot_training - Step 40918: {'lr': 3.988207184699957e-05, 'samples': 20950528, 'steps': 40918, 'batch_loss/train': 0.6033623195253313} 12/28/2021 08:26:51 - INFO - codeparrot_training - Step 40919: {'lr': 3.987353092489721e-05, 'samples': 20951040, 'steps': 40919, 'batch_loss/train': 0.763713008724153} 12/28/2021 08:27:02 - INFO - codeparrot_training - Step 40920: {'lr': 3.986499083817832e-05, 'samples': 20951552, 'steps': 40920, 'batch_loss/train': 0.7149619038682431} 12/28/2021 08:27:16 - INFO - codeparrot_training - Step 40921: {'lr': 3.985645158687692e-05, 'samples': 20952064, 'steps': 40921, 'batch_loss/train': 0.746123637072742} 12/28/2021 08:27:27 - INFO - codeparrot_training - Step 40922: {'lr': 3.9847913171026954e-05, 'samples': 20952576, 'steps': 40922, 'batch_loss/train': 0.7337620388716459} 12/28/2021 08:27:37 - INFO - codeparrot_training - Step 40923: {'lr': 3.983937559066225e-05, 'samples': 20953088, 'steps': 40923, 'batch_loss/train': 0.5832779845513869} 12/28/2021 08:27:49 - INFO - codeparrot_training - Step 40924: {'lr': 3.983083884581692e-05, 'samples': 20953600, 'steps': 40924, 'batch_loss/train': 0.6045855877455324} 12/28/2021 08:28:00 - INFO - codeparrot_training - Step 40925: {'lr': 3.982230293652478e-05, 'samples': 20954112, 'steps': 40925, 'batch_loss/train': 0.7231644135899842} 12/28/2021 08:28:10 - INFO - codeparrot_training - Step 40926: {'lr': 3.981376786281979e-05, 'samples': 20954624, 'steps': 40926, 'batch_loss/train': 0.6827997509390116} 12/28/2021 08:28:21 - INFO - codeparrot_training - Step 40927: {'lr': 3.98052336247359e-05, 'samples': 20955136, 'steps': 40927, 'batch_loss/train': 0.6375197973102331} 12/28/2021 08:28:35 - INFO - codeparrot_training - Step 40928: {'lr': 3.9796700222307054e-05, 'samples': 20955648, 'steps': 40928, 'batch_loss/train': 0.6622905712283682} 12/28/2021 08:28:46 - INFO - codeparrot_training - Step 40929: {'lr': 3.9788167655567115e-05, 'samples': 20956160, 'steps': 40929, 'batch_loss/train': 0.7080025486648083} 12/28/2021 08:28:56 - INFO - codeparrot_training - Step 40930: {'lr': 3.9779635924550086e-05, 'samples': 20956672, 'steps': 40930, 'batch_loss/train': 0.6548257363028824} 12/28/2021 08:29:09 - INFO - codeparrot_training - Step 40931: {'lr': 3.977110502928982e-05, 'samples': 20957184, 'steps': 40931, 'batch_loss/train': 0.7895181756466627} 12/28/2021 08:29:19 - INFO - codeparrot_training - Step 40932: {'lr': 3.97625749698203e-05, 'samples': 20957696, 'steps': 40932, 'batch_loss/train': 0.7153754732571542} 12/28/2021 08:29:30 - INFO - codeparrot_training - Step 40933: {'lr': 3.9754045746175415e-05, 'samples': 20958208, 'steps': 40933, 'batch_loss/train': 0.6724345730617642} 12/28/2021 08:29:42 - INFO - codeparrot_training - Step 40934: {'lr': 3.9745517358388965e-05, 'samples': 20958720, 'steps': 40934, 'batch_loss/train': 0.6520634123007767} 12/28/2021 08:29:53 - INFO - codeparrot_training - Step 40935: {'lr': 3.973698980649501e-05, 'samples': 20959232, 'steps': 40935, 'batch_loss/train': 0.758246289100498} 12/28/2021 08:30:03 - INFO - codeparrot_training - Step 40936: {'lr': 3.972846309052744e-05, 'samples': 20959744, 'steps': 40936, 'batch_loss/train': 0.6070809378288686} 12/28/2021 08:30:14 - INFO - codeparrot_training - Step 40937: {'lr': 3.971993721052006e-05, 'samples': 20960256, 'steps': 40937, 'batch_loss/train': 0.7967420276254416} 12/28/2021 08:30:28 - INFO - codeparrot_training - Step 40938: {'lr': 3.971141216650673e-05, 'samples': 20960768, 'steps': 40938, 'batch_loss/train': 0.780502162873745} 12/28/2021 08:30:38 - INFO - codeparrot_training - Step 40939: {'lr': 3.970288795852156e-05, 'samples': 20961280, 'steps': 40939, 'batch_loss/train': 0.6946358415298164} 12/28/2021 08:30:49 - INFO - codeparrot_training - Step 40940: {'lr': 3.9694364586598226e-05, 'samples': 20961792, 'steps': 40940, 'batch_loss/train': 0.7588870311155915} 12/28/2021 08:31:01 - INFO - codeparrot_training - Step 40941: {'lr': 3.968584205077072e-05, 'samples': 20962304, 'steps': 40941, 'batch_loss/train': 0.731440001167357} 12/28/2021 08:31:12 - INFO - codeparrot_training - Step 40942: {'lr': 3.967732035107285e-05, 'samples': 20962816, 'steps': 40942, 'batch_loss/train': 0.7171827387064695} 12/28/2021 08:31:22 - INFO - codeparrot_training - Step 40943: {'lr': 3.966879948753857e-05, 'samples': 20963328, 'steps': 40943, 'batch_loss/train': 0.620220291428268} 12/28/2021 08:31:36 - INFO - codeparrot_training - Step 40944: {'lr': 3.9660279460201734e-05, 'samples': 20963840, 'steps': 40944, 'batch_loss/train': 0.6758493844245095} 12/28/2021 08:31:47 - INFO - codeparrot_training - Step 40945: {'lr': 3.96517602690962e-05, 'samples': 20964352, 'steps': 40945, 'batch_loss/train': 0.5959604777744971} 12/28/2021 08:31:57 - INFO - codeparrot_training - Step 40946: {'lr': 3.964324191425583e-05, 'samples': 20964864, 'steps': 40946, 'batch_loss/train': 0.7218564269132912} 12/28/2021 08:32:09 - INFO - codeparrot_training - Step 40947: {'lr': 3.96347243957145e-05, 'samples': 20965376, 'steps': 40947, 'batch_loss/train': 0.6980614820495248} 12/28/2021 08:32:20 - INFO - codeparrot_training - Step 40948: {'lr': 3.962620771350614e-05, 'samples': 20965888, 'steps': 40948, 'batch_loss/train': 0.715427081566304} 12/28/2021 08:32:31 - INFO - codeparrot_training - Step 40949: {'lr': 3.961769186766442e-05, 'samples': 20966400, 'steps': 40949, 'batch_loss/train': 0.7260456448420882} 12/28/2021 08:32:41 - INFO - codeparrot_training - Step 40950: {'lr': 3.960917685822346e-05, 'samples': 20966912, 'steps': 40950, 'batch_loss/train': 0.8835179635789245} 12/28/2021 08:32:54 - INFO - codeparrot_training - Step 40951: {'lr': 3.9600662685216896e-05, 'samples': 20967424, 'steps': 40951, 'batch_loss/train': 0.7027821369701996} 12/28/2021 08:33:04 - INFO - codeparrot_training - Step 40952: {'lr': 3.9592149348678664e-05, 'samples': 20967936, 'steps': 40952, 'batch_loss/train': 0.6872052401304245} 12/28/2021 08:33:15 - INFO - codeparrot_training - Step 40953: {'lr': 3.958363684864258e-05, 'samples': 20968448, 'steps': 40953, 'batch_loss/train': 0.6647005905397236} 12/28/2021 08:33:29 - INFO - codeparrot_training - Step 40954: {'lr': 3.957512518514253e-05, 'samples': 20968960, 'steps': 40954, 'batch_loss/train': 0.9686220148578286} 12/28/2021 08:33:39 - INFO - codeparrot_training - Step 40955: {'lr': 3.956661435821229e-05, 'samples': 20969472, 'steps': 40955, 'batch_loss/train': 0.6960217608720995} 12/28/2021 08:33:50 - INFO - codeparrot_training - Step 40956: {'lr': 3.955810436788576e-05, 'samples': 20969984, 'steps': 40956, 'batch_loss/train': 0.6703286992851645} 12/28/2021 08:34:02 - INFO - codeparrot_training - Step 40957: {'lr': 3.954959521419677e-05, 'samples': 20970496, 'steps': 40957, 'batch_loss/train': 0.7374784145504236} 12/28/2021 08:34:13 - INFO - codeparrot_training - Step 40958: {'lr': 3.954108689717908e-05, 'samples': 20971008, 'steps': 40958, 'batch_loss/train': 0.6826874143444002} 12/28/2021 08:34:24 - INFO - codeparrot_training - Step 40959: {'lr': 3.9532579416866666e-05, 'samples': 20971520, 'steps': 40959, 'batch_loss/train': 0.6913118017837405} 12/28/2021 08:34:36 - INFO - codeparrot_training - Step 40960: {'lr': 3.952407277329309e-05, 'samples': 20972032, 'steps': 40960, 'batch_loss/train': 0.7193396403454244} 12/28/2021 08:34:46 - INFO - codeparrot_training - Step 40961: {'lr': 3.9515566966492415e-05, 'samples': 20972544, 'steps': 40961, 'batch_loss/train': 0.6763024493120611} 12/28/2021 08:34:57 - INFO - codeparrot_training - Step 40962: {'lr': 3.950706199649842e-05, 'samples': 20973056, 'steps': 40962, 'batch_loss/train': 0.7865863875485957} 12/28/2021 08:35:08 - INFO - codeparrot_training - Step 40963: {'lr': 3.94985578633448e-05, 'samples': 20973568, 'steps': 40963, 'batch_loss/train': 0.7268219129182398} 12/28/2021 08:35:20 - INFO - codeparrot_training - Step 40964: {'lr': 3.9490054567065394e-05, 'samples': 20974080, 'steps': 40964, 'batch_loss/train': 0.7221620129421353} 12/28/2021 08:35:31 - INFO - codeparrot_training - Step 40965: {'lr': 3.9481552107694134e-05, 'samples': 20974592, 'steps': 40965, 'batch_loss/train': 0.7883222829550505} 12/28/2021 08:35:41 - INFO - codeparrot_training - Step 40966: {'lr': 3.94730504852647e-05, 'samples': 20975104, 'steps': 40966, 'batch_loss/train': 0.7662522667087615} 12/28/2021 08:35:55 - INFO - codeparrot_training - Step 40967: {'lr': 3.9464549699810864e-05, 'samples': 20975616, 'steps': 40967, 'batch_loss/train': 0.7708728867582977} 12/28/2021 08:36:06 - INFO - codeparrot_training - Step 40968: {'lr': 3.9456049751366604e-05, 'samples': 20976128, 'steps': 40968, 'batch_loss/train': 0.6755052814260125} 12/28/2021 08:36:16 - INFO - codeparrot_training - Step 40969: {'lr': 3.9447550639965526e-05, 'samples': 20976640, 'steps': 40969, 'batch_loss/train': 0.713741097599268} 12/28/2021 08:36:28 - INFO - codeparrot_training - Step 40970: {'lr': 3.9439052365641465e-05, 'samples': 20977152, 'steps': 40970, 'batch_loss/train': 0.7319603650830686} 12/28/2021 08:36:39 - INFO - codeparrot_training - Step 40971: {'lr': 3.9430554928428255e-05, 'samples': 20977664, 'steps': 40971, 'batch_loss/train': 0.5827410406200215} 12/28/2021 08:36:50 - INFO - codeparrot_training - Step 40972: {'lr': 3.942205832835963e-05, 'samples': 20978176, 'steps': 40972, 'batch_loss/train': 0.7315609101206064} 12/28/2021 08:37:00 - INFO - codeparrot_training - Step 40973: {'lr': 3.9413562565469405e-05, 'samples': 20978688, 'steps': 40973, 'batch_loss/train': 0.6737823858857155} 12/28/2021 08:37:14 - INFO - codeparrot_training - Step 40974: {'lr': 3.9405067639791405e-05, 'samples': 20979200, 'steps': 40974, 'batch_loss/train': 0.8535825340077281} 12/28/2021 08:37:25 - INFO - codeparrot_training - Step 40975: {'lr': 3.93965735513592e-05, 'samples': 20979712, 'steps': 40975, 'batch_loss/train': 0.6766426134854555} 12/28/2021 08:37:36 - INFO - codeparrot_training - Step 40976: {'lr': 3.938808030020677e-05, 'samples': 20980224, 'steps': 40976, 'batch_loss/train': 0.672106456477195} 12/28/2021 08:37:48 - INFO - codeparrot_training - Step 40977: {'lr': 3.9379587886367866e-05, 'samples': 20980736, 'steps': 40977, 'batch_loss/train': 0.5782705805613659} 12/28/2021 08:37:58 - INFO - codeparrot_training - Step 40978: {'lr': 3.9371096309876117e-05, 'samples': 20981248, 'steps': 40978, 'batch_loss/train': 0.707532165106386} 12/28/2021 08:38:09 - INFO - codeparrot_training - Step 40979: {'lr': 3.9362605570765374e-05, 'samples': 20981760, 'steps': 40979, 'batch_loss/train': 0.5782677548704669} 12/28/2021 08:38:21 - INFO - codeparrot_training - Step 40980: {'lr': 3.9354115669069405e-05, 'samples': 20982272, 'steps': 40980, 'batch_loss/train': 0.6688160467892885} 12/28/2021 08:38:32 - INFO - codeparrot_training - Step 40981: {'lr': 3.9345626604821914e-05, 'samples': 20982784, 'steps': 40981, 'batch_loss/train': 0.6850603406783193} 12/28/2021 08:38:43 - INFO - codeparrot_training - Step 40982: {'lr': 3.9337138378056666e-05, 'samples': 20983296, 'steps': 40982, 'batch_loss/train': 0.623952922411263} 12/28/2021 08:38:57 - INFO - codeparrot_training - Step 40983: {'lr': 3.932865098880739e-05, 'samples': 20983808, 'steps': 40983, 'batch_loss/train': 0.7035118527710438} 12/28/2021 08:39:07 - INFO - codeparrot_training - Step 40984: {'lr': 3.9320164437107895e-05, 'samples': 20984320, 'steps': 40984, 'batch_loss/train': 0.6277270601131022} 12/28/2021 08:39:18 - INFO - codeparrot_training - Step 40985: {'lr': 3.931167872299191e-05, 'samples': 20984832, 'steps': 40985, 'batch_loss/train': 0.5782120067160577} 12/28/2021 08:39:29 - INFO - codeparrot_training - Step 40986: {'lr': 3.930319384649303e-05, 'samples': 20985344, 'steps': 40986, 'batch_loss/train': 0.6824689696077257} 12/28/2021 08:39:41 - INFO - codeparrot_training - Step 40987: {'lr': 3.929470980764513e-05, 'samples': 20985856, 'steps': 40987, 'batch_loss/train': 0.59990727272816} 12/28/2021 08:39:51 - INFO - codeparrot_training - Step 40988: {'lr': 3.928622660648198e-05, 'samples': 20986368, 'steps': 40988, 'batch_loss/train': 0.5975989430153277} 12/28/2021 08:40:02 - INFO - codeparrot_training - Step 40989: {'lr': 3.927774424303718e-05, 'samples': 20986880, 'steps': 40989, 'batch_loss/train': 0.670177798718214} 12/28/2021 08:40:16 - INFO - codeparrot_training - Step 40990: {'lr': 3.9269262717344434e-05, 'samples': 20987392, 'steps': 40990, 'batch_loss/train': 0.7229069033637643} 12/28/2021 08:40:27 - INFO - codeparrot_training - Step 40991: {'lr': 3.926078202943767e-05, 'samples': 20987904, 'steps': 40991, 'batch_loss/train': 0.6791283199563622} 12/28/2021 08:40:37 - INFO - codeparrot_training - Step 40992: {'lr': 3.9252302179350386e-05, 'samples': 20988416, 'steps': 40992, 'batch_loss/train': 0.6826344570145011} 12/28/2021 08:40:49 - INFO - codeparrot_training - Step 40993: {'lr': 3.92438231671163e-05, 'samples': 20988928, 'steps': 40993, 'batch_loss/train': 0.5391172296658624} 12/28/2021 08:41:00 - INFO - codeparrot_training - Step 40994: {'lr': 3.9235344992769344e-05, 'samples': 20989440, 'steps': 40994, 'batch_loss/train': 0.6604084530845284} 12/28/2021 08:41:11 - INFO - codeparrot_training - Step 40995: {'lr': 3.922686765634301e-05, 'samples': 20989952, 'steps': 40995, 'batch_loss/train': 0.6641875705681741} 12/28/2021 08:41:21 - INFO - codeparrot_training - Step 40996: {'lr': 3.9218391157871043e-05, 'samples': 20990464, 'steps': 40996, 'batch_loss/train': 0.8124537747353315} 12/28/2021 08:41:33 - INFO - codeparrot_training - Step 40997: {'lr': 3.920991549738717e-05, 'samples': 20990976, 'steps': 40997, 'batch_loss/train': 0.7049988496000879} 12/28/2021 08:41:44 - INFO - codeparrot_training - Step 40998: {'lr': 3.920144067492509e-05, 'samples': 20991488, 'steps': 40998, 'batch_loss/train': 0.47644823175505735} 12/28/2021 08:41:55 - INFO - codeparrot_training - Step 40999: {'lr': 3.91929666905185e-05, 'samples': 20992000, 'steps': 40999, 'batch_loss/train': 0.7375806807540357} 12/28/2021 08:42:07 - INFO - codeparrot_training - Step 41000: {'lr': 3.9184493544201116e-05, 'samples': 20992512, 'steps': 41000, 'batch_loss/train': 0.6156824808567762} 12/28/2021 08:42:17 - INFO - codeparrot_training - Step 41001: {'lr': 3.917602123600647e-05, 'samples': 20993024, 'steps': 41001, 'batch_loss/train': 0.740080120973289} 12/28/2021 08:42:28 - INFO - codeparrot_training - Step 41002: {'lr': 3.9167549765968426e-05, 'samples': 20993536, 'steps': 41002, 'batch_loss/train': 0.5476583614072297} 12/28/2021 08:42:42 - INFO - codeparrot_training - Step 41003: {'lr': 3.915907913412067e-05, 'samples': 20994048, 'steps': 41003, 'batch_loss/train': 0.7624564757570624} 12/28/2021 08:42:53 - INFO - codeparrot_training - Step 41004: {'lr': 3.915060934049666e-05, 'samples': 20994560, 'steps': 41004, 'batch_loss/train': 0.602488388074562} 12/28/2021 08:43:03 - INFO - codeparrot_training - Step 41005: {'lr': 3.914214038513034e-05, 'samples': 20995072, 'steps': 41005, 'batch_loss/train': 0.7623747661709785} 12/28/2021 08:43:15 - INFO - codeparrot_training - Step 41006: {'lr': 3.9133672268055195e-05, 'samples': 20995584, 'steps': 41006, 'batch_loss/train': 0.6086212736554444} 12/28/2021 08:43:26 - INFO - codeparrot_training - Step 41007: {'lr': 3.912520498930494e-05, 'samples': 20996096, 'steps': 41007, 'batch_loss/train': 0.6172106396406889} 12/28/2021 08:43:37 - INFO - codeparrot_training - Step 41008: {'lr': 3.911673854891326e-05, 'samples': 20996608, 'steps': 41008, 'batch_loss/train': 0.7264805305749178} 12/28/2021 08:43:47 - INFO - codeparrot_training - Step 41009: {'lr': 3.9108272946913795e-05, 'samples': 20997120, 'steps': 41009, 'batch_loss/train': 0.6538771805353463} 12/28/2021 08:44:00 - INFO - codeparrot_training - Step 41010: {'lr': 3.909980818334022e-05, 'samples': 20997632, 'steps': 41010, 'batch_loss/train': 0.6472063814289868} 12/28/2021 08:44:10 - INFO - codeparrot_training - Step 41011: {'lr': 3.909134425822616e-05, 'samples': 20998144, 'steps': 41011, 'batch_loss/train': 0.6916025201790035} 12/28/2021 08:44:21 - INFO - codeparrot_training - Step 41012: {'lr': 3.90828811716053e-05, 'samples': 20998656, 'steps': 41012, 'batch_loss/train': 0.5915128356427886} 12/28/2021 08:44:35 - INFO - codeparrot_training - Step 41013: {'lr': 3.907441892351124e-05, 'samples': 20999168, 'steps': 41013, 'batch_loss/train': 0.6229546125978231} 12/28/2021 08:44:45 - INFO - codeparrot_training - Step 41014: {'lr': 3.9065957513977706e-05, 'samples': 20999680, 'steps': 41014, 'batch_loss/train': 0.705613651778549} 12/28/2021 08:44:56 - INFO - codeparrot_training - Step 41015: {'lr': 3.905749694303823e-05, 'samples': 21000192, 'steps': 41015, 'batch_loss/train': 0.6722292292397469} 12/28/2021 08:45:08 - INFO - codeparrot_training - Step 41016: {'lr': 3.904903721072645e-05, 'samples': 21000704, 'steps': 41016, 'batch_loss/train': 0.7025203285738826} 12/28/2021 08:45:19 - INFO - codeparrot_training - Step 41017: {'lr': 3.904057831707614e-05, 'samples': 21001216, 'steps': 41017, 'batch_loss/train': 1.449188731610775} 12/28/2021 08:45:29 - INFO - codeparrot_training - Step 41018: {'lr': 3.90321202621208e-05, 'samples': 21001728, 'steps': 41018, 'batch_loss/train': 0.6801795838400722} 12/28/2021 08:45:40 - INFO - codeparrot_training - Step 41019: {'lr': 3.902366304589402e-05, 'samples': 21002240, 'steps': 41019, 'batch_loss/train': 0.6897533335722983} 12/28/2021 08:45:52 - INFO - codeparrot_training - Step 41020: {'lr': 3.9015206668429615e-05, 'samples': 21002752, 'steps': 41020, 'batch_loss/train': 0.6340732043026946} 12/28/2021 08:46:03 - INFO - codeparrot_training - Step 41021: {'lr': 3.9006751129761016e-05, 'samples': 21003264, 'steps': 41021, 'batch_loss/train': 0.7050162996165454} 12/28/2021 08:46:13 - INFO - codeparrot_training - Step 41022: {'lr': 3.899829642992192e-05, 'samples': 21003776, 'steps': 41022, 'batch_loss/train': 0.680200690869242} 12/28/2021 08:46:27 - INFO - codeparrot_training - Step 41023: {'lr': 3.898984256894592e-05, 'samples': 21004288, 'steps': 41023, 'batch_loss/train': 0.6289814094780013} 12/28/2021 08:46:38 - INFO - codeparrot_training - Step 41024: {'lr': 3.898138954686661e-05, 'samples': 21004800, 'steps': 41024, 'batch_loss/train': 0.853571762010688} 12/28/2021 08:46:49 - INFO - codeparrot_training - Step 41025: {'lr': 3.897293736371765e-05, 'samples': 21005312, 'steps': 41025, 'batch_loss/train': 0.6320315356715582} 12/28/2021 08:47:01 - INFO - codeparrot_training - Step 41026: {'lr': 3.896448601953267e-05, 'samples': 21005824, 'steps': 41026, 'batch_loss/train': 0.7679512137547135} 12/28/2021 08:47:12 - INFO - codeparrot_training - Step 41027: {'lr': 3.895603551434507e-05, 'samples': 21006336, 'steps': 41027, 'batch_loss/train': 0.4743196351919323} 12/28/2021 08:47:23 - INFO - codeparrot_training - Step 41028: {'lr': 3.8947585848188665e-05, 'samples': 21006848, 'steps': 41028, 'batch_loss/train': 0.7050107689574361} 12/28/2021 08:47:37 - INFO - codeparrot_training - Step 41029: {'lr': 3.8939137021097e-05, 'samples': 21007360, 'steps': 41029, 'batch_loss/train': 0.7028492148965597} 12/28/2021 08:47:47 - INFO - codeparrot_training - Step 41030: {'lr': 3.893068903310354e-05, 'samples': 21007872, 'steps': 41030, 'batch_loss/train': 0.7067852541804314} 12/28/2021 08:47:58 - INFO - codeparrot_training - Step 41031: {'lr': 3.8922241884242065e-05, 'samples': 21008384, 'steps': 41031, 'batch_loss/train': 0.6540909046307206} 12/28/2021 08:48:09 - INFO - codeparrot_training - Step 41032: {'lr': 3.8913795574546026e-05, 'samples': 21008896, 'steps': 41032, 'batch_loss/train': 0.7217647875659168} 12/28/2021 08:48:21 - INFO - codeparrot_training - Step 41033: {'lr': 3.8905350104049e-05, 'samples': 21009408, 'steps': 41033, 'batch_loss/train': 0.649707242147997} 12/28/2021 08:48:32 - INFO - codeparrot_training - Step 41034: {'lr': 3.8896905472784596e-05, 'samples': 21009920, 'steps': 41034, 'batch_loss/train': 0.678785455354955} 12/28/2021 08:48:42 - INFO - codeparrot_training - Step 41035: {'lr': 3.888846168078641e-05, 'samples': 21010432, 'steps': 41035, 'batch_loss/train': 1.0008740330813453} 12/28/2021 08:48:54 - INFO - codeparrot_training - Step 41036: {'lr': 3.888001872808797e-05, 'samples': 21010944, 'steps': 41036, 'batch_loss/train': 0.7714537847787142} 12/28/2021 08:49:05 - INFO - codeparrot_training - Step 41037: {'lr': 3.8871576614722876e-05, 'samples': 21011456, 'steps': 41037, 'batch_loss/train': 0.6367290369234979} 12/28/2021 08:49:16 - INFO - codeparrot_training - Step 41038: {'lr': 3.886313534072467e-05, 'samples': 21011968, 'steps': 41038, 'batch_loss/train': 0.5813463260419667} 12/28/2021 08:49:28 - INFO - codeparrot_training - Step 41039: {'lr': 3.8854694906126895e-05, 'samples': 21012480, 'steps': 41039, 'batch_loss/train': 0.7425906481221318} 12/28/2021 08:49:39 - INFO - codeparrot_training - Step 41040: {'lr': 3.8846255310963194e-05, 'samples': 21012992, 'steps': 41040, 'batch_loss/train': 0.6866338268155232} 12/28/2021 08:49:49 - INFO - codeparrot_training - Step 41041: {'lr': 3.8837816555266966e-05, 'samples': 21013504, 'steps': 41041, 'batch_loss/train': 0.680233810126083} 12/28/2021 08:50:00 - INFO - codeparrot_training - Step 41042: {'lr': 3.8829378639071876e-05, 'samples': 21014016, 'steps': 41042, 'batch_loss/train': 0.7226305669173598} 12/28/2021 08:50:14 - INFO - codeparrot_training - Step 41043: {'lr': 3.8820941562411505e-05, 'samples': 21014528, 'steps': 41043, 'batch_loss/train': 0.6473705652169883} 12/28/2021 08:50:25 - INFO - codeparrot_training - Step 41044: {'lr': 3.881250532531927e-05, 'samples': 21015040, 'steps': 41044, 'batch_loss/train': 0.7301937467418611} 12/28/2021 08:50:35 - INFO - codeparrot_training - Step 41045: {'lr': 3.880406992782873e-05, 'samples': 21015552, 'steps': 41045, 'batch_loss/train': 0.7040322795510292} 12/28/2021 08:50:47 - INFO - codeparrot_training - Step 41046: {'lr': 3.879563536997358e-05, 'samples': 21016064, 'steps': 41046, 'batch_loss/train': 0.5120556718320586} 12/28/2021 08:50:58 - INFO - codeparrot_training - Step 41047: {'lr': 3.8787201651787174e-05, 'samples': 21016576, 'steps': 41047, 'batch_loss/train': 0.6989949585986324} 12/28/2021 08:51:09 - INFO - codeparrot_training - Step 41048: {'lr': 3.877876877330308e-05, 'samples': 21017088, 'steps': 41048, 'batch_loss/train': 0.6161583552602679} 12/28/2021 08:51:21 - INFO - codeparrot_training - Step 41049: {'lr': 3.8770336734554875e-05, 'samples': 21017600, 'steps': 41049, 'batch_loss/train': 0.6277750316075981} 12/28/2021 08:51:31 - INFO - codeparrot_training - Step 41050: {'lr': 3.8761905535576045e-05, 'samples': 21018112, 'steps': 41050, 'batch_loss/train': 0.7274577813223004} 12/28/2021 08:51:42 - INFO - codeparrot_training - Step 41051: {'lr': 3.87534751764001e-05, 'samples': 21018624, 'steps': 41051, 'batch_loss/train': 0.8209987364243716} 12/28/2021 08:51:53 - INFO - codeparrot_training - Step 41052: {'lr': 3.874504565706066e-05, 'samples': 21019136, 'steps': 41052, 'batch_loss/train': 0.6843736106529832} 12/28/2021 08:52:06 - INFO - codeparrot_training - Step 41053: {'lr': 3.8736616977591036e-05, 'samples': 21019648, 'steps': 41053, 'batch_loss/train': 0.706131134647876} 12/28/2021 08:52:16 - INFO - codeparrot_training - Step 41054: {'lr': 3.8728189138024896e-05, 'samples': 21020160, 'steps': 41054, 'batch_loss/train': 0.7685583438724279} 12/28/2021 08:52:27 - INFO - codeparrot_training - Step 41055: {'lr': 3.871976213839576e-05, 'samples': 21020672, 'steps': 41055, 'batch_loss/train': 0.7273634066805243} 12/28/2021 08:52:39 - INFO - codeparrot_training - Step 41056: {'lr': 3.871133597873697e-05, 'samples': 21021184, 'steps': 41056, 'batch_loss/train': 0.703189882915467} 12/28/2021 08:52:50 - INFO - codeparrot_training - Step 41057: {'lr': 3.870291065908218e-05, 'samples': 21021696, 'steps': 41057, 'batch_loss/train': 0.7672564806416631} 12/28/2021 08:53:01 - INFO - codeparrot_training - Step 41058: {'lr': 3.869448617946489e-05, 'samples': 21022208, 'steps': 41058, 'batch_loss/train': 0.696374392719008} 12/28/2021 08:53:15 - INFO - codeparrot_training - Step 41059: {'lr': 3.8686062539918463e-05, 'samples': 21022720, 'steps': 41059, 'batch_loss/train': 0.4668176129925996} 12/28/2021 08:53:25 - INFO - codeparrot_training - Step 41060: {'lr': 3.86776397404765e-05, 'samples': 21023232, 'steps': 41060, 'batch_loss/train': 0.7293521272949874} 12/28/2021 08:53:36 - INFO - codeparrot_training - Step 41061: {'lr': 3.8669217781172414e-05, 'samples': 21023744, 'steps': 41061, 'batch_loss/train': 0.750057527795434} 12/28/2021 08:53:46 - INFO - codeparrot_training - Step 41062: {'lr': 3.8660796662039756e-05, 'samples': 21024256, 'steps': 41062, 'batch_loss/train': 0.6050268891267478} 12/28/2021 08:53:58 - INFO - codeparrot_training - Step 41063: {'lr': 3.865237638311195e-05, 'samples': 21024768, 'steps': 41063, 'batch_loss/train': 0.7253961493261158} 12/28/2021 08:54:09 - INFO - codeparrot_training - Step 41064: {'lr': 3.864395694442249e-05, 'samples': 21025280, 'steps': 41064, 'batch_loss/train': 0.7020030862186104} 12/28/2021 08:54:20 - INFO - codeparrot_training - Step 41065: {'lr': 3.863553834600486e-05, 'samples': 21025792, 'steps': 41065, 'batch_loss/train': 0.6573014911264181} 12/28/2021 08:54:32 - INFO - codeparrot_training - Step 41066: {'lr': 3.862712058789253e-05, 'samples': 21026304, 'steps': 41066, 'batch_loss/train': 0.744113015010953} 12/28/2021 08:54:42 - INFO - codeparrot_training - Step 41067: {'lr': 3.861870367011894e-05, 'samples': 21026816, 'steps': 41067, 'batch_loss/train': 0.7544735479168594} 12/28/2021 08:54:53 - INFO - codeparrot_training - Step 41068: {'lr': 3.861028759271759e-05, 'samples': 21027328, 'steps': 41068, 'batch_loss/train': 0.7412649355828762} 12/28/2021 08:55:07 - INFO - codeparrot_training - Step 41069: {'lr': 3.860187235572196e-05, 'samples': 21027840, 'steps': 41069, 'batch_loss/train': 0.6725435764528811} 12/28/2021 08:55:18 - INFO - codeparrot_training - Step 41070: {'lr': 3.859345795916541e-05, 'samples': 21028352, 'steps': 41070, 'batch_loss/train': 1.2438027255702764} 12/28/2021 08:55:28 - INFO - codeparrot_training - Step 41071: {'lr': 3.858504440308139e-05, 'samples': 21028864, 'steps': 41071, 'batch_loss/train': 0.6785328704863787} 12/28/2021 08:55:39 - INFO - codeparrot_training - Step 41072: {'lr': 3.8576631687503525e-05, 'samples': 21029376, 'steps': 41072, 'batch_loss/train': 0.6027086321264505} 12/28/2021 08:55:51 - INFO - codeparrot_training - Step 41073: {'lr': 3.8568219812465085e-05, 'samples': 21029888, 'steps': 41073, 'batch_loss/train': 0.7301613597082905} 12/28/2021 08:56:02 - INFO - codeparrot_training - Step 41074: {'lr': 3.855980877799956e-05, 'samples': 21030400, 'steps': 41074, 'batch_loss/train': 0.716975349932909} 12/28/2021 08:56:12 - INFO - codeparrot_training - Step 41075: {'lr': 3.8551398584140415e-05, 'samples': 21030912, 'steps': 41075, 'batch_loss/train': 0.5715714764955919} 12/28/2021 08:56:25 - INFO - codeparrot_training - Step 41076: {'lr': 3.854298923092106e-05, 'samples': 21031424, 'steps': 41076, 'batch_loss/train': 0.8706163046881557} 12/28/2021 08:56:35 - INFO - codeparrot_training - Step 41077: {'lr': 3.853458071837493e-05, 'samples': 21031936, 'steps': 41077, 'batch_loss/train': 0.6095366193330847} 12/28/2021 08:56:46 - INFO - codeparrot_training - Step 41078: {'lr': 3.8526173046535465e-05, 'samples': 21032448, 'steps': 41078, 'batch_loss/train': 0.6574749066494405} 12/28/2021 08:56:58 - INFO - codeparrot_training - Step 41079: {'lr': 3.8517766215436094e-05, 'samples': 21032960, 'steps': 41079, 'batch_loss/train': 0.7476839395239949} 12/28/2021 08:57:09 - INFO - codeparrot_training - Step 41080: {'lr': 3.850936022511023e-05, 'samples': 21033472, 'steps': 41080, 'batch_loss/train': 0.7712726895697415} 12/28/2021 08:57:20 - INFO - codeparrot_training - Step 41081: {'lr': 3.850095507559134e-05, 'samples': 21033984, 'steps': 41081, 'batch_loss/train': 0.680847468553111} 12/28/2021 08:57:30 - INFO - codeparrot_training - Step 41082: {'lr': 3.8492550766912685e-05, 'samples': 21034496, 'steps': 41082, 'batch_loss/train': 0.674831434211228} 12/28/2021 08:57:44 - INFO - codeparrot_training - Step 41083: {'lr': 3.848414729910785e-05, 'samples': 21035008, 'steps': 41083, 'batch_loss/train': 0.5210017965873703} 12/28/2021 08:57:55 - INFO - codeparrot_training - Step 41084: {'lr': 3.847574467221024e-05, 'samples': 21035520, 'steps': 41084, 'batch_loss/train': 0.7455147528089583} 12/28/2021 08:58:05 - INFO - codeparrot_training - Step 41085: {'lr': 3.846734288625311e-05, 'samples': 21036032, 'steps': 41085, 'batch_loss/train': 0.720810635946691} 12/28/2021 08:58:18 - INFO - codeparrot_training - Step 41086: {'lr': 3.845894194127e-05, 'samples': 21036544, 'steps': 41086, 'batch_loss/train': 0.7308098529465497} 12/28/2021 08:58:28 - INFO - codeparrot_training - Step 41087: {'lr': 3.8450541837294236e-05, 'samples': 21037056, 'steps': 41087, 'batch_loss/train': 0.7425839272909798} 12/28/2021 08:58:39 - INFO - codeparrot_training - Step 41088: {'lr': 3.844214257435924e-05, 'samples': 21037568, 'steps': 41088, 'batch_loss/train': 0.763797145918943} 12/28/2021 08:58:53 - INFO - codeparrot_training - Step 41089: {'lr': 3.843374415249842e-05, 'samples': 21038080, 'steps': 41089, 'batch_loss/train': 0.6687535420060158} 12/28/2021 08:59:03 - INFO - codeparrot_training - Step 41090: {'lr': 3.8425346571745124e-05, 'samples': 21038592, 'steps': 41090, 'batch_loss/train': 0.6982157869497314} 12/28/2021 08:59:14 - INFO - codeparrot_training - Step 41091: {'lr': 3.8416949832132786e-05, 'samples': 21039104, 'steps': 41091, 'batch_loss/train': 0.7261992925778031} 12/28/2021 08:59:26 - INFO - codeparrot_training - Step 41092: {'lr': 3.8408553933694746e-05, 'samples': 21039616, 'steps': 41092, 'batch_loss/train': 0.6239045895636082} 12/28/2021 08:59:37 - INFO - codeparrot_training - Step 41093: {'lr': 3.840015887646442e-05, 'samples': 21040128, 'steps': 41093, 'batch_loss/train': 0.5993877062574029} 12/28/2021 08:59:47 - INFO - codeparrot_training - Step 41094: {'lr': 3.839176466047514e-05, 'samples': 21040640, 'steps': 41094, 'batch_loss/train': 0.7213416248559952} 12/28/2021 08:59:58 - INFO - codeparrot_training - Step 41095: {'lr': 3.8383371285760385e-05, 'samples': 21041152, 'steps': 41095, 'batch_loss/train': 0.7049128506332636} 12/28/2021 09:00:10 - INFO - codeparrot_training - Step 41096: {'lr': 3.837497875235338e-05, 'samples': 21041664, 'steps': 41096, 'batch_loss/train': 0.6818074071779847} 12/28/2021 09:00:21 - INFO - codeparrot_training - Step 41097: {'lr': 3.8366587060287506e-05, 'samples': 21042176, 'steps': 41097, 'batch_loss/train': 0.6981854625046253} 12/28/2021 09:00:31 - INFO - codeparrot_training - Step 41098: {'lr': 3.835819620959627e-05, 'samples': 21042688, 'steps': 41098, 'batch_loss/train': 0.7090179969673045} 12/28/2021 09:00:45 - INFO - codeparrot_training - Step 41099: {'lr': 3.834980620031286e-05, 'samples': 21043200, 'steps': 41099, 'batch_loss/train': 0.6884708777070045} 12/28/2021 09:00:56 - INFO - codeparrot_training - Step 41100: {'lr': 3.8341417032470735e-05, 'samples': 21043712, 'steps': 41100, 'batch_loss/train': 0.658650394063443} 12/28/2021 09:01:06 - INFO - codeparrot_training - Step 41101: {'lr': 3.8333028706103196e-05, 'samples': 21044224, 'steps': 41101, 'batch_loss/train': 0.7298250913154334} 12/28/2021 09:01:19 - INFO - codeparrot_training - Step 41102: {'lr': 3.8324641221243625e-05, 'samples': 21044736, 'steps': 41102, 'batch_loss/train': 0.6017200942442287} 12/28/2021 09:01:29 - INFO - codeparrot_training - Step 41103: {'lr': 3.831625457792537e-05, 'samples': 21045248, 'steps': 41103, 'batch_loss/train': 0.6957866689190269} 12/28/2021 09:01:40 - INFO - codeparrot_training - Step 41104: {'lr': 3.830786877618172e-05, 'samples': 21045760, 'steps': 41104, 'batch_loss/train': 0.6836799941956997} 12/28/2021 09:01:51 - INFO - codeparrot_training - Step 41105: {'lr': 3.8299483816046085e-05, 'samples': 21046272, 'steps': 41105, 'batch_loss/train': 0.8622136292979121} 12/28/2021 09:02:05 - INFO - codeparrot_training - Step 41106: {'lr': 3.8291099697551765e-05, 'samples': 21046784, 'steps': 41106, 'batch_loss/train': 0.6192258559167385} 12/28/2021 09:02:16 - INFO - codeparrot_training - Step 41107: {'lr': 3.8282716420732136e-05, 'samples': 21047296, 'steps': 41107, 'batch_loss/train': 0.7011196021921933} 12/28/2021 09:02:26 - INFO - codeparrot_training - Step 41108: {'lr': 3.8274333985620355e-05, 'samples': 21047808, 'steps': 41108, 'batch_loss/train': 0.7598989857360721} 12/28/2021 09:02:38 - INFO - codeparrot_training - Step 41109: {'lr': 3.826595239224998e-05, 'samples': 21048320, 'steps': 41109, 'batch_loss/train': 0.7253643311560154} 12/28/2021 09:02:49 - INFO - codeparrot_training - Step 41110: {'lr': 3.825757164065424e-05, 'samples': 21048832, 'steps': 41110, 'batch_loss/train': 0.691170132078696} 12/28/2021 09:02:59 - INFO - codeparrot_training - Step 41111: {'lr': 3.824919173086644e-05, 'samples': 21049344, 'steps': 41111, 'batch_loss/train': 0.7418858013115823} 12/28/2021 09:03:11 - INFO - codeparrot_training - Step 41112: {'lr': 3.8240812662919787e-05, 'samples': 21049856, 'steps': 41112, 'batch_loss/train': 0.7987308786250651} 12/28/2021 09:03:22 - INFO - codeparrot_training - Step 41113: {'lr': 3.823243443684785e-05, 'samples': 21050368, 'steps': 41113, 'batch_loss/train': 0.6861478579230607} 12/28/2021 09:03:33 - INFO - codeparrot_training - Step 41114: {'lr': 3.822405705268375e-05, 'samples': 21050880, 'steps': 41114, 'batch_loss/train': 0.6589106584433466} 12/28/2021 09:03:43 - INFO - codeparrot_training - Step 41115: {'lr': 3.8215680510460814e-05, 'samples': 21051392, 'steps': 41115, 'batch_loss/train': 0.7665241397917271} 12/28/2021 09:03:56 - INFO - codeparrot_training - Step 41116: {'lr': 3.820730481021237e-05, 'samples': 21051904, 'steps': 41116, 'batch_loss/train': 0.5355068790086079} 12/28/2021 09:04:06 - INFO - codeparrot_training - Step 41117: {'lr': 3.8198929951971714e-05, 'samples': 21052416, 'steps': 41117, 'batch_loss/train': 0.6686906209215522} 12/28/2021 09:04:17 - INFO - codeparrot_training - Step 41118: {'lr': 3.819055593577214e-05, 'samples': 21052928, 'steps': 41118, 'batch_loss/train': 0.8029351746663451} 12/28/2021 09:04:32 - INFO - codeparrot_training - Step 41119: {'lr': 3.818218276164692e-05, 'samples': 21053440, 'steps': 41119, 'batch_loss/train': 0.7177003078977577} 12/28/2021 09:04:43 - INFO - codeparrot_training - Step 41120: {'lr': 3.817381042962939e-05, 'samples': 21053952, 'steps': 41120, 'batch_loss/train': 0.45153408218175173} 12/28/2021 09:04:53 - INFO - codeparrot_training - Step 41121: {'lr': 3.8165438939752856e-05, 'samples': 21054464, 'steps': 41121, 'batch_loss/train': 0.6522258128970861} 12/28/2021 09:05:05 - INFO - codeparrot_training - Step 41122: {'lr': 3.815706829205048e-05, 'samples': 21054976, 'steps': 41122, 'batch_loss/train': 0.7492904337123036} 12/28/2021 09:05:16 - INFO - codeparrot_training - Step 41123: {'lr': 3.8148698486555556e-05, 'samples': 21055488, 'steps': 41123, 'batch_loss/train': 0.681745583191514} 12/28/2021 09:05:27 - INFO - codeparrot_training - Step 41124: {'lr': 3.814032952330154e-05, 'samples': 21056000, 'steps': 41124, 'batch_loss/train': 0.7450733166188002} 12/28/2021 09:05:39 - INFO - codeparrot_training - Step 41125: {'lr': 3.813196140232153e-05, 'samples': 21056512, 'steps': 41125, 'batch_loss/train': 0.6380019226344302} 12/28/2021 09:05:50 - INFO - codeparrot_training - Step 41126: {'lr': 3.8123594123648825e-05, 'samples': 21057024, 'steps': 41126, 'batch_loss/train': 0.6843234356492758} 12/28/2021 09:06:00 - INFO - codeparrot_training - Step 41127: {'lr': 3.811522768731673e-05, 'samples': 21057536, 'steps': 41127, 'batch_loss/train': 0.6556411189958453} 12/28/2021 09:06:11 - INFO - codeparrot_training - Step 41128: {'lr': 3.810686209335848e-05, 'samples': 21058048, 'steps': 41128, 'batch_loss/train': 0.6869215667247772} 12/28/2021 09:06:25 - INFO - codeparrot_training - Step 41129: {'lr': 3.809849734180731e-05, 'samples': 21058560, 'steps': 41129, 'batch_loss/train': 0.6826174389570951} 12/28/2021 09:06:35 - INFO - codeparrot_training - Step 41130: {'lr': 3.809013343269654e-05, 'samples': 21059072, 'steps': 41130, 'batch_loss/train': 0.6508313848462421} 12/28/2021 09:06:46 - INFO - codeparrot_training - Step 41131: {'lr': 3.8081770366059366e-05, 'samples': 21059584, 'steps': 41131, 'batch_loss/train': 0.7441626903600991} 12/28/2021 09:06:58 - INFO - codeparrot_training - Step 41132: {'lr': 3.807340814192905e-05, 'samples': 21060096, 'steps': 41132, 'batch_loss/train': 0.6546672964468598} 12/28/2021 09:07:09 - INFO - codeparrot_training - Step 41133: {'lr': 3.806504676033892e-05, 'samples': 21060608, 'steps': 41133, 'batch_loss/train': 0.5862844835792203} 12/28/2021 09:07:19 - INFO - codeparrot_training - Step 41134: {'lr': 3.8056686221322024e-05, 'samples': 21061120, 'steps': 41134, 'batch_loss/train': 0.7195357494056225} 12/28/2021 09:07:31 - INFO - codeparrot_training - Step 41135: {'lr': 3.804832652491175e-05, 'samples': 21061632, 'steps': 41135, 'batch_loss/train': 0.7045822162181139} 12/28/2021 09:07:42 - INFO - codeparrot_training - Step 41136: {'lr': 3.803996767114138e-05, 'samples': 21062144, 'steps': 41136, 'batch_loss/train': 0.7940586563199759} 12/28/2021 09:07:53 - INFO - codeparrot_training - Step 41137: {'lr': 3.8031609660044015e-05, 'samples': 21062656, 'steps': 41137, 'batch_loss/train': 0.6994975302368402} 12/28/2021 09:08:06 - INFO - codeparrot_training - Step 41138: {'lr': 3.8023252491652835e-05, 'samples': 21063168, 'steps': 41138, 'batch_loss/train': 0.621351567999227} 12/28/2021 09:08:17 - INFO - codeparrot_training - Step 41139: {'lr': 3.801489616600132e-05, 'samples': 21063680, 'steps': 41139, 'batch_loss/train': 0.6439565353794023} 12/28/2021 09:08:28 - INFO - codeparrot_training - Step 41140: {'lr': 3.800654068312248e-05, 'samples': 21064192, 'steps': 41140, 'batch_loss/train': 0.6247211089357734} 12/28/2021 09:08:38 - INFO - codeparrot_training - Step 41141: {'lr': 3.7998186043049555e-05, 'samples': 21064704, 'steps': 41141, 'batch_loss/train': 0.8009377708658576} 12/28/2021 09:08:50 - INFO - codeparrot_training - Step 41142: {'lr': 3.798983224581581e-05, 'samples': 21065216, 'steps': 41142, 'batch_loss/train': 0.7523943439591676} 12/28/2021 09:09:01 - INFO - codeparrot_training - Step 41143: {'lr': 3.798147929145446e-05, 'samples': 21065728, 'steps': 41143, 'batch_loss/train': 0.679139286512509} 12/28/2021 09:09:12 - INFO - codeparrot_training - Step 41144: {'lr': 3.797312717999865e-05, 'samples': 21066240, 'steps': 41144, 'batch_loss/train': 0.7310160775668919} 12/28/2021 09:09:26 - INFO - codeparrot_training - Step 41145: {'lr': 3.7964775911481676e-05, 'samples': 21066752, 'steps': 41145, 'batch_loss/train': 0.703602428548038} 12/28/2021 09:09:36 - INFO - codeparrot_training - Step 41146: {'lr': 3.7956425485936666e-05, 'samples': 21067264, 'steps': 41146, 'batch_loss/train': 0.6927729360759258} 12/28/2021 09:09:47 - INFO - codeparrot_training - Step 41147: {'lr': 3.7948075903396825e-05, 'samples': 21067776, 'steps': 41147, 'batch_loss/train': 0.6413777600391768} 12/28/2021 09:09:59 - INFO - codeparrot_training - Step 41148: {'lr': 3.793972716389546e-05, 'samples': 21068288, 'steps': 41148, 'batch_loss/train': 0.8138997852802277} 12/28/2021 09:10:10 - INFO - codeparrot_training - Step 41149: {'lr': 3.793137926746554e-05, 'samples': 21068800, 'steps': 41149, 'batch_loss/train': 0.7251161150634289} 12/28/2021 09:10:20 - INFO - codeparrot_training - Step 41150: {'lr': 3.792303221414051e-05, 'samples': 21069312, 'steps': 41150, 'batch_loss/train': 0.7367305760271847} 12/28/2021 09:10:32 - INFO - codeparrot_training - Step 41151: {'lr': 3.791468600395334e-05, 'samples': 21069824, 'steps': 41151, 'batch_loss/train': 0.8673788886517286} 12/28/2021 09:10:43 - INFO - codeparrot_training - Step 41152: {'lr': 3.790634063693726e-05, 'samples': 21070336, 'steps': 41152, 'batch_loss/train': 0.7560377530753613} 12/28/2021 09:10:53 - INFO - codeparrot_training - Step 41153: {'lr': 3.7897996113125585e-05, 'samples': 21070848, 'steps': 41153, 'batch_loss/train': 0.7278735740110278} 12/28/2021 09:11:07 - INFO - codeparrot_training - Step 41154: {'lr': 3.7889652432551353e-05, 'samples': 21071360, 'steps': 41154, 'batch_loss/train': 0.6035345494747162} 12/28/2021 09:11:18 - INFO - codeparrot_training - Step 41155: {'lr': 3.788130959524777e-05, 'samples': 21071872, 'steps': 41155, 'batch_loss/train': 0.7272494791541249} 12/28/2021 09:11:29 - INFO - codeparrot_training - Step 41156: {'lr': 3.7872967601248006e-05, 'samples': 21072384, 'steps': 41156, 'batch_loss/train': 0.7449235031381249} 12/28/2021 09:11:39 - INFO - codeparrot_training - Step 41157: {'lr': 3.786462645058522e-05, 'samples': 21072896, 'steps': 41157, 'batch_loss/train': 0.7142978506162763} 12/28/2021 09:11:51 - INFO - codeparrot_training - Step 41158: {'lr': 3.785628614329259e-05, 'samples': 21073408, 'steps': 41158, 'batch_loss/train': 0.6754511739127338} 12/28/2021 09:12:02 - INFO - codeparrot_training - Step 41159: {'lr': 3.7847946679403304e-05, 'samples': 21073920, 'steps': 41159, 'batch_loss/train': 0.630590573331574} 12/28/2021 09:12:13 - INFO - codeparrot_training - Step 41160: {'lr': 3.783960805895037e-05, 'samples': 21074432, 'steps': 41160, 'batch_loss/train': 0.5654899703804404} 12/28/2021 09:12:25 - INFO - codeparrot_training - Step 41161: {'lr': 3.783127028196712e-05, 'samples': 21074944, 'steps': 41161, 'batch_loss/train': 0.7304074340499938} 12/28/2021 09:12:35 - INFO - codeparrot_training - Step 41162: {'lr': 3.782293334848666e-05, 'samples': 21075456, 'steps': 41162, 'batch_loss/train': 0.7387046385556459} 12/28/2021 09:12:46 - INFO - codeparrot_training - Step 41163: {'lr': 3.7814597258542056e-05, 'samples': 21075968, 'steps': 41163, 'batch_loss/train': 0.7060185967711732} 12/28/2021 09:12:58 - INFO - codeparrot_training - Step 41164: {'lr': 3.780626201216644e-05, 'samples': 21076480, 'steps': 41164, 'batch_loss/train': 0.5864907624199986} 12/28/2021 09:13:09 - INFO - codeparrot_training - Step 41165: {'lr': 3.779792760939313e-05, 'samples': 21076992, 'steps': 41165, 'batch_loss/train': 0.6900190818123519} 12/28/2021 09:13:19 - INFO - codeparrot_training - Step 41166: {'lr': 3.778959405025506e-05, 'samples': 21077504, 'steps': 41166, 'batch_loss/train': 0.7194865567144006} 12/28/2021 09:13:30 - INFO - codeparrot_training - Step 41167: {'lr': 3.778126133478543e-05, 'samples': 21078016, 'steps': 41167, 'batch_loss/train': 0.6098726103082299} 12/28/2021 09:13:44 - INFO - codeparrot_training - Step 41168: {'lr': 3.7772929463017386e-05, 'samples': 21078528, 'steps': 41168, 'batch_loss/train': 0.7041658367961645} 12/28/2021 09:13:54 - INFO - codeparrot_training - Step 41169: {'lr': 3.7764598434984035e-05, 'samples': 21079040, 'steps': 41169, 'batch_loss/train': 0.7266049841418862} 12/28/2021 09:14:05 - INFO - codeparrot_training - Step 41170: {'lr': 3.775626825071851e-05, 'samples': 21079552, 'steps': 41170, 'batch_loss/train': 0.710936508141458} 12/28/2021 09:14:17 - INFO - codeparrot_training - Step 41171: {'lr': 3.774793891025394e-05, 'samples': 21080064, 'steps': 41171, 'batch_loss/train': 0.7397889774292707} 12/28/2021 09:14:28 - INFO - codeparrot_training - Step 41172: {'lr': 3.7739610413623394e-05, 'samples': 21080576, 'steps': 41172, 'batch_loss/train': 0.7973775891587138} 12/28/2021 09:14:39 - INFO - codeparrot_training - Step 41173: {'lr': 3.7731282760860016e-05, 'samples': 21081088, 'steps': 41173, 'batch_loss/train': 0.6204420882277191} 12/28/2021 09:14:52 - INFO - codeparrot_training - Step 41174: {'lr': 3.772295595199699e-05, 'samples': 21081600, 'steps': 41174, 'batch_loss/train': 0.6953377779573202} 12/28/2021 09:15:03 - INFO - codeparrot_training - Step 41175: {'lr': 3.771462998706718e-05, 'samples': 21082112, 'steps': 41175, 'batch_loss/train': 0.6483640470542014} 12/28/2021 09:15:14 - INFO - codeparrot_training - Step 41176: {'lr': 3.770630486610399e-05, 'samples': 21082624, 'steps': 41176, 'batch_loss/train': 0.7037874015804846} 12/28/2021 09:15:26 - INFO - codeparrot_training - Step 41177: {'lr': 3.769798058914031e-05, 'samples': 21083136, 'steps': 41177, 'batch_loss/train': 0.6502860287437215} 12/28/2021 09:15:36 - INFO - codeparrot_training - Step 41178: {'lr': 3.768965715620923e-05, 'samples': 21083648, 'steps': 41178, 'batch_loss/train': 0.7227981556206942} 12/28/2021 09:15:47 - INFO - codeparrot_training - Step 41179: {'lr': 3.7681334567344045e-05, 'samples': 21084160, 'steps': 41179, 'batch_loss/train': 0.7190803599078208} 12/28/2021 09:15:58 - INFO - codeparrot_training - Step 41180: {'lr': 3.7673012822577625e-05, 'samples': 21084672, 'steps': 41180, 'batch_loss/train': 0.7318063837010413} 12/28/2021 09:16:10 - INFO - codeparrot_training - Step 41181: {'lr': 3.766469192194313e-05, 'samples': 21085184, 'steps': 41181, 'batch_loss/train': 0.7545141819864511} 12/28/2021 09:16:20 - INFO - codeparrot_training - Step 41182: {'lr': 3.765637186547366e-05, 'samples': 21085696, 'steps': 41182, 'batch_loss/train': 0.6912437789142132} 12/28/2021 09:16:31 - INFO - codeparrot_training - Step 41183: {'lr': 3.764805265320228e-05, 'samples': 21086208, 'steps': 41183, 'batch_loss/train': 0.8110213037580252} 12/28/2021 09:16:45 - INFO - codeparrot_training - Step 41184: {'lr': 3.763973428516207e-05, 'samples': 21086720, 'steps': 41184, 'batch_loss/train': 0.717070622369647} 12/28/2021 09:16:56 - INFO - codeparrot_training - Step 41185: {'lr': 3.763141676138612e-05, 'samples': 21087232, 'steps': 41185, 'batch_loss/train': 0.6487305578775704} 12/28/2021 09:17:06 - INFO - codeparrot_training - Step 41186: {'lr': 3.762310008190739e-05, 'samples': 21087744, 'steps': 41186, 'batch_loss/train': 0.652310804463923} 12/28/2021 09:17:18 - INFO - codeparrot_training - Step 41187: {'lr': 3.761478424675907e-05, 'samples': 21088256, 'steps': 41187, 'batch_loss/train': 0.7759278607554734} 12/28/2021 09:17:29 - INFO - codeparrot_training - Step 41188: {'lr': 3.760646925597422e-05, 'samples': 21088768, 'steps': 41188, 'batch_loss/train': 0.7528243875131011} 12/28/2021 09:17:40 - INFO - codeparrot_training - Step 41189: {'lr': 3.759815510958573e-05, 'samples': 21089280, 'steps': 41189, 'batch_loss/train': 0.7572869649156928} 12/28/2021 09:17:54 - INFO - codeparrot_training - Step 41190: {'lr': 3.758984180762684e-05, 'samples': 21089792, 'steps': 41190, 'batch_loss/train': 0.662991936551407} 12/28/2021 09:18:04 - INFO - codeparrot_training - Step 41191: {'lr': 3.758152935013062e-05, 'samples': 21090304, 'steps': 41191, 'batch_loss/train': 0.5625101177429315} 12/28/2021 09:18:15 - INFO - codeparrot_training - Step 41192: {'lr': 3.7573217737129936e-05, 'samples': 21090816, 'steps': 41192, 'batch_loss/train': 0.6842352310195565} 12/28/2021 09:18:26 - INFO - codeparrot_training - Step 41193: {'lr': 3.7564906968657886e-05, 'samples': 21091328, 'steps': 41193, 'batch_loss/train': 0.6371550264302641} 12/28/2021 09:18:38 - INFO - codeparrot_training - Step 41194: {'lr': 3.755659704474768e-05, 'samples': 21091840, 'steps': 41194, 'batch_loss/train': 0.6440538596361876} 12/28/2021 09:18:49 - INFO - codeparrot_training - Step 41195: {'lr': 3.754828796543217e-05, 'samples': 21092352, 'steps': 41195, 'batch_loss/train': 0.6383040716173127} 12/28/2021 09:18:59 - INFO - codeparrot_training - Step 41196: {'lr': 3.7539979730744437e-05, 'samples': 21092864, 'steps': 41196, 'batch_loss/train': 0.716913802549243} 12/28/2021 09:19:12 - INFO - codeparrot_training - Step 41197: {'lr': 3.753167234071753e-05, 'samples': 21093376, 'steps': 41197, 'batch_loss/train': 1.013464032439515} 12/28/2021 09:19:22 - INFO - codeparrot_training - Step 41198: {'lr': 3.7523365795384454e-05, 'samples': 21093888, 'steps': 41198, 'batch_loss/train': 0.7556806474458426} 12/28/2021 09:19:33 - INFO - codeparrot_training - Step 41199: {'lr': 3.751506009477826e-05, 'samples': 21094400, 'steps': 41199, 'batch_loss/train': 0.8064182931557298} 12/28/2021 09:19:44 - INFO - codeparrot_training - Step 41200: {'lr': 3.750675523893202e-05, 'samples': 21094912, 'steps': 41200, 'batch_loss/train': 1.4612974864430726} 12/28/2021 09:19:56 - INFO - codeparrot_training - Step 41201: {'lr': 3.749845122787857e-05, 'samples': 21095424, 'steps': 41201, 'batch_loss/train': 0.7267713551409543} 12/28/2021 09:20:06 - INFO - codeparrot_training - Step 41202: {'lr': 3.74901480616511e-05, 'samples': 21095936, 'steps': 41202, 'batch_loss/train': 0.9038039385341108} 12/28/2021 09:20:17 - INFO - codeparrot_training - Step 41203: {'lr': 3.74818457402826e-05, 'samples': 21096448, 'steps': 41203, 'batch_loss/train': 0.6347234840504825} 12/28/2021 09:20:31 - INFO - codeparrot_training - Step 41204: {'lr': 3.7473544263805945e-05, 'samples': 21096960, 'steps': 41204, 'batch_loss/train': 0.650759092066437} 12/28/2021 09:20:42 - INFO - codeparrot_training - Step 41205: {'lr': 3.746524363225431e-05, 'samples': 21097472, 'steps': 41205, 'batch_loss/train': 0.651950042694807} 12/28/2021 09:20:52 - INFO - codeparrot_training - Step 41206: {'lr': 3.745694384566059e-05, 'samples': 21097984, 'steps': 41206, 'batch_loss/train': 0.4698366750671994} 12/28/2021 09:21:04 - INFO - codeparrot_training - Step 41207: {'lr': 3.744864490405781e-05, 'samples': 21098496, 'steps': 41207, 'batch_loss/train': 0.6655584929976612} 12/28/2021 09:21:15 - INFO - codeparrot_training - Step 41208: {'lr': 3.744034680747896e-05, 'samples': 21099008, 'steps': 41208, 'batch_loss/train': 0.5669842089992017} 12/28/2021 09:21:26 - INFO - codeparrot_training - Step 41209: {'lr': 3.7432049555957025e-05, 'samples': 21099520, 'steps': 41209, 'batch_loss/train': 0.739200841402635} 12/28/2021 09:21:38 - INFO - codeparrot_training - Step 41210: {'lr': 3.742375314952498e-05, 'samples': 21100032, 'steps': 41210, 'batch_loss/train': 0.5669695362448692} 12/28/2021 09:21:48 - INFO - codeparrot_training - Step 41211: {'lr': 3.741545758821591e-05, 'samples': 21100544, 'steps': 41211, 'batch_loss/train': 0.6718160333111882} 12/28/2021 09:21:59 - INFO - codeparrot_training - Step 41212: {'lr': 3.740716287206258e-05, 'samples': 21101056, 'steps': 41212, 'batch_loss/train': 0.7352861466351897} 12/28/2021 09:22:10 - INFO - codeparrot_training - Step 41213: {'lr': 3.739886900109818e-05, 'samples': 21101568, 'steps': 41213, 'batch_loss/train': 0.6703471783548594} 12/28/2021 09:22:24 - INFO - codeparrot_training - Step 41214: {'lr': 3.739057597535564e-05, 'samples': 21102080, 'steps': 41214, 'batch_loss/train': 0.521841102745384} 12/28/2021 09:22:35 - INFO - codeparrot_training - Step 41215: {'lr': 3.73822837948678e-05, 'samples': 21102592, 'steps': 41215, 'batch_loss/train': 0.6209904020070098} 12/28/2021 09:22:45 - INFO - codeparrot_training - Step 41216: {'lr': 3.737399245966774e-05, 'samples': 21103104, 'steps': 41216, 'batch_loss/train': 0.6937187118455768} 12/28/2021 09:22:58 - INFO - codeparrot_training - Step 41217: {'lr': 3.7365701969788475e-05, 'samples': 21103616, 'steps': 41217, 'batch_loss/train': 0.6511717806570232} 12/28/2021 09:23:08 - INFO - codeparrot_training - Step 41218: {'lr': 3.735741232526285e-05, 'samples': 21104128, 'steps': 41218, 'batch_loss/train': 0.6354565774672665} 12/28/2021 09:23:19 - INFO - codeparrot_training - Step 41219: {'lr': 3.7349123526123796e-05, 'samples': 21104640, 'steps': 41219, 'batch_loss/train': 0.7733172457665205} 12/28/2021 09:23:33 - INFO - codeparrot_training - Step 41220: {'lr': 3.734083557240442e-05, 'samples': 21105152, 'steps': 41220, 'batch_loss/train': 0.6237650560215116} 12/28/2021 09:23:44 - INFO - codeparrot_training - Step 41221: {'lr': 3.733254846413756e-05, 'samples': 21105664, 'steps': 41221, 'batch_loss/train': 1.0311354203149676} 12/28/2021 09:23:54 - INFO - codeparrot_training - Step 41222: {'lr': 3.732426220135618e-05, 'samples': 21106176, 'steps': 41222, 'batch_loss/train': 0.73996068444103} 12/28/2021 09:24:05 - INFO - codeparrot_training - Step 41223: {'lr': 3.731597678409324e-05, 'samples': 21106688, 'steps': 41223, 'batch_loss/train': 0.7228375893319026} 12/28/2021 09:24:17 - INFO - codeparrot_training - Step 41224: {'lr': 3.7307692212381646e-05, 'samples': 21107200, 'steps': 41224, 'batch_loss/train': 0.7059612213633955} 12/28/2021 09:24:28 - INFO - codeparrot_training - Step 41225: {'lr': 3.7299408486254374e-05, 'samples': 21107712, 'steps': 41225, 'batch_loss/train': 0.6042306388553698} 12/28/2021 09:24:38 - INFO - codeparrot_training - Step 41226: {'lr': 3.7291125605744324e-05, 'samples': 21108224, 'steps': 41226, 'batch_loss/train': 0.676368243759498} 12/28/2021 09:24:50 - INFO - codeparrot_training - Step 41227: {'lr': 3.7282843570884464e-05, 'samples': 21108736, 'steps': 41227, 'batch_loss/train': 0.7916870564222336} 12/28/2021 09:25:01 - INFO - codeparrot_training - Step 41228: {'lr': 3.727456238170768e-05, 'samples': 21109248, 'steps': 41228, 'batch_loss/train': 0.6438826145604253} 12/28/2021 09:25:12 - INFO - codeparrot_training - Step 41229: {'lr': 3.7266282038247004e-05, 'samples': 21109760, 'steps': 41229, 'batch_loss/train': 0.6591504639945924} 12/28/2021 09:25:26 - INFO - codeparrot_training - Step 41230: {'lr': 3.72580025405351e-05, 'samples': 21110272, 'steps': 41230, 'batch_loss/train': 0.6771864043548703} 12/28/2021 09:25:37 - INFO - codeparrot_training - Step 41231: {'lr': 3.72497238886052e-05, 'samples': 21110784, 'steps': 41231, 'batch_loss/train': 0.7245011096820235} 12/28/2021 09:25:47 - INFO - codeparrot_training - Step 41232: {'lr': 3.724144608249e-05, 'samples': 21111296, 'steps': 41232, 'batch_loss/train': 0.6692306912736967} 12/28/2021 09:25:59 - INFO - codeparrot_training - Step 41233: {'lr': 3.7233169122222494e-05, 'samples': 21111808, 'steps': 41233, 'batch_loss/train': 0.5219236745033413} 12/28/2021 09:26:10 - INFO - codeparrot_training - Step 41234: {'lr': 3.722489300783555e-05, 'samples': 21112320, 'steps': 41234, 'batch_loss/train': 0.6832046722993255} 12/28/2021 09:26:20 - INFO - codeparrot_training - Step 41235: {'lr': 3.721661773936208e-05, 'samples': 21112832, 'steps': 41235, 'batch_loss/train': 0.6712789782322943} 12/28/2021 09:26:31 - INFO - codeparrot_training - Step 41236: {'lr': 3.720834331683501e-05, 'samples': 21113344, 'steps': 41236, 'batch_loss/train': 0.6738295461982489} 12/28/2021 09:26:43 - INFO - codeparrot_training - Step 41237: {'lr': 3.7200069740287224e-05, 'samples': 21113856, 'steps': 41237, 'batch_loss/train': 0.7208307578694075} 12/28/2021 09:26:54 - INFO - codeparrot_training - Step 41238: {'lr': 3.7191797009751616e-05, 'samples': 21114368, 'steps': 41238, 'batch_loss/train': 0.6835049837827682} 12/28/2021 09:27:04 - INFO - codeparrot_training - Step 41239: {'lr': 3.718352512526105e-05, 'samples': 21114880, 'steps': 41239, 'batch_loss/train': 0.624258894007653} 12/28/2021 09:27:17 - INFO - codeparrot_training - Step 41240: {'lr': 3.7175254086848494e-05, 'samples': 21115392, 'steps': 41240, 'batch_loss/train': 0.6817541461496148} 12/28/2021 09:27:28 - INFO - codeparrot_training - Step 41241: {'lr': 3.7166983894546676e-05, 'samples': 21115904, 'steps': 41241, 'batch_loss/train': 0.7144450573250651} 12/28/2021 09:27:38 - INFO - codeparrot_training - Step 41242: {'lr': 3.71587145483886e-05, 'samples': 21116416, 'steps': 41242, 'batch_loss/train': 0.568187752738595} 12/28/2021 09:27:52 - INFO - codeparrot_training - Step 41243: {'lr': 3.715044604840717e-05, 'samples': 21116928, 'steps': 41243, 'batch_loss/train': 0.6803495909553021} 12/28/2021 09:28:03 - INFO - codeparrot_training - Step 41244: {'lr': 3.714217839463513e-05, 'samples': 21117440, 'steps': 41244, 'batch_loss/train': 0.7576418472453952} 12/28/2021 09:28:13 - INFO - codeparrot_training - Step 41245: {'lr': 3.713391158710539e-05, 'samples': 21117952, 'steps': 41245, 'batch_loss/train': 0.7295684572309256} 12/28/2021 09:28:26 - INFO - codeparrot_training - Step 41246: {'lr': 3.712564562585094e-05, 'samples': 21118464, 'steps': 41246, 'batch_loss/train': 0.716779128764756} 12/28/2021 09:28:36 - INFO - codeparrot_training - Step 41247: {'lr': 3.7117380510904494e-05, 'samples': 21118976, 'steps': 41247, 'batch_loss/train': 0.7459867694415152} 12/28/2021 09:28:47 - INFO - codeparrot_training - Step 41248: {'lr': 3.710911624229893e-05, 'samples': 21119488, 'steps': 41248, 'batch_loss/train': 0.567680909531191} 12/28/2021 09:28:57 - INFO - codeparrot_training - Step 41249: {'lr': 3.710085282006717e-05, 'samples': 21120000, 'steps': 41249, 'batch_loss/train': 0.7558305433485657} 12/28/2021 09:29:11 - INFO - codeparrot_training - Step 41250: {'lr': 3.709259024424202e-05, 'samples': 21120512, 'steps': 41250, 'batch_loss/train': 0.589160452131182} 12/28/2021 09:29:22 - INFO - codeparrot_training - Step 41251: {'lr': 3.708432851485635e-05, 'samples': 21121024, 'steps': 41251, 'batch_loss/train': 0.6431146282702684} 12/28/2021 09:29:33 - INFO - codeparrot_training - Step 41252: {'lr': 3.707606763194299e-05, 'samples': 21121536, 'steps': 41252, 'batch_loss/train': 0.6791288694366813} 12/28/2021 09:29:45 - INFO - codeparrot_training - Step 41253: {'lr': 3.706780759553477e-05, 'samples': 21122048, 'steps': 41253, 'batch_loss/train': 0.7589171454310417} 12/28/2021 09:29:55 - INFO - codeparrot_training - Step 41254: {'lr': 3.705954840566458e-05, 'samples': 21122560, 'steps': 41254, 'batch_loss/train': 0.5996603104285896} 12/28/2021 09:30:06 - INFO - codeparrot_training - Step 41255: {'lr': 3.705129006236527e-05, 'samples': 21123072, 'steps': 41255, 'batch_loss/train': 0.6779196033312473} 12/28/2021 09:30:18 - INFO - codeparrot_training - Step 41256: {'lr': 3.7043032565669485e-05, 'samples': 21123584, 'steps': 41256, 'batch_loss/train': 0.7678357316181064} 12/28/2021 09:30:29 - INFO - codeparrot_training - Step 41257: {'lr': 3.703477591561033e-05, 'samples': 21124096, 'steps': 41257, 'batch_loss/train': 0.768838569521904} 12/28/2021 09:30:39 - INFO - codeparrot_training - Step 41258: {'lr': 3.702652011222044e-05, 'samples': 21124608, 'steps': 41258, 'batch_loss/train': 0.6487887691473588} 12/28/2021 09:30:50 - INFO - codeparrot_training - Step 41259: {'lr': 3.7018265155532666e-05, 'samples': 21125120, 'steps': 41259, 'batch_loss/train': 0.6902922161389142} 12/28/2021 09:31:03 - INFO - codeparrot_training - Step 41260: {'lr': 3.701001104557988e-05, 'samples': 21125632, 'steps': 41260, 'batch_loss/train': 0.6824781731702387} 12/28/2021 09:31:13 - INFO - codeparrot_training - Step 41261: {'lr': 3.700175778239484e-05, 'samples': 21126144, 'steps': 41261, 'batch_loss/train': 0.7024354375316761} 12/28/2021 09:31:24 - INFO - codeparrot_training - Step 41262: {'lr': 3.69935053660104e-05, 'samples': 21126656, 'steps': 41262, 'batch_loss/train': 0.6880040680989623} 12/28/2021 09:31:38 - INFO - codeparrot_training - Step 41263: {'lr': 3.6985253796459366e-05, 'samples': 21127168, 'steps': 41263, 'batch_loss/train': 0.7558018085546792} 12/28/2021 09:31:49 - INFO - codeparrot_training - Step 41264: {'lr': 3.697700307377452e-05, 'samples': 21127680, 'steps': 41264, 'batch_loss/train': 0.7850516940234229} 12/28/2021 09:32:00 - INFO - codeparrot_training - Step 41265: {'lr': 3.6968753197988675e-05, 'samples': 21128192, 'steps': 41265, 'batch_loss/train': 0.7125332332216203} 12/28/2021 09:32:12 - INFO - codeparrot_training - Step 41266: {'lr': 3.696050416913468e-05, 'samples': 21128704, 'steps': 41266, 'batch_loss/train': 0.6845153691247106} 12/28/2021 09:32:22 - INFO - codeparrot_training - Step 41267: {'lr': 3.695225598724519e-05, 'samples': 21129216, 'steps': 41267, 'batch_loss/train': 0.7271852198755369} 12/28/2021 09:32:33 - INFO - codeparrot_training - Step 41268: {'lr': 3.6944008652353116e-05, 'samples': 21129728, 'steps': 41268, 'batch_loss/train': 0.7318122442811728} 12/28/2021 09:32:44 - INFO - codeparrot_training - Step 41269: {'lr': 3.6935762164491294e-05, 'samples': 21130240, 'steps': 41269, 'batch_loss/train': 0.6550665586255491} 12/28/2021 09:32:58 - INFO - codeparrot_training - Step 41270: {'lr': 3.692751652369236e-05, 'samples': 21130752, 'steps': 41270, 'batch_loss/train': 0.6113391304388642} 12/28/2021 09:33:08 - INFO - codeparrot_training - Step 41271: {'lr': 3.6919271729989126e-05, 'samples': 21131264, 'steps': 41271, 'batch_loss/train': 0.7887900425121188} 12/28/2021 09:33:19 - INFO - codeparrot_training - Step 41272: {'lr': 3.691102778341451e-05, 'samples': 21131776, 'steps': 41272, 'batch_loss/train': 0.7274414584971964} 12/28/2021 09:33:31 - INFO - codeparrot_training - Step 41273: {'lr': 3.6902784684001126e-05, 'samples': 21132288, 'steps': 41273, 'batch_loss/train': 0.6946681709523546} 12/28/2021 09:33:42 - INFO - codeparrot_training - Step 41274: {'lr': 3.6894542431781833e-05, 'samples': 21132800, 'steps': 41274, 'batch_loss/train': 1.3641751483082771} 12/28/2021 09:33:53 - INFO - codeparrot_training - Step 41275: {'lr': 3.688630102678936e-05, 'samples': 21133312, 'steps': 41275, 'batch_loss/train': 0.6354056850541383} 12/28/2021 09:34:03 - INFO - codeparrot_training - Step 41276: {'lr': 3.687806046905648e-05, 'samples': 21133824, 'steps': 41276, 'batch_loss/train': 0.7161515336483717} 12/28/2021 09:34:15 - INFO - codeparrot_training - Step 41277: {'lr': 3.6869820758615974e-05, 'samples': 21134336, 'steps': 41277, 'batch_loss/train': 0.6393119678832591} 12/28/2021 09:34:26 - INFO - codeparrot_training - Step 41278: {'lr': 3.6861581895500566e-05, 'samples': 21134848, 'steps': 41278, 'batch_loss/train': 0.7785124657675624} 12/28/2021 09:34:37 - INFO - codeparrot_training - Step 41279: {'lr': 3.685334387974304e-05, 'samples': 21135360, 'steps': 41279, 'batch_loss/train': 0.8031252510845661} 12/28/2021 09:34:51 - INFO - codeparrot_training - Step 41280: {'lr': 3.684510671137614e-05, 'samples': 21135872, 'steps': 41280, 'batch_loss/train': 0.6982943685725331} 12/28/2021 09:35:01 - INFO - codeparrot_training - Step 41281: {'lr': 3.683687039043268e-05, 'samples': 21136384, 'steps': 41281, 'batch_loss/train': 0.8892350541427732} 12/28/2021 09:35:12 - INFO - codeparrot_training - Step 41282: {'lr': 3.682863491694521e-05, 'samples': 21136896, 'steps': 41282, 'batch_loss/train': 0.6056649533566087} 12/28/2021 09:35:24 - INFO - codeparrot_training - Step 41283: {'lr': 3.682040029094666e-05, 'samples': 21137408, 'steps': 41283, 'batch_loss/train': 0.8371935049071908} 12/28/2021 09:35:35 - INFO - codeparrot_training - Step 41284: {'lr': 3.681216651246974e-05, 'samples': 21137920, 'steps': 41284, 'batch_loss/train': 0.7367120889248326} 12/28/2021 09:35:45 - INFO - codeparrot_training - Step 41285: {'lr': 3.68039335815471e-05, 'samples': 21138432, 'steps': 41285, 'batch_loss/train': 1.0235940162092447} 12/28/2021 09:35:56 - INFO - codeparrot_training - Step 41286: {'lr': 3.6795701498211546e-05, 'samples': 21138944, 'steps': 41286, 'batch_loss/train': 0.7064820118248463} 12/28/2021 09:36:08 - INFO - codeparrot_training - Step 41287: {'lr': 3.6787470262495765e-05, 'samples': 21139456, 'steps': 41287, 'batch_loss/train': 0.7086339453235269} 12/28/2021 09:36:19 - INFO - codeparrot_training - Step 41288: {'lr': 3.677923987443249e-05, 'samples': 21139968, 'steps': 41288, 'batch_loss/train': 0.47254529781639576} 12/28/2021 09:36:29 - INFO - codeparrot_training - Step 41289: {'lr': 3.6771010334054444e-05, 'samples': 21140480, 'steps': 41289, 'batch_loss/train': 0.7110097615513951} 12/28/2021 09:36:43 - INFO - codeparrot_training - Step 41290: {'lr': 3.676278164139435e-05, 'samples': 21140992, 'steps': 41290, 'batch_loss/train': 0.6431890819221735} 12/28/2021 09:36:54 - INFO - codeparrot_training - Step 41291: {'lr': 3.675455379648493e-05, 'samples': 21141504, 'steps': 41291, 'batch_loss/train': 0.7224012603983283} 12/28/2021 09:37:05 - INFO - codeparrot_training - Step 41292: {'lr': 3.674632679935897e-05, 'samples': 21142016, 'steps': 41292, 'batch_loss/train': 0.7445216625928879} 12/28/2021 09:37:17 - INFO - codeparrot_training - Step 41293: {'lr': 3.673810065004893e-05, 'samples': 21142528, 'steps': 41293, 'batch_loss/train': 0.630819242913276} 12/28/2021 09:37:27 - INFO - codeparrot_training - Step 41294: {'lr': 3.672987534858776e-05, 'samples': 21143040, 'steps': 41294, 'batch_loss/train': 0.7119573480449617} 12/28/2021 09:37:38 - INFO - codeparrot_training - Step 41295: {'lr': 3.6721650895008115e-05, 'samples': 21143552, 'steps': 41295, 'batch_loss/train': 0.593722794437781} 12/28/2021 09:37:52 - INFO - codeparrot_training - Step 41296: {'lr': 3.6713427289342627e-05, 'samples': 21144064, 'steps': 41296, 'batch_loss/train': 0.6063010385842063} 12/28/2021 09:38:03 - INFO - codeparrot_training - Step 41297: {'lr': 3.670520453162393e-05, 'samples': 21144576, 'steps': 41297, 'batch_loss/train': 0.6845707604661584} 12/28/2021 09:38:13 - INFO - codeparrot_training - Step 41298: {'lr': 3.669698262188495e-05, 'samples': 21145088, 'steps': 41298, 'batch_loss/train': 0.8195094205439091} 12/28/2021 09:38:24 - INFO - codeparrot_training - Step 41299: {'lr': 3.6688761560158155e-05, 'samples': 21145600, 'steps': 41299, 'batch_loss/train': 0.6552643924951553} 12/28/2021 09:38:36 - INFO - codeparrot_training - Step 41300: {'lr': 3.668054134647622e-05, 'samples': 21146112, 'steps': 41300, 'batch_loss/train': 0.6671909254509956} 12/28/2021 09:38:47 - INFO - codeparrot_training - Step 41301: {'lr': 3.6672321980872056e-05, 'samples': 21146624, 'steps': 41301, 'batch_loss/train': 0.6381581394234672} 12/28/2021 09:38:57 - INFO - codeparrot_training - Step 41302: {'lr': 3.666410346337812e-05, 'samples': 21147136, 'steps': 41302, 'batch_loss/train': 0.732788740308024} 12/28/2021 09:39:10 - INFO - codeparrot_training - Step 41303: {'lr': 3.665588579402715e-05, 'samples': 21147648, 'steps': 41303, 'batch_loss/train': 0.6960077900439501} 12/28/2021 09:39:20 - INFO - codeparrot_training - Step 41304: {'lr': 3.664766897285182e-05, 'samples': 21148160, 'steps': 41304, 'batch_loss/train': 0.46704469178803265} 12/28/2021 09:39:31 - INFO - codeparrot_training - Step 41305: {'lr': 3.663945299988483e-05, 'samples': 21148672, 'steps': 41305, 'batch_loss/train': 0.6666191723197699} 12/28/2021 09:39:43 - INFO - codeparrot_training - Step 41306: {'lr': 3.663123787515879e-05, 'samples': 21149184, 'steps': 41306, 'batch_loss/train': 0.7199815157800913} 12/28/2021 09:39:54 - INFO - codeparrot_training - Step 41307: {'lr': 3.662302359870642e-05, 'samples': 21149696, 'steps': 41307, 'batch_loss/train': 0.6530139878159389} 12/28/2021 09:40:04 - INFO - codeparrot_training - Step 41308: {'lr': 3.6614810170560256e-05, 'samples': 21150208, 'steps': 41308, 'batch_loss/train': 0.6634487572591752} 12/28/2021 09:40:15 - INFO - codeparrot_training - Step 41309: {'lr': 3.6606597590753074e-05, 'samples': 21150720, 'steps': 41309, 'batch_loss/train': 0.7754041599109769} 12/28/2021 09:40:29 - INFO - codeparrot_training - Step 41310: {'lr': 3.659838585931757e-05, 'samples': 21151232, 'steps': 41310, 'batch_loss/train': 0.6908998684957623} 12/28/2021 09:40:40 - INFO - codeparrot_training - Step 41311: {'lr': 3.659017497628622e-05, 'samples': 21151744, 'steps': 41311, 'batch_loss/train': 0.6985064735636115} 12/28/2021 09:40:50 - INFO - codeparrot_training - Step 41312: {'lr': 3.658196494169178e-05, 'samples': 21152256, 'steps': 41312, 'batch_loss/train': 0.7703279057750478} 12/28/2021 09:41:02 - INFO - codeparrot_training - Step 41313: {'lr': 3.657375575556684e-05, 'samples': 21152768, 'steps': 41313, 'batch_loss/train': 0.6361422254703939} 12/28/2021 09:41:13 - INFO - codeparrot_training - Step 41314: {'lr': 3.656554741794407e-05, 'samples': 21153280, 'steps': 41314, 'batch_loss/train': 0.7260195069247857} 12/28/2021 09:41:24 - INFO - codeparrot_training - Step 41315: {'lr': 3.6557339928856096e-05, 'samples': 21153792, 'steps': 41315, 'batch_loss/train': 0.7785343704745173} 12/28/2021 09:41:36 - INFO - codeparrot_training - Step 41316: {'lr': 3.654913328833556e-05, 'samples': 21154304, 'steps': 41316, 'batch_loss/train': 0.8272271379828453} 12/28/2021 09:41:46 - INFO - codeparrot_training - Step 41317: {'lr': 3.654092749641505e-05, 'samples': 21154816, 'steps': 41317, 'batch_loss/train': 0.6693582145962864} 12/28/2021 09:41:57 - INFO - codeparrot_training - Step 41318: {'lr': 3.653272255312723e-05, 'samples': 21155328, 'steps': 41318, 'batch_loss/train': 0.8454563235864043} 12/28/2021 09:42:11 - INFO - codeparrot_training - Step 41319: {'lr': 3.652451845850468e-05, 'samples': 21155840, 'steps': 41319, 'batch_loss/train': 0.7278905333951116} 12/28/2021 09:42:22 - INFO - codeparrot_training - Step 41320: {'lr': 3.651631521258006e-05, 'samples': 21156352, 'steps': 41320, 'batch_loss/train': 0.5625256375642493} 12/28/2021 09:42:32 - INFO - codeparrot_training - Step 41321: {'lr': 3.650811281538602e-05, 'samples': 21156864, 'steps': 41321, 'batch_loss/train': 0.7845224412158132} 12/28/2021 09:42:43 - INFO - codeparrot_training - Step 41322: {'lr': 3.649991126695504e-05, 'samples': 21157376, 'steps': 41322, 'batch_loss/train': 0.7891795947216451} 12/28/2021 09:42:55 - INFO - codeparrot_training - Step 41323: {'lr': 3.649171056731976e-05, 'samples': 21157888, 'steps': 41323, 'batch_loss/train': 0.7375099551863968} 12/28/2021 09:43:06 - INFO - codeparrot_training - Step 41324: {'lr': 3.64835107165129e-05, 'samples': 21158400, 'steps': 41324, 'batch_loss/train': 0.7036183471791446} 12/28/2021 09:43:16 - INFO - codeparrot_training - Step 41325: {'lr': 3.647531171456697e-05, 'samples': 21158912, 'steps': 41325, 'batch_loss/train': 0.42858819843968377} 12/28/2021 09:43:28 - INFO - codeparrot_training - Step 41326: {'lr': 3.6467113561514496e-05, 'samples': 21159424, 'steps': 41326, 'batch_loss/train': 0.5434925808513071} 12/28/2021 09:43:39 - INFO - codeparrot_training - Step 41327: {'lr': 3.6458916257388255e-05, 'samples': 21159936, 'steps': 41327, 'batch_loss/train': 0.8104254570789635} 12/28/2021 09:43:50 - INFO - codeparrot_training - Step 41328: {'lr': 3.64507198022207e-05, 'samples': 21160448, 'steps': 41328, 'batch_loss/train': 0.7742325821891427} 12/28/2021 09:44:04 - INFO - codeparrot_training - Step 41329: {'lr': 3.6442524196044435e-05, 'samples': 21160960, 'steps': 41329, 'batch_loss/train': 0.7351267314516008} 12/28/2021 09:44:14 - INFO - codeparrot_training - Step 41330: {'lr': 3.643432943889205e-05, 'samples': 21161472, 'steps': 41330, 'batch_loss/train': 0.814394747838378} 12/28/2021 09:44:25 - INFO - codeparrot_training - Step 41331: {'lr': 3.6426135530796153e-05, 'samples': 21161984, 'steps': 41331, 'batch_loss/train': 0.6849334319122136} 12/28/2021 09:44:36 - INFO - codeparrot_training - Step 41332: {'lr': 3.6417942471789285e-05, 'samples': 21162496, 'steps': 41332, 'batch_loss/train': 0.6789911522064358} 12/28/2021 09:44:48 - INFO - codeparrot_training - Step 41333: {'lr': 3.64097502619041e-05, 'samples': 21163008, 'steps': 41333, 'batch_loss/train': 0.6835669698193669} 12/28/2021 09:44:58 - INFO - codeparrot_training - Step 41334: {'lr': 3.640155890117297e-05, 'samples': 21163520, 'steps': 41334, 'batch_loss/train': 0.7481736754998565} 12/28/2021 09:45:09 - INFO - codeparrot_training - Step 41335: {'lr': 3.639336838962867e-05, 'samples': 21164032, 'steps': 41335, 'batch_loss/train': 0.8020946392789483} 12/28/2021 09:45:23 - INFO - codeparrot_training - Step 41336: {'lr': 3.638517872730371e-05, 'samples': 21164544, 'steps': 41336, 'batch_loss/train': 0.7952767820097506} 12/28/2021 09:45:34 - INFO - codeparrot_training - Step 41337: {'lr': 3.637698991423052e-05, 'samples': 21165056, 'steps': 41337, 'batch_loss/train': 0.7325661228969693} 12/28/2021 09:45:44 - INFO - codeparrot_training - Step 41338: {'lr': 3.636880195044187e-05, 'samples': 21165568, 'steps': 41338, 'batch_loss/train': 0.614356700622011} 12/28/2021 09:45:56 - INFO - codeparrot_training - Step 41339: {'lr': 3.6360614835970165e-05, 'samples': 21166080, 'steps': 41339, 'batch_loss/train': 0.659411586355418} 12/28/2021 09:46:07 - INFO - codeparrot_training - Step 41340: {'lr': 3.635242857084797e-05, 'samples': 21166592, 'steps': 41340, 'batch_loss/train': 0.6550866331672296} 12/28/2021 09:46:18 - INFO - codeparrot_training - Step 41341: {'lr': 3.634424315510784e-05, 'samples': 21167104, 'steps': 41341, 'batch_loss/train': 0.7542017516680062} 12/28/2021 09:46:30 - INFO - codeparrot_training - Step 41342: {'lr': 3.633605858878234e-05, 'samples': 21167616, 'steps': 41342, 'batch_loss/train': 0.7200029110535979} 12/28/2021 09:46:41 - INFO - codeparrot_training - Step 41343: {'lr': 3.632787487190398e-05, 'samples': 21168128, 'steps': 41343, 'batch_loss/train': 0.6976273600012064} 12/28/2021 09:46:51 - INFO - codeparrot_training - Step 41344: {'lr': 3.631969200450533e-05, 'samples': 21168640, 'steps': 41344, 'batch_loss/train': 0.823764665517956} 12/28/2021 09:47:02 - INFO - codeparrot_training - Step 41345: {'lr': 3.631150998661889e-05, 'samples': 21169152, 'steps': 41345, 'batch_loss/train': 0.6541530226822942} 12/28/2021 09:47:14 - INFO - codeparrot_training - Step 41346: {'lr': 3.6303328818277194e-05, 'samples': 21169664, 'steps': 41346, 'batch_loss/train': 0.7191128144040704} 12/28/2021 09:47:25 - INFO - codeparrot_training - Step 41347: {'lr': 3.629514849951285e-05, 'samples': 21170176, 'steps': 41347, 'batch_loss/train': 0.597061128122732} 12/28/2021 09:47:35 - INFO - codeparrot_training - Step 41348: {'lr': 3.628696903035822e-05, 'samples': 21170688, 'steps': 41348, 'batch_loss/train': 0.6845103583764285} 12/28/2021 09:47:49 - INFO - codeparrot_training - Step 41349: {'lr': 3.6278790410845865e-05, 'samples': 21171200, 'steps': 41349, 'batch_loss/train': 1.1107436628080904} 12/28/2021 09:48:00 - INFO - codeparrot_training - Step 41350: {'lr': 3.627061264100848e-05, 'samples': 21171712, 'steps': 41350, 'batch_loss/train': 1.0229457193054259} 12/28/2021 09:48:11 - INFO - codeparrot_training - Step 41351: {'lr': 3.6262435720878336e-05, 'samples': 21172224, 'steps': 41351, 'batch_loss/train': 0.7660776292905211} 12/28/2021 09:48:23 - INFO - codeparrot_training - Step 41352: {'lr': 3.625425965048801e-05, 'samples': 21172736, 'steps': 41352, 'batch_loss/train': 0.7684371163486503} 12/28/2021 09:48:34 - INFO - codeparrot_training - Step 41353: {'lr': 3.624608442987015e-05, 'samples': 21173248, 'steps': 41353, 'batch_loss/train': 0.7952549690380692} 12/28/2021 09:48:44 - INFO - codeparrot_training - Step 41354: {'lr': 3.6237910059057104e-05, 'samples': 21173760, 'steps': 41354, 'batch_loss/train': 0.6049908443819731} 12/28/2021 09:48:55 - INFO - codeparrot_training - Step 41355: {'lr': 3.62297365380814e-05, 'samples': 21174272, 'steps': 41355, 'batch_loss/train': 0.6976963421329856} 12/28/2021 09:49:09 - INFO - codeparrot_training - Step 41356: {'lr': 3.622156386697553e-05, 'samples': 21174784, 'steps': 41356, 'batch_loss/train': 0.758418507874012} 12/28/2021 09:49:19 - INFO - codeparrot_training - Step 41357: {'lr': 3.6213392045772034e-05, 'samples': 21175296, 'steps': 41357, 'batch_loss/train': 0.802768430672586} 12/28/2021 09:49:30 - INFO - codeparrot_training - Step 41358: {'lr': 3.620522107450336e-05, 'samples': 21175808, 'steps': 41358, 'batch_loss/train': 0.6922384053468704} 12/28/2021 09:49:42 - INFO - codeparrot_training - Step 41359: {'lr': 3.619705095320205e-05, 'samples': 21176320, 'steps': 41359, 'batch_loss/train': 0.7796165393665433} 12/28/2021 09:49:53 - INFO - codeparrot_training - Step 41360: {'lr': 3.6188881681900435e-05, 'samples': 21176832, 'steps': 41360, 'batch_loss/train': 0.776460996363312} 12/28/2021 09:50:03 - INFO - codeparrot_training - Step 41361: {'lr': 3.618071326063113e-05, 'samples': 21177344, 'steps': 41361, 'batch_loss/train': 0.7465294422581792} 12/28/2021 09:50:16 - INFO - codeparrot_training - Step 41362: {'lr': 3.6172545689426654e-05, 'samples': 21177856, 'steps': 41362, 'batch_loss/train': 0.7396129202097654} 12/28/2021 09:50:26 - INFO - codeparrot_training - Step 41363: {'lr': 3.6164378968319285e-05, 'samples': 21178368, 'steps': 41363, 'batch_loss/train': 0.6960986114572734} 12/28/2021 09:50:37 - INFO - codeparrot_training - Step 41364: {'lr': 3.615621309734163e-05, 'samples': 21178880, 'steps': 41364, 'batch_loss/train': 0.7891762480139732} 12/28/2021 09:50:51 - INFO - codeparrot_training - Step 41365: {'lr': 3.614804807652622e-05, 'samples': 21179392, 'steps': 41365, 'batch_loss/train': 0.7167719537392259} 12/28/2021 09:51:01 - INFO - codeparrot_training - Step 41366: {'lr': 3.613988390590534e-05, 'samples': 21179904, 'steps': 41366, 'batch_loss/train': 0.669820950308349} 12/28/2021 09:51:12 - INFO - codeparrot_training - Step 41367: {'lr': 3.613172058551154e-05, 'samples': 21180416, 'steps': 41367, 'batch_loss/train': 0.626499067991972} 12/28/2021 09:51:23 - INFO - codeparrot_training - Step 41368: {'lr': 3.6123558115377264e-05, 'samples': 21180928, 'steps': 41368, 'batch_loss/train': 0.661468249745667} 12/28/2021 09:51:35 - INFO - codeparrot_training - Step 41369: {'lr': 3.611539649553497e-05, 'samples': 21181440, 'steps': 41369, 'batch_loss/train': 0.7777959262020886} 12/28/2021 09:51:45 - INFO - codeparrot_training - Step 41370: {'lr': 3.6107235726017094e-05, 'samples': 21181952, 'steps': 41370, 'batch_loss/train': 0.7369706449098885} 12/28/2021 09:51:56 - INFO - codeparrot_training - Step 41371: {'lr': 3.6099075806856094e-05, 'samples': 21182464, 'steps': 41371, 'batch_loss/train': 0.7136802095919847} 12/28/2021 09:52:08 - INFO - codeparrot_training - Step 41372: {'lr': 3.60909167380844e-05, 'samples': 21182976, 'steps': 41372, 'batch_loss/train': 0.7265907065011561} 12/28/2021 09:52:18 - INFO - codeparrot_training - Step 41373: {'lr': 3.608275851973447e-05, 'samples': 21183488, 'steps': 41373, 'batch_loss/train': 0.7033501705154777} 12/28/2021 09:52:29 - INFO - codeparrot_training - Step 41374: {'lr': 3.607460115183869e-05, 'samples': 21184000, 'steps': 41374, 'batch_loss/train': 0.7423858223482966} 12/28/2021 09:52:43 - INFO - codeparrot_training - Step 41375: {'lr': 3.606644463442954e-05, 'samples': 21184512, 'steps': 41375, 'batch_loss/train': 0.726475334726274} 12/28/2021 09:52:54 - INFO - codeparrot_training - Step 41376: {'lr': 3.605828896753946e-05, 'samples': 21185024, 'steps': 41376, 'batch_loss/train': 0.8106808278243989} 12/28/2021 09:53:04 - INFO - codeparrot_training - Step 41377: {'lr': 3.605013415120081e-05, 'samples': 21185536, 'steps': 41377, 'batch_loss/train': 0.6872041271999478} 12/28/2021 09:53:15 - INFO - codeparrot_training - Step 41378: {'lr': 3.604198018544597e-05, 'samples': 21186048, 'steps': 41378, 'batch_loss/train': 0.7492382526397705} 12/28/2021 09:53:27 - INFO - codeparrot_training - Step 41379: {'lr': 3.603382707030756e-05, 'samples': 21186560, 'steps': 41379, 'batch_loss/train': 0.7471189559437335} 12/28/2021 09:53:38 - INFO - codeparrot_training - Step 41380: {'lr': 3.6025674805817807e-05, 'samples': 21187072, 'steps': 41380, 'batch_loss/train': 1.5515589825809002} 12/28/2021 09:53:48 - INFO - codeparrot_training - Step 41381: {'lr': 3.601752339200915e-05, 'samples': 21187584, 'steps': 41381, 'batch_loss/train': 0.7226838070782833} 12/28/2021 09:54:01 - INFO - codeparrot_training - Step 41382: {'lr': 3.6009372828914036e-05, 'samples': 21188096, 'steps': 41382, 'batch_loss/train': 0.8541686469689012} 12/28/2021 09:54:12 - INFO - codeparrot_training - Step 41383: {'lr': 3.6001223116564866e-05, 'samples': 21188608, 'steps': 41383, 'batch_loss/train': 0.7891027135774493} 12/28/2021 09:54:22 - INFO - codeparrot_training - Step 41384: {'lr': 3.599307425499404e-05, 'samples': 21189120, 'steps': 41384, 'batch_loss/train': 0.7292692735791206} 12/28/2021 09:54:36 - INFO - codeparrot_training - Step 41385: {'lr': 3.598492624423397e-05, 'samples': 21189632, 'steps': 41385, 'batch_loss/train': 0.7556986237177625} 12/28/2021 09:54:47 - INFO - codeparrot_training - Step 41386: {'lr': 3.597677908431693e-05, 'samples': 21190144, 'steps': 41386, 'batch_loss/train': 0.7222819686867297} 12/28/2021 09:54:57 - INFO - codeparrot_training - Step 41387: {'lr': 3.5968632775275454e-05, 'samples': 21190656, 'steps': 41387, 'batch_loss/train': 0.6699817516491748} 12/28/2021 09:55:08 - INFO - codeparrot_training - Step 41388: {'lr': 3.5960487317141936e-05, 'samples': 21191168, 'steps': 41388, 'batch_loss/train': 0.7519885208457708} 12/28/2021 09:55:20 - INFO - codeparrot_training - Step 41389: {'lr': 3.59523427099486e-05, 'samples': 21191680, 'steps': 41389, 'batch_loss/train': 0.5584596438857261} 12/28/2021 09:55:31 - INFO - codeparrot_training - Step 41390: {'lr': 3.594419895372797e-05, 'samples': 21192192, 'steps': 41390, 'batch_loss/train': 0.6991808153688908} 12/28/2021 09:55:41 - INFO - codeparrot_training - Step 41391: {'lr': 3.593605604851244e-05, 'samples': 21192704, 'steps': 41391, 'batch_loss/train': 0.6592591116204858} 12/28/2021 09:55:54 - INFO - codeparrot_training - Step 41392: {'lr': 3.592791399433429e-05, 'samples': 21193216, 'steps': 41392, 'batch_loss/train': 0.7549303029663861} 12/28/2021 09:56:04 - INFO - codeparrot_training - Step 41393: {'lr': 3.591977279122591e-05, 'samples': 21193728, 'steps': 41393, 'batch_loss/train': 0.6978828432038426} 12/28/2021 09:56:15 - INFO - codeparrot_training - Step 41394: {'lr': 3.591163243921966e-05, 'samples': 21194240, 'steps': 41394, 'batch_loss/train': 0.699662746861577} 12/28/2021 09:56:29 - INFO - codeparrot_training - Step 41395: {'lr': 3.5903492938347935e-05, 'samples': 21194752, 'steps': 41395, 'batch_loss/train': 0.6805013036355376} 12/28/2021 09:56:39 - INFO - codeparrot_training - Step 41396: {'lr': 3.58953542886431e-05, 'samples': 21195264, 'steps': 41396, 'batch_loss/train': 0.6624349676130805} 12/28/2021 09:56:50 - INFO - codeparrot_training - Step 41397: {'lr': 3.588721649013746e-05, 'samples': 21195776, 'steps': 41397, 'batch_loss/train': 0.6774794205557555} 12/28/2021 09:57:01 - INFO - codeparrot_training - Step 41398: {'lr': 3.587907954286343e-05, 'samples': 21196288, 'steps': 41398, 'batch_loss/train': 0.7410375126637518} 12/28/2021 09:57:13 - INFO - codeparrot_training - Step 41399: {'lr': 3.587094344685332e-05, 'samples': 21196800, 'steps': 41399, 'batch_loss/train': 0.7619422231800854} 12/28/2021 09:57:24 - INFO - codeparrot_training - Step 41400: {'lr': 3.5862808202139493e-05, 'samples': 21197312, 'steps': 41400, 'batch_loss/train': 0.5348813589371275} 12/28/2021 09:57:34 - INFO - codeparrot_training - Step 41401: {'lr': 3.585467380875426e-05, 'samples': 21197824, 'steps': 41401, 'batch_loss/train': 0.7095349375158548} 12/28/2021 09:57:46 - INFO - codeparrot_training - Step 41402: {'lr': 3.5846540266730064e-05, 'samples': 21198336, 'steps': 41402, 'batch_loss/train': 0.702797326259315} 12/28/2021 09:57:57 - INFO - codeparrot_training - Step 41403: {'lr': 3.583840757609913e-05, 'samples': 21198848, 'steps': 41403, 'batch_loss/train': 0.7073953752405941} 12/28/2021 09:58:08 - INFO - codeparrot_training - Step 41404: {'lr': 3.5830275736893734e-05, 'samples': 21199360, 'steps': 41404, 'batch_loss/train': 0.6398974065668881} 12/28/2021 09:58:21 - INFO - codeparrot_training - Step 41405: {'lr': 3.5822144749146416e-05, 'samples': 21199872, 'steps': 41405, 'batch_loss/train': 0.6663378071971238} 12/28/2021 09:58:32 - INFO - codeparrot_training - Step 41406: {'lr': 3.581401461288933e-05, 'samples': 21200384, 'steps': 41406, 'batch_loss/train': 0.721143067930825} 12/28/2021 09:58:43 - INFO - codeparrot_training - Step 41407: {'lr': 3.580588532815482e-05, 'samples': 21200896, 'steps': 41407, 'batch_loss/train': 0.6976943309418857} 12/28/2021 09:58:53 - INFO - codeparrot_training - Step 41408: {'lr': 3.5797756894975274e-05, 'samples': 21201408, 'steps': 41408, 'batch_loss/train': 0.7480922872200608} 12/28/2021 09:59:05 - INFO - codeparrot_training - Step 41409: {'lr': 3.578962931338292e-05, 'samples': 21201920, 'steps': 41409, 'batch_loss/train': 0.7598857771372423} 12/28/2021 09:59:16 - INFO - codeparrot_training - Step 41410: {'lr': 3.5781502583410146e-05, 'samples': 21202432, 'steps': 41410, 'batch_loss/train': 0.6712366412393749} 12/28/2021 09:59:27 - INFO - codeparrot_training - Step 41411: {'lr': 3.577337670508923e-05, 'samples': 21202944, 'steps': 41411, 'batch_loss/train': 0.7067398023791611} 12/28/2021 09:59:41 - INFO - codeparrot_training - Step 41412: {'lr': 3.5765251678452487e-05, 'samples': 21203456, 'steps': 41412, 'batch_loss/train': 0.7014013021253049} 12/28/2021 09:59:51 - INFO - codeparrot_training - Step 41413: {'lr': 3.575712750353222e-05, 'samples': 21203968, 'steps': 41413, 'batch_loss/train': 0.7436866108328104} 12/28/2021 10:00:02 - INFO - codeparrot_training - Step 41414: {'lr': 3.5749004180360755e-05, 'samples': 21204480, 'steps': 41414, 'batch_loss/train': 0.6309584584087133} 12/28/2021 10:00:14 - INFO - codeparrot_training - Step 41415: {'lr': 3.5740881708970246e-05, 'samples': 21204992, 'steps': 41415, 'batch_loss/train': 0.779998348094523} 12/28/2021 10:00:25 - INFO - codeparrot_training - Step 41416: {'lr': 3.5732760089393135e-05, 'samples': 21205504, 'steps': 41416, 'batch_loss/train': 0.7734233979135752} 12/28/2021 10:00:35 - INFO - codeparrot_training - Step 41417: {'lr': 3.572463932166173e-05, 'samples': 21206016, 'steps': 41417, 'batch_loss/train': 0.7196399802342057} 12/28/2021 10:00:47 - INFO - codeparrot_training - Step 41418: {'lr': 3.5716519405808194e-05, 'samples': 21206528, 'steps': 41418, 'batch_loss/train': 0.658553677611053} 12/28/2021 10:00:58 - INFO - codeparrot_training - Step 41419: {'lr': 3.5708400341864813e-05, 'samples': 21207040, 'steps': 41419, 'batch_loss/train': 1.5773012647405267} 12/28/2021 10:01:09 - INFO - codeparrot_training - Step 41420: {'lr': 3.5700282129864034e-05, 'samples': 21207552, 'steps': 41420, 'batch_loss/train': 0.7552374671213329} 12/28/2021 10:01:19 - INFO - codeparrot_training - Step 41421: {'lr': 3.569216476983794e-05, 'samples': 21208064, 'steps': 41421, 'batch_loss/train': 0.733895187266171} 12/28/2021 10:01:31 - INFO - codeparrot_training - Step 41422: {'lr': 3.5684048261818925e-05, 'samples': 21208576, 'steps': 41422, 'batch_loss/train': 0.6139865994337015} 12/28/2021 10:01:42 - INFO - codeparrot_training - Step 41423: {'lr': 3.567593260583918e-05, 'samples': 21209088, 'steps': 41423, 'batch_loss/train': 0.745440058875829} 12/28/2021 10:01:53 - INFO - codeparrot_training - Step 41424: {'lr': 3.566781780193099e-05, 'samples': 21209600, 'steps': 41424, 'batch_loss/train': 0.7986506698653102} 12/28/2021 10:02:07 - INFO - codeparrot_training - Step 41425: {'lr': 3.565970385012665e-05, 'samples': 21210112, 'steps': 41425, 'batch_loss/train': 0.675746824243106} 12/28/2021 10:02:17 - INFO - codeparrot_training - Step 41426: {'lr': 3.565159075045837e-05, 'samples': 21210624, 'steps': 41426, 'batch_loss/train': 0.803439955227077} 12/28/2021 10:02:28 - INFO - codeparrot_training - Step 41427: {'lr': 3.564347850295846e-05, 'samples': 21211136, 'steps': 41427, 'batch_loss/train': 0.6059773583256174} 12/28/2021 10:02:40 - INFO - codeparrot_training - Step 41428: {'lr': 3.5635367107659186e-05, 'samples': 21211648, 'steps': 41428, 'batch_loss/train': 0.6882062670774758} 12/28/2021 10:02:51 - INFO - codeparrot_training - Step 41429: {'lr': 3.56272565645927e-05, 'samples': 21212160, 'steps': 41429, 'batch_loss/train': 0.7103268513455987} 12/28/2021 10:03:01 - INFO - codeparrot_training - Step 41430: {'lr': 3.5619146873791225e-05, 'samples': 21212672, 'steps': 41430, 'batch_loss/train': 0.7175955334678292} 12/28/2021 10:03:12 - INFO - codeparrot_training - Step 41431: {'lr': 3.56110380352872e-05, 'samples': 21213184, 'steps': 41431, 'batch_loss/train': 0.7485152399167418} 12/28/2021 10:03:24 - INFO - codeparrot_training - Step 41432: {'lr': 3.560293004911269e-05, 'samples': 21213696, 'steps': 41432, 'batch_loss/train': 0.744437396991998} 12/28/2021 10:03:35 - INFO - codeparrot_training - Step 41433: {'lr': 3.5594822915299975e-05, 'samples': 21214208, 'steps': 41433, 'batch_loss/train': 0.7377573754638433} 12/28/2021 10:03:46 - INFO - codeparrot_training - Step 41434: {'lr': 3.558671663388127e-05, 'samples': 21214720, 'steps': 41434, 'batch_loss/train': 0.8680910468101501} 12/28/2021 10:03:59 - INFO - codeparrot_training - Step 41435: {'lr': 3.557861120488884e-05, 'samples': 21215232, 'steps': 41435, 'batch_loss/train': 0.5910369991324842} 12/28/2021 10:04:10 - INFO - codeparrot_training - Step 41436: {'lr': 3.557050662835487e-05, 'samples': 21215744, 'steps': 41436, 'batch_loss/train': 0.7173869274556637} 12/28/2021 10:04:21 - INFO - codeparrot_training - Step 41437: {'lr': 3.556240290431162e-05, 'samples': 21216256, 'steps': 41437, 'batch_loss/train': 0.7027274118736386} 12/28/2021 10:04:33 - INFO - codeparrot_training - Step 41438: {'lr': 3.555430003279128e-05, 'samples': 21216768, 'steps': 41438, 'batch_loss/train': 0.7303887428715825} 12/28/2021 10:04:44 - INFO - codeparrot_training - Step 41439: {'lr': 3.5546198013826054e-05, 'samples': 21217280, 'steps': 41439, 'batch_loss/train': 0.7853540927171707} 12/28/2021 10:04:54 - INFO - codeparrot_training - Step 41440: {'lr': 3.5538096847448244e-05, 'samples': 21217792, 'steps': 41440, 'batch_loss/train': 0.9499731245450675} 12/28/2021 10:05:05 - INFO - codeparrot_training - Step 41441: {'lr': 3.5529996533689876e-05, 'samples': 21218304, 'steps': 41441, 'batch_loss/train': 0.8124514520168304} 12/28/2021 10:05:19 - INFO - codeparrot_training - Step 41442: {'lr': 3.5521897072583325e-05, 'samples': 21218816, 'steps': 41442, 'batch_loss/train': 0.7255048165097833} 12/28/2021 10:05:29 - INFO - codeparrot_training - Step 41443: {'lr': 3.551379846416075e-05, 'samples': 21219328, 'steps': 41443, 'batch_loss/train': 0.7481932644732296} 12/28/2021 10:05:40 - INFO - codeparrot_training - Step 41444: {'lr': 3.55057007084543e-05, 'samples': 21219840, 'steps': 41444, 'batch_loss/train': 0.6710994336754084} 12/28/2021 10:05:52 - INFO - codeparrot_training - Step 41445: {'lr': 3.549760380549613e-05, 'samples': 21220352, 'steps': 41445, 'batch_loss/train': 0.6870848406106234} 12/28/2021 10:06:03 - INFO - codeparrot_training - Step 41446: {'lr': 3.548950775531859e-05, 'samples': 21220864, 'steps': 41446, 'batch_loss/train': 0.7304796847165562} 12/28/2021 10:06:13 - INFO - codeparrot_training - Step 41447: {'lr': 3.548141255795373e-05, 'samples': 21221376, 'steps': 41447, 'batch_loss/train': 0.6976257711648941} 12/28/2021 10:06:25 - INFO - codeparrot_training - Step 41448: {'lr': 3.547331821343378e-05, 'samples': 21221888, 'steps': 41448, 'batch_loss/train': 0.6722100544720888} 12/28/2021 10:06:36 - INFO - codeparrot_training - Step 41449: {'lr': 3.546522472179089e-05, 'samples': 21222400, 'steps': 41449, 'batch_loss/train': 0.7539834575727582} 12/28/2021 10:06:47 - INFO - codeparrot_training - Step 41450: {'lr': 3.545713208305728e-05, 'samples': 21222912, 'steps': 41450, 'batch_loss/train': 0.7034709630534053} 12/28/2021 10:06:57 - INFO - codeparrot_training - Step 41451: {'lr': 3.544904029726512e-05, 'samples': 21223424, 'steps': 41451, 'batch_loss/train': 0.7056423723697662} 12/28/2021 10:07:11 - INFO - codeparrot_training - Step 41452: {'lr': 3.5440949364446526e-05, 'samples': 21223936, 'steps': 41452, 'batch_loss/train': 0.6302275392226875} 12/28/2021 10:07:22 - INFO - codeparrot_training - Step 41453: {'lr': 3.5432859284633745e-05, 'samples': 21224448, 'steps': 41453, 'batch_loss/train': 0.8590461824787781} 12/28/2021 10:07:33 - INFO - codeparrot_training - Step 41454: {'lr': 3.542477005785888e-05, 'samples': 21224960, 'steps': 41454, 'batch_loss/train': 0.7278438843786716} 12/28/2021 10:07:45 - INFO - codeparrot_training - Step 41455: {'lr': 3.541668168415415e-05, 'samples': 21225472, 'steps': 41455, 'batch_loss/train': 0.7649635598063469} 12/28/2021 10:07:56 - INFO - codeparrot_training - Step 41456: {'lr': 3.5408594163551575e-05, 'samples': 21225984, 'steps': 41456, 'batch_loss/train': 0.6600856208242476} 12/28/2021 10:08:06 - INFO - codeparrot_training - Step 41457: {'lr': 3.540050749608351e-05, 'samples': 21226496, 'steps': 41457, 'batch_loss/train': 0.6556982174515724} 12/28/2021 10:08:18 - INFO - codeparrot_training - Step 41458: {'lr': 3.539242168178194e-05, 'samples': 21227008, 'steps': 41458, 'batch_loss/train': 0.6940722000435926} 12/28/2021 10:08:29 - INFO - codeparrot_training - Step 41459: {'lr': 3.538433672067906e-05, 'samples': 21227520, 'steps': 41459, 'batch_loss/train': 0.7404183517210186} 12/28/2021 10:08:39 - INFO - codeparrot_training - Step 41460: {'lr': 3.537625261280705e-05, 'samples': 21228032, 'steps': 41460, 'batch_loss/train': 0.5952767922135536} 12/28/2021 10:08:50 - INFO - codeparrot_training - Step 41461: {'lr': 3.5368169358198e-05, 'samples': 21228544, 'steps': 41461, 'batch_loss/train': 0.6943807132774964} 12/28/2021 10:09:02 - INFO - codeparrot_training - Step 41462: {'lr': 3.5360086956884056e-05, 'samples': 21229056, 'steps': 41462, 'batch_loss/train': 0.6756628663279116} 12/28/2021 10:09:13 - INFO - codeparrot_training - Step 41463: {'lr': 3.535200540889738e-05, 'samples': 21229568, 'steps': 41463, 'batch_loss/train': 0.7914465002249926} 12/28/2021 10:09:23 - INFO - codeparrot_training - Step 41464: {'lr': 3.5343924714270085e-05, 'samples': 21230080, 'steps': 41464, 'batch_loss/train': 0.696938862092793} 12/28/2021 10:09:39 - INFO - codeparrot_training - Step 41465: {'lr': 3.533584487303429e-05, 'samples': 21230592, 'steps': 41465, 'batch_loss/train': 0.6638363052916247} 12/28/2021 10:09:50 - INFO - codeparrot_training - Step 41466: {'lr': 3.5327765885222154e-05, 'samples': 21231104, 'steps': 41466, 'batch_loss/train': 0.6644636560231447} 12/28/2021 10:10:00 - INFO - codeparrot_training - Step 41467: {'lr': 3.531968775086569e-05, 'samples': 21231616, 'steps': 41467, 'batch_loss/train': 0.7095117443241179} 12/28/2021 10:10:13 - INFO - codeparrot_training - Step 41468: {'lr': 3.531161046999712e-05, 'samples': 21232128, 'steps': 41468, 'batch_loss/train': 0.6860526840901002} 12/28/2021 10:10:23 - INFO - codeparrot_training - Step 41469: {'lr': 3.5303534042648574e-05, 'samples': 21232640, 'steps': 41469, 'batch_loss/train': 0.93458566442132} 12/28/2021 10:10:34 - INFO - codeparrot_training - Step 41470: {'lr': 3.529545846885207e-05, 'samples': 21233152, 'steps': 41470, 'batch_loss/train': 0.6569227868458256} 12/28/2021 10:10:44 - INFO - codeparrot_training - Step 41471: {'lr': 3.5287383748639713e-05, 'samples': 21233664, 'steps': 41471, 'batch_loss/train': 0.6676495219580829} 12/28/2021 10:10:58 - INFO - codeparrot_training - Step 41472: {'lr': 3.5279309882043723e-05, 'samples': 21234176, 'steps': 41472, 'batch_loss/train': 0.7009779454674572} 12/28/2021 10:11:09 - INFO - codeparrot_training - Step 41473: {'lr': 3.527123686909608e-05, 'samples': 21234688, 'steps': 41473, 'batch_loss/train': 0.7674765530973673} 12/28/2021 10:11:20 - INFO - codeparrot_training - Step 41474: {'lr': 3.526316470982893e-05, 'samples': 21235200, 'steps': 41474, 'batch_loss/train': 0.6895979577675462} 12/28/2021 10:11:32 - INFO - codeparrot_training - Step 41475: {'lr': 3.525509340427435e-05, 'samples': 21235712, 'steps': 41475, 'batch_loss/train': 0.7154574510641396} 12/28/2021 10:11:42 - INFO - codeparrot_training - Step 41476: {'lr': 3.524702295246443e-05, 'samples': 21236224, 'steps': 41476, 'batch_loss/train': 0.5788409428205341} 12/28/2021 10:11:53 - INFO - codeparrot_training - Step 41477: {'lr': 3.523895335443125e-05, 'samples': 21236736, 'steps': 41477, 'batch_loss/train': 0.572933447547257} 12/28/2021 10:12:05 - INFO - codeparrot_training - Step 41478: {'lr': 3.523088461020693e-05, 'samples': 21237248, 'steps': 41478, 'batch_loss/train': 0.7700228816829622} 12/28/2021 10:12:16 - INFO - codeparrot_training - Step 41479: {'lr': 3.5222816719823494e-05, 'samples': 21237760, 'steps': 41479, 'batch_loss/train': 0.6618377935374156} 12/28/2021 10:12:26 - INFO - codeparrot_training - Step 41480: {'lr': 3.521474968331303e-05, 'samples': 21238272, 'steps': 41480, 'batch_loss/train': 0.651579761877656} 12/28/2021 10:12:40 - INFO - codeparrot_training - Step 41481: {'lr': 3.5206683500707714e-05, 'samples': 21238784, 'steps': 41481, 'batch_loss/train': 0.70101447426714} 12/28/2021 10:12:51 - INFO - codeparrot_training - Step 41482: {'lr': 3.51986181720394e-05, 'samples': 21239296, 'steps': 41482, 'batch_loss/train': 0.6989846425130963} 12/28/2021 10:13:01 - INFO - codeparrot_training - Step 41483: {'lr': 3.5190553697340365e-05, 'samples': 21239808, 'steps': 41483, 'batch_loss/train': 0.6929174528922886} 12/28/2021 10:13:12 - INFO - codeparrot_training - Step 41484: {'lr': 3.518249007664254e-05, 'samples': 21240320, 'steps': 41484, 'batch_loss/train': 0.7727010906673968} 12/28/2021 10:13:24 - INFO - codeparrot_training - Step 41485: {'lr': 3.517442730997797e-05, 'samples': 21240832, 'steps': 41485, 'batch_loss/train': 0.7140384912490845} 12/28/2021 10:13:35 - INFO - codeparrot_training - Step 41486: {'lr': 3.516636539737886e-05, 'samples': 21241344, 'steps': 41486, 'batch_loss/train': 0.7417801870033145} 12/28/2021 10:13:46 - INFO - codeparrot_training - Step 41487: {'lr': 3.515830433887712e-05, 'samples': 21241856, 'steps': 41487, 'batch_loss/train': 0.7747007198631763} 12/28/2021 10:13:58 - INFO - codeparrot_training - Step 41488: {'lr': 3.515024413450482e-05, 'samples': 21242368, 'steps': 41488, 'batch_loss/train': 0.6816759537905455} 12/28/2021 10:14:08 - INFO - codeparrot_training - Step 41489: {'lr': 3.514218478429404e-05, 'samples': 21242880, 'steps': 41489, 'batch_loss/train': 0.6840355889871716} 12/28/2021 10:14:19 - INFO - codeparrot_training - Step 41490: {'lr': 3.5134126288276805e-05, 'samples': 21243392, 'steps': 41490, 'batch_loss/train': 0.7382751135155559} 12/28/2021 10:14:31 - INFO - codeparrot_training - Step 41491: {'lr': 3.512606864648515e-05, 'samples': 21243904, 'steps': 41491, 'batch_loss/train': 0.6929496573284268} 12/28/2021 10:14:42 - INFO - codeparrot_training - Step 41492: {'lr': 3.511801185895119e-05, 'samples': 21244416, 'steps': 41492, 'batch_loss/train': 0.7450800538063049} 12/28/2021 10:14:52 - INFO - codeparrot_training - Step 41493: {'lr': 3.510995592570676e-05, 'samples': 21244928, 'steps': 41493, 'batch_loss/train': 0.5736684853618499} 12/28/2021 10:15:03 - INFO - codeparrot_training - Step 41494: {'lr': 3.510190084678405e-05, 'samples': 21245440, 'steps': 41494, 'batch_loss/train': 0.8723988849669695} 12/28/2021 10:15:17 - INFO - codeparrot_training - Step 41495: {'lr': 3.50938466222151e-05, 'samples': 21245952, 'steps': 41495, 'batch_loss/train': 0.6615924679208547} 12/28/2021 10:15:27 - INFO - codeparrot_training - Step 41496: {'lr': 3.508579325203182e-05, 'samples': 21246464, 'steps': 41496, 'batch_loss/train': 0.7264257939532399} 12/28/2021 10:15:38 - INFO - codeparrot_training - Step 41497: {'lr': 3.507774073626621e-05, 'samples': 21246976, 'steps': 41497, 'batch_loss/train': 0.6520982615766115} 12/28/2021 10:15:50 - INFO - codeparrot_training - Step 41498: {'lr': 3.506968907495045e-05, 'samples': 21247488, 'steps': 41498, 'batch_loss/train': 0.7089025129098445} 12/28/2021 10:16:01 - INFO - codeparrot_training - Step 41499: {'lr': 3.506163826811642e-05, 'samples': 21248000, 'steps': 41499, 'batch_loss/train': 0.8701697699725628} 12/28/2021 10:16:11 - INFO - codeparrot_training - Step 41500: {'lr': 3.505358831579611e-05, 'samples': 21248512, 'steps': 41500, 'batch_loss/train': 0.7204147554002702} 12/28/2021 10:16:26 - INFO - codeparrot_training - Step 41501: {'lr': 3.5045539218021675e-05, 'samples': 21249024, 'steps': 41501, 'batch_loss/train': 0.74898443184793} 12/28/2021 10:16:36 - INFO - codeparrot_training - Step 41502: {'lr': 3.503749097482495e-05, 'samples': 21249536, 'steps': 41502, 'batch_loss/train': 0.5226102356100455} 12/28/2021 10:16:47 - INFO - codeparrot_training - Step 41503: {'lr': 3.502944358623797e-05, 'samples': 21250048, 'steps': 41503, 'batch_loss/train': 0.5909537689294666} 12/28/2021 10:16:59 - INFO - codeparrot_training - Step 41504: {'lr': 3.5021397052292794e-05, 'samples': 21250560, 'steps': 41504, 'batch_loss/train': 0.7644563224166632} 12/28/2021 10:17:10 - INFO - codeparrot_training - Step 41505: {'lr': 3.501335137302136e-05, 'samples': 21251072, 'steps': 41505, 'batch_loss/train': 0.773739037103951} 12/28/2021 10:17:20 - INFO - codeparrot_training - Step 41506: {'lr': 3.5005306548455654e-05, 'samples': 21251584, 'steps': 41506, 'batch_loss/train': 0.7556928126141429} 12/28/2021 10:17:31 - INFO - codeparrot_training - Step 41507: {'lr': 3.499726257862773e-05, 'samples': 21252096, 'steps': 41507, 'batch_loss/train': 0.7263468806631863} 12/28/2021 10:17:43 - INFO - codeparrot_training - Step 41508: {'lr': 3.4989219463569395e-05, 'samples': 21252608, 'steps': 41508, 'batch_loss/train': 0.6638321182690561} 12/28/2021 10:17:54 - INFO - codeparrot_training - Step 41509: {'lr': 3.498117720331282e-05, 'samples': 21253120, 'steps': 41509, 'batch_loss/train': 0.7685074498876929} 12/28/2021 10:18:04 - INFO - codeparrot_training - Step 41510: {'lr': 3.497313579788991e-05, 'samples': 21253632, 'steps': 41510, 'batch_loss/train': 0.7978712692856789} 12/28/2021 10:18:18 - INFO - codeparrot_training - Step 41511: {'lr': 3.496509524733252e-05, 'samples': 21254144, 'steps': 41511, 'batch_loss/train': 0.7532082623802125} 12/28/2021 10:18:29 - INFO - codeparrot_training - Step 41512: {'lr': 3.4957055551672865e-05, 'samples': 21254656, 'steps': 41512, 'batch_loss/train': 0.6150559587404132} 12/28/2021 10:18:40 - INFO - codeparrot_training - Step 41513: {'lr': 3.494901671094267e-05, 'samples': 21255168, 'steps': 41513, 'batch_loss/train': 0.7512076604180038} 12/28/2021 10:18:52 - INFO - codeparrot_training - Step 41514: {'lr': 3.4940978725173986e-05, 'samples': 21255680, 'steps': 41514, 'batch_loss/train': 0.7143118041567504} 12/28/2021 10:19:03 - INFO - codeparrot_training - Step 41515: {'lr': 3.4932941594398775e-05, 'samples': 21256192, 'steps': 41515, 'batch_loss/train': 0.7866953844204545} 12/28/2021 10:19:13 - INFO - codeparrot_training - Step 41516: {'lr': 3.492490531864897e-05, 'samples': 21256704, 'steps': 41516, 'batch_loss/train': 0.7360972058959305} 12/28/2021 10:19:24 - INFO - codeparrot_training - Step 41517: {'lr': 3.491686989795656e-05, 'samples': 21257216, 'steps': 41517, 'batch_loss/train': 0.6515723783522844} 12/28/2021 10:19:38 - INFO - codeparrot_training - Step 41518: {'lr': 3.4908835332353505e-05, 'samples': 21257728, 'steps': 41518, 'batch_loss/train': 0.6575706470757723} 12/28/2021 10:19:48 - INFO - codeparrot_training - Step 41519: {'lr': 3.490080162187159e-05, 'samples': 21258240, 'steps': 41519, 'batch_loss/train': 0.7216002019122243} 12/28/2021 10:19:59 - INFO - codeparrot_training - Step 41520: {'lr': 3.48927687665429e-05, 'samples': 21258752, 'steps': 41520, 'batch_loss/train': 0.6853739202488214} 12/28/2021 10:20:11 - INFO - codeparrot_training - Step 41521: {'lr': 3.488473676639944e-05, 'samples': 21259264, 'steps': 41521, 'batch_loss/train': 0.7325316313654184} 12/28/2021 10:20:22 - INFO - codeparrot_training - Step 41522: {'lr': 3.487670562147291e-05, 'samples': 21259776, 'steps': 41522, 'batch_loss/train': 0.8086121436208487} 12/28/2021 10:20:32 - INFO - codeparrot_training - Step 41523: {'lr': 3.486867533179544e-05, 'samples': 21260288, 'steps': 41523, 'batch_loss/train': 0.7236459372797981} 12/28/2021 10:20:44 - INFO - codeparrot_training - Step 41524: {'lr': 3.486064589739893e-05, 'samples': 21260800, 'steps': 41524, 'batch_loss/train': 0.7053675716742873} 12/28/2021 10:20:55 - INFO - codeparrot_training - Step 41525: {'lr': 3.485261731831521e-05, 'samples': 21261312, 'steps': 41525, 'batch_loss/train': 0.8576425425708294} 12/28/2021 10:21:06 - INFO - codeparrot_training - Step 41526: {'lr': 3.484458959457618e-05, 'samples': 21261824, 'steps': 41526, 'batch_loss/train': 0.7228547120466828} 12/28/2021 10:21:16 - INFO - codeparrot_training - Step 41527: {'lr': 3.483656272621394e-05, 'samples': 21262336, 'steps': 41527, 'batch_loss/train': 0.6522757486090995} 12/28/2021 10:21:29 - INFO - codeparrot_training - Step 41528: {'lr': 3.482853671326022e-05, 'samples': 21262848, 'steps': 41528, 'batch_loss/train': 0.7025906909257174} 12/28/2021 10:21:39 - INFO - codeparrot_training - Step 41529: {'lr': 3.4820511555747004e-05, 'samples': 21263360, 'steps': 41529, 'batch_loss/train': 0.7774187680915929} 12/28/2021 10:21:50 - INFO - codeparrot_training - Step 41530: {'lr': 3.481248725370617e-05, 'samples': 21263872, 'steps': 41530, 'batch_loss/train': 0.7454806575551629} 12/28/2021 10:22:04 - INFO - codeparrot_training - Step 41531: {'lr': 3.4804463807169646e-05, 'samples': 21264384, 'steps': 41531, 'batch_loss/train': 0.7557276580482721} 12/28/2021 10:22:15 - INFO - codeparrot_training - Step 41532: {'lr': 3.479644121616932e-05, 'samples': 21264896, 'steps': 41532, 'batch_loss/train': 0.5602715175191406} 12/28/2021 10:22:26 - INFO - codeparrot_training - Step 41533: {'lr': 3.478841948073716e-05, 'samples': 21265408, 'steps': 41533, 'batch_loss/train': 0.653933810943272} 12/28/2021 10:22:38 - INFO - codeparrot_training - Step 41534: {'lr': 3.478039860090484e-05, 'samples': 21265920, 'steps': 41534, 'batch_loss/train': 0.7098502945154905} 12/28/2021 10:22:49 - INFO - codeparrot_training - Step 41535: {'lr': 3.4772378576704455e-05, 'samples': 21266432, 'steps': 41535, 'batch_loss/train': 0.7172465873882174} 12/28/2021 10:23:00 - INFO - codeparrot_training - Step 41536: {'lr': 3.476435940816788e-05, 'samples': 21266944, 'steps': 41536, 'batch_loss/train': 0.68351168371737} 12/28/2021 10:23:10 - INFO - codeparrot_training - Step 41537: {'lr': 3.475634109532685e-05, 'samples': 21267456, 'steps': 41537, 'batch_loss/train': 0.8156015304848552} 12/28/2021 10:23:22 - INFO - codeparrot_training - Step 41538: {'lr': 3.474832363821345e-05, 'samples': 21267968, 'steps': 41538, 'batch_loss/train': 0.6881545819342136} 12/28/2021 10:23:33 - INFO - codeparrot_training - Step 41539: {'lr': 3.4740307036859346e-05, 'samples': 21268480, 'steps': 41539, 'batch_loss/train': 0.6484912112355232} 12/28/2021 10:23:44 - INFO - codeparrot_training - Step 41540: {'lr': 3.473229129129654e-05, 'samples': 21268992, 'steps': 41540, 'batch_loss/train': 0.7659668801352382} 12/28/2021 10:23:58 - INFO - codeparrot_training - Step 41541: {'lr': 3.472427640155684e-05, 'samples': 21269504, 'steps': 41541, 'batch_loss/train': 0.7014773855917156} 12/28/2021 10:24:08 - INFO - codeparrot_training - Step 41542: {'lr': 3.471626236767214e-05, 'samples': 21270016, 'steps': 41542, 'batch_loss/train': 0.6956529356539249} 12/28/2021 10:24:19 - INFO - codeparrot_training - Step 41543: {'lr': 3.47082491896743e-05, 'samples': 21270528, 'steps': 41543, 'batch_loss/train': 0.7236835472285748} 12/28/2021 10:24:31 - INFO - codeparrot_training - Step 41544: {'lr': 3.4700236867595155e-05, 'samples': 21271040, 'steps': 41544, 'batch_loss/train': 0.7500525210052729} 12/28/2021 10:24:42 - INFO - codeparrot_training - Step 41545: {'lr': 3.4692225401466574e-05, 'samples': 21271552, 'steps': 41545, 'batch_loss/train': 0.7652287995442748} 12/28/2021 10:24:52 - INFO - codeparrot_training - Step 41546: {'lr': 3.468421479132042e-05, 'samples': 21272064, 'steps': 41546, 'batch_loss/train': 0.5832478639204055} 12/28/2021 10:25:03 - INFO - codeparrot_training - Step 41547: {'lr': 3.4676205037188574e-05, 'samples': 21272576, 'steps': 41547, 'batch_loss/train': 0.7136660672258586} 12/28/2021 10:25:15 - INFO - codeparrot_training - Step 41548: {'lr': 3.466819613910274e-05, 'samples': 21273088, 'steps': 41548, 'batch_loss/train': 0.7653752830810845} 12/28/2021 10:25:26 - INFO - codeparrot_training - Step 41549: {'lr': 3.4660188097094897e-05, 'samples': 21273600, 'steps': 41549, 'batch_loss/train': 0.6360893670062069} 12/28/2021 10:25:36 - INFO - codeparrot_training - Step 41550: {'lr': 3.465218091119687e-05, 'samples': 21274112, 'steps': 41550, 'batch_loss/train': 0.7892587996320799} 12/28/2021 10:25:51 - INFO - codeparrot_training - Step 41551: {'lr': 3.4644174581440425e-05, 'samples': 21274624, 'steps': 41551, 'batch_loss/train': 0.7434839960187674} 12/28/2021 10:26:01 - INFO - codeparrot_training - Step 41552: {'lr': 3.463616910785738e-05, 'samples': 21275136, 'steps': 41552, 'batch_loss/train': 0.7535849995911121} 12/28/2021 10:26:12 - INFO - codeparrot_training - Step 41553: {'lr': 3.46281644904797e-05, 'samples': 21275648, 'steps': 41553, 'batch_loss/train': 0.748363628052175} 12/28/2021 10:26:24 - INFO - codeparrot_training - Step 41554: {'lr': 3.4620160729339076e-05, 'samples': 21276160, 'steps': 41554, 'batch_loss/train': 0.6835882076993585} 12/28/2021 10:26:35 - INFO - codeparrot_training - Step 41555: {'lr': 3.461215782446736e-05, 'samples': 21276672, 'steps': 41555, 'batch_loss/train': 0.7157702259719372} 12/28/2021 10:26:45 - INFO - codeparrot_training - Step 41556: {'lr': 3.4604155775896376e-05, 'samples': 21277184, 'steps': 41556, 'batch_loss/train': 0.7075220651458949} 12/28/2021 10:26:59 - INFO - codeparrot_training - Step 41557: {'lr': 3.4596154583657946e-05, 'samples': 21277696, 'steps': 41557, 'batch_loss/train': 0.7707954701036215} 12/28/2021 10:27:10 - INFO - codeparrot_training - Step 41558: {'lr': 3.458815424778386e-05, 'samples': 21278208, 'steps': 41558, 'batch_loss/train': 0.7908078380860388} 12/28/2021 10:27:21 - INFO - codeparrot_training - Step 41559: {'lr': 3.458015476830595e-05, 'samples': 21278720, 'steps': 41559, 'batch_loss/train': 0.7036960162222385} 12/28/2021 10:27:31 - INFO - codeparrot_training - Step 41560: {'lr': 3.457215614525599e-05, 'samples': 21279232, 'steps': 41560, 'batch_loss/train': 0.6793039643671364} 12/28/2021 10:27:43 - INFO - codeparrot_training - Step 41561: {'lr': 3.45641583786658e-05, 'samples': 21279744, 'steps': 41561, 'batch_loss/train': 0.7100477069616318} 12/28/2021 10:27:54 - INFO - codeparrot_training - Step 41562: {'lr': 3.455616146856724e-05, 'samples': 21280256, 'steps': 41562, 'batch_loss/train': 0.6856347140856087} 12/28/2021 10:28:04 - INFO - codeparrot_training - Step 41563: {'lr': 3.454816541499192e-05, 'samples': 21280768, 'steps': 41563, 'batch_loss/train': 0.7223341539502144} 12/28/2021 10:28:17 - INFO - codeparrot_training - Step 41564: {'lr': 3.454017021797184e-05, 'samples': 21281280, 'steps': 41564, 'batch_loss/train': 0.7427229294553399} 12/28/2021 10:28:27 - INFO - codeparrot_training - Step 41565: {'lr': 3.453217587753865e-05, 'samples': 21281792, 'steps': 41565, 'batch_loss/train': 0.7203288269229233} 12/28/2021 10:28:38 - INFO - codeparrot_training - Step 41566: {'lr': 3.452418239372415e-05, 'samples': 21282304, 'steps': 41566, 'batch_loss/train': 0.7192516019567847} 12/28/2021 10:28:50 - INFO - codeparrot_training - Step 41567: {'lr': 3.451618976656018e-05, 'samples': 21282816, 'steps': 41567, 'batch_loss/train': 0.620019753521774} 12/28/2021 10:29:01 - INFO - codeparrot_training - Step 41568: {'lr': 3.450819799607843e-05, 'samples': 21283328, 'steps': 41568, 'batch_loss/train': 0.7212742543779314} 12/28/2021 10:29:11 - INFO - codeparrot_training - Step 41569: {'lr': 3.450020708231075e-05, 'samples': 21283840, 'steps': 41569, 'batch_loss/train': 0.6399565101601183} 12/28/2021 10:29:22 - INFO - codeparrot_training - Step 41570: {'lr': 3.449221702528885e-05, 'samples': 21284352, 'steps': 41570, 'batch_loss/train': 0.581018467550166} 12/28/2021 10:29:36 - INFO - codeparrot_training - Step 41571: {'lr': 3.448422782504454e-05, 'samples': 21284864, 'steps': 41571, 'batch_loss/train': 0.6070524178503547} 12/28/2021 10:29:47 - INFO - codeparrot_training - Step 41572: {'lr': 3.447623948160955e-05, 'samples': 21285376, 'steps': 41572, 'batch_loss/train': 0.7066384591162205} 12/28/2021 10:29:57 - INFO - codeparrot_training - Step 41573: {'lr': 3.4468251995015745e-05, 'samples': 21285888, 'steps': 41573, 'batch_loss/train': 0.7659238697960973} 12/28/2021 10:30:10 - INFO - codeparrot_training - Step 41574: {'lr': 3.4460265365294644e-05, 'samples': 21286400, 'steps': 41574, 'batch_loss/train': 0.7132376489462331} 12/28/2021 10:30:20 - INFO - codeparrot_training - Step 41575: {'lr': 3.4452279592478206e-05, 'samples': 21286912, 'steps': 41575, 'batch_loss/train': 0.7365926904603839} 12/28/2021 10:30:31 - INFO - codeparrot_training - Step 41576: {'lr': 3.4444294676598184e-05, 'samples': 21287424, 'steps': 41576, 'batch_loss/train': 0.7713529723696411} 12/28/2021 10:30:43 - INFO - codeparrot_training - Step 41577: {'lr': 3.443631061768618e-05, 'samples': 21287936, 'steps': 41577, 'batch_loss/train': 0.6968986308202147} 12/28/2021 10:30:54 - INFO - codeparrot_training - Step 41578: {'lr': 3.442832741577395e-05, 'samples': 21288448, 'steps': 41578, 'batch_loss/train': 0.643276626477018} 12/28/2021 10:31:04 - INFO - codeparrot_training - Step 41579: {'lr': 3.442034507089342e-05, 'samples': 21288960, 'steps': 41579, 'batch_loss/train': 0.6064038836630061} 12/28/2021 10:31:15 - INFO - codeparrot_training - Step 41580: {'lr': 3.4412363583076144e-05, 'samples': 21289472, 'steps': 41580, 'batch_loss/train': 0.7782243653200567} 12/28/2021 10:31:29 - INFO - codeparrot_training - Step 41581: {'lr': 3.44043829523539e-05, 'samples': 21289984, 'steps': 41581, 'batch_loss/train': 0.741127533139661} 12/28/2021 10:31:39 - INFO - codeparrot_training - Step 41582: {'lr': 3.4396403178758446e-05, 'samples': 21290496, 'steps': 41582, 'batch_loss/train': 0.751310970634222} 12/28/2021 10:31:50 - INFO - codeparrot_training - Step 41583: {'lr': 3.438842426232147e-05, 'samples': 21291008, 'steps': 41583, 'batch_loss/train': 0.7762007284909487} 12/28/2021 10:32:02 - INFO - codeparrot_training - Step 41584: {'lr': 3.4380446203074705e-05, 'samples': 21291520, 'steps': 41584, 'batch_loss/train': 0.7568511599674821} 12/28/2021 10:32:13 - INFO - codeparrot_training - Step 41585: {'lr': 3.437246900104987e-05, 'samples': 21292032, 'steps': 41585, 'batch_loss/train': 0.6449960693717003} 12/28/2021 10:32:23 - INFO - codeparrot_training - Step 41586: {'lr': 3.4364492656278665e-05, 'samples': 21292544, 'steps': 41586, 'batch_loss/train': 0.6619370910339057} 12/28/2021 10:32:37 - INFO - codeparrot_training - Step 41587: {'lr': 3.4356517168792845e-05, 'samples': 21293056, 'steps': 41587, 'batch_loss/train': 0.6511564991087653} 12/28/2021 10:32:48 - INFO - codeparrot_training - Step 41588: {'lr': 3.434854253862413e-05, 'samples': 21293568, 'steps': 41588, 'batch_loss/train': 0.6896169431274757} 12/28/2021 10:32:59 - INFO - codeparrot_training - Step 41589: {'lr': 3.4340568765804074e-05, 'samples': 21294080, 'steps': 41589, 'batch_loss/train': 0.5788967196131125} 12/28/2021 10:33:11 - INFO - codeparrot_training - Step 41590: {'lr': 3.433259585036455e-05, 'samples': 21294592, 'steps': 41590, 'batch_loss/train': 0.7532752561382949} 12/28/2021 10:33:21 - INFO - codeparrot_training - Step 41591: {'lr': 3.432462379233722e-05, 'samples': 21295104, 'steps': 41591, 'batch_loss/train': 0.5728822260862216} 12/28/2021 10:33:32 - INFO - codeparrot_training - Step 41592: {'lr': 3.4316652591753736e-05, 'samples': 21295616, 'steps': 41592, 'batch_loss/train': 0.7287245257757604} 12/28/2021 10:33:43 - INFO - codeparrot_training - Step 41593: {'lr': 3.430868224864578e-05, 'samples': 21296128, 'steps': 41593, 'batch_loss/train': 0.6515525048598647} 12/28/2021 10:33:55 - INFO - codeparrot_training - Step 41594: {'lr': 3.430071276304506e-05, 'samples': 21296640, 'steps': 41594, 'batch_loss/train': 0.674888267647475} 12/28/2021 10:34:05 - INFO - codeparrot_training - Step 41595: {'lr': 3.4292744134983264e-05, 'samples': 21297152, 'steps': 41595, 'batch_loss/train': 0.7235937230288982} 12/28/2021 10:34:16 - INFO - codeparrot_training - Step 41596: {'lr': 3.428477636449206e-05, 'samples': 21297664, 'steps': 41596, 'batch_loss/train': 0.7189266262575984} 12/28/2021 10:34:30 - INFO - codeparrot_training - Step 41597: {'lr': 3.427680945160314e-05, 'samples': 21298176, 'steps': 41597, 'batch_loss/train': 0.7129168603569269} 12/28/2021 10:34:41 - INFO - codeparrot_training - Step 41598: {'lr': 3.4268843396348174e-05, 'samples': 21298688, 'steps': 41598, 'batch_loss/train': 0.6198351392522454} 12/28/2021 10:34:51 - INFO - codeparrot_training - Step 41599: {'lr': 3.426087819875884e-05, 'samples': 21299200, 'steps': 41599, 'batch_loss/train': 0.8147013857960701} 12/28/2021 10:35:03 - INFO - codeparrot_training - Step 41600: {'lr': 3.425291385886678e-05, 'samples': 21299712, 'steps': 41600, 'batch_loss/train': 0.7514394298195839} 12/28/2021 10:35:14 - INFO - codeparrot_training - Step 41601: {'lr': 3.424495037670369e-05, 'samples': 21300224, 'steps': 41601, 'batch_loss/train': 0.8837468679994345} 12/28/2021 10:35:25 - INFO - codeparrot_training - Step 41602: {'lr': 3.4236987752301245e-05, 'samples': 21300736, 'steps': 41602, 'batch_loss/train': 0.6994913902599365} 12/28/2021 10:35:35 - INFO - codeparrot_training - Step 41603: {'lr': 3.422902598569102e-05, 'samples': 21301248, 'steps': 41603, 'batch_loss/train': 0.722947375557851} 12/28/2021 10:35:48 - INFO - codeparrot_training - Step 41604: {'lr': 3.422106507690467e-05, 'samples': 21301760, 'steps': 41604, 'batch_loss/train': 0.5024168840027414} 12/28/2021 10:35:59 - INFO - codeparrot_training - Step 41605: {'lr': 3.4213105025973994e-05, 'samples': 21302272, 'steps': 41605, 'batch_loss/train': 0.509930731728673} 12/28/2021 10:36:09 - INFO - codeparrot_training - Step 41606: {'lr': 3.420514583293047e-05, 'samples': 21302784, 'steps': 41606, 'batch_loss/train': 0.6758315288461745} 12/28/2021 10:36:23 - INFO - codeparrot_training - Step 41607: {'lr': 3.419718749780581e-05, 'samples': 21303296, 'steps': 41607, 'batch_loss/train': 0.4229970572050661} 12/28/2021 10:36:34 - INFO - codeparrot_training - Step 41608: {'lr': 3.418923002063165e-05, 'samples': 21303808, 'steps': 41608, 'batch_loss/train': 0.6634138058288954} 12/28/2021 10:36:45 - INFO - codeparrot_training - Step 41609: {'lr': 3.4181273401439614e-05, 'samples': 21304320, 'steps': 41609, 'batch_loss/train': 0.7650661491788924} 12/28/2021 10:36:55 - INFO - codeparrot_training - Step 41610: {'lr': 3.417331764026138e-05, 'samples': 21304832, 'steps': 41610, 'batch_loss/train': 0.7743208413012326} 12/28/2021 10:37:08 - INFO - codeparrot_training - Step 41611: {'lr': 3.4165362737128504e-05, 'samples': 21305344, 'steps': 41611, 'batch_loss/train': 0.6867004218511283} 12/28/2021 10:37:18 - INFO - codeparrot_training - Step 41612: {'lr': 3.4157408692072675e-05, 'samples': 21305856, 'steps': 41612, 'batch_loss/train': 0.73965975176543} 12/28/2021 10:37:29 - INFO - codeparrot_training - Step 41613: {'lr': 3.414945550512547e-05, 'samples': 21306368, 'steps': 41613, 'batch_loss/train': 0.7649435694329441} 12/28/2021 10:37:41 - INFO - codeparrot_training - Step 41614: {'lr': 3.414150317631859e-05, 'samples': 21306880, 'steps': 41614, 'batch_loss/train': 0.45833559962920845} 12/28/2021 10:37:52 - INFO - codeparrot_training - Step 41615: {'lr': 3.41335517056835e-05, 'samples': 21307392, 'steps': 41615, 'batch_loss/train': 0.7668561283499002} 12/28/2021 10:38:02 - INFO - codeparrot_training - Step 41616: {'lr': 3.412560109325191e-05, 'samples': 21307904, 'steps': 41616, 'batch_loss/train': 0.7088095592334867} 12/28/2021 10:38:13 - INFO - codeparrot_training - Step 41617: {'lr': 3.4117651339055504e-05, 'samples': 21308416, 'steps': 41617, 'batch_loss/train': 0.6881768628954887} 12/28/2021 10:38:27 - INFO - codeparrot_training - Step 41618: {'lr': 3.410970244312575e-05, 'samples': 21308928, 'steps': 41618, 'batch_loss/train': 0.7674330184236169} 12/28/2021 10:38:37 - INFO - codeparrot_training - Step 41619: {'lr': 3.410175440549429e-05, 'samples': 21309440, 'steps': 41619, 'batch_loss/train': 0.6819914559600875} 12/28/2021 10:38:48 - INFO - codeparrot_training - Step 41620: {'lr': 3.4093807226192737e-05, 'samples': 21309952, 'steps': 41620, 'batch_loss/train': 0.6294901357032359} 12/28/2021 10:39:00 - INFO - codeparrot_training - Step 41621: {'lr': 3.40858609052527e-05, 'samples': 21310464, 'steps': 41621, 'batch_loss/train': 0.6438775411807001} 12/28/2021 10:39:11 - INFO - codeparrot_training - Step 41622: {'lr': 3.407791544270575e-05, 'samples': 21310976, 'steps': 41622, 'batch_loss/train': 0.7001776807010174} 12/28/2021 10:39:21 - INFO - codeparrot_training - Step 41623: {'lr': 3.406997083858346e-05, 'samples': 21311488, 'steps': 41623, 'batch_loss/train': 0.6350125847384334} 12/28/2021 10:39:34 - INFO - codeparrot_training - Step 41624: {'lr': 3.406202709291745e-05, 'samples': 21312000, 'steps': 41624, 'batch_loss/train': 0.6917784647084773} 12/28/2021 10:39:45 - INFO - codeparrot_training - Step 41625: {'lr': 3.4054084205739274e-05, 'samples': 21312512, 'steps': 41625, 'batch_loss/train': 0.7685033748857677} 12/28/2021 10:39:55 - INFO - codeparrot_training - Step 41626: {'lr': 3.404614217708052e-05, 'samples': 21313024, 'steps': 41626, 'batch_loss/train': 0.6922330986708403} 12/28/2021 10:40:06 - INFO - codeparrot_training - Step 41627: {'lr': 3.4038201006972773e-05, 'samples': 21313536, 'steps': 41627, 'batch_loss/train': 0.45875386882107705} 12/28/2021 10:40:20 - INFO - codeparrot_training - Step 41628: {'lr': 3.403026069544765e-05, 'samples': 21314048, 'steps': 41628, 'batch_loss/train': 0.9093511867686175} 12/28/2021 10:40:31 - INFO - codeparrot_training - Step 41629: {'lr': 3.4022321242536594e-05, 'samples': 21314560, 'steps': 41629, 'batch_loss/train': 0.6939028832130134} 12/28/2021 10:40:42 - INFO - codeparrot_training - Step 41630: {'lr': 3.40143826482712e-05, 'samples': 21315072, 'steps': 41630, 'batch_loss/train': 0.7750792887527496} 12/28/2021 10:40:55 - INFO - codeparrot_training - Step 41631: {'lr': 3.400644491268318e-05, 'samples': 21315584, 'steps': 41631, 'batch_loss/train': 0.7607819233089685} 12/28/2021 10:41:05 - INFO - codeparrot_training - Step 41632: {'lr': 3.39985080358039e-05, 'samples': 21316096, 'steps': 41632, 'batch_loss/train': 0.6085472325794399} 12/28/2021 10:41:16 - INFO - codeparrot_training - Step 41633: {'lr': 3.399057201766495e-05, 'samples': 21316608, 'steps': 41633, 'batch_loss/train': 0.7401102874428034} 12/28/2021 10:41:27 - INFO - codeparrot_training - Step 41634: {'lr': 3.398263685829805e-05, 'samples': 21317120, 'steps': 41634, 'batch_loss/train': 0.7141675469465554} 12/28/2021 10:41:39 - INFO - codeparrot_training - Step 41635: {'lr': 3.397470255773455e-05, 'samples': 21317632, 'steps': 41635, 'batch_loss/train': 0.7164322668686509} 12/28/2021 10:41:49 - INFO - codeparrot_training - Step 41636: {'lr': 3.396676911600607e-05, 'samples': 21318144, 'steps': 41636, 'batch_loss/train': 0.6967063075862825} 12/28/2021 10:42:00 - INFO - codeparrot_training - Step 41637: {'lr': 3.3958836533144136e-05, 'samples': 21318656, 'steps': 41637, 'batch_loss/train': 0.6747384397312999} 12/28/2021 10:42:14 - INFO - codeparrot_training - Step 41638: {'lr': 3.395090480918031e-05, 'samples': 21319168, 'steps': 41638, 'batch_loss/train': 0.7131575820967555} 12/28/2021 10:42:25 - INFO - codeparrot_training - Step 41639: {'lr': 3.394297394414611e-05, 'samples': 21319680, 'steps': 41639, 'batch_loss/train': 0.6681948634795845} 12/28/2021 10:42:35 - INFO - codeparrot_training - Step 41640: {'lr': 3.3935043938073116e-05, 'samples': 21320192, 'steps': 41640, 'batch_loss/train': 0.7001492194831371} 12/28/2021 10:42:46 - INFO - codeparrot_training - Step 41641: {'lr': 3.392711479099267e-05, 'samples': 21320704, 'steps': 41641, 'batch_loss/train': 0.796325109899044} 12/28/2021 10:42:58 - INFO - codeparrot_training - Step 41642: {'lr': 3.391918650293652e-05, 'samples': 21321216, 'steps': 41642, 'batch_loss/train': 0.73693577805534} 12/28/2021 10:43:09 - INFO - codeparrot_training - Step 41643: {'lr': 3.391125907393613e-05, 'samples': 21321728, 'steps': 41643, 'batch_loss/train': 0.6866414558608085} 12/28/2021 10:43:19 - INFO - codeparrot_training - Step 41644: {'lr': 3.390333250402294e-05, 'samples': 21322240, 'steps': 41644, 'batch_loss/train': 0.7059020238230005} 12/28/2021 10:43:32 - INFO - codeparrot_training - Step 41645: {'lr': 3.3895406793228504e-05, 'samples': 21322752, 'steps': 41645, 'batch_loss/train': 0.7368628005497158} 12/28/2021 10:43:42 - INFO - codeparrot_training - Step 41646: {'lr': 3.3887481941584306e-05, 'samples': 21323264, 'steps': 41646, 'batch_loss/train': 0.720416127354838} 12/28/2021 10:43:53 - INFO - codeparrot_training - Step 41647: {'lr': 3.38795579491219e-05, 'samples': 21323776, 'steps': 41647, 'batch_loss/train': 0.6946071879938245} 12/28/2021 10:44:06 - INFO - codeparrot_training - Step 41648: {'lr': 3.387163481587277e-05, 'samples': 21324288, 'steps': 41648, 'batch_loss/train': 0.7125500324182212} 12/28/2021 10:44:17 - INFO - codeparrot_training - Step 41649: {'lr': 3.38637125418684e-05, 'samples': 21324800, 'steps': 41649, 'batch_loss/train': 0.6579812015406787} 12/28/2021 10:44:27 - INFO - codeparrot_training - Step 41650: {'lr': 3.385579112714032e-05, 'samples': 21325312, 'steps': 41650, 'batch_loss/train': 0.7548663383349776} 12/28/2021 10:44:38 - INFO - codeparrot_training - Step 41651: {'lr': 3.384787057171998e-05, 'samples': 21325824, 'steps': 41651, 'batch_loss/train': 0.6430167959770188} 12/28/2021 10:44:50 - INFO - codeparrot_training - Step 41652: {'lr': 3.38399508756389e-05, 'samples': 21326336, 'steps': 41652, 'batch_loss/train': 0.7938863169401884} 12/28/2021 10:45:01 - INFO - codeparrot_training - Step 41653: {'lr': 3.383203203892857e-05, 'samples': 21326848, 'steps': 41653, 'batch_loss/train': 0.7112267164047807} 12/28/2021 10:45:11 - INFO - codeparrot_training - Step 41654: {'lr': 3.38241140616205e-05, 'samples': 21327360, 'steps': 41654, 'batch_loss/train': 0.6519762079697102} 12/28/2021 10:45:23 - INFO - codeparrot_training - Step 41655: {'lr': 3.381619694374608e-05, 'samples': 21327872, 'steps': 41655, 'batch_loss/train': 0.7146370515692979} 12/28/2021 10:45:34 - INFO - codeparrot_training - Step 41656: {'lr': 3.3808280685336755e-05, 'samples': 21328384, 'steps': 41656, 'batch_loss/train': 0.671924838796258} 12/28/2021 10:45:45 - INFO - codeparrot_training - Step 41657: {'lr': 3.380036528642422e-05, 'samples': 21328896, 'steps': 41657, 'batch_loss/train': 0.7005132320336998} 12/28/2021 10:45:59 - INFO - codeparrot_training - Step 41658: {'lr': 3.379245074703971e-05, 'samples': 21329408, 'steps': 41658, 'batch_loss/train': 0.6520265061408281} 12/28/2021 10:46:09 - INFO - codeparrot_training - Step 41659: {'lr': 3.378453706721474e-05, 'samples': 21329920, 'steps': 41659, 'batch_loss/train': 0.7740650940686464} 12/28/2021 10:46:20 - INFO - codeparrot_training - Step 41660: {'lr': 3.377662424698091e-05, 'samples': 21330432, 'steps': 41660, 'batch_loss/train': 0.6402227770304307} 12/28/2021 10:46:30 - INFO - codeparrot_training - Step 41661: {'lr': 3.376871228636955e-05, 'samples': 21330944, 'steps': 41661, 'batch_loss/train': 0.7183920503593981} 12/28/2021 10:46:43 - INFO - codeparrot_training - Step 41662: {'lr': 3.376080118541214e-05, 'samples': 21331456, 'steps': 41662, 'batch_loss/train': 0.794546527788043} 12/28/2021 10:46:53 - INFO - codeparrot_training - Step 41663: {'lr': 3.375289094414011e-05, 'samples': 21331968, 'steps': 41663, 'batch_loss/train': 0.6699543287977576} 12/28/2021 10:47:04 - INFO - codeparrot_training - Step 41664: {'lr': 3.3744981562584956e-05, 'samples': 21332480, 'steps': 41664, 'batch_loss/train': 0.6818393818102777} 12/28/2021 10:47:18 - INFO - codeparrot_training - Step 41665: {'lr': 3.373707304077808e-05, 'samples': 21332992, 'steps': 41665, 'batch_loss/train': 0.792970722541213} 12/28/2021 10:47:28 - INFO - codeparrot_training - Step 41666: {'lr': 3.3729165378751006e-05, 'samples': 21333504, 'steps': 41666, 'batch_loss/train': 0.763794269412756} 12/28/2021 10:47:39 - INFO - codeparrot_training - Step 41667: {'lr': 3.372125857653502e-05, 'samples': 21334016, 'steps': 41667, 'batch_loss/train': 0.750882713124156} 12/28/2021 10:47:51 - INFO - codeparrot_training - Step 41668: {'lr': 3.37133526341617e-05, 'samples': 21334528, 'steps': 41668, 'batch_loss/train': 0.6594730257056653} 12/28/2021 10:48:02 - INFO - codeparrot_training - Step 41669: {'lr': 3.370544755166247e-05, 'samples': 21335040, 'steps': 41669, 'batch_loss/train': 0.8183435313403606} 12/28/2021 10:48:13 - INFO - codeparrot_training - Step 41670: {'lr': 3.369754332906858e-05, 'samples': 21335552, 'steps': 41670, 'batch_loss/train': 0.8662413582205772} 12/28/2021 10:48:23 - INFO - codeparrot_training - Step 41671: {'lr': 3.368963996641167e-05, 'samples': 21336064, 'steps': 41671, 'batch_loss/train': 0.7357504880055785} 12/28/2021 10:48:35 - INFO - codeparrot_training - Step 41672: {'lr': 3.36817374637231e-05, 'samples': 21336576, 'steps': 41672, 'batch_loss/train': 0.6448475755751133} 12/28/2021 10:48:46 - INFO - codeparrot_training - Step 41673: {'lr': 3.3673835821034225e-05, 'samples': 21337088, 'steps': 41673, 'batch_loss/train': 0.769943174906075} 12/28/2021 10:48:57 - INFO - codeparrot_training - Step 41674: {'lr': 3.36659350383765e-05, 'samples': 21337600, 'steps': 41674, 'batch_loss/train': 0.5877407299121842} 12/28/2021 10:49:09 - INFO - codeparrot_training - Step 41675: {'lr': 3.365803511578133e-05, 'samples': 21338112, 'steps': 41675, 'batch_loss/train': 0.8583541161497124} 12/28/2021 10:49:19 - INFO - codeparrot_training - Step 41676: {'lr': 3.365013605328013e-05, 'samples': 21338624, 'steps': 41676, 'batch_loss/train': 0.7722990745678544} 12/28/2021 10:49:30 - INFO - codeparrot_training - Step 41677: {'lr': 3.364223785090428e-05, 'samples': 21339136, 'steps': 41677, 'batch_loss/train': 0.7364085272420198} 12/28/2021 10:49:44 - INFO - codeparrot_training - Step 41678: {'lr': 3.3634340508685217e-05, 'samples': 21339648, 'steps': 41678, 'batch_loss/train': 0.7439810652285814} 12/28/2021 10:49:55 - INFO - codeparrot_training - Step 41679: {'lr': 3.3626444026654325e-05, 'samples': 21340160, 'steps': 41679, 'batch_loss/train': 0.6129724632482976} 12/28/2021 10:50:05 - INFO - codeparrot_training - Step 41680: {'lr': 3.361854840484296e-05, 'samples': 21340672, 'steps': 41680, 'batch_loss/train': 0.5335897258482873} 12/28/2021 10:50:16 - INFO - codeparrot_training - Step 41681: {'lr': 3.361065364328264e-05, 'samples': 21341184, 'steps': 41681, 'batch_loss/train': 0.6454820563085377} 12/28/2021 10:50:28 - INFO - codeparrot_training - Step 41682: {'lr': 3.3602759742004525e-05, 'samples': 21341696, 'steps': 41682, 'batch_loss/train': 0.6419020507310051} 12/28/2021 10:50:39 - INFO - codeparrot_training - Step 41683: {'lr': 3.3594866701040225e-05, 'samples': 21342208, 'steps': 41683, 'batch_loss/train': 0.7363594932248816} 12/28/2021 10:50:49 - INFO - codeparrot_training - Step 41684: {'lr': 3.3586974520421005e-05, 'samples': 21342720, 'steps': 41684, 'batch_loss/train': 0.8245597584173083} 12/28/2021 10:51:01 - INFO - codeparrot_training - Step 41685: {'lr': 3.357908320017819e-05, 'samples': 21343232, 'steps': 41685, 'batch_loss/train': 0.7865561195649207} 12/28/2021 10:51:12 - INFO - codeparrot_training - Step 41686: {'lr': 3.357119274034331e-05, 'samples': 21343744, 'steps': 41686, 'batch_loss/train': 0.7336862650699914} 12/28/2021 10:51:22 - INFO - codeparrot_training - Step 41687: {'lr': 3.35633031409476e-05, 'samples': 21344256, 'steps': 41687, 'batch_loss/train': 0.7392761157825589} 12/28/2021 10:51:36 - INFO - codeparrot_training - Step 41688: {'lr': 3.35554144020225e-05, 'samples': 21344768, 'steps': 41688, 'batch_loss/train': 1.6893998505547643} 12/28/2021 10:51:47 - INFO - codeparrot_training - Step 41689: {'lr': 3.354752652359933e-05, 'samples': 21345280, 'steps': 41689, 'batch_loss/train': 0.8034251306671649} 12/28/2021 10:51:58 - INFO - codeparrot_training - Step 41690: {'lr': 3.3539639505709454e-05, 'samples': 21345792, 'steps': 41690, 'batch_loss/train': 0.6900888648815453} 12/28/2021 10:52:10 - INFO - codeparrot_training - Step 41691: {'lr': 3.3531753348384266e-05, 'samples': 21346304, 'steps': 41691, 'batch_loss/train': 0.7426247769035399} 12/28/2021 10:52:21 - INFO - codeparrot_training - Step 41692: {'lr': 3.352386805165511e-05, 'samples': 21346816, 'steps': 41692, 'batch_loss/train': 0.6217413037084043} 12/28/2021 10:52:31 - INFO - codeparrot_training - Step 41693: {'lr': 3.3515983615553245e-05, 'samples': 21347328, 'steps': 41693, 'batch_loss/train': 0.7075135419145226} 12/28/2021 10:52:42 - INFO - codeparrot_training - Step 41694: {'lr': 3.3508100040110126e-05, 'samples': 21347840, 'steps': 41694, 'batch_loss/train': 0.6957081460859627} 12/28/2021 10:52:56 - INFO - codeparrot_training - Step 41695: {'lr': 3.35002173253571e-05, 'samples': 21348352, 'steps': 41695, 'batch_loss/train': 0.7463557254523039} 12/28/2021 10:53:06 - INFO - codeparrot_training - Step 41696: {'lr': 3.349233547132535e-05, 'samples': 21348864, 'steps': 41696, 'batch_loss/train': 0.6860312717035413} 12/28/2021 10:53:17 - INFO - codeparrot_training - Step 41697: {'lr': 3.3484454478046403e-05, 'samples': 21349376, 'steps': 41697, 'batch_loss/train': 0.7428756440058351} 12/28/2021 10:53:29 - INFO - codeparrot_training - Step 41698: {'lr': 3.347657434555154e-05, 'samples': 21349888, 'steps': 41698, 'batch_loss/train': 0.7760354625061154} 12/28/2021 10:53:40 - INFO - codeparrot_training - Step 41699: {'lr': 3.346869507387204e-05, 'samples': 21350400, 'steps': 41699, 'batch_loss/train': 0.7693417686969042} 12/28/2021 10:53:50 - INFO - codeparrot_training - Step 41700: {'lr': 3.346081666303924e-05, 'samples': 21350912, 'steps': 41700, 'batch_loss/train': 0.7970671951770782} 12/28/2021 10:54:03 - INFO - codeparrot_training - Step 41701: {'lr': 3.345293911308445e-05, 'samples': 21351424, 'steps': 41701, 'batch_loss/train': 0.7514374516904354} 12/28/2021 10:54:13 - INFO - codeparrot_training - Step 41702: {'lr': 3.3445062424039024e-05, 'samples': 21351936, 'steps': 41702, 'batch_loss/train': 0.6127544129267335} 12/28/2021 10:54:24 - INFO - codeparrot_training - Step 41703: {'lr': 3.343718659593425e-05, 'samples': 21352448, 'steps': 41703, 'batch_loss/train': 0.7899465197697282} 12/28/2021 10:54:34 - INFO - codeparrot_training - Step 41704: {'lr': 3.342931162880147e-05, 'samples': 21352960, 'steps': 41704, 'batch_loss/train': 0.7654546359553933} 12/28/2021 10:54:48 - INFO - codeparrot_training - Step 41705: {'lr': 3.342143752267196e-05, 'samples': 21353472, 'steps': 41705, 'batch_loss/train': 0.7978151910938323} 12/28/2021 10:54:59 - INFO - codeparrot_training - Step 41706: {'lr': 3.341356427757705e-05, 'samples': 21353984, 'steps': 41706, 'batch_loss/train': 0.6954420374240726} 12/28/2021 10:55:10 - INFO - codeparrot_training - Step 41707: {'lr': 3.3405691893548e-05, 'samples': 21354496, 'steps': 41707, 'batch_loss/train': 1.0825394177809358} 12/28/2021 10:55:22 - INFO - codeparrot_training - Step 41708: {'lr': 3.339782037061617e-05, 'samples': 21355008, 'steps': 41708, 'batch_loss/train': 0.7299121045507491} 12/28/2021 10:55:33 - INFO - codeparrot_training - Step 41709: {'lr': 3.338994970881284e-05, 'samples': 21355520, 'steps': 41709, 'batch_loss/train': 0.6514819809235632} 12/28/2021 10:55:43 - INFO - codeparrot_training - Step 41710: {'lr': 3.3382079908169274e-05, 'samples': 21356032, 'steps': 41710, 'batch_loss/train': 0.9919869601726532} 12/28/2021 10:55:55 - INFO - codeparrot_training - Step 41711: {'lr': 3.337421096871668e-05, 'samples': 21356544, 'steps': 41711, 'batch_loss/train': 0.7106104120612144} 12/28/2021 10:56:06 - INFO - codeparrot_training - Step 41712: {'lr': 3.336634289048654e-05, 'samples': 21357056, 'steps': 41712, 'batch_loss/train': 0.6218902618857101} 12/28/2021 10:56:17 - INFO - codeparrot_training - Step 41713: {'lr': 3.335847567350997e-05, 'samples': 21357568, 'steps': 41713, 'batch_loss/train': 0.6905035865493119} 12/28/2021 10:56:31 - INFO - codeparrot_training - Step 41714: {'lr': 3.3350609317818305e-05, 'samples': 21358080, 'steps': 41714, 'batch_loss/train': 0.6852475134655833} 12/28/2021 10:56:42 - INFO - codeparrot_training - Step 41715: {'lr': 3.3342743823442794e-05, 'samples': 21358592, 'steps': 41715, 'batch_loss/train': 0.7687827246263623} 12/28/2021 10:56:53 - INFO - codeparrot_training - Step 41716: {'lr': 3.333487919041475e-05, 'samples': 21359104, 'steps': 41716, 'batch_loss/train': 0.8107953332364559} 12/28/2021 10:57:03 - INFO - codeparrot_training - Step 41717: {'lr': 3.332701541876543e-05, 'samples': 21359616, 'steps': 41717, 'batch_loss/train': 0.7546178954653442} 12/28/2021 10:57:16 - INFO - codeparrot_training - Step 41718: {'lr': 3.33191525085261e-05, 'samples': 21360128, 'steps': 41718, 'batch_loss/train': 0.7719386991811916} 12/28/2021 10:57:26 - INFO - codeparrot_training - Step 41719: {'lr': 3.331129045972792e-05, 'samples': 21360640, 'steps': 41719, 'batch_loss/train': 0.7577532255090773} 12/28/2021 10:57:37 - INFO - codeparrot_training - Step 41720: {'lr': 3.330342927240226e-05, 'samples': 21361152, 'steps': 41720, 'batch_loss/train': 0.726936852093786} 12/28/2021 10:57:50 - INFO - codeparrot_training - Step 41721: {'lr': 3.3295568946580426e-05, 'samples': 21361664, 'steps': 41721, 'batch_loss/train': 0.6539331017993391} 12/28/2021 10:58:01 - INFO - codeparrot_training - Step 41722: {'lr': 3.328770948229348e-05, 'samples': 21362176, 'steps': 41722, 'batch_loss/train': 0.6498598931357265} 12/28/2021 10:58:12 - INFO - codeparrot_training - Step 41723: {'lr': 3.3279850879572805e-05, 'samples': 21362688, 'steps': 41723, 'batch_loss/train': 0.6758797662332654} 12/28/2021 10:58:22 - INFO - codeparrot_training - Step 41724: {'lr': 3.3271993138449666e-05, 'samples': 21363200, 'steps': 41724, 'batch_loss/train': 0.5392313187476248} 12/28/2021 10:58:34 - INFO - codeparrot_training - Step 41725: {'lr': 3.326413625895522e-05, 'samples': 21363712, 'steps': 41725, 'batch_loss/train': 0.7907941835001111} 12/28/2021 10:58:45 - INFO - codeparrot_training - Step 41726: {'lr': 3.325628024112065e-05, 'samples': 21364224, 'steps': 41726, 'batch_loss/train': 0.5178714976354968} 12/28/2021 10:58:56 - INFO - codeparrot_training - Step 41727: {'lr': 3.3248425084977415e-05, 'samples': 21364736, 'steps': 41727, 'batch_loss/train': 0.623984138481319} 12/28/2021 10:59:08 - INFO - codeparrot_training - Step 41728: {'lr': 3.324057079055651e-05, 'samples': 21365248, 'steps': 41728, 'batch_loss/train': 0.5820913141360506} 12/28/2021 10:59:18 - INFO - codeparrot_training - Step 41729: {'lr': 3.3232717357889267e-05, 'samples': 21365760, 'steps': 41729, 'batch_loss/train': 0.7913073264062405} 12/28/2021 10:59:29 - INFO - codeparrot_training - Step 41730: {'lr': 3.3224864787006896e-05, 'samples': 21366272, 'steps': 41730, 'batch_loss/train': 0.7016260190866888} 12/28/2021 10:59:43 - INFO - codeparrot_training - Step 41731: {'lr': 3.321701307794059e-05, 'samples': 21366784, 'steps': 41731, 'batch_loss/train': 0.8657043948769569} 12/28/2021 10:59:54 - INFO - codeparrot_training - Step 41732: {'lr': 3.320916223072159e-05, 'samples': 21367296, 'steps': 41732, 'batch_loss/train': 0.6497496757656336} 12/28/2021 11:00:04 - INFO - codeparrot_training - Step 41733: {'lr': 3.3201312245381145e-05, 'samples': 21367808, 'steps': 41733, 'batch_loss/train': 0.720042422413826} 12/28/2021 11:00:15 - INFO - codeparrot_training - Step 41734: {'lr': 3.3193463121950375e-05, 'samples': 21368320, 'steps': 41734, 'batch_loss/train': 0.7176904110237956} 12/28/2021 11:00:27 - INFO - codeparrot_training - Step 41735: {'lr': 3.3185614860460596e-05, 'samples': 21368832, 'steps': 41735, 'batch_loss/train': 0.8056777352467179} 12/28/2021 11:00:37 - INFO - codeparrot_training - Step 41736: {'lr': 3.317776746094292e-05, 'samples': 21369344, 'steps': 41736, 'batch_loss/train': 0.6893313275068067} 12/28/2021 11:00:48 - INFO - codeparrot_training - Step 41737: {'lr': 3.316992092342849e-05, 'samples': 21369856, 'steps': 41737, 'batch_loss/train': 0.79159040376544} 12/28/2021 11:01:00 - INFO - codeparrot_training - Step 41738: {'lr': 3.3162075247948706e-05, 'samples': 21370368, 'steps': 41738, 'batch_loss/train': 0.6718965144827962} 12/28/2021 11:01:11 - INFO - codeparrot_training - Step 41739: {'lr': 3.315423043453455e-05, 'samples': 21370880, 'steps': 41739, 'batch_loss/train': 0.6776474744547158} 12/28/2021 11:01:21 - INFO - codeparrot_training - Step 41740: {'lr': 3.314638648321733e-05, 'samples': 21371392, 'steps': 41740, 'batch_loss/train': 0.7013206053525209} 12/28/2021 11:01:35 - INFO - codeparrot_training - Step 41741: {'lr': 3.313854339402819e-05, 'samples': 21371904, 'steps': 41741, 'batch_loss/train': 0.6175560620613396} 12/28/2021 11:01:46 - INFO - codeparrot_training - Step 41742: {'lr': 3.313070116699832e-05, 'samples': 21372416, 'steps': 41742, 'batch_loss/train': 0.7521521728485823} 12/28/2021 11:01:57 - INFO - codeparrot_training - Step 41743: {'lr': 3.312285980215887e-05, 'samples': 21372928, 'steps': 41743, 'batch_loss/train': 0.6875565480440855} 12/28/2021 11:02:09 - INFO - codeparrot_training - Step 41744: {'lr': 3.3115019299541055e-05, 'samples': 21373440, 'steps': 41744, 'batch_loss/train': 0.710025099106133} 12/28/2021 11:02:20 - INFO - codeparrot_training - Step 41745: {'lr': 3.310717965917603e-05, 'samples': 21373952, 'steps': 41745, 'batch_loss/train': 0.7733574183657765} 12/28/2021 11:02:30 - INFO - codeparrot_training - Step 41746: {'lr': 3.309934088109498e-05, 'samples': 21374464, 'steps': 41746, 'batch_loss/train': 0.7444104989990592} 12/28/2021 11:02:41 - INFO - codeparrot_training - Step 41747: {'lr': 3.309150296532909e-05, 'samples': 21374976, 'steps': 41747, 'batch_loss/train': 0.5237641632556915} 12/28/2021 11:02:53 - INFO - codeparrot_training - Step 41748: {'lr': 3.30836659119094e-05, 'samples': 21375488, 'steps': 41748, 'batch_loss/train': 0.7068630787543952} 12/28/2021 11:03:03 - INFO - codeparrot_training - Step 41749: {'lr': 3.307582972086717e-05, 'samples': 21376000, 'steps': 41749, 'batch_loss/train': 0.6989161800593138} 12/28/2021 11:03:14 - INFO - codeparrot_training - Step 41750: {'lr': 3.3067994392233595e-05, 'samples': 21376512, 'steps': 41750, 'batch_loss/train': 0.7546912566758692} 12/28/2021 11:03:28 - INFO - codeparrot_training - Step 41751: {'lr': 3.306015992603972e-05, 'samples': 21377024, 'steps': 41751, 'batch_loss/train': 0.6401650165207684} 12/28/2021 11:03:39 - INFO - codeparrot_training - Step 41752: {'lr': 3.305232632231667e-05, 'samples': 21377536, 'steps': 41752, 'batch_loss/train': 0.6591723151504993} 12/28/2021 11:03:49 - INFO - codeparrot_training - Step 41753: {'lr': 3.3044493581095765e-05, 'samples': 21378048, 'steps': 41753, 'batch_loss/train': 0.6327168283751234} 12/28/2021 11:04:01 - INFO - codeparrot_training - Step 41754: {'lr': 3.303666170240799e-05, 'samples': 21378560, 'steps': 41754, 'batch_loss/train': 0.655041717516724} 12/28/2021 11:04:12 - INFO - codeparrot_training - Step 41755: {'lr': 3.302883068628454e-05, 'samples': 21379072, 'steps': 41755, 'batch_loss/train': 0.8106603743508458} 12/28/2021 11:04:23 - INFO - codeparrot_training - Step 41756: {'lr': 3.30210005327565e-05, 'samples': 21379584, 'steps': 41756, 'batch_loss/train': 0.7723469990305603} 12/28/2021 11:04:33 - INFO - codeparrot_training - Step 41757: {'lr': 3.3013171241855044e-05, 'samples': 21380096, 'steps': 41757, 'batch_loss/train': 0.6199663656298071} 12/28/2021 11:04:45 - INFO - codeparrot_training - Step 41758: {'lr': 3.3005342813611286e-05, 'samples': 21380608, 'steps': 41758, 'batch_loss/train': 0.7488572428701445} 12/28/2021 11:04:56 - INFO - codeparrot_training - Step 41759: {'lr': 3.299751524805636e-05, 'samples': 21381120, 'steps': 41759, 'batch_loss/train': 0.6439883357379586} 12/28/2021 11:05:06 - INFO - codeparrot_training - Step 41760: {'lr': 3.298968854522139e-05, 'samples': 21381632, 'steps': 41760, 'batch_loss/train': 0.7059955094009638} 12/28/2021 11:05:21 - INFO - codeparrot_training - Step 41761: {'lr': 3.298186270513745e-05, 'samples': 21382144, 'steps': 41761, 'batch_loss/train': 0.7361024022102356} 12/28/2021 11:05:31 - INFO - codeparrot_training - Step 41762: {'lr': 3.297403772783575e-05, 'samples': 21382656, 'steps': 41762, 'batch_loss/train': 0.6956549333408475} 12/28/2021 11:05:42 - INFO - codeparrot_training - Step 41763: {'lr': 3.296621361334723e-05, 'samples': 21383168, 'steps': 41763, 'batch_loss/train': 0.5921090564224869} 12/28/2021 11:05:54 - INFO - codeparrot_training - Step 41764: {'lr': 3.295839036170317e-05, 'samples': 21383680, 'steps': 41764, 'batch_loss/train': 0.5587882734835148} 12/28/2021 11:06:05 - INFO - codeparrot_training - Step 41765: {'lr': 3.295056797293458e-05, 'samples': 21384192, 'steps': 41765, 'batch_loss/train': 0.6765157540794462} 12/28/2021 11:06:15 - INFO - codeparrot_training - Step 41766: {'lr': 3.2942746447072566e-05, 'samples': 21384704, 'steps': 41766, 'batch_loss/train': 0.7851464599370956} 12/28/2021 11:06:26 - INFO - codeparrot_training - Step 41767: {'lr': 3.293492578414822e-05, 'samples': 21385216, 'steps': 41767, 'batch_loss/train': 0.720063341781497} 12/28/2021 11:06:38 - INFO - codeparrot_training - Step 41768: {'lr': 3.2927105984192686e-05, 'samples': 21385728, 'steps': 41768, 'batch_loss/train': 0.7260022085392848} 12/28/2021 11:06:49 - INFO - codeparrot_training - Step 41769: {'lr': 3.291928704723699e-05, 'samples': 21386240, 'steps': 41769, 'batch_loss/train': 0.6827942421659827} 12/28/2021 11:06:59 - INFO - codeparrot_training - Step 41770: {'lr': 3.2911468973312235e-05, 'samples': 21386752, 'steps': 41770, 'batch_loss/train': 0.6572760839480907} 12/28/2021 11:07:13 - INFO - codeparrot_training - Step 41771: {'lr': 3.2903651762449525e-05, 'samples': 21387264, 'steps': 41771, 'batch_loss/train': 0.6535160690546036} 12/28/2021 11:07:24 - INFO - codeparrot_training - Step 41772: {'lr': 3.2895835414679914e-05, 'samples': 21387776, 'steps': 41772, 'batch_loss/train': 0.8635183153674006} 12/28/2021 11:07:34 - INFO - codeparrot_training - Step 41773: {'lr': 3.288801993003454e-05, 'samples': 21388288, 'steps': 41773, 'batch_loss/train': 0.7804685281589627} 12/28/2021 11:07:47 - INFO - codeparrot_training - Step 41774: {'lr': 3.28802053085443e-05, 'samples': 21388800, 'steps': 41774, 'batch_loss/train': 0.7670624041929841} 12/28/2021 11:07:57 - INFO - codeparrot_training - Step 41775: {'lr': 3.2872391550240466e-05, 'samples': 21389312, 'steps': 41775, 'batch_loss/train': 0.7559237948153168} 12/28/2021 11:08:08 - INFO - codeparrot_training - Step 41776: {'lr': 3.286457865515405e-05, 'samples': 21389824, 'steps': 41776, 'batch_loss/train': 0.6848525386303663} 12/28/2021 11:08:19 - INFO - codeparrot_training - Step 41777: {'lr': 3.285676662331605e-05, 'samples': 21390336, 'steps': 41777, 'batch_loss/train': 0.6635421039536595} 12/28/2021 11:08:33 - INFO - codeparrot_training - Step 41778: {'lr': 3.284895545475747e-05, 'samples': 21390848, 'steps': 41778, 'batch_loss/train': 0.7309759529307485} 12/28/2021 11:08:43 - INFO - codeparrot_training - Step 41779: {'lr': 3.2841145149509585e-05, 'samples': 21391360, 'steps': 41779, 'batch_loss/train': 0.7447129711508751} 12/28/2021 11:08:54 - INFO - codeparrot_training - Step 41780: {'lr': 3.2833335707603245e-05, 'samples': 21391872, 'steps': 41780, 'batch_loss/train': 0.5116026933828834} 12/28/2021 11:09:06 - INFO - codeparrot_training - Step 41781: {'lr': 3.282552712906955e-05, 'samples': 21392384, 'steps': 41781, 'batch_loss/train': 0.6821883958764374} 12/28/2021 11:09:17 - INFO - codeparrot_training - Step 41782: {'lr': 3.28177194139396e-05, 'samples': 21392896, 'steps': 41782, 'batch_loss/train': 0.688403346342966} 12/28/2021 11:09:27 - INFO - codeparrot_training - Step 41783: {'lr': 3.280991256224436e-05, 'samples': 21393408, 'steps': 41783, 'batch_loss/train': 0.6858189541380852} 12/28/2021 11:09:39 - INFO - codeparrot_training - Step 41784: {'lr': 3.280210657401489e-05, 'samples': 21393920, 'steps': 41784, 'batch_loss/train': 0.6862544011091813} 12/28/2021 11:09:50 - INFO - codeparrot_training - Step 41785: {'lr': 3.279430144928225e-05, 'samples': 21394432, 'steps': 41785, 'batch_loss/train': 0.7350833611562848} 12/28/2021 11:10:01 - INFO - codeparrot_training - Step 41786: {'lr': 3.2786497188077466e-05, 'samples': 21394944, 'steps': 41786, 'batch_loss/train': 0.709492432884872} 12/28/2021 11:10:13 - INFO - codeparrot_training - Step 41787: {'lr': 3.277869379043152e-05, 'samples': 21395456, 'steps': 41787, 'batch_loss/train': 0.8430783078074455} 12/28/2021 11:10:23 - INFO - codeparrot_training - Step 41788: {'lr': 3.277089125637556e-05, 'samples': 21395968, 'steps': 41788, 'batch_loss/train': 0.6663887738250196} 12/28/2021 11:10:34 - INFO - codeparrot_training - Step 41789: {'lr': 3.2763089585940376e-05, 'samples': 21396480, 'steps': 41789, 'batch_loss/train': 0.8156125424429774} 12/28/2021 11:10:45 - INFO - codeparrot_training - Step 41790: {'lr': 3.275528877915726e-05, 'samples': 21396992, 'steps': 41790, 'batch_loss/train': 0.6789877903647721} 12/28/2021 11:10:59 - INFO - codeparrot_training - Step 41791: {'lr': 3.274748883605699e-05, 'samples': 21397504, 'steps': 41791, 'batch_loss/train': 0.6887303427793086} 12/28/2021 11:11:09 - INFO - codeparrot_training - Step 41792: {'lr': 3.273968975667072e-05, 'samples': 21398016, 'steps': 41792, 'batch_loss/train': 0.846010793466121} 12/28/2021 11:11:20 - INFO - codeparrot_training - Step 41793: {'lr': 3.273189154102937e-05, 'samples': 21398528, 'steps': 41793, 'batch_loss/train': 0.6988995354622602} 12/28/2021 11:11:32 - INFO - codeparrot_training - Step 41794: {'lr': 3.2724094189164e-05, 'samples': 21399040, 'steps': 41794, 'batch_loss/train': 0.7448417304549366} 12/28/2021 11:11:43 - INFO - codeparrot_training - Step 41795: {'lr': 3.27162977011056e-05, 'samples': 21399552, 'steps': 41795, 'batch_loss/train': 0.7791194226592779} 12/28/2021 11:11:54 - INFO - codeparrot_training - Step 41796: {'lr': 3.270850207688514e-05, 'samples': 21400064, 'steps': 41796, 'batch_loss/train': 0.6591093372553587} 12/28/2021 11:12:06 - INFO - codeparrot_training - Step 41797: {'lr': 3.2700707316533654e-05, 'samples': 21400576, 'steps': 41797, 'batch_loss/train': 0.8866951148957014} 12/28/2021 11:12:16 - INFO - codeparrot_training - Step 41798: {'lr': 3.269291342008207e-05, 'samples': 21401088, 'steps': 41798, 'batch_loss/train': 0.6653291108086705} 12/28/2021 11:12:27 - INFO - codeparrot_training - Step 41799: {'lr': 3.2685120387561495e-05, 'samples': 21401600, 'steps': 41799, 'batch_loss/train': 0.6315810601226985} 12/28/2021 11:12:38 - INFO - codeparrot_training - Step 41800: {'lr': 3.267732821900271e-05, 'samples': 21402112, 'steps': 41800, 'batch_loss/train': 1.065648791845888} 12/28/2021 11:12:52 - INFO - codeparrot_training - Step 41801: {'lr': 3.266953691443686e-05, 'samples': 21402624, 'steps': 41801, 'batch_loss/train': 0.917663287371397} 12/28/2021 11:13:02 - INFO - codeparrot_training - Step 41802: {'lr': 3.2661746473894915e-05, 'samples': 21403136, 'steps': 41802, 'batch_loss/train': 0.8683682456612587} 12/28/2021 11:13:13 - INFO - codeparrot_training - Step 41803: {'lr': 3.2653956897407774e-05, 'samples': 21403648, 'steps': 41803, 'batch_loss/train': 0.6885051636490971} 12/28/2021 11:13:25 - INFO - codeparrot_training - Step 41804: {'lr': 3.264616818500637e-05, 'samples': 21404160, 'steps': 41804, 'batch_loss/train': 0.7439057808369398} 12/28/2021 11:13:36 - INFO - codeparrot_training - Step 41805: {'lr': 3.263838033672184e-05, 'samples': 21404672, 'steps': 41805, 'batch_loss/train': 0.8016996365040541} 12/28/2021 11:13:46 - INFO - codeparrot_training - Step 41806: {'lr': 3.263059335258497e-05, 'samples': 21405184, 'steps': 41806, 'batch_loss/train': 0.7001268188469112} 12/28/2021 11:14:00 - INFO - codeparrot_training - Step 41807: {'lr': 3.262280723262673e-05, 'samples': 21405696, 'steps': 41807, 'batch_loss/train': 0.6539054298773408} 12/28/2021 11:14:11 - INFO - codeparrot_training - Step 41808: {'lr': 3.261502197687827e-05, 'samples': 21406208, 'steps': 41808, 'batch_loss/train': 0.7449217475950718} 12/28/2021 11:14:22 - INFO - codeparrot_training - Step 41809: {'lr': 3.260723758537032e-05, 'samples': 21406720, 'steps': 41809, 'batch_loss/train': 0.8194195628166199} 12/28/2021 11:14:34 - INFO - codeparrot_training - Step 41810: {'lr': 3.259945405813392e-05, 'samples': 21407232, 'steps': 41810, 'batch_loss/train': 0.7091462928801775} 12/28/2021 11:14:44 - INFO - codeparrot_training - Step 41811: {'lr': 3.2591671395199975e-05, 'samples': 21407744, 'steps': 41811, 'batch_loss/train': 0.7157964073121548} 12/28/2021 11:14:55 - INFO - codeparrot_training - Step 41812: {'lr': 3.258388959659947e-05, 'samples': 21408256, 'steps': 41812, 'batch_loss/train': 0.7150787242571823} 12/28/2021 11:15:05 - INFO - codeparrot_training - Step 41813: {'lr': 3.257610866236335e-05, 'samples': 21408768, 'steps': 41813, 'batch_loss/train': 0.7850616322830319} 12/28/2021 11:15:18 - INFO - codeparrot_training - Step 41814: {'lr': 3.256832859252257e-05, 'samples': 21409280, 'steps': 41814, 'batch_loss/train': 0.6858817194588482} 12/28/2021 11:15:28 - INFO - codeparrot_training - Step 41815: {'lr': 3.2560549387107896e-05, 'samples': 21409792, 'steps': 41815, 'batch_loss/train': 0.7166521803010255} 12/28/2021 11:15:39 - INFO - codeparrot_training - Step 41816: {'lr': 3.2552771046150426e-05, 'samples': 21410304, 'steps': 41816, 'batch_loss/train': 0.7486347649246454} 12/28/2021 11:15:53 - INFO - codeparrot_training - Step 41817: {'lr': 3.2544993569681104e-05, 'samples': 21410816, 'steps': 41817, 'batch_loss/train': 0.6676380494609475} 12/28/2021 11:16:03 - INFO - codeparrot_training - Step 41818: {'lr': 3.2537216957730654e-05, 'samples': 21411328, 'steps': 41818, 'batch_loss/train': 0.7744477177038789} 12/28/2021 11:16:14 - INFO - codeparrot_training - Step 41819: {'lr': 3.252944121033025e-05, 'samples': 21411840, 'steps': 41819, 'batch_loss/train': 0.676997919101268} 12/28/2021 11:16:26 - INFO - codeparrot_training - Step 41820: {'lr': 3.2521666327510606e-05, 'samples': 21412352, 'steps': 41820, 'batch_loss/train': 0.6637225586455315} 12/28/2021 11:16:37 - INFO - codeparrot_training - Step 41821: {'lr': 3.251389230930271e-05, 'samples': 21412864, 'steps': 41821, 'batch_loss/train': 0.7255173800513148} 12/28/2021 11:16:48 - INFO - codeparrot_training - Step 41822: {'lr': 3.250611915573745e-05, 'samples': 21413376, 'steps': 41822, 'batch_loss/train': 0.6782847158610821} 12/28/2021 11:16:58 - INFO - codeparrot_training - Step 41823: {'lr': 3.249834686684572e-05, 'samples': 21413888, 'steps': 41823, 'batch_loss/train': 0.9891242564772256} 12/28/2021 11:17:11 - INFO - codeparrot_training - Step 41824: {'lr': 3.249057544265846e-05, 'samples': 21414400, 'steps': 41824, 'batch_loss/train': 0.5498278585146181} 12/28/2021 11:17:21 - INFO - codeparrot_training - Step 41825: {'lr': 3.24828048832066e-05, 'samples': 21414912, 'steps': 41825, 'batch_loss/train': 0.6544965747743845} 12/28/2021 11:17:32 - INFO - codeparrot_training - Step 41826: {'lr': 3.247503518852085e-05, 'samples': 21415424, 'steps': 41826, 'batch_loss/train': 0.5842925147153437} 12/28/2021 11:17:44 - INFO - codeparrot_training - Step 41827: {'lr': 3.2467266358632287e-05, 'samples': 21415936, 'steps': 41827, 'batch_loss/train': 0.7163987052626908} 12/28/2021 11:17:55 - INFO - codeparrot_training - Step 41828: {'lr': 3.245949839357182e-05, 'samples': 21416448, 'steps': 41828, 'batch_loss/train': 0.6820010272786021} 12/28/2021 11:18:05 - INFO - codeparrot_training - Step 41829: {'lr': 3.2451731293370144e-05, 'samples': 21416960, 'steps': 41829, 'batch_loss/train': 0.6493507409468293} 12/28/2021 11:18:19 - INFO - codeparrot_training - Step 41830: {'lr': 3.2443965058058214e-05, 'samples': 21417472, 'steps': 41830, 'batch_loss/train': 0.6576981833204627} 12/28/2021 11:18:30 - INFO - codeparrot_training - Step 41831: {'lr': 3.2436199687667036e-05, 'samples': 21417984, 'steps': 41831, 'batch_loss/train': 0.907519212923944} 12/28/2021 11:18:41 - INFO - codeparrot_training - Step 41832: {'lr': 3.2428435182227326e-05, 'samples': 21418496, 'steps': 41832, 'batch_loss/train': 0.8451500781811774} 12/28/2021 11:18:51 - INFO - codeparrot_training - Step 41833: {'lr': 3.2420671541769944e-05, 'samples': 21419008, 'steps': 41833, 'batch_loss/train': 0.610056625213474} 12/28/2021 11:19:03 - INFO - codeparrot_training - Step 41834: {'lr': 3.241290876632594e-05, 'samples': 21419520, 'steps': 41834, 'batch_loss/train': 0.4907033514173236} 12/28/2021 11:19:14 - INFO - codeparrot_training - Step 41835: {'lr': 3.240514685592599e-05, 'samples': 21420032, 'steps': 41835, 'batch_loss/train': 0.6298485505394638} 12/28/2021 11:19:25 - INFO - codeparrot_training - Step 41836: {'lr': 3.239738581060103e-05, 'samples': 21420544, 'steps': 41836, 'batch_loss/train': 0.6606026692315936} 12/28/2021 11:19:37 - INFO - codeparrot_training - Step 41837: {'lr': 3.238962563038189e-05, 'samples': 21421056, 'steps': 41837, 'batch_loss/train': 0.6803353005088866} 12/28/2021 11:19:47 - INFO - codeparrot_training - Step 41838: {'lr': 3.238186631529943e-05, 'samples': 21421568, 'steps': 41838, 'batch_loss/train': 1.4896519221365452} 12/28/2021 11:19:58 - INFO - codeparrot_training - Step 41839: {'lr': 3.23741078653845e-05, 'samples': 21422080, 'steps': 41839, 'batch_loss/train': 0.5709012541919947} 12/28/2021 11:20:12 - INFO - codeparrot_training - Step 41840: {'lr': 3.2366350280668005e-05, 'samples': 21422592, 'steps': 41840, 'batch_loss/train': 0.723574265895877} 12/28/2021 11:20:23 - INFO - codeparrot_training - Step 41841: {'lr': 3.235859356118062e-05, 'samples': 21423104, 'steps': 41841, 'batch_loss/train': 0.5943883988074958} 12/28/2021 11:20:33 - INFO - codeparrot_training - Step 41842: {'lr': 3.2350837706953325e-05, 'samples': 21423616, 'steps': 41842, 'batch_loss/train': 0.8199374433606863} 12/28/2021 11:20:44 - INFO - codeparrot_training - Step 41843: {'lr': 3.234308271801698e-05, 'samples': 21424128, 'steps': 41843, 'batch_loss/train': 0.6109979720786214} 12/28/2021 11:20:56 - INFO - codeparrot_training - Step 41844: {'lr': 3.2335328594402245e-05, 'samples': 21424640, 'steps': 41844, 'batch_loss/train': 0.7386167556978762} 12/28/2021 11:21:07 - INFO - codeparrot_training - Step 41845: {'lr': 3.232757533614017e-05, 'samples': 21425152, 'steps': 41845, 'batch_loss/train': 0.6577914687804878} 12/28/2021 11:21:17 - INFO - codeparrot_training - Step 41846: {'lr': 3.2319822943261415e-05, 'samples': 21425664, 'steps': 41846, 'batch_loss/train': 0.7020851233974099} 12/28/2021 11:21:31 - INFO - codeparrot_training - Step 41847: {'lr': 3.231207141579684e-05, 'samples': 21426176, 'steps': 41847, 'batch_loss/train': 0.6649303804151714} 12/28/2021 11:21:42 - INFO - codeparrot_training - Step 41848: {'lr': 3.2304320753777286e-05, 'samples': 21426688, 'steps': 41848, 'batch_loss/train': 0.7378305410966277} 12/28/2021 11:21:52 - INFO - codeparrot_training - Step 41849: {'lr': 3.229657095723354e-05, 'samples': 21427200, 'steps': 41849, 'batch_loss/train': 0.644118650816381} 12/28/2021 11:22:05 - INFO - codeparrot_training - Step 41850: {'lr': 3.228882202619643e-05, 'samples': 21427712, 'steps': 41850, 'batch_loss/train': 0.7200832781381905} 12/28/2021 11:22:15 - INFO - codeparrot_training - Step 41851: {'lr': 3.2281073960696764e-05, 'samples': 21428224, 'steps': 41851, 'batch_loss/train': 0.8013863041996956} 12/28/2021 11:22:26 - INFO - codeparrot_training - Step 41852: {'lr': 3.227332676076536e-05, 'samples': 21428736, 'steps': 41852, 'batch_loss/train': 0.7368810009211302} 12/28/2021 11:22:38 - INFO - codeparrot_training - Step 41853: {'lr': 3.226558042643296e-05, 'samples': 21429248, 'steps': 41853, 'batch_loss/train': 0.7976217579562217} 12/28/2021 11:22:49 - INFO - codeparrot_training - Step 41854: {'lr': 3.2257834957730464e-05, 'samples': 21429760, 'steps': 41854, 'batch_loss/train': 0.7557666962966323} 12/28/2021 11:22:59 - INFO - codeparrot_training - Step 41855: {'lr': 3.225009035468851e-05, 'samples': 21430272, 'steps': 41855, 'batch_loss/train': 0.7692720210179687} 12/28/2021 11:23:10 - INFO - codeparrot_training - Step 41856: {'lr': 3.224234661733802e-05, 'samples': 21430784, 'steps': 41856, 'batch_loss/train': 0.5566669620748144} 12/28/2021 11:23:24 - INFO - codeparrot_training - Step 41857: {'lr': 3.22346037457098e-05, 'samples': 21431296, 'steps': 41857, 'batch_loss/train': 0.9297602036967874} 12/28/2021 11:23:34 - INFO - codeparrot_training - Step 41858: {'lr': 3.22268617398345e-05, 'samples': 21431808, 'steps': 41858, 'batch_loss/train': 0.7125035915523767} 12/28/2021 11:23:45 - INFO - codeparrot_training - Step 41859: {'lr': 3.221912059974291e-05, 'samples': 21432320, 'steps': 41859, 'batch_loss/train': 0.7037900503491983} 12/28/2021 11:23:57 - INFO - codeparrot_training - Step 41860: {'lr': 3.2211380325466e-05, 'samples': 21432832, 'steps': 41860, 'batch_loss/train': 0.8468417413532734} 12/28/2021 11:24:08 - INFO - codeparrot_training - Step 41861: {'lr': 3.2203640917034346e-05, 'samples': 21433344, 'steps': 41861, 'batch_loss/train': 0.6989048479590565} 12/28/2021 11:24:18 - INFO - codeparrot_training - Step 41862: {'lr': 3.219590237447878e-05, 'samples': 21433856, 'steps': 41862, 'batch_loss/train': 0.7399593316949904} 12/28/2021 11:24:30 - INFO - codeparrot_training - Step 41863: {'lr': 3.218816469783004e-05, 'samples': 21434368, 'steps': 41863, 'batch_loss/train': 0.7189590008929372} 12/28/2021 11:24:41 - INFO - codeparrot_training - Step 41864: {'lr': 3.2180427887118946e-05, 'samples': 21434880, 'steps': 41864, 'batch_loss/train': 0.723692902829498} 12/28/2021 11:24:51 - INFO - codeparrot_training - Step 41865: {'lr': 3.217269194237621e-05, 'samples': 21435392, 'steps': 41865, 'batch_loss/train': 0.7216054555028677} 12/28/2021 11:25:06 - INFO - codeparrot_training - Step 41866: {'lr': 3.216495686363266e-05, 'samples': 21435904, 'steps': 41866, 'batch_loss/train': 0.6948236444150098} 12/28/2021 11:25:16 - INFO - codeparrot_training - Step 41867: {'lr': 3.2157222650918885e-05, 'samples': 21436416, 'steps': 41867, 'batch_loss/train': 0.816343781829346} 12/28/2021 11:25:27 - INFO - codeparrot_training - Step 41868: {'lr': 3.2149489304265786e-05, 'samples': 21436928, 'steps': 41868, 'batch_loss/train': 0.786481544142589} 12/28/2021 11:25:37 - INFO - codeparrot_training - Step 41869: {'lr': 3.2141756823704126e-05, 'samples': 21437440, 'steps': 41869, 'batch_loss/train': 0.6548048508702777} 12/28/2021 11:25:50 - INFO - codeparrot_training - Step 41870: {'lr': 3.213402520926448e-05, 'samples': 21437952, 'steps': 41870, 'batch_loss/train': 0.5440617083513644} 12/28/2021 11:26:01 - INFO - codeparrot_training - Step 41871: {'lr': 3.2126294460977754e-05, 'samples': 21438464, 'steps': 41871, 'batch_loss/train': 0.7231053402647376} 12/28/2021 11:26:11 - INFO - codeparrot_training - Step 41872: {'lr': 3.2118564578874604e-05, 'samples': 21438976, 'steps': 41872, 'batch_loss/train': 0.7190522474702448} 12/28/2021 11:26:23 - INFO - codeparrot_training - Step 41873: {'lr': 3.211083556298575e-05, 'samples': 21439488, 'steps': 41873, 'batch_loss/train': 0.4663205978577025} 12/28/2021 11:26:34 - INFO - codeparrot_training - Step 41874: {'lr': 3.210310741334196e-05, 'samples': 21440000, 'steps': 41874, 'batch_loss/train': 0.7644684142433107} 12/28/2021 11:26:44 - INFO - codeparrot_training - Step 41875: {'lr': 3.209538012997393e-05, 'samples': 21440512, 'steps': 41875, 'batch_loss/train': 0.73669399227947} 12/28/2021 11:26:58 - INFO - codeparrot_training - Step 41876: {'lr': 3.208765371291239e-05, 'samples': 21441024, 'steps': 41876, 'batch_loss/train': 0.6896932595409453} 12/28/2021 11:27:08 - INFO - codeparrot_training - Step 41877: {'lr': 3.207992816218805e-05, 'samples': 21441536, 'steps': 41877, 'batch_loss/train': 0.7024845089763403} 12/28/2021 11:27:19 - INFO - codeparrot_training - Step 41878: {'lr': 3.207220347783163e-05, 'samples': 21442048, 'steps': 41878, 'batch_loss/train': 0.6696505798026919} 12/28/2021 11:27:30 - INFO - codeparrot_training - Step 41879: {'lr': 3.2064479659873855e-05, 'samples': 21442560, 'steps': 41879, 'batch_loss/train': 0.7037370875477791} 12/28/2021 11:27:42 - INFO - codeparrot_training - Step 41880: {'lr': 3.2056756708345486e-05, 'samples': 21443072, 'steps': 41880, 'batch_loss/train': 0.6928228511242196} 12/28/2021 11:27:52 - INFO - codeparrot_training - Step 41881: {'lr': 3.204903462327702e-05, 'samples': 21443584, 'steps': 41881, 'batch_loss/train': 0.7113428660668433} 12/28/2021 11:28:03 - INFO - codeparrot_training - Step 41882: {'lr': 3.2041313404699376e-05, 'samples': 21444096, 'steps': 41882, 'batch_loss/train': 0.6884794444777071} 12/28/2021 11:28:17 - INFO - codeparrot_training - Step 41883: {'lr': 3.2033593052643225e-05, 'samples': 21444608, 'steps': 41883, 'batch_loss/train': 0.7550574047490954} 12/28/2021 11:28:28 - INFO - codeparrot_training - Step 41884: {'lr': 3.202587356713915e-05, 'samples': 21445120, 'steps': 41884, 'batch_loss/train': 0.6788158467970788} 12/28/2021 11:28:38 - INFO - codeparrot_training - Step 41885: {'lr': 3.201815494821783e-05, 'samples': 21445632, 'steps': 41885, 'batch_loss/train': 0.6465384536422789} 12/28/2021 11:28:50 - INFO - codeparrot_training - Step 41886: {'lr': 3.201043719591015e-05, 'samples': 21446144, 'steps': 41886, 'batch_loss/train': 0.6882705690222792} 12/28/2021 11:29:01 - INFO - codeparrot_training - Step 41887: {'lr': 3.200272031024659e-05, 'samples': 21446656, 'steps': 41887, 'batch_loss/train': 0.6339459273440298} 12/28/2021 11:29:12 - INFO - codeparrot_training - Step 41888: {'lr': 3.19950042912579e-05, 'samples': 21447168, 'steps': 41888, 'batch_loss/train': 0.7151960977353156} 12/28/2021 11:29:24 - INFO - codeparrot_training - Step 41889: {'lr': 3.198728913897478e-05, 'samples': 21447680, 'steps': 41889, 'batch_loss/train': 0.6735725114122033} 12/28/2021 11:29:35 - INFO - codeparrot_training - Step 41890: {'lr': 3.1979574853427867e-05, 'samples': 21448192, 'steps': 41890, 'batch_loss/train': 0.6691383798606694} 12/28/2021 11:29:45 - INFO - codeparrot_training - Step 41891: {'lr': 3.197186143464784e-05, 'samples': 21448704, 'steps': 41891, 'batch_loss/train': 0.6249335269676521} 12/28/2021 11:29:56 - INFO - codeparrot_training - Step 41892: {'lr': 3.196414888266536e-05, 'samples': 21449216, 'steps': 41892, 'batch_loss/train': 0.7190344082191586} 12/28/2021 11:30:10 - INFO - codeparrot_training - Step 41893: {'lr': 3.1956437197511094e-05, 'samples': 21449728, 'steps': 41893, 'batch_loss/train': 0.6600276732351631} 12/28/2021 11:30:20 - INFO - codeparrot_training - Step 41894: {'lr': 3.194872637921573e-05, 'samples': 21450240, 'steps': 41894, 'batch_loss/train': 0.6057414966635406} 12/28/2021 11:30:31 - INFO - codeparrot_training - Step 41895: {'lr': 3.194101642780994e-05, 'samples': 21450752, 'steps': 41895, 'batch_loss/train': 0.6281923049828038} 12/28/2021 11:30:43 - INFO - codeparrot_training - Step 41896: {'lr': 3.1933307343324226e-05, 'samples': 21451264, 'steps': 41896, 'batch_loss/train': 0.7608408750966191} 12/28/2021 11:30:54 - INFO - codeparrot_training - Step 41897: {'lr': 3.1925599125789375e-05, 'samples': 21451776, 'steps': 41897, 'batch_loss/train': 0.5782144937547855} 12/28/2021 11:31:04 - INFO - codeparrot_training - Step 41898: {'lr': 3.191789177523607e-05, 'samples': 21452288, 'steps': 41898, 'batch_loss/train': 0.8418185827322304} 12/28/2021 11:31:17 - INFO - codeparrot_training - Step 41899: {'lr': 3.191018529169484e-05, 'samples': 21452800, 'steps': 41899, 'batch_loss/train': 0.7384760184213519} 12/28/2021 11:31:27 - INFO - codeparrot_training - Step 41900: {'lr': 3.190247967519635e-05, 'samples': 21453312, 'steps': 41900, 'batch_loss/train': 0.6830251840874553} 12/28/2021 11:31:38 - INFO - codeparrot_training - Step 41901: {'lr': 3.1894774925771254e-05, 'samples': 21453824, 'steps': 41901, 'batch_loss/train': 0.6634230203926563} 12/28/2021 11:31:48 - INFO - codeparrot_training - Step 41902: {'lr': 3.188707104345018e-05, 'samples': 21454336, 'steps': 41902, 'batch_loss/train': 0.8104381589218974} 12/28/2021 11:32:01 - INFO - codeparrot_training - Step 41903: {'lr': 3.187936802826377e-05, 'samples': 21454848, 'steps': 41903, 'batch_loss/train': 0.7628686968237162} 12/28/2021 11:32:11 - INFO - codeparrot_training - Step 41904: {'lr': 3.187166588024262e-05, 'samples': 21455360, 'steps': 41904, 'batch_loss/train': 0.631461763754487} 12/28/2021 11:32:22 - INFO - codeparrot_training - Step 41905: {'lr': 3.186396459941737e-05, 'samples': 21455872, 'steps': 41905, 'batch_loss/train': 0.6603207486041356} 12/28/2021 11:32:36 - INFO - codeparrot_training - Step 41906: {'lr': 3.185626418581861e-05, 'samples': 21456384, 'steps': 41906, 'batch_loss/train': 0.7356271436437964} 12/28/2021 11:32:46 - INFO - codeparrot_training - Step 41907: {'lr': 3.1848564639477e-05, 'samples': 21456896, 'steps': 41907, 'batch_loss/train': 0.6869251280440949} 12/28/2021 11:32:57 - INFO - codeparrot_training - Step 41908: {'lr': 3.1840865960423127e-05, 'samples': 21457408, 'steps': 41908, 'batch_loss/train': 0.7061462076380849} 12/28/2021 11:33:09 - INFO - codeparrot_training - Step 41909: {'lr': 3.183316814868764e-05, 'samples': 21457920, 'steps': 41909, 'batch_loss/train': 0.6570779655594379} 12/28/2021 11:33:20 - INFO - codeparrot_training - Step 41910: {'lr': 3.182547120430104e-05, 'samples': 21458432, 'steps': 41910, 'batch_loss/train': 0.6296967691741884} 12/28/2021 11:33:31 - INFO - codeparrot_training - Step 41911: {'lr': 3.1817775127293945e-05, 'samples': 21458944, 'steps': 41911, 'batch_loss/train': 0.6701431244728155} 12/28/2021 11:33:43 - INFO - codeparrot_training - Step 41912: {'lr': 3.18100799176971e-05, 'samples': 21459456, 'steps': 41912, 'batch_loss/train': 0.7575281513854861} 12/28/2021 11:33:53 - INFO - codeparrot_training - Step 41913: {'lr': 3.1802385575540935e-05, 'samples': 21459968, 'steps': 41913, 'batch_loss/train': 0.6937234337674454} 12/28/2021 11:34:04 - INFO - codeparrot_training - Step 41914: {'lr': 3.179469210085611e-05, 'samples': 21460480, 'steps': 41914, 'batch_loss/train': 0.6747793322429061} 12/28/2021 11:34:14 - INFO - codeparrot_training - Step 41915: {'lr': 3.1786999493673166e-05, 'samples': 21460992, 'steps': 41915, 'batch_loss/train': 0.7667432120069861} 12/28/2021 11:34:28 - INFO - codeparrot_training - Step 41916: {'lr': 3.177930775402274e-05, 'samples': 21461504, 'steps': 41916, 'batch_loss/train': 0.6959433824522421} 12/28/2021 11:34:39 - INFO - codeparrot_training - Step 41917: {'lr': 3.177161688193539e-05, 'samples': 21462016, 'steps': 41917, 'batch_loss/train': 0.572084509767592} 12/28/2021 11:34:50 - INFO - codeparrot_training - Step 41918: {'lr': 3.17639268774417e-05, 'samples': 21462528, 'steps': 41918, 'batch_loss/train': 0.708789570897352} 12/28/2021 11:35:02 - INFO - codeparrot_training - Step 41919: {'lr': 3.175623774057221e-05, 'samples': 21463040, 'steps': 41919, 'batch_loss/train': 0.5679454689379781} 12/28/2021 11:35:13 - INFO - codeparrot_training - Step 41920: {'lr': 3.174854947135752e-05, 'samples': 21463552, 'steps': 41920, 'batch_loss/train': 0.7347481594479177} 12/28/2021 11:35:23 - INFO - codeparrot_training - Step 41921: {'lr': 3.174086206982826e-05, 'samples': 21464064, 'steps': 41921, 'batch_loss/train': 0.673481794539839} 12/28/2021 11:35:37 - INFO - codeparrot_training - Step 41922: {'lr': 3.173317553601482e-05, 'samples': 21464576, 'steps': 41922, 'batch_loss/train': 0.7084976712940261} 12/28/2021 11:35:48 - INFO - codeparrot_training - Step 41923: {'lr': 3.172548986994789e-05, 'samples': 21465088, 'steps': 41923, 'batch_loss/train': 0.748061683960259} 12/28/2021 11:35:58 - INFO - codeparrot_training - Step 41924: {'lr': 3.1717805071658056e-05, 'samples': 21465600, 'steps': 41924, 'batch_loss/train': 0.6865296382457018} 12/28/2021 11:36:09 - INFO - codeparrot_training - Step 41925: {'lr': 3.1710121141175744e-05, 'samples': 21466112, 'steps': 41925, 'batch_loss/train': 0.6633579162880778} 12/28/2021 11:36:21 - INFO - codeparrot_training - Step 41926: {'lr': 3.170243807853157e-05, 'samples': 21466624, 'steps': 41926, 'batch_loss/train': 0.7109962801914662} 12/28/2021 11:36:32 - INFO - codeparrot_training - Step 41927: {'lr': 3.1694755883756085e-05, 'samples': 21467136, 'steps': 41927, 'batch_loss/train': 0.7638377868570387} 12/28/2021 11:36:42 - INFO - codeparrot_training - Step 41928: {'lr': 3.168707455687983e-05, 'samples': 21467648, 'steps': 41928, 'batch_loss/train': 0.6808460103347898} 12/28/2021 11:36:54 - INFO - codeparrot_training - Step 41929: {'lr': 3.167939409793333e-05, 'samples': 21468160, 'steps': 41929, 'batch_loss/train': 0.6150993839837611} 12/28/2021 11:37:05 - INFO - codeparrot_training - Step 41930: {'lr': 3.1671714506947114e-05, 'samples': 21468672, 'steps': 41930, 'batch_loss/train': 0.7134444513358176} 12/28/2021 11:37:16 - INFO - codeparrot_training - Step 41931: {'lr': 3.1664035783951716e-05, 'samples': 21469184, 'steps': 41931, 'batch_loss/train': 0.65095388982445} 12/28/2021 11:37:28 - INFO - codeparrot_training - Step 41932: {'lr': 3.16563579289777e-05, 'samples': 21469696, 'steps': 41932, 'batch_loss/train': 0.5824168764520437} 12/28/2021 11:37:39 - INFO - codeparrot_training - Step 41933: {'lr': 3.164868094205553e-05, 'samples': 21470208, 'steps': 41933, 'batch_loss/train': 0.5716842953115702} 12/28/2021 11:37:49 - INFO - codeparrot_training - Step 41934: {'lr': 3.164100482321577e-05, 'samples': 21470720, 'steps': 41934, 'batch_loss/train': 0.7452558809891343} 12/28/2021 11:38:00 - INFO - codeparrot_training - Step 41935: {'lr': 3.163332957248899e-05, 'samples': 21471232, 'steps': 41935, 'batch_loss/train': 0.7260061141569167} 12/28/2021 11:38:14 - INFO - codeparrot_training - Step 41936: {'lr': 3.1625655189905595e-05, 'samples': 21471744, 'steps': 41936, 'batch_loss/train': 0.7202402930706739} 12/28/2021 11:38:25 - INFO - codeparrot_training - Step 41937: {'lr': 3.1617981675496066e-05, 'samples': 21472256, 'steps': 41937, 'batch_loss/train': 0.5068489476107061} 12/28/2021 11:38:35 - INFO - codeparrot_training - Step 41938: {'lr': 3.16103090292911e-05, 'samples': 21472768, 'steps': 41938, 'batch_loss/train': 0.6265196242020465} 12/28/2021 11:38:47 - INFO - codeparrot_training - Step 41939: {'lr': 3.160263725132101e-05, 'samples': 21473280, 'steps': 41939, 'batch_loss/train': 0.6123976334929466} 12/28/2021 11:38:58 - INFO - codeparrot_training - Step 41940: {'lr': 3.15949663416164e-05, 'samples': 21473792, 'steps': 41940, 'batch_loss/train': 0.6973670720472} 12/28/2021 11:39:09 - INFO - codeparrot_training - Step 41941: {'lr': 3.1587296300207705e-05, 'samples': 21474304, 'steps': 41941, 'batch_loss/train': 0.7710319720208645} 12/28/2021 11:39:21 - INFO - codeparrot_training - Step 41942: {'lr': 3.1579627127125475e-05, 'samples': 21474816, 'steps': 41942, 'batch_loss/train': 0.7491324730217457} 12/28/2021 11:39:31 - INFO - codeparrot_training - Step 41943: {'lr': 3.1571958822400186e-05, 'samples': 21475328, 'steps': 41943, 'batch_loss/train': 0.7999070044606924} 12/28/2021 11:39:42 - INFO - codeparrot_training - Step 41944: {'lr': 3.156429138606229e-05, 'samples': 21475840, 'steps': 41944, 'batch_loss/train': 0.7638589683920145} 12/28/2021 11:39:56 - INFO - codeparrot_training - Step 41945: {'lr': 3.155662481814231e-05, 'samples': 21476352, 'steps': 41945, 'batch_loss/train': 0.6933551477268338} 12/28/2021 11:40:07 - INFO - codeparrot_training - Step 41946: {'lr': 3.15489591186707e-05, 'samples': 21476864, 'steps': 41946, 'batch_loss/train': 0.6060746019938961} 12/28/2021 11:40:17 - INFO - codeparrot_training - Step 41947: {'lr': 3.154129428767799e-05, 'samples': 21477376, 'steps': 41947, 'batch_loss/train': 1.4385629380121827} 12/28/2021 11:40:28 - INFO - codeparrot_training - Step 41948: {'lr': 3.1533630325194517e-05, 'samples': 21477888, 'steps': 41948, 'batch_loss/train': 0.7108439006842673} 12/28/2021 11:40:40 - INFO - codeparrot_training - Step 41949: {'lr': 3.152596723125087e-05, 'samples': 21478400, 'steps': 41949, 'batch_loss/train': 0.6059795008040965} 12/28/2021 11:40:51 - INFO - codeparrot_training - Step 41950: {'lr': 3.151830500587755e-05, 'samples': 21478912, 'steps': 41950, 'batch_loss/train': 0.736034143017605} 12/28/2021 11:41:01 - INFO - codeparrot_training - Step 41951: {'lr': 3.1510643649104916e-05, 'samples': 21479424, 'steps': 41951, 'batch_loss/train': 0.6479436529043596} 12/28/2021 11:41:13 - INFO - codeparrot_training - Step 41952: {'lr': 3.150298316096339e-05, 'samples': 21479936, 'steps': 41952, 'batch_loss/train': 0.7009750502184033} 12/28/2021 11:41:24 - INFO - codeparrot_training - Step 41953: {'lr': 3.1495323541483626e-05, 'samples': 21480448, 'steps': 41953, 'batch_loss/train': 0.6771215936169028} 12/28/2021 11:41:35 - INFO - codeparrot_training - Step 41954: {'lr': 3.1487664790695866e-05, 'samples': 21480960, 'steps': 41954, 'batch_loss/train': 0.7063091518357396} 12/28/2021 11:41:49 - INFO - codeparrot_training - Step 41955: {'lr': 3.1480006908630674e-05, 'samples': 21481472, 'steps': 41955, 'batch_loss/train': 0.6372980964370072} 12/28/2021 11:41:59 - INFO - codeparrot_training - Step 41956: {'lr': 3.147234989531847e-05, 'samples': 21481984, 'steps': 41956, 'batch_loss/train': 0.6808179087820463} 12/28/2021 11:42:10 - INFO - codeparrot_training - Step 41957: {'lr': 3.146469375078967e-05, 'samples': 21482496, 'steps': 41957, 'batch_loss/train': 0.7443834603764117} 12/28/2021 11:42:20 - INFO - codeparrot_training - Step 41958: {'lr': 3.145703847507475e-05, 'samples': 21483008, 'steps': 41958, 'batch_loss/train': 0.7069458061596379} 12/28/2021 11:42:33 - INFO - codeparrot_training - Step 41959: {'lr': 3.144938406820411e-05, 'samples': 21483520, 'steps': 41959, 'batch_loss/train': 0.8005030578933656} 12/28/2021 11:42:43 - INFO - codeparrot_training - Step 41960: {'lr': 3.144173053020821e-05, 'samples': 21484032, 'steps': 41960, 'batch_loss/train': 0.7060241284198128} 12/28/2021 11:42:54 - INFO - codeparrot_training - Step 41961: {'lr': 3.143407786111749e-05, 'samples': 21484544, 'steps': 41961, 'batch_loss/train': 0.5888929240172729} 12/28/2021 11:43:08 - INFO - codeparrot_training - Step 41962: {'lr': 3.142642606096233e-05, 'samples': 21485056, 'steps': 41962, 'batch_loss/train': 0.6117163323797286} 12/28/2021 11:43:18 - INFO - codeparrot_training - Step 41963: {'lr': 3.1418775129773094e-05, 'samples': 21485568, 'steps': 41963, 'batch_loss/train': 0.7302329316735268} 12/28/2021 11:43:29 - INFO - codeparrot_training - Step 41964: {'lr': 3.14111250675804e-05, 'samples': 21486080, 'steps': 41964, 'batch_loss/train': 0.7085748764220625} 12/28/2021 11:43:41 - INFO - codeparrot_training - Step 41965: {'lr': 3.140347587441447e-05, 'samples': 21486592, 'steps': 41965, 'batch_loss/train': 0.6931532183662057} 12/28/2021 11:43:52 - INFO - codeparrot_training - Step 41966: {'lr': 3.1395827550305734e-05, 'samples': 21487104, 'steps': 41966, 'batch_loss/train': 0.7059168880805373} 12/28/2021 11:44:02 - INFO - codeparrot_training - Step 41967: {'lr': 3.138818009528474e-05, 'samples': 21487616, 'steps': 41967, 'batch_loss/train': 0.647093212697655} 12/28/2021 11:44:13 - INFO - codeparrot_training - Step 41968: {'lr': 3.1380533509381775e-05, 'samples': 21488128, 'steps': 41968, 'batch_loss/train': 0.7356294032651931} 12/28/2021 11:44:25 - INFO - codeparrot_training - Step 41969: {'lr': 3.137288779262723e-05, 'samples': 21488640, 'steps': 41969, 'batch_loss/train': 0.9048241511918604} 12/28/2021 11:44:36 - INFO - codeparrot_training - Step 41970: {'lr': 3.136524294505155e-05, 'samples': 21489152, 'steps': 41970, 'batch_loss/train': 0.7928490396589041} 12/28/2021 11:44:46 - INFO - codeparrot_training - Step 41971: {'lr': 3.135759896668511e-05, 'samples': 21489664, 'steps': 41971, 'batch_loss/train': 0.7542711752466857} 12/28/2021 11:44:59 - INFO - codeparrot_training - Step 41972: {'lr': 3.134995585755829e-05, 'samples': 21490176, 'steps': 41972, 'batch_loss/train': 0.6628543925471604} 12/28/2021 11:45:09 - INFO - codeparrot_training - Step 41973: {'lr': 3.134231361770154e-05, 'samples': 21490688, 'steps': 41973, 'batch_loss/train': 0.517011390067637} 12/28/2021 11:45:20 - INFO - codeparrot_training - Step 41974: {'lr': 3.133467224714509e-05, 'samples': 21491200, 'steps': 41974, 'batch_loss/train': 0.7040979412849993} 12/28/2021 11:45:34 - INFO - codeparrot_training - Step 41975: {'lr': 3.132703174591947e-05, 'samples': 21491712, 'steps': 41975, 'batch_loss/train': 0.7132402188144624} 12/28/2021 11:45:44 - INFO - codeparrot_training - Step 41976: {'lr': 3.131939211405505e-05, 'samples': 21492224, 'steps': 41976, 'batch_loss/train': 0.7220703447237611} 12/28/2021 11:45:55 - INFO - codeparrot_training - Step 41977: {'lr': 3.1311753351582125e-05, 'samples': 21492736, 'steps': 41977, 'batch_loss/train': 0.7358604930341244} 12/28/2021 11:46:06 - INFO - codeparrot_training - Step 41978: {'lr': 3.130411545853101e-05, 'samples': 21493248, 'steps': 41978, 'batch_loss/train': 0.8005333133041859} 12/28/2021 11:46:18 - INFO - codeparrot_training - Step 41979: {'lr': 3.1296478434932276e-05, 'samples': 21493760, 'steps': 41979, 'batch_loss/train': 0.7180085110594518} 12/28/2021 11:46:29 - INFO - codeparrot_training - Step 41980: {'lr': 3.1288842280816134e-05, 'samples': 21494272, 'steps': 41980, 'batch_loss/train': 0.7774800481274724} 12/28/2021 11:46:39 - INFO - codeparrot_training - Step 41981: {'lr': 3.128120699621295e-05, 'samples': 21494784, 'steps': 41981, 'batch_loss/train': 0.7100552460178733} 12/28/2021 11:46:51 - INFO - codeparrot_training - Step 41982: {'lr': 3.1273572581153095e-05, 'samples': 21495296, 'steps': 41982, 'batch_loss/train': 0.7351865163072944} 12/28/2021 11:47:02 - INFO - codeparrot_training - Step 41983: {'lr': 3.126593903566693e-05, 'samples': 21495808, 'steps': 41983, 'batch_loss/train': 0.8579235086217523} 12/28/2021 11:47:13 - INFO - codeparrot_training - Step 41984: {'lr': 3.12583063597848e-05, 'samples': 21496320, 'steps': 41984, 'batch_loss/train': 0.722505803219974} 12/28/2021 11:47:26 - INFO - codeparrot_training - Step 41985: {'lr': 3.125067455353706e-05, 'samples': 21496832, 'steps': 41985, 'batch_loss/train': 0.6968956450000405} 12/28/2021 11:47:37 - INFO - codeparrot_training - Step 41986: {'lr': 3.124304361695404e-05, 'samples': 21497344, 'steps': 41986, 'batch_loss/train': 0.6322755496948957} 12/28/2021 11:47:48 - INFO - codeparrot_training - Step 41987: {'lr': 3.1235413550066094e-05, 'samples': 21497856, 'steps': 41987, 'batch_loss/train': 0.761603572871536} 12/28/2021 11:48:00 - INFO - codeparrot_training - Step 41988: {'lr': 3.1227784352903574e-05, 'samples': 21498368, 'steps': 41988, 'batch_loss/train': 0.8213880104012787} 12/28/2021 11:48:10 - INFO - codeparrot_training - Step 41989: {'lr': 3.122015602549669e-05, 'samples': 21498880, 'steps': 41989, 'batch_loss/train': 0.8131972691044211} 12/28/2021 11:48:21 - INFO - codeparrot_training - Step 41990: {'lr': 3.121252856787593e-05, 'samples': 21499392, 'steps': 41990, 'batch_loss/train': 0.6774098873138428} 12/28/2021 11:48:32 - INFO - codeparrot_training - Step 41991: {'lr': 3.120490198007153e-05, 'samples': 21499904, 'steps': 41991, 'batch_loss/train': 0.601451316382736} 12/28/2021 11:48:46 - INFO - codeparrot_training - Step 41992: {'lr': 3.1197276262113744e-05, 'samples': 21500416, 'steps': 41992, 'batch_loss/train': 0.7170384284108877} 12/28/2021 11:48:56 - INFO - codeparrot_training - Step 41993: {'lr': 3.1189651414033096e-05, 'samples': 21500928, 'steps': 41993, 'batch_loss/train': 0.6324339853599668} 12/28/2021 11:49:07 - INFO - codeparrot_training - Step 41994: {'lr': 3.118202743585971e-05, 'samples': 21501440, 'steps': 41994, 'batch_loss/train': 0.6593000832945108} 12/28/2021 11:49:19 - INFO - codeparrot_training - Step 41995: {'lr': 3.117440432762397e-05, 'samples': 21501952, 'steps': 41995, 'batch_loss/train': 0.776714924024418} 12/28/2021 11:49:30 - INFO - codeparrot_training - Step 41996: {'lr': 3.116678208935617e-05, 'samples': 21502464, 'steps': 41996, 'batch_loss/train': 0.6424117792048492} 12/28/2021 11:49:40 - INFO - codeparrot_training - Step 41997: {'lr': 3.11591607210866e-05, 'samples': 21502976, 'steps': 41997, 'batch_loss/train': 0.6986369757214561} 12/28/2021 11:49:52 - INFO - codeparrot_training - Step 41998: {'lr': 3.115154022284561e-05, 'samples': 21503488, 'steps': 41998, 'batch_loss/train': 0.6843653712421656} 12/28/2021 11:50:03 - INFO - codeparrot_training - Step 41999: {'lr': 3.114392059466348e-05, 'samples': 21504000, 'steps': 41999, 'batch_loss/train': 0.8061250858008862} 12/28/2021 11:50:14 - INFO - codeparrot_training - Step 42000: {'lr': 3.11363018365704e-05, 'samples': 21504512, 'steps': 42000, 'batch_loss/train': 0.7708917809650302} 12/28/2021 11:50:28 - INFO - codeparrot_training - Step 42001: {'lr': 3.112868394859678e-05, 'samples': 21505024, 'steps': 42001, 'batch_loss/train': 0.7321282718330622} 12/28/2021 11:50:38 - INFO - codeparrot_training - Step 42002: {'lr': 3.112106693077293e-05, 'samples': 21505536, 'steps': 42002, 'batch_loss/train': 0.5780193648533896} 12/28/2021 11:50:49 - INFO - codeparrot_training - Step 42003: {'lr': 3.1113450783128997e-05, 'samples': 21506048, 'steps': 42003, 'batch_loss/train': 0.741110757458955} 12/28/2021 11:51:00 - INFO - codeparrot_training - Step 42004: {'lr': 3.1105835505695345e-05, 'samples': 21506560, 'steps': 42004, 'batch_loss/train': 0.698977845342597} 12/28/2021 11:51:12 - INFO - codeparrot_training - Step 42005: {'lr': 3.10982210985023e-05, 'samples': 21507072, 'steps': 42005, 'batch_loss/train': 0.7190391522890422} 12/28/2021 11:51:23 - INFO - codeparrot_training - Step 42006: {'lr': 3.1090607561580055e-05, 'samples': 21507584, 'steps': 42006, 'batch_loss/train': 0.7425698721781373} 12/28/2021 11:51:33 - INFO - codeparrot_training - Step 42007: {'lr': 3.108299489495886e-05, 'samples': 21508096, 'steps': 42007, 'batch_loss/train': 0.6459686039015651} 12/28/2021 11:51:46 - INFO - codeparrot_training - Step 42008: {'lr': 3.107538309866903e-05, 'samples': 21508608, 'steps': 42008, 'batch_loss/train': 0.784155959263444} 12/28/2021 11:51:56 - INFO - codeparrot_training - Step 42009: {'lr': 3.1067772172740814e-05, 'samples': 21509120, 'steps': 42009, 'batch_loss/train': 0.7182777556590736} 12/28/2021 11:52:07 - INFO - codeparrot_training - Step 42010: {'lr': 3.106016211720447e-05, 'samples': 21509632, 'steps': 42010, 'batch_loss/train': 0.512531709682662} 12/28/2021 11:52:19 - INFO - codeparrot_training - Step 42011: {'lr': 3.105255293209025e-05, 'samples': 21510144, 'steps': 42011, 'batch_loss/train': 0.5873741476680152} 12/28/2021 11:52:30 - INFO - codeparrot_training - Step 42012: {'lr': 3.10449446174284e-05, 'samples': 21510656, 'steps': 42012, 'batch_loss/train': 0.7110297083854675} 12/28/2021 11:52:41 - INFO - codeparrot_training - Step 42013: {'lr': 3.1037337173249195e-05, 'samples': 21511168, 'steps': 42013, 'batch_loss/train': 0.7532150116749108} 12/28/2021 11:52:51 - INFO - codeparrot_training - Step 42014: {'lr': 3.102973059958292e-05, 'samples': 21511680, 'steps': 42014, 'batch_loss/train': 0.7167070779833011} 12/28/2021 11:53:05 - INFO - codeparrot_training - Step 42015: {'lr': 3.102212489645964e-05, 'samples': 21512192, 'steps': 42015, 'batch_loss/train': 0.6527030444703996} 12/28/2021 11:53:16 - INFO - codeparrot_training - Step 42016: {'lr': 3.1014520063909816e-05, 'samples': 21512704, 'steps': 42016, 'batch_loss/train': 0.6344720079214312} 12/28/2021 11:53:26 - INFO - codeparrot_training - Step 42017: {'lr': 3.100691610196349e-05, 'samples': 21513216, 'steps': 42017, 'batch_loss/train': 0.72479756642133} 12/28/2021 11:53:39 - INFO - codeparrot_training - Step 42018: {'lr': 3.0999313010650954e-05, 'samples': 21513728, 'steps': 42018, 'batch_loss/train': 0.6387896756641567} 12/28/2021 11:53:49 - INFO - codeparrot_training - Step 42019: {'lr': 3.099171079000257e-05, 'samples': 21514240, 'steps': 42019, 'batch_loss/train': 0.6105557167902589} 12/28/2021 11:54:00 - INFO - codeparrot_training - Step 42020: {'lr': 3.09841094400484e-05, 'samples': 21514752, 'steps': 42020, 'batch_loss/train': 0.8423809031955898} 12/28/2021 11:54:14 - INFO - codeparrot_training - Step 42021: {'lr': 3.097650896081869e-05, 'samples': 21515264, 'steps': 42021, 'batch_loss/train': 0.6107899884227663} 12/28/2021 11:54:24 - INFO - codeparrot_training - Step 42022: {'lr': 3.09689093523437e-05, 'samples': 21515776, 'steps': 42022, 'batch_loss/train': 0.6927351891063154} 12/28/2021 11:54:35 - INFO - codeparrot_training - Step 42023: {'lr': 3.09613106146536e-05, 'samples': 21516288, 'steps': 42023, 'batch_loss/train': 0.6829368694452569} 12/28/2021 11:54:46 - INFO - codeparrot_training - Step 42024: {'lr': 3.095371274777864e-05, 'samples': 21516800, 'steps': 42024, 'batch_loss/train': 0.790919145103544} 12/28/2021 11:54:58 - INFO - codeparrot_training - Step 42025: {'lr': 3.094611575174905e-05, 'samples': 21517312, 'steps': 42025, 'batch_loss/train': 0.7018769095302559} 12/28/2021 11:55:08 - INFO - codeparrot_training - Step 42026: {'lr': 3.0938519626594914e-05, 'samples': 21517824, 'steps': 42026, 'batch_loss/train': 0.7547580128884874} 12/28/2021 11:55:19 - INFO - codeparrot_training - Step 42027: {'lr': 3.093092437234654e-05, 'samples': 21518336, 'steps': 42027, 'batch_loss/train': 0.739997498691082} 12/28/2021 11:55:31 - INFO - codeparrot_training - Step 42028: {'lr': 3.092332998903416e-05, 'samples': 21518848, 'steps': 42028, 'batch_loss/train': 0.6105195025447756} 12/28/2021 11:55:41 - INFO - codeparrot_training - Step 42029: {'lr': 3.091573647668777e-05, 'samples': 21519360, 'steps': 42029, 'batch_loss/train': 0.7585651030531153} 12/28/2021 11:55:52 - INFO - codeparrot_training - Step 42030: {'lr': 3.090814383533777e-05, 'samples': 21519872, 'steps': 42030, 'batch_loss/train': 0.6644924620632082} 12/28/2021 11:56:06 - INFO - codeparrot_training - Step 42031: {'lr': 3.09005520650143e-05, 'samples': 21520384, 'steps': 42031, 'batch_loss/train': 0.6540573881939054} 12/28/2021 11:56:17 - INFO - codeparrot_training - Step 42032: {'lr': 3.089296116574744e-05, 'samples': 21520896, 'steps': 42032, 'batch_loss/train': 0.693518653512001} 12/28/2021 11:56:27 - INFO - codeparrot_training - Step 42033: {'lr': 3.0885371137567395e-05, 'samples': 21521408, 'steps': 42033, 'batch_loss/train': 0.6540943121653982} 12/28/2021 11:56:39 - INFO - codeparrot_training - Step 42034: {'lr': 3.08777819805045e-05, 'samples': 21521920, 'steps': 42034, 'batch_loss/train': 0.688216601498425} 12/28/2021 11:56:50 - INFO - codeparrot_training - Step 42035: {'lr': 3.087019369458874e-05, 'samples': 21522432, 'steps': 42035, 'batch_loss/train': 1.379055651370436} 12/28/2021 11:57:01 - INFO - codeparrot_training - Step 42036: {'lr': 3.086260627985035e-05, 'samples': 21522944, 'steps': 42036, 'batch_loss/train': 0.706163567956537} 12/28/2021 11:57:11 - INFO - codeparrot_training - Step 42037: {'lr': 3.0855019736319514e-05, 'samples': 21523456, 'steps': 42037, 'batch_loss/train': 0.7754089708905667} 12/28/2021 11:57:26 - INFO - codeparrot_training - Step 42038: {'lr': 3.0847434064026345e-05, 'samples': 21523968, 'steps': 42038, 'batch_loss/train': 0.7534445887431502} 12/28/2021 11:57:36 - INFO - codeparrot_training - Step 42039: {'lr': 3.083984926300104e-05, 'samples': 21524480, 'steps': 42039, 'batch_loss/train': 0.9123668447136879} 12/28/2021 11:57:47 - INFO - codeparrot_training - Step 42040: {'lr': 3.083226533327374e-05, 'samples': 21524992, 'steps': 42040, 'batch_loss/train': 0.793608233332634} 12/28/2021 11:57:59 - INFO - codeparrot_training - Step 42041: {'lr': 3.082468227487461e-05, 'samples': 21525504, 'steps': 42041, 'batch_loss/train': 0.7895969464443624} 12/28/2021 11:58:10 - INFO - codeparrot_training - Step 42042: {'lr': 3.081710008783378e-05, 'samples': 21526016, 'steps': 42042, 'batch_loss/train': 0.7555259857326746} 12/28/2021 11:58:20 - INFO - codeparrot_training - Step 42043: {'lr': 3.0809518772181465e-05, 'samples': 21526528, 'steps': 42043, 'batch_loss/train': 0.6924257939681411} 12/28/2021 11:58:32 - INFO - codeparrot_training - Step 42044: {'lr': 3.0801938327947614e-05, 'samples': 21527040, 'steps': 42044, 'batch_loss/train': 0.7171974214725196} 12/28/2021 11:58:43 - INFO - codeparrot_training - Step 42045: {'lr': 3.079435875516259e-05, 'samples': 21527552, 'steps': 42045, 'batch_loss/train': 0.6837652521207929} 12/28/2021 11:58:54 - INFO - codeparrot_training - Step 42046: {'lr': 3.078678005385641e-05, 'samples': 21528064, 'steps': 42046, 'batch_loss/train': 0.8141253711655736} 12/28/2021 11:59:04 - INFO - codeparrot_training - Step 42047: {'lr': 3.077920222405919e-05, 'samples': 21528576, 'steps': 42047, 'batch_loss/train': 0.6154182725585997} 12/28/2021 11:59:17 - INFO - codeparrot_training - Step 42048: {'lr': 3.0771625265801104e-05, 'samples': 21529088, 'steps': 42048, 'batch_loss/train': 0.6398020589258522} 12/28/2021 11:59:28 - INFO - codeparrot_training - Step 42049: {'lr': 3.076404917911224e-05, 'samples': 21529600, 'steps': 42049, 'batch_loss/train': 0.5196841177530587} 12/28/2021 11:59:38 - INFO - codeparrot_training - Step 42050: {'lr': 3.075647396402273e-05, 'samples': 21530112, 'steps': 42050, 'batch_loss/train': 0.7255936443107203} 12/28/2021 11:59:52 - INFO - codeparrot_training - Step 42051: {'lr': 3.074889962056276e-05, 'samples': 21530624, 'steps': 42051, 'batch_loss/train': 1.1108382549136877} 12/28/2021 12:00:03 - INFO - codeparrot_training - Step 42052: {'lr': 3.07413261487623e-05, 'samples': 21531136, 'steps': 42052, 'batch_loss/train': 0.5875003343971912} 12/28/2021 12:00:14 - INFO - codeparrot_training - Step 42053: {'lr': 3.073375354865157e-05, 'samples': 21531648, 'steps': 42053, 'batch_loss/train': 0.6944120228290558} 12/28/2021 12:00:24 - INFO - codeparrot_training - Step 42054: {'lr': 3.072618182026069e-05, 'samples': 21532160, 'steps': 42054, 'batch_loss/train': 0.7634709157282487} 12/28/2021 12:00:36 - INFO - codeparrot_training - Step 42055: {'lr': 3.071861096361961e-05, 'samples': 21532672, 'steps': 42055, 'batch_loss/train': 0.7248158110305667} 12/28/2021 12:00:47 - INFO - codeparrot_training - Step 42056: {'lr': 3.07110409787586e-05, 'samples': 21533184, 'steps': 42056, 'batch_loss/train': 0.8226844053715467} 12/28/2021 12:00:57 - INFO - codeparrot_training - Step 42057: {'lr': 3.0703471865707736e-05, 'samples': 21533696, 'steps': 42057, 'batch_loss/train': 0.7680627973750234} 12/28/2021 12:01:10 - INFO - codeparrot_training - Step 42058: {'lr': 3.069590362449701e-05, 'samples': 21534208, 'steps': 42058, 'batch_loss/train': 0.6653251838870347} 12/28/2021 12:01:21 - INFO - codeparrot_training - Step 42059: {'lr': 3.068833625515652e-05, 'samples': 21534720, 'steps': 42059, 'batch_loss/train': 0.6926947627216578} 12/28/2021 12:01:31 - INFO - codeparrot_training - Step 42060: {'lr': 3.0680769757716496e-05, 'samples': 21535232, 'steps': 42060, 'batch_loss/train': 0.7038185372948647} 12/28/2021 12:01:45 - INFO - codeparrot_training - Step 42061: {'lr': 3.067320413220687e-05, 'samples': 21535744, 'steps': 42061, 'batch_loss/train': 2.1345515940338373} 12/28/2021 12:01:56 - INFO - codeparrot_training - Step 42062: {'lr': 3.0665639378657765e-05, 'samples': 21536256, 'steps': 42062, 'batch_loss/train': 0.6705858651548624} 12/28/2021 12:02:06 - INFO - codeparrot_training - Step 42063: {'lr': 3.0658075497099285e-05, 'samples': 21536768, 'steps': 42063, 'batch_loss/train': 0.6215886248683091} 12/28/2021 12:02:17 - INFO - codeparrot_training - Step 42064: {'lr': 3.0650512487561455e-05, 'samples': 21537280, 'steps': 42064, 'batch_loss/train': 0.7310331175103784} 12/28/2021 12:02:29 - INFO - codeparrot_training - Step 42065: {'lr': 3.064295035007436e-05, 'samples': 21537792, 'steps': 42065, 'batch_loss/train': 0.6947784638032317} 12/28/2021 12:02:40 - INFO - codeparrot_training - Step 42066: {'lr': 3.063538908466806e-05, 'samples': 21538304, 'steps': 42066, 'batch_loss/train': 0.7118810098618269} 12/28/2021 12:02:50 - INFO - codeparrot_training - Step 42067: {'lr': 3.062782869137265e-05, 'samples': 21538816, 'steps': 42067, 'batch_loss/train': 0.8627311802702025} 12/28/2021 12:03:04 - INFO - codeparrot_training - Step 42068: {'lr': 3.062026917021815e-05, 'samples': 21539328, 'steps': 42068, 'batch_loss/train': 0.6164000611752272} 12/28/2021 12:03:15 - INFO - codeparrot_training - Step 42069: {'lr': 3.061271052123468e-05, 'samples': 21539840, 'steps': 42069, 'batch_loss/train': 0.6761223068460822} 12/28/2021 12:03:26 - INFO - codeparrot_training - Step 42070: {'lr': 3.060515274445213e-05, 'samples': 21540352, 'steps': 42070, 'batch_loss/train': 0.7063394272699952} 12/28/2021 12:03:38 - INFO - codeparrot_training - Step 42071: {'lr': 3.059759583990077e-05, 'samples': 21540864, 'steps': 42071, 'batch_loss/train': 0.7571243201382458} 12/28/2021 12:03:48 - INFO - codeparrot_training - Step 42072: {'lr': 3.059003980761044e-05, 'samples': 21541376, 'steps': 42072, 'batch_loss/train': 0.7511971171479672} 12/28/2021 12:03:59 - INFO - codeparrot_training - Step 42073: {'lr': 3.0582484647611295e-05, 'samples': 21541888, 'steps': 42073, 'batch_loss/train': 0.6595828840509057} 12/28/2021 12:04:09 - INFO - codeparrot_training - Step 42074: {'lr': 3.0574930359933335e-05, 'samples': 21542400, 'steps': 42074, 'batch_loss/train': 0.724582630675286} 12/28/2021 12:04:22 - INFO - codeparrot_training - Step 42075: {'lr': 3.05673769446066e-05, 'samples': 21542912, 'steps': 42075, 'batch_loss/train': 0.7946827560663223} 12/28/2021 12:04:32 - INFO - codeparrot_training - Step 42076: {'lr': 3.055982440166111e-05, 'samples': 21543424, 'steps': 42076, 'batch_loss/train': 0.7488500422332436} 12/28/2021 12:04:43 - INFO - codeparrot_training - Step 42077: {'lr': 3.05522727311269e-05, 'samples': 21543936, 'steps': 42077, 'batch_loss/train': 0.7448560423217714} 12/28/2021 12:04:57 - INFO - codeparrot_training - Step 42078: {'lr': 3.054472193303401e-05, 'samples': 21544448, 'steps': 42078, 'batch_loss/train': 0.6459129170980304} 12/28/2021 12:05:08 - INFO - codeparrot_training - Step 42079: {'lr': 3.053717200741243e-05, 'samples': 21544960, 'steps': 42079, 'batch_loss/train': 0.7531528302351944} 12/28/2021 12:05:18 - INFO - codeparrot_training - Step 42080: {'lr': 3.0529622954292235e-05, 'samples': 21545472, 'steps': 42080, 'batch_loss/train': 0.6834086782473605} 12/28/2021 12:05:30 - INFO - codeparrot_training - Step 42081: {'lr': 3.0522074773703305e-05, 'samples': 21545984, 'steps': 42081, 'batch_loss/train': 0.7857113976497203} 12/28/2021 12:05:41 - INFO - codeparrot_training - Step 42082: {'lr': 3.051452746567576e-05, 'samples': 21546496, 'steps': 42082, 'batch_loss/train': 0.7366734575480223} 12/28/2021 12:05:52 - INFO - codeparrot_training - Step 42083: {'lr': 3.050698103023966e-05, 'samples': 21547008, 'steps': 42083, 'batch_loss/train': 0.8117014011368155} 12/28/2021 12:06:02 - INFO - codeparrot_training - Step 42084: {'lr': 3.049943546742487e-05, 'samples': 21547520, 'steps': 42084, 'batch_loss/train': 0.5207059012027457} 12/28/2021 12:06:14 - INFO - codeparrot_training - Step 42085: {'lr': 3.049189077726136e-05, 'samples': 21548032, 'steps': 42085, 'batch_loss/train': 0.7008366654627025} 12/28/2021 12:06:25 - INFO - codeparrot_training - Step 42086: {'lr': 3.0484346959779363e-05, 'samples': 21548544, 'steps': 42086, 'batch_loss/train': 0.5326908781134989} 12/28/2021 12:06:36 - INFO - codeparrot_training - Step 42087: {'lr': 3.047680401500863e-05, 'samples': 21549056, 'steps': 42087, 'batch_loss/train': 0.8244755691848695} 12/28/2021 12:06:50 - INFO - codeparrot_training - Step 42088: {'lr': 3.046926194297925e-05, 'samples': 21549568, 'steps': 42088, 'batch_loss/train': 0.69879838405177} 12/28/2021 12:07:00 - INFO - codeparrot_training - Step 42089: {'lr': 3.0461720743721167e-05, 'samples': 21550080, 'steps': 42089, 'batch_loss/train': 0.6579243410378695} 12/28/2021 12:07:11 - INFO - codeparrot_training - Step 42090: {'lr': 3.0454180417264414e-05, 'samples': 21550592, 'steps': 42090, 'batch_loss/train': 0.6512710587121546} 12/28/2021 12:07:23 - INFO - codeparrot_training - Step 42091: {'lr': 3.0446640963638945e-05, 'samples': 21551104, 'steps': 42091, 'batch_loss/train': 0.6819271154236048} 12/28/2021 12:07:34 - INFO - codeparrot_training - Step 42092: {'lr': 3.043910238287473e-05, 'samples': 21551616, 'steps': 42092, 'batch_loss/train': 0.7350648883730173} 12/28/2021 12:07:44 - INFO - codeparrot_training - Step 42093: {'lr': 3.0431564675001714e-05, 'samples': 21552128, 'steps': 42093, 'batch_loss/train': 0.8062497922219336} 12/28/2021 12:07:56 - INFO - codeparrot_training - Step 42094: {'lr': 3.0424027840049912e-05, 'samples': 21552640, 'steps': 42094, 'batch_loss/train': 0.6002153540030122} 12/28/2021 12:08:07 - INFO - codeparrot_training - Step 42095: {'lr': 3.041649187804932e-05, 'samples': 21553152, 'steps': 42095, 'batch_loss/train': 0.6915466976352036} 12/28/2021 12:08:18 - INFO - codeparrot_training - Step 42096: {'lr': 3.0408956789029747e-05, 'samples': 21553664, 'steps': 42096, 'batch_loss/train': 0.7344412822276354} 12/28/2021 12:08:28 - INFO - codeparrot_training - Step 42097: {'lr': 3.0401422573021336e-05, 'samples': 21554176, 'steps': 42097, 'batch_loss/train': 0.7833897513337433} 12/28/2021 12:08:40 - INFO - codeparrot_training - Step 42098: {'lr': 3.03938892300539e-05, 'samples': 21554688, 'steps': 42098, 'batch_loss/train': 0.6258653451805003} 12/28/2021 12:08:51 - INFO - codeparrot_training - Step 42099: {'lr': 3.038635676015744e-05, 'samples': 21555200, 'steps': 42099, 'batch_loss/train': 0.6674538506194949} 12/28/2021 12:09:01 - INFO - codeparrot_training - Step 42100: {'lr': 3.0378825163361935e-05, 'samples': 21555712, 'steps': 42100, 'batch_loss/train': 0.749171975068748} 12/28/2021 12:09:15 - INFO - codeparrot_training - Step 42101: {'lr': 3.0371294439697244e-05, 'samples': 21556224, 'steps': 42101, 'batch_loss/train': 0.6646880283951759} 12/28/2021 12:09:26 - INFO - codeparrot_training - Step 42102: {'lr': 3.0363764589193405e-05, 'samples': 21556736, 'steps': 42102, 'batch_loss/train': 0.7270884420722723} 12/28/2021 12:09:37 - INFO - codeparrot_training - Step 42103: {'lr': 3.035623561188028e-05, 'samples': 21557248, 'steps': 42103, 'batch_loss/train': 0.7219607373699546} 12/28/2021 12:09:49 - INFO - codeparrot_training - Step 42104: {'lr': 3.0348707507787814e-05, 'samples': 21557760, 'steps': 42104, 'batch_loss/train': 0.6080461301025935} 12/28/2021 12:10:00 - INFO - codeparrot_training - Step 42105: {'lr': 3.034118027694599e-05, 'samples': 21558272, 'steps': 42105, 'batch_loss/train': 0.7412554593756795} 12/28/2021 12:10:10 - INFO - codeparrot_training - Step 42106: {'lr': 3.0333653919384725e-05, 'samples': 21558784, 'steps': 42106, 'batch_loss/train': 1.4796033897437155} 12/28/2021 12:10:24 - INFO - codeparrot_training - Step 42107: {'lr': 3.03261284351338e-05, 'samples': 21559296, 'steps': 42107, 'batch_loss/train': 0.7152477006311528} 12/28/2021 12:10:35 - INFO - codeparrot_training - Step 42108: {'lr': 3.03186038242233e-05, 'samples': 21559808, 'steps': 42108, 'batch_loss/train': 0.691626354586333} 12/28/2021 12:10:46 - INFO - codeparrot_training - Step 42109: {'lr': 3.0311080086683123e-05, 'samples': 21560320, 'steps': 42109, 'batch_loss/train': 0.691294183023274} 12/28/2021 12:10:56 - INFO - codeparrot_training - Step 42110: {'lr': 3.0303557222543106e-05, 'samples': 21560832, 'steps': 42110, 'batch_loss/train': 0.8683350309729576} 12/28/2021 12:11:08 - INFO - codeparrot_training - Step 42111: {'lr': 3.0296035231833136e-05, 'samples': 21561344, 'steps': 42111, 'batch_loss/train': 0.6282326197251678} 12/28/2021 12:11:19 - INFO - codeparrot_training - Step 42112: {'lr': 3.0288514114583278e-05, 'samples': 21561856, 'steps': 42112, 'batch_loss/train': 0.6335135181434453} 12/28/2021 12:11:30 - INFO - codeparrot_training - Step 42113: {'lr': 3.0280993870823254e-05, 'samples': 21562368, 'steps': 42113, 'batch_loss/train': 0.7095033898949623} 12/28/2021 12:11:42 - INFO - codeparrot_training - Step 42114: {'lr': 3.0273474500583014e-05, 'samples': 21562880, 'steps': 42114, 'batch_loss/train': 0.6124396752566099} 12/28/2021 12:11:52 - INFO - codeparrot_training - Step 42115: {'lr': 3.0265956003892565e-05, 'samples': 21563392, 'steps': 42115, 'batch_loss/train': 0.6973539692698978} 12/28/2021 12:12:03 - INFO - codeparrot_training - Step 42116: {'lr': 3.0258438380781654e-05, 'samples': 21563904, 'steps': 42116, 'batch_loss/train': 0.654293421539478} 12/28/2021 12:12:17 - INFO - codeparrot_training - Step 42117: {'lr': 3.025092163128021e-05, 'samples': 21564416, 'steps': 42117, 'batch_loss/train': 1.4786142490338534} 12/28/2021 12:12:28 - INFO - codeparrot_training - Step 42118: {'lr': 3.0243405755418148e-05, 'samples': 21564928, 'steps': 42118, 'batch_loss/train': 0.6683694655075669} 12/28/2021 12:12:38 - INFO - codeparrot_training - Step 42119: {'lr': 3.0235890753225304e-05, 'samples': 21565440, 'steps': 42119, 'batch_loss/train': 0.77516122860834} 12/28/2021 12:12:50 - INFO - codeparrot_training - Step 42120: {'lr': 3.02283766247316e-05, 'samples': 21565952, 'steps': 42120, 'batch_loss/train': 0.7363458829931915} 12/28/2021 12:13:01 - INFO - codeparrot_training - Step 42121: {'lr': 3.0220863369966935e-05, 'samples': 21566464, 'steps': 42121, 'batch_loss/train': 0.7823777176672593} 12/28/2021 12:13:12 - INFO - codeparrot_training - Step 42122: {'lr': 3.0213350988961023e-05, 'samples': 21566976, 'steps': 42122, 'batch_loss/train': 0.6684254971332848} 12/28/2021 12:13:22 - INFO - codeparrot_training - Step 42123: {'lr': 3.020583948174391e-05, 'samples': 21567488, 'steps': 42123, 'batch_loss/train': 0.6411684962222353} 12/28/2021 12:13:36 - INFO - codeparrot_training - Step 42124: {'lr': 3.019832884834542e-05, 'samples': 21568000, 'steps': 42124, 'batch_loss/train': 0.7312909867614508} 12/28/2021 12:13:47 - INFO - codeparrot_training - Step 42125: {'lr': 3.0190819088795312e-05, 'samples': 21568512, 'steps': 42125, 'batch_loss/train': 0.5900555057451129} 12/28/2021 12:13:57 - INFO - codeparrot_training - Step 42126: {'lr': 3.0183310203123536e-05, 'samples': 21569024, 'steps': 42126, 'batch_loss/train': 0.6085826618364081} 12/28/2021 12:14:09 - INFO - codeparrot_training - Step 42127: {'lr': 3.017580219135993e-05, 'samples': 21569536, 'steps': 42127, 'batch_loss/train': 0.6998118385672569} 12/28/2021 12:14:20 - INFO - codeparrot_training - Step 42128: {'lr': 3.0168295053534302e-05, 'samples': 21570048, 'steps': 42128, 'batch_loss/train': 0.6909197620116174} 12/28/2021 12:14:31 - INFO - codeparrot_training - Step 42129: {'lr': 3.016078878967654e-05, 'samples': 21570560, 'steps': 42129, 'batch_loss/train': 0.7695561479777098} 12/28/2021 12:14:43 - INFO - codeparrot_training - Step 42130: {'lr': 3.015328339981649e-05, 'samples': 21571072, 'steps': 42130, 'batch_loss/train': 0.7045266055501997} 12/28/2021 12:14:53 - INFO - codeparrot_training - Step 42131: {'lr': 3.0145778883983955e-05, 'samples': 21571584, 'steps': 42131, 'batch_loss/train': 0.7733764164149761} 12/28/2021 12:15:04 - INFO - codeparrot_training - Step 42132: {'lr': 3.013827524220886e-05, 'samples': 21572096, 'steps': 42132, 'batch_loss/train': 0.5953639664803632} 12/28/2021 12:15:14 - INFO - codeparrot_training - Step 42133: {'lr': 3.013077247452084e-05, 'samples': 21572608, 'steps': 42133, 'batch_loss/train': 0.6881025582551956} 12/28/2021 12:15:27 - INFO - codeparrot_training - Step 42134: {'lr': 3.0123270580949908e-05, 'samples': 21573120, 'steps': 42134, 'batch_loss/train': 0.7073607838246971} 12/28/2021 12:15:37 - INFO - codeparrot_training - Step 42135: {'lr': 3.0115769561525868e-05, 'samples': 21573632, 'steps': 42135, 'batch_loss/train': 0.6874534138478339} 12/28/2021 12:15:48 - INFO - codeparrot_training - Step 42136: {'lr': 3.010826941627848e-05, 'samples': 21574144, 'steps': 42136, 'batch_loss/train': 0.7636595047079027} 12/28/2021 12:16:02 - INFO - codeparrot_training - Step 42137: {'lr': 3.0100770145237517e-05, 'samples': 21574656, 'steps': 42137, 'batch_loss/train': 0.6888761762529612} 12/28/2021 12:16:12 - INFO - codeparrot_training - Step 42138: {'lr': 3.0093271748432958e-05, 'samples': 21575168, 'steps': 42138, 'batch_loss/train': 0.7991378819569945} 12/28/2021 12:16:23 - INFO - codeparrot_training - Step 42139: {'lr': 3.008577422589448e-05, 'samples': 21575680, 'steps': 42139, 'batch_loss/train': 0.7568331426009536} 12/28/2021 12:16:35 - INFO - codeparrot_training - Step 42140: {'lr': 3.0078277577651886e-05, 'samples': 21576192, 'steps': 42140, 'batch_loss/train': 0.6934818116715178} 12/28/2021 12:16:46 - INFO - codeparrot_training - Step 42141: {'lr': 3.0070781803735098e-05, 'samples': 21576704, 'steps': 42141, 'batch_loss/train': 0.7036594203673303} 12/28/2021 12:16:56 - INFO - codeparrot_training - Step 42142: {'lr': 3.0063286904173816e-05, 'samples': 21577216, 'steps': 42142, 'batch_loss/train': 0.7661845618858933} 12/28/2021 12:17:08 - INFO - codeparrot_training - Step 42143: {'lr': 3.005579287899782e-05, 'samples': 21577728, 'steps': 42143, 'batch_loss/train': 0.7554760351777077} 12/28/2021 12:17:19 - INFO - codeparrot_training - Step 42144: {'lr': 3.004829972823697e-05, 'samples': 21578240, 'steps': 42144, 'batch_loss/train': 0.7247506519779563} 12/28/2021 12:17:30 - INFO - codeparrot_training - Step 42145: {'lr': 3.004080745192103e-05, 'samples': 21578752, 'steps': 42145, 'batch_loss/train': 0.7267705649137497} 12/28/2021 12:17:40 - INFO - codeparrot_training - Step 42146: {'lr': 3.003331605007978e-05, 'samples': 21579264, 'steps': 42146, 'batch_loss/train': 0.7597776683978736} 12/28/2021 12:17:54 - INFO - codeparrot_training - Step 42147: {'lr': 3.0025825522743078e-05, 'samples': 21579776, 'steps': 42147, 'batch_loss/train': 0.6121386461891234} 12/28/2021 12:18:05 - INFO - codeparrot_training - Step 42148: {'lr': 3.0018335869940516e-05, 'samples': 21580288, 'steps': 42148, 'batch_loss/train': 0.6505176217760891} 12/28/2021 12:18:16 - INFO - codeparrot_training - Step 42149: {'lr': 3.0010847091702043e-05, 'samples': 21580800, 'steps': 42149, 'batch_loss/train': 0.7831177567131817} 12/28/2021 12:18:28 - INFO - codeparrot_training - Step 42150: {'lr': 3.000335918805741e-05, 'samples': 21581312, 'steps': 42150, 'batch_loss/train': 0.6882234737277031} 12/28/2021 12:18:39 - INFO - codeparrot_training - Step 42151: {'lr': 2.9995872159036293e-05, 'samples': 21581824, 'steps': 42151, 'batch_loss/train': 0.7843004977330565} 12/28/2021 12:18:49 - INFO - codeparrot_training - Step 42152: {'lr': 2.998838600466858e-05, 'samples': 21582336, 'steps': 42152, 'batch_loss/train': 0.7135842200368643} 12/28/2021 12:19:03 - INFO - codeparrot_training - Step 42153: {'lr': 2.998090072498394e-05, 'samples': 21582848, 'steps': 42153, 'batch_loss/train': 0.5048888550081756} 12/28/2021 12:19:14 - INFO - codeparrot_training - Step 42154: {'lr': 2.9973416320012158e-05, 'samples': 21583360, 'steps': 42154, 'batch_loss/train': 0.634512519929558} 12/28/2021 12:19:24 - INFO - codeparrot_training - Step 42155: {'lr': 2.996593278978299e-05, 'samples': 21583872, 'steps': 42155, 'batch_loss/train': 0.703049160016235} 12/28/2021 12:19:35 - INFO - codeparrot_training - Step 42156: {'lr': 2.9958450134326186e-05, 'samples': 21584384, 'steps': 42156, 'batch_loss/train': 0.9369698744267225} 12/28/2021 12:19:47 - INFO - codeparrot_training - Step 42157: {'lr': 2.995096835367153e-05, 'samples': 21584896, 'steps': 42157, 'batch_loss/train': 0.7018617594148964} 12/28/2021 12:19:58 - INFO - codeparrot_training - Step 42158: {'lr': 2.994348744784872e-05, 'samples': 21585408, 'steps': 42158, 'batch_loss/train': 0.6975940396077931} 12/28/2021 12:20:08 - INFO - codeparrot_training - Step 42159: {'lr': 2.993600741688751e-05, 'samples': 21585920, 'steps': 42159, 'batch_loss/train': 0.6666292985901237} 12/28/2021 12:20:21 - INFO - codeparrot_training - Step 42160: {'lr': 2.992852826081763e-05, 'samples': 21586432, 'steps': 42160, 'batch_loss/train': 0.684116063406691} 12/28/2021 12:20:31 - INFO - codeparrot_training - Step 42161: {'lr': 2.992104997966888e-05, 'samples': 21586944, 'steps': 42161, 'batch_loss/train': 0.8132944041863084} 12/28/2021 12:20:42 - INFO - codeparrot_training - Step 42162: {'lr': 2.991357257347091e-05, 'samples': 21587456, 'steps': 42162, 'batch_loss/train': 0.5066813897865359} 12/28/2021 12:20:56 - INFO - codeparrot_training - Step 42163: {'lr': 2.990609604225339e-05, 'samples': 21587968, 'steps': 42163, 'batch_loss/train': 0.718078050063923} 12/28/2021 12:21:07 - INFO - codeparrot_training - Step 42164: {'lr': 2.9898620386046267e-05, 'samples': 21588480, 'steps': 42164, 'batch_loss/train': 0.659207166172564} 12/28/2021 12:21:17 - INFO - codeparrot_training - Step 42165: {'lr': 2.9891145604879044e-05, 'samples': 21588992, 'steps': 42165, 'batch_loss/train': 0.8299900447018445} 12/28/2021 12:21:30 - INFO - codeparrot_training - Step 42166: {'lr': 2.9883671698781477e-05, 'samples': 21589504, 'steps': 42166, 'batch_loss/train': 0.6351138101890683} 12/28/2021 12:21:40 - INFO - codeparrot_training - Step 42167: {'lr': 2.9876198667783406e-05, 'samples': 21590016, 'steps': 42167, 'batch_loss/train': 0.8021942172199488} 12/28/2021 12:21:51 - INFO - codeparrot_training - Step 42168: {'lr': 2.9868726511914413e-05, 'samples': 21590528, 'steps': 42168, 'batch_loss/train': 0.6049230121425353} 12/28/2021 12:22:02 - INFO - codeparrot_training - Step 42169: {'lr': 2.9861255231204255e-05, 'samples': 21591040, 'steps': 42169, 'batch_loss/train': 0.8390861367806792} 12/28/2021 12:22:16 - INFO - codeparrot_training - Step 42170: {'lr': 2.98537848256826e-05, 'samples': 21591552, 'steps': 42170, 'batch_loss/train': 0.8751378050073981} 12/28/2021 12:22:26 - INFO - codeparrot_training - Step 42171: {'lr': 2.9846315295379207e-05, 'samples': 21592064, 'steps': 42171, 'batch_loss/train': 0.7955530695617199} 12/28/2021 12:22:37 - INFO - codeparrot_training - Step 42172: {'lr': 2.9838846640323713e-05, 'samples': 21592576, 'steps': 42172, 'batch_loss/train': 0.6332639558240771} 12/28/2021 12:22:49 - INFO - codeparrot_training - Step 42173: {'lr': 2.9831378860545878e-05, 'samples': 21593088, 'steps': 42173, 'batch_loss/train': 0.6965546421706676} 12/28/2021 12:23:00 - INFO - codeparrot_training - Step 42174: {'lr': 2.9823911956075257e-05, 'samples': 21593600, 'steps': 42174, 'batch_loss/train': 0.8042451580986381} 12/28/2021 12:23:10 - INFO - codeparrot_training - Step 42175: {'lr': 2.9816445926941688e-05, 'samples': 21594112, 'steps': 42175, 'batch_loss/train': 0.7495956313796341} 12/28/2021 12:23:22 - INFO - codeparrot_training - Step 42176: {'lr': 2.9808980773174843e-05, 'samples': 21594624, 'steps': 42176, 'batch_loss/train': 0.6118736212374642} 12/28/2021 12:23:33 - INFO - codeparrot_training - Step 42177: {'lr': 2.9801516494804226e-05, 'samples': 21595136, 'steps': 42177, 'batch_loss/train': 0.6428714960638899} 12/28/2021 12:23:44 - INFO - codeparrot_training - Step 42178: {'lr': 2.979405309185976e-05, 'samples': 21595648, 'steps': 42178, 'batch_loss/train': 0.6636135373264551} 12/28/2021 12:23:54 - INFO - codeparrot_training - Step 42179: {'lr': 2.978659056437094e-05, 'samples': 21596160, 'steps': 42179, 'batch_loss/train': 0.6194137404672801} 12/28/2021 12:24:08 - INFO - codeparrot_training - Step 42180: {'lr': 2.9779128912367476e-05, 'samples': 21596672, 'steps': 42180, 'batch_loss/train': 0.7078955871984363} 12/28/2021 12:24:19 - INFO - codeparrot_training - Step 42181: {'lr': 2.977166813587903e-05, 'samples': 21597184, 'steps': 42181, 'batch_loss/train': 0.7864296128973365} 12/28/2021 12:24:30 - INFO - codeparrot_training - Step 42182: {'lr': 2.976420823493531e-05, 'samples': 21597696, 'steps': 42182, 'batch_loss/train': 0.7133378681028262} 12/28/2021 12:24:42 - INFO - codeparrot_training - Step 42183: {'lr': 2.975674920956592e-05, 'samples': 21598208, 'steps': 42183, 'batch_loss/train': 0.6796425171196461} 12/28/2021 12:24:52 - INFO - codeparrot_training - Step 42184: {'lr': 2.9749291059800538e-05, 'samples': 21598720, 'steps': 42184, 'batch_loss/train': 0.7062205385882407} 12/28/2021 12:25:03 - INFO - codeparrot_training - Step 42185: {'lr': 2.9741833785668802e-05, 'samples': 21599232, 'steps': 42185, 'batch_loss/train': 0.7928042045095935} 12/28/2021 12:25:17 - INFO - codeparrot_training - Step 42186: {'lr': 2.9734377387200388e-05, 'samples': 21599744, 'steps': 42186, 'batch_loss/train': 0.6273771775886416} 12/28/2021 12:25:28 - INFO - codeparrot_training - Step 42187: {'lr': 2.9726921864424965e-05, 'samples': 21600256, 'steps': 42187, 'batch_loss/train': 0.654251849045977} 12/28/2021 12:25:38 - INFO - codeparrot_training - Step 42188: {'lr': 2.971946721737201e-05, 'samples': 21600768, 'steps': 42188, 'batch_loss/train': 0.6549879722297192} 12/28/2021 12:25:49 - INFO - codeparrot_training - Step 42189: {'lr': 2.9712013446071357e-05, 'samples': 21601280, 'steps': 42189, 'batch_loss/train': 0.6694192594150081} 12/28/2021 12:26:02 - INFO - codeparrot_training - Step 42190: {'lr': 2.9704560550552595e-05, 'samples': 21601792, 'steps': 42190, 'batch_loss/train': 0.6725510710384697} 12/28/2021 12:26:12 - INFO - codeparrot_training - Step 42191: {'lr': 2.9697108530845258e-05, 'samples': 21602304, 'steps': 42191, 'batch_loss/train': 0.5585309430025518} 12/28/2021 12:26:23 - INFO - codeparrot_training - Step 42192: {'lr': 2.9689657386979013e-05, 'samples': 21602816, 'steps': 42192, 'batch_loss/train': 0.7536585554480553} 12/28/2021 12:26:35 - INFO - codeparrot_training - Step 42193: {'lr': 2.968220711898359e-05, 'samples': 21603328, 'steps': 42193, 'batch_loss/train': 0.745409619063139} 12/28/2021 12:26:46 - INFO - codeparrot_training - Step 42194: {'lr': 2.9674757726888464e-05, 'samples': 21603840, 'steps': 42194, 'batch_loss/train': 0.6629207949154079} 12/28/2021 12:26:56 - INFO - codeparrot_training - Step 42195: {'lr': 2.9667309210723302e-05, 'samples': 21604352, 'steps': 42195, 'batch_loss/train': 0.6391050911042839} 12/28/2021 12:27:09 - INFO - codeparrot_training - Step 42196: {'lr': 2.9659861570517755e-05, 'samples': 21604864, 'steps': 42196, 'batch_loss/train': 0.7687671722378582} 12/28/2021 12:27:19 - INFO - codeparrot_training - Step 42197: {'lr': 2.9652414806301375e-05, 'samples': 21605376, 'steps': 42197, 'batch_loss/train': 0.7744259265018627} 12/28/2021 12:27:30 - INFO - codeparrot_training - Step 42198: {'lr': 2.964496891810381e-05, 'samples': 21605888, 'steps': 42198, 'batch_loss/train': 0.46852971526095644} 12/28/2021 12:27:41 - INFO - codeparrot_training - Step 42199: {'lr': 2.96375239059547e-05, 'samples': 21606400, 'steps': 42199, 'batch_loss/train': 0.7532796813175082} 12/28/2021 12:27:55 - INFO - codeparrot_training - Step 42200: {'lr': 2.9630079769883467e-05, 'samples': 21606912, 'steps': 42200, 'batch_loss/train': 0.8125883750617504} 12/28/2021 12:28:05 - INFO - codeparrot_training - Step 42201: {'lr': 2.9622636509919894e-05, 'samples': 21607424, 'steps': 42201, 'batch_loss/train': 0.7840385643066838} 12/28/2021 12:28:16 - INFO - codeparrot_training - Step 42202: {'lr': 2.9615194126093564e-05, 'samples': 21607936, 'steps': 42202, 'batch_loss/train': 0.7789378203451633} 12/28/2021 12:28:28 - INFO - codeparrot_training - Step 42203: {'lr': 2.9607752618433903e-05, 'samples': 21608448, 'steps': 42203, 'batch_loss/train': 0.7518935110419989} 12/28/2021 12:28:38 - INFO - codeparrot_training - Step 42204: {'lr': 2.9600311986970636e-05, 'samples': 21608960, 'steps': 42204, 'batch_loss/train': 0.7298577628098428} 12/28/2021 12:28:49 - INFO - codeparrot_training - Step 42205: {'lr': 2.9592872231733377e-05, 'samples': 21609472, 'steps': 42205, 'batch_loss/train': 0.7854502843692899} 12/28/2021 12:29:01 - INFO - codeparrot_training - Step 42206: {'lr': 2.9585433352751574e-05, 'samples': 21609984, 'steps': 42206, 'batch_loss/train': 0.7101578409783542} 12/28/2021 12:29:12 - INFO - codeparrot_training - Step 42207: {'lr': 2.957799535005487e-05, 'samples': 21610496, 'steps': 42207, 'batch_loss/train': 0.6686785374768078} 12/28/2021 12:29:23 - INFO - codeparrot_training - Step 42208: {'lr': 2.957055822367283e-05, 'samples': 21611008, 'steps': 42208, 'batch_loss/train': 0.8044898597872816} 12/28/2021 12:29:33 - INFO - codeparrot_training - Step 42209: {'lr': 2.9563121973635004e-05, 'samples': 21611520, 'steps': 42209, 'batch_loss/train': 0.7330343786161393} 12/28/2021 12:29:47 - INFO - codeparrot_training - Step 42210: {'lr': 2.9555686599970988e-05, 'samples': 21612032, 'steps': 42210, 'batch_loss/train': 0.7057996625080705} 12/28/2021 12:29:58 - INFO - codeparrot_training - Step 42211: {'lr': 2.954825210271031e-05, 'samples': 21612544, 'steps': 42211, 'batch_loss/train': 0.7957690567709506} 12/28/2021 12:30:09 - INFO - codeparrot_training - Step 42212: {'lr': 2.954081848188256e-05, 'samples': 21613056, 'steps': 42212, 'batch_loss/train': 0.7737600607797503} 12/28/2021 12:30:21 - INFO - codeparrot_training - Step 42213: {'lr': 2.953338573751724e-05, 'samples': 21613568, 'steps': 42213, 'batch_loss/train': 0.9931144171860069} 12/28/2021 12:30:31 - INFO - codeparrot_training - Step 42214: {'lr': 2.9525953869643963e-05, 'samples': 21614080, 'steps': 42214, 'batch_loss/train': 0.7288631456904113} 12/28/2021 12:30:42 - INFO - codeparrot_training - Step 42215: {'lr': 2.9518522878292236e-05, 'samples': 21614592, 'steps': 42215, 'batch_loss/train': 0.4439655306050554} 12/28/2021 12:30:53 - INFO - codeparrot_training - Step 42216: {'lr': 2.9511092763491678e-05, 'samples': 21615104, 'steps': 42216, 'batch_loss/train': 0.6016144117456861} 12/28/2021 12:31:05 - INFO - codeparrot_training - Step 42217: {'lr': 2.9503663525271703e-05, 'samples': 21615616, 'steps': 42217, 'batch_loss/train': 0.7176394648849964} 12/28/2021 12:31:15 - INFO - codeparrot_training - Step 42218: {'lr': 2.9496235163661846e-05, 'samples': 21616128, 'steps': 42218, 'batch_loss/train': 0.7244764701463282} 12/28/2021 12:31:26 - INFO - codeparrot_training - Step 42219: {'lr': 2.9488807678691804e-05, 'samples': 21616640, 'steps': 42219, 'batch_loss/train': 0.7428138582035899} 12/28/2021 12:31:40 - INFO - codeparrot_training - Step 42220: {'lr': 2.9481381070390945e-05, 'samples': 21617152, 'steps': 42220, 'batch_loss/train': 0.6860672472976148} 12/28/2021 12:31:51 - INFO - codeparrot_training - Step 42221: {'lr': 2.9473955338788855e-05, 'samples': 21617664, 'steps': 42221, 'batch_loss/train': 0.7027787016704679} 12/28/2021 12:32:01 - INFO - codeparrot_training - Step 42222: {'lr': 2.9466530483915065e-05, 'samples': 21618176, 'steps': 42222, 'batch_loss/train': 0.783570204366697} 12/28/2021 12:32:14 - INFO - codeparrot_training - Step 42223: {'lr': 2.9459106505799084e-05, 'samples': 21618688, 'steps': 42223, 'batch_loss/train': 0.7671627346426249} 12/28/2021 12:32:24 - INFO - codeparrot_training - Step 42224: {'lr': 2.945168340447038e-05, 'samples': 21619200, 'steps': 42224, 'batch_loss/train': 0.8382214065641165} 12/28/2021 12:32:35 - INFO - codeparrot_training - Step 42225: {'lr': 2.944426117995855e-05, 'samples': 21619712, 'steps': 42225, 'batch_loss/train': 0.7399371941573918} 12/28/2021 12:32:45 - INFO - codeparrot_training - Step 42226: {'lr': 2.943683983229303e-05, 'samples': 21620224, 'steps': 42226, 'batch_loss/train': 0.7469301717355847} 12/28/2021 12:32:59 - INFO - codeparrot_training - Step 42227: {'lr': 2.9429419361503367e-05, 'samples': 21620736, 'steps': 42227, 'batch_loss/train': 0.6211214582435787} 12/28/2021 12:33:09 - INFO - codeparrot_training - Step 42228: {'lr': 2.9421999767619114e-05, 'samples': 21621248, 'steps': 42228, 'batch_loss/train': 0.6563584677642211} 12/28/2021 12:33:20 - INFO - codeparrot_training - Step 42229: {'lr': 2.9414581050669575e-05, 'samples': 21621760, 'steps': 42229, 'batch_loss/train': 0.7685855990275741} 12/28/2021 12:33:32 - INFO - codeparrot_training - Step 42230: {'lr': 2.940716321068443e-05, 'samples': 21622272, 'steps': 42230, 'batch_loss/train': 0.841603527776897} 12/28/2021 12:33:43 - INFO - codeparrot_training - Step 42231: {'lr': 2.9399746247693178e-05, 'samples': 21622784, 'steps': 42231, 'batch_loss/train': 0.7836440070532262} 12/28/2021 12:33:53 - INFO - codeparrot_training - Step 42232: {'lr': 2.9392330161725185e-05, 'samples': 21623296, 'steps': 42232, 'batch_loss/train': 0.8019457617774606} 12/28/2021 12:34:05 - INFO - codeparrot_training - Step 42233: {'lr': 2.9384914952809956e-05, 'samples': 21623808, 'steps': 42233, 'batch_loss/train': 0.7873909631744027} 12/28/2021 12:34:16 - INFO - codeparrot_training - Step 42234: {'lr': 2.937750062097705e-05, 'samples': 21624320, 'steps': 42234, 'batch_loss/train': 0.7320135226473212} 12/28/2021 12:34:27 - INFO - codeparrot_training - Step 42235: {'lr': 2.937008716625586e-05, 'samples': 21624832, 'steps': 42235, 'batch_loss/train': 0.8020673533901572} 12/28/2021 12:34:37 - INFO - codeparrot_training - Step 42236: {'lr': 2.936267458867592e-05, 'samples': 21625344, 'steps': 42236, 'batch_loss/train': 0.7151320148259401} 12/28/2021 12:34:49 - INFO - codeparrot_training - Step 42237: {'lr': 2.9355262888266676e-05, 'samples': 21625856, 'steps': 42237, 'batch_loss/train': 0.7748393341898918} 12/28/2021 12:35:00 - INFO - codeparrot_training - Step 42238: {'lr': 2.9347852065057578e-05, 'samples': 21626368, 'steps': 42238, 'batch_loss/train': 0.6456509889103472} 12/28/2021 12:35:10 - INFO - codeparrot_training - Step 42239: {'lr': 2.9340442119078104e-05, 'samples': 21626880, 'steps': 42239, 'batch_loss/train': 0.7118884891970083} 12/28/2021 12:35:24 - INFO - codeparrot_training - Step 42240: {'lr': 2.9333033050357727e-05, 'samples': 21627392, 'steps': 42240, 'batch_loss/train': 0.829746900126338} 12/28/2021 12:35:35 - INFO - codeparrot_training - Step 42241: {'lr': 2.9325624858925898e-05, 'samples': 21627904, 'steps': 42241, 'batch_loss/train': 0.7923225089907646} 12/28/2021 12:35:46 - INFO - codeparrot_training - Step 42242: {'lr': 2.9318217544812093e-05, 'samples': 21628416, 'steps': 42242, 'batch_loss/train': 0.7777623670990579} 12/28/2021 12:35:58 - INFO - codeparrot_training - Step 42243: {'lr': 2.9310811108045704e-05, 'samples': 21628928, 'steps': 42243, 'batch_loss/train': 0.7767519121989608} 12/28/2021 12:36:08 - INFO - codeparrot_training - Step 42244: {'lr': 2.9303405548656127e-05, 'samples': 21629440, 'steps': 42244, 'batch_loss/train': 0.7167349075898528} 12/28/2021 12:36:19 - INFO - codeparrot_training - Step 42245: {'lr': 2.9296000866672974e-05, 'samples': 21629952, 'steps': 42245, 'batch_loss/train': 0.7408862551674247} 12/28/2021 12:36:30 - INFO - codeparrot_training - Step 42246: {'lr': 2.9288597062125555e-05, 'samples': 21630464, 'steps': 42246, 'batch_loss/train': 0.6097309354227036} 12/28/2021 12:36:42 - INFO - codeparrot_training - Step 42247: {'lr': 2.928119413504332e-05, 'samples': 21630976, 'steps': 42247, 'batch_loss/train': 0.6969942566938698} 12/28/2021 12:36:52 - INFO - codeparrot_training - Step 42248: {'lr': 2.9273792085455747e-05, 'samples': 21631488, 'steps': 42248, 'batch_loss/train': 0.7886928683146834} 12/28/2021 12:37:03 - INFO - codeparrot_training - Step 42249: {'lr': 2.92663909133922e-05, 'samples': 21632000, 'steps': 42249, 'batch_loss/train': 0.5006786467856728} 12/28/2021 12:37:17 - INFO - codeparrot_training - Step 42250: {'lr': 2.9258990618882155e-05, 'samples': 21632512, 'steps': 42250, 'batch_loss/train': 0.7652470285538584} 12/28/2021 12:37:28 - INFO - codeparrot_training - Step 42251: {'lr': 2.9251591201955e-05, 'samples': 21633024, 'steps': 42251, 'batch_loss/train': 0.720968805369921} 12/28/2021 12:37:38 - INFO - codeparrot_training - Step 42252: {'lr': 2.9244192662640195e-05, 'samples': 21633536, 'steps': 42252, 'batch_loss/train': 0.7684696661308408} 12/28/2021 12:37:50 - INFO - codeparrot_training - Step 42253: {'lr': 2.92367950009671e-05, 'samples': 21634048, 'steps': 42253, 'batch_loss/train': 0.8513139290735126} 12/28/2021 12:38:01 - INFO - codeparrot_training - Step 42254: {'lr': 2.9229398216965216e-05, 'samples': 21634560, 'steps': 42254, 'batch_loss/train': 0.7550620804540813} 12/28/2021 12:38:12 - INFO - codeparrot_training - Step 42255: {'lr': 2.922200231066377e-05, 'samples': 21635072, 'steps': 42255, 'batch_loss/train': 0.6950674299150705} 12/28/2021 12:38:26 - INFO - codeparrot_training - Step 42256: {'lr': 2.9214607282092355e-05, 'samples': 21635584, 'steps': 42256, 'batch_loss/train': 0.735985572449863} 12/28/2021 12:38:37 - INFO - codeparrot_training - Step 42257: {'lr': 2.920721313128033e-05, 'samples': 21636096, 'steps': 42257, 'batch_loss/train': 0.8636943413875997} 12/28/2021 12:38:48 - INFO - codeparrot_training - Step 42258: {'lr': 2.919981985825698e-05, 'samples': 21636608, 'steps': 42258, 'batch_loss/train': 0.823806734289974} 12/28/2021 12:38:58 - INFO - codeparrot_training - Step 42259: {'lr': 2.9192427463051756e-05, 'samples': 21637120, 'steps': 42259, 'batch_loss/train': 0.6528918037656695} 12/28/2021 12:39:10 - INFO - codeparrot_training - Step 42260: {'lr': 2.918503594569416e-05, 'samples': 21637632, 'steps': 42260, 'batch_loss/train': 0.8201401247642934} 12/28/2021 12:39:21 - INFO - codeparrot_training - Step 42261: {'lr': 2.9177645306213445e-05, 'samples': 21638144, 'steps': 42261, 'batch_loss/train': 0.5743400926003233} 12/28/2021 12:39:32 - INFO - codeparrot_training - Step 42262: {'lr': 2.9170255544639035e-05, 'samples': 21638656, 'steps': 42262, 'batch_loss/train': 0.7183611209038645} 12/28/2021 12:39:44 - INFO - codeparrot_training - Step 42263: {'lr': 2.9162866661000292e-05, 'samples': 21639168, 'steps': 42263, 'batch_loss/train': 0.6926548999035731} 12/28/2021 12:39:54 - INFO - codeparrot_training - Step 42264: {'lr': 2.915547865532661e-05, 'samples': 21639680, 'steps': 42264, 'batch_loss/train': 0.859656632412225} 12/28/2021 12:40:05 - INFO - codeparrot_training - Step 42265: {'lr': 2.914809152764736e-05, 'samples': 21640192, 'steps': 42265, 'batch_loss/train': 0.7498458800837398} 12/28/2021 12:40:16 - INFO - codeparrot_training - Step 42266: {'lr': 2.9140705277991898e-05, 'samples': 21640704, 'steps': 42266, 'batch_loss/train': 0.678523620357737} 12/28/2021 12:40:30 - INFO - codeparrot_training - Step 42267: {'lr': 2.913331990638962e-05, 'samples': 21641216, 'steps': 42267, 'batch_loss/train': 0.758819404989481} 12/28/2021 12:40:41 - INFO - codeparrot_training - Step 42268: {'lr': 2.9125935412869897e-05, 'samples': 21641728, 'steps': 42268, 'batch_loss/train': 0.7045714718988165} 12/28/2021 12:40:51 - INFO - codeparrot_training - Step 42269: {'lr': 2.9118551797462006e-05, 'samples': 21642240, 'steps': 42269, 'batch_loss/train': 0.8049524137750268} 12/28/2021 12:41:03 - INFO - codeparrot_training - Step 42270: {'lr': 2.9111169060195313e-05, 'samples': 21642752, 'steps': 42270, 'batch_loss/train': 0.8181156329810619} 12/28/2021 12:41:14 - INFO - codeparrot_training - Step 42271: {'lr': 2.9103787201099298e-05, 'samples': 21643264, 'steps': 42271, 'batch_loss/train': 0.7370026559219696} 12/28/2021 12:41:25 - INFO - codeparrot_training - Step 42272: {'lr': 2.909640622020318e-05, 'samples': 21643776, 'steps': 42272, 'batch_loss/train': 0.7093264944851398} 12/28/2021 12:41:37 - INFO - codeparrot_training - Step 42273: {'lr': 2.9089026117536277e-05, 'samples': 21644288, 'steps': 42273, 'batch_loss/train': 0.663976184790954} 12/28/2021 12:41:47 - INFO - codeparrot_training - Step 42274: {'lr': 2.9081646893128116e-05, 'samples': 21644800, 'steps': 42274, 'batch_loss/train': 0.5662739214021713} 12/28/2021 12:41:58 - INFO - codeparrot_training - Step 42275: {'lr': 2.9074268547007843e-05, 'samples': 21645312, 'steps': 42275, 'batch_loss/train': 0.6916382594499737} 12/28/2021 12:42:09 - INFO - codeparrot_training - Step 42276: {'lr': 2.9066891079204848e-05, 'samples': 21645824, 'steps': 42276, 'batch_loss/train': 0.653624651953578} 12/28/2021 12:42:21 - INFO - codeparrot_training - Step 42277: {'lr': 2.90595144897485e-05, 'samples': 21646336, 'steps': 42277, 'batch_loss/train': 0.7398798577487469} 12/28/2021 12:42:31 - INFO - codeparrot_training - Step 42278: {'lr': 2.905213877866808e-05, 'samples': 21646848, 'steps': 42278, 'batch_loss/train': 0.7680253321304917} 12/28/2021 12:42:42 - INFO - codeparrot_training - Step 42279: {'lr': 2.9044763945992923e-05, 'samples': 21647360, 'steps': 42279, 'batch_loss/train': 0.7332058344036341} 12/28/2021 12:42:56 - INFO - codeparrot_training - Step 42280: {'lr': 2.9037389991752427e-05, 'samples': 21647872, 'steps': 42280, 'batch_loss/train': 0.7736911168321967} 12/28/2021 12:43:07 - INFO - codeparrot_training - Step 42281: {'lr': 2.9030016915975732e-05, 'samples': 21648384, 'steps': 42281, 'batch_loss/train': 0.6190242040902376} 12/28/2021 12:43:17 - INFO - codeparrot_training - Step 42282: {'lr': 2.9022644718692286e-05, 'samples': 21648896, 'steps': 42282, 'batch_loss/train': 0.6967418608255684} 12/28/2021 12:43:29 - INFO - codeparrot_training - Step 42283: {'lr': 2.901527339993143e-05, 'samples': 21649408, 'steps': 42283, 'batch_loss/train': 0.7617475586012006} 12/28/2021 12:43:40 - INFO - codeparrot_training - Step 42284: {'lr': 2.900790295972236e-05, 'samples': 21649920, 'steps': 42284, 'batch_loss/train': 0.6112208368722349} 12/28/2021 12:43:51 - INFO - codeparrot_training - Step 42285: {'lr': 2.900053339809436e-05, 'samples': 21650432, 'steps': 42285, 'batch_loss/train': 0.7681059199385345} 12/28/2021 12:44:01 - INFO - codeparrot_training - Step 42286: {'lr': 2.8993164715076903e-05, 'samples': 21650944, 'steps': 42286, 'batch_loss/train': 0.7152078277431428} 12/28/2021 12:44:13 - INFO - codeparrot_training - Step 42287: {'lr': 2.8985796910699137e-05, 'samples': 21651456, 'steps': 42287, 'batch_loss/train': 0.767314987257123} 12/28/2021 12:44:24 - INFO - codeparrot_training - Step 42288: {'lr': 2.897842998499037e-05, 'samples': 21651968, 'steps': 42288, 'batch_loss/train': 0.6886538886465132} 12/28/2021 12:44:35 - INFO - codeparrot_training - Step 42289: {'lr': 2.8971063937979935e-05, 'samples': 21652480, 'steps': 42289, 'batch_loss/train': 0.6784062855876982} 12/28/2021 12:44:49 - INFO - codeparrot_training - Step 42290: {'lr': 2.8963698769697067e-05, 'samples': 21652992, 'steps': 42290, 'batch_loss/train': 0.7756171708460897} 12/28/2021 12:44:59 - INFO - codeparrot_training - Step 42291: {'lr': 2.895633448017107e-05, 'samples': 21653504, 'steps': 42291, 'batch_loss/train': 0.8256103820167482} 12/28/2021 12:45:10 - INFO - codeparrot_training - Step 42292: {'lr': 2.8948971069431258e-05, 'samples': 21654016, 'steps': 42292, 'batch_loss/train': 0.7458431990817189} 12/28/2021 12:45:22 - INFO - codeparrot_training - Step 42293: {'lr': 2.8941608537506826e-05, 'samples': 21654528, 'steps': 42293, 'batch_loss/train': 0.9589458610862494} 12/28/2021 12:45:33 - INFO - codeparrot_training - Step 42294: {'lr': 2.893424688442711e-05, 'samples': 21655040, 'steps': 42294, 'batch_loss/train': 0.7186049008741975} 12/28/2021 12:45:44 - INFO - codeparrot_training - Step 42295: {'lr': 2.892688611022143e-05, 'samples': 21655552, 'steps': 42295, 'batch_loss/train': 0.7460182150825858} 12/28/2021 12:45:54 - INFO - codeparrot_training - Step 42296: {'lr': 2.8919526214918863e-05, 'samples': 21656064, 'steps': 42296, 'batch_loss/train': 0.6751743792556226} 12/28/2021 12:46:08 - INFO - codeparrot_training - Step 42297: {'lr': 2.891216719854886e-05, 'samples': 21656576, 'steps': 42297, 'batch_loss/train': 0.6893375692889094} 12/28/2021 12:46:19 - INFO - codeparrot_training - Step 42298: {'lr': 2.890480906114057e-05, 'samples': 21657088, 'steps': 42298, 'batch_loss/train': 0.4270992277888581} 12/28/2021 12:46:30 - INFO - codeparrot_training - Step 42299: {'lr': 2.889745180272324e-05, 'samples': 21657600, 'steps': 42299, 'batch_loss/train': 0.6663712980225682} 12/28/2021 12:46:42 - INFO - codeparrot_training - Step 42300: {'lr': 2.889009542332624e-05, 'samples': 21658112, 'steps': 42300, 'batch_loss/train': 0.7545642433688045} 12/28/2021 12:46:53 - INFO - codeparrot_training - Step 42301: {'lr': 2.8882739922978684e-05, 'samples': 21658624, 'steps': 42301, 'batch_loss/train': 0.5387545877601951} 12/28/2021 12:47:03 - INFO - codeparrot_training - Step 42302: {'lr': 2.8875385301709856e-05, 'samples': 21659136, 'steps': 42302, 'batch_loss/train': 0.8410858797142282} 12/28/2021 12:47:15 - INFO - codeparrot_training - Step 42303: {'lr': 2.8868031559549006e-05, 'samples': 21659648, 'steps': 42303, 'batch_loss/train': 0.6931823733029887} 12/28/2021 12:47:26 - INFO - codeparrot_training - Step 42304: {'lr': 2.8860678696525394e-05, 'samples': 21660160, 'steps': 42304, 'batch_loss/train': 0.766474857868161} 12/28/2021 12:47:37 - INFO - codeparrot_training - Step 42305: {'lr': 2.8853326712668187e-05, 'samples': 21660672, 'steps': 42305, 'batch_loss/train': 0.5912792640738189} 12/28/2021 12:47:47 - INFO - codeparrot_training - Step 42306: {'lr': 2.8845975608006724e-05, 'samples': 21661184, 'steps': 42306, 'batch_loss/train': 0.710568618029356} 12/28/2021 12:48:01 - INFO - codeparrot_training - Step 42307: {'lr': 2.8838625382570037e-05, 'samples': 21661696, 'steps': 42307, 'batch_loss/train': 0.6665071714669466} 12/28/2021 12:48:12 - INFO - codeparrot_training - Step 42308: {'lr': 2.883127603638755e-05, 'samples': 21662208, 'steps': 42308, 'batch_loss/train': 0.6954288566485047} 12/28/2021 12:48:22 - INFO - codeparrot_training - Step 42309: {'lr': 2.8823927569488427e-05, 'samples': 21662720, 'steps': 42309, 'batch_loss/train': 0.8270247676409781} 12/28/2021 12:48:35 - INFO - codeparrot_training - Step 42310: {'lr': 2.8816579981901735e-05, 'samples': 21663232, 'steps': 42310, 'batch_loss/train': 0.6892850985750556} 12/28/2021 12:48:45 - INFO - codeparrot_training - Step 42311: {'lr': 2.8809233273656893e-05, 'samples': 21663744, 'steps': 42311, 'batch_loss/train': 0.7414511712267995} 12/28/2021 12:48:56 - INFO - codeparrot_training - Step 42312: {'lr': 2.8801887444783042e-05, 'samples': 21664256, 'steps': 42312, 'batch_loss/train': 0.7278420180082321} 12/28/2021 12:49:08 - INFO - codeparrot_training - Step 42313: {'lr': 2.8794542495309355e-05, 'samples': 21664768, 'steps': 42313, 'batch_loss/train': 0.7646220824681222} 12/28/2021 12:49:19 - INFO - codeparrot_training - Step 42314: {'lr': 2.8787198425265003e-05, 'samples': 21665280, 'steps': 42314, 'batch_loss/train': 0.8552349610254169} 12/28/2021 12:49:29 - INFO - codeparrot_training - Step 42315: {'lr': 2.8779855234679242e-05, 'samples': 21665792, 'steps': 42315, 'batch_loss/train': 0.809620937332511} 12/28/2021 12:49:40 - INFO - codeparrot_training - Step 42316: {'lr': 2.877251292358124e-05, 'samples': 21666304, 'steps': 42316, 'batch_loss/train': 0.6803987188031897} 12/28/2021 12:49:52 - INFO - codeparrot_training - Step 42317: {'lr': 2.8765171492000196e-05, 'samples': 21666816, 'steps': 42317, 'batch_loss/train': 0.780994608416222} 12/28/2021 12:50:03 - INFO - codeparrot_training - Step 42318: {'lr': 2.8757830939965312e-05, 'samples': 21667328, 'steps': 42318, 'batch_loss/train': 0.7810532352887094} 12/28/2021 12:50:13 - INFO - codeparrot_training - Step 42319: {'lr': 2.8750491267505728e-05, 'samples': 21667840, 'steps': 42319, 'batch_loss/train': 0.7484686383977532} 12/28/2021 12:50:27 - INFO - codeparrot_training - Step 42320: {'lr': 2.8743152474650675e-05, 'samples': 21668352, 'steps': 42320, 'batch_loss/train': 0.7705353759229183} 12/28/2021 12:50:38 - INFO - codeparrot_training - Step 42321: {'lr': 2.873581456142932e-05, 'samples': 21668864, 'steps': 42321, 'batch_loss/train': 0.6768548139370978} 12/28/2021 12:50:49 - INFO - codeparrot_training - Step 42322: {'lr': 2.872847752787075e-05, 'samples': 21669376, 'steps': 42322, 'batch_loss/train': 0.6639784760773182} 12/28/2021 12:51:01 - INFO - codeparrot_training - Step 42323: {'lr': 2.8721141374004305e-05, 'samples': 21669888, 'steps': 42323, 'batch_loss/train': 0.8925459096208215} 12/28/2021 12:51:11 - INFO - codeparrot_training - Step 42324: {'lr': 2.871380609985899e-05, 'samples': 21670400, 'steps': 42324, 'batch_loss/train': 0.8417088421992958} 12/28/2021 12:51:22 - INFO - codeparrot_training - Step 42325: {'lr': 2.8706471705463978e-05, 'samples': 21670912, 'steps': 42325, 'batch_loss/train': 0.7276354295900092} 12/28/2021 12:51:33 - INFO - codeparrot_training - Step 42326: {'lr': 2.8699138190848573e-05, 'samples': 21671424, 'steps': 42326, 'batch_loss/train': 0.7304872907698154} 12/28/2021 12:51:46 - INFO - codeparrot_training - Step 42327: {'lr': 2.8691805556041784e-05, 'samples': 21671936, 'steps': 42327, 'batch_loss/train': 0.8168535716831684} 12/28/2021 12:51:57 - INFO - codeparrot_training - Step 42328: {'lr': 2.8684473801072806e-05, 'samples': 21672448, 'steps': 42328, 'batch_loss/train': 0.835046922788024} 12/28/2021 12:52:08 - INFO - codeparrot_training - Step 42329: {'lr': 2.8677142925970816e-05, 'samples': 21672960, 'steps': 42329, 'batch_loss/train': 0.7438598000444472} 12/28/2021 12:52:20 - INFO - codeparrot_training - Step 42330: {'lr': 2.8669812930764926e-05, 'samples': 21673472, 'steps': 42330, 'batch_loss/train': 0.7167154978960752} 12/28/2021 12:52:31 - INFO - codeparrot_training - Step 42331: {'lr': 2.866248381548431e-05, 'samples': 21673984, 'steps': 42331, 'batch_loss/train': 0.6886926963925362} 12/28/2021 12:52:41 - INFO - codeparrot_training - Step 42332: {'lr': 2.8655155580158104e-05, 'samples': 21674496, 'steps': 42332, 'batch_loss/train': 0.7649396606720984} 12/28/2021 12:52:53 - INFO - codeparrot_training - Step 42333: {'lr': 2.8647828224815348e-05, 'samples': 21675008, 'steps': 42333, 'batch_loss/train': 0.7832752401009202} 12/28/2021 12:53:04 - INFO - codeparrot_training - Step 42334: {'lr': 2.8640501749485266e-05, 'samples': 21675520, 'steps': 42334, 'batch_loss/train': 0.8102231919765472} 12/28/2021 12:53:15 - INFO - codeparrot_training - Step 42335: {'lr': 2.8633176154197056e-05, 'samples': 21676032, 'steps': 42335, 'batch_loss/train': 0.7004030626267195} 12/28/2021 12:53:29 - INFO - codeparrot_training - Step 42336: {'lr': 2.862585143897961e-05, 'samples': 21676544, 'steps': 42336, 'batch_loss/train': 0.72235266584903} 12/28/2021 12:53:40 - INFO - codeparrot_training - Step 42337: {'lr': 2.861852760386227e-05, 'samples': 21677056, 'steps': 42337, 'batch_loss/train': 0.5092599630297627} 12/28/2021 12:53:50 - INFO - codeparrot_training - Step 42338: {'lr': 2.861120464887412e-05, 'samples': 21677568, 'steps': 42338, 'batch_loss/train': 0.7925181584432721} 12/28/2021 12:54:01 - INFO - codeparrot_training - Step 42339: {'lr': 2.8603882574044172e-05, 'samples': 21678080, 'steps': 42339, 'batch_loss/train': 0.6492670958396047} 12/28/2021 12:54:13 - INFO - codeparrot_training - Step 42340: {'lr': 2.859656137940153e-05, 'samples': 21678592, 'steps': 42340, 'batch_loss/train': 0.6375485979951918} 12/28/2021 12:54:24 - INFO - codeparrot_training - Step 42341: {'lr': 2.8589241064975487e-05, 'samples': 21679104, 'steps': 42341, 'batch_loss/train': 0.7179537634365261} 12/28/2021 12:54:34 - INFO - codeparrot_training - Step 42342: {'lr': 2.8581921630794955e-05, 'samples': 21679616, 'steps': 42342, 'batch_loss/train': 0.7689396254718304} 12/28/2021 12:54:47 - INFO - codeparrot_training - Step 42343: {'lr': 2.857460307688911e-05, 'samples': 21680128, 'steps': 42343, 'batch_loss/train': 0.7285531295929104} 12/28/2021 12:54:57 - INFO - codeparrot_training - Step 42344: {'lr': 2.8567285403287012e-05, 'samples': 21680640, 'steps': 42344, 'batch_loss/train': 0.6978135717799887} 12/28/2021 12:55:08 - INFO - codeparrot_training - Step 42345: {'lr': 2.8559968610017804e-05, 'samples': 21681152, 'steps': 42345, 'batch_loss/train': 0.7289592530578375} 12/28/2021 12:55:22 - INFO - codeparrot_training - Step 42346: {'lr': 2.855265269711055e-05, 'samples': 21681664, 'steps': 42346, 'batch_loss/train': 0.7074131374247372} 12/28/2021 12:55:32 - INFO - codeparrot_training - Step 42347: {'lr': 2.8545337664594302e-05, 'samples': 21682176, 'steps': 42347, 'batch_loss/train': 0.7371039697900414} 12/28/2021 12:55:43 - INFO - codeparrot_training - Step 42348: {'lr': 2.8538023512498206e-05, 'samples': 21682688, 'steps': 42348, 'batch_loss/train': 0.6706536035053432} 12/28/2021 12:55:54 - INFO - codeparrot_training - Step 42349: {'lr': 2.8530710240851272e-05, 'samples': 21683200, 'steps': 42349, 'batch_loss/train': 0.6941067562438548} 12/28/2021 12:56:06 - INFO - codeparrot_training - Step 42350: {'lr': 2.852339784968269e-05, 'samples': 21683712, 'steps': 42350, 'batch_loss/train': 0.7266929121688008} 12/28/2021 12:56:16 - INFO - codeparrot_training - Step 42351: {'lr': 2.8516086339021337e-05, 'samples': 21684224, 'steps': 42351, 'batch_loss/train': 0.675871997140348} 12/28/2021 12:56:27 - INFO - codeparrot_training - Step 42352: {'lr': 2.8508775708896512e-05, 'samples': 21684736, 'steps': 42352, 'batch_loss/train': 0.5378399635665119} 12/28/2021 12:56:39 - INFO - codeparrot_training - Step 42353: {'lr': 2.8501465959337115e-05, 'samples': 21685248, 'steps': 42353, 'batch_loss/train': 0.7528833746910095} 12/28/2021 12:56:50 - INFO - codeparrot_training - Step 42354: {'lr': 2.849415709037223e-05, 'samples': 21685760, 'steps': 42354, 'batch_loss/train': 0.7102760479319841} 12/28/2021 12:57:00 - INFO - codeparrot_training - Step 42355: {'lr': 2.848684910203095e-05, 'samples': 21686272, 'steps': 42355, 'batch_loss/train': 0.7122997860424221} 12/28/2021 12:57:14 - INFO - codeparrot_training - Step 42356: {'lr': 2.8479541994342327e-05, 'samples': 21686784, 'steps': 42356, 'batch_loss/train': 0.7619574889540672} 12/28/2021 12:57:25 - INFO - codeparrot_training - Step 42357: {'lr': 2.84722357673354e-05, 'samples': 21687296, 'steps': 42357, 'batch_loss/train': 0.7911930745467544} 12/28/2021 12:57:36 - INFO - codeparrot_training - Step 42358: {'lr': 2.846493042103926e-05, 'samples': 21687808, 'steps': 42358, 'batch_loss/train': 0.7637014172505587} 12/28/2021 12:57:46 - INFO - codeparrot_training - Step 42359: {'lr': 2.845762595548282e-05, 'samples': 21688320, 'steps': 42359, 'batch_loss/train': 0.6899459373671561} 12/28/2021 12:57:58 - INFO - codeparrot_training - Step 42360: {'lr': 2.8450322370695253e-05, 'samples': 21688832, 'steps': 42360, 'batch_loss/train': 0.8208324424922466} 12/28/2021 12:58:09 - INFO - codeparrot_training - Step 42361: {'lr': 2.8443019666705593e-05, 'samples': 21689344, 'steps': 42361, 'batch_loss/train': 0.8501810766756535} 12/28/2021 12:58:19 - INFO - codeparrot_training - Step 42362: {'lr': 2.8435717843542737e-05, 'samples': 21689856, 'steps': 42362, 'batch_loss/train': 0.7396309762261808} 12/28/2021 12:58:32 - INFO - codeparrot_training - Step 42363: {'lr': 2.8428416901235847e-05, 'samples': 21690368, 'steps': 42363, 'batch_loss/train': 0.7589835114777088} 12/28/2021 12:58:42 - INFO - codeparrot_training - Step 42364: {'lr': 2.8421116839813965e-05, 'samples': 21690880, 'steps': 42364, 'batch_loss/train': 0.7703363941982388} 12/28/2021 12:58:53 - INFO - codeparrot_training - Step 42365: {'lr': 2.8413817659306006e-05, 'samples': 21691392, 'steps': 42365, 'batch_loss/train': 0.5972655159421265} 12/28/2021 12:59:07 - INFO - codeparrot_training - Step 42366: {'lr': 2.8406519359741007e-05, 'samples': 21691904, 'steps': 42366, 'batch_loss/train': 0.6891534868627787} 12/28/2021 12:59:18 - INFO - codeparrot_training - Step 42367: {'lr': 2.839922194114808e-05, 'samples': 21692416, 'steps': 42367, 'batch_loss/train': 0.7646116614341736} 12/28/2021 12:59:28 - INFO - codeparrot_training - Step 42368: {'lr': 2.8391925403556146e-05, 'samples': 21692928, 'steps': 42368, 'batch_loss/train': 0.7076531518250704} 12/28/2021 12:59:39 - INFO - codeparrot_training - Step 42369: {'lr': 2.8384629746994238e-05, 'samples': 21693440, 'steps': 42369, 'batch_loss/train': 0.7225329885259271} 12/28/2021 12:59:51 - INFO - codeparrot_training - Step 42370: {'lr': 2.837733497149139e-05, 'samples': 21693952, 'steps': 42370, 'batch_loss/train': 0.7104433537460864} 12/28/2021 13:00:02 - INFO - codeparrot_training - Step 42371: {'lr': 2.8370041077076552e-05, 'samples': 21694464, 'steps': 42371, 'batch_loss/train': 0.8534915177151561} 12/28/2021 13:00:12 - INFO - codeparrot_training - Step 42372: {'lr': 2.8362748063778725e-05, 'samples': 21694976, 'steps': 42372, 'batch_loss/train': 0.7405082164332271} 12/28/2021 13:00:26 - INFO - codeparrot_training - Step 42373: {'lr': 2.835545593162697e-05, 'samples': 21695488, 'steps': 42373, 'batch_loss/train': 0.6276261461898685} 12/28/2021 13:00:37 - INFO - codeparrot_training - Step 42374: {'lr': 2.8348164680650214e-05, 'samples': 21696000, 'steps': 42374, 'batch_loss/train': 0.7731631007045507} 12/28/2021 13:00:48 - INFO - codeparrot_training - Step 42375: {'lr': 2.834087431087745e-05, 'samples': 21696512, 'steps': 42375, 'batch_loss/train': 0.7195246191695333} 12/28/2021 13:01:00 - INFO - codeparrot_training - Step 42376: {'lr': 2.833358482233772e-05, 'samples': 21697024, 'steps': 42376, 'batch_loss/train': 0.585670827422291} 12/28/2021 13:01:11 - INFO - codeparrot_training - Step 42377: {'lr': 2.832629621505989e-05, 'samples': 21697536, 'steps': 42377, 'batch_loss/train': 0.7552249734289944} 12/28/2021 13:01:21 - INFO - codeparrot_training - Step 42378: {'lr': 2.8319008489073066e-05, 'samples': 21698048, 'steps': 42378, 'batch_loss/train': 0.7299008566769771} 12/28/2021 13:01:32 - INFO - codeparrot_training - Step 42379: {'lr': 2.8311721644406152e-05, 'samples': 21698560, 'steps': 42379, 'batch_loss/train': 0.8192129712551832} 12/28/2021 13:01:44 - INFO - codeparrot_training - Step 42380: {'lr': 2.8304435681088092e-05, 'samples': 21699072, 'steps': 42380, 'batch_loss/train': 0.7628131718374789} 12/28/2021 13:01:55 - INFO - codeparrot_training - Step 42381: {'lr': 2.829715059914792e-05, 'samples': 21699584, 'steps': 42381, 'batch_loss/train': 0.6816727081313729} 12/28/2021 13:02:05 - INFO - codeparrot_training - Step 42382: {'lr': 2.8289866398614527e-05, 'samples': 21700096, 'steps': 42382, 'batch_loss/train': 0.7450349563732743} 12/28/2021 13:02:19 - INFO - codeparrot_training - Step 42383: {'lr': 2.828258307951695e-05, 'samples': 21700608, 'steps': 42383, 'batch_loss/train': 0.8646040551830083} 12/28/2021 13:02:30 - INFO - codeparrot_training - Step 42384: {'lr': 2.8275300641884078e-05, 'samples': 21701120, 'steps': 42384, 'batch_loss/train': 0.6610297767911106} 12/28/2021 13:02:41 - INFO - codeparrot_training - Step 42385: {'lr': 2.826801908574489e-05, 'samples': 21701632, 'steps': 42385, 'batch_loss/train': 0.709633210208267} 12/28/2021 13:02:53 - INFO - codeparrot_training - Step 42386: {'lr': 2.8260738411128362e-05, 'samples': 21702144, 'steps': 42386, 'batch_loss/train': 0.6274541588500142} 12/28/2021 13:03:03 - INFO - codeparrot_training - Step 42387: {'lr': 2.8253458618063443e-05, 'samples': 21702656, 'steps': 42387, 'batch_loss/train': 0.7447728966362774} 12/28/2021 13:03:14 - INFO - codeparrot_training - Step 42388: {'lr': 2.8246179706578943e-05, 'samples': 21703168, 'steps': 42388, 'batch_loss/train': 0.7508887471631169} 12/28/2021 13:03:25 - INFO - codeparrot_training - Step 42389: {'lr': 2.8238901676703954e-05, 'samples': 21703680, 'steps': 42389, 'batch_loss/train': 0.8400836391374469} 12/28/2021 13:03:37 - INFO - codeparrot_training - Step 42390: {'lr': 2.823162452846739e-05, 'samples': 21704192, 'steps': 42390, 'batch_loss/train': 0.7547259824350476} 12/28/2021 13:03:47 - INFO - codeparrot_training - Step 42391: {'lr': 2.8224348261898127e-05, 'samples': 21704704, 'steps': 42391, 'batch_loss/train': 0.742630158085376} 12/28/2021 13:03:58 - INFO - codeparrot_training - Step 42392: {'lr': 2.8217072877025074e-05, 'samples': 21705216, 'steps': 42392, 'batch_loss/train': 0.6913641928695142} 12/28/2021 13:04:10 - INFO - codeparrot_training - Step 42393: {'lr': 2.820979837387727e-05, 'samples': 21705728, 'steps': 42393, 'batch_loss/train': 0.5766795915551484} 12/28/2021 13:04:21 - INFO - codeparrot_training - Step 42394: {'lr': 2.8202524752483527e-05, 'samples': 21706240, 'steps': 42394, 'batch_loss/train': 0.654614003142342} 12/28/2021 13:04:31 - INFO - codeparrot_training - Step 42395: {'lr': 2.819525201287279e-05, 'samples': 21706752, 'steps': 42395, 'batch_loss/train': 0.7305817473679781} 12/28/2021 13:04:45 - INFO - codeparrot_training - Step 42396: {'lr': 2.8187980155073983e-05, 'samples': 21707264, 'steps': 42396, 'batch_loss/train': 0.7539581563323736} 12/28/2021 13:04:56 - INFO - codeparrot_training - Step 42397: {'lr': 2.8180709179116027e-05, 'samples': 21707776, 'steps': 42397, 'batch_loss/train': 0.9351032112026587} 12/28/2021 13:05:07 - INFO - codeparrot_training - Step 42398: {'lr': 2.8173439085027814e-05, 'samples': 21708288, 'steps': 42398, 'batch_loss/train': 0.7034155614674091} 12/28/2021 13:05:17 - INFO - codeparrot_training - Step 42399: {'lr': 2.8166169872838236e-05, 'samples': 21708800, 'steps': 42399, 'batch_loss/train': 0.7246999095659703} 12/28/2021 13:05:30 - INFO - codeparrot_training - Step 42400: {'lr': 2.8158901542576192e-05, 'samples': 21709312, 'steps': 42400, 'batch_loss/train': 0.7337188143283129} 12/28/2021 13:05:40 - INFO - codeparrot_training - Step 42401: {'lr': 2.815163409427063e-05, 'samples': 21709824, 'steps': 42401, 'batch_loss/train': 0.7198491645976901} 12/28/2021 13:05:51 - INFO - codeparrot_training - Step 42402: {'lr': 2.814436752795041e-05, 'samples': 21710336, 'steps': 42402, 'batch_loss/train': 0.7903961362317204} 12/28/2021 13:06:05 - INFO - codeparrot_training - Step 42403: {'lr': 2.8137101843644346e-05, 'samples': 21710848, 'steps': 42403, 'batch_loss/train': 0.7523646233603358} 12/28/2021 13:06:15 - INFO - codeparrot_training - Step 42404: {'lr': 2.81298370413815e-05, 'samples': 21711360, 'steps': 42404, 'batch_loss/train': 0.7632668842561543} 12/28/2021 13:06:26 - INFO - codeparrot_training - Step 42405: {'lr': 2.812257312119057e-05, 'samples': 21711872, 'steps': 42405, 'batch_loss/train': 0.8658626191318035} 12/28/2021 13:06:38 - INFO - codeparrot_training - Step 42406: {'lr': 2.811531008310053e-05, 'samples': 21712384, 'steps': 42406, 'batch_loss/train': 0.6941713951528072} 12/28/2021 13:06:49 - INFO - codeparrot_training - Step 42407: {'lr': 2.8108047927140222e-05, 'samples': 21712896, 'steps': 42407, 'batch_loss/train': 0.6396227600052953} 12/28/2021 13:06:59 - INFO - codeparrot_training - Step 42408: {'lr': 2.810078665333854e-05, 'samples': 21713408, 'steps': 42408, 'batch_loss/train': 0.7414492616662756} 12/28/2021 13:07:10 - INFO - codeparrot_training - Step 42409: {'lr': 2.8093526261724344e-05, 'samples': 21713920, 'steps': 42409, 'batch_loss/train': 0.7695392481982708} 12/28/2021 13:07:22 - INFO - codeparrot_training - Step 42410: {'lr': 2.8086266752326505e-05, 'samples': 21714432, 'steps': 42410, 'batch_loss/train': 0.7083775659557432} 12/28/2021 13:07:33 - INFO - codeparrot_training - Step 42411: {'lr': 2.8079008125173887e-05, 'samples': 21714944, 'steps': 42411, 'batch_loss/train': 0.7399166310206056} 12/28/2021 13:07:44 - INFO - codeparrot_training - Step 42412: {'lr': 2.8071750380295304e-05, 'samples': 21715456, 'steps': 42412, 'batch_loss/train': 0.5194373440463096} 12/28/2021 13:07:58 - INFO - codeparrot_training - Step 42413: {'lr': 2.8064493517719724e-05, 'samples': 21715968, 'steps': 42413, 'batch_loss/train': 0.7017886964604259} 12/28/2021 13:08:08 - INFO - codeparrot_training - Step 42414: {'lr': 2.80572375374758e-05, 'samples': 21716480, 'steps': 42414, 'batch_loss/train': 0.7110953312367201} 12/28/2021 13:08:19 - INFO - codeparrot_training - Step 42415: {'lr': 2.8049982439592557e-05, 'samples': 21716992, 'steps': 42415, 'batch_loss/train': 0.6748038266086951} 12/28/2021 13:08:31 - INFO - codeparrot_training - Step 42416: {'lr': 2.804272822409884e-05, 'samples': 21717504, 'steps': 42416, 'batch_loss/train': 0.7488306472077966} 12/28/2021 13:08:42 - INFO - codeparrot_training - Step 42417: {'lr': 2.803547489102337e-05, 'samples': 21718016, 'steps': 42417, 'batch_loss/train': 0.701291523873806} 12/28/2021 13:08:52 - INFO - codeparrot_training - Step 42418: {'lr': 2.802822244039499e-05, 'samples': 21718528, 'steps': 42418, 'batch_loss/train': 0.8002557219006121} 12/28/2021 13:09:04 - INFO - codeparrot_training - Step 42419: {'lr': 2.8020970872242673e-05, 'samples': 21719040, 'steps': 42419, 'batch_loss/train': 0.6582194688962772} 12/28/2021 13:09:15 - INFO - codeparrot_training - Step 42420: {'lr': 2.801372018659512e-05, 'samples': 21719552, 'steps': 42420, 'batch_loss/train': 0.6918988130055368} 12/28/2021 13:09:25 - INFO - codeparrot_training - Step 42421: {'lr': 2.800647038348114e-05, 'samples': 21720064, 'steps': 42421, 'batch_loss/train': 0.7432077070698142} 12/28/2021 13:09:36 - INFO - codeparrot_training - Step 42422: {'lr': 2.7999221462929742e-05, 'samples': 21720576, 'steps': 42422, 'batch_loss/train': 0.7557685617357492} 12/28/2021 13:09:48 - INFO - codeparrot_training - Step 42423: {'lr': 2.7991973424969537e-05, 'samples': 21721088, 'steps': 42423, 'batch_loss/train': 0.5966391160618514} 12/28/2021 13:09:59 - INFO - codeparrot_training - Step 42424: {'lr': 2.798472626962942e-05, 'samples': 21721600, 'steps': 42424, 'batch_loss/train': 0.7843004586175084} 12/28/2021 13:10:10 - INFO - codeparrot_training - Step 42425: {'lr': 2.79774799969382e-05, 'samples': 21722112, 'steps': 42425, 'batch_loss/train': 0.7444934826344252} 12/28/2021 13:10:23 - INFO - codeparrot_training - Step 42426: {'lr': 2.797023460692469e-05, 'samples': 21722624, 'steps': 42426, 'batch_loss/train': 0.7060801875777543} 12/28/2021 13:10:34 - INFO - codeparrot_training - Step 42427: {'lr': 2.7962990099617697e-05, 'samples': 21723136, 'steps': 42427, 'batch_loss/train': 0.7796091232448816} 12/28/2021 13:10:44 - INFO - codeparrot_training - Step 42428: {'lr': 2.795574647504606e-05, 'samples': 21723648, 'steps': 42428, 'batch_loss/train': 0.6826468661893159} 12/28/2021 13:10:56 - INFO - codeparrot_training - Step 42429: {'lr': 2.7948503733238428e-05, 'samples': 21724160, 'steps': 42429, 'batch_loss/train': 0.6979129938408732} 12/28/2021 13:11:07 - INFO - codeparrot_training - Step 42430: {'lr': 2.7941261874223772e-05, 'samples': 21724672, 'steps': 42430, 'batch_loss/train': 0.6976593071594834} 12/28/2021 13:11:18 - INFO - codeparrot_training - Step 42431: {'lr': 2.7934020898030848e-05, 'samples': 21725184, 'steps': 42431, 'batch_loss/train': 0.6076366789638996} 12/28/2021 13:11:28 - INFO - codeparrot_training - Step 42432: {'lr': 2.7926780804688358e-05, 'samples': 21725696, 'steps': 42432, 'batch_loss/train': 0.7778000961989164} 12/28/2021 13:11:42 - INFO - codeparrot_training - Step 42433: {'lr': 2.791954159422516e-05, 'samples': 21726208, 'steps': 42433, 'batch_loss/train': 0.5401940706651658} 12/28/2021 13:11:53 - INFO - codeparrot_training - Step 42434: {'lr': 2.7912303266669992e-05, 'samples': 21726720, 'steps': 42434, 'batch_loss/train': 0.7710363729856908} 12/28/2021 13:12:04 - INFO - codeparrot_training - Step 42435: {'lr': 2.7905065822051627e-05, 'samples': 21727232, 'steps': 42435, 'batch_loss/train': 0.7484196410514414} 12/28/2021 13:12:16 - INFO - codeparrot_training - Step 42436: {'lr': 2.7897829260398882e-05, 'samples': 21727744, 'steps': 42436, 'batch_loss/train': 0.7881839587935247} 12/28/2021 13:12:26 - INFO - codeparrot_training - Step 42437: {'lr': 2.7890593581740508e-05, 'samples': 21728256, 'steps': 42437, 'batch_loss/train': 0.8240091092884541} 12/28/2021 13:12:37 - INFO - codeparrot_training - Step 42438: {'lr': 2.788335878610526e-05, 'samples': 21728768, 'steps': 42438, 'batch_loss/train': 0.6379252881743014} 12/28/2021 13:12:49 - INFO - codeparrot_training - Step 42439: {'lr': 2.7876124873521925e-05, 'samples': 21729280, 'steps': 42439, 'batch_loss/train': 1.3883983560372144} 12/28/2021 13:13:00 - INFO - codeparrot_training - Step 42440: {'lr': 2.786889184401925e-05, 'samples': 21729792, 'steps': 42440, 'batch_loss/train': 0.7381153553724289} 12/28/2021 13:13:11 - INFO - codeparrot_training - Step 42441: {'lr': 2.7861659697625967e-05, 'samples': 21730304, 'steps': 42441, 'batch_loss/train': 0.6488305772654712} 12/28/2021 13:13:21 - INFO - codeparrot_training - Step 42442: {'lr': 2.7854428434370915e-05, 'samples': 21730816, 'steps': 42442, 'batch_loss/train': 0.7901869025081396} 12/28/2021 13:13:35 - INFO - codeparrot_training - Step 42443: {'lr': 2.7847198054282707e-05, 'samples': 21731328, 'steps': 42443, 'batch_loss/train': 0.8341597723774612} 12/28/2021 13:13:46 - INFO - codeparrot_training - Step 42444: {'lr': 2.7839968557390126e-05, 'samples': 21731840, 'steps': 42444, 'batch_loss/train': 0.769114226102829} 12/28/2021 13:13:56 - INFO - codeparrot_training - Step 42445: {'lr': 2.783273994372204e-05, 'samples': 21732352, 'steps': 42445, 'batch_loss/train': 1.0118652950040996} 12/28/2021 13:14:08 - INFO - codeparrot_training - Step 42446: {'lr': 2.7825512213307036e-05, 'samples': 21732864, 'steps': 42446, 'batch_loss/train': 0.795604023616761} 12/28/2021 13:14:19 - INFO - codeparrot_training - Step 42447: {'lr': 2.7818285366173868e-05, 'samples': 21733376, 'steps': 42447, 'batch_loss/train': 0.722171665285714} 12/28/2021 13:14:30 - INFO - codeparrot_training - Step 42448: {'lr': 2.78110594023514e-05, 'samples': 21733888, 'steps': 42448, 'batch_loss/train': 0.7989903502166271} 12/28/2021 13:14:42 - INFO - codeparrot_training - Step 42449: {'lr': 2.78038343218682e-05, 'samples': 21734400, 'steps': 42449, 'batch_loss/train': 0.811983204446733} 12/28/2021 13:14:52 - INFO - codeparrot_training - Step 42450: {'lr': 2.779661012475307e-05, 'samples': 21734912, 'steps': 42450, 'batch_loss/train': 0.7251607356593013} 12/28/2021 13:15:03 - INFO - codeparrot_training - Step 42451: {'lr': 2.778938681103471e-05, 'samples': 21735424, 'steps': 42451, 'batch_loss/train': 0.6520466923248023} 12/28/2021 13:15:13 - INFO - codeparrot_training - Step 42452: {'lr': 2.7782164380741853e-05, 'samples': 21735936, 'steps': 42452, 'batch_loss/train': 0.625341618899256} 12/28/2021 13:15:27 - INFO - codeparrot_training - Step 42453: {'lr': 2.777494283390322e-05, 'samples': 21736448, 'steps': 42453, 'batch_loss/train': 0.6816876232624054} 12/28/2021 13:15:38 - INFO - codeparrot_training - Step 42454: {'lr': 2.7767722170547514e-05, 'samples': 21736960, 'steps': 42454, 'batch_loss/train': 0.7284287353977561} 12/28/2021 13:15:49 - INFO - codeparrot_training - Step 42455: {'lr': 2.7760502390703378e-05, 'samples': 21737472, 'steps': 42455, 'batch_loss/train': 0.6828666548244655} 12/28/2021 13:16:01 - INFO - codeparrot_training - Step 42456: {'lr': 2.7753283494399593e-05, 'samples': 21737984, 'steps': 42456, 'batch_loss/train': 0.7208312912844121} 12/28/2021 13:16:11 - INFO - codeparrot_training - Step 42457: {'lr': 2.774606548166489e-05, 'samples': 21738496, 'steps': 42457, 'batch_loss/train': 0.7134509230963886} 12/28/2021 13:16:22 - INFO - codeparrot_training - Step 42458: {'lr': 2.7738848352527822e-05, 'samples': 21739008, 'steps': 42458, 'batch_loss/train': 0.7382389204576612} 12/28/2021 13:16:34 - INFO - codeparrot_training - Step 42459: {'lr': 2.773163210701726e-05, 'samples': 21739520, 'steps': 42459, 'batch_loss/train': 0.7395894005894661} 12/28/2021 13:16:45 - INFO - codeparrot_training - Step 42460: {'lr': 2.772441674516174e-05, 'samples': 21740032, 'steps': 42460, 'batch_loss/train': 0.7557442770339549} 12/28/2021 13:16:56 - INFO - codeparrot_training - Step 42461: {'lr': 2.7717202266990038e-05, 'samples': 21740544, 'steps': 42461, 'batch_loss/train': 0.7884063720703125} 12/28/2021 13:17:06 - INFO - codeparrot_training - Step 42462: {'lr': 2.7709988672530802e-05, 'samples': 21741056, 'steps': 42462, 'batch_loss/train': 0.761058063886594} 12/28/2021 13:17:19 - INFO - codeparrot_training - Step 42463: {'lr': 2.7702775961812698e-05, 'samples': 21741568, 'steps': 42463, 'batch_loss/train': 0.9054101714864373} 12/28/2021 13:17:29 - INFO - codeparrot_training - Step 42464: {'lr': 2.769556413486443e-05, 'samples': 21742080, 'steps': 42464, 'batch_loss/train': 0.772924785502255} 12/28/2021 13:17:40 - INFO - codeparrot_training - Step 42465: {'lr': 2.768835319171467e-05, 'samples': 21742592, 'steps': 42465, 'batch_loss/train': 0.44834332005120814} 12/28/2021 13:17:54 - INFO - codeparrot_training - Step 42466: {'lr': 2.7681143132392062e-05, 'samples': 21743104, 'steps': 42466, 'batch_loss/train': 0.9216730322223157} 12/28/2021 13:18:05 - INFO - codeparrot_training - Step 42467: {'lr': 2.76739339569253e-05, 'samples': 21743616, 'steps': 42467, 'batch_loss/train': 0.6955186887644231} 12/28/2021 13:18:15 - INFO - codeparrot_training - Step 42468: {'lr': 2.766672566534309e-05, 'samples': 21744128, 'steps': 42468, 'batch_loss/train': 0.7795752054080367} 12/28/2021 13:18:27 - INFO - codeparrot_training - Step 42469: {'lr': 2.765951825767396e-05, 'samples': 21744640, 'steps': 42469, 'batch_loss/train': 0.6523179637733847} 12/28/2021 13:18:38 - INFO - codeparrot_training - Step 42470: {'lr': 2.7652311733946584e-05, 'samples': 21745152, 'steps': 42470, 'batch_loss/train': 0.821408633608371} 12/28/2021 13:18:49 - INFO - codeparrot_training - Step 42471: {'lr': 2.7645106094189742e-05, 'samples': 21745664, 'steps': 42471, 'batch_loss/train': 0.6106278292136267} 12/28/2021 13:18:59 - INFO - codeparrot_training - Step 42472: {'lr': 2.763790133843197e-05, 'samples': 21746176, 'steps': 42472, 'batch_loss/train': 0.7613845709711313} 12/28/2021 13:19:14 - INFO - codeparrot_training - Step 42473: {'lr': 2.763069746670191e-05, 'samples': 21746688, 'steps': 42473, 'batch_loss/train': 0.6807342073880136} 12/28/2021 13:19:25 - INFO - codeparrot_training - Step 42474: {'lr': 2.7623494479028317e-05, 'samples': 21747200, 'steps': 42474, 'batch_loss/train': 0.6475966391153634} 12/28/2021 13:19:35 - INFO - codeparrot_training - Step 42475: {'lr': 2.7616292375439693e-05, 'samples': 21747712, 'steps': 42475, 'batch_loss/train': 0.8158623361960053} 12/28/2021 13:19:48 - INFO - codeparrot_training - Step 42476: {'lr': 2.7609091155964717e-05, 'samples': 21748224, 'steps': 42476, 'batch_loss/train': 1.9549901261925697} 12/28/2021 13:19:58 - INFO - codeparrot_training - Step 42477: {'lr': 2.7601890820632025e-05, 'samples': 21748736, 'steps': 42477, 'batch_loss/train': 0.7157039120793343} 12/28/2021 13:20:09 - INFO - codeparrot_training - Step 42478: {'lr': 2.7594691369470236e-05, 'samples': 21749248, 'steps': 42478, 'batch_loss/train': 0.7185192593606189} 12/28/2021 13:20:19 - INFO - codeparrot_training - Step 42479: {'lr': 2.7587492802507997e-05, 'samples': 21749760, 'steps': 42479, 'batch_loss/train': 0.7785386312752962} 12/28/2021 13:20:31 - INFO - codeparrot_training - Step 42480: {'lr': 2.7580295119773946e-05, 'samples': 21750272, 'steps': 42480, 'batch_loss/train': 0.8081485666334629} 12/28/2021 13:20:42 - INFO - codeparrot_training - Step 42481: {'lr': 2.7573098321296537e-05, 'samples': 21750784, 'steps': 42481, 'batch_loss/train': 0.7080979589372873} 12/28/2021 13:20:53 - INFO - codeparrot_training - Step 42482: {'lr': 2.7565902407104575e-05, 'samples': 21751296, 'steps': 42482, 'batch_loss/train': 0.7943774070590734} 12/28/2021 13:21:07 - INFO - codeparrot_training - Step 42483: {'lr': 2.7558707377226655e-05, 'samples': 21751808, 'steps': 42483, 'batch_loss/train': 0.6869534244760871} 12/28/2021 13:21:18 - INFO - codeparrot_training - Step 42484: {'lr': 2.7551513231691224e-05, 'samples': 21752320, 'steps': 42484, 'batch_loss/train': 0.4808771114330739} 12/28/2021 13:21:28 - INFO - codeparrot_training - Step 42485: {'lr': 2.754431997052706e-05, 'samples': 21752832, 'steps': 42485, 'batch_loss/train': 0.7974307769909501} 12/28/2021 13:21:40 - INFO - codeparrot_training - Step 42486: {'lr': 2.7537127593762644e-05, 'samples': 21753344, 'steps': 42486, 'batch_loss/train': 0.758601248729974} 12/28/2021 13:21:51 - INFO - codeparrot_training - Step 42487: {'lr': 2.7529936101426623e-05, 'samples': 21753856, 'steps': 42487, 'batch_loss/train': 0.7834160490892828} 12/28/2021 13:22:02 - INFO - codeparrot_training - Step 42488: {'lr': 2.752274549354755e-05, 'samples': 21754368, 'steps': 42488, 'batch_loss/train': 0.7649109102785587} 12/28/2021 13:22:12 - INFO - codeparrot_training - Step 42489: {'lr': 2.7515555770154077e-05, 'samples': 21754880, 'steps': 42489, 'batch_loss/train': 0.6474554045125842} 12/28/2021 13:22:26 - INFO - codeparrot_training - Step 42490: {'lr': 2.7508366931274703e-05, 'samples': 21755392, 'steps': 42490, 'batch_loss/train': 0.7728499453514814} 12/28/2021 13:22:37 - INFO - codeparrot_training - Step 42491: {'lr': 2.7501178976938074e-05, 'samples': 21755904, 'steps': 42491, 'batch_loss/train': 0.7670481922104955} 12/28/2021 13:22:47 - INFO - codeparrot_training - Step 42492: {'lr': 2.749399190717275e-05, 'samples': 21756416, 'steps': 42492, 'batch_loss/train': 0.8402541382238269} 12/28/2021 13:23:00 - INFO - codeparrot_training - Step 42493: {'lr': 2.7486805722007292e-05, 'samples': 21756928, 'steps': 42493, 'batch_loss/train': 0.7798980223014951} 12/28/2021 13:23:10 - INFO - codeparrot_training - Step 42494: {'lr': 2.7479620421470315e-05, 'samples': 21757440, 'steps': 42494, 'batch_loss/train': 0.6654002303257585} 12/28/2021 13:23:21 - INFO - codeparrot_training - Step 42495: {'lr': 2.7472436005590268e-05, 'samples': 21757952, 'steps': 42495, 'batch_loss/train': 0.7354262848384678} 12/28/2021 13:23:33 - INFO - codeparrot_training - Step 42496: {'lr': 2.7465252474395823e-05, 'samples': 21758464, 'steps': 42496, 'batch_loss/train': 0.7384862648323178} 12/28/2021 13:23:44 - INFO - codeparrot_training - Step 42497: {'lr': 2.7458069827915572e-05, 'samples': 21758976, 'steps': 42497, 'batch_loss/train': 0.8084387555718422} 12/28/2021 13:23:54 - INFO - codeparrot_training - Step 42498: {'lr': 2.7450888066177958e-05, 'samples': 21759488, 'steps': 42498, 'batch_loss/train': 0.6762193748727441} 12/28/2021 13:24:05 - INFO - codeparrot_training - Step 42499: {'lr': 2.744370718921152e-05, 'samples': 21760000, 'steps': 42499, 'batch_loss/train': 0.7796046268194914} 12/28/2021 13:24:05 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/28/2021 13:27:27 - INFO - codeparrot_training - Step 42500: {'loss/eval': 0.7435383200645447, 'perplexity': 2.1033647060394287} 12/28/2021 13:27:46 - WARNING - huggingface_hub.repository - Several commits (15) will be pushed upstream. 12/28/2021 13:28:03 - INFO - codeparrot_training - Step 42500: {'lr': 2.743652719704498e-05, 'samples': 21760512, 'steps': 42500, 'batch_loss/train': 0.7004945147782564} 12/28/2021 13:28:13 - INFO - codeparrot_training - Step 42501: {'lr': 2.7429348089706735e-05, 'samples': 21761024, 'steps': 42501, 'batch_loss/train': 0.6976831629872322} 12/28/2021 13:28:24 - INFO - codeparrot_training - Step 42502: {'lr': 2.7422169867225344e-05, 'samples': 21761536, 'steps': 42502, 'batch_loss/train': 0.8051681257784367} 12/28/2021 13:28:36 - INFO - codeparrot_training - Step 42503: {'lr': 2.741499252962934e-05, 'samples': 21762048, 'steps': 42503, 'batch_loss/train': 0.754157162969932} 12/28/2021 13:28:47 - INFO - codeparrot_training - Step 42504: {'lr': 2.740781607694731e-05, 'samples': 21762560, 'steps': 42504, 'batch_loss/train': 0.6706830877810717} 12/28/2021 13:28:58 - INFO - codeparrot_training - Step 42505: {'lr': 2.7400640509207764e-05, 'samples': 21763072, 'steps': 42505, 'batch_loss/train': 0.7312038233503699} 12/28/2021 13:29:10 - INFO - codeparrot_training - Step 42506: {'lr': 2.7393465826439258e-05, 'samples': 21763584, 'steps': 42506, 'batch_loss/train': 0.7785771051421762} 12/28/2021 13:29:20 - INFO - codeparrot_training - Step 42507: {'lr': 2.7386292028670158e-05, 'samples': 21764096, 'steps': 42507, 'batch_loss/train': 0.6454480632673949} 12/28/2021 13:29:31 - INFO - codeparrot_training - Step 42508: {'lr': 2.737911911592919e-05, 'samples': 21764608, 'steps': 42508, 'batch_loss/train': 0.8011961434967816} 12/28/2021 13:29:43 - INFO - codeparrot_training - Step 42509: {'lr': 2.7371947088244807e-05, 'samples': 21765120, 'steps': 42509, 'batch_loss/train': 0.718452975153923} 12/28/2021 13:29:54 - INFO - codeparrot_training - Step 42510: {'lr': 2.7364775945645432e-05, 'samples': 21765632, 'steps': 42510, 'batch_loss/train': 0.744437062414363} 12/28/2021 13:30:04 - INFO - codeparrot_training - Step 42511: {'lr': 2.7357605688159674e-05, 'samples': 21766144, 'steps': 42511, 'batch_loss/train': 0.7457643551751971} 12/28/2021 13:30:15 - INFO - codeparrot_training - Step 42512: {'lr': 2.7350436315816075e-05, 'samples': 21766656, 'steps': 42512, 'batch_loss/train': 0.781331030651927} 12/28/2021 13:30:29 - INFO - codeparrot_training - Step 42513: {'lr': 2.7343267828642994e-05, 'samples': 21767168, 'steps': 42513, 'batch_loss/train': 0.6672748511191458} 12/28/2021 13:30:40 - INFO - codeparrot_training - Step 42514: {'lr': 2.733610022666902e-05, 'samples': 21767680, 'steps': 42514, 'batch_loss/train': 0.7080229916609824} 12/28/2021 13:30:50 - INFO - codeparrot_training - Step 42515: {'lr': 2.732893350992266e-05, 'samples': 21768192, 'steps': 42515, 'batch_loss/train': 0.782982237637043} 12/28/2021 13:31:02 - INFO - codeparrot_training - Step 42516: {'lr': 2.7321767678432363e-05, 'samples': 21768704, 'steps': 42516, 'batch_loss/train': 0.8302211249247193} 12/28/2021 13:31:13 - INFO - codeparrot_training - Step 42517: {'lr': 2.7314602732226633e-05, 'samples': 21769216, 'steps': 42517, 'batch_loss/train': 0.5815172280999832} 12/28/2021 13:31:24 - INFO - codeparrot_training - Step 42518: {'lr': 2.730743867133395e-05, 'samples': 21769728, 'steps': 42518, 'batch_loss/train': 0.7217157271225005} 12/28/2021 13:31:38 - INFO - codeparrot_training - Step 42519: {'lr': 2.7300275495782816e-05, 'samples': 21770240, 'steps': 42519, 'batch_loss/train': 0.9673175634816289} 12/28/2021 13:31:49 - INFO - codeparrot_training - Step 42520: {'lr': 2.729311320560171e-05, 'samples': 21770752, 'steps': 42520, 'batch_loss/train': 0.7669903440400958} 12/28/2021 13:31:59 - INFO - codeparrot_training - Step 42521: {'lr': 2.728595180081911e-05, 'samples': 21771264, 'steps': 42521, 'batch_loss/train': 0.6837764789815992} 12/28/2021 13:32:10 - INFO - codeparrot_training - Step 42522: {'lr': 2.7278791281463433e-05, 'samples': 21771776, 'steps': 42522, 'batch_loss/train': 0.951064151711762} 12/28/2021 13:32:22 - INFO - codeparrot_training - Step 42523: {'lr': 2.7271631647563244e-05, 'samples': 21772288, 'steps': 42523, 'batch_loss/train': 0.9202890751184896} 12/28/2021 13:32:33 - INFO - codeparrot_training - Step 42524: {'lr': 2.726447289914691e-05, 'samples': 21772800, 'steps': 42524, 'batch_loss/train': 0.8279555565677583} 12/28/2021 13:32:43 - INFO - codeparrot_training - Step 42525: {'lr': 2.7257315036242875e-05, 'samples': 21773312, 'steps': 42525, 'batch_loss/train': 0.8006839888403192} 12/28/2021 13:32:55 - INFO - codeparrot_training - Step 42526: {'lr': 2.7250158058879758e-05, 'samples': 21773824, 'steps': 42526, 'batch_loss/train': 0.8003761358559132} 12/28/2021 13:33:06 - INFO - codeparrot_training - Step 42527: {'lr': 2.7243001967085847e-05, 'samples': 21774336, 'steps': 42527, 'batch_loss/train': 0.7650872804224491} 12/28/2021 13:33:17 - INFO - codeparrot_training - Step 42528: {'lr': 2.7235846760889666e-05, 'samples': 21774848, 'steps': 42528, 'batch_loss/train': 0.7833704184740782} 12/28/2021 13:33:31 - INFO - codeparrot_training - Step 42529: {'lr': 2.7228692440319643e-05, 'samples': 21775360, 'steps': 42529, 'batch_loss/train': 0.7461020218906924} 12/28/2021 13:33:41 - INFO - codeparrot_training - Step 42530: {'lr': 2.7221539005404196e-05, 'samples': 21775872, 'steps': 42530, 'batch_loss/train': 0.7323348419740796} 12/28/2021 13:33:52 - INFO - codeparrot_training - Step 42531: {'lr': 2.721438645617183e-05, 'samples': 21776384, 'steps': 42531, 'batch_loss/train': 0.7417780694086105} 12/28/2021 13:34:04 - INFO - codeparrot_training - Step 42532: {'lr': 2.7207234792650914e-05, 'samples': 21776896, 'steps': 42532, 'batch_loss/train': 0.7368893548846245} 12/28/2021 13:34:15 - INFO - codeparrot_training - Step 42533: {'lr': 2.7200084014869925e-05, 'samples': 21777408, 'steps': 42533, 'batch_loss/train': 0.6156201013363898} 12/28/2021 13:34:26 - INFO - codeparrot_training - Step 42534: {'lr': 2.7192934122857255e-05, 'samples': 21777920, 'steps': 42534, 'batch_loss/train': 0.7076956313103437} 12/28/2021 13:34:37 - INFO - codeparrot_training - Step 42535: {'lr': 2.7185785116641436e-05, 'samples': 21778432, 'steps': 42535, 'batch_loss/train': 0.6264114785008132} 12/28/2021 13:34:49 - INFO - codeparrot_training - Step 42536: {'lr': 2.7178636996250672e-05, 'samples': 21778944, 'steps': 42536, 'batch_loss/train': 0.4964801352471113} 12/28/2021 13:34:59 - INFO - codeparrot_training - Step 42537: {'lr': 2.7171489761713574e-05, 'samples': 21779456, 'steps': 42537, 'batch_loss/train': 0.7473274613730609} 12/28/2021 13:35:10 - INFO - codeparrot_training - Step 42538: {'lr': 2.716434341305854e-05, 'samples': 21779968, 'steps': 42538, 'batch_loss/train': 0.7581436401233077} 12/28/2021 13:35:22 - INFO - codeparrot_training - Step 42539: {'lr': 2.7157197950313904e-05, 'samples': 21780480, 'steps': 42539, 'batch_loss/train': 0.7688611038029194} 12/28/2021 13:35:33 - INFO - codeparrot_training - Step 42540: {'lr': 2.7150053373508093e-05, 'samples': 21780992, 'steps': 42540, 'batch_loss/train': 0.7601089146919549} 12/28/2021 13:35:43 - INFO - codeparrot_training - Step 42541: {'lr': 2.7142909682669526e-05, 'samples': 21781504, 'steps': 42541, 'batch_loss/train': 0.7075431370176375} 12/28/2021 13:35:54 - INFO - codeparrot_training - Step 42542: {'lr': 2.7135766877826622e-05, 'samples': 21782016, 'steps': 42542, 'batch_loss/train': 0.6439506970345974} 12/28/2021 13:36:08 - INFO - codeparrot_training - Step 42543: {'lr': 2.7128624959007723e-05, 'samples': 21782528, 'steps': 42543, 'batch_loss/train': 0.7970273387618363} 12/28/2021 13:36:19 - INFO - codeparrot_training - Step 42544: {'lr': 2.712148392624128e-05, 'samples': 21783040, 'steps': 42544, 'batch_loss/train': 0.7758398083969951} 12/28/2021 13:36:29 - INFO - codeparrot_training - Step 42545: {'lr': 2.7114343779555656e-05, 'samples': 21783552, 'steps': 42545, 'batch_loss/train': 0.6897365497425199} 12/28/2021 13:36:42 - INFO - codeparrot_training - Step 42546: {'lr': 2.7107204518979246e-05, 'samples': 21784064, 'steps': 42546, 'batch_loss/train': 0.8020186331123114} 12/28/2021 13:36:52 - INFO - codeparrot_training - Step 42547: {'lr': 2.7100066144540442e-05, 'samples': 21784576, 'steps': 42547, 'batch_loss/train': 0.7580157183110714} 12/28/2021 13:37:03 - INFO - codeparrot_training - Step 42548: {'lr': 2.7092928656267585e-05, 'samples': 21785088, 'steps': 42548, 'batch_loss/train': 0.7899861484766006} 12/28/2021 13:37:17 - INFO - codeparrot_training - Step 42549: {'lr': 2.7085792054189152e-05, 'samples': 21785600, 'steps': 42549, 'batch_loss/train': 0.8068087231367826} 12/28/2021 13:37:27 - INFO - codeparrot_training - Step 42550: {'lr': 2.7078656338333367e-05, 'samples': 21786112, 'steps': 42550, 'batch_loss/train': 0.7451301126857288} 12/28/2021 13:37:38 - INFO - codeparrot_training - Step 42551: {'lr': 2.707152150872863e-05, 'samples': 21786624, 'steps': 42551, 'batch_loss/train': 0.8407262861728668} 12/28/2021 13:37:50 - INFO - codeparrot_training - Step 42552: {'lr': 2.706438756540347e-05, 'samples': 21787136, 'steps': 42552, 'batch_loss/train': 0.7610770891187713} 12/28/2021 13:38:01 - INFO - codeparrot_training - Step 42553: {'lr': 2.7057254508386058e-05, 'samples': 21787648, 'steps': 42553, 'batch_loss/train': 0.5865780641324818} 12/28/2021 13:38:11 - INFO - codeparrot_training - Step 42554: {'lr': 2.705012233770482e-05, 'samples': 21788160, 'steps': 42554, 'batch_loss/train': 0.5243561934621539} 12/28/2021 13:38:22 - INFO - codeparrot_training - Step 42555: {'lr': 2.7042991053388118e-05, 'samples': 21788672, 'steps': 42555, 'batch_loss/train': 0.7506950273527764} 12/28/2021 13:38:34 - INFO - codeparrot_training - Step 42556: {'lr': 2.703586065546429e-05, 'samples': 21789184, 'steps': 42556, 'batch_loss/train': 0.7168807063717395} 12/28/2021 13:38:45 - INFO - codeparrot_training - Step 42557: {'lr': 2.7028731143961705e-05, 'samples': 21789696, 'steps': 42557, 'batch_loss/train': 0.6997908125631511} 12/28/2021 13:38:55 - INFO - codeparrot_training - Step 42558: {'lr': 2.70216025189087e-05, 'samples': 21790208, 'steps': 42558, 'batch_loss/train': 0.8655804060399532} 12/28/2021 13:39:09 - INFO - codeparrot_training - Step 42559: {'lr': 2.7014474780333588e-05, 'samples': 21790720, 'steps': 42559, 'batch_loss/train': 0.6738541885279119} 12/28/2021 13:39:20 - INFO - codeparrot_training - Step 42560: {'lr': 2.700734792826476e-05, 'samples': 21791232, 'steps': 42560, 'batch_loss/train': 0.7529141092672944} 12/28/2021 13:39:31 - INFO - codeparrot_training - Step 42561: {'lr': 2.7000221962730526e-05, 'samples': 21791744, 'steps': 42561, 'batch_loss/train': 0.7760366229340434} 12/28/2021 13:39:43 - INFO - codeparrot_training - Step 42562: {'lr': 2.6993096883759144e-05, 'samples': 21792256, 'steps': 42562, 'batch_loss/train': 0.7424377778079361} 12/28/2021 13:39:54 - INFO - codeparrot_training - Step 42563: {'lr': 2.6985972691379034e-05, 'samples': 21792768, 'steps': 42563, 'batch_loss/train': 0.7543232021853328} 12/28/2021 13:40:04 - INFO - codeparrot_training - Step 42564: {'lr': 2.6978849385618536e-05, 'samples': 21793280, 'steps': 42564, 'batch_loss/train': 0.6621879264712334} 12/28/2021 13:40:15 - INFO - codeparrot_training - Step 42565: {'lr': 2.6971726966505907e-05, 'samples': 21793792, 'steps': 42565, 'batch_loss/train': 0.8013642802834511} 12/28/2021 13:40:29 - INFO - codeparrot_training - Step 42566: {'lr': 2.6964605434069396e-05, 'samples': 21794304, 'steps': 42566, 'batch_loss/train': 0.6796853286214173} 12/28/2021 13:40:39 - INFO - codeparrot_training - Step 42567: {'lr': 2.695748478833751e-05, 'samples': 21794816, 'steps': 42567, 'batch_loss/train': 0.7260332014411688} 12/28/2021 13:40:50 - INFO - codeparrot_training - Step 42568: {'lr': 2.69503650293384e-05, 'samples': 21795328, 'steps': 42568, 'batch_loss/train': 0.8546094726771116} 12/28/2021 13:41:02 - INFO - codeparrot_training - Step 42569: {'lr': 2.6943246157100392e-05, 'samples': 21795840, 'steps': 42569, 'batch_loss/train': 0.7167334347032011} 12/28/2021 13:41:13 - INFO - codeparrot_training - Step 42570: {'lr': 2.6936128171651865e-05, 'samples': 21796352, 'steps': 42570, 'batch_loss/train': 0.6830033729784191} 12/28/2021 13:41:23 - INFO - codeparrot_training - Step 42571: {'lr': 2.6929011073021036e-05, 'samples': 21796864, 'steps': 42571, 'batch_loss/train': 0.747876406647265} 12/28/2021 13:41:35 - INFO - codeparrot_training - Step 42572: {'lr': 2.6921894861236224e-05, 'samples': 21797376, 'steps': 42572, 'batch_loss/train': 0.6972904298454523} 12/28/2021 13:41:46 - INFO - codeparrot_training - Step 42573: {'lr': 2.6914779536325736e-05, 'samples': 21797888, 'steps': 42573, 'batch_loss/train': 0.8056711326353252} 12/28/2021 13:41:57 - INFO - codeparrot_training - Step 42574: {'lr': 2.6907665098317853e-05, 'samples': 21798400, 'steps': 42574, 'batch_loss/train': 0.7692394540645182} 12/28/2021 13:42:07 - INFO - codeparrot_training - Step 42575: {'lr': 2.6900551547240886e-05, 'samples': 21798912, 'steps': 42575, 'batch_loss/train': 0.7310243956744671} 12/28/2021 13:42:21 - INFO - codeparrot_training - Step 42576: {'lr': 2.6893438883123066e-05, 'samples': 21799424, 'steps': 42576, 'batch_loss/train': 0.7319705891422927} 12/28/2021 13:42:32 - INFO - codeparrot_training - Step 42577: {'lr': 2.688632710599265e-05, 'samples': 21799936, 'steps': 42577, 'batch_loss/train': 1.1027760760043748} 12/28/2021 13:42:43 - INFO - codeparrot_training - Step 42578: {'lr': 2.687921621587802e-05, 'samples': 21800448, 'steps': 42578, 'batch_loss/train': 1.7960756188258529} 12/28/2021 13:42:55 - INFO - codeparrot_training - Step 42579: {'lr': 2.6872106212807334e-05, 'samples': 21800960, 'steps': 42579, 'batch_loss/train': 0.7497339071705937} 12/28/2021 13:43:06 - INFO - codeparrot_training - Step 42580: {'lr': 2.6864997096808923e-05, 'samples': 21801472, 'steps': 42580, 'batch_loss/train': 0.8136100899428129} 12/28/2021 13:43:16 - INFO - codeparrot_training - Step 42581: {'lr': 2.6857888867911018e-05, 'samples': 21801984, 'steps': 42581, 'batch_loss/train': 0.7353853737004101} 12/28/2021 13:43:27 - INFO - codeparrot_training - Step 42582: {'lr': 2.68507815261419e-05, 'samples': 21802496, 'steps': 42582, 'batch_loss/train': 0.7235416723415256} 12/28/2021 13:43:40 - INFO - codeparrot_training - Step 42583: {'lr': 2.6843675071529795e-05, 'samples': 21803008, 'steps': 42583, 'batch_loss/train': 0.8082741731777787} 12/28/2021 13:43:50 - INFO - codeparrot_training - Step 42584: {'lr': 2.6836569504103014e-05, 'samples': 21803520, 'steps': 42584, 'batch_loss/train': 0.8072171160019934} 12/28/2021 13:44:01 - INFO - codeparrot_training - Step 42585: {'lr': 2.6829464823889733e-05, 'samples': 21804032, 'steps': 42585, 'batch_loss/train': 0.6649309366475791} 12/28/2021 13:44:13 - INFO - codeparrot_training - Step 42586: {'lr': 2.6822361030918257e-05, 'samples': 21804544, 'steps': 42586, 'batch_loss/train': 0.7539305472746491} 12/28/2021 13:44:24 - INFO - codeparrot_training - Step 42587: {'lr': 2.6815258125216846e-05, 'samples': 21805056, 'steps': 42587, 'batch_loss/train': 0.815368739888072} 12/28/2021 13:44:34 - INFO - codeparrot_training - Step 42588: {'lr': 2.6808156106813614e-05, 'samples': 21805568, 'steps': 42588, 'batch_loss/train': 0.7671287544071674} 12/28/2021 13:44:45 - INFO - codeparrot_training - Step 42589: {'lr': 2.68010549757369e-05, 'samples': 21806080, 'steps': 42589, 'batch_loss/train': 0.8705878118053079} 12/28/2021 13:44:59 - INFO - codeparrot_training - Step 42590: {'lr': 2.6793954732014986e-05, 'samples': 21806592, 'steps': 42590, 'batch_loss/train': 0.6935642041498795} 12/28/2021 13:45:09 - INFO - codeparrot_training - Step 42591: {'lr': 2.678685537567596e-05, 'samples': 21807104, 'steps': 42591, 'batch_loss/train': 0.7978236945345998} 12/28/2021 13:45:20 - INFO - codeparrot_training - Step 42592: {'lr': 2.6779756906748083e-05, 'samples': 21807616, 'steps': 42592, 'batch_loss/train': 0.7062590192072093} 12/28/2021 13:45:32 - INFO - codeparrot_training - Step 42593: {'lr': 2.6772659325259713e-05, 'samples': 21808128, 'steps': 42593, 'batch_loss/train': 0.7921862113289535} 12/28/2021 13:45:43 - INFO - codeparrot_training - Step 42594: {'lr': 2.676556263123889e-05, 'samples': 21808640, 'steps': 42594, 'batch_loss/train': 0.731928922701627} 12/28/2021 13:45:53 - INFO - codeparrot_training - Step 42595: {'lr': 2.675846682471392e-05, 'samples': 21809152, 'steps': 42595, 'batch_loss/train': 0.6972706005908549} 12/28/2021 13:46:08 - INFO - codeparrot_training - Step 42596: {'lr': 2.6751371905712974e-05, 'samples': 21809664, 'steps': 42596, 'batch_loss/train': 0.7904366091825068} 12/28/2021 13:46:19 - INFO - codeparrot_training - Step 42597: {'lr': 2.6744277874264282e-05, 'samples': 21810176, 'steps': 42597, 'batch_loss/train': 0.7564612647984177} 12/28/2021 13:46:29 - INFO - codeparrot_training - Step 42598: {'lr': 2.6737184730396075e-05, 'samples': 21810688, 'steps': 42598, 'batch_loss/train': 0.8814279062207788} 12/28/2021 13:46:41 - INFO - codeparrot_training - Step 42599: {'lr': 2.6730092474136492e-05, 'samples': 21811200, 'steps': 42599, 'batch_loss/train': 0.45733456080779433} 12/28/2021 13:46:52 - INFO - codeparrot_training - Step 42600: {'lr': 2.672300110551376e-05, 'samples': 21811712, 'steps': 42600, 'batch_loss/train': 0.6902219248004258} 12/28/2021 13:47:03 - INFO - codeparrot_training - Step 42601: {'lr': 2.6715910624556054e-05, 'samples': 21812224, 'steps': 42601, 'batch_loss/train': 0.702801987528801} 12/28/2021 13:47:13 - INFO - codeparrot_training - Step 42602: {'lr': 2.670882103129166e-05, 'samples': 21812736, 'steps': 42602, 'batch_loss/train': 0.8059949530288577} 12/28/2021 13:47:28 - INFO - codeparrot_training - Step 42603: {'lr': 2.6701732325748573e-05, 'samples': 21813248, 'steps': 42603, 'batch_loss/train': 0.8339343033730984} 12/28/2021 13:47:39 - INFO - codeparrot_training - Step 42604: {'lr': 2.669464450795517e-05, 'samples': 21813760, 'steps': 42604, 'batch_loss/train': 0.6041291406145319} 12/28/2021 13:47:49 - INFO - codeparrot_training - Step 42605: {'lr': 2.66875575779395e-05, 'samples': 21814272, 'steps': 42605, 'batch_loss/train': 0.5308420069050044} 12/28/2021 13:48:02 - INFO - codeparrot_training - Step 42606: {'lr': 2.6680471535729722e-05, 'samples': 21814784, 'steps': 42606, 'batch_loss/train': 0.867158685810864} 12/28/2021 13:48:12 - INFO - codeparrot_training - Step 42607: {'lr': 2.6673386381354192e-05, 'samples': 21815296, 'steps': 42607, 'batch_loss/train': 0.706735952058807} 12/28/2021 13:48:23 - INFO - codeparrot_training - Step 42608: {'lr': 2.666630211484089e-05, 'samples': 21815808, 'steps': 42608, 'batch_loss/train': 0.6501704258553218} 12/28/2021 13:48:35 - INFO - codeparrot_training - Step 42609: {'lr': 2.6659218736218042e-05, 'samples': 21816320, 'steps': 42609, 'batch_loss/train': 0.6536305669287685} 12/28/2021 13:48:46 - INFO - codeparrot_training - Step 42610: {'lr': 2.6652136245513798e-05, 'samples': 21816832, 'steps': 42610, 'batch_loss/train': 0.6879867550451308} 12/28/2021 13:48:56 - INFO - codeparrot_training - Step 42611: {'lr': 2.664505464275635e-05, 'samples': 21817344, 'steps': 42611, 'batch_loss/train': 0.6499570969026536} 12/28/2021 13:49:07 - INFO - codeparrot_training - Step 42612: {'lr': 2.6637973927973817e-05, 'samples': 21817856, 'steps': 42612, 'batch_loss/train': 0.6431973652215675} 12/28/2021 13:49:21 - INFO - codeparrot_training - Step 42613: {'lr': 2.663089410119443e-05, 'samples': 21818368, 'steps': 42613, 'batch_loss/train': 0.5568706778576598} 12/28/2021 13:49:32 - INFO - codeparrot_training - Step 42614: {'lr': 2.6623815162446135e-05, 'samples': 21818880, 'steps': 42614, 'batch_loss/train': 0.6892455318011343} 12/28/2021 13:49:42 - INFO - codeparrot_training - Step 42615: {'lr': 2.6616737111757273e-05, 'samples': 21819392, 'steps': 42615, 'batch_loss/train': 0.6375196129083633} 12/28/2021 13:49:55 - INFO - codeparrot_training - Step 42616: {'lr': 2.6609659949155984e-05, 'samples': 21819904, 'steps': 42616, 'batch_loss/train': 0.7482366198673844} 12/28/2021 13:50:05 - INFO - codeparrot_training - Step 42617: {'lr': 2.660258367467025e-05, 'samples': 21820416, 'steps': 42617, 'batch_loss/train': 0.702512756921351} 12/28/2021 13:50:16 - INFO - codeparrot_training - Step 42618: {'lr': 2.659550828832827e-05, 'samples': 21820928, 'steps': 42618, 'batch_loss/train': 0.5536389632616192} 12/28/2021 13:50:30 - INFO - codeparrot_training - Step 42619: {'lr': 2.6588433790158295e-05, 'samples': 21821440, 'steps': 42619, 'batch_loss/train': 0.7442729263566434} 12/28/2021 13:50:40 - INFO - codeparrot_training - Step 42620: {'lr': 2.658136018018831e-05, 'samples': 21821952, 'steps': 42620, 'batch_loss/train': 0.6679021596210077} 12/28/2021 13:50:51 - INFO - codeparrot_training - Step 42621: {'lr': 2.657428745844645e-05, 'samples': 21822464, 'steps': 42621, 'batch_loss/train': 0.62143256329} 12/28/2021 13:51:03 - INFO - codeparrot_training - Step 42622: {'lr': 2.6567215624960868e-05, 'samples': 21822976, 'steps': 42622, 'batch_loss/train': 0.6605080173467286} 12/28/2021 13:51:14 - INFO - codeparrot_training - Step 42623: {'lr': 2.6560144679759675e-05, 'samples': 21823488, 'steps': 42623, 'batch_loss/train': 0.6883854467305355} 12/28/2021 13:51:24 - INFO - codeparrot_training - Step 42624: {'lr': 2.6553074622870987e-05, 'samples': 21824000, 'steps': 42624, 'batch_loss/train': 0.7646001456305385} 12/28/2021 13:51:35 - INFO - codeparrot_training - Step 42625: {'lr': 2.6546005454322896e-05, 'samples': 21824512, 'steps': 42625, 'batch_loss/train': 0.6742428004508838} 12/28/2021 13:51:47 - INFO - codeparrot_training - Step 42626: {'lr': 2.653893717414352e-05, 'samples': 21825024, 'steps': 42626, 'batch_loss/train': 0.5266303234966472} 12/28/2021 13:51:58 - INFO - codeparrot_training - Step 42627: {'lr': 2.653186978236094e-05, 'samples': 21825536, 'steps': 42627, 'batch_loss/train': 0.6726481071673334} 12/28/2021 13:52:08 - INFO - codeparrot_training - Step 42628: {'lr': 2.6524803279003334e-05, 'samples': 21826048, 'steps': 42628, 'batch_loss/train': 0.7025937992148101} 12/28/2021 13:52:20 - INFO - codeparrot_training - Step 42629: {'lr': 2.6517737664098625e-05, 'samples': 21826560, 'steps': 42629, 'batch_loss/train': 0.7352643045596778} 12/28/2021 13:52:31 - INFO - codeparrot_training - Step 42630: {'lr': 2.651067293767509e-05, 'samples': 21827072, 'steps': 42630, 'batch_loss/train': 0.609323800279526} 12/28/2021 13:52:42 - INFO - codeparrot_training - Step 42631: {'lr': 2.650360909976071e-05, 'samples': 21827584, 'steps': 42631, 'batch_loss/train': 0.7804056378081441} 12/28/2021 13:52:56 - INFO - codeparrot_training - Step 42632: {'lr': 2.6496546150383522e-05, 'samples': 21828096, 'steps': 42632, 'batch_loss/train': 0.5807799606118351} 12/28/2021 13:53:06 - INFO - codeparrot_training - Step 42633: {'lr': 2.6489484089571776e-05, 'samples': 21828608, 'steps': 42633, 'batch_loss/train': 0.5306811866466887} 12/28/2021 13:53:17 - INFO - codeparrot_training - Step 42634: {'lr': 2.6482422917353422e-05, 'samples': 21829120, 'steps': 42634, 'batch_loss/train': 0.7377413091016933} 12/28/2021 13:53:28 - INFO - codeparrot_training - Step 42635: {'lr': 2.6475362633756523e-05, 'samples': 21829632, 'steps': 42635, 'batch_loss/train': 0.7946821767836809} 12/28/2021 13:53:40 - INFO - codeparrot_training - Step 42636: {'lr': 2.6468303238809193e-05, 'samples': 21830144, 'steps': 42636, 'batch_loss/train': 0.7524408791214228} 12/28/2021 13:53:51 - INFO - codeparrot_training - Step 42637: {'lr': 2.6461244732539497e-05, 'samples': 21830656, 'steps': 42637, 'batch_loss/train': 0.7606656635180116} 12/28/2021 13:54:01 - INFO - codeparrot_training - Step 42638: {'lr': 2.6454187114975493e-05, 'samples': 21831168, 'steps': 42638, 'batch_loss/train': 0.6665282784961164} 12/28/2021 13:54:14 - INFO - codeparrot_training - Step 42639: {'lr': 2.644713038614527e-05, 'samples': 21831680, 'steps': 42639, 'batch_loss/train': 0.764296853041742} 12/28/2021 13:54:24 - INFO - codeparrot_training - Step 42640: {'lr': 2.6440074546076747e-05, 'samples': 21832192, 'steps': 42640, 'batch_loss/train': 0.4732684787886683} 12/28/2021 13:54:35 - INFO - codeparrot_training - Step 42641: {'lr': 2.643301959479813e-05, 'samples': 21832704, 'steps': 42641, 'batch_loss/train': 0.6583535165991634} 12/28/2021 13:54:49 - INFO - codeparrot_training - Step 42642: {'lr': 2.6425965532337448e-05, 'samples': 21833216, 'steps': 42642, 'batch_loss/train': 0.6509473692858592} 12/28/2021 13:55:00 - INFO - codeparrot_training - Step 42643: {'lr': 2.6418912358722626e-05, 'samples': 21833728, 'steps': 42643, 'batch_loss/train': 0.7536401017569005} 12/28/2021 13:55:10 - INFO - codeparrot_training - Step 42644: {'lr': 2.641186007398183e-05, 'samples': 21834240, 'steps': 42644, 'batch_loss/train': 0.6629094027448446} 12/28/2021 13:55:22 - INFO - codeparrot_training - Step 42645: {'lr': 2.64048086781431e-05, 'samples': 21834752, 'steps': 42645, 'batch_loss/train': 0.5506430640816689} 12/28/2021 13:55:33 - INFO - codeparrot_training - Step 42646: {'lr': 2.6397758171234383e-05, 'samples': 21835264, 'steps': 42646, 'batch_loss/train': 0.6761122043244541} 12/28/2021 13:55:44 - INFO - codeparrot_training - Step 42647: {'lr': 2.639070855328371e-05, 'samples': 21835776, 'steps': 42647, 'batch_loss/train': 0.7262574099004269} 12/28/2021 13:55:54 - INFO - codeparrot_training - Step 42648: {'lr': 2.6383659824319228e-05, 'samples': 21836288, 'steps': 42648, 'batch_loss/train': 0.6504286681301892} 12/28/2021 13:56:08 - INFO - codeparrot_training - Step 42649: {'lr': 2.637661198436883e-05, 'samples': 21836800, 'steps': 42649, 'batch_loss/train': 0.6435676994733512} 12/28/2021 13:56:19 - INFO - codeparrot_training - Step 42650: {'lr': 2.6369565033460607e-05, 'samples': 21837312, 'steps': 42650, 'batch_loss/train': 0.7217610152438283} 12/28/2021 13:56:29 - INFO - codeparrot_training - Step 42651: {'lr': 2.6362518971622535e-05, 'samples': 21837824, 'steps': 42651, 'batch_loss/train': 0.5809395376127213} 12/28/2021 13:56:42 - INFO - codeparrot_training - Step 42652: {'lr': 2.6355473798882646e-05, 'samples': 21838336, 'steps': 42652, 'batch_loss/train': 0.6126077813096344} 12/28/2021 13:56:52 - INFO - codeparrot_training - Step 42653: {'lr': 2.6348429515268945e-05, 'samples': 21838848, 'steps': 42653, 'batch_loss/train': 0.5868668639159296} 12/28/2021 13:57:03 - INFO - codeparrot_training - Step 42654: {'lr': 2.6341386120809497e-05, 'samples': 21839360, 'steps': 42654, 'batch_loss/train': 0.7217547670006752} 12/28/2021 13:57:15 - INFO - codeparrot_training - Step 42655: {'lr': 2.6334343615532162e-05, 'samples': 21839872, 'steps': 42655, 'batch_loss/train': 0.6668973327614367} 12/28/2021 13:57:26 - INFO - codeparrot_training - Step 42656: {'lr': 2.6327301999465065e-05, 'samples': 21840384, 'steps': 42656, 'batch_loss/train': 0.7632475201971829} 12/28/2021 13:57:36 - INFO - codeparrot_training - Step 42657: {'lr': 2.6320261272636202e-05, 'samples': 21840896, 'steps': 42657, 'batch_loss/train': 0.6405709958635271} 12/28/2021 13:57:47 - INFO - codeparrot_training - Step 42658: {'lr': 2.631322143507342e-05, 'samples': 21841408, 'steps': 42658, 'batch_loss/train': 0.646905355155468} 12/28/2021 13:57:59 - INFO - codeparrot_training - Step 42659: {'lr': 2.6306182486804915e-05, 'samples': 21841920, 'steps': 42659, 'batch_loss/train': 0.7268227366730571} 12/28/2021 13:58:10 - INFO - codeparrot_training - Step 42660: {'lr': 2.6299144427858524e-05, 'samples': 21842432, 'steps': 42660, 'batch_loss/train': 0.632962703704834} 12/28/2021 13:58:20 - INFO - codeparrot_training - Step 42661: {'lr': 2.6292107258262254e-05, 'samples': 21842944, 'steps': 42661, 'batch_loss/train': 0.6920384906115942} 12/28/2021 13:58:33 - INFO - codeparrot_training - Step 42662: {'lr': 2.6285070978044083e-05, 'samples': 21843456, 'steps': 42662, 'batch_loss/train': 0.7482318715192378} 12/28/2021 13:58:43 - INFO - codeparrot_training - Step 42663: {'lr': 2.627803558723202e-05, 'samples': 21843968, 'steps': 42663, 'batch_loss/train': 0.6776120068971068} 12/28/2021 13:58:54 - INFO - codeparrot_training - Step 42664: {'lr': 2.6271001085854036e-05, 'samples': 21844480, 'steps': 42664, 'batch_loss/train': 0.6937527982518077} 12/28/2021 13:59:08 - INFO - codeparrot_training - Step 42665: {'lr': 2.6263967473938083e-05, 'samples': 21844992, 'steps': 42665, 'batch_loss/train': 0.5958826458081603} 12/28/2021 13:59:18 - INFO - codeparrot_training - Step 42666: {'lr': 2.6256934751512028e-05, 'samples': 21845504, 'steps': 42666, 'batch_loss/train': 0.7347174407914281} 12/28/2021 13:59:29 - INFO - codeparrot_training - Step 42667: {'lr': 2.6249902918603992e-05, 'samples': 21846016, 'steps': 42667, 'batch_loss/train': 0.5774855406489223} 12/28/2021 13:59:41 - INFO - codeparrot_training - Step 42668: {'lr': 2.624287197524186e-05, 'samples': 21846528, 'steps': 42668, 'batch_loss/train': 0.734719910658896} 12/28/2021 13:59:52 - INFO - codeparrot_training - Step 42669: {'lr': 2.6235841921453536e-05, 'samples': 21847040, 'steps': 42669, 'batch_loss/train': 0.6459910857956856} 12/28/2021 14:00:03 - INFO - codeparrot_training - Step 42670: {'lr': 2.6228812757267016e-05, 'samples': 21847552, 'steps': 42670, 'batch_loss/train': 0.6158253633766435} 12/28/2021 14:00:13 - INFO - codeparrot_training - Step 42671: {'lr': 2.6221784482710314e-05, 'samples': 21848064, 'steps': 42671, 'batch_loss/train': 0.6897213058546185} 12/28/2021 14:00:27 - INFO - codeparrot_training - Step 42672: {'lr': 2.6214757097811266e-05, 'samples': 21848576, 'steps': 42672, 'batch_loss/train': 0.7105961139313877} 12/28/2021 14:00:38 - INFO - codeparrot_training - Step 42673: {'lr': 2.620773060259779e-05, 'samples': 21849088, 'steps': 42673, 'batch_loss/train': 0.8798572331434116} 12/28/2021 14:00:49 - INFO - codeparrot_training - Step 42674: {'lr': 2.620070499709795e-05, 'samples': 21849600, 'steps': 42674, 'batch_loss/train': 0.5753807719156612} 12/28/2021 14:01:01 - INFO - codeparrot_training - Step 42675: {'lr': 2.619368028133959e-05, 'samples': 21850112, 'steps': 42675, 'batch_loss/train': 0.7281739199534059} 12/28/2021 14:01:11 - INFO - codeparrot_training - Step 42676: {'lr': 2.6186656455350622e-05, 'samples': 21850624, 'steps': 42676, 'batch_loss/train': 0.7511681746691465} 12/28/2021 14:01:22 - INFO - codeparrot_training - Step 42677: {'lr': 2.617963351915903e-05, 'samples': 21851136, 'steps': 42677, 'batch_loss/train': 0.7124555300688371} 12/28/2021 14:01:34 - INFO - codeparrot_training - Step 42678: {'lr': 2.617261147279268e-05, 'samples': 21851648, 'steps': 42678, 'batch_loss/train': 0.7763481643050909} 12/28/2021 14:01:45 - INFO - codeparrot_training - Step 42679: {'lr': 2.616559031627952e-05, 'samples': 21852160, 'steps': 42679, 'batch_loss/train': 0.6218530816840939} 12/28/2021 14:01:55 - INFO - codeparrot_training - Step 42680: {'lr': 2.6158570049647446e-05, 'samples': 21852672, 'steps': 42680, 'batch_loss/train': 0.6712822439149022} 12/28/2021 14:02:09 - INFO - codeparrot_training - Step 42681: {'lr': 2.615155067292438e-05, 'samples': 21853184, 'steps': 42681, 'batch_loss/train': 0.655717562418431} 12/28/2021 14:02:20 - INFO - codeparrot_training - Step 42682: {'lr': 2.6144532186138242e-05, 'samples': 21853696, 'steps': 42682, 'batch_loss/train': 0.5959334959916305} 12/28/2021 14:02:31 - INFO - codeparrot_training - Step 42683: {'lr': 2.6137514589316956e-05, 'samples': 21854208, 'steps': 42683, 'batch_loss/train': 0.8341362401843071} 12/28/2021 14:02:41 - INFO - codeparrot_training - Step 42684: {'lr': 2.613049788248828e-05, 'samples': 21854720, 'steps': 42684, 'batch_loss/train': 0.7855358086526394} 12/28/2021 14:02:53 - INFO - codeparrot_training - Step 42685: {'lr': 2.6123482065680326e-05, 'samples': 21855232, 'steps': 42685, 'batch_loss/train': 0.6690132618532516} 12/28/2021 14:03:04 - INFO - codeparrot_training - Step 42686: {'lr': 2.6116467138920823e-05, 'samples': 21855744, 'steps': 42686, 'batch_loss/train': 0.7577284411527216} 12/28/2021 14:03:15 - INFO - codeparrot_training - Step 42687: {'lr': 2.6109453102237723e-05, 'samples': 21856256, 'steps': 42687, 'batch_loss/train': 0.6558839688077569} 12/28/2021 14:03:29 - INFO - codeparrot_training - Step 42688: {'lr': 2.610243995565889e-05, 'samples': 21856768, 'steps': 42688, 'batch_loss/train': 0.836928189266473} 12/28/2021 14:03:39 - INFO - codeparrot_training - Step 42689: {'lr': 2.609542769921222e-05, 'samples': 21857280, 'steps': 42689, 'batch_loss/train': 0.6446058531291783} 12/28/2021 14:03:50 - INFO - codeparrot_training - Step 42690: {'lr': 2.608841633292558e-05, 'samples': 21857792, 'steps': 42690, 'batch_loss/train': 0.6489140770863742} 12/28/2021 14:04:02 - INFO - codeparrot_training - Step 42691: {'lr': 2.6081405856826835e-05, 'samples': 21858304, 'steps': 42691, 'batch_loss/train': 0.6735691466019489} 12/28/2021 14:04:13 - INFO - codeparrot_training - Step 42692: {'lr': 2.607439627094388e-05, 'samples': 21858816, 'steps': 42692, 'batch_loss/train': 0.7180614592507482} 12/28/2021 14:04:24 - INFO - codeparrot_training - Step 42693: {'lr': 2.606738757530458e-05, 'samples': 21859328, 'steps': 42693, 'batch_loss/train': 0.6305480939336121} 12/28/2021 14:04:35 - INFO - codeparrot_training - Step 42694: {'lr': 2.606037976993686e-05, 'samples': 21859840, 'steps': 42694, 'batch_loss/train': 0.6423400822095573} 12/28/2021 14:04:46 - INFO - codeparrot_training - Step 42695: {'lr': 2.605337285486839e-05, 'samples': 21860352, 'steps': 42695, 'batch_loss/train': 0.7661818042397499} 12/28/2021 14:04:57 - INFO - codeparrot_training - Step 42696: {'lr': 2.6046366830127205e-05, 'samples': 21860864, 'steps': 42696, 'batch_loss/train': 0.6420432568993419} 12/28/2021 14:05:07 - INFO - codeparrot_training - Step 42697: {'lr': 2.603936169574117e-05, 'samples': 21861376, 'steps': 42697, 'batch_loss/train': 0.7010422325693071} 12/28/2021 14:05:20 - INFO - codeparrot_training - Step 42698: {'lr': 2.6032357451738013e-05, 'samples': 21861888, 'steps': 42698, 'batch_loss/train': 0.7938134018331766} 12/28/2021 14:05:30 - INFO - codeparrot_training - Step 42699: {'lr': 2.6025354098145576e-05, 'samples': 21862400, 'steps': 42699, 'batch_loss/train': 0.44332133073476143} 12/28/2021 14:05:41 - INFO - codeparrot_training - Step 42700: {'lr': 2.601835163499186e-05, 'samples': 21862912, 'steps': 42700, 'batch_loss/train': 0.6936666793189943} 12/28/2021 14:05:55 - INFO - codeparrot_training - Step 42701: {'lr': 2.601135006230457e-05, 'samples': 21863424, 'steps': 42701, 'batch_loss/train': 0.7132807960733771} 12/28/2021 14:06:06 - INFO - codeparrot_training - Step 42702: {'lr': 2.6004349380111563e-05, 'samples': 21863936, 'steps': 42702, 'batch_loss/train': 0.7249266132712364} 12/28/2021 14:06:16 - INFO - codeparrot_training - Step 42703: {'lr': 2.599734958844069e-05, 'samples': 21864448, 'steps': 42703, 'batch_loss/train': 0.6998108634725213} 12/28/2021 14:06:29 - INFO - codeparrot_training - Step 42704: {'lr': 2.599035068731978e-05, 'samples': 21864960, 'steps': 42704, 'batch_loss/train': 0.6226721687417012} 12/28/2021 14:06:39 - INFO - codeparrot_training - Step 42705: {'lr': 2.598335267677665e-05, 'samples': 21865472, 'steps': 42705, 'batch_loss/train': 0.6622943640686572} 12/28/2021 14:06:50 - INFO - codeparrot_training - Step 42706: {'lr': 2.5976355556839108e-05, 'samples': 21865984, 'steps': 42706, 'batch_loss/train': 0.7962149479426444} 12/28/2021 14:07:01 - INFO - codeparrot_training - Step 42707: {'lr': 2.5969359327535024e-05, 'samples': 21866496, 'steps': 42707, 'batch_loss/train': 0.6450060431379825} 12/28/2021 14:07:13 - INFO - codeparrot_training - Step 42708: {'lr': 2.596236398889215e-05, 'samples': 21867008, 'steps': 42708, 'batch_loss/train': 0.6783351209014654} 12/28/2021 14:07:24 - INFO - codeparrot_training - Step 42709: {'lr': 2.5955369540938385e-05, 'samples': 21867520, 'steps': 42709, 'batch_loss/train': 0.5752347942034248} 12/28/2021 14:07:34 - INFO - codeparrot_training - Step 42710: {'lr': 2.5948375983701366e-05, 'samples': 21868032, 'steps': 42710, 'batch_loss/train': 0.7420150395482779} 12/28/2021 14:07:48 - INFO - codeparrot_training - Step 42711: {'lr': 2.5941383317209104e-05, 'samples': 21868544, 'steps': 42711, 'batch_loss/train': 0.6711917980574071} 12/28/2021 14:07:59 - INFO - codeparrot_training - Step 42712: {'lr': 2.5934391541489243e-05, 'samples': 21869056, 'steps': 42712, 'batch_loss/train': 0.7332440417958423} 12/28/2021 14:08:09 - INFO - codeparrot_training - Step 42713: {'lr': 2.592740065656965e-05, 'samples': 21869568, 'steps': 42713, 'batch_loss/train': 0.7006997730350122} 12/28/2021 14:08:22 - INFO - codeparrot_training - Step 42714: {'lr': 2.5920410662478106e-05, 'samples': 21870080, 'steps': 42714, 'batch_loss/train': 0.6947369808331132} 12/28/2021 14:08:32 - INFO - codeparrot_training - Step 42715: {'lr': 2.5913421559242423e-05, 'samples': 21870592, 'steps': 42715, 'batch_loss/train': 0.6747318613342941} 12/28/2021 14:08:43 - INFO - codeparrot_training - Step 42716: {'lr': 2.590643334689033e-05, 'samples': 21871104, 'steps': 42716, 'batch_loss/train': 0.6451702192425728} 12/28/2021 14:08:53 - INFO - codeparrot_training - Step 42717: {'lr': 2.5899446025449665e-05, 'samples': 21871616, 'steps': 42717, 'batch_loss/train': 0.6613732678815722} 12/28/2021 14:09:08 - INFO - codeparrot_training - Step 42718: {'lr': 2.5892459594948154e-05, 'samples': 21872128, 'steps': 42718, 'batch_loss/train': 0.6830317909189034} 12/28/2021 14:09:18 - INFO - codeparrot_training - Step 42719: {'lr': 2.5885474055413638e-05, 'samples': 21872640, 'steps': 42719, 'batch_loss/train': 0.5512840462324675} 12/28/2021 14:09:29 - INFO - codeparrot_training - Step 42720: {'lr': 2.587848940687387e-05, 'samples': 21873152, 'steps': 42720, 'batch_loss/train': 0.6923006419092417} 12/28/2021 14:09:41 - INFO - codeparrot_training - Step 42721: {'lr': 2.587150564935653e-05, 'samples': 21873664, 'steps': 42721, 'batch_loss/train': 0.6877042753621936} 12/28/2021 14:09:51 - INFO - codeparrot_training - Step 42722: {'lr': 2.5864522782889503e-05, 'samples': 21874176, 'steps': 42722, 'batch_loss/train': 0.6821292047388852} 12/28/2021 14:10:02 - INFO - codeparrot_training - Step 42723: {'lr': 2.585754080750055e-05, 'samples': 21874688, 'steps': 42723, 'batch_loss/train': 0.7414089194498956} 12/28/2021 14:10:14 - INFO - codeparrot_training - Step 42724: {'lr': 2.5850559723217344e-05, 'samples': 21875200, 'steps': 42724, 'batch_loss/train': 0.7096676058135927} 12/28/2021 14:10:25 - INFO - codeparrot_training - Step 42725: {'lr': 2.5843579530067608e-05, 'samples': 21875712, 'steps': 42725, 'batch_loss/train': 0.7096436787396669} 12/28/2021 14:10:36 - INFO - codeparrot_training - Step 42726: {'lr': 2.5836600228079267e-05, 'samples': 21876224, 'steps': 42726, 'batch_loss/train': 0.7268537264317274} 12/28/2021 14:10:49 - INFO - codeparrot_training - Step 42727: {'lr': 2.5829621817279907e-05, 'samples': 21876736, 'steps': 42727, 'batch_loss/train': 0.6648406460299157} 12/28/2021 14:11:00 - INFO - codeparrot_training - Step 42728: {'lr': 2.5822644297697345e-05, 'samples': 21877248, 'steps': 42728, 'batch_loss/train': 0.8945381678640842} 12/28/2021 14:11:11 - INFO - codeparrot_training - Step 42729: {'lr': 2.5815667669359307e-05, 'samples': 21877760, 'steps': 42729, 'batch_loss/train': 0.671026018448174} 12/28/2021 14:11:21 - INFO - codeparrot_training - Step 42730: {'lr': 2.580869193229349e-05, 'samples': 21878272, 'steps': 42730, 'batch_loss/train': 0.684489656239748} 12/28/2021 14:11:34 - INFO - codeparrot_training - Step 42731: {'lr': 2.5801717086527703e-05, 'samples': 21878784, 'steps': 42731, 'batch_loss/train': 0.7406664853915572} 12/28/2021 14:11:44 - INFO - codeparrot_training - Step 42732: {'lr': 2.5794743132089597e-05, 'samples': 21879296, 'steps': 42732, 'batch_loss/train': 0.8505321433767676} 12/28/2021 14:11:55 - INFO - codeparrot_training - Step 42733: {'lr': 2.5787770069006956e-05, 'samples': 21879808, 'steps': 42733, 'batch_loss/train': 0.6728843036107719} 12/28/2021 14:12:07 - INFO - codeparrot_training - Step 42734: {'lr': 2.5780797897307474e-05, 'samples': 21880320, 'steps': 42734, 'batch_loss/train': 0.6487867531832308} 12/28/2021 14:12:18 - INFO - codeparrot_training - Step 42735: {'lr': 2.5773826617018936e-05, 'samples': 21880832, 'steps': 42735, 'batch_loss/train': 0.7023163144476712} 12/28/2021 14:12:29 - INFO - codeparrot_training - Step 42736: {'lr': 2.5766856228168906e-05, 'samples': 21881344, 'steps': 42736, 'batch_loss/train': 0.7395326755940914} 12/28/2021 14:12:41 - INFO - codeparrot_training - Step 42737: {'lr': 2.575988673078522e-05, 'samples': 21881856, 'steps': 42737, 'batch_loss/train': 0.8164555989205837} 12/28/2021 14:12:51 - INFO - codeparrot_training - Step 42738: {'lr': 2.5752918124895636e-05, 'samples': 21882368, 'steps': 42738, 'batch_loss/train': 0.7279195077717304} 12/28/2021 14:13:02 - INFO - codeparrot_training - Step 42739: {'lr': 2.5745950410527685e-05, 'samples': 21882880, 'steps': 42739, 'batch_loss/train': 0.8268435932695866} 12/28/2021 14:13:12 - INFO - codeparrot_training - Step 42740: {'lr': 2.573898358770918e-05, 'samples': 21883392, 'steps': 42740, 'batch_loss/train': 0.6816207598894835} 12/28/2021 14:13:26 - INFO - codeparrot_training - Step 42741: {'lr': 2.573201765646782e-05, 'samples': 21883904, 'steps': 42741, 'batch_loss/train': 0.6372739658690989} 12/28/2021 14:13:37 - INFO - codeparrot_training - Step 42742: {'lr': 2.5725052616831253e-05, 'samples': 21884416, 'steps': 42742, 'batch_loss/train': 0.7084128549322486} 12/28/2021 14:13:48 - INFO - codeparrot_training - Step 42743: {'lr': 2.57180884688272e-05, 'samples': 21884928, 'steps': 42743, 'batch_loss/train': 0.7360221417620778} 12/28/2021 14:14:00 - INFO - codeparrot_training - Step 42744: {'lr': 2.5711125212483338e-05, 'samples': 21885440, 'steps': 42744, 'batch_loss/train': 0.594686200201977} 12/28/2021 14:14:11 - INFO - codeparrot_training - Step 42745: {'lr': 2.570416284782734e-05, 'samples': 21885952, 'steps': 42745, 'batch_loss/train': 0.6182454983354546} 12/28/2021 14:14:21 - INFO - codeparrot_training - Step 42746: {'lr': 2.569720137488693e-05, 'samples': 21886464, 'steps': 42746, 'batch_loss/train': 0.7049994447734207} 12/28/2021 14:14:35 - INFO - codeparrot_training - Step 42747: {'lr': 2.569024079368973e-05, 'samples': 21886976, 'steps': 42747, 'batch_loss/train': 0.7046919353306293} 12/28/2021 14:14:46 - INFO - codeparrot_training - Step 42748: {'lr': 2.5683281104263433e-05, 'samples': 21887488, 'steps': 42748, 'batch_loss/train': 0.7549136239103973} 12/28/2021 14:14:56 - INFO - codeparrot_training - Step 42749: {'lr': 2.5676322306635773e-05, 'samples': 21888000, 'steps': 42749, 'batch_loss/train': 0.6928617982193828} 12/28/2021 14:15:08 - INFO - codeparrot_training - Step 42750: {'lr': 2.566936440083431e-05, 'samples': 21888512, 'steps': 42750, 'batch_loss/train': 0.7821282586082816} 12/28/2021 14:15:19 - INFO - codeparrot_training - Step 42751: {'lr': 2.566240738688669e-05, 'samples': 21889024, 'steps': 42751, 'batch_loss/train': 0.7011549919843674} 12/28/2021 14:15:30 - INFO - codeparrot_training - Step 42752: {'lr': 2.5655451264820746e-05, 'samples': 21889536, 'steps': 42752, 'batch_loss/train': 0.6487720826116856} 12/28/2021 14:15:40 - INFO - codeparrot_training - Step 42753: {'lr': 2.564849603466396e-05, 'samples': 21890048, 'steps': 42753, 'batch_loss/train': 0.7481093555688858} 12/28/2021 14:15:53 - INFO - codeparrot_training - Step 42754: {'lr': 2.5641541696444005e-05, 'samples': 21890560, 'steps': 42754, 'batch_loss/train': 0.7410822524689138} 12/28/2021 14:16:03 - INFO - codeparrot_training - Step 42755: {'lr': 2.5634588250188634e-05, 'samples': 21891072, 'steps': 42755, 'batch_loss/train': 0.7211113171651959} 12/28/2021 14:16:14 - INFO - codeparrot_training - Step 42756: {'lr': 2.5627635695925384e-05, 'samples': 21891584, 'steps': 42756, 'batch_loss/train': 0.6444102439563721} 12/28/2021 14:16:28 - INFO - codeparrot_training - Step 42757: {'lr': 2.562068403368195e-05, 'samples': 21892096, 'steps': 42757, 'batch_loss/train': 0.659648559987545} 12/28/2021 14:16:38 - INFO - codeparrot_training - Step 42758: {'lr': 2.5613733263485955e-05, 'samples': 21892608, 'steps': 42758, 'batch_loss/train': 0.6973035028204322} 12/28/2021 14:16:49 - INFO - codeparrot_training - Step 42759: {'lr': 2.5606783385365012e-05, 'samples': 21893120, 'steps': 42759, 'batch_loss/train': 0.6015217194799334} 12/28/2021 14:17:01 - INFO - codeparrot_training - Step 42760: {'lr': 2.5599834399346793e-05, 'samples': 21893632, 'steps': 42760, 'batch_loss/train': 0.9216554001905024} 12/28/2021 14:17:12 - INFO - codeparrot_training - Step 42761: {'lr': 2.5592886305458917e-05, 'samples': 21894144, 'steps': 42761, 'batch_loss/train': 1.5440112696960568} 12/28/2021 14:17:22 - INFO - codeparrot_training - Step 42762: {'lr': 2.5585939103728917e-05, 'samples': 21894656, 'steps': 42762, 'batch_loss/train': 0.76991150341928} 12/28/2021 14:17:33 - INFO - codeparrot_training - Step 42763: {'lr': 2.5578992794184547e-05, 'samples': 21895168, 'steps': 42763, 'batch_loss/train': 0.7136429056990892} 12/28/2021 14:17:47 - INFO - codeparrot_training - Step 42764: {'lr': 2.55720473768534e-05, 'samples': 21895680, 'steps': 42764, 'batch_loss/train': 0.6414009278523736} 12/28/2021 14:17:58 - INFO - codeparrot_training - Step 42765: {'lr': 2.5565102851763002e-05, 'samples': 21896192, 'steps': 42765, 'batch_loss/train': 0.640815129969269} 12/28/2021 14:18:08 - INFO - codeparrot_training - Step 42766: {'lr': 2.555815921894103e-05, 'samples': 21896704, 'steps': 42766, 'batch_loss/train': 0.7031587292440236} 12/28/2021 14:18:20 - INFO - codeparrot_training - Step 42767: {'lr': 2.5551216478415045e-05, 'samples': 21897216, 'steps': 42767, 'batch_loss/train': 0.6807758179493248} 12/28/2021 14:18:31 - INFO - codeparrot_training - Step 42768: {'lr': 2.5544274630212693e-05, 'samples': 21897728, 'steps': 42768, 'batch_loss/train': 0.7680954309180379} 12/28/2021 14:18:42 - INFO - codeparrot_training - Step 42769: {'lr': 2.5537333674361534e-05, 'samples': 21898240, 'steps': 42769, 'batch_loss/train': 0.6853818618692458} 12/28/2021 14:18:54 - INFO - codeparrot_training - Step 42770: {'lr': 2.5530393610889214e-05, 'samples': 21898752, 'steps': 42770, 'batch_loss/train': 0.692243620287627} 12/28/2021 14:19:04 - INFO - codeparrot_training - Step 42771: {'lr': 2.5523454439823262e-05, 'samples': 21899264, 'steps': 42771, 'batch_loss/train': 0.79577611759305} 12/28/2021 14:19:15 - INFO - codeparrot_training - Step 42772: {'lr': 2.55165161611913e-05, 'samples': 21899776, 'steps': 42772, 'batch_loss/train': 0.6576126329600811} 12/28/2021 14:19:26 - INFO - codeparrot_training - Step 42773: {'lr': 2.5509578775020914e-05, 'samples': 21900288, 'steps': 42773, 'batch_loss/train': 0.7357192793861032} 12/28/2021 14:19:38 - INFO - codeparrot_training - Step 42774: {'lr': 2.5502642281339693e-05, 'samples': 21900800, 'steps': 42774, 'batch_loss/train': 0.5522103044204414} 12/28/2021 14:19:49 - INFO - codeparrot_training - Step 42775: {'lr': 2.549570668017523e-05, 'samples': 21901312, 'steps': 42775, 'batch_loss/train': 0.7295858213328756} 12/28/2021 14:19:59 - INFO - codeparrot_training - Step 42776: {'lr': 2.5488771971555023e-05, 'samples': 21901824, 'steps': 42776, 'batch_loss/train': 0.7363025257363915} 12/28/2021 14:20:13 - INFO - codeparrot_training - Step 42777: {'lr': 2.548183815550664e-05, 'samples': 21902336, 'steps': 42777, 'batch_loss/train': 0.694617934525013} 12/28/2021 14:20:24 - INFO - codeparrot_training - Step 42778: {'lr': 2.5474905232057784e-05, 'samples': 21902848, 'steps': 42778, 'batch_loss/train': 0.6330925300717354} 12/28/2021 14:20:34 - INFO - codeparrot_training - Step 42779: {'lr': 2.5467973201235896e-05, 'samples': 21903360, 'steps': 42779, 'batch_loss/train': 0.7470101742073894} 12/28/2021 14:20:47 - INFO - codeparrot_training - Step 42780: {'lr': 2.5461042063068485e-05, 'samples': 21903872, 'steps': 42780, 'batch_loss/train': 0.7204806893132627} 12/28/2021 14:20:57 - INFO - codeparrot_training - Step 42781: {'lr': 2.545411181758331e-05, 'samples': 21904384, 'steps': 42781, 'batch_loss/train': 0.7517221737653017} 12/28/2021 14:21:08 - INFO - codeparrot_training - Step 42782: {'lr': 2.544718246480776e-05, 'samples': 21904896, 'steps': 42782, 'batch_loss/train': 0.7077350728213787} 12/28/2021 14:21:18 - INFO - codeparrot_training - Step 42783: {'lr': 2.5440254004769404e-05, 'samples': 21905408, 'steps': 42783, 'batch_loss/train': 0.7426349364686757} 12/28/2021 14:21:31 - INFO - codeparrot_training - Step 42784: {'lr': 2.5433326437495824e-05, 'samples': 21905920, 'steps': 42784, 'batch_loss/train': 0.6405955068767071} 12/28/2021 14:21:41 - INFO - codeparrot_training - Step 42785: {'lr': 2.5426399763014528e-05, 'samples': 21906432, 'steps': 42785, 'batch_loss/train': 0.5949881731648929} 12/28/2021 14:21:52 - INFO - codeparrot_training - Step 42786: {'lr': 2.541947398135308e-05, 'samples': 21906944, 'steps': 42786, 'batch_loss/train': 0.5637305612035561} 12/28/2021 14:22:06 - INFO - codeparrot_training - Step 42787: {'lr': 2.5412549092539066e-05, 'samples': 21907456, 'steps': 42787, 'batch_loss/train': 0.8043093690648675} 12/28/2021 14:22:17 - INFO - codeparrot_training - Step 42788: {'lr': 2.5405625096599853e-05, 'samples': 21907968, 'steps': 42788, 'batch_loss/train': 0.6525299986824393} 12/28/2021 14:22:27 - INFO - codeparrot_training - Step 42789: {'lr': 2.5398701993563115e-05, 'samples': 21908480, 'steps': 42789, 'batch_loss/train': 0.7140269577503204} 12/28/2021 14:22:39 - INFO - codeparrot_training - Step 42790: {'lr': 2.5391779783456388e-05, 'samples': 21908992, 'steps': 42790, 'batch_loss/train': 0.7434944584965706} 12/28/2021 14:22:50 - INFO - codeparrot_training - Step 42791: {'lr': 2.5384858466307003e-05, 'samples': 21909504, 'steps': 42791, 'batch_loss/train': 0.7174299934995361} 12/28/2021 14:23:01 - INFO - codeparrot_training - Step 42792: {'lr': 2.5377938042142695e-05, 'samples': 21910016, 'steps': 42792, 'batch_loss/train': 0.7593349497765303} 12/28/2021 14:23:13 - INFO - codeparrot_training - Step 42793: {'lr': 2.537101851099094e-05, 'samples': 21910528, 'steps': 42793, 'batch_loss/train': 0.6999371766578406} 12/28/2021 14:23:23 - INFO - codeparrot_training - Step 42794: {'lr': 2.536409987287913e-05, 'samples': 21911040, 'steps': 42794, 'batch_loss/train': 0.6934655061922967} 12/28/2021 14:23:34 - INFO - codeparrot_training - Step 42795: {'lr': 2.535718212783486e-05, 'samples': 21911552, 'steps': 42795, 'batch_loss/train': 0.7147474754601717} 12/28/2021 14:23:45 - INFO - codeparrot_training - Step 42796: {'lr': 2.5350265275885605e-05, 'samples': 21912064, 'steps': 42796, 'batch_loss/train': 0.5786329702241346} 12/28/2021 14:23:59 - INFO - codeparrot_training - Step 42797: {'lr': 2.5343349317058868e-05, 'samples': 21912576, 'steps': 42797, 'batch_loss/train': 0.7859419812448323} 12/28/2021 14:24:09 - INFO - codeparrot_training - Step 42798: {'lr': 2.533643425138213e-05, 'samples': 21913088, 'steps': 42798, 'batch_loss/train': 0.6968350079841912} 12/28/2021 14:24:20 - INFO - codeparrot_training - Step 42799: {'lr': 2.532952007888292e-05, 'samples': 21913600, 'steps': 42799, 'batch_loss/train': 0.5781924861948937} 12/28/2021 14:24:32 - INFO - codeparrot_training - Step 42800: {'lr': 2.5322606799588697e-05, 'samples': 21914112, 'steps': 42800, 'batch_loss/train': 0.6763314958661795} 12/28/2021 14:24:43 - INFO - codeparrot_training - Step 42801: {'lr': 2.5315694413527017e-05, 'samples': 21914624, 'steps': 42801, 'batch_loss/train': 0.6646315176039934} 12/28/2021 14:24:53 - INFO - codeparrot_training - Step 42802: {'lr': 2.530878292072525e-05, 'samples': 21915136, 'steps': 42802, 'batch_loss/train': 0.7928918190300465} 12/28/2021 14:25:07 - INFO - codeparrot_training - Step 42803: {'lr': 2.5301872321210868e-05, 'samples': 21915648, 'steps': 42803, 'batch_loss/train': 0.7807258386164904} 12/28/2021 14:25:18 - INFO - codeparrot_training - Step 42804: {'lr': 2.529496261501149e-05, 'samples': 21916160, 'steps': 42804, 'batch_loss/train': 0.7239598288433626} 12/28/2021 14:25:29 - INFO - codeparrot_training - Step 42805: {'lr': 2.5288053802154432e-05, 'samples': 21916672, 'steps': 42805, 'batch_loss/train': 0.4254409911809489} 12/28/2021 14:25:41 - INFO - codeparrot_training - Step 42806: {'lr': 2.5281145882667194e-05, 'samples': 21917184, 'steps': 42806, 'batch_loss/train': 0.6412277482450008} 12/28/2021 14:25:52 - INFO - codeparrot_training - Step 42807: {'lr': 2.527423885657737e-05, 'samples': 21917696, 'steps': 42807, 'batch_loss/train': 0.6783397980616428} 12/28/2021 14:26:03 - INFO - codeparrot_training - Step 42808: {'lr': 2.526733272391227e-05, 'samples': 21918208, 'steps': 42808, 'batch_loss/train': 0.6620555591362063} 12/28/2021 14:26:13 - INFO - codeparrot_training - Step 42809: {'lr': 2.5260427484699395e-05, 'samples': 21918720, 'steps': 42809, 'batch_loss/train': 0.6546889995224774} 12/28/2021 14:26:26 - INFO - codeparrot_training - Step 42810: {'lr': 2.52535231389662e-05, 'samples': 21919232, 'steps': 42810, 'batch_loss/train': 0.6171129299327731} 12/28/2021 14:26:36 - INFO - codeparrot_training - Step 42811: {'lr': 2.5246619686740136e-05, 'samples': 21919744, 'steps': 42811, 'batch_loss/train': 0.6654891101643443} 12/28/2021 14:26:47 - INFO - codeparrot_training - Step 42812: {'lr': 2.523971712804865e-05, 'samples': 21920256, 'steps': 42812, 'batch_loss/train': 0.5938084699737374} 12/28/2021 14:26:59 - INFO - codeparrot_training - Step 42813: {'lr': 2.523281546291925e-05, 'samples': 21920768, 'steps': 42813, 'batch_loss/train': 0.7413388143759221} 12/28/2021 14:27:10 - INFO - codeparrot_training - Step 42814: {'lr': 2.522591469137919e-05, 'samples': 21921280, 'steps': 42814, 'batch_loss/train': 1.0359428878873587} 12/28/2021 14:27:20 - INFO - codeparrot_training - Step 42815: {'lr': 2.5219014813456065e-05, 'samples': 21921792, 'steps': 42815, 'batch_loss/train': 0.6838598009198904} 12/28/2021 14:27:34 - INFO - codeparrot_training - Step 42816: {'lr': 2.5212115829177316e-05, 'samples': 21922304, 'steps': 42816, 'batch_loss/train': 0.7425763597711921} 12/28/2021 14:27:45 - INFO - codeparrot_training - Step 42817: {'lr': 2.5205217738570236e-05, 'samples': 21922816, 'steps': 42817, 'batch_loss/train': 0.6282291430979967} 12/28/2021 14:27:56 - INFO - codeparrot_training - Step 42818: {'lr': 2.519832054166235e-05, 'samples': 21923328, 'steps': 42818, 'batch_loss/train': 0.759013238362968} 12/28/2021 14:28:06 - INFO - codeparrot_training - Step 42819: {'lr': 2.5191424238481143e-05, 'samples': 21923840, 'steps': 42819, 'batch_loss/train': 0.7557497806847095} 12/28/2021 14:28:19 - INFO - codeparrot_training - Step 42820: {'lr': 2.5184528829053866e-05, 'samples': 21924352, 'steps': 42820, 'batch_loss/train': 0.6783348633907735} 12/28/2021 14:28:29 - INFO - codeparrot_training - Step 42821: {'lr': 2.5177634313408056e-05, 'samples': 21924864, 'steps': 42821, 'batch_loss/train': 0.5342044270655606} 12/28/2021 14:28:40 - INFO - codeparrot_training - Step 42822: {'lr': 2.5170740691571046e-05, 'samples': 21925376, 'steps': 42822, 'batch_loss/train': 0.803897196543403} 12/28/2021 14:28:52 - INFO - codeparrot_training - Step 42823: {'lr': 2.5163847963570298e-05, 'samples': 21925888, 'steps': 42823, 'batch_loss/train': 0.6911729069543071} 12/28/2021 14:29:03 - INFO - codeparrot_training - Step 42824: {'lr': 2.5156956129433197e-05, 'samples': 21926400, 'steps': 42824, 'batch_loss/train': 0.6198061590257566} 12/28/2021 14:29:13 - INFO - codeparrot_training - Step 42825: {'lr': 2.515006518918711e-05, 'samples': 21926912, 'steps': 42825, 'batch_loss/train': 0.6921493839472532} 12/28/2021 14:29:28 - INFO - codeparrot_training - Step 42826: {'lr': 2.514317514285949e-05, 'samples': 21927424, 'steps': 42826, 'batch_loss/train': 0.718626297544688} 12/28/2021 14:29:38 - INFO - codeparrot_training - Step 42827: {'lr': 2.513628599047771e-05, 'samples': 21927936, 'steps': 42827, 'batch_loss/train': 0.47944675784674473} 12/28/2021 14:29:49 - INFO - codeparrot_training - Step 42828: {'lr': 2.5129397732069152e-05, 'samples': 21928448, 'steps': 42828, 'batch_loss/train': 0.7290273830294609} 12/28/2021 14:29:59 - INFO - codeparrot_training - Step 42829: {'lr': 2.5122510367661166e-05, 'samples': 21928960, 'steps': 42829, 'batch_loss/train': 0.6491705002263188} 12/28/2021 14:30:12 - INFO - codeparrot_training - Step 42830: {'lr': 2.5115623897281253e-05, 'samples': 21929472, 'steps': 42830, 'batch_loss/train': 0.7589916801080108} 12/28/2021 14:30:22 - INFO - codeparrot_training - Step 42831: {'lr': 2.5108738320956643e-05, 'samples': 21929984, 'steps': 42831, 'batch_loss/train': 0.6460111024789512} 12/28/2021 14:30:33 - INFO - codeparrot_training - Step 42832: {'lr': 2.5101853638714727e-05, 'samples': 21930496, 'steps': 42832, 'batch_loss/train': 0.7435944303870201} 12/28/2021 14:30:47 - INFO - codeparrot_training - Step 42833: {'lr': 2.5094969850583016e-05, 'samples': 21931008, 'steps': 42833, 'batch_loss/train': 0.6946047199890018} 12/28/2021 14:30:57 - INFO - codeparrot_training - Step 42834: {'lr': 2.5088086956588764e-05, 'samples': 21931520, 'steps': 42834, 'batch_loss/train': 0.5463161512743682} 12/28/2021 14:31:08 - INFO - codeparrot_training - Step 42835: {'lr': 2.5081204956759307e-05, 'samples': 21932032, 'steps': 42835, 'batch_loss/train': 0.5818191077560186} 12/28/2021 14:31:20 - INFO - codeparrot_training - Step 42836: {'lr': 2.5074323851122098e-05, 'samples': 21932544, 'steps': 42836, 'batch_loss/train': 0.7145186886191368} 12/28/2021 14:31:31 - INFO - codeparrot_training - Step 42837: {'lr': 2.5067443639704423e-05, 'samples': 21933056, 'steps': 42837, 'batch_loss/train': 0.5675098836654797} 12/28/2021 14:31:41 - INFO - codeparrot_training - Step 42838: {'lr': 2.5060564322533673e-05, 'samples': 21933568, 'steps': 42838, 'batch_loss/train': 0.5507583320140839} 12/28/2021 14:31:53 - INFO - codeparrot_training - Step 42839: {'lr': 2.505368589963722e-05, 'samples': 21934080, 'steps': 42839, 'batch_loss/train': 0.6230059087392874} 12/28/2021 14:32:04 - INFO - codeparrot_training - Step 42840: {'lr': 2.5046808371042285e-05, 'samples': 21934592, 'steps': 42840, 'batch_loss/train': 0.6539830882102251} 12/28/2021 14:32:15 - INFO - codeparrot_training - Step 42841: {'lr': 2.503993173677635e-05, 'samples': 21935104, 'steps': 42841, 'batch_loss/train': 0.7238431451842189} 12/28/2021 14:32:25 - INFO - codeparrot_training - Step 42842: {'lr': 2.5033055996866756e-05, 'samples': 21935616, 'steps': 42842, 'batch_loss/train': 0.6003724434704054} 12/28/2021 14:32:38 - INFO - codeparrot_training - Step 42843: {'lr': 2.502618115134067e-05, 'samples': 21936128, 'steps': 42843, 'batch_loss/train': 0.6600670958869159} 12/28/2021 14:32:48 - INFO - codeparrot_training - Step 42844: {'lr': 2.5019307200225604e-05, 'samples': 21936640, 'steps': 42844, 'batch_loss/train': 0.7215003808960319} 12/28/2021 14:32:59 - INFO - codeparrot_training - Step 42845: {'lr': 2.5012434143548897e-05, 'samples': 21937152, 'steps': 42845, 'batch_loss/train': 0.5353306455072016} 12/28/2021 14:33:13 - INFO - codeparrot_training - Step 42846: {'lr': 2.5005561981337715e-05, 'samples': 21937664, 'steps': 42846, 'batch_loss/train': 0.7256592493504286} 12/28/2021 14:33:23 - INFO - codeparrot_training - Step 42847: {'lr': 2.499869071361949e-05, 'samples': 21938176, 'steps': 42847, 'batch_loss/train': 0.6334453218150884} 12/28/2021 14:33:34 - INFO - codeparrot_training - Step 42848: {'lr': 2.4991820340421496e-05, 'samples': 21938688, 'steps': 42848, 'batch_loss/train': 0.7250768393278122} 12/28/2021 14:33:46 - INFO - codeparrot_training - Step 42849: {'lr': 2.498495086177108e-05, 'samples': 21939200, 'steps': 42849, 'batch_loss/train': 0.6102676363661885} 12/28/2021 14:33:57 - INFO - codeparrot_training - Step 42850: {'lr': 2.4978082277695518e-05, 'samples': 21939712, 'steps': 42850, 'batch_loss/train': 0.6937053129076958} 12/28/2021 14:34:07 - INFO - codeparrot_training - Step 42851: {'lr': 2.497121458822213e-05, 'samples': 21940224, 'steps': 42851, 'batch_loss/train': 0.6893827985040843} 12/28/2021 14:34:18 - INFO - codeparrot_training - Step 42852: {'lr': 2.496434779337825e-05, 'samples': 21940736, 'steps': 42852, 'batch_loss/train': 0.8282265090383589} 12/28/2021 14:34:30 - INFO - codeparrot_training - Step 42853: {'lr': 2.4957481893191138e-05, 'samples': 21941248, 'steps': 42853, 'batch_loss/train': 0.6762414088007063} 12/28/2021 14:34:41 - INFO - codeparrot_training - Step 42854: {'lr': 2.4950616887688128e-05, 'samples': 21941760, 'steps': 42854, 'batch_loss/train': 0.6827224921435118} 12/28/2021 14:34:51 - INFO - codeparrot_training - Step 42855: {'lr': 2.494375277689645e-05, 'samples': 21942272, 'steps': 42855, 'batch_loss/train': 0.6704500191845} 12/28/2021 14:35:05 - INFO - codeparrot_training - Step 42856: {'lr': 2.49368895608435e-05, 'samples': 21942784, 'steps': 42856, 'batch_loss/train': 0.6936539728194475} 12/28/2021 14:35:16 - INFO - codeparrot_training - Step 42857: {'lr': 2.4930027239556453e-05, 'samples': 21943296, 'steps': 42857, 'batch_loss/train': 0.6632503694854677} 12/28/2021 14:35:27 - INFO - codeparrot_training - Step 42858: {'lr': 2.4923165813062586e-05, 'samples': 21943808, 'steps': 42858, 'batch_loss/train': 0.7816369347274303} 12/28/2021 14:35:39 - INFO - codeparrot_training - Step 42859: {'lr': 2.491630528138933e-05, 'samples': 21944320, 'steps': 42859, 'batch_loss/train': 0.6740704560652375} 12/28/2021 14:35:50 - INFO - codeparrot_training - Step 42860: {'lr': 2.4909445644563794e-05, 'samples': 21944832, 'steps': 42860, 'batch_loss/train': 0.7299590781331062} 12/28/2021 14:36:00 - INFO - codeparrot_training - Step 42861: {'lr': 2.490258690261332e-05, 'samples': 21945344, 'steps': 42861, 'batch_loss/train': 0.6960535527323373} 12/28/2021 14:36:14 - INFO - codeparrot_training - Step 42862: {'lr': 2.489572905556517e-05, 'samples': 21945856, 'steps': 42862, 'batch_loss/train': 0.6239629140764009} 12/28/2021 14:36:25 - INFO - codeparrot_training - Step 42863: {'lr': 2.488887210344659e-05, 'samples': 21946368, 'steps': 42863, 'batch_loss/train': 0.5276808957569301} 12/28/2021 14:36:35 - INFO - codeparrot_training - Step 42864: {'lr': 2.4882016046284872e-05, 'samples': 21946880, 'steps': 42864, 'batch_loss/train': 0.6689766675699502} 12/28/2021 14:36:46 - INFO - codeparrot_training - Step 42865: {'lr': 2.487516088410724e-05, 'samples': 21947392, 'steps': 42865, 'batch_loss/train': 0.696465949062258} 12/28/2021 14:36:58 - INFO - codeparrot_training - Step 42866: {'lr': 2.4868306616940977e-05, 'samples': 21947904, 'steps': 42866, 'batch_loss/train': 0.7425555000081658} 12/28/2021 14:37:09 - INFO - codeparrot_training - Step 42867: {'lr': 2.4861453244813315e-05, 'samples': 21948416, 'steps': 42867, 'batch_loss/train': 0.6640648851171136} 12/28/2021 14:37:19 - INFO - codeparrot_training - Step 42868: {'lr': 2.4854600767751562e-05, 'samples': 21948928, 'steps': 42868, 'batch_loss/train': 0.666034581779968} 12/28/2021 14:37:32 - INFO - codeparrot_training - Step 42869: {'lr': 2.484774918578281e-05, 'samples': 21949440, 'steps': 42869, 'batch_loss/train': 0.9163078740239143} 12/28/2021 14:37:42 - INFO - codeparrot_training - Step 42870: {'lr': 2.4840898498934423e-05, 'samples': 21949952, 'steps': 42870, 'batch_loss/train': 0.7225278849364258} 12/28/2021 14:37:53 - INFO - codeparrot_training - Step 42871: {'lr': 2.4834048707233657e-05, 'samples': 21950464, 'steps': 42871, 'batch_loss/train': 0.7016482655890286} 12/28/2021 14:38:07 - INFO - codeparrot_training - Step 42872: {'lr': 2.4827199810707664e-05, 'samples': 21950976, 'steps': 42872, 'batch_loss/train': 0.5205401700804941} 12/28/2021 14:38:18 - INFO - codeparrot_training - Step 42873: {'lr': 2.4820351809383633e-05, 'samples': 21951488, 'steps': 42873, 'batch_loss/train': 0.7500793114304543} 12/28/2021 14:38:28 - INFO - codeparrot_training - Step 42874: {'lr': 2.481350470328897e-05, 'samples': 21952000, 'steps': 42874, 'batch_loss/train': 0.7738585125189275} 12/28/2021 14:38:39 - INFO - codeparrot_training - Step 42875: {'lr': 2.4806658492450755e-05, 'samples': 21952512, 'steps': 42875, 'batch_loss/train': 0.6189043829217553} 12/28/2021 14:38:51 - INFO - codeparrot_training - Step 42876: {'lr': 2.4799813176896195e-05, 'samples': 21953024, 'steps': 42876, 'batch_loss/train': 0.7489926163107157} 12/28/2021 14:39:02 - INFO - codeparrot_training - Step 42877: {'lr': 2.4792968756652574e-05, 'samples': 21953536, 'steps': 42877, 'batch_loss/train': 0.6778360074386001} 12/28/2021 14:39:12 - INFO - codeparrot_training - Step 42878: {'lr': 2.4786125231747057e-05, 'samples': 21954048, 'steps': 42878, 'batch_loss/train': 0.6778342113830149} 12/28/2021 14:39:25 - INFO - codeparrot_training - Step 42879: {'lr': 2.4779282602206882e-05, 'samples': 21954560, 'steps': 42879, 'batch_loss/train': 0.6453069183044136} 12/28/2021 14:39:35 - INFO - codeparrot_training - Step 42880: {'lr': 2.4772440868059242e-05, 'samples': 21955072, 'steps': 42880, 'batch_loss/train': 0.7372348327189684} 12/28/2021 14:39:46 - INFO - codeparrot_training - Step 42881: {'lr': 2.4765600029331313e-05, 'samples': 21955584, 'steps': 42881, 'batch_loss/train': 0.7964602924475912} 12/28/2021 14:39:58 - INFO - codeparrot_training - Step 42882: {'lr': 2.4758760086050353e-05, 'samples': 21956096, 'steps': 42882, 'batch_loss/train': 0.6845827656798065} 12/28/2021 14:40:09 - INFO - codeparrot_training - Step 42883: {'lr': 2.475192103824353e-05, 'samples': 21956608, 'steps': 42883, 'batch_loss/train': 0.6654734909534454} 12/28/2021 14:40:19 - INFO - codeparrot_training - Step 42884: {'lr': 2.4745082885937935e-05, 'samples': 21957120, 'steps': 42884, 'batch_loss/train': 0.7208289038389921} 12/28/2021 14:40:33 - INFO - codeparrot_training - Step 42885: {'lr': 2.473824562916091e-05, 'samples': 21957632, 'steps': 42885, 'batch_loss/train': 0.6737242843955755} 12/28/2021 14:40:44 - INFO - codeparrot_training - Step 42886: {'lr': 2.4731409267939543e-05, 'samples': 21958144, 'steps': 42886, 'batch_loss/train': 0.8079843977466226} 12/28/2021 14:40:54 - INFO - codeparrot_training - Step 42887: {'lr': 2.4724573802301032e-05, 'samples': 21958656, 'steps': 42887, 'batch_loss/train': 0.6118491585366428} 12/28/2021 14:41:05 - INFO - codeparrot_training - Step 42888: {'lr': 2.471773923227255e-05, 'samples': 21959168, 'steps': 42888, 'batch_loss/train': 0.7072059424826875} 12/28/2021 14:41:17 - INFO - codeparrot_training - Step 42889: {'lr': 2.4710905557881274e-05, 'samples': 21959680, 'steps': 42889, 'batch_loss/train': 0.7179166376590729} 12/28/2021 14:41:28 - INFO - codeparrot_training - Step 42890: {'lr': 2.4704072779154373e-05, 'samples': 21960192, 'steps': 42890, 'batch_loss/train': 0.8112422528211027} 12/28/2021 14:41:38 - INFO - codeparrot_training - Step 42891: {'lr': 2.4697240896118995e-05, 'samples': 21960704, 'steps': 42891, 'batch_loss/train': 0.6932994364760816} 12/28/2021 14:41:51 - INFO - codeparrot_training - Step 42892: {'lr': 2.4690409908802335e-05, 'samples': 21961216, 'steps': 42892, 'batch_loss/train': 0.6944080349057913} 12/28/2021 14:42:01 - INFO - codeparrot_training - Step 42893: {'lr': 2.4683579817231544e-05, 'samples': 21961728, 'steps': 42893, 'batch_loss/train': 0.6393844303674996} 12/28/2021 14:42:12 - INFO - codeparrot_training - Step 42894: {'lr': 2.4676750621433793e-05, 'samples': 21962240, 'steps': 42894, 'batch_loss/train': 0.7361234896816313} 12/28/2021 14:42:26 - INFO - codeparrot_training - Step 42895: {'lr': 2.4669922321436084e-05, 'samples': 21962752, 'steps': 42895, 'batch_loss/train': 0.6437515001744032} 12/28/2021 14:42:37 - INFO - codeparrot_training - Step 42896: {'lr': 2.466309491726576e-05, 'samples': 21963264, 'steps': 42896, 'batch_loss/train': 0.7167964857071638} 12/28/2021 14:42:47 - INFO - codeparrot_training - Step 42897: {'lr': 2.465626840894994e-05, 'samples': 21963776, 'steps': 42897, 'batch_loss/train': 0.6593714961200021} 12/28/2021 14:42:58 - INFO - codeparrot_training - Step 42898: {'lr': 2.4649442796515655e-05, 'samples': 21964288, 'steps': 42898, 'batch_loss/train': 0.6590352016501129} 12/28/2021 14:43:10 - INFO - codeparrot_training - Step 42899: {'lr': 2.4642618079990047e-05, 'samples': 21964800, 'steps': 42899, 'batch_loss/train': 0.6818867707625031} 12/28/2021 14:43:21 - INFO - codeparrot_training - Step 42900: {'lr': 2.463579425940038e-05, 'samples': 21965312, 'steps': 42900, 'batch_loss/train': 0.6405623502796516} 12/28/2021 14:43:31 - INFO - codeparrot_training - Step 42901: {'lr': 2.4628971334773653e-05, 'samples': 21965824, 'steps': 42901, 'batch_loss/train': 0.6640281956642866} 12/28/2021 14:43:45 - INFO - codeparrot_training - Step 42902: {'lr': 2.462214930613707e-05, 'samples': 21966336, 'steps': 42902, 'batch_loss/train': 0.7069559432566166} 12/28/2021 14:43:56 - INFO - codeparrot_training - Step 42903: {'lr': 2.4615328173517692e-05, 'samples': 21966848, 'steps': 42903, 'batch_loss/train': 0.6705914009362459} 12/28/2021 14:44:07 - INFO - codeparrot_training - Step 42904: {'lr': 2.4608507936942665e-05, 'samples': 21967360, 'steps': 42904, 'batch_loss/train': 0.7076231817482039} 12/28/2021 14:44:19 - INFO - codeparrot_training - Step 42905: {'lr': 2.460168859643913e-05, 'samples': 21967872, 'steps': 42905, 'batch_loss/train': 0.7553669568151236} 12/28/2021 14:44:30 - INFO - codeparrot_training - Step 42906: {'lr': 2.4594870152034156e-05, 'samples': 21968384, 'steps': 42906, 'batch_loss/train': 0.5939631969667971} 12/28/2021 14:44:40 - INFO - codeparrot_training - Step 42907: {'lr': 2.4588052603754878e-05, 'samples': 21968896, 'steps': 42907, 'batch_loss/train': 0.6999860443174839} 12/28/2021 14:44:52 - INFO - codeparrot_training - Step 42908: {'lr': 2.4581235951628394e-05, 'samples': 21969408, 'steps': 42908, 'batch_loss/train': 0.7559133642353117} 12/28/2021 14:45:04 - INFO - codeparrot_training - Step 42909: {'lr': 2.4574420195681847e-05, 'samples': 21969920, 'steps': 42909, 'batch_loss/train': 0.698761741630733} 12/28/2021 14:45:14 - INFO - codeparrot_training - Step 42910: {'lr': 2.4567605335942183e-05, 'samples': 21970432, 'steps': 42910, 'batch_loss/train': 0.6120567484758794} 12/28/2021 14:45:25 - INFO - codeparrot_training - Step 42911: {'lr': 2.4560791372436693e-05, 'samples': 21970944, 'steps': 42911, 'batch_loss/train': 0.6621054589631967} 12/28/2021 14:45:39 - INFO - codeparrot_training - Step 42912: {'lr': 2.455397830519235e-05, 'samples': 21971456, 'steps': 42912, 'batch_loss/train': 0.8168098246096633} 12/28/2021 14:45:50 - INFO - codeparrot_training - Step 42913: {'lr': 2.454716613423624e-05, 'samples': 21971968, 'steps': 42913, 'batch_loss/train': 0.716584493406117} 12/28/2021 14:46:00 - INFO - codeparrot_training - Step 42914: {'lr': 2.4540354859595466e-05, 'samples': 21972480, 'steps': 42914, 'batch_loss/train': 0.6817581849172711} 12/28/2021 14:46:12 - INFO - codeparrot_training - Step 42915: {'lr': 2.4533544481297133e-05, 'samples': 21972992, 'steps': 42915, 'batch_loss/train': 0.8888501385226846} 12/28/2021 14:46:23 - INFO - codeparrot_training - Step 42916: {'lr': 2.452673499936828e-05, 'samples': 21973504, 'steps': 42916, 'batch_loss/train': 0.6284152949228883} 12/28/2021 14:46:34 - INFO - codeparrot_training - Step 42917: {'lr': 2.451992641383599e-05, 'samples': 21974016, 'steps': 42917, 'batch_loss/train': 0.7843422621954232} 12/28/2021 14:46:44 - INFO - codeparrot_training - Step 42918: {'lr': 2.4513118724727363e-05, 'samples': 21974528, 'steps': 42918, 'batch_loss/train': 0.6309530581347644} 12/28/2021 14:46:57 - INFO - codeparrot_training - Step 42919: {'lr': 2.4506311932069397e-05, 'samples': 21975040, 'steps': 42919, 'batch_loss/train': 0.7111790790222585} 12/28/2021 14:47:07 - INFO - codeparrot_training - Step 42920: {'lr': 2.449950603588927e-05, 'samples': 21975552, 'steps': 42920, 'batch_loss/train': 0.67299650516361} 12/28/2021 14:47:18 - INFO - codeparrot_training - Step 42921: {'lr': 2.4492701036213843e-05, 'samples': 21976064, 'steps': 42921, 'batch_loss/train': 0.5042471377528273} 12/28/2021 14:47:30 - INFO - codeparrot_training - Step 42922: {'lr': 2.448589693307035e-05, 'samples': 21976576, 'steps': 42922, 'batch_loss/train': 0.6798484059982002} 12/28/2021 14:47:40 - INFO - codeparrot_training - Step 42923: {'lr': 2.4479093726485825e-05, 'samples': 21977088, 'steps': 42923, 'batch_loss/train': 0.6396222955081612} 12/28/2021 14:47:51 - INFO - codeparrot_training - Step 42924: {'lr': 2.4472291416487213e-05, 'samples': 21977600, 'steps': 42924, 'batch_loss/train': 0.7002434376627207} 12/28/2021 14:48:05 - INFO - codeparrot_training - Step 42925: {'lr': 2.446549000310158e-05, 'samples': 21978112, 'steps': 42925, 'batch_loss/train': 0.7542052504140884} 12/28/2021 14:48:16 - INFO - codeparrot_training - Step 42926: {'lr': 2.44586894863561e-05, 'samples': 21978624, 'steps': 42926, 'batch_loss/train': 0.7706100754439831} 12/28/2021 14:48:26 - INFO - codeparrot_training - Step 42927: {'lr': 2.445188986627764e-05, 'samples': 21979136, 'steps': 42927, 'batch_loss/train': 0.6813106233894359} 12/28/2021 14:48:38 - INFO - codeparrot_training - Step 42928: {'lr': 2.4445091142893313e-05, 'samples': 21979648, 'steps': 42928, 'batch_loss/train': 0.7603783532977104} 12/28/2021 14:48:49 - INFO - codeparrot_training - Step 42929: {'lr': 2.4438293316230125e-05, 'samples': 21980160, 'steps': 42929, 'batch_loss/train': 0.7112086699344218} 12/28/2021 14:49:00 - INFO - codeparrot_training - Step 42930: {'lr': 2.4431496386315143e-05, 'samples': 21980672, 'steps': 42930, 'batch_loss/train': 0.6996079073287547} 12/28/2021 14:49:10 - INFO - codeparrot_training - Step 42931: {'lr': 2.4424700353175344e-05, 'samples': 21981184, 'steps': 42931, 'batch_loss/train': 0.5855051049147733} 12/28/2021 14:49:24 - INFO - codeparrot_training - Step 42932: {'lr': 2.441790521683776e-05, 'samples': 21981696, 'steps': 42932, 'batch_loss/train': 0.5795045546838082} 12/28/2021 14:49:35 - INFO - codeparrot_training - Step 42933: {'lr': 2.4411110977329398e-05, 'samples': 21982208, 'steps': 42933, 'batch_loss/train': 0.7990483939647675} 12/28/2021 14:49:46 - INFO - codeparrot_training - Step 42934: {'lr': 2.440431763467729e-05, 'samples': 21982720, 'steps': 42934, 'batch_loss/train': 0.7265011421404779} 12/28/2021 14:49:58 - INFO - codeparrot_training - Step 42935: {'lr': 2.4397525188908503e-05, 'samples': 21983232, 'steps': 42935, 'batch_loss/train': 0.7257592556998134} 12/28/2021 14:50:08 - INFO - codeparrot_training - Step 42936: {'lr': 2.4390733640049846e-05, 'samples': 21983744, 'steps': 42936, 'batch_loss/train': 0.6007625595084392} 12/28/2021 14:50:19 - INFO - codeparrot_training - Step 42937: {'lr': 2.4383942988128572e-05, 'samples': 21984256, 'steps': 42937, 'batch_loss/train': 0.7851564618758857} 12/28/2021 14:50:31 - INFO - codeparrot_training - Step 42938: {'lr': 2.4377153233171466e-05, 'samples': 21984768, 'steps': 42938, 'batch_loss/train': 0.7219605622813106} 12/28/2021 14:50:42 - INFO - codeparrot_training - Step 42939: {'lr': 2.4370364375205568e-05, 'samples': 21985280, 'steps': 42939, 'batch_loss/train': 0.7515822381246835} 12/28/2021 14:50:53 - INFO - codeparrot_training - Step 42940: {'lr': 2.4363576414258017e-05, 'samples': 21985792, 'steps': 42940, 'batch_loss/train': 2.1174383223988116} 12/28/2021 14:51:03 - INFO - codeparrot_training - Step 42941: {'lr': 2.4356789350355653e-05, 'samples': 21986304, 'steps': 42941, 'batch_loss/train': 0.7385994717478752} 12/28/2021 14:51:17 - INFO - codeparrot_training - Step 42942: {'lr': 2.4350003183525454e-05, 'samples': 21986816, 'steps': 42942, 'batch_loss/train': 0.6648600553162396} 12/28/2021 14:51:28 - INFO - codeparrot_training - Step 42943: {'lr': 2.4343217913794486e-05, 'samples': 21987328, 'steps': 42943, 'batch_loss/train': 0.6760694617405534} 12/28/2021 14:51:39 - INFO - codeparrot_training - Step 42944: {'lr': 2.433643354118964e-05, 'samples': 21987840, 'steps': 42944, 'batch_loss/train': 0.6627284134738147} 12/28/2021 14:51:51 - INFO - codeparrot_training - Step 42945: {'lr': 2.432965006573795e-05, 'samples': 21988352, 'steps': 42945, 'batch_loss/train': 0.7267396077513695} 12/28/2021 14:52:01 - INFO - codeparrot_training - Step 42946: {'lr': 2.4322867487466398e-05, 'samples': 21988864, 'steps': 42946, 'batch_loss/train': 0.6660306504927576} 12/28/2021 14:52:12 - INFO - codeparrot_training - Step 42947: {'lr': 2.4316085806401822e-05, 'samples': 21989376, 'steps': 42947, 'batch_loss/train': 0.6039609722211026} 12/28/2021 14:52:26 - INFO - codeparrot_training - Step 42948: {'lr': 2.430930502257134e-05, 'samples': 21989888, 'steps': 42948, 'batch_loss/train': 0.6560165181290358} 12/28/2021 14:52:37 - INFO - codeparrot_training - Step 42949: {'lr': 2.4302525136001868e-05, 'samples': 21990400, 'steps': 42949, 'batch_loss/train': 0.7097772872075438} 12/28/2021 14:52:47 - INFO - codeparrot_training - Step 42950: {'lr': 2.4295746146720283e-05, 'samples': 21990912, 'steps': 42950, 'batch_loss/train': 1.5288089541718364} 12/28/2021 14:52:58 - INFO - codeparrot_training - Step 42951: {'lr': 2.4288968054753558e-05, 'samples': 21991424, 'steps': 42951, 'batch_loss/train': 0.6773628629744053} 12/28/2021 14:53:10 - INFO - codeparrot_training - Step 42952: {'lr': 2.4282190860128756e-05, 'samples': 21991936, 'steps': 42952, 'batch_loss/train': 0.9283945113420486} 12/28/2021 14:53:21 - INFO - codeparrot_training - Step 42953: {'lr': 2.427541456287269e-05, 'samples': 21992448, 'steps': 42953, 'batch_loss/train': 0.6807875689119101} 12/28/2021 14:53:32 - INFO - codeparrot_training - Step 42954: {'lr': 2.4268639163012307e-05, 'samples': 21992960, 'steps': 42954, 'batch_loss/train': 0.5759236401645467} 12/28/2021 14:53:44 - INFO - codeparrot_training - Step 42955: {'lr': 2.4261864660574647e-05, 'samples': 21993472, 'steps': 42955, 'batch_loss/train': 0.753335440531373} 12/28/2021 14:53:54 - INFO - codeparrot_training - Step 42956: {'lr': 2.425509105558657e-05, 'samples': 21993984, 'steps': 42956, 'batch_loss/train': 0.6944567620521411} 12/28/2021 14:54:05 - INFO - codeparrot_training - Step 42957: {'lr': 2.4248318348074978e-05, 'samples': 21994496, 'steps': 42957, 'batch_loss/train': 0.6890019420534372} 12/28/2021 14:54:19 - INFO - codeparrot_training - Step 42958: {'lr': 2.4241546538066845e-05, 'samples': 21995008, 'steps': 42958, 'batch_loss/train': 0.74009492341429} 12/28/2021 14:54:30 - INFO - codeparrot_training - Step 42959: {'lr': 2.4234775625589094e-05, 'samples': 21995520, 'steps': 42959, 'batch_loss/train': 0.8160582715063356} 12/28/2021 14:54:40 - INFO - codeparrot_training - Step 42960: {'lr': 2.4228005610668623e-05, 'samples': 21996032, 'steps': 42960, 'batch_loss/train': 0.7377775115892291} 12/28/2021 14:54:51 - INFO - codeparrot_training - Step 42961: {'lr': 2.422123649333238e-05, 'samples': 21996544, 'steps': 42961, 'batch_loss/train': 0.535505961481249} 12/28/2021 14:55:03 - INFO - codeparrot_training - Step 42962: {'lr': 2.4214468273607182e-05, 'samples': 21997056, 'steps': 42962, 'batch_loss/train': 0.6226595503976569} 12/28/2021 14:55:14 - INFO - codeparrot_training - Step 42963: {'lr': 2.4207700951520057e-05, 'samples': 21997568, 'steps': 42963, 'batch_loss/train': 0.6811142894439399} 12/28/2021 14:55:24 - INFO - codeparrot_training - Step 42964: {'lr': 2.4200934527097873e-05, 'samples': 21998080, 'steps': 42964, 'batch_loss/train': 0.7454754309728742} 12/28/2021 14:55:36 - INFO - codeparrot_training - Step 42965: {'lr': 2.4194169000367444e-05, 'samples': 21998592, 'steps': 42965, 'batch_loss/train': 0.6850551227107644} 12/28/2021 14:55:47 - INFO - codeparrot_training - Step 42966: {'lr': 2.418740437135583e-05, 'samples': 21999104, 'steps': 42966, 'batch_loss/train': 0.760031028650701} 12/28/2021 14:55:58 - INFO - codeparrot_training - Step 42967: {'lr': 2.4180640640089764e-05, 'samples': 21999616, 'steps': 42967, 'batch_loss/train': 0.7684129485860467} 12/28/2021 14:56:10 - INFO - codeparrot_training - Step 42968: {'lr': 2.4173877806596244e-05, 'samples': 22000128, 'steps': 42968, 'batch_loss/train': 0.7243165830150247} 12/28/2021 14:56:20 - INFO - codeparrot_training - Step 42969: {'lr': 2.416711587090209e-05, 'samples': 22000640, 'steps': 42969, 'batch_loss/train': 0.7321455222554505} 12/28/2021 14:56:31 - INFO - codeparrot_training - Step 42970: {'lr': 2.416035483303422e-05, 'samples': 22001152, 'steps': 42970, 'batch_loss/train': 0.6379069746471941} 12/28/2021 14:56:45 - INFO - codeparrot_training - Step 42971: {'lr': 2.4153594693019503e-05, 'samples': 22001664, 'steps': 42971, 'batch_loss/train': 0.7666100949281827} 12/28/2021 14:56:56 - INFO - codeparrot_training - Step 42972: {'lr': 2.4146835450884863e-05, 'samples': 22002176, 'steps': 42972, 'batch_loss/train': 0.7153888386674225} 12/28/2021 14:57:06 - INFO - codeparrot_training - Step 42973: {'lr': 2.4140077106657027e-05, 'samples': 22002688, 'steps': 42973, 'batch_loss/train': 0.7216653837822378} 12/28/2021 14:57:17 - INFO - codeparrot_training - Step 42974: {'lr': 2.4133319660363e-05, 'samples': 22003200, 'steps': 42974, 'batch_loss/train': 0.6298899282701313} 12/28/2021 14:57:29 - INFO - codeparrot_training - Step 42975: {'lr': 2.412656311202968e-05, 'samples': 22003712, 'steps': 42975, 'batch_loss/train': 0.7708557429723442} 12/28/2021 14:57:40 - INFO - codeparrot_training - Step 42976: {'lr': 2.411980746168377e-05, 'samples': 22004224, 'steps': 42976, 'batch_loss/train': 0.6659022428211756} 12/28/2021 14:57:51 - INFO - codeparrot_training - Step 42977: {'lr': 2.411305270935224e-05, 'samples': 22004736, 'steps': 42977, 'batch_loss/train': 0.6861565592698753} 12/28/2021 14:58:04 - INFO - codeparrot_training - Step 42978: {'lr': 2.410629885506199e-05, 'samples': 22005248, 'steps': 42978, 'batch_loss/train': 0.7543231868185103} 12/28/2021 14:58:15 - INFO - codeparrot_training - Step 42979: {'lr': 2.4099545898839754e-05, 'samples': 22005760, 'steps': 42979, 'batch_loss/train': 0.66950568347238} 12/28/2021 14:58:26 - INFO - codeparrot_training - Step 42980: {'lr': 2.4092793840712362e-05, 'samples': 22006272, 'steps': 42980, 'batch_loss/train': 0.6458720522059593} 12/28/2021 14:58:38 - INFO - codeparrot_training - Step 42981: {'lr': 2.4086042680706828e-05, 'samples': 22006784, 'steps': 42981, 'batch_loss/train': 0.7241249540820718} 12/28/2021 14:58:48 - INFO - codeparrot_training - Step 42982: {'lr': 2.407929241884982e-05, 'samples': 22007296, 'steps': 42982, 'batch_loss/train': 0.6549115055240691} 12/28/2021 14:58:59 - INFO - codeparrot_training - Step 42983: {'lr': 2.407254305516826e-05, 'samples': 22007808, 'steps': 42983, 'batch_loss/train': 0.4795214480836876} 12/28/2021 14:59:10 - INFO - codeparrot_training - Step 42984: {'lr': 2.4065794589688965e-05, 'samples': 22008320, 'steps': 42984, 'batch_loss/train': 0.7902352577075362} 12/28/2021 14:59:22 - INFO - codeparrot_training - Step 42985: {'lr': 2.4059047022438746e-05, 'samples': 22008832, 'steps': 42985, 'batch_loss/train': 0.6688861957518384} 12/28/2021 14:59:33 - INFO - codeparrot_training - Step 42986: {'lr': 2.4052300353444467e-05, 'samples': 22009344, 'steps': 42986, 'batch_loss/train': 0.6711880983784795} 12/28/2021 14:59:43 - INFO - codeparrot_training - Step 42987: {'lr': 2.404555458273294e-05, 'samples': 22009856, 'steps': 42987, 'batch_loss/train': 0.677357692271471} 12/28/2021 14:59:58 - INFO - codeparrot_training - Step 42988: {'lr': 2.40388097103309e-05, 'samples': 22010368, 'steps': 42988, 'batch_loss/train': 0.6918208310380578} 12/28/2021 15:00:08 - INFO - codeparrot_training - Step 42989: {'lr': 2.403206573626529e-05, 'samples': 22010880, 'steps': 42989, 'batch_loss/train': 0.6074634953401983} 12/28/2021 15:00:19 - INFO - codeparrot_training - Step 42990: {'lr': 2.4025322660562898e-05, 'samples': 22011392, 'steps': 42990, 'batch_loss/train': 0.6663519633002579} 12/28/2021 15:00:31 - INFO - codeparrot_training - Step 42991: {'lr': 2.4018580483250395e-05, 'samples': 22011904, 'steps': 42991, 'batch_loss/train': 0.5479464341478888} 12/28/2021 15:00:42 - INFO - codeparrot_training - Step 42992: {'lr': 2.4011839204354792e-05, 'samples': 22012416, 'steps': 42992, 'batch_loss/train': 0.7663297429680824} 12/28/2021 15:00:52 - INFO - codeparrot_training - Step 42993: {'lr': 2.4005098823902754e-05, 'samples': 22012928, 'steps': 42993, 'batch_loss/train': 0.6546267946250737} 12/28/2021 15:01:03 - INFO - codeparrot_training - Step 42994: {'lr': 2.39983593419211e-05, 'samples': 22013440, 'steps': 42994, 'batch_loss/train': 0.7508767800172791} 12/28/2021 15:01:15 - INFO - codeparrot_training - Step 42995: {'lr': 2.3991620758436612e-05, 'samples': 22013952, 'steps': 42995, 'batch_loss/train': 0.817843590863049} 12/28/2021 15:01:26 - INFO - codeparrot_training - Step 42996: {'lr': 2.3984883073476125e-05, 'samples': 22014464, 'steps': 42996, 'batch_loss/train': 0.7042088070884347} 12/28/2021 15:01:36 - INFO - codeparrot_training - Step 42997: {'lr': 2.3978146287066404e-05, 'samples': 22014976, 'steps': 42997, 'batch_loss/train': 0.7129263540264219} 12/28/2021 15:01:48 - INFO - codeparrot_training - Step 42998: {'lr': 2.3971410399234227e-05, 'samples': 22015488, 'steps': 42998, 'batch_loss/train': 0.6563752321526408} 12/28/2021 15:01:59 - INFO - codeparrot_training - Step 42999: {'lr': 2.3964675410006382e-05, 'samples': 22016000, 'steps': 42999, 'batch_loss/train': 0.894891794770956} 12/28/2021 15:02:10 - INFO - codeparrot_training - Step 43000: {'lr': 2.395794131940962e-05, 'samples': 22016512, 'steps': 43000, 'batch_loss/train': 0.6422113627195358} 12/28/2021 15:02:24 - INFO - codeparrot_training - Step 43001: {'lr': 2.3951208127470785e-05, 'samples': 22017024, 'steps': 43001, 'batch_loss/train': 0.6196160560066346} 12/28/2021 15:02:34 - INFO - codeparrot_training - Step 43002: {'lr': 2.394447583421652e-05, 'samples': 22017536, 'steps': 43002, 'batch_loss/train': 0.599912095349282} 12/28/2021 15:02:45 - INFO - codeparrot_training - Step 43003: {'lr': 2.3937744439673698e-05, 'samples': 22018048, 'steps': 43003, 'batch_loss/train': 0.6794987358152866} 12/28/2021 15:02:56 - INFO - codeparrot_training - Step 43004: {'lr': 2.3931013943869095e-05, 'samples': 22018560, 'steps': 43004, 'batch_loss/train': 0.6495391696225852} 12/28/2021 15:03:08 - INFO - codeparrot_training - Step 43005: {'lr': 2.392428434682939e-05, 'samples': 22019072, 'steps': 43005, 'batch_loss/train': 0.6771103050559759} 12/28/2021 15:03:19 - INFO - codeparrot_training - Step 43006: {'lr': 2.391755564858128e-05, 'samples': 22019584, 'steps': 43006, 'batch_loss/train': 0.6347163263708353} 12/28/2021 15:03:29 - INFO - codeparrot_training - Step 43007: {'lr': 2.3910827849151745e-05, 'samples': 22020096, 'steps': 43007, 'batch_loss/train': 0.6433123294264078} 12/28/2021 15:03:43 - INFO - codeparrot_training - Step 43008: {'lr': 2.390410094856732e-05, 'samples': 22020608, 'steps': 43008, 'batch_loss/train': 0.6594062935328111} 12/28/2021 15:03:54 - INFO - codeparrot_training - Step 43009: {'lr': 2.3897374946854816e-05, 'samples': 22021120, 'steps': 43009, 'batch_loss/train': 0.6990340799093246} 12/28/2021 15:04:04 - INFO - codeparrot_training - Step 43010: {'lr': 2.389064984404096e-05, 'samples': 22021632, 'steps': 43010, 'batch_loss/train': 0.789112652419135} 12/28/2021 15:04:16 - INFO - codeparrot_training - Step 43011: {'lr': 2.388392564015254e-05, 'samples': 22022144, 'steps': 43011, 'batch_loss/train': 0.6803093729540706} 12/28/2021 15:04:27 - INFO - codeparrot_training - Step 43012: {'lr': 2.387720233521623e-05, 'samples': 22022656, 'steps': 43012, 'batch_loss/train': 0.696424062596634} 12/28/2021 15:04:38 - INFO - codeparrot_training - Step 43013: {'lr': 2.3870479929258783e-05, 'samples': 22023168, 'steps': 43013, 'batch_loss/train': 0.7283290214836597} 12/28/2021 15:04:48 - INFO - codeparrot_training - Step 43014: {'lr': 2.3863758422306953e-05, 'samples': 22023680, 'steps': 43014, 'batch_loss/train': 0.7005381910130382} 12/28/2021 15:05:01 - INFO - codeparrot_training - Step 43015: {'lr': 2.385703781438739e-05, 'samples': 22024192, 'steps': 43015, 'batch_loss/train': 0.7882659332826734} 12/28/2021 15:05:11 - INFO - codeparrot_training - Step 43016: {'lr': 2.385031810552693e-05, 'samples': 22024704, 'steps': 43016, 'batch_loss/train': 0.7187886275351048} 12/28/2021 15:05:22 - INFO - codeparrot_training - Step 43017: {'lr': 2.384359929575211e-05, 'samples': 22025216, 'steps': 43017, 'batch_loss/train': 0.74976238142699} 12/28/2021 15:05:36 - INFO - codeparrot_training - Step 43018: {'lr': 2.3836881385089855e-05, 'samples': 22025728, 'steps': 43018, 'batch_loss/train': 0.7666850257664919} 12/28/2021 15:05:46 - INFO - codeparrot_training - Step 43019: {'lr': 2.383016437356672e-05, 'samples': 22026240, 'steps': 43019, 'batch_loss/train': 0.6646334289689548} 12/28/2021 15:05:57 - INFO - codeparrot_training - Step 43020: {'lr': 2.382344826120944e-05, 'samples': 22026752, 'steps': 43020, 'batch_loss/train': 0.6901934309862554} 12/28/2021 15:06:09 - INFO - codeparrot_training - Step 43021: {'lr': 2.3816733048044714e-05, 'samples': 22027264, 'steps': 43021, 'batch_loss/train': 0.7548415772616863} 12/28/2021 15:06:20 - INFO - codeparrot_training - Step 43022: {'lr': 2.381001873409927e-05, 'samples': 22027776, 'steps': 43022, 'batch_loss/train': 0.657152867410332} 12/28/2021 15:06:30 - INFO - codeparrot_training - Step 43023: {'lr': 2.3803305319399783e-05, 'samples': 22028288, 'steps': 43023, 'batch_loss/train': 0.7246419964358211} 12/28/2021 15:06:44 - INFO - codeparrot_training - Step 43024: {'lr': 2.379659280397295e-05, 'samples': 22028800, 'steps': 43024, 'batch_loss/train': 0.6783698829822242} 12/28/2021 15:06:55 - INFO - codeparrot_training - Step 43025: {'lr': 2.378988118784545e-05, 'samples': 22029312, 'steps': 43025, 'batch_loss/train': 0.652981020975858} 12/28/2021 15:07:06 - INFO - codeparrot_training - Step 43026: {'lr': 2.3783170471043974e-05, 'samples': 22029824, 'steps': 43026, 'batch_loss/train': 0.7327755531296134} 12/28/2021 15:07:16 - INFO - codeparrot_training - Step 43027: {'lr': 2.3776460653595233e-05, 'samples': 22030336, 'steps': 43027, 'batch_loss/train': 0.6881623519584537} 12/28/2021 15:07:28 - INFO - codeparrot_training - Step 43028: {'lr': 2.3769751735525786e-05, 'samples': 22030848, 'steps': 43028, 'batch_loss/train': 0.6840994087979198} 12/28/2021 15:07:39 - INFO - codeparrot_training - Step 43029: {'lr': 2.3763043716862415e-05, 'samples': 22031360, 'steps': 43029, 'batch_loss/train': 0.6407831767573953} 12/28/2021 15:07:49 - INFO - codeparrot_training - Step 43030: {'lr': 2.375633659763182e-05, 'samples': 22031872, 'steps': 43030, 'batch_loss/train': 0.6730134662939236} 12/28/2021 15:08:02 - INFO - codeparrot_training - Step 43031: {'lr': 2.3749630377860543e-05, 'samples': 22032384, 'steps': 43031, 'batch_loss/train': 0.7972085243090987} 12/28/2021 15:08:12 - INFO - codeparrot_training - Step 43032: {'lr': 2.3742925057575277e-05, 'samples': 22032896, 'steps': 43032, 'batch_loss/train': 0.7583199202781543} 12/28/2021 15:08:23 - INFO - codeparrot_training - Step 43033: {'lr': 2.3736220636802808e-05, 'samples': 22033408, 'steps': 43033, 'batch_loss/train': 0.732898055575788} 12/28/2021 15:08:37 - INFO - codeparrot_training - Step 43034: {'lr': 2.3729517115569616e-05, 'samples': 22033920, 'steps': 43034, 'batch_loss/train': 0.682277322455775} 12/28/2021 15:08:47 - INFO - codeparrot_training - Step 43035: {'lr': 2.3722814493902456e-05, 'samples': 22034432, 'steps': 43035, 'batch_loss/train': 0.7773096463643014} 12/28/2021 15:08:58 - INFO - codeparrot_training - Step 43036: {'lr': 2.371611277182792e-05, 'samples': 22034944, 'steps': 43036, 'batch_loss/train': 0.678251585457474} 12/28/2021 15:09:10 - INFO - codeparrot_training - Step 43037: {'lr': 2.3709411949372705e-05, 'samples': 22035456, 'steps': 43037, 'batch_loss/train': 0.7052710661664605} 12/28/2021 15:09:21 - INFO - codeparrot_training - Step 43038: {'lr': 2.3702712026563406e-05, 'samples': 22035968, 'steps': 43038, 'batch_loss/train': 0.6119980430230498} 12/28/2021 15:09:31 - INFO - codeparrot_training - Step 43039: {'lr': 2.3696013003426665e-05, 'samples': 22036480, 'steps': 43039, 'batch_loss/train': 0.9815329303964972} 12/28/2021 15:09:42 - INFO - codeparrot_training - Step 43040: {'lr': 2.3689314879989155e-05, 'samples': 22036992, 'steps': 43040, 'batch_loss/train': 0.6948422640562057} 12/28/2021 15:09:54 - INFO - codeparrot_training - Step 43041: {'lr': 2.3682617656277467e-05, 'samples': 22037504, 'steps': 43041, 'batch_loss/train': 0.7802142389118671} 12/28/2021 15:10:05 - INFO - codeparrot_training - Step 43042: {'lr': 2.3675921332318275e-05, 'samples': 22038016, 'steps': 43042, 'batch_loss/train': 0.6042674724012613} 12/28/2021 15:10:15 - INFO - codeparrot_training - Step 43043: {'lr': 2.3669225908138082e-05, 'samples': 22038528, 'steps': 43043, 'batch_loss/train': 0.7024561929283664} 12/28/2021 15:10:28 - INFO - codeparrot_training - Step 43044: {'lr': 2.3662531383763646e-05, 'samples': 22039040, 'steps': 43044, 'batch_loss/train': 0.7754815146327019} 12/28/2021 15:10:38 - INFO - codeparrot_training - Step 43045: {'lr': 2.3655837759221532e-05, 'samples': 22039552, 'steps': 43045, 'batch_loss/train': 0.6948979897424579} 12/28/2021 15:10:49 - INFO - codeparrot_training - Step 43046: {'lr': 2.3649145034538328e-05, 'samples': 22040064, 'steps': 43046, 'batch_loss/train': 0.6709362021065317} 12/28/2021 15:11:03 - INFO - codeparrot_training - Step 43047: {'lr': 2.364245320974065e-05, 'samples': 22040576, 'steps': 43047, 'batch_loss/train': 0.6963208923116326} 12/28/2021 15:11:14 - INFO - codeparrot_training - Step 43048: {'lr': 2.3635762284855118e-05, 'samples': 22041088, 'steps': 43048, 'batch_loss/train': 0.6978125190362334} 12/28/2021 15:11:25 - INFO - codeparrot_training - Step 43049: {'lr': 2.362907225990829e-05, 'samples': 22041600, 'steps': 43049, 'batch_loss/train': 0.697171373758465} 12/28/2021 15:11:35 - INFO - codeparrot_training - Step 43050: {'lr': 2.362238313492682e-05, 'samples': 22042112, 'steps': 43050, 'batch_loss/train': 0.5926320839789696} 12/28/2021 15:11:47 - INFO - codeparrot_training - Step 43051: {'lr': 2.3615694909937285e-05, 'samples': 22042624, 'steps': 43051, 'batch_loss/train': 0.6423852929729037} 12/28/2021 15:11:58 - INFO - codeparrot_training - Step 43052: {'lr': 2.360900758496626e-05, 'samples': 22043136, 'steps': 43052, 'batch_loss/train': 0.6518335163127631} 12/28/2021 15:12:09 - INFO - codeparrot_training - Step 43053: {'lr': 2.3602321160040323e-05, 'samples': 22043648, 'steps': 43053, 'batch_loss/train': 0.8107540337368846} 12/28/2021 15:12:23 - INFO - codeparrot_training - Step 43054: {'lr': 2.35956356351861e-05, 'samples': 22044160, 'steps': 43054, 'batch_loss/train': 0.6704482232162263} 12/28/2021 15:12:34 - INFO - codeparrot_training - Step 43055: {'lr': 2.3588951010430126e-05, 'samples': 22044672, 'steps': 43055, 'batch_loss/train': 0.6249303827062249} 12/28/2021 15:12:45 - INFO - codeparrot_training - Step 43056: {'lr': 2.3582267285799043e-05, 'samples': 22045184, 'steps': 43056, 'batch_loss/train': 0.728389760479331} 12/28/2021 15:12:57 - INFO - codeparrot_training - Step 43057: {'lr': 2.357558446131933e-05, 'samples': 22045696, 'steps': 43057, 'batch_loss/train': 0.7365194436279126} 12/28/2021 15:13:07 - INFO - codeparrot_training - Step 43058: {'lr': 2.3568902537017546e-05, 'samples': 22046208, 'steps': 43058, 'batch_loss/train': 0.6967570437118411} 12/28/2021 15:13:18 - INFO - codeparrot_training - Step 43059: {'lr': 2.3562221512920395e-05, 'samples': 22046720, 'steps': 43059, 'batch_loss/train': 0.8126242216676474} 12/28/2021 15:13:30 - INFO - codeparrot_training - Step 43060: {'lr': 2.355554138905433e-05, 'samples': 22047232, 'steps': 43060, 'batch_loss/train': 0.7607578607276082} 12/28/2021 15:13:41 - INFO - codeparrot_training - Step 43061: {'lr': 2.354886216544591e-05, 'samples': 22047744, 'steps': 43061, 'batch_loss/train': 0.6311725690029562} 12/28/2021 15:13:52 - INFO - codeparrot_training - Step 43062: {'lr': 2.3542183842121728e-05, 'samples': 22048256, 'steps': 43062, 'batch_loss/train': 0.6791292398702353} 12/28/2021 15:14:02 - INFO - codeparrot_training - Step 43063: {'lr': 2.353550641910829e-05, 'samples': 22048768, 'steps': 43063, 'batch_loss/train': 0.5972941300133243} 12/28/2021 15:14:16 - INFO - codeparrot_training - Step 43064: {'lr': 2.352882989643218e-05, 'samples': 22049280, 'steps': 43064, 'batch_loss/train': 0.5573675425257534} 12/28/2021 15:14:27 - INFO - codeparrot_training - Step 43065: {'lr': 2.352215427411994e-05, 'samples': 22049792, 'steps': 43065, 'batch_loss/train': 0.6764320685470011} 12/28/2021 15:14:37 - INFO - codeparrot_training - Step 43066: {'lr': 2.35154795521981e-05, 'samples': 22050304, 'steps': 43066, 'batch_loss/train': 0.7294329938013107} 12/28/2021 15:14:49 - INFO - codeparrot_training - Step 43067: {'lr': 2.3508805730693166e-05, 'samples': 22050816, 'steps': 43067, 'batch_loss/train': 0.6956647569313645} 12/28/2021 15:15:00 - INFO - codeparrot_training - Step 43068: {'lr': 2.3502132809631788e-05, 'samples': 22051328, 'steps': 43068, 'batch_loss/train': 0.6914100074209273} 12/28/2021 15:15:11 - INFO - codeparrot_training - Step 43069: {'lr': 2.3495460789040274e-05, 'samples': 22051840, 'steps': 43069, 'batch_loss/train': 0.7424819958396256} 12/28/2021 15:15:23 - INFO - codeparrot_training - Step 43070: {'lr': 2.3488789668945355e-05, 'samples': 22052352, 'steps': 43070, 'batch_loss/train': 0.6771417623385787} 12/28/2021 15:15:34 - INFO - codeparrot_training - Step 43071: {'lr': 2.3482119449373535e-05, 'samples': 22052864, 'steps': 43071, 'batch_loss/train': 0.5859214519150555} 12/28/2021 15:15:44 - INFO - codeparrot_training - Step 43072: {'lr': 2.347545013035121e-05, 'samples': 22053376, 'steps': 43072, 'batch_loss/train': 0.6370870299870148} 12/28/2021 15:15:55 - INFO - codeparrot_training - Step 43073: {'lr': 2.3468781711904974e-05, 'samples': 22053888, 'steps': 43073, 'batch_loss/train': 0.7574109099805355} 12/28/2021 15:16:09 - INFO - codeparrot_training - Step 43074: {'lr': 2.346211419406133e-05, 'samples': 22054400, 'steps': 43074, 'batch_loss/train': 0.7108112481655553} 12/28/2021 15:16:19 - INFO - codeparrot_training - Step 43075: {'lr': 2.3455447576846782e-05, 'samples': 22054912, 'steps': 43075, 'batch_loss/train': 0.6863126400858164} 12/28/2021 15:16:30 - INFO - codeparrot_training - Step 43076: {'lr': 2.3448781860287816e-05, 'samples': 22055424, 'steps': 43076, 'batch_loss/train': 0.7908679554238915} 12/28/2021 15:16:42 - INFO - codeparrot_training - Step 43077: {'lr': 2.3442117044410988e-05, 'samples': 22055936, 'steps': 43077, 'batch_loss/train': 0.8164316127076745} 12/28/2021 15:16:53 - INFO - codeparrot_training - Step 43078: {'lr': 2.3435453129242724e-05, 'samples': 22056448, 'steps': 43078, 'batch_loss/train': 0.6740169818513095} 12/28/2021 15:17:03 - INFO - codeparrot_training - Step 43079: {'lr': 2.3428790114809557e-05, 'samples': 22056960, 'steps': 43079, 'batch_loss/train': 0.7383521939627826} 12/28/2021 15:17:16 - INFO - codeparrot_training - Step 43080: {'lr': 2.3422128001137993e-05, 'samples': 22057472, 'steps': 43080, 'batch_loss/train': 0.6943894512951374} 12/28/2021 15:17:27 - INFO - codeparrot_training - Step 43081: {'lr': 2.3415466788254458e-05, 'samples': 22057984, 'steps': 43081, 'batch_loss/train': 0.6195501768961549} 12/28/2021 15:17:38 - INFO - codeparrot_training - Step 43082: {'lr': 2.340880647618554e-05, 'samples': 22058496, 'steps': 43082, 'batch_loss/train': 0.7381236683577299} 12/28/2021 15:17:50 - INFO - codeparrot_training - Step 43083: {'lr': 2.3402147064957608e-05, 'samples': 22059008, 'steps': 43083, 'batch_loss/train': 0.7234829762019217} 12/28/2021 15:18:01 - INFO - codeparrot_training - Step 43084: {'lr': 2.339548855459714e-05, 'samples': 22059520, 'steps': 43084, 'batch_loss/train': 0.8365113250911236} 12/28/2021 15:18:11 - INFO - codeparrot_training - Step 43085: {'lr': 2.3388830945130724e-05, 'samples': 22060032, 'steps': 43085, 'batch_loss/train': 0.8004371467977762} 12/28/2021 15:18:22 - INFO - codeparrot_training - Step 43086: {'lr': 2.33821742365847e-05, 'samples': 22060544, 'steps': 43086, 'batch_loss/train': 0.7684024563059211} 12/28/2021 15:18:34 - INFO - codeparrot_training - Step 43087: {'lr': 2.3375518428985555e-05, 'samples': 22061056, 'steps': 43087, 'batch_loss/train': 0.7835501129738986} 12/28/2021 15:18:45 - INFO - codeparrot_training - Step 43088: {'lr': 2.336886352235984e-05, 'samples': 22061568, 'steps': 43088, 'batch_loss/train': 0.9234914910048246} 12/28/2021 15:18:55 - INFO - codeparrot_training - Step 43089: {'lr': 2.336220951673393e-05, 'samples': 22062080, 'steps': 43089, 'batch_loss/train': 0.7747994055971503} 12/28/2021 15:19:08 - INFO - codeparrot_training - Step 43090: {'lr': 2.33555564121343e-05, 'samples': 22062592, 'steps': 43090, 'batch_loss/train': 0.6916887951083481} 12/28/2021 15:19:18 - INFO - codeparrot_training - Step 43091: {'lr': 2.3348904208587403e-05, 'samples': 22063104, 'steps': 43091, 'batch_loss/train': 0.6475855011958629} 12/28/2021 15:19:29 - INFO - codeparrot_training - Step 43092: {'lr': 2.3342252906119686e-05, 'samples': 22063616, 'steps': 43092, 'batch_loss/train': 0.6655355719849467} 12/28/2021 15:19:43 - INFO - codeparrot_training - Step 43093: {'lr': 2.333560250475758e-05, 'samples': 22064128, 'steps': 43093, 'batch_loss/train': 0.7106264936737716} 12/28/2021 15:19:54 - INFO - codeparrot_training - Step 43094: {'lr': 2.332895300452756e-05, 'samples': 22064640, 'steps': 43094, 'batch_loss/train': 0.647575793787837} 12/28/2021 15:20:05 - INFO - codeparrot_training - Step 43095: {'lr': 2.3322304405455962e-05, 'samples': 22065152, 'steps': 43095, 'batch_loss/train': 0.6521237911656499} 12/28/2021 15:20:15 - INFO - codeparrot_training - Step 43096: {'lr': 2.3315656707569356e-05, 'samples': 22065664, 'steps': 43096, 'batch_loss/train': 0.6754862212110311} 12/28/2021 15:20:27 - INFO - codeparrot_training - Step 43097: {'lr': 2.330900991089413e-05, 'samples': 22066176, 'steps': 43097, 'batch_loss/train': 0.6893166624940932} 12/28/2021 15:20:38 - INFO - codeparrot_training - Step 43098: {'lr': 2.3302364015456627e-05, 'samples': 22066688, 'steps': 43098, 'batch_loss/train': 0.7079307828098536} 12/28/2021 15:20:49 - INFO - codeparrot_training - Step 43099: {'lr': 2.3295719021283302e-05, 'samples': 22067200, 'steps': 43099, 'batch_loss/train': 0.6915857857093215} 12/28/2021 15:21:01 - INFO - codeparrot_training - Step 43100: {'lr': 2.3289074928400683e-05, 'samples': 22067712, 'steps': 43100, 'batch_loss/train': 0.6359905041754246} 12/28/2021 15:21:11 - INFO - codeparrot_training - Step 43101: {'lr': 2.328243173683506e-05, 'samples': 22068224, 'steps': 43101, 'batch_loss/train': 0.7423398615792394} 12/28/2021 15:21:22 - INFO - codeparrot_training - Step 43102: {'lr': 2.327578944661288e-05, 'samples': 22068736, 'steps': 43102, 'batch_loss/train': 0.7067517009563744} 12/28/2021 15:21:36 - INFO - codeparrot_training - Step 43103: {'lr': 2.3269148057760537e-05, 'samples': 22069248, 'steps': 43103, 'batch_loss/train': 0.8157958053052425} 12/28/2021 15:21:47 - INFO - codeparrot_training - Step 43104: {'lr': 2.3262507570304487e-05, 'samples': 22069760, 'steps': 43104, 'batch_loss/train': 0.7248985578771681} 12/28/2021 15:21:57 - INFO - codeparrot_training - Step 43105: {'lr': 2.3255867984271067e-05, 'samples': 22070272, 'steps': 43105, 'batch_loss/train': 0.7054540645913221} 12/28/2021 15:22:09 - INFO - codeparrot_training - Step 43106: {'lr': 2.32492292996867e-05, 'samples': 22070784, 'steps': 43106, 'batch_loss/train': 0.7282456024549901} 12/28/2021 15:22:20 - INFO - codeparrot_training - Step 43107: {'lr': 2.3242591516577787e-05, 'samples': 22071296, 'steps': 43107, 'batch_loss/train': 0.7359017436392605} 12/28/2021 15:22:31 - INFO - codeparrot_training - Step 43108: {'lr': 2.3235954634970748e-05, 'samples': 22071808, 'steps': 43108, 'batch_loss/train': 0.8080026591196656} 12/28/2021 15:22:41 - INFO - codeparrot_training - Step 43109: {'lr': 2.322931865489189e-05, 'samples': 22072320, 'steps': 43109, 'batch_loss/train': 0.7214582883752882} 12/28/2021 15:22:55 - INFO - codeparrot_training - Step 43110: {'lr': 2.3222683576367588e-05, 'samples': 22072832, 'steps': 43110, 'batch_loss/train': 0.6372262947261333} 12/28/2021 15:23:06 - INFO - codeparrot_training - Step 43111: {'lr': 2.3216049399424373e-05, 'samples': 22073344, 'steps': 43111, 'batch_loss/train': 0.6157099662814289} 12/28/2021 15:23:16 - INFO - codeparrot_training - Step 43112: {'lr': 2.3209416124088472e-05, 'samples': 22073856, 'steps': 43112, 'batch_loss/train': 0.6735547501593828} 12/28/2021 15:23:29 - INFO - codeparrot_training - Step 43113: {'lr': 2.320278375038623e-05, 'samples': 22074368, 'steps': 43113, 'batch_loss/train': 0.6376667278818786} 12/28/2021 15:23:39 - INFO - codeparrot_training - Step 43114: {'lr': 2.319615227834418e-05, 'samples': 22074880, 'steps': 43114, 'batch_loss/train': 0.6451920890249312} 12/28/2021 15:23:50 - INFO - codeparrot_training - Step 43115: {'lr': 2.3189521707988548e-05, 'samples': 22075392, 'steps': 43115, 'batch_loss/train': 0.7125572999939322} 12/28/2021 15:24:02 - INFO - codeparrot_training - Step 43116: {'lr': 2.318289203934576e-05, 'samples': 22075904, 'steps': 43116, 'batch_loss/train': 0.7878445014357567} 12/28/2021 15:24:13 - INFO - codeparrot_training - Step 43117: {'lr': 2.3176263272442126e-05, 'samples': 22076416, 'steps': 43117, 'batch_loss/train': 0.7296596057713032} 12/28/2021 15:24:23 - INFO - codeparrot_training - Step 43118: {'lr': 2.3169635407304017e-05, 'samples': 22076928, 'steps': 43118, 'batch_loss/train': 0.6704115383327007} 12/28/2021 15:24:37 - INFO - codeparrot_training - Step 43119: {'lr': 2.316300844395783e-05, 'samples': 22077440, 'steps': 43119, 'batch_loss/train': 0.7471813107840717} 12/28/2021 15:24:48 - INFO - codeparrot_training - Step 43120: {'lr': 2.3156382382429868e-05, 'samples': 22077952, 'steps': 43120, 'batch_loss/train': 0.6262448341585696} 12/28/2021 15:24:58 - INFO - codeparrot_training - Step 43121: {'lr': 2.3149757222746425e-05, 'samples': 22078464, 'steps': 43121, 'batch_loss/train': 0.7701826076954603} 12/28/2021 15:25:09 - INFO - codeparrot_training - Step 43122: {'lr': 2.3143132964933923e-05, 'samples': 22078976, 'steps': 43122, 'batch_loss/train': 0.6645978097803891} 12/28/2021 15:25:21 - INFO - codeparrot_training - Step 43123: {'lr': 2.3136509609018697e-05, 'samples': 22079488, 'steps': 43123, 'batch_loss/train': 0.7395355748012662} 12/28/2021 15:25:32 - INFO - codeparrot_training - Step 43124: {'lr': 2.312988715502698e-05, 'samples': 22080000, 'steps': 43124, 'batch_loss/train': 0.6840144917368889} 12/28/2021 15:25:42 - INFO - codeparrot_training - Step 43125: {'lr': 2.3123265602985195e-05, 'samples': 22080512, 'steps': 43125, 'batch_loss/train': 0.755999656394124} 12/28/2021 15:25:55 - INFO - codeparrot_training - Step 43126: {'lr': 2.311664495291968e-05, 'samples': 22081024, 'steps': 43126, 'batch_loss/train': 0.7181067764759064} 12/28/2021 15:26:06 - INFO - codeparrot_training - Step 43127: {'lr': 2.3110025204856693e-05, 'samples': 22081536, 'steps': 43127, 'batch_loss/train': 0.7220814863685519} 12/28/2021 15:26:16 - INFO - codeparrot_training - Step 43128: {'lr': 2.3103406358822577e-05, 'samples': 22082048, 'steps': 43128, 'batch_loss/train': 0.40659166482510045} 12/28/2021 15:26:28 - INFO - codeparrot_training - Step 43129: {'lr': 2.3096788414843638e-05, 'samples': 22082560, 'steps': 43129, 'batch_loss/train': 0.6750170066952705} 12/28/2021 15:26:39 - INFO - codeparrot_training - Step 43130: {'lr': 2.3090171372946196e-05, 'samples': 22083072, 'steps': 43130, 'batch_loss/train': 0.7284466624259949} 12/28/2021 15:26:50 - INFO - codeparrot_training - Step 43131: {'lr': 2.308355523315653e-05, 'samples': 22083584, 'steps': 43131, 'batch_loss/train': 0.7102112622815184} 12/28/2021 15:27:00 - INFO - codeparrot_training - Step 43132: {'lr': 2.3076939995501007e-05, 'samples': 22084096, 'steps': 43132, 'batch_loss/train': 0.7094631213694811} 12/28/2021 15:27:14 - INFO - codeparrot_training - Step 43133: {'lr': 2.307032566000586e-05, 'samples': 22084608, 'steps': 43133, 'batch_loss/train': 0.7027429363224655} 12/28/2021 15:27:25 - INFO - codeparrot_training - Step 43134: {'lr': 2.3063712226697425e-05, 'samples': 22085120, 'steps': 43134, 'batch_loss/train': 0.807369664311409} 12/28/2021 15:27:35 - INFO - codeparrot_training - Step 43135: {'lr': 2.3057099695602017e-05, 'samples': 22085632, 'steps': 43135, 'batch_loss/train': 0.7572835590690374} 12/28/2021 15:27:48 - INFO - codeparrot_training - Step 43136: {'lr': 2.3050488066745813e-05, 'samples': 22086144, 'steps': 43136, 'batch_loss/train': 0.662425896152854} 12/28/2021 15:27:58 - INFO - codeparrot_training - Step 43137: {'lr': 2.3043877340155257e-05, 'samples': 22086656, 'steps': 43137, 'batch_loss/train': 0.5419895498198457} 12/28/2021 15:28:09 - INFO - codeparrot_training - Step 43138: {'lr': 2.3037267515856496e-05, 'samples': 22087168, 'steps': 43138, 'batch_loss/train': 0.7064165755291469} 12/28/2021 15:28:23 - INFO - codeparrot_training - Step 43139: {'lr': 2.3030658593875848e-05, 'samples': 22087680, 'steps': 43139, 'batch_loss/train': 0.7772517679259181} 12/28/2021 15:28:33 - INFO - codeparrot_training - Step 43140: {'lr': 2.3024050574239673e-05, 'samples': 22088192, 'steps': 43140, 'batch_loss/train': 0.6794301588088274} 12/28/2021 15:28:44 - INFO - codeparrot_training - Step 43141: {'lr': 2.3017443456974124e-05, 'samples': 22088704, 'steps': 43141, 'batch_loss/train': 0.6663486361503601} 12/28/2021 15:28:56 - INFO - codeparrot_training - Step 43142: {'lr': 2.3010837242105507e-05, 'samples': 22089216, 'steps': 43142, 'batch_loss/train': 0.6819377385545522} 12/28/2021 15:29:07 - INFO - codeparrot_training - Step 43143: {'lr': 2.3004231929660087e-05, 'samples': 22089728, 'steps': 43143, 'batch_loss/train': 0.7176671382039785} 12/28/2021 15:29:18 - INFO - codeparrot_training - Step 43144: {'lr': 2.2997627519664165e-05, 'samples': 22090240, 'steps': 43144, 'batch_loss/train': 0.7028111983090639} 12/28/2021 15:29:28 - INFO - codeparrot_training - Step 43145: {'lr': 2.2991024012143924e-05, 'samples': 22090752, 'steps': 43145, 'batch_loss/train': 0.7319592470303178} 12/28/2021 15:29:40 - INFO - codeparrot_training - Step 43146: {'lr': 2.2984421407125726e-05, 'samples': 22091264, 'steps': 43146, 'batch_loss/train': 0.6759917112067342} 12/28/2021 15:29:51 - INFO - codeparrot_training - Step 43147: {'lr': 2.2977819704635637e-05, 'samples': 22091776, 'steps': 43147, 'batch_loss/train': 0.7318739145994186} 12/28/2021 15:30:02 - INFO - codeparrot_training - Step 43148: {'lr': 2.2971218904700082e-05, 'samples': 22092288, 'steps': 43148, 'batch_loss/train': 0.84724150178954} 12/28/2021 15:30:16 - INFO - codeparrot_training - Step 43149: {'lr': 2.2964619007345282e-05, 'samples': 22092800, 'steps': 43149, 'batch_loss/train': 0.6486416340630967} 12/28/2021 15:30:26 - INFO - codeparrot_training - Step 43150: {'lr': 2.2958020012597336e-05, 'samples': 22093312, 'steps': 43150, 'batch_loss/train': 0.7218127655796707} 12/28/2021 15:30:37 - INFO - codeparrot_training - Step 43151: {'lr': 2.2951421920482636e-05, 'samples': 22093824, 'steps': 43151, 'batch_loss/train': 0.6296669850125909} 12/28/2021 15:30:48 - INFO - codeparrot_training - Step 43152: {'lr': 2.294482473102738e-05, 'samples': 22094336, 'steps': 43152, 'batch_loss/train': 1.4224592512473464} 12/28/2021 15:31:00 - INFO - codeparrot_training - Step 43153: {'lr': 2.293822844425772e-05, 'samples': 22094848, 'steps': 43153, 'batch_loss/train': 0.8083369699306786} 12/28/2021 15:31:11 - INFO - codeparrot_training - Step 43154: {'lr': 2.2931633060199935e-05, 'samples': 22095360, 'steps': 43154, 'batch_loss/train': 0.7627990879118443} 12/28/2021 15:31:21 - INFO - codeparrot_training - Step 43155: {'lr': 2.292503857888026e-05, 'samples': 22095872, 'steps': 43155, 'batch_loss/train': 0.6867809062823653} 12/28/2021 15:31:36 - INFO - codeparrot_training - Step 43156: {'lr': 2.291844500032486e-05, 'samples': 22096384, 'steps': 43156, 'batch_loss/train': 0.7198542454279959} 12/28/2021 15:31:46 - INFO - codeparrot_training - Step 43157: {'lr': 2.291185232456e-05, 'samples': 22096896, 'steps': 43157, 'batch_loss/train': 0.7461458649486303} 12/28/2021 15:31:57 - INFO - codeparrot_training - Step 43158: {'lr': 2.290526055161185e-05, 'samples': 22097408, 'steps': 43158, 'batch_loss/train': 0.9314841451123357} 12/28/2021 15:32:07 - INFO - codeparrot_training - Step 43159: {'lr': 2.289866968150664e-05, 'samples': 22097920, 'steps': 43159, 'batch_loss/train': 0.6858532793121412} 12/28/2021 15:32:19 - INFO - codeparrot_training - Step 43160: {'lr': 2.28920797142706e-05, 'samples': 22098432, 'steps': 43160, 'batch_loss/train': 0.8726948411203921} 12/28/2021 15:32:30 - INFO - codeparrot_training - Step 43161: {'lr': 2.288549064992987e-05, 'samples': 22098944, 'steps': 43161, 'batch_loss/train': 0.9170409636572003} 12/28/2021 15:32:41 - INFO - codeparrot_training - Step 43162: {'lr': 2.2878902488510685e-05, 'samples': 22099456, 'steps': 43162, 'batch_loss/train': 0.7611195277422667} 12/28/2021 15:32:53 - INFO - codeparrot_training - Step 43163: {'lr': 2.2872315230039243e-05, 'samples': 22099968, 'steps': 43163, 'batch_loss/train': 0.712560894433409} 12/28/2021 15:33:04 - INFO - codeparrot_training - Step 43164: {'lr': 2.286572887454169e-05, 'samples': 22100480, 'steps': 43164, 'batch_loss/train': 0.7107262202189304} 12/28/2021 15:33:14 - INFO - codeparrot_training - Step 43165: {'lr': 2.2859143422044205e-05, 'samples': 22100992, 'steps': 43165, 'batch_loss/train': 0.8078114492818713} 12/28/2021 15:33:26 - INFO - codeparrot_training - Step 43166: {'lr': 2.2852558872573092e-05, 'samples': 22101504, 'steps': 43166, 'batch_loss/train': 0.6376179392682388} 12/28/2021 15:33:37 - INFO - codeparrot_training - Step 43167: {'lr': 2.2845975226154364e-05, 'samples': 22102016, 'steps': 43167, 'batch_loss/train': 0.6246074677910656} 12/28/2021 15:33:48 - INFO - codeparrot_training - Step 43168: {'lr': 2.2839392482814304e-05, 'samples': 22102528, 'steps': 43168, 'batch_loss/train': 0.7547674803063273} 12/28/2021 15:34:01 - INFO - codeparrot_training - Step 43169: {'lr': 2.2832810642579028e-05, 'samples': 22103040, 'steps': 43169, 'batch_loss/train': 0.7092382605187595} 12/28/2021 15:34:12 - INFO - codeparrot_training - Step 43170: {'lr': 2.2826229705474712e-05, 'samples': 22103552, 'steps': 43170, 'batch_loss/train': 0.6799740819260478} 12/28/2021 15:34:22 - INFO - codeparrot_training - Step 43171: {'lr': 2.2819649671527526e-05, 'samples': 22104064, 'steps': 43171, 'batch_loss/train': 0.7666466697119176} 12/28/2021 15:34:33 - INFO - codeparrot_training - Step 43172: {'lr': 2.2813070540763704e-05, 'samples': 22104576, 'steps': 43172, 'batch_loss/train': 0.7980128787457943} 12/28/2021 15:34:45 - INFO - codeparrot_training - Step 43173: {'lr': 2.280649231320922e-05, 'samples': 22105088, 'steps': 43173, 'batch_loss/train': 0.6942748259752989} 12/28/2021 15:34:56 - INFO - codeparrot_training - Step 43174: {'lr': 2.279991498889039e-05, 'samples': 22105600, 'steps': 43174, 'batch_loss/train': 0.6961007695645094} 12/28/2021 15:35:06 - INFO - codeparrot_training - Step 43175: {'lr': 2.279333856783336e-05, 'samples': 22106112, 'steps': 43175, 'batch_loss/train': 0.7497346303425729} 12/28/2021 15:35:18 - INFO - codeparrot_training - Step 43176: {'lr': 2.2786763050064135e-05, 'samples': 22106624, 'steps': 43176, 'batch_loss/train': 0.7517734542489052} 12/28/2021 15:35:29 - INFO - codeparrot_training - Step 43177: {'lr': 2.2780188435609e-05, 'samples': 22107136, 'steps': 43177, 'batch_loss/train': 0.7852203836664557} 12/28/2021 15:35:40 - INFO - codeparrot_training - Step 43178: {'lr': 2.2773614724494073e-05, 'samples': 22107648, 'steps': 43178, 'batch_loss/train': 0.7454232652089559} 12/28/2021 15:35:54 - INFO - codeparrot_training - Step 43179: {'lr': 2.276704191674542e-05, 'samples': 22108160, 'steps': 43179, 'batch_loss/train': 0.7341366014443338} 12/28/2021 15:36:04 - INFO - codeparrot_training - Step 43180: {'lr': 2.276047001238915e-05, 'samples': 22108672, 'steps': 43180, 'batch_loss/train': 0.6698859492316842} 12/28/2021 15:36:15 - INFO - codeparrot_training - Step 43181: {'lr': 2.275389901145153e-05, 'samples': 22109184, 'steps': 43181, 'batch_loss/train': 0.6986425807699561} 12/28/2021 15:36:26 - INFO - codeparrot_training - Step 43182: {'lr': 2.2747328913958587e-05, 'samples': 22109696, 'steps': 43182, 'batch_loss/train': 0.6698526625987142} 12/28/2021 15:36:38 - INFO - codeparrot_training - Step 43183: {'lr': 2.2740759719936445e-05, 'samples': 22110208, 'steps': 43183, 'batch_loss/train': 1.0559787834063172} 12/28/2021 15:36:49 - INFO - codeparrot_training - Step 43184: {'lr': 2.2734191429411246e-05, 'samples': 22110720, 'steps': 43184, 'batch_loss/train': 0.9131473479792476} 12/28/2021 15:36:59 - INFO - codeparrot_training - Step 43185: {'lr': 2.272762404240905e-05, 'samples': 22111232, 'steps': 43185, 'batch_loss/train': 0.7253661500290036} 12/28/2021 15:37:13 - INFO - codeparrot_training - Step 43186: {'lr': 2.2721057558956036e-05, 'samples': 22111744, 'steps': 43186, 'batch_loss/train': 0.6796532515436411} 12/28/2021 15:37:24 - INFO - codeparrot_training - Step 43187: {'lr': 2.2714491979078295e-05, 'samples': 22112256, 'steps': 43187, 'batch_loss/train': 0.6887432671501301} 12/28/2021 15:37:35 - INFO - codeparrot_training - Step 43188: {'lr': 2.2707927302801885e-05, 'samples': 22112768, 'steps': 43188, 'batch_loss/train': 0.6478459275094792} 12/28/2021 15:37:47 - INFO - codeparrot_training - Step 43189: {'lr': 2.2701363530152953e-05, 'samples': 22113280, 'steps': 43189, 'batch_loss/train': 0.7117877127602696} 12/28/2021 15:37:57 - INFO - codeparrot_training - Step 43190: {'lr': 2.269480066115759e-05, 'samples': 22113792, 'steps': 43190, 'batch_loss/train': 0.6683075542096049} 12/28/2021 15:38:08 - INFO - codeparrot_training - Step 43191: {'lr': 2.26882386958418e-05, 'samples': 22114304, 'steps': 43191, 'batch_loss/train': 0.6771488017402589} 12/28/2021 15:38:19 - INFO - codeparrot_training - Step 43192: {'lr': 2.2681677634231844e-05, 'samples': 22114816, 'steps': 43192, 'batch_loss/train': 0.6998860640451312} 12/28/2021 15:38:31 - INFO - codeparrot_training - Step 43193: {'lr': 2.2675117476353646e-05, 'samples': 22115328, 'steps': 43193, 'batch_loss/train': 0.6989975366741419} 12/28/2021 15:38:41 - INFO - codeparrot_training - Step 43194: {'lr': 2.2668558222233375e-05, 'samples': 22115840, 'steps': 43194, 'batch_loss/train': 0.7175268763676286} 12/28/2021 15:38:52 - INFO - codeparrot_training - Step 43195: {'lr': 2.2661999871897042e-05, 'samples': 22116352, 'steps': 43195, 'batch_loss/train': 0.5902242232114077} 12/28/2021 15:39:06 - INFO - codeparrot_training - Step 43196: {'lr': 2.2655442425370793e-05, 'samples': 22116864, 'steps': 43196, 'batch_loss/train': 1.285088169388473} 12/28/2021 15:39:17 - INFO - codeparrot_training - Step 43197: {'lr': 2.2648885882680632e-05, 'samples': 22117376, 'steps': 43197, 'batch_loss/train': 0.6445663319900632} 12/28/2021 15:39:27 - INFO - codeparrot_training - Step 43198: {'lr': 2.2642330243852678e-05, 'samples': 22117888, 'steps': 43198, 'batch_loss/train': 0.6749034058302641} 12/28/2021 15:39:39 - INFO - codeparrot_training - Step 43199: {'lr': 2.2635775508912966e-05, 'samples': 22118400, 'steps': 43199, 'batch_loss/train': 0.6811514087021351} 12/28/2021 15:39:50 - INFO - codeparrot_training - Step 43200: {'lr': 2.262922167788753e-05, 'samples': 22118912, 'steps': 43200, 'batch_loss/train': 0.5614826739765704} 12/28/2021 15:40:01 - INFO - codeparrot_training - Step 43201: {'lr': 2.2622668750802546e-05, 'samples': 22119424, 'steps': 43201, 'batch_loss/train': 0.7196296658366919} 12/28/2021 15:40:13 - INFO - codeparrot_training - Step 43202: {'lr': 2.2616116727683876e-05, 'samples': 22119936, 'steps': 43202, 'batch_loss/train': 0.6342722952831537} 12/28/2021 15:40:23 - INFO - codeparrot_training - Step 43203: {'lr': 2.2609565608557704e-05, 'samples': 22120448, 'steps': 43203, 'batch_loss/train': 0.7318437066860497} 12/28/2021 15:40:34 - INFO - codeparrot_training - Step 43204: {'lr': 2.260301539345008e-05, 'samples': 22120960, 'steps': 43204, 'batch_loss/train': 0.5769135761074722} 12/28/2021 15:40:44 - INFO - codeparrot_training - Step 43205: {'lr': 2.2596466082386968e-05, 'samples': 22121472, 'steps': 43205, 'batch_loss/train': 0.6534170005470514} 12/28/2021 15:40:58 - INFO - codeparrot_training - Step 43206: {'lr': 2.2589917675394394e-05, 'samples': 22121984, 'steps': 43206, 'batch_loss/train': 0.6903828596696258} 12/28/2021 15:41:09 - INFO - codeparrot_training - Step 43207: {'lr': 2.2583370172498537e-05, 'samples': 22122496, 'steps': 43207, 'batch_loss/train': 0.6697103129699826} 12/28/2021 15:41:20 - INFO - codeparrot_training - Step 43208: {'lr': 2.2576823573725285e-05, 'samples': 22123008, 'steps': 43208, 'batch_loss/train': 0.7387136324541643} 12/28/2021 15:41:32 - INFO - codeparrot_training - Step 43209: {'lr': 2.257027787910071e-05, 'samples': 22123520, 'steps': 43209, 'batch_loss/train': 0.7042991383932531} 12/28/2021 15:41:42 - INFO - codeparrot_training - Step 43210: {'lr': 2.256373308865084e-05, 'samples': 22124032, 'steps': 43210, 'batch_loss/train': 0.6265113013796508} 12/28/2021 15:41:53 - INFO - codeparrot_training - Step 43211: {'lr': 2.2557189202401685e-05, 'samples': 22124544, 'steps': 43211, 'batch_loss/train': 0.7859168858267367} 12/28/2021 15:42:05 - INFO - codeparrot_training - Step 43212: {'lr': 2.2550646220379277e-05, 'samples': 22125056, 'steps': 43212, 'batch_loss/train': 0.5394850022275932} 12/28/2021 15:42:16 - INFO - codeparrot_training - Step 43213: {'lr': 2.2544104142609602e-05, 'samples': 22125568, 'steps': 43213, 'batch_loss/train': 0.6655761594884098} 12/28/2021 15:42:27 - INFO - codeparrot_training - Step 43214: {'lr': 2.2537562969118684e-05, 'samples': 22126080, 'steps': 43214, 'batch_loss/train': 0.7326232839841396} 12/28/2021 15:42:41 - INFO - codeparrot_training - Step 43215: {'lr': 2.2531022699932542e-05, 'samples': 22126592, 'steps': 43215, 'batch_loss/train': 0.7360204723663628} 12/28/2021 15:42:52 - INFO - codeparrot_training - Step 43216: {'lr': 2.25244833350772e-05, 'samples': 22127104, 'steps': 43216, 'batch_loss/train': 0.9027981921099126} 12/28/2021 15:43:02 - INFO - codeparrot_training - Step 43217: {'lr': 2.251794487457856e-05, 'samples': 22127616, 'steps': 43217, 'batch_loss/train': 0.7145407389616594} 12/28/2021 15:43:13 - INFO - codeparrot_training - Step 43218: {'lr': 2.251140731846274e-05, 'samples': 22128128, 'steps': 43218, 'batch_loss/train': 0.7406329791992903} 12/28/2021 15:43:25 - INFO - codeparrot_training - Step 43219: {'lr': 2.2504870666755605e-05, 'samples': 22128640, 'steps': 43219, 'batch_loss/train': 0.7365324823185802} 12/28/2021 15:43:36 - INFO - codeparrot_training - Step 43220: {'lr': 2.2498334919483247e-05, 'samples': 22129152, 'steps': 43220, 'batch_loss/train': 0.7001616570632905} 12/28/2021 15:43:46 - INFO - codeparrot_training - Step 43221: {'lr': 2.2491800076671593e-05, 'samples': 22129664, 'steps': 43221, 'batch_loss/train': 0.6447814349085093} 12/28/2021 15:43:58 - INFO - codeparrot_training - Step 43222: {'lr': 2.2485266138346616e-05, 'samples': 22130176, 'steps': 43222, 'batch_loss/train': 0.5987016583094373} 12/28/2021 15:44:09 - INFO - codeparrot_training - Step 43223: {'lr': 2.2478733104534327e-05, 'samples': 22130688, 'steps': 43223, 'batch_loss/train': 0.6675793016329408} 12/28/2021 15:44:20 - INFO - codeparrot_training - Step 43224: {'lr': 2.2472200975260703e-05, 'samples': 22131200, 'steps': 43224, 'batch_loss/train': 0.7708332027541474} 12/28/2021 15:44:34 - INFO - codeparrot_training - Step 43225: {'lr': 2.246566975055167e-05, 'samples': 22131712, 'steps': 43225, 'batch_loss/train': 0.6740316440118477} 12/28/2021 15:44:45 - INFO - codeparrot_training - Step 43226: {'lr': 2.2459139430433233e-05, 'samples': 22132224, 'steps': 43226, 'batch_loss/train': 0.8713417379185557} 12/28/2021 15:44:55 - INFO - codeparrot_training - Step 43227: {'lr': 2.2452610014931402e-05, 'samples': 22132736, 'steps': 43227, 'batch_loss/train': 0.6982195246964693} 12/28/2021 15:45:06 - INFO - codeparrot_training - Step 43228: {'lr': 2.2446081504071987e-05, 'samples': 22133248, 'steps': 43228, 'batch_loss/train': 0.7139587252167985} 12/28/2021 15:45:18 - INFO - codeparrot_training - Step 43229: {'lr': 2.243955389788105e-05, 'samples': 22133760, 'steps': 43229, 'batch_loss/train': 0.6724719735793769} 12/28/2021 15:45:29 - INFO - codeparrot_training - Step 43230: {'lr': 2.243302719638457e-05, 'samples': 22134272, 'steps': 43230, 'batch_loss/train': 0.7042778767645359} 12/28/2021 15:45:39 - INFO - codeparrot_training - Step 43231: {'lr': 2.242650139960842e-05, 'samples': 22134784, 'steps': 43231, 'batch_loss/train': 0.6572889778763056} 12/28/2021 15:45:51 - INFO - codeparrot_training - Step 43232: {'lr': 2.2419976507578517e-05, 'samples': 22135296, 'steps': 43232, 'batch_loss/train': 0.7352759130299091} 12/28/2021 15:46:02 - INFO - codeparrot_training - Step 43233: {'lr': 2.2413452520320952e-05, 'samples': 22135808, 'steps': 43233, 'batch_loss/train': 0.6568824173882604} 12/28/2021 15:46:12 - INFO - codeparrot_training - Step 43234: {'lr': 2.240692943786152e-05, 'samples': 22136320, 'steps': 43234, 'batch_loss/train': 0.7585138129070401} 12/28/2021 15:46:26 - INFO - codeparrot_training - Step 43235: {'lr': 2.240040726022616e-05, 'samples': 22136832, 'steps': 43235, 'batch_loss/train': 0.7424291931092739} 12/28/2021 15:46:37 - INFO - codeparrot_training - Step 43236: {'lr': 2.2393885987440908e-05, 'samples': 22137344, 'steps': 43236, 'batch_loss/train': 0.7863342259079218} 12/28/2021 15:46:48 - INFO - codeparrot_training - Step 43237: {'lr': 2.2387365619531613e-05, 'samples': 22137856, 'steps': 43237, 'batch_loss/train': 0.7418826669454575} 12/28/2021 15:47:00 - INFO - codeparrot_training - Step 43238: {'lr': 2.2380846156524194e-05, 'samples': 22138368, 'steps': 43238, 'batch_loss/train': 0.7188719834666699} 12/28/2021 15:47:10 - INFO - codeparrot_training - Step 43239: {'lr': 2.237432759844457e-05, 'samples': 22138880, 'steps': 43239, 'batch_loss/train': 0.6417617936385795} 12/28/2021 15:47:21 - INFO - codeparrot_training - Step 43240: {'lr': 2.23678099453187e-05, 'samples': 22139392, 'steps': 43240, 'batch_loss/train': 0.6735881064087152} 12/28/2021 15:47:32 - INFO - codeparrot_training - Step 43241: {'lr': 2.236129319717245e-05, 'samples': 22139904, 'steps': 43241, 'batch_loss/train': 0.8048897637054324} 12/28/2021 15:47:46 - INFO - codeparrot_training - Step 43242: {'lr': 2.2354777354031798e-05, 'samples': 22140416, 'steps': 43242, 'batch_loss/train': 0.6386401169002056} 12/28/2021 15:47:56 - INFO - codeparrot_training - Step 43243: {'lr': 2.2348262415922498e-05, 'samples': 22140928, 'steps': 43243, 'batch_loss/train': 0.6465968115953729} 12/28/2021 15:48:07 - INFO - codeparrot_training - Step 43244: {'lr': 2.2341748382870647e-05, 'samples': 22141440, 'steps': 43244, 'batch_loss/train': 0.6769424863159657} 12/28/2021 15:48:19 - INFO - codeparrot_training - Step 43245: {'lr': 2.2335235254901997e-05, 'samples': 22141952, 'steps': 43245, 'batch_loss/train': 0.6246438961825334} 12/28/2021 15:48:30 - INFO - codeparrot_training - Step 43246: {'lr': 2.2328723032042475e-05, 'samples': 22142464, 'steps': 43246, 'batch_loss/train': 0.7362667531706393} 12/28/2021 15:48:41 - INFO - codeparrot_training - Step 43247: {'lr': 2.2322211714318004e-05, 'samples': 22142976, 'steps': 43247, 'batch_loss/train': 0.7267638598568738} 12/28/2021 15:48:51 - INFO - codeparrot_training - Step 43248: {'lr': 2.231570130175442e-05, 'samples': 22143488, 'steps': 43248, 'batch_loss/train': 0.5008768349944148} 12/28/2021 15:49:04 - INFO - codeparrot_training - Step 43249: {'lr': 2.230919179437768e-05, 'samples': 22144000, 'steps': 43249, 'batch_loss/train': 0.6468604592955671} 12/28/2021 15:49:14 - INFO - codeparrot_training - Step 43250: {'lr': 2.2302683192213598e-05, 'samples': 22144512, 'steps': 43250, 'batch_loss/train': 0.8469512906158343} 12/28/2021 15:49:25 - INFO - codeparrot_training - Step 43251: {'lr': 2.2296175495288062e-05, 'samples': 22145024, 'steps': 43251, 'batch_loss/train': 0.9310693680308759} 12/28/2021 15:49:37 - INFO - codeparrot_training - Step 43252: {'lr': 2.2289668703626976e-05, 'samples': 22145536, 'steps': 43252, 'batch_loss/train': 0.7027691397815943} 12/28/2021 15:49:48 - INFO - codeparrot_training - Step 43253: {'lr': 2.2283162817256232e-05, 'samples': 22146048, 'steps': 43253, 'batch_loss/train': 0.7342911288142204} 12/28/2021 15:49:58 - INFO - codeparrot_training - Step 43254: {'lr': 2.227665783620156e-05, 'samples': 22146560, 'steps': 43254, 'batch_loss/train': 0.6739062296692282} 12/28/2021 15:50:12 - INFO - codeparrot_training - Step 43255: {'lr': 2.227015376048894e-05, 'samples': 22147072, 'steps': 43255, 'batch_loss/train': 0.6422620805678889} 12/28/2021 15:50:23 - INFO - codeparrot_training - Step 43256: {'lr': 2.2263650590144265e-05, 'samples': 22147584, 'steps': 43256, 'batch_loss/train': 0.6986910989508033} 12/28/2021 15:50:33 - INFO - codeparrot_training - Step 43257: {'lr': 2.2257148325193267e-05, 'samples': 22148096, 'steps': 43257, 'batch_loss/train': 0.7043307768180966} 12/28/2021 15:50:45 - INFO - codeparrot_training - Step 43258: {'lr': 2.2250646965661815e-05, 'samples': 22148608, 'steps': 43258, 'batch_loss/train': 0.6326108139473945} 12/28/2021 15:50:56 - INFO - codeparrot_training - Step 43259: {'lr': 2.2244146511575857e-05, 'samples': 22149120, 'steps': 43259, 'batch_loss/train': 0.5912734722078312} 12/28/2021 15:51:07 - INFO - codeparrot_training - Step 43260: {'lr': 2.2237646962961184e-05, 'samples': 22149632, 'steps': 43260, 'batch_loss/train': 0.6832010226789862} 12/28/2021 15:51:17 - INFO - codeparrot_training - Step 43261: {'lr': 2.2231148319843547e-05, 'samples': 22150144, 'steps': 43261, 'batch_loss/train': 0.807847079820931} 12/28/2021 15:51:30 - INFO - codeparrot_training - Step 43262: {'lr': 2.2224650582248958e-05, 'samples': 22150656, 'steps': 43262, 'batch_loss/train': 0.5854616619180888} 12/28/2021 15:51:40 - INFO - codeparrot_training - Step 43263: {'lr': 2.2218153750203114e-05, 'samples': 22151168, 'steps': 43263, 'batch_loss/train': 0.7271020419429988} 12/28/2021 15:51:51 - INFO - codeparrot_training - Step 43264: {'lr': 2.2211657823731886e-05, 'samples': 22151680, 'steps': 43264, 'batch_loss/train': 0.6674121785908937} 12/28/2021 15:52:05 - INFO - codeparrot_training - Step 43265: {'lr': 2.2205162802861085e-05, 'samples': 22152192, 'steps': 43265, 'batch_loss/train': 0.691800672095269} 12/28/2021 15:52:16 - INFO - codeparrot_training - Step 43266: {'lr': 2.2198668687616553e-05, 'samples': 22152704, 'steps': 43266, 'batch_loss/train': 0.5494788854848593} 12/28/2021 15:52:26 - INFO - codeparrot_training - Step 43267: {'lr': 2.21921754780241e-05, 'samples': 22153216, 'steps': 43267, 'batch_loss/train': 0.7636395515874028} 12/28/2021 15:52:38 - INFO - codeparrot_training - Step 43268: {'lr': 2.218568317410957e-05, 'samples': 22153728, 'steps': 43268, 'batch_loss/train': 0.666564267128706} 12/28/2021 15:52:49 - INFO - codeparrot_training - Step 43269: {'lr': 2.2179191775898668e-05, 'samples': 22154240, 'steps': 43269, 'batch_loss/train': 0.6917397221550345} 12/28/2021 15:53:00 - INFO - codeparrot_training - Step 43270: {'lr': 2.2172701283417334e-05, 'samples': 22154752, 'steps': 43270, 'batch_loss/train': 0.6332859632093459} 12/28/2021 15:53:10 - INFO - codeparrot_training - Step 43271: {'lr': 2.2166211696691336e-05, 'samples': 22155264, 'steps': 43271, 'batch_loss/train': 0.6375606229994446} 12/28/2021 15:53:25 - INFO - codeparrot_training - Step 43272: {'lr': 2.2159723015746397e-05, 'samples': 22155776, 'steps': 43272, 'batch_loss/train': 0.7208762292284518} 12/28/2021 15:53:35 - INFO - codeparrot_training - Step 43273: {'lr': 2.2153235240608415e-05, 'samples': 22156288, 'steps': 43273, 'batch_loss/train': 0.9431757256388664} 12/28/2021 15:53:46 - INFO - codeparrot_training - Step 43274: {'lr': 2.214674837130312e-05, 'samples': 22156800, 'steps': 43274, 'batch_loss/train': 0.6308350601466373} 12/28/2021 15:53:58 - INFO - codeparrot_training - Step 43275: {'lr': 2.2140262407856327e-05, 'samples': 22157312, 'steps': 43275, 'batch_loss/train': 0.7637553154490888} 12/28/2021 15:54:09 - INFO - codeparrot_training - Step 43276: {'lr': 2.2133777350293815e-05, 'samples': 22157824, 'steps': 43276, 'batch_loss/train': 0.7143760037142783} 12/28/2021 15:54:19 - INFO - codeparrot_training - Step 43277: {'lr': 2.2127293198641347e-05, 'samples': 22158336, 'steps': 43277, 'batch_loss/train': 0.863854699768126} 12/28/2021 15:54:31 - INFO - codeparrot_training - Step 43278: {'lr': 2.2120809952924732e-05, 'samples': 22158848, 'steps': 43278, 'batch_loss/train': 0.6489950162358582} 12/28/2021 15:54:42 - INFO - codeparrot_training - Step 43279: {'lr': 2.2114327613169725e-05, 'samples': 22159360, 'steps': 43279, 'batch_loss/train': 0.631156057206681} 12/28/2021 15:54:53 - INFO - codeparrot_training - Step 43280: {'lr': 2.2107846179402118e-05, 'samples': 22159872, 'steps': 43280, 'batch_loss/train': 0.6928548724390566} 12/28/2021 15:55:03 - INFO - codeparrot_training - Step 43281: {'lr': 2.2101365651647636e-05, 'samples': 22160384, 'steps': 43281, 'batch_loss/train': 0.7976600434631109} 12/28/2021 15:55:16 - INFO - codeparrot_training - Step 43282: {'lr': 2.2094886029932148e-05, 'samples': 22160896, 'steps': 43282, 'batch_loss/train': 0.7305807955563068} 12/28/2021 15:55:26 - INFO - codeparrot_training - Step 43283: {'lr': 2.208840731428127e-05, 'samples': 22161408, 'steps': 43283, 'batch_loss/train': 0.7166097613517195} 12/28/2021 15:55:37 - INFO - codeparrot_training - Step 43284: {'lr': 2.208192950472082e-05, 'samples': 22161920, 'steps': 43284, 'batch_loss/train': 0.7632069159299135} 12/28/2021 15:55:51 - INFO - codeparrot_training - Step 43285: {'lr': 2.2075452601276608e-05, 'samples': 22162432, 'steps': 43285, 'batch_loss/train': 0.810283275321126} 12/28/2021 15:56:01 - INFO - codeparrot_training - Step 43286: {'lr': 2.2068976603974332e-05, 'samples': 22162944, 'steps': 43286, 'batch_loss/train': 0.6409243638627231} 12/28/2021 15:56:12 - INFO - codeparrot_training - Step 43287: {'lr': 2.2062501512839673e-05, 'samples': 22163456, 'steps': 43287, 'batch_loss/train': 0.6491158357821405} 12/28/2021 15:56:24 - INFO - codeparrot_training - Step 43288: {'lr': 2.205602732789852e-05, 'samples': 22163968, 'steps': 43288, 'batch_loss/train': 0.7643880667164922} 12/28/2021 15:56:35 - INFO - codeparrot_training - Step 43289: {'lr': 2.2049554049176524e-05, 'samples': 22164480, 'steps': 43289, 'batch_loss/train': 2.2011435497552156} 12/28/2021 15:56:46 - INFO - codeparrot_training - Step 43290: {'lr': 2.2043081676699382e-05, 'samples': 22164992, 'steps': 43290, 'batch_loss/train': 0.7043631562264636} 12/28/2021 15:56:58 - INFO - codeparrot_training - Step 43291: {'lr': 2.2036610210492914e-05, 'samples': 22165504, 'steps': 43291, 'batch_loss/train': 0.732106045819819} 12/28/2021 15:57:08 - INFO - codeparrot_training - Step 43292: {'lr': 2.2030139650582786e-05, 'samples': 22166016, 'steps': 43292, 'batch_loss/train': 0.7070007973816246} 12/28/2021 15:57:19 - INFO - codeparrot_training - Step 43293: {'lr': 2.202366999699476e-05, 'samples': 22166528, 'steps': 43293, 'batch_loss/train': 0.5293081232812256} 12/28/2021 15:57:30 - INFO - codeparrot_training - Step 43294: {'lr': 2.2017201249754592e-05, 'samples': 22167040, 'steps': 43294, 'batch_loss/train': 0.7463192762807012} 12/28/2021 15:57:44 - INFO - codeparrot_training - Step 43295: {'lr': 2.2010733408887845e-05, 'samples': 22167552, 'steps': 43295, 'batch_loss/train': 0.7719437456689775} 12/28/2021 15:57:54 - INFO - codeparrot_training - Step 43296: {'lr': 2.2004266474420388e-05, 'samples': 22168064, 'steps': 43296, 'batch_loss/train': 0.7097575557418168} 12/28/2021 15:58:05 - INFO - codeparrot_training - Step 43297: {'lr': 2.1997800446377918e-05, 'samples': 22168576, 'steps': 43297, 'batch_loss/train': 0.7072693705558777} 12/28/2021 15:58:17 - INFO - codeparrot_training - Step 43298: {'lr': 2.1991335324786005e-05, 'samples': 22169088, 'steps': 43298, 'batch_loss/train': 0.7802169601200148} 12/28/2021 15:58:28 - INFO - codeparrot_training - Step 43299: {'lr': 2.1984871109670568e-05, 'samples': 22169600, 'steps': 43299, 'batch_loss/train': 0.7832276336848736} 12/28/2021 15:58:38 - INFO - codeparrot_training - Step 43300: {'lr': 2.1978407801057116e-05, 'samples': 22170112, 'steps': 43300, 'batch_loss/train': 0.7124221157282591} 12/28/2021 15:58:52 - INFO - codeparrot_training - Step 43301: {'lr': 2.1971945398971433e-05, 'samples': 22170624, 'steps': 43301, 'batch_loss/train': 0.7662296965718269} 12/28/2021 15:59:03 - INFO - codeparrot_training - Step 43302: {'lr': 2.1965483903439194e-05, 'samples': 22171136, 'steps': 43302, 'batch_loss/train': 0.8087318586185575} 12/28/2021 15:59:14 - INFO - codeparrot_training - Step 43303: {'lr': 2.195902331448607e-05, 'samples': 22171648, 'steps': 43303, 'batch_loss/train': 0.7734012929722667} 12/28/2021 15:59:26 - INFO - codeparrot_training - Step 43304: {'lr': 2.1952563632137794e-05, 'samples': 22172160, 'steps': 43304, 'batch_loss/train': 0.6612389840884134} 12/28/2021 15:59:36 - INFO - codeparrot_training - Step 43305: {'lr': 2.1946104856419984e-05, 'samples': 22172672, 'steps': 43305, 'batch_loss/train': 0.7069524971302599} 12/28/2021 15:59:47 - INFO - codeparrot_training - Step 43306: {'lr': 2.193964698735837e-05, 'samples': 22173184, 'steps': 43306, 'batch_loss/train': 0.5636952693457715} 12/28/2021 15:59:58 - INFO - codeparrot_training - Step 43307: {'lr': 2.1933190024978623e-05, 'samples': 22173696, 'steps': 43307, 'batch_loss/train': 0.6378474442171864} 12/28/2021 16:00:10 - INFO - codeparrot_training - Step 43308: {'lr': 2.1926733969306417e-05, 'samples': 22174208, 'steps': 43308, 'batch_loss/train': 0.6290567773976363} 12/28/2021 16:00:21 - INFO - codeparrot_training - Step 43309: {'lr': 2.1920278820367317e-05, 'samples': 22174720, 'steps': 43309, 'batch_loss/train': 0.6828675165306777} 12/28/2021 16:00:31 - INFO - codeparrot_training - Step 43310: {'lr': 2.1913824578187135e-05, 'samples': 22175232, 'steps': 43310, 'batch_loss/train': 0.8220963790081441} 12/28/2021 16:00:45 - INFO - codeparrot_training - Step 43311: {'lr': 2.190737124279149e-05, 'samples': 22175744, 'steps': 43311, 'batch_loss/train': 0.7883556028828025} 12/28/2021 16:00:56 - INFO - codeparrot_training - Step 43312: {'lr': 2.1900918814205943e-05, 'samples': 22176256, 'steps': 43312, 'batch_loss/train': 0.7443929431028664} 12/28/2021 16:01:06 - INFO - codeparrot_training - Step 43313: {'lr': 2.18944672924562e-05, 'samples': 22176768, 'steps': 43313, 'batch_loss/train': 0.7248285822570324} 12/28/2021 16:01:19 - INFO - codeparrot_training - Step 43314: {'lr': 2.1888016677568012e-05, 'samples': 22177280, 'steps': 43314, 'batch_loss/train': 0.5844577640818898} 12/28/2021 16:01:29 - INFO - codeparrot_training - Step 43315: {'lr': 2.188156696956689e-05, 'samples': 22177792, 'steps': 43315, 'batch_loss/train': 0.7086435329401866} 12/28/2021 16:01:40 - INFO - codeparrot_training - Step 43316: {'lr': 2.1875118168478535e-05, 'samples': 22178304, 'steps': 43316, 'batch_loss/train': 0.6067683612927794} 12/28/2021 16:01:51 - INFO - codeparrot_training - Step 43317: {'lr': 2.1868670274328568e-05, 'samples': 22178816, 'steps': 43317, 'batch_loss/train': 0.583428226178512} 12/28/2021 16:02:03 - INFO - codeparrot_training - Step 43318: {'lr': 2.186222328714263e-05, 'samples': 22179328, 'steps': 43318, 'batch_loss/train': 0.6396189811639488} 12/28/2021 16:02:14 - INFO - codeparrot_training - Step 43319: {'lr': 2.1855777206946343e-05, 'samples': 22179840, 'steps': 43319, 'batch_loss/train': 0.594738052750472} 12/28/2021 16:02:24 - INFO - codeparrot_training - Step 43320: {'lr': 2.1849332033765378e-05, 'samples': 22180352, 'steps': 43320, 'batch_loss/train': 0.672515444457531} 12/28/2021 16:02:38 - INFO - codeparrot_training - Step 43321: {'lr': 2.1842887767625247e-05, 'samples': 22180864, 'steps': 43321, 'batch_loss/train': 0.8332160345744342} 12/28/2021 16:02:49 - INFO - codeparrot_training - Step 43322: {'lr': 2.1836444408551702e-05, 'samples': 22181376, 'steps': 43322, 'batch_loss/train': 0.8264926010742784} 12/28/2021 16:02:59 - INFO - codeparrot_training - Step 43323: {'lr': 2.183000195657034e-05, 'samples': 22181888, 'steps': 43323, 'batch_loss/train': 0.5611179923871532} 12/28/2021 16:03:12 - INFO - codeparrot_training - Step 43324: {'lr': 2.1823560411706635e-05, 'samples': 22182400, 'steps': 43324, 'batch_loss/train': 0.6796143655665219} 12/28/2021 16:03:22 - INFO - codeparrot_training - Step 43325: {'lr': 2.1817119773986398e-05, 'samples': 22182912, 'steps': 43325, 'batch_loss/train': 0.6947389226406813} 12/28/2021 16:03:33 - INFO - codeparrot_training - Step 43326: {'lr': 2.1810680043435087e-05, 'samples': 22183424, 'steps': 43326, 'batch_loss/train': 0.7310707119759172} 12/28/2021 16:03:46 - INFO - codeparrot_training - Step 43327: {'lr': 2.1804241220078346e-05, 'samples': 22183936, 'steps': 43327, 'batch_loss/train': 0.5977224005910102} 12/28/2021 16:03:56 - INFO - codeparrot_training - Step 43328: {'lr': 2.1797803303941766e-05, 'samples': 22184448, 'steps': 43328, 'batch_loss/train': 1.1736546885222197} 12/28/2021 16:04:07 - INFO - codeparrot_training - Step 43329: {'lr': 2.1791366295050964e-05, 'samples': 22184960, 'steps': 43329, 'batch_loss/train': 0.5920111741870642} 12/28/2021 16:04:17 - INFO - codeparrot_training - Step 43330: {'lr': 2.1784930193431528e-05, 'samples': 22185472, 'steps': 43330, 'batch_loss/train': 0.703611153177917} 12/28/2021 16:04:30 - INFO - codeparrot_training - Step 43331: {'lr': 2.1778494999109055e-05, 'samples': 22185984, 'steps': 43331, 'batch_loss/train': 0.7042631795629859} 12/28/2021 16:04:40 - INFO - codeparrot_training - Step 43332: {'lr': 2.1772060712109078e-05, 'samples': 22186496, 'steps': 43332, 'batch_loss/train': 0.7136447392404079} 12/28/2021 16:04:51 - INFO - codeparrot_training - Step 43333: {'lr': 2.1765627332457238e-05, 'samples': 22187008, 'steps': 43333, 'batch_loss/train': 0.6971557354554534} 12/28/2021 16:05:05 - INFO - codeparrot_training - Step 43334: {'lr': 2.1759194860179133e-05, 'samples': 22187520, 'steps': 43334, 'batch_loss/train': 0.513481305912137} 12/28/2021 16:05:15 - INFO - codeparrot_training - Step 43335: {'lr': 2.1752763295300183e-05, 'samples': 22188032, 'steps': 43335, 'batch_loss/train': 0.7349740946665406} 12/28/2021 16:05:26 - INFO - codeparrot_training - Step 43336: {'lr': 2.1746332637846116e-05, 'samples': 22188544, 'steps': 43336, 'batch_loss/train': 0.6128545513493009} 12/28/2021 16:05:38 - INFO - codeparrot_training - Step 43337: {'lr': 2.1739902887842473e-05, 'samples': 22189056, 'steps': 43337, 'batch_loss/train': 0.7628027908504009} 12/28/2021 16:05:49 - INFO - codeparrot_training - Step 43338: {'lr': 2.1733474045314782e-05, 'samples': 22189568, 'steps': 43338, 'batch_loss/train': 0.6292988726636395} 12/28/2021 16:06:00 - INFO - codeparrot_training - Step 43339: {'lr': 2.1727046110288528e-05, 'samples': 22190080, 'steps': 43339, 'batch_loss/train': 0.6111253134440631} 12/28/2021 16:06:10 - INFO - codeparrot_training - Step 43340: {'lr': 2.172061908278944e-05, 'samples': 22190592, 'steps': 43340, 'batch_loss/train': 0.6780309299938381} 12/28/2021 16:06:24 - INFO - codeparrot_training - Step 43341: {'lr': 2.171419296284294e-05, 'samples': 22191104, 'steps': 43341, 'batch_loss/train': 0.5961466294247657} 12/28/2021 16:06:35 - INFO - codeparrot_training - Step 43342: {'lr': 2.1707767750474615e-05, 'samples': 22191616, 'steps': 43342, 'batch_loss/train': 0.7909635016694665} 12/28/2021 16:06:45 - INFO - codeparrot_training - Step 43343: {'lr': 2.170134344571001e-05, 'samples': 22192128, 'steps': 43343, 'batch_loss/train': 0.7378318924456835} 12/28/2021 16:06:58 - INFO - codeparrot_training - Step 43344: {'lr': 2.1694920048574652e-05, 'samples': 22192640, 'steps': 43344, 'batch_loss/train': 0.7941763941198587} 12/28/2021 16:07:08 - INFO - codeparrot_training - Step 43345: {'lr': 2.1688497559094078e-05, 'samples': 22193152, 'steps': 43345, 'batch_loss/train': 0.7319325960706919} 12/28/2021 16:07:19 - INFO - codeparrot_training - Step 43346: {'lr': 2.1682075977293826e-05, 'samples': 22193664, 'steps': 43346, 'batch_loss/train': 0.6081641822238453} 12/28/2021 16:07:31 - INFO - codeparrot_training - Step 43347: {'lr': 2.1675655303199455e-05, 'samples': 22194176, 'steps': 43347, 'batch_loss/train': 0.654228859115392} 12/28/2021 16:07:42 - INFO - codeparrot_training - Step 43348: {'lr': 2.1669235536836446e-05, 'samples': 22194688, 'steps': 43348, 'batch_loss/train': 0.7644551806151867} 12/28/2021 16:07:52 - INFO - codeparrot_training - Step 43349: {'lr': 2.1662816678230417e-05, 'samples': 22195200, 'steps': 43349, 'batch_loss/train': 0.716303200693801} 12/28/2021 16:08:03 - INFO - codeparrot_training - Step 43350: {'lr': 2.1656398727406683e-05, 'samples': 22195712, 'steps': 43350, 'batch_loss/train': 0.5955838458612561} 12/28/2021 16:08:17 - INFO - codeparrot_training - Step 43351: {'lr': 2.1649981684390972e-05, 'samples': 22196224, 'steps': 43351, 'batch_loss/train': 0.7300532730296254} 12/28/2021 16:08:28 - INFO - codeparrot_training - Step 43352: {'lr': 2.1643565549208738e-05, 'samples': 22196736, 'steps': 43352, 'batch_loss/train': 0.686910844524391} 12/28/2021 16:08:39 - INFO - codeparrot_training - Step 43353: {'lr': 2.16371503218854e-05, 'samples': 22197248, 'steps': 43353, 'batch_loss/train': 0.6483570062555373} 12/28/2021 16:08:51 - INFO - codeparrot_training - Step 43354: {'lr': 2.1630736002446554e-05, 'samples': 22197760, 'steps': 43354, 'batch_loss/train': 0.5543011367553845} 12/28/2021 16:09:01 - INFO - codeparrot_training - Step 43355: {'lr': 2.162432259091765e-05, 'samples': 22198272, 'steps': 43355, 'batch_loss/train': 0.6953355425503105} 12/28/2021 16:09:12 - INFO - codeparrot_training - Step 43356: {'lr': 2.161791008732422e-05, 'samples': 22198784, 'steps': 43356, 'batch_loss/train': 0.6808953499421477} 12/28/2021 16:09:26 - INFO - codeparrot_training - Step 43357: {'lr': 2.161149849169175e-05, 'samples': 22199296, 'steps': 43357, 'batch_loss/train': 0.7150770118460059} 12/28/2021 16:09:37 - INFO - codeparrot_training - Step 43358: {'lr': 2.1605087804045715e-05, 'samples': 22199808, 'steps': 43358, 'batch_loss/train': 1.2556912135332823} 12/28/2021 16:09:47 - INFO - codeparrot_training - Step 43359: {'lr': 2.1598678024411622e-05, 'samples': 22200320, 'steps': 43359, 'batch_loss/train': 0.7253453461453319} 12/28/2021 16:09:59 - INFO - codeparrot_training - Step 43360: {'lr': 2.1592269152814954e-05, 'samples': 22200832, 'steps': 43360, 'batch_loss/train': 0.7976950593292713} 12/28/2021 16:10:10 - INFO - codeparrot_training - Step 43361: {'lr': 2.158586118928116e-05, 'samples': 22201344, 'steps': 43361, 'batch_loss/train': 0.7528185211122036} 12/28/2021 16:10:21 - INFO - codeparrot_training - Step 43362: {'lr': 2.1579454133835746e-05, 'samples': 22201856, 'steps': 43362, 'batch_loss/train': 0.6747239786200225} 12/28/2021 16:10:31 - INFO - codeparrot_training - Step 43363: {'lr': 2.1573047986504224e-05, 'samples': 22202368, 'steps': 43363, 'batch_loss/train': 0.7684647273272276} 12/28/2021 16:10:43 - INFO - codeparrot_training - Step 43364: {'lr': 2.156664274731196e-05, 'samples': 22202880, 'steps': 43364, 'batch_loss/train': 0.7817162503488362} 12/28/2021 16:10:54 - INFO - codeparrot_training - Step 43365: {'lr': 2.1560238416284433e-05, 'samples': 22203392, 'steps': 43365, 'batch_loss/train': 0.6831558621488512} 12/28/2021 16:11:05 - INFO - codeparrot_training - Step 43366: {'lr': 2.1553834993447207e-05, 'samples': 22203904, 'steps': 43366, 'batch_loss/train': 0.6603614282794297} 12/28/2021 16:11:17 - INFO - codeparrot_training - Step 43367: {'lr': 2.1547432478825652e-05, 'samples': 22204416, 'steps': 43367, 'batch_loss/train': 0.7529225572943687} 12/28/2021 16:11:28 - INFO - codeparrot_training - Step 43368: {'lr': 2.1541030872445245e-05, 'samples': 22204928, 'steps': 43368, 'batch_loss/train': 0.7688530217856169} 12/28/2021 16:11:38 - INFO - codeparrot_training - Step 43369: {'lr': 2.1534630174331436e-05, 'samples': 22205440, 'steps': 43369, 'batch_loss/train': 0.6634829529793933} 12/28/2021 16:11:52 - INFO - codeparrot_training - Step 43370: {'lr': 2.1528230384509685e-05, 'samples': 22205952, 'steps': 43370, 'batch_loss/train': 0.7654236999806017} 12/28/2021 16:12:03 - INFO - codeparrot_training - Step 43371: {'lr': 2.1521831503005406e-05, 'samples': 22206464, 'steps': 43371, 'batch_loss/train': 0.709577449131757} 12/28/2021 16:12:14 - INFO - codeparrot_training - Step 43372: {'lr': 2.151543352984406e-05, 'samples': 22206976, 'steps': 43372, 'batch_loss/train': 0.4964699135744013} 12/28/2021 16:12:25 - INFO - codeparrot_training - Step 43373: {'lr': 2.150903646505106e-05, 'samples': 22207488, 'steps': 43373, 'batch_loss/train': 0.7944808537140489} 12/28/2021 16:12:36 - INFO - codeparrot_training - Step 43374: {'lr': 2.1502640308651873e-05, 'samples': 22208000, 'steps': 43374, 'batch_loss/train': 0.7038555629551411} 12/28/2021 16:12:47 - INFO - codeparrot_training - Step 43375: {'lr': 2.1496245060671966e-05, 'samples': 22208512, 'steps': 43375, 'batch_loss/train': 0.5701923804590479} 12/28/2021 16:12:57 - INFO - codeparrot_training - Step 43376: {'lr': 2.14898507211366e-05, 'samples': 22209024, 'steps': 43376, 'batch_loss/train': 0.6784877830650657} 12/28/2021 16:13:10 - INFO - codeparrot_training - Step 43377: {'lr': 2.148345729007134e-05, 'samples': 22209536, 'steps': 43377, 'batch_loss/train': 0.6710027474910021} 12/28/2021 16:13:20 - INFO - codeparrot_training - Step 43378: {'lr': 2.1477064767501635e-05, 'samples': 22210048, 'steps': 43378, 'batch_loss/train': 0.7444843128323555} 12/28/2021 16:13:31 - INFO - codeparrot_training - Step 43379: {'lr': 2.1470673153452802e-05, 'samples': 22210560, 'steps': 43379, 'batch_loss/train': 1.0762330265715718} 12/28/2021 16:13:45 - INFO - codeparrot_training - Step 43380: {'lr': 2.1464282447950262e-05, 'samples': 22211072, 'steps': 43380, 'batch_loss/train': 0.6053799046203494} 12/28/2021 16:13:55 - INFO - codeparrot_training - Step 43381: {'lr': 2.1457892651019438e-05, 'samples': 22211584, 'steps': 43381, 'batch_loss/train': 0.7022740305401385} 12/28/2021 16:14:06 - INFO - codeparrot_training - Step 43382: {'lr': 2.145150376268576e-05, 'samples': 22212096, 'steps': 43382, 'batch_loss/train': 0.7525805216282606} 12/28/2021 16:14:18 - INFO - codeparrot_training - Step 43383: {'lr': 2.144511578297459e-05, 'samples': 22212608, 'steps': 43383, 'batch_loss/train': 0.6723636509850621} 12/28/2021 16:14:29 - INFO - codeparrot_training - Step 43384: {'lr': 2.1438728711911353e-05, 'samples': 22213120, 'steps': 43384, 'batch_loss/train': 0.6753179635852575} 12/28/2021 16:14:39 - INFO - codeparrot_training - Step 43385: {'lr': 2.1432342549521423e-05, 'samples': 22213632, 'steps': 43385, 'batch_loss/train': 0.7286082343198359} 12/28/2021 16:14:50 - INFO - codeparrot_training - Step 43386: {'lr': 2.142595729583019e-05, 'samples': 22214144, 'steps': 43386, 'batch_loss/train': 0.6325778475729749} 12/28/2021 16:15:05 - INFO - codeparrot_training - Step 43387: {'lr': 2.1419572950863053e-05, 'samples': 22214656, 'steps': 43387, 'batch_loss/train': 0.6217829412780702} 12/28/2021 16:15:15 - INFO - codeparrot_training - Step 43388: {'lr': 2.141318951464538e-05, 'samples': 22215168, 'steps': 43388, 'batch_loss/train': 0.6978725459775887} 12/28/2021 16:15:26 - INFO - codeparrot_training - Step 43389: {'lr': 2.1406806987202598e-05, 'samples': 22215680, 'steps': 43389, 'batch_loss/train': 0.7457254314795136} 12/28/2021 16:15:38 - INFO - codeparrot_training - Step 43390: {'lr': 2.1400425368559988e-05, 'samples': 22216192, 'steps': 43390, 'batch_loss/train': 0.6560113601153716} 12/28/2021 16:15:49 - INFO - codeparrot_training - Step 43391: {'lr': 2.1394044658742923e-05, 'samples': 22216704, 'steps': 43391, 'batch_loss/train': 0.6952397571876645} 12/28/2021 16:15:59 - INFO - codeparrot_training - Step 43392: {'lr': 2.1387664857776933e-05, 'samples': 22217216, 'steps': 43392, 'batch_loss/train': 0.6785569492494687} 12/28/2021 16:16:11 - INFO - codeparrot_training - Step 43393: {'lr': 2.1381285965687197e-05, 'samples': 22217728, 'steps': 43393, 'batch_loss/train': 0.7214775255415589} 12/28/2021 16:16:22 - INFO - codeparrot_training - Step 43394: {'lr': 2.1374907982499164e-05, 'samples': 22218240, 'steps': 43394, 'batch_loss/train': 0.6911683293874376} 12/28/2021 16:16:32 - INFO - codeparrot_training - Step 43395: {'lr': 2.1368530908238148e-05, 'samples': 22218752, 'steps': 43395, 'batch_loss/train': 0.6800731134135276} 12/28/2021 16:16:43 - INFO - codeparrot_training - Step 43396: {'lr': 2.1362154742929546e-05, 'samples': 22219264, 'steps': 43396, 'batch_loss/train': 0.7610093159601092} 12/28/2021 16:16:57 - INFO - codeparrot_training - Step 43397: {'lr': 2.135577948659867e-05, 'samples': 22219776, 'steps': 43397, 'batch_loss/train': 0.7300082375877537} 12/28/2021 16:17:08 - INFO - codeparrot_training - Step 43398: {'lr': 2.1349405139270915e-05, 'samples': 22220288, 'steps': 43398, 'batch_loss/train': 0.588575245346874} 12/28/2021 16:17:18 - INFO - codeparrot_training - Step 43399: {'lr': 2.1343031700971545e-05, 'samples': 22220800, 'steps': 43399, 'batch_loss/train': 0.7188713122159243} 12/28/2021 16:17:30 - INFO - codeparrot_training - Step 43400: {'lr': 2.1336659171725975e-05, 'samples': 22221312, 'steps': 43400, 'batch_loss/train': 0.7263796823099256} 12/28/2021 16:17:41 - INFO - codeparrot_training - Step 43401: {'lr': 2.1330287551559553e-05, 'samples': 22221824, 'steps': 43401, 'batch_loss/train': 0.8328963350504637} 12/28/2021 16:17:52 - INFO - codeparrot_training - Step 43402: {'lr': 2.132391684049745e-05, 'samples': 22222336, 'steps': 43402, 'batch_loss/train': 0.6649667671881616} 12/28/2021 16:18:04 - INFO - codeparrot_training - Step 43403: {'lr': 2.1317547038565172e-05, 'samples': 22222848, 'steps': 43403, 'batch_loss/train': 1.4663712866604328} 12/28/2021 16:18:14 - INFO - codeparrot_training - Step 43404: {'lr': 2.1311178145788035e-05, 'samples': 22223360, 'steps': 43404, 'batch_loss/train': 0.6948032539803535} 12/28/2021 16:18:25 - INFO - codeparrot_training - Step 43405: {'lr': 2.130481016219124e-05, 'samples': 22223872, 'steps': 43405, 'batch_loss/train': 0.7099218117073178} 12/28/2021 16:18:36 - INFO - codeparrot_training - Step 43406: {'lr': 2.129844308780013e-05, 'samples': 22224384, 'steps': 43406, 'batch_loss/train': 0.5963840489275753} 12/28/2021 16:18:48 - INFO - codeparrot_training - Step 43407: {'lr': 2.129207692264015e-05, 'samples': 22224896, 'steps': 43407, 'batch_loss/train': 0.7885635071434081} 12/28/2021 16:18:59 - INFO - codeparrot_training - Step 43408: {'lr': 2.1285711666736457e-05, 'samples': 22225408, 'steps': 43408, 'batch_loss/train': 0.508132541552186} 12/28/2021 16:19:09 - INFO - codeparrot_training - Step 43409: {'lr': 2.127934732011444e-05, 'samples': 22225920, 'steps': 43409, 'batch_loss/train': 0.7218504447955638} 12/28/2021 16:19:23 - INFO - codeparrot_training - Step 43410: {'lr': 2.127298388279936e-05, 'samples': 22226432, 'steps': 43410, 'batch_loss/train': 0.8126259529963136} 12/28/2021 16:19:33 - INFO - codeparrot_training - Step 43411: {'lr': 2.1266621354816523e-05, 'samples': 22226944, 'steps': 43411, 'batch_loss/train': 0.7171829538419843} 12/28/2021 16:19:44 - INFO - codeparrot_training - Step 43412: {'lr': 2.126025973619125e-05, 'samples': 22227456, 'steps': 43412, 'batch_loss/train': 0.6560357543639839} 12/28/2021 16:19:56 - INFO - codeparrot_training - Step 43413: {'lr': 2.1253899026948798e-05, 'samples': 22227968, 'steps': 43413, 'batch_loss/train': 0.9372843839228153} 12/28/2021 16:20:07 - INFO - codeparrot_training - Step 43414: {'lr': 2.1247539227114476e-05, 'samples': 22228480, 'steps': 43414, 'batch_loss/train': 2.1874730209819973} 12/28/2021 16:20:18 - INFO - codeparrot_training - Step 43415: {'lr': 2.124118033671363e-05, 'samples': 22228992, 'steps': 43415, 'batch_loss/train': 0.7250225730240345} 12/28/2021 16:20:28 - INFO - codeparrot_training - Step 43416: {'lr': 2.1234822355771428e-05, 'samples': 22229504, 'steps': 43416, 'batch_loss/train': 0.6952542496728711} 12/28/2021 16:20:40 - INFO - codeparrot_training - Step 43417: {'lr': 2.122846528431313e-05, 'samples': 22230016, 'steps': 43417, 'batch_loss/train': 0.588229913613759} 12/28/2021 16:20:51 - INFO - codeparrot_training - Step 43418: {'lr': 2.1222109122364164e-05, 'samples': 22230528, 'steps': 43418, 'batch_loss/train': 0.721971657127142} 12/28/2021 16:21:01 - INFO - codeparrot_training - Step 43419: {'lr': 2.1215753869949674e-05, 'samples': 22231040, 'steps': 43419, 'batch_loss/train': 0.6867283806204796} 12/28/2021 16:21:15 - INFO - codeparrot_training - Step 43420: {'lr': 2.1209399527094913e-05, 'samples': 22231552, 'steps': 43420, 'batch_loss/train': 0.6319937778171152} 12/28/2021 16:21:26 - INFO - codeparrot_training - Step 43421: {'lr': 2.1203046093825283e-05, 'samples': 22232064, 'steps': 43421, 'batch_loss/train': 1.133230926003307} 12/28/2021 16:21:37 - INFO - codeparrot_training - Step 43422: {'lr': 2.1196693570165904e-05, 'samples': 22232576, 'steps': 43422, 'batch_loss/train': 0.6592248587403446} 12/28/2021 16:21:49 - INFO - codeparrot_training - Step 43423: {'lr': 2.1190341956142085e-05, 'samples': 22233088, 'steps': 43423, 'batch_loss/train': 0.737396152690053} 12/28/2021 16:22:00 - INFO - codeparrot_training - Step 43424: {'lr': 2.1183991251779054e-05, 'samples': 22233600, 'steps': 43424, 'batch_loss/train': 1.5167793751461431} 12/28/2021 16:22:10 - INFO - codeparrot_training - Step 43425: {'lr': 2.1177641457102103e-05, 'samples': 22234112, 'steps': 43425, 'batch_loss/train': 0.6802961529465392} 12/28/2021 16:22:21 - INFO - codeparrot_training - Step 43426: {'lr': 2.117129257213643e-05, 'samples': 22234624, 'steps': 43426, 'batch_loss/train': 0.6054891520179808} 12/28/2021 16:22:34 - INFO - codeparrot_training - Step 43427: {'lr': 2.116494459690735e-05, 'samples': 22235136, 'steps': 43427, 'batch_loss/train': 0.6790391062386334} 12/28/2021 16:22:45 - INFO - codeparrot_training - Step 43428: {'lr': 2.115859753143995e-05, 'samples': 22235648, 'steps': 43428, 'batch_loss/train': 0.5557859149994329} 12/28/2021 16:22:56 - INFO - codeparrot_training - Step 43429: {'lr': 2.11522513757596e-05, 'samples': 22236160, 'steps': 43429, 'batch_loss/train': 0.6425271374173462} 12/28/2021 16:23:08 - INFO - codeparrot_training - Step 43430: {'lr': 2.1145906129891533e-05, 'samples': 22236672, 'steps': 43430, 'batch_loss/train': 0.7882128416094929} 12/28/2021 16:23:18 - INFO - codeparrot_training - Step 43431: {'lr': 2.1139561793860894e-05, 'samples': 22237184, 'steps': 43431, 'batch_loss/train': 0.7511121844872832} 12/28/2021 16:23:29 - INFO - codeparrot_training - Step 43432: {'lr': 2.1133218367692885e-05, 'samples': 22237696, 'steps': 43432, 'batch_loss/train': 0.7036580805433914} 12/28/2021 16:23:41 - INFO - codeparrot_training - Step 43433: {'lr': 2.1126875851412874e-05, 'samples': 22238208, 'steps': 43433, 'batch_loss/train': 0.7257085784804076} 12/28/2021 16:23:52 - INFO - codeparrot_training - Step 43434: {'lr': 2.1120534245045952e-05, 'samples': 22238720, 'steps': 43434, 'batch_loss/train': 0.7821944109164178} 12/28/2021 16:24:02 - INFO - codeparrot_training - Step 43435: {'lr': 2.1114193548617378e-05, 'samples': 22239232, 'steps': 43435, 'batch_loss/train': 0.6721549862995744} 12/28/2021 16:24:13 - INFO - codeparrot_training - Step 43436: {'lr': 2.1107853762152324e-05, 'samples': 22239744, 'steps': 43436, 'batch_loss/train': 0.7597637018188834} 12/28/2021 16:24:27 - INFO - codeparrot_training - Step 43437: {'lr': 2.110151488567602e-05, 'samples': 22240256, 'steps': 43437, 'batch_loss/train': 0.6532205655239522} 12/28/2021 16:24:38 - INFO - codeparrot_training - Step 43438: {'lr': 2.109517691921367e-05, 'samples': 22240768, 'steps': 43438, 'batch_loss/train': 0.726195146329701} 12/28/2021 16:24:48 - INFO - codeparrot_training - Step 43439: {'lr': 2.1088839862790478e-05, 'samples': 22241280, 'steps': 43439, 'batch_loss/train': 0.7164421593770385} 12/28/2021 16:25:01 - INFO - codeparrot_training - Step 43440: {'lr': 2.1082503716431612e-05, 'samples': 22241792, 'steps': 43440, 'batch_loss/train': 0.6635148623026907} 12/28/2021 16:25:11 - INFO - codeparrot_training - Step 43441: {'lr': 2.107616848016225e-05, 'samples': 22242304, 'steps': 43441, 'batch_loss/train': 0.7293125661090016} 12/28/2021 16:25:22 - INFO - codeparrot_training - Step 43442: {'lr': 2.106983415400768e-05, 'samples': 22242816, 'steps': 43442, 'batch_loss/train': 0.514497296884656} 12/28/2021 16:25:34 - INFO - codeparrot_training - Step 43443: {'lr': 2.10635007379929e-05, 'samples': 22243328, 'steps': 43443, 'batch_loss/train': 0.5486244403291494} 12/28/2021 16:25:45 - INFO - codeparrot_training - Step 43444: {'lr': 2.105716823214332e-05, 'samples': 22243840, 'steps': 43444, 'batch_loss/train': 0.8111923169344664} 12/28/2021 16:25:56 - INFO - codeparrot_training - Step 43445: {'lr': 2.105083663648391e-05, 'samples': 22244352, 'steps': 43445, 'batch_loss/train': 0.5786006919224747} 12/28/2021 16:26:06 - INFO - codeparrot_training - Step 43446: {'lr': 2.1044505951039905e-05, 'samples': 22244864, 'steps': 43446, 'batch_loss/train': 0.6420165575109422} 12/28/2021 16:26:18 - INFO - codeparrot_training - Step 43447: {'lr': 2.1038176175836588e-05, 'samples': 22245376, 'steps': 43447, 'batch_loss/train': 0.6816074838861823} 12/28/2021 16:26:29 - INFO - codeparrot_training - Step 43448: {'lr': 2.1031847310898967e-05, 'samples': 22245888, 'steps': 43448, 'batch_loss/train': 0.6573711177334189} 12/28/2021 16:26:40 - INFO - codeparrot_training - Step 43449: {'lr': 2.1025519356252272e-05, 'samples': 22246400, 'steps': 43449, 'batch_loss/train': 0.6489821881987154} 12/28/2021 16:26:54 - INFO - codeparrot_training - Step 43450: {'lr': 2.1019192311921646e-05, 'samples': 22246912, 'steps': 43450, 'batch_loss/train': 0.6678747157566249} 12/28/2021 16:27:04 - INFO - codeparrot_training - Step 43451: {'lr': 2.101286617793227e-05, 'samples': 22247424, 'steps': 43451, 'batch_loss/train': 0.5657179770059884} 12/28/2021 16:27:15 - INFO - codeparrot_training - Step 43452: {'lr': 2.1006540954309262e-05, 'samples': 22247936, 'steps': 43452, 'batch_loss/train': 0.7658347934484482} 12/28/2021 16:27:27 - INFO - codeparrot_training - Step 43453: {'lr': 2.1000216641077845e-05, 'samples': 22248448, 'steps': 43453, 'batch_loss/train': 0.6401138819055632} 12/28/2021 16:27:38 - INFO - codeparrot_training - Step 43454: {'lr': 2.0993893238263006e-05, 'samples': 22248960, 'steps': 43454, 'batch_loss/train': 0.7531860615126789} 12/28/2021 16:27:48 - INFO - codeparrot_training - Step 43455: {'lr': 2.098757074589e-05, 'samples': 22249472, 'steps': 43455, 'batch_loss/train': 0.7460372122004628} 12/28/2021 16:27:59 - INFO - codeparrot_training - Step 43456: {'lr': 2.0981249163984002e-05, 'samples': 22249984, 'steps': 43456, 'batch_loss/train': 0.7125974837690592} 12/28/2021 16:28:13 - INFO - codeparrot_training - Step 43457: {'lr': 2.097492849256999e-05, 'samples': 22250496, 'steps': 43457, 'batch_loss/train': 0.8054621149785817} 12/28/2021 16:28:23 - INFO - codeparrot_training - Step 43458: {'lr': 2.096860873167325e-05, 'samples': 22251008, 'steps': 43458, 'batch_loss/train': 0.6671698214486241} 12/28/2021 16:28:34 - INFO - codeparrot_training - Step 43459: {'lr': 2.0962289881318873e-05, 'samples': 22251520, 'steps': 43459, 'batch_loss/train': 0.7074168142862618} 12/28/2021 16:28:46 - INFO - codeparrot_training - Step 43460: {'lr': 2.0955971941531898e-05, 'samples': 22252032, 'steps': 43460, 'batch_loss/train': 0.7010107680689543} 12/28/2021 16:28:57 - INFO - codeparrot_training - Step 43461: {'lr': 2.0949654912337495e-05, 'samples': 22252544, 'steps': 43461, 'batch_loss/train': 0.7615444427356124} 12/28/2021 16:29:07 - INFO - codeparrot_training - Step 43462: {'lr': 2.0943338793760786e-05, 'samples': 22253056, 'steps': 43462, 'batch_loss/train': 0.539988616714254} 12/28/2021 16:29:20 - INFO - codeparrot_training - Step 43463: {'lr': 2.0937023585826857e-05, 'samples': 22253568, 'steps': 43463, 'batch_loss/train': 0.763832657597959} 12/28/2021 16:29:30 - INFO - codeparrot_training - Step 43464: {'lr': 2.0930709288560863e-05, 'samples': 22254080, 'steps': 43464, 'batch_loss/train': 0.7909394484013319} 12/28/2021 16:29:41 - INFO - codeparrot_training - Step 43465: {'lr': 2.0924395901987857e-05, 'samples': 22254592, 'steps': 43465, 'batch_loss/train': 0.7186170481145382} 12/28/2021 16:29:51 - INFO - codeparrot_training - Step 43466: {'lr': 2.0918083426132968e-05, 'samples': 22255104, 'steps': 43466, 'batch_loss/train': 0.7552293865010142} 12/28/2021 16:30:05 - INFO - codeparrot_training - Step 43467: {'lr': 2.0911771861021252e-05, 'samples': 22255616, 'steps': 43467, 'batch_loss/train': 0.5061210144194774} 12/28/2021 16:30:16 - INFO - codeparrot_training - Step 43468: {'lr': 2.090546120667791e-05, 'samples': 22256128, 'steps': 43468, 'batch_loss/train': 0.6864535566419363} 12/28/2021 16:30:27 - INFO - codeparrot_training - Step 43469: {'lr': 2.0899151463127848e-05, 'samples': 22256640, 'steps': 43469, 'batch_loss/train': 0.5870960247702897} 12/28/2021 16:30:39 - INFO - codeparrot_training - Step 43470: {'lr': 2.0892842630396342e-05, 'samples': 22257152, 'steps': 43470, 'batch_loss/train': 0.5378758502483834} 12/28/2021 16:30:49 - INFO - codeparrot_training - Step 43471: {'lr': 2.088653470850832e-05, 'samples': 22257664, 'steps': 43471, 'batch_loss/train': 0.6701990666333586} 12/28/2021 16:31:00 - INFO - codeparrot_training - Step 43472: {'lr': 2.0880227697488902e-05, 'samples': 22258176, 'steps': 43472, 'batch_loss/train': 0.6252141566365026} 12/28/2021 16:31:12 - INFO - codeparrot_training - Step 43473: {'lr': 2.087392159736329e-05, 'samples': 22258688, 'steps': 43473, 'batch_loss/train': 0.8176855519413948} 12/28/2021 16:31:23 - INFO - codeparrot_training - Step 43474: {'lr': 2.0867616408156402e-05, 'samples': 22259200, 'steps': 43474, 'batch_loss/train': 0.690254651941359} 12/28/2021 16:31:33 - INFO - codeparrot_training - Step 43475: {'lr': 2.086131212989334e-05, 'samples': 22259712, 'steps': 43475, 'batch_loss/train': 0.7532916510244831} 12/28/2021 16:31:47 - INFO - codeparrot_training - Step 43476: {'lr': 2.0855008762599186e-05, 'samples': 22260224, 'steps': 43476, 'batch_loss/train': 0.6940743746235967} 12/28/2021 16:31:58 - INFO - codeparrot_training - Step 43477: {'lr': 2.084870630629901e-05, 'samples': 22260736, 'steps': 43477, 'batch_loss/train': 0.7103328735101968} 12/28/2021 16:32:09 - INFO - codeparrot_training - Step 43478: {'lr': 2.0842404761017842e-05, 'samples': 22261248, 'steps': 43478, 'batch_loss/train': 0.7681093802675605} 12/28/2021 16:32:19 - INFO - codeparrot_training - Step 43479: {'lr': 2.0836104126780774e-05, 'samples': 22261760, 'steps': 43479, 'batch_loss/train': 0.7672809828072786} 12/28/2021 16:32:31 - INFO - codeparrot_training - Step 43480: {'lr': 2.082980440361276e-05, 'samples': 22262272, 'steps': 43480, 'batch_loss/train': 0.6652844045311213} 12/28/2021 16:32:42 - INFO - codeparrot_training - Step 43481: {'lr': 2.082350559153895e-05, 'samples': 22262784, 'steps': 43481, 'batch_loss/train': 0.7315475312061608} 12/28/2021 16:32:53 - INFO - codeparrot_training - Step 43482: {'lr': 2.08172076905844e-05, 'samples': 22263296, 'steps': 43482, 'batch_loss/train': 0.6115643093362451} 12/28/2021 16:33:05 - INFO - codeparrot_training - Step 43483: {'lr': 2.0810910700773984e-05, 'samples': 22263808, 'steps': 43483, 'batch_loss/train': 0.7365649726707488} 12/28/2021 16:33:15 - INFO - codeparrot_training - Step 43484: {'lr': 2.0804614622132902e-05, 'samples': 22264320, 'steps': 43484, 'batch_loss/train': 0.732046214689035} 12/28/2021 16:33:26 - INFO - codeparrot_training - Step 43485: {'lr': 2.0798319454686187e-05, 'samples': 22264832, 'steps': 43485, 'batch_loss/train': 0.6600302630104125} 12/28/2021 16:33:40 - INFO - codeparrot_training - Step 43486: {'lr': 2.0792025198458768e-05, 'samples': 22265344, 'steps': 43486, 'batch_loss/train': 0.7018168987706304} 12/28/2021 16:33:50 - INFO - codeparrot_training - Step 43487: {'lr': 2.0785731853475654e-05, 'samples': 22265856, 'steps': 43487, 'batch_loss/train': 0.6687226218637079} 12/28/2021 16:34:01 - INFO - codeparrot_training - Step 43488: {'lr': 2.0779439419762013e-05, 'samples': 22266368, 'steps': 43488, 'batch_loss/train': 0.7574154356261715} 12/28/2021 16:34:12 - INFO - codeparrot_training - Step 43489: {'lr': 2.0773147897342748e-05, 'samples': 22266880, 'steps': 43489, 'batch_loss/train': 0.6736445464193821} 12/28/2021 16:34:24 - INFO - codeparrot_training - Step 43490: {'lr': 2.0766857286242863e-05, 'samples': 22267392, 'steps': 43490, 'batch_loss/train': 0.7484052325598896} 12/28/2021 16:34:34 - INFO - codeparrot_training - Step 43491: {'lr': 2.076056758648742e-05, 'samples': 22267904, 'steps': 43491, 'batch_loss/train': 0.6988121203612536} 12/28/2021 16:34:45 - INFO - codeparrot_training - Step 43492: {'lr': 2.0754278798101404e-05, 'samples': 22268416, 'steps': 43492, 'batch_loss/train': 0.6214893334545195} 12/28/2021 16:34:57 - INFO - codeparrot_training - Step 43493: {'lr': 2.0747990921109795e-05, 'samples': 22268928, 'steps': 43493, 'batch_loss/train': 0.6829169760458171} 12/28/2021 16:35:08 - INFO - codeparrot_training - Step 43494: {'lr': 2.074170395553762e-05, 'samples': 22269440, 'steps': 43494, 'batch_loss/train': 0.6481304063927382} 12/28/2021 16:35:18 - INFO - codeparrot_training - Step 43495: {'lr': 2.073541790140987e-05, 'samples': 22269952, 'steps': 43495, 'batch_loss/train': 0.7129271635785699} 12/28/2021 16:35:33 - INFO - codeparrot_training - Step 43496: {'lr': 2.0729132758751547e-05, 'samples': 22270464, 'steps': 43496, 'batch_loss/train': 0.6788507187739015} 12/28/2021 16:35:44 - INFO - codeparrot_training - Step 43497: {'lr': 2.0722848527587657e-05, 'samples': 22270976, 'steps': 43497, 'batch_loss/train': 0.7145811431109905} 12/28/2021 16:35:54 - INFO - codeparrot_training - Step 43498: {'lr': 2.0716565207943043e-05, 'samples': 22271488, 'steps': 43498, 'batch_loss/train': 0.7570495740510523} 12/28/2021 16:36:06 - INFO - codeparrot_training - Step 43499: {'lr': 2.0710282799842883e-05, 'samples': 22272000, 'steps': 43499, 'batch_loss/train': 0.7523221110459417} 12/28/2021 16:36:17 - INFO - codeparrot_training - Step 43500: {'lr': 2.0704001303312042e-05, 'samples': 22272512, 'steps': 43500, 'batch_loss/train': 0.740969218313694} 12/28/2021 16:36:28 - INFO - codeparrot_training - Step 43501: {'lr': 2.06977207183755e-05, 'samples': 22273024, 'steps': 43501, 'batch_loss/train': 0.6990628009662032} 12/28/2021 16:36:38 - INFO - codeparrot_training - Step 43502: {'lr': 2.069144104505824e-05, 'samples': 22273536, 'steps': 43502, 'batch_loss/train': 0.7932597873732448} 12/28/2021 16:36:52 - INFO - codeparrot_training - Step 43503: {'lr': 2.0685162283385212e-05, 'samples': 22274048, 'steps': 43503, 'batch_loss/train': 0.7358053200878203} 12/28/2021 16:37:03 - INFO - codeparrot_training - Step 43504: {'lr': 2.067888443338142e-05, 'samples': 22274560, 'steps': 43504, 'batch_loss/train': 0.751731077209115} 12/28/2021 16:37:13 - INFO - codeparrot_training - Step 43505: {'lr': 2.0672607495071827e-05, 'samples': 22275072, 'steps': 43505, 'batch_loss/train': 0.7234721016138792} 12/28/2021 16:37:26 - INFO - codeparrot_training - Step 43506: {'lr': 2.0666331468481266e-05, 'samples': 22275584, 'steps': 43506, 'batch_loss/train': 0.756729599321261} 12/28/2021 16:37:36 - INFO - codeparrot_training - Step 43507: {'lr': 2.0660056353634797e-05, 'samples': 22276096, 'steps': 43507, 'batch_loss/train': 0.732085342053324} 12/28/2021 16:37:47 - INFO - codeparrot_training - Step 43508: {'lr': 2.0653782150557436e-05, 'samples': 22276608, 'steps': 43508, 'batch_loss/train': 0.533580829272978} 12/28/2021 16:37:59 - INFO - codeparrot_training - Step 43509: {'lr': 2.064750885927394e-05, 'samples': 22277120, 'steps': 43509, 'batch_loss/train': 0.6073612325126305} 12/28/2021 16:38:10 - INFO - codeparrot_training - Step 43510: {'lr': 2.0641236479809365e-05, 'samples': 22277632, 'steps': 43510, 'batch_loss/train': 0.7703288120683283} 12/28/2021 16:38:20 - INFO - codeparrot_training - Step 43511: {'lr': 2.06349650121887e-05, 'samples': 22278144, 'steps': 43511, 'batch_loss/train': 0.7124374387785792} 12/28/2021 16:38:31 - INFO - codeparrot_training - Step 43512: {'lr': 2.0628694456436753e-05, 'samples': 22278656, 'steps': 43512, 'batch_loss/train': 0.6481544259004295} 12/28/2021 16:38:45 - INFO - codeparrot_training - Step 43513: {'lr': 2.0622424812578476e-05, 'samples': 22279168, 'steps': 43513, 'batch_loss/train': 0.6729236876126379} 12/28/2021 16:38:56 - INFO - codeparrot_training - Step 43514: {'lr': 2.061615608063891e-05, 'samples': 22279680, 'steps': 43514, 'batch_loss/train': 0.783031296916306} 12/28/2021 16:39:06 - INFO - codeparrot_training - Step 43515: {'lr': 2.060988826064286e-05, 'samples': 22280192, 'steps': 43515, 'batch_loss/train': 0.6973662149393931} 12/28/2021 16:39:18 - INFO - codeparrot_training - Step 43516: {'lr': 2.0603621352615287e-05, 'samples': 22280704, 'steps': 43516, 'batch_loss/train': 0.695143059361726} 12/28/2021 16:39:29 - INFO - codeparrot_training - Step 43517: {'lr': 2.059735535658108e-05, 'samples': 22281216, 'steps': 43517, 'batch_loss/train': 0.6450837687007152} 12/28/2021 16:39:40 - INFO - codeparrot_training - Step 43518: {'lr': 2.0591090272565173e-05, 'samples': 22281728, 'steps': 43518, 'batch_loss/train': 0.488869501510635} 12/28/2021 16:39:54 - INFO - codeparrot_training - Step 43519: {'lr': 2.058482610059248e-05, 'samples': 22282240, 'steps': 43519, 'batch_loss/train': 0.7458866871893406} 12/28/2021 16:40:04 - INFO - codeparrot_training - Step 43520: {'lr': 2.0578562840687877e-05, 'samples': 22282752, 'steps': 43520, 'batch_loss/train': 0.5247964083682746} 12/28/2021 16:40:15 - INFO - codeparrot_training - Step 43521: {'lr': 2.0572300492876284e-05, 'samples': 22283264, 'steps': 43521, 'batch_loss/train': 0.6638183556497097} 12/28/2021 16:40:25 - INFO - codeparrot_training - Step 43522: {'lr': 2.0566039057182628e-05, 'samples': 22283776, 'steps': 43522, 'batch_loss/train': 0.7227807859890163} 12/28/2021 16:40:38 - INFO - codeparrot_training - Step 43523: {'lr': 2.055977853363178e-05, 'samples': 22284288, 'steps': 43523, 'batch_loss/train': 0.68414660776034} 12/28/2021 16:40:48 - INFO - codeparrot_training - Step 43524: {'lr': 2.055351892224855e-05, 'samples': 22284800, 'steps': 43524, 'batch_loss/train': 0.7904562894254923} 12/28/2021 16:40:59 - INFO - codeparrot_training - Step 43525: {'lr': 2.0547260223057974e-05, 'samples': 22285312, 'steps': 43525, 'batch_loss/train': 0.7486152248457074} 12/28/2021 16:41:11 - INFO - codeparrot_training - Step 43526: {'lr': 2.0541002436084783e-05, 'samples': 22285824, 'steps': 43526, 'batch_loss/train': 0.7563935816287994} 12/28/2021 16:41:21 - INFO - codeparrot_training - Step 43527: {'lr': 2.0534745561353957e-05, 'samples': 22286336, 'steps': 43527, 'batch_loss/train': 0.7613241728395224} 12/28/2021 16:41:32 - INFO - codeparrot_training - Step 43528: {'lr': 2.0528489598890337e-05, 'samples': 22286848, 'steps': 43528, 'batch_loss/train': 0.7935368400067091} 12/28/2021 16:41:44 - INFO - codeparrot_training - Step 43529: {'lr': 2.0522234548718793e-05, 'samples': 22287360, 'steps': 43529, 'batch_loss/train': 0.7135955947451293} 12/28/2021 16:41:55 - INFO - codeparrot_training - Step 43530: {'lr': 2.0515980410864193e-05, 'samples': 22287872, 'steps': 43530, 'batch_loss/train': 0.7602428835816681} 12/28/2021 16:42:06 - INFO - codeparrot_training - Step 43531: {'lr': 2.0509727185351407e-05, 'samples': 22288384, 'steps': 43531, 'batch_loss/train': 0.6703630267875269} 12/28/2021 16:42:16 - INFO - codeparrot_training - Step 43532: {'lr': 2.0503474872205303e-05, 'samples': 22288896, 'steps': 43532, 'batch_loss/train': 0.645227910252288} 12/28/2021 16:42:30 - INFO - codeparrot_training - Step 43533: {'lr': 2.0497223471450722e-05, 'samples': 22289408, 'steps': 43533, 'batch_loss/train': 0.65820402931422} 12/28/2021 16:42:41 - INFO - codeparrot_training - Step 43534: {'lr': 2.049097298311256e-05, 'samples': 22289920, 'steps': 43534, 'batch_loss/train': 0.7273731585592031} 12/28/2021 16:42:51 - INFO - codeparrot_training - Step 43535: {'lr': 2.048472340721555e-05, 'samples': 22290432, 'steps': 43535, 'batch_loss/train': 0.7682917267084122} 12/28/2021 16:43:03 - INFO - codeparrot_training - Step 43536: {'lr': 2.047847474378467e-05, 'samples': 22290944, 'steps': 43536, 'batch_loss/train': 0.7456387523561716} 12/28/2021 16:43:14 - INFO - codeparrot_training - Step 43537: {'lr': 2.0472226992844734e-05, 'samples': 22291456, 'steps': 43537, 'batch_loss/train': 0.7190204812213778} 12/28/2021 16:43:25 - INFO - codeparrot_training - Step 43538: {'lr': 2.0465980154420526e-05, 'samples': 22291968, 'steps': 43538, 'batch_loss/train': 0.7503766758018173} 12/28/2021 16:43:37 - INFO - codeparrot_training - Step 43539: {'lr': 2.0459734228536865e-05, 'samples': 22292480, 'steps': 43539, 'batch_loss/train': 0.7649217755533755} 12/28/2021 16:43:48 - INFO - codeparrot_training - Step 43540: {'lr': 2.0453489215218722e-05, 'samples': 22292992, 'steps': 43540, 'batch_loss/train': 0.8870082765351981} 12/28/2021 16:43:58 - INFO - codeparrot_training - Step 43541: {'lr': 2.044724511449078e-05, 'samples': 22293504, 'steps': 43541, 'batch_loss/train': 0.7559799337759614} 12/28/2021 16:44:09 - INFO - codeparrot_training - Step 43542: {'lr': 2.0441001926377934e-05, 'samples': 22294016, 'steps': 43542, 'batch_loss/train': 1.2415314023382962} 12/28/2021 16:44:27 - INFO - codeparrot_training - Step 43543: {'lr': 2.0434759650904966e-05, 'samples': 22294528, 'steps': 43543, 'batch_loss/train': 1.1227372521534562} 12/28/2021 16:44:37 - INFO - codeparrot_training - Step 43544: {'lr': 2.042851828809672e-05, 'samples': 22295040, 'steps': 43544, 'batch_loss/train': 0.9046505205333233} 12/28/2021 16:44:48 - INFO - codeparrot_training - Step 43545: {'lr': 2.042227783797801e-05, 'samples': 22295552, 'steps': 43545, 'batch_loss/train': 0.6345432108792011} 12/28/2021 16:45:00 - INFO - codeparrot_training - Step 43546: {'lr': 2.0416038300573646e-05, 'samples': 22296064, 'steps': 43546, 'batch_loss/train': 0.738487146794796} 12/28/2021 16:45:11 - INFO - codeparrot_training - Step 43547: {'lr': 2.040979967590839e-05, 'samples': 22296576, 'steps': 43547, 'batch_loss/train': 0.6529700090177357} 12/28/2021 16:45:21 - INFO - codeparrot_training - Step 43548: {'lr': 2.0403561964007112e-05, 'samples': 22297088, 'steps': 43548, 'batch_loss/train': 0.6271840613335371} 12/28/2021 16:45:32 - INFO - codeparrot_training - Step 43549: {'lr': 2.039732516489462e-05, 'samples': 22297600, 'steps': 43549, 'batch_loss/train': 0.7584124002605677} 12/28/2021 16:45:44 - INFO - codeparrot_training - Step 43550: {'lr': 2.0391089278595566e-05, 'samples': 22298112, 'steps': 43550, 'batch_loss/train': 0.788964958046563} 12/28/2021 16:45:55 - INFO - codeparrot_training - Step 43551: {'lr': 2.0384854305134952e-05, 'samples': 22298624, 'steps': 43551, 'batch_loss/train': 0.760646250564605} 12/28/2021 16:46:05 - INFO - codeparrot_training - Step 43552: {'lr': 2.0378620244537405e-05, 'samples': 22299136, 'steps': 43552, 'batch_loss/train': 0.6893884773598984} 12/28/2021 16:46:19 - INFO - codeparrot_training - Step 43553: {'lr': 2.037238709682776e-05, 'samples': 22299648, 'steps': 43553, 'batch_loss/train': 0.7543999501504004} 12/28/2021 16:46:30 - INFO - codeparrot_training - Step 43554: {'lr': 2.0366154862030806e-05, 'samples': 22300160, 'steps': 43554, 'batch_loss/train': 0.7390691684558988} 12/28/2021 16:46:41 - INFO - codeparrot_training - Step 43555: {'lr': 2.03599235401713e-05, 'samples': 22300672, 'steps': 43555, 'batch_loss/train': 0.7770155845209956} 12/28/2021 16:46:53 - INFO - codeparrot_training - Step 43556: {'lr': 2.0353693131274026e-05, 'samples': 22301184, 'steps': 43556, 'batch_loss/train': 0.7135297181084752} 12/28/2021 16:47:03 - INFO - codeparrot_training - Step 43557: {'lr': 2.034746363536377e-05, 'samples': 22301696, 'steps': 43557, 'batch_loss/train': 0.6658573849126697} 12/28/2021 16:47:14 - INFO - codeparrot_training - Step 43558: {'lr': 2.0341235052465295e-05, 'samples': 22302208, 'steps': 43558, 'batch_loss/train': 0.7749383689370006} 12/28/2021 16:47:25 - INFO - codeparrot_training - Step 43559: {'lr': 2.0335007382603325e-05, 'samples': 22302720, 'steps': 43559, 'batch_loss/train': 0.7370326754171401} 12/28/2021 16:47:39 - INFO - codeparrot_training - Step 43560: {'lr': 2.0328780625802706e-05, 'samples': 22303232, 'steps': 43560, 'batch_loss/train': 0.7538184382719919} 12/28/2021 16:47:49 - INFO - codeparrot_training - Step 43561: {'lr': 2.0322554782088053e-05, 'samples': 22303744, 'steps': 43561, 'batch_loss/train': 0.8022567690350115} 12/28/2021 16:48:00 - INFO - codeparrot_training - Step 43562: {'lr': 2.0316329851484237e-05, 'samples': 22304256, 'steps': 43562, 'batch_loss/train': 0.6893492102390155} 12/28/2021 16:48:12 - INFO - codeparrot_training - Step 43563: {'lr': 2.0310105834016017e-05, 'samples': 22304768, 'steps': 43563, 'batch_loss/train': 0.7835565214045346} 12/28/2021 16:48:23 - INFO - codeparrot_training - Step 43564: {'lr': 2.030388272970804e-05, 'samples': 22305280, 'steps': 43564, 'batch_loss/train': 0.6576421755598858} 12/28/2021 16:48:33 - INFO - codeparrot_training - Step 43565: {'lr': 2.029766053858506e-05, 'samples': 22305792, 'steps': 43565, 'batch_loss/train': 0.7102504717186093} 12/28/2021 16:48:46 - INFO - codeparrot_training - Step 43566: {'lr': 2.0291439260671923e-05, 'samples': 22306304, 'steps': 43566, 'batch_loss/train': 0.656452901661396} 12/28/2021 16:48:56 - INFO - codeparrot_training - Step 43567: {'lr': 2.0285218895993274e-05, 'samples': 22306816, 'steps': 43567, 'batch_loss/train': 0.7980327729601413} 12/28/2021 16:49:07 - INFO - codeparrot_training - Step 43568: {'lr': 2.0278999444573788e-05, 'samples': 22307328, 'steps': 43568, 'batch_loss/train': 0.789233885705471} 12/28/2021 16:49:21 - INFO - codeparrot_training - Step 43569: {'lr': 2.0272780906438358e-05, 'samples': 22307840, 'steps': 43569, 'batch_loss/train': 0.7055135904811323} 12/28/2021 16:49:31 - INFO - codeparrot_training - Step 43570: {'lr': 2.0266563281611555e-05, 'samples': 22308352, 'steps': 43570, 'batch_loss/train': 0.7093531284481287} 12/28/2021 16:49:42 - INFO - codeparrot_training - Step 43571: {'lr': 2.026034657011816e-05, 'samples': 22308864, 'steps': 43571, 'batch_loss/train': 0.765248910523951} 12/28/2021 16:49:53 - INFO - codeparrot_training - Step 43572: {'lr': 2.0254130771982876e-05, 'samples': 22309376, 'steps': 43572, 'batch_loss/train': 0.8140660929493606} 12/28/2021 16:50:05 - INFO - codeparrot_training - Step 43573: {'lr': 2.0247915887230435e-05, 'samples': 22309888, 'steps': 43573, 'batch_loss/train': 0.7707800720818341} 12/28/2021 16:50:15 - INFO - codeparrot_training - Step 43574: {'lr': 2.0241701915885507e-05, 'samples': 22310400, 'steps': 43574, 'batch_loss/train': 0.6548121469095349} 12/28/2021 16:50:26 - INFO - codeparrot_training - Step 43575: {'lr': 2.0235488857972883e-05, 'samples': 22310912, 'steps': 43575, 'batch_loss/train': 0.6807925631292164} 12/28/2021 16:50:38 - INFO - codeparrot_training - Step 43576: {'lr': 2.0229276713517096e-05, 'samples': 22311424, 'steps': 43576, 'batch_loss/train': 0.798008575453423} 12/28/2021 16:50:49 - INFO - codeparrot_training - Step 43577: {'lr': 2.022306548254299e-05, 'samples': 22311936, 'steps': 43577, 'batch_loss/train': 0.706247553229332} 12/28/2021 16:51:00 - INFO - codeparrot_training - Step 43578: {'lr': 2.0216855165075266e-05, 'samples': 22312448, 'steps': 43578, 'batch_loss/train': 0.6920734015293419} 12/28/2021 16:51:10 - INFO - codeparrot_training - Step 43579: {'lr': 2.0210645761138513e-05, 'samples': 22312960, 'steps': 43579, 'batch_loss/train': 0.684062133077532} 12/28/2021 16:51:22 - INFO - codeparrot_training - Step 43580: {'lr': 2.0204437270757463e-05, 'samples': 22313472, 'steps': 43580, 'batch_loss/train': 0.7400285424664617} 12/28/2021 16:51:33 - INFO - codeparrot_training - Step 43581: {'lr': 2.019822969395682e-05, 'samples': 22313984, 'steps': 43581, 'batch_loss/train': 0.7179299406707287} 12/28/2021 16:51:43 - INFO - codeparrot_training - Step 43582: {'lr': 2.0192023030761257e-05, 'samples': 22314496, 'steps': 43582, 'batch_loss/train': 0.7263008593581617} 12/28/2021 16:51:58 - INFO - codeparrot_training - Step 43583: {'lr': 2.0185817281195422e-05, 'samples': 22315008, 'steps': 43583, 'batch_loss/train': 0.7495627552270889} 12/28/2021 16:52:08 - INFO - codeparrot_training - Step 43584: {'lr': 2.017961244528399e-05, 'samples': 22315520, 'steps': 43584, 'batch_loss/train': 0.7562326649203897} 12/28/2021 16:52:19 - INFO - codeparrot_training - Step 43585: {'lr': 2.0173408523051662e-05, 'samples': 22316032, 'steps': 43585, 'batch_loss/train': 0.7416676804423332} 12/28/2021 16:52:31 - INFO - codeparrot_training - Step 43586: {'lr': 2.0167205514523058e-05, 'samples': 22316544, 'steps': 43586, 'batch_loss/train': 0.784850955940783} 12/28/2021 16:52:42 - INFO - codeparrot_training - Step 43587: {'lr': 2.0161003419722878e-05, 'samples': 22317056, 'steps': 43587, 'batch_loss/train': 0.8247334046754986} 12/28/2021 16:52:52 - INFO - codeparrot_training - Step 43588: {'lr': 2.0154802238675773e-05, 'samples': 22317568, 'steps': 43588, 'batch_loss/train': 0.8747802204452455} 12/28/2021 16:53:07 - INFO - codeparrot_training - Step 43589: {'lr': 2.0148601971406415e-05, 'samples': 22318080, 'steps': 43589, 'batch_loss/train': 0.8079252743627876} 12/28/2021 16:53:17 - INFO - codeparrot_training - Step 43590: {'lr': 2.01424026179394e-05, 'samples': 22318592, 'steps': 43590, 'batch_loss/train': 0.9455791525542736} 12/28/2021 16:53:28 - INFO - codeparrot_training - Step 43591: {'lr': 2.0136204178299368e-05, 'samples': 22319104, 'steps': 43591, 'batch_loss/train': 0.7553369477391243} 12/28/2021 16:53:39 - INFO - codeparrot_training - Step 43592: {'lr': 2.013000665251105e-05, 'samples': 22319616, 'steps': 43592, 'batch_loss/train': 0.8360991696827114} 12/28/2021 16:53:51 - INFO - codeparrot_training - Step 43593: {'lr': 2.0123810040599017e-05, 'samples': 22320128, 'steps': 43593, 'batch_loss/train': 0.5895905555225909} 12/28/2021 16:54:02 - INFO - codeparrot_training - Step 43594: {'lr': 2.0117614342587853e-05, 'samples': 22320640, 'steps': 43594, 'batch_loss/train': 0.7808756409212947} 12/28/2021 16:54:12 - INFO - codeparrot_training - Step 43595: {'lr': 2.011141955850235e-05, 'samples': 22321152, 'steps': 43595, 'batch_loss/train': 0.7723095640540123} 12/28/2021 16:54:24 - INFO - codeparrot_training - Step 43596: {'lr': 2.010522568836698e-05, 'samples': 22321664, 'steps': 43596, 'batch_loss/train': 0.5061975589487702} 12/28/2021 16:54:35 - INFO - codeparrot_training - Step 43597: {'lr': 2.009903273220645e-05, 'samples': 22322176, 'steps': 43597, 'batch_loss/train': 0.6590358694083989} 12/28/2021 16:54:46 - INFO - codeparrot_training - Step 43598: {'lr': 2.0092840690045356e-05, 'samples': 22322688, 'steps': 43598, 'batch_loss/train': 0.8110168852144852} 12/28/2021 16:54:56 - INFO - codeparrot_training - Step 43599: {'lr': 2.008664956190831e-05, 'samples': 22323200, 'steps': 43599, 'batch_loss/train': 0.8496292214840651} 12/28/2021 16:55:10 - INFO - codeparrot_training - Step 43600: {'lr': 2.0080459347819934e-05, 'samples': 22323712, 'steps': 43600, 'batch_loss/train': 0.7417234062450007} 12/28/2021 16:55:21 - INFO - codeparrot_training - Step 43601: {'lr': 2.0074270047804876e-05, 'samples': 22324224, 'steps': 43601, 'batch_loss/train': 0.7175130671821535} 12/28/2021 16:55:31 - INFO - codeparrot_training - Step 43602: {'lr': 2.0068081661887615e-05, 'samples': 22324736, 'steps': 43602, 'batch_loss/train': 0.710300964768976} 12/28/2021 16:55:44 - INFO - codeparrot_training - Step 43603: {'lr': 2.0061894190092884e-05, 'samples': 22325248, 'steps': 43603, 'batch_loss/train': 0.7160043842159212} 12/28/2021 16:55:54 - INFO - codeparrot_training - Step 43604: {'lr': 2.0055707632445296e-05, 'samples': 22325760, 'steps': 43604, 'batch_loss/train': 0.5637571250554174} 12/28/2021 16:56:05 - INFO - codeparrot_training - Step 43605: {'lr': 2.004952198896931e-05, 'samples': 22326272, 'steps': 43605, 'batch_loss/train': 0.7935980167239904} 12/28/2021 16:56:16 - INFO - codeparrot_training - Step 43606: {'lr': 2.0043337259689683e-05, 'samples': 22326784, 'steps': 43606, 'batch_loss/train': 0.8888823490124196} 12/28/2021 16:56:30 - INFO - codeparrot_training - Step 43607: {'lr': 2.0037153444630864e-05, 'samples': 22327296, 'steps': 43607, 'batch_loss/train': 0.7317290501669049} 12/28/2021 16:56:40 - INFO - codeparrot_training - Step 43608: {'lr': 2.00309705438175e-05, 'samples': 22327808, 'steps': 43608, 'batch_loss/train': 1.3178314422257245} 12/28/2021 16:56:51 - INFO - codeparrot_training - Step 43609: {'lr': 2.002478855727416e-05, 'samples': 22328320, 'steps': 43609, 'batch_loss/train': 0.6720056044869125} 12/28/2021 16:57:03 - INFO - codeparrot_training - Step 43610: {'lr': 2.0018607485025432e-05, 'samples': 22328832, 'steps': 43610, 'batch_loss/train': 0.7341536111198366} 12/28/2021 16:57:14 - INFO - codeparrot_training - Step 43611: {'lr': 2.0012427327095876e-05, 'samples': 22329344, 'steps': 43611, 'batch_loss/train': 0.7478662063367665} 12/28/2021 16:57:24 - INFO - codeparrot_training - Step 43612: {'lr': 2.0006248083510065e-05, 'samples': 22329856, 'steps': 43612, 'batch_loss/train': 0.770715743303299} 12/28/2021 16:57:36 - INFO - codeparrot_training - Step 43613: {'lr': 2.000006975429258e-05, 'samples': 22330368, 'steps': 43613, 'batch_loss/train': 0.6940345084294677} 12/28/2021 16:57:47 - INFO - codeparrot_training - Step 43614: {'lr': 1.9993892339467966e-05, 'samples': 22330880, 'steps': 43614, 'batch_loss/train': 0.6538366654422134} 12/28/2021 16:57:58 - INFO - codeparrot_training - Step 43615: {'lr': 1.998771583906084e-05, 'samples': 22331392, 'steps': 43615, 'batch_loss/train': 0.7729939445853233} 12/28/2021 16:58:10 - INFO - codeparrot_training - Step 43616: {'lr': 1.9981540253095648e-05, 'samples': 22331904, 'steps': 43616, 'batch_loss/train': 0.7527305171824992} 12/28/2021 16:58:21 - INFO - codeparrot_training - Step 43617: {'lr': 1.997536558159696e-05, 'samples': 22332416, 'steps': 43617, 'batch_loss/train': 0.6878002276644111} 12/28/2021 16:58:31 - INFO - codeparrot_training - Step 43618: {'lr': 1.996919182458948e-05, 'samples': 22332928, 'steps': 43618, 'batch_loss/train': 0.7369631342589855} 12/28/2021 16:58:42 - INFO - codeparrot_training - Step 43619: {'lr': 1.996301898209757e-05, 'samples': 22333440, 'steps': 43619, 'batch_loss/train': 0.7336265242192894} 12/28/2021 16:58:56 - INFO - codeparrot_training - Step 43620: {'lr': 1.99568470541458e-05, 'samples': 22333952, 'steps': 43620, 'batch_loss/train': 0.7259883414953947} 12/28/2021 16:59:06 - INFO - codeparrot_training - Step 43621: {'lr': 1.9950676040758815e-05, 'samples': 22334464, 'steps': 43621, 'batch_loss/train': 0.6001000786200166} 12/28/2021 16:59:17 - INFO - codeparrot_training - Step 43622: {'lr': 1.994450594196104e-05, 'samples': 22334976, 'steps': 43622, 'batch_loss/train': 0.5470073008327745} 12/28/2021 16:59:29 - INFO - codeparrot_training - Step 43623: {'lr': 1.9938336757777064e-05, 'samples': 22335488, 'steps': 43623, 'batch_loss/train': 0.710282820102293} 12/28/2021 16:59:40 - INFO - codeparrot_training - Step 43624: {'lr': 1.993216848823137e-05, 'samples': 22336000, 'steps': 43624, 'batch_loss/train': 0.7176260149572045} 12/28/2021 16:59:50 - INFO - codeparrot_training - Step 43625: {'lr': 1.9926001133348527e-05, 'samples': 22336512, 'steps': 43625, 'batch_loss/train': 0.8337345793843269} 12/28/2021 17:00:03 - INFO - codeparrot_training - Step 43626: {'lr': 1.9919834693153032e-05, 'samples': 22337024, 'steps': 43626, 'batch_loss/train': 0.702709439327009} 12/28/2021 17:00:13 - INFO - codeparrot_training - Step 43627: {'lr': 1.991366916766943e-05, 'samples': 22337536, 'steps': 43627, 'batch_loss/train': 0.7072122159879655} 12/28/2021 17:00:24 - INFO - codeparrot_training - Step 43628: {'lr': 1.9907504556922113e-05, 'samples': 22338048, 'steps': 43628, 'batch_loss/train': 0.8213274488225579} 12/28/2021 17:00:34 - INFO - codeparrot_training - Step 43629: {'lr': 1.990134086093573e-05, 'samples': 22338560, 'steps': 43629, 'batch_loss/train': 0.7882369682192802} 12/28/2021 17:00:48 - INFO - codeparrot_training - Step 43630: {'lr': 1.989517807973476e-05, 'samples': 22339072, 'steps': 43630, 'batch_loss/train': 0.8632665267214179} 12/28/2021 17:00:59 - INFO - codeparrot_training - Step 43631: {'lr': 1.9889016213343604e-05, 'samples': 22339584, 'steps': 43631, 'batch_loss/train': 0.7211274281144142} 12/28/2021 17:01:10 - INFO - codeparrot_training - Step 43632: {'lr': 1.9882855261786875e-05, 'samples': 22340096, 'steps': 43632, 'batch_loss/train': 0.7146926070563495} 12/28/2021 17:01:22 - INFO - codeparrot_training - Step 43633: {'lr': 1.987669522508906e-05, 'samples': 22340608, 'steps': 43633, 'batch_loss/train': 0.8247180161997676} 12/28/2021 17:01:32 - INFO - codeparrot_training - Step 43634: {'lr': 1.9870536103274578e-05, 'samples': 22341120, 'steps': 43634, 'batch_loss/train': 0.7601466355845332} 12/28/2021 17:01:43 - INFO - codeparrot_training - Step 43635: {'lr': 1.9864377896367967e-05, 'samples': 22341632, 'steps': 43635, 'batch_loss/train': 0.8675495623610914} 12/28/2021 17:01:57 - INFO - codeparrot_training - Step 43636: {'lr': 1.9858220604393655e-05, 'samples': 22342144, 'steps': 43636, 'batch_loss/train': 0.7114720779936761} 12/28/2021 17:02:08 - INFO - codeparrot_training - Step 43637: {'lr': 1.9852064227376204e-05, 'samples': 22342656, 'steps': 43637, 'batch_loss/train': 0.6485785152763128} 12/28/2021 17:02:18 - INFO - codeparrot_training - Step 43638: {'lr': 1.984590876534001e-05, 'samples': 22343168, 'steps': 43638, 'batch_loss/train': 0.6752792655024678} 12/28/2021 17:02:31 - INFO - codeparrot_training - Step 43639: {'lr': 1.9839754218309607e-05, 'samples': 22343680, 'steps': 43639, 'batch_loss/train': 0.696419829968363} 12/28/2021 17:02:41 - INFO - codeparrot_training - Step 43640: {'lr': 1.9833600586309426e-05, 'samples': 22344192, 'steps': 43640, 'batch_loss/train': 0.78944908734411} 12/28/2021 17:02:52 - INFO - codeparrot_training - Step 43641: {'lr': 1.982744786936397e-05, 'samples': 22344704, 'steps': 43641, 'batch_loss/train': 0.7731192149221897} 12/28/2021 17:03:03 - INFO - codeparrot_training - Step 43642: {'lr': 1.9821296067497608e-05, 'samples': 22345216, 'steps': 43642, 'batch_loss/train': 0.6783303112024441} 12/28/2021 17:03:15 - INFO - codeparrot_training - Step 43643: {'lr': 1.981514518073488e-05, 'samples': 22345728, 'steps': 43643, 'batch_loss/train': 0.8206101241521537} 12/28/2021 17:03:25 - INFO - codeparrot_training - Step 43644: {'lr': 1.9808995209100287e-05, 'samples': 22346240, 'steps': 43644, 'batch_loss/train': 0.7534524686634541} 12/28/2021 17:03:36 - INFO - codeparrot_training - Step 43645: {'lr': 1.9802846152618155e-05, 'samples': 22346752, 'steps': 43645, 'batch_loss/train': 0.8140233960002661} 12/28/2021 17:03:50 - INFO - codeparrot_training - Step 43646: {'lr': 1.9796698011312953e-05, 'samples': 22347264, 'steps': 43646, 'batch_loss/train': 0.8093131561763585} 12/28/2021 17:04:01 - INFO - codeparrot_training - Step 43647: {'lr': 1.9790550785209222e-05, 'samples': 22347776, 'steps': 43647, 'batch_loss/train': 0.7317215540679172} 12/28/2021 17:04:11 - INFO - codeparrot_training - Step 43648: {'lr': 1.978440447433133e-05, 'samples': 22348288, 'steps': 43648, 'batch_loss/train': 0.6961149820126593} 12/28/2021 17:04:24 - INFO - codeparrot_training - Step 43649: {'lr': 1.9778259078703704e-05, 'samples': 22348800, 'steps': 43649, 'batch_loss/train': 0.7262553367763758} 12/28/2021 17:04:34 - INFO - codeparrot_training - Step 43650: {'lr': 1.9772114598350767e-05, 'samples': 22349312, 'steps': 43650, 'batch_loss/train': 0.5819332371465862} 12/28/2021 17:04:45 - INFO - codeparrot_training - Step 43651: {'lr': 1.9765971033296998e-05, 'samples': 22349824, 'steps': 43651, 'batch_loss/train': 0.7435176339931786} 12/28/2021 17:04:56 - INFO - codeparrot_training - Step 43652: {'lr': 1.9759828383566798e-05, 'samples': 22350336, 'steps': 43652, 'batch_loss/train': 0.8449091208167374} 12/28/2021 17:05:08 - INFO - codeparrot_training - Step 43653: {'lr': 1.9753686649184616e-05, 'samples': 22350848, 'steps': 43653, 'batch_loss/train': 0.8139241300523281} 12/28/2021 17:05:18 - INFO - codeparrot_training - Step 43654: {'lr': 1.974754583017477e-05, 'samples': 22351360, 'steps': 43654, 'batch_loss/train': 0.7382655355613679} 12/28/2021 17:05:29 - INFO - codeparrot_training - Step 43655: {'lr': 1.9741405926561763e-05, 'samples': 22351872, 'steps': 43655, 'batch_loss/train': 0.7147817779332399} 12/28/2021 17:05:41 - INFO - codeparrot_training - Step 43656: {'lr': 1.9735266938370022e-05, 'samples': 22352384, 'steps': 43656, 'batch_loss/train': 0.7185149099677801} 12/28/2021 17:05:52 - INFO - codeparrot_training - Step 43657: {'lr': 1.9729128865623864e-05, 'samples': 22352896, 'steps': 43657, 'batch_loss/train': 0.7863656487315893} 12/28/2021 17:06:02 - INFO - codeparrot_training - Step 43658: {'lr': 1.9722991708347764e-05, 'samples': 22353408, 'steps': 43658, 'batch_loss/train': 0.6922382889315486} 12/28/2021 17:06:16 - INFO - codeparrot_training - Step 43659: {'lr': 1.9716855466566125e-05, 'samples': 22353920, 'steps': 43659, 'batch_loss/train': 0.7609089598990977} 12/28/2021 17:06:27 - INFO - codeparrot_training - Step 43660: {'lr': 1.9710720140303312e-05, 'samples': 22354432, 'steps': 43660, 'batch_loss/train': 0.8110669618472457} 12/28/2021 17:06:38 - INFO - codeparrot_training - Step 43661: {'lr': 1.9704585729583698e-05, 'samples': 22354944, 'steps': 43661, 'batch_loss/train': 0.7658844627439976} 12/28/2021 17:06:50 - INFO - codeparrot_training - Step 43662: {'lr': 1.9698452234431703e-05, 'samples': 22355456, 'steps': 43662, 'batch_loss/train': 0.7364076776430011} 12/28/2021 17:07:00 - INFO - codeparrot_training - Step 43663: {'lr': 1.96923196548717e-05, 'samples': 22355968, 'steps': 43663, 'batch_loss/train': 0.7825276986695826} 12/28/2021 17:07:11 - INFO - codeparrot_training - Step 43664: {'lr': 1.9686187990928055e-05, 'samples': 22356480, 'steps': 43664, 'batch_loss/train': 0.6666361839743331} 12/28/2021 17:07:22 - INFO - codeparrot_training - Step 43665: {'lr': 1.96800572426252e-05, 'samples': 22356992, 'steps': 43665, 'batch_loss/train': 0.6381022813729942} 12/28/2021 17:07:36 - INFO - codeparrot_training - Step 43666: {'lr': 1.9673927409987442e-05, 'samples': 22357504, 'steps': 43666, 'batch_loss/train': 0.6729059062781744} 12/28/2021 17:07:46 - INFO - codeparrot_training - Step 43667: {'lr': 1.9667798493039206e-05, 'samples': 22358016, 'steps': 43667, 'batch_loss/train': 0.7483166952151805} 12/28/2021 17:07:57 - INFO - codeparrot_training - Step 43668: {'lr': 1.966167049180481e-05, 'samples': 22358528, 'steps': 43668, 'batch_loss/train': 0.4954616241448093} 12/28/2021 17:08:09 - INFO - codeparrot_training - Step 43669: {'lr': 1.965554340630865e-05, 'samples': 22359040, 'steps': 43669, 'batch_loss/train': 0.9834025725722313} 12/28/2021 17:08:20 - INFO - codeparrot_training - Step 43670: {'lr': 1.9649417236575125e-05, 'samples': 22359552, 'steps': 43670, 'batch_loss/train': 0.6462812600657344} 12/28/2021 17:08:30 - INFO - codeparrot_training - Step 43671: {'lr': 1.9643291982628486e-05, 'samples': 22360064, 'steps': 43671, 'batch_loss/train': 0.6405384065583348} 12/28/2021 17:08:42 - INFO - codeparrot_training - Step 43672: {'lr': 1.963716764449308e-05, 'samples': 22360576, 'steps': 43672, 'batch_loss/train': 0.7018599197035655} 12/28/2021 17:08:53 - INFO - codeparrot_training - Step 43673: {'lr': 1.9631044222193418e-05, 'samples': 22361088, 'steps': 43673, 'batch_loss/train': 0.7699119858443737} 12/28/2021 17:09:04 - INFO - codeparrot_training - Step 43674: {'lr': 1.9624921715753696e-05, 'samples': 22361600, 'steps': 43674, 'batch_loss/train': 0.8118460569530725} 12/28/2021 17:09:14 - INFO - codeparrot_training - Step 43675: {'lr': 1.9618800125198316e-05, 'samples': 22362112, 'steps': 43675, 'batch_loss/train': 0.8304718500003219} 12/28/2021 17:09:29 - INFO - codeparrot_training - Step 43676: {'lr': 1.961267945055159e-05, 'samples': 22362624, 'steps': 43676, 'batch_loss/train': 0.7843554893042892} 12/28/2021 17:09:39 - INFO - codeparrot_training - Step 43677: {'lr': 1.9606559691837833e-05, 'samples': 22363136, 'steps': 43677, 'batch_loss/train': 0.8353963531553745} 12/28/2021 17:09:50 - INFO - codeparrot_training - Step 43678: {'lr': 1.960044084908144e-05, 'samples': 22363648, 'steps': 43678, 'batch_loss/train': 0.7425217051059008} 12/28/2021 17:10:02 - INFO - codeparrot_training - Step 43679: {'lr': 1.9594322922306675e-05, 'samples': 22364160, 'steps': 43679, 'batch_loss/train': 0.8863969594240189} 12/28/2021 17:10:13 - INFO - codeparrot_training - Step 43680: {'lr': 1.958820591153787e-05, 'samples': 22364672, 'steps': 43680, 'batch_loss/train': 0.7511012847535312} 12/28/2021 17:10:23 - INFO - codeparrot_training - Step 43681: {'lr': 1.9582089816799404e-05, 'samples': 22365184, 'steps': 43681, 'batch_loss/train': 0.7434234152315184} 12/28/2021 17:10:35 - INFO - codeparrot_training - Step 43682: {'lr': 1.9575974638115556e-05, 'samples': 22365696, 'steps': 43682, 'batch_loss/train': 0.7351283053867519} 12/28/2021 17:10:46 - INFO - codeparrot_training - Step 43683: {'lr': 1.9569860375510533e-05, 'samples': 22366208, 'steps': 43683, 'batch_loss/train': 0.7588706811657175} 12/28/2021 17:10:57 - INFO - codeparrot_training - Step 43684: {'lr': 1.9563747029008817e-05, 'samples': 22366720, 'steps': 43684, 'batch_loss/train': 0.7111501349136233} 12/28/2021 17:11:07 - INFO - codeparrot_training - Step 43685: {'lr': 1.9557634598634634e-05, 'samples': 22367232, 'steps': 43685, 'batch_loss/train': 0.7548960465937853} 12/28/2021 17:11:19 - INFO - codeparrot_training - Step 43686: {'lr': 1.955152308441227e-05, 'samples': 22367744, 'steps': 43686, 'batch_loss/train': 0.8046970358118415} 12/28/2021 17:11:30 - INFO - codeparrot_training - Step 43687: {'lr': 1.9545412486366044e-05, 'samples': 22368256, 'steps': 43687, 'batch_loss/train': 0.7291516761761159} 12/28/2021 17:11:41 - INFO - codeparrot_training - Step 43688: {'lr': 1.9539302804520208e-05, 'samples': 22368768, 'steps': 43688, 'batch_loss/train': 0.6177168251015246} 12/28/2021 17:11:55 - INFO - codeparrot_training - Step 43689: {'lr': 1.953319403889911e-05, 'samples': 22369280, 'steps': 43689, 'batch_loss/train': 0.7763404008001089} 12/28/2021 17:12:05 - INFO - codeparrot_training - Step 43690: {'lr': 1.9527086189527003e-05, 'samples': 22369792, 'steps': 43690, 'batch_loss/train': 0.6915562856593169} 12/28/2021 17:12:16 - INFO - codeparrot_training - Step 43691: {'lr': 1.9520979256428177e-05, 'samples': 22370304, 'steps': 43691, 'batch_loss/train': 0.6395124425762333} 12/28/2021 17:12:28 - INFO - codeparrot_training - Step 43692: {'lr': 1.9514873239626916e-05, 'samples': 22370816, 'steps': 43692, 'batch_loss/train': 0.7262612117920071} 12/28/2021 17:12:39 - INFO - codeparrot_training - Step 43693: {'lr': 1.950876813914751e-05, 'samples': 22371328, 'steps': 43693, 'batch_loss/train': 0.7972382558509707} 12/28/2021 17:12:50 - INFO - codeparrot_training - Step 43694: {'lr': 1.9502663955014182e-05, 'samples': 22371840, 'steps': 43694, 'batch_loss/train': 0.7913088672794402} 12/28/2021 17:13:02 - INFO - codeparrot_training - Step 43695: {'lr': 1.9496560687251254e-05, 'samples': 22372352, 'steps': 43695, 'batch_loss/train': 0.6593247214332223} 12/28/2021 17:13:13 - INFO - codeparrot_training - Step 43696: {'lr': 1.9490458335882978e-05, 'samples': 22372864, 'steps': 43696, 'batch_loss/train': 0.7090826532803476} 12/28/2021 17:13:23 - INFO - codeparrot_training - Step 43697: {'lr': 1.9484356900933586e-05, 'samples': 22373376, 'steps': 43697, 'batch_loss/train': 0.7393272412009537} 12/28/2021 17:13:34 - INFO - codeparrot_training - Step 43698: {'lr': 1.947825638242731e-05, 'samples': 22373888, 'steps': 43698, 'batch_loss/train': 1.0942740151658654} 12/28/2021 17:13:48 - INFO - codeparrot_training - Step 43699: {'lr': 1.9472156780388523e-05, 'samples': 22374400, 'steps': 43699, 'batch_loss/train': 0.7903225107584149} 12/28/2021 17:13:59 - INFO - codeparrot_training - Step 43700: {'lr': 1.946605809484134e-05, 'samples': 22374912, 'steps': 43700, 'batch_loss/train': 0.7639431040734053} 12/28/2021 17:14:10 - INFO - codeparrot_training - Step 43701: {'lr': 1.9459960325810072e-05, 'samples': 22375424, 'steps': 43701, 'batch_loss/train': 0.8261668113991618} 12/28/2021 17:14:22 - INFO - codeparrot_training - Step 43702: {'lr': 1.9453863473318957e-05, 'samples': 22375936, 'steps': 43702, 'batch_loss/train': 0.7270322232507169} 12/28/2021 17:14:32 - INFO - codeparrot_training - Step 43703: {'lr': 1.9447767537392247e-05, 'samples': 22376448, 'steps': 43703, 'batch_loss/train': 0.7204211237840354} 12/28/2021 17:14:43 - INFO - codeparrot_training - Step 43704: {'lr': 1.944167251805412e-05, 'samples': 22376960, 'steps': 43704, 'batch_loss/train': 0.7066128174774349} 12/28/2021 17:14:57 - INFO - codeparrot_training - Step 43705: {'lr': 1.943557841532889e-05, 'samples': 22377472, 'steps': 43705, 'batch_loss/train': 0.858263332862407} 12/28/2021 17:15:08 - INFO - codeparrot_training - Step 43706: {'lr': 1.9429485229240705e-05, 'samples': 22377984, 'steps': 43706, 'batch_loss/train': 0.6830367259681225} 12/28/2021 17:15:18 - INFO - codeparrot_training - Step 43707: {'lr': 1.9423392959813852e-05, 'samples': 22378496, 'steps': 43707, 'batch_loss/train': 0.6748069394379854} 12/28/2021 17:15:29 - INFO - codeparrot_training - Step 43708: {'lr': 1.9417301607072556e-05, 'samples': 22379008, 'steps': 43708, 'batch_loss/train': 0.6331794610014185} 12/28/2021 17:15:41 - INFO - codeparrot_training - Step 43709: {'lr': 1.9411211171040944e-05, 'samples': 22379520, 'steps': 43709, 'batch_loss/train': 0.7373796869069338} 12/28/2021 17:15:51 - INFO - codeparrot_training - Step 43710: {'lr': 1.9405121651743297e-05, 'samples': 22380032, 'steps': 43710, 'batch_loss/train': 0.7496225470677018} 12/28/2021 17:16:02 - INFO - codeparrot_training - Step 43711: {'lr': 1.9399033049203875e-05, 'samples': 22380544, 'steps': 43711, 'batch_loss/train': 0.6944072081241757} 12/28/2021 17:16:14 - INFO - codeparrot_training - Step 43712: {'lr': 1.9392945363446796e-05, 'samples': 22381056, 'steps': 43712, 'batch_loss/train': 0.732629070058465} 12/28/2021 17:16:25 - INFO - codeparrot_training - Step 43713: {'lr': 1.9386858594496238e-05, 'samples': 22381568, 'steps': 43713, 'batch_loss/train': 0.7338931974954903} 12/28/2021 17:16:36 - INFO - codeparrot_training - Step 43714: {'lr': 1.938077274237654e-05, 'samples': 22382080, 'steps': 43714, 'batch_loss/train': 0.7767059868201613} 12/28/2021 17:16:49 - INFO - codeparrot_training - Step 43715: {'lr': 1.937468780711177e-05, 'samples': 22382592, 'steps': 43715, 'batch_loss/train': 0.6676922934129834} 12/28/2021 17:17:00 - INFO - codeparrot_training - Step 43716: {'lr': 1.936860378872615e-05, 'samples': 22383104, 'steps': 43716, 'batch_loss/train': 0.8467287635430694} 12/28/2021 17:17:11 - INFO - codeparrot_training - Step 43717: {'lr': 1.9362520687243896e-05, 'samples': 22383616, 'steps': 43717, 'batch_loss/train': 0.7135044410824776} 12/28/2021 17:17:21 - INFO - codeparrot_training - Step 43718: {'lr': 1.9356438502689145e-05, 'samples': 22384128, 'steps': 43718, 'batch_loss/train': 0.7804344650357962} 12/28/2021 17:17:34 - INFO - codeparrot_training - Step 43719: {'lr': 1.935035723508613e-05, 'samples': 22384640, 'steps': 43719, 'batch_loss/train': 0.7072119298391044} 12/28/2021 17:17:44 - INFO - codeparrot_training - Step 43720: {'lr': 1.9344276884459e-05, 'samples': 22385152, 'steps': 43720, 'batch_loss/train': 0.8850626070052385} 12/28/2021 17:17:55 - INFO - codeparrot_training - Step 43721: {'lr': 1.933819745083193e-05, 'samples': 22385664, 'steps': 43721, 'batch_loss/train': 0.81058051250875} 12/28/2021 17:18:08 - INFO - codeparrot_training - Step 43722: {'lr': 1.933211893422912e-05, 'samples': 22386176, 'steps': 43722, 'batch_loss/train': 0.6635229110252112} 12/28/2021 17:18:19 - INFO - codeparrot_training - Step 43723: {'lr': 1.932604133467472e-05, 'samples': 22386688, 'steps': 43723, 'batch_loss/train': 0.7080485709011555} 12/28/2021 17:18:30 - INFO - codeparrot_training - Step 43724: {'lr': 1.9319964652192794e-05, 'samples': 22387200, 'steps': 43724, 'batch_loss/train': 0.7367723835632205} 12/28/2021 17:18:42 - INFO - codeparrot_training - Step 43725: {'lr': 1.9313888886807685e-05, 'samples': 22387712, 'steps': 43725, 'batch_loss/train': 0.7866583582945168} 12/28/2021 17:18:52 - INFO - codeparrot_training - Step 43726: {'lr': 1.9307814038543397e-05, 'samples': 22388224, 'steps': 43726, 'batch_loss/train': 0.6145215854048729} 12/28/2021 17:19:03 - INFO - codeparrot_training - Step 43727: {'lr': 1.930174010742414e-05, 'samples': 22388736, 'steps': 43727, 'batch_loss/train': 0.7112744320183992} 12/28/2021 17:19:15 - INFO - codeparrot_training - Step 43728: {'lr': 1.929566709347408e-05, 'samples': 22389248, 'steps': 43728, 'batch_loss/train': 0.7263216702267528} 12/28/2021 17:19:26 - INFO - codeparrot_training - Step 43729: {'lr': 1.9289594996717285e-05, 'samples': 22389760, 'steps': 43729, 'batch_loss/train': 0.8754741149023175} 12/28/2021 17:19:37 - INFO - codeparrot_training - Step 43730: {'lr': 1.928352381717799e-05, 'samples': 22390272, 'steps': 43730, 'batch_loss/train': 0.8217881256714463} 12/28/2021 17:19:47 - INFO - codeparrot_training - Step 43731: {'lr': 1.927745355488028e-05, 'samples': 22390784, 'steps': 43731, 'batch_loss/train': 0.5421728155342862} 12/28/2021 17:19:59 - INFO - codeparrot_training - Step 43732: {'lr': 1.9271384209848282e-05, 'samples': 22391296, 'steps': 43732, 'batch_loss/train': 0.7712922594510019} 12/28/2021 17:20:10 - INFO - codeparrot_training - Step 43733: {'lr': 1.9265315782106162e-05, 'samples': 22391808, 'steps': 43733, 'batch_loss/train': 0.8220433220267296} 12/28/2021 17:20:21 - INFO - codeparrot_training - Step 43734: {'lr': 1.9259248271678047e-05, 'samples': 22392320, 'steps': 43734, 'batch_loss/train': 0.7114706572610885} 12/28/2021 17:20:35 - INFO - codeparrot_training - Step 43735: {'lr': 1.9253181678587973e-05, 'samples': 22392832, 'steps': 43735, 'batch_loss/train': 0.7192156473174691} 12/28/2021 17:20:45 - INFO - codeparrot_training - Step 43736: {'lr': 1.9247116002860137e-05, 'samples': 22393344, 'steps': 43736, 'batch_loss/train': 0.704274034127593} 12/28/2021 17:20:56 - INFO - codeparrot_training - Step 43737: {'lr': 1.924105124451872e-05, 'samples': 22393856, 'steps': 43737, 'batch_loss/train': 0.7268098705681041} 12/28/2021 17:21:08 - INFO - codeparrot_training - Step 43738: {'lr': 1.923498740358767e-05, 'samples': 22394368, 'steps': 43738, 'batch_loss/train': 0.735871035605669} 12/28/2021 17:21:19 - INFO - codeparrot_training - Step 43739: {'lr': 1.9228924480091143e-05, 'samples': 22394880, 'steps': 43739, 'batch_loss/train': 0.7467967392876744} 12/28/2021 17:21:29 - INFO - codeparrot_training - Step 43740: {'lr': 1.9222862474053388e-05, 'samples': 22395392, 'steps': 43740, 'batch_loss/train': 0.774944256991148} 12/28/2021 17:21:40 - INFO - codeparrot_training - Step 43741: {'lr': 1.9216801385498333e-05, 'samples': 22395904, 'steps': 43741, 'batch_loss/train': 0.7540203025564551} 12/28/2021 17:21:52 - INFO - codeparrot_training - Step 43742: {'lr': 1.921074121445013e-05, 'samples': 22396416, 'steps': 43742, 'batch_loss/train': 0.6998963016085327} 12/28/2021 17:22:03 - INFO - codeparrot_training - Step 43743: {'lr': 1.920468196093289e-05, 'samples': 22396928, 'steps': 43743, 'batch_loss/train': 0.6222020536661148} 12/28/2021 17:22:13 - INFO - codeparrot_training - Step 43744: {'lr': 1.9198623624970686e-05, 'samples': 22397440, 'steps': 43744, 'batch_loss/train': 0.8177598880138248} 12/28/2021 17:22:27 - INFO - codeparrot_training - Step 43745: {'lr': 1.9192566206587575e-05, 'samples': 22397952, 'steps': 43745, 'batch_loss/train': 0.8061314187943935} 12/28/2021 17:22:38 - INFO - codeparrot_training - Step 43746: {'lr': 1.9186509705807708e-05, 'samples': 22398464, 'steps': 43746, 'batch_loss/train': 0.6343656685203314} 12/28/2021 17:22:49 - INFO - codeparrot_training - Step 43747: {'lr': 1.9180454122655122e-05, 'samples': 22398976, 'steps': 43747, 'batch_loss/train': 0.6382032819092274} 12/28/2021 17:23:01 - INFO - codeparrot_training - Step 43748: {'lr': 1.9174399457153878e-05, 'samples': 22399488, 'steps': 43748, 'batch_loss/train': 0.6601815624162555} 12/28/2021 17:23:12 - INFO - codeparrot_training - Step 43749: {'lr': 1.916834570932813e-05, 'samples': 22400000, 'steps': 43749, 'batch_loss/train': 0.742668210528791} 12/28/2021 17:23:22 - INFO - codeparrot_training - Step 43750: {'lr': 1.9162292879201794e-05, 'samples': 22400512, 'steps': 43750, 'batch_loss/train': 0.7478388822637498} 12/28/2021 17:23:33 - INFO - codeparrot_training - Step 43751: {'lr': 1.9156240966799104e-05, 'samples': 22401024, 'steps': 43751, 'batch_loss/train': 0.6435650233179331} 12/28/2021 17:23:47 - INFO - codeparrot_training - Step 43752: {'lr': 1.9150189972143988e-05, 'samples': 22401536, 'steps': 43752, 'batch_loss/train': 0.7755263643339276} 12/28/2021 17:23:57 - INFO - codeparrot_training - Step 43753: {'lr': 1.9144139895260508e-05, 'samples': 22402048, 'steps': 43753, 'batch_loss/train': 0.7119042177218944} 12/28/2021 17:24:08 - INFO - codeparrot_training - Step 43754: {'lr': 1.9138090736172843e-05, 'samples': 22402560, 'steps': 43754, 'batch_loss/train': 0.8741376837715507} 12/28/2021 17:24:20 - INFO - codeparrot_training - Step 43755: {'lr': 1.9132042494904913e-05, 'samples': 22403072, 'steps': 43755, 'batch_loss/train': 0.6968424320220947} 12/28/2021 17:24:31 - INFO - codeparrot_training - Step 43756: {'lr': 1.9125995171480837e-05, 'samples': 22403584, 'steps': 43756, 'batch_loss/train': 0.8173075262457132} 12/28/2021 17:24:41 - INFO - codeparrot_training - Step 43757: {'lr': 1.91199487659246e-05, 'samples': 22404096, 'steps': 43757, 'batch_loss/train': 0.7844464527443051} 12/28/2021 17:24:54 - INFO - codeparrot_training - Step 43758: {'lr': 1.9113903278260293e-05, 'samples': 22404608, 'steps': 43758, 'batch_loss/train': 0.7705523017793894} 12/28/2021 17:25:05 - INFO - codeparrot_training - Step 43759: {'lr': 1.9107858708511895e-05, 'samples': 22405120, 'steps': 43759, 'batch_loss/train': 1.2024803087115288} 12/28/2021 17:25:16 - INFO - codeparrot_training - Step 43760: {'lr': 1.9101815056703552e-05, 'samples': 22405632, 'steps': 43760, 'batch_loss/train': 0.628347294870764} 12/28/2021 17:25:26 - INFO - codeparrot_training - Step 43761: {'lr': 1.9095772322859084e-05, 'samples': 22406144, 'steps': 43761, 'batch_loss/train': 0.7874798304401338} 12/28/2021 17:25:40 - INFO - codeparrot_training - Step 43762: {'lr': 1.9089730507002717e-05, 'samples': 22406656, 'steps': 43762, 'batch_loss/train': 0.6673025946365669} 12/28/2021 17:25:51 - INFO - codeparrot_training - Step 43763: {'lr': 1.9083689609158406e-05, 'samples': 22407168, 'steps': 43763, 'batch_loss/train': 0.7707730988040566} 12/28/2021 17:26:01 - INFO - codeparrot_training - Step 43764: {'lr': 1.907764962935013e-05, 'samples': 22407680, 'steps': 43764, 'batch_loss/train': 0.7007548045367002} 12/28/2021 17:26:14 - INFO - codeparrot_training - Step 43765: {'lr': 1.907161056760187e-05, 'samples': 22408192, 'steps': 43765, 'batch_loss/train': 0.5716311521828175} 12/28/2021 17:26:24 - INFO - codeparrot_training - Step 43766: {'lr': 1.9065572423937777e-05, 'samples': 22408704, 'steps': 43766, 'batch_loss/train': 0.8242629058659077} 12/28/2021 17:26:35 - INFO - codeparrot_training - Step 43767: {'lr': 1.9059535198381745e-05, 'samples': 22409216, 'steps': 43767, 'batch_loss/train': 0.6960998745635152} 12/28/2021 17:26:46 - INFO - codeparrot_training - Step 43768: {'lr': 1.905349889095781e-05, 'samples': 22409728, 'steps': 43768, 'batch_loss/train': 0.5934498896822333} 12/28/2021 17:26:58 - INFO - codeparrot_training - Step 43769: {'lr': 1.9047463501689926e-05, 'samples': 22410240, 'steps': 43769, 'batch_loss/train': 0.7123095728456974} 12/28/2021 17:27:08 - INFO - codeparrot_training - Step 43770: {'lr': 1.9041429030602158e-05, 'samples': 22410752, 'steps': 43770, 'batch_loss/train': 0.7618934931233525} 12/28/2021 17:27:19 - INFO - codeparrot_training - Step 43771: {'lr': 1.903539547771843e-05, 'samples': 22411264, 'steps': 43771, 'batch_loss/train': 0.7183187864720821} 12/28/2021 17:27:31 - INFO - codeparrot_training - Step 43772: {'lr': 1.9029362843062782e-05, 'samples': 22411776, 'steps': 43772, 'batch_loss/train': 0.766173031181097} 12/28/2021 17:27:42 - INFO - codeparrot_training - Step 43773: {'lr': 1.902333112665916e-05, 'samples': 22412288, 'steps': 43773, 'batch_loss/train': 0.8095325054600835} 12/28/2021 17:27:52 - INFO - codeparrot_training - Step 43774: {'lr': 1.901730032853158e-05, 'samples': 22412800, 'steps': 43774, 'batch_loss/train': 0.6731307714653667} 12/28/2021 17:28:07 - INFO - codeparrot_training - Step 43775: {'lr': 1.9011270448704015e-05, 'samples': 22413312, 'steps': 43775, 'batch_loss/train': 0.6972935581579804} 12/28/2021 17:28:17 - INFO - codeparrot_training - Step 43776: {'lr': 1.900524148720037e-05, 'samples': 22413824, 'steps': 43776, 'batch_loss/train': 0.7214920681435615} 12/28/2021 17:28:28 - INFO - codeparrot_training - Step 43777: {'lr': 1.8999213444044705e-05, 'samples': 22414336, 'steps': 43777, 'batch_loss/train': 0.746470239944756} 12/28/2021 17:28:40 - INFO - codeparrot_training - Step 43778: {'lr': 1.8993186319260946e-05, 'samples': 22414848, 'steps': 43778, 'batch_loss/train': 0.7929914202541113} 12/28/2021 17:28:51 - INFO - codeparrot_training - Step 43779: {'lr': 1.8987160112872992e-05, 'samples': 22415360, 'steps': 43779, 'batch_loss/train': 0.7315123528242111} 12/28/2021 17:29:01 - INFO - codeparrot_training - Step 43780: {'lr': 1.8981134824904933e-05, 'samples': 22415872, 'steps': 43780, 'batch_loss/train': 0.7386629628017545} 12/28/2021 17:29:12 - INFO - codeparrot_training - Step 43781: {'lr': 1.8975110455380638e-05, 'samples': 22416384, 'steps': 43781, 'batch_loss/train': 0.7810899242758751} 12/28/2021 17:29:26 - INFO - codeparrot_training - Step 43782: {'lr': 1.8969087004324065e-05, 'samples': 22416896, 'steps': 43782, 'batch_loss/train': 0.8669226607307792} 12/28/2021 17:29:37 - INFO - codeparrot_training - Step 43783: {'lr': 1.896306447175916e-05, 'samples': 22417408, 'steps': 43783, 'batch_loss/train': 0.7408436150290072} 12/28/2021 17:29:47 - INFO - codeparrot_training - Step 43784: {'lr': 1.8957042857709855e-05, 'samples': 22417920, 'steps': 43784, 'batch_loss/train': 0.6888241074047983} 12/28/2021 17:29:59 - INFO - codeparrot_training - Step 43785: {'lr': 1.8951022162200154e-05, 'samples': 22418432, 'steps': 43785, 'batch_loss/train': 0.7154637947678566} 12/28/2021 17:30:10 - INFO - codeparrot_training - Step 43786: {'lr': 1.8945002385253957e-05, 'samples': 22418944, 'steps': 43786, 'batch_loss/train': 0.5718978389631957} 12/28/2021 17:30:21 - INFO - codeparrot_training - Step 43787: {'lr': 1.8938983526895103e-05, 'samples': 22419456, 'steps': 43787, 'batch_loss/train': 0.7757075749104843} 12/28/2021 17:30:33 - INFO - codeparrot_training - Step 43788: {'lr': 1.893296558714766e-05, 'samples': 22419968, 'steps': 43788, 'batch_loss/train': 0.5659619495272636} 12/28/2021 17:30:43 - INFO - codeparrot_training - Step 43789: {'lr': 1.8926948566035552e-05, 'samples': 22420480, 'steps': 43789, 'batch_loss/train': 0.6714893179014325} 12/28/2021 17:30:54 - INFO - codeparrot_training - Step 43790: {'lr': 1.8920932463582536e-05, 'samples': 22420992, 'steps': 43790, 'batch_loss/train': 0.706766554620117} 12/28/2021 17:31:05 - INFO - codeparrot_training - Step 43791: {'lr': 1.8914917279812703e-05, 'samples': 22421504, 'steps': 43791, 'batch_loss/train': 0.8232260039076209} 12/28/2021 17:31:19 - INFO - codeparrot_training - Step 43792: {'lr': 1.890890301474993e-05, 'samples': 22422016, 'steps': 43792, 'batch_loss/train': 0.7311509647406638} 12/28/2021 17:31:29 - INFO - codeparrot_training - Step 43793: {'lr': 1.8902889668418078e-05, 'samples': 22422528, 'steps': 43793, 'batch_loss/train': 0.7849465254694223} 12/28/2021 17:31:40 - INFO - codeparrot_training - Step 43794: {'lr': 1.889687724084102e-05, 'samples': 22423040, 'steps': 43794, 'batch_loss/train': 0.6781511658336967} 12/28/2021 17:31:52 - INFO - codeparrot_training - Step 43795: {'lr': 1.8890865732042824e-05, 'samples': 22423552, 'steps': 43795, 'batch_loss/train': 0.6685969200916588} 12/28/2021 17:32:03 - INFO - codeparrot_training - Step 43796: {'lr': 1.8884855142047215e-05, 'samples': 22424064, 'steps': 43796, 'batch_loss/train': 0.7120685032568872} 12/28/2021 17:32:13 - INFO - codeparrot_training - Step 43797: {'lr': 1.8878845470878175e-05, 'samples': 22424576, 'steps': 43797, 'batch_loss/train': 0.8332055490463972} 12/28/2021 17:32:25 - INFO - codeparrot_training - Step 43798: {'lr': 1.887283671855955e-05, 'samples': 22425088, 'steps': 43798, 'batch_loss/train': 0.6764198071323335} 12/28/2021 17:32:36 - INFO - codeparrot_training - Step 43799: {'lr': 1.8866828885115288e-05, 'samples': 22425600, 'steps': 43799, 'batch_loss/train': 0.7136989813297987} 12/28/2021 17:32:47 - INFO - codeparrot_training - Step 43800: {'lr': 1.8860821970569208e-05, 'samples': 22426112, 'steps': 43800, 'batch_loss/train': 0.6868313438026235} 12/28/2021 17:33:01 - INFO - codeparrot_training - Step 43801: {'lr': 1.885481597494529e-05, 'samples': 22426624, 'steps': 43801, 'batch_loss/train': 0.7362777683883905} 12/28/2021 17:33:11 - INFO - codeparrot_training - Step 43802: {'lr': 1.884881089826726e-05, 'samples': 22427136, 'steps': 43802, 'batch_loss/train': 0.721529901959002} 12/28/2021 17:33:22 - INFO - codeparrot_training - Step 43803: {'lr': 1.8842806740559132e-05, 'samples': 22427648, 'steps': 43803, 'batch_loss/train': 0.7540623038075864} 12/28/2021 17:33:33 - INFO - codeparrot_training - Step 43804: {'lr': 1.8836803501844745e-05, 'samples': 22428160, 'steps': 43804, 'batch_loss/train': 0.8614094201475382} 12/28/2021 17:33:45 - INFO - codeparrot_training - Step 43805: {'lr': 1.883080118214786e-05, 'samples': 22428672, 'steps': 43805, 'batch_loss/train': 0.8555084522813559} 12/28/2021 17:33:55 - INFO - codeparrot_training - Step 43806: {'lr': 1.8824799781492513e-05, 'samples': 22429184, 'steps': 43806, 'batch_loss/train': 0.5995352268218994} 12/28/2021 17:34:06 - INFO - codeparrot_training - Step 43807: {'lr': 1.8818799299902434e-05, 'samples': 22429696, 'steps': 43807, 'batch_loss/train': 0.8265802939422429} 12/28/2021 17:34:18 - INFO - codeparrot_training - Step 43808: {'lr': 1.881279973740152e-05, 'samples': 22430208, 'steps': 43808, 'batch_loss/train': 0.742986022029072} 12/28/2021 17:34:29 - INFO - codeparrot_training - Step 43809: {'lr': 1.8806801094013644e-05, 'samples': 22430720, 'steps': 43809, 'batch_loss/train': 0.48068901570513844} 12/28/2021 17:34:39 - INFO - codeparrot_training - Step 43810: {'lr': 1.8800803369762587e-05, 'samples': 22431232, 'steps': 43810, 'batch_loss/train': 0.676874364289688} 12/28/2021 17:34:52 - INFO - codeparrot_training - Step 43811: {'lr': 1.8794806564672252e-05, 'samples': 22431744, 'steps': 43811, 'batch_loss/train': 0.7241416499018669} 12/28/2021 17:35:02 - INFO - codeparrot_training - Step 43812: {'lr': 1.878881067876653e-05, 'samples': 22432256, 'steps': 43812, 'batch_loss/train': 0.7440331694670022} 12/28/2021 17:35:13 - INFO - codeparrot_training - Step 43813: {'lr': 1.878281571206908e-05, 'samples': 22432768, 'steps': 43813, 'batch_loss/train': 0.7609851198503748} 12/28/2021 17:35:27 - INFO - codeparrot_training - Step 43814: {'lr': 1.8776821664603928e-05, 'samples': 22433280, 'steps': 43814, 'batch_loss/train': 0.682246645912528} 12/28/2021 17:35:38 - INFO - codeparrot_training - Step 43815: {'lr': 1.8770828536394835e-05, 'samples': 22433792, 'steps': 43815, 'batch_loss/train': 0.5948410737328231} 12/28/2021 17:35:48 - INFO - codeparrot_training - Step 43816: {'lr': 1.8764836327465562e-05, 'samples': 22434304, 'steps': 43816, 'batch_loss/train': 0.7419239450246096} 12/28/2021 17:35:59 - INFO - codeparrot_training - Step 43817: {'lr': 1.8758845037840005e-05, 'samples': 22434816, 'steps': 43817, 'batch_loss/train': 0.7369029903784394} 12/28/2021 17:36:11 - INFO - codeparrot_training - Step 43818: {'lr': 1.8752854667542036e-05, 'samples': 22435328, 'steps': 43818, 'batch_loss/train': 0.8417036049067974} 12/28/2021 17:36:22 - INFO - codeparrot_training - Step 43819: {'lr': 1.8746865216595326e-05, 'samples': 22435840, 'steps': 43819, 'batch_loss/train': 0.9964025979861617} 12/28/2021 17:36:32 - INFO - codeparrot_training - Step 43820: {'lr': 1.874087668502375e-05, 'samples': 22436352, 'steps': 43820, 'batch_loss/train': 0.7567547773942351} 12/28/2021 17:36:46 - INFO - codeparrot_training - Step 43821: {'lr': 1.8734889072851202e-05, 'samples': 22436864, 'steps': 43821, 'batch_loss/train': 0.7012965672183782} 12/28/2021 17:36:57 - INFO - codeparrot_training - Step 43822: {'lr': 1.8728902380101358e-05, 'samples': 22437376, 'steps': 43822, 'batch_loss/train': 0.7053936589509249} 12/28/2021 17:37:08 - INFO - codeparrot_training - Step 43823: {'lr': 1.872291660679809e-05, 'samples': 22437888, 'steps': 43823, 'batch_loss/train': 0.9040170991793275} 12/28/2021 17:37:20 - INFO - codeparrot_training - Step 43824: {'lr': 1.8716931752965183e-05, 'samples': 22438400, 'steps': 43824, 'batch_loss/train': 0.746601689606905} 12/28/2021 17:37:31 - INFO - codeparrot_training - Step 43825: {'lr': 1.8710947818626396e-05, 'samples': 22438912, 'steps': 43825, 'batch_loss/train': 0.8146760929375887} 12/28/2021 17:37:41 - INFO - codeparrot_training - Step 43826: {'lr': 1.870496480380557e-05, 'samples': 22439424, 'steps': 43826, 'batch_loss/train': 0.7012322078226134} 12/28/2021 17:37:52 - INFO - codeparrot_training - Step 43827: {'lr': 1.869898270852649e-05, 'samples': 22439936, 'steps': 43827, 'batch_loss/train': 0.614471772685647} 12/28/2021 17:38:04 - INFO - codeparrot_training - Step 43828: {'lr': 1.8693001532812894e-05, 'samples': 22440448, 'steps': 43828, 'batch_loss/train': 0.6500683519989252} 12/28/2021 17:38:15 - INFO - codeparrot_training - Step 43829: {'lr': 1.8687021276688587e-05, 'samples': 22440960, 'steps': 43829, 'batch_loss/train': 0.755784903652966} 12/28/2021 17:38:25 - INFO - codeparrot_training - Step 43830: {'lr': 1.868104194017739e-05, 'samples': 22441472, 'steps': 43830, 'batch_loss/train': 0.8110726804006845} 12/28/2021 17:38:39 - INFO - codeparrot_training - Step 43831: {'lr': 1.8675063523302947e-05, 'samples': 22441984, 'steps': 43831, 'batch_loss/train': 0.7302573919296265} 12/28/2021 17:38:50 - INFO - codeparrot_training - Step 43832: {'lr': 1.866908602608919e-05, 'samples': 22442496, 'steps': 43832, 'batch_loss/train': 0.8060567118227482} 12/28/2021 17:39:00 - INFO - codeparrot_training - Step 43833: {'lr': 1.8663109448559757e-05, 'samples': 22443008, 'steps': 43833, 'batch_loss/train': 0.7778100473806262} 12/28/2021 17:39:12 - INFO - codeparrot_training - Step 43834: {'lr': 1.865713379073844e-05, 'samples': 22443520, 'steps': 43834, 'batch_loss/train': 0.7322870097123086} 12/28/2021 17:39:23 - INFO - codeparrot_training - Step 43835: {'lr': 1.8651159052648996e-05, 'samples': 22444032, 'steps': 43835, 'batch_loss/train': 0.7876257095485926} 12/28/2021 17:39:34 - INFO - codeparrot_training - Step 43836: {'lr': 1.8645185234315213e-05, 'samples': 22444544, 'steps': 43836, 'batch_loss/train': 0.6685505895875394} 12/28/2021 17:39:44 - INFO - codeparrot_training - Step 43837: {'lr': 1.8639212335760823e-05, 'samples': 22445056, 'steps': 43837, 'batch_loss/train': 0.6978654148988426} 12/28/2021 17:39:58 - INFO - codeparrot_training - Step 43838: {'lr': 1.8633240357009557e-05, 'samples': 22445568, 'steps': 43838, 'batch_loss/train': 0.6038864581205416} 12/28/2021 17:40:09 - INFO - codeparrot_training - Step 43839: {'lr': 1.862726929808517e-05, 'samples': 22446080, 'steps': 43839, 'batch_loss/train': 0.7656259571667761} 12/28/2021 17:40:19 - INFO - codeparrot_training - Step 43840: {'lr': 1.8621299159011396e-05, 'samples': 22446592, 'steps': 43840, 'batch_loss/train': 0.6894454666180536} 12/28/2021 17:40:32 - INFO - codeparrot_training - Step 43841: {'lr': 1.861532993981202e-05, 'samples': 22447104, 'steps': 43841, 'batch_loss/train': 0.6142770652659237} 12/28/2021 17:40:42 - INFO - codeparrot_training - Step 43842: {'lr': 1.860936164051061e-05, 'samples': 22447616, 'steps': 43842, 'batch_loss/train': 0.7297738939523697} 12/28/2021 17:40:53 - INFO - codeparrot_training - Step 43843: {'lr': 1.8603394261131085e-05, 'samples': 22448128, 'steps': 43843, 'batch_loss/train': 0.6767819102387875} 12/28/2021 17:41:05 - INFO - codeparrot_training - Step 43844: {'lr': 1.8597427801697125e-05, 'samples': 22448640, 'steps': 43844, 'batch_loss/train': 0.6502404562197626} 12/28/2021 17:41:16 - INFO - codeparrot_training - Step 43845: {'lr': 1.8591462262232377e-05, 'samples': 22449152, 'steps': 43845, 'batch_loss/train': 0.8553340826183558} 12/28/2021 17:41:27 - INFO - codeparrot_training - Step 43846: {'lr': 1.8585497642760546e-05, 'samples': 22449664, 'steps': 43846, 'batch_loss/train': 1.0293504928704351} 12/28/2021 17:41:39 - INFO - codeparrot_training - Step 43847: {'lr': 1.857953394330547e-05, 'samples': 22450176, 'steps': 43847, 'batch_loss/train': 0.8040087958797812} 12/28/2021 17:41:50 - INFO - codeparrot_training - Step 43848: {'lr': 1.8573571163890774e-05, 'samples': 22450688, 'steps': 43848, 'batch_loss/train': 0.7392540879081935} 12/28/2021 17:42:00 - INFO - codeparrot_training - Step 43849: {'lr': 1.8567609304540155e-05, 'samples': 22451200, 'steps': 43849, 'batch_loss/train': 0.725763919763267} 12/28/2021 17:42:11 - INFO - codeparrot_training - Step 43850: {'lr': 1.856164836527735e-05, 'samples': 22451712, 'steps': 43850, 'batch_loss/train': 0.6883806046098471} 12/28/2021 17:42:25 - INFO - codeparrot_training - Step 43851: {'lr': 1.855568834612603e-05, 'samples': 22452224, 'steps': 43851, 'batch_loss/train': 0.8196455026045442} 12/28/2021 17:42:36 - INFO - codeparrot_training - Step 43852: {'lr': 1.854972924710993e-05, 'samples': 22452736, 'steps': 43852, 'batch_loss/train': 0.6889845561236143} 12/28/2021 17:42:46 - INFO - codeparrot_training - Step 43853: {'lr': 1.8543771068252696e-05, 'samples': 22453248, 'steps': 43853, 'batch_loss/train': 0.7093792604282498} 12/28/2021 17:42:58 - INFO - codeparrot_training - Step 43854: {'lr': 1.853781380957803e-05, 'samples': 22453760, 'steps': 43854, 'batch_loss/train': 0.7676394548034295} 12/28/2021 17:43:09 - INFO - codeparrot_training - Step 43855: {'lr': 1.853185747110961e-05, 'samples': 22454272, 'steps': 43855, 'batch_loss/train': 0.7667479705996811} 12/28/2021 17:43:19 - INFO - codeparrot_training - Step 43856: {'lr': 1.852590205287119e-05, 'samples': 22454784, 'steps': 43856, 'batch_loss/train': 0.7474100929684937} 12/28/2021 17:43:32 - INFO - codeparrot_training - Step 43857: {'lr': 1.8519947554886283e-05, 'samples': 22455296, 'steps': 43857, 'batch_loss/train': 0.7024413379840553} 12/28/2021 17:43:42 - INFO - codeparrot_training - Step 43858: {'lr': 1.851399397717876e-05, 'samples': 22455808, 'steps': 43858, 'batch_loss/train': 0.5229155708802864} 12/28/2021 17:43:53 - INFO - codeparrot_training - Step 43859: {'lr': 1.8508041319772127e-05, 'samples': 22456320, 'steps': 43859, 'batch_loss/train': 0.7213924992829561} 12/28/2021 17:44:03 - INFO - codeparrot_training - Step 43860: {'lr': 1.8502089582690146e-05, 'samples': 22456832, 'steps': 43860, 'batch_loss/train': 0.8022694699466228} 12/28/2021 17:44:17 - INFO - codeparrot_training - Step 43861: {'lr': 1.8496138765956405e-05, 'samples': 22457344, 'steps': 43861, 'batch_loss/train': 0.6946917860768735} 12/28/2021 17:44:28 - INFO - codeparrot_training - Step 43862: {'lr': 1.8490188869594637e-05, 'samples': 22457856, 'steps': 43862, 'batch_loss/train': 0.7772121336311102} 12/28/2021 17:44:39 - INFO - codeparrot_training - Step 43863: {'lr': 1.8484239893628462e-05, 'samples': 22458368, 'steps': 43863, 'batch_loss/train': 0.8024026500061154} 12/28/2021 17:44:51 - INFO - codeparrot_training - Step 43864: {'lr': 1.84782918380815e-05, 'samples': 22458880, 'steps': 43864, 'batch_loss/train': 0.7936564185656607} 12/28/2021 17:45:01 - INFO - codeparrot_training - Step 43865: {'lr': 1.8472344702977457e-05, 'samples': 22459392, 'steps': 43865, 'batch_loss/train': 0.7411293145269156} 12/28/2021 17:45:12 - INFO - codeparrot_training - Step 43866: {'lr': 1.8466398488339947e-05, 'samples': 22459904, 'steps': 43866, 'batch_loss/train': 0.7054022289812565} 12/28/2021 17:45:26 - INFO - codeparrot_training - Step 43867: {'lr': 1.846045319419265e-05, 'samples': 22460416, 'steps': 43867, 'batch_loss/train': 0.7662192992866039} 12/28/2021 17:45:37 - INFO - codeparrot_training - Step 43868: {'lr': 1.84545088205591e-05, 'samples': 22460928, 'steps': 43868, 'batch_loss/train': 0.7118016095482744} 12/28/2021 17:45:47 - INFO - codeparrot_training - Step 43869: {'lr': 1.8448565367463034e-05, 'samples': 22461440, 'steps': 43869, 'batch_loss/train': 0.7118379813618958} 12/28/2021 17:45:59 - INFO - codeparrot_training - Step 43870: {'lr': 1.8442622834928063e-05, 'samples': 22461952, 'steps': 43870, 'batch_loss/train': 0.806603180244565} 12/28/2021 17:46:10 - INFO - codeparrot_training - Step 43871: {'lr': 1.8436681222977758e-05, 'samples': 22462464, 'steps': 43871, 'batch_loss/train': 0.7959309481084347} 12/28/2021 17:46:21 - INFO - codeparrot_training - Step 43872: {'lr': 1.8430740531635736e-05, 'samples': 22462976, 'steps': 43872, 'batch_loss/train': 0.7141621774062514} 12/28/2021 17:46:31 - INFO - codeparrot_training - Step 43873: {'lr': 1.842480076092573e-05, 'samples': 22463488, 'steps': 43873, 'batch_loss/train': 0.6299356808885932} 12/28/2021 17:46:43 - INFO - codeparrot_training - Step 43874: {'lr': 1.841886191087122e-05, 'samples': 22464000, 'steps': 43874, 'batch_loss/train': 0.7257975321263075} 12/28/2021 17:46:54 - INFO - codeparrot_training - Step 43875: {'lr': 1.841292398149591e-05, 'samples': 22464512, 'steps': 43875, 'batch_loss/train': 0.7256235741078854} 12/28/2021 17:47:04 - INFO - codeparrot_training - Step 43876: {'lr': 1.8406986972823337e-05, 'samples': 22465024, 'steps': 43876, 'batch_loss/train': 0.7026932234875858} 12/28/2021 17:47:18 - INFO - codeparrot_training - Step 43877: {'lr': 1.8401050884877145e-05, 'samples': 22465536, 'steps': 43877, 'batch_loss/train': 0.770290276966989} 12/28/2021 17:47:29 - INFO - codeparrot_training - Step 43878: {'lr': 1.8395115717680934e-05, 'samples': 22466048, 'steps': 43878, 'batch_loss/train': 0.8062632102519274} 12/28/2021 17:47:40 - INFO - codeparrot_training - Step 43879: {'lr': 1.8389181471258288e-05, 'samples': 22466560, 'steps': 43879, 'batch_loss/train': 0.6697713097091764} 12/28/2021 17:47:52 - INFO - codeparrot_training - Step 43880: {'lr': 1.8383248145632802e-05, 'samples': 22467072, 'steps': 43880, 'batch_loss/train': 0.6853900225833058} 12/28/2021 17:48:03 - INFO - codeparrot_training - Step 43881: {'lr': 1.8377315740828072e-05, 'samples': 22467584, 'steps': 43881, 'batch_loss/train': 0.9033179981634021} 12/28/2021 17:48:13 - INFO - codeparrot_training - Step 43882: {'lr': 1.837138425686774e-05, 'samples': 22468096, 'steps': 43882, 'batch_loss/train': 0.6073418925516307} 12/28/2021 17:48:24 - INFO - codeparrot_training - Step 43883: {'lr': 1.8365453693775207e-05, 'samples': 22468608, 'steps': 43883, 'batch_loss/train': 0.6618920126929879} 12/28/2021 17:48:36 - INFO - codeparrot_training - Step 43884: {'lr': 1.8359524051574232e-05, 'samples': 22469120, 'steps': 43884, 'batch_loss/train': 0.8369752466678619} 12/28/2021 17:48:47 - INFO - codeparrot_training - Step 43885: {'lr': 1.8353595330288354e-05, 'samples': 22469632, 'steps': 43885, 'batch_loss/train': 0.6932432467583567} 12/28/2021 17:48:57 - INFO - codeparrot_training - Step 43886: {'lr': 1.8347667529941075e-05, 'samples': 22470144, 'steps': 43886, 'batch_loss/train': 0.7861136039718986} 12/28/2021 17:49:11 - INFO - codeparrot_training - Step 43887: {'lr': 1.8341740650555995e-05, 'samples': 22470656, 'steps': 43887, 'batch_loss/train': 0.7523678508587182} 12/28/2021 17:49:22 - INFO - codeparrot_training - Step 43888: {'lr': 1.83358146921567e-05, 'samples': 22471168, 'steps': 43888, 'batch_loss/train': 0.6213970193639398} 12/28/2021 17:49:32 - INFO - codeparrot_training - Step 43889: {'lr': 1.8329889654766702e-05, 'samples': 22471680, 'steps': 43889, 'batch_loss/train': 0.8045492940582335} 12/28/2021 17:49:44 - INFO - codeparrot_training - Step 43890: {'lr': 1.8323965538409622e-05, 'samples': 22472192, 'steps': 43890, 'batch_loss/train': 0.7141005108132958} 12/28/2021 17:49:55 - INFO - codeparrot_training - Step 43891: {'lr': 1.8318042343108965e-05, 'samples': 22472704, 'steps': 43891, 'batch_loss/train': 0.7022842082660645} 12/28/2021 17:50:06 - INFO - codeparrot_training - Step 43892: {'lr': 1.83121200688883e-05, 'samples': 22473216, 'steps': 43892, 'batch_loss/train': 0.7185953832231462} 12/28/2021 17:50:16 - INFO - codeparrot_training - Step 43893: {'lr': 1.8306198715771134e-05, 'samples': 22473728, 'steps': 43893, 'batch_loss/train': 0.8697312297299504} 12/28/2021 17:50:28 - INFO - codeparrot_training - Step 43894: {'lr': 1.8300278283781057e-05, 'samples': 22474240, 'steps': 43894, 'batch_loss/train': 0.6356533863581717} 12/28/2021 17:50:39 - INFO - codeparrot_training - Step 43895: {'lr': 1.829435877294161e-05, 'samples': 22474752, 'steps': 43895, 'batch_loss/train': 0.7617720370180905} 12/28/2021 17:50:50 - INFO - codeparrot_training - Step 43896: {'lr': 1.8288440183276302e-05, 'samples': 22475264, 'steps': 43896, 'batch_loss/train': 0.7324523311108351} 12/28/2021 17:51:04 - INFO - codeparrot_training - Step 43897: {'lr': 1.8282522514808665e-05, 'samples': 22475776, 'steps': 43897, 'batch_loss/train': 0.7739135585725307} 12/28/2021 17:51:15 - INFO - codeparrot_training - Step 43898: {'lr': 1.8276605767562155e-05, 'samples': 22476288, 'steps': 43898, 'batch_loss/train': 1.033289611339569} 12/28/2021 17:51:25 - INFO - codeparrot_training - Step 43899: {'lr': 1.8270689941560446e-05, 'samples': 22476800, 'steps': 43899, 'batch_loss/train': 0.8285173873882741} 12/28/2021 17:51:38 - INFO - codeparrot_training - Step 43900: {'lr': 1.826477503682697e-05, 'samples': 22477312, 'steps': 43900, 'batch_loss/train': 0.6635183799080551} 12/28/2021 17:51:48 - INFO - codeparrot_training - Step 43901: {'lr': 1.8258861053385172e-05, 'samples': 22477824, 'steps': 43901, 'batch_loss/train': 0.7250218358822167} 12/28/2021 17:51:59 - INFO - codeparrot_training - Step 43902: {'lr': 1.8252947991258732e-05, 'samples': 22478336, 'steps': 43902, 'batch_loss/train': 0.7025320022366941} 12/28/2021 17:52:09 - INFO - codeparrot_training - Step 43903: {'lr': 1.8247035850471023e-05, 'samples': 22478848, 'steps': 43903, 'batch_loss/train': 0.6613040022784844} 12/28/2021 17:52:22 - INFO - codeparrot_training - Step 43904: {'lr': 1.8241124631045603e-05, 'samples': 22479360, 'steps': 43904, 'batch_loss/train': 0.7640813761390746} 12/28/2021 17:52:32 - INFO - codeparrot_training - Step 43905: {'lr': 1.8235214333005958e-05, 'samples': 22479872, 'steps': 43905, 'batch_loss/train': 0.7343862212146632} 12/28/2021 17:52:43 - INFO - codeparrot_training - Step 43906: {'lr': 1.8229304956375593e-05, 'samples': 22480384, 'steps': 43906, 'batch_loss/train': 0.8082115133292973} 12/28/2021 17:52:57 - INFO - codeparrot_training - Step 43907: {'lr': 1.8223396501177992e-05, 'samples': 22480896, 'steps': 43907, 'batch_loss/train': 0.7579803792759776} 12/28/2021 17:53:08 - INFO - codeparrot_training - Step 43908: {'lr': 1.8217488967436692e-05, 'samples': 22481408, 'steps': 43908, 'batch_loss/train': 0.8043681737035513} 12/28/2021 17:53:18 - INFO - codeparrot_training - Step 43909: {'lr': 1.821158235517506e-05, 'samples': 22481920, 'steps': 43909, 'batch_loss/train': 0.7190670198760927} 12/28/2021 17:53:30 - INFO - codeparrot_training - Step 43910: {'lr': 1.820567666441669e-05, 'samples': 22482432, 'steps': 43910, 'batch_loss/train': 0.6854786032345146} 12/28/2021 17:53:41 - INFO - codeparrot_training - Step 43911: {'lr': 1.8199771895185068e-05, 'samples': 22482944, 'steps': 43911, 'batch_loss/train': 0.7300583044998348} 12/28/2021 17:53:52 - INFO - codeparrot_training - Step 43912: {'lr': 1.8193868047503615e-05, 'samples': 22483456, 'steps': 43912, 'batch_loss/train': 0.742433762177825} 12/28/2021 17:54:02 - INFO - codeparrot_training - Step 43913: {'lr': 1.8187965121395783e-05, 'samples': 22483968, 'steps': 43913, 'batch_loss/train': 0.7614028798416257} 12/28/2021 17:54:16 - INFO - codeparrot_training - Step 43914: {'lr': 1.8182063116885084e-05, 'samples': 22484480, 'steps': 43914, 'batch_loss/train': 0.7084021957125515} 12/28/2021 17:54:27 - INFO - codeparrot_training - Step 43915: {'lr': 1.817616203399497e-05, 'samples': 22484992, 'steps': 43915, 'batch_loss/train': 0.772827104665339} 12/28/2021 17:54:37 - INFO - codeparrot_training - Step 43916: {'lr': 1.8170261872748923e-05, 'samples': 22485504, 'steps': 43916, 'batch_loss/train': 0.6865394607884809} 12/28/2021 17:54:50 - INFO - codeparrot_training - Step 43917: {'lr': 1.816436263317034e-05, 'samples': 22486016, 'steps': 43917, 'batch_loss/train': 0.6628643297590315} 12/28/2021 17:55:00 - INFO - codeparrot_training - Step 43918: {'lr': 1.8158464315282757e-05, 'samples': 22486528, 'steps': 43918, 'batch_loss/train': 0.6627261482644826} 12/28/2021 17:55:11 - INFO - codeparrot_training - Step 43919: {'lr': 1.815256691910955e-05, 'samples': 22487040, 'steps': 43919, 'batch_loss/train': 0.79127954877913} 12/28/2021 17:55:23 - INFO - codeparrot_training - Step 43920: {'lr': 1.814667044467419e-05, 'samples': 22487552, 'steps': 43920, 'batch_loss/train': 1.3472102604282554} 12/28/2021 17:55:34 - INFO - codeparrot_training - Step 43921: {'lr': 1.814077489200014e-05, 'samples': 22488064, 'steps': 43921, 'batch_loss/train': 0.6902215210720897} 12/28/2021 17:55:44 - INFO - codeparrot_training - Step 43922: {'lr': 1.8134880261110876e-05, 'samples': 22488576, 'steps': 43922, 'batch_loss/train': 0.5829236574936658} 12/28/2021 17:55:55 - INFO - codeparrot_training - Step 43923: {'lr': 1.8128986552029712e-05, 'samples': 22489088, 'steps': 43923, 'batch_loss/train': 0.5538990045897663} 12/28/2021 17:56:07 - INFO - codeparrot_training - Step 43924: {'lr': 1.812309376478011e-05, 'samples': 22489600, 'steps': 43924, 'batch_loss/train': 0.7264601979404688} 12/28/2021 17:56:18 - INFO - codeparrot_training - Step 43925: {'lr': 1.811720189938562e-05, 'samples': 22490112, 'steps': 43925, 'batch_loss/train': 0.7681647376157343} 12/28/2021 17:56:28 - INFO - codeparrot_training - Step 43926: {'lr': 1.8111310955869543e-05, 'samples': 22490624, 'steps': 43926, 'batch_loss/train': 0.6164371364284307} 12/28/2021 17:56:43 - INFO - codeparrot_training - Step 43927: {'lr': 1.8105420934255297e-05, 'samples': 22491136, 'steps': 43927, 'batch_loss/train': 0.8505219703074545} 12/28/2021 17:56:53 - INFO - codeparrot_training - Step 43928: {'lr': 1.8099531834566418e-05, 'samples': 22491648, 'steps': 43928, 'batch_loss/train': 0.8517875140532851} 12/28/2021 17:57:04 - INFO - codeparrot_training - Step 43929: {'lr': 1.809364365682617e-05, 'samples': 22492160, 'steps': 43929, 'batch_loss/train': 0.620694070123136} 12/28/2021 17:57:16 - INFO - codeparrot_training - Step 43930: {'lr': 1.808775640105806e-05, 'samples': 22492672, 'steps': 43930, 'batch_loss/train': 0.7820970311295241} 12/28/2021 17:57:27 - INFO - codeparrot_training - Step 43931: {'lr': 1.8081870067285456e-05, 'samples': 22493184, 'steps': 43931, 'batch_loss/train': 0.8917777645401657} 12/28/2021 17:57:37 - INFO - codeparrot_training - Step 43932: {'lr': 1.8075984655531785e-05, 'samples': 22493696, 'steps': 43932, 'batch_loss/train': 0.8388322577811778} 12/28/2021 17:57:48 - INFO - codeparrot_training - Step 43933: {'lr': 1.8070100165820393e-05, 'samples': 22494208, 'steps': 43933, 'batch_loss/train': 0.6828112020157278} 12/28/2021 17:58:01 - INFO - codeparrot_training - Step 43934: {'lr': 1.8064216598174782e-05, 'samples': 22494720, 'steps': 43934, 'batch_loss/train': 0.6607731976546347} 12/28/2021 17:58:11 - INFO - codeparrot_training - Step 43935: {'lr': 1.8058333952618162e-05, 'samples': 22495232, 'steps': 43935, 'batch_loss/train': 0.5710829086601734} 12/28/2021 17:58:22 - INFO - codeparrot_training - Step 43936: {'lr': 1.8052452229174094e-05, 'samples': 22495744, 'steps': 43936, 'batch_loss/train': 0.5896034324541688} 12/28/2021 17:58:36 - INFO - codeparrot_training - Step 43937: {'lr': 1.8046571427865922e-05, 'samples': 22496256, 'steps': 43937, 'batch_loss/train': 0.7693973025307059} 12/28/2021 17:58:47 - INFO - codeparrot_training - Step 43938: {'lr': 1.8040691548716932e-05, 'samples': 22496768, 'steps': 43938, 'batch_loss/train': 0.7406941815279424} 12/28/2021 17:58:58 - INFO - codeparrot_training - Step 43939: {'lr': 1.8034812591750578e-05, 'samples': 22497280, 'steps': 43939, 'batch_loss/train': 1.0045012449845672} 12/28/2021 17:59:08 - INFO - codeparrot_training - Step 43940: {'lr': 1.8028934556990285e-05, 'samples': 22497792, 'steps': 43940, 'batch_loss/train': 0.6363733462058008} 12/28/2021 17:59:21 - INFO - codeparrot_training - Step 43941: {'lr': 1.802305744445934e-05, 'samples': 22498304, 'steps': 43941, 'batch_loss/train': 0.6942926794290543} 12/28/2021 17:59:31 - INFO - codeparrot_training - Step 43942: {'lr': 1.8017181254181088e-05, 'samples': 22498816, 'steps': 43942, 'batch_loss/train': 0.753554267110303} 12/28/2021 17:59:42 - INFO - codeparrot_training - Step 43943: {'lr': 1.801130598617895e-05, 'samples': 22499328, 'steps': 43943, 'batch_loss/train': 0.6791631744708866} 12/28/2021 17:59:56 - INFO - codeparrot_training - Step 43944: {'lr': 1.8005431640476272e-05, 'samples': 22499840, 'steps': 43944, 'batch_loss/train': 0.7030551908537745} 12/28/2021 18:00:06 - INFO - codeparrot_training - Step 43945: {'lr': 1.799955821709637e-05, 'samples': 22500352, 'steps': 43945, 'batch_loss/train': 0.8800844671204686} 12/28/2021 18:00:17 - INFO - codeparrot_training - Step 43946: {'lr': 1.7993685716062662e-05, 'samples': 22500864, 'steps': 43946, 'batch_loss/train': 0.6873562136897817} 12/28/2021 18:00:29 - INFO - codeparrot_training - Step 43947: {'lr': 1.7987814137398416e-05, 'samples': 22501376, 'steps': 43947, 'batch_loss/train': 0.744465502910316} 12/28/2021 18:00:39 - INFO - codeparrot_training - Step 43948: {'lr': 1.7981943481127083e-05, 'samples': 22501888, 'steps': 43948, 'batch_loss/train': 0.7348651071079075} 12/28/2021 18:00:50 - INFO - codeparrot_training - Step 43949: {'lr': 1.7976073747271892e-05, 'samples': 22502400, 'steps': 43949, 'batch_loss/train': 0.7132592761190608} 12/28/2021 18:01:01 - INFO - codeparrot_training - Step 43950: {'lr': 1.7970204935856187e-05, 'samples': 22502912, 'steps': 43950, 'batch_loss/train': 0.7409438602626324} 12/28/2021 18:01:13 - INFO - codeparrot_training - Step 43951: {'lr': 1.7964337046903424e-05, 'samples': 22503424, 'steps': 43951, 'batch_loss/train': 0.7070438580121845} 12/28/2021 18:01:24 - INFO - codeparrot_training - Step 43952: {'lr': 1.7958470080436802e-05, 'samples': 22503936, 'steps': 43952, 'batch_loss/train': 0.7404608263168484} 12/28/2021 18:01:34 - INFO - codeparrot_training - Step 43953: {'lr': 1.7952604036479635e-05, 'samples': 22504448, 'steps': 43953, 'batch_loss/train': 0.6918024807237089} 12/28/2021 18:01:48 - INFO - codeparrot_training - Step 43954: {'lr': 1.7946738915055384e-05, 'samples': 22504960, 'steps': 43954, 'batch_loss/train': 0.62080501858145} 12/28/2021 18:01:59 - INFO - codeparrot_training - Step 43955: {'lr': 1.7940874716187216e-05, 'samples': 22505472, 'steps': 43955, 'batch_loss/train': 0.7964174575172365} 12/28/2021 18:02:10 - INFO - codeparrot_training - Step 43956: {'lr': 1.7935011439898535e-05, 'samples': 22505984, 'steps': 43956, 'batch_loss/train': 0.7633007746189833} 12/28/2021 18:02:22 - INFO - codeparrot_training - Step 43957: {'lr': 1.7929149086212597e-05, 'samples': 22506496, 'steps': 43957, 'batch_loss/train': 0.6913404786027968} 12/28/2021 18:02:32 - INFO - codeparrot_training - Step 43958: {'lr': 1.7923287655152774e-05, 'samples': 22507008, 'steps': 43958, 'batch_loss/train': 0.7297321367077529} 12/28/2021 18:02:43 - INFO - codeparrot_training - Step 43959: {'lr': 1.7917427146742298e-05, 'samples': 22507520, 'steps': 43959, 'batch_loss/train': 0.6581164395902306} 12/28/2021 18:02:54 - INFO - codeparrot_training - Step 43960: {'lr': 1.7911567561004567e-05, 'samples': 22508032, 'steps': 43960, 'batch_loss/train': 0.6635325835086405} 12/28/2021 18:03:06 - INFO - codeparrot_training - Step 43961: {'lr': 1.7905708897962696e-05, 'samples': 22508544, 'steps': 43961, 'batch_loss/train': 0.7476076101884246} 12/28/2021 18:03:16 - INFO - codeparrot_training - Step 43962: {'lr': 1.7899851157640174e-05, 'samples': 22509056, 'steps': 43962, 'batch_loss/train': 0.6780840209685266} 12/28/2021 18:03:27 - INFO - codeparrot_training - Step 43963: {'lr': 1.7893994340060227e-05, 'samples': 22509568, 'steps': 43963, 'batch_loss/train': 0.6889210809022188} 12/28/2021 18:03:39 - INFO - codeparrot_training - Step 43964: {'lr': 1.7888138445246033e-05, 'samples': 22510080, 'steps': 43964, 'batch_loss/train': 1.0689470228971913} 12/28/2021 18:03:50 - INFO - codeparrot_training - Step 43965: {'lr': 1.788228347322099e-05, 'samples': 22510592, 'steps': 43965, 'batch_loss/train': 0.8148815762251616} 12/28/2021 18:04:01 - INFO - codeparrot_training - Step 43966: {'lr': 1.787642942400841e-05, 'samples': 22511104, 'steps': 43966, 'batch_loss/train': 0.7992057017982006} 12/28/2021 18:04:15 - INFO - codeparrot_training - Step 43967: {'lr': 1.7870576297631446e-05, 'samples': 22511616, 'steps': 43967, 'batch_loss/train': 1.0049192178994417} 12/28/2021 18:04:26 - INFO - codeparrot_training - Step 43968: {'lr': 1.786472409411341e-05, 'samples': 22512128, 'steps': 43968, 'batch_loss/train': 0.7560720397159457} 12/28/2021 18:04:36 - INFO - codeparrot_training - Step 43969: {'lr': 1.7858872813477587e-05, 'samples': 22512640, 'steps': 43969, 'batch_loss/train': 0.7870151326060295} 12/28/2021 18:04:47 - INFO - codeparrot_training - Step 43970: {'lr': 1.7853022455747243e-05, 'samples': 22513152, 'steps': 43970, 'batch_loss/train': 0.747612402192317} 12/28/2021 18:04:59 - INFO - codeparrot_training - Step 43971: {'lr': 1.7847173020945628e-05, 'samples': 22513664, 'steps': 43971, 'batch_loss/train': 0.6945935555268079} 12/28/2021 18:05:10 - INFO - codeparrot_training - Step 43972: {'lr': 1.784132450909598e-05, 'samples': 22514176, 'steps': 43972, 'batch_loss/train': 0.7322058700956404} 12/28/2021 18:05:20 - INFO - codeparrot_training - Step 43973: {'lr': 1.783547692022158e-05, 'samples': 22514688, 'steps': 43973, 'batch_loss/train': 0.7563880458474159} 12/28/2021 18:05:34 - INFO - codeparrot_training - Step 43974: {'lr': 1.7829630254345642e-05, 'samples': 22515200, 'steps': 43974, 'batch_loss/train': 0.7938450326910242} 12/28/2021 18:05:45 - INFO - codeparrot_training - Step 43975: {'lr': 1.7823784511491443e-05, 'samples': 22515712, 'steps': 43975, 'batch_loss/train': 0.640654010232538} 12/28/2021 18:05:55 - INFO - codeparrot_training - Step 43976: {'lr': 1.7817939691682217e-05, 'samples': 22516224, 'steps': 43976, 'batch_loss/train': 0.7150104407919571} 12/28/2021 18:06:07 - INFO - codeparrot_training - Step 43977: {'lr': 1.7812095794941225e-05, 'samples': 22516736, 'steps': 43977, 'batch_loss/train': 0.7746856939047575} 12/28/2021 18:06:18 - INFO - codeparrot_training - Step 43978: {'lr': 1.780625282129164e-05, 'samples': 22517248, 'steps': 43978, 'batch_loss/train': 0.7954997536726296} 12/28/2021 18:06:29 - INFO - codeparrot_training - Step 43979: {'lr': 1.780041077075667e-05, 'samples': 22517760, 'steps': 43979, 'batch_loss/train': 0.6502470104023814} 12/28/2021 18:06:39 - INFO - codeparrot_training - Step 43980: {'lr': 1.7794569643359683e-05, 'samples': 22518272, 'steps': 43980, 'batch_loss/train': 0.7044253151398152} 12/28/2021 18:06:51 - INFO - codeparrot_training - Step 43981: {'lr': 1.778872943912374e-05, 'samples': 22518784, 'steps': 43981, 'batch_loss/train': 0.7308340505696833} 12/28/2021 18:07:02 - INFO - codeparrot_training - Step 43982: {'lr': 1.7782890158072162e-05, 'samples': 22519296, 'steps': 43982, 'batch_loss/train': 0.7168416767381132} 12/28/2021 18:07:13 - INFO - codeparrot_training - Step 43983: {'lr': 1.777705180022812e-05, 'samples': 22519808, 'steps': 43983, 'batch_loss/train': 0.7470676791854203} 12/28/2021 18:07:27 - INFO - codeparrot_training - Step 43984: {'lr': 1.7771214365614824e-05, 'samples': 22520320, 'steps': 43984, 'batch_loss/train': 0.7386877983808517} 12/28/2021 18:07:37 - INFO - codeparrot_training - Step 43985: {'lr': 1.7765377854255528e-05, 'samples': 22520832, 'steps': 43985, 'batch_loss/train': 0.8085194481536746} 12/28/2021 18:07:48 - INFO - codeparrot_training - Step 43986: {'lr': 1.7759542266173405e-05, 'samples': 22521344, 'steps': 43986, 'batch_loss/train': 0.7788183372467756} 12/28/2021 18:08:00 - INFO - codeparrot_training - Step 43987: {'lr': 1.7753707601391585e-05, 'samples': 22521856, 'steps': 43987, 'batch_loss/train': 0.695502744987607} 12/28/2021 18:08:11 - INFO - codeparrot_training - Step 43988: {'lr': 1.7747873859933374e-05, 'samples': 22522368, 'steps': 43988, 'batch_loss/train': 0.6772650857456028} 12/28/2021 18:08:21 - INFO - codeparrot_training - Step 43989: {'lr': 1.774204104182195e-05, 'samples': 22522880, 'steps': 43989, 'batch_loss/train': 0.7459799749776721} 12/28/2021 18:08:34 - INFO - codeparrot_training - Step 43990: {'lr': 1.7736209147080406e-05, 'samples': 22523392, 'steps': 43990, 'batch_loss/train': 0.7634867029264569} 12/28/2021 18:08:44 - INFO - codeparrot_training - Step 43991: {'lr': 1.773037817573203e-05, 'samples': 22523904, 'steps': 43991, 'batch_loss/train': 0.6845857407897711} 12/28/2021 18:08:55 - INFO - codeparrot_training - Step 43992: {'lr': 1.7724548127800027e-05, 'samples': 22524416, 'steps': 43992, 'batch_loss/train': 0.739385865163058} 12/28/2021 18:09:06 - INFO - codeparrot_training - Step 43993: {'lr': 1.771871900330746e-05, 'samples': 22524928, 'steps': 43993, 'batch_loss/train': 0.7809376730583608} 12/28/2021 18:09:19 - INFO - codeparrot_training - Step 43994: {'lr': 1.771289080227756e-05, 'samples': 22525440, 'steps': 43994, 'batch_loss/train': 0.6919710584916174} 12/28/2021 18:09:29 - INFO - codeparrot_training - Step 43995: {'lr': 1.7707063524733476e-05, 'samples': 22525952, 'steps': 43995, 'batch_loss/train': 0.7381732212379575} 12/28/2021 18:09:40 - INFO - codeparrot_training - Step 43996: {'lr': 1.7701237170698437e-05, 'samples': 22526464, 'steps': 43996, 'batch_loss/train': 0.7384666632860899} 12/28/2021 18:09:53 - INFO - codeparrot_training - Step 43997: {'lr': 1.769541174019554e-05, 'samples': 22526976, 'steps': 43997, 'batch_loss/train': 0.705515009118244} 12/28/2021 18:10:03 - INFO - codeparrot_training - Step 43998: {'lr': 1.7689587233247983e-05, 'samples': 22527488, 'steps': 43998, 'batch_loss/train': 0.6736034385394305} 12/28/2021 18:10:14 - INFO - codeparrot_training - Step 43999: {'lr': 1.7683763649878894e-05, 'samples': 22528000, 'steps': 43999, 'batch_loss/train': 0.8033771547488868} 12/28/2021 18:10:28 - INFO - codeparrot_training - Step 44000: {'lr': 1.7677940990111467e-05, 'samples': 22528512, 'steps': 44000, 'batch_loss/train': 1.238082847557962} 12/28/2021 18:10:39 - INFO - codeparrot_training - Step 44001: {'lr': 1.7672119253968804e-05, 'samples': 22529024, 'steps': 44001, 'batch_loss/train': 0.733747489284724} 12/28/2021 18:10:49 - INFO - codeparrot_training - Step 44002: {'lr': 1.7666298441474075e-05, 'samples': 22529536, 'steps': 44002, 'batch_loss/train': 0.6579249378410168} 12/28/2021 18:11:00 - INFO - codeparrot_training - Step 44003: {'lr': 1.766047855265046e-05, 'samples': 22530048, 'steps': 44003, 'batch_loss/train': 0.9541194830089808} 12/28/2021 18:11:12 - INFO - codeparrot_training - Step 44004: {'lr': 1.7654659587521022e-05, 'samples': 22530560, 'steps': 44004, 'batch_loss/train': 0.7108762238640338} 12/28/2021 18:11:23 - INFO - codeparrot_training - Step 44005: {'lr': 1.764884154610888e-05, 'samples': 22531072, 'steps': 44005, 'batch_loss/train': 0.6766165299341083} 12/28/2021 18:11:33 - INFO - codeparrot_training - Step 44006: {'lr': 1.7643024428437294e-05, 'samples': 22531584, 'steps': 44006, 'batch_loss/train': 0.6338441804982722} 12/28/2021 18:11:45 - INFO - codeparrot_training - Step 44007: {'lr': 1.7637208234529274e-05, 'samples': 22532096, 'steps': 44007, 'batch_loss/train': 0.7704649777151644} 12/28/2021 18:11:56 - INFO - codeparrot_training - Step 44008: {'lr': 1.7631392964407966e-05, 'samples': 22532608, 'steps': 44008, 'batch_loss/train': 0.8598638791590929} 12/28/2021 18:12:07 - INFO - codeparrot_training - Step 44009: {'lr': 1.7625578618096522e-05, 'samples': 22533120, 'steps': 44009, 'batch_loss/train': 0.7764569986611605} 12/28/2021 18:12:19 - INFO - codeparrot_training - Step 44010: {'lr': 1.7619765195618005e-05, 'samples': 22533632, 'steps': 44010, 'batch_loss/train': 0.8381935935467482} 12/28/2021 18:12:29 - INFO - codeparrot_training - Step 44011: {'lr': 1.761395269699559e-05, 'samples': 22534144, 'steps': 44011, 'batch_loss/train': 0.671253951266408} 12/28/2021 18:12:40 - INFO - codeparrot_training - Step 44012: {'lr': 1.7608141122252313e-05, 'samples': 22534656, 'steps': 44012, 'batch_loss/train': 0.6828646650537848} 12/28/2021 18:12:51 - INFO - codeparrot_training - Step 44013: {'lr': 1.7602330471411353e-05, 'samples': 22535168, 'steps': 44013, 'batch_loss/train': 0.6670516488375142} 12/28/2021 18:13:05 - INFO - codeparrot_training - Step 44014: {'lr': 1.7596520744495775e-05, 'samples': 22535680, 'steps': 44014, 'batch_loss/train': 0.7457851571962237} 12/28/2021 18:13:15 - INFO - codeparrot_training - Step 44015: {'lr': 1.7590711941528696e-05, 'samples': 22536192, 'steps': 44015, 'batch_loss/train': 0.7206008671782911} 12/28/2021 18:13:26 - INFO - codeparrot_training - Step 44016: {'lr': 1.758490406253313e-05, 'samples': 22536704, 'steps': 44016, 'batch_loss/train': 0.6794338249601424} 12/28/2021 18:13:38 - INFO - codeparrot_training - Step 44017: {'lr': 1.7579097107532277e-05, 'samples': 22537216, 'steps': 44017, 'batch_loss/train': 0.6721497550606728} 12/28/2021 18:13:49 - INFO - codeparrot_training - Step 44018: {'lr': 1.7573291076549202e-05, 'samples': 22537728, 'steps': 44018, 'batch_loss/train': 0.7364372066222131} 12/28/2021 18:13:59 - INFO - codeparrot_training - Step 44019: {'lr': 1.7567485969606915e-05, 'samples': 22538240, 'steps': 44019, 'batch_loss/train': 0.8335973555222154} 12/28/2021 18:14:11 - INFO - codeparrot_training - Step 44020: {'lr': 1.756168178672851e-05, 'samples': 22538752, 'steps': 44020, 'batch_loss/train': 0.852652202360332} 12/28/2021 18:14:22 - INFO - codeparrot_training - Step 44021: {'lr': 1.7555878527937163e-05, 'samples': 22539264, 'steps': 44021, 'batch_loss/train': 0.7865010257810354} 12/28/2021 18:14:33 - INFO - codeparrot_training - Step 44022: {'lr': 1.7550076193255853e-05, 'samples': 22539776, 'steps': 44022, 'batch_loss/train': 0.8223040793091059} 12/28/2021 18:14:43 - INFO - codeparrot_training - Step 44023: {'lr': 1.754427478270765e-05, 'samples': 22540288, 'steps': 44023, 'batch_loss/train': 0.6843159096315503} 12/28/2021 18:14:57 - INFO - codeparrot_training - Step 44024: {'lr': 1.7538474296315666e-05, 'samples': 22540800, 'steps': 44024, 'batch_loss/train': 0.729889515787363} 12/28/2021 18:15:07 - INFO - codeparrot_training - Step 44025: {'lr': 1.7532674734102914e-05, 'samples': 22541312, 'steps': 44025, 'batch_loss/train': 0.9316099029965699} 12/28/2021 18:15:18 - INFO - codeparrot_training - Step 44026: {'lr': 1.7526876096092463e-05, 'samples': 22541824, 'steps': 44026, 'batch_loss/train': 0.8028270965442061} 12/28/2021 18:15:30 - INFO - codeparrot_training - Step 44027: {'lr': 1.7521078382307376e-05, 'samples': 22542336, 'steps': 44027, 'batch_loss/train': 0.7602474894374609} 12/28/2021 18:15:41 - INFO - codeparrot_training - Step 44028: {'lr': 1.7515281592770715e-05, 'samples': 22542848, 'steps': 44028, 'batch_loss/train': 0.8163314331322908} 12/28/2021 18:15:52 - INFO - codeparrot_training - Step 44029: {'lr': 1.750948572750552e-05, 'samples': 22543360, 'steps': 44029, 'batch_loss/train': 0.6379226446151733} 12/28/2021 18:16:06 - INFO - codeparrot_training - Step 44030: {'lr': 1.7503690786534855e-05, 'samples': 22543872, 'steps': 44030, 'batch_loss/train': 0.587795193423517} 12/28/2021 18:16:17 - INFO - codeparrot_training - Step 44031: {'lr': 1.7497896769881642e-05, 'samples': 22544384, 'steps': 44031, 'batch_loss/train': 0.7115385076031089} 12/28/2021 18:16:27 - INFO - codeparrot_training - Step 44032: {'lr': 1.7492103677569064e-05, 'samples': 22544896, 'steps': 44032, 'batch_loss/train': 0.7360229906626046} 12/28/2021 18:16:38 - INFO - codeparrot_training - Step 44033: {'lr': 1.7486311509620072e-05, 'samples': 22545408, 'steps': 44033, 'batch_loss/train': 0.6995098637416959} 12/28/2021 18:16:50 - INFO - codeparrot_training - Step 44034: {'lr': 1.7480520266057732e-05, 'samples': 22545920, 'steps': 44034, 'batch_loss/train': 0.7674002684652805} 12/28/2021 18:17:01 - INFO - codeparrot_training - Step 44035: {'lr': 1.747472994690502e-05, 'samples': 22546432, 'steps': 44035, 'batch_loss/train': 0.8740431889891624} 12/28/2021 18:17:11 - INFO - codeparrot_training - Step 44036: {'lr': 1.746894055218498e-05, 'samples': 22546944, 'steps': 44036, 'batch_loss/train': 0.7608780383598059} 12/28/2021 18:17:23 - INFO - codeparrot_training - Step 44037: {'lr': 1.7463152081920647e-05, 'samples': 22547456, 'steps': 44037, 'batch_loss/train': 0.8011162085458636} 12/28/2021 18:17:34 - INFO - codeparrot_training - Step 44038: {'lr': 1.7457364536135e-05, 'samples': 22547968, 'steps': 44038, 'batch_loss/train': 0.7102606415865012} 12/28/2021 18:17:45 - INFO - codeparrot_training - Step 44039: {'lr': 1.745157791485108e-05, 'samples': 22548480, 'steps': 44039, 'batch_loss/train': 0.708322410704568} 12/28/2021 18:17:59 - INFO - codeparrot_training - Step 44040: {'lr': 1.7445792218091895e-05, 'samples': 22548992, 'steps': 44040, 'batch_loss/train': 0.7143854218302295} 12/28/2021 18:18:09 - INFO - codeparrot_training - Step 44041: {'lr': 1.7440007445880452e-05, 'samples': 22549504, 'steps': 44041, 'batch_loss/train': 0.6776510099880397} 12/28/2021 18:18:20 - INFO - codeparrot_training - Step 44042: {'lr': 1.743422359823965e-05, 'samples': 22550016, 'steps': 44042, 'batch_loss/train': 0.7879387928405777} 12/28/2021 18:18:30 - INFO - codeparrot_training - Step 44043: {'lr': 1.7428440675192615e-05, 'samples': 22550528, 'steps': 44043, 'batch_loss/train': 0.6787501082289964} 12/28/2021 18:18:43 - INFO - codeparrot_training - Step 44044: {'lr': 1.7422658676762293e-05, 'samples': 22551040, 'steps': 44044, 'batch_loss/train': 0.6806264193728566} 12/28/2021 18:18:53 - INFO - codeparrot_training - Step 44045: {'lr': 1.741687760297164e-05, 'samples': 22551552, 'steps': 44045, 'batch_loss/train': 0.593050975818187} 12/28/2021 18:19:04 - INFO - codeparrot_training - Step 44046: {'lr': 1.7411097453843643e-05, 'samples': 22552064, 'steps': 44046, 'batch_loss/train': 0.8488705372437835} 12/28/2021 18:19:16 - INFO - codeparrot_training - Step 44047: {'lr': 1.7405318229401364e-05, 'samples': 22552576, 'steps': 44047, 'batch_loss/train': 0.6892186980694532} 12/28/2021 18:19:27 - INFO - codeparrot_training - Step 44048: {'lr': 1.7399539929667667e-05, 'samples': 22553088, 'steps': 44048, 'batch_loss/train': 0.6375712077133358} 12/28/2021 18:19:37 - INFO - codeparrot_training - Step 44049: {'lr': 1.7393762554665572e-05, 'samples': 22553600, 'steps': 44049, 'batch_loss/train': 0.7683975999243557} 12/28/2021 18:19:49 - INFO - codeparrot_training - Step 44050: {'lr': 1.7387986104418053e-05, 'samples': 22554112, 'steps': 44050, 'batch_loss/train': 0.6339855622500181} 12/28/2021 18:20:00 - INFO - codeparrot_training - Step 44051: {'lr': 1.7382210578948065e-05, 'samples': 22554624, 'steps': 44051, 'batch_loss/train': 0.708315756637603} 12/28/2021 18:20:11 - INFO - codeparrot_training - Step 44052: {'lr': 1.7376435978278593e-05, 'samples': 22555136, 'steps': 44052, 'batch_loss/train': 0.7678703228011727} 12/28/2021 18:20:25 - INFO - codeparrot_training - Step 44053: {'lr': 1.737066230243256e-05, 'samples': 22555648, 'steps': 44053, 'batch_loss/train': 0.7264456106349826} 12/28/2021 18:20:36 - INFO - codeparrot_training - Step 44054: {'lr': 1.736488955143295e-05, 'samples': 22556160, 'steps': 44054, 'batch_loss/train': 0.7153540703002363} 12/28/2021 18:20:46 - INFO - codeparrot_training - Step 44055: {'lr': 1.7359117725302686e-05, 'samples': 22556672, 'steps': 44055, 'batch_loss/train': 0.9606973314657807} 12/28/2021 18:20:57 - INFO - codeparrot_training - Step 44056: {'lr': 1.735334682406478e-05, 'samples': 22557184, 'steps': 44056, 'batch_loss/train': 0.7002493524923921} 12/28/2021 18:21:09 - INFO - codeparrot_training - Step 44057: {'lr': 1.734757684774202e-05, 'samples': 22557696, 'steps': 44057, 'batch_loss/train': 0.5959945891518146} 12/28/2021 18:21:20 - INFO - codeparrot_training - Step 44058: {'lr': 1.734180779635755e-05, 'samples': 22558208, 'steps': 44058, 'batch_loss/train': 0.6520224874839187} 12/28/2021 18:21:30 - INFO - codeparrot_training - Step 44059: {'lr': 1.733603966993416e-05, 'samples': 22558720, 'steps': 44059, 'batch_loss/train': 0.7165270398836583} 12/28/2021 18:21:44 - INFO - codeparrot_training - Step 44060: {'lr': 1.733027246849478e-05, 'samples': 22559232, 'steps': 44060, 'batch_loss/train': 0.7047858536243439} 12/28/2021 18:21:55 - INFO - codeparrot_training - Step 44061: {'lr': 1.7324506192062467e-05, 'samples': 22559744, 'steps': 44061, 'batch_loss/train': 0.6692529394058511} 12/28/2021 18:22:06 - INFO - codeparrot_training - Step 44062: {'lr': 1.731874084066004e-05, 'samples': 22560256, 'steps': 44062, 'batch_loss/train': 0.703566464362666} 12/28/2021 18:22:18 - INFO - codeparrot_training - Step 44063: {'lr': 1.73129764143104e-05, 'samples': 22560768, 'steps': 44063, 'batch_loss/train': 0.7489577940432355} 12/28/2021 18:22:28 - INFO - codeparrot_training - Step 44064: {'lr': 1.7307212913036552e-05, 'samples': 22561280, 'steps': 44064, 'batch_loss/train': 0.6472369455732405} 12/28/2021 18:22:39 - INFO - codeparrot_training - Step 44065: {'lr': 1.730145033686134e-05, 'samples': 22561792, 'steps': 44065, 'batch_loss/train': 0.7190349641023204} 12/28/2021 18:22:50 - INFO - codeparrot_training - Step 44066: {'lr': 1.729568868580769e-05, 'samples': 22562304, 'steps': 44066, 'batch_loss/train': 0.6857911892002448} 12/28/2021 18:23:02 - INFO - codeparrot_training - Step 44067: {'lr': 1.728992795989859e-05, 'samples': 22562816, 'steps': 44067, 'batch_loss/train': 0.7119883252307773} 12/28/2021 18:23:12 - INFO - codeparrot_training - Step 44068: {'lr': 1.728416815915676e-05, 'samples': 22563328, 'steps': 44068, 'batch_loss/train': 0.581036277115345} 12/28/2021 18:23:23 - INFO - codeparrot_training - Step 44069: {'lr': 1.7278409283605247e-05, 'samples': 22563840, 'steps': 44069, 'batch_loss/train': 0.7640924262814224} 12/28/2021 18:23:36 - INFO - codeparrot_training - Step 44070: {'lr': 1.7272651333266943e-05, 'samples': 22564352, 'steps': 44070, 'batch_loss/train': 0.6983195437351242} 12/28/2021 18:23:47 - INFO - codeparrot_training - Step 44071: {'lr': 1.7266894308164694e-05, 'samples': 22564864, 'steps': 44071, 'batch_loss/train': 0.7186236004345119} 12/28/2021 18:23:57 - INFO - codeparrot_training - Step 44072: {'lr': 1.7261138208321313e-05, 'samples': 22565376, 'steps': 44072, 'batch_loss/train': 0.8481705635786057} 12/28/2021 18:24:10 - INFO - codeparrot_training - Step 44073: {'lr': 1.72553830337599e-05, 'samples': 22565888, 'steps': 44073, 'batch_loss/train': 0.783269127830863} 12/28/2021 18:24:20 - INFO - codeparrot_training - Step 44074: {'lr': 1.724962878450312e-05, 'samples': 22566400, 'steps': 44074, 'batch_loss/train': 0.7262070445576683} 12/28/2021 18:24:31 - INFO - codeparrot_training - Step 44075: {'lr': 1.7243875460573905e-05, 'samples': 22566912, 'steps': 44075, 'batch_loss/train': 0.6914906564634293} 12/28/2021 18:24:41 - INFO - codeparrot_training - Step 44076: {'lr': 1.7238123061995236e-05, 'samples': 22567424, 'steps': 44076, 'batch_loss/train': 0.7466375082731247} 12/28/2021 18:24:55 - INFO - codeparrot_training - Step 44077: {'lr': 1.7232371588789876e-05, 'samples': 22567936, 'steps': 44077, 'batch_loss/train': 0.7247644015587866} 12/28/2021 18:25:05 - INFO - codeparrot_training - Step 44078: {'lr': 1.722662104098069e-05, 'samples': 22568448, 'steps': 44078, 'batch_loss/train': 0.7021257793530822} 12/28/2021 18:25:16 - INFO - codeparrot_training - Step 44079: {'lr': 1.7220871418590578e-05, 'samples': 22568960, 'steps': 44079, 'batch_loss/train': 0.7624734807759523} 12/28/2021 18:25:28 - INFO - codeparrot_training - Step 44080: {'lr': 1.7215122721642412e-05, 'samples': 22569472, 'steps': 44080, 'batch_loss/train': 0.7349740886129439} 12/28/2021 18:25:39 - INFO - codeparrot_training - Step 44081: {'lr': 1.720937495015898e-05, 'samples': 22569984, 'steps': 44081, 'batch_loss/train': 0.6811268655583262} 12/28/2021 18:25:49 - INFO - codeparrot_training - Step 44082: {'lr': 1.7203628104163232e-05, 'samples': 22570496, 'steps': 44082, 'batch_loss/train': 0.7409201841801405} 12/28/2021 18:26:02 - INFO - codeparrot_training - Step 44083: {'lr': 1.7197882183677903e-05, 'samples': 22571008, 'steps': 44083, 'batch_loss/train': 0.7335655898787081} 12/28/2021 18:26:12 - INFO - codeparrot_training - Step 44084: {'lr': 1.719213718872595e-05, 'samples': 22571520, 'steps': 44084, 'batch_loss/train': 0.46459195436909795} 12/28/2021 18:26:23 - INFO - codeparrot_training - Step 44085: {'lr': 1.7186393119330096e-05, 'samples': 22572032, 'steps': 44085, 'batch_loss/train': 0.7412258991971612} 12/28/2021 18:26:35 - INFO - codeparrot_training - Step 44086: {'lr': 1.718064997551322e-05, 'samples': 22572544, 'steps': 44086, 'batch_loss/train': 0.7907983362674713} 12/28/2021 18:26:46 - INFO - codeparrot_training - Step 44087: {'lr': 1.7174907757298243e-05, 'samples': 22573056, 'steps': 44087, 'batch_loss/train': 0.6889716917648911} 12/28/2021 18:26:57 - INFO - codeparrot_training - Step 44088: {'lr': 1.7169166464707898e-05, 'samples': 22573568, 'steps': 44088, 'batch_loss/train': 0.8311479976400733} 12/28/2021 18:27:07 - INFO - codeparrot_training - Step 44089: {'lr': 1.7163426097765e-05, 'samples': 22574080, 'steps': 44089, 'batch_loss/train': 0.5821107784286141} 12/28/2021 18:27:21 - INFO - codeparrot_training - Step 44090: {'lr': 1.715768665649245e-05, 'samples': 22574592, 'steps': 44090, 'batch_loss/train': 0.7338676920626312} 12/28/2021 18:27:32 - INFO - codeparrot_training - Step 44091: {'lr': 1.7151948140912976e-05, 'samples': 22575104, 'steps': 44091, 'batch_loss/train': 0.9164522150531411} 12/28/2021 18:27:43 - INFO - codeparrot_training - Step 44092: {'lr': 1.714621055104945e-05, 'samples': 22575616, 'steps': 44092, 'batch_loss/train': 0.7654493290465325} 12/28/2021 18:27:55 - INFO - codeparrot_training - Step 44093: {'lr': 1.714047388692472e-05, 'samples': 22576128, 'steps': 44093, 'batch_loss/train': 0.7634390108287334} 12/28/2021 18:28:06 - INFO - codeparrot_training - Step 44094: {'lr': 1.7134738148561452e-05, 'samples': 22576640, 'steps': 44094, 'batch_loss/train': 0.7487895430531353} 12/28/2021 18:28:16 - INFO - codeparrot_training - Step 44095: {'lr': 1.712900333598258e-05, 'samples': 22577152, 'steps': 44095, 'batch_loss/train': 0.656121902866289} 12/28/2021 18:28:28 - INFO - codeparrot_training - Step 44096: {'lr': 1.7123269449210914e-05, 'samples': 22577664, 'steps': 44096, 'batch_loss/train': 0.663492335472256} 12/28/2021 18:28:39 - INFO - codeparrot_training - Step 44097: {'lr': 1.711753648826911e-05, 'samples': 22578176, 'steps': 44097, 'batch_loss/train': 0.7170693245716393} 12/28/2021 18:28:50 - INFO - codeparrot_training - Step 44098: {'lr': 1.7111804453180084e-05, 'samples': 22578688, 'steps': 44098, 'batch_loss/train': 0.773008905351162} 12/28/2021 18:29:00 - INFO - codeparrot_training - Step 44099: {'lr': 1.7106073343966634e-05, 'samples': 22579200, 'steps': 44099, 'batch_loss/train': 1.483337384648621} 12/28/2021 18:29:14 - INFO - codeparrot_training - Step 44100: {'lr': 1.7100343160651427e-05, 'samples': 22579712, 'steps': 44100, 'batch_loss/train': 0.6785651815589517} 12/28/2021 18:29:24 - INFO - codeparrot_training - Step 44101: {'lr': 1.709461390325731e-05, 'samples': 22580224, 'steps': 44101, 'batch_loss/train': 0.762852190528065} 12/28/2021 18:29:35 - INFO - codeparrot_training - Step 44102: {'lr': 1.7088885571807156e-05, 'samples': 22580736, 'steps': 44102, 'batch_loss/train': 0.6663623293861747} 12/28/2021 18:29:47 - INFO - codeparrot_training - Step 44103: {'lr': 1.708315816632358e-05, 'samples': 22581248, 'steps': 44103, 'batch_loss/train': 0.6476708436384797} 12/28/2021 18:29:58 - INFO - codeparrot_training - Step 44104: {'lr': 1.707743168682943e-05, 'samples': 22581760, 'steps': 44104, 'batch_loss/train': 0.7409072457812726} 12/28/2021 18:30:09 - INFO - codeparrot_training - Step 44105: {'lr': 1.7071706133347463e-05, 'samples': 22582272, 'steps': 44105, 'batch_loss/train': 0.7521425671875477} 12/28/2021 18:30:19 - INFO - codeparrot_training - Step 44106: {'lr': 1.7065981505900467e-05, 'samples': 22582784, 'steps': 44106, 'batch_loss/train': 0.920120975933969} 12/28/2021 18:30:31 - INFO - codeparrot_training - Step 44107: {'lr': 1.7060257804511147e-05, 'samples': 22583296, 'steps': 44107, 'batch_loss/train': 0.7751257959753275} 12/28/2021 18:30:42 - INFO - codeparrot_training - Step 44108: {'lr': 1.7054535029202345e-05, 'samples': 22583808, 'steps': 44108, 'batch_loss/train': 0.7282276917248964} 12/28/2021 18:30:52 - INFO - codeparrot_training - Step 44109: {'lr': 1.7048813179996653e-05, 'samples': 22584320, 'steps': 44109, 'batch_loss/train': 0.6613099700771272} 12/28/2021 18:31:06 - INFO - codeparrot_training - Step 44110: {'lr': 1.704309225691697e-05, 'samples': 22584832, 'steps': 44110, 'batch_loss/train': 0.7716531925834715} 12/28/2021 18:31:17 - INFO - codeparrot_training - Step 44111: {'lr': 1.7037372259986027e-05, 'samples': 22585344, 'steps': 44111, 'batch_loss/train': 0.7017708318307996} 12/28/2021 18:31:28 - INFO - codeparrot_training - Step 44112: {'lr': 1.7031653189226476e-05, 'samples': 22585856, 'steps': 44112, 'batch_loss/train': 0.7414953161496669} 12/28/2021 18:31:40 - INFO - codeparrot_training - Step 44113: {'lr': 1.7025935044661155e-05, 'samples': 22586368, 'steps': 44113, 'batch_loss/train': 0.7351261819712818} 12/28/2021 18:31:51 - INFO - codeparrot_training - Step 44114: {'lr': 1.702021782631272e-05, 'samples': 22586880, 'steps': 44114, 'batch_loss/train': 0.8339325794950128} 12/28/2021 18:32:01 - INFO - codeparrot_training - Step 44115: {'lr': 1.7014501534203948e-05, 'samples': 22587392, 'steps': 44115, 'batch_loss/train': 0.7726771291345358} 12/28/2021 18:32:15 - INFO - codeparrot_training - Step 44116: {'lr': 1.700878616835752e-05, 'samples': 22587904, 'steps': 44116, 'batch_loss/train': 0.6791682038456202} 12/28/2021 18:32:26 - INFO - codeparrot_training - Step 44117: {'lr': 1.7003071728796198e-05, 'samples': 22588416, 'steps': 44117, 'batch_loss/train': 1.0013982113450766} 12/28/2021 18:32:37 - INFO - codeparrot_training - Step 44118: {'lr': 1.6997358215542657e-05, 'samples': 22588928, 'steps': 44118, 'batch_loss/train': 1.492614606861025} 12/28/2021 18:32:47 - INFO - codeparrot_training - Step 44119: {'lr': 1.6991645628619683e-05, 'samples': 22589440, 'steps': 44119, 'batch_loss/train': 0.6837786892428994} 12/28/2021 18:33:00 - INFO - codeparrot_training - Step 44120: {'lr': 1.6985933968049927e-05, 'samples': 22589952, 'steps': 44120, 'batch_loss/train': 0.8065931405872107} 12/28/2021 18:33:10 - INFO - codeparrot_training - Step 44121: {'lr': 1.6980223233856117e-05, 'samples': 22590464, 'steps': 44121, 'batch_loss/train': 0.6656017825007439} 12/28/2021 18:33:21 - INFO - codeparrot_training - Step 44122: {'lr': 1.6974513426060988e-05, 'samples': 22590976, 'steps': 44122, 'batch_loss/train': 0.7298403051681817} 12/28/2021 18:33:33 - INFO - codeparrot_training - Step 44123: {'lr': 1.6968804544687132e-05, 'samples': 22591488, 'steps': 44123, 'batch_loss/train': 0.869217867963016} 12/28/2021 18:33:44 - INFO - codeparrot_training - Step 44124: {'lr': 1.6963096589757365e-05, 'samples': 22592000, 'steps': 44124, 'batch_loss/train': 0.7893591625615954} 12/28/2021 18:33:54 - INFO - codeparrot_training - Step 44125: {'lr': 1.6957389561294363e-05, 'samples': 22592512, 'steps': 44125, 'batch_loss/train': 0.7654696456156671} 12/28/2021 18:34:05 - INFO - codeparrot_training - Step 44126: {'lr': 1.6951683459320745e-05, 'samples': 22593024, 'steps': 44126, 'batch_loss/train': 0.7386048790067434} 12/28/2021 18:34:18 - INFO - codeparrot_training - Step 44127: {'lr': 1.6945978283859216e-05, 'samples': 22593536, 'steps': 44127, 'batch_loss/train': 0.8245248636230826} 12/28/2021 18:34:29 - INFO - codeparrot_training - Step 44128: {'lr': 1.6940274034932533e-05, 'samples': 22594048, 'steps': 44128, 'batch_loss/train': 0.676937083946541} 12/28/2021 18:34:39 - INFO - codeparrot_training - Step 44129: {'lr': 1.6934570712563296e-05, 'samples': 22594560, 'steps': 44129, 'batch_loss/train': 0.771975455339998} 12/28/2021 18:34:52 - INFO - codeparrot_training - Step 44130: {'lr': 1.6928868316774204e-05, 'samples': 22595072, 'steps': 44130, 'batch_loss/train': 0.7058472190983593} 12/28/2021 18:35:02 - INFO - codeparrot_training - Step 44131: {'lr': 1.6923166847587935e-05, 'samples': 22595584, 'steps': 44131, 'batch_loss/train': 0.7220767429098487} 12/28/2021 18:35:13 - INFO - codeparrot_training - Step 44132: {'lr': 1.6917466305027134e-05, 'samples': 22596096, 'steps': 44132, 'batch_loss/train': 0.8482781387865543} 12/28/2021 18:35:25 - INFO - codeparrot_training - Step 44133: {'lr': 1.6911766689114483e-05, 'samples': 22596608, 'steps': 44133, 'batch_loss/train': 0.807482285425067} 12/28/2021 18:35:36 - INFO - codeparrot_training - Step 44134: {'lr': 1.6906067999872653e-05, 'samples': 22597120, 'steps': 44134, 'batch_loss/train': 0.7371011418290436} 12/28/2021 18:35:46 - INFO - codeparrot_training - Step 44135: {'lr': 1.690037023732427e-05, 'samples': 22597632, 'steps': 44135, 'batch_loss/train': 0.7230551075190306} 12/28/2021 18:35:57 - INFO - codeparrot_training - Step 44136: {'lr': 1.6894673401492006e-05, 'samples': 22598144, 'steps': 44136, 'batch_loss/train': 0.8860829407349229} 12/28/2021 18:36:09 - INFO - codeparrot_training - Step 44137: {'lr': 1.688897749239854e-05, 'samples': 22598656, 'steps': 44137, 'batch_loss/train': 0.707526424434036} 12/28/2021 18:36:20 - INFO - codeparrot_training - Step 44138: {'lr': 1.6883282510066377e-05, 'samples': 22599168, 'steps': 44138, 'batch_loss/train': 0.7716069621965289} 12/28/2021 18:36:31 - INFO - codeparrot_training - Step 44139: {'lr': 1.6877588454518367e-05, 'samples': 22599680, 'steps': 44139, 'batch_loss/train': 0.6981854285113513} 12/28/2021 18:36:44 - INFO - codeparrot_training - Step 44140: {'lr': 1.6871895325776986e-05, 'samples': 22600192, 'steps': 44140, 'batch_loss/train': 0.6781728072091937} 12/28/2021 18:36:55 - INFO - codeparrot_training - Step 44141: {'lr': 1.6866203123864944e-05, 'samples': 22600704, 'steps': 44141, 'batch_loss/train': 0.869995525223203} 12/28/2021 18:37:05 - INFO - codeparrot_training - Step 44142: {'lr': 1.6860511848804827e-05, 'samples': 22601216, 'steps': 44142, 'batch_loss/train': 0.6934254867956042} 12/28/2021 18:37:17 - INFO - codeparrot_training - Step 44143: {'lr': 1.6854821500619315e-05, 'samples': 22601728, 'steps': 44143, 'batch_loss/train': 0.6875063590705395} 12/28/2021 18:37:28 - INFO - codeparrot_training - Step 44144: {'lr': 1.6849132079330977e-05, 'samples': 22602240, 'steps': 44144, 'batch_loss/train': 0.7396951557602733} 12/28/2021 18:37:39 - INFO - codeparrot_training - Step 44145: {'lr': 1.6843443584962454e-05, 'samples': 22602752, 'steps': 44145, 'batch_loss/train': 0.6725552438292652} 12/28/2021 18:37:49 - INFO - codeparrot_training - Step 44146: {'lr': 1.683775601753637e-05, 'samples': 22603264, 'steps': 44146, 'batch_loss/train': 0.8437172984704375} 12/28/2021 18:38:03 - INFO - codeparrot_training - Step 44147: {'lr': 1.6832069377075347e-05, 'samples': 22603776, 'steps': 44147, 'batch_loss/train': 0.751840085722506} 12/28/2021 18:38:14 - INFO - codeparrot_training - Step 44148: {'lr': 1.682638366360201e-05, 'samples': 22604288, 'steps': 44148, 'batch_loss/train': 1.1558970818296075} 12/28/2021 18:38:24 - INFO - codeparrot_training - Step 44149: {'lr': 1.682069887713883e-05, 'samples': 22604800, 'steps': 44149, 'batch_loss/train': 0.6219357782974839} 12/28/2021 18:38:37 - INFO - codeparrot_training - Step 44150: {'lr': 1.6815015017708547e-05, 'samples': 22605312, 'steps': 44150, 'batch_loss/train': 0.7325491202063859} 12/28/2021 18:38:47 - INFO - codeparrot_training - Step 44151: {'lr': 1.680933208533375e-05, 'samples': 22605824, 'steps': 44151, 'batch_loss/train': 0.5782710519852117} 12/28/2021 18:38:58 - INFO - codeparrot_training - Step 44152: {'lr': 1.680365008003698e-05, 'samples': 22606336, 'steps': 44152, 'batch_loss/train': 0.7013597940094769} 12/28/2021 18:39:10 - INFO - codeparrot_training - Step 44153: {'lr': 1.6797969001840802e-05, 'samples': 22606848, 'steps': 44153, 'batch_loss/train': 0.750567264854908} 12/28/2021 18:39:21 - INFO - codeparrot_training - Step 44154: {'lr': 1.6792288850767918e-05, 'samples': 22607360, 'steps': 44154, 'batch_loss/train': 0.7983393878675997} 12/28/2021 18:39:31 - INFO - codeparrot_training - Step 44155: {'lr': 1.678660962684081e-05, 'samples': 22607872, 'steps': 44155, 'batch_loss/train': 0.7409832836128771} 12/28/2021 18:39:42 - INFO - codeparrot_training - Step 44156: {'lr': 1.6780931330082074e-05, 'samples': 22608384, 'steps': 44156, 'batch_loss/train': 0.7040219753980637} 12/28/2021 18:39:56 - INFO - codeparrot_training - Step 44157: {'lr': 1.6775253960514298e-05, 'samples': 22608896, 'steps': 44157, 'batch_loss/train': 0.731582386419177} 12/28/2021 18:40:06 - INFO - codeparrot_training - Step 44158: {'lr': 1.6769577518160052e-05, 'samples': 22609408, 'steps': 44158, 'batch_loss/train': 0.7738432832993567} 12/28/2021 18:40:17 - INFO - codeparrot_training - Step 44159: {'lr': 1.6763902003041898e-05, 'samples': 22609920, 'steps': 44159, 'batch_loss/train': 0.7220258065499365} 12/28/2021 18:40:29 - INFO - codeparrot_training - Step 44160: {'lr': 1.6758227415182403e-05, 'samples': 22610432, 'steps': 44160, 'batch_loss/train': 0.7538609956391156} 12/28/2021 18:40:40 - INFO - codeparrot_training - Step 44161: {'lr': 1.6752553754604134e-05, 'samples': 22610944, 'steps': 44161, 'batch_loss/train': 0.7573166117072105} 12/28/2021 18:40:50 - INFO - codeparrot_training - Step 44162: {'lr': 1.6746881021329623e-05, 'samples': 22611456, 'steps': 44162, 'batch_loss/train': 0.7611296931281686} 12/28/2021 18:41:03 - INFO - codeparrot_training - Step 44163: {'lr': 1.6741209215381498e-05, 'samples': 22611968, 'steps': 44163, 'batch_loss/train': 0.8064184365794063} 12/28/2021 18:41:13 - INFO - codeparrot_training - Step 44164: {'lr': 1.6735538336782153e-05, 'samples': 22612480, 'steps': 44164, 'batch_loss/train': 0.8204891383647919} 12/28/2021 18:41:24 - INFO - codeparrot_training - Step 44165: {'lr': 1.672986838555432e-05, 'samples': 22612992, 'steps': 44165, 'batch_loss/train': 0.74798284471035} 12/28/2021 18:41:35 - INFO - codeparrot_training - Step 44166: {'lr': 1.6724199361720428e-05, 'samples': 22613504, 'steps': 44166, 'batch_loss/train': 0.7825987171381712} 12/28/2021 18:41:48 - INFO - codeparrot_training - Step 44167: {'lr': 1.6718531265303012e-05, 'samples': 22614016, 'steps': 44167, 'batch_loss/train': 0.7506873533129692} 12/28/2021 18:41:59 - INFO - codeparrot_training - Step 44168: {'lr': 1.6712864096324636e-05, 'samples': 22614528, 'steps': 44168, 'batch_loss/train': 0.6335251143900678} 12/28/2021 18:42:09 - INFO - codeparrot_training - Step 44169: {'lr': 1.6707197854807842e-05, 'samples': 22615040, 'steps': 44169, 'batch_loss/train': 0.8843675618991256} 12/28/2021 18:42:21 - INFO - codeparrot_training - Step 44170: {'lr': 1.670153254077511e-05, 'samples': 22615552, 'steps': 44170, 'batch_loss/train': 0.7292089900001884} 12/28/2021 18:42:32 - INFO - codeparrot_training - Step 44171: {'lr': 1.6695868154249033e-05, 'samples': 22616064, 'steps': 44171, 'batch_loss/train': 0.7338743191212416} 12/28/2021 18:42:42 - INFO - codeparrot_training - Step 44172: {'lr': 1.6690204695252066e-05, 'samples': 22616576, 'steps': 44172, 'batch_loss/train': 0.7473654155619442} 12/28/2021 18:42:55 - INFO - codeparrot_training - Step 44173: {'lr': 1.668454216380677e-05, 'samples': 22617088, 'steps': 44173, 'batch_loss/train': 0.6500058092642576} 12/28/2021 18:43:05 - INFO - codeparrot_training - Step 44174: {'lr': 1.6678880559935662e-05, 'samples': 22617600, 'steps': 44174, 'batch_loss/train': 0.8112466298043728} 12/28/2021 18:43:16 - INFO - codeparrot_training - Step 44175: {'lr': 1.6673219883661163e-05, 'samples': 22618112, 'steps': 44175, 'batch_loss/train': 0.7083060350269079} 12/28/2021 18:43:29 - INFO - codeparrot_training - Step 44176: {'lr': 1.6667560135005864e-05, 'samples': 22618624, 'steps': 44176, 'batch_loss/train': 0.8216718853800558} 12/28/2021 18:43:40 - INFO - codeparrot_training - Step 44177: {'lr': 1.6661901313992284e-05, 'samples': 22619136, 'steps': 44177, 'batch_loss/train': 0.6965804849751294} 12/28/2021 18:43:50 - INFO - codeparrot_training - Step 44178: {'lr': 1.6656243420642842e-05, 'samples': 22619648, 'steps': 44178, 'batch_loss/train': 0.682777744717896} 12/28/2021 18:44:01 - INFO - codeparrot_training - Step 44179: {'lr': 1.6650586454980022e-05, 'samples': 22620160, 'steps': 44179, 'batch_loss/train': 0.8188076494261622} 12/28/2021 18:44:13 - INFO - codeparrot_training - Step 44180: {'lr': 1.6644930417026445e-05, 'samples': 22620672, 'steps': 44180, 'batch_loss/train': 0.6593194250017405} 12/28/2021 18:44:24 - INFO - codeparrot_training - Step 44181: {'lr': 1.663927530680448e-05, 'samples': 22621184, 'steps': 44181, 'batch_loss/train': 0.7977374205365777} 12/28/2021 18:44:34 - INFO - codeparrot_training - Step 44182: {'lr': 1.663362112433664e-05, 'samples': 22621696, 'steps': 44182, 'batch_loss/train': 1.5476706423796713} 12/28/2021 18:44:47 - INFO - codeparrot_training - Step 44183: {'lr': 1.6627967869645404e-05, 'samples': 22622208, 'steps': 44183, 'batch_loss/train': 0.6945043983869255} 12/28/2021 18:44:57 - INFO - codeparrot_training - Step 44184: {'lr': 1.6622315542753257e-05, 'samples': 22622720, 'steps': 44184, 'batch_loss/train': 0.6965669246856123} 12/28/2021 18:45:08 - INFO - codeparrot_training - Step 44185: {'lr': 1.661666414368268e-05, 'samples': 22623232, 'steps': 44185, 'batch_loss/train': 0.482527126907371} 12/28/2021 18:45:21 - INFO - codeparrot_training - Step 44186: {'lr': 1.66110136724561e-05, 'samples': 22623744, 'steps': 44186, 'batch_loss/train': 0.8399126762524247} 12/28/2021 18:45:32 - INFO - codeparrot_training - Step 44187: {'lr': 1.6605364129095997e-05, 'samples': 22624256, 'steps': 44187, 'batch_loss/train': 0.6822843609843403} 12/28/2021 18:45:42 - INFO - codeparrot_training - Step 44188: {'lr': 1.6599715513624852e-05, 'samples': 22624768, 'steps': 44188, 'batch_loss/train': 0.665766092017293} 12/28/2021 18:45:53 - INFO - codeparrot_training - Step 44189: {'lr': 1.6594067826065152e-05, 'samples': 22625280, 'steps': 44189, 'batch_loss/train': 0.7454780070111156} 12/28/2021 18:46:05 - INFO - codeparrot_training - Step 44190: {'lr': 1.6588421066439234e-05, 'samples': 22625792, 'steps': 44190, 'batch_loss/train': 0.8016421515494585} 12/28/2021 18:46:16 - INFO - codeparrot_training - Step 44191: {'lr': 1.6582775234769643e-05, 'samples': 22626304, 'steps': 44191, 'batch_loss/train': 1.5258907712996006} 12/28/2021 18:46:26 - INFO - codeparrot_training - Step 44192: {'lr': 1.6577130331078853e-05, 'samples': 22626816, 'steps': 44192, 'batch_loss/train': 0.6018353563849814} 12/28/2021 18:46:40 - INFO - codeparrot_training - Step 44193: {'lr': 1.6571486355389214e-05, 'samples': 22627328, 'steps': 44193, 'batch_loss/train': 0.671841761097312} 12/28/2021 18:46:51 - INFO - codeparrot_training - Step 44194: {'lr': 1.6565843307723205e-05, 'samples': 22627840, 'steps': 44194, 'batch_loss/train': 0.7197821801528335} 12/28/2021 18:47:02 - INFO - codeparrot_training - Step 44195: {'lr': 1.656020118810328e-05, 'samples': 22628352, 'steps': 44195, 'batch_loss/train': 0.6767762554809451} 12/28/2021 18:47:12 - INFO - codeparrot_training - Step 44196: {'lr': 1.6554559996551837e-05, 'samples': 22628864, 'steps': 44196, 'batch_loss/train': 1.1079340167343616} 12/28/2021 18:47:25 - INFO - codeparrot_training - Step 44197: {'lr': 1.6548919733091307e-05, 'samples': 22629376, 'steps': 44197, 'batch_loss/train': 0.808526593260467} 12/28/2021 18:47:36 - INFO - codeparrot_training - Step 44198: {'lr': 1.6543280397744138e-05, 'samples': 22629888, 'steps': 44198, 'batch_loss/train': 0.7571931313723326} 12/28/2021 18:47:46 - INFO - codeparrot_training - Step 44199: {'lr': 1.653764199053273e-05, 'samples': 22630400, 'steps': 44199, 'batch_loss/train': 0.7245207456871867} 12/28/2021 18:47:59 - INFO - codeparrot_training - Step 44200: {'lr': 1.6532004511479515e-05, 'samples': 22630912, 'steps': 44200, 'batch_loss/train': 0.9807036723941565} 12/28/2021 18:48:09 - INFO - codeparrot_training - Step 44201: {'lr': 1.6526367960606887e-05, 'samples': 22631424, 'steps': 44201, 'batch_loss/train': 0.9911241284571588} 12/28/2021 18:48:20 - INFO - codeparrot_training - Step 44202: {'lr': 1.6520732337937245e-05, 'samples': 22631936, 'steps': 44202, 'batch_loss/train': 0.7094690902158618} 12/28/2021 18:48:31 - INFO - codeparrot_training - Step 44203: {'lr': 1.651509764349307e-05, 'samples': 22632448, 'steps': 44203, 'batch_loss/train': 0.7952769068069756} 12/28/2021 18:48:44 - INFO - codeparrot_training - Step 44204: {'lr': 1.6509463877296683e-05, 'samples': 22632960, 'steps': 44204, 'batch_loss/train': 0.7388125943252817} 12/28/2021 18:48:55 - INFO - codeparrot_training - Step 44205: {'lr': 1.6503831039370448e-05, 'samples': 22633472, 'steps': 44205, 'batch_loss/train': 0.8048993591219187} 12/28/2021 18:49:06 - INFO - codeparrot_training - Step 44206: {'lr': 1.6498199129736907e-05, 'samples': 22633984, 'steps': 44206, 'batch_loss/train': 0.9274623701348901} 12/28/2021 18:49:18 - INFO - codeparrot_training - Step 44207: {'lr': 1.649256814841832e-05, 'samples': 22634496, 'steps': 44207, 'batch_loss/train': 0.8319214815273881} 12/28/2021 18:49:28 - INFO - codeparrot_training - Step 44208: {'lr': 1.6486938095437052e-05, 'samples': 22635008, 'steps': 44208, 'batch_loss/train': 0.6774082635529339} 12/28/2021 18:49:39 - INFO - codeparrot_training - Step 44209: {'lr': 1.648130897081565e-05, 'samples': 22635520, 'steps': 44209, 'batch_loss/train': 0.796432739123702} 12/28/2021 18:49:51 - INFO - codeparrot_training - Step 44210: {'lr': 1.647568077457634e-05, 'samples': 22636032, 'steps': 44210, 'batch_loss/train': 0.7962425332516432} 12/28/2021 18:50:02 - INFO - codeparrot_training - Step 44211: {'lr': 1.6470053506741552e-05, 'samples': 22636544, 'steps': 44211, 'batch_loss/train': 0.8635541787371039} 12/28/2021 18:50:13 - INFO - codeparrot_training - Step 44212: {'lr': 1.646442716733365e-05, 'samples': 22637056, 'steps': 44212, 'batch_loss/train': 0.8097462388686836} 12/28/2021 18:50:23 - INFO - codeparrot_training - Step 44213: {'lr': 1.6458801756374986e-05, 'samples': 22637568, 'steps': 44213, 'batch_loss/train': 0.9351950893178582} 12/28/2021 18:50:35 - INFO - codeparrot_training - Step 44214: {'lr': 1.6453177273887982e-05, 'samples': 22638080, 'steps': 44214, 'batch_loss/train': 0.7580447718501091} 12/28/2021 18:50:46 - INFO - codeparrot_training - Step 44215: {'lr': 1.6447553719894982e-05, 'samples': 22638592, 'steps': 44215, 'batch_loss/train': 0.677996211219579} 12/28/2021 18:50:57 - INFO - codeparrot_training - Step 44216: {'lr': 1.6441931094418245e-05, 'samples': 22639104, 'steps': 44216, 'batch_loss/train': 0.720915550366044} 12/28/2021 18:51:11 - INFO - codeparrot_training - Step 44217: {'lr': 1.6436309397480224e-05, 'samples': 22639616, 'steps': 44217, 'batch_loss/train': 0.780173453153111} 12/28/2021 18:51:22 - INFO - codeparrot_training - Step 44218: {'lr': 1.6430688629103326e-05, 'samples': 22640128, 'steps': 44218, 'batch_loss/train': 0.7215541338082403} 12/28/2021 18:51:32 - INFO - codeparrot_training - Step 44219: {'lr': 1.6425068789309745e-05, 'samples': 22640640, 'steps': 44219, 'batch_loss/train': 0.6975013772025704} 12/28/2021 18:51:44 - INFO - codeparrot_training - Step 44220: {'lr': 1.641944987812191e-05, 'samples': 22641152, 'steps': 44220, 'batch_loss/train': 0.722357866470702} 12/28/2021 18:51:55 - INFO - codeparrot_training - Step 44221: {'lr': 1.641383189556214e-05, 'samples': 22641664, 'steps': 44221, 'batch_loss/train': 0.7004167406121269} 12/28/2021 18:52:06 - INFO - codeparrot_training - Step 44222: {'lr': 1.640821484165278e-05, 'samples': 22642176, 'steps': 44222, 'batch_loss/train': 0.7505181645974517} 12/28/2021 18:52:18 - INFO - codeparrot_training - Step 44223: {'lr': 1.640259871641614e-05, 'samples': 22642688, 'steps': 44223, 'batch_loss/train': 0.7276510633528233} 12/28/2021 18:52:28 - INFO - codeparrot_training - Step 44224: {'lr': 1.6396983519874592e-05, 'samples': 22643200, 'steps': 44224, 'batch_loss/train': 0.8492876756936312} 12/28/2021 18:52:39 - INFO - codeparrot_training - Step 44225: {'lr': 1.6391369252050424e-05, 'samples': 22643712, 'steps': 44225, 'batch_loss/train': 0.7167568672448397} 12/28/2021 18:52:50 - INFO - codeparrot_training - Step 44226: {'lr': 1.6385755912965956e-05, 'samples': 22644224, 'steps': 44226, 'batch_loss/train': 0.7329843770712614} 12/28/2021 18:53:03 - INFO - codeparrot_training - Step 44227: {'lr': 1.6380143502643524e-05, 'samples': 22644736, 'steps': 44227, 'batch_loss/train': 0.6779209531378001} 12/28/2021 18:53:13 - INFO - codeparrot_training - Step 44228: {'lr': 1.637453202110542e-05, 'samples': 22645248, 'steps': 44228, 'batch_loss/train': 0.7443595621734858} 12/28/2021 18:53:24 - INFO - codeparrot_training - Step 44229: {'lr': 1.6368921468373986e-05, 'samples': 22645760, 'steps': 44229, 'batch_loss/train': 0.7618763698264956} 12/28/2021 18:53:36 - INFO - codeparrot_training - Step 44230: {'lr': 1.6363311844471486e-05, 'samples': 22646272, 'steps': 44230, 'batch_loss/train': 0.6426055570482276} 12/28/2021 18:53:47 - INFO - codeparrot_training - Step 44231: {'lr': 1.63577031494202e-05, 'samples': 22646784, 'steps': 44231, 'batch_loss/train': 0.7763330538582522} 12/28/2021 18:53:57 - INFO - codeparrot_training - Step 44232: {'lr': 1.635209538324256e-05, 'samples': 22647296, 'steps': 44232, 'batch_loss/train': 0.8508986003580503} 12/28/2021 18:54:11 - INFO - codeparrot_training - Step 44233: {'lr': 1.6346488545960714e-05, 'samples': 22647808, 'steps': 44233, 'batch_loss/train': 0.6880112076178193} 12/28/2021 18:54:22 - INFO - codeparrot_training - Step 44234: {'lr': 1.6340882637596948e-05, 'samples': 22648320, 'steps': 44234, 'batch_loss/train': 0.6697855787351727} 12/28/2021 18:54:32 - INFO - codeparrot_training - Step 44235: {'lr': 1.633527765817372e-05, 'samples': 22648832, 'steps': 44235, 'batch_loss/train': 0.669682770036161} 12/28/2021 18:54:43 - INFO - codeparrot_training - Step 44236: {'lr': 1.6329673607713146e-05, 'samples': 22649344, 'steps': 44236, 'batch_loss/train': 0.7074535586871207} 12/28/2021 18:54:55 - INFO - codeparrot_training - Step 44237: {'lr': 1.6324070486237545e-05, 'samples': 22649856, 'steps': 44237, 'batch_loss/train': 0.7848156662657857} 12/28/2021 18:55:06 - INFO - codeparrot_training - Step 44238: {'lr': 1.6318468293769206e-05, 'samples': 22650368, 'steps': 44238, 'batch_loss/train': 1.4354554498568177} 12/28/2021 18:55:16 - INFO - codeparrot_training - Step 44239: {'lr': 1.6312867030330414e-05, 'samples': 22650880, 'steps': 44239, 'batch_loss/train': 0.7239901318680495} 12/28/2021 18:55:28 - INFO - codeparrot_training - Step 44240: {'lr': 1.6307266695943428e-05, 'samples': 22651392, 'steps': 44240, 'batch_loss/train': 0.7585072410292923} 12/28/2021 18:55:39 - INFO - codeparrot_training - Step 44241: {'lr': 1.6301667290630567e-05, 'samples': 22651904, 'steps': 44241, 'batch_loss/train': 0.7288704458624125} 12/28/2021 18:55:50 - INFO - codeparrot_training - Step 44242: {'lr': 1.6296068814413922e-05, 'samples': 22652416, 'steps': 44242, 'batch_loss/train': 0.8111358024179935} 12/28/2021 18:56:00 - INFO - codeparrot_training - Step 44243: {'lr': 1.629047126731592e-05, 'samples': 22652928, 'steps': 44243, 'batch_loss/train': 0.7009551292285323} 12/28/2021 18:56:14 - INFO - codeparrot_training - Step 44244: {'lr': 1.6284874649358794e-05, 'samples': 22653440, 'steps': 44244, 'batch_loss/train': 0.7547985324636102} 12/28/2021 18:56:25 - INFO - codeparrot_training - Step 44245: {'lr': 1.627927896056469e-05, 'samples': 22653952, 'steps': 44245, 'batch_loss/train': 0.6808878576848656} 12/28/2021 18:56:36 - INFO - codeparrot_training - Step 44246: {'lr': 1.6273684200955957e-05, 'samples': 22654464, 'steps': 44246, 'batch_loss/train': 0.7485798336565495} 12/28/2021 18:56:48 - INFO - codeparrot_training - Step 44247: {'lr': 1.626809037055485e-05, 'samples': 22654976, 'steps': 44247, 'batch_loss/train': 0.7147771326126531} 12/28/2021 18:56:58 - INFO - codeparrot_training - Step 44248: {'lr': 1.626249746938352e-05, 'samples': 22655488, 'steps': 44248, 'batch_loss/train': 0.7961997790262103} 12/28/2021 18:57:09 - INFO - codeparrot_training - Step 44249: {'lr': 1.6256905497464226e-05, 'samples': 22656000, 'steps': 44249, 'batch_loss/train': 0.629335985518992} 12/28/2021 18:57:21 - INFO - codeparrot_training - Step 44250: {'lr': 1.6251314454819226e-05, 'samples': 22656512, 'steps': 44250, 'batch_loss/train': 0.74415825586766} 12/28/2021 18:57:32 - INFO - codeparrot_training - Step 44251: {'lr': 1.6245724341470757e-05, 'samples': 22657024, 'steps': 44251, 'batch_loss/train': 0.5755693424725905} 12/28/2021 18:57:43 - INFO - codeparrot_training - Step 44252: {'lr': 1.6240135157441022e-05, 'samples': 22657536, 'steps': 44252, 'batch_loss/train': 0.8621018063277006} 12/28/2021 18:57:55 - INFO - codeparrot_training - Step 44253: {'lr': 1.6234546902752222e-05, 'samples': 22658048, 'steps': 44253, 'batch_loss/train': 0.634575979784131} 12/28/2021 18:58:05 - INFO - codeparrot_training - Step 44254: {'lr': 1.622895957742662e-05, 'samples': 22658560, 'steps': 44254, 'batch_loss/train': 0.8025962272658944} 12/28/2021 18:58:16 - INFO - codeparrot_training - Step 44255: {'lr': 1.6223373181486422e-05, 'samples': 22659072, 'steps': 44255, 'batch_loss/train': 0.7679232228547335} 12/28/2021 18:58:27 - INFO - codeparrot_training - Step 44256: {'lr': 1.6217787714953802e-05, 'samples': 22659584, 'steps': 44256, 'batch_loss/train': 0.7086834600195289} 12/28/2021 18:58:41 - INFO - codeparrot_training - Step 44257: {'lr': 1.6212203177850937e-05, 'samples': 22660096, 'steps': 44257, 'batch_loss/train': 0.6927815824747086} 12/28/2021 18:58:51 - INFO - codeparrot_training - Step 44258: {'lr': 1.620661957020017e-05, 'samples': 22660608, 'steps': 44258, 'batch_loss/train': 0.7495500519871712} 12/28/2021 18:59:02 - INFO - codeparrot_training - Step 44259: {'lr': 1.6201036892023568e-05, 'samples': 22661120, 'steps': 44259, 'batch_loss/train': 0.7513462249189615} 12/28/2021 18:59:14 - INFO - codeparrot_training - Step 44260: {'lr': 1.6195455143343306e-05, 'samples': 22661632, 'steps': 44260, 'batch_loss/train': 0.790870126336813} 12/28/2021 18:59:25 - INFO - codeparrot_training - Step 44261: {'lr': 1.618987432418173e-05, 'samples': 22662144, 'steps': 44261, 'batch_loss/train': 0.7518909508362412} 12/28/2021 18:59:35 - INFO - codeparrot_training - Step 44262: {'lr': 1.6184294434560876e-05, 'samples': 22662656, 'steps': 44262, 'batch_loss/train': 0.648339859675616} 12/28/2021 18:59:46 - INFO - codeparrot_training - Step 44263: {'lr': 1.6178715474502974e-05, 'samples': 22663168, 'steps': 44263, 'batch_loss/train': 0.8187334598042071} 12/28/2021 19:00:00 - INFO - codeparrot_training - Step 44264: {'lr': 1.6173137444030233e-05, 'samples': 22663680, 'steps': 44264, 'batch_loss/train': 0.696036055451259} 12/28/2021 19:00:11 - INFO - codeparrot_training - Step 44265: {'lr': 1.61675603431648e-05, 'samples': 22664192, 'steps': 44265, 'batch_loss/train': 0.9853634536266327} 12/28/2021 19:00:22 - INFO - codeparrot_training - Step 44266: {'lr': 1.6161984171928852e-05, 'samples': 22664704, 'steps': 44266, 'batch_loss/train': 0.7899388279765844} 12/28/2021 19:00:34 - INFO - codeparrot_training - Step 44267: {'lr': 1.615640893034462e-05, 'samples': 22665216, 'steps': 44267, 'batch_loss/train': 0.7181662237271667} 12/28/2021 19:00:45 - INFO - codeparrot_training - Step 44268: {'lr': 1.6150834618434114e-05, 'samples': 22665728, 'steps': 44268, 'batch_loss/train': 0.8937777215614915} 12/28/2021 19:00:55 - INFO - codeparrot_training - Step 44269: {'lr': 1.6145261236219623e-05, 'samples': 22666240, 'steps': 44269, 'batch_loss/train': 0.5925140811596066} 12/28/2021 19:01:07 - INFO - codeparrot_training - Step 44270: {'lr': 1.613968878372332e-05, 'samples': 22666752, 'steps': 44270, 'batch_loss/train': 0.7380428188480437} 12/28/2021 19:01:18 - INFO - codeparrot_training - Step 44271: {'lr': 1.613411726096725e-05, 'samples': 22667264, 'steps': 44271, 'batch_loss/train': 0.7192294460255653} 12/28/2021 19:01:29 - INFO - codeparrot_training - Step 44272: {'lr': 1.6128546667973644e-05, 'samples': 22667776, 'steps': 44272, 'batch_loss/train': 0.7687409416539595} 12/28/2021 19:01:39 - INFO - codeparrot_training - Step 44273: {'lr': 1.612297700476467e-05, 'samples': 22668288, 'steps': 44273, 'batch_loss/train': 0.602747370256111} 12/28/2021 19:01:53 - INFO - codeparrot_training - Step 44274: {'lr': 1.6117408271362405e-05, 'samples': 22668800, 'steps': 44274, 'batch_loss/train': 0.678136202855967} 12/28/2021 19:02:04 - INFO - codeparrot_training - Step 44275: {'lr': 1.611184046778902e-05, 'samples': 22669312, 'steps': 44275, 'batch_loss/train': 0.7682886403053999} 12/28/2021 19:02:15 - INFO - codeparrot_training - Step 44276: {'lr': 1.6106273594066634e-05, 'samples': 22669824, 'steps': 44276, 'batch_loss/train': 0.5292214277142193} 12/28/2021 19:02:27 - INFO - codeparrot_training - Step 44277: {'lr': 1.6100707650217374e-05, 'samples': 22670336, 'steps': 44277, 'batch_loss/train': 0.7731345811625943} 12/28/2021 19:02:37 - INFO - codeparrot_training - Step 44278: {'lr': 1.609514263626341e-05, 'samples': 22670848, 'steps': 44278, 'batch_loss/train': 0.6676218844950199} 12/28/2021 19:02:48 - INFO - codeparrot_training - Step 44279: {'lr': 1.6089578552226812e-05, 'samples': 22671360, 'steps': 44279, 'batch_loss/train': 0.7413360420614481} 12/28/2021 19:03:00 - INFO - codeparrot_training - Step 44280: {'lr': 1.6084015398129732e-05, 'samples': 22671872, 'steps': 44280, 'batch_loss/train': 0.7483589090406895} 12/28/2021 19:03:11 - INFO - codeparrot_training - Step 44281: {'lr': 1.6078453173994313e-05, 'samples': 22672384, 'steps': 44281, 'batch_loss/train': 0.8144655216019601} 12/28/2021 19:03:21 - INFO - codeparrot_training - Step 44282: {'lr': 1.607289187984262e-05, 'samples': 22672896, 'steps': 44282, 'batch_loss/train': 0.6347014987841249} 12/28/2021 19:03:34 - INFO - codeparrot_training - Step 44283: {'lr': 1.606733151569678e-05, 'samples': 22673408, 'steps': 44283, 'batch_loss/train': 0.7089304896071553} 12/28/2021 19:03:45 - INFO - codeparrot_training - Step 44284: {'lr': 1.6061772081578942e-05, 'samples': 22673920, 'steps': 44284, 'batch_loss/train': 0.713048294885084} 12/28/2021 19:03:56 - INFO - codeparrot_training - Step 44285: {'lr': 1.6056213577511108e-05, 'samples': 22674432, 'steps': 44285, 'batch_loss/train': 0.6596158635802567} 12/28/2021 19:04:06 - INFO - codeparrot_training - Step 44286: {'lr': 1.6050656003515436e-05, 'samples': 22674944, 'steps': 44286, 'batch_loss/train': 0.7849548188969493} 12/28/2021 19:04:18 - INFO - codeparrot_training - Step 44287: {'lr': 1.604509935961407e-05, 'samples': 22675456, 'steps': 44287, 'batch_loss/train': 0.7553263790905476} 12/28/2021 19:04:29 - INFO - codeparrot_training - Step 44288: {'lr': 1.6039543645829018e-05, 'samples': 22675968, 'steps': 44288, 'batch_loss/train': 0.6569926119409502} 12/28/2021 19:04:40 - INFO - codeparrot_training - Step 44289: {'lr': 1.603398886218241e-05, 'samples': 22676480, 'steps': 44289, 'batch_loss/train': 0.7214555856771767} 12/28/2021 19:04:52 - INFO - codeparrot_training - Step 44290: {'lr': 1.60284350086963e-05, 'samples': 22676992, 'steps': 44290, 'batch_loss/train': 0.700921356678009} 12/28/2021 19:05:02 - INFO - codeparrot_training - Step 44291: {'lr': 1.6022882085392793e-05, 'samples': 22677504, 'steps': 44291, 'batch_loss/train': 0.7168099214322865} 12/28/2021 19:05:13 - INFO - codeparrot_training - Step 44292: {'lr': 1.6017330092293976e-05, 'samples': 22678016, 'steps': 44292, 'batch_loss/train': 0.6175303015625104} 12/28/2021 19:05:25 - INFO - codeparrot_training - Step 44293: {'lr': 1.601177902942194e-05, 'samples': 22678528, 'steps': 44293, 'batch_loss/train': 0.7225642609992065} 12/28/2021 19:05:36 - INFO - codeparrot_training - Step 44294: {'lr': 1.600622889679862e-05, 'samples': 22679040, 'steps': 44294, 'batch_loss/train': 0.7461053607985377} 12/28/2021 19:05:47 - INFO - codeparrot_training - Step 44295: {'lr': 1.600067969444624e-05, 'samples': 22679552, 'steps': 44295, 'batch_loss/train': 0.6950665847398341} 12/28/2021 19:05:57 - INFO - codeparrot_training - Step 44296: {'lr': 1.5995131422386844e-05, 'samples': 22680064, 'steps': 44296, 'batch_loss/train': 0.6914308127015829} 12/28/2021 19:06:11 - INFO - codeparrot_training - Step 44297: {'lr': 1.5989584080642354e-05, 'samples': 22680576, 'steps': 44297, 'batch_loss/train': 0.7982771010138094} 12/28/2021 19:06:22 - INFO - codeparrot_training - Step 44298: {'lr': 1.598403766923498e-05, 'samples': 22681088, 'steps': 44298, 'batch_loss/train': 0.6663259617052972} 12/28/2021 19:06:32 - INFO - codeparrot_training - Step 44299: {'lr': 1.5978492188186754e-05, 'samples': 22681600, 'steps': 44299, 'batch_loss/train': 0.7500038142316043} 12/28/2021 19:06:45 - INFO - codeparrot_training - Step 44300: {'lr': 1.5972947637519608e-05, 'samples': 22682112, 'steps': 44300, 'batch_loss/train': 0.7182165938429534} 12/28/2021 19:06:55 - INFO - codeparrot_training - Step 44301: {'lr': 1.5967404017255687e-05, 'samples': 22682624, 'steps': 44301, 'batch_loss/train': 0.8144579101353884} 12/28/2021 19:07:06 - INFO - codeparrot_training - Step 44302: {'lr': 1.5961861327416978e-05, 'samples': 22683136, 'steps': 44302, 'batch_loss/train': 0.645298620685935} 12/28/2021 19:07:20 - INFO - codeparrot_training - Step 44303: {'lr': 1.5956319568025567e-05, 'samples': 22683648, 'steps': 44303, 'batch_loss/train': 0.6718086791224778} 12/28/2021 19:07:30 - INFO - codeparrot_training - Step 44304: {'lr': 1.5950778739103443e-05, 'samples': 22684160, 'steps': 44304, 'batch_loss/train': 0.7022174925077707} 12/28/2021 19:07:41 - INFO - codeparrot_training - Step 44305: {'lr': 1.5945238840672643e-05, 'samples': 22684672, 'steps': 44305, 'batch_loss/train': 0.79038424231112} 12/28/2021 19:07:53 - INFO - codeparrot_training - Step 44306: {'lr': 1.5939699872755203e-05, 'samples': 22685184, 'steps': 44306, 'batch_loss/train': 0.8416506857611239} 12/28/2021 19:08:04 - INFO - codeparrot_training - Step 44307: {'lr': 1.5934161835373133e-05, 'samples': 22685696, 'steps': 44307, 'batch_loss/train': 1.0319015635177493} 12/28/2021 19:08:15 - INFO - codeparrot_training - Step 44308: {'lr': 1.5928624728548475e-05, 'samples': 22686208, 'steps': 44308, 'batch_loss/train': 0.7530293832533062} 12/28/2021 19:08:25 - INFO - codeparrot_training - Step 44309: {'lr': 1.5923088552303206e-05, 'samples': 22686720, 'steps': 44309, 'batch_loss/train': 0.8000767258927226} 12/28/2021 19:08:39 - INFO - codeparrot_training - Step 44310: {'lr': 1.5917553306659394e-05, 'samples': 22687232, 'steps': 44310, 'batch_loss/train': 0.6958627491258085} 12/28/2021 19:08:49 - INFO - codeparrot_training - Step 44311: {'lr': 1.5912018991638966e-05, 'samples': 22687744, 'steps': 44311, 'batch_loss/train': 0.9133764123544097} 12/28/2021 19:09:00 - INFO - codeparrot_training - Step 44312: {'lr': 1.590648560726393e-05, 'samples': 22688256, 'steps': 44312, 'batch_loss/train': 0.763369528343901} 12/28/2021 19:09:12 - INFO - codeparrot_training - Step 44313: {'lr': 1.5900953153556384e-05, 'samples': 22688768, 'steps': 44313, 'batch_loss/train': 0.62334657413885} 12/28/2021 19:09:23 - INFO - codeparrot_training - Step 44314: {'lr': 1.5895421630538225e-05, 'samples': 22689280, 'steps': 44314, 'batch_loss/train': 0.8575609987601638} 12/28/2021 19:09:33 - INFO - codeparrot_training - Step 44315: {'lr': 1.5889891038231463e-05, 'samples': 22689792, 'steps': 44315, 'batch_loss/train': 0.8757194555364549} 12/28/2021 19:09:44 - INFO - codeparrot_training - Step 44316: {'lr': 1.5884361376658106e-05, 'samples': 22690304, 'steps': 44316, 'batch_loss/train': 0.6927813803777099} 12/28/2021 19:09:56 - INFO - codeparrot_training - Step 44317: {'lr': 1.5878832645840114e-05, 'samples': 22690816, 'steps': 44317, 'batch_loss/train': 0.754369433503598} 12/28/2021 19:10:07 - INFO - codeparrot_training - Step 44318: {'lr': 1.587330484579952e-05, 'samples': 22691328, 'steps': 44318, 'batch_loss/train': 0.6629805851262063} 12/28/2021 19:10:17 - INFO - codeparrot_training - Step 44319: {'lr': 1.586777797655822e-05, 'samples': 22691840, 'steps': 44319, 'batch_loss/train': 0.7445853261742741} 12/28/2021 19:10:31 - INFO - codeparrot_training - Step 44320: {'lr': 1.5862252038138262e-05, 'samples': 22692352, 'steps': 44320, 'batch_loss/train': 0.6658275369554758} 12/28/2021 19:10:42 - INFO - codeparrot_training - Step 44321: {'lr': 1.585672703056157e-05, 'samples': 22692864, 'steps': 44321, 'batch_loss/train': 0.718418839154765} 12/28/2021 19:10:52 - INFO - codeparrot_training - Step 44322: {'lr': 1.5851202953850175e-05, 'samples': 22693376, 'steps': 44322, 'batch_loss/train': 0.7630188791081309} 12/28/2021 19:11:05 - INFO - codeparrot_training - Step 44323: {'lr': 1.5845679808025897e-05, 'samples': 22693888, 'steps': 44323, 'batch_loss/train': 0.7406565030105412} 12/28/2021 19:11:15 - INFO - codeparrot_training - Step 44324: {'lr': 1.5840157593110803e-05, 'samples': 22694400, 'steps': 44324, 'batch_loss/train': 0.7992643248289824} 12/28/2021 19:11:26 - INFO - codeparrot_training - Step 44325: {'lr': 1.5834636309126904e-05, 'samples': 22694912, 'steps': 44325, 'batch_loss/train': 0.755512148141861} 12/28/2021 19:11:38 - INFO - codeparrot_training - Step 44326: {'lr': 1.5829115956096013e-05, 'samples': 22695424, 'steps': 44326, 'batch_loss/train': 0.8142629992216825} 12/28/2021 19:11:49 - INFO - codeparrot_training - Step 44327: {'lr': 1.5823596534040085e-05, 'samples': 22695936, 'steps': 44327, 'batch_loss/train': 1.3711061533540487} 12/28/2021 19:11:59 - INFO - codeparrot_training - Step 44328: {'lr': 1.5818078042981216e-05, 'samples': 22696448, 'steps': 44328, 'batch_loss/train': 0.5645300354517531} 12/28/2021 19:12:10 - INFO - codeparrot_training - Step 44329: {'lr': 1.581256048294119e-05, 'samples': 22696960, 'steps': 44329, 'batch_loss/train': 0.7131512900814414} 12/28/2021 19:12:24 - INFO - codeparrot_training - Step 44330: {'lr': 1.5807043853942023e-05, 'samples': 22697472, 'steps': 44330, 'batch_loss/train': 0.6809406246757135} 12/28/2021 19:12:34 - INFO - codeparrot_training - Step 44331: {'lr': 1.5801528156005608e-05, 'samples': 22697984, 'steps': 44331, 'batch_loss/train': 0.7205755361355841} 12/28/2021 19:12:45 - INFO - codeparrot_training - Step 44332: {'lr': 1.5796013389153873e-05, 'samples': 22698496, 'steps': 44332, 'batch_loss/train': 0.6485144211910665} 12/28/2021 19:12:57 - INFO - codeparrot_training - Step 44333: {'lr': 1.5790499553408777e-05, 'samples': 22699008, 'steps': 44333, 'batch_loss/train': 0.7919032771023922} 12/28/2021 19:13:08 - INFO - codeparrot_training - Step 44334: {'lr': 1.5784986648792215e-05, 'samples': 22699520, 'steps': 44334, 'batch_loss/train': 0.8902664706110954} 12/28/2021 19:13:18 - INFO - codeparrot_training - Step 44335: {'lr': 1.577947467532609e-05, 'samples': 22700032, 'steps': 44335, 'batch_loss/train': 0.833325014449656} 12/28/2021 19:13:31 - INFO - codeparrot_training - Step 44336: {'lr': 1.577396363303235e-05, 'samples': 22700544, 'steps': 44336, 'batch_loss/train': 0.6600474519655108} 12/28/2021 19:13:41 - INFO - codeparrot_training - Step 44337: {'lr': 1.5768453521932928e-05, 'samples': 22701056, 'steps': 44337, 'batch_loss/train': 0.7488461988978088} 12/28/2021 19:13:52 - INFO - codeparrot_training - Step 44338: {'lr': 1.576294434204964e-05, 'samples': 22701568, 'steps': 44338, 'batch_loss/train': 0.6089180042035878} 12/28/2021 19:14:03 - INFO - codeparrot_training - Step 44339: {'lr': 1.5757436093404493e-05, 'samples': 22702080, 'steps': 44339, 'batch_loss/train': 0.7693813340738416} 12/28/2021 19:14:15 - INFO - codeparrot_training - Step 44340: {'lr': 1.57519287760193e-05, 'samples': 22702592, 'steps': 44340, 'batch_loss/train': 0.9452579049393535} 12/28/2021 19:14:26 - INFO - codeparrot_training - Step 44341: {'lr': 1.574642238991597e-05, 'samples': 22703104, 'steps': 44341, 'batch_loss/train': 0.6972414404153824} 12/28/2021 19:14:36 - INFO - codeparrot_training - Step 44342: {'lr': 1.574091693511645e-05, 'samples': 22703616, 'steps': 44342, 'batch_loss/train': 0.6541494554840028} 12/28/2021 19:14:49 - INFO - codeparrot_training - Step 44343: {'lr': 1.5735412411642557e-05, 'samples': 22704128, 'steps': 44343, 'batch_loss/train': 0.7288720803335309} 12/28/2021 19:15:00 - INFO - codeparrot_training - Step 44344: {'lr': 1.572990881951622e-05, 'samples': 22704640, 'steps': 44344, 'batch_loss/train': 0.7036150712519884} 12/28/2021 19:15:11 - INFO - codeparrot_training - Step 44345: {'lr': 1.5724406158759336e-05, 'samples': 22705152, 'steps': 44345, 'batch_loss/train': 0.6771576311439276} 12/28/2021 19:15:23 - INFO - codeparrot_training - Step 44346: {'lr': 1.571890442939372e-05, 'samples': 22705664, 'steps': 44346, 'batch_loss/train': 0.7047913964488544} 12/28/2021 19:15:34 - INFO - codeparrot_training - Step 44347: {'lr': 1.5713403631441303e-05, 'samples': 22706176, 'steps': 44347, 'batch_loss/train': 0.6982322502881289} 12/28/2021 19:15:44 - INFO - codeparrot_training - Step 44348: {'lr': 1.5707903764923953e-05, 'samples': 22706688, 'steps': 44348, 'batch_loss/train': 0.8105487311258912} 12/28/2021 19:15:55 - INFO - codeparrot_training - Step 44349: {'lr': 1.5702404829863432e-05, 'samples': 22707200, 'steps': 44349, 'batch_loss/train': 0.7825024258345366} 12/28/2021 19:16:09 - INFO - codeparrot_training - Step 44350: {'lr': 1.569690682628172e-05, 'samples': 22707712, 'steps': 44350, 'batch_loss/train': 0.7822496574372053} 12/28/2021 19:16:19 - INFO - codeparrot_training - Step 44351: {'lr': 1.5691409754200692e-05, 'samples': 22708224, 'steps': 44351, 'batch_loss/train': 0.6888970371801406} 12/28/2021 19:16:30 - INFO - codeparrot_training - Step 44352: {'lr': 1.5685913613642106e-05, 'samples': 22708736, 'steps': 44352, 'batch_loss/train': 0.8998481980524957} 12/28/2021 19:16:42 - INFO - codeparrot_training - Step 44353: {'lr': 1.5680418404627804e-05, 'samples': 22709248, 'steps': 44353, 'batch_loss/train': 0.6524684201576747} 12/28/2021 19:16:53 - INFO - codeparrot_training - Step 44354: {'lr': 1.5674924127179775e-05, 'samples': 22709760, 'steps': 44354, 'batch_loss/train': 0.7533221244812012} 12/28/2021 19:17:04 - INFO - codeparrot_training - Step 44355: {'lr': 1.5669430781319716e-05, 'samples': 22710272, 'steps': 44355, 'batch_loss/train': 0.8596371398307383} 12/28/2021 19:17:15 - INFO - codeparrot_training - Step 44356: {'lr': 1.566393836706953e-05, 'samples': 22710784, 'steps': 44356, 'batch_loss/train': 0.6593566802330315} 12/28/2021 19:17:26 - INFO - codeparrot_training - Step 44357: {'lr': 1.5658446884451032e-05, 'samples': 22711296, 'steps': 44357, 'batch_loss/train': 0.7672759188571945} 12/28/2021 19:17:37 - INFO - codeparrot_training - Step 44358: {'lr': 1.5652956333486067e-05, 'samples': 22711808, 'steps': 44358, 'batch_loss/train': 0.7031656764447689} 12/28/2021 19:17:47 - INFO - codeparrot_training - Step 44359: {'lr': 1.564746671419648e-05, 'samples': 22712320, 'steps': 44359, 'batch_loss/train': 0.6823712881887332} 12/28/2021 19:18:00 - INFO - codeparrot_training - Step 44360: {'lr': 1.5641978026604055e-05, 'samples': 22712832, 'steps': 44360, 'batch_loss/train': 0.5696758010890335} 12/28/2021 19:18:10 - INFO - codeparrot_training - Step 44361: {'lr': 1.5636490270730637e-05, 'samples': 22713344, 'steps': 44361, 'batch_loss/train': 0.6987545997835696} 12/28/2021 19:18:21 - INFO - codeparrot_training - Step 44362: {'lr': 1.563100344659807e-05, 'samples': 22713856, 'steps': 44362, 'batch_loss/train': 0.6703267865814269} 12/28/2021 19:18:35 - INFO - codeparrot_training - Step 44363: {'lr': 1.5625517554228143e-05, 'samples': 22714368, 'steps': 44363, 'batch_loss/train': 0.7257239283062518} 12/28/2021 19:18:46 - INFO - codeparrot_training - Step 44364: {'lr': 1.5620032593642615e-05, 'samples': 22714880, 'steps': 44364, 'batch_loss/train': 0.6501891675870866} 12/28/2021 19:18:57 - INFO - codeparrot_training - Step 44365: {'lr': 1.5614548564863385e-05, 'samples': 22715392, 'steps': 44365, 'batch_loss/train': 0.4006544796284288} 12/28/2021 19:19:09 - INFO - codeparrot_training - Step 44366: {'lr': 1.5609065467912188e-05, 'samples': 22715904, 'steps': 44366, 'batch_loss/train': 0.6687113428488374} 12/28/2021 19:19:20 - INFO - codeparrot_training - Step 44367: {'lr': 1.5603583302810837e-05, 'samples': 22716416, 'steps': 44367, 'batch_loss/train': 0.7066816333681345} 12/28/2021 19:19:30 - INFO - codeparrot_training - Step 44368: {'lr': 1.559810206958115e-05, 'samples': 22716928, 'steps': 44368, 'batch_loss/train': 0.7387996207689866} 12/28/2021 19:19:41 - INFO - codeparrot_training - Step 44369: {'lr': 1.5592621768244885e-05, 'samples': 22717440, 'steps': 44369, 'batch_loss/train': 0.727974895038642} 12/28/2021 19:19:55 - INFO - codeparrot_training - Step 44370: {'lr': 1.558714239882386e-05, 'samples': 22717952, 'steps': 44370, 'batch_loss/train': 0.7069232980720699} 12/28/2021 19:20:05 - INFO - codeparrot_training - Step 44371: {'lr': 1.5581663961339836e-05, 'samples': 22718464, 'steps': 44371, 'batch_loss/train': 0.7508357726037502} 12/28/2021 19:20:16 - INFO - codeparrot_training - Step 44372: {'lr': 1.55761864558146e-05, 'samples': 22718976, 'steps': 44372, 'batch_loss/train': 0.7869826294481754} 12/28/2021 19:20:28 - INFO - codeparrot_training - Step 44373: {'lr': 1.557070988226994e-05, 'samples': 22719488, 'steps': 44373, 'batch_loss/train': 0.7176119829528034} 12/28/2021 19:20:39 - INFO - codeparrot_training - Step 44374: {'lr': 1.5565234240727642e-05, 'samples': 22720000, 'steps': 44374, 'batch_loss/train': 0.6930803614668548} 12/28/2021 19:20:50 - INFO - codeparrot_training - Step 44375: {'lr': 1.5559759531209416e-05, 'samples': 22720512, 'steps': 44375, 'batch_loss/train': 0.7050572037696838} 12/28/2021 19:21:02 - INFO - codeparrot_training - Step 44376: {'lr': 1.5554285753737073e-05, 'samples': 22721024, 'steps': 44376, 'batch_loss/train': 0.7231528768315911} 12/28/2021 19:21:12 - INFO - codeparrot_training - Step 44377: {'lr': 1.5548812908332405e-05, 'samples': 22721536, 'steps': 44377, 'batch_loss/train': 0.8665075394092128} 12/28/2021 19:21:23 - INFO - codeparrot_training - Step 44378: {'lr': 1.5543340995017084e-05, 'samples': 22722048, 'steps': 44378, 'batch_loss/train': 0.7045423053205013} 12/28/2021 19:21:34 - INFO - codeparrot_training - Step 44379: {'lr': 1.5537870013812905e-05, 'samples': 22722560, 'steps': 44379, 'batch_loss/train': 0.6860078300815076} 12/28/2021 19:21:47 - INFO - codeparrot_training - Step 44380: {'lr': 1.5532399964741678e-05, 'samples': 22723072, 'steps': 44380, 'batch_loss/train': 0.6947306650690734} 12/28/2021 19:21:58 - INFO - codeparrot_training - Step 44381: {'lr': 1.5526930847825083e-05, 'samples': 22723584, 'steps': 44381, 'batch_loss/train': 0.6993760764598846} 12/28/2021 19:22:08 - INFO - codeparrot_training - Step 44382: {'lr': 1.5521462663084825e-05, 'samples': 22724096, 'steps': 44382, 'batch_loss/train': 0.7245543396566063} 12/28/2021 19:22:21 - INFO - codeparrot_training - Step 44383: {'lr': 1.5515995410542743e-05, 'samples': 22724608, 'steps': 44383, 'batch_loss/train': 0.7849345649592578} 12/28/2021 19:22:31 - INFO - codeparrot_training - Step 44384: {'lr': 1.551052909022052e-05, 'samples': 22725120, 'steps': 44384, 'batch_loss/train': 0.6288302764296532} 12/28/2021 19:22:42 - INFO - codeparrot_training - Step 44385: {'lr': 1.5505063702139888e-05, 'samples': 22725632, 'steps': 44385, 'batch_loss/train': 0.5858407744672149} 12/28/2021 19:22:54 - INFO - codeparrot_training - Step 44386: {'lr': 1.5499599246322575e-05, 'samples': 22726144, 'steps': 44386, 'batch_loss/train': 0.4988360834540799} 12/28/2021 19:23:05 - INFO - codeparrot_training - Step 44387: {'lr': 1.549413572279032e-05, 'samples': 22726656, 'steps': 44387, 'batch_loss/train': 0.6194188894005492} 12/28/2021 19:23:15 - INFO - codeparrot_training - Step 44388: {'lr': 1.548867313156485e-05, 'samples': 22727168, 'steps': 44388, 'batch_loss/train': 0.7806696137413383} 12/28/2021 19:23:29 - INFO - codeparrot_training - Step 44389: {'lr': 1.5483211472667876e-05, 'samples': 22727680, 'steps': 44389, 'batch_loss/train': 0.7744014789350331} 12/28/2021 19:23:39 - INFO - codeparrot_training - Step 44390: {'lr': 1.547775074612104e-05, 'samples': 22728192, 'steps': 44390, 'batch_loss/train': 0.7441305950051174} 12/28/2021 19:23:50 - INFO - codeparrot_training - Step 44391: {'lr': 1.5472290951946187e-05, 'samples': 22728704, 'steps': 44391, 'batch_loss/train': 0.7652040245011449} 12/28/2021 19:24:00 - INFO - codeparrot_training - Step 44392: {'lr': 1.5466832090164916e-05, 'samples': 22729216, 'steps': 44392, 'batch_loss/train': 0.8338326434604824} 12/28/2021 19:24:13 - INFO - codeparrot_training - Step 44393: {'lr': 1.546137416079893e-05, 'samples': 22729728, 'steps': 44393, 'batch_loss/train': 0.833690982311964} 12/28/2021 19:24:23 - INFO - codeparrot_training - Step 44394: {'lr': 1.5455917163870042e-05, 'samples': 22730240, 'steps': 44394, 'batch_loss/train': 0.6248386651277542} 12/28/2021 19:24:34 - INFO - codeparrot_training - Step 44395: {'lr': 1.5450461099399822e-05, 'samples': 22730752, 'steps': 44395, 'batch_loss/train': 0.7511014682240784} 12/28/2021 19:24:47 - INFO - codeparrot_training - Step 44396: {'lr': 1.5445005967409996e-05, 'samples': 22731264, 'steps': 44396, 'batch_loss/train': 0.8502559652552009} 12/28/2021 19:24:58 - INFO - codeparrot_training - Step 44397: {'lr': 1.543955176792228e-05, 'samples': 22731776, 'steps': 44397, 'batch_loss/train': 0.7549099926836789} 12/28/2021 19:25:08 - INFO - codeparrot_training - Step 44398: {'lr': 1.543409850095831e-05, 'samples': 22732288, 'steps': 44398, 'batch_loss/train': 0.8163443924859166} 12/28/2021 19:25:21 - INFO - codeparrot_training - Step 44399: {'lr': 1.542864616653983e-05, 'samples': 22732800, 'steps': 44399, 'batch_loss/train': 0.81215357221663} 12/28/2021 19:25:31 - INFO - codeparrot_training - Step 44400: {'lr': 1.542319476468851e-05, 'samples': 22733312, 'steps': 44400, 'batch_loss/train': 0.6838477025739849} 12/28/2021 19:25:42 - INFO - codeparrot_training - Step 44401: {'lr': 1.541774429542592e-05, 'samples': 22733824, 'steps': 44401, 'batch_loss/train': 0.786426228005439} 12/28/2021 19:25:53 - INFO - codeparrot_training - Step 44402: {'lr': 1.5412294758773813e-05, 'samples': 22734336, 'steps': 44402, 'batch_loss/train': 0.6966199576854706} 12/28/2021 19:26:05 - INFO - codeparrot_training - Step 44403: {'lr': 1.5406846154753935e-05, 'samples': 22734848, 'steps': 44403, 'batch_loss/train': 0.7097383555956185} 12/28/2021 19:26:15 - INFO - codeparrot_training - Step 44404: {'lr': 1.5401398483387784e-05, 'samples': 22735360, 'steps': 44404, 'batch_loss/train': 0.7392145632766187} 12/28/2021 19:26:26 - INFO - codeparrot_training - Step 44405: {'lr': 1.5395951744697045e-05, 'samples': 22735872, 'steps': 44405, 'batch_loss/train': 0.6430657958844677} 12/28/2021 19:26:38 - INFO - codeparrot_training - Step 44406: {'lr': 1.539050593870353e-05, 'samples': 22736384, 'steps': 44406, 'batch_loss/train': 0.7216494493186474} 12/28/2021 19:26:49 - INFO - codeparrot_training - Step 44407: {'lr': 1.5385061065428724e-05, 'samples': 22736896, 'steps': 44407, 'batch_loss/train': 0.7770240870304406} 12/28/2021 19:26:59 - INFO - codeparrot_training - Step 44408: {'lr': 1.5379617124894274e-05, 'samples': 22737408, 'steps': 44408, 'batch_loss/train': 0.718339663464576} 12/28/2021 19:27:13 - INFO - codeparrot_training - Step 44409: {'lr': 1.537417411712197e-05, 'samples': 22737920, 'steps': 44409, 'batch_loss/train': 0.6367608858272433} 12/28/2021 19:27:24 - INFO - codeparrot_training - Step 44410: {'lr': 1.536873204213332e-05, 'samples': 22738432, 'steps': 44410, 'batch_loss/train': 0.692446896340698} 12/28/2021 19:27:34 - INFO - codeparrot_training - Step 44411: {'lr': 1.5363290899950004e-05, 'samples': 22738944, 'steps': 44411, 'batch_loss/train': 0.8053944366984069} 12/28/2021 19:27:45 - INFO - codeparrot_training - Step 44412: {'lr': 1.5357850690593616e-05, 'samples': 22739456, 'steps': 44412, 'batch_loss/train': 0.8304861737415195} 12/28/2021 19:27:57 - INFO - codeparrot_training - Step 44413: {'lr': 1.535241141408586e-05, 'samples': 22739968, 'steps': 44413, 'batch_loss/train': 0.7322831536293961} 12/28/2021 19:28:08 - INFO - codeparrot_training - Step 44414: {'lr': 1.53469730704483e-05, 'samples': 22740480, 'steps': 44414, 'batch_loss/train': 0.6695693801157176} 12/28/2021 19:28:18 - INFO - codeparrot_training - Step 44415: {'lr': 1.534153565970259e-05, 'samples': 22740992, 'steps': 44415, 'batch_loss/train': 0.6102096484974027} 12/28/2021 19:28:30 - INFO - codeparrot_training - Step 44416: {'lr': 1.5336099181870287e-05, 'samples': 22741504, 'steps': 44416, 'batch_loss/train': 0.6565303541719913} 12/28/2021 19:28:41 - INFO - codeparrot_training - Step 44417: {'lr': 1.5330663636973053e-05, 'samples': 22742016, 'steps': 44417, 'batch_loss/train': 0.6927918130531907} 12/28/2021 19:28:52 - INFO - codeparrot_training - Step 44418: {'lr': 1.5325229025032554e-05, 'samples': 22742528, 'steps': 44418, 'batch_loss/train': 0.7187347188591957} 12/28/2021 19:29:05 - INFO - codeparrot_training - Step 44419: {'lr': 1.5319795346070254e-05, 'samples': 22743040, 'steps': 44419, 'batch_loss/train': 0.8325517391785979} 12/28/2021 19:29:16 - INFO - codeparrot_training - Step 44420: {'lr': 1.5314362600107905e-05, 'samples': 22743552, 'steps': 44420, 'batch_loss/train': 0.7171320924535394} 12/28/2021 19:29:26 - INFO - codeparrot_training - Step 44421: {'lr': 1.5308930787167024e-05, 'samples': 22744064, 'steps': 44421, 'batch_loss/train': 0.7338522607460618} 12/28/2021 19:29:37 - INFO - codeparrot_training - Step 44422: {'lr': 1.53034999072692e-05, 'samples': 22744576, 'steps': 44422, 'batch_loss/train': 0.7425373150035739} 12/28/2021 19:29:49 - INFO - codeparrot_training - Step 44423: {'lr': 1.5298069960436033e-05, 'samples': 22745088, 'steps': 44423, 'batch_loss/train': 0.7438248284161091} 12/28/2021 19:30:00 - INFO - codeparrot_training - Step 44424: {'lr': 1.529264094668914e-05, 'samples': 22745600, 'steps': 44424, 'batch_loss/train': 0.8473088688333519} 12/28/2021 19:30:10 - INFO - codeparrot_training - Step 44425: {'lr': 1.528721286605006e-05, 'samples': 22746112, 'steps': 44425, 'batch_loss/train': 0.8484486498637125} 12/28/2021 19:30:24 - INFO - codeparrot_training - Step 44426: {'lr': 1.5281785718540413e-05, 'samples': 22746624, 'steps': 44426, 'batch_loss/train': 0.760726684005931} 12/28/2021 19:30:35 - INFO - codeparrot_training - Step 44427: {'lr': 1.527635950418177e-05, 'samples': 22747136, 'steps': 44427, 'batch_loss/train': 0.7277459658216685} 12/28/2021 19:30:45 - INFO - codeparrot_training - Step 44428: {'lr': 1.5270934222995663e-05, 'samples': 22747648, 'steps': 44428, 'batch_loss/train': 0.7800780707038939} 12/28/2021 19:30:57 - INFO - codeparrot_training - Step 44429: {'lr': 1.5265509875003747e-05, 'samples': 22748160, 'steps': 44429, 'batch_loss/train': 0.7386282747611403} 12/28/2021 19:31:08 - INFO - codeparrot_training - Step 44430: {'lr': 1.5260086460227473e-05, 'samples': 22748672, 'steps': 44430, 'batch_loss/train': 0.7502810908481479} 12/28/2021 19:31:19 - INFO - codeparrot_training - Step 44431: {'lr': 1.5254663978688466e-05, 'samples': 22749184, 'steps': 44431, 'batch_loss/train': 0.7580657862126827} 12/28/2021 19:31:31 - INFO - codeparrot_training - Step 44432: {'lr': 1.5249242430408345e-05, 'samples': 22749696, 'steps': 44432, 'batch_loss/train': 0.7259154352359474} 12/28/2021 19:31:41 - INFO - codeparrot_training - Step 44433: {'lr': 1.5243821815408537e-05, 'samples': 22750208, 'steps': 44433, 'batch_loss/train': 0.7935123834758997} 12/28/2021 19:31:52 - INFO - codeparrot_training - Step 44434: {'lr': 1.5238402133710637e-05, 'samples': 22750720, 'steps': 44434, 'batch_loss/train': 0.7697290824726224} 12/28/2021 19:32:03 - INFO - codeparrot_training - Step 44435: {'lr': 1.5232983385336269e-05, 'samples': 22751232, 'steps': 44435, 'batch_loss/train': 0.7863826008979231} 12/28/2021 19:32:17 - INFO - codeparrot_training - Step 44436: {'lr': 1.5227565570306884e-05, 'samples': 22751744, 'steps': 44436, 'batch_loss/train': 0.724615097977221} 12/28/2021 19:32:27 - INFO - codeparrot_training - Step 44437: {'lr': 1.5222148688644021e-05, 'samples': 22752256, 'steps': 44437, 'batch_loss/train': 0.7592490212991834} 12/28/2021 19:32:38 - INFO - codeparrot_training - Step 44438: {'lr': 1.5216732740369277e-05, 'samples': 22752768, 'steps': 44438, 'batch_loss/train': 0.810789032606408} 12/28/2021 19:32:50 - INFO - codeparrot_training - Step 44439: {'lr': 1.521131772550416e-05, 'samples': 22753280, 'steps': 44439, 'batch_loss/train': 0.8091069757938385} 12/28/2021 19:33:01 - INFO - codeparrot_training - Step 44440: {'lr': 1.520590364407018e-05, 'samples': 22753792, 'steps': 44440, 'batch_loss/train': 0.7631713710725307} 12/28/2021 19:33:11 - INFO - codeparrot_training - Step 44441: {'lr': 1.5200490496088904e-05, 'samples': 22754304, 'steps': 44441, 'batch_loss/train': 0.6126543362624943} 12/28/2021 19:33:23 - INFO - codeparrot_training - Step 44442: {'lr': 1.5195078281581759e-05, 'samples': 22754816, 'steps': 44442, 'batch_loss/train': 0.8185417240601964} 12/28/2021 19:33:34 - INFO - codeparrot_training - Step 44443: {'lr': 1.518966700057034e-05, 'samples': 22755328, 'steps': 44443, 'batch_loss/train': 0.6932616622652858} 12/28/2021 19:33:45 - INFO - codeparrot_training - Step 44444: {'lr': 1.5184256653076185e-05, 'samples': 22755840, 'steps': 44444, 'batch_loss/train': 0.8082292536273599} 12/28/2021 19:33:55 - INFO - codeparrot_training - Step 44445: {'lr': 1.517884723912072e-05, 'samples': 22756352, 'steps': 44445, 'batch_loss/train': 0.75874581374228} 12/28/2021 19:34:09 - INFO - codeparrot_training - Step 44446: {'lr': 1.517343875872554e-05, 'samples': 22756864, 'steps': 44446, 'batch_loss/train': 0.800376464612782} 12/28/2021 19:34:20 - INFO - codeparrot_training - Step 44447: {'lr': 1.516803121191207e-05, 'samples': 22757376, 'steps': 44447, 'batch_loss/train': 0.8329168916679919} 12/28/2021 19:34:31 - INFO - codeparrot_training - Step 44448: {'lr': 1.5162624598701824e-05, 'samples': 22757888, 'steps': 44448, 'batch_loss/train': 0.8304764851927757} 12/28/2021 19:34:43 - INFO - codeparrot_training - Step 44449: {'lr': 1.515721891911634e-05, 'samples': 22758400, 'steps': 44449, 'batch_loss/train': 0.705072307959199} 12/28/2021 19:34:53 - INFO - codeparrot_training - Step 44450: {'lr': 1.5151814173177043e-05, 'samples': 22758912, 'steps': 44450, 'batch_loss/train': 0.8370769424363971} 12/28/2021 19:35:04 - INFO - codeparrot_training - Step 44451: {'lr': 1.51464103609055e-05, 'samples': 22759424, 'steps': 44451, 'batch_loss/train': 0.7483592573553324} 12/28/2021 19:35:17 - INFO - codeparrot_training - Step 44452: {'lr': 1.514100748232311e-05, 'samples': 22759936, 'steps': 44452, 'batch_loss/train': 0.7974230693653226} 12/28/2021 19:35:27 - INFO - codeparrot_training - Step 44453: {'lr': 1.5135605537451414e-05, 'samples': 22760448, 'steps': 44453, 'batch_loss/train': 0.8082894212566316} 12/28/2021 19:35:38 - INFO - codeparrot_training - Step 44454: {'lr': 1.5130204526311891e-05, 'samples': 22760960, 'steps': 44454, 'batch_loss/train': 0.7536374279297888} 12/28/2021 19:35:49 - INFO - codeparrot_training - Step 44455: {'lr': 1.5124804448925999e-05, 'samples': 22761472, 'steps': 44455, 'batch_loss/train': 0.7717868397012353} 12/28/2021 19:36:02 - INFO - codeparrot_training - Step 44456: {'lr': 1.5119405305315137e-05, 'samples': 22761984, 'steps': 44456, 'batch_loss/train': 0.649835747666657} 12/28/2021 19:36:13 - INFO - codeparrot_training - Step 44457: {'lr': 1.511400709550087e-05, 'samples': 22762496, 'steps': 44457, 'batch_loss/train': 0.726723110768944} 12/28/2021 19:36:23 - INFO - codeparrot_training - Step 44458: {'lr': 1.5108609819504654e-05, 'samples': 22763008, 'steps': 44458, 'batch_loss/train': 0.6277072047814727} 12/28/2021 19:36:35 - INFO - codeparrot_training - Step 44459: {'lr': 1.5103213477347888e-05, 'samples': 22763520, 'steps': 44459, 'batch_loss/train': 0.6924799755215645} 12/28/2021 19:36:46 - INFO - codeparrot_training - Step 44460: {'lr': 1.5097818069052e-05, 'samples': 22764032, 'steps': 44460, 'batch_loss/train': 0.7830400308594108} 12/28/2021 19:36:57 - INFO - codeparrot_training - Step 44461: {'lr': 1.5092423594638555e-05, 'samples': 22764544, 'steps': 44461, 'batch_loss/train': 0.7607603981159627} 12/28/2021 19:37:09 - INFO - codeparrot_training - Step 44462: {'lr': 1.5087030054128926e-05, 'samples': 22765056, 'steps': 44462, 'batch_loss/train': 0.7389201335608959} 12/28/2021 19:37:19 - INFO - codeparrot_training - Step 44463: {'lr': 1.5081637447544538e-05, 'samples': 22765568, 'steps': 44463, 'batch_loss/train': 0.7345748590305448} 12/28/2021 19:37:30 - INFO - codeparrot_training - Step 44464: {'lr': 1.5076245774906877e-05, 'samples': 22766080, 'steps': 44464, 'batch_loss/train': 0.7701140618883073} 12/28/2021 19:37:40 - INFO - codeparrot_training - Step 44465: {'lr': 1.5070855036237369e-05, 'samples': 22766592, 'steps': 44465, 'batch_loss/train': 0.7133084442466497} 12/28/2021 19:37:54 - INFO - codeparrot_training - Step 44466: {'lr': 1.5065465231557413e-05, 'samples': 22767104, 'steps': 44466, 'batch_loss/train': 0.7720306219998747} 12/28/2021 19:38:04 - INFO - codeparrot_training - Step 44467: {'lr': 1.5060076360888465e-05, 'samples': 22767616, 'steps': 44467, 'batch_loss/train': 0.8617779370397329} 12/28/2021 19:38:15 - INFO - codeparrot_training - Step 44468: {'lr': 1.505468842425195e-05, 'samples': 22768128, 'steps': 44468, 'batch_loss/train': 0.7054209299385548} 12/28/2021 19:38:27 - INFO - codeparrot_training - Step 44469: {'lr': 1.5049301421669298e-05, 'samples': 22768640, 'steps': 44469, 'batch_loss/train': 0.795904211467132} 12/28/2021 19:38:38 - INFO - codeparrot_training - Step 44470: {'lr': 1.5043915353161908e-05, 'samples': 22769152, 'steps': 44470, 'batch_loss/train': 0.7972309280885383} 12/28/2021 19:38:48 - INFO - codeparrot_training - Step 44471: {'lr': 1.5038530218751152e-05, 'samples': 22769664, 'steps': 44471, 'batch_loss/train': 0.6798349805176258} 12/28/2021 19:39:02 - INFO - codeparrot_training - Step 44472: {'lr': 1.503314601845851e-05, 'samples': 22770176, 'steps': 44472, 'batch_loss/train': 0.7055594231933355} 12/28/2021 19:39:12 - INFO - codeparrot_training - Step 44473: {'lr': 1.5027762752305385e-05, 'samples': 22770688, 'steps': 44473, 'batch_loss/train': 0.7955365704838187} 12/28/2021 19:39:23 - INFO - codeparrot_training - Step 44474: {'lr': 1.502238042031312e-05, 'samples': 22771200, 'steps': 44474, 'batch_loss/train': 0.6452991520054638} 12/28/2021 19:39:34 - INFO - codeparrot_training - Step 44475: {'lr': 1.501699902250317e-05, 'samples': 22771712, 'steps': 44475, 'batch_loss/train': 0.774208853719756} 12/28/2021 19:39:46 - INFO - codeparrot_training - Step 44476: {'lr': 1.5011618558896878e-05, 'samples': 22772224, 'steps': 44476, 'batch_loss/train': 0.6959541453979909} 12/28/2021 19:39:56 - INFO - codeparrot_training - Step 44477: {'lr': 1.5006239029515673e-05, 'samples': 22772736, 'steps': 44477, 'batch_loss/train': 0.7447398100048304} 12/28/2021 19:40:07 - INFO - codeparrot_training - Step 44478: {'lr': 1.5000860434380924e-05, 'samples': 22773248, 'steps': 44478, 'batch_loss/train': 0.6963043115101755} 12/28/2021 19:40:19 - INFO - codeparrot_training - Step 44479: {'lr': 1.4995482773514035e-05, 'samples': 22773760, 'steps': 44479, 'batch_loss/train': 0.7628959352150559} 12/28/2021 19:40:30 - INFO - codeparrot_training - Step 44480: {'lr': 1.4990106046936375e-05, 'samples': 22774272, 'steps': 44480, 'batch_loss/train': 0.7242788085713983} 12/28/2021 19:40:40 - INFO - codeparrot_training - Step 44481: {'lr': 1.4984730254669344e-05, 'samples': 22774784, 'steps': 44481, 'batch_loss/train': 0.7687091552652419} 12/28/2021 19:40:53 - INFO - codeparrot_training - Step 44482: {'lr': 1.4979355396734201e-05, 'samples': 22775296, 'steps': 44482, 'batch_loss/train': 0.6564990489277989} 12/28/2021 19:41:03 - INFO - codeparrot_training - Step 44483: {'lr': 1.497398147315246e-05, 'samples': 22775808, 'steps': 44483, 'batch_loss/train': 0.6720117116346955} 12/28/2021 19:41:14 - INFO - codeparrot_training - Step 44484: {'lr': 1.4968608483945433e-05, 'samples': 22776320, 'steps': 44484, 'batch_loss/train': 0.7607802310958505} 12/28/2021 19:41:27 - INFO - codeparrot_training - Step 44485: {'lr': 1.496323642913447e-05, 'samples': 22776832, 'steps': 44485, 'batch_loss/train': 0.7071882151067257} 12/28/2021 19:41:38 - INFO - codeparrot_training - Step 44486: {'lr': 1.495786530874088e-05, 'samples': 22777344, 'steps': 44486, 'batch_loss/train': 0.6981701664626598} 12/28/2021 19:41:48 - INFO - codeparrot_training - Step 44487: {'lr': 1.4952495122786153e-05, 'samples': 22777856, 'steps': 44487, 'batch_loss/train': 0.7288296609767713} 12/28/2021 19:41:59 - INFO - codeparrot_training - Step 44488: {'lr': 1.4947125871291516e-05, 'samples': 22778368, 'steps': 44488, 'batch_loss/train': 0.7071274691261351} 12/28/2021 19:42:11 - INFO - codeparrot_training - Step 44489: {'lr': 1.4941757554278346e-05, 'samples': 22778880, 'steps': 44489, 'batch_loss/train': 0.913590332493186} 12/28/2021 19:42:22 - INFO - codeparrot_training - Step 44490: {'lr': 1.4936390171767983e-05, 'samples': 22779392, 'steps': 44490, 'batch_loss/train': 0.7269970765337348} 12/28/2021 19:42:32 - INFO - codeparrot_training - Step 44491: {'lr': 1.4931023723781801e-05, 'samples': 22779904, 'steps': 44491, 'batch_loss/train': 0.7147567961364985} 12/28/2021 19:42:45 - INFO - codeparrot_training - Step 44492: {'lr': 1.4925658210341087e-05, 'samples': 22780416, 'steps': 44492, 'batch_loss/train': 0.6287444308400154} 12/28/2021 19:42:55 - INFO - codeparrot_training - Step 44493: {'lr': 1.4920293631467214e-05, 'samples': 22780928, 'steps': 44493, 'batch_loss/train': 0.8074891953729093} 12/28/2021 19:43:06 - INFO - codeparrot_training - Step 44494: {'lr': 1.491492998718147e-05, 'samples': 22781440, 'steps': 44494, 'batch_loss/train': 0.7872083494439721} 12/28/2021 19:43:17 - INFO - codeparrot_training - Step 44495: {'lr': 1.4909567277505226e-05, 'samples': 22781952, 'steps': 44495, 'batch_loss/train': 0.7252486329525709} 12/28/2021 19:43:30 - INFO - codeparrot_training - Step 44496: {'lr': 1.49042055024598e-05, 'samples': 22782464, 'steps': 44496, 'batch_loss/train': 0.7174284995417111} 12/28/2021 19:43:41 - INFO - codeparrot_training - Step 44497: {'lr': 1.4898844662066424e-05, 'samples': 22782976, 'steps': 44497, 'batch_loss/train': 0.6688984003849328} 12/28/2021 19:43:51 - INFO - codeparrot_training - Step 44498: {'lr': 1.4893484756346471e-05, 'samples': 22783488, 'steps': 44498, 'batch_loss/train': 0.7584341485053301} 12/28/2021 19:44:03 - INFO - codeparrot_training - Step 44499: {'lr': 1.488812578532131e-05, 'samples': 22784000, 'steps': 44499, 'batch_loss/train': 0.7493745271349326} 12/28/2021 19:44:14 - INFO - codeparrot_training - Step 44500: {'lr': 1.4882767749012149e-05, 'samples': 22784512, 'steps': 44500, 'batch_loss/train': 0.7807084331288934} 12/28/2021 19:44:24 - INFO - codeparrot_training - Step 44501: {'lr': 1.4877410647440331e-05, 'samples': 22785024, 'steps': 44501, 'batch_loss/train': 0.7058289067354053} 12/28/2021 19:44:38 - INFO - codeparrot_training - Step 44502: {'lr': 1.4872054480627145e-05, 'samples': 22785536, 'steps': 44502, 'batch_loss/train': 0.6780203452799469} 12/28/2021 19:44:48 - INFO - codeparrot_training - Step 44503: {'lr': 1.4866699248593907e-05, 'samples': 22786048, 'steps': 44503, 'batch_loss/train': 0.7580363173037767} 12/28/2021 19:44:59 - INFO - codeparrot_training - Step 44504: {'lr': 1.486134495136185e-05, 'samples': 22786560, 'steps': 44504, 'batch_loss/train': 0.7588818897493184} 12/28/2021 19:45:10 - INFO - codeparrot_training - Step 44505: {'lr': 1.4855991588952344e-05, 'samples': 22787072, 'steps': 44505, 'batch_loss/train': 0.7222316483967006} 12/28/2021 19:45:22 - INFO - codeparrot_training - Step 44506: {'lr': 1.4850639161386598e-05, 'samples': 22787584, 'steps': 44506, 'batch_loss/train': 0.7358269635587931} 12/28/2021 19:45:32 - INFO - codeparrot_training - Step 44507: {'lr': 1.4845287668685924e-05, 'samples': 22788096, 'steps': 44507, 'batch_loss/train': 0.7553306819172576} 12/28/2021 19:45:43 - INFO - codeparrot_training - Step 44508: {'lr': 1.4839937110871615e-05, 'samples': 22788608, 'steps': 44508, 'batch_loss/train': 0.6801475349348038} 12/28/2021 19:45:55 - INFO - codeparrot_training - Step 44509: {'lr': 1.4834587487964901e-05, 'samples': 22789120, 'steps': 44509, 'batch_loss/train': 0.6036784470779821} 12/28/2021 19:46:06 - INFO - codeparrot_training - Step 44510: {'lr': 1.4829238799987127e-05, 'samples': 22789632, 'steps': 44510, 'batch_loss/train': 0.7069858773611486} 12/28/2021 19:46:16 - INFO - codeparrot_training - Step 44511: {'lr': 1.4823891046959443e-05, 'samples': 22790144, 'steps': 44511, 'batch_loss/train': 0.7299148524180055} 12/28/2021 19:46:30 - INFO - codeparrot_training - Step 44512: {'lr': 1.4818544228903164e-05, 'samples': 22790656, 'steps': 44512, 'batch_loss/train': 0.6460038721561432} 12/28/2021 19:46:40 - INFO - codeparrot_training - Step 44513: {'lr': 1.4813198345839607e-05, 'samples': 22791168, 'steps': 44513, 'batch_loss/train': 0.5481710135936737} 12/28/2021 19:46:51 - INFO - codeparrot_training - Step 44514: {'lr': 1.4807853397789922e-05, 'samples': 22791680, 'steps': 44514, 'batch_loss/train': 0.7513115899637341} 12/28/2021 19:47:02 - INFO - codeparrot_training - Step 44515: {'lr': 1.4802509384775425e-05, 'samples': 22792192, 'steps': 44515, 'batch_loss/train': 0.6331357881426811} 12/28/2021 19:47:14 - INFO - codeparrot_training - Step 44516: {'lr': 1.479716630681735e-05, 'samples': 22792704, 'steps': 44516, 'batch_loss/train': 0.7425076486542821} 12/28/2021 19:47:25 - INFO - codeparrot_training - Step 44517: {'lr': 1.4791824163936902e-05, 'samples': 22793216, 'steps': 44517, 'batch_loss/train': 0.8086816882714629} 12/28/2021 19:47:35 - INFO - codeparrot_training - Step 44518: {'lr': 1.4786482956155367e-05, 'samples': 22793728, 'steps': 44518, 'batch_loss/train': 0.8453871898818761} 12/28/2021 19:47:47 - INFO - codeparrot_training - Step 44519: {'lr': 1.4781142683493982e-05, 'samples': 22794240, 'steps': 44519, 'batch_loss/train': 1.494638395961374} 12/28/2021 19:47:58 - INFO - codeparrot_training - Step 44520: {'lr': 1.4775803345973948e-05, 'samples': 22794752, 'steps': 44520, 'batch_loss/train': 0.7811250928789377} 12/28/2021 19:48:09 - INFO - codeparrot_training - Step 44521: {'lr': 1.4770464943616474e-05, 'samples': 22795264, 'steps': 44521, 'batch_loss/train': 0.741895878687501} 12/28/2021 19:48:21 - INFO - codeparrot_training - Step 44522: {'lr': 1.4765127476442874e-05, 'samples': 22795776, 'steps': 44522, 'batch_loss/train': 0.6657720973016694} 12/28/2021 19:48:32 - INFO - codeparrot_training - Step 44523: {'lr': 1.4759790944474243e-05, 'samples': 22796288, 'steps': 44523, 'batch_loss/train': 0.665813154540956} 12/28/2021 19:48:43 - INFO - codeparrot_training - Step 44524: {'lr': 1.4754455347731871e-05, 'samples': 22796800, 'steps': 44524, 'batch_loss/train': 0.7404964622110128} 12/28/2021 19:48:56 - INFO - codeparrot_training - Step 44525: {'lr': 1.4749120686237016e-05, 'samples': 22797312, 'steps': 44525, 'batch_loss/train': 0.4371933205402456} 12/28/2021 19:49:07 - INFO - codeparrot_training - Step 44526: {'lr': 1.4743786960010774e-05, 'samples': 22797824, 'steps': 44526, 'batch_loss/train': 0.7007467793300748} 12/28/2021 19:49:17 - INFO - codeparrot_training - Step 44527: {'lr': 1.4738454169074433e-05, 'samples': 22798336, 'steps': 44527, 'batch_loss/train': 0.5542737258947454} 12/28/2021 19:49:28 - INFO - codeparrot_training - Step 44528: {'lr': 1.4733122313449143e-05, 'samples': 22798848, 'steps': 44528, 'batch_loss/train': 0.7886454910039902} 12/28/2021 19:49:40 - INFO - codeparrot_training - Step 44529: {'lr': 1.4727791393156136e-05, 'samples': 22799360, 'steps': 44529, 'batch_loss/train': 0.7976474976167083} 12/28/2021 19:49:51 - INFO - codeparrot_training - Step 44530: {'lr': 1.4722461408216592e-05, 'samples': 22799872, 'steps': 44530, 'batch_loss/train': 0.7676403895020485} 12/28/2021 19:50:01 - INFO - codeparrot_training - Step 44531: {'lr': 1.4717132358651686e-05, 'samples': 22800384, 'steps': 44531, 'batch_loss/train': 0.6877498154062778} 12/28/2021 19:50:15 - INFO - codeparrot_training - Step 44532: {'lr': 1.4711804244482651e-05, 'samples': 22800896, 'steps': 44532, 'batch_loss/train': 0.9467433326644823} 12/28/2021 19:50:26 - INFO - codeparrot_training - Step 44533: {'lr': 1.4706477065730639e-05, 'samples': 22801408, 'steps': 44533, 'batch_loss/train': 0.752078311983496} 12/28/2021 19:50:36 - INFO - codeparrot_training - Step 44534: {'lr': 1.4701150822416825e-05, 'samples': 22801920, 'steps': 44534, 'batch_loss/train': 0.9534788858145475} 12/28/2021 19:50:47 - INFO - codeparrot_training - Step 44535: {'lr': 1.469582551456239e-05, 'samples': 22802432, 'steps': 44535, 'batch_loss/train': 0.7366510955616832} 12/28/2021 19:50:59 - INFO - codeparrot_training - Step 44536: {'lr': 1.4690501142188533e-05, 'samples': 22802944, 'steps': 44536, 'batch_loss/train': 0.6911252613645047} 12/28/2021 19:51:10 - INFO - codeparrot_training - Step 44537: {'lr': 1.4685177705316354e-05, 'samples': 22803456, 'steps': 44537, 'batch_loss/train': 0.720787369646132} 12/28/2021 19:51:20 - INFO - codeparrot_training - Step 44538: {'lr': 1.467985520396703e-05, 'samples': 22803968, 'steps': 44538, 'batch_loss/train': 0.7609802111983299} 12/28/2021 19:51:32 - INFO - codeparrot_training - Step 44539: {'lr': 1.4674533638161819e-05, 'samples': 22804480, 'steps': 44539, 'batch_loss/train': 0.8652660856023431} 12/28/2021 19:51:43 - INFO - codeparrot_training - Step 44540: {'lr': 1.4669213007921788e-05, 'samples': 22804992, 'steps': 44540, 'batch_loss/train': 0.7479615425691009} 12/28/2021 19:51:54 - INFO - codeparrot_training - Step 44541: {'lr': 1.4663893313268034e-05, 'samples': 22805504, 'steps': 44541, 'batch_loss/train': 0.9545808425173163} 12/28/2021 19:52:07 - INFO - codeparrot_training - Step 44542: {'lr': 1.4658574554221898e-05, 'samples': 22806016, 'steps': 44542, 'batch_loss/train': 0.720602935180068} 12/28/2021 19:52:17 - INFO - codeparrot_training - Step 44543: {'lr': 1.4653256730804337e-05, 'samples': 22806528, 'steps': 44543, 'batch_loss/train': 0.702931213658303} 12/28/2021 19:52:28 - INFO - codeparrot_training - Step 44544: {'lr': 1.4647939843036584e-05, 'samples': 22807040, 'steps': 44544, 'batch_loss/train': 0.7926341239362955} 12/28/2021 19:52:40 - INFO - codeparrot_training - Step 44545: {'lr': 1.464262389093976e-05, 'samples': 22807552, 'steps': 44545, 'batch_loss/train': 0.727089126361534} 12/28/2021 19:52:51 - INFO - codeparrot_training - Step 44546: {'lr': 1.463730887453496e-05, 'samples': 22808064, 'steps': 44546, 'batch_loss/train': 0.8253916576504707} 12/28/2021 19:53:01 - INFO - codeparrot_training - Step 44547: {'lr': 1.463199479384339e-05, 'samples': 22808576, 'steps': 44547, 'batch_loss/train': 0.7262849387479946} 12/28/2021 19:53:12 - INFO - codeparrot_training - Step 44548: {'lr': 1.462668164888617e-05, 'samples': 22809088, 'steps': 44548, 'batch_loss/train': 0.7712957737967372} 12/28/2021 19:53:24 - INFO - codeparrot_training - Step 44549: {'lr': 1.4621369439684285e-05, 'samples': 22809600, 'steps': 44549, 'batch_loss/train': 0.6102518690750003} 12/28/2021 19:53:35 - INFO - codeparrot_training - Step 44550: {'lr': 1.4616058166259022e-05, 'samples': 22810112, 'steps': 44550, 'batch_loss/train': 0.7403649194166064} 12/28/2021 19:53:45 - INFO - codeparrot_training - Step 44551: {'lr': 1.4610747828631476e-05, 'samples': 22810624, 'steps': 44551, 'batch_loss/train': 0.6211210582405329} 12/28/2021 19:53:57 - INFO - codeparrot_training - Step 44552: {'lr': 1.460543842682266e-05, 'samples': 22811136, 'steps': 44552, 'batch_loss/train': 0.7118547293357551} 12/28/2021 19:54:08 - INFO - codeparrot_training - Step 44553: {'lr': 1.460012996085372e-05, 'samples': 22811648, 'steps': 44553, 'batch_loss/train': 0.7858149260282516} 12/28/2021 19:54:19 - INFO - codeparrot_training - Step 44554: {'lr': 1.4594822430745835e-05, 'samples': 22812160, 'steps': 44554, 'batch_loss/train': 0.704813888296485} 12/28/2021 19:54:32 - INFO - codeparrot_training - Step 44555: {'lr': 1.4589515836520046e-05, 'samples': 22812672, 'steps': 44555, 'batch_loss/train': 0.7314496566541493} 12/28/2021 19:54:43 - INFO - codeparrot_training - Step 44556: {'lr': 1.4584210178197415e-05, 'samples': 22813184, 'steps': 44556, 'batch_loss/train': 0.7316136742010713} 12/28/2021 19:54:53 - INFO - codeparrot_training - Step 44557: {'lr': 1.4578905455799097e-05, 'samples': 22813696, 'steps': 44557, 'batch_loss/train': 1.0141364689916372} 12/28/2021 19:55:04 - INFO - codeparrot_training - Step 44558: {'lr': 1.4573601669346153e-05, 'samples': 22814208, 'steps': 44558, 'batch_loss/train': 0.7020819089375436} 12/28/2021 19:55:16 - INFO - codeparrot_training - Step 44559: {'lr': 1.4568298818859681e-05, 'samples': 22814720, 'steps': 44559, 'batch_loss/train': 0.7151258722878993} 12/28/2021 19:55:27 - INFO - codeparrot_training - Step 44560: {'lr': 1.4562996904360776e-05, 'samples': 22815232, 'steps': 44560, 'batch_loss/train': 0.6980341502930969} 12/28/2021 19:55:37 - INFO - codeparrot_training - Step 44561: {'lr': 1.4557695925870474e-05, 'samples': 22815744, 'steps': 44561, 'batch_loss/train': 0.7718272535130382} 12/28/2021 19:55:51 - INFO - codeparrot_training - Step 44562: {'lr': 1.455239588340987e-05, 'samples': 22816256, 'steps': 44562, 'batch_loss/train': 0.6602363232523203} 12/28/2021 19:56:01 - INFO - codeparrot_training - Step 44563: {'lr': 1.4547096777000084e-05, 'samples': 22816768, 'steps': 44563, 'batch_loss/train': 0.4502592319622636} 12/28/2021 19:56:12 - INFO - codeparrot_training - Step 44564: {'lr': 1.4541798606662049e-05, 'samples': 22817280, 'steps': 44564, 'batch_loss/train': 0.6566577809280716} 12/28/2021 19:56:24 - INFO - codeparrot_training - Step 44565: {'lr': 1.4536501372417022e-05, 'samples': 22817792, 'steps': 44565, 'batch_loss/train': 0.7207918101921678} 12/28/2021 19:56:35 - INFO - codeparrot_training - Step 44566: {'lr': 1.4531205074285875e-05, 'samples': 22818304, 'steps': 44566, 'batch_loss/train': 0.7177837376948446} 12/28/2021 19:56:45 - INFO - codeparrot_training - Step 44567: {'lr': 1.4525909712289759e-05, 'samples': 22818816, 'steps': 44567, 'batch_loss/train': 0.7238929360173643} 12/28/2021 19:56:56 - INFO - codeparrot_training - Step 44568: {'lr': 1.4520615286449767e-05, 'samples': 22819328, 'steps': 44568, 'batch_loss/train': 0.7775223220814951} 12/28/2021 19:57:08 - INFO - codeparrot_training - Step 44569: {'lr': 1.4515321796786857e-05, 'samples': 22819840, 'steps': 44569, 'batch_loss/train': 0.6363610322587192} 12/28/2021 19:57:19 - INFO - codeparrot_training - Step 44570: {'lr': 1.4510029243322092e-05, 'samples': 22820352, 'steps': 44570, 'batch_loss/train': 0.7656714431941509} 12/28/2021 19:57:30 - INFO - codeparrot_training - Step 44571: {'lr': 1.450473762607657e-05, 'samples': 22820864, 'steps': 44571, 'batch_loss/train': 0.8298880839720368} 12/28/2021 19:57:43 - INFO - codeparrot_training - Step 44572: {'lr': 1.4499446945071271e-05, 'samples': 22821376, 'steps': 44572, 'batch_loss/train': 0.7587624853476882} 12/28/2021 19:57:54 - INFO - codeparrot_training - Step 44573: {'lr': 1.4494157200327263e-05, 'samples': 22821888, 'steps': 44573, 'batch_loss/train': 0.7634810591116548} 12/28/2021 19:58:04 - INFO - codeparrot_training - Step 44574: {'lr': 1.4488868391865583e-05, 'samples': 22822400, 'steps': 44574, 'batch_loss/train': 0.6599106717621908} 12/28/2021 19:58:16 - INFO - codeparrot_training - Step 44575: {'lr': 1.4483580519707163e-05, 'samples': 22822912, 'steps': 44575, 'batch_loss/train': 0.7220694692805409} 12/28/2021 19:58:27 - INFO - codeparrot_training - Step 44576: {'lr': 1.4478293583873147e-05, 'samples': 22823424, 'steps': 44576, 'batch_loss/train': 0.828002137131989} 12/28/2021 19:58:38 - INFO - codeparrot_training - Step 44577: {'lr': 1.4473007584384552e-05, 'samples': 22823936, 'steps': 44577, 'batch_loss/train': 0.716884090565145} 12/28/2021 19:58:50 - INFO - codeparrot_training - Step 44578: {'lr': 1.4467722521262245e-05, 'samples': 22824448, 'steps': 44578, 'batch_loss/train': 0.7579092527739704} 12/28/2021 19:59:00 - INFO - codeparrot_training - Step 44579: {'lr': 1.4462438394527406e-05, 'samples': 22824960, 'steps': 44579, 'batch_loss/train': 0.8109311331063509} 12/28/2021 19:59:11 - INFO - codeparrot_training - Step 44580: {'lr': 1.445715520420099e-05, 'samples': 22825472, 'steps': 44580, 'batch_loss/train': 0.6767658167518675} 12/28/2021 19:59:22 - INFO - codeparrot_training - Step 44581: {'lr': 1.4451872950303951e-05, 'samples': 22825984, 'steps': 44581, 'batch_loss/train': 0.8298316905274987} 12/28/2021 19:59:35 - INFO - codeparrot_training - Step 44582: {'lr': 1.4446591632857331e-05, 'samples': 22826496, 'steps': 44582, 'batch_loss/train': 0.7468736725859344} 12/28/2021 19:59:46 - INFO - codeparrot_training - Step 44583: {'lr': 1.4441311251882139e-05, 'samples': 22827008, 'steps': 44583, 'batch_loss/train': 0.6665414017625153} 12/28/2021 19:59:56 - INFO - codeparrot_training - Step 44584: {'lr': 1.443603180739933e-05, 'samples': 22827520, 'steps': 44584, 'batch_loss/train': 0.6606636343058199} 12/28/2021 20:00:08 - INFO - codeparrot_training - Step 44585: {'lr': 1.4430753299429917e-05, 'samples': 22828032, 'steps': 44585, 'batch_loss/train': 0.7418468296527863} 12/28/2021 20:00:19 - INFO - codeparrot_training - Step 44586: {'lr': 1.442547572799488e-05, 'samples': 22828544, 'steps': 44586, 'batch_loss/train': 0.7449423023499548} 12/28/2021 20:00:30 - INFO - codeparrot_training - Step 44587: {'lr': 1.4420199093115205e-05, 'samples': 22829056, 'steps': 44587, 'batch_loss/train': 0.7426582234911621} 12/28/2021 20:00:42 - INFO - codeparrot_training - Step 44588: {'lr': 1.4414923394811875e-05, 'samples': 22829568, 'steps': 44588, 'batch_loss/train': 0.8382472153753042} 12/28/2021 20:00:53 - INFO - codeparrot_training - Step 44589: {'lr': 1.44096486331059e-05, 'samples': 22830080, 'steps': 44589, 'batch_loss/train': 0.7978941276669502} 12/28/2021 20:01:03 - INFO - codeparrot_training - Step 44590: {'lr': 1.4404374808018094e-05, 'samples': 22830592, 'steps': 44590, 'batch_loss/train': 0.6696567870676517} 12/28/2021 20:01:14 - INFO - codeparrot_training - Step 44591: {'lr': 1.439910191956964e-05, 'samples': 22831104, 'steps': 44591, 'batch_loss/train': 0.8640657365322113} 12/28/2021 20:01:28 - INFO - codeparrot_training - Step 44592: {'lr': 1.4393829967781352e-05, 'samples': 22831616, 'steps': 44592, 'batch_loss/train': 0.725426432210952} 12/28/2021 20:01:39 - INFO - codeparrot_training - Step 44593: {'lr': 1.4388558952674214e-05, 'samples': 22832128, 'steps': 44593, 'batch_loss/train': 0.9135311922291294} 12/28/2021 20:01:50 - INFO - codeparrot_training - Step 44594: {'lr': 1.4383288874269262e-05, 'samples': 22832640, 'steps': 44594, 'batch_loss/train': 0.7957166079431772} 12/28/2021 20:02:02 - INFO - codeparrot_training - Step 44595: {'lr': 1.4378019732587345e-05, 'samples': 22833152, 'steps': 44595, 'batch_loss/train': 0.6880183280445635} 12/28/2021 20:02:12 - INFO - codeparrot_training - Step 44596: {'lr': 1.437275152764947e-05, 'samples': 22833664, 'steps': 44596, 'batch_loss/train': 0.7241202036384493} 12/28/2021 20:02:23 - INFO - codeparrot_training - Step 44597: {'lr': 1.4367484259476538e-05, 'samples': 22834176, 'steps': 44597, 'batch_loss/train': 0.9589898651465774} 12/28/2021 20:02:34 - INFO - codeparrot_training - Step 44598: {'lr': 1.4362217928089561e-05, 'samples': 22834688, 'steps': 44598, 'batch_loss/train': 0.7773782266303897} 12/28/2021 20:02:46 - INFO - codeparrot_training - Step 44599: {'lr': 1.4356952533509382e-05, 'samples': 22835200, 'steps': 44599, 'batch_loss/train': 0.7256087698042393} 12/28/2021 20:02:56 - INFO - codeparrot_training - Step 44600: {'lr': 1.4351688075757069e-05, 'samples': 22835712, 'steps': 44600, 'batch_loss/train': 0.716180331306532} 12/28/2021 20:03:07 - INFO - codeparrot_training - Step 44601: {'lr': 1.4346424554853355e-05, 'samples': 22836224, 'steps': 44601, 'batch_loss/train': 0.7042367931571789} 12/28/2021 20:03:21 - INFO - codeparrot_training - Step 44602: {'lr': 1.4341161970819332e-05, 'samples': 22836736, 'steps': 44602, 'batch_loss/train': 0.7448403094895184} 12/28/2021 20:03:31 - INFO - codeparrot_training - Step 44603: {'lr': 1.4335900323675905e-05, 'samples': 22837248, 'steps': 44603, 'batch_loss/train': 0.6802072103600949} 12/28/2021 20:03:42 - INFO - codeparrot_training - Step 44604: {'lr': 1.4330639613443886e-05, 'samples': 22837760, 'steps': 44604, 'batch_loss/train': 0.6434643531683832} 12/28/2021 20:03:54 - INFO - codeparrot_training - Step 44605: {'lr': 1.4325379840144287e-05, 'samples': 22838272, 'steps': 44605, 'batch_loss/train': 0.8648756127804518} 12/28/2021 20:04:05 - INFO - codeparrot_training - Step 44606: {'lr': 1.4320121003798009e-05, 'samples': 22838784, 'steps': 44606, 'batch_loss/train': 0.6713795067043975} 12/28/2021 20:04:16 - INFO - codeparrot_training - Step 44607: {'lr': 1.4314863104425923e-05, 'samples': 22839296, 'steps': 44607, 'batch_loss/train': 0.7638258840888739} 12/28/2021 20:04:26 - INFO - codeparrot_training - Step 44608: {'lr': 1.430960614204896e-05, 'samples': 22839808, 'steps': 44608, 'batch_loss/train': 0.7283994844183326} 12/28/2021 20:04:38 - INFO - codeparrot_training - Step 44609: {'lr': 1.4304350116687987e-05, 'samples': 22840320, 'steps': 44609, 'batch_loss/train': 0.7345205172896385} 12/28/2021 20:04:49 - INFO - codeparrot_training - Step 44610: {'lr': 1.4299095028363935e-05, 'samples': 22840832, 'steps': 44610, 'batch_loss/train': 0.6838839170522988} 12/28/2021 20:05:00 - INFO - codeparrot_training - Step 44611: {'lr': 1.4293840877097702e-05, 'samples': 22841344, 'steps': 44611, 'batch_loss/train': 0.7615022836253047} 12/28/2021 20:05:13 - INFO - codeparrot_training - Step 44612: {'lr': 1.4288587662910135e-05, 'samples': 22841856, 'steps': 44612, 'batch_loss/train': 0.7719960613176227} 12/28/2021 20:05:24 - INFO - codeparrot_training - Step 44613: {'lr': 1.428333538582216e-05, 'samples': 22842368, 'steps': 44613, 'batch_loss/train': 0.6541840075515211} 12/28/2021 20:05:34 - INFO - codeparrot_training - Step 44614: {'lr': 1.4278084045854623e-05, 'samples': 22842880, 'steps': 44614, 'batch_loss/train': 0.7496933112852275} 12/28/2021 20:05:46 - INFO - codeparrot_training - Step 44615: {'lr': 1.4272833643028421e-05, 'samples': 22843392, 'steps': 44615, 'batch_loss/train': 0.7294842427363619} 12/28/2021 20:05:57 - INFO - codeparrot_training - Step 44616: {'lr': 1.4267584177364428e-05, 'samples': 22843904, 'steps': 44616, 'batch_loss/train': 0.7230412242934108} 12/28/2021 20:06:08 - INFO - codeparrot_training - Step 44617: {'lr': 1.4262335648883545e-05, 'samples': 22844416, 'steps': 44617, 'batch_loss/train': 0.6745692519471049} 12/28/2021 20:06:18 - INFO - codeparrot_training - Step 44618: {'lr': 1.425708805760656e-05, 'samples': 22844928, 'steps': 44618, 'batch_loss/train': 0.8079744146671146} 12/28/2021 20:06:31 - INFO - codeparrot_training - Step 44619: {'lr': 1.4251841403554372e-05, 'samples': 22845440, 'steps': 44619, 'batch_loss/train': 0.7795685436576605} 12/28/2021 20:06:42 - INFO - codeparrot_training - Step 44620: {'lr': 1.4246595686747883e-05, 'samples': 22845952, 'steps': 44620, 'batch_loss/train': 0.7855383764253929} 12/28/2021 20:06:53 - INFO - codeparrot_training - Step 44621: {'lr': 1.4241350907207906e-05, 'samples': 22846464, 'steps': 44621, 'batch_loss/train': 0.737085344735533} 12/28/2021 20:07:05 - INFO - codeparrot_training - Step 44622: {'lr': 1.4236107064955289e-05, 'samples': 22846976, 'steps': 44622, 'batch_loss/train': 0.7090542023070157} 12/28/2021 20:07:15 - INFO - codeparrot_training - Step 44623: {'lr': 1.4230864160010876e-05, 'samples': 22847488, 'steps': 44623, 'batch_loss/train': 0.7908072602003813} 12/28/2021 20:07:26 - INFO - codeparrot_training - Step 44624: {'lr': 1.4225622192395538e-05, 'samples': 22848000, 'steps': 44624, 'batch_loss/train': 0.7977840937674046} 12/28/2021 20:07:38 - INFO - codeparrot_training - Step 44625: {'lr': 1.4220381162130119e-05, 'samples': 22848512, 'steps': 44625, 'batch_loss/train': 0.6947796046733856} 12/28/2021 20:07:49 - INFO - codeparrot_training - Step 44626: {'lr': 1.4215141069235437e-05, 'samples': 22849024, 'steps': 44626, 'batch_loss/train': 0.7783204400911927} 12/28/2021 20:07:59 - INFO - codeparrot_training - Step 44627: {'lr': 1.4209901913732281e-05, 'samples': 22849536, 'steps': 44627, 'batch_loss/train': 0.8233035178855062} 12/28/2021 20:08:10 - INFO - codeparrot_training - Step 44628: {'lr': 1.4204663695641551e-05, 'samples': 22850048, 'steps': 44628, 'batch_loss/train': 0.6981205749325454} 12/28/2021 20:08:22 - INFO - codeparrot_training - Step 44629: {'lr': 1.419942641498409e-05, 'samples': 22850560, 'steps': 44629, 'batch_loss/train': 0.6735198878450319} 12/28/2021 20:08:33 - INFO - codeparrot_training - Step 44630: {'lr': 1.4194190071780577e-05, 'samples': 22851072, 'steps': 44630, 'batch_loss/train': 0.7397234360687435} 12/28/2021 20:08:44 - INFO - codeparrot_training - Step 44631: {'lr': 1.4188954666051995e-05, 'samples': 22851584, 'steps': 44631, 'batch_loss/train': 0.5608922424726188} 12/28/2021 20:08:57 - INFO - codeparrot_training - Step 44632: {'lr': 1.4183720197819106e-05, 'samples': 22852096, 'steps': 44632, 'batch_loss/train': 0.7543417075648904} 12/28/2021 20:09:08 - INFO - codeparrot_training - Step 44633: {'lr': 1.417848666710267e-05, 'samples': 22852608, 'steps': 44633, 'batch_loss/train': 0.8138433094136417} 12/28/2021 20:09:18 - INFO - codeparrot_training - Step 44634: {'lr': 1.4173254073923502e-05, 'samples': 22853120, 'steps': 44634, 'batch_loss/train': 0.7560738874599338} 12/28/2021 20:09:30 - INFO - codeparrot_training - Step 44635: {'lr': 1.4168022418302478e-05, 'samples': 22853632, 'steps': 44635, 'batch_loss/train': 0.7821311922743917} 12/28/2021 20:09:41 - INFO - codeparrot_training - Step 44636: {'lr': 1.4162791700260359e-05, 'samples': 22854144, 'steps': 44636, 'batch_loss/train': 0.6991421412676573} 12/28/2021 20:09:52 - INFO - codeparrot_training - Step 44637: {'lr': 1.4157561919817902e-05, 'samples': 22854656, 'steps': 44637, 'batch_loss/train': 0.7604072648100555} 12/28/2021 20:10:02 - INFO - codeparrot_training - Step 44638: {'lr': 1.4152333076995927e-05, 'samples': 22855168, 'steps': 44638, 'batch_loss/train': 0.6793890167027712} 12/28/2021 20:10:14 - INFO - codeparrot_training - Step 44639: {'lr': 1.4147105171815222e-05, 'samples': 22855680, 'steps': 44639, 'batch_loss/train': 0.7479772865772247} 12/28/2021 20:10:25 - INFO - codeparrot_training - Step 44640: {'lr': 1.4141878204296577e-05, 'samples': 22856192, 'steps': 44640, 'batch_loss/train': 0.7156023448333144} 12/28/2021 20:10:36 - INFO - codeparrot_training - Step 44641: {'lr': 1.413665217446078e-05, 'samples': 22856704, 'steps': 44641, 'batch_loss/train': 0.727549722418189} 12/28/2021 20:10:49 - INFO - codeparrot_training - Step 44642: {'lr': 1.4131427082328562e-05, 'samples': 22857216, 'steps': 44642, 'batch_loss/train': 0.8006944889202714} 12/28/2021 20:10:59 - INFO - codeparrot_training - Step 44643: {'lr': 1.4126202927920773e-05, 'samples': 22857728, 'steps': 44643, 'batch_loss/train': 0.6811465863138437} 12/28/2021 20:11:10 - INFO - codeparrot_training - Step 44644: {'lr': 1.4120979711258142e-05, 'samples': 22858240, 'steps': 44644, 'batch_loss/train': 0.7160113987047225} 12/28/2021 20:11:23 - INFO - codeparrot_training - Step 44645: {'lr': 1.4115757432361375e-05, 'samples': 22858752, 'steps': 44645, 'batch_loss/train': 0.7337289098650217} 12/28/2021 20:11:33 - INFO - codeparrot_training - Step 44646: {'lr': 1.4110536091251347e-05, 'samples': 22859264, 'steps': 44646, 'batch_loss/train': 0.840433114208281} 12/28/2021 20:11:44 - INFO - codeparrot_training - Step 44647: {'lr': 1.4105315687948733e-05, 'samples': 22859776, 'steps': 44647, 'batch_loss/train': 0.757107425481081} 12/28/2021 20:11:55 - INFO - codeparrot_training - Step 44648: {'lr': 1.4100096222474323e-05, 'samples': 22860288, 'steps': 44648, 'batch_loss/train': 0.9849321125075221} 12/28/2021 20:12:08 - INFO - codeparrot_training - Step 44649: {'lr': 1.4094877694848852e-05, 'samples': 22860800, 'steps': 44649, 'batch_loss/train': 0.6806617630645633} 12/28/2021 20:12:19 - INFO - codeparrot_training - Step 44650: {'lr': 1.4089660105093077e-05, 'samples': 22861312, 'steps': 44650, 'batch_loss/train': 0.7606039543170482} 12/28/2021 20:12:29 - INFO - codeparrot_training - Step 44651: {'lr': 1.4084443453227736e-05, 'samples': 22861824, 'steps': 44651, 'batch_loss/train': 0.8455743631348014} 12/28/2021 20:12:41 - INFO - codeparrot_training - Step 44652: {'lr': 1.407922773927356e-05, 'samples': 22862336, 'steps': 44652, 'batch_loss/train': 0.8080253070220351} 12/28/2021 20:12:52 - INFO - codeparrot_training - Step 44653: {'lr': 1.407401296325128e-05, 'samples': 22862848, 'steps': 44653, 'batch_loss/train': 0.7250653272494674} 12/28/2021 20:13:03 - INFO - codeparrot_training - Step 44654: {'lr': 1.4068799125181663e-05, 'samples': 22863360, 'steps': 44654, 'batch_loss/train': 0.7538965339772403} 12/28/2021 20:13:15 - INFO - codeparrot_training - Step 44655: {'lr': 1.4063586225085467e-05, 'samples': 22863872, 'steps': 44655, 'batch_loss/train': 0.6913668471388519} 12/28/2021 20:13:26 - INFO - codeparrot_training - Step 44656: {'lr': 1.4058374262983259e-05, 'samples': 22864384, 'steps': 44656, 'batch_loss/train': 0.7989012124016881} 12/28/2021 20:13:36 - INFO - codeparrot_training - Step 44657: {'lr': 1.405316323889594e-05, 'samples': 22864896, 'steps': 44657, 'batch_loss/train': 1.052779086632654} 12/28/2021 20:13:47 - INFO - codeparrot_training - Step 44658: {'lr': 1.4047953152844156e-05, 'samples': 22865408, 'steps': 44658, 'batch_loss/train': 0.7212256686761975} 12/28/2021 20:14:00 - INFO - codeparrot_training - Step 44659: {'lr': 1.404274400484859e-05, 'samples': 22865920, 'steps': 44659, 'batch_loss/train': 0.7143307207152247} 12/28/2021 20:14:11 - INFO - codeparrot_training - Step 44660: {'lr': 1.4037535794929973e-05, 'samples': 22866432, 'steps': 44660, 'batch_loss/train': 0.7114345761947334} 12/28/2021 20:14:22 - INFO - codeparrot_training - Step 44661: {'lr': 1.4032328523109067e-05, 'samples': 22866944, 'steps': 44661, 'batch_loss/train': 0.7301865347544663} 12/28/2021 20:14:34 - INFO - codeparrot_training - Step 44662: {'lr': 1.4027122189406466e-05, 'samples': 22867456, 'steps': 44662, 'batch_loss/train': 0.7078115004114807} 12/28/2021 20:14:44 - INFO - codeparrot_training - Step 44663: {'lr': 1.4021916793842958e-05, 'samples': 22867968, 'steps': 44663, 'batch_loss/train': 0.7508108862675726} 12/28/2021 20:14:55 - INFO - codeparrot_training - Step 44664: {'lr': 1.4016712336439197e-05, 'samples': 22868480, 'steps': 44664, 'batch_loss/train': 1.4650592114776373} 12/28/2021 20:15:07 - INFO - codeparrot_training - Step 44665: {'lr': 1.4011508817215885e-05, 'samples': 22868992, 'steps': 44665, 'batch_loss/train': 0.7075353180989623} 12/28/2021 20:15:18 - INFO - codeparrot_training - Step 44666: {'lr': 1.4006306236193728e-05, 'samples': 22869504, 'steps': 44666, 'batch_loss/train': 0.5938265890581533} 12/28/2021 20:15:28 - INFO - codeparrot_training - Step 44667: {'lr': 1.4001104593393376e-05, 'samples': 22870016, 'steps': 44667, 'batch_loss/train': 0.6784295805264264} 12/28/2021 20:15:39 - INFO - codeparrot_training - Step 44668: {'lr': 1.399590388883551e-05, 'samples': 22870528, 'steps': 44668, 'batch_loss/train': 0.7979269614443183} 12/28/2021 20:15:53 - INFO - codeparrot_training - Step 44669: {'lr': 1.3990704122540831e-05, 'samples': 22871040, 'steps': 44669, 'batch_loss/train': 0.6353920153342187} 12/28/2021 20:16:04 - INFO - codeparrot_training - Step 44670: {'lr': 1.398550529453002e-05, 'samples': 22871552, 'steps': 44670, 'batch_loss/train': 0.6903884320054203} 12/28/2021 20:16:14 - INFO - codeparrot_training - Step 44671: {'lr': 1.3980307404823673e-05, 'samples': 22872064, 'steps': 44671, 'batch_loss/train': 0.7025751858018339} 12/28/2021 20:16:26 - INFO - codeparrot_training - Step 44672: {'lr': 1.3975110453442546e-05, 'samples': 22872576, 'steps': 44672, 'batch_loss/train': 0.7788038358557969} 12/28/2021 20:16:37 - INFO - codeparrot_training - Step 44673: {'lr': 1.3969914440407266e-05, 'samples': 22873088, 'steps': 44673, 'batch_loss/train': 0.775616752798669} 12/28/2021 20:16:48 - INFO - codeparrot_training - Step 44674: {'lr': 1.3964719365738454e-05, 'samples': 22873600, 'steps': 44674, 'batch_loss/train': 0.8007147093303502} 12/28/2021 20:17:00 - INFO - codeparrot_training - Step 44675: {'lr': 1.3959525229456815e-05, 'samples': 22874112, 'steps': 44675, 'batch_loss/train': 0.8962728190235794} 12/28/2021 20:17:10 - INFO - codeparrot_training - Step 44676: {'lr': 1.3954332031582972e-05, 'samples': 22874624, 'steps': 44676, 'batch_loss/train': 0.7324632462114096} 12/28/2021 20:17:21 - INFO - codeparrot_training - Step 44677: {'lr': 1.3949139772137549e-05, 'samples': 22875136, 'steps': 44677, 'batch_loss/train': 0.857424674089998} 12/28/2021 20:17:35 - INFO - codeparrot_training - Step 44678: {'lr': 1.3943948451141247e-05, 'samples': 22875648, 'steps': 44678, 'batch_loss/train': 0.6944731832481921} 12/28/2021 20:17:45 - INFO - codeparrot_training - Step 44679: {'lr': 1.3938758068614666e-05, 'samples': 22876160, 'steps': 44679, 'batch_loss/train': 0.726835222914815} 12/28/2021 20:17:56 - INFO - codeparrot_training - Step 44680: {'lr': 1.3933568624578452e-05, 'samples': 22876672, 'steps': 44680, 'batch_loss/train': 0.6782371043227613} 12/28/2021 20:18:06 - INFO - codeparrot_training - Step 44681: {'lr': 1.3928380119053285e-05, 'samples': 22877184, 'steps': 44681, 'batch_loss/train': 0.74814941175282} 12/28/2021 20:18:19 - INFO - codeparrot_training - Step 44682: {'lr': 1.392319255205965e-05, 'samples': 22877696, 'steps': 44682, 'batch_loss/train': 0.9102813815698028} 12/28/2021 20:18:29 - INFO - codeparrot_training - Step 44683: {'lr': 1.3918005923618304e-05, 'samples': 22878208, 'steps': 44683, 'batch_loss/train': 0.7402809616178274} 12/28/2021 20:18:40 - INFO - codeparrot_training - Step 44684: {'lr': 1.3912820233749873e-05, 'samples': 22878720, 'steps': 44684, 'batch_loss/train': 0.6976939258165658} 12/28/2021 20:18:52 - INFO - codeparrot_training - Step 44685: {'lr': 1.3907635482474867e-05, 'samples': 22879232, 'steps': 44685, 'batch_loss/train': 0.7242323467507958} 12/28/2021 20:19:03 - INFO - codeparrot_training - Step 44686: {'lr': 1.3902451669813936e-05, 'samples': 22879744, 'steps': 44686, 'batch_loss/train': 0.7984049608930945} 12/28/2021 20:19:13 - INFO - codeparrot_training - Step 44687: {'lr': 1.389726879578776e-05, 'samples': 22880256, 'steps': 44687, 'batch_loss/train': 0.6507836659438908} 12/28/2021 20:19:26 - INFO - codeparrot_training - Step 44688: {'lr': 1.3892086860416903e-05, 'samples': 22880768, 'steps': 44688, 'batch_loss/train': 0.7369827209040523} 12/28/2021 20:19:37 - INFO - codeparrot_training - Step 44689: {'lr': 1.3886905863721878e-05, 'samples': 22881280, 'steps': 44689, 'batch_loss/train': 0.6770531716756523} 12/28/2021 20:19:48 - INFO - codeparrot_training - Step 44690: {'lr': 1.3881725805723445e-05, 'samples': 22881792, 'steps': 44690, 'batch_loss/train': 0.7856999887153506} 12/28/2021 20:19:58 - INFO - codeparrot_training - Step 44691: {'lr': 1.3876546686442088e-05, 'samples': 22882304, 'steps': 44691, 'batch_loss/train': 0.7114997548051178} 12/28/2021 20:20:11 - INFO - codeparrot_training - Step 44692: {'lr': 1.387136850589843e-05, 'samples': 22882816, 'steps': 44692, 'batch_loss/train': 0.6934454112779349} 12/28/2021 20:20:21 - INFO - codeparrot_training - Step 44693: {'lr': 1.3866191264113037e-05, 'samples': 22883328, 'steps': 44693, 'batch_loss/train': 0.7254495467059314} 12/28/2021 20:20:32 - INFO - codeparrot_training - Step 44694: {'lr': 1.3861014961106533e-05, 'samples': 22883840, 'steps': 44694, 'batch_loss/train': 0.7102953158318996} 12/28/2021 20:20:45 - INFO - codeparrot_training - Step 44695: {'lr': 1.3855839596899427e-05, 'samples': 22884352, 'steps': 44695, 'batch_loss/train': 0.7861123755574226} 12/28/2021 20:20:56 - INFO - codeparrot_training - Step 44696: {'lr': 1.3850665171512427e-05, 'samples': 22884864, 'steps': 44696, 'batch_loss/train': 0.698181320913136} 12/28/2021 20:21:07 - INFO - codeparrot_training - Step 44697: {'lr': 1.3845491684965905e-05, 'samples': 22885376, 'steps': 44697, 'batch_loss/train': 0.6917777610942721} 12/28/2021 20:21:19 - INFO - codeparrot_training - Step 44698: {'lr': 1.3840319137280621e-05, 'samples': 22885888, 'steps': 44698, 'batch_loss/train': 0.6771051813848317} 12/28/2021 20:21:29 - INFO - codeparrot_training - Step 44699: {'lr': 1.3835147528477032e-05, 'samples': 22886400, 'steps': 44699, 'batch_loss/train': 0.8289607437327504} 12/28/2021 20:21:40 - INFO - codeparrot_training - Step 44700: {'lr': 1.3829976858575705e-05, 'samples': 22886912, 'steps': 44700, 'batch_loss/train': 0.6526250476017594} 12/28/2021 20:21:51 - INFO - codeparrot_training - Step 44701: {'lr': 1.3824807127597234e-05, 'samples': 22887424, 'steps': 44701, 'batch_loss/train': 0.7834526062943041} 12/28/2021 20:22:03 - INFO - codeparrot_training - Step 44702: {'lr': 1.3819638335562129e-05, 'samples': 22887936, 'steps': 44702, 'batch_loss/train': 0.714407961233519} 12/28/2021 20:22:14 - INFO - codeparrot_training - Step 44703: {'lr': 1.3814470482490988e-05, 'samples': 22888448, 'steps': 44703, 'batch_loss/train': 0.7890814635902643} 12/28/2021 20:22:24 - INFO - codeparrot_training - Step 44704: {'lr': 1.3809303568404319e-05, 'samples': 22888960, 'steps': 44704, 'batch_loss/train': 0.7130245509324595} 12/28/2021 20:22:38 - INFO - codeparrot_training - Step 44705: {'lr': 1.3804137593322692e-05, 'samples': 22889472, 'steps': 44705, 'batch_loss/train': 0.6593261775560677} 12/28/2021 20:22:48 - INFO - codeparrot_training - Step 44706: {'lr': 1.3798972557266614e-05, 'samples': 22889984, 'steps': 44706, 'batch_loss/train': 0.6507758991792798} 12/28/2021 20:22:59 - INFO - codeparrot_training - Step 44707: {'lr': 1.3793808460256685e-05, 'samples': 22890496, 'steps': 44707, 'batch_loss/train': 0.714032422285527} 12/28/2021 20:23:11 - INFO - codeparrot_training - Step 44708: {'lr': 1.3788645302313301e-05, 'samples': 22891008, 'steps': 44708, 'batch_loss/train': 0.8485724404454231} 12/28/2021 20:23:22 - INFO - codeparrot_training - Step 44709: {'lr': 1.3783483083457116e-05, 'samples': 22891520, 'steps': 44709, 'batch_loss/train': 0.659697150811553} 12/28/2021 20:23:32 - INFO - codeparrot_training - Step 44710: {'lr': 1.3778321803708637e-05, 'samples': 22892032, 'steps': 44710, 'batch_loss/train': 0.7900752332061529} 12/28/2021 20:23:43 - INFO - codeparrot_training - Step 44711: {'lr': 1.3773161463088353e-05, 'samples': 22892544, 'steps': 44711, 'batch_loss/train': 0.7140636593103409} 12/28/2021 20:23:55 - INFO - codeparrot_training - Step 44712: {'lr': 1.3768002061616714e-05, 'samples': 22893056, 'steps': 44712, 'batch_loss/train': 0.642985881306231} 12/28/2021 20:24:05 - INFO - codeparrot_training - Step 44713: {'lr': 1.3762843599314373e-05, 'samples': 22893568, 'steps': 44713, 'batch_loss/train': 0.7208838434889913} 12/28/2021 20:24:16 - INFO - codeparrot_training - Step 44714: {'lr': 1.375768607620173e-05, 'samples': 22894080, 'steps': 44714, 'batch_loss/train': 0.6389999362872913} 12/28/2021 20:24:28 - INFO - codeparrot_training - Step 44715: {'lr': 1.3752529492299298e-05, 'samples': 22894592, 'steps': 44715, 'batch_loss/train': 0.7196646835654974} 12/28/2021 20:24:39 - INFO - codeparrot_training - Step 44716: {'lr': 1.3747373847627698e-05, 'samples': 22895104, 'steps': 44716, 'batch_loss/train': 0.8056919118389487} 12/28/2021 20:24:49 - INFO - codeparrot_training - Step 44717: {'lr': 1.3742219142207275e-05, 'samples': 22895616, 'steps': 44717, 'batch_loss/train': 0.6899007408646867} 12/28/2021 20:25:03 - INFO - codeparrot_training - Step 44718: {'lr': 1.3737065376058566e-05, 'samples': 22896128, 'steps': 44718, 'batch_loss/train': 0.8230301961302757} 12/28/2021 20:25:13 - INFO - codeparrot_training - Step 44719: {'lr': 1.3731912549202113e-05, 'samples': 22896640, 'steps': 44719, 'batch_loss/train': 0.7275080503895879} 12/28/2021 20:25:24 - INFO - codeparrot_training - Step 44720: {'lr': 1.3726760661658344e-05, 'samples': 22897152, 'steps': 44720, 'batch_loss/train': 0.6979223147500306} 12/28/2021 20:25:35 - INFO - codeparrot_training - Step 44721: {'lr': 1.372160971344777e-05, 'samples': 22897664, 'steps': 44721, 'batch_loss/train': 0.7003626981750131} 12/28/2021 20:25:47 - INFO - codeparrot_training - Step 44722: {'lr': 1.3716459704590872e-05, 'samples': 22898176, 'steps': 44722, 'batch_loss/train': 0.7668045181781054} 12/28/2021 20:25:58 - INFO - codeparrot_training - Step 44723: {'lr': 1.3711310635108054e-05, 'samples': 22898688, 'steps': 44723, 'batch_loss/train': 0.7363303005695343} 12/28/2021 20:26:08 - INFO - codeparrot_training - Step 44724: {'lr': 1.3706162505019882e-05, 'samples': 22899200, 'steps': 44724, 'batch_loss/train': 0.5039713816659059} 12/28/2021 20:26:20 - INFO - codeparrot_training - Step 44725: {'lr': 1.3701015314346838e-05, 'samples': 22899712, 'steps': 44725, 'batch_loss/train': 0.7126512518152595} 12/28/2021 20:26:31 - INFO - codeparrot_training - Step 44726: {'lr': 1.3695869063109239e-05, 'samples': 22900224, 'steps': 44726, 'batch_loss/train': 0.848467942327261} 12/28/2021 20:26:41 - INFO - codeparrot_training - Step 44727: {'lr': 1.3690723751327739e-05, 'samples': 22900736, 'steps': 44727, 'batch_loss/train': 0.6615209685405716} 12/28/2021 20:26:55 - INFO - codeparrot_training - Step 44728: {'lr': 1.3685579379022623e-05, 'samples': 22901248, 'steps': 44728, 'batch_loss/train': 0.7108053625561297} 12/28/2021 20:27:05 - INFO - codeparrot_training - Step 44729: {'lr': 1.3680435946214459e-05, 'samples': 22901760, 'steps': 44729, 'batch_loss/train': 0.712003412656486} 12/28/2021 20:27:16 - INFO - codeparrot_training - Step 44730: {'lr': 1.3675293452923621e-05, 'samples': 22902272, 'steps': 44730, 'batch_loss/train': 0.6937810115050524} 12/28/2021 20:27:28 - INFO - codeparrot_training - Step 44731: {'lr': 1.3670151899170591e-05, 'samples': 22902784, 'steps': 44731, 'batch_loss/train': 0.8125794157385826} 12/28/2021 20:27:39 - INFO - codeparrot_training - Step 44732: {'lr': 1.3665011284975798e-05, 'samples': 22903296, 'steps': 44732, 'batch_loss/train': 0.7268615448847413} 12/28/2021 20:27:49 - INFO - codeparrot_training - Step 44733: {'lr': 1.365987161035967e-05, 'samples': 22903808, 'steps': 44733, 'batch_loss/train': 0.7462841691449285} 12/28/2021 20:28:00 - INFO - codeparrot_training - Step 44734: {'lr': 1.3654732875342663e-05, 'samples': 22904320, 'steps': 44734, 'batch_loss/train': 0.7649562498554587} 12/28/2021 20:28:13 - INFO - codeparrot_training - Step 44735: {'lr': 1.3649595079945204e-05, 'samples': 22904832, 'steps': 44735, 'batch_loss/train': 0.7245863699354231} 12/28/2021 20:28:24 - INFO - codeparrot_training - Step 44736: {'lr': 1.364445822418775e-05, 'samples': 22905344, 'steps': 44736, 'batch_loss/train': 0.7461247639730573} 12/28/2021 20:28:35 - INFO - codeparrot_training - Step 44737: {'lr': 1.3639322308090645e-05, 'samples': 22905856, 'steps': 44737, 'batch_loss/train': 0.7986645692144521} 12/28/2021 20:28:47 - INFO - codeparrot_training - Step 44738: {'lr': 1.363418733167432e-05, 'samples': 22906368, 'steps': 44738, 'batch_loss/train': 0.7942181953694671} 12/28/2021 20:28:58 - INFO - codeparrot_training - Step 44739: {'lr': 1.362905329495928e-05, 'samples': 22906880, 'steps': 44739, 'batch_loss/train': 0.7477365587837994} 12/28/2021 20:29:08 - INFO - codeparrot_training - Step 44740: {'lr': 1.362392019796585e-05, 'samples': 22907392, 'steps': 44740, 'batch_loss/train': 0.7107679951004684} 12/28/2021 20:29:20 - INFO - codeparrot_training - Step 44741: {'lr': 1.3618788040714425e-05, 'samples': 22907904, 'steps': 44741, 'batch_loss/train': 0.7919319105567411} 12/28/2021 20:29:31 - INFO - codeparrot_training - Step 44742: {'lr': 1.3613656823225517e-05, 'samples': 22908416, 'steps': 44742, 'batch_loss/train': 0.7951205215649679} 12/28/2021 20:29:42 - INFO - codeparrot_training - Step 44743: {'lr': 1.3608526545519389e-05, 'samples': 22908928, 'steps': 44743, 'batch_loss/train': 0.7130680353147909} 12/28/2021 20:29:53 - INFO - codeparrot_training - Step 44744: {'lr': 1.3603397207616525e-05, 'samples': 22909440, 'steps': 44744, 'batch_loss/train': 0.7109258144628257} 12/28/2021 20:30:06 - INFO - codeparrot_training - Step 44745: {'lr': 1.3598268809537295e-05, 'samples': 22909952, 'steps': 44745, 'batch_loss/train': 0.6123832773882896} 12/28/2021 20:30:17 - INFO - codeparrot_training - Step 44746: {'lr': 1.3593141351302101e-05, 'samples': 22910464, 'steps': 44746, 'batch_loss/train': 0.642116891220212} 12/28/2021 20:30:28 - INFO - codeparrot_training - Step 44747: {'lr': 1.358801483293129e-05, 'samples': 22910976, 'steps': 44747, 'batch_loss/train': 0.7111279107630253} 12/28/2021 20:30:40 - INFO - codeparrot_training - Step 44748: {'lr': 1.3582889254445313e-05, 'samples': 22911488, 'steps': 44748, 'batch_loss/train': 0.6953107081353664} 12/28/2021 20:30:50 - INFO - codeparrot_training - Step 44749: {'lr': 1.3577764615864407e-05, 'samples': 22912000, 'steps': 44749, 'batch_loss/train': 0.6417810344137251} 12/28/2021 20:31:01 - INFO - codeparrot_training - Step 44750: {'lr': 1.3572640917209084e-05, 'samples': 22912512, 'steps': 44750, 'batch_loss/train': 0.577302251389483} 12/28/2021 20:31:13 - INFO - codeparrot_training - Step 44751: {'lr': 1.3567518158499715e-05, 'samples': 22913024, 'steps': 44751, 'batch_loss/train': 0.6366529213264585} 12/28/2021 20:31:24 - INFO - codeparrot_training - Step 44752: {'lr': 1.3562396339756562e-05, 'samples': 22913536, 'steps': 44752, 'batch_loss/train': 0.7782060778699815} 12/28/2021 20:31:34 - INFO - codeparrot_training - Step 44753: {'lr': 1.3557275461000107e-05, 'samples': 22914048, 'steps': 44753, 'batch_loss/train': 1.5917660649865866} 12/28/2021 20:31:45 - INFO - codeparrot_training - Step 44754: {'lr': 1.3552155522250587e-05, 'samples': 22914560, 'steps': 44754, 'batch_loss/train': 0.6868020281544887} 12/28/2021 20:31:57 - INFO - codeparrot_training - Step 44755: {'lr': 1.3547036523528455e-05, 'samples': 22915072, 'steps': 44755, 'batch_loss/train': 0.7003633375279605} 12/28/2021 20:32:08 - INFO - codeparrot_training - Step 44756: {'lr': 1.3541918464854003e-05, 'samples': 22915584, 'steps': 44756, 'batch_loss/train': 0.7609743759967387} 12/28/2021 20:32:18 - INFO - codeparrot_training - Step 44757: {'lr': 1.35368013462476e-05, 'samples': 22916096, 'steps': 44757, 'batch_loss/train': 0.5476430773269385} 12/28/2021 20:32:32 - INFO - codeparrot_training - Step 44758: {'lr': 1.3531685167729597e-05, 'samples': 22916608, 'steps': 44758, 'batch_loss/train': 0.6999204019084573} 12/28/2021 20:32:42 - INFO - codeparrot_training - Step 44759: {'lr': 1.3526569929320305e-05, 'samples': 22917120, 'steps': 44759, 'batch_loss/train': 0.5382411060854793} 12/28/2021 20:32:53 - INFO - codeparrot_training - Step 44760: {'lr': 1.3521455631040097e-05, 'samples': 22917632, 'steps': 44760, 'batch_loss/train': 0.63977513439022} 12/28/2021 20:33:05 - INFO - codeparrot_training - Step 44761: {'lr': 1.3516342272909294e-05, 'samples': 22918144, 'steps': 44761, 'batch_loss/train': 0.7363593702903017} 12/28/2021 20:33:16 - INFO - codeparrot_training - Step 44762: {'lr': 1.3511229854948265e-05, 'samples': 22918656, 'steps': 44762, 'batch_loss/train': 0.7169270445592701} 12/28/2021 20:33:26 - INFO - codeparrot_training - Step 44763: {'lr': 1.3506118377177217e-05, 'samples': 22919168, 'steps': 44763, 'batch_loss/train': 0.7301630922593176} 12/28/2021 20:33:37 - INFO - codeparrot_training - Step 44764: {'lr': 1.350100783961658e-05, 'samples': 22919680, 'steps': 44764, 'batch_loss/train': 0.600815675832564} 12/28/2021 20:33:50 - INFO - codeparrot_training - Step 44765: {'lr': 1.349589824228667e-05, 'samples': 22920192, 'steps': 44765, 'batch_loss/train': 0.7627658005803823} 12/28/2021 20:34:01 - INFO - codeparrot_training - Step 44766: {'lr': 1.3490789585207719e-05, 'samples': 22920704, 'steps': 44766, 'batch_loss/train': 0.7685468029230833} 12/28/2021 20:34:12 - INFO - codeparrot_training - Step 44767: {'lr': 1.3485681868400073e-05, 'samples': 22921216, 'steps': 44767, 'batch_loss/train': 0.8196745608001947} 12/28/2021 20:34:24 - INFO - codeparrot_training - Step 44768: {'lr': 1.3480575091884106e-05, 'samples': 22921728, 'steps': 44768, 'batch_loss/train': 0.696442671935074} 12/28/2021 20:34:34 - INFO - codeparrot_training - Step 44769: {'lr': 1.3475469255680051e-05, 'samples': 22922240, 'steps': 44769, 'batch_loss/train': 0.7793675884604454} 12/28/2021 20:34:45 - INFO - codeparrot_training - Step 44770: {'lr': 1.3470364359808195e-05, 'samples': 22922752, 'steps': 44770, 'batch_loss/train': 0.601698309648782} 12/28/2021 20:34:57 - INFO - codeparrot_training - Step 44771: {'lr': 1.346526040428886e-05, 'samples': 22923264, 'steps': 44771, 'batch_loss/train': 0.7390191806480289} 12/28/2021 20:35:08 - INFO - codeparrot_training - Step 44772: {'lr': 1.3460157389142358e-05, 'samples': 22923776, 'steps': 44772, 'batch_loss/train': 0.7214473044732586} 12/28/2021 20:35:18 - INFO - codeparrot_training - Step 44773: {'lr': 1.3455055314388953e-05, 'samples': 22924288, 'steps': 44773, 'batch_loss/train': 0.6823305301368237} 12/28/2021 20:35:29 - INFO - codeparrot_training - Step 44774: {'lr': 1.3449954180048963e-05, 'samples': 22924800, 'steps': 44774, 'batch_loss/train': 0.7375252293422818} 12/28/2021 20:35:42 - INFO - codeparrot_training - Step 44775: {'lr': 1.3444853986142563e-05, 'samples': 22925312, 'steps': 44775, 'batch_loss/train': 0.6933029795181938} 12/28/2021 20:35:53 - INFO - codeparrot_training - Step 44776: {'lr': 1.3439754732690158e-05, 'samples': 22925824, 'steps': 44776, 'batch_loss/train': 0.6902625828515738} 12/28/2021 20:36:03 - INFO - codeparrot_training - Step 44777: {'lr': 1.3434656419711978e-05, 'samples': 22926336, 'steps': 44777, 'batch_loss/train': 0.7208649329841137} 12/28/2021 20:36:15 - INFO - codeparrot_training - Step 44778: {'lr': 1.342955904722823e-05, 'samples': 22926848, 'steps': 44778, 'batch_loss/train': 0.6756991683505476} 12/28/2021 20:36:26 - INFO - codeparrot_training - Step 44779: {'lr': 1.3424462615259259e-05, 'samples': 22927360, 'steps': 44779, 'batch_loss/train': 0.7616416974924505} 12/28/2021 20:36:37 - INFO - codeparrot_training - Step 44780: {'lr': 1.3419367123825326e-05, 'samples': 22927872, 'steps': 44780, 'batch_loss/train': 0.7605844680219889} 12/28/2021 20:36:50 - INFO - codeparrot_training - Step 44781: {'lr': 1.3414272572946612e-05, 'samples': 22928384, 'steps': 44781, 'batch_loss/train': 0.7876634863205254} 12/28/2021 20:37:00 - INFO - codeparrot_training - Step 44782: {'lr': 1.340917896264343e-05, 'samples': 22928896, 'steps': 44782, 'batch_loss/train': 0.78644352639094} 12/28/2021 20:37:11 - INFO - codeparrot_training - Step 44783: {'lr': 1.3404086292936019e-05, 'samples': 22929408, 'steps': 44783, 'batch_loss/train': 0.791319886688143} 12/28/2021 20:37:21 - INFO - codeparrot_training - Step 44784: {'lr': 1.3398994563844636e-05, 'samples': 22929920, 'steps': 44784, 'batch_loss/train': 0.6164586753584445} 12/28/2021 20:37:33 - INFO - codeparrot_training - Step 44785: {'lr': 1.3393903775389515e-05, 'samples': 22930432, 'steps': 44785, 'batch_loss/train': 0.7812034231610596} 12/28/2021 20:37:44 - INFO - codeparrot_training - Step 44786: {'lr': 1.3388813927590893e-05, 'samples': 22930944, 'steps': 44786, 'batch_loss/train': 0.7696303157135844} 12/28/2021 20:37:55 - INFO - codeparrot_training - Step 44787: {'lr': 1.3383725020468973e-05, 'samples': 22931456, 'steps': 44787, 'batch_loss/train': 0.6937362258322537} 12/28/2021 20:38:10 - INFO - codeparrot_training - Step 44788: {'lr': 1.33786370540441e-05, 'samples': 22931968, 'steps': 44788, 'batch_loss/train': 0.7322179526090622} 12/28/2021 20:38:20 - INFO - codeparrot_training - Step 44789: {'lr': 1.3373550028336318e-05, 'samples': 22932480, 'steps': 44789, 'batch_loss/train': 0.7887695236131549} 12/28/2021 20:38:31 - INFO - codeparrot_training - Step 44790: {'lr': 1.3368463943366021e-05, 'samples': 22932992, 'steps': 44790, 'batch_loss/train': 0.6518225884065032} 12/28/2021 20:38:42 - INFO - codeparrot_training - Step 44791: {'lr': 1.3363378799153364e-05, 'samples': 22933504, 'steps': 44791, 'batch_loss/train': 0.9850894389674067} 12/28/2021 20:38:54 - INFO - codeparrot_training - Step 44792: {'lr': 1.335829459571855e-05, 'samples': 22934016, 'steps': 44792, 'batch_loss/train': 0.8308779364451766} 12/28/2021 20:39:04 - INFO - codeparrot_training - Step 44793: {'lr': 1.335321133308176e-05, 'samples': 22934528, 'steps': 44793, 'batch_loss/train': 0.7558496799319983} 12/28/2021 20:39:15 - INFO - codeparrot_training - Step 44794: {'lr': 1.334812901126331e-05, 'samples': 22935040, 'steps': 44794, 'batch_loss/train': 0.6838780154939741} 12/28/2021 20:39:27 - INFO - codeparrot_training - Step 44795: {'lr': 1.3343047630283323e-05, 'samples': 22935552, 'steps': 44795, 'batch_loss/train': 0.6694279429502785} 12/28/2021 20:39:38 - INFO - codeparrot_training - Step 44796: {'lr': 1.3337967190162003e-05, 'samples': 22936064, 'steps': 44796, 'batch_loss/train': 0.7158516314812005} 12/28/2021 20:39:48 - INFO - codeparrot_training - Step 44797: {'lr': 1.3332887690919588e-05, 'samples': 22936576, 'steps': 44797, 'batch_loss/train': 0.7288436726666987} 12/28/2021 20:40:02 - INFO - codeparrot_training - Step 44798: {'lr': 1.3327809132576251e-05, 'samples': 22937088, 'steps': 44798, 'batch_loss/train': 0.6174648247542791} 12/28/2021 20:40:13 - INFO - codeparrot_training - Step 44799: {'lr': 1.3322731515152147e-05, 'samples': 22937600, 'steps': 44799, 'batch_loss/train': 0.7494484996423125} 12/28/2021 20:40:24 - INFO - codeparrot_training - Step 44800: {'lr': 1.3317654838667537e-05, 'samples': 22938112, 'steps': 44800, 'batch_loss/train': 0.6425475101568736} 12/28/2021 20:40:36 - INFO - codeparrot_training - Step 44801: {'lr': 1.331257910314254e-05, 'samples': 22938624, 'steps': 44801, 'batch_loss/train': 0.7799229053780437} 12/28/2021 20:40:46 - INFO - codeparrot_training - Step 44802: {'lr': 1.3307504308597367e-05, 'samples': 22939136, 'steps': 44802, 'batch_loss/train': 0.7117850752547383} 12/28/2021 20:40:57 - INFO - codeparrot_training - Step 44803: {'lr': 1.3302430455052222e-05, 'samples': 22939648, 'steps': 44803, 'batch_loss/train': 0.6890066582709551} 12/28/2021 20:41:08 - INFO - codeparrot_training - Step 44804: {'lr': 1.3297357542527144e-05, 'samples': 22940160, 'steps': 44804, 'batch_loss/train': 0.8232028237544} 12/28/2021 20:41:20 - INFO - codeparrot_training - Step 44805: {'lr': 1.329228557104245e-05, 'samples': 22940672, 'steps': 44805, 'batch_loss/train': 0.7664988096803427} 12/28/2021 20:41:30 - INFO - codeparrot_training - Step 44806: {'lr': 1.3287214540618293e-05, 'samples': 22941184, 'steps': 44806, 'batch_loss/train': 0.8162275915965438} 12/28/2021 20:41:41 - INFO - codeparrot_training - Step 44807: {'lr': 1.3282144451274763e-05, 'samples': 22941696, 'steps': 44807, 'batch_loss/train': 0.6644782638177276} 12/28/2021 20:41:55 - INFO - codeparrot_training - Step 44808: {'lr': 1.3277075303032044e-05, 'samples': 22942208, 'steps': 44808, 'batch_loss/train': 0.7213124204427004} 12/28/2021 20:42:06 - INFO - codeparrot_training - Step 44809: {'lr': 1.3272007095910283e-05, 'samples': 22942720, 'steps': 44809, 'batch_loss/train': 0.7689856109209359} 12/28/2021 20:42:16 - INFO - codeparrot_training - Step 44810: {'lr': 1.3266939829929631e-05, 'samples': 22943232, 'steps': 44810, 'batch_loss/train': 0.650189263134962} 12/28/2021 20:42:29 - INFO - codeparrot_training - Step 44811: {'lr': 1.326187350511024e-05, 'samples': 22943744, 'steps': 44811, 'batch_loss/train': 0.7018010143656284} 12/28/2021 20:42:39 - INFO - codeparrot_training - Step 44812: {'lr': 1.3256808121472258e-05, 'samples': 22944256, 'steps': 44812, 'batch_loss/train': 0.6664979103952646} 12/28/2021 20:42:50 - INFO - codeparrot_training - Step 44813: {'lr': 1.325174367903581e-05, 'samples': 22944768, 'steps': 44813, 'batch_loss/train': 0.7442472488619387} 12/28/2021 20:43:00 - INFO - codeparrot_training - Step 44814: {'lr': 1.3246680177821018e-05, 'samples': 22945280, 'steps': 44814, 'batch_loss/train': 0.6145844606216997} 12/28/2021 20:43:14 - INFO - codeparrot_training - Step 44815: {'lr': 1.3241617617848062e-05, 'samples': 22945792, 'steps': 44815, 'batch_loss/train': 0.7307553659193218} 12/28/2021 20:43:25 - INFO - codeparrot_training - Step 44816: {'lr': 1.3236555999137034e-05, 'samples': 22946304, 'steps': 44816, 'batch_loss/train': 0.7264314605854452} 12/28/2021 20:43:36 - INFO - codeparrot_training - Step 44817: {'lr': 1.3231495321708087e-05, 'samples': 22946816, 'steps': 44817, 'batch_loss/train': 0.769196031615138} 12/28/2021 20:43:48 - INFO - codeparrot_training - Step 44818: {'lr': 1.3226435585581286e-05, 'samples': 22947328, 'steps': 44818, 'batch_loss/train': 0.7476661494001746} 12/28/2021 20:43:58 - INFO - codeparrot_training - Step 44819: {'lr': 1.3221376790776728e-05, 'samples': 22947840, 'steps': 44819, 'batch_loss/train': 0.67419037676882} 12/28/2021 20:44:09 - INFO - codeparrot_training - Step 44820: {'lr': 1.3216318937314647e-05, 'samples': 22948352, 'steps': 44820, 'batch_loss/train': 0.7956449673511088} 12/28/2021 20:44:21 - INFO - codeparrot_training - Step 44821: {'lr': 1.3211262025215054e-05, 'samples': 22948864, 'steps': 44821, 'batch_loss/train': 0.787569179199636} 12/28/2021 20:44:32 - INFO - codeparrot_training - Step 44822: {'lr': 1.3206206054498071e-05, 'samples': 22949376, 'steps': 44822, 'batch_loss/train': 0.661207193043083} 12/28/2021 20:44:42 - INFO - codeparrot_training - Step 44823: {'lr': 1.3201151025183795e-05, 'samples': 22949888, 'steps': 44823, 'batch_loss/train': 0.7921542404219508} 12/28/2021 20:44:55 - INFO - codeparrot_training - Step 44824: {'lr': 1.3196096937292319e-05, 'samples': 22950400, 'steps': 44824, 'batch_loss/train': 0.6937518827617168} 12/28/2021 20:45:05 - INFO - codeparrot_training - Step 44825: {'lr': 1.3191043790843765e-05, 'samples': 22950912, 'steps': 44825, 'batch_loss/train': 0.765240992885083} 12/28/2021 20:45:16 - INFO - codeparrot_training - Step 44826: {'lr': 1.3185991585858204e-05, 'samples': 22951424, 'steps': 44826, 'batch_loss/train': 0.6723478017374873} 12/28/2021 20:45:26 - INFO - codeparrot_training - Step 44827: {'lr': 1.3180940322355727e-05, 'samples': 22951936, 'steps': 44827, 'batch_loss/train': 0.6946057642344385} 12/28/2021 20:45:40 - INFO - codeparrot_training - Step 44828: {'lr': 1.3175890000356406e-05, 'samples': 22952448, 'steps': 44828, 'batch_loss/train': 0.6731212306767702} 12/28/2021 20:45:51 - INFO - codeparrot_training - Step 44829: {'lr': 1.3170840619880358e-05, 'samples': 22952960, 'steps': 44829, 'batch_loss/train': 0.6971254805102944} 12/28/2021 20:46:02 - INFO - codeparrot_training - Step 44830: {'lr': 1.3165792180947544e-05, 'samples': 22953472, 'steps': 44830, 'batch_loss/train': 0.6481913728639483} 12/28/2021 20:46:15 - INFO - codeparrot_training - Step 44831: {'lr': 1.3160744683578168e-05, 'samples': 22953984, 'steps': 44831, 'batch_loss/train': 0.6720201279968023} 12/28/2021 20:46:25 - INFO - codeparrot_training - Step 44832: {'lr': 1.315569812779227e-05, 'samples': 22954496, 'steps': 44832, 'batch_loss/train': 0.6066879583522677} 12/28/2021 20:46:36 - INFO - codeparrot_training - Step 44833: {'lr': 1.3150652513609862e-05, 'samples': 22955008, 'steps': 44833, 'batch_loss/train': 0.6633197828195989} 12/28/2021 20:46:50 - INFO - codeparrot_training - Step 44834: {'lr': 1.314560784105101e-05, 'samples': 22955520, 'steps': 44834, 'batch_loss/train': 0.78272401727736} 12/28/2021 20:47:01 - INFO - codeparrot_training - Step 44835: {'lr': 1.314056411013581e-05, 'samples': 22956032, 'steps': 44835, 'batch_loss/train': 0.9687947437632829} 12/28/2021 20:47:11 - INFO - codeparrot_training - Step 44836: {'lr': 1.3135521320884275e-05, 'samples': 22956544, 'steps': 44836, 'batch_loss/train': 0.6340396924642846} 12/28/2021 20:47:22 - INFO - codeparrot_training - Step 44837: {'lr': 1.3130479473316498e-05, 'samples': 22957056, 'steps': 44837, 'batch_loss/train': 0.8115964103490114} 12/28/2021 20:47:34 - INFO - codeparrot_training - Step 44838: {'lr': 1.3125438567452464e-05, 'samples': 22957568, 'steps': 44838, 'batch_loss/train': 0.6110284986789338} 12/28/2021 20:47:45 - INFO - codeparrot_training - Step 44839: {'lr': 1.3120398603312266e-05, 'samples': 22958080, 'steps': 44839, 'batch_loss/train': 0.7537954733707011} 12/28/2021 20:47:55 - INFO - codeparrot_training - Step 44840: {'lr': 1.311535958091592e-05, 'samples': 22958592, 'steps': 44840, 'batch_loss/train': 0.7066299477592111} 12/28/2021 20:48:07 - INFO - codeparrot_training - Step 44841: {'lr': 1.3110321500283462e-05, 'samples': 22959104, 'steps': 44841, 'batch_loss/train': 0.6839916244498454} 12/28/2021 20:48:18 - INFO - codeparrot_training - Step 44842: {'lr': 1.3105284361434906e-05, 'samples': 22959616, 'steps': 44842, 'batch_loss/train': 0.6935628484934568} 12/28/2021 20:48:29 - INFO - codeparrot_training - Step 44843: {'lr': 1.3100248164390316e-05, 'samples': 22960128, 'steps': 44843, 'batch_loss/train': 0.8556574168615043} 12/28/2021 20:48:43 - INFO - codeparrot_training - Step 44844: {'lr': 1.3095212909169679e-05, 'samples': 22960640, 'steps': 44844, 'batch_loss/train': 0.6587315334472805} 12/28/2021 20:48:53 - INFO - codeparrot_training - Step 44845: {'lr': 1.3090178595792979e-05, 'samples': 22961152, 'steps': 44845, 'batch_loss/train': 0.6890188534744084} 12/28/2021 20:49:04 - INFO - codeparrot_training - Step 44846: {'lr': 1.3085145224280337e-05, 'samples': 22961664, 'steps': 44846, 'batch_loss/train': 0.5783398255007342} 12/28/2021 20:49:16 - INFO - codeparrot_training - Step 44847: {'lr': 1.3080112794651683e-05, 'samples': 22962176, 'steps': 44847, 'batch_loss/train': 0.7596115083433688} 12/28/2021 20:49:27 - INFO - codeparrot_training - Step 44848: {'lr': 1.3075081306927028e-05, 'samples': 22962688, 'steps': 44848, 'batch_loss/train': 0.6782629441004246} 12/28/2021 20:49:37 - INFO - codeparrot_training - Step 44849: {'lr': 1.3070050761126412e-05, 'samples': 22963200, 'steps': 44849, 'batch_loss/train': 0.7814462371170521} 12/28/2021 20:49:48 - INFO - codeparrot_training - Step 44850: {'lr': 1.306502115726979e-05, 'samples': 22963712, 'steps': 44850, 'batch_loss/train': 0.6912765605957247} 12/28/2021 20:50:00 - INFO - codeparrot_training - Step 44851: {'lr': 1.3059992495377204e-05, 'samples': 22964224, 'steps': 44851, 'batch_loss/train': 0.7121398622402921} 12/28/2021 20:50:11 - INFO - codeparrot_training - Step 44852: {'lr': 1.3054964775468608e-05, 'samples': 22964736, 'steps': 44852, 'batch_loss/train': 0.7341879811137915} 12/28/2021 20:50:21 - INFO - codeparrot_training - Step 44853: {'lr': 1.3049937997564015e-05, 'samples': 22965248, 'steps': 44853, 'batch_loss/train': 0.7581088859587908} 12/28/2021 20:50:36 - INFO - codeparrot_training - Step 44854: {'lr': 1.3044912161683381e-05, 'samples': 22965760, 'steps': 44854, 'batch_loss/train': 0.762495948409196} 12/28/2021 20:50:46 - INFO - codeparrot_training - Step 44855: {'lr': 1.3039887267846746e-05, 'samples': 22966272, 'steps': 44855, 'batch_loss/train': 0.5056444550864398} 12/28/2021 20:50:57 - INFO - codeparrot_training - Step 44856: {'lr': 1.303486331607398e-05, 'samples': 22966784, 'steps': 44856, 'batch_loss/train': 0.7965814247727394} 12/28/2021 20:51:09 - INFO - codeparrot_training - Step 44857: {'lr': 1.3029840306385154e-05, 'samples': 22967296, 'steps': 44857, 'batch_loss/train': 0.7083062187302858} 12/28/2021 20:51:20 - INFO - codeparrot_training - Step 44858: {'lr': 1.3024818238800252e-05, 'samples': 22967808, 'steps': 44858, 'batch_loss/train': 0.7681698175147176} 12/28/2021 20:51:30 - INFO - codeparrot_training - Step 44859: {'lr': 1.3019797113339172e-05, 'samples': 22968320, 'steps': 44859, 'batch_loss/train': 0.6360917913261801} 12/28/2021 20:51:41 - INFO - codeparrot_training - Step 44860: {'lr': 1.3014776930021843e-05, 'samples': 22968832, 'steps': 44860, 'batch_loss/train': 0.7137867505662143} 12/28/2021 20:51:53 - INFO - codeparrot_training - Step 44861: {'lr': 1.3009757688868334e-05, 'samples': 22969344, 'steps': 44861, 'batch_loss/train': 0.652775910217315} 12/28/2021 20:52:04 - INFO - codeparrot_training - Step 44862: {'lr': 1.3004739389898545e-05, 'samples': 22969856, 'steps': 44862, 'batch_loss/train': 0.6685368060134351} 12/28/2021 20:52:14 - INFO - codeparrot_training - Step 44863: {'lr': 1.2999722033132406e-05, 'samples': 22970368, 'steps': 44863, 'batch_loss/train': 0.6662077973596752} 12/28/2021 20:52:28 - INFO - codeparrot_training - Step 44864: {'lr': 1.2994705618589869e-05, 'samples': 22970880, 'steps': 44864, 'batch_loss/train': 0.7754911668598652} 12/28/2021 20:52:38 - INFO - codeparrot_training - Step 44865: {'lr': 1.2989690146290923e-05, 'samples': 22971392, 'steps': 44865, 'batch_loss/train': 0.741590662393719} 12/28/2021 20:52:49 - INFO - codeparrot_training - Step 44866: {'lr': 1.2984675616255464e-05, 'samples': 22971904, 'steps': 44866, 'batch_loss/train': 0.6951101124286652} 12/28/2021 20:53:01 - INFO - codeparrot_training - Step 44867: {'lr': 1.2979662028503424e-05, 'samples': 22972416, 'steps': 44867, 'batch_loss/train': 0.806021642871201} 12/28/2021 20:53:12 - INFO - codeparrot_training - Step 44868: {'lr': 1.2974649383054787e-05, 'samples': 22972928, 'steps': 44868, 'batch_loss/train': 0.7413163513119798} 12/28/2021 20:53:22 - INFO - codeparrot_training - Step 44869: {'lr': 1.2969637679929425e-05, 'samples': 22973440, 'steps': 44869, 'batch_loss/train': 0.8099472995963879} 12/28/2021 20:53:33 - INFO - codeparrot_training - Step 44870: {'lr': 1.296462691914732e-05, 'samples': 22973952, 'steps': 44870, 'batch_loss/train': 0.7236925374600105} 12/28/2021 20:53:45 - INFO - codeparrot_training - Step 44871: {'lr': 1.2959617100728294e-05, 'samples': 22974464, 'steps': 44871, 'batch_loss/train': 0.8043847200460732} 12/28/2021 20:53:56 - INFO - codeparrot_training - Step 44872: {'lr': 1.2954608224692382e-05, 'samples': 22974976, 'steps': 44872, 'batch_loss/train': 0.749309771694243} 12/28/2021 20:54:06 - INFO - codeparrot_training - Step 44873: {'lr': 1.2949600291059432e-05, 'samples': 22975488, 'steps': 44873, 'batch_loss/train': 0.6652922574430704} 12/28/2021 20:54:21 - INFO - codeparrot_training - Step 44874: {'lr': 1.2944593299849317e-05, 'samples': 22976000, 'steps': 44874, 'batch_loss/train': 0.8708055070601404} 12/28/2021 20:54:31 - INFO - codeparrot_training - Step 44875: {'lr': 1.2939587251082075e-05, 'samples': 22976512, 'steps': 44875, 'batch_loss/train': 0.6858572693308815} 12/28/2021 20:54:42 - INFO - codeparrot_training - Step 44876: {'lr': 1.2934582144777469e-05, 'samples': 22977024, 'steps': 44876, 'batch_loss/train': 0.7694036685861647} 12/28/2021 20:54:54 - INFO - codeparrot_training - Step 44877: {'lr': 1.2929577980955454e-05, 'samples': 22977536, 'steps': 44877, 'batch_loss/train': 0.7316343556158245} 12/28/2021 20:55:05 - INFO - codeparrot_training - Step 44878: {'lr': 1.2924574759635932e-05, 'samples': 22978048, 'steps': 44878, 'batch_loss/train': 0.8148097284138203} 12/28/2021 20:55:15 - INFO - codeparrot_training - Step 44879: {'lr': 1.2919572480838775e-05, 'samples': 22978560, 'steps': 44879, 'batch_loss/train': 0.8108349265530705} 12/28/2021 20:55:26 - INFO - codeparrot_training - Step 44880: {'lr': 1.2914571144583886e-05, 'samples': 22979072, 'steps': 44880, 'batch_loss/train': 0.7276285784319043} 12/28/2021 20:55:38 - INFO - codeparrot_training - Step 44881: {'lr': 1.2909570750891164e-05, 'samples': 22979584, 'steps': 44881, 'batch_loss/train': 0.692138833925128} 12/28/2021 20:55:48 - INFO - codeparrot_training - Step 44882: {'lr': 1.2904571299780399e-05, 'samples': 22980096, 'steps': 44882, 'batch_loss/train': 0.7641836674883962} 12/28/2021 20:55:59 - INFO - codeparrot_training - Step 44883: {'lr': 1.2899572791271602e-05, 'samples': 22980608, 'steps': 44883, 'batch_loss/train': 0.6875320073449984} 12/28/2021 20:56:14 - INFO - codeparrot_training - Step 44884: {'lr': 1.2894575225384564e-05, 'samples': 22981120, 'steps': 44884, 'batch_loss/train': 0.8085589222609997} 12/28/2021 20:56:24 - INFO - codeparrot_training - Step 44885: {'lr': 1.2889578602139184e-05, 'samples': 22981632, 'steps': 44885, 'batch_loss/train': 0.6598718203604221} 12/28/2021 20:56:35 - INFO - codeparrot_training - Step 44886: {'lr': 1.2884582921555227e-05, 'samples': 22982144, 'steps': 44886, 'batch_loss/train': 0.6342046704958193} 12/28/2021 20:56:47 - INFO - codeparrot_training - Step 44887: {'lr': 1.287958818365273e-05, 'samples': 22982656, 'steps': 44887, 'batch_loss/train': 0.6083291303366423} 12/28/2021 20:56:58 - INFO - codeparrot_training - Step 44888: {'lr': 1.2874594388451428e-05, 'samples': 22983168, 'steps': 44888, 'batch_loss/train': 0.706337527371943} 12/28/2021 20:57:08 - INFO - codeparrot_training - Step 44889: {'lr': 1.2869601535971192e-05, 'samples': 22983680, 'steps': 44889, 'batch_loss/train': 0.7758072204887867} 12/28/2021 20:57:21 - INFO - codeparrot_training - Step 44890: {'lr': 1.2864609626231899e-05, 'samples': 22984192, 'steps': 44890, 'batch_loss/train': 0.7141193780116737} 12/28/2021 20:57:31 - INFO - codeparrot_training - Step 44891: {'lr': 1.2859618659253363e-05, 'samples': 22984704, 'steps': 44891, 'batch_loss/train': 0.5883046865928918} 12/28/2021 20:57:42 - INFO - codeparrot_training - Step 44892: {'lr': 1.285462863505546e-05, 'samples': 22985216, 'steps': 44892, 'batch_loss/train': 0.696160668740049} 12/28/2021 20:57:53 - INFO - codeparrot_training - Step 44893: {'lr': 1.2849639553657976e-05, 'samples': 22985728, 'steps': 44893, 'batch_loss/train': 0.8263740825932473} 12/28/2021 20:58:05 - INFO - codeparrot_training - Step 44894: {'lr': 1.2844651415080815e-05, 'samples': 22986240, 'steps': 44894, 'batch_loss/train': 0.789151850156486} 12/28/2021 20:58:16 - INFO - codeparrot_training - Step 44895: {'lr': 1.2839664219343766e-05, 'samples': 22986752, 'steps': 44895, 'batch_loss/train': 0.5750239412300289} 12/28/2021 20:58:26 - INFO - codeparrot_training - Step 44896: {'lr': 1.2834677966466702e-05, 'samples': 22987264, 'steps': 44896, 'batch_loss/train': 0.658104341593571} 12/28/2021 20:58:41 - INFO - codeparrot_training - Step 44897: {'lr': 1.2829692656469328e-05, 'samples': 22987776, 'steps': 44897, 'batch_loss/train': 0.7436693720519543} 12/28/2021 20:58:51 - INFO - codeparrot_training - Step 44898: {'lr': 1.2824708289371601e-05, 'samples': 22988288, 'steps': 44898, 'batch_loss/train': 0.6790652071940713} 12/28/2021 20:59:02 - INFO - codeparrot_training - Step 44899: {'lr': 1.2819724865193255e-05, 'samples': 22988800, 'steps': 44899, 'batch_loss/train': 1.065919267013669} 12/28/2021 20:59:13 - INFO - codeparrot_training - Step 44900: {'lr': 1.2814742383954109e-05, 'samples': 22989312, 'steps': 44900, 'batch_loss/train': 0.6977917365729809} 12/28/2021 20:59:25 - INFO - codeparrot_training - Step 44901: {'lr': 1.2809760845674062e-05, 'samples': 22989824, 'steps': 44901, 'batch_loss/train': 0.6635445905849338} 12/28/2021 20:59:35 - INFO - codeparrot_training - Step 44902: {'lr': 1.2804780250372794e-05, 'samples': 22990336, 'steps': 44902, 'batch_loss/train': 0.7256871149875224} 12/28/2021 20:59:46 - INFO - codeparrot_training - Step 44903: {'lr': 1.279980059807015e-05, 'samples': 22990848, 'steps': 44903, 'batch_loss/train': 0.6493957510683686} 12/28/2021 21:00:00 - INFO - codeparrot_training - Step 44904: {'lr': 1.2794821888785946e-05, 'samples': 22991360, 'steps': 44904, 'batch_loss/train': 0.6988379182294011} 12/28/2021 21:00:11 - INFO - codeparrot_training - Step 44905: {'lr': 1.2789844122539973e-05, 'samples': 22991872, 'steps': 44905, 'batch_loss/train': 0.7532736035063863} 12/28/2021 21:00:21 - INFO - codeparrot_training - Step 44906: {'lr': 1.2784867299351992e-05, 'samples': 22992384, 'steps': 44906, 'batch_loss/train': 0.6141119817621075} 12/28/2021 21:00:32 - INFO - codeparrot_training - Step 44907: {'lr': 1.277989141924188e-05, 'samples': 22992896, 'steps': 44907, 'batch_loss/train': 0.6479383455007337} 12/28/2021 21:00:44 - INFO - codeparrot_training - Step 44908: {'lr': 1.2774916482229254e-05, 'samples': 22993408, 'steps': 44908, 'batch_loss/train': 0.6989085134118795} 12/28/2021 21:00:55 - INFO - codeparrot_training - Step 44909: {'lr': 1.276994248833402e-05, 'samples': 22993920, 'steps': 44909, 'batch_loss/train': 0.7965810089372098} 12/28/2021 21:01:05 - INFO - codeparrot_training - Step 44910: {'lr': 1.2764969437575963e-05, 'samples': 22994432, 'steps': 44910, 'batch_loss/train': 0.7955244993790984} 12/28/2021 21:01:17 - INFO - codeparrot_training - Step 44911: {'lr': 1.2759997329974737e-05, 'samples': 22994944, 'steps': 44911, 'batch_loss/train': 0.6538720477838069} 12/28/2021 21:01:28 - INFO - codeparrot_training - Step 44912: {'lr': 1.2755026165550215e-05, 'samples': 22995456, 'steps': 44912, 'batch_loss/train': 0.7305033435113728} 12/28/2021 21:01:39 - INFO - codeparrot_training - Step 44913: {'lr': 1.2750055944322186e-05, 'samples': 22995968, 'steps': 44913, 'batch_loss/train': 0.7695453623309731} 12/28/2021 21:01:53 - INFO - codeparrot_training - Step 44914: {'lr': 1.2745086666310302e-05, 'samples': 22996480, 'steps': 44914, 'batch_loss/train': 0.6770617021247745} 12/28/2021 21:02:03 - INFO - codeparrot_training - Step 44915: {'lr': 1.2740118331534323e-05, 'samples': 22996992, 'steps': 44915, 'batch_loss/train': 0.6949358754791319} 12/28/2021 21:02:14 - INFO - codeparrot_training - Step 44916: {'lr': 1.2735150940014123e-05, 'samples': 22997504, 'steps': 44916, 'batch_loss/train': 0.6554673963692039} 12/28/2021 21:02:26 - INFO - codeparrot_training - Step 44917: {'lr': 1.2730184491769354e-05, 'samples': 22998016, 'steps': 44917, 'batch_loss/train': 0.7231839979067445} 12/28/2021 21:02:37 - INFO - codeparrot_training - Step 44918: {'lr': 1.2725218986819776e-05, 'samples': 22998528, 'steps': 44918, 'batch_loss/train': 0.7439837856218219} 12/28/2021 21:02:47 - INFO - codeparrot_training - Step 44919: {'lr': 1.2720254425185123e-05, 'samples': 22999040, 'steps': 44919, 'batch_loss/train': 0.7809070786461234} 12/28/2021 21:02:58 - INFO - codeparrot_training - Step 44920: {'lr': 1.271529080688516e-05, 'samples': 22999552, 'steps': 44920, 'batch_loss/train': 0.7923209127038717} 12/28/2021 21:03:10 - INFO - codeparrot_training - Step 44921: {'lr': 1.271032813193959e-05, 'samples': 23000064, 'steps': 44921, 'batch_loss/train': 0.6042949017137289} 12/28/2021 21:03:21 - INFO - codeparrot_training - Step 44922: {'lr': 1.2705366400368207e-05, 'samples': 23000576, 'steps': 44922, 'batch_loss/train': 0.6774265333078802} 12/28/2021 21:03:31 - INFO - codeparrot_training - Step 44923: {'lr': 1.270040561219063e-05, 'samples': 23001088, 'steps': 44923, 'batch_loss/train': 0.6646741512231529} 12/28/2021 21:03:45 - INFO - codeparrot_training - Step 44924: {'lr': 1.2695445767426677e-05, 'samples': 23001600, 'steps': 44924, 'batch_loss/train': 0.6590665173716843} 12/28/2021 21:03:56 - INFO - codeparrot_training - Step 44925: {'lr': 1.269048686609603e-05, 'samples': 23002112, 'steps': 44925, 'batch_loss/train': 0.8457164308056235} 12/28/2021 21:04:07 - INFO - codeparrot_training - Step 44926: {'lr': 1.2685528908218336e-05, 'samples': 23002624, 'steps': 44926, 'batch_loss/train': 0.7031124143395573} 12/28/2021 21:04:19 - INFO - codeparrot_training - Step 44927: {'lr': 1.2680571893813441e-05, 'samples': 23003136, 'steps': 44927, 'batch_loss/train': 0.6417653481476009} 12/28/2021 21:04:30 - INFO - codeparrot_training - Step 44928: {'lr': 1.267561582290097e-05, 'samples': 23003648, 'steps': 44928, 'batch_loss/train': 0.6430700803175569} 12/28/2021 21:04:40 - INFO - codeparrot_training - Step 44929: {'lr': 1.267066069550063e-05, 'samples': 23004160, 'steps': 44929, 'batch_loss/train': 0.6699356907047331} 12/28/2021 21:04:51 - INFO - codeparrot_training - Step 44930: {'lr': 1.2665706511632152e-05, 'samples': 23004672, 'steps': 44930, 'batch_loss/train': 0.7915217438712716} 12/28/2021 21:05:03 - INFO - codeparrot_training - Step 44931: {'lr': 1.2660753271315189e-05, 'samples': 23005184, 'steps': 44931, 'batch_loss/train': 0.7051984826102853} 12/28/2021 21:05:14 - INFO - codeparrot_training - Step 44932: {'lr': 1.2655800974569448e-05, 'samples': 23005696, 'steps': 44932, 'batch_loss/train': 0.7695357352495193} 12/28/2021 21:05:24 - INFO - codeparrot_training - Step 44933: {'lr': 1.265084962141469e-05, 'samples': 23006208, 'steps': 44933, 'batch_loss/train': 0.7019115500152111} 12/28/2021 21:05:36 - INFO - codeparrot_training - Step 44934: {'lr': 1.2645899211870426e-05, 'samples': 23006720, 'steps': 44934, 'batch_loss/train': 0.7295017472933978} 12/28/2021 21:05:47 - INFO - codeparrot_training - Step 44935: {'lr': 1.2640949745956504e-05, 'samples': 23007232, 'steps': 44935, 'batch_loss/train': 0.7066862895153463} 12/28/2021 21:05:58 - INFO - codeparrot_training - Step 44936: {'lr': 1.2636001223692572e-05, 'samples': 23007744, 'steps': 44936, 'batch_loss/train': 0.7356623690575361} 12/28/2021 21:06:12 - INFO - codeparrot_training - Step 44937: {'lr': 1.26310536450982e-05, 'samples': 23008256, 'steps': 44937, 'batch_loss/train': 0.5667628694791347} 12/28/2021 21:06:22 - INFO - codeparrot_training - Step 44938: {'lr': 1.2626107010193178e-05, 'samples': 23008768, 'steps': 44938, 'batch_loss/train': 0.8364323754794896} 12/28/2021 21:06:33 - INFO - codeparrot_training - Step 44939: {'lr': 1.2621161318997153e-05, 'samples': 23009280, 'steps': 44939, 'batch_loss/train': 0.6446721986867487} 12/28/2021 21:06:45 - INFO - codeparrot_training - Step 44940: {'lr': 1.2616216571529726e-05, 'samples': 23009792, 'steps': 44940, 'batch_loss/train': 0.7026942418888211} 12/28/2021 21:06:56 - INFO - codeparrot_training - Step 44941: {'lr': 1.2611272767810571e-05, 'samples': 23010304, 'steps': 44941, 'batch_loss/train': 0.8058744189329445} 12/28/2021 21:07:06 - INFO - codeparrot_training - Step 44942: {'lr': 1.2606329907859399e-05, 'samples': 23010816, 'steps': 44942, 'batch_loss/train': 0.8213061224669218} 12/28/2021 21:07:17 - INFO - codeparrot_training - Step 44943: {'lr': 1.26013879916958e-05, 'samples': 23011328, 'steps': 44943, 'batch_loss/train': 0.7181948041543365} 12/28/2021 21:07:31 - INFO - codeparrot_training - Step 44944: {'lr': 1.2596447019339458e-05, 'samples': 23011840, 'steps': 44944, 'batch_loss/train': 0.688827168662101} 12/28/2021 21:07:42 - INFO - codeparrot_training - Step 44945: {'lr': 1.2591506990809993e-05, 'samples': 23012352, 'steps': 44945, 'batch_loss/train': 0.7297022498678416} 12/28/2021 21:07:52 - INFO - codeparrot_training - Step 44946: {'lr': 1.2586567906127055e-05, 'samples': 23012864, 'steps': 44946, 'batch_loss/train': 0.6401559796649963} 12/28/2021 21:08:04 - INFO - codeparrot_training - Step 44947: {'lr': 1.2581629765310271e-05, 'samples': 23013376, 'steps': 44947, 'batch_loss/train': 0.7318467060104012} 12/28/2021 21:08:15 - INFO - codeparrot_training - Step 44948: {'lr': 1.2576692568379288e-05, 'samples': 23013888, 'steps': 44948, 'batch_loss/train': 0.547312268987298} 12/28/2021 21:08:26 - INFO - codeparrot_training - Step 44949: {'lr': 1.2571756315353732e-05, 'samples': 23014400, 'steps': 44949, 'batch_loss/train': 0.7221773490309715} 12/28/2021 21:08:38 - INFO - codeparrot_training - Step 44950: {'lr': 1.2566821006253226e-05, 'samples': 23014912, 'steps': 44950, 'batch_loss/train': 0.8006832245155238} 12/28/2021 21:08:48 - INFO - codeparrot_training - Step 44951: {'lr': 1.2561886641097419e-05, 'samples': 23015424, 'steps': 44951, 'batch_loss/train': 0.7270614076405764} 12/28/2021 21:08:59 - INFO - codeparrot_training - Step 44952: {'lr': 1.2556953219905826e-05, 'samples': 23015936, 'steps': 44952, 'batch_loss/train': 0.7337712626904249} 12/28/2021 21:09:10 - INFO - codeparrot_training - Step 44953: {'lr': 1.2552020742698206e-05, 'samples': 23016448, 'steps': 44953, 'batch_loss/train': 0.7366656982339919} 12/28/2021 21:09:24 - INFO - codeparrot_training - Step 44954: {'lr': 1.2547089209494045e-05, 'samples': 23016960, 'steps': 44954, 'batch_loss/train': 0.5332537284120917} 12/28/2021 21:09:35 - INFO - codeparrot_training - Step 44955: {'lr': 1.2542158620312993e-05, 'samples': 23017472, 'steps': 44955, 'batch_loss/train': 0.6958582904189825} 12/28/2021 21:09:45 - INFO - codeparrot_training - Step 44956: {'lr': 1.2537228975174674e-05, 'samples': 23017984, 'steps': 44956, 'batch_loss/train': 0.7023964574327692} 12/28/2021 21:09:57 - INFO - codeparrot_training - Step 44957: {'lr': 1.2532300274098685e-05, 'samples': 23018496, 'steps': 44957, 'batch_loss/train': 0.7043376080691814} 12/28/2021 21:10:08 - INFO - codeparrot_training - Step 44958: {'lr': 1.252737251710459e-05, 'samples': 23019008, 'steps': 44958, 'batch_loss/train': 0.7163219470530748} 12/28/2021 21:10:19 - INFO - codeparrot_training - Step 44959: {'lr': 1.2522445704211988e-05, 'samples': 23019520, 'steps': 44959, 'batch_loss/train': 0.6991438223049045} 12/28/2021 21:10:32 - INFO - codeparrot_training - Step 44960: {'lr': 1.2517519835440472e-05, 'samples': 23020032, 'steps': 44960, 'batch_loss/train': 0.7211578856222332} 12/28/2021 21:10:43 - INFO - codeparrot_training - Step 44961: {'lr': 1.2512594910809639e-05, 'samples': 23020544, 'steps': 44961, 'batch_loss/train': 0.7522688512690365} 12/28/2021 21:10:54 - INFO - codeparrot_training - Step 44962: {'lr': 1.250767093033911e-05, 'samples': 23021056, 'steps': 44962, 'batch_loss/train': 0.8949803886935115} 12/28/2021 21:11:04 - INFO - codeparrot_training - Step 44963: {'lr': 1.2502747894048316e-05, 'samples': 23021568, 'steps': 44963, 'batch_loss/train': 0.7492129025049508} 12/28/2021 21:11:16 - INFO - codeparrot_training - Step 44964: {'lr': 1.2497825801956936e-05, 'samples': 23022080, 'steps': 44964, 'batch_loss/train': 0.7259510979056358} 12/28/2021 21:11:27 - INFO - codeparrot_training - Step 44965: {'lr': 1.2492904654084591e-05, 'samples': 23022592, 'steps': 44965, 'batch_loss/train': 0.7717103864997625} 12/28/2021 21:11:38 - INFO - codeparrot_training - Step 44966: {'lr': 1.248798445045074e-05, 'samples': 23023104, 'steps': 44966, 'batch_loss/train': 0.7360580591484904} 12/28/2021 21:11:50 - INFO - codeparrot_training - Step 44967: {'lr': 1.2483065191074922e-05, 'samples': 23023616, 'steps': 44967, 'batch_loss/train': 0.7274965075775981} 12/28/2021 21:12:01 - INFO - codeparrot_training - Step 44968: {'lr': 1.2478146875976843e-05, 'samples': 23024128, 'steps': 44968, 'batch_loss/train': 0.7262429152615368} 12/28/2021 21:12:11 - INFO - codeparrot_training - Step 44969: {'lr': 1.2473229505175932e-05, 'samples': 23024640, 'steps': 44969, 'batch_loss/train': 0.7632082761265337} 12/28/2021 21:12:23 - INFO - codeparrot_training - Step 44970: {'lr': 1.2468313078691784e-05, 'samples': 23025152, 'steps': 44970, 'batch_loss/train': 0.8083221060223877} 12/28/2021 21:12:34 - INFO - codeparrot_training - Step 44971: {'lr': 1.2463397596543912e-05, 'samples': 23025664, 'steps': 44971, 'batch_loss/train': 0.7442458234727383} 12/28/2021 21:12:45 - INFO - codeparrot_training - Step 44972: {'lr': 1.2458483058751885e-05, 'samples': 23026176, 'steps': 44972, 'batch_loss/train': 0.6714891605079174} 12/28/2021 21:12:59 - INFO - codeparrot_training - Step 44973: {'lr': 1.245356946533524e-05, 'samples': 23026688, 'steps': 44973, 'batch_loss/train': 0.7099165577674285} 12/28/2021 21:13:09 - INFO - codeparrot_training - Step 44974: {'lr': 1.244865681631352e-05, 'samples': 23027200, 'steps': 44974, 'batch_loss/train': 0.6573407677933574} 12/28/2021 21:13:20 - INFO - codeparrot_training - Step 44975: {'lr': 1.2443745111706235e-05, 'samples': 23027712, 'steps': 44975, 'batch_loss/train': 0.7965230377740227} 12/28/2021 21:13:31 - INFO - codeparrot_training - Step 44976: {'lr': 1.2438834351532923e-05, 'samples': 23028224, 'steps': 44976, 'batch_loss/train': 0.5387624334543943} 12/28/2021 21:13:43 - INFO - codeparrot_training - Step 44977: {'lr': 1.2433924535813158e-05, 'samples': 23028736, 'steps': 44977, 'batch_loss/train': 0.8397184379864484} 12/28/2021 21:13:53 - INFO - codeparrot_training - Step 44978: {'lr': 1.2429015664566306e-05, 'samples': 23029248, 'steps': 44978, 'batch_loss/train': 0.7451759693212807} 12/28/2021 21:14:04 - INFO - codeparrot_training - Step 44979: {'lr': 1.242410773781208e-05, 'samples': 23029760, 'steps': 44979, 'batch_loss/train': 0.6927516153082252} 12/28/2021 21:14:16 - INFO - codeparrot_training - Step 44980: {'lr': 1.2419200755569848e-05, 'samples': 23030272, 'steps': 44980, 'batch_loss/train': 0.694405036047101} 12/28/2021 21:14:27 - INFO - codeparrot_training - Step 44981: {'lr': 1.2414294717859153e-05, 'samples': 23030784, 'steps': 44981, 'batch_loss/train': 0.677150844479911} 12/28/2021 21:14:37 - INFO - codeparrot_training - Step 44982: {'lr': 1.2409389624699535e-05, 'samples': 23031296, 'steps': 44982, 'batch_loss/train': 0.6811900339089334} 12/28/2021 21:14:50 - INFO - codeparrot_training - Step 44983: {'lr': 1.2404485476110477e-05, 'samples': 23031808, 'steps': 44983, 'batch_loss/train': 0.5618568354984745} 12/28/2021 21:15:01 - INFO - codeparrot_training - Step 44984: {'lr': 1.2399582272111465e-05, 'samples': 23032320, 'steps': 44984, 'batch_loss/train': 0.8034100208897144} 12/28/2021 21:15:12 - INFO - codeparrot_training - Step 44985: {'lr': 1.2394680012721981e-05, 'samples': 23032832, 'steps': 44985, 'batch_loss/train': 0.8445778116583824} 12/28/2021 21:15:24 - INFO - codeparrot_training - Step 44986: {'lr': 1.2389778697961568e-05, 'samples': 23033344, 'steps': 44986, 'batch_loss/train': 0.7470579775981605} 12/28/2021 21:15:34 - INFO - codeparrot_training - Step 44987: {'lr': 1.2384878327849652e-05, 'samples': 23033856, 'steps': 44987, 'batch_loss/train': 0.6692639588145539} 12/28/2021 21:15:45 - INFO - codeparrot_training - Step 44988: {'lr': 1.2379978902405774e-05, 'samples': 23034368, 'steps': 44988, 'batch_loss/train': 0.7432776667992584} 12/28/2021 21:15:56 - INFO - codeparrot_training - Step 44989: {'lr': 1.237508042164931e-05, 'samples': 23034880, 'steps': 44989, 'batch_loss/train': 0.8061871370300651} 12/28/2021 21:16:10 - INFO - codeparrot_training - Step 44990: {'lr': 1.2370182885599851e-05, 'samples': 23035392, 'steps': 44990, 'batch_loss/train': 0.7083332790061831} 12/28/2021 21:16:21 - INFO - codeparrot_training - Step 44991: {'lr': 1.2365286294276829e-05, 'samples': 23035904, 'steps': 44991, 'batch_loss/train': 2.207082530017942} 12/28/2021 21:16:31 - INFO - codeparrot_training - Step 44992: {'lr': 1.2360390647699698e-05, 'samples': 23036416, 'steps': 44992, 'batch_loss/train': 1.2661650758236647} 12/28/2021 21:16:43 - INFO - codeparrot_training - Step 44993: {'lr': 1.2355495945887862e-05, 'samples': 23036928, 'steps': 44993, 'batch_loss/train': 0.7111700726673007} 12/28/2021 21:16:54 - INFO - codeparrot_training - Step 44994: {'lr': 1.2350602188860943e-05, 'samples': 23037440, 'steps': 44994, 'batch_loss/train': 0.6998613076284528} 12/28/2021 21:17:05 - INFO - codeparrot_training - Step 44995: {'lr': 1.234570937663823e-05, 'samples': 23037952, 'steps': 44995, 'batch_loss/train': 0.6791232684627175} 12/28/2021 21:17:15 - INFO - codeparrot_training - Step 44996: {'lr': 1.2340817509239266e-05, 'samples': 23038464, 'steps': 44996, 'batch_loss/train': 0.7215040284208953} 12/28/2021 21:17:27 - INFO - codeparrot_training - Step 44997: {'lr': 1.2335926586683448e-05, 'samples': 23038976, 'steps': 44997, 'batch_loss/train': 0.6975839795777574} 12/28/2021 21:17:38 - INFO - codeparrot_training - Step 44998: {'lr': 1.2331036608990293e-05, 'samples': 23039488, 'steps': 44998, 'batch_loss/train': 0.695563433226198} 12/28/2021 21:17:48 - INFO - codeparrot_training - Step 44999: {'lr': 1.2326147576179142e-05, 'samples': 23040000, 'steps': 44999, 'batch_loss/train': 0.7561318734660745} 12/28/2021 21:17:48 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/28/2021 21:21:11 - INFO - codeparrot_training - Step 45000: {'loss/eval': 0.7414663434028625, 'perplexity': 2.099011182785034} 12/28/2021 21:21:30 - WARNING - huggingface_hub.repository - Several commits (16) will be pushed upstream. 12/28/2021 21:21:46 - INFO - codeparrot_training - Step 45000: {'lr': 1.2321259488269537e-05, 'samples': 23040512, 'steps': 45000, 'batch_loss/train': 0.7388205407187343} 12/28/2021 21:21:57 - INFO - codeparrot_training - Step 45001: {'lr': 1.2316372345280824e-05, 'samples': 23041024, 'steps': 45001, 'batch_loss/train': 0.7499255004804581} 12/28/2021 21:22:08 - INFO - codeparrot_training - Step 45002: {'lr': 1.2311486147232487e-05, 'samples': 23041536, 'steps': 45002, 'batch_loss/train': 0.7241759057505988} 12/28/2021 21:22:20 - INFO - codeparrot_training - Step 45003: {'lr': 1.2306600894143954e-05, 'samples': 23042048, 'steps': 45003, 'batch_loss/train': 0.6472440086654387} 12/28/2021 21:22:30 - INFO - codeparrot_training - Step 45004: {'lr': 1.2301716586034573e-05, 'samples': 23042560, 'steps': 45004, 'batch_loss/train': 0.7359280455857515} 12/28/2021 21:22:41 - INFO - codeparrot_training - Step 45005: {'lr': 1.2296833222923853e-05, 'samples': 23043072, 'steps': 45005, 'batch_loss/train': 0.6243610216770321} 12/28/2021 21:22:53 - INFO - codeparrot_training - Step 45006: {'lr': 1.229195080483117e-05, 'samples': 23043584, 'steps': 45006, 'batch_loss/train': 0.6903959596529603} 12/28/2021 21:23:04 - INFO - codeparrot_training - Step 45007: {'lr': 1.2287069331775924e-05, 'samples': 23044096, 'steps': 45007, 'batch_loss/train': 0.7088620727881789} 12/28/2021 21:23:14 - INFO - codeparrot_training - Step 45008: {'lr': 1.2282188803777544e-05, 'samples': 23044608, 'steps': 45008, 'batch_loss/train': 0.6584864500910044} 12/28/2021 21:23:25 - INFO - codeparrot_training - Step 45009: {'lr': 1.2277309220855404e-05, 'samples': 23045120, 'steps': 45009, 'batch_loss/train': 0.7067359153879806} 12/28/2021 21:23:37 - INFO - codeparrot_training - Step 45010: {'lr': 1.2272430583028932e-05, 'samples': 23045632, 'steps': 45010, 'batch_loss/train': 0.7558464175090194} 12/28/2021 21:23:48 - INFO - codeparrot_training - Step 45011: {'lr': 1.2267552890317501e-05, 'samples': 23046144, 'steps': 45011, 'batch_loss/train': 0.6812909061554819} 12/28/2021 21:23:58 - INFO - codeparrot_training - Step 45012: {'lr': 1.2262676142740542e-05, 'samples': 23046656, 'steps': 45012, 'batch_loss/train': 0.7566838106140494} 12/28/2021 21:24:12 - INFO - codeparrot_training - Step 45013: {'lr': 1.2257800340317399e-05, 'samples': 23047168, 'steps': 45013, 'batch_loss/train': 0.6624577877810225} 12/28/2021 21:24:22 - INFO - codeparrot_training - Step 45014: {'lr': 1.22529254830675e-05, 'samples': 23047680, 'steps': 45014, 'batch_loss/train': 0.7585712759755552} 12/28/2021 21:24:33 - INFO - codeparrot_training - Step 45015: {'lr': 1.2248051571010138e-05, 'samples': 23048192, 'steps': 45015, 'batch_loss/train': 0.6877373078023084} 12/28/2021 21:24:45 - INFO - codeparrot_training - Step 45016: {'lr': 1.2243178604164796e-05, 'samples': 23048704, 'steps': 45016, 'batch_loss/train': 0.8030269555747509} 12/28/2021 21:24:56 - INFO - codeparrot_training - Step 45017: {'lr': 1.2238306582550818e-05, 'samples': 23049216, 'steps': 45017, 'batch_loss/train': 0.6700289399595931} 12/28/2021 21:25:06 - INFO - codeparrot_training - Step 45018: {'lr': 1.2233435506187551e-05, 'samples': 23049728, 'steps': 45018, 'batch_loss/train': 0.7592328507453203} 12/28/2021 21:25:17 - INFO - codeparrot_training - Step 45019: {'lr': 1.2228565375094313e-05, 'samples': 23050240, 'steps': 45019, 'batch_loss/train': 0.7161308089271188} 12/28/2021 21:25:31 - INFO - codeparrot_training - Step 45020: {'lr': 1.222369618929059e-05, 'samples': 23050752, 'steps': 45020, 'batch_loss/train': 0.7606823602691293} 12/28/2021 21:25:42 - INFO - codeparrot_training - Step 45021: {'lr': 1.2218827948795641e-05, 'samples': 23051264, 'steps': 45021, 'batch_loss/train': 0.47170426324009895} 12/28/2021 21:25:53 - INFO - codeparrot_training - Step 45022: {'lr': 1.2213960653628814e-05, 'samples': 23051776, 'steps': 45022, 'batch_loss/train': 0.6969932010397315} 12/28/2021 21:26:05 - INFO - codeparrot_training - Step 45023: {'lr': 1.2209094303809564e-05, 'samples': 23052288, 'steps': 45023, 'batch_loss/train': 0.6803211309015751} 12/28/2021 21:26:15 - INFO - codeparrot_training - Step 45024: {'lr': 1.2204228899357128e-05, 'samples': 23052800, 'steps': 45024, 'batch_loss/train': 0.6747496607713401} 12/28/2021 21:26:26 - INFO - codeparrot_training - Step 45025: {'lr': 1.2199364440290877e-05, 'samples': 23053312, 'steps': 45025, 'batch_loss/train': 0.7391664991155267} 12/28/2021 21:26:38 - INFO - codeparrot_training - Step 45026: {'lr': 1.2194500926630187e-05, 'samples': 23053824, 'steps': 45026, 'batch_loss/train': 0.6956963350530714} 12/28/2021 21:26:49 - INFO - codeparrot_training - Step 45027: {'lr': 1.2189638358394373e-05, 'samples': 23054336, 'steps': 45027, 'batch_loss/train': 0.8028994733467698} 12/28/2021 21:26:59 - INFO - codeparrot_training - Step 45028: {'lr': 1.2184776735602754e-05, 'samples': 23054848, 'steps': 45028, 'batch_loss/train': 0.6697774371132255} 12/28/2021 21:27:10 - INFO - codeparrot_training - Step 45029: {'lr': 1.2179916058274676e-05, 'samples': 23055360, 'steps': 45029, 'batch_loss/train': 0.7647643140517175} 12/28/2021 21:27:24 - INFO - codeparrot_training - Step 45030: {'lr': 1.217505632642943e-05, 'samples': 23055872, 'steps': 45030, 'batch_loss/train': 0.6615564168896526} 12/28/2021 21:27:35 - INFO - codeparrot_training - Step 45031: {'lr': 1.2170197540086358e-05, 'samples': 23056384, 'steps': 45031, 'batch_loss/train': 0.7597562554292381} 12/28/2021 21:27:45 - INFO - codeparrot_training - Step 45032: {'lr': 1.2165339699264838e-05, 'samples': 23056896, 'steps': 45032, 'batch_loss/train': 0.6721908585168421} 12/28/2021 21:27:57 - INFO - codeparrot_training - Step 45033: {'lr': 1.2160482803984074e-05, 'samples': 23057408, 'steps': 45033, 'batch_loss/train': 0.7449830938130617} 12/28/2021 21:28:08 - INFO - codeparrot_training - Step 45034: {'lr': 1.2155626854263441e-05, 'samples': 23057920, 'steps': 45034, 'batch_loss/train': 0.7669051168486476} 12/28/2021 21:28:19 - INFO - codeparrot_training - Step 45035: {'lr': 1.2150771850122227e-05, 'samples': 23058432, 'steps': 45035, 'batch_loss/train': 0.6152602593356278} 12/28/2021 21:28:33 - INFO - codeparrot_training - Step 45036: {'lr': 1.2145917791579724e-05, 'samples': 23058944, 'steps': 45036, 'batch_loss/train': 0.748514523729682} 12/28/2021 21:28:43 - INFO - codeparrot_training - Step 45037: {'lr': 1.214106467865525e-05, 'samples': 23059456, 'steps': 45037, 'batch_loss/train': 0.6869725696742535} 12/28/2021 21:28:54 - INFO - codeparrot_training - Step 45038: {'lr': 1.2136212511368094e-05, 'samples': 23059968, 'steps': 45038, 'batch_loss/train': 0.668008507695049} 12/28/2021 21:29:06 - INFO - codeparrot_training - Step 45039: {'lr': 1.2131361289737547e-05, 'samples': 23060480, 'steps': 45039, 'batch_loss/train': 0.7238097100635059} 12/28/2021 21:29:17 - INFO - codeparrot_training - Step 45040: {'lr': 1.2126511013782871e-05, 'samples': 23060992, 'steps': 45040, 'batch_loss/train': 0.6564555326476693} 12/28/2021 21:29:27 - INFO - codeparrot_training - Step 45041: {'lr': 1.2121661683523382e-05, 'samples': 23061504, 'steps': 45041, 'batch_loss/train': 0.7985632829368114} 12/28/2021 21:29:38 - INFO - codeparrot_training - Step 45042: {'lr': 1.2116813298978347e-05, 'samples': 23062016, 'steps': 45042, 'batch_loss/train': 0.7289464806672186} 12/28/2021 21:29:50 - INFO - codeparrot_training - Step 45043: {'lr': 1.2111965860167079e-05, 'samples': 23062528, 'steps': 45043, 'batch_loss/train': 0.6300630713813007} 12/28/2021 21:30:01 - INFO - codeparrot_training - Step 45044: {'lr': 1.210711936710876e-05, 'samples': 23063040, 'steps': 45044, 'batch_loss/train': 0.7316146837547421} 12/28/2021 21:30:11 - INFO - codeparrot_training - Step 45045: {'lr': 1.2102273819822707e-05, 'samples': 23063552, 'steps': 45045, 'batch_loss/train': 0.7828474510461092} 12/28/2021 21:30:24 - INFO - codeparrot_training - Step 45046: {'lr': 1.2097429218328237e-05, 'samples': 23064064, 'steps': 45046, 'batch_loss/train': 0.6406933176331222} 12/28/2021 21:30:34 - INFO - codeparrot_training - Step 45047: {'lr': 1.2092585562644504e-05, 'samples': 23064576, 'steps': 45047, 'batch_loss/train': 0.8760713180527091} 12/28/2021 21:30:45 - INFO - codeparrot_training - Step 45048: {'lr': 1.2087742852790822e-05, 'samples': 23065088, 'steps': 45048, 'batch_loss/train': 0.6311043673194945} 12/28/2021 21:30:57 - INFO - codeparrot_training - Step 45049: {'lr': 1.2082901088786485e-05, 'samples': 23065600, 'steps': 45049, 'batch_loss/train': 0.675655156839639} 12/28/2021 21:31:08 - INFO - codeparrot_training - Step 45050: {'lr': 1.2078060270650699e-05, 'samples': 23066112, 'steps': 45050, 'batch_loss/train': 0.7452592477202415} 12/28/2021 21:31:18 - INFO - codeparrot_training - Step 45051: {'lr': 1.2073220398402668e-05, 'samples': 23066624, 'steps': 45051, 'batch_loss/train': 0.5943583841435611} 12/28/2021 21:31:29 - INFO - codeparrot_training - Step 45052: {'lr': 1.2068381472061713e-05, 'samples': 23067136, 'steps': 45052, 'batch_loss/train': 0.6296164027298801} 12/28/2021 21:31:43 - INFO - codeparrot_training - Step 45053: {'lr': 1.2063543491647011e-05, 'samples': 23067648, 'steps': 45053, 'batch_loss/train': 0.7326388363726437} 12/28/2021 21:31:54 - INFO - codeparrot_training - Step 45054: {'lr': 1.2058706457177826e-05, 'samples': 23068160, 'steps': 45054, 'batch_loss/train': 0.7313567204400897} 12/28/2021 21:32:04 - INFO - codeparrot_training - Step 45055: {'lr': 1.2053870368673392e-05, 'samples': 23068672, 'steps': 45055, 'batch_loss/train': 0.6825571060180664} 12/28/2021 21:32:16 - INFO - codeparrot_training - Step 45056: {'lr': 1.2049035226152888e-05, 'samples': 23069184, 'steps': 45056, 'batch_loss/train': 0.8042012131772935} 12/28/2021 21:32:27 - INFO - codeparrot_training - Step 45057: {'lr': 1.2044201029635604e-05, 'samples': 23069696, 'steps': 45057, 'batch_loss/train': 0.737498982809484} 12/28/2021 21:32:38 - INFO - codeparrot_training - Step 45058: {'lr': 1.2039367779140748e-05, 'samples': 23070208, 'steps': 45058, 'batch_loss/train': 0.7586365723982453} 12/28/2021 21:32:51 - INFO - codeparrot_training - Step 45059: {'lr': 1.2034535474687442e-05, 'samples': 23070720, 'steps': 45059, 'batch_loss/train': 0.8371688146144152} 12/28/2021 21:33:01 - INFO - codeparrot_training - Step 45060: {'lr': 1.202970411629506e-05, 'samples': 23071232, 'steps': 45060, 'batch_loss/train': 1.5028177560307086} 12/28/2021 21:33:12 - INFO - codeparrot_training - Step 45061: {'lr': 1.2024873703982669e-05, 'samples': 23071744, 'steps': 45061, 'batch_loss/train': 0.7540037084836513} 12/28/2021 21:33:22 - INFO - codeparrot_training - Step 45062: {'lr': 1.2020044237769533e-05, 'samples': 23072256, 'steps': 45062, 'batch_loss/train': 0.774959710193798} 12/28/2021 21:33:35 - INFO - codeparrot_training - Step 45063: {'lr': 1.2015215717674832e-05, 'samples': 23072768, 'steps': 45063, 'batch_loss/train': 0.7192422905936837} 12/28/2021 21:33:45 - INFO - codeparrot_training - Step 45064: {'lr': 1.2010388143717798e-05, 'samples': 23073280, 'steps': 45064, 'batch_loss/train': 0.647829485940747} 12/28/2021 21:33:56 - INFO - codeparrot_training - Step 45065: {'lr': 1.2005561515917584e-05, 'samples': 23073792, 'steps': 45065, 'batch_loss/train': 0.6883272877894342} 12/28/2021 21:34:08 - INFO - codeparrot_training - Step 45066: {'lr': 1.2000735834293396e-05, 'samples': 23074304, 'steps': 45066, 'batch_loss/train': 0.7192554154898971} 12/28/2021 21:34:19 - INFO - codeparrot_training - Step 45067: {'lr': 1.1995911098864415e-05, 'samples': 23074816, 'steps': 45067, 'batch_loss/train': 0.6047175964340568} 12/28/2021 21:34:29 - INFO - codeparrot_training - Step 45068: {'lr': 1.1991087309649818e-05, 'samples': 23075328, 'steps': 45068, 'batch_loss/train': 0.7046756665222347} 12/28/2021 21:34:44 - INFO - codeparrot_training - Step 45069: {'lr': 1.1986264466668839e-05, 'samples': 23075840, 'steps': 45069, 'batch_loss/train': 0.740153732476756} 12/28/2021 21:34:55 - INFO - codeparrot_training - Step 45070: {'lr': 1.1981442569940549e-05, 'samples': 23076352, 'steps': 45070, 'batch_loss/train': 0.7268863599747419} 12/28/2021 21:35:06 - INFO - codeparrot_training - Step 45071: {'lr': 1.1976621619484152e-05, 'samples': 23076864, 'steps': 45071, 'batch_loss/train': 0.6751030939631164} 12/28/2021 21:35:18 - INFO - codeparrot_training - Step 45072: {'lr': 1.1971801615318884e-05, 'samples': 23077376, 'steps': 45072, 'batch_loss/train': 0.7915262130554765} 12/28/2021 21:35:29 - INFO - codeparrot_training - Step 45073: {'lr': 1.196698255746384e-05, 'samples': 23077888, 'steps': 45073, 'batch_loss/train': 0.7022106060176156} 12/28/2021 21:35:39 - INFO - codeparrot_training - Step 45074: {'lr': 1.1962164445938173e-05, 'samples': 23078400, 'steps': 45074, 'batch_loss/train': 0.7481046519242227} 12/28/2021 21:35:50 - INFO - codeparrot_training - Step 45075: {'lr': 1.1957347280761088e-05, 'samples': 23078912, 'steps': 45075, 'batch_loss/train': 0.6991078602150083} 12/28/2021 21:36:04 - INFO - codeparrot_training - Step 45076: {'lr': 1.195253106195171e-05, 'samples': 23079424, 'steps': 45076, 'batch_loss/train': 0.7975812293589115} 12/28/2021 21:36:14 - INFO - codeparrot_training - Step 45077: {'lr': 1.194771578952919e-05, 'samples': 23079936, 'steps': 45077, 'batch_loss/train': 0.7183679258450866} 12/28/2021 21:36:25 - INFO - codeparrot_training - Step 45078: {'lr': 1.1942901463512651e-05, 'samples': 23080448, 'steps': 45078, 'batch_loss/train': 0.6687115444801748} 12/28/2021 21:36:37 - INFO - codeparrot_training - Step 45079: {'lr': 1.1938088083921245e-05, 'samples': 23080960, 'steps': 45079, 'batch_loss/train': 0.7518635876476765} 12/28/2021 21:36:48 - INFO - codeparrot_training - Step 45080: {'lr': 1.1933275650774122e-05, 'samples': 23081472, 'steps': 45080, 'batch_loss/train': 0.7558619352057576} 12/28/2021 21:36:58 - INFO - codeparrot_training - Step 45081: {'lr': 1.1928464164090408e-05, 'samples': 23081984, 'steps': 45081, 'batch_loss/train': 0.750448209233582} 12/28/2021 21:37:10 - INFO - codeparrot_training - Step 45082: {'lr': 1.1923653623889197e-05, 'samples': 23082496, 'steps': 45082, 'batch_loss/train': 0.7628919137641788} 12/28/2021 21:37:21 - INFO - codeparrot_training - Step 45083: {'lr': 1.1918844030189668e-05, 'samples': 23083008, 'steps': 45083, 'batch_loss/train': 0.7362244976684451} 12/28/2021 21:37:32 - INFO - codeparrot_training - Step 45084: {'lr': 1.1914035383010945e-05, 'samples': 23083520, 'steps': 45084, 'batch_loss/train': 0.6899639358744025} 12/28/2021 21:37:44 - INFO - codeparrot_training - Step 45085: {'lr': 1.190922768237207e-05, 'samples': 23084032, 'steps': 45085, 'batch_loss/train': 0.7373720174655318} 12/28/2021 21:37:55 - INFO - codeparrot_training - Step 45086: {'lr': 1.190442092829222e-05, 'samples': 23084544, 'steps': 45086, 'batch_loss/train': 0.8320061499252915} 12/28/2021 21:38:05 - INFO - codeparrot_training - Step 45087: {'lr': 1.1899615120790547e-05, 'samples': 23085056, 'steps': 45087, 'batch_loss/train': 0.781310370832216} 12/28/2021 21:38:16 - INFO - codeparrot_training - Step 45088: {'lr': 1.1894810259886063e-05, 'samples': 23085568, 'steps': 45088, 'batch_loss/train': 0.6484068525023758} 12/28/2021 21:38:30 - INFO - codeparrot_training - Step 45089: {'lr': 1.1890006345597892e-05, 'samples': 23086080, 'steps': 45089, 'batch_loss/train': 0.694540711119771} 12/28/2021 21:38:41 - INFO - codeparrot_training - Step 45090: {'lr': 1.1885203377945158e-05, 'samples': 23086592, 'steps': 45090, 'batch_loss/train': 0.4748491674545221} 12/28/2021 21:38:52 - INFO - codeparrot_training - Step 45091: {'lr': 1.188040135694693e-05, 'samples': 23087104, 'steps': 45091, 'batch_loss/train': 0.6762011111713946} 12/28/2021 21:39:04 - INFO - codeparrot_training - Step 45092: {'lr': 1.1875600282622328e-05, 'samples': 23087616, 'steps': 45092, 'batch_loss/train': 0.8015128504484892} 12/28/2021 21:39:14 - INFO - codeparrot_training - Step 45093: {'lr': 1.1870800154990453e-05, 'samples': 23088128, 'steps': 45093, 'batch_loss/train': 0.7680754195898771} 12/28/2021 21:39:25 - INFO - codeparrot_training - Step 45094: {'lr': 1.186600097407034e-05, 'samples': 23088640, 'steps': 45094, 'batch_loss/train': 0.6349109052098356} 12/28/2021 21:39:37 - INFO - codeparrot_training - Step 45095: {'lr': 1.1861202739881116e-05, 'samples': 23089152, 'steps': 45095, 'batch_loss/train': 0.720810909755528} 12/28/2021 21:39:48 - INFO - codeparrot_training - Step 45096: {'lr': 1.1856405452441765e-05, 'samples': 23089664, 'steps': 45096, 'batch_loss/train': 0.8004782989737578} 12/28/2021 21:39:58 - INFO - codeparrot_training - Step 45097: {'lr': 1.1851609111771466e-05, 'samples': 23090176, 'steps': 45097, 'batch_loss/train': 0.6310273174894974} 12/28/2021 21:40:09 - INFO - codeparrot_training - Step 45098: {'lr': 1.1846813717889288e-05, 'samples': 23090688, 'steps': 45098, 'batch_loss/train': 0.7178476499393582} 12/28/2021 21:40:23 - INFO - codeparrot_training - Step 45099: {'lr': 1.1842019270814214e-05, 'samples': 23091200, 'steps': 45099, 'batch_loss/train': 0.6204731618054211} 12/28/2021 21:40:33 - INFO - codeparrot_training - Step 45100: {'lr': 1.1837225770565312e-05, 'samples': 23091712, 'steps': 45100, 'batch_loss/train': 0.6634516839403659} 12/28/2021 21:40:44 - INFO - codeparrot_training - Step 45101: {'lr': 1.1832433217161764e-05, 'samples': 23092224, 'steps': 45101, 'batch_loss/train': 0.6729153123451397} 12/28/2021 21:40:56 - INFO - codeparrot_training - Step 45102: {'lr': 1.182764161062247e-05, 'samples': 23092736, 'steps': 45102, 'batch_loss/train': 0.6145767402485944} 12/28/2021 21:41:07 - INFO - codeparrot_training - Step 45103: {'lr': 1.1822850950966551e-05, 'samples': 23093248, 'steps': 45103, 'batch_loss/train': 0.7177111972123384} 12/28/2021 21:41:17 - INFO - codeparrot_training - Step 45104: {'lr': 1.1818061238213052e-05, 'samples': 23093760, 'steps': 45104, 'batch_loss/train': 0.7314780526794493} 12/28/2021 21:41:31 - INFO - codeparrot_training - Step 45105: {'lr': 1.1813272472381009e-05, 'samples': 23094272, 'steps': 45105, 'batch_loss/train': 0.7638329109176993} 12/28/2021 21:41:42 - INFO - codeparrot_training - Step 45106: {'lr': 1.1808484653489437e-05, 'samples': 23094784, 'steps': 45106, 'batch_loss/train': 0.6526569481939077} 12/28/2021 21:41:53 - INFO - codeparrot_training - Step 45107: {'lr': 1.180369778155746e-05, 'samples': 23095296, 'steps': 45107, 'batch_loss/train': 0.7360106627456844} 12/28/2021 21:42:05 - INFO - codeparrot_training - Step 45108: {'lr': 1.179891185660395e-05, 'samples': 23095808, 'steps': 45108, 'batch_loss/train': 0.6459036413580179} 12/28/2021 21:42:15 - INFO - codeparrot_training - Step 45109: {'lr': 1.179412687864806e-05, 'samples': 23096320, 'steps': 45109, 'batch_loss/train': 0.7211948551703244} 12/28/2021 21:42:26 - INFO - codeparrot_training - Step 45110: {'lr': 1.1789342847708828e-05, 'samples': 23096832, 'steps': 45110, 'batch_loss/train': 0.6632023984566331} 12/28/2021 21:42:37 - INFO - codeparrot_training - Step 45111: {'lr': 1.178455976380513e-05, 'samples': 23097344, 'steps': 45111, 'batch_loss/train': 0.6374338078312576} 12/28/2021 21:42:49 - INFO - codeparrot_training - Step 45112: {'lr': 1.1779777626956117e-05, 'samples': 23097856, 'steps': 45112, 'batch_loss/train': 0.7772199315950274} 12/28/2021 21:43:00 - INFO - codeparrot_training - Step 45113: {'lr': 1.1774996437180801e-05, 'samples': 23098368, 'steps': 45113, 'batch_loss/train': 0.8973648566752672} 12/28/2021 21:43:10 - INFO - codeparrot_training - Step 45114: {'lr': 1.1770216194498113e-05, 'samples': 23098880, 'steps': 45114, 'batch_loss/train': 0.7833612665999681} 12/28/2021 21:43:24 - INFO - codeparrot_training - Step 45115: {'lr': 1.1765436898927118e-05, 'samples': 23099392, 'steps': 45115, 'batch_loss/train': 0.6017526917858049} 12/28/2021 21:43:35 - INFO - codeparrot_training - Step 45116: {'lr': 1.1760658550486747e-05, 'samples': 23099904, 'steps': 45116, 'batch_loss/train': 0.7072297921404243} 12/28/2021 21:43:46 - INFO - codeparrot_training - Step 45117: {'lr': 1.1755881149196068e-05, 'samples': 23100416, 'steps': 45117, 'batch_loss/train': 0.6999191669747233} 12/28/2021 21:43:58 - INFO - codeparrot_training - Step 45118: {'lr': 1.1751104695074038e-05, 'samples': 23100928, 'steps': 45118, 'batch_loss/train': 0.7282848000177182} 12/28/2021 21:44:09 - INFO - codeparrot_training - Step 45119: {'lr': 1.174632918813967e-05, 'samples': 23101440, 'steps': 45119, 'batch_loss/train': 0.5183316124603152} 12/28/2021 21:44:19 - INFO - codeparrot_training - Step 45120: {'lr': 1.1741554628411921e-05, 'samples': 23101952, 'steps': 45120, 'batch_loss/train': 0.7663863329216838} 12/28/2021 21:44:30 - INFO - codeparrot_training - Step 45121: {'lr': 1.1736781015909803e-05, 'samples': 23102464, 'steps': 45121, 'batch_loss/train': 0.6768820327706635} 12/28/2021 21:44:42 - INFO - codeparrot_training - Step 45122: {'lr': 1.1732008350652273e-05, 'samples': 23102976, 'steps': 45122, 'batch_loss/train': 0.6983950262656435} 12/28/2021 21:44:53 - INFO - codeparrot_training - Step 45123: {'lr': 1.1727236632658317e-05, 'samples': 23103488, 'steps': 45123, 'batch_loss/train': 0.8431602651253343} 12/28/2021 21:45:03 - INFO - codeparrot_training - Step 45124: {'lr': 1.1722465861946918e-05, 'samples': 23104000, 'steps': 45124, 'batch_loss/train': 1.1258037968655117} 12/28/2021 21:45:17 - INFO - codeparrot_training - Step 45125: {'lr': 1.1717696038537006e-05, 'samples': 23104512, 'steps': 45125, 'batch_loss/train': 0.7656624419614673} 12/28/2021 21:45:28 - INFO - codeparrot_training - Step 45126: {'lr': 1.1712927162447512e-05, 'samples': 23105024, 'steps': 45126, 'batch_loss/train': 0.8196420883759856} 12/28/2021 21:45:38 - INFO - codeparrot_training - Step 45127: {'lr': 1.1708159233697529e-05, 'samples': 23105536, 'steps': 45127, 'batch_loss/train': 0.6996498496737331} 12/28/2021 21:45:51 - INFO - codeparrot_training - Step 45128: {'lr': 1.1703392252305906e-05, 'samples': 23106048, 'steps': 45128, 'batch_loss/train': 0.6960059977136552} 12/28/2021 21:46:01 - INFO - codeparrot_training - Step 45129: {'lr': 1.1698626218291596e-05, 'samples': 23106560, 'steps': 45129, 'batch_loss/train': 0.7600269811227918} 12/28/2021 21:46:12 - INFO - codeparrot_training - Step 45130: {'lr': 1.169386113167356e-05, 'samples': 23107072, 'steps': 45130, 'batch_loss/train': 0.7185526052489877} 12/28/2021 21:46:22 - INFO - codeparrot_training - Step 45131: {'lr': 1.1689096992470783e-05, 'samples': 23107584, 'steps': 45131, 'batch_loss/train': 0.6830867105163634} 12/28/2021 21:46:35 - INFO - codeparrot_training - Step 45132: {'lr': 1.1684333800702135e-05, 'samples': 23108096, 'steps': 45132, 'batch_loss/train': 0.6919168028980494} 12/28/2021 21:46:45 - INFO - codeparrot_training - Step 45133: {'lr': 1.167957155638663e-05, 'samples': 23108608, 'steps': 45133, 'batch_loss/train': 0.6780005036853254} 12/28/2021 21:46:56 - INFO - codeparrot_training - Step 45134: {'lr': 1.1674810259543144e-05, 'samples': 23109120, 'steps': 45134, 'batch_loss/train': 0.5672831028932706} 12/28/2021 21:47:10 - INFO - codeparrot_training - Step 45135: {'lr': 1.1670049910190633e-05, 'samples': 23109632, 'steps': 45135, 'batch_loss/train': 0.5928733334876597} 12/28/2021 21:47:21 - INFO - codeparrot_training - Step 45136: {'lr': 1.1665290508348025e-05, 'samples': 23110144, 'steps': 45136, 'batch_loss/train': 0.7247302439063787} 12/28/2021 21:47:31 - INFO - codeparrot_training - Step 45137: {'lr': 1.1660532054034196e-05, 'samples': 23110656, 'steps': 45137, 'batch_loss/train': 0.6199493641033769} 12/28/2021 21:47:43 - INFO - codeparrot_training - Step 45138: {'lr': 1.1655774547268099e-05, 'samples': 23111168, 'steps': 45138, 'batch_loss/train': 0.6074463874101639} 12/28/2021 21:47:54 - INFO - codeparrot_training - Step 45139: {'lr': 1.1651017988068696e-05, 'samples': 23111680, 'steps': 45139, 'batch_loss/train': 0.7497723158448935} 12/28/2021 21:48:05 - INFO - codeparrot_training - Step 45140: {'lr': 1.164626237645483e-05, 'samples': 23112192, 'steps': 45140, 'batch_loss/train': 0.7725201440043747} 12/28/2021 21:48:15 - INFO - codeparrot_training - Step 45141: {'lr': 1.1641507712445403e-05, 'samples': 23112704, 'steps': 45141, 'batch_loss/train': 0.5392035723198205} 12/28/2021 21:48:27 - INFO - codeparrot_training - Step 45142: {'lr': 1.1636753996059346e-05, 'samples': 23113216, 'steps': 45142, 'batch_loss/train': 0.6276215150719509} 12/28/2021 21:48:38 - INFO - codeparrot_training - Step 45143: {'lr': 1.1632001227315558e-05, 'samples': 23113728, 'steps': 45143, 'batch_loss/train': 0.677175990305841} 12/28/2021 21:48:49 - INFO - codeparrot_training - Step 45144: {'lr': 1.1627249406232916e-05, 'samples': 23114240, 'steps': 45144, 'batch_loss/train': 0.779763444326818} 12/28/2021 21:49:03 - INFO - codeparrot_training - Step 45145: {'lr': 1.1622498532830345e-05, 'samples': 23114752, 'steps': 45145, 'batch_loss/train': 0.6233625188469887} 12/28/2021 21:49:13 - INFO - codeparrot_training - Step 45146: {'lr': 1.1617748607126693e-05, 'samples': 23115264, 'steps': 45146, 'batch_loss/train': 0.6595634799450636} 12/28/2021 21:49:24 - INFO - codeparrot_training - Step 45147: {'lr': 1.1612999629140863e-05, 'samples': 23115776, 'steps': 45147, 'batch_loss/train': 0.7605407368391752} 12/28/2021 21:49:36 - INFO - codeparrot_training - Step 45148: {'lr': 1.1608251598891757e-05, 'samples': 23116288, 'steps': 45148, 'batch_loss/train': 0.723977105692029} 12/28/2021 21:49:47 - INFO - codeparrot_training - Step 45149: {'lr': 1.1603504516398216e-05, 'samples': 23116800, 'steps': 45149, 'batch_loss/train': 0.9458010466769338} 12/28/2021 21:49:58 - INFO - codeparrot_training - Step 45150: {'lr': 1.1598758381679148e-05, 'samples': 23117312, 'steps': 45150, 'batch_loss/train': 0.6989142610691488} 12/28/2021 21:50:08 - INFO - codeparrot_training - Step 45151: {'lr': 1.1594013194753366e-05, 'samples': 23117824, 'steps': 45151, 'batch_loss/train': 0.776330485008657} 12/28/2021 21:50:20 - INFO - codeparrot_training - Step 45152: {'lr': 1.1589268955639747e-05, 'samples': 23118336, 'steps': 45152, 'batch_loss/train': 0.741174004971981} 12/28/2021 21:50:31 - INFO - codeparrot_training - Step 45153: {'lr': 1.1584525664357249e-05, 'samples': 23118848, 'steps': 45153, 'batch_loss/train': 0.8118411167524755} 12/28/2021 21:50:42 - INFO - codeparrot_training - Step 45154: {'lr': 1.157978332092463e-05, 'samples': 23119360, 'steps': 45154, 'batch_loss/train': 0.710104389116168} 12/28/2021 21:50:54 - INFO - codeparrot_training - Step 45155: {'lr': 1.157504192536074e-05, 'samples': 23119872, 'steps': 45155, 'batch_loss/train': 0.7094334170687944} 12/28/2021 21:51:05 - INFO - codeparrot_training - Step 45156: {'lr': 1.157030147768448e-05, 'samples': 23120384, 'steps': 45156, 'batch_loss/train': 0.7496247170493007} 12/28/2021 21:51:15 - INFO - codeparrot_training - Step 45157: {'lr': 1.156556197791464e-05, 'samples': 23120896, 'steps': 45157, 'batch_loss/train': 0.7458471427671611} 12/28/2021 21:51:29 - INFO - codeparrot_training - Step 45158: {'lr': 1.1560823426070121e-05, 'samples': 23121408, 'steps': 45158, 'batch_loss/train': 0.6419788913335651} 12/28/2021 21:51:40 - INFO - codeparrot_training - Step 45159: {'lr': 1.1556085822169744e-05, 'samples': 23121920, 'steps': 45159, 'batch_loss/train': 0.7882645162753761} 12/28/2021 21:51:51 - INFO - codeparrot_training - Step 45160: {'lr': 1.1551349166232323e-05, 'samples': 23122432, 'steps': 45160, 'batch_loss/train': 0.7543531587580219} 12/28/2021 21:52:01 - INFO - codeparrot_training - Step 45161: {'lr': 1.1546613458276706e-05, 'samples': 23122944, 'steps': 45161, 'batch_loss/train': 0.6978685674257576} 12/28/2021 21:52:14 - INFO - codeparrot_training - Step 45162: {'lr': 1.1541878698321767e-05, 'samples': 23123456, 'steps': 45162, 'batch_loss/train': 0.6429912850726396} 12/28/2021 21:52:24 - INFO - codeparrot_training - Step 45163: {'lr': 1.1537144886386186e-05, 'samples': 23123968, 'steps': 45163, 'batch_loss/train': 0.8312504030764103} 12/28/2021 21:52:35 - INFO - codeparrot_training - Step 45164: {'lr': 1.1532412022488919e-05, 'samples': 23124480, 'steps': 45164, 'batch_loss/train': 0.7599345003254712} 12/28/2021 21:52:49 - INFO - codeparrot_training - Step 45165: {'lr': 1.1527680106648785e-05, 'samples': 23124992, 'steps': 45165, 'batch_loss/train': 0.5350348016363569} 12/28/2021 21:52:59 - INFO - codeparrot_training - Step 45166: {'lr': 1.152294913888452e-05, 'samples': 23125504, 'steps': 45166, 'batch_loss/train': 0.6943556689657271} 12/28/2021 21:53:10 - INFO - codeparrot_training - Step 45167: {'lr': 1.1518219119214911e-05, 'samples': 23126016, 'steps': 45167, 'batch_loss/train': 0.7792547666467726} 12/28/2021 21:53:22 - INFO - codeparrot_training - Step 45168: {'lr': 1.1513490047658892e-05, 'samples': 23126528, 'steps': 45168, 'batch_loss/train': 0.6440566085802857} 12/28/2021 21:53:33 - INFO - codeparrot_training - Step 45169: {'lr': 1.1508761924235168e-05, 'samples': 23127040, 'steps': 45169, 'batch_loss/train': 0.6070592103642412} 12/28/2021 21:53:44 - INFO - codeparrot_training - Step 45170: {'lr': 1.1504034748962527e-05, 'samples': 23127552, 'steps': 45170, 'batch_loss/train': 0.7517749206162989} 12/28/2021 21:53:54 - INFO - codeparrot_training - Step 45171: {'lr': 1.149930852185982e-05, 'samples': 23128064, 'steps': 45171, 'batch_loss/train': 0.9618778033182025} 12/28/2021 21:54:06 - INFO - codeparrot_training - Step 45172: {'lr': 1.1494583242945777e-05, 'samples': 23128576, 'steps': 45172, 'batch_loss/train': 0.7103884099051356} 12/28/2021 21:54:17 - INFO - codeparrot_training - Step 45173: {'lr': 1.1489858912239248e-05, 'samples': 23129088, 'steps': 45173, 'batch_loss/train': 0.6807679194025695} 12/28/2021 21:54:28 - INFO - codeparrot_training - Step 45174: {'lr': 1.1485135529758966e-05, 'samples': 23129600, 'steps': 45174, 'batch_loss/train': 0.6988634196459316} 12/28/2021 21:54:42 - INFO - codeparrot_training - Step 45175: {'lr': 1.1480413095523722e-05, 'samples': 23130112, 'steps': 45175, 'batch_loss/train': 0.7225189578020945} 12/28/2021 21:54:53 - INFO - codeparrot_training - Step 45176: {'lr': 1.147569160955228e-05, 'samples': 23130624, 'steps': 45176, 'batch_loss/train': 0.602459106448805} 12/28/2021 21:55:03 - INFO - codeparrot_training - Step 45177: {'lr': 1.1470971071863485e-05, 'samples': 23131136, 'steps': 45177, 'batch_loss/train': 0.8836542330682278} 12/28/2021 21:55:15 - INFO - codeparrot_training - Step 45178: {'lr': 1.1466251482475959e-05, 'samples': 23131648, 'steps': 45178, 'batch_loss/train': 0.7286112722940743} 12/28/2021 21:55:26 - INFO - codeparrot_training - Step 45179: {'lr': 1.1461532841408634e-05, 'samples': 23132160, 'steps': 45179, 'batch_loss/train': 0.6215958904940635} 12/28/2021 21:55:36 - INFO - codeparrot_training - Step 45180: {'lr': 1.1456815148680133e-05, 'samples': 23132672, 'steps': 45180, 'batch_loss/train': 0.6652057596365921} 12/28/2021 21:55:47 - INFO - codeparrot_training - Step 45181: {'lr': 1.1452098404309303e-05, 'samples': 23133184, 'steps': 45181, 'batch_loss/train': 0.6901094354689121} 12/28/2021 21:55:59 - INFO - codeparrot_training - Step 45182: {'lr': 1.1447382608314821e-05, 'samples': 23133696, 'steps': 45182, 'batch_loss/train': 0.7181781842373312} 12/28/2021 21:56:10 - INFO - codeparrot_training - Step 45183: {'lr': 1.144266776071548e-05, 'samples': 23134208, 'steps': 45183, 'batch_loss/train': 0.6281738271936774} 12/28/2021 21:56:21 - INFO - codeparrot_training - Step 45184: {'lr': 1.1437953861530015e-05, 'samples': 23134720, 'steps': 45184, 'batch_loss/train': 0.7714589834213257} 12/28/2021 21:56:33 - INFO - codeparrot_training - Step 45185: {'lr': 1.1433240910777159e-05, 'samples': 23135232, 'steps': 45185, 'batch_loss/train': 0.7533157514408231} 12/28/2021 21:56:44 - INFO - codeparrot_training - Step 45186: {'lr': 1.1428528908475678e-05, 'samples': 23135744, 'steps': 45186, 'batch_loss/train': 0.6851532143191434} 12/28/2021 21:56:55 - INFO - codeparrot_training - Step 45187: {'lr': 1.1423817854644276e-05, 'samples': 23136256, 'steps': 45187, 'batch_loss/train': 0.7251094508683309} 12/28/2021 21:57:07 - INFO - codeparrot_training - Step 45188: {'lr': 1.1419107749301716e-05, 'samples': 23136768, 'steps': 45188, 'batch_loss/train': 0.7406473859446123} 12/28/2021 21:57:18 - INFO - codeparrot_training - Step 45189: {'lr': 1.1414398592466651e-05, 'samples': 23137280, 'steps': 45189, 'batch_loss/train': 0.6781553609762341} 12/28/2021 21:57:28 - INFO - codeparrot_training - Step 45190: {'lr': 1.1409690384157873e-05, 'samples': 23137792, 'steps': 45190, 'batch_loss/train': 0.7917155711911619} 12/28/2021 21:57:39 - INFO - codeparrot_training - Step 45191: {'lr': 1.1404983124394114e-05, 'samples': 23138304, 'steps': 45191, 'batch_loss/train': 0.695106367697008} 12/28/2021 21:57:51 - INFO - codeparrot_training - Step 45192: {'lr': 1.1400276813194028e-05, 'samples': 23138816, 'steps': 45192, 'batch_loss/train': 0.7419019603985362} 12/28/2021 21:58:02 - INFO - codeparrot_training - Step 45193: {'lr': 1.1395571450576292e-05, 'samples': 23139328, 'steps': 45193, 'batch_loss/train': 0.482095543819014} 12/28/2021 21:58:13 - INFO - codeparrot_training - Step 45194: {'lr': 1.1390867036559754e-05, 'samples': 23139840, 'steps': 45194, 'batch_loss/train': 0.6520025851204991} 12/28/2021 21:58:25 - INFO - codeparrot_training - Step 45195: {'lr': 1.1386163571162983e-05, 'samples': 23140352, 'steps': 45195, 'batch_loss/train': 0.6111841504462063} 12/28/2021 21:58:35 - INFO - codeparrot_training - Step 45196: {'lr': 1.1381461054404768e-05, 'samples': 23140864, 'steps': 45196, 'batch_loss/train': 0.5653514165896922} 12/28/2021 21:58:46 - INFO - codeparrot_training - Step 45197: {'lr': 1.1376759486303733e-05, 'samples': 23141376, 'steps': 45197, 'batch_loss/train': 0.6556123597547412} 12/28/2021 21:59:00 - INFO - codeparrot_training - Step 45198: {'lr': 1.1372058866878588e-05, 'samples': 23141888, 'steps': 45198, 'batch_loss/train': 0.7248354586772621} 12/28/2021 21:59:11 - INFO - codeparrot_training - Step 45199: {'lr': 1.1367359196148063e-05, 'samples': 23142400, 'steps': 45199, 'batch_loss/train': 0.7011120608076453} 12/28/2021 21:59:21 - INFO - codeparrot_training - Step 45200: {'lr': 1.1362660474130815e-05, 'samples': 23142912, 'steps': 45200, 'batch_loss/train': 0.7146652820520103} 12/28/2021 21:59:33 - INFO - codeparrot_training - Step 45201: {'lr': 1.135796270084552e-05, 'samples': 23143424, 'steps': 45201, 'batch_loss/train': 0.6802255888469517} 12/28/2021 21:59:44 - INFO - codeparrot_training - Step 45202: {'lr': 1.135326587631083e-05, 'samples': 23143936, 'steps': 45202, 'batch_loss/train': 0.8390423152595758} 12/28/2021 21:59:55 - INFO - codeparrot_training - Step 45203: {'lr': 1.1348570000545511e-05, 'samples': 23144448, 'steps': 45203, 'batch_loss/train': 0.6800969741307199} 12/28/2021 22:00:05 - INFO - codeparrot_training - Step 45204: {'lr': 1.1343875073568072e-05, 'samples': 23144960, 'steps': 45204, 'batch_loss/train': 0.6721721105277538} 12/28/2021 22:00:23 - INFO - codeparrot_training - Step 45205: {'lr': 1.1339181095397361e-05, 'samples': 23145472, 'steps': 45205, 'batch_loss/train': 0.718182299984619} 12/28/2021 22:00:34 - INFO - codeparrot_training - Step 45206: {'lr': 1.1334488066051917e-05, 'samples': 23145984, 'steps': 45206, 'batch_loss/train': 0.7300790552981198} 12/28/2021 22:00:44 - INFO - codeparrot_training - Step 45207: {'lr': 1.1329795985550396e-05, 'samples': 23146496, 'steps': 45207, 'batch_loss/train': 0.6861814074218273} 12/28/2021 22:00:57 - INFO - codeparrot_training - Step 45208: {'lr': 1.1325104853911555e-05, 'samples': 23147008, 'steps': 45208, 'batch_loss/train': 0.665014612255618} 12/28/2021 22:01:07 - INFO - codeparrot_training - Step 45209: {'lr': 1.132041467115394e-05, 'samples': 23147520, 'steps': 45209, 'batch_loss/train': 0.7433960535563529} 12/28/2021 22:01:18 - INFO - codeparrot_training - Step 45210: {'lr': 1.1315725437296227e-05, 'samples': 23148032, 'steps': 45210, 'batch_loss/train': 0.6775140275713056} 12/28/2021 22:01:30 - INFO - codeparrot_training - Step 45211: {'lr': 1.1311037152357068e-05, 'samples': 23148544, 'steps': 45211, 'batch_loss/train': 0.7709558987990022} 12/28/2021 22:01:41 - INFO - codeparrot_training - Step 45212: {'lr': 1.1306349816355116e-05, 'samples': 23149056, 'steps': 45212, 'batch_loss/train': 0.6586386940907687} 12/28/2021 22:01:51 - INFO - codeparrot_training - Step 45213: {'lr': 1.1301663429308967e-05, 'samples': 23149568, 'steps': 45213, 'batch_loss/train': 0.6813489380292594} 12/28/2021 22:02:02 - INFO - codeparrot_training - Step 45214: {'lr': 1.1296977991237328e-05, 'samples': 23150080, 'steps': 45214, 'batch_loss/train': 0.6840239576995373} 12/28/2021 22:02:16 - INFO - codeparrot_training - Step 45215: {'lr': 1.1292293502158712e-05, 'samples': 23150592, 'steps': 45215, 'batch_loss/train': 0.8789694784209132} 12/28/2021 22:02:26 - INFO - codeparrot_training - Step 45216: {'lr': 1.1287609962091827e-05, 'samples': 23151104, 'steps': 45216, 'batch_loss/train': 0.7519833836704493} 12/28/2021 22:02:37 - INFO - codeparrot_training - Step 45217: {'lr': 1.1282927371055295e-05, 'samples': 23151616, 'steps': 45217, 'batch_loss/train': 0.7265521013177931} 12/28/2021 22:02:49 - INFO - codeparrot_training - Step 45218: {'lr': 1.1278245729067688e-05, 'samples': 23152128, 'steps': 45218, 'batch_loss/train': 0.7645460017956793} 12/28/2021 22:03:00 - INFO - codeparrot_training - Step 45219: {'lr': 1.1273565036147598e-05, 'samples': 23152640, 'steps': 45219, 'batch_loss/train': 0.7092871246859431} 12/28/2021 22:03:11 - INFO - codeparrot_training - Step 45220: {'lr': 1.1268885292313735e-05, 'samples': 23153152, 'steps': 45220, 'batch_loss/train': 0.7705654481542297} 12/28/2021 22:03:24 - INFO - codeparrot_training - Step 45221: {'lr': 1.1264206497584612e-05, 'samples': 23153664, 'steps': 45221, 'batch_loss/train': 0.7733873687684536} 12/28/2021 22:03:35 - INFO - codeparrot_training - Step 45222: {'lr': 1.125952865197885e-05, 'samples': 23154176, 'steps': 45222, 'batch_loss/train': 0.71188008133322} 12/28/2021 22:03:46 - INFO - codeparrot_training - Step 45223: {'lr': 1.1254851755515105e-05, 'samples': 23154688, 'steps': 45223, 'batch_loss/train': 0.7335936911404133} 12/28/2021 22:03:56 - INFO - codeparrot_training - Step 45224: {'lr': 1.1250175808211888e-05, 'samples': 23155200, 'steps': 45224, 'batch_loss/train': 0.7349751787260175} 12/28/2021 22:04:08 - INFO - codeparrot_training - Step 45225: {'lr': 1.1245500810087822e-05, 'samples': 23155712, 'steps': 45225, 'batch_loss/train': 0.7380467001348734} 12/28/2021 22:04:19 - INFO - codeparrot_training - Step 45226: {'lr': 1.1240826761161505e-05, 'samples': 23156224, 'steps': 45226, 'batch_loss/train': 0.7355642057955265} 12/28/2021 22:04:29 - INFO - codeparrot_training - Step 45227: {'lr': 1.1236153661451477e-05, 'samples': 23156736, 'steps': 45227, 'batch_loss/train': 0.6533847944810987} 12/28/2021 22:04:42 - INFO - codeparrot_training - Step 45228: {'lr': 1.1231481510976388e-05, 'samples': 23157248, 'steps': 45228, 'batch_loss/train': 0.8815470701083541} 12/28/2021 22:04:52 - INFO - codeparrot_training - Step 45229: {'lr': 1.1226810309754782e-05, 'samples': 23157760, 'steps': 45229, 'batch_loss/train': 0.728433322859928} 12/28/2021 22:05:03 - INFO - codeparrot_training - Step 45230: {'lr': 1.1222140057805169e-05, 'samples': 23158272, 'steps': 45230, 'batch_loss/train': 0.6192574314773083} 12/28/2021 22:05:15 - INFO - codeparrot_training - Step 45231: {'lr': 1.121747075514623e-05, 'samples': 23158784, 'steps': 45231, 'batch_loss/train': 0.5502081043669023} 12/28/2021 22:05:26 - INFO - codeparrot_training - Step 45232: {'lr': 1.1212802401796424e-05, 'samples': 23159296, 'steps': 45232, 'batch_loss/train': 0.7050936496816576} 12/28/2021 22:05:37 - INFO - codeparrot_training - Step 45233: {'lr': 1.1208134997774344e-05, 'samples': 23159808, 'steps': 45233, 'batch_loss/train': 0.7237762650474906} 12/28/2021 22:05:51 - INFO - codeparrot_training - Step 45234: {'lr': 1.1203468543098589e-05, 'samples': 23160320, 'steps': 45234, 'batch_loss/train': 0.6724340398795903} 12/28/2021 22:06:02 - INFO - codeparrot_training - Step 45235: {'lr': 1.119880303778767e-05, 'samples': 23160832, 'steps': 45235, 'batch_loss/train': 0.8753606639802456} 12/28/2021 22:06:13 - INFO - codeparrot_training - Step 45236: {'lr': 1.1194138481860127e-05, 'samples': 23161344, 'steps': 45236, 'batch_loss/train': 0.6808420117013156} 12/28/2021 22:06:23 - INFO - codeparrot_training - Step 45237: {'lr': 1.1189474875334533e-05, 'samples': 23161856, 'steps': 45237, 'batch_loss/train': 0.6573677156120539} 12/28/2021 22:06:36 - INFO - codeparrot_training - Step 45238: {'lr': 1.1184812218229395e-05, 'samples': 23162368, 'steps': 45238, 'batch_loss/train': 0.9554148181341588} 12/28/2021 22:06:47 - INFO - codeparrot_training - Step 45239: {'lr': 1.1180150510563287e-05, 'samples': 23162880, 'steps': 45239, 'batch_loss/train': 1.0799186201766133} 12/28/2021 22:06:57 - INFO - codeparrot_training - Step 45240: {'lr': 1.1175489752354772e-05, 'samples': 23163392, 'steps': 45240, 'batch_loss/train': 0.8932139687240124} 12/28/2021 22:07:09 - INFO - codeparrot_training - Step 45241: {'lr': 1.117082994362223e-05, 'samples': 23163904, 'steps': 45241, 'batch_loss/train': 0.6179465335444547} 12/28/2021 22:07:20 - INFO - codeparrot_training - Step 45242: {'lr': 1.1166171084384364e-05, 'samples': 23164416, 'steps': 45242, 'batch_loss/train': 0.6919245892204344} 12/28/2021 22:07:31 - INFO - codeparrot_training - Step 45243: {'lr': 1.1161513174659632e-05, 'samples': 23164928, 'steps': 45243, 'batch_loss/train': 0.6826542783528566} 12/28/2021 22:07:41 - INFO - codeparrot_training - Step 45244: {'lr': 1.1156856214466465e-05, 'samples': 23165440, 'steps': 45244, 'batch_loss/train': 0.7390599376522005} 12/28/2021 22:07:55 - INFO - codeparrot_training - Step 45245: {'lr': 1.1152200203823515e-05, 'samples': 23165952, 'steps': 45245, 'batch_loss/train': 0.7221329691819847} 12/28/2021 22:08:06 - INFO - codeparrot_training - Step 45246: {'lr': 1.1147545142749239e-05, 'samples': 23166464, 'steps': 45246, 'batch_loss/train': 0.7685714890249074} 12/28/2021 22:08:17 - INFO - codeparrot_training - Step 45247: {'lr': 1.1142891031262121e-05, 'samples': 23166976, 'steps': 45247, 'batch_loss/train': 0.8230871669948101} 12/28/2021 22:08:29 - INFO - codeparrot_training - Step 45248: {'lr': 1.1138237869380647e-05, 'samples': 23167488, 'steps': 45248, 'batch_loss/train': 0.7140512084588408} 12/28/2021 22:08:40 - INFO - codeparrot_training - Step 45249: {'lr': 1.1133585657123385e-05, 'samples': 23168000, 'steps': 45249, 'batch_loss/train': 0.6738131381571293} 12/28/2021 22:08:50 - INFO - codeparrot_training - Step 45250: {'lr': 1.1128934394508794e-05, 'samples': 23168512, 'steps': 45250, 'batch_loss/train': 0.7170391110703349} 12/28/2021 22:09:04 - INFO - codeparrot_training - Step 45251: {'lr': 1.1124284081555359e-05, 'samples': 23169024, 'steps': 45251, 'batch_loss/train': 0.6984525015577674} 12/28/2021 22:09:15 - INFO - codeparrot_training - Step 45252: {'lr': 1.1119634718281563e-05, 'samples': 23169536, 'steps': 45252, 'batch_loss/train': 0.7021812271559611} 12/28/2021 22:09:25 - INFO - codeparrot_training - Step 45253: {'lr': 1.1114986304705892e-05, 'samples': 23170048, 'steps': 45253, 'batch_loss/train': 0.6724969043862075} 12/28/2021 22:09:36 - INFO - codeparrot_training - Step 45254: {'lr': 1.1110338840846862e-05, 'samples': 23170560, 'steps': 45254, 'batch_loss/train': 0.7499102395959198} 12/28/2021 22:09:48 - INFO - codeparrot_training - Step 45255: {'lr': 1.1105692326722955e-05, 'samples': 23171072, 'steps': 45255, 'batch_loss/train': 0.7043064082972705} 12/28/2021 22:09:59 - INFO - codeparrot_training - Step 45256: {'lr': 1.1101046762352546e-05, 'samples': 23171584, 'steps': 45256, 'batch_loss/train': 0.7961749751702882} 12/28/2021 22:10:09 - INFO - codeparrot_training - Step 45257: {'lr': 1.1096402147754176e-05, 'samples': 23172096, 'steps': 45257, 'batch_loss/train': 0.6600732328370214} 12/28/2021 22:10:21 - INFO - codeparrot_training - Step 45258: {'lr': 1.1091758482946356e-05, 'samples': 23172608, 'steps': 45258, 'batch_loss/train': 0.6515343161299825} 12/28/2021 22:10:32 - INFO - codeparrot_training - Step 45259: {'lr': 1.1087115767947436e-05, 'samples': 23173120, 'steps': 45259, 'batch_loss/train': 0.6001659393077716} 12/28/2021 22:10:43 - INFO - codeparrot_training - Step 45260: {'lr': 1.1082474002776011e-05, 'samples': 23173632, 'steps': 45260, 'batch_loss/train': 0.7552105141803622} 12/28/2021 22:10:57 - INFO - codeparrot_training - Step 45261: {'lr': 1.1077833187450398e-05, 'samples': 23174144, 'steps': 45261, 'batch_loss/train': 0.5725286195520312} 12/28/2021 22:11:07 - INFO - codeparrot_training - Step 45262: {'lr': 1.107319332198911e-05, 'samples': 23174656, 'steps': 45262, 'batch_loss/train': 0.7894206577911973} 12/28/2021 22:11:18 - INFO - codeparrot_training - Step 45263: {'lr': 1.1068554406410608e-05, 'samples': 23175168, 'steps': 45263, 'batch_loss/train': 0.8380233468487859} 12/28/2021 22:11:29 - INFO - codeparrot_training - Step 45264: {'lr': 1.106391644073329e-05, 'samples': 23175680, 'steps': 45264, 'batch_loss/train': 0.7534953807480633} 12/28/2021 22:11:41 - INFO - codeparrot_training - Step 45265: {'lr': 1.1059279424975644e-05, 'samples': 23176192, 'steps': 45265, 'batch_loss/train': 0.9111900036223233} 12/28/2021 22:11:52 - INFO - codeparrot_training - Step 45266: {'lr': 1.105464335915607e-05, 'samples': 23176704, 'steps': 45266, 'batch_loss/train': 0.7783919349312782} 12/28/2021 22:12:02 - INFO - codeparrot_training - Step 45267: {'lr': 1.1050008243293026e-05, 'samples': 23177216, 'steps': 45267, 'batch_loss/train': 0.7054474800825119} 12/28/2021 22:12:16 - INFO - codeparrot_training - Step 45268: {'lr': 1.1045374077404913e-05, 'samples': 23177728, 'steps': 45268, 'batch_loss/train': 0.7188941463828087} 12/28/2021 22:12:27 - INFO - codeparrot_training - Step 45269: {'lr': 1.1040740861510218e-05, 'samples': 23178240, 'steps': 45269, 'batch_loss/train': 0.8300011469982564} 12/28/2021 22:12:38 - INFO - codeparrot_training - Step 45270: {'lr': 1.103610859562723e-05, 'samples': 23178752, 'steps': 45270, 'batch_loss/train': 0.7060732033569366} 12/28/2021 22:12:48 - INFO - codeparrot_training - Step 45271: {'lr': 1.1031477279774466e-05, 'samples': 23179264, 'steps': 45271, 'batch_loss/train': 0.7719943816773593} 12/28/2021 22:13:00 - INFO - codeparrot_training - Step 45272: {'lr': 1.1026846913970351e-05, 'samples': 23179776, 'steps': 45272, 'batch_loss/train': 0.6992597542703152} 12/28/2021 22:13:11 - INFO - codeparrot_training - Step 45273: {'lr': 1.1022217498233234e-05, 'samples': 23180288, 'steps': 45273, 'batch_loss/train': 1.4651657892391086} 12/28/2021 22:13:22 - INFO - codeparrot_training - Step 45274: {'lr': 1.1017589032581543e-05, 'samples': 23180800, 'steps': 45274, 'batch_loss/train': 0.7581694712862372} 12/28/2021 22:13:34 - INFO - codeparrot_training - Step 45275: {'lr': 1.101296151703371e-05, 'samples': 23181312, 'steps': 45275, 'batch_loss/train': 0.6235096626915038} 12/28/2021 22:13:44 - INFO - codeparrot_training - Step 45276: {'lr': 1.100833495160808e-05, 'samples': 23181824, 'steps': 45276, 'batch_loss/train': 0.7271245773881674} 12/28/2021 22:13:55 - INFO - codeparrot_training - Step 45277: {'lr': 1.100370933632308e-05, 'samples': 23182336, 'steps': 45277, 'batch_loss/train': 0.7586483911727555} 12/28/2021 22:14:07 - INFO - codeparrot_training - Step 45278: {'lr': 1.0999084671197063e-05, 'samples': 23182848, 'steps': 45278, 'batch_loss/train': 0.7134393873857334} 12/28/2021 22:14:18 - INFO - codeparrot_training - Step 45279: {'lr': 1.0994460956248481e-05, 'samples': 23183360, 'steps': 45279, 'batch_loss/train': 0.632844238076359} 12/28/2021 22:14:28 - INFO - codeparrot_training - Step 45280: {'lr': 1.0989838191495655e-05, 'samples': 23183872, 'steps': 45280, 'batch_loss/train': 0.7187032364308834} 12/28/2021 22:14:39 - INFO - codeparrot_training - Step 45281: {'lr': 1.0985216376956985e-05, 'samples': 23184384, 'steps': 45281, 'batch_loss/train': 0.7205729158595204} 12/28/2021 22:14:53 - INFO - codeparrot_training - Step 45282: {'lr': 1.0980595512650848e-05, 'samples': 23184896, 'steps': 45282, 'batch_loss/train': 0.7506268839351833} 12/28/2021 22:15:04 - INFO - codeparrot_training - Step 45283: {'lr': 1.0975975598595617e-05, 'samples': 23185408, 'steps': 45283, 'batch_loss/train': 0.7629040649626404} 12/28/2021 22:15:14 - INFO - codeparrot_training - Step 45284: {'lr': 1.0971356634809692e-05, 'samples': 23185920, 'steps': 45284, 'batch_loss/train': 0.6816213498823345} 12/28/2021 22:15:27 - INFO - codeparrot_training - Step 45285: {'lr': 1.0966738621311311e-05, 'samples': 23186432, 'steps': 45285, 'batch_loss/train': 0.6695514561142772} 12/28/2021 22:15:38 - INFO - codeparrot_training - Step 45286: {'lr': 1.0962121558119015e-05, 'samples': 23186944, 'steps': 45286, 'batch_loss/train': 0.7937301495112479} 12/28/2021 22:15:48 - INFO - codeparrot_training - Step 45287: {'lr': 1.0957505445251009e-05, 'samples': 23187456, 'steps': 45287, 'batch_loss/train': 0.6989519577473402} 12/28/2021 22:16:00 - INFO - codeparrot_training - Step 45288: {'lr': 1.0952890282725724e-05, 'samples': 23187968, 'steps': 45288, 'batch_loss/train': 1.0939832422882318} 12/28/2021 22:16:11 - INFO - codeparrot_training - Step 45289: {'lr': 1.0948276070561481e-05, 'samples': 23188480, 'steps': 45289, 'batch_loss/train': 0.8505614250898361} 12/28/2021 22:16:22 - INFO - codeparrot_training - Step 45290: {'lr': 1.0943662808776622e-05, 'samples': 23188992, 'steps': 45290, 'batch_loss/train': 0.7349847466684878} 12/28/2021 22:16:32 - INFO - codeparrot_training - Step 45291: {'lr': 1.0939050497389497e-05, 'samples': 23189504, 'steps': 45291, 'batch_loss/train': 0.7196546790655702} 12/28/2021 22:16:46 - INFO - codeparrot_training - Step 45292: {'lr': 1.093443913641845e-05, 'samples': 23190016, 'steps': 45292, 'batch_loss/train': 0.6784994169138372} 12/28/2021 22:16:57 - INFO - codeparrot_training - Step 45293: {'lr': 1.0929828725881803e-05, 'samples': 23190528, 'steps': 45293, 'batch_loss/train': 0.5754673110204749} 12/28/2021 22:17:08 - INFO - codeparrot_training - Step 45294: {'lr': 1.0925219265797897e-05, 'samples': 23191040, 'steps': 45294, 'batch_loss/train': 0.4961165436543524} 12/28/2021 22:17:20 - INFO - codeparrot_training - Step 45295: {'lr': 1.0920610756185056e-05, 'samples': 23191552, 'steps': 45295, 'batch_loss/train': 0.6434111797716469} 12/28/2021 22:17:30 - INFO - codeparrot_training - Step 45296: {'lr': 1.0916003197061541e-05, 'samples': 23192064, 'steps': 45296, 'batch_loss/train': 0.8592402385547757} 12/28/2021 22:17:41 - INFO - codeparrot_training - Step 45297: {'lr': 1.0911396588445755e-05, 'samples': 23192576, 'steps': 45297, 'batch_loss/train': 0.5038390303379856} 12/28/2021 22:17:53 - INFO - codeparrot_training - Step 45298: {'lr': 1.0906790930356015e-05, 'samples': 23193088, 'steps': 45298, 'batch_loss/train': 0.6511128675192595} 12/28/2021 22:18:04 - INFO - codeparrot_training - Step 45299: {'lr': 1.0902186222810557e-05, 'samples': 23193600, 'steps': 45299, 'batch_loss/train': 0.9157065879553556} 12/28/2021 22:18:15 - INFO - codeparrot_training - Step 45300: {'lr': 1.0897582465827698e-05, 'samples': 23194112, 'steps': 45300, 'batch_loss/train': 0.7176357582211494} 12/28/2021 22:18:25 - INFO - codeparrot_training - Step 45301: {'lr': 1.0892979659425844e-05, 'samples': 23194624, 'steps': 45301, 'batch_loss/train': 0.7360463039949536} 12/28/2021 22:18:39 - INFO - codeparrot_training - Step 45302: {'lr': 1.0888377803623172e-05, 'samples': 23195136, 'steps': 45302, 'batch_loss/train': 0.6681429732125252} 12/28/2021 22:18:50 - INFO - codeparrot_training - Step 45303: {'lr': 1.088377689843803e-05, 'samples': 23195648, 'steps': 45303, 'batch_loss/train': 0.6851050448603928} 12/28/2021 22:19:01 - INFO - codeparrot_training - Step 45304: {'lr': 1.0879176943888709e-05, 'samples': 23196160, 'steps': 45304, 'batch_loss/train': 0.677665734430775} 12/28/2021 22:19:13 - INFO - codeparrot_training - Step 45305: {'lr': 1.0874577939993497e-05, 'samples': 23196672, 'steps': 45305, 'batch_loss/train': 0.7096738638356328} 12/28/2021 22:19:23 - INFO - codeparrot_training - Step 45306: {'lr': 1.086997988677066e-05, 'samples': 23197184, 'steps': 45306, 'batch_loss/train': 0.6497659147717059} 12/28/2021 22:19:34 - INFO - codeparrot_training - Step 45307: {'lr': 1.0865382784238487e-05, 'samples': 23197696, 'steps': 45307, 'batch_loss/train': 0.6789854351663962} 12/28/2021 22:19:46 - INFO - codeparrot_training - Step 45308: {'lr': 1.086078663241527e-05, 'samples': 23198208, 'steps': 45308, 'batch_loss/train': 0.7247230112552643} 12/28/2021 22:19:57 - INFO - codeparrot_training - Step 45309: {'lr': 1.0856191431319273e-05, 'samples': 23198720, 'steps': 45309, 'batch_loss/train': 0.6887982254847884} 12/28/2021 22:20:08 - INFO - codeparrot_training - Step 45310: {'lr': 1.0851597180968787e-05, 'samples': 23199232, 'steps': 45310, 'batch_loss/train': 0.5507680092414375} 12/28/2021 22:20:21 - INFO - codeparrot_training - Step 45311: {'lr': 1.084700388138199e-05, 'samples': 23199744, 'steps': 45311, 'batch_loss/train': 0.7541107328142971} 12/28/2021 22:20:32 - INFO - codeparrot_training - Step 45312: {'lr': 1.0842411532577229e-05, 'samples': 23200256, 'steps': 45312, 'batch_loss/train': 0.7128020357340574} 12/28/2021 22:20:43 - INFO - codeparrot_training - Step 45313: {'lr': 1.0837820134572767e-05, 'samples': 23200768, 'steps': 45313, 'batch_loss/train': 0.7255416731350124} 12/28/2021 22:20:53 - INFO - codeparrot_training - Step 45314: {'lr': 1.0833229687386814e-05, 'samples': 23201280, 'steps': 45314, 'batch_loss/train': 0.6962267503840849} 12/28/2021 22:21:06 - INFO - codeparrot_training - Step 45315: {'lr': 1.082864019103763e-05, 'samples': 23201792, 'steps': 45315, 'batch_loss/train': 0.7338845045305789} 12/28/2021 22:21:17 - INFO - codeparrot_training - Step 45316: {'lr': 1.0824051645543453e-05, 'samples': 23202304, 'steps': 45316, 'batch_loss/train': 0.6128290402702987} 12/28/2021 22:21:27 - INFO - codeparrot_training - Step 45317: {'lr': 1.0819464050922545e-05, 'samples': 23202816, 'steps': 45317, 'batch_loss/train': 0.6620603036135435} 12/28/2021 22:21:40 - INFO - codeparrot_training - Step 45318: {'lr': 1.0814877407193142e-05, 'samples': 23203328, 'steps': 45318, 'batch_loss/train': 0.5543960845097899} 12/28/2021 22:21:50 - INFO - codeparrot_training - Step 45319: {'lr': 1.081029171437345e-05, 'samples': 23203840, 'steps': 45319, 'batch_loss/train': 0.6914405198767781} 12/28/2021 22:22:01 - INFO - codeparrot_training - Step 45320: {'lr': 1.0805706972481761e-05, 'samples': 23204352, 'steps': 45320, 'batch_loss/train': 0.5540552381426096} 12/28/2021 22:22:14 - INFO - codeparrot_training - Step 45321: {'lr': 1.0801123181536283e-05, 'samples': 23204864, 'steps': 45321, 'batch_loss/train': 1.1100015873089433} 12/28/2021 22:22:25 - INFO - codeparrot_training - Step 45322: {'lr': 1.0796540341555138e-05, 'samples': 23205376, 'steps': 45322, 'batch_loss/train': 0.6207844359159935} 12/28/2021 22:22:35 - INFO - codeparrot_training - Step 45323: {'lr': 1.0791958452556677e-05, 'samples': 23205888, 'steps': 45323, 'batch_loss/train': 0.7627575974911451} 12/28/2021 22:22:46 - INFO - codeparrot_training - Step 45324: {'lr': 1.0787377514559076e-05, 'samples': 23206400, 'steps': 45324, 'batch_loss/train': 0.7517069960013032} 12/28/2021 22:22:59 - INFO - codeparrot_training - Step 45325: {'lr': 1.0782797527580546e-05, 'samples': 23206912, 'steps': 45325, 'batch_loss/train': 0.732842265162617} 12/28/2021 22:23:09 - INFO - codeparrot_training - Step 45326: {'lr': 1.0778218491639236e-05, 'samples': 23207424, 'steps': 45326, 'batch_loss/train': 0.7318932809866965} 12/28/2021 22:23:20 - INFO - codeparrot_training - Step 45327: {'lr': 1.0773640406753465e-05, 'samples': 23207936, 'steps': 45327, 'batch_loss/train': 0.7285827691666782} 12/28/2021 22:23:34 - INFO - codeparrot_training - Step 45328: {'lr': 1.0769063272941331e-05, 'samples': 23208448, 'steps': 45328, 'batch_loss/train': 0.484386034484487} 12/28/2021 22:23:44 - INFO - codeparrot_training - Step 45329: {'lr': 1.0764487090221097e-05, 'samples': 23208960, 'steps': 45329, 'batch_loss/train': 0.7220790479332209} 12/28/2021 22:23:55 - INFO - codeparrot_training - Step 45330: {'lr': 1.0759911858610915e-05, 'samples': 23209472, 'steps': 45330, 'batch_loss/train': 0.7424742095172405} 12/28/2021 22:24:07 - INFO - codeparrot_training - Step 45331: {'lr': 1.0755337578128994e-05, 'samples': 23209984, 'steps': 45331, 'batch_loss/train': 0.6534157157875597} 12/28/2021 22:24:18 - INFO - codeparrot_training - Step 45332: {'lr': 1.075076424879351e-05, 'samples': 23210496, 'steps': 45332, 'batch_loss/train': 0.648250441532582} 12/28/2021 22:24:29 - INFO - codeparrot_training - Step 45333: {'lr': 1.0746191870622646e-05, 'samples': 23211008, 'steps': 45333, 'batch_loss/train': 0.7088682139292359} 12/28/2021 22:24:39 - INFO - codeparrot_training - Step 45334: {'lr': 1.074162044363461e-05, 'samples': 23211520, 'steps': 45334, 'batch_loss/train': 0.6621065251529217} 12/28/2021 22:24:51 - INFO - codeparrot_training - Step 45335: {'lr': 1.0737049967847523e-05, 'samples': 23212032, 'steps': 45335, 'batch_loss/train': 0.8325257394462824} 12/28/2021 22:25:02 - INFO - codeparrot_training - Step 45336: {'lr': 1.0732480443279624e-05, 'samples': 23212544, 'steps': 45336, 'batch_loss/train': 0.7722415053285658} 12/28/2021 22:25:13 - INFO - codeparrot_training - Step 45337: {'lr': 1.0727911869948981e-05, 'samples': 23213056, 'steps': 45337, 'batch_loss/train': 0.6735509932041168} 12/28/2021 22:25:25 - INFO - codeparrot_training - Step 45338: {'lr': 1.072334424787383e-05, 'samples': 23213568, 'steps': 45338, 'batch_loss/train': 0.8288606273708865} 12/28/2021 22:25:36 - INFO - codeparrot_training - Step 45339: {'lr': 1.0718777577072348e-05, 'samples': 23214080, 'steps': 45339, 'batch_loss/train': 0.7626053616404533} 12/28/2021 22:25:46 - INFO - codeparrot_training - Step 45340: {'lr': 1.0714211857562634e-05, 'samples': 23214592, 'steps': 45340, 'batch_loss/train': 0.7257033863570541} 12/28/2021 22:26:01 - INFO - codeparrot_training - Step 45341: {'lr': 1.0709647089362867e-05, 'samples': 23215104, 'steps': 45341, 'batch_loss/train': 0.3537993535865098} 12/28/2021 22:26:12 - INFO - codeparrot_training - Step 45342: {'lr': 1.0705083272491201e-05, 'samples': 23215616, 'steps': 45342, 'batch_loss/train': 0.6063477366697043} 12/28/2021 22:26:23 - INFO - codeparrot_training - Step 45343: {'lr': 1.0700520406965758e-05, 'samples': 23216128, 'steps': 45343, 'batch_loss/train': 0.9455185933038592} 12/28/2021 22:26:33 - INFO - codeparrot_training - Step 45344: {'lr': 1.0695958492804664e-05, 'samples': 23216640, 'steps': 45344, 'batch_loss/train': 0.6993869263678789} 12/28/2021 22:26:45 - INFO - codeparrot_training - Step 45345: {'lr': 1.0691397530026126e-05, 'samples': 23217152, 'steps': 45345, 'batch_loss/train': 0.6959548979066312} 12/28/2021 22:26:56 - INFO - codeparrot_training - Step 45346: {'lr': 1.0686837518648213e-05, 'samples': 23217664, 'steps': 45346, 'batch_loss/train': 0.6812650999054313} 12/28/2021 22:27:07 - INFO - codeparrot_training - Step 45347: {'lr': 1.0682278458689048e-05, 'samples': 23218176, 'steps': 45347, 'batch_loss/train': 0.6890046981861815} 12/28/2021 22:27:19 - INFO - codeparrot_training - Step 45348: {'lr': 1.0677720350166814e-05, 'samples': 23218688, 'steps': 45348, 'batch_loss/train': 0.723866836167872} 12/28/2021 22:27:30 - INFO - codeparrot_training - Step 45349: {'lr': 1.0673163193099606e-05, 'samples': 23219200, 'steps': 45349, 'batch_loss/train': 0.796257097274065} 12/28/2021 22:27:40 - INFO - codeparrot_training - Step 45350: {'lr': 1.0668606987505546e-05, 'samples': 23219712, 'steps': 45350, 'batch_loss/train': 0.5265887342393398} 12/28/2021 22:27:51 - INFO - codeparrot_training - Step 45351: {'lr': 1.0664051733402708e-05, 'samples': 23220224, 'steps': 45351, 'batch_loss/train': 0.8924901681020856} 12/28/2021 22:28:05 - INFO - codeparrot_training - Step 45352: {'lr': 1.0659497430809184e-05, 'samples': 23220736, 'steps': 45352, 'batch_loss/train': 0.7565289498306811} 12/28/2021 22:28:16 - INFO - codeparrot_training - Step 45353: {'lr': 1.0654944079743211e-05, 'samples': 23221248, 'steps': 45353, 'batch_loss/train': 0.6236851648427546} 12/28/2021 22:28:26 - INFO - codeparrot_training - Step 45354: {'lr': 1.0650391680222776e-05, 'samples': 23221760, 'steps': 45354, 'batch_loss/train': 0.6939378841780126} 12/28/2021 22:28:39 - INFO - codeparrot_training - Step 45355: {'lr': 1.0645840232265973e-05, 'samples': 23222272, 'steps': 45355, 'batch_loss/train': 0.7403922802768648} 12/28/2021 22:28:49 - INFO - codeparrot_training - Step 45356: {'lr': 1.0641289735890985e-05, 'samples': 23222784, 'steps': 45356, 'batch_loss/train': 0.7942587197758257} 12/28/2021 22:29:00 - INFO - codeparrot_training - Step 45357: {'lr': 1.0636740191115824e-05, 'samples': 23223296, 'steps': 45357, 'batch_loss/train': 0.7010393578093499} 12/28/2021 22:29:12 - INFO - codeparrot_training - Step 45358: {'lr': 1.0632191597958613e-05, 'samples': 23223808, 'steps': 45358, 'batch_loss/train': 0.6156602972187102} 12/28/2021 22:29:23 - INFO - codeparrot_training - Step 45359: {'lr': 1.062764395643742e-05, 'samples': 23224320, 'steps': 45359, 'batch_loss/train': 0.6692573865875602} 12/28/2021 22:29:33 - INFO - codeparrot_training - Step 45360: {'lr': 1.0623097266570347e-05, 'samples': 23224832, 'steps': 45360, 'batch_loss/train': 0.8730495180934668} 12/28/2021 22:29:44 - INFO - codeparrot_training - Step 45361: {'lr': 1.061855152837543e-05, 'samples': 23225344, 'steps': 45361, 'batch_loss/train': 0.7029155241325498} 12/28/2021 22:29:58 - INFO - codeparrot_training - Step 45362: {'lr': 1.0614006741870797e-05, 'samples': 23225856, 'steps': 45362, 'batch_loss/train': 0.7678727144375443} 12/28/2021 22:30:09 - INFO - codeparrot_training - Step 45363: {'lr': 1.060946290707443e-05, 'samples': 23226368, 'steps': 45363, 'batch_loss/train': 0.6651145978830755} 12/28/2021 22:30:19 - INFO - codeparrot_training - Step 45364: {'lr': 1.0604920024004482e-05, 'samples': 23226880, 'steps': 45364, 'batch_loss/train': 0.8230637712404132} 12/28/2021 22:30:31 - INFO - codeparrot_training - Step 45365: {'lr': 1.0600378092679026e-05, 'samples': 23227392, 'steps': 45365, 'batch_loss/train': 0.7300275946035981} 12/28/2021 22:30:42 - INFO - codeparrot_training - Step 45366: {'lr': 1.0595837113116014e-05, 'samples': 23227904, 'steps': 45366, 'batch_loss/train': 0.6930468659847975} 12/28/2021 22:30:53 - INFO - codeparrot_training - Step 45367: {'lr': 1.0591297085333573e-05, 'samples': 23228416, 'steps': 45367, 'batch_loss/train': 0.8013980253599584} 12/28/2021 22:31:07 - INFO - codeparrot_training - Step 45368: {'lr': 1.0586758009349745e-05, 'samples': 23228928, 'steps': 45368, 'batch_loss/train': 0.7181337913498282} 12/28/2021 22:31:17 - INFO - codeparrot_training - Step 45369: {'lr': 1.0582219885182542e-05, 'samples': 23229440, 'steps': 45369, 'batch_loss/train': 0.7100527049042284} 12/28/2021 22:31:28 - INFO - codeparrot_training - Step 45370: {'lr': 1.0577682712850061e-05, 'samples': 23229952, 'steps': 45370, 'batch_loss/train': 0.665957611054182} 12/28/2021 22:31:38 - INFO - codeparrot_training - Step 45371: {'lr': 1.0573146492370289e-05, 'samples': 23230464, 'steps': 45371, 'batch_loss/train': 0.7725987378507853} 12/28/2021 22:31:51 - INFO - codeparrot_training - Step 45372: {'lr': 1.0568611223761293e-05, 'samples': 23230976, 'steps': 45372, 'batch_loss/train': 0.6724550444632769} 12/28/2021 22:32:02 - INFO - codeparrot_training - Step 45373: {'lr': 1.0564076907041087e-05, 'samples': 23231488, 'steps': 45373, 'batch_loss/train': 0.7463839296251535} 12/28/2021 22:32:12 - INFO - codeparrot_training - Step 45374: {'lr': 1.0559543542227711e-05, 'samples': 23232000, 'steps': 45374, 'batch_loss/train': 0.6817212644964457} 12/28/2021 22:32:24 - INFO - codeparrot_training - Step 45375: {'lr': 1.0555011129339181e-05, 'samples': 23232512, 'steps': 45375, 'batch_loss/train': 0.34417605539783835} 12/28/2021 22:32:35 - INFO - codeparrot_training - Step 45376: {'lr': 1.0550479668393537e-05, 'samples': 23233024, 'steps': 45376, 'batch_loss/train': 0.7010289751924574} 12/28/2021 22:32:46 - INFO - codeparrot_training - Step 45377: {'lr': 1.0545949159408762e-05, 'samples': 23233536, 'steps': 45377, 'batch_loss/train': 0.8255411572754383} 12/28/2021 22:32:58 - INFO - codeparrot_training - Step 45378: {'lr': 1.054141960240282e-05, 'samples': 23234048, 'steps': 45378, 'batch_loss/train': 0.7538873115554452} 12/28/2021 22:33:09 - INFO - codeparrot_training - Step 45379: {'lr': 1.0536890997393855e-05, 'samples': 23234560, 'steps': 45379, 'batch_loss/train': 0.7663059572223574} 12/28/2021 22:33:19 - INFO - codeparrot_training - Step 45380: {'lr': 1.0532363344399776e-05, 'samples': 23235072, 'steps': 45380, 'batch_loss/train': 0.7639625910669565} 12/28/2021 22:33:30 - INFO - codeparrot_training - Step 45381: {'lr': 1.0527836643438537e-05, 'samples': 23235584, 'steps': 45381, 'batch_loss/train': 0.7372170723974705} 12/28/2021 22:33:44 - INFO - codeparrot_training - Step 45382: {'lr': 1.0523310894528292e-05, 'samples': 23236096, 'steps': 45382, 'batch_loss/train': 0.7838636082597077} 12/28/2021 22:33:54 - INFO - codeparrot_training - Step 45383: {'lr': 1.0518786097686889e-05, 'samples': 23236608, 'steps': 45383, 'batch_loss/train': 0.6557683162391186} 12/28/2021 22:34:05 - INFO - codeparrot_training - Step 45384: {'lr': 1.0514262252932395e-05, 'samples': 23237120, 'steps': 45384, 'batch_loss/train': 0.628497215511743} 12/28/2021 22:34:17 - INFO - codeparrot_training - Step 45385: {'lr': 1.0509739360282739e-05, 'samples': 23237632, 'steps': 45385, 'batch_loss/train': 0.6949268446769565} 12/28/2021 22:34:28 - INFO - codeparrot_training - Step 45386: {'lr': 1.0505217419755936e-05, 'samples': 23238144, 'steps': 45386, 'batch_loss/train': 0.7229671152308583} 12/28/2021 22:34:38 - INFO - codeparrot_training - Step 45387: {'lr': 1.0500696431369972e-05, 'samples': 23238656, 'steps': 45387, 'batch_loss/train': 0.6648648795671761} 12/28/2021 22:34:51 - INFO - codeparrot_training - Step 45388: {'lr': 1.049617639514283e-05, 'samples': 23239168, 'steps': 45388, 'batch_loss/train': 0.5769379602279514} 12/28/2021 22:35:01 - INFO - codeparrot_training - Step 45389: {'lr': 1.049165731109239e-05, 'samples': 23239680, 'steps': 45389, 'batch_loss/train': 0.7126280022785068} 12/28/2021 22:35:12 - INFO - codeparrot_training - Step 45390: {'lr': 1.0487139179236715e-05, 'samples': 23240192, 'steps': 45390, 'batch_loss/train': 0.6406337291700765} 12/28/2021 22:35:26 - INFO - codeparrot_training - Step 45391: {'lr': 1.0482621999593767e-05, 'samples': 23240704, 'steps': 45391, 'batch_loss/train': 0.7010802970035002} 12/28/2021 22:35:37 - INFO - codeparrot_training - Step 45392: {'lr': 1.0478105772181417e-05, 'samples': 23241216, 'steps': 45392, 'batch_loss/train': 0.6785030226456001} 12/28/2021 22:35:47 - INFO - codeparrot_training - Step 45393: {'lr': 1.0473590497017737e-05, 'samples': 23241728, 'steps': 45393, 'batch_loss/train': 0.6706264442764223} 12/28/2021 22:35:58 - INFO - codeparrot_training - Step 45394: {'lr': 1.0469076174120628e-05, 'samples': 23242240, 'steps': 45394, 'batch_loss/train': 0.7101914626546204} 12/28/2021 22:36:10 - INFO - codeparrot_training - Step 45395: {'lr': 1.0464562803507992e-05, 'samples': 23242752, 'steps': 45395, 'batch_loss/train': 0.7095421929843724} 12/28/2021 22:36:21 - INFO - codeparrot_training - Step 45396: {'lr': 1.0460050385197816e-05, 'samples': 23243264, 'steps': 45396, 'batch_loss/train': 0.7088321119081229} 12/28/2021 22:36:31 - INFO - codeparrot_training - Step 45397: {'lr': 1.0455538919208029e-05, 'samples': 23243776, 'steps': 45397, 'batch_loss/train': 0.6953708860091865} 12/28/2021 22:36:45 - INFO - codeparrot_training - Step 45398: {'lr': 1.045102840555659e-05, 'samples': 23244288, 'steps': 45398, 'batch_loss/train': 0.6995573155581951} 12/28/2021 22:36:56 - INFO - codeparrot_training - Step 45399: {'lr': 1.044651884426137e-05, 'samples': 23244800, 'steps': 45399, 'batch_loss/train': 0.7807993814349174} 12/28/2021 22:37:07 - INFO - codeparrot_training - Step 45400: {'lr': 1.0442010235340388e-05, 'samples': 23245312, 'steps': 45400, 'batch_loss/train': 0.6536472500301898} 12/28/2021 22:37:19 - INFO - codeparrot_training - Step 45401: {'lr': 1.0437502578811486e-05, 'samples': 23245824, 'steps': 45401, 'batch_loss/train': 0.7597985593602061} 12/28/2021 22:37:29 - INFO - codeparrot_training - Step 45402: {'lr': 1.0432995874692625e-05, 'samples': 23246336, 'steps': 45402, 'batch_loss/train': 0.7172085456550121} 12/28/2021 22:37:40 - INFO - codeparrot_training - Step 45403: {'lr': 1.042849012300176e-05, 'samples': 23246848, 'steps': 45403, 'batch_loss/train': 0.7548709074035287} 12/28/2021 22:37:52 - INFO - codeparrot_training - Step 45404: {'lr': 1.0423985323756685e-05, 'samples': 23247360, 'steps': 45404, 'batch_loss/train': 0.6475929458392784} 12/28/2021 22:38:03 - INFO - codeparrot_training - Step 45405: {'lr': 1.0419481476975439e-05, 'samples': 23247872, 'steps': 45405, 'batch_loss/train': 0.6416481880005449} 12/28/2021 22:38:13 - INFO - codeparrot_training - Step 45406: {'lr': 1.0414978582675871e-05, 'samples': 23248384, 'steps': 45406, 'batch_loss/train': 0.8667937167920172} 12/28/2021 22:38:24 - INFO - codeparrot_training - Step 45407: {'lr': 1.0410476640875827e-05, 'samples': 23248896, 'steps': 45407, 'batch_loss/train': 0.7248216252774} 12/28/2021 22:38:38 - INFO - codeparrot_training - Step 45408: {'lr': 1.0405975651593318e-05, 'samples': 23249408, 'steps': 45408, 'batch_loss/train': 0.7124078895431012} 12/28/2021 22:38:49 - INFO - codeparrot_training - Step 45409: {'lr': 1.0401475614846168e-05, 'samples': 23249920, 'steps': 45409, 'batch_loss/train': 0.7964088493026793} 12/28/2021 22:38:59 - INFO - codeparrot_training - Step 45410: {'lr': 1.0396976530652274e-05, 'samples': 23250432, 'steps': 45410, 'batch_loss/train': 0.6339516146108508} 12/28/2021 22:39:11 - INFO - codeparrot_training - Step 45411: {'lr': 1.0392478399029543e-05, 'samples': 23250944, 'steps': 45411, 'batch_loss/train': 0.7757703238166869} 12/28/2021 22:39:22 - INFO - codeparrot_training - Step 45412: {'lr': 1.0387981219995845e-05, 'samples': 23251456, 'steps': 45412, 'batch_loss/train': 0.6825442593544722} 12/28/2021 22:39:33 - INFO - codeparrot_training - Step 45413: {'lr': 1.0383484993569059e-05, 'samples': 23251968, 'steps': 45413, 'batch_loss/train': 0.7136816214770079} 12/28/2021 22:39:45 - INFO - codeparrot_training - Step 45414: {'lr': 1.0378989719767084e-05, 'samples': 23252480, 'steps': 45414, 'batch_loss/train': 0.6821950803278014} 12/28/2021 22:39:55 - INFO - codeparrot_training - Step 45415: {'lr': 1.0374495398607715e-05, 'samples': 23252992, 'steps': 45415, 'batch_loss/train': 0.7597128306515515} 12/28/2021 22:40:06 - INFO - codeparrot_training - Step 45416: {'lr': 1.0370002030108877e-05, 'samples': 23253504, 'steps': 45416, 'batch_loss/train': 0.7166423702146858} 12/28/2021 22:40:17 - INFO - codeparrot_training - Step 45417: {'lr': 1.0365509614288504e-05, 'samples': 23254016, 'steps': 45417, 'batch_loss/train': 0.6693125630263239} 12/28/2021 22:40:30 - INFO - codeparrot_training - Step 45418: {'lr': 1.0361018151164275e-05, 'samples': 23254528, 'steps': 45418, 'batch_loss/train': 0.7281099520623684} 12/28/2021 22:40:41 - INFO - codeparrot_training - Step 45419: {'lr': 1.0356527640754231e-05, 'samples': 23255040, 'steps': 45419, 'batch_loss/train': 0.8884070133790374} 12/28/2021 22:40:51 - INFO - codeparrot_training - Step 45420: {'lr': 1.0352038083076137e-05, 'samples': 23255552, 'steps': 45420, 'batch_loss/train': 1.2798552359454334} 12/28/2021 22:41:05 - INFO - codeparrot_training - Step 45421: {'lr': 1.0347549478147838e-05, 'samples': 23256064, 'steps': 45421, 'batch_loss/train': 0.6670440074522048} 12/28/2021 22:41:16 - INFO - codeparrot_training - Step 45422: {'lr': 1.034306182598721e-05, 'samples': 23256576, 'steps': 45422, 'batch_loss/train': 0.7143973768688738} 12/28/2021 22:41:26 - INFO - codeparrot_training - Step 45423: {'lr': 1.0338575126612043e-05, 'samples': 23257088, 'steps': 45423, 'batch_loss/train': 0.6840759888291359} 12/28/2021 22:41:37 - INFO - codeparrot_training - Step 45424: {'lr': 1.033408938004024e-05, 'samples': 23257600, 'steps': 45424, 'batch_loss/train': 0.7132362159900367} 12/28/2021 22:41:49 - INFO - codeparrot_training - Step 45425: {'lr': 1.0329604586289593e-05, 'samples': 23258112, 'steps': 45425, 'batch_loss/train': 0.7773328577168286} 12/28/2021 22:42:00 - INFO - codeparrot_training - Step 45426: {'lr': 1.0325120745377947e-05, 'samples': 23258624, 'steps': 45426, 'batch_loss/train': 0.7127693556249142} 12/28/2021 22:42:10 - INFO - codeparrot_training - Step 45427: {'lr': 1.0320637857323095e-05, 'samples': 23259136, 'steps': 45427, 'batch_loss/train': 0.6564622209407389} 12/28/2021 22:42:24 - INFO - codeparrot_training - Step 45428: {'lr': 1.0316155922142911e-05, 'samples': 23259648, 'steps': 45428, 'batch_loss/train': 0.756042345194146} 12/28/2021 22:42:35 - INFO - codeparrot_training - Step 45429: {'lr': 1.0311674939855187e-05, 'samples': 23260160, 'steps': 45429, 'batch_loss/train': 0.573965908784885} 12/28/2021 22:42:45 - INFO - codeparrot_training - Step 45430: {'lr': 1.0307194910477742e-05, 'samples': 23260672, 'steps': 45430, 'batch_loss/train': 0.6665816507302225} 12/28/2021 22:42:58 - INFO - codeparrot_training - Step 45431: {'lr': 1.030271583402842e-05, 'samples': 23261184, 'steps': 45431, 'batch_loss/train': 0.7720927456393838} 12/28/2021 22:43:08 - INFO - codeparrot_training - Step 45432: {'lr': 1.029823771052496e-05, 'samples': 23261696, 'steps': 45432, 'batch_loss/train': 0.7306536943651736} 12/28/2021 22:43:19 - INFO - codeparrot_training - Step 45433: {'lr': 1.0293760539985181e-05, 'samples': 23262208, 'steps': 45433, 'batch_loss/train': 0.7406465304084122} 12/28/2021 22:43:31 - INFO - codeparrot_training - Step 45434: {'lr': 1.0289284322426928e-05, 'samples': 23262720, 'steps': 45434, 'batch_loss/train': 0.7493166914209723} 12/28/2021 22:43:42 - INFO - codeparrot_training - Step 45435: {'lr': 1.0284809057867966e-05, 'samples': 23263232, 'steps': 45435, 'batch_loss/train': 0.6817869660444558} 12/28/2021 22:43:52 - INFO - codeparrot_training - Step 45436: {'lr': 1.0280334746326087e-05, 'samples': 23263744, 'steps': 45436, 'batch_loss/train': 0.6671408917754889} 12/28/2021 22:44:03 - INFO - codeparrot_training - Step 45437: {'lr': 1.0275861387819079e-05, 'samples': 23264256, 'steps': 45437, 'batch_loss/train': 0.7080861581489444} 12/28/2021 22:44:17 - INFO - codeparrot_training - Step 45438: {'lr': 1.027138898236471e-05, 'samples': 23264768, 'steps': 45438, 'batch_loss/train': 0.7339762854389846} 12/28/2021 22:44:27 - INFO - codeparrot_training - Step 45439: {'lr': 1.0266917529980824e-05, 'samples': 23265280, 'steps': 45439, 'batch_loss/train': 0.6934661772102118} 12/28/2021 22:44:38 - INFO - codeparrot_training - Step 45440: {'lr': 1.0262447030685156e-05, 'samples': 23265792, 'steps': 45440, 'batch_loss/train': 0.7149080624803901} 12/28/2021 22:44:50 - INFO - codeparrot_training - Step 45441: {'lr': 1.0257977484495418e-05, 'samples': 23266304, 'steps': 45441, 'batch_loss/train': 0.6664924152137246} 12/28/2021 22:45:01 - INFO - codeparrot_training - Step 45442: {'lr': 1.0253508891429452e-05, 'samples': 23266816, 'steps': 45442, 'batch_loss/train': 0.7770499708130956} 12/28/2021 22:45:11 - INFO - codeparrot_training - Step 45443: {'lr': 1.0249041251505082e-05, 'samples': 23267328, 'steps': 45443, 'batch_loss/train': 0.6752933119423687} 12/28/2021 22:45:23 - INFO - codeparrot_training - Step 45444: {'lr': 1.02445745647399e-05, 'samples': 23267840, 'steps': 45444, 'batch_loss/train': 0.8031979193910956} 12/28/2021 22:45:34 - INFO - codeparrot_training - Step 45445: {'lr': 1.0240108831151812e-05, 'samples': 23268352, 'steps': 45445, 'batch_loss/train': 0.6415101177990437} 12/28/2021 22:45:45 - INFO - codeparrot_training - Step 45446: {'lr': 1.0235644050758552e-05, 'samples': 23268864, 'steps': 45446, 'batch_loss/train': 0.6904356991872191} 12/28/2021 22:45:55 - INFO - codeparrot_training - Step 45447: {'lr': 1.0231180223577802e-05, 'samples': 23269376, 'steps': 45447, 'batch_loss/train': 0.6995802614837885} 12/28/2021 22:46:09 - INFO - codeparrot_training - Step 45448: {'lr': 1.0226717349627323e-05, 'samples': 23269888, 'steps': 45448, 'batch_loss/train': 0.8386504200752825} 12/28/2021 22:46:20 - INFO - codeparrot_training - Step 45449: {'lr': 1.022225542892491e-05, 'samples': 23270400, 'steps': 45449, 'batch_loss/train': 0.6644402893725783} 12/28/2021 22:46:31 - INFO - codeparrot_training - Step 45450: {'lr': 1.0217794461488267e-05, 'samples': 23270912, 'steps': 45450, 'batch_loss/train': 0.7123370412737131} 12/28/2021 22:46:43 - INFO - codeparrot_training - Step 45451: {'lr': 1.0213334447335132e-05, 'samples': 23271424, 'steps': 45451, 'batch_loss/train': 0.6605602707713842} 12/28/2021 22:46:54 - INFO - codeparrot_training - Step 45452: {'lr': 1.0208875386483268e-05, 'samples': 23271936, 'steps': 45452, 'batch_loss/train': 0.8594098784960806} 12/28/2021 22:47:04 - INFO - codeparrot_training - Step 45453: {'lr': 1.0204417278950328e-05, 'samples': 23272448, 'steps': 45453, 'batch_loss/train': 0.8014468904584646} 12/28/2021 22:47:16 - INFO - codeparrot_training - Step 45454: {'lr': 1.0199960124754104e-05, 'samples': 23272960, 'steps': 45454, 'batch_loss/train': 0.5376245186198503} 12/28/2021 22:47:27 - INFO - codeparrot_training - Step 45455: {'lr': 1.0195503923912303e-05, 'samples': 23273472, 'steps': 45455, 'batch_loss/train': 0.6523260503308848} 12/28/2021 22:47:38 - INFO - codeparrot_training - Step 45456: {'lr': 1.0191048676442605e-05, 'samples': 23273984, 'steps': 45456, 'batch_loss/train': 0.7949811797589064} 12/28/2021 22:47:51 - INFO - codeparrot_training - Step 45457: {'lr': 1.0186594382362802e-05, 'samples': 23274496, 'steps': 45457, 'batch_loss/train': 0.7885586069896817} 12/28/2021 22:48:02 - INFO - codeparrot_training - Step 45458: {'lr': 1.0182141041690519e-05, 'samples': 23275008, 'steps': 45458, 'batch_loss/train': 0.712237901519984} 12/28/2021 22:48:13 - INFO - codeparrot_training - Step 45459: {'lr': 1.0177688654443462e-05, 'samples': 23275520, 'steps': 45459, 'batch_loss/train': 0.685056199785322} 12/28/2021 22:48:23 - INFO - codeparrot_training - Step 45460: {'lr': 1.0173237220639398e-05, 'samples': 23276032, 'steps': 45460, 'batch_loss/train': 0.7505501061677933} 12/28/2021 22:48:36 - INFO - codeparrot_training - Step 45461: {'lr': 1.0168786740295976e-05, 'samples': 23276544, 'steps': 45461, 'batch_loss/train': 0.6024831961840391} 12/28/2021 22:48:46 - INFO - codeparrot_training - Step 45462: {'lr': 1.0164337213430908e-05, 'samples': 23277056, 'steps': 45462, 'batch_loss/train': 0.7912550941109657} 12/28/2021 22:48:57 - INFO - codeparrot_training - Step 45463: {'lr': 1.0159888640061871e-05, 'samples': 23277568, 'steps': 45463, 'batch_loss/train': 0.6827211594209075} 12/28/2021 22:49:09 - INFO - codeparrot_training - Step 45464: {'lr': 1.0155441020206547e-05, 'samples': 23278080, 'steps': 45464, 'batch_loss/train': 0.7907172059640288} 12/28/2021 22:49:19 - INFO - codeparrot_training - Step 45465: {'lr': 1.0150994353882642e-05, 'samples': 23278592, 'steps': 45465, 'batch_loss/train': 0.7264383211731911} 12/28/2021 22:49:30 - INFO - codeparrot_training - Step 45466: {'lr': 1.0146548641107811e-05, 'samples': 23279104, 'steps': 45466, 'batch_loss/train': 0.6673216689378023} 12/28/2021 22:49:44 - INFO - codeparrot_training - Step 45467: {'lr': 1.0142103881899734e-05, 'samples': 23279616, 'steps': 45467, 'batch_loss/train': 0.6635509072802961} 12/28/2021 22:49:55 - INFO - codeparrot_training - Step 45468: {'lr': 1.013766007627609e-05, 'samples': 23280128, 'steps': 45468, 'batch_loss/train': 0.7011138265952468} 12/28/2021 22:50:06 - INFO - codeparrot_training - Step 45469: {'lr': 1.0133217224254588e-05, 'samples': 23280640, 'steps': 45469, 'batch_loss/train': 0.724808621685952} 12/28/2021 22:50:16 - INFO - codeparrot_training - Step 45470: {'lr': 1.0128775325852769e-05, 'samples': 23281152, 'steps': 45470, 'batch_loss/train': 0.7891314532607794} 12/28/2021 22:50:28 - INFO - codeparrot_training - Step 45471: {'lr': 1.0124334381088395e-05, 'samples': 23281664, 'steps': 45471, 'batch_loss/train': 0.8664060514420271} 12/28/2021 22:50:39 - INFO - codeparrot_training - Step 45472: {'lr': 1.0119894389979124e-05, 'samples': 23282176, 'steps': 45472, 'batch_loss/train': 0.684996189083904} 12/28/2021 22:50:49 - INFO - codeparrot_training - Step 45473: {'lr': 1.0115455352542546e-05, 'samples': 23282688, 'steps': 45473, 'batch_loss/train': 0.7540377981495112} 12/28/2021 22:51:02 - INFO - codeparrot_training - Step 45474: {'lr': 1.011101726879629e-05, 'samples': 23283200, 'steps': 45474, 'batch_loss/train': 0.7005619881674647} 12/28/2021 22:51:12 - INFO - codeparrot_training - Step 45475: {'lr': 1.0106580138758148e-05, 'samples': 23283712, 'steps': 45475, 'batch_loss/train': 0.6136932764202356} 12/28/2021 22:51:23 - INFO - codeparrot_training - Step 45476: {'lr': 1.0102143962445632e-05, 'samples': 23284224, 'steps': 45476, 'batch_loss/train': 0.7534158984199166} 12/28/2021 22:51:37 - INFO - codeparrot_training - Step 45477: {'lr': 1.0097708739876393e-05, 'samples': 23284736, 'steps': 45477, 'batch_loss/train': 0.6736611728556454} 12/28/2021 22:51:47 - INFO - codeparrot_training - Step 45478: {'lr': 1.0093274471068086e-05, 'samples': 23285248, 'steps': 45478, 'batch_loss/train': 0.6697295429185033} 12/28/2021 22:51:58 - INFO - codeparrot_training - Step 45479: {'lr': 1.0088841156038337e-05, 'samples': 23285760, 'steps': 45479, 'batch_loss/train': 0.5048873771447688} 12/28/2021 22:52:10 - INFO - codeparrot_training - Step 45480: {'lr': 1.008440879480474e-05, 'samples': 23286272, 'steps': 45480, 'batch_loss/train': 0.7290785266086459} 12/28/2021 22:52:21 - INFO - codeparrot_training - Step 45481: {'lr': 1.0079977387384976e-05, 'samples': 23286784, 'steps': 45481, 'batch_loss/train': 0.668680329923518} 12/28/2021 22:52:31 - INFO - codeparrot_training - Step 45482: {'lr': 1.0075546933796642e-05, 'samples': 23287296, 'steps': 45482, 'batch_loss/train': 0.718552746810019} 12/28/2021 22:52:42 - INFO - codeparrot_training - Step 45483: {'lr': 1.0071117434057308e-05, 'samples': 23287808, 'steps': 45483, 'batch_loss/train': 0.6637555789202452} 12/28/2021 22:52:56 - INFO - codeparrot_training - Step 45484: {'lr': 1.0066688888184656e-05, 'samples': 23288320, 'steps': 45484, 'batch_loss/train': 0.7618849165737629} 12/28/2021 22:53:07 - INFO - codeparrot_training - Step 45485: {'lr': 1.0062261296196224e-05, 'samples': 23288832, 'steps': 45485, 'batch_loss/train': 0.642767951823771} 12/28/2021 22:53:17 - INFO - codeparrot_training - Step 45486: {'lr': 1.0057834658109665e-05, 'samples': 23289344, 'steps': 45486, 'batch_loss/train': 0.678557604085654} 12/28/2021 22:53:29 - INFO - codeparrot_training - Step 45487: {'lr': 1.0053408973942552e-05, 'samples': 23289856, 'steps': 45487, 'batch_loss/train': 0.7388386139646173} 12/28/2021 22:53:40 - INFO - codeparrot_training - Step 45488: {'lr': 1.0048984243712478e-05, 'samples': 23290368, 'steps': 45488, 'batch_loss/train': 0.6861914866603911} 12/28/2021 22:53:51 - INFO - codeparrot_training - Step 45489: {'lr': 1.0044560467437042e-05, 'samples': 23290880, 'steps': 45489, 'batch_loss/train': 0.7232690728269517} 12/28/2021 22:54:03 - INFO - codeparrot_training - Step 45490: {'lr': 1.0040137645133812e-05, 'samples': 23291392, 'steps': 45490, 'batch_loss/train': 0.6866155387833714} 12/28/2021 22:54:14 - INFO - codeparrot_training - Step 45491: {'lr': 1.0035715776820415e-05, 'samples': 23291904, 'steps': 45491, 'batch_loss/train': 0.6898050936870277} 12/28/2021 22:54:24 - INFO - codeparrot_training - Step 45492: {'lr': 1.0031294862514389e-05, 'samples': 23292416, 'steps': 45492, 'batch_loss/train': 0.6065362724475563} 12/28/2021 22:54:35 - INFO - codeparrot_training - Step 45493: {'lr': 1.0026874902233335e-05, 'samples': 23292928, 'steps': 45493, 'batch_loss/train': 0.6946635388303548} 12/28/2021 22:54:47 - INFO - codeparrot_training - Step 45494: {'lr': 1.0022455895994819e-05, 'samples': 23293440, 'steps': 45494, 'batch_loss/train': 0.5808107976336032} 12/28/2021 22:54:58 - INFO - codeparrot_training - Step 45495: {'lr': 1.0018037843816413e-05, 'samples': 23293952, 'steps': 45495, 'batch_loss/train': 0.6844837171956897} 12/28/2021 22:55:08 - INFO - codeparrot_training - Step 45496: {'lr': 1.001362074571563e-05, 'samples': 23294464, 'steps': 45496, 'batch_loss/train': 0.5810388496611267} 12/28/2021 22:55:22 - INFO - codeparrot_training - Step 45497: {'lr': 1.000920460171012e-05, 'samples': 23294976, 'steps': 45497, 'batch_loss/train': 0.6670199800282717} 12/28/2021 22:55:33 - INFO - codeparrot_training - Step 45498: {'lr': 1.00047894118174e-05, 'samples': 23295488, 'steps': 45498, 'batch_loss/train': 0.683261682454031} 12/28/2021 22:55:43 - INFO - codeparrot_training - Step 45499: {'lr': 1.000037517605501e-05, 'samples': 23296000, 'steps': 45499, 'batch_loss/train': 0.6006805319339037} 12/28/2021 22:55:56 - INFO - codeparrot_training - Step 45500: {'lr': 9.995961894440464e-06, 'samples': 23296512, 'steps': 45500, 'batch_loss/train': 0.7023093798197806} 12/28/2021 22:56:06 - INFO - codeparrot_training - Step 45501: {'lr': 9.991549566991415e-06, 'samples': 23297024, 'steps': 45501, 'batch_loss/train': 0.5977705442346632} 12/28/2021 22:56:17 - INFO - codeparrot_training - Step 45502: {'lr': 9.98713819372532e-06, 'samples': 23297536, 'steps': 45502, 'batch_loss/train': 0.7653553104028106} 12/28/2021 22:56:29 - INFO - codeparrot_training - Step 45503: {'lr': 9.982727774659722e-06, 'samples': 23298048, 'steps': 45503, 'batch_loss/train': 0.7220573916565627} 12/28/2021 22:56:40 - INFO - codeparrot_training - Step 45504: {'lr': 9.97831830981219e-06, 'samples': 23298560, 'steps': 45504, 'batch_loss/train': 0.7300571193918586} 12/28/2021 22:56:50 - INFO - codeparrot_training - Step 45505: {'lr': 9.973909799200237e-06, 'samples': 23299072, 'steps': 45505, 'batch_loss/train': 0.7879718937911093} 12/28/2021 22:57:01 - INFO - codeparrot_training - Step 45506: {'lr': 9.969502242841377e-06, 'samples': 23299584, 'steps': 45506, 'batch_loss/train': 0.7479194137267768} 12/28/2021 22:57:15 - INFO - codeparrot_training - Step 45507: {'lr': 9.965095640753125e-06, 'samples': 23300096, 'steps': 45507, 'batch_loss/train': 0.7883351603522897} 12/28/2021 22:57:25 - INFO - codeparrot_training - Step 45508: {'lr': 9.960689992953049e-06, 'samples': 23300608, 'steps': 45508, 'batch_loss/train': 0.7467877026647329} 12/28/2021 22:57:36 - INFO - codeparrot_training - Step 45509: {'lr': 9.956285299458606e-06, 'samples': 23301120, 'steps': 45509, 'batch_loss/train': 0.7029042076319456} 12/28/2021 22:57:48 - INFO - codeparrot_training - Step 45510: {'lr': 9.951881560287395e-06, 'samples': 23301632, 'steps': 45510, 'batch_loss/train': 0.6696406435221434} 12/28/2021 22:57:59 - INFO - codeparrot_training - Step 45511: {'lr': 9.947478775456792e-06, 'samples': 23302144, 'steps': 45511, 'batch_loss/train': 0.6876383516937494} 12/28/2021 22:58:09 - INFO - codeparrot_training - Step 45512: {'lr': 9.943076944984448e-06, 'samples': 23302656, 'steps': 45512, 'batch_loss/train': 0.8769966941326857} 12/28/2021 22:58:23 - INFO - codeparrot_training - Step 45513: {'lr': 9.938676068887736e-06, 'samples': 23303168, 'steps': 45513, 'batch_loss/train': 0.6930886174086481} 12/28/2021 22:58:34 - INFO - codeparrot_training - Step 45514: {'lr': 9.9342761471842e-06, 'samples': 23303680, 'steps': 45514, 'batch_loss/train': 0.4847627100534737} 12/28/2021 22:58:45 - INFO - codeparrot_training - Step 45515: {'lr': 9.929877179891356e-06, 'samples': 23304192, 'steps': 45515, 'batch_loss/train': 0.7519457284361124} 12/28/2021 22:58:55 - INFO - codeparrot_training - Step 45516: {'lr': 9.925479167026658e-06, 'samples': 23304704, 'steps': 45516, 'batch_loss/train': 0.6979496772401035} 12/28/2021 22:59:07 - INFO - codeparrot_training - Step 45517: {'lr': 9.921082108607621e-06, 'samples': 23305216, 'steps': 45517, 'batch_loss/train': 0.6892920737154782} 12/28/2021 22:59:18 - INFO - codeparrot_training - Step 45518: {'lr': 9.916686004651705e-06, 'samples': 23305728, 'steps': 45518, 'batch_loss/train': 0.7359628556296229} 12/28/2021 22:59:29 - INFO - codeparrot_training - Step 45519: {'lr': 9.912290855176392e-06, 'samples': 23306240, 'steps': 45519, 'batch_loss/train': 0.7794537162408233} 12/28/2021 22:59:41 - INFO - codeparrot_training - Step 45520: {'lr': 9.907896660199173e-06, 'samples': 23306752, 'steps': 45520, 'batch_loss/train': 0.7599857272580266} 12/28/2021 22:59:52 - INFO - codeparrot_training - Step 45521: {'lr': 9.903503419737503e-06, 'samples': 23307264, 'steps': 45521, 'batch_loss/train': 0.789836535230279} 12/28/2021 23:00:02 - INFO - codeparrot_training - Step 45522: {'lr': 9.899111133808813e-06, 'samples': 23307776, 'steps': 45522, 'batch_loss/train': 0.6907774759456515} 12/28/2021 23:00:16 - INFO - codeparrot_training - Step 45523: {'lr': 9.894719802430619e-06, 'samples': 23308288, 'steps': 45523, 'batch_loss/train': 0.6190168666653335} 12/28/2021 23:00:27 - INFO - codeparrot_training - Step 45524: {'lr': 9.890329425620377e-06, 'samples': 23308800, 'steps': 45524, 'batch_loss/train': 0.6864514751941897} 12/28/2021 23:00:37 - INFO - codeparrot_training - Step 45525: {'lr': 9.885940003395488e-06, 'samples': 23309312, 'steps': 45525, 'batch_loss/train': 0.7505368441343307} 12/28/2021 23:00:49 - INFO - codeparrot_training - Step 45526: {'lr': 9.881551535773441e-06, 'samples': 23309824, 'steps': 45526, 'batch_loss/train': 0.7795694340020418} 12/28/2021 23:01:00 - INFO - codeparrot_training - Step 45527: {'lr': 9.877164022771723e-06, 'samples': 23310336, 'steps': 45527, 'batch_loss/train': 0.6768879345618188} 12/28/2021 23:01:11 - INFO - codeparrot_training - Step 45528: {'lr': 9.872777464407706e-06, 'samples': 23310848, 'steps': 45528, 'batch_loss/train': 0.7058514383388683} 12/28/2021 23:01:21 - INFO - codeparrot_training - Step 45529: {'lr': 9.868391860698822e-06, 'samples': 23311360, 'steps': 45529, 'batch_loss/train': 0.727812509983778} 12/28/2021 23:01:34 - INFO - codeparrot_training - Step 45530: {'lr': 9.864007211662585e-06, 'samples': 23311872, 'steps': 45530, 'batch_loss/train': 0.7431049030274153} 12/28/2021 23:01:44 - INFO - codeparrot_training - Step 45531: {'lr': 9.859623517316368e-06, 'samples': 23312384, 'steps': 45531, 'batch_loss/train': 0.7349691314157099} 12/28/2021 23:01:55 - INFO - codeparrot_training - Step 45532: {'lr': 9.855240777677632e-06, 'samples': 23312896, 'steps': 45532, 'batch_loss/train': 0.6816644514910877} 12/28/2021 23:02:07 - INFO - codeparrot_training - Step 45533: {'lr': 9.85085899276375e-06, 'samples': 23313408, 'steps': 45533, 'batch_loss/train': 0.8163027800619602} 12/28/2021 23:02:18 - INFO - codeparrot_training - Step 45534: {'lr': 9.846478162592209e-06, 'samples': 23313920, 'steps': 45534, 'batch_loss/train': 0.6984525746665895} 12/28/2021 23:02:28 - INFO - codeparrot_training - Step 45535: {'lr': 9.842098287180385e-06, 'samples': 23314432, 'steps': 45535, 'batch_loss/train': 0.7291670325212181} 12/28/2021 23:02:42 - INFO - codeparrot_training - Step 45536: {'lr': 9.837719366545706e-06, 'samples': 23314944, 'steps': 45536, 'batch_loss/train': 0.7341048792004585} 12/28/2021 23:02:53 - INFO - codeparrot_training - Step 45537: {'lr': 9.833341400705548e-06, 'samples': 23315456, 'steps': 45537, 'batch_loss/train': 0.6475710591766983} 12/28/2021 23:03:04 - INFO - codeparrot_training - Step 45538: {'lr': 9.82896438967737e-06, 'samples': 23315968, 'steps': 45538, 'batch_loss/train': 0.752103746519424} 12/28/2021 23:03:14 - INFO - codeparrot_training - Step 45539: {'lr': 9.824588333478518e-06, 'samples': 23316480, 'steps': 45539, 'batch_loss/train': 0.7246462611947209} 12/28/2021 23:03:26 - INFO - codeparrot_training - Step 45540: {'lr': 9.820213232126396e-06, 'samples': 23316992, 'steps': 45540, 'batch_loss/train': 0.7982499664649367} 12/28/2021 23:03:37 - INFO - codeparrot_training - Step 45541: {'lr': 9.815839085638489e-06, 'samples': 23317504, 'steps': 45541, 'batch_loss/train': 0.6693907547742128} 12/28/2021 23:03:48 - INFO - codeparrot_training - Step 45542: {'lr': 9.811465894032063e-06, 'samples': 23318016, 'steps': 45542, 'batch_loss/train': 0.7471036617644131} 12/28/2021 23:04:01 - INFO - codeparrot_training - Step 45543: {'lr': 9.807093657324573e-06, 'samples': 23318528, 'steps': 45543, 'batch_loss/train': 0.7052305545657873} 12/28/2021 23:04:12 - INFO - codeparrot_training - Step 45544: {'lr': 9.802722375533369e-06, 'samples': 23319040, 'steps': 45544, 'batch_loss/train': 0.6898606307804585} 12/28/2021 23:04:23 - INFO - codeparrot_training - Step 45545: {'lr': 9.79835204867588e-06, 'samples': 23319552, 'steps': 45545, 'batch_loss/train': 0.6797647168859839} 12/28/2021 23:04:35 - INFO - codeparrot_training - Step 45546: {'lr': 9.793982676769398e-06, 'samples': 23320064, 'steps': 45546, 'batch_loss/train': 0.5720435864059255} 12/28/2021 23:04:45 - INFO - codeparrot_training - Step 45547: {'lr': 9.789614259831409e-06, 'samples': 23320576, 'steps': 45547, 'batch_loss/train': 0.6984084469731897} 12/28/2021 23:04:56 - INFO - codeparrot_training - Step 45548: {'lr': 9.78524679787915e-06, 'samples': 23321088, 'steps': 45548, 'batch_loss/train': 0.837005041539669} 12/28/2021 23:05:07 - INFO - codeparrot_training - Step 45549: {'lr': 9.780880290930078e-06, 'samples': 23321600, 'steps': 45549, 'batch_loss/train': 0.6553143444471061} 12/28/2021 23:05:19 - INFO - codeparrot_training - Step 45550: {'lr': 9.77651473900154e-06, 'samples': 23322112, 'steps': 45550, 'batch_loss/train': 0.6965626794844866} 12/28/2021 23:05:29 - INFO - codeparrot_training - Step 45551: {'lr': 9.77215014211083e-06, 'samples': 23322624, 'steps': 45551, 'batch_loss/train': 0.6569881543400697} 12/28/2021 23:05:40 - INFO - codeparrot_training - Step 45552: {'lr': 9.76778650027535e-06, 'samples': 23323136, 'steps': 45552, 'batch_loss/train': 0.6927640987560153} 12/28/2021 23:05:54 - INFO - codeparrot_training - Step 45553: {'lr': 9.763423813512474e-06, 'samples': 23323648, 'steps': 45553, 'batch_loss/train': 0.7228050292469561} 12/28/2021 23:06:04 - INFO - codeparrot_training - Step 45554: {'lr': 9.759062081839521e-06, 'samples': 23324160, 'steps': 45554, 'batch_loss/train': 0.6875400384888053} 12/28/2021 23:06:15 - INFO - codeparrot_training - Step 45555: {'lr': 9.754701305273755e-06, 'samples': 23324672, 'steps': 45555, 'batch_loss/train': 0.6838415465317667} 12/28/2021 23:06:27 - INFO - codeparrot_training - Step 45556: {'lr': 9.750341483832664e-06, 'samples': 23325184, 'steps': 45556, 'batch_loss/train': 0.6506529079051688} 12/28/2021 23:06:38 - INFO - codeparrot_training - Step 45557: {'lr': 9.745982617533455e-06, 'samples': 23325696, 'steps': 45557, 'batch_loss/train': 0.7581533886259422} 12/28/2021 23:06:49 - INFO - codeparrot_training - Step 45558: {'lr': 9.741624706393532e-06, 'samples': 23326208, 'steps': 45558, 'batch_loss/train': 0.7722725071944296} 12/28/2021 23:07:00 - INFO - codeparrot_training - Step 45559: {'lr': 9.737267750430184e-06, 'samples': 23326720, 'steps': 45559, 'batch_loss/train': 0.6869260161183774} 12/28/2021 23:07:11 - INFO - codeparrot_training - Step 45560: {'lr': 9.732911749660706e-06, 'samples': 23327232, 'steps': 45560, 'batch_loss/train': 0.6747195445932448} 12/28/2021 23:07:22 - INFO - codeparrot_training - Step 45561: {'lr': 9.728556704102499e-06, 'samples': 23327744, 'steps': 45561, 'batch_loss/train': 0.7880747229792178} 12/28/2021 23:07:32 - INFO - codeparrot_training - Step 45562: {'lr': 9.724202613772826e-06, 'samples': 23328256, 'steps': 45562, 'batch_loss/train': 0.6612762766890228} 12/28/2021 23:07:47 - INFO - codeparrot_training - Step 45563: {'lr': 9.719849478688924e-06, 'samples': 23328768, 'steps': 45563, 'batch_loss/train': 0.6562710027210414} 12/28/2021 23:07:57 - INFO - codeparrot_training - Step 45564: {'lr': 9.715497298868253e-06, 'samples': 23329280, 'steps': 45564, 'batch_loss/train': 0.6951845221919939} 12/28/2021 23:08:08 - INFO - codeparrot_training - Step 45565: {'lr': 9.71114607432802e-06, 'samples': 23329792, 'steps': 45565, 'batch_loss/train': 0.6834589773206972} 12/28/2021 23:08:20 - INFO - codeparrot_training - Step 45566: {'lr': 9.706795805085517e-06, 'samples': 23330304, 'steps': 45566, 'batch_loss/train': 0.7661949424655177} 12/28/2021 23:08:31 - INFO - codeparrot_training - Step 45567: {'lr': 9.70244649115809e-06, 'samples': 23330816, 'steps': 45567, 'batch_loss/train': 0.6713146381080151} 12/28/2021 23:08:41 - INFO - codeparrot_training - Step 45568: {'lr': 9.698098132563005e-06, 'samples': 23331328, 'steps': 45568, 'batch_loss/train': 0.7176468030083925} 12/28/2021 23:08:54 - INFO - codeparrot_training - Step 45569: {'lr': 9.693750729317524e-06, 'samples': 23331840, 'steps': 45569, 'batch_loss/train': 0.6570214661769569} 12/28/2021 23:09:05 - INFO - codeparrot_training - Step 45570: {'lr': 9.689404281438969e-06, 'samples': 23332352, 'steps': 45570, 'batch_loss/train': 0.6609097048640251} 12/28/2021 23:09:15 - INFO - codeparrot_training - Step 45571: {'lr': 9.685058788944573e-06, 'samples': 23332864, 'steps': 45571, 'batch_loss/train': 0.7261445648036897} 12/28/2021 23:09:26 - INFO - codeparrot_training - Step 45572: {'lr': 9.680714251851686e-06, 'samples': 23333376, 'steps': 45572, 'batch_loss/train': 0.7181580872274935} 12/28/2021 23:09:39 - INFO - codeparrot_training - Step 45573: {'lr': 9.676370670177515e-06, 'samples': 23333888, 'steps': 45573, 'batch_loss/train': 0.9232547031715512} 12/28/2021 23:09:50 - INFO - codeparrot_training - Step 45574: {'lr': 9.672028043939351e-06, 'samples': 23334400, 'steps': 45574, 'batch_loss/train': 0.6682717697694898} 12/28/2021 23:10:00 - INFO - codeparrot_training - Step 45575: {'lr': 9.66768637315446e-06, 'samples': 23334912, 'steps': 45575, 'batch_loss/train': 0.6588187073357403} 12/28/2021 23:10:12 - INFO - codeparrot_training - Step 45576: {'lr': 9.663345657840133e-06, 'samples': 23335424, 'steps': 45576, 'batch_loss/train': 0.7104833265766501} 12/28/2021 23:10:23 - INFO - codeparrot_training - Step 45577: {'lr': 9.65900589801355e-06, 'samples': 23335936, 'steps': 45577, 'batch_loss/train': 0.7809540778398514} 12/28/2021 23:10:34 - INFO - codeparrot_training - Step 45578: {'lr': 9.654667093692033e-06, 'samples': 23336448, 'steps': 45578, 'batch_loss/train': 0.9215349843725562} 12/28/2021 23:10:47 - INFO - codeparrot_training - Step 45579: {'lr': 9.650329244892842e-06, 'samples': 23336960, 'steps': 45579, 'batch_loss/train': 0.7398596131242812} 12/28/2021 23:10:57 - INFO - codeparrot_training - Step 45580: {'lr': 9.645992351633131e-06, 'samples': 23337472, 'steps': 45580, 'batch_loss/train': 0.7487538361456245} 12/28/2021 23:11:08 - INFO - codeparrot_training - Step 45581: {'lr': 9.641656413930194e-06, 'samples': 23337984, 'steps': 45581, 'batch_loss/train': 1.0956158265471458} 12/28/2021 23:11:19 - INFO - codeparrot_training - Step 45582: {'lr': 9.63732143180135e-06, 'samples': 23338496, 'steps': 45582, 'batch_loss/train': 0.826207589590922} 12/28/2021 23:11:32 - INFO - codeparrot_training - Step 45583: {'lr': 9.632987405263693e-06, 'samples': 23339008, 'steps': 45583, 'batch_loss/train': 0.703369417693466} 12/28/2021 23:11:43 - INFO - codeparrot_training - Step 45584: {'lr': 9.628654334334547e-06, 'samples': 23339520, 'steps': 45584, 'batch_loss/train': 0.6735156625509262} 12/28/2021 23:11:54 - INFO - codeparrot_training - Step 45585: {'lr': 9.624322219031118e-06, 'samples': 23340032, 'steps': 45585, 'batch_loss/train': 0.7027934701181948} 12/28/2021 23:12:06 - INFO - codeparrot_training - Step 45586: {'lr': 9.619991059370587e-06, 'samples': 23340544, 'steps': 45586, 'batch_loss/train': 0.7368344059213996} 12/28/2021 23:12:17 - INFO - codeparrot_training - Step 45587: {'lr': 9.615660855370246e-06, 'samples': 23341056, 'steps': 45587, 'batch_loss/train': 0.7069672015495598} 12/28/2021 23:12:27 - INFO - codeparrot_training - Step 45588: {'lr': 9.611331607047275e-06, 'samples': 23341568, 'steps': 45588, 'batch_loss/train': 0.7092901836149395} 12/28/2021 23:12:38 - INFO - codeparrot_training - Step 45589: {'lr': 9.607003314418828e-06, 'samples': 23342080, 'steps': 45589, 'batch_loss/train': 0.709690093062818} 12/28/2021 23:12:50 - INFO - codeparrot_training - Step 45590: {'lr': 9.602675977502195e-06, 'samples': 23342592, 'steps': 45590, 'batch_loss/train': 0.6534809279255569} 12/28/2021 23:13:00 - INFO - codeparrot_training - Step 45591: {'lr': 9.598349596314588e-06, 'samples': 23343104, 'steps': 45591, 'batch_loss/train': 0.6775838428875431} 12/28/2021 23:13:11 - INFO - codeparrot_training - Step 45592: {'lr': 9.5940241708731e-06, 'samples': 23343616, 'steps': 45592, 'batch_loss/train': 0.6050346570555121} 12/28/2021 23:13:25 - INFO - codeparrot_training - Step 45593: {'lr': 9.589699701195054e-06, 'samples': 23344128, 'steps': 45593, 'batch_loss/train': 0.6453629713505507} 12/28/2021 23:13:36 - INFO - codeparrot_training - Step 45594: {'lr': 9.585376187297573e-06, 'samples': 23344640, 'steps': 45594, 'batch_loss/train': 0.6578679555095732} 12/28/2021 23:13:46 - INFO - codeparrot_training - Step 45595: {'lr': 9.581053629197866e-06, 'samples': 23345152, 'steps': 45595, 'batch_loss/train': 0.6465652892366052} 12/28/2021 23:13:58 - INFO - codeparrot_training - Step 45596: {'lr': 9.576732026913088e-06, 'samples': 23345664, 'steps': 45596, 'batch_loss/train': 0.6881764810532331} 12/28/2021 23:14:09 - INFO - codeparrot_training - Step 45597: {'lr': 9.572411380460471e-06, 'samples': 23346176, 'steps': 45597, 'batch_loss/train': 0.841520281508565} 12/28/2021 23:14:20 - INFO - codeparrot_training - Step 45598: {'lr': 9.568091689857145e-06, 'samples': 23346688, 'steps': 45598, 'batch_loss/train': 0.8005507998168468} 12/28/2021 23:14:34 - INFO - codeparrot_training - Step 45599: {'lr': 9.563772955120314e-06, 'samples': 23347200, 'steps': 45599, 'batch_loss/train': 0.6975928822648712} 12/28/2021 23:14:44 - INFO - codeparrot_training - Step 45600: {'lr': 9.559455176267106e-06, 'samples': 23347712, 'steps': 45600, 'batch_loss/train': 0.7531780106946826} 12/28/2021 23:14:55 - INFO - codeparrot_training - Step 45601: {'lr': 9.555138353314757e-06, 'samples': 23348224, 'steps': 45601, 'batch_loss/train': 0.661419422365725} 12/28/2021 23:15:06 - INFO - codeparrot_training - Step 45602: {'lr': 9.550822486280392e-06, 'samples': 23348736, 'steps': 45602, 'batch_loss/train': 0.6822636825963855} 12/28/2021 23:15:18 - INFO - codeparrot_training - Step 45603: {'lr': 9.546507575181108e-06, 'samples': 23349248, 'steps': 45603, 'batch_loss/train': 0.9452994496095926} 12/28/2021 23:15:29 - INFO - codeparrot_training - Step 45604: {'lr': 9.542193620034168e-06, 'samples': 23349760, 'steps': 45604, 'batch_loss/train': 0.6860243730479851} 12/28/2021 23:15:39 - INFO - codeparrot_training - Step 45605: {'lr': 9.537880620856698e-06, 'samples': 23350272, 'steps': 45605, 'batch_loss/train': 0.7597313541918993} 12/28/2021 23:15:51 - INFO - codeparrot_training - Step 45606: {'lr': 9.533568577665769e-06, 'samples': 23350784, 'steps': 45606, 'batch_loss/train': 0.7347802193835378} 12/28/2021 23:16:02 - INFO - codeparrot_training - Step 45607: {'lr': 9.52925749047856e-06, 'samples': 23351296, 'steps': 45607, 'batch_loss/train': 0.6418961966410279} 12/28/2021 23:16:13 - INFO - codeparrot_training - Step 45608: {'lr': 9.524947359312307e-06, 'samples': 23351808, 'steps': 45608, 'batch_loss/train': 0.8139610085636377} 12/28/2021 23:16:25 - INFO - codeparrot_training - Step 45609: {'lr': 9.520638184184e-06, 'samples': 23352320, 'steps': 45609, 'batch_loss/train': 0.6918171914294362} 12/28/2021 23:16:35 - INFO - codeparrot_training - Step 45610: {'lr': 9.516329965110843e-06, 'samples': 23352832, 'steps': 45610, 'batch_loss/train': 0.5782277407415677} 12/28/2021 23:16:46 - INFO - codeparrot_training - Step 45611: {'lr': 9.512022702109962e-06, 'samples': 23353344, 'steps': 45611, 'batch_loss/train': 0.6072481739684008} 12/28/2021 23:17:01 - INFO - codeparrot_training - Step 45612: {'lr': 9.507716395198457e-06, 'samples': 23353856, 'steps': 45612, 'batch_loss/train': 0.6238124272786081} 12/28/2021 23:17:12 - INFO - codeparrot_training - Step 45613: {'lr': 9.50341104439348e-06, 'samples': 23354368, 'steps': 45613, 'batch_loss/train': 0.6209913083002903} 12/28/2021 23:17:22 - INFO - codeparrot_training - Step 45614: {'lr': 9.499106649712124e-06, 'samples': 23354880, 'steps': 45614, 'batch_loss/train': 0.725394893437624} 12/28/2021 23:17:33 - INFO - codeparrot_training - Step 45615: {'lr': 9.494803211171522e-06, 'samples': 23355392, 'steps': 45615, 'batch_loss/train': 0.7576756123453379} 12/28/2021 23:17:45 - INFO - codeparrot_training - Step 45616: {'lr': 9.490500728788764e-06, 'samples': 23355904, 'steps': 45616, 'batch_loss/train': 0.6423772820271552} 12/28/2021 23:17:56 - INFO - codeparrot_training - Step 45617: {'lr': 9.486199202581009e-06, 'samples': 23356416, 'steps': 45617, 'batch_loss/train': 0.7412867750972509} 12/28/2021 23:18:06 - INFO - codeparrot_training - Step 45618: {'lr': 9.48189863256524e-06, 'samples': 23356928, 'steps': 45618, 'batch_loss/train': 0.7458846243098378} 12/28/2021 23:18:20 - INFO - codeparrot_training - Step 45619: {'lr': 9.477599018758637e-06, 'samples': 23357440, 'steps': 45619, 'batch_loss/train': 0.6256932428805158} 12/28/2021 23:18:31 - INFO - codeparrot_training - Step 45620: {'lr': 9.473300361178328e-06, 'samples': 23357952, 'steps': 45620, 'batch_loss/train': 0.7405675202608109} 12/28/2021 23:18:41 - INFO - codeparrot_training - Step 45621: {'lr': 9.469002659841325e-06, 'samples': 23358464, 'steps': 45621, 'batch_loss/train': 0.6912796157412231} 12/28/2021 23:18:54 - INFO - codeparrot_training - Step 45622: {'lr': 9.464705914764754e-06, 'samples': 23358976, 'steps': 45622, 'batch_loss/train': 0.6201828697812743} 12/28/2021 23:19:05 - INFO - codeparrot_training - Step 45623: {'lr': 9.460410125965686e-06, 'samples': 23359488, 'steps': 45623, 'batch_loss/train': 0.7391825839877129} 12/28/2021 23:19:15 - INFO - codeparrot_training - Step 45624: {'lr': 9.456115293461187e-06, 'samples': 23360000, 'steps': 45624, 'batch_loss/train': 0.7608282845467329} 12/28/2021 23:19:26 - INFO - codeparrot_training - Step 45625: {'lr': 9.451821417268387e-06, 'samples': 23360512, 'steps': 45625, 'batch_loss/train': 0.7373949401080608} 12/28/2021 23:19:38 - INFO - codeparrot_training - Step 45626: {'lr': 9.447528497404295e-06, 'samples': 23361024, 'steps': 45626, 'batch_loss/train': 0.999160663690418} 12/28/2021 23:19:49 - INFO - codeparrot_training - Step 45627: {'lr': 9.443236533885985e-06, 'samples': 23361536, 'steps': 45627, 'batch_loss/train': 0.7219428713433444} 12/28/2021 23:19:59 - INFO - codeparrot_training - Step 45628: {'lr': 9.438945526730581e-06, 'samples': 23362048, 'steps': 45628, 'batch_loss/train': 0.7007676232606173} 12/28/2021 23:20:13 - INFO - codeparrot_training - Step 45629: {'lr': 9.43465547595504e-06, 'samples': 23362560, 'steps': 45629, 'batch_loss/train': 0.6758406511507928} 12/28/2021 23:20:24 - INFO - codeparrot_training - Step 45630: {'lr': 9.430366381576488e-06, 'samples': 23363072, 'steps': 45630, 'batch_loss/train': 0.6485789329744875} 12/28/2021 23:20:35 - INFO - codeparrot_training - Step 45631: {'lr': 9.426078243612025e-06, 'samples': 23363584, 'steps': 45631, 'batch_loss/train': 0.6913343900814652} 12/28/2021 23:20:47 - INFO - codeparrot_training - Step 45632: {'lr': 9.421791062078578e-06, 'samples': 23364096, 'steps': 45632, 'batch_loss/train': 0.660549582913518} 12/28/2021 23:20:57 - INFO - codeparrot_training - Step 45633: {'lr': 9.417504836993218e-06, 'samples': 23364608, 'steps': 45633, 'batch_loss/train': 0.7324497466906905} 12/28/2021 23:21:08 - INFO - codeparrot_training - Step 45634: {'lr': 9.413219568373099e-06, 'samples': 23365120, 'steps': 45634, 'batch_loss/train': 0.6581174959428608} 12/28/2021 23:21:20 - INFO - codeparrot_training - Step 45635: {'lr': 9.408935256235152e-06, 'samples': 23365632, 'steps': 45635, 'batch_loss/train': 0.6791147187759634} 12/28/2021 23:21:31 - INFO - codeparrot_training - Step 45636: {'lr': 9.404651900596417e-06, 'samples': 23366144, 'steps': 45636, 'batch_loss/train': 0.6916749062947929} 12/28/2021 23:21:41 - INFO - codeparrot_training - Step 45637: {'lr': 9.400369501473938e-06, 'samples': 23366656, 'steps': 45637, 'batch_loss/train': 0.7643992078956217} 12/28/2021 23:21:52 - INFO - codeparrot_training - Step 45638: {'lr': 9.396088058884755e-06, 'samples': 23367168, 'steps': 45638, 'batch_loss/train': 0.5350519955682103} 12/28/2021 23:22:06 - INFO - codeparrot_training - Step 45639: {'lr': 9.391807572845885e-06, 'samples': 23367680, 'steps': 45639, 'batch_loss/train': 0.6372511379886419} 12/28/2021 23:22:17 - INFO - codeparrot_training - Step 45640: {'lr': 9.38752804337431e-06, 'samples': 23368192, 'steps': 45640, 'batch_loss/train': 0.6865973090752959} 12/28/2021 23:22:27 - INFO - codeparrot_training - Step 45641: {'lr': 9.383249470487104e-06, 'samples': 23368704, 'steps': 45641, 'batch_loss/train': 0.580607486423105} 12/28/2021 23:22:40 - INFO - codeparrot_training - Step 45642: {'lr': 9.378971854201223e-06, 'samples': 23369216, 'steps': 45642, 'batch_loss/train': 0.6176190366968513} 12/28/2021 23:22:51 - INFO - codeparrot_training - Step 45643: {'lr': 9.374695194533739e-06, 'samples': 23369728, 'steps': 45643, 'batch_loss/train': 0.8130861995741725} 12/28/2021 23:23:01 - INFO - codeparrot_training - Step 45644: {'lr': 9.370419491501525e-06, 'samples': 23370240, 'steps': 45644, 'batch_loss/train': 0.8792521012946963} 12/28/2021 23:23:12 - INFO - codeparrot_training - Step 45645: {'lr': 9.366144745121735e-06, 'samples': 23370752, 'steps': 45645, 'batch_loss/train': 1.2583612110465765} 12/28/2021 23:23:26 - INFO - codeparrot_training - Step 45646: {'lr': 9.3618709554113e-06, 'samples': 23371264, 'steps': 45646, 'batch_loss/train': 0.7192447246052325} 12/28/2021 23:23:36 - INFO - codeparrot_training - Step 45647: {'lr': 9.357598122387151e-06, 'samples': 23371776, 'steps': 45647, 'batch_loss/train': 0.7739065974019468} 12/28/2021 23:23:47 - INFO - codeparrot_training - Step 45648: {'lr': 9.353326246066358e-06, 'samples': 23372288, 'steps': 45648, 'batch_loss/train': 0.7645624531432986} 12/28/2021 23:23:59 - INFO - codeparrot_training - Step 45649: {'lr': 9.349055326465877e-06, 'samples': 23372800, 'steps': 45649, 'batch_loss/train': 0.6879671206697822} 12/28/2021 23:24:10 - INFO - codeparrot_training - Step 45650: {'lr': 9.34478536360267e-06, 'samples': 23373312, 'steps': 45650, 'batch_loss/train': 0.7199685946106911} 12/28/2021 23:24:21 - INFO - codeparrot_training - Step 45651: {'lr': 9.34051635749375e-06, 'samples': 23373824, 'steps': 45651, 'batch_loss/train': 0.5655174722196534} 12/28/2021 23:24:33 - INFO - codeparrot_training - Step 45652: {'lr': 9.336248308156048e-06, 'samples': 23374336, 'steps': 45652, 'batch_loss/train': 0.6953707197681069} 12/28/2021 23:24:43 - INFO - codeparrot_training - Step 45653: {'lr': 9.331981215606578e-06, 'samples': 23374848, 'steps': 45653, 'batch_loss/train': 0.7377681015059352} 12/28/2021 23:24:54 - INFO - codeparrot_training - Step 45654: {'lr': 9.327715079862242e-06, 'samples': 23375360, 'steps': 45654, 'batch_loss/train': 0.7305877958424389} 12/28/2021 23:25:04 - INFO - codeparrot_training - Step 45655: {'lr': 9.323449900940057e-06, 'samples': 23375872, 'steps': 45655, 'batch_loss/train': 0.7463720161467791} 12/28/2021 23:25:17 - INFO - codeparrot_training - Step 45656: {'lr': 9.31918567885695e-06, 'samples': 23376384, 'steps': 45656, 'batch_loss/train': 0.7640552613884211} 12/28/2021 23:25:28 - INFO - codeparrot_training - Step 45657: {'lr': 9.314922413629939e-06, 'samples': 23376896, 'steps': 45657, 'batch_loss/train': 0.8841690360568464} 12/28/2021 23:25:38 - INFO - codeparrot_training - Step 45658: {'lr': 9.31066010527587e-06, 'samples': 23377408, 'steps': 45658, 'batch_loss/train': 1.3830066239461303} 12/28/2021 23:25:52 - INFO - codeparrot_training - Step 45659: {'lr': 9.306398753811702e-06, 'samples': 23377920, 'steps': 45659, 'batch_loss/train': 2.054842656478286} 12/28/2021 23:26:03 - INFO - codeparrot_training - Step 45660: {'lr': 9.302138359254448e-06, 'samples': 23378432, 'steps': 45660, 'batch_loss/train': 0.5794867167714983} 12/28/2021 23:26:14 - INFO - codeparrot_training - Step 45661: {'lr': 9.297878921621011e-06, 'samples': 23378944, 'steps': 45661, 'batch_loss/train': 0.6261985418386757} 12/28/2021 23:26:24 - INFO - codeparrot_training - Step 45662: {'lr': 9.293620440928323e-06, 'samples': 23379456, 'steps': 45662, 'batch_loss/train': 0.6808488513343036} 12/28/2021 23:26:37 - INFO - codeparrot_training - Step 45663: {'lr': 9.28936291719329e-06, 'samples': 23379968, 'steps': 45663, 'batch_loss/train': 1.0043423939496279} 12/28/2021 23:26:47 - INFO - codeparrot_training - Step 45664: {'lr': 9.285106350432865e-06, 'samples': 23380480, 'steps': 45664, 'batch_loss/train': 0.7544946996495128} 12/28/2021 23:26:58 - INFO - codeparrot_training - Step 45665: {'lr': 9.280850740663982e-06, 'samples': 23380992, 'steps': 45665, 'batch_loss/train': 0.7164391689002514} 12/28/2021 23:27:10 - INFO - codeparrot_training - Step 45666: {'lr': 9.276596087903544e-06, 'samples': 23381504, 'steps': 45666, 'batch_loss/train': 0.6841828944161534} 12/28/2021 23:27:21 - INFO - codeparrot_training - Step 45667: {'lr': 9.272342392168454e-06, 'samples': 23382016, 'steps': 45667, 'batch_loss/train': 0.6801051227375865} 12/28/2021 23:27:31 - INFO - codeparrot_training - Step 45668: {'lr': 9.268089653475643e-06, 'samples': 23382528, 'steps': 45668, 'batch_loss/train': 0.6488636735011823} 12/28/2021 23:27:45 - INFO - codeparrot_training - Step 45669: {'lr': 9.26383787184204e-06, 'samples': 23383040, 'steps': 45669, 'batch_loss/train': 1.606963119469583} 12/28/2021 23:27:56 - INFO - codeparrot_training - Step 45670: {'lr': 9.259587047284467e-06, 'samples': 23383552, 'steps': 45670, 'batch_loss/train': 0.7359208134002984} 12/28/2021 23:28:07 - INFO - codeparrot_training - Step 45671: {'lr': 9.255337179819884e-06, 'samples': 23384064, 'steps': 45671, 'batch_loss/train': 0.7418794147670269} 12/28/2021 23:28:17 - INFO - codeparrot_training - Step 45672: {'lr': 9.251088269465218e-06, 'samples': 23384576, 'steps': 45672, 'batch_loss/train': 0.7959502246230841} 12/28/2021 23:28:29 - INFO - codeparrot_training - Step 45673: {'lr': 9.246840316237293e-06, 'samples': 23385088, 'steps': 45673, 'batch_loss/train': 0.7443661349825561} 12/28/2021 23:28:40 - INFO - codeparrot_training - Step 45674: {'lr': 9.242593320153036e-06, 'samples': 23385600, 'steps': 45674, 'batch_loss/train': 0.6388361863791943} 12/28/2021 23:28:51 - INFO - codeparrot_training - Step 45675: {'lr': 9.238347281229326e-06, 'samples': 23386112, 'steps': 45675, 'batch_loss/train': 0.8357615671120584} 12/28/2021 23:29:03 - INFO - codeparrot_training - Step 45676: {'lr': 9.234102199483007e-06, 'samples': 23386624, 'steps': 45676, 'batch_loss/train': 0.642466540215537} 12/28/2021 23:29:14 - INFO - codeparrot_training - Step 45677: {'lr': 9.229858074931014e-06, 'samples': 23387136, 'steps': 45677, 'batch_loss/train': 0.6826358947437257} 12/28/2021 23:29:25 - INFO - codeparrot_training - Step 45678: {'lr': 9.22561490759019e-06, 'samples': 23387648, 'steps': 45678, 'batch_loss/train': 0.736713427118957} 12/28/2021 23:29:39 - INFO - codeparrot_training - Step 45679: {'lr': 9.221372697477415e-06, 'samples': 23388160, 'steps': 45679, 'batch_loss/train': 0.6540428372099996} 12/28/2021 23:29:49 - INFO - codeparrot_training - Step 45680: {'lr': 9.217131444609534e-06, 'samples': 23388672, 'steps': 45680, 'batch_loss/train': 0.7337281135842204} 12/28/2021 23:30:00 - INFO - codeparrot_training - Step 45681: {'lr': 9.212891149003422e-06, 'samples': 23389184, 'steps': 45681, 'batch_loss/train': 0.807762417010963} 12/28/2021 23:30:12 - INFO - codeparrot_training - Step 45682: {'lr': 9.208651810675955e-06, 'samples': 23389696, 'steps': 45682, 'batch_loss/train': 0.6793023595237173} 12/28/2021 23:30:23 - INFO - codeparrot_training - Step 45683: {'lr': 9.204413429643981e-06, 'samples': 23390208, 'steps': 45683, 'batch_loss/train': 0.6748284557834268} 12/28/2021 23:30:33 - INFO - codeparrot_training - Step 45684: {'lr': 9.200176005924321e-06, 'samples': 23390720, 'steps': 45684, 'batch_loss/train': 0.7631050711497664} 12/28/2021 23:30:44 - INFO - codeparrot_training - Step 45685: {'lr': 9.195939539533793e-06, 'samples': 23391232, 'steps': 45685, 'batch_loss/train': 0.6646473980508745} 12/28/2021 23:30:56 - INFO - codeparrot_training - Step 45686: {'lr': 9.191704030489356e-06, 'samples': 23391744, 'steps': 45686, 'batch_loss/train': 0.6441572655458003} 12/28/2021 23:31:07 - INFO - codeparrot_training - Step 45687: {'lr': 9.187469478807747e-06, 'samples': 23392256, 'steps': 45687, 'batch_loss/train': 0.6839543762616813} 12/28/2021 23:31:17 - INFO - codeparrot_training - Step 45688: {'lr': 9.183235884505786e-06, 'samples': 23392768, 'steps': 45688, 'batch_loss/train': 0.7095047202892601} 12/28/2021 23:31:31 - INFO - codeparrot_training - Step 45689: {'lr': 9.179003247600432e-06, 'samples': 23393280, 'steps': 45689, 'batch_loss/train': 0.7823447426781058} 12/28/2021 23:31:42 - INFO - codeparrot_training - Step 45690: {'lr': 9.174771568108392e-06, 'samples': 23393792, 'steps': 45690, 'batch_loss/train': 0.6454932179767638} 12/28/2021 23:31:53 - INFO - codeparrot_training - Step 45691: {'lr': 9.170540846046543e-06, 'samples': 23394304, 'steps': 45691, 'batch_loss/train': 0.6548075219616294} 12/28/2021 23:32:05 - INFO - codeparrot_training - Step 45692: {'lr': 9.166311081431678e-06, 'samples': 23394816, 'steps': 45692, 'batch_loss/train': 0.8259477620013058} 12/28/2021 23:32:16 - INFO - codeparrot_training - Step 45693: {'lr': 9.162082274280614e-06, 'samples': 23395328, 'steps': 45693, 'batch_loss/train': 0.7378309040796012} 12/28/2021 23:32:26 - INFO - codeparrot_training - Step 45694: {'lr': 9.157854424610173e-06, 'samples': 23395840, 'steps': 45694, 'batch_loss/train': 0.7966404408216476} 12/28/2021 23:32:37 - INFO - codeparrot_training - Step 45695: {'lr': 9.153627532437203e-06, 'samples': 23396352, 'steps': 45695, 'batch_loss/train': 0.6771405348554254} 12/28/2021 23:32:49 - INFO - codeparrot_training - Step 45696: {'lr': 9.149401597778412e-06, 'samples': 23396864, 'steps': 45696, 'batch_loss/train': 0.7472879849374294} 12/28/2021 23:33:00 - INFO - codeparrot_training - Step 45697: {'lr': 9.145176620650703e-06, 'samples': 23397376, 'steps': 45697, 'batch_loss/train': 0.7197505026124418} 12/28/2021 23:33:10 - INFO - codeparrot_training - Step 45698: {'lr': 9.14095260107084e-06, 'samples': 23397888, 'steps': 45698, 'batch_loss/train': 0.7265005265944637} 12/28/2021 23:33:24 - INFO - codeparrot_training - Step 45699: {'lr': 9.136729539055589e-06, 'samples': 23398400, 'steps': 45699, 'batch_loss/train': 0.7111508082598448} 12/28/2021 23:33:35 - INFO - codeparrot_training - Step 45700: {'lr': 9.132507434621711e-06, 'samples': 23398912, 'steps': 45700, 'batch_loss/train': 0.7239737212657928} 12/28/2021 23:33:45 - INFO - codeparrot_training - Step 45701: {'lr': 9.128286287786087e-06, 'samples': 23399424, 'steps': 45701, 'batch_loss/train': 0.6755588632076979} 12/28/2021 23:33:57 - INFO - codeparrot_training - Step 45702: {'lr': 9.12406609856542e-06, 'samples': 23399936, 'steps': 45702, 'batch_loss/train': 0.655054347589612} 12/28/2021 23:34:08 - INFO - codeparrot_training - Step 45703: {'lr': 9.119846866976534e-06, 'samples': 23400448, 'steps': 45703, 'batch_loss/train': 0.6968518955400214} 12/28/2021 23:34:19 - INFO - codeparrot_training - Step 45704: {'lr': 9.11562859303619e-06, 'samples': 23400960, 'steps': 45704, 'batch_loss/train': 0.6701799726579338} 12/28/2021 23:34:29 - INFO - codeparrot_training - Step 45705: {'lr': 9.111411276761128e-06, 'samples': 23401472, 'steps': 45705, 'batch_loss/train': 0.7167601310648024} 12/28/2021 23:34:43 - INFO - codeparrot_training - Step 45706: {'lr': 9.107194918168138e-06, 'samples': 23401984, 'steps': 45706, 'batch_loss/train': 0.7171092797070742} 12/28/2021 23:34:54 - INFO - codeparrot_training - Step 45707: {'lr': 9.102979517273985e-06, 'samples': 23402496, 'steps': 45707, 'batch_loss/train': 0.6785991858923808} 12/28/2021 23:35:05 - INFO - codeparrot_training - Step 45708: {'lr': 9.098765074095433e-06, 'samples': 23403008, 'steps': 45708, 'batch_loss/train': 0.7671624585054815} 12/28/2021 23:35:17 - INFO - codeparrot_training - Step 45709: {'lr': 9.094551588649247e-06, 'samples': 23403520, 'steps': 45709, 'batch_loss/train': 0.983222461072728} 12/28/2021 23:35:28 - INFO - codeparrot_training - Step 45710: {'lr': 9.090339060952162e-06, 'samples': 23404032, 'steps': 45710, 'batch_loss/train': 0.6824026592075825} 12/28/2021 23:35:39 - INFO - codeparrot_training - Step 45711: {'lr': 9.08612749102089e-06, 'samples': 23404544, 'steps': 45711, 'batch_loss/train': 0.5227016085991636} 12/28/2021 23:35:51 - INFO - codeparrot_training - Step 45712: {'lr': 9.081916878872249e-06, 'samples': 23405056, 'steps': 45712, 'batch_loss/train': 0.7339955167844892} 12/28/2021 23:36:01 - INFO - codeparrot_training - Step 45713: {'lr': 9.077707224522919e-06, 'samples': 23405568, 'steps': 45713, 'batch_loss/train': 0.8000295693054795} 12/28/2021 23:36:12 - INFO - codeparrot_training - Step 45714: {'lr': 9.073498527989638e-06, 'samples': 23406080, 'steps': 45714, 'batch_loss/train': 0.7501046070829034} 12/28/2021 23:36:23 - INFO - codeparrot_training - Step 45715: {'lr': 9.069290789289197e-06, 'samples': 23406592, 'steps': 45715, 'batch_loss/train': 0.700144310016185} 12/28/2021 23:36:35 - INFO - codeparrot_training - Step 45716: {'lr': 9.065084008438251e-06, 'samples': 23407104, 'steps': 45716, 'batch_loss/train': 0.661013575270772} 12/28/2021 23:36:45 - INFO - codeparrot_training - Step 45717: {'lr': 9.060878185453591e-06, 'samples': 23407616, 'steps': 45717, 'batch_loss/train': 0.7731685768812895} 12/28/2021 23:36:56 - INFO - codeparrot_training - Step 45718: {'lr': 9.056673320351871e-06, 'samples': 23408128, 'steps': 45718, 'batch_loss/train': 0.8452318198978901} 12/28/2021 23:37:10 - INFO - codeparrot_training - Step 45719: {'lr': 9.052469413149855e-06, 'samples': 23408640, 'steps': 45719, 'batch_loss/train': 0.5823256080038846} 12/28/2021 23:37:21 - INFO - codeparrot_training - Step 45720: {'lr': 9.048266463864224e-06, 'samples': 23409152, 'steps': 45720, 'batch_loss/train': 0.83658342435956} 12/28/2021 23:37:31 - INFO - codeparrot_training - Step 45721: {'lr': 9.04406447251177e-06, 'samples': 23409664, 'steps': 45721, 'batch_loss/train': 0.6175441695377231} 12/28/2021 23:37:43 - INFO - codeparrot_training - Step 45722: {'lr': 9.039863439109036e-06, 'samples': 23410176, 'steps': 45722, 'batch_loss/train': 0.8399863122031093} 12/28/2021 23:37:54 - INFO - codeparrot_training - Step 45723: {'lr': 9.035663363672868e-06, 'samples': 23410688, 'steps': 45723, 'batch_loss/train': 0.7765606194734573} 12/28/2021 23:38:05 - INFO - codeparrot_training - Step 45724: {'lr': 9.03146424621995e-06, 'samples': 23411200, 'steps': 45724, 'batch_loss/train': 0.780063194106333} 12/28/2021 23:38:15 - INFO - codeparrot_training - Step 45725: {'lr': 9.027266086766905e-06, 'samples': 23411712, 'steps': 45725, 'batch_loss/train': 0.7415765612386167} 12/28/2021 23:38:27 - INFO - codeparrot_training - Step 45726: {'lr': 9.02306888533047e-06, 'samples': 23412224, 'steps': 45726, 'batch_loss/train': 0.7161385822109878} 12/28/2021 23:38:38 - INFO - codeparrot_training - Step 45727: {'lr': 9.018872641927356e-06, 'samples': 23412736, 'steps': 45727, 'batch_loss/train': 0.7153073288500309} 12/28/2021 23:38:49 - INFO - codeparrot_training - Step 45728: {'lr': 9.014677356574187e-06, 'samples': 23413248, 'steps': 45728, 'batch_loss/train': 0.675354699138552} 12/28/2021 23:39:03 - INFO - codeparrot_training - Step 45729: {'lr': 9.010483029287641e-06, 'samples': 23413760, 'steps': 45729, 'batch_loss/train': 0.7658509109169245} 12/28/2021 23:39:13 - INFO - codeparrot_training - Step 45730: {'lr': 9.00628966008446e-06, 'samples': 23414272, 'steps': 45730, 'batch_loss/train': 0.856325819157064} 12/28/2021 23:39:24 - INFO - codeparrot_training - Step 45731: {'lr': 9.002097248981238e-06, 'samples': 23414784, 'steps': 45731, 'batch_loss/train': 0.7716564573347569} 12/28/2021 23:39:36 - INFO - codeparrot_training - Step 45732: {'lr': 8.997905795994715e-06, 'samples': 23415296, 'steps': 45732, 'batch_loss/train': 0.6731334747746587} 12/28/2021 23:39:47 - INFO - codeparrot_training - Step 45733: {'lr': 8.993715301141513e-06, 'samples': 23415808, 'steps': 45733, 'batch_loss/train': 0.7484898846596479} 12/28/2021 23:39:57 - INFO - codeparrot_training - Step 45734: {'lr': 8.98952576443829e-06, 'samples': 23416320, 'steps': 45734, 'batch_loss/train': 0.8048099633306265} 12/28/2021 23:40:10 - INFO - codeparrot_training - Step 45735: {'lr': 8.985337185901694e-06, 'samples': 23416832, 'steps': 45735, 'batch_loss/train': 0.6611016753013246} 12/28/2021 23:40:21 - INFO - codeparrot_training - Step 45736: {'lr': 8.981149565548436e-06, 'samples': 23417344, 'steps': 45736, 'batch_loss/train': 0.675698540173471} 12/28/2021 23:40:32 - INFO - codeparrot_training - Step 45737: {'lr': 8.976962903395085e-06, 'samples': 23417856, 'steps': 45737, 'batch_loss/train': 0.7226616684347391} 12/28/2021 23:40:42 - INFO - codeparrot_training - Step 45738: {'lr': 8.972777199458355e-06, 'samples': 23418368, 'steps': 45738, 'batch_loss/train': 0.789858955424279} 12/28/2021 23:40:54 - INFO - codeparrot_training - Step 45739: {'lr': 8.968592453754837e-06, 'samples': 23418880, 'steps': 45739, 'batch_loss/train': 0.6596303957048804} 12/28/2021 23:41:05 - INFO - codeparrot_training - Step 45740: {'lr': 8.96440866630116e-06, 'samples': 23419392, 'steps': 45740, 'batch_loss/train': 0.8123812042176723} 12/28/2021 23:41:16 - INFO - codeparrot_training - Step 45741: {'lr': 8.960225837114033e-06, 'samples': 23419904, 'steps': 45741, 'batch_loss/train': 0.8464869596064091} 12/28/2021 23:41:28 - INFO - codeparrot_training - Step 45742: {'lr': 8.956043966210025e-06, 'samples': 23420416, 'steps': 45742, 'batch_loss/train': 0.7494609635323286} 12/28/2021 23:41:38 - INFO - codeparrot_training - Step 45743: {'lr': 8.951863053605736e-06, 'samples': 23420928, 'steps': 45743, 'batch_loss/train': 0.8619276685640216} 12/28/2021 23:41:49 - INFO - codeparrot_training - Step 45744: {'lr': 8.947683099317845e-06, 'samples': 23421440, 'steps': 45744, 'batch_loss/train': 0.7867833925411105} 12/28/2021 23:42:03 - INFO - codeparrot_training - Step 45745: {'lr': 8.943504103362949e-06, 'samples': 23421952, 'steps': 45745, 'batch_loss/train': 0.7399503854103386} 12/28/2021 23:42:14 - INFO - codeparrot_training - Step 45746: {'lr': 8.939326065757647e-06, 'samples': 23422464, 'steps': 45746, 'batch_loss/train': 0.723742539063096} 12/28/2021 23:42:24 - INFO - codeparrot_training - Step 45747: {'lr': 8.935148986518621e-06, 'samples': 23422976, 'steps': 45747, 'batch_loss/train': 0.737531915307045} 12/28/2021 23:42:35 - INFO - codeparrot_training - Step 45748: {'lr': 8.930972865662329e-06, 'samples': 23423488, 'steps': 45748, 'batch_loss/train': 0.7275257240980864} 12/28/2021 23:42:47 - INFO - codeparrot_training - Step 45749: {'lr': 8.926797703205508e-06, 'samples': 23424000, 'steps': 45749, 'batch_loss/train': 0.6204302771948278} 12/28/2021 23:42:58 - INFO - codeparrot_training - Step 45750: {'lr': 8.922623499164755e-06, 'samples': 23424512, 'steps': 45750, 'batch_loss/train': 0.7087686480954289} 12/28/2021 23:43:08 - INFO - codeparrot_training - Step 45751: {'lr': 8.91845025355656e-06, 'samples': 23425024, 'steps': 45751, 'batch_loss/train': 0.6841423632577062} 12/28/2021 23:43:21 - INFO - codeparrot_training - Step 45752: {'lr': 8.9142779663976e-06, 'samples': 23425536, 'steps': 45752, 'batch_loss/train': 0.7587518191430718} 12/28/2021 23:43:31 - INFO - codeparrot_training - Step 45753: {'lr': 8.910106637704473e-06, 'samples': 23426048, 'steps': 45753, 'batch_loss/train': 0.7785979760810733} 12/28/2021 23:43:42 - INFO - codeparrot_training - Step 45754: {'lr': 8.905936267493697e-06, 'samples': 23426560, 'steps': 45754, 'batch_loss/train': 0.8346336963586509} 12/28/2021 23:43:56 - INFO - codeparrot_training - Step 45755: {'lr': 8.901766855781839e-06, 'samples': 23427072, 'steps': 45755, 'batch_loss/train': 0.7318960186094046} 12/28/2021 23:44:07 - INFO - codeparrot_training - Step 45756: {'lr': 8.89759840258561e-06, 'samples': 23427584, 'steps': 45756, 'batch_loss/train': 0.8554856274276972} 12/28/2021 23:44:18 - INFO - codeparrot_training - Step 45757: {'lr': 8.89343090792144e-06, 'samples': 23428096, 'steps': 45757, 'batch_loss/train': 0.7875574491918087} 12/28/2021 23:44:28 - INFO - codeparrot_training - Step 45758: {'lr': 8.889264371805955e-06, 'samples': 23428608, 'steps': 45758, 'batch_loss/train': 1.1745776245370507} 12/28/2021 23:44:40 - INFO - codeparrot_training - Step 45759: {'lr': 8.885098794255726e-06, 'samples': 23429120, 'steps': 45759, 'batch_loss/train': 0.7839590637013316} 12/28/2021 23:44:51 - INFO - codeparrot_training - Step 45760: {'lr': 8.880934175287292e-06, 'samples': 23429632, 'steps': 45760, 'batch_loss/train': 0.8036775500513613} 12/28/2021 23:45:02 - INFO - codeparrot_training - Step 45761: {'lr': 8.876770514917226e-06, 'samples': 23430144, 'steps': 45761, 'batch_loss/train': 0.7321151616051793} 12/28/2021 23:45:14 - INFO - codeparrot_training - Step 45762: {'lr': 8.872607813162098e-06, 'samples': 23430656, 'steps': 45762, 'batch_loss/train': 0.7551571857184172} 12/28/2021 23:45:24 - INFO - codeparrot_training - Step 45763: {'lr': 8.868446070038394e-06, 'samples': 23431168, 'steps': 45763, 'batch_loss/train': 0.7605020962655544} 12/28/2021 23:45:35 - INFO - codeparrot_training - Step 45764: {'lr': 8.864285285562767e-06, 'samples': 23431680, 'steps': 45764, 'batch_loss/train': 0.7230873056687415} 12/28/2021 23:45:49 - INFO - codeparrot_training - Step 45765: {'lr': 8.86012545975165e-06, 'samples': 23432192, 'steps': 45765, 'batch_loss/train': 0.7346288450062275} 12/28/2021 23:46:00 - INFO - codeparrot_training - Step 45766: {'lr': 8.855966592621584e-06, 'samples': 23432704, 'steps': 45766, 'batch_loss/train': 0.7204971846658736} 12/28/2021 23:46:10 - INFO - codeparrot_training - Step 45767: {'lr': 8.851808684189194e-06, 'samples': 23433216, 'steps': 45767, 'batch_loss/train': 0.6267958502285182} 12/28/2021 23:46:21 - INFO - codeparrot_training - Step 45768: {'lr': 8.847651734470941e-06, 'samples': 23433728, 'steps': 45768, 'batch_loss/train': 0.7281816594768316} 12/28/2021 23:46:34 - INFO - codeparrot_training - Step 45769: {'lr': 8.843495743483394e-06, 'samples': 23434240, 'steps': 45769, 'batch_loss/train': 0.6995853413827717} 12/28/2021 23:46:44 - INFO - codeparrot_training - Step 45770: {'lr': 8.83934071124301e-06, 'samples': 23434752, 'steps': 45770, 'batch_loss/train': 0.5573684854898602} 12/28/2021 23:46:55 - INFO - codeparrot_training - Step 45771: {'lr': 8.835186637766363e-06, 'samples': 23435264, 'steps': 45771, 'batch_loss/train': 0.6001102142035961} 12/28/2021 23:47:07 - INFO - codeparrot_training - Step 45772: {'lr': 8.831033523069964e-06, 'samples': 23435776, 'steps': 45772, 'batch_loss/train': 0.7433645238634199} 12/28/2021 23:47:18 - INFO - codeparrot_training - Step 45773: {'lr': 8.826881367170304e-06, 'samples': 23436288, 'steps': 45773, 'batch_loss/train': 0.7347879535518587} 12/28/2021 23:47:28 - INFO - codeparrot_training - Step 45774: {'lr': 8.822730170083892e-06, 'samples': 23436800, 'steps': 45774, 'batch_loss/train': 0.8300438597798347} 12/28/2021 23:47:39 - INFO - codeparrot_training - Step 45775: {'lr': 8.818579931827276e-06, 'samples': 23437312, 'steps': 45775, 'batch_loss/train': 0.6826746929436922} 12/28/2021 23:47:53 - INFO - codeparrot_training - Step 45776: {'lr': 8.814430652416911e-06, 'samples': 23437824, 'steps': 45776, 'batch_loss/train': 0.7265763003379107} 12/28/2021 23:48:04 - INFO - codeparrot_training - Step 45777: {'lr': 8.810282331869257e-06, 'samples': 23438336, 'steps': 45777, 'batch_loss/train': 0.8521785461343825} 12/28/2021 23:48:14 - INFO - codeparrot_training - Step 45778: {'lr': 8.806134970200886e-06, 'samples': 23438848, 'steps': 45778, 'batch_loss/train': 0.7690949807874858} 12/28/2021 23:48:26 - INFO - codeparrot_training - Step 45779: {'lr': 8.801988567428254e-06, 'samples': 23439360, 'steps': 45779, 'batch_loss/train': 0.6830847151577473} 12/28/2021 23:48:37 - INFO - codeparrot_training - Step 45780: {'lr': 8.79784312356785e-06, 'samples': 23439872, 'steps': 45780, 'batch_loss/train': 0.8399868464330211} 12/28/2021 23:48:48 - INFO - codeparrot_training - Step 45781: {'lr': 8.793698638636077e-06, 'samples': 23440384, 'steps': 45781, 'batch_loss/train': 0.7132937116548419} 12/28/2021 23:49:02 - INFO - codeparrot_training - Step 45782: {'lr': 8.789555112649588e-06, 'samples': 23440896, 'steps': 45782, 'batch_loss/train': 0.7050235556671396} 12/28/2021 23:49:12 - INFO - codeparrot_training - Step 45783: {'lr': 8.785412545624677e-06, 'samples': 23441408, 'steps': 45783, 'batch_loss/train': 0.6305895007681102} 12/28/2021 23:49:23 - INFO - codeparrot_training - Step 45784: {'lr': 8.781270937577913e-06, 'samples': 23441920, 'steps': 45784, 'batch_loss/train': 0.6661429964005947} 12/28/2021 23:49:34 - INFO - codeparrot_training - Step 45785: {'lr': 8.777130288525725e-06, 'samples': 23442432, 'steps': 45785, 'batch_loss/train': 0.7267395202070475} 12/28/2021 23:49:46 - INFO - codeparrot_training - Step 45786: {'lr': 8.772990598484603e-06, 'samples': 23442944, 'steps': 45786, 'batch_loss/train': 0.7460448949132115} 12/28/2021 23:49:57 - INFO - codeparrot_training - Step 45787: {'lr': 8.76885186747095e-06, 'samples': 23443456, 'steps': 45787, 'batch_loss/train': 0.578784310258925} 12/28/2021 23:50:07 - INFO - codeparrot_training - Step 45788: {'lr': 8.76471409550128e-06, 'samples': 23443968, 'steps': 45788, 'batch_loss/train': 0.7454724656417966} 12/28/2021 23:50:20 - INFO - codeparrot_training - Step 45789: {'lr': 8.760577282592025e-06, 'samples': 23444480, 'steps': 45789, 'batch_loss/train': 0.7363296948606148} 12/28/2021 23:50:30 - INFO - codeparrot_training - Step 45790: {'lr': 8.756441428759615e-06, 'samples': 23444992, 'steps': 45790, 'batch_loss/train': 0.7400999299134128} 12/28/2021 23:50:41 - INFO - codeparrot_training - Step 45791: {'lr': 8.752306534020538e-06, 'samples': 23445504, 'steps': 45791, 'batch_loss/train': 0.6896647778339684} 12/28/2021 23:50:53 - INFO - codeparrot_training - Step 45792: {'lr': 8.74817259839114e-06, 'samples': 23446016, 'steps': 45792, 'batch_loss/train': 0.7378191389143467} 12/28/2021 23:51:04 - INFO - codeparrot_training - Step 45793: {'lr': 8.744039621887968e-06, 'samples': 23446528, 'steps': 45793, 'batch_loss/train': 0.6955171707086265} 12/28/2021 23:51:14 - INFO - codeparrot_training - Step 45794: {'lr': 8.739907604527392e-06, 'samples': 23447040, 'steps': 45794, 'batch_loss/train': 0.7375292298384011} 12/28/2021 23:51:25 - INFO - codeparrot_training - Step 45795: {'lr': 8.735776546325819e-06, 'samples': 23447552, 'steps': 45795, 'batch_loss/train': 0.7387779235141352} 12/28/2021 23:51:39 - INFO - codeparrot_training - Step 45796: {'lr': 8.731646447299707e-06, 'samples': 23448064, 'steps': 45796, 'batch_loss/train': 0.7852585259824991} 12/28/2021 23:51:50 - INFO - codeparrot_training - Step 45797: {'lr': 8.727517307465488e-06, 'samples': 23448576, 'steps': 45797, 'batch_loss/train': 0.7367368647828698} 12/28/2021 23:52:00 - INFO - codeparrot_training - Step 45798: {'lr': 8.723389126839537e-06, 'samples': 23449088, 'steps': 45798, 'batch_loss/train': 0.571079739369452} 12/28/2021 23:52:12 - INFO - codeparrot_training - Step 45799: {'lr': 8.719261905438313e-06, 'samples': 23449600, 'steps': 45799, 'batch_loss/train': 0.6757987008895725} 12/28/2021 23:52:23 - INFO - codeparrot_training - Step 45800: {'lr': 8.715135643278165e-06, 'samples': 23450112, 'steps': 45800, 'batch_loss/train': 0.7899298798292875} 12/28/2021 23:52:34 - INFO - codeparrot_training - Step 45801: {'lr': 8.711010340375552e-06, 'samples': 23450624, 'steps': 45801, 'batch_loss/train': 0.723763111163862} 12/28/2021 23:52:46 - INFO - codeparrot_training - Step 45802: {'lr': 8.706885996746905e-06, 'samples': 23451136, 'steps': 45802, 'batch_loss/train': 0.6058484862442128} 12/28/2021 23:52:57 - INFO - codeparrot_training - Step 45803: {'lr': 8.702762612408489e-06, 'samples': 23451648, 'steps': 45803, 'batch_loss/train': 0.7058139662258327} 12/28/2021 23:53:07 - INFO - codeparrot_training - Step 45804: {'lr': 8.698640187376788e-06, 'samples': 23452160, 'steps': 45804, 'batch_loss/train': 0.7231985640246421} 12/28/2021 23:53:18 - INFO - codeparrot_training - Step 45805: {'lr': 8.694518721668238e-06, 'samples': 23452672, 'steps': 45805, 'batch_loss/train': 0.5638486123061739} 12/28/2021 23:53:32 - INFO - codeparrot_training - Step 45806: {'lr': 8.690398215299128e-06, 'samples': 23453184, 'steps': 45806, 'batch_loss/train': 0.7663700794801116} 12/28/2021 23:53:43 - INFO - codeparrot_training - Step 45807: {'lr': 8.686278668285863e-06, 'samples': 23453696, 'steps': 45807, 'batch_loss/train': 0.771585582755506} 12/28/2021 23:53:53 - INFO - codeparrot_training - Step 45808: {'lr': 8.682160080644874e-06, 'samples': 23454208, 'steps': 45808, 'batch_loss/train': 0.6931608598679304} 12/28/2021 23:54:06 - INFO - codeparrot_training - Step 45809: {'lr': 8.678042452392481e-06, 'samples': 23454720, 'steps': 45809, 'batch_loss/train': 0.6683413884602487} 12/28/2021 23:54:16 - INFO - codeparrot_training - Step 45810: {'lr': 8.673925783545089e-06, 'samples': 23455232, 'steps': 45810, 'batch_loss/train': 0.7298377160914242} 12/28/2021 23:54:27 - INFO - codeparrot_training - Step 45811: {'lr': 8.669810074119017e-06, 'samples': 23455744, 'steps': 45811, 'batch_loss/train': 0.7545626806095243} 12/28/2021 23:54:39 - INFO - codeparrot_training - Step 45812: {'lr': 8.665695324130695e-06, 'samples': 23456256, 'steps': 45812, 'batch_loss/train': 0.6831257847952656} 12/28/2021 23:54:50 - INFO - codeparrot_training - Step 45813: {'lr': 8.661581533596418e-06, 'samples': 23456768, 'steps': 45813, 'batch_loss/train': 0.790289968252182} 12/28/2021 23:55:00 - INFO - codeparrot_training - Step 45814: {'lr': 8.65746870253256e-06, 'samples': 23457280, 'steps': 45814, 'batch_loss/train': 0.753431857097894} 12/28/2021 23:55:11 - INFO - codeparrot_training - Step 45815: {'lr': 8.6533568309555e-06, 'samples': 23457792, 'steps': 45815, 'batch_loss/train': 0.7970824539661407} 12/28/2021 23:55:25 - INFO - codeparrot_training - Step 45816: {'lr': 8.649245918881555e-06, 'samples': 23458304, 'steps': 45816, 'batch_loss/train': 0.6198735534271691} 12/28/2021 23:55:36 - INFO - codeparrot_training - Step 45817: {'lr': 8.6451359663271e-06, 'samples': 23458816, 'steps': 45817, 'batch_loss/train': 0.7763317599892616} 12/28/2021 23:55:47 - INFO - codeparrot_training - Step 45818: {'lr': 8.641026973308402e-06, 'samples': 23459328, 'steps': 45818, 'batch_loss/train': 0.7200931245461106} 12/28/2021 23:55:59 - INFO - codeparrot_training - Step 45819: {'lr': 8.636918939841893e-06, 'samples': 23459840, 'steps': 45819, 'batch_loss/train': 0.692627145908773} 12/28/2021 23:56:09 - INFO - codeparrot_training - Step 45820: {'lr': 8.632811865943835e-06, 'samples': 23460352, 'steps': 45820, 'batch_loss/train': 0.7332671396434307} 12/28/2021 23:56:20 - INFO - codeparrot_training - Step 45821: {'lr': 8.628705751630606e-06, 'samples': 23460864, 'steps': 45821, 'batch_loss/train': 0.6572561543434858} 12/28/2021 23:56:34 - INFO - codeparrot_training - Step 45822: {'lr': 8.624600596918469e-06, 'samples': 23461376, 'steps': 45822, 'batch_loss/train': 0.6823655287735164} 12/28/2021 23:56:44 - INFO - codeparrot_training - Step 45823: {'lr': 8.620496401823801e-06, 'samples': 23461888, 'steps': 45823, 'batch_loss/train': 0.6879399307072163} 12/28/2021 23:56:55 - INFO - codeparrot_training - Step 45824: {'lr': 8.616393166362895e-06, 'samples': 23462400, 'steps': 45824, 'batch_loss/train': 0.7237093388102949} 12/28/2021 23:57:07 - INFO - codeparrot_training - Step 45825: {'lr': 8.612290890552043e-06, 'samples': 23462912, 'steps': 45825, 'batch_loss/train': 0.6864173971116543} 12/28/2021 23:57:18 - INFO - codeparrot_training - Step 45826: {'lr': 8.60818957440762e-06, 'samples': 23463424, 'steps': 45826, 'batch_loss/train': 0.7682352447882295} 12/28/2021 23:57:28 - INFO - codeparrot_training - Step 45827: {'lr': 8.604089217945864e-06, 'samples': 23463936, 'steps': 45827, 'batch_loss/train': 0.6671780729666352} 12/28/2021 23:57:39 - INFO - codeparrot_training - Step 45828: {'lr': 8.599989821183124e-06, 'samples': 23464448, 'steps': 45828, 'batch_loss/train': 0.5811509842751548} 12/28/2021 23:57:51 - INFO - codeparrot_training - Step 45829: {'lr': 8.595891384135634e-06, 'samples': 23464960, 'steps': 45829, 'batch_loss/train': 0.7675413126125932} 12/28/2021 23:58:02 - INFO - codeparrot_training - Step 45830: {'lr': 8.591793906819745e-06, 'samples': 23465472, 'steps': 45830, 'batch_loss/train': 0.7231537448242307} 12/28/2021 23:58:12 - INFO - codeparrot_training - Step 45831: {'lr': 8.587697389251774e-06, 'samples': 23465984, 'steps': 45831, 'batch_loss/train': 0.7087960690259933} 12/28/2021 23:58:25 - INFO - codeparrot_training - Step 45832: {'lr': 8.583601831447935e-06, 'samples': 23466496, 'steps': 45832, 'batch_loss/train': 0.7714222054928541} 12/28/2021 23:58:35 - INFO - codeparrot_training - Step 45833: {'lr': 8.579507233424488e-06, 'samples': 23467008, 'steps': 45833, 'batch_loss/train': 0.7243831856176257} 12/28/2021 23:58:46 - INFO - codeparrot_training - Step 45834: {'lr': 8.57541359519784e-06, 'samples': 23467520, 'steps': 45834, 'batch_loss/train': 0.9487563520669937} 12/28/2021 23:58:59 - INFO - codeparrot_training - Step 45835: {'lr': 8.57132091678417e-06, 'samples': 23468032, 'steps': 45835, 'batch_loss/train': 0.7227851670468226} 12/28/2021 23:59:10 - INFO - codeparrot_training - Step 45836: {'lr': 8.567229198199717e-06, 'samples': 23468544, 'steps': 45836, 'batch_loss/train': 0.7859726883471012} 12/28/2021 23:59:21 - INFO - codeparrot_training - Step 45837: {'lr': 8.563138439460883e-06, 'samples': 23469056, 'steps': 45837, 'batch_loss/train': 0.5970732003916055} 12/28/2021 23:59:31 - INFO - codeparrot_training - Step 45838: {'lr': 8.559048640583795e-06, 'samples': 23469568, 'steps': 45838, 'batch_loss/train': 0.705728309461847} 12/28/2021 23:59:44 - INFO - codeparrot_training - Step 45839: {'lr': 8.554959801584773e-06, 'samples': 23470080, 'steps': 45839, 'batch_loss/train': 0.6794826179975644} 12/28/2021 23:59:54 - INFO - codeparrot_training - Step 45840: {'lr': 8.550871922480053e-06, 'samples': 23470592, 'steps': 45840, 'batch_loss/train': 0.6350155947729945} 12/29/2021 00:00:05 - INFO - codeparrot_training - Step 45841: {'lr': 8.546785003285929e-06, 'samples': 23471104, 'steps': 45841, 'batch_loss/train': 0.5752781943883747} 12/29/2021 00:00:17 - INFO - codeparrot_training - Step 45842: {'lr': 8.542699044018582e-06, 'samples': 23471616, 'steps': 45842, 'batch_loss/train': 0.583563888707431} 12/29/2021 00:00:28 - INFO - codeparrot_training - Step 45843: {'lr': 8.538614044694359e-06, 'samples': 23472128, 'steps': 45843, 'batch_loss/train': 0.7364309309050441} 12/29/2021 00:00:38 - INFO - codeparrot_training - Step 45844: {'lr': 8.53453000532936e-06, 'samples': 23472640, 'steps': 45844, 'batch_loss/train': 0.7384189977310598} 12/29/2021 00:00:52 - INFO - codeparrot_training - Step 45845: {'lr': 8.530446925939933e-06, 'samples': 23473152, 'steps': 45845, 'batch_loss/train': 0.6102749803103507} 12/29/2021 00:01:03 - INFO - codeparrot_training - Step 45846: {'lr': 8.526364806542258e-06, 'samples': 23473664, 'steps': 45846, 'batch_loss/train': 0.829132161103189} 12/29/2021 00:01:13 - INFO - codeparrot_training - Step 45847: {'lr': 8.522283647152574e-06, 'samples': 23474176, 'steps': 45847, 'batch_loss/train': 0.7649382203817368} 12/29/2021 00:01:24 - INFO - codeparrot_training - Step 45848: {'lr': 8.518203447787143e-06, 'samples': 23474688, 'steps': 45848, 'batch_loss/train': 0.6275448771193624} 12/29/2021 00:01:36 - INFO - codeparrot_training - Step 45849: {'lr': 8.514124208462121e-06, 'samples': 23475200, 'steps': 45849, 'batch_loss/train': 0.6990237273275852} 12/29/2021 00:01:47 - INFO - codeparrot_training - Step 45850: {'lr': 8.5100459291938e-06, 'samples': 23475712, 'steps': 45850, 'batch_loss/train': 0.708981170784682} 12/29/2021 00:01:58 - INFO - codeparrot_training - Step 45851: {'lr': 8.505968609998304e-06, 'samples': 23476224, 'steps': 45851, 'batch_loss/train': 0.8538805413991213} 12/29/2021 00:02:12 - INFO - codeparrot_training - Step 45852: {'lr': 8.501892250891929e-06, 'samples': 23476736, 'steps': 45852, 'batch_loss/train': 0.6949786636978388} 12/29/2021 00:02:22 - INFO - codeparrot_training - Step 45853: {'lr': 8.497816851890828e-06, 'samples': 23477248, 'steps': 45853, 'batch_loss/train': 0.807506229262799} 12/29/2021 00:02:33 - INFO - codeparrot_training - Step 45854: {'lr': 8.493742413011235e-06, 'samples': 23477760, 'steps': 45854, 'batch_loss/train': 0.7197435714770108} 12/29/2021 00:02:45 - INFO - codeparrot_training - Step 45855: {'lr': 8.489668934269306e-06, 'samples': 23478272, 'steps': 45855, 'batch_loss/train': 0.7100622588768601} 12/29/2021 00:02:56 - INFO - codeparrot_training - Step 45856: {'lr': 8.48559641568128e-06, 'samples': 23478784, 'steps': 45856, 'batch_loss/train': 0.7745723119005561} 12/29/2021 00:03:06 - INFO - codeparrot_training - Step 45857: {'lr': 8.481524857263334e-06, 'samples': 23479296, 'steps': 45857, 'batch_loss/train': 0.699072350282222} 12/29/2021 00:03:17 - INFO - codeparrot_training - Step 45858: {'lr': 8.477454259031653e-06, 'samples': 23479808, 'steps': 45858, 'batch_loss/train': 0.7167767849750817} 12/29/2021 00:03:29 - INFO - codeparrot_training - Step 45859: {'lr': 8.473384621002389e-06, 'samples': 23480320, 'steps': 45859, 'batch_loss/train': 0.7186492079636082} 12/29/2021 00:03:40 - INFO - codeparrot_training - Step 45860: {'lr': 8.46931594319178e-06, 'samples': 23480832, 'steps': 45860, 'batch_loss/train': 0.6941277054138482} 12/29/2021 00:03:50 - INFO - codeparrot_training - Step 45861: {'lr': 8.465248225615978e-06, 'samples': 23481344, 'steps': 45861, 'batch_loss/train': 0.7818750659935176} 12/29/2021 00:04:04 - INFO - codeparrot_training - Step 45862: {'lr': 8.461181468291112e-06, 'samples': 23481856, 'steps': 45862, 'batch_loss/train': 0.7428356460295618} 12/29/2021 00:04:15 - INFO - codeparrot_training - Step 45863: {'lr': 8.457115671233417e-06, 'samples': 23482368, 'steps': 45863, 'batch_loss/train': 0.7335401698946953} 12/29/2021 00:04:26 - INFO - codeparrot_training - Step 45864: {'lr': 8.45305083445902e-06, 'samples': 23482880, 'steps': 45864, 'batch_loss/train': 1.0612156114075333} 12/29/2021 00:04:38 - INFO - codeparrot_training - Step 45865: {'lr': 8.4489869579841e-06, 'samples': 23483392, 'steps': 45865, 'batch_loss/train': 0.7500274442136288} 12/29/2021 00:04:48 - INFO - codeparrot_training - Step 45866: {'lr': 8.444924041824786e-06, 'samples': 23483904, 'steps': 45866, 'batch_loss/train': 0.7473608274012804} 12/29/2021 00:04:59 - INFO - codeparrot_training - Step 45867: {'lr': 8.440862085997258e-06, 'samples': 23484416, 'steps': 45867, 'batch_loss/train': 0.7177558178082108} 12/29/2021 00:05:10 - INFO - codeparrot_training - Step 45868: {'lr': 8.436801090517644e-06, 'samples': 23484928, 'steps': 45868, 'batch_loss/train': 0.7756232377141714} 12/29/2021 00:05:22 - INFO - codeparrot_training - Step 45869: {'lr': 8.432741055402121e-06, 'samples': 23485440, 'steps': 45869, 'batch_loss/train': 0.7530767982825637} 12/29/2021 00:05:33 - INFO - codeparrot_training - Step 45870: {'lr': 8.428681980666763e-06, 'samples': 23485952, 'steps': 45870, 'batch_loss/train': 0.9254194712266326} 12/29/2021 00:05:43 - INFO - codeparrot_training - Step 45871: {'lr': 8.42462386632778e-06, 'samples': 23486464, 'steps': 45871, 'batch_loss/train': 0.658649027289357} 12/29/2021 00:05:55 - INFO - codeparrot_training - Step 45872: {'lr': 8.420566712401294e-06, 'samples': 23486976, 'steps': 45872, 'batch_loss/train': 0.5088713721488602} 12/29/2021 00:06:06 - INFO - codeparrot_training - Step 45873: {'lr': 8.416510518903353e-06, 'samples': 23487488, 'steps': 45873, 'batch_loss/train': 0.6799912229180336} 12/29/2021 00:06:17 - INFO - codeparrot_training - Step 45874: {'lr': 8.412455285850218e-06, 'samples': 23488000, 'steps': 45874, 'batch_loss/train': 0.740252458723262} 12/29/2021 00:06:30 - INFO - codeparrot_training - Step 45875: {'lr': 8.408401013257905e-06, 'samples': 23488512, 'steps': 45875, 'batch_loss/train': 0.6587042239261791} 12/29/2021 00:06:41 - INFO - codeparrot_training - Step 45876: {'lr': 8.404347701142595e-06, 'samples': 23489024, 'steps': 45876, 'batch_loss/train': 0.7023724457249045} 12/29/2021 00:06:52 - INFO - codeparrot_training - Step 45877: {'lr': 8.40029534952036e-06, 'samples': 23489536, 'steps': 45877, 'batch_loss/train': 0.7282546758651733} 12/29/2021 00:07:04 - INFO - codeparrot_training - Step 45878: {'lr': 8.396243958407324e-06, 'samples': 23490048, 'steps': 45878, 'batch_loss/train': 0.7865441273897886} 12/29/2021 00:07:15 - INFO - codeparrot_training - Step 45879: {'lr': 8.392193527819585e-06, 'samples': 23490560, 'steps': 45879, 'batch_loss/train': 0.6800500359386206} 12/29/2021 00:07:25 - INFO - codeparrot_training - Step 45880: {'lr': 8.388144057773272e-06, 'samples': 23491072, 'steps': 45880, 'batch_loss/train': 0.7132895449176431} 12/29/2021 00:07:36 - INFO - codeparrot_training - Step 45881: {'lr': 8.384095548284453e-06, 'samples': 23491584, 'steps': 45881, 'batch_loss/train': 0.769105423707515} 12/29/2021 00:07:50 - INFO - codeparrot_training - Step 45882: {'lr': 8.380047999369256e-06, 'samples': 23492096, 'steps': 45882, 'batch_loss/train': 0.8487921603955328} 12/29/2021 00:08:01 - INFO - codeparrot_training - Step 45883: {'lr': 8.376001411043776e-06, 'samples': 23492608, 'steps': 45883, 'batch_loss/train': 0.7835700260475278} 12/29/2021 00:08:11 - INFO - codeparrot_training - Step 45884: {'lr': 8.37195578332403e-06, 'samples': 23493120, 'steps': 45884, 'batch_loss/train': 0.7395263556391001} 12/29/2021 00:08:23 - INFO - codeparrot_training - Step 45885: {'lr': 8.367911116226173e-06, 'samples': 23493632, 'steps': 45885, 'batch_loss/train': 0.7505915723158978} 12/29/2021 00:08:34 - INFO - codeparrot_training - Step 45886: {'lr': 8.363867409766301e-06, 'samples': 23494144, 'steps': 45886, 'batch_loss/train': 0.6959035250474699} 12/29/2021 00:08:45 - INFO - codeparrot_training - Step 45887: {'lr': 8.359824663960403e-06, 'samples': 23494656, 'steps': 45887, 'batch_loss/train': 0.5453165043145418} 12/29/2021 00:08:57 - INFO - codeparrot_training - Step 45888: {'lr': 8.355782878824602e-06, 'samples': 23495168, 'steps': 45888, 'batch_loss/train': 0.6720680617727339} 12/29/2021 00:09:07 - INFO - codeparrot_training - Step 45889: {'lr': 8.351742054375028e-06, 'samples': 23495680, 'steps': 45889, 'batch_loss/train': 0.7677802318939939} 12/29/2021 00:09:18 - INFO - codeparrot_training - Step 45890: {'lr': 8.347702190627638e-06, 'samples': 23496192, 'steps': 45890, 'batch_loss/train': 0.7828388791531324} 12/29/2021 00:09:28 - INFO - codeparrot_training - Step 45891: {'lr': 8.34366328759853e-06, 'samples': 23496704, 'steps': 45891, 'batch_loss/train': 0.7348129358142614} 12/29/2021 00:09:42 - INFO - codeparrot_training - Step 45892: {'lr': 8.339625345303803e-06, 'samples': 23497216, 'steps': 45892, 'batch_loss/train': 0.7042009723372757} 12/29/2021 00:09:53 - INFO - codeparrot_training - Step 45893: {'lr': 8.335588363759443e-06, 'samples': 23497728, 'steps': 45893, 'batch_loss/train': 0.7585695607122034} 12/29/2021 00:10:04 - INFO - codeparrot_training - Step 45894: {'lr': 8.331552342981552e-06, 'samples': 23498240, 'steps': 45894, 'batch_loss/train': 0.7390518842730671} 12/29/2021 00:10:16 - INFO - codeparrot_training - Step 45895: {'lr': 8.327517282986197e-06, 'samples': 23498752, 'steps': 45895, 'batch_loss/train': 0.8697770384605974} 12/29/2021 00:10:27 - INFO - codeparrot_training - Step 45896: {'lr': 8.32348318378931e-06, 'samples': 23499264, 'steps': 45896, 'batch_loss/train': 0.7075217714300379} 12/29/2021 00:10:37 - INFO - codeparrot_training - Step 45897: {'lr': 8.319450045407045e-06, 'samples': 23499776, 'steps': 45897, 'batch_loss/train': 0.7940001413226128} 12/29/2021 00:10:51 - INFO - codeparrot_training - Step 45898: {'lr': 8.31541786785539e-06, 'samples': 23500288, 'steps': 45898, 'batch_loss/train': 0.7849208964034915} 12/29/2021 00:11:02 - INFO - codeparrot_training - Step 45899: {'lr': 8.31138665115036e-06, 'samples': 23500800, 'steps': 45899, 'batch_loss/train': 0.7370990357594565} 12/29/2021 00:11:12 - INFO - codeparrot_training - Step 45900: {'lr': 8.307356395308024e-06, 'samples': 23501312, 'steps': 45900, 'batch_loss/train': 0.6697063744068146} 12/29/2021 00:11:23 - INFO - codeparrot_training - Step 45901: {'lr': 8.30332710034437e-06, 'samples': 23501824, 'steps': 45901, 'batch_loss/train': 0.692655669467058} 12/29/2021 00:11:35 - INFO - codeparrot_training - Step 45902: {'lr': 8.299298766275415e-06, 'samples': 23502336, 'steps': 45902, 'batch_loss/train': 0.742781805456616} 12/29/2021 00:11:45 - INFO - codeparrot_training - Step 45903: {'lr': 8.295271393117172e-06, 'samples': 23502848, 'steps': 45903, 'batch_loss/train': 0.6456279149278998} 12/29/2021 00:11:56 - INFO - codeparrot_training - Step 45904: {'lr': 8.291244980885681e-06, 'samples': 23503360, 'steps': 45904, 'batch_loss/train': 0.7076173331588507} 12/29/2021 00:12:08 - INFO - codeparrot_training - Step 45905: {'lr': 8.287219529596935e-06, 'samples': 23503872, 'steps': 45905, 'batch_loss/train': 0.6200220019090921} 12/29/2021 00:12:19 - INFO - codeparrot_training - Step 45906: {'lr': 8.283195039266943e-06, 'samples': 23504384, 'steps': 45906, 'batch_loss/train': 0.7321348460391164} 12/29/2021 00:12:29 - INFO - codeparrot_training - Step 45907: {'lr': 8.279171509911699e-06, 'samples': 23504896, 'steps': 45907, 'batch_loss/train': 0.7090786879416555} 12/29/2021 00:12:42 - INFO - codeparrot_training - Step 45908: {'lr': 8.275148941547183e-06, 'samples': 23505408, 'steps': 45908, 'batch_loss/train': 0.6757059150841087} 12/29/2021 00:12:52 - INFO - codeparrot_training - Step 45909: {'lr': 8.271127334189443e-06, 'samples': 23505920, 'steps': 45909, 'batch_loss/train': 0.7220122233848087} 12/29/2021 00:13:03 - INFO - codeparrot_training - Step 45910: {'lr': 8.267106687854381e-06, 'samples': 23506432, 'steps': 45910, 'batch_loss/train': 0.8641180284321308} 12/29/2021 00:13:17 - INFO - codeparrot_training - Step 45911: {'lr': 8.26308700255804e-06, 'samples': 23506944, 'steps': 45911, 'batch_loss/train': 0.72084323852323} 12/29/2021 00:13:27 - INFO - codeparrot_training - Step 45912: {'lr': 8.259068278316406e-06, 'samples': 23507456, 'steps': 45912, 'batch_loss/train': 0.8033717847429216} 12/29/2021 00:13:38 - INFO - codeparrot_training - Step 45913: {'lr': 8.25505051514544e-06, 'samples': 23507968, 'steps': 45913, 'batch_loss/train': 0.7471996219828725} 12/29/2021 00:13:49 - INFO - codeparrot_training - Step 45914: {'lr': 8.251033713061073e-06, 'samples': 23508480, 'steps': 45914, 'batch_loss/train': 0.7887028479017317} 12/29/2021 00:14:01 - INFO - codeparrot_training - Step 45915: {'lr': 8.247017872079377e-06, 'samples': 23508992, 'steps': 45915, 'batch_loss/train': 0.7744613456306979} 12/29/2021 00:14:12 - INFO - codeparrot_training - Step 45916: {'lr': 8.243002992216198e-06, 'samples': 23509504, 'steps': 45916, 'batch_loss/train': 0.7088074064813554} 12/29/2021 00:14:22 - INFO - codeparrot_training - Step 45917: {'lr': 8.238989073487579e-06, 'samples': 23510016, 'steps': 45917, 'batch_loss/train': 0.757908150088042} 12/29/2021 00:14:35 - INFO - codeparrot_training - Step 45918: {'lr': 8.234976115909453e-06, 'samples': 23510528, 'steps': 45918, 'batch_loss/train': 0.6543422963004559} 12/29/2021 00:14:45 - INFO - codeparrot_training - Step 45919: {'lr': 8.23096411949778e-06, 'samples': 23511040, 'steps': 45919, 'batch_loss/train': 0.4605534464935772} 12/29/2021 00:14:56 - INFO - codeparrot_training - Step 45920: {'lr': 8.22695308426849e-06, 'samples': 23511552, 'steps': 45920, 'batch_loss/train': 0.685928335878998} 12/29/2021 00:15:07 - INFO - codeparrot_training - Step 45921: {'lr': 8.222943010237572e-06, 'samples': 23512064, 'steps': 45921, 'batch_loss/train': 0.6867907484993339} 12/29/2021 00:15:20 - INFO - codeparrot_training - Step 45922: {'lr': 8.218933897420927e-06, 'samples': 23512576, 'steps': 45922, 'batch_loss/train': 0.8677508528344333} 12/29/2021 00:15:31 - INFO - codeparrot_training - Step 45923: {'lr': 8.214925745834489e-06, 'samples': 23513088, 'steps': 45923, 'batch_loss/train': 0.6681092967046425} 12/29/2021 00:15:42 - INFO - codeparrot_training - Step 45924: {'lr': 8.210918555494246e-06, 'samples': 23513600, 'steps': 45924, 'batch_loss/train': 0.8505732230842113} 12/29/2021 00:15:54 - INFO - codeparrot_training - Step 45925: {'lr': 8.206912326416071e-06, 'samples': 23514112, 'steps': 45925, 'batch_loss/train': 0.5908297869609669} 12/29/2021 00:16:05 - INFO - codeparrot_training - Step 45926: {'lr': 8.2029070586159e-06, 'samples': 23514624, 'steps': 45926, 'batch_loss/train': 0.7207009163685143} 12/29/2021 00:16:15 - INFO - codeparrot_training - Step 45927: {'lr': 8.198902752109716e-06, 'samples': 23515136, 'steps': 45927, 'batch_loss/train': 0.6977292550727725} 12/29/2021 00:16:29 - INFO - codeparrot_training - Step 45928: {'lr': 8.19489940691337e-06, 'samples': 23515648, 'steps': 45928, 'batch_loss/train': 0.7869254238903522} 12/29/2021 00:16:40 - INFO - codeparrot_training - Step 45929: {'lr': 8.19089702304282e-06, 'samples': 23516160, 'steps': 45929, 'batch_loss/train': 0.6917821569368243} 12/29/2021 00:16:50 - INFO - codeparrot_training - Step 45930: {'lr': 8.186895600513944e-06, 'samples': 23516672, 'steps': 45930, 'batch_loss/train': 0.7608742658048868} 12/29/2021 00:17:01 - INFO - codeparrot_training - Step 45931: {'lr': 8.182895139342645e-06, 'samples': 23517184, 'steps': 45931, 'batch_loss/train': 0.6987920677638613} 12/29/2021 00:17:13 - INFO - codeparrot_training - Step 45932: {'lr': 8.178895639544882e-06, 'samples': 23517696, 'steps': 45932, 'batch_loss/train': 0.6686399793252349} 12/29/2021 00:17:24 - INFO - codeparrot_training - Step 45933: {'lr': 8.174897101136502e-06, 'samples': 23518208, 'steps': 45933, 'batch_loss/train': 0.7463299809023738} 12/29/2021 00:17:34 - INFO - codeparrot_training - Step 45934: {'lr': 8.170899524133412e-06, 'samples': 23518720, 'steps': 45934, 'batch_loss/train': 0.7023141696117818} 12/29/2021 00:17:47 - INFO - codeparrot_training - Step 45935: {'lr': 8.166902908551571e-06, 'samples': 23519232, 'steps': 45935, 'batch_loss/train': 0.7883778037503362} 12/29/2021 00:17:57 - INFO - codeparrot_training - Step 45936: {'lr': 8.162907254406742e-06, 'samples': 23519744, 'steps': 45936, 'batch_loss/train': 0.5468595686834306} 12/29/2021 00:18:08 - INFO - codeparrot_training - Step 45937: {'lr': 8.158912561714887e-06, 'samples': 23520256, 'steps': 45937, 'batch_loss/train': 2.2797786127775908} 12/29/2021 00:18:22 - INFO - codeparrot_training - Step 45938: {'lr': 8.154918830491937e-06, 'samples': 23520768, 'steps': 45938, 'batch_loss/train': 0.7773715853691101} 12/29/2021 00:18:33 - INFO - codeparrot_training - Step 45939: {'lr': 8.150926060753683e-06, 'samples': 23521280, 'steps': 45939, 'batch_loss/train': 0.7608843627385795} 12/29/2021 00:18:43 - INFO - codeparrot_training - Step 45940: {'lr': 8.146934252515974e-06, 'samples': 23521792, 'steps': 45940, 'batch_loss/train': 0.6967465593479574} 12/29/2021 00:18:54 - INFO - codeparrot_training - Step 45941: {'lr': 8.142943405794829e-06, 'samples': 23522304, 'steps': 45941, 'batch_loss/train': 0.5909591112285852} 12/29/2021 00:19:06 - INFO - codeparrot_training - Step 45942: {'lr': 8.138953520605952e-06, 'samples': 23522816, 'steps': 45942, 'batch_loss/train': 0.7973785204812884} 12/29/2021 00:19:17 - INFO - codeparrot_training - Step 45943: {'lr': 8.134964596965305e-06, 'samples': 23523328, 'steps': 45943, 'batch_loss/train': 0.8302529789507389} 12/29/2021 00:19:27 - INFO - codeparrot_training - Step 45944: {'lr': 8.130976634888682e-06, 'samples': 23523840, 'steps': 45944, 'batch_loss/train': 0.708065964281559} 12/29/2021 00:19:39 - INFO - codeparrot_training - Step 45945: {'lr': 8.126989634391985e-06, 'samples': 23524352, 'steps': 45945, 'batch_loss/train': 0.7203690647147596} 12/29/2021 00:19:50 - INFO - codeparrot_training - Step 45946: {'lr': 8.123003595491064e-06, 'samples': 23524864, 'steps': 45946, 'batch_loss/train': 0.7070555910468102} 12/29/2021 00:20:00 - INFO - codeparrot_training - Step 45947: {'lr': 8.119018518201737e-06, 'samples': 23525376, 'steps': 45947, 'batch_loss/train': 0.7463821256533265} 12/29/2021 00:20:13 - INFO - codeparrot_training - Step 45948: {'lr': 8.115034402539856e-06, 'samples': 23525888, 'steps': 45948, 'batch_loss/train': 0.727501580491662} 12/29/2021 00:20:23 - INFO - codeparrot_training - Step 45949: {'lr': 8.111051248521295e-06, 'samples': 23526400, 'steps': 45949, 'batch_loss/train': 0.7676265952177346} 12/29/2021 00:20:34 - INFO - codeparrot_training - Step 45950: {'lr': 8.107069056161848e-06, 'samples': 23526912, 'steps': 45950, 'batch_loss/train': 0.7829691204242408} 12/29/2021 00:20:44 - INFO - codeparrot_training - Step 45951: {'lr': 8.103087825477334e-06, 'samples': 23527424, 'steps': 45951, 'batch_loss/train': 0.775394706055522} 12/29/2021 00:20:58 - INFO - codeparrot_training - Step 45952: {'lr': 8.099107556483632e-06, 'samples': 23527936, 'steps': 45952, 'batch_loss/train': 0.7744486443698406} 12/29/2021 00:21:09 - INFO - codeparrot_training - Step 45953: {'lr': 8.09512824919656e-06, 'samples': 23528448, 'steps': 45953, 'batch_loss/train': 0.6965901623480022} 12/29/2021 00:21:20 - INFO - codeparrot_training - Step 45954: {'lr': 8.091149903631884e-06, 'samples': 23528960, 'steps': 45954, 'batch_loss/train': 0.7916807951405644} 12/29/2021 00:21:32 - INFO - codeparrot_training - Step 45955: {'lr': 8.087172519805452e-06, 'samples': 23529472, 'steps': 45955, 'batch_loss/train': 0.7050600721267983} 12/29/2021 00:21:42 - INFO - codeparrot_training - Step 45956: {'lr': 8.083196097733087e-06, 'samples': 23529984, 'steps': 45956, 'batch_loss/train': 0.8822928350418806} 12/29/2021 00:21:53 - INFO - codeparrot_training - Step 45957: {'lr': 8.079220637430606e-06, 'samples': 23530496, 'steps': 45957, 'batch_loss/train': 0.7469110791571438} 12/29/2021 00:22:06 - INFO - codeparrot_training - Step 45958: {'lr': 8.075246138913777e-06, 'samples': 23531008, 'steps': 45958, 'batch_loss/train': 0.761259094811976} 12/29/2021 00:22:16 - INFO - codeparrot_training - Step 45959: {'lr': 8.071272602198421e-06, 'samples': 23531520, 'steps': 45959, 'batch_loss/train': 1.1236356510780752} 12/29/2021 00:22:27 - INFO - codeparrot_training - Step 45960: {'lr': 8.067300027300356e-06, 'samples': 23532032, 'steps': 45960, 'batch_loss/train': 0.8083319319412112} 12/29/2021 00:22:37 - INFO - codeparrot_training - Step 45961: {'lr': 8.063328414235348e-06, 'samples': 23532544, 'steps': 45961, 'batch_loss/train': 0.7077586890663952} 12/29/2021 00:22:51 - INFO - codeparrot_training - Step 45962: {'lr': 8.059357763019165e-06, 'samples': 23533056, 'steps': 45962, 'batch_loss/train': 0.7960174083709717} 12/29/2021 00:23:02 - INFO - codeparrot_training - Step 45963: {'lr': 8.055388073667653e-06, 'samples': 23533568, 'steps': 45963, 'batch_loss/train': 0.8330558594316244} 12/29/2021 00:23:13 - INFO - codeparrot_training - Step 45964: {'lr': 8.051419346196603e-06, 'samples': 23534080, 'steps': 45964, 'batch_loss/train': 0.6916559273377061} 12/29/2021 00:23:25 - INFO - codeparrot_training - Step 45965: {'lr': 8.047451580621702e-06, 'samples': 23534592, 'steps': 45965, 'batch_loss/train': 0.7546906294301152} 12/29/2021 00:23:36 - INFO - codeparrot_training - Step 45966: {'lr': 8.043484776958738e-06, 'samples': 23535104, 'steps': 45966, 'batch_loss/train': 0.7510942325461656} 12/29/2021 00:23:46 - INFO - codeparrot_training - Step 45967: {'lr': 8.03951893522359e-06, 'samples': 23535616, 'steps': 45967, 'batch_loss/train': 1.0909689301624894} 12/29/2021 00:24:00 - INFO - codeparrot_training - Step 45968: {'lr': 8.035554055431937e-06, 'samples': 23536128, 'steps': 45968, 'batch_loss/train': 0.775720402598381} 12/29/2021 00:24:11 - INFO - codeparrot_training - Step 45969: {'lr': 8.031590137599549e-06, 'samples': 23536640, 'steps': 45969, 'batch_loss/train': 0.771659036166966} 12/29/2021 00:24:22 - INFO - codeparrot_training - Step 45970: {'lr': 8.027627181742186e-06, 'samples': 23537152, 'steps': 45970, 'batch_loss/train': 0.6951901218853891} 12/29/2021 00:24:32 - INFO - codeparrot_training - Step 45971: {'lr': 8.023665187875618e-06, 'samples': 23537664, 'steps': 45971, 'batch_loss/train': 0.6865398911759257} 12/29/2021 00:24:44 - INFO - codeparrot_training - Step 45972: {'lr': 8.019704156015606e-06, 'samples': 23538176, 'steps': 45972, 'batch_loss/train': 0.7026589440647513} 12/29/2021 00:24:55 - INFO - codeparrot_training - Step 45973: {'lr': 8.015744086177861e-06, 'samples': 23538688, 'steps': 45973, 'batch_loss/train': 0.631228398764506} 12/29/2021 00:25:06 - INFO - codeparrot_training - Step 45974: {'lr': 8.01178497837815e-06, 'samples': 23539200, 'steps': 45974, 'batch_loss/train': 0.7088735485449433} 12/29/2021 00:25:18 - INFO - codeparrot_training - Step 45975: {'lr': 8.007826832632236e-06, 'samples': 23539712, 'steps': 45975, 'batch_loss/train': 0.48862741189077497} 12/29/2021 00:25:28 - INFO - codeparrot_training - Step 45976: {'lr': 8.00386964895583e-06, 'samples': 23540224, 'steps': 45976, 'batch_loss/train': 0.7815499906428158} 12/29/2021 00:25:39 - INFO - codeparrot_training - Step 45977: {'lr': 7.999913427364614e-06, 'samples': 23540736, 'steps': 45977, 'batch_loss/train': 0.7022904464974999} 12/29/2021 00:25:53 - INFO - codeparrot_training - Step 45978: {'lr': 7.995958167874406e-06, 'samples': 23541248, 'steps': 45978, 'batch_loss/train': 0.7625417213421315} 12/29/2021 00:26:04 - INFO - codeparrot_training - Step 45979: {'lr': 7.992003870500919e-06, 'samples': 23541760, 'steps': 45979, 'batch_loss/train': 0.8293354790657759} 12/29/2021 00:26:14 - INFO - codeparrot_training - Step 45980: {'lr': 7.988050535259805e-06, 'samples': 23542272, 'steps': 45980, 'batch_loss/train': 0.7723077200353146} 12/29/2021 00:26:25 - INFO - codeparrot_training - Step 45981: {'lr': 7.984098162166802e-06, 'samples': 23542784, 'steps': 45981, 'batch_loss/train': 0.7690677363425493} 12/29/2021 00:26:37 - INFO - codeparrot_training - Step 45982: {'lr': 7.980146751237677e-06, 'samples': 23543296, 'steps': 45982, 'batch_loss/train': 0.7653434197418392} 12/29/2021 00:26:48 - INFO - codeparrot_training - Step 45983: {'lr': 7.976196302488109e-06, 'samples': 23543808, 'steps': 45983, 'batch_loss/train': 0.7371156215667725} 12/29/2021 00:26:58 - INFO - codeparrot_training - Step 45984: {'lr': 7.97224681593378e-06, 'samples': 23544320, 'steps': 45984, 'batch_loss/train': 0.6547699109651148} 12/29/2021 00:27:10 - INFO - codeparrot_training - Step 45985: {'lr': 7.968298291590403e-06, 'samples': 23544832, 'steps': 45985, 'batch_loss/train': 1.229565274901688} 12/29/2021 00:27:21 - INFO - codeparrot_training - Step 45986: {'lr': 7.964350729473713e-06, 'samples': 23545344, 'steps': 45986, 'batch_loss/train': 0.6364777218550444} 12/29/2021 00:27:32 - INFO - codeparrot_training - Step 45987: {'lr': 7.960404129599363e-06, 'samples': 23545856, 'steps': 45987, 'batch_loss/train': 0.6772932172752917} 12/29/2021 00:27:44 - INFO - codeparrot_training - Step 45988: {'lr': 7.956458491983066e-06, 'samples': 23546368, 'steps': 45988, 'batch_loss/train': 0.8234202042222023} 12/29/2021 00:27:55 - INFO - codeparrot_training - Step 45989: {'lr': 7.952513816640473e-06, 'samples': 23546880, 'steps': 45989, 'batch_loss/train': 0.6999631971120834} 12/29/2021 00:28:05 - INFO - codeparrot_training - Step 45990: {'lr': 7.948570103587322e-06, 'samples': 23547392, 'steps': 45990, 'batch_loss/train': 0.5882127593213227} 12/29/2021 00:28:16 - INFO - codeparrot_training - Step 45991: {'lr': 7.94462735283924e-06, 'samples': 23547904, 'steps': 45991, 'batch_loss/train': 0.6951046936446801} 12/29/2021 00:28:30 - INFO - codeparrot_training - Step 45992: {'lr': 7.94068556441191e-06, 'samples': 23548416, 'steps': 45992, 'batch_loss/train': 0.7766945324838161} 12/29/2021 00:28:40 - INFO - codeparrot_training - Step 45993: {'lr': 7.936744738321066e-06, 'samples': 23548928, 'steps': 45993, 'batch_loss/train': 0.7070937417447567} 12/29/2021 00:28:51 - INFO - codeparrot_training - Step 45994: {'lr': 7.932804874582283e-06, 'samples': 23549440, 'steps': 45994, 'batch_loss/train': 0.658990434370935} 12/29/2021 00:29:03 - INFO - codeparrot_training - Step 45995: {'lr': 7.928865973211268e-06, 'samples': 23549952, 'steps': 45995, 'batch_loss/train': 0.7550226468592882} 12/29/2021 00:29:14 - INFO - codeparrot_training - Step 45996: {'lr': 7.924928034223705e-06, 'samples': 23550464, 'steps': 45996, 'batch_loss/train': 0.6961405258625746} 12/29/2021 00:29:24 - INFO - codeparrot_training - Step 45997: {'lr': 7.920991057635218e-06, 'samples': 23550976, 'steps': 45997, 'batch_loss/train': 1.3225031602196395} 12/29/2021 00:29:38 - INFO - codeparrot_training - Step 45998: {'lr': 7.917055043461436e-06, 'samples': 23551488, 'steps': 45998, 'batch_loss/train': 0.7731918497011065} 12/29/2021 00:29:49 - INFO - codeparrot_training - Step 45999: {'lr': 7.913119991718064e-06, 'samples': 23552000, 'steps': 45999, 'batch_loss/train': 0.7835453227162361} 12/29/2021 00:30:00 - INFO - codeparrot_training - Step 46000: {'lr': 7.909185902420707e-06, 'samples': 23552512, 'steps': 46000, 'batch_loss/train': 0.619580915896222} 12/29/2021 00:30:12 - INFO - codeparrot_training - Step 46001: {'lr': 7.905252775585014e-06, 'samples': 23553024, 'steps': 46001, 'batch_loss/train': 0.7760551385581493} 12/29/2021 00:30:22 - INFO - codeparrot_training - Step 46002: {'lr': 7.90132061122667e-06, 'samples': 23553536, 'steps': 46002, 'batch_loss/train': 0.6561717865988612} 12/29/2021 00:30:33 - INFO - codeparrot_training - Step 46003: {'lr': 7.89738940936119e-06, 'samples': 23554048, 'steps': 46003, 'batch_loss/train': 0.7552500236779451} 12/29/2021 00:30:44 - INFO - codeparrot_training - Step 46004: {'lr': 7.893459170004308e-06, 'samples': 23554560, 'steps': 46004, 'batch_loss/train': 0.8095080283237621} 12/29/2021 00:30:56 - INFO - codeparrot_training - Step 46005: {'lr': 7.889529893171627e-06, 'samples': 23555072, 'steps': 46005, 'batch_loss/train': 0.6682095299474895} 12/29/2021 00:31:06 - INFO - codeparrot_training - Step 46006: {'lr': 7.885601578878742e-06, 'samples': 23555584, 'steps': 46006, 'batch_loss/train': 0.8173375884070992} 12/29/2021 00:31:17 - INFO - codeparrot_training - Step 46007: {'lr': 7.881674227141283e-06, 'samples': 23556096, 'steps': 46007, 'batch_loss/train': 0.8249724647030234} 12/29/2021 00:31:31 - INFO - codeparrot_training - Step 46008: {'lr': 7.877747837974902e-06, 'samples': 23556608, 'steps': 46008, 'batch_loss/train': 0.7390121463686228} 12/29/2021 00:31:42 - INFO - codeparrot_training - Step 46009: {'lr': 7.873822411395143e-06, 'samples': 23557120, 'steps': 46009, 'batch_loss/train': 0.9979760418646038} 12/29/2021 00:31:52 - INFO - codeparrot_training - Step 46010: {'lr': 7.869897947417631e-06, 'samples': 23557632, 'steps': 46010, 'batch_loss/train': 0.7872964357957244} 12/29/2021 00:32:05 - INFO - codeparrot_training - Step 46011: {'lr': 7.865974446057993e-06, 'samples': 23558144, 'steps': 46011, 'batch_loss/train': 0.658901026006788} 12/29/2021 00:32:15 - INFO - codeparrot_training - Step 46012: {'lr': 7.86205190733183e-06, 'samples': 23558656, 'steps': 46012, 'batch_loss/train': 0.7142264756839722} 12/29/2021 00:32:26 - INFO - codeparrot_training - Step 46013: {'lr': 7.85813033125471e-06, 'samples': 23559168, 'steps': 46013, 'batch_loss/train': 0.7885884018614888} 12/29/2021 00:32:40 - INFO - codeparrot_training - Step 46014: {'lr': 7.854209717842232e-06, 'samples': 23559680, 'steps': 46014, 'batch_loss/train': 0.6219241379294544} 12/29/2021 00:32:51 - INFO - codeparrot_training - Step 46015: {'lr': 7.850290067109966e-06, 'samples': 23560192, 'steps': 46015, 'batch_loss/train': 0.7559030069969594} 12/29/2021 00:33:01 - INFO - codeparrot_training - Step 46016: {'lr': 7.846371379073542e-06, 'samples': 23560704, 'steps': 46016, 'batch_loss/train': 0.6916403230279684} 12/29/2021 00:33:12 - INFO - codeparrot_training - Step 46017: {'lr': 7.842453653748527e-06, 'samples': 23561216, 'steps': 46017, 'batch_loss/train': 0.806477876380086} 12/29/2021 00:33:24 - INFO - codeparrot_training - Step 46018: {'lr': 7.838536891150438e-06, 'samples': 23561728, 'steps': 46018, 'batch_loss/train': 0.727889786940068} 12/29/2021 00:33:35 - INFO - codeparrot_training - Step 46019: {'lr': 7.83462109129493e-06, 'samples': 23562240, 'steps': 46019, 'batch_loss/train': 0.5840343958698213} 12/29/2021 00:33:46 - INFO - codeparrot_training - Step 46020: {'lr': 7.830706254197517e-06, 'samples': 23562752, 'steps': 46020, 'batch_loss/train': 0.7080973666161299} 12/29/2021 00:33:58 - INFO - codeparrot_training - Step 46021: {'lr': 7.826792379873742e-06, 'samples': 23563264, 'steps': 46021, 'batch_loss/train': 0.6264812012086622} 12/29/2021 00:34:08 - INFO - codeparrot_training - Step 46022: {'lr': 7.822879468339262e-06, 'samples': 23563776, 'steps': 46022, 'batch_loss/train': 0.7227882472798228} 12/29/2021 00:34:19 - INFO - codeparrot_training - Step 46023: {'lr': 7.818967519609533e-06, 'samples': 23564288, 'steps': 46023, 'batch_loss/train': 0.7424850901588798} 12/29/2021 00:34:31 - INFO - codeparrot_training - Step 46024: {'lr': 7.815056533700127e-06, 'samples': 23564800, 'steps': 46024, 'batch_loss/train': 0.728226619772613} 12/29/2021 00:34:42 - INFO - codeparrot_training - Step 46025: {'lr': 7.811146510626643e-06, 'samples': 23565312, 'steps': 46025, 'batch_loss/train': 0.6641716263256967} 12/29/2021 00:34:52 - INFO - codeparrot_training - Step 46026: {'lr': 7.807237450404569e-06, 'samples': 23565824, 'steps': 46026, 'batch_loss/train': 0.7860753117129207} 12/29/2021 00:35:03 - INFO - codeparrot_training - Step 46027: {'lr': 7.803329353049476e-06, 'samples': 23566336, 'steps': 46027, 'batch_loss/train': 0.7194254044443369} 12/29/2021 00:35:17 - INFO - codeparrot_training - Step 46028: {'lr': 7.799422218576935e-06, 'samples': 23566848, 'steps': 46028, 'batch_loss/train': 0.6905865308362991} 12/29/2021 00:35:28 - INFO - codeparrot_training - Step 46029: {'lr': 7.795516047002377e-06, 'samples': 23567360, 'steps': 46029, 'batch_loss/train': 0.8332540011033416} 12/29/2021 00:35:38 - INFO - codeparrot_training - Step 46030: {'lr': 7.791610838341428e-06, 'samples': 23567872, 'steps': 46030, 'batch_loss/train': 0.6893685760442168} 12/29/2021 00:35:50 - INFO - codeparrot_training - Step 46031: {'lr': 7.787706592609606e-06, 'samples': 23568384, 'steps': 46031, 'batch_loss/train': 0.8097557974979281} 12/29/2021 00:36:01 - INFO - codeparrot_training - Step 46032: {'lr': 7.783803309822368e-06, 'samples': 23568896, 'steps': 46032, 'batch_loss/train': 0.7750787164550275} 12/29/2021 00:36:12 - INFO - codeparrot_training - Step 46033: {'lr': 7.77990098999526e-06, 'samples': 23569408, 'steps': 46033, 'batch_loss/train': 0.7891280758194625} 12/29/2021 00:36:24 - INFO - codeparrot_training - Step 46034: {'lr': 7.775999633143876e-06, 'samples': 23569920, 'steps': 46034, 'batch_loss/train': 0.669242728035897} 12/29/2021 00:36:34 - INFO - codeparrot_training - Step 46035: {'lr': 7.772099239283598e-06, 'samples': 23570432, 'steps': 46035, 'batch_loss/train': 0.7903364300727844} 12/29/2021 00:36:45 - INFO - codeparrot_training - Step 46036: {'lr': 7.768199808429993e-06, 'samples': 23570944, 'steps': 46036, 'batch_loss/train': 0.6997565279598348} 12/29/2021 00:36:56 - INFO - codeparrot_training - Step 46037: {'lr': 7.76430134059858e-06, 'samples': 23571456, 'steps': 46037, 'batch_loss/train': 0.6152092239353806} 12/29/2021 00:37:10 - INFO - codeparrot_training - Step 46038: {'lr': 7.76040383580484e-06, 'samples': 23571968, 'steps': 46038, 'batch_loss/train': 0.7341688182204962} 12/29/2021 00:37:20 - INFO - codeparrot_training - Step 46039: {'lr': 7.75650729406424e-06, 'samples': 23572480, 'steps': 46039, 'batch_loss/train': 0.7829398917965591} 12/29/2021 00:37:31 - INFO - codeparrot_training - Step 46040: {'lr': 7.752611715392322e-06, 'samples': 23572992, 'steps': 46040, 'batch_loss/train': 0.7738297744654119} 12/29/2021 00:37:43 - INFO - codeparrot_training - Step 46041: {'lr': 7.748717099804543e-06, 'samples': 23573504, 'steps': 46041, 'batch_loss/train': 0.6800192436203361} 12/29/2021 00:37:54 - INFO - codeparrot_training - Step 46042: {'lr': 7.744823447316418e-06, 'samples': 23574016, 'steps': 46042, 'batch_loss/train': 0.7075393190607429} 12/29/2021 00:38:04 - INFO - codeparrot_training - Step 46043: {'lr': 7.74093075794341e-06, 'samples': 23574528, 'steps': 46043, 'batch_loss/train': 0.7804581262171268} 12/29/2021 00:38:18 - INFO - codeparrot_training - Step 46044: {'lr': 7.73703903170092e-06, 'samples': 23575040, 'steps': 46044, 'batch_loss/train': 0.8012520764023066} 12/29/2021 00:38:29 - INFO - codeparrot_training - Step 46045: {'lr': 7.733148268604578e-06, 'samples': 23575552, 'steps': 46045, 'batch_loss/train': 0.7792073683813214} 12/29/2021 00:38:39 - INFO - codeparrot_training - Step 46046: {'lr': 7.729258468669704e-06, 'samples': 23576064, 'steps': 46046, 'batch_loss/train': 0.6830652924254537} 12/29/2021 00:38:50 - INFO - codeparrot_training - Step 46047: {'lr': 7.725369631911811e-06, 'samples': 23576576, 'steps': 46047, 'batch_loss/train': 0.8283684030175209} 12/29/2021 00:39:02 - INFO - codeparrot_training - Step 46048: {'lr': 7.721481758346416e-06, 'samples': 23577088, 'steps': 46048, 'batch_loss/train': 0.7434368960093707} 12/29/2021 00:39:13 - INFO - codeparrot_training - Step 46049: {'lr': 7.717594847988896e-06, 'samples': 23577600, 'steps': 46049, 'batch_loss/train': 0.6952446182258427} 12/29/2021 00:39:23 - INFO - codeparrot_training - Step 46050: {'lr': 7.713708900854738e-06, 'samples': 23578112, 'steps': 46050, 'batch_loss/train': 0.6898423626553267} 12/29/2021 00:39:35 - INFO - codeparrot_training - Step 46051: {'lr': 7.709823916959401e-06, 'samples': 23578624, 'steps': 46051, 'batch_loss/train': 0.6357289962470531} 12/29/2021 00:39:46 - INFO - codeparrot_training - Step 46052: {'lr': 7.705939896318293e-06, 'samples': 23579136, 'steps': 46052, 'batch_loss/train': 0.7413702975027263} 12/29/2021 00:39:57 - INFO - codeparrot_training - Step 46053: {'lr': 7.702056838946896e-06, 'samples': 23579648, 'steps': 46053, 'batch_loss/train': 0.7348511829040945} 12/29/2021 00:40:11 - INFO - codeparrot_training - Step 46054: {'lr': 7.698174744860675e-06, 'samples': 23580160, 'steps': 46054, 'batch_loss/train': 0.603451712639071} 12/29/2021 00:40:21 - INFO - codeparrot_training - Step 46055: {'lr': 7.694293614074949e-06, 'samples': 23580672, 'steps': 46055, 'batch_loss/train': 0.7958884132094681} 12/29/2021 00:40:32 - INFO - codeparrot_training - Step 46056: {'lr': 7.69041344660526e-06, 'samples': 23581184, 'steps': 46056, 'batch_loss/train': 0.6693559210980311} 12/29/2021 00:40:44 - INFO - codeparrot_training - Step 46057: {'lr': 7.686534242467015e-06, 'samples': 23581696, 'steps': 46057, 'batch_loss/train': 0.7658480606041849} 12/29/2021 00:40:55 - INFO - codeparrot_training - Step 46058: {'lr': 7.682656001675558e-06, 'samples': 23582208, 'steps': 46058, 'batch_loss/train': 0.770905239507556} 12/29/2021 00:41:05 - INFO - codeparrot_training - Step 46059: {'lr': 7.67877872424641e-06, 'samples': 23582720, 'steps': 46059, 'batch_loss/train': 0.7600222756154835} 12/29/2021 00:41:16 - INFO - codeparrot_training - Step 46060: {'lr': 7.674902410194945e-06, 'samples': 23583232, 'steps': 46060, 'batch_loss/train': 0.877672332979273} 12/29/2021 00:41:28 - INFO - codeparrot_training - Step 46061: {'lr': 7.671027059536567e-06, 'samples': 23583744, 'steps': 46061, 'batch_loss/train': 0.7040692130103707} 12/29/2021 00:41:39 - INFO - codeparrot_training - Step 46062: {'lr': 7.667152672286625e-06, 'samples': 23584256, 'steps': 46062, 'batch_loss/train': 0.7084497702307999} 12/29/2021 00:41:49 - INFO - codeparrot_training - Step 46063: {'lr': 7.663279248460663e-06, 'samples': 23584768, 'steps': 46063, 'batch_loss/train': 0.8476134929805994} 12/29/2021 00:42:04 - INFO - codeparrot_training - Step 46064: {'lr': 7.659406788073975e-06, 'samples': 23585280, 'steps': 46064, 'batch_loss/train': 0.8810689486563206} 12/29/2021 00:42:14 - INFO - codeparrot_training - Step 46065: {'lr': 7.655535291141964e-06, 'samples': 23585792, 'steps': 46065, 'batch_loss/train': 0.6957886125892401} 12/29/2021 00:42:25 - INFO - codeparrot_training - Step 46066: {'lr': 7.65166475768006e-06, 'samples': 23586304, 'steps': 46066, 'batch_loss/train': 0.7815111763775349} 12/29/2021 00:42:36 - INFO - codeparrot_training - Step 46067: {'lr': 7.647795187703615e-06, 'samples': 23586816, 'steps': 46067, 'batch_loss/train': 0.8178627341985703} 12/29/2021 00:42:48 - INFO - codeparrot_training - Step 46068: {'lr': 7.643926581228006e-06, 'samples': 23587328, 'steps': 46068, 'batch_loss/train': 0.6731561560882255} 12/29/2021 00:42:58 - INFO - codeparrot_training - Step 46069: {'lr': 7.640058938268662e-06, 'samples': 23587840, 'steps': 46069, 'batch_loss/train': 0.7260149461217225} 12/29/2021 00:43:09 - INFO - codeparrot_training - Step 46070: {'lr': 7.636192258840935e-06, 'samples': 23588352, 'steps': 46070, 'batch_loss/train': 0.6408910490572453} 12/29/2021 00:43:21 - INFO - codeparrot_training - Step 46071: {'lr': 7.632326542960228e-06, 'samples': 23588864, 'steps': 46071, 'batch_loss/train': 0.74715765286237} 12/29/2021 00:43:32 - INFO - codeparrot_training - Step 46072: {'lr': 7.628461790641833e-06, 'samples': 23589376, 'steps': 46072, 'batch_loss/train': 0.7774887047708035} 12/29/2021 00:43:42 - INFO - codeparrot_training - Step 46073: {'lr': 7.624598001901128e-06, 'samples': 23589888, 'steps': 46073, 'batch_loss/train': 0.8510266400408} 12/29/2021 00:43:56 - INFO - codeparrot_training - Step 46074: {'lr': 7.620735176753574e-06, 'samples': 23590400, 'steps': 46074, 'batch_loss/train': 0.7470411881804466} 12/29/2021 00:44:07 - INFO - codeparrot_training - Step 46075: {'lr': 7.616873315214407e-06, 'samples': 23590912, 'steps': 46075, 'batch_loss/train': 0.652649667696096} 12/29/2021 00:44:17 - INFO - codeparrot_training - Step 46076: {'lr': 7.613012417299059e-06, 'samples': 23591424, 'steps': 46076, 'batch_loss/train': 0.6487501277588308} 12/29/2021 00:44:30 - INFO - codeparrot_training - Step 46077: {'lr': 7.609152483022825e-06, 'samples': 23591936, 'steps': 46077, 'batch_loss/train': 0.776784653775394} 12/29/2021 00:44:40 - INFO - codeparrot_training - Step 46078: {'lr': 7.6052935124011065e-06, 'samples': 23592448, 'steps': 46078, 'batch_loss/train': 1.2857203278690577} 12/29/2021 00:44:51 - INFO - codeparrot_training - Step 46079: {'lr': 7.6014355054492e-06, 'samples': 23592960, 'steps': 46079, 'batch_loss/train': 0.9595059405546635} 12/29/2021 00:45:02 - INFO - codeparrot_training - Step 46080: {'lr': 7.5975784621824805e-06, 'samples': 23593472, 'steps': 46080, 'batch_loss/train': 0.8367793486686423} 12/29/2021 00:45:14 - INFO - codeparrot_training - Step 46081: {'lr': 7.593722382616214e-06, 'samples': 23593984, 'steps': 46081, 'batch_loss/train': 0.4223478354688268} 12/29/2021 00:45:25 - INFO - codeparrot_training - Step 46082: {'lr': 7.589867266765804e-06, 'samples': 23594496, 'steps': 46082, 'batch_loss/train': 0.7643743762746453} 12/29/2021 00:45:35 - INFO - codeparrot_training - Step 46083: {'lr': 7.586013114646573e-06, 'samples': 23595008, 'steps': 46083, 'batch_loss/train': 0.7643522620201111} 12/29/2021 00:45:49 - INFO - codeparrot_training - Step 46084: {'lr': 7.582159926273785e-06, 'samples': 23595520, 'steps': 46084, 'batch_loss/train': 0.7487380364909768} 12/29/2021 00:46:00 - INFO - codeparrot_training - Step 46085: {'lr': 7.5783077016628175e-06, 'samples': 23596032, 'steps': 46085, 'batch_loss/train': 0.7736040120944381} 12/29/2021 00:46:11 - INFO - codeparrot_training - Step 46086: {'lr': 7.574456440828964e-06, 'samples': 23596544, 'steps': 46086, 'batch_loss/train': 0.752053695556242} 12/29/2021 00:46:23 - INFO - codeparrot_training - Step 46087: {'lr': 7.570606143787545e-06, 'samples': 23597056, 'steps': 46087, 'batch_loss/train': 0.7574810807127506} 12/29/2021 00:46:34 - INFO - codeparrot_training - Step 46088: {'lr': 7.5667568105537986e-06, 'samples': 23597568, 'steps': 46088, 'batch_loss/train': 0.7156584984622896} 12/29/2021 00:46:44 - INFO - codeparrot_training - Step 46089: {'lr': 7.562908441143185e-06, 'samples': 23598080, 'steps': 46089, 'batch_loss/train': 0.7088486095890403} 12/29/2021 00:46:55 - INFO - codeparrot_training - Step 46090: {'lr': 7.55906103557083e-06, 'samples': 23598592, 'steps': 46090, 'batch_loss/train': 0.7175275944173336} 12/29/2021 00:47:07 - INFO - codeparrot_training - Step 46091: {'lr': 7.555214593852139e-06, 'samples': 23599104, 'steps': 46091, 'batch_loss/train': 0.7646175944246352} 12/29/2021 00:47:18 - INFO - codeparrot_training - Step 46092: {'lr': 7.551369116002376e-06, 'samples': 23599616, 'steps': 46092, 'batch_loss/train': 0.7613656925968826} 12/29/2021 00:47:28 - INFO - codeparrot_training - Step 46093: {'lr': 7.54752460203681e-06, 'samples': 23600128, 'steps': 46093, 'batch_loss/train': 0.9365900354459882} 12/29/2021 00:47:42 - INFO - codeparrot_training - Step 46094: {'lr': 7.54368105197073e-06, 'samples': 23600640, 'steps': 46094, 'batch_loss/train': 0.6302250448497944} 12/29/2021 00:47:53 - INFO - codeparrot_training - Step 46095: {'lr': 7.539838465819432e-06, 'samples': 23601152, 'steps': 46095, 'batch_loss/train': 0.6902771731838584} 12/29/2021 00:48:04 - INFO - codeparrot_training - Step 46096: {'lr': 7.5359968435982086e-06, 'samples': 23601664, 'steps': 46096, 'batch_loss/train': 0.7090713521465659} 12/29/2021 00:48:16 - INFO - codeparrot_training - Step 46097: {'lr': 7.53215618532227e-06, 'samples': 23602176, 'steps': 46097, 'batch_loss/train': 0.691749129910022} 12/29/2021 00:48:27 - INFO - codeparrot_training - Step 46098: {'lr': 7.5283164910069925e-06, 'samples': 23602688, 'steps': 46098, 'batch_loss/train': 0.6781820463947952} 12/29/2021 00:48:37 - INFO - codeparrot_training - Step 46099: {'lr': 7.524477760667503e-06, 'samples': 23603200, 'steps': 46099, 'batch_loss/train': 0.8011603783816099} 12/29/2021 00:48:48 - INFO - codeparrot_training - Step 46100: {'lr': 7.520639994319179e-06, 'samples': 23603712, 'steps': 46100, 'batch_loss/train': 0.710551664698869} 12/29/2021 00:49:02 - INFO - codeparrot_training - Step 46101: {'lr': 7.516803191977229e-06, 'samples': 23604224, 'steps': 46101, 'batch_loss/train': 0.642996154492721} 12/29/2021 00:49:12 - INFO - codeparrot_training - Step 46102: {'lr': 7.512967353656891e-06, 'samples': 23604736, 'steps': 46102, 'batch_loss/train': 0.7298121824860573} 12/29/2021 00:49:23 - INFO - codeparrot_training - Step 46103: {'lr': 7.50913247937346e-06, 'samples': 23605248, 'steps': 46103, 'batch_loss/train': 1.1661484638461843} 12/29/2021 00:49:35 - INFO - codeparrot_training - Step 46104: {'lr': 7.505298569142144e-06, 'samples': 23605760, 'steps': 46104, 'batch_loss/train': 0.7100855065509677} 12/29/2021 00:49:46 - INFO - codeparrot_training - Step 46105: {'lr': 7.501465622978182e-06, 'samples': 23606272, 'steps': 46105, 'batch_loss/train': 0.9259011810645461} 12/29/2021 00:49:56 - INFO - codeparrot_training - Step 46106: {'lr': 7.497633640896867e-06, 'samples': 23606784, 'steps': 46106, 'batch_loss/train': 0.7463904283358715} 12/29/2021 00:50:09 - INFO - codeparrot_training - Step 46107: {'lr': 7.493802622913354e-06, 'samples': 23607296, 'steps': 46107, 'batch_loss/train': 0.7572878126520663} 12/29/2021 00:50:19 - INFO - codeparrot_training - Step 46108: {'lr': 7.4899725690429355e-06, 'samples': 23607808, 'steps': 46108, 'batch_loss/train': 0.6488332080189139} 12/29/2021 00:50:30 - INFO - codeparrot_training - Step 46109: {'lr': 7.48614347930085e-06, 'samples': 23608320, 'steps': 46109, 'batch_loss/train': 0.5184781433781609} 12/29/2021 00:50:41 - INFO - codeparrot_training - Step 46110: {'lr': 7.482315353702224e-06, 'samples': 23608832, 'steps': 46110, 'batch_loss/train': 0.5006965766660869} 12/29/2021 00:50:55 - INFO - codeparrot_training - Step 46111: {'lr': 7.478488192262379e-06, 'samples': 23609344, 'steps': 46111, 'batch_loss/train': 0.646544421557337} 12/29/2021 00:51:05 - INFO - codeparrot_training - Step 46112: {'lr': 7.474661994996496e-06, 'samples': 23609856, 'steps': 46112, 'batch_loss/train': 0.6832068683579564} 12/29/2021 00:51:16 - INFO - codeparrot_training - Step 46113: {'lr': 7.470836761919786e-06, 'samples': 23610368, 'steps': 46113, 'batch_loss/train': 0.712757283821702} 12/29/2021 00:51:28 - INFO - codeparrot_training - Step 46114: {'lr': 7.4670124930474035e-06, 'samples': 23610880, 'steps': 46114, 'batch_loss/train': 0.7885854300111532} 12/29/2021 00:51:39 - INFO - codeparrot_training - Step 46115: {'lr': 7.46318918839467e-06, 'samples': 23611392, 'steps': 46115, 'batch_loss/train': 0.7425829991698265} 12/29/2021 00:51:49 - INFO - codeparrot_training - Step 46116: {'lr': 7.459366847976684e-06, 'samples': 23611904, 'steps': 46116, 'batch_loss/train': 0.7382542649284005} 12/29/2021 00:52:04 - INFO - codeparrot_training - Step 46117: {'lr': 7.455545471808684e-06, 'samples': 23612416, 'steps': 46117, 'batch_loss/train': 0.7302944269031286} 12/29/2021 00:52:14 - INFO - codeparrot_training - Step 46118: {'lr': 7.451725059905851e-06, 'samples': 23612928, 'steps': 46118, 'batch_loss/train': 0.5684101199731231} 12/29/2021 00:52:25 - INFO - codeparrot_training - Step 46119: {'lr': 7.447905612283368e-06, 'samples': 23613440, 'steps': 46119, 'batch_loss/train': 0.757981551811099} 12/29/2021 00:52:35 - INFO - codeparrot_training - Step 46120: {'lr': 7.444087128956417e-06, 'samples': 23613952, 'steps': 46120, 'batch_loss/train': 0.7379559352993965} 12/29/2021 00:52:48 - INFO - codeparrot_training - Step 46121: {'lr': 7.440269609940209e-06, 'samples': 23614464, 'steps': 46121, 'batch_loss/train': 0.6861645546741784} 12/29/2021 00:52:58 - INFO - codeparrot_training - Step 46122: {'lr': 7.436453055249898e-06, 'samples': 23614976, 'steps': 46122, 'batch_loss/train': 0.7051184894517064} 12/29/2021 00:53:09 - INFO - codeparrot_training - Step 46123: {'lr': 7.4326374649006655e-06, 'samples': 23615488, 'steps': 46123, 'batch_loss/train': 0.7543215677142143} 12/29/2021 00:53:21 - INFO - codeparrot_training - Step 46124: {'lr': 7.428822838907695e-06, 'samples': 23616000, 'steps': 46124, 'batch_loss/train': 0.699434113688767} 12/29/2021 00:53:32 - INFO - codeparrot_training - Step 46125: {'lr': 7.425009177286085e-06, 'samples': 23616512, 'steps': 46125, 'batch_loss/train': 0.7182745207101107} 12/29/2021 00:53:42 - INFO - codeparrot_training - Step 46126: {'lr': 7.421196480051101e-06, 'samples': 23617024, 'steps': 46126, 'batch_loss/train': 0.7635036413557827} 12/29/2021 00:53:56 - INFO - codeparrot_training - Step 46127: {'lr': 7.417384747217815e-06, 'samples': 23617536, 'steps': 46127, 'batch_loss/train': 0.7291857525706291} 12/29/2021 00:54:07 - INFO - codeparrot_training - Step 46128: {'lr': 7.413573978801435e-06, 'samples': 23618048, 'steps': 46128, 'batch_loss/train': 0.7528557204641402} 12/29/2021 00:54:17 - INFO - codeparrot_training - Step 46129: {'lr': 7.409764174817063e-06, 'samples': 23618560, 'steps': 46129, 'batch_loss/train': 0.7379535529762506} 12/29/2021 00:54:30 - INFO - codeparrot_training - Step 46130: {'lr': 7.405955335279879e-06, 'samples': 23619072, 'steps': 46130, 'batch_loss/train': 0.7018925151787698} 12/29/2021 00:54:40 - INFO - codeparrot_training - Step 46131: {'lr': 7.402147460205011e-06, 'samples': 23619584, 'steps': 46131, 'batch_loss/train': 0.7635609852150083} 12/29/2021 00:54:51 - INFO - codeparrot_training - Step 46132: {'lr': 7.398340549607585e-06, 'samples': 23620096, 'steps': 46132, 'batch_loss/train': 0.747687438968569} 12/29/2021 00:55:02 - INFO - codeparrot_training - Step 46133: {'lr': 7.394534603502784e-06, 'samples': 23620608, 'steps': 46133, 'batch_loss/train': 0.7271363468607888} 12/29/2021 00:55:14 - INFO - codeparrot_training - Step 46134: {'lr': 7.390729621905706e-06, 'samples': 23621120, 'steps': 46134, 'batch_loss/train': 0.6930834539234638} 12/29/2021 00:55:24 - INFO - codeparrot_training - Step 46135: {'lr': 7.386925604831479e-06, 'samples': 23621632, 'steps': 46135, 'batch_loss/train': 0.7331688974518329} 12/29/2021 00:55:35 - INFO - codeparrot_training - Step 46136: {'lr': 7.383122552295201e-06, 'samples': 23622144, 'steps': 46136, 'batch_loss/train': 0.8414242451544851} 12/29/2021 00:55:49 - INFO - codeparrot_training - Step 46137: {'lr': 7.379320464312056e-06, 'samples': 23622656, 'steps': 46137, 'batch_loss/train': 0.6853368598967791} 12/29/2021 00:55:59 - INFO - codeparrot_training - Step 46138: {'lr': 7.375519340897113e-06, 'samples': 23623168, 'steps': 46138, 'batch_loss/train': 0.75620127748698} 12/29/2021 00:56:10 - INFO - codeparrot_training - Step 46139: {'lr': 7.371719182065501e-06, 'samples': 23623680, 'steps': 46139, 'batch_loss/train': 0.7006192323751748} 12/29/2021 00:56:22 - INFO - codeparrot_training - Step 46140: {'lr': 7.367919987832289e-06, 'samples': 23624192, 'steps': 46140, 'batch_loss/train': 0.7306457150261849} 12/29/2021 00:56:33 - INFO - codeparrot_training - Step 46141: {'lr': 7.364121758212661e-06, 'samples': 23624704, 'steps': 46141, 'batch_loss/train': 0.7505391398444772} 12/29/2021 00:56:43 - INFO - codeparrot_training - Step 46142: {'lr': 7.3603244932216316e-06, 'samples': 23625216, 'steps': 46142, 'batch_loss/train': 0.7397911106236279} 12/29/2021 00:56:54 - INFO - codeparrot_training - Step 46143: {'lr': 7.356528192874329e-06, 'samples': 23625728, 'steps': 46143, 'batch_loss/train': 0.848889296175912} 12/29/2021 00:57:08 - INFO - codeparrot_training - Step 46144: {'lr': 7.352732857185879e-06, 'samples': 23626240, 'steps': 46144, 'batch_loss/train': 0.660590874729678} 12/29/2021 00:57:19 - INFO - codeparrot_training - Step 46145: {'lr': 7.348938486171325e-06, 'samples': 23626752, 'steps': 46145, 'batch_loss/train': 0.7536396835930645} 12/29/2021 00:57:29 - INFO - codeparrot_training - Step 46146: {'lr': 7.3451450798457945e-06, 'samples': 23627264, 'steps': 46146, 'batch_loss/train': 0.6587454058462754} 12/29/2021 00:57:41 - INFO - codeparrot_training - Step 46147: {'lr': 7.341352638224302e-06, 'samples': 23627776, 'steps': 46147, 'batch_loss/train': 0.748751358827576} 12/29/2021 00:57:52 - INFO - codeparrot_training - Step 46148: {'lr': 7.337561161322004e-06, 'samples': 23628288, 'steps': 46148, 'batch_loss/train': 0.78042056504637} 12/29/2021 00:58:03 - INFO - codeparrot_training - Step 46149: {'lr': 7.333770649153915e-06, 'samples': 23628800, 'steps': 46149, 'batch_loss/train': 0.7203565633390099} 12/29/2021 00:58:15 - INFO - codeparrot_training - Step 46150: {'lr': 7.329981101735161e-06, 'samples': 23629312, 'steps': 46150, 'batch_loss/train': 0.82735530519858} 12/29/2021 00:58:26 - INFO - codeparrot_training - Step 46151: {'lr': 7.326192519080732e-06, 'samples': 23629824, 'steps': 46151, 'batch_loss/train': 0.7470391783863306} 12/29/2021 00:58:36 - INFO - codeparrot_training - Step 46152: {'lr': 7.322404901205726e-06, 'samples': 23630336, 'steps': 46152, 'batch_loss/train': 0.9156142910942435} 12/29/2021 00:58:47 - INFO - codeparrot_training - Step 46153: {'lr': 7.318618248125269e-06, 'samples': 23630848, 'steps': 46153, 'batch_loss/train': 0.9072473386768252} 12/29/2021 00:58:59 - INFO - codeparrot_training - Step 46154: {'lr': 7.314832559854323e-06, 'samples': 23631360, 'steps': 46154, 'batch_loss/train': 0.7357209185138345} 12/29/2021 00:59:10 - INFO - codeparrot_training - Step 46155: {'lr': 7.3110478364079305e-06, 'samples': 23631872, 'steps': 46155, 'batch_loss/train': 0.6507665608078241} 12/29/2021 00:59:20 - INFO - codeparrot_training - Step 46156: {'lr': 7.307264077801218e-06, 'samples': 23632384, 'steps': 46156, 'batch_loss/train': 0.7370003708638251} 12/29/2021 00:59:34 - INFO - codeparrot_training - Step 46157: {'lr': 7.3034812840491736e-06, 'samples': 23632896, 'steps': 46157, 'batch_loss/train': 0.5970797234913334} 12/29/2021 00:59:45 - INFO - codeparrot_training - Step 46158: {'lr': 7.299699455166842e-06, 'samples': 23633408, 'steps': 46158, 'batch_loss/train': 0.762351471115835} 12/29/2021 00:59:56 - INFO - codeparrot_training - Step 46159: {'lr': 7.295918591169265e-06, 'samples': 23633920, 'steps': 46159, 'batch_loss/train': 0.795914996881038} 12/29/2021 01:00:08 - INFO - codeparrot_training - Step 46160: {'lr': 7.292138692071487e-06, 'samples': 23634432, 'steps': 46160, 'batch_loss/train': 0.8374386923387647} 12/29/2021 01:00:19 - INFO - codeparrot_training - Step 46161: {'lr': 7.288359757888524e-06, 'samples': 23634944, 'steps': 46161, 'batch_loss/train': 0.7433890444226563} 12/29/2021 01:00:29 - INFO - codeparrot_training - Step 46162: {'lr': 7.284581788635364e-06, 'samples': 23635456, 'steps': 46162, 'batch_loss/train': 0.7236845414154232} 12/29/2021 01:00:40 - INFO - codeparrot_training - Step 46163: {'lr': 7.280804784327105e-06, 'samples': 23635968, 'steps': 46163, 'batch_loss/train': 0.6676226185518317} 12/29/2021 01:00:52 - INFO - codeparrot_training - Step 46164: {'lr': 7.2770287449787085e-06, 'samples': 23636480, 'steps': 46164, 'batch_loss/train': 0.7470121532678604} 12/29/2021 01:01:03 - INFO - codeparrot_training - Step 46165: {'lr': 7.27325367060519e-06, 'samples': 23636992, 'steps': 46165, 'batch_loss/train': 0.724197366158478} 12/29/2021 01:01:13 - INFO - codeparrot_training - Step 46166: {'lr': 7.269479561221537e-06, 'samples': 23637504, 'steps': 46166, 'batch_loss/train': 0.7229508580639958} 12/29/2021 01:01:27 - INFO - codeparrot_training - Step 46167: {'lr': 7.265706416842849e-06, 'samples': 23638016, 'steps': 46167, 'batch_loss/train': 0.7909890897572041} 12/29/2021 01:01:38 - INFO - codeparrot_training - Step 46168: {'lr': 7.2619342374840025e-06, 'samples': 23638528, 'steps': 46168, 'batch_loss/train': 0.705960646038875} 12/29/2021 01:01:48 - INFO - codeparrot_training - Step 46169: {'lr': 7.258163023160041e-06, 'samples': 23639040, 'steps': 46169, 'batch_loss/train': 0.7532181488350034} 12/29/2021 01:02:00 - INFO - codeparrot_training - Step 46170: {'lr': 7.2543927738860095e-06, 'samples': 23639552, 'steps': 46170, 'batch_loss/train': 0.7164127486757934} 12/29/2021 01:02:11 - INFO - codeparrot_training - Step 46171: {'lr': 7.250623489676838e-06, 'samples': 23640064, 'steps': 46171, 'batch_loss/train': 0.734382092487067} 12/29/2021 01:02:22 - INFO - codeparrot_training - Step 46172: {'lr': 7.246855170547545e-06, 'samples': 23640576, 'steps': 46172, 'batch_loss/train': 0.7484850212931633} 12/29/2021 01:02:35 - INFO - codeparrot_training - Step 46173: {'lr': 7.2430878165130885e-06, 'samples': 23641088, 'steps': 46173, 'batch_loss/train': 0.7366828750818968} 12/29/2021 01:02:46 - INFO - codeparrot_training - Step 46174: {'lr': 7.23932142758843e-06, 'samples': 23641600, 'steps': 46174, 'batch_loss/train': 0.7531184526160359} 12/29/2021 01:02:57 - INFO - codeparrot_training - Step 46175: {'lr': 7.235556003788585e-06, 'samples': 23642112, 'steps': 46175, 'batch_loss/train': 0.7868189113214612} 12/29/2021 01:03:07 - INFO - codeparrot_training - Step 46176: {'lr': 7.231791545128541e-06, 'samples': 23642624, 'steps': 46176, 'batch_loss/train': 0.770965036470443} 12/29/2021 01:03:19 - INFO - codeparrot_training - Step 46177: {'lr': 7.228028051623148e-06, 'samples': 23643136, 'steps': 46177, 'batch_loss/train': 0.782727780751884} 12/29/2021 01:03:30 - INFO - codeparrot_training - Step 46178: {'lr': 7.224265523287504e-06, 'samples': 23643648, 'steps': 46178, 'batch_loss/train': 0.701204986544326} 12/29/2021 01:03:40 - INFO - codeparrot_training - Step 46179: {'lr': 7.220503960136515e-06, 'samples': 23644160, 'steps': 46179, 'batch_loss/train': 0.7088749501854181} 12/29/2021 01:03:53 - INFO - codeparrot_training - Step 46180: {'lr': 7.216743362185085e-06, 'samples': 23644672, 'steps': 46180, 'batch_loss/train': 0.7237128049600869} 12/29/2021 01:04:03 - INFO - codeparrot_training - Step 46181: {'lr': 7.212983729448258e-06, 'samples': 23645184, 'steps': 46181, 'batch_loss/train': 0.8180809828918427} 12/29/2021 01:04:14 - INFO - codeparrot_training - Step 46182: {'lr': 7.209225061940938e-06, 'samples': 23645696, 'steps': 46182, 'batch_loss/train': 0.754977002274245} 12/29/2021 01:04:26 - INFO - codeparrot_training - Step 46183: {'lr': 7.205467359678031e-06, 'samples': 23646208, 'steps': 46183, 'batch_loss/train': 0.6070831613615155} 12/29/2021 01:04:37 - INFO - codeparrot_training - Step 46184: {'lr': 7.2017106226745235e-06, 'samples': 23646720, 'steps': 46184, 'batch_loss/train': 0.7821896076202393} 12/29/2021 01:04:48 - INFO - codeparrot_training - Step 46185: {'lr': 7.197954850945349e-06, 'samples': 23647232, 'steps': 46185, 'batch_loss/train': 1.0508014438673854} 12/29/2021 01:04:58 - INFO - codeparrot_training - Step 46186: {'lr': 7.194200044505411e-06, 'samples': 23647744, 'steps': 46186, 'batch_loss/train': 0.7455510310828686} 12/29/2021 01:05:12 - INFO - codeparrot_training - Step 46187: {'lr': 7.190446203369671e-06, 'samples': 23648256, 'steps': 46187, 'batch_loss/train': 0.7275038491934538} 12/29/2021 01:05:23 - INFO - codeparrot_training - Step 46188: {'lr': 7.186693327553007e-06, 'samples': 23648768, 'steps': 46188, 'batch_loss/train': 0.7293491405434906} 12/29/2021 01:05:34 - INFO - codeparrot_training - Step 46189: {'lr': 7.182941417070405e-06, 'samples': 23649280, 'steps': 46189, 'batch_loss/train': 0.7273933060932904} 12/29/2021 01:05:46 - INFO - codeparrot_training - Step 46190: {'lr': 7.179190471936742e-06, 'samples': 23649792, 'steps': 46190, 'batch_loss/train': 0.6605210350826383} 12/29/2021 01:05:56 - INFO - codeparrot_training - Step 46191: {'lr': 7.175440492166896e-06, 'samples': 23650304, 'steps': 46191, 'batch_loss/train': 0.6910445131361485} 12/29/2021 01:06:07 - INFO - codeparrot_training - Step 46192: {'lr': 7.171691477775827e-06, 'samples': 23650816, 'steps': 46192, 'batch_loss/train': 0.8196679381653666} 12/29/2021 01:06:20 - INFO - codeparrot_training - Step 46193: {'lr': 7.167943428778439e-06, 'samples': 23651328, 'steps': 46193, 'batch_loss/train': 0.7217513830401003} 12/29/2021 01:06:30 - INFO - codeparrot_training - Step 46194: {'lr': 7.164196345189611e-06, 'samples': 23651840, 'steps': 46194, 'batch_loss/train': 1.0611712979152799} 12/29/2021 01:06:41 - INFO - codeparrot_training - Step 46195: {'lr': 7.160450227024218e-06, 'samples': 23652352, 'steps': 46195, 'batch_loss/train': 0.7096968041732907} 12/29/2021 01:06:51 - INFO - codeparrot_training - Step 46196: {'lr': 7.1567050742972205e-06, 'samples': 23652864, 'steps': 46196, 'batch_loss/train': 0.6942191594280303} 12/29/2021 01:07:05 - INFO - codeparrot_training - Step 46197: {'lr': 7.152960887023469e-06, 'samples': 23653376, 'steps': 46197, 'batch_loss/train': 0.7020041110226884} 12/29/2021 01:07:16 - INFO - codeparrot_training - Step 46198: {'lr': 7.149217665217838e-06, 'samples': 23653888, 'steps': 46198, 'batch_loss/train': 0.7217406018171459} 12/29/2021 01:07:27 - INFO - codeparrot_training - Step 46199: {'lr': 7.145475408895208e-06, 'samples': 23654400, 'steps': 46199, 'batch_loss/train': 0.6596042616292834} 12/29/2021 01:07:39 - INFO - codeparrot_training - Step 46200: {'lr': 7.141734118070481e-06, 'samples': 23654912, 'steps': 46200, 'batch_loss/train': 0.7254732502624393} 12/29/2021 01:07:49 - INFO - codeparrot_training - Step 46201: {'lr': 7.1379937927585074e-06, 'samples': 23655424, 'steps': 46201, 'batch_loss/train': 0.7648061108775437} 12/29/2021 01:08:00 - INFO - codeparrot_training - Step 46202: {'lr': 7.134254432974219e-06, 'samples': 23655936, 'steps': 46202, 'batch_loss/train': 0.7222114419564605} 12/29/2021 01:08:14 - INFO - codeparrot_training - Step 46203: {'lr': 7.1305160387323545e-06, 'samples': 23656448, 'steps': 46203, 'batch_loss/train': 0.7459703786298633} 12/29/2021 01:08:25 - INFO - codeparrot_training - Step 46204: {'lr': 7.126778610047902e-06, 'samples': 23656960, 'steps': 46204, 'batch_loss/train': 0.7434641337022185} 12/29/2021 01:08:35 - INFO - codeparrot_training - Step 46205: {'lr': 7.123042146935682e-06, 'samples': 23657472, 'steps': 46205, 'batch_loss/train': 0.652054303791374} 12/29/2021 01:08:46 - INFO - codeparrot_training - Step 46206: {'lr': 7.11930664941049e-06, 'samples': 23657984, 'steps': 46206, 'batch_loss/train': 0.6566412427928299} 12/29/2021 01:08:58 - INFO - codeparrot_training - Step 46207: {'lr': 7.115572117487284e-06, 'samples': 23658496, 'steps': 46207, 'batch_loss/train': 0.6843708814121783} 12/29/2021 01:09:09 - INFO - codeparrot_training - Step 46208: {'lr': 7.111838551180832e-06, 'samples': 23659008, 'steps': 46208, 'batch_loss/train': 0.6885039834305644} 12/29/2021 01:09:20 - INFO - codeparrot_training - Step 46209: {'lr': 7.108105950506011e-06, 'samples': 23659520, 'steps': 46209, 'batch_loss/train': 0.7184747685678303} 12/29/2021 01:09:32 - INFO - codeparrot_training - Step 46210: {'lr': 7.104374315477613e-06, 'samples': 23660032, 'steps': 46210, 'batch_loss/train': 0.6505536064505577} 12/29/2021 01:09:43 - INFO - codeparrot_training - Step 46211: {'lr': 7.100643646110544e-06, 'samples': 23660544, 'steps': 46211, 'batch_loss/train': 0.6821804558858275} 12/29/2021 01:09:53 - INFO - codeparrot_training - Step 46212: {'lr': 7.096913942419569e-06, 'samples': 23661056, 'steps': 46212, 'batch_loss/train': 0.8628662582486868} 12/29/2021 01:10:04 - INFO - codeparrot_training - Step 46213: {'lr': 7.093185204419567e-06, 'samples': 23661568, 'steps': 46213, 'batch_loss/train': 0.9145271927118301} 12/29/2021 01:10:18 - INFO - codeparrot_training - Step 46214: {'lr': 7.089457432125329e-06, 'samples': 23662080, 'steps': 46214, 'batch_loss/train': 0.8192629935219884} 12/29/2021 01:10:29 - INFO - codeparrot_training - Step 46215: {'lr': 7.085730625551706e-06, 'samples': 23662592, 'steps': 46215, 'batch_loss/train': 0.7325714505277574} 12/29/2021 01:10:39 - INFO - codeparrot_training - Step 46216: {'lr': 7.082004784713519e-06, 'samples': 23663104, 'steps': 46216, 'batch_loss/train': 0.6907761273905635} 12/29/2021 01:10:52 - INFO - codeparrot_training - Step 46217: {'lr': 7.078279909625507e-06, 'samples': 23663616, 'steps': 46217, 'batch_loss/train': 0.6892951841582544} 12/29/2021 01:11:02 - INFO - codeparrot_training - Step 46218: {'lr': 7.074556000302545e-06, 'samples': 23664128, 'steps': 46218, 'batch_loss/train': 0.6516545359045267} 12/29/2021 01:11:13 - INFO - codeparrot_training - Step 46219: {'lr': 7.070833056759457e-06, 'samples': 23664640, 'steps': 46219, 'batch_loss/train': 0.8861193731427193} 12/29/2021 01:11:25 - INFO - codeparrot_training - Step 46220: {'lr': 7.06711107901098e-06, 'samples': 23665152, 'steps': 46220, 'batch_loss/train': 0.6639505963539705} 12/29/2021 01:11:36 - INFO - codeparrot_training - Step 46221: {'lr': 7.063390067071934e-06, 'samples': 23665664, 'steps': 46221, 'batch_loss/train': 0.6913990564644337} 12/29/2021 01:11:46 - INFO - codeparrot_training - Step 46222: {'lr': 7.059670020957143e-06, 'samples': 23666176, 'steps': 46222, 'batch_loss/train': 0.6751316576264799} 12/29/2021 01:11:57 - INFO - codeparrot_training - Step 46223: {'lr': 7.055950940681371e-06, 'samples': 23666688, 'steps': 46223, 'batch_loss/train': 0.6868224389618263} 12/29/2021 01:12:09 - INFO - codeparrot_training - Step 46224: {'lr': 7.052232826259386e-06, 'samples': 23667200, 'steps': 46224, 'batch_loss/train': 0.7436306085437536} 12/29/2021 01:12:20 - INFO - codeparrot_training - Step 46225: {'lr': 7.048515677706007e-06, 'samples': 23667712, 'steps': 46225, 'batch_loss/train': 0.7315280737821013} 12/29/2021 01:12:30 - INFO - codeparrot_training - Step 46226: {'lr': 7.0447994950359754e-06, 'samples': 23668224, 'steps': 46226, 'batch_loss/train': 0.8529402269050479} 12/29/2021 01:12:44 - INFO - codeparrot_training - Step 46227: {'lr': 7.041084278264109e-06, 'samples': 23668736, 'steps': 46227, 'batch_loss/train': 0.8174975384026766} 12/29/2021 01:12:55 - INFO - codeparrot_training - Step 46228: {'lr': 7.037370027405177e-06, 'samples': 23669248, 'steps': 46228, 'batch_loss/train': 0.6979124409845099} 12/29/2021 01:13:05 - INFO - codeparrot_training - Step 46229: {'lr': 7.03365674247386e-06, 'samples': 23669760, 'steps': 46229, 'batch_loss/train': 0.6533850373816676} 12/29/2021 01:13:17 - INFO - codeparrot_training - Step 46230: {'lr': 7.029944423485008e-06, 'samples': 23670272, 'steps': 46230, 'batch_loss/train': 0.7280495380982757} 12/29/2021 01:13:28 - INFO - codeparrot_training - Step 46231: {'lr': 7.026233070453386e-06, 'samples': 23670784, 'steps': 46231, 'batch_loss/train': 0.7611243552528322} 12/29/2021 01:13:39 - INFO - codeparrot_training - Step 46232: {'lr': 7.02252268339365e-06, 'samples': 23671296, 'steps': 46232, 'batch_loss/train': 0.7896661479026079} 12/29/2021 01:13:49 - INFO - codeparrot_training - Step 46233: {'lr': 7.018813262320678e-06, 'samples': 23671808, 'steps': 46233, 'batch_loss/train': 0.8249867912381887} 12/29/2021 01:14:03 - INFO - codeparrot_training - Step 46234: {'lr': 7.015104807249151e-06, 'samples': 23672320, 'steps': 46234, 'batch_loss/train': 0.7757427417673171} 12/29/2021 01:14:13 - INFO - codeparrot_training - Step 46235: {'lr': 7.011397318193807e-06, 'samples': 23672832, 'steps': 46235, 'batch_loss/train': 0.7189789566909894} 12/29/2021 01:14:24 - INFO - codeparrot_training - Step 46236: {'lr': 7.007690795169414e-06, 'samples': 23673344, 'steps': 46236, 'batch_loss/train': 0.7633631490170956} 12/29/2021 01:14:36 - INFO - codeparrot_training - Step 46237: {'lr': 7.003985238190652e-06, 'samples': 23673856, 'steps': 46237, 'batch_loss/train': 0.8518268330954015} 12/29/2021 01:14:47 - INFO - codeparrot_training - Step 46238: {'lr': 7.000280647272344e-06, 'samples': 23674368, 'steps': 46238, 'batch_loss/train': 0.7896272549405694} 12/29/2021 01:14:57 - INFO - codeparrot_training - Step 46239: {'lr': 6.996577022429118e-06, 'samples': 23674880, 'steps': 46239, 'batch_loss/train': 0.6629347607376985} 12/29/2021 01:15:09 - INFO - codeparrot_training - Step 46240: {'lr': 6.992874363675794e-06, 'samples': 23675392, 'steps': 46240, 'batch_loss/train': 0.8324075324926525} 12/29/2021 01:15:20 - INFO - codeparrot_training - Step 46241: {'lr': 6.989172671027e-06, 'samples': 23675904, 'steps': 46241, 'batch_loss/train': 0.8199469139799476} 12/29/2021 01:15:31 - INFO - codeparrot_training - Step 46242: {'lr': 6.985471944497529e-06, 'samples': 23676416, 'steps': 46242, 'batch_loss/train': 0.8471215311437845} 12/29/2021 01:15:44 - INFO - codeparrot_training - Step 46243: {'lr': 6.981772184102065e-06, 'samples': 23676928, 'steps': 46243, 'batch_loss/train': 0.7469015109236352} 12/29/2021 01:15:55 - INFO - codeparrot_training - Step 46244: {'lr': 6.978073389855288e-06, 'samples': 23677440, 'steps': 46244, 'batch_loss/train': 0.7544717926066369} 12/29/2021 01:16:06 - INFO - codeparrot_training - Step 46245: {'lr': 6.9743755617719675e-06, 'samples': 23677952, 'steps': 46245, 'batch_loss/train': 0.6821549953892827} 12/29/2021 01:16:16 - INFO - codeparrot_training - Step 46246: {'lr': 6.970678699866756e-06, 'samples': 23678464, 'steps': 46246, 'batch_loss/train': 0.6624634435866028} 12/29/2021 01:16:29 - INFO - codeparrot_training - Step 46247: {'lr': 6.966982804154337e-06, 'samples': 23678976, 'steps': 46247, 'batch_loss/train': 0.8087197067216039} 12/29/2021 01:16:39 - INFO - codeparrot_training - Step 46248: {'lr': 6.963287874649449e-06, 'samples': 23679488, 'steps': 46248, 'batch_loss/train': 0.7994593791663647} 12/29/2021 01:16:50 - INFO - codeparrot_training - Step 46249: {'lr': 6.959593911366774e-06, 'samples': 23680000, 'steps': 46249, 'batch_loss/train': 0.7627374036237597} 12/29/2021 01:17:04 - INFO - codeparrot_training - Step 46250: {'lr': 6.955900914320939e-06, 'samples': 23680512, 'steps': 46250, 'batch_loss/train': 0.6668106794822961} 12/29/2021 01:17:14 - INFO - codeparrot_training - Step 46251: {'lr': 6.952208883526712e-06, 'samples': 23681024, 'steps': 46251, 'batch_loss/train': 0.7649040995165706} 12/29/2021 01:17:25 - INFO - codeparrot_training - Step 46252: {'lr': 6.948517818998718e-06, 'samples': 23681536, 'steps': 46252, 'batch_loss/train': 0.7078189766034484} 12/29/2021 01:17:37 - INFO - codeparrot_training - Step 46253: {'lr': 6.94482772075164e-06, 'samples': 23682048, 'steps': 46253, 'batch_loss/train': 0.7372397584840655} 12/29/2021 01:17:48 - INFO - codeparrot_training - Step 46254: {'lr': 6.941138588800161e-06, 'samples': 23682560, 'steps': 46254, 'batch_loss/train': 0.7934325593523681} 12/29/2021 01:17:58 - INFO - codeparrot_training - Step 46255: {'lr': 6.937450423158937e-06, 'samples': 23683072, 'steps': 46255, 'batch_loss/train': 0.7569562918506563} 12/29/2021 01:18:09 - INFO - codeparrot_training - Step 46256: {'lr': 6.9337632238426204e-06, 'samples': 23683584, 'steps': 46256, 'batch_loss/train': 0.6708130785264075} 12/29/2021 01:18:21 - INFO - codeparrot_training - Step 46257: {'lr': 6.930076990865925e-06, 'samples': 23684096, 'steps': 46257, 'batch_loss/train': 0.6708346595987678} 12/29/2021 01:18:32 - INFO - codeparrot_training - Step 46258: {'lr': 6.926391724243419e-06, 'samples': 23684608, 'steps': 46258, 'batch_loss/train': 0.7829539831727743} 12/29/2021 01:18:42 - INFO - codeparrot_training - Step 46259: {'lr': 6.922707423989816e-06, 'samples': 23685120, 'steps': 46259, 'batch_loss/train': 0.8095382917672396} 12/29/2021 01:18:56 - INFO - codeparrot_training - Step 46260: {'lr': 6.919024090119769e-06, 'samples': 23685632, 'steps': 46260, 'batch_loss/train': 0.7618274139240384} 12/29/2021 01:19:07 - INFO - codeparrot_training - Step 46261: {'lr': 6.91534172264785e-06, 'samples': 23686144, 'steps': 46261, 'batch_loss/train': 0.5225009059067816} 12/29/2021 01:19:17 - INFO - codeparrot_training - Step 46262: {'lr': 6.91166032158877e-06, 'samples': 23686656, 'steps': 46262, 'batch_loss/train': 0.6918770079500973} 12/29/2021 01:19:30 - INFO - codeparrot_training - Step 46263: {'lr': 6.9079798869571556e-06, 'samples': 23687168, 'steps': 46263, 'batch_loss/train': 0.6697679008357227} 12/29/2021 01:19:40 - INFO - codeparrot_training - Step 46264: {'lr': 6.904300418767606e-06, 'samples': 23687680, 'steps': 46264, 'batch_loss/train': 1.4872042655479163} 12/29/2021 01:19:51 - INFO - codeparrot_training - Step 46265: {'lr': 6.900621917034777e-06, 'samples': 23688192, 'steps': 46265, 'batch_loss/train': 0.6504272841848433} 12/29/2021 01:20:02 - INFO - codeparrot_training - Step 46266: {'lr': 6.896944381773268e-06, 'samples': 23688704, 'steps': 46266, 'batch_loss/train': 0.7300825815182179} 12/29/2021 01:20:14 - INFO - codeparrot_training - Step 46267: {'lr': 6.893267812997733e-06, 'samples': 23689216, 'steps': 46267, 'batch_loss/train': 0.712629213463515} 12/29/2021 01:20:24 - INFO - codeparrot_training - Step 46268: {'lr': 6.889592210722773e-06, 'samples': 23689728, 'steps': 46268, 'batch_loss/train': 0.5522850994602777} 12/29/2021 01:20:35 - INFO - codeparrot_training - Step 46269: {'lr': 6.885917574962986e-06, 'samples': 23690240, 'steps': 46269, 'batch_loss/train': 0.600861796643585} 12/29/2021 01:20:47 - INFO - codeparrot_training - Step 46270: {'lr': 6.882243905733027e-06, 'samples': 23690752, 'steps': 46270, 'batch_loss/train': 0.7981493826955557} 12/29/2021 01:20:58 - INFO - codeparrot_training - Step 46271: {'lr': 6.878571203047468e-06, 'samples': 23691264, 'steps': 46271, 'batch_loss/train': 0.7695967219769955} 12/29/2021 01:21:09 - INFO - codeparrot_training - Step 46272: {'lr': 6.874899466920881e-06, 'samples': 23691776, 'steps': 46272, 'batch_loss/train': 0.900913305580616} 12/29/2021 01:21:24 - INFO - codeparrot_training - Step 46273: {'lr': 6.871228697367865e-06, 'samples': 23692288, 'steps': 46273, 'batch_loss/train': 0.6639525394421071} 12/29/2021 01:21:34 - INFO - codeparrot_training - Step 46274: {'lr': 6.867558894403103e-06, 'samples': 23692800, 'steps': 46274, 'batch_loss/train': 0.7001116685569286} 12/29/2021 01:21:45 - INFO - codeparrot_training - Step 46275: {'lr': 6.863890058041083e-06, 'samples': 23693312, 'steps': 46275, 'batch_loss/train': 0.4255490618816111} 12/29/2021 01:21:56 - INFO - codeparrot_training - Step 46276: {'lr': 6.860222188296461e-06, 'samples': 23693824, 'steps': 46276, 'batch_loss/train': 0.8049748670309782} 12/29/2021 01:22:08 - INFO - codeparrot_training - Step 46277: {'lr': 6.85655528518378e-06, 'samples': 23694336, 'steps': 46277, 'batch_loss/train': 1.3927613566629589} 12/29/2021 01:22:18 - INFO - codeparrot_training - Step 46278: {'lr': 6.852889348717611e-06, 'samples': 23694848, 'steps': 46278, 'batch_loss/train': 0.8262589322403073} 12/29/2021 01:22:29 - INFO - codeparrot_training - Step 46279: {'lr': 6.849224378912555e-06, 'samples': 23695360, 'steps': 46279, 'batch_loss/train': 0.8514679349027574} 12/29/2021 01:22:40 - INFO - codeparrot_training - Step 46280: {'lr': 6.845560375783183e-06, 'samples': 23695872, 'steps': 46280, 'batch_loss/train': 0.7071405901806429} 12/29/2021 01:22:52 - INFO - codeparrot_training - Step 46281: {'lr': 6.841897339344039e-06, 'samples': 23696384, 'steps': 46281, 'batch_loss/train': 0.7115461900830269} 12/29/2021 01:23:02 - INFO - codeparrot_training - Step 46282: {'lr': 6.838235269609694e-06, 'samples': 23696896, 'steps': 46282, 'batch_loss/train': 0.8111610822379589} 12/29/2021 01:23:13 - INFO - codeparrot_training - Step 46283: {'lr': 6.834574166594748e-06, 'samples': 23697408, 'steps': 46283, 'batch_loss/train': 0.7474937913939357} 12/29/2021 01:23:27 - INFO - codeparrot_training - Step 46284: {'lr': 6.830914030313662e-06, 'samples': 23697920, 'steps': 46284, 'batch_loss/train': 0.7335960129275918} 12/29/2021 01:23:37 - INFO - codeparrot_training - Step 46285: {'lr': 6.827254860781063e-06, 'samples': 23698432, 'steps': 46285, 'batch_loss/train': 0.6572745512239635} 12/29/2021 01:23:48 - INFO - codeparrot_training - Step 46286: {'lr': 6.823596658011522e-06, 'samples': 23698944, 'steps': 46286, 'batch_loss/train': 0.6699600927531719} 12/29/2021 01:24:00 - INFO - codeparrot_training - Step 46287: {'lr': 6.8199394220195e-06, 'samples': 23699456, 'steps': 46287, 'batch_loss/train': 0.6372411490301602} 12/29/2021 01:24:11 - INFO - codeparrot_training - Step 46288: {'lr': 6.816283152819597e-06, 'samples': 23699968, 'steps': 46288, 'batch_loss/train': 0.7629147057887167} 12/29/2021 01:24:22 - INFO - codeparrot_training - Step 46289: {'lr': 6.8126278504263284e-06, 'samples': 23700480, 'steps': 46289, 'batch_loss/train': 0.7757801217958331} 12/29/2021 01:24:36 - INFO - codeparrot_training - Step 46290: {'lr': 6.8089735148542105e-06, 'samples': 23700992, 'steps': 46290, 'batch_loss/train': 0.7836120137944818} 12/29/2021 01:24:46 - INFO - codeparrot_training - Step 46291: {'lr': 6.805320146117788e-06, 'samples': 23701504, 'steps': 46291, 'batch_loss/train': 0.7707259664312005} 12/29/2021 01:24:57 - INFO - codeparrot_training - Step 46292: {'lr': 6.8016677442316046e-06, 'samples': 23702016, 'steps': 46292, 'batch_loss/train': 0.7547893323935568} 12/29/2021 01:25:08 - INFO - codeparrot_training - Step 46293: {'lr': 6.798016309210148e-06, 'samples': 23702528, 'steps': 46293, 'batch_loss/train': 0.5706824902445078} 12/29/2021 01:25:20 - INFO - codeparrot_training - Step 46294: {'lr': 6.794365841067934e-06, 'samples': 23703040, 'steps': 46294, 'batch_loss/train': 0.634113761363551} 12/29/2021 01:25:30 - INFO - codeparrot_training - Step 46295: {'lr': 6.790716339819508e-06, 'samples': 23703552, 'steps': 46295, 'batch_loss/train': 0.5440907886950299} 12/29/2021 01:25:41 - INFO - codeparrot_training - Step 46296: {'lr': 6.787067805479386e-06, 'samples': 23704064, 'steps': 46296, 'batch_loss/train': 0.8391356584616005} 12/29/2021 01:25:55 - INFO - codeparrot_training - Step 46297: {'lr': 6.783420238062027e-06, 'samples': 23704576, 'steps': 46297, 'batch_loss/train': 0.7647813918301836} 12/29/2021 01:26:06 - INFO - codeparrot_training - Step 46298: {'lr': 6.779773637581977e-06, 'samples': 23705088, 'steps': 46298, 'batch_loss/train': 0.7438942054286599} 12/29/2021 01:26:16 - INFO - codeparrot_training - Step 46299: {'lr': 6.776128004053639e-06, 'samples': 23705600, 'steps': 46299, 'batch_loss/train': 1.5012119645252824} 12/29/2021 01:26:29 - INFO - codeparrot_training - Step 46300: {'lr': 6.7724833374916696e-06, 'samples': 23706112, 'steps': 46300, 'batch_loss/train': 0.7234385469928384} 12/29/2021 01:26:39 - INFO - codeparrot_training - Step 46301: {'lr': 6.768839637910418e-06, 'samples': 23706624, 'steps': 46301, 'batch_loss/train': 0.7957325587049127} 12/29/2021 01:26:50 - INFO - codeparrot_training - Step 46302: {'lr': 6.765196905324428e-06, 'samples': 23707136, 'steps': 46302, 'batch_loss/train': 0.8175824042409658} 12/29/2021 01:27:01 - INFO - codeparrot_training - Step 46303: {'lr': 6.76155513974816e-06, 'samples': 23707648, 'steps': 46303, 'batch_loss/train': 0.4967504390515387} 12/29/2021 01:27:13 - INFO - codeparrot_training - Step 46304: {'lr': 6.757914341196103e-06, 'samples': 23708160, 'steps': 46304, 'batch_loss/train': 0.7757249316200614} 12/29/2021 01:27:23 - INFO - codeparrot_training - Step 46305: {'lr': 6.7542745096827454e-06, 'samples': 23708672, 'steps': 46305, 'batch_loss/train': 0.6509712147526443} 12/29/2021 01:27:34 - INFO - codeparrot_training - Step 46306: {'lr': 6.750635645222547e-06, 'samples': 23709184, 'steps': 46306, 'batch_loss/train': 0.7405125377699733} 12/29/2021 01:27:46 - INFO - codeparrot_training - Step 46307: {'lr': 6.746997747829997e-06, 'samples': 23709696, 'steps': 46307, 'batch_loss/train': 0.8073592251166701} 12/29/2021 01:27:57 - INFO - codeparrot_training - Step 46308: {'lr': 6.743360817519501e-06, 'samples': 23710208, 'steps': 46308, 'batch_loss/train': 0.7924033966846764} 12/29/2021 01:28:07 - INFO - codeparrot_training - Step 46309: {'lr': 6.739724854305574e-06, 'samples': 23710720, 'steps': 46309, 'batch_loss/train': 0.7312837494537234} 12/29/2021 01:28:20 - INFO - codeparrot_training - Step 46310: {'lr': 6.7360898582026205e-06, 'samples': 23711232, 'steps': 46310, 'batch_loss/train': 0.649235287681222} 12/29/2021 01:28:30 - INFO - codeparrot_training - Step 46311: {'lr': 6.732455829225131e-06, 'samples': 23711744, 'steps': 46311, 'batch_loss/train': 0.7372621762333438} 12/29/2021 01:28:41 - INFO - codeparrot_training - Step 46312: {'lr': 6.728822767387594e-06, 'samples': 23712256, 'steps': 46312, 'batch_loss/train': 0.7986095556989312} 12/29/2021 01:28:54 - INFO - codeparrot_training - Step 46313: {'lr': 6.725190672704357e-06, 'samples': 23712768, 'steps': 46313, 'batch_loss/train': 0.7626429637894034} 12/29/2021 01:29:05 - INFO - codeparrot_training - Step 46314: {'lr': 6.721559545189881e-06, 'samples': 23713280, 'steps': 46314, 'batch_loss/train': 0.7584756473079324} 12/29/2021 01:29:16 - INFO - codeparrot_training - Step 46315: {'lr': 6.717929384858684e-06, 'samples': 23713792, 'steps': 46315, 'batch_loss/train': 0.7809131587855518} 12/29/2021 01:29:26 - INFO - codeparrot_training - Step 46316: {'lr': 6.714300191725087e-06, 'samples': 23714304, 'steps': 46316, 'batch_loss/train': 0.7041197493090294} 12/29/2021 01:29:38 - INFO - codeparrot_training - Step 46317: {'lr': 6.710671965803605e-06, 'samples': 23714816, 'steps': 46317, 'batch_loss/train': 0.7453270321711898} 12/29/2021 01:29:49 - INFO - codeparrot_training - Step 46318: {'lr': 6.7070447071086176e-06, 'samples': 23715328, 'steps': 46318, 'batch_loss/train': 0.7665092488750815} 12/29/2021 01:30:00 - INFO - codeparrot_training - Step 46319: {'lr': 6.703418415654555e-06, 'samples': 23715840, 'steps': 46319, 'batch_loss/train': 0.7956178714521229} 12/29/2021 01:30:14 - INFO - codeparrot_training - Step 46320: {'lr': 6.699793091455825e-06, 'samples': 23716352, 'steps': 46320, 'batch_loss/train': 0.6499349007208366} 12/29/2021 01:30:24 - INFO - codeparrot_training - Step 46321: {'lr': 6.6961687345268585e-06, 'samples': 23716864, 'steps': 46321, 'batch_loss/train': 0.8171388609334826} 12/29/2021 01:30:35 - INFO - codeparrot_training - Step 46322: {'lr': 6.692545344882062e-06, 'samples': 23717376, 'steps': 46322, 'batch_loss/train': 0.7550914604216814} 12/29/2021 01:30:47 - INFO - codeparrot_training - Step 46323: {'lr': 6.688922922535839e-06, 'samples': 23717888, 'steps': 46323, 'batch_loss/train': 0.6758306189440191} 12/29/2021 01:30:58 - INFO - codeparrot_training - Step 46324: {'lr': 6.685301467502597e-06, 'samples': 23718400, 'steps': 46324, 'batch_loss/train': 0.6977033521980047} 12/29/2021 01:31:09 - INFO - codeparrot_training - Step 46325: {'lr': 6.681680979796684e-06, 'samples': 23718912, 'steps': 46325, 'batch_loss/train': 0.7336985701695085} 12/29/2021 01:31:21 - INFO - codeparrot_training - Step 46326: {'lr': 6.678061459432589e-06, 'samples': 23719424, 'steps': 46326, 'batch_loss/train': 0.6095302058383822} 12/29/2021 01:31:31 - INFO - codeparrot_training - Step 46327: {'lr': 6.6744429064246324e-06, 'samples': 23719936, 'steps': 46327, 'batch_loss/train': 0.7876489413902164} 12/29/2021 01:31:42 - INFO - codeparrot_training - Step 46328: {'lr': 6.6708253207871664e-06, 'samples': 23720448, 'steps': 46328, 'batch_loss/train': 0.7230468858033419} 12/29/2021 01:31:52 - INFO - codeparrot_training - Step 46329: {'lr': 6.667208702534678e-06, 'samples': 23720960, 'steps': 46329, 'batch_loss/train': 0.7176405619829893} 12/29/2021 01:32:06 - INFO - codeparrot_training - Step 46330: {'lr': 6.66359305168146e-06, 'samples': 23721472, 'steps': 46330, 'batch_loss/train': 0.8081347523257136} 12/29/2021 01:32:16 - INFO - codeparrot_training - Step 46331: {'lr': 6.65997836824192e-06, 'samples': 23721984, 'steps': 46331, 'batch_loss/train': 1.100878200493753} 12/29/2021 01:32:27 - INFO - codeparrot_training - Step 46332: {'lr': 6.656364652230434e-06, 'samples': 23722496, 'steps': 46332, 'batch_loss/train': 0.7858423050493002} 12/29/2021 01:32:39 - INFO - codeparrot_training - Step 46333: {'lr': 6.652751903661353e-06, 'samples': 23723008, 'steps': 46333, 'batch_loss/train': 0.7421044547809288} 12/29/2021 01:32:50 - INFO - codeparrot_training - Step 46334: {'lr': 6.649140122549052e-06, 'samples': 23723520, 'steps': 46334, 'batch_loss/train': 0.7245556600391865} 12/29/2021 01:33:01 - INFO - codeparrot_training - Step 46335: {'lr': 6.645529308907883e-06, 'samples': 23724032, 'steps': 46335, 'batch_loss/train': 0.6867295685224235} 12/29/2021 01:33:11 - INFO - codeparrot_training - Step 46336: {'lr': 6.6419194627521665e-06, 'samples': 23724544, 'steps': 46336, 'batch_loss/train': 0.7139207175932825} 12/29/2021 01:33:25 - INFO - codeparrot_training - Step 46337: {'lr': 6.6383105840963355e-06, 'samples': 23725056, 'steps': 46337, 'batch_loss/train': 1.008720960933715} 12/29/2021 01:33:36 - INFO - codeparrot_training - Step 46338: {'lr': 6.634702672954685e-06, 'samples': 23725568, 'steps': 46338, 'batch_loss/train': 0.9933733409270644} 12/29/2021 01:33:46 - INFO - codeparrot_training - Step 46339: {'lr': 6.631095729341563e-06, 'samples': 23726080, 'steps': 46339, 'batch_loss/train': 0.6923622705508024} 12/29/2021 01:33:58 - INFO - codeparrot_training - Step 46340: {'lr': 6.627489753271293e-06, 'samples': 23726592, 'steps': 46340, 'batch_loss/train': 0.6768550763372332} 12/29/2021 01:34:09 - INFO - codeparrot_training - Step 46341: {'lr': 6.623884744758252e-06, 'samples': 23727104, 'steps': 46341, 'batch_loss/train': 0.7672543693333864} 12/29/2021 01:34:20 - INFO - codeparrot_training - Step 46342: {'lr': 6.620280703816734e-06, 'samples': 23727616, 'steps': 46342, 'batch_loss/train': 0.8109114640392363} 12/29/2021 01:34:30 - INFO - codeparrot_training - Step 46343: {'lr': 6.6166776304610875e-06, 'samples': 23728128, 'steps': 46343, 'batch_loss/train': 0.7410417150240391} 12/29/2021 01:34:42 - INFO - codeparrot_training - Step 46344: {'lr': 6.613075524705636e-06, 'samples': 23728640, 'steps': 46344, 'batch_loss/train': 0.5824672980234027} 12/29/2021 01:34:53 - INFO - codeparrot_training - Step 46345: {'lr': 6.609474386564701e-06, 'samples': 23729152, 'steps': 46345, 'batch_loss/train': 0.6499365079216659} 12/29/2021 01:35:04 - INFO - codeparrot_training - Step 46346: {'lr': 6.605874216052604e-06, 'samples': 23729664, 'steps': 46346, 'batch_loss/train': 0.6780941494507715} 12/29/2021 01:35:16 - INFO - codeparrot_training - Step 46347: {'lr': 6.6022750131836115e-06, 'samples': 23730176, 'steps': 46347, 'batch_loss/train': 0.6803511157631874} 12/29/2021 01:35:26 - INFO - codeparrot_training - Step 46348: {'lr': 6.5986767779721016e-06, 'samples': 23730688, 'steps': 46348, 'batch_loss/train': 0.7352011166512966} 12/29/2021 01:35:37 - INFO - codeparrot_training - Step 46349: {'lr': 6.595079510432339e-06, 'samples': 23731200, 'steps': 46349, 'batch_loss/train': 0.7249002708122134} 12/29/2021 01:35:50 - INFO - codeparrot_training - Step 46350: {'lr': 6.5914832105786745e-06, 'samples': 23731712, 'steps': 46350, 'batch_loss/train': 0.7494561118073761} 12/29/2021 01:36:00 - INFO - codeparrot_training - Step 46351: {'lr': 6.587887878425291e-06, 'samples': 23732224, 'steps': 46351, 'batch_loss/train': 0.6769770204555243} 12/29/2021 01:36:11 - INFO - codeparrot_training - Step 46352: {'lr': 6.584293513986623e-06, 'samples': 23732736, 'steps': 46352, 'batch_loss/train': 0.6718658544123173} 12/29/2021 01:36:23 - INFO - codeparrot_training - Step 46353: {'lr': 6.58070011727685e-06, 'samples': 23733248, 'steps': 46353, 'batch_loss/train': 0.6656153229996562} 12/29/2021 01:36:34 - INFO - codeparrot_training - Step 46354: {'lr': 6.577107688310296e-06, 'samples': 23733760, 'steps': 46354, 'batch_loss/train': 0.6898913233308122} 12/29/2021 01:36:44 - INFO - codeparrot_training - Step 46355: {'lr': 6.5735162271012835e-06, 'samples': 23734272, 'steps': 46355, 'batch_loss/train': 0.8442181749269366} 12/29/2021 01:36:55 - INFO - codeparrot_training - Step 46356: {'lr': 6.569925733664051e-06, 'samples': 23734784, 'steps': 46356, 'batch_loss/train': 0.719674464315176} 12/29/2021 01:37:07 - INFO - codeparrot_training - Step 46357: {'lr': 6.566336208012863e-06, 'samples': 23735296, 'steps': 46357, 'batch_loss/train': 0.6542035307502374} 12/29/2021 01:37:18 - INFO - codeparrot_training - Step 46358: {'lr': 6.562747650161988e-06, 'samples': 23735808, 'steps': 46358, 'batch_loss/train': 0.7905286625027657} 12/29/2021 01:37:28 - INFO - codeparrot_training - Step 46359: {'lr': 6.5591600601257465e-06, 'samples': 23736320, 'steps': 46359, 'batch_loss/train': 0.5966965243860614} 12/29/2021 01:37:41 - INFO - codeparrot_training - Step 46360: {'lr': 6.555573437918349e-06, 'samples': 23736832, 'steps': 46360, 'batch_loss/train': 0.7781512270448729} 12/29/2021 01:37:52 - INFO - codeparrot_training - Step 46361: {'lr': 6.551987783554092e-06, 'samples': 23737344, 'steps': 46361, 'batch_loss/train': 0.6806710120290518} 12/29/2021 01:38:03 - INFO - codeparrot_training - Step 46362: {'lr': 6.548403097047156e-06, 'samples': 23737856, 'steps': 46362, 'batch_loss/train': 0.6923778709024191} 12/29/2021 01:38:15 - INFO - codeparrot_training - Step 46363: {'lr': 6.544819378411893e-06, 'samples': 23738368, 'steps': 46363, 'batch_loss/train': 0.7078305138275027} 12/29/2021 01:38:25 - INFO - codeparrot_training - Step 46364: {'lr': 6.541236627662511e-06, 'samples': 23738880, 'steps': 46364, 'batch_loss/train': 0.7231001020409167} 12/29/2021 01:38:36 - INFO - codeparrot_training - Step 46365: {'lr': 6.537654844813196e-06, 'samples': 23739392, 'steps': 46365, 'batch_loss/train': 0.7875720467418432} 12/29/2021 01:38:47 - INFO - codeparrot_training - Step 46366: {'lr': 6.53407402987824e-06, 'samples': 23739904, 'steps': 46366, 'batch_loss/train': 0.7644362915307283} 12/29/2021 01:38:59 - INFO - codeparrot_training - Step 46367: {'lr': 6.530494182871938e-06, 'samples': 23740416, 'steps': 46367, 'batch_loss/train': 0.7559782052412629} 12/29/2021 01:39:10 - INFO - codeparrot_training - Step 46368: {'lr': 6.526915303808417e-06, 'samples': 23740928, 'steps': 46368, 'batch_loss/train': 0.711293550208211} 12/29/2021 01:39:21 - INFO - codeparrot_training - Step 46369: {'lr': 6.523337392701945e-06, 'samples': 23741440, 'steps': 46369, 'batch_loss/train': 0.7268172092735767} 12/29/2021 01:39:33 - INFO - codeparrot_training - Step 46370: {'lr': 6.519760449566758e-06, 'samples': 23741952, 'steps': 46370, 'batch_loss/train': 0.7367680789902806} 12/29/2021 01:39:43 - INFO - codeparrot_training - Step 46371: {'lr': 6.5161844744170695e-06, 'samples': 23742464, 'steps': 46371, 'batch_loss/train': 0.7374270306900144} 12/29/2021 01:39:54 - INFO - codeparrot_training - Step 46372: {'lr': 6.512609467267117e-06, 'samples': 23742976, 'steps': 46372, 'batch_loss/train': 0.8619362744502723} 12/29/2021 01:40:06 - INFO - codeparrot_training - Step 46373: {'lr': 6.509035428131055e-06, 'samples': 23743488, 'steps': 46373, 'batch_loss/train': 0.6868656971491873} 12/29/2021 01:40:17 - INFO - codeparrot_training - Step 46374: {'lr': 6.505462357023151e-06, 'samples': 23744000, 'steps': 46374, 'batch_loss/train': 0.7627067128196359} 12/29/2021 01:40:28 - INFO - codeparrot_training - Step 46375: {'lr': 6.501890253957587e-06, 'samples': 23744512, 'steps': 46375, 'batch_loss/train': 0.698751755291596} 12/29/2021 01:40:40 - INFO - codeparrot_training - Step 46376: {'lr': 6.4983191189485755e-06, 'samples': 23745024, 'steps': 46376, 'batch_loss/train': 0.7070905710570514} 12/29/2021 01:40:51 - INFO - codeparrot_training - Step 46377: {'lr': 6.494748952010271e-06, 'samples': 23745536, 'steps': 46377, 'batch_loss/train': 0.6797575056552887} 12/29/2021 01:41:02 - INFO - codeparrot_training - Step 46378: {'lr': 6.491179753156967e-06, 'samples': 23746048, 'steps': 46378, 'batch_loss/train': 0.746761177200824} 12/29/2021 01:41:12 - INFO - codeparrot_training - Step 46379: {'lr': 6.487611522402737e-06, 'samples': 23746560, 'steps': 46379, 'batch_loss/train': 0.692876560613513} 12/29/2021 01:41:24 - INFO - codeparrot_training - Step 46380: {'lr': 6.48404425976179e-06, 'samples': 23747072, 'steps': 46380, 'batch_loss/train': 0.8897634600289166} 12/29/2021 01:41:35 - INFO - codeparrot_training - Step 46381: {'lr': 6.480477965248394e-06, 'samples': 23747584, 'steps': 46381, 'batch_loss/train': 0.7385101513937116} 12/29/2021 01:41:46 - INFO - codeparrot_training - Step 46382: {'lr': 6.476912638876648e-06, 'samples': 23748096, 'steps': 46382, 'batch_loss/train': 0.7138758156797849} 12/29/2021 01:41:58 - INFO - codeparrot_training - Step 46383: {'lr': 6.473348280660735e-06, 'samples': 23748608, 'steps': 46383, 'batch_loss/train': 0.7116339770145714} 12/29/2021 01:42:09 - INFO - codeparrot_training - Step 46384: {'lr': 6.469784890614838e-06, 'samples': 23749120, 'steps': 46384, 'batch_loss/train': 0.745890696067363} 12/29/2021 01:42:19 - INFO - codeparrot_training - Step 46385: {'lr': 6.466222468753114e-06, 'samples': 23749632, 'steps': 46385, 'batch_loss/train': 0.6964520723558962} 12/29/2021 01:42:31 - INFO - codeparrot_training - Step 46386: {'lr': 6.462661015089744e-06, 'samples': 23750144, 'steps': 46386, 'batch_loss/train': 0.6807820507092401} 12/29/2021 01:42:42 - INFO - codeparrot_training - Step 46387: {'lr': 6.459100529638911e-06, 'samples': 23750656, 'steps': 46387, 'batch_loss/train': 0.841633529169485} 12/29/2021 01:42:53 - INFO - codeparrot_training - Step 46388: {'lr': 6.45554101241469e-06, 'samples': 23751168, 'steps': 46388, 'batch_loss/train': 0.7078904000809416} 12/29/2021 01:43:03 - INFO - codeparrot_training - Step 46389: {'lr': 6.451982463431289e-06, 'samples': 23751680, 'steps': 46389, 'batch_loss/train': 0.7774503817781806} 12/29/2021 01:43:16 - INFO - codeparrot_training - Step 46390: {'lr': 6.448424882702864e-06, 'samples': 23752192, 'steps': 46390, 'batch_loss/train': 0.7463746268767864} 12/29/2021 01:43:26 - INFO - codeparrot_training - Step 46391: {'lr': 6.4448682702435146e-06, 'samples': 23752704, 'steps': 46391, 'batch_loss/train': 0.7617997350171208} 12/29/2021 01:43:37 - INFO - codeparrot_training - Step 46392: {'lr': 6.441312626067425e-06, 'samples': 23753216, 'steps': 46392, 'batch_loss/train': 0.7201383241917938} 12/29/2021 01:43:49 - INFO - codeparrot_training - Step 46393: {'lr': 6.437757950188722e-06, 'samples': 23753728, 'steps': 46393, 'batch_loss/train': 0.6825038278475404} 12/29/2021 01:44:00 - INFO - codeparrot_training - Step 46394: {'lr': 6.4342042426215054e-06, 'samples': 23754240, 'steps': 46394, 'batch_loss/train': 0.7910317284986377} 12/29/2021 01:44:11 - INFO - codeparrot_training - Step 46395: {'lr': 6.430651503379931e-06, 'samples': 23754752, 'steps': 46395, 'batch_loss/train': 0.689269165857695} 12/29/2021 01:44:23 - INFO - codeparrot_training - Step 46396: {'lr': 6.427099732478126e-06, 'samples': 23755264, 'steps': 46396, 'batch_loss/train': 0.7834515022113919} 12/29/2021 01:44:33 - INFO - codeparrot_training - Step 46397: {'lr': 6.42354892993019e-06, 'samples': 23755776, 'steps': 46397, 'batch_loss/train': 0.6930960933677852} 12/29/2021 01:44:44 - INFO - codeparrot_training - Step 46398: {'lr': 6.419999095750278e-06, 'samples': 23756288, 'steps': 46398, 'batch_loss/train': 0.7360873953439295} 12/29/2021 01:44:54 - INFO - codeparrot_training - Step 46399: {'lr': 6.416450229952464e-06, 'samples': 23756800, 'steps': 46399, 'batch_loss/train': 0.7392800664529204} 12/29/2021 01:45:07 - INFO - codeparrot_training - Step 46400: {'lr': 6.412902332550846e-06, 'samples': 23757312, 'steps': 46400, 'batch_loss/train': 0.8809290342032909} 12/29/2021 01:45:18 - INFO - codeparrot_training - Step 46401: {'lr': 6.4093554035595806e-06, 'samples': 23757824, 'steps': 46401, 'batch_loss/train': 0.7390345067251474} 12/29/2021 01:45:29 - INFO - codeparrot_training - Step 46402: {'lr': 6.405809442992739e-06, 'samples': 23758336, 'steps': 46402, 'batch_loss/train': 0.8515718767885119} 12/29/2021 01:45:41 - INFO - codeparrot_training - Step 46403: {'lr': 6.40226445086442e-06, 'samples': 23758848, 'steps': 46403, 'batch_loss/train': 0.7197595646139234} 12/29/2021 01:45:51 - INFO - codeparrot_training - Step 46404: {'lr': 6.398720427188698e-06, 'samples': 23759360, 'steps': 46404, 'batch_loss/train': 0.789801275357604} 12/29/2021 01:46:02 - INFO - codeparrot_training - Step 46405: {'lr': 6.395177371979727e-06, 'samples': 23759872, 'steps': 46405, 'batch_loss/train': 0.8187500997446477} 12/29/2021 01:46:15 - INFO - codeparrot_training - Step 46406: {'lr': 6.391635285251496e-06, 'samples': 23760384, 'steps': 46406, 'batch_loss/train': 0.7736975830048323} 12/29/2021 01:46:25 - INFO - codeparrot_training - Step 46407: {'lr': 6.388094167018188e-06, 'samples': 23760896, 'steps': 46407, 'batch_loss/train': 0.7934151571244001} 12/29/2021 01:46:36 - INFO - codeparrot_training - Step 46408: {'lr': 6.38455401729382e-06, 'samples': 23761408, 'steps': 46408, 'batch_loss/train': 0.7181200970662758} 12/29/2021 01:46:48 - INFO - codeparrot_training - Step 46409: {'lr': 6.381014836092463e-06, 'samples': 23761920, 'steps': 46409, 'batch_loss/train': 0.7478167889639735} 12/29/2021 01:46:59 - INFO - codeparrot_training - Step 46410: {'lr': 6.377476623428219e-06, 'samples': 23762432, 'steps': 46410, 'batch_loss/train': 0.7950517609715462} 12/29/2021 01:47:09 - INFO - codeparrot_training - Step 46411: {'lr': 6.37393937931513e-06, 'samples': 23762944, 'steps': 46411, 'batch_loss/train': 0.6500310113187879} 12/29/2021 01:47:20 - INFO - codeparrot_training - Step 46412: {'lr': 6.3704031037672695e-06, 'samples': 23763456, 'steps': 46412, 'batch_loss/train': 0.7586730099283159} 12/29/2021 01:47:32 - INFO - codeparrot_training - Step 46413: {'lr': 6.3668677967987085e-06, 'samples': 23763968, 'steps': 46413, 'batch_loss/train': 0.751184223452583} 12/29/2021 01:47:43 - INFO - codeparrot_training - Step 46414: {'lr': 6.3633334584234645e-06, 'samples': 23764480, 'steps': 46414, 'batch_loss/train': 0.7573749315924942} 12/29/2021 01:47:54 - INFO - codeparrot_training - Step 46415: {'lr': 6.359800088655637e-06, 'samples': 23764992, 'steps': 46415, 'batch_loss/train': 0.6161970216780901} 12/29/2021 01:48:06 - INFO - codeparrot_training - Step 46416: {'lr': 6.356267687509271e-06, 'samples': 23765504, 'steps': 46416, 'batch_loss/train': 0.8631813812535256} 12/29/2021 01:48:17 - INFO - codeparrot_training - Step 46417: {'lr': 6.3527362549983255e-06, 'samples': 23766016, 'steps': 46417, 'batch_loss/train': 0.7963525473605841} 12/29/2021 01:48:28 - INFO - codeparrot_training - Step 46418: {'lr': 6.34920579113693e-06, 'samples': 23766528, 'steps': 46418, 'batch_loss/train': 0.6296023488976061} 12/29/2021 01:48:40 - INFO - codeparrot_training - Step 46419: {'lr': 6.345676295939129e-06, 'samples': 23767040, 'steps': 46419, 'batch_loss/train': 0.751110983081162} 12/29/2021 01:48:51 - INFO - codeparrot_training - Step 46420: {'lr': 6.342147769418882e-06, 'samples': 23767552, 'steps': 46420, 'batch_loss/train': 0.7291943242307752} 12/29/2021 01:49:01 - INFO - codeparrot_training - Step 46421: {'lr': 6.338620211590235e-06, 'samples': 23768064, 'steps': 46421, 'batch_loss/train': 0.8230069817509502} 12/29/2021 01:49:12 - INFO - codeparrot_training - Step 46422: {'lr': 6.335093622467259e-06, 'samples': 23768576, 'steps': 46422, 'batch_loss/train': 0.7420104295597412} 12/29/2021 01:49:24 - INFO - codeparrot_training - Step 46423: {'lr': 6.331568002063942e-06, 'samples': 23769088, 'steps': 46423, 'batch_loss/train': 0.7246180921792984} 12/29/2021 01:49:35 - INFO - codeparrot_training - Step 46424: {'lr': 6.3280433503943025e-06, 'samples': 23769600, 'steps': 46424, 'batch_loss/train': 0.600519105559215} 12/29/2021 01:49:46 - INFO - codeparrot_training - Step 46425: {'lr': 6.324519667472384e-06, 'samples': 23770112, 'steps': 46425, 'batch_loss/train': 0.7526837214827538} 12/29/2021 01:49:58 - INFO - codeparrot_training - Step 46426: {'lr': 6.320996953312119e-06, 'samples': 23770624, 'steps': 46426, 'batch_loss/train': 0.7434886915143579} 12/29/2021 01:50:09 - INFO - codeparrot_training - Step 46427: {'lr': 6.317475207927581e-06, 'samples': 23771136, 'steps': 46427, 'batch_loss/train': 0.781601800583303} 12/29/2021 01:50:20 - INFO - codeparrot_training - Step 46428: {'lr': 6.313954431332758e-06, 'samples': 23771648, 'steps': 46428, 'batch_loss/train': 0.7699305498972535} 12/29/2021 01:50:32 - INFO - codeparrot_training - Step 46429: {'lr': 6.310434623541639e-06, 'samples': 23772160, 'steps': 46429, 'batch_loss/train': 0.9751603100448847} 12/29/2021 01:50:43 - INFO - codeparrot_training - Step 46430: {'lr': 6.306915784568212e-06, 'samples': 23772672, 'steps': 46430, 'batch_loss/train': 0.687387265264988} 12/29/2021 01:50:54 - INFO - codeparrot_training - Step 46431: {'lr': 6.303397914426495e-06, 'samples': 23773184, 'steps': 46431, 'batch_loss/train': 0.7126328311860561} 12/29/2021 01:51:04 - INFO - codeparrot_training - Step 46432: {'lr': 6.299881013130421e-06, 'samples': 23773696, 'steps': 46432, 'batch_loss/train': 0.7399997990578413} 12/29/2021 01:51:16 - INFO - codeparrot_training - Step 46433: {'lr': 6.296365080694033e-06, 'samples': 23774208, 'steps': 46433, 'batch_loss/train': 0.7924150517210364} 12/29/2021 01:51:27 - INFO - codeparrot_training - Step 46434: {'lr': 6.292850117131238e-06, 'samples': 23774720, 'steps': 46434, 'batch_loss/train': 0.8769879890605807} 12/29/2021 01:51:38 - INFO - codeparrot_training - Step 46435: {'lr': 6.289336122456079e-06, 'samples': 23775232, 'steps': 46435, 'batch_loss/train': 0.771605129702948} 12/29/2021 01:51:50 - INFO - codeparrot_training - Step 46436: {'lr': 6.285823096682492e-06, 'samples': 23775744, 'steps': 46436, 'batch_loss/train': 0.8264692923985422} 12/29/2021 01:52:01 - INFO - codeparrot_training - Step 46437: {'lr': 6.282311039824462e-06, 'samples': 23776256, 'steps': 46437, 'batch_loss/train': 0.8125214921310544} 12/29/2021 01:52:12 - INFO - codeparrot_training - Step 46438: {'lr': 6.278799951895897e-06, 'samples': 23776768, 'steps': 46438, 'batch_loss/train': 0.7279928792268038} 12/29/2021 01:52:24 - INFO - codeparrot_training - Step 46439: {'lr': 6.275289832910841e-06, 'samples': 23777280, 'steps': 46439, 'batch_loss/train': 0.7321909796446562} 12/29/2021 01:52:35 - INFO - codeparrot_training - Step 46440: {'lr': 6.271780682883199e-06, 'samples': 23777792, 'steps': 46440, 'batch_loss/train': 0.6781269208877347} 12/29/2021 01:52:45 - INFO - codeparrot_training - Step 46441: {'lr': 6.268272501826905e-06, 'samples': 23778304, 'steps': 46441, 'batch_loss/train': 0.7829091819003224} 12/29/2021 01:52:56 - INFO - codeparrot_training - Step 46442: {'lr': 6.264765289755975e-06, 'samples': 23778816, 'steps': 46442, 'batch_loss/train': 0.7697457297472283} 12/29/2021 01:53:08 - INFO - codeparrot_training - Step 46443: {'lr': 6.261259046684259e-06, 'samples': 23779328, 'steps': 46443, 'batch_loss/train': 0.7417254492174834} 12/29/2021 01:53:18 - INFO - codeparrot_training - Step 46444: {'lr': 6.2577537726257465e-06, 'samples': 23779840, 'steps': 46444, 'batch_loss/train': 0.6307087338063866} 12/29/2021 01:53:29 - INFO - codeparrot_training - Step 46445: {'lr': 6.254249467594425e-06, 'samples': 23780352, 'steps': 46445, 'batch_loss/train': 0.8291733507066965} 12/29/2021 01:53:42 - INFO - codeparrot_training - Step 46446: {'lr': 6.250746131604118e-06, 'samples': 23780864, 'steps': 46446, 'batch_loss/train': 0.7342126802541316} 12/29/2021 01:53:53 - INFO - codeparrot_training - Step 46447: {'lr': 6.247243764668814e-06, 'samples': 23781376, 'steps': 46447, 'batch_loss/train': 0.4937426178948954} 12/29/2021 01:54:04 - INFO - codeparrot_training - Step 46448: {'lr': 6.243742366802474e-06, 'samples': 23781888, 'steps': 46448, 'batch_loss/train': 0.6934433486312628} 12/29/2021 01:54:16 - INFO - codeparrot_training - Step 46449: {'lr': 6.2402419380189476e-06, 'samples': 23782400, 'steps': 46449, 'batch_loss/train': 1.3155236952006817} 12/29/2021 01:54:26 - INFO - codeparrot_training - Step 46450: {'lr': 6.236742478332169e-06, 'samples': 23782912, 'steps': 46450, 'batch_loss/train': 0.7212679823860526} 12/29/2021 01:54:37 - INFO - codeparrot_training - Step 46451: {'lr': 6.233243987756071e-06, 'samples': 23783424, 'steps': 46451, 'batch_loss/train': 0.7001624200493097} 12/29/2021 01:54:48 - INFO - codeparrot_training - Step 46452: {'lr': 6.229746466304531e-06, 'samples': 23783936, 'steps': 46452, 'batch_loss/train': 0.6948142568580806} 12/29/2021 01:55:00 - INFO - codeparrot_training - Step 46453: {'lr': 6.22624991399151e-06, 'samples': 23784448, 'steps': 46453, 'batch_loss/train': 0.6296020117588341} 12/29/2021 01:55:10 - INFO - codeparrot_training - Step 46454: {'lr': 6.22275433083086e-06, 'samples': 23784960, 'steps': 46454, 'batch_loss/train': 0.7369946800172329} 12/29/2021 01:55:21 - INFO - codeparrot_training - Step 46455: {'lr': 6.219259716836484e-06, 'samples': 23785472, 'steps': 46455, 'batch_loss/train': 0.6473789773881435} 12/29/2021 01:55:34 - INFO - codeparrot_training - Step 46456: {'lr': 6.215766072022288e-06, 'samples': 23785984, 'steps': 46456, 'batch_loss/train': 0.8186690732836723} 12/29/2021 01:55:44 - INFO - codeparrot_training - Step 46457: {'lr': 6.212273396402179e-06, 'samples': 23786496, 'steps': 46457, 'batch_loss/train': 0.7431454351171851} 12/29/2021 01:55:55 - INFO - codeparrot_training - Step 46458: {'lr': 6.208781689989979e-06, 'samples': 23787008, 'steps': 46458, 'batch_loss/train': 0.6490856271702796} 12/29/2021 01:56:07 - INFO - codeparrot_training - Step 46459: {'lr': 6.205290952799619e-06, 'samples': 23787520, 'steps': 46459, 'batch_loss/train': 0.759344800375402} 12/29/2021 01:56:18 - INFO - codeparrot_training - Step 46460: {'lr': 6.201801184845007e-06, 'samples': 23788032, 'steps': 46460, 'batch_loss/train': 0.7218066564528272} 12/29/2021 01:56:28 - INFO - codeparrot_training - Step 46461: {'lr': 6.198312386139965e-06, 'samples': 23788544, 'steps': 46461, 'batch_loss/train': 0.7368890689685941} 12/29/2021 01:56:41 - INFO - codeparrot_training - Step 46462: {'lr': 6.1948245566983415e-06, 'samples': 23789056, 'steps': 46462, 'batch_loss/train': 0.7741915481165051} 12/29/2021 01:56:52 - INFO - codeparrot_training - Step 46463: {'lr': 6.191337696534072e-06, 'samples': 23789568, 'steps': 46463, 'batch_loss/train': 0.8471130318939686} 12/29/2021 01:57:03 - INFO - codeparrot_training - Step 46464: {'lr': 6.187851805660977e-06, 'samples': 23790080, 'steps': 46464, 'batch_loss/train': 0.745207377942279} 12/29/2021 01:57:13 - INFO - codeparrot_training - Step 46465: {'lr': 6.184366884092907e-06, 'samples': 23790592, 'steps': 46465, 'batch_loss/train': 0.7226653029210865} 12/29/2021 01:57:26 - INFO - codeparrot_training - Step 46466: {'lr': 6.180882931843767e-06, 'samples': 23791104, 'steps': 46466, 'batch_loss/train': 0.7609192449599504} 12/29/2021 01:57:36 - INFO - codeparrot_training - Step 46467: {'lr': 6.1773999489273534e-06, 'samples': 23791616, 'steps': 46467, 'batch_loss/train': 0.5952908378094435} 12/29/2021 01:57:47 - INFO - codeparrot_training - Step 46468: {'lr': 6.173917935357543e-06, 'samples': 23792128, 'steps': 46468, 'batch_loss/train': 0.6957054417580366} 12/29/2021 01:57:59 - INFO - codeparrot_training - Step 46469: {'lr': 6.170436891148129e-06, 'samples': 23792640, 'steps': 46469, 'batch_loss/train': 0.7814653888344765} 12/29/2021 01:58:10 - INFO - codeparrot_training - Step 46470: {'lr': 6.166956816313046e-06, 'samples': 23793152, 'steps': 46470, 'batch_loss/train': 0.6921428088098764} 12/29/2021 01:58:21 - INFO - codeparrot_training - Step 46471: {'lr': 6.163477710866061e-06, 'samples': 23793664, 'steps': 46471, 'batch_loss/train': 0.6978792957961559} 12/29/2021 01:58:33 - INFO - codeparrot_training - Step 46472: {'lr': 6.159999574821024e-06, 'samples': 23794176, 'steps': 46472, 'batch_loss/train': 0.5723495418787934} 12/29/2021 01:58:43 - INFO - codeparrot_training - Step 46473: {'lr': 6.156522408191701e-06, 'samples': 23794688, 'steps': 46473, 'batch_loss/train': 0.732360870577395} 12/29/2021 01:58:54 - INFO - codeparrot_training - Step 46474: {'lr': 6.1530462109920525e-06, 'samples': 23795200, 'steps': 46474, 'batch_loss/train': 1.4842447601258755} 12/29/2021 01:59:05 - INFO - codeparrot_training - Step 46475: {'lr': 6.149570983235792e-06, 'samples': 23795712, 'steps': 46475, 'batch_loss/train': 0.7000930937938392} 12/29/2021 01:59:19 - INFO - codeparrot_training - Step 46476: {'lr': 6.146096724936712e-06, 'samples': 23796224, 'steps': 46476, 'batch_loss/train': 0.6968480418436229} 12/29/2021 01:59:30 - INFO - codeparrot_training - Step 46477: {'lr': 6.1426234361087465e-06, 'samples': 23796736, 'steps': 46477, 'batch_loss/train': 0.8055871017277241} 12/29/2021 01:59:40 - INFO - codeparrot_training - Step 46478: {'lr': 6.1391511167656065e-06, 'samples': 23797248, 'steps': 46478, 'batch_loss/train': 0.8214521887712181} 12/29/2021 01:59:51 - INFO - codeparrot_training - Step 46479: {'lr': 6.135679766921115e-06, 'samples': 23797760, 'steps': 46479, 'batch_loss/train': 1.3545818668790162} 12/29/2021 02:00:03 - INFO - codeparrot_training - Step 46480: {'lr': 6.132209386589094e-06, 'samples': 23798272, 'steps': 46480, 'batch_loss/train': 0.8609001273289323} 12/29/2021 02:00:14 - INFO - codeparrot_training - Step 46481: {'lr': 6.128739975783337e-06, 'samples': 23798784, 'steps': 46481, 'batch_loss/train': 0.7681252444162965} 12/29/2021 02:00:24 - INFO - codeparrot_training - Step 46482: {'lr': 6.125271534517612e-06, 'samples': 23799296, 'steps': 46482, 'batch_loss/train': 0.7975415172986686} 12/29/2021 02:00:36 - INFO - codeparrot_training - Step 46483: {'lr': 6.121804062805741e-06, 'samples': 23799808, 'steps': 46483, 'batch_loss/train': 0.7180179292336106} 12/29/2021 02:00:47 - INFO - codeparrot_training - Step 46484: {'lr': 6.1183375606614635e-06, 'samples': 23800320, 'steps': 46484, 'batch_loss/train': 0.7703862437047064} 12/29/2021 02:00:58 - INFO - codeparrot_training - Step 46485: {'lr': 6.1148720280986e-06, 'samples': 23800832, 'steps': 46485, 'batch_loss/train': 0.7604290153831244} 12/29/2021 02:01:11 - INFO - codeparrot_training - Step 46486: {'lr': 6.111407465130947e-06, 'samples': 23801344, 'steps': 46486, 'batch_loss/train': 0.6974516985937953} 12/29/2021 02:01:22 - INFO - codeparrot_training - Step 46487: {'lr': 6.1079438717722145e-06, 'samples': 23801856, 'steps': 46487, 'batch_loss/train': 0.788780327886343} 12/29/2021 02:01:32 - INFO - codeparrot_training - Step 46488: {'lr': 6.1044812480361974e-06, 'samples': 23802368, 'steps': 46488, 'batch_loss/train': 0.5716908075846732} 12/29/2021 02:01:43 - INFO - codeparrot_training - Step 46489: {'lr': 6.1010195939366905e-06, 'samples': 23802880, 'steps': 46489, 'batch_loss/train': 0.7438450357876718} 12/29/2021 02:01:55 - INFO - codeparrot_training - Step 46490: {'lr': 6.0975589094874325e-06, 'samples': 23803392, 'steps': 46490, 'batch_loss/train': 0.705945584923029} 12/29/2021 02:02:06 - INFO - codeparrot_training - Step 46491: {'lr': 6.09409919470219e-06, 'samples': 23803904, 'steps': 46491, 'batch_loss/train': 0.8353746677748859} 12/29/2021 02:02:16 - INFO - codeparrot_training - Step 46492: {'lr': 6.090640449594731e-06, 'samples': 23804416, 'steps': 46492, 'batch_loss/train': 0.8099678419530392} 12/29/2021 02:02:29 - INFO - codeparrot_training - Step 46493: {'lr': 6.087182674178765e-06, 'samples': 23804928, 'steps': 46493, 'batch_loss/train': 0.7870851410552859} 12/29/2021 02:02:40 - INFO - codeparrot_training - Step 46494: {'lr': 6.0837258684680595e-06, 'samples': 23805440, 'steps': 46494, 'batch_loss/train': 0.7932814084924757} 12/29/2021 02:02:50 - INFO - codeparrot_training - Step 46495: {'lr': 6.080270032476381e-06, 'samples': 23805952, 'steps': 46495, 'batch_loss/train': 0.7178882961161435} 12/29/2021 02:03:03 - INFO - codeparrot_training - Step 46496: {'lr': 6.076815166217442e-06, 'samples': 23806464, 'steps': 46496, 'batch_loss/train': 0.8195964267943054} 12/29/2021 02:03:13 - INFO - codeparrot_training - Step 46497: {'lr': 6.07336126970498e-06, 'samples': 23806976, 'steps': 46497, 'batch_loss/train': 0.7397699411958456} 12/29/2021 02:03:24 - INFO - codeparrot_training - Step 46498: {'lr': 6.069908342952734e-06, 'samples': 23807488, 'steps': 46498, 'batch_loss/train': 0.8704884201288223} 12/29/2021 02:03:36 - INFO - codeparrot_training - Step 46499: {'lr': 6.066456385974417e-06, 'samples': 23808000, 'steps': 46499, 'batch_loss/train': 0.7309365822002292} 12/29/2021 02:03:47 - INFO - codeparrot_training - Step 46500: {'lr': 6.063005398783794e-06, 'samples': 23808512, 'steps': 46500, 'batch_loss/train': 0.7682087626308203} 12/29/2021 02:03:57 - INFO - codeparrot_training - Step 46501: {'lr': 6.059555381394521e-06, 'samples': 23809024, 'steps': 46501, 'batch_loss/train': 0.6988576692529023} 12/29/2021 02:04:08 - INFO - codeparrot_training - Step 46502: {'lr': 6.056106333820338e-06, 'samples': 23809536, 'steps': 46502, 'batch_loss/train': 0.6340116629144177} 12/29/2021 02:04:21 - INFO - codeparrot_training - Step 46503: {'lr': 6.0526582560750106e-06, 'samples': 23810048, 'steps': 46503, 'batch_loss/train': 0.8244126616045833} 12/29/2021 02:04:31 - INFO - codeparrot_training - Step 46504: {'lr': 6.049211148172168e-06, 'samples': 23810560, 'steps': 46504, 'batch_loss/train': 0.5802675224840641} 12/29/2021 02:04:42 - INFO - codeparrot_training - Step 46505: {'lr': 6.045765010125576e-06, 'samples': 23811072, 'steps': 46505, 'batch_loss/train': 0.7935726951109245} 12/29/2021 02:04:54 - INFO - codeparrot_training - Step 46506: {'lr': 6.04231984194889e-06, 'samples': 23811584, 'steps': 46506, 'batch_loss/train': 0.7608450115658343} 12/29/2021 02:05:05 - INFO - codeparrot_training - Step 46507: {'lr': 6.038875643655823e-06, 'samples': 23812096, 'steps': 46507, 'batch_loss/train': 0.8260759143158793} 12/29/2021 02:05:15 - INFO - codeparrot_training - Step 46508: {'lr': 6.035432415260056e-06, 'samples': 23812608, 'steps': 46508, 'batch_loss/train': 0.7235207429621369} 12/29/2021 02:05:28 - INFO - codeparrot_training - Step 46509: {'lr': 6.031990156775357e-06, 'samples': 23813120, 'steps': 46509, 'batch_loss/train': 0.7118728188797832} 12/29/2021 02:05:39 - INFO - codeparrot_training - Step 46510: {'lr': 6.028548868215272e-06, 'samples': 23813632, 'steps': 46510, 'batch_loss/train': 0.6526904201600701} 12/29/2021 02:05:49 - INFO - codeparrot_training - Step 46511: {'lr': 6.0251085495935945e-06, 'samples': 23814144, 'steps': 46511, 'batch_loss/train': 0.7916270764544606} 12/29/2021 02:06:00 - INFO - codeparrot_training - Step 46512: {'lr': 6.0216692009239795e-06, 'samples': 23814656, 'steps': 46512, 'batch_loss/train': 0.7087158495560288} 12/29/2021 02:06:13 - INFO - codeparrot_training - Step 46513: {'lr': 6.0182308222200274e-06, 'samples': 23815168, 'steps': 46513, 'batch_loss/train': 0.7310427273623645} 12/29/2021 02:06:23 - INFO - codeparrot_training - Step 46514: {'lr': 6.014793413495534e-06, 'samples': 23815680, 'steps': 46514, 'batch_loss/train': 0.7394919523503631} 12/29/2021 02:06:34 - INFO - codeparrot_training - Step 46515: {'lr': 6.011356974764043e-06, 'samples': 23816192, 'steps': 46515, 'batch_loss/train': 0.6547909241635352} 12/29/2021 02:06:46 - INFO - codeparrot_training - Step 46516: {'lr': 6.007921506039293e-06, 'samples': 23816704, 'steps': 46516, 'batch_loss/train': 0.6222384599968791} 12/29/2021 02:06:57 - INFO - codeparrot_training - Step 46517: {'lr': 6.004487007334913e-06, 'samples': 23817216, 'steps': 46517, 'batch_loss/train': 0.7353796781972051} 12/29/2021 02:07:08 - INFO - codeparrot_training - Step 46518: {'lr': 6.001053478664558e-06, 'samples': 23817728, 'steps': 46518, 'batch_loss/train': 0.72452257364057} 12/29/2021 02:07:20 - INFO - codeparrot_training - Step 46519: {'lr': 5.997620920041913e-06, 'samples': 23818240, 'steps': 46519, 'batch_loss/train': 0.7200478194281459} 12/29/2021 02:07:30 - INFO - codeparrot_training - Step 46520: {'lr': 5.9941893314805765e-06, 'samples': 23818752, 'steps': 46520, 'batch_loss/train': 0.6804830739274621} 12/29/2021 02:07:41 - INFO - codeparrot_training - Step 46521: {'lr': 5.9907587129942045e-06, 'samples': 23819264, 'steps': 46521, 'batch_loss/train': 0.7532029319554567} 12/29/2021 02:07:52 - INFO - codeparrot_training - Step 46522: {'lr': 5.987329064596425e-06, 'samples': 23819776, 'steps': 46522, 'batch_loss/train': 0.7948908022372052} 12/29/2021 02:08:05 - INFO - codeparrot_training - Step 46523: {'lr': 5.98390038630095e-06, 'samples': 23820288, 'steps': 46523, 'batch_loss/train': 0.6846754889702424} 12/29/2021 02:08:15 - INFO - codeparrot_training - Step 46524: {'lr': 5.980472678121296e-06, 'samples': 23820800, 'steps': 46524, 'batch_loss/train': 0.6885110780131072} 12/29/2021 02:08:26 - INFO - codeparrot_training - Step 46525: {'lr': 5.9770459400711464e-06, 'samples': 23821312, 'steps': 46525, 'batch_loss/train': 0.5922926403582096} 12/29/2021 02:08:38 - INFO - codeparrot_training - Step 46526: {'lr': 5.973620172164157e-06, 'samples': 23821824, 'steps': 46526, 'batch_loss/train': 0.6758197881281376} 12/29/2021 02:08:49 - INFO - codeparrot_training - Step 46527: {'lr': 5.9701953744138726e-06, 'samples': 23822336, 'steps': 46527, 'batch_loss/train': 0.7485631455201656} 12/29/2021 02:08:59 - INFO - codeparrot_training - Step 46528: {'lr': 5.966771546833949e-06, 'samples': 23822848, 'steps': 46528, 'batch_loss/train': 0.7625657003372908} 12/29/2021 02:09:11 - INFO - codeparrot_training - Step 46529: {'lr': 5.9633486894380414e-06, 'samples': 23823360, 'steps': 46529, 'batch_loss/train': 0.7642679126001894} 12/29/2021 02:09:22 - INFO - codeparrot_training - Step 46530: {'lr': 5.959926802239696e-06, 'samples': 23823872, 'steps': 46530, 'batch_loss/train': 0.7297056037932634} 12/29/2021 02:09:32 - INFO - codeparrot_training - Step 46531: {'lr': 5.956505885252539e-06, 'samples': 23824384, 'steps': 46531, 'batch_loss/train': 0.7247404521331191} 12/29/2021 02:09:43 - INFO - codeparrot_training - Step 46532: {'lr': 5.9530859384901435e-06, 'samples': 23824896, 'steps': 46532, 'batch_loss/train': 0.7579891355708241} 12/29/2021 02:09:56 - INFO - codeparrot_training - Step 46533: {'lr': 5.949666961966138e-06, 'samples': 23825408, 'steps': 46533, 'batch_loss/train': 0.6510486899642274} 12/29/2021 02:10:06 - INFO - codeparrot_training - Step 46534: {'lr': 5.946248955694122e-06, 'samples': 23825920, 'steps': 46534, 'batch_loss/train': 0.673755771946162} 12/29/2021 02:10:17 - INFO - codeparrot_training - Step 46535: {'lr': 5.942831919687697e-06, 'samples': 23826432, 'steps': 46535, 'batch_loss/train': 0.6699563353322446} 12/29/2021 02:10:30 - INFO - codeparrot_training - Step 46536: {'lr': 5.939415853960351e-06, 'samples': 23826944, 'steps': 46536, 'batch_loss/train': 0.6711962191620842} 12/29/2021 02:10:40 - INFO - codeparrot_training - Step 46537: {'lr': 5.936000758525767e-06, 'samples': 23827456, 'steps': 46537, 'batch_loss/train': 0.6519939270219766} 12/29/2021 02:10:51 - INFO - codeparrot_training - Step 46538: {'lr': 5.932586633397491e-06, 'samples': 23827968, 'steps': 46538, 'batch_loss/train': 0.7408841736614704} 12/29/2021 02:11:02 - INFO - codeparrot_training - Step 46539: {'lr': 5.929173478589067e-06, 'samples': 23828480, 'steps': 46539, 'batch_loss/train': 0.5259209295036271} 12/29/2021 02:11:14 - INFO - codeparrot_training - Step 46540: {'lr': 5.925761294114124e-06, 'samples': 23828992, 'steps': 46540, 'batch_loss/train': 0.7611669942270964} 12/29/2021 02:11:25 - INFO - codeparrot_training - Step 46541: {'lr': 5.922350079986205e-06, 'samples': 23829504, 'steps': 46541, 'batch_loss/train': 0.5583180077373981} 12/29/2021 02:11:35 - INFO - codeparrot_training - Step 46542: {'lr': 5.918939836218856e-06, 'samples': 23830016, 'steps': 46542, 'batch_loss/train': 0.6444703817833215} 12/29/2021 02:11:47 - INFO - codeparrot_training - Step 46543: {'lr': 5.915530562825622e-06, 'samples': 23830528, 'steps': 46543, 'batch_loss/train': 0.7430653502233326} 12/29/2021 02:11:58 - INFO - codeparrot_training - Step 46544: {'lr': 5.9121222598200745e-06, 'samples': 23831040, 'steps': 46544, 'batch_loss/train': 0.7583739114925265} 12/29/2021 02:12:09 - INFO - codeparrot_training - Step 46545: {'lr': 5.908714927215786e-06, 'samples': 23831552, 'steps': 46545, 'batch_loss/train': 0.8097171150147915} 12/29/2021 02:12:17 - INFO - codeparrot_training - Dataset epoch: 2 12/29/2021 02:12:23 - INFO - codeparrot_training - Step 46546: {'lr': 5.905308565026274e-06, 'samples': 23832064, 'steps': 46546, 'batch_loss/train': 0.6446466716006398} 12/29/2021 02:12:34 - INFO - codeparrot_training - Step 46547: {'lr': 5.901903173265083e-06, 'samples': 23832576, 'steps': 46547, 'batch_loss/train': 0.7015944602899253} 12/29/2021 02:12:44 - INFO - codeparrot_training - Step 46548: {'lr': 5.898498751945758e-06, 'samples': 23833088, 'steps': 46548, 'batch_loss/train': 0.6435019322670996} 12/29/2021 02:12:55 - INFO - codeparrot_training - Step 46549: {'lr': 5.895095301081815e-06, 'samples': 23833600, 'steps': 46549, 'batch_loss/train': 0.7310878769494593} 12/29/2021 02:13:09 - INFO - codeparrot_training - Step 46550: {'lr': 5.891692820686828e-06, 'samples': 23834112, 'steps': 46550, 'batch_loss/train': 0.7807645443826914} 12/29/2021 02:13:19 - INFO - codeparrot_training - Step 46551: {'lr': 5.8882913107742856e-06, 'samples': 23834624, 'steps': 46551, 'batch_loss/train': 0.6658499822951853} 12/29/2021 02:13:30 - INFO - codeparrot_training - Step 46552: {'lr': 5.8848907713577595e-06, 'samples': 23835136, 'steps': 46552, 'batch_loss/train': 0.6555053456395399} 12/29/2021 02:13:42 - INFO - codeparrot_training - Step 46553: {'lr': 5.881491202450712e-06, 'samples': 23835648, 'steps': 46553, 'batch_loss/train': 0.643343451898545} 12/29/2021 02:13:53 - INFO - codeparrot_training - Step 46554: {'lr': 5.878092604066631e-06, 'samples': 23836160, 'steps': 46554, 'batch_loss/train': 0.6851258943788707} 12/29/2021 02:14:03 - INFO - codeparrot_training - Step 46555: {'lr': 5.874694976219147e-06, 'samples': 23836672, 'steps': 46555, 'batch_loss/train': 0.5125106011983007} 12/29/2021 02:14:16 - INFO - codeparrot_training - Step 46556: {'lr': 5.871298318921664e-06, 'samples': 23837184, 'steps': 46556, 'batch_loss/train': 0.7327632217202336} 12/29/2021 02:14:26 - INFO - codeparrot_training - Step 46557: {'lr': 5.8679026321877275e-06, 'samples': 23837696, 'steps': 46557, 'batch_loss/train': 0.6759601363446563} 12/29/2021 02:14:37 - INFO - codeparrot_training - Step 46558: {'lr': 5.864507916030826e-06, 'samples': 23838208, 'steps': 46558, 'batch_loss/train': 0.6564153367653489} 12/29/2021 02:14:51 - INFO - codeparrot_training - Step 46559: {'lr': 5.8611141704644485e-06, 'samples': 23838720, 'steps': 46559, 'batch_loss/train': 0.7766203552018851} 12/29/2021 02:15:01 - INFO - codeparrot_training - Step 46560: {'lr': 5.857721395502114e-06, 'samples': 23839232, 'steps': 46560, 'batch_loss/train': 0.6951513696694747} 12/29/2021 02:15:12 - INFO - codeparrot_training - Step 46561: {'lr': 5.854329591157309e-06, 'samples': 23839744, 'steps': 46561, 'batch_loss/train': 0.5892315013334155} 12/29/2021 02:15:23 - INFO - codeparrot_training - Step 46562: {'lr': 5.850938757443441e-06, 'samples': 23840256, 'steps': 46562, 'batch_loss/train': 0.7197529512923211} 12/29/2021 02:15:35 - INFO - codeparrot_training - Step 46563: {'lr': 5.847548894374111e-06, 'samples': 23840768, 'steps': 46563, 'batch_loss/train': 0.7384588662534952} 12/29/2021 02:15:46 - INFO - codeparrot_training - Step 46564: {'lr': 5.84416000196275e-06, 'samples': 23841280, 'steps': 46564, 'batch_loss/train': 0.7377502433955669} 12/29/2021 02:15:56 - INFO - codeparrot_training - Step 46565: {'lr': 5.840772080222767e-06, 'samples': 23841792, 'steps': 46565, 'batch_loss/train': 0.589922450017184} 12/29/2021 02:16:10 - INFO - codeparrot_training - Step 46566: {'lr': 5.837385129167705e-06, 'samples': 23842304, 'steps': 46566, 'batch_loss/train': 0.7954148184508085} 12/29/2021 02:16:21 - INFO - codeparrot_training - Step 46567: {'lr': 5.833999148811026e-06, 'samples': 23842816, 'steps': 46567, 'batch_loss/train': 0.6840092416387051} 12/29/2021 02:16:31 - INFO - codeparrot_training - Step 46568: {'lr': 5.830614139166163e-06, 'samples': 23843328, 'steps': 46568, 'batch_loss/train': 0.5792427103151567} 12/29/2021 02:16:44 - INFO - codeparrot_training - Step 46569: {'lr': 5.8272301002465774e-06, 'samples': 23843840, 'steps': 46569, 'batch_loss/train': 1.136985257267952} 12/29/2021 02:16:54 - INFO - codeparrot_training - Step 46570: {'lr': 5.823847032065732e-06, 'samples': 23844352, 'steps': 46570, 'batch_loss/train': 0.8020467665046453} 12/29/2021 02:17:05 - INFO - codeparrot_training - Step 46571: {'lr': 5.82046493463706e-06, 'samples': 23844864, 'steps': 46571, 'batch_loss/train': 0.8135793614201248} 12/29/2021 02:17:15 - INFO - codeparrot_training - Step 46572: {'lr': 5.817083807974022e-06, 'samples': 23845376, 'steps': 46572, 'batch_loss/train': 0.7617074807640165} 12/29/2021 02:17:28 - INFO - codeparrot_training - Step 46573: {'lr': 5.81370365209008e-06, 'samples': 23845888, 'steps': 46573, 'batch_loss/train': 0.6920874507632107} 12/29/2021 02:17:38 - INFO - codeparrot_training - Step 46574: {'lr': 5.81032446699864e-06, 'samples': 23846400, 'steps': 46574, 'batch_loss/train': 0.7071769190952182} 12/29/2021 02:17:49 - INFO - codeparrot_training - Step 46575: {'lr': 5.8069462527131356e-06, 'samples': 23846912, 'steps': 46575, 'batch_loss/train': 0.7547582592815161} 12/29/2021 02:18:03 - INFO - codeparrot_training - Step 46576: {'lr': 5.803569009247028e-06, 'samples': 23847424, 'steps': 46576, 'batch_loss/train': 0.691774868639186} 12/29/2021 02:18:14 - INFO - codeparrot_training - Step 46577: {'lr': 5.800192736613724e-06, 'samples': 23847936, 'steps': 46577, 'batch_loss/train': 0.7323031427804381} 12/29/2021 02:18:24 - INFO - codeparrot_training - Step 46578: {'lr': 5.796817434826656e-06, 'samples': 23848448, 'steps': 46578, 'batch_loss/train': 0.7620372767560184} 12/29/2021 02:18:36 - INFO - codeparrot_training - Step 46579: {'lr': 5.793443103899204e-06, 'samples': 23848960, 'steps': 46579, 'batch_loss/train': 0.6242999371606857} 12/29/2021 02:18:47 - INFO - codeparrot_training - Step 46580: {'lr': 5.7900697438448e-06, 'samples': 23849472, 'steps': 46580, 'batch_loss/train': 0.6736194572877139} 12/29/2021 02:18:57 - INFO - codeparrot_training - Step 46581: {'lr': 5.786697354676934e-06, 'samples': 23849984, 'steps': 46581, 'batch_loss/train': 0.6833337359130383} 12/29/2021 02:19:08 - INFO - codeparrot_training - Step 46582: {'lr': 5.783325936408901e-06, 'samples': 23850496, 'steps': 46582, 'batch_loss/train': 0.7497129030525684} 12/29/2021 02:19:20 - INFO - codeparrot_training - Step 46583: {'lr': 5.779955489054161e-06, 'samples': 23851008, 'steps': 46583, 'batch_loss/train': 0.6600721096619964} 12/29/2021 02:19:31 - INFO - codeparrot_training - Step 46584: {'lr': 5.776586012626123e-06, 'samples': 23851520, 'steps': 46584, 'batch_loss/train': 0.6950307926163077} 12/29/2021 02:19:41 - INFO - codeparrot_training - Step 46585: {'lr': 5.773217507138134e-06, 'samples': 23852032, 'steps': 46585, 'batch_loss/train': 0.6698815263807774} 12/29/2021 02:19:53 - INFO - codeparrot_training - Step 46586: {'lr': 5.76984997260363e-06, 'samples': 23852544, 'steps': 46586, 'batch_loss/train': 0.6991239651106298} 12/29/2021 02:20:04 - INFO - codeparrot_training - Step 46587: {'lr': 5.766483409035989e-06, 'samples': 23853056, 'steps': 46587, 'batch_loss/train': 0.7171855855267495} 12/29/2021 02:20:15 - INFO - codeparrot_training - Step 46588: {'lr': 5.76311781644856e-06, 'samples': 23853568, 'steps': 46588, 'batch_loss/train': 0.7140407087281346} 12/29/2021 02:20:29 - INFO - codeparrot_training - Step 46589: {'lr': 5.759753194854778e-06, 'samples': 23854080, 'steps': 46589, 'batch_loss/train': 0.7166671622544527} 12/29/2021 02:20:39 - INFO - codeparrot_training - Step 46590: {'lr': 5.7563895442680205e-06, 'samples': 23854592, 'steps': 46590, 'batch_loss/train': 0.757999227847904} 12/29/2021 02:20:50 - INFO - codeparrot_training - Step 46591: {'lr': 5.753026864701583e-06, 'samples': 23855104, 'steps': 46591, 'batch_loss/train': 0.7060967488214374} 12/29/2021 02:21:02 - INFO - codeparrot_training - Step 46592: {'lr': 5.749665156168926e-06, 'samples': 23855616, 'steps': 46592, 'batch_loss/train': 0.7092402022099122} 12/29/2021 02:21:13 - INFO - codeparrot_training - Step 46593: {'lr': 5.7463044186833736e-06, 'samples': 23856128, 'steps': 46593, 'batch_loss/train': 0.7523427354171872} 12/29/2021 02:21:24 - INFO - codeparrot_training - Step 46594: {'lr': 5.742944652258303e-06, 'samples': 23856640, 'steps': 46594, 'batch_loss/train': 0.6985855139791965} 12/29/2021 02:21:34 - INFO - codeparrot_training - Step 46595: {'lr': 5.739585856906981e-06, 'samples': 23857152, 'steps': 46595, 'batch_loss/train': 0.700181461405009} 12/29/2021 02:21:48 - INFO - codeparrot_training - Step 46596: {'lr': 5.736228032642926e-06, 'samples': 23857664, 'steps': 46596, 'batch_loss/train': 0.6996875312179327} 12/29/2021 02:21:59 - INFO - codeparrot_training - Step 46597: {'lr': 5.732871179479349e-06, 'samples': 23858176, 'steps': 46597, 'batch_loss/train': 0.6629366399720311} 12/29/2021 02:22:09 - INFO - codeparrot_training - Step 46598: {'lr': 5.729515297429655e-06, 'samples': 23858688, 'steps': 46598, 'batch_loss/train': 0.6807778589427471} 12/29/2021 02:22:22 - INFO - codeparrot_training - Step 46599: {'lr': 5.726160386507167e-06, 'samples': 23859200, 'steps': 46599, 'batch_loss/train': 0.6438652561046183} 12/29/2021 02:22:32 - INFO - codeparrot_training - Step 46600: {'lr': 5.722806446725238e-06, 'samples': 23859712, 'steps': 46600, 'batch_loss/train': 1.506801032461226} 12/29/2021 02:22:43 - INFO - codeparrot_training - Step 46601: {'lr': 5.719453478097214e-06, 'samples': 23860224, 'steps': 46601, 'batch_loss/train': 0.7056369869969785} 12/29/2021 02:22:55 - INFO - codeparrot_training - Step 46602: {'lr': 5.716101480636393e-06, 'samples': 23860736, 'steps': 46602, 'batch_loss/train': 0.7414629748091102} 12/29/2021 02:23:06 - INFO - codeparrot_training - Step 46603: {'lr': 5.712750454356097e-06, 'samples': 23861248, 'steps': 46603, 'batch_loss/train': 0.7919963006861508} 12/29/2021 02:23:17 - INFO - codeparrot_training - Step 46604: {'lr': 5.709400399269704e-06, 'samples': 23861760, 'steps': 46604, 'batch_loss/train': 0.7266445837449282} 12/29/2021 02:23:27 - INFO - codeparrot_training - Step 46605: {'lr': 5.706051315390453e-06, 'samples': 23862272, 'steps': 46605, 'batch_loss/train': 0.6976522243348882} 12/29/2021 02:23:41 - INFO - codeparrot_training - Step 46606: {'lr': 5.702703202731696e-06, 'samples': 23862784, 'steps': 46606, 'batch_loss/train': 0.7655040547251701} 12/29/2021 02:23:52 - INFO - codeparrot_training - Step 46607: {'lr': 5.6993560613068095e-06, 'samples': 23863296, 'steps': 46607, 'batch_loss/train': 0.680612213909626} 12/29/2021 02:24:03 - INFO - codeparrot_training - Step 46608: {'lr': 5.696009891128978e-06, 'samples': 23863808, 'steps': 46608, 'batch_loss/train': 0.6611613219138235} 12/29/2021 02:24:15 - INFO - codeparrot_training - Step 46609: {'lr': 5.6926646922115795e-06, 'samples': 23864320, 'steps': 46609, 'batch_loss/train': 0.6983957998454571} 12/29/2021 02:24:25 - INFO - codeparrot_training - Step 46610: {'lr': 5.68932046456791e-06, 'samples': 23864832, 'steps': 46610, 'batch_loss/train': 0.6788814230822027} 12/29/2021 02:24:36 - INFO - codeparrot_training - Step 46611: {'lr': 5.6859772082112625e-06, 'samples': 23865344, 'steps': 46611, 'batch_loss/train': 0.691991753410548} 12/29/2021 02:24:48 - INFO - codeparrot_training - Step 46612: {'lr': 5.6826349231549056e-06, 'samples': 23865856, 'steps': 46612, 'batch_loss/train': 0.632502106949687} 12/29/2021 02:24:59 - INFO - codeparrot_training - Step 46613: {'lr': 5.679293609412134e-06, 'samples': 23866368, 'steps': 46613, 'batch_loss/train': 0.6700055487453938} 12/29/2021 02:25:09 - INFO - codeparrot_training - Step 46614: {'lr': 5.675953266996242e-06, 'samples': 23866880, 'steps': 46614, 'batch_loss/train': 0.6784133762121201} 12/29/2021 02:25:20 - INFO - codeparrot_training - Step 46615: {'lr': 5.6726138959205256e-06, 'samples': 23867392, 'steps': 46615, 'batch_loss/train': 0.6558375880122185} 12/29/2021 02:25:34 - INFO - codeparrot_training - Step 46616: {'lr': 5.669275496198251e-06, 'samples': 23867904, 'steps': 46616, 'batch_loss/train': 0.7658596867695451} 12/29/2021 02:25:44 - INFO - codeparrot_training - Step 46617: {'lr': 5.66593806784263e-06, 'samples': 23868416, 'steps': 46617, 'batch_loss/train': 0.8230203147977591} 12/29/2021 02:25:55 - INFO - codeparrot_training - Step 46618: {'lr': 5.662601610867013e-06, 'samples': 23868928, 'steps': 46618, 'batch_loss/train': 0.8304295567795634} 12/29/2021 02:26:07 - INFO - codeparrot_training - Step 46619: {'lr': 5.65926612528464e-06, 'samples': 23869440, 'steps': 46619, 'batch_loss/train': 0.7730727707967162} 12/29/2021 02:26:18 - INFO - codeparrot_training - Step 46620: {'lr': 5.655931611108778e-06, 'samples': 23869952, 'steps': 46620, 'batch_loss/train': 0.7439729073084891} 12/29/2021 02:26:29 - INFO - codeparrot_training - Step 46621: {'lr': 5.65259806835261e-06, 'samples': 23870464, 'steps': 46621, 'batch_loss/train': 0.783963629975915} 12/29/2021 02:26:43 - INFO - codeparrot_training - Step 46622: {'lr': 5.649265497029516e-06, 'samples': 23870976, 'steps': 46622, 'batch_loss/train': 0.6539810246322304} 12/29/2021 02:26:54 - INFO - codeparrot_training - Step 46623: {'lr': 5.64593389715265e-06, 'samples': 23871488, 'steps': 46623, 'batch_loss/train': 0.7822325127199292} 12/29/2021 02:27:04 - INFO - codeparrot_training - Step 46624: {'lr': 5.642603268735308e-06, 'samples': 23872000, 'steps': 46624, 'batch_loss/train': 0.7607897100970149} 12/29/2021 02:27:15 - INFO - codeparrot_training - Step 46625: {'lr': 5.639273611790674e-06, 'samples': 23872512, 'steps': 46625, 'batch_loss/train': 0.7984127728268504} 12/29/2021 02:27:27 - INFO - codeparrot_training - Step 46626: {'lr': 5.635944926332043e-06, 'samples': 23873024, 'steps': 46626, 'batch_loss/train': 1.428286393173039} 12/29/2021 02:27:38 - INFO - codeparrot_training - Step 46627: {'lr': 5.632617212372626e-06, 'samples': 23873536, 'steps': 46627, 'batch_loss/train': 0.7054730127565563} 12/29/2021 02:27:49 - INFO - codeparrot_training - Step 46628: {'lr': 5.629290469925636e-06, 'samples': 23874048, 'steps': 46628, 'batch_loss/train': 0.7231184476986527} 12/29/2021 02:28:01 - INFO - codeparrot_training - Step 46629: {'lr': 5.625964699004338e-06, 'samples': 23874560, 'steps': 46629, 'batch_loss/train': 0.7050271527841687} 12/29/2021 02:28:11 - INFO - codeparrot_training - Step 46630: {'lr': 5.622639899621917e-06, 'samples': 23875072, 'steps': 46630, 'batch_loss/train': 0.8143537291907705} 12/29/2021 02:28:22 - INFO - codeparrot_training - Step 46631: {'lr': 5.619316071791641e-06, 'samples': 23875584, 'steps': 46631, 'batch_loss/train': 0.7377761102980003} 12/29/2021 02:28:33 - INFO - codeparrot_training - Step 46632: {'lr': 5.615993215526638e-06, 'samples': 23876096, 'steps': 46632, 'batch_loss/train': 0.7679095058701932} 12/29/2021 02:28:45 - INFO - codeparrot_training - Step 46633: {'lr': 5.61267133084023e-06, 'samples': 23876608, 'steps': 46633, 'batch_loss/train': 0.814855198841542} 12/29/2021 02:28:55 - INFO - codeparrot_training - Step 46634: {'lr': 5.609350417745518e-06, 'samples': 23877120, 'steps': 46634, 'batch_loss/train': 0.7745488351210952} 12/29/2021 02:29:06 - INFO - codeparrot_training - Step 46635: {'lr': 5.606030476255769e-06, 'samples': 23877632, 'steps': 46635, 'batch_loss/train': 0.7058696895837784} 12/29/2021 02:29:20 - INFO - codeparrot_training - Step 46636: {'lr': 5.602711506384139e-06, 'samples': 23878144, 'steps': 46636, 'batch_loss/train': 0.6542865601368248} 12/29/2021 02:29:31 - INFO - codeparrot_training - Step 46637: {'lr': 5.599393508143868e-06, 'samples': 23878656, 'steps': 46637, 'batch_loss/train': 0.6308319107629359} 12/29/2021 02:29:41 - INFO - codeparrot_training - Step 46638: {'lr': 5.59607648154814e-06, 'samples': 23879168, 'steps': 46638, 'batch_loss/train': 0.6730236241128296} 12/29/2021 02:29:54 - INFO - codeparrot_training - Step 46639: {'lr': 5.5927604266100825e-06, 'samples': 23879680, 'steps': 46639, 'batch_loss/train': 0.6961830812506378} 12/29/2021 02:30:05 - INFO - codeparrot_training - Step 46640: {'lr': 5.589445343342964e-06, 'samples': 23880192, 'steps': 46640, 'batch_loss/train': 0.6699628615169786} 12/29/2021 02:30:15 - INFO - codeparrot_training - Step 46641: {'lr': 5.5861312317599115e-06, 'samples': 23880704, 'steps': 46641, 'batch_loss/train': 0.7911574076861143} 12/29/2021 02:30:29 - INFO - codeparrot_training - Step 46642: {'lr': 5.582818091874137e-06, 'samples': 23881216, 'steps': 46642, 'batch_loss/train': 0.6752426992170513} 12/29/2021 02:30:40 - INFO - codeparrot_training - Step 46643: {'lr': 5.579505923698741e-06, 'samples': 23881728, 'steps': 46643, 'batch_loss/train': 0.8273814241401851} 12/29/2021 02:30:50 - INFO - codeparrot_training - Step 46644: {'lr': 5.576194727246964e-06, 'samples': 23882240, 'steps': 46644, 'batch_loss/train': 0.6360481394222006} 12/29/2021 02:31:01 - INFO - codeparrot_training - Step 46645: {'lr': 5.572884502531989e-06, 'samples': 23882752, 'steps': 46645, 'batch_loss/train': 0.8045755792409182} 12/29/2021 02:31:13 - INFO - codeparrot_training - Step 46646: {'lr': 5.569575249566888e-06, 'samples': 23883264, 'steps': 46646, 'batch_loss/train': 0.7351511525921524} 12/29/2021 02:31:24 - INFO - codeparrot_training - Step 46647: {'lr': 5.566266968364847e-06, 'samples': 23883776, 'steps': 46647, 'batch_loss/train': 0.6843841903610155} 12/29/2021 02:31:34 - INFO - codeparrot_training - Step 46648: {'lr': 5.5629596589390765e-06, 'samples': 23884288, 'steps': 46648, 'batch_loss/train': 0.6895507059525698} 12/29/2021 02:31:46 - INFO - codeparrot_training - Step 46649: {'lr': 5.559653321302649e-06, 'samples': 23884800, 'steps': 46649, 'batch_loss/train': 0.694293858949095} 12/29/2021 02:31:57 - INFO - codeparrot_training - Step 46650: {'lr': 5.556347955468777e-06, 'samples': 23885312, 'steps': 46650, 'batch_loss/train': 0.5905024642124772} 12/29/2021 02:32:08 - INFO - codeparrot_training - Step 46651: {'lr': 5.5530435614505326e-06, 'samples': 23885824, 'steps': 46651, 'batch_loss/train': 0.7160331532359123} 12/29/2021 02:32:21 - INFO - codeparrot_training - Step 46652: {'lr': 5.5497401392611e-06, 'samples': 23886336, 'steps': 46652, 'batch_loss/train': 0.6911960560828447} 12/29/2021 02:32:32 - INFO - codeparrot_training - Step 46653: {'lr': 5.546437688913608e-06, 'samples': 23886848, 'steps': 46653, 'batch_loss/train': 0.8207480348646641} 12/29/2021 02:32:43 - INFO - codeparrot_training - Step 46654: {'lr': 5.543136210421157e-06, 'samples': 23887360, 'steps': 46654, 'batch_loss/train': 0.673935842292849} 12/29/2021 02:32:53 - INFO - codeparrot_training - Step 46655: {'lr': 5.5398357037969025e-06, 'samples': 23887872, 'steps': 46655, 'batch_loss/train': 0.7666379939764738} 12/29/2021 02:33:05 - INFO - codeparrot_training - Step 46656: {'lr': 5.536536169053974e-06, 'samples': 23888384, 'steps': 46656, 'batch_loss/train': 0.7092340476810932} 12/29/2021 02:33:16 - INFO - codeparrot_training - Step 46657: {'lr': 5.533237606205472e-06, 'samples': 23888896, 'steps': 46657, 'batch_loss/train': 0.63266067719087} 12/29/2021 02:33:27 - INFO - codeparrot_training - Step 46658: {'lr': 5.529940015264468e-06, 'samples': 23889408, 'steps': 46658, 'batch_loss/train': 0.7322727255523205} 12/29/2021 02:33:39 - INFO - codeparrot_training - Step 46659: {'lr': 5.526643396244174e-06, 'samples': 23889920, 'steps': 46659, 'batch_loss/train': 0.5422572773823049} 12/29/2021 02:33:50 - INFO - codeparrot_training - Step 46660: {'lr': 5.523347749157609e-06, 'samples': 23890432, 'steps': 46660, 'batch_loss/train': 0.7511174771934748} 12/29/2021 02:34:00 - INFO - codeparrot_training - Step 46661: {'lr': 5.520053074017872e-06, 'samples': 23890944, 'steps': 46661, 'batch_loss/train': 0.7360445074737072} 12/29/2021 02:34:12 - INFO - codeparrot_training - Step 46662: {'lr': 5.5167593708381205e-06, 'samples': 23891456, 'steps': 46662, 'batch_loss/train': 0.7468252656981349} 12/29/2021 02:34:23 - INFO - codeparrot_training - Step 46663: {'lr': 5.513466639631426e-06, 'samples': 23891968, 'steps': 46663, 'batch_loss/train': 0.7265853271819651} 12/29/2021 02:34:34 - INFO - codeparrot_training - Step 46664: {'lr': 5.510174880410862e-06, 'samples': 23892480, 'steps': 46664, 'batch_loss/train': 0.8060842426493764} 12/29/2021 02:34:47 - INFO - codeparrot_training - Step 46665: {'lr': 5.506884093189529e-06, 'samples': 23892992, 'steps': 46665, 'batch_loss/train': 0.8055428927764297} 12/29/2021 02:34:58 - INFO - codeparrot_training - Step 46666: {'lr': 5.503594277980528e-06, 'samples': 23893504, 'steps': 46666, 'batch_loss/train': 0.6562614301219583} 12/29/2021 02:35:09 - INFO - codeparrot_training - Step 46667: {'lr': 5.5003054347968764e-06, 'samples': 23894016, 'steps': 46667, 'batch_loss/train': 0.7146757389418781} 12/29/2021 02:35:19 - INFO - codeparrot_training - Step 46668: {'lr': 5.4970175636517575e-06, 'samples': 23894528, 'steps': 46668, 'batch_loss/train': 0.6947829034179449} 12/29/2021 02:35:31 - INFO - codeparrot_training - Step 46669: {'lr': 5.493730664558106e-06, 'samples': 23895040, 'steps': 46669, 'batch_loss/train': 0.7092430735938251} 12/29/2021 02:35:42 - INFO - codeparrot_training - Step 46670: {'lr': 5.490444737529077e-06, 'samples': 23895552, 'steps': 46670, 'batch_loss/train': 0.7342807874083519} 12/29/2021 02:35:53 - INFO - codeparrot_training - Step 46671: {'lr': 5.487159782577744e-06, 'samples': 23896064, 'steps': 46671, 'batch_loss/train': 0.7269367570988834} 12/29/2021 02:36:05 - INFO - codeparrot_training - Step 46672: {'lr': 5.483875799717125e-06, 'samples': 23896576, 'steps': 46672, 'batch_loss/train': 0.6952040363103151} 12/29/2021 02:36:16 - INFO - codeparrot_training - Step 46673: {'lr': 5.4805927889602645e-06, 'samples': 23897088, 'steps': 46673, 'batch_loss/train': 0.7205853527411819} 12/29/2021 02:36:26 - INFO - codeparrot_training - Step 46674: {'lr': 5.4773107503202635e-06, 'samples': 23897600, 'steps': 46674, 'batch_loss/train': 0.8082593586295843} 12/29/2021 02:36:41 - INFO - codeparrot_training - Step 46675: {'lr': 5.4740296838101665e-06, 'samples': 23898112, 'steps': 46675, 'batch_loss/train': 0.8222400061786175} 12/29/2021 02:36:52 - INFO - codeparrot_training - Step 46676: {'lr': 5.470749589442936e-06, 'samples': 23898624, 'steps': 46676, 'batch_loss/train': 0.7954597459174693} 12/29/2021 02:37:02 - INFO - codeparrot_training - Step 46677: {'lr': 5.467470467231755e-06, 'samples': 23899136, 'steps': 46677, 'batch_loss/train': 1.1168636130169034} 12/29/2021 02:37:13 - INFO - codeparrot_training - Step 46678: {'lr': 5.464192317189531e-06, 'samples': 23899648, 'steps': 46678, 'batch_loss/train': 0.8610239154077135} 12/29/2021 02:37:25 - INFO - codeparrot_training - Step 46679: {'lr': 5.460915139329336e-06, 'samples': 23900160, 'steps': 46679, 'batch_loss/train': 0.7669082549400628} 12/29/2021 02:37:36 - INFO - codeparrot_training - Step 46680: {'lr': 5.457638933664244e-06, 'samples': 23900672, 'steps': 46680, 'batch_loss/train': 0.7073692139238119} 12/29/2021 02:37:46 - INFO - codeparrot_training - Step 46681: {'lr': 5.4543637002072155e-06, 'samples': 23901184, 'steps': 46681, 'batch_loss/train': 0.6746920151636004} 12/29/2021 02:37:59 - INFO - codeparrot_training - Step 46682: {'lr': 5.451089438971296e-06, 'samples': 23901696, 'steps': 46682, 'batch_loss/train': 0.7142969832057133} 12/29/2021 02:38:10 - INFO - codeparrot_training - Step 46683: {'lr': 5.447816149969559e-06, 'samples': 23902208, 'steps': 46683, 'batch_loss/train': 0.6504907542839646} 12/29/2021 02:38:20 - INFO - codeparrot_training - Step 46684: {'lr': 5.44454383321491e-06, 'samples': 23902720, 'steps': 46684, 'batch_loss/train': 0.5516189441550523} 12/29/2021 02:38:31 - INFO - codeparrot_training - Step 46685: {'lr': 5.44127248872045e-06, 'samples': 23903232, 'steps': 46685, 'batch_loss/train': 0.7051782049238682} 12/29/2021 02:38:43 - INFO - codeparrot_training - Step 46686: {'lr': 5.438002116499113e-06, 'samples': 23903744, 'steps': 46686, 'batch_loss/train': 0.723902135156095} 12/29/2021 02:38:54 - INFO - codeparrot_training - Step 46687: {'lr': 5.434732716563945e-06, 'samples': 23904256, 'steps': 46687, 'batch_loss/train': 0.7906933580525219} 12/29/2021 02:39:05 - INFO - codeparrot_training - Step 46688: {'lr': 5.431464288927962e-06, 'samples': 23904768, 'steps': 46688, 'batch_loss/train': 0.7642978671938181} 12/29/2021 02:39:17 - INFO - codeparrot_training - Step 46689: {'lr': 5.428196833604127e-06, 'samples': 23905280, 'steps': 46689, 'batch_loss/train': 0.6537321922369301} 12/29/2021 02:39:27 - INFO - codeparrot_training - Step 46690: {'lr': 5.4249303506054e-06, 'samples': 23905792, 'steps': 46690, 'batch_loss/train': 0.6820286328438669} 12/29/2021 02:39:38 - INFO - codeparrot_training - Step 46691: {'lr': 5.421664839944829e-06, 'samples': 23906304, 'steps': 46691, 'batch_loss/train': 0.6826387075707316} 12/29/2021 02:39:51 - INFO - codeparrot_training - Step 46692: {'lr': 5.418400301635374e-06, 'samples': 23906816, 'steps': 46692, 'batch_loss/train': 0.9192230235785246} 12/29/2021 02:40:01 - INFO - codeparrot_training - Step 46693: {'lr': 5.4151367356899964e-06, 'samples': 23907328, 'steps': 46693, 'batch_loss/train': 0.6249417765066028} 12/29/2021 02:40:12 - INFO - codeparrot_training - Step 46694: {'lr': 5.411874142121687e-06, 'samples': 23907840, 'steps': 46694, 'batch_loss/train': 0.7578366072848439} 12/29/2021 02:40:24 - INFO - codeparrot_training - Step 46695: {'lr': 5.4086125209434065e-06, 'samples': 23908352, 'steps': 46695, 'batch_loss/train': 0.7459809682331979} 12/29/2021 02:40:35 - INFO - codeparrot_training - Step 46696: {'lr': 5.405351872168118e-06, 'samples': 23908864, 'steps': 46696, 'batch_loss/train': 0.6647077505476773} 12/29/2021 02:40:45 - INFO - codeparrot_training - Step 46697: {'lr': 5.402092195808839e-06, 'samples': 23909376, 'steps': 46697, 'batch_loss/train': 0.5510806984966621} 12/29/2021 02:40:56 - INFO - codeparrot_training - Step 46698: {'lr': 5.398833491878446e-06, 'samples': 23909888, 'steps': 46698, 'batch_loss/train': 0.6862701531499624} 12/29/2021 02:41:09 - INFO - codeparrot_training - Step 46699: {'lr': 5.395575760389931e-06, 'samples': 23910400, 'steps': 46699, 'batch_loss/train': 0.813508449587971} 12/29/2021 02:41:19 - INFO - codeparrot_training - Step 46700: {'lr': 5.392319001356283e-06, 'samples': 23910912, 'steps': 46700, 'batch_loss/train': 0.70306331734173} 12/29/2021 02:41:30 - INFO - codeparrot_training - Step 46701: {'lr': 5.3890632147903795e-06, 'samples': 23911424, 'steps': 46701, 'batch_loss/train': 0.8062673814129084} 12/29/2021 02:41:42 - INFO - codeparrot_training - Step 46702: {'lr': 5.385808400705211e-06, 'samples': 23911936, 'steps': 46702, 'batch_loss/train': 0.8289758050232194} 12/29/2021 02:41:53 - INFO - codeparrot_training - Step 46703: {'lr': 5.382554559113712e-06, 'samples': 23912448, 'steps': 46703, 'batch_loss/train': 0.6265262723900378} 12/29/2021 02:42:03 - INFO - codeparrot_training - Step 46704: {'lr': 5.379301690028815e-06, 'samples': 23912960, 'steps': 46704, 'batch_loss/train': 0.6946486621163785} 12/29/2021 02:42:15 - INFO - codeparrot_training - Step 46705: {'lr': 5.376049793463428e-06, 'samples': 23913472, 'steps': 46705, 'batch_loss/train': 0.7288937270641327} 12/29/2021 02:42:26 - INFO - codeparrot_training - Step 46706: {'lr': 5.372798869430512e-06, 'samples': 23913984, 'steps': 46706, 'batch_loss/train': 0.7129444014281034} 12/29/2021 02:42:37 - INFO - codeparrot_training - Step 46707: {'lr': 5.369548917943001e-06, 'samples': 23914496, 'steps': 46707, 'batch_loss/train': 0.5858921430190094} 12/29/2021 02:42:47 - INFO - codeparrot_training - Step 46708: {'lr': 5.366299939013775e-06, 'samples': 23915008, 'steps': 46708, 'batch_loss/train': 0.7407867155270651} 12/29/2021 02:43:00 - INFO - codeparrot_training - Step 46709: {'lr': 5.363051932655793e-06, 'samples': 23915520, 'steps': 46709, 'batch_loss/train': 0.7330227429047227} 12/29/2021 02:43:10 - INFO - codeparrot_training - Step 46710: {'lr': 5.359804898881881e-06, 'samples': 23916032, 'steps': 46710, 'batch_loss/train': 0.7228787420317531} 12/29/2021 02:43:21 - INFO - codeparrot_training - Step 46711: {'lr': 5.356558837705056e-06, 'samples': 23916544, 'steps': 46711, 'batch_loss/train': 0.8102666204795241} 12/29/2021 02:43:34 - INFO - codeparrot_training - Step 46712: {'lr': 5.353313749138194e-06, 'samples': 23917056, 'steps': 46712, 'batch_loss/train': 0.7332541118375957} 12/29/2021 02:43:45 - INFO - codeparrot_training - Step 46713: {'lr': 5.350069633194149e-06, 'samples': 23917568, 'steps': 46713, 'batch_loss/train': 0.5952826165594161} 12/29/2021 02:43:55 - INFO - codeparrot_training - Step 46714: {'lr': 5.346826489885881e-06, 'samples': 23918080, 'steps': 46714, 'batch_loss/train': 0.6699930997565389} 12/29/2021 02:44:08 - INFO - codeparrot_training - Step 46715: {'lr': 5.343584319226241e-06, 'samples': 23918592, 'steps': 46715, 'batch_loss/train': 0.760097160586156} 12/29/2021 02:44:19 - INFO - codeparrot_training - Step 46716: {'lr': 5.340343121228136e-06, 'samples': 23919104, 'steps': 46716, 'batch_loss/train': 0.5845822533592582} 12/29/2021 02:44:29 - INFO - codeparrot_training - Step 46717: {'lr': 5.337102895904444e-06, 'samples': 23919616, 'steps': 46717, 'batch_loss/train': 0.7298959731124341} 12/29/2021 02:44:40 - INFO - codeparrot_training - Step 46718: {'lr': 5.333863643268045e-06, 'samples': 23920128, 'steps': 46718, 'batch_loss/train': 0.7215645834803581} 12/29/2021 02:44:52 - INFO - codeparrot_training - Step 46719: {'lr': 5.330625363331815e-06, 'samples': 23920640, 'steps': 46719, 'batch_loss/train': 0.7213813476264477} 12/29/2021 02:45:03 - INFO - codeparrot_training - Step 46720: {'lr': 5.327388056108634e-06, 'samples': 23921152, 'steps': 46720, 'batch_loss/train': 0.5735275589686353} 12/29/2021 02:45:13 - INFO - codeparrot_training - Step 46721: {'lr': 5.324151721611381e-06, 'samples': 23921664, 'steps': 46721, 'batch_loss/train': 0.7467954587191343} 12/29/2021 02:45:26 - INFO - codeparrot_training - Step 46722: {'lr': 5.320916359852934e-06, 'samples': 23922176, 'steps': 46722, 'batch_loss/train': 0.6528745554387569} 12/29/2021 02:45:37 - INFO - codeparrot_training - Step 46723: {'lr': 5.317681970846117e-06, 'samples': 23922688, 'steps': 46723, 'batch_loss/train': 0.6649202110711485} 12/29/2021 02:45:47 - INFO - codeparrot_training - Step 46724: {'lr': 5.314448554603807e-06, 'samples': 23923200, 'steps': 46724, 'batch_loss/train': 0.6904139583930373} 12/29/2021 02:46:00 - INFO - codeparrot_training - Step 46725: {'lr': 5.3112161111388556e-06, 'samples': 23923712, 'steps': 46725, 'batch_loss/train': 0.6867979457601905} 12/29/2021 02:46:10 - INFO - codeparrot_training - Step 46726: {'lr': 5.307984640464142e-06, 'samples': 23924224, 'steps': 46726, 'batch_loss/train': 0.6495155391748995} 12/29/2021 02:46:21 - INFO - codeparrot_training - Step 46727: {'lr': 5.304754142592461e-06, 'samples': 23924736, 'steps': 46727, 'batch_loss/train': 0.6599216380855069} 12/29/2021 02:46:31 - INFO - codeparrot_training - Step 46728: {'lr': 5.30152461753669e-06, 'samples': 23925248, 'steps': 46728, 'batch_loss/train': 0.6538081028265879} 12/29/2021 02:46:44 - INFO - codeparrot_training - Step 46729: {'lr': 5.298296065309682e-06, 'samples': 23925760, 'steps': 46729, 'batch_loss/train': 0.8189736793283373} 12/29/2021 02:46:54 - INFO - codeparrot_training - Step 46730: {'lr': 5.295068485924232e-06, 'samples': 23926272, 'steps': 46730, 'batch_loss/train': 0.6970726111903787} 12/29/2021 02:47:05 - INFO - codeparrot_training - Step 46731: {'lr': 5.291841879393189e-06, 'samples': 23926784, 'steps': 46731, 'batch_loss/train': 0.6613683978503104} 12/29/2021 02:47:18 - INFO - codeparrot_training - Step 46732: {'lr': 5.2886162457294055e-06, 'samples': 23927296, 'steps': 46732, 'batch_loss/train': 0.617215350328479} 12/29/2021 02:47:29 - INFO - codeparrot_training - Step 46733: {'lr': 5.285391584945676e-06, 'samples': 23927808, 'steps': 46733, 'batch_loss/train': 0.618141520768404} 12/29/2021 02:47:39 - INFO - codeparrot_training - Step 46734: {'lr': 5.282167897054823e-06, 'samples': 23928320, 'steps': 46734, 'batch_loss/train': 0.6969947358593345} 12/29/2021 02:47:51 - INFO - codeparrot_training - Step 46735: {'lr': 5.278945182069672e-06, 'samples': 23928832, 'steps': 46735, 'batch_loss/train': 0.930425884667784} 12/29/2021 02:48:02 - INFO - codeparrot_training - Step 46736: {'lr': 5.275723440003016e-06, 'samples': 23929344, 'steps': 46736, 'batch_loss/train': 0.6390919201076031} 12/29/2021 02:48:13 - INFO - codeparrot_training - Step 46737: {'lr': 5.272502670867707e-06, 'samples': 23929856, 'steps': 46737, 'batch_loss/train': 0.6580514639499597} 12/29/2021 02:48:23 - INFO - codeparrot_training - Step 46738: {'lr': 5.26928287467654e-06, 'samples': 23930368, 'steps': 46738, 'batch_loss/train': 0.7377248386619613} 12/29/2021 02:48:36 - INFO - codeparrot_training - Step 46739: {'lr': 5.266064051442226e-06, 'samples': 23930880, 'steps': 46739, 'batch_loss/train': 0.7471164495800622} 12/29/2021 02:48:46 - INFO - codeparrot_training - Step 46740: {'lr': 5.262846201177702e-06, 'samples': 23931392, 'steps': 46740, 'batch_loss/train': 0.4950397973298095} 12/29/2021 02:48:57 - INFO - codeparrot_training - Step 46741: {'lr': 5.259629323895676e-06, 'samples': 23931904, 'steps': 46741, 'batch_loss/train': 0.7767962636426091} 12/29/2021 02:49:10 - INFO - codeparrot_training - Step 46742: {'lr': 5.256413419608918e-06, 'samples': 23932416, 'steps': 46742, 'batch_loss/train': 0.6464419178664684} 12/29/2021 02:49:21 - INFO - codeparrot_training - Step 46743: {'lr': 5.25319848833028e-06, 'samples': 23932928, 'steps': 46743, 'batch_loss/train': 0.7040240857750177} 12/29/2021 02:49:31 - INFO - codeparrot_training - Step 46744: {'lr': 5.249984530072499e-06, 'samples': 23933440, 'steps': 46744, 'batch_loss/train': 0.6178882874082774} 12/29/2021 02:49:43 - INFO - codeparrot_training - Step 46745: {'lr': 5.2467715448483725e-06, 'samples': 23933952, 'steps': 46745, 'batch_loss/train': 0.7240029606036842} 12/29/2021 02:49:54 - INFO - codeparrot_training - Step 46746: {'lr': 5.2435595326706665e-06, 'samples': 23934464, 'steps': 46746, 'batch_loss/train': 0.6746165929362178} 12/29/2021 02:50:05 - INFO - codeparrot_training - Step 46747: {'lr': 5.2403484935521495e-06, 'samples': 23934976, 'steps': 46747, 'batch_loss/train': 0.7342611937783659} 12/29/2021 02:50:15 - INFO - codeparrot_training - Step 46748: {'lr': 5.237138427505589e-06, 'samples': 23935488, 'steps': 46748, 'batch_loss/train': 0.7432134905830026} 12/29/2021 02:50:27 - INFO - codeparrot_training - Step 46749: {'lr': 5.23392933454378e-06, 'samples': 23936000, 'steps': 46749, 'batch_loss/train': 0.8015781287103891} 12/29/2021 02:50:38 - INFO - codeparrot_training - Step 46750: {'lr': 5.230721214679407e-06, 'samples': 23936512, 'steps': 46750, 'batch_loss/train': 0.6610657079145312} 12/29/2021 02:50:49 - INFO - codeparrot_training - Step 46751: {'lr': 5.2275140679252656e-06, 'samples': 23937024, 'steps': 46751, 'batch_loss/train': 0.6570887635461986} 12/29/2021 02:51:02 - INFO - codeparrot_training - Step 46752: {'lr': 5.224307894294122e-06, 'samples': 23937536, 'steps': 46752, 'batch_loss/train': 0.7370776310563087} 12/29/2021 02:51:12 - INFO - codeparrot_training - Step 46753: {'lr': 5.221102693798718e-06, 'samples': 23938048, 'steps': 46753, 'batch_loss/train': 0.7315803468227386} 12/29/2021 02:51:23 - INFO - codeparrot_training - Step 46754: {'lr': 5.217898466451737e-06, 'samples': 23938560, 'steps': 46754, 'batch_loss/train': 0.7691934499889612} 12/29/2021 02:51:35 - INFO - codeparrot_training - Step 46755: {'lr': 5.2146952122660296e-06, 'samples': 23939072, 'steps': 46755, 'batch_loss/train': 0.7559731146320701} 12/29/2021 02:51:46 - INFO - codeparrot_training - Step 46756: {'lr': 5.211492931254225e-06, 'samples': 23939584, 'steps': 46756, 'batch_loss/train': 0.6534105360042304} 12/29/2021 02:51:56 - INFO - codeparrot_training - Step 46757: {'lr': 5.20829162342909e-06, 'samples': 23940096, 'steps': 46757, 'batch_loss/train': 0.8031455152668059} 12/29/2021 02:52:09 - INFO - codeparrot_training - Step 46758: {'lr': 5.205091288803365e-06, 'samples': 23940608, 'steps': 46758, 'batch_loss/train': 0.5697978881944437} 12/29/2021 02:52:20 - INFO - codeparrot_training - Step 46759: {'lr': 5.20189192738979e-06, 'samples': 23941120, 'steps': 46759, 'batch_loss/train': 0.8061553332954645} 12/29/2021 02:52:30 - INFO - codeparrot_training - Step 46760: {'lr': 5.1986935392010215e-06, 'samples': 23941632, 'steps': 46760, 'batch_loss/train': 0.6113572251051664} 12/29/2021 02:52:41 - INFO - codeparrot_training - Step 46761: {'lr': 5.195496124249855e-06, 'samples': 23942144, 'steps': 46761, 'batch_loss/train': 0.6755033910740167} 12/29/2021 02:52:53 - INFO - codeparrot_training - Step 46762: {'lr': 5.192299682548946e-06, 'samples': 23942656, 'steps': 46762, 'batch_loss/train': 0.7698778579942882} 12/29/2021 02:53:04 - INFO - codeparrot_training - Step 46763: {'lr': 5.1891042141110065e-06, 'samples': 23943168, 'steps': 46763, 'batch_loss/train': 0.6823506942018867} 12/29/2021 02:53:14 - INFO - codeparrot_training - Step 46764: {'lr': 5.185909718948778e-06, 'samples': 23943680, 'steps': 46764, 'batch_loss/train': 0.6613538681995124} 12/29/2021 02:53:27 - INFO - codeparrot_training - Step 46765: {'lr': 5.1827161970748885e-06, 'samples': 23944192, 'steps': 46765, 'batch_loss/train': 0.5516468775167596} 12/29/2021 02:53:37 - INFO - codeparrot_training - Step 46766: {'lr': 5.1795236485021046e-06, 'samples': 23944704, 'steps': 46766, 'batch_loss/train': 0.7092262096703053} 12/29/2021 02:53:48 - INFO - codeparrot_training - Step 46767: {'lr': 5.176332073243112e-06, 'samples': 23945216, 'steps': 46767, 'batch_loss/train': 0.7183510067407042} 12/29/2021 02:54:00 - INFO - codeparrot_training - Step 46768: {'lr': 5.173141471310566e-06, 'samples': 23945728, 'steps': 46768, 'batch_loss/train': 0.788479776121676} 12/29/2021 02:54:11 - INFO - codeparrot_training - Step 46769: {'lr': 5.169951842717152e-06, 'samples': 23946240, 'steps': 46769, 'batch_loss/train': 0.7563297059386969} 12/29/2021 02:54:21 - INFO - codeparrot_training - Step 46770: {'lr': 5.166763187475582e-06, 'samples': 23946752, 'steps': 46770, 'batch_loss/train': 0.7074837256222963} 12/29/2021 02:54:32 - INFO - codeparrot_training - Step 46771: {'lr': 5.163575505598511e-06, 'samples': 23947264, 'steps': 46771, 'batch_loss/train': 0.7592099388712086} 12/29/2021 02:54:45 - INFO - codeparrot_training - Step 46772: {'lr': 5.160388797098625e-06, 'samples': 23947776, 'steps': 46772, 'batch_loss/train': 0.6359245562925935} 12/29/2021 02:54:55 - INFO - codeparrot_training - Step 46773: {'lr': 5.15720306198858e-06, 'samples': 23948288, 'steps': 46773, 'batch_loss/train': 0.7168127377517521} 12/29/2021 02:55:06 - INFO - codeparrot_training - Step 46774: {'lr': 5.154018300281033e-06, 'samples': 23948800, 'steps': 46774, 'batch_loss/train': 0.7942498084157705} 12/29/2021 02:55:18 - INFO - codeparrot_training - Step 46775: {'lr': 5.150834511988694e-06, 'samples': 23949312, 'steps': 46775, 'batch_loss/train': 0.7235818281769753} 12/29/2021 02:55:29 - INFO - codeparrot_training - Step 46776: {'lr': 5.14765169712414e-06, 'samples': 23949824, 'steps': 46776, 'batch_loss/train': 0.6776129305362701} 12/29/2021 02:55:39 - INFO - codeparrot_training - Step 46777: {'lr': 5.144469855700107e-06, 'samples': 23950336, 'steps': 46777, 'batch_loss/train': 0.6749056214466691} 12/29/2021 02:55:52 - INFO - codeparrot_training - Step 46778: {'lr': 5.141288987729226e-06, 'samples': 23950848, 'steps': 46778, 'batch_loss/train': 0.7689493116922677} 12/29/2021 02:56:02 - INFO - codeparrot_training - Step 46779: {'lr': 5.13810909322407e-06, 'samples': 23951360, 'steps': 46779, 'batch_loss/train': 0.6798284472897649} 12/29/2021 02:56:13 - INFO - codeparrot_training - Step 46780: {'lr': 5.13493017219735e-06, 'samples': 23951872, 'steps': 46780, 'batch_loss/train': 0.7343373596668243} 12/29/2021 02:56:24 - INFO - codeparrot_training - Step 46781: {'lr': 5.131752224661723e-06, 'samples': 23952384, 'steps': 46781, 'batch_loss/train': 0.6718319905921817} 12/29/2021 02:56:37 - INFO - codeparrot_training - Step 46782: {'lr': 5.128575250629763e-06, 'samples': 23952896, 'steps': 46782, 'batch_loss/train': 0.7050471445545554} 12/29/2021 02:56:47 - INFO - codeparrot_training - Step 46783: {'lr': 5.125399250114126e-06, 'samples': 23953408, 'steps': 46783, 'batch_loss/train': 0.9222842380404472} 12/29/2021 02:56:58 - INFO - codeparrot_training - Step 46784: {'lr': 5.122224223127442e-06, 'samples': 23953920, 'steps': 46784, 'batch_loss/train': 1.2502910975017585} 12/29/2021 02:57:10 - INFO - codeparrot_training - Step 46785: {'lr': 5.119050169682337e-06, 'samples': 23954432, 'steps': 46785, 'batch_loss/train': 0.7414008444175124} 12/29/2021 02:57:21 - INFO - codeparrot_training - Step 46786: {'lr': 5.115877089791416e-06, 'samples': 23954944, 'steps': 46786, 'batch_loss/train': 0.7327363472431898} 12/29/2021 02:57:31 - INFO - codeparrot_training - Step 46787: {'lr': 5.112704983467331e-06, 'samples': 23955456, 'steps': 46787, 'batch_loss/train': 0.7066285137552768} 12/29/2021 02:57:43 - INFO - codeparrot_training - Step 46788: {'lr': 5.109533850722631e-06, 'samples': 23955968, 'steps': 46788, 'batch_loss/train': 0.7598563591018319} 12/29/2021 02:57:54 - INFO - codeparrot_training - Step 46789: {'lr': 5.106363691569971e-06, 'samples': 23956480, 'steps': 46789, 'batch_loss/train': 0.7099934946745634} 12/29/2021 02:58:04 - INFO - codeparrot_training - Step 46790: {'lr': 5.1031945060219795e-06, 'samples': 23956992, 'steps': 46790, 'batch_loss/train': 0.6229037962621078} 12/29/2021 02:58:15 - INFO - codeparrot_training - Step 46791: {'lr': 5.100026294091148e-06, 'samples': 23957504, 'steps': 46791, 'batch_loss/train': 0.7369984087417834} 12/29/2021 02:58:28 - INFO - codeparrot_training - Step 46792: {'lr': 5.0968590557901594e-06, 'samples': 23958016, 'steps': 46792, 'batch_loss/train': 0.7083963612094522} 12/29/2021 02:58:39 - INFO - codeparrot_training - Step 46793: {'lr': 5.093692791131643e-06, 'samples': 23958528, 'steps': 46793, 'batch_loss/train': 0.647058577160351} 12/29/2021 02:58:49 - INFO - codeparrot_training - Step 46794: {'lr': 5.090527500128089e-06, 'samples': 23959040, 'steps': 46794, 'batch_loss/train': 0.7371122662443668} 12/29/2021 02:59:01 - INFO - codeparrot_training - Step 46795: {'lr': 5.087363182792099e-06, 'samples': 23959552, 'steps': 46795, 'batch_loss/train': 1.4148138673044741} 12/29/2021 02:59:12 - INFO - codeparrot_training - Step 46796: {'lr': 5.0841998391363e-06, 'samples': 23960064, 'steps': 46796, 'batch_loss/train': 0.6756950113922358} 12/29/2021 02:59:23 - INFO - codeparrot_training - Step 46797: {'lr': 5.081037469173266e-06, 'samples': 23960576, 'steps': 46797, 'batch_loss/train': 0.6495742124971002} 12/29/2021 02:59:35 - INFO - codeparrot_training - Step 46798: {'lr': 5.077876072915516e-06, 'samples': 23961088, 'steps': 46798, 'batch_loss/train': 0.7844638000242412} 12/29/2021 02:59:46 - INFO - codeparrot_training - Step 46799: {'lr': 5.0747156503756774e-06, 'samples': 23961600, 'steps': 46799, 'batch_loss/train': 0.6823126235976815} 12/29/2021 02:59:57 - INFO - codeparrot_training - Step 46800: {'lr': 5.071556201566268e-06, 'samples': 23962112, 'steps': 46800, 'batch_loss/train': 0.6956045059487224} 12/29/2021 03:00:07 - INFO - codeparrot_training - Step 46801: {'lr': 5.06839772649989e-06, 'samples': 23962624, 'steps': 46801, 'batch_loss/train': 0.7156153172254562} 12/29/2021 03:00:19 - INFO - codeparrot_training - Step 46802: {'lr': 5.06524022518906e-06, 'samples': 23963136, 'steps': 46802, 'batch_loss/train': 0.7006361917592585} 12/29/2021 03:00:30 - INFO - codeparrot_training - Step 46803: {'lr': 5.062083697646353e-06, 'samples': 23963648, 'steps': 46803, 'batch_loss/train': 0.7029376965947449} 12/29/2021 03:00:41 - INFO - codeparrot_training - Step 46804: {'lr': 5.058928143884367e-06, 'samples': 23964160, 'steps': 46804, 'batch_loss/train': 0.7197096173185855} 12/29/2021 03:00:53 - INFO - codeparrot_training - Step 46805: {'lr': 5.055773563915539e-06, 'samples': 23964672, 'steps': 46805, 'batch_loss/train': 0.7363464459776878} 12/29/2021 03:01:03 - INFO - codeparrot_training - Step 46806: {'lr': 5.052619957752469e-06, 'samples': 23965184, 'steps': 46806, 'batch_loss/train': 0.6895359251648188} 12/29/2021 03:01:14 - INFO - codeparrot_training - Step 46807: {'lr': 5.049467325407731e-06, 'samples': 23965696, 'steps': 46807, 'batch_loss/train': 0.6904151090420783} 12/29/2021 03:01:27 - INFO - codeparrot_training - Step 46808: {'lr': 5.046315666893786e-06, 'samples': 23966208, 'steps': 46808, 'batch_loss/train': 0.7616470232605934} 12/29/2021 03:01:37 - INFO - codeparrot_training - Step 46809: {'lr': 5.043164982223181e-06, 'samples': 23966720, 'steps': 46809, 'batch_loss/train': 0.6939738811925054} 12/29/2021 03:01:48 - INFO - codeparrot_training - Step 46810: {'lr': 5.0400152714085156e-06, 'samples': 23967232, 'steps': 46810, 'batch_loss/train': 0.7255350803025067} 12/29/2021 03:01:59 - INFO - codeparrot_training - Step 46811: {'lr': 5.036866534462226e-06, 'samples': 23967744, 'steps': 46811, 'batch_loss/train': 0.7869552690535784} 12/29/2021 03:02:11 - INFO - codeparrot_training - Step 46812: {'lr': 5.033718771396883e-06, 'samples': 23968256, 'steps': 46812, 'batch_loss/train': 0.6593768103048205} 12/29/2021 03:02:21 - INFO - codeparrot_training - Step 46813: {'lr': 5.030571982224952e-06, 'samples': 23968768, 'steps': 46813, 'batch_loss/train': 0.6544525395147502} 12/29/2021 03:02:32 - INFO - codeparrot_training - Step 46814: {'lr': 5.027426166958976e-06, 'samples': 23969280, 'steps': 46814, 'batch_loss/train': 0.7475277716293931} 12/29/2021 03:02:44 - INFO - codeparrot_training - Step 46815: {'lr': 5.024281325611446e-06, 'samples': 23969792, 'steps': 46815, 'batch_loss/train': 0.6489277444779873} 12/29/2021 03:02:55 - INFO - codeparrot_training - Step 46816: {'lr': 5.021137458194908e-06, 'samples': 23970304, 'steps': 46816, 'batch_loss/train': 0.7599233742803335} 12/29/2021 03:03:05 - INFO - codeparrot_training - Step 46817: {'lr': 5.017994564721795e-06, 'samples': 23970816, 'steps': 46817, 'batch_loss/train': 0.7646086560562253} 12/29/2021 03:03:17 - INFO - codeparrot_training - Step 46818: {'lr': 5.014852645204654e-06, 'samples': 23971328, 'steps': 46818, 'batch_loss/train': 0.7372558675706387} 12/29/2021 03:03:28 - INFO - codeparrot_training - Step 46819: {'lr': 5.0117116996559465e-06, 'samples': 23971840, 'steps': 46819, 'batch_loss/train': 0.7640931280329823} 12/29/2021 03:03:39 - INFO - codeparrot_training - Step 46820: {'lr': 5.008571728088162e-06, 'samples': 23972352, 'steps': 46820, 'batch_loss/train': 0.6829312858171761} 12/29/2021 03:03:52 - INFO - codeparrot_training - Step 46821: {'lr': 5.005432730513821e-06, 'samples': 23972864, 'steps': 46821, 'batch_loss/train': 0.6837489046156406} 12/29/2021 03:04:02 - INFO - codeparrot_training - Step 46822: {'lr': 5.0022947069453265e-06, 'samples': 23973376, 'steps': 46822, 'batch_loss/train': 0.8005366390570998} 12/29/2021 03:04:13 - INFO - codeparrot_training - Step 46823: {'lr': 4.999157657395226e-06, 'samples': 23973888, 'steps': 46823, 'batch_loss/train': 0.6859450773335993} 12/29/2021 03:04:23 - INFO - codeparrot_training - Step 46824: {'lr': 4.9960215818759815e-06, 'samples': 23974400, 'steps': 46824, 'batch_loss/train': 1.5333657497540116} 12/29/2021 03:04:36 - INFO - codeparrot_training - Step 46825: {'lr': 4.9928864804000276e-06, 'samples': 23974912, 'steps': 46825, 'batch_loss/train': 0.6356637319549918} 12/29/2021 03:04:47 - INFO - codeparrot_training - Step 46826: {'lr': 4.989752352979826e-06, 'samples': 23975424, 'steps': 46826, 'batch_loss/train': 0.6147769116796553} 12/29/2021 03:04:57 - INFO - codeparrot_training - Step 46827: {'lr': 4.986619199627895e-06, 'samples': 23975936, 'steps': 46827, 'batch_loss/train': 0.7821107031777501} 12/29/2021 03:05:10 - INFO - codeparrot_training - Step 46828: {'lr': 4.983487020356614e-06, 'samples': 23976448, 'steps': 46828, 'batch_loss/train': 0.7180587965995073} 12/29/2021 03:05:21 - INFO - codeparrot_training - Step 46829: {'lr': 4.9803558151785e-06, 'samples': 23976960, 'steps': 46829, 'batch_loss/train': 0.7319239871576428} 12/29/2021 03:05:31 - INFO - codeparrot_training - Step 46830: {'lr': 4.9772255841059876e-06, 'samples': 23977472, 'steps': 46830, 'batch_loss/train': 0.7115958742797375} 12/29/2021 03:05:43 - INFO - codeparrot_training - Step 46831: {'lr': 4.974096327151512e-06, 'samples': 23977984, 'steps': 46831, 'batch_loss/train': 0.7372012166306376} 12/29/2021 03:05:54 - INFO - codeparrot_training - Step 46832: {'lr': 4.970968044327451e-06, 'samples': 23978496, 'steps': 46832, 'batch_loss/train': 0.6661312756477855} 12/29/2021 03:06:05 - INFO - codeparrot_training - Step 46833: {'lr': 4.967840735646379e-06, 'samples': 23979008, 'steps': 46833, 'batch_loss/train': 0.645197672303766} 12/29/2021 03:06:15 - INFO - codeparrot_training - Step 46834: {'lr': 4.964714401120618e-06, 'samples': 23979520, 'steps': 46834, 'batch_loss/train': 0.778684510383755} 12/29/2021 03:06:27 - INFO - codeparrot_training - Step 46835: {'lr': 4.961589040762604e-06, 'samples': 23980032, 'steps': 46835, 'batch_loss/train': 0.7101948782801628} 12/29/2021 03:06:38 - INFO - codeparrot_training - Step 46836: {'lr': 4.958464654584827e-06, 'samples': 23980544, 'steps': 46836, 'batch_loss/train': 0.6266470269765705} 12/29/2021 03:06:49 - INFO - codeparrot_training - Step 46837: {'lr': 4.955341242599637e-06, 'samples': 23981056, 'steps': 46837, 'batch_loss/train': 0.7005469088908285} 12/29/2021 03:07:01 - INFO - codeparrot_training - Step 46838: {'lr': 4.9522188048195245e-06, 'samples': 23981568, 'steps': 46838, 'batch_loss/train': 0.6837882399559021} 12/29/2021 03:07:12 - INFO - codeparrot_training - Step 46839: {'lr': 4.949097341256842e-06, 'samples': 23982080, 'steps': 46839, 'batch_loss/train': 0.6886977106332779} 12/29/2021 03:07:23 - INFO - codeparrot_training - Step 46840: {'lr': 4.945976851924022e-06, 'samples': 23982592, 'steps': 46840, 'batch_loss/train': 0.6864454234018922} 12/29/2021 03:07:35 - INFO - codeparrot_training - Step 46841: {'lr': 4.942857336833445e-06, 'samples': 23983104, 'steps': 46841, 'batch_loss/train': 0.7747632344253361} 12/29/2021 03:07:45 - INFO - codeparrot_training - Step 46842: {'lr': 4.9397387959976e-06, 'samples': 23983616, 'steps': 46842, 'batch_loss/train': 0.7999403569847345} 12/29/2021 03:07:56 - INFO - codeparrot_training - Step 46843: {'lr': 4.936621229428756e-06, 'samples': 23984128, 'steps': 46843, 'batch_loss/train': 0.7025643554516137} 12/29/2021 03:08:07 - INFO - codeparrot_training - Step 46844: {'lr': 4.933504637139402e-06, 'samples': 23984640, 'steps': 46844, 'batch_loss/train': 0.7016371302306652} 12/29/2021 03:08:19 - INFO - codeparrot_training - Step 46845: {'lr': 4.930389019141917e-06, 'samples': 23985152, 'steps': 46845, 'batch_loss/train': 0.6995283517753705} 12/29/2021 03:08:30 - INFO - codeparrot_training - Step 46846: {'lr': 4.927274375448626e-06, 'samples': 23985664, 'steps': 46846, 'batch_loss/train': 0.7624665712937713} 12/29/2021 03:08:41 - INFO - codeparrot_training - Step 46847: {'lr': 4.924160706071989e-06, 'samples': 23986176, 'steps': 46847, 'batch_loss/train': 0.6763610441703349} 12/29/2021 03:08:53 - INFO - codeparrot_training - Step 46848: {'lr': 4.92104801102436e-06, 'samples': 23986688, 'steps': 46848, 'batch_loss/train': 0.7041364493779838} 12/29/2021 03:09:04 - INFO - codeparrot_training - Step 46849: {'lr': 4.917936290318087e-06, 'samples': 23987200, 'steps': 46849, 'batch_loss/train': 0.7097812360152602} 12/29/2021 03:09:14 - INFO - codeparrot_training - Step 46850: {'lr': 4.9148255439655785e-06, 'samples': 23987712, 'steps': 46850, 'batch_loss/train': 0.7184916024561971} 12/29/2021 03:09:27 - INFO - codeparrot_training - Step 46851: {'lr': 4.911715771979159e-06, 'samples': 23988224, 'steps': 46851, 'batch_loss/train': 0.7457053600810468} 12/29/2021 03:09:37 - INFO - codeparrot_training - Step 46852: {'lr': 4.9086069743712325e-06, 'samples': 23988736, 'steps': 46852, 'batch_loss/train': 0.7196397380903363} 12/29/2021 03:09:48 - INFO - codeparrot_training - Step 46853: {'lr': 4.905499151154124e-06, 'samples': 23989248, 'steps': 46853, 'batch_loss/train': 0.7509152577258646} 12/29/2021 03:09:58 - INFO - codeparrot_training - Step 46854: {'lr': 4.902392302340214e-06, 'samples': 23989760, 'steps': 46854, 'batch_loss/train': 0.6631721092853695} 12/29/2021 03:10:11 - INFO - codeparrot_training - Step 46855: {'lr': 4.899286427941851e-06, 'samples': 23990272, 'steps': 46855, 'batch_loss/train': 0.8507082983851433} 12/29/2021 03:10:21 - INFO - codeparrot_training - Step 46856: {'lr': 4.896181527971389e-06, 'samples': 23990784, 'steps': 46856, 'batch_loss/train': 0.6706173825077713} 12/29/2021 03:10:32 - INFO - codeparrot_training - Step 46857: {'lr': 4.893077602441176e-06, 'samples': 23991296, 'steps': 46857, 'batch_loss/train': 0.8112341295927763} 12/29/2021 03:10:44 - INFO - codeparrot_training - Step 46858: {'lr': 4.889974651363482e-06, 'samples': 23991808, 'steps': 46858, 'batch_loss/train': 0.7646121839061379} 12/29/2021 03:10:55 - INFO - codeparrot_training - Step 46859: {'lr': 4.886872674750742e-06, 'samples': 23992320, 'steps': 46859, 'batch_loss/train': 0.7190771301393397} 12/29/2021 03:11:05 - INFO - codeparrot_training - Step 46860: {'lr': 4.88377167261525e-06, 'samples': 23992832, 'steps': 46860, 'batch_loss/train': 0.7765679024159908} 12/29/2021 03:11:18 - INFO - codeparrot_training - Step 46861: {'lr': 4.880671644969276e-06, 'samples': 23993344, 'steps': 46861, 'batch_loss/train': 0.7794700618833303} 12/29/2021 03:11:29 - INFO - codeparrot_training - Step 46862: {'lr': 4.877572591825252e-06, 'samples': 23993856, 'steps': 46862, 'batch_loss/train': 0.7084351303055882} 12/29/2021 03:11:39 - INFO - codeparrot_training - Step 46863: {'lr': 4.8744745131954485e-06, 'samples': 23994368, 'steps': 46863, 'batch_loss/train': 0.6744514214806259} 12/29/2021 03:11:51 - INFO - codeparrot_training - Step 46864: {'lr': 4.871377409092159e-06, 'samples': 23994880, 'steps': 46864, 'batch_loss/train': 0.6691917679854669} 12/29/2021 03:12:02 - INFO - codeparrot_training - Step 46865: {'lr': 4.868281279527708e-06, 'samples': 23995392, 'steps': 46865, 'batch_loss/train': 0.797234246507287} 12/29/2021 03:12:13 - INFO - codeparrot_training - Step 46866: {'lr': 4.8651861245144206e-06, 'samples': 23995904, 'steps': 46866, 'batch_loss/train': 0.7558006262406707} 12/29/2021 03:12:23 - INFO - codeparrot_training - Step 46867: {'lr': 4.86209194406459e-06, 'samples': 23996416, 'steps': 46867, 'batch_loss/train': 0.7716923881089315} 12/29/2021 03:12:36 - INFO - codeparrot_training - Step 46868: {'lr': 4.85899873819054e-06, 'samples': 23996928, 'steps': 46868, 'batch_loss/train': 0.736548843793571} 12/29/2021 03:12:47 - INFO - codeparrot_training - Step 46869: {'lr': 4.855906506904512e-06, 'samples': 23997440, 'steps': 46869, 'batch_loss/train': 0.6777945334324613} 12/29/2021 03:12:57 - INFO - codeparrot_training - Step 46870: {'lr': 4.852815250218856e-06, 'samples': 23997952, 'steps': 46870, 'batch_loss/train': 0.7607456743717194} 12/29/2021 03:13:10 - INFO - codeparrot_training - Step 46871: {'lr': 4.849724968145869e-06, 'samples': 23998464, 'steps': 46871, 'batch_loss/train': 0.7772658041212708} 12/29/2021 03:13:20 - INFO - codeparrot_training - Step 46872: {'lr': 4.8466356606977625e-06, 'samples': 23998976, 'steps': 46872, 'batch_loss/train': 0.7782066576182842} 12/29/2021 03:13:31 - INFO - codeparrot_training - Step 46873: {'lr': 4.843547327886888e-06, 'samples': 23999488, 'steps': 46873, 'batch_loss/train': 0.7037001738208346} 12/29/2021 03:13:43 - INFO - codeparrot_training - Step 46874: {'lr': 4.840459969725514e-06, 'samples': 24000000, 'steps': 46874, 'batch_loss/train': 0.5516919377841987} 12/29/2021 03:13:54 - INFO - codeparrot_training - Step 46875: {'lr': 4.837373586225907e-06, 'samples': 24000512, 'steps': 46875, 'batch_loss/train': 0.7685356182046235} 12/29/2021 03:14:04 - INFO - codeparrot_training - Step 46876: {'lr': 4.834288177400309e-06, 'samples': 24001024, 'steps': 46876, 'batch_loss/train': 0.748938157223165} 12/29/2021 03:14:15 - INFO - codeparrot_training - Step 46877: {'lr': 4.831203743261014e-06, 'samples': 24001536, 'steps': 46877, 'batch_loss/train': 0.7403732948005199} 12/29/2021 03:14:28 - INFO - codeparrot_training - Step 46878: {'lr': 4.828120283820292e-06, 'samples': 24002048, 'steps': 46878, 'batch_loss/train': 0.7344829358626157} 12/29/2021 03:14:38 - INFO - codeparrot_training - Step 46879: {'lr': 4.825037799090382e-06, 'samples': 24002560, 'steps': 46879, 'batch_loss/train': 0.756141773192212} 12/29/2021 03:14:49 - INFO - codeparrot_training - Step 46880: {'lr': 4.821956289083551e-06, 'samples': 24003072, 'steps': 46880, 'batch_loss/train': 0.7671863930299878} 12/29/2021 03:15:01 - INFO - codeparrot_training - Step 46881: {'lr': 4.8188757538120405e-06, 'samples': 24003584, 'steps': 46881, 'batch_loss/train': 0.7881047707051039} 12/29/2021 03:15:12 - INFO - codeparrot_training - Step 46882: {'lr': 4.815796193288091e-06, 'samples': 24004096, 'steps': 46882, 'batch_loss/train': 0.7162752067670226} 12/29/2021 03:15:23 - INFO - codeparrot_training - Step 46883: {'lr': 4.81271760752397e-06, 'samples': 24004608, 'steps': 46883, 'batch_loss/train': 0.852236564271152} 12/29/2021 03:15:36 - INFO - codeparrot_training - Step 46884: {'lr': 4.809639996531917e-06, 'samples': 24005120, 'steps': 46884, 'batch_loss/train': 0.4655806990340352} 12/29/2021 03:15:46 - INFO - codeparrot_training - Step 46885: {'lr': 4.806563360324173e-06, 'samples': 24005632, 'steps': 46885, 'batch_loss/train': 0.7897071307525039} 12/29/2021 03:15:57 - INFO - codeparrot_training - Step 46886: {'lr': 4.803487698912923e-06, 'samples': 24006144, 'steps': 46886, 'batch_loss/train': 0.8770369396079332} 12/29/2021 03:16:08 - INFO - codeparrot_training - Step 46887: {'lr': 4.800413012310406e-06, 'samples': 24006656, 'steps': 46887, 'batch_loss/train': 0.696841552387923} 12/29/2021 03:16:20 - INFO - codeparrot_training - Step 46888: {'lr': 4.797339300528891e-06, 'samples': 24007168, 'steps': 46888, 'batch_loss/train': 1.0314320367760956} 12/29/2021 03:16:30 - INFO - codeparrot_training - Step 46889: {'lr': 4.794266563580563e-06, 'samples': 24007680, 'steps': 46889, 'batch_loss/train': 1.0236200061626732} 12/29/2021 03:16:41 - INFO - codeparrot_training - Step 46890: {'lr': 4.7911948014776606e-06, 'samples': 24008192, 'steps': 46890, 'batch_loss/train': 0.6665377243189141} 12/29/2021 03:16:53 - INFO - codeparrot_training - Step 46891: {'lr': 4.788124014232342e-06, 'samples': 24008704, 'steps': 46891, 'batch_loss/train': 0.7297729328274727} 12/29/2021 03:17:04 - INFO - codeparrot_training - Step 46892: {'lr': 4.785054201856876e-06, 'samples': 24009216, 'steps': 46892, 'batch_loss/train': 0.7204521866515279} 12/29/2021 03:17:15 - INFO - codeparrot_training - Step 46893: {'lr': 4.781985364363445e-06, 'samples': 24009728, 'steps': 46893, 'batch_loss/train': 0.814561880659312} 12/29/2021 03:17:27 - INFO - codeparrot_training - Step 46894: {'lr': 4.7789175017642905e-06, 'samples': 24010240, 'steps': 46894, 'batch_loss/train': 0.6717323162592947} 12/29/2021 03:17:38 - INFO - codeparrot_training - Step 46895: {'lr': 4.775850614071486e-06, 'samples': 24010752, 'steps': 46895, 'batch_loss/train': 0.7851888127624989} 12/29/2021 03:17:49 - INFO - codeparrot_training - Step 46896: {'lr': 4.772784701297356e-06, 'samples': 24011264, 'steps': 46896, 'batch_loss/train': 0.8140816632658243} 12/29/2021 03:17:59 - INFO - codeparrot_training - Step 46897: {'lr': 4.7697197634540555e-06, 'samples': 24011776, 'steps': 46897, 'batch_loss/train': 0.7318855365738273} 12/29/2021 03:18:12 - INFO - codeparrot_training - Step 46898: {'lr': 4.766655800553688e-06, 'samples': 24012288, 'steps': 46898, 'batch_loss/train': 0.7327809575945139} 12/29/2021 03:18:22 - INFO - codeparrot_training - Step 46899: {'lr': 4.763592812608547e-06, 'samples': 24012800, 'steps': 46899, 'batch_loss/train': 0.6586694438010454} 12/29/2021 03:18:33 - INFO - codeparrot_training - Step 46900: {'lr': 4.760530799630791e-06, 'samples': 24013312, 'steps': 46900, 'batch_loss/train': 0.7507996172644198} 12/29/2021 03:18:45 - INFO - codeparrot_training - Step 46901: {'lr': 4.7574697616325215e-06, 'samples': 24013824, 'steps': 46901, 'batch_loss/train': 0.7138861711136997} 12/29/2021 03:18:56 - INFO - codeparrot_training - Step 46902: {'lr': 4.75440969862595e-06, 'samples': 24014336, 'steps': 46902, 'batch_loss/train': 0.7058553980314173} 12/29/2021 03:19:06 - INFO - codeparrot_training - Step 46903: {'lr': 4.751350610623261e-06, 'samples': 24014848, 'steps': 46903, 'batch_loss/train': 0.719870962202549} 12/29/2021 03:19:18 - INFO - codeparrot_training - Step 46904: {'lr': 4.748292497636614e-06, 'samples': 24015360, 'steps': 46904, 'batch_loss/train': 0.7473260283004493} 12/29/2021 03:19:29 - INFO - codeparrot_training - Step 46905: {'lr': 4.745235359678135e-06, 'samples': 24015872, 'steps': 46905, 'batch_loss/train': 0.7329400068847463} 12/29/2021 03:19:40 - INFO - codeparrot_training - Step 46906: {'lr': 4.742179196759983e-06, 'samples': 24016384, 'steps': 46906, 'batch_loss/train': 0.704409662168473} 12/29/2021 03:19:52 - INFO - codeparrot_training - Step 46907: {'lr': 4.739124008894341e-06, 'samples': 24016896, 'steps': 46907, 'batch_loss/train': 0.7246380220167339} 12/29/2021 03:20:03 - INFO - codeparrot_training - Step 46908: {'lr': 4.73606979609334e-06, 'samples': 24017408, 'steps': 46908, 'batch_loss/train': 0.7419342123903334} 12/29/2021 03:20:14 - INFO - codeparrot_training - Step 46909: {'lr': 4.73301655836908e-06, 'samples': 24017920, 'steps': 46909, 'batch_loss/train': 0.8083935417234898} 12/29/2021 03:20:24 - INFO - codeparrot_training - Step 46910: {'lr': 4.729964295733774e-06, 'samples': 24018432, 'steps': 46910, 'batch_loss/train': 0.7619129605591297} 12/29/2021 03:20:37 - INFO - codeparrot_training - Step 46911: {'lr': 4.726913008199524e-06, 'samples': 24018944, 'steps': 46911, 'batch_loss/train': 0.8177265804260969} 12/29/2021 03:20:47 - INFO - codeparrot_training - Step 46912: {'lr': 4.7238626957784306e-06, 'samples': 24019456, 'steps': 46912, 'batch_loss/train': 0.7145504420623183} 12/29/2021 03:20:58 - INFO - codeparrot_training - Step 46913: {'lr': 4.7208133584826505e-06, 'samples': 24019968, 'steps': 46913, 'batch_loss/train': 0.8058570944704115} 12/29/2021 03:21:11 - INFO - codeparrot_training - Step 46914: {'lr': 4.717764996324314e-06, 'samples': 24020480, 'steps': 46914, 'batch_loss/train': 0.6729938853532076} 12/29/2021 03:21:21 - INFO - codeparrot_training - Step 46915: {'lr': 4.714717609315522e-06, 'samples': 24020992, 'steps': 46915, 'batch_loss/train': 0.8158202990889549} 12/29/2021 03:21:32 - INFO - codeparrot_training - Step 46916: {'lr': 4.711671197468403e-06, 'samples': 24021504, 'steps': 46916, 'batch_loss/train': 0.7749120723456144} 12/29/2021 03:21:44 - INFO - codeparrot_training - Step 46917: {'lr': 4.70862576079506e-06, 'samples': 24022016, 'steps': 46917, 'batch_loss/train': 0.6997431670315564} 12/29/2021 03:21:55 - INFO - codeparrot_training - Step 46918: {'lr': 4.705581299307593e-06, 'samples': 24022528, 'steps': 46918, 'batch_loss/train': 0.8007563222199678} 12/29/2021 03:22:05 - INFO - codeparrot_training - Step 46919: {'lr': 4.702537813018132e-06, 'samples': 24023040, 'steps': 46919, 'batch_loss/train': 0.7451327508315444} 12/29/2021 03:22:16 - INFO - codeparrot_training - Step 46920: {'lr': 4.69949530193875e-06, 'samples': 24023552, 'steps': 46920, 'batch_loss/train': 0.8323976192623377} 12/29/2021 03:22:28 - INFO - codeparrot_training - Step 46921: {'lr': 4.696453766081549e-06, 'samples': 24024064, 'steps': 46921, 'batch_loss/train': 0.6502983670216054} 12/29/2021 03:22:39 - INFO - codeparrot_training - Step 46922: {'lr': 4.6934132054586296e-06, 'samples': 24024576, 'steps': 46922, 'batch_loss/train': 0.7392147481441498} 12/29/2021 03:22:49 - INFO - codeparrot_training - Step 46923: {'lr': 4.690373620082095e-06, 'samples': 24025088, 'steps': 46923, 'batch_loss/train': 0.6281072865240276} 12/29/2021 03:23:02 - INFO - codeparrot_training - Step 46924: {'lr': 4.687335009963961e-06, 'samples': 24025600, 'steps': 46924, 'batch_loss/train': 0.7101561052259058} 12/29/2021 03:23:12 - INFO - codeparrot_training - Step 46925: {'lr': 4.684297375116386e-06, 'samples': 24026112, 'steps': 46925, 'batch_loss/train': 0.6606370112858713} 12/29/2021 03:23:23 - INFO - codeparrot_training - Step 46926: {'lr': 4.681260715551416e-06, 'samples': 24026624, 'steps': 46926, 'batch_loss/train': 0.7104036207310855} 12/29/2021 03:23:35 - INFO - codeparrot_training - Step 46927: {'lr': 4.6782250312811245e-06, 'samples': 24027136, 'steps': 46927, 'batch_loss/train': 0.7269873498007655} 12/29/2021 03:23:46 - INFO - codeparrot_training - Step 46928: {'lr': 4.675190322317558e-06, 'samples': 24027648, 'steps': 46928, 'batch_loss/train': 0.7822473747655749} 12/29/2021 03:23:57 - INFO - codeparrot_training - Step 46929: {'lr': 4.672156588672844e-06, 'samples': 24028160, 'steps': 46929, 'batch_loss/train': 0.7944977944716811} 12/29/2021 03:24:07 - INFO - codeparrot_training - Step 46930: {'lr': 4.6691238303590015e-06, 'samples': 24028672, 'steps': 46930, 'batch_loss/train': 0.5030607416993007} 12/29/2021 03:24:20 - INFO - codeparrot_training - Step 46931: {'lr': 4.666092047388048e-06, 'samples': 24029184, 'steps': 46931, 'batch_loss/train': 0.7276636427268386} 12/29/2021 03:24:31 - INFO - codeparrot_training - Step 46932: {'lr': 4.663061239772115e-06, 'samples': 24029696, 'steps': 46932, 'batch_loss/train': 0.7253453517332673} 12/29/2021 03:24:41 - INFO - codeparrot_training - Step 46933: {'lr': 4.660031407523191e-06, 'samples': 24030208, 'steps': 46933, 'batch_loss/train': 0.8054408659227192} 12/29/2021 03:24:53 - INFO - codeparrot_training - Step 46934: {'lr': 4.65700255065335e-06, 'samples': 24030720, 'steps': 46934, 'batch_loss/train': 0.7130867121741176} 12/29/2021 03:25:04 - INFO - codeparrot_training - Step 46935: {'lr': 4.653974669174638e-06, 'samples': 24031232, 'steps': 46935, 'batch_loss/train': 0.7605660003609955} 12/29/2021 03:25:15 - INFO - codeparrot_training - Step 46936: {'lr': 4.650947763099101e-06, 'samples': 24031744, 'steps': 46936, 'batch_loss/train': 0.7006240841001272} 12/29/2021 03:25:27 - INFO - codeparrot_training - Step 46937: {'lr': 4.647921832438729e-06, 'samples': 24032256, 'steps': 46937, 'batch_loss/train': 0.6825501807034016} 12/29/2021 03:25:37 - INFO - codeparrot_training - Step 46938: {'lr': 4.644896877205596e-06, 'samples': 24032768, 'steps': 46938, 'batch_loss/train': 0.8169473155867308} 12/29/2021 03:25:48 - INFO - codeparrot_training - Step 46939: {'lr': 4.641872897411665e-06, 'samples': 24033280, 'steps': 46939, 'batch_loss/train': 0.794648420996964} 12/29/2021 03:25:59 - INFO - codeparrot_training - Step 46940: {'lr': 4.638849893069064e-06, 'samples': 24033792, 'steps': 46940, 'batch_loss/train': 0.6690927958115935} 12/29/2021 03:26:11 - INFO - codeparrot_training - Step 46941: {'lr': 4.635827864189729e-06, 'samples': 24034304, 'steps': 46941, 'batch_loss/train': 0.7581865731626749} 12/29/2021 03:26:21 - INFO - codeparrot_training - Step 46942: {'lr': 4.632806810785706e-06, 'samples': 24034816, 'steps': 46942, 'batch_loss/train': 0.7472189799882472} 12/29/2021 03:26:32 - INFO - codeparrot_training - Step 46943: {'lr': 4.6297867328689566e-06, 'samples': 24035328, 'steps': 46943, 'batch_loss/train': 0.7344433418475091} 12/29/2021 03:26:45 - INFO - codeparrot_training - Step 46944: {'lr': 4.626767630451556e-06, 'samples': 24035840, 'steps': 46944, 'batch_loss/train': 0.7515514697879553} 12/29/2021 03:26:56 - INFO - codeparrot_training - Step 46945: {'lr': 4.6237495035454934e-06, 'samples': 24036352, 'steps': 46945, 'batch_loss/train': 0.48915063275489956} 12/29/2021 03:27:06 - INFO - codeparrot_training - Step 46946: {'lr': 4.620732352162732e-06, 'samples': 24036864, 'steps': 46946, 'batch_loss/train': 0.675732898293063} 12/29/2021 03:27:18 - INFO - codeparrot_training - Step 46947: {'lr': 4.6177161763152895e-06, 'samples': 24037376, 'steps': 46947, 'batch_loss/train': 0.7427357165142894} 12/29/2021 03:27:29 - INFO - codeparrot_training - Step 46948: {'lr': 4.614700976015129e-06, 'samples': 24037888, 'steps': 46948, 'batch_loss/train': 0.6149739188840613} 12/29/2021 03:27:40 - INFO - codeparrot_training - Step 46949: {'lr': 4.6116867512743245e-06, 'samples': 24038400, 'steps': 46949, 'batch_loss/train': 0.7907123158220202} 12/29/2021 03:27:52 - INFO - codeparrot_training - Step 46950: {'lr': 4.6086735021047266e-06, 'samples': 24038912, 'steps': 46950, 'batch_loss/train': 0.5517462189309299} 12/29/2021 03:28:03 - INFO - codeparrot_training - Step 46951: {'lr': 4.605661228518438e-06, 'samples': 24039424, 'steps': 46951, 'batch_loss/train': 0.7556305108591914} 12/29/2021 03:28:13 - INFO - codeparrot_training - Step 46952: {'lr': 4.602649930527392e-06, 'samples': 24039936, 'steps': 46952, 'batch_loss/train': 0.6833119327202439} 12/29/2021 03:28:24 - INFO - codeparrot_training - Step 46953: {'lr': 4.599639608143524e-06, 'samples': 24040448, 'steps': 46953, 'batch_loss/train': 0.852120999712497} 12/29/2021 03:28:37 - INFO - codeparrot_training - Step 46954: {'lr': 4.596630261378798e-06, 'samples': 24040960, 'steps': 46954, 'batch_loss/train': 0.685736624058336} 12/29/2021 03:28:47 - INFO - codeparrot_training - Step 46955: {'lr': 4.593621890245258e-06, 'samples': 24041472, 'steps': 46955, 'batch_loss/train': 0.7921067057177424} 12/29/2021 03:28:58 - INFO - codeparrot_training - Step 46956: {'lr': 4.590614494754813e-06, 'samples': 24041984, 'steps': 46956, 'batch_loss/train': 0.7545939953997731} 12/29/2021 03:29:10 - INFO - codeparrot_training - Step 46957: {'lr': 4.587608074919397e-06, 'samples': 24042496, 'steps': 46957, 'batch_loss/train': 0.7763643469661474} 12/29/2021 03:29:21 - INFO - codeparrot_training - Step 46958: {'lr': 4.584602630751e-06, 'samples': 24043008, 'steps': 46958, 'batch_loss/train': 0.7108146185055375} 12/29/2021 03:29:32 - INFO - codeparrot_training - Step 46959: {'lr': 4.581598162261558e-06, 'samples': 24043520, 'steps': 46959, 'batch_loss/train': 0.5895618931390345} 12/29/2021 03:29:44 - INFO - codeparrot_training - Step 46960: {'lr': 4.578594669463032e-06, 'samples': 24044032, 'steps': 46960, 'batch_loss/train': 0.5492544794105925} 12/29/2021 03:29:54 - INFO - codeparrot_training - Step 46961: {'lr': 4.5755921523673304e-06, 'samples': 24044544, 'steps': 46961, 'batch_loss/train': 0.8672537235543132} 12/29/2021 03:30:05 - INFO - codeparrot_training - Step 46962: {'lr': 4.572590610986388e-06, 'samples': 24045056, 'steps': 46962, 'batch_loss/train': 0.7667274628765881} 12/29/2021 03:30:16 - INFO - codeparrot_training - Step 46963: {'lr': 4.569590045332167e-06, 'samples': 24045568, 'steps': 46963, 'batch_loss/train': 0.6915867399657145} 12/29/2021 03:30:28 - INFO - codeparrot_training - Step 46964: {'lr': 4.56659045541663e-06, 'samples': 24046080, 'steps': 46964, 'batch_loss/train': 0.7714840031694621} 12/29/2021 03:30:39 - INFO - codeparrot_training - Step 46965: {'lr': 4.563591841251574e-06, 'samples': 24046592, 'steps': 46965, 'batch_loss/train': 0.6261070067994297} 12/29/2021 03:30:50 - INFO - codeparrot_training - Step 46966: {'lr': 4.560594202849072e-06, 'samples': 24047104, 'steps': 46966, 'batch_loss/train': 0.7405787231400609} 12/29/2021 03:31:02 - INFO - codeparrot_training - Step 46967: {'lr': 4.557597540220948e-06, 'samples': 24047616, 'steps': 46967, 'batch_loss/train': 0.7602063240483403} 12/29/2021 03:31:12 - INFO - codeparrot_training - Step 46968: {'lr': 4.5546018533791364e-06, 'samples': 24048128, 'steps': 46968, 'batch_loss/train': 0.5827896525152028} 12/29/2021 03:31:23 - INFO - codeparrot_training - Step 46969: {'lr': 4.551607142335573e-06, 'samples': 24048640, 'steps': 46969, 'batch_loss/train': 0.7310543488711119} 12/29/2021 03:31:36 - INFO - codeparrot_training - Step 46970: {'lr': 4.548613407102109e-06, 'samples': 24049152, 'steps': 46970, 'batch_loss/train': 0.7373761786147952} 12/29/2021 03:31:46 - INFO - codeparrot_training - Step 46971: {'lr': 4.545620647690707e-06, 'samples': 24049664, 'steps': 46971, 'batch_loss/train': 0.7014385531656444} 12/29/2021 03:31:57 - INFO - codeparrot_training - Step 46972: {'lr': 4.542628864113219e-06, 'samples': 24050176, 'steps': 46972, 'batch_loss/train': 1.0236745541915298} 12/29/2021 03:32:08 - INFO - codeparrot_training - Step 46973: {'lr': 4.539638056381551e-06, 'samples': 24050688, 'steps': 46973, 'batch_loss/train': 0.6611023084260523} 12/29/2021 03:32:20 - INFO - codeparrot_training - Step 46974: {'lr': 4.536648224507611e-06, 'samples': 24051200, 'steps': 46974, 'batch_loss/train': 0.5722799907671288} 12/29/2021 03:32:30 - INFO - codeparrot_training - Step 46975: {'lr': 4.5336593685033064e-06, 'samples': 24051712, 'steps': 46975, 'batch_loss/train': 0.7007420370355248} 12/29/2021 03:32:41 - INFO - codeparrot_training - Step 46976: {'lr': 4.530671488380433e-06, 'samples': 24052224, 'steps': 46976, 'batch_loss/train': 0.9203477697446942} 12/29/2021 03:32:53 - INFO - codeparrot_training - Step 46977: {'lr': 4.527684584150954e-06, 'samples': 24052736, 'steps': 46977, 'batch_loss/train': 0.732332234736532} 12/29/2021 03:33:04 - INFO - codeparrot_training - Step 46978: {'lr': 4.52469865582672e-06, 'samples': 24053248, 'steps': 46978, 'batch_loss/train': 0.7057340820319951} 12/29/2021 03:33:14 - INFO - codeparrot_training - Step 46979: {'lr': 4.521713703419583e-06, 'samples': 24053760, 'steps': 46979, 'batch_loss/train': 0.6616790974512696} 12/29/2021 03:33:27 - INFO - codeparrot_training - Step 46980: {'lr': 4.518729726941423e-06, 'samples': 24054272, 'steps': 46980, 'batch_loss/train': 0.6755416780943051} 12/29/2021 03:33:37 - INFO - codeparrot_training - Step 46981: {'lr': 4.515746726404119e-06, 'samples': 24054784, 'steps': 46981, 'batch_loss/train': 0.7809590417891741} 12/29/2021 03:33:48 - INFO - codeparrot_training - Step 46982: {'lr': 4.512764701819494e-06, 'samples': 24055296, 'steps': 46982, 'batch_loss/train': 0.7084564450196922} 12/29/2021 03:33:59 - INFO - codeparrot_training - Step 46983: {'lr': 4.5097836531994e-06, 'samples': 24055808, 'steps': 46983, 'batch_loss/train': 0.6811120097991079} 12/29/2021 03:34:13 - INFO - codeparrot_training - Step 46984: {'lr': 4.5068035805557735e-06, 'samples': 24056320, 'steps': 46984, 'batch_loss/train': 0.6729829241521657} 12/29/2021 03:34:23 - INFO - codeparrot_training - Step 46985: {'lr': 4.503824483900382e-06, 'samples': 24056832, 'steps': 46985, 'batch_loss/train': 0.6274387962184846} 12/29/2021 03:34:34 - INFO - codeparrot_training - Step 46986: {'lr': 4.500846363245076e-06, 'samples': 24057344, 'steps': 46986, 'batch_loss/train': 0.6270579337142408} 12/29/2021 03:34:46 - INFO - codeparrot_training - Step 46987: {'lr': 4.497869218601708e-06, 'samples': 24057856, 'steps': 46987, 'batch_loss/train': 0.7714504506438971} 12/29/2021 03:34:57 - INFO - codeparrot_training - Step 46988: {'lr': 4.494893049982101e-06, 'samples': 24058368, 'steps': 46988, 'batch_loss/train': 0.812314978800714} 12/29/2021 03:35:07 - INFO - codeparrot_training - Step 46989: {'lr': 4.491917857398137e-06, 'samples': 24058880, 'steps': 46989, 'batch_loss/train': 0.6951979291625321} 12/29/2021 03:35:21 - INFO - codeparrot_training - Step 46990: {'lr': 4.488943640861581e-06, 'samples': 24059392, 'steps': 46990, 'batch_loss/train': 0.6367197320796549} 12/29/2021 03:35:32 - INFO - codeparrot_training - Step 46991: {'lr': 4.485970400384287e-06, 'samples': 24059904, 'steps': 46991, 'batch_loss/train': 0.6926439446397126} 12/29/2021 03:35:43 - INFO - codeparrot_training - Step 46992: {'lr': 4.482998135978078e-06, 'samples': 24060416, 'steps': 46992, 'batch_loss/train': 0.5322978736367077} 12/29/2021 03:35:53 - INFO - codeparrot_training - Step 46993: {'lr': 4.480026847654778e-06, 'samples': 24060928, 'steps': 46993, 'batch_loss/train': 0.7072392527479678} 12/29/2021 03:36:06 - INFO - codeparrot_training - Step 46994: {'lr': 4.477056535426127e-06, 'samples': 24061440, 'steps': 46994, 'batch_loss/train': 0.7110213842242956} 12/29/2021 03:36:17 - INFO - codeparrot_training - Step 46995: {'lr': 4.474087199304061e-06, 'samples': 24061952, 'steps': 46995, 'batch_loss/train': 0.740003221668303} 12/29/2021 03:36:27 - INFO - codeparrot_training - Step 46996: {'lr': 4.471118839300292e-06, 'samples': 24062464, 'steps': 46996, 'batch_loss/train': 0.4933004272170365} 12/29/2021 03:36:39 - INFO - codeparrot_training - Step 46997: {'lr': 4.468151455426644e-06, 'samples': 24062976, 'steps': 46997, 'batch_loss/train': 0.47957419918384403} 12/29/2021 03:36:50 - INFO - codeparrot_training - Step 46998: {'lr': 4.465185047694914e-06, 'samples': 24063488, 'steps': 46998, 'batch_loss/train': 0.5805241279304028} 12/29/2021 03:37:01 - INFO - codeparrot_training - Step 46999: {'lr': 4.462219616116925e-06, 'samples': 24064000, 'steps': 46999, 'batch_loss/train': 0.7415673154173419} 12/29/2021 03:37:14 - INFO - codeparrot_training - Step 47000: {'lr': 4.459255160704418e-06, 'samples': 24064512, 'steps': 47000, 'batch_loss/train': 0.6985687459819019} 12/29/2021 03:37:25 - INFO - codeparrot_training - Step 47001: {'lr': 4.456291681469215e-06, 'samples': 24065024, 'steps': 47001, 'batch_loss/train': 0.6911696721799672} 12/29/2021 03:37:36 - INFO - codeparrot_training - Step 47002: {'lr': 4.453329178423032e-06, 'samples': 24065536, 'steps': 47002, 'batch_loss/train': 0.6364036403829232} 12/29/2021 03:37:48 - INFO - codeparrot_training - Step 47003: {'lr': 4.450367651577747e-06, 'samples': 24066048, 'steps': 47003, 'batch_loss/train': 0.6683314276160672} 12/29/2021 03:37:58 - INFO - codeparrot_training - Step 47004: {'lr': 4.447407100945073e-06, 'samples': 24066560, 'steps': 47004, 'batch_loss/train': 0.6818875130265951} 12/29/2021 03:38:09 - INFO - codeparrot_training - Step 47005: {'lr': 4.444447526536805e-06, 'samples': 24067072, 'steps': 47005, 'batch_loss/train': 0.6997524921316653} 12/29/2021 03:38:20 - INFO - codeparrot_training - Step 47006: {'lr': 4.441488928364656e-06, 'samples': 24067584, 'steps': 47006, 'batch_loss/train': 0.814353596419096} 12/29/2021 03:38:34 - INFO - codeparrot_training - Step 47007: {'lr': 4.438531306440452e-06, 'samples': 24068096, 'steps': 47007, 'batch_loss/train': 0.6887784665450454} 12/29/2021 03:38:44 - INFO - codeparrot_training - Step 47008: {'lr': 4.435574660775932e-06, 'samples': 24068608, 'steps': 47008, 'batch_loss/train': 0.8069716272875667} 12/29/2021 03:38:55 - INFO - codeparrot_training - Step 47009: {'lr': 4.432618991382809e-06, 'samples': 24069120, 'steps': 47009, 'batch_loss/train': 0.676273787394166} 12/29/2021 03:39:07 - INFO - codeparrot_training - Step 47010: {'lr': 4.429664298272934e-06, 'samples': 24069632, 'steps': 47010, 'batch_loss/train': 0.6415237910114229} 12/29/2021 03:39:18 - INFO - codeparrot_training - Step 47011: {'lr': 4.426710581457938e-06, 'samples': 24070144, 'steps': 47011, 'batch_loss/train': 0.7328588063828647} 12/29/2021 03:39:28 - INFO - codeparrot_training - Step 47012: {'lr': 4.423757840949644e-06, 'samples': 24070656, 'steps': 47012, 'batch_loss/train': 0.6869487725198269} 12/29/2021 03:39:40 - INFO - codeparrot_training - Step 47013: {'lr': 4.4208060767597645e-06, 'samples': 24071168, 'steps': 47013, 'batch_loss/train': 0.7452283641323447} 12/29/2021 03:39:51 - INFO - codeparrot_training - Step 47014: {'lr': 4.417855288900013e-06, 'samples': 24071680, 'steps': 47014, 'batch_loss/train': 0.6670300173573196} 12/29/2021 03:40:02 - INFO - codeparrot_training - Step 47015: {'lr': 4.414905477382131e-06, 'samples': 24072192, 'steps': 47015, 'batch_loss/train': 0.7910233444999903} 12/29/2021 03:40:12 - INFO - codeparrot_training - Step 47016: {'lr': 4.411956642217912e-06, 'samples': 24072704, 'steps': 47016, 'batch_loss/train': 0.7029074807651341} 12/29/2021 03:40:26 - INFO - codeparrot_training - Step 47017: {'lr': 4.409008783418961e-06, 'samples': 24073216, 'steps': 47017, 'batch_loss/train': 0.7185123804956675} 12/29/2021 03:40:37 - INFO - codeparrot_training - Step 47018: {'lr': 4.406061900997072e-06, 'samples': 24073728, 'steps': 47018, 'batch_loss/train': 0.6671990351751447} 12/29/2021 03:40:48 - INFO - codeparrot_training - Step 47019: {'lr': 4.403115994963986e-06, 'samples': 24074240, 'steps': 47019, 'batch_loss/train': 0.6532104192301631} 12/29/2021 03:41:00 - INFO - codeparrot_training - Step 47020: {'lr': 4.400171065331332e-06, 'samples': 24074752, 'steps': 47020, 'batch_loss/train': 0.652703178639058} 12/29/2021 03:41:10 - INFO - codeparrot_training - Step 47021: {'lr': 4.39722711211088e-06, 'samples': 24075264, 'steps': 47021, 'batch_loss/train': 0.6773668886162341} 12/29/2021 03:41:21 - INFO - codeparrot_training - Step 47022: {'lr': 4.394284135314342e-06, 'samples': 24075776, 'steps': 47022, 'batch_loss/train': 0.7002330070827156} 12/29/2021 03:41:33 - INFO - codeparrot_training - Step 47023: {'lr': 4.391342134953347e-06, 'samples': 24076288, 'steps': 47023, 'batch_loss/train': 0.705099037848413} 12/29/2021 03:41:44 - INFO - codeparrot_training - Step 47024: {'lr': 4.388401111039664e-06, 'samples': 24076800, 'steps': 47024, 'batch_loss/train': 0.6664010393433273} 12/29/2021 03:41:54 - INFO - codeparrot_training - Step 47025: {'lr': 4.3854610635849505e-06, 'samples': 24077312, 'steps': 47025, 'batch_loss/train': 0.7327969213947654} 12/29/2021 03:42:06 - INFO - codeparrot_training - Step 47026: {'lr': 4.382521992600918e-06, 'samples': 24077824, 'steps': 47026, 'batch_loss/train': 0.654117812635377} 12/29/2021 03:42:17 - INFO - codeparrot_training - Step 47027: {'lr': 4.379583898099227e-06, 'samples': 24078336, 'steps': 47027, 'batch_loss/train': 0.5623639028635807} 12/29/2021 03:42:28 - INFO - codeparrot_training - Step 47028: {'lr': 4.376646780091559e-06, 'samples': 24078848, 'steps': 47028, 'batch_loss/train': 0.5694429511204362} 12/29/2021 03:42:38 - INFO - codeparrot_training - Step 47029: {'lr': 4.373710638589601e-06, 'samples': 24079360, 'steps': 47029, 'batch_loss/train': 0.6439855601638556} 12/29/2021 03:42:52 - INFO - codeparrot_training - Step 47030: {'lr': 4.370775473605065e-06, 'samples': 24079872, 'steps': 47030, 'batch_loss/train': 0.7771761454641819} 12/29/2021 03:43:03 - INFO - codeparrot_training - Step 47031: {'lr': 4.367841285149526e-06, 'samples': 24080384, 'steps': 47031, 'batch_loss/train': 0.7720392236951739} 12/29/2021 03:43:13 - INFO - codeparrot_training - Step 47032: {'lr': 4.364908073234697e-06, 'samples': 24080896, 'steps': 47032, 'batch_loss/train': 0.6581936464644969} 12/29/2021 03:43:26 - INFO - codeparrot_training - Step 47033: {'lr': 4.361975837872289e-06, 'samples': 24081408, 'steps': 47033, 'batch_loss/train': 0.9287243648432195} 12/29/2021 03:43:36 - INFO - codeparrot_training - Step 47034: {'lr': 4.359044579073906e-06, 'samples': 24081920, 'steps': 47034, 'batch_loss/train': 0.7144563104957342} 12/29/2021 03:43:47 - INFO - codeparrot_training - Step 47035: {'lr': 4.3561142968511765e-06, 'samples': 24082432, 'steps': 47035, 'batch_loss/train': 0.6954942094162107} 12/29/2021 03:44:01 - INFO - codeparrot_training - Step 47036: {'lr': 4.353184991215841e-06, 'samples': 24082944, 'steps': 47036, 'batch_loss/train': 0.6870319857262075} 12/29/2021 03:44:11 - INFO - codeparrot_training - Step 47037: {'lr': 4.350256662179447e-06, 'samples': 24083456, 'steps': 47037, 'batch_loss/train': 0.7186131924390793} 12/29/2021 03:44:22 - INFO - codeparrot_training - Step 47038: {'lr': 4.347329309753706e-06, 'samples': 24083968, 'steps': 47038, 'batch_loss/train': 0.7190938699059188} 12/29/2021 03:44:33 - INFO - codeparrot_training - Step 47039: {'lr': 4.34440293395022e-06, 'samples': 24084480, 'steps': 47039, 'batch_loss/train': 0.789157644379884} 12/29/2021 03:44:45 - INFO - codeparrot_training - Step 47040: {'lr': 4.341477534780619e-06, 'samples': 24084992, 'steps': 47040, 'batch_loss/train': 0.7297925725579262} 12/29/2021 03:44:56 - INFO - codeparrot_training - Step 47041: {'lr': 4.338553112256533e-06, 'samples': 24085504, 'steps': 47041, 'batch_loss/train': 0.33331103666569106} 12/29/2021 03:45:06 - INFO - codeparrot_training - Step 47042: {'lr': 4.335629666389646e-06, 'samples': 24086016, 'steps': 47042, 'batch_loss/train': 0.6388934537535533} 12/29/2021 03:45:19 - INFO - codeparrot_training - Step 47043: {'lr': 4.332707197191505e-06, 'samples': 24086528, 'steps': 47043, 'batch_loss/train': 0.6825982625596225} 12/29/2021 03:45:29 - INFO - codeparrot_training - Step 47044: {'lr': 4.329785704673767e-06, 'samples': 24087040, 'steps': 47044, 'batch_loss/train': 0.6970622779335827} 12/29/2021 03:45:40 - INFO - codeparrot_training - Step 47045: {'lr': 4.326865188848062e-06, 'samples': 24087552, 'steps': 47045, 'batch_loss/train': 0.5135391642106697} 12/29/2021 03:45:54 - INFO - codeparrot_training - Step 47046: {'lr': 4.323945649725935e-06, 'samples': 24088064, 'steps': 47046, 'batch_loss/train': 0.7133845770731568} 12/29/2021 03:46:04 - INFO - codeparrot_training - Step 47047: {'lr': 4.321027087319073e-06, 'samples': 24088576, 'steps': 47047, 'batch_loss/train': 0.7290943232364953} 12/29/2021 03:46:15 - INFO - codeparrot_training - Step 47048: {'lr': 4.318109501639023e-06, 'samples': 24089088, 'steps': 47048, 'batch_loss/train': 0.7091836594045162} 12/29/2021 03:46:27 - INFO - codeparrot_training - Step 47049: {'lr': 4.315192892697411e-06, 'samples': 24089600, 'steps': 47049, 'batch_loss/train': 0.6863123932853341} 12/29/2021 03:46:38 - INFO - codeparrot_training - Step 47050: {'lr': 4.3122772605058145e-06, 'samples': 24090112, 'steps': 47050, 'batch_loss/train': 0.7516122348606586} 12/29/2021 03:46:49 - INFO - codeparrot_training - Step 47051: {'lr': 4.3093626050758336e-06, 'samples': 24090624, 'steps': 47051, 'batch_loss/train': 0.6257529663853347} 12/29/2021 03:46:59 - INFO - codeparrot_training - Step 47052: {'lr': 4.306448926419071e-06, 'samples': 24091136, 'steps': 47052, 'batch_loss/train': 0.7238174725789577} 12/29/2021 03:47:12 - INFO - codeparrot_training - Step 47053: {'lr': 4.3035362245471e-06, 'samples': 24091648, 'steps': 47053, 'batch_loss/train': 0.7787390621379018} 12/29/2021 03:47:22 - INFO - codeparrot_training - Step 47054: {'lr': 4.300624499471495e-06, 'samples': 24092160, 'steps': 47054, 'batch_loss/train': 0.6690286248922348} 12/29/2021 03:47:33 - INFO - codeparrot_training - Step 47055: {'lr': 4.29771375120383e-06, 'samples': 24092672, 'steps': 47055, 'batch_loss/train': 0.7141740305814892} 12/29/2021 03:47:45 - INFO - codeparrot_training - Step 47056: {'lr': 4.29480397975568e-06, 'samples': 24093184, 'steps': 47056, 'batch_loss/train': 0.6474840235896409} 12/29/2021 03:47:56 - INFO - codeparrot_training - Step 47057: {'lr': 4.291895185138617e-06, 'samples': 24093696, 'steps': 47057, 'batch_loss/train': 0.6953460448421538} 12/29/2021 03:48:06 - INFO - codeparrot_training - Step 47058: {'lr': 4.288987367364189e-06, 'samples': 24094208, 'steps': 47058, 'batch_loss/train': 0.6817285167053342} 12/29/2021 03:48:20 - INFO - codeparrot_training - Step 47059: {'lr': 4.286080526443997e-06, 'samples': 24094720, 'steps': 47059, 'batch_loss/train': 0.6904769740067422} 12/29/2021 03:48:31 - INFO - codeparrot_training - Step 47060: {'lr': 4.28317466238956e-06, 'samples': 24095232, 'steps': 47060, 'batch_loss/train': 0.7156988633796573} 12/29/2021 03:48:42 - INFO - codeparrot_training - Step 47061: {'lr': 4.280269775212425e-06, 'samples': 24095744, 'steps': 47061, 'batch_loss/train': 0.6196220135316253} 12/29/2021 03:48:52 - INFO - codeparrot_training - Step 47062: {'lr': 4.277365864924193e-06, 'samples': 24096256, 'steps': 47062, 'batch_loss/train': 0.6561126867309213} 12/29/2021 03:49:04 - INFO - codeparrot_training - Step 47063: {'lr': 4.274462931536355e-06, 'samples': 24096768, 'steps': 47063, 'batch_loss/train': 0.6525152190588415} 12/29/2021 03:49:15 - INFO - codeparrot_training - Step 47064: {'lr': 4.2715609750604564e-06, 'samples': 24097280, 'steps': 47064, 'batch_loss/train': 0.6199501128867269} 12/29/2021 03:49:26 - INFO - codeparrot_training - Step 47065: {'lr': 4.268659995508045e-06, 'samples': 24097792, 'steps': 47065, 'batch_loss/train': 0.853235880844295} 12/29/2021 03:49:38 - INFO - codeparrot_training - Step 47066: {'lr': 4.2657599928906946e-06, 'samples': 24098304, 'steps': 47066, 'batch_loss/train': 0.63114316604333} 12/29/2021 03:49:48 - INFO - codeparrot_training - Step 47067: {'lr': 4.262860967219867e-06, 'samples': 24098816, 'steps': 47067, 'batch_loss/train': 0.7357070073485374} 12/29/2021 03:49:59 - INFO - codeparrot_training - Step 47068: {'lr': 4.2599629185071096e-06, 'samples': 24099328, 'steps': 47068, 'batch_loss/train': 0.7003312958404422} 12/29/2021 03:50:13 - INFO - codeparrot_training - Step 47069: {'lr': 4.257065846763969e-06, 'samples': 24099840, 'steps': 47069, 'batch_loss/train': 0.4217334470595233} 12/29/2021 03:50:24 - INFO - codeparrot_training - Step 47070: {'lr': 4.254169752001935e-06, 'samples': 24100352, 'steps': 47070, 'batch_loss/train': 0.7180052832700312} 12/29/2021 03:50:34 - INFO - codeparrot_training - Step 47071: {'lr': 4.251274634232555e-06, 'samples': 24100864, 'steps': 47071, 'batch_loss/train': 0.7111645918339491} 12/29/2021 03:50:45 - INFO - codeparrot_training - Step 47072: {'lr': 4.248380493467291e-06, 'samples': 24101376, 'steps': 47072, 'batch_loss/train': 0.6550576272420585} 12/29/2021 03:50:57 - INFO - codeparrot_training - Step 47073: {'lr': 4.2454873297176626e-06, 'samples': 24101888, 'steps': 47073, 'batch_loss/train': 0.684830428857822} 12/29/2021 03:51:08 - INFO - codeparrot_training - Step 47074: {'lr': 4.242595142995243e-06, 'samples': 24102400, 'steps': 47074, 'batch_loss/train': 0.7204367009690031} 12/29/2021 03:51:18 - INFO - codeparrot_training - Step 47075: {'lr': 4.239703933311412e-06, 'samples': 24102912, 'steps': 47075, 'batch_loss/train': 0.6278671049512923} 12/29/2021 03:51:32 - INFO - codeparrot_training - Step 47076: {'lr': 4.236813700677744e-06, 'samples': 24103424, 'steps': 47076, 'batch_loss/train': 0.6813830161700025} 12/29/2021 03:51:43 - INFO - codeparrot_training - Step 47077: {'lr': 4.233924445105703e-06, 'samples': 24103936, 'steps': 47077, 'batch_loss/train': 0.5562366964295506} 12/29/2021 03:51:54 - INFO - codeparrot_training - Step 47078: {'lr': 4.231036166606778e-06, 'samples': 24104448, 'steps': 47078, 'batch_loss/train': 0.6050170632079244} 12/29/2021 03:52:06 - INFO - codeparrot_training - Step 47079: {'lr': 4.22814886519246e-06, 'samples': 24104960, 'steps': 47079, 'batch_loss/train': 0.8252790961414576} 12/29/2021 03:52:17 - INFO - codeparrot_training - Step 47080: {'lr': 4.225262540874241e-06, 'samples': 24105472, 'steps': 47080, 'batch_loss/train': 0.7077500422019511} 12/29/2021 03:52:27 - INFO - codeparrot_training - Step 47081: {'lr': 4.2223771936635565e-06, 'samples': 24105984, 'steps': 47081, 'batch_loss/train': 0.5619826302863657} 12/29/2021 03:52:39 - INFO - codeparrot_training - Step 47082: {'lr': 4.219492823571896e-06, 'samples': 24106496, 'steps': 47082, 'batch_loss/train': 0.7627954710042104} 12/29/2021 03:52:50 - INFO - codeparrot_training - Step 47083: {'lr': 4.2166094306107236e-06, 'samples': 24107008, 'steps': 47083, 'batch_loss/train': 0.6094183581881225} 12/29/2021 03:53:00 - INFO - codeparrot_training - Step 47084: {'lr': 4.213727014791529e-06, 'samples': 24107520, 'steps': 47084, 'batch_loss/train': 0.7423912296071649} 12/29/2021 03:53:11 - INFO - codeparrot_training - Step 47085: {'lr': 4.210845576125749e-06, 'samples': 24108032, 'steps': 47085, 'batch_loss/train': 0.5866995882533956} 12/29/2021 03:53:25 - INFO - codeparrot_training - Step 47086: {'lr': 4.207965114624818e-06, 'samples': 24108544, 'steps': 47086, 'batch_loss/train': 0.6772344553319272} 12/29/2021 03:53:36 - INFO - codeparrot_training - Step 47087: {'lr': 4.205085630300226e-06, 'samples': 24109056, 'steps': 47087, 'batch_loss/train': 0.6258214358240366} 12/29/2021 03:53:46 - INFO - codeparrot_training - Step 47088: {'lr': 4.2022071231634095e-06, 'samples': 24109568, 'steps': 47088, 'batch_loss/train': 0.6811493635177612} 12/29/2021 03:53:58 - INFO - codeparrot_training - Step 47089: {'lr': 4.199329593225804e-06, 'samples': 24110080, 'steps': 47089, 'batch_loss/train': 0.6455626896349713} 12/29/2021 03:54:09 - INFO - codeparrot_training - Step 47090: {'lr': 4.196453040498843e-06, 'samples': 24110592, 'steps': 47090, 'batch_loss/train': 0.6272309699561447} 12/29/2021 03:54:20 - INFO - codeparrot_training - Step 47091: {'lr': 4.193577464993964e-06, 'samples': 24111104, 'steps': 47091, 'batch_loss/train': 0.7425986309535801} 12/29/2021 03:54:32 - INFO - codeparrot_training - Step 47092: {'lr': 4.190702866722629e-06, 'samples': 24111616, 'steps': 47092, 'batch_loss/train': 0.7392220366746187} 12/29/2021 03:54:42 - INFO - codeparrot_training - Step 47093: {'lr': 4.187829245696245e-06, 'samples': 24112128, 'steps': 47093, 'batch_loss/train': 0.675106986425817} 12/29/2021 03:54:53 - INFO - codeparrot_training - Step 47094: {'lr': 4.1849566019262205e-06, 'samples': 24112640, 'steps': 47094, 'batch_loss/train': 0.6958549956325442} 12/29/2021 03:55:05 - INFO - codeparrot_training - Step 47095: {'lr': 4.182084935424019e-06, 'samples': 24113152, 'steps': 47095, 'batch_loss/train': 0.7099389238283038} 12/29/2021 03:55:16 - INFO - codeparrot_training - Step 47096: {'lr': 4.179214246201019e-06, 'samples': 24113664, 'steps': 47096, 'batch_loss/train': 0.6139524129685014} 12/29/2021 03:55:26 - INFO - codeparrot_training - Step 47097: {'lr': 4.176344534268656e-06, 'samples': 24114176, 'steps': 47097, 'batch_loss/train': 0.6807351768948138} 12/29/2021 03:55:37 - INFO - codeparrot_training - Step 47098: {'lr': 4.1734757996382824e-06, 'samples': 24114688, 'steps': 47098, 'batch_loss/train': 0.6247204847168177} 12/29/2021 03:55:51 - INFO - codeparrot_training - Step 47099: {'lr': 4.1706080423213885e-06, 'samples': 24115200, 'steps': 47099, 'batch_loss/train': 0.7192319049499929} 12/29/2021 03:56:02 - INFO - codeparrot_training - Step 47100: {'lr': 4.167741262329328e-06, 'samples': 24115712, 'steps': 47100, 'batch_loss/train': 0.8610155978240073} 12/29/2021 03:56:12 - INFO - codeparrot_training - Step 47101: {'lr': 4.164875459673506e-06, 'samples': 24116224, 'steps': 47101, 'batch_loss/train': 0.5579450849909335} 12/29/2021 03:56:25 - INFO - codeparrot_training - Step 47102: {'lr': 4.1620106343653045e-06, 'samples': 24116736, 'steps': 47102, 'batch_loss/train': 0.7072551655583084} 12/29/2021 03:56:35 - INFO - codeparrot_training - Step 47103: {'lr': 4.159146786416129e-06, 'samples': 24117248, 'steps': 47103, 'batch_loss/train': 0.6131587554409634} 12/29/2021 03:56:46 - INFO - codeparrot_training - Step 47104: {'lr': 4.156283915837361e-06, 'samples': 24117760, 'steps': 47104, 'batch_loss/train': 0.4604004848806653} 12/29/2021 03:57:00 - INFO - codeparrot_training - Step 47105: {'lr': 4.153422022640407e-06, 'samples': 24118272, 'steps': 47105, 'batch_loss/train': 0.7485960652120411} 12/29/2021 03:57:11 - INFO - codeparrot_training - Step 47106: {'lr': 4.150561106836592e-06, 'samples': 24118784, 'steps': 47106, 'batch_loss/train': 0.7256812689010985} 12/29/2021 03:57:22 - INFO - codeparrot_training - Step 47107: {'lr': 4.147701168437295e-06, 'samples': 24119296, 'steps': 47107, 'batch_loss/train': 0.6176111600943841} 12/29/2021 03:57:32 - INFO - codeparrot_training - Step 47108: {'lr': 4.144842207453953e-06, 'samples': 24119808, 'steps': 47108, 'batch_loss/train': 0.7314280662685633} 12/29/2021 03:57:44 - INFO - codeparrot_training - Step 47109: {'lr': 4.14198422389786e-06, 'samples': 24120320, 'steps': 47109, 'batch_loss/train': 1.0877408916130662} 12/29/2021 03:57:55 - INFO - codeparrot_training - Step 47110: {'lr': 4.1391272177803975e-06, 'samples': 24120832, 'steps': 47110, 'batch_loss/train': 0.5843777291011065} 12/29/2021 03:58:06 - INFO - codeparrot_training - Step 47111: {'lr': 4.136271189112972e-06, 'samples': 24121344, 'steps': 47111, 'batch_loss/train': 0.6234440100379288} 12/29/2021 03:58:18 - INFO - codeparrot_training - Step 47112: {'lr': 4.133416137906881e-06, 'samples': 24121856, 'steps': 47112, 'batch_loss/train': 0.7620376674458385} 12/29/2021 03:58:28 - INFO - codeparrot_training - Step 47113: {'lr': 4.130562064173449e-06, 'samples': 24122368, 'steps': 47113, 'batch_loss/train': 0.7799432084430009} 12/29/2021 03:58:39 - INFO - codeparrot_training - Step 47114: {'lr': 4.127708967924138e-06, 'samples': 24122880, 'steps': 47114, 'batch_loss/train': 0.6687976662069559} 12/29/2021 03:58:50 - INFO - codeparrot_training - Step 47115: {'lr': 4.124856849170161e-06, 'samples': 24123392, 'steps': 47115, 'batch_loss/train': 0.6991650925483555} 12/29/2021 03:59:04 - INFO - codeparrot_training - Step 47116: {'lr': 4.122005707922926e-06, 'samples': 24123904, 'steps': 47116, 'batch_loss/train': 0.5886185695417225} 12/29/2021 03:59:14 - INFO - codeparrot_training - Step 47117: {'lr': 4.119155544193787e-06, 'samples': 24124416, 'steps': 47117, 'batch_loss/train': 0.5736717651598155} 12/29/2021 03:59:25 - INFO - codeparrot_training - Step 47118: {'lr': 4.11630635799401e-06, 'samples': 24124928, 'steps': 47118, 'batch_loss/train': 0.6790058552287519} 12/29/2021 03:59:37 - INFO - codeparrot_training - Step 47119: {'lr': 4.1134581493349755e-06, 'samples': 24125440, 'steps': 47119, 'batch_loss/train': 0.7113430555909872} 12/29/2021 03:59:48 - INFO - codeparrot_training - Step 47120: {'lr': 4.110610918227981e-06, 'samples': 24125952, 'steps': 47120, 'batch_loss/train': 0.7409545611590147} 12/29/2021 03:59:58 - INFO - codeparrot_training - Step 47121: {'lr': 4.107764664684349e-06, 'samples': 24126464, 'steps': 47121, 'batch_loss/train': 0.7517308602109551} 12/29/2021 04:00:10 - INFO - codeparrot_training - Step 47122: {'lr': 4.104919388715406e-06, 'samples': 24126976, 'steps': 47122, 'batch_loss/train': 0.7531667547300458} 12/29/2021 04:00:21 - INFO - codeparrot_training - Step 47123: {'lr': 4.102075090332474e-06, 'samples': 24127488, 'steps': 47123, 'batch_loss/train': 0.7258398644626141} 12/29/2021 04:00:32 - INFO - codeparrot_training - Step 47124: {'lr': 4.099231769546796e-06, 'samples': 24128000, 'steps': 47124, 'batch_loss/train': 0.6859722873196006} 12/29/2021 04:00:42 - INFO - codeparrot_training - Step 47125: {'lr': 4.096389426369751e-06, 'samples': 24128512, 'steps': 47125, 'batch_loss/train': 0.779086618218571} 12/29/2021 04:00:54 - INFO - codeparrot_training - Step 47126: {'lr': 4.093548060812635e-06, 'samples': 24129024, 'steps': 47126, 'batch_loss/train': 0.678695797920227} 12/29/2021 04:01:05 - INFO - codeparrot_training - Step 47127: {'lr': 4.09070767288669e-06, 'samples': 24129536, 'steps': 47127, 'batch_loss/train': 0.6571711023279931} 12/29/2021 04:01:16 - INFO - codeparrot_training - Step 47128: {'lr': 4.08786826260324e-06, 'samples': 24130048, 'steps': 47128, 'batch_loss/train': 0.7469643047079444} 12/29/2021 04:01:29 - INFO - codeparrot_training - Step 47129: {'lr': 4.085029829973608e-06, 'samples': 24130560, 'steps': 47129, 'batch_loss/train': 0.7287276377901435} 12/29/2021 04:01:39 - INFO - codeparrot_training - Step 47130: {'lr': 4.082192375009009e-06, 'samples': 24131072, 'steps': 47130, 'batch_loss/train': 0.6814271840266883} 12/29/2021 04:01:50 - INFO - codeparrot_training - Step 47131: {'lr': 4.079355897720766e-06, 'samples': 24131584, 'steps': 47131, 'batch_loss/train': 0.6423871541628614} 12/29/2021 04:02:02 - INFO - codeparrot_training - Step 47132: {'lr': 4.076520398120149e-06, 'samples': 24132096, 'steps': 47132, 'batch_loss/train': 0.7361748018302023} 12/29/2021 04:02:13 - INFO - codeparrot_training - Step 47133: {'lr': 4.073685876218425e-06, 'samples': 24132608, 'steps': 47133, 'batch_loss/train': 0.5749157338868827} 12/29/2021 04:02:23 - INFO - codeparrot_training - Step 47134: {'lr': 4.070852332026864e-06, 'samples': 24133120, 'steps': 47134, 'batch_loss/train': 0.6460206578485668} 12/29/2021 04:02:36 - INFO - codeparrot_training - Step 47135: {'lr': 4.0680197655567356e-06, 'samples': 24133632, 'steps': 47135, 'batch_loss/train': 0.7068421457079239} 12/29/2021 04:02:47 - INFO - codeparrot_training - Step 47136: {'lr': 4.065188176819307e-06, 'samples': 24134144, 'steps': 47136, 'batch_loss/train': 0.6271029440686107} 12/29/2021 04:02:57 - INFO - codeparrot_training - Step 47137: {'lr': 4.062357565825847e-06, 'samples': 24134656, 'steps': 47137, 'batch_loss/train': 0.6559476270340383} 12/29/2021 04:03:08 - INFO - codeparrot_training - Step 47138: {'lr': 4.059527932587598e-06, 'samples': 24135168, 'steps': 47138, 'batch_loss/train': 0.6397389628691599} 12/29/2021 04:03:20 - INFO - codeparrot_training - Step 47139: {'lr': 4.056699277115744e-06, 'samples': 24135680, 'steps': 47139, 'batch_loss/train': 0.69045653619105} 12/29/2021 04:03:31 - INFO - codeparrot_training - Step 47140: {'lr': 4.053871599421638e-06, 'samples': 24136192, 'steps': 47140, 'batch_loss/train': 0.6034603648004122} 12/29/2021 04:03:41 - INFO - codeparrot_training - Step 47141: {'lr': 4.051044899516465e-06, 'samples': 24136704, 'steps': 47141, 'batch_loss/train': 0.6296619644854218} 12/29/2021 04:03:54 - INFO - codeparrot_training - Step 47142: {'lr': 4.048219177411439e-06, 'samples': 24137216, 'steps': 47142, 'batch_loss/train': 0.6958554275333881} 12/29/2021 04:04:04 - INFO - codeparrot_training - Step 47143: {'lr': 4.045394433117882e-06, 'samples': 24137728, 'steps': 47143, 'batch_loss/train': 0.7627163222059608} 12/29/2021 04:04:15 - INFO - codeparrot_training - Step 47144: {'lr': 4.0425706666469275e-06, 'samples': 24138240, 'steps': 47144, 'batch_loss/train': 0.5821665039984509} 12/29/2021 04:04:28 - INFO - codeparrot_training - Step 47145: {'lr': 4.0397478780098415e-06, 'samples': 24138752, 'steps': 47145, 'batch_loss/train': 0.7288061415310949} 12/29/2021 04:04:38 - INFO - codeparrot_training - Step 47146: {'lr': 4.036926067217866e-06, 'samples': 24139264, 'steps': 47146, 'batch_loss/train': 0.749957786872983} 12/29/2021 04:04:49 - INFO - codeparrot_training - Step 47147: {'lr': 4.0341052342821875e-06, 'samples': 24139776, 'steps': 47147, 'batch_loss/train': 0.7670976901426911} 12/29/2021 04:05:01 - INFO - codeparrot_training - Step 47148: {'lr': 4.031285379214045e-06, 'samples': 24140288, 'steps': 47148, 'batch_loss/train': 0.8000395749695599} 12/29/2021 04:05:12 - INFO - codeparrot_training - Step 47149: {'lr': 4.028466502024624e-06, 'samples': 24140800, 'steps': 47149, 'batch_loss/train': 0.8061518063768744} 12/29/2021 04:05:22 - INFO - codeparrot_training - Step 47150: {'lr': 4.025648602725141e-06, 'samples': 24141312, 'steps': 47150, 'batch_loss/train': 0.6361829284578562} 12/29/2021 04:05:33 - INFO - codeparrot_training - Step 47151: {'lr': 4.0228316813268055e-06, 'samples': 24141824, 'steps': 47151, 'batch_loss/train': 0.7089717881754041} 12/29/2021 04:05:46 - INFO - codeparrot_training - Step 47152: {'lr': 4.02001573784086e-06, 'samples': 24142336, 'steps': 47152, 'batch_loss/train': 0.7655357262119651} 12/29/2021 04:05:56 - INFO - codeparrot_training - Step 47153: {'lr': 4.017200772278406e-06, 'samples': 24142848, 'steps': 47153, 'batch_loss/train': 0.5586309708014596} 12/29/2021 04:06:07 - INFO - codeparrot_training - Step 47154: {'lr': 4.014386784650659e-06, 'samples': 24143360, 'steps': 47154, 'batch_loss/train': 0.6151571707159746} 12/29/2021 04:06:19 - INFO - codeparrot_training - Step 47155: {'lr': 4.011573774968885e-06, 'samples': 24143872, 'steps': 47155, 'batch_loss/train': 0.5777645230700728} 12/29/2021 04:06:30 - INFO - codeparrot_training - Step 47156: {'lr': 4.008761743244188e-06, 'samples': 24144384, 'steps': 47156, 'batch_loss/train': 0.6581726067815907} 12/29/2021 04:06:40 - INFO - codeparrot_training - Step 47157: {'lr': 4.0059506894877805e-06, 'samples': 24144896, 'steps': 47157, 'batch_loss/train': 0.7501810351386666} 12/29/2021 04:06:52 - INFO - codeparrot_training - Step 47158: {'lr': 4.00314061371082e-06, 'samples': 24145408, 'steps': 47158, 'batch_loss/train': 0.6569695714861155} 12/29/2021 04:07:03 - INFO - codeparrot_training - Step 47159: {'lr': 4.000331515924466e-06, 'samples': 24145920, 'steps': 47159, 'batch_loss/train': 0.6253840219287667} 12/29/2021 04:07:14 - INFO - codeparrot_training - Step 47160: {'lr': 3.997523396139929e-06, 'samples': 24146432, 'steps': 47160, 'batch_loss/train': 0.7146572545170784} 12/29/2021 04:07:24 - INFO - codeparrot_training - Step 47161: {'lr': 3.994716254368369e-06, 'samples': 24146944, 'steps': 47161, 'batch_loss/train': 0.7149890176951885} 12/29/2021 04:07:37 - INFO - codeparrot_training - Step 47162: {'lr': 3.991910090620915e-06, 'samples': 24147456, 'steps': 47162, 'batch_loss/train': 0.7829069432336837} 12/29/2021 04:07:47 - INFO - codeparrot_training - Step 47163: {'lr': 3.9891049049087245e-06, 'samples': 24147968, 'steps': 47163, 'batch_loss/train': 0.44720564933959395} 12/29/2021 04:07:58 - INFO - codeparrot_training - Step 47164: {'lr': 3.986300697242984e-06, 'samples': 24148480, 'steps': 47164, 'batch_loss/train': 0.7252958118915558} 12/29/2021 04:08:11 - INFO - codeparrot_training - Step 47165: {'lr': 3.983497467634795e-06, 'samples': 24148992, 'steps': 47165, 'batch_loss/train': 0.7007385930046439} 12/29/2021 04:08:22 - INFO - codeparrot_training - Step 47166: {'lr': 3.98069521609537e-06, 'samples': 24149504, 'steps': 47166, 'batch_loss/train': 0.5367925565806217} 12/29/2021 04:08:32 - INFO - codeparrot_training - Step 47167: {'lr': 3.977893942635785e-06, 'samples': 24150016, 'steps': 47167, 'batch_loss/train': 0.7038229294121265} 12/29/2021 04:08:44 - INFO - codeparrot_training - Step 47168: {'lr': 3.975093647267169e-06, 'samples': 24150528, 'steps': 47168, 'batch_loss/train': 0.6889733271673322} 12/29/2021 04:08:55 - INFO - codeparrot_training - Step 47169: {'lr': 3.9722943300007365e-06, 'samples': 24151040, 'steps': 47169, 'batch_loss/train': 0.7763061774894595} 12/29/2021 04:09:06 - INFO - codeparrot_training - Step 47170: {'lr': 3.969495990847533e-06, 'samples': 24151552, 'steps': 47170, 'batch_loss/train': 0.5884782264765818} 12/29/2021 04:09:16 - INFO - codeparrot_training - Step 47171: {'lr': 3.9666986298187154e-06, 'samples': 24152064, 'steps': 47171, 'batch_loss/train': 0.7044702200219035} 12/29/2021 04:09:29 - INFO - codeparrot_training - Step 47172: {'lr': 3.963902246925416e-06, 'samples': 24152576, 'steps': 47172, 'batch_loss/train': 0.7634251089766622} 12/29/2021 04:09:39 - INFO - codeparrot_training - Step 47173: {'lr': 3.961106842178736e-06, 'samples': 24153088, 'steps': 47173, 'batch_loss/train': 0.670781871303916} 12/29/2021 04:09:50 - INFO - codeparrot_training - Step 47174: {'lr': 3.958312415589804e-06, 'samples': 24153600, 'steps': 47174, 'batch_loss/train': 0.7621516045182943} 12/29/2021 04:10:03 - INFO - codeparrot_training - Step 47175: {'lr': 3.955518967169725e-06, 'samples': 24154112, 'steps': 47175, 'batch_loss/train': 0.6216469770297408} 12/29/2021 04:10:13 - INFO - codeparrot_training - Step 47176: {'lr': 3.952726496929571e-06, 'samples': 24154624, 'steps': 47176, 'batch_loss/train': 0.7255449220538139} 12/29/2021 04:10:24 - INFO - codeparrot_training - Step 47177: {'lr': 3.9499350048805005e-06, 'samples': 24155136, 'steps': 47177, 'batch_loss/train': 0.63723874615971} 12/29/2021 04:10:36 - INFO - codeparrot_training - Step 47178: {'lr': 3.947144491033589e-06, 'samples': 24155648, 'steps': 47178, 'batch_loss/train': 0.4140433834400028} 12/29/2021 04:10:47 - INFO - codeparrot_training - Step 47179: {'lr': 3.944354955399882e-06, 'samples': 24156160, 'steps': 47179, 'batch_loss/train': 0.7197074126452208} 12/29/2021 04:10:58 - INFO - codeparrot_training - Step 47180: {'lr': 3.941566397990537e-06, 'samples': 24156672, 'steps': 47180, 'batch_loss/train': 0.7077867463231087} 12/29/2021 04:11:08 - INFO - codeparrot_training - Step 47181: {'lr': 3.938778818816657e-06, 'samples': 24157184, 'steps': 47181, 'batch_loss/train': 0.5895962384529412} 12/29/2021 04:11:21 - INFO - codeparrot_training - Step 47182: {'lr': 3.935992217889234e-06, 'samples': 24157696, 'steps': 47182, 'batch_loss/train': 0.7891700994223356} 12/29/2021 04:11:32 - INFO - codeparrot_training - Step 47183: {'lr': 3.933206595219396e-06, 'samples': 24158208, 'steps': 47183, 'batch_loss/train': 0.7467448376119137} 12/29/2021 04:11:42 - INFO - codeparrot_training - Step 47184: {'lr': 3.930421950818247e-06, 'samples': 24158720, 'steps': 47184, 'batch_loss/train': 0.7053601071238518} 12/29/2021 04:11:55 - INFO - codeparrot_training - Step 47185: {'lr': 3.927638284696805e-06, 'samples': 24159232, 'steps': 47185, 'batch_loss/train': 0.6215266624931246} 12/29/2021 04:12:05 - INFO - codeparrot_training - Step 47186: {'lr': 3.924855596866145e-06, 'samples': 24159744, 'steps': 47186, 'batch_loss/train': 0.5711812344379723} 12/29/2021 04:12:16 - INFO - codeparrot_training - Step 47187: {'lr': 3.922073887337368e-06, 'samples': 24160256, 'steps': 47187, 'batch_loss/train': 0.7379737896844745} 12/29/2021 04:12:28 - INFO - codeparrot_training - Step 47188: {'lr': 3.919293156121523e-06, 'samples': 24160768, 'steps': 47188, 'batch_loss/train': 0.7831715629436076} 12/29/2021 04:12:39 - INFO - codeparrot_training - Step 47189: {'lr': 3.9165134032296265e-06, 'samples': 24161280, 'steps': 47189, 'batch_loss/train': 0.673020385322161} 12/29/2021 04:12:49 - INFO - codeparrot_training - Step 47190: {'lr': 3.913734628672782e-06, 'samples': 24161792, 'steps': 47190, 'batch_loss/train': 0.8260582545772195} 12/29/2021 04:13:02 - INFO - codeparrot_training - Step 47191: {'lr': 3.9109568324619536e-06, 'samples': 24162304, 'steps': 47191, 'batch_loss/train': 0.7316725673153996} 12/29/2021 04:13:12 - INFO - codeparrot_training - Step 47192: {'lr': 3.908180014608298e-06, 'samples': 24162816, 'steps': 47192, 'batch_loss/train': 0.5094256704906002} 12/29/2021 04:13:23 - INFO - codeparrot_training - Step 47193: {'lr': 3.905404175122779e-06, 'samples': 24163328, 'steps': 47193, 'batch_loss/train': 0.43209989252500236} 12/29/2021 04:13:34 - INFO - codeparrot_training - Step 47194: {'lr': 3.9026293140164426e-06, 'samples': 24163840, 'steps': 47194, 'batch_loss/train': 0.7553963102400303} 12/29/2021 04:13:46 - INFO - codeparrot_training - Step 47195: {'lr': 3.899855431300364e-06, 'samples': 24164352, 'steps': 47195, 'batch_loss/train': 0.6443294589407742} 12/29/2021 04:13:57 - INFO - codeparrot_training - Step 47196: {'lr': 3.8970825269855074e-06, 'samples': 24164864, 'steps': 47196, 'batch_loss/train': 0.5767926769331098} 12/29/2021 04:14:08 - INFO - codeparrot_training - Step 47197: {'lr': 3.894310601082918e-06, 'samples': 24165376, 'steps': 47197, 'batch_loss/train': 0.7491289428435266} 12/29/2021 04:14:20 - INFO - codeparrot_training - Step 47198: {'lr': 3.891539653603643e-06, 'samples': 24165888, 'steps': 47198, 'batch_loss/train': 0.7412739004939795} 12/29/2021 04:14:30 - INFO - codeparrot_training - Step 47199: {'lr': 3.888769684558674e-06, 'samples': 24166400, 'steps': 47199, 'batch_loss/train': 0.7844371376559138} 12/29/2021 04:14:41 - INFO - codeparrot_training - Step 47200: {'lr': 3.88600069395903e-06, 'samples': 24166912, 'steps': 47200, 'batch_loss/train': 0.5812663870747201} 12/29/2021 04:14:53 - INFO - codeparrot_training - Step 47201: {'lr': 3.883232681815729e-06, 'samples': 24167424, 'steps': 47201, 'batch_loss/train': 0.6843080339021981} 12/29/2021 04:15:04 - INFO - codeparrot_training - Step 47202: {'lr': 3.880465648139736e-06, 'samples': 24167936, 'steps': 47202, 'batch_loss/train': 0.570095396891702} 12/29/2021 04:15:14 - INFO - codeparrot_training - Step 47203: {'lr': 3.877699592942069e-06, 'samples': 24168448, 'steps': 47203, 'batch_loss/train': 0.6987306722439826} 12/29/2021 04:15:25 - INFO - codeparrot_training - Step 47204: {'lr': 3.8749345162338025e-06, 'samples': 24168960, 'steps': 47204, 'batch_loss/train': 0.6873667016625404} 12/29/2021 04:15:38 - INFO - codeparrot_training - Step 47205: {'lr': 3.872170418025789e-06, 'samples': 24169472, 'steps': 47205, 'batch_loss/train': 0.6869712881743908} 12/29/2021 04:15:48 - INFO - codeparrot_training - Step 47206: {'lr': 3.8694072983291305e-06, 'samples': 24169984, 'steps': 47206, 'batch_loss/train': 0.6928585779387504} 12/29/2021 04:15:59 - INFO - codeparrot_training - Step 47207: {'lr': 3.866645157154791e-06, 'samples': 24170496, 'steps': 47207, 'batch_loss/train': 0.7179375817067921} 12/29/2021 04:16:11 - INFO - codeparrot_training - Step 47208: {'lr': 3.863883994513706e-06, 'samples': 24171008, 'steps': 47208, 'batch_loss/train': 0.7155772187979892} 12/29/2021 04:16:22 - INFO - codeparrot_training - Step 47209: {'lr': 3.861123810416866e-06, 'samples': 24171520, 'steps': 47209, 'batch_loss/train': 0.6743462951853871} 12/29/2021 04:16:32 - INFO - codeparrot_training - Step 47210: {'lr': 3.858364604875292e-06, 'samples': 24172032, 'steps': 47210, 'batch_loss/train': 0.5889667025767267} 12/29/2021 04:16:45 - INFO - codeparrot_training - Step 47211: {'lr': 3.855606377899917e-06, 'samples': 24172544, 'steps': 47211, 'batch_loss/train': 0.5640150926192291} 12/29/2021 04:16:55 - INFO - codeparrot_training - Step 47212: {'lr': 3.8528491295016786e-06, 'samples': 24173056, 'steps': 47212, 'batch_loss/train': 0.6636685254052281} 12/29/2021 04:17:06 - INFO - codeparrot_training - Step 47213: {'lr': 3.850092859691595e-06, 'samples': 24173568, 'steps': 47213, 'batch_loss/train': 0.6463785246014595} 12/29/2021 04:17:18 - INFO - codeparrot_training - Step 47214: {'lr': 3.847337568480603e-06, 'samples': 24174080, 'steps': 47214, 'batch_loss/train': 0.7530632107518613} 12/29/2021 04:17:29 - INFO - codeparrot_training - Step 47215: {'lr': 3.844583255879636e-06, 'samples': 24174592, 'steps': 47215, 'batch_loss/train': 0.7119183982722461} 12/29/2021 04:17:40 - INFO - codeparrot_training - Step 47216: {'lr': 3.841829921899687e-06, 'samples': 24175104, 'steps': 47216, 'batch_loss/train': 0.7352344272658229} 12/29/2021 04:17:50 - INFO - codeparrot_training - Step 47217: {'lr': 3.839077566551663e-06, 'samples': 24175616, 'steps': 47217, 'batch_loss/train': 0.6709822658449411} 12/29/2021 04:18:03 - INFO - codeparrot_training - Step 47218: {'lr': 3.836326189846528e-06, 'samples': 24176128, 'steps': 47218, 'batch_loss/train': 0.6288231869693846} 12/29/2021 04:18:13 - INFO - codeparrot_training - Step 47219: {'lr': 3.833575791795218e-06, 'samples': 24176640, 'steps': 47219, 'batch_loss/train': 0.7060734434053302} 12/29/2021 04:18:24 - INFO - codeparrot_training - Step 47220: {'lr': 3.830826372408613e-06, 'samples': 24177152, 'steps': 47220, 'batch_loss/train': 0.6516873701475561} 12/29/2021 04:18:37 - INFO - codeparrot_training - Step 47221: {'lr': 3.82807793169776e-06, 'samples': 24177664, 'steps': 47221, 'batch_loss/train': 0.6230572260683402} 12/29/2021 04:18:47 - INFO - codeparrot_training - Step 47222: {'lr': 3.825330469673483e-06, 'samples': 24178176, 'steps': 47222, 'batch_loss/train': 0.560643781645922} 12/29/2021 04:18:58 - INFO - codeparrot_training - Step 47223: {'lr': 3.8225839863467446e-06, 'samples': 24178688, 'steps': 47223, 'batch_loss/train': 0.7661683949409053} 12/29/2021 04:19:10 - INFO - codeparrot_training - Step 47224: {'lr': 3.819838481728455e-06, 'samples': 24179200, 'steps': 47224, 'batch_loss/train': 0.7043728902935982} 12/29/2021 04:19:21 - INFO - codeparrot_training - Step 47225: {'lr': 3.81709395582952e-06, 'samples': 24179712, 'steps': 47225, 'batch_loss/train': 0.5233654241310433} 12/29/2021 04:19:31 - INFO - codeparrot_training - Step 47226: {'lr': 3.814350408660877e-06, 'samples': 24180224, 'steps': 47226, 'batch_loss/train': 0.7633447744883597} 12/29/2021 04:19:42 - INFO - codeparrot_training - Step 47227: {'lr': 3.811607840233461e-06, 'samples': 24180736, 'steps': 47227, 'batch_loss/train': 0.6265803137794137} 12/29/2021 04:19:54 - INFO - codeparrot_training - Step 47228: {'lr': 3.8088662505580683e-06, 'samples': 24181248, 'steps': 47228, 'batch_loss/train': 0.6010170562658459} 12/29/2021 04:20:05 - INFO - codeparrot_training - Step 47229: {'lr': 3.8061256396456913e-06, 'samples': 24181760, 'steps': 47229, 'batch_loss/train': 0.719105651602149} 12/29/2021 04:20:15 - INFO - codeparrot_training - Step 47230: {'lr': 3.803386007507237e-06, 'samples': 24182272, 'steps': 47230, 'batch_loss/train': 0.6331476159393787} 12/29/2021 04:20:28 - INFO - codeparrot_training - Step 47231: {'lr': 3.8006473541535024e-06, 'samples': 24182784, 'steps': 47231, 'batch_loss/train': 0.6616743094054982} 12/29/2021 04:20:38 - INFO - codeparrot_training - Step 47232: {'lr': 3.7979096795954516e-06, 'samples': 24183296, 'steps': 47232, 'batch_loss/train': 0.44032945472281426} 12/29/2021 04:20:49 - INFO - codeparrot_training - Step 47233: {'lr': 3.795172983843964e-06, 'samples': 24183808, 'steps': 47233, 'batch_loss/train': 0.7539029037579894} 12/29/2021 04:21:02 - INFO - codeparrot_training - Step 47234: {'lr': 3.7924372669098927e-06, 'samples': 24184320, 'steps': 47234, 'batch_loss/train': 0.5896135347138625} 12/29/2021 04:21:12 - INFO - codeparrot_training - Step 47235: {'lr': 3.7897025288041178e-06, 'samples': 24184832, 'steps': 47235, 'batch_loss/train': 0.6324143279343843} 12/29/2021 04:21:23 - INFO - codeparrot_training - Step 47236: {'lr': 3.7869687695375466e-06, 'samples': 24185344, 'steps': 47236, 'batch_loss/train': 0.69212706387043} 12/29/2021 04:21:35 - INFO - codeparrot_training - Step 47237: {'lr': 3.7842359891210043e-06, 'samples': 24185856, 'steps': 47237, 'batch_loss/train': 0.7784952083602548} 12/29/2021 04:21:46 - INFO - codeparrot_training - Step 47238: {'lr': 3.781504187565371e-06, 'samples': 24186368, 'steps': 47238, 'batch_loss/train': 0.6840322739444673} 12/29/2021 04:21:56 - INFO - codeparrot_training - Step 47239: {'lr': 3.7787733648814992e-06, 'samples': 24186880, 'steps': 47239, 'batch_loss/train': 0.6866912795230746} 12/29/2021 04:22:07 - INFO - codeparrot_training - Step 47240: {'lr': 3.7760435210802692e-06, 'samples': 24187392, 'steps': 47240, 'batch_loss/train': 0.6998932752758265} 12/29/2021 04:22:19 - INFO - codeparrot_training - Step 47241: {'lr': 3.7733146561725327e-06, 'samples': 24187904, 'steps': 47241, 'batch_loss/train': 0.7266497807577252} 12/29/2021 04:22:30 - INFO - codeparrot_training - Step 47242: {'lr': 3.7705867701691152e-06, 'samples': 24188416, 'steps': 47242, 'batch_loss/train': 0.5685857653152198} 12/29/2021 04:22:40 - INFO - codeparrot_training - Step 47243: {'lr': 3.7678598630808414e-06, 'samples': 24188928, 'steps': 47243, 'batch_loss/train': 0.6655755324754864} 12/29/2021 04:22:53 - INFO - codeparrot_training - Step 47244: {'lr': 3.7651339349186185e-06, 'samples': 24189440, 'steps': 47244, 'batch_loss/train': 0.7801166735589504} 12/29/2021 04:23:04 - INFO - codeparrot_training - Step 47245: {'lr': 3.762408985693272e-06, 'samples': 24189952, 'steps': 47245, 'batch_loss/train': 0.7430111337453127} 12/29/2021 04:23:15 - INFO - codeparrot_training - Step 47246: {'lr': 3.7596850154155427e-06, 'samples': 24190464, 'steps': 47246, 'batch_loss/train': 0.6942439517588355} 12/29/2021 04:23:27 - INFO - codeparrot_training - Step 47247: {'lr': 3.756962024096394e-06, 'samples': 24190976, 'steps': 47247, 'batch_loss/train': 0.7012124609900638} 12/29/2021 04:23:37 - INFO - codeparrot_training - Step 47248: {'lr': 3.75424001174654e-06, 'samples': 24191488, 'steps': 47248, 'batch_loss/train': 0.6186296441592276} 12/29/2021 04:23:48 - INFO - codeparrot_training - Step 47249: {'lr': 3.751518978376889e-06, 'samples': 24192000, 'steps': 47249, 'batch_loss/train': 0.6885399012826383} 12/29/2021 04:23:59 - INFO - codeparrot_training - Step 47250: {'lr': 3.7487989239981813e-06, 'samples': 24192512, 'steps': 47250, 'batch_loss/train': 0.6139397881925106} 12/29/2021 04:24:11 - INFO - codeparrot_training - Step 47251: {'lr': 3.74607984862127e-06, 'samples': 24193024, 'steps': 47251, 'batch_loss/train': 0.7142404166515917} 12/29/2021 04:24:22 - INFO - codeparrot_training - Step 47252: {'lr': 3.74336175225698e-06, 'samples': 24193536, 'steps': 47252, 'batch_loss/train': 0.7445079816970974} 12/29/2021 04:24:33 - INFO - codeparrot_training - Step 47253: {'lr': 3.7406446349161074e-06, 'samples': 24194048, 'steps': 47253, 'batch_loss/train': 0.6977691655047238} 12/29/2021 04:24:45 - INFO - codeparrot_training - Step 47254: {'lr': 3.737928496609422e-06, 'samples': 24194560, 'steps': 47254, 'batch_loss/train': 0.6591089325957} 12/29/2021 04:24:56 - INFO - codeparrot_training - Step 47255: {'lr': 3.735213337347776e-06, 'samples': 24195072, 'steps': 47255, 'batch_loss/train': 0.6774484000634402} 12/29/2021 04:25:06 - INFO - codeparrot_training - Step 47256: {'lr': 3.7324991571419386e-06, 'samples': 24195584, 'steps': 47256, 'batch_loss/train': 0.6494415230117738} 12/29/2021 04:25:18 - INFO - codeparrot_training - Step 47257: {'lr': 3.7297859560026793e-06, 'samples': 24196096, 'steps': 47257, 'batch_loss/train': 0.7633722499012947} 12/29/2021 04:25:29 - INFO - codeparrot_training - Step 47258: {'lr': 3.7270737339407946e-06, 'samples': 24196608, 'steps': 47258, 'batch_loss/train': 0.7206219085492194} 12/29/2021 04:25:40 - INFO - codeparrot_training - Step 47259: {'lr': 3.724362490967109e-06, 'samples': 24197120, 'steps': 47259, 'batch_loss/train': 0.576142790960148} 12/29/2021 04:25:52 - INFO - codeparrot_training - Step 47260: {'lr': 3.721652227092337e-06, 'samples': 24197632, 'steps': 47260, 'batch_loss/train': 0.6674540415406227} 12/29/2021 04:26:03 - INFO - codeparrot_training - Step 47261: {'lr': 3.7189429423272747e-06, 'samples': 24198144, 'steps': 47261, 'batch_loss/train': 0.7688306644558907} 12/29/2021 04:26:14 - INFO - codeparrot_training - Step 47262: {'lr': 3.716234636682747e-06, 'samples': 24198656, 'steps': 47262, 'batch_loss/train': 0.6928766835480928} 12/29/2021 04:26:24 - INFO - codeparrot_training - Step 47263: {'lr': 3.71352731016944e-06, 'samples': 24199168, 'steps': 47263, 'batch_loss/train': 0.7067977776750922} 12/29/2021 04:26:37 - INFO - codeparrot_training - Step 47264: {'lr': 3.7108209627981783e-06, 'samples': 24199680, 'steps': 47264, 'batch_loss/train': 0.7325578103773296} 12/29/2021 04:26:47 - INFO - codeparrot_training - Step 47265: {'lr': 3.708115594579675e-06, 'samples': 24200192, 'steps': 47265, 'batch_loss/train': 0.6213197708129883} 12/29/2021 04:26:58 - INFO - codeparrot_training - Step 47266: {'lr': 3.705411205524728e-06, 'samples': 24200704, 'steps': 47266, 'batch_loss/train': 0.7081809595692903} 12/29/2021 04:27:10 - INFO - codeparrot_training - Step 47267: {'lr': 3.70270779564405e-06, 'samples': 24201216, 'steps': 47267, 'batch_loss/train': 0.6970865987241268} 12/29/2021 04:27:21 - INFO - codeparrot_training - Step 47268: {'lr': 3.700005364948411e-06, 'samples': 24201728, 'steps': 47268, 'batch_loss/train': 0.7194316051900387} 12/29/2021 04:27:31 - INFO - codeparrot_training - Step 47269: {'lr': 3.697303913448552e-06, 'samples': 24202240, 'steps': 47269, 'batch_loss/train': 0.7526983125135303} 12/29/2021 04:27:43 - INFO - codeparrot_training - Step 47270: {'lr': 3.694603441155242e-06, 'samples': 24202752, 'steps': 47270, 'batch_loss/train': 0.6368338298052549} 12/29/2021 04:27:54 - INFO - codeparrot_training - Step 47271: {'lr': 3.691903948079167e-06, 'samples': 24203264, 'steps': 47271, 'batch_loss/train': 0.7296052994206548} 12/29/2021 04:28:05 - INFO - codeparrot_training - Step 47272: {'lr': 3.6892054342310687e-06, 'samples': 24203776, 'steps': 47272, 'batch_loss/train': 0.7155123390257359} 12/29/2021 04:28:17 - INFO - codeparrot_training - Step 47273: {'lr': 3.6865078996216885e-06, 'samples': 24204288, 'steps': 47273, 'batch_loss/train': 0.7327886531129479} 12/29/2021 04:28:28 - INFO - codeparrot_training - Step 47274: {'lr': 3.683811344261767e-06, 'samples': 24204800, 'steps': 47274, 'batch_loss/train': 0.7793607776984572} 12/29/2021 04:28:39 - INFO - codeparrot_training - Step 47275: {'lr': 3.681115768161991e-06, 'samples': 24205312, 'steps': 47275, 'batch_loss/train': 0.5983374619390815} 12/29/2021 04:28:49 - INFO - codeparrot_training - Step 47276: {'lr': 3.6784211713331017e-06, 'samples': 24205824, 'steps': 47276, 'batch_loss/train': 0.6710913763381541} 12/29/2021 04:29:01 - INFO - codeparrot_training - Step 47277: {'lr': 3.675727553785785e-06, 'samples': 24206336, 'steps': 47277, 'batch_loss/train': 0.735424904152751} 12/29/2021 04:29:12 - INFO - codeparrot_training - Step 47278: {'lr': 3.6730349155307817e-06, 'samples': 24206848, 'steps': 47278, 'batch_loss/train': 0.7096242355182767} 12/29/2021 04:29:23 - INFO - codeparrot_training - Step 47279: {'lr': 3.6703432565787785e-06, 'samples': 24207360, 'steps': 47279, 'batch_loss/train': 0.6883650552481413} 12/29/2021 04:29:35 - INFO - codeparrot_training - Step 47280: {'lr': 3.667652576940461e-06, 'samples': 24207872, 'steps': 47280, 'batch_loss/train': 0.7323253443464637} 12/29/2021 04:29:46 - INFO - codeparrot_training - Step 47281: {'lr': 3.6649628766265707e-06, 'samples': 24208384, 'steps': 47281, 'batch_loss/train': 0.7045056140050292} 12/29/2021 04:29:57 - INFO - codeparrot_training - Step 47282: {'lr': 3.6622741556477935e-06, 'samples': 24208896, 'steps': 47282, 'batch_loss/train': 0.6070936750620604} 12/29/2021 04:30:09 - INFO - codeparrot_training - Step 47283: {'lr': 3.6595864140147593e-06, 'samples': 24209408, 'steps': 47283, 'batch_loss/train': 0.742816879414022} 12/29/2021 04:30:19 - INFO - codeparrot_training - Step 47284: {'lr': 3.65689965173821e-06, 'samples': 24209920, 'steps': 47284, 'batch_loss/train': 0.6870803850470111} 12/29/2021 04:30:30 - INFO - codeparrot_training - Step 47285: {'lr': 3.654213868828832e-06, 'samples': 24210432, 'steps': 47285, 'batch_loss/train': 0.7164185456931591} 12/29/2021 04:30:42 - INFO - codeparrot_training - Step 47286: {'lr': 3.6515290652972543e-06, 'samples': 24210944, 'steps': 47286, 'batch_loss/train': 0.6654675554018468} 12/29/2021 04:30:53 - INFO - codeparrot_training - Step 47287: {'lr': 3.648845241154164e-06, 'samples': 24211456, 'steps': 47287, 'batch_loss/train': 0.6982426450704224} 12/29/2021 04:31:04 - INFO - codeparrot_training - Step 47288: {'lr': 3.646162396410302e-06, 'samples': 24211968, 'steps': 47288, 'batch_loss/train': 0.6567234382964671} 12/29/2021 04:31:14 - INFO - codeparrot_training - Step 47289: {'lr': 3.6434805310762433e-06, 'samples': 24212480, 'steps': 47289, 'batch_loss/train': 0.7421408388763666} 12/29/2021 04:31:27 - INFO - codeparrot_training - Step 47290: {'lr': 3.6407996451626733e-06, 'samples': 24212992, 'steps': 47290, 'batch_loss/train': 0.6724711034912616} 12/29/2021 04:31:38 - INFO - codeparrot_training - Step 47291: {'lr': 3.638119738680279e-06, 'samples': 24213504, 'steps': 47291, 'batch_loss/train': 0.6820691018365324} 12/29/2021 04:31:48 - INFO - codeparrot_training - Step 47292: {'lr': 3.6354408116397177e-06, 'samples': 24214016, 'steps': 47292, 'batch_loss/train': 0.6753245261497796} 12/29/2021 04:32:01 - INFO - codeparrot_training - Step 47293: {'lr': 3.632762864051592e-06, 'samples': 24214528, 'steps': 47293, 'batch_loss/train': 0.711485349573195} 12/29/2021 04:32:11 - INFO - codeparrot_training - Step 47294: {'lr': 3.630085895926588e-06, 'samples': 24215040, 'steps': 47294, 'batch_loss/train': 0.783533972222358} 12/29/2021 04:32:22 - INFO - codeparrot_training - Step 47295: {'lr': 3.6274099072753364e-06, 'samples': 24215552, 'steps': 47295, 'batch_loss/train': 0.6810733596794307} 12/29/2021 04:32:34 - INFO - codeparrot_training - Step 47296: {'lr': 3.624734898108467e-06, 'samples': 24216064, 'steps': 47296, 'batch_loss/train': 0.6544530743267387} 12/29/2021 04:32:45 - INFO - codeparrot_training - Step 47297: {'lr': 3.6220608684366387e-06, 'samples': 24216576, 'steps': 47297, 'batch_loss/train': 0.6516347592696548} 12/29/2021 04:32:55 - INFO - codeparrot_training - Step 47298: {'lr': 3.619387818270453e-06, 'samples': 24217088, 'steps': 47298, 'batch_loss/train': 0.7010706514120102} 12/29/2021 04:33:06 - INFO - codeparrot_training - Step 47299: {'lr': 3.6167157476205415e-06, 'samples': 24217600, 'steps': 47299, 'batch_loss/train': 0.7353264163248241} 12/29/2021 04:33:19 - INFO - codeparrot_training - Step 47300: {'lr': 3.6140446564975893e-06, 'samples': 24218112, 'steps': 47300, 'batch_loss/train': 0.6422019025776535} 12/29/2021 04:33:29 - INFO - codeparrot_training - Step 47301: {'lr': 3.611374544912116e-06, 'samples': 24218624, 'steps': 47301, 'batch_loss/train': 0.7129536317661405} 12/29/2021 04:33:40 - INFO - codeparrot_training - Step 47302: {'lr': 3.6087054128748076e-06, 'samples': 24219136, 'steps': 47302, 'batch_loss/train': 0.7072454784065485} 12/29/2021 04:33:52 - INFO - codeparrot_training - Step 47303: {'lr': 3.606037260396239e-06, 'samples': 24219648, 'steps': 47303, 'batch_loss/train': 0.5973624875769019} 12/29/2021 04:34:03 - INFO - codeparrot_training - Step 47304: {'lr': 3.6033700874870402e-06, 'samples': 24220160, 'steps': 47304, 'batch_loss/train': 0.7245312063023448} 12/29/2021 04:34:13 - INFO - codeparrot_training - Step 47305: {'lr': 3.600703894157786e-06, 'samples': 24220672, 'steps': 47305, 'batch_loss/train': 0.668110373429954} 12/29/2021 04:34:25 - INFO - codeparrot_training - Step 47306: {'lr': 3.598038680419108e-06, 'samples': 24221184, 'steps': 47306, 'batch_loss/train': 0.7178520569577813} 12/29/2021 04:34:36 - INFO - codeparrot_training - Step 47307: {'lr': 3.5953744462816073e-06, 'samples': 24221696, 'steps': 47307, 'batch_loss/train': 0.6568187063094229} 12/29/2021 04:34:47 - INFO - codeparrot_training - Step 47308: {'lr': 3.5927111917558596e-06, 'samples': 24222208, 'steps': 47308, 'batch_loss/train': 0.6132563529827166} 12/29/2021 04:34:59 - INFO - codeparrot_training - Step 47309: {'lr': 3.590048916852412e-06, 'samples': 24222720, 'steps': 47309, 'batch_loss/train': 0.7904037144035101} 12/29/2021 04:35:10 - INFO - codeparrot_training - Step 47310: {'lr': 3.5873876215819223e-06, 'samples': 24223232, 'steps': 47310, 'batch_loss/train': 0.624205077678198} 12/29/2021 04:35:21 - INFO - codeparrot_training - Step 47311: {'lr': 3.5847273059549657e-06, 'samples': 24223744, 'steps': 47311, 'batch_loss/train': 0.6517922348575667} 12/29/2021 04:35:31 - INFO - codeparrot_training - Step 47312: {'lr': 3.5820679699820336e-06, 'samples': 24224256, 'steps': 47312, 'batch_loss/train': 0.7139806197956204} 12/29/2021 04:35:44 - INFO - codeparrot_training - Step 47313: {'lr': 3.5794096136737563e-06, 'samples': 24224768, 'steps': 47313, 'batch_loss/train': 1.1459348504431546} 12/29/2021 04:35:55 - INFO - codeparrot_training - Step 47314: {'lr': 3.5767522370407644e-06, 'samples': 24225280, 'steps': 47314, 'batch_loss/train': 0.6364382595638745} 12/29/2021 04:36:05 - INFO - codeparrot_training - Step 47315: {'lr': 3.574095840093494e-06, 'samples': 24225792, 'steps': 47315, 'batch_loss/train': 0.7429183293133974} 12/29/2021 04:36:17 - INFO - codeparrot_training - Step 47316: {'lr': 3.5714404228425755e-06, 'samples': 24226304, 'steps': 47316, 'batch_loss/train': 0.6749840793199837} 12/29/2021 04:36:28 - INFO - codeparrot_training - Step 47317: {'lr': 3.5687859852986116e-06, 'samples': 24226816, 'steps': 47317, 'batch_loss/train': 0.6625751070678234} 12/29/2021 04:36:39 - INFO - codeparrot_training - Step 47318: {'lr': 3.566132527472066e-06, 'samples': 24227328, 'steps': 47318, 'batch_loss/train': 0.7392337219789624} 12/29/2021 04:36:49 - INFO - codeparrot_training - Step 47319: {'lr': 3.563480049373541e-06, 'samples': 24227840, 'steps': 47319, 'batch_loss/train': 0.7106340536847711} 12/29/2021 04:37:02 - INFO - codeparrot_training - Step 47320: {'lr': 3.560828551013584e-06, 'samples': 24228352, 'steps': 47320, 'batch_loss/train': 0.6102331556903664} 12/29/2021 04:37:13 - INFO - codeparrot_training - Step 47321: {'lr': 3.558178032402687e-06, 'samples': 24228864, 'steps': 47321, 'batch_loss/train': 0.8271698020398617} 12/29/2021 04:37:24 - INFO - codeparrot_training - Step 47322: {'lr': 3.555528493551452e-06, 'samples': 24229376, 'steps': 47322, 'batch_loss/train': 0.7283894188003615} 12/29/2021 04:37:36 - INFO - codeparrot_training - Step 47323: {'lr': 3.552879934470371e-06, 'samples': 24229888, 'steps': 47323, 'batch_loss/train': 0.6177813242538832} 12/29/2021 04:37:47 - INFO - codeparrot_training - Step 47324: {'lr': 3.550232355169963e-06, 'samples': 24230400, 'steps': 47324, 'batch_loss/train': 0.6368045061826706} 12/29/2021 04:37:57 - INFO - codeparrot_training - Step 47325: {'lr': 3.5475857556608037e-06, 'samples': 24230912, 'steps': 47325, 'batch_loss/train': 0.5988444038375746} 12/29/2021 04:38:10 - INFO - codeparrot_training - Step 47326: {'lr': 3.544940135953384e-06, 'samples': 24231424, 'steps': 47326, 'batch_loss/train': 0.6678384467959404} 12/29/2021 04:38:20 - INFO - codeparrot_training - Step 47327: {'lr': 3.5422954960581955e-06, 'samples': 24231936, 'steps': 47327, 'batch_loss/train': 0.6958401543088257} 12/29/2021 04:38:31 - INFO - codeparrot_training - Step 47328: {'lr': 3.539651835985813e-06, 'samples': 24232448, 'steps': 47328, 'batch_loss/train': 0.6591957253403962} 12/29/2021 04:38:44 - INFO - codeparrot_training - Step 47329: {'lr': 3.5370091557467286e-06, 'samples': 24232960, 'steps': 47329, 'batch_loss/train': 0.6833918554475531} 12/29/2021 04:38:54 - INFO - codeparrot_training - Step 47330: {'lr': 3.5343674553514062e-06, 'samples': 24233472, 'steps': 47330, 'batch_loss/train': 0.781298047862947} 12/29/2021 04:39:05 - INFO - codeparrot_training - Step 47331: {'lr': 3.5317267348103642e-06, 'samples': 24233984, 'steps': 47331, 'batch_loss/train': 0.6646168678998947} 12/29/2021 04:39:16 - INFO - codeparrot_training - Step 47332: {'lr': 3.529086994134151e-06, 'samples': 24234496, 'steps': 47332, 'batch_loss/train': 0.5109351000282913} 12/29/2021 04:39:28 - INFO - codeparrot_training - Step 47333: {'lr': 3.5264482333332013e-06, 'samples': 24235008, 'steps': 47333, 'batch_loss/train': 0.7399593945592642} 12/29/2021 04:39:38 - INFO - codeparrot_training - Step 47334: {'lr': 3.523810452418008e-06, 'samples': 24235520, 'steps': 47334, 'batch_loss/train': 0.7409965498372912} 12/29/2021 04:39:49 - INFO - codeparrot_training - Step 47335: {'lr': 3.5211736513991178e-06, 'samples': 24236032, 'steps': 47335, 'batch_loss/train': 0.7997591393068433} 12/29/2021 04:40:01 - INFO - codeparrot_training - Step 47336: {'lr': 3.5185378302869664e-06, 'samples': 24236544, 'steps': 47336, 'batch_loss/train': 0.6098683121963404} 12/29/2021 04:40:12 - INFO - codeparrot_training - Step 47337: {'lr': 3.5159029890920456e-06, 'samples': 24237056, 'steps': 47337, 'batch_loss/train': 0.7621386321261525} 12/29/2021 04:40:22 - INFO - codeparrot_training - Step 47338: {'lr': 3.513269127824792e-06, 'samples': 24237568, 'steps': 47338, 'batch_loss/train': 0.7695396542549133} 12/29/2021 04:40:35 - INFO - codeparrot_training - Step 47339: {'lr': 3.5106362464957242e-06, 'samples': 24238080, 'steps': 47339, 'batch_loss/train': 0.7289596593473107} 12/29/2021 04:40:46 - INFO - codeparrot_training - Step 47340: {'lr': 3.5080043451153342e-06, 'samples': 24238592, 'steps': 47340, 'batch_loss/train': 0.7792439768090844} 12/29/2021 04:40:56 - INFO - codeparrot_training - Step 47341: {'lr': 3.5053734236940027e-06, 'samples': 24239104, 'steps': 47341, 'batch_loss/train': 0.7959854500368237} 12/29/2021 04:41:08 - INFO - codeparrot_training - Step 47342: {'lr': 3.5027434822422213e-06, 'samples': 24239616, 'steps': 47342, 'batch_loss/train': 0.7296266419580206} 12/29/2021 04:41:19 - INFO - codeparrot_training - Step 47343: {'lr': 3.500114520770509e-06, 'samples': 24240128, 'steps': 47343, 'batch_loss/train': 0.7089239866472781} 12/29/2021 04:41:30 - INFO - codeparrot_training - Step 47344: {'lr': 3.4974865392892465e-06, 'samples': 24240640, 'steps': 47344, 'batch_loss/train': 0.7395188137888908} 12/29/2021 04:41:40 - INFO - codeparrot_training - Step 47345: {'lr': 3.4948595378088977e-06, 'samples': 24241152, 'steps': 47345, 'batch_loss/train': 0.7474234509281814} 12/29/2021 04:41:52 - INFO - codeparrot_training - Step 47346: {'lr': 3.492233516339899e-06, 'samples': 24241664, 'steps': 47346, 'batch_loss/train': 0.6502064494416118} 12/29/2021 04:42:03 - INFO - codeparrot_training - Step 47347: {'lr': 3.489608474892714e-06, 'samples': 24242176, 'steps': 47347, 'batch_loss/train': 0.6542672757059336} 12/29/2021 04:42:13 - INFO - codeparrot_training - Step 47348: {'lr': 3.4869844134777786e-06, 'samples': 24242688, 'steps': 47348, 'batch_loss/train': 0.6835815005470067} 12/29/2021 04:42:26 - INFO - codeparrot_training - Step 47349: {'lr': 3.4843613321055013e-06, 'samples': 24243200, 'steps': 47349, 'batch_loss/train': 0.7686041994020343} 12/29/2021 04:42:37 - INFO - codeparrot_training - Step 47350: {'lr': 3.4817392307863183e-06, 'samples': 24243712, 'steps': 47350, 'batch_loss/train': 0.7717043180018663} 12/29/2021 04:42:48 - INFO - codeparrot_training - Step 47351: {'lr': 3.4791181095306655e-06, 'samples': 24244224, 'steps': 47351, 'batch_loss/train': 0.6592451175674796} 12/29/2021 04:43:00 - INFO - codeparrot_training - Step 47352: {'lr': 3.4764979683489517e-06, 'samples': 24244736, 'steps': 47352, 'batch_loss/train': 0.7336127692833543} 12/29/2021 04:43:10 - INFO - codeparrot_training - Step 47353: {'lr': 3.4738788072515847e-06, 'samples': 24245248, 'steps': 47353, 'batch_loss/train': 0.660784785519354} 12/29/2021 04:43:21 - INFO - codeparrot_training - Step 47354: {'lr': 3.4712606262490288e-06, 'samples': 24245760, 'steps': 47354, 'batch_loss/train': 0.931402076035738} 12/29/2021 04:43:32 - INFO - codeparrot_training - Step 47355: {'lr': 3.4686434253516364e-06, 'samples': 24246272, 'steps': 47355, 'batch_loss/train': 0.5672658355906606} 12/29/2021 04:43:44 - INFO - codeparrot_training - Step 47356: {'lr': 3.466027204569816e-06, 'samples': 24246784, 'steps': 47356, 'batch_loss/train': 0.6292000162648037} 12/29/2021 04:43:55 - INFO - codeparrot_training - Step 47357: {'lr': 3.4634119639140036e-06, 'samples': 24247296, 'steps': 47357, 'batch_loss/train': 0.8206584574654698} 12/29/2021 04:44:05 - INFO - codeparrot_training - Step 47358: {'lr': 3.4607977033945803e-06, 'samples': 24247808, 'steps': 47358, 'batch_loss/train': 0.6385585833340883} 12/29/2021 04:44:18 - INFO - codeparrot_training - Step 47359: {'lr': 3.4581844230218985e-06, 'samples': 24248320, 'steps': 47359, 'batch_loss/train': 0.6754618673585355} 12/29/2021 04:44:29 - INFO - codeparrot_training - Step 47360: {'lr': 3.455572122806422e-06, 'samples': 24248832, 'steps': 47360, 'batch_loss/train': 0.7645040974020958} 12/29/2021 04:44:39 - INFO - codeparrot_training - Step 47361: {'lr': 3.4529608027584758e-06, 'samples': 24249344, 'steps': 47361, 'batch_loss/train': 0.748470320366323} 12/29/2021 04:44:50 - INFO - codeparrot_training - Step 47362: {'lr': 3.450350462888496e-06, 'samples': 24249856, 'steps': 47362, 'batch_loss/train': 1.0374248418956995} 12/29/2021 04:45:02 - INFO - codeparrot_training - Step 47363: {'lr': 3.4477411032068083e-06, 'samples': 24250368, 'steps': 47363, 'batch_loss/train': 0.7519551294390112} 12/29/2021 04:45:13 - INFO - codeparrot_training - Step 47364: {'lr': 3.4451327237237927e-06, 'samples': 24250880, 'steps': 47364, 'batch_loss/train': 0.8004484013654292} 12/29/2021 04:45:23 - INFO - codeparrot_training - Step 47365: {'lr': 3.442525324449858e-06, 'samples': 24251392, 'steps': 47365, 'batch_loss/train': 0.7202921099960804} 12/29/2021 04:45:36 - INFO - codeparrot_training - Step 47366: {'lr': 3.439918905395356e-06, 'samples': 24251904, 'steps': 47366, 'batch_loss/train': 0.6796971205621958} 12/29/2021 04:45:47 - INFO - codeparrot_training - Step 47367: {'lr': 3.437313466570613e-06, 'samples': 24252416, 'steps': 47367, 'batch_loss/train': 0.5687627855222672} 12/29/2021 04:45:57 - INFO - codeparrot_training - Step 47368: {'lr': 3.4347090079860087e-06, 'samples': 24252928, 'steps': 47368, 'batch_loss/train': 0.7816730858758092} 12/29/2021 04:46:10 - INFO - codeparrot_training - Step 47369: {'lr': 3.432105529651924e-06, 'samples': 24253440, 'steps': 47369, 'batch_loss/train': 0.7070361441001296} 12/29/2021 04:46:20 - INFO - codeparrot_training - Step 47370: {'lr': 3.4295030315786845e-06, 'samples': 24253952, 'steps': 47370, 'batch_loss/train': 0.8323926543816924} 12/29/2021 04:46:31 - INFO - codeparrot_training - Step 47371: {'lr': 3.4269015137766424e-06, 'samples': 24254464, 'steps': 47371, 'batch_loss/train': 0.7274833684787154} 12/29/2021 04:46:43 - INFO - codeparrot_training - Step 47372: {'lr': 3.424300976256123e-06, 'samples': 24254976, 'steps': 47372, 'batch_loss/train': 0.7177276965230703} 12/29/2021 04:46:54 - INFO - codeparrot_training - Step 47373: {'lr': 3.4217014190274785e-06, 'samples': 24255488, 'steps': 47373, 'batch_loss/train': 0.4911770903854631} 12/29/2021 04:47:04 - INFO - codeparrot_training - Step 47374: {'lr': 3.419102842101035e-06, 'samples': 24256000, 'steps': 47374, 'batch_loss/train': 0.6299622470978647} 12/29/2021 04:47:15 - INFO - codeparrot_training - Step 47375: {'lr': 3.4165052454871724e-06, 'samples': 24256512, 'steps': 47375, 'batch_loss/train': 0.5830975968856364} 12/29/2021 04:47:28 - INFO - codeparrot_training - Step 47376: {'lr': 3.413908629196133e-06, 'samples': 24257024, 'steps': 47376, 'batch_loss/train': 0.7447493094950914} 12/29/2021 04:47:38 - INFO - codeparrot_training - Step 47377: {'lr': 3.411312993238325e-06, 'samples': 24257536, 'steps': 47377, 'batch_loss/train': 0.7465155941899866} 12/29/2021 04:47:49 - INFO - codeparrot_training - Step 47378: {'lr': 3.4087183376240172e-06, 'samples': 24258048, 'steps': 47378, 'batch_loss/train': 0.6542116799391806} 12/29/2021 04:48:01 - INFO - codeparrot_training - Step 47379: {'lr': 3.4061246623635078e-06, 'samples': 24258560, 'steps': 47379, 'batch_loss/train': 0.6700461776927114} 12/29/2021 04:48:12 - INFO - codeparrot_training - Step 47380: {'lr': 3.4035319674671494e-06, 'samples': 24259072, 'steps': 47380, 'batch_loss/train': 0.7872187143657357} 12/29/2021 04:48:22 - INFO - codeparrot_training - Step 47381: {'lr': 3.400940252945267e-06, 'samples': 24259584, 'steps': 47381, 'batch_loss/train': 0.6968765817582607} 12/29/2021 04:48:34 - INFO - codeparrot_training - Step 47382: {'lr': 3.3983495188081304e-06, 'samples': 24260096, 'steps': 47382, 'batch_loss/train': 0.7407632877584547} 12/29/2021 04:48:45 - INFO - codeparrot_training - Step 47383: {'lr': 3.3957597650660087e-06, 'samples': 24260608, 'steps': 47383, 'batch_loss/train': 0.5421847022371367} 12/29/2021 04:48:56 - INFO - codeparrot_training - Step 47384: {'lr': 3.3931709917292554e-06, 'samples': 24261120, 'steps': 47384, 'batch_loss/train': 0.6902845441363752} 12/29/2021 04:49:08 - INFO - codeparrot_training - Step 47385: {'lr': 3.3905831988081393e-06, 'samples': 24261632, 'steps': 47385, 'batch_loss/train': 0.5947052667033859} 12/29/2021 04:49:19 - INFO - codeparrot_training - Step 47386: {'lr': 3.38799638631293e-06, 'samples': 24262144, 'steps': 47386, 'batch_loss/train': 0.6015667754109018} 12/29/2021 04:49:29 - INFO - codeparrot_training - Step 47387: {'lr': 3.385410554253954e-06, 'samples': 24262656, 'steps': 47387, 'batch_loss/train': 0.6622202079743147} 12/29/2021 04:49:40 - INFO - codeparrot_training - Step 47388: {'lr': 3.382825702641451e-06, 'samples': 24263168, 'steps': 47388, 'batch_loss/train': 0.7278862232342362} 12/29/2021 04:49:52 - INFO - codeparrot_training - Step 47389: {'lr': 3.3802418314857197e-06, 'samples': 24263680, 'steps': 47389, 'batch_loss/train': 0.7895924118347466} 12/29/2021 04:50:03 - INFO - codeparrot_training - Step 47390: {'lr': 3.3776589407970017e-06, 'samples': 24264192, 'steps': 47390, 'batch_loss/train': 0.680938511621207} 12/29/2021 04:50:14 - INFO - codeparrot_training - Step 47391: {'lr': 3.3750770305856216e-06, 'samples': 24264704, 'steps': 47391, 'batch_loss/train': 0.8035631151869893} 12/29/2021 04:50:26 - INFO - codeparrot_training - Step 47392: {'lr': 3.3724961008618214e-06, 'samples': 24265216, 'steps': 47392, 'batch_loss/train': 0.6689043997321278} 12/29/2021 04:50:36 - INFO - codeparrot_training - Step 47393: {'lr': 3.3699161516358434e-06, 'samples': 24265728, 'steps': 47393, 'batch_loss/train': 0.6438220572890714} 12/29/2021 04:50:47 - INFO - codeparrot_training - Step 47394: {'lr': 3.367337182917929e-06, 'samples': 24266240, 'steps': 47394, 'batch_loss/train': 0.8062619240954518} 12/29/2021 04:50:59 - INFO - codeparrot_training - Step 47395: {'lr': 3.364759194718403e-06, 'samples': 24266752, 'steps': 47395, 'batch_loss/train': 0.6999079901725054} 12/29/2021 04:51:10 - INFO - codeparrot_training - Step 47396: {'lr': 3.3621821870474246e-06, 'samples': 24267264, 'steps': 47396, 'batch_loss/train': 0.8516150889918208} 12/29/2021 04:51:21 - INFO - codeparrot_training - Step 47397: {'lr': 3.359606159915318e-06, 'samples': 24267776, 'steps': 47397, 'batch_loss/train': 0.7552501063328236} 12/29/2021 04:51:31 - INFO - codeparrot_training - Step 47398: {'lr': 3.3570311133322706e-06, 'samples': 24268288, 'steps': 47398, 'batch_loss/train': 0.49993882002308965} 12/29/2021 04:51:44 - INFO - codeparrot_training - Step 47399: {'lr': 3.354457047308551e-06, 'samples': 24268800, 'steps': 47399, 'batch_loss/train': 0.6501034717075527} 12/29/2021 04:51:55 - INFO - codeparrot_training - Step 47400: {'lr': 3.351883961854346e-06, 'samples': 24269312, 'steps': 47400, 'batch_loss/train': 0.569538950570859} 12/29/2021 04:52:05 - INFO - codeparrot_training - Step 47401: {'lr': 3.349311856979953e-06, 'samples': 24269824, 'steps': 47401, 'batch_loss/train': 0.814878772944212} 12/29/2021 04:52:17 - INFO - codeparrot_training - Step 47402: {'lr': 3.346740732695558e-06, 'samples': 24270336, 'steps': 47402, 'batch_loss/train': 0.6167961825849488} 12/29/2021 04:52:28 - INFO - codeparrot_training - Step 47403: {'lr': 3.344170589011375e-06, 'samples': 24270848, 'steps': 47403, 'batch_loss/train': 0.6701685378793627} 12/29/2021 04:52:38 - INFO - codeparrot_training - Step 47404: {'lr': 3.341601425937674e-06, 'samples': 24271360, 'steps': 47404, 'batch_loss/train': 0.6976188458502293} 12/29/2021 04:52:51 - INFO - codeparrot_training - Step 47405: {'lr': 3.339033243484585e-06, 'samples': 24271872, 'steps': 47405, 'batch_loss/train': 0.6591645264998078} 12/29/2021 04:53:02 - INFO - codeparrot_training - Step 47406: {'lr': 3.336466041662378e-06, 'samples': 24272384, 'steps': 47406, 'batch_loss/train': 0.7508860463276505} 12/29/2021 04:53:12 - INFO - codeparrot_training - Step 47407: {'lr': 3.3338998204812678e-06, 'samples': 24272896, 'steps': 47407, 'batch_loss/train': 0.708690736675635} 12/29/2021 04:53:26 - INFO - codeparrot_training - Step 47408: {'lr': 3.3313345799514394e-06, 'samples': 24273408, 'steps': 47408, 'batch_loss/train': 0.7108374675735831} 12/29/2021 04:53:36 - INFO - codeparrot_training - Step 47409: {'lr': 3.328770320083052e-06, 'samples': 24273920, 'steps': 47409, 'batch_loss/train': 1.6463190375361592} 12/29/2021 04:53:47 - INFO - codeparrot_training - Step 47410: {'lr': 3.3262070408863477e-06, 'samples': 24274432, 'steps': 47410, 'batch_loss/train': 1.0959950685501099} 12/29/2021 04:53:58 - INFO - codeparrot_training - Step 47411: {'lr': 3.323644742371512e-06, 'samples': 24274944, 'steps': 47411, 'batch_loss/train': 1.0180057636462152} 12/29/2021 04:54:10 - INFO - codeparrot_training - Step 47412: {'lr': 3.3210834245487044e-06, 'samples': 24275456, 'steps': 47412, 'batch_loss/train': 1.316737812012434} 12/29/2021 04:54:20 - INFO - codeparrot_training - Step 47413: {'lr': 3.318523087428138e-06, 'samples': 24275968, 'steps': 47413, 'batch_loss/train': 0.8171672141179442} 12/29/2021 04:54:31 - INFO - codeparrot_training - Step 47414: {'lr': 3.315963731019972e-06, 'samples': 24276480, 'steps': 47414, 'batch_loss/train': 0.723424123832956} 12/29/2021 04:54:43 - INFO - codeparrot_training - Step 47415: {'lr': 3.31340535533442e-06, 'samples': 24276992, 'steps': 47415, 'batch_loss/train': 0.6710909595713019} 12/29/2021 04:54:54 - INFO - codeparrot_training - Step 47416: {'lr': 3.3108479603815854e-06, 'samples': 24277504, 'steps': 47416, 'batch_loss/train': 0.7305356496945024} 12/29/2021 04:55:05 - INFO - codeparrot_training - Step 47417: {'lr': 3.3082915461716823e-06, 'samples': 24278016, 'steps': 47417, 'batch_loss/train': 0.7814771635457873} 12/29/2021 04:55:17 - INFO - codeparrot_training - Step 47418: {'lr': 3.3057361127148967e-06, 'samples': 24278528, 'steps': 47418, 'batch_loss/train': 0.6795236342586577} 12/29/2021 04:55:28 - INFO - codeparrot_training - Step 47419: {'lr': 3.3031816600213316e-06, 'samples': 24279040, 'steps': 47419, 'batch_loss/train': 0.6844354132190347} 12/29/2021 04:55:38 - INFO - codeparrot_training - Step 47420: {'lr': 3.300628188101146e-06, 'samples': 24279552, 'steps': 47420, 'batch_loss/train': 0.5346465120092034} 12/29/2021 04:55:49 - INFO - codeparrot_training - Step 47421: {'lr': 3.298075696964553e-06, 'samples': 24280064, 'steps': 47421, 'batch_loss/train': 0.6446169186383486} 12/29/2021 04:56:01 - INFO - codeparrot_training - Step 47422: {'lr': 3.295524186621629e-06, 'samples': 24280576, 'steps': 47422, 'batch_loss/train': 0.7506633764132857} 12/29/2021 04:56:12 - INFO - codeparrot_training - Step 47423: {'lr': 3.2929736570825596e-06, 'samples': 24281088, 'steps': 47423, 'batch_loss/train': 0.6609505680389702} 12/29/2021 04:56:22 - INFO - codeparrot_training - Step 47424: {'lr': 3.290424108357476e-06, 'samples': 24281600, 'steps': 47424, 'batch_loss/train': 0.7310629525454715} 12/29/2021 04:56:35 - INFO - codeparrot_training - Step 47425: {'lr': 3.2878755404565085e-06, 'samples': 24282112, 'steps': 47425, 'batch_loss/train': 0.7540103560313582} 12/29/2021 04:56:45 - INFO - codeparrot_training - Step 47426: {'lr': 3.285327953389816e-06, 'samples': 24282624, 'steps': 47426, 'batch_loss/train': 0.7536340653896332} 12/29/2021 04:56:56 - INFO - codeparrot_training - Step 47427: {'lr': 3.282781347167474e-06, 'samples': 24283136, 'steps': 47427, 'batch_loss/train': 0.692338238004595} 12/29/2021 04:57:06 - INFO - codeparrot_training - Step 47428: {'lr': 3.2802357217996404e-06, 'samples': 24283648, 'steps': 47428, 'batch_loss/train': 0.6649408161174506} 12/29/2021 04:57:19 - INFO - codeparrot_training - Step 47429: {'lr': 3.2776910772964464e-06, 'samples': 24284160, 'steps': 47429, 'batch_loss/train': 0.6007395935012028} 12/29/2021 04:57:30 - INFO - codeparrot_training - Step 47430: {'lr': 3.275147413667995e-06, 'samples': 24284672, 'steps': 47430, 'batch_loss/train': 0.6412005796155427} 12/29/2021 04:57:40 - INFO - codeparrot_training - Step 47431: {'lr': 3.2726047309243614e-06, 'samples': 24285184, 'steps': 47431, 'batch_loss/train': 0.7334294328466058} 12/29/2021 04:57:53 - INFO - codeparrot_training - Step 47432: {'lr': 3.270063029075732e-06, 'samples': 24285696, 'steps': 47432, 'batch_loss/train': 0.549523315159604} 12/29/2021 04:58:03 - INFO - codeparrot_training - Step 47433: {'lr': 3.2675223081321537e-06, 'samples': 24286208, 'steps': 47433, 'batch_loss/train': 0.6747697149403393} 12/29/2021 04:58:14 - INFO - codeparrot_training - Step 47434: {'lr': 3.264982568103758e-06, 'samples': 24286720, 'steps': 47434, 'batch_loss/train': 0.7185896849259734} 12/29/2021 04:58:28 - INFO - codeparrot_training - Step 47435: {'lr': 3.2624438090005924e-06, 'samples': 24287232, 'steps': 47435, 'batch_loss/train': 0.6826984889339656} 12/29/2021 04:58:39 - INFO - codeparrot_training - Step 47436: {'lr': 3.2599060308328155e-06, 'samples': 24287744, 'steps': 47436, 'batch_loss/train': 0.6499341670423746} 12/29/2021 04:58:49 - INFO - codeparrot_training - Step 47437: {'lr': 3.2573692336104745e-06, 'samples': 24288256, 'steps': 47437, 'batch_loss/train': 0.8947087964043021} 12/29/2021 04:59:02 - INFO - codeparrot_training - Step 47438: {'lr': 3.2548334173436723e-06, 'samples': 24288768, 'steps': 47438, 'batch_loss/train': 0.6300102039240301} 12/29/2021 04:59:12 - INFO - codeparrot_training - Step 47439: {'lr': 3.2522985820424845e-06, 'samples': 24289280, 'steps': 47439, 'batch_loss/train': 0.8181042142678052} 12/29/2021 04:59:23 - INFO - codeparrot_training - Step 47440: {'lr': 3.249764727716986e-06, 'samples': 24289792, 'steps': 47440, 'batch_loss/train': 0.797463639639318} 12/29/2021 04:59:33 - INFO - codeparrot_training - Step 47441: {'lr': 3.2472318543772805e-06, 'samples': 24290304, 'steps': 47441, 'batch_loss/train': 0.8428957457654178} 12/29/2021 04:59:46 - INFO - codeparrot_training - Step 47442: {'lr': 3.244699962033387e-06, 'samples': 24290816, 'steps': 47442, 'batch_loss/train': 0.6830023573711514} 12/29/2021 04:59:56 - INFO - codeparrot_training - Step 47443: {'lr': 3.2421690506953817e-06, 'samples': 24291328, 'steps': 47443, 'batch_loss/train': 0.7102387580089271} 12/29/2021 05:00:07 - INFO - codeparrot_training - Step 47444: {'lr': 3.2396391203733665e-06, 'samples': 24291840, 'steps': 47444, 'batch_loss/train': 0.775171248242259} 12/29/2021 05:00:21 - INFO - codeparrot_training - Step 47445: {'lr': 3.237110171077362e-06, 'samples': 24292352, 'steps': 47445, 'batch_loss/train': 0.7781139467842877} 12/29/2021 05:00:31 - INFO - codeparrot_training - Step 47446: {'lr': 3.2345822028174155e-06, 'samples': 24292864, 'steps': 47446, 'batch_loss/train': 0.7223774739541113} 12/29/2021 05:00:42 - INFO - codeparrot_training - Step 47447: {'lr': 3.2320552156036298e-06, 'samples': 24293376, 'steps': 47447, 'batch_loss/train': 0.7030103206634521} 12/29/2021 05:00:54 - INFO - codeparrot_training - Step 47448: {'lr': 3.2295292094459973e-06, 'samples': 24293888, 'steps': 47448, 'batch_loss/train': 0.7095646522939205} 12/29/2021 05:01:05 - INFO - codeparrot_training - Step 47449: {'lr': 3.227004184354565e-06, 'samples': 24294400, 'steps': 47449, 'batch_loss/train': 0.6754389866255224} 12/29/2021 05:01:15 - INFO - codeparrot_training - Step 47450: {'lr': 3.2244801403394363e-06, 'samples': 24294912, 'steps': 47450, 'batch_loss/train': 0.6968292728997767} 12/29/2021 05:01:29 - INFO - codeparrot_training - Step 47451: {'lr': 3.221957077410548e-06, 'samples': 24295424, 'steps': 47451, 'batch_loss/train': 0.6676771151833236} 12/29/2021 05:01:40 - INFO - codeparrot_training - Step 47452: {'lr': 3.219434995578002e-06, 'samples': 24295936, 'steps': 47452, 'batch_loss/train': 0.6464381862897426} 12/29/2021 05:01:51 - INFO - codeparrot_training - Step 47453: {'lr': 3.216913894851792e-06, 'samples': 24296448, 'steps': 47453, 'batch_loss/train': 0.7092203111387789} 12/29/2021 05:02:01 - INFO - codeparrot_training - Step 47454: {'lr': 3.214393775241936e-06, 'samples': 24296960, 'steps': 47454, 'batch_loss/train': 0.7742323046550155} 12/29/2021 05:02:13 - INFO - codeparrot_training - Step 47455: {'lr': 3.211874636758483e-06, 'samples': 24297472, 'steps': 47455, 'batch_loss/train': 0.7615192933008075} 12/29/2021 05:02:24 - INFO - codeparrot_training - Step 47456: {'lr': 3.2093564794114517e-06, 'samples': 24297984, 'steps': 47456, 'batch_loss/train': 0.906210379791446} 12/29/2021 05:02:35 - INFO - codeparrot_training - Step 47457: {'lr': 3.206839303210807e-06, 'samples': 24298496, 'steps': 47457, 'batch_loss/train': 0.704715080326423} 12/29/2021 05:02:47 - INFO - codeparrot_training - Step 47458: {'lr': 3.2043231081665956e-06, 'samples': 24299008, 'steps': 47458, 'batch_loss/train': 0.716948383487761} 12/29/2021 05:02:57 - INFO - codeparrot_training - Step 47459: {'lr': 3.2018078942888107e-06, 'samples': 24299520, 'steps': 47459, 'batch_loss/train': 0.7437311722896993} 12/29/2021 05:03:08 - INFO - codeparrot_training - Step 47460: {'lr': 3.1992936615874434e-06, 'samples': 24300032, 'steps': 47460, 'batch_loss/train': 0.7491009393706918} 12/29/2021 05:03:22 - INFO - codeparrot_training - Step 47461: {'lr': 3.196780410072486e-06, 'samples': 24300544, 'steps': 47461, 'batch_loss/train': 0.7415916176396422} 12/29/2021 05:03:32 - INFO - codeparrot_training - Step 47462: {'lr': 3.1942681397539586e-06, 'samples': 24301056, 'steps': 47462, 'batch_loss/train': 0.7177171325311065} 12/29/2021 05:03:43 - INFO - codeparrot_training - Step 47463: {'lr': 3.191756850641825e-06, 'samples': 24301568, 'steps': 47463, 'batch_loss/train': 0.8698350293561816} 12/29/2021 05:03:54 - INFO - codeparrot_training - Step 47464: {'lr': 3.189246542746077e-06, 'samples': 24302080, 'steps': 47464, 'batch_loss/train': 0.8126755002886057} 12/29/2021 05:04:06 - INFO - codeparrot_training - Step 47465: {'lr': 3.186737216076707e-06, 'samples': 24302592, 'steps': 47465, 'batch_loss/train': 0.7420995645225048} 12/29/2021 05:04:16 - INFO - codeparrot_training - Step 47466: {'lr': 3.1842288706436517e-06, 'samples': 24303104, 'steps': 47466, 'batch_loss/train': 0.7085259594023228} 12/29/2021 05:04:27 - INFO - codeparrot_training - Step 47467: {'lr': 3.1817215064569025e-06, 'samples': 24303616, 'steps': 47467, 'batch_loss/train': 0.6840747549431399} 12/29/2021 05:04:41 - INFO - codeparrot_training - Step 47468: {'lr': 3.179215123526452e-06, 'samples': 24304128, 'steps': 47468, 'batch_loss/train': 0.7605212684720755} 12/29/2021 05:04:52 - INFO - codeparrot_training - Step 47469: {'lr': 3.1767097218622364e-06, 'samples': 24304640, 'steps': 47469, 'batch_loss/train': 0.7590699037536979} 12/29/2021 05:05:02 - INFO - codeparrot_training - Step 47470: {'lr': 3.174205301474248e-06, 'samples': 24305152, 'steps': 47470, 'batch_loss/train': 0.5958257112361025} 12/29/2021 05:05:14 - INFO - codeparrot_training - Step 47471: {'lr': 3.1717018623724224e-06, 'samples': 24305664, 'steps': 47471, 'batch_loss/train': 0.5508757452480495} 12/29/2021 05:05:25 - INFO - codeparrot_training - Step 47472: {'lr': 3.1691994045666693e-06, 'samples': 24306176, 'steps': 47472, 'batch_loss/train': 0.5973161019501276} 12/29/2021 05:05:36 - INFO - codeparrot_training - Step 47473: {'lr': 3.166697928067036e-06, 'samples': 24306688, 'steps': 47473, 'batch_loss/train': 0.7601380879059434} 12/29/2021 05:05:46 - INFO - codeparrot_training - Step 47474: {'lr': 3.1641974328833755e-06, 'samples': 24307200, 'steps': 47474, 'batch_loss/train': 0.6094368319027126} 12/29/2021 05:05:58 - INFO - codeparrot_training - Step 47475: {'lr': 3.1616979190256523e-06, 'samples': 24307712, 'steps': 47475, 'batch_loss/train': 0.7096839984878898} 12/29/2021 05:06:09 - INFO - codeparrot_training - Step 47476: {'lr': 3.1591993865038303e-06, 'samples': 24308224, 'steps': 47476, 'batch_loss/train': 0.771985637024045} 12/29/2021 05:06:20 - INFO - codeparrot_training - Step 47477: {'lr': 3.1567018353278464e-06, 'samples': 24308736, 'steps': 47477, 'batch_loss/train': 0.718559896107763} 12/29/2021 05:06:32 - INFO - codeparrot_training - Step 47478: {'lr': 3.1542052655075813e-06, 'samples': 24309248, 'steps': 47478, 'batch_loss/train': 0.655148749705404} 12/29/2021 05:06:42 - INFO - codeparrot_training - Step 47479: {'lr': 3.151709677053e-06, 'samples': 24309760, 'steps': 47479, 'batch_loss/train': 0.7883134530857205} 12/29/2021 05:06:53 - INFO - codeparrot_training - Step 47480: {'lr': 3.1492150699740373e-06, 'samples': 24310272, 'steps': 47480, 'batch_loss/train': 0.7488402351737022} 12/29/2021 05:07:07 - INFO - codeparrot_training - Step 47481: {'lr': 3.146721444280548e-06, 'samples': 24310784, 'steps': 47481, 'batch_loss/train': 0.8242170724552125} 12/29/2021 05:07:18 - INFO - codeparrot_training - Step 47482: {'lr': 3.1442287999825238e-06, 'samples': 24311296, 'steps': 47482, 'batch_loss/train': 0.685222020605579} 12/29/2021 05:07:28 - INFO - codeparrot_training - Step 47483: {'lr': 3.1417371370898175e-06, 'samples': 24311808, 'steps': 47483, 'batch_loss/train': 0.7438780744560063} 12/29/2021 05:07:39 - INFO - codeparrot_training - Step 47484: {'lr': 3.139246455612366e-06, 'samples': 24312320, 'steps': 47484, 'batch_loss/train': 0.7551612888928503} 12/29/2021 05:07:51 - INFO - codeparrot_training - Step 47485: {'lr': 3.1367567555600494e-06, 'samples': 24312832, 'steps': 47485, 'batch_loss/train': 0.7747432810720056} 12/29/2021 05:08:01 - INFO - codeparrot_training - Step 47486: {'lr': 3.134268036942778e-06, 'samples': 24313344, 'steps': 47486, 'batch_loss/train': 0.7017998951487243} 12/29/2021 05:08:12 - INFO - codeparrot_training - Step 47487: {'lr': 3.1317802997704316e-06, 'samples': 24313856, 'steps': 47487, 'batch_loss/train': 0.6414475687779486} 12/29/2021 05:08:24 - INFO - codeparrot_training - Step 47488: {'lr': 3.129293544052947e-06, 'samples': 24314368, 'steps': 47488, 'batch_loss/train': 0.6384420916438103} 12/29/2021 05:08:35 - INFO - codeparrot_training - Step 47489: {'lr': 3.1268077698001497e-06, 'samples': 24314880, 'steps': 47489, 'batch_loss/train': 0.7039547557942569} 12/29/2021 05:08:46 - INFO - codeparrot_training - Step 47490: {'lr': 3.1243229770219484e-06, 'samples': 24315392, 'steps': 47490, 'batch_loss/train': 0.6415463690645993} 12/29/2021 05:08:59 - INFO - codeparrot_training - Step 47491: {'lr': 3.1218391657282243e-06, 'samples': 24315904, 'steps': 47491, 'batch_loss/train': 0.7908311244100332} 12/29/2021 05:09:10 - INFO - codeparrot_training - Step 47492: {'lr': 3.119356335928858e-06, 'samples': 24316416, 'steps': 47492, 'batch_loss/train': 0.7718366426415741} 12/29/2021 05:09:21 - INFO - codeparrot_training - Step 47493: {'lr': 3.116874487633703e-06, 'samples': 24316928, 'steps': 47493, 'batch_loss/train': 0.6573979163076729} 12/29/2021 05:09:31 - INFO - codeparrot_training - Step 47494: {'lr': 3.1143936208526402e-06, 'samples': 24317440, 'steps': 47494, 'batch_loss/train': 0.7542474246583879} 12/29/2021 05:09:43 - INFO - codeparrot_training - Step 47495: {'lr': 3.1119137355955507e-06, 'samples': 24317952, 'steps': 47495, 'batch_loss/train': 0.7248191959224641} 12/29/2021 05:09:54 - INFO - codeparrot_training - Step 47496: {'lr': 3.1094348318722322e-06, 'samples': 24318464, 'steps': 47496, 'batch_loss/train': 0.6599957786384039} 12/29/2021 05:10:05 - INFO - codeparrot_training - Step 47497: {'lr': 3.106956909692621e-06, 'samples': 24318976, 'steps': 47497, 'batch_loss/train': 1.0728646079078317} 12/29/2021 05:10:17 - INFO - codeparrot_training - Step 47498: {'lr': 3.104479969066487e-06, 'samples': 24319488, 'steps': 47498, 'batch_loss/train': 0.6726429772097617} 12/29/2021 05:10:27 - INFO - codeparrot_training - Step 47499: {'lr': 3.1020040100037393e-06, 'samples': 24320000, 'steps': 47499, 'batch_loss/train': 0.7271024992223829} 12/29/2021 05:10:27 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/29/2021 05:13:49 - INFO - codeparrot_training - Step 47500: {'loss/eval': 0.7404438257217407, 'perplexity': 2.0968658924102783} 12/29/2021 05:14:08 - WARNING - huggingface_hub.repository - Several commits (17) will be pushed upstream. 12/29/2021 05:14:21 - INFO - codeparrot_training - Step 47500: {'lr': 3.0995290325141757e-06, 'samples': 24320512, 'steps': 47500, 'batch_loss/train': 0.7142994897440076} 12/29/2021 05:14:35 - INFO - codeparrot_training - Step 47501: {'lr': 3.0970550366076766e-06, 'samples': 24321024, 'steps': 47501, 'batch_loss/train': 0.7185069830156863} 12/29/2021 05:14:46 - INFO - codeparrot_training - Step 47502: {'lr': 3.094582022294068e-06, 'samples': 24321536, 'steps': 47502, 'batch_loss/train': 0.7865669149905443} 12/29/2021 05:14:56 - INFO - codeparrot_training - Step 47503: {'lr': 3.092109989583147e-06, 'samples': 24322048, 'steps': 47503, 'batch_loss/train': 0.5161740591283888} 12/29/2021 05:15:07 - INFO - codeparrot_training - Step 47504: {'lr': 3.0896389384847955e-06, 'samples': 24322560, 'steps': 47504, 'batch_loss/train': 0.7239033812074922} 12/29/2021 05:15:19 - INFO - codeparrot_training - Step 47505: {'lr': 3.0871688690087828e-06, 'samples': 24323072, 'steps': 47505, 'batch_loss/train': 0.8683047229424119} 12/29/2021 05:15:30 - INFO - codeparrot_training - Step 47506: {'lr': 3.0846997811649625e-06, 'samples': 24323584, 'steps': 47506, 'batch_loss/train': 0.795848541893065} 12/29/2021 05:15:41 - INFO - codeparrot_training - Step 47507: {'lr': 3.0822316749631597e-06, 'samples': 24324096, 'steps': 47507, 'batch_loss/train': 0.6930774739012122} 12/29/2021 05:15:55 - INFO - codeparrot_training - Step 47508: {'lr': 3.079764550413172e-06, 'samples': 24324608, 'steps': 47508, 'batch_loss/train': 0.9589014956727624} 12/29/2021 05:16:05 - INFO - codeparrot_training - Step 47509: {'lr': 3.077298407524798e-06, 'samples': 24325120, 'steps': 47509, 'batch_loss/train': 0.6889746403321624} 12/29/2021 05:16:16 - INFO - codeparrot_training - Step 47510: {'lr': 3.0748332463078345e-06, 'samples': 24325632, 'steps': 47510, 'batch_loss/train': 0.7055857053492218} 12/29/2021 05:16:28 - INFO - codeparrot_training - Step 47511: {'lr': 3.0723690667721347e-06, 'samples': 24326144, 'steps': 47511, 'batch_loss/train': 0.7254427324514836} 12/29/2021 05:16:39 - INFO - codeparrot_training - Step 47512: {'lr': 3.069905868927414e-06, 'samples': 24326656, 'steps': 47512, 'batch_loss/train': 0.7203088980168104} 12/29/2021 05:16:49 - INFO - codeparrot_training - Step 47513: {'lr': 3.0674436527835525e-06, 'samples': 24327168, 'steps': 47513, 'batch_loss/train': 0.6592502947896719} 12/29/2021 05:17:00 - INFO - codeparrot_training - Step 47514: {'lr': 3.064982418350293e-06, 'samples': 24327680, 'steps': 47514, 'batch_loss/train': 1.3359698895365} 12/29/2021 05:17:12 - INFO - codeparrot_training - Step 47515: {'lr': 3.0625221656374047e-06, 'samples': 24328192, 'steps': 47515, 'batch_loss/train': 0.7179236770607531} 12/29/2021 05:17:23 - INFO - codeparrot_training - Step 47516: {'lr': 3.060062894654686e-06, 'samples': 24328704, 'steps': 47516, 'batch_loss/train': 0.7161964094266295} 12/29/2021 05:17:33 - INFO - codeparrot_training - Step 47517: {'lr': 3.0576046054119344e-06, 'samples': 24329216, 'steps': 47517, 'batch_loss/train': 0.7144722617231309} 12/29/2021 05:17:45 - INFO - codeparrot_training - Step 47518: {'lr': 3.055147297918892e-06, 'samples': 24329728, 'steps': 47518, 'batch_loss/train': 0.712045188061893} 12/29/2021 05:17:56 - INFO - codeparrot_training - Step 47519: {'lr': 3.052690972185357e-06, 'samples': 24330240, 'steps': 47519, 'batch_loss/train': 0.799641733057797} 12/29/2021 05:18:07 - INFO - codeparrot_training - Step 47520: {'lr': 3.0502356282210706e-06, 'samples': 24330752, 'steps': 47520, 'batch_loss/train': 0.5674862313026097} 12/29/2021 05:18:21 - INFO - codeparrot_training - Step 47521: {'lr': 3.047781266035776e-06, 'samples': 24331264, 'steps': 47521, 'batch_loss/train': 0.8363626040518284} 12/29/2021 05:18:32 - INFO - codeparrot_training - Step 47522: {'lr': 3.0453278856392987e-06, 'samples': 24331776, 'steps': 47522, 'batch_loss/train': 0.7164649190381169} 12/29/2021 05:18:42 - INFO - codeparrot_training - Step 47523: {'lr': 3.0428754870413244e-06, 'samples': 24332288, 'steps': 47523, 'batch_loss/train': 0.7283314187079668} 12/29/2021 05:18:53 - INFO - codeparrot_training - Step 47524: {'lr': 3.040424070251652e-06, 'samples': 24332800, 'steps': 47524, 'batch_loss/train': 0.5454908024985343} 12/29/2021 05:19:05 - INFO - codeparrot_training - Step 47525: {'lr': 3.0379736352799945e-06, 'samples': 24333312, 'steps': 47525, 'batch_loss/train': 0.7206964856013656} 12/29/2021 05:19:16 - INFO - codeparrot_training - Step 47526: {'lr': 3.0355241821360956e-06, 'samples': 24333824, 'steps': 47526, 'batch_loss/train': 0.6414483650587499} 12/29/2021 05:19:27 - INFO - codeparrot_training - Step 47527: {'lr': 3.0330757108296967e-06, 'samples': 24334336, 'steps': 47527, 'batch_loss/train': 0.7219134867191315} 12/29/2021 05:19:40 - INFO - codeparrot_training - Step 47528: {'lr': 3.030628221370568e-06, 'samples': 24334848, 'steps': 47528, 'batch_loss/train': 0.7134045243728906} 12/29/2021 05:19:51 - INFO - codeparrot_training - Step 47529: {'lr': 3.0281817137683967e-06, 'samples': 24335360, 'steps': 47529, 'batch_loss/train': 0.7298453412950039} 12/29/2021 05:20:02 - INFO - codeparrot_training - Step 47530: {'lr': 3.025736188032896e-06, 'samples': 24335872, 'steps': 47530, 'batch_loss/train': 0.7523774793371558} 12/29/2021 05:20:14 - INFO - codeparrot_training - Step 47531: {'lr': 3.0232916441738367e-06, 'samples': 24336384, 'steps': 47531, 'batch_loss/train': 0.7076944727450609} 12/29/2021 05:20:25 - INFO - codeparrot_training - Step 47532: {'lr': 3.0208480822009332e-06, 'samples': 24336896, 'steps': 47532, 'batch_loss/train': 0.7527461764402688} 12/29/2021 05:20:35 - INFO - codeparrot_training - Step 47533: {'lr': 3.0184055021238443e-06, 'samples': 24337408, 'steps': 47533, 'batch_loss/train': 0.7105735270306468} 12/29/2021 05:20:47 - INFO - codeparrot_training - Step 47534: {'lr': 3.015963903952368e-06, 'samples': 24337920, 'steps': 47534, 'batch_loss/train': 0.8052504323422909} 12/29/2021 05:20:58 - INFO - codeparrot_training - Step 47535: {'lr': 3.0135232876961073e-06, 'samples': 24338432, 'steps': 47535, 'batch_loss/train': 0.7053061301121488} 12/29/2021 05:21:09 - INFO - codeparrot_training - Step 47536: {'lr': 3.011083653364832e-06, 'samples': 24338944, 'steps': 47536, 'batch_loss/train': 0.6767298295162618} 12/29/2021 05:21:19 - INFO - codeparrot_training - Step 47537: {'lr': 3.0086450009682574e-06, 'samples': 24339456, 'steps': 47537, 'batch_loss/train': 0.7511524488218129} 12/29/2021 05:21:33 - INFO - codeparrot_training - Step 47538: {'lr': 3.0062073305160144e-06, 'samples': 24339968, 'steps': 47538, 'batch_loss/train': 0.7240196322090924} 12/29/2021 05:21:44 - INFO - codeparrot_training - Step 47539: {'lr': 3.0037706420178444e-06, 'samples': 24340480, 'steps': 47539, 'batch_loss/train': 0.7757140975445509} 12/29/2021 05:21:54 - INFO - codeparrot_training - Step 47540: {'lr': 3.0013349354834353e-06, 'samples': 24340992, 'steps': 47540, 'batch_loss/train': 0.6689399461029097} 12/29/2021 05:22:06 - INFO - codeparrot_training - Step 47541: {'lr': 2.998900210922445e-06, 'samples': 24341504, 'steps': 47541, 'batch_loss/train': 0.6913521122187376} 12/29/2021 05:22:17 - INFO - codeparrot_training - Step 47542: {'lr': 2.9964664683445607e-06, 'samples': 24342016, 'steps': 47542, 'batch_loss/train': 0.7930256607942283} 12/29/2021 05:22:28 - INFO - codeparrot_training - Step 47543: {'lr': 2.9940337077594693e-06, 'samples': 24342528, 'steps': 47543, 'batch_loss/train': 0.6803514705970883} 12/29/2021 05:22:40 - INFO - codeparrot_training - Step 47544: {'lr': 2.991601929176829e-06, 'samples': 24343040, 'steps': 47544, 'batch_loss/train': 0.712479799054563} 12/29/2021 05:22:50 - INFO - codeparrot_training - Step 47545: {'lr': 2.9891711326063e-06, 'samples': 24343552, 'steps': 47545, 'batch_loss/train': 0.5496618575416505} 12/29/2021 05:23:01 - INFO - codeparrot_training - Step 47546: {'lr': 2.9867413180575676e-06, 'samples': 24344064, 'steps': 47546, 'batch_loss/train': 0.7113267029635608} 12/29/2021 05:23:12 - INFO - codeparrot_training - Step 47547: {'lr': 2.9843124855402916e-06, 'samples': 24344576, 'steps': 47547, 'batch_loss/train': 0.7492923235986382} 12/29/2021 05:23:26 - INFO - codeparrot_training - Step 47548: {'lr': 2.9818846350641027e-06, 'samples': 24345088, 'steps': 47548, 'batch_loss/train': 0.7225815574056469} 12/29/2021 05:23:36 - INFO - codeparrot_training - Step 47549: {'lr': 2.9794577666386877e-06, 'samples': 24345600, 'steps': 47549, 'batch_loss/train': 0.7759031439200044} 12/29/2021 05:23:47 - INFO - codeparrot_training - Step 47550: {'lr': 2.9770318802736785e-06, 'samples': 24346112, 'steps': 47550, 'batch_loss/train': 0.7414350416511297} 12/29/2021 05:23:59 - INFO - codeparrot_training - Step 47551: {'lr': 2.9746069759787053e-06, 'samples': 24346624, 'steps': 47551, 'batch_loss/train': 0.7995961112901568} 12/29/2021 05:24:10 - INFO - codeparrot_training - Step 47552: {'lr': 2.9721830537634278e-06, 'samples': 24347136, 'steps': 47552, 'batch_loss/train': 0.768515539704822} 12/29/2021 05:24:20 - INFO - codeparrot_training - Step 47553: {'lr': 2.9697601136374486e-06, 'samples': 24347648, 'steps': 47553, 'batch_loss/train': 0.7128591237124056} 12/29/2021 05:24:32 - INFO - codeparrot_training - Step 47554: {'lr': 2.967338155610483e-06, 'samples': 24348160, 'steps': 47554, 'batch_loss/train': 1.51214156486094} 12/29/2021 05:24:43 - INFO - codeparrot_training - Step 47555: {'lr': 2.9649171796920784e-06, 'samples': 24348672, 'steps': 47555, 'batch_loss/train': 0.6563785686157644} 12/29/2021 05:24:54 - INFO - codeparrot_training - Step 47556: {'lr': 2.962497185891866e-06, 'samples': 24349184, 'steps': 47556, 'batch_loss/train': 0.6941132512874901} 12/29/2021 05:25:04 - INFO - codeparrot_training - Step 47557: {'lr': 2.960078174219505e-06, 'samples': 24349696, 'steps': 47557, 'batch_loss/train': 0.6421837820671499} 12/29/2021 05:25:16 - INFO - codeparrot_training - Step 47558: {'lr': 2.9576601446845986e-06, 'samples': 24350208, 'steps': 47558, 'batch_loss/train': 0.6534938751719892} 12/29/2021 05:25:27 - INFO - codeparrot_training - Step 47559: {'lr': 2.9552430972967504e-06, 'samples': 24350720, 'steps': 47559, 'batch_loss/train': 0.6902362452819943} 12/29/2021 05:25:38 - INFO - codeparrot_training - Step 47560: {'lr': 2.9528270320655914e-06, 'samples': 24351232, 'steps': 47560, 'batch_loss/train': 0.7508368259295821} 12/29/2021 05:25:51 - INFO - codeparrot_training - Step 47561: {'lr': 2.9504119490006698e-06, 'samples': 24351744, 'steps': 47561, 'batch_loss/train': 0.6900778498966247} 12/29/2021 05:26:02 - INFO - codeparrot_training - Step 47562: {'lr': 2.947997848111672e-06, 'samples': 24352256, 'steps': 47562, 'batch_loss/train': 0.7777202734723687} 12/29/2021 05:26:13 - INFO - codeparrot_training - Step 47563: {'lr': 2.945584729408146e-06, 'samples': 24352768, 'steps': 47563, 'batch_loss/train': 0.7027611620724201} 12/29/2021 05:26:25 - INFO - codeparrot_training - Step 47564: {'lr': 2.9431725928996677e-06, 'samples': 24353280, 'steps': 47564, 'batch_loss/train': 0.6615166908595711} 12/29/2021 05:26:35 - INFO - codeparrot_training - Step 47565: {'lr': 2.94076143859584e-06, 'samples': 24353792, 'steps': 47565, 'batch_loss/train': 0.7248236564919353} 12/29/2021 05:26:46 - INFO - codeparrot_training - Step 47566: {'lr': 2.938351266506295e-06, 'samples': 24354304, 'steps': 47566, 'batch_loss/train': 0.6819373071193695} 12/29/2021 05:26:57 - INFO - codeparrot_training - Step 47567: {'lr': 2.93594207664058e-06, 'samples': 24354816, 'steps': 47567, 'batch_loss/train': 0.7767293106298894} 12/29/2021 05:27:10 - INFO - codeparrot_training - Step 47568: {'lr': 2.9335338690082424e-06, 'samples': 24355328, 'steps': 47568, 'batch_loss/train': 0.7146223345771432} 12/29/2021 05:27:21 - INFO - codeparrot_training - Step 47569: {'lr': 2.9311266436189142e-06, 'samples': 24355840, 'steps': 47569, 'batch_loss/train': 0.7852404867298901} 12/29/2021 05:27:32 - INFO - codeparrot_training - Step 47570: {'lr': 2.9287204004821154e-06, 'samples': 24356352, 'steps': 47570, 'batch_loss/train': 0.7412413572892547} 12/29/2021 05:27:44 - INFO - codeparrot_training - Step 47571: {'lr': 2.9263151396074215e-06, 'samples': 24356864, 'steps': 47571, 'batch_loss/train': 0.6066284934058785} 12/29/2021 05:27:55 - INFO - codeparrot_training - Step 47572: {'lr': 2.9239108610044363e-06, 'samples': 24357376, 'steps': 47572, 'batch_loss/train': 0.7324116248637438} 12/29/2021 05:28:05 - INFO - codeparrot_training - Step 47573: {'lr': 2.9215075646826796e-06, 'samples': 24357888, 'steps': 47573, 'batch_loss/train': 0.805313247255981} 12/29/2021 05:28:17 - INFO - codeparrot_training - Step 47574: {'lr': 2.919105250651699e-06, 'samples': 24358400, 'steps': 47574, 'batch_loss/train': 0.7013810598291457} 12/29/2021 05:28:28 - INFO - codeparrot_training - Step 47575: {'lr': 2.916703918921071e-06, 'samples': 24358912, 'steps': 47575, 'batch_loss/train': 0.6886053271591663} 12/29/2021 05:28:38 - INFO - codeparrot_training - Step 47576: {'lr': 2.9143035695003427e-06, 'samples': 24359424, 'steps': 47576, 'batch_loss/train': 0.7911646105349064} 12/29/2021 05:28:51 - INFO - codeparrot_training - Step 47577: {'lr': 2.911904202399035e-06, 'samples': 24359936, 'steps': 47577, 'batch_loss/train': 0.6676319246180356} 12/29/2021 05:29:02 - INFO - codeparrot_training - Step 47578: {'lr': 2.909505817626723e-06, 'samples': 24360448, 'steps': 47578, 'batch_loss/train': 0.8029229594394565} 12/29/2021 05:29:13 - INFO - codeparrot_training - Step 47579: {'lr': 2.9071084151928993e-06, 'samples': 24360960, 'steps': 47579, 'batch_loss/train': 0.7217835438204929} 12/29/2021 05:29:23 - INFO - codeparrot_training - Step 47580: {'lr': 2.90471199510714e-06, 'samples': 24361472, 'steps': 47580, 'batch_loss/train': 0.6408159178681672} 12/29/2021 05:29:36 - INFO - codeparrot_training - Step 47581: {'lr': 2.902316557378909e-06, 'samples': 24361984, 'steps': 47581, 'batch_loss/train': 0.6327755721285939} 12/29/2021 05:29:46 - INFO - codeparrot_training - Step 47582: {'lr': 2.8999221020177825e-06, 'samples': 24362496, 'steps': 47582, 'batch_loss/train': 0.7237575743347406} 12/29/2021 05:29:57 - INFO - codeparrot_training - Step 47583: {'lr': 2.8975286290332526e-06, 'samples': 24363008, 'steps': 47583, 'batch_loss/train': 0.6425970960408449} 12/29/2021 05:30:10 - INFO - codeparrot_training - Step 47584: {'lr': 2.895136138434867e-06, 'samples': 24363520, 'steps': 47584, 'batch_loss/train': 0.655895286356099} 12/29/2021 05:30:20 - INFO - codeparrot_training - Step 47585: {'lr': 2.8927446302321194e-06, 'samples': 24364032, 'steps': 47585, 'batch_loss/train': 0.6426265079062432} 12/29/2021 05:30:31 - INFO - codeparrot_training - Step 47586: {'lr': 2.8903541044345006e-06, 'samples': 24364544, 'steps': 47586, 'batch_loss/train': 0.7029325887560844} 12/29/2021 05:30:43 - INFO - codeparrot_training - Step 47587: {'lr': 2.8879645610515316e-06, 'samples': 24365056, 'steps': 47587, 'batch_loss/train': 0.7005817154422402} 12/29/2021 05:30:54 - INFO - codeparrot_training - Step 47588: {'lr': 2.8855760000927045e-06, 'samples': 24365568, 'steps': 47588, 'batch_loss/train': 0.6343494607135653} 12/29/2021 05:31:04 - INFO - codeparrot_training - Step 47589: {'lr': 2.8831884215675396e-06, 'samples': 24366080, 'steps': 47589, 'batch_loss/train': 0.6758362301625311} 12/29/2021 05:31:15 - INFO - codeparrot_training - Step 47590: {'lr': 2.880801825485474e-06, 'samples': 24366592, 'steps': 47590, 'batch_loss/train': 0.6774015170522034} 12/29/2021 05:31:27 - INFO - codeparrot_training - Step 47591: {'lr': 2.8784162118560554e-06, 'samples': 24367104, 'steps': 47591, 'batch_loss/train': 0.6734797954559326} 12/29/2021 05:31:38 - INFO - codeparrot_training - Step 47592: {'lr': 2.876031580688748e-06, 'samples': 24367616, 'steps': 47592, 'batch_loss/train': 0.6646332318487111} 12/29/2021 05:31:48 - INFO - codeparrot_training - Step 47593: {'lr': 2.8736479319930176e-06, 'samples': 24368128, 'steps': 47593, 'batch_loss/train': 0.7130439332686365} 12/29/2021 05:32:00 - INFO - codeparrot_training - Step 47594: {'lr': 2.8712652657783556e-06, 'samples': 24368640, 'steps': 47594, 'batch_loss/train': 0.6969249453395605} 12/29/2021 05:32:11 - INFO - codeparrot_training - Step 47595: {'lr': 2.8688835820542268e-06, 'samples': 24369152, 'steps': 47595, 'batch_loss/train': 0.6873803690541536} 12/29/2021 05:32:22 - INFO - codeparrot_training - Step 47596: {'lr': 2.866502880830124e-06, 'samples': 24369664, 'steps': 47596, 'batch_loss/train': 0.7662013052031398} 12/29/2021 05:32:35 - INFO - codeparrot_training - Step 47597: {'lr': 2.8641231621154563e-06, 'samples': 24370176, 'steps': 47597, 'batch_loss/train': 0.7173230298794806} 12/29/2021 05:32:46 - INFO - codeparrot_training - Step 47598: {'lr': 2.8617444259197433e-06, 'samples': 24370688, 'steps': 47598, 'batch_loss/train': 0.8044251287356019} 12/29/2021 05:32:56 - INFO - codeparrot_training - Step 47599: {'lr': 2.8593666722524225e-06, 'samples': 24371200, 'steps': 47599, 'batch_loss/train': 0.4655052630114369} 12/29/2021 05:33:07 - INFO - codeparrot_training - Step 47600: {'lr': 2.8569899011229308e-06, 'samples': 24371712, 'steps': 47600, 'batch_loss/train': 0.7130703059956431} 12/29/2021 05:33:19 - INFO - codeparrot_training - Step 47601: {'lr': 2.854614112540732e-06, 'samples': 24372224, 'steps': 47601, 'batch_loss/train': 0.8088556858710945} 12/29/2021 05:33:30 - INFO - codeparrot_training - Step 47602: {'lr': 2.8522393065152917e-06, 'samples': 24372736, 'steps': 47602, 'batch_loss/train': 0.7219401078764349} 12/29/2021 05:33:40 - INFO - codeparrot_training - Step 47603: {'lr': 2.8498654830559913e-06, 'samples': 24373248, 'steps': 47603, 'batch_loss/train': 0.7552243331447244} 12/29/2021 05:33:52 - INFO - codeparrot_training - Step 47604: {'lr': 2.84749264217235e-06, 'samples': 24373760, 'steps': 47604, 'batch_loss/train': 0.6813800157979131} 12/29/2021 05:34:03 - INFO - codeparrot_training - Step 47605: {'lr': 2.8451207838737227e-06, 'samples': 24374272, 'steps': 47605, 'batch_loss/train': 0.5835721300682053} 12/29/2021 05:34:14 - INFO - codeparrot_training - Step 47606: {'lr': 2.8427499081695727e-06, 'samples': 24374784, 'steps': 47606, 'batch_loss/train': 0.7135355175705627} 12/29/2021 05:34:26 - INFO - codeparrot_training - Step 47607: {'lr': 2.840380015069338e-06, 'samples': 24375296, 'steps': 47607, 'batch_loss/train': 0.545073562010657} 12/29/2021 05:34:37 - INFO - codeparrot_training - Step 47608: {'lr': 2.8380111045824277e-06, 'samples': 24375808, 'steps': 47608, 'batch_loss/train': 0.6871775141917169} 12/29/2021 05:34:48 - INFO - codeparrot_training - Step 47609: {'lr': 2.83564317671825e-06, 'samples': 24376320, 'steps': 47609, 'batch_loss/train': 0.7157067721709609} 12/29/2021 05:35:00 - INFO - codeparrot_training - Step 47610: {'lr': 2.8332762314862426e-06, 'samples': 24376832, 'steps': 47610, 'batch_loss/train': 0.6660990985110402} 12/29/2021 05:35:10 - INFO - codeparrot_training - Step 47611: {'lr': 2.8309102688957867e-06, 'samples': 24377344, 'steps': 47611, 'batch_loss/train': 0.6684398036450148} 12/29/2021 05:35:21 - INFO - codeparrot_training - Step 47612: {'lr': 2.828545288956319e-06, 'samples': 24377856, 'steps': 47612, 'batch_loss/train': 0.730004059150815} 12/29/2021 05:35:32 - INFO - codeparrot_training - Step 47613: {'lr': 2.826181291677221e-06, 'samples': 24378368, 'steps': 47613, 'batch_loss/train': 0.6507564564235508} 12/29/2021 05:35:44 - INFO - codeparrot_training - Step 47614: {'lr': 2.823818277067902e-06, 'samples': 24378880, 'steps': 47614, 'batch_loss/train': 0.704701678827405} 12/29/2021 05:35:55 - INFO - codeparrot_training - Step 47615: {'lr': 2.821456245137771e-06, 'samples': 24379392, 'steps': 47615, 'batch_loss/train': 0.6446958649903536} 12/29/2021 05:36:06 - INFO - codeparrot_training - Step 47616: {'lr': 2.819095195896154e-06, 'samples': 24379904, 'steps': 47616, 'batch_loss/train': 0.7165529052726924} 12/29/2021 05:36:18 - INFO - codeparrot_training - Step 47617: {'lr': 2.816735129352488e-06, 'samples': 24380416, 'steps': 47617, 'batch_loss/train': 0.740035904571414} 12/29/2021 05:36:28 - INFO - codeparrot_training - Step 47618: {'lr': 2.8143760455161814e-06, 'samples': 24380928, 'steps': 47618, 'batch_loss/train': 0.7639142920961604} 12/29/2021 05:36:39 - INFO - codeparrot_training - Step 47619: {'lr': 2.8120179443965887e-06, 'samples': 24381440, 'steps': 47619, 'batch_loss/train': 0.7542732837609947} 12/29/2021 05:36:51 - INFO - codeparrot_training - Step 47620: {'lr': 2.8096608260030353e-06, 'samples': 24381952, 'steps': 47620, 'batch_loss/train': 0.7188503104262054} 12/29/2021 05:37:02 - INFO - codeparrot_training - Step 47621: {'lr': 2.8073046903449585e-06, 'samples': 24382464, 'steps': 47621, 'batch_loss/train': 0.6594489415874705} 12/29/2021 05:37:13 - INFO - codeparrot_training - Step 47622: {'lr': 2.804949537431711e-06, 'samples': 24382976, 'steps': 47622, 'batch_loss/train': 0.8239318141713738} 12/29/2021 05:37:23 - INFO - codeparrot_training - Step 47623: {'lr': 2.80259536727262e-06, 'samples': 24383488, 'steps': 47623, 'batch_loss/train': 0.7400148864835501} 12/29/2021 05:37:36 - INFO - codeparrot_training - Step 47624: {'lr': 2.800242179877094e-06, 'samples': 24384000, 'steps': 47624, 'batch_loss/train': 0.6772532882168889} 12/29/2021 05:37:47 - INFO - codeparrot_training - Step 47625: {'lr': 2.7978899752544585e-06, 'samples': 24384512, 'steps': 47625, 'batch_loss/train': 0.667577319778502} 12/29/2021 05:37:57 - INFO - codeparrot_training - Step 47626: {'lr': 2.7955387534140674e-06, 'samples': 24385024, 'steps': 47626, 'batch_loss/train': 0.675511727691628} 12/29/2021 05:38:09 - INFO - codeparrot_training - Step 47627: {'lr': 2.7931885143652747e-06, 'samples': 24385536, 'steps': 47627, 'batch_loss/train': 0.7411810359917581} 12/29/2021 05:38:20 - INFO - codeparrot_training - Step 47628: {'lr': 2.7908392581174334e-06, 'samples': 24386048, 'steps': 47628, 'batch_loss/train': 0.6429246850311756} 12/29/2021 05:38:31 - INFO - codeparrot_training - Step 47629: {'lr': 2.78849098467987e-06, 'samples': 24386560, 'steps': 47629, 'batch_loss/train': 0.6558556822128594} 12/29/2021 05:38:43 - INFO - codeparrot_training - Step 47630: {'lr': 2.786143694061938e-06, 'samples': 24387072, 'steps': 47630, 'batch_loss/train': 0.5568773967388552} 12/29/2021 05:38:54 - INFO - codeparrot_training - Step 47631: {'lr': 2.7837973862729072e-06, 'samples': 24387584, 'steps': 47631, 'batch_loss/train': 0.6804157891310751} 12/29/2021 05:39:04 - INFO - codeparrot_training - Step 47632: {'lr': 2.7814520613221872e-06, 'samples': 24388096, 'steps': 47632, 'batch_loss/train': 0.7965161171741784} 12/29/2021 05:39:16 - INFO - codeparrot_training - Step 47633: {'lr': 2.779107719219076e-06, 'samples': 24388608, 'steps': 47633, 'batch_loss/train': 0.7072410219116136} 12/29/2021 05:39:27 - INFO - codeparrot_training - Step 47634: {'lr': 2.7767643599728443e-06, 'samples': 24389120, 'steps': 47634, 'batch_loss/train': 0.7364150499925017} 12/29/2021 05:39:37 - INFO - codeparrot_training - Step 47635: {'lr': 2.774421983592901e-06, 'samples': 24389632, 'steps': 47635, 'batch_loss/train': 0.8120554867200553} 12/29/2021 05:39:48 - INFO - codeparrot_training - Step 47636: {'lr': 2.772080590088488e-06, 'samples': 24390144, 'steps': 47636, 'batch_loss/train': 0.7128264459315687} 12/29/2021 05:40:02 - INFO - codeparrot_training - Step 47637: {'lr': 2.7697401794689324e-06, 'samples': 24390656, 'steps': 47637, 'batch_loss/train': 0.49858250422403216} 12/29/2021 05:40:12 - INFO - codeparrot_training - Step 47638: {'lr': 2.7674007517435317e-06, 'samples': 24391168, 'steps': 47638, 'batch_loss/train': 0.6818771832622588} 12/29/2021 05:40:23 - INFO - codeparrot_training - Step 47639: {'lr': 2.765062306921584e-06, 'samples': 24391680, 'steps': 47639, 'batch_loss/train': 0.6963473975192755} 12/29/2021 05:40:35 - INFO - codeparrot_training - Step 47640: {'lr': 2.7627248450124154e-06, 'samples': 24392192, 'steps': 47640, 'batch_loss/train': 0.8828220115974545} 12/29/2021 05:40:46 - INFO - codeparrot_training - Step 47641: {'lr': 2.760388366025296e-06, 'samples': 24392704, 'steps': 47641, 'batch_loss/train': 0.7388540196698159} 12/29/2021 05:40:57 - INFO - codeparrot_training - Step 47642: {'lr': 2.758052869969496e-06, 'samples': 24393216, 'steps': 47642, 'batch_loss/train': 0.7158956532366574} 12/29/2021 05:41:09 - INFO - codeparrot_training - Step 47643: {'lr': 2.7557183568543422e-06, 'samples': 24393728, 'steps': 47643, 'batch_loss/train': 0.5124149362090975} 12/29/2021 05:41:20 - INFO - codeparrot_training - Step 47644: {'lr': 2.7533848266891316e-06, 'samples': 24394240, 'steps': 47644, 'batch_loss/train': 0.7286157519556582} 12/29/2021 05:41:31 - INFO - codeparrot_training - Step 47645: {'lr': 2.751052279483052e-06, 'samples': 24394752, 'steps': 47645, 'batch_loss/train': 0.7098006107844412} 12/29/2021 05:41:41 - INFO - codeparrot_training - Step 47646: {'lr': 2.7487207152454565e-06, 'samples': 24395264, 'steps': 47646, 'batch_loss/train': 0.7338326014578342} 12/29/2021 05:41:54 - INFO - codeparrot_training - Step 47647: {'lr': 2.746390133985588e-06, 'samples': 24395776, 'steps': 47647, 'batch_loss/train': 0.6624634996987879} 12/29/2021 05:42:05 - INFO - codeparrot_training - Step 47648: {'lr': 2.744060535712717e-06, 'samples': 24396288, 'steps': 47648, 'batch_loss/train': 0.643579167779535} 12/29/2021 05:42:16 - INFO - codeparrot_training - Step 47649: {'lr': 2.741731920436058e-06, 'samples': 24396800, 'steps': 47649, 'batch_loss/train': 0.6745621864683926} 12/29/2021 05:42:28 - INFO - codeparrot_training - Step 47650: {'lr': 2.739404288164965e-06, 'samples': 24397312, 'steps': 47650, 'batch_loss/train': 1.4278827114030719} 12/29/2021 05:42:39 - INFO - codeparrot_training - Step 47651: {'lr': 2.737077638908625e-06, 'samples': 24397824, 'steps': 47651, 'batch_loss/train': 1.4574260907247663} 12/29/2021 05:42:49 - INFO - codeparrot_training - Step 47652: {'lr': 2.7347519726763083e-06, 'samples': 24398336, 'steps': 47652, 'batch_loss/train': 0.7458389014936984} 12/29/2021 05:43:00 - INFO - codeparrot_training - Step 47653: {'lr': 2.7324272894772295e-06, 'samples': 24398848, 'steps': 47653, 'batch_loss/train': 0.6280627138912678} 12/29/2021 05:43:13 - INFO - codeparrot_training - Step 47654: {'lr': 2.7301035893206593e-06, 'samples': 24399360, 'steps': 47654, 'batch_loss/train': 0.8101213546469808} 12/29/2021 05:43:23 - INFO - codeparrot_training - Step 47655: {'lr': 2.7277808722158405e-06, 'samples': 24399872, 'steps': 47655, 'batch_loss/train': 0.8412392577156425} 12/29/2021 05:43:34 - INFO - codeparrot_training - Step 47656: {'lr': 2.7254591381720152e-06, 'samples': 24400384, 'steps': 47656, 'batch_loss/train': 0.6545535447075963} 12/29/2021 05:43:46 - INFO - codeparrot_training - Step 47657: {'lr': 2.7231383871983707e-06, 'samples': 24400896, 'steps': 47657, 'batch_loss/train': 0.6051602603401989} 12/29/2021 05:43:57 - INFO - codeparrot_training - Step 47658: {'lr': 2.7208186193041494e-06, 'samples': 24401408, 'steps': 47658, 'batch_loss/train': 0.7897325940430164} 12/29/2021 05:44:07 - INFO - codeparrot_training - Step 47659: {'lr': 2.7184998344986224e-06, 'samples': 24401920, 'steps': 47659, 'batch_loss/train': 0.6984283630445134} 12/29/2021 05:44:20 - INFO - codeparrot_training - Step 47660: {'lr': 2.716182032790948e-06, 'samples': 24402432, 'steps': 47660, 'batch_loss/train': 0.6879128087311983} 12/29/2021 05:44:31 - INFO - codeparrot_training - Step 47661: {'lr': 2.7138652141903696e-06, 'samples': 24402944, 'steps': 47661, 'batch_loss/train': 0.7455329755321145} 12/29/2021 05:44:41 - INFO - codeparrot_training - Step 47662: {'lr': 2.711549378706102e-06, 'samples': 24403456, 'steps': 47662, 'batch_loss/train': 0.6691089926753193} 12/29/2021 05:44:52 - INFO - codeparrot_training - Step 47663: {'lr': 2.7092345263473317e-06, 'samples': 24403968, 'steps': 47663, 'batch_loss/train': 0.7134580994024873} 12/29/2021 05:45:05 - INFO - codeparrot_training - Step 47664: {'lr': 2.7069206571232464e-06, 'samples': 24404480, 'steps': 47664, 'batch_loss/train': 0.5992817102523986} 12/29/2021 05:45:15 - INFO - codeparrot_training - Step 47665: {'lr': 2.704607771043116e-06, 'samples': 24404992, 'steps': 47665, 'batch_loss/train': 1.2940046517178416} 12/29/2021 05:45:26 - INFO - codeparrot_training - Step 47666: {'lr': 2.702295868116045e-06, 'samples': 24405504, 'steps': 47666, 'batch_loss/train': 0.6928439093753695} 12/29/2021 05:45:38 - INFO - codeparrot_training - Step 47667: {'lr': 2.6999849483513027e-06, 'samples': 24406016, 'steps': 47667, 'batch_loss/train': 0.7364609576761723} 12/29/2021 05:45:49 - INFO - codeparrot_training - Step 47668: {'lr': 2.6976750117580218e-06, 'samples': 24406528, 'steps': 47668, 'batch_loss/train': 0.7101045162416995} 12/29/2021 05:45:59 - INFO - codeparrot_training - Step 47669: {'lr': 2.6953660583454166e-06, 'samples': 24407040, 'steps': 47669, 'batch_loss/train': 0.691777684725821} 12/29/2021 05:46:11 - INFO - codeparrot_training - Step 47670: {'lr': 2.693058088122674e-06, 'samples': 24407552, 'steps': 47670, 'batch_loss/train': 0.5931711997836828} 12/29/2021 05:46:22 - INFO - codeparrot_training - Step 47671: {'lr': 2.690751101098926e-06, 'samples': 24408064, 'steps': 47671, 'batch_loss/train': 0.6356811947189271} 12/29/2021 05:46:33 - INFO - codeparrot_training - Step 47672: {'lr': 2.6884450972833875e-06, 'samples': 24408576, 'steps': 47672, 'batch_loss/train': 0.7824193052947521} 12/29/2021 05:46:43 - INFO - codeparrot_training - Step 47673: {'lr': 2.686140076685217e-06, 'samples': 24409088, 'steps': 47673, 'batch_loss/train': 0.7002097070217133} 12/29/2021 05:46:56 - INFO - codeparrot_training - Step 47674: {'lr': 2.6838360393135475e-06, 'samples': 24409600, 'steps': 47674, 'batch_loss/train': 0.7601842237636447} 12/29/2021 05:47:07 - INFO - codeparrot_training - Step 47675: {'lr': 2.681532985177565e-06, 'samples': 24410112, 'steps': 47675, 'batch_loss/train': 0.7213630815385841} 12/29/2021 05:47:17 - INFO - codeparrot_training - Step 47676: {'lr': 2.6792309142864566e-06, 'samples': 24410624, 'steps': 47676, 'batch_loss/train': 0.716326053137891} 12/29/2021 05:47:29 - INFO - codeparrot_training - Step 47677: {'lr': 2.676929826649299e-06, 'samples': 24411136, 'steps': 47677, 'batch_loss/train': 0.7314335932023823} 12/29/2021 05:47:40 - INFO - codeparrot_training - Step 47678: {'lr': 2.6746297222753067e-06, 'samples': 24411648, 'steps': 47678, 'batch_loss/train': 0.6779692266136408} 12/29/2021 05:47:51 - INFO - codeparrot_training - Step 47679: {'lr': 2.672330601173584e-06, 'samples': 24412160, 'steps': 47679, 'batch_loss/train': 0.6458872593939304} 12/29/2021 05:48:03 - INFO - codeparrot_training - Step 47680: {'lr': 2.670032463353317e-06, 'samples': 24412672, 'steps': 47680, 'batch_loss/train': 0.7357852566055954} 12/29/2021 05:48:13 - INFO - codeparrot_training - Step 47681: {'lr': 2.6677353088235825e-06, 'samples': 24413184, 'steps': 47681, 'batch_loss/train': 0.6947172600775957} 12/29/2021 05:48:24 - INFO - codeparrot_training - Step 47682: {'lr': 2.665439137593567e-06, 'samples': 24413696, 'steps': 47682, 'batch_loss/train': 0.6709090601652861} 12/29/2021 05:48:37 - INFO - codeparrot_training - Step 47683: {'lr': 2.6631439496723477e-06, 'samples': 24414208, 'steps': 47683, 'batch_loss/train': 0.6313450834713876} 12/29/2021 05:48:47 - INFO - codeparrot_training - Step 47684: {'lr': 2.6608497450691106e-06, 'samples': 24414720, 'steps': 47684, 'batch_loss/train': 0.6791051959153265} 12/29/2021 05:48:58 - INFO - codeparrot_training - Step 47685: {'lr': 2.6585565237929323e-06, 'samples': 24415232, 'steps': 47685, 'batch_loss/train': 0.7347337007522583} 12/29/2021 05:49:09 - INFO - codeparrot_training - Step 47686: {'lr': 2.6562642858529162e-06, 'samples': 24415744, 'steps': 47686, 'batch_loss/train': 0.6793211402837187} 12/29/2021 05:49:21 - INFO - codeparrot_training - Step 47687: {'lr': 2.6539730312582223e-06, 'samples': 24416256, 'steps': 47687, 'batch_loss/train': 0.544526330602821} 12/29/2021 05:49:31 - INFO - codeparrot_training - Step 47688: {'lr': 2.651682760017954e-06, 'samples': 24416768, 'steps': 47688, 'batch_loss/train': 0.7955884588882327} 12/29/2021 05:49:42 - INFO - codeparrot_training - Step 47689: {'lr': 2.6493934721411593e-06, 'samples': 24417280, 'steps': 47689, 'batch_loss/train': 0.6970440991572104} 12/29/2021 05:49:54 - INFO - codeparrot_training - Step 47690: {'lr': 2.6471051676369982e-06, 'samples': 24417792, 'steps': 47690, 'batch_loss/train': 0.6857084180228412} 12/29/2021 05:50:05 - INFO - codeparrot_training - Step 47691: {'lr': 2.644817846514547e-06, 'samples': 24418304, 'steps': 47691, 'batch_loss/train': 0.7099656816571951} 12/29/2021 05:50:15 - INFO - codeparrot_training - Step 47692: {'lr': 2.6425315087829084e-06, 'samples': 24418816, 'steps': 47692, 'batch_loss/train': 0.7062044506892562} 12/29/2021 05:50:28 - INFO - codeparrot_training - Step 47693: {'lr': 2.640246154451187e-06, 'samples': 24419328, 'steps': 47693, 'batch_loss/train': 0.7461221646517515} 12/29/2021 05:50:39 - INFO - codeparrot_training - Step 47694: {'lr': 2.6379617835284032e-06, 'samples': 24419840, 'steps': 47694, 'batch_loss/train': 0.6903555886819959} 12/29/2021 05:50:49 - INFO - codeparrot_training - Step 47695: {'lr': 2.635678396023716e-06, 'samples': 24420352, 'steps': 47695, 'batch_loss/train': 0.8151137242093682} 12/29/2021 05:51:01 - INFO - codeparrot_training - Step 47696: {'lr': 2.633395991946147e-06, 'samples': 24420864, 'steps': 47696, 'batch_loss/train': 0.6811312790960073} 12/29/2021 05:51:12 - INFO - codeparrot_training - Step 47697: {'lr': 2.631114571304827e-06, 'samples': 24421376, 'steps': 47697, 'batch_loss/train': 0.7902692509815097} 12/29/2021 05:51:23 - INFO - codeparrot_training - Step 47698: {'lr': 2.6288341341087485e-06, 'samples': 24421888, 'steps': 47698, 'batch_loss/train': 0.7035880438052118} 12/29/2021 05:51:33 - INFO - codeparrot_training - Step 47699: {'lr': 2.6265546803670714e-06, 'samples': 24422400, 'steps': 47699, 'batch_loss/train': 0.6933302120305598} 12/29/2021 05:51:46 - INFO - codeparrot_training - Step 47700: {'lr': 2.6242762100887884e-06, 'samples': 24422912, 'steps': 47700, 'batch_loss/train': 0.7562898825854063} 12/29/2021 05:51:57 - INFO - codeparrot_training - Step 47701: {'lr': 2.6219987232829477e-06, 'samples': 24423424, 'steps': 47701, 'batch_loss/train': 0.7579228293616325} 12/29/2021 05:52:07 - INFO - codeparrot_training - Step 47702: {'lr': 2.619722219958681e-06, 'samples': 24423936, 'steps': 47702, 'batch_loss/train': 0.671227537561208} 12/29/2021 05:52:20 - INFO - codeparrot_training - Step 47703: {'lr': 2.6174467001249536e-06, 'samples': 24424448, 'steps': 47703, 'batch_loss/train': 0.7554085785523057} 12/29/2021 05:52:31 - INFO - codeparrot_training - Step 47704: {'lr': 2.6151721637908686e-06, 'samples': 24424960, 'steps': 47704, 'batch_loss/train': 0.7275888514705002} 12/29/2021 05:52:42 - INFO - codeparrot_training - Step 47705: {'lr': 2.612898610965447e-06, 'samples': 24425472, 'steps': 47705, 'batch_loss/train': 0.8493280680850148} 12/29/2021 05:52:52 - INFO - codeparrot_training - Step 47706: {'lr': 2.6106260416577376e-06, 'samples': 24425984, 'steps': 47706, 'batch_loss/train': 0.9366579549387097} 12/29/2021 05:53:04 - INFO - codeparrot_training - Step 47707: {'lr': 2.6083544558767604e-06, 'samples': 24426496, 'steps': 47707, 'batch_loss/train': 0.8141442453488708} 12/29/2021 05:53:15 - INFO - codeparrot_training - Step 47708: {'lr': 2.6060838536315356e-06, 'samples': 24427008, 'steps': 47708, 'batch_loss/train': 0.6405846420675516} 12/29/2021 05:53:26 - INFO - codeparrot_training - Step 47709: {'lr': 2.60381423493114e-06, 'samples': 24427520, 'steps': 47709, 'batch_loss/train': 0.659193079918623} 12/29/2021 05:53:38 - INFO - codeparrot_training - Step 47710: {'lr': 2.6015455997845383e-06, 'samples': 24428032, 'steps': 47710, 'batch_loss/train': 0.8405475742183626} 12/29/2021 05:53:48 - INFO - codeparrot_training - Step 47711: {'lr': 2.599277948200807e-06, 'samples': 24428544, 'steps': 47711, 'batch_loss/train': 0.6792366425506771} 12/29/2021 05:53:59 - INFO - codeparrot_training - Step 47712: {'lr': 2.59701128018891e-06, 'samples': 24429056, 'steps': 47712, 'batch_loss/train': 0.648960332153365} 12/29/2021 05:54:12 - INFO - codeparrot_training - Step 47713: {'lr': 2.594745595757897e-06, 'samples': 24429568, 'steps': 47713, 'batch_loss/train': 0.7796358547639102} 12/29/2021 05:54:22 - INFO - codeparrot_training - Step 47714: {'lr': 2.59248089491676e-06, 'samples': 24430080, 'steps': 47714, 'batch_loss/train': 0.7033442137762904} 12/29/2021 05:54:33 - INFO - codeparrot_training - Step 47715: {'lr': 2.590217177674492e-06, 'samples': 24430592, 'steps': 47715, 'batch_loss/train': 0.6876225797459483} 12/29/2021 05:54:43 - INFO - codeparrot_training - Step 47716: {'lr': 2.587954444040086e-06, 'samples': 24431104, 'steps': 47716, 'batch_loss/train': 0.6709520540898666} 12/29/2021 05:54:56 - INFO - codeparrot_training - Step 47717: {'lr': 2.58569269402259e-06, 'samples': 24431616, 'steps': 47717, 'batch_loss/train': 0.7230179845355451} 12/29/2021 05:55:06 - INFO - codeparrot_training - Step 47718: {'lr': 2.5834319276309413e-06, 'samples': 24432128, 'steps': 47718, 'batch_loss/train': 0.843428622931242} 12/29/2021 05:55:17 - INFO - codeparrot_training - Step 47719: {'lr': 2.581172144874133e-06, 'samples': 24432640, 'steps': 47719, 'batch_loss/train': 0.6772095162887126} 12/29/2021 05:55:29 - INFO - codeparrot_training - Step 47720: {'lr': 2.578913345761158e-06, 'samples': 24433152, 'steps': 47720, 'batch_loss/train': 0.6929710061522201} 12/29/2021 05:55:40 - INFO - codeparrot_training - Step 47721: {'lr': 2.5766555303010364e-06, 'samples': 24433664, 'steps': 47721, 'batch_loss/train': 0.7335815359838307} 12/29/2021 05:55:51 - INFO - codeparrot_training - Step 47722: {'lr': 2.574398698502678e-06, 'samples': 24434176, 'steps': 47722, 'batch_loss/train': 0.48765923114842735} 12/29/2021 05:56:04 - INFO - codeparrot_training - Step 47723: {'lr': 2.572142850375103e-06, 'samples': 24434688, 'steps': 47723, 'batch_loss/train': 0.6697178075555712} 12/29/2021 05:56:14 - INFO - codeparrot_training - Step 47724: {'lr': 2.5698879859272496e-06, 'samples': 24435200, 'steps': 47724, 'batch_loss/train': 0.6996753485873342} 12/29/2021 05:56:25 - INFO - codeparrot_training - Step 47725: {'lr': 2.5676341051681375e-06, 'samples': 24435712, 'steps': 47725, 'batch_loss/train': 0.7300521805882454} 12/29/2021 05:56:37 - INFO - codeparrot_training - Step 47726: {'lr': 2.5653812081066487e-06, 'samples': 24436224, 'steps': 47726, 'batch_loss/train': 0.9597101458348334} 12/29/2021 05:56:48 - INFO - codeparrot_training - Step 47727: {'lr': 2.5631292947517483e-06, 'samples': 24436736, 'steps': 47727, 'batch_loss/train': 0.5616514887660742} 12/29/2021 05:56:59 - INFO - codeparrot_training - Step 47728: {'lr': 2.560878365112457e-06, 'samples': 24437248, 'steps': 47728, 'batch_loss/train': 0.7955249072983861} 12/29/2021 05:57:09 - INFO - codeparrot_training - Step 47729: {'lr': 2.5586284191976837e-06, 'samples': 24437760, 'steps': 47729, 'batch_loss/train': 0.7172778057865798} 12/29/2021 05:57:22 - INFO - codeparrot_training - Step 47730: {'lr': 2.556379457016367e-06, 'samples': 24438272, 'steps': 47730, 'batch_loss/train': 0.6778391457628459} 12/29/2021 05:57:33 - INFO - codeparrot_training - Step 47731: {'lr': 2.5541314785774426e-06, 'samples': 24438784, 'steps': 47731, 'batch_loss/train': 0.7416002778336406} 12/29/2021 05:57:43 - INFO - codeparrot_training - Step 47732: {'lr': 2.551884483889877e-06, 'samples': 24439296, 'steps': 47732, 'batch_loss/train': 0.802486521191895} 12/29/2021 05:57:55 - INFO - codeparrot_training - Step 47733: {'lr': 2.5496384729625787e-06, 'samples': 24439808, 'steps': 47733, 'batch_loss/train': 0.7423054231330752} 12/29/2021 05:58:06 - INFO - codeparrot_training - Step 47734: {'lr': 2.5473934458044854e-06, 'samples': 24440320, 'steps': 47734, 'batch_loss/train': 0.7351760109886527} 12/29/2021 05:58:17 - INFO - codeparrot_training - Step 47735: {'lr': 2.545149402424507e-06, 'samples': 24440832, 'steps': 47735, 'batch_loss/train': 0.7284182007424533} 12/29/2021 05:58:29 - INFO - codeparrot_training - Step 47736: {'lr': 2.542906342831608e-06, 'samples': 24441344, 'steps': 47736, 'batch_loss/train': 0.6890797272790223} 12/29/2021 05:58:39 - INFO - codeparrot_training - Step 47737: {'lr': 2.54066426703467e-06, 'samples': 24441856, 'steps': 47737, 'batch_loss/train': 0.70770706795156} 12/29/2021 05:58:50 - INFO - codeparrot_training - Step 47738: {'lr': 2.5384231750425757e-06, 'samples': 24442368, 'steps': 47738, 'batch_loss/train': 0.7050310482736677} 12/29/2021 05:59:01 - INFO - codeparrot_training - Step 47739: {'lr': 2.5361830668643172e-06, 'samples': 24442880, 'steps': 47739, 'batch_loss/train': 0.6243020398542285} 12/29/2021 05:59:13 - INFO - codeparrot_training - Step 47740: {'lr': 2.5339439425087487e-06, 'samples': 24443392, 'steps': 47740, 'batch_loss/train': 0.7244626749306917} 12/29/2021 05:59:24 - INFO - codeparrot_training - Step 47741: {'lr': 2.5317058019847796e-06, 'samples': 24443904, 'steps': 47741, 'batch_loss/train': 0.7579465033486485} 12/29/2021 05:59:35 - INFO - codeparrot_training - Step 47742: {'lr': 2.5294686453012915e-06, 'samples': 24444416, 'steps': 47742, 'batch_loss/train': 0.7090219967067242} 12/29/2021 05:59:47 - INFO - codeparrot_training - Step 47743: {'lr': 2.5272324724671948e-06, 'samples': 24444928, 'steps': 47743, 'batch_loss/train': 0.6628467694390565} 12/29/2021 05:59:58 - INFO - codeparrot_training - Step 47744: {'lr': 2.524997283491398e-06, 'samples': 24445440, 'steps': 47744, 'batch_loss/train': 0.6778377252630889} 12/29/2021 06:00:08 - INFO - codeparrot_training - Step 47745: {'lr': 2.522763078382756e-06, 'samples': 24445952, 'steps': 47745, 'batch_loss/train': 0.7339647747576237} 12/29/2021 06:00:21 - INFO - codeparrot_training - Step 47746: {'lr': 2.5205298571501777e-06, 'samples': 24446464, 'steps': 47746, 'batch_loss/train': 0.7638788828044198} 12/29/2021 06:00:31 - INFO - codeparrot_training - Step 47747: {'lr': 2.5182976198025176e-06, 'samples': 24446976, 'steps': 47747, 'batch_loss/train': 0.6069710231386125} 12/29/2021 06:00:42 - INFO - codeparrot_training - Step 47748: {'lr': 2.516066366348657e-06, 'samples': 24447488, 'steps': 47748, 'batch_loss/train': 0.7012198213487864} 12/29/2021 06:00:53 - INFO - codeparrot_training - Step 47749: {'lr': 2.513836096797478e-06, 'samples': 24448000, 'steps': 47749, 'batch_loss/train': 0.6629310528514907} 12/29/2021 06:01:05 - INFO - codeparrot_training - Step 47750: {'lr': 2.511606811157835e-06, 'samples': 24448512, 'steps': 47750, 'batch_loss/train': 0.6230931570753455} 12/29/2021 06:01:15 - INFO - codeparrot_training - Step 47751: {'lr': 2.509378509438609e-06, 'samples': 24449024, 'steps': 47751, 'batch_loss/train': 0.6466726534999907} 12/29/2021 06:01:26 - INFO - codeparrot_training - Step 47752: {'lr': 2.507151191648627e-06, 'samples': 24449536, 'steps': 47752, 'batch_loss/train': 0.7356327336165123} 12/29/2021 06:01:39 - INFO - codeparrot_training - Step 47753: {'lr': 2.504924857796742e-06, 'samples': 24450048, 'steps': 47753, 'batch_loss/train': 0.6760245691984892} 12/29/2021 06:01:49 - INFO - codeparrot_training - Step 47754: {'lr': 2.5026995078918645e-06, 'samples': 24450560, 'steps': 47754, 'batch_loss/train': 0.62885895796353} 12/29/2021 06:02:00 - INFO - codeparrot_training - Step 47755: {'lr': 2.500475141942793e-06, 'samples': 24451072, 'steps': 47755, 'batch_loss/train': 0.6601232378743589} 12/29/2021 06:02:12 - INFO - codeparrot_training - Step 47756: {'lr': 2.498251759958381e-06, 'samples': 24451584, 'steps': 47756, 'batch_loss/train': 0.8212485425174236} 12/29/2021 06:02:23 - INFO - codeparrot_training - Step 47757: {'lr': 2.496029361947455e-06, 'samples': 24452096, 'steps': 47757, 'batch_loss/train': 0.6505339927971363} 12/29/2021 06:02:33 - INFO - codeparrot_training - Step 47758: {'lr': 2.4938079479188693e-06, 'samples': 24452608, 'steps': 47758, 'batch_loss/train': 0.7442509746178985} 12/29/2021 06:02:47 - INFO - codeparrot_training - Step 47759: {'lr': 2.4915875178814495e-06, 'samples': 24453120, 'steps': 47759, 'batch_loss/train': 0.7725750021636486} 12/29/2021 06:02:58 - INFO - codeparrot_training - Step 47760: {'lr': 2.4893680718440228e-06, 'samples': 24453632, 'steps': 47760, 'batch_loss/train': 1.1017093257978559} 12/29/2021 06:03:09 - INFO - codeparrot_training - Step 47761: {'lr': 2.4871496098154146e-06, 'samples': 24454144, 'steps': 47761, 'batch_loss/train': 0.7323428322561085} 12/29/2021 06:03:19 - INFO - codeparrot_training - Step 47762: {'lr': 2.4849321318044237e-06, 'samples': 24454656, 'steps': 47762, 'batch_loss/train': 0.690772014670074} 12/29/2021 06:03:32 - INFO - codeparrot_training - Step 47763: {'lr': 2.4827156378199324e-06, 'samples': 24455168, 'steps': 47763, 'batch_loss/train': 0.5770046599209309} 12/29/2021 06:03:42 - INFO - codeparrot_training - Step 47764: {'lr': 2.480500127870655e-06, 'samples': 24455680, 'steps': 47764, 'batch_loss/train': 0.6983143100515008} 12/29/2021 06:03:53 - INFO - codeparrot_training - Step 47765: {'lr': 2.4782856019654742e-06, 'samples': 24456192, 'steps': 47765, 'batch_loss/train': 0.608113358495757} 12/29/2021 06:04:05 - INFO - codeparrot_training - Step 47766: {'lr': 2.4760720601131604e-06, 'samples': 24456704, 'steps': 47766, 'batch_loss/train': 0.7204752609832212} 12/29/2021 06:04:16 - INFO - codeparrot_training - Step 47767: {'lr': 2.473859502322512e-06, 'samples': 24457216, 'steps': 47767, 'batch_loss/train': 0.7346412697806954} 12/29/2021 06:04:26 - INFO - codeparrot_training - Step 47768: {'lr': 2.471647928602355e-06, 'samples': 24457728, 'steps': 47768, 'batch_loss/train': 0.8320850385352969} 12/29/2021 06:04:37 - INFO - codeparrot_training - Step 47769: {'lr': 2.469437338961461e-06, 'samples': 24458240, 'steps': 47769, 'batch_loss/train': 0.7043535097036511} 12/29/2021 06:04:50 - INFO - codeparrot_training - Step 47770: {'lr': 2.4672277334086e-06, 'samples': 24458752, 'steps': 47770, 'batch_loss/train': 0.7450750067364424} 12/29/2021 06:05:00 - INFO - codeparrot_training - Step 47771: {'lr': 2.4650191119525988e-06, 'samples': 24459264, 'steps': 47771, 'batch_loss/train': 0.68956726975739} 12/29/2021 06:05:11 - INFO - codeparrot_training - Step 47772: {'lr': 2.4628114746022e-06, 'samples': 24459776, 'steps': 47772, 'batch_loss/train': 0.7496702512726188} 12/29/2021 06:05:23 - INFO - codeparrot_training - Step 47773: {'lr': 2.4606048213662026e-06, 'samples': 24460288, 'steps': 47773, 'batch_loss/train': 0.8086654795333743} 12/29/2021 06:05:34 - INFO - codeparrot_training - Step 47774: {'lr': 2.4583991522533767e-06, 'samples': 24460800, 'steps': 47774, 'batch_loss/train': 0.8106328977737576} 12/29/2021 06:05:44 - INFO - codeparrot_training - Step 47775: {'lr': 2.4561944672724933e-06, 'samples': 24461312, 'steps': 47775, 'batch_loss/train': 0.6869902145117521} 12/29/2021 06:05:56 - INFO - codeparrot_training - Step 47776: {'lr': 2.453990766432296e-06, 'samples': 24461824, 'steps': 47776, 'batch_loss/train': 0.6611487192567438} 12/29/2021 06:06:07 - INFO - codeparrot_training - Step 47777: {'lr': 2.451788049741582e-06, 'samples': 24462336, 'steps': 47777, 'batch_loss/train': 0.6591453799046576} 12/29/2021 06:06:18 - INFO - codeparrot_training - Step 47778: {'lr': 2.449586317209096e-06, 'samples': 24462848, 'steps': 47778, 'batch_loss/train': 0.6820595473982394} 12/29/2021 06:06:28 - INFO - codeparrot_training - Step 47779: {'lr': 2.447385568843552e-06, 'samples': 24463360, 'steps': 47779, 'batch_loss/train': 0.7436712952330709} 12/29/2021 06:06:40 - INFO - codeparrot_training - Step 47780: {'lr': 2.445185804653749e-06, 'samples': 24463872, 'steps': 47780, 'batch_loss/train': 0.7649499531835318} 12/29/2021 06:06:51 - INFO - codeparrot_training - Step 47781: {'lr': 2.442987024648402e-06, 'samples': 24464384, 'steps': 47781, 'batch_loss/train': 0.6062158061831724} 12/29/2021 06:07:02 - INFO - codeparrot_training - Step 47782: {'lr': 2.4407892288362543e-06, 'samples': 24464896, 'steps': 47782, 'batch_loss/train': 0.6878988966345787} 12/29/2021 06:07:15 - INFO - codeparrot_training - Step 47783: {'lr': 2.438592417226104e-06, 'samples': 24465408, 'steps': 47783, 'batch_loss/train': 0.7011427953839302} 12/29/2021 06:07:25 - INFO - codeparrot_training - Step 47784: {'lr': 2.4363965898265838e-06, 'samples': 24465920, 'steps': 47784, 'batch_loss/train': 0.6100686518475413} 12/29/2021 06:07:36 - INFO - codeparrot_training - Step 47785: {'lr': 2.4342017466464917e-06, 'samples': 24466432, 'steps': 47785, 'batch_loss/train': 0.7271692119538784} 12/29/2021 06:07:48 - INFO - codeparrot_training - Step 47786: {'lr': 2.432007887694543e-06, 'samples': 24466944, 'steps': 47786, 'batch_loss/train': 0.47528575686737895} 12/29/2021 06:07:59 - INFO - codeparrot_training - Step 47787: {'lr': 2.429815012979453e-06, 'samples': 24467456, 'steps': 47787, 'batch_loss/train': 0.44579508039169014} 12/29/2021 06:08:10 - INFO - codeparrot_training - Step 47788: {'lr': 2.427623122509909e-06, 'samples': 24467968, 'steps': 47788, 'batch_loss/train': 0.7765347734093666} 12/29/2021 06:08:22 - INFO - codeparrot_training - Step 47789: {'lr': 2.4254322162947095e-06, 'samples': 24468480, 'steps': 47789, 'batch_loss/train': 0.783193071372807} 12/29/2021 06:08:33 - INFO - codeparrot_training - Step 47790: {'lr': 2.423242294342459e-06, 'samples': 24468992, 'steps': 47790, 'batch_loss/train': 0.6993343327194452} 12/29/2021 06:08:43 - INFO - codeparrot_training - Step 47791: {'lr': 2.4210533566619285e-06, 'samples': 24469504, 'steps': 47791, 'batch_loss/train': 0.7143426039256155} 12/29/2021 06:08:54 - INFO - codeparrot_training - Step 47792: {'lr': 2.418865403261833e-06, 'samples': 24470016, 'steps': 47792, 'batch_loss/train': 0.7130467051174492} 12/29/2021 06:09:06 - INFO - codeparrot_training - Step 47793: {'lr': 2.416678434150832e-06, 'samples': 24470528, 'steps': 47793, 'batch_loss/train': 0.6840829190332443} 12/29/2021 06:09:17 - INFO - codeparrot_training - Step 47794: {'lr': 2.414492449337613e-06, 'samples': 24471040, 'steps': 47794, 'batch_loss/train': 0.7941497825086117} 12/29/2021 06:09:27 - INFO - codeparrot_training - Step 47795: {'lr': 2.4123074488309195e-06, 'samples': 24471552, 'steps': 47795, 'batch_loss/train': 0.7259944165125489} 12/29/2021 06:09:39 - INFO - codeparrot_training - Step 47796: {'lr': 2.4101234326394107e-06, 'samples': 24472064, 'steps': 47796, 'batch_loss/train': 0.6532089905813336} 12/29/2021 06:09:50 - INFO - codeparrot_training - Step 47797: {'lr': 2.4079404007717464e-06, 'samples': 24472576, 'steps': 47797, 'batch_loss/train': 0.7172740655951202} 12/29/2021 06:10:01 - INFO - codeparrot_training - Step 47798: {'lr': 2.4057583532366143e-06, 'samples': 24473088, 'steps': 47798, 'batch_loss/train': 0.698839204153046} 12/29/2021 06:10:13 - INFO - codeparrot_training - Step 47799: {'lr': 2.403577290042702e-06, 'samples': 24473600, 'steps': 47799, 'batch_loss/train': 0.6689703653100878} 12/29/2021 06:10:24 - INFO - codeparrot_training - Step 47800: {'lr': 2.401397211198697e-06, 'samples': 24474112, 'steps': 47800, 'batch_loss/train': 0.6834412661846727} 12/29/2021 06:10:35 - INFO - codeparrot_training - Step 47801: {'lr': 2.399218116713231e-06, 'samples': 24474624, 'steps': 47801, 'batch_loss/train': 0.8007857422344387} 12/29/2021 06:10:45 - INFO - codeparrot_training - Step 47802: {'lr': 2.3970400065949917e-06, 'samples': 24475136, 'steps': 47802, 'batch_loss/train': 0.8229571064002812} 12/29/2021 06:10:57 - INFO - codeparrot_training - Step 47803: {'lr': 2.394862880852611e-06, 'samples': 24475648, 'steps': 47803, 'batch_loss/train': 0.6322764959186316} 12/29/2021 06:11:08 - INFO - codeparrot_training - Step 47804: {'lr': 2.392686739494804e-06, 'samples': 24476160, 'steps': 47804, 'batch_loss/train': 0.627389463596046} 12/29/2021 06:11:19 - INFO - codeparrot_training - Step 47805: {'lr': 2.3905115825301195e-06, 'samples': 24476672, 'steps': 47805, 'batch_loss/train': 0.6302654803730547} 12/29/2021 06:11:31 - INFO - codeparrot_training - Step 47806: {'lr': 2.3883374099673283e-06, 'samples': 24477184, 'steps': 47806, 'batch_loss/train': 0.678322727442719} 12/29/2021 06:11:41 - INFO - codeparrot_training - Step 47807: {'lr': 2.3861642218149794e-06, 'samples': 24477696, 'steps': 47807, 'batch_loss/train': 0.7042055055499077} 12/29/2021 06:11:52 - INFO - codeparrot_training - Step 47808: {'lr': 2.383992018081732e-06, 'samples': 24478208, 'steps': 47808, 'batch_loss/train': 0.6342944544740021} 12/29/2021 06:12:05 - INFO - codeparrot_training - Step 47809: {'lr': 2.381820798776274e-06, 'samples': 24478720, 'steps': 47809, 'batch_loss/train': 0.7180003626272082} 12/29/2021 06:12:16 - INFO - codeparrot_training - Step 47810: {'lr': 2.379650563907182e-06, 'samples': 24479232, 'steps': 47810, 'batch_loss/train': 0.6949202520772815} 12/29/2021 06:12:26 - INFO - codeparrot_training - Step 47811: {'lr': 2.3774813134831154e-06, 'samples': 24479744, 'steps': 47811, 'batch_loss/train': 0.7674744483083487} 12/29/2021 06:12:38 - INFO - codeparrot_training - Step 47812: {'lr': 2.3753130475126785e-06, 'samples': 24480256, 'steps': 47812, 'batch_loss/train': 0.7141416458180174} 12/29/2021 06:12:49 - INFO - codeparrot_training - Step 47813: {'lr': 2.3731457660044753e-06, 'samples': 24480768, 'steps': 47813, 'batch_loss/train': 0.7450713801663369} 12/29/2021 06:12:59 - INFO - codeparrot_training - Step 47814: {'lr': 2.370979468967166e-06, 'samples': 24481280, 'steps': 47814, 'batch_loss/train': 0.6574523556046188} 12/29/2021 06:13:10 - INFO - codeparrot_training - Step 47815: {'lr': 2.3688141564093547e-06, 'samples': 24481792, 'steps': 47815, 'batch_loss/train': 0.9067096458747983} 12/29/2021 06:13:23 - INFO - codeparrot_training - Step 47816: {'lr': 2.3666498283395897e-06, 'samples': 24482304, 'steps': 47816, 'batch_loss/train': 0.7486199312843382} 12/29/2021 06:13:34 - INFO - codeparrot_training - Step 47817: {'lr': 2.3644864847665594e-06, 'samples': 24482816, 'steps': 47817, 'batch_loss/train': 1.2960069170221686} 12/29/2021 06:13:44 - INFO - codeparrot_training - Step 47818: {'lr': 2.3623241256988394e-06, 'samples': 24483328, 'steps': 47818, 'batch_loss/train': 0.6385038332082331} 12/29/2021 06:13:57 - INFO - codeparrot_training - Step 47819: {'lr': 2.3601627511449785e-06, 'samples': 24483840, 'steps': 47819, 'batch_loss/train': 0.6355394159909338} 12/29/2021 06:14:08 - INFO - codeparrot_training - Step 47820: {'lr': 2.3580023611136093e-06, 'samples': 24484352, 'steps': 47820, 'batch_loss/train': 0.7159617925062776} 12/29/2021 06:14:18 - INFO - codeparrot_training - Step 47821: {'lr': 2.3558429556133353e-06, 'samples': 24484864, 'steps': 47821, 'batch_loss/train': 0.6682819621637464} 12/29/2021 06:14:29 - INFO - codeparrot_training - Step 47822: {'lr': 2.3536845346527338e-06, 'samples': 24485376, 'steps': 47822, 'batch_loss/train': 0.7039201455190778} 12/29/2021 06:14:41 - INFO - codeparrot_training - Step 47823: {'lr': 2.3515270982403245e-06, 'samples': 24485888, 'steps': 47823, 'batch_loss/train': 0.7213034280575812} 12/29/2021 06:14:52 - INFO - codeparrot_training - Step 47824: {'lr': 2.3493706463847964e-06, 'samples': 24486400, 'steps': 47824, 'batch_loss/train': 0.6896284276153892} 12/29/2021 06:15:03 - INFO - codeparrot_training - Step 47825: {'lr': 2.347215179094614e-06, 'samples': 24486912, 'steps': 47825, 'batch_loss/train': 0.7968774721957743} 12/29/2021 06:15:15 - INFO - codeparrot_training - Step 47826: {'lr': 2.3450606963784094e-06, 'samples': 24487424, 'steps': 47826, 'batch_loss/train': 0.8080652141943574} 12/29/2021 06:15:26 - INFO - codeparrot_training - Step 47827: {'lr': 2.3429071982447312e-06, 'samples': 24487936, 'steps': 47827, 'batch_loss/train': 0.6893614670261741} 12/29/2021 06:15:36 - INFO - codeparrot_training - Step 47828: {'lr': 2.3407546847021565e-06, 'samples': 24488448, 'steps': 47828, 'batch_loss/train': 0.6407864019274712} 12/29/2021 06:15:49 - INFO - codeparrot_training - Step 47829: {'lr': 2.338603155759206e-06, 'samples': 24488960, 'steps': 47829, 'batch_loss/train': 0.7129602746572345} 12/29/2021 06:16:00 - INFO - codeparrot_training - Step 47830: {'lr': 2.336452611424483e-06, 'samples': 24489472, 'steps': 47830, 'batch_loss/train': 0.7244433359010145} 12/29/2021 06:16:10 - INFO - codeparrot_training - Step 47831: {'lr': 2.3343030517064822e-06, 'samples': 24489984, 'steps': 47831, 'batch_loss/train': 0.6931854300200939} 12/29/2021 06:16:21 - INFO - codeparrot_training - Step 47832: {'lr': 2.332154476613779e-06, 'samples': 24490496, 'steps': 47832, 'batch_loss/train': 0.6419736444950104} 12/29/2021 06:16:33 - INFO - codeparrot_training - Step 47833: {'lr': 2.3300068861549506e-06, 'samples': 24491008, 'steps': 47833, 'batch_loss/train': 0.7541031839791685} 12/29/2021 06:16:43 - INFO - codeparrot_training - Step 47834: {'lr': 2.3278602803384618e-06, 'samples': 24491520, 'steps': 47834, 'batch_loss/train': 0.6964752525091171} 12/29/2021 06:16:54 - INFO - codeparrot_training - Step 47835: {'lr': 2.325714659172917e-06, 'samples': 24492032, 'steps': 47835, 'batch_loss/train': 0.6345683941617608} 12/29/2021 06:17:06 - INFO - codeparrot_training - Step 47836: {'lr': 2.323570022666782e-06, 'samples': 24492544, 'steps': 47836, 'batch_loss/train': 0.7004695096984506} 12/29/2021 06:17:17 - INFO - codeparrot_training - Step 47837: {'lr': 2.3214263708286054e-06, 'samples': 24493056, 'steps': 47837, 'batch_loss/train': 0.72265171026811} 12/29/2021 06:17:28 - INFO - codeparrot_training - Step 47838: {'lr': 2.3192837036669358e-06, 'samples': 24493568, 'steps': 47838, 'batch_loss/train': 0.8289039712399244} 12/29/2021 06:17:40 - INFO - codeparrot_training - Step 47839: {'lr': 2.3171420211902937e-06, 'samples': 24494080, 'steps': 47839, 'batch_loss/train': 0.6797897759824991} 12/29/2021 06:17:51 - INFO - codeparrot_training - Step 47840: {'lr': 2.3150013234071455e-06, 'samples': 24494592, 'steps': 47840, 'batch_loss/train': 0.6285850868443958} 12/29/2021 06:18:02 - INFO - codeparrot_training - Step 47841: {'lr': 2.3128616103260393e-06, 'samples': 24495104, 'steps': 47841, 'batch_loss/train': 0.6437177821062505} 12/29/2021 06:18:12 - INFO - codeparrot_training - Step 47842: {'lr': 2.310722881955468e-06, 'samples': 24495616, 'steps': 47842, 'batch_loss/train': 0.6936373198404908} 12/29/2021 06:18:24 - INFO - codeparrot_training - Step 47843: {'lr': 2.308585138303926e-06, 'samples': 24496128, 'steps': 47843, 'batch_loss/train': 0.7533102855086327} 12/29/2021 06:18:35 - INFO - codeparrot_training - Step 47844: {'lr': 2.3064483793799607e-06, 'samples': 24496640, 'steps': 47844, 'batch_loss/train': 0.7033873999025673} 12/29/2021 06:18:46 - INFO - codeparrot_training - Step 47845: {'lr': 2.3043126051919826e-06, 'samples': 24497152, 'steps': 47845, 'batch_loss/train': 0.6875504488125443} 12/29/2021 06:18:59 - INFO - codeparrot_training - Step 47846: {'lr': 2.3021778157485686e-06, 'samples': 24497664, 'steps': 47846, 'batch_loss/train': 0.6664745942689478} 12/29/2021 06:19:10 - INFO - codeparrot_training - Step 47847: {'lr': 2.3000440110581557e-06, 'samples': 24498176, 'steps': 47847, 'batch_loss/train': 0.7204827666282654} 12/29/2021 06:19:20 - INFO - codeparrot_training - Step 47848: {'lr': 2.297911191129237e-06, 'samples': 24498688, 'steps': 47848, 'batch_loss/train': 0.6455901218578219} 12/29/2021 06:19:32 - INFO - codeparrot_training - Step 47849: {'lr': 2.2957793559703066e-06, 'samples': 24499200, 'steps': 47849, 'batch_loss/train': 0.6844102372415364} 12/29/2021 06:19:43 - INFO - codeparrot_training - Step 47850: {'lr': 2.2936485055898293e-06, 'samples': 24499712, 'steps': 47850, 'batch_loss/train': 0.8301321649923921} 12/29/2021 06:19:54 - INFO - codeparrot_training - Step 47851: {'lr': 2.291518639996271e-06, 'samples': 24500224, 'steps': 47851, 'batch_loss/train': 0.5045414548367262} 12/29/2021 06:20:06 - INFO - codeparrot_training - Step 47852: {'lr': 2.2893897591980963e-06, 'samples': 24500736, 'steps': 47852, 'batch_loss/train': 0.6484007276594639} 12/29/2021 06:20:17 - INFO - codeparrot_training - Step 47853: {'lr': 2.287261863203771e-06, 'samples': 24501248, 'steps': 47853, 'batch_loss/train': 0.6711104116402566} 12/29/2021 06:20:28 - INFO - codeparrot_training - Step 47854: {'lr': 2.2851349520217612e-06, 'samples': 24501760, 'steps': 47854, 'batch_loss/train': 0.5759068724000826} 12/29/2021 06:20:38 - INFO - codeparrot_training - Step 47855: {'lr': 2.283009025660532e-06, 'samples': 24502272, 'steps': 47855, 'batch_loss/train': 0.6104316973360255} 12/29/2021 06:20:50 - INFO - codeparrot_training - Step 47856: {'lr': 2.2808840841285206e-06, 'samples': 24502784, 'steps': 47856, 'batch_loss/train': 0.8049186486750841} 12/29/2021 06:21:01 - INFO - codeparrot_training - Step 47857: {'lr': 2.2787601274341654e-06, 'samples': 24503296, 'steps': 47857, 'batch_loss/train': 0.6145197271835059} 12/29/2021 06:21:12 - INFO - codeparrot_training - Step 47858: {'lr': 2.276637155585959e-06, 'samples': 24503808, 'steps': 47858, 'batch_loss/train': 0.6890281559899449} 12/29/2021 06:21:25 - INFO - codeparrot_training - Step 47859: {'lr': 2.274515168592284e-06, 'samples': 24504320, 'steps': 47859, 'batch_loss/train': 0.7608055535238236} 12/29/2021 06:21:35 - INFO - codeparrot_training - Step 47860: {'lr': 2.2723941664615777e-06, 'samples': 24504832, 'steps': 47860, 'batch_loss/train': 0.791534059215337} 12/29/2021 06:21:46 - INFO - codeparrot_training - Step 47861: {'lr': 2.270274149202334e-06, 'samples': 24505344, 'steps': 47861, 'batch_loss/train': 0.6785058863461018} 12/29/2021 06:21:58 - INFO - codeparrot_training - Step 47862: {'lr': 2.268155116822934e-06, 'samples': 24505856, 'steps': 47862, 'batch_loss/train': 0.5995357261272147} 12/29/2021 06:22:09 - INFO - codeparrot_training - Step 47863: {'lr': 2.266037069331789e-06, 'samples': 24506368, 'steps': 47863, 'batch_loss/train': 0.5127602485008538} 12/29/2021 06:22:20 - INFO - codeparrot_training - Step 47864: {'lr': 2.2639200067373633e-06, 'samples': 24506880, 'steps': 47864, 'batch_loss/train': 0.7284260438755155} 12/29/2021 06:22:30 - INFO - codeparrot_training - Step 47865: {'lr': 2.2618039290480397e-06, 'samples': 24507392, 'steps': 47865, 'batch_loss/train': 0.7373072346672416} 12/29/2021 06:22:42 - INFO - codeparrot_training - Step 47866: {'lr': 2.259688836272228e-06, 'samples': 24507904, 'steps': 47866, 'batch_loss/train': 0.8581383644486777} 12/29/2021 06:22:53 - INFO - codeparrot_training - Step 47867: {'lr': 2.2575747284183657e-06, 'samples': 24508416, 'steps': 47867, 'batch_loss/train': 0.7941754311323166} 12/29/2021 06:23:04 - INFO - codeparrot_training - Step 47868: {'lr': 2.255461605494835e-06, 'samples': 24508928, 'steps': 47868, 'batch_loss/train': 0.7597108571790159} 12/29/2021 06:23:17 - INFO - codeparrot_training - Step 47869: {'lr': 2.2533494675100465e-06, 'samples': 24509440, 'steps': 47869, 'batch_loss/train': 0.7391164479777217} 12/29/2021 06:23:27 - INFO - codeparrot_training - Step 47870: {'lr': 2.251238314472409e-06, 'samples': 24509952, 'steps': 47870, 'batch_loss/train': 0.6826137411408126} 12/29/2021 06:23:38 - INFO - codeparrot_training - Step 47871: {'lr': 2.249128146390278e-06, 'samples': 24510464, 'steps': 47871, 'batch_loss/train': 0.7039156402461231} 12/29/2021 06:23:50 - INFO - codeparrot_training - Step 47872: {'lr': 2.247018963272063e-06, 'samples': 24510976, 'steps': 47872, 'batch_loss/train': 0.6884315544739366} 12/29/2021 06:24:01 - INFO - codeparrot_training - Step 47873: {'lr': 2.2449107651261735e-06, 'samples': 24511488, 'steps': 47873, 'batch_loss/train': 0.7502632727846503} 12/29/2021 06:24:11 - INFO - codeparrot_training - Step 47874: {'lr': 2.2428035519609646e-06, 'samples': 24512000, 'steps': 47874, 'batch_loss/train': 0.6908773314207792} 12/29/2021 06:24:22 - INFO - codeparrot_training - Step 47875: {'lr': 2.240697323784818e-06, 'samples': 24512512, 'steps': 47875, 'batch_loss/train': 0.7422056854702532} 12/29/2021 06:24:34 - INFO - codeparrot_training - Step 47876: {'lr': 2.2385920806061166e-06, 'samples': 24513024, 'steps': 47876, 'batch_loss/train': 0.7409256836399436} 12/29/2021 06:24:45 - INFO - codeparrot_training - Step 47877: {'lr': 2.2364878224332143e-06, 'samples': 24513536, 'steps': 47877, 'batch_loss/train': 0.8285517133772373} 12/29/2021 06:24:55 - INFO - codeparrot_training - Step 47878: {'lr': 2.2343845492744653e-06, 'samples': 24514048, 'steps': 47878, 'batch_loss/train': 0.7571839489974082} 12/29/2021 06:25:08 - INFO - codeparrot_training - Step 47879: {'lr': 2.23228226113828e-06, 'samples': 24514560, 'steps': 47879, 'batch_loss/train': 0.7456127610057592} 12/29/2021 06:25:18 - INFO - codeparrot_training - Step 47880: {'lr': 2.2301809580329847e-06, 'samples': 24515072, 'steps': 47880, 'batch_loss/train': 0.7023007487878203} 12/29/2021 06:25:29 - INFO - codeparrot_training - Step 47881: {'lr': 2.2280806399669064e-06, 'samples': 24515584, 'steps': 47881, 'batch_loss/train': 0.6421002629213035} 12/29/2021 06:25:43 - INFO - codeparrot_training - Step 47882: {'lr': 2.2259813069484547e-06, 'samples': 24516096, 'steps': 47882, 'batch_loss/train': 0.7897145618335344} 12/29/2021 06:25:54 - INFO - codeparrot_training - Step 47883: {'lr': 2.223882958985929e-06, 'samples': 24516608, 'steps': 47883, 'batch_loss/train': 0.7543263314291835} 12/29/2021 06:26:04 - INFO - codeparrot_training - Step 47884: {'lr': 2.221785596087711e-06, 'samples': 24517120, 'steps': 47884, 'batch_loss/train': 0.6905619292519987} 12/29/2021 06:26:16 - INFO - codeparrot_training - Step 47885: {'lr': 2.2196892182620997e-06, 'samples': 24517632, 'steps': 47885, 'batch_loss/train': 0.6670072916895151} 12/29/2021 06:26:27 - INFO - codeparrot_training - Step 47886: {'lr': 2.217593825517422e-06, 'samples': 24518144, 'steps': 47886, 'batch_loss/train': 0.9089707564562559} 12/29/2021 06:26:38 - INFO - codeparrot_training - Step 47887: {'lr': 2.215499417862088e-06, 'samples': 24518656, 'steps': 47887, 'batch_loss/train': 0.7068099705502391} 12/29/2021 06:26:48 - INFO - codeparrot_training - Step 47888: {'lr': 2.2134059953043407e-06, 'samples': 24519168, 'steps': 47888, 'batch_loss/train': 0.7282369993627071} 12/29/2021 06:27:02 - INFO - codeparrot_training - Step 47889: {'lr': 2.2113135578525067e-06, 'samples': 24519680, 'steps': 47889, 'batch_loss/train': 0.7334133408148773} 12/29/2021 06:27:13 - INFO - codeparrot_training - Step 47890: {'lr': 2.209222105514941e-06, 'samples': 24520192, 'steps': 47890, 'batch_loss/train': 0.7546438836725429} 12/29/2021 06:27:24 - INFO - codeparrot_training - Step 47891: {'lr': 2.2071316382999697e-06, 'samples': 24520704, 'steps': 47891, 'batch_loss/train': 0.7505682432092726} 12/29/2021 06:27:36 - INFO - codeparrot_training - Step 47892: {'lr': 2.205042156215864e-06, 'samples': 24521216, 'steps': 47892, 'batch_loss/train': 0.773704444989562} 12/29/2021 06:27:46 - INFO - codeparrot_training - Step 47893: {'lr': 2.202953659270951e-06, 'samples': 24521728, 'steps': 47893, 'batch_loss/train': 0.7857165029272437} 12/29/2021 06:27:57 - INFO - codeparrot_training - Step 47894: {'lr': 2.2008661474735013e-06, 'samples': 24522240, 'steps': 47894, 'batch_loss/train': 0.7316697533242404} 12/29/2021 06:28:09 - INFO - codeparrot_training - Step 47895: {'lr': 2.19877962083187e-06, 'samples': 24522752, 'steps': 47895, 'batch_loss/train': 0.6408313601277769} 12/29/2021 06:28:20 - INFO - codeparrot_training - Step 47896: {'lr': 2.1966940793543557e-06, 'samples': 24523264, 'steps': 47896, 'batch_loss/train': 0.7382016410119832} 12/29/2021 06:28:30 - INFO - codeparrot_training - Step 47897: {'lr': 2.1946095230491458e-06, 'samples': 24523776, 'steps': 47897, 'batch_loss/train': 0.7256084163673222} 12/29/2021 06:28:41 - INFO - codeparrot_training - Step 47898: {'lr': 2.192525951924651e-06, 'samples': 24524288, 'steps': 47898, 'batch_loss/train': 0.7448982112109661} 12/29/2021 06:28:55 - INFO - codeparrot_training - Step 47899: {'lr': 2.190443365989114e-06, 'samples': 24524800, 'steps': 47899, 'batch_loss/train': 0.7306762882508337} 12/29/2021 06:29:06 - INFO - codeparrot_training - Step 47900: {'lr': 2.188361765250779e-06, 'samples': 24525312, 'steps': 47900, 'batch_loss/train': 0.7215521247126162} 12/29/2021 06:29:16 - INFO - codeparrot_training - Step 47901: {'lr': 2.1862811497179445e-06, 'samples': 24525824, 'steps': 47901, 'batch_loss/train': 0.7322764871641994} 12/29/2021 06:29:28 - INFO - codeparrot_training - Step 47902: {'lr': 2.1842015193989096e-06, 'samples': 24526336, 'steps': 47902, 'batch_loss/train': 0.7431391477584839} 12/29/2021 06:29:39 - INFO - codeparrot_training - Step 47903: {'lr': 2.1821228743018893e-06, 'samples': 24526848, 'steps': 47903, 'batch_loss/train': 0.7064082727301866} 12/29/2021 06:29:50 - INFO - codeparrot_training - Step 47904: {'lr': 2.1800452144351833e-06, 'samples': 24527360, 'steps': 47904, 'batch_loss/train': 0.7723795315250754} 12/29/2021 06:30:04 - INFO - codeparrot_training - Step 47905: {'lr': 2.1779685398070624e-06, 'samples': 24527872, 'steps': 47905, 'batch_loss/train': 0.7595129008404911} 12/29/2021 06:30:14 - INFO - codeparrot_training - Step 47906: {'lr': 2.1758928504257424e-06, 'samples': 24528384, 'steps': 47906, 'batch_loss/train': 0.6903949514962733} 12/29/2021 06:30:25 - INFO - codeparrot_training - Step 47907: {'lr': 2.1738181462994946e-06, 'samples': 24528896, 'steps': 47907, 'batch_loss/train': 0.7762304584030062} 12/29/2021 06:30:36 - INFO - codeparrot_training - Step 47908: {'lr': 2.1717444274365894e-06, 'samples': 24529408, 'steps': 47908, 'batch_loss/train': 0.7794095831923187} 12/29/2021 06:30:48 - INFO - codeparrot_training - Step 47909: {'lr': 2.169671693845243e-06, 'samples': 24529920, 'steps': 47909, 'batch_loss/train': 0.7844363618642092} 12/29/2021 06:30:58 - INFO - codeparrot_training - Step 47910: {'lr': 2.1675999455336993e-06, 'samples': 24530432, 'steps': 47910, 'batch_loss/train': 0.7002643287414685} 12/29/2021 06:31:09 - INFO - codeparrot_training - Step 47911: {'lr': 2.1655291825102285e-06, 'samples': 24530944, 'steps': 47911, 'batch_loss/train': 0.8517661551013589} 12/29/2021 06:31:21 - INFO - codeparrot_training - Step 47912: {'lr': 2.1634594047829916e-06, 'samples': 24531456, 'steps': 47912, 'batch_loss/train': 0.8358540418557823} 12/29/2021 06:31:31 - INFO - codeparrot_training - Step 47913: {'lr': 2.1613906123602867e-06, 'samples': 24531968, 'steps': 47913, 'batch_loss/train': 0.856759219430387} 12/29/2021 06:31:42 - INFO - codeparrot_training - Step 47914: {'lr': 2.15932280525033e-06, 'samples': 24532480, 'steps': 47914, 'batch_loss/train': 0.6985803490970284} 12/29/2021 06:31:56 - INFO - codeparrot_training - Step 47915: {'lr': 2.157255983461309e-06, 'samples': 24532992, 'steps': 47915, 'batch_loss/train': 0.7264803480356932} 12/29/2021 06:32:07 - INFO - codeparrot_training - Step 47916: {'lr': 2.15519014700144e-06, 'samples': 24533504, 'steps': 47916, 'batch_loss/train': 0.7377708128187805} 12/29/2021 06:32:17 - INFO - codeparrot_training - Step 47917: {'lr': 2.153125295878966e-06, 'samples': 24534016, 'steps': 47917, 'batch_loss/train': 0.7568432204425335} 12/29/2021 06:32:28 - INFO - codeparrot_training - Step 47918: {'lr': 2.1510614301020748e-06, 'samples': 24534528, 'steps': 47918, 'batch_loss/train': 0.7442411966621876} 12/29/2021 06:32:40 - INFO - codeparrot_training - Step 47919: {'lr': 2.1489985496789823e-06, 'samples': 24535040, 'steps': 47919, 'batch_loss/train': 0.7407852933974937} 12/29/2021 06:32:51 - INFO - codeparrot_training - Step 47920: {'lr': 2.1469366546179037e-06, 'samples': 24535552, 'steps': 47920, 'batch_loss/train': 0.6903624809347093} 12/29/2021 06:33:01 - INFO - codeparrot_training - Step 47921: {'lr': 2.1448757449269997e-06, 'samples': 24536064, 'steps': 47921, 'batch_loss/train': 0.7028262121602893} 12/29/2021 06:33:13 - INFO - codeparrot_training - Step 47922: {'lr': 2.1428158206144854e-06, 'samples': 24536576, 'steps': 47922, 'batch_loss/train': 0.7090919800102711} 12/29/2021 06:33:24 - INFO - codeparrot_training - Step 47923: {'lr': 2.1407568816885494e-06, 'samples': 24537088, 'steps': 47923, 'batch_loss/train': 0.6566650392487645} 12/29/2021 06:33:35 - INFO - codeparrot_training - Step 47924: {'lr': 2.1386989281573786e-06, 'samples': 24537600, 'steps': 47924, 'batch_loss/train': 0.7146233811508864} 12/29/2021 06:33:47 - INFO - codeparrot_training - Step 47925: {'lr': 2.136641960029162e-06, 'samples': 24538112, 'steps': 47925, 'batch_loss/train': 0.7254000310786068} 12/29/2021 06:33:57 - INFO - codeparrot_training - Step 47926: {'lr': 2.134585977312059e-06, 'samples': 24538624, 'steps': 47926, 'batch_loss/train': 0.7337144007906318} 12/29/2021 06:34:08 - INFO - codeparrot_training - Step 47927: {'lr': 2.1325309800142302e-06, 'samples': 24539136, 'steps': 47927, 'batch_loss/train': 0.6854514796286821} 12/29/2021 06:34:19 - INFO - codeparrot_training - Step 47928: {'lr': 2.1304769681439186e-06, 'samples': 24539648, 'steps': 47928, 'batch_loss/train': 0.6536952988244593} 12/29/2021 06:34:33 - INFO - codeparrot_training - Step 47929: {'lr': 2.128423941709201e-06, 'samples': 24540160, 'steps': 47929, 'batch_loss/train': 0.8195595825091004} 12/29/2021 06:34:43 - INFO - codeparrot_training - Step 47930: {'lr': 2.126371900718266e-06, 'samples': 24540672, 'steps': 47930, 'batch_loss/train': 0.6176537563442253} 12/29/2021 06:34:54 - INFO - codeparrot_training - Step 47931: {'lr': 2.1243208451793006e-06, 'samples': 24541184, 'steps': 47931, 'batch_loss/train': 0.6961613731982652} 12/29/2021 06:35:06 - INFO - codeparrot_training - Step 47932: {'lr': 2.1222707751004377e-06, 'samples': 24541696, 'steps': 47932, 'batch_loss/train': 0.7802633559331298} 12/29/2021 06:35:17 - INFO - codeparrot_training - Step 47933: {'lr': 2.1202216904898376e-06, 'samples': 24542208, 'steps': 47933, 'batch_loss/train': 0.7711006684694439} 12/29/2021 06:35:27 - INFO - codeparrot_training - Step 47934: {'lr': 2.118173591355632e-06, 'samples': 24542720, 'steps': 47934, 'batch_loss/train': 0.6856756739725824} 12/29/2021 06:35:39 - INFO - codeparrot_training - Step 47935: {'lr': 2.116126477705982e-06, 'samples': 24543232, 'steps': 47935, 'batch_loss/train': 0.63967256777687} 12/29/2021 06:35:50 - INFO - codeparrot_training - Step 47936: {'lr': 2.1140803495490192e-06, 'samples': 24543744, 'steps': 47936, 'batch_loss/train': 1.6197147639468312} 12/29/2021 06:36:01 - INFO - codeparrot_training - Step 47937: {'lr': 2.112035206892876e-06, 'samples': 24544256, 'steps': 47937, 'batch_loss/train': 0.7598372390493751} 12/29/2021 06:36:15 - INFO - codeparrot_training - Step 47938: {'lr': 2.1099910497456577e-06, 'samples': 24544768, 'steps': 47938, 'batch_loss/train': 0.8338267714716494} 12/29/2021 06:36:25 - INFO - codeparrot_training - Step 47939: {'lr': 2.107947878115524e-06, 'samples': 24545280, 'steps': 47939, 'batch_loss/train': 0.7817827546969056} 12/29/2021 06:36:36 - INFO - codeparrot_training - Step 47940: {'lr': 2.105905692010607e-06, 'samples': 24545792, 'steps': 47940, 'batch_loss/train': 0.8721248647198081} 12/29/2021 06:36:47 - INFO - codeparrot_training - Step 47941: {'lr': 2.1038644914390116e-06, 'samples': 24546304, 'steps': 47941, 'batch_loss/train': 0.7227296666242182} 12/29/2021 06:36:59 - INFO - codeparrot_training - Step 47942: {'lr': 2.101824276408815e-06, 'samples': 24546816, 'steps': 47942, 'batch_loss/train': 0.7711533436086029} 12/29/2021 06:37:09 - INFO - codeparrot_training - Step 47943: {'lr': 2.0997850469281765e-06, 'samples': 24547328, 'steps': 47943, 'batch_loss/train': 0.7052369425073266} 12/29/2021 06:37:20 - INFO - codeparrot_training - Step 47944: {'lr': 2.0977468030052015e-06, 'samples': 24547840, 'steps': 47944, 'batch_loss/train': 0.780565925873816} 12/29/2021 06:37:34 - INFO - codeparrot_training - Step 47945: {'lr': 2.0957095446479667e-06, 'samples': 24548352, 'steps': 47945, 'batch_loss/train': 0.6318064937368035} 12/29/2021 06:37:45 - INFO - codeparrot_training - Step 47946: {'lr': 2.0936732718646047e-06, 'samples': 24548864, 'steps': 47946, 'batch_loss/train': 1.2046016016975045} 12/29/2021 06:37:56 - INFO - codeparrot_training - Step 47947: {'lr': 2.091637984663164e-06, 'samples': 24549376, 'steps': 47947, 'batch_loss/train': 0.780244879424572} 12/29/2021 06:38:06 - INFO - codeparrot_training - Step 47948: {'lr': 2.089603683051777e-06, 'samples': 24549888, 'steps': 47948, 'batch_loss/train': 0.7842404972761869} 12/29/2021 06:38:18 - INFO - codeparrot_training - Step 47949: {'lr': 2.0875703670385217e-06, 'samples': 24550400, 'steps': 47949, 'batch_loss/train': 0.7903517936356366} 12/29/2021 06:38:29 - INFO - codeparrot_training - Step 47950: {'lr': 2.0855380366314738e-06, 'samples': 24550912, 'steps': 47950, 'batch_loss/train': 1.0494525833055377} 12/29/2021 06:38:40 - INFO - codeparrot_training - Step 47951: {'lr': 2.083506691838738e-06, 'samples': 24551424, 'steps': 47951, 'batch_loss/train': 0.7268383298069239} 12/29/2021 06:38:52 - INFO - codeparrot_training - Step 47952: {'lr': 2.081476332668336e-06, 'samples': 24551936, 'steps': 47952, 'batch_loss/train': 0.840443829074502} 12/29/2021 06:39:02 - INFO - codeparrot_training - Step 47953: {'lr': 2.0794469591284006e-06, 'samples': 24552448, 'steps': 47953, 'batch_loss/train': 0.7439680309034884} 12/29/2021 06:39:13 - INFO - codeparrot_training - Step 47954: {'lr': 2.0774185712269522e-06, 'samples': 24552960, 'steps': 47954, 'batch_loss/train': 0.8458955371752381} 12/29/2021 06:39:27 - INFO - codeparrot_training - Step 47955: {'lr': 2.0753911689720963e-06, 'samples': 24553472, 'steps': 47955, 'batch_loss/train': 0.6839080397039652} 12/29/2021 06:39:38 - INFO - codeparrot_training - Step 47956: {'lr': 2.0733647523718534e-06, 'samples': 24553984, 'steps': 47956, 'batch_loss/train': 0.7057524916017428} 12/29/2021 06:39:49 - INFO - codeparrot_training - Step 47957: {'lr': 2.071339321434329e-06, 'samples': 24554496, 'steps': 47957, 'batch_loss/train': 0.45943693770095706} 12/29/2021 06:39:59 - INFO - codeparrot_training - Step 47958: {'lr': 2.0693148761675164e-06, 'samples': 24555008, 'steps': 47958, 'batch_loss/train': 0.7266401117667556} 12/29/2021 06:40:11 - INFO - codeparrot_training - Step 47959: {'lr': 2.067291416579492e-06, 'samples': 24555520, 'steps': 47959, 'batch_loss/train': 0.7219248928595334} 12/29/2021 06:40:22 - INFO - codeparrot_training - Step 47960: {'lr': 2.0652689426783054e-06, 'samples': 24556032, 'steps': 47960, 'batch_loss/train': 0.6184686257038265} 12/29/2021 06:40:33 - INFO - codeparrot_training - Step 47961: {'lr': 2.0632474544720057e-06, 'samples': 24556544, 'steps': 47961, 'batch_loss/train': 0.6686196364462376} 12/29/2021 06:40:45 - INFO - codeparrot_training - Step 47962: {'lr': 2.0612269519686142e-06, 'samples': 24557056, 'steps': 47962, 'batch_loss/train': 0.7263323166407645} 12/29/2021 06:40:55 - INFO - codeparrot_training - Step 47963: {'lr': 2.0592074351761525e-06, 'samples': 24557568, 'steps': 47963, 'batch_loss/train': 0.650898611987941} 12/29/2021 06:41:06 - INFO - codeparrot_training - Step 47964: {'lr': 2.0571889041026692e-06, 'samples': 24558080, 'steps': 47964, 'batch_loss/train': 0.7199436374939978} 12/29/2021 06:41:18 - INFO - codeparrot_training - Step 47965: {'lr': 2.055171358756186e-06, 'samples': 24558592, 'steps': 47965, 'batch_loss/train': 0.725175061263144} 12/29/2021 06:41:29 - INFO - codeparrot_training - Step 47966: {'lr': 2.053154799144724e-06, 'samples': 24559104, 'steps': 47966, 'batch_loss/train': 0.7287663898896426} 12/29/2021 06:41:39 - INFO - codeparrot_training - Step 47967: {'lr': 2.051139225276277e-06, 'samples': 24559616, 'steps': 47967, 'batch_loss/train': 0.7415033169090748} 12/29/2021 06:41:50 - INFO - codeparrot_training - Step 47968: {'lr': 2.0491246371589223e-06, 'samples': 24560128, 'steps': 47968, 'batch_loss/train': 0.6499728966155089} 12/29/2021 06:42:04 - INFO - codeparrot_training - Step 47969: {'lr': 2.047111034800597e-06, 'samples': 24560640, 'steps': 47969, 'batch_loss/train': 0.7792909536510706} 12/29/2021 06:42:15 - INFO - codeparrot_training - Step 47970: {'lr': 2.0450984182093235e-06, 'samples': 24561152, 'steps': 47970, 'batch_loss/train': 0.7300239037722349} 12/29/2021 06:42:25 - INFO - codeparrot_training - Step 47971: {'lr': 2.0430867873931224e-06, 'samples': 24561664, 'steps': 47971, 'batch_loss/train': 0.8059534290805459} 12/29/2021 06:42:38 - INFO - codeparrot_training - Step 47972: {'lr': 2.0410761423599876e-06, 'samples': 24562176, 'steps': 47972, 'batch_loss/train': 0.655590872745961} 12/29/2021 06:42:48 - INFO - codeparrot_training - Step 47973: {'lr': 2.039066483117913e-06, 'samples': 24562688, 'steps': 47973, 'batch_loss/train': 0.6465772328083403} 12/29/2021 06:42:59 - INFO - codeparrot_training - Step 47974: {'lr': 2.0370578096748914e-06, 'samples': 24563200, 'steps': 47974, 'batch_loss/train': 0.7409800002351403} 12/29/2021 06:43:13 - INFO - codeparrot_training - Step 47975: {'lr': 2.0350501220388893e-06, 'samples': 24563712, 'steps': 47975, 'batch_loss/train': 0.7156143020838499} 12/29/2021 06:43:23 - INFO - codeparrot_training - Step 47976: {'lr': 2.0330434202179003e-06, 'samples': 24564224, 'steps': 47976, 'batch_loss/train': 0.6799608441069722} 12/29/2021 06:43:34 - INFO - codeparrot_training - Step 47977: {'lr': 2.031037704219918e-06, 'samples': 24564736, 'steps': 47977, 'batch_loss/train': 0.806843354832381} 12/29/2021 06:43:46 - INFO - codeparrot_training - Step 47978: {'lr': 2.02903297405288e-06, 'samples': 24565248, 'steps': 47978, 'batch_loss/train': 0.7523118769749999} 12/29/2021 06:43:57 - INFO - codeparrot_training - Step 47979: {'lr': 2.0270292297247526e-06, 'samples': 24565760, 'steps': 47979, 'batch_loss/train': 0.6329756050836295} 12/29/2021 06:44:07 - INFO - codeparrot_training - Step 47980: {'lr': 2.025026471243557e-06, 'samples': 24566272, 'steps': 47980, 'batch_loss/train': 0.7355766603723168} 12/29/2021 06:44:18 - INFO - codeparrot_training - Step 47981: {'lr': 2.023024698617232e-06, 'samples': 24566784, 'steps': 47981, 'batch_loss/train': 0.8123969566076994} 12/29/2021 06:44:30 - INFO - codeparrot_training - Step 47982: {'lr': 2.021023911853714e-06, 'samples': 24567296, 'steps': 47982, 'batch_loss/train': 0.5849059915635735} 12/29/2021 06:44:41 - INFO - codeparrot_training - Step 47983: {'lr': 2.019024110960971e-06, 'samples': 24567808, 'steps': 47983, 'batch_loss/train': 0.7647467700298876} 12/29/2021 06:44:52 - INFO - codeparrot_training - Step 47984: {'lr': 2.017025295946967e-06, 'samples': 24568320, 'steps': 47984, 'batch_loss/train': 0.7604167857207358} 12/29/2021 06:45:06 - INFO - codeparrot_training - Step 47985: {'lr': 2.0150274668196133e-06, 'samples': 24568832, 'steps': 47985, 'batch_loss/train': 0.7573279021307826} 12/29/2021 06:45:16 - INFO - codeparrot_training - Step 47986: {'lr': 2.0130306235869035e-06, 'samples': 24569344, 'steps': 47986, 'batch_loss/train': 0.6647088606841862} 12/29/2021 06:45:27 - INFO - codeparrot_training - Step 47987: {'lr': 2.0110347662567196e-06, 'samples': 24569856, 'steps': 47987, 'batch_loss/train': 0.695778498891741} 12/29/2021 06:45:39 - INFO - codeparrot_training - Step 47988: {'lr': 2.009039894837056e-06, 'samples': 24570368, 'steps': 47988, 'batch_loss/train': 0.7912974543869495} 12/29/2021 06:45:50 - INFO - codeparrot_training - Step 47989: {'lr': 2.0070460093357947e-06, 'samples': 24570880, 'steps': 47989, 'batch_loss/train': 0.7681646076962352} 12/29/2021 06:46:00 - INFO - codeparrot_training - Step 47990: {'lr': 2.0050531097608737e-06, 'samples': 24571392, 'steps': 47990, 'batch_loss/train': 0.6802233913913369} 12/29/2021 06:46:11 - INFO - codeparrot_training - Step 47991: {'lr': 2.0030611961202314e-06, 'samples': 24571904, 'steps': 47991, 'batch_loss/train': 0.7815374378114939} 12/29/2021 06:46:25 - INFO - codeparrot_training - Step 47992: {'lr': 2.0010702684217787e-06, 'samples': 24572416, 'steps': 47992, 'batch_loss/train': 0.6038015400990844} 12/29/2021 06:46:35 - INFO - codeparrot_training - Step 47993: {'lr': 1.999080326673397e-06, 'samples': 24572928, 'steps': 47993, 'batch_loss/train': 0.8322336133569479} 12/29/2021 06:46:46 - INFO - codeparrot_training - Step 47994: {'lr': 1.9970913708830807e-06, 'samples': 24573440, 'steps': 47994, 'batch_loss/train': 0.7059669164009392} 12/29/2021 06:46:58 - INFO - codeparrot_training - Step 47995: {'lr': 1.9951034010586567e-06, 'samples': 24573952, 'steps': 47995, 'batch_loss/train': 0.7483965782448649} 12/29/2021 06:47:09 - INFO - codeparrot_training - Step 47996: {'lr': 1.993116417208063e-06, 'samples': 24574464, 'steps': 47996, 'batch_loss/train': 0.862807254306972} 12/29/2021 06:47:19 - INFO - codeparrot_training - Step 47997: {'lr': 1.99113041933921e-06, 'samples': 24574976, 'steps': 47997, 'batch_loss/train': 0.7481928197667003} 12/29/2021 06:47:31 - INFO - codeparrot_training - Step 47998: {'lr': 1.9891454074599522e-06, 'samples': 24575488, 'steps': 47998, 'batch_loss/train': 0.6729669240303338} 12/29/2021 06:47:42 - INFO - codeparrot_training - Step 47999: {'lr': 1.9871613815782285e-06, 'samples': 24576000, 'steps': 47999, 'batch_loss/train': 0.7039976473897696} 12/29/2021 06:47:53 - INFO - codeparrot_training - Step 48000: {'lr': 1.985178341701893e-06, 'samples': 24576512, 'steps': 48000, 'batch_loss/train': 0.7029458270408213} 12/29/2021 06:48:03 - INFO - codeparrot_training - Step 48001: {'lr': 1.9831962878388566e-06, 'samples': 24577024, 'steps': 48001, 'batch_loss/train': 0.7524541299790144} 12/29/2021 06:48:16 - INFO - codeparrot_training - Step 48002: {'lr': 1.9812152199969735e-06, 'samples': 24577536, 'steps': 48002, 'batch_loss/train': 0.7777227442711592} 12/29/2021 06:48:26 - INFO - codeparrot_training - Step 48003: {'lr': 1.9792351381841266e-06, 'samples': 24578048, 'steps': 48003, 'batch_loss/train': 0.7907410357147455} 12/29/2021 06:48:37 - INFO - codeparrot_training - Step 48004: {'lr': 1.977256042408199e-06, 'samples': 24578560, 'steps': 48004, 'batch_loss/train': 0.673704874701798} 12/29/2021 06:48:51 - INFO - codeparrot_training - Step 48005: {'lr': 1.9752779326770444e-06, 'samples': 24579072, 'steps': 48005, 'batch_loss/train': 0.742690552957356} 12/29/2021 06:49:01 - INFO - codeparrot_training - Step 48006: {'lr': 1.973300808998546e-06, 'samples': 24579584, 'steps': 48006, 'batch_loss/train': 0.6570942191174254} 12/29/2021 06:49:12 - INFO - codeparrot_training - Step 48007: {'lr': 1.9713246713805587e-06, 'samples': 24580096, 'steps': 48007, 'batch_loss/train': 0.7511862255632877} 12/29/2021 06:49:24 - INFO - codeparrot_training - Step 48008: {'lr': 1.9693495198309098e-06, 'samples': 24580608, 'steps': 48008, 'batch_loss/train': 0.6833125650882721} 12/29/2021 06:49:35 - INFO - codeparrot_training - Step 48009: {'lr': 1.967375354357481e-06, 'samples': 24581120, 'steps': 48009, 'batch_loss/train': 0.7613055668771267} 12/29/2021 06:49:45 - INFO - codeparrot_training - Step 48010: {'lr': 1.9654021749681285e-06, 'samples': 24581632, 'steps': 48010, 'batch_loss/train': 0.7975763976573944} 12/29/2021 06:49:56 - INFO - codeparrot_training - Step 48011: {'lr': 1.96342998167065e-06, 'samples': 24582144, 'steps': 48011, 'batch_loss/train': 0.6888624876737595} 12/29/2021 06:50:08 - INFO - codeparrot_training - Step 48012: {'lr': 1.9614587744729573e-06, 'samples': 24582656, 'steps': 48012, 'batch_loss/train': 0.7389739067293704} 12/29/2021 06:50:19 - INFO - codeparrot_training - Step 48013: {'lr': 1.959488553382821e-06, 'samples': 24583168, 'steps': 48013, 'batch_loss/train': 0.7848729073884897} 12/29/2021 06:50:29 - INFO - codeparrot_training - Step 48014: {'lr': 1.957519318408124e-06, 'samples': 24583680, 'steps': 48014, 'batch_loss/train': 0.66167862189468} 12/29/2021 06:50:44 - INFO - codeparrot_training - Step 48015: {'lr': 1.9555510695566656e-06, 'samples': 24584192, 'steps': 48015, 'batch_loss/train': 0.7210561446845531} 12/29/2021 06:50:54 - INFO - codeparrot_training - Step 48016: {'lr': 1.9535838068362453e-06, 'samples': 24584704, 'steps': 48016, 'batch_loss/train': 0.7742695715278387} 12/29/2021 06:51:05 - INFO - codeparrot_training - Step 48017: {'lr': 1.9516175302547455e-06, 'samples': 24585216, 'steps': 48017, 'batch_loss/train': 0.7428563832072541} 12/29/2021 06:51:17 - INFO - codeparrot_training - Step 48018: {'lr': 1.949652239819966e-06, 'samples': 24585728, 'steps': 48018, 'batch_loss/train': 0.5295855798758566} 12/29/2021 06:51:28 - INFO - codeparrot_training - Step 48019: {'lr': 1.947687935539705e-06, 'samples': 24586240, 'steps': 48019, 'batch_loss/train': 0.509128155419603} 12/29/2021 06:51:38 - INFO - codeparrot_training - Step 48020: {'lr': 1.9457246174217624e-06, 'samples': 24586752, 'steps': 48020, 'batch_loss/train': 0.7032130118459463} 12/29/2021 06:51:49 - INFO - codeparrot_training - Step 48021: {'lr': 1.9437622854739656e-06, 'samples': 24587264, 'steps': 48021, 'batch_loss/train': 0.7128009248990566} 12/29/2021 06:52:03 - INFO - codeparrot_training - Step 48022: {'lr': 1.941800939704114e-06, 'samples': 24587776, 'steps': 48022, 'batch_loss/train': 0.7007823293097317} 12/29/2021 06:52:14 - INFO - codeparrot_training - Step 48023: {'lr': 1.9398405801199783e-06, 'samples': 24588288, 'steps': 48023, 'batch_loss/train': 0.7069714251556434} 12/29/2021 06:52:24 - INFO - codeparrot_training - Step 48024: {'lr': 1.937881206729386e-06, 'samples': 24588800, 'steps': 48024, 'batch_loss/train': 0.7836095774546266} 12/29/2021 06:52:36 - INFO - codeparrot_training - Step 48025: {'lr': 1.935922819540109e-06, 'samples': 24589312, 'steps': 48025, 'batch_loss/train': 0.686197841539979} 12/29/2021 06:52:47 - INFO - codeparrot_training - Step 48026: {'lr': 1.9339654185599465e-06, 'samples': 24589824, 'steps': 48026, 'batch_loss/train': 0.7704199128784239} 12/29/2021 06:52:58 - INFO - codeparrot_training - Step 48027: {'lr': 1.9320090037966697e-06, 'samples': 24590336, 'steps': 48027, 'batch_loss/train': 0.6565401146945078} 12/29/2021 06:53:10 - INFO - codeparrot_training - Step 48028: {'lr': 1.9300535752580505e-06, 'samples': 24590848, 'steps': 48028, 'batch_loss/train': 0.7476347894407809} 12/29/2021 06:53:21 - INFO - codeparrot_training - Step 48029: {'lr': 1.9280991329518605e-06, 'samples': 24591360, 'steps': 48029, 'batch_loss/train': 1.186897175386548} 12/29/2021 06:53:32 - INFO - codeparrot_training - Step 48030: {'lr': 1.926145676885899e-06, 'samples': 24591872, 'steps': 48030, 'batch_loss/train': 0.5946983993053436} 12/29/2021 06:53:42 - INFO - codeparrot_training - Step 48031: {'lr': 1.924193207067909e-06, 'samples': 24592384, 'steps': 48031, 'batch_loss/train': 0.7761753033846617} 12/29/2021 06:53:56 - INFO - codeparrot_training - Step 48032: {'lr': 1.9222417235056633e-06, 'samples': 24592896, 'steps': 48032, 'batch_loss/train': 0.7811786448583007} 12/29/2021 06:54:07 - INFO - codeparrot_training - Step 48033: {'lr': 1.920291226206905e-06, 'samples': 24593408, 'steps': 48033, 'batch_loss/train': 0.7282462911680341} 12/29/2021 06:54:17 - INFO - codeparrot_training - Step 48034: {'lr': 1.918341715179406e-06, 'samples': 24593920, 'steps': 48034, 'batch_loss/train': 0.7829730128869414} 12/29/2021 06:54:30 - INFO - codeparrot_training - Step 48035: {'lr': 1.9163931904309094e-06, 'samples': 24594432, 'steps': 48035, 'batch_loss/train': 0.7534253350459039} 12/29/2021 06:54:40 - INFO - codeparrot_training - Step 48036: {'lr': 1.9144456519691323e-06, 'samples': 24594944, 'steps': 48036, 'batch_loss/train': 0.7320773182436824} 12/29/2021 06:54:51 - INFO - codeparrot_training - Step 48037: {'lr': 1.912499099801873e-06, 'samples': 24595456, 'steps': 48037, 'batch_loss/train': 0.7369944136589766} 12/29/2021 06:55:01 - INFO - codeparrot_training - Step 48038: {'lr': 1.9105535339368484e-06, 'samples': 24595968, 'steps': 48038, 'batch_loss/train': 0.7668224675580859} 12/29/2021 06:55:14 - INFO - codeparrot_training - Step 48039: {'lr': 1.908608954381774e-06, 'samples': 24596480, 'steps': 48039, 'batch_loss/train': 0.7547698458656669} 12/29/2021 06:55:24 - INFO - codeparrot_training - Step 48040: {'lr': 1.906665361144394e-06, 'samples': 24596992, 'steps': 48040, 'batch_loss/train': 0.6114537864923477} 12/29/2021 06:55:35 - INFO - codeparrot_training - Step 48041: {'lr': 1.904722754232424e-06, 'samples': 24597504, 'steps': 48041, 'batch_loss/train': 0.9166889805346727} 12/29/2021 06:55:49 - INFO - codeparrot_training - Step 48042: {'lr': 1.902781133653636e-06, 'samples': 24598016, 'steps': 48042, 'batch_loss/train': 0.7188019130844623} 12/29/2021 06:56:00 - INFO - codeparrot_training - Step 48043: {'lr': 1.9008404994156902e-06, 'samples': 24598528, 'steps': 48043, 'batch_loss/train': 0.71328095626086} 12/29/2021 06:56:10 - INFO - codeparrot_training - Step 48044: {'lr': 1.8989008515263306e-06, 'samples': 24599040, 'steps': 48044, 'batch_loss/train': 0.8136280914768577} 12/29/2021 06:56:22 - INFO - codeparrot_training - Step 48045: {'lr': 1.8969621899932454e-06, 'samples': 24599552, 'steps': 48045, 'batch_loss/train': 0.7626346936449409} 12/29/2021 06:56:33 - INFO - codeparrot_training - Step 48046: {'lr': 1.8950245148241508e-06, 'samples': 24600064, 'steps': 48046, 'batch_loss/train': 0.7438567336648703} 12/29/2021 06:56:43 - INFO - codeparrot_training - Step 48047: {'lr': 1.8930878260267904e-06, 'samples': 24600576, 'steps': 48047, 'batch_loss/train': 0.7712875853758305} 12/29/2021 06:56:56 - INFO - codeparrot_training - Step 48048: {'lr': 1.8911521236088247e-06, 'samples': 24601088, 'steps': 48048, 'batch_loss/train': 0.7253146813600324} 12/29/2021 06:57:07 - INFO - codeparrot_training - Step 48049: {'lr': 1.8892174075779145e-06, 'samples': 24601600, 'steps': 48049, 'batch_loss/train': 0.7432173443958163} 12/29/2021 06:57:17 - INFO - codeparrot_training - Step 48050: {'lr': 1.887283677941859e-06, 'samples': 24602112, 'steps': 48050, 'batch_loss/train': 0.7301389849744737} 12/29/2021 06:57:28 - INFO - codeparrot_training - Step 48051: {'lr': 1.8853509347082355e-06, 'samples': 24602624, 'steps': 48051, 'batch_loss/train': 0.7482617853675038} 12/29/2021 06:57:40 - INFO - codeparrot_training - Step 48052: {'lr': 1.8834191778847876e-06, 'samples': 24603136, 'steps': 48052, 'batch_loss/train': 0.7735665997024626} 12/29/2021 06:57:51 - INFO - codeparrot_training - Step 48053: {'lr': 1.881488407479176e-06, 'samples': 24603648, 'steps': 48053, 'batch_loss/train': 1.098793406621553} 12/29/2021 06:58:02 - INFO - codeparrot_training - Step 48054: {'lr': 1.879558623499089e-06, 'samples': 24604160, 'steps': 48054, 'batch_loss/train': 0.723070815205574} 12/29/2021 06:58:15 - INFO - codeparrot_training - Step 48055: {'lr': 1.8776298259521596e-06, 'samples': 24604672, 'steps': 48055, 'batch_loss/train': 0.7969305836595595} 12/29/2021 06:58:26 - INFO - codeparrot_training - Step 48056: {'lr': 1.8757020148461035e-06, 'samples': 24605184, 'steps': 48056, 'batch_loss/train': 0.66534993465757} 12/29/2021 06:58:37 - INFO - codeparrot_training - Step 48057: {'lr': 1.8737751901885813e-06, 'samples': 24605696, 'steps': 48057, 'batch_loss/train': 0.747165369684808} 12/29/2021 06:58:47 - INFO - codeparrot_training - Step 48058: {'lr': 1.8718493519872537e-06, 'samples': 24606208, 'steps': 48058, 'batch_loss/train': 0.7317967140115798} 12/29/2021 06:58:59 - INFO - codeparrot_training - Step 48059: {'lr': 1.8699245002497257e-06, 'samples': 24606720, 'steps': 48059, 'batch_loss/train': 0.5710605320055038} 12/29/2021 06:59:10 - INFO - codeparrot_training - Step 48060: {'lr': 1.868000634983713e-06, 'samples': 24607232, 'steps': 48060, 'batch_loss/train': 0.6724818598013371} 12/29/2021 06:59:21 - INFO - codeparrot_training - Step 48061: {'lr': 1.8660777561968212e-06, 'samples': 24607744, 'steps': 48061, 'batch_loss/train': 0.7476045023649931} 12/29/2021 06:59:35 - INFO - codeparrot_training - Step 48062: {'lr': 1.8641558638967383e-06, 'samples': 24608256, 'steps': 48062, 'batch_loss/train': 0.674887839704752} 12/29/2021 06:59:45 - INFO - codeparrot_training - Step 48063: {'lr': 1.8622349580910413e-06, 'samples': 24608768, 'steps': 48063, 'batch_loss/train': 0.7020787661895156} 12/29/2021 06:59:56 - INFO - codeparrot_training - Step 48064: {'lr': 1.8603150387874469e-06, 'samples': 24609280, 'steps': 48064, 'batch_loss/train': 0.7314933703746647} 12/29/2021 07:00:08 - INFO - codeparrot_training - Step 48065: {'lr': 1.858396105993504e-06, 'samples': 24609792, 'steps': 48065, 'batch_loss/train': 0.8184639634564519} 12/29/2021 07:00:19 - INFO - codeparrot_training - Step 48066: {'lr': 1.8564781597169012e-06, 'samples': 24610304, 'steps': 48066, 'batch_loss/train': 0.7241315161809325} 12/29/2021 07:00:29 - INFO - codeparrot_training - Step 48067: {'lr': 1.854561199965271e-06, 'samples': 24610816, 'steps': 48067, 'batch_loss/train': 0.6708775060251355} 12/29/2021 07:00:40 - INFO - codeparrot_training - Step 48068: {'lr': 1.8526452267461636e-06, 'samples': 24611328, 'steps': 48068, 'batch_loss/train': 0.8180672032758594} 12/29/2021 07:00:52 - INFO - codeparrot_training - Step 48069: {'lr': 1.8507302400672666e-06, 'samples': 24611840, 'steps': 48069, 'batch_loss/train': 0.6604659301228821} 12/29/2021 07:01:03 - INFO - codeparrot_training - Step 48070: {'lr': 1.8488162399361852e-06, 'samples': 24612352, 'steps': 48070, 'batch_loss/train': 0.8188786851242185} 12/29/2021 07:01:14 - INFO - codeparrot_training - Step 48071: {'lr': 1.8469032263604967e-06, 'samples': 24612864, 'steps': 48071, 'batch_loss/train': 0.7908947048708797} 12/29/2021 07:01:28 - INFO - codeparrot_training - Step 48072: {'lr': 1.8449911993478064e-06, 'samples': 24613376, 'steps': 48072, 'batch_loss/train': 0.7366464098449796} 12/29/2021 07:01:38 - INFO - codeparrot_training - Step 48073: {'lr': 1.8430801589057467e-06, 'samples': 24613888, 'steps': 48073, 'batch_loss/train': 0.4550447320798412} 12/29/2021 07:01:49 - INFO - codeparrot_training - Step 48074: {'lr': 1.841170105041895e-06, 'samples': 24614400, 'steps': 48074, 'batch_loss/train': 0.7960298303514719} 12/29/2021 07:02:01 - INFO - codeparrot_training - Step 48075: {'lr': 1.8392610377638563e-06, 'samples': 24614912, 'steps': 48075, 'batch_loss/train': 0.6386082605458796} 12/29/2021 07:02:12 - INFO - codeparrot_training - Step 48076: {'lr': 1.837352957079208e-06, 'samples': 24615424, 'steps': 48076, 'batch_loss/train': 0.6185552791575901} 12/29/2021 07:02:22 - INFO - codeparrot_training - Step 48077: {'lr': 1.835445862995555e-06, 'samples': 24615936, 'steps': 48077, 'batch_loss/train': 0.7482722809072584} 12/29/2021 07:02:33 - INFO - codeparrot_training - Step 48078: {'lr': 1.833539755520447e-06, 'samples': 24616448, 'steps': 48078, 'batch_loss/train': 0.7160250288434327} 12/29/2021 07:02:45 - INFO - codeparrot_training - Step 48079: {'lr': 1.8316346346614887e-06, 'samples': 24616960, 'steps': 48079, 'batch_loss/train': 0.8042460763826966} 12/29/2021 07:02:56 - INFO - codeparrot_training - Step 48080: {'lr': 1.8297305004262576e-06, 'samples': 24617472, 'steps': 48080, 'batch_loss/train': 0.7407947642495856} 12/29/2021 07:03:07 - INFO - codeparrot_training - Step 48081: {'lr': 1.8278273528223032e-06, 'samples': 24617984, 'steps': 48081, 'batch_loss/train': 0.7185848960652947} 12/29/2021 07:03:19 - INFO - codeparrot_training - Step 48082: {'lr': 1.8259251918572028e-06, 'samples': 24618496, 'steps': 48082, 'batch_loss/train': 0.9031730759888887} 12/29/2021 07:03:29 - INFO - codeparrot_training - Step 48083: {'lr': 1.8240240175385059e-06, 'samples': 24619008, 'steps': 48083, 'batch_loss/train': 0.7384878965094686} 12/29/2021 07:03:40 - INFO - codeparrot_training - Step 48084: {'lr': 1.8221238298737897e-06, 'samples': 24619520, 'steps': 48084, 'batch_loss/train': 0.7060397979803383} 12/29/2021 07:03:54 - INFO - codeparrot_training - Step 48085: {'lr': 1.8202246288706037e-06, 'samples': 24620032, 'steps': 48085, 'batch_loss/train': 0.776437824126333} 12/29/2021 07:04:05 - INFO - codeparrot_training - Step 48086: {'lr': 1.8183264145364697e-06, 'samples': 24620544, 'steps': 48086, 'batch_loss/train': 0.7327298955060542} 12/29/2021 07:04:15 - INFO - codeparrot_training - Step 48087: {'lr': 1.8164291868790206e-06, 'samples': 24621056, 'steps': 48087, 'batch_loss/train': 0.6201968500972725} 12/29/2021 07:04:26 - INFO - codeparrot_training - Step 48088: {'lr': 1.8145329459056948e-06, 'samples': 24621568, 'steps': 48088, 'batch_loss/train': 0.7860184612218291} 12/29/2021 07:04:38 - INFO - codeparrot_training - Step 48089: {'lr': 1.8126376916240694e-06, 'samples': 24622080, 'steps': 48089, 'batch_loss/train': 0.7368827832397074} 12/29/2021 07:04:49 - INFO - codeparrot_training - Step 48090: {'lr': 1.8107434240416942e-06, 'samples': 24622592, 'steps': 48090, 'batch_loss/train': 0.7567741395905614} 12/29/2021 07:04:59 - INFO - codeparrot_training - Step 48091: {'lr': 1.808850143166091e-06, 'samples': 24623104, 'steps': 48091, 'batch_loss/train': 0.7656571017578244} 12/29/2021 07:05:14 - INFO - codeparrot_training - Step 48092: {'lr': 1.8069578490047812e-06, 'samples': 24623616, 'steps': 48092, 'batch_loss/train': 0.6644940879195929} 12/29/2021 07:05:24 - INFO - codeparrot_training - Step 48093: {'lr': 1.8050665415653144e-06, 'samples': 24624128, 'steps': 48093, 'batch_loss/train': 0.6905902717262506} 12/29/2021 07:05:35 - INFO - codeparrot_training - Step 48094: {'lr': 1.8031762208551571e-06, 'samples': 24624640, 'steps': 48094, 'batch_loss/train': 0.6945527605712414} 12/29/2021 07:05:47 - INFO - codeparrot_training - Step 48095: {'lr': 1.8012868868818865e-06, 'samples': 24625152, 'steps': 48095, 'batch_loss/train': 0.6087212348356843} 12/29/2021 07:05:58 - INFO - codeparrot_training - Step 48096: {'lr': 1.7993985396529688e-06, 'samples': 24625664, 'steps': 48096, 'batch_loss/train': 0.7627649027854204} 12/29/2021 07:06:09 - INFO - codeparrot_training - Step 48097: {'lr': 1.7975111791759257e-06, 'samples': 24626176, 'steps': 48097, 'batch_loss/train': 0.8179545197635889} 12/29/2021 07:06:19 - INFO - codeparrot_training - Step 48098: {'lr': 1.7956248054582513e-06, 'samples': 24626688, 'steps': 48098, 'batch_loss/train': 0.7152895163744688} 12/29/2021 07:06:32 - INFO - codeparrot_training - Step 48099: {'lr': 1.7937394185074673e-06, 'samples': 24627200, 'steps': 48099, 'batch_loss/train': 0.7016514586284757} 12/29/2021 07:06:43 - INFO - codeparrot_training - Step 48100: {'lr': 1.7918550183310677e-06, 'samples': 24627712, 'steps': 48100, 'batch_loss/train': 0.7437138739041984} 12/29/2021 07:06:53 - INFO - codeparrot_training - Step 48101: {'lr': 1.7899716049364912e-06, 'samples': 24628224, 'steps': 48101, 'batch_loss/train': 0.7979004764929414} 12/29/2021 07:07:08 - INFO - codeparrot_training - Step 48102: {'lr': 1.7880891783313146e-06, 'samples': 24628736, 'steps': 48102, 'batch_loss/train': 0.7488568918779492} 12/29/2021 07:07:18 - INFO - codeparrot_training - Step 48103: {'lr': 1.786207738522949e-06, 'samples': 24629248, 'steps': 48103, 'batch_loss/train': 0.6508500841446221} 12/29/2021 07:07:29 - INFO - codeparrot_training - Step 48104: {'lr': 1.7843272855189163e-06, 'samples': 24629760, 'steps': 48104, 'batch_loss/train': 0.8456008424982429} 12/29/2021 07:07:41 - INFO - codeparrot_training - Step 48105: {'lr': 1.7824478193266547e-06, 'samples': 24630272, 'steps': 48105, 'batch_loss/train': 0.734898654744029} 12/29/2021 07:07:52 - INFO - codeparrot_training - Step 48106: {'lr': 1.7805693399536583e-06, 'samples': 24630784, 'steps': 48106, 'batch_loss/train': 0.6994468932971358} 12/29/2021 07:08:03 - INFO - codeparrot_training - Step 48107: {'lr': 1.7786918474073932e-06, 'samples': 24631296, 'steps': 48107, 'batch_loss/train': 0.7011915906332433} 12/29/2021 07:08:13 - INFO - codeparrot_training - Step 48108: {'lr': 1.7768153416953536e-06, 'samples': 24631808, 'steps': 48108, 'batch_loss/train': 0.7531995642930269} 12/29/2021 07:08:27 - INFO - codeparrot_training - Step 48109: {'lr': 1.7749398228249503e-06, 'samples': 24632320, 'steps': 48109, 'batch_loss/train': 0.7528270841576159} 12/29/2021 07:08:38 - INFO - codeparrot_training - Step 48110: {'lr': 1.7730652908036493e-06, 'samples': 24632832, 'steps': 48110, 'batch_loss/train': 0.6991562554612756} 12/29/2021 07:08:48 - INFO - codeparrot_training - Step 48111: {'lr': 1.7711917456389448e-06, 'samples': 24633344, 'steps': 48111, 'batch_loss/train': 0.7362428614869714} 12/29/2021 07:09:01 - INFO - codeparrot_training - Step 48112: {'lr': 1.7693191873382198e-06, 'samples': 24633856, 'steps': 48112, 'batch_loss/train': 0.7232697084546089} 12/29/2021 07:09:12 - INFO - codeparrot_training - Step 48113: {'lr': 1.767447615908968e-06, 'samples': 24634368, 'steps': 48113, 'batch_loss/train': 0.586669534444809} 12/29/2021 07:09:22 - INFO - codeparrot_training - Step 48114: {'lr': 1.7655770313586283e-06, 'samples': 24634880, 'steps': 48114, 'batch_loss/train': 0.7345542582916096} 12/29/2021 07:09:34 - INFO - codeparrot_training - Step 48115: {'lr': 1.7637074336945834e-06, 'samples': 24635392, 'steps': 48115, 'batch_loss/train': 0.7531706746667624} 12/29/2021 07:09:45 - INFO - codeparrot_training - Step 48116: {'lr': 1.7618388229243554e-06, 'samples': 24635904, 'steps': 48116, 'batch_loss/train': 0.7217569686472416} 12/29/2021 07:09:55 - INFO - codeparrot_training - Step 48117: {'lr': 1.759971199055299e-06, 'samples': 24636416, 'steps': 48117, 'batch_loss/train': 0.6955771539360285} 12/29/2021 07:10:06 - INFO - codeparrot_training - Step 48118: {'lr': 1.758104562094881e-06, 'samples': 24636928, 'steps': 48118, 'batch_loss/train': 0.7765219230204821} 12/29/2021 07:10:18 - INFO - codeparrot_training - Step 48119: {'lr': 1.756238912050484e-06, 'samples': 24637440, 'steps': 48119, 'batch_loss/train': 0.8070937562733889} 12/29/2021 07:10:29 - INFO - codeparrot_training - Step 48120: {'lr': 1.7543742489295745e-06, 'samples': 24637952, 'steps': 48120, 'batch_loss/train': 0.6995236072689295} 12/29/2021 07:10:40 - INFO - codeparrot_training - Step 48121: {'lr': 1.7525105727395074e-06, 'samples': 24638464, 'steps': 48121, 'batch_loss/train': 0.7725237607955933} 12/29/2021 07:10:53 - INFO - codeparrot_training - Step 48122: {'lr': 1.7506478834877493e-06, 'samples': 24638976, 'steps': 48122, 'batch_loss/train': 0.730504245031625} 12/29/2021 07:11:04 - INFO - codeparrot_training - Step 48123: {'lr': 1.7487861811816552e-06, 'samples': 24639488, 'steps': 48123, 'batch_loss/train': 0.7343668836401775} 12/29/2021 07:11:15 - INFO - codeparrot_training - Step 48124: {'lr': 1.7469254658286914e-06, 'samples': 24640000, 'steps': 48124, 'batch_loss/train': 0.7464088359847665} 12/29/2021 07:11:27 - INFO - codeparrot_training - Step 48125: {'lr': 1.7450657374361855e-06, 'samples': 24640512, 'steps': 48125, 'batch_loss/train': 0.6406079921871424} 12/29/2021 07:11:38 - INFO - codeparrot_training - Step 48126: {'lr': 1.743206996011576e-06, 'samples': 24641024, 'steps': 48126, 'batch_loss/train': 0.7150219837203622} 12/29/2021 07:11:48 - INFO - codeparrot_training - Step 48127: {'lr': 1.7413492415622179e-06, 'samples': 24641536, 'steps': 48127, 'batch_loss/train': 0.6654278634232469} 12/29/2021 07:11:59 - INFO - codeparrot_training - Step 48128: {'lr': 1.7394924740955498e-06, 'samples': 24642048, 'steps': 48128, 'batch_loss/train': 0.6904932525940239} 12/29/2021 07:12:11 - INFO - codeparrot_training - Step 48129: {'lr': 1.7376366936188714e-06, 'samples': 24642560, 'steps': 48129, 'batch_loss/train': 0.7809798894450068} 12/29/2021 07:12:22 - INFO - codeparrot_training - Step 48130: {'lr': 1.7357819001396213e-06, 'samples': 24643072, 'steps': 48130, 'batch_loss/train': 0.6804253533482552} 12/29/2021 07:12:33 - INFO - codeparrot_training - Step 48131: {'lr': 1.73392809366521e-06, 'samples': 24643584, 'steps': 48131, 'batch_loss/train': 0.6500578029081225} 12/29/2021 07:12:47 - INFO - codeparrot_training - Step 48132: {'lr': 1.73207527420291e-06, 'samples': 24644096, 'steps': 48132, 'batch_loss/train': 0.8001793790608644} 12/29/2021 07:12:57 - INFO - codeparrot_training - Step 48133: {'lr': 1.7302234417601593e-06, 'samples': 24644608, 'steps': 48133, 'batch_loss/train': 0.71995262010023} 12/29/2021 07:13:08 - INFO - codeparrot_training - Step 48134: {'lr': 1.7283725963442854e-06, 'samples': 24645120, 'steps': 48134, 'batch_loss/train': 0.7882889942266047} 12/29/2021 07:13:20 - INFO - codeparrot_training - Step 48135: {'lr': 1.7265227379626713e-06, 'samples': 24645632, 'steps': 48135, 'batch_loss/train': 0.7124798013828695} 12/29/2021 07:13:30 - INFO - codeparrot_training - Step 48136: {'lr': 1.7246738666226446e-06, 'samples': 24646144, 'steps': 48136, 'batch_loss/train': 0.7809601151384413} 12/29/2021 07:13:41 - INFO - codeparrot_training - Step 48137: {'lr': 1.7228259823315885e-06, 'samples': 24646656, 'steps': 48137, 'batch_loss/train': 0.7051076762145385} 12/29/2021 07:13:52 - INFO - codeparrot_training - Step 48138: {'lr': 1.720979085096802e-06, 'samples': 24647168, 'steps': 48138, 'batch_loss/train': 0.6286134077236056} 12/29/2021 07:14:06 - INFO - codeparrot_training - Step 48139: {'lr': 1.719133174925669e-06, 'samples': 24647680, 'steps': 48139, 'batch_loss/train': 0.7242271043360233} 12/29/2021 07:14:16 - INFO - codeparrot_training - Step 48140: {'lr': 1.717288251825516e-06, 'samples': 24648192, 'steps': 48140, 'batch_loss/train': 0.7374797107186168} 12/29/2021 07:14:27 - INFO - codeparrot_training - Step 48141: {'lr': 1.7154443158036715e-06, 'samples': 24648704, 'steps': 48141, 'batch_loss/train': 0.6240693160798401} 12/29/2021 07:14:39 - INFO - codeparrot_training - Step 48142: {'lr': 1.71360136686749e-06, 'samples': 24649216, 'steps': 48142, 'batch_loss/train': 0.6566130956634879} 12/29/2021 07:14:50 - INFO - codeparrot_training - Step 48143: {'lr': 1.7117594050242714e-06, 'samples': 24649728, 'steps': 48143, 'batch_loss/train': 0.7275932729244232} 12/29/2021 07:15:00 - INFO - codeparrot_training - Step 48144: {'lr': 1.7099184302813431e-06, 'samples': 24650240, 'steps': 48144, 'batch_loss/train': 0.7329357411945239} 12/29/2021 07:15:12 - INFO - codeparrot_training - Step 48145: {'lr': 1.708078442646005e-06, 'samples': 24650752, 'steps': 48145, 'batch_loss/train': 0.7914088771212846} 12/29/2021 07:15:23 - INFO - codeparrot_training - Step 48146: {'lr': 1.70623944212564e-06, 'samples': 24651264, 'steps': 48146, 'batch_loss/train': 0.7520077931694686} 12/29/2021 07:15:34 - INFO - codeparrot_training - Step 48147: {'lr': 1.7044014287274923e-06, 'samples': 24651776, 'steps': 48147, 'batch_loss/train': 0.7054177051177248} 12/29/2021 07:15:44 - INFO - codeparrot_training - Step 48148: {'lr': 1.7025644024588893e-06, 'samples': 24652288, 'steps': 48148, 'batch_loss/train': 0.5888504860922694} 12/29/2021 07:15:58 - INFO - codeparrot_training - Step 48149: {'lr': 1.7007283633271308e-06, 'samples': 24652800, 'steps': 48149, 'batch_loss/train': 0.7287151651107706} 12/29/2021 07:16:09 - INFO - codeparrot_training - Step 48150: {'lr': 1.6988933113395443e-06, 'samples': 24653312, 'steps': 48150, 'batch_loss/train': 0.820059600751847} 12/29/2021 07:16:19 - INFO - codeparrot_training - Step 48151: {'lr': 1.6970592465034018e-06, 'samples': 24653824, 'steps': 48151, 'batch_loss/train': 0.7766981655731797} 12/29/2021 07:16:32 - INFO - codeparrot_training - Step 48152: {'lr': 1.6952261688259752e-06, 'samples': 24654336, 'steps': 48152, 'batch_loss/train': 0.6850547580979764} 12/29/2021 07:16:42 - INFO - codeparrot_training - Step 48153: {'lr': 1.693394078314564e-06, 'samples': 24654848, 'steps': 48153, 'batch_loss/train': 0.6093862220877782} 12/29/2021 07:16:53 - INFO - codeparrot_training - Step 48154: {'lr': 1.691562974976496e-06, 'samples': 24655360, 'steps': 48154, 'batch_loss/train': 0.8791398564353585} 12/29/2021 07:17:05 - INFO - codeparrot_training - Step 48155: {'lr': 1.6897328588190153e-06, 'samples': 24655872, 'steps': 48155, 'batch_loss/train': 0.7502018045634031} 12/29/2021 07:17:16 - INFO - codeparrot_training - Step 48156: {'lr': 1.687903729849366e-06, 'samples': 24656384, 'steps': 48156, 'batch_loss/train': 0.6492606499232352} 12/29/2021 07:17:26 - INFO - codeparrot_training - Step 48157: {'lr': 1.6860755880748758e-06, 'samples': 24656896, 'steps': 48157, 'batch_loss/train': 0.8497391319833696} 12/29/2021 07:17:37 - INFO - codeparrot_training - Step 48158: {'lr': 1.6842484335027885e-06, 'samples': 24657408, 'steps': 48158, 'batch_loss/train': 0.70508743962273} 12/29/2021 07:17:49 - INFO - codeparrot_training - Step 48159: {'lr': 1.6824222661403765e-06, 'samples': 24657920, 'steps': 48159, 'batch_loss/train': 0.7412471234565601} 12/29/2021 07:18:00 - INFO - codeparrot_training - Step 48160: {'lr': 1.680597085994856e-06, 'samples': 24658432, 'steps': 48160, 'batch_loss/train': 0.7554279399919324} 12/29/2021 07:18:10 - INFO - codeparrot_training - Step 48161: {'lr': 1.6787728930735546e-06, 'samples': 24658944, 'steps': 48161, 'batch_loss/train': 0.6838875566609204} 12/29/2021 07:18:25 - INFO - codeparrot_training - Step 48162: {'lr': 1.6769496873836887e-06, 'samples': 24659456, 'steps': 48162, 'batch_loss/train': 0.7663269150070846} 12/29/2021 07:18:35 - INFO - codeparrot_training - Step 48163: {'lr': 1.6751274689325023e-06, 'samples': 24659968, 'steps': 48163, 'batch_loss/train': 0.8021709127351642} 12/29/2021 07:18:46 - INFO - codeparrot_training - Step 48164: {'lr': 1.67330623772724e-06, 'samples': 24660480, 'steps': 48164, 'batch_loss/train': 0.6666919912677258} 12/29/2021 07:18:58 - INFO - codeparrot_training - Step 48165: {'lr': 1.6714859937751458e-06, 'samples': 24660992, 'steps': 48165, 'batch_loss/train': 0.8366995179094374} 12/29/2021 07:19:09 - INFO - codeparrot_training - Step 48166: {'lr': 1.6696667370834639e-06, 'samples': 24661504, 'steps': 48166, 'batch_loss/train': 0.6596614615991712} 12/29/2021 07:19:19 - INFO - codeparrot_training - Step 48167: {'lr': 1.6678484676594108e-06, 'samples': 24662016, 'steps': 48167, 'batch_loss/train': 0.7566618146374822} 12/29/2021 07:19:30 - INFO - codeparrot_training - Step 48168: {'lr': 1.6660311855102306e-06, 'samples': 24662528, 'steps': 48168, 'batch_loss/train': 0.5424173781648278} 12/29/2021 07:19:44 - INFO - codeparrot_training - Step 48169: {'lr': 1.6642148906431397e-06, 'samples': 24663040, 'steps': 48169, 'batch_loss/train': 0.7895241249352694} 12/29/2021 07:19:54 - INFO - codeparrot_training - Step 48170: {'lr': 1.6623995830653827e-06, 'samples': 24663552, 'steps': 48170, 'batch_loss/train': 0.7455487502738833} 12/29/2021 07:20:05 - INFO - codeparrot_training - Step 48171: {'lr': 1.6605852627841479e-06, 'samples': 24664064, 'steps': 48171, 'batch_loss/train': 0.7901531532406807} 12/29/2021 07:20:18 - INFO - codeparrot_training - Step 48172: {'lr': 1.6587719298066517e-06, 'samples': 24664576, 'steps': 48172, 'batch_loss/train': 0.6604746568482369} 12/29/2021 07:20:29 - INFO - codeparrot_training - Step 48173: {'lr': 1.6569595841401108e-06, 'samples': 24665088, 'steps': 48173, 'batch_loss/train': 0.6175311300903559} 12/29/2021 07:20:39 - INFO - codeparrot_training - Step 48174: {'lr': 1.6551482257917138e-06, 'samples': 24665600, 'steps': 48174, 'batch_loss/train': 0.6144176460802555} 12/29/2021 07:20:52 - INFO - codeparrot_training - Step 48175: {'lr': 1.6533378547686773e-06, 'samples': 24666112, 'steps': 48175, 'batch_loss/train': 0.7256717374548316} 12/29/2021 07:21:02 - INFO - codeparrot_training - Step 48176: {'lr': 1.6515284710782174e-06, 'samples': 24666624, 'steps': 48176, 'batch_loss/train': 0.5508596799336374} 12/29/2021 07:21:13 - INFO - codeparrot_training - Step 48177: {'lr': 1.6497200747274955e-06, 'samples': 24667136, 'steps': 48177, 'batch_loss/train': 0.6801609003450722} 12/29/2021 07:21:24 - INFO - codeparrot_training - Step 48178: {'lr': 1.6479126657236999e-06, 'samples': 24667648, 'steps': 48178, 'batch_loss/train': 0.8307559610693716} 12/29/2021 07:21:37 - INFO - codeparrot_training - Step 48179: {'lr': 1.646106244074047e-06, 'samples': 24668160, 'steps': 48179, 'batch_loss/train': 0.6932832628954202} 12/29/2021 07:21:48 - INFO - codeparrot_training - Step 48180: {'lr': 1.644300809785726e-06, 'samples': 24668672, 'steps': 48180, 'batch_loss/train': 0.7858306504786015} 12/29/2021 07:21:59 - INFO - codeparrot_training - Step 48181: {'lr': 1.6424963628658418e-06, 'samples': 24669184, 'steps': 48181, 'batch_loss/train': 0.8019935907796025} 12/29/2021 07:22:11 - INFO - codeparrot_training - Step 48182: {'lr': 1.640692903321639e-06, 'samples': 24669696, 'steps': 48182, 'batch_loss/train': 0.7642236454412341} 12/29/2021 07:22:22 - INFO - codeparrot_training - Step 48183: {'lr': 1.6388904311602781e-06, 'samples': 24670208, 'steps': 48183, 'batch_loss/train': 0.7852072897367179} 12/29/2021 07:22:32 - INFO - codeparrot_training - Step 48184: {'lr': 1.6370889463888926e-06, 'samples': 24670720, 'steps': 48184, 'batch_loss/train': 0.7285976447165012} 12/29/2021 07:22:44 - INFO - codeparrot_training - Step 48185: {'lr': 1.6352884490146437e-06, 'samples': 24671232, 'steps': 48185, 'batch_loss/train': 0.629280521592591} 12/29/2021 07:22:55 - INFO - codeparrot_training - Step 48186: {'lr': 1.6334889390447472e-06, 'samples': 24671744, 'steps': 48186, 'batch_loss/train': 0.6870396775193512} 12/29/2021 07:23:06 - INFO - codeparrot_training - Step 48187: {'lr': 1.6316904164863089e-06, 'samples': 24672256, 'steps': 48187, 'batch_loss/train': 0.6488667484372854} 12/29/2021 07:23:16 - INFO - codeparrot_training - Step 48188: {'lr': 1.629892881346462e-06, 'samples': 24672768, 'steps': 48188, 'batch_loss/train': 0.7234355872496963} 12/29/2021 07:23:30 - INFO - codeparrot_training - Step 48189: {'lr': 1.628096333632395e-06, 'samples': 24673280, 'steps': 48189, 'batch_loss/train': 0.6827297499403358} 12/29/2021 07:23:41 - INFO - codeparrot_training - Step 48190: {'lr': 1.6263007733512414e-06, 'samples': 24673792, 'steps': 48190, 'batch_loss/train': 0.7489448417909443} 12/29/2021 07:23:51 - INFO - codeparrot_training - Step 48191: {'lr': 1.6245062005101342e-06, 'samples': 24674304, 'steps': 48191, 'batch_loss/train': 0.7726914072409272} 12/29/2021 07:24:03 - INFO - codeparrot_training - Step 48192: {'lr': 1.6227126151162342e-06, 'samples': 24674816, 'steps': 48192, 'batch_loss/train': 0.8047918062657118} 12/29/2021 07:24:14 - INFO - codeparrot_training - Step 48193: {'lr': 1.6209200171765914e-06, 'samples': 24675328, 'steps': 48193, 'batch_loss/train': 0.6873145885765553} 12/29/2021 07:24:25 - INFO - codeparrot_training - Step 48194: {'lr': 1.6191284066984225e-06, 'samples': 24675840, 'steps': 48194, 'batch_loss/train': 0.7272727412637323} 12/29/2021 07:24:38 - INFO - codeparrot_training - Step 48195: {'lr': 1.6173377836888048e-06, 'samples': 24676352, 'steps': 48195, 'batch_loss/train': 0.8587202739436179} 12/29/2021 07:24:48 - INFO - codeparrot_training - Step 48196: {'lr': 1.6155481481548717e-06, 'samples': 24676864, 'steps': 48196, 'batch_loss/train': 0.7729220651090145} 12/29/2021 07:24:59 - INFO - codeparrot_training - Step 48197: {'lr': 1.6137595001037287e-06, 'samples': 24677376, 'steps': 48197, 'batch_loss/train': 0.6919299056753516} 12/29/2021 07:25:10 - INFO - codeparrot_training - Step 48198: {'lr': 1.611971839542481e-06, 'samples': 24677888, 'steps': 48198, 'batch_loss/train': 0.7874929991085082} 12/29/2021 07:25:22 - INFO - codeparrot_training - Step 48199: {'lr': 1.610185166478262e-06, 'samples': 24678400, 'steps': 48199, 'batch_loss/train': 0.9447293234989047} 12/29/2021 07:25:32 - INFO - codeparrot_training - Step 48200: {'lr': 1.608399480918149e-06, 'samples': 24678912, 'steps': 48200, 'batch_loss/train': 0.7097507989965379} 12/29/2021 07:25:43 - INFO - codeparrot_training - Step 48201: {'lr': 1.606614782869248e-06, 'samples': 24679424, 'steps': 48201, 'batch_loss/train': 0.7684809304773808} 12/29/2021 07:25:57 - INFO - codeparrot_training - Step 48202: {'lr': 1.6048310723386638e-06, 'samples': 24679936, 'steps': 48202, 'batch_loss/train': 0.658043249277398} 12/29/2021 07:26:08 - INFO - codeparrot_training - Step 48203: {'lr': 1.6030483493335024e-06, 'samples': 24680448, 'steps': 48203, 'batch_loss/train': 0.9065038897097111} 12/29/2021 07:26:18 - INFO - codeparrot_training - Step 48204: {'lr': 1.6012666138607857e-06, 'samples': 24680960, 'steps': 48204, 'batch_loss/train': 0.6839203235576861} 12/29/2021 07:26:30 - INFO - codeparrot_training - Step 48205: {'lr': 1.5994858659276467e-06, 'samples': 24681472, 'steps': 48205, 'batch_loss/train': 0.8184057120233774} 12/29/2021 07:26:41 - INFO - codeparrot_training - Step 48206: {'lr': 1.5977061055411911e-06, 'samples': 24681984, 'steps': 48206, 'batch_loss/train': 0.7237437216099352} 12/29/2021 07:26:52 - INFO - codeparrot_training - Step 48207: {'lr': 1.595927332708441e-06, 'samples': 24682496, 'steps': 48207, 'batch_loss/train': 0.6192921584006399} 12/29/2021 07:27:02 - INFO - codeparrot_training - Step 48208: {'lr': 1.5941495474364741e-06, 'samples': 24683008, 'steps': 48208, 'batch_loss/train': 0.7777470396831632} 12/29/2021 07:27:16 - INFO - codeparrot_training - Step 48209: {'lr': 1.5923727497324236e-06, 'samples': 24683520, 'steps': 48209, 'batch_loss/train': 0.7746319742873311} 12/29/2021 07:27:27 - INFO - codeparrot_training - Step 48210: {'lr': 1.5905969396032838e-06, 'samples': 24684032, 'steps': 48210, 'batch_loss/train': 0.8070311993360519} 12/29/2021 07:27:37 - INFO - codeparrot_training - Step 48211: {'lr': 1.5888221170561045e-06, 'samples': 24684544, 'steps': 48211, 'batch_loss/train': 0.771657676436007} 12/29/2021 07:27:50 - INFO - codeparrot_training - Step 48212: {'lr': 1.5870482820979914e-06, 'samples': 24685056, 'steps': 48212, 'batch_loss/train': 0.7567898379638791} 12/29/2021 07:28:00 - INFO - codeparrot_training - Step 48213: {'lr': 1.5852754347359943e-06, 'samples': 24685568, 'steps': 48213, 'batch_loss/train': 0.5925871810177341} 12/29/2021 07:28:11 - INFO - codeparrot_training - Step 48214: {'lr': 1.5835035749771353e-06, 'samples': 24686080, 'steps': 48214, 'batch_loss/train': 0.7962235789746046} 12/29/2021 07:28:23 - INFO - codeparrot_training - Step 48215: {'lr': 1.5817327028284366e-06, 'samples': 24686592, 'steps': 48215, 'batch_loss/train': 0.7711011897772551} 12/29/2021 07:28:34 - INFO - codeparrot_training - Step 48216: {'lr': 1.5799628182970038e-06, 'samples': 24687104, 'steps': 48216, 'batch_loss/train': 0.7158457040786743} 12/29/2021 07:28:44 - INFO - codeparrot_training - Step 48217: {'lr': 1.578193921389831e-06, 'samples': 24687616, 'steps': 48217, 'batch_loss/train': 0.7201679656282067} 12/29/2021 07:28:55 - INFO - codeparrot_training - Step 48218: {'lr': 1.5764260121139685e-06, 'samples': 24688128, 'steps': 48218, 'batch_loss/train': 0.7700069062411785} 12/29/2021 07:29:09 - INFO - codeparrot_training - Step 48219: {'lr': 1.5746590904764103e-06, 'samples': 24688640, 'steps': 48219, 'batch_loss/train': 0.7412732918746769} 12/29/2021 07:29:19 - INFO - codeparrot_training - Step 48220: {'lr': 1.572893156484234e-06, 'samples': 24689152, 'steps': 48220, 'batch_loss/train': 0.7925430051982403} 12/29/2021 07:29:30 - INFO - codeparrot_training - Step 48221: {'lr': 1.571128210144407e-06, 'samples': 24689664, 'steps': 48221, 'batch_loss/train': 0.8810338666662574} 12/29/2021 07:29:42 - INFO - codeparrot_training - Step 48222: {'lr': 1.5693642514639784e-06, 'samples': 24690176, 'steps': 48222, 'batch_loss/train': 0.8060699691995978} 12/29/2021 07:29:53 - INFO - codeparrot_training - Step 48223: {'lr': 1.5676012804499706e-06, 'samples': 24690688, 'steps': 48223, 'batch_loss/train': 1.0453801434487104} 12/29/2021 07:30:04 - INFO - codeparrot_training - Step 48224: {'lr': 1.5658392971093505e-06, 'samples': 24691200, 'steps': 48224, 'batch_loss/train': 0.7890575733035803} 12/29/2021 07:30:18 - INFO - codeparrot_training - Step 48225: {'lr': 1.5640783014491677e-06, 'samples': 24691712, 'steps': 48225, 'batch_loss/train': 0.7574287499301136} 12/29/2021 07:30:28 - INFO - codeparrot_training - Step 48226: {'lr': 1.5623182934763891e-06, 'samples': 24692224, 'steps': 48226, 'batch_loss/train': 0.6642492287792265} 12/29/2021 07:30:39 - INFO - codeparrot_training - Step 48227: {'lr': 1.5605592731980366e-06, 'samples': 24692736, 'steps': 48227, 'batch_loss/train': 0.7483199732378125} 12/29/2021 07:30:50 - INFO - codeparrot_training - Step 48228: {'lr': 1.558801240621105e-06, 'samples': 24693248, 'steps': 48228, 'batch_loss/train': 0.7956437724642456} 12/29/2021 07:31:02 - INFO - codeparrot_training - Step 48229: {'lr': 1.5570441957525606e-06, 'samples': 24693760, 'steps': 48229, 'batch_loss/train': 0.7218323717825115} 12/29/2021 07:31:13 - INFO - codeparrot_training - Step 48230: {'lr': 1.55528813859937e-06, 'samples': 24694272, 'steps': 48230, 'batch_loss/train': 0.9711312246508896} 12/29/2021 07:31:23 - INFO - codeparrot_training - Step 48231: {'lr': 1.5535330691685833e-06, 'samples': 24694784, 'steps': 48231, 'batch_loss/train': 0.7441495088860393} 12/29/2021 07:31:35 - INFO - codeparrot_training - Step 48232: {'lr': 1.5517789874671395e-06, 'samples': 24695296, 'steps': 48232, 'batch_loss/train': 0.7370923552662134} 12/29/2021 07:31:46 - INFO - codeparrot_training - Step 48233: {'lr': 1.550025893502005e-06, 'samples': 24695808, 'steps': 48233, 'batch_loss/train': 0.7016873233951628} 12/29/2021 07:31:56 - INFO - codeparrot_training - Step 48234: {'lr': 1.5482737872801466e-06, 'samples': 24696320, 'steps': 48234, 'batch_loss/train': 0.7903619729913771} 12/29/2021 07:32:08 - INFO - codeparrot_training - Step 48235: {'lr': 1.546522668808531e-06, 'samples': 24696832, 'steps': 48235, 'batch_loss/train': 0.7791981835616753} 12/29/2021 07:32:19 - INFO - codeparrot_training - Step 48236: {'lr': 1.5447725380941525e-06, 'samples': 24697344, 'steps': 48236, 'batch_loss/train': 0.730864513432607} 12/29/2021 07:32:30 - INFO - codeparrot_training - Step 48237: {'lr': 1.5430233951439498e-06, 'samples': 24697856, 'steps': 48237, 'batch_loss/train': 0.7760451000649482} 12/29/2021 07:32:40 - INFO - codeparrot_training - Step 48238: {'lr': 1.5412752399648345e-06, 'samples': 24698368, 'steps': 48238, 'batch_loss/train': 0.7368810987100005} 12/29/2021 07:32:54 - INFO - codeparrot_training - Step 48239: {'lr': 1.5395280725638283e-06, 'samples': 24698880, 'steps': 48239, 'batch_loss/train': 0.9168851473368704} 12/29/2021 07:33:05 - INFO - codeparrot_training - Step 48240: {'lr': 1.5377818929478427e-06, 'samples': 24699392, 'steps': 48240, 'batch_loss/train': 0.6702624009922147} 12/29/2021 07:33:16 - INFO - codeparrot_training - Step 48241: {'lr': 1.5360367011237886e-06, 'samples': 24699904, 'steps': 48241, 'batch_loss/train': 0.8245811541564763} 12/29/2021 07:33:28 - INFO - codeparrot_training - Step 48242: {'lr': 1.5342924970986606e-06, 'samples': 24700416, 'steps': 48242, 'batch_loss/train': 0.8021484054625034} 12/29/2021 07:33:39 - INFO - codeparrot_training - Step 48243: {'lr': 1.5325492808793695e-06, 'samples': 24700928, 'steps': 48243, 'batch_loss/train': 0.8325053631269839} 12/29/2021 07:33:49 - INFO - codeparrot_training - Step 48244: {'lr': 1.5308070524728546e-06, 'samples': 24701440, 'steps': 48244, 'batch_loss/train': 0.8899757429026067} 12/29/2021 07:34:01 - INFO - codeparrot_training - Step 48245: {'lr': 1.5290658118860268e-06, 'samples': 24701952, 'steps': 48245, 'batch_loss/train': 0.7626663232222199} 12/29/2021 07:34:12 - INFO - codeparrot_training - Step 48246: {'lr': 1.5273255591257974e-06, 'samples': 24702464, 'steps': 48246, 'batch_loss/train': 0.753651668317616} 12/29/2021 07:34:23 - INFO - codeparrot_training - Step 48247: {'lr': 1.5255862941991327e-06, 'samples': 24702976, 'steps': 48247, 'batch_loss/train': 0.7901764786802232} 12/29/2021 07:34:33 - INFO - codeparrot_training - Step 48248: {'lr': 1.5238480171128889e-06, 'samples': 24703488, 'steps': 48248, 'batch_loss/train': 0.7014881114009768} 12/29/2021 07:34:47 - INFO - codeparrot_training - Step 48249: {'lr': 1.5221107278740042e-06, 'samples': 24704000, 'steps': 48249, 'batch_loss/train': 0.8044718410819769} 12/29/2021 07:34:58 - INFO - codeparrot_training - Step 48250: {'lr': 1.5203744264893903e-06, 'samples': 24704512, 'steps': 48250, 'batch_loss/train': 1.0524268511217088} 12/29/2021 07:35:08 - INFO - codeparrot_training - Step 48251: {'lr': 1.5186391129659582e-06, 'samples': 24705024, 'steps': 48251, 'batch_loss/train': 0.7124572098255157} 12/29/2021 07:35:20 - INFO - codeparrot_training - Step 48252: {'lr': 1.5169047873105635e-06, 'samples': 24705536, 'steps': 48252, 'batch_loss/train': 0.7833931567147374} 12/29/2021 07:35:31 - INFO - codeparrot_training - Step 48253: {'lr': 1.5151714495301727e-06, 'samples': 24706048, 'steps': 48253, 'batch_loss/train': 0.6667253852356225} 12/29/2021 07:35:42 - INFO - codeparrot_training - Step 48254: {'lr': 1.5134390996316139e-06, 'samples': 24706560, 'steps': 48254, 'batch_loss/train': 0.7051604609005153} 12/29/2021 07:35:56 - INFO - codeparrot_training - Step 48255: {'lr': 1.511707737621798e-06, 'samples': 24707072, 'steps': 48255, 'batch_loss/train': 0.7479234719648957} 12/29/2021 07:36:06 - INFO - codeparrot_training - Step 48256: {'lr': 1.5099773635076085e-06, 'samples': 24707584, 'steps': 48256, 'batch_loss/train': 0.6333778141997755} 12/29/2021 07:36:17 - INFO - codeparrot_training - Step 48257: {'lr': 1.5082479772959013e-06, 'samples': 24708096, 'steps': 48257, 'batch_loss/train': 0.7995695890858769} 12/29/2021 07:36:27 - INFO - codeparrot_training - Step 48258: {'lr': 1.5065195789936147e-06, 'samples': 24708608, 'steps': 48258, 'batch_loss/train': 0.7685348689556122} 12/29/2021 07:36:40 - INFO - codeparrot_training - Step 48259: {'lr': 1.5047921686075494e-06, 'samples': 24709120, 'steps': 48259, 'batch_loss/train': 0.7430287110619247} 12/29/2021 07:36:50 - INFO - codeparrot_training - Step 48260: {'lr': 1.5030657461446162e-06, 'samples': 24709632, 'steps': 48260, 'batch_loss/train': 0.6216779361711815} 12/29/2021 07:37:01 - INFO - codeparrot_training - Step 48261: {'lr': 1.501340311611643e-06, 'samples': 24710144, 'steps': 48261, 'batch_loss/train': 0.7372178230434656} 12/29/2021 07:37:14 - INFO - codeparrot_training - Step 48262: {'lr': 1.4996158650155412e-06, 'samples': 24710656, 'steps': 48262, 'batch_loss/train': 0.7693818537518382} 12/29/2021 07:37:24 - INFO - codeparrot_training - Step 48263: {'lr': 1.4978924063631106e-06, 'samples': 24711168, 'steps': 48263, 'batch_loss/train': 0.9723462555557489} 12/29/2021 07:37:35 - INFO - codeparrot_training - Step 48264: {'lr': 1.4961699356612624e-06, 'samples': 24711680, 'steps': 48264, 'batch_loss/train': 0.9777362337335944} 12/29/2021 07:37:49 - INFO - codeparrot_training - Step 48265: {'lr': 1.494448452916769e-06, 'samples': 24712192, 'steps': 48265, 'batch_loss/train': 0.7579189105890691} 12/29/2021 07:37:59 - INFO - codeparrot_training - Step 48266: {'lr': 1.4927279581365416e-06, 'samples': 24712704, 'steps': 48266, 'batch_loss/train': 0.7901882063597441} 12/29/2021 07:38:10 - INFO - codeparrot_training - Step 48267: {'lr': 1.4910084513273802e-06, 'samples': 24713216, 'steps': 48267, 'batch_loss/train': 0.7047493904829025} 12/29/2021 07:38:21 - INFO - codeparrot_training - Step 48268: {'lr': 1.4892899324961684e-06, 'samples': 24713728, 'steps': 48268, 'batch_loss/train': 0.7168518714606762} 12/29/2021 07:38:33 - INFO - codeparrot_training - Step 48269: {'lr': 1.4875724016496784e-06, 'samples': 24714240, 'steps': 48269, 'batch_loss/train': 0.7050167159177363} 12/29/2021 07:38:43 - INFO - codeparrot_training - Step 48270: {'lr': 1.4858558587947657e-06, 'samples': 24714752, 'steps': 48270, 'batch_loss/train': 0.6576751479879022} 12/29/2021 07:38:54 - INFO - codeparrot_training - Step 48271: {'lr': 1.4841403039382585e-06, 'samples': 24715264, 'steps': 48271, 'batch_loss/train': 0.7278319969773293} 12/29/2021 07:39:06 - INFO - codeparrot_training - Step 48272: {'lr': 1.4824257370869842e-06, 'samples': 24715776, 'steps': 48272, 'batch_loss/train': 0.7794435303658247} 12/29/2021 07:39:17 - INFO - codeparrot_training - Step 48273: {'lr': 1.4807121582477435e-06, 'samples': 24716288, 'steps': 48273, 'batch_loss/train': 0.7716294508427382} 12/29/2021 07:39:27 - INFO - codeparrot_training - Step 48274: {'lr': 1.478999567427336e-06, 'samples': 24716800, 'steps': 48274, 'batch_loss/train': 0.7118595764040947} 12/29/2021 07:39:39 - INFO - codeparrot_training - Step 48275: {'lr': 1.4772879646325898e-06, 'samples': 24717312, 'steps': 48275, 'batch_loss/train': 0.6764001687988639} 12/29/2021 07:39:50 - INFO - codeparrot_training - Step 48276: {'lr': 1.475577349870333e-06, 'samples': 24717824, 'steps': 48276, 'batch_loss/train': 0.7823693761602044} 12/29/2021 07:40:01 - INFO - codeparrot_training - Step 48277: {'lr': 1.4738677231473375e-06, 'samples': 24718336, 'steps': 48277, 'batch_loss/train': 0.7777043264359236} 12/29/2021 07:40:11 - INFO - codeparrot_training - Step 48278: {'lr': 1.4721590844704036e-06, 'samples': 24718848, 'steps': 48278, 'batch_loss/train': 0.6659440686926246} 12/29/2021 07:40:25 - INFO - codeparrot_training - Step 48279: {'lr': 1.4704514338463037e-06, 'samples': 24719360, 'steps': 48279, 'batch_loss/train': 0.703183067496866} 12/29/2021 07:40:36 - INFO - codeparrot_training - Step 48280: {'lr': 1.4687447712818658e-06, 'samples': 24719872, 'steps': 48280, 'batch_loss/train': 0.7566351585555822} 12/29/2021 07:40:46 - INFO - codeparrot_training - Step 48281: {'lr': 1.4670390967838621e-06, 'samples': 24720384, 'steps': 48281, 'batch_loss/train': 0.788023242726922} 12/29/2021 07:40:59 - INFO - codeparrot_training - Step 48282: {'lr': 1.4653344103590372e-06, 'samples': 24720896, 'steps': 48282, 'batch_loss/train': 0.6608758866786957} 12/29/2021 07:41:09 - INFO - codeparrot_training - Step 48283: {'lr': 1.463630712014219e-06, 'samples': 24721408, 'steps': 48283, 'batch_loss/train': 0.7582284267991781} 12/29/2021 07:41:20 - INFO - codeparrot_training - Step 48284: {'lr': 1.4619280017561797e-06, 'samples': 24721920, 'steps': 48284, 'batch_loss/train': 0.8348948629572988} 12/29/2021 07:41:32 - INFO - codeparrot_training - Step 48285: {'lr': 1.4602262795916643e-06, 'samples': 24722432, 'steps': 48285, 'batch_loss/train': 0.6664507173700258} 12/29/2021 07:41:42 - INFO - codeparrot_training - Step 48286: {'lr': 1.458525545527417e-06, 'samples': 24722944, 'steps': 48286, 'batch_loss/train': 0.720650817733258} 12/29/2021 07:41:53 - INFO - codeparrot_training - Step 48287: {'lr': 1.456825799570266e-06, 'samples': 24723456, 'steps': 48287, 'batch_loss/train': 0.7933264216408134} 12/29/2021 07:42:04 - INFO - codeparrot_training - Step 48288: {'lr': 1.4551270417269002e-06, 'samples': 24723968, 'steps': 48288, 'batch_loss/train': 0.6814692134503275} 12/29/2021 07:42:17 - INFO - codeparrot_training - Step 48289: {'lr': 1.453429272004092e-06, 'samples': 24724480, 'steps': 48289, 'batch_loss/train': 0.8703194865956903} 12/29/2021 07:42:28 - INFO - codeparrot_training - Step 48290: {'lr': 1.4517324904086137e-06, 'samples': 24724992, 'steps': 48290, 'batch_loss/train': 0.7061907821334898} 12/29/2021 07:42:39 - INFO - codeparrot_training - Step 48291: {'lr': 1.45003669694721e-06, 'samples': 24725504, 'steps': 48291, 'batch_loss/train': 0.6793144010007381} 12/29/2021 07:42:51 - INFO - codeparrot_training - Step 48292: {'lr': 1.4483418916265977e-06, 'samples': 24726016, 'steps': 48292, 'batch_loss/train': 0.6976538700982928} 12/29/2021 07:43:02 - INFO - codeparrot_training - Step 48293: {'lr': 1.4466480744535492e-06, 'samples': 24726528, 'steps': 48293, 'batch_loss/train': 0.7629133488517255} 12/29/2021 07:43:12 - INFO - codeparrot_training - Step 48294: {'lr': 1.4449552454347537e-06, 'samples': 24727040, 'steps': 48294, 'batch_loss/train': 0.7626666137948632} 12/29/2021 07:43:26 - INFO - codeparrot_training - Step 48295: {'lr': 1.4432634045769832e-06, 'samples': 24727552, 'steps': 48295, 'batch_loss/train': 0.8231230525998399} 12/29/2021 07:43:37 - INFO - codeparrot_training - Step 48296: {'lr': 1.4415725518869548e-06, 'samples': 24728064, 'steps': 48296, 'batch_loss/train': 0.7420778272207826} 12/29/2021 07:43:47 - INFO - codeparrot_training - Step 48297: {'lr': 1.4398826873713578e-06, 'samples': 24728576, 'steps': 48297, 'batch_loss/train': 0.7854085825383663} 12/29/2021 07:43:58 - INFO - codeparrot_training - Step 48298: {'lr': 1.4381938110369363e-06, 'samples': 24729088, 'steps': 48298, 'batch_loss/train': 0.6781833155546337} 12/29/2021 07:44:11 - INFO - codeparrot_training - Step 48299: {'lr': 1.4365059228904076e-06, 'samples': 24729600, 'steps': 48299, 'batch_loss/train': 0.811085018562153} 12/29/2021 07:44:21 - INFO - codeparrot_training - Step 48300: {'lr': 1.4348190229384883e-06, 'samples': 24730112, 'steps': 48300, 'batch_loss/train': 0.6457379146013409} 12/29/2021 07:44:32 - INFO - codeparrot_training - Step 48301: {'lr': 1.4331331111878677e-06, 'samples': 24730624, 'steps': 48301, 'batch_loss/train': 1.1233734120614827} 12/29/2021 07:44:44 - INFO - codeparrot_training - Step 48302: {'lr': 1.4314481876452622e-06, 'samples': 24731136, 'steps': 48302, 'batch_loss/train': 0.7315197018906474} 12/29/2021 07:44:54 - INFO - codeparrot_training - Step 48303: {'lr': 1.4297642523173337e-06, 'samples': 24731648, 'steps': 48303, 'batch_loss/train': 0.6481932229362428} 12/29/2021 07:45:05 - INFO - codeparrot_training - Step 48304: {'lr': 1.4280813052108265e-06, 'samples': 24732160, 'steps': 48304, 'batch_loss/train': 0.7640979560092092} 12/29/2021 07:45:19 - INFO - codeparrot_training - Step 48305: {'lr': 1.4263993463324022e-06, 'samples': 24732672, 'steps': 48305, 'batch_loss/train': 0.6905289152637124} 12/29/2021 07:45:30 - INFO - codeparrot_training - Step 48306: {'lr': 1.424718375688777e-06, 'samples': 24733184, 'steps': 48306, 'batch_loss/train': 0.6883647856302559} 12/29/2021 07:45:40 - INFO - codeparrot_training - Step 48307: {'lr': 1.4230383932865853e-06, 'samples': 24733696, 'steps': 48307, 'batch_loss/train': 0.8516200203448534} 12/29/2021 07:45:52 - INFO - codeparrot_training - Step 48308: {'lr': 1.4213593991325713e-06, 'samples': 24734208, 'steps': 48308, 'batch_loss/train': 0.5814020487014204} 12/29/2021 07:46:03 - INFO - codeparrot_training - Step 48309: {'lr': 1.4196813932333407e-06, 'samples': 24734720, 'steps': 48309, 'batch_loss/train': 0.6679227380082011} 12/29/2021 07:46:14 - INFO - codeparrot_training - Step 48310: {'lr': 1.4180043755956106e-06, 'samples': 24735232, 'steps': 48310, 'batch_loss/train': 0.7526637623086572} 12/29/2021 07:46:24 - INFO - codeparrot_training - Step 48311: {'lr': 1.4163283462260424e-06, 'samples': 24735744, 'steps': 48311, 'batch_loss/train': 0.7322018044069409} 12/29/2021 07:46:39 - INFO - codeparrot_training - Step 48312: {'lr': 1.4146533051312693e-06, 'samples': 24736256, 'steps': 48312, 'batch_loss/train': 0.7565625687129796} 12/29/2021 07:46:49 - INFO - codeparrot_training - Step 48313: {'lr': 1.4129792523179808e-06, 'samples': 24736768, 'steps': 48313, 'batch_loss/train': 0.7929803244769573} 12/29/2021 07:47:00 - INFO - codeparrot_training - Step 48314: {'lr': 1.4113061877928102e-06, 'samples': 24737280, 'steps': 48314, 'batch_loss/train': 0.7263390182051808} 12/29/2021 07:47:12 - INFO - codeparrot_training - Step 48315: {'lr': 1.4096341115624466e-06, 'samples': 24737792, 'steps': 48315, 'batch_loss/train': 0.6773296864703298} 12/29/2021 07:47:23 - INFO - codeparrot_training - Step 48316: {'lr': 1.407963023633496e-06, 'samples': 24738304, 'steps': 48316, 'batch_loss/train': 1.4831803699489683} 12/29/2021 07:47:33 - INFO - codeparrot_training - Step 48317: {'lr': 1.4062929240126476e-06, 'samples': 24738816, 'steps': 48317, 'batch_loss/train': 0.7898713551112451} 12/29/2021 07:47:45 - INFO - codeparrot_training - Step 48318: {'lr': 1.404623812706479e-06, 'samples': 24739328, 'steps': 48318, 'batch_loss/train': 0.7682273415848613} 12/29/2021 07:47:56 - INFO - codeparrot_training - Step 48319: {'lr': 1.40295568972168e-06, 'samples': 24739840, 'steps': 48319, 'batch_loss/train': 0.7105168015696108} 12/29/2021 07:48:07 - INFO - codeparrot_training - Step 48320: {'lr': 1.4012885550648557e-06, 'samples': 24740352, 'steps': 48320, 'batch_loss/train': 0.686689589638263} 12/29/2021 07:48:17 - INFO - codeparrot_training - Step 48321: {'lr': 1.39962240874264e-06, 'samples': 24740864, 'steps': 48321, 'batch_loss/train': 0.7765085264109075} 12/29/2021 07:48:31 - INFO - codeparrot_training - Step 48322: {'lr': 1.3979572507616668e-06, 'samples': 24741376, 'steps': 48322, 'batch_loss/train': 0.7694974197074771} 12/29/2021 07:48:42 - INFO - codeparrot_training - Step 48323: {'lr': 1.3962930811285414e-06, 'samples': 24741888, 'steps': 48323, 'batch_loss/train': 0.813974445220083} 12/29/2021 07:48:53 - INFO - codeparrot_training - Step 48324: {'lr': 1.3946298998498698e-06, 'samples': 24742400, 'steps': 48324, 'batch_loss/train': 0.6812630547210574} 12/29/2021 07:49:05 - INFO - codeparrot_training - Step 48325: {'lr': 1.3929677069323133e-06, 'samples': 24742912, 'steps': 48325, 'batch_loss/train': 0.7260599206201732} 12/29/2021 07:49:15 - INFO - codeparrot_training - Step 48326: {'lr': 1.3913065023824222e-06, 'samples': 24743424, 'steps': 48326, 'batch_loss/train': 0.7108564113732427} 12/29/2021 07:49:26 - INFO - codeparrot_training - Step 48327: {'lr': 1.3896462862068304e-06, 'samples': 24743936, 'steps': 48327, 'batch_loss/train': 0.6855632485821843} 12/29/2021 07:49:40 - INFO - codeparrot_training - Step 48328: {'lr': 1.3879870584121435e-06, 'samples': 24744448, 'steps': 48328, 'batch_loss/train': 0.7783438442274928} 12/29/2021 07:49:51 - INFO - codeparrot_training - Step 48329: {'lr': 1.3863288190049395e-06, 'samples': 24744960, 'steps': 48329, 'batch_loss/train': 0.6785634178668261} 12/29/2021 07:50:01 - INFO - codeparrot_training - Step 48330: {'lr': 1.3846715679918243e-06, 'samples': 24745472, 'steps': 48330, 'batch_loss/train': 0.751599489711225} 12/29/2021 07:50:12 - INFO - codeparrot_training - Step 48331: {'lr': 1.3830153053793482e-06, 'samples': 24745984, 'steps': 48331, 'batch_loss/train': 0.681262903381139} 12/29/2021 07:50:24 - INFO - codeparrot_training - Step 48332: {'lr': 1.3813600311741726e-06, 'samples': 24746496, 'steps': 48332, 'batch_loss/train': 0.6270428323186934} 12/29/2021 07:50:35 - INFO - codeparrot_training - Step 48333: {'lr': 1.3797057453828198e-06, 'samples': 24747008, 'steps': 48333, 'batch_loss/train': 0.7414101869799197} 12/29/2021 07:50:46 - INFO - codeparrot_training - Step 48334: {'lr': 1.3780524480118683e-06, 'samples': 24747520, 'steps': 48334, 'batch_loss/train': 0.6077470146119595} 12/29/2021 07:50:58 - INFO - codeparrot_training - Step 48335: {'lr': 1.3764001390679236e-06, 'samples': 24748032, 'steps': 48335, 'batch_loss/train': 0.8227198915556073} 12/29/2021 07:51:08 - INFO - codeparrot_training - Step 48336: {'lr': 1.3747488185575364e-06, 'samples': 24748544, 'steps': 48336, 'batch_loss/train': 0.830544670112431} 12/29/2021 07:51:19 - INFO - codeparrot_training - Step 48337: {'lr': 1.3730984864872564e-06, 'samples': 24749056, 'steps': 48337, 'batch_loss/train': 0.8459116145968437} 12/29/2021 07:51:33 - INFO - codeparrot_training - Step 48338: {'lr': 1.3714491428636622e-06, 'samples': 24749568, 'steps': 48338, 'batch_loss/train': 0.7757006352767348} 12/29/2021 07:51:44 - INFO - codeparrot_training - Step 48339: {'lr': 1.3698007876933316e-06, 'samples': 24750080, 'steps': 48339, 'batch_loss/train': 0.731410413980484} 12/29/2021 07:51:54 - INFO - codeparrot_training - Step 48340: {'lr': 1.3681534209827595e-06, 'samples': 24750592, 'steps': 48340, 'batch_loss/train': 0.755577951669693} 12/29/2021 07:52:05 - INFO - codeparrot_training - Step 48341: {'lr': 1.366507042738524e-06, 'samples': 24751104, 'steps': 48341, 'batch_loss/train': 0.682768399361521} 12/29/2021 07:52:17 - INFO - codeparrot_training - Step 48342: {'lr': 1.364861652967203e-06, 'samples': 24751616, 'steps': 48342, 'batch_loss/train': 0.7180885728448629} 12/29/2021 07:52:28 - INFO - codeparrot_training - Step 48343: {'lr': 1.3632172516753194e-06, 'samples': 24752128, 'steps': 48343, 'batch_loss/train': 0.6535550812259316} 12/29/2021 07:52:38 - INFO - codeparrot_training - Step 48344: {'lr': 1.3615738388693677e-06, 'samples': 24752640, 'steps': 48344, 'batch_loss/train': 0.5847929192241281} 12/29/2021 07:52:51 - INFO - codeparrot_training - Step 48345: {'lr': 1.3599314145559538e-06, 'samples': 24753152, 'steps': 48345, 'batch_loss/train': 0.8063998622819781} 12/29/2021 07:53:01 - INFO - codeparrot_training - Step 48346: {'lr': 1.3582899787415448e-06, 'samples': 24753664, 'steps': 48346, 'batch_loss/train': 0.7642436539754272} 12/29/2021 07:53:12 - INFO - codeparrot_training - Step 48347: {'lr': 1.356649531432691e-06, 'samples': 24754176, 'steps': 48347, 'batch_loss/train': 0.7952477266080678} 12/29/2021 07:53:24 - INFO - codeparrot_training - Step 48348: {'lr': 1.3550100726359427e-06, 'samples': 24754688, 'steps': 48348, 'batch_loss/train': 0.7984708901494741} 12/29/2021 07:53:35 - INFO - codeparrot_training - Step 48349: {'lr': 1.3533716023577392e-06, 'samples': 24755200, 'steps': 48349, 'batch_loss/train': 0.7034047534689307} 12/29/2021 07:53:45 - INFO - codeparrot_training - Step 48350: {'lr': 1.3517341206046862e-06, 'samples': 24755712, 'steps': 48350, 'batch_loss/train': 0.757475861813873} 12/29/2021 07:53:56 - INFO - codeparrot_training - Step 48351: {'lr': 1.350097627383251e-06, 'samples': 24756224, 'steps': 48351, 'batch_loss/train': 0.766131647862494} 12/29/2021 07:54:10 - INFO - codeparrot_training - Step 48352: {'lr': 1.3484621226999005e-06, 'samples': 24756736, 'steps': 48352, 'batch_loss/train': 0.7951430999673903} 12/29/2021 07:54:20 - INFO - codeparrot_training - Step 48353: {'lr': 1.3468276065612128e-06, 'samples': 24757248, 'steps': 48353, 'batch_loss/train': 0.8504976984113455} 12/29/2021 07:54:31 - INFO - codeparrot_training - Step 48354: {'lr': 1.345194078973655e-06, 'samples': 24757760, 'steps': 48354, 'batch_loss/train': 0.7441824208945036} 12/29/2021 07:54:43 - INFO - codeparrot_training - Step 48355: {'lr': 1.3435615399437218e-06, 'samples': 24758272, 'steps': 48355, 'batch_loss/train': 0.7454209367278963} 12/29/2021 07:54:54 - INFO - codeparrot_training - Step 48356: {'lr': 1.3419299894778524e-06, 'samples': 24758784, 'steps': 48356, 'batch_loss/train': 0.7641761638224125} 12/29/2021 07:55:04 - INFO - codeparrot_training - Step 48357: {'lr': 1.3402994275826253e-06, 'samples': 24759296, 'steps': 48357, 'batch_loss/train': 0.6544999569887295} 12/29/2021 07:55:18 - INFO - codeparrot_training - Step 48358: {'lr': 1.3386698542644794e-06, 'samples': 24759808, 'steps': 48358, 'batch_loss/train': 0.789144825655967} 12/29/2021 07:55:29 - INFO - codeparrot_training - Step 48359: {'lr': 1.337041269529854e-06, 'samples': 24760320, 'steps': 48359, 'batch_loss/train': 0.7301239427179098} 12/29/2021 07:55:39 - INFO - codeparrot_training - Step 48360: {'lr': 1.3354136733852995e-06, 'samples': 24760832, 'steps': 48360, 'batch_loss/train': 0.7309691281989217} 12/29/2021 07:55:50 - INFO - codeparrot_training - Step 48361: {'lr': 1.3337870658372275e-06, 'samples': 24761344, 'steps': 48361, 'batch_loss/train': 0.6699147685430944} 12/29/2021 07:56:02 - INFO - codeparrot_training - Step 48362: {'lr': 1.3321614468921328e-06, 'samples': 24761856, 'steps': 48362, 'batch_loss/train': 0.8130527622997761} 12/29/2021 07:56:13 - INFO - codeparrot_training - Step 48363: {'lr': 1.3305368165564546e-06, 'samples': 24762368, 'steps': 48363, 'batch_loss/train': 0.6877323784865439} 12/29/2021 07:56:23 - INFO - codeparrot_training - Step 48364: {'lr': 1.3289131748366877e-06, 'samples': 24762880, 'steps': 48364, 'batch_loss/train': 0.7447910718619823} 12/29/2021 07:56:35 - INFO - codeparrot_training - Step 48365: {'lr': 1.3272905217392717e-06, 'samples': 24763392, 'steps': 48365, 'batch_loss/train': 0.7695676125586033} 12/29/2021 07:56:46 - INFO - codeparrot_training - Step 48366: {'lr': 1.3256688572706455e-06, 'samples': 24763904, 'steps': 48366, 'batch_loss/train': 0.7750978979747742} 12/29/2021 07:56:57 - INFO - codeparrot_training - Step 48367: {'lr': 1.3240481814372485e-06, 'samples': 24764416, 'steps': 48367, 'batch_loss/train': 0.7117932117544115} 12/29/2021 07:57:11 - INFO - codeparrot_training - Step 48368: {'lr': 1.3224284942455477e-06, 'samples': 24764928, 'steps': 48368, 'batch_loss/train': 0.753887745551765} 12/29/2021 07:57:21 - INFO - codeparrot_training - Step 48369: {'lr': 1.3208097957019828e-06, 'samples': 24765440, 'steps': 48369, 'batch_loss/train': 0.697342571336776} 12/29/2021 07:57:32 - INFO - codeparrot_training - Step 48370: {'lr': 1.3191920858129646e-06, 'samples': 24765952, 'steps': 48370, 'batch_loss/train': 0.7225215816870332} 12/29/2021 07:57:44 - INFO - codeparrot_training - Step 48371: {'lr': 1.3175753645849608e-06, 'samples': 24766464, 'steps': 48371, 'batch_loss/train': 0.7664311295375228} 12/29/2021 07:57:55 - INFO - codeparrot_training - Step 48372: {'lr': 1.315959632024355e-06, 'samples': 24766976, 'steps': 48372, 'batch_loss/train': 0.640465781558305} 12/29/2021 07:58:05 - INFO - codeparrot_training - Step 48373: {'lr': 1.3143448881376141e-06, 'samples': 24767488, 'steps': 48373, 'batch_loss/train': 0.7907677702605724} 12/29/2021 07:58:16 - INFO - codeparrot_training - Step 48374: {'lr': 1.312731132931122e-06, 'samples': 24768000, 'steps': 48374, 'batch_loss/train': 0.7052297564223409} 12/29/2021 07:58:30 - INFO - codeparrot_training - Step 48375: {'lr': 1.3111183664113458e-06, 'samples': 24768512, 'steps': 48375, 'batch_loss/train': 0.7069549718871713} 12/29/2021 07:58:41 - INFO - codeparrot_training - Step 48376: {'lr': 1.3095065885846136e-06, 'samples': 24769024, 'steps': 48376, 'batch_loss/train': 0.7069404271896929} 12/29/2021 07:58:51 - INFO - codeparrot_training - Step 48377: {'lr': 1.3078957994574204e-06, 'samples': 24769536, 'steps': 48377, 'batch_loss/train': 0.7767243904527277} 12/29/2021 07:59:03 - INFO - codeparrot_training - Step 48378: {'lr': 1.3062859990360942e-06, 'samples': 24770048, 'steps': 48378, 'batch_loss/train': 0.7403825595974922} 12/29/2021 07:59:14 - INFO - codeparrot_training - Step 48379: {'lr': 1.3046771873271024e-06, 'samples': 24770560, 'steps': 48379, 'batch_loss/train': 0.7367295935400762} 12/29/2021 07:59:25 - INFO - codeparrot_training - Step 48380: {'lr': 1.3030693643368007e-06, 'samples': 24771072, 'steps': 48380, 'batch_loss/train': 0.7203783085569739} 12/29/2021 07:59:37 - INFO - codeparrot_training - Step 48381: {'lr': 1.301462530071601e-06, 'samples': 24771584, 'steps': 48381, 'batch_loss/train': 0.6729747159406543} 12/29/2021 07:59:47 - INFO - codeparrot_training - Step 48382: {'lr': 1.299856684537859e-06, 'samples': 24772096, 'steps': 48382, 'batch_loss/train': 0.7170378995360807} 12/29/2021 07:59:58 - INFO - codeparrot_training - Step 48383: {'lr': 1.2982518277420142e-06, 'samples': 24772608, 'steps': 48383, 'batch_loss/train': 0.6754393093287945} 12/29/2021 08:00:09 - INFO - codeparrot_training - Step 48384: {'lr': 1.2966479596903945e-06, 'samples': 24773120, 'steps': 48384, 'batch_loss/train': 0.7748871613293886} 12/29/2021 08:00:23 - INFO - codeparrot_training - Step 48385: {'lr': 1.2950450803893844e-06, 'samples': 24773632, 'steps': 48385, 'batch_loss/train': 0.627560997963883} 12/29/2021 08:00:33 - INFO - codeparrot_training - Step 48386: {'lr': 1.2934431898453669e-06, 'samples': 24774144, 'steps': 48386, 'batch_loss/train': 0.7131344326771796} 12/29/2021 08:00:44 - INFO - codeparrot_training - Step 48387: {'lr': 1.2918422880647262e-06, 'samples': 24774656, 'steps': 48387, 'batch_loss/train': 0.7049403298879042} 12/29/2021 08:00:56 - INFO - codeparrot_training - Step 48388: {'lr': 1.2902423750538184e-06, 'samples': 24775168, 'steps': 48388, 'batch_loss/train': 0.7661587987095118} 12/29/2021 08:01:07 - INFO - codeparrot_training - Step 48389: {'lr': 1.2886434508189715e-06, 'samples': 24775680, 'steps': 48389, 'batch_loss/train': 0.5923097315244377} 12/29/2021 08:01:18 - INFO - codeparrot_training - Step 48390: {'lr': 1.2870455153665973e-06, 'samples': 24776192, 'steps': 48390, 'batch_loss/train': 0.7105920561589301} 12/29/2021 08:01:30 - INFO - codeparrot_training - Step 48391: {'lr': 1.285448568702996e-06, 'samples': 24776704, 'steps': 48391, 'batch_loss/train': 0.7596692591905594} 12/29/2021 08:01:40 - INFO - codeparrot_training - Step 48392: {'lr': 1.2838526108345516e-06, 'samples': 24777216, 'steps': 48392, 'batch_loss/train': 0.6416232972405851} 12/29/2021 08:01:51 - INFO - codeparrot_training - Step 48393: {'lr': 1.2822576417675924e-06, 'samples': 24777728, 'steps': 48393, 'batch_loss/train': 0.7251927978359163} 12/29/2021 08:02:01 - INFO - codeparrot_training - Step 48394: {'lr': 1.2806636615084744e-06, 'samples': 24778240, 'steps': 48394, 'batch_loss/train': 0.8520309766754508} 12/29/2021 08:02:14 - INFO - codeparrot_training - Step 48395: {'lr': 1.2790706700635256e-06, 'samples': 24778752, 'steps': 48395, 'batch_loss/train': 0.4951871468219906} 12/29/2021 08:02:25 - INFO - codeparrot_training - Step 48396: {'lr': 1.2774786674390748e-06, 'samples': 24779264, 'steps': 48396, 'batch_loss/train': 0.7165587125346065} 12/29/2021 08:02:35 - INFO - codeparrot_training - Step 48397: {'lr': 1.27588765364145e-06, 'samples': 24779776, 'steps': 48397, 'batch_loss/train': 0.7043344527482986} 12/29/2021 08:02:49 - INFO - codeparrot_training - Step 48398: {'lr': 1.2742976286769791e-06, 'samples': 24780288, 'steps': 48398, 'batch_loss/train': 0.695586503483355} 12/29/2021 08:03:00 - INFO - codeparrot_training - Step 48399: {'lr': 1.272708592551991e-06, 'samples': 24780800, 'steps': 48399, 'batch_loss/train': 0.7104108795174398} 12/29/2021 08:03:10 - INFO - codeparrot_training - Step 48400: {'lr': 1.2711205452727858e-06, 'samples': 24781312, 'steps': 48400, 'batch_loss/train': 0.7604880351573229} 12/29/2021 08:03:22 - INFO - codeparrot_training - Step 48401: {'lr': 1.2695334868457198e-06, 'samples': 24781824, 'steps': 48401, 'batch_loss/train': 0.8538858015090227} 12/29/2021 08:03:33 - INFO - codeparrot_training - Step 48402: {'lr': 1.2679474172770377e-06, 'samples': 24782336, 'steps': 48402, 'batch_loss/train': 0.7371403761208057} 12/29/2021 08:03:44 - INFO - codeparrot_training - Step 48403: {'lr': 1.2663623365731237e-06, 'samples': 24782848, 'steps': 48403, 'batch_loss/train': 0.737374366261065} 12/29/2021 08:03:54 - INFO - codeparrot_training - Step 48404: {'lr': 1.2647782447401668e-06, 'samples': 24783360, 'steps': 48404, 'batch_loss/train': 0.7109576016664505} 12/29/2021 08:04:16 - INFO - codeparrot_training - Step 48405: {'lr': 1.2631951417845788e-06, 'samples': 24783872, 'steps': 48405, 'batch_loss/train': 0.7567721763625741} 12/29/2021 08:04:27 - INFO - codeparrot_training - Step 48406: {'lr': 1.2616130277126048e-06, 'samples': 24784384, 'steps': 48406, 'batch_loss/train': 0.7032041332568042} 12/29/2021 08:04:38 - INFO - codeparrot_training - Step 48407: {'lr': 1.2600319025305451e-06, 'samples': 24784896, 'steps': 48407, 'batch_loss/train': 0.7646683044731617} 12/29/2021 08:04:50 - INFO - codeparrot_training - Step 48408: {'lr': 1.2584517662446448e-06, 'samples': 24785408, 'steps': 48408, 'batch_loss/train': 0.6285332971892785} 12/29/2021 08:05:01 - INFO - codeparrot_training - Step 48409: {'lr': 1.2568726188612601e-06, 'samples': 24785920, 'steps': 48409, 'batch_loss/train': 0.770296195987612} 12/29/2021 08:05:11 - INFO - codeparrot_training - Step 48410: {'lr': 1.255294460386608e-06, 'samples': 24786432, 'steps': 48410, 'batch_loss/train': 0.6892436040798202} 12/29/2021 08:05:24 - INFO - codeparrot_training - Step 48411: {'lr': 1.2537172908269612e-06, 'samples': 24786944, 'steps': 48411, 'batch_loss/train': 0.5546290490310639} 12/29/2021 08:05:34 - INFO - codeparrot_training - Step 48412: {'lr': 1.252141110188648e-06, 'samples': 24787456, 'steps': 48412, 'batch_loss/train': 0.6291042685043067} 12/29/2021 08:05:45 - INFO - codeparrot_training - Step 48413: {'lr': 1.2505659184778862e-06, 'samples': 24787968, 'steps': 48413, 'batch_loss/train': 0.7825110624544322} 12/29/2021 08:05:56 - INFO - codeparrot_training - Step 48414: {'lr': 1.2489917157009479e-06, 'samples': 24788480, 'steps': 48414, 'batch_loss/train': 0.7435341477394104} 12/29/2021 08:06:10 - INFO - codeparrot_training - Step 48415: {'lr': 1.2474185018640782e-06, 'samples': 24788992, 'steps': 48415, 'batch_loss/train': 0.793290154542774} 12/29/2021 08:06:20 - INFO - codeparrot_training - Step 48416: {'lr': 1.2458462769735502e-06, 'samples': 24789504, 'steps': 48416, 'batch_loss/train': 0.6262157299788669} 12/29/2021 08:06:31 - INFO - codeparrot_training - Step 48417: {'lr': 1.2442750410356362e-06, 'samples': 24790016, 'steps': 48417, 'batch_loss/train': 0.6821163957938552} 12/29/2021 08:06:43 - INFO - codeparrot_training - Step 48418: {'lr': 1.2427047940565539e-06, 'samples': 24790528, 'steps': 48418, 'batch_loss/train': 0.728728105314076} 12/29/2021 08:06:54 - INFO - codeparrot_training - Step 48419: {'lr': 1.2411355360425202e-06, 'samples': 24791040, 'steps': 48419, 'batch_loss/train': 0.7081649294123054} 12/29/2021 08:07:04 - INFO - codeparrot_training - Step 48420: {'lr': 1.239567266999836e-06, 'samples': 24791552, 'steps': 48420, 'batch_loss/train': 0.7705680699436925} 12/29/2021 08:07:17 - INFO - codeparrot_training - Step 48421: {'lr': 1.2379999869346903e-06, 'samples': 24792064, 'steps': 48421, 'batch_loss/train': 0.7733589340932667} 12/29/2021 08:07:27 - INFO - codeparrot_training - Step 48422: {'lr': 1.2364336958533284e-06, 'samples': 24792576, 'steps': 48422, 'batch_loss/train': 0.6676266237627715} 12/29/2021 08:07:38 - INFO - codeparrot_training - Step 48423: {'lr': 1.2348683937619953e-06, 'samples': 24793088, 'steps': 48423, 'batch_loss/train': 0.7463565738871694} 12/29/2021 08:07:48 - INFO - codeparrot_training - Step 48424: {'lr': 1.2333040806668806e-06, 'samples': 24793600, 'steps': 48424, 'batch_loss/train': 0.6843833988532424} 12/29/2021 08:08:01 - INFO - codeparrot_training - Step 48425: {'lr': 1.2317407565742011e-06, 'samples': 24794112, 'steps': 48425, 'batch_loss/train': 0.7916124926414341} 12/29/2021 08:08:11 - INFO - codeparrot_training - Step 48426: {'lr': 1.2301784214902024e-06, 'samples': 24794624, 'steps': 48426, 'batch_loss/train': 0.757781570777297} 12/29/2021 08:08:22 - INFO - codeparrot_training - Step 48427: {'lr': 1.2286170754211011e-06, 'samples': 24795136, 'steps': 48427, 'batch_loss/train': 0.7498543113470078} 12/29/2021 08:08:35 - INFO - codeparrot_training - Step 48428: {'lr': 1.2270567183730596e-06, 'samples': 24795648, 'steps': 48428, 'batch_loss/train': 0.7049633641727269} 12/29/2021 08:08:46 - INFO - codeparrot_training - Step 48429: {'lr': 1.2254973503523226e-06, 'samples': 24796160, 'steps': 48429, 'batch_loss/train': 0.609237025026232} 12/29/2021 08:08:57 - INFO - codeparrot_training - Step 48430: {'lr': 1.2239389713650518e-06, 'samples': 24796672, 'steps': 48430, 'batch_loss/train': 0.769994453061372} 12/29/2021 08:09:09 - INFO - codeparrot_training - Step 48431: {'lr': 1.222381581417492e-06, 'samples': 24797184, 'steps': 48431, 'batch_loss/train': 0.7489937014179304} 12/29/2021 08:09:19 - INFO - codeparrot_training - Step 48432: {'lr': 1.2208251805158055e-06, 'samples': 24797696, 'steps': 48432, 'batch_loss/train': 0.7694673342630267} 12/29/2021 08:09:30 - INFO - codeparrot_training - Step 48433: {'lr': 1.2192697686661814e-06, 'samples': 24798208, 'steps': 48433, 'batch_loss/train': 0.8460509795695543} 12/29/2021 08:09:41 - INFO - codeparrot_training - Step 48434: {'lr': 1.2177153458747814e-06, 'samples': 24798720, 'steps': 48434, 'batch_loss/train': 0.6541902946773916} 12/29/2021 08:09:53 - INFO - codeparrot_training - Step 48435: {'lr': 1.2161619121478507e-06, 'samples': 24799232, 'steps': 48435, 'batch_loss/train': 0.742153542349115} 12/29/2021 08:10:03 - INFO - codeparrot_training - Step 48436: {'lr': 1.2146094674914954e-06, 'samples': 24799744, 'steps': 48436, 'batch_loss/train': 0.7135434029623866} 12/29/2021 08:10:14 - INFO - codeparrot_training - Step 48437: {'lr': 1.213058011911905e-06, 'samples': 24800256, 'steps': 48437, 'batch_loss/train': 0.8060080586001277} 12/29/2021 08:10:28 - INFO - codeparrot_training - Step 48438: {'lr': 1.211507545415269e-06, 'samples': 24800768, 'steps': 48438, 'batch_loss/train': 0.6649038188625127} 12/29/2021 08:10:38 - INFO - codeparrot_training - Step 48439: {'lr': 1.2099580680077493e-06, 'samples': 24801280, 'steps': 48439, 'batch_loss/train': 0.6762106278911233} 12/29/2021 08:10:49 - INFO - codeparrot_training - Step 48440: {'lr': 1.2084095796954798e-06, 'samples': 24801792, 'steps': 48440, 'batch_loss/train': 0.7631034078076482} 12/29/2021 08:11:02 - INFO - codeparrot_training - Step 48441: {'lr': 1.2068620804846497e-06, 'samples': 24802304, 'steps': 48441, 'batch_loss/train': 0.6662178400438279} 12/29/2021 08:11:12 - INFO - codeparrot_training - Step 48442: {'lr': 1.2053155703813934e-06, 'samples': 24802816, 'steps': 48442, 'batch_loss/train': 0.4122544068377465} 12/29/2021 08:11:23 - INFO - codeparrot_training - Step 48443: {'lr': 1.2037700493918448e-06, 'samples': 24803328, 'steps': 48443, 'batch_loss/train': 0.7314733038656414} 12/29/2021 08:11:33 - INFO - codeparrot_training - Step 48444: {'lr': 1.202225517522193e-06, 'samples': 24803840, 'steps': 48444, 'batch_loss/train': 0.7197836423292756} 12/29/2021 08:11:47 - INFO - codeparrot_training - Step 48445: {'lr': 1.200681974778517e-06, 'samples': 24804352, 'steps': 48445, 'batch_loss/train': 0.6160486289300025} 12/29/2021 08:11:58 - INFO - codeparrot_training - Step 48446: {'lr': 1.199139421167006e-06, 'samples': 24804864, 'steps': 48446, 'batch_loss/train': 0.8360894564539194} 12/29/2021 08:12:08 - INFO - codeparrot_training - Step 48447: {'lr': 1.197597856693794e-06, 'samples': 24805376, 'steps': 48447, 'batch_loss/train': 0.715793059207499} 12/29/2021 08:12:21 - INFO - codeparrot_training - Step 48448: {'lr': 1.1960572813649594e-06, 'samples': 24805888, 'steps': 48448, 'batch_loss/train': 0.8647571867331862} 12/29/2021 08:12:31 - INFO - codeparrot_training - Step 48449: {'lr': 1.1945176951866643e-06, 'samples': 24806400, 'steps': 48449, 'batch_loss/train': 0.6995444912463427} 12/29/2021 08:12:42 - INFO - codeparrot_training - Step 48450: {'lr': 1.1929790981650146e-06, 'samples': 24806912, 'steps': 48450, 'batch_loss/train': 0.6992223288398236} 12/29/2021 08:12:54 - INFO - codeparrot_training - Step 48451: {'lr': 1.1914414903061443e-06, 'samples': 24807424, 'steps': 48451, 'batch_loss/train': 0.7445502057671547} 12/29/2021 08:13:05 - INFO - codeparrot_training - Step 48452: {'lr': 1.189904871616132e-06, 'samples': 24807936, 'steps': 48452, 'batch_loss/train': 0.7235536546795629} 12/29/2021 08:13:15 - INFO - codeparrot_training - Step 48453: {'lr': 1.1883692421011395e-06, 'samples': 24808448, 'steps': 48453, 'batch_loss/train': 0.7976557709043846} 12/29/2021 08:13:26 - INFO - codeparrot_training - Step 48454: {'lr': 1.1868346017672448e-06, 'samples': 24808960, 'steps': 48454, 'batch_loss/train': 0.8561710985377431} 12/29/2021 08:13:40 - INFO - codeparrot_training - Step 48455: {'lr': 1.185300950620527e-06, 'samples': 24809472, 'steps': 48455, 'batch_loss/train': 0.7468202573945746} 12/29/2021 08:13:50 - INFO - codeparrot_training - Step 48456: {'lr': 1.1837682886671197e-06, 'samples': 24809984, 'steps': 48456, 'batch_loss/train': 0.647159157320857} 12/29/2021 08:14:01 - INFO - codeparrot_training - Step 48457: {'lr': 1.1822366159131016e-06, 'samples': 24810496, 'steps': 48457, 'batch_loss/train': 0.6220426277723163} 12/29/2021 08:14:13 - INFO - codeparrot_training - Step 48458: {'lr': 1.1807059323645508e-06, 'samples': 24811008, 'steps': 48458, 'batch_loss/train': 0.73942200478632} 12/29/2021 08:14:24 - INFO - codeparrot_training - Step 48459: {'lr': 1.179176238027574e-06, 'samples': 24811520, 'steps': 48459, 'batch_loss/train': 0.7641506958752871} 12/29/2021 08:14:34 - INFO - codeparrot_training - Step 48460: {'lr': 1.1776475329082216e-06, 'samples': 24812032, 'steps': 48460, 'batch_loss/train': 0.799106125254184} 12/29/2021 08:14:45 - INFO - codeparrot_training - Step 48461: {'lr': 1.1761198170125997e-06, 'samples': 24812544, 'steps': 48461, 'batch_loss/train': 0.7352613266557455} 12/29/2021 08:14:57 - INFO - codeparrot_training - Step 48462: {'lr': 1.1745930903467872e-06, 'samples': 24813056, 'steps': 48462, 'batch_loss/train': 0.8249935712665319} 12/29/2021 08:15:08 - INFO - codeparrot_training - Step 48463: {'lr': 1.1730673529168068e-06, 'samples': 24813568, 'steps': 48463, 'batch_loss/train': 0.7624987829476595} 12/29/2021 08:15:18 - INFO - codeparrot_training - Step 48464: {'lr': 1.1715426047287925e-06, 'samples': 24814080, 'steps': 48464, 'batch_loss/train': 0.7175484298495576} 12/29/2021 08:15:32 - INFO - codeparrot_training - Step 48465: {'lr': 1.1700188457887396e-06, 'samples': 24814592, 'steps': 48465, 'batch_loss/train': 0.7626031562685966} 12/29/2021 08:15:43 - INFO - codeparrot_training - Step 48466: {'lr': 1.168496076102754e-06, 'samples': 24815104, 'steps': 48466, 'batch_loss/train': 0.7666340665891767} 12/29/2021 08:15:54 - INFO - codeparrot_training - Step 48467: {'lr': 1.166974295676887e-06, 'samples': 24815616, 'steps': 48467, 'batch_loss/train': 0.6822058754041791} 12/29/2021 08:16:06 - INFO - codeparrot_training - Step 48468: {'lr': 1.165453504517161e-06, 'samples': 24816128, 'steps': 48468, 'batch_loss/train': 0.717208482674323} 12/29/2021 08:16:17 - INFO - codeparrot_training - Step 48469: {'lr': 1.163933702629627e-06, 'samples': 24816640, 'steps': 48469, 'batch_loss/train': 0.8012021207250655} 12/29/2021 08:16:27 - INFO - codeparrot_training - Step 48470: {'lr': 1.1624148900203634e-06, 'samples': 24817152, 'steps': 48470, 'batch_loss/train': 0.7049840996041894} 12/29/2021 08:16:39 - INFO - codeparrot_training - Step 48471: {'lr': 1.1608970666953377e-06, 'samples': 24817664, 'steps': 48471, 'batch_loss/train': 0.7208565541077405} 12/29/2021 08:16:50 - INFO - codeparrot_training - Step 48472: {'lr': 1.1593802326606562e-06, 'samples': 24818176, 'steps': 48472, 'batch_loss/train': 1.016653761267662} 12/29/2021 08:17:01 - INFO - codeparrot_training - Step 48473: {'lr': 1.157864387922314e-06, 'samples': 24818688, 'steps': 48473, 'batch_loss/train': 0.7890771538950503} 12/29/2021 08:17:11 - INFO - codeparrot_training - Step 48474: {'lr': 1.1563495324863339e-06, 'samples': 24819200, 'steps': 48474, 'batch_loss/train': 0.7398597132414579} 12/29/2021 08:17:26 - INFO - codeparrot_training - Step 48475: {'lr': 1.1548356663587667e-06, 'samples': 24819712, 'steps': 48475, 'batch_loss/train': 0.705831415951252} 12/29/2021 08:17:36 - INFO - codeparrot_training - Step 48476: {'lr': 1.1533227895456078e-06, 'samples': 24820224, 'steps': 48476, 'batch_loss/train': 0.5455376369063742} 12/29/2021 08:17:47 - INFO - codeparrot_training - Step 48477: {'lr': 1.15181090205288e-06, 'samples': 24820736, 'steps': 48477, 'batch_loss/train': 0.5533592522260733} 12/29/2021 08:18:00 - INFO - codeparrot_training - Step 48478: {'lr': 1.1503000038865785e-06, 'samples': 24821248, 'steps': 48478, 'batch_loss/train': 0.6899294182658195} 12/29/2021 08:18:11 - INFO - codeparrot_training - Step 48479: {'lr': 1.1487900950527264e-06, 'samples': 24821760, 'steps': 48479, 'batch_loss/train': 0.7070214605191723} 12/29/2021 08:18:21 - INFO - codeparrot_training - Step 48480: {'lr': 1.1472811755573464e-06, 'samples': 24822272, 'steps': 48480, 'batch_loss/train': 0.720940881408751} 12/29/2021 08:18:32 - INFO - codeparrot_training - Step 48481: {'lr': 1.1457732454063786e-06, 'samples': 24822784, 'steps': 48481, 'batch_loss/train': 1.1828015614300966} 12/29/2021 08:18:44 - INFO - codeparrot_training - Step 48482: {'lr': 1.1442663046058733e-06, 'samples': 24823296, 'steps': 48482, 'batch_loss/train': 0.7021662839688361} 12/29/2021 08:18:55 - INFO - codeparrot_training - Step 48483: {'lr': 1.1427603531617703e-06, 'samples': 24823808, 'steps': 48483, 'batch_loss/train': 0.6296142798382789} 12/29/2021 08:19:05 - INFO - codeparrot_training - Step 48484: {'lr': 1.1412553910801205e-06, 'samples': 24824320, 'steps': 48484, 'batch_loss/train': 0.7536686769453809} 12/29/2021 08:19:19 - INFO - codeparrot_training - Step 48485: {'lr': 1.1397514183668633e-06, 'samples': 24824832, 'steps': 48485, 'batch_loss/train': 0.8047069273889065} 12/29/2021 08:19:30 - INFO - codeparrot_training - Step 48486: {'lr': 1.1382484350279942e-06, 'samples': 24825344, 'steps': 48486, 'batch_loss/train': 0.8874855795875192} 12/29/2021 08:19:40 - INFO - codeparrot_training - Step 48487: {'lr': 1.1367464410694805e-06, 'samples': 24825856, 'steps': 48487, 'batch_loss/train': 0.709813219960779} 12/29/2021 08:19:53 - INFO - codeparrot_training - Step 48488: {'lr': 1.1352454364972897e-06, 'samples': 24826368, 'steps': 48488, 'batch_loss/train': 0.5844313162378967} 12/29/2021 08:20:03 - INFO - codeparrot_training - Step 48489: {'lr': 1.1337454213173891e-06, 'samples': 24826880, 'steps': 48489, 'batch_loss/train': 0.7675598608329892} 12/29/2021 08:20:14 - INFO - codeparrot_training - Step 48490: {'lr': 1.1322463955357742e-06, 'samples': 24827392, 'steps': 48490, 'batch_loss/train': 0.7862128564156592} 12/29/2021 08:20:24 - INFO - codeparrot_training - Step 48491: {'lr': 1.1307483591583566e-06, 'samples': 24827904, 'steps': 48491, 'batch_loss/train': 0.6646546619012952} 12/29/2021 08:20:39 - INFO - codeparrot_training - Step 48492: {'lr': 1.129251312191104e-06, 'samples': 24828416, 'steps': 48492, 'batch_loss/train': 0.7549090199172497} 12/29/2021 08:20:50 - INFO - codeparrot_training - Step 48493: {'lr': 1.1277552546400117e-06, 'samples': 24828928, 'steps': 48493, 'batch_loss/train': 0.594290318666026} 12/29/2021 08:21:00 - INFO - codeparrot_training - Step 48494: {'lr': 1.1262601865109634e-06, 'samples': 24829440, 'steps': 48494, 'batch_loss/train': 0.7371112979017198} 12/29/2021 08:21:12 - INFO - codeparrot_training - Step 48495: {'lr': 1.1247661078099547e-06, 'samples': 24829952, 'steps': 48495, 'batch_loss/train': 0.7855017129331827} 12/29/2021 08:21:23 - INFO - codeparrot_training - Step 48496: {'lr': 1.1232730185428974e-06, 'samples': 24830464, 'steps': 48496, 'batch_loss/train': 0.7808919893577695} 12/29/2021 08:21:33 - INFO - codeparrot_training - Step 48497: {'lr': 1.1217809187157312e-06, 'samples': 24830976, 'steps': 48497, 'batch_loss/train': 0.888947494328022} 12/29/2021 08:21:46 - INFO - codeparrot_training - Step 48498: {'lr': 1.1202898083343959e-06, 'samples': 24831488, 'steps': 48498, 'batch_loss/train': 0.6735588423907757} 12/29/2021 08:21:56 - INFO - codeparrot_training - Step 48499: {'lr': 1.1187996874048034e-06, 'samples': 24832000, 'steps': 48499, 'batch_loss/train': 0.7007964579388499} 12/29/2021 08:22:07 - INFO - codeparrot_training - Step 48500: {'lr': 1.1173105559328932e-06, 'samples': 24832512, 'steps': 48500, 'batch_loss/train': 0.7291326355189085} 12/29/2021 08:22:19 - INFO - codeparrot_training - Step 48501: {'lr': 1.1158224139246053e-06, 'samples': 24833024, 'steps': 48501, 'batch_loss/train': 0.7215812960639596} 12/29/2021 08:22:30 - INFO - codeparrot_training - Step 48502: {'lr': 1.114335261385796e-06, 'samples': 24833536, 'steps': 48502, 'batch_loss/train': 0.6646559634245932} 12/29/2021 08:22:40 - INFO - codeparrot_training - Step 48503: {'lr': 1.1128490983224326e-06, 'samples': 24834048, 'steps': 48503, 'batch_loss/train': 0.7805347959510982} 12/29/2021 08:22:51 - INFO - codeparrot_training - Step 48504: {'lr': 1.1113639247403994e-06, 'samples': 24834560, 'steps': 48504, 'batch_loss/train': 0.7698882340919226} 12/29/2021 08:23:05 - INFO - codeparrot_training - Step 48505: {'lr': 1.1098797406456362e-06, 'samples': 24835072, 'steps': 48505, 'batch_loss/train': 0.7785197957418859} 12/29/2021 08:23:16 - INFO - codeparrot_training - Step 48506: {'lr': 1.1083965460439716e-06, 'samples': 24835584, 'steps': 48506, 'batch_loss/train': 0.44907059567049146} 12/29/2021 08:23:26 - INFO - codeparrot_training - Step 48507: {'lr': 1.1069143409413728e-06, 'samples': 24836096, 'steps': 48507, 'batch_loss/train': 0.7110481918789446} 12/29/2021 08:23:39 - INFO - codeparrot_training - Step 48508: {'lr': 1.1054331253436966e-06, 'samples': 24836608, 'steps': 48508, 'batch_loss/train': 0.8214517384767532} 12/29/2021 08:23:50 - INFO - codeparrot_training - Step 48509: {'lr': 1.103952899256827e-06, 'samples': 24837120, 'steps': 48509, 'batch_loss/train': 0.6753914784640074} 12/29/2021 08:24:00 - INFO - codeparrot_training - Step 48510: {'lr': 1.102473662686676e-06, 'samples': 24837632, 'steps': 48510, 'batch_loss/train': 0.7568382853642106} 12/29/2021 08:24:12 - INFO - codeparrot_training - Step 48511: {'lr': 1.1009954156391e-06, 'samples': 24838144, 'steps': 48511, 'batch_loss/train': 0.6917667454108596} 12/29/2021 08:24:23 - INFO - codeparrot_training - Step 48512: {'lr': 1.099518158119983e-06, 'samples': 24838656, 'steps': 48512, 'batch_loss/train': 0.7764375600963831} 12/29/2021 08:24:34 - INFO - codeparrot_training - Step 48513: {'lr': 1.0980418901352096e-06, 'samples': 24839168, 'steps': 48513, 'batch_loss/train': 0.738814102485776} 12/29/2021 08:24:44 - INFO - codeparrot_training - Step 48514: {'lr': 1.0965666116906359e-06, 'samples': 24839680, 'steps': 48514, 'batch_loss/train': 0.7425660295411944} 12/29/2021 08:24:59 - INFO - codeparrot_training - Step 48515: {'lr': 1.0950923227921183e-06, 'samples': 24840192, 'steps': 48515, 'batch_loss/train': 0.7043882557190955} 12/29/2021 08:25:09 - INFO - codeparrot_training - Step 48516: {'lr': 1.093619023445569e-06, 'samples': 24840704, 'steps': 48516, 'batch_loss/train': 0.7840326228761114} 12/29/2021 08:25:20 - INFO - codeparrot_training - Step 48517: {'lr': 1.0921467136567885e-06, 'samples': 24841216, 'steps': 48517, 'batch_loss/train': 0.6454784472007304} 12/29/2021 08:25:32 - INFO - codeparrot_training - Step 48518: {'lr': 1.0906753934316338e-06, 'samples': 24841728, 'steps': 48518, 'batch_loss/train': 0.7990711415186524} 12/29/2021 08:25:43 - INFO - codeparrot_training - Step 48519: {'lr': 1.0892050627759887e-06, 'samples': 24842240, 'steps': 48519, 'batch_loss/train': 0.7505232151597738} 12/29/2021 08:25:53 - INFO - codeparrot_training - Step 48520: {'lr': 1.087735721695654e-06, 'samples': 24842752, 'steps': 48520, 'batch_loss/train': 0.7840838967822492} 12/29/2021 08:26:07 - INFO - codeparrot_training - Step 48521: {'lr': 1.0862673701965143e-06, 'samples': 24843264, 'steps': 48521, 'batch_loss/train': 0.7368218968622386} 12/29/2021 08:26:18 - INFO - codeparrot_training - Step 48522: {'lr': 1.0848000082843979e-06, 'samples': 24843776, 'steps': 48522, 'batch_loss/train': 0.7472532042302191} 12/29/2021 08:26:29 - INFO - codeparrot_training - Step 48523: {'lr': 1.0833336359651057e-06, 'samples': 24844288, 'steps': 48523, 'batch_loss/train': 0.5902836271561682} 12/29/2021 08:26:39 - INFO - codeparrot_training - Step 48524: {'lr': 1.0818682532444945e-06, 'samples': 24844800, 'steps': 48524, 'batch_loss/train': 0.7609494049102068} 12/29/2021 08:26:52 - INFO - codeparrot_training - Step 48525: {'lr': 1.0804038601284205e-06, 'samples': 24845312, 'steps': 48525, 'batch_loss/train': 0.5195902225095779} 12/29/2021 08:27:02 - INFO - codeparrot_training - Step 48526: {'lr': 1.078940456622629e-06, 'samples': 24845824, 'steps': 48526, 'batch_loss/train': 0.6622261321172118} 12/29/2021 08:27:13 - INFO - codeparrot_training - Step 48527: {'lr': 1.0774780427330043e-06, 'samples': 24846336, 'steps': 48527, 'batch_loss/train': 0.3589968211017549} 12/29/2021 08:27:25 - INFO - codeparrot_training - Step 48528: {'lr': 1.0760166184653475e-06, 'samples': 24846848, 'steps': 48528, 'batch_loss/train': 0.6986249163746834} 12/29/2021 08:27:36 - INFO - codeparrot_training - Step 48529: {'lr': 1.0745561838254593e-06, 'samples': 24847360, 'steps': 48529, 'batch_loss/train': 0.7857569847255945} 12/29/2021 08:27:46 - INFO - codeparrot_training - Step 48530: {'lr': 1.0730967388191126e-06, 'samples': 24847872, 'steps': 48530, 'batch_loss/train': 0.8061559586785734} 12/29/2021 08:27:58 - INFO - codeparrot_training - Step 48531: {'lr': 1.0716382834521643e-06, 'samples': 24848384, 'steps': 48531, 'batch_loss/train': 0.6204854459501803} 12/29/2021 08:28:09 - INFO - codeparrot_training - Step 48532: {'lr': 1.0701808177303873e-06, 'samples': 24848896, 'steps': 48532, 'batch_loss/train': 0.7048764135688543} 12/29/2021 08:28:20 - INFO - codeparrot_training - Step 48533: {'lr': 1.0687243416595826e-06, 'samples': 24849408, 'steps': 48533, 'batch_loss/train': 0.7576726591214538} 12/29/2021 08:28:30 - INFO - codeparrot_training - Step 48534: {'lr': 1.0672688552455234e-06, 'samples': 24849920, 'steps': 48534, 'batch_loss/train': 0.7152283226605505} 12/29/2021 08:28:44 - INFO - codeparrot_training - Step 48535: {'lr': 1.0658143584940383e-06, 'samples': 24850432, 'steps': 48535, 'batch_loss/train': 0.7231877110898495} 12/29/2021 08:28:55 - INFO - codeparrot_training - Step 48536: {'lr': 1.064360851410845e-06, 'samples': 24850944, 'steps': 48536, 'batch_loss/train': 0.8127067033201456} 12/29/2021 08:29:06 - INFO - codeparrot_training - Step 48537: {'lr': 1.0629083340017997e-06, 'samples': 24851456, 'steps': 48537, 'batch_loss/train': 0.7122430019080639} 12/29/2021 08:29:18 - INFO - codeparrot_training - Step 48538: {'lr': 1.0614568062725927e-06, 'samples': 24851968, 'steps': 48538, 'batch_loss/train': 0.6015415922738612} 12/29/2021 08:29:28 - INFO - codeparrot_training - Step 48539: {'lr': 1.0600062682290802e-06, 'samples': 24852480, 'steps': 48539, 'batch_loss/train': 0.7557220421731472} 12/29/2021 08:29:39 - INFO - codeparrot_training - Step 48540: {'lr': 1.058556719876952e-06, 'samples': 24852992, 'steps': 48540, 'batch_loss/train': 0.724391839466989} 12/29/2021 08:29:51 - INFO - codeparrot_training - Step 48541: {'lr': 1.057108161222009e-06, 'samples': 24853504, 'steps': 48541, 'batch_loss/train': 0.7973825172521174} 12/29/2021 08:30:02 - INFO - codeparrot_training - Step 48542: {'lr': 1.0556605922700247e-06, 'samples': 24854016, 'steps': 48542, 'batch_loss/train': 0.6580570896621794} 12/29/2021 08:30:12 - INFO - codeparrot_training - Step 48543: {'lr': 1.0542140130267441e-06, 'samples': 24854528, 'steps': 48543, 'batch_loss/train': 0.789022873621434} 12/29/2021 08:30:23 - INFO - codeparrot_training - Step 48544: {'lr': 1.0527684234978851e-06, 'samples': 24855040, 'steps': 48544, 'batch_loss/train': 0.7833910943008959} 12/29/2021 08:30:37 - INFO - codeparrot_training - Step 48545: {'lr': 1.0513238236892209e-06, 'samples': 24855552, 'steps': 48545, 'batch_loss/train': 0.7277247223537415} 12/29/2021 08:30:48 - INFO - codeparrot_training - Step 48546: {'lr': 1.0498802136064966e-06, 'samples': 24856064, 'steps': 48546, 'batch_loss/train': 0.7217359342612326} 12/29/2021 08:30:58 - INFO - codeparrot_training - Step 48547: {'lr': 1.048437593255458e-06, 'samples': 24856576, 'steps': 48547, 'batch_loss/train': 0.8598299324512482} 12/29/2021 08:31:10 - INFO - codeparrot_training - Step 48548: {'lr': 1.0469959626418223e-06, 'samples': 24857088, 'steps': 48548, 'batch_loss/train': 0.8399775652214885} 12/29/2021 08:31:21 - INFO - codeparrot_training - Step 48549: {'lr': 1.0455553217713353e-06, 'samples': 24857600, 'steps': 48549, 'batch_loss/train': 0.7997096912004054} 12/29/2021 08:31:32 - INFO - codeparrot_training - Step 48550: {'lr': 1.0441156706497145e-06, 'samples': 24858112, 'steps': 48550, 'batch_loss/train': 0.7729160971939564} 12/29/2021 08:31:46 - INFO - codeparrot_training - Step 48551: {'lr': 1.0426770092827054e-06, 'samples': 24858624, 'steps': 48551, 'batch_loss/train': 0.7204874765593559} 12/29/2021 08:31:56 - INFO - codeparrot_training - Step 48552: {'lr': 1.0412393376759977e-06, 'samples': 24859136, 'steps': 48552, 'batch_loss/train': 0.7225810483796522} 12/29/2021 08:32:07 - INFO - codeparrot_training - Step 48553: {'lr': 1.0398026558353091e-06, 'samples': 24859648, 'steps': 48553, 'batch_loss/train': 0.7902199737727642} 12/29/2021 08:32:19 - INFO - codeparrot_training - Step 48554: {'lr': 1.038366963766385e-06, 'samples': 24860160, 'steps': 48554, 'batch_loss/train': 0.7287180623970926} 12/29/2021 08:32:29 - INFO - codeparrot_training - Step 48555: {'lr': 1.0369322614748878e-06, 'samples': 24860672, 'steps': 48555, 'batch_loss/train': 0.735390470654238} 12/29/2021 08:32:40 - INFO - codeparrot_training - Step 48556: {'lr': 1.0354985489665348e-06, 'samples': 24861184, 'steps': 48556, 'batch_loss/train': 0.6952435970306396} 12/29/2021 08:32:51 - INFO - codeparrot_training - Step 48557: {'lr': 1.0340658262470436e-06, 'samples': 24861696, 'steps': 48557, 'batch_loss/train': 0.8061623140238225} 12/29/2021 08:33:03 - INFO - codeparrot_training - Step 48558: {'lr': 1.0326340933221323e-06, 'samples': 24862208, 'steps': 48558, 'batch_loss/train': 0.7469576857984066} 12/29/2021 08:33:14 - INFO - codeparrot_training - Step 48559: {'lr': 1.0312033501974072e-06, 'samples': 24862720, 'steps': 48559, 'batch_loss/train': 0.6433977561537176} 12/29/2021 08:33:24 - INFO - codeparrot_training - Step 48560: {'lr': 1.0297735968786415e-06, 'samples': 24863232, 'steps': 48560, 'batch_loss/train': 1.673827555263415} 12/29/2021 08:33:39 - INFO - codeparrot_training - Step 48561: {'lr': 1.0283448333714973e-06, 'samples': 24863744, 'steps': 48561, 'batch_loss/train': 0.7454966679215431} 12/29/2021 08:33:49 - INFO - codeparrot_training - Step 48562: {'lr': 1.0269170596816369e-06, 'samples': 24864256, 'steps': 48562, 'batch_loss/train': 0.6009691627696157} 12/29/2021 08:34:00 - INFO - codeparrot_training - Step 48563: {'lr': 1.0254902758147222e-06, 'samples': 24864768, 'steps': 48563, 'batch_loss/train': 0.777558333822526} 12/29/2021 08:34:13 - INFO - codeparrot_training - Step 48564: {'lr': 1.024064481776471e-06, 'samples': 24865280, 'steps': 48564, 'batch_loss/train': 0.7053399726282805} 12/29/2021 08:34:23 - INFO - codeparrot_training - Step 48565: {'lr': 1.0226396775725456e-06, 'samples': 24865792, 'steps': 48565, 'batch_loss/train': 0.7795084761455655} 12/29/2021 08:34:34 - INFO - codeparrot_training - Step 48566: {'lr': 1.0212158632085799e-06, 'samples': 24866304, 'steps': 48566, 'batch_loss/train': 0.7720216276939027} 12/29/2021 08:34:45 - INFO - codeparrot_training - Step 48567: {'lr': 1.0197930386902365e-06, 'samples': 24866816, 'steps': 48567, 'batch_loss/train': 0.848394968546927} 12/29/2021 08:34:57 - INFO - codeparrot_training - Step 48568: {'lr': 1.0183712040232051e-06, 'samples': 24867328, 'steps': 48568, 'batch_loss/train': 0.7482132229488343} 12/29/2021 08:35:07 - INFO - codeparrot_training - Step 48569: {'lr': 1.0169503592130925e-06, 'samples': 24867840, 'steps': 48569, 'batch_loss/train': 0.7794571444392204} 12/29/2021 08:35:18 - INFO - codeparrot_training - Step 48570: {'lr': 1.0155305042655882e-06, 'samples': 24868352, 'steps': 48570, 'batch_loss/train': 0.7041214155033231} 12/29/2021 08:35:30 - INFO - codeparrot_training - Step 48571: {'lr': 1.014111639186327e-06, 'samples': 24868864, 'steps': 48571, 'batch_loss/train': 0.7346699258778244} 12/29/2021 08:35:41 - INFO - codeparrot_training - Step 48572: {'lr': 1.0126937639809431e-06, 'samples': 24869376, 'steps': 48572, 'batch_loss/train': 0.6000151622574776} 12/29/2021 08:35:51 - INFO - codeparrot_training - Step 48573: {'lr': 1.0112768786550709e-06, 'samples': 24869888, 'steps': 48573, 'batch_loss/train': 0.7170677995309234} 12/29/2021 08:36:05 - INFO - codeparrot_training - Step 48574: {'lr': 1.0098609832143445e-06, 'samples': 24870400, 'steps': 48574, 'batch_loss/train': 0.6193961646640673} 12/29/2021 08:36:16 - INFO - codeparrot_training - Step 48575: {'lr': 1.0084460776643988e-06, 'samples': 24870912, 'steps': 48575, 'batch_loss/train': 0.7896543345414102} 12/29/2021 08:36:27 - INFO - codeparrot_training - Step 48576: {'lr': 1.007032162010868e-06, 'samples': 24871424, 'steps': 48576, 'batch_loss/train': 0.7222659243270755} 12/29/2021 08:36:37 - INFO - codeparrot_training - Step 48577: {'lr': 1.0056192362593586e-06, 'samples': 24871936, 'steps': 48577, 'batch_loss/train': 0.679504944011569} 12/29/2021 08:36:50 - INFO - codeparrot_training - Step 48578: {'lr': 1.0042073004154494e-06, 'samples': 24872448, 'steps': 48578, 'batch_loss/train': 0.8789702898357064} 12/29/2021 08:37:01 - INFO - codeparrot_training - Step 48579: {'lr': 1.0027963544848307e-06, 'samples': 24872960, 'steps': 48579, 'batch_loss/train': 0.7026134198531508} 12/29/2021 08:37:11 - INFO - codeparrot_training - Step 48580: {'lr': 1.0013863984730809e-06, 'samples': 24873472, 'steps': 48580, 'batch_loss/train': 1.190974340774119} 12/29/2021 08:37:23 - INFO - codeparrot_training - Step 48581: {'lr': 9.999774323858068e-07, 'samples': 24873984, 'steps': 48581, 'batch_loss/train': 0.7712983712553978} 12/29/2021 08:37:34 - INFO - codeparrot_training - Step 48582: {'lr': 9.985694562285596e-07, 'samples': 24874496, 'steps': 48582, 'batch_loss/train': 0.8475746307522058} 12/29/2021 08:37:45 - INFO - codeparrot_training - Step 48583: {'lr': 9.971624700070293e-07, 'samples': 24875008, 'steps': 48583, 'batch_loss/train': 0.7984580653719604} 12/29/2021 08:37:59 - INFO - codeparrot_training - Step 48584: {'lr': 9.957564737267388e-07, 'samples': 24875520, 'steps': 48584, 'batch_loss/train': 0.7931341398507357} 12/29/2021 08:38:09 - INFO - codeparrot_training - Step 48585: {'lr': 9.943514673932952e-07, 'samples': 24876032, 'steps': 48585, 'batch_loss/train': 0.6610731020336971} 12/29/2021 08:38:20 - INFO - codeparrot_training - Step 48586: {'lr': 9.92947451012305e-07, 'samples': 24876544, 'steps': 48586, 'batch_loss/train': 0.7232476896606386} 12/29/2021 08:38:31 - INFO - codeparrot_training - Step 48587: {'lr': 9.91544424589319e-07, 'samples': 24877056, 'steps': 48587, 'batch_loss/train': 0.8035904942080379} 12/29/2021 08:38:43 - INFO - codeparrot_training - Step 48588: {'lr': 9.901423881299443e-07, 'samples': 24877568, 'steps': 48588, 'batch_loss/train': 0.7841789927333593} 12/29/2021 08:38:53 - INFO - codeparrot_training - Step 48589: {'lr': 9.887413416397317e-07, 'samples': 24878080, 'steps': 48589, 'batch_loss/train': 0.7594958720728755} 12/29/2021 08:39:04 - INFO - codeparrot_training - Step 48590: {'lr': 9.873412851242602e-07, 'samples': 24878592, 'steps': 48590, 'batch_loss/train': 0.7685517151840031} 12/29/2021 08:39:18 - INFO - codeparrot_training - Step 48591: {'lr': 9.859422185891087e-07, 'samples': 24879104, 'steps': 48591, 'batch_loss/train': 0.7306290385313332} 12/29/2021 08:39:28 - INFO - codeparrot_training - Step 48592: {'lr': 9.845441420398282e-07, 'samples': 24879616, 'steps': 48592, 'batch_loss/train': 0.7059139925986528} 12/29/2021 08:39:39 - INFO - codeparrot_training - Step 48593: {'lr': 9.831470554819421e-07, 'samples': 24880128, 'steps': 48593, 'batch_loss/train': 0.6903827330097556} 12/29/2021 08:39:51 - INFO - codeparrot_training - Step 48594: {'lr': 9.817509589210571e-07, 'samples': 24880640, 'steps': 48594, 'batch_loss/train': 0.6801237622275949} 12/29/2021 08:40:02 - INFO - codeparrot_training - Step 48595: {'lr': 9.803558523627242e-07, 'samples': 24881152, 'steps': 48595, 'batch_loss/train': 0.744922504061833} 12/29/2021 08:40:13 - INFO - codeparrot_training - Step 48596: {'lr': 9.789617358124391e-07, 'samples': 24881664, 'steps': 48596, 'batch_loss/train': 0.7182741602882743} 12/29/2021 08:40:23 - INFO - codeparrot_training - Step 48597: {'lr': 9.775686092758085e-07, 'samples': 24882176, 'steps': 48597, 'batch_loss/train': 0.7012136811390519} 12/29/2021 08:40:35 - INFO - codeparrot_training - Step 48598: {'lr': 9.761764727583e-07, 'samples': 24882688, 'steps': 48598, 'batch_loss/train': 0.8239004844799638} 12/29/2021 08:40:46 - INFO - codeparrot_training - Step 48599: {'lr': 9.747853262655204e-07, 'samples': 24883200, 'steps': 48599, 'batch_loss/train': 0.7102717068046331} 12/29/2021 08:40:57 - INFO - codeparrot_training - Step 48600: {'lr': 9.733951698029375e-07, 'samples': 24883712, 'steps': 48600, 'batch_loss/train': 0.7646710082190111} 12/29/2021 08:41:11 - INFO - codeparrot_training - Step 48601: {'lr': 9.720060033761303e-07, 'samples': 24884224, 'steps': 48601, 'batch_loss/train': 0.39342423586640507} 12/29/2021 08:41:21 - INFO - codeparrot_training - Step 48602: {'lr': 9.706178269905942e-07, 'samples': 24884736, 'steps': 48602, 'batch_loss/train': 0.6874048803001642} 12/29/2021 08:41:32 - INFO - codeparrot_training - Step 48603: {'lr': 9.692306406518526e-07, 'samples': 24885248, 'steps': 48603, 'batch_loss/train': 0.7679233672097325} 12/29/2021 08:41:44 - INFO - codeparrot_training - Step 48604: {'lr': 9.67844444365401e-07, 'samples': 24885760, 'steps': 48604, 'batch_loss/train': 0.7991809104569256} 12/29/2021 08:41:55 - INFO - codeparrot_training - Step 48605: {'lr': 9.664592381367909e-07, 'samples': 24886272, 'steps': 48605, 'batch_loss/train': 0.6421773261390626} 12/29/2021 08:42:05 - INFO - codeparrot_training - Step 48606: {'lr': 9.650750219714898e-07, 'samples': 24886784, 'steps': 48606, 'batch_loss/train': 0.8496884135529399} 12/29/2021 08:42:16 - INFO - codeparrot_training - Step 48607: {'lr': 9.636917958750212e-07, 'samples': 24887296, 'steps': 48607, 'batch_loss/train': 0.5822006111266091} 12/29/2021 08:42:28 - INFO - codeparrot_training - Step 48608: {'lr': 9.623095598528808e-07, 'samples': 24887808, 'steps': 48608, 'batch_loss/train': 0.6475925603881478} 12/29/2021 08:42:39 - INFO - codeparrot_training - Step 48609: {'lr': 9.609283139105918e-07, 'samples': 24888320, 'steps': 48609, 'batch_loss/train': 0.8459054594859481} 12/29/2021 08:42:50 - INFO - codeparrot_training - Step 48610: {'lr': 9.595480580535942e-07, 'samples': 24888832, 'steps': 48610, 'batch_loss/train': 0.7388561828993261} 12/29/2021 08:43:02 - INFO - codeparrot_training - Step 48611: {'lr': 9.58168792287384e-07, 'samples': 24889344, 'steps': 48611, 'batch_loss/train': 0.6766986076254398} 12/29/2021 08:43:12 - INFO - codeparrot_training - Step 48612: {'lr': 9.56790516617484e-07, 'samples': 24889856, 'steps': 48612, 'batch_loss/train': 0.771481916308403} 12/29/2021 08:43:23 - INFO - codeparrot_training - Step 48613: {'lr': 9.554132310493624e-07, 'samples': 24890368, 'steps': 48613, 'batch_loss/train': 0.699134879745543} 12/29/2021 08:43:37 - INFO - codeparrot_training - Step 48614: {'lr': 9.540369355884593e-07, 'samples': 24890880, 'steps': 48614, 'batch_loss/train': 0.758044458925724} 12/29/2021 08:43:48 - INFO - codeparrot_training - Step 48615: {'lr': 9.526616302402702e-07, 'samples': 24891392, 'steps': 48615, 'batch_loss/train': 0.8641630262136459} 12/29/2021 08:43:58 - INFO - codeparrot_training - Step 48616: {'lr': 9.51287315010263e-07, 'samples': 24891904, 'steps': 48616, 'batch_loss/train': 0.6079866837244481} 12/29/2021 08:44:10 - INFO - codeparrot_training - Step 48617: {'lr': 9.499139899039055e-07, 'samples': 24892416, 'steps': 48617, 'batch_loss/train': 0.6295154807157815} 12/29/2021 08:44:21 - INFO - codeparrot_training - Step 48618: {'lr': 9.485416549266657e-07, 'samples': 24892928, 'steps': 48618, 'batch_loss/train': 0.6151126902550459} 12/29/2021 08:44:32 - INFO - codeparrot_training - Step 48619: {'lr': 9.471703100839557e-07, 'samples': 24893440, 'steps': 48619, 'batch_loss/train': 0.8443085364997387} 12/29/2021 08:44:42 - INFO - codeparrot_training - Step 48620: {'lr': 9.457999553812713e-07, 'samples': 24893952, 'steps': 48620, 'batch_loss/train': 0.7165454463101923} 12/29/2021 08:44:56 - INFO - codeparrot_training - Step 48621: {'lr': 9.444305908240524e-07, 'samples': 24894464, 'steps': 48621, 'batch_loss/train': 0.6608729930594563} 12/29/2021 08:45:07 - INFO - codeparrot_training - Step 48622: {'lr': 9.430622164177116e-07, 'samples': 24894976, 'steps': 48622, 'batch_loss/train': 0.7303130775690079} 12/29/2021 08:45:17 - INFO - codeparrot_training - Step 48623: {'lr': 9.416948321677443e-07, 'samples': 24895488, 'steps': 48623, 'batch_loss/train': 0.7437559422105551} 12/29/2021 08:45:29 - INFO - codeparrot_training - Step 48624: {'lr': 9.403284380795629e-07, 'samples': 24896000, 'steps': 48624, 'batch_loss/train': 0.7964381487108767} 12/29/2021 08:45:40 - INFO - codeparrot_training - Step 48625: {'lr': 9.389630341585798e-07, 'samples': 24896512, 'steps': 48625, 'batch_loss/train': 0.7016892237588763} 12/29/2021 08:45:51 - INFO - codeparrot_training - Step 48626: {'lr': 9.375986204102349e-07, 'samples': 24897024, 'steps': 48626, 'batch_loss/train': 0.7615249729715288} 12/29/2021 08:46:03 - INFO - codeparrot_training - Step 48627: {'lr': 9.362351968399685e-07, 'samples': 24897536, 'steps': 48627, 'batch_loss/train': 0.7217223192565143} 12/29/2021 08:46:14 - INFO - codeparrot_training - Step 48628: {'lr': 9.348727634531651e-07, 'samples': 24898048, 'steps': 48628, 'batch_loss/train': 0.7066185348667204} 12/29/2021 08:46:24 - INFO - codeparrot_training - Step 48629: {'lr': 9.335113202552925e-07, 'samples': 24898560, 'steps': 48629, 'batch_loss/train': 0.815540736541152} 12/29/2021 08:46:35 - INFO - codeparrot_training - Step 48630: {'lr': 9.321508672517076e-07, 'samples': 24899072, 'steps': 48630, 'batch_loss/train': 0.7352399383671582} 12/29/2021 08:46:49 - INFO - codeparrot_training - Step 48631: {'lr': 9.307914044478782e-07, 'samples': 24899584, 'steps': 48631, 'batch_loss/train': 0.5849880164896604} 12/29/2021 08:47:00 - INFO - codeparrot_training - Step 48632: {'lr': 9.294329318491612e-07, 'samples': 24900096, 'steps': 48632, 'batch_loss/train': 0.620469794201199} 12/29/2021 08:47:10 - INFO - codeparrot_training - Step 48633: {'lr': 9.280754494609689e-07, 'samples': 24900608, 'steps': 48633, 'batch_loss/train': 0.6783293196931481} 12/29/2021 08:47:23 - INFO - codeparrot_training - Step 48634: {'lr': 9.267189572886859e-07, 'samples': 24901120, 'steps': 48634, 'batch_loss/train': 0.7780977771617472} 12/29/2021 08:47:33 - INFO - codeparrot_training - Step 48635: {'lr': 9.253634553377521e-07, 'samples': 24901632, 'steps': 48635, 'batch_loss/train': 0.9015526380389929} 12/29/2021 08:47:44 - INFO - codeparrot_training - Step 48636: {'lr': 9.240089436135246e-07, 'samples': 24902144, 'steps': 48636, 'batch_loss/train': 0.7086311001330614} 12/29/2021 08:47:55 - INFO - codeparrot_training - Step 48637: {'lr': 9.226554221213879e-07, 'samples': 24902656, 'steps': 48637, 'batch_loss/train': 0.7651178888045251} 12/29/2021 08:48:08 - INFO - codeparrot_training - Step 48638: {'lr': 9.213028908667264e-07, 'samples': 24903168, 'steps': 48638, 'batch_loss/train': 1.4465452815056778} 12/29/2021 08:48:19 - INFO - codeparrot_training - Step 48639: {'lr': 9.199513498548972e-07, 'samples': 24903680, 'steps': 48639, 'batch_loss/train': 0.9082850287668407} 12/29/2021 08:48:30 - INFO - codeparrot_training - Step 48640: {'lr': 9.186007990913126e-07, 'samples': 24904192, 'steps': 48640, 'batch_loss/train': 0.6887285998091102} 12/29/2021 08:48:42 - INFO - codeparrot_training - Step 48641: {'lr': 9.172512385813015e-07, 'samples': 24904704, 'steps': 48641, 'batch_loss/train': 0.67157664289698} 12/29/2021 08:48:52 - INFO - codeparrot_training - Step 48642: {'lr': 9.159026683302484e-07, 'samples': 24905216, 'steps': 48642, 'batch_loss/train': 0.7430353418458253} 12/29/2021 08:49:03 - INFO - codeparrot_training - Step 48643: {'lr': 9.145550883435383e-07, 'samples': 24905728, 'steps': 48643, 'batch_loss/train': 0.7442113072611392} 12/29/2021 08:49:15 - INFO - codeparrot_training - Step 48644: {'lr': 9.132084986264999e-07, 'samples': 24906240, 'steps': 48644, 'batch_loss/train': 0.7351208890322596} 12/29/2021 08:49:26 - INFO - codeparrot_training - Step 48645: {'lr': 9.118628991844901e-07, 'samples': 24906752, 'steps': 48645, 'batch_loss/train': 0.7596267983317375} 12/29/2021 08:49:36 - INFO - codeparrot_training - Step 48646: {'lr': 9.105182900228659e-07, 'samples': 24907264, 'steps': 48646, 'batch_loss/train': 0.7072691009379923} 12/29/2021 08:49:47 - INFO - codeparrot_training - Step 48647: {'lr': 9.091746711469562e-07, 'samples': 24907776, 'steps': 48647, 'batch_loss/train': 0.5967182889580727} 12/29/2021 08:49:59 - INFO - codeparrot_training - Step 48648: {'lr': 9.078320425621178e-07, 'samples': 24908288, 'steps': 48648, 'batch_loss/train': 0.7411892339587212} 12/29/2021 08:50:10 - INFO - codeparrot_training - Step 48649: {'lr': 9.064904042737076e-07, 'samples': 24908800, 'steps': 48649, 'batch_loss/train': 0.7552098957821727} 12/29/2021 08:50:21 - INFO - codeparrot_training - Step 48650: {'lr': 9.051497562869992e-07, 'samples': 24909312, 'steps': 48650, 'batch_loss/train': 0.6565972551470622} 12/29/2021 08:50:34 - INFO - codeparrot_training - Step 48651: {'lr': 9.038100986074049e-07, 'samples': 24909824, 'steps': 48651, 'batch_loss/train': 0.6927963839843869} 12/29/2021 08:50:45 - INFO - codeparrot_training - Step 48652: {'lr': 9.024714312401705e-07, 'samples': 24910336, 'steps': 48652, 'batch_loss/train': 0.695217679720372} 12/29/2021 08:50:56 - INFO - codeparrot_training - Step 48653: {'lr': 9.011337541906528e-07, 'samples': 24910848, 'steps': 48653, 'batch_loss/train': 0.6073239556280896} 12/29/2021 08:51:08 - INFO - codeparrot_training - Step 48654: {'lr': 8.997970674642087e-07, 'samples': 24911360, 'steps': 48654, 'batch_loss/train': 0.7599999764934182} 12/29/2021 08:51:19 - INFO - codeparrot_training - Step 48655: {'lr': 8.984613710660838e-07, 'samples': 24911872, 'steps': 48655, 'batch_loss/train': 0.7806160293985158} 12/29/2021 08:51:29 - INFO - codeparrot_training - Step 48656: {'lr': 8.971266650016075e-07, 'samples': 24912384, 'steps': 48656, 'batch_loss/train': 0.7418660968542099} 12/29/2021 08:51:41 - INFO - codeparrot_training - Step 48657: {'lr': 8.957929492761363e-07, 'samples': 24912896, 'steps': 48657, 'batch_loss/train': 0.8182906545698643} 12/29/2021 08:51:52 - INFO - codeparrot_training - Step 48658: {'lr': 8.944602238949162e-07, 'samples': 24913408, 'steps': 48658, 'batch_loss/train': 0.9327503703534603} 12/29/2021 08:52:03 - INFO - codeparrot_training - Step 48659: {'lr': 8.931284888632485e-07, 'samples': 24913920, 'steps': 48659, 'batch_loss/train': 0.7551333119627088} 12/29/2021 08:52:13 - INFO - codeparrot_training - Step 48660: {'lr': 8.917977441864622e-07, 'samples': 24914432, 'steps': 48660, 'batch_loss/train': 0.6779233273118734} 12/29/2021 08:52:27 - INFO - codeparrot_training - Step 48661: {'lr': 8.904679898698309e-07, 'samples': 24914944, 'steps': 48661, 'batch_loss/train': 0.708798878127709} 12/29/2021 08:52:38 - INFO - codeparrot_training - Step 48662: {'lr': 8.891392259186282e-07, 'samples': 24915456, 'steps': 48662, 'batch_loss/train': 0.7106085065752268} 12/29/2021 08:52:48 - INFO - codeparrot_training - Step 48663: {'lr': 8.878114523381553e-07, 'samples': 24915968, 'steps': 48663, 'batch_loss/train': 0.7127290097996593} 12/29/2021 08:53:00 - INFO - codeparrot_training - Step 48664: {'lr': 8.864846691336858e-07, 'samples': 24916480, 'steps': 48664, 'batch_loss/train': 0.5904231565073133} 12/29/2021 08:53:11 - INFO - codeparrot_training - Step 48665: {'lr': 8.851588763104657e-07, 'samples': 24916992, 'steps': 48665, 'batch_loss/train': 0.6505648037418723} 12/29/2021 08:53:22 - INFO - codeparrot_training - Step 48666: {'lr': 8.838340738738238e-07, 'samples': 24917504, 'steps': 48666, 'batch_loss/train': 0.7047189115546644} 12/29/2021 08:53:34 - INFO - codeparrot_training - Step 48667: {'lr': 8.825102618289782e-07, 'samples': 24918016, 'steps': 48667, 'batch_loss/train': 0.9177870582789183} 12/29/2021 08:53:45 - INFO - codeparrot_training - Step 48668: {'lr': 8.811874401812026e-07, 'samples': 24918528, 'steps': 48668, 'batch_loss/train': 0.6137466478394344} 12/29/2021 08:53:55 - INFO - codeparrot_training - Step 48669: {'lr': 8.798656089357703e-07, 'samples': 24919040, 'steps': 48669, 'batch_loss/train': 0.9498469214886427} 12/29/2021 08:54:06 - INFO - codeparrot_training - Step 48670: {'lr': 8.785447680979275e-07, 'samples': 24919552, 'steps': 48670, 'batch_loss/train': 0.8602556674741209} 12/29/2021 08:54:20 - INFO - codeparrot_training - Step 48671: {'lr': 8.772249176729197e-07, 'samples': 24920064, 'steps': 48671, 'batch_loss/train': 0.8354627210646868} 12/29/2021 08:54:31 - INFO - codeparrot_training - Step 48672: {'lr': 8.759060576660205e-07, 'samples': 24920576, 'steps': 48672, 'batch_loss/train': 0.7659877380356193} 12/29/2021 08:54:41 - INFO - codeparrot_training - Step 48673: {'lr': 8.74588188082448e-07, 'samples': 24921088, 'steps': 48673, 'batch_loss/train': 0.6812296148855239} 12/29/2021 08:54:54 - INFO - codeparrot_training - Step 48674: {'lr': 8.732713089274203e-07, 'samples': 24921600, 'steps': 48674, 'batch_loss/train': 0.7708143714116886} 12/29/2021 08:55:04 - INFO - codeparrot_training - Step 48675: {'lr': 8.719554202062385e-07, 'samples': 24922112, 'steps': 48675, 'batch_loss/train': 0.9913691172841936} 12/29/2021 08:55:15 - INFO - codeparrot_training - Step 48676: {'lr': 8.706405219240654e-07, 'samples': 24922624, 'steps': 48676, 'batch_loss/train': 0.8100067051127553} 12/29/2021 08:55:26 - INFO - codeparrot_training - Step 48677: {'lr': 8.693266140861744e-07, 'samples': 24923136, 'steps': 48677, 'batch_loss/train': 0.7215272719040513} 12/29/2021 08:55:40 - INFO - codeparrot_training - Step 48678: {'lr': 8.680136966977559e-07, 'samples': 24923648, 'steps': 48678, 'batch_loss/train': 0.5708432587562129} 12/29/2021 08:55:50 - INFO - codeparrot_training - Step 48679: {'lr': 8.667017697640555e-07, 'samples': 24924160, 'steps': 48679, 'batch_loss/train': 0.7539646066725254} 12/29/2021 08:56:01 - INFO - codeparrot_training - Step 48680: {'lr': 8.653908332902915e-07, 'samples': 24924672, 'steps': 48680, 'batch_loss/train': 0.7668366925790906} 12/29/2021 08:56:13 - INFO - codeparrot_training - Step 48681: {'lr': 8.640808872816542e-07, 'samples': 24925184, 'steps': 48681, 'batch_loss/train': 0.7495654402300715} 12/29/2021 08:56:24 - INFO - codeparrot_training - Step 48682: {'lr': 8.627719317433614e-07, 'samples': 24925696, 'steps': 48682, 'batch_loss/train': 0.6784750989172608} 12/29/2021 08:56:35 - INFO - codeparrot_training - Step 48683: {'lr': 8.614639666806034e-07, 'samples': 24926208, 'steps': 48683, 'batch_loss/train': 0.7564849406480789} 12/29/2021 08:56:47 - INFO - codeparrot_training - Step 48684: {'lr': 8.601569920986263e-07, 'samples': 24926720, 'steps': 48684, 'batch_loss/train': 1.1258407905697823} 12/29/2021 08:56:57 - INFO - codeparrot_training - Step 48685: {'lr': 8.588510080025647e-07, 'samples': 24927232, 'steps': 48685, 'batch_loss/train': 0.725710969301872} 12/29/2021 08:57:08 - INFO - codeparrot_training - Step 48686: {'lr': 8.575460143976365e-07, 'samples': 24927744, 'steps': 48686, 'batch_loss/train': 0.6409611033741385} 12/29/2021 08:57:19 - INFO - codeparrot_training - Step 48687: {'lr': 8.562420112890601e-07, 'samples': 24928256, 'steps': 48687, 'batch_loss/train': 0.7651011077687144} 12/29/2021 08:57:31 - INFO - codeparrot_training - Step 48688: {'lr': 8.5493899868197e-07, 'samples': 24928768, 'steps': 48688, 'batch_loss/train': 0.6332490706117824} 12/29/2021 08:57:42 - INFO - codeparrot_training - Step 48689: {'lr': 8.536369765815843e-07, 'samples': 24929280, 'steps': 48689, 'batch_loss/train': 0.7939290339127183} 12/29/2021 08:57:52 - INFO - codeparrot_training - Step 48690: {'lr': 8.52335944993038e-07, 'samples': 24929792, 'steps': 48690, 'batch_loss/train': 0.7993008978664875} 12/29/2021 08:58:06 - INFO - codeparrot_training - Step 48691: {'lr': 8.510359039215488e-07, 'samples': 24930304, 'steps': 48691, 'batch_loss/train': 0.6817129561677575} 12/29/2021 08:58:17 - INFO - codeparrot_training - Step 48692: {'lr': 8.497368533722794e-07, 'samples': 24930816, 'steps': 48692, 'batch_loss/train': 0.7624572515487671} 12/29/2021 08:58:28 - INFO - codeparrot_training - Step 48693: {'lr': 8.484387933503646e-07, 'samples': 24931328, 'steps': 48693, 'batch_loss/train': 0.7163457595743239} 12/29/2021 08:58:40 - INFO - codeparrot_training - Step 48694: {'lr': 8.471417238609669e-07, 'samples': 24931840, 'steps': 48694, 'batch_loss/train': 0.7789714364334941} 12/29/2021 08:58:50 - INFO - codeparrot_training - Step 48695: {'lr': 8.458456449092766e-07, 'samples': 24932352, 'steps': 48695, 'batch_loss/train': 0.7602567672729492} 12/29/2021 08:59:01 - INFO - codeparrot_training - Step 48696: {'lr': 8.445505565004286e-07, 'samples': 24932864, 'steps': 48696, 'batch_loss/train': 0.8034594338387251} 12/29/2021 08:59:14 - INFO - codeparrot_training - Step 48697: {'lr': 8.432564586395575e-07, 'samples': 24933376, 'steps': 48697, 'batch_loss/train': 0.7039846661500633} 12/29/2021 08:59:24 - INFO - codeparrot_training - Step 48698: {'lr': 8.419633513318259e-07, 'samples': 24933888, 'steps': 48698, 'batch_loss/train': 0.9419950628653169} 12/29/2021 08:59:35 - INFO - codeparrot_training - Step 48699: {'lr': 8.406712345823686e-07, 'samples': 24934400, 'steps': 48699, 'batch_loss/train': 0.9449430089443922} 12/29/2021 08:59:45 - INFO - codeparrot_training - Step 48700: {'lr': 8.393801083963204e-07, 'samples': 24934912, 'steps': 48700, 'batch_loss/train': 0.8113251309841871} 12/29/2021 08:59:59 - INFO - codeparrot_training - Step 48701: {'lr': 8.380899727788438e-07, 'samples': 24935424, 'steps': 48701, 'batch_loss/train': 0.7458582513500005} 12/29/2021 09:00:10 - INFO - codeparrot_training - Step 48702: {'lr': 8.36800827735018e-07, 'samples': 24935936, 'steps': 48702, 'batch_loss/train': 0.7325570799293928} 12/29/2021 09:00:21 - INFO - codeparrot_training - Step 48703: {'lr': 8.355126732700058e-07, 'samples': 24936448, 'steps': 48703, 'batch_loss/train': 0.7059121299535036} 12/29/2021 09:00:33 - INFO - codeparrot_training - Step 48704: {'lr': 8.342255093888862e-07, 'samples': 24936960, 'steps': 48704, 'batch_loss/train': 0.6875384841114283} 12/29/2021 09:00:43 - INFO - codeparrot_training - Step 48705: {'lr': 8.329393360968219e-07, 'samples': 24937472, 'steps': 48705, 'batch_loss/train': 0.7907391637563705} 12/29/2021 09:00:54 - INFO - codeparrot_training - Step 48706: {'lr': 8.316541533989197e-07, 'samples': 24937984, 'steps': 48706, 'batch_loss/train': 0.6102371462620795} 12/29/2021 09:01:08 - INFO - codeparrot_training - Step 48707: {'lr': 8.303699613002591e-07, 'samples': 24938496, 'steps': 48707, 'batch_loss/train': 0.7138471463695168} 12/29/2021 09:01:19 - INFO - codeparrot_training - Step 48708: {'lr': 8.290867598059748e-07, 'samples': 24939008, 'steps': 48708, 'batch_loss/train': 0.7853964185342193} 12/29/2021 09:01:30 - INFO - codeparrot_training - Step 48709: {'lr': 8.278045489211461e-07, 'samples': 24939520, 'steps': 48709, 'batch_loss/train': 0.7601233671884984} 12/29/2021 09:01:42 - INFO - codeparrot_training - Step 48710: {'lr': 8.265233286509077e-07, 'samples': 24940032, 'steps': 48710, 'batch_loss/train': 0.8032672051340342} 12/29/2021 09:01:53 - INFO - codeparrot_training - Step 48711: {'lr': 8.252430990003113e-07, 'samples': 24940544, 'steps': 48711, 'batch_loss/train': 0.8036401253193617} 12/29/2021 09:02:03 - INFO - codeparrot_training - Step 48712: {'lr': 8.239638599744637e-07, 'samples': 24941056, 'steps': 48712, 'batch_loss/train': 0.7205640412867069} 12/29/2021 09:02:14 - INFO - codeparrot_training - Step 48713: {'lr': 8.226856115784443e-07, 'samples': 24941568, 'steps': 48713, 'batch_loss/train': 0.7711657835170627} 12/29/2021 09:02:27 - INFO - codeparrot_training - Step 48714: {'lr': 8.2140835381736e-07, 'samples': 24942080, 'steps': 48714, 'batch_loss/train': 0.6729691782966256} 12/29/2021 09:02:37 - INFO - codeparrot_training - Step 48715: {'lr': 8.201320866962347e-07, 'samples': 24942592, 'steps': 48715, 'batch_loss/train': 0.7192195560783148} 12/29/2021 09:02:48 - INFO - codeparrot_training - Step 48716: {'lr': 8.188568102202309e-07, 'samples': 24943104, 'steps': 48716, 'batch_loss/train': 0.5559419498313218} 12/29/2021 09:03:02 - INFO - codeparrot_training - Step 48717: {'lr': 8.175825243943169e-07, 'samples': 24943616, 'steps': 48717, 'batch_loss/train': 1.2972456408897415} 12/29/2021 09:03:12 - INFO - codeparrot_training - Step 48718: {'lr': 8.163092292236273e-07, 'samples': 24944128, 'steps': 48718, 'batch_loss/train': 0.7632480678148568} 12/29/2021 09:03:23 - INFO - codeparrot_training - Step 48719: {'lr': 8.150369247132138e-07, 'samples': 24944640, 'steps': 48719, 'batch_loss/train': 0.7198485806584358} 12/29/2021 09:03:34 - INFO - codeparrot_training - Step 48720: {'lr': 8.137656108681279e-07, 'samples': 24945152, 'steps': 48720, 'batch_loss/train': 0.7110892413184047} 12/29/2021 09:03:46 - INFO - codeparrot_training - Step 48721: {'lr': 8.124952876933934e-07, 'samples': 24945664, 'steps': 48721, 'batch_loss/train': 0.7884154571220279} 12/29/2021 09:03:56 - INFO - codeparrot_training - Step 48722: {'lr': 8.112259551940892e-07, 'samples': 24946176, 'steps': 48722, 'batch_loss/train': 0.7110206210054457} 12/29/2021 09:04:07 - INFO - codeparrot_training - Step 48723: {'lr': 8.09957613375295e-07, 'samples': 24946688, 'steps': 48723, 'batch_loss/train': 0.7789900209754705} 12/29/2021 09:04:21 - INFO - codeparrot_training - Step 48724: {'lr': 8.086902622419789e-07, 'samples': 24947200, 'steps': 48724, 'batch_loss/train': 0.6328423414379358} 12/29/2021 09:04:31 - INFO - codeparrot_training - Step 48725: {'lr': 8.07423901799248e-07, 'samples': 24947712, 'steps': 48725, 'batch_loss/train': 0.7608653670176864} 12/29/2021 09:04:42 - INFO - codeparrot_training - Step 48726: {'lr': 8.061585320520981e-07, 'samples': 24948224, 'steps': 48726, 'batch_loss/train': 0.7599989781156182} 12/29/2021 09:04:55 - INFO - codeparrot_training - Step 48727: {'lr': 8.048941530055809e-07, 'samples': 24948736, 'steps': 48727, 'batch_loss/train': 0.6184286626521498} 12/29/2021 09:05:05 - INFO - codeparrot_training - Step 48728: {'lr': 8.036307646646923e-07, 'samples': 24949248, 'steps': 48728, 'batch_loss/train': 0.5588716887868941} 12/29/2021 09:05:16 - INFO - codeparrot_training - Step 48729: {'lr': 8.023683670344839e-07, 'samples': 24949760, 'steps': 48729, 'batch_loss/train': 0.725596786942333} 12/29/2021 09:05:28 - INFO - codeparrot_training - Step 48730: {'lr': 8.011069601199795e-07, 'samples': 24950272, 'steps': 48730, 'batch_loss/train': 0.7602634934009984} 12/29/2021 09:05:39 - INFO - codeparrot_training - Step 48731: {'lr': 7.998465439261471e-07, 'samples': 24950784, 'steps': 48731, 'batch_loss/train': 0.7681636083871126} 12/29/2021 09:05:50 - INFO - codeparrot_training - Step 48732: {'lr': 7.985871184580662e-07, 'samples': 24951296, 'steps': 48732, 'batch_loss/train': 1.253639408154413} 12/29/2021 09:06:00 - INFO - codeparrot_training - Step 48733: {'lr': 7.973286837206773e-07, 'samples': 24951808, 'steps': 48733, 'batch_loss/train': 0.7974913055077195} 12/29/2021 09:06:13 - INFO - codeparrot_training - Step 48734: {'lr': 7.960712397190317e-07, 'samples': 24952320, 'steps': 48734, 'batch_loss/train': 0.793164144270122} 12/29/2021 09:06:23 - INFO - codeparrot_training - Step 48735: {'lr': 7.948147864580979e-07, 'samples': 24952832, 'steps': 48735, 'batch_loss/train': 0.7302760076709092} 12/29/2021 09:06:34 - INFO - codeparrot_training - Step 48736: {'lr': 7.935593239428996e-07, 'samples': 24953344, 'steps': 48736, 'batch_loss/train': 0.644252966158092} 12/29/2021 09:06:48 - INFO - codeparrot_training - Step 48737: {'lr': 7.923048521784049e-07, 'samples': 24953856, 'steps': 48737, 'batch_loss/train': 0.7047855663113296} 12/29/2021 09:06:59 - INFO - codeparrot_training - Step 48738: {'lr': 7.910513711696099e-07, 'samples': 24954368, 'steps': 48738, 'batch_loss/train': 0.6460906891152263} 12/29/2021 09:07:09 - INFO - codeparrot_training - Step 48739: {'lr': 7.897988809215107e-07, 'samples': 24954880, 'steps': 48739, 'batch_loss/train': 0.6344204163178802} 12/29/2021 09:07:20 - INFO - codeparrot_training - Step 48740: {'lr': 7.885473814390475e-07, 'samples': 24955392, 'steps': 48740, 'batch_loss/train': 0.6919299517758191} 12/29/2021 09:07:32 - INFO - codeparrot_training - Step 48741: {'lr': 7.872968727272444e-07, 'samples': 24955904, 'steps': 48741, 'batch_loss/train': 0.8086422244086862} 12/29/2021 09:07:43 - INFO - codeparrot_training - Step 48742: {'lr': 7.860473547910419e-07, 'samples': 24956416, 'steps': 48742, 'batch_loss/train': 0.7309944604057819} 12/29/2021 09:07:53 - INFO - codeparrot_training - Step 48743: {'lr': 7.847988276354079e-07, 'samples': 24956928, 'steps': 48743, 'batch_loss/train': 0.6576087202411145} 12/29/2021 09:08:08 - INFO - codeparrot_training - Step 48744: {'lr': 7.835512912653109e-07, 'samples': 24957440, 'steps': 48744, 'batch_loss/train': 0.7123788883909583} 12/29/2021 09:08:18 - INFO - codeparrot_training - Step 48745: {'lr': 7.823047456857468e-07, 'samples': 24957952, 'steps': 48745, 'batch_loss/train': 1.5034438529110048} 12/29/2021 09:08:29 - INFO - codeparrot_training - Step 48746: {'lr': 7.810591909016007e-07, 'samples': 24958464, 'steps': 48746, 'batch_loss/train': 0.7209292566403747} 12/29/2021 09:08:41 - INFO - codeparrot_training - Step 48747: {'lr': 7.798146269178686e-07, 'samples': 24958976, 'steps': 48747, 'batch_loss/train': 1.5469704447314143} 12/29/2021 09:08:52 - INFO - codeparrot_training - Step 48748: {'lr': 7.785710537394908e-07, 'samples': 24959488, 'steps': 48748, 'batch_loss/train': 0.7673013550229371} 12/29/2021 09:09:02 - INFO - codeparrot_training - Step 48749: {'lr': 7.773284713714357e-07, 'samples': 24960000, 'steps': 48749, 'batch_loss/train': 0.679816777817905} 12/29/2021 09:09:13 - INFO - codeparrot_training - Step 48750: {'lr': 7.760868798185883e-07, 'samples': 24960512, 'steps': 48750, 'batch_loss/train': 0.6307709007523954} 12/29/2021 09:09:25 - INFO - codeparrot_training - Step 48751: {'lr': 7.748462790859446e-07, 'samples': 24961024, 'steps': 48751, 'batch_loss/train': 0.7183348340913653} 12/29/2021 09:09:36 - INFO - codeparrot_training - Step 48752: {'lr': 7.736066691783894e-07, 'samples': 24961536, 'steps': 48752, 'batch_loss/train': 0.7199084311723709} 12/29/2021 09:09:46 - INFO - codeparrot_training - Step 48753: {'lr': 7.723680501008635e-07, 'samples': 24962048, 'steps': 48753, 'batch_loss/train': 0.7279483694583178} 12/29/2021 09:10:01 - INFO - codeparrot_training - Step 48754: {'lr': 7.711304218583348e-07, 'samples': 24962560, 'steps': 48754, 'batch_loss/train': 0.6746753124753013} 12/29/2021 09:10:11 - INFO - codeparrot_training - Step 48755: {'lr': 7.698937844556609e-07, 'samples': 24963072, 'steps': 48755, 'batch_loss/train': 0.7003693040460348} 12/29/2021 09:10:22 - INFO - codeparrot_training - Step 48756: {'lr': 7.686581378977819e-07, 'samples': 24963584, 'steps': 48756, 'batch_loss/train': 0.6837002790998667} 12/29/2021 09:10:34 - INFO - codeparrot_training - Step 48757: {'lr': 7.674234821896109e-07, 'samples': 24964096, 'steps': 48757, 'batch_loss/train': 0.7398240939946845} 12/29/2021 09:10:45 - INFO - codeparrot_training - Step 48758: {'lr': 7.661898173360604e-07, 'samples': 24964608, 'steps': 48758, 'batch_loss/train': 0.7259067445993423} 12/29/2021 09:10:56 - INFO - codeparrot_training - Step 48759: {'lr': 7.649571433420154e-07, 'samples': 24965120, 'steps': 48759, 'batch_loss/train': 0.6619223922025412} 12/29/2021 09:11:06 - INFO - codeparrot_training - Step 48760: {'lr': 7.637254602124166e-07, 'samples': 24965632, 'steps': 48760, 'batch_loss/train': 0.6922811921685934} 12/29/2021 09:11:18 - INFO - codeparrot_training - Step 48761: {'lr': 7.62494767952121e-07, 'samples': 24966144, 'steps': 48761, 'batch_loss/train': 0.7052688321564347} 12/29/2021 09:11:29 - INFO - codeparrot_training - Step 48762: {'lr': 7.612650665660692e-07, 'samples': 24966656, 'steps': 48762, 'batch_loss/train': 0.6577281490899622} 12/29/2021 09:11:39 - INFO - codeparrot_training - Step 48763: {'lr': 7.600363560590906e-07, 'samples': 24967168, 'steps': 48763, 'batch_loss/train': 0.6793251060880721} 12/29/2021 09:11:53 - INFO - codeparrot_training - Step 48764: {'lr': 7.58808636436098e-07, 'samples': 24967680, 'steps': 48764, 'batch_loss/train': 0.6957404003478587} 12/29/2021 09:12:04 - INFO - codeparrot_training - Step 48765: {'lr': 7.575819077019763e-07, 'samples': 24968192, 'steps': 48765, 'batch_loss/train': 0.6973333992063999} 12/29/2021 09:12:15 - INFO - codeparrot_training - Step 48766: {'lr': 7.563561698616106e-07, 'samples': 24968704, 'steps': 48766, 'batch_loss/train': 0.7463909301441163} 12/29/2021 09:12:27 - INFO - codeparrot_training - Step 48767: {'lr': 7.55131422919858e-07, 'samples': 24969216, 'steps': 48767, 'batch_loss/train': 0.7568834647536278} 12/29/2021 09:12:37 - INFO - codeparrot_training - Step 48768: {'lr': 7.539076668815759e-07, 'samples': 24969728, 'steps': 48768, 'batch_loss/train': 0.7162639200687408} 12/29/2021 09:12:48 - INFO - codeparrot_training - Step 48769: {'lr': 7.526849017516491e-07, 'samples': 24970240, 'steps': 48769, 'batch_loss/train': 0.7451809151098132} 12/29/2021 09:12:59 - INFO - codeparrot_training - Step 48770: {'lr': 7.514631275349349e-07, 'samples': 24970752, 'steps': 48770, 'batch_loss/train': 0.7148760298732668} 12/29/2021 09:13:11 - INFO - codeparrot_training - Step 48771: {'lr': 7.502423442363182e-07, 'samples': 24971264, 'steps': 48771, 'batch_loss/train': 0.4977401622454636} 12/29/2021 09:13:22 - INFO - codeparrot_training - Step 48772: {'lr': 7.490225518606008e-07, 'samples': 24971776, 'steps': 48772, 'batch_loss/train': 0.4575455150625203} 12/29/2021 09:13:33 - INFO - codeparrot_training - Step 48773: {'lr': 7.478037504126678e-07, 'samples': 24972288, 'steps': 48773, 'batch_loss/train': 0.7136164684779942} 12/29/2021 09:13:47 - INFO - codeparrot_training - Step 48774: {'lr': 7.465859398973207e-07, 'samples': 24972800, 'steps': 48774, 'batch_loss/train': 0.6418462365400046} 12/29/2021 09:13:57 - INFO - codeparrot_training - Step 48775: {'lr': 7.453691203194723e-07, 'samples': 24973312, 'steps': 48775, 'batch_loss/train': 0.6238155757309869} 12/29/2021 09:14:08 - INFO - codeparrot_training - Step 48776: {'lr': 7.441532916838967e-07, 'samples': 24973824, 'steps': 48776, 'batch_loss/train': 0.6297505546826869} 12/29/2021 09:14:20 - INFO - codeparrot_training - Step 48777: {'lr': 7.429384539954787e-07, 'samples': 24974336, 'steps': 48777, 'batch_loss/train': 0.6341127105988562} 12/29/2021 09:14:30 - INFO - codeparrot_training - Step 48778: {'lr': 7.417246072589922e-07, 'samples': 24974848, 'steps': 48778, 'batch_loss/train': 0.6437413231469691} 12/29/2021 09:14:41 - INFO - codeparrot_training - Step 48779: {'lr': 7.405117514792948e-07, 'samples': 24975360, 'steps': 48779, 'batch_loss/train': 0.6916636563837528} 12/29/2021 09:14:52 - INFO - codeparrot_training - Step 48780: {'lr': 7.392998866612432e-07, 'samples': 24975872, 'steps': 48780, 'batch_loss/train': 0.7483610985800624} 12/29/2021 09:15:04 - INFO - codeparrot_training - Step 48781: {'lr': 7.380890128095841e-07, 'samples': 24976384, 'steps': 48781, 'batch_loss/train': 0.6280333122704178} 12/29/2021 09:15:15 - INFO - codeparrot_training - Step 48782: {'lr': 7.368791299291744e-07, 'samples': 24976896, 'steps': 48782, 'batch_loss/train': 0.7392048480687663} 12/29/2021 09:15:25 - INFO - codeparrot_training - Step 48783: {'lr': 7.356702380248159e-07, 'samples': 24977408, 'steps': 48783, 'batch_loss/train': 0.6774957748129964} 12/29/2021 09:15:40 - INFO - codeparrot_training - Step 48784: {'lr': 7.344623371013104e-07, 'samples': 24977920, 'steps': 48784, 'batch_loss/train': 0.7396395253017545} 12/29/2021 09:15:50 - INFO - codeparrot_training - Step 48785: {'lr': 7.332554271634595e-07, 'samples': 24978432, 'steps': 48785, 'batch_loss/train': 0.4900262322917115} 12/29/2021 09:16:01 - INFO - codeparrot_training - Step 48786: {'lr': 7.320495082160649e-07, 'samples': 24978944, 'steps': 48786, 'batch_loss/train': 0.7460703272372484} 12/29/2021 09:16:13 - INFO - codeparrot_training - Step 48787: {'lr': 7.308445802639286e-07, 'samples': 24979456, 'steps': 48787, 'batch_loss/train': 0.6081140507012606} 12/29/2021 09:16:24 - INFO - codeparrot_training - Step 48788: {'lr': 7.296406433118241e-07, 'samples': 24979968, 'steps': 48788, 'batch_loss/train': 0.7274126706179231} 12/29/2021 09:16:34 - INFO - codeparrot_training - Step 48789: {'lr': 7.284376973645535e-07, 'samples': 24980480, 'steps': 48789, 'batch_loss/train': 0.6497601978480816} 12/29/2021 09:16:45 - INFO - codeparrot_training - Step 48790: {'lr': 7.272357424269182e-07, 'samples': 24980992, 'steps': 48790, 'batch_loss/train': 0.7368303656112403} 12/29/2021 09:16:59 - INFO - codeparrot_training - Step 48791: {'lr': 7.26034778503637e-07, 'samples': 24981504, 'steps': 48791, 'batch_loss/train': 0.7259293519891798} 12/29/2021 09:17:09 - INFO - codeparrot_training - Step 48792: {'lr': 7.248348055995391e-07, 'samples': 24982016, 'steps': 48792, 'batch_loss/train': 0.6356371740112081} 12/29/2021 09:17:20 - INFO - codeparrot_training - Step 48793: {'lr': 7.236358237193708e-07, 'samples': 24982528, 'steps': 48793, 'batch_loss/train': 0.7820037314668298} 12/29/2021 09:17:32 - INFO - codeparrot_training - Step 48794: {'lr': 7.22437832867906e-07, 'samples': 24983040, 'steps': 48794, 'batch_loss/train': 0.6285573169589043} 12/29/2021 09:17:43 - INFO - codeparrot_training - Step 48795: {'lr': 7.21240833049891e-07, 'samples': 24983552, 'steps': 48795, 'batch_loss/train': 0.7050336794927716} 12/29/2021 09:17:53 - INFO - codeparrot_training - Step 48796: {'lr': 7.200448242700996e-07, 'samples': 24984064, 'steps': 48796, 'batch_loss/train': 0.7093717600218952} 12/29/2021 09:18:05 - INFO - codeparrot_training - Step 48797: {'lr': 7.18849806533306e-07, 'samples': 24984576, 'steps': 48797, 'batch_loss/train': 0.7676428882405162} 12/29/2021 09:18:16 - INFO - codeparrot_training - Step 48798: {'lr': 7.176557798442284e-07, 'samples': 24985088, 'steps': 48798, 'batch_loss/train': 0.8073788830079138} 12/29/2021 09:18:27 - INFO - codeparrot_training - Step 48799: {'lr': 7.164627442076133e-07, 'samples': 24985600, 'steps': 48799, 'batch_loss/train': 0.6808305319282226} 12/29/2021 09:18:37 - INFO - codeparrot_training - Step 48800: {'lr': 7.152706996282344e-07, 'samples': 24986112, 'steps': 48800, 'batch_loss/train': 0.7415898153558373} 12/29/2021 09:18:51 - INFO - codeparrot_training - Step 48801: {'lr': 7.140796461108102e-07, 'samples': 24986624, 'steps': 48801, 'batch_loss/train': 0.6786441693548113} 12/29/2021 09:19:02 - INFO - codeparrot_training - Step 48802: {'lr': 7.128895836600591e-07, 'samples': 24987136, 'steps': 48802, 'batch_loss/train': 0.7061250824481249} 12/29/2021 09:19:12 - INFO - codeparrot_training - Step 48803: {'lr': 7.117005122807552e-07, 'samples': 24987648, 'steps': 48803, 'batch_loss/train': 0.6491048275493085} 12/29/2021 09:19:24 - INFO - codeparrot_training - Step 48804: {'lr': 7.105124319775891e-07, 'samples': 24988160, 'steps': 48804, 'batch_loss/train': 0.612169440370053} 12/29/2021 09:19:35 - INFO - codeparrot_training - Step 48805: {'lr': 7.093253427553071e-07, 'samples': 24988672, 'steps': 48805, 'batch_loss/train': 0.7583474800921977} 12/29/2021 09:19:45 - INFO - codeparrot_training - Step 48806: {'lr': 7.081392446186274e-07, 'samples': 24989184, 'steps': 48806, 'batch_loss/train': 0.609788880450651} 12/29/2021 09:19:57 - INFO - codeparrot_training - Step 48807: {'lr': 7.06954137572241e-07, 'samples': 24989696, 'steps': 48807, 'batch_loss/train': 0.7618729677051306} 12/29/2021 09:20:08 - INFO - codeparrot_training - Step 48808: {'lr': 7.05770021620894e-07, 'samples': 24990208, 'steps': 48808, 'batch_loss/train': 0.7419911482720636} 12/29/2021 09:20:19 - INFO - codeparrot_training - Step 48809: {'lr': 7.045868967692493e-07, 'samples': 24990720, 'steps': 48809, 'batch_loss/train': 0.7459198497235775} 12/29/2021 09:20:29 - INFO - codeparrot_training - Step 48810: {'lr': 7.034047630220531e-07, 'samples': 24991232, 'steps': 48810, 'batch_loss/train': 0.8244185596704483} 12/29/2021 09:20:41 - INFO - codeparrot_training - Step 48811: {'lr': 7.022236203839682e-07, 'samples': 24991744, 'steps': 48811, 'batch_loss/train': 0.8088690002914518} 12/29/2021 09:20:52 - INFO - codeparrot_training - Step 48812: {'lr': 7.010434688597412e-07, 'samples': 24992256, 'steps': 48812, 'batch_loss/train': 0.6826389413326979} 12/29/2021 09:21:03 - INFO - codeparrot_training - Step 48813: {'lr': 6.998643084540069e-07, 'samples': 24992768, 'steps': 48813, 'batch_loss/train': 0.6341687858221121} 12/29/2021 09:21:17 - INFO - codeparrot_training - Step 48814: {'lr': 6.986861391715116e-07, 'samples': 24993280, 'steps': 48814, 'batch_loss/train': 2.0584837840870023} 12/29/2021 09:21:28 - INFO - codeparrot_training - Step 48815: {'lr': 6.975089610168906e-07, 'samples': 24993792, 'steps': 48815, 'batch_loss/train': 0.7124211041373201} 12/29/2021 09:21:39 - INFO - codeparrot_training - Step 48816: {'lr': 6.963327739948344e-07, 'samples': 24994304, 'steps': 48816, 'batch_loss/train': 0.6872486462816596} 12/29/2021 09:21:51 - INFO - codeparrot_training - Step 48817: {'lr': 6.951575781100339e-07, 'samples': 24994816, 'steps': 48817, 'batch_loss/train': 0.6996333601418883} 12/29/2021 09:22:01 - INFO - codeparrot_training - Step 48818: {'lr': 6.939833733671242e-07, 'samples': 24995328, 'steps': 48818, 'batch_loss/train': 0.8318786122836173} 12/29/2021 09:22:12 - INFO - codeparrot_training - Step 48819: {'lr': 6.928101597708514e-07, 'samples': 24995840, 'steps': 48819, 'batch_loss/train': 0.717551918933168} 12/29/2021 09:22:23 - INFO - codeparrot_training - Step 48820: {'lr': 6.916379373257953e-07, 'samples': 24996352, 'steps': 48820, 'batch_loss/train': 0.7650472931563854} 12/29/2021 09:22:35 - INFO - codeparrot_training - Step 48821: {'lr': 6.904667060366742e-07, 'samples': 24996864, 'steps': 48821, 'batch_loss/train': 0.5577662864234298} 12/29/2021 09:22:45 - INFO - codeparrot_training - Step 48822: {'lr': 6.892964659080958e-07, 'samples': 24997376, 'steps': 48822, 'batch_loss/train': 0.7781009525060654} 12/29/2021 09:22:56 - INFO - codeparrot_training - Step 48823: {'lr': 6.881272169447783e-07, 'samples': 24997888, 'steps': 48823, 'batch_loss/train': 0.6508955452591181} 12/29/2021 09:23:10 - INFO - codeparrot_training - Step 48824: {'lr': 6.869589591513015e-07, 'samples': 24998400, 'steps': 48824, 'batch_loss/train': 0.7557126423344016} 12/29/2021 09:23:21 - INFO - codeparrot_training - Step 48825: {'lr': 6.857916925323282e-07, 'samples': 24998912, 'steps': 48825, 'batch_loss/train': 0.7219321299344301} 12/29/2021 09:23:31 - INFO - codeparrot_training - Step 48826: {'lr': 6.846254170925215e-07, 'samples': 24999424, 'steps': 48826, 'batch_loss/train': 0.7416200982406735} 12/29/2021 09:23:43 - INFO - codeparrot_training - Step 48827: {'lr': 6.834601328365164e-07, 'samples': 24999936, 'steps': 48827, 'batch_loss/train': 0.6771259258966893} 12/29/2021 09:23:54 - INFO - codeparrot_training - Step 48828: {'lr': 6.822958397689205e-07, 'samples': 25000448, 'steps': 48828, 'batch_loss/train': 0.7544237598776817} 12/29/2021 09:24:05 - INFO - codeparrot_training - Step 48829: {'lr': 6.811325378943967e-07, 'samples': 25000960, 'steps': 48829, 'batch_loss/train': 0.7495616676751524} 12/29/2021 09:24:18 - INFO - codeparrot_training - Step 48830: {'lr': 6.799702272175523e-07, 'samples': 25001472, 'steps': 48830, 'batch_loss/train': 0.7950403997674584} 12/29/2021 09:24:29 - INFO - codeparrot_training - Step 48831: {'lr': 6.788089077429948e-07, 'samples': 25001984, 'steps': 48831, 'batch_loss/train': 0.7570103872567415} 12/29/2021 09:24:40 - INFO - codeparrot_training - Step 48832: {'lr': 6.776485794753596e-07, 'samples': 25002496, 'steps': 48832, 'batch_loss/train': 1.6159433210268617} 12/29/2021 09:24:50 - INFO - codeparrot_training - Step 48833: {'lr': 6.76489242419226e-07, 'samples': 25003008, 'steps': 48833, 'batch_loss/train': 0.7155724344775081} 12/29/2021 09:25:02 - INFO - codeparrot_training - Step 48834: {'lr': 6.753308965792571e-07, 'samples': 25003520, 'steps': 48834, 'batch_loss/train': 0.732963602989912} 12/29/2021 09:25:13 - INFO - codeparrot_training - Step 48835: {'lr': 6.741735419600325e-07, 'samples': 25004032, 'steps': 48835, 'batch_loss/train': 0.693383161444217} 12/29/2021 09:25:24 - INFO - codeparrot_training - Step 48836: {'lr': 6.730171785661321e-07, 'samples': 25004544, 'steps': 48836, 'batch_loss/train': 0.7305371053516865} 12/29/2021 09:25:36 - INFO - codeparrot_training - Step 48837: {'lr': 6.718618064021909e-07, 'samples': 25005056, 'steps': 48837, 'batch_loss/train': 0.649102296680212} 12/29/2021 09:25:47 - INFO - codeparrot_training - Step 48838: {'lr': 6.707074254727885e-07, 'samples': 25005568, 'steps': 48838, 'batch_loss/train': 0.7540359671693295} 12/29/2021 09:25:57 - INFO - codeparrot_training - Step 48839: {'lr': 6.695540357825047e-07, 'samples': 25006080, 'steps': 48839, 'batch_loss/train': 0.7033105101436377} 12/29/2021 09:26:08 - INFO - codeparrot_training - Step 48840: {'lr': 6.684016373359192e-07, 'samples': 25006592, 'steps': 48840, 'batch_loss/train': 0.7785658873617649} 12/29/2021 09:26:23 - INFO - codeparrot_training - Step 48841: {'lr': 6.672502301376393e-07, 'samples': 25007104, 'steps': 48841, 'batch_loss/train': 0.7394254696555436} 12/29/2021 09:26:33 - INFO - codeparrot_training - Step 48842: {'lr': 6.66099814192217e-07, 'samples': 25007616, 'steps': 48842, 'batch_loss/train': 0.6814299924299121} 12/29/2021 09:26:44 - INFO - codeparrot_training - Step 48843: {'lr': 6.64950389504232e-07, 'samples': 25008128, 'steps': 48843, 'batch_loss/train': 0.7948983409442008} 12/29/2021 09:26:56 - INFO - codeparrot_training - Step 48844: {'lr': 6.638019560782637e-07, 'samples': 25008640, 'steps': 48844, 'batch_loss/train': 0.8356736861169338} 12/29/2021 09:27:07 - INFO - codeparrot_training - Step 48845: {'lr': 6.626545139188922e-07, 'samples': 25009152, 'steps': 48845, 'batch_loss/train': 0.7290749736130238} 12/29/2021 09:27:18 - INFO - codeparrot_training - Step 48846: {'lr': 6.615080630306414e-07, 'samples': 25009664, 'steps': 48846, 'batch_loss/train': 0.6402737017488107} 12/29/2021 09:27:31 - INFO - codeparrot_training - Step 48847: {'lr': 6.60362603418091e-07, 'samples': 25010176, 'steps': 48847, 'batch_loss/train': 0.6332275015302002} 12/29/2021 09:27:42 - INFO - codeparrot_training - Step 48848: {'lr': 6.592181350857929e-07, 'samples': 25010688, 'steps': 48848, 'batch_loss/train': 0.8104922724887729} 12/29/2021 09:27:53 - INFO - codeparrot_training - Step 48849: {'lr': 6.58074658038299e-07, 'samples': 25011200, 'steps': 48849, 'batch_loss/train': 0.7121848100796342} 12/29/2021 09:28:05 - INFO - codeparrot_training - Step 48850: {'lr': 6.569321722801614e-07, 'samples': 25011712, 'steps': 48850, 'batch_loss/train': 0.7479889541864395} 12/29/2021 09:28:16 - INFO - codeparrot_training - Step 48851: {'lr': 6.557906778158762e-07, 'samples': 25012224, 'steps': 48851, 'batch_loss/train': 0.8109472473151982} 12/29/2021 09:28:26 - INFO - codeparrot_training - Step 48852: {'lr': 6.54650174650051e-07, 'samples': 25012736, 'steps': 48852, 'batch_loss/train': 0.6466767704114318} 12/29/2021 09:28:37 - INFO - codeparrot_training - Step 48853: {'lr': 6.535106627871823e-07, 'samples': 25013248, 'steps': 48853, 'batch_loss/train': 0.8521840954199433} 12/29/2021 09:28:49 - INFO - codeparrot_training - Step 48854: {'lr': 6.523721422317941e-07, 'samples': 25013760, 'steps': 48854, 'batch_loss/train': 0.7281014518812299} 12/29/2021 09:29:00 - INFO - codeparrot_training - Step 48855: {'lr': 6.512346129884383e-07, 'samples': 25014272, 'steps': 48855, 'batch_loss/train': 0.6789198340848088} 12/29/2021 09:29:10 - INFO - codeparrot_training - Step 48856: {'lr': 6.500980750616392e-07, 'samples': 25014784, 'steps': 48856, 'batch_loss/train': 0.7173955873586237} 12/29/2021 09:29:22 - INFO - codeparrot_training - Step 48857: {'lr': 6.489625284558654e-07, 'samples': 25015296, 'steps': 48857, 'batch_loss/train': 0.6887541189789772} 12/29/2021 09:29:33 - INFO - codeparrot_training - Step 48858: {'lr': 6.478279731756964e-07, 'samples': 25015808, 'steps': 48858, 'batch_loss/train': 0.9343058164231479} 12/29/2021 09:29:44 - INFO - codeparrot_training - Step 48859: {'lr': 6.466944092255733e-07, 'samples': 25016320, 'steps': 48859, 'batch_loss/train': 0.8377226110897027} 12/29/2021 09:29:58 - INFO - codeparrot_training - Step 48860: {'lr': 6.455618366100758e-07, 'samples': 25016832, 'steps': 48860, 'batch_loss/train': 0.6087450215127319} 12/29/2021 09:30:08 - INFO - codeparrot_training - Step 48861: {'lr': 6.444302553336446e-07, 'samples': 25017344, 'steps': 48861, 'batch_loss/train': 0.692066324991174} 12/29/2021 09:30:19 - INFO - codeparrot_training - Step 48862: {'lr': 6.432996654008317e-07, 'samples': 25017856, 'steps': 48862, 'batch_loss/train': 0.7410721685737371} 12/29/2021 09:30:30 - INFO - codeparrot_training - Step 48863: {'lr': 6.421700668160779e-07, 'samples': 25018368, 'steps': 48863, 'batch_loss/train': 0.7776189735159278} 12/29/2021 09:30:42 - INFO - codeparrot_training - Step 48864: {'lr': 6.410414595839353e-07, 'samples': 25018880, 'steps': 48864, 'batch_loss/train': 0.6663085492327809} 12/29/2021 09:30:53 - INFO - codeparrot_training - Step 48865: {'lr': 6.399138437088447e-07, 'samples': 25019392, 'steps': 48865, 'batch_loss/train': 0.6996879875659943} 12/29/2021 09:31:03 - INFO - codeparrot_training - Step 48866: {'lr': 6.387872191952748e-07, 'samples': 25019904, 'steps': 48866, 'batch_loss/train': 0.8030692329630256} 12/29/2021 09:31:16 - INFO - codeparrot_training - Step 48867: {'lr': 6.376615860477497e-07, 'samples': 25020416, 'steps': 48867, 'batch_loss/train': 0.654015158303082} 12/29/2021 09:31:26 - INFO - codeparrot_training - Step 48868: {'lr': 6.36536944270738e-07, 'samples': 25020928, 'steps': 48868, 'batch_loss/train': 0.7573057264089584} 12/29/2021 09:31:37 - INFO - codeparrot_training - Step 48869: {'lr': 6.354132938687084e-07, 'samples': 25021440, 'steps': 48869, 'batch_loss/train': 0.693883020314388} 12/29/2021 09:31:51 - INFO - codeparrot_training - Step 48870: {'lr': 6.34290634846102e-07, 'samples': 25021952, 'steps': 48870, 'batch_loss/train': 0.5801023283565883} 12/29/2021 09:32:02 - INFO - codeparrot_training - Step 48871: {'lr': 6.33168967207387e-07, 'samples': 25022464, 'steps': 48871, 'batch_loss/train': 0.712272166274488} 12/29/2021 09:32:12 - INFO - codeparrot_training - Step 48872: {'lr': 6.320482909570602e-07, 'samples': 25022976, 'steps': 48872, 'batch_loss/train': 0.6868470660410821} 12/29/2021 09:32:23 - INFO - codeparrot_training - Step 48873: {'lr': 6.309286060995345e-07, 'samples': 25023488, 'steps': 48873, 'batch_loss/train': 0.6217823978513479} 12/29/2021 09:32:35 - INFO - codeparrot_training - Step 48874: {'lr': 6.298099126392787e-07, 'samples': 25024000, 'steps': 48874, 'batch_loss/train': 0.9340907135047019} 12/29/2021 09:32:46 - INFO - codeparrot_training - Step 48875: {'lr': 6.286922105807613e-07, 'samples': 25024512, 'steps': 48875, 'batch_loss/train': 0.7835420505143702} 12/29/2021 09:32:56 - INFO - codeparrot_training - Step 48876: {'lr': 6.275754999283678e-07, 'samples': 25025024, 'steps': 48876, 'batch_loss/train': 0.5405367047060281} 12/29/2021 09:33:10 - INFO - codeparrot_training - Step 48877: {'lr': 6.264597806865946e-07, 'samples': 25025536, 'steps': 48877, 'batch_loss/train': 0.7632604003883898} 12/29/2021 09:33:21 - INFO - codeparrot_training - Step 48878: {'lr': 6.253450528598826e-07, 'samples': 25026048, 'steps': 48878, 'batch_loss/train': 0.7373282788321376} 12/29/2021 09:33:32 - INFO - codeparrot_training - Step 48879: {'lr': 6.242313164525893e-07, 'samples': 25026560, 'steps': 48879, 'batch_loss/train': 0.6165566976414993} 12/29/2021 09:33:44 - INFO - codeparrot_training - Step 48880: {'lr': 6.231185714692389e-07, 'samples': 25027072, 'steps': 48880, 'batch_loss/train': 0.7750975601375103} 12/29/2021 09:33:54 - INFO - codeparrot_training - Step 48881: {'lr': 6.220068179141614e-07, 'samples': 25027584, 'steps': 48881, 'batch_loss/train': 0.7830075239762664} 12/29/2021 09:34:05 - INFO - codeparrot_training - Step 48882: {'lr': 6.20896055791853e-07, 'samples': 25028096, 'steps': 48882, 'batch_loss/train': 0.667327500297688} 12/29/2021 09:34:16 - INFO - codeparrot_training - Step 48883: {'lr': 6.197862851066994e-07, 'samples': 25028608, 'steps': 48883, 'batch_loss/train': 0.7471308152889833} 12/29/2021 09:34:28 - INFO - codeparrot_training - Step 48884: {'lr': 6.186775058631134e-07, 'samples': 25029120, 'steps': 48884, 'batch_loss/train': 0.8313580721151084} 12/29/2021 09:34:38 - INFO - codeparrot_training - Step 48885: {'lr': 6.175697180654805e-07, 'samples': 25029632, 'steps': 48885, 'batch_loss/train': 0.7295168510172516} 12/29/2021 09:34:49 - INFO - codeparrot_training - Step 48886: {'lr': 6.164629217182693e-07, 'samples': 25030144, 'steps': 48886, 'batch_loss/train': 0.7559919254854321} 12/29/2021 09:35:03 - INFO - codeparrot_training - Step 48887: {'lr': 6.153571168258098e-07, 'samples': 25030656, 'steps': 48887, 'batch_loss/train': 0.7717156335711479} 12/29/2021 09:35:14 - INFO - codeparrot_training - Step 48888: {'lr': 6.142523033925151e-07, 'samples': 25031168, 'steps': 48888, 'batch_loss/train': 0.7897591448854655} 12/29/2021 09:35:24 - INFO - codeparrot_training - Step 48889: {'lr': 6.131484814227983e-07, 'samples': 25031680, 'steps': 48889, 'batch_loss/train': 0.691709749866277} 12/29/2021 09:35:36 - INFO - codeparrot_training - Step 48890: {'lr': 6.120456509210725e-07, 'samples': 25032192, 'steps': 48890, 'batch_loss/train': 0.7271308330819011} 12/29/2021 09:35:47 - INFO - codeparrot_training - Step 48891: {'lr': 6.1094381189164e-07, 'samples': 25032704, 'steps': 48891, 'batch_loss/train': 0.757993720471859} 12/29/2021 09:35:58 - INFO - codeparrot_training - Step 48892: {'lr': 6.098429643389691e-07, 'samples': 25033216, 'steps': 48892, 'batch_loss/train': 0.7668351056054235} 12/29/2021 09:36:10 - INFO - codeparrot_training - Step 48893: {'lr': 6.087431082673622e-07, 'samples': 25033728, 'steps': 48893, 'batch_loss/train': 0.7739393077790737} 12/29/2021 09:36:21 - INFO - codeparrot_training - Step 48894: {'lr': 6.076442436812324e-07, 'samples': 25034240, 'steps': 48894, 'batch_loss/train': 1.045581630896777} 12/29/2021 09:36:31 - INFO - codeparrot_training - Step 48895: {'lr': 6.06546370584965e-07, 'samples': 25034752, 'steps': 48895, 'batch_loss/train': 0.7492457311600447} 12/29/2021 09:36:42 - INFO - codeparrot_training - Step 48896: {'lr': 6.054494889828898e-07, 'samples': 25035264, 'steps': 48896, 'batch_loss/train': 0.6679469447117299} 12/29/2021 09:36:54 - INFO - codeparrot_training - Step 48897: {'lr': 6.043535988793647e-07, 'samples': 25035776, 'steps': 48897, 'batch_loss/train': 0.7685776180587709} 12/29/2021 09:37:05 - INFO - codeparrot_training - Step 48898: {'lr': 6.032587002787749e-07, 'samples': 25036288, 'steps': 48898, 'batch_loss/train': 0.6453145085833967} 12/29/2021 09:37:15 - INFO - codeparrot_training - Step 48899: {'lr': 6.021647931854501e-07, 'samples': 25036800, 'steps': 48899, 'batch_loss/train': 0.7158243649173528} 12/29/2021 09:37:29 - INFO - codeparrot_training - Step 48900: {'lr': 6.010718776037483e-07, 'samples': 25037312, 'steps': 48900, 'batch_loss/train': 0.6880078380927444} 12/29/2021 09:37:40 - INFO - codeparrot_training - Step 48901: {'lr': 5.999799535380546e-07, 'samples': 25037824, 'steps': 48901, 'batch_loss/train': 0.7727544531226158} 12/29/2021 09:37:51 - INFO - codeparrot_training - Step 48902: {'lr': 5.988890209926434e-07, 'samples': 25038336, 'steps': 48902, 'batch_loss/train': 0.7209668091963977} 12/29/2021 09:38:01 - INFO - codeparrot_training - Step 48903: {'lr': 5.977990799718724e-07, 'samples': 25038848, 'steps': 48903, 'batch_loss/train': 0.6939069638028741} 12/29/2021 09:38:13 - INFO - codeparrot_training - Step 48904: {'lr': 5.967101304800715e-07, 'samples': 25039360, 'steps': 48904, 'batch_loss/train': 0.7892586393281817} 12/29/2021 09:38:24 - INFO - codeparrot_training - Step 48905: {'lr': 5.956221725215983e-07, 'samples': 25039872, 'steps': 48905, 'batch_loss/train': 0.7240345245227218} 12/29/2021 09:38:35 - INFO - codeparrot_training - Step 48906: {'lr': 5.945352061007548e-07, 'samples': 25040384, 'steps': 48906, 'batch_loss/train': 0.7765799574553967} 12/29/2021 09:38:47 - INFO - codeparrot_training - Step 48907: {'lr': 5.934492312218709e-07, 'samples': 25040896, 'steps': 48907, 'batch_loss/train': 0.6455612946301699} 12/29/2021 09:38:57 - INFO - codeparrot_training - Step 48908: {'lr': 5.923642478892488e-07, 'samples': 25041408, 'steps': 48908, 'batch_loss/train': 0.7105989530682564} 12/29/2021 09:39:08 - INFO - codeparrot_training - Step 48909: {'lr': 5.912802561072184e-07, 'samples': 25041920, 'steps': 48909, 'batch_loss/train': 0.6434816832188517} 12/29/2021 09:39:22 - INFO - codeparrot_training - Step 48910: {'lr': 5.901972558801095e-07, 'samples': 25042432, 'steps': 48910, 'batch_loss/train': 0.7365147406235337} 12/29/2021 09:39:33 - INFO - codeparrot_training - Step 48911: {'lr': 5.891152472121686e-07, 'samples': 25042944, 'steps': 48911, 'batch_loss/train': 0.8781493827700615} 12/29/2021 09:39:43 - INFO - codeparrot_training - Step 48912: {'lr': 5.880342301077535e-07, 'samples': 25043456, 'steps': 48912, 'batch_loss/train': 0.6845367692876607} 12/29/2021 09:39:55 - INFO - codeparrot_training - Step 48913: {'lr': 5.869542045711385e-07, 'samples': 25043968, 'steps': 48913, 'batch_loss/train': 0.6920244578504935} 12/29/2021 09:40:06 - INFO - codeparrot_training - Step 48914: {'lr': 5.858751706065979e-07, 'samples': 25044480, 'steps': 48914, 'batch_loss/train': 0.7443080209195614} 12/29/2021 09:40:17 - INFO - codeparrot_training - Step 48915: {'lr': 5.847971282184616e-07, 'samples': 25044992, 'steps': 48915, 'batch_loss/train': 0.7384258466772735} 12/29/2021 09:40:27 - INFO - codeparrot_training - Step 48916: {'lr': 5.837200774110041e-07, 'samples': 25045504, 'steps': 48916, 'batch_loss/train': 0.8044142378494143} 12/29/2021 09:40:41 - INFO - codeparrot_training - Step 48917: {'lr': 5.826440181884718e-07, 'samples': 25046016, 'steps': 48917, 'batch_loss/train': 0.7890607309527695} 12/29/2021 09:40:52 - INFO - codeparrot_training - Step 48918: {'lr': 5.815689505551946e-07, 'samples': 25046528, 'steps': 48918, 'batch_loss/train': 0.7668475350365043} 12/29/2021 09:41:02 - INFO - codeparrot_training - Step 48919: {'lr': 5.804948745154193e-07, 'samples': 25047040, 'steps': 48919, 'batch_loss/train': 0.7050375966355205} 12/29/2021 09:41:15 - INFO - codeparrot_training - Step 48920: {'lr': 5.7942179007342e-07, 'samples': 25047552, 'steps': 48920, 'batch_loss/train': 0.6383198075927794} 12/29/2021 09:41:25 - INFO - codeparrot_training - Step 48921: {'lr': 5.783496972334434e-07, 'samples': 25048064, 'steps': 48921, 'batch_loss/train': 0.6666105738840997} 12/29/2021 09:41:36 - INFO - codeparrot_training - Step 48922: {'lr': 5.772785959997917e-07, 'samples': 25048576, 'steps': 48922, 'batch_loss/train': 0.7800071574747562} 12/29/2021 09:41:48 - INFO - codeparrot_training - Step 48923: {'lr': 5.762084863766837e-07, 'samples': 25049088, 'steps': 48923, 'batch_loss/train': 0.6463871211744845} 12/29/2021 09:41:58 - INFO - codeparrot_training - Step 48924: {'lr': 5.751393683683937e-07, 'samples': 25049600, 'steps': 48924, 'batch_loss/train': 0.6266887285746634} 12/29/2021 09:42:09 - INFO - codeparrot_training - Step 48925: {'lr': 5.74071241979196e-07, 'samples': 25050112, 'steps': 48925, 'batch_loss/train': 0.7247328441590071} 12/29/2021 09:42:23 - INFO - codeparrot_training - Step 48926: {'lr': 5.730041072132819e-07, 'samples': 25050624, 'steps': 48926, 'batch_loss/train': 0.647919581271708} 12/29/2021 09:42:34 - INFO - codeparrot_training - Step 48927: {'lr': 5.719379640749534e-07, 'samples': 25051136, 'steps': 48927, 'batch_loss/train': 0.6683847745880485} 12/29/2021 09:42:44 - INFO - codeparrot_training - Step 48928: {'lr': 5.708728125684015e-07, 'samples': 25051648, 'steps': 48928, 'batch_loss/train': 0.7747236748691648} 12/29/2021 09:42:55 - INFO - codeparrot_training - Step 48929: {'lr': 5.69808652697873e-07, 'samples': 25052160, 'steps': 48929, 'batch_loss/train': 0.5373793067410588} 12/29/2021 09:43:07 - INFO - codeparrot_training - Step 48930: {'lr': 5.687454844676143e-07, 'samples': 25052672, 'steps': 48930, 'batch_loss/train': 0.7900753582362086} 12/29/2021 09:43:18 - INFO - codeparrot_training - Step 48931: {'lr': 5.676833078818167e-07, 'samples': 25053184, 'steps': 48931, 'batch_loss/train': 1.020990014076233} 12/29/2021 09:43:28 - INFO - codeparrot_training - Step 48932: {'lr': 5.666221229447544e-07, 'samples': 25053696, 'steps': 48932, 'batch_loss/train': 0.6931632792111486} 12/29/2021 09:43:41 - INFO - codeparrot_training - Step 48933: {'lr': 5.655619296606185e-07, 'samples': 25054208, 'steps': 48933, 'batch_loss/train': 0.7689267238602042} 12/29/2021 09:43:52 - INFO - codeparrot_training - Step 48934: {'lr': 5.64502728033628e-07, 'samples': 25054720, 'steps': 48934, 'batch_loss/train': 0.7825247151777148} 12/29/2021 09:44:03 - INFO - codeparrot_training - Step 48935: {'lr': 5.634445180679737e-07, 'samples': 25055232, 'steps': 48935, 'batch_loss/train': 0.6439625115017407} 12/29/2021 09:44:15 - INFO - codeparrot_training - Step 48936: {'lr': 5.623872997679025e-07, 'samples': 25055744, 'steps': 48936, 'batch_loss/train': 0.5233240514062345} 12/29/2021 09:44:25 - INFO - codeparrot_training - Step 48937: {'lr': 5.613310731375776e-07, 'samples': 25056256, 'steps': 48937, 'batch_loss/train': 0.7463370240293443} 12/29/2021 09:44:36 - INFO - codeparrot_training - Step 48938: {'lr': 5.602758381812178e-07, 'samples': 25056768, 'steps': 48938, 'batch_loss/train': 0.7039226666092873} 12/29/2021 09:44:47 - INFO - codeparrot_training - Step 48939: {'lr': 5.592215949030421e-07, 'samples': 25057280, 'steps': 48939, 'batch_loss/train': 0.7410867484286427} 12/29/2021 09:45:00 - INFO - codeparrot_training - Step 48940: {'lr': 5.58168343307186e-07, 'samples': 25057792, 'steps': 48940, 'batch_loss/train': 0.7739759730175138} 12/29/2021 09:45:11 - INFO - codeparrot_training - Step 48941: {'lr': 5.571160833978683e-07, 'samples': 25058304, 'steps': 48941, 'batch_loss/train': 0.6225163680501282} 12/29/2021 09:45:22 - INFO - codeparrot_training - Step 48942: {'lr': 5.560648151792803e-07, 'samples': 25058816, 'steps': 48942, 'batch_loss/train': 0.7419935841462575} 12/29/2021 09:45:34 - INFO - codeparrot_training - Step 48943: {'lr': 5.550145386556127e-07, 'samples': 25059328, 'steps': 48943, 'batch_loss/train': 0.6854568959679455} 12/29/2021 09:45:44 - INFO - codeparrot_training - Step 48944: {'lr': 5.539652538309736e-07, 'samples': 25059840, 'steps': 48944, 'batch_loss/train': 0.7021562356385402} 12/29/2021 09:45:55 - INFO - codeparrot_training - Step 48945: {'lr': 5.529169607096096e-07, 'samples': 25060352, 'steps': 48945, 'batch_loss/train': 0.7177411783486605} 12/29/2021 09:46:09 - INFO - codeparrot_training - Step 48946: {'lr': 5.518696592956562e-07, 'samples': 25060864, 'steps': 48946, 'batch_loss/train': 0.7424995042383671} 12/29/2021 09:46:20 - INFO - codeparrot_training - Step 48947: {'lr': 5.508233495933046e-07, 'samples': 25061376, 'steps': 48947, 'batch_loss/train': 0.7589584197849035} 12/29/2021 09:46:30 - INFO - codeparrot_training - Step 48948: {'lr': 5.497780316066625e-07, 'samples': 25061888, 'steps': 48948, 'batch_loss/train': 0.5495677087455988} 12/29/2021 09:46:42 - INFO - codeparrot_training - Step 48949: {'lr': 5.487337053399211e-07, 'samples': 25062400, 'steps': 48949, 'batch_loss/train': 0.653334473259747} 12/29/2021 09:46:53 - INFO - codeparrot_training - Step 48950: {'lr': 5.476903707972158e-07, 'samples': 25062912, 'steps': 48950, 'batch_loss/train': 1.2045988405589014} 12/29/2021 09:47:04 - INFO - codeparrot_training - Step 48951: {'lr': 5.466480279827102e-07, 'samples': 25063424, 'steps': 48951, 'batch_loss/train': 0.9731959319906309} 12/29/2021 09:47:14 - INFO - codeparrot_training - Step 48952: {'lr': 5.456066769005397e-07, 'samples': 25063936, 'steps': 48952, 'batch_loss/train': 0.6896392540074885} 12/29/2021 09:47:27 - INFO - codeparrot_training - Step 48953: {'lr': 5.445663175548676e-07, 'samples': 25064448, 'steps': 48953, 'batch_loss/train': 0.6768562919460237} 12/29/2021 09:47:37 - INFO - codeparrot_training - Step 48954: {'lr': 5.435269499498019e-07, 'samples': 25064960, 'steps': 48954, 'batch_loss/train': 0.650777698494494} 12/29/2021 09:47:48 - INFO - codeparrot_training - Step 48955: {'lr': 5.424885740894503e-07, 'samples': 25065472, 'steps': 48955, 'batch_loss/train': 0.6262945138732903} 12/29/2021 09:48:02 - INFO - codeparrot_training - Step 48956: {'lr': 5.414511899780039e-07, 'samples': 25065984, 'steps': 48956, 'batch_loss/train': 0.7299549096496776} 12/29/2021 09:48:13 - INFO - codeparrot_training - Step 48957: {'lr': 5.404147976195428e-07, 'samples': 25066496, 'steps': 48957, 'batch_loss/train': 1.7583172984886914} 12/29/2021 09:48:23 - INFO - codeparrot_training - Step 48958: {'lr': 5.393793970182304e-07, 'samples': 25067008, 'steps': 48958, 'batch_loss/train': 0.6729537502396852} 12/29/2021 09:48:34 - INFO - codeparrot_training - Step 48959: {'lr': 5.383449881781189e-07, 'samples': 25067520, 'steps': 48959, 'batch_loss/train': 0.8681339318864048} 12/29/2021 09:48:46 - INFO - codeparrot_training - Step 48960: {'lr': 5.373115711033715e-07, 'samples': 25068032, 'steps': 48960, 'batch_loss/train': 0.7277547167614102} 12/29/2021 09:48:57 - INFO - codeparrot_training - Step 48961: {'lr': 5.362791457980687e-07, 'samples': 25068544, 'steps': 48961, 'batch_loss/train': 0.7473162687383592} 12/29/2021 09:49:08 - INFO - codeparrot_training - Step 48962: {'lr': 5.35247712266318e-07, 'samples': 25069056, 'steps': 48962, 'batch_loss/train': 0.7590807313099504} 12/29/2021 09:49:22 - INFO - codeparrot_training - Step 48963: {'lr': 5.342172705122273e-07, 'samples': 25069568, 'steps': 48963, 'batch_loss/train': 0.7457298655062914} 12/29/2021 09:49:32 - INFO - codeparrot_training - Step 48964: {'lr': 5.33187820539932e-07, 'samples': 25070080, 'steps': 48964, 'batch_loss/train': 0.7431729180389084} 12/29/2021 09:49:43 - INFO - codeparrot_training - Step 48965: {'lr': 5.32159362353457e-07, 'samples': 25070592, 'steps': 48965, 'batch_loss/train': 0.512013071012916} 12/29/2021 09:49:55 - INFO - codeparrot_training - Step 48966: {'lr': 5.3113189595691e-07, 'samples': 25071104, 'steps': 48966, 'batch_loss/train': 0.778100369963795} 12/29/2021 09:50:06 - INFO - codeparrot_training - Step 48967: {'lr': 5.301054213543988e-07, 'samples': 25071616, 'steps': 48967, 'batch_loss/train': 0.6450455943122506} 12/29/2021 09:50:16 - INFO - codeparrot_training - Step 48968: {'lr': 5.290799385500034e-07, 'samples': 25072128, 'steps': 48968, 'batch_loss/train': 0.6830332351382822} 12/29/2021 09:50:28 - INFO - codeparrot_training - Step 48969: {'lr': 5.280554475477761e-07, 'samples': 25072640, 'steps': 48969, 'batch_loss/train': 0.8864935070741922} 12/29/2021 09:50:39 - INFO - codeparrot_training - Step 48970: {'lr': 5.270319483518249e-07, 'samples': 25073152, 'steps': 48970, 'batch_loss/train': 0.8150467071682215} 12/29/2021 09:50:50 - INFO - codeparrot_training - Step 48971: {'lr': 5.260094409661742e-07, 'samples': 25073664, 'steps': 48971, 'batch_loss/train': 0.7543496275320649} 12/29/2021 09:51:00 - INFO - codeparrot_training - Step 48972: {'lr': 5.249879253949318e-07, 'samples': 25074176, 'steps': 48972, 'batch_loss/train': 0.790680349804461} 12/29/2021 09:51:14 - INFO - codeparrot_training - Step 48973: {'lr': 5.239674016421225e-07, 'samples': 25074688, 'steps': 48973, 'batch_loss/train': 0.6523028337396681} 12/29/2021 09:51:25 - INFO - codeparrot_training - Step 48974: {'lr': 5.229478697118262e-07, 'samples': 25075200, 'steps': 48974, 'batch_loss/train': 0.7706664691213518} 12/29/2021 09:51:35 - INFO - codeparrot_training - Step 48975: {'lr': 5.219293296080674e-07, 'samples': 25075712, 'steps': 48975, 'batch_loss/train': 0.7192023540847003} 12/29/2021 09:51:47 - INFO - codeparrot_training - Step 48976: {'lr': 5.20911781334954e-07, 'samples': 25076224, 'steps': 48976, 'batch_loss/train': 0.7826396888121963} 12/29/2021 09:51:58 - INFO - codeparrot_training - Step 48977: {'lr': 5.198952248964828e-07, 'samples': 25076736, 'steps': 48977, 'batch_loss/train': 0.7117941873148084} 12/29/2021 09:52:09 - INFO - codeparrot_training - Step 48978: {'lr': 5.18879660296706e-07, 'samples': 25077248, 'steps': 48978, 'batch_loss/train': 0.7586915157735348} 12/29/2021 09:52:21 - INFO - codeparrot_training - Step 48979: {'lr': 5.178650875396484e-07, 'samples': 25077760, 'steps': 48979, 'batch_loss/train': 0.6741468133404851} 12/29/2021 09:52:31 - INFO - codeparrot_training - Step 48980: {'lr': 5.1685150662939e-07, 'samples': 25078272, 'steps': 48980, 'batch_loss/train': 0.8124618334695697} 12/29/2021 09:52:42 - INFO - codeparrot_training - Step 48981: {'lr': 5.158389175698997e-07, 'samples': 25078784, 'steps': 48981, 'batch_loss/train': 0.7667684714542702} 12/29/2021 09:52:54 - INFO - codeparrot_training - Step 48982: {'lr': 5.148273203652576e-07, 'samples': 25079296, 'steps': 48982, 'batch_loss/train': 0.6875861305743456} 12/29/2021 09:53:05 - INFO - codeparrot_training - Step 48983: {'lr': 5.138167150194606e-07, 'samples': 25079808, 'steps': 48983, 'batch_loss/train': 0.7932756398804486} 12/29/2021 09:53:15 - INFO - codeparrot_training - Step 48984: {'lr': 5.128071015365055e-07, 'samples': 25080320, 'steps': 48984, 'batch_loss/train': 0.6933925587218255} 12/29/2021 09:53:26 - INFO - codeparrot_training - Step 48985: {'lr': 5.117984799204445e-07, 'samples': 25080832, 'steps': 48985, 'batch_loss/train': 0.7352213398553431} 12/29/2021 09:53:40 - INFO - codeparrot_training - Step 48986: {'lr': 5.107908501752468e-07, 'samples': 25081344, 'steps': 48986, 'batch_loss/train': 0.7393137149047107} 12/29/2021 09:53:51 - INFO - codeparrot_training - Step 48987: {'lr': 5.097842123049646e-07, 'samples': 25081856, 'steps': 48987, 'batch_loss/train': 0.757974112406373} 12/29/2021 09:54:01 - INFO - codeparrot_training - Step 48988: {'lr': 5.087785663135669e-07, 'samples': 25082368, 'steps': 48988, 'batch_loss/train': 0.7012648256495595} 12/29/2021 09:54:13 - INFO - codeparrot_training - Step 48989: {'lr': 5.077739122050506e-07, 'samples': 25082880, 'steps': 48989, 'batch_loss/train': 0.7630800916813314} 12/29/2021 09:54:24 - INFO - codeparrot_training - Step 48990: {'lr': 5.067702499834403e-07, 'samples': 25083392, 'steps': 48990, 'batch_loss/train': 0.7607673639431596} 12/29/2021 09:54:35 - INFO - codeparrot_training - Step 48991: {'lr': 5.057675796526772e-07, 'samples': 25083904, 'steps': 48991, 'batch_loss/train': 0.7305626654997468} 12/29/2021 09:54:47 - INFO - codeparrot_training - Step 48992: {'lr': 5.047659012168137e-07, 'samples': 25084416, 'steps': 48992, 'batch_loss/train': 0.7371665546670556} 12/29/2021 09:54:57 - INFO - codeparrot_training - Step 48993: {'lr': 5.037652146797633e-07, 'samples': 25084928, 'steps': 48993, 'batch_loss/train': 0.739754285197705} 12/29/2021 09:55:08 - INFO - codeparrot_training - Step 48994: {'lr': 5.027655200455505e-07, 'samples': 25085440, 'steps': 48994, 'batch_loss/train': 0.6848665052093565} 12/29/2021 09:55:19 - INFO - codeparrot_training - Step 48995: {'lr': 5.017668173181444e-07, 'samples': 25085952, 'steps': 48995, 'batch_loss/train': 0.6830455912277102} 12/29/2021 09:55:33 - INFO - codeparrot_training - Step 48996: {'lr': 5.007691065014864e-07, 'samples': 25086464, 'steps': 48996, 'batch_loss/train': 0.6678712638095021} 12/29/2021 09:55:43 - INFO - codeparrot_training - Step 48997: {'lr': 4.99772387599573e-07, 'samples': 25086976, 'steps': 48997, 'batch_loss/train': 0.7488440992310643} 12/29/2021 09:55:54 - INFO - codeparrot_training - Step 48998: {'lr': 4.987766606163735e-07, 'samples': 25087488, 'steps': 48998, 'batch_loss/train': 0.6991433394141495} 12/29/2021 09:56:06 - INFO - codeparrot_training - Step 48999: {'lr': 4.977819255558014e-07, 'samples': 25088000, 'steps': 48999, 'batch_loss/train': 0.7589141167700291} 12/29/2021 09:56:17 - INFO - codeparrot_training - Step 49000: {'lr': 4.967881824218534e-07, 'samples': 25088512, 'steps': 49000, 'batch_loss/train': 0.7194710369221866} 12/29/2021 09:56:27 - INFO - codeparrot_training - Step 49001: {'lr': 4.957954312184709e-07, 'samples': 25089024, 'steps': 49001, 'batch_loss/train': 0.7360256058163941} 12/29/2021 09:56:41 - INFO - codeparrot_training - Step 49002: {'lr': 4.948036719495952e-07, 'samples': 25089536, 'steps': 49002, 'batch_loss/train': 0.6414452578756027} 12/29/2021 09:56:52 - INFO - codeparrot_training - Step 49003: {'lr': 4.938129046191675e-07, 'samples': 25090048, 'steps': 49003, 'batch_loss/train': 0.6980158314108849} 12/29/2021 09:57:02 - INFO - codeparrot_training - Step 49004: {'lr': 4.928231292311291e-07, 'samples': 25090560, 'steps': 49004, 'batch_loss/train': 0.7427234780043364} 12/29/2021 09:57:15 - INFO - codeparrot_training - Step 49005: {'lr': 4.918343457894214e-07, 'samples': 25091072, 'steps': 49005, 'batch_loss/train': 0.8008470768108964} 12/29/2021 09:57:25 - INFO - codeparrot_training - Step 49006: {'lr': 4.908465542979857e-07, 'samples': 25091584, 'steps': 49006, 'batch_loss/train': 0.7039990541525185} 12/29/2021 09:57:36 - INFO - codeparrot_training - Step 49007: {'lr': 4.898597547607076e-07, 'samples': 25092096, 'steps': 49007, 'batch_loss/train': 0.7061966413166374} 12/29/2021 09:57:47 - INFO - codeparrot_training - Step 49008: {'lr': 4.888739471815284e-07, 'samples': 25092608, 'steps': 49008, 'batch_loss/train': 0.7150225141085684} 12/29/2021 09:57:59 - INFO - codeparrot_training - Step 49009: {'lr': 4.878891315643897e-07, 'samples': 25093120, 'steps': 49009, 'batch_loss/train': 0.6312555089825764} 12/29/2021 09:58:09 - INFO - codeparrot_training - Step 49010: {'lr': 4.869053079132047e-07, 'samples': 25093632, 'steps': 49010, 'batch_loss/train': 0.7535852990113199} 12/29/2021 09:58:20 - INFO - codeparrot_training - Step 49011: {'lr': 4.859224762318315e-07, 'samples': 25094144, 'steps': 49011, 'batch_loss/train': 1.0651730904355645} 12/29/2021 09:58:34 - INFO - codeparrot_training - Step 49012: {'lr': 4.84940636524267e-07, 'samples': 25094656, 'steps': 49012, 'batch_loss/train': 0.7481730910949409} 12/29/2021 09:58:45 - INFO - codeparrot_training - Step 49013: {'lr': 4.839597887943414e-07, 'samples': 25095168, 'steps': 49013, 'batch_loss/train': 0.6990538267418742} 12/29/2021 09:58:55 - INFO - codeparrot_training - Step 49014: {'lr': 4.829799330459683e-07, 'samples': 25095680, 'steps': 49014, 'batch_loss/train': 0.74390979250893} 12/29/2021 09:59:07 - INFO - codeparrot_training - Step 49015: {'lr': 4.820010692830612e-07, 'samples': 25096192, 'steps': 49015, 'batch_loss/train': 0.7757652285508811} 12/29/2021 09:59:18 - INFO - codeparrot_training - Step 49016: {'lr': 4.810231975095059e-07, 'samples': 25096704, 'steps': 49016, 'batch_loss/train': 0.9794818083755672} 12/29/2021 09:59:29 - INFO - codeparrot_training - Step 49017: {'lr': 4.800463177291881e-07, 'samples': 25097216, 'steps': 49017, 'batch_loss/train': 0.6704307021573186} 12/29/2021 09:59:39 - INFO - codeparrot_training - Step 49018: {'lr': 4.790704299459936e-07, 'samples': 25097728, 'steps': 49018, 'batch_loss/train': 0.8081860356032848} 12/29/2021 09:59:52 - INFO - codeparrot_training - Step 49019: {'lr': 4.780955341637805e-07, 'samples': 25098240, 'steps': 49019, 'batch_loss/train': 0.7090080287307501} 12/29/2021 10:00:03 - INFO - codeparrot_training - Step 49020: {'lr': 4.771216303864622e-07, 'samples': 25098752, 'steps': 49020, 'batch_loss/train': 0.6406249948777258} 12/29/2021 10:00:13 - INFO - codeparrot_training - Step 49021: {'lr': 4.76148718617897e-07, 'samples': 25099264, 'steps': 49021, 'batch_loss/train': 0.6822924633743241} 12/29/2021 10:00:27 - INFO - codeparrot_training - Step 49022: {'lr': 4.7517679886191487e-07, 'samples': 25099776, 'steps': 49022, 'batch_loss/train': 0.7294734297320247} 12/29/2021 10:00:38 - INFO - codeparrot_training - Step 49023: {'lr': 4.7420587112245726e-07, 'samples': 25100288, 'steps': 49023, 'batch_loss/train': 0.6857520069461316} 12/29/2021 10:00:48 - INFO - codeparrot_training - Step 49024: {'lr': 4.7323593540332664e-07, 'samples': 25100800, 'steps': 49024, 'batch_loss/train': 0.7701084865257144} 12/29/2021 10:01:01 - INFO - codeparrot_training - Step 49025: {'lr': 4.7226699170838104e-07, 'samples': 25101312, 'steps': 49025, 'batch_loss/train': 0.671170199289918} 12/29/2021 10:01:12 - INFO - codeparrot_training - Step 49026: {'lr': 4.712990400414785e-07, 'samples': 25101824, 'steps': 49026, 'batch_loss/train': 0.8293714132159948} 12/29/2021 10:01:22 - INFO - codeparrot_training - Step 49027: {'lr': 4.70332080406477e-07, 'samples': 25102336, 'steps': 49027, 'batch_loss/train': 0.826662466628477} 12/29/2021 10:01:33 - INFO - codeparrot_training - Step 49028: {'lr': 4.6936611280723463e-07, 'samples': 25102848, 'steps': 49028, 'batch_loss/train': 0.6335281520150602} 12/29/2021 10:01:45 - INFO - codeparrot_training - Step 49029: {'lr': 4.6840113724755383e-07, 'samples': 25103360, 'steps': 49029, 'batch_loss/train': 0.4495880967588164} 12/29/2021 10:01:56 - INFO - codeparrot_training - Step 49030: {'lr': 4.674371537312927e-07, 'samples': 25103872, 'steps': 49030, 'batch_loss/train': 0.8607345083728433} 12/29/2021 10:02:06 - INFO - codeparrot_training - Step 49031: {'lr': 4.664741622623092e-07, 'samples': 25104384, 'steps': 49031, 'batch_loss/train': 0.6835821634158492} 12/29/2021 10:02:21 - INFO - codeparrot_training - Step 49032: {'lr': 4.6551216284437813e-07, 'samples': 25104896, 'steps': 49032, 'batch_loss/train': 0.6443297588266432} 12/29/2021 10:02:31 - INFO - codeparrot_training - Step 49033: {'lr': 4.645511554813298e-07, 'samples': 25105408, 'steps': 49033, 'batch_loss/train': 0.8140531880781054} 12/29/2021 10:02:42 - INFO - codeparrot_training - Step 49034: {'lr': 4.6359114017704983e-07, 'samples': 25105920, 'steps': 49034, 'batch_loss/train': 0.7046846153680235} 12/29/2021 10:02:54 - INFO - codeparrot_training - Step 49035: {'lr': 4.626321169352854e-07, 'samples': 25106432, 'steps': 49035, 'batch_loss/train': 0.46209819382056594} 12/29/2021 10:03:05 - INFO - codeparrot_training - Step 49036: {'lr': 4.6167408575986667e-07, 'samples': 25106944, 'steps': 49036, 'batch_loss/train': 0.677086777985096} 12/29/2021 10:03:15 - INFO - codeparrot_training - Step 49037: {'lr': 4.60717046654624e-07, 'samples': 25107456, 'steps': 49037, 'batch_loss/train': 0.6888250643387437} 12/29/2021 10:03:26 - INFO - codeparrot_training - Step 49038: {'lr': 4.597609996233598e-07, 'samples': 25107968, 'steps': 49038, 'batch_loss/train': 0.6762382353190333} 12/29/2021 10:03:38 - INFO - codeparrot_training - Step 49039: {'lr': 4.588059446698489e-07, 'samples': 25108480, 'steps': 49039, 'batch_loss/train': 0.7302872836589813} 12/29/2021 10:03:49 - INFO - codeparrot_training - Step 49040: {'lr': 4.5785188179789385e-07, 'samples': 25108992, 'steps': 49040, 'batch_loss/train': 0.6422685054130852} 12/29/2021 10:03:59 - INFO - codeparrot_training - Step 49041: {'lr': 4.5689881101129703e-07, 'samples': 25109504, 'steps': 49041, 'batch_loss/train': 0.649268586654216} 12/29/2021 10:04:13 - INFO - codeparrot_training - Step 49042: {'lr': 4.559467323138611e-07, 'samples': 25110016, 'steps': 49042, 'batch_loss/train': 0.6746835135854781} 12/29/2021 10:04:24 - INFO - codeparrot_training - Step 49043: {'lr': 4.549956457093607e-07, 'samples': 25110528, 'steps': 49043, 'batch_loss/train': 0.9425616096705198} 12/29/2021 10:04:35 - INFO - codeparrot_training - Step 49044: {'lr': 4.540455512015429e-07, 'samples': 25111040, 'steps': 49044, 'batch_loss/train': 0.9766100002452731} 12/29/2021 10:04:45 - INFO - codeparrot_training - Step 49045: {'lr': 4.5309644879423795e-07, 'samples': 25111552, 'steps': 49045, 'batch_loss/train': 0.7486769678071141} 12/29/2021 10:04:58 - INFO - codeparrot_training - Step 49046: {'lr': 4.5214833849119286e-07, 'samples': 25112064, 'steps': 49046, 'batch_loss/train': 0.6809318526647985} 12/29/2021 10:05:08 - INFO - codeparrot_training - Step 49047: {'lr': 4.512012202961824e-07, 'samples': 25112576, 'steps': 49047, 'batch_loss/train': 0.7012458918616176} 12/29/2021 10:05:19 - INFO - codeparrot_training - Step 49048: {'lr': 4.502550942129813e-07, 'samples': 25113088, 'steps': 49048, 'batch_loss/train': 0.7569069089367986} 12/29/2021 10:05:31 - INFO - codeparrot_training - Step 49049: {'lr': 4.493099602453088e-07, 'samples': 25113600, 'steps': 49049, 'batch_loss/train': 0.7018961692228913} 12/29/2021 10:05:42 - INFO - codeparrot_training - Step 49050: {'lr': 4.4836581839696745e-07, 'samples': 25114112, 'steps': 49050, 'batch_loss/train': 0.6999639007262886} 12/29/2021 10:05:52 - INFO - codeparrot_training - Step 49051: {'lr': 4.4742266867170423e-07, 'samples': 25114624, 'steps': 49051, 'batch_loss/train': 0.8239671894116327} 12/29/2021 10:06:04 - INFO - codeparrot_training - Step 49052: {'lr': 4.4648051107326614e-07, 'samples': 25115136, 'steps': 49052, 'batch_loss/train': 0.6764687849208713} 12/29/2021 10:06:15 - INFO - codeparrot_training - Step 49053: {'lr': 4.4553934560537245e-07, 'samples': 25115648, 'steps': 49053, 'batch_loss/train': 0.7741415426135063} 12/29/2021 10:06:26 - INFO - codeparrot_training - Step 49054: {'lr': 4.445991722717979e-07, 'samples': 25116160, 'steps': 49054, 'batch_loss/train': 0.8036465980112553} 12/29/2021 10:06:36 - INFO - codeparrot_training - Step 49055: {'lr': 4.436599910762895e-07, 'samples': 25116672, 'steps': 49055, 'batch_loss/train': 0.7172203473746777} 12/29/2021 10:06:50 - INFO - codeparrot_training - Step 49056: {'lr': 4.4272180202253876e-07, 'samples': 25117184, 'steps': 49056, 'batch_loss/train': 0.8123975191265345} 12/29/2021 10:07:01 - INFO - codeparrot_training - Step 49057: {'lr': 4.4178460511429265e-07, 'samples': 25117696, 'steps': 49057, 'batch_loss/train': 0.6883505424484611} 12/29/2021 10:07:11 - INFO - codeparrot_training - Step 49058: {'lr': 4.408484003552704e-07, 'samples': 25118208, 'steps': 49058, 'batch_loss/train': 0.7648016642779112} 12/29/2021 10:07:23 - INFO - codeparrot_training - Step 49059: {'lr': 4.3991318774921906e-07, 'samples': 25118720, 'steps': 49059, 'batch_loss/train': 0.659437010763213} 12/29/2021 10:07:34 - INFO - codeparrot_training - Step 49060: {'lr': 4.3897896729985786e-07, 'samples': 25119232, 'steps': 49060, 'batch_loss/train': 0.7252837037667632} 12/29/2021 10:07:45 - INFO - codeparrot_training - Step 49061: {'lr': 4.380457390108783e-07, 'samples': 25119744, 'steps': 49061, 'batch_loss/train': 0.7251307442784309} 12/29/2021 10:07:59 - INFO - codeparrot_training - Step 49062: {'lr': 4.3711350288597187e-07, 'samples': 25120256, 'steps': 49062, 'batch_loss/train': 0.727368007414043} 12/29/2021 10:08:09 - INFO - codeparrot_training - Step 49063: {'lr': 4.3618225892891326e-07, 'samples': 25120768, 'steps': 49063, 'batch_loss/train': 0.7519029565155506} 12/29/2021 10:08:20 - INFO - codeparrot_training - Step 49064: {'lr': 4.352520071433386e-07, 'samples': 25121280, 'steps': 49064, 'batch_loss/train': 0.8121998829301447} 12/29/2021 10:08:32 - INFO - codeparrot_training - Step 49065: {'lr': 4.3432274753296695e-07, 'samples': 25121792, 'steps': 49065, 'batch_loss/train': 0.6366829714970663} 12/29/2021 10:08:43 - INFO - codeparrot_training - Step 49066: {'lr': 4.3339448010148994e-07, 'samples': 25122304, 'steps': 49066, 'batch_loss/train': 0.7484378670342267} 12/29/2021 10:08:53 - INFO - codeparrot_training - Step 49067: {'lr': 4.324672048526268e-07, 'samples': 25122816, 'steps': 49067, 'batch_loss/train': 0.6398016256280243} 12/29/2021 10:09:04 - INFO - codeparrot_training - Step 49068: {'lr': 4.3154092179001346e-07, 'samples': 25123328, 'steps': 49068, 'batch_loss/train': 0.771912201307714} 12/29/2021 10:09:16 - INFO - codeparrot_training - Step 49069: {'lr': 4.306156309173692e-07, 'samples': 25123840, 'steps': 49069, 'batch_loss/train': 0.6490618023090065} 12/29/2021 10:09:27 - INFO - codeparrot_training - Step 49070: {'lr': 4.296913322383855e-07, 'samples': 25124352, 'steps': 49070, 'batch_loss/train': 0.7939071110449731} 12/29/2021 10:09:37 - INFO - codeparrot_training - Step 49071: {'lr': 4.287680257566984e-07, 'samples': 25124864, 'steps': 49071, 'batch_loss/train': 0.7258637396153063} 12/29/2021 10:09:51 - INFO - codeparrot_training - Step 49072: {'lr': 4.278457114759993e-07, 'samples': 25125376, 'steps': 49072, 'batch_loss/train': 0.7642985451966524} 12/29/2021 10:10:02 - INFO - codeparrot_training - Step 49073: {'lr': 4.269243893999519e-07, 'samples': 25125888, 'steps': 49073, 'batch_loss/train': 0.7047563539817929} 12/29/2021 10:10:13 - INFO - codeparrot_training - Step 49074: {'lr': 4.260040595321923e-07, 'samples': 25126400, 'steps': 49074, 'batch_loss/train': 0.7070841840468347} 12/29/2021 10:10:25 - INFO - codeparrot_training - Step 49075: {'lr': 4.250847218764398e-07, 'samples': 25126912, 'steps': 49075, 'batch_loss/train': 0.8746702605858445} 12/29/2021 10:10:35 - INFO - codeparrot_training - Step 49076: {'lr': 4.241663764363024e-07, 'samples': 25127424, 'steps': 49076, 'batch_loss/train': 0.7301646331325173} 12/29/2021 10:10:46 - INFO - codeparrot_training - Step 49077: {'lr': 4.2324902321541624e-07, 'samples': 25127936, 'steps': 49077, 'batch_loss/train': 0.7325300415977836} 12/29/2021 10:10:57 - INFO - codeparrot_training - Step 49078: {'lr': 4.2233266221750056e-07, 'samples': 25128448, 'steps': 49078, 'batch_loss/train': 0.7239865995943546} 12/29/2021 10:11:11 - INFO - codeparrot_training - Step 49079: {'lr': 4.214172934461358e-07, 'samples': 25128960, 'steps': 49079, 'batch_loss/train': 0.6769450223073363} 12/29/2021 10:11:21 - INFO - codeparrot_training - Step 49080: {'lr': 4.205029169049579e-07, 'samples': 25129472, 'steps': 49080, 'batch_loss/train': 0.6471193539910018} 12/29/2021 10:11:32 - INFO - codeparrot_training - Step 49081: {'lr': 4.195895325976584e-07, 'samples': 25129984, 'steps': 49081, 'batch_loss/train': 0.6762829683721066} 12/29/2021 10:11:44 - INFO - codeparrot_training - Step 49082: {'lr': 4.186771405278178e-07, 'samples': 25130496, 'steps': 49082, 'batch_loss/train': 0.8584370696917176} 12/29/2021 10:11:55 - INFO - codeparrot_training - Step 49083: {'lr': 4.1776574069907203e-07, 'samples': 25131008, 'steps': 49083, 'batch_loss/train': 0.7354380954056978} 12/29/2021 10:12:05 - INFO - codeparrot_training - Step 49084: {'lr': 4.168553331150571e-07, 'samples': 25131520, 'steps': 49084, 'batch_loss/train': 0.7239827811717987} 12/29/2021 10:12:17 - INFO - codeparrot_training - Step 49085: {'lr': 4.159459177793812e-07, 'samples': 25132032, 'steps': 49085, 'batch_loss/train': 0.7225486128591001} 12/29/2021 10:12:28 - INFO - codeparrot_training - Step 49086: {'lr': 4.1503749469565255e-07, 'samples': 25132544, 'steps': 49086, 'batch_loss/train': 0.6685991304693744} 12/29/2021 10:12:39 - INFO - codeparrot_training - Step 49087: {'lr': 4.1413006386753494e-07, 'samples': 25133056, 'steps': 49087, 'batch_loss/train': 0.7215123688802123} 12/29/2021 10:12:51 - INFO - codeparrot_training - Step 49088: {'lr': 4.132236252985533e-07, 'samples': 25133568, 'steps': 49088, 'batch_loss/train': 0.6746517606079578} 12/29/2021 10:13:01 - INFO - codeparrot_training - Step 49089: {'lr': 4.123181789923991e-07, 'samples': 25134080, 'steps': 49089, 'batch_loss/train': 0.7912526922300458} 12/29/2021 10:13:12 - INFO - codeparrot_training - Step 49090: {'lr': 4.114137249525973e-07, 'samples': 25134592, 'steps': 49090, 'batch_loss/train': 0.7525664652348496} 12/29/2021 10:13:23 - INFO - codeparrot_training - Step 49091: {'lr': 4.105102631827562e-07, 'samples': 25135104, 'steps': 49091, 'batch_loss/train': 0.6564292479306459} 12/29/2021 10:13:37 - INFO - codeparrot_training - Step 49092: {'lr': 4.096077936865117e-07, 'samples': 25135616, 'steps': 49092, 'batch_loss/train': 0.7033978700637817} 12/29/2021 10:13:48 - INFO - codeparrot_training - Step 49093: {'lr': 4.0870631646741654e-07, 'samples': 25136128, 'steps': 49093, 'batch_loss/train': 0.8062050407752395} 12/29/2021 10:13:58 - INFO - codeparrot_training - Step 49094: {'lr': 4.078058315290789e-07, 'samples': 25136640, 'steps': 49094, 'batch_loss/train': 0.7266807602718472} 12/29/2021 10:14:10 - INFO - codeparrot_training - Step 49095: {'lr': 4.0690633887505157e-07, 'samples': 25137152, 'steps': 49095, 'batch_loss/train': 0.6051569650880992} 12/29/2021 10:14:21 - INFO - codeparrot_training - Step 49096: {'lr': 4.06007838508915e-07, 'samples': 25137664, 'steps': 49096, 'batch_loss/train': 0.7484024916775525} 12/29/2021 10:14:32 - INFO - codeparrot_training - Step 49097: {'lr': 4.051103304342496e-07, 'samples': 25138176, 'steps': 49097, 'batch_loss/train': 0.7707415008917451} 12/29/2021 10:14:44 - INFO - codeparrot_training - Step 49098: {'lr': 4.0421381465463593e-07, 'samples': 25138688, 'steps': 49098, 'batch_loss/train': 0.8641883507370949} 12/29/2021 10:14:54 - INFO - codeparrot_training - Step 49099: {'lr': 4.033182911736266e-07, 'samples': 25139200, 'steps': 49099, 'batch_loss/train': 0.7891717478632927} 12/29/2021 10:15:05 - INFO - codeparrot_training - Step 49100: {'lr': 4.0242375999474666e-07, 'samples': 25139712, 'steps': 49100, 'batch_loss/train': 0.7300404906272888} 12/29/2021 10:15:15 - INFO - codeparrot_training - Step 49101: {'lr': 4.015302211216043e-07, 'samples': 25140224, 'steps': 49101, 'batch_loss/train': 0.695763046387583} 12/29/2021 10:15:30 - INFO - codeparrot_training - Step 49102: {'lr': 4.0063767455775224e-07, 'samples': 25140736, 'steps': 49102, 'batch_loss/train': 0.7725979406386614} 12/29/2021 10:15:40 - INFO - codeparrot_training - Step 49103: {'lr': 3.997461203066877e-07, 'samples': 25141248, 'steps': 49103, 'batch_loss/train': 0.6733740859199315} 12/29/2021 10:15:51 - INFO - codeparrot_training - Step 49104: {'lr': 3.988555583720188e-07, 'samples': 25141760, 'steps': 49104, 'batch_loss/train': 0.6654140615137294} 12/29/2021 10:16:03 - INFO - codeparrot_training - Step 49105: {'lr': 3.9796598875724286e-07, 'samples': 25142272, 'steps': 49105, 'batch_loss/train': 0.5952272458234802} 12/29/2021 10:16:14 - INFO - codeparrot_training - Step 49106: {'lr': 3.9707741146588483e-07, 'samples': 25142784, 'steps': 49106, 'batch_loss/train': 0.7205049665644765} 12/29/2021 10:16:25 - INFO - codeparrot_training - Step 49107: {'lr': 3.9618982650152514e-07, 'samples': 25143296, 'steps': 49107, 'batch_loss/train': 0.5455720265163109} 12/29/2021 10:16:38 - INFO - codeparrot_training - Step 49108: {'lr': 3.95303233867661e-07, 'samples': 25143808, 'steps': 49108, 'batch_loss/train': 0.7369818645529449} 12/29/2021 10:16:48 - INFO - codeparrot_training - Step 49109: {'lr': 3.944176335678173e-07, 'samples': 25144320, 'steps': 49109, 'batch_loss/train': 0.6835706406272948} 12/29/2021 10:16:59 - INFO - codeparrot_training - Step 49110: {'lr': 3.9353302560554694e-07, 'samples': 25144832, 'steps': 49110, 'batch_loss/train': 0.7266463651321828} 12/29/2021 10:17:10 - INFO - codeparrot_training - Step 49111: {'lr': 3.926494099843192e-07, 'samples': 25145344, 'steps': 49111, 'batch_loss/train': 0.6893729818984866} 12/29/2021 10:17:22 - INFO - codeparrot_training - Step 49112: {'lr': 3.917667867076591e-07, 'samples': 25145856, 'steps': 49112, 'batch_loss/train': 0.738396670203656} 12/29/2021 10:17:33 - INFO - codeparrot_training - Step 49113: {'lr': 3.908851557791193e-07, 'samples': 25146368, 'steps': 49113, 'batch_loss/train': 0.7996827368624508} 12/29/2021 10:17:43 - INFO - codeparrot_training - Step 49114: {'lr': 3.900045172021416e-07, 'samples': 25146880, 'steps': 49114, 'batch_loss/train': 0.48811734560877085} 12/29/2021 10:17:55 - INFO - codeparrot_training - Step 49115: {'lr': 3.891248709802786e-07, 'samples': 25147392, 'steps': 49115, 'batch_loss/train': 0.7458665333688259} 12/29/2021 10:18:06 - INFO - codeparrot_training - Step 49116: {'lr': 3.882462171169998e-07, 'samples': 25147904, 'steps': 49116, 'batch_loss/train': 0.721932552754879} 12/29/2021 10:18:17 - INFO - codeparrot_training - Step 49117: {'lr': 3.8736855561580244e-07, 'samples': 25148416, 'steps': 49117, 'batch_loss/train': 0.7647102938499302} 12/29/2021 10:18:31 - INFO - codeparrot_training - Step 49118: {'lr': 3.8649188648018364e-07, 'samples': 25148928, 'steps': 49118, 'batch_loss/train': 0.7583031095564365} 12/29/2021 10:18:41 - INFO - codeparrot_training - Step 49119: {'lr': 3.8561620971361286e-07, 'samples': 25149440, 'steps': 49119, 'batch_loss/train': 0.7347408095374703} 12/29/2021 10:18:52 - INFO - codeparrot_training - Step 49120: {'lr': 3.8474152531955966e-07, 'samples': 25149952, 'steps': 49120, 'batch_loss/train': 0.7547663999721408} 12/29/2021 10:19:02 - INFO - codeparrot_training - Step 49121: {'lr': 3.8386783330154885e-07, 'samples': 25150464, 'steps': 49121, 'batch_loss/train': 0.6314167212694883} 12/29/2021 10:19:15 - INFO - codeparrot_training - Step 49122: {'lr': 3.829951336630222e-07, 'samples': 25150976, 'steps': 49122, 'batch_loss/train': 0.7288365252315998} 12/29/2021 10:19:25 - INFO - codeparrot_training - Step 49123: {'lr': 3.8212342640747687e-07, 'samples': 25151488, 'steps': 49123, 'batch_loss/train': 0.6911937615368515} 12/29/2021 10:19:36 - INFO - codeparrot_training - Step 49124: {'lr': 3.8125271153832686e-07, 'samples': 25152000, 'steps': 49124, 'batch_loss/train': 0.7713557332754135} 12/29/2021 10:19:48 - INFO - codeparrot_training - Step 49125: {'lr': 3.803829890590971e-07, 'samples': 25152512, 'steps': 49125, 'batch_loss/train': 0.7420659867930226} 12/29/2021 10:19:59 - INFO - codeparrot_training - Step 49126: {'lr': 3.795142589731737e-07, 'samples': 25153024, 'steps': 49126, 'batch_loss/train': 0.6912539638578892} 12/29/2021 10:20:09 - INFO - codeparrot_training - Step 49127: {'lr': 3.78646521284054e-07, 'samples': 25153536, 'steps': 49127, 'batch_loss/train': 0.737018272280693} 12/29/2021 10:20:22 - INFO - codeparrot_training - Step 49128: {'lr': 3.777797759952073e-07, 'samples': 25154048, 'steps': 49128, 'batch_loss/train': 0.6861158809624612} 12/29/2021 10:20:32 - INFO - codeparrot_training - Step 49129: {'lr': 3.769140231100199e-07, 'samples': 25154560, 'steps': 49129, 'batch_loss/train': 0.6428167349658906} 12/29/2021 10:20:43 - INFO - codeparrot_training - Step 49130: {'lr': 3.7604926263201666e-07, 'samples': 25155072, 'steps': 49130, 'batch_loss/train': 0.7533396519720554} 12/29/2021 10:20:53 - INFO - codeparrot_training - Step 49131: {'lr': 3.7518549456455606e-07, 'samples': 25155584, 'steps': 49131, 'batch_loss/train': 0.9762708605267107} 12/29/2021 10:21:07 - INFO - codeparrot_training - Step 49132: {'lr': 3.743227189111076e-07, 'samples': 25156096, 'steps': 49132, 'batch_loss/train': 0.7363897347822785} 12/29/2021 10:21:18 - INFO - codeparrot_training - Step 49133: {'lr': 3.734609356750851e-07, 'samples': 25156608, 'steps': 49133, 'batch_loss/train': 0.864881937392056} 12/29/2021 10:21:29 - INFO - codeparrot_training - Step 49134: {'lr': 3.7260014485993033e-07, 'samples': 25157120, 'steps': 49134, 'batch_loss/train': 0.8475954402238131} 12/29/2021 10:21:41 - INFO - codeparrot_training - Step 49135: {'lr': 3.7174034646908494e-07, 'samples': 25157632, 'steps': 49135, 'batch_loss/train': 0.7117032099631615} 12/29/2021 10:21:51 - INFO - codeparrot_training - Step 49136: {'lr': 3.7088154050593513e-07, 'samples': 25158144, 'steps': 49136, 'batch_loss/train': 0.7401823531836271} 12/29/2021 10:22:02 - INFO - codeparrot_training - Step 49137: {'lr': 3.7002372697389486e-07, 'samples': 25158656, 'steps': 49137, 'batch_loss/train': 0.7143446480622515} 12/29/2021 10:22:16 - INFO - codeparrot_training - Step 49138: {'lr': 3.6916690587637805e-07, 'samples': 25159168, 'steps': 49138, 'batch_loss/train': 0.7298367647454143} 12/29/2021 10:22:27 - INFO - codeparrot_training - Step 49139: {'lr': 3.683110772167986e-07, 'samples': 25159680, 'steps': 49139, 'batch_loss/train': 0.6985325030982494} 12/29/2021 10:22:37 - INFO - codeparrot_training - Step 49140: {'lr': 3.674562409985704e-07, 'samples': 25160192, 'steps': 49140, 'batch_loss/train': 0.6938665136694908} 12/29/2021 10:22:48 - INFO - codeparrot_training - Step 49141: {'lr': 3.666023972250521e-07, 'samples': 25160704, 'steps': 49141, 'batch_loss/train': 0.755506390996743} 12/29/2021 10:23:00 - INFO - codeparrot_training - Step 49142: {'lr': 3.6574954589968514e-07, 'samples': 25161216, 'steps': 49142, 'batch_loss/train': 0.6226528373663314} 12/29/2021 10:23:11 - INFO - codeparrot_training - Step 49143: {'lr': 3.648976870258558e-07, 'samples': 25161728, 'steps': 49143, 'batch_loss/train': 0.5791327776387334} 12/29/2021 10:23:21 - INFO - codeparrot_training - Step 49144: {'lr': 3.640468206069225e-07, 'samples': 25162240, 'steps': 49144, 'batch_loss/train': 0.7709026131778955} 12/29/2021 10:23:33 - INFO - codeparrot_training - Step 49145: {'lr': 3.6319694664627143e-07, 'samples': 25162752, 'steps': 49145, 'batch_loss/train': 0.6861301327589899} 12/29/2021 10:23:44 - INFO - codeparrot_training - Step 49146: {'lr': 3.623480651472888e-07, 'samples': 25163264, 'steps': 49146, 'batch_loss/train': 0.8146096910350025} 12/29/2021 10:23:55 - INFO - codeparrot_training - Step 49147: {'lr': 3.615001761133607e-07, 'samples': 25163776, 'steps': 49147, 'batch_loss/train': 0.745478315744549} 12/29/2021 10:24:09 - INFO - codeparrot_training - Step 49148: {'lr': 3.6065327954784564e-07, 'samples': 25164288, 'steps': 49148, 'batch_loss/train': 0.7105808625929058} 12/29/2021 10:24:19 - INFO - codeparrot_training - Step 49149: {'lr': 3.5980737545410204e-07, 'samples': 25164800, 'steps': 49149, 'batch_loss/train': 0.6580239967443049} 12/29/2021 10:24:30 - INFO - codeparrot_training - Step 49150: {'lr': 3.58962463835516e-07, 'samples': 25165312, 'steps': 49150, 'batch_loss/train': 0.7553186267614365} 12/29/2021 10:24:41 - INFO - codeparrot_training - Step 49151: {'lr': 3.581185446954183e-07, 'samples': 25165824, 'steps': 49151, 'batch_loss/train': 0.7449400760233402} 12/29/2021 10:24:53 - INFO - codeparrot_training - Step 49152: {'lr': 3.57275618037195e-07, 'samples': 25166336, 'steps': 49152, 'batch_loss/train': 0.6834912430495024} 12/29/2021 10:25:03 - INFO - codeparrot_training - Step 49153: {'lr': 3.5643368386417683e-07, 'samples': 25166848, 'steps': 49153, 'batch_loss/train': 0.6604671664535999} 12/29/2021 10:25:14 - INFO - codeparrot_training - Step 49154: {'lr': 3.5559274217972225e-07, 'samples': 25167360, 'steps': 49154, 'batch_loss/train': 0.6878858170239255} 12/29/2021 10:25:28 - INFO - codeparrot_training - Step 49155: {'lr': 3.547527929871619e-07, 'samples': 25167872, 'steps': 49155, 'batch_loss/train': 0.5376315329922363} 12/29/2021 10:25:39 - INFO - codeparrot_training - Step 49156: {'lr': 3.5391383628985416e-07, 'samples': 25168384, 'steps': 49156, 'batch_loss/train': 0.6500816164771095} 12/29/2021 10:25:49 - INFO - codeparrot_training - Step 49157: {'lr': 3.5307587209110204e-07, 'samples': 25168896, 'steps': 49157, 'batch_loss/train': 0.6982075218111277} 12/29/2021 10:26:01 - INFO - codeparrot_training - Step 49158: {'lr': 3.5223890039429163e-07, 'samples': 25169408, 'steps': 49158, 'batch_loss/train': 0.7440279111033306} 12/29/2021 10:26:12 - INFO - codeparrot_training - Step 49159: {'lr': 3.5140292120267036e-07, 'samples': 25169920, 'steps': 49159, 'batch_loss/train': 0.6723130873870105} 12/29/2021 10:26:23 - INFO - codeparrot_training - Step 49160: {'lr': 3.505679345196522e-07, 'samples': 25170432, 'steps': 49160, 'batch_loss/train': 0.7381194122135639} 12/29/2021 10:26:33 - INFO - codeparrot_training - Step 49161: {'lr': 3.497339403484845e-07, 'samples': 25170944, 'steps': 49161, 'batch_loss/train': 0.6922691110521555} 12/29/2021 10:26:45 - INFO - codeparrot_training - Step 49162: {'lr': 3.4890093869249795e-07, 'samples': 25171456, 'steps': 49162, 'batch_loss/train': 0.6766390189295635} 12/29/2021 10:26:56 - INFO - codeparrot_training - Step 49163: {'lr': 3.48068929555051e-07, 'samples': 25171968, 'steps': 49163, 'batch_loss/train': 0.85291145183146} 12/29/2021 10:27:07 - INFO - codeparrot_training - Step 49164: {'lr': 3.47237912939391e-07, 'samples': 25172480, 'steps': 49164, 'batch_loss/train': 0.7219033065484837} 12/29/2021 10:27:19 - INFO - codeparrot_training - Step 49165: {'lr': 3.464078888488764e-07, 'samples': 25172992, 'steps': 49165, 'batch_loss/train': 0.7586476730066352} 12/29/2021 10:27:30 - INFO - codeparrot_training - Step 49166: {'lr': 3.455788572867546e-07, 'samples': 25173504, 'steps': 49166, 'batch_loss/train': 0.7602196438238025} 12/29/2021 10:27:40 - INFO - codeparrot_training - Step 49167: {'lr': 3.4475081825635634e-07, 'samples': 25174016, 'steps': 49167, 'batch_loss/train': 0.758226063568145} 12/29/2021 10:27:54 - INFO - codeparrot_training - Step 49168: {'lr': 3.4392377176095666e-07, 'samples': 25174528, 'steps': 49168, 'batch_loss/train': 0.842649033293128} 12/29/2021 10:28:05 - INFO - codeparrot_training - Step 49169: {'lr': 3.430977178038308e-07, 'samples': 25175040, 'steps': 49169, 'batch_loss/train': 0.6811456624418497} 12/29/2021 10:28:15 - INFO - codeparrot_training - Step 49170: {'lr': 3.422726563883094e-07, 'samples': 25175552, 'steps': 49170, 'batch_loss/train': 0.6945701590739191} 12/29/2021 10:28:30 - INFO - codeparrot_training - Step 49171: {'lr': 3.4144858751763987e-07, 'samples': 25176064, 'steps': 49171, 'batch_loss/train': 0.6565318093053065} 12/29/2021 10:28:40 - INFO - codeparrot_training - Step 49172: {'lr': 3.4062551119509733e-07, 'samples': 25176576, 'steps': 49172, 'batch_loss/train': 0.7102809716016054} 12/29/2021 10:28:51 - INFO - codeparrot_training - Step 49173: {'lr': 3.39803427423957e-07, 'samples': 25177088, 'steps': 49173, 'batch_loss/train': 0.7312065421720035} 12/29/2021 10:29:02 - INFO - codeparrot_training - Step 49174: {'lr': 3.38982336207494e-07, 'samples': 25177600, 'steps': 49174, 'batch_loss/train': 0.6666083126328886} 12/29/2021 10:29:14 - INFO - codeparrot_training - Step 49175: {'lr': 3.3816223754898346e-07, 'samples': 25178112, 'steps': 49175, 'batch_loss/train': 0.6932624503970146} 12/29/2021 10:29:25 - INFO - codeparrot_training - Step 49176: {'lr': 3.3734313145164507e-07, 'samples': 25178624, 'steps': 49176, 'batch_loss/train': 0.5442811063257977} 12/29/2021 10:29:35 - INFO - codeparrot_training - Step 49177: {'lr': 3.3652501791878176e-07, 'samples': 25179136, 'steps': 49177, 'batch_loss/train': 0.7387994080781937} 12/29/2021 10:29:47 - INFO - codeparrot_training - Step 49178: {'lr': 3.3570789695361317e-07, 'samples': 25179648, 'steps': 49178, 'batch_loss/train': 0.7117399671114981} 12/29/2021 10:29:58 - INFO - codeparrot_training - Step 49179: {'lr': 3.348917685594144e-07, 'samples': 25180160, 'steps': 49179, 'batch_loss/train': 0.6902230558916926} 12/29/2021 10:30:09 - INFO - codeparrot_training - Step 49180: {'lr': 3.340766327394051e-07, 'samples': 25180672, 'steps': 49180, 'batch_loss/train': 0.6700640092603862} 12/29/2021 10:30:23 - INFO - codeparrot_training - Step 49181: {'lr': 3.332624894968328e-07, 'samples': 25181184, 'steps': 49181, 'batch_loss/train': 0.7236184179782867} 12/29/2021 10:30:33 - INFO - codeparrot_training - Step 49182: {'lr': 3.3244933883494475e-07, 'samples': 25181696, 'steps': 49182, 'batch_loss/train': 0.6598679237067699} 12/29/2021 10:30:44 - INFO - codeparrot_training - Step 49183: {'lr': 3.3163718075698843e-07, 'samples': 25182208, 'steps': 49183, 'batch_loss/train': 0.6911373329930939} 12/29/2021 10:30:55 - INFO - codeparrot_training - Step 49184: {'lr': 3.30826015266128e-07, 'samples': 25182720, 'steps': 49184, 'batch_loss/train': 0.6381656827870756} 12/29/2021 10:31:07 - INFO - codeparrot_training - Step 49185: {'lr': 3.300158423656663e-07, 'samples': 25183232, 'steps': 49185, 'batch_loss/train': 0.7213368928059936} 12/29/2021 10:31:17 - INFO - codeparrot_training - Step 49186: {'lr': 3.292066620587675e-07, 'samples': 25183744, 'steps': 49186, 'batch_loss/train': 0.7353638401255012} 12/29/2021 10:31:28 - INFO - codeparrot_training - Step 49187: {'lr': 3.283984743487067e-07, 'samples': 25184256, 'steps': 49187, 'batch_loss/train': 0.7897276552394032} 12/29/2021 10:31:42 - INFO - codeparrot_training - Step 49188: {'lr': 3.275912792386204e-07, 'samples': 25184768, 'steps': 49188, 'batch_loss/train': 0.7410060038673691} 12/29/2021 10:31:52 - INFO - codeparrot_training - Step 49189: {'lr': 3.267850767317837e-07, 'samples': 25185280, 'steps': 49189, 'batch_loss/train': 0.770335235632956} 12/29/2021 10:32:03 - INFO - codeparrot_training - Step 49190: {'lr': 3.259798668313607e-07, 'samples': 25185792, 'steps': 49190, 'batch_loss/train': 0.729620979167521} 12/29/2021 10:32:15 - INFO - codeparrot_training - Step 49191: {'lr': 3.2517564954057113e-07, 'samples': 25186304, 'steps': 49191, 'batch_loss/train': 0.6631822381168604} 12/29/2021 10:32:26 - INFO - codeparrot_training - Step 49192: {'lr': 3.2437242486260675e-07, 'samples': 25186816, 'steps': 49192, 'batch_loss/train': 0.7783093694597483} 12/29/2021 10:32:36 - INFO - codeparrot_training - Step 49193: {'lr': 3.235701928006873e-07, 'samples': 25187328, 'steps': 49193, 'batch_loss/train': 0.7392956716939807} 12/29/2021 10:32:49 - INFO - codeparrot_training - Step 49194: {'lr': 3.227689533579492e-07, 'samples': 25187840, 'steps': 49194, 'batch_loss/train': 0.7746235271915793} 12/29/2021 10:32:59 - INFO - codeparrot_training - Step 49195: {'lr': 3.2196870653763975e-07, 'samples': 25188352, 'steps': 49195, 'batch_loss/train': 0.6115034180693328} 12/29/2021 10:33:10 - INFO - codeparrot_training - Step 49196: {'lr': 3.2116945234286764e-07, 'samples': 25188864, 'steps': 49196, 'batch_loss/train': 0.7674262369982898} 12/29/2021 10:33:20 - INFO - codeparrot_training - Step 49197: {'lr': 3.203711907768803e-07, 'samples': 25189376, 'steps': 49197, 'batch_loss/train': 0.7380993259139359} 12/29/2021 10:33:35 - INFO - codeparrot_training - Step 49198: {'lr': 3.1957392184281396e-07, 'samples': 25189888, 'steps': 49198, 'batch_loss/train': 0.6936133075505495} 12/29/2021 10:33:46 - INFO - codeparrot_training - Step 49199: {'lr': 3.1877764554383293e-07, 'samples': 25190400, 'steps': 49199, 'batch_loss/train': 0.7934108776971698} 12/29/2021 10:33:56 - INFO - codeparrot_training - Step 49200: {'lr': 3.17982361883129e-07, 'samples': 25190912, 'steps': 49200, 'batch_loss/train': 0.7007455305429175} 12/29/2021 10:34:08 - INFO - codeparrot_training - Step 49201: {'lr': 3.171880708638664e-07, 'samples': 25191424, 'steps': 49201, 'batch_loss/train': 0.7225021133199334} 12/29/2021 10:34:19 - INFO - codeparrot_training - Step 49202: {'lr': 3.1639477248915363e-07, 'samples': 25191936, 'steps': 49202, 'batch_loss/train': 0.8050575135275722} 12/29/2021 10:34:30 - INFO - codeparrot_training - Step 49203: {'lr': 3.1560246676221037e-07, 'samples': 25192448, 'steps': 49203, 'batch_loss/train': 0.5898607521085069} 12/29/2021 10:34:42 - INFO - codeparrot_training - Step 49204: {'lr': 3.1481115368614534e-07, 'samples': 25192960, 'steps': 49204, 'batch_loss/train': 0.6869602827355266} 12/29/2021 10:34:52 - INFO - codeparrot_training - Step 49205: {'lr': 3.140208332640948e-07, 'samples': 25193472, 'steps': 49205, 'batch_loss/train': 0.8202020898461342} 12/29/2021 10:35:03 - INFO - codeparrot_training - Step 49206: {'lr': 3.132315054992507e-07, 'samples': 25193984, 'steps': 49206, 'batch_loss/train': 0.6572330272756517} 12/29/2021 10:35:14 - INFO - codeparrot_training - Step 49207: {'lr': 3.124431703946939e-07, 'samples': 25194496, 'steps': 49207, 'batch_loss/train': 0.5333404240664095} 12/29/2021 10:35:26 - INFO - codeparrot_training - Step 49208: {'lr': 3.116558279535886e-07, 'samples': 25195008, 'steps': 49208, 'batch_loss/train': 0.7814849382266402} 12/29/2021 10:35:36 - INFO - codeparrot_training - Step 49209: {'lr': 3.1086947817907107e-07, 'samples': 25195520, 'steps': 49209, 'batch_loss/train': 0.5142152768094093} 12/29/2021 10:35:47 - INFO - codeparrot_training - Step 49210: {'lr': 3.1008412107425e-07, 'samples': 25196032, 'steps': 49210, 'batch_loss/train': 0.7287000394426286} 12/29/2021 10:36:02 - INFO - codeparrot_training - Step 49211: {'lr': 3.0929975664226174e-07, 'samples': 25196544, 'steps': 49211, 'batch_loss/train': 0.7166980365291238} 12/29/2021 10:36:12 - INFO - codeparrot_training - Step 49212: {'lr': 3.085163848862149e-07, 'samples': 25197056, 'steps': 49212, 'batch_loss/train': 0.9861694443970919} 12/29/2021 10:36:23 - INFO - codeparrot_training - Step 49213: {'lr': 3.0773400580921815e-07, 'samples': 25197568, 'steps': 49213, 'batch_loss/train': 0.6760018682107329} 12/29/2021 10:36:35 - INFO - codeparrot_training - Step 49214: {'lr': 3.069526194144079e-07, 'samples': 25198080, 'steps': 49214, 'batch_loss/train': 0.7224558964371681} 12/29/2021 10:36:46 - INFO - codeparrot_training - Step 49215: {'lr': 3.061722257048649e-07, 'samples': 25198592, 'steps': 49215, 'batch_loss/train': 0.7525770491920412} 12/29/2021 10:36:56 - INFO - codeparrot_training - Step 49216: {'lr': 3.053928246836979e-07, 'samples': 25199104, 'steps': 49216, 'batch_loss/train': 0.8123498307541013} 12/29/2021 10:37:07 - INFO - codeparrot_training - Step 49217: {'lr': 3.046144163540154e-07, 'samples': 25199616, 'steps': 49217, 'batch_loss/train': 0.8243666738271713} 12/29/2021 10:37:21 - INFO - codeparrot_training - Step 49218: {'lr': 3.0383700071887064e-07, 'samples': 25200128, 'steps': 49218, 'batch_loss/train': 0.6984492023475468} 12/29/2021 10:37:32 - INFO - codeparrot_training - Step 49219: {'lr': 3.030605777814277e-07, 'samples': 25200640, 'steps': 49219, 'batch_loss/train': 0.7318003862164915} 12/29/2021 10:37:42 - INFO - codeparrot_training - Step 49220: {'lr': 3.0228514754473966e-07, 'samples': 25201152, 'steps': 49220, 'batch_loss/train': 0.7120937635772862} 12/29/2021 10:37:54 - INFO - codeparrot_training - Step 49221: {'lr': 3.015107100118597e-07, 'samples': 25201664, 'steps': 49221, 'batch_loss/train': 0.7843536706641316} 12/29/2021 10:38:05 - INFO - codeparrot_training - Step 49222: {'lr': 3.0073726518589637e-07, 'samples': 25202176, 'steps': 49222, 'batch_loss/train': 0.7488589105196297} 12/29/2021 10:38:16 - INFO - codeparrot_training - Step 49223: {'lr': 2.9996481306990286e-07, 'samples': 25202688, 'steps': 49223, 'batch_loss/train': 0.7004292127676308} 12/29/2021 10:38:28 - INFO - codeparrot_training - Step 49224: {'lr': 2.9919335366698774e-07, 'samples': 25203200, 'steps': 49224, 'batch_loss/train': 0.6958168335258961} 12/29/2021 10:38:38 - INFO - codeparrot_training - Step 49225: {'lr': 2.9842288698020414e-07, 'samples': 25203712, 'steps': 49225, 'batch_loss/train': 0.6828588657081127} 12/29/2021 10:38:49 - INFO - codeparrot_training - Step 49226: {'lr': 2.976534130126052e-07, 'samples': 25204224, 'steps': 49226, 'batch_loss/train': 1.6004905039444566} 12/29/2021 10:39:00 - INFO - codeparrot_training - Step 49227: {'lr': 2.9688493176724397e-07, 'samples': 25204736, 'steps': 49227, 'batch_loss/train': 0.9075669748708606} 12/29/2021 10:39:14 - INFO - codeparrot_training - Step 49228: {'lr': 2.961174432471736e-07, 'samples': 25205248, 'steps': 49228, 'batch_loss/train': 0.7580412719398737} 12/29/2021 10:39:24 - INFO - codeparrot_training - Step 49229: {'lr': 2.95350947455475e-07, 'samples': 25205760, 'steps': 49229, 'batch_loss/train': 0.6714131240732968} 12/29/2021 10:39:35 - INFO - codeparrot_training - Step 49230: {'lr': 2.9458544439514567e-07, 'samples': 25206272, 'steps': 49230, 'batch_loss/train': 0.6983271958306432} 12/29/2021 10:39:47 - INFO - codeparrot_training - Step 49231: {'lr': 2.9382093406929436e-07, 'samples': 25206784, 'steps': 49231, 'batch_loss/train': 0.7813609093427658} 12/29/2021 10:39:58 - INFO - codeparrot_training - Step 49232: {'lr': 2.930574164809185e-07, 'samples': 25207296, 'steps': 49232, 'batch_loss/train': 0.7247022567898966} 12/29/2021 10:40:08 - INFO - codeparrot_training - Step 49233: {'lr': 2.9229489163304366e-07, 'samples': 25207808, 'steps': 49233, 'batch_loss/train': 0.730399392079562} 12/29/2021 10:40:20 - INFO - codeparrot_training - Step 49234: {'lr': 2.9153335952872286e-07, 'samples': 25208320, 'steps': 49234, 'batch_loss/train': 0.7613345049321651} 12/29/2021 10:40:31 - INFO - codeparrot_training - Step 49235: {'lr': 2.907728201709814e-07, 'samples': 25208832, 'steps': 49235, 'batch_loss/train': 0.643864270998165} 12/29/2021 10:40:42 - INFO - codeparrot_training - Step 49236: {'lr': 2.9001327356284466e-07, 'samples': 25209344, 'steps': 49236, 'batch_loss/train': 0.7816912503913045} 12/29/2021 10:40:52 - INFO - codeparrot_training - Step 49237: {'lr': 2.892547197073381e-07, 'samples': 25209856, 'steps': 49237, 'batch_loss/train': 0.7360447756946087} 12/29/2021 10:41:06 - INFO - codeparrot_training - Step 49238: {'lr': 2.884971586074592e-07, 'samples': 25210368, 'steps': 49238, 'batch_loss/train': 0.724791650660336} 12/29/2021 10:41:17 - INFO - codeparrot_training - Step 49239: {'lr': 2.8774059026623334e-07, 'samples': 25210880, 'steps': 49239, 'batch_loss/train': 0.7638902319595218} 12/29/2021 10:41:27 - INFO - codeparrot_training - Step 49240: {'lr': 2.869850146866304e-07, 'samples': 25211392, 'steps': 49240, 'batch_loss/train': 0.7356999353505671} 12/29/2021 10:41:40 - INFO - codeparrot_training - Step 49241: {'lr': 2.862304318717313e-07, 'samples': 25211904, 'steps': 49241, 'batch_loss/train': 0.661766892299056} 12/29/2021 10:41:50 - INFO - codeparrot_training - Step 49242: {'lr': 2.8547684182447796e-07, 'samples': 25212416, 'steps': 49242, 'batch_loss/train': 0.6944723450578749} 12/29/2021 10:42:01 - INFO - codeparrot_training - Step 49243: {'lr': 2.847242445478682e-07, 'samples': 25212928, 'steps': 49243, 'batch_loss/train': 0.9770870776847005} 12/29/2021 10:42:15 - INFO - codeparrot_training - Step 49244: {'lr': 2.8397264004489943e-07, 'samples': 25213440, 'steps': 49244, 'batch_loss/train': 0.9429112332873046} 12/29/2021 10:42:26 - INFO - codeparrot_training - Step 49245: {'lr': 2.832220283185971e-07, 'samples': 25213952, 'steps': 49245, 'batch_loss/train': 0.7777408133260906} 12/29/2021 10:42:36 - INFO - codeparrot_training - Step 49246: {'lr': 2.8247240937190333e-07, 'samples': 25214464, 'steps': 49246, 'batch_loss/train': 0.6572827599011362} 12/29/2021 10:42:48 - INFO - codeparrot_training - Step 49247: {'lr': 2.817237832077879e-07, 'samples': 25214976, 'steps': 49247, 'batch_loss/train': 0.7040327859576792} 12/29/2021 10:42:59 - INFO - codeparrot_training - Step 49248: {'lr': 2.809761498292762e-07, 'samples': 25215488, 'steps': 49248, 'batch_loss/train': 0.778260032646358} 12/29/2021 10:43:09 - INFO - codeparrot_training - Step 49249: {'lr': 2.8022950923931035e-07, 'samples': 25216000, 'steps': 49249, 'batch_loss/train': 0.7751928111538291} 12/29/2021 10:43:20 - INFO - codeparrot_training - Step 49250: {'lr': 2.7948386144083236e-07, 'samples': 25216512, 'steps': 49250, 'batch_loss/train': 0.6849391767755151} 12/29/2021 10:43:32 - INFO - codeparrot_training - Step 49251: {'lr': 2.7873920643686767e-07, 'samples': 25217024, 'steps': 49251, 'batch_loss/train': 0.7940009832382202} 12/29/2021 10:43:43 - INFO - codeparrot_training - Step 49252: {'lr': 2.7799554423033056e-07, 'samples': 25217536, 'steps': 49252, 'batch_loss/train': 0.70244705688674} 12/29/2021 10:43:53 - INFO - codeparrot_training - Step 49253: {'lr': 2.772528748241909e-07, 'samples': 25218048, 'steps': 49253, 'batch_loss/train': 0.7613879164564423} 12/29/2021 10:44:06 - INFO - codeparrot_training - Step 49254: {'lr': 2.765111982214186e-07, 'samples': 25218560, 'steps': 49254, 'batch_loss/train': 0.7488312888890505} 12/29/2021 10:44:17 - INFO - codeparrot_training - Step 49255: {'lr': 2.7577051442492784e-07, 'samples': 25219072, 'steps': 49255, 'batch_loss/train': 0.5664005246944726} 12/29/2021 10:44:27 - INFO - codeparrot_training - Step 49256: {'lr': 2.7503082343768857e-07, 'samples': 25219584, 'steps': 49256, 'batch_loss/train': 0.5277619027765468} 12/29/2021 10:44:41 - INFO - codeparrot_training - Step 49257: {'lr': 2.7429212526261514e-07, 'samples': 25220096, 'steps': 49257, 'batch_loss/train': 0.718966699205339} 12/29/2021 10:44:52 - INFO - codeparrot_training - Step 49258: {'lr': 2.7355441990270514e-07, 'samples': 25220608, 'steps': 49258, 'batch_loss/train': 0.7867319779470563} 12/29/2021 10:45:02 - INFO - codeparrot_training - Step 49259: {'lr': 2.7281770736081737e-07, 'samples': 25221120, 'steps': 49259, 'batch_loss/train': 0.7075098808854818} 12/29/2021 10:45:14 - INFO - codeparrot_training - Step 49260: {'lr': 2.7208198763992163e-07, 'samples': 25221632, 'steps': 49260, 'batch_loss/train': 0.5640979206655174} 12/29/2021 10:45:25 - INFO - codeparrot_training - Step 49261: {'lr': 2.7134726074293235e-07, 'samples': 25222144, 'steps': 49261, 'batch_loss/train': 0.6254623425193131} 12/29/2021 10:45:36 - INFO - codeparrot_training - Step 49262: {'lr': 2.7061352667276385e-07, 'samples': 25222656, 'steps': 49262, 'batch_loss/train': 0.6810957583365962} 12/29/2021 10:45:46 - INFO - codeparrot_training - Step 49263: {'lr': 2.6988078543235814e-07, 'samples': 25223168, 'steps': 49263, 'batch_loss/train': 0.7135852687060833} 12/29/2021 10:45:59 - INFO - codeparrot_training - Step 49264: {'lr': 2.691490370246019e-07, 'samples': 25223680, 'steps': 49264, 'batch_loss/train': 0.7878930941224098} 12/29/2021 10:46:10 - INFO - codeparrot_training - Step 49265: {'lr': 2.6841828145240943e-07, 'samples': 25224192, 'steps': 49265, 'batch_loss/train': 0.9987336611375213} 12/29/2021 10:46:20 - INFO - codeparrot_training - Step 49266: {'lr': 2.67688518718695e-07, 'samples': 25224704, 'steps': 49266, 'batch_loss/train': 0.72668366599828} 12/29/2021 10:46:34 - INFO - codeparrot_training - Step 49267: {'lr': 2.6695974882637306e-07, 'samples': 25225216, 'steps': 49267, 'batch_loss/train': 0.7053287702146918} 12/29/2021 10:46:45 - INFO - codeparrot_training - Step 49268: {'lr': 2.662319717783024e-07, 'samples': 25225728, 'steps': 49268, 'batch_loss/train': 0.6562412097118795} 12/29/2021 10:46:55 - INFO - codeparrot_training - Step 49269: {'lr': 2.65505187577425e-07, 'samples': 25226240, 'steps': 49269, 'batch_loss/train': 0.7504204765427858} 12/29/2021 10:47:07 - INFO - codeparrot_training - Step 49270: {'lr': 2.647793962265721e-07, 'samples': 25226752, 'steps': 49270, 'batch_loss/train': 0.6232676059007645} 12/29/2021 10:47:18 - INFO - codeparrot_training - Step 49271: {'lr': 2.640545977286579e-07, 'samples': 25227264, 'steps': 49271, 'batch_loss/train': 0.6849357062019408} 12/29/2021 10:47:29 - INFO - codeparrot_training - Step 49272: {'lr': 2.633307920865968e-07, 'samples': 25227776, 'steps': 49272, 'batch_loss/train': 0.6735047260299325} 12/29/2021 10:47:39 - INFO - codeparrot_training - Step 49273: {'lr': 2.6260797930321987e-07, 'samples': 25228288, 'steps': 49273, 'batch_loss/train': 0.7496019662357867} 12/29/2021 10:47:53 - INFO - codeparrot_training - Step 49274: {'lr': 2.618861593814137e-07, 'samples': 25228800, 'steps': 49274, 'batch_loss/train': 0.6601896472275257} 12/29/2021 10:48:04 - INFO - codeparrot_training - Step 49275: {'lr': 2.6116533232403707e-07, 'samples': 25229312, 'steps': 49275, 'batch_loss/train': 1.4945657178759575} 12/29/2021 10:48:15 - INFO - codeparrot_training - Step 49276: {'lr': 2.604454981340043e-07, 'samples': 25229824, 'steps': 49276, 'batch_loss/train': 1.5115963511634618} 12/29/2021 10:48:27 - INFO - codeparrot_training - Step 49277: {'lr': 2.597266568141188e-07, 'samples': 25230336, 'steps': 49277, 'batch_loss/train': 0.7336253207176924} 12/29/2021 10:48:37 - INFO - codeparrot_training - Step 49278: {'lr': 2.5900880836726703e-07, 'samples': 25230848, 'steps': 49278, 'batch_loss/train': 0.7400713637471199} 12/29/2021 10:48:48 - INFO - codeparrot_training - Step 49279: {'lr': 2.5829195279628017e-07, 'samples': 25231360, 'steps': 49279, 'batch_loss/train': 0.7697266051545739} 12/29/2021 10:49:00 - INFO - codeparrot_training - Step 49280: {'lr': 2.575760901040447e-07, 'samples': 25231872, 'steps': 49280, 'batch_loss/train': 0.6963088354095817} 12/29/2021 10:49:11 - INFO - codeparrot_training - Step 49281: {'lr': 2.5686122029339174e-07, 'samples': 25232384, 'steps': 49281, 'batch_loss/train': 0.7155153614003211} 12/29/2021 10:49:21 - INFO - codeparrot_training - Step 49282: {'lr': 2.561473433671524e-07, 'samples': 25232896, 'steps': 49282, 'batch_loss/train': 0.7418396342545748} 12/29/2021 10:49:32 - INFO - codeparrot_training - Step 49283: {'lr': 2.554344593281577e-07, 'samples': 25233408, 'steps': 49283, 'batch_loss/train': 0.701126444619149} 12/29/2021 10:49:46 - INFO - codeparrot_training - Step 49284: {'lr': 2.5472256817929416e-07, 'samples': 25233920, 'steps': 49284, 'batch_loss/train': 0.7533069495111704} 12/29/2021 10:49:56 - INFO - codeparrot_training - Step 49285: {'lr': 2.540116699233097e-07, 'samples': 25234432, 'steps': 49285, 'batch_loss/train': 0.8156485296785831} 12/29/2021 10:50:07 - INFO - codeparrot_training - Step 49286: {'lr': 2.5330176456309085e-07, 'samples': 25234944, 'steps': 49286, 'batch_loss/train': 0.7140177116380073} 12/29/2021 10:50:19 - INFO - codeparrot_training - Step 49287: {'lr': 2.5259285210146865e-07, 'samples': 25235456, 'steps': 49287, 'batch_loss/train': 0.7932735979557037} 12/29/2021 10:50:30 - INFO - codeparrot_training - Step 49288: {'lr': 2.5188493254119095e-07, 'samples': 25235968, 'steps': 49288, 'batch_loss/train': 0.6679100533947349} 12/29/2021 10:50:40 - INFO - codeparrot_training - Step 49289: {'lr': 2.5117800588514426e-07, 'samples': 25236480, 'steps': 49289, 'batch_loss/train': 0.6745926523581147} 12/29/2021 10:50:54 - INFO - codeparrot_training - Step 49290: {'lr': 2.504720721361042e-07, 'samples': 25236992, 'steps': 49290, 'batch_loss/train': 0.7179815173149109} 12/29/2021 10:51:05 - INFO - codeparrot_training - Step 49291: {'lr': 2.497671312969019e-07, 'samples': 25237504, 'steps': 49291, 'batch_loss/train': 0.7013215636834502} 12/29/2021 10:51:16 - INFO - codeparrot_training - Step 49292: {'lr': 2.4906318337031273e-07, 'samples': 25238016, 'steps': 49292, 'batch_loss/train': 0.674258538056165} 12/29/2021 10:51:26 - INFO - codeparrot_training - Step 49293: {'lr': 2.483602283591402e-07, 'samples': 25238528, 'steps': 49293, 'batch_loss/train': 0.7030937452800572} 12/29/2021 10:51:38 - INFO - codeparrot_training - Step 49294: {'lr': 2.476582662661875e-07, 'samples': 25239040, 'steps': 49294, 'batch_loss/train': 0.6062643523328006} 12/29/2021 10:51:49 - INFO - codeparrot_training - Step 49295: {'lr': 2.4695729709425796e-07, 'samples': 25239552, 'steps': 49295, 'batch_loss/train': 0.6616833889856935} 12/29/2021 10:52:00 - INFO - codeparrot_training - Step 49296: {'lr': 2.462573208460994e-07, 'samples': 25240064, 'steps': 49296, 'batch_loss/train': 0.7906480251695029} 12/29/2021 10:52:12 - INFO - codeparrot_training - Step 49297: {'lr': 2.455583375245152e-07, 'samples': 25240576, 'steps': 49297, 'batch_loss/train': 0.6707731753122061} 12/29/2021 10:52:22 - INFO - codeparrot_training - Step 49298: {'lr': 2.4486034713230853e-07, 'samples': 25241088, 'steps': 49298, 'batch_loss/train': 0.5845672901778016} 12/29/2021 10:52:33 - INFO - codeparrot_training - Step 49299: {'lr': 2.441633496722273e-07, 'samples': 25241600, 'steps': 49299, 'batch_loss/train': 0.7299083706457168} 12/29/2021 10:52:45 - INFO - codeparrot_training - Step 49300: {'lr': 2.4346734514704704e-07, 'samples': 25242112, 'steps': 49300, 'batch_loss/train': 0.8455735864117742} 12/29/2021 10:52:56 - INFO - codeparrot_training - Step 49301: {'lr': 2.4277233355954334e-07, 'samples': 25242624, 'steps': 49301, 'batch_loss/train': 0.7513341708108783} 12/29/2021 10:53:06 - INFO - codeparrot_training - Step 49302: {'lr': 2.420783149124639e-07, 'samples': 25243136, 'steps': 49302, 'batch_loss/train': 0.8019305691123009} 12/29/2021 10:53:17 - INFO - codeparrot_training - Step 49303: {'lr': 2.4138528920855664e-07, 'samples': 25243648, 'steps': 49303, 'batch_loss/train': 0.6083602583967149} 12/29/2021 10:53:31 - INFO - codeparrot_training - Step 49304: {'lr': 2.406932564506248e-07, 'samples': 25244160, 'steps': 49304, 'batch_loss/train': 0.645183046348393} 12/29/2021 10:53:42 - INFO - codeparrot_training - Step 49305: {'lr': 2.4000221664138846e-07, 'samples': 25244672, 'steps': 49305, 'batch_loss/train': 0.7135252635926008} 12/29/2021 10:53:52 - INFO - codeparrot_training - Step 49306: {'lr': 2.393121697835954e-07, 'samples': 25245184, 'steps': 49306, 'batch_loss/train': 0.7116830563172698} 12/29/2021 10:54:04 - INFO - codeparrot_training - Step 49307: {'lr': 2.3862311587996567e-07, 'samples': 25245696, 'steps': 49307, 'batch_loss/train': 0.7858765637502074} 12/29/2021 10:54:15 - INFO - codeparrot_training - Step 49308: {'lr': 2.379350549333026e-07, 'samples': 25246208, 'steps': 49308, 'batch_loss/train': 0.7091975067742169} 12/29/2021 10:54:26 - INFO - codeparrot_training - Step 49309: {'lr': 2.372479869462707e-07, 'samples': 25246720, 'steps': 49309, 'batch_loss/train': 0.6397831356152892} 12/29/2021 10:54:38 - INFO - codeparrot_training - Step 49310: {'lr': 2.3656191192167332e-07, 'samples': 25247232, 'steps': 49310, 'batch_loss/train': 0.7557029211893678} 12/29/2021 10:54:49 - INFO - codeparrot_training - Step 49311: {'lr': 2.3587682986217496e-07, 'samples': 25247744, 'steps': 49311, 'batch_loss/train': 0.7988558290526271} 12/29/2021 10:54:59 - INFO - codeparrot_training - Step 49312: {'lr': 2.3519274077052344e-07, 'samples': 25248256, 'steps': 49312, 'batch_loss/train': 0.5557177467271686} 12/29/2021 10:55:10 - INFO - codeparrot_training - Step 49313: {'lr': 2.345096446494388e-07, 'samples': 25248768, 'steps': 49313, 'batch_loss/train': 0.7364036468788981} 12/29/2021 10:55:24 - INFO - codeparrot_training - Step 49314: {'lr': 2.3382754150161333e-07, 'samples': 25249280, 'steps': 49314, 'batch_loss/train': 0.7464653346687555} 12/29/2021 10:55:34 - INFO - codeparrot_training - Step 49315: {'lr': 2.331464313298226e-07, 'samples': 25249792, 'steps': 49315, 'batch_loss/train': 0.7437340496107936} 12/29/2021 10:55:45 - INFO - codeparrot_training - Step 49316: {'lr': 2.3246631413670337e-07, 'samples': 25250304, 'steps': 49316, 'batch_loss/train': 0.679720469750464} 12/29/2021 10:55:57 - INFO - codeparrot_training - Step 49317: {'lr': 2.3178718992500348e-07, 'samples': 25250816, 'steps': 49317, 'batch_loss/train': 0.6280224575311877} 12/29/2021 10:56:08 - INFO - codeparrot_training - Step 49318: {'lr': 2.3110905869741517e-07, 'samples': 25251328, 'steps': 49318, 'batch_loss/train': 0.7260826269921381} 12/29/2021 10:56:18 - INFO - codeparrot_training - Step 49319: {'lr': 2.30431920456603e-07, 'samples': 25251840, 'steps': 49319, 'batch_loss/train': 0.7232455210760236} 12/29/2021 10:56:30 - INFO - codeparrot_training - Step 49320: {'lr': 2.2975577520531477e-07, 'samples': 25252352, 'steps': 49320, 'batch_loss/train': 0.7817441076040268} 12/29/2021 10:56:41 - INFO - codeparrot_training - Step 49321: {'lr': 2.2908062294618725e-07, 'samples': 25252864, 'steps': 49321, 'batch_loss/train': 0.5414368548954371} 12/29/2021 10:56:52 - INFO - codeparrot_training - Step 49322: {'lr': 2.284064636819405e-07, 'samples': 25253376, 'steps': 49322, 'batch_loss/train': 0.8277892600744963} 12/29/2021 10:57:06 - INFO - codeparrot_training - Step 49323: {'lr': 2.2773329741521133e-07, 'samples': 25253888, 'steps': 49323, 'batch_loss/train': 0.7802560180425644} 12/29/2021 10:57:16 - INFO - codeparrot_training - Step 49324: {'lr': 2.2706112414874748e-07, 'samples': 25254400, 'steps': 49324, 'batch_loss/train': 0.6301242653280497} 12/29/2021 10:57:27 - INFO - codeparrot_training - Step 49325: {'lr': 2.2638994388513022e-07, 'samples': 25254912, 'steps': 49325, 'batch_loss/train': 0.7690768046304584} 12/29/2021 10:57:38 - INFO - codeparrot_training - Step 49326: {'lr': 2.257197566271074e-07, 'samples': 25255424, 'steps': 49326, 'batch_loss/train': 0.6371289873786736} 12/29/2021 10:57:50 - INFO - codeparrot_training - Step 49327: {'lr': 2.250505623773158e-07, 'samples': 25255936, 'steps': 49327, 'batch_loss/train': 0.7554929596371949} 12/29/2021 10:58:01 - INFO - codeparrot_training - Step 49328: {'lr': 2.2438236113839218e-07, 'samples': 25256448, 'steps': 49328, 'batch_loss/train': 0.7407253817655146} 12/29/2021 10:58:11 - INFO - codeparrot_training - Step 49329: {'lr': 2.2371515291300105e-07, 'samples': 25256960, 'steps': 49329, 'batch_loss/train': 0.6081867134198546} 12/29/2021 10:58:23 - INFO - codeparrot_training - Step 49330: {'lr': 2.2304893770383473e-07, 'samples': 25257472, 'steps': 49330, 'batch_loss/train': 0.6809135158546269} 12/29/2021 10:58:34 - INFO - codeparrot_training - Step 49331: {'lr': 2.2238371551350222e-07, 'samples': 25257984, 'steps': 49331, 'batch_loss/train': 0.7427145945839584} 12/29/2021 10:58:45 - INFO - codeparrot_training - Step 49332: {'lr': 2.217194863446681e-07, 'samples': 25258496, 'steps': 49332, 'batch_loss/train': 0.6463158689439297} 12/29/2021 10:58:59 - INFO - codeparrot_training - Step 49333: {'lr': 2.2105625019996912e-07, 'samples': 25259008, 'steps': 49333, 'batch_loss/train': 0.6948817062657326} 12/29/2021 10:59:09 - INFO - codeparrot_training - Step 49334: {'lr': 2.203940070820143e-07, 'samples': 25259520, 'steps': 49334, 'batch_loss/train': 0.6627803798764944} 12/29/2021 10:59:20 - INFO - codeparrot_training - Step 49335: {'lr': 2.1973275699346818e-07, 'samples': 25260032, 'steps': 49335, 'batch_loss/train': 0.6891114264726639} 12/29/2021 10:59:30 - INFO - codeparrot_training - Step 49336: {'lr': 2.1907249993696755e-07, 'samples': 25260544, 'steps': 49336, 'batch_loss/train': 0.7103944350965321} 12/29/2021 10:59:43 - INFO - codeparrot_training - Step 49337: {'lr': 2.1841323591509365e-07, 'samples': 25261056, 'steps': 49337, 'batch_loss/train': 0.6008953605778515} 12/29/2021 10:59:53 - INFO - codeparrot_training - Step 49338: {'lr': 2.1775496493051104e-07, 'samples': 25261568, 'steps': 49338, 'batch_loss/train': 0.7062697573564947} 12/29/2021 11:00:04 - INFO - codeparrot_training - Step 49339: {'lr': 2.1709768698582877e-07, 'samples': 25262080, 'steps': 49339, 'batch_loss/train': 0.6504008499905467} 12/29/2021 11:00:16 - INFO - codeparrot_training - Step 49340: {'lr': 2.164414020836558e-07, 'samples': 25262592, 'steps': 49340, 'batch_loss/train': 0.6525309160351753} 12/29/2021 11:00:27 - INFO - codeparrot_training - Step 49341: {'lr': 2.1578611022657345e-07, 'samples': 25263104, 'steps': 49341, 'batch_loss/train': 0.6788215446285903} 12/29/2021 11:00:37 - INFO - codeparrot_training - Step 49342: {'lr': 2.151318114172185e-07, 'samples': 25263616, 'steps': 49342, 'batch_loss/train': 0.6675360496155918} 12/29/2021 11:00:51 - INFO - codeparrot_training - Step 49343: {'lr': 2.1447850565819992e-07, 'samples': 25264128, 'steps': 49343, 'batch_loss/train': 0.7304994423175231} 12/29/2021 11:01:02 - INFO - codeparrot_training - Step 49344: {'lr': 2.1382619295207129e-07, 'samples': 25264640, 'steps': 49344, 'batch_loss/train': 0.6795962168835104} 12/29/2021 11:01:13 - INFO - codeparrot_training - Step 49345: {'lr': 2.1317487330146934e-07, 'samples': 25265152, 'steps': 49345, 'batch_loss/train': 0.7338958522304893} 12/29/2021 11:01:23 - INFO - codeparrot_training - Step 49346: {'lr': 2.1252454670897536e-07, 'samples': 25265664, 'steps': 49346, 'batch_loss/train': 0.5990864767809398} 12/29/2021 11:01:35 - INFO - codeparrot_training - Step 49347: {'lr': 2.1187521317717061e-07, 'samples': 25266176, 'steps': 49347, 'batch_loss/train': 0.830209163017571} 12/29/2021 11:01:46 - INFO - codeparrot_training - Step 49348: {'lr': 2.1122687270860863e-07, 'samples': 25266688, 'steps': 49348, 'batch_loss/train': 0.7681824169121683} 12/29/2021 11:01:57 - INFO - codeparrot_training - Step 49349: {'lr': 2.1057952530592617e-07, 'samples': 25267200, 'steps': 49349, 'batch_loss/train': 0.6753078768961132} 12/29/2021 11:02:11 - INFO - codeparrot_training - Step 49350: {'lr': 2.09933170971649e-07, 'samples': 25267712, 'steps': 49350, 'batch_loss/train': 0.7573577058501542} 12/29/2021 11:02:21 - INFO - codeparrot_training - Step 49351: {'lr': 2.0928780970833062e-07, 'samples': 25268224, 'steps': 49351, 'batch_loss/train': 0.7305026557296515} 12/29/2021 11:02:32 - INFO - codeparrot_training - Step 49352: {'lr': 2.0864344151860782e-07, 'samples': 25268736, 'steps': 49352, 'batch_loss/train': 0.7362006194889545} 12/29/2021 11:02:44 - INFO - codeparrot_training - Step 49353: {'lr': 2.0800006640497859e-07, 'samples': 25269248, 'steps': 49353, 'batch_loss/train': 0.7266898583620787} 12/29/2021 11:02:54 - INFO - codeparrot_training - Step 49354: {'lr': 2.0735768437002421e-07, 'samples': 25269760, 'steps': 49354, 'batch_loss/train': 0.705537929199636} 12/29/2021 11:03:05 - INFO - codeparrot_training - Step 49355: {'lr': 2.067162954162982e-07, 'samples': 25270272, 'steps': 49355, 'batch_loss/train': 0.6768641951493919} 12/29/2021 11:03:17 - INFO - codeparrot_training - Step 49356: {'lr': 2.060758995463541e-07, 'samples': 25270784, 'steps': 49356, 'batch_loss/train': 0.6968769326340407} 12/29/2021 11:03:28 - INFO - codeparrot_training - Step 49357: {'lr': 2.0543649676274535e-07, 'samples': 25271296, 'steps': 49357, 'batch_loss/train': 0.7396020339801908} 12/29/2021 11:03:39 - INFO - codeparrot_training - Step 49358: {'lr': 2.0479808706799774e-07, 'samples': 25271808, 'steps': 49358, 'batch_loss/train': 0.6784496619366109} 12/29/2021 11:03:49 - INFO - codeparrot_training - Step 49359: {'lr': 2.0416067046463706e-07, 'samples': 25272320, 'steps': 49359, 'batch_loss/train': 0.6200970550999045} 12/29/2021 11:04:03 - INFO - codeparrot_training - Step 49360: {'lr': 2.0352424695521677e-07, 'samples': 25272832, 'steps': 49360, 'batch_loss/train': 0.6897961043287069} 12/29/2021 11:04:14 - INFO - codeparrot_training - Step 49361: {'lr': 2.028888165422904e-07, 'samples': 25273344, 'steps': 49361, 'batch_loss/train': 0.6508042006753385} 12/29/2021 11:04:24 - INFO - codeparrot_training - Step 49362: {'lr': 2.0225437922832824e-07, 'samples': 25273856, 'steps': 49362, 'batch_loss/train': 0.7220627479255199} 12/29/2021 11:04:37 - INFO - codeparrot_training - Step 49363: {'lr': 2.016209350159115e-07, 'samples': 25274368, 'steps': 49363, 'batch_loss/train': 0.7333598607219756} 12/29/2021 11:04:47 - INFO - codeparrot_training - Step 49364: {'lr': 2.0098848390751044e-07, 'samples': 25274880, 'steps': 49364, 'batch_loss/train': 0.6924068918451667} 12/29/2021 11:04:58 - INFO - codeparrot_training - Step 49365: {'lr': 2.0035702590565087e-07, 'samples': 25275392, 'steps': 49365, 'batch_loss/train': 0.6969940746203065} 12/29/2021 11:05:10 - INFO - codeparrot_training - Step 49366: {'lr': 1.9972656101285846e-07, 'samples': 25275904, 'steps': 49366, 'batch_loss/train': 0.6423132159397937} 12/29/2021 11:05:21 - INFO - codeparrot_training - Step 49367: {'lr': 1.9909708923163128e-07, 'samples': 25276416, 'steps': 49367, 'batch_loss/train': 0.6997628309763968} 12/29/2021 11:05:31 - INFO - codeparrot_training - Step 49368: {'lr': 1.9846861056449507e-07, 'samples': 25276928, 'steps': 49368, 'batch_loss/train': 0.7426755174237769} 12/29/2021 11:05:42 - INFO - codeparrot_training - Step 49369: {'lr': 1.9784112501389227e-07, 'samples': 25277440, 'steps': 49369, 'batch_loss/train': 0.6835236917249858} 12/29/2021 11:05:54 - INFO - codeparrot_training - Step 49370: {'lr': 1.972146325823765e-07, 'samples': 25277952, 'steps': 49370, 'batch_loss/train': 0.747746713925153} 12/29/2021 11:06:05 - INFO - codeparrot_training - Step 49371: {'lr': 1.9658913327239015e-07, 'samples': 25278464, 'steps': 49371, 'batch_loss/train': 0.7267895615659654} 12/29/2021 11:06:15 - INFO - codeparrot_training - Step 49372: {'lr': 1.95964627086459e-07, 'samples': 25278976, 'steps': 49372, 'batch_loss/train': 0.7461863490752876} 12/29/2021 11:06:29 - INFO - codeparrot_training - Step 49373: {'lr': 1.9534111402705335e-07, 'samples': 25279488, 'steps': 49373, 'batch_loss/train': 1.1430549176875502} 12/29/2021 11:06:40 - INFO - codeparrot_training - Step 49374: {'lr': 1.9471859409664338e-07, 'samples': 25280000, 'steps': 49374, 'batch_loss/train': 0.7400333513505757} 12/29/2021 11:06:51 - INFO - codeparrot_training - Step 49375: {'lr': 1.9409706729772713e-07, 'samples': 25280512, 'steps': 49375, 'batch_loss/train': 0.674532572273165} 12/29/2021 11:07:03 - INFO - codeparrot_training - Step 49376: {'lr': 1.9347653363271932e-07, 'samples': 25281024, 'steps': 49376, 'batch_loss/train': 0.7059712819755077} 12/29/2021 11:07:13 - INFO - codeparrot_training - Step 49377: {'lr': 1.9285699310414574e-07, 'samples': 25281536, 'steps': 49377, 'batch_loss/train': 0.7192618338158354} 12/29/2021 11:07:24 - INFO - codeparrot_training - Step 49378: {'lr': 1.9223844571447658e-07, 'samples': 25282048, 'steps': 49378, 'batch_loss/train': 0.7686593541875482} 12/29/2021 11:07:36 - INFO - codeparrot_training - Step 49379: {'lr': 1.9162089146609885e-07, 'samples': 25282560, 'steps': 49379, 'batch_loss/train': 0.6661438813898712} 12/29/2021 11:07:47 - INFO - codeparrot_training - Step 49380: {'lr': 1.910043303615383e-07, 'samples': 25283072, 'steps': 49380, 'batch_loss/train': 0.7433510972186923} 12/29/2021 11:07:57 - INFO - codeparrot_training - Step 49381: {'lr': 1.903887624032097e-07, 'samples': 25283584, 'steps': 49381, 'batch_loss/train': 0.5764368506206665} 12/29/2021 11:08:08 - INFO - codeparrot_training - Step 49382: {'lr': 1.8977418759358323e-07, 'samples': 25284096, 'steps': 49382, 'batch_loss/train': 0.8962613558396697} 12/29/2021 11:08:22 - INFO - codeparrot_training - Step 49383: {'lr': 1.8916060593507366e-07, 'samples': 25284608, 'steps': 49383, 'batch_loss/train': 0.8569598654285073} 12/29/2021 11:08:33 - INFO - codeparrot_training - Step 49384: {'lr': 1.8854801743015127e-07, 'samples': 25285120, 'steps': 49384, 'batch_loss/train': 0.7559631103649735} 12/29/2021 11:08:43 - INFO - codeparrot_training - Step 49385: {'lr': 1.8793642208123074e-07, 'samples': 25285632, 'steps': 49385, 'batch_loss/train': 0.6060180200729519} 12/29/2021 11:08:56 - INFO - codeparrot_training - Step 49386: {'lr': 1.873258198907546e-07, 'samples': 25286144, 'steps': 49386, 'batch_loss/train': 0.6174922990612686} 12/29/2021 11:09:06 - INFO - codeparrot_training - Step 49387: {'lr': 1.8671621086113754e-07, 'samples': 25286656, 'steps': 49387, 'batch_loss/train': 0.6867718230932951} 12/29/2021 11:09:17 - INFO - codeparrot_training - Step 49388: {'lr': 1.8610759499479435e-07, 'samples': 25287168, 'steps': 49388, 'batch_loss/train': 0.6801299334038049} 12/29/2021 11:09:28 - INFO - codeparrot_training - Step 49389: {'lr': 1.8549997229419523e-07, 'samples': 25287680, 'steps': 49389, 'batch_loss/train': 0.7754363999702036} 12/29/2021 11:09:41 - INFO - codeparrot_training - Step 49390: {'lr': 1.8489334276169944e-07, 'samples': 25288192, 'steps': 49390, 'batch_loss/train': 0.7629470793763176} 12/29/2021 11:09:52 - INFO - codeparrot_training - Step 49391: {'lr': 1.8428770639974945e-07, 'samples': 25288704, 'steps': 49391, 'batch_loss/train': 0.7910648547112942} 12/29/2021 11:10:03 - INFO - codeparrot_training - Step 49392: {'lr': 1.8368306321073225e-07, 'samples': 25289216, 'steps': 49392, 'batch_loss/train': 0.7018304192461073} 12/29/2021 11:10:15 - INFO - codeparrot_training - Step 49393: {'lr': 1.830794131970903e-07, 'samples': 25289728, 'steps': 49393, 'batch_loss/train': 0.8221284504979849} 12/29/2021 11:10:26 - INFO - codeparrot_training - Step 49394: {'lr': 1.8247675636118288e-07, 'samples': 25290240, 'steps': 49394, 'batch_loss/train': 0.7325582085177302} 12/29/2021 11:10:36 - INFO - codeparrot_training - Step 49395: {'lr': 1.8187509270542468e-07, 'samples': 25290752, 'steps': 49395, 'batch_loss/train': 0.7634750795550644} 12/29/2021 11:10:48 - INFO - codeparrot_training - Step 49396: {'lr': 1.812744222322027e-07, 'samples': 25291264, 'steps': 49396, 'batch_loss/train': 0.7508445386774838} 12/29/2021 11:10:59 - INFO - codeparrot_training - Step 49397: {'lr': 1.806747449439039e-07, 'samples': 25291776, 'steps': 49397, 'batch_loss/train': 0.7354395147413015} 12/29/2021 11:11:10 - INFO - codeparrot_training - Step 49398: {'lr': 1.800760608429153e-07, 'samples': 25292288, 'steps': 49398, 'batch_loss/train': 0.5461051961174235} 12/29/2021 11:11:23 - INFO - codeparrot_training - Step 49399: {'lr': 1.7947836993159606e-07, 'samples': 25292800, 'steps': 49399, 'batch_loss/train': 0.6066608070395887} 12/29/2021 11:11:34 - INFO - codeparrot_training - Step 49400: {'lr': 1.7888167221236095e-07, 'samples': 25293312, 'steps': 49400, 'batch_loss/train': 0.7587936855852604} 12/29/2021 11:11:45 - INFO - codeparrot_training - Step 49401: {'lr': 1.7828596768756923e-07, 'samples': 25293824, 'steps': 49401, 'batch_loss/train': 0.676149055827409} 12/29/2021 11:11:55 - INFO - codeparrot_training - Step 49402: {'lr': 1.7769125635958005e-07, 'samples': 25294336, 'steps': 49402, 'batch_loss/train': 0.7490142425522208} 12/29/2021 11:12:08 - INFO - codeparrot_training - Step 49403: {'lr': 1.770975382307527e-07, 'samples': 25294848, 'steps': 49403, 'batch_loss/train': 0.7872518766671419} 12/29/2021 11:12:19 - INFO - codeparrot_training - Step 49404: {'lr': 1.7650481330344636e-07, 'samples': 25295360, 'steps': 49404, 'batch_loss/train': 1.3014731798321009} 12/29/2021 11:12:30 - INFO - codeparrot_training - Step 49405: {'lr': 1.7591308158004803e-07, 'samples': 25295872, 'steps': 49405, 'batch_loss/train': 0.7473590220324695} 12/29/2021 11:12:42 - INFO - codeparrot_training - Step 49406: {'lr': 1.753223430628892e-07, 'samples': 25296384, 'steps': 49406, 'batch_loss/train': 0.7609118018299341} 12/29/2021 11:12:52 - INFO - codeparrot_training - Step 49407: {'lr': 1.747325977543013e-07, 'samples': 25296896, 'steps': 49407, 'batch_loss/train': 0.7359719891101122} 12/29/2021 11:13:03 - INFO - codeparrot_training - Step 49408: {'lr': 1.7414384565667129e-07, 'samples': 25297408, 'steps': 49408, 'batch_loss/train': 0.6256393934600055} 12/29/2021 11:13:18 - INFO - codeparrot_training - Step 49409: {'lr': 1.7355608677230295e-07, 'samples': 25297920, 'steps': 49409, 'batch_loss/train': 0.7576334276236594} 12/29/2021 11:13:28 - INFO - codeparrot_training - Step 49410: {'lr': 1.729693211035277e-07, 'samples': 25298432, 'steps': 49410, 'batch_loss/train': 0.7715007183142006} 12/29/2021 11:13:39 - INFO - codeparrot_training - Step 49411: {'lr': 1.7238354865273253e-07, 'samples': 25298944, 'steps': 49411, 'batch_loss/train': 0.770542121026665} 12/29/2021 11:13:50 - INFO - codeparrot_training - Step 49412: {'lr': 1.7179876942216567e-07, 'samples': 25299456, 'steps': 49412, 'batch_loss/train': 0.6909829170908779} 12/29/2021 11:14:02 - INFO - codeparrot_training - Step 49413: {'lr': 1.7121498341421405e-07, 'samples': 25299968, 'steps': 49413, 'batch_loss/train': 0.6839496120810509} 12/29/2021 11:14:13 - INFO - codeparrot_training - Step 49414: {'lr': 1.7063219063118141e-07, 'samples': 25300480, 'steps': 49414, 'batch_loss/train': 0.820237924810499} 12/29/2021 11:14:23 - INFO - codeparrot_training - Step 49415: {'lr': 1.7005039107539923e-07, 'samples': 25300992, 'steps': 49415, 'batch_loss/train': 0.7053587147966027} 12/29/2021 11:14:35 - INFO - codeparrot_training - Step 49416: {'lr': 1.6946958474914344e-07, 'samples': 25301504, 'steps': 49416, 'batch_loss/train': 0.7428724407218397} 12/29/2021 11:14:46 - INFO - codeparrot_training - Step 49417: {'lr': 1.688897716547455e-07, 'samples': 25302016, 'steps': 49417, 'batch_loss/train': 0.7824080670252442} 12/29/2021 11:14:57 - INFO - codeparrot_training - Step 49418: {'lr': 1.6831095179450916e-07, 'samples': 25302528, 'steps': 49418, 'batch_loss/train': 0.783117407001555} 12/29/2021 11:15:07 - INFO - codeparrot_training - Step 49419: {'lr': 1.6773312517076588e-07, 'samples': 25303040, 'steps': 49419, 'batch_loss/train': 0.814361804514192} 12/29/2021 11:15:21 - INFO - codeparrot_training - Step 49420: {'lr': 1.671562917857361e-07, 'samples': 25303552, 'steps': 49420, 'batch_loss/train': 1.0590558536350727} 12/29/2021 11:15:32 - INFO - codeparrot_training - Step 49421: {'lr': 1.6658045164177904e-07, 'samples': 25304064, 'steps': 49421, 'batch_loss/train': 0.659408101812005} 12/29/2021 11:15:42 - INFO - codeparrot_training - Step 49422: {'lr': 1.6600560474117067e-07, 'samples': 25304576, 'steps': 49422, 'batch_loss/train': 0.7132363594137132} 12/29/2021 11:15:54 - INFO - codeparrot_training - Step 49423: {'lr': 1.6543175108618692e-07, 'samples': 25305088, 'steps': 49423, 'batch_loss/train': 0.714714182773605} 12/29/2021 11:16:05 - INFO - codeparrot_training - Step 49424: {'lr': 1.6485889067913151e-07, 'samples': 25305600, 'steps': 49424, 'batch_loss/train': 0.7315037841908634} 12/29/2021 11:16:16 - INFO - codeparrot_training - Step 49425: {'lr': 1.642870235222249e-07, 'samples': 25306112, 'steps': 49425, 'batch_loss/train': 0.7742658648639917} 12/29/2021 11:16:28 - INFO - codeparrot_training - Step 49426: {'lr': 1.6371614961779856e-07, 'samples': 25306624, 'steps': 49426, 'batch_loss/train': 0.8255768232047558} 12/29/2021 11:16:39 - INFO - codeparrot_training - Step 49427: {'lr': 1.631462689681007e-07, 'samples': 25307136, 'steps': 49427, 'batch_loss/train': 0.706240420229733} 12/29/2021 11:16:49 - INFO - codeparrot_training - Step 49428: {'lr': 1.6257738157540724e-07, 'samples': 25307648, 'steps': 49428, 'batch_loss/train': 0.5740001538069919} 12/29/2021 11:17:03 - INFO - codeparrot_training - Step 49429: {'lr': 1.620094874419664e-07, 'samples': 25308160, 'steps': 49429, 'batch_loss/train': 0.7050670827738941} 12/29/2021 11:17:14 - INFO - codeparrot_training - Step 49430: {'lr': 1.6144258657002642e-07, 'samples': 25308672, 'steps': 49430, 'batch_loss/train': 0.7517241379246116} 12/29/2021 11:17:25 - INFO - codeparrot_training - Step 49431: {'lr': 1.6087667896189095e-07, 'samples': 25309184, 'steps': 49431, 'batch_loss/train': 0.8087068067397922} 12/29/2021 11:17:35 - INFO - codeparrot_training - Step 49432: {'lr': 1.6031176461972496e-07, 'samples': 25309696, 'steps': 49432, 'batch_loss/train': 0.5709084101836197} 12/29/2021 11:17:47 - INFO - codeparrot_training - Step 49433: {'lr': 1.5974784354585992e-07, 'samples': 25310208, 'steps': 49433, 'batch_loss/train': 0.6447751699015498} 12/29/2021 11:17:58 - INFO - codeparrot_training - Step 49434: {'lr': 1.591849157424885e-07, 'samples': 25310720, 'steps': 49434, 'batch_loss/train': 0.7710631880909204} 12/29/2021 11:18:09 - INFO - codeparrot_training - Step 49435: {'lr': 1.5862298121185892e-07, 'samples': 25311232, 'steps': 49435, 'batch_loss/train': 0.7722279792651534} 12/29/2021 11:18:21 - INFO - codeparrot_training - Step 49436: {'lr': 1.5806203995621937e-07, 'samples': 25311744, 'steps': 49436, 'batch_loss/train': 0.7546388069167733} 12/29/2021 11:18:31 - INFO - codeparrot_training - Step 49437: {'lr': 1.575020919777903e-07, 'samples': 25312256, 'steps': 49437, 'batch_loss/train': 0.6500430857413448} 12/29/2021 11:18:42 - INFO - codeparrot_training - Step 49438: {'lr': 1.5694313727876442e-07, 'samples': 25312768, 'steps': 49438, 'batch_loss/train': 0.6954149744706228} 12/29/2021 11:18:56 - INFO - codeparrot_training - Step 49439: {'lr': 1.5638517586141764e-07, 'samples': 25313280, 'steps': 49439, 'batch_loss/train': 0.8153699939139187} 12/29/2021 11:19:07 - INFO - codeparrot_training - Step 49440: {'lr': 1.558282077279427e-07, 'samples': 25313792, 'steps': 49440, 'batch_loss/train': 0.6693235406419262} 12/29/2021 11:19:17 - INFO - codeparrot_training - Step 49441: {'lr': 1.5527223288056003e-07, 'samples': 25314304, 'steps': 49441, 'batch_loss/train': 0.7501918622292578} 12/29/2021 11:19:28 - INFO - codeparrot_training - Step 49442: {'lr': 1.5471725132149006e-07, 'samples': 25314816, 'steps': 49442, 'batch_loss/train': 0.7137623592279851} 12/29/2021 11:19:40 - INFO - codeparrot_training - Step 49443: {'lr': 1.541632630529255e-07, 'samples': 25315328, 'steps': 49443, 'batch_loss/train': 0.7006565070478246} 12/29/2021 11:19:51 - INFO - codeparrot_training - Step 49444: {'lr': 1.53610268077059e-07, 'samples': 25315840, 'steps': 49444, 'batch_loss/train': 0.6717297714203596} 12/29/2021 11:20:01 - INFO - codeparrot_training - Step 49445: {'lr': 1.5305826639608333e-07, 'samples': 25316352, 'steps': 49445, 'batch_loss/train': 0.759131315164268} 12/29/2021 11:20:15 - INFO - codeparrot_training - Step 49446: {'lr': 1.5250725801224663e-07, 'samples': 25316864, 'steps': 49446, 'batch_loss/train': 0.6787793486728333} 12/29/2021 11:20:26 - INFO - codeparrot_training - Step 49447: {'lr': 1.5195724292765833e-07, 'samples': 25317376, 'steps': 49447, 'batch_loss/train': 0.6822316944599152} 12/29/2021 11:20:37 - INFO - codeparrot_training - Step 49448: {'lr': 1.5140822114456664e-07, 'samples': 25317888, 'steps': 49448, 'batch_loss/train': 0.6122149908915162} 12/29/2021 11:20:49 - INFO - codeparrot_training - Step 49449: {'lr': 1.5086019266516428e-07, 'samples': 25318400, 'steps': 49449, 'batch_loss/train': 0.6578040085732937} 12/29/2021 11:20:59 - INFO - codeparrot_training - Step 49450: {'lr': 1.503131574915606e-07, 'samples': 25318912, 'steps': 49450, 'batch_loss/train': 0.6534743597730994} 12/29/2021 11:21:10 - INFO - codeparrot_training - Step 49451: {'lr': 1.4976711562600387e-07, 'samples': 25319424, 'steps': 49451, 'batch_loss/train': 0.7604107824154198} 12/29/2021 11:21:21 - INFO - codeparrot_training - Step 49452: {'lr': 1.4922206707063123e-07, 'samples': 25319936, 'steps': 49452, 'batch_loss/train': 0.69407067168504} 12/29/2021 11:21:33 - INFO - codeparrot_training - Step 49453: {'lr': 1.486780118275799e-07, 'samples': 25320448, 'steps': 49453, 'batch_loss/train': 0.6941923167323694} 12/29/2021 11:21:44 - INFO - codeparrot_training - Step 49454: {'lr': 1.4813494989907028e-07, 'samples': 25320960, 'steps': 49454, 'batch_loss/train': 0.7552373534999788} 12/29/2021 11:21:54 - INFO - codeparrot_training - Step 49455: {'lr': 1.4759288128723957e-07, 'samples': 25321472, 'steps': 49455, 'batch_loss/train': 0.6304348530247808} 12/29/2021 11:22:06 - INFO - codeparrot_training - Step 49456: {'lr': 1.4705180599422495e-07, 'samples': 25321984, 'steps': 49456, 'batch_loss/train': 0.7522726939059794} 12/29/2021 11:22:17 - INFO - codeparrot_training - Step 49457: {'lr': 1.4651172402219138e-07, 'samples': 25322496, 'steps': 49457, 'batch_loss/train': 0.7936521027004346} 12/29/2021 11:22:28 - INFO - codeparrot_training - Step 49458: {'lr': 1.4597263537330373e-07, 'samples': 25323008, 'steps': 49458, 'batch_loss/train': 0.8023766214028001} 12/29/2021 11:22:41 - INFO - codeparrot_training - Step 49459: {'lr': 1.454345400496715e-07, 'samples': 25323520, 'steps': 49459, 'batch_loss/train': 0.7183523122221231} 12/29/2021 11:22:52 - INFO - codeparrot_training - Step 49460: {'lr': 1.4489743805345956e-07, 'samples': 25324032, 'steps': 49460, 'batch_loss/train': 0.7130498946644366} 12/29/2021 11:23:03 - INFO - codeparrot_training - Step 49461: {'lr': 1.4436132938680512e-07, 'samples': 25324544, 'steps': 49461, 'batch_loss/train': 0.6563909306423739} 12/29/2021 11:23:15 - INFO - codeparrot_training - Step 49462: {'lr': 1.4382621405178986e-07, 'samples': 25325056, 'steps': 49462, 'batch_loss/train': 0.6723926754202694} 12/29/2021 11:23:25 - INFO - codeparrot_training - Step 49463: {'lr': 1.4329209205060646e-07, 'samples': 25325568, 'steps': 49463, 'batch_loss/train': 0.7357280743890442} 12/29/2021 11:23:36 - INFO - codeparrot_training - Step 49464: {'lr': 1.4275896338536432e-07, 'samples': 25326080, 'steps': 49464, 'batch_loss/train': 0.7774893429595977} 12/29/2021 11:23:47 - INFO - codeparrot_training - Step 49465: {'lr': 1.4222682805814514e-07, 'samples': 25326592, 'steps': 49465, 'batch_loss/train': 0.8593833111226559} 12/29/2021 11:23:59 - INFO - codeparrot_training - Step 49466: {'lr': 1.4169568607108607e-07, 'samples': 25327104, 'steps': 49466, 'batch_loss/train': 0.7449297029525042} 12/29/2021 11:24:09 - INFO - codeparrot_training - Step 49467: {'lr': 1.4116553742632433e-07, 'samples': 25327616, 'steps': 49467, 'batch_loss/train': 0.6647104159928858} 12/29/2021 11:24:20 - INFO - codeparrot_training - Step 49468: {'lr': 1.4063638212594154e-07, 'samples': 25328128, 'steps': 49468, 'batch_loss/train': 0.6319048891309649} 12/29/2021 11:24:34 - INFO - codeparrot_training - Step 49469: {'lr': 1.401082201720194e-07, 'samples': 25328640, 'steps': 49469, 'batch_loss/train': 0.6162313409149647} 12/29/2021 11:24:45 - INFO - codeparrot_training - Step 49470: {'lr': 1.395810515666951e-07, 'samples': 25329152, 'steps': 49470, 'batch_loss/train': 0.6715942923910916} 12/29/2021 11:24:55 - INFO - codeparrot_training - Step 49471: {'lr': 1.3905487631207804e-07, 'samples': 25329664, 'steps': 49471, 'batch_loss/train': 0.6367541195359081} 12/29/2021 11:25:08 - INFO - codeparrot_training - Step 49472: {'lr': 1.3852969441022212e-07, 'samples': 25330176, 'steps': 49472, 'batch_loss/train': 0.7237378912977874} 12/29/2021 11:25:19 - INFO - codeparrot_training - Step 49473: {'lr': 1.3800550586320903e-07, 'samples': 25330688, 'steps': 49473, 'batch_loss/train': 0.5741105217020959} 12/29/2021 11:25:29 - INFO - codeparrot_training - Step 49474: {'lr': 1.374823106731482e-07, 'samples': 25331200, 'steps': 49474, 'batch_loss/train': 0.7197969313710928} 12/29/2021 11:25:43 - INFO - codeparrot_training - Step 49475: {'lr': 1.3696010884214903e-07, 'samples': 25331712, 'steps': 49475, 'batch_loss/train': 0.5326621285639703} 12/29/2021 11:25:54 - INFO - codeparrot_training - Step 49476: {'lr': 1.3643890037220997e-07, 'samples': 25332224, 'steps': 49476, 'batch_loss/train': 0.6955746505409479} 12/29/2021 11:26:04 - INFO - codeparrot_training - Step 49477: {'lr': 1.3591868526546813e-07, 'samples': 25332736, 'steps': 49477, 'batch_loss/train': 0.6451351579162292} 12/29/2021 11:26:15 - INFO - codeparrot_training - Step 49478: {'lr': 1.3539946352394972e-07, 'samples': 25333248, 'steps': 49478, 'batch_loss/train': 0.8061218615621328} 12/29/2021 11:26:27 - INFO - codeparrot_training - Step 49479: {'lr': 1.3488123514973637e-07, 'samples': 25333760, 'steps': 49479, 'batch_loss/train': 0.6636226193513721} 12/29/2021 11:26:38 - INFO - codeparrot_training - Step 49480: {'lr': 1.3436400014490979e-07, 'samples': 25334272, 'steps': 49480, 'batch_loss/train': 0.745873102452606} 12/29/2021 11:26:48 - INFO - codeparrot_training - Step 49481: {'lr': 1.338477585114961e-07, 'samples': 25334784, 'steps': 49481, 'batch_loss/train': 0.7477726712822914} 12/29/2021 11:27:01 - INFO - codeparrot_training - Step 49482: {'lr': 1.3333251025154924e-07, 'samples': 25335296, 'steps': 49482, 'batch_loss/train': 0.6536796437576413} 12/29/2021 11:27:12 - INFO - codeparrot_training - Step 49483: {'lr': 1.3281825536715087e-07, 'samples': 25335808, 'steps': 49483, 'batch_loss/train': 0.6791275087743998} 12/29/2021 11:27:22 - INFO - codeparrot_training - Step 49484: {'lr': 1.3230499386029938e-07, 'samples': 25336320, 'steps': 49484, 'batch_loss/train': 0.7592086037620902} 12/29/2021 11:27:33 - INFO - codeparrot_training - Step 49485: {'lr': 1.3179272573307645e-07, 'samples': 25336832, 'steps': 49485, 'batch_loss/train': 0.6213183160871267} 12/29/2021 11:27:47 - INFO - codeparrot_training - Step 49486: {'lr': 1.3128145098748045e-07, 'samples': 25337344, 'steps': 49486, 'batch_loss/train': 0.7046960042789578} 12/29/2021 11:27:58 - INFO - codeparrot_training - Step 49487: {'lr': 1.3077116962556533e-07, 'samples': 25337856, 'steps': 49487, 'batch_loss/train': 0.7695923852152191} 12/29/2021 11:28:08 - INFO - codeparrot_training - Step 49488: {'lr': 1.3026188164935727e-07, 'samples': 25338368, 'steps': 49488, 'batch_loss/train': 0.6867910162545741} 12/29/2021 11:28:21 - INFO - codeparrot_training - Step 49489: {'lr': 1.2975358706088237e-07, 'samples': 25338880, 'steps': 49489, 'batch_loss/train': 0.8079290213063359} 12/29/2021 11:28:31 - INFO - codeparrot_training - Step 49490: {'lr': 1.292462858621668e-07, 'samples': 25339392, 'steps': 49490, 'batch_loss/train': 0.7379980306141078} 12/29/2021 11:28:42 - INFO - codeparrot_training - Step 49491: {'lr': 1.2873997805520897e-07, 'samples': 25339904, 'steps': 49491, 'batch_loss/train': 0.7196735041216016} 12/29/2021 11:28:54 - INFO - codeparrot_training - Step 49492: {'lr': 1.2823466364203507e-07, 'samples': 25340416, 'steps': 49492, 'batch_loss/train': 0.5771764968521893} 12/29/2021 11:29:05 - INFO - codeparrot_training - Step 49493: {'lr': 1.277303426246712e-07, 'samples': 25340928, 'steps': 49493, 'batch_loss/train': 0.7566359769552946} 12/29/2021 11:29:15 - INFO - codeparrot_training - Step 49494: {'lr': 1.2722701500508805e-07, 'samples': 25341440, 'steps': 49494, 'batch_loss/train': 0.7744531026110053} 12/29/2021 11:29:26 - INFO - codeparrot_training - Step 49495: {'lr': 1.2672468078528398e-07, 'samples': 25341952, 'steps': 49495, 'batch_loss/train': 0.6923135067336261} 12/29/2021 11:29:38 - INFO - codeparrot_training - Step 49496: {'lr': 1.2622333996731294e-07, 'samples': 25342464, 'steps': 49496, 'batch_loss/train': 0.8919157697819173} 12/29/2021 11:29:49 - INFO - codeparrot_training - Step 49497: {'lr': 1.2572299255309006e-07, 'samples': 25342976, 'steps': 49497, 'batch_loss/train': 0.6724886740557849} 12/29/2021 11:29:59 - INFO - codeparrot_training - Step 49498: {'lr': 1.2522363854466923e-07, 'samples': 25343488, 'steps': 49498, 'batch_loss/train': 0.771542236674577} 12/29/2021 11:30:13 - INFO - codeparrot_training - Step 49499: {'lr': 1.247252779440211e-07, 'samples': 25344000, 'steps': 49499, 'batch_loss/train': 0.6994452304206789} 12/29/2021 11:30:24 - INFO - codeparrot_training - Step 49500: {'lr': 1.2422791075308859e-07, 'samples': 25344512, 'steps': 49500, 'batch_loss/train': 0.646261372603476} 12/29/2021 11:30:35 - INFO - codeparrot_training - Step 49501: {'lr': 1.2373153697389783e-07, 'samples': 25345024, 'steps': 49501, 'batch_loss/train': 0.6219487031921744} 12/29/2021 11:30:47 - INFO - codeparrot_training - Step 49502: {'lr': 1.2323615660836396e-07, 'samples': 25345536, 'steps': 49502, 'batch_loss/train': 0.715386288240552} 12/29/2021 11:30:57 - INFO - codeparrot_training - Step 49503: {'lr': 1.227417696585409e-07, 'samples': 25346048, 'steps': 49503, 'batch_loss/train': 0.8089624703861773} 12/29/2021 11:31:08 - INFO - codeparrot_training - Step 49504: {'lr': 1.22248376126316e-07, 'samples': 25346560, 'steps': 49504, 'batch_loss/train': 0.7450744090601802} 12/29/2021 11:31:22 - INFO - codeparrot_training - Step 49505: {'lr': 1.217559760136877e-07, 'samples': 25347072, 'steps': 49505, 'batch_loss/train': 0.7793391146697104} 12/29/2021 11:31:32 - INFO - codeparrot_training - Step 49506: {'lr': 1.212645693225989e-07, 'samples': 25347584, 'steps': 49506, 'batch_loss/train': 0.7417790032923222} 12/29/2021 11:31:43 - INFO - codeparrot_training - Step 49507: {'lr': 1.2077415605502017e-07, 'samples': 25348096, 'steps': 49507, 'batch_loss/train': 0.728390634059906} 12/29/2021 11:31:54 - INFO - codeparrot_training - Step 49508: {'lr': 1.202847362128945e-07, 'samples': 25348608, 'steps': 49508, 'batch_loss/train': 0.6607215607073158} 12/29/2021 11:32:06 - INFO - codeparrot_training - Step 49509: {'lr': 1.1979630979816468e-07, 'samples': 25349120, 'steps': 49509, 'batch_loss/train': 0.7270897105336189} 12/29/2021 11:32:17 - INFO - codeparrot_training - Step 49510: {'lr': 1.1930887681280146e-07, 'samples': 25349632, 'steps': 49510, 'batch_loss/train': 0.7292198287323117} 12/29/2021 11:32:27 - INFO - codeparrot_training - Step 49511: {'lr': 1.1882243725866437e-07, 'samples': 25350144, 'steps': 49511, 'batch_loss/train': 0.7227557888254523} 12/29/2021 11:32:39 - INFO - codeparrot_training - Step 49512: {'lr': 1.1833699113777963e-07, 'samples': 25350656, 'steps': 49512, 'batch_loss/train': 0.7297208178788424} 12/29/2021 11:32:50 - INFO - codeparrot_training - Step 49513: {'lr': 1.1785253845200683e-07, 'samples': 25351168, 'steps': 49513, 'batch_loss/train': 0.8224280881695449} 12/29/2021 11:33:00 - INFO - codeparrot_training - Step 49514: {'lr': 1.1736907920331663e-07, 'samples': 25351680, 'steps': 49514, 'batch_loss/train': 0.7529900409281254} 12/29/2021 11:33:14 - INFO - codeparrot_training - Step 49515: {'lr': 1.168866133935964e-07, 'samples': 25352192, 'steps': 49515, 'batch_loss/train': 0.7552061164751649} 12/29/2021 11:33:25 - INFO - codeparrot_training - Step 49516: {'lr': 1.164051410248168e-07, 'samples': 25352704, 'steps': 49516, 'batch_loss/train': 0.7672306629829109} 12/29/2021 11:33:36 - INFO - codeparrot_training - Step 49517: {'lr': 1.1592466209883745e-07, 'samples': 25353216, 'steps': 49517, 'batch_loss/train': 0.7942891838029027} 12/29/2021 11:33:48 - INFO - codeparrot_training - Step 49518: {'lr': 1.1544517661757347e-07, 'samples': 25353728, 'steps': 49518, 'batch_loss/train': 0.8280387255363166} 12/29/2021 11:33:58 - INFO - codeparrot_training - Step 49519: {'lr': 1.1496668458296777e-07, 'samples': 25354240, 'steps': 49519, 'batch_loss/train': 0.77610575966537} 12/29/2021 11:34:09 - INFO - codeparrot_training - Step 49520: {'lr': 1.1448918599690772e-07, 'samples': 25354752, 'steps': 49520, 'batch_loss/train': 0.7334063942544162} 12/29/2021 11:34:20 - INFO - codeparrot_training - Step 49521: {'lr': 1.1401268086125294e-07, 'samples': 25355264, 'steps': 49521, 'batch_loss/train': 0.6780309833120555} 12/29/2021 11:34:33 - INFO - codeparrot_training - Step 49522: {'lr': 1.1353716917794632e-07, 'samples': 25355776, 'steps': 49522, 'batch_loss/train': 0.6684462444391102} 12/29/2021 11:34:43 - INFO - codeparrot_training - Step 49523: {'lr': 1.1306265094887525e-07, 'samples': 25356288, 'steps': 49523, 'batch_loss/train': 0.8014104254543781} 12/29/2021 11:34:54 - INFO - codeparrot_training - Step 49524: {'lr': 1.1258912617589933e-07, 'samples': 25356800, 'steps': 49524, 'batch_loss/train': 0.73196901823394} 12/29/2021 11:35:08 - INFO - codeparrot_training - Step 49525: {'lr': 1.1211659486093373e-07, 'samples': 25357312, 'steps': 49525, 'batch_loss/train': 0.4339001253247261} 12/29/2021 11:35:19 - INFO - codeparrot_training - Step 49526: {'lr': 1.1164505700581029e-07, 'samples': 25357824, 'steps': 49526, 'batch_loss/train': 0.7407907778397202} 12/29/2021 11:35:29 - INFO - codeparrot_training - Step 49527: {'lr': 1.1117451261247191e-07, 'samples': 25358336, 'steps': 49527, 'batch_loss/train': 0.7401115242391825} 12/29/2021 11:35:40 - INFO - codeparrot_training - Step 49528: {'lr': 1.107049616827227e-07, 'samples': 25358848, 'steps': 49528, 'batch_loss/train': 0.6894469894468784} 12/29/2021 11:35:52 - INFO - codeparrot_training - Step 49529: {'lr': 1.1023640421845005e-07, 'samples': 25359360, 'steps': 49529, 'batch_loss/train': 0.6172813335433602} 12/29/2021 11:36:03 - INFO - codeparrot_training - Step 49530: {'lr': 1.0976884022154132e-07, 'samples': 25359872, 'steps': 49530, 'batch_loss/train': 0.8472817139700055} 12/29/2021 11:36:13 - INFO - codeparrot_training - Step 49531: {'lr': 1.093022696938284e-07, 'samples': 25360384, 'steps': 49531, 'batch_loss/train': 0.766892016865313} 12/29/2021 11:36:25 - INFO - codeparrot_training - Step 49532: {'lr': 1.0883669263717089e-07, 'samples': 25360896, 'steps': 49532, 'batch_loss/train': 0.8292896212078631} 12/29/2021 11:36:36 - INFO - codeparrot_training - Step 49533: {'lr': 1.0837210905342843e-07, 'samples': 25361408, 'steps': 49533, 'batch_loss/train': 0.78414296079427} 12/29/2021 11:36:47 - INFO - codeparrot_training - Step 49534: {'lr': 1.0790851894443287e-07, 'samples': 25361920, 'steps': 49534, 'batch_loss/train': 0.682243843562901} 12/29/2021 11:36:59 - INFO - codeparrot_training - Step 49535: {'lr': 1.0744592231204387e-07, 'samples': 25362432, 'steps': 49535, 'batch_loss/train': 0.742088760714978} 12/29/2021 11:37:10 - INFO - codeparrot_training - Step 49536: {'lr': 1.0698431915809325e-07, 'samples': 25362944, 'steps': 49536, 'batch_loss/train': 0.5523319074127357} 12/29/2021 11:37:20 - INFO - codeparrot_training - Step 49537: {'lr': 1.0652370948441293e-07, 'samples': 25363456, 'steps': 49537, 'batch_loss/train': 0.725419987575151} 12/29/2021 11:37:34 - INFO - codeparrot_training - Step 49538: {'lr': 1.0606409329283473e-07, 'samples': 25363968, 'steps': 49538, 'batch_loss/train': 0.6494230413809419} 12/29/2021 11:37:45 - INFO - codeparrot_training - Step 49539: {'lr': 1.056054705852183e-07, 'samples': 25364480, 'steps': 49539, 'batch_loss/train': 0.6909134443849325} 12/29/2021 11:37:55 - INFO - codeparrot_training - Step 49540: {'lr': 1.0514784136333999e-07, 'samples': 25364992, 'steps': 49540, 'batch_loss/train': 0.6451571751385927} 12/29/2021 11:38:06 - INFO - codeparrot_training - Step 49541: {'lr': 1.0469120562903168e-07, 'samples': 25365504, 'steps': 49541, 'batch_loss/train': 0.6509219277650118} 12/29/2021 11:38:18 - INFO - codeparrot_training - Step 49542: {'lr': 1.042355633841252e-07, 'samples': 25366016, 'steps': 49542, 'batch_loss/train': 0.6547350576147437} 12/29/2021 11:38:29 - INFO - codeparrot_training - Step 49543: {'lr': 1.0378091463039697e-07, 'samples': 25366528, 'steps': 49543, 'batch_loss/train': 0.6795052766101435} 12/29/2021 11:38:39 - INFO - codeparrot_training - Step 49544: {'lr': 1.0332725936970654e-07, 'samples': 25367040, 'steps': 49544, 'batch_loss/train': 0.7256547319702804} 12/29/2021 11:38:53 - INFO - codeparrot_training - Step 49545: {'lr': 1.0287459760380257e-07, 'samples': 25367552, 'steps': 49545, 'batch_loss/train': 0.7089067851193249} 12/29/2021 11:39:04 - INFO - codeparrot_training - Step 49546: {'lr': 1.0242292933454467e-07, 'samples': 25368064, 'steps': 49546, 'batch_loss/train': 0.6988802945706993} 12/29/2021 11:39:15 - INFO - codeparrot_training - Step 49547: {'lr': 1.0197225456368142e-07, 'samples': 25368576, 'steps': 49547, 'batch_loss/train': 0.6660681343637407} 12/29/2021 11:39:27 - INFO - codeparrot_training - Step 49548: {'lr': 1.0152257329301695e-07, 'samples': 25369088, 'steps': 49548, 'batch_loss/train': 0.6504021314904094} 12/29/2021 11:39:38 - INFO - codeparrot_training - Step 49549: {'lr': 1.0107388552432761e-07, 'samples': 25369600, 'steps': 49549, 'batch_loss/train': 0.7751597492024302} 12/29/2021 11:39:48 - INFO - codeparrot_training - Step 49550: {'lr': 1.0062619125941752e-07, 'samples': 25370112, 'steps': 49550, 'batch_loss/train': 0.7980882069095969} 12/29/2021 11:39:59 - INFO - codeparrot_training - Step 49551: {'lr': 1.0017949050006303e-07, 'samples': 25370624, 'steps': 49551, 'batch_loss/train': 0.6961646601557732} 12/29/2021 11:40:11 - INFO - codeparrot_training - Step 49552: {'lr': 9.97337832480405e-08, 'samples': 25371136, 'steps': 49552, 'batch_loss/train': 0.6593493702821434} 12/29/2021 11:40:22 - INFO - codeparrot_training - Step 49553: {'lr': 9.928906950512628e-08, 'samples': 25371648, 'steps': 49553, 'batch_loss/train': 0.739242687355727} 12/29/2021 11:40:32 - INFO - codeparrot_training - Step 49554: {'lr': 9.884534927306899e-08, 'samples': 25372160, 'steps': 49554, 'batch_loss/train': 0.7050925400108099} 12/29/2021 11:40:46 - INFO - codeparrot_training - Step 49555: {'lr': 9.840262255364496e-08, 'samples': 25372672, 'steps': 49555, 'batch_loss/train': 0.89733827393502} 12/29/2021 11:40:57 - INFO - codeparrot_training - Step 49556: {'lr': 9.796088934860281e-08, 'samples': 25373184, 'steps': 49556, 'batch_loss/train': 0.6405396135523915} 12/29/2021 11:41:07 - INFO - codeparrot_training - Step 49557: {'lr': 9.752014965971889e-08, 'samples': 25373696, 'steps': 49557, 'batch_loss/train': 1.6777953170239925} 12/29/2021 11:41:20 - INFO - codeparrot_training - Step 49558: {'lr': 9.708040348874181e-08, 'samples': 25374208, 'steps': 49558, 'batch_loss/train': 0.7302074246108532} 12/29/2021 11:41:30 - INFO - codeparrot_training - Step 49559: {'lr': 9.66416508373924e-08, 'samples': 25374720, 'steps': 49559, 'batch_loss/train': 0.6598636899143457} 12/29/2021 11:41:41 - INFO - codeparrot_training - Step 49560: {'lr': 9.620389170744703e-08, 'samples': 25375232, 'steps': 49560, 'batch_loss/train': 0.6797405779361725} 12/29/2021 11:41:52 - INFO - codeparrot_training - Step 49561: {'lr': 9.57671261006543e-08, 'samples': 25375744, 'steps': 49561, 'batch_loss/train': 0.7071912558749318} 12/29/2021 11:42:06 - INFO - codeparrot_training - Step 49562: {'lr': 9.53313540187073e-08, 'samples': 25376256, 'steps': 49562, 'batch_loss/train': 0.6957278363406658} 12/29/2021 11:42:17 - INFO - codeparrot_training - Step 49563: {'lr': 9.489657546335462e-08, 'samples': 25376768, 'steps': 49563, 'batch_loss/train': 0.7194976601749659} 12/29/2021 11:42:27 - INFO - codeparrot_training - Step 49564: {'lr': 9.446279043634488e-08, 'samples': 25377280, 'steps': 49564, 'batch_loss/train': 0.8517761959228665} 12/29/2021 11:42:39 - INFO - codeparrot_training - Step 49565: {'lr': 9.40299989393989e-08, 'samples': 25377792, 'steps': 49565, 'batch_loss/train': 0.7493004580028355} 12/29/2021 11:42:50 - INFO - codeparrot_training - Step 49566: {'lr': 9.359820097423755e-08, 'samples': 25378304, 'steps': 49566, 'batch_loss/train': 0.6885150247253478} 12/29/2021 11:43:00 - INFO - codeparrot_training - Step 49567: {'lr': 9.31673965425539e-08, 'samples': 25378816, 'steps': 49567, 'batch_loss/train': 0.6080987232271582} 12/29/2021 11:43:13 - INFO - codeparrot_training - Step 49568: {'lr': 9.273758564606882e-08, 'samples': 25379328, 'steps': 49568, 'batch_loss/train': 0.7256855154410005} 12/29/2021 11:43:23 - INFO - codeparrot_training - Step 49569: {'lr': 9.230876828650314e-08, 'samples': 25379840, 'steps': 49569, 'batch_loss/train': 0.6355132830794901} 12/29/2021 11:43:34 - INFO - codeparrot_training - Step 49570: {'lr': 9.188094446554995e-08, 'samples': 25380352, 'steps': 49570, 'batch_loss/train': 0.7238946449942887} 12/29/2021 11:43:45 - INFO - codeparrot_training - Step 49571: {'lr': 9.14541141849301e-08, 'samples': 25380864, 'steps': 49571, 'batch_loss/train': 0.7426099944859743} 12/29/2021 11:43:57 - INFO - codeparrot_training - Step 49572: {'lr': 9.102827744630892e-08, 'samples': 25381376, 'steps': 49572, 'batch_loss/train': 0.7617309633642435} 12/29/2021 11:44:07 - INFO - codeparrot_training - Step 49573: {'lr': 9.060343425140727e-08, 'samples': 25381888, 'steps': 49573, 'batch_loss/train': 0.7322879899293184} 12/29/2021 11:44:18 - INFO - codeparrot_training - Step 49574: {'lr': 9.017958460191821e-08, 'samples': 25382400, 'steps': 49574, 'batch_loss/train': 0.7136316429823637} 12/29/2021 11:44:32 - INFO - codeparrot_training - Step 49575: {'lr': 8.975672849950711e-08, 'samples': 25382912, 'steps': 49575, 'batch_loss/train': 0.679483360494487} 12/29/2021 11:44:42 - INFO - codeparrot_training - Step 49576: {'lr': 8.933486594583929e-08, 'samples': 25383424, 'steps': 49576, 'batch_loss/train': 0.76906053267885} 12/29/2021 11:44:53 - INFO - codeparrot_training - Step 49577: {'lr': 8.891399694263558e-08, 'samples': 25383936, 'steps': 49577, 'batch_loss/train': 0.6558454008772969} 12/29/2021 11:45:05 - INFO - codeparrot_training - Step 49578: {'lr': 8.849412149153358e-08, 'samples': 25384448, 'steps': 49578, 'batch_loss/train': 0.793184407055378} 12/29/2021 11:45:16 - INFO - codeparrot_training - Step 49579: {'lr': 8.807523959422636e-08, 'samples': 25384960, 'steps': 49579, 'batch_loss/train': 0.7962601878680289} 12/29/2021 11:45:27 - INFO - codeparrot_training - Step 49580: {'lr': 8.765735125237928e-08, 'samples': 25385472, 'steps': 49580, 'batch_loss/train': 0.6469507440924644} 12/29/2021 11:45:39 - INFO - codeparrot_training - Step 49581: {'lr': 8.724045646760215e-08, 'samples': 25385984, 'steps': 49581, 'batch_loss/train': 0.7186060715466738} 12/29/2021 11:45:49 - INFO - codeparrot_training - Step 49582: {'lr': 8.682455524164357e-08, 'samples': 25386496, 'steps': 49582, 'batch_loss/train': 0.6766341091133654} 12/29/2021 11:46:00 - INFO - codeparrot_training - Step 49583: {'lr': 8.64096475760856e-08, 'samples': 25387008, 'steps': 49583, 'batch_loss/train': 0.8088371022604406} 12/29/2021 11:46:10 - INFO - codeparrot_training - Step 49584: {'lr': 8.59957334725936e-08, 'samples': 25387520, 'steps': 49584, 'batch_loss/train': 0.7476252615451813} 12/29/2021 11:46:24 - INFO - codeparrot_training - Step 49585: {'lr': 8.558281293280512e-08, 'samples': 25388032, 'steps': 49585, 'batch_loss/train': 0.7336369496770203} 12/29/2021 11:46:35 - INFO - codeparrot_training - Step 49586: {'lr': 8.517088595841328e-08, 'samples': 25388544, 'steps': 49586, 'batch_loss/train': 0.6409126111539081} 12/29/2021 11:46:46 - INFO - codeparrot_training - Step 49587: {'lr': 8.47599525510001e-08, 'samples': 25389056, 'steps': 49587, 'batch_loss/train': 0.7862218976952136} 12/29/2021 11:46:58 - INFO - codeparrot_training - Step 49588: {'lr': 8.435001271220321e-08, 'samples': 25389568, 'steps': 49588, 'batch_loss/train': 0.7286938913166523} 12/29/2021 11:47:09 - INFO - codeparrot_training - Step 49589: {'lr': 8.394106644368793e-08, 'samples': 25390080, 'steps': 49589, 'batch_loss/train': 0.7032261975109577} 12/29/2021 11:47:19 - INFO - codeparrot_training - Step 49590: {'lr': 8.353311374703631e-08, 'samples': 25390592, 'steps': 49590, 'batch_loss/train': 0.8818223725538701} 12/29/2021 11:47:33 - INFO - codeparrot_training - Step 49591: {'lr': 8.312615462391371e-08, 'samples': 25391104, 'steps': 49591, 'batch_loss/train': 0.7237468645907938} 12/29/2021 11:47:44 - INFO - codeparrot_training - Step 49592: {'lr': 8.272018907590217e-08, 'samples': 25391616, 'steps': 49592, 'batch_loss/train': 0.7750374283641577} 12/29/2021 11:47:55 - INFO - codeparrot_training - Step 49593: {'lr': 8.23152171046393e-08, 'samples': 25392128, 'steps': 49593, 'batch_loss/train': 0.43191522103734314} 12/29/2021 11:48:05 - INFO - codeparrot_training - Step 49594: {'lr': 8.191123871170714e-08, 'samples': 25392640, 'steps': 49594, 'batch_loss/train': 0.7795270672068} 12/29/2021 11:48:17 - INFO - codeparrot_training - Step 49595: {'lr': 8.150825389874329e-08, 'samples': 25393152, 'steps': 49595, 'batch_loss/train': 0.7602381691103801} 12/29/2021 11:48:28 - INFO - codeparrot_training - Step 49596: {'lr': 8.11062626673298e-08, 'samples': 25393664, 'steps': 49596, 'batch_loss/train': 0.6088513672584668} 12/29/2021 11:48:39 - INFO - codeparrot_training - Step 49597: {'lr': 8.070526501910425e-08, 'samples': 25394176, 'steps': 49597, 'batch_loss/train': 0.6480611888691783} 12/29/2021 11:48:52 - INFO - codeparrot_training - Step 49598: {'lr': 8.030526095559321e-08, 'samples': 25394688, 'steps': 49598, 'batch_loss/train': 0.7377682197839022} 12/29/2021 11:49:03 - INFO - codeparrot_training - Step 49599: {'lr': 7.990625047843425e-08, 'samples': 25395200, 'steps': 49599, 'batch_loss/train': 0.6021611816831864} 12/29/2021 11:49:13 - INFO - codeparrot_training - Step 49600: {'lr': 7.950823358920944e-08, 'samples': 25395712, 'steps': 49600, 'batch_loss/train': 0.9592564338818192} 12/29/2021 11:49:26 - INFO - codeparrot_training - Step 49601: {'lr': 7.91112102894731e-08, 'samples': 25396224, 'steps': 49601, 'batch_loss/train': 0.9022452780045569} 12/29/2021 11:49:36 - INFO - codeparrot_training - Step 49602: {'lr': 7.871518058083505e-08, 'samples': 25396736, 'steps': 49602, 'batch_loss/train': 0.697508743731305} 12/29/2021 11:49:47 - INFO - codeparrot_training - Step 49603: {'lr': 7.832014446487734e-08, 'samples': 25397248, 'steps': 49603, 'batch_loss/train': 0.7545348154380918} 12/29/2021 11:49:58 - INFO - codeparrot_training - Step 49604: {'lr': 7.792610194312655e-08, 'samples': 25397760, 'steps': 49604, 'batch_loss/train': 0.7953517017886043} 12/29/2021 11:50:12 - INFO - codeparrot_training - Step 49605: {'lr': 7.75330530171925e-08, 'samples': 25398272, 'steps': 49605, 'batch_loss/train': 0.7500478969886899} 12/29/2021 11:50:23 - INFO - codeparrot_training - Step 49606: {'lr': 7.714099768860173e-08, 'samples': 25398784, 'steps': 49606, 'batch_loss/train': 0.7482149042189121} 12/29/2021 11:50:33 - INFO - codeparrot_training - Step 49607: {'lr': 7.674993595893631e-08, 'samples': 25399296, 'steps': 49607, 'batch_loss/train': 0.7361537497490644} 12/29/2021 11:50:46 - INFO - codeparrot_training - Step 49608: {'lr': 7.635986782975057e-08, 'samples': 25399808, 'steps': 49608, 'batch_loss/train': 0.5892592868767679} 12/29/2021 11:50:56 - INFO - codeparrot_training - Step 49609: {'lr': 7.597079330259882e-08, 'samples': 25400320, 'steps': 49609, 'batch_loss/train': 0.6115807008463889} 12/29/2021 11:51:07 - INFO - codeparrot_training - Step 49610: {'lr': 7.55827123790076e-08, 'samples': 25400832, 'steps': 49610, 'batch_loss/train': 0.694832494482398} 12/29/2021 11:51:19 - INFO - codeparrot_training - Step 49611: {'lr': 7.519562506053123e-08, 'samples': 25401344, 'steps': 49611, 'batch_loss/train': 0.8008061992004514} 12/29/2021 11:51:30 - INFO - codeparrot_training - Step 49612: {'lr': 7.480953134872403e-08, 'samples': 25401856, 'steps': 49612, 'batch_loss/train': 0.646824948489666} 12/29/2021 11:51:40 - INFO - codeparrot_training - Step 49613: {'lr': 7.44244312450848e-08, 'samples': 25402368, 'steps': 49613, 'batch_loss/train': 0.7063311755191535} 12/29/2021 11:51:51 - INFO - codeparrot_training - Step 49614: {'lr': 7.404032475116784e-08, 'samples': 25402880, 'steps': 49614, 'batch_loss/train': 0.6999884301330894} 12/29/2021 11:52:05 - INFO - codeparrot_training - Step 49615: {'lr': 7.365721186849972e-08, 'samples': 25403392, 'steps': 49615, 'batch_loss/train': 0.7215190113056451} 12/29/2021 11:52:16 - INFO - codeparrot_training - Step 49616: {'lr': 7.327509259860699e-08, 'samples': 25403904, 'steps': 49616, 'batch_loss/train': 0.5379191164392978} 12/29/2021 11:52:26 - INFO - codeparrot_training - Step 49617: {'lr': 7.289396694301619e-08, 'samples': 25404416, 'steps': 49617, 'batch_loss/train': 0.6745675606653094} 12/29/2021 11:52:39 - INFO - codeparrot_training - Step 49618: {'lr': 7.251383490322616e-08, 'samples': 25404928, 'steps': 49618, 'batch_loss/train': 0.7130672893836163} 12/29/2021 11:52:49 - INFO - codeparrot_training - Step 49619: {'lr': 7.213469648073568e-08, 'samples': 25405440, 'steps': 49619, 'batch_loss/train': 0.8232782595441677} 12/29/2021 11:53:00 - INFO - codeparrot_training - Step 49620: {'lr': 7.17565516770713e-08, 'samples': 25405952, 'steps': 49620, 'batch_loss/train': 0.48499456510762684} 12/29/2021 11:53:14 - INFO - codeparrot_training - Step 49621: {'lr': 7.137940049373181e-08, 'samples': 25406464, 'steps': 49621, 'batch_loss/train': 0.6733599696308374} 12/29/2021 11:53:25 - INFO - codeparrot_training - Step 49622: {'lr': 7.100324293224381e-08, 'samples': 25406976, 'steps': 49622, 'batch_loss/train': 0.6658220151439309} 12/29/2021 11:53:35 - INFO - codeparrot_training - Step 49623: {'lr': 7.062807899405054e-08, 'samples': 25407488, 'steps': 49623, 'batch_loss/train': 0.7648122692480683} 12/29/2021 11:53:46 - INFO - codeparrot_training - Step 49624: {'lr': 7.02539086806786e-08, 'samples': 25408000, 'steps': 49624, 'batch_loss/train': 0.7740770210511982} 12/29/2021 11:53:58 - INFO - codeparrot_training - Step 49625: {'lr': 6.988073199362676e-08, 'samples': 25408512, 'steps': 49625, 'batch_loss/train': 0.7464604061096907} 12/29/2021 11:54:09 - INFO - codeparrot_training - Step 49626: {'lr': 6.950854893433833e-08, 'samples': 25409024, 'steps': 49626, 'batch_loss/train': 0.7284859456121922} 12/29/2021 11:54:19 - INFO - codeparrot_training - Step 49627: {'lr': 6.91373595043121e-08, 'samples': 25409536, 'steps': 49627, 'batch_loss/train': 0.6322208163328469} 12/29/2021 11:54:33 - INFO - codeparrot_training - Step 49628: {'lr': 6.876716370501912e-08, 'samples': 25410048, 'steps': 49628, 'batch_loss/train': 0.7556168041191995} 12/29/2021 11:54:44 - INFO - codeparrot_training - Step 49629: {'lr': 6.839796153795819e-08, 'samples': 25410560, 'steps': 49629, 'batch_loss/train': 0.7303334046155214} 12/29/2021 11:54:54 - INFO - codeparrot_training - Step 49630: {'lr': 6.802975300457259e-08, 'samples': 25411072, 'steps': 49630, 'batch_loss/train': 0.724764708429575} 12/29/2021 11:55:07 - INFO - codeparrot_training - Step 49631: {'lr': 6.766253810630563e-08, 'samples': 25411584, 'steps': 49631, 'batch_loss/train': 0.6999527304433286} 12/29/2021 11:55:17 - INFO - codeparrot_training - Step 49632: {'lr': 6.729631684468385e-08, 'samples': 25412096, 'steps': 49632, 'batch_loss/train': 0.5927771246060729} 12/29/2021 11:55:28 - INFO - codeparrot_training - Step 49633: {'lr': 6.693108922109503e-08, 'samples': 25412608, 'steps': 49633, 'batch_loss/train': 0.7596683679148555} 12/29/2021 11:55:38 - INFO - codeparrot_training - Step 49634: {'lr': 6.656685523701023e-08, 'samples': 25413120, 'steps': 49634, 'batch_loss/train': 0.609788021305576} 12/29/2021 11:55:51 - INFO - codeparrot_training - Step 49635: {'lr': 6.620361489390048e-08, 'samples': 25413632, 'steps': 49635, 'batch_loss/train': 0.8224725777981803} 12/29/2021 11:56:01 - INFO - codeparrot_training - Step 49636: {'lr': 6.584136819318132e-08, 'samples': 25414144, 'steps': 49636, 'batch_loss/train': 0.7134951394982636} 12/29/2021 11:56:12 - INFO - codeparrot_training - Step 49637: {'lr': 6.548011513629604e-08, 'samples': 25414656, 'steps': 49637, 'batch_loss/train': 0.5631029354408383} 12/29/2021 11:56:24 - INFO - codeparrot_training - Step 49638: {'lr': 6.511985572471568e-08, 'samples': 25415168, 'steps': 49638, 'batch_loss/train': 0.8111436204053462} 12/29/2021 11:56:35 - INFO - codeparrot_training - Step 49639: {'lr': 6.476058995982803e-08, 'samples': 25415680, 'steps': 49639, 'batch_loss/train': 0.7610357863595709} 12/29/2021 11:56:46 - INFO - codeparrot_training - Step 49640: {'lr': 6.440231784310413e-08, 'samples': 25416192, 'steps': 49640, 'batch_loss/train': 0.7641291497275233} 12/29/2021 11:56:58 - INFO - codeparrot_training - Step 49641: {'lr': 6.404503937593176e-08, 'samples': 25416704, 'steps': 49641, 'batch_loss/train': 0.7811222318559885} 12/29/2021 11:57:09 - INFO - codeparrot_training - Step 49642: {'lr': 6.368875455972644e-08, 'samples': 25417216, 'steps': 49642, 'batch_loss/train': 0.6142212462145835} 12/29/2021 11:57:19 - INFO - codeparrot_training - Step 49643: {'lr': 6.333346339595925e-08, 'samples': 25417728, 'steps': 49643, 'batch_loss/train': 0.7639573020860553} 12/29/2021 11:57:30 - INFO - codeparrot_training - Step 49644: {'lr': 6.297916588596242e-08, 'samples': 25418240, 'steps': 49644, 'batch_loss/train': 0.7423899965360761} 12/29/2021 11:57:44 - INFO - codeparrot_training - Step 49645: {'lr': 6.262586203120702e-08, 'samples': 25418752, 'steps': 49645, 'batch_loss/train': 0.7411622635554522} 12/29/2021 11:57:55 - INFO - codeparrot_training - Step 49646: {'lr': 6.227355183308081e-08, 'samples': 25419264, 'steps': 49646, 'batch_loss/train': 0.6993074878118932} 12/29/2021 11:58:05 - INFO - codeparrot_training - Step 49647: {'lr': 6.192223529299935e-08, 'samples': 25419776, 'steps': 49647, 'batch_loss/train': 0.7346445652656257} 12/29/2021 11:58:18 - INFO - codeparrot_training - Step 49648: {'lr': 6.157191241232263e-08, 'samples': 25420288, 'steps': 49648, 'batch_loss/train': 0.6586481961421669} 12/29/2021 11:58:28 - INFO - codeparrot_training - Step 49649: {'lr': 6.12225831924662e-08, 'samples': 25420800, 'steps': 49649, 'batch_loss/train': 0.7589275678619742} 12/29/2021 11:58:39 - INFO - codeparrot_training - Step 49650: {'lr': 6.087424763481786e-08, 'samples': 25421312, 'steps': 49650, 'batch_loss/train': 0.667423581937328} 12/29/2021 11:58:53 - INFO - codeparrot_training - Step 49651: {'lr': 6.05269057407376e-08, 'samples': 25421824, 'steps': 49651, 'batch_loss/train': 0.7596051967702806} 12/29/2021 11:59:03 - INFO - codeparrot_training - Step 49652: {'lr': 6.018055751166874e-08, 'samples': 25422336, 'steps': 49652, 'batch_loss/train': 0.6309612318873405} 12/29/2021 11:59:14 - INFO - codeparrot_training - Step 49653: {'lr': 5.983520294891576e-08, 'samples': 25422848, 'steps': 49653, 'batch_loss/train': 0.6898290319368243} 12/29/2021 11:59:26 - INFO - codeparrot_training - Step 49654: {'lr': 5.9490842053894214e-08, 'samples': 25423360, 'steps': 49654, 'batch_loss/train': 0.7469374206848443} 12/29/2021 11:59:37 - INFO - codeparrot_training - Step 49655: {'lr': 5.9147474827964124e-08, 'samples': 25423872, 'steps': 49655, 'batch_loss/train': 0.6061354537378065} 12/29/2021 11:59:48 - INFO - codeparrot_training - Step 49656: {'lr': 5.880510127248551e-08, 'samples': 25424384, 'steps': 49656, 'batch_loss/train': 0.7654200857505202} 12/29/2021 11:59:58 - INFO - codeparrot_training - Step 49657: {'lr': 5.8463721388846145e-08, 'samples': 25424896, 'steps': 49657, 'batch_loss/train': 0.7726735966280103} 12/29/2021 12:00:10 - INFO - codeparrot_training - Step 49658: {'lr': 5.812333517835056e-08, 'samples': 25425408, 'steps': 49658, 'batch_loss/train': 0.6599284969270229} 12/29/2021 12:00:21 - INFO - codeparrot_training - Step 49659: {'lr': 5.778394264241427e-08, 'samples': 25425920, 'steps': 49659, 'batch_loss/train': 0.7511524786241353} 12/29/2021 12:00:32 - INFO - codeparrot_training - Step 49660: {'lr': 5.744554378234179e-08, 'samples': 25426432, 'steps': 49660, 'batch_loss/train': 0.7472877791151404} 12/29/2021 12:00:45 - INFO - codeparrot_training - Step 49661: {'lr': 5.7108138599465395e-08, 'samples': 25426944, 'steps': 49661, 'batch_loss/train': 0.7670057555660605} 12/29/2021 12:00:56 - INFO - codeparrot_training - Step 49662: {'lr': 5.677172709517286e-08, 'samples': 25427456, 'steps': 49662, 'batch_loss/train': 0.7627437822520733} 12/29/2021 12:01:07 - INFO - codeparrot_training - Step 49663: {'lr': 5.643630927079646e-08, 'samples': 25427968, 'steps': 49663, 'batch_loss/train': 0.7260546693578362} 12/29/2021 12:01:19 - INFO - codeparrot_training - Step 49664: {'lr': 5.610188512764069e-08, 'samples': 25428480, 'steps': 49664, 'batch_loss/train': 0.7251288839615881} 12/29/2021 12:01:30 - INFO - codeparrot_training - Step 49665: {'lr': 5.5768454667037836e-08, 'samples': 25428992, 'steps': 49665, 'batch_loss/train': 0.7233811691403389} 12/29/2021 12:01:40 - INFO - codeparrot_training - Step 49666: {'lr': 5.543601789032016e-08, 'samples': 25429504, 'steps': 49666, 'batch_loss/train': 0.8114846516400576} 12/29/2021 12:01:51 - INFO - codeparrot_training - Step 49667: {'lr': 5.510457479881992e-08, 'samples': 25430016, 'steps': 49667, 'batch_loss/train': 0.8869628788670525} 12/29/2021 12:02:05 - INFO - codeparrot_training - Step 49668: {'lr': 5.477412539386939e-08, 'samples': 25430528, 'steps': 49668, 'batch_loss/train': 0.7931468235328794} 12/29/2021 12:02:16 - INFO - codeparrot_training - Step 49669: {'lr': 5.444466967674533e-08, 'samples': 25431040, 'steps': 49669, 'batch_loss/train': 0.7321270368993282} 12/29/2021 12:02:27 - INFO - codeparrot_training - Step 49670: {'lr': 5.4116207648780005e-08, 'samples': 25431552, 'steps': 49670, 'batch_loss/train': 0.6287279720418155} 12/29/2021 12:02:39 - INFO - codeparrot_training - Step 49671: {'lr': 5.378873931125017e-08, 'samples': 25432064, 'steps': 49671, 'batch_loss/train': 0.7724486626684666} 12/29/2021 12:02:49 - INFO - codeparrot_training - Step 49672: {'lr': 5.346226466551585e-08, 'samples': 25432576, 'steps': 49672, 'batch_loss/train': 0.7977384862024337} 12/29/2021 12:03:00 - INFO - codeparrot_training - Step 49673: {'lr': 5.3136783712798284e-08, 'samples': 25433088, 'steps': 49673, 'batch_loss/train': 0.7019165107049048} 12/29/2021 12:03:12 - INFO - codeparrot_training - Step 49674: {'lr': 5.281229645445751e-08, 'samples': 25433600, 'steps': 49674, 'batch_loss/train': 0.7495476147159934} 12/29/2021 12:03:23 - INFO - codeparrot_training - Step 49675: {'lr': 5.2488802891770274e-08, 'samples': 25434112, 'steps': 49675, 'batch_loss/train': 0.6791813438758254} 12/29/2021 12:03:33 - INFO - codeparrot_training - Step 49676: {'lr': 5.2166303025985574e-08, 'samples': 25434624, 'steps': 49676, 'batch_loss/train': 0.7757380966795608} 12/29/2021 12:03:45 - INFO - codeparrot_training - Step 49677: {'lr': 5.1844796858435685e-08, 'samples': 25435136, 'steps': 49677, 'batch_loss/train': 0.849838551832363} 12/29/2021 12:03:56 - INFO - codeparrot_training - Step 49678: {'lr': 5.152428439034185e-08, 'samples': 25435648, 'steps': 49678, 'batch_loss/train': 0.6982199784833938} 12/29/2021 12:04:07 - INFO - codeparrot_training - Step 49679: {'lr': 5.120476562303633e-08, 'samples': 25436160, 'steps': 49679, 'batch_loss/train': 0.7025620574131608} 12/29/2021 12:04:17 - INFO - codeparrot_training - Step 49680: {'lr': 5.088624055776814e-08, 'samples': 25436672, 'steps': 49680, 'batch_loss/train': 0.6807236555032432} 12/29/2021 12:04:32 - INFO - codeparrot_training - Step 49681: {'lr': 5.0568709195786264e-08, 'samples': 25437184, 'steps': 49681, 'batch_loss/train': 0.9202449265867472} 12/29/2021 12:04:42 - INFO - codeparrot_training - Step 49682: {'lr': 5.0252171538367475e-08, 'samples': 25437696, 'steps': 49682, 'batch_loss/train': 0.5928808082244359} 12/29/2021 12:04:53 - INFO - codeparrot_training - Step 49683: {'lr': 4.993662758676076e-08, 'samples': 25438208, 'steps': 49683, 'batch_loss/train': 0.6534469211474061} 12/29/2021 12:05:05 - INFO - codeparrot_training - Step 49684: {'lr': 4.962207734224289e-08, 'samples': 25438720, 'steps': 49684, 'batch_loss/train': 0.6905312980525196} 12/29/2021 12:05:15 - INFO - codeparrot_training - Step 49685: {'lr': 4.93085208060351e-08, 'samples': 25439232, 'steps': 49685, 'batch_loss/train': 0.7639983743429184} 12/29/2021 12:05:26 - INFO - codeparrot_training - Step 49686: {'lr': 4.899595797941414e-08, 'samples': 25439744, 'steps': 49686, 'batch_loss/train': 0.670738557819277} 12/29/2021 12:05:37 - INFO - codeparrot_training - Step 49687: {'lr': 4.868438886360127e-08, 'samples': 25440256, 'steps': 49687, 'batch_loss/train': 0.6533530564047396} 12/29/2021 12:05:49 - INFO - codeparrot_training - Step 49688: {'lr': 4.837381345984548e-08, 'samples': 25440768, 'steps': 49688, 'batch_loss/train': 0.7694872641004622} 12/29/2021 12:06:00 - INFO - codeparrot_training - Step 49689: {'lr': 4.806423176936803e-08, 'samples': 25441280, 'steps': 49689, 'batch_loss/train': 0.5938548998092301} 12/29/2021 12:06:10 - INFO - codeparrot_training - Step 49690: {'lr': 4.77556437934179e-08, 'samples': 25441792, 'steps': 49690, 'batch_loss/train': 0.7277667156886309} 12/29/2021 12:06:24 - INFO - codeparrot_training - Step 49691: {'lr': 4.7448049533216354e-08, 'samples': 25442304, 'steps': 49691, 'batch_loss/train': 0.689346119761467} 12/29/2021 12:06:35 - INFO - codeparrot_training - Step 49692: {'lr': 4.714144898995687e-08, 'samples': 25442816, 'steps': 49692, 'batch_loss/train': 0.7829747591167688} 12/29/2021 12:06:46 - INFO - codeparrot_training - Step 49693: {'lr': 4.683584216491621e-08, 'samples': 25443328, 'steps': 49693, 'batch_loss/train': 0.7128652790561318} 12/29/2021 12:06:58 - INFO - codeparrot_training - Step 49694: {'lr': 4.6531229059260104e-08, 'samples': 25443840, 'steps': 49694, 'batch_loss/train': 0.57119210367091} 12/29/2021 12:07:08 - INFO - codeparrot_training - Step 49695: {'lr': 4.6227609674209804e-08, 'samples': 25444352, 'steps': 49695, 'batch_loss/train': 0.7153498791158199} 12/29/2021 12:07:19 - INFO - codeparrot_training - Step 49696: {'lr': 4.592498401098655e-08, 'samples': 25444864, 'steps': 49696, 'batch_loss/train': 0.7389564514160156} 12/29/2021 12:07:33 - INFO - codeparrot_training - Step 49697: {'lr': 4.562335207081159e-08, 'samples': 25445376, 'steps': 49697, 'batch_loss/train': 0.7426752871833742} 12/29/2021 12:07:44 - INFO - codeparrot_training - Step 49698: {'lr': 4.53227138548229e-08, 'samples': 25445888, 'steps': 49698, 'batch_loss/train': 1.0261255977675319} 12/29/2021 12:07:55 - INFO - codeparrot_training - Step 49699: {'lr': 4.5023069364269475e-08, 'samples': 25446400, 'steps': 49699, 'batch_loss/train': 0.785298490896821} 12/29/2021 12:08:05 - INFO - codeparrot_training - Step 49700: {'lr': 4.4724418600317063e-08, 'samples': 25446912, 'steps': 49700, 'batch_loss/train': 0.879578722640872} 12/29/2021 12:08:17 - INFO - codeparrot_training - Step 49701: {'lr': 4.44267615641869e-08, 'samples': 25447424, 'steps': 49701, 'batch_loss/train': 0.576993859373033} 12/29/2021 12:08:28 - INFO - codeparrot_training - Step 49702: {'lr': 4.413009825701697e-08, 'samples': 25447936, 'steps': 49702, 'batch_loss/train': 0.740995617583394} 12/29/2021 12:08:38 - INFO - codeparrot_training - Step 49703: {'lr': 4.383442868000076e-08, 'samples': 25448448, 'steps': 49703, 'batch_loss/train': 0.5799486782052554} 12/29/2021 12:08:51 - INFO - codeparrot_training - Step 49704: {'lr': 4.353975283433176e-08, 'samples': 25448960, 'steps': 49704, 'batch_loss/train': 0.7129551068646833} 12/29/2021 12:09:02 - INFO - codeparrot_training - Step 49705: {'lr': 4.324607072114795e-08, 'samples': 25449472, 'steps': 49705, 'batch_loss/train': 0.7379651900264435} 12/29/2021 12:09:12 - INFO - codeparrot_training - Step 49706: {'lr': 4.295338234164281e-08, 'samples': 25449984, 'steps': 49706, 'batch_loss/train': 0.7283211858011782} 12/29/2021 12:09:23 - INFO - codeparrot_training - Step 49707: {'lr': 4.266168769698209e-08, 'samples': 25450496, 'steps': 49707, 'batch_loss/train': 0.6966181974858046} 12/29/2021 12:09:37 - INFO - codeparrot_training - Step 49708: {'lr': 4.237098678833151e-08, 'samples': 25451008, 'steps': 49708, 'batch_loss/train': 0.7587734026019461} 12/29/2021 12:09:47 - INFO - codeparrot_training - Step 49709: {'lr': 4.208127961682906e-08, 'samples': 25451520, 'steps': 49709, 'batch_loss/train': 0.7131918541272171} 12/29/2021 12:09:58 - INFO - codeparrot_training - Step 49710: {'lr': 4.179256618361271e-08, 'samples': 25452032, 'steps': 49710, 'batch_loss/train': 0.6033388966461644} 12/29/2021 12:10:10 - INFO - codeparrot_training - Step 49711: {'lr': 4.1504846489848203e-08, 'samples': 25452544, 'steps': 49711, 'batch_loss/train': 0.7061135922558606} 12/29/2021 12:10:21 - INFO - codeparrot_training - Step 49712: {'lr': 4.121812053670126e-08, 'samples': 25453056, 'steps': 49712, 'batch_loss/train': 0.7490996681153774} 12/29/2021 12:10:31 - INFO - codeparrot_training - Step 49713: {'lr': 4.093238832528212e-08, 'samples': 25453568, 'steps': 49713, 'batch_loss/train': 0.5921073062345386} 12/29/2021 12:10:43 - INFO - codeparrot_training - Step 49714: {'lr': 4.064764985672875e-08, 'samples': 25454080, 'steps': 49714, 'batch_loss/train': 0.7108915616408922} 12/29/2021 12:10:54 - INFO - codeparrot_training - Step 49715: {'lr': 4.036390513217914e-08, 'samples': 25454592, 'steps': 49715, 'batch_loss/train': 0.7186199384741485} 12/29/2021 12:11:05 - INFO - codeparrot_training - Step 49716: {'lr': 4.008115415277125e-08, 'samples': 25455104, 'steps': 49716, 'batch_loss/train': 0.7651899391785264} 12/29/2021 12:11:17 - INFO - codeparrot_training - Step 49717: {'lr': 3.979939691961532e-08, 'samples': 25455616, 'steps': 49717, 'batch_loss/train': 0.727060048317071} 12/29/2021 12:11:27 - INFO - codeparrot_training - Step 49718: {'lr': 3.951863343382156e-08, 'samples': 25456128, 'steps': 49718, 'batch_loss/train': 0.8015848263166845} 12/29/2021 12:11:38 - INFO - codeparrot_training - Step 49719: {'lr': 3.9238863696555715e-08, 'samples': 25456640, 'steps': 49719, 'batch_loss/train': 0.711345680989325} 12/29/2021 12:11:49 - INFO - codeparrot_training - Step 49720: {'lr': 3.896008770884474e-08, 'samples': 25457152, 'steps': 49720, 'batch_loss/train': 0.7087754879612476} 12/29/2021 12:12:03 - INFO - codeparrot_training - Step 49721: {'lr': 3.868230547188212e-08, 'samples': 25457664, 'steps': 49721, 'batch_loss/train': 0.7249702184926718} 12/29/2021 12:12:13 - INFO - codeparrot_training - Step 49722: {'lr': 3.840551698672257e-08, 'samples': 25458176, 'steps': 49722, 'batch_loss/train': 0.7261448707431555} 12/29/2021 12:12:24 - INFO - codeparrot_training - Step 49723: {'lr': 3.812972225450406e-08, 'samples': 25458688, 'steps': 49723, 'batch_loss/train': 0.6200693866703659} 12/29/2021 12:12:36 - INFO - codeparrot_training - Step 49724: {'lr': 3.785492127628132e-08, 'samples': 25459200, 'steps': 49724, 'batch_loss/train': 0.6569430567324162} 12/29/2021 12:12:47 - INFO - codeparrot_training - Step 49725: {'lr': 3.758111405316455e-08, 'samples': 25459712, 'steps': 49725, 'batch_loss/train': 0.6670534610748291} 12/29/2021 12:12:57 - INFO - codeparrot_training - Step 49726: {'lr': 3.730830058623624e-08, 'samples': 25460224, 'steps': 49726, 'batch_loss/train': 0.7770039159804583} 12/29/2021 12:13:09 - INFO - codeparrot_training - Step 49727: {'lr': 3.70364808766066e-08, 'samples': 25460736, 'steps': 49727, 'batch_loss/train': 0.6798737952485681} 12/29/2021 12:13:20 - INFO - codeparrot_training - Step 49728: {'lr': 3.676565492533035e-08, 'samples': 25461248, 'steps': 49728, 'batch_loss/train': 0.6673885294003412} 12/29/2021 12:13:30 - INFO - codeparrot_training - Step 49729: {'lr': 3.649582273348995e-08, 'samples': 25461760, 'steps': 49729, 'batch_loss/train': 0.7927574459463358} 12/29/2021 12:13:41 - INFO - codeparrot_training - Step 49730: {'lr': 3.622698430214011e-08, 'samples': 25462272, 'steps': 49730, 'batch_loss/train': 0.7916838591918349} 12/29/2021 12:13:55 - INFO - codeparrot_training - Step 49731: {'lr': 3.595913963239106e-08, 'samples': 25462784, 'steps': 49731, 'batch_loss/train': 0.6169229988008738} 12/29/2021 12:14:06 - INFO - codeparrot_training - Step 49732: {'lr': 3.56922887252975e-08, 'samples': 25463296, 'steps': 49732, 'batch_loss/train': 0.7542591774836183} 12/29/2021 12:14:16 - INFO - codeparrot_training - Step 49733: {'lr': 3.5426431581886407e-08, 'samples': 25463808, 'steps': 49733, 'batch_loss/train': 0.8271201085299253} 12/29/2021 12:14:29 - INFO - codeparrot_training - Step 49734: {'lr': 3.516156820324024e-08, 'samples': 25464320, 'steps': 49734, 'batch_loss/train': 0.7543727857992053} 12/29/2021 12:14:39 - INFO - codeparrot_training - Step 49735: {'lr': 3.48976985904137e-08, 'samples': 25464832, 'steps': 49735, 'batch_loss/train': 0.6904947212897241} 12/29/2021 12:14:50 - INFO - codeparrot_training - Step 49736: {'lr': 3.463482274443375e-08, 'samples': 25465344, 'steps': 49736, 'batch_loss/train': 0.6271153572015464} 12/29/2021 12:15:04 - INFO - codeparrot_training - Step 49737: {'lr': 3.4372940666382856e-08, 'samples': 25465856, 'steps': 49737, 'batch_loss/train': 0.833325375802815} 12/29/2021 12:15:15 - INFO - codeparrot_training - Step 49738: {'lr': 3.4112052357287985e-08, 'samples': 25466368, 'steps': 49738, 'batch_loss/train': 0.736074005253613} 12/29/2021 12:15:25 - INFO - codeparrot_training - Step 49739: {'lr': 3.3852157818176075e-08, 'samples': 25466880, 'steps': 49739, 'batch_loss/train': 0.7487575430423021} 12/29/2021 12:15:36 - INFO - codeparrot_training - Step 49740: {'lr': 3.35932570500741e-08, 'samples': 25467392, 'steps': 49740, 'batch_loss/train': 0.6517553944140673} 12/29/2021 12:15:48 - INFO - codeparrot_training - Step 49741: {'lr': 3.3335350054009004e-08, 'samples': 25467904, 'steps': 49741, 'batch_loss/train': 0.8283215723931789} 12/29/2021 12:15:59 - INFO - codeparrot_training - Step 49742: {'lr': 3.3078436831035506e-08, 'samples': 25468416, 'steps': 49742, 'batch_loss/train': 0.7795705180615187} 12/29/2021 12:16:09 - INFO - codeparrot_training - Step 49743: {'lr': 3.282251738215281e-08, 'samples': 25468928, 'steps': 49743, 'batch_loss/train': 0.6850054804235697} 12/29/2021 12:16:21 - INFO - codeparrot_training - Step 49744: {'lr': 3.256759170838786e-08, 'samples': 25469440, 'steps': 49744, 'batch_loss/train': 0.6723171072080731} 12/29/2021 12:16:32 - INFO - codeparrot_training - Step 49745: {'lr': 3.2313659810767617e-08, 'samples': 25469952, 'steps': 49745, 'batch_loss/train': 0.7322197295725346} 12/29/2021 12:16:42 - INFO - codeparrot_training - Step 49746: {'lr': 3.2060721690263526e-08, 'samples': 25470464, 'steps': 49746, 'batch_loss/train': 0.6839122204110026} 12/29/2021 12:16:56 - INFO - codeparrot_training - Step 49747: {'lr': 3.180877734790255e-08, 'samples': 25470976, 'steps': 49747, 'batch_loss/train': 0.718670372501947} 12/29/2021 12:17:07 - INFO - codeparrot_training - Step 49748: {'lr': 3.1557826784683886e-08, 'samples': 25471488, 'steps': 49748, 'batch_loss/train': 0.6707949216943234} 12/29/2021 12:17:18 - INFO - codeparrot_training - Step 49749: {'lr': 3.130787000160673e-08, 'samples': 25472000, 'steps': 49749, 'batch_loss/train': 0.6456519733183086} 12/29/2021 12:17:30 - INFO - codeparrot_training - Step 49750: {'lr': 3.1058906999670286e-08, 'samples': 25472512, 'steps': 49750, 'batch_loss/train': 0.7694197921082377} 12/29/2021 12:17:41 - INFO - codeparrot_training - Step 49751: {'lr': 3.081093777987376e-08, 'samples': 25473024, 'steps': 49751, 'batch_loss/train': 0.5715851485729218} 12/29/2021 12:17:51 - INFO - codeparrot_training - Step 49752: {'lr': 3.0563962343160836e-08, 'samples': 25473536, 'steps': 49752, 'batch_loss/train': 0.7848498777020723} 12/29/2021 12:18:02 - INFO - codeparrot_training - Step 49753: {'lr': 3.031798069055847e-08, 'samples': 25474048, 'steps': 49753, 'batch_loss/train': 0.7951310742646456} 12/29/2021 12:18:14 - INFO - codeparrot_training - Step 49754: {'lr': 3.007299282301035e-08, 'samples': 25474560, 'steps': 49754, 'batch_loss/train': 0.7028756479267031} 12/29/2021 12:18:25 - INFO - codeparrot_training - Step 49755: {'lr': 2.982899874151568e-08, 'samples': 25475072, 'steps': 49755, 'batch_loss/train': 0.7638527452945709} 12/29/2021 12:18:35 - INFO - codeparrot_training - Step 49756: {'lr': 2.9585998447045904e-08, 'samples': 25475584, 'steps': 49756, 'batch_loss/train': 0.7317563388496637} 12/29/2021 12:18:47 - INFO - codeparrot_training - Step 49757: {'lr': 2.934399194054471e-08, 'samples': 25476096, 'steps': 49757, 'batch_loss/train': 0.6885087681002915} 12/29/2021 12:18:58 - INFO - codeparrot_training - Step 49758: {'lr': 2.9102979223011307e-08, 'samples': 25476608, 'steps': 49758, 'batch_loss/train': 0.6888571493327618} 12/29/2021 12:19:09 - INFO - codeparrot_training - Step 49759: {'lr': 2.886296029536162e-08, 'samples': 25477120, 'steps': 49759, 'batch_loss/train': 0.7585279656341299} 12/29/2021 12:19:23 - INFO - codeparrot_training - Step 49760: {'lr': 2.8623935158539337e-08, 'samples': 25477632, 'steps': 49760, 'batch_loss/train': 0.6793496571481228} 12/29/2021 12:19:33 - INFO - codeparrot_training - Step 49761: {'lr': 2.8385903813571423e-08, 'samples': 25478144, 'steps': 49761, 'batch_loss/train': 0.7567462716251612} 12/29/2021 12:19:44 - INFO - codeparrot_training - Step 49762: {'lr': 2.8148866261318296e-08, 'samples': 25478656, 'steps': 49762, 'batch_loss/train': 0.6802740534767509} 12/29/2021 12:19:55 - INFO - codeparrot_training - Step 49763: {'lr': 2.7912822502779156e-08, 'samples': 25479168, 'steps': 49763, 'batch_loss/train': 0.7262102928943932} 12/29/2021 12:20:07 - INFO - codeparrot_training - Step 49764: {'lr': 2.7677772538869938e-08, 'samples': 25479680, 'steps': 49764, 'batch_loss/train': 0.7425803104415536} 12/29/2021 12:20:17 - INFO - codeparrot_training - Step 49765: {'lr': 2.7443716370534334e-08, 'samples': 25480192, 'steps': 49765, 'batch_loss/train': 0.775152828427963} 12/29/2021 12:20:28 - INFO - codeparrot_training - Step 49766: {'lr': 2.7210653998660517e-08, 'samples': 25480704, 'steps': 49766, 'batch_loss/train': 0.7143416912294924} 12/29/2021 12:20:43 - INFO - codeparrot_training - Step 49767: {'lr': 2.697858542424769e-08, 'samples': 25481216, 'steps': 49767, 'batch_loss/train': 0.7238801941275597} 12/29/2021 12:20:54 - INFO - codeparrot_training - Step 49768: {'lr': 2.6747510648184037e-08, 'samples': 25481728, 'steps': 49768, 'batch_loss/train': 0.8666967460885644} 12/29/2021 12:21:04 - INFO - codeparrot_training - Step 49769: {'lr': 2.6517429671357728e-08, 'samples': 25482240, 'steps': 49769, 'batch_loss/train': 1.2756245932541788} 12/29/2021 12:21:15 - INFO - codeparrot_training - Step 49770: {'lr': 2.6288342494740215e-08, 'samples': 25482752, 'steps': 49770, 'batch_loss/train': 0.5218506024684757} 12/29/2021 12:21:27 - INFO - codeparrot_training - Step 49771: {'lr': 2.6060249119191915e-08, 'samples': 25483264, 'steps': 49771, 'batch_loss/train': 0.7792106028646231} 12/29/2021 12:21:38 - INFO - codeparrot_training - Step 49772: {'lr': 2.583314954565652e-08, 'samples': 25483776, 'steps': 49772, 'batch_loss/train': 0.6263595604104921} 12/29/2021 12:21:48 - INFO - codeparrot_training - Step 49773: {'lr': 2.560704377502221e-08, 'samples': 25484288, 'steps': 49773, 'batch_loss/train': 0.6629399599332828} 12/29/2021 12:22:00 - INFO - codeparrot_training - Step 49774: {'lr': 2.538193180817716e-08, 'samples': 25484800, 'steps': 49774, 'batch_loss/train': 0.6576391342096031} 12/29/2021 12:22:11 - INFO - codeparrot_training - Step 49775: {'lr': 2.515781364603731e-08, 'samples': 25485312, 'steps': 49775, 'batch_loss/train': 1.001668369397521} 12/29/2021 12:22:22 - INFO - codeparrot_training - Step 49776: {'lr': 2.4934689289463075e-08, 'samples': 25485824, 'steps': 49776, 'batch_loss/train': 0.7322180375922471} 12/29/2021 12:22:36 - INFO - codeparrot_training - Step 49777: {'lr': 2.471255873939815e-08, 'samples': 25486336, 'steps': 49777, 'batch_loss/train': 0.7744163451716304} 12/29/2021 12:22:47 - INFO - codeparrot_training - Step 49778: {'lr': 2.4491421996647446e-08, 'samples': 25486848, 'steps': 49778, 'batch_loss/train': 0.6941306000808254} 12/29/2021 12:22:57 - INFO - codeparrot_training - Step 49779: {'lr': 2.427127906215465e-08, 'samples': 25487360, 'steps': 49779, 'batch_loss/train': 1.2193021569401026} 12/29/2021 12:23:08 - INFO - codeparrot_training - Step 49780: {'lr': 2.4052129936780188e-08, 'samples': 25487872, 'steps': 49780, 'batch_loss/train': 0.8616187418811023} 12/29/2021 12:23:20 - INFO - codeparrot_training - Step 49781: {'lr': 2.3833974621356724e-08, 'samples': 25488384, 'steps': 49781, 'batch_loss/train': 0.7220078082755208} 12/29/2021 12:23:31 - INFO - codeparrot_training - Step 49782: {'lr': 2.3616813116800194e-08, 'samples': 25488896, 'steps': 49782, 'batch_loss/train': 0.7451942176558077} 12/29/2021 12:23:41 - INFO - codeparrot_training - Step 49783: {'lr': 2.3400645423971024e-08, 'samples': 25489408, 'steps': 49783, 'batch_loss/train': 0.5317088805604726} 12/29/2021 12:23:55 - INFO - codeparrot_training - Step 49784: {'lr': 2.3185471543701876e-08, 'samples': 25489920, 'steps': 49784, 'batch_loss/train': 0.7454059803858399} 12/29/2021 12:24:06 - INFO - codeparrot_training - Step 49785: {'lr': 2.2971291476853173e-08, 'samples': 25490432, 'steps': 49785, 'batch_loss/train': 0.6687683411873877} 12/29/2021 12:24:17 - INFO - codeparrot_training - Step 49786: {'lr': 2.2758105224285343e-08, 'samples': 25490944, 'steps': 49786, 'batch_loss/train': 0.8331255288794637} 12/29/2021 12:24:29 - INFO - codeparrot_training - Step 49787: {'lr': 2.25459127868588e-08, 'samples': 25491456, 'steps': 49787, 'batch_loss/train': 0.7069140719249845} 12/29/2021 12:24:39 - INFO - codeparrot_training - Step 49788: {'lr': 2.2334714165378467e-08, 'samples': 25491968, 'steps': 49788, 'batch_loss/train': 0.7899830408859998} 12/29/2021 12:24:50 - INFO - codeparrot_training - Step 49789: {'lr': 2.212450936070476e-08, 'samples': 25492480, 'steps': 49789, 'batch_loss/train': 0.7206742335110903} 12/29/2021 12:25:00 - INFO - codeparrot_training - Step 49790: {'lr': 2.19152983736981e-08, 'samples': 25492992, 'steps': 49790, 'batch_loss/train': 0.6773557984270155} 12/29/2021 12:25:13 - INFO - codeparrot_training - Step 49791: {'lr': 2.1707081205163403e-08, 'samples': 25493504, 'steps': 49791, 'batch_loss/train': 0.7208636375144124} 12/29/2021 12:25:23 - INFO - codeparrot_training - Step 49792: {'lr': 2.1499857855933336e-08, 'samples': 25494016, 'steps': 49792, 'batch_loss/train': 0.7639092265162617} 12/29/2021 12:25:34 - INFO - codeparrot_training - Step 49793: {'lr': 2.129362832681281e-08, 'samples': 25494528, 'steps': 49793, 'batch_loss/train': 0.7507009556284174} 12/29/2021 12:25:46 - INFO - codeparrot_training - Step 49794: {'lr': 2.1088392618662243e-08, 'samples': 25495040, 'steps': 49794, 'batch_loss/train': 0.739140747115016} 12/29/2021 12:25:57 - INFO - codeparrot_training - Step 49795: {'lr': 2.08841507322588e-08, 'samples': 25495552, 'steps': 49795, 'batch_loss/train': 0.6845108885318041} 12/29/2021 12:26:07 - INFO - codeparrot_training - Step 49796: {'lr': 2.068090266843514e-08, 'samples': 25496064, 'steps': 49796, 'batch_loss/train': 0.6831740622874349} 12/29/2021 12:26:21 - INFO - codeparrot_training - Step 49797: {'lr': 2.0478648428023938e-08, 'samples': 25496576, 'steps': 49797, 'batch_loss/train': 0.7651861147023737} 12/29/2021 12:26:32 - INFO - codeparrot_training - Step 49798: {'lr': 2.027738801177459e-08, 'samples': 25497088, 'steps': 49798, 'batch_loss/train': 0.7260711374692619} 12/29/2021 12:26:42 - INFO - codeparrot_training - Step 49799: {'lr': 2.007712142051976e-08, 'samples': 25497600, 'steps': 49799, 'batch_loss/train': 0.7384539092890918} 12/29/2021 12:26:55 - INFO - codeparrot_training - Step 49800: {'lr': 1.9877848655036613e-08, 'samples': 25498112, 'steps': 49800, 'batch_loss/train': 0.6336557921022177} 12/29/2021 12:27:05 - INFO - codeparrot_training - Step 49801: {'lr': 1.9679569716130053e-08, 'samples': 25498624, 'steps': 49801, 'batch_loss/train': 0.6944420458748937} 12/29/2021 12:27:16 - INFO - codeparrot_training - Step 49802: {'lr': 1.9482284604604994e-08, 'samples': 25499136, 'steps': 49802, 'batch_loss/train': 0.7718196995556355} 12/29/2021 12:27:27 - INFO - codeparrot_training - Step 49803: {'lr': 1.9285993321238594e-08, 'samples': 25499648, 'steps': 49803, 'batch_loss/train': 0.664853454567492} 12/29/2021 12:27:39 - INFO - codeparrot_training - Step 49804: {'lr': 1.909069586678025e-08, 'samples': 25500160, 'steps': 49804, 'batch_loss/train': 0.5688439593650401} 12/29/2021 12:27:50 - INFO - codeparrot_training - Step 49805: {'lr': 1.8896392242034875e-08, 'samples': 25500672, 'steps': 49805, 'batch_loss/train': 0.702287164516747} 12/29/2021 12:28:00 - INFO - codeparrot_training - Step 49806: {'lr': 1.8703082447751875e-08, 'samples': 25501184, 'steps': 49806, 'batch_loss/train': 0.8333139475435019} 12/29/2021 12:28:14 - INFO - codeparrot_training - Step 49807: {'lr': 1.851076648473615e-08, 'samples': 25501696, 'steps': 49807, 'batch_loss/train': 0.7025415340904146} 12/29/2021 12:28:25 - INFO - codeparrot_training - Step 49808: {'lr': 1.8319444353737115e-08, 'samples': 25502208, 'steps': 49808, 'batch_loss/train': 0.6437043760088272} 12/29/2021 12:28:35 - INFO - codeparrot_training - Step 49809: {'lr': 1.8129116055504157e-08, 'samples': 25502720, 'steps': 49809, 'batch_loss/train': 0.720721585676074} 12/29/2021 12:28:46 - INFO - codeparrot_training - Step 49810: {'lr': 1.7939781590786687e-08, 'samples': 25503232, 'steps': 49810, 'batch_loss/train': 0.6914129708893597} 12/29/2021 12:28:58 - INFO - codeparrot_training - Step 49811: {'lr': 1.7751440960361852e-08, 'samples': 25503744, 'steps': 49811, 'batch_loss/train': 0.7588369161821902} 12/29/2021 12:29:09 - INFO - codeparrot_training - Step 49812: {'lr': 1.756409416497906e-08, 'samples': 25504256, 'steps': 49812, 'batch_loss/train': 0.6611596918664873} 12/29/2021 12:29:20 - INFO - codeparrot_training - Step 49813: {'lr': 1.7377741205359954e-08, 'samples': 25504768, 'steps': 49813, 'batch_loss/train': 0.6679522152990103} 12/29/2021 12:29:34 - INFO - codeparrot_training - Step 49814: {'lr': 1.7192382082253932e-08, 'samples': 25505280, 'steps': 49814, 'batch_loss/train': 0.7427371232770383} 12/29/2021 12:29:44 - INFO - codeparrot_training - Step 49815: {'lr': 1.7008016796410398e-08, 'samples': 25505792, 'steps': 49815, 'batch_loss/train': 0.6895394108723849} 12/29/2021 12:29:55 - INFO - codeparrot_training - Step 49816: {'lr': 1.6824645348550993e-08, 'samples': 25506304, 'steps': 49816, 'batch_loss/train': 0.7652262065093964} 12/29/2021 12:30:07 - INFO - codeparrot_training - Step 49817: {'lr': 1.6642267739397364e-08, 'samples': 25506816, 'steps': 49817, 'batch_loss/train': 0.804439929779619} 12/29/2021 12:30:18 - INFO - codeparrot_training - Step 49818: {'lr': 1.6460883969698915e-08, 'samples': 25507328, 'steps': 49818, 'batch_loss/train': 0.725324543658644} 12/29/2021 12:30:28 - INFO - codeparrot_training - Step 49819: {'lr': 1.628049404014953e-08, 'samples': 25507840, 'steps': 49819, 'batch_loss/train': 0.6955778123810887} 12/29/2021 12:30:40 - INFO - codeparrot_training - Step 49820: {'lr': 1.610109795149861e-08, 'samples': 25508352, 'steps': 49820, 'batch_loss/train': 0.7272875341586769} 12/29/2021 12:30:51 - INFO - codeparrot_training - Step 49821: {'lr': 1.5922695704412294e-08, 'samples': 25508864, 'steps': 49821, 'batch_loss/train': 0.5618532483931631} 12/29/2021 12:31:02 - INFO - codeparrot_training - Step 49822: {'lr': 1.5745287299639975e-08, 'samples': 25509376, 'steps': 49822, 'batch_loss/train': 0.5989105913322419} 12/29/2021 12:31:12 - INFO - codeparrot_training - Step 49823: {'lr': 1.556887273787555e-08, 'samples': 25509888, 'steps': 49823, 'batch_loss/train': 0.6162973502650857} 12/29/2021 12:31:24 - INFO - codeparrot_training - Step 49824: {'lr': 1.5393452019812904e-08, 'samples': 25510400, 'steps': 49824, 'batch_loss/train': 0.811881048779469} 12/29/2021 12:31:35 - INFO - codeparrot_training - Step 49825: {'lr': 1.5219025146145927e-08, 'samples': 25510912, 'steps': 49825, 'batch_loss/train': 0.6841723259421997} 12/29/2021 12:31:46 - INFO - codeparrot_training - Step 49826: {'lr': 1.5045592117596264e-08, 'samples': 25511424, 'steps': 49826, 'batch_loss/train': 0.6634762440808117} 12/29/2021 12:32:00 - INFO - codeparrot_training - Step 49827: {'lr': 1.4873152934830047e-08, 'samples': 25511936, 'steps': 49827, 'batch_loss/train': 0.7171554123051465} 12/29/2021 12:32:10 - INFO - codeparrot_training - Step 49828: {'lr': 1.470170759854117e-08, 'samples': 25512448, 'steps': 49828, 'batch_loss/train': 0.7569931368343532} 12/29/2021 12:32:21 - INFO - codeparrot_training - Step 49829: {'lr': 1.4531256109395763e-08, 'samples': 25512960, 'steps': 49829, 'batch_loss/train': 0.8454291627276689} 12/29/2021 12:32:33 - INFO - codeparrot_training - Step 49830: {'lr': 1.4361798468087716e-08, 'samples': 25513472, 'steps': 49830, 'batch_loss/train': 0.6952474664431065} 12/29/2021 12:32:44 - INFO - codeparrot_training - Step 49831: {'lr': 1.4193334675283164e-08, 'samples': 25513984, 'steps': 49831, 'batch_loss/train': 0.7410897798836231} 12/29/2021 12:32:54 - INFO - codeparrot_training - Step 49832: {'lr': 1.4025864731675997e-08, 'samples': 25514496, 'steps': 49832, 'batch_loss/train': 0.7304212525486946} 12/29/2021 12:33:05 - INFO - codeparrot_training - Step 49833: {'lr': 1.3859388637876835e-08, 'samples': 25515008, 'steps': 49833, 'batch_loss/train': 0.729501839261502} 12/29/2021 12:33:17 - INFO - codeparrot_training - Step 49834: {'lr': 1.3693906394607325e-08, 'samples': 25515520, 'steps': 49834, 'batch_loss/train': 0.6994355034548789} 12/29/2021 12:33:28 - INFO - codeparrot_training - Step 49835: {'lr': 1.3529418002505845e-08, 'samples': 25516032, 'steps': 49835, 'batch_loss/train': 0.78305643517524} 12/29/2021 12:33:38 - INFO - codeparrot_training - Step 49836: {'lr': 1.3365923462183015e-08, 'samples': 25516544, 'steps': 49836, 'batch_loss/train': 0.7357468730770051} 12/29/2021 12:33:52 - INFO - codeparrot_training - Step 49837: {'lr': 1.3203422774360485e-08, 'samples': 25517056, 'steps': 49837, 'batch_loss/train': 0.6766395317390561} 12/29/2021 12:34:03 - INFO - codeparrot_training - Step 49838: {'lr': 1.3041915939648874e-08, 'samples': 25517568, 'steps': 49838, 'batch_loss/train': 0.6528021135600284} 12/29/2021 12:34:13 - INFO - codeparrot_training - Step 49839: {'lr': 1.2881402958686561e-08, 'samples': 25518080, 'steps': 49839, 'batch_loss/train': 0.6304772743023932} 12/29/2021 12:34:26 - INFO - codeparrot_training - Step 49840: {'lr': 1.272188383208417e-08, 'samples': 25518592, 'steps': 49840, 'batch_loss/train': 0.6651781462132931} 12/29/2021 12:34:36 - INFO - codeparrot_training - Step 49841: {'lr': 1.2563358560535587e-08, 'samples': 25519104, 'steps': 49841, 'batch_loss/train': 0.765004638582468} 12/29/2021 12:34:47 - INFO - codeparrot_training - Step 49842: {'lr': 1.2405827144651438e-08, 'samples': 25519616, 'steps': 49842, 'batch_loss/train': 0.7789105498231947} 12/29/2021 12:34:57 - INFO - codeparrot_training - Step 49843: {'lr': 1.2249289585042345e-08, 'samples': 25520128, 'steps': 49843, 'batch_loss/train': 0.7060432913713157} 12/29/2021 12:35:12 - INFO - codeparrot_training - Step 49844: {'lr': 1.2093745882346684e-08, 'samples': 25520640, 'steps': 49844, 'batch_loss/train': 0.6563885288778692} 12/29/2021 12:35:22 - INFO - codeparrot_training - Step 49845: {'lr': 1.1939196037147326e-08, 'samples': 25521152, 'steps': 49845, 'batch_loss/train': 1.6233989698812366} 12/29/2021 12:35:33 - INFO - codeparrot_training - Step 49846: {'lr': 1.1785640050110403e-08, 'samples': 25521664, 'steps': 49846, 'batch_loss/train': 0.7303674668073654} 12/29/2021 12:35:45 - INFO - codeparrot_training - Step 49847: {'lr': 1.163307792181878e-08, 'samples': 25522176, 'steps': 49847, 'batch_loss/train': 0.8477260316722095} 12/29/2021 12:35:56 - INFO - codeparrot_training - Step 49848: {'lr': 1.1481509652883082e-08, 'samples': 25522688, 'steps': 49848, 'batch_loss/train': 0.7645454509183764} 12/29/2021 12:36:06 - INFO - codeparrot_training - Step 49849: {'lr': 1.1330935243886176e-08, 'samples': 25523200, 'steps': 49849, 'batch_loss/train': 0.72147445846349} 12/29/2021 12:36:18 - INFO - codeparrot_training - Step 49850: {'lr': 1.118135469546644e-08, 'samples': 25523712, 'steps': 49850, 'batch_loss/train': 0.7230130634270608} 12/29/2021 12:36:29 - INFO - codeparrot_training - Step 49851: {'lr': 1.1032768008178984e-08, 'samples': 25524224, 'steps': 49851, 'batch_loss/train': 0.6383392107672989} 12/29/2021 12:36:40 - INFO - codeparrot_training - Step 49852: {'lr': 1.0885175182662188e-08, 'samples': 25524736, 'steps': 49852, 'batch_loss/train': 0.7492544691776857} 12/29/2021 12:36:50 - INFO - codeparrot_training - Step 49853: {'lr': 1.0738576219443408e-08, 'samples': 25525248, 'steps': 49853, 'batch_loss/train': 0.6937068431871012} 12/29/2021 12:37:04 - INFO - codeparrot_training - Step 49854: {'lr': 1.0592971119161022e-08, 'samples': 25525760, 'steps': 49854, 'batch_loss/train': 0.7298724427819252} 12/29/2021 12:37:15 - INFO - codeparrot_training - Step 49855: {'lr': 1.0448359882342385e-08, 'samples': 25526272, 'steps': 49855, 'batch_loss/train': 0.7417085864581168} 12/29/2021 12:37:25 - INFO - codeparrot_training - Step 49856: {'lr': 1.030474250959812e-08, 'samples': 25526784, 'steps': 49856, 'batch_loss/train': 0.7198866007383913} 12/29/2021 12:37:37 - INFO - codeparrot_training - Step 49857: {'lr': 1.0162119001511094e-08, 'samples': 25527296, 'steps': 49857, 'batch_loss/train': 0.8586070858873427} 12/29/2021 12:37:48 - INFO - codeparrot_training - Step 49858: {'lr': 1.0020489358608665e-08, 'samples': 25527808, 'steps': 49858, 'batch_loss/train': 0.6739072455093265} 12/29/2021 12:37:59 - INFO - codeparrot_training - Step 49859: {'lr': 9.879853581473696e-09, 'samples': 25528320, 'steps': 49859, 'batch_loss/train': 0.6380928571743425} 12/29/2021 12:38:11 - INFO - codeparrot_training - Step 49860: {'lr': 9.740211670661303e-09, 'samples': 25528832, 'steps': 49860, 'batch_loss/train': 0.6396625279448926} 12/29/2021 12:38:21 - INFO - codeparrot_training - Step 49861: {'lr': 9.601563626726595e-09, 'samples': 25529344, 'steps': 49861, 'batch_loss/train': 0.6533124377019703} 12/29/2021 12:38:32 - INFO - codeparrot_training - Step 49862: {'lr': 9.463909450224683e-09, 'samples': 25529856, 'steps': 49862, 'batch_loss/train': 0.7405005963519216} 12/29/2021 12:38:46 - INFO - codeparrot_training - Step 49863: {'lr': 9.327249141710681e-09, 'samples': 25530368, 'steps': 49863, 'batch_loss/train': 0.6517513596918434} 12/29/2021 12:38:57 - INFO - codeparrot_training - Step 49864: {'lr': 9.191582701711943e-09, 'samples': 25530880, 'steps': 49864, 'batch_loss/train': 0.665884930989705} 12/29/2021 12:39:08 - INFO - codeparrot_training - Step 49865: {'lr': 9.05691013078358e-09, 'samples': 25531392, 'steps': 49865, 'batch_loss/train': 0.6557984943501651} 12/29/2021 12:39:18 - INFO - codeparrot_training - Step 49866: {'lr': 8.92323142945295e-09, 'samples': 25531904, 'steps': 49866, 'batch_loss/train': 0.5791574292816222} 12/29/2021 12:39:31 - INFO - codeparrot_training - Step 49867: {'lr': 8.790546598219651e-09, 'samples': 25532416, 'steps': 49867, 'batch_loss/train': 0.5707398122176528} 12/29/2021 12:39:41 - INFO - codeparrot_training - Step 49868: {'lr': 8.658855637666551e-09, 'samples': 25532928, 'steps': 49868, 'batch_loss/train': 0.7500795789528638} 12/29/2021 12:39:52 - INFO - codeparrot_training - Step 49869: {'lr': 8.528158548293252e-09, 'samples': 25533440, 'steps': 49869, 'batch_loss/train': 0.7108259308151901} 12/29/2021 12:40:04 - INFO - codeparrot_training - Step 49870: {'lr': 8.398455330627108e-09, 'samples': 25533952, 'steps': 49870, 'batch_loss/train': 0.678325857501477} 12/29/2021 12:40:14 - INFO - codeparrot_training - Step 49871: {'lr': 8.269745985167721e-09, 'samples': 25534464, 'steps': 49871, 'batch_loss/train': 0.7411825607996434} 12/29/2021 12:40:25 - INFO - codeparrot_training - Step 49872: {'lr': 8.142030512414688e-09, 'samples': 25534976, 'steps': 49872, 'batch_loss/train': 0.7576148207299411} 12/29/2021 12:40:39 - INFO - codeparrot_training - Step 49873: {'lr': 8.015308912923126e-09, 'samples': 25535488, 'steps': 49873, 'batch_loss/train': 0.6602278682403266} 12/29/2021 12:40:50 - INFO - codeparrot_training - Step 49874: {'lr': 7.889581187164874e-09, 'samples': 25536000, 'steps': 49874, 'batch_loss/train': 0.7005574550712481} 12/29/2021 12:41:00 - INFO - codeparrot_training - Step 49875: {'lr': 7.764847335639536e-09, 'samples': 25536512, 'steps': 49875, 'batch_loss/train': 0.6270767957903445} 12/29/2021 12:41:11 - INFO - codeparrot_training - Step 49876: {'lr': 7.64110735884671e-09, 'samples': 25537024, 'steps': 49876, 'batch_loss/train': 0.7794397185789421} 12/29/2021 12:41:23 - INFO - codeparrot_training - Step 49877: {'lr': 7.518361257258244e-09, 'samples': 25537536, 'steps': 49877, 'batch_loss/train': 0.7171402382664382} 12/29/2021 12:41:34 - INFO - codeparrot_training - Step 49878: {'lr': 7.396609031401491e-09, 'samples': 25538048, 'steps': 49878, 'batch_loss/train': 0.8075303738005459} 12/29/2021 12:41:44 - INFO - codeparrot_training - Step 49879: {'lr': 7.2758506817482975e-09, 'samples': 25538560, 'steps': 49879, 'batch_loss/train': 0.6314128148369491} 12/29/2021 12:41:56 - INFO - codeparrot_training - Step 49880: {'lr': 7.1560862087705066e-09, 'samples': 25539072, 'steps': 49880, 'batch_loss/train': 0.8618857730180025} 12/29/2021 12:42:07 - INFO - codeparrot_training - Step 49881: {'lr': 7.037315612939965e-09, 'samples': 25539584, 'steps': 49881, 'batch_loss/train': 0.7948663551360369} 12/29/2021 12:42:18 - INFO - codeparrot_training - Step 49882: {'lr': 6.9195388947562725e-09, 'samples': 25540096, 'steps': 49882, 'batch_loss/train': 0.5702019366435707} 12/29/2021 12:42:32 - INFO - codeparrot_training - Step 49883: {'lr': 6.8027560546357615e-09, 'samples': 25540608, 'steps': 49883, 'batch_loss/train': 0.7106835730373859} 12/29/2021 12:42:42 - INFO - codeparrot_training - Step 49884: {'lr': 6.686967093105789e-09, 'samples': 25541120, 'steps': 49884, 'batch_loss/train': 0.7161623956053518} 12/29/2021 12:42:53 - INFO - codeparrot_training - Step 49885: {'lr': 6.5721720105826885e-09, 'samples': 25541632, 'steps': 49885, 'batch_loss/train': 0.8213379960507154} 12/29/2021 12:43:03 - INFO - codeparrot_training - Step 49886: {'lr': 6.45837080751055e-09, 'samples': 25542144, 'steps': 49886, 'batch_loss/train': 0.7270708410069346} 12/29/2021 12:43:16 - INFO - codeparrot_training - Step 49887: {'lr': 6.345563484388972e-09, 'samples': 25542656, 'steps': 49887, 'batch_loss/train': 0.5877524325624108} 12/29/2021 12:43:27 - INFO - codeparrot_training - Step 49888: {'lr': 6.233750041634289e-09, 'samples': 25543168, 'steps': 49888, 'batch_loss/train': 0.5724373486591503} 12/29/2021 12:43:37 - INFO - codeparrot_training - Step 49889: {'lr': 6.122930479718347e-09, 'samples': 25543680, 'steps': 49889, 'batch_loss/train': 0.6685664977412671} 12/29/2021 12:43:49 - INFO - codeparrot_training - Step 49890: {'lr': 6.013104799057478e-09, 'samples': 25544192, 'steps': 49890, 'batch_loss/train': 0.6874986565671861} 12/29/2021 12:44:00 - INFO - codeparrot_training - Step 49891: {'lr': 5.904273000095772e-09, 'samples': 25544704, 'steps': 49891, 'batch_loss/train': 0.7923079524189234} 12/29/2021 12:44:11 - INFO - codeparrot_training - Step 49892: {'lr': 5.796435083249563e-09, 'samples': 25545216, 'steps': 49892, 'batch_loss/train': 0.726694033946842} 12/29/2021 12:44:24 - INFO - codeparrot_training - Step 49893: {'lr': 5.689591048962939e-09, 'samples': 25545728, 'steps': 49893, 'batch_loss/train': 0.7698747836984694} 12/29/2021 12:44:35 - INFO - codeparrot_training - Step 49894: {'lr': 5.58374089767999e-09, 'samples': 25546240, 'steps': 49894, 'batch_loss/train': 0.6994294598698616} 12/29/2021 12:44:46 - INFO - codeparrot_training - Step 49895: {'lr': 5.478884629789294e-09, 'samples': 25546752, 'steps': 49895, 'batch_loss/train': 0.6896230096463114} 12/29/2021 12:44:56 - INFO - codeparrot_training - Step 49896: {'lr': 5.375022245707184e-09, 'samples': 25547264, 'steps': 49896, 'batch_loss/train': 0.7120068224612623} 12/29/2021 12:45:09 - INFO - codeparrot_training - Step 49897: {'lr': 5.2721537458777506e-09, 'samples': 25547776, 'steps': 49897, 'batch_loss/train': 0.6731583990622312} 12/29/2021 12:45:19 - INFO - codeparrot_training - Step 49898: {'lr': 5.17027913068957e-09, 'samples': 25548288, 'steps': 49898, 'batch_loss/train': 0.6942173941060901} 12/29/2021 12:45:30 - INFO - codeparrot_training - Step 49899: {'lr': 5.069398400531222e-09, 'samples': 25548800, 'steps': 49899, 'batch_loss/train': 0.7435921290889382} 12/29/2021 12:45:42 - INFO - codeparrot_training - Step 49900: {'lr': 4.969511555846795e-09, 'samples': 25549312, 'steps': 49900, 'batch_loss/train': 0.696451535623055} 12/29/2021 12:45:53 - INFO - codeparrot_training - Step 49901: {'lr': 4.870618596997112e-09, 'samples': 25549824, 'steps': 49901, 'batch_loss/train': 0.9400501023046672} 12/29/2021 12:46:03 - INFO - codeparrot_training - Step 49902: {'lr': 4.772719524398505e-09, 'samples': 25550336, 'steps': 49902, 'batch_loss/train': 0.6924040998565033} 12/29/2021 12:46:16 - INFO - codeparrot_training - Step 49903: {'lr': 4.675814338411799e-09, 'samples': 25550848, 'steps': 49903, 'batch_loss/train': 0.676469320897013} 12/29/2021 12:46:26 - INFO - codeparrot_training - Step 49904: {'lr': 4.579903039453326e-09, 'samples': 25551360, 'steps': 49904, 'batch_loss/train': 1.4724192488065455} 12/29/2021 12:46:37 - INFO - codeparrot_training - Step 49905: {'lr': 4.484985627883908e-09, 'samples': 25551872, 'steps': 49905, 'batch_loss/train': 0.6538780559785664} 12/29/2021 12:46:48 - INFO - codeparrot_training - Step 49906: {'lr': 4.391062104092125e-09, 'samples': 25552384, 'steps': 49906, 'batch_loss/train': 0.7409894210286438} 12/29/2021 12:47:02 - INFO - codeparrot_training - Step 49907: {'lr': 4.298132468438798e-09, 'samples': 25552896, 'steps': 49907, 'batch_loss/train': 0.7304472564719617} 12/29/2021 12:47:12 - INFO - codeparrot_training - Step 49908: {'lr': 4.206196721312505e-09, 'samples': 25553408, 'steps': 49908, 'batch_loss/train': 0.6630393047817051} 12/29/2021 12:47:23 - INFO - codeparrot_training - Step 49909: {'lr': 4.1152548630740695e-09, 'samples': 25553920, 'steps': 49909, 'batch_loss/train': 0.7955454606562853} 12/29/2021 12:47:35 - INFO - codeparrot_training - Step 49910: {'lr': 4.0253068940565574e-09, 'samples': 25554432, 'steps': 49910, 'batch_loss/train': 0.7215407658368349} 12/29/2021 12:47:46 - INFO - codeparrot_training - Step 49911: {'lr': 3.936352814648547e-09, 'samples': 25554944, 'steps': 49911, 'batch_loss/train': 0.8101383065804839} 12/29/2021 12:47:56 - INFO - codeparrot_training - Step 49912: {'lr': 3.848392625210861e-09, 'samples': 25555456, 'steps': 49912, 'batch_loss/train': 0.5941392190870829} 12/29/2021 12:48:11 - INFO - codeparrot_training - Step 49913: {'lr': 3.761426326048812e-09, 'samples': 25555968, 'steps': 49913, 'batch_loss/train': 0.7438274944433942} 12/29/2021 12:48:22 - INFO - codeparrot_training - Step 49914: {'lr': 3.675453917550975e-09, 'samples': 25556480, 'steps': 49914, 'batch_loss/train': 0.655243890825659} 12/29/2021 12:48:32 - INFO - codeparrot_training - Step 49915: {'lr': 3.590475400050419e-09, 'samples': 25556992, 'steps': 49915, 'batch_loss/train': 0.584951470606029} 12/29/2021 12:48:43 - INFO - codeparrot_training - Step 49916: {'lr': 3.506490773880211e-09, 'samples': 25557504, 'steps': 49916, 'batch_loss/train': 0.7396182650700212} 12/29/2021 12:48:55 - INFO - codeparrot_training - Step 49917: {'lr': 3.423500039345662e-09, 'samples': 25558016, 'steps': 49917, 'batch_loss/train': 0.608006093185395} 12/29/2021 12:49:06 - INFO - codeparrot_training - Step 49918: {'lr': 3.341503196835349e-09, 'samples': 25558528, 'steps': 49918, 'batch_loss/train': 0.6754583325237036} 12/29/2021 12:49:16 - INFO - codeparrot_training - Step 49919: {'lr': 3.2605002466268296e-09, 'samples': 25559040, 'steps': 49919, 'batch_loss/train': 0.6148792556487024} 12/29/2021 12:49:29 - INFO - codeparrot_training - Step 49920: {'lr': 3.180491189080925e-09, 'samples': 25559552, 'steps': 49920, 'batch_loss/train': 0.6257413193816319} 12/29/2021 12:49:39 - INFO - codeparrot_training - Step 49921: {'lr': 3.1014760244751914e-09, 'samples': 25560064, 'steps': 49921, 'batch_loss/train': 0.767559562344104} 12/29/2021 12:49:50 - INFO - codeparrot_training - Step 49922: {'lr': 3.023454753142696e-09, 'samples': 25560576, 'steps': 49922, 'batch_loss/train': 0.8239208629820496} 12/29/2021 12:50:04 - INFO - codeparrot_training - Step 49923: {'lr': 2.9464273753887494e-09, 'samples': 25561088, 'steps': 49923, 'batch_loss/train': 0.7353542733471841} 12/29/2021 12:50:15 - INFO - codeparrot_training - Step 49924: {'lr': 2.870393891546419e-09, 'samples': 25561600, 'steps': 49924, 'batch_loss/train': 0.7918819454498589} 12/29/2021 12:50:25 - INFO - codeparrot_training - Step 49925: {'lr': 2.795354301865505e-09, 'samples': 25562112, 'steps': 49925, 'batch_loss/train': 0.7949960669502616} 12/29/2021 12:50:36 - INFO - codeparrot_training - Step 49926: {'lr': 2.721308606706829e-09, 'samples': 25562624, 'steps': 49926, 'batch_loss/train': 0.7303789751604199} 12/29/2021 12:50:48 - INFO - codeparrot_training - Step 49927: {'lr': 2.648256806320193e-09, 'samples': 25563136, 'steps': 49927, 'batch_loss/train': 0.6336272184271365} 12/29/2021 12:50:59 - INFO - codeparrot_training - Step 49928: {'lr': 2.5761989010109064e-09, 'samples': 25563648, 'steps': 49928, 'batch_loss/train': 0.7514077650848776} 12/29/2021 12:51:09 - INFO - codeparrot_training - Step 49929: {'lr': 2.5051348910565264e-09, 'samples': 25564160, 'steps': 49929, 'batch_loss/train': 0.7285669203847647} 12/29/2021 12:51:23 - INFO - codeparrot_training - Step 49930: {'lr': 2.435064776734608e-09, 'samples': 25564672, 'steps': 49930, 'batch_loss/train': 0.7881956762867048} 12/29/2021 12:51:34 - INFO - codeparrot_training - Step 49931: {'lr': 2.3659885583504627e-09, 'samples': 25565184, 'steps': 49931, 'batch_loss/train': 0.6491673714481294} 12/29/2021 12:51:45 - INFO - codeparrot_training - Step 49932: {'lr': 2.297906236181646e-09, 'samples': 25565696, 'steps': 49932, 'batch_loss/train': 0.6345381878782064} 12/29/2021 12:51:57 - INFO - codeparrot_training - Step 49933: {'lr': 2.230817810450203e-09, 'samples': 25566208, 'steps': 49933, 'batch_loss/train': 0.7228429140523076} 12/29/2021 12:52:07 - INFO - codeparrot_training - Step 49934: {'lr': 2.164723281461445e-09, 'samples': 25566720, 'steps': 49934, 'batch_loss/train': 0.7681522604543716} 12/29/2021 12:52:18 - INFO - codeparrot_training - Step 49935: {'lr': 2.099622649492927e-09, 'samples': 25567232, 'steps': 49935, 'batch_loss/train': 0.6954580461606383} 12/29/2021 12:52:29 - INFO - codeparrot_training - Step 49936: {'lr': 2.0355159147666947e-09, 'samples': 25567744, 'steps': 49936, 'batch_loss/train': 0.7016292652115226} 12/29/2021 12:52:41 - INFO - codeparrot_training - Step 49937: {'lr': 1.9724030775325475e-09, 'samples': 25568256, 'steps': 49937, 'batch_loss/train': 0.7271888358518481} 12/29/2021 12:52:51 - INFO - codeparrot_training - Step 49938: {'lr': 1.9102841380680415e-09, 'samples': 25568768, 'steps': 49938, 'batch_loss/train': 0.7010356718674302} 12/29/2021 12:53:02 - INFO - codeparrot_training - Step 49939: {'lr': 1.8491590966229766e-09, 'samples': 25569280, 'steps': 49939, 'batch_loss/train': 0.5926259499974549} 12/29/2021 12:53:14 - INFO - codeparrot_training - Step 49940: {'lr': 1.7890279534471531e-09, 'samples': 25569792, 'steps': 49940, 'batch_loss/train': 0.6718419110402465} 12/29/2021 12:53:25 - INFO - codeparrot_training - Step 49941: {'lr': 1.7298907087348603e-09, 'samples': 25570304, 'steps': 49941, 'batch_loss/train': 0.8373827980831265} 12/29/2021 12:53:36 - INFO - codeparrot_training - Step 49942: {'lr': 1.6717473627636536e-09, 'samples': 25570816, 'steps': 49942, 'batch_loss/train': 0.6776735354214907} 12/29/2021 12:53:50 - INFO - codeparrot_training - Step 49943: {'lr': 1.6145979157278223e-09, 'samples': 25571328, 'steps': 49943, 'batch_loss/train': 0.5027714566385839} 12/29/2021 12:54:00 - INFO - codeparrot_training - Step 49944: {'lr': 1.558442367904922e-09, 'samples': 25571840, 'steps': 49944, 'batch_loss/train': 0.7091701340395957} 12/29/2021 12:54:11 - INFO - codeparrot_training - Step 49945: {'lr': 1.503280719461486e-09, 'samples': 25572352, 'steps': 49945, 'batch_loss/train': 0.7306075245141983} 12/29/2021 12:54:23 - INFO - codeparrot_training - Step 49946: {'lr': 1.4491129706750705e-09, 'samples': 25572864, 'steps': 49946, 'batch_loss/train': 0.7347925333306193} 12/29/2021 12:54:34 - INFO - codeparrot_training - Step 49947: {'lr': 1.3959391217122087e-09, 'samples': 25573376, 'steps': 49947, 'batch_loss/train': 0.6795259951613843} 12/29/2021 12:54:45 - INFO - codeparrot_training - Step 49948: {'lr': 1.343759172794945e-09, 'samples': 25573888, 'steps': 49948, 'batch_loss/train': 0.5930272545083426} 12/29/2021 12:54:55 - INFO - codeparrot_training - Step 49949: {'lr': 1.2925731241730797e-09, 'samples': 25574400, 'steps': 49949, 'batch_loss/train': 0.6930013122037053} 12/29/2021 12:55:08 - INFO - codeparrot_training - Step 49950: {'lr': 1.242380975985391e-09, 'samples': 25574912, 'steps': 49950, 'batch_loss/train': 0.7828089320100844} 12/29/2021 12:55:18 - INFO - codeparrot_training - Step 49951: {'lr': 1.1931827284816787e-09, 'samples': 25575424, 'steps': 49951, 'batch_loss/train': 0.7072200626134872} 12/29/2021 12:55:29 - INFO - codeparrot_training - Step 49952: {'lr': 1.1449783818284765e-09, 'samples': 25575936, 'steps': 49952, 'batch_loss/train': 0.6803266940405592} 12/29/2021 12:55:43 - INFO - codeparrot_training - Step 49953: {'lr': 1.0977679362200732e-09, 'samples': 25576448, 'steps': 49953, 'batch_loss/train': 0.7213703240267932} 12/29/2021 12:55:54 - INFO - codeparrot_training - Step 49954: {'lr': 1.051551391850758e-09, 'samples': 25576960, 'steps': 49954, 'batch_loss/train': 1.5281107331393287} 12/29/2021 12:56:04 - INFO - codeparrot_training - Step 49955: {'lr': 1.0063287489148199e-09, 'samples': 25577472, 'steps': 49955, 'batch_loss/train': 0.5097056819358841} 12/29/2021 12:56:16 - INFO - codeparrot_training - Step 49956: {'lr': 9.621000075787922e-10, 'samples': 25577984, 'steps': 49956, 'batch_loss/train': 0.5891997774597257} 12/29/2021 12:56:27 - INFO - codeparrot_training - Step 49957: {'lr': 9.188651680092086e-10, 'samples': 25578496, 'steps': 49957, 'batch_loss/train': 0.8008020456181839} 12/29/2021 12:56:38 - INFO - codeparrot_training - Step 49958: {'lr': 8.766242304003579e-10, 'samples': 25579008, 'steps': 49958, 'batch_loss/train': 0.7720384951680899} 12/29/2021 12:56:48 - INFO - codeparrot_training - Step 49959: {'lr': 8.353771949187738e-10, 'samples': 25579520, 'steps': 49959, 'batch_loss/train': 0.6761095192632638} 12/29/2021 12:57:01 - INFO - codeparrot_training - Step 49960: {'lr': 7.951240617032341e-10, 'samples': 25580032, 'steps': 49960, 'batch_loss/train': 0.737551839556545} 12/29/2021 12:57:12 - INFO - codeparrot_training - Step 49961: {'lr': 7.558648309480276e-10, 'samples': 25580544, 'steps': 49961, 'batch_loss/train': 0.7837783619761467} 12/29/2021 12:57:22 - INFO - codeparrot_training - Step 49962: {'lr': 7.175995027919324e-10, 'samples': 25581056, 'steps': 49962, 'batch_loss/train': 0.6429206780157983} 12/29/2021 12:57:36 - INFO - codeparrot_training - Step 49963: {'lr': 6.803280773737264e-10, 'samples': 25581568, 'steps': 49963, 'batch_loss/train': 0.6214303034357727} 12/29/2021 12:57:47 - INFO - codeparrot_training - Step 49964: {'lr': 6.440505548599429e-10, 'samples': 25582080, 'steps': 49964, 'batch_loss/train': 0.7495497721247375} 12/29/2021 12:57:57 - INFO - codeparrot_training - Step 49965: {'lr': 6.087669353893599e-10, 'samples': 25582592, 'steps': 49965, 'batch_loss/train': 0.6696416735649109} 12/29/2021 12:58:10 - INFO - codeparrot_training - Step 49966: {'lr': 5.744772191007552e-10, 'samples': 25583104, 'steps': 49966, 'batch_loss/train': 0.68606261536479} 12/29/2021 12:58:20 - INFO - codeparrot_training - Step 49967: {'lr': 5.411814061329068e-10, 'samples': 25583616, 'steps': 49967, 'batch_loss/train': 0.7459427425637841} 12/29/2021 12:58:31 - INFO - codeparrot_training - Step 49968: {'lr': 5.088794965968368e-10, 'samples': 25584128, 'steps': 49968, 'batch_loss/train': 0.743040238507092} 12/29/2021 12:58:45 - INFO - codeparrot_training - Step 49969: {'lr': 4.775714906590789e-10, 'samples': 25584640, 'steps': 49969, 'batch_loss/train': 0.784685343503952} 12/29/2021 12:58:55 - INFO - codeparrot_training - Step 49970: {'lr': 4.4725738843065523e-10, 'samples': 25585152, 'steps': 49970, 'batch_loss/train': 0.8525404529646039} 12/29/2021 12:59:06 - INFO - codeparrot_training - Step 49971: {'lr': 4.1793719002258813e-10, 'samples': 25585664, 'steps': 49971, 'batch_loss/train': 0.7638161415234208} 12/29/2021 12:59:17 - INFO - codeparrot_training - Step 49972: {'lr': 3.8961089557365546e-10, 'samples': 25586176, 'steps': 49972, 'batch_loss/train': 0.7268530630972236} 12/29/2021 12:59:29 - INFO - codeparrot_training - Step 49973: {'lr': 3.62278505167124e-10, 'samples': 25586688, 'steps': 49973, 'batch_loss/train': 0.6703128577210009} 12/29/2021 12:59:40 - INFO - codeparrot_training - Step 49974: {'lr': 3.3594001891401605e-10, 'samples': 25587200, 'steps': 49974, 'batch_loss/train': 0.79679602291435} 12/29/2021 12:59:50 - INFO - codeparrot_training - Step 49975: {'lr': 3.1059543692535384e-10, 'samples': 25587712, 'steps': 49975, 'batch_loss/train': 0.6228625979274511} 12/29/2021 13:00:02 - INFO - codeparrot_training - Step 49976: {'lr': 2.8624475931215976e-10, 'samples': 25588224, 'steps': 49976, 'batch_loss/train': 0.7484005230944604} 12/29/2021 13:00:13 - INFO - codeparrot_training - Step 49977: {'lr': 2.628879861854561e-10, 'samples': 25588736, 'steps': 49977, 'batch_loss/train': 0.7667995793744922} 12/29/2021 13:00:24 - INFO - codeparrot_training - Step 49978: {'lr': 2.405251176007539e-10, 'samples': 25589248, 'steps': 49978, 'batch_loss/train': 0.6338405134156346} 12/29/2021 13:00:36 - INFO - codeparrot_training - Step 49979: {'lr': 2.191561536690756e-10, 'samples': 25589760, 'steps': 49979, 'batch_loss/train': 0.6390025834552944} 12/29/2021 13:00:46 - INFO - codeparrot_training - Step 49980: {'lr': 1.9878109444593228e-10, 'samples': 25590272, 'steps': 49980, 'batch_loss/train': 0.7874329024925828} 12/29/2021 13:00:57 - INFO - codeparrot_training - Step 49981: {'lr': 1.7939994007010186e-10, 'samples': 25590784, 'steps': 49981, 'batch_loss/train': 0.7303680684417486} 12/29/2021 13:01:08 - INFO - codeparrot_training - Step 49982: {'lr': 1.6101269056933987e-10, 'samples': 25591296, 'steps': 49982, 'batch_loss/train': 0.6950306771323085} 12/29/2021 13:01:22 - INFO - codeparrot_training - Step 49983: {'lr': 1.4361934602691306e-10, 'samples': 25591808, 'steps': 49983, 'batch_loss/train': 0.7469236357137561} 12/29/2021 13:01:32 - INFO - codeparrot_training - Step 49984: {'lr': 1.2721990652608817e-10, 'samples': 25592320, 'steps': 49984, 'batch_loss/train': 0.5969591030152515} 12/29/2021 13:01:43 - INFO - codeparrot_training - Step 49985: {'lr': 1.1181437212237633e-10, 'samples': 25592832, 'steps': 49985, 'batch_loss/train': 0.6338297328911722} 12/29/2021 13:01:55 - INFO - codeparrot_training - Step 49986: {'lr': 9.74027428712887e-11, 'samples': 25593344, 'steps': 49986, 'batch_loss/train': 0.7630333877168596} 12/29/2021 13:02:06 - INFO - codeparrot_training - Step 49987: {'lr': 8.398501882833642e-11, 'samples': 25593856, 'steps': 49987, 'batch_loss/train': 0.688161738216877} 12/29/2021 13:02:16 - INFO - codeparrot_training - Step 49988: {'lr': 7.156120007678624e-11, 'samples': 25594368, 'steps': 49988, 'batch_loss/train': 0.7013966045342386} 12/29/2021 13:02:28 - INFO - codeparrot_training - Step 49989: {'lr': 6.013128664439371e-11, 'samples': 25594880, 'steps': 49989, 'batch_loss/train': 0.7887239241972566} 12/29/2021 13:02:39 - INFO - codeparrot_training - Step 49990: {'lr': 4.9695278558914426e-11, 'samples': 25595392, 'steps': 49990, 'batch_loss/train': 0.7387627973221242} 12/29/2021 13:02:50 - INFO - codeparrot_training - Step 49991: {'lr': 4.0253175875859527e-11, 'samples': 25595904, 'steps': 49991, 'batch_loss/train': 0.6628477505873889} 12/29/2021 13:03:00 - INFO - codeparrot_training - Step 49992: {'lr': 3.180497865074017e-11, 'samples': 25596416, 'steps': 49992, 'batch_loss/train': 0.5629864588845521} 12/29/2021 13:03:14 - INFO - codeparrot_training - Step 49993: {'lr': 2.4350686911311925e-11, 'samples': 25596928, 'steps': 49993, 'batch_loss/train': 0.6278233337216079} 12/29/2021 13:03:25 - INFO - codeparrot_training - Step 49994: {'lr': 1.78903006575748e-11, 'samples': 25597440, 'steps': 49994, 'batch_loss/train': 0.6894345912151039} 12/29/2021 13:03:35 - INFO - codeparrot_training - Step 49995: {'lr': 1.2423819945039938e-11, 'samples': 25597952, 'steps': 49995, 'batch_loss/train': 0.7295490756514482} 12/29/2021 13:03:48 - INFO - codeparrot_training - Step 49996: {'lr': 7.951244801462921e-12, 'samples': 25598464, 'steps': 49996, 'batch_loss/train': 0.7133490510750562} 12/29/2021 13:03:58 - INFO - codeparrot_training - Step 49997: {'lr': 4.4725751990881695e-12, 'samples': 25598976, 'steps': 49997, 'batch_loss/train': 0.6320071532391012} 12/29/2021 13:04:09 - INFO - codeparrot_training - Step 49998: {'lr': 1.987811193426836e-12, 'samples': 25599488, 'steps': 49998, 'batch_loss/train': 0.7334505151957273} 12/29/2021 13:04:23 - INFO - codeparrot_training - Step 49999: {'lr': 4.969528122344969e-13, 'samples': 25600000, 'steps': 49999, 'batch_loss/train': 0.7188061801716685} 12/29/2021 13:04:23 - INFO - codeparrot_training - Evaluating and saving model checkpoint 12/29/2021 13:07:47 - INFO - codeparrot_training - Step 50000: {'loss/eval': 0.7400861382484436, 'perplexity': 2.096116065979004} 12/29/2021 13:08:05 - WARNING - huggingface_hub.repository - Several commits (18) will be pushed upstream. 12/29/2021 13:08:08 - INFO - codeparrot_training - Evaluating and saving model after training 12/29/2021 13:11:30 - INFO - codeparrot_training - Step 1600000: {'loss/eval': 0.7400861382484436, 'perplexity': 2.096116065979004}